⮜ Blog

⮜ List of tags

Showing all posts tagged
and

📝 Posted:
🚚 Summary of:
P0135, P0136
Commits:
a6eed55...252c13d, 252c13d...07bfcf2
💰 Funded by:
[Anonymous]
🏷 Tags:

Alright, no more big code maintenance tasks that absolutely need to be done right now. Time to really focus on parts 6 and 7 of repaying technical debt, right? Except that we don't get to speed up just yet, as TH05's barely decompilable PMD file loading function is rather… complicated.
Fun fact: Whenever I see an unusual sequence of x86 instructions in PC-98 Touhou, I first consult the disassembly of Wolfenstein 3D. That game was originally compiled with the quite similar Borland C++ 3.0, so it's quite helpful to compare its ASM to the officially released source code. If I find the instructions in question, they mostly come from that game's ASM code, leading to the amusing realization that "even John Carmack was unable to get these instructions out of this compiler" :onricdennat: This time though, Wolfenstein 3D did point me to Borland's intrinsics for common C functions like memcpy() and strchr(), available via #pragma intrinsic. Bu~t those unfortunately still generate worse code than what ZUN micro-optimized here. Commenting how these sequences of instructions should look in C is unfortunately all I could do here.
The conditional branches in this function did compile quite nicely though, clarifying the control flow, and clearly exposing a ZUN bug: TH05's snd_load() will hang in an infinite loop when trying to load a non-existing -86 BGM file (with a .M2 extension) if the corresponding -26 BGM file (with a .M extension) doesn't exist either.

Unsurprisingly, the PMD channel monitoring code in TH05's Music Room remains undecompilable outside the two most "high-level" initialization and rendering functions. And it's not because there's data in the middle of the code segment – that would have actually been possible with some #pragmas to ensure that the data and code segments have the same name. As soon as the SI and DI registers are referenced anywhere, Turbo C++ insists on emitting prolog code to save these on the stack at the beginning of the function, and epilog code to restore them from there before returning. Found that out in September 2019, and confirmed that there's no way around it. All the small helper functions here are quite simply too optimized, throwing away any concern for such safety measures. 🤷
Oh well, the two functions that were decompilable at least indicate that I do try.


Within that same 6th push though, we've finally reached the one function in TH05 that was blocking further progress in TH04, allowing that game to finally catch up with the others in terms of separated translation units. Feels good to finally delete more of those .ASM files we've decompiled a while ago… finally!

But since that was just getting started, the most satisfying development in both of these pushes actually came from some more experiments with macros and inline functions for near-ASM code. By adding "unused" dummy parameters for all relevant registers, the exact input registers are made more explicit, which might help future port authors who then maybe wouldn't have to look them up in an x86 instruction reference quite as often. At its best, this even allows us to declare certain functions with the __fastcall convention and express their parameter lists as regular C, with no additional pseudo-registers or macros required.
As for output registers, Turbo C++'s code generation turns out to be even more amazing than previously thought when it comes to returning pseudo-registers from inline functions. A nice example for how this can improve readability can be found in this piece of TH02 code for polling the PC-98 keyboard state using a BIOS interrupt:

inline uint8_t keygroup_sense(uint8_t group) {
	_AL = group;
	_AH = 0x04;
	geninterrupt(0x18);
	// This turns the output register of this BIOS call into the return value
	// of this function. Surprisingly enough, this does *not* naively generate
	// the `MOV AL, AH` instruction you might expect here!
	return _AH;
}

void input_sense(void)
{
	// As a result, this assignment becomes `_AH = _AH`, which Turbo C++
	// never emits as such, giving us only the three instructions we need.
	_AH = keygroup_sense(8);

	// Whereas this one gives us the one additional `MOV BH, AH` instruction
	// we'd expect, and nothing more.
	_BH = keygroup_sense(7);

	// And now it's obvious what both of these registers contain, from just
	// the assignments above.
	if(_BH & K7_ARROW_UP || _AH & K8_NUM_8) {
		key_det |= INPUT_UP;
	}
	// […]
}

I love it. No inline assembly, as close to idiomatic C code as something like this is going to get, yet still compiling into the minimum possible number of x86 instructions on even a 1994 compiler. This is how I keep this project interesting for myself during chores like these. :tannedcirno: We might have even reached peak inline already?

And that's 65% of technical debt in the SHARED segment repaid so far. Next up: Two more of these, which might already complete that segment? Finally!

📝 Posted:
🚚 Summary of:
P0090, P0091
Commits:
90252cc...07dab29, 07dab29...29c5a73
💰 Funded by:
Yanga, Ember2528
🏷 Tags:

Back to TH01, and its high score menu… oh, wait, that one will eventually involve keyboard input. And thanks to the generous TH01 funding situation, there's really no reason not to cover that right now. After all, TH01 is the last game where input still hadn't been RE'd.
But first, let's also cover that one unused blitting function, together with REIIDEN.CFG loading and saving, which are in front of the input function in OP.EXE… (By now, we all know about the hidden start bomb configuration, right?)

Unsurprisingly, the earliest game also implements input in the messiest way, with a different function for each of the three executables. "Because they all react differently to keyboard inputs :zunpet:", apparently? OP.EXE even has two functions for it, one for the START / CONTINUE / OPTION / QUIT main menu, and one for both Option and Music Test menus, both of which directly perform the ring arithmetic on the menu cursor variable. A consistent separation of keyboard polling from input processing apparently wasn't all too obvious of a thought, since it's only truly done from TH02 on.

This lack of proper architecture becomes actually hilarious once you notice that it did in fact facilitate a recursion bug! :godzun: In case you've been living under a rock for the past 8 years, TH01 shipped with debugging features, which you can enter by running the game via game d from the DOS prompt. These features include a memory info screen, shown when pressing PgUp, implemented as one blocking function (test_mem()) called directly in response to the pressed key inside the polling function. test_mem() only returns once that screen is left by pressing PgDown. And in order to poll input… it directly calls back into the same polling function that called it in the first place, after a 3-frame delay.

Which means that this screen is actually re-entered for every 3 frames that the PgUp key is being held. And yes, you can, of course, also crash the system via a stack overflow this way by holding down PgUp for a few seconds, if that's your thing.
Edit (2020-09-17): Here's a video from spaztron64, showing off this exact stack overflow crash while running under the VEM486 memory manager, which displays additional information about these sorts of crashes:

What makes this even funnier is that the code actually tracks the last state of every polled key, to prevent exactly that sort of bug. But the copy-pasted assignment of the last input state is only done after test_mem() already returned, making it effectively pointless for PgUp. It does work as intended for PgDown… and that's why you have to actually press and release this key once for every call to test_mem() in order to actually get back into the game. Even though a single call to PgDown will already show the game screen again.

In maybe more relevant news though, this function also came with what can be considered the first piece of actual gameplay logic! Bombing via double-tapping the Z and X keys is also handled here, and now we know that both keys simply have to be tapped twice within a window of 20 frames. They are tracked independently from each other, so you don't necessarily have to press them simultaneously.
In debug mode, the bomb count tracks precisely this window of time. That's why it only resets back to 0 when pressing Z or X if it's ≥20.

Sure, TH01's code is expectedly terrible and messy. But compared to the micro-optimizations of TH04 and TH05, it's an absolute joy to work on, and opening all these ZUN bug loot boxes is just the icing on the cake. Looking forward to more of the high score menu in the next pushes!