⮜ Blog

⮜ List of tags

Showing all posts tagged
and

📝 Posted:
🚚 Summary of:
P0278, P0279
Commits:
b6a7285...f0fbaf6, f0fbaf6...20bac82
💰 Funded by:
Yanga, Blue Bolt
🏷 Tags:

That was quick: In a surprising turn of events, Romantique Tp themselves came in just one day after the last blog post went up, updated me with their current and much more positive opinion on Sound Canvas VA, and confirmed that real SC-88Pro hardware clamps invalid Reverb Macro values to the specified range. I promised to release a new Sound Canvas VA BGM pack for free once I knew the exact behavior of real hardware, so let's go right back to Seihou and also integrate the necessary SysEx patches into the game's MIDI player behind a toggle. This would also be a great occasion to quickly incorporate some long overdue code maintenance and build system improvements, and a migration to C++ modules in particular. When I started the Shuusou Gyoku Linux port a year ago, the combination of modules and <windows.h> threw lots of weird errors and even crashed the Visual Studio compiler. But nowadays, Microsoft even uses modules in the Office code base. This must mean that these issues are fixed by now, right?
Well, there's still a bug that causes the modularized C++ standard library to be basically unusable in combination with the static analyzer, and somehow, I was the first one to report it. So it's 3½ years after C++20 was finalized, and somehow, modules are still a bleeding-edge feature and a second-class citizen in even the compiler that supports them the best. I want fast compile times already! 😕
Thankfully, Microsoft agrees that this is a bug, and will work on it at some point. While we're waiting, let's return to the original plan of decompiling the endings of the one PC-98 Touhou game that still needed them decompiled.

  1. TH02's endings
  2. TH02's Staff Roll
  3. TH02's verdict screen, and its hidden challenge
  4. TH02's end-of-stage bonus screens

After the textless slideshows of TH01, TH02 was the first Touhou game to feature lore text in its endings. Given that this game stores its 📝 in-game dialog text in fixed-size plaintext files, you wouldn't expect anything more fancy for the endings either, so it's not surprising to see that the END?.TXT files use the same concept, with 44 visible bytes per line followed by two bytes of padding for the CR/LF newline sequence. Each of these lines is typed to the screen in full, with all whitespace and a fixed time for each 2-byte chunk.
As a result, everything surrounding the text is just as hardcoded as TH01's endings were, which once again opens up the possibility of freely integrating all sorts of creative animations without the overhead of an interpreter. Sadly, TH02 only makes use of this freedom in a mere two cases: the picture scrolling effect from Reimu's head to Marisa's head in the Bad Endings, and a single hardware palette change in the Good Endings.

Powered by master.lib's egc_shift_down().
Screenshot of the (0-based) line #13 in TH02's Good Endings, together with its associated (and colored) pictureScreenshot of the (0-based) line #14 in TH02's Good Endings, showing off how it doesn't change the picture of the previous line and only applies a different grayscale palette
Same image, different palette. Note how the palette for 2️⃣ must still contain a green color for the VRAM-rendered bold text, which the image is not supposed to use.

Hardcoding also still made sense for this game because of how the ending text is structured. The Good and Bad Endings for the individual shot types respectively share 55% and 77% of their text, and both only diverge after the first 27 lines. In straight-line procedural code, this translates to one branch for each shot type at a single point, neatly matching the high-level structure of these endings.

But that's the end of the positive or neutral aspects I can find in these scripts. The worst part, by far, is ZUN's approach to displaying the text in alternating colors, and how it impacts the entire structure of the code.
The simplest solution would have involved a hardcoded array with the color of each line, just like how the in-game dialogs store the face IDs for each text box. But for whatever reason, ZUN did not apply this piece of wisdom to the endings and instead hardcoded these color changes by… mutating a global variable before calling the text typing function for every individual line.:zunpet: This approach ruins any possibility of compressing the script code into loops. While ZUN did use loops, all of them are very short because they can only last until the next color change. In the end, the code contains 90 explicitly spelled-out calls to the 5-parameter line typing function that only vary in the pointer to each line and in the slower speed used for the one or two final lines of each ending. As usual, I've deduplicated the code in the ReC98 repository down to a sensible level, but here's the full inlined and macro-expanded horror:

Raw decompilation of TH02's script function for its three Bad Endings, without inline function or macro trickeryRaw decompilation of TH02's script function for its three Good Endings, without inline function or macro trickery
It's highly likely that this is what ZUN hacked into his PC-98 and was staring at back in 1997. :godzun:

All this redundancy bloats the two script functions for the 6 endings to a whopping 3,344 bytes inside TH02's MAINE.EXE. In particular, the single function that covers the three Good Endings ends up with a total of 631 x86 ASM instructions, making it the single largest function in TH02 and the 7th longest function in all of PC-98 Touhou. If the 📝 single-executable build for TH02's debloated and anniversary branches ends up needing a few more KB to reduce its size below the original MAIN.EXE, there are lots of opportunities to compress it all.

The ending text can also be fast-forwarded by holding any key. As we've come to expect for this sort of ZUN code, the text typing function runs its own rendering loop with VSync delays and input detection, which means that we 📝 once 📝 again have to talk about the infamous quirk of the PC-98 keyboard controller in relation to held keys. We've still got 54 not yet decompiled calls to input detection functions left in this codebase, are you excited yet?! :tannedcirno:
Holding any key speeds up the text of all ending lines before the last one by displaying two kana/kanji instead of one per rendered frame and reducing the delay between the rendered frames to 1/3 of its regular length. In pseudocode:

for(i = 0; i < number_of_2_byte_chunks_on_displayed_line; i++) {
	input = convert_current_pc98_bios_input_state_to_game_specific_bitflags();
	add_chunk_to_internal_text_buffer(i);
	blit_internal_text_buffer_from_the_beginning();
	if(input == INPUT_NONE) {
		// Basic case, no key pressed
		frame_delay(frames_per_chunk);
	} else if((i % 2) == 1) {
		// Key pressed, chunk number is odd.
		frame_delay(frames_per_chunk / 3);
	} else {
		// Key pressed, chunk number is even.
		// No delay; next iteration adds to the same frame.
	}
}

This is exactly the kind of code you would write if you wanted to deliberately maximize the impact of this hardware quirk. If the game happens to read the current input state right after a key up scancode for the last previously held and game-relevant key, it will then wrongly take the branch that uninterruptibly waits for the regular, non-divided amount of VSync interrupts. In my tests, this broke the rhythm of the fast-forwarded text about once per line. Note how this branch can also be taken on an even chunk: Rendering glyphs straight from font ROM to VRAM is not exactly cheap, and if each iteration (needlessly) blits one more full-width glyph than the last one, the probability of a key up scancode arriving in the middle of a frame only increases.
The fact that TH02 allows any of the supported input keys to be held points to another detail of this quirk I haven't mentioned so far. If you press multiple keys at once, the PC-98's keyboard controller only sends the periodic key up scancodes as long as you are holding the last key you pressed. Because the controller only remembers this last key, pressing and releasing any other key would get rid of these scancodes for all keys you are still holding.
As usual, this ZUN bug only occurs on real hardware and with DOSBox-X's correct emulation of the PC-98 keyboard controller.


After the ending, we get to witness the most seamless transition between ending and Staff Roll in any Touhou game as the BGM immediately changes to the Staff Roll theme, and the ending picture is shifted into the same place where the Staff Roll pictures will appear. Except that the code misses the exact position by four pixels, and cuts off another four pixels at the right edge of the picture:

Also, note the green 1-pixel line at the right edge of this specific picture. This is a bug in the .PI file where the picture is indeed shifted one pixel to the left. :zunpet:

What follows is a comparatively large amount of unused content for a single scene. It starts right at the end of this underappreciated 11-frame animation loaded from ENDFT.BFT:

TH02's ENDFT.BFT
Wastefully using the 4bpp BFNT format. The single ZUN frame at the end of the animation is unused; while it might look identical to the ZUN glyphs later on in the Staff Roll, that's only because both are independently rendered boldfaced versions of the same font ROM glyphs. Then again, it does prove that ZUN created this animation on a PC-98 model made by NEC, as the Epson clones used a font ROM with a distinctly different look.

TH02's Staff Roll is also unique for the pre-made screenshots of all 5 stages that get shown together with a fancy rotating rectangle animation while the Staff Roll progresses in sync with the BGM. The first interesting detail shows up immediately after the first image, where the code jumps over one of the 320×200 quarters in ED06.PI, leaving the screenshot of the Stage 2 midboss unused.
All of the cutscenes in PC-98 Touhou store their pictures as 320×200 quarters within a single 640×400 .PI file. Anywhere else, all four quarters are supposed to be displayed with the same palette specified in the .PI header, but TH02's Staff Roll screenshots are also unique in how all quarters beyond the top-left one require palettes loaded from external .RGB files to look right. Consequently, the game doesn't clearly specify the intended palette of this unused screenshot, and leaves two possibilities:

The unused second 320×200 quarter of TH02's ED06.PI, displayed in the Stage 2 color palette used in-game.
The unused second 320×200 quarter of TH02's ED06.PI, displayed in the palette specified in the .PI header. These are the colors you'd see when looking at the file in a .PI viewer, when converting it into another format with the usual tools, or in sprite rips that don't take TH02's hardcoded palette changes into account. These colors are only intended for the Stage 1 screenshot in the top-left quarter of the file.
The unused second 320×200 quarter of TH02's ED06.PI, displayed in the palette from ED06B.RGB, which the game uses for the following screenshot of the Meira fight. As it's from the same stage, it almost matches the in-game colors seen in 1️⃣, and only differs in the white color (#FFF) being slightly red-tinted (#FCC).

It might seem obvious that the Stage 2 palette in 1️⃣ is the correct one, but ZUN indeed uses ED06B.RGB with the red-tinted white color for the following screenshot of the Meira fight. Not only does this palette not match Meira's in-game appearance, but it also discolors the rectangle animation and the surrounding Staff Roll text:

Also, that tearing on frame #1 is not a recording artifact, but the expected result of yet another VSync-related landmine. 💣 This time, it's caused by the combination of 1) the entire sequence from the ending to the verdict screen being single-buffered, and 2) this animation always running immediately after an expensive operation (640×400 .PI image loading and blitting to VRAM, 320×200 VRAM inter-page copy, or hardware palette loading from a packed file), without waiting for the VSync interrupt. This makes it highly likely for the first frame of this animation to start rendering at a point where the (real or emulated) electron beam has already traveled over a significant portion of the screen.

But when I went into Stage 2 to compare these colors to the in-game palette, I found something even more curious. ZUN obviously made this screenshot with the Reimu-C shot type, but one of the shot sprites looks slightly different from how it does in-game. :thonk: These screenshots must have been made earlier in development when the sprite didn't yet feature the second ring at the top. The same applies to the Stage 4 screenshot later on:

Original version of the third 320×200 quarter from TH02's ED06.PI, representing the Meira boss fight and showing off an old version of the Reimu-C shot spritesOriginal version of the first 320×200 quarter from TH02's ED07.PI, representing Stage 4 and showing off an old version of the Reimu-C shot sprites
Edited version of the third 320×200 quarter from TH02's ED06.PI, representing the Meira boss fight; Reimu-C's shot sprites were replaced with their final versionEdited version of the first 320×200 quarter from TH02's ED07.PI, representing Stage 4; Reimu-C's shot sprites were replaced with their final version

Finally, the rotating rectangle animation delivers one more minor rendering bug. Each of the 20 frames removes the largest and outermost rectangle from VRAM by redrawing it in the same black color of the background before drawing the remaining rectangles on top. The corners of these rectangles are placed on a shrinking circle that starts with a radius of 256 pixels and is centered at (192, 200), which results in a maximum possible X coordinate of 448 for the rightmost corner of the rectangle. However, the Staff Roll text starts at an X coordinate of 416, causing the first two full-width glyphs to still fall within the area of the circle. Each line of text is also only rendered once before the animation. So if any of the rectangles then happens to be placed at an angle that causes its edges to overlap the text, its removal will cut small holes of black pixels into the glyphs:

The green dotted circle corresponds to the newest/smallest rectangle. Note how ZUN only happened to avoid the holes for the two final animations by choosing an initial angle and angular velocity that causes the resulting rectangles to just barely avoid touching the TEST PLAYER glyphs.

At least the following verdict screen manages to have no bugs aside from the slightly imperfect centering of its table values, and only comes with a small amount of additional bloat. Let's get right to the mapping from skill points to the 12 title strings from END3.TXT, because one of them is not like the others:

SkillTitle
≥100神を超えた巫女!!
90 - 99もはや神の領域!!
80 - 99A級シューター!!
78 - 79うきうきゲーマー!
77バニラはーもにー!
70 - 76うきうきゲーマー!
60 - 69どきどきゲーマー!
50 - 59要練習ゲーマー
40 - 49非ゲーマー級
30 - 39ちょっとだめ
20 - 29非人間級
10 - 19人間でない何か
≤9死んでいいよ、いやいやまじで
Looks like I'm the first one to document the required skill points as well? Everyone else just copy-pastes END3.TXT without providing context.

So how would you get exactly 77 and achieve vanilla harmony? Here's the formula:

Difficulty level* × 20
+10 - (Continues used × 3)
+max((50 - (Lives lost × 3) - Bombs used), 0)
+min(max(📝 item_skill, 0), 25)
* Ranges from 0 (Easy) to 3 (Lunatic).
Across all 5 stages.
With Easy Mode capping out at 85, this is possible on every difficulty, although it requires increasingly perfect play the lower you go. Reaching 77 on purpose, however, pretty much demands a careful route through the entire game, as every collected and missed item will influence the item_skill in some way. This almost feels it's like the ultimate challenge that this game has to offer. Looking forward to the first Vanilla Harmony% run!

And with that, TH02's MAINE.EXE is both fully position-independent and ready for translation. There's a tiny bit of undecompiled bit of code left in the binary, but I'll leave that for rounding up a future TH02 decompilation push.


With one of the game's skill-based formulas decompiled, it's fitting to round out the second push with the other two. The in-game bonus tables at the end of a stage also have labels that we'd eventually like to translate, after all.
The bonus formula for the 4 regular stages is also the first place where we encounter TH02's rank value, as well as the only instance in PC-98 Touhou where the game actually displays a rank-derived value to the player. KirbyComment and Colin Douglas Howell accurately documented the rank mechanics over at Touhou Wiki two years ago, which helped quite a bit as rank would have been slightly out of scope for these two pushes. 📝 Similar to TH01, TH02's rank value only affects bullet speed, but the exact details of how rank is factored in will have to wait until RE progress arrives at this game's bullet system.
These bonuses are calculated by taking a sum of various gameplay metrics and multiplying it with the amount of point items collected during the stage. In the 4 regular stages, the sum consists of:

 難易度 Difficulty level* × 2,000
ステージ (Rank + 16) ×   200
ボム max((2,500 - (Bombs used* ×   500)), 0)
ミス max((3,000 - (Lives lost* × 1,000)), 0)
靈撃初期数 (4 - Starting bombs) ×   800
靈夢初期数 (5 - Starting lives) × 1,000
* Within this stage, across all continues.
Yup, 封魔録.TXT does indeed document this correctly.

As rank can range from -6 to +4 on Easy and +16 on the other difficulties, this sum can range between:

EasyNormalHardLunatic
Minimum 2,8004,8006,8008,800
Maximum 16,70021,10023,10025,100

The sum for the Extra Stage is not documented in 封魔録.TXT:

クリア 10,000
ミス回数 max((20,000 - (Lives lost × 4,000)), 0)
ボム回数 max((20,000 - (Bombs used × 4,000)), 0)
クリアタイム ⌊max((20,000 - Boss fight frames*), 0) ÷ 10⌋ × 10
* Amount of frames spent fighting Evil Eye Σ, counted from the end of the pre-boss dialog until the start of the defeat animation.

And that's two pushes packed full of the most bloated and copy-pasted code that's unique to TH02! So bloated, in fact, that TH02 RE as a whole jumped by almost 7%, which in turn finally pushed overall RE% over the 60% mark. 🎉 It's been a while since we hit a similar milestone; 50% overall RE happened almost 2 years ago during 📝 P0204, a month before I completed the TH01 decompilation.
Next up: Continuing to wait for Microsoft to fix the static analyzer bug until May at the latest, and working towards the newly popular dreams of TH03 netplay by looking at some of its foundational gameplay code.

📝 Posted:
🚚 Summary of:
P0264, P0265
Commits:
46cd6e7...78728f6, 78728f6...ff19bed
💰 Funded by:
Blue Bolt, [Anonymous], iruleatgames
🏷 Tags:

Oh, it's 2024 already and I didn't even have a delivery for December or January? Yeah… I can only repeat what I said at the end of November, although the finish line is actually in sight now. With 10 pushes across 4 repositories and a blog post that has already reached a word count of 9,240, the Shuusou Gyoku SC-88Pro BGM release is going to break 📝 both the push record set by TH01 Sariel two years ago, and 📝 the blog post length record set by the last Shuusou Gyoku delivery. Until that's done though, let's clear some more PC-98 Touhou pushes out of the backlog, and continue the preparation work for the non-ASCII translation project starting later this year.

But first, we got another free bugfix according to my policy! 📝 Back in April 2022 when I researched the Divide Error crash that can occur in TH04's Stage 4 Marisa fight, I proposed and implemented four possible workarounds and let the community pick one of them for the generally recommended small bugfix mod. I still pushed the others onto individual branches in case the gameplay community ever wants to look more closely into them and maybe pick a different one… except that I accidentally pushed the wrong code for the warp workaround, probably because I got confused with the second warp variant I developed later on.
Fortunately, I still had the intended code for both variants lying around, and used the occasion to merge the current master branch into all of these mod branches. Thanks to wyatt8740 for spotting and reporting this oversight!

  1. The Music Room background masking effect
  2. The GRCG's plane disabling flags
  3. Text color restrictions
  4. The entire messy rest of the Music Room code
  5. TH04's partially consistent congratulation picture on Easy Mode
  6. TH02's boss position and damage variables

As the final piece of code shared in largely identical form between 4 of the 5 games, the Music Rooms were the biggest remaining piece of low-hanging fruit that guaranteed big finalization% gains for comparatively little effort. They seemed to be especially easy because I already decompiled TH02's Music Room together with the rest of that game's OP.EXE back in early 2015, when this project focused on just raw decompilation with little to no research. 9 years of increased standards later though, it turns out that I missed a lot of details, and ended up renaming most variables and functions. Combined with larger-than-expected changes in later games and the usual quality level of ZUN's menu code, this ended up taking noticeably longer than the single push I expected.

The undoubtedly most interesting part about this screen is the animation in the background, with the spinning and falling polygons cutting into a single-color background to reveal a spacey image below. However, the only background image loaded in the Music Room is OP3.PI (TH02/TH03) or MUSIC3.PI (TH04/TH05), which looks like this in a .PI viewer or when converted into another image format with the usual tools:

TH02's Music Room background in its on-disk state TH03's Music Room background in its on-disk state TH04's Music Room background in its on-disk state TH05's Music Room background in its on-disk state
Let's call this "the blank image".

That is definitely the color that appears on top of the polygons, but where is the spacey background? If there is no other .PI file where it could come from, it has to be somewhere in that same file, right? :thonk:
And indeed: This effect is another bitplane/color palette trick, exactly like the 📝 three falling stars in the background of TH04's Stage 5. If we set every bit on the first bitplane and thus change any of the resulting even hardware palette color indices to odd ones, we reveal a full second 8-color sub-image hiding in the same .PI file:

TH02's Music Room background, with all bits in the first bitplane set to reveal the spacey background image, and the full color palette at the bottom TH03's Music Room background, with all bits in the first bitplane set to reveal the spacey background image, and the full color palette at the bottom TH04's Music Room background, with all bits in the first bitplane set to reveal the spacey background image, and the full color palette at the bottom TH05's Music Room background, with all bits in the first bitplane set to reveal the spacey background image, and the full color palette at the bottom
The spacey sub-image. Never before seen!1!! …OK, touhou-memories beat me by a month. Let's add each image's full 16-color palette to deliver some additional value.

On a high level, the first bitplane therefore acts as a stencil buffer that selects between the blank and spacey sub-image for every pixel. The important part here, however, is that the first bitplane of the blank sub-images does not consist entirely of 0 bits, but does have 1 bits at the pixels that represent the caption that's supposed to be overlaid on top of the animation. Since there now are some pixels that should always be taken from the spacey sub-image regardless of whether they're covered by a polygon, the game can no longer just clear the first bitplane at the start of every frame. Instead, it has to keep a separate copy of the first bitplane's original state (called nopoly_B in the code), captured right after it blitted the .PI image to VRAM. Turns out that this copy also comes in quite handy with the text, but more on that later.


Then, the game simply draws polygons onto only the reblitted first bitplane to conditionally set the respective bits. ZUN used master.lib's grcg_polygon_c() function for this, which means that we can entirely thank the uncredited master.lib developers for this iconic animation – if they hadn't included such a function, the Music Rooms would most certainly look completely different.
This is where we get to complete the series on the PC-98 GRCG chip with the last remaining four bits of its mode register. So far, we only needed the highest bit (0x80) to either activate or deactivate it, and the bit below (0x40) to choose between the 📝 RMW and 📝 TCR/📝 TDW modes. But you can also use the lowest four bits to restrict the GRCG's operations to any subset of the four bitplanes, leaving the other ones untouched:

// Enable the GRCG (0x80) in regular RMW mode (0x40). All bitplanes are
// enabled and written according to the contents of the tile register.
outportb(0x7C, 0xC0);

// The same, but limiting writes to the first bitplane by disabling the
// second (0x02), third (0x04), and fourth (0x08) one, as done in the
// PC-98 Touhou Music Rooms.
outportb(0x7C, 0xCE);

// Regular GRCG blitting code to any VRAM segment…
pokeb(0xA8000, offset, …);

// We're done, turn off the GRCG.
outportb(0x7C, 0x00);

This could be used for some unusual effects when writing to two or three of the four planes, but it seems rather pointless for this specific case at first. If we only want to write to a single plane, why not just do so directly, without the GRCG? Using that chip only involves more hardware and is therefore slower by definition, and the blitting code would be the same, right?
This is another one of these questions that would be interesting to benchmark one day, but in this case, the reason is purely practical: All of master.lib's polygon drawing functions expect the GRCG to be running in RMW mode. They write their pixels as bitmasks where 1 and 0 represent pixels that should or should not change, and leave it to the GRCG to combine these masks with its tile register and OR the result into the bitplanes instead of doing so themselves. Since GRCG writes are done via MOV instructions, not using the GRCG would turn these bitmasks into actual dot patterns, overwriting any previous contents of each VRAM byte that gets modified.
Technically, you'd only have to replace a few MOV instructions with OR to build a non-GRCG version of such a function, but why would you do that if you haven't measured polygon drawing to be an actual bottleneck.

Three overlapping Music Room polygons rendered using master.lib's grcg_polygon_c() function with a disabled GRCGThree overlapping Music Room polygons rendered as in the original game, with the GRCG enabled
An example with three polygons drawn from top to bottom. Without the GRCG, edges of later polygons overwrite any previously drawn pixels within the same VRAM byte. Note how treating bitmasks as dot patterns corrupts even those areas where the background image had nonzero bits in its first bitplane.

As far as complexity is concerned though, the worst part is the implicit logic that allows all this text to show up on top of the polygons in the first place. If every single piece of text is only rendered a single time, how can it appear on top of the polygons if those are drawn every frame?
Depending on the game (because of course it's game-specific), the answer involves either the individual bits of the text color index or the actual contents of the palette:

The contents of nopoly_B with each game's first track selected.

Finally, here's a list of all the smaller details that turn the Music Rooms into such a mess:

And that's all the Music Rooms! The OP.EXE binaries of TH04 and especially TH05 are now very close to being 100% RE'd, with only the respective High Score menus and TH04's title animation still missing. As for actual completion though, the finalization% metric is more relevant as it also includes the ZUN Soft logo, which I RE'd on paper but haven't decompiled. I'm 📝 still hoping that this will be the final piece of code I decompile for these two games, and that no one pays to get it done earlier… :onricdennat:


For the rest of the second push, there was a specific goal I wanted to reach for the remaining anything budget, which was blocked by a few functions at the beginning of TH04's and TH05's MAINE.EXE. In another anticlimactic development, this involved yet another way too early decompilation of a main() function…
Generally, this main() function just calls the top-level functions of all other ending-related screens in sequence, but it also handles the TH04-exclusive congratulating All Clear images within itself. After a 1CC, these are an additional reward on top of the Good Ending, showing the player character wearing a different outfit depending on the selected difficulty. On Easy Mode, however, the Good Ending is unattainable because the game always ends after Stage 5 with a Bad Ending, but ZUN still chose to show the EASY ALL CLEAR!! image in this case, regardless of how many continues you used.
While this might seem inconsistent with the other difficulties, it is consistent within Easy Mode itself, as the enforced Bad Ending after Stage 5 also doesn't distinguish between the number of continues. Also, Try to Normal Rank!! could very well be ZUN's roundabout way of implying "because this is how you avoid the Bad Ending".

With that out of the way, I was finally able to separate the VRAM text renderer of TH04 and TH05 into its own assembly unit, 📝 finishing the technical debt repayment project that I couldn't complete in 2021 due to assembly-time code segment label arithmetic in the data segment. This now allows me to translate this undecompilable self-modifying mess of ASM into C++ for the non-ASCII translation project, and thus unify the text renderers of all games and enhance them with support for Unicode characters loaded from a bitmap font. As the final finalized function in the SHARED segment, it also allowed me to remove 143 lines of particularly ugly segmentation workarounds 🙌


The remaining 1/6th of the second push provided the perfect occasion for some light TH02 PI work. The global boss position and damage variables represented some equally low-hanging fruit, being easily identified global variables that aren't part of a larger structure in this game. In an interesting twist, TH02 is the only game that uses an increasing damage value to track boss health rather than decreasing HP, and also doesn't internally distinguish between bosses and midbosses as far as these variables are concerned. Obviously, there's quite a bit of state left to be RE'd, not least because Marisa is doing her own thing with a bunch of redundant copies of her position, but that was too complex to figure out right now.

Also doing their own thing are the Five Magic Stones, which need five positions rather than a single one. Since they don't move, the game doesn't have to keep 📝 separate position variables for both VRAM pages, and can handle their positions in a much simpler way that made for a nice final commit.
And for the first time in a long while, I quite like what ZUN did there! Not only are their positions stored in an array that is indexed with a consistent ID for every stone, but these IDs also follow the order you fight the stones in: The two inner ones use 0 and 1, the two outer ones use 2 and 3, and the one in the center uses 4. This might look like an odd choice at first because it doesn't match their horizontal order on the playfield. But then you notice that ZUN uses this property in the respective phase control functions to iterate over only the subrange of active stones, and you realize how brilliant it actually is.

Screenshot of TH02's Five Magic Stones, with the first two (both internally and in the order you fight them in) alive and activated Screenshot of TH02's Five Magic Stones, with the second two (both internally and in the order you fight them in) alive and activated Screenshot of TH02's Five Magic Stones, with the last one (both internally and in the order you fight them in) alive and activated

This seems like a really basic thing to get excited about, especially since the rest of their data layout sure isn't perfect. Splitting each piece of state and even the individual X and Y coordinates into separate 5-element arrays is still counter-productive because the game ends up paying more memory and CPU cycles to recalculate the element offsets over and over again than this would have ever saved in cache misses on a 486. But that's a minor issue that could be fixed with a few regex replacements, not a misdesigned architecture that would require a full rewrite to clean it up. Compared to the hardcoded and bloated mess that was 📝 YuugenMagan's five eyes, this is definitely an improvement worthy of the good-code tag. The first actual one in two years, and a welcome change after the Music Room!

These three pieces of data alone yielded a whopping 5% of overall TH02 PI in just 1/6th of a push, bringing that game comfortably over the 60% PI mark. MAINE.EXE is guaranteed to reach 100% PI before I start working on the non-ASCII translations, but at this rate, it might even be realistic to go for 100% PI on MAIN.EXE as well? Or at least technical position independence, without the false positives.

Next up: Shuusou Gyoku SC-88Pro BGM. It's going to be wild.

📝 Posted:
🚚 Summary of:
P0258, P0259, P0260, P0261
Commits:
5876755...e8a0b3e, e8a0b3e...dfaa3c6, dfaa3c6...ed9ee93, ed9ee93...ae2fc28
💰 Funded by:
Blue Bolt, [Anonymous], Yanga, Splashman
🏷 Tags:

And we're back to PC-98 Touhou for a brief interruption of the ongoing Shuusou Gyoku Linux port. Let's clear some of the Touhou-related progress from the backlog, and use the unconstrained nature of these contributions to prepare the 📝 upcoming non-ASCII translations commissioned by Touhou Patch Center. The current budget won't cover all of my ambitions, but it would at least be nice if all text in these games was feasibly translatable by the time I officially start working on that project.

At a little over 3 pushes, it might be surprising to see that this took longer than the 📝 TH03/TH04/TH05 cutscene system. It's obvious that TH02 started out with a different system for in-game dialog, but while TH04 and TH05 look identical on the surface, they only actually share 30% of their dialog code. So this felt more like decompiling 2.4 distinct systems, as opposed to one identical base with tons of game-specific differences on top.

The table of contents was pretty popular last time around, so let's have another one:

  1. Overview of TH04's dialog system
  2. Changes introduced in TH05
  3. Command reference for the TH04 and TH05 systems
  4. Overview of TH02's dialog system
  5. TH02's face portrait images
  6. Bugs during TH02's dialog box slide-in animation
  7. Bugs and quirks in Mima's defeat dialog (might be lore-relevant)
  8. TH03 win messages

Let's start with the ones from TH04 and TH05, since they are not that broken. For TH04, ZUN started out by copy-pasting the cutscene system, causing the result to inherit many of the caveats I already described in the cutscene blog post:

Then, however, he greatly simplified the system. Mainly, this was done by moving text rendering from the PC-98 graphics chip to the text chip, which avoids the need for any text-related unblitting code, but ZUN also added a bunch of smaller changes:

While it would seem that TH05 has no issues with ASCII 0x20 spaces, the text as a whole is still blindly processed two bytes at a time, and any commands can only appear at even byte positions within a line. I dimmed the VRAM pixels to 25% of their original brightness to make the text easier to read.
The same text backported to TH04, additionally demonstrating how that game's dialog system inherited the whitespace skipping behavior of TH03's cutscene system. Just like there, ASCII 0x20 spaces only work at odd byte positions because the game treats them as the trailing byte of a full-width Shift-JIS codepoint. I don't know how large the budget for the upcoming non-ASCII translations will be, but I'm going to fix this even in the very basic fully static variant. I dimmed the VRAM pixels to 25% of their original brightness to make the text easier to read.
Demonstrating the lack of automatic line or box breaks in TH05's dialog systemDemonstrating the lack of automatic line or box breaks in TH04's dialog system, in addition to its lack of support for ASCII 0x20 spaces carried over from TH03's cutscene system

TH05 then moved from TH04's plaintext scripts to the binary .TX2 format while removing all the unused commands copy-pasted from the cutscene system. Except for a single additional command intended to clear a text box, TH05's dialog system only supports a strict subset of the features of TH04's system.
This change also introduced the following differences compared to TH04:

Writing the 0x02 byte to text RAM results in an SX character, which is simply the PC-98 font ROM's glyph for that Shift-JIS codepoint.
Also note how each face change is now preceded by two frames of delay.
No problem in TH04. Note how the dialog also runs a bit faster – TH04 only adds the aforementioned one frame of delay to each face change, and has fewer two-byte chunks of text to display overall.

For modding these files, you probably want to use TXDEF from -Tom-'s MysticTK. It decodes these files into a text representation, and its encoder then takes care of the character-specific byte offsets in the 10-byte header. This text representation simplifies the format a lot by avoiding all corner cases and landmines you'd experience during hex-editing – most notably by interpreting the box-starting 0x0D as a command to show text that takes a string parameter, avoiding the broken calls to script commands in the middle of text. However, you'd still have to manually ensure an even number of bytes on every line of text.

In the entry function of TH05's dialog loop, we also encounter the hack that is responsible for properly handling 📝 ZUN's hidden Extra Stage replay. Since the dialog loop doesn't access the replay inputs but still requires key presses to advance through the boxes, ZUN chose to just skip the dialog altogether in the specific case of the Extra Stage replay being active, and replicated all sprite management commands from the dialog script by just hardcoding them.
And you know what? Not only do I not mind this hack, but I would have preferred it over the actual dialog system! The aforementioned sprite management commands effectively boil down to manual memory management, deallocating all stage enemy and midboss sprites and thus ensuring that the boss sprites end up at specific master.lib sprite IDs (patnums). The hardcoded boss rendering function then expects these sprites to be available at these exact IDs… which means that the otherwise hardcoded bosses can't render properly without the dialog script running before them. :zunpet:
There is absolutely no excuse for the game to burden dialog scripts with this functionality. Sure, delayed deallocation would allow them to blit stage-specific sprites, but the original games don't do that; probably because none of the two games feature an unblitting command. And even if they did, it would have still been cleaner to expose the boss-specific sprite setup as a single script command that can then also be called from game code if the script didn't do so. Commands like these just are a recipe for crashes, especially with parsers that expect fullwidth Shift-JIS text and where misaligned ASCII text can easily cause these commands to be skipped.

But then again, it does make for funny screenshot material if you accidentally the deallocation and then see bosses being turned into stage enemies:

TH04's dialog before the Stage 4 Marisa fight without deallocating the stage sprites inside the script, causing Marisa to be turned into one of the stage enemiesTH04's dialog before the Stage 6 Yuuka fight without deallocating the stage sprites inside the script, causing Yuuka to be turned into two different cels of the same stage enemyTH05's dialog before the Louise fight without deallocating the stage sprites inside the script, causing Louise to be turned into one of the ice enemies from TH05's Stage 2TH05's dialog before the Louise fight without deallocating the stage sprites inside the script, causing Mai and Yuki to be turned into a windmill and fairy/demon enemy, respectively
Some of the more amusing consequences of not calling the sprite-deallocating :th04: \c /  :th05: 0x04 command inside a dialog script.
In the case of 4️⃣, the game then even crashes on this frame at the end of the dialog, in a way that resembles the infamous 📝 TH04 crash before Stage 5 Yuuka if no EMS driver is loaded. Both the stage- and boss-specific BFNT sprites are loaded into memory at this point, leaving no room for the 256×256-pixel background image on the size-limited master.lib heap.

With all the general details out of the way, here's the command reference:

:th04: :th05:
0
1
0x00
0x01
Selects either the player character (0) or the boss (1) as the currently speaking character, and moves the cursor to the beginning of the text box. In TH04, this command also directly starts the new dialog box, which is probably why it's not prefixed with a \ as it only makes sense outside of text. TH05 requires a separate 0x0D command to do the same.
\=1 0x02 0x!! Replaces the face portrait of the currently active speaking character with image #1 within her .CD2 file.
\=255 0x02 0xFF Removes the face portrait from the currently active text box.
\l,filename 0x03 filename 0x00 Calls master.lib's super_entry_bfnt() function, which loads sprites from a BFNT file to consecutive IDs starting at the current patnum write cursor.
\c 0x04 Deallocates all stage-specific BFNT sprites (i.e., stage enemies and midbosses), freeing up conventional RAM for the boss sprites and ensuring that master.lib's patnum write cursor ends up at :th04: 128 / :th05: 180.
In TH05's Extra Stage, this command also replaces 📝 the sprites loaded from MIKO16.BFT with the ones from ST06_16.BFT.
\d Deallocates all face portrait images.
The game automatically does this at the end of each dialog sequence. However, ZUN wanted to load Stage 6 Yuuka's 76 KiB of additional animations inside the script via \l, and would have once again run up against the master.lib heap size limit without that extra free memory.
\m,filename 0x05 filename 0x00 Stops the currently playing BGM, loads a new one from the given file, and starts playback.
\m$ 0x05 $ 0x00 Stops the currently playing BGM.
Note that TH05 interprets $ as a null-terminated filename as well.
\m* Restarts playback of the currently loaded BGM from the beginning.
\b0,0,0 0x06 0x!!!! 0x!!!! 0x!! Blits the master.lib patnum with the ID indicated by the third parameter to the current VRAM page at the top-left screen position indicated by the first two parameters.
\e0 Plays the sound effect with the given ID.
\t100 Sets palette brightness via master.lib's palette_settone() to any value from 0 (fully black) to 200 (fully white). 100 corresponds to the palette's original colors.
\fo1
\fi1
Calls master.lib's palette_black_out() or palette_black_in() to play a hardware palette fade animation from or to black, spending roughly 1 frame on each of the 16 fade steps.
\wo1
\wi1
0x09 0x!!
0x0A 0x!!
Calls master.lib's palette_white_out() or palette_white_in() to play a hardware palette fade animation from or to white, spending roughly 1 frame on each of the 16 fade steps.
The TH05 version of 0x09 also clears the text in both boxes before the animation.
\n 0x0B Starts a new line by resetting the X coordinate of the TRAM cursor to the left edge of the text area and incrementing the Y coordinate.
The new line will always be the next one below the last one that was properly started, regardless of whether the text previously wrapped to the next TRAM row at the edge of the screen.
\g8 Plays a blocking 8-frame screen shake animation. Copy-pasted from the cutscene parser, but actually used right at the end of the dialog shown before TH04's Bad Ending.
\ga0 0x0C 0x!! Shows the gaiji with the given ID from 0 to 255 at the current cursor position, ignoring the per-glyph delay.
\k0 Waits 0 frames (0 = forever) for any key to be pressed before continuing script execution.
0x0D Starts a new dialog box with the previously selected speaker. All text until the next 0xFF command will appear on screen.
Inside dialogs, this is a no-op.
0x0E Takes the current dialog cursor as the top-left corner of a 240×48-pixel rectangle, and replaces all text RAM characters within that rectangle with whitespace.
This is only used to clear the player character's text box before Shinki's final いくよ‼ box. Shinki has two consecutive text boxes in all 4 scripts here, and ZUN probably wanted to clear the otherwise blue text to imply a dramatic pause before Shinki's final sentence. Nice touch.
(You could, however, also use it after a box-ending 0xFF command to mess with text RAM in general.)
\# Quits the currently running loop. This returns from either the text loop to the command loop, or it ends the dialog sequence by returning from the command loop back to gameplay. If this stage of the game later starts another dialog sequence, it will start at the next script byte.
\$ Like \#, but first waits for any key to be pressed.
0xFF Behaves like TH04's \$ in the text loop, and like \# in the command loop. Hence, it's not possible in TH05 to automatically end a text box and advance to the next one without waiting for a key press.
Unused commands are in gray.

At the end of the day, you might criticize the system for how its landmines make it annoying to mod in ASCII text, but it all works and does what it's supposed to. ZUN could have written the cleanest single and central Shift-JIS iterator that properly chunks a byte buffer into halfwidth and fullwidth codepoints, and I'd still be throwing it out for the upcoming non-ASCII translations in favor of something that either also supports UTF-8 or performs dictionary lookups with a full box of text.
The only actual bug can be found in the input detection, which once again doesn't correctly handle the infamous key up/key down scancode quirk of PC-98 keyboards. All it takes is one wrongly placed input polling call, and suddenly you have to think about how the update cycle behind the PC-98 keyboard state bytes might cause the game to run the regular 2-frame delay for a single 2-byte chunk of text before it shows the full text of a box after all… But even this bug is highly theoretical and could probably only be observed very, very rarely, and exclusively on real hardware.


The same can't be said about TH02 though, but more on that later. Let's first take a look at its data, which started out much simpler in that game. The STAGE?.TXT files contain just raw Shift-JIS text with no trace of commands or structure. Turning on the whitespace display feature in your editor reveals how the dialog system even assumes a fixed byte length for each box: 36 bytes per line which will appear on screen, followed by 4 bytes of padding, which the original files conveniently use to visually split the lines via a CR/LF newline sequence. Make sure to disable trimming of trailing whitespace in your editor to not ruin the file when modding the text… :onricdennat:

靈夢:あんた、まだ名前も聞いてないの··
······に覚えられないわよ。・・・・・··
里香:あたいは、里香よ。覚えときなさ··
・・・い。・・・・・・················
Two boxes from TH02's STAGE5.TXT with visualized whitespace. These also demonstrate how the CR/LF newlines only make up 2 of the 4 padding bytes, and require each line to be padded with two more bytes; you could not use these trailing spaces for actual text. Also note how the exquisite mixture of fullwidth and halfwidth spaces demands the text to be viewed with only the most metrically consistent monospace fonts to preserve the intended alignment. 🍷 It appears quite misaligned on my phone.

Consequently, everything else is hardcoded – every effect shown between text boxes, the face portrait shown for each box, and even how many boxes are part of each dialog sequence. Which means that the source code now contains a long hardcoded list of face IDs for most of the text boxes in the game, with the rest being part of the dedicated hardcoded dialog scripts for 2/3 of the game's stages.
Without the restriction to a fixed set of scripting commands, TH02 naturally gravitated to having the most varied dialog sequences of all PC-98 Touhou games. This flexibility certainly facilitated Mima's grand entrance animation in Stage 4, or the different lines in Stage 4 and 5 depending on whether you already used a continue or not. Marisa's post-boss dialog even inserts the number of continues into the text itself – by, you guessed it, writing to hardcoded byte offsets inside the dialog text before printing it to the screen. :godzun: But once again, I have nothing to criticize here – not even the fact that the alternate dialog scripts have to mutate the "box cursor" to jump to the intended boxes within the file. I know that some people in my audience like VMs, but I would have considered it more bloated if ZUN had implemented a full-blown scripting language just to handle all these special cases.


Another unique aspect of TH02 is the way it stores its face portraits, which are infamous for how hard they are to find in the original data files. These sprites are actually map tiles, stored in MIKO_K.MPN, and drawn using the same functions used to blit the regular map tiles to the 📝 tile source area in VRAM. We can only guess why ZUN chose this one out of the three graphics formats he used in TH02:

TH02's MIKO_K.PTN, arranged into a 16×16-tile layout that reveals how these tiles are combined into face portraits.
MPNDEF from -Tom-'s MysticTK conveniently uses this exact layout in its .BMP output. Earlier MPNDEF versions crashed when converting this file as its 256 tiles led to an 8-bit overflow bug, so make sure you've updated to the current version from the end of October 2023 if you want to convert this file yourself. The format stores the 4 bitplanes of each 16×16 tile in order, so good luck finding a different planar image viewer that would support both such a tiled layout and a custom palette. Sometimes, a weird internal format is the best type of obfuscation. :tannedcirno:
TH02's MIKO_K.PTN with the 16×16 tile grid overlaid

And since you're certainly wondering about all these black tiles at the edges: Yes, these are not only part of the file and pad it from the required 240×192 pixels to 256×256, but also kept in memory during a stage, wasting 9.5 KiB of conventional RAM. That's 172 seconds of potential input replay data, just for those people who might still think that we need EMS for replays.


Alright, we've got the text, we've got the faces, let's slide in the box and display it all on screen. Apparently though, we also have to blit the player and option sprites using raw, low-level master.lib function calls in the process? :thonk: This can't be right, especially because ZUN always blits the option sprite associated with the Reimu-A shot type, regardless of which one the player actually selected. And if you keep moving above the box area before the dialog starts, you get to see exactly how wrong this is:

Let's look closer at Reimu's sprite during the slide-in animation, and in the two frames before:

Zoomed-in area around Reimu's sprite from frame 35 of the video aboveZoomed-in area around Reimu's sprite from frame 36 of the video aboveZoomed-in area around Reimu's sprite from frame 37 of the video above

This one image shows off no less than 4 bugs:

  1. ZUN blits the stationary player sprite here, regardless of whether the player was previously moving left or right. This is a nice way of indicating that Reimu stops moving once the dialog starts, but maybe ZUN should have unblitted the old sprite so that the new one wouldn't have appeared on top. The game only unblits the 384×64 pixels covered by the dialog box on every frame of the slide-in animation, so Reimu would only appear correctly if her sprite happened to be entirely located within that area.
  2. All sprites are shifted up by 1 pixel in frame 2️⃣. This one is not a bug in the dialog system, but in the main game loop. The game runs the relevant actions in the following order:

    1. Invalidate any map tiles covered by entities
    2. Redraw invalidated tiles
    3. Decrement the Y coordinate at the top of VRAM according to the scroll speed
    4. Update and render all game entities
    5. Scroll in new tiles as necessary according to the scroll speed, and report whether the game has scrolled one pixel past the end of the map
    6. If that happened, pretend it didn't by incrementing the value calculated in #3 for all further frames and skipping to #8.
    7. Issue a GDC SCROLL command to reflect the line calculated in #3 on the display
    8. Wait for VSync
    9. Flip VRAM pages
    10. Start boss if we're past the end of the map

    The problem here: Once the dialog starts, the game has already rendered an entire new frame, with all sprites being offset by a new Y scroll offset, without adjusting the graphics GDC's scroll registers to compensate. Hence, the Y position in 3️⃣ is the correct one, and the whole existence of frame 2️⃣ is a bug in itself. (Well… OK, probably a quirk because speedrunning exists, and it would be pretty annoying to synchronize any video regression tests of the future TH02 Anniversary Edition if it renders one fewer frame in the middle of a stage.)

  3. ZUN blits the option sprites to their position from frame 1️⃣. This brings us back to 📝 TH02's special way of retaining the previous and current position in a two-element array, indexed with a VRAM page ID. Normally, this would be equivalent to using dedicated prev and cur structure fields and you'd just index it with the back page for every rendering call. But if you then decide to go single-buffered for dialogs and render them onto the front page instead… :zunpet:
    Note that fixing bug #2 would not cancel out this one – the sprites would then simply be rendered to their position in the frame before 1️⃣.

  4. And of course, the fixed option sprite ID also counts as a bug.

As for the boxes themselves, it's yet another loop that prints 2-byte chunks of Shift-JIS text at an even slower fixed interval of 3 frames. In an interesting quirk though, ZUN assumes that every box starts with the name of the speaking character in its first two fullwidth Shift-JIS characters, followed by a fullwidth colon. These 6 bytes are displayed immediately at the start of every box, without the usual delay. The resulting alignment looks rather janky with Genjii, whose single right-padded kanji looks quite awkward with the fullwidth space between the name and the colon. Kind of makes you wonder why ZUN just didn't spell out his proper name, 玄爺, instead, but I get the stylistic difference.
In Stage 4, the two-kanji assumption then breaks with Marisa's three-kanji name, which causes the full-width colon to be printed as the first delayed character in each of her boxes:


That's all the issues and quirks in the system itself. The scripts themselves don't leave much room for bugs as they basically just loop over the hardcoded face ID array at this level… until we reach the end of the game. Previously, the slide-in animation could simply use the tile invalidation and re-rendering system to unblit the box on each frame, which also explained why Reimu had to be separately rendered on top. But this no longer works with a custom-rendered boss background, and so the game just chooses to flood-fill the area with graphics chip color #0:

Then again, transferring pixels from the back page would be just as wrong as they lag one frame behind. No way around capturing these 384×64 pixels to main memory here… Oh well, this flood-fill at least adds even more legibility on top of the already half-transparent text box. A property that the following dialog sequence unfortunately lacks…

For Mima's final defeat dialog though, ZUN chose to not even show the box. He might have realized the issue by that point, or simply preferred the more dramatic effect this had on the lines. The resulting issues, however, might even have ramifications for such un-technical things as lore and character dynamics. :zunpet: As it turns out, the code for this dialog sequence does in fact render Mima's smiling face for all boxes?! You only don't see it in the original game because it's rendered to the other VRAM page that remains invisible during the dialog sequence:

Caution, flashing lights.

Here's how I interpret the situation:

So, the future TH02 Anniversary Edition will fix the bug by showing the back page, but retain the quirk by rewriting the dialog code to not blit the face.


And with that, we've secured all in-game dialog for the upcoming non-ASCII translations! The remaining 2/3 of the last push made for a good occasion to also decompile the small amount of code related to TH03's win messages, stored in the @0?TX.TXT files. Similar to TH02's dialog format, these files are also split into fixed-size blocks of 3×60 bytes. But this time, TH03 loads all 60 bytes of a line, including the CR/LF line breaking codepoints in the original files, into the statically allocated buffer that it renders from. These control characters are then only filtered to whitespace by ZUN's graph_putsa_fx() function. If you remove the line breaks, you get to use the full 60 bytes on every line.
The final commits went to the MIKO.CFG loading and saving functions used in TH04's and TH05's OP.EXE, as well as TH04's game startup code to finally catch up with 📝 TH05's counterpart from over 3 years ago. This brought us right in front of the main menu rendering code in both TH04 and TH05, which is identical in both games and will be tackled in the next PC-98 Touhou delivery.

Next up, though: Returning to Shuusou Gyoku, and adding support for SC-88Pro recordings as BGM. Which may or may not come with a slight controversy…

📝 Posted:
🚚 Summary of:
P0190, P0191, P0192
Commits:
5734815...293e16a, 293e16a...71cb7b5, 71cb7b5...e1f3f9f
💰 Funded by:
nrook, -Tom-, [Anonymous]
🏷 Tags:

The important things first:

So, Shinki! As far as final boss code is concerned, she's surprisingly economical, with 📝 her background animations making up more than ⅓ of her entire code. Going straight from TH01's 📝 final 📝 bosses to TH05's final boss definitely showed how much ZUN had streamlined danmaku pattern code by the end of PC-98 Touhou. Don't get me wrong, there is still room for improvement: TH05 not only 📝 reuses the same 16 bytes of generic boss state we saw in TH04 last month, but also uses them 4× as often, and even for midbosses. Most importantly though, defining danmaku patterns using a single global instance of the group template structure is just bad no matter how you look at it:

Declaring a separate structure instance with the static data for every pattern would be both safer and more space-efficient, and there's more than enough space left for that in the game's data segment.
But all in all, the pattern functions are short, sweet, and easy to follow. The "devil" pattern is significantly more complex than the others, but still far from TH01's final bosses at their worst. I especially like the clear architectural separation between "one-shot pattern" functions that return true once they're done, and "looping pattern" functions that run as long as they're being called from a boss's main function. Not many all too interesting things in these pattern functions for the most part, except for two pieces of evidence that Shinki was coded after Yumeko:


Speaking about that wing sprite: If you look at ST05.BB2 (or any other file with a large sprite, for that matter), you notice a rather weird file layout:

Raw file layout of TH05's ST05.BB2, demonstrating master.lib's supposed BFNT width limit of 64 pixels
A large sprite split into multiple smaller ones with a width of 64 pixels each? What's this, hardware sprite limitations? On my PC-98?!

And it's not a limitation of the sprite width field in the BFNT+ header either. Instead, it's master.lib's BFNT functions which are limited to sprite widths up to 64 pixels… or at least that's what MASTER.MAN claims. Whatever the restriction was, it seems to be completely nonexistent as of master.lib version 0.23, and none of the master.lib functions used by the games have any issues with larger sprites.
Since ZUN stuck to the supposed 64-pixel width limit though, it's now the game that expects Shinki's winged form to consist of 4 physical sprites, not just 1. Any conversion from another, more logical sprite sheet layout back into BFNT+ must therefore replicate the original number of sprites. Otherwise, the sequential IDs ("patnums") assigned to every newly loaded sprite no longer match ZUN's hardcoded IDs, causing the game to crash. This is exactly what used to happen with -Tom-'s MysticTK automation scripts, which combined these exact sprites into a single large one. This issue has now been fixed – just in case there are some underground modders out there who used these scripts and wonder why their game crashed as soon as the Shinki fight started.


And then the code quality takes a nosedive with Shinki's main function. :onricdennat: Even in TH05, these boss and midboss update functions are still very imperative:

The biggest WTF in there, however, goes to using one of the 16 state bytes as a "relative phase" variable for differentiating between boss phases that share the same branch within the switch(boss.phase) statement. While it's commendable that ZUN tried to reduce code duplication for once, he could have just branched depending on the actual boss.phase variable? The same state byte is then reused in the "devil" pattern to track the activity state of the big jerky lasers in the second half of the pattern. If you somehow managed to end the phase after the first few bullets of the pattern, but before these lasers are up, Shinki's update function would think that you're still in the phase before the "devil" pattern. The main function then sequence-breaks right to the defeat phase, skipping the final pattern with the burning Makai background. Luckily, the HP boundaries are far away enough to make this impossible in practice.
The takeaway here: If you want to use the state bytes for your custom boss script mods, alias them to your own 16-byte structure, and limit each of the bytes to a clearly defined meaning across your entire boss script.

One final discovery that doesn't seem to be documented anywhere yet: Shinki actually has a hidden bomb shield during her two purple-wing phases. uth05win got this part slightly wrong though: It's not a complete shield, and hitting Shinki will still deal 1 point of chip damage per frame. For comparison, the first phase lasts for 3,000 HP, and the "devil" pattern phase lasts for 5,800 HP.

And there we go, 3rd PC-98 Touhou boss script* decompiled, 28 to go! 🎉 In case you were expecting a fix for the Shinki death glitch: That one is more appropriately fixed as part of the Mai & Yuki script. It also requires new code, should ideally look a bit prettier than just removing cheetos between one frame and the next, and I'd still like it to fit within the original position-dependent code layout… Let's do that some other time.
Not much to say about the Stage 1 midboss, or midbosses in general even, except that their update functions have to imperatively handle even more subsystems, due to the relative lack of helper functions.


The remaining ¾ of the third push went to a bunch of smaller RE and finalization work that would have hardly got any attention otherwise, to help secure that 50% RE mark. The nicest piece of code in there shows off what looks like the optimal way of setting up the 📝 GRCG tile register for monochrome blitting in a variable color:

mov ah, palette_index ; Any other non-AL 8-bit register works too.
                      ; (x86 only supports AL as the source operand for OUTs.)

rept 4                ; For all 4 bitplanes…
    shr ah,  1        ; Shift the next color bit into the x86 carry flag
    sbb al,  al       ; Extend the carry flag to a full byte
                      ; (CF=0 → 0x00, CF=1 → 0xFF)
    out 7Eh, al       ; Write AL to the GRCG tile register
endm

Thanks to Turbo C++'s inlining capabilities, the loop body even decompiles into a surprisingly nice one-liner. What a beautiful micro-optimization, at a place where micro-optimization doesn't hurt and is almost expected.
Unfortunately, the micro-optimizations went all downhill from there, becoming increasingly dumb and undecompilable. Was it really necessary to save 4 x86 instructions in the highly unlikely case of a new spark sprite being spawned outside the playfield? That one 2D polar→Cartesian conversion function then pointed out Turbo C++ 4.0J's woefully limited support for 32-bit micro-optimizations. The code generation for 32-bit 📝 pseudo-registers is so bad that they almost aren't worth using for arithmetic operations, and the inline assembler just flat out doesn't support anything 32-bit. No use in decompiling a function that you'd have to entirely spell out in machine code, especially if the same function already exists in multiple other, more idiomatic C++ variations.
Rounding out the third push, we got the TH04/TH05 DEMO?.REC replay file reading code, which should finally prove that nothing about the game's original replay system could serve as even just the foundation for community-usable replays. Just in case anyone was still thinking that.


Next up: Back to TH01, with the Elis fight! Got a bit of room left in the cap again, and there are a lot of things that would make a lot of sense now:

📝 Posted:
🚚 Summary of:
P0138
Commits:
8d953dc...864e864
💰 Funded by:
[Anonymous], Blue Bolt
🏷 Tags:

Technical debt, part 9… and as it turns out, it's highly impractical to repay 100% of it at this point in development. 😕

The reason: graph_putsa_fx(), ZUN's function for rendering optionally boldfaced text to VRAM using the font ROM glyphs, in its ridiculously micro-optimized TH04 and TH05 version. This one sets the "callback function" for applying the boldface effect by self-modifying the target of two CALL rel16 instructions… because there really wasn't any free register left for an indirect CALL, eh? The necessary distance, from the call site to the function itself, has to be calculated at assembly time, by subtracting the target function label from the call site label.
This usually wouldn't be a problem… if ZUN didn't store the resulting lookup tables in the .DATA segment. With code segments, we can easily split them at pretty much any point between functions because there are multiple of them. But there's only a single .DATA segment, with all ZUN and master.lib data sandwiched between Borland C++'s crt0 at the top, and Borland C++'s library functions at the bottom of the segment. Adding another split point would require all data after that point to be moved to its own translation unit, which in turn requires EXTERN references in the big .ASM file to all that moved data… in short, it would turn the codebase into an even greater mess.
Declaring the labels as EXTERN wouldn't work either, since the linker can't do fancy arithmetic and is limited to simply replacing address placeholders with one single address. So, we're now stuck with this function at the bottom of the SHARED segment, for the foreseeable future.


We can still continue to separate functions off the top of that segment, though. Pretty much the only thing noteworthy there, so far: TH04's code for loading stage tile images from .MPN files, which we hadn't reverse-engineered so far, and which nicely fit into one of Blue Bolt's pending ⅓ RE contributions. Yup, we finally moved the RE% bars again! If only for a tiny bit. :tannedcirno:
Both TH02 and TH05 simply store one pointer to one dynamically allocated memory block for all tile images, as well as the number of images, in the data segment. TH04, on the other hand, reserves memory for 8 .MPN slots, complete with their color palettes, even though it only ever uses the first one of these. There goes another 458 bytes of conventional RAM… I should start summing up all the waste we've seen so far. Let's put the next website contribution towards a tagging system for these blog posts.

At 86% of technical debt in the SHARED segment repaid, we aren't quite done yet, but the rest is mostly just TH04 needing to catch up with functions we've already separated. Next up: Getting to that practical 98.5% point. Since this is very likely to not require a full push, I'll also decompile some more actual TH04 and TH05 game code I previously reverse-engineered – and after that, reopen the store!

📝 Posted:
🚚 Summary of:
P0132
Commits:
dc9e3ee...045450c
💰 Funded by:
[Anonymous]
🏷 Tags:

Now that's the amount of translation unit separation progress I was looking for! Too bad that RL is keeping me more and more occupied these days, and ended up delaying this push until 2021. Now that Touhou Patch Center is also commissioning me to update their infrastructure, it's going to take a while for ReC98 to return to full speed, and for the store to be reopened. Should happen by April at the latest, though!

With everything related to this separation of translation units explained earlier, we've really got a push with nothing to talk about, this time. Except, maybe, for the realization that 📝 this current approach might not be the best fit for TH02 after all: Not only did it force us to 📝 throw away the previous decompilation of the sound effect playback functions, but OP.EXE also contains obviously copy-pasted code in addition to the common, shared set of library functions. How was that game even built, originally??? No way around compiling that one instance of the "delay until given BGM measure" function separately then, if it insists on using its own instance of the VSync delay function…
Oh well, this separated layout still works better for the later games, and consistency is good. Smooth sailing with all of the other functions, at least.

Next up: One more of these, which might even end up completing the 📝 transition to our own master.lib header file. In terms of the total number of ASM code left in the SHARED code segments, we're now 30% done after 3 dedicated pushes. It really shouldn't require 7 more pushes, though!

📝 Posted:
🚚 Summary of:
P0113, P0114
Commits:
150d2c6...6204fdd, 6204fdd...967bb8b
💰 Funded by:
Lmocinemod
🏷 Tags:

Alright, tooling and technical debt. Shouldn't be really much to talk about… oh, wait, this is still ReC98 :tannedcirno:

For the tooling part, I finished up the remaining ergonomics and error handling for the 📝 sprite converter that Jonathan Campbell contributed two months ago. While I familiarized myself with the tool, I've actually ran into some unreported errors myself, so this was sort of important to me. Still got no command-line help in there, but the error messages can now do that job probably even better, since we would have had to write them anyway.

So, what's up with the technical debt then? Well, by now we've accumulated quite a number of 📝 ASM code slices that need to be either decompiled or clearly marked as undecompilable. Since we define those slices as "already reverse-engineered", that decision won't affect the numbers on the front page at all. But for a complete decompilation, we'd still have to do this someday. So, rather than incorporating this work into pushes that were purchased with the expectation of measurable progress in a certain area, let's take the "anything goes" pushes, and focus entirely on that during them.

The second code segment seemed like the best place to start with this, since it affects the largest number of games simultaneously. Starting with TH02, this segment contains a set of random "core" functions needed by the binary. Image formats, sounds, input, math, it's all there in some capacity. You could maybe call it all "libzun" or something like that? But for the time being, I simply went with the obvious name, seg2. Maybe I'll come up with something more convincing in the future.


Oh, but wait, why were we assembling all the previous undecompilable ASM translation units in the 16-bit build part? By moving those to the 32-bit part, we don't even need a 16-bit TASM in our list of dependencies, as long as our build process is not fully 16-bit.
And with that, ReC98 now also builds on Windows 95, and thus, every 32-bit Windows version. 🎉 Which is certainly the most user-visible improvement in all of these two pushes. :onricdennat:


Back in 2015, I already decompiled all of TH02's seg2 functions. As suggested by the Borland compiler, I tried to follow a "one translation unit per segment" layout, bundling the binary-specific contents via #include. In the end, it required two translation units – and that was even after manually inserting the original padding bytes via #pragma codestring… yuck. But it worked, compiled, and kept the linker's job (and, by extension, segmentation worries) to a minimum. And as long as it all matched the original binaries, it still counted as a valid reconstruction of ZUN's code. :zunpet:

However, that idea ultimately falls apart once TH03 starts mixing undecompilable ASM code inbetween C functions. Now, we officially have no choice but to use multiple C and ASM translation units, with maybe only just one or two #includes in them…

…or we finally start reconstructing the actual seg2 library, turning every sequence of related functions into its own translation unit. This way, we can simply reuse the once-compiled .OBJ files for all the binaries those functions appear in, without requiring that additional layer of translation units mirroring the original segmentation.
The best example for this is TH03's almost undecompilable function that generates a lookup table for horizontally flipping 8 1bpp pixels. It's part of every binary since TH03, but only used in that game. With the previous approach, we would have had to add 9 C translation units, which would all have just #included that one file. Now, we simply put the .OBJ file into the correct place on the linker command line, as soon as we can.

💡 And suddenly, the linker just inserts the correct padding bytes itself.

The most immediate gains there also happened to come from TH03. Which is also where we did get some tiny RE% and PI% gains out of this after all, by reverse-engineering some of its sprite blitting setup code. Sure, I should have done even more RE here, to also cover those 5 functions at the end of code segment #2 in TH03's MAIN.EXE that were in front of a number of library functions I already covered in this push. But let's leave that to an actual RE push 😛


All in all though, I was just getting started with this; the real gains in terms of removed ASM files are still to come. But in the meantime, the funding situation has become even better in terms of allowing me to focus on things nobody asked for. 🙂 So here's a slightly better idea: Instead of spending two more pushes on this, let's shoot for TH05 MAINE.EXE position independence next. If I manage to get it done, we'll have a 100% position-independent TH05 by the time -Tom- finishes his MAIN.EXE PI demo, rather than the 94% we'd get from just MAIN.EXE. That's bound to make a much better impression on all the people who will then (re-)discover the project.

📝 Posted:
🚚 Summary of:
P0031, P0032, P0033
Commits:
dea40ad...9f764fa, 9f764fa...e6294c2, e6294c2...6cdd229
💰 Funded by:
zorg
🏷 Tags:

The glacial pace continues, with TH05's unnecessarily, inappropriately micro-optimized, and hence, un-decompilable code for rendering the current and high score, as well as the enemy health / dream / power bars. While the latter might still pass as well-written ASM, the former goes to such ridiculous levels that it ends up being technically buggy. If you enjoy quality ZUN code, it's definitely worth a read.

In TH05, this all still is at the end of code segment #1, but in TH04, the same code lies all over the same segment. And since I really wanted to move that code into its final form now, I finally did the research into decompiling from anywhere else in a segment.

Turns out we actually can! It's kinda annoying, though: After splitting the segment after the function we want to decompile, we then need to group the two new segments back together into one "virtual segment" matching the original one. But since all ASM in ReC98 heavily relies on being assembled in MASM mode, we then start to suffer from MASM's group addressing quirk. Which then forces us to manually prefix every single function call

with the group name. It's stupidly boring busywork, because of all the function calls you mustn't prefix. Special tooling might make this easier, but I don't have it, and I'm not getting crowdfunded for it.

So while you now definitely can request any specific thing in any of the 5 games to be decompiled right now, it will take slightly longer, and cost slightly more.
(Except for that one big segment in TH04, of course.)

Only one function away from the TH05 shot type control functions now!

📝 Posted:
🚚 Summary of:
P0025, P0026, P0027
Commits:
0cde4b7...261d503
💰 Funded by:
zorg
🏷 Tags:

… yeah, no, we won't get very far without figuring out these drawing routines.
Which process data that comes from the .STD files. Which has various arrays related to the background… including one to specify the scrolling speed. And wait, setting that to 0 actually is what starts a boss battle?

So, have a TH05 Boss Rush patch: 2018-12-26-TH05BossRush.zip Theoretically, this should have also worked for TH04, but for some reason, the Stage 3 boss gets stuck on the first phase if we do this?

Here's the diff for the Boss Rush. Turning it into a thcrap-style Skipgame patch is left as an exercise for the reader.