That was quick: In a surprising turn of events, Romantique Tp themselves came in just one day after the last blog post went up, updated me with their current and much more positive opinion on Sound Canvas VA, and confirmed that real SC-88Pro hardware clamps invalid Reverb Macro values to the specified range. I promised to release a new Sound Canvas VA BGM pack for free once I knew the exact behavior of real hardware, so let's go right back to Seihou and also integrate the necessary SysEx patches into the game's MIDI player behind a toggle. This would also be a great occasion to quickly incorporate some long overdue code maintenance and build system improvements, and a migration to C++ modules in particular. When I started the Shuusou Gyoku Linux port a year ago, the combination of modules and <windows.h> threw lots of weird errors and even crashed the Visual Studio compiler. But nowadays, Microsoft even uses modules in the Office code base. This must mean that these issues are fixed by now, right?
Well, there's still a bug that causes the modularized C++ standard library to be basically unusable in combination with the static analyzer, and somehow, I was the first one to report it. So it's 3½ years after C++20 was finalized, and somehow, modules are still a bleeding-edge feature and a second-class citizen in even the compiler that supports them the best. I want fast compile times already! 😕
Thankfully, Microsoft agrees that this is a bug, and will work on it at some point. While we're waiting, let's return to the original plan of decompiling the endings of the one PC-98 Touhou game that still needed them decompiled.
After the textless slideshows of TH01, TH02 was the first Touhou game to feature lore text in its endings. Given that this game stores its 📝 in-game dialog text in fixed-size plaintext files, you wouldn't expect anything more fancy for the endings either, so it's not surprising to see that the END?.TXT files use the same concept, with 44 visible bytes per line followed by two bytes of padding for the CR/LF newline sequence. Each of these lines is typed to the screen in full, with all whitespace and a fixed time for each 2-byte chunk.
As a result, everything surrounding the text is just as hardcoded as TH01's endings were, which once again opens up the possibility of freely integrating all sorts of creative animations without the overhead of an interpreter. Sadly, TH02 only makes use of this freedom in a mere two cases: the picture scrolling effect from Reimu's head to Marisa's head in the Bad Endings, and a single hardware palette change in the Good Endings.
Hardcoding also still made sense for this game because of how the ending text is structured. The Good and Bad Endings for the individual shot types respectively share 55% and 77% of their text, and both only diverge after the first 27 lines. In straight-line procedural code, this translates to one branch for each shot type at a single point, neatly matching the high-level structure of these endings.
But that's the end of the positive or neutral aspects I can find in these scripts. The worst part, by far, is ZUN's approach to displaying the text in alternating colors, and how it impacts the entire structure of the code.
The simplest solution would have involved a hardcoded array with the color of each line, just like how the in-game dialogs store the face IDs for each text box. But for whatever reason, ZUN did not apply this piece of wisdom to the endings and instead hardcoded these color changes by… mutating a global variable before calling the text typing function for every individual line. This approach ruins any possibility of compressing the script code into loops. While ZUN did use loops, all of them are very short because they can only last until the next color change. In the end, the code contains 90 explicitly spelled-out calls to the 5-parameter line typing function that only vary in the pointer to each line and in the slower speed used for the one or two final lines of each ending. As usual, I've deduplicated the code in the ReC98 repository down to a sensible level, but here's the full inlined and macro-expanded horror:
It's highly likely that this is what ZUN hacked into his PC-98 and was staring at back in 1997.
All this redundancy bloats the two script functions for the 6 endings to a whopping 3,344 bytes inside TH02's MAINE.EXE. In particular, the single function that covers the three Good Endings ends up with a total of 631 x86 ASM instructions, making it the single largest function in TH02 and the 7th longest function in all of PC-98 Touhou. If the 📝 single-executable build for TH02's debloated and anniversary branches ends up needing a few more KB to reduce its size below the original MAIN.EXE, there are lots of opportunities to compress it all.
The ending text can also be fast-forwarded by holding any key. As we've come to expect for this sort of ZUN code, the text typing function runs its own rendering loop with VSync delays and input detection, which means that we 📝 once📝 again have to talk about the infamous quirk of the PC-98 keyboard controller in relation to held keys. We've still got 54 not yet decompiled calls to input detection functions left in this codebase, are you excited yet?!
Holding any key speeds up the text of all ending lines before the last one by displaying two kana/kanji instead of one per rendered frame and reducing the delay between the rendered frames to 1/3 of its regular length. In pseudocode:
for(i = 0; i < number_of_2_byte_chunks_on_displayed_line; i++) {
input = convert_current_pc98_bios_input_state_to_game_specific_bitflags();
add_chunk_to_internal_text_buffer(i);
blit_internal_text_buffer_from_the_beginning();
if(input == INPUT_NONE) {
// Basic case, no key pressed
frame_delay(frames_per_chunk);
} else if((i % 2) == 1) {
// Key pressed, chunk number is odd.
frame_delay(frames_per_chunk / 3);
} else {
// Key pressed, chunk number is even.
// No delay; next iteration adds to the same frame.
}
}
This is exactly the kind of code you would write if you wanted to deliberately maximize the impact of this hardware quirk. If the game happens to read the current input state right after a key up scancode for the last previously held and game-relevant key, it will then wrongly take the branch that uninterruptibly waits for the regular, non-divided amount of VSync interrupts. In my tests, this broke the rhythm of the fast-forwarded text about once per line. Note how this branch can also be taken on an even chunk: Rendering glyphs straight from font ROM to VRAM is not exactly cheap, and if each iteration (needlessly) blits one more full-width glyph than the last one, the probability of a key up scancode arriving in the middle of a frame only increases.
The fact that TH02 allows any of the supported input keys to be held points to another detail of this quirk I haven't mentioned so far. If you press multiple keys at once, the PC-98's keyboard controller only sends the periodic key up scancodes as long as you are holding the last key you pressed. Because the controller only remembers this last key, pressing and releasing any other key would get rid of these scancodes for all keys you are still holding.
As usual, this ZUN bug only occurs on real hardware and with DOSBox-X's correct emulation of the PC-98 keyboard controller.
After the ending, we get to witness the most seamless transition between ending and Staff Roll in any Touhou game as the BGM immediately changes to the Staff Roll theme, and the ending picture is shifted into the same place where the Staff Roll pictures will appear. Except that the code misses the exact position by four pixels, and cuts off another four pixels at the right edge of the picture:
Also, note the green 1-pixel line at the right edge of this specific picture. This is a bug in the .PI file where the picture is indeed shifted one pixel to the left.
What follows is a comparatively large amount of unused content for a single scene. It starts right at the end of this underappreciated 11-frame animation loaded from ENDFT.BFT:
Wastefully using the 4bpp BFNT format. The single frame at the end of the animation is unused; while it might look identical to the ZUN glyphs later on in the Staff Roll, that's only because both are independently rendered boldfaced versions of the same font ROM glyphs. Then again, it does prove that ZUN created this animation on a PC-98 model made by NEC, as the Epson clones used a font ROM with a distinctly different look.
TH02's Staff Roll is also unique for the pre-made screenshots of all 5 stages that get shown together with a fancy rotating rectangle animation while the Staff Roll progresses in sync with the BGM. The first interesting detail shows up immediately after the first image, where the code jumps over one of the 320×200 quarters in ED06.PI, leaving the screenshot of the Stage 2 midboss unused.
All of the cutscenes in PC-98 Touhou store their pictures as 320×200 quarters within a single 640×400 .PI file. Anywhere else, all four quarters are supposed to be displayed with the same palette specified in the .PI header, but TH02's Staff Roll screenshots are also unique in how all quarters beyond the top-left one require palettes loaded from external .RGB files to look right. Consequently, the game doesn't clearly specify the intended palette of this unused screenshot, and leaves two possibilities:
The unused second 320×200 quarter of TH02's ED06.PI, displayed in the Stage 2 color palette used in-game.
The unused second 320×200 quarter of TH02's ED06.PI, displayed in the palette specified in the .PI header. These are the colors you'd see when looking at the file in a .PI viewer, when converting it into another format with the usual tools, or in sprite rips that don't take TH02's hardcoded palette changes into account. These colors are only intended for the Stage 1 screenshot in the top-left quarter of the file.
The unused second 320×200 quarter of TH02's ED06.PI, displayed in the palette from ED06B.RGB, which the game uses for the following screenshot of the Meira fight. As it's from the same stage, it almost matches the in-game colors seen in 1️⃣, and only differs in the white color (#FFF) being slightly red-tinted (#FCC).
It might seem obvious that the Stage 2 palette in 1️⃣ is the correct one, but ZUN indeed uses ED06B.RGB with the red-tinted white color for the following screenshot of the Meira fight. Not only does this palette not match Meira's in-game appearance, but it also discolors the rectangle animation and the surrounding Staff Roll text:
Also, that tearing on frame #1 is not a recording artifact, but the expected result of yet another VSync-related landmine. 💣 This time, it's caused by the combination of 1) the entire sequence from the ending to the verdict screen being single-buffered, and 2) this animation always running immediately after an expensive operation (640×400 .PI image loading and blitting to VRAM, 320×200 VRAM inter-page copy, or hardware palette loading from a packed file), without waiting for the VSync interrupt. This makes it highly likely for the first frame of this animation to start rendering at a point where the (real or emulated) electron beam has already traveled over a significant portion of the screen.
But when I went into Stage 2 to compare these colors to the in-game palette, I found something even more curious. ZUN obviously made this screenshot with the Reimu-C shot type, but one of the shot sprites looks slightly different from how it does in-game. These screenshots must have been made earlier in development when the sprite didn't yet feature the second ring at the top. The same applies to the Stage 4 screenshot later on:
Finally, the rotating rectangle animation delivers one more minor rendering bug. Each of the 20 frames removes the largest and outermost rectangle from VRAM by redrawing it in the same black color of the background before drawing the remaining rectangles on top. The corners of these rectangles are placed on a shrinking circle that starts with a radius of 256 pixels and is centered at (192, 200), which results in a maximum possible X coordinate of 448 for the rightmost corner of the rectangle. However, the Staff Roll text starts at an X coordinate of 416, causing the first two full-width glyphs to still fall within the area of the circle. Each line of text is also only rendered once before the animation. So if any of the rectangles then happens to be placed at an angle that causes its edges to overlap the text, its removal will cut small holes of black pixels into the glyphs:
The green dotted circle corresponds to the newest/smallest rectangle. Note how ZUN only happened to avoid the holes for the two final animations by choosing an initial angle and angular velocity that causes the resulting rectangles to just barely avoid touching the TEST PLAYER glyphs.
At least the following verdict screen manages to have no bugs aside from the slightly imperfect centering of its table values, and only comes with a small amount of additional bloat. Let's get right to the mapping from skill points to the 12 title strings from END3.TXT, because one of them is not like the others:
Skill
Title
≥100
神を超えた巫女!!
90 - 99
もはや神の領域!!
80 - 99
A級シューター!!
78 - 79
うきうきゲーマー!
77
バニラはーもにー!
70 - 76
うきうきゲーマー!
60 - 69
どきどきゲーマー!
50 - 59
要練習ゲーマー
40 - 49
非ゲーマー級
30 - 39
ちょっとだめ
20 - 29
非人間級
10 - 19
人間でない何か
≤9
死んでいいよ、いやいやまじで
Looks like I'm the first one to document the required skill points as well? Everyoneelse just copy-pastes END3.TXT without providing context.
So how would you get exactly 77 and achieve vanilla harmony? Here's the formula:
* Ranges from 0 (Easy) to 3 (Lunatic). † Across all 5 stages.
With Easy Mode capping out at 85, this is possible on every difficulty, although it requires increasingly perfect play the lower you go. Reaching 77 on purpose, however, pretty much demands a careful route through the entire game, as every collected and missed item will influence the item_skill in some way. This almost feels it's like the ultimate challenge that this game has to offer. Looking forward to the first Vanilla Harmony% run!
And with that, TH02's MAINE.EXE is both fully position-independent and ready for translation. There's a tiny bit of undecompiled bit of code left in the binary, but I'll leave that for rounding up a future TH02 decompilation push.
With one of the game's skill-based formulas decompiled, it's fitting to round out the second push with the other two. The in-game bonus tables at the end of a stage also have labels that we'd eventually like to translate, after all.
The bonus formula for the 4 regular stages is also the first place where we encounter TH02's rank value, as well as the only instance in PC-98 Touhou where the game actually displays a rank-derived value to the player. KirbyComment and Colin Douglas Howell accurately documented the rank mechanics over at Touhou Wiki two years ago, which helped quite a bit as rank would have been slightly out of scope for these two pushes. 📝 Similar to TH01, TH02's rank value only affects bullet speed, but the exact details of how rank is factored in will have to wait until RE progress arrives at this game's bullet system.
These bonuses are calculated by taking a sum of various gameplay metrics and multiplying it with the amount of point items collected during the stage. In the 4 regular stages, the sum consists of:
難易度
Difficulty level* × 2,000
ステージ
(Rank + 16) × 200
ボム
max((2,500 - (Bombs used* × 500)), 0)
ミス
max((3,000 - (Lives lost* × 1,000)), 0)
靈撃初期数
(4 - Starting bombs) × 800
靈夢初期数
(5 - Starting lives) × 1,000
* Within this stage, across all continues.
Yup, 封魔録.TXT does indeed document this correctly.
As rank can range from -6 to +4 on Easy and +16 on the other difficulties, this sum can range between:
Easy
Normal
Hard
Lunatic
Minimum
2,800
4,800
6,800
8,800
Maximum
16,700
21,100
23,100
25,100
The sum for the Extra Stage is not documented in 封魔録.TXT:
クリア
10,000
ミス回数
max((20,000 - (Lives lost × 4,000)), 0)
ボム回数
max((20,000 - (Bombs used × 4,000)), 0)
クリアタイム
⌊max((20,000 - Boss fight frames*), 0) ÷ 10⌋ × 10
* Amount of frames spent fighting Evil Eye Σ, counted from the end of the pre-boss dialog until the start of the defeat animation.
And that's two pushes packed full of the most bloated and copy-pasted code that's unique to TH02! So bloated, in fact, that TH02 RE as a whole jumped by almost 7%, which in turn finally pushed overall RE% over the 60% mark. 🎉 It's been a while since we hit a similar milestone; 50% overall RE happened almost 2 years ago during 📝 P0204, a month before I completed the TH01 decompilation.
Next up: Continuing to wait for Microsoft to fix the static analyzer bug until May at the latest, and working towards the newly popular dreams of TH03 netplay by looking at some of its foundational gameplay code.
And we're back to PC-98 Touhou for a brief interruption of the ongoing Shuusou Gyoku Linux port.
Let's clear some of the Touhou-related progress from the backlog, and use
the unconstrained nature of these contributions to prepare the
📝 upcoming non-ASCII translations commissioned by Touhou Patch Center.
The current budget won't cover all of my ambitions, but it would at least be
nice if all text in these games was feasibly translatable by the time I
officially start working on that project.
At a little over 3 pushes, it might be surprising to see that this took
longer than the
📝 TH03/TH04/TH05 cutscene system. It's
obvious that TH02 started out with a different system for in-game dialog,
but while TH04 and TH05 look identical on the surface, they only
actually share 30% of their dialog code. So this felt more like decompiling
2.4 distinct systems, as opposed to one identical base with tons of
game-specific differences on top.
The table of contents was pretty popular last time around, so let's have
another one:
Let's start with the ones from TH04 and TH05, since they are not that
broken. For TH04, ZUN started out by copy-pasting the cutscene system,
causing the result to inherit many of the caveats I already described in the
cutscene blog post:
It's still a plaintext format geared exclusively toward full-width
Japanese text.
The parser still ignores all whitespace, forcing ASCII text into hacks
with unassigned Shift-JIS lead bytes outside the second byte of a 2-byte
chunk.
Commands are still preceded by a 0x5C byte, which renders
as either a \ or a ¥ depending on your font and
interpretation of Shift-JIS.
Command parameters are parsed in exactly the same way, with all the same
limits.
A lot of the same script commands are identical, including 7 of them
that were not used in TH04's original dialog scripts.
Then, however, he greatly simplified the system. Mainly, this was done by
moving text rendering from the PC-98 graphics chip to the text chip, which
avoids the need for any text-related unblitting code, but ZUN also added a
bunch of smaller changes:
The player must advance through every dialog box by releasing any held
keys and then pressing any key mapped to a game action. There are no
timeouts.
The delay for every 2 bytes of text was doubled to 2 frames, and can't
be overridden.
Instead of holding ESC to fast-forward, pressing any key
will immediately print the entire rest of a text box.
Dialogs run in their own single-buffered frame loop, interrupting the
rest of the game. The other VRAM page keeps the background pixels required
for unblitting the face images.
All script commands that affect the graphics layer are preceded by a
1-frame delay. ZUN most likely did this because of the single-buffered
nature, as it prevents tearing on the first frame by waiting for the CRT
beam to return to the top-left corner before changing any pixels.
Both boxes are intended to contain up to 30 half-width characters on
each of their up to 3 lines, but nothing in the code enforces these limits.
There is no support for automatic line breaks or starting new boxes.
While it would seem that TH05 has no issues with ASCII 0x20
spaces, the text as a whole is still blindly processed two bytes at a
time, and any commands can only appear at even byte positions within a
line. I dimmed the VRAM pixels to 25% of their original brightness to make the
text easier to read.
The same text backported to TH04, additionally demonstrating how that
game's dialog system inherited the whitespace skipping behavior of
TH03's cutscene system. Just like there, ASCII 0x20 spaces
only work at odd byte positions because the game treats them as the
trailing byte of a full-width Shift-JIS codepoint. I don't know how
large the budget for the upcoming non-ASCII translations will be, but
I'm going to fix this even in the very basic fully static variant.
I dimmed the VRAM pixels to 25% of their original brightness to make the
text easier to read.
TH05 then moved from TH04's plaintext scripts to the binary
.TX2 format while removing all the unused commands copy-pasted
from the cutscene system. Except for a
single additional command intended to clear a text box, TH05's dialog
system only supports a strict subset of the features of TH04's system.
This change also introduced the following differences compared to TH04:
The game now stores the dialog of all 4 playable characters in the same
file, with a (4 + 1)-word header that indicates the byte offset
and length of each character's script. This way, it can load only the one
script for the currently played character.
Since there is no need for whitespace in a binary format, you can now
use ASCII 0x20 spaces even as the first byte of a 2-byte text
chunk! 🥳
All command parameters are now mandatory.
Filenames are now passed directly by pointer to the respective game
function. Therefore, they now need to be null-terminated, but can in turn be
as long as
📝 the number of remaining bytes in the allocated dialog segment.
In practice though, the game still runs on DOS and shares its restriction of
8.3 filenames…
When starting a new dialog box, any existing text in the other box is
now colored blue.
Thanks to ZUN messing up the return values of the command-interpreting
switch function, you can effectively use only line break and gaiji commands in the middle of text. All other
commands do execute, but the interpreter then also treats their command byte
as a Shift-JIS lead byte and places it in text RAM together with whatever
other byte follows in the script.
This is why TH04 can and does put its \= commandsinto the boxes
started with the 0 or 1 commands, but TH05 has to
put its 0x02 commands before the equivalent 0x0D.
Writing the 0x02 byte to text RAM results in an character, which is simply the PC-98 font ROM's glyph for that
Shift-JIS codepoint. Also note how each face change is now
preceded by two frames of delay.
No problem in TH04. Note how the dialog also runs a bit faster – TH04
only adds the aforementioned one frame of delay to each face change, and
has fewer two-byte chunks of text to display overall.
For modding these files, you probably want to use TXDEF from
-Tom-'s MysticTK. It decodes these
files into a text representation, and its encoder then takes care of the
character-specific byte offsets in the 10-byte header. This text
representation simplifies the format a lot by avoiding all corner cases and
landmines you'd experience during hex-editing – most notably by interpreting
the box-starting 0x0D as a
command to show text that takes a string parameter, avoiding the broken
calls to script commands in the middle of text. However, you'd still have to
manually ensure an even number of bytes on every line of text.
In the entry function of TH05's dialog loop, we also encounter the hack that
is responsible for properly handling
📝 ZUN's hidden Extra Stage replay. Since the
dialog loop doesn't access the replay inputs but still requires key presses
to advance through the boxes, ZUN chose to just skip the dialog altogether in the
specific case of the Extra Stage replay being active, and replicated all
sprite management commands from the dialog script by just hardcoding
them.
And you know what? Not only do I not mind this hack, but I would have
preferred it over the actual dialog system! The aforementioned sprite
management commands effectively boil down to manual memory management,
deallocating all stage enemy and midboss sprites and thus ensuring that the
boss sprites end up at specific master.lib sprite IDs (patnums). The
hardcoded boss rendering function then expects these sprites to be available
at these exact IDs… which means that the otherwise hardcoded bosses can't
render properly without the dialog script running before them.
There is absolutely no excuse for the game to burden dialog scripts with
this functionality. Sure, delayed deallocation would allow them to blit
stage-specific sprites, but the original games don't do that; probably
because none of the two games feature an unblitting command. And even if
they did, it would have still been cleaner to expose the boss-specific
sprite setup as a single script command that can then also be called from
game code if the script didn't do so. Commands like these just are a recipe
for crashes, especially with parsers that expect fullwidth Shift-JIS
text and where misaligned ASCII text can easily cause these commands to be
skipped.
But then again, it does make for funny screenshot material if you
accidentally the deallocation and then see bosses being turned into stage
enemies:
Some of the more amusing consequences of not calling the
sprite-deallocating
\c /
0x04 command inside a dialog
script.
In the case of 4️⃣, the game then even crashes on this frame at the end
of the dialog, in a way that resembles the infamous
📝 TH04 crash before Stage 5 Yuuka if no EMS driver is loaded.
Both the stage- and boss-specific BFNT sprites are loaded into memory at
this point, leaving no room for the 256×256-pixel background image on
the size-limited master.lib heap.
With all the general details out of the way, here's the command reference:
0 1
0x00 0x01
Selects either the player character (0) or the boss (1) as the
currently speaking character, and moves the cursor to the beginning of
the text box. In TH04, this command also directly starts the new dialog
box, which is probably why it's not prefixed with a \ as it
only makes sense outside of text. TH05 requires a separate 0x0D command to do the
same.
\=1
0x02 0x!!
Replaces the face portrait of the currently active speaking
character with image #1 within her .CD2
file.
\=255
0x02 0xFF
Removes the face portrait from the currently active text box.
\l,filename
0x03 filename 0x00
Calls master.lib's super_entry_bfnt() function, which
loads sprites from a BFNT file to consecutive IDs starting at the
current patnum write cursor.
\c
0x04
Deallocates all stage-specific BFNT sprites (i.e., stage enemies and
midbosses), freeing up conventional RAM for the boss sprites and
ensuring that master.lib's patnum write cursor ends up at
128 /
180.
In TH05's Extra Stage, this command also replaces
📝 the sprites loaded from MIKO16.BFT with the ones from ST06_16.BFT.
\d
Deallocates all face portrait images.
The game automatically does this at the end of each dialog sequence.
However, ZUN wanted to load Stage 6 Yuuka's 76 KiB of additional
animations inside the script via \l, and would have once again
run up against the master.lib heap size limit without that extra free
memory.
\m,filename
0x05 filename 0x00
Stops the currently playing BGM, loads a new one from the given
file, and starts playback.
\m$
0x05 $ 0x00
Stops the currently playing BGM.
Note that TH05 interprets $ as a null-terminated filename as
well.
\m*
Restarts playback of the currently loaded BGM from the
beginning.
\b0,0,0
0x06 0x!!!!0x!!!!0x!!
Blits the master.lib patnum with the ID indicated by the third
parameter to the current VRAM page at the top-left screen position
indicated by the first two parameters.
\e0
Plays the sound effect with the given ID.
\t100
Sets palette brightness via master.lib's
palette_settone() to any value from 0 (fully black) to 200
(fully white). 100 corresponds to the palette's original colors.
\fo1
\fi1
Calls master.lib's palette_black_out() or
palette_black_in() to play a hardware palette fade
animation from or to black, spending roughly 1 frame on each of the 16 fade steps.
\wo1
\wi1
0x09 0x!!
0x0A 0x!!
Calls master.lib's palette_white_out() or
palette_white_in() to play a hardware palette fade
animation from or to white, spending roughly 1 frame on each of the 16 fade steps. The
TH05 version of 0x09 also clears the text in both boxes
before the animation.
\n
0x0B
Starts a new line by resetting the X coordinate of the TRAM cursor
to the left edge of the text area and incrementing the Y coordinate.
The new line will always be the next one below the last one that was
properly started, regardless of whether the text previously wrapped to
the next TRAM row at the edge of the screen.
\g8
Plays a blocking 8-frame screen shake
animation. Copy-pasted from the cutscene parser, but actually used right
at the end of the dialog shown before TH04's Bad Ending.
\ga0
0x0C 0x!!
Shows the gaiji with the given ID from 0 to 255
at the current cursor position, ignoring the per-glyph delay.
\k0
Waits 0 frames (0 = forever) for any key
to be pressed before continuing script execution.
Takes the current dialog cursor as the top-left corner of a
240×48-pixel rectangle, and replaces all text RAM characters within that
rectangle with whitespace.
This is only used to clear the player character's text box before
Shinki's final いくよ‼ box. Shinki has two
consecutive text boxes in all 4 scripts here, and ZUN probably wanted to
clear the otherwise blue text to imply a dramatic pause before Shinki's
final sentence. Nice touch.
(You could, however, also use it after a
box-ending 0xFF command to mess with text RAM in
general.)
\#
Quits the currently running loop. This returns from either the text
loop to the command loop, or it ends the dialog sequence by returning
from the command loop back to gameplay. If this stage of the game later
starts another dialog sequence, it will start at the next script
byte.
\$
Like \#, but first waits for any key to be
pressed.
0xFF
Behaves like TH04's \$ in the text loop, and like
\# in the command loop. Hence, it's not possible in TH05 to
automatically end a text box and advance to the next one without waiting
for a key press.
Unused commands are in gray.
At the end of the day, you might criticize the system for how its landmines
make it annoying to mod in ASCII text, but it all works and does what it's
supposed to. ZUN could have written the cleanest single and central
Shift-JIS iterator that properly chunks a byte buffer into halfwidth and
fullwidth codepoints, and I'd still be throwing it out for the upcoming
non-ASCII translations in favor of something that either also supports UTF-8
or performs dictionary lookups with a full box of text.
The only actual bug can be found in the input detection, which once
again doesn't correctly handle the infamous key
up/key down scancode quirk of PC-98 keyboards. All it takes
is one wrongly placed input polling call, and suddenly you have to think
about how the update cycle behind the PC-98 keyboard state bytes
might cause the game to run the regular 2-frame delay for a single
2-byte chunk of text before it shows the full text of a box after
all… But even this bug is highly theoretical and could probably only be
observed very, very rarely, and exclusively on real hardware.
The same can't be said about TH02 though, but more on that later. Let's
first take a look at its data, which started out much simpler in that game.
The STAGE?.TXT files contain just raw Shift-JIS text with no
trace of commands or structure. Turning on the whitespace display feature in
your editor reveals how the dialog system even assumes a fixed byte
length for each box: 36 bytes per line which will appear on screen, followed
by 4 bytes of padding, which the original files conveniently use to visually
split the lines via a CR/LF newline sequence. Make sure to disable trimming
of trailing whitespace in your editor to not ruin the file when modding the
text…
Two boxes from TH02's STAGE5.TXT with visualized whitespace.
These also demonstrate how the CR/LF newlines only make up 2 of the 4
padding bytes, and require each line to be padded with two more bytes; you
could not use these trailing spaces for actual text. Also note how
the exquisite mixture of fullwidth and halfwidth spaces demands the text to
be viewed with only the most metrically consistent monospace fonts to
preserve the intended alignment. 🍷 It appears quite misaligned on my phone.
Consequently, everything else is hardcoded – every effect shown between text
boxes, the face portrait shown for each box, and even how many boxes are
part of each dialog sequence. Which means that the source code now contains
a
long hardcoded list of face IDs for most of the text boxes in the game,
with the rest being part of the
dedicated hardcoded dialog scripts for 2/3 of the
game's stages.
Without the restriction to a fixed set of scripting commands, TH02 naturally
gravitated to having the most varied dialog sequences of all PC-98 Touhou
games. This flexibility certainly facilitated Mima's grand entrance
animation in Stage 4, or the different lines in Stage 4 and 5 depending on
whether you already used a continue or not. Marisa's post-boss dialog even
inserts the number of continues into the text itself – by, you guessed it,
writing to hardcoded byte offsets inside the dialog text before printing it
to the screen. But once again, I have nothing to
criticize here – not even the fact that the alternate dialog scripts have to
mutate the "box cursor" to jump to the intended boxes within the file. I
know that some people in my audience like VMs, but I would have considered
it more bloated if ZUN had implemented a full-blown scripting
language just to handle all these special cases.
Another unique aspect of TH02 is the way it stores its face portraits, which
are infamous for how hard they are to find in the original data files. These
sprites are actually map tiles, stored in MIKO_K.MPN,
and drawn using the same functions used to blit the regular map tiles to the
📝 tile source area in VRAM. We can only guess
why ZUN chose this one out of the three graphics formats he used in TH02:
BFNT supports transparency, but sacrifices one of the 16 colors to do
so. ZUN only used 15 colors for the face portraits, but might have wanted to
keep open the option to use that 16th color. The detailed
backgrounds also suggest that these images were never supposed to be
transparent to begin with.
PI is used for all bigger and non-transparent images, but ZUN would have
had to write a separate small function to blit a 48×48 subsection of such an
image. That certainly wouldn't have stopped him in the TH01 days, but he
probably was already past that point by this game.
That only leaves .MPN. Sure, he did have to slice each face into 9
separate 16×16 "map" tiles to use this format, but that's a small price to
pay in exchange for not having to write any new low-level blitting code,
especially since he must have already had an asset pipeline to generate
these files.
TH02's MIKO_K.PTN, arranged into a 16×16-tile layout that
reveals how these tiles are combined into face portraits. MPNDEF from -Tom-'s MysticTK conveniently uses
this exact layout in its .BMP output. Earlier MPNDEF
versions crashed when converting this file as its 256 tiles led to an
8-bit overflow bug, so make sure you've updated to the current version
from the end of October 2023 if you want to convert this file yourself.
The format stores the 4 bitplanes of each 16×16 tile in order, so good
luck finding a different planar image viewer that would support both
such a tiled layout and a custom palette. Sometimes, a weird
internal format is the best type of obfuscation.
And since you're certainly wondering about all these black tiles at the
edges: Yes, these are not only part of the file and pad it from the required
240×192 pixels to 256×256, but also kept in memory during a stage, wasting
9.5 KiB of conventional RAM. That's 172 seconds of potential input
replay data, just for those people who might still think that we need EMS
for replays.
Alright, we've got the text, we've got the faces, let's slide in the box and
display it all on screen. Apparently though, we also have to blit the player
and option sprites using raw, low-level master.lib function calls in the
process? This can't be right, especially because ZUN
always blits the option sprite associated with the Reimu-A shot type,
regardless of which one the player actually selected. And if you keep moving
above the box area before the dialog starts, you get to see exactly how
wrong this is:
Let's look closer at Reimu's sprite during the slide-in animation, and in
the two frames before:
This one image shows off no less than 4 bugs:
ZUN blits the stationary player sprite here, regardless of whether the
player was previously moving left or right. This is a nice way of indicating
that Reimu stops moving once the dialog starts, but maybe ZUN should
have unblitted the old sprite so that the new one wouldn't have appeared on
top. The game only unblits the 384×64 pixels covered by the dialog box on
every frame of the slide-in animation, so Reimu would only appear correctly
if her sprite happened to be entirely located within that area.
All sprites are shifted up by 1 pixel in frame 2️⃣. This one is not a
bug in the dialog system, but in the main game loop. The game runs the
relevant actions in the following order:
Invalidate any map tiles covered by entities
Redraw invalidated tiles
Decrement the Y coordinate at the top of VRAM according to the
scroll speed
Update and render all game entities
Scroll in new tiles as necessary according to the scroll speed, and
report whether the game has scrolled one pixel past the end of the
map
If that happened, pretend it didn't by incrementing the value
calculated in #3 for all further frames and skipping to
#8.
Issue a GDC SCROLL command to reflect the line
calculated in #3 on the display
Wait for VSync
Flip VRAM pages
Start boss if we're past the end of the map
The problem here: Once the dialog starts, the game has already rendered
an entire new frame, with all sprites being offset by a new Y scroll
offset, without adjusting the graphics GDC's scroll registers to
compensate. Hence, the Y position in 3️⃣ is the correct one, and the
whole existence of frame 2️⃣ is a bug in itself. (Well… OK, probably a
quirk because speedrunning exists, and it would be pretty annoying to
synchronize any video regression tests of the future TH02 Anniversary
Edition if it renders one fewer frame in the middle of a stage.)
ZUN blits the option sprites to their position from frame 1️⃣. This
brings us back to
📝 TH02's special way of retaining the previous and current position in a two-element array, indexed with a VRAM page ID.
Normally, this would be equivalent to using dedicated prev and
cur structure fields and you'd just index it with the back page
for every rendering call. But if you then decide to go single-buffered for
dialogs and render them onto the front page instead…
Note that fixing bug #2 would not cancel out this one – the sprites would
then simply be rendered to their position in the frame before 1️⃣.
And of course, the fixed option sprite ID also counts as a bug.
As for the boxes themselves, it's yet another loop that prints 2-byte chunks
of Shift-JIS text at an even slower fixed interval of 3 frames. In an
interesting quirk though, ZUN assumes that every box starts with the name of
the speaking character in its first two fullwidth Shift-JIS characters,
followed by a fullwidth colon. These 6 bytes are displayed immediately at
the start of every box, without the usual delay. The resulting alignment
looks rather janky with Genjii, whose single right-padded 亀
kanji looks quite awkward with the fullwidth space between the name
and the colon. Kind of makes you wonder why ZUN just didn't spell out his
proper name, 玄爺, instead, but I get the stylistic
difference.
In Stage 4, the two-kanji assumption then breaks with Marisa's three-kanji
name, which causes the full-width colon to be printed as the first delayed
character in each of her boxes:
That's all the issues and quirks in the system itself. The scripts
themselves don't leave much room for bugs as they basically just loop over
the hardcoded face ID array at this level… until we reach the end of the
game. Previously, the slide-in animation could simply use the tile
invalidation and re-rendering system to unblit the box on each frame, which
also explained why Reimu had to be separately rendered on top. But this no
longer works with a custom-rendered boss background, and so the game just
chooses to flood-fill the area with graphics chip color #0:
Then again, transferring pixels from the back page would be just
as wrong as they lag one frame behind. No way around capturing these 384×64
pixels to main memory here… Oh well, this flood-fill at least adds even more
legibility on top of the already half-transparent text box. A property that
the following dialog sequence unfortunately lacks…
For Mima's final defeat dialog though, ZUN chose to not even show the box.
He might have realized the issue by that point, or simply preferred the more
dramatic effect this had on the lines. The resulting issues, however, might
even have ramifications for such un-technical things as lore and
character dynamics. As it turns out, the code
for this dialog sequence does in fact render Mima's smiling face for all
boxes?! You only don't see it in the original game because it's rendered to
the other VRAM page that remains invisible during the dialog sequence:
Caution, flashing lights.
Here's how I interpret the situation:
The function that launches into the final part of the dialog script
starts with dedicated
code to re-render Mima to the back page, on top of the previously
rendered planet background. Since the entire script runs on the front
page (and thus, on top of the previous frame) and the game launches into
the ending immediately after, you don't ever get to see this new partial
frame in the original game.
Showing this partial frame would also ensure that you can actually
read the dialog text without a surrounding box. Then, the white
letters won't ever be put on top of any white bullets – or, worse, be completely invisible if the
dialog is triggered in the middle of Reimu-B's bomb animation, which
fills VRAM with lots of white pixels.
Hence, we've got enough evidence to classify not showing the back page
as a ZUN
bug. 🐞
However, Mima's smiling face jars with the words she says here. Adding
the face would deviate more significantly from the original game than
removing the player shot, item, bullet, or spark sprites would. It's
imaginable that ZUN just forgot about the dedicated code that
re-rendered just Mima to the back page, but the faces add
something to the dialog, and ZUN would have clearly noticed and
fixed it if their absence wasn't intended. Heck, ZUN might have just put
something related to Mima into the code because TH02's dialog system has
no way of not drawing a face for a dialog box. Filling the face
area with graphics chip color #0, as seen in the first and third boxes
of the Extra Stage pre-boss dialog, would have been an alternative, but
that would have been equally wrong with regard to the background.
Hence, the invisible face portrait from the original game is a ZUN
quirk. 🎺
So, the future TH02 Anniversary Edition will fix the bug by showing
the back page, but retain the quirk by rewriting the dialog code to
not blit the face.
And with that, we've secured all in-game dialog for the upcoming non-ASCII
translations! The remaining 2/3 of the last push made
for a good occasion to also decompile the small amount of code related to
TH03's win messages, stored in the @0?TX.TXT files. Similar to
TH02's dialog format, these files are also split into fixed-size blocks of
3×60 bytes. But this time, TH03 loads all 60 bytes of a line, including the
CR/LF line breaking codepoints in the original files, into the statically
allocated buffer that it renders from. These control characters are then
only filtered to whitespace by ZUN's graph_putsa_fx() function.
If you remove the line breaks, you get to use the full 60 bytes on every
line.
The final commits went to the MIKO.CFG loading and saving
functions used in TH04's and TH05's OP.EXE, as well as TH04's
game startup code to finally catch up with
📝 TH05's counterpart from over 3 years ago.
This brought us right in front of the main menu rendering code in both TH04
and TH05, which is identical in both games and will be tackled in the next
PC-98 Touhou delivery.
Next up, though: Returning to Shuusou Gyoku, and adding support for SC-88Pro
recordings as BGM. Which may or may not come with a slight controversy…
Technical debt, part 9… and as it turns out, it's highly impractical to
repay 100% of it at this point in development. 😕
The reason: graph_putsa_fx(), ZUN's function for rendering
optionally boldfaced text to VRAM using the font ROM glyphs, in its
ridiculously micro-optimized TH04 and TH05 version. This one sets the
"callback function" for applying the boldface effect by self-modifying
the target of two CALL rel16 instructions… because
there really wasn't any free register left for an indirect
CALL, eh? The necessary distance, from the call site to the
function itself, has to be calculated at assembly time, by subtracting the
target function label from the call site label.
This usually wouldn't be a problem… if ZUN didn't store the resulting
lookup tables in the .DATA segment. With code segments, we
can easily split them at pretty much any point between functions because
there are multiple of them. But there's only a single .DATA
segment, with all ZUN and master.lib data sandwiched between Borland C++'s
crt0 at the
top, and Borland C++'s library functions at the bottom of the segment.
Adding another split point would require all data after that point to be
moved to its own translation unit, which in turn requires
EXTERN references in the big .ASM file to all that moved
data… in short, it would turn the codebase into an even greater
mess.
Declaring the labels as EXTERN wouldn't work either, since
the linker can't do fancy arithmetic and is limited to simply replacing
address placeholders with one single address. So, we're now stuck with
this function at the bottom of the SHARED segment, for the
foreseeable future.
We can still continue to separate functions off the top of that segment,
though. Pretty much the only thing noteworthy there, so far: TH04's code
for loading stage tile images from .MPN files, which we hadn't
reverse-engineered so far, and which nicely fit into one of
Blue Bolt's pending ⅓ RE contributions. Yup, we finally moved
the RE% bars again! If only for a tiny bit.
Both TH02 and TH05 simply store one pointer to one dynamically allocated
memory block for all tile images, as well as the number of images, in the
data segment. TH04, on the other hand, reserves memory for 8 .MPN slots,
complete with their color palettes, even though it only ever uses the
first one of these. There goes another 458 bytes of conventional RAM… I
should start summing up all the waste we've seen so far. Let's put the
next website contribution towards a tagging system for these blog posts.
At 86% of technical debt in the SHARED segment repaid, we
aren't quite done yet, but the rest is mostly just TH04 needing to catch
up with functions we've already separated. Next up: Getting to that
practical 98.5% point. Since this is very likely to not require a full
push, I'll also decompile some more actual TH04 and TH05 game code I
previously reverse-engineered – and after that, reopen the store!
Turns out that TH04's player selection menu is exactly three times as
complicated as TH05's. Two screens for character and shot type rather than
one, and a way more intricate implementation for saving and restoring the
background behind the raised top and left edges of a character picture
when moving the cursor between Reimu and Marisa. TH04 decides to backup
precisely only the two 256×8 (top) and 8×244 (left) strips behind the
edges, indicated in red in the picture
below.
These take up just 4 KB of heap memory… but require custom blitting
functions, and expanding this explicitly hardcoded approach to TH05's 4
characters would have been pretty annoying. So, rather than, uh, not
explicitly hardcoding it all, ZUN decided to just be lazy with the backup
area in TH05, saving the entire 640×400 screen, and thus spending 128 KB
of heap memory on this rather simple selection shadow effect.
So, this really wasn't something to quickly get done during the first half
of a push, even after already having done TH05's equivalent of this menu.
But since life is very busy right now, I also used the occasion to start
addressing another code organization annoyance: master.lib's single master.h header file.
Now that ReC98 is trying to develop (or at least mimic) a more
type-safe C++ foundation to model the PC-98 hardware, a pure C header
(with counter-productive C++ extensions) is becoming increasingly
unidiomatic. By moving some of the original assumptions about function
parameters into the type system, we can also reduce the reliance on its
Japanese-only documentation without having to translate it
It's quite bloated, with at least 2800 lines of code that
currently are #included into the vast majority of files, not
counting master.h's recursively included C standard library
headers. PC-98 Touhou only makes direct use of a rather small fraction of
its contents.
And finally, all the DOS/V compatibility definitions are especially
useless in the context of ReC98. As I've noted
📝 time and
📝 time again, porting PC-98 Touhou to
IBM-compatible DOS won't be easy, and MASTER_DOSV won't be
helping much. Therefore, my upstream version of ReC98 will never include
all of master.lib. There's no point in lengthening compile times for
everyone by default, and those will be getting quite noticeable
after moving to a full 16-bit build process.
(Actually, what retro system ports should rather be doing: Get rid
of master.lib's original ASM code, replace it with
readable, modern
C++, and then simply convert the optimized assembly output of modern
compilers to your ISA of choice. Improving the landscape of such
assembly or object file converters would benefit everyone!)
So, time to start a new master.hpp header that would contain
just the declarations from master.h that PC-98 Touhou
actually needs, plus some semantic (yes, semantic) sugar. Comparing just
the old master.h to just the new master.hpp
after roughly 60% of the transition has been completed, we get median
build times of 319 ms for master.h, and 144 ms for
master.hpp on my (admittedly rather slow) DOSBox setup.
Nice!
As of this push, ReC98 consists of 107 translation units that have to be
compiled with Turbo C++ 4.0J. Fully rebuilding all of these currently
takes roughly 37.5 seconds in DOSBox. After the transition to
master.hpp is done, we could therefore shave some 10 to 15
seconds off this time, simply by switching header files. And that's just
the beginning, as this will also pave the way for further
#include optimizations. Life in this codebase will be great!
Unfortunately, there wasn't enough time to repay some of the actual
technical debt I was looking forward to, after all of this. Oh well, at
least we now also have nice identifiers for the three different boldface
options that are used when rendering text to VRAM, after procrastinating
that issue for almost 11 months. Next up, assuming the existing
subscriptions: More ridiculous decompilations of things that definitely
weren't originally written in C, and a big blocker in TH03's
MAIN.EXE.
Wait, PI for FUUIN.EXE is mainly blocked by the high score
menu? That one should really be properly decompiled in a separate
RE push, since it's also present in largely identical form in
REIIDEN.EXE… but I currently lack the explicit funding to do
that.
And as it turns out, I shouldn't really capture any of the existing generic
RE contributions for it either. Back in 2018 when I ran the crowdfunding
on the Touhou Patch Center Discord server, I said that generic RE
contributions would never go towards TH01. No one was interested in that
game back then, and as it's significantly different from all the other
games, it made sense to only cover it if explicitly requested.
As Touhou Patch Center still remains one of the biggest supporters and
advertisers for ReC98, someone recently believed that this rule was still
in effect, despite not being mentioned anywhere on this website.
Fast forward to today, and TH01 has become the single most supported game
lately, with plenty of incomplete pushes still open to be completed.
Reverse-engineering it has proven to be quite efficient, yielding lots of
completion percentage points per push. This, I suppose, is exactly what
backers that don't give any specific priorities are mainly interested in.
Therefore, I will allocate future partial
contributions to TH01, whenever it makes sense.
So, instead of rushing TH01 PI, let's wait for Ember2528's
April subscription, and get the 25% total RE milestone with some TH05 PI
progress instead. This one primarily focused on the gather circles
(spirals…?), the third-last missing entity type in TH05. These are
rendered using the same 8×8 pellet sprite introduced in TH02… except that
the actual pellets received a darkened bottom part in TH04
.
Which, in turn, is actually rendered quite efficiently – the games first
render the top white part of all pellets, followed by the bottom gray part
of all pellets. The PC-98 GRCG is used throughout the process, doing its
typical job of accelerating monochrome blitting, and by arranging the
rendering like this, only two GRCG color changes are required to draw any
number of pellets. I guess that makes it quite a worthwhile
optimization? Don't ask me for specific performance numbers or even saved
cycles, though
So, where to start? Well, TH04 bullets are hard, so let's
procrastinate start with TH03 instead
The 📝 sprite display functions are the
obvious blocker for any structure describing a sprite, and therefore most
meaningful PI gains in that game… and I actually did manage to fit a
decompilation of those three functions into exactly the amount of time
that the Touhou Patch Center community votes alloted to TH03
reverse-engineering!
And a pretty amazing one at that. The original code was so obviously
written in ASM and was just barely decompilable by exclusively using
register pseudovariables and a bit of goto, but I was able to
abstract most of that away, not least thanks to a few helpful optimization
properties of Turbo C++… seriously, I can't stop marveling at this ancient
compiler. The end result is both readable, clear, and dare I say
portable?! To anyone interested in porting TH03,
take a look. How painful would it be to port that away from 16-bit
x86?
However, this push is also a typical example that the RE/PI priorities can
only control what I look at, and the outcome can actually differ
greatly. Even though the priorities were 65% RE and 35% PI, the progress
outcome was +0.13% RE and +1.35% PI. But hey, we've got one more push with
a focus on TH03 PI, so maybe that one will include more RE than
PI, and then everything will end up just as ordered?
So, let's continue with player shots! …eh, or maybe not directly, since they involve two other structure types in TH05, which we'd have to cover first. One of them is a different sort of sprite, and since I like me some context in my reverse-engineering, let's disable every other sprite type first to figure out what it is.
One of those other sprite types were the little sparks flying away from killed stage enemies, midbosses, and grazed bullets; easy enough to also RE right now. Turns out they use the same 8 hardcoded 8×8 sprites in TH02, TH04, and TH05. Except that it's actually 64 16×8 sprites, because ZUN wanted to pre-shift them for all 8 possible start pixels within a planar VRAM byte (rather than, like, just writing a few instructions to shift them programmatically), leading to them taking up 1,024 bytes rather than just 64.
Oh, and the thing I wanted to RE *actually* was the decay animation whenever a shot hits something. Not too complex either, especially since it's exclusive to TH05.
And since there was some time left and I actually have to pick some of the next RE places strategically to best prepare for the upcoming 17 decompilation pushes, here's two more function pointers for good measure.