TH03 finally passed 20% RE, and the newly decompiled code contains no
serious ZUN bugs! What a nice way to end the year.
There's only a single unlockable feature in TH03: Chiyuri and Yumemi as
playable characters, unlocked after a 1CC on any difficulty. Just like the
Extra Stages in TH04 and TH05, YUME.NEM contains a single
designated variable for this unlocked feature, making it trivial to craft a
fully unlocked score file without recording any high scores that others
would have to compete against. So, we can now put together a complete set
for all PC-98 Touhou games: 2021-12-27-Fully-unlocked-clean-score-files.zip
It would have been cool to set the randomly generated encryption keys in
these files to a fixed value so that they cancel out and end up not actually
encrypting the file. Too bad that TH03 also started feeding each encrypted
byte back into its stream cipher, which makes this impossible.
The main loading and saving code turned out to be the second-cleanest
implementation of a score file format in PC-98 Touhou, just behind TH02.
Only two of the YUME.NEM functions come with nonsensical
differences between OP.EXE and MAINL.EXE, rather
than 📝 all of them, as in TH01 or
📝 too many of them, as in TH04 and TH05. As
for the rest of the per-difficulty structure though… well, it quickly
becomes clear why this was the final score file format to be RE'd. The name,
score, and stage fields are directly stored in terms of the internal
REGI*.BFT sprite IDs used on the high score screen. TH03 also
stores 10 score digits for each place rather than the 9 possible ones, keeps
any leading 0 digits, and stores the letters of entered names in reverse
order… yeah, let's decompile the high score screen as well, for a full
understanding of why ZUN might have done all that. (Answer: For no reason at
And wow, what a breath of fresh air. It's surely not
good-code: The overlapping shadows resulting from using
a 24-pixel letterspacing with 32-pixel glyphs in the name column led ZUN to
do quite a lot of unnecessary and slightly confusing rendering work when
moving the cursor back and forth, and he even forgot about the EGC there.
But it's nowhere close to the level of jank we saw in
📝 TH01's high score menu last year. Good to
see that ZUN had learned a thing or two by his third game – especially when
it comes to storing the character map cursor in terms of a character ID,
and improving the layout of the character map:
That's almost a nicely regular grid there. With the question mark and the
double-wide SP, BS, and END options, the cursor
movement code only comes with a reasonable two exceptions, which are easily
handled. And while I didn't get this screen completely decompiled,
one additional push was enough to cover all important code there.
The only potential glitch on this screen is a result of ZUN's continued use
decimal digits without any bounds check or cap. Like the in-game HUD
score display in TH04 and TH05, TH03's high score screen simply uses the
next glyph in the character set for the most significant digit of any score
above 1,000,000,000 points – in this case, the period. Still, it only
really gets bad at 8,000,000,000 points: Once the glyphs are
exhausted, the blitting function ends up accessing garbage data and filling
the entire screen with garbage pixels. For comparison though, the current world record
is 133,650,710 points, so good luck getting 8 billion in the first
Next up: Starting 2022 with the long-awaited decompilation of TH01's Sariel
fight! Due to the 📝 recent price increase,
we now got a window in the cap that
is going to remain open until tomorrow, providing an early opportunity to
set a new priority after Sariel is done.
Of course, Sariel's potentially bloated and copy-pasted code is blocked by
even more definitely bloated and copy-pasted code. It's TH01, what did you
But even then, TH01's item code is on a new level of software architecture
ridiculousness. First, ZUN uses distinct arrays for both types of items,
with their own caps of 4 for bomb items, and 10 for point items. Since that
obviously makes any type-related switch statement redundant,
he also used distinct functions for both types, with copy-pasted
boilerplate code. The main per-item update and render function is
shared though… and takes every single accessed member of the item
structure as its own reference parameter. Like, why, you have a
structure, right there?! That's one way to really practice the C++ language
concept of passing arbitrary structure fields by mutable reference…
To complete the unwarranted grand generic design of this function, it calls
back into per-type collision detection, drop, and collect functions with
another three reference parameters. Yeah, why use C++ virtual methods when
you can also implement the effectively same polymorphism functionality by
hand? Oh, and the coordinate clamping code in one of these callbacks could
only possibly have come from nested min() and
max() preprocessor macros. And that's how you extend such
dead-simple functionality to 1¼ pushes…
Amidst all this jank, we've at least got a sensible item↔player hitbox this
time, with 24 pixels around Reimu's center point to the left and right, and
extending from 24 pixels above Reimu down to the bottom of the playfield.
It absolutely didn't look like that from the initial naive decompilation
though. Changing entity coordinates from left/top to center was one of the
better lessons from TH01 that ZUN implemented in later games, it really
makes collision detection code much more intuitive to grasp.
The card flip code is where we find out some slightly more interesting
aspects about item drops in this game, and how they're controlled by a
hidden cycle variable:
At the beginning of every 5-stage scene, this variable is set to a
random value in the [0..59] range
Point items are dropped at every multiple of 10
Every card flip adds 1 to its value after this mod 10
At a value of 140, the point item is replaced with a bomb item, but only
if no damaging bomb is active. In any case, its value is then reset to
Then again, score players largely ignore point items anyway, as card
combos simply have a much bigger effect on the score. With this, I should
have RE'd all information necessary to construct a tool-assisted score run,
though? Edit: Turns out that 1) point items are becoming
increasingly important in score runs, and 2) Pearl already did a TAS some
months ago. Thanks to
spaztron64 for the info!
The Orb↔card hitbox also makes perfect sense, with 24 pixels around
the center point of a card in every direction.
The rest of the code confirms the
flip score formula documented on Touhou Wiki, as well as the way cards
are flipped by bombs: During every of the 90 "damaging" frames of the
140-frame bomb animation, there is a 75% chance to flip the card at the
[bomb_frame % total_card_count_in_stage] array index. Since
stages can only have up to 50 cards
📝 thanks to a bug, even a 75% chance is high
enough to typically flip most cards during a bomb. Each of these flips
still only removes a single card HP, just like after a regular collision
with the Orb.
Also, why are the card score popups rendered before the cards
themselves? That's two needless frames of flicker during that 25-frame
animation. Not all too noticeable, but still.
And that's over 50% of REIIDEN.EXE decompiled as well! Next
up: More HUD update and rendering code… with a direct dependency on
rank pellet speed modifications?
…or maybe not that soon, as it would have only wasted time to
untangle the bullet update commits from the rest of the progress. So,
here's all the bullet spawning code in TH04 and TH05 instead. I hope
you're ready for this, there's a lot to talk about!
(For the sake of readability, "bullets" in this blog post refers to the
white 8×8 pellets
and all 16×16 bullets loaded from MIKO16.BFT, nothing else.)
But first, what was going on📝 in 2020? Spent 4 pushes on the basic types
and constants back then, still ended up confusing a couple of things, and
even getting some wrong. Like how TH05's "bullet slowdown" flag actually
always prevents slowdown and fires bullets at a constant speed
instead. Or how "random spread" is not the
best term to describe that unused bullet group type in TH04.
Or that there are two distinct ways of clearing all bullets on screen,
which deserve different names:
Bullets are zapped at the end of most midboss and boss phases, and
cleared everywhere else – most notably, during bombs, when losing a
life, or as rewards for extends or a maximized Dream bonus. The
Bonus!! points awarded for zapping bullets are calculated iteratively,
so it's not trivial to give an exact formula for these. For a small number
𝑛 of bullets, it would exactly be 5𝑛³ - 10𝑛² + 15𝑛
points – or, using uth05win's (correct) recursive definition,
Bonus(𝑛) = Bonus(𝑛-1) + 15𝑛² - 5𝑛 + 10.
However, one of the internal step variables is capped at a different number
of points for each difficulty (and game), after which the points only
increase linearly. Hence, "semi-exponential".
On to TH04's bullet spawn code then, because that one can at least be
decompiled. And immediately, we have to deal with a pointless distinction
between regular bullets, with either a decelerating or constant
velocity, and special bullets, with preset velocity changes during
their lifetime. That preset has to be set somewhere, so why have
separate functions? In TH04, this separation continues even down to the
lowest level of functions, where values are written into the global bullet
array. TH05 merges those two functions into one, but then goes too far and
uses self-modifying code to save a grand total of two local variables…
Luckily, the rest of its actual code is identical to TH04.
Most of the complexity in bullet spawning comes from the (thankfully
shared) helper function that calculates the velocities of the individual
bullets within a group. Both games handle each group type via a large
switch statement, which is where TH04 shows off another Turbo
C++ 4.0 optimization: If the range of case values is too
sparse to be meaningfully expressed in a jump table, it usually generates a
linear search through a second value table. But with the -G
command-line option, it instead generates branching code for a binary
search through the set of cases. 𝑂(log 𝑛) as the worst case for a
switch statement in a C++ compiler from 1994… that's so cool.
But still, why are the values in TH04's group type enum all
over the place to begin with?
Unfortunately, this optimization is pretty rare in PC-98 Touhou. It only
shows up here and in a few places in TH02, compared to at least 50
switch value tables.
In all of its micro-optimized pointlessness, TH05's undecompilable version
at least fixes some of TH04's redundancy. While it's still not even
optimal, it's at least a decently written piece of ASM…
if you take the time to understand what's going on there, because it
certainly took quite a bit of that to verify that all of the things which
looked like bugs or quirks were in fact correct. And that's how the code
for this function ended up with 35% comments and blank lines before I could
confidently call it "reverse-engineered"…
Oh well, at least it finally fixes a correctness issue from TH01 and TH04,
where an invalid bullet group type would fill all remaining slots in the
bullet array with identical versions of the first bullet.
Something that both games also share in these functions is an over-reliance
on globals for return values or other local state. The most ridiculous
example here: Tuning the speed of a bullet based on rank actually mutates
the global bullet template… which ZUN then works around by adding a wrapper
function around both regular and special bullet spawning, which saves the
base speed before executing that function, and restores it afterward.
Add another set of wrappers to bypass that exact
tuning, and you've expanded your nice 1-function interface to 4 functions.
Oh, and did I mention that TH04 pointlessly duplicates the first set of
wrapper functions for 3 of the 4 difficulties, which can't even be
explained with "debugging reasons"? That's 10 functions then… and probably
explains why I've procrastinated this feature for so long.
At this point, I also finally stopped decompiling ZUN's original ASM just
for the sake of it. All these small TH05 functions would look horribly
unidiomatic, are identical to their decompiled TH04 counterparts anyway,
except for some unique constant… and, in the case of TH05's rank-based
speed tuning function, actually become undecompilable as soon as we
want to return a C++ class to preserve the semantic meaning of the return
value. Mainly, this is because Turbo C++ does not allow register
pseudo-variables like _AX or _AL to be cast into
class types, even if their size matches. Decompiling that function would
have therefore lowered the quality of the rest of the decompiled code, in
exchange for the additional maintenance and compile-time cost of another
translation unit. Not worth it – and for a TH05 port, you'd already have to
decompile all the rest of the bullet spawning code anyway!
The only thing in there that was still somewhat worth being
decompiled was the pre-spawn clipping and collision detection function. Due
to what's probably a micro-optimization mistake, the TH05 version continues
to spawn a bullet even if it was spawned on top of the player. This might
sound like it has a different effect on gameplay… until you realize that
the player got hit in this case and will either lose a life or deathbomb,
both of which will cause all on-screen bullets to be cleared anyway.
So it's at most a visual glitch.
But while we're at it, can we please stop talking about hitboxes? At least
in the context of TH04 and TH05 bullets. The actual collision detection is
described way better as a kill delta of 8×8 pixels between the
center points of the player and a bullet. You can distribute these pixels
to any combination of bullet and player "hitboxes" that make up 8×8. 4×4
around both the player and bullets? 1×1 for bullets, and 8×8 for the
player? All equally valid… or perhaps none of them, once you keep in mind
that other entity types might have different kill deltas. With that in
mind, the concept of a "hitbox" turns into just a confusing abstraction.
The same is true for the 36×44 graze box delta. For some reason,
this one is not exactly around the center of a bullet, but shifted to the
right by 2 pixels. So, a bullet can be grazed up to 20 pixels right of the
player, but only up to 16 pixels left of the player. uth05win also spotted
this… and rotated the deltas clockwise by 90°?!
Which brings us to the bullet updates… for which I still had to
research a decompilation workaround, because
📝 P0148 turned out to not help at all?
Instead, the solution was to lie to the compiler about the true segment
distance of the popup function and declare its signature far
rather than near. This allowed ZUN to save that ridiculous overhead of 1 additional far function
call/return per frame, and those precious 2 bytes in the BSS segment
that he didn't have to spend on a segment value.
📝 Another function that didn't have just a
single declaration in a common header file… really,
📝 how were these games even built???
The function itself is among the longer ones in both games. It especially
stands out in the indentation department, with 7 levels at its most
indented point – and that's the minimum of what's possible without
goto. Only two more notable discoveries there:
Bullets are the only entity affected by Slow Mode. If the number of
bullets on screen is ≥ (24 + (difficulty * 8) + rank) in TH04,
or (42 + (difficulty * 8)) in TH05, Slow Mode reduces the frame
rate by 33%, by waiting for one additional VSync event every two frames.
The code also reveals a second tier, with 50% slowdown for a slightly
higher number of bullets, but that conditional branch can never be executed
Bullets must have been grazed in a previous frame before they can
be collided with. (Note how this does not apply to bullets that spawned
on top of the player, as explained earlier!)
Whew… When did ReC98 turn into a full-on code review?! 😅 And after all
this, we're still not done with TH04 and TH05 bullets, with all the
special movement types still missing. That should be less than one push
though, once we get to it. Next up: Back to TH01 and Konngara! Now have fun
rewriting the Touhou Wiki Gameplay pages 😛
Three pushes to decompile the TH01 high score menu… because it's
completely terrible, and needlessly complicated in pretty much every
Another, final set of differences between the REIIDEN.EXE
and FUUIN.EXE versions of the code. Which are so
insignificant that it must mean that ZUN kept this code in two
separate, manually and imperfectly synced files. The REIIDEN.EXE
version, only shown when game-overing, automatically jumps to the
enter/終 button after the 8th character was entered,
and also has a completely invisible timeout that force-enters a high score
name after 1000… key presses? Not frames? Why. Like, how do you
even realistically such a number. (Best guess: It's a hidden easter egg to
amuse players who place drinking glasses on cursor keys. Or beer bottles.)
That's all the differences that are maybe visible if you squint
hard enough. On top of that though, we got a bunch of further, minor code
organization differences that serve no purpose other than to waste
decompilation time, and certainly did their part in stretching this out to
3 pushes instead of 2.
Entered names are restricted to a set of 16-bit, full-width Shift-JIS
codepoints, yet are still accessed as 8-bit byte arrays everywhere. This
bloats both the C++ and generated ASM code with needless byte splits,
swaps, and bit shifts. Same for the route kanji. You have this 16-, heck,
even 32-bit CPU, why not use it?! (Fun fact: FUUIN.EXE is
explicitly compiled for a 80186, for the most part – unlike
REIIDEN.EXE, which does use Turbo C++'s 80386 mode.)
The sensible way of storing the current position of the alphabet
cursor would simply be two variables, indicating the logical row and
column inside the character map. When rendering, you'd then transform
these into screen space. This can keep the on-screen position constants in
a single place of code.
TH01 does the opposite: The selected character is stored directly in terms
of its on-screen position, which is then mapped back to a character
index for every processed input and the subsequent screen update. There's
no notion of a logical row or column anywhere, and consequently, the
position constants are vomited all over the code.
Which might not be as bad if the character map had a uniform
grid structure, with no gaps. But the one in TH01 looks like this:
And with no sense of abstraction anywhere, both input handling and
rendering end up with a separate if branch for at least 4 of
the 6 rows.
In the end, I just gave up with my usual redundancy reduction efforts for
this one. Anyone wanting to change TH01's high score name entering code
would be better off just rewriting the entire thing properly.
And that's all of the shared code in TH01! Both OP.EXE and
FUUIN.EXE are now only missing the actual main menu and
ending code, respectively. Next up, though: The long awaited TH01 PI push.
Which will not only deliver 100% PI for OP.EXE and
FUUIN.EXE, but also probably quite some gains in
REIIDEN.EXE. With now over 30% of the game decompiled, it's about
time we get to look at some gameplay code!
Final TH01 RE push for the time being, and as expected, we've got the
superficially final piece of shared code between the TH01 executables.
However, just having a single implementation for loading and recreating
the REYHI*.DAT score files would have been way above ZUN's
standards of consistency. So ZUN had the unique idea to mix up the file
I/O APIs, using master.lib functions in REIIDEN.EXE, and
POSIX functions (along with error messages and disabled interrupts) in
FUUIN.EXE… Could have been worse
though, as it was possible to abstract that away quite nicely.
That code wasn't quite in the natural way of decompilation either. As it
turns out though, 📝 segment splitting isn't
so painful after all if one of the new segments only has a few functions.
Definitely going to do that more often from now on, since it allows a much
larger number of functions to be immediately decompiled. Which is always
superior to somehow transforming a function's ASM into a form that I can
confidently call "reverse-engineered", only to revisit it again later for
And while I unfortunately missed 25% of total RE by a bit, this push
reached two other and perhaps even more significant milestones:
After (finally) compressing all unknown parts of the BSS segments
using arrays, the number of remaining lines in the
REIIDEN.EXE ASM dump has fallen below TASM's limit of 65,535. Which
means that we no longer need that annoying th01_reiiden_2.inc
file that everyone has forgotten about at least once.
As noted in 📝 P0061, TH03 gameplay RE is
indeed going to progress very slowly in the beginning. A lot of the
initial progress won't even be reflected in the RE% – there are just so
many features in this game that are intertwined into each other, and I
only consider functions to be "reverse-engineered" once we understand
every involved piece of code and data, and labeled every absolute
memory reference in it. (Yes, that means that the percentages on the front
page are actually underselling ReC98's progress quite a bit, and reflect a
pretty low bound of our actual understanding of the games.)
So, when I get asked to look directly at gameplay code right now,
it's quite the struggle to find a place that can be covered within a push
or two and that would immediately benefit
scoreplayers. The basics of score and combo handling themselves
managed to fit in pretty well, though:
Just like TH04 and TH05, TH03 stores the current score as 8
decimal digits. Since the last constant 0 is not included, the maximum
score displayable without glitches therefore is 999,999,990 points, but
the game will happily store up to 24,699,999,990 points before the score
wraps back to 0.
There are (surprisingly?) only 6 places where the game actually
adds points to the score. Not quite sure about all of them yet, but they
(of course) include ending a combo, killing enemies, and the bonus at the
end of a round.
Combos can be continued for 80 frames after a 2-hit. The hit counter
can only be increased in the first 48, and effectively resets to 0 for the
last 32, when the Spell Point value starts blinking.
TH03 can track a total of 16 independent "hit combo sources" per
player, simultaneously. These are not related to the number of
actual explosions; rather, each explosion is assigned to one of the 16
slots when it spawns, and all consecutive explosions spawned from that one
will then add to the hit combo in that slot. The hit number displayed in
the top left is simply the largest one among all these.
Oh well, at least we still got a bit of PI% out of this one. From this
point though, the next push (or two) should be enough to cover the big
128-byte player structure – which by itself might not be immediately
interesting to scoreplayers, but surely is quite a blocker for everything
Just like most of the time, it was more sensible to cover
GENSOU.SCR, the last structure missing in TH05's
everywhere it's used, rather than just rushing out OP.EXE
position independence. I did have to look into all of the functions to
fully RE it after all, and to find out whether the unused fields actually
are unused. The only thing that kept this push from yielding even
more above-average progress was the sheer inconsistency in how the games
implemented the operations on this PC-98 equivalent of score*.dat:
OP.EXE declares two structure instances, for simultaneous
access to both Reimu and Marisa scores. TH05 with its 4 playable
characters instead uses a single one, and overwrites it successively for
each character when drawing the high score menu – meaning, you'd only see
Yuuka's scores when looking at the structure inside the rendered high
score menu. However, it still declares the TH04 "Marisa" structure as a
leftover… and also decodes it and verifies its checksum, despite
nothing being ever loaded into it
MAIN.EXE uses a separate ASM implementation of the decoding
and encoding functions
TH05's MAIN.EXE also reimplements the basic loading
in ASM – without the code to regenerate GENSOU.SCR with
default data if the file is missing or corrupted. That actually makes
sense, since any regeneration is already done in OP.EXE, which
always has to load that file anyway to check how much has been cleared
However, there is a regeneration function in TH05's
MAINE.EXE… which actually generates different default
data: OP.EXE consistently sets Extra Stage records to Stage 1,
while MAINE.EXE uses the same place-based stage numbering that
both versions use for the regular ranks
Technically though, TH05's OP.EXEis
position-independent now, and the rest are (should be?
) merely false positives. However, TH04's is
still missing another structure, in addition to its false
positives. So, let's wait with the big announcement until the next push…
which will also come with a demo video of what will be possible then.
The glacial pace continues, with TH05's unnecessarily, inappropriately
micro-optimized, and hence, un-decompilable code for rendering the current
and high score, as well as the enemy health / dream / power bars. While
the latter might still pass as well-written ASM, the former goes to such
ridiculous levels that it ends up being technically buggy. If you
enjoy quality ZUN code, it's
definitely worth a read.
In TH05, this all still is at the end of code segment #1, but in TH04,
the same code lies all over the same segment. And since I really
wanted to move that code into its final form now, I finally did the
research into decompiling from anywhere else in a segment.
Turns out we actually can! It's kinda annoying, though: After splitting
the segment after the function we want to decompile, we then need to group
the two new segments back together into one "virtual segment" matching the
original one. But since all ASM in ReC98 heavily relies on being
assembled in MASM mode, we then start to suffer from MASM's group
addressing quirk. Which then forces us to manually prefix every single
from inside the group
to anywhere else within the newly created segment
with the group name. It's stupidly boring busywork, because of all the
function calls you mustn't prefix. Special tooling might make this
easier, but I don't have it, and I'm not getting crowdfunded for it.
So while you now definitely can request any specific thing in any
of the 5 games to be decompiled right now, it will take slightly
longer, and cost slightly more.
(Except for that one big segment in TH04, of course.)
Only one function away from the TH05 shot type control functions now!
What do you do if the TH06 text image feature for thcrap should have been done 3 days™ ago, but keeps getting more and more complex, and you have a ton of other pushes to deliver anyway? Get some distraction with some light ReC98 reverse-engineering work. This is where it becomes very obvious how much uth05win helps us with all the games, not just TH05.
5a5c347 is the most important one in there, this was the missing substructure that now makes every other sprite-like structure trivial to figure out.