OK, TH01 missile bullets. Can we maybe have a well-behaved entity type,
without any weirdness? Just once?
Ehh, kinda. Apart from another 150 bytes wasted on unused structure members,
this code is indeed more on the low end in terms of overall jank. It does
become very obvious why dodging these missiles in the YuugenMagan, Mima, and
Elis fights feels so awful though: An unfair 46×46 pixel hitbox around
Reimu's center pixel, combined with the comeback of
📝 interlaced rendering, this time in every
stage. ZUN probably did this because missiles are the only 16×16 sprite in
TH01 that is blitted to unaligned X positions, which effectively ends up
touching a 32×16 area of VRAM per sprite.
But even if we assume VRAM writes to be the bottleneck here, it would
have been totally possible to render every missile in every frame at roughly
the same amount of CPU time that the original game uses for interlaced
Note that all missile sprites only use two colors, white and green.
Instead of naively going with the usual four bitplanes, extract the
pixels drawn in each of the two used colors into their own bitplanes.
master.lib calls this the "tiny format".
Use the GRCG to draw these two bitplanes in the intended white and green
colors, halving the amount of VRAM writes compared to the original
(Not using the .PTN format would have also avoided the inconsistency of
storing the missile sprites in boss-specific sprite slots.)
That's an optimization that would have significantly benefitted the game, in
contrast to all of the fake ones
introduced in later games. Then again, this optimization is
actually something that the later games do, and it might have in fact been
necessary to achieve their higher bullet counts without significant
After some effectively unused Mima sprite effect code that is so broken that
it's impossible to make sense out of it, we get to the final feature I
wanted to cover for all bosses in parallel before returning to Sariel: The
separate sprite background storage for moving or animated boss sprites in
the Mima, Elis, and Sariel fights. But, uh… why is this necessary to begin
with? Doesn't TH01 already reserve the other VRAM page for backgrounds?
Well, these sprites are quite big, and ZUN didn't want to blit them from
main memory on every frame. After all, TH01 and TH02 had a minimum required
clock speed of 33 MHz, half of the speed required for the later three games.
So, he simply blitted these boss sprites to both VRAM pages, leading
the usual unblitting calls to only remove the other sprites on top of the
boss. However, these bosses themselves want to move across the screen…
and this makes it necessary to save the stage background behind them
in some other way.
Enter .PTN, and its functions to capture a 16×16 or 32×32 square from VRAM
into a sprite slot. No problem with that approach in theory, as the size of
all these bigger sprites is a multiple of 32×32; splitting a larger sprite
into these smaller 32×32 chunks makes the code look just a little bit clumsy
(and, of course, slower).
But somewhere during the development of Mima's fight, ZUN apparently forgot
that those sprite backgrounds existed. And once Mima's 🚫 casting sprite is
blitted on top of her regular sprite, using just regular sprite
transparency, she ends up with her infamous third arm:
Ironically, there's an unused code path in Mima's unblit function where ZUN
assumes a height of 48 pixels for Mima's animation sprites rather than the
actual 64. This leads to even clumsier .PTN function calls for the bottom
128×16 pixels… Failing to unblit the bottom 16 pixels would have also
yielded that third arm, although it wouldn't have looked as natural. Still
wouldn't say that it was intentional; maybe this casting sprite was just
added pretty late in the game's development?
So, mission accomplished, Sariel unblocked… at 2¼ pushes. That's quite some time left for some smaller stage initialization
code, which bundles a bunch of random function calls in places where they
logically really don't belong. The stage opening animation then adds a bunch
of VRAM inter-page copies that are not only redundant but can't even be
understood without knowing the hidden internal state of the last VRAM page
accessed by previous ZUN code…
In better news though: Turbo C++ 4.0 really doesn't seem to have any
complexity limit on inlining arithmetic expressions, as long as they only
operate on compile-time constants. That's how we get macro-free,
compile-time Shift-JIS to JIS X 0208 conversion of the individual code
points in the 東方★靈異伝 string, in a compiler from 1994. As long as you
don't store any intermediate results in variables, that is…
But wait, there's more! With still ¼ of a push left, I also went for the
boss defeat animation, which includes the route selection after the SinGyoku
As in all other instances, the 2× scaled font is accomplished by first
rendering the text at regular 1× resolution to the other, invisible VRAM
page, and then scaled from there to the visible one. However, the route
selection is unique in that its scaled text is both drawn transparently on
top of the stage background (not onto a black one), and can also change
colors depending on the selection. It would have been no problem to unblit
and reblit the text by rendering the 1× version to a position on the
invisible VRAM page that isn't covered by the 2× version on the visible one,
but ZUN (needlessly) clears the invisible page before rendering any text.
Instead, he assigned a separate VRAM color for both
the 魔界 and 地獄 options, and only changed the palette value for
these colors to white or gray, depending on the correct selection. This is
another one of the
📝 rare cases where TH01 demonstrates good use of PC-98 hardware,
as the 魔界へ and 地獄へ strings don't need to be reblitted during the selection process, only the Orb "cursor" does.
Then, why does this still not count as good-code? When
changing palette colors, you kinda need to be aware of everything
else that can possibly be on screen, which colors are used there, and which
aren't and can therefore be used for such an effect without affecting other
sprites. In this case, well… hover over the image below, and notice how
Reimu's hair and the bomb sprites in the HUD light up when Makai is
This push did end on a high note though, with the generic, non-SinGyoku
version of the defeat animation being an easily parametrizable copy. And
that's how you decompile another 2.58% of TH01 in just slightly over three
Now, we're not only ready to decompile Sariel, but also Kikuri, Elis, and
SinGyoku without needing any more detours into non-boss code. Thanks to the
current TH01 funding subscriptions, I can plan to cover most, if not all, of
Sariel in a single push series, but the currently 3 pending pushes probably
won't suffice for Sariel's 8.10% of all remaining code in TH01. We've got
quite a lot of not specifically TH01-related funds in the backlog to pass
the time though.
Due to recent developments, it actually makes quite a lot of sense to take a
break from TH01: spaztron64 has
managed what every Touhou download site so far has failed to do: Bundling
all 5 game onto a single .HDI together with pre-configured PC-98
emulators and a nice boot menu, and hosting the resulting package on a
proper website. While this first release is already quite good (and much
better than my attempt from 2014), there is still a bit of room for
improvement to be gained from specific ReC98 research. Next up,
Researching how TH04 and TH05 use EMS memory, together with the cause
behind TH04's crash in Stage 5 when playing as Reimu without an EMS driver
reverse-engineering TH03's score data file format
(YUME.NEM), which hopefully also comes with a way of building a
file that unlocks all characters without any high scores.
And indeed, I got to end my vacation with a lot of image format and
blitting code, covering the final two formats, .GRC and .BOS. .GRC was
nothing noteworthy – one function for loading, one function for
byte-aligned blitting, and one function for freeing memory. That's it –
not even a unblitting function for this one. .BOS, on the other hand…
…has no generic (read: single/sane) implementation, and is only
implemented as methods of some boss entity class. And then again for
Sariel's dress and wand animations, and then again for Reimu's
animations, both of which weren't even part of these 4 pushes. Looking
forward to decompiling essentially the same algorithms all over again… And
that's how TH01 became the largest and most bloated PC-98 Touhou game. So
yeah, still not done with image formats, even at 44% RE.
This means I also had to reverse-engineer that "boss entity" class… yeah,
what else to call something a boss can have multiple of, that may or may
not be part of a larger boss sprite, may or may not be animated, and that
may or may not have an orb hitbox?
All bosses except for Kikuri share the same 5 global instances of this
class. Since renaming all these variables in ASM land is tedious anyway, I
went the extra mile and directly defined separate, meaningful names for
the entities of all bosses. These also now document the natural order in
which the bosses will ultimately be decompiled. So, unless a backer
requests anything else, this order will be:
(code for regular card-flipping stages)
As everyone kind of expects from TH01 by now, this class reveals yet
another… um, unique and quirky piece of code architecture. In
addition to the position and hitbox members you'd expect from a class like
this, the game also stores the .BOS metadata – width, height, animation
frame count, and 📝 bitplane pointer slot
number – inside the same class. But if each of those still corresponds to
one individual on-screen sprite, how can YuugenMagan have 5 eye sprites,
or Kikuri have more than one soul and tear sprite? By duplicating that
metadata, of course! And copying it from one entity to another
At this point, I feel like I even have to congratulate the game for not
actually loading YuugenMagan's eye sprites 5 times. But then again, 53,760
bytes of waste would have definitely been noticeable in the DOS days.
Makes much more sense to waste that amount of space on an unused C++
exception handler, and a bunch of redundant, unoptimized blitting
(Thinking about it, YuugenMagan fits this entire system perfectly. And
together with its position in the game's code – last to be decompiled
means first on the linker command line – we might speculate that
YuugenMagan was the first boss to be programmed for TH01?)
So if a boss wants to use sprites with different sizes, there's no way
around using another entity. And that's why Girl-Elis and Bat-Elis are two
distinct entities internally, and have to manually sync their position.
Except that there's also a third one for Attacking-Girl-Elis,
because Girl-Elis has 9 frames of animation in total, and the global .BOS
bitplane pointers are divided into 4 slots of only 8 images each.
Same for SinGyoku, who is split into a sphere entity, a
person entity, and a… white flash entity for all three forms,
all at the same resolution. Or Konngara's facial expressions, which also
require two entities just for themselves.
And once you decompile all this code, you notice just how much of it the
game didn't even use. 13 of the 50 bytes of the boss entity class are
outright unused, and 10 bytes are used for a movement clamping and lock
system that would have been nice if ZUN also used it outside of
Kikuri's soul sprites. Instead, all other bosses ignore this system
completely, and just
the X/Y coordinates of the boss entities directly.
As for the rendering functions, 5 out of 10 are unused. And while those
definitely make up less than half of the code, I still must have
spent at least 1 of those 4 pushes on effectively unused functionality.
Only one of these functions lends itself to some speculation. For Elis'
entrance animation, the class provides functions for wavy blitting and
unblitting, which use a separate X coordinate for every line of the
sprite. But there's also an unused and sort of broken one for unblitting
two overlapping wavy sprites, located at the same Y coordinate. This might
indicate that Elis could originally split herself into two sprites,
similar to TH04 Stage 6 Yuuka? Or it might just have been some other kind
of animation effect, who knows.
After over 3 months of TH01 progress though, it's finally time to look at
other games, to cover the rest of the crowdfunding backlog. Next up: Going
back to TH05, and getting rid of those last PI false positives. And since
I can potentially spend the next 7 weeks on almost full-time ReC98 work,
I've also re-opened the store until October!
Well, make that three days. Trying to figure out all the details behind
the sprite flickering was absolutely dreadful…
It started out easy enough, though. Unsurprisingly, TH01 had a quite
limited pellet system compared to TH04 and TH05:
The cap is 100, rather than 240 in TH04 or 180 in TH05.
Only 6 special motion functions (with one of them broken and unused)
instead of 10. This is where you find the code that generates SinGyoku's
chase pellets, Kikuri's small spinning multi-pellet circles, and
Konngara's rain pellets that bounce down from the top of the playfield.
A tiny selection of preconfigured multi-pellet groups. Rather than
TH04's and TH05's freely configurable n-way spreads, stacks, and rings,
TH01 only provides abstractions for 2-, 3-, 4-, and 5- way spreads (yup,
no 6-way or beyond), with a fixed narrow or wide angle between the
individual pellets. The resulting pellets are also hardcoded to linear
motion, and can't use the special motion functions. Maybe not the best
code, but still kind of cute, since the generated groups do follow a
As expected from TH01, the code comes with its fair share of smaller,
insignificant ZUN bugs and oversights. As you would also expect
though, the sprite flickering points to the biggest and most consequential
flaw in all of this.
Apparently, it started with ZUN getting the impression that it's only
possible to use the PC-98 EGC for fast blitting of all 4 bitplanes in one
CPU instruction if you blit 16 horizontal pixels (= 2 bytes) at a time.
Consequently, he only wrote one function for EGC-accelerated sprite
unblitting, which can only operate on a "grid" of 16×1 tiles in VRAM. But
wait, pellets are not only just 8×8, but can also be placed at any
unaligned X position…
… yet the game still insists on using this 16-dot-aligned function to
unblit pellets, forcing itself into using a super sloppy 16×8 rectangle
for the job. 🤦 ZUN then tried to mitigate the resulting flickering in two
hilarious ways that just make it worse:
An… "interlaced rendering" mode? This one's activated for all Stage 15
and 20 fights, and separates pellets into two halves that are rendered on
alternating frames. Collision detection with the Yin-Yang Orb and the
player is only done for the visible half, but collision detection with
player shots is still done for all pellets every frame, as are
motion updates – so that pellets don't end up moving half as fast as they
So yeah, your eyes weren't deceiving you. The game does effectively
drop its perceived frame rate in the Elis, Kikuri, Sariel, and Konngara
fights, and it does so deliberately.
📝 Just like player shots, pellets
are also unblitted, moved, and rendered in a single function.
Thanks to the 16×8 rectangle, there's now the (completely unnecessary)
possibility of accidentally unblitting parts of a sprite that was
previously drawn into the 8 pixels right of a pellet. And this
is where ZUN went full and went "oh, I
know, let's test the entire 16 pixels, and in case we got an entity
there, we simply make the pellet invisible for this frame! Then
we don't even have to unblit it later!"
Except that this is only done for the first 3 elements of the player
shot array…?! Which don't even necessarily have to contain the 3 shots
fired last. It's not done for the player sprite, the Orb, or, heck,
other pellets that come earlier in the pellet array. (At least
we avoided going 𝑂(𝑛²) there?)
Actually, and I'm only realizing this now as I type this blog post:
This test is done even if the shots at those array elements aren't
active. So, pellets tend to be made invisible based on comparisons
with garbage data.
And then you notice that the player shot
unblit/move/render function is actually only ever called from the
pellet unblit/move/render function on the one global instance
of the player shot manager class, after pellets were unblitted. So, we
end up with a sequence of
which means that we can't ever unblit a previously rendered shot
with a pellet. Sure, as terrible as this one function call is from
a software architecture perspective, it was enough to fix this issue.
Yet we don't even get the intended positive effect, and walk away with
pellets that are made temporarily invisible for no reason at all. So,
uh, maybe it all just was an attempt at increasing the
ramerate on lower spec PC-98 models?
Yup, that's it, we've found the most stupid piece of code in this game,
period. It'll be hard to top this.
I'm confident that it's possible to turn TH01 into a well-written, fluid
PC-98 game, with no flickering, and no perceived lag, once it's
position-independent. With some more in-depth knowledge and documentation
on the EGC (remember, there's still
📝 this one TH03 push waiting to be funded),
you might even be able to continue using that piece of blitter hardware.
And no, you certainly won't need ASM micro-optimizations – just a bit of
knowledge about which optimizations Turbo C++ does on its own, and what
you'd have to improve in your own code. It'd be very hard to write
worse code than what you find in TH01 itself.
(Godbolt for Turbo C++ 4.0J when?
Seriously though, that would 📝 also be a
great project for outside contributors!)
Oh well. In contrast to TH04 and TH05, where 4 pushes only covered all the
involved data types, they were enough to completely cover all of
the pellet code in TH01. Everything's already decompiled, and we never
have to look at it again. 😌 And with that, TH01 has also gone from by far
the least RE'd to the most RE'd game within ReC98, in just half a year! 🎉
Still, that was enough TH01 game logic for a while.
Next up: Making up for the delay with some
more relaxing and easy pieces of TH01 code, that hopefully make just a
bit more sense than all this garbage. More image formats, mainly.