EMS memory! The
infamous stopgap measure between the 640 KiB ("ought to be enough for
everyone") of conventional
memory offered by DOS from the very beginning, and the later XMS standard for
accessing all the rest of memory up to 4 GiB in the x86 Protected Mode. With
an optionally active EMS driver, TH04 and TH05 will make use of EMS memory
to preload a bunch of situational .CDG images at the beginning of
The "eye catch" game title image, shown while stages are loaded
The character-specific background image, shown while bombing
The player character dialog portraits
TH05 additionally stores the boss portraits there, preloading them
at the beginning of each stage. (TH04 instead keeps them in conventional
memory during the entire stage.)
Once these images are needed, they can then be copied into conventional
memory and accessed as usual.
Uh… wait, copied? It certainly would have been possible to map EMS
memory to a regular 16-bit Real Mode segment for direct access,
bank-switching out rarely used system or peripheral memory in exchange for
the EMS data. However, master.lib doesn't expose this functionality, and
only provides functions for copying data from EMS to regular memory and vice
But even that still makes EMS an excellent fit for the large image files
it's used for, as it's possible to directly copy their pixel data from EMS
to VRAM. (Yes, I tried!) Well… would, because ZUN doesn't do
that either, and always naively copies the images to newly allocated
conventional memory first. In essence, this dumbs down EMS into just another
layer of the memory hierarchy, inserted between conventional memory and
disk: Not quite as slow as disk, but still requiring that
memcpy() to retrieve the data. Most importantly though: Using
EMS in this way does not increase the total amount of memory
simultaneously accessible to the game. After all, some other data will have
to be freed from conventional memory to make room for the newly loaded data.
The most idiomatic way to define the game-specific layout of the EMS area
would be either a struct or an enum.
Unfortunately, the total size of all these images exceeds the range of a
16-bit value, and Turbo C++ 4.0J supports neither 32-bit enums
(which are silently degraded to 16-bit) nor 32-bit structs
(which simply don't compile). That still leaves raw compile-time constants
though, you only have to manually define the offset to each image in terms
of the size of its predecessor. But instead of doing that, ZUN just placed
each image at a nice round decimal offset, each slightly larger than the
actual memory required by the previous image, just to make sure that
everything fits. This results not only in quite
a bit of unnecessary padding, but also in technically the single
biggest amount of "wasted" memory in PC-98 Touhou: Out of the 180,000 (TH04)
and 320,000 (TH05) EMS bytes requested, the game only uses 135,552 (TH04)
and 175,904 (TH05) bytes. But hey, it's EMS, so who cares, right? Out of all
the opportunities to take shortcuts during development, this is among the
most acceptable ones. Any actual PC-98 model that could run these two games
comes with plenty of memory for this to not turn into an actual issue.
On to the EMS-using functions themselves, which are the definition of
"cross-cutting concerns". Most of these have a fallback path for the non-EMS
case, and keep the loaded .CDG images in memory if they are immediately
needed. Which totally makes sense, but also makes it difficult to find names
that reflect all the global state changed by these functions. Every one of
these is also just called from a single place, so inlining
them would have saved me a lot of naming and documentation trouble
The TH04 version of the EMS allocation code was actually displayed on ZUN's monitor in the
2010 MAG・ネット documentary; WindowsTiger already transcribed the low-quality video image
in 2019. By 2015 ReC98 standards, I would have just run with that, but
the current project goal is to write better code than ZUN, so I didn't. 😛
We sure ain't going to use magic numbers for EMS offsets.
The dialog init and exit code then is completely different in both games,
yet equally cross-cutting. TH05 goes even further in saving conventional
memory, loading each individual player or boss portrait into a single .CDG
slot immediately before blitting it to VRAM and freeing the pixel data
again. People who play TH05 without an active EMS driver are surely going to
enjoy the hard drive access lag between each portrait change…
TH04, on the other hand, also abuses the dialog
exit function to preload the Mugetsu defeat / Gengetsu entrance and
Gengetsu defeat portraits, using a static variable to track how often the
function has been called during the Extra Stage… who needs function
parameters anyway, right?
This is also the function in which TH04 infamously crashes after the Stage 5
pre-boss dialog when playing with Reimu and without any active EMS driver.
That crash is what motivated this look into the games' EMS usage… but the
code looks perfectly fine? Oh well, guess the crash is not related to EMS
then. Next u–
OK, of course I can't leave it like that. Everyone is expecting a fix now,
and I still got half of a push left over after decompiling the regular EMS
code. Also, I've now RE'd every function that could possibly be involved in
the crash, and this is very likely to be the last time I'll be looking at
Turns out that the bug has little to do with EMS, and everything to do with
ZUN limiting the amount of conventional RAM that TH04's
MAIN.EXE is allowed to use, and then slightly miscalculating
this upper limit. Playing Stage 5 with Reimu is the most asset-intensive
configuration in this game, due to the combination of
6 player portraits (Marisa has only 5), at 128×128 pixels each
a 288×256 background for the boss fight, tied in size only with the
ones in the Extra Stage
the additional 96×80 image for the vertically scrolling stars during
the stage, wastefully stored as 4 bitplanes rather than a single one.
This image is never freed, not even at the end of the stage.
The star image used in TH04's Stage 5.
Remove any single one of the above points, and this crash would have never
occurred. But with all of them combined, the total amount of memory consumed
by TH04's MAIN.EXE just barely exceeds ZUN's limit of 320,000
bytes, by no more than 3,840 bytes, the size of the star image.
But wait: As we established earlier, EMS does nothing to reduce the amount
of conventional memory used by the game. In fact, if you disabled TH04's EMS
handling, you'd still get this crash even if you are running an EMS
driver and loaded DOS into the High Memory Area to free up as much
conventional RAM as possible. How can EMS then prevent this crash in the
The answer: It's only because ZUN's usage of EMS bypasses the need to load
the cached images back out of the XOR-encrypted 東方幻想.郷
packfile. Leaving aside the general
stupidity of any game data file encryption*, master.lib's decryption
implementation is also quite wasteful: It uses a separate buffer that
receives fixed-size chunks of the file, before decrypting every individual
byte and copying it to its intended destination buffer. That really
resembles the typical slowness of a C fread() implementation
more than it does the highly optimized ASM code that master.lib purports to
be… And how large is this well-hidden decryption buffer? 4 KiB.
So, looking back at the game, here is what happens once the Stage 5
pre-battle dialog ends:
Reimu's bomb background image, which was previously freed to make space
for her dialog portraits, has to be loaded back into conventional memory
BB0.CDG is found inside the 東方幻想.郷
file_ropen() ends up allocating a 4 KiB buffer for the
encrypted packfile data, getting us the decisive ~4 KiB closer to the memory
The .CDG loader tries to allocate 52 608 contiguous bytes for the
pixel data of Reimu's bomb image
This would exceed the memory limit, so hmem_allocbyte()
fails and returns a nullptr
ZUN doesn't check for this case (as usual)
The pixel data is loaded to address 0000:0000,
overwriting the Interrupt Vector Table and whatever comes after
The game crashes
The final frame rendered by a crashing TH04.
The 4 KiB encryption buffer would only be freed by the corresponding
file_close() call, which of course never happens because the
game crashes before it gets there. At one point, I really did suspect the
cause to be some kind of memory leak or fragmentation inside master.lib,
which would have been quite delightful to fix.
Instead, the most straightforward fix here is to bump up that memory limit
by at least 4 KiB. Certainly easier than squeezing in a
cdg_free() call for the star image before the pre-boss dialog
without breaking position dependence.
Or, even better, let's nuke all these memory limits from orbit
because they make little sense to begin with, and fix every other potential
out-of-memory crash that modders would encounter when adding enough data to
any of the 4 games that impose such limits on themselves. Unless you want to
launch other binaries (which need to do their own memory allocations) after
launching the game, there's really no reason to restrict the amount of
memory available to a DOS process. Heck, whenever DOS creates a new one, it
assigns all remaining free memory by default anyway.
Removing the memory limits also removes one of ZUN's few error checks, which
end up quitting the game if there isn't at least a given maximum amount of
conventional RAM available. While it might be tempting to reserve enough
memory at the beginning of execution and then never check any allocation for
a potential failure, that's exactly where something like TH04's crash
This game is also still running on DOS, where such an initial allocation
failure is very unlikely to happen – no one fills close to half of
conventional RAM with TSRs and then tries running one of these games. It
might have been useful to detect systems with less than 640 KiB of
actual, physical RAM, but none of the PC-98 models with that little amount
of memory are fast enough to run these games to begin with. How ironic… a
place where ZUN actually added an error check, and then it's mostly
Technical debt, part 5… and we only got TH05's stupidly optimized
.PI functions this time?
As far as actual progress is concerned, that is. In maintenance news
though, I was really hyped for the #include improvements I've
mentioned in 📝 the last post. The result: A
new x86real.h file, bundling all the declarations specific to
the 16-bit x86 Real Mode in a smaller file than Turbo C++'s own
DOS.H. After all, DOS is something else than the underlying
CPU. And while it didn't speed up build times quite as much as I had hoped,
it now clearly indicates the x86-specific parts of PC-98 Touhou code to
future port authors.
After another couple of improvements to parameter declaration in ASM land,
we get to TH05's .PI functions… and really, why did ZUN write all of
them in ASM? Why (re)declare all the necessary structures and data in
ASM land, when all these functions are merely one layer of abstraction
above master.lib, which does all the actual work?
I get that ZUN might have wanted masked blitting to be faster, which is
used for the fade-in effect seen during TH05's main menu animation and the
ending artwork. But, uh… he knew how to modify master.lib. In fact, he
did already modify the graph_pack_put_8() function
used for rendering a single .PI image row, to ignore master.lib's VRAM
clipping region. For this effect though, he first blits each row regularly
to the invisible 400th row of VRAM, and then does an EGC-accelerated
VRAM-to-VRAM blit of that row to its actual target position with the mask
enabled. It would have been way more efficient to add another version of
this function that takes a mask pattern. No amount of REP
MOVSW is going to change the fact that two VRAM writes per line are
slower than a single one. Not to mention that it doesn't justify writing
every other .PI function in ASM to go along with it…
This is where we also find the most hilarious aspect about this: For most
of ZUN's pointless micro-optimizations, you could have maybe made the
argument that they do save some CPU cycles here and there, and
therefore did something positive to the final, PC-98-exclusive result. But
some of the hand-written ASM here doesn't even constitute a
micro-optimization, because it's worse than what you would have got
out of even Turbo C++ 4.0J with its 80386 optimization flags!
At least it was possible to "decompile" 6 out of the 10 functions
here, making them easy to clean up for future modders and port authors.
Could have been 7 functions if I also decided to "decompile"
pi_free(), but all the C++ code is already surrounded by ASM,
resulting in 2 ASM translation units and 2 C++ translation units.
pi_free() would have needed a single translation unit by
itself, which wasn't worth it, given that I would have had to spell out
every single ASM instruction anyway.
There you go. What about this needed to be written in ASM?!?
The function calls between these small translation units even seemed to
glitch out TASM and the linker in the end, leading to one CALL
offset being weirdly shifted by 32 bytes. Usually, TLINK reports a fixup
overflow error when this happens, but this time it didn't, for some reason?
Mirroring the segment grouping in the affected translation unit did solve
the problem, and I already knew this, but only thought of it after spending
quite some RTFM time… during which I discovered the -lE
switch, which enables TLINK to use the expanded dictionaries in
Borland's .OBJ and .LIB files to speed up linking. That shaved off roughly
another second from the build time of the complete ReC98 repository. The
more you know… Binary blobs compiled with non-Borland tools would be the
only reason not to use this flag.
So, even more slowdown with this 5th dedicated push, since we've still only
repaid 41% of the technical debt in the SHARED segment so far.
Next up: Part 6, which hopefully manages to decompile the FM and SSG
channel animations in TH05's Music Room, and hopefully ends up being the
final one of the slow ones.
You can now freely add or remove both data and code anywhere in TH05, by
editing the ReC98 codebase, writing your mod in ASM or C/C++, and
recompiling the code. Since all absolute memory addresses have now been
converted to labels, this will work without causing any instability. See
the position independence section in the FAQ
for a more thorough explanation about why this was a problem.
By extension, this also means that it's now theoretically possible
to use a different compiler on the source code. But:
What does this not mean?
The original ZUN code hasn't been completely reverse-engineered yet, let
alone decompiled. As the final PC-98 Touhou game, TH05 also happens to
have the largest amount of actual ZUN-written ASM that can't ever
be decompiled within ReC98's constraints of a legit source code
reconstruction. But a lot of the originally-in-C code is also still in
ASM, which might make modding a bit inconvenient right now. And while I
have decompiled a bunch of functions, I selected them largely
because they would help with PI (as requested by the backers), and not
because they are particularly relevant to typical modding interests.
As a result, the code might also be a bit confusingly organized. There's
quite a conflict between various goals there: On the one hand, I'd like to
only have a single instance of every function shared with earlier games,
as well as reduce ZUN's code duplication within a single game. On the
other hand, this leads to quite a lot of code being scattered all over the
place and then #include-pasted back together, except for the
📝 this doesn't work, and you'd have to use multiple translation units anyway…
I'm only beginning to figure out the best structure here, and some more
reverse-engineering attention surely won't hurt.
Also, keep in mind that the code still targets x86 Real Mode. To work
effectively in this codebase, you'd need some familiarity with
segmentation, and how to express it all in code. This tends to make
even regular C++ development about an order of magnitude harder,
especially once you want to interface with the remaining ASM code. That
part made -Tom- struggle quite a bit with implementing his
custom scripting language for the demo above. For now, he built that demo
on quite a limited foundation – which is why he also chose to release
neither the build nor the source publically for the time being.
So yeah, you're definitely going to need the TASM and Borland C++ manuals
tl;dr: We now know everything about this game's data, but not quite
as much about this game's code.
So, how long until source ports become a realistic project?
You probably want to wait for 100% RE, which is when everything
that can be decompiled has been decompiled.
Unless your target system is 16-bit Windows, in which case you could
theoretically start right away. 📝 Again,
this would be the ideal first system to port PC-98 Touhou to: It would
require all the generic portability work to remove the dependency on PC-98
hardware, thus paving the way for a subsequent port to modern systems,
yet you could still just drop in any undecompiled ASM.
Porting to IBM-compatible DOS would only be a harder and less universally
useful version of that. You'd then simply exchange one architecture, with
its idiosyncrasies and limits, for another, with its own set of
idiosyncrasies and limits. (Unless, of course, you already happen to be
intimately familiar with that architecture.) The fact that master.lib
provides DOS/V support would have only mattered if ZUN consistently used
it to abstract away PC-98 hardware at every single place in the code,
which is definitely not the case.
The list of actually interesting findings in this push is,
📝 again, very short. Probably the most
notable discovery: The low-level part of the code that renders Marisa's
laser from her TH04 Illusion Laser shot type is still present in
TH05. Insert wild mass guessing about potential beta version shot types…
Oh, and did you know that the order of background images in the Extra
Stage staff roll differs by character?
Next up: Finally driving up the RE% bar again, by decompiling some TH05
main menu code.
Finally, after a long while, we've got two pushes with barely anything to
talk about! Continuing the road towards 100% PI for TH05, these were
exactly the two pushes that TH05 MAINE.EXE PI was estimated
to additionally cost, relative to TH04's. Consequently, they mostly went
to TH05's unique data structures in the ending cutscenes, the score name
registration menu, and the
A unique feature in there is TH05's support for automatic text color
changes in its ending scripts, based on the first full-width Shift-JIS
codepoint in a line. The \c=codepoint,color
commands at the top of the _ED??.TXT set up exactly this
codepoint→color mapping. As far as I can tell, TH05 is the only Touhou
game with a feature like this – even the Windows Touhou games went back to
manually spelling out each color change.
The orb particles in TH05's staff roll also try to be a bit unique by
using 32-bit X and Y subpixel variables for their current position. With
still just 4 fractional bits, I can't really tell yet whether the extended
range was actually necessary. Maybe due to how the "camera scrolling"
through "space" was implemented? All other entities were pretty much the
usual fare, though.
12.4, 4.4, and now a 28.4 fixed-point format… yup,
📝 C++ templates were
definitely the right choice.
At the end of its staff roll, TH05 not only displays
the usual performance
verdict, but then scrolls in the scores at the end of each stage
before switching to the high score menu. The simplest way to smoothly
scroll between two full screens on a PC-98 involves a separate bitmap…
which is exactly what TH05 does here, reserving 28,160 bytes of its global
data segment for just one overly large monochrome 320×704 bitmap where
both the screens are rendered to. That's… one benefit of splitting your
game into multiple executables, I guess?
Not sure if it's common knowledge that you can actually scroll back and
forth between the two screens with the Up and Down keys before moving to
the score menu. I surely didn't know that before. But it makes sense –
might as well get the most out of that memory.
The necessary groundwork for all of this may have actually made
TH04's (yes, TH04's) MAINE.EXE technically
position-independent. Didn't quite reach the same goal for TH05's – but
what we did reach is ⅔ of all PC-98 Touhou code now being
position-independent! Next up: Celebrating even more milestones, as
-Tom- is about to finish development on his TH05
MAIN.EXE PI demo…
Alright, the score popup numbers shown when collecting items or defeating
(mid)bosses. The second-to-last remaining big entity type in TH05… with
quite some PI false positives in the memory range occupied by its data.
Good thing I still got some outstanding generic RE pushes that haven't
been claimed for anything more specific in over a month! These
conveniently allowed me to RE most of these functions right away, the
Most of the false positives were boss HP values, passed to a "boss phase
end" function which sets the HP value at which the next phase should end.
Stage 6 Yuuka, Mugetsu, and EX-Alice have their own copies of this
function, in which they also reset certain boss-specific global variables.
Since I always like to cover all varieties of such duplicated functions at
once, it made sense to reverse-engineer all the involved variables while I
was at it… and that's why this was exactly the right time to cover the
implementation details of Stage 6 Yuuka's parasol and vanishing animations
With still a bit of time left in that RE push afterwards, I could also
start looking into some of the smaller functions that didn't quite fit
into other pushes. The most notable one there was a simple function that
aims from any point to the current player position. Which actually only
became a separate function in TH05, probably since it's called 27 times in
total. That's 27 places no longer being blocked from further RE progress.
did most of the work for the score popup numbers in January, which meant
that I only had to review it and bring it up to ReC98's current coding
styles and standards. This one turned out to be one of those rare features
whose TH05 implementation is significantly less insane than the
TH04 one. Both games lazily redraw only the tiles of the stage background
that were drawn over in the previous frame, and try their best to minimize
the amount of tiles to be redrawn in this way. For these popup numbers,
this involves calculating the on-screen width, based on the exact number
of digits in the point value. TH04 calculates this width every frame
during the rendering function, and even resorts to setting that field
through the digit iteration pointer via self-modifying code… yup. TH05, on
the other hand, simply calculates the width once when spawning a new popup
number, during the conversion of the point value to
decimal. The "×2" multiplier suffix being removed in TH05 certainly
also helped in simplifying that feature in this game.
And that's ⅓ of TH05 reverse-engineered! Next up, one more TH05 PI push,
in which the stage enemies hopefully finish all the big entity types.
Maybe it will also be accompanied by another RE push? In any case, that
will be the last piece of TH05 progress for quite some time. The next TH01
stretch will consist of 6 pushes at the very least, and I currently have
no idea of how much time I can spend on ReC98 a month from now…
Wait, PI for FUUIN.EXE is mainly blocked by the high score
menu? That one should really be properly decompiled in a separate
RE push, since it's also present in largely identical form in
REIIDEN.EXE… but I currently lack the explicit funding to do
And as it turns out, I shouldn't really capture any of the existing generic
RE contributions for it either. Back in 2018 when I ran the crowdfunding
on the Touhou Patch Center Discord server, I said that generic RE
contributions would never go towards TH01. No one was interested in that
game back then, and as it's significantly different from all the other
games, it made sense to only cover it if explicitly requested.
As Touhou Patch Center still remains one of the biggest supporters and
advertisers for ReC98, someone recently believed that this rule was still
in effect, despite not being mentioned anywhere on this website.
Fast forward to today, and TH01 has become the single most supported game
lately, with plenty of incomplete pushes still open to be completed.
Reverse-engineering it has proven to be quite efficient, yielding lots of
completion percentage points per push. This, I suppose, is exactly what
backers that don't give any specific priorities are mainly interested in.
Therefore, I will allocate future partial
contributions to TH01, whenever it makes sense.
So, instead of rushing TH01 PI, let's wait for Ember2528's
April subscription, and get the 25% total RE milestone with some TH05 PI
progress instead. This one primarily focused on the gather circles
(spirals…?), the third-last missing entity type in TH05. These are
rendered using the same 8×8 pellet sprite introduced in TH02… except that
the actual pellets received a darkened bottom part in TH04
Which, in turn, is actually rendered quite efficiently – the games first
render the top white part of all pellets, followed by the bottom gray part
of all pellets. The PC-98 GRCG is used throughout the process, doing its
typical job of accelerating monochrome blitting, and by arranging the
rendering like this, only two GRCG color changes are required to draw any
number of pellets. I guess that makes it quite a worthwhile
optimization? Don't ask me for specific performance numbers or even saved
Long time no see! And this is exactly why I've been procrastinating
bullets while there was still meaningful progress to be had in other parts
of TH04 and TH05: There was bound to be quite some complexity in this most
central piece of game logic, and so I couldn't possibly get to a
satisfying understanding in just one push.
Or in two, because their rendering involves another bunch of
micro-optimized functions adapted from master.lib.
Or in three, because we'd like to actually name all the bullet sprites,
since there are a number of sprite ID-related conditional branches. And
so, I was refining things I supposedly RE'd in the the commits from the
first push until the very end of the fourth.
When we talk about "bullets" in TH04 and TH05, we mean just two things:
the white 8×8 pellets, with a cap of 240 in TH04 and 180 in TH05, and any
16×16 sprites from MIKO16.BFT, with a cap of 200 in TH04 and
220 in TH05. These are by far the most common types of… err, "things the
player can collide with", and so ZUN provides a whole bunch of pre-made
motion, animation, and
n-way spread / ring / stack group options for those, which can be
selected by simply setting a few fields in the bullet template. All the
other "non-bullets" have to be fired and controlled individually.
Which is nothing new, since uth05win covered this part pretty accurately –
I don't think anyone could just make up these structure member
overloads. The interesting insights here all come from applying this
research to TH04, and figuring out its differences compared to TH05. The
most notable one there is in the default groups: TH05 allows you to add
to any single bullet, n-way spread or ring, but TH04 only lets you create
stacks separately from n-way spreads and rings, and thus gets by with
fewer fields in its bullet template structure. On the other hand, TH04 has
a separate "n-way spread with random angles, yet still aimed at the
player" group? Which seems to be unused, at least as far as
midbosses and bosses are concerned; can't say anything about stage enemies
In fact, TH05's larger bullet template structure illustrates that these
distinct group types actually are a rather redundant piece of
over-engineering. You can perfectly indicate any permutation of the basic
groups through just the stack bullet count (1 = no stack), spread bullet
count (1 = no spread), and spread delta angle (0 = ring instead of
spread). Add a 4-flag bitfield to cover the rest (aim to player, randomize
angle, randomize speed, force single bullet regardless of difficulty or
rank), and the result would be less redundant and even slightly
Even those 4 pushes didn't quite finish all of the bullet-related types,
stopping just shy of the most trivial and consistent enum that defines
special movement. This also left us in a
📝 TH03-like situation, in which we're still
a bit away from actually converting all this research into actual RE%. Oh
well, at least this got us way past 50% in overall position independence.
On to the second half! 🎉
For the next push though, we'll first have a quick detour to the remaining
C code of all the ZUN.COM binaries. Now that the
📝 TH04 and TH05 resident structures no
longer block those, -Tom- has requested TH05's
RES_KSO.COM to be covered in one of his outstanding pushes.
And since 32th System
recently RE'd TH03's resident structure, it makes sense to also review and
merge that, before decompiling all three remaining RES_*.COM
binaries in hopefully a single push. It might even get done faster than
that, in which case I'll then review and merge some more of
So, where to start? Well, TH04 bullets are hard, so let's
procrastinate start with TH03 instead
The 📝 sprite display functions are the
obvious blocker for any structure describing a sprite, and therefore most
meaningful PI gains in that game… and I actually did manage to fit a
decompilation of those three functions into exactly the amount of time
that the Touhou Patch Center community votes alloted to TH03
And a pretty amazing one at that. The original code was so obviously
written in ASM and was just barely decompilable by exclusively using
register pseudovariables and a bit of goto, but I was able to
abstract most of that away, not least thanks to a few helpful optimization
properties of Turbo C++… seriously, I can't stop marveling at this ancient
compiler. The end result is both readable, clear, and dare I say
portable?! To anyone interested in porting TH03,
take a look. How painful would it be to port that away from 16-bit
However, this push is also a typical example that the RE/PI priorities can
only control what I look at, and the outcome can actually differ
greatly. Even though the priorities were 65% RE and 35% PI, the progress
outcome was +0.13% RE and +1.35% PI. But hey, we've got one more push with
a focus on TH03 PI, so maybe that one will include more RE than
PI, and then everything will end up just as ordered?
Here we go, new C code! …eh, it will still take a bit to really get
decompilation going at the speeds I was hoping for. Especially with the
sheer amount of stuff that is set in the first few significant
functions we actually can decompile, which now all has to be
correctly declared in the C world. Turns out I spent the last 2 years
screwing up the case of exported functions, and even some of their names,
so that it didn't actually reflect their calling convention… yup. That's
just the stuff you tend to forget while it doesn't matter.
To make up for that, I decided to research whether we can make use of some
C++ features to improve code readability after all. Previously, it seemed
that TH01 was the only game that included any C++ code, whereas TH02 and
later seemed to be 100% C and ASM. However, during the development of the
soon to be released new build system, I noticed that even this old
compiler from the mid-90's, infamous for prioritizing compile speeds over
all but the most trivial optimizations, was capable of quite surprising
levels of automatic inlining with class methods…
…leading the research to culminate in the mindblow that is
9d121c7 – yes, we can use C++ class methods
and operator overloading to make the code more readable, while still
generating the same code than if we had just used C and preprocessor
Looks like there's now the potential for a few pull requests from outside
devs that apply C++ features to improve the legibility of previously
decompiled and terribly macro-ridden code. So, if anyone wants to help
without spending money…
Turns out I had only been about half done with the drawing routines. The rest was all related to redrawing the scrolling stage backgrounds after other sprites were drawn on top. Since the PC-98 does have hardware-accelerated scrolling, but no hardware-accelerated sprites, everything that draws animated sprites into a scrolling VRAM must then also make sure that the background tiles covered by the sprite are redrawn in the next frame, which required a bit of ZUN code. And that are the functions that have been in the way of the expected rapid reverse-engineering progress that uth05win was supposed to bring. So, looks like everything's going to go really fast now?
… yeah, no, we won't get very far without figuring out these drawing routines.
Which process data that comes from the .STD files.
Which has various arrays related to the background… including one to specify the scrolling speed. And wait, setting that to 0 actually is what starts a boss battle?
So, have a TH05 Boss Rush patch: 2018-12-26-TH05BossRush.zip
Theoretically, this should have also worked for TH04, but for some reason,
the Stage 3 boss gets stuck on the first phase if we do this?