🎉 TH05 is finally fully position-independent! 🎉 To celebrate this
milestone, -Tom- coded a little demo, which we recorded on
both an emulator and on real PC-98 hardware:
You can now freely add or remove both data and code anywhere in TH05, by
editing the ReC98 codebase, writing your mod in ASM or C/C++, and
recompiling the code. Since all absolute memory addresses have now been
converted to labels, this will work without causing any instability. See
the position independence section in the FAQ
for a more thorough explanation about why this was a problem.
By extension, this also means that it's now theoretically possible
to use a different compiler on the source code. But:
What does this not mean?
The original ZUN code hasn't been completely reverse-engineered yet, let
alone decompiled. As the final PC-98 Touhou game, TH05 also happens to
have the largest amount of actual ZUN-written ASM that can't ever
be decompiled within ReC98's constraints of a legit source code
reconstruction. But a lot of the originally-in-C code is also still in
ASM, which might make modding a bit inconvenient right now. And while I
have decompiled a bunch of functions, I selected them largely
because they would help with PI (as requested by the backers), and not
because they are particularly relevant to typical modding interests.
As a result, the code might also be a bit confusingly organized. There's
quite a conflict between various goals there: On the one hand, I'd like to
only have a single instance of every function shared with earlier games,
as well as reduce ZUN's code duplication within a single game. On the
other hand, this leads to quite a lot of code being scattered all over the
place and then #include-pasted back together, except for the
places where
📝 this doesn't work, and you'd have to use multiple translation units anyway…
I'm only beginning to figure out the best structure here, and some more
reverse-engineering attention surely won't hurt.
Also, keep in mind that the code still targets x86 Real Mode. To work
effectively in this codebase, you'd need some familiarity with
memory
segmentation, and how to express it all in code. This tends to make
even regular C++ development about an order of magnitude harder,
especially once you want to interface with the remaining ASM code. That
part made -Tom- struggle quite a bit with implementing his
custom scripting language for the demo above. For now, he built that demo
on quite a limited foundation – which is why he also chose to release
neither the build nor the source publically for the time being.
So yeah, you're definitely going to need the TASM and Borland C++ manuals
there.
tl;dr: We now know everything about this game's data, but not quite
as much about this game's code.
So, how long until source ports become a realistic project?
You probably want to wait for 100% RE, which is when everything
that can be decompiled has been decompiled.
Unless your target system is 16-bit Windows, in which case you could
theoretically start right away. 📝 Again,
this would be the ideal first system to port PC-98 Touhou to: It would
require all the generic portability work to remove the dependency on PC-98
hardware, thus paving the way for a subsequent port to modern systems,
yet you could still just drop in any undecompiled ASM.
Porting to IBM-compatible DOS would only be a harder and less universally
useful version of that. You'd then simply exchange one architecture, with
its idiosyncrasies and limits, for another, with its own set of
idiosyncrasies and limits. (Unless, of course, you already happen to be
intimately familiar with that architecture.) The fact that master.lib
provides DOS/V support would have only mattered if ZUN consistently used
it to abstract away PC-98 hardware at every single place in the code,
which is definitely not the case.
The list of actually interesting findings in this push is,
📝 again, very short. Probably the most
notable discovery: The low-level part of the code that renders Marisa's
laser from her TH04 Illusion Laser shot type is still present in
TH05. Insert wild mass guessing about potential beta version shot types…
Oh, and did you know that the order of background images in the Extra
Stage staff roll differs by character?
Next up: Finally driving up the RE% bar again, by decompiling some TH05
main menu code.
Wouldn't it be a bit disappointing to have TH05 completely
position-independent, but have it still require hex-editing of the
original ZUN.COM to mod its gaiji characters? As in, these
custom "text" glyphs, available to the PC-98 text RAM:
Especially since we now even have a sprite converter… the lack of which
was exactly 📝 what made rebuilding ZUN.COM not that worthwhile before.
So, before the big release, let's get all the remaining
ZUN.COM sub-binaries of TH04 and TH05 dumped into .ASM files,
and re-assembled and linked during the build process.
This is also the moment in which Egor's 2018
reimplementation of O. Morikawa's comcstm finally gets
to shine. Back then, I considered it too early to even bother with
ZUN.COM and reimplementing the .COM wrapper that ZUN
originally used to bundle multiple smaller executables into that single
binary. But now that the time is right, it is nice to have that
code, as it allowed me to get these rebuilds done in half a push.
Otherwise, it would have surely required one or two dedicated ones.
Since we like correctness here, newly dumped ZUN code means that it also
has to be included in the RE%
baseline calculation. This is why TH04's and TH05's overall RE% bars
have gone back a tiny bit… in case you remember how they previously looked
like After all, I would like to figure
out where all that memory allocated during TH04's and TH05's memory check
is freed, if at all.
Alright, one half of a push left… Y'know, getting rid of those last few PI
false positives is actually one of the most annoying chores in this
project, and quite stressful as well: I have to convince myself that the
remaining false positives are, in fact, not memory references, but with
way too little time for in-depth RE and to denote what they are
instead. In that situation, everyone (including myself!)
is anticipating that PI goal, and no one is really interested in RE.
(Well… that is, until they actually get to developing their mod. But more
on that tomorrow. ) Which means that it boils
down to quite some hasty, dumb, and superficial RE around those remaining
numbers.
So, in the hope of making it less annoying for the other 4 games in the
future, let's systematically cover the sources of those remaining false
positives in TH05, over all games. I/O port accesses with either the port
or the value in registers (and thus, no longer as an immediate argument to
the IN or OUT instructions, which the PI counter
can clearly ingore), palette color arithmetic, or heck, 0xFF constants that
obviously just mean "-1" and are not a reference to offset 0xFF in
the data segment. All of this, of course, once again had a way bigger
effect on everything but an almost position-independent TH05… but
hey, that's the sort of thing you reserve the "anything" pushes for. And
that's also how we get some of the single biggest PI% gains we have seen
so far, and will be seeing before the 100% PI mark. And yes, those will
continue in the next push.
🎉 TH01's OP.EXE and FUUIN.EXE are now fully
position-independent! 🎉
What does this mean?
You can now add any data or code to TH01's main menu or ending cutscenes,
by simply editing the ReC98 source, writing your mod in ASM or C++, and
recompiling the code. Since all absolute memory addresses in OP
and FUUIN have now been converted to labels, this
will work without causing any instability. See the
position independence section in the FAQ for a more thorough
explanation about why this was a problem.
As an example, the most popular TH01 mod idea, replacing MDRV2 with PMD,
could now at least be prototyped and tested in
OP.EXE, without having to worry about x86 instruction lengths.
📝 Check the video I made for the TH04/TH05 OP.EXE PI announcement for a basic overview of how to do that.
What does this not mean?
The original ZUN code hasn't been completely decompiled yet. The final
high-level parts of both the main menu and the cutscenes are still ASM,
which might make modding a bit inconvenient right now.
It's not that much more code though, and could quickly be covered in a few
pushes if requested. Due to the plentiful monthly subscriptions, the shop
will stay closed for regular orders until the end of June, but backers
with outstanding contributions could request that now if they want
to – simply drop me a mail. Otherwise, the "generic TH01 RE" money will
continue to go towards the main game. That way, we'll have more substance
to show once we do decide to decompile the rest of
OP.EXE and FUUIN.EXE, and likely get some press
coverage as a result.
Then again, we've been building up to this point over the last few pushes,
and it only really needed a quick look over the remaining false positives.
The majority of the time therefore went towards more PI in
REIIDEN.EXE, where the bitplane pointers for .BOS files yielded
some quite big gains. Couldn't really find any obvious reason why ZUN used
two slighly different variations on loading and blitting those files,
though…
As the final function in this rather random push, we got TH01's
hardware-powered scrolling function, used for screen shaking effects and
the scrolling backgrounds at the start of the Final Boss stages. And while
I tried to document all these I/O writes… it turned out that ZUN actually
copied the entire function straight from the PC-9801 Programmers'
Bible, with no changes. It's the
setgsta() example function on page 150. Which is terribly
suboptimal and bloated – all those integer divisions are really
not how you'd write such code for a 16-bit compiler from the 90's…
And that gives us 60% PI overall, and 50% PI over all of TH01! Next up:
More structures… and classes, even?
🎉 TH04's and TH05's OP.EXE are now fully
position-independent! 🎉
What does this mean?
You can now add any data or code to the main menus of the two games, by
simply editing the ReC98 source, writing your mod in ASM or C/C++, and
recompiling the code. Since all absolute memory addresses have now been
converted to labels, this will work without causing any instability. See
the position independence section in the FAQ
for a more thorough explanation about why this was a problem.
What does this not mean?
The original ZUN code hasn't been completely reverse-engineered yet, let
alone decompiled. Pretty much all of that is still ASM, which might make
modding a bit inconvenient right now.
Since this push was otherwise pretty unremarkable, I made a video
demonstrating a few basic things you can do with this:
Now, what to do for the last outstanding Touhou Patch Center push?
Bullets, or resident structures?
Big gains, as expected, but not much to say about this one. With TH05 Reimu
being way too easy to decompile after
📝 the shot control groundwork done in October,
there was enough time to give the comprehensive PI false-positive
treatment to two other sets of functions present in TH04's and TH05's
OP.EXE. One of them, master.lib's super_*()
functions, was used a lot in TH02, more than in any other game… I
wonder how much more that game will progress without even focusing on it
in particular.
Alright then! 100% PI for TH04's and TH05's OP.EXE upcoming…
(Edit: Already got funding to cover this!)
With no feedback to 📝 last week's blog post,
I assume you all are fine with how things are going? Alright then, another
one towards position independence, with the same approach as before…
Since -Tom- wanted to learn something about how the PC-98
EGC is used in TH04 and TH05, I took a look at master.lib's
egc_shift_*() functions. These simply do a hardware-accelerated
memmove() of any VRAM region, and are used for screen shaking
effects. Hover over the image below for the raw effect:
Then, I finally wanted to take a look at the bullet structures, but it
required way too much reverse-engineering to even start within ¾ of
a position independence push. Even with the help of uth05win –
bullet handling was changed quite a bit from TH04 to TH05.
What I ultimately settled on was more raw, "boring" PI work based around
an already known set of functions. For this one, I looked at vector
construction… and this time, that actually made the games a little
bit more position-independent, and wasn't just all about removing
false positives from the calculation. This was one of the few sets of
functions that would also apply to TH01, and it revealed just how
chaotically that game was coded. This one commit shows three ways how ZUN
stored regular 2D points in TH01:
"regularly", like in master.lib's Point structure (X
first, Y second)
reversed, (Y first and X second), then obviously with two distinct
variables declared next to each other
… yeah. But in more productive news, this did actually lay the
groundwork for TH04 and TH05 bullet structures. Which might even be coming
up within the next big, 5-push order from Touhou Patch Center? These are
the priorities I got from them, let's see how close I can get!