⮜ Blog

⮜ List of tags

Showing all posts tagged
and

📝 Posted:
🚚 Summary of:
P0174, P0175, P0176, P0177, P0178, P0179, P0180, P0181
Commits:
27f901c...a0fe812, a0fe812...40ac9a7, 40ac9a7...c5dc45b, c5dc45b...5f0cabc, 5f0cabc...60621f8, 60621f8...9e5b344, 9e5b344...091f19f, 091f19f...313450f
💰 Funded by:
Ember2528, Yanga
🏷 Tags:

Here we go, TH01 Sariel! This is the single biggest boss fight in all of PC-98 Touhou: If we include all custom effect code we previously decompiled, it amounts to a total of 10.31% of all code in TH01 (and 3.14% overall). These 8 pushes cover the final 8.10% (or 2.47% overall), and are likely to be the single biggest delivery this project will ever see. Considering that I only managed to decompile 6.00% across all games in 2021, 2022 is already off to a much better start!

So, how can Sariel's code be that large? Well, we've got:

In total, it's just under 3,000 lines of C++ code, containing a total of 8 definite ZUN bugs, 3 of them being subpixel/pixel confusions. That might not look all too bad if you compare it to the 📝 player control function's 8 bugs in 900 lines of code, but given that Konngara had 0… (Edit (2022-07-17): Konngara contains two bugs after all: A 📝 possible heap corruption in test or debug mode, and the infamous 📝 temporary green discoloration.) And no, the code doesn't make it obvious whether ZUN coded Konngara or Sariel first; there's just as much evidence for either.

Some terminology before we start: Sariel's first form is separated into four phases, indicated by different background images, that cycle until Sariel's HP reach 0 and the second, single-phase form starts. The danmaku patterns within each phase are also on a cycle, and the game picks a random but limited number of patterns per phase before transitioning to the next one. The fight always starts at pattern 1 of phase 1 (the random purple lasers), and each new phase also starts at its respective first pattern.


Sariel's bugs already start at the graphics asset level, before any code gets to run. Some of the patterns include a wand raise animation, which is stored in BOSS6_2.BOS:

TH01 BOSS6_2.BOS
Umm… OK? The same sprite twice, just with slightly different colors? So how is the wand lowered again?

The "lowered wand" sprite is missing in this file simply because it's captured from the regular background image in VRAM, at the beginning of the fight and after every background transition. What I previously thought to be 📝 background storage code has therefore a different meaning in Sariel's case. Since this captured sprite is fully opaque, it will reset the entire 128×128 wand area… wait, 128×128, rather than 96×96? Yup, this lowered sprite is larger than necessary, wasting 1,967 bytes of conventional memory.
That still doesn't quite explain the second sprite in BOSS6_2.BOS though. Turns out that the black part is indeed meant to unblit the purple reflection (?) in the first sprite. But… that's not how you would correctly unblit that?

VRAM after blitting the first sprite of TH01's BOSS6_2.BOS VRAM after blitting the second sprite of TH01's BOSS6_2.BOS

The first sprite already eats up part of the red HUD line, and the second one additionally fails to recover the seal pixels underneath, leaving a nice little black hole and some stray purple pixels until the next background transition. :tannedcirno: Quite ironic given that both sprites do include the right part of the seal, which isn't even part of the animation.


Just like Konngara, Sariel continues the approach of using a single function per danmaku pattern or custom entity. While I appreciate that this allows all pattern- and entity-specific state to be scoped locally to that one function, it quickly gets ugly as soon as such a function has to do more than one thing.
The "bird function" is particularly awful here: It's just one if(…) {…} else if(…) {…} else if(…) {…} chain with different branches for the subfunction parameter, with zero shared code between any of these branches. It also uses 64-bit floating-point double as its subpixel type… and since it also takes four of those as parameters (y'know, just in case the "spawn new bird" subfunction is called), every call site has to also push four double values onto the stack. Thanks to Turbo C++ even using the FPU for pushing a 0.0 constant, we have already reached maximum floating-point decadence before even having seen a single danmaku pattern. Why decadence? Every possible spawn position and velocity in both bird patterns just uses pixel resolution, with no fractional component in sight. And there goes another 720 bytes of conventional memory.

Speaking about bird patterns, the red-bird one is where we find the first code-level ZUN bug: The spawn cross circle sprite suddenly disappears after it finished spawning all the bird eggs. How can we tell it's a bug? Because there is code to smoothly fly this sprite off the playfield, that code just suddenly forgets that the sprite's position is stored in Q12.4 subpixels, and treats it as raw screen pixels instead. :zunpet: As a result, the well-intentioned 640×400 screen-space clipping rectangle effectively shrinks to 38×23 pixels in the top-left corner of the screen. Which the sprite is always outside of, and thus never rendered again.
The intended animation is easily restored though:

Sariel's third pattern, and the first to spawn birds, in its original and fixed versions. Note that I somewhat fixed the bird hatch animation as well: ZUN's code never unblits any frame of animation there, and simply blits every new one on top of the previous one.

Also, did you know that birds actually have a quite unfair 14×38-pixel hitbox? Not that you'd ever collide with them in any of the patterns…

Another 3 of the 8 bugs can be found in the symmetric, interlaced spawn rays used in three of the patterns, and the 32×32 debris "sprites" shown at their endpoint, at the edge of the screen. You kinda have to commend ZUN's attention to detail here, and how he wrote a lot of code for those few rapidly animated pixels that you most likely don't even notice, especially with all the other wrong pixels resulting from rendering glitches. One of the bugs in the very final pattern of phase 4 even turns them into the vortex sprites from the second pattern in phase 1 during the first 5 frames of the first time the pattern is active, and I had to single-step the blitting calls to verify it.
It certainly was annoying how much time I spent making sense of these bugs, and all weird blitting offsets, for just a few pixels… Let's look at something more wholesome, shall we?


So far, we've only seen the PC-98 GRCG being used in RMW (read-modify-write) mode, which I previously 📝 explained in the context of TH01's red-white HP pattern. The second of its three modes, TCR (Tile Compare Read), affects VRAM reads rather than writes, and performs "color extraction" across all 4 bitplanes: Instead of returning raw 1bpp data from one plane, a VRAM read will instead return a bitmask, with a 1 bit at every pixel whose full 4-bit color exactly matches the color at that offset in the GRCG's tile register, and 0 everywhere else. Sariel uses this mode to make sure that the 2×2 particles and the wind effect are only blitted on top of "air color" pixels, with other parts of the background behaving like a mask. The algorithm:

  1. Set the GRCG to TCR mode, and all 8 tile register dots to the air color
  2. Read N bits from the target VRAM position to obtain an N-bit mask where all 1 bits indicate air color pixels at the respective position
  3. AND that mask with the alpha plane of the sprite to be drawn, shifted to the correct start bit within the 8-pixel VRAM byte
  4. Set the GRCG to RMW mode, and all 8 tile register dots to the color that should be drawn
  5. Write the previously obtained bitmask to the same position in VRAM

Quite clever how the extracted colors double as a secondary alpha plane, making for another well-earned good-code tag. The wind effect really doesn't deserve it, though:

As far as I can tell, ZUN didn't use TCR mode anywhere else in PC-98 Touhou. Tune in again later during a TH04 or TH05 push to learn about TDW, the final GRCG mode!


Speaking about the 2×2 particle systems, why do we need three of them? Their only observable difference lies in the way they move their particles:

  1. Up or down in a straight line (used in phases 4 and 2, respectively)
  2. Left or right in a straight line (used in the second form)
  3. Left and right in a sinusoidal motion (used in phase 3, the "dark orange" one)

Out of all possible formats ZUN could have used for storing the positions and velocities of individual particles, he chose a) 64-bit / double-precision floating-point, and b) raw screen pixels. Want to take a guess at which data type is used for which particle system?

If you picked double for 1) and 2), and raw screen pixels for 3), you are of course correct! :godzun: Not that I'm implying that it should have been the other way round – screen pixels would have perfectly fit all three systems use cases, as all 16-bit coordinates are extended to 32 bits for trigonometric calculations anyway. That's what, another 1.080 bytes of wasted conventional memory? And that's even calculated while keeping the current architecture, which allocates space for 3×30 particles as part of the game's global data, although only one of the three particle systems is active at any given time.

That's it for the first form, time to put on "Civilization of Magic"! Or "死なばもろとも"? Or "Theme of 地獄めくり"? Or whatever SYUGEN is supposed to mean…


… and the code of these final patterns comes out roughly as exciting as their in-game impact. With the big exception of the very final "swaying leaves" pattern: After 📝 Q4.4, 📝 Q28.4, 📝 Q24.8, and double variables, this pattern uses… decimal subpixels? Like, multiplying the number by 10, and using the decimal one's digit to represent the fractional part? Well, sure, if you really insist on moving the leaves in cleanly represented integer multiples of ⅒, which is infamously impossible in IEEE 754. Aside from aesthetic reasons, it only really combines less precision (10 possible fractions rather than the usual 16) with the inferior performance of having to use integer divisions and multiplications rather than simple bit shifts. And it's surely not because the leaf sprites needed an extended integer value range of [-3276, +3276], compared to Q12.4's [-2047, +2048]: They are clipped to 640×400 screen space anyway, and are removed as soon as they leave this area.

This pattern also contains the second bug in the "subpixel/pixel confusion hiding an entire animation" category, causing all of BOSS6GR4.GRC to effectively become unused:

The "swaying leaves" pattern. ZUN intended a splash animation to be shown once each leaf "spark" reaches the top of the playfield, which is never displayed in the original game.

At least their hitboxes are what you would expect, exactly covering the 30×30 pixels of Reimu's sprite. Both animation fixes are available on the th01_sariel_fixes branch.

After all that, Sariel's main function turned out fairly unspectacular, just putting everything together and adding some shake, transition, and color pulse effects with a bunch of unnecessary hardware palette changes. There is one reference to a missing BOSS6.GRP file during the first→second form transition, suggesting that Sariel originally had a separate "first form defeat" graphic, before it was replaced with just the shaking effect in the final game.
Speaking about the transition code, it is kind of funny how the… um, imperative and concrete nature of TH01 leads to these 2×24 lines of straight-line code. They kind of look like ZUN rattling off a laundry list of subsystems and raw variables to be reinitialized, making damn sure to not forget anything.


Whew! Second PC-98 Touhou boss completely decompiled, 29 to go, and they'll only get easier from here! 🎉 The next one in line, Elis, is somewhere between Konngara and Sariel as far as x86 instruction count is concerned, so that'll need to wait for some additional funding. Next up, therefore: Looking at a thing in TH03's main game code – really, I have little idea what it will be!

Now that the store is open again, also check out the 📝 updated RE progress overview I've posted together with this one. In addition to more RE, you can now also directly order a variety of mods; all of these are further explained in the order form itself.

📝 Posted:
🚚 Summary of:
P0168, P0169
Commits:
c2de6ab...8b046da, 8b046da...479b766
💰 Funded by:
rosenrose, Blue Bolt
🏷 Tags:

EMS memory! The infamous stopgap measure between the 640 KiB ("ought to be enough for everyone") of conventional memory offered by DOS from the very beginning, and the later XMS standard for accessing all the rest of memory up to 4 GiB in the x86 Protected Mode. With an optionally active EMS driver, TH04 and TH05 will make use of EMS memory to preload a bunch of situational .CDG images at the beginning of MAIN.EXE:

  1. The "eye catch" game title image, shown while stages are loaded
  2. The character-specific background image, shown while bombing
  3. The player character dialog portraits
  4. TH05 additionally stores the boss portraits there, preloading them at the beginning of each stage. (TH04 instead keeps them in conventional memory during the entire stage.)

Once these images are needed, they can then be copied into conventional memory and accessed as usual.

Uh… wait, copied? It certainly would have been possible to map EMS memory to a regular 16-bit Real Mode segment for direct access, bank-switching out rarely used system or peripheral memory in exchange for the EMS data. However, master.lib doesn't expose this functionality, and only provides functions for copying data from EMS to regular memory and vice versa.
But even that still makes EMS an excellent fit for the large image files it's used for, as it's possible to directly copy their pixel data from EMS to VRAM. (Yes, I tried!) Well… would, because ZUN doesn't do that either, and always naively copies the images to newly allocated conventional memory first. In essence, this dumbs down EMS into just another layer of the memory hierarchy, inserted between conventional memory and disk: Not quite as slow as disk, but still requiring that memcpy() to retrieve the data. Most importantly though: Using EMS in this way does not increase the total amount of memory simultaneously accessible to the game. After all, some other data will have to be freed from conventional memory to make room for the newly loaded data.


The most idiomatic way to define the game-specific layout of the EMS area would be either a struct or an enum. Unfortunately, the total size of all these images exceeds the range of a 16-bit value, and Turbo C++ 4.0J supports neither 32-bit enums (which are silently degraded to 16-bit) nor 32-bit structs (which simply don't compile). That still leaves raw compile-time constants though, you only have to manually define the offset to each image in terms of the size of its predecessor. But instead of doing that, ZUN just placed each image at a nice round decimal offset, each slightly larger than the actual memory required by the previous image, just to make sure that everything fits. :tannedcirno: This results not only in quite a bit of unnecessary padding, but also in technically the single biggest amount of "wasted" memory in PC-98 Touhou: Out of the 180,000 (TH04) and 320,000 (TH05) EMS bytes requested, the game only uses 135,552 (TH04) and 175,904 (TH05) bytes. But hey, it's EMS, so who cares, right? Out of all the opportunities to take shortcuts during development, this is among the most acceptable ones. Any actual PC-98 model that could run these two games comes with plenty of memory for this to not turn into an actual issue.

On to the EMS-using functions themselves, which are the definition of "cross-cutting concerns". Most of these have a fallback path for the non-EMS case, and keep the loaded .CDG images in memory if they are immediately needed. Which totally makes sense, but also makes it difficult to find names that reflect all the global state changed by these functions. Every one of these is also just called from a single place, so inlining them would have saved me a lot of naming and documentation trouble there.
The TH04 version of the EMS allocation code was actually displayed on ZUN's monitor in the 2010 MAG・ネット documentary; WindowsTiger already transcribed the low-quality video image in 2019. By 2015 ReC98 standards, I would have just run with that, but the current project goal is to write better code than ZUN, so I didn't. 😛 We sure ain't going to use magic numbers for EMS offsets.

The dialog init and exit code then is completely different in both games, yet equally cross-cutting. TH05 goes even further in saving conventional memory, loading each individual player or boss portrait into a single .CDG slot immediately before blitting it to VRAM and freeing the pixel data again. People who play TH05 without an active EMS driver are surely going to enjoy the hard drive access lag between each portrait change… :godzun: TH04, on the other hand, also abuses the dialog exit function to preload the Mugetsu defeat / Gengetsu entrance and Gengetsu defeat portraits, using a static variable to track how often the function has been called during the Extra Stage… who needs function parameters anyway, right? :zunpet:

This is also the function in which TH04 infamously crashes after the Stage 5 pre-boss dialog when playing with Reimu and without any active EMS driver. That crash is what motivated this look into the games' EMS usage… but the code looks perfectly fine? Oh well, guess the crash is not related to EMS then. Next u–

OK, of course I can't leave it like that. Everyone is expecting a fix now, and I still got half of a push left over after decompiling the regular EMS code. Also, I've now RE'd every function that could possibly be involved in the crash, and this is very likely to be the last time I'll be looking at them.


Turns out that the bug has little to do with EMS, and everything to do with ZUN limiting the amount of conventional RAM that TH04's MAIN.EXE is allowed to use, and then slightly miscalculating this upper limit. Playing Stage 5 with Reimu is the most asset-intensive configuration in this game, due to the combination of

The star image used in TH04's Stage 5.
The star image used in TH04's Stage 5.

Remove any single one of the above points, and this crash would have never occurred. But with all of them combined, the total amount of memory consumed by TH04's MAIN.EXE just barely exceeds ZUN's limit of 320,000 bytes, by no more than 3,840 bytes, the size of the star image.

But wait: As we established earlier, EMS does nothing to reduce the amount of conventional memory used by the game. In fact, if you disabled TH04's EMS handling, you'd still get this crash even if you are running an EMS driver and loaded DOS into the High Memory Area to free up as much conventional RAM as possible. How can EMS then prevent this crash in the first place?

The answer: It's only because ZUN's usage of EMS bypasses the need to load the cached images back out of the XOR-encrypted 東方幻想.郷 packfile. Leaving aside the general stupidity of any game data file encryption*, master.lib's decryption implementation is also quite wasteful: It uses a separate buffer that receives fixed-size chunks of the file, before decrypting every individual byte and copying it to its intended destination buffer. That really resembles the typical slowness of a C fread() implementation more than it does the highly optimized ASM code that master.lib purports to be… And how large is this well-hidden decryption buffer? 4 KiB. :onricdennat:

So, looking back at the game, here is what happens once the Stage 5 pre-battle dialog ends:

  1. Reimu's bomb background image, which was previously freed to make space for her dialog portraits, has to be loaded back into conventional memory from disk
  2. BB0.CDG is found inside the 東方幻想.郷 packfile
  3. file_ropen() ends up allocating a 4 KiB buffer for the encrypted packfile data, getting us the decisive ~4 KiB closer to the memory limit
  4. The .CDG loader tries to allocate 52 608 contiguous bytes for the pixel data of Reimu's bomb image
  5. This would exceed the memory limit, so hmem_allocbyte() fails and returns a nullptr
  6. ZUN doesn't check for this case (as usual)
  7. The pixel data is loaded to address 0000:0000, overwriting the Interrupt Vector Table and whatever comes after
  8. The game crashes
The final frame rendered before the TH04 Stage 5 Reimu No-EMS crash
The final frame rendered by a crashing TH04.

The 4 KiB encryption buffer would only be freed by the corresponding file_close() call, which of course never happens because the game crashes before it gets there. At one point, I really did suspect the cause to be some kind of memory leak or fragmentation inside master.lib, which would have been quite delightful to fix.
Instead, the most straightforward fix here is to bump up that memory limit by at least 4 KiB. Certainly easier than squeezing in a cdg_free() call for the star image before the pre-boss dialog without breaking position dependence.

Or, even better, let's nuke all these memory limits from orbit because they make little sense to begin with, and fix every other potential out-of-memory crash that modders would encounter when adding enough data to any of the 4 games that impose such limits on themselves. Unless you want to launch other binaries (which need to do their own memory allocations) after launching the game, there's really no reason to restrict the amount of memory available to a DOS process. Heck, whenever DOS creates a new one, it assigns all remaining free memory by default anyway.
Removing the memory limits also removes one of ZUN's few error checks, which end up quitting the game if there isn't at least a given maximum amount of conventional RAM available. While it might be tempting to reserve enough memory at the beginning of execution and then never check any allocation for a potential failure, that's exactly where something like TH04's crash comes from.
This game is also still running on DOS, where such an initial allocation failure is very unlikely to happen – no one fills close to half of conventional RAM with TSRs and then tries running one of these games. It might have been useful to detect systems with less than 640 KiB of actual, physical RAM, but none of the PC-98 models with that little amount of memory are fast enough to run these games to begin with. How ironic… a place where ZUN actually added an error check, and then it's mostly pointless.

Here's an archive that contains both fix variants, just in case. These were compiled from the th04_noems_crash_fix and mem_assign_all branches, and contain as little code changes as possible.
Edit (2022-04-18): For TH04, you probably want to download the 📝 community choice fix package instead, which contains this fix along with other workarounds for the Divide error crashes. 2021-11-29-Memory-limit-fixes.zip

So yeah, quite a complex bug, leaving no time for the TH03 scorefile format research after all. Next up: Raising prices.

📝 Posted:
🚚 Summary of:
P0165, P0166, P0167
Commits:
7a0e5d8...f2bca01, f2bca01...e697907, e697907...c2de6ab
💰 Funded by:
Ember2528
🏷 Tags:

OK, TH01 missile bullets. Can we maybe have a well-behaved entity type, without any weirdness? Just once?

Ehh, kinda. Apart from another 150 bytes wasted on unused structure members, this code is indeed more on the low end in terms of overall jank. It does become very obvious why dodging these missiles in the YuugenMagan, Mima, and Elis fights feels so awful though: An unfair 46×46 pixel hitbox around Reimu's center pixel, combined with the comeback of 📝 interlaced rendering, this time in every stage. ZUN probably did this because missiles are the only 16×16 sprite in TH01 that is blitted to unaligned X positions, which effectively ends up touching a 32×16 area of VRAM per sprite.
But even if we assume VRAM writes to be the bottleneck here, it would have been totally possible to render every missile in every frame at roughly the same amount of CPU time that the original game uses for interlaced rendering:

That's an optimization that would have significantly benefitted the game, in contrast to all of the fake ones introduced in later games. Then again, this optimization is actually something that the later games do, and it might have in fact been necessary to achieve their higher bullet counts without significant slowdown.

Unfortunately, it was only worth decompiling half of the missile code right now, thanks to gratuitous FPU usage in the other half, where 📝 double variables are compared to float literals. That one will have to wait 📝 until after SinGyoku.


After some effectively unused Mima sprite effect code that is so broken that it's impossible to make sense out of it, we get to the final feature I wanted to cover for all bosses in parallel before returning to Sariel: The separate sprite background storage for moving or animated boss sprites in the Mima, Elis, and Sariel fights. But, uh… why is this necessary to begin with? Doesn't TH01 already reserve the other VRAM page for backgrounds?
Well, these sprites are quite big, and ZUN didn't want to blit them from main memory on every frame. After all, TH01 and TH02 had a minimum required clock speed of 33 MHz, half of the speed required for the later three games. So, he simply blitted these boss sprites to both VRAM pages, leading the usual unblitting calls to only remove the other sprites on top of the boss. However, these bosses themselves want to move across the screen… and this makes it necessary to save the stage background behind them in some other way.

Enter .PTN, and its functions to capture a 16×16 or 32×32 square from VRAM into a sprite slot. No problem with that approach in theory, as the size of all these bigger sprites is a multiple of 32×32; splitting a larger sprite into these smaller 32×32 chunks makes the code look just a little bit clumsy (and, of course, slower).
But somewhere during the development of Mima's fight, ZUN apparently forgot that those sprite backgrounds existed. And once Mima's 🚫 casting sprite is blitted on top of her regular sprite, using just regular sprite transparency, she ends up with her infamous third arm:

TH01 Mima's third arm

Ironically, there's an unused code path in Mima's unblit function where ZUN assumes a height of 48 pixels for Mima's animation sprites rather than the actual 64. This leads to even clumsier .PTN function calls for the bottom 128×16 pixels… Failing to unblit the bottom 16 pixels would have also yielded that third arm, although it wouldn't have looked as natural. Still wouldn't say that it was intentional; maybe this casting sprite was just added pretty late in the game's development?


So, mission accomplished, Sariel unblocked… at 2¼ pushes. :thonk: That's quite some time left for some smaller stage initialization code, which bundles a bunch of random function calls in places where they logically really don't belong. The stage opening animation then adds a bunch of VRAM inter-page copies that are not only redundant but can't even be understood without knowing the hidden internal state of the last VRAM page accessed by previous ZUN code…
In better news though: Turbo C++ 4.0 really doesn't seem to have any complexity limit on inlining arithmetic expressions, as long as they only operate on compile-time constants. That's how we get macro-free, compile-time Shift-JIS to JIS X 0208 conversion of the individual code points in the 東方★靈異伝 string, in a compiler from 1994. As long as you don't store any intermediate results in variables, that is… :tannedcirno:

But wait, there's more! With still ¼ of a push left, I also went for the boss defeat animation, which includes the route selection after the SinGyoku fight.
As in all other instances, the 2× scaled font is accomplished by first rendering the text at regular 1× resolution to the other, invisible VRAM page, and then scaled from there to the visible one. However, the route selection is unique in that its scaled text is both drawn transparently on top of the stage background (not onto a black one), and can also change colors depending on the selection. It would have been no problem to unblit and reblit the text by rendering the 1× version to a position on the invisible VRAM page that isn't covered by the 2× version on the visible one, but ZUN (needlessly) clears the invisible page before rendering any text. :zunpet: Instead, he assigned a separate VRAM color for both the 魔界 and 地獄 options, and only changed the palette value for these colors to white or gray, depending on the correct selection. This is another one of the 📝 rare cases where TH01 demonstrates good use of PC-98 hardware, as the 魔界へ and 地獄へ strings don't need to be reblitted during the selection process, only the Orb "cursor" does.

Then, why does this still not count as good-code? When changing palette colors, you kinda need to be aware of everything else that can possibly be on screen, which colors are used there, and which aren't and can therefore be used for such an effect without affecting other sprites. In this case, well… hover over the image below, and notice how Reimu's hair and the bomb sprites in the HUD light up when Makai is selected:

Demonstration of palette changes in TH01's route selection

This push did end on a high note though, with the generic, non-SinGyoku version of the defeat animation being an easily parametrizable copy. And that's how you decompile another 2.58% of TH01 in just slightly over three pushes.


Now, we're not only ready to decompile Sariel, but also Kikuri, Elis, and SinGyoku without needing any more detours into non-boss code. Thanks to the current TH01 funding subscriptions, I can plan to cover most, if not all, of Sariel in a single push series, but the currently 3 pending pushes probably won't suffice for Sariel's 8.10% of all remaining code in TH01. We've got quite a lot of not specifically TH01-related funds in the backlog to pass the time though.

Due to recent developments, it actually makes quite a lot of sense to take a break from TH01: spaztron64 has managed what every Touhou download site so far has failed to do: Bundling all 5 game onto a single .HDI together with pre-configured PC-98 emulators and a nice boot menu, and hosting the resulting package on a proper website. While this first release is already quite good (and much better than my attempt from 2014), there is still a bit of room for improvement to be gained from specific ReC98 research. Next up, therefore:

📝 Posted:
🚚 Summary of:
P0160, P0161
Commits:
e491cd7...42ba4a5, 42ba4a5...81dd96e
💰 Funded by:
Yanga, [Anonymous]
🏷 Tags:

Nothing really noteworthy in TH01's stage timer code, just yet another HUD element that is needlessly drawn into VRAM. Sure, ZUN applies his custom boldfacing effect on top of the glyphs retrieved from font ROM, but he could have easily installed those modified glyphs as gaiji.
Well, OK, halfwidth gaiji aren't exactly well documented, and sometimes not even correctly emulated 📝 due to the same PC-98 hardware oddity I was researching last month. I've reserved two of the pending anonymous "anything" pushes for the conclusion of this research, just in case you were wondering why the outstanding workload is now lower after the two delivered here.

And since it doesn't seem to be clearly documented elsewhere: Every 2 ticks on the stage timer correspond to 4 frames.


So, TH01 rank pellet speed. The resident pellet speed value is a factor ranging from a minimum of -0.375 up to a maximum of 0.5 (pixels per frame), multiplied with the difficulty-adjusted base speed for each pellet and added on top of that same speed. This multiplier is modified

Apparently, ZUN noted that these deltas couldn't be losslessly stored in an IEEE 754 floating-point variable, and therefore didn't store the pellet speed factor exactly in a way that would correspond to its gameplay effect. Instead, it's stored similar to Q12.4 subpixels: as a simple integer, pre-multiplied by 40. This results in a raw range of -15 to 20, which is what the undecompiled ASM calls still use. When spawning a new pellet, its base speed is first multiplied by that factor, and then divided by 40 again. This is actually quite smart: The calculation doesn't need to be aware of either Q12.4 or the 40× format, as ((Q12.4 * factor×40) / factor×40) still comes out as a Q12.4 subpixel even if all numbers are integers. The only limiting issue here would be the potential overflow of the 16-bit multiplication at unadjusted base speeds of more than 50 pixels per frame, but that'd be seriously unplayable.
So yeah, pellet speed modifications are indeed gradual, and don't just fall into the coarse three "high, normal, and low" categories.


That's ⅝ of P0160 done, and the continue and pause menus would make good candidates to fill up the remaining ⅜… except that it seemed impossible to figure out the correct compiler options for this code?
The issues centered around the two effects of Turbo C++ 4.0J's -O switch:

  1. Optimizing jump instructions: merging duplicate successive jumps into a single one, and merging duplicated instructions at the end of conditional branches into a single place under a single branch, which the other branches then jump to
  2. Compressing ADD SP and POP CX stack-clearing instructions after multiple successive CALLs to __cdecl functions into a single ADD SP with the combined parameter stack size of all function calls

But how can the ASM for these functions exhibit #1 but not #2? How can it be seemingly optimized and unoptimized at the same time? The only option that gets somewhat close would be -O- -y, which emits line number information into the .OBJ files for debugging. This combination provides its own kind of #1, but these functions clearly need the real deal.

The research into this issue ended up consuming a full push on its own. In the end, this solution turned out to be completely unrelated to compiler options, and instead came from the effects of a compiler bug in a totally different place. Initializing a local structure instance or array like

const uint4_t flash_colors[3] = { 3, 4, 5 };

always emits the { 3, 4, 5 } array into the program's data segment, and then generates a call to the internal SCOPY@ function which copies this data array to the local variable on the stack. And as soon as this SCOPY@ call is emitted, the -O optimization #1 is disabled for the entire rest of the translation unit?!
So, any code segment with an SCOPY@ call followed by __cdecl functions must strictly be decompiled from top to bottom, mirroring the original layout of translation units. That means no TH01 continue and pause menus before we haven't decompiled the bomb animation, which contains such an SCOPY@ call. 😕
Luckily, TH01 is the only game where this bug leads to significant restrictions in decompilation order, as later games predominantly use the pascal calling convention, in which each function itself clears its stack as part of its RET instruction.


What now, then? With 51% of REIIDEN.EXE decompiled, we're slowly running out of small features that can be decompiled within ⅜ of a push. Good that I haven't been looking a lot into OP.EXE and FUUIN.EXE, which pretty much only got easy pieces of code left to do. Maybe I'll end up finishing their decompilations entirely within these smaller gaps?
I still ended up finding one more small piece in REIIDEN.EXE though: The particle system, seen in the Mima fight.

I like how everything about this animation is contained within a single function that is called once per frame, but ZUN could have really consolidated the spawning code for new particles a bit. In Mima's fight, particles are only spawned from the top and right edges of the screen, but the function in fact contains unused code for all other 7 possible directions, written in quite a bloated manner. This wouldn't feel quite as unused if ZUN had used an angle parameter instead… :thonk: Also, why unnecessarily waste another 40 bytes of the BSS segment?

But wait, what's going on with the very first spawned particle that just stops near the bottom edge of the screen in the video above? Well, even in such a simple and self-contained function, ZUN managed to include an off-by-one error. This one then results in an out-of-bounds array access on the 80th frame, where the code attempts to spawn a 41st particle. If the first particle was unlucky to be both slow enough and spawned away far enough from the bottom and right edges, the spawning code will then kill it off before its unblitting code gets to run, leaving its pixel on the screen until something else overlaps it and causes it to be unblitted.
Which, during regular gameplay, will quickly happen with the Orb, all the pellets flying around, and your own player movement. Also, the RNG can easily spawn this particle at a position and velocity that causes it to leave the screen more quickly. Kind of impressive how ZUN laid out the structure of arrays in a way that ensured practically no effect of this bug on the game; this glitch could have easily happened every 80 frames instead. He almost got close to all bugs canceling out each other here! :godzun:

Next up: The player control functions, including the second-biggest function in all of PC-98 Touhou.

📝 Posted:
🚚 Summary of:
P0153, P0154, P0155, P0156
Commits:
624e0cb...d05c9ba, d05c9ba...031b526, 031b526...9ad578e, 9ad578e...4bc6405
💰 Funded by:
Ember2528
🏷 Tags:

📝 7 pushes to get Konngara done, according to my previous estimate? Well, how about being twice as fast, and getting the entire boss fight done in 3.5 pushes instead? So much copy-pasted code in there… without any flashy unused content, apart from four calculations with an unclear purpose. And the three strings "ANGEL", "OF", "DEATH", which were probably meant to be rendered using those giant upscaled font ROM glyphs that also display the STAGE # and HARRY UP strings? Those three strings are also part of Sariel's code, though.

On to the remaining 11 patterns then! Konngara's homing snakes, shown in the video above, are one of the more notorious parts of this battle. They occur in two patterns – one with two snakes and one with four – with all of the spawn, aim, update, and render code copy-pasted between the two. :zunpet: Three gameplay-related discoveries here:


This was followed by really weird aiming code for the "sprayed pellets from cup" pattern… which can only possibly have been done on purpose, but is sort of mitigated by the spraying motion anyway.
After a bunch of long if(…) {…} else if(…) {…} else if(…) {…} chains, which remain quite popular in certain corners of the game dev scene to this day, we've got the three sword slash patterns as the final notable ones. At first, it seemed as if ZUN just improvised those raw number constants involved in the pellet spawner's movement calculations to describe some sort of path that vaguely resembles the sword slash. But once I tried to express these numbers in terms of the slash animation's keyframes, it all worked out perfectly, and resulted in this:

Triangular path of the pellet spawner during Konngara's slash patterns

Yup, the spawner always takes an exact path along this triangle. Sometimes, I wonder whether I should just rush this project and don't bother about naming these repeated number literals. Then I gain insights like these, and it's all worth it.


Finally, we've got Konngara's main function, which coordinates the entire fight. Third-longest function in both TH01 and all of PC-98 Touhou, only behind some player-related stuff and YuugenMagan's gigantic main function… and it's even more of a copy-pasta, making it feel not nearly as long as it is. Key insights there:

Seriously, 📝 line drawing was much harder to decompile.


And that's it for Konngara! First boss with not a single piece of ASM left, 30 more to go! 🎉 But wait, what about the cause behind the temporary green discoloration after leaving the Pause menu? I expected to find something on that as well, but nope, it's nothing in Konngara's code segment. We'll probably only get to figure that out near the very end of TH01's decompilation, once we get to the one function that directly calls all of the boss-specific main functions in a switch statement.
Edit (2022-07-17): 📝 Only took until Mima.

So, Sariel next? With half of a push left, I did cover Sariel's first few initialization functions, but all the sprite unblitting and HUD manipulation will need some extra attention first. The first one of these functions is related to the HUD, the stage timer, and the HARRY UP mode, whose pellet pattern I've also decompiled now.

All of this brings us past 75% PI in all games, and TH01 to under 30,000 remaining ASM instructions, leaving TH03 as the now most expensive game to be completely decompiled. Looking forward to how much more TH01's code will fall apart if you just tap it lightly… Next up: The aforementioned helper functions related to HARRY UP, drawing the HUD, and unblitting the other bosses whose sprites are a bit more animated.

📝 Posted:
🚚 Summary of:
P0147
Commits:
456b621...c940059
💰 Funded by:
Ember2528, -Tom-
🏷 Tags:

Didn't quite get to cover background rendering for TH05's Stage 1-5 bosses in this one, as I had to reverse-engineer two more fundamental parts involved in boss background rendering before.

First, we got the those blocky transitions from stage tiles to bomb and boss backgrounds, loaded from BB*.BB and ST*.BB, respectively. These files store 16 frames of animation, with every bit corresponding to a 16×16 tile on the playfield. With 384×368 pixels to be covered, that would require 69 bytes per frame. But since that's a very odd number to work with in micro-optimized ASM, ZUN instead stores 512×512 pixels worth of bits, ending up with a frame size of 128 bytes, and a per-frame waste of 59 bytes. :tannedcirno: At least it was possible to decompile the core blitting function as __fastcall for once.
But wait, TH05 comes with, and loads, a bomb .BB file for every character, not just for the Reimu and Yuuka bomb transitions you see in-game… 🤔 Restoring those unused stage tile → bomb image transition animations for Mima and Marisa isn't that trivial without having decompiled their actual bomb animation functions before, so stay tuned!

Interestingly though, the code leaves out what would look like the most obvious optimization: All stage tiles are unconditionally redrawn each frame before they're erased again with the 16×16 blocks, no matter if they weren't covered by such a block in the previous frame, or are going to be covered by such a block in this frame. The same is true for the static bomb and boss background images, where ZUN simply didn't write a .CDG blitting function that takes the dirty tile array into account. If VRAM writes on PC-98 really were as slow as the games' README.TXT files claim them to be, shouldn't all the optimization work have gone towards minimizing them? :thonk: Oh well, it's not like I have any idea what I'm talking about here. I'd better stop talking about anything relating to VRAM performance on PC-98… :onricdennat:


Second, it finally was time to solve the long-standing confusion about all those callbacks that are supposed to render the playfield background. Given the aforementioned static bomb background images, ZUN chose to make this needlessly complicated. And so, we have two callback function pointers: One during bomb animations, one outside of bomb animations, and each boss update function is responsible for keeping the former in sync with the latter. :zunpet:

Other than that, this was one of the smoothest pushes we've had in a while; the hardest parts of boss background rendering all were part of 📝 the last push. Once you figured out that ZUN does indeed dynamically change hardware color #0 based on the current boss phase, the remaining one function for Shinki, and all of EX-Alice's background rendering becomes very straightforward and understandable.


Meanwhile, -Tom- told me about his plans to publicly release 📝 his TH05 scripting toolkit once TH05's MAIN.EXE would hit around 50% RE! That pretty much defines what the next bunch of generic TH05 pushes will go towards: bullets, shared boss code, and one full, concrete boss script to demonstrate how it's all combined. Next up, therefore: TH04's bullet firing code…? Yes, TH04's. I want to see what I'm doing before I tackle the undecompilable mess that is TH05's bullet firing code, and you all probably want readable code for that feature as well. Turns out it's also the perfect place for Blue Bolt's pending contributions.

📝 Posted:
🚚 Summary of:
P0146
Commits:
08bc188...456b621
💰 Funded by:
Ember2528, -Tom-
🏷 Tags:

Y'know, I kinda prefer the pending crowdfunded workload to stay more near the middle of the cap, rather than being sold out all the time. So to reach this point more quickly, let's do the most relaxing thing that can be easily done in TH05 right now: The boss backgrounds, starting with Shinki's, 📝 now that we've got the time to look at it in detail.

… Oh come on, more things that are borderline undecompilable, and require new workarounds to be developed? Yup, Borland C++ always optimizes any comparison of a register with a literal 0 to OR reg, reg, no matter how many calculations and inlined function calls you replace the 0 with. Shinki's background particle rendering function contains a CMP AX, 0 instruction though… so yeah, 📝 yet another piece of custom ASM that's worse than what Turbo C++ 4.0J would have generated if ZUN had just written readable C. This was probably motivated by ZUN insisting that his modified master.lib function for blitting particles takes its X and Y parameters as registers. If he had just used the __fastcall convention, he also would have got the sprite ID passed as a register. 🤷
So, we really don't want to be forced into inline assembly just because of the third comparison in the otherwise perfectly decompilable four-comparison if() expression that prevents invisible particles from being drawn. The workaround: Comparing to a pointer instead, which only the linker gets to resolve to the actual value of 0. :tannedcirno: This way, the compiler has to make room for any 16-bit literal, and can't optimize anything.


And then we go straight from micro-optimization to waste, with all the duplication in the code that animates all those particles together with the zooming and spinning lines. This push decompiled 1.31% of all code in TH05, and thanks to alignment, we're still missing Shinki's high-level background rendering function that calls all the subfunctions I decompiled here.
With all the manipulated state involved here, it's not at all trivial to see how this code produces what you see in-game. Like:

  1. If all lines have the same Y velocity, how do the other three lines in background type B get pushed down into this vertical formation while the top one stays still? (Answer: This velocity is only applied to the top line, the other lines are only pushed based on some delta.)
  2. How can this delta be calculated based on the distance of the top line with its supposed target point around Shinki's wings? (Answer: The velocity is never set to 0, so the top line overshoots this target point in every frame. After calculating the delta, the top line itself is pushed down as well, canceling out the movement. :zunpet:)
  3. Why don't they get pushed down infinitely, but stop eventually? (Answer: We only see four lines out of 20, at indices #0, #6, #12, and #18. In each frame, lines [0..17] are copied to lines [1..18], before anything gets moved. The invisible lines are pushed down based on the delta as well, which defines a distance between the visible lines of (velocity * array gap). And since the velocity is capped at -14 pixels per frame, this also means a maximum distance of 84 pixels between the midpoints of each line.)
  4. And why are the lines moving back up when switching to background type C, before moving down? (Answer: Because type C increases the velocity rather than decreasing it. Therefore, it relies on the previous velocity state from type B to show a gapless animation.)

So yeah, it's a nice-looking effect, just very hard to understand. 😵

With the amount of effort I'm putting into this project, I typically gravitate towards more descriptive function names. Here, however, uth05win's simple and seemingly tiny-brained "background type A/B/C/D" was quite a smart choice. It clearly defines the sequence in which these animations are intended to be shown, and as we've seen with point 4 from the list above, that does indeed matter.

Next up: At least EX-Alice's background animations, and probably also the high-level parts of the background rendering for all the other TH05 bosses.

📝 Posted:
🚚 Summary of:
P0120, P0121
Commits:
453dd3c...3c008b6, 3c008b6...5c42fcd
💰 Funded by:
Yanga
🏷 Tags:

Back to TH01, and its boss sprite format… with a separate class for storing animations that only differs minutely from the 📝 regular boss entity class I covered last time? Decompiling this class was almost free, and the main reason why the first of these pushes ended up looking pretty huge.

Next up were the remaining shape drawing functions from the code segment that started with the .GRC functions. P0105 already started these with the (surprisingly sanely implemented) 8×8 diamond, star, and… uh, snowflake (?) sprites , prominently seen in the Konngara, Elis, and Sariel fights, respectively. Now, we've also got:

The weirdness becomes obvious with just a single screenshot:

TH01 invincibility sprite weirdness

First, we've got the obvious issue of the sprites not being clipped at the right edge of VRAM, with the rightmost pixels in each row of the sprite extending to the beginning of the next row. Well, that's just what you get if you insist on writing unique low-level blitting code for the majority of the individual sprites in the game… 🤷
More importantly though, the sprite sheet looks like this: So how do we even get these fully filled red diamonds?

Well, turns out that the sprites are never consistently unblitted during their 8 frames of animation. There is a function that looks like it unblits the sprite… except that it starts with by enabling the GRCG and… reading from the first bitplane on the background page? If this was the EGC, such a read would fill some internal registers with the contents of all 4 bitplanes, which can then subsequently be blitted to all 4 bitplanes of any VRAM page with a single memory write. But with the GRCG in RMW mode, reads do nothing special, and simply copy the memory contents of one bitplane to the read destination. Maybe ZUN thought that setting the RMW color to red also sets some internal 4-plane mask register to match that color? :zunpet:
Instead, the rather random pixels read from the first bitplane are then used as a mask for a second blit of the same red sprite. Effectively, this only really "unblits" the invincibility pixels that are drawn on top of Reimu's sprite. Since Reimu is drawn first, the invincibility sprites are overwritten anyway. But due to the palette color layout of Reimu's sprite, its pixels end up fully masking away any invincibility sprite pixels in that second blit, leaving VRAM untouched as a result. Anywhere else though, this animation quickly turns into the union of all animation frames.

Then again, if that 16-dot-aligned rectangular unblitting function is all you know about the EGC, and you can't be bothered to write a perfect unblitter for 8×8 sprites, it becomes obvious why you wouldn't want to use it:

Because Reimu would barely be visible under all that flicker. In comparison, those fully filled diamonds actually look pretty good.


After all that, the remaining time wouldn't have been enough for the next few essential classes, so I closed out the push with three more VRAM effects instead:


And with that, ReC98, as a whole, is not only ⅓ done, but I've also fully caught up with the feature backlog for the first time in the history of this crowdfunding! Time to go into maintenance mode then, while we wait for the next pushes to be funded. Got a huge backlog of tiny maintenance issues to address at a leisurely pace, and of course there's also the 📝 16-bit build system waiting to be finished.

📝 Posted:
🚚 Summary of:
P0105, P0106, P0107, P0108
Commits:
3622eb6...11b776b, 11b776b...1f1829d, 1f1829d...1650241, 1650241...dcf4e2c
💰 Funded by:
Yanga
🏷 Tags:

And indeed, I got to end my vacation with a lot of image format and blitting code, covering the final two formats, .GRC and .BOS. .GRC was nothing noteworthy – one function for loading, one function for byte-aligned blitting, and one function for freeing memory. That's it – not even a unblitting function for this one. .BOS, on the other hand…

…has no generic (read: single/sane) implementation, and is only implemented as methods of some boss entity class. And then again for Sariel's dress and wand animations, and then again for Reimu's animations, both of which weren't even part of these 4 pushes. Looking forward to decompiling essentially the same algorithms all over again… And that's how TH01 became the largest and most bloated PC-98 Touhou game. So yeah, still not done with image formats, even at 44% RE.

This means I also had to reverse-engineer that "boss entity" class… yeah, what else to call something a boss can have multiple of, that may or may not be part of a larger boss sprite, may or may not be animated, and that may or may not have an orb hitbox?
All bosses except for Kikuri share the same 5 global instances of this class. Since renaming all these variables in ASM land is tedious anyway, I went the extra mile and directly defined separate, meaningful names for the entities of all bosses. These also now document the natural order in which the bosses will ultimately be decompiled. So, unless a backer requests anything else, this order will be:

  1. Konngara
  2. Sariel
  3. Elis
  4. Kikuri
  5. SinGyoku
  6. (code for regular card-flipping stages)
  7. Mima
  8. YuugenMagan

As everyone kind of expects from TH01 by now, this class reveals yet another… um, unique and quirky piece of code architecture. In addition to the position and hitbox members you'd expect from a class like this, the game also stores the .BOS metadata – width, height, animation frame count, and 📝 bitplane pointer slot number – inside the same class. But if each of those still corresponds to one individual on-screen sprite, how can YuugenMagan have 5 eye sprites, or Kikuri have more than one soul and tear sprite? By duplicating that metadata, of course! And copying it from one entity to another :onricdennat:
At this point, I feel like I even have to congratulate the game for not actually loading YuugenMagan's eye sprites 5 times. But then again, 53,760 bytes of waste would have definitely been noticeable in the DOS days. Makes much more sense to waste that amount of space on an unused C++ exception handler, and a bunch of redundant, unoptimized blitting functions :tannedcirno:

(Thinking about it, YuugenMagan fits this entire system perfectly. And together with its position in the game's code – last to be decompiled means first on the linker command line – we might speculate that YuugenMagan was the first boss to be programmed for TH01?)

So if a boss wants to use sprites with different sizes, there's no way around using another entity. And that's why Girl-Elis and Bat-Elis are two distinct entities internally, and have to manually sync their position. Except that there's also a third one for Attacking-Girl-Elis, because Girl-Elis has 9 frames of animation in total, and the global .BOS bitplane pointers are divided into 4 slots of only 8 images each. :zunpet:
Same for SinGyoku, who is split into a sphere entity, a person entity, and a… white flash entity for all three forms, all at the same resolution. Or Konngara's facial expressions, which also require two entities just for themselves.


And once you decompile all this code, you notice just how much of it the game didn't even use. 13 of the 50 bytes of the boss entity class are outright unused, and 10 bytes are used for a movement clamping and lock system that would have been nice if ZUN also used it outside of Kikuri's soul sprites. Instead, all other bosses ignore this system completely, and just party on the X/Y coordinates of the boss entities directly.

As for the rendering functions, 5 out of 10 are unused. And while those definitely make up less than half of the code, I still must have spent at least 1 of those 4 pushes on effectively unused functionality.
Only one of these functions lends itself to some speculation. For Elis' entrance animation, the class provides functions for wavy blitting and unblitting, which use a separate X coordinate for every line of the sprite. But there's also an unused and sort of broken one for unblitting two overlapping wavy sprites, located at the same Y coordinate. This might indicate that Elis could originally split herself into two sprites, similar to TH04 Stage 6 Yuuka? Or it might just have been some other kind of animation effect, who knows.


After over 3 months of TH01 progress though, it's finally time to look at other games, to cover the rest of the crowdfunding backlog. Next up: Going back to TH05, and getting rid of those last PI false positives. And since I can potentially spend the next 7 weeks on almost full-time ReC98 work, I've also re-opened the store until October!

📝 Posted:
🚚 Summary of:
P0081
Commits:
0252da2...5ac9b30
💰 Funded by:
Ember2528
🏷 Tags:

Sadly, we've already reached the end of fast triple-speed TH01 progress with 📝 the last push, which decompiled the last segment shared by all three of TH01's executables. There's still a bit of double-speed progress left though, with a small number of code segments that are shared between just two of the three executables.

At the end of the first one of these, we've got all the code for the .GRZ format – which is yet another run-length encoded image format, but this time storing up to 16 full 640×400 16-color images with an alpha bit. This one is exclusively used to wastefully store Konngara's sword slash and kuji-in kill animations. Due to… suboptimal code organization, the code for the format is also present in OP.EXE, despite not being used there. But hey, that brings TH01 to over 20% in RE!

Decoupling the RLE command stream from the pixel data sounds like a nice idea at first, allowing the format to efficiently encode a variety of animation frames displayed all over the screen… if ZUN actually made use of it. The RLE stream also has quite some ridiculous overhead, starting with 1 byte to store the 1-bit command (putting a single 8×1 pixel block, or entering a run of N such blocks). Run commands then store another 1-byte run length, which has to be followed by another command byte to identify the run as putting N blocks, or skipping N blocks. And the pixel data is just a sequence of these blocks for all 4 bitplanes, in uncompressed form…

Also, have some rips of all the images this format is used for:

<code>boss8.grz</code>, image 1/16<code>boss8.grz</code>, image 2/16<code>boss8.grz</code>, image 3/16<code>boss8.grz</code>, image 4/16<code>boss8.grz</code>, image 5/16<code>boss8.grz</code>, image 6/16<code>boss8.grz</code>, image 7/16<code>boss8.grz</code>, image 8/16<code>boss8.grz</code>, image 9/16<code>boss8.grz</code>, image 10/16<code>boss8.grz</code>, image 11/16<code>boss8.grz</code>, image 12/16<code>boss8.grz</code>, image 13/16<code>boss8.grz</code>, image 14/16<code>boss8.grz</code>, image 15/16<code>boss8.grz</code>, image 16/16

To make these, I just wrote a small viewer, calling the same decompiled TH01 code: 2020-03-07-grzview.zip Obviously, this means that it not only must to be run on a PC-98, but also discards the alpha information. If any backers are really interested in having a proper converter to and from PNG, I can implement that in an upcoming push… although that would be the perfect thing for outside contributors to do.

Next up, we got some code for the PI format… oh, wait, the actual files are called "GRP" in TH01.