Blog

📝 Posted:
🏷️ Tags:

Unfortunately, I have to quickly interrupt the current PC-98 Touhou progress with breaking news of a replay desync bug in my Shuusou Gyoku build. Yup, it's free mod bugfix time again, this time featuring a bug with the most complex implications so far…

  1. The Extra Stage replay desync bug introduced in P0295
  2. How it happened
  3. Thinking about the best way to fix it
  4. Another filesize-related bug in replay saving

The bug in question dates back to the P0295 build from last October. While that giant release mostly focused on porting the game's rendering to SDL, it also included 📝 fixes for three pbg bugs in Shuusou Gyoku's handling of Extra Stage replays. Unfortunately, these fixes would introduce a bug of my own that was even worse. :tannedcirno:
Ever since that build, the replay header has consistently stored the difficulty and starting life count as shown in the Config → Difficulty menu. This looks fine on the surface until you consider the exact issue behind the three bugs: Shuusou Gyoku's Extra Stage is supposed to run on Hard difficulty and 2 starting lives, not on whatever you set for the regular 6 stages.
You can probably already imagine how invalid difficulty settings will cause desyncs shortly into a replay. Running a debug build at any commit between ≥P0295 and ≤P0310 quickly reveals the issue:

Screenshot of Shuusou Gyoku's debug mode at the beginning of the Extra Stage, showing off the intended internal difficulty (Hard) and initial rank value (Pr = 10240).Screenshot of Shuusou Gyoku's debug mode at the beginning of an Extra Stage replay recorded on a ReC98 build between ≥P0295 and ≤P0310, showing off how the replay incorrectly uses the configured difficulty (Lunatic) along with the higher rank that comes along with it.
Different difficulties come with different initial rank values (Pr), which cause bullets and lasers to spawn at different speeds than what the player maneuvered around while recording the replay, which in turn will manifest as a desync.

The only way to protect a replay from this bug was to set the regular game to Hard difficulty and 2 starting life settings in the Config → Difficulty menu before recording. This is probably one of the rarest configurations imaginable – most people will have set the difficulty to either Normal to get that survival clear that unlocks Extra in the first place, or to Lunatic because they're superplayers and that's the only difficulty that matters to them. :godzun:
Note how the bug only affects the saved replay file. You were still playing and recording Extra in its intended Hard difficulty and with 2 starting lives, and any clear you've achieved was still valid.


This is exactly the kind of bug that can easily fall through the cracks of the regular testing that my backers and I do for every new build. Replays are a key item on my testing checklist, but I primarily test whether existing ones still work. With only one replay slot per stage, recording a new replay is always a cumbersome process: Is my previous replay for that stage worth keeping? If yes, what made it special? After all, I now need to give a more descriptive name to the file. Do I remember, or do I have to watch the replay again?
Also, the primary concern of replays is compatibility with pbg's original 1.005 version. In that context, they can provide important evidence that I haven't accidentally forked the gameplay. Therefore, replays should hit as many gameplay aspects and potential failure/desync points as possible, which requires actual gameplay effort. :onricdennat: From that point of view, it makes more sense to just keep testing with existing replays, especially when it comes to the Extra Stage.


Since this was just a metadata issue, we can both easily fix this bug for future replays and repair any existing affected ones. We simply have to set the replay's difficulty and starting lives to the one and only official values for the Extra Stage, and they will play back correctly again.

But doing that creates a potential problem. What if you actually modded the game before P0295, intentionally changed the difficulty and/or number of starting lives for the Extra Stage, and then recorded a replay? If that was the whole extent of your mod, such a replay would play back correctly on not just your modded build of Shuusou Gyoku, but on every single one of my builds and pbg's original 1.005 build. "Fixing" these settings in the replay header would then actually break such a replay. Since we're still using pbg's old replay format, there is no way we can distinguish valid modded replays from broken and desyncing ones by looking at just the replay header.
We could tell after we've run the replay – if the game ends before the replay has reached its last recorded frame, we know that something is wrong. However, we're not quite at the point where we can quickly simulate an entire round of gameplay logic on just the CPU, without rendering anything. The best we could do until then is to pop up a message at the end of a rendered replay, informing the player that they've just watched a desync and offering an automatic repair of known issues. But that would be a lot of work for a policy-bugfix, and even fall under the planned paid feature of improved replay-related error reporting. And if we zoom out, such a window won't be much of a help in the general case of people watching replays from incompatible builds. The game can't possibly know the specific mod a desyncing replay originated from, so what could it possibly do, other than to say "Sorry, that was, in fact, a desync 🤷"?
That's why it's so important to me that 📝 the new replay format stores the exact game binary and stage script versions a replay was recorded on. As well as any gameplay tweaking options, if we ever go that route: Properly fixing Shuusou Gyoku's fake deathbomb quirk is not just about the few lines of custom gameplay code you can find in Tasos500's fork, but mainly about the bureaucracy of cleanly establishing a separate competition tier, not breaking existing replays, and making sure that replay hosting sites deal with the distinction as well.

That said, that's a lot of thought for a very specific potential scenario. Any change to the Extra Stage settings would have required modding the game at either the C++ or machine code level. If you were able to pull that off and you're considering updating to the new build, you'll probably also read these lines and will have no problem adapting whatever fix I roll out for that issue.


So, let's go for that unconditional rewrite of every affected Extra Stage replay upon clicking the ExStage デモ再生 option… but wait, why are the rewritten files suddenly smaller than the old ones? :thonk:
Turns out that there was another replay-related bug that dates back to 📝 my very first Shuusou Gyoku release from September 2022. This one boiled down to the classic C/C++ footgun of confusing byte sizes with element counts, but pbg's misleading variable names certainly played their part as well.
This bug is mostly harmless within the unmodded game, which also explains how I didn't detect it for so long. The game doesn't care about compressing and uncompressing twice as many bytes, the loader still copied the correct amount of bytes and wouldn't have overflowed the buffer, and at a few KB per replay, it doesn't really stick out if the files are roughly twice as large as they needed to be. But this was still a landmine that would have exploded once modders crafted stages longer than half of the 20-minute buffer that pbg designated for replays.

Since I'm already implementing an automated fix here, I might as well also recompress every watched non-Extra Stage replay if its amount of decompressed bytes doesn't match the replay's indicated frame count. Of course, recompression won't work if you've marked the replay files as read-only, which I often do as a means of protecting them from getting overwritten with accidentally recorded new replays of the same stage…
…but wait, how about restricting both fixes to writable replay files? This would create at least one possibility of protecting existing modded replays, and also make sense from a consistency point of view. If the game isn't allowed to fix the replay, it also shouldn't untransparently hotpatch its header and play it differently than any other build of the game would play it, even if that way was the correct one in the vast majority of cases. Sure, this is slightly annoying for people who use that same trick, but those people will probably also read either these lines or the release notes.

As a neat bonus, I also made sure to preserve the original timestamps of any repaired and/or recompressed replay file. This is the only other piece of meaningful identifying metadata we have with these files, and I don't want to throw it away just because I messed up the saving code at one point. Without that extra level of care, I probably wouldn't have gone for such an unconditional automatic fix in the first place. Instead, this little detail makes the whole fix as invisible as it could possibly be. If you only recorded an Extra Stage replay once, haven't watched it since, and haven't touched the file either, you won't even notice that there was a bug in the first place.
SDL 3's filesystem API does not cover file timestamp modification, so this required more OS-specific code of the kind 📝 I'd rather want to get rid of. SDL 3 does support timestamp retrieval though, and that's all we need for the new replay format where I'll take the timestamp from the filesystem and properly write it into the file itself.

And there we go, no more replay bugs! Also, did I just write down all the justification anyone would ever need for the new replay format? That should be shortening that future blog post by quite a bit at least…

Thanks to >>49320040 on /jp/ for pointing out that desyncs exist. Please tell me this sort of thing! I'm not ZUN, desyncs are critical bugs that will always receive my immediate attention. If they turn out to be my fault, they definitely fall under my free bugfix policy, and if they don't, we at least get to document them as bugs in the original game that might get fixed in a later push.
Alright, back to writing blitters for the PC-98…