Blog

📝 Posted:
🏷️ Tags:

Another replay desync bug in Shuusou Gyoku's Extra Stage that I accidentally introduced in P0295 and that therefore falls under my free bugfix policy?! Last time, we were lucky and there was a general solution that would allow automatic repairs for affected replays inside the game, but not this time. That's right, the most dreaded catastrophic case has actually happened, and I accidentally forked gameplay. ⚠️ If you've recorded an Extra Stage replay on any build I released in the last 10 months, there's a rare chance that your replay might be invalid and would desync on the original game and future builds.

  1. Accidentally breaking Marisa's bit rotation
  2. Defense strategies

So, what happened? 📝 Modernizing pbg's gameplay code for the sole sake of satisfying static analysis was a scary prospect from the start, despite the already reduced scope of warnings I went for. And sure enough, this bug had to be an issue with integer casting once again, specifically in the code that controls the rotation of Marisa's 📝 bits:

// Calculate the angular velocity (`dir`) of a bit.
// `bit->Angle` is the same kind of 📝 8-bit angle used in PC-98 Touhou.
const uint8_t d = ((BitData.BaseAngle >> 1) + (delta * bit->BitID));

// Original pbg code
int dir = (int)d - (int)(bit->Angle);

// Broken modern rewrite in P0295
int dir = (Cast::sign<int8_t>(d) - Cast::sign<int8_t>(bit->Angle));

// Correct modern rewrite in P0310-2
int dir = (Cast::up_sign<int>(d) - Cast::up_sign<int>(bit->Angle));
Why did I try to be smart here?

Let's plug in 44 for d and 172 for bit->Angle. With the original int casts, this results in an obvious dir of

(44 - 172) = -128

But if an int8_t cast would turn 172 to -84, shouldn't a result of

(44 - (-84)) = 128

then also overflow to -128 if the calculation is done in 8 bits? As of C++20, this overflow is even well-defined. Well, C still promotes both operands to int and only then checks whether they're the same type, as specified in Section 6.3.1.8/1 of the C standard. Thus, we lose the expected overflow, and the bits begin to rotate in the other direction. That's now the third time that C's integer promotion rules have 📝 ruined my day

It's easy to see how a difference in rotation can lead to different gameplay by changing the spawn points of bullets, but this bug actually tends to have far bigger consequences. At one point during the fight, the ECL script reads out the bit angle and not only applies it to Marisa's own movement angle, but also uses it to determine the next danmaku pattern. And with a large enough difference compared to pbg's original intentions, we get the reported desync:

Since a lot of Marisa's behavior and pattern selection comes down to RNG, this doesn't necessarily have to happen. While most fights will run into a case where this bug would change bit direction for a few frames, pattern differences like this one appear to be rather rare. None of my six Marisa-beating test replays desync in response to this bug, which explains how it remained undetected during all my testing.
Thankfully, the replay above was recorded on P0275. Thus, it relied on the original behavior, and the replay will play back correctly again on the new and future builds. So far, I haven't seen any replay that relies on the wrong behavior; even BWF6's crazy god-tier RNG run that got all the easy patterns remains legit. But the possibility definitely exists.


The only true defense against this kind of RNG-dependent gameplay fork involves validation against tons and tons of replays. Unfortunately, Shuusou Gyoku's original single-replay system makes collecting them too impractical for everyone involved. And as long as the backers focus on flashier features over testability, we'll always run the risk of accidental gameplay forks, despite my best efforts and promises.
Still, the obvious and actionable lesson for myself is that I have to get better at not touching gameplay code. And as a first step of that, I will keep the fix at this single line instead of trying to look for and fix more of these potential issues by throwing more static analysis onto pbg's code.

This bug could have been prevented by having at least a basic physical split of gameplay and rendering code into different source files, which would allow the former to be excluded from these massive modernizing refactors more easily. Sure, TH02 and especially TH01 intertwine gameplay and rendering so much that this would have been another subproject consuming multiple pushes. But for Shuusou Gyoku, it would have been easy, and I still didn't do it. Big mistake. I'm definitely going to take the time once we get to Kioh Gyoku.
In that light, it's a massive advantage that I'll have to implement TH03 netplay without MAIN.EXE being position-independent. This way, the binary diff against the original version remains small by necessity, and the risk of accidental gameplay forks is massively reduced.

Many, many thanks to Ripper Roo for reporting this bug and providing the one affected replay. Let's hope that this was the one and only bug hiding in these guideline rewrites…
We're lucky to have found this one before I implemented the upcoming better and more community-usable replay system. Imagine if we already had a replay site that hosted a bunch of replays recorded on the broken builds. We'd now have to mark all of them as forked, except that we probably want to manually validate and remove that fork marker for every replay that syncs on pbg's original build after all… That would have been even more bureaucracy waiting for me now.

At least the next regular PC-98 Touhou delivery doesn't even try to touch game code. It's been feature-complete for a while and is only missing the blog post, some in-depth testing, and the usual release preparations. As you might have guessed from the time it took, this first foray into portability escalated to another big 10-push monster… Really happy with the visualizations this time around, though.