Blog

📝 Posted:
💰 Funded by:
Root, Arandui
🏷️ Tags:

Backporting my Shuusou Gyoku build to Windows 98 was one of my favorite commissions in recent history. If you remember 📝 last year's backport of the overhauled ReC98 build system to Windows 9x, it left me rather demoralized at the end of it all. Sure, it may be the technically fastest way of fully rebuilding the entire codebase, but it just doesn't matter to me personally – incremental rebuilds on modern systems are still faster and much better integrated with the editors I actually use. People might have appreciated the research that went into it, as usual, but it just feels so pointless if nobody actually uses the result. So why are we treating Windows 9x compatibility as this noble goal and ideal expectation again? Just because retro-computing communities exist and prefer to paint it that way? The length of this post should hopefully make it clear that this is nothing that should be demanded or taken for granted.
That's why seeing this goal in particular getting funded was such a refreshing change of perspective. Finally, retro-computing people have put their money where their mouth is, and invested in something other than hardware! 🙌

  1. The backporting strategy
  2. A compatibility layer for Microsoft's C runtime
  3. A brief look back at Zig
  4. Compiling SDL 2 for Windows 98
  5. Restoring compatibility with D3DWindower
  6. Handling Shift-JIS and Unicode

So, how do you backport a modern C++ project to Windows 98 in 2025? Visual Studio removed official support for such old systems a long time ago, and increasingly uses newfangled Win32 API functions in its C++ standard library implementations where they can't be trivially removed.
If your codebase of choice restricts itself to old C and C++ standards, compiling it with an old version of Visual Studio can get you most of the way there. But this is becoming increasingly unlikely as we only ever move further away from the mid-90s. After all, this restriction would not only have to apply to a project's own code, but to all of its dependencies as well, since a backport can't just fall back on precompiled libraries. And then, all bets are off – some projects like miniaudio might be committed to supporting Visual C++ 6, but others might just freely use whatever language features are available on the GCC version that is part of the oldest Linux image offered by their CI provider. Which is totally understandable: There is a reason behind new language versions, and at some point, developers just want to move on and stop taking productivity hits all the time. Or just prefer to try something new, because C89 in particular sure gets old after writing a 5-digit number of lines in it, at least as far as I'm concerned. I'm still hoping that I get to statically recompile Turbo C++ 4.0J one day and add at least a few more language features and code optimizations to it…
Also, having simple and accessible build processes has always been a guiding principle of mine. If people can't compile with widely available tools and have to acquire old proprietary compilers from legally dubious sources, I don't fully deliver on a key promise of free software, which is kind of important to me.

But as long as the Windows 98 users are willing to install KernelEx, we can get very far with even current Visual Studio versions. KernelEx covers most of those newfangled Win32 API functions, and even helpfully makes Windows ignore the *OperatingSystemVersion fields in the PE header. The only thing we should manually add to the build process is the /arch:IA32 flag: It removes any modern x86 instructions in newly-compiled code and thus ensures that the game still runs on period-correct CPUs. Of course, the modern build should use all modern instructions it possibly can, but it makes sense to limit Windows 98 support to the alternate build with pbg's original DirectDraw and Direct3D graphics and add the flag there.
And sure enough, 📝 this worked out beautifully for the first few releases of my Shuusou Gyoku fork. But once I added more features, running on Windows 98 became increasingly harder:

So let's finally give this backport the dedicated attention it needs, and start the usual backporting loop:

  1. Encounter one of the classic DLL function errors at startup
  2. Look at the disassembly to figure out where that call came from
  3. Either rewrite the offending code to not use the function, or find some way of polyfilling it if the call originated from code that is not under your direct control
  4. Repeat until the game works
  5. Follow the same steps for any crashes or weird behavior introduced by the older Windows version

There is some room for creativity in this process, as well as non-zero hack value and enjoyment from seeing it all work out in the end. Heck, MattKC even made a blockbuster feature film out of it. But ultimately, it's dumb drudge work that wouldn't be worth doing if no actual person cares.
And I haven't even mentioned the worst part: Setting up a full-featured, bug-free, and performant VM that connects to your development system in a sort of comfortable way – and then repeating this process for different language versions of Windows 98, and even for Windows XP and maybe 7 when it comes to debugging DirectDraw issues. This only gets harder as the required dedicated VM code for these old systems starts to bit-rot, which left apparently every VM software out there with at least one deprecated or already removed feature…


So, let's start by looking at the Win32 API functions referenced by Microsoft's C runtime, and create a separate project for any required polyfills so that they would never annoy anyone. I only found out later that someone else had the same idea, but quickly moved on to targeting Clang due to hitting a roadblock with std::mutex and only ever wanted to target NT kernels anyway. So there definitely was a point to me starting a new project there. It also gave me the chance to approach the whole problem of overwriting and redirecting DLL imports from multiple angles, which led me to arrive at two slightly different solutions. Check the project's README for more details on that.
After I previously 📝 migrated all C++ threading code to SDL as part of the Linux port, I only needed to polyfill a total of 8 functions to get the game back running:

Not only were these functions enough to cover Windows 98 with the one version of KernelEx I managed to snag from the MSFN forum before they disabled downloads, but they also made the game run on unmodified Windows XP again! To completely remove the need for KernelEx and unicows.dll, we'd still have to cover a slightly bigger number of Win32 API functions, though. But now that this push has put all the foundations into place, the chances are good that the next push might already get this done. And at that point, even Windows 95 support wouldn't be far away.
If it takes longer, it'll probably be due to these two other remaining issues:


The second issue in particular shows the limits of this approach. It's only a matter of time until Microsoft activates unconditional SSE on every single part of the precompiled CRT, forcing us to reimplement pretty much all of it for continued 9x compatibility.
This is exactly why I prefer the Zig approach of compiling the C standard library on demand against the chosen CPU model. Looking at Zig's recent progress, I'm very impressed to see that the community has addressed almost 📝 all of my pain points in the 1½ years since I last looked at it. The Zig compiler now has PDB basenames, compilation progress output, improved UBSan error messages, and compilation speed is actively being worked on.
Unfortunately though, they still break the build system all the time. For a system-level dependency that people can and will use different versions of, that kind of instability is a non-starter. So I'm very likely not going to migrate anything to Zig before the compiler hits 1.0, unless they do a feature freeze or make some other kind of compatibility promise before that point. Oh well, I've put too much effort into my Tup building blocks to not continue using them for at least a few more years.

It might seem like compiling with MinGW would be the more reliable alternative here. Even its GCC 14 version still sets the *OperatingSystemVersion and *SubsystemVersion fields to 4.0, indicating Windows 95, when compiling a 32-bit binary. And if MinGW ever decides on a higher default, I'm sure that the --(major|minor)-(os|subsystem)-version linker flags will continue to allow that default to be freely overridden. Unlike Visual Studio 2022's LINK and EDITBIN tools, which refuse OS version 4.0 for no particular reason. 🙄
However, MinGW is hardcoded to link against the DLL version of Microsoft's C runtime and offers no option of statically linking the CRT, presumably due to legal reasons. This used to be no problem as its GCC ≤13 versions linked against the generic msvcrt.dll, which is available on Windows 98 as well. But this was bad for multiple reasons. And so, even they ultimately decided that Windows 7 was a reasonable minimum requirement these days and made MinGW's GCC 14 version link against the Universal CRT, with all its api-ms-win-crt-*-l1-1-0 DLLs. We can only avoid these DLL dependencies by going -nostdlib and rewriting all our code accordingly – but guess what, we could do the exact same thing on MSVC with /NODEFAULTLIB, without switching compilers.

Unfortunately, that's the same reason why Zig would make no difference, regardless of whether you use it as a compiler for C/C++ code or write pure Zig. If you build for Windows, you can merely choose between the GNU and MSVC ABI. Then, Zig behaves exactly like the respective C compiler: Select GNU and you get the UCRT dependencies, select MSVC and you get the statically linked Microsoft CRT with all of its aforementioned drawbacks. Supposedly, it's possible to bypass MSVC, but the GNU ABI was the answer to the question of compiling without Visual Studio back then. Establishing an easy-to-use third ABI without any dependencies sounds like much more of a research project than just staying with C++.
Not to mention that Zig's Windows version support policy follows Microsoft's extended support lifecycle. Zig 0.6.0 dropped Windows 7 support, and Zig 0.11.0 dropped Windows 8.1 support. While Andrew Kelley is open to non-invasive patches for greater OS support, these could break at any moment and would therefore need consistent maintenance as well.


Surprisingly, SDL 2 has been causing by far the least amount of problems in all of this. A small adjustment to its threading functions removed its only mandatory reliance on Microsoft CRT code, and KernelEx and unicows.dll then cover any remaining unconditional usage of newer Win32 API functions. Since we already needed a __WIN9X__ macro to opt into this change and retain SDL's default behavior on modern systems, I also took the opportunity to disable most of the subsystem backends that are unsupported on Windows 98, shaving a few hundred KB off the DLL's file size.

This made SDL look even better than 📝 the already good impression I got last time. Not only is SDL not a problem, but it's actually the biggest asset we have in a Windows 9x port. And with all the improved subsystems in SDL 3, it becomes so much of an asset that we should ideally just go all in on SDL 3 and make it a hard dependency of even the cross-platform logic code.
This would be quite a big deal, and it might not immediately be obvious why. Doesn't every one of our supported platforms already depend on SDL anyway? Internally though, my current architecture predates the plan of using SDL and is still designed for the hypothetical case of not using it. After all, retaining and expanding pbg's old backend code for a slim Windows 98 port without any big dependencies was a viable option that could have been funded. But now that the backers have voted against it, directly architecting all code against SDL 3 would have so many upsides:

In fact, this idea is so convincing that it makes me want to freeze all new feature or backport development for Shuusou Gyoku until it's done. However, we absolutely want to do this with SDL 3 rather than 2 to reap the full set of benefits. This would imply removing the SDL 2 code path for good, but our Flatpak still uses this code path because the Freedesktop SDK will only start shipping SDL 3 with the next update in August. We could compile SDL 3 from source in the meantime, but maybe we shouldn't? :thonk:
Given the funding situation and general hype, it'll probably be best if I just focus on TH03 until then.

But once that's done, it would only leave TrueType fonts, MIDI, and graphics rendering as the subsystems that our architecture supports system-specific APIs for – and even MIDI will only be on there until someone funds MIDI support for just a single non-Windows platform.
You might wonder why graphics rendering is on there, but we can unfortunately never get rid of pbg's original DirectDraw code. The 8-bit mode is just too crucial for getting the game to run decently on the old systems without 3D acceleration that a Windows 9x port is supposed to target. We could try going full GDI in the hope of maybe even being faster or more portable, but that would just be another custom backend.
We could, however, go the opposite route. Turning pbg's old code into an SDL_Renderer backend would facilitate all kinds of backports of pure SDL_Renderer games to that late-90s period of hardware. Those games will probably not run all that well 📝 if our benchmark results for software rendering are any indication, but the idea definitely has hack value.
And why stop there? Let's add a PC-98 backend! :tannedcirno: …yeah, I'm getting off-track.


Speaking of pbg's old rendering path though: Having to run it while debugging the Windows 98 backport comes with the practical problem that we still have no proper windowed mode for it. Multi-monitor support in VMs is sketchy at best, and even if it works, running OllyDbg on a separate virtual monitor next to exclusive fullscreen 8-bit DirectDraw still doesn't prevent these highly disruptive mode switches between the game and debugger windows.

Fortunately, D3DWindower is old enough to still work on Windows 98 and works well with pbg's original build of Shuusou Gyoku. Unfortunately, 📝 it stopped working as soon as I migrated window creation to SDL 2. But if we put two and two together, we immediately get a theory as to why: Because it works on Windows 98, D3DWindower might only hook the ANSI versions of all the Windows API functions that games can use to enter exclusive fullscreen mode, but SDL uses the Unicode variants. You wouldn't think that a mode-switching API uses potentially localizable strings as part of its parameters, but hey, maybe monitors are treated like files and addressed with names?
And indeed, SDL uses ChangeDisplaySettingsExW(), but D3DWindower only hooks ChangeDisplaySettingsExA(). Switching from W to A was all it took to get it working again… on modern Windows at least. :thonk: It wasn't enough for Windows 98, but what could we possibly be missing?

Turns out that KernelEx is the one and only issue there. D3DWindower (or rather, its internally used madCodeHook library) uses the Win32 GetVersion() function, but interchangeably calls it both via its import and via a proc address pointer retrieved directly from kernel32.dll. KernelEx only wraps one of the two, which causes the hooking algorithm to fail as it gets confused by contradictory Windows version numbers.
The problem with version numbers is that the number-returning function itself has no way of knowing the caller's intent. I can't think of a situation where it wouldn't make more sense to query the presence of a certain OS feature rather than the version number of the entire thing. And so, KernelEx's wrapper makes the understandable choice of returning exactly the version you've configured for the executable:

Screenshot of the Windows 98 file property dialog, showing the KernelEx tab for an executable that is configured to use the specific compatbility mode for Windows 2008 SP1
This will cause the hooked GetVersion() to return 0x17710006 rather than 0xC0000A04.

madCodeHook, however, uses the version number to pick between the completely different hooking strategies for 9x and NT kernels, and therefore always needs the actual version of the underlying system. Presumably, these different strategies are needed because 9x kernels didn't have the copy-on-write mechanism that allows a process to freely rewrite system DLL code without affecting other processes. Instead, 9x kernels only have a single global shared instance of all system DLLs, which gets mapped to the same address for every process. This is also why setting code breakpoints within system DLLs on 9x can break the entire system: Since 9x doesn't support hardware breakpoints, debuggers only have the option of writing the INT 3 instruction byte (0xCC) to the breakpoint address and then reverting it before resuming execution. But this instruction can only break back into the debugger for the one process that the debugger is attached to. In the meantime, every other process is left with a corrupted instruction stream, and OllyDbg's cryptic Unable to flush cache only describes a single symptom of the ensuing general instability.
Thankfully, KernelEx lets us disable its GetVersion() wrapper for any specific compatibility mode by editing the respective section of %WINDIR%\KernelEx\Core.ini. For Windows 2008 SP1, the change would be:

 [WIN2K8.names]
-KERNEL32.GetVersion=kexbases.8
+KERNEL32.GetVersion=std

After a reboot, D3DWindower then succeeds in hooking ChangeDisplaySettingsExA() through KernelEx even if the KernelEx-injected Microsoft Layer for Unicode previously redirected ChangeDisplaySettingsExW() to that function.
Since the ideal definition of a "Windows 98 backport" does not include KernelEx though, it made sense to already go ANSI right now and restore general compatibility with D3DWindower on NT kernels as well. And with one more crucial manual setting that prevents SDL from crashing itself in confusion…

Screenshot of the D3DWindower Foreground Control settings that are necessary to prevent SDL from trying to move back into fullscreen mode after the game window lost and regained focus, which will result in a weirdly stretched borderless fullscreen image followed by a crash
Yes, I would have preferred a nice GIAN07 (Windows 98).exe file name, but D3DWindower unfortunately glitches if binary names contain spaces. If the directory of a hooked executable contains another executable whose name matches the hooked one up to the first space, D3DWindower will run that other executable instead of the intended one. We sure don't want to run the regular GIAN07.exe by accident.

… we've got the game running in a provisional windowed mode on Windows 9x!

Screenshot of the ReC98 Shuusou Gyoku build running in a 640×480 window on a Windows 98 desktop, next to the System Properties window (as usual for backport screenshots) and the D3DWindower window (demonstrating how the game was windowed). The game window shows the screenshot submenu introduced in the ReC98 P0309 build, which was definitely not part of pbg's original release.Screenshot of the ReC98 Shuusou Gyoku build running in a 640×480 window on a Windows 98 desktop, next to the System Properties window (as usual for backport screenshots) and the D3DWindower window (demonstrating how the game was windowed). The game window shows the BGM pack selection submenu introduced in the ReC98 P0275 build, which was definitely not part of pbg's original release.Screenshot of the ReC98 Shuusou Gyoku build running in a 640×480 window on a Windows 98 desktop, next to the System Properties window (as usual for backport screenshots) and the D3DWindower window (demonstrating how the game was windowed). The game window shows the Stage 3 entrance animation in 8-bit mode, demonstrating how D3DWindower's inaccurate 8-bit color emulation replicates the infamous "golf course" bug.
I can't stress enough that debugging was the main intention behind this fix. Without scaling options, D3DWindower is not a replacement for a proper windowed mode, and it adds its own bugs on top. 📝 The P0251 blog post has more detail about how precise an 8-bit DirectDraw emulation has to be to avoid the infamous golf course in Stage 3.

But turning off Unicode in the Windows 98 build of SDL 2 also had one unfortunate drawback: The window title is now ??? rather than 秋霜玉, even when running on NT kernels or a Japanese version of Windows 98. That brings us right to the other big complication of this backport:

😩 Handling Shift-JIS and Unicode 😩

With SDL now using the *A() functions, you might have expected the same mojibake that you'd see in the windowed title bar of pbg's original build. But since we pass UTF-8 to SDL rather than Shift-JIS, the result would always be slightly different. The number of question marks does match the number of codepoints in the string though, which means that SDL does convert from UTF-8 into something before passing the string to Windows. Unfortunately, this target encoding is always pure 7-bit ASCII because SDL's hand-rolled iconv() function only supports that, Latin-1, and the Unicode Transformation Formats.
This looks like a very bad choice on the surface. Sure, this implementation is meant to be a minimal fallback for systems that don't have iconv(3), but if it uses that library when available, why doesn't it also use WideCharToMultiByte() on Windows? One reason might be right there in the name of that Win32 function: Windows treats UTF-16 as the base encoding from where all other encodings are converted, but SDL (and everyone else) prefers UTF-8 in that role. This allows SDL to directly convert to UTF-32 or Latin-1 without stopping at UTF-16 first.
But even if SDL offered Win32-powered conversion from UTF-8 into any Win32 codepage, there's still the issue that WideCharToMultiByte(932) will most likely just not work on non-Japanese editions of Windows 9x. Since there is no algorithmic mapping between JIS and Unicode, the conversion between these two encodings requires a lookup table. Windows stores this table in C_932.NLS, and there is no guarantee that this file will be installed on anything before Vista.

On the other hand, the second screenshot above clearly shows that…

Text rendering

…just works for Japanese text? On my Western Windows 98?! Things quickly take a turn once we enter the Music Room though, where we get working Japanese text next to mojibake:

Screenshot of Shuusou Gyoku's Music Room in 8-bit mode on Windows 98, playing the title screen theme loaded from a BGM pack. The theme name (「秋霜玉 ~ Clockworks」) and the version banner (「秋霜玉    Version 1.005     ★デモ対応版#★」) internally use UTF-8 strings and show up correctly, but the comment uses Shift-JIS and shows up as mojibake.
It currently also looks like this on Japanese Windows 98 due to, well, me not having tested this case before.

This disparity is quickly explained: Any text that is either hardcoded or pulled from the Vorbis comment tag of a BGM pack file is in UTF-8 and can be trivially converted to UTF-16. Every piece of mojibake, on the other hand, comes from the original .DAT files, is therefore encoded in Shift-JIS, and fails the conversion to UTF-16 for the aforementioned reasons.
But seriously, how can UTF-16 text rendering suddenly just work on Windows 9x? Well, contrary to popular (or certainly my) belief, Windows 9x did have functional Unicode variants for a small group of 15 API functions, which just happens to include GDI's TextOutW(), ExtTextOutW(), and GetTextExtentPoint32W(). Yup – these empty text areas we were getting for Japanese games on Windows 9x back in the day? All of them were at least partly preventable. The missing C_932.NLS on non-Japanese systems would have still meant empty text boxes if developers preferred storing text in Shift-JIS rather than UTF-8, which they might have wanted to do if their favorite editors were similarly limited. But that's about the only valid argument for using Shift-JIS on Windows 9x:

So even if devs absolutely wanted to use Shift-JIS as the on-disk format, converting to UTF-16 at runtime and calling the Unicode versions of the GDI text rendering functions would have been better than using their *A() versions. Then, Windows 9x users could have fixed empty text boxes by properly installing codepage 932, XP users would have only needed to check that one Install files for East Asian languages box, and it all would have just worked without requiring the unbearable cringe of locale emulation. The *A() versions had no reason to exist other than programmer convenience.

Alright, so there's some theoretical way to get all rendered text to show up correctly on Windows 9x, regardless of locale. But what about…

The original Japanese filenames

Without a proper CreateFileW(), this is where we hit all the problems we were expecting. How should these behave on systems with non-Japanese codepages? Are we OK with turning old replay file names like 秋霜りぷEx.DAT into ????Ex.DAT, and consequently ____Ex.DAT due to question marks not being allowed in file names? This looks like the best choice we have: It's also what unicows.dll does right now, and it has the advantage of being easy to manually type.
It also is better than returning to the game's original behavior of blindly reinterpreting the bytes in the system codepage, which would turn the string into H‘š‚è‚ÕEx.DAT on codepage 1252. If you run pbg's original build on Western Windows 98, you'll see that it actually won't save any file whose name starts with the kanji. Apparently, 9x kernels are much stricter than NT kernels when it comes to filenames in the system codepage and will outright refuse to create a file if it contains unassigned codepoints? The Shift-JIS lead byte of is 0x8F, which is unused in CP1252.

Then again, if we had better replay-related error reporting, the specific file names probably wouldn't matter because we'd just display them on screen. Given that 📝 our forward-compatible configuration format only uses ASCII characters on purpose and the new replay format will do the same, this would only ever matter for the initial upgrade. There will be the possibility of converting future replays back into the original format for validation purposes, but that feature would ideally use exactly the names that the original game uses on the current system: Japanese names in Japanese locale, nothing on Western Windows 98, and mojibake everywhere else. Maybe we could even add a menu option to let players pick among all possible broken file names? :onricdennat:
But shouldn't we at least retain support for loading from original names when running the Windows 98 build on an NT kernel? Sure, the goal is a backport to Windows 98, but functional Unicode makes Windows XP a much more reasonable retro target. It sure makes a lot more sense than supporting XP with the regular, non-suffixed modern build: XP is almost identical to 98 in terms of SDL backends, being just as limited to Direct3D 9, WinMM, and DirectSound in the graphics and sound department. The only additional backend for XP would be Raw Input, and we could just enable that one conditionally.

So how about just…

Leaving it all to unicows.dll?

Yeah, why don't we just directly link both SDL and the game against the Microsoft Layer for Unicode, without relying on KernelEx injecting it for us? Then, SDL could just continue using Unicode APIs without us having to rewrite anything. And since MSLU disables itself on NT kernels, you'd still get the 秋霜玉 window title and support for the original Japanese filenames regardless of the non-Unicode codepage. Heck, unicows.lib is still shipped with current Visual Studio. I'd only have to add a single linker flag and be done with it!
But when I tried this, the game broke in every environment:

Maybe this could be considered a fixable bug that traces back to SDL 1, but the whole situation is just very silly. If this had worked, we would be running the Windows 9x build through up to three separate layers of dynamically patched code – KernelEx, MSLU, and D3DWindower – when we'd like to have at most one and ideally zero. Besides, we know which code we want to run, we know that we don't need to reach for subclassing to make it all work, and we know that SDL is the ideal place for it. Now we just have to write it all.

Until then, this is where we are right now:

:sh01: Shuusou Gyoku P0310 Windows build

Next up: The long-awaited return to TH03! With per month going explicitly toward that game now, we'll definitely stay there for a while. Ember2528 is generously funding short-term and long-term netplay options, so let's finish OP.EXE in preparation for nice and user-friendly menus. This is the last main menu to be decompiled across all of PC-98 Touhou and it's mostly text-based, so how hard can it be?