⮜ Blog

⮜ List of tags

Showing all posts tagged tcc-, rec98- and build-process-

📝 Posted:
🚚 Summary of:
P0137
Commits:
07bfcf2...8d953dc
💰 Funded by:
[Anonymous]
🏷 Tags:
rec98- th02+ th03+ th04+ th05+ build-process- meta+ contribution-ideas+ mod+ tasm+ tcc-

Whoops, the build was broken again? Since P0127 from mid-November 2020, on TASM32 version 5.3, which also happens to be the one in the DevKit… That version changed the alignment for the default segments of certain memory models when requesting .386 support. And since redefining segment alignment apparently is highly illegal and absolutely has to be a build error, some of the stand-alone .ASM translation units didn't assemble anymore on this version. I've only spotted this on my own because I casually compiled ReC98 somewhere else – on my development system, I happened to have TASM32 version 5.0 in the PATH during all this time.
At least this was a good occasion to get rid of some weird segment alignment workarounds from 2015, and replace them with the superior convention of using the USE16 modifier for the .MODEL directive.

ReC98 would highly benefit from a build server – both in order to immediately spot issues like this one, and as a service for modders. Even more so than the usual open-source project of its size, I would say. But that might be exactly because it doesn't seem like something you can trivially outsource to one of the big CI providers for open-source projects, and quickly set it up with a few lines of YAML.
That might still work in the beginning, and we might get by with a regular 64-bit Windows 10 and DOSBox running the exact build tools from the DevKit. Ideally, though, such a server should really run the optimal configuration of a 32-bit Windows 10, allowing both the 32-bit and the 16-bit build step to run natively, which already is something that no popular CI service out there offers. Then, we'd optimally expand to Linux, every other Windows version down to 95, emulated PC-98 systems, other TASM versions… yeah, it'd be a lot. An experimental project all on its own, with additional hosting costs and probably diminishing returns, the more it expands…
I've added it as a category to the order form, let's see how much interest there is once the store reopens (which will be at the beginning of May, at the latest). That aside, it would 📝 also be a great project for outside contributors!


So, technical debt, part 8… and right away, we're faced with TH03's low-level input function, which 📝 once 📝 again 📝 insists on being word-aligned in a way we can't fake without duplicating translation units. Being undecompilable isn't exactly the best property for a function that has been interesting to modders in the past: In 2018, spaztron64 created an ASM-level mod that hardcoded more ergonomic key bindings for human-vs-human multiplayer mode: 2021-04-04-TH03-WASD-2player.zip However, this remapping attempt remained quite limited, since we hadn't (and still haven't) reached full position independence for TH03 yet. There's quite some potential for size optimizations in this function, which would allow more BIOS key groups to already be used right now, but it's not all that obvious to modders who aren't intimately familiar with x86 ASM. Therefore, I really wouldn't want to keep such a long and important function in ASM if we don't absolutely have to…

… and apparently, that's all the motivation I needed? So I took the risk, and spent the first half of this push on reverse-engineering TCC.EXE, to hopefully find a way to get word-aligned code segments out of Turbo C++ after all.

And there is! The -WX option, used for creating DPMI applications, messes up all sorts of code generation aspects in weird ways, but does in fact mark the code segment as word-aligned. We can consider ourselves quite lucky that we get to use Turbo C++ 4.0, because this feature isn't available in any previous version of Borland's C++ compilers.
That allowed us to restore all the decompilations I previously threw away… well, two of the three, that lookup table generator was too much of a mess in C. :tannedcirno: But what an abuse this is. The subtly different code generation has basically required one creative workaround per usage of -WX. For example, enabling that option causes the regular PUSH BP and POP BP prolog and epilog instructions to be wrapped with INC BP and DEC BP, for some reason:

a_function_compiled_with_wx proc
	inc 	bp    	; ???
	push	bp
	mov 	bp, sp
	    	      	; [… function code …]
	pop 	bp
	dec 	bp    	; ???
	ret
a_function_compiled_with_wx endp

Luckily again, all the functions that currently require -WX don't set up a stack frame and don't take any parameters.
While this hasn't directly been an issue so far, it's been pretty close: snd_se_reset(void) is one of the functions that require word alignment. Previously, it shared a translation unit with the immediately following snd_se_play(int new_se), which does take a parameter, and therefore would have had its prolog and epilog code messed up by -WX. Since the latter function has a consistent (and thus, fakeable) alignment, I simply split that code segment into two, with a new -WX translation unit for just snd_se_reset(void). Problem solved – after all, two C++ translation units are still better than one ASM translation unit. :onricdennat: Especially with all the previous #include improvements.

The rest was more of the usual, getting us 74% done with repaying the technical debt in the SHARED segment. A lot of the remaining 26% is TH04 needing to catch up with TH03 and TH05, which takes comparatively little time. With some good luck, we might get this done within the next push… that is, if we aren't confronted with all too many more disgusting decompilations, like the two functions that ended this push. If we are, we might be needing 10 pushes to complete this after all, but that piece of research was definitely worth the delay. Next up: One more of these.

📝 Posted:
🚚 Summary of:
P0001
Commits:
e447a2d...150d2c6
💰 Funded by:
GhostPhanom
🏷 Tags:
rec98- build-process- tcc- tasm+

(tl;dr: ReC98 has switched to Tup for the 32-bit build. You probably want to get 💾 this build of Tup, and put it somewhere in your PATH. It's optional, and always will be, but highly recommended.)


P0001! Reserved for the delivery of the very first financial contribution I've ever received for ReC98, back in January 2018. GhostPhanom requested the exact opposite of immediate results, which motivated me to go on quite a passionate quest for the perfect ReC98 build system. A quest that went way beyond the crowdfunding…

Makefiles are a decent idea in theory: Specify the targets to generate, the source files these targets depend on and are generated from, and the rules to do the generating, with some helpful shorthand syntax. Then, you have a build dependency graph, and your make tool of choice can provide minimal rebuilds of only the targets whose sources changed since the last make call. But, uh… wait, this is C/C++ we're talking about, and doesn't pretty much every source file come with a second set of dependent source files, namely, every single #include in the source file itself? Do we really have to duplicate all these inside the Makefile, and keep it in sync with the source file? 🙄

This fact alone means that Makefiles are inherently unsuited for any language with an #include feature… that is, pretty much every language out there. Not to mention other aspects like changes to the compilation command lines, or the build rules themselves, all of which require metadata of the previous build to be persistently stored in some way. I have no idea why such a trash technology is even touted as a viable build tool for code.

But wait! Most make implementations, including Borland's, do support the notion of auto-dependency information, emitted by the compiler in a specific format, to provide make with the additional list of #includes. Sure, this should be a basic feature of any self-respecting build tool, and not something you have to add as an extension, but let's just set our idealism aside for a moment. Well, too bad that Borland's implementation only works if you spell out both the source➜object and the object➜binary rules, which loses the performance gained from compiling multiple translation units in a single BCC or TCC process. And even then, it tends to break in that DOS VM you're probably using. Not to mention, again, all the other aspects that still remain unsolved.

So, I decided to just write my own build system, tailor-made for the needs of ReC98's 16-bit build process, and combining a number of experimental ideas. Which is still not quite bug-free and ready for public use, given that the entire past year has kept me busy with actual tangible RE and PI progress. What did finally become ready, however, is the improvement for the 32-bit build part, and that's what we've got here.


💭 Now, if only there was a build system that would perfectly track dependencies of any compiler it calls, by injecting code and hooking file opening syscalls. It'd be completely unrealistic for it to also run on DOS (and we probably don't want to traverse a graph database in a cycle-limited DOSBox), but it would be perfect for our 32-bit build part, as long as that one still exists.

Turns out Tup is exactly that system. In practice, its low-level nature as a make replacement does limit its general usefulness, which is why you probably haven't seen it used in a lot of projects. But for something like ReC98 with its reliance on outdated compilers that aren't supported by any decent high-level tool, it's exactly the right tool for the job. Also, it's completely beyond me how Ninja, the most popular make replacement these days, was inspired by Tup, yet went a step back to parsing the specific dependency information produced by gcc, Clang, and Visual Studio, and only those

Sure, it might seem really minor to worry about not unconditionally rebuilding all 32-bit .asm files, which just takes a couple of seconds anyway. But minimal rebuilds in the 32-bit part also provide the foundation for minimal rebuilds in the 16-bit part – and those TLINK invocations do take quite some time after all.

Using Tup for ReC98 was an idea that dated back to January 2017. Back then, I already opened the pull request with a fix to allow Tup to work together with 32-bit TASM. As much as I love Tup though, the fact that it only worked on 64-bit Windows ≥Vista would have meant that we had to exchange perfect dependency tracking for the ability to build on 32-bit and older Windows versions at all. For a project that relies on DOS compilers, this would have been exactly the wrong trade-off to make.

What's worse though: TLINK fails to run on modern 32-bit Windows with Loader error (0000) : Unrecognized Error. Therefore, the set of systems that Tup runs on, and the set of systems that can actually compile ReC98's 16-bit build part natively, would have been exactly disjoint, with no OS getting to use both at the same time.
So I've kept using Tup for only my own development, but indefinitely shelved the idea of making it the official build system, due to those drawbacks. Recently though, it all came together:

  • The tup generate sub-command can generate a .bat file that does a full dumb rebuild of everything, which can serve as a fallback option for systems that can't run Tup. All we have to do is to commit that .bat file to the ReC98 Git repository as well, and tell build32b.bat to fall back on that if Tup can't be run. That alone would have given us the benefits of Tup without being worse than the current dumb build process.
  • In the meantime, other contributors improved Tup's own build process to the point where 32-bit builds were simple enough to accomplish from the comfort of a WSL terminal.
  • Two commits of mine later, and 32-bit Windows Tup was fully functional. Another one later, and 32-bit Windows Tup even gained one potential advantage over its 64-bit counterpart. Since it only has to support DLL injection into 32-bit programs, it doesn't need a separate 32-bit binary for retrieving function pointers to the 32-bit version of Windows' DLL loading syscalls. Weirdly enough, Windows Defender on current Windows 10 falsely flags that binary as malware, despite it doing nothing but printing those pointer values to stdout. 🤷
  • And that TLINK bug? Easily solved by a Google search, and by editing %WINDIR%\System32\autoexec.nt and rebooting afterwards:
     REM Install DPMI support
    -LH %SystemRoot%\system32\dosx
    +%SystemRoot%\system32\dosx

As I'm writing this post, the pull request has unfortunately not yet been merged. So, here's my own custom build instead:

💾 Download Tup for 32-bit Windows (optimized build at this commit)

I've also added it to the DevKit, for any newcomers to ReC98.


After the switch to Tup and the fallback option, I extensively tested building ReC98 on all operating systems I had lying around. And holy cow, so much in that build was broken beyond belief. In the end, the solution involved just fully rebuilding the entire 16-bit part by default. :tannedcirno: Which, of course, nullifies any of the advantages we might have gotten from a Makefile in the first place, due to just how unreliable they are. If you had problems building ReC98 in the past, try again now!

And sure, it would certainly be possible to also get Tup working on Windows ≤XP, or 9x even. But I leave that to all those tinkerers out there who are actually motivated to keep those OSes alive. My work here is done – we now have a build process that is optimal on 32-bit Windows ≧Vista, and still functional and reliable on 64-bit Windows, Linux, and everything down to Windows 98 SE, and therefore also real PC-98 hardware. Pretty good, I'd say.

(If it weren't for that weird crash of the 16-bit TASM.EXE in that Windows 95 command prompt I've tried it in, it would also work on that OS. Probably just a misconfiguration on my part?)

Now, it might look like a waste of time to improve a 32-bit build part that won't even exist anymore once this project is done. However, a fully 16-bit DOS build will only make sense after

  • master.lib has been turned into a proper library, linked in by TLINK rather than #included in the big .ASM files.
  • This affects all games. If master.lib's data was consistently placed at the beginning or end of each data segment, this would be no big deal, but it's placed somewhere else in every binary.
  • So, this will only make sense sometime around 90% overall PI, and maybe ~50% RE in each game. Which is something else than 50% overall – especially since it includes TH02, the objectively worst Touhou game, which hasn't received any dedicated funding ever.
  • Then, it will probably still require a couple of dedicated pushes to move all the remaining data to C land.
  • Oh, and my 16-bit build system project also needs to be done before, because, again, Makefiles are trash and we shouldn't rely on them even more.
And who knows whether this project will get funded for that long. So yeah, the 32-bit build part will stay with us for quite some more time, and for all upcoming PI milestones. And with the current build process, it's pretty much the most minor among all the minor issues I can think of. Let's all enjoy the performance of a 32-bit build while we can 🙂

Next up: Paying some technical debt while keeping the RE% and PI% in place.