dconf 2022 online schedule, some arsd.game work, template emission discussion with d index file proposal

Posted 2022-11-14

The big thing this week is explaining library version mismatches, a radical idea of a linker-driven compiler, and a more likely to happen proposal to fix problems inherent to D's current build paradigm.

Core D Development Statistics

In the community

Community announcements

See more at the announce forum.

DConf Online 2022

The schedule was announced this week, and I'm on it! Saturday, December 17, 20:15 UTC (3:15 PM my time here in NYS), and I'll do another live thing.

See the rest of the schedule here http://dconf.org/2022/online/index.html.

What Adam is working on

I've started documenting and overhauling the api for my arsd.game module. It'll be a while until I'm done with this since I have a lot of other things to do, but I might be able to write more about it next week, much of the changes are refactoring what I've done with it over the last couple years.

I also am doing some fixes to simpledisplay's (default!) automatic resize handlers and want to bring in another colorspace to color.d in the near future, probably bringing in some of Manu Evans' code.

I just have four other things on work deadlines I need to do... that even ate up half my weekend this time, so that needs to be done first.

Redesign for template emission woes

A common issues that has come up with D over many years now - indeed, cgi.d hit similar issues going back to prehistoric times (my initial git commit, from July 2011, includes the "Referencing this gigantic typeid seems to remind the compiler to actually put the symbol in the object file" block, and remember the file existed for years before I used git) - is that the compiler will not put code in object files when building a module that the linker might expect to be there when importing a module.

The general problem

We usually call this "template emission" bugs, since that's the most common place to see it, but it isn't the only place - you can trigger it yourself with version blocks among many other things (most compiler flags can create similar trouble), and those aren't bugs per se since the compiler is doing what you told it to do. Just what you told it to do doesn't work and it has no way of realizing that. In fact, a linker error is among the least harmful possibilities that can arise - at least it prevented a runtime memory corruption problem! As Steven Schveighoffer said in the chat as I was writing this blog entry: "But compiling different libs with different version definitions is a recipe for disaster."

So, what is the cause and what do I propose to finally fix it? Generally speaking, the root cause is what's in the compiled object file (or library, all the same thing fundamentally) doesn't match what the compiler thinks is there when you import it. When you build a module with different flags than what you use when importing it, they can have different things:

Library module:

module lib;

version(with_func)
void func() {}

Application module:

module app;

import lib;

void main() { func(); }

You can probably already imagine the problem. Compile like this and get a compile error:

$ dmd -c -version=with_func lib.d
$ dmd app.d lib.o
app.d(5): Error: undefined identifier func

In this case, the function was compiled into the object file, but the compiler didn't know it was available when building the application.

Or compile like this and get a linker error:

$ dmd -c lib.d
$ dmd -version=with_func app.d lib.o
/usr/bin/ld: app.o: in function `_Dmain':
app.d:(.text._Dmain_Dmain+0x5): undefined reference to `_D3lib4funcFZv'
collect2: error: ld returned 1 exit status
Error: linker exited with status 1

In this case, the compiler thinks it is there, but it isn't actually there, again because the library was built with a different set of flags and the compiler has no way to know this.

Various compiler flags can create similar situations, including, but not limited to, -version, -debug, -unittest, -dip1000, -checkaction, -check=contracts=x, -release, and more.

Moreover, since specifiers in particular are global, applying to your modules and imported library modules alike, there is no way to even use the library without potentially passing conflicting flags! This most often comes up with -unittest, which sets a global version specifier which may cause the compiler to assume certain test helpers are compiled in when they aren't.

Template emission bugs in particular

Template emission bugs are a special case of this same problem of the compiler not knowing what's precompiled vs what's imported. The key difference is it happens because of the compiler's own actions vs a user-defined flag, and thus there's the belief that the compiler can manage it itself. This, unfortunately, is false with the current build paradigm. While you can make reasonably close guesses that work most the time which makes it feel possible, hence reinforcing the belief that it is just another bug to squash, the truth is the build paradigm really needs to change.

I'd love to demonstrate this in an easily reproducible way, but the compiler's heuristics really do make good guesses and it is as hard to intentionally trigger as it is to make go away when it comes up unintentionally.

But take my word for it (or don't, and destroy my arguments with your own investigation!) that what happens is the compiler sees a template needs to exist, but thinks it already does in the compiled object file, and skips outputting it again. When it is correct, this saves some compile time. When it is wrong, you get frustrating linker errors (or worse, in the rare event that the instance's ABI is different than expected, but that's more often caused by compiler flags than template instances).

One way to solve the template emission problem - or at least, solve its special case extensions while leaving behind the general problems explained in the previous section of this article - is to simply always emit them at the usage point. This is theoretically sound because a template doesn't really exist in a library build anyway; templates only exist in source code and instances are built from it when it is used, where it is used. This is what the compiler switch -allinst instructs dmd to do - output all instances. (That switch is sometimes buggy too, skipping some anyway, but that is just a bug when it happens.)

The problem with it is that it can get very slow. Note this doesn't mean it is always very slow, it might work for you when you get one of these linker errors! But it certainly can be slow and generate a lot of useless junk in temporary files on disk for the linker to sort through, adding to the slowness. So compilers try to find various solutions to the speed problem. dmd has its emission heuristics (and some linker section thing too), g++ uses a peculiar kind of linker section and sometimes separate object files, and more. To be honest, I don't much about know all the solutions everyone has, I just know none of them work 100% without extra support from the language and/or build system.

Some solutions

If we know the fundamental problem in all of these is that we have a mismatch between the source file import the compiler sees and the object file the linker sees, the path to a solution would thus be to close this mismatch. I have a few ideas that might work, some of which I know have seen success in C++.

Radical - linker-driven builds

The most radical idea is to let the linker call the compiler instead of how it works now, which is the opposite. In this paradigm, the linker starts by asking the compiler for the exported functions (for an application, just entry point, aka main, but for a shared library it might be more). The compiler finds it, compiles it, and returns it to the linker. The linker then sees that it references whatever in its call tree and, if not already in the linker's index, asks the compiler to produce them too, and it does (or it issues an error when it can't, or it might even return a "not modified" reply, simply validating the linker's cache is still usable), and so on in this loop until the linker has everything it needs to actually generate the finished file. The linker would need to know which compiler to invoke for any given symbol, which it might be able to guess from the mangled name, or otherwise might go down a list of compilers it asks in the case of mixed language builds.

In this paradigm, the compiler would never generate useless or duplicate code, since it is driven by actual needs from the beginning, and you'd never have confusion when the compiler thought it output something but skipped what the linker needed, since the linker knows what it needs and is communicating that directly back to the compiler! You might even compile individual functions in parallel and maintain compiler caches of types and such too.

I've never actually seen anyone do this in the real world. I think anyway... and besides, just because I haven't personally seen it doesn't mean it doesn't exist. I wouldn't be surprised if there's some awesome product out there I've never heard of since I live under a D-shaped rock, or some research project that "dropped out of college" me doesn't follow, but idk. JIT systems kinda do work this way, but that's a bit too much different than D's model to be directly comparable.

Anyway, I mention this not because I expect to see it, but just because I think it'd be cool if we did.

Closer to the status quo - D Index Files

OK, this is something I think we can more realistically actually do with our toolchains as they exist today: reimagine the .di file from something akin to a "header" file to something more like an index file, something tightly coupled to the generated object file and source code file, that tells the compiler exactly what is in both.

Here's how it works: any time you produce an object file (or a shared library or a static library, again, remember, these are all fundamentally different expressions of the same concept), you also produce a .di file. These are always to be distributed together. Indeed, they arguably should be in the same physical file, but since one of the design constraints here is compatibility with existing toolchain pieces, I don't want to push my luck. Maybe there is section we could use in standard object file formats that won't be mangled when passing through debug info and linkers and programs like strip, etc., and won't add significant time waste to compiler or linker in finding the info they need while ignoring what they don't, but I don't know.

I do know we can produce and distribute the two files together, so that's how I'm going to phrase this. But know the same things could happen if they were physically bundled too.

Anyway, the di file is stripped in many ways to what we have now. Since it specifically is tightly coupled to this particular build, it has no need for version blocks in its code and should NOT have them, in order to make impossible the first example we saw in this article. If the with_func was built into the library, the di file should always expose it, regardless of what version flags were passed to the application build. And if it wasn't, it shouldn't.

The function prototype it outputs ought to indicate that they are in the object file. The extern keyword can do this. The function body need not be present. As a D programmer, you might be thinking "but how will it be interpreted at compile time if there's no available source code?" I know, I'll come back to that, keep reading.

These D index files would not be intended for people to ever look at, so the compiler is free to use them as a CTFE result cache or similar as well, replacing ctfe calls with literal values. Prototypes for mixed in functions should also be present here.

// source code
int foo() { return 55; }
enum bar = foo();
int baz = foo();
mixin("int m() { return bar; }");

compile it...

// index file
extern int foo();
enum bar = 55; // ctfe result cached in file
extern int baz; // initializer's ctfe result is in the object file so mark it extern
extern int m(); // mixed in but we know it is in the object file so just output the reference to it

Optionally, additionally, there might be no top-level, non-public imports in the generated file. The compiler can spit out imported!"fully.qualified".names everywhere, since again, it isn't meant to be pretty. (Error messages, of course, should be pretty, but the file itself need not be. Perhaps the compiler can see that it is a .di file and apply different pretty-print rules when referencing it for errors.) Why would you do this? So importing the library avoids importing its internal dependencies too. But this is not required for the concept to work, I'm just sidetracking

Anyway, this is how it works for functions. What about templates? Well, remember that a template instance is just another function!

// source code
T foo(T)(T i) { return i; }

void useit() { int a = foo(5); }

Compiles into:

extern T foo(T:int)(T i) @safe pure nothrow; // inferred attributes need to be specified
extern void useit();

// or maybe, but these should be the same thing
template foo(T:int) { extern T foo(T i); }

Or something similar, since the :int specialization syntax doesn't exactly mean this, but I think it is good enough. In any case, you put out info in this index file so the compiler knows, with factual surety, that the template instance with this particular set of arguments is actually present in the object file. C++ does this kind of thing pretty successfully.

Please note that all attributes that were inferred on the function are listed here too. This means that stuff will work despite the lack of source code to analyze. They will always be the same for any particular instance, so this works fine, and it is needed to build the correct mangle back up anyway.

Of course, now you're really itching to ask: "but how will it instantiate the template on new types when there's no available source code!" Same with CTFE, sometimes the D compiler needs the source, even if it has already been compiled into the object file. So, what happens?

Well, there's two possibilities: 1) the source code is always* output in the .di file, even if things are marked extern. The compiler should be smart enough to skip processing source code until it is needed, and it should not error when there is both an extern and non-extern definition. or 2) the compiler knows where to find the source code if it needs it.

* Well maybe not "always", if distributing a closed-source library, you might configure it to skip functions based on a list or presence/absence of UDAs or something. But you'd default to having them so CTFE isn't broken by default. Of course, the source code could be obfuscated; the di file is not meant to be pretty regardless.

With possibility two, the D index file is linked both to the object file and to a source file, so the compiler, instead of erroring out when it can't find something in the file, loads the original source file for the module and pulls out of there instead. Please note that it cannot import the original module because they're the same module! Just different aspects of it. So the implementation would need to understand this distinction to allow it to continue. The benefit of this model might be that parsing the di file might be faster since it has less irrelevant info in it.

Type definitions would also likely always be output in the D Index file anyway, so I think outputting all the source into the generated .di file is the way to do it. If the compiler implementation wants speed it can perhaps just learn how to do semantic more lazily.

But the most important thing for the "template emission bugs" are:

  • Versions stripped out of the index file so you always see the same file as the build regardless of other global flags set.
  • Template instances that exist in the object file are explicitly listed in the index file.
  • You still need access to the source code to create new instances (which will be listed as an extern thing in the D index file associated with that object file - and the compiler needs to merge overload sets when it generates these too) and to run CTFE things (which is broken in today's .di files anyway).

Everything else is just performance improvements. Which are why the template emissions have the potentially buggy heuristics in the first place... but this is a different kind of performance improvement. The compiler isn't guessing anymore, it is just deferring work it already has all the tools to do deterministically. And caching ctfe and mixin output can produce faster builds even if the files are bigger anyway.

We ought to investigate this, and do whatever language changes are needed (the extern template declarations don't work today, but I think it'd be a fairly small change to make it work). I think it'd be a big win relative to the effort, putting away the minefield of version mismatches, stopping the template emission woes, and quite likely making builds faster, all at the same time.

Edits

Got a few comments after publication I want to preserve for later:

Steve: "For the "version" thing, it should be in there, but just set at the top. Otherwise, you can't deal with things such as static if(isVersionSet!"someversion")"

Me: "Yeah it could list all the ones set just it also would need to be insulated from command line switches so -version=wasnt_set_in_original_build wouldn't be added when imported later"