version considered harmful

Posted 2023-10-16

I wrote a chat post about some trouble with version. Cleaned up and reproduced here.

Core D Development Statistics

In the community

Community announcements

See more at the announce forum.

version's problems

version is one of D's features that works ok in some small cases, but falls apart as time goes on.

Among the reasons why are:

  • It is global when specified from the command line.

    If one library does version(v2) and another library does version(v2) you cannot specify -version=v2 that applies to one lib without the other unless they never have any import overlap whatsoever.

    If there's no import overlap whatsoever, you can compile them separately and link them to work around the global switch. But as soon it becomes impossible to fully isolate them - which, remember, happens when some third-party user just uses both libraries together - it can't be done anymore. You'd have to work outside the language to accomplish this, such as rewriting the D file to remove the version specifier, which has zero assistance from the language; the .di generator preserves version decls (though it could perhaps be reimagined to help with this, and besides, di is broken in a hundred ways; it rarely actually even works and when it does, it still is of no help.)

    This is why experienced D programmers tend to put some kind of prefix on their version specifiers, e.g., version(BindSDL_Static) instead of just version(Static) or version(cgi_no_fork) instead of version(no_fork), though of course these are still susceptible to the same problem it is at least less likely. (BTW the module namespace is also superglobal in D, so once again, experienced D programmers know to always, always use a top-level package name in all modules).

  • On the other hand, code-defined versions are strictly module-local (unless pre-defined on the command line, in which case the global one overrides it, good luck handling that when it is a third party user's outside dependency that requires it and it happens to clash with one of your "local" version specifiers - and code cannot tell the difference between the two), meaning it cannot be used to provide clarifying information to the rest of the library.

    Thus, for example, want to provide a version(kqueue)? Good luck; this would happen if version(FreeBSD) or version(OpenBSD) or version(OSX) etc., etc., etc., so you don't want to redo that in each module. You might make it a build flag, but neither dmd nor dub have much of a way to express a conditional version on platforms like this either; if the compiler didn't already special case it for you, you're stuck doing it in code.

    If you want it to be sharable, you end up defining enums for use with static if instead since it can be namespaced and imported.... which of course causes the curious to question: if version has so many pitfalls avoided by static if, why does version exist?

  • version assignments follow strange rules not seen elsewhere in the language, going through an earlier processing stage and requiring the compiler to examine it in lexical order, yet independently of other things. This makes it difficult to combine with other D features:
    	enum a = true;
    
    	static if(a)
            	version=foo;
    
    	version(foo) {}
    test.d(4): Error: version foo defined after use
  • version specifiers are ad-hoc and can appear almost anywhere... infamously, not inside an enum declaration, but almost anywhere else.

    A version block might only change an internal implementation detail and thus be fine for separate compilation, or it might change the binary interface, creating an undetectable runtime flaw in the program.

    There is no reliably way to tell what versions apply to what module. You can try grepping it, but this is easier said than done; version dependencies can also be obfuscated by mixins , and a version decl inside a template depends on the global versions specified at the instantiation point, yet the local versions at the declaration point... and the instantiation point that ends up in the binary may not be the same one specified in the currently compiled source code, and this is essentially random - dmd's template emission code is so notoriously buggy that there's codebases with functions that never get called that writeln some template just in an attempt to avoid linker errors that are there tuesday, gone wednesday, back friday - so worst case scenario is you get an abi difference that comes and goes with seemingly innocuous build changes.

    Don't believe me? try it yourself:

    // insta.d
    module insta;
    
    version(no_foo)
    	{}
    else
    	version=foo;
    
    void lol()() {
    	import std.stdio;
    	version(foo)
    		writeln("foo was there");
    	else
    		writeln("nope on foo");
    
    	version(bar)
    		writeln("bar was there");
    	else
    		writeln("nope on bar");
    }
    
    void uhoh() {
    	lol();
    }
    // instac.d
    module instac;
    void lol() {
    	import insta;
    	insta.lol();
    }
    // instab.d
    module instab;
    void main() {
    	lol();
    
    	import insta;
    	uhoh();
    }

    Compile with any combination of things and see different results.

    $ dmd -c insta.d
    $ dmd instac.d -c -version=no_foo
    $ dmd instab.d instac.o insta.o
    $ ./instab
    nope on foo
    nope on bar
    nope on foo
    nope on bar

    Now make the main function call insta.lol(); directly and it changes the result of the pre-compiled library functions too, including the module precompiiled with -version=no_foo! This is because the template instance symbol from the object file was overridden by the instance created from the separate build. This time with a new local version influenced by the different set of global versions.

    This example is obviously contrived, but we had a unittest failure a few weeks ago at work that came up from this same cause.

Remember, everything that applies to version also applies to debug; they're virtually identical, though debug has some saving graces version lacks like being able do do debug writeln.

General question for a language feature: what if there's a diamond dependency? lib a is depended on by two separate things, lib b and lib c.... then the application uses a combination of the three. if you can't answer that, the feature will appear to work in trivial toys, but prove to be broken eventually in real world deployment. this situation usually doesn't happen in demos, but inevitably happens in the wild. this same question btw also reveals importC's http://dpldocs.info/this-week-in-d/Blog.Posted_2022_05_16.html|fundamental design flaw], the dub version mangle prefix fundamental design flaw, dub's sourceLibrary build type's design flaw, the list goes on.

Analyzing diamond dependencies ought to be as common of a question in DIPs as "what if you pass zero or int.max?" is for unit tests.

If I were in charge of simplifying the language, I'd fix debug and kill version. To some extent, these problems also apply to -unittest and several other switches too, in part because they can trigger a built-in version difference. This broke Phobos not that long ago.