DConf deadline passes, write up on the -mv switch

Posted 2023-06-05

The DConf deadline has gone past with a good number of talk submissions for the Aug 29-Sep 1 event. The selected speakers will be announced in another week.

Meanwhile, I wrote a bit about dmd's -mv (called --mv on ldc2 and, I say the best name: -fmodule-file on gdc) switch and some interesting quirks with it.

Core D Development Statistics

In the community

Community announcements

See more at the announce forum.

Using the module file switch

The module file switch to the D compilers is documented as:

-mv=<package.module>=<filespec>: use <filespec> as source file for <package.module>. This is used when the source file path and names are not the same as the package and module hierarchy. The rightmost components of the path/filename and package.module can be omitted if they are the same.

But this doesn't really describe just what it does and what it is good for. This is part of why I prefer the gdc name of -fmodule-file, this is more descriptive than dmd and ldc's -mv name - it doesn't actually move anything, it just tells the compiler where to find something.

However, that final sentence in the documentation implies it does a bit more than just a module file, and trying it in practice shows it actually does a bit less than specify the file too.

Since the rightmost part can be excluded, this means you can specify a package directory mapping as well as a module file mapping. For example, create the following file layout in a root folder:

/// main.d
module main;

import app.foo;

void main() {
        bar();
}

and then

/// source/foo.d
module app.foo;

void bar() {}

So the end result is:

$ ls -R
.:
main.d  source

./source: foo.d

First, if you compile without extra flags, you get an expected failure:

$ dmd -i main.d
main.d(3): Error: module foo is in file 'app/foo.d' which cannot be read

Next, let's tell it where to find the app.foo module explicitly:

$ dmd -i main.d -mv=app.foo=source/foo.d

Works! This is no different than a more traditional compile:

$ dmd main.d source/foo.d

So, why would you ever use the -mv switch for specifying files? The key difference is if we don't use the -i switch; that is, when doing separate compilation.

If you list the files to the build, like dmd main.d source/foo.d, both main.d and foo.d are always compiled. This is actually often desirable; it is more likely to just work and it usually builds faster! But sometimes it isn't, so you want a solution to specify where an import can be found without adding it to the full compile list.

This is where -mv in this form can help:

$ dmd -lib source/foo.d # compile the module separately
$ dmd main.d foo.a
main.d(3): Error: module foo is in file 'app/foo.d' which cannot be read
# that didn't work since it can't find the import file
$ dmd main.d foo.a -mv=app.foo=source/foo.d
# success!

When compiling the modules separately and the compiler can't guess the filename from just an import statement, you can use the -mv switch to tell it where to find it without making it compile again. This already has some use!

And the documentation says it can do more than this. We can also provide partial names and let the default guess fill in the rest. For example, back to our test rig, instead of telling it where to find the module app.foo specifically, we can tell it just where to find the app package in the filesystem and let it infer the modules inside:

$ dmd -i main.d -mv=app=source

Success! It used the user-provided mapping for the package, then did the default filename lookup for the rest of it. This can provide some of the benefits of specifying files while still enjoying the benefit of -i to compile the dependency modules automatically.

This makes me wonder how it interacts with a peculiar D feature that tries to import another language's code without necessarily compiling it and having to work with system include dirs.... can we use it with importC? The answer is yes! ... but there's a caveat in the current implementation.

Go back to our test rig and add some more files:

// include/test.c
int printf(const char*, ...);

(Please note that it called .c, not .h. I know, I know, keep reading.)

And let's use it under the namespace cbindings in D:

/// main.d
module main;

import app.foo;

import cbindings.test;

void main() {
        bar();
        printf("Hello\n");
}

And let's compile it, adding the -mv mapping for the cbindings package namespace to the include folder, with -mv=cbindings=include/:

$ dmd -i main.d -mv=app.foo=source/foo.d -mv=cbindings=include/
$ ./main
Hello

It worked! It is a little strange though. If the switch worked as documented, providing a file path for the given module, it *should* have worked with .h, but it doesn't, we had to use .c. And check this out, make an empty file called include/test.d and recompile:

$ touch include/test.d # make the empty file
$ dmd -i main.d -mv=app.foo=source/foo.d -mv=cbindings=include/ # same build command ...
main.d(9): Error: printf is not defined, perhaps import core.stdc.stdio; is needed?

Same command, different result. What if you gave it the full name?

$ dmd -i main.d -mv=app.foo=source/foo.d -mv=cbindings.test=include/test.c
main.d(9): Error: printf is not defined, perhaps import core.stdc.stdio; is needed?

Note the -mv=cbindings.test=include/test.c fill name, explicitly specifying the .c extension, but still an error message. dmd -v hints as to why:

< snip >
import    cbindings.test        (include/test.d)

It ignored the file extension I specified! So, what actually happens here? dmd's implementation strips off the file extension you specify and treats it as just a prefix comparison for the module name. It then re-appends whatever is left in the actual import module name with its normal guess process - first, it tries to append .di and open that file. If it fails, it appends .d and opens that. If that fails, it appends either .c or .i (I forget the order but it tries them both), yada yada yada. It repeats this process through its -I import paths until it either finds something or exhausts the list, at which point it finally issues the error message.

I believe this is an implementation bug and it shouldn't strip the extension, if one is given. Indeed, it should probably be able to give its own extension preference, meaning I could say use cbindings.*=include/*.h or something similar to specifically tell it to search for .h files instead of .d files.

If it did that, this might solve the problem that ended up in importC's check for .h files being reverted: the way that was implemented would lead to .h files sometimes shadowing .d files. If we could specify which D namespace applies to which files, we could say the cbindings.* are going to read .h files from a particular dir without affecting any D files. It can all be driven by the system configuration and/or command line to help clean things up a little.

Another thing that is arguably a bug applies here too. If dmd finds a file that appears to match a pattern, it doesn't necessarily require the module definition to match:

// color.d
module mismatch.color; // package name specified here

void foo() {}

and

// test.d
import color; // but no package specified here

void main() {
        foo(); // the declaration from color.d is visible
}
# specifying the file to build will correctly fail:
$ dmd test.d color.d
test.d(1): Error: module mismatch.color from file color.d must be imported with 'import mismatch.color;'

# but specifying it through an import path will not fail: $ dmd -i test.d -mv=mismatch=. // worked

Despite the fact that the D file specifies it belongs to a package, the import mechanism did not actually test that, unless both modules were explicitly compiled together! If you do separate compilation (dmd -lib color.d followed by the separate build) or automatically included imports (the -i switch), the compiler happily uses the file it found that seems to match the name without actually checking its explicit module declaration.

Once again, this is pretty clearly a bug in the compiler implementation. The spec says these names must match. This doesn't seem to lead to any major problem - if there is a conflict in namespaces, you will get an error using them together - but it sure can be a bit surprising.

....the problem is I think that the importC mechanism actually relies upon this bug! Since a .c file never has a D module definition, it cannot match it. And if you use -mv to namespace it (and I think you should namespace it since top-level names are problematic long-term, well, more accurately, I think the compiler should provide a magic namespace for importC and perhaps drop its ties to D's import keyword entirely, as I discussed about a year ago here: http://dpldocs.info/this-week-in-d/Blog.Posted_2022_05_16.html ), this bug being fixed might break the whole idea.

But still, recognizing this use might justify fixing the bug with an exception for importc configurations. It might help make importc more useful in real world situations while aligning more with the spec.