ImportC's module namespace problem

Posted 2022-05-16

Following up from last week, I will write about another problem importC has that I haven't seen other people discuss.

Core D Development Statistics

In the community

Community announcements

See more at the announce forum.

ImportC and module namespaces

D and C have very different compilation and symbol namespace models. In C, you concatenate all the included files into one compilation unit and compile it, putting all declarations into a global namespace. In D, you import independent modules, each with their own separate namespace. If you've ever used two different sources of extern(C) bindings (including through dpp), you might have encounter this before, manifesting as type mismatches between two things that look the same to you.

As of this writing, these examples only work on dmd master, since it leans on the new support for shelling out to preprocessor for convenience.

Consider the following. I have two C libraries that both use a FILE* that I want to combine from D:

// b.c
#include<stdio.h>

FILE* openAFile();

FILE* openAFile() {
        return stdout;
}
// b2.c
#include<stdio.h>

void saySomethingToAFile(FILE*);

void saySomethingToAFile(FILE* fp) {
        fprintf(fp, "hi!\n");
}

(the declaration before the definition is to simulate the result of a #include "b.h" in a real world thing after going through the preprocessor, but you don't really need that; the point is just that there's two separate C units that both use a FILE*)

In a C project, you could easily use these two library functions together.

But try to use them with ImportC the most straightfoward way like so:

import b;
import b2;

void main() {
        auto fp = openAFile();
        scope(exit) fclose(fp);
        saySomethingToAFile(fp);
}

What happens?

$ dmd d.d b.c b2.c
d.d(7): Error: function `b2.saySomethingToAFile(_IO_FILE* fp)` is not callable
sing argument types `(_IO_FILE*)`
d.d(7):        cannot pass argument `fp` of type `b._IO_FILE*` to parameter `b2._IO_FILE* fp`

Uh oh, it didn't work. Why can't I use a FILE* as a FILE*? What if I tried to call printf("hello!\n"); in my D function?

d.d(9): Error: function `b2.printf` at /usr/include/stdio.h(332) conflicts with function `b.printf` at /usr/include/stdio.h(332)

Gotta love it pointing at the same definition in the same file, but if you read the name carefully, you'll see one is referring to module b's namespace and one is referring to module b2's namespace. In C, these are the same thing; it is all in a global namespace that the linker would sort out, but in D they are different due to the module system.

Make no mistake: the D module system is outstanding and much better than what C has to offer. But the promise of ImportC is that things just work, and as you can see here, things are not just working.

dstep would convert individual header files to independent D modules, that import each other instead of including them. This approach sometimes breaks C code that expects the concatenation behavior, but it actually works more often than not and works pretty well to bridge the C headers into the D module system. (My experience with using these various existing techniques in real world code is part of why I rapidly hypothesized this wouldn't work.)

ImportC doesn't do this though; it focused on writing a C compiler instead of thinking about how it would interact with the preprocessor. And note, this is using dmd master, with upstream's attempt at preprocessor integration already merged. It just pretends each compilation unit of C amalgamation is a D module and applies an arbitrary namespace, leading to significant symbol duplication inside different D namespaces.

What are some solutions?

Well, one idea is to put everything together into one file. Have one .c file that #includes everything you want in one big shared namespace. But that's hard to do in reality since you might have code from different sources coming together; library A can do this and library B can do this, but application C using both A and B will have a hard time re-combining.

Another might be to do some of what what dstep and deimos does: make a collection of D modules that just #include one file at a time. But even this can fail because one file might include another file and the diamond dependency in C results in incompatibility in D. dstep avoids this by having an entirely different approach to the preprocessor than importC; could importC replace #include directives with __import directives? But then there's still some macros and preprocessor defines that would break. And besides, if you're going to just clone dstep, might as well just use dstep!

We might not import c but instead mixin(c) inside the scaffolding of a D module. At least then you can give it a proper name and have some intuitive understanding that it is embedded. But this is still one layer deep; an include that includes another include will still get copied in the new namespace.

Nah, I don't think these are going to work. (Though I would prefer MixinC to ImportC since then at least you aren't changing the D import and module name rules!) But for a full solution, I think we have to rethink the interaction of module namespaces and C declarations.

This is one advantage of the ImportC framework - it doesn't have to play by D's rules. The compiler could put them all into a combined __global_from_c namespace, regardless of where they are actually #included. This would simulate the C global namespace across modules while still giving some disambiguation option leveraging D's excellent module system.

But then what happens to the import c; Well, frankly, I'd be pretty happy just getting rid of that. I think it was a mistake to try to do that in the first place. Using a fake preprocessor - a new #include <somefile> name; that triggers the importC behavior into the global C namespace then imports that with the renamed name, similar to:

#include <somefile.c> name;
// that'd be rewritten like
import name = somefile;
// except somefile's things go into the
// special C namespace, you'd only use the
// name here to disambiguate as used

Similar to how template mixins work with an optional disambiguation name.

Or we can work this right in to mixinC. Either way, import c is pretty problematic and I think we will need to do something about it to make this work sooner or later. The errors we get in the simple test case now are not productive.

EXTRA EXTRA READ ALL ABOUT IT

Edit after the fact: someone on the ng pointed out they already opened a bug about this:

https://issues.dlang.org/show_bug.cgi?id=22674

Walter closed it "WONTFIX". Yes, there are ways to make it work with fully qualified names, structural casts, etc. But if the point of ImportC is for things to Just Work, we should be less dismissive of real world experience.