dub 2.0 design discussion

Posted 2022-03-28

Core D Development Statistics

In the community

Community announcements

See more at the announce forum.

What Adam is working on

I added a pragma to minigui itself to opt in to version 6 common controls on Windows, which enables visual themes and some other nice features if you are building with mscoff. This should work ever since either XP or Vista, so I see no real reason not to, but if you do, use -version=minigui_no_manifest to undo it.... except....

...This pragma does NOT work with the version of lld-link bundled with dmd at this time due to bugs in that linker. It is fixed in the new version of lld-link and works in the Microsoft link.exe, so it cannot be merged until that is fixed upstream. Max started a PR to bump the version.

Until that is merged, this is actually under version=minigui_manifest. I will static if it when it is possible to do so and enable by default. Hopefully, this will make it in for the next release of dmd and ldc. I'm not sure if gdc supports it already, I will test when the official release of it is made next month. (gdc on Windows works except for missing pthreads decls in druntime... I might fix that myself too.)

dub 2.0

I've made no secret of my dislike of dub. It is mediocre in all respects, except where it is outright bad. Its biggest crime is that its mediocrity crowds out innovation, but I think I have a solution to that: dub 2.0.

I think it is important to separate out the three things we bundle as "dub" and discuss its three elements independently:

  1. dub the repository
  2. dub the packager
  3. dub the builder

Each of these have pretty serious design and implementation problems that we'd want to fix in a 2.0 revamp.

dub the repository

Let's look at the repository first. Its main design flaw is not enforcing any kind of namespacing. Module names are global in D, and the dub 1.0 repo allows packages to add modules with any name, including even common single words like "module util", which can lead to conflicts. Even if it is a private module, the linker still sees all the names, and can lead to problems. Fixing this may mean breaking packages, so it would have to be opt in so people can migrate.

The repository also has a number of implementation problems, though these could likely be fixed in dub 1.0, people have attempted and failed to do so, so there's surely some problems pretty deep in the implementation (I find the code all quite hard to follow). Among these problems are the broken search, the iffy download count and scoring systems, the regular outages, slow updates, and the lack of a process to remove abandoned/broken packages or to transfer ownership of names (or, generally speaking, to allow group access). All of these would be worth fixing.

dub repository 2.0 would also want to make the breaking change of either enforcing module namespaces or, more likely, since it is useful to extend things like std or core in certain special cases, minimally warning on names like this so users are aware of potential incompatibilities.

dub the packager

Next is the package management aspect. This means resolving versions, downloading and installing packages, and defining the package format.

I'm extremely skeptical of the value of semver resolution. Having authors tag versions and document their policies is useful, so insofar as the package manager encourages people to do this, there is potential value. Users being formed about this changelog as they update is also somewhat useful; they can get recommendations of tested versions from upstream, and doing that recommendation can come out of version ranges and a resolution algorithm.

But other than the recommendation, it isn't that useful. An application would just provide its own locked version set. For libraries, the user will want to make their own decisions - and really, the best practical option is almost always the newest versions available at the time you are choosing versions; you might start with the newest then defer updates for a while, but if and when you do decide to update, you want to use the newest one unless you can't for some concrete reason.

In theory, you can use a long-term support branch and get benefits from updates without risk of breakage. In practice, such branches rarely even exist (if you look through the current repository, once a new minor or major version is released, there's only a handful of patch releases ever made again, mostly from libraries doing a special-case release upon specific request rather than as a general policy)... and when they do, still sometimes break anyway as two authors rarely agree on what exactly constitutes a "breaking change". So you're unlikely to find real world success trying to not update.

And worth noting updating *anything* can mean everything; a library might have no breaking changes... if you use their version of the compiler and/or other dependencies. But if any one of those move, a new compiler might deprecate something the library used, for example, you likely are forced to move everything or beg the author for a special release just for you. But in all probability, you're gonna be forced to either fork your libs or get new things at some point anyway.

And let's talk about forks for a bit: I don't really like how dub hides the library source in a distant folder, since I like being able to make edits in place. Sometimes, you have to fix something and can't wait for upstream to do something about it. dub 1.0 has add-override to help with this, but then it doesn't get saved with the rest of your project.

dub 1.0's package format combines both package metadata and building information. Package metadata includes things like name, description, dependencies, etc. Building information are preBuildCommands, copyFiles, and so on. targetType is something in the middle, influencing both the package info and the build.

I've ranted in the past (at least in chat rooms) about how dub.json is very redundant with information available in the source code and also limiting with comparison to what the compiler can express.

In dub 2.0, I'd want to reduce the redundancy and separate out the build commands from the package description. There are always some grey areas, like targetType, but my rule of thumb is the declarative ones, like "this project produces a library called thing.dll and a source file called thing.d", are fair to have as part of the package description (even though it might not even specify it to such a detail - it might just point at an output directory and say "whatever the build dumps in there"), but any imperative commands to actually get there ought to be a separate concern. The most the package description part would do about the build part is define what kind it prefers, so the package manager, when asked to build a dependency, can delegate to the appropriate build system.

For reducing redundancy, anything it can learn from analyzing the code, it should be able to learn from analyzing the code, at least as defaults which you can override later if needed.

I'd probably call the package file dub2.json, so you can provide both a dub.json and dub2.json for compatibility with both 2.0 and 1.0 dub users.

I'd like to bundle various well-known build systems, like gmake, reggae, perhaps ninja, something like that with dub 2.0, so you have at least some reliable systems to delegate to. Of course, you can also just call out to dmd -i, or let the user do their own build. But I imagine most things would like to provide an automatic method. Of course, one of the defaults can also be dub 1.0; simply delegating to dub 1.0 to build in some cases can ease migration without forcing dub 2.0 itself to copy all of dub 1.0's mistakes and bugs. A clean break while maintaining some degree of compatibility for the end user can be achieved through this delegation.

I'd also like to see dub 2.0 have the ability to package non-D projects, so you can list, for example, sqlite3 as a dependency, which might be installed by the user's operating system or built, depending on cases. The dub 2.0 package description would describe what the D program needs, then delegate "building" it to another system, even if that other system is actually pacman -S or apt-get install. I think this part is easier said than done, but would be a worthwhile stretch goal.

dub the builder

The final major aspect of dub is the building aspect. After the packages are prepared, you want to build and perhaps run the application. I talked a bit about this in the last section - the key idea for dub 2.0 as a builder would be delegate necessary information to some other build system.

My big idea though is that building a library actually provides both an optional library build and a list of D modules that are either imported or just added to the build. The list of D modules might just be the original source, or it might be specially made (e.g. with dmd -H) for the library import case. This would let you potentially optimize the library use and let the build system arrange them in such a way that the compiler can easily find them without additional flags. Note that D files can also declare their own link-library dependencies, meaning this allows for more delegation to existing build systems as well as the optimization potentials. You might also generate .d files on the fly for .h files, like with dstep run across the C headers you installed from the system package manager for a non-D project.

This also completely decouples the source layout with the library import layout and ensures private parts of the library, like test files, are free to be implemented however they like without influencing the user's application. The library build is now separate from the application build.

That said, there are some cases where you do want build information from a library to influence the application build. You might want resource files and application manifests to be build. You might need certain -preview or -dip flags.

I'd let the package description suggest these, but note the word "suggest" - since these are global settings, the end user must have the final decision. The system could aggregate these, warn on potential incompatibilities, but then make a build script/config file the user can try, edit, and commit to their own repo. If this user is a library author, these suggestions would then be sent down to the next user to do the same thing.

I'd probably want libraries requiring flags in the final build - in their private build it is ok, but the final build is different since that leaks out - to be flagged as problematic in the repo since this is likely to be incompatible with other libraries, just like reusing a shared name.

Conclusion

This is the outline for dub 2.0 that I have so far: break from dub 1.0, introduce curation to the repository, separate out declarative package definitions from build, and separate the build of libraries from the build of applications.

I imagine most current dub packages could be converted by an automatic means, but since you can just delegate to dub 1.0 to build, this might be doable.

A lot of people have talked about fixing dub 1.0. But it has design flaws on every level and its implementation is not easy to fix. If it was, it surely would have been done by now. Worth noting that several of D's more prolific developers only use it when forced to...

It is time to seriously consider dub 2.0.