See more at the announce forum.
GC talk came up again and I got thinking a little bit about pointer barriers in the compiler. I don't think it is helpful to think of them as new types per se, but rather as something akin to range checks in slices - a facility in the normal type that adds safety and capability, but can be bypassed both locally and globally if necessary for special cases.
You still have the same pointers D has today. The only difference to the language itself is adding a way to bypass the GC write barrier, probably some kind of core.memory.__raw(T) which would look something like struct __raw(T) { T ptr; alias ptr this; auto opAssign(T rhs) { /* compiler magic to bypass the bounds check */ } }. Note the alias this - if you did actually have one of these, it'd implicitly convert right back to the normal pointer. On the other hand, bypassing the check is always an explicit act at construction time. Its only real job is to indicate that you want to bypass the normal check here, you might even use it similarly to slice.ptr[0] to locally bypass a bounds check. Important to realize that you never have to actually use this; it is strictly an optional performance enhancement for special cases, just like how you can bypass a bounds check.
In the compiler, again analogous to bounds checking, you can add -barrier=[none|writes] as a command line switch to control it globally. Certain GC implementations would not be available if the barriers are not compiled in.
The barrier itself would be delegated to a druntime function or a llvm intrinsic, or whatever is appropriate for the compiler and garbage collector. (This can make swapping out collectors at runtime an additional performance hit, since that'd imply the barrier is also a function pointer, but all these are reasonable decisions to look at.) The function would most likely look like core.memory.__ptr_write(void** where, void* what) so give the implementation maximum flexibility. With intrinsics and inlining, when the barriers are turned off, this would surely reduce to mov [where], what; anyway.
A few things to note:
The compiler frontend doesn't even need to do anything special for this case either btw, since the backend can detect these self-modify cases and elide the check.
I expect this would mean a great many D functions don't generate the barrier at all, even when it is enabled.... unless I'm missing something important. The barriers happen when you reassign pointers to another block.
char* a; a = x; // this gets a barrier *a = x; // this does not
It is only the mutation of the pointer value that matters. The GC doesn't care about the value of a random char, only about which memory blocks are referenced and unreferenced.
EDIT
I posted this but didn't explain some of the practical benefits of these different GC impls! So here's some very brief things:
Right now, the GC mark phase stops all threads so it doesn't get into race conditions, when it marks one thing then it changes while the GC is working elsewhere. With the write barriers, you can avoid stopping a thread until it tries to actually write a pointer value
So say you have an audio thread that is consuming GC resources but only writing to an existing ushort[] buffer... its activities will never invalidates the GC's work-in-progress so it can just keep going. This would just work without explicit user work, no need to unregister it.
Another work thread may be processing some data then will send the buffer to a function. This one MIGHT be able to keep running through the whole thing, depending on if the borrowed pointer thing actually worked, but a simpler and likely less buggy implementation will probably left it finish processing the data then only actually stop when it copies the buffer pointer to the function, which would trigger the write barrier. And by then, the GC might be almost done, meaning the actual time stopped is much smaller than the current implementation.
DONE WITH EDIT
Additional reading: http://www.infognition.com/blog/2014/the_real_problem_with_gc_in_d.html is an article from 2014 about how generational GCs can provide a major win and absolutely require write barriers to implement. I'd note that generational GCs are often also copying, since this lets them reduce memory fragmentation when promoting items to older generations, thus reducing wasted memory. But it isn't strictly necessary; you can still free the unreferenced ones and then promote the whole area to older generation, allocating a new pool for new young objects. Not ideal but definitely doable.
Also, a post about the GC implementation in particular 4 years ago: https://web.archive.org/web/20180622071748/https://olshansky.me//gc/runtime/dlang/2017/06/14/inside-d-gc.html I don't know how much of that has changed since then.
I actually find today's GC perfectly good enough, but improvements are generally good anyway and I think the write barriers would be a net again without being particularly invasive in user code.
Next week, I'll talk about how to optimize your objects to reduce GC pauses in today's D with a general understanding the NO_SCAN flag. In short, arrays without pointers are not scanned at all and you can use this to your advantage. I'll also touch upon strategic deferrals of collections when you know you are in a pure allocation loop.
I sketch out some thoughts on how D could easily and efficiently add write barriers to enable more GC strategies without affecting other code, analogizing them to bounds checks.