Standard event loop in OpenD

Posted 2024-04-22

Having a standardized event loop in a programming language brings many benefits: operating system interop, integration across libraries, async functionality, threaded parallelism work, and more. Standardizing at the lowest level brings possible additional benefits like GC integration. Let's take a closer look.

Standardization benefits

Standard interface

The biggest benefit of a standard event loop is simply that it is standard. If you want to combine uses of different libraries right now, you've got to use separate threads for each library's loop. This complicates the code, burns additional resources, and, in some cases, is just plain difficult to realize because the library doesn't offer enough interop facilities at all!

A standard event loop offers a shared interface for users and implementers to target, giving a chance for better cohesiveness across the library ecosystem.

Standard implementation

Implementing the standard event loop interface is likely non-trivial but also not terribly innovative. There's just a lot of pieces of it that need to be done a certain way to work well (the real work is done by the operating system and that's a shared component); doing it once and reusing it is likely going to yield better bang for buck than a bunch of different implementations that converge anyway but are nevertheless incompatible.

GC benefits

The GC can also benefit from a shared interface for deferred jobs. You might be able to defer most collections to event loop idle, batching them better to avoid pointless intermediate collections (though it might not be reliable to do this automatically anyway...) and also being compatible with existing old-d webassembly setups.

Note that "event loop idle" is actually a fairly important state - it should be ready to respond to a new event as fast as possible - so you still want to minimize gc pause times here, or at least make the GC itself interruptible, so an event can preempt it. Most likely, a collection being preempted would mean it is cancelled and all work done discarded, so it'd have to start over again from scratch later, and it is possible it'd be interrupted again and simply never have a chance to complete, being responsive... but running 100% cpu and ever growing memory usage. Though perhaps you could do a concurrent run: fork, then have the child process send its report back to the parent telling it what to free via its own event loop. I think it'd be tricky to get this right, but easier with a standard event loop than without one. There is a concurrent collector usable in some cases in D right now, and it appears to work by checking its messages when a future collect is requested, which works but I believe would benefit from having a more immediate event.

Another useful case to consider is that you can have destructors dispatch cleanup messages back to specific threads.

In the current GC implementation, object destructors are run by random threads; it just runs in whatever thread happened to invoke the GC this time. For destructors that must work with particular thread-local resources, this is problematic. The arsd simpledisplay library has a ring buffer destructors queue commands to that the main thread picks up to rectify this situation. I have been planning to generalize this into arsd.core, but haven't finished the work yet. Once that's done though, I can imagine it being useful in other cases too. However, an object will need to specify which thread gets its cleanup messages though, and doing that generically means each object needs a tag in its memory or something... so maybe not useful in general, but again, at least there will be some way to represent it when necessary in a standard interface.

Async main

A lot of programming languages have some kind of async/await built into the language. Only async functions can await on other async functions. In D, we don't have it built in, but there is something somewhat similar in D with fibers: not all functions are compatible with yielding. First, if you're not already in a fiber, you can't yield at all. Second, if you can yield from them, you also need to know when to wake it up, and only some functions are compatible with that.

The latter part, knowing when to wake up, is a tricky question, but not one that is helped by having an async main. (It is helped by a standard interface though.) The former part though, being able yield at any time, is helped by this: if main is implicitly async (whether done via a fiber or a stackless coroutine thing), it is compatible with all this.

On the other hand, it isn't that hard to just spawn a waitable task from main yourself. Just like how it isn't that hard to run an explicit event loop at the end too...

At the end of the day, I think the benefit of a standard event loop are mostly realized just through standardization - get all the libraries on the same page with a compatible interface, and no matter how you call into it, you benefit - but making things just work might just be the last piece to realize the culture shift.

Is this actually smart to do? What about programs that just don't care about this? Well, there's still extern(C) int main() {}. Additionally, overriding _d_cmain is an option as well - something that could be done pretty easily with a selective import.

Platform support

The main objection I expect to see from this is that all this standardization risks being exclusionary: what if you're in an environment where none of this support exists? What if the support exists but you don't use it?

In both cases, the event loop function itself can be a no-op - we define an event loop as running until all jobs are done, and if there's no support, there can be no jobs, thus an event loop that does nothing is a conforming implementation.

What about functions that are expected to create jobs? setTimeout, for example, without an event loop will never run. It might throw an exception or fail to compile if it knows support doesn't exist. I can imagine these being easy to static if or normal if checks to handle; a standard interface should be implemented, but we can define it such that some features can be detected by the user. In some cases too, async functions can complete synchronously without breaking the interface too, it depends on the specifics but we need to stay open to these possibilities.

Big question might be: what if you're running an incompatible event loop? Really, this is similar to any other blocking function, except that it is expected to block much longer... but the solution is similar: put it in its own thread if it can't be fully integrated. We'll try to minimize incompatibilities though through standardization.

Similarly, if you're being called from another language, there's no main anyway, but you might want to run a temporary loop to support these functions or implement the interface in the terms of the host.

Also worth noting: the webassembly platform *always* has an async main. This actually exposes that reality.

First Step

The first step is to go ahead and do it. Let's put the pieces in place and start filling them in. If we're wrong, we can change our mind back later; there's not that much opend user code yet anyway. Worst case is people call the explicit function which clears the job, then the druntime loop sees it has no work and returns anyway; we don't have much to lose by giving it a try.