my thoughts on std.socket design

Posted 2021-12-20

A few updates on cgi.d's hybrid server coming to Windows and other arsd things, but mostly I'll talk about std.socket and what I'd like to see from it in a hypothetical v2. Spoilers: it isn't what most the proposals describe.

Core D Development Statistics
In the community

Community announcements

What Adam is working on
Thoughts on Phobos' std.socket

Core D Development Statistics

15 bugs fixed
14 bugs and enhancement requests opened
36 pull requests merged into the language: 23 into DMD, 4 into Phobos, and 9 into druntime.
8 pull requests merged into the website.

In the community

Community announcements

See more at the announce forum.

What Adam is working on

A bunch of new web server code: cgi.d's embedded_httpd_hybrid (the one that splits work between both threads and fibers) is now implemented on Windows as well as Linux. I did this using the i/o completion port functionality, and it was my first time actually using this myself, so I'd like to do more testing before tagging it.

There's also a new RequestServer.stop() function to stop the server when it is done with current requests. I want to add a few more config controls to that object but not sure when I will.

I also added some convenience functions for the dispatcher for refresh headers and 201 Created responses. I'll document these later.

I need to find a few days to write some cgi.d tutorials and I am thinking about doing an overhaul on the event interface, but I also have minigui examples to write and like a million other things. Ugh.

Thoughts on Phobos' std.socket

I have a complicated relationship with std.socket. On the one hand, I love how flexible it is, but on the other hand, I hate how limited it is out of the box.

This felt contradictory, but today I realized what I actually like about it is the virtual interface, and what I'm meh on is the implementation.

Well, I don't really like the interface either. It is ok, but it doesn't offer the nice async options Windows has and I'm not in love with the lastSocketError / wouldHaveBlocked functions. accept throwing when it would block bugs me a little too. But even those complaints are somewhat minor, and most importantly, I like that it IS an interface.

I've done a few different Socket subclasses. There's my OpenSSL socket, my fiber socket, my stdin-pretending-to-be-socket, and recently, yes, a Windows async io Socket, which is why this is on my mind. These do varying amounts of actually supplementing or outright replacing the Phobos implementation, but thanks to the interface, code build on the Phobos Socket can be transparently reused with these new subclasses, even if they weren't written with any such replacement in mind. (Well, maybe not the WHOLE interface, but plenty enough to make this approach useful anyway.) And that's proven pretty valuable to me.

I see a lot of people hate on std.socket. They say it isn't up to the modern standards and needs to be replaced. I partially agree - like I said, I often swap out parts of its implementation and some of its interface is certainly sub-optimal. I find it quite obnoxious that the destructor closes the socket and I can't find any way to override that.... But I still think they are underrating the module as a whole and many of the proposals to replace it.

A lot of this comes down to struct vs class. My like of it comes directly from the fact that it is a simple class. No final! This makes the interface flexible enough to adapt to new situations. That said, I probably wouldn't mind if the forwarding methods were final, since you really only want to override the actual implementation one anyway, and leave the others behind (well, you want to alias in the overloads sets rather than leave them, but like you don't want to override them). Again, the interface isn't perfect right now; I'm not against improving it, I am just against throwing it out entirely.

Anyway, the complaints against the class tend to be:

It isn't @nogc. This has nothing to do with classes. Even with the current Socket class, you can fulfil the nogc static requirements by working on a subclass and allocate it with scope (of course a function taking a branching subclass hits some reuse issues with the status quo, my point is just that it can pretty easily be done). Its accept function even offers a virtual hook so you can change the allocation scheme it uses!
There's virtual function overhead. I find it hard to believe this is significant next to the i/o call itself, but even if it is, I say trust the compiler, and even if it fails, it is still worth the cost thanks to being able to reuse code that works on the interface.
The destructor can do silly things. You can fix this with various techniques, including scope guards, a release method on the interface perhaps, and of course, a struct wrapper, similar to File, could be provided. I'll come back to this.
Template can provide similar polymorphism for the user. But this is simply put a hassle - how many things that take File template on it? And I always push back on anti-template bias, but some of the bloat and object linking complaints are real.

Well, as you can see, I don't find these complaints convincing. I think a std.socket without a Socket class would be a step down from what we have today. It can kinda make up for it by providing more built-in functionality, but frankly, Phobos doesn't exactly have a stellar track record in pulling that off. Either that'd do the unlikely (but I don't think it is impossible!) task of actually doing everything right for everyone, or leave us craving some extensibility.

So I think what I'd do is actually have a kind of struct wrapper around a class. This would essentially be following in the PIMPL pattern, just like std.stdio.File, but by using an injected class, you keep all the subclass benefits for users.

struct Socket {
	public interface Impl {
		// your things here
	}

	private Impl impl;
	public Impl getImpl() { return impl; } // it is still available at user's own risk

	// user can provide their own from the outside
	this(Impl impl) {
		this.impl = impl;
	}

	// can refcount it for the more deterministic destruction
	// again, just like Phobos's File does, but with easier
	// replacement of the implementations.
	this(this) {
		if(impl) impl.AddRef();
	}
	~this() {
		if(impl) impl.Release();
	}

	void open(Class = PhobosSocket, Args...)(Args args) if(is(Class : Impl)) {
		// or you can pass an allocator, or even just malloc it or whatever
		// we could have some required CTFEable things for design-by-introspection
		// and whatnot too. Could get pretty complicated but doesn't have to.
		//
		// But the point is just that the user chooses the class impl it uses.
		this.impl = new Class(args);
	}

	mixin ForwardCallsTo!Impl;
}

class PhobosSocket : Socket.Impl {
	// some reasonable default implementation so things just work for new users
}

User code functions would simply work with this Socket struct and get the reusability benefits. Only when constructing do they need to make some decisions and pick their implementations. Of course, the library should provide a good enough implementation - the current Phobos Socket with some tweaks can be that - but then you can swap it out, including at runtime, easily enough to account for things like other ssl libs, other event loop integrations, and so on.

Now, if you want to get into what I'd like to see in the interface itself, I'd say something more like Windows provides. It is easier to emulate the (frankly superior) Windows API on Posix than it is to do vice versa; you can block until a non-blocking operation completes, but you can't really not block until a blocking operation completes. I haven't looked closely at the other event libraries but I'd be surprised if they don't more-or-less do this. But stack vs heap buffers can be a legitimate thing to consider if you commit to either blocking or fiber pseudo-blocking operations. Still whatever, that's the weeds and I don't want to go far into that today. My main point here is just that Socket being a class is actually really useful, and if you do a struct/class hybrid, you can combine things in a useful way for a lot of people's cases.

Blog Articles