Working on official blog 2018 retro, C++ new wrapped, dmd reading zips?

Posted 2019-01-21

I was approached to write "Last Year in D" for the official blog, and spent much of my free time this week working on that. Upstream, I noticed a little more obj-c code merged, some talk about wrapping C++'s new operator for D as yet another memory management option, and there is a forum post (again) about expanding D to read zip files directly.

Core D Development Statistics

In the community

Community announcements

See more at the announce forum.

What Adam is working on

dpldocs.info now has a clear cache button on the footers... unless your project is cached. You can contact me to force-clear it faster if you like, then after that, you will be able to do it yourself via the web. (Or you can issue a POST post to the /reset-cache endpoint on your project's dpldocs.info subdomain already.)

On reddit, people showed interest in my gtkd doc work: http://gtk-d.dpldocs.info/gtk.AboutDialog.AboutDialog.html so I might renew some effort on that soon, but I still haven't finished my cgi.d changes, so that will probably happen first.

My thoughts on forum discussions

The zip file thing has come up before and nothing came from it, but since it is Walter who brings it up (and again this time), there's always a chance it will happen. I find it interesting, but I am skeptical as to the claimed benefits.

Walter said looking up fewer files will make Phobos import faster, and proposed putting it all in phobos.zip. But I find it unlikely that this would actually bring a significant change, and I can't help but notice the irony of first being told we need to break up big files for compile speed (see: std.algorithm, std.datetime, std.range's history), and now being told we need to zip them all back together for compile speed. Both seem, to me anyway, to be missing the actual point - no matter the number of files you have, Phobos is a heavily templated, strongly coupled project. Whether many files or one, the import graph looks like a web, so come one, come all (that there has been some good work in improving this, and breaking up the files has actually helped in some cases). Whether many files or one, dmd is slow at working with deep templates.

I used to be excited about the zip possibility. One of the reasons why I keep my module import graph small is that I have historically, and still usually do, distribute my libraries as single, individual files you just drop in. With zips, maybe I would do it differently. Though, probably not, since I like many other advantages of my progress. Regardless, I still kinda like the idea.

But, I remember the triumphant posts saying how much faster dmd got when they stopped freeing memory. But, that was only true in a few specific cases (the benefit went away when using a different C library; it was a flaw in the malloc/free implementation dmd was using on their system's benchmarks), and more importantly, it has now led to the situation where dmd is outright unusable on low memory machines because it allocates so much and never even attempts to free to the point that it thrashes swap, hard, or if you're lucky, it gets terminated by the kernel OOM's executioner. Either way, this win in a flawed benchmark is (yes, present tense because it still is!) a major loss in real world usage.

I love using a profiler and removing unnecessary function calls too; it is frequently the right thing to do, but we should have carefully investigated and confirmed the root of the problem there instead of eagerly chasing the first easy solution that comes to mind and patting ourselves on the back for our cleverness.

Of course, a new zip feature is unlikely to actually hurt anyone. It shouldn't break anyone's existing processes, and like I said, I am moderately optimistic about the possibilities for library distribution with the feature (though much less so now than I used to be, since dmd -i and even dub exist now), but nevertheless, I fear we are rushing into a paper-over fix instead of addressing the root problem with compile speeds.

It is possible that the slow file system is a problem, and maybe that is worth fixing regardless. But I'm not convinced it is worth spending effort on relative to other ideas. Maybe Walter should sit down with Stefan and get that new ctfe engine in.

I don't know, and I'm out of time to write right now, I just implore everyone not to lose sight of the big picture while looking at small benchmarks and potentially cool features.