Adam's rant on benchmarks
Posted 2019-11-18
Blog
Articles- terminal.d gets clipboard functions, ldc 1.20 out.
- DConf keynote speaker announced: Lua architect Roberto Ierusalimschy, Named args DIP discussed
- February 3, 2020
- Adam's terminal suite explained
- Understanding mixin templates, terminal.d improvements
- My attribute-by-default proposal. Also dmd 2.090 came out.
- DConf 2020 announced: June 17-20 in London. @safe by default debated. Adam did: Android, JNI, WebSocket in arsd libs
- tar.xz, --DRT tip, dom bug fixes, more Android and JNI, link to old phobos docs
- LDC 1.19 - Android, AVR. My rant on tests, update on JNI and COM.
- Walter's string interpolation proposal is OK but not great. My Android thing nearing beta release. dub downtime explained.
- Android project update, introduction to arsd.jni
- New pattern about interface contracts
- Adam shares Windows console secrets - DO NOT USE chcp!!
- Adam's rant on benchmarks
- Socket tutorial
- November 4, 2019
- October 28, 2019
- arsd package updates, forum nonsense
- Update on Android
- Adam does iOS "goodbye world"
- September 30, 2019
- D turns 20, Adam rants on software freedom
- Named arg DIPs and my thoughts on code organization
- September 9, 2019
- I wrote about mixin templates vs string mixins on Stack Overflow
- August 26, 2019
- Bug bounty in D again - my hot take, on reusing code, a fun picture, my tentative plan for the next month
- Time invested is worth a lot
- cgi.d's new scheduler, static this tricks
- July 29, 2019
- July 22, 2019
- Solving vs managing problems
- A big week in the arsd repo
- July 1, 2019
- June 24, 2019
- June 17, 2019
- CRTP thoughts, named arguments DIP review, DConf videos now on youtube
- musings on hybrid CT/RT tests, some more progress on new web framework
- a little more webassembly
- May 20, 2019
- Adam's string interpolation proposal
- DMD 2.086 live, GCC 9 with D support formally released, DConf coming soon, links to posts on builder pattern and disallowing implicit conversions with templates, and 2d array op overloads
- template constraint error improvements coming?
- dmd 2.086 beta, dstep 1.0 released, Adam works on memory usage
- obj-c and webassembly report, tips on is expressions linked.
- new ldc, new dmd, dpp on the blog
- D's future discussed in forums
- LDC beta, DConf blog link, Adam introduces gamehelpers.d
- March 18, 2019
- LDC 1.15.0-beta1, responsive design rant
- dmd 2.085.0 released
- Obj-C interop and D without druntime code to copy/paste
- dmd beta, more info coming next time, demo of new web framework initial prototype
- automatic web interface discussion, reflection tips and tricks
- Adam busy with weather and a move, lots of community announcements
- January 28, 2019
- Working on official blog 2018 retro, C++ new wrapped, dmd reading zips?
- dmd obj-c growing, Adam static foreaches an interface to RPC
- dmd 2.084, hope for future, but busy non-D week for me
- IDE tools released, my cgi.d gets new features
- DConf announced, tip, Adam rants: mouse trap
- This Week in D is back!
A DIP to remove the ~= operator from slices was shot down by the community, and there were forum posts about benchmarks. I write about why I don't really care for benchmarks.
Core D Development Statistics
In the community
Community announcements
See more at the announce forum.
Adam's rant
I don't really believe in benchmarks. Lots of websites write lots of numbers on them, but really, they don't have a lot of real world applicability.
Despite some benchmarks trying hard to be realistic, they rarely actually are, and they are very often are applied far too generally. For example, how many times have you seen a comment on one of those vibe benchmarks that says "D is slow?" That's not what the benchmark actually said though: all it really said is this implementation using the vibe.d library performed more slowly than the competitors on this specific test.
That says very little about D itself. It doesn't even say a lot about vibe.d itself - perhaps this implementation was just not great, or called a poor section of the library as a whole. Or maybe the implementation is fine, but the competitors cheated! Well, cheated is a kinda strong word, but they could be optimized to the benchmark case, perhaps neutral to or perhaps at the expense of the general case.
Benchmarks try to do an apples-to-apples comparison by using a particular piece of hardware for everyone. But that hardware may be absolutely nothing like what you actually use, and your code may perform radically different on the hardware you actually use.
To draw a conclusion about your use case on your hardware, the online benchmarks are of little help. Instead you have to profile yourself. And then, unlike the benchmark which just says "this took X seconds", the profile actually gives you hints as to why it is slow.
Let me expand on that last row: you might argue it is important to know where to look ahead of time so you don't hit a wall after getting invested. And I somewhat agree, but the problem is benchmarks don't really measure the holistic cost and benefit of a system.
If you are using a "slow" language, you might worry it is going to be too slow for you. And that might be fair, but you should consider that the implementation could be improved (probably though profiling!), or there's a good chance you can rewrite bottlenecks as a component in another language, or just set up a memory cache, or something like that to turn it around.
The real question is: how difficult is it to set those things up? Will they actually work with your usage patterns? Benchmarks are rarely insightful on these other factors.
Whereas once you start working and find your development speed is poor, or your site has poor latency, or you are hitting a global GC lock on high concurrency you actually need, now you can test that specific thing and try to change or work around it.
You can learn about those problems ahead of time by reading other people's reports. But you'll almost certainly need to look more closely at the circumstances than benchmark websites provide. And that's why I don't put a whole lot of stock in them.
(unless my code wins, then benchmarks are totally legit and 100% accurate to everything!!!!!!)