September 30, 2019

Posted 2019-09-30

Another LDC beta, and my thoughts on the PHP to D blog post ideas.

Core D Development Statistics

In the community

Community announcements

See more at the announce forum.

Adam's thoughts

HTML generation by objects

One of the controversies I saw about the blog post this week was using DOM functions to generate or populate HTML templates. I like this approach, in fact, I have been using it for a long time now myself! And even my new webtemplate.d module, while appearing to be string-based on the outside, is actually DOM based on the inside.

HTML can be represented as a string, but it isn't actually a string. Forgetting this is what leads to cross-site scripting (also known as XSS) vulnerabilities among other problems like simply broken websites. (I once had a form that would randomly fail to submit. The root cause? The submit button ended up on the outside of the </form> tag. Whether it worked or not now depended on factors that made it appear quite randomly.) Using strings makes these problems very easy, whereas using a DOM representation makes these things harder, if not impossible.

Template languages mitigate this somewhat by at least HTML encoding dynamic elements by default, and truth be told, this actually does a decent job! But DOM based stuff can do even better:

  • A DOM insertion is context-aware and may be type safe. It is possible to throw an exception when trying to output a non-URL to a href attribute, for example (though I have never actually done that, it is possible). It can also recognize what is an attribute name, value, text node, Javascript code component, etc.
  • A DOM insertion can never create malformed HTML. (It can create invalid html - though again, it is possible to detect this immediately - but not malformed)
  • DOM manipulation is possible too, modifying an existing element.

Some of these possibilities can be done with strings, too (a general pattern; there is rarely only one way to do something) - you can also run validators on the output, check links, etc., but by the time they are strings you have lost information. Some can be regained (ironically, often by parsing it back into a dom lol), but the definitive original data is now gone.

What about separation of concerns?

The most common counterargument to DOM templates is separation of concerns: the fear that DOM manipulation puts too much of the view logic in the main application code. I have to confess, I've done quite a bit of that myself. dom.d makes it so easy it is tempting to put everything right there in the D code.

But, it is also quite easy to separate things just like any other template system, if not even better. My webtemplate.d is one attempt at merging the best of both worlds: it looks like a string system on the outside (similar to asp, erb, etc.), but on the inside still uses a DOM based system. When you write <a href="<%= link %>">, it isn't just inserting a string at a character position, but rather knows to set the value of the link's href attribute to the value of the variable link.

Even without that, a dom system can work on id, class, or data-* attributes and achieve a high level of decoupling. You can define custom tags, too, as well as as much or as little manipulation DSL as you like. Indeed, the Javascript frameworks React and Angular work this way, in principle. Both are integrated into the dom concept instead of treating it as strings mixed with code.

Of course, you could also just argue the server side is just generating a semantic description of the data, then the template lays it out based in ID tags (or whatever), and then the CSS achieves visual separation.

It depends on how specifically you use it.

Rewrites vs fixes

Another common criticism of seeing wins in D is the argument that you can have those same wins in other languages, that the rewrite is unnecessary.

First, I'd counter on empirical grounds: if the old code was so easy to fix, why didn't it happen there? Whatever the reason, this is an interesting observation. Just, of course, this also needs to be weighed against the other frequent empirical observation, that the rewrites (much like the code they are replacing probably did) often go over budget and behind schedule.

But second, I'd say rewriting generally makes it possible to correct big mistakes and cancel a lot of technical debt, while slimming down with lessons learned from use. (If you actually finish them - a big temptation is to try to do too much and that breaks you.)

And in particular, using D in general gives a benefit in enabling experimentation. Walter Bright wrote about this back when he was writing his Warp preprocessor - in his case, using the pipeline programming paradigm helped him make larger algorithmic changes during his benchmarking iterations. I find D's static checks and metaprogramming facilities help give me the confidence to make changes with the right mix of customization and centralization.

So yes, in theory, you can usually get he same wins in other languages that D can do, but I'd argue that the process of getting there is easier in D.