arsd.dom

This is an html DOM implementation, started with cloning what the browser offers in Javascript, but going well beyond it in convenience.

If you can do it in Javascript, you can probably do it with this module, and much more.

import arsd.dom;

void main() {
	auto document = new Document("<html><p>paragraph</p></html>");
	writeln(document.querySelector("p"));
	document.root.innerHTML = "<p>hey</p>";
	writeln(document);
}

BTW: this file optionally depends on arsd.characterencodings, to help it correctly read files from the internet. You should be able to get characterencodings.d from the same place you got this file.

If you want it to stand alone, just always use the Document.parseUtf8 function or the constructor that takes a string.

Members

Core functionality

These members provide core functionality. The members on these classes will provide most your direct interaction.

Core functionality Classes

Document
class Document

The main document interface, including a html or xml parser.

DocumentFragment
class DocumentFragment

.

Element
class Element

This represents almost everything in the DOM and offers a lot of inspection and manipulation functions. Element, or its subclasses, are what makes the dom tree.

XmlDocument
class XmlDocument

Specializes Document for handling generic XML. (always uses strict mode, uses xml mime type and file header)

Core functionality Functions

findComments
Element[] findComments(Document document, string txt)
Element[] findComments(Element element, string txt)

finds comments that match the given txt. Case insensitive, strips whitespace.

htmlEntitiesDecode
string htmlEntitiesDecode(string data, bool strict)

This takes a string of raw HTML and decodes the entities into a nice D utf-8 string. By default, it uses loose mode - it will try to return a useful string from garbage input too. Set the second parameter to true if you'd prefer it to strictly throw exceptions on garbage input.

htmlEntitiesEncode
string htmlEntitiesEncode(string data, Appender!string output, bool encodeNonAscii)

Given text, encode all html entities on it - &, <, >, and ". This function also encodes all 8 bit characters as entities, thus ensuring the resultant text will work even if your charset isn't set right. You can suppress with by setting encodeNonAscii = false

parseEntity
dchar parseEntity(dchar[] entity)

This helper function is used for decoding html entities. It has a hard-coded list of entities and characters.

require
T require(Element e)

You can use this to do an easy null check or a dynamic cast+null check on any element.

xmlEntitiesEncode
string xmlEntitiesEncode(string data)

An alias for htmlEntitiesEncode; it works for xml too

Core functionality Structs

Html
struct Html

The html struct is used to differentiate between regular text nodes and html in certain functions

Selector
struct Selector

Represents a parsed CSS selector. You never have to use this directly, but you can if you know it is going to be reused a lot to avoid a bit of repeat parsing.

Bonus functionality

These provide additional functionality for special use cases.

Bonus functionality Enums

NodeType
enum NodeType

.

Bonus functionality Interfaces

FileResource
interface FileResource

This might belong in another module, but it represents a file with a mime type and some data. Document implements this interface with type = text/html (see Document.contentType for more info) and data = document.toString, so you can return Documents anywhere web.d expects FileResources.

Implementations

These provide implementations of other functionality.

Implementations Classes

AspCode
class AspCode

.

BangInstruction
class BangInstruction

.

ElementNotFoundException
class ElementNotFoundException

This is used when you are using one of the require variants of navigation, and no matching element can be found in the tree.

Form
class Form

Represents a HTML form. This slightly specializes Element to add a few more convenience methods for adding and extracting form data.

HtmlComment
class HtmlComment

.

Link
class Link

Represents a HTML link. This provides some convenience methods for manipulating query strings, but otherwise is sthe same Element interface.

MarkupException
class MarkupException

This is thrown on parse errors.

PhpCode
class PhpCode

.

QuestionInstruction
class QuestionInstruction

.

RawSource
class RawSource

.

ServerSideCode
class ServerSideCode
SpecialElement
class SpecialElement
Table
class Table

Represents a HTML table. Has some convenience methods for working with tabular data.

TableCell
class TableCell

Represents anything that can be a table cell - <td> or <th> html.

TableRow
class TableRow

Represents a table row element - a <tr>

TextNode
class TextNode

.

Implementations Mixin templates

JavascriptStyleDispatch
mixintemplate JavascriptStyleDispatch()

this puts in operators and opDispatch to handle string indexes and properties, forwarding to get and set functions.

Implementations Structs

AttributeSet
struct AttributeSet

Proxy object for attributes which will replace the main opDispatch eventually

DataSet
struct DataSet

A proxy object to do the Element class' dataset property. See Element.dataset for more info.

ElementCollection
struct ElementCollection

A collection of elements which forwards methods to the children.

ElementStyle
struct ElementStyle

for style, i want to be able to set it with a string like a plain attribute, but also be able to do properties Javascript style.

MaybeNullElement
struct MaybeNullElement(SomeElementType)

An option type that propagates null. See: Element.optionSelector

Other

Other Aliases

EventHandler
alias EventHandler = void delegate(Element handlerAttachedTo, Event event)

used for DOM events

Other Classes

CssStyle
class CssStyle

This is probably not useful to you unless you're writing a browser or something like that. It represents a *computed* style, like what the browser gives you after applying stylesheets, inline styles, and html attributes. From here, you can start to make a layout engine for the box model and have a css aware browser.

ElementStream
class ElementStream

This is the lazy range that walks the tree for you. It tries to go in the lexical order of the source: node, then children from first to last, each recursively.

Event
class Event

This is a DOM event, like in javascript. Note that this library never fires events - it is only here for you to use if you want it.

Stack
class Stack(T)

This is kinda private; just a little utility container for use by the ElementStream class.

StyleSheet
class StyleSheet

This probably isn't useful, unless you're writing a browser or something like that. You might want to look at arsd.html for css macro, nesting, etc., or just use standard css as text.

Other Functions

camelCase
string camelCase(string a)

Translates a css style property-name to a camel cased propertyName

getElementsBySelectorParts
Element[] getElementsBySelectorParts(Element start, SelectorPart[] parts, Element scopeElementNow)

Parts of the CSS selector implementation

idToken
sizediff_t idToken(string str, sizediff_t position)

.

intFromHex
int intFromHex(string hex)

helper function for decoding html entities

lexSelector
string[] lexSelector(string selstr)

Parts of the CSS selector implementation

normalizeWhitespace
string normalizeWhitespace(string text)

Normalizes the whitespace in the given text according to HTML rules.

parseSelector
SelectorComponent parseSelector(string[] tokens, bool caseSensitiveTags)

.

parseSelectorString
SelectorComponent[] parseSelectorString(string selector, bool caseSensitiveTags)

.

removeDuplicates
Element[] removeDuplicates(Element[] input)

.

unCamelCase
string unCamelCase(string a)

Converts a camel cased propertyName to a css style dashed property-name

Other Static variables

selectorTokens
string[] selectorTokens;

.

Other Structs

FormFieldOptions
struct FormFieldOptions
Undocumented in source.
SelectorComponent
struct SelectorComponent

.

SelectorPart
struct SelectorPart

Parts of the CSS selector implementation

Suggestion Box / Bug Report