The Only Two Markup Languages

75 points by heavyrain266


isuffix

It's nice to see more discussions of concrete syntax in markup languages! I've refactored most of Typst's current parser, so I have a lot of appreciation for the nuance of markup language syntax. Here are three thoughts I had while reading this:

First, to go along with the broad families, I would argue for including a third family based on Lisp or TcL (Forth/Shell are similar, but I'll just write out these). For example, here's how I would respectively render some nested markup: Lisp: (foo (attrib value) '(,(bar 'wrapped) text)) TcL: [foo -attrib $value "[bar "wrapped"] text"]. The defining difference of these syntaxes to the others is the more explicit quoting of text content and the more implicit use of nodes/variables. While this is approaching the explicitness of a normal programming language, I think this approach is underrated for a markup language syntax.

Second, a serious pitfall of the TeX and SGML syntax is the stringly-typed nature of attributes. Under this posts's analysis, Typst's syntax would fall under the TeX family: #foo(attrib: value)[#bar[wrapped] text], but a notable difference for Typst is that here value is not just a string, but an expression in Typst's code-mode syntax. This enables far greater programming possibilities in Typst and allows it to embed a lightweight markup language alongside its heavier markup syntax. Having this in HTML would be the equivalent of allowing JavaScript expressions in attributes, like <foo attrib=1 + object.key[3] />. Indeed, JSX deserves mention for allowing this kind of code interpolation in HTML markup already.

Finally, for non-programming or non-text-based use cases (e.g. data transfer, configuration) KDL would be my recommendation for a modern take on the SGML syntax. Ex: foo attrib=value { bar { text "wrapped" }; text "text" }. KDL has actual non-string value types, a generic type-annotation syntax, and a stable standard definition. The example here suffers from being written inline and marking up text specifically, but the explicitness of the language and its kinder syntax make it a worthy successor to XML.