For the love of Troff (2020)

15 points by mccd


dzwdz

The system developed by Bell Labs, troff, was as different as could be: a direct typesetter with no imposed structure and no explicit machinery to enforce separation of semantics from style. Instead of dictionaries of opaque tags, troff provided sets of macros that users were free to extend or ignore or abuse. The wonderful, programmable flexibility was a feature to user and a bug to management: freedom for one is to the other loss of control. It should be no surprise that large, top-down organizations such as IBM and the military preferred SGML to troff.

"Semantic documentation bad because that's organization, and organization is what the military does"? I don't understand this argument.

Yeah, troff does not enforce the usage of semantic tags, and gives authors complete control over typesetting if they so desire. That's not a feature. If people start doing typesetting manually, they will inevitably only optimize for the way in which they are viewing the manpage.

Troff came from Bell Labs, so let's look at Plan 9's man(6), it should be a good example.. The REQUESTS table looks great on the terminal... in the pdf export slightly less so, and in the HTML version... well, you can take a look at how fucked up it is.

See, rather than using the somewhat more "semantic" tbl macros to lay out the table, whoever wrote that manpage used the "wonderful, programmable flexibility" to lay it out themselves. The header row is hard wrapped, because otherwise it would look bad on a narrow text terminal - but now it looks bad everywhere else. A manpage viewer can't even reliably figure out that this is a table in the first place, so the columns in the HTML render aren't even aligned.

Keep in mind this is from Bell Labs, the birthplace of roff.

The author acknowledges that "The quality of the rendered output is uneven [amongst formats]", but troff manpages are easily the worst offenders, as we've just seen. HTML exports from mdoc (e.g.), on the other hand, look fantastic!

Can we ask no more of our documentation system than that new pages come up when we click on the blue underlined text?

The author would love mdoc. Check out all the links on that page! You can use Sx to create links to sections, Xr to create links to other manpages, etc.

Plan 9 manpages just use IR instead. As in - they format the manpage name as italic, and the section as Roman. There's no semantic information there. man2html can still figure out the convention and create links between manpages, but that's obviously not ideal.

In general I'd argue that taking away some control from authors in favor of giving more control to manpage viewers is a worthwhile tradeoff. And if you just can't express something with semantic tags - it's much better to introduce new tags that can be rendered as appropriate in the given context, rather than hacking your way around with manual formatting.

fanf

The quality of the rendered output is uneven. No system today is equally good at producing PDF and HTML output, and rendering usefully in a terminal window.

This problem is fixed by mandoc which renders man pages to html beautifully. If you use semantic mdoc macros to write the documentation then the html is nicely structured too.

Arguably texinfo is OK at all three; I generally dislike texinfo, except for the emacs documentation, for some reason.

This article is making a similar argument to a previous one https://lobste.rs/s/fwohal/man_pages_are_great_man_readers_are