The Unreasonable Effectiveness of ProseMirror Model in Rich Text Transformation

13 points by smoores


isuffix

I've been dealing with a quite similar problem in the Typst parser recently. In our concrete syntax tree, we assign unique numbers to each node called spans that are more stable than plain textual ranges across edits. This allows us to use spans as part of the input to our incremental engine.

But this means that there's currently no way for us to address text that isn't uniquely contained in a node of the tree. We want that capability to improve the fidelity of our error and warning diagnostics, so they can target text that exists within some node or continues across multiple nodes instead of just targeting the largest containing node.

My solution is sub-ranges: adding a range that's relative to a given span and targets text underneath it. These can use smaller indices than normal ranges, and importantly do not require updating when unrelated text is edited. However they still have to be translated to absolute ranges eventually, but so do our normal spans! So that cost is shared.

dlants

I've spent a lot of time with ProseMirror, and it (and CodeMirror) are brilliant! Marijn did a great job with both projects, and I have advocated in all of the places I've worked that ended up using it to send contributions his way. https://marijnhaverbeke.nl/fund/

Both projects have some really nice design decisions, including the document format + unique positions you highlight here. Another is the flat, rather than nested, mark setup (a <b>text<i>more</i>text</b> becomes something like [{text, b}, {more, b, i}, {text, b}]). And of course the orientation around real-time, collaborative editing.