Writing that changed how I think about PL
55 points by surprisetalk
55 points by surprisetalk
Parsing expressions by precedence climbing
A good one! I feel like a lot of recursive descent parser authors end up writing something like it after deciding life is too short for 15 separate functions that all look the same. It’s used in TLA+, a weird language where operators have precedence ranges. The pattern handles that gracefully too, you just recurse with the precedence parameter set to the upper end of the currently-matched-operator’s precedence range.
TLA+, a weird language where operators have precedence ranges
Precedence ranges? That’s a new thing to me; a quick search yields:
The relative precedence of two operators is unspecified if their ranges overlap.
on the page 7 of https://lamport.azurewebsites.net/tla/summary-standalone.pdf along with a table of the ranges.
I wonder what’s the reason behind this choice.
Assuming it’s an error when it’s unspecified my guess would be the intention is to force the user to use parenthesis to clarify In circumstances where the intention isn’t obvious by the usual PEMDAS rules
You are correct about it being a parse error when the precedence ranges of two operators overlap.
Funnily enough talking about PEMDAS rules, TLA+ gives subtraction higher precedence than addition. This was discussed on the mailing list and while we couldn’t figure out why, it does not seem likely it would cause any problems: https://groups.google.com/g/tlaplus/c/Bt1fl8GvizE/m/ohYTfQWiCgAJ
Heh, the first thing I thought of when I read that description was Lamport’s happens-before relation, which has similar semantics (the total ordering of events in a distributed system is generally not defined, but there can be clear partial orderings)
I ended up using the Shunting Yard algorithm for precedence. It’s easy to implement, and it handles left and right associativity, making it easy to add new operators in my expression parser in my 6809 assembler (here’s the precedence table and the top level expression parsing function).
Is the shunting yard algorithm the same as recursive descent parsing except with the stack explicitly managed instead of using the runtime’s call stack?
Both. You can keep the recursive descent part, but Instead of embedding the precedence in call stack, like:
expr = term [ '*' term ]*
term = bool [ '+' bool ]*
bool = factor [ '&' factor ]*
factor = DIGITS
Instead, you can simplify it like:
expr = factor [ op factor ]*
op = '+' / '*' / '&'
factor = DIGITS
and in the function that handles expr
, you manage the precedence stack there (as you can see in the example I linked to).