The penultimate conditional syntax

34 points by fanf

matklad

I am huge fan of Kotlin’s low-tech solution, which is a combination of three things.

Multiway if:

when {
   x > 0 -> "postive"
   x == 0 -> "zero"
   x < 0 -> "negative"
}

(seriously, every language should have multiway if! if else chains are horribly unreadable, as they don’t visually separate conditions from actions, but complex multiway ifs is exactly where you want maximum readability)

is pattern-matching expression, which combines with flow sensitive typing:

if (animal is Cat && animal.meow().isLoud())

I think is can be straightforwardly extended to unpacking, rather than just narrowing types:

if (animal is Cat(cat) && cat.meow().isLoud())

Exhaustive matching which is syntactically similar to the combination of the two:

when (animal) {
  is Cat -> "meow"
  is Fox -> "???"
}

What is cool is that you can easily upgrade exhaustive match to an arbitrary complex condition:

when {
  animal is Cat && animal.isFerral -> "deadly silence"
  animal is Cat -> "meow"
  animal is Fox -> "???"
}

You have to bind scrutinee to a variable and then repeat it in conditions, but that’s wasn’t a huge pain in practice (though this part might depend on type narrowing semantics, where you generally need fewer names to being with). My only two complains about Kotlin’s conditionals were that default formatting eats two levels of indentation, and that often binary when managed to look better than an if!

if (condition) 
  trueBranch
else 
  falseBranch

vs

when {
  condition -> trueBranch
  else -> falseBranch
}

which really is more of a complaint about if :D

Though, I was surprised to learn that there’s a KEEP to add guards to Kotlin: https://github.com/Kotlin/KEEP/blob/guards/proposals/guards.md

Like, it is a feature that you don’t need guards because the desired semantics falls out naturally from the combination of other features! But I haven’t touched Kotlin for a long while! :0)

Not sure how I feel about UCS or PUCS! Definitely an important problem to solve, but somehow the solution doesn’t immediately appeal to me.

fanf

Thanks for those Kotlin examples, it looks fairly nice!

somehow the solution doesn’t immediately appeal to me.

Yeah, UCS erred on the side of too little clutter, but I think PUCS errs on the side of too much clutter. Your Kotlin examples make me wonder about omitting || entirely. I think it would end up a bit less uniform, but that’s probably good because complete homogeneity can hurt readability. It isn’t possible to get rid of || from general boolean expressions, and nested patterns still need some kind of | … hmm, that edges it closer to Rust, heh.
- matklad
  
  It isn’t possible to get rid of || from general boolean expressions,
  
  Another strong opinion, but and and or are sooo much better than || for && booleans — much easier to type, more distinct with syntax highlighting, and much clearer that there’s control flow in play.
  - fanf
    
    I tend to agree! (I liked spelling them out when perpetrating perl.) But for this blog post I wanted to stick closer to the curly-brace tradition.
    
    Be thankful I was not in a contrarian mood, I might have written /\ and/or \/ everywhere :-)
- jessicah
  
  C# surprisingly has a lot of this now too, with switch expressions and is patterns with optional binding. It really does simplify a lot of previous behaviours, especially with the optional binding. Besides some slight syntax differences, all of the above Kotlin examples work in C#.
  - osa1
    
    Yeah, I was surprised a few weeks ago when I discovered how expressive C#‘s pattern syntax has gotten. For those wondering, here’s a reference: https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/operators/patterns. Relational and property patterns are the interesting ones. It also has is expressions, which can bind variables: https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/operators/is.
- dogacel
  I also like Kotlin’s solution. I always thought it can be extended to have some syntax similar to UCS, such as
  
  when (age) { > 65 -> "elderly" < 18 -> "children" else -> "adult" }
  
  especially when you need to make some method calls
  
  when (user) { // Shorthand for `user.deleted` deleted -> ... age > 65 -> ... age < 18 -> ... // Or getAge() < 18 -> ... }
- xguerin
  I like that form as well. I implemented something akin to it in my toy lisp except that the when is scoped to the variable to test:
  
  (cond v ((\ X (and (isCat? X) (isFerral? X)) . "deadly silence") (isCat? . "meow") (isFox? . "???") (_ . "fallback") )
  - yogsototh
    
    I really like the Clojure different ways of doing this depending on the case:
    
    (case x v1 (code-for-x-equal-v1) v2 (code-for-x-equal-v2) (code-for-else-case)) (cond (predicate-1) (code-if-predicate-1-is-true) (predicate-2) (code-if-predicate-2-is-true) :else ;; could be anything not nil or false (code-for-else-case)) (condp > x 12 (code-if-x>12) 5 (code-if-x>5) (code-for-unexpected-case)) (cond-> x (predicate-1) (fn1) (predicate-2) (fn2) (predicate-3) (fn3)) ;; if predicate-1 and predicate-3 are true but not predicate-2 ;; this will return (fn3 (fn1 x))
    
    more example of condp here https://clojuredocs.org/clojure.core/condp
- osa1
  Interesting idea, but I’m not sure why you would want to unify if and match. At least the way I designed them in Fir (which is not that unique actually), they serve different use cases:
  
  match does exhaustiveness checking, and needs an “else” branch if exhaustiveness of the patterns can’t be proven by the compiler. Since match always returns a value, it can be used as an expression or statement.
  
  if doesn’t do exhaustiveness checking and does not require an “else” branch. When it doesn’t have an “else” branch, it cannot be used as an expression.
  
  These differences make match more useful for things like parsing user input and handling all outputs. If you change the parser to return more kinds of outputs, you want to make sure those are handled in the call sites.
  
  And if for things like handling one specific case specially.
  
  We also have is expressions (with Bool value) in Fir. The way they propagate bound variables to the enclosing expressions is simple:
  
  In && (boolean and), variables bound on the left are propagated to the right and up.
  
  In && (boolean and), variables bound on the right are propagated up.
  
  In if branches, variables propagated to the condition are bound in the branch body.
  
  In match branches, variables propagated to the guards are bound in the branch body.
  
  In principle we could also support || (boolean or) expressions: if left binds x : Ty and right binds x : Ty (i.e. same variable to the same type), propagate it up.
  
  However I haven’t found any cases where the existing rules aren’t enough to allow what I want to do.
  - fanf
    
    Yeah, exhaustiveness checking is important. I didn’t discuss it because the notes were already long enough, so I left out type-related discussion. I think a reasonable rule is that a multi-way pattern is { … } must be exhaustive unless it’s followed by an || alternative.
    
    Those Fir scoping rules are neat! That’s a great solution to the problem I was struggling with. (Not strictly lexical, though, but reasonably principled I think.)
- fanf
  
  previously
  
  I was reminded of “the ultimate conditional syntax” a couple of weeks ago, at which time I had forgotten some of what bothered me about it. I didn’t have a clear idea of what made it seem over-complicated. But it’s been floating around in my peripheral consciousness and eventually a bright idea breached the surface, that resolved much of the botheration and complication.
- withoutboats
  
  Personally, I don’t see the great appeal of combining pattern matching with if. What problem are we really solving with these new syntaxes? The syntax of match or switch in most languages with pattern languages works great, if is some syntactic sugar for a very common pattern: matching on the simplest sum, the boolean. If anything, the problem -such as is it is- is that the intermediary if let that Rust and Swift have falls into a weird spot between them so that it does both more and less than if.
  
  I really don’t like scoping that isn’t strictly lexical; I think it should be as easy as possible to understand what scope your variable will be available for when you declare it.
  
  There is one big difference between if and match that makes keeping them separate interesting in my opinion: with if, you know that the conditional must have the type bool, whereas with match you don’t know the type of the “scrutinee”. This difference is a space to play with in language design.
  
  Most languages attempt to use the patterns as input to infer the type, but there is another option that I suspect may be better: treat match as a projection operator (like the field access . operator) and require that the type of the scrutinee be known from other context. This would completely solve the problem of needing to scope the variants in patterns, because if you already know the type of the scrutinee you already know the acceptable patterns. I think Swift may do something like this, but I’m not clear on exactly how its case .variant feature works.
  - fanf
    
    The appeal for me is that it replaces a bunch of ad-hoc syntax and complicated rules with a few relatively simple orthogonal parts. It’s partly a reaction against the increasing complexity discussed by Yoshua Wuyts the other week, and the way that the various proposals to make Rust’s syntax more uniform turn out to conflict with each other because of the non-uniform starting point and ad-hoc growth.
    
    But as I wrote in my notes, it’s true that syntax can be too uniform. There are reasons Lisp and Tcl are less popular! I dunno where this idea lands, but it’s worth playing around to find out.
    
    Re scope, it is strictly lexical: in both UCS and PUCS the scope of a variable bound in a pattern is just the consequent of that pattern. The trick is to make the consequent usable as a pattern guard, or the rest of a boolean expression, or as a then clause, or all three.
    
    I maybe should have written more about types. I didn’t make it very clear that in PUCS the expression following the if keyword is still just a boolean expression. (The keyword just allows you to use then and else clauses in the expression.) The “is” expression does most of what match does but in a manner that scales down to smaller fragments. However, by itself “is” is just a test with a boolean result; to return a different type you put it in an if.
    
    I like your thought about type inference. Is that how Zig supports its unscoped .member names? Swift has a reputation for erring on the side of too much inference so I would be slightly surprised if its case variant relied on judiciously limited inference instead :-)
    
    matklad
    
    I like your thought about type inference. Is that how Zig supports its unscoped .member names?
    
    Yeah:
    
    https://ziglang.org/documentation/master/#Result-Types
    
    https://ziglang.org/download/0.14.0/release-notes.html#Decl-Literals
    
    Zig type inference in general worth studying (some notes here), it is interesting:
    
    The annotation burden is low (std.mem.eql(u8, xs, ys) is a common offender, but that’s just an artifact of library design, it could have been xs.eql(ys)).
    
    But is doesn’t do unification! It’s a simple, direct, single pass algorithm
    
    And function signatures are fully explicit, there’s isn’t anything analogous to Rust’s impl trait syntax leaking Send-ness
    
    I am not sure how much of that is a consequence of language not having closures though.
    
    osa1
    
    But is doesn’t do unification! It’s a simple, direct, single pass algorithm
    
    Does Zig have header files or similar? How does it collect data types and function types in single pass so that you’ll have all the types you need when checking a function body?
    
    Also, a nit, but unification doesn’t make a type checker multi-pass. If you look at the Damas-Hindley-Milner implementations in TaPL, or “Typing Haskell in Haskell”, or similar, you’ll see that they visit each expression once.
    
    matklad
    
    Let me try to make terminology more precise: Zig type inference is on-line, expression type is known before the compiler sees the rest of expressions of the function.
    
    icefox
    
    Maybe phrasing it as “greedy” vs “lazy” would be useful? I usually see those terms used when talking about search stuff…
    
    vimpostor
    
    It’s quite unconventional and unrelated to pattern matching, but the most elegant conditional syntax I ever came across was when writing some Unicon. Its expression system allows to directly write mathematical expressions like if 0 < x < 5, instead of having to rewrite it as the old cumbersome if 0 < x && x < 5 dance (or the worse code-golfing alternative).
    
    The way this works is because everything in Unicon is an expression. A Unicon expression either fails or returns a value. The comparison operator has been conveniently designed to return the right hand side if it doesn’t fail.
    
    I wish some modern languages would have played around with this concept more. It’s a bit weird to wrap your head around first, but if the returned value is chosen wisely this system allows for quite some elegant expressions: https://btiffin.users.sourceforge.net/up/expressions.html
    
    lojikil
    
    It’s interesting, I’ve been experimenting with combining case and cond forms in my little ML dialect, with the latter being guard clauses of case sans an initial condition. That’s one thing I’ve tried to experiment the most with: how do we not only combine case and cond, but also all the various conditional types you can have with pattern matching. coastML is heading towards (in a private branch):
    
    case x | (Foo.Bar (_ > 10)) { ... } | (Foo.Bar z) { ... } | (Foo.Baz y) { ... } | _ { ... };;
    
    but I need to read the original UCS paper now, since that could be interesting…
    
    gasche
    
    As someone that have dabbled in related questions, and also looked at UCS, I found your post well-done and interesting. Thanks!
    
    fanf
    
    Thank you for the kind words! and thanks for sharing that interesting discussion. Your proposal is very similar!
    
    This example is interesting,
    
    let foo () = match f () with | Some y -> (match g y with | A x -> (match h x with | Ok result -> result | Error _ -> raise Not_found ) | _ -> raise Not_found ) | None -> raise Not_found
    
    The nested matches have to be exhaustive at every level, whereas the UCS wanted to require exhaustiveness for the condition as a whole.
    
    There’s the shorter proposed OCaml version,
    
    let foo () = match f () with | Some y and (g y with A x and (h x with Ok result)) -> result | _ -> raise Not_found
    
    In my Rust-flavoured PUCS sketch that would be written
    
    fn foo() { if f() is Some(y) && g(y) is A(x) && h(x) is Ok(result) => result || => raise Not_found }
    
    Which makes me suspicious: it’s clear that this kind of chain of conditions gains terseness at the cost of less accurate exhaustiveness checking. In this case I think the brevity is a good result, but like the discussion in your link I think it also shows that programmers might have to keep in mind when a nested condition must be exhaustive independent of its outer condition.
    
    In PUCS you do that by changing && to => if (which might need {block} braces to disambiguate). But it’s hard to tell how error-prone lack of exhaustiveness for patterns after && might be.
    
    I think nested => if also addresses the question of irrefutable patterns, because if has to be exhaustive, so if scrutinee is pattern => consequent; will provoke an error when the pattern is refutable and there’s no else clause. (Perhaps this suggests there ought to be a when for single-legged refutable conditions? dunno.)
    
    Things like this example, when e2 might have side-effects:
    
    match (e1 : int option) with | None -> () | Some p and e2 with None -> () | Some p and e2 with Some _ -> ()
    
    In PUCS it would be,
    
    if e1 is { || None => () || Some(p) && e2 is None => () || Some(p) && e2 is Some(_) => () }
    
    which is supposed to be obviously different, in terms of the number of times and conditions in which e2 is evaluated, from
    
    if e1 is { || None => () || Some(p) && e2 is Some(_) => () || Some(p) && e2 is None => () }
    
    and
    
    if e1 is { || None => () || Some(p) && e2 is { || None => () || Some(_) => () } }
    
    hopefully in a way that makes sense to programmers familiar with short-cutting && and ||.
    
    unwind
    
    I just want to shout out SQL’s case stuff, for better or for worse
    
    pro: it’s an expression, not control flow, so you can sum(…) it or whatever
    
    con: it’s quite verbose next to a C-style ternary
    
    pro: it’s got both the simple value-match style & the arbitrary-booleans style
    
    con: you can’t really mix the two a la case val when between x and y then … AFAIK
    
    pro: it’s got a val between x and y to check the very common inclusive range
    
    con: it doesn’t help with inclusive-exclusive ranges, and so on
    
    pro: it’s word-based, no noisy characters (preference, sure)
    
    con: it doesn’t have a shorthand for is-a-subtype a la t.id in (select id from Foo)
    
    pro: it’s got val like ‘%wildcard%' for matching strings
    
    con: it doesn’t, by the standard, match regular expressions, just wildcard expressions
    
    pro: PostgreSQL supports regexp matching a la val ~ ‘regexp’ (other DBs, similar)
    
    con: no standard let … in … bindings, though MySQL variables are basically as good (?)
    
    Not optimal, sure. Next to the spaceship that is Kotlin, it’s like an old Datsun. Dare I say, DeLorean. But it’s not all bad!
    
    I apologize if any of the above is incorrect, there’s just so much I don’t know about so many dialects
    
    gasche
    
    Comment removed by author