The penultimate conditional syntax
34 points by fanf
34 points by fanf
I am huge fan of Kotlin’s low-tech solution, which is a combination of three things.
when {
x > 0 -> "postive"
x == 0 -> "zero"
x < 0 -> "negative"
}
(seriously, every language should have multiway if! if else chains are horribly unreadable, as they don’t visually separate conditions from actions, but complex multiway ifs is exactly where you want maximum readability)
is
pattern-matching expression, which combines with flow sensitive typing:if (animal is Cat && animal.meow().isLoud())
I think is
can be straightforwardly extended to unpacking, rather than just narrowing types:
if (animal is Cat(cat) && cat.meow().isLoud())
when (animal) {
is Cat -> "meow"
is Fox -> "???"
}
What is cool is that you can easily upgrade exhaustive match to an arbitrary complex condition:
when {
animal is Cat && animal.isFerral -> "deadly silence"
animal is Cat -> "meow"
animal is Fox -> "???"
}
You have to bind scrutinee to a variable and then repeat it in conditions, but that’s wasn’t a huge pain in practice (though this part might depend on type narrowing semantics, where you generally need fewer names to being with). My only two complains about Kotlin’s conditionals were that default formatting eats two levels of indentation, and that often binary when
managed to look better than an if
!
if (condition)
trueBranch
else
falseBranch
vs
when {
condition -> trueBranch
else -> falseBranch
}
which really is more of a complaint about if
:D
Though, I was surprised to learn that there’s a KEEP to add guards to Kotlin: https://github.com/Kotlin/KEEP/blob/guards/proposals/guards.md
Like, it is a feature that you don’t need guards because the desired semantics falls out naturally from the combination of other features! But I haven’t touched Kotlin for a long while! :0)
Not sure how I feel about UCS or PUCS! Definitely an important problem to solve, but somehow the solution doesn’t immediately appeal to me.
Thanks for those Kotlin examples, it looks fairly nice!
somehow the solution doesn’t immediately appeal to me.
Yeah, UCS erred on the side of too little clutter, but I think PUCS errs on the side of too much clutter. Your Kotlin examples make me wonder about omitting ||
entirely. I think it would end up a bit less uniform, but that’s probably good because complete homogeneity can hurt readability. It isn’t possible to get rid of ||
from general boolean expressions, and nested patterns still need some kind of |
… hmm, that edges it closer to Rust, heh.
It isn’t possible to get rid of || from general boolean expressions,
Another strong opinion, but and
and or
are sooo much better than || for && booleans — much easier to type, more distinct with syntax highlighting, and much clearer that there’s control flow in play.
I tend to agree! (I liked spelling them out when perpetrating perl.) But for this blog post I wanted to stick closer to the curly-brace tradition.
Be thankful I was not in a contrarian mood, I might have written /\
and/or \/
everywhere :-)
C# surprisingly has a lot of this now too, with switch expressions and is
patterns with optional binding. It really does simplify a lot of previous behaviours, especially with the optional binding. Besides some slight syntax differences, all of the above Kotlin examples work in C#.
Yeah, I was surprised a few weeks ago when I discovered how expressive C#‘s pattern syntax has gotten. For those wondering, here’s a reference: https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/operators/patterns. Relational and property patterns are the interesting ones. It also has is
expressions, which can bind variables: https://learn.microsoft.com/en-us/dotnet/csharp/language-reference/operators/is.
I also like Kotlin’s solution. I always thought it can be extended to have some syntax similar to UCS, such as
when (age) {
> 65 -> "elderly"
< 18 -> "children"
else -> "adult"
}
especially when you need to make some method calls
when (user) {
// Shorthand for `user.deleted`
deleted -> ...
age > 65 -> ...
age < 18 -> ...
// Or
getAge() < 18 -> ...
}
I like that form as well. I implemented something akin to it in my toy lisp except that the when
is scoped to the variable to test:
(cond v
((\ X (and (isCat? X) (isFerral? X)) . "deadly silence")
(isCat? . "meow")
(isFox? . "???")
(_ . "fallback")
)
I really like the Clojure different ways of doing this depending on the case:
(case x
v1 (code-for-x-equal-v1)
v2 (code-for-x-equal-v2)
(code-for-else-case))
(cond
(predicate-1) (code-if-predicate-1-is-true)
(predicate-2) (code-if-predicate-2-is-true)
:else ;; could be anything not nil or false
(code-for-else-case))
(condp > x
12 (code-if-x>12)
5 (code-if-x>5)
(code-for-unexpected-case))
(cond-> x
(predicate-1) (fn1)
(predicate-2) (fn2)
(predicate-3) (fn3))
;; if predicate-1 and predicate-3 are true but not predicate-2
;; this will return (fn3 (fn1 x))
more example of condp here https://clojuredocs.org/clojure.core/condp
Interesting idea, but I’m not sure why you would want to unify if
and match
. At least the way I designed them in Fir (which is not that unique actually), they serve different use cases:
match
does exhaustiveness checking, and needs an “else” branch if exhaustiveness of the patterns can’t be proven by the compiler. Since match
always returns a value, it can be used as an expression or statement.if
doesn’t do exhaustiveness checking and does not require an “else” branch. When it doesn’t have an “else” branch, it cannot be used as an expression.These differences make match
more useful for things like parsing user input and handling all outputs. If you change the parser to return more kinds of outputs, you want to make sure those are handled in the call sites.
And if
for things like handling one specific case specially.
We also have is
expressions (with Bool
value) in Fir. The way they propagate bound variables to the enclosing expressions is simple:
&&
(boolean and), variables bound on the left are propagated to the right and up.&&
(boolean and), variables bound on the right are propagated up.if
branches, variables propagated to the condition are bound in the branch body.match
branches, variables propagated to the guards are bound in the branch body.In principle we could also support ||
(boolean or) expressions: if left binds x : Ty
and right binds x : Ty
(i.e. same variable to the same type), propagate it up.
However I haven’t found any cases where the existing rules aren’t enough to allow what I want to do.
Yeah, exhaustiveness checking is important. I didn’t discuss it because the notes were already long enough, so I left out type-related discussion. I think a reasonable rule is that a multi-way pattern is { … }
must be exhaustive unless it’s followed by an ||
alternative.
Those Fir scoping rules are neat! That’s a great solution to the problem I was struggling with. (Not strictly lexical, though, but reasonably principled I think.)
I was reminded of “the ultimate conditional syntax” a couple of weeks ago, at which time I had forgotten some of what bothered me about it. I didn’t have a clear idea of what made it seem over-complicated. But it’s been floating around in my peripheral consciousness and eventually a bright idea breached the surface, that resolved much of the botheration and complication.
Personally, I don’t see the great appeal of combining pattern matching with if
. What problem are we really solving with these new syntaxes? The syntax of match
or switch
in most languages with pattern languages works great, if
is some syntactic sugar for a very common pattern: matching on the simplest sum, the boolean. If anything, the problem -such as is it is- is that the intermediary if let
that Rust and Swift have falls into a weird spot between them so that it does both more and less than if
.
I really don’t like scoping that isn’t strictly lexical; I think it should be as easy as possible to understand what scope your variable will be available for when you declare it.
There is one big difference between if
and match
that makes keeping them separate interesting in my opinion: with if
, you know that the conditional must have the type bool
, whereas with match
you don’t know the type of the “scrutinee”. This difference is a space to play with in language design.
Most languages attempt to use the patterns as input to infer the type, but there is another option that I suspect may be better: treat match
as a projection operator (like the field access .
operator) and require that the type of the scrutinee be known from other context. This would completely solve the problem of needing to scope the variants in patterns, because if you already know the type of the scrutinee you already know the acceptable patterns. I think Swift may do something like this, but I’m not clear on exactly how its case .variant
feature works.
The appeal for me is that it replaces a bunch of ad-hoc syntax and complicated rules with a few relatively simple orthogonal parts. It’s partly a reaction against the increasing complexity discussed by Yoshua Wuyts the other week, and the way that the various proposals to make Rust’s syntax more uniform turn out to conflict with each other because of the non-uniform starting point and ad-hoc growth.
But as I wrote in my notes, it’s true that syntax can be too uniform. There are reasons Lisp and Tcl are less popular! I dunno where this idea lands, but it’s worth playing around to find out.
Re scope, it is strictly lexical: in both UCS and PUCS the scope of a variable bound in a pattern is just the consequent of that pattern. The trick is to make the consequent usable as a pattern guard, or the rest of a boolean expression, or as a then
clause, or all three.
I maybe should have written more about types. I didn’t make it very clear that in PUCS the expression following the if
keyword is still just a boolean expression. (The keyword just allows you to use then
and else
clauses in the expression.) The “is
” expression does most of what match
does but in a manner that scales down to smaller fragments. However, by itself “is
” is just a test with a boolean result; to return a different type you put it in an if
.
I like your thought about type inference. Is that how Zig supports its unscoped .member
names? Swift has a reputation for erring on the side of too much inference so I would be slightly surprised if its case variant
relied on judiciously limited inference instead :-)
I like your thought about type inference. Is that how Zig supports its unscoped .member names?
Yeah:
Zig type inference in general worth studying (some notes here), it is interesting:
std.mem.eql(u8, xs, ys)
is a common offender, but that’s just an artifact of library design, it could have been xs.eql(ys)
).I am not sure how much of that is a consequence of language not having closures though.
But is doesn’t do unification! It’s a simple, direct, single pass algorithm
Does Zig have header files or similar? How does it collect data types and function types in single pass so that you’ll have all the types you need when checking a function body?
Also, a nit, but unification doesn’t make a type checker multi-pass. If you look at the Damas-Hindley-Milner implementations in TaPL, or “Typing Haskell in Haskell”, or similar, you’ll see that they visit each expression once.
Let me try to make terminology more precise: Zig type inference is on-line, expression type is known before the compiler sees the rest of expressions of the function.
Maybe phrasing it as “greedy” vs “lazy” would be useful? I usually see those terms used when talking about search stuff…
It’s quite unconventional and unrelated to pattern matching, but the most elegant conditional syntax I ever came across was when writing some Unicon. Its expression system allows to directly write mathematical expressions like if 0 < x < 5
, instead of having to rewrite it as the old cumbersome if 0 < x && x < 5
dance (or the worse code-golfing alternative).
The way this works is because everything in Unicon is an expression. A Unicon expression either fails or returns a value. The comparison operator has been conveniently designed to return the right hand side if it doesn’t fail.
I wish some modern languages would have played around with this concept more. It’s a bit weird to wrap your head around first, but if the returned value is chosen wisely this system allows for quite some elegant expressions: https://btiffin.users.sourceforge.net/up/expressions.html
It’s interesting, I’ve been experimenting with combining case
and cond
forms in my little ML dialect, with the latter being guard clauses of case sans an initial condition. That’s one thing I’ve tried to experiment the most with: how do we not only combine case and cond, but also all the various conditional types you can have with pattern matching. coastML is heading towards (in a private branch):
case x
| (Foo.Bar (_ > 10)) { ... }
| (Foo.Bar z) { ... }
| (Foo.Baz y) { ... }
| _ { ... };;
but I need to read the original UCS paper now, since that could be interesting…
As someone that have dabbled in related questions, and also looked at UCS, I found your post well-done and interesting. Thanks!
Thank you for the kind words! and thanks for sharing that interesting discussion. Your proposal is very similar!
This example is interesting,
let foo () = match f () with
| Some y -> (match g y with
| A x -> (match h x with
| Ok result -> result
| Error _ -> raise Not_found
)
| _ -> raise Not_found
)
| None -> raise Not_found
The nested matches have to be exhaustive at every level, whereas the UCS wanted to require exhaustiveness for the condition as a whole.
There’s the shorter proposed OCaml version,
let foo () = match f () with
| Some y
and (g y with A x
and (h x with Ok result)) -> result
| _ -> raise Not_found
In my Rust-flavoured PUCS sketch that would be written
fn foo() {
if f() is Some(y)
&& g(y) is A(x)
&& h(x) is Ok(result) => result
|| => raise Not_found
}
Which makes me suspicious: it’s clear that this kind of chain of conditions gains terseness at the cost of less accurate exhaustiveness checking. In this case I think the brevity is a good result, but like the discussion in your link I think it also shows that programmers might have to keep in mind when a nested condition must be exhaustive independent of its outer condition.
In PUCS you do that by changing &&
to => if
(which might need {block} braces to disambiguate). But it’s hard to tell how error-prone lack of exhaustiveness for patterns after && might be.
I think nested => if
also addresses the question of irrefutable patterns, because if
has to be exhaustive, so if scrutinee is pattern => consequent;
will provoke an error when the pattern is refutable and there’s no else clause. (Perhaps this suggests there ought to be a when
for single-legged refutable conditions? dunno.)
Things like this example, when e2 might have side-effects:
match (e1 : int option) with
| None -> ()
| Some p and e2 with None -> ()
| Some p and e2 with Some _ -> ()
In PUCS it would be,
if e1 is {
|| None => ()
|| Some(p) && e2 is None => ()
|| Some(p) && e2 is Some(_) => ()
}
which is supposed to be obviously different, in terms of the number of times and conditions in which e2 is evaluated, from
if e1 is {
|| None => ()
|| Some(p) && e2 is Some(_) => ()
|| Some(p) && e2 is None => ()
}
and
if e1 is {
|| None => ()
|| Some(p) && e2 is {
|| None => ()
|| Some(_) => ()
}
}
hopefully in a way that makes sense to programmers familiar with short-cutting && and ||.
I just want to shout out SQL’s case
stuff, for better or for worse
case val when between x and y then …
AFAIKval between x and y
to check the very common inclusive ranget.id in (select id from Foo)
val like ‘%wildcard%'
for matching stringsval ~ ‘regexp’
(other DBs, similar)let … in …
bindings, though MySQL variables are basically as good (?)Not optimal, sure. Next to the spaceship that is Kotlin, it’s like an old Datsun. Dare I say, DeLorean. But it’s not all bad!
I apologize if any of the above is incorrect, there’s just so much I don’t know about so many dialects