The Defer Technical Specification: It Is Time
85 points by bakaq
85 points by bakaq
This would be such a boon to poor, deprived C. I hope it makes it.
Good luck using it before 2040, though.
Note that the call to action here is to implement the specification in your compilers and write up experience reports, and make sure those experience reports make it to the committee.
Oh, I wouldn’t use it myself! I use C++. But I accept that there are folks who won’t give up C, and at least this way they can write code that’s a bit safer.
I haven’t used Go enough to stumble on this, but from what this article says it hoists the defer execution to function scope. I have no idea why someone would want this, keeping things local seems so much cleaner, safer and more expressive.
It’s handy in situations where you need to conditionally defer, like inside an “if” block that acquires a resource under some conditions, that gets used by the rest of the function.
But it’s not hard to do this with block-scoped defer, you just move the defer to function scope and put a conditional in it.
Yeah, I use Go a lot, and I don’t love defer for that reason. If I’m looping over a bunch of files, I don’t want to defer closing them until the end of the function–I want to close them right away to avoid keeping hundreds of file handles open. So this usually requires creating a new function for the loop body, which is tedious.
Honestly, I never thought about it until reading this article. I always looked at defer as a function cleanup rather than a scope cleanup. If I was asked to review the IIFE example in the article, my first thought would be to refactor it to something like this
func updateX(m *sync.Mutex) {
m.Lock()
defer m.Unlock()
x = x + 1
}
func work(wg *sync.WaitGroup, m *sync.Mutex) {
defer wg.Done()
for i := 0; i < 42; i++ {
updateX(m)
}
}
IMO, behavior encourages you to write smaller, clearer, functions, but I can definitely see how it’s a footgun.
In Go I often want to write something like this:
for _, path := range paths {
f, err := os.Open(path)
if err != nil { return err }
defer f.Close()
// do thing with file
}
This will leak file handles and error when len(paths) exceeds the system’s per-process open file-handle limit, so we really want the defer to apply to the loop body scope. So instead we have to write something like this:
for _, path := range paths {
func(path string) {
f, err := os.Open(path)
if err != nil { return err }
defer f.Close()
// do thing with file
}(path)
}
Not a huge deal, but it’s tedious and somewhat error prone. I think I would prefer a block-scoped defer.
On the other hand, I can imagine there are also situations where you really do want the defer to happen after the loop is finished and not during each iteration, so maybe having it be function-scoped is a good compromise there? You have more control over the scope by inserting IIFEs when you explicitly want the defer to only be scoped to a specific block.
If you only had block-scoped defers, I don’t think it would be possible to write something that behaves the same as your first examples does now, at least not without manually calling the defers yourself.
Honestly, no, I can’t imagine any situation where I would want every iteration of a loop to queue up a new defer and then they all execute in reverse order after the end of the scope. That doesn’t mean such a situation doesn’t exist, but I’ve certainly never encountered anything like it from what I can recall.
However I can think of a really easy way to implement it, should it become necessary: make an array of functions, defer looping over and calling every function in reverse order, then in the loop, add a function to the array.
And if you do it this way it is explicit! Which Go loves!
Yeah, I can’t really understand why Go made defer function scope. It really seems like an overly complex solution to a problem that almost no one had that is a real footgun.
I tend to write functions that focus on one task, so I just can’t really imagine not writing a doStuff(path string) error
function and calling it in a loop. Plus that makes it easier to switch to a concurrent system if you end up needing that. I can see how it’s a concern, but it just feels like operating on one file = one function to me.
I was surprised by function-scoped defers in Go too, but after some time I came to see that block-scoped defer is not straightforward: it introduces invisible control flow with goto/break/continue:
while (...) {
...
defer cleanup();
...
while (...) {
...
goto out;
}
}
out:
...
– with block-scoped defer
the goto out
statement has to do cleanup()
before jumping to out
.
This does not seem a big issue with break/continue (afaik, C does not have labelled breaks, unlike Java/Rust), but the example above illustrates a control flow complex enough to be puzzling. Whereas a function-scoped defer is only triggered with explicit return
.
On the other hand, RAII in C++ does exactly this, invisible block-scoped destructors calls. So it’s doable and neat when a language embraces RAII, but I’m not sure this fits in C well.
An issue with function-scoped defers in C is that, unlike Go (and C++), C does not have anonymous functions to force “local” cleanup.
part of it might be an implementation detail. i’m not sure how go does it, but you can implement defer by hacking the return address
you can implement defer by pushing the defered code’s address to the stack, and then jumping over the deferred code. then, when the function returns, it ends up jumping to the defer block instead of to the caller. then you return again at the end of the defer block, and it’ll return to the caller (or the previous deferred block)
i’ve seen a macro that does this in C. it’s prone to error, though. the specifics depend on calling convention, it only works on some of them, and the optimizer can easily ruin your day
In Next Generation Shell, defer-like functionality is working in a “block”. Block is used for flow control (main purpose is non local exit). Example:
block b {
if COND {
ACQUIRE_SOME_RESORCE
b.finally({ RELEASE_THE_RESOURCE })
...
}
}
keeping things local seems so much cleaner, safer and more expressive.
I don’t know about cleaner, almost all of the article is devoted to describing the semantics of block-level defer and its intricacies. A lot of things have to be considered by a programmer: capture by reference, behavior with gotos/break/continue, change or no change of return value, timing when using in blocks, not to mention what optimization does or doesn’t do to it.
Limiting return to function blocks avoids most of that.
All that text is essentially to say that it follows normal scoping rules and normal variable-in-scope rules, and variables are resolved in the normal way. Seems pretty clean to me.
I favor C++ these days (I’d have a hard time now giving up either standard library containers, or operator overloading for vector and matrix types), but this looks pretty darn nice to me. I think it’s much better to let the compiler enforce that cleanup is handled at all possible exits.
Alef had a somewhat different set of mechanisms for this cleanup, it’s ‘rescue’ and ‘raise’ keywords.
One could scatter ‘rescue’ blocks through a function, similar to how one would apply ‘defer’ blocks with this proposal, however they would not be automatically triggered. The triggering would be via a ‘raise’ statement which would then effectively goto the lexicaly previous ‘rescue’ block. So a structured way of chaining goto’s for cleanup.
Possibly it is closer to the ‘errdefer’ mechanism in Zig?
See 6.7.1 here: https://doc.cat-v.org/plan_9/2nd_edition/papers/alef/ref
I think it’s a (very late) step in the right direction for C, but I also think it is insufficient to fix the entire class of vulnerabilities that emerges from refactors, in at least 2 situations:
names
to things, because strings have this dual properties of being essentially “values” but also “resources”. This would be fixed by RAII.defer cleanup_value(v)
right after the initialization of v
now causes a use after free and a double free. This would be fixed by move semantics.The incompatibilities with longjmp
is also something to worry about IMO, isn’t longjmp
used a lot in C, precisely for exception handling? The article is long, and I skimmed a bit, but I didn’t see a discussion of this.
This would be cooler if it properly interacted with non-local longjmp
s, but I understand that that would be difficult to do in C.
IMHO that is a bad idea. longjmp
is a very low level tool. Making it do anything other than restore registers (including the program counter) would be a mistake. If C needs a higher-level unwinding tool then it should be added separately, not backed into the low-level toolbox.
For example longjmp is often used for implementing coroutines. But you wouldn’t want to run your defers in that case.
I also don’t think making defer work when bailing via longjmp would actually mitigate most of the footguns that longjump creates (like expecting functions to return) so it seems a small win for a major loss.
I don’t think it would be that hard. Longjmp can’t jump down the stack. But you’d need to actually materialise a list of defers somewhere, all the time, just in case you’re between a setjmp and a longjmp you don’t know about, which would (I presume) be a bit slower than resolving them at compile time.
The WG and indeed the whole C community (from what I see of it, anyway) gave up on longjmp decades ago. It’s a bit sad because a longjmp-friendly defer might well be the thing that would make longjmp actually useful. Oh well.
But you’d need to actually materialise a list of defers somewhere, all the time
Yes, and that sort of runtime support is exactly what the C language designers are allergic to, for better or worse. ;-)
I don’t think it’s any different from a call stack. Certainly less invasive than thread-local storage or VLAs or, like, malloc.
tangent that nerd sniped me:
why my firefox started showing ⸺ as a box with a bar in middle and “2M” overlaid above the bar. What could it be? Only happens on my windows host, not on linux. CSS + Font inspect tells me it’s just Arial.