The Defer Technical Specification: It Is Time

85 points by bakaq

snej

This would be such a boon to poor, deprived C. I hope it makes it.

Halkcyon

Good luck using it before 2040, though.
- riking
  
  Note that the call to action here is to implement the specification in your compilers and write up experience reports, and make sure those experience reports make it to the committee.
- snej
  
  Oh, I wouldn’t use it myself! I use C++. But I accept that there are folks who won’t give up C, and at least this way they can write code that’s a bit safer.
gianni

Zig has defer, and I like using it there. This seems great for C.
- bakaq
  
  Yes! defer is something I miss a lot when writing C. It’s probably the best Zig feature that would actually work in C, and also some of the best cost-benefit in terms of complexity.
bakaq

I haven’t used Go enough to stumble on this, but from what this article says it hoists the defer execution to function scope. I have no idea why someone would want this, keeping things local seems so much cleaner, safer and more expressive.
- snej
  
  It’s handy in situations where you need to conditionally defer, like inside an “if” block that acquires a resource under some conditions, that gets used by the rest of the function.
  
  But it’s not hard to do this with block-scoped defer, you just move the defer to function scope and put a conditional in it.
- weberc2
  
  Yeah, I use Go a lot, and I don’t love defer for that reason. If I’m looping over a bunch of files, I don’t want to defer closing them until the end of the function–I want to close them right away to avoid keeping hundreds of file handles open. So this usually requires creating a new function for the loop body, which is tedious.
- ollien
  Honestly, I never thought about it until reading this article. I always looked at defer as a function cleanup rather than a scope cleanup. If I was asked to review the IIFE example in the article, my first thought would be to refactor it to something like this
  
  func updateX(m *sync.Mutex) { m.Lock() defer m.Unlock() x = x + 1 } func work(wg *sync.WaitGroup, m *sync.Mutex) { defer wg.Done() for i := 0; i < 42; i++ { updateX(m) } }
  
  IMO, behavior encourages you to write smaller, clearer, functions, but I can definitely see how it’s a footgun.
  - weberc2
    
    In Go I often want to write something like this:
    
    for _, path := range paths { f, err := os.Open(path) if err != nil { return err } defer f.Close() // do thing with file }
    
    This will leak file handles and error when len(paths) exceeds the system’s per-process open file-handle limit, so we really want the defer to apply to the loop body scope. So instead we have to write something like this:
    
    for _, path := range paths { func(path string) { f, err := os.Open(path) if err != nil { return err } defer f.Close() // do thing with file }(path) }
    
    Not a huge deal, but it’s tedious and somewhat error prone. I think I would prefer a block-scoped defer.
    
    riking
    
    this is in the article btw!
    
    Johz
    
    On the other hand, I can imagine there are also situations where you really do want the defer to happen after the loop is finished and not during each iteration, so maybe having it be function-scoped is a good compromise there? You have more control over the scope by inserting IIFEs when you explicitly want the defer to only be scoped to a specific block.
    
    If you only had block-scoped defers, I don’t think it would be possible to write something that behaves the same as your first examples does now, at least not without manually calling the defers yourself.
    
    mort
    
    Honestly, no, I can’t imagine any situation where I would want every iteration of a loop to queue up a new defer and then they all execute in reverse order after the end of the scope. That doesn’t mean such a situation doesn’t exist, but I’ve certainly never encountered anything like it from what I can recall.
    
    However I can think of a really easy way to implement it, should it become necessary: make an array of functions, defer looping over and calling every function in reverse order, then in the loop, add a function to the array.
    
    kevincox
    
    And if you do it this way it is explicit! Which Go loves!
    
    Yeah, I can’t really understand why Go made defer function scope. It really seems like an overly complex solution to a problem that almost no one had that is a real footgun.
    
    carlana
    
    I tend to write functions that focus on one task, so I just can’t really imagine not writing a doStuff(path string) error function and calling it in a loop. Plus that makes it easier to switch to a concurrent system if you end up needing that. I can see how it’s a concern, but it just feels like operating on one file = one function to me.
    
    dmytrish
    
    I was surprised by function-scoped defers in Go too, but after some time I came to see that block-scoped defer is not straightforward: it introduces invisible control flow with goto/break/continue:
    
    while (...) { ... defer cleanup(); ... while (...) { ... goto out; } } out: ...
    
    – with block-scoped defer the goto out statement has to do cleanup() before jumping to out.
    
    This does not seem a big issue with break/continue (afaik, C does not have labelled breaks, unlike Java/Rust), but the example above illustrates a control flow complex enough to be puzzling. Whereas a function-scoped defer is only triggered with explicit return.
    
    On the other hand, RAII in C++ does exactly this, invisible block-scoped destructors calls. So it’s doable and neat when a language embraces RAII, but I’m not sure this fits in C well.
    
    An issue with function-scoped defers in C is that, unlike Go (and C++), C does not have anonymous functions to force “local” cleanup.
    
    hc
    
    part of it might be an implementation detail. i’m not sure how go does it, but you can implement defer by hacking the return address
    
    you can implement defer by pushing the defered code’s address to the stack, and then jumping over the deferred code. then, when the function returns, it ends up jumping to the defer block instead of to the caller. then you return again at the end of the defer block, and it’ll return to the caller (or the previous deferred block)
    
    i’ve seen a macro that does this in C. it’s prone to error, though. the specifics depend on calling convention, it only works on some of them, and the optimizer can easily ruin your day
    
    ilyash
    
    In Next Generation Shell, defer-like functionality is working in a “block”. Block is used for flow control (main purpose is non local exit). Example:
    
    block b { if COND { ACQUIRE_SOME_RESORCE b.finally({ RELEASE_THE_RESOURCE }) ... } }
    
    dlisboa
    
    keeping things local seems so much cleaner, safer and more expressive.
    
    I don’t know about cleaner, almost all of the article is devoted to describing the semantics of block-level defer and its intricacies. A lot of things have to be considered by a programmer: capture by reference, behavior with gotos/break/continue, change or no change of return value, timing when using in blocks, not to mention what optimization does or doesn’t do to it.
    
    Limiting return to function blocks avoids most of that.
    
    mort
    
    All that text is essentially to say that it follows normal scoping rules and normal variable-in-scope rules, and variables are resolved in the normal way. Seems pretty clean to me.
    
    Boojum
    
    I favor C++ these days (I’d have a hard time now giving up either standard library containers, or operator overloading for vector and matrix types), but this looks pretty darn nice to me. I think it’s much better to let the compiler enforce that cleanup is handled at all possible exits.
    
    dfawcus
    
    Alef had a somewhat different set of mechanisms for this cleanup, it’s ‘rescue’ and ‘raise’ keywords.
    
    One could scatter ‘rescue’ blocks through a function, similar to how one would apply ‘defer’ blocks with this proposal, however they would not be automatically triggered. The triggering would be via a ‘raise’ statement which would then effectively goto the lexicaly previous ‘rescue’ block. So a structured way of chaining goto’s for cleanup.
    
    Possibly it is closer to the ‘errdefer’ mechanism in Zig?
    
    See 6.7.1 here: https://doc.cat-v.org/plan_9/2nd_edition/papers/alef/ref
    
    Or here: https://swtch.com/~rsc/thread/alef.pdf
    
    dureuill
    
    I think it’s a (very late) step in the right direction for C, but I also think it is insufficient to fix the entire class of vulnerabilities that emerges from refactors, in at least 2 situations:
    
    One object that did not previously require clean-up now does. You have to track all such objects and add the clean-up, or you get a leak as a result of your refactor. In my experience, this mostly happens when giving names to things, because strings have this dual properties of being essentially “values” but also “resources”. This would be fixed by RAII.
    
    A function adds a path where the value is moved instead of destroyed in the function. The existing defer cleanup_value(v) right after the initialization of v now causes a use after free and a double free. This would be fixed by move semantics.
    
    The incompatibilities with longjmp is also something to worry about IMO, isn’t longjmp used a lot in C, precisely for exception handling? The article is long, and I skimmed a bit, but I didn’t see a discussion of this.
    
    manuel
    
    This would be cooler if it properly interacted with non-local longjmps, but I understand that that would be difficult to do in C.
    
    kevincox
    
    IMHO that is a bad idea. longjmp is a very low level tool. Making it do anything other than restore registers (including the program counter) would be a mistake. If C needs a higher-level unwinding tool then it should be added separately, not backed into the low-level toolbox.
    
    For example longjmp is often used for implementing coroutines. But you wouldn’t want to run your defers in that case.
    
    I also don’t think making defer work when bailing via longjmp would actually mitigate most of the footguns that longjump creates (like expecting functions to return) so it seems a small win for a major loss.
    
    manuel
    
    If C needs a higher-level unwinding tool then it should be added separately
    
    Good point
    
    edk-
    
    I don’t think it would be that hard. Longjmp can’t jump down the stack. But you’d need to actually materialise a list of defers somewhere, all the time, just in case you’re between a setjmp and a longjmp you don’t know about, which would (I presume) be a bit slower than resolving them at compile time.
    
    The WG and indeed the whole C community (from what I see of it, anyway) gave up on longjmp decades ago. It’s a bit sad because a longjmp-friendly defer might well be the thing that would make longjmp actually useful. Oh well.
    
    manuel
    
    But you’d need to actually materialise a list of defers somewhere, all the time
    
    Yes, and that sort of runtime support is exactly what the C language designers are allergic to, for better or worse. ;-)
    
    edk-
    
    I don’t think it’s any different from a call stack. Certainly less invasive than thread-local storage or VLAs or, like, malloc.
    
    kwas
    
    tangent that nerd sniped me:
    
    why my firefox started showing ⸺ as a box with a bar in middle and “2M” overlaid above the bar. What could it be? Only happens on my windows host, not on linux. CSS + Font inspect tells me it’s just Arial.