I don't care that it's X times faster

59 points by zmitchell

cceckman

Regarding

Your benchmark isn't measuring what you think it's measuring

Charitable interpretation: you're accidentally measuring something that's been optimized away entirely

A good paper from The Literature: Producing wrong data without doing anything obviously wrong, how many CS papers are within the variance you can get from, say, changing the aggregate size of your environment variables.

nicoco

This reminded me of another interesting paper in the same vein: Confidence intervals uncovered: Are we ready for real-world medical imaging AI? (for a different topic though).

For more than 60% of papers, the mean performance of the second-ranked method was within the CI of the first-ranked method

jrwren

If faster means uses less memory and less CPU cycles and the thing is deployed widely at scale, you are literally saving the planet by using less electricity.

I do care that it's X times faster.

zmitchell

Me too, but that’s not what the rant is about

tonymet

I produced a lighter alternative distribution of google-cloud-cli https://github.com/tonymet/gcloud-lite using the approach the author laments. It's 85% less resource intensive to deploy. That means you can use it on micro instances that the official CLI would hang (due to vcpu /iops budgets and resource constraints)

Google recognized this and the user complaints, and made some efforts to trim their official packages:

https://issuetracker.google.com/issues/324114897?pli=1

So benchmarking and competition is a good thing. Assuming the tool is correct, it's important that it's also efficient and elegant . Often developers are so focused on feature work they forget to constrain resource utilization. A little bit of attention can make a load of difference.

Sure some benchmarks can be biased, nobody is perfect, but in general we should encouraged people to reduce resources as much as possible.

jmillikin

The author doesn't link to which post is bothering them, but based on timing I'd guess it's this one?

[r/rust] I rewrote tmignore in Rust — 667 paths in 2.5s instead of 14 min

Maybe it's not the traditional sort of "I optimized an AV1 encoder's inner loop by 5% with clever SIMD" optimization post, but it still seems interesting to see someone investigate and solve a performance problem by disassembling a propriety tool.

zmitchell

It’s not that one. I specifically didn’t link to a post because I didn’t want to target a specific person. By targeting a specific person that would (1) be pretty mean, and (2) unfairly put the spotlight on them instead of any of the number of other people making this kind of post.

Edit: and by “not that one” I mean that’s not the post I saw that triggered me. I haven’t read the post that you linked.
- soulcutter
  
  How about this one? https://lobste.rs/s/krdjnf/truffleruby_34_full_ruby_3_4
  
  You don’t have to answer, I was amused that I could immediately think of another one, demonstrating how this is a frequent thing. 🍻
  
  Edit: oh, I wooshed and hadn’t read that your post came from /r/rust . Guess I should read first.
  - matthiasportzel
    
    The author’s post is about bad benchmark measurements which show huge improvements for the sake of clickbait titles. TruffleRuby AFAIK is showing reasonable speedups (in exchange for a delay in adding features). That just sounds like quality engineering.
    
    soulcutter
    
    💯
    
    I made the cardinal sin of not reading before posting, and I was wayyy off. I tend not to delete my mistakes, because it feels dishonest, which is the only reason I left it up with the edit.
  - bitshift
    
    if you're improving the performance of code that isn't part of the bottleneck, this often makes no practical difference.
    
    You can run this in reverse, too. If it makes development easier, you can sacrifice performance everywhere else and focus only on the bottleneck.
    
    I remember switching from grep -r to tools like ack and ag. In terms of regex engines, grep does the fast thing while the tools using PCRE do the slow thing. But the newer search tools were so much faster—not for rocket science reasons, but because they scanned multiple files in parallel and didn't recurse into .git/ and friends. (And now with ripgrep we get the best of both worlds! But my point is we got faster tools a few years earlier by focusing on the actual bottleneck.)
    
    This isn't to say that you shouldn't care about performance, but there's a number of other axes you can optimize along
    
    This is one of my favorite thoughts whenever I feel like everyone else is a rockstar computer wizard while I'm puzzling over why my burnt sand no work so good. The curse of dimensionality turns into a blessing.
  - Loup-Vaillant
    
    Maintainers of <existing tool> are naive/ignorant/bad/wasting their time/...
    
    I can’t really speak for open source or popular stuff, but in my experience at work, this one is actually depressingly likely. And I do have that one example of the Windows Terminal vs Casey Muratori’s Refterm. Not necessarily of the Windows terminal folks being idiots, but rather, them simply not knowing the relevant programming techniques.
  - zod000
    
    I get that this is a rant, but anyone that isn't asking "Is it true?" and "Is it important?" to claims that a new thing is X times faster than Y is being silly. If the answer to both is a resounding yes, then I may care. I'd go so far as to say I probably care.
    
    In my experience, the sticking point is that usually the answer to "Is it true?" is "Yes... in some narrow circumstances that may be contrived."
    
    zmitchell
    
    Rather than being silly, I think it’s probably a combination of charitably taking claims at face value and just a lack of familiarity with what the new project is “replacing”. I want to take claims at face value and believe an author/maintainer/etc, it’s only through experience that I’ve learned that you can’t.
  - matheusmoreira
    
    Project A does some work inline, whereas your project sends the work to a background thread, and you're comparing the time it takes to send data over a channel as opposed to actually doing the rest of the work on the background thread.
    
    I think I ran into this recently... Benchmarked my language against several others. It got destroyed by most of them but it managed to outdo bash by a wide margin. I'm still not sure why that is the case. Recursive Fibonacci isn't exactly one of bash's strengths, it might be starting processes for each recursive call for all I know.
    
    I do hope you'll forgive us if we get ahead of ourselves sometimes though. Seeing those numbers on the screen is pretty exciting. Pretty easy to get high on it and post something we'll regret later.