Which Programming Language Is Best for Claude Code?

17 points by MatheusRich


st3fan

It is a weird benchmark focused on Time & Cost. Not code quality or code complexity or maintainability.

rtfeldman

The outputs are 200-LoC programs.

I sympathize with the author that "designing a large-scale benchmark that's fair across 15 languages is quite challenging," but you have to be honest about the conclusions that can be drawn from the experiment you actually did. "At least for prototyping-scale tasks, Ruby, Python, and JavaScript appear to be the best fit" is not remotely supported by the data here. What percentage of useful prototypes are 200 lines of code?

Headings like "What causes the speed/cost differences?" "Doesn't lack of types mean more bugs?" "A 2× difference isn't that big, is it?" and "Isn't ecosystem and runtime performance more important for language choice?" could each have one sentence under them: "At the scale of this experiment, we can't draw any meaningful conclusions about this."

brucehoult

Maybe I missed it, but this doesn't seem to include the execution speed of the generated code.

I would expect the compact and cheap to write OCaml and Haskell programs to execute much faster than the Ruby and Python ones. Though JavaScript/TypeScript is going to be fast in nodejs as well.

And of course C and Rust should also produce fast programs, though in some cases if the extra lines of code include implementing library-style functionality then that might not be as well-coded as the built-in libraries in Ruby and Python.

It warms my heart to see Ruby come in so well. I've always loved it as a general-purpose scripting language, well ahead of the arbitrarily non-orthogonal and annoying syntax Python. Ruby is also a much more natural upgrade from Perl / shell / awk / sed while being an actually good language.

xq

I'm a bit sad that we're missinh both Java and C#, which are a different class of statically typed languages than Rust, Go and C.

Apart from TypeScript, none of the static languages had exceptions, which means errot handling works differently.

I'd also love to see a comparison in code quality. My own vibecoding experience shows that nailing down strict rules, and static analysis definitly improves the runtime crash behaviour of programs.

jonathannen

I'm surprised TypeScript sits where it does compared to JavaScript - There is often discussion on types on Claude and the impact. So it's a very interesting benchmark.

Types and TypeScript are far more useful for larger projects. If I was building a single index.js to process some JSON for me I'd reach for JavaScript. For everything else it's TypeScript. So I do wonder how these results shift as you scale.

A 2× difference isn't that big, is it?

You can argue a type system forces the coder/AI to build the code (at least conceptually) twice. And if you take this data at face value that looks true.

bediger4000

I'm a little surprised that this matters. If "AI" is as good as is claimed about interpreting meaning, then "AI" should be able to use any language, Brainfuck, even, with good results. Here, "good" would mean "accomplishing the assigned task swiftly".

cohix

Been using Rust almost exclusively, it’s been working great.

mrcruz

I'm mildly salty that Bash wasn't included :c

Student

The test suite is very bare bones and happy path focused. This validates that agents are good at coding relatively simple projects in scripting languages if you can write a test suite. Big win for test suites that don’t depend on api details and we should probably all be investing in these types of test harnesses more.

The big surprise for me here is that ocaml comes out looking very good. Comparable to golang with very little dispersion in time taken to achieve the task.