Lines of Code Are Back (And It's Worse Than Before)
22 points by allanmacgregor
22 points by allanmacgregor
Old and busted: lines of code
New hotness: tokens used
I find the amount of code generated by agents to be a concern of its own. When a human authors a PR/MR with 5000 lines of code, reviewers will ask for shorter code, smaller commits, about the architecture, and really wonder if it has to be that long, ... There will always be a big pushback. It's too much, it can't be reviewed, it's surprising that it's so much code probably, and it will be difficult to maintain it. The last option may even be to ditch the feature responsible for so much code (we'd typically do that at an ealier stage, which is much better obviously).
When a machine generates that much code to be merged, it somehow seems more acceptable? That doesn't make sense. The only concern that is avoided is the possible pain for the author of so much code; all the concerns should remain!
Some may say that the prompts become the actual source but that would require storing prompts, contexts, full models (that you cannot copy and that get changed very often). Even then, these LLMs are probabilistic so that doesn't work.
And now people accept thousands of lines of code routinely? Even if the code works and isn't ugly, where has the questioning gone? There is no reason to not have the same concerns as if it were human-authored. I've now seen several established projects merge huge PRs that are simply too long to have been not even reviewed but merely "thought" again. I can only fear for the longevity of these projects now. There is no way a human uses critical thinking when faced with >5k lines in a single PR, even at the architectural level.
I'm not a terseness advocate. I've never been a proponent of software that strive for minimality on whatever basis and I believe some things really require a given amount of code. LLMs and agents have broken every ceiling I could have imagined however.
PS: I reckon however that not all project strive for longevity and reliability. There's a clear difference between FOSS and proprietary ones. I don't really think it's interesting in discussing politics of proprietary software development here.
Very good read !
Fresh off the presses of your favorite LLM. It's almost certainly AI-generated.
Pangram claims it's 85% LLM-generated, so I only skimmed 15% of the article.
It reads that way to me, too. But the human who posted it has been here a little while, so let's hear what they say when asked directly.
especially with anthropic, and somewhat with google, their AI code percentage isn't a measure of the code, it's a measure of the AI. their goal is to be able to hit 100%, because that's a product feature for the AI
and, i think the result of gaming it is much less bad than gaming LoC: you just tell the AI to write the line of code you were going to write.
i'm guessing this is why anthropic's percentage is so high: they presumably have a mandate to dogfood it, so they're telling it "write a for loop" more often, where others would type it and save the tokens. but, that doesn't really make the code any worse
where this is actually useful (without getting into other metrics), is that if the tool can hit 100%, it ends up being pretty helpful if you want to use voice input to write your code