Lines of code are useful

12 points by kqr


Johz

Lines of code measure code complexity. That is well established. You don’t have to take my word for it:

The examples that follow here, though, all measure code complexity, at least indirectly, in terms of quantity of code. Essentially, this is another way of saying "if you write more code, you'll have written more code".

What isn't demonstrated (in this blog post, nor in general, see e.g. A Critique of Software Defect Prediction Models is whether these volume-based measures of code complexity map clearly to the real-world effects of complexity. That is, if you show me a 500-line implementation of a module, and an 1000-line implementation of the same module, is it generally true that the 500-line implementation will have fewer bugs, will be easier to read, will be easier to make changes to, etc?

I mean, it might be the case, and it certainly feels like it should be the case, at least in general. But we're missing the evidence to demonstrate it.

Even the later discussion on essential vs accidental complexity seems flawed, because we don't really have any way of measuring the difference. Different — very intelligent, experienced, and technically capable — developers will disagree over whether, say, NextJS is a big bundle of accidental complexity, or whether it's moving essential complexity from application code into library code.

FWIW, I agree with the author on a lot of their points here, at least insofar as it all feels right. This is more a comment on the miserable state of software complexity analysis than on the rest of what the author's trying to say.

pointlessone

I find these statements strange, because they are not true.

They are if you care to mention that they all are said in the context of measuring productivity, not complexity.

I presume OP thinks it’s a clever hook, subversion of a common narrative but here it kinda works against the main premise of the text. Especially since OP points out the distinction in the last section.

I guess the confusion comes from the identical name of the metric while in practice it measures different things.

For complexity “lines of code” is an absolute measure. For productivity it’s “lines of code per day.” They are different units. They are not the same thing. And mixing them up is what OP does on purpose in the intro, which is disingenuous.

nrposner

I can see what the author was going for here, but it's not executed well. We start with a bunch of quotes about LoC being a bad measure of productivity, which are dismissed as incorrect, followed by a long tangent talking about how LoC are a good metric of complexity, before ultimately conceding that complexity and productivity are different things. But hey, maybe LoC could also be a good productivity measure under very specific circumstances which never seem to happen and get far less attention.

I think the idea 'LoC have a place as a metric in a culture that rigorously measures complexity to model things like future maintenance needs' could be made, but I've never encountered a software culture that is at this point, nor do I expect most of the readership have. It just feels like a tendentious hook.

Sirikon

Even when the essential/accidental complexity ratio is favorable, lines of code as a productivity metric does not take into account all the investigation time that a certain task took. Spending a whole day to provide a ten-lines comment and a single line of functional code is a classic.

It also doesn't reflect all the non-technical work a worker needs to do like mentoring, bureaucracy or pointless meetings.

However, if we can ensure the relationship between value and essential complexity is stable, and further that the ratio of essential to accidental complexity is stable, then lines of code is usable as a productivity metric too.

That is technically correct but totally impractical. That is something very hard and time consuming to measure and no two people will think exactly the same about the same codebase. Also, in theory, if a project's collaboration is based on pull requests, accidental complexity should be very low, yet we all know it doesn't work like that.

When lines of code as a metric of an individual productivity is crtititized it usually means criticizing lazy managers looking for an excuse to fire people, not to actually improve anything.

Counting lines is easy and repeatable, but assessing complexity is not.

lcapaldo

I think when people say things like the quotes at the beginning they are largely objecting to it as a measure of the programmer not as a measure of the program.

noteflakes

Whenever I look at an unfamiliar code base, I always run cloc on different files and directories to get a feel of the complexity. So I totally agree - LOC is quite useful as an indicator of complexity.

hyperpape

I'm not well enough versed on this literature to quote it, but one important caveat: last time I looked, the correlations between lines of code and complexity (iirc, I was thinking of defect rates) had been validated within a language, but not across multiple languages. That is, each language has a rough correlation between LoC and defects, but those correlations had not been demonstrated to be similar between languages.[0]

But even ignoring empirical research, just think about it. 5 lines of APL/J/K code and 5 lines of Java do not have the same complexity. I guess you could theoretically do some research of sufficient quality to convince me otherwise, but building an experiment that was robust enough to overcome the contrary intuition is like...the final boss of empirical software engineering research.

[0] If you have citations for or against this, I'd love to hear it.

alchemmist

Interesting take. Feels like this reframes LOC from a “productivity metric” into more of a liability metric — which aligns with the idea that every line adds maintenance cost and surface area for bugs.

What I found especially compelling is how this implicitly shifts focus toward outcomes instead of output. Counting lines (human or AI-generated) seems increasingly meaningless if what actually matters is time-to-value and system behavior .

Curious how people here think about this in practice: do you actively optimize for less code, or is it more of an emergent property of good design?

enobayram

So, I should remove type annotations and tests and write if(cond) stmt; to reduce the complexity. Got it.

apromixately

I don't have any links but afair loc has also not done well as a measure of complexity.