Can We Measure Software Slop? An Experiment

4 points by pscanf


xq

I like the idea but the algorithm utterly fails for my repositories.

Ashet OS got a slop score of 3.3/5, while the first larger AI-engineered part was merged last week. Rest of the codebase is just a lot of work, with huge commits (merge requests typically in the ±10kloc range, and commits in +1kloc).

Seems like i'm working like a robot 🤖

zig-args even gets a score of 4.7, but has never seen AI at all.

kristall got a 0.3, which is fairly accurate (no ai used at all either)

For blade it works accurately, as in "i'm just supervising the test suite, not the code itself." I personally wouldn't count it as slop, as i'm taking care the test suite is tight and the code coverage is high