The Codebase of a National Lab I Worked At
10 points by MiraWelner
10 points by MiraWelner
Either I have Stockholm syndrome or this is not really too bad. The Java codebase I work on is 25 years old and also has large areas without test coverage (smaller, recently!), was only updated from Java 8 to 11 last year, and had portions generated with a JavaCC version from the 90s that could no longer be found anywhere. Tens of thousands of IDE warnings are not a big deal. At a certain point once all the original developers leave you start to be concerned about minimizing diffs, to preserve the ability to bisect and find what changes introduced a bug - thus large changes such as reformatting or fixing IDE warnings become untenable. This is a sort of local maxima of development efficiency which can really only be escaped by investing in writing truly staggering quantities of unit tests, so that newly-discovered bugs can be analyzed by writing another test instead of bisecting through changes. And we are doing that, but it just takes a while.
Me writing all that was just an indicator of getting old, though. I remember coming out of college and hating warnings. They are simple and easy to fix without understanding the codebase, which seems overwhelming by comparison. Fixing all -Wall compiler warnings in a large C++ project was my first task upon joining Azure out of undergrad. Some day all the code you write will, unchanged, become riddled with IDE warnings as language capabilities shift over time. And a new grad will ask how you can tolerate it.
I encountered C code being riddled with so many UBs run in CERN (I think in 100 loc it was around 8 UB I immediately saw) and the researchers not willing to fix it since it produced “correct” results (more like results they wanted to see), I’ve seen generated autotools files being committed into git and additional commits with things in the commited files like “My Laptop”. The worst shell scripting I’ve ever seen. All the tooling required using custom setup files per PC instead of using the package manager, the env, etc. And that’s just the first few things I remember.
This here is just pretty eh.
I agree that the codebase isn’t bad for like… a standard Java codebase. However I would have expected that a high-stakes nuclear research facility would have had a better codebase? I’m just surprised that bad code is this pervasive.
One thing they don’t really teach you in undergrad is that old code is usually good code, even if it doesn’t have tests and comes with loads of compiler warnings. Time, ultimately, is the greatest judge of quality and code that has been battle-tested and found/refined to withstand the rigor of real-world use is good code in almost every way that matters. For empirical support of this sensibility you can look at Google’s RIIR experience reports where they talk about targeting new C++ code for Rust rewrites, because the bug density in old code is much lower.
Google’s experience is based on their old code being maintained to a high standard. Google’s old code is not like the kind of old code that developers are scared to change.
Old code in C or C++ that has lots of warnings is probably also riddled with undefined behaviour that a compiler upgrade is likely to change from latent bugs into actual bugs.
I would expect old Java code to be much less problematic, but on the other hand there must be some reason why upgrading from Java 8 is so difficult. If old code can’t run on new systems it must be approaching the other end of the bathtub curve.
Upgrading from Java 8 to Java 11 took me (the intern) and my supervisor (a full time programmer) three months. It was HARD.
battle-tested and found/refined to withstand the rigor of real-world use
You are not describing research code. Scientists are generally concerned with publishing their next paper, not with maintaining software which often has no users at all outside the lab. Code is a means to an end, and usually seen as somewhat incidental to the business at hand. Even computer scientists tend to write sloppy, throwaway research prototype code, in my experience.
having grown up near such a facility, and knowing how fast and loose they tend to play with radiological materials, let alone perl scripts, i’m utterly unsurprised, but i guess for people without prior exposure it would be a shock, yeah
ime most processes in most organizations, code or otherwise, are held together with duct tape and prayer to varying degrees of literalism
You should try looking at the NumPy internals!
It’s better than it used to be though…
Why on Earth did I think bash was a good idea?
Mistakes like that are a natural part of learning - self-compassion can lead to a broader understanding. If the author made this “mistake,” surely others have too.
What’s encouraging is the direction of the codebase. The fact that management was investing in upgrading Java and encouraging testing shows they were trying to improve things. It’s not perfect, but progress matters. The bigger concern would be if the author’s efforts were met with indifference or resistance.
You might as well just give the name of the lab, because I would expect anybody who’s worked at one to figure out the answer anyway (I had made my guess early, and confirmed it by the halfway point). That, or obfuscate the post more (message me and I’ll tell you what to obfuscate). If you told me the facility had initially used Ada, I think I’d even know what building you were talking about. Edit: So there’s not really a lot of point being coy about the facility & project if you just list it explicitly on your homepage.
Anyway,
It is also possible that the lab’s focus is not programming, and that the programming only exists to support the amazing scientific research taking place at the facility, so code quality takes a back seat
This is the answer. The programmers at these labs were, for a long time anyway, physicists first and foremost. You were working at a science lab, not the engineering lab, so I might speculate that the focus on solid software development would be even less of a priority, but then I’ve always been biased toward the engineering lab.
My experience of public sector software development is that it’s significantly affected by the inability to pay developers market rates, and the inability to hire the number of developers needed to maintain their codebase. There are other pathologies, but these are the biggest.