Eight years of wanting, three months of building with AI
82 points by cgrinds
82 points by cgrinds
This is one of the best long-form pieces about serious, professional quality agentic engineering I've seen so far. Plenty of great stuff on where AI falls short, too.
The takeaway for me is simple: AI is an incredible force multiplier for implementation, but it’s a dangerous substitute for design. It’s brilliant at giving you the right answer to a specific technical question, but it has no sense of history, taste, or how a human will actually feel using your API. If you rely on it for the “soul” of your software, you’ll just end up hitting a wall faster than you ever have before.
Can you maintain that sense of judgment & taste, once you have completely removed yourself from the process as the creator?
"once you have completely removed yourself from the process" does not line up with this bit:
More importantly, I completely changed my role in the project. I took ownership of all decisions16 and used it more as “autocomplete on steroids” inside a much tighter process: opinionated design upfront, reviewing every change thoroughly, fixing problems eagerly as I spotted them, and investing in scaffolding (like linting, validation, and non-trivial testing17) to check AI output automatically.
True - it wasn't a critique of the the article, but a general pondering on the limitation of these tools, since to control them you still need to understand it all (or most of it) :)
Um, I'm confused - I feel the article addresses your question more than once throughout, doesn't it? (To anyone reading this comment who didn't read the article yet, I strongly recommend you do - I consider it one of the best ones on the topic I've seen since I finally started educating myself on "LLM AI" recently, one of the few actually insightful, useful, and balanced ones. But if you still really don't want to read it, in shortest words, the answer as I understand it is: no, you can't.
Thus, as the author also concludes, he needed to consciously re-involve himself in the code in a deliberate way, e.g. through patiently and deeply reviewing it. Yes, it would thus change our job character to some extent. It makes me think back to a thought I was having some year or more ago, that our job may go in a direction somewhat resembling airline pilots: they "fly" the airplane much less than in the "pioneering days of aviation", but most of the time "monitor it", and gently direct. But they still do fly it sometimes, notably and sometimes infamously whenever conditions and circumstances get tricky. And thus they need to also train themselves more deliberately outside regular flights. Which, I suspect, our management layers will have to realize sooner or later - maybe at least the smarter among them have already? What's however some relief to me recently, is that I seem to find those tricky situations still very fun to get involved in.)
Wow.. A blog post about AI-assisted coding that's not hyperbolic but reasonable, based on actual experiences of a person who's already a senior programmer (as opposed to, e.g., tech CEOs or other less-senior programmers who can't write fizzbuzz but are still so sure that coding is dead), with upsides and downsides of the process, and insights you can trust..
I really like the whole process written down. Is has a lot of overlap with my experience reimplementing some proprietary macos tools. Only one comment, if you think you may be running into the "No sense of time" section issues - there are frameworks which can preserve past decisions and observations which should help with this. I like https://deciduous.dev/ but there are many others.
If we accept that, as some people argue, we should completely reject generative AI for ethical reasons, then I wonder if working with a junior developer, who brought fresh ideas and energy to the problem (particularly if the junior developer were working on it full-time), could have had a similar effect to using an LLM, or better. Of course, setting up a system where a good junior developer can be brought onto the project and fairly paid is another problem.
The biggest difference is a junior is a committment, not an experiment. Someone can try out an agent for a single project like this, then move on. A junior doesn't just go away when the project is over
A junior doesn't just go away when the project is over
The junior may eventually grow to become a senior, a thing an LLM definitely will not be doing. So I would characterize that as an "investment" rather just an one-sided "commitment".
But towards the end of 2025, the models seemed to make a significant step forward in quality.
In my circles we call it the "Opus 4.5 moment", where a model crossed over to being "good enough" for a meaningful surface area of your day-to-day. I noticed a real shift in the group of engineers I interact with. It feels like this moment hit large engineering teams early this year.
Another thing that shouldn't be discounted is the harnesses are getting better. In November I had a bunch of terminals open. This was all pre Claude Code Web. Now I manage numerous streams through CCW and tooling. Also the other shift was organizing our codebase and tools to be AI-first.
None of these are dramatic individually. But they stack and that's been a huge change for me.
December was the first month where I hand-typed less than 10% of the code I shipped, and by January I was consistently able to work on 3x as much while shipping output that was comparable in quality. It was genuinely a sea change. But then I quickly ran into the limitation that the limit isn't shipping code, it's deciding what needs to be built and how.
Great article sharing how a professional programmer adapts to using the new tech. For myself I seem to prefer AI (I use the free setup by opencode) to:
a) prototype
b) seed code especially boiler plate stuff (adding dependency, incorporating a library into new code, adding unit tests)
c) an implementation (most likely a synthesized copy from another programming language) of system developed for a public protocol/specification
d) research
basically the rest I am doing, still, myself. So I have the added task of validating the AI-generated code, I have an added task of learning the AI tooling specs/processes, but I save on dealing with boilerplate, and I have a better prototyping partner that does not ask for much. :-)
I do not ask AI how to solve a problem, but mostly how to implement something that I am pretty sure has been implemented (may be in another language, etc), and then write unit tests for that
--
I would have a question for the author though, why did they not consider enhancing/contributing SQLLite parser to https://github.com/tobymao/sqlglot . Is that the Python-to-rust / C runtime missfit?