Early observations from Interviews with Engineering Teams Adopting AI
9 points by jonathannen
9 points by jonathannen
On one side, teams that treated AI as a catalyst for rethinking how they build software. They've restructured codebases, changed review processes, rebuilt deployment pipelines, and invested heavily in shared learning. These teams are getting genuine lift.
You can do this without AI! Just like writing decent documentation!
Sometimes I just want to scream at people.
You can, but the amount of effort may be the difference. There are some useful restructures that I can give to the LLM and verify in under an hour, but they'd take me a week or more to do. Things like that depend on the codebase of course, but some of those improvements are only theoretical without the help - nobody will pay for that amount of time.
I don't think people in general are very good judges of effort or efficiency when it comes to this stuff.
At work we piloted an AI rollout for a select few teams. One of the teams selected is a team I manage, so I gave them a refactoring task just like stated above, things people had either put off or just not put a bunch of thought into prior. The team had to record their usage of the AI and also assess their own productivity at the end (the people selected were already extremely familiar with LLM usage as many of them used it heavily previously or in side projects).
The team members spent around 2 business days (~16 hours) between prompting, tweaking, reviewing LLM output, repeating, etc. This does not include any time the agents were working alone, only the prompting and reviewing (i.e. active developer time). At the end they had a pretty decent size PR with the majority of the refactorings included. They were thrilled and assessed themselves as far more efficient than they otherwise would have been without the AI assistance, they called it a massive success.
However, when the pilot kicked off, I also completed the refactorings on the side without any AI usage (also with no prior inner knowledge of this particular codebase, which is in contrast to the team who works in it every single day). I completed the refactorings in about 1.5 hours.
Take that anecdote for what you will. Is this indicative of all AI usage? Absolutely not. But it's definitely not the first time I've seen people overestimate their gains (or said different, underestimate their human potential).
I think those are quite different ideas though. You had a team communicating, likely making decisions together, reviewing proposed changes. Of course some tasks are going to take longer than a single person making own decisions and just going with their idea of what the end result can be like. Also it sounds like that 1.5h doesn't include someone reviewing the change and checking for the unknowns you're weren't familiar with.
We'd need a much stronger test for this.
I think you're making a lot of presumptions. Just for completeness sake, I didn't include PR review time by team, and 12 of the hours I listed above were a single engineer prior to involving any team members at all. You're right, it's not a perfect 1:1 but it's close (and you'll have to trust me on those details and that I'm fully aware of the variables you listed) and I do think it aligns with many other anecdotes I've experienced with people overestimating their productivity with AI. Again, this isn't indicative of all LLM usage; I've seen and read about people whom I really respect and trust their judgement on if it's been a boost for them and they said it has been (many on this very site!).
Time and time again, technological fashion of the moment is used by some teams to craft an excuse to perform a large-scale refactoring and reorganisation they had had planned out but could not get around to.
In such cases the nutritional content of an axe might end up irrelevant to the properties of the resulting porridge, though. ( https://www.longlongtimeago.com/once-upon-a-time/folktales/axe-porridge/ )
One team had spent weeks on an elaborate agent pipeline and was still shipping fewer PRs than before they started.
Yes, and...?
I'm not sure if there's just a whole lot to read between the lines here, but these types of statements don't make sense in a discussion about net productivity. You can talk about "shipped PRs" all you want but this is a useless metric.
Most of this post seems to be similar assertions that require some foundational beliefs that I just don't seem to share about what makes a project actually move forward and evolve as it needs to.
I'm not really making a claim about PRs as a metric. In that case it also happened to be one the team themselves were tracking.
Most of this post seems to be similar assertions that require some foundational beliefs...
I assume you mean the use of AI in general? The post is about teams that have already decided to adopt AI, so there are some assumptions baked in.
For these teams productivity gains was the driver. That said, your more general point about being more explicit and drilling in on goals/metrics is fair -- I’ll likely include that next time.
For these teams productivity gains was the driver.
I believe your article lacks giving a definition of productivity except for the shallow metric of shipping more PRs, which I understand is what the parent comment is pointing to. The foundational belief here resides on the fact that integrating AI successfully is enough for a team to be productive, but it maybe more of a case of Goodhart's Law if you can't tell me what value is derived from it.
To be frank, I can't help but feel that this article is a bit superficial and extremely high level. I am sure they are valid observations. But I am missing the practical link with AI usage here.
The teams pulling ahead invested in process before tooling, created structures for shared learning, and gave their engineers time to adapt.
This has always been the case. 5 years ago, 10 years ago, 50 years ago, you name it. If there is a new technology you need to make sure that processes are in place, or the very least that the foundation to implement new processes and adapt old ones is there.
It is also what is simply not there in a frightening amount of organisations. What might be different this time around is that C-suite FOMO is at an all time high so teams that always wanted to implement better processes are now given the opportunity to do so.
As a practical example
Engineers are spending serious time turning Claude Code into their own customized hotrod. Custom CLI/MCP servers, elaborate prompt chains, multi-step agent orchestrations. I get the appeal -- but for established codebases you get most of the benefit from a (relatively) simple setup.
When pipelines first started to take off as opposed to manual deployments you could have written the very same sentence. You just would need to replace the MCP examples with examples about engineers who abused Jenkin's plethora of options to over engineer complex fragile solutions, grunt files for javascript projects almost as complex as the project itself, etc.
If you don't stop to think about the process, any tool implementation is going to cause you issues. You are absolutely right, I wish company execs would realize this instead of jumping on the FOMO hype train. But I also have been wishing that for years now and mostly see the same patterns over and over again. So I am glad that some teams at least get to improve their process.
When pipelines first started to take off as opposed to manual deployments you could have written the very same sentence. You just would need to replace the MCP examples with examples about engineers who abused Jenkin's plethora of options to over engineer complex fragile solutions, grunt files for javascript projects almost as complex as the project itself, etc.
This reminds me strongly of Kubernetes vs. just running plain Docker containers, or even cloud VMs. There's stuff in Kubernetes, in every situation, that you'll miss out on by using plain Docker, but if you can slash the complexity and get 80-90% of the benefit, the benefit you're leaving on the table is probably not worth the complexity of k8s. But the complexity is a hidden cost, and the lack of benefits with Docker is a visible cost.
To be frank, I can't help but feel that this article is a bit superficial and extremely high level. I am sure they are valid observations. But I am missing the practical link with AI usage here.
In my view the biggest differentiator is the pace. I've not seen a shift of this scale and speed in my career. Some teams have been grappling with this for a year or more, but most are trying to shift three gears since just a couple of months ago.
It has many of the characteristics of "any new tool" and I think your point is well made. And I hesitate to make this a "is AI hype or not debate", because I think our points here are a layer down from that. But I would say, rightly or wrongly, it's hitting teams like a meteor.