AI Coding (II)
A lot of folks are asking if AI makes developers more efficient, but we'd have to know how to measure efficiency first.
After spending a weekend on my own coding with Claude, I've been using it a lot more at work. I have been able to get good results out of it sometimes. Sometimes it gets tunnel vision and circles a problem over and over again without making any progress but chewing through "tokens."
I've been trying to pay close attention to my own workflow and thinking while using Claude. I've noticed I have a tendency to get into a mode of thinking where I ask Claude to do very basic things I could do much more quickly. Which leads us to questions about efficiency.
The widely-discussed METR study says that developers using AI took 19% longer to do work than developers not using AI. Well, at least the top-line data is discussed a lot. Once you dig into the study you can see there are tons of caveats. The methodology and low sample-size makes it difficult to draw any solid conclusions either.
But is it even the right question? Any study that examines whether AI coding improves efficiency is going to bump up into other unresolved problems:
- We don't know what efficiency means in the context of software development
- Developers are not good at estimating.
The question rests on a manufacturing / throughput mindset that is not well-suited to software engineering. I think the METR study chooses a throughput approach, and they calculate time-savings by comparing against time estimates.
I just don't think you can answer the question of developer efficiency. At least, not this way.
AI changes the work
I think a lot of people's assumption is that we are going to do the same work more quickly, but AI is a material change in how we work. That change is happening in the context of corporate cultures and professional norms constructed around assumptions that may no longer be valid.
What I don't think has changed is the basic observation made by Frederick Brooks in No Silver Bullet: the most difficult task is deciding what to write.
The thing with AI is you can never really be sure where the gains or losses in speed come from.
If the work proceeds more quickly, maybe it is because the AI speeds things up. Or maybe it's because we choose to ignore modularizing code, respecting the existing architecture decisions, creating a thoughtful and accessible UI, or other professional practices.
If the work proceeds more slowly, maybe it's because the AI is an impediment. Or maybe it's because we choose to spend that time improving the architecture, exploring and comparing different solutions, writing more exhaustive tests, or creating a stronger, more accessible UI.
Just from my own experience I could have shipped code much more quickly with AI, but instead I spent more time going over the code in multiple passes, encouraging the AI to better modularize the result -- much as I would have done with my own code.
Could I have written any of that more quickly without AI? I am confident that I would not have written it at all. In this case, the feature was in an unfamiliar codebase using a language I do not know and a problem domain I rarely visit. I am not a full-stack developer, but with AI assistance I can credibly play one on TV.
What we should look for
"Can we write code more quickly" is not the right question to ask. Instead, I think the questions to ask are:
- Can more team members contribute across a more broad range of problems?
- Are we delivering more features that our customers actually use?
- Is our code better than it was before?
- Are our developers less susceptible to burn-out?
To be clear, the answers to these questions probably have as much to do with business culture as it does AI use. A company that prioritizes developer efficiency will doubtlessly create more volume, but I'm not sure "writes bad or pointless code ten times faster" is an accomplishment to celebrate.