Claude Code Changed How My Team Writes Software
It started with three volunteers. September 2025, I gave three engineers on my 15-person team access to Claude Code and asked them to use it for all their development work for 30 days. No restrictions on how they used it. The only requirement was that they track their time, log their workflows, and be honest about what worked and what didn't.
By day 45 the entire team was using it. By day 90 our development workflow had fundamentally changed in ways I didn't predict. Some of the changes were what you'd expect. People wrote code faster. But the second-order effects — on team dynamics, on the role of senior engineers, on how we onboard new hires — those surprised me.
What We Were Starting From
My team builds an enterprise data platform. The codebase is around 340,000 lines across Python, TypeScript, and Go. Fourteen microservices, a React frontend, some internal CLI tools. Fifteen engineers: four seniors, six mid-level, five juniors. Spread across Pune and Bangalore with a few remote people.
Before the experiment, our workflow was conventional. Engineers used VS Code or JetBrains. Some had GitHub Copilot for autocomplete. PRs went through standard review with at least one senior reviewer. Our median PR cycle time was a bit over three days. Median PR size was around 280 lines changed. New engineers typically took four to six weeks before making their first meaningful production contribution.
We weren't slow by industry standards. But I'd been watching the AI-assisted development space closely and suspected we were leaving significant productivity on the table.
The First Thirty Days
I picked the three volunteers deliberately: one senior, one mid-level, one junior. I wanted to see how the tool performed across experience levels.
The first week was messy in instructive ways. The senior engineer immediately tried to use Claude Code for complex architectural refactoring and hit context window limits. The junior engineer used it for everything and accepted code uncritically — I had to have a direct conversation with him about that around day ten. The mid-level engineer found the right groove fastest: she used it for scaffolding new features, writing test suites, and understanding unfamiliar parts of the codebase.
By week two, patterns settled. The senior stopped trying to feed the entire codebase in and started using Claude Code for targeted tasks — "refactor this specific function to use the new error handling pattern," "review this PR and identify potential issues," "write integration tests for this endpoint." More surgical. The junior engineer, after reading generated code more carefully and asking Claude Code to explain its decisions, got genuinely better at the tool.
End of month one for the three volunteers against their own baselines: PR velocity up about 40%, median PR size dropped from 280 lines to around 190 (more atomic changes), test coverage on new code went from the low 60s to over 80%. Time spent on boilerplate roughly halved.
The Skeptic
When I opened it up to the full team around day 30, one of the senior engineers pushed back. He'd been watching the volunteers from a distance and wasn't convinced. His concern wasn't the tool's capability — he'd seen the numbers — it was what he called "learned helplessness." His argument: when engineers stop struggling through hard problems manually, they stop building the intuitions that make them useful in a crisis. AI tooling might be optimizing for short-term velocity at the cost of long-term engineering judgment.
I didn't dismiss this. I still think about it. What we decided was to treat it as a real risk worth managing rather than a reason not to adopt. We added a rule: if you can't explain every line of your PR during review, it's not ready to merge. We also started having Claude Code add detailed inline comments to generated code, which helped engineers understand what they were accepting.
The senior engineer eventually started using it, though more reluctantly than the rest of the team. He's actually become one of the most thoughtful users of it, which I think proves his own point in a weird way — the engineers with the strongest foundations are also the ones who can most effectively direct the tool.
What Changed About How We Work
The most significant shift happened in month two. Engineers started using Claude Code not just for writing code, but for understanding code. Before, when someone needed to work on an unfamiliar service, they'd spend hours reading source files, tracing call paths, asking colleagues questions. Now they'd point Claude Code at the relevant files and ask: "Explain how the authentication flow works in this service" or "What happens when a webhook event is received by this handler?" The time-to-understanding dropped dramatically.
This had a cascade effect. Engineers started taking on tasks outside their usual area of the codebase. Cross-team contributions that had been rare because of ramp-up cost became common. One of our frontend engineers fixed a bug in a Go microservice because Claude Code helped her understand the Go code quickly enough that it was faster than waiting for a backend engineer to pick it up.
Seniors became force multipliers instead of individual contributors. That’s the real story, not the raw velocity numbers.
Post thisBy day 60, senior engineers had started spending more time on design and review and less time writing production code. Not because they were less productive, but because their time was worth more elsewhere. One of them put it plainly: "I used to spend maybe 60% of my time writing code. Now it's closer to 20%, and the team ships more than it did before." The seniors had become force multipliers instead of individual contributors. That's the real story, not the raw velocity numbers.
A few workflow patterns became standard and stuck:
For any new feature, the engineer describes what they need to Claude Code and lets it generate the initial structure — file layout, function signatures, type definitions, basic implementation. Then they manually refine the business logic, edge cases, and integration points. Claude Code is excellent at structure and mediocre at domain-specific business logic. Playing to those strengths.
Engineers started writing tests before implementation more frequently, because generating tests with Claude Code became nearly effortless. When that cost drops toward zero, TDD adoption happens almost on its own. Our test coverage on new code went from around 60% to about 80% over the 90-day period.
PR descriptions improved a lot. Engineers started running their diffs through Claude Code to generate first-draft descriptions. The quality went up — more context about what changed and why, potential risks, testing notes. Reviews got faster because reviewers had better context going in. Small change, outsized impact on review velocity.
The Numbers After Ninety Days
Comparing the 90-day experiment period to the 90 days before it, across the full team:
PR velocity went from around 24 PRs merged per sprint to roughly 38 — something like a 60% increase. Median PR cycle time dropped from just over three days to under two. Median PR size went down from 280 lines to around 165 — more focused, atomic changes. Bug rate on new code stayed roughly flat, which was the metric I watched most closely. Faster output without increased defects was the goal; we got that. Test coverage on new code went from about 58% to around 80%.
The onboarding number was the one that genuinely surprised me. We brought on two new engineers during the experiment. Both made production contributions in their second week. Previous expectation for a codebase of this complexity: four to six weeks. That alone probably justifies the tooling cost.
What Doesn't Work Well
Our 340,000-line codebase doesn't fit in any context window. For tasks where you need the model to understand cross-cutting concerns across many services simultaneously, Claude Code provides incomplete or incorrect suggestions because it literally can't see enough of the system. For those tasks, a senior engineer manually curates the relevant context and feeds it in structured form. It works, but it's manual and time-consuming. Better tooling for large codebase navigation is the biggest gap I see in the current AI-assisted development ecosystem.
Cost adds up. At scale — fifteen engineers using it actively throughout the day with large context windows for a big codebase — we were spending roughly $3K per month on API costs. That's about $200 per developer per month. The ROI math is overwhelmingly positive if you're comparing it to engineer productivity, but I had to make that argument explicitly to get the budget approved. A lot of engineering leaders are still treating AI tools as discretionary perks rather than essential infrastructure.
The same prompt doesn't always produce the same code. Two engineers working on similar features may get structurally different implementations. We mitigated this with a shared project context file with architectural patterns, naming conventions, and preferred approaches. But it requires maintenance — every time we establish a new pattern, someone needs to update the context file.
What Changed That I Didn't Expect
Documentation improved dramatically. Engineers who previously never wrote documentation started producing it because Claude Code made it easy. "Write documentation for this module" takes 30 seconds of prompting and produces a reasonable first draft. Our internal documentation coverage went from roughly 30% of modules to something like 70% over the course of the experiment.
We changed how we interview. We stopped asking candidates to write code on a whiteboard. If our own engineers use AI to write code daily, testing candidates without it is testing an irrelevant skill. We moved to system design and problem decomposition. Can this person break down a problem, evaluate AI-generated code critically, know when to trust the model and when to override it? Claude Code handles implementation; we need engineers who can direct it.
Sprint estimation got more accurate. Engineers started using Claude Code to prototype during estimation sessions — spend 20 minutes having it scaffold a feature, see how complex the implementation actually is, estimate from that concrete starting point instead of guessing.
Six Months In
Claude Code is now as fundamental to our workflow as Git. I can't imagine going back, and I don't think any of my engineers would want to.
But I want to be careful not to tell a simple success story. The tool changed what it means to be a software engineer on my team. Senior engineers are primarily designers and reviewers now. Mid-level engineers tackle work that previously required senior involvement. Junior engineers are productive faster but risk developing shallower understanding of fundamentals — the concern my skeptical senior raised hasn't gone away, we're just actively managing it.
My role changed too. I spend more time thinking about code quality frameworks, review processes, and developer growth plans. Less time worrying about velocity, because velocity is no longer the constraint. The constraint now is judgment: ensuring the team makes good decisions about what to build and how to build it, even as the cost of building drops.
If you want to try this: start with three volunteers. Give them 30 days. Measure everything. Be honest about the challenges. You'll be surprised by what you find — and not all of the surprises will be positive ones.
References
- Anthropic, "Claude Code: An agentic coding tool," 2025 - https://docs.anthropic.com/en/docs/claude-code
- Peng, S. et al., "The Impact of AI on Developer Productivity: Evidence from GitHub Copilot," arXiv:2302.06590, 2023.
- Google, "DORA State of DevOps Report 2024: AI-assisted Development," 2024.
- Cursor, "AI-first Code Editor" - https://cursor.sh
- Murali, V. et al., "CodeCompose: A Large-Scale Industrial Deployment of AI-Assisted Code Authoring," arXiv:2305.12050, 2023.
- Ziegler, A. et al., "Productivity Assessment of Neural Code Completion," ACM SIGPLAN, 2022.
- Vaithilingam, P. et al., "Expectation vs. Experience: Evaluating the Usability of Code Generation Tools," CHI 2022.
Updated January 10, 2026