Intuition to Evidence: How We at TATA 1mg Measured AI's Real Impact on Developer Productivity

The Reality Check We've Been Waiting For

Posted By DeputyDev Team

7 Minutes read.

Everyone's talking about AI coding tools. GitHub Copilot promises faster development. Cursor claims to revolutionize how we code. But here's the uncomfortable truth: most studies evaluate these tools in controlled environments with synthetic benchmarks like HumanEval or MBPP.

Real software development is messier. It involves legacy codebases, team dynamics, code reviews, deployment pipelines, and the human factors that make or break adoption.

Our Question: What happens when production engineers use AI-assisted development tools for an entire year in production? Not in a lab. Not on toy problems. But in real, complex, enterprise software development.

We conducted research to address these questions in our paper Intuition to Evidence: Measuring AI's True Impact on Developer Productivity.

Read the full paper: Intuition to Evidence — Measuring AI's True Impact on Developer Productivity

Read on arXiv

The Study: By the Numbers

Study Design

Duration: 12 months (September 2024 - August 2025)
Participants: 300 software engineers across multiple teams
Tool: DeputyDev — an in-house AI platform combining code generation and automated code review
Methodology: Quasi-experimental longitudinal design with within-subjects and between-subjects controls
Data Sources: Automated metrics, surveys (228 responses), and qualitative interviews (125 engineers)

We didn't just measure lines of code. We tracked PR review times, code acceptance rates, adoption patterns, developer satisfaction, and even calculated the ROI down to the dollar.

Adoption: The Adoption Curve That Actually Happened

The Adoption Story

Month 1: Only 4% of engineers actively used AI tools (the early adopters and the curious)
Months 2-3: Rapid acceleration as success stories spread
Month 6: Peak engagement at 83% (the tipping point)

This isn't just a line on a graph. It represents the real human journey from skepticism to trust, from experimentation to integration into daily workflows.

The Productivity Impact: What Actually Changed

Code Volume Growth Analysis: AI-generated code progression from 3,000 lines (March 2025) to 2.26M lines (August 2025). Around 40% AI generated code shipped in production in August 2025 and 28% increase in production code volume.

Productivity Gains by Experience Level: Junior engineers (SDE1) achieved highest productivity increase at 77%, while mid-level and senior engineers showed 45% improvements.

The Cohort Divide: Engagement Matters

We split engineers into high adopters (top 10%) and low adopters (bottom 10%). The contrast is striking:

High Adopters (n=30): 61% increase in shipped code, 150k AI-generated lines in production
Low Adopters (n=30): 11% decline in shipped code, <200 AI-generated lines in production
The takeaway: AI tools aren't magic. They require engagement, learning, and integration into workflows to deliver value.

The Experience Level Effect

Does AI help junior developers more than senior ones? Our data says yes, but not in the way you might think.

Level	Before AI	After AI	Improvement	AI Accepted	Accept Rate
SDE1 (Junior)	80,492 LOC	142,354 LOC	+77%	45,849 LOC	29%
SDE2 (Mid)	79,065 LOC	114,327 LOC	+45%	92,127 LOC	33%
SDE3 (Senior)	7,490 LOC	10,828 LOC	+45%	3,897 LOC	34%

Junior engineers saw the biggest productivity boost (77%), likely because AI helps them learn patterns and overcome knowledge gaps. But interestingly, senior engineers had the highest acceptance rate (34%), suggesting they're better at curating and refining AI suggestions.

The ROI Reality Check

Total Cost: $46,833 over 5 months (~$112k annualized)
Per Engineer: $30–34 per month
Breakdown: 91.5% LLM API costs (mostly AWS Bedrock), 8.5% infrastructure
ROI: With 31.8% time savings on reviews alone, the payback is measured in hours, not months.

What Developers Actually Think

KPI / Area	Insight
Helpfulness of PR reviews	162 engineers (71%) Agree or Strongly Agree
Time saved per developer	≈ 20 minutes per day on average
Code suggestions accepted	173 engineers (76%) Sometimes or Frequently accept code
Most-valued capability	"Identifying issues/bugs in code" (151 mentions)
AI Code suggestion use	192 engineers (84%) used it in the last 3 months
Perceived plug-in helpfulness	57% say Yes, 30% Maybe
Preferred interaction mode	76% favour Chat mode over Act mode
Desire to continue	93% plan to keep DeputyDev in their workflow

Numbers tell one story. Developer sentiment tells another. We surveyed 228 engineers after 5 months:

85% Satisfied with AI code review features. Developers want it to keep reviewing their PRs.
62% Approval for code generation features. Lower than review, likely due to early stability issues.
93% Retention — express desire to continue using DeputyDev. The ultimate vote of confidence.
NPS Score: 34 — In the "good" category with 44% promoters, 46% passives, and only 10% detractors. Room to grow, but a solid foundation.

Lessons Learned: What Actually Works

What Worked

Gradual Rollout: Phased deployment allowed iteration based on feedback
Champion Networks: Early adopters became internal advocates
Continuous Feedback: Regular surveys drove rapid improvements
Multi-Agent Reviews: Specialized agents (security, performance, bugs) caught diverse issues

What Didn't Work

One-Size-Fits-All: Generic models struggled with specialized codebases
Auto-Acceptance: Automatic acceptance of suggestions led to quality issues (quickly disabled)
Over-Automation: Trying to automate too much created resistance
Initial Latency: 2–3s responses were too slow (optimized to <500ms)

What This Means for the Future

This study represents one of the first comprehensive, longitudinal evaluations of AI-assisted development in production. Here's what we learned:

The Bottom Line

AI coding tools work — but only when developers actually use them. The 72.7 percentage point gap between high and low adopters (61% vs -11%) shows that engagement is everything.
Junior developers benefit most — 77% productivity gains suggest AI can accelerate learning and skill development.
Code review is the killer app — 85% satisfaction for review vs 62% for generation. Developers trust AI more to review code than write it (for now).
The cost is negligible — At $30–34/engineer/month, the ROI from time savings alone justifies the investment within days.