Models Have a Context Limit. So Do You. Meet METU.

The paradox nobody wants to talk about

By early 2026, developer AI adoption has crossed every conceivable line of normality. Per JetBrains' State of Developer Ecosystem 2025, 85% of professional developers regularly use AI tools, and 51% use them every day. GitHub Copilot hit 4.7 million paid subscribers by January 2026, up 75% year over year. At Anthropic, 70–90% of company code is now AI-generated. At 25% of Y Combinator Winter 2025 startups, codebases are 95%+ AI-written.

And all of this comes with the same strange number attached.

DX, in their study of 121,000 developers across 450+ companies, showed that individual developers got 25–55% faster, AI-authored code now accounts for 26.9% of production code, but team productivity hasn't budged past 10%. Faros AI called this the AI productivity paradox — individual output is up, business metrics are flat.

Why? One answer seems obvious to me — yet almost nobody frames it this way.

The forgotten context window

When we talk about an LLM's context window, everyone understands it's a limit. Claude holds 200k tokens, Gemini 2M, and even they "get lost" past a certain density of information. It's an engineering constraint — we remember it.

But humans have a context window too. We've just forgotten about it.

Cognitive science has known for nearly seventy years that working memory holds 7±2 items (Miller, 1956). Attention is a resource that depletes. Decision fatigue is real and measurable. When you operate on code you didn't write yourself — which is exactly what happens when AI generates for you — the load of interpretation, verification, and integration falls entirely on your cortex.

And this is where it gets interesting.

A senior developer who used to write 200 lines a day and understood every one of them is now "reviewing" 2,000 lines generated by Claude or Cursor. One engineer in a recent SF Standard piece admitted that since he started writing code with AI, he only understands about half of what he ships. Academic literature already has a term for this: high-functioning burnout — a state where performance holds up while mental reserves quietly erode. BCG and UC Riverside coined another: brain fry — exhaustion from orchestrating AI agents. Faros AI documented that developers on high-AI-adoption teams ship 47% more PRs per day, but review time grew by 91% and bugs per developer by 9%.

This is the human context limit. And we ignore it because our interface with the computer doesn't show context-window fullness the way Claude Code does.

"Permanent underclass" — why it's the wrong term

For the past six months, Silicon Valley has been trading the term permanent underclass. SF Standard ran a feature on it in February 2026; NYT, earlier. One engineer told reporters: "Maybe we'll all just get fired and put into the permanent underclass. I'll fix cars if we end up in the AI apocalypse."

The term gets used in two senses:

Economic — those who can't keep up with AI will lose their jobs.
Cognitive — even an experienced person isn't of "sufficiently high class" to handle the new information flow.

Both senses are harmful, but the second is especially so. Because it:

delivers a diagnosis-via-label ("you don't measure up"), not a measurement;
strips agency — it's about your identity, not your current state;
gives you no tool: neither the manager nor the developer themselves can do anything about it.

It's like telling an athlete "you're not top-tier" instead of measuring their VO₂ max.

The most absurd part is that this term fundamentally conflates two different things: experience and cognitive throughput. You can be a brilliant architect with 20 years of experience and still, at 3 AM, fail to handle the stream from ten parallel Claude Code sessions — not because you're "underclass," but because your context ran out.

We need a different metric.

METU — Maximum Effective Token Usage

I propose a term: METU — Maximum Effective Token Usage.

This is the maximum number of tokens one experienced developer can effectively process per unit of time (day, week, sprint).

The key word is effectively. METU doesn't count tokens sent to the API. That would be an idiot metric — spamming prompts is easy. METU counts the full round-trip:

prompt → understanding the response → verification → integration into the system → maintaining the mental model

If a developer generated 50k tokens of code but didn't understand half of it and spent two weeks debugging — their effective token usage is 25k, not 50k. The rest is technical debt accruing interest.

METU as a dynamic metric — the APM analogy

Esports has long had a metric called APM — actions per minute. Top StarCraft players hit 300–400 APM. But this isn't "player class." It's current capacity — it depends on:

sleep and physical state,
stress,
task familiarity,
fatigue across the course of a tournament.

APM can be trained, tracked, discussed. Nobody says "that player's APM is low, he's underclass." They say "his APM dropped today, he got tired by game five."

METU should work the same way.

Then the following sentences — which don't exist in the industry today but are badly needed — become possible:

"My METU was high today — I rested well and closed two hard refactors."
"My METU is maxed out, we need another developer."
"I'm at 60% METU right now, give me reviews, not a new feature."
"This team's combined METU is 1.2M tokens/week, but the pipeline demands 2M. We're under-resourced."

This shifts the conversation from "who can hack it and who can't" to "how much capacity do we have today, and is it enough." That's a huge difference — both ethical and practical.

How do you measure METU?

This is the most interesting and still-unresolved part. A few approaches:

Indirect signals from tools:

AI-suggestion acceptance rate (~30% for Copilot on average — drops under cognitive overload).
Review time per PR.
Number of rollbacks and hot-fixes per week.
Ratio of AI-authored code to bugs per developer.
PR size (up 154% in high-AI-adoption teams — a signal that review bandwidth is becoming the bottleneck).

Self-reporting:

A simple 1–10 scale at the start and end of the day.
A cognitive-load journal (like sleep trackers, but for the brain).

Biometrics (prospectively):

HRV (heart rate variability) — a known proxy for cognitive fatigue.
Pupillometry, consumer-grade EEG headbands.

The point isn't a precise unit of measurement — it's the trajectory. Is my METU dropping 40% from Monday to Friday? Is the team's METU sliding after two weeks of crunch? If so, the signal isn't "buckle down and work" — it's redistribute the load.

The real question: when to grow the team

Right now hiring decisions are made on the old model — story points, velocity, backlog. That's not broken, but it's blind to the new constraint.

The bottleneck used to be writing code. AI removed that. Now the bottleneck is the team's ability to meaningfully consume what AI generates.

If your team's combined METU is below the AI throughput your product demands, you need to grow the team immediately — even if velocity formally looks fine. Because in 2–3 months you'll see:

rising bugs per developer,
review backlogs and PR queue collapse,
quiet burnout among seniors,
their departure, taking the mental models of your system with them.

This is already happening. At Microsoft, 20–30% of code is AI-generated, and they themselves describe "uneven adoption." At Anthropic, 70–90%, and one team lead admits 100% of his personal code is now AI-written. The numbers are impressive, but without a metric for human context, you can't see the real cost behind them.

Closing

AI didn't make senior developers "underclass." AI just loudly exposed something we've known in cognitive science for decades but ignored in the industry: a human is a system with a finite context window. And when the generator starts producing 10× what the consumer can meaningfully process, the bottleneck isn't the generator.

METU isn't a panacea or a ready-made formula. It's an attempt to introduce language in which we can discuss cognitive throughput without judgment and without offense. So that saying "my METU is low today" becomes as normal as saying "I didn't sleep well." So that "team METU is exhausted" becomes a valid argument for a hiring budget — alongside velocity and lead time.

If we want AI to actually make us 10× more productive, instead of 10× faster at burning out, we need to learn to measure not just what the machine generates, but what the human can absorb.

Idea by Max Kudinov. Researched and written with Claude.

Data sources: JetBrains State of Developer Ecosystem 2025; DX "Measuring Developer Productivity & AI Impact" 2026; Faros AI "AI Productivity Paradox Report"; GitHub Copilot Statistics 2026; SF Standard "AI writes the code now" (Feb 2026); Microsoft Global AI Diffusion Report Q1 2026; BCG and UC Riverside research on "brain fry."