AI Is Still Cheap. But for How Long?

AI is still cheap if you are a serious individual user on a subscription plan, but that does not mean every account is massively subsidized.

My own example in AI Is Still Cheap points in a more boring direction: my Codex usage was roughly in line with the $200/month subscription price, and I was nowhere near filling the available quotas. The subsidy story really shows up at the edge: heavy users, long Claude Code sessions, and third-party agent workflows that keep the model busy for hours. Under those patterns, a $100 or $200 monthly plan can expose you to far more API-priced usage than the same work would cost through metered API or enterprise terms.

That is not the same as saying Anthropic's internal cost is $5,000. I made that distinction in the previous post because it matters. Still, the pricing signal is real. For power users, individual subscriptions can sit well below the customer-facing metered price while the labs build habit, developer mindshare, and enterprise pipeline. Prompt hygiene helps, but the bigger question is how long that gap lasts.

Anthropic is already protecting compute

Anthropic is the cleanest example because the changes are visible.

In March 2026, Anthropic doubled usage outside peak weekday hours for two weeks across Claude, Cowork, Claude Code, Excel, and PowerPoint [1]. That sounds generous, but the shape of the promotion matters. It was not random generosity. It pushed users away from the busiest compute windows.

Shortly after, Anthropic also started adjusting five-hour session limits during peak hours. The official Max plan docs now describe usage as variable based on message length, attachments, model choice, feature, current capacity, and conversation context [2]. In plain English: the quota is no longer just "you paid, therefore you get the same amount all the time." Capacity matters.

Business Insider also reported that Anthropic tested restricting Claude Code on lower subscription tiers as part of a broader compute squeeze, while GitHub paused some Copilot signups [3]. OpenClaw made the same pressure visible from another angle. In April 2026, Anthropic moved third-party harnesses such as OpenClaw off normal Claude subscription limits and onto extra usage or API-key billing; The Register quoted Anthropic saying those tools put an "outsized strain" on its systems [4].

OpenAI is moving differently right now. OpenAI says Codex is included with ChatGPT plans and can be used through the CLI, IDE extension, web, app, SDK, and Slack [5]. For OpenClaw specifically, its OpenAI provider docs describe a Codex subscription auth route through OpenAI Codex OAuth [6]. So the contrast is not subtle: Anthropic is narrowing the flat-rate surface around Claude, while OpenAI is still widening Codex distribution.

The reported $5,000 number is the uncomfortable part. The source material for this post and Forbes both point to Cursor-side analysis that a $200 Claude Code subscription could reportedly represent up to about $5,000 of API-priced inference if used hard enough [7] [8]. I would not treat that as a precise public accounting number or as proof of Anthropic's internal cost. I would treat it as the right order-of-magnitude warning.

If a subscription can expose you to thousands of dollars of API-equivalent usage for $200, the gap has to be managed somewhere. That is the subsidy pressure, even if the underlying provider cost is lower than the public API price.

The subscription is not the real enterprise price

This is where many companies fool themselves.

A developer tries Claude Code, Codex, Gemini, or Copilot on a personal plan. The $200/month line item does not feel cheap; compared with most software subscriptions developers are used to, it is a lot. But the price can still be worth it, so the team starts to treat the subscription as the planning baseline. Then the company tries to roll the same workflow into real production work, with enterprise terms, API access, data controls, auditability, and support.

That is when the price changes.

Claude Code's legal documentation separates the consumer and commercial worlds clearly. Commercial terms apply to Team, Enterprise, and API users, while Free, Pro, and Max are covered by consumer terms. Anthropic also says advertised Pro and Max usage assumes ordinary individual use of Claude Code and the Agent SDK [9].

That distinction matters. The $200/month private user is not the enterprise baseline. The enterprise baseline is closer to the metered customer price.

Business Insider reported that Uber had already spent its 2026 Claude Code budget, according to its CTO, and that Uber increased AI investment after underestimating the impact of AI tools in its 2025 planning [10]. The same reporting said roughly 10% of Uber's code changes were produced by autonomous agents, with human review before merge.

That is what adoption looks like when it leaves the individual subscription bubble. The workflow may be worth it. It may even be extremely profitable. But the bill is not the same bill.

The quotas tell you where the subsidy is

The simplest way to understand the current AI market is to turn the quotas back into API-priced usage.

Anthropic's Max 20x plan costs $200/month. Anthropic has described it as 20x Pro usage per session, with five-hour resets, plus weekly limits. In the more concrete wording Anthropic used for Max, that could mean at least 900 short messages every five hours for the $200 plan, depending on message length, conversation length, model, feature, and current capacity [2].

Now price that same usage as Claude Opus 4.7 API usage. Opus 4.7 starts at $5 per million input tokens and $25 per million output tokens [11]. The formula is simple:

input tokens * $5 / 1,000,000 + output tokens * $25 / 1,000,000.

If a five-hour Max 20x window were filled with 900 relatively small Opus 4.7 requests at 10,000 input tokens and 1,000 output tokens each, the API-equivalent cost would be about $67.50 for that one window. At 50,000 input tokens and 2,000 output tokens, the same 900 requests would cost about $270. At 200,000 input tokens and 4,000 output tokens, it would cost about $990.

Those are not exotic token counts for agentic coding. A long Claude Code session can keep pulling repository context, tool results, diffs, error logs, and previous conversation state back into the model. The expensive part is often not the short answer you read. It is the repeated context that made the answer possible.

The weekly cap makes the point even clearer. Anthropic does not publish a clean public token bucket for Max, so this is not an official quota conversion. But you do not need to max out every five-hour window all week to get near the reported $5,000 number. Five fully-used heavy windows at the 200,000-input / 4,000-output example above would already be about $4,950 of Opus 4.7 API-priced usage. Six such windows would be about $5,940. The user still pays $200/month.

That is why the quotas matter. Hourly limits, five-hour session windows, weekly caps, model restrictions, extra-usage billing, and API rates are not random product details. They are the shape of the subsidy. Anthropic even lets paid subscribers continue beyond included usage through extra usage at standard API rates [12]. Once you cross from the subsidized subscription bucket into metered usage, the customer-facing compute price appears.

The same logic applies across Claude, Gemini, Codex, and Copilot even if each product expresses the limits differently. Some use message windows. Some use weekly caps. Some use credits. Some hide the exact unit behind "usage limits." But the economic direction is the same. Personal plans are subsidized access to a constrained resource. Enterprise usage is where the customer-facing metered price is recovered.

Why token strategy content is everywhere

This also explains why LinkedIn and YouTube are suddenly full of "token strategy" posts.

Some of that advice is useful. Clean context, better file selection, smaller prompts, and not dumping the whole repository into every request all help. If you are using Claude Code on a subscription plan, these tricks are becoming part of normal survival. The compute restrictions are visible enough now that serious users have to think about context hygiene just to get the most out of the subscription they already paid for.

The marketing around Claude Code makes this more intense. Claude Code is everywhere in developer feeds, even though the practical edge over Codex is often less dramatic than the discourse suggests. Once a tool becomes the fashionable default, people start optimizing around its limits. That is why so much advice now sounds like: keep the context small, split work carefully, summarize aggressively, avoid unnecessary file reads, and do not let the agent wander.

For subscription users, this is partly about staying inside the quota. For enterprise customers, it is more serious. Every unnecessary token is a real cost, and agentic coding can burn through expensive context quickly if nobody designs the workflow. Token strategy is not just prompt cleverness there. It is cost control, architecture, evaluation discipline, and developer workflow design.

So I do not think token-saving content is useless. I think it is emerging quickly for a reason. Anthropic's quota pressure makes it necessary for heavy Claude subscription users, and enterprise pricing makes it unavoidable for companies. The mistake is treating token tricks as a way to preserve today's subsidized subscription economics forever. They can reduce waste. They cannot make scarce compute unlimited.

What companies should do now

Companies should stop using consumer subscription prices as their internal planning number.

If your team is experimenting with AI coding tools, the cheap subscription phase is useful. Use it. Learn from it. Let engineers discover where the leverage is real. But do not build the business case on the assumption that the same usage will remain available forever at $20, $100, or $200 per person.

The more honest planning model is:

measure real usage, not just seats,
estimate API-equivalent or credit-equivalent cost,
separate personal experimentation from commercial work,
decide which workflows deserve frontier models,
build guardrails before the invoice forces the conversation.

This is especially important for agentic coding. A normal chat request and a long-running agent session are not the same economic object. One can answer a question. The other can spend hours reading, editing, running tools, retrying, and consuming context. If you price both mentally as "one AI request," the forecast will be wrong.

AI is still cheap, but the cheapness is conditional

I still believe the core point from my earlier post, AI Is Still Cheap. For serious individual users, these tools can easily justify their subscription price. The leverage is real.

But the next sentence matters now: AI is cheap because the current subscription economics are subsidized.

The labs are not doing that forever out of kindness. They are buying adoption, habit, developer mindshare, and enterprise pipeline. Once companies depend on the workflow, the price moves closer to the customer-facing metered price. Anthropic's quota changes are not a small UX annoyance. They are an early sign of the subsidy being managed down.

So yes, use the tools hard while they are cheap. But do not confuse a subsidized subscription with the permanent cost of AI-enabled delivery.

That is the real takeaway. AI is still cheap. The better question is: for how long?