The Signal

Users on Claude's Pro Max plan (marketed as "5 x usage") are hitting quota exhaustion in roughly 90 minutes of active Claude Code sessions — despite what they describe as moderate usage. The issue is trending on GitHub (issue #45756) with 136 upvotes on HN and 65 comments. The core complaint: the "5x" multiplier is opaque. Nobody knows what the baseline is, how tokens are counted inside a gentic loops, or when the counter resets. You just get cut off mid-session.

This isn't a fringe edge case. It's happening to people doing normal coding work — not running bulk batch jobs or stress -testing the API.

Builder's Take

This is a leverage problem disguised as a billing problem.

Claude Code's power comes from agentic loops: it reads files, writes code, runs tests, reads output, iterates. Each loop iteration burns tokens you never explicitly "sent." A single "fix this bug" prompt might internally expand to 10,000+ tokens of context — file reads, tool calls, chain-of-thought. The 5x quota sounds generous until you realize the denominator is invisible.

Here's the leverage calculation that matters: if you're a solo dev and your Claude Code session goes dark mid-afternoon, you lose not just tokens — you lose flow state. That's the real cost. A $20 /month plan that cuts out at 1pm is worth less than a $5 plan you can predict and plan around.

What moat does this create or destroy?

  • Destroys: The "just leave Claude Code running" workflow. You can't use it as a persistent coding assistant throughout the day.
  • Creates: Opportunity for builders who instrument their own token usage and build quota-aware tooling around the API directly.
  • Reality check: API access ( paying per token) has no quota cliff. If you're a serious builder hitting these limits, the math might favor direct API + a token budget manager over the flat -rate plan.

DHH-style take: Anthropic is selling a "5x" number without defining what 1x is. That's a product failure, not a user failure. Op aque limits are a tax on power users.

Tools & Stack

The Affected Setup

  • Claude Code — Anthropic's agentic coding CLI . Runs inside your terminal, reads/writes your codebase. Requires Pro or Pro Max subscription for heavy use.
  • Pro Max plan — Described as "5x usage limits" vs standard Pro. Actual token ceiling: not publicly documented (check anthropic.com/pricing for current tiers).

Workarounds Being Discussed in the Issue

  • Session batching: Do all your Claude Code work in one focused morning block . Don't leave it idle (idle context still counts in some agentic frameworks).
  • Reduce context window: Use .claudeignore to exclude large files/dirs from Claude Code's file reads. Less context per loop = more loops before quota hit.
  • Switch to API: claude-3-5-sonnet-20241022 via direct API is pay-per-token. No quota cliff. You control the spend. Check anthropic.com/api for current token pricing.
  • Alternative : Cursor + API key — Cursor lets you bring your own Anthropic API key. Same Claude models, no subscription quota. You pay per token, predictably.
  • Alternative: Aider — Open source CLI coding assistant. pip install aider-chat. Plug in your own API key. No middleman quota layer.
    pip install aider-chat
    aider --model claude-3-5-sonnet-20241022 --api-key anthropic= YOUR_KEY

Monitor Your Own Usage

If you stay on the subscription plan, at minimum add visibility:

#  Claude Code doesn't expose token counts natively yet
# Proxy  workaround: log Claude Code's API calls via a local proxy 
# Tools like mitmproxy or proxyman can intercept and log token  usage
mitmproxy --mode regular --listen-port 8080

Crude but effective until Anthropic ships a usage dashboard inside Claude Code.

Ship It This Week

Build a Claude Code quota watchdog.

Here's the concrete idea : a small script that sits between you and Claude Code, tracks approximate token usage per session, and warns you before you hit the wall.

Rough implementation path:

  1. Use mitmproxy or a local HTTPS proxy to intercept Claude Code's API calls.
  2. Parse the usage field from each response (input_tokens, output_tokens) .
  3. Accumulate a rolling session total. Log to a local SQLite file.
  4. When you cross a threshold (e.g., 80% of your estimated daily budget), trigger a desktop notification via osascript on Mac or notify-send on Linux.
# pseudocode sketch
for each intercepted response:
  tokens_ used += response.usage.input_tokens + response.usage.output_tokens
  if tokens_used > DAILY_BUDGET * 0.8:
    notify ("Claude Code: 80% quota used — wrap  up or switch to API")

This is a weekend project. Ship it as an open source tool, post it on HN and the GitHub issue thread. There are 136+ frustrated developers who would install it today. That's your distribution.

Longer play: turn it into a quota-aware Claude Code wrapper with session scheduling ("use budget in the morning, reserve X% for afternoon"). Charge $5 /month. Anthropic's opacity is your market.