Skip to content

Claude Code usage limits explained: Pro, Max, and weekly caps

Shivam Malani
Claude Code usage limits explained: Pro, Max, and weekly caps

Claude Code runs on the same usage budget as the rest of Anthropic's Claude products, and hitting a limit mid-task is one of the fastest ways to derail a coding session. The caps are measured in rolling time windows rather than a visible token counter, which is why burn rates can feel unpredictable depending on the model, context size, and time of day.

Quick answer: Claude Code shares one usage budget across claude.ai, Claude Desktop, and the CLI. You get a rolling 5-hour session limit plus a weekly cap, with higher allowances on Pro, Max 5x, and Max 20x. Peak-hour throttling (weekdays 5–11 a.m. PT / 1–7 p.m. GMT) makes the 5-hour window drain faster.

How the two limits actually work

There are two separate caps running at the same time, and both can stop a session cold. The 5-hour session limit is a rolling window that starts with your first message and resets five hours later. The weekly limit is a longer cap that covers all models across a seven-day window.

Usage is not measured in raw messages. It's weighted by conversation length, model choice, features in use, and tool calls. A short prompt with heavy file reads, extended thinking, or multiple MCP connectors can burn far more than a longer prompt with none of that overhead. Everything you do across claude.ai, Claude Desktop, and the Claude Code CLI counts against the same pool.


Plans and where Claude Code fits

All paid tiers include Claude Code access, but the headroom differs sharply. Pro is the entry point and is the one most users report as tight for any sustained agentic coding. Max 5x and Max 20x are aimed at developers running Claude Code for hours per day, and Enterprise uses a different consumption-based model.

PlanPriceRelative session allowanceTypical fit
Free$0BaselineLight chat, testing
Pro$20/mo~5x FreeCasual Claude Code use, short tasks
Max 5x$100/mo~5x ProDaily Claude Code work, single project
Max 20x$200/mo~20x ProHeavy agentic workflows, parallel sessions
Team / EnterpriseVariesPer-seat or usage-basedOrgs, custom billing

Anthropic doesn't publish exact token numbers for session or weekly caps. The allowances shift with load, and the company has adjusted them more than once as demand grew.


Peak-hour throttling

Free, Pro, and Max subscribers move through the 5-hour session limit faster on weekdays between 5 a.m. and 11 a.m. Pacific Time (1 p.m. to 7 p.m. GMT). The weekly cap is unaffected by this change.

In practice, a task that consumed 20% of a session window off-peak can eat 40% or more during those hours. If you have flexibility, shifting heavy Claude Code runs outside that band is the single biggest lever for Pro and Max 5x users.


What burns the session limit fastest

Claude Code is agentic, so each turn can trigger many tool calls, file reads, and subagent invocations. A few patterns drain the session meter disproportionately:

  • Opus versus Sonnet. Opus consumes roughly 5x more per equivalent task. Sonnet handles most implementation work.
  • Extended thinking. High reasoning budgets compound across every tool call in a loop.
  • The 1M-token context variant of Opus. Larger payloads per request mean more tokens in flight even when the actual task is small.
  • Repeated file reads. Claude Code often re-reads files already in context.
  • Heavy CLAUDE.md files and always-on MCP connectors. Both load into every message.

Length limits are separate

The usage cap controls how much you can do over time. The context window controls how much Claude can hold in a single conversation. All paid plans get a 200K-token context window, with 500K available on some Enterprise models.

When code execution is enabled, Claude automatically summarizes older turns as a chat approaches the context window, so long sessions continue rather than hitting a hard wall. The full chat history stays accessible even after summarization. The message "organizing its thoughts" is the visible signal that automatic context management is running.


Stretching the session budget in Claude Code

A handful of configuration changes have an outsized effect on burn rate inside the CLI.

Step 1: Open ~/.claude/settings.json and set Sonnet as the default model, cap thinking tokens, lower the autocompact threshold, and route subagents to Haiku. This single block cuts consumption meaningfully for most workflows.


{
  "model": "sonnet",
  "env": {
    "MAX_THINKING_TOKENS": "10000",
    "CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "50",
    "CLAUDE_CODE_SUBAGENT_MODEL": "haiku"
  }
}

~/.claude/settings.json

Step 2: Add a .claudeignore file at the project root to block directories like node_modules/, dist/, and lockfiles. It works like .gitignore and saves tokens on every prompt that would otherwise pull those files into context.

Step 3: Keep CLAUDE.md lean. A 60-line file is typically enough. Move everything else into docs/ and let Claude pull it on demand rather than loading it into every message.

Step 4: Use /clear between unrelated tasks and /compact at logical breakpoints. Do not let a single conversation balloon past roughly 200K tokens even when a larger window is available.

Step 5: Plan in Opus, implement in Sonnet. Use Opus only where reasoning quality matters, then switch models for execution.


Web and desktop app tactics

The CLI-only tricks don't apply to claude.ai or the desktop app, but a few habits still help:

  • Switch Opus to Sonnet unless you need the reasoning difference.
  • Start new conversations per task; context accumulates with every message.
  • Write specific prompts. "Fix the JWT validation in src/auth/validate.ts line 42" is far cheaper than "fix the auth bug."
  • Pre-process large PDFs into plain text before uploading.
  • Disable web search, Research, and MCP connectors when they aren't needed. Tools and connectors are token-intensive.
  • Use projects for documents you'll reference repeatedly. Projects use retrieval instead of loading everything into context.
  • Turn off extended thinking for tasks that don't need it.

What happens when you hit a limit

A session-limit message shows a reset time five hours from your first message in that window. Hitting the weekly cap locks all-model usage until the weekly reset, which is tied to your own seven-day cycle rather than a fixed calendar day.

Paid plans can purchase extra usage from the account settings, which adds a separate credit pool. Note that extra usage is billed against an API-style budget and resets monthly rather than being a permanent top-up. If extra usage is enabled, log out and back in if the CLI continues to show the limit message after a purchase.

You can check current consumption and reset times in the Claude Code CLI with the /usage command, which shows session percentage, weekly percentage, and any extra-usage balance in a single view.


Monitoring burn rate

Anthropic's in-product usage display updates slowly and lags actual consumption. Community-built tools fill the gap by reading local Claude Code logs:

  • npx ccusage@latest for daily, session, and 5-hour window reports.
  • ccburn --compact for visual burn-up charts and projections.
  • Claude Code Usage Monitor for a real-time terminal dashboard.
  • ccstatusline or claude-powerline for status-bar integration.

These are useful because the official meter doesn't expose token counts directly, and they help you see whether you're on track to finish a task before the next reset.


When limits feel broken

Short, clean prompts occasionally consume 30% or more of a session window. Two factors usually explain it. First, peak-hour throttling multiplies session consumption silently. Second, the 1M-context Opus variant sends a much larger payload per request than the 200K model, so each turn costs more tokens even for trivial tasks. Switching away from the 1M variant and confirming your model choice in the CLI resolves a lot of these spikes.

If session usage jumps with no obvious cause, try updating the Claude Code CLI to the latest version, then run /compact or /clear to rebuild the context from a clean state.


Choosing the right plan for Claude Code

The practical breakpoints:

  • Pro at $20/month works for short, focused coding tasks and light planning. Sustained agentic loops will hit session limits quickly.
  • Max 5x at $100/month fits most individual developers doing daily Claude Code work on a single codebase.
  • Max 20x at $200/month is the practical tier for parallel sessions, long-running agent loops, and teams coordinating through one account.

If you're consistently hitting the weekly cap on Max 20x, the sustainable path is distributing load, with Claude handling planning and review and a second tool like Codex or a local model handling bulk implementation, rather than upgrading further.