Claude Code runs on the same usage budget as the rest of Anthropic's Claude products, and hitting a limit mid-task is one of the fastest ways to derail a coding session. The caps are measured in rolling time windows rather than a visible token counter, which is why burn rates can feel unpredictable depending on the model, context size, and time of day.
How the two limits actually work
There are two separate caps running at the same time, and both can stop a session cold. The 5-hour session limit is a rolling window that starts with your first message and resets five hours later. The weekly limit is a longer cap that covers all models across a seven-day window.
Usage is not measured in raw messages. It's weighted by conversation length, model choice, features in use, and tool calls. A short prompt with heavy file reads, extended thinking, or multiple MCP connectors can burn far more than a longer prompt with none of that overhead. Everything you do across claude.ai, Claude Desktop, and the Claude Code CLI counts against the same pool.
Plans and where Claude Code fits
All paid tiers include Claude Code access, but the headroom differs sharply. Pro is the entry point and is the one most users report as tight for any sustained agentic coding. Max 5x and Max 20x are aimed at developers running Claude Code for hours per day, and Enterprise uses a different consumption-based model.
| Plan | Price | Relative session allowance | Typical fit |
|---|---|---|---|
| Free | $0 | Baseline | Light chat, testing |
| Pro | $20/mo | ~5x Free | Casual Claude Code use, short tasks |
| Max 5x | $100/mo | ~5x Pro | Daily Claude Code work, single project |
| Max 20x | $200/mo | ~20x Pro | Heavy agentic workflows, parallel sessions |
| Team / Enterprise | Varies | Per-seat or usage-based | Orgs, custom billing |
Anthropic doesn't publish exact token numbers for session or weekly caps. The allowances shift with load, and the company has adjusted them more than once as demand grew.
Peak-hour throttling
Free, Pro, and Max subscribers move through the 5-hour session limit faster on weekdays between 5 a.m. and 11 a.m. Pacific Time (1 p.m. to 7 p.m. GMT). The weekly cap is unaffected by this change.
In practice, a task that consumed 20% of a session window off-peak can eat 40% or more during those hours. If you have flexibility, shifting heavy Claude Code runs outside that band is the single biggest lever for Pro and Max 5x users.
What burns the session limit fastest
Claude Code is agentic, so each turn can trigger many tool calls, file reads, and subagent invocations. A few patterns drain the session meter disproportionately:
- Opus versus Sonnet. Opus consumes roughly 5x more per equivalent task. Sonnet handles most implementation work.
- Extended thinking. High reasoning budgets compound across every tool call in a loop.
- The 1M-token context variant of Opus. Larger payloads per request mean more tokens in flight even when the actual task is small.
- Repeated file reads. Claude Code often re-reads files already in context.
- Heavy
CLAUDE.mdfiles and always-on MCP connectors. Both load into every message.
Length limits are separate
The usage cap controls how much you can do over time. The context window controls how much Claude can hold in a single conversation. All paid plans get a 200K-token context window, with 500K available on some Enterprise models.
When code execution is enabled, Claude automatically summarizes older turns as a chat approaches the context window, so long sessions continue rather than hitting a hard wall. The full chat history stays accessible even after summarization. The message "organizing its thoughts" is the visible signal that automatic context management is running.
Stretching the session budget in Claude Code
A handful of configuration changes have an outsized effect on burn rate inside the CLI.
Step 1: Open ~/.claude/settings.json and set Sonnet as the default model, cap thinking tokens, lower the autocompact threshold, and route subagents to Haiku. This single block cuts consumption meaningfully for most workflows.
{
"model": "sonnet",
"env": {
"MAX_THINKING_TOKENS": "10000",
"CLAUDE_AUTOCOMPACT_PCT_OVERRIDE": "50",
"CLAUDE_CODE_SUBAGENT_MODEL": "haiku"
}
}
~/.claude/settings.json
Step 2: Add a .claudeignore file at the project root to block directories like node_modules/, dist/, and lockfiles. It works like .gitignore and saves tokens on every prompt that would otherwise pull those files into context.
Step 3: Keep CLAUDE.md lean. A 60-line file is typically enough. Move everything else into docs/ and let Claude pull it on demand rather than loading it into every message.
Step 4: Use /clear between unrelated tasks and /compact at logical breakpoints. Do not let a single conversation balloon past roughly 200K tokens even when a larger window is available.
Step 5: Plan in Opus, implement in Sonnet. Use Opus only where reasoning quality matters, then switch models for execution.
Web and desktop app tactics
The CLI-only tricks don't apply to claude.ai or the desktop app, but a few habits still help:
- Switch Opus to Sonnet unless you need the reasoning difference.
- Start new conversations per task; context accumulates with every message.
- Write specific prompts. "Fix the JWT validation in
src/auth/validate.tsline 42" is far cheaper than "fix the auth bug." - Pre-process large PDFs into plain text before uploading.
- Disable web search, Research, and MCP connectors when they aren't needed. Tools and connectors are token-intensive.
- Use projects for documents you'll reference repeatedly. Projects use retrieval instead of loading everything into context.
- Turn off extended thinking for tasks that don't need it.
What happens when you hit a limit
A session-limit message shows a reset time five hours from your first message in that window. Hitting the weekly cap locks all-model usage until the weekly reset, which is tied to your own seven-day cycle rather than a fixed calendar day.
Paid plans can purchase extra usage from the account settings, which adds a separate credit pool. Note that extra usage is billed against an API-style budget and resets monthly rather than being a permanent top-up. If extra usage is enabled, log out and back in if the CLI continues to show the limit message after a purchase.
You can check current consumption and reset times in the Claude Code CLI with the /usage command, which shows session percentage, weekly percentage, and any extra-usage balance in a single view.
Monitoring burn rate
Anthropic's in-product usage display updates slowly and lags actual consumption. Community-built tools fill the gap by reading local Claude Code logs:
npx ccusage@latestfor daily, session, and 5-hour window reports.ccburn --compactfor visual burn-up charts and projections.- Claude Code Usage Monitor for a real-time terminal dashboard.
ccstatuslineorclaude-powerlinefor status-bar integration.
These are useful because the official meter doesn't expose token counts directly, and they help you see whether you're on track to finish a task before the next reset.
When limits feel broken
Short, clean prompts occasionally consume 30% or more of a session window. Two factors usually explain it. First, peak-hour throttling multiplies session consumption silently. Second, the 1M-context Opus variant sends a much larger payload per request than the 200K model, so each turn costs more tokens even for trivial tasks. Switching away from the 1M variant and confirming your model choice in the CLI resolves a lot of these spikes.
If session usage jumps with no obvious cause, try updating the Claude Code CLI to the latest version, then run /compact or /clear to rebuild the context from a clean state.
Choosing the right plan for Claude Code
The practical breakpoints:
- Pro at $20/month works for short, focused coding tasks and light planning. Sustained agentic loops will hit session limits quickly.
- Max 5x at $100/month fits most individual developers doing daily Claude Code work on a single codebase.
- Max 20x at $200/month is the practical tier for parallel sessions, long-running agent loops, and teams coordinating through one account.
If you're consistently hitting the weekly cap on Max 20x, the sustainable path is distributing load, with Claude handling planning and review and a second tool like Codex or a local model handling bulk implementation, rather than upgrading further.