1. The Four Effort Levels
Illustrative DataClaude Code's effort parameter controls how deeply the model reasons before responding. It's a behavioral signal, not a strict token ceiling. Higher effort grants the model more latitude to explore edge cases, self-verify, and spawn sub-agents.
Configurable via: /effort command, --effort flag, CLAUDE_CODE_EFFORT_LEVEL env var, or effortLevel in settings.json. Note: Max only persists via the environment variable.
| Level | VS Code Slider | Behavior | Best For |
|---|---|---|---|
| Low | ●○○○ | Minimal reasoning. Skips thinking phase for straightforward queries. | Boilerplate, syntax fixes, file grep, formatting |
| Medium | ●●○○ | Bounded adaptive reasoning. Default for Opus 4.6 and Sonnet 4.6 (since v2.1.68). | General feature work, standard bug fixes, daily coding |
| High | ●●●○ | Extensive chain-of-thought. Explores multiple approaches. Triggered per-turn by including "ultrathink" in prompt. | Refactoring, cross-file debugging, security analysis |
| Max Opus Only | ●●●● | Unbounded adaptive reasoning. No constraints on thinking depth. Only available on Opus 4.6. | Root-cause analysis, algorithm design, mission-critical synthesis |
2. Model Capability Profiles
IllustrativeOpus 4.6 leads on deep architectural reasoning and self-debugging. Sonnet 4.6 has closed the gap on standard code generation. Haiku 4.5 is optimized as a high-speed parsing and retrieval sub-agent.
3. Verified Benchmark Comparison
Published DataPublished scores from Anthropic and independent evaluators. The GPQA gap (17.2pp) is where Opus earns its premium. For standard coding, the delta is negligible.
| Benchmark | Opus 4.6 | Sonnet 4.6 | Delta |
|---|---|---|---|
| SWE-bench Verified | 80.8% | 79.6% | 1.2pp |
| OSWorld (Computer Use) | 72.7% | 72.5% | 0.2pp |
| Terminal-Bench 2.0 | 65.4% | ~60% | ~5pp |
| SRE-skills-bench | 94.7% | 90.4% | 4.3pp |
| GPQA Diamond | 91.3% | 74.1% | 17.2pp |
Sources: Anthropic model announcements, Rootly SRE-skills-bench report, NxCode benchmark synthesis. Sonnet Terminal-Bench score estimated (~60%) based on Opus 4.5 baseline of 59.8%.
4. Autonomous Sub-Agent Routing
Opus 4.6 is highly proficient at autonomous delegation. It spawns specialized sub-agents (Explore agents for file search, Haiku for parsing) without requiring user instruction. Manual orchestration typically results in context fragmentation and higher token costs.
Configure sub-agent model: CLAUDE_CODE_SUBAGENT_MODEL="claude-haiku-4-5". For massive parallel work, Agent Teams are available via CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1.
Autonomous Sub-Agents (Default)
opusplan Hybrid Alias
5. Task Routing Matrix
Optimal model and effort pairing per task type. The goal: reserve Opus + High/Max for work that actually needs depth-first reasoning, and let Sonnet + Medium handle the volume.
| Task Category | Model | Effort | Orchestration | Rationale |
|---|---|---|---|---|
| Boilerplate, Syntax, Formatting | Sonnet / Haiku | Low | Single agent | Pure pattern matching. Reasoning overhead is wasted here. |
| Feature Implementation, Standard Bugs | Sonnet 4.6 | Medium | Single + auto sub-agents | 98% of Opus coding performance at 40-60% lower cost. |
| Complex Refactoring, System Design | opusplan | High | Hybrid (Opus plans, Sonnet executes) | Opus designs the blueprint, Sonnet does the volume work. |
| Security Audits, PR Review | Sonnet 4.6 | High | Agent Teams | Sonnet's breadth-first scanning catches diverse edge cases across modules. |
| Deep Root-Cause Analysis | Opus 4.6 | High / Max | Single agent | Complex stack traces and cross-file logic require depth-first reasoning. |
| Architecture Migration (cross-layer) | Opus 4.6 | High | Agent Teams | Parallelized specialists for frontend/backend/infra. High token cost, justified by scope. |
6. Effort-Complexity Mismatch: Known Regression Patterns
IllustrativeMismatching effort level to task complexity produces distinct failure modes. Running Opus 4.6 at Max effort on simple tasks triggers over-exploration loops (documented in #28469, #26761). Running Sonnet 4.6 at Low effort on complex tasks produces speculative paralysis and incomplete fixes.
The mitigation is the same in both directions: match effort to complexity. Medium effort bounds the reasoning engine and forces both models to act on their highest-probability plans.
Opus Over-Exploration Loop
At High/Max effort on simple tasks, Opus generates excessive counter-hypotheses, spawns redundant sub-agents, and burns thousands of reasoning tokens before arriving at the same initial conclusion. Documented in Issues #28469, #26761, #37023.
Sonnet Speculative Paralysis
When pushed beyond its reasoning depth on complex tasks at Low/Medium effort, Sonnet reaches a conclusion but immediately self-doubts, entering a loop of restating the problem without executing a fix.
7. Peak Hours & Session Distribution
Real DataOn March 26, 2026, Anthropic confirmed that 5-hour session limits are adjusted during peak hours: weekdays 5:00-11:00 AM PT. During peak, each token costs more "usage units" against your rolling window. The result: you hit your session ceiling sooner, but weekly limits are unchanged and response quality is unaffected.
Source: Anthropic engineer Thariq Shihipar via X/Twitter, March 26, 2026. Estimated ~7% of users affected. Max 20x subscribers largely insulated (~2% seeing differences).
Session Distribution by Hour (Pacific Time)
2,094 sessions from a Max 20x subscriber, Jan-Apr 2026. The red zone marks Anthropic's declared peak window. This user's natural schedule (night owl, CT timezone) avoids peak almost entirely: only 7.9% of sessions fall in the 5-11 AM PT window.
8. Weekly Output Trends & Rate Limit Behavior
Real DataWeekly output tokens, Feb-Mar 2026, spanning three subscription tiers. The upgrade path is visible: Pro $20 (Feb 5), Max 5x $100 (Feb 12), Max 20x $200 (Mar 5). Peak week hit 7.5M output tokens (~$5,558 API-equivalent). After heavy days (>1M tokens), next-day output consistently drops 35-91%.
Heavy Day Recovery Pattern
After days exceeding 1M output tokens, the next day's output drops significantly. This pattern is consistent and reflects the rate limiter throttling sustained peak usage.
| Heavy Day | Output | Next Day | Output | Recovery % |
|---|---|---|---|---|
| Mar 9 | 1,517,757 | Mar 10 | 649,725 | 43% |
| Mar 13 | 1,591,558 | Mar 14 | 621,848 | 39% |
| Mar 16 | 816,648 | Mar 17 | 228,093 | 28% |
| Mar 23 | 1,086,790 | Mar 24 | 239,295 | 22% |
| Mar 27 | 1,629,977 | Mar 28 | 146,031 | 9% |
9. API Pricing: Who Got Cheaper, Who Didn't
Published DataOpus output dropped 67% ($75 to $25/MTok) with the 4.5 release in Nov 2025. Sonnet has held at $15/MTok since March 2024: eight models, zero price changes. Haiku went the other direction: 4x more expensive ($1.25 to $5/MTok) as it got smarter.
| Model | Mar 2024 Launch | Output $/MTok Then | Output $/MTok Now | Change |
|---|---|---|---|---|
| Opus | Claude 3 Opus | $75.00 | $25.00 | -67% |
| Sonnet | Claude 3 Sonnet | $15.00 | $15.00 | 0% |
| Haiku | Claude 3 Haiku | $1.25 | $5.00 | +300% |
Sources: Anthropic pricing page, TechCrunch (Haiku price hike Nov 2024), InfoWorld (Opus 4.5 price drop Nov 2025). Haiku briefly dropped to $4/MTok in Dec 2024 before returning to $5 with Haiku 4.5.
10. The Tightening: Rate Limit Layers
DocumentedPro launched with one rule. Two years later, subscribers navigate three layers of constraints. API tokens got cheaper; subscription access got more controlled.
The paradox: API prices dropped (Opus: -67%). Subscription prices held ($20 Pro, $200 Max). But effective per-dollar access got progressively more constrained through limit layering. Cheaper to run, more restricted to use.
11. Weekly Output by Model
Real DataOutput token distribution across Opus 4.6, Sonnet 4.6, and Haiku 4.5, Feb-Mar 2026. Note the Feb 26 week where Sonnet outproduced Opus (2.3M vs 1.9M) on the Max 5x plan, followed by a hard pivot to Opus-dominant workflow after upgrading to Max 20x. Haiku maintains steady sub-agent output throughout.
12. The Three-Meter System: How Weekly Limits Actually Nest
ConfirmedThe Claude dashboard shows three usage meters. The "Sonnet only" meter looks like a separate pool. It is not. Sonnet usage drains both "All models" AND "Sonnet only" simultaneously. When "All models" hits 100%, you are locked out of everything, including Sonnet, regardless of remaining Sonnet capacity.
Confirmed via user reports with screenshots: GitHub Issues #12487, #14362, #12795.
The Misleading Launch Message
When Opus 4.5 launched (Nov 2025), Anthropic's in-app notification said: "Sonnet now has its own limit, it's set to match your previous overall limit, so you can use just as much as before." This language implies independence, but the implementation is nested. Multiple users have flagged the confusion (#12487). Anthropic has not clarified.
What Happens at 100% "All Models"
Locked out of everything. Doesn't matter if "Sonnet only" shows 5% or 95%. The master bucket is empty. Wait for reset.
What Happens at 100% "Sonnet Only"
Sonnet is unavailable, but Opus and Haiku may still work if "All models" has remaining capacity. The sub-cap prevents you from burning all your budget on one model.
13. Max 20x Plan Economics ($200/month)
Anthropic does not publish hard token caps. The system uses internal "usage units" with a rolling 5-hour window and a separate weekly budget. When you hit the limit, you get model-downgraded (Opus to Sonnet), not cut off.
Known Issue: CLI Telemetry Desync (#24727)
The CLI's /status command can report 100% usage while the web dashboard shows significantly lower figures. If Extra Usage is enabled with an unlimited spend cap, the CLI may silently fall back to API billing. Recommendation: disable unlimited Extra Usage or cross-reference the web dashboard until this is patched.
April 4, 2026: Third-Party Framework Enforcement
Anthropic now enforces that Max subscription tokens cannot power third-party autonomous frameworks (OpenClaw, Cursor, custom harnesses). These must use pay-as-you-go API billing. The Max plan is ring-fenced for first-party interfaces: claude.ai, Claude Desktop, and the Claude Code CLI.
14. Configuration Quick Reference
Key environment variables and settings for tuning Claude Code behavior.
| Variable | Purpose | Example |
|---|---|---|
| ANTHROPIC_MODEL | Override primary model | opusplan |
| CLAUDE_CODE_EFFORT_LEVEL | Persist effort across sessions (only way to persist max) |
max |
| CLAUDE_CODE_SUBAGENT_MODEL | Model for background sub-agents | claude-haiku-4-5 |
| ANTHROPIC_DEFAULT_OPUS_MODEL | Pin Opus version for opusplan routing | claude-opus-4-6 |
| ANTHROPIC_DEFAULT_SONNET_MODEL | Pin Sonnet version for opusplan routing | claude-sonnet-4-6 |
| CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS | Enable Agent Teams (experimental) | 1 |
Known UI Bug (#30726)
Setting effortLevel: "max" in settings.json works at session start, but interacting with the VS Code effort slider or model selector silently downgrades to Medium. Use the CLAUDE_CODE_EFFORT_LEVEL env var for reliable persistence.