The Real Reason You're Hitting Usage Limits
Claude Code is the most powerful coding tool on the planet right now. It's also the easiest one to burn through tokens on if you don't know what you're doing.
Most people hit their usage limits because they're using Claude wrong — not because Claude is expensive. Three small changes and you'll get 3-5x more out of the same plan.
Run /model opus-plan
Claude has multiple models. Opus 4.6 is the smartest. Sonnet 4.6 is fast, cheap, and honestly more than enough for 90% of coding work.
By default, Claude Code uses Opus for everything. That's overkill — and it's why your tokens disappear so fast.
Run this command in your Claude Code session:
Now Opus only handles planning — the big-picture thinking. Sonnet handles execution — the actual file edits and code writing. Same quality output. Roughly 5x cheaper on the heavy lifting. Do it once per session.
Use Subagents For Research And Exploration
Every message you send, Claude re-reads your entire chat history. So when you tell Claude "go explore this codebase and find X," all those file reads bloat your context — and you pay for them on every future message.
The fix is subagents.
A subagent runs in its own context window. You send it off to do the heavy reading — exploring the codebase, researching a library, scanning logs — and it sends back a clean 2-paragraph summary. Your main chat stays small and fast.
How to use them: just ask. Say something like:
Claude will spin one up automatically. The subagent reads 50 files. You only pay for the summary. This is the move most people don't even know exists.
Install The Caveman Plugin
This one sounds like a joke. It's not.
Caveman is a Claude Code plugin that makes Claude respond in caveman-speak — short, blunt, no filler words. Same technical accuracy, way fewer tokens.
The benchmarks are real:
- 65% average reduction in output tokens
- 100% technical accuracy preserved
- 3x faster responses
Instead of Claude writing:
It writes:
Same answer. 75% less talking. You read faster, Claude burns fewer tokens, your usage limit stretches way further.
How to install: just tell Claude Code:
That's it. Once installed, type /caveman to activate. Pick your level: lite, full, or ultra depending on how much grunt you want.
Stack All Three
These three compound on each other:
/model opus-plancuts your model cost by ~5x- Subagents cut your context bloat
- Caveman cuts your output tokens by ~65%
Combined, you'll go from hitting usage limits daily to barely thinking about them. You don't need a bigger plan. You just need to use Claude Code like someone who actually knows what they're doing.