--- name: caveman-token-optimizer description: Claude Code skill that makes AI agents respond in caveman-speak, cutting ~65-75% of output tokens while preserving full technical accuracy triggers: - "install caveman mode" - "make claude talk like caveman" - "reduce token usage caveman" - "enable caveman skill" - "less tokens please" - "caveman mode for claude code" - "add caveman plugin" - "token optimization caveman" --- # Caveman Token Optimizer > Skill by [ara.so](https://ara.so) — Daily 2026 Skills collection. A Claude Code skill and Codex plugin that makes AI agents respond in compressed caveman-speak — cutting **~65% of output tokens on average** (up to 87%) while keeping full technical accuracy. No pleasantries. No filler. Just answer. ## What It Does Caveman mode strips: - Pleasantries: "Sure, I'd be happy to help!" → gone - Hedging: "It might be worth considering" → gone - Articles (a, an, the) → gone - Verbose transitions → gone Caveman keeps: - All code blocks (written normally) - Technical terms (exact: `useMemo`, `polymorphism`, `middleware`) - Error messages (quoted exactly) - Git commits and PR descriptions (normal) **Same fix. 75% less word. Brain still big.** ## Install ### Claude Code (npx) ```bash npx skills add JuliusBrussee/caveman ``` ### Claude Code (plugin system) ```bash claude plugin marketplace add JuliusBrussee/caveman claude plugin install caveman@caveman ``` ### Codex 1. Clone the repo 2. Open Codex inside the repo 3. Run `/plugins` 4. Search `Caveman` 5. Install plugin Install once. Works in all sessions after that. ### Manual / Local ```bash git clone https://github.com/JuliusBrussee/caveman.git cd caveman pip install -e . ``` ## Usage — Trigger Commands ### Claude Code ``` /caveman # enable default (full) caveman mode /caveman lite # professional brevity, grammar intact /caveman full # default — drop articles, use fragments /caveman ultra # maximum compression, telegraphic ``` ### Codex ``` $caveman $caveman lite $caveman full $caveman ultra ``` ### Natural language triggers Any of these phrases activate caveman mode: - "talk like caveman" - "caveman mode" - "less tokens please" - "be concise" ### Disable ``` /caveman off # or say: "stop caveman" / "normal mode" ``` Level sticks until changed or session ends. ## Intensity Levels | Level | Trigger | Style | Example | |-------|---------|-------|---------| | **Lite** | `/caveman lite` | Drop filler, keep grammar | "Component re-renders because inline object prop creates new reference each cycle. Wrap in `useMemo`." | | **Full** | `/caveman full` | Drop articles, use fragments | "New object ref each render. Inline prop = new ref = re-render. Wrap in `useMemo`." | | **Ultra** | `/caveman ultra` | Telegraphic, abbreviate everything | "Inline obj prop → new ref → re-render. `useMemo`." | ## Benchmark Results Real token counts from Claude API (reproducible via `benchmarks/` directory): | Task | Normal | Caveman | Saved | |------|-------:|--------:|------:| | React re-render bug | 1180 | 159 | 87% | | Auth middleware fix | 704 | 121 | 83% | | PostgreSQL pool setup | 2347 | 380 | 84% | | Git rebase vs merge | 702 | 292 | 58% | | Async/await refactor | 387 | 301 | 22% | | Docker multi-stage build | 1042 | 290 | 72% | | **Average** | **1214** | **294** | **65%** | > **Important:** Caveman only affects output tokens. Thinking/reasoning tokens are untouched. Caveman make *mouth* smaller, not brain. ## Reproducing Benchmarks ```bash git clone https://github.com/JuliusBrussee/caveman.git cd caveman/benchmarks # Set your Anthropic API key export ANTHROPIC_API_KEY=your_key_here # Run benchmark suite python run_benchmarks.py # Compare normal vs caveman responses python compare.py --task react-rerender python compare.py --task auth-middleware python compare.py --all ``` ## Code Examples — What Caveman Mode Changes ### Before (normal, 69 tokens) ``` The reason your React component is re-rendering is likely because you're creating a new object reference on each render cycle. When you pass an inline object as a prop, React's shallow comparison sees it as a different object every time, which triggers a re-render. I'd recommend using useMemo to memoize the object. ``` ### After (caveman full, 19 tokens) ``` New object ref each render. Inline object prop = new ref = re-render. Wrap in `useMemo`. ``` ### Code blocks stay normal — caveman not stupid ```python # Caveman explains in grunt, but code stays clean: # "Token expiry check broken. Fix:" def verify_token(token: str) -> bool: payload = jwt.decode(token, SECRET_KEY, algorithms=["HS256"]) # Was: payload["exp"] < time.time() # Fix: return payload["exp"] >= time.time() ``` ## What Caveman Preserves vs. Removes ```python # Tokens caveman REMOVES (waste): filler_phrases = [ "I'd be happy to help you with that", # 8 tokens gone "The reason this is happening is because", # 7 tokens gone "I would recommend that you consider", # 7 tokens gone "Sure, let me take a look at that", # 8 tokens gone "Great question!", # 2 tokens gone "Certainly!", # 1 token gone ] # Things caveman KEEPS (substance): preserved = [ "code blocks", # always normal "technical_terms", # exact spelling preserved "error_messages", # quoted verbatim "variable_names", # exact "git_commits", # normal prose "pr_descriptions", # normal prose ] ``` ## Integration Pattern — Using in a Project If you want caveman-style compression in your own Claude API calls: ```python import anthropic client = anthropic.Anthropic() # uses ANTHROPIC_API_KEY env var # Load the caveman SKILL.md as a system prompt addition with open("path/to/caveman/SKILL.md", "r") as f: caveman_skill = f.read() response = client.messages.create( model="claude-opus-4-5", max_tokens=1024, system=f"{caveman_skill}\n\nRespond in caveman mode: full intensity.", messages=[ {"role": "user", "content": "Why is my React component re-rendering?"} ] ) print(response.content[0].text) # → "New object ref each render. Inline prop = new ref = re-render. useMemo fix." print(f"Tokens used: {response.usage.output_tokens}") # ~19 vs ~69 ``` ## Session Workflow ``` # Start session with caveman /caveman full # Ask technical questions normally — agent responds in caveman > Why does my Docker build take so long? → "Layer cache miss. COPY before RUN npm install. Fix order:" [code block shown normally] # Switch intensity mid-session /caveman lite # Turn off for PR description writing /caveman off > Write a PR description for this auth fix → [normal, professional prose] # Back to caveman /caveman ``` ## Troubleshooting **Caveman mode not activating:** ```bash # Verify plugin installed claude plugin list | grep caveman # Reinstall claude plugin remove caveman claude plugin install caveman@caveman ``` **Savings lower than expected:** - Caveman only compresses *output* tokens — input tokens unchanged - Tasks with heavy code output (like Docker setup) see less savings since code is preserved verbatim - Reasoning/thinking tokens not affected — savings show in visible response only - Ultra mode gets maximum compression; switch if full mode feels verbose **Need normal mode for specific output:** ``` /caveman off # for PR descriptions, user-facing docs, formal reports /caveman # re-enable after ``` **Benchmarking your own tasks:** ```bash cd benchmarks/ export ANTHROPIC_API_KEY=your_key_here python run_benchmarks.py --task "your custom task description" ``` ## Why It Works Backed by a March 2026 paper ["Brevity Constraints Reverse Performance Hierarchies in Language Models"](https://arxiv.org/abs/2604.00025): constraining large models to brief responses **improved accuracy by 26 percentage points** on certain benchmarks. Verbose not always better. ``` TOKENS SAVED ████████ 65% avg (up to 87%) TECHNICAL ACCURACY ████████ 100% RESPONSE SPEED ████████ faster (less to generate) READABILITY ████████ better (no wall of text) ``` ## Key Files ``` caveman/ ├── SKILL.md # the skill definition loaded by Claude Code ├── benchmarks/ │ ├── run_benchmarks.py # reproduce token count results │ └── compare.py # side-by-side comparison tool ├── plugin.json # Codex plugin manifest └── README.md ``` ## Links - **Repo:** https://github.com/JuliusBrussee/caveman - **Homepage:** https://juliusbrussee.github.io/caveman/ - **Also by Julius:** [Blueprint](https://github.com/JuliusBrussee/blueprint) — spec-driven dev for Claude Code --- *One rock. That it.* 🪨