--- name: token-efficiency description: Reduce token waste by 40-60% through anti-sycophancy rules, tool-call budgets, one-pass coding, task profiles, and read-before-write enforcement. Inspired by drona23/claude-token-efficient. --- # Token Efficiency Reduce output token waste and prevent iteration cycles that consume context. ## Trigger Use when: - Sessions feel expensive or slow - Output is verbose with filler text - Claude is re-reading files or iterating unnecessarily - Setting up a new project for token-efficient work ## Anti-Sycophancy Rules These patterns waste 30-60% of output tokens: | Pattern | Example | Fix | |---------|---------|-----| | Sycophantic opener | "Sure! Great question!" | Delete. Lead with answer. | | Prompt restatement | "You're asking about X..." | Delete. Answer directly. | | Closing fluff | "Let me know if you need anything!" | Delete. Stop after the answer. | | Unsolicited suggestions | "You might also want to..." | Delete unless asked. | | AI disclaimers | "As an AI model..." | Delete entirely. | | Verbose preambles | "I'll help you with that..." | Delete. Start with the action. | ## Tool-Call Budgets Set explicit budgets by task complexity: | Task Type | Tool-Call Budget | Wrap-Up At | |-----------|-----------------|------------| | Quick fix / lookup | 20 calls | 15 | | Bug fix | 30 calls | 25 | | Feature (small) | 50 calls | 40 | | Feature (large) | 80 calls | 65 | | Refactor | 50 calls | 40 | | Exploration / research | 30 calls | 25 | At the wrap-up threshold: commit progress, assess remaining work, decide whether to continue or start fresh. ## One-Pass Coding Discipline For simple-to-medium tasks: 1. **Read all relevant files** including tests first 2. **Understand what tests assert** before coding 3. **Write complete solution in one pass** — not incrementally 4. **Run tests once** — if pass, STOP immediately 5. **If fail**: read the error, fix once, retest 6. **Never iterate** more than twice on the same failure — rethink approach 7. **Never refactor, improve, or polish passing code** ## Task Profiles Switch profiles based on what you're doing: ### Coding Profile - Return code first, explanation after (only if non-obvious) - Simplest working solution, no over-engineering - Read file before modifying — always - No docstrings on unchanged code - No error handling for impossible scenarios - State bug, show fix, stop ### Agent/Pipeline Profile - Structured output only: JSON, bullets, tables - No prose unless targeting a human reader - Every output must be parseable without post-processing - Execute task, do not narrate actions - Never invent file paths, API endpoints, or function names - If unknown: return null or "UNKNOWN", never guess ### Analysis Profile - Lead with finding, context and methodology after - Tables and bullets over prose - Numbers must include units - Never fabricate data points - Summary first (3 bullets max), caveats last ## Read-Before-Write Enforcement Hard rules: 1. **Never write a file you haven't read** in this session 2. **Never re-read a file** already read unless it was modified 3. **Read tests before coding** — understand what passes before writing 4. **Read error output carefully** before attempting a fix ## ASCII-Only Output Use ASCII characters only in all output: - `--` not `—` (em dash) - `"` not `"` `"` (smart quotes) - `'` not `'` `'` (curly apostrophes) - No emoji unless explicitly requested - No Unicode decorators or special characters This ensures clean copy-paste for code and compatibility with downstream systems. ## Measuring Impact Track these metrics to measure token savings: - **Output length**: average words per response (target: 30-50% reduction) - **Tool calls per task**: should stay within budget tier - **Re-read count**: should be near zero - **Write-without-read count**: should be zero - **Iteration cycles**: tests should pass in 1-2 attempts, not 5+ ## Attribution Token efficiency patterns adapted from [drona23/claude-token-efficient](https://github.com/drona23/claude-token-efficient) (MIT).