---
id: ins_red-green-tdd-shorthand-for-agents
operator: Simon Willison
operator_role: Independent AI/engineering writer; co-creator of Django
source_url: https://www.lennysnewsletter.com/p/an-ai-state-of-the-union
source_type: podcast
source_title: Simon Willison on agentic engineering and the November 2025 inflection — Lenny's Podcast
source_date: 2026-04-02
captured_date: 2026-05-01
domain: [ai-native, engineering]
lifecycle: [ai-workflow]
maturity: applied
artifact_class: workflow
score: { originality: 3, specificity: 5, evidence: 3, transferability: 5, source: 5 }
tier: B
related: [ins_personal-pattern-hoarding]
raw_ref: raw/podcasts/simon-willison--agentic-engineering-november-inflection--2026-04-02.md
---

# Encode jargon shorthand once, save tokens forever

## Claim
Compress recurring multi-paragraph instructions into terms the agent already recognizes ("Red/Green TDD", "monorepo conventions", "ship as research preview") so each new task spends tokens on the work, not the meta-instruction.

## Mechanism
LLMs are pattern-matchers over training distribution. Standard jargon collapses a long natural-language spec into a short symbol the model has seen thousands of times in training. The agent already knows the implied steps, conventions, and edge cases. Token spend per task drops; consistency across tasks rises because the symbol carries shared assumptions.

## Conditions
Holds when:
- The shorthand is a real industry term, not a private acronym. Models recognize "Red/Green TDD"; they do not recognize "RG-TDD-v2".
- The team agrees on the shorthand. Mixed usage produces drift.

Fails when:
- The shorthand is too coarse for the actual task. "Use TDD" is fine for a CRUD endpoint and wrong for a complex algorithm where the test design is the hard part.
- The model's training cutoff predates the term. Then you pay full token cost anyway.

## Evidence
> "I hated TDD as a human; I love it for agents."

Simon notes agents don't get bored writing 1,000 lines of test boilerplate. Adjacent rule he flags: keep tests generously now that updating them is free, what used to be a code smell (verbose test suites) is now a feature.

· Simon Willison on Lenny's Podcast, 2026-04-02

## Signals
- Agent prompts shrink over time as the team adopts shorthand.
- Test suites grow faster than the codebase without slowing iteration.
- New team members adopt the shorthand quickly because the model reinforces it.

## Counter-evidence
Over-compressed prompts hide intent and produce wrong outputs when the symbol means something different in the agent's training data. Always sanity-check the agent's first output on a new shorthand before scaling it.

## Cross-references
- `ins_personal-pattern-hoarding`, the same idea applied at the artifact level instead of the language level