--- id: ins_dark-factory-pattern operator: Simon Willison operator_role: Independent AI/engineering writer; co-creator of Django source_url: https://www.lennysnewsletter.com/p/an-ai-state-of-the-union source_type: podcast source_title: Simon Willison on agentic engineering and the November 2025 inflection — Lenny's Podcast source_date: 2026-04-02 captured_date: 2026-05-01 domain: [ai-native, engineering] lifecycle: [ai-workflow, process-cadence] maturity: frontier artifact_class: case-study score: { originality: 5, specificity: 5, evidence: 4, transferability: 4, source: 5 } tier: A related: [ins_november-2025-coding-inflection, ins_simulated-qa-swarm] raw_ref: raw/podcasts/simon-willison--agentic-engineering-november-inflection--2026-04-02.md --- # The dark factory: nobody reads the code, gated by a simulated QA swarm ## Claim StrongDM has been operating since August 2025 under a two-rule progression: (1) nobody types code, (2) nobody reads code. They make this safe by running a simulated swarm of thousands of agent-employees against a vibe-coded simulated stack (Slack, Jira, Okta, the works), 24/7, at roughly $10K/day in tokens. ## Mechanism The cost of simulating dependencies has collapsed. Building a fake Slack, fake Jira, fake Okta used to be a six-month project; now agents build them from API docs in days. Once the simulated dependencies exist, you can run continuous adversarial-style QA at scale that no human team could match. The simulator catches regressions earlier than any human review, so the human-in-the-loop on individual PRs becomes redundant. Read-the-code discipline shifts from line-level review to suite-level test coverage. ## Conditions Holds when: - The product has well-defined surfaces with API docs that can be simulated. - The team has the operating discipline and budget to run agent swarms continuously. - The token economics work, token spend < salary saved. Fails when: - The product's correctness depends on physical-world or human-judgment outputs the simulator can't cover. - The org has not invested in the simulation harness, without it, "nobody reads code" is reckless. ## Evidence > "The cost of simulating those dependencies has crashed... They've built simulated employees that work in a simulated Slack, simulated Jira, simulated Okta — and these are running 24/7 testing their access management software." · Simon Willison on Lenny's Podcast, 2026-04-02 Token spend reported as ~$10K/day. StrongDM ships security software, the case where you'd most expect line-level review to be irreducible. ## Signals - Token cost runs higher than the salary cost they replace, but is rising slower than throughput. - New regressions are caught by the simulated swarm before reaching staging. - Human time shifts from PR review to harness improvement, eval design, and exception triage. - Engineers rarely open the code, they read the swarm's reports. ## Counter-evidence For most companies, "nobody reads code" is premature. The simulator is the load-bearing piece, and most teams have neither the budget nor the discipline to maintain a live simulated environment. The pattern transfers earliest in security, where adversarial coverage is the existing review model anyway. Marketing-side analog requires a synthetic-buyer simulator, not a synthetic-employee simulator, different shape, partly built (Nooks-style SDR practice tools), not yet at "nobody reads the copy" maturity. ## Cross-references - `ins_november-2025-coding-inflection`, the model-quality threshold that made this safe - `ins_simulated-qa-swarm`, the harness pattern abstracted from the StrongDM specifics