--- title: How To Control Smarter Intelligences date: "2025-07-01T06:37:40Z" lastmod: "2025-07-01T06:37:42Z" categories: - llms wp_id: 4155 description: "The way to govern smarter systems is through mechanisms like checklists, sampling, red-teaming, gating, consensus, and outcome-based evaluation." keywords: ["AI governance", "checklists", "red teaming", "sampling", "consensus", "evaluation methods"] --- ![How To Control Smarter Intelligences](/blog/assets/ChatGPT-Image-Jul-1-2025-11_27_23-AM.webp) LLMs are smarter than us in many areas. How do we manage them? This is not a new problem. - **VC partners** evaluate deep-tech startups. - **Science editors** review Nobel laureates. - **Managers** manage specialist teams. - **Judges** evaluate expert testimony. - **Coaches** train Olympic athletes. … and they manage and evaluate "smarter" outputs in **many** ways: 1. **Verify**. Check against an "answer sheet". 2. **Checklist**. Evaluate against pre-defined criteria. 3. **Sampling**. Randomly review a subset. 4. **Gating**. Accept low-risk work. Evaluate critical ones. 5. **Benchmark**. Compare against others. 6. **Red-team**. Probe to expose hidden flaws. 7. **Double-blind review**. Mask identity to curb bias. 8. **Reproduce**. Re-running gives the same output? 9. **Consensus**. Aggregate multiple responses. Wisdom of crowds. 10. **Outcome**. Did it work in the real world? For example: - **Vibe coding**: Non-programmers might glance at lint checks (**Checklist**) and see if it works (**Outcome**). - **LLM image designs**: Developers might check if a few images look good (**Sampling**) and check a few marketers (**Consensus**). - **LLM news articles**: An journalist might run a **Checklist**, a **Double-blind review** with experts, and **Verify** critical facts (**Gating**). You **already** know many of these. You learnt them in Auditing. Statistics. Law. System controls. Policy analysis. Quality engineering. Clinical epidemiology. Investigative journalism. Design critique. Worth brushing up skills. They're **more** important in the AI era.