--- title: '2026 Watchlist' description: 'The dated signals, scenario trajectories, and falsifiers that will tell you whether the framework holds.' --- A framework that does not specify how it could be wrong is not a framework. It is a marketing claim. This page lists the concrete signals over the next twelve to eighteen months that test the analysis on this site. Each scenario carries an observable. Each observable resolves one way or the other. Read it as a stress test, not as a forecast you can outsource. The framework is directional. Timing is uncertain. The point of this page is to make the uncertainty visible. **Three minute read.** Five scenario trajectories. Two levers that move the system this quarter. One decision window (July 1, 2026). One thesis falsifier (AI volume rises, unanchored-claim rate does not). ## The decision window: July 1, 2026 There is one date worth treating as a deadline. ```mermaid timeline title The 2026 decision window section Now (May 2026) Two levers active : Release gate (0 to 60 days) : Procurement artifacts (0 to 90 days) section Jul 1 2026 The door : Structural validity gate in production : Or faster cadence becomes culture section 12-18 months Trajectory locks in : Polished Flood Thin Spine (base case) : Or one of four alternatives ``` By July 1, 2026, any pipeline that labels output "decision-grade" should have a structural-validity release gate in place. Miss the date and the team will normalize the faster cadence. Later gates feel like sabotage rather than quality control. The retrofit cost rises sharply once the new cadence is calendared, staffed, and expected by clients. **The no-regret action.** Require every decision-grade brief to include: 1. A numbered list of core claims. 2. A source or computation for each claim. 3. Explicit assumptions named, not assumed. 4. At least one observation that would make the conclusion wrong. Start with one product line. Scale once the gate proves it can hold under cycle-time pressure. ## The two levers active this quarter The system has two enforcement points with enough friction to alter the trajectory inside ninety days. Everything else is downstream of these. Editorial and production teams formally require assumption registers, boundary conditions, and claim-source maps before any output carries a decision-grade label. **Confirms:** Procurement of Proof trajectory. **Disconfirms if:** The deadline passes without a gate in production. At least one procurement cycle includes structural transparency artifacts as acceptance criteria, not just deliverable templates. **Confirms:** Procurement of Proof trajectory. **Disconfirms if:** Procurement accepts narrative-only deliverables without objection. ## Five scenario trajectories The framework predicts five plausible futures. They are not mutually exclusive; the market can split across them. Each has a named mechanism, a named observable, and a named falsifier. **Trajectory:** The market drifts into endless glossy deliverables whose core claims lack traceable support. **Mechanism:** Cheap drafting pushes volume up. Fixed reviewer capacity skims rather than tests. Throughput incentives treat "reads well" as "is solid." **Observable:** Review cycles lengthening, style checks substituting for evidence checks, release gates optimized for speed rather than structural validity. **Falsifier:** Independent sampling shows stable anchoring of core claims despite higher AI volume. If unanchored-claim rates stay at pre-AI baselines while output rises, the drift is not happening. **Trajectory:** A minority of producers build verification artifacts. They form islands of trust. The rest of the market becomes a narrative ocean. **Mechanism:** Some producers sell to high-downside buyers willing to pay for traceability. Everyone else competes on speed and polish. **Observable:** Visible split in deliverable standards. Some firms ship assumption registers and claim-source maps. Others ship slides only. **Falsifier:** Verification artifacts become table stakes across most major producers within twelve months. Fragmentation gives way to a new market norm. **Trajectory:** Purchasing departments start buying traceability, not just slides. The market shifts from brand trust to auditability. **Mechanism:** Buyers require claim-source maps and assumption registers as acceptance criteria. Vendors fund verification capacity to protect revenue. **Observable:** At least one major procurement cycle includes structural transparency artifacts as acceptance criteria within ninety days. **Falsifier:** Buyers keep selecting vendors primarily on reputation and turnaround. Procurement does not pull the system toward proof. **Trajectory:** Badges and disclosures stack on top of the same thin verification layer. Compliance gets read as truth-testing. **Mechanism:** Regulators demand visible action about AI-generated content. Rules focus on labeling AI use and content provenance (who made this, with what tool), not on whether the claim is true. Organizations optimize for the check-the-box audit. **Observable:** New review boards, policies, or labels reference quality or governance but do not specify enforceable artifacts (no assumption registers, no claim-source maps, no discriminating tests). **Falsifier:** Regulators bind certification to measurable accuracy claims, or impose real liability for false analytical claims. The trajectory changes character. **Trajectory:** Verification gets cheap enough that the bottleneck flips. Teams scale output without hollowing it out. **Mechanism:** Validator tools cut reviewer-minutes per claim by automatically tracing assertions, surfacing missing premises, and flagging conflicts. The cost curve for verification finally bends. **Observable:** At least one buyer pays a premium or extends timeline specifically for validated outputs within ninety days. Organizations that build or buy genuine validation capacity gain a measurable competitive edge. **Falsifier:** Validator tools fail on false negatives, or raise review time through noise. Organizations do not trust them for decision-grade work. ## The thesis falsifier One observation, if true, breaks the entire analysis on this site. **What would prove the framework wrong:** Independent audits show AI-assisted volume rose, but the rate of unanchored core claims in released work did not rise. Define "unanchored" narrowly: a core claim lacks a traceable source, a checkable computation, or a named primary witness. Three observable proxies: - **Stable rejection rates** for missing sourcing - **Stable correction or retraction rates** after publication - **Stable client-reported accuracy scores** over time If those metrics stay at pre-AI baselines while volume climbs, the verification-gap story is wrong. Existing review gates are absorbing the shock. The framework should be revised, not defended. ## Regulatory signals (in force, monitor for enforcement) These signals are already live as of May 2026. Track them for implementation severity, waiver applications, and the first material enforcement actions. The pattern: when these regimes show teeth, the buyer-side correction this framework predicts accelerates. | Signal | Date in force | What to watch for | What it tells you | |---|---|---|---| | **[SR 26-2](https://www.federalreserve.gov/supervisionreg/srletters/SR2602.pdf)** Revised Guidance on Model Risk Management | April 17, 2026 | First material enforcement action against a major bank for inadequate AI/model validation | Procurement-of-proof dynamic is real and propagating from banking outward | | **GENIUS Act** implementation | July 18, 2025 onward | First OCC enforcement against a stablecoin issuer for inadequate attestation; whether BDO attestations for federally-supervised stablecoins are substantive or perfunctory | Federal coalition is enforcing the perimeter, not rubber-stamping it | | **SOC 2 2026 criteria** | 2026 | Enterprise procurement teams pushing 2026 SOC 2 criteria into AI-vendor RFPs | The crossover from security procurement to AI verification procurement is happening | | **[EU AI Act](https://eur-lex.europa.eu/eli/reg/2024/1689/oj)** high-risk provisions | Staggered 2026-2027 | First regulator-level fine against an AI verification provider for inadequate transparency or oversight | European enforcement typically leads U.S. enforcement by 12-18 months in adjacent domains | ## Indicators worth instrumenting now If you want to run the framework's tests on your own pipeline, these are the measures to collect. They distinguish a real verification deficit from a phantom one. - Documents per reviewer per period - Average review lag from draft to signoff - Output volume change since AI-assisted drafting was adopted - Structural-defect rate per document - Share of documents with at least one coded defect - Factual-error rate (separate from structural) - Attribution time from defect flag to evidence path - Correction latency from detection to documented fix - Post-publication reversal rate - What gets rewarded in performance reviews (speed, satisfaction, accuracy) - Whether analysts are scored on forecast or claim accuracy - Whether postmortems are run on released analysis ## How to use this page If your organization labels output decision-grade, the structural-validity release gate should be in production by that date. Treat the date as a planning deadline tied to adoption lock-in, not as a hard event. Release gate activation (60 days) and procurement artifact mandates (90 days) are the only enforcement points with enough friction to alter the trajectory in the short term. Each of the five scenarios has a named observable. As one or more resolve, the market trajectory becomes clearer. Multiple scenarios can be true simultaneously; the market can fragment. If you can measure unanchored-claim rates over time inside your own pipeline, you have a direct test of the framework's core claim. The framework predicts the rate rises. If yours does not, the framework is wrong for your context. ## An epistemic note This framework, including this Watchlist, is directional rather than precise. The core mechanism (drafting costs collapse faster than verification capacity scales) is well-supported by available evidence. The timing, the actor sequencing, and the relative probability of each scenario are less well-grounded. Treat the scenarios as a way to stress-test your process, not as a timer with an alarm you can set. The framework's own grounding rate sits below where you would want it for a definitive forecast. The point of publishing it openly is that it can be contested, refined, and corrected. The doctrine improves when it is contested. Substantive disagreements through the [repository](https://github.com/DavidVALIS/decision-grade) are welcome. ## Where this goes next The diagnosis: why current AI controls miss the real problem. The posture: Zero Trust applied to AI verification. The action: seven questions to put to AI vendors.