---
name: trust-but-verify
description: Verify system claims and test results through independent validation rather than trusting assumptions
license: MIT
compatibility: opencode
metadata:
  audience: developers
  category: testing
---

# Trust But Verify

Apply skeptical verification to all system claims, test results, and assumptions through independent validation rather than blind trust, preventing false confidence and accelerating issue detection.

## When to use me

Use this skill when:
- Tests pass but you suspect something might still be wrong
- Documentation claims features work but you want to verify
- System memory/brain/progress tracking says something is built
- Stakeholders assume functionality exists based on reports
- You need to validate assumptions before critical decisions
- Building resilience against false positives and blind spots
- Preparing for production releases or high-risk changes
- Onboarding to a system with uncertain quality signals

## What I do

### 1. Claim Identification
- **Extract claims** from:
  - Test results and coverage reports
  - System documentation and specifications
  - Progress tracking and memory systems
  - Stakeholder expectations and assumptions
  - Deployment logs and monitoring dashboards
  - Team communications and status updates

- **Categorize claims** by:
  - Criticality (mission-critical vs nice-to-have)
  - Verifiability (easily testable vs ambiguous)
  - Source credibility (trusted source vs unknown)
  - Time since last verification (fresh vs stale)

### 2. Verification Strategy Design
- **Select appropriate verification methods**:
  - Independent test execution (different from original tests)
  - System probing and health checks
  - User scenario simulation
  - Data validation and integrity checks
  - Performance benchmarking
  - Security penetration testing
  - Documentation vs implementation comparison

- **Coordinate with other test types**:
  - Use unit tests but run them differently
  - Run integration tests with different data
  - Execute E2E tests with edge cases
  - Perform chaos testing to verify resilience claims
  - Conduct usability testing to verify user experience claims

### 3. Skeptical Verification Execution
- **Challenge assumptions deliberately**:
  - What if the test is testing the wrong thing?
  - What if the test passes for the wrong reason?
  - What if the feature works but not as users expect?
  - What if the system works now but won't under load?
  - What if documentation diverges from implementation?

- **Execute verification with different contexts**:
  - Different environments (not just test environment)
  - Different data sets (not just test data)
  - Different user personas (not just happy path)
  - Different time periods (not just immediate)
  - Different failure conditions (not just success paths)

### 4. Discrepancy Detection & Reporting
- **Compare claims vs verification results**:
  - Identify false positives (claims true but verification fails)
  - Identify false negatives (claims false but verification passes)
  - Measure divergence magnitude (minor vs critical differences)
  - Track verification confidence levels

- **Generate actionable insights**:
  - Specific discrepancies found
  - Root cause hypotheses
  - Impact assessment
  - Priority recommendations
  - Verification method effectiveness

## Verification Strategies by Claim Type

### For "Tests Pass" Claims:
- **Verify test quality**: Are tests actually testing the right thing?
- **Check test coverage**: Do tests cover critical paths and edge cases?
- **Review test data**: Is test data realistic and comprehensive?
- **Execute alternative tests**: Run similar but different verification tests
- **Check test environment**: Does test environment match production?

### For "Feature Built" Claims:
- **Verify functionality**: Does feature actually work as described?
- **Check user experience**: Is feature usable and intuitive?
- **Validate integration**: Does feature work with other components?
- **Test edge cases**: How does feature handle unusual situations?
- **Verify documentation**: Does documentation match implementation?

### For "System Operational" Claims:
- **Health checks**: Is system actually running and responsive?
- **Load testing**: Does system perform under expected load?
- **Failure testing**: How does system handle failures?
- **Monitoring verification**: Are monitoring systems actually catching issues?
- **Backup validation**: Are backups actually restorable?

### For "Memory/Progress" Claims:
- **Verify completion**: Is claimed work actually complete?
- **Check quality**: Is completed work production-ready?
- **Validate dependencies**: Do dependencies actually exist and work?
- **Review implementation**: Does implementation match design?
- **Test deliverables**: Do deliverables actually solve the problem?

## Examples

```bash
# Verify test results claims
npm run verify:test-results -- --test-suite "user-authentication"
npm run verify:test-coverage -- --module "payment-processing"

# Verify feature claims  
npm run verify:feature -- --feature "checkout-flow" --claim "handles 1000 concurrent users"
npm run verify:feature -- --feature "report-generation" --claim "exports to PDF format"

# Verify system operational claims
npm run verify:system-health -- --component "database" --claim "redundant and fault-tolerant"
npm run verify:system-performance -- --endpoint "/api/orders" --claim "response < 200ms"

# Verify progress/memory claims
npm run verify:progress -- --task "implement-payment-webhook" --claim "completed and tested"
npm run verify:documentation -- --section "api-reference" --claim "accurately describes endpoints"

# Comprehensive verification
npm run verify:all-claims           # Verify all identified claims
npm run verify:critical-claims      # Verify only critical claims
npm run verify:stale-claims         # Verify claims not recently checked

# Integration with other testing
npm run verify:with -- --test-type chaos --claim "system-resilient"
npm run verify:with -- --test-type security --claim "no-vulnerabilities"
npm run verify:with -- --test-type usability --claim "user-friendly"
```

## Output format

```
Trust But Verify Report
──────────────────────────────
Verification Context: Pre-production release validation
Total Claims Identified: 47
Claims Verified: 23 (priority order)
Verification Duration: 2 hours 15 minutes

Critical Claim Verification Results:

1. Claim: "Payment processing tests pass with 100% coverage"
   Source: CI/CD pipeline report
   Verification Strategy: Independent test execution + coverage analysis
   Result: ❌ DISCREPANCY FOUND
   - Tests pass but don't validate currency conversion rates
   - Coverage shows 100% but misses error handling paths
   - Test data uses only USD, missing other currencies
   Recommendation: Add currency conversion tests, expand test data

2. Claim: "System handles 5000 concurrent users"
   Source: Performance test report from 2 weeks ago
   Verification Strategy: Fresh load test with different patterns
   Result: ⚠️ PARTIALLY VERIFIED
   - System handles 5000 users but response time degrades by 300%
   - Database connection pool exhausted at 4500 users
   - CPU usage reaches 95% at target load
   Recommendation: Optimize database connections, add autoscaling

3. Claim: "User registration feature complete"
   Source: Project management system
   Verification Strategy: End-to-end testing + security review
   Result: ✅ VERIFIED
   - Registration flow works correctly
   - Email verification functional
   - Password security requirements enforced
   - No security vulnerabilities found

4. Claim: "Monitoring alerts configured for all critical errors"
   Source: DevOps runbook
   Verification Strategy: Error injection + alert monitoring
   Result: ❌ DISCREPANCY FOUND
   - Database connection errors not alerting
   - Payment gateway timeouts not monitored
   - Alert thresholds too high for business impact
   Recommendation: Review and update alert configuration

5. Claim: "Backup system tested and functional"
   Source: System documentation
   Verification Strategy: Actual backup restore test
   Result: ⚠️ PARTIALLY VERIFIED
   - Backup creation works
   - Restore process documented but untested
   - Restore time exceeds RTO (Recovery Time Objective)
   Recommendation: Test full restore, optimize restore process

Verification Confidence Assessment:
  - High Confidence: 8 claims (thoroughly verified)
  - Medium Confidence: 10 claims (partially verified)
  - Low Confidence: 5 claims (insufficient verification)
  - Failed Verification: 5 claims (discrepancies found)

Critical Issues Requiring Attention:
  1. Payment currency conversion untested (business risk: high)
  2. Database connection pool limits scalability (performance risk: high)
  3. Missing critical error alerts (operational risk: medium)
  4. Backup restore untested (recovery risk: medium)

Verification Effectiveness:
  - False positives prevented: 3 (would have caused production issues)
  - Assumptions challenged: 12 (revealed hidden risks)
  - Verification time vs value: High ROI (2 hours prevented days of issues)
  - Recommendations generated: 7 actionable improvements

Next Steps:
  1. Address critical discrepancies before release
  2. Improve test coverage for payment processing
  3. Optimize database connection management
  4. Update monitoring and alert configuration
  5. Schedule regular verification for high-risk claims
```

## Notes

- Trust but verify is a mindset, not just a technical process
- Balance verification effort with risk and criticality
- Document verification methods and results for audit trails
- Use verification findings to improve original testing and claims
- Consider verification as ongoing process, not one-time event
- Involve different perspectives in verification (fresh eyes see different things)
- Measure verification effectiveness over time
- Share verification findings transparently with stakeholders
- Use verification to build system understanding, not just find faults
- Adapt verification strategies based on what you learn
- Remember: absence of evidence is not evidence of absence