---
name: mechinterp-validation-suite
description: Run credibility checks on feature interpretations including split-half stability and shuffle null tests
---

# MechInterp Validation Suite

Run comprehensive credibility checks on feature interpretations to ensure findings are robust and not artifacts.

## Purpose

The validation suite skill:
- Tests stability of analysis results across data splits
- Creates null distributions to assess significance
- Generates validation reports with pass/fail criteria
- Helps identify unreliable interpretations early

## When to Use

Use this skill when:
- A hypothesis reaches high confidence (>0.7) and needs validation
- You want to verify that a pattern is real, not noise
- Before finalizing feature labels or interpretations
- As part of a standard research checkpoint

## Validation Tests

### 1. Split-Half Stability

Tests if analysis results are consistent across random splits of the data.

**What it measures**: Correlation of token frequency rankings between two halves
**Pass criterion**: Mean correlation > 0.7

```python
from splatnlp.mechinterp.schemas import ExperimentSpec, ExperimentType
from splatnlp.mechinterp.experiments import get_runner_for_type
from splatnlp.mechinterp.skill_helpers import load_context

# Create split-half spec
spec = ExperimentSpec(
    type=ExperimentType.SPLIT_HALF,
    feature_id=18712,
    model_type="ultra",
    variables={
        "n_splits": 10,
        "metric": "token_frequency_correlation"
    }
)

# Run validation
ctx = load_context("ultra")
runner = get_runner_for_type(spec.type)
result = runner.run(spec, ctx)

# Check results
mean_corr = result.aggregates.custom["mean_correlation"]
passed = result.aggregates.custom["stability_passed"]
print(f"Split-half correlation: {mean_corr:.3f} ({'PASS' if passed else 'FAIL'})")
```

### 2. Shuffle Null Test

Creates a null distribution by shuffling activations to test if observed patterns are significant.

**What it measures**: Whether top-token concentration exceeds null expectation
**Pass criterion**: p-value < 0.05

```python
spec = ExperimentSpec(
    type=ExperimentType.SHUFFLE_NULL,
    feature_id=18712,
    model_type="ultra",
    variables={
        "n_shuffles": 100
    }
)

result = runner.run(spec, ctx)

p_value = result.aggregates.custom["p_value"]
significant = result.aggregates.custom["significant"]
print(f"Shuffle null p-value: {p_value:.4f} ({'SIGNIFICANT' if significant else 'NOT SIGNIFICANT'})")
```

## Running Full Validation Suite

```python
from splatnlp.mechinterp.schemas import ExperimentSpec, ExperimentType
from splatnlp.mechinterp.experiments import get_runner_for_type
from splatnlp.mechinterp.skill_helpers import load_context
from splatnlp.mechinterp.state.io import SPECS_DIR, RESULTS_DIR
from datetime import datetime
import json

def run_validation_suite(feature_id: int, model_type: str = "ultra"):
    """Run all validation tests for a feature."""
    ctx = load_context(model_type)
    results = {}

    # Test 1: Split-half stability
    split_spec = ExperimentSpec(
        type=ExperimentType.SPLIT_HALF,
        feature_id=feature_id,
        model_type=model_type,
        variables={"n_splits": 10}
    )
    runner = get_runner_for_type(split_spec.type)
    split_result = runner.run(split_spec, ctx)
    results["split_half"] = {
        "mean_correlation": split_result.aggregates.custom.get("mean_correlation"),
        "passed": split_result.aggregates.custom.get("stability_passed", 0) == 1
    }

    # Test 2: Shuffle null
    null_spec = ExperimentSpec(
        type=ExperimentType.SHUFFLE_NULL,
        feature_id=feature_id,
        model_type=model_type,
        variables={"n_shuffles": 100}
    )
    runner = get_runner_for_type(null_spec.type)
    null_result = runner.run(null_spec, ctx)
    results["shuffle_null"] = {
        "p_value": null_result.aggregates.custom.get("p_value"),
        "passed": null_result.aggregates.custom.get("significant", 0) == 1
    }

    # Overall pass/fail
    all_passed = all(r["passed"] for r in results.values())

    return {
        "feature_id": feature_id,
        "model_type": model_type,
        "tests": results,
        "overall_passed": all_passed,
        "timestamp": datetime.now().isoformat()
    }

# Run suite
validation = run_validation_suite(18712, "ultra")
print(f"\nValidation Suite for Feature {validation['feature_id']}:")
print(f"  Split-half: {validation['tests']['split_half']['mean_correlation']:.3f} "
      f"({'PASS' if validation['tests']['split_half']['passed'] else 'FAIL'})")
print(f"  Shuffle null: p={validation['tests']['shuffle_null']['p_value']:.4f} "
      f"({'PASS' if validation['tests']['shuffle_null']['passed'] else 'FAIL'})")
print(f"\nOVERALL: {'PASS' if validation['overall_passed'] else 'FAIL'}")
```

## Interpretation Guide

### Split-Half Results

| Correlation | Interpretation |
|-------------|----------------|
| > 0.8 | Excellent stability - results are highly reproducible |
| 0.7 - 0.8 | Good stability - results are reliable |
| 0.5 - 0.7 | Moderate stability - some patterns may be noisy |
| < 0.5 | Poor stability - interpret with caution |

### Shuffle Null Results

| p-value | Interpretation |
|---------|----------------|
| < 0.01 | Highly significant - pattern very unlikely by chance |
| 0.01 - 0.05 | Significant - pattern unlikely by chance |
| 0.05 - 0.10 | Marginally significant - borderline |
| > 0.10 | Not significant - pattern may be noise |

## Workflow Integration

1. **Conduct research**: Build hypotheses, gather evidence
2. **Reach confidence threshold**: When hypothesis confidence > 0.7
3. **Run validation suite**: Execute this skill
4. **Update state**: Mark hypothesis as validated (or not)
5. **Document**: Add validation results to evidence

```python
from splatnlp.mechinterp.state import ResearchStateManager
from splatnlp.mechinterp.schemas.research_state import HypothesisStatus

# After validation passes
manager = ResearchStateManager(18712, "ultra")
manager.update_hypothesis(
    "h001",
    status=HypothesisStatus.SUPPORTED,
    confidence_absolute=0.9
)
manager.add_evidence(
    experiment_id="validation_suite",
    result_path="/mnt/e/mechinterp_runs/results/validation.json",
    summary="Passed split-half (r=0.85) and shuffle null (p<0.01)",
    strength=EvidenceStrength.STRONG,
    supports=["h001"]
)
```

## CLI Usage

```bash
cd /root/dev/SplatNLP

# Run split-half validation
poetry run python -m splatnlp.mechinterp.cli.runner_cli \
    --spec-path specs/split_half_spec.json

# Run shuffle null validation
poetry run python -m splatnlp.mechinterp.cli.runner_cli \
    --spec-path specs/shuffle_null_spec.json
```

## See Also

- **mechinterp-state**: Update hypotheses after validation
- **mechinterp-summarizer**: Document validation results
- **mechinterp-runner**: Execute validation experiments