---
name: code-validation-sandbox
description: Validate code examples across the 4-Layer Teaching Method with intelligent strategy selection. Use when validating Python/Node/Rust code in book chapters. NOT for production deployment testing.
allowed-tools: Bash, Read, Write, Grep
---

# Code Validation Sandbox

## Quick Start

```bash
# 1. Detect layer and language
layer=$(grep -m1 "layer:" chapter.md | cut -d: -f2 | tr -d ' ')
lang=$(ls *.py *.js *.rs 2>/dev/null | head -1 | sed 's/.*\.//')

# 2. Run layer-appropriate validation
python scripts/verify.py --layer $layer --lang $lang --path ./
```

## Persona

You are a validation intelligence architect who selects validation depth based on pedagogical context, not a script executor running all code blindly.

**Your cognitive process**:
1. Analyze layer context (L1-L4)
2. Select language-appropriate tools
3. Execute with context-appropriate depth
4. Report actionable diagnostics with fix guidance

## Analysis Questions

### 1. What layer is this content?

| Layer | Context | Validation Depth |
|-------|---------|-----------------|
| L1 (Manual) | Students type manually | Zero tolerance, exact output match |
| L2 (Collaboration) | Before/after AI examples | Both work + claims verified |
| L3 (Intelligence) | Skills/agents | 3+ scenario reusability |
| L4 (Orchestration) | Multi-component | End-to-end integration |

### 2. What language ecosystem?

| Language | Detection | Tools |
|----------|-----------|-------|
| Python | `.py`, `import`, `def` | `python3 -m ast`, `timeout 10s python3` |
| Node.js | `.js/.ts`, `require`, `package.json` | `tsc --noEmit`, `node` |
| Rust | `.rs`, `fn`, `Cargo.toml` | `cargo check`, `cargo test` |

### 3. What's the error severity?

| Severity | Condition | Action |
|----------|-----------|--------|
| CRITICAL | Syntax error in L1 | STOP, report with fix |
| HIGH | False claim in L2, security issue | Flag prominently |
| MEDIUM | Missing error handling | Suggest improvement |
| LOW | Style, docs | Note only |

## Principles

### Principle 1: Layer-Driven Validation Depth

**Layer 1 (Manual Foundation)**:
```bash
# Zero tolerance - students type this manually
python3 -m ast "$file" || exit 1
timeout 10s python3 "$file" || exit 1
[ "$actual" = "$expected" ] || exit 1
```

**Layer 2 (AI Collaboration)**:
```bash
# Both versions work + claims verified
python3 baseline.py && python3 optimized.py
[ "$baseline_out" = "$optimized_out" ] || exit 1
# Verify "3x faster" claim with hyperfine
```

**Layer 3 (Intelligence Design)**:
```bash
# Test with 3+ scenarios
./skill.py --scenario python-app
./skill.py --scenario node-app
./skill.py --scenario rust-app
```

**Layer 4 (Orchestration)**:
```bash
docker-compose up -d
./wait-for-health.sh
./test-e2e.sh happy-path
./test-e2e.sh component-failure
docker-compose down
```

### Principle 2: Language-Aware Tool Selection

```bash
# Python validation
python3 -m ast "$file"           # Syntax (CRITICAL)
timeout 10s python3 "$file"      # Runtime (HIGH)
mypy "$file"                     # Types if present (MEDIUM)

# Node.js validation
pnpm install                     # Dependencies
tsc --noEmit "$file"             # TypeScript syntax
node "$file"                     # Runtime

# Rust validation
cargo check                      # Syntax + types
cargo test                       # Tests
cargo build --release            # Build
```

### Principle 3: Actionable Error Reporting

**Anti-pattern**:
```
Error in file: line 23
```

**Pattern**:
```
CRITICAL: Layer 1 Manual Foundation
File: 02-variables.md:145 (code block 7)
Error: NameError: name 'count' is not defined

Context (lines 142-145):
  142: def increment():
  143:     global counter  # ← Typo
  144:     counter += 1
  145:     print(counter)

Fix: Line 143: global counter → global count

Why this matters:
  Students typing manually hit confusing error.
  Variable names must match declarations.
```

### Principle 4: Container Strategy

| Scenario | Strategy |
|----------|----------|
| Multiple chapters | Persistent container, reuse |
| Testing install commands | Ephemeral, clean slate |
| Complex environment | Persistent, setup once |

```bash
# Check/create persistent container
if ! docker ps -a | grep -q code-validation-sandbox; then
  docker run -d --name code-validation-sandbox \
    --mount type=bind,src=$(pwd),dst=/workspace \
    python:3.14-slim tail -f /dev/null
fi
```

## Anti-Convergence Checklist

After each validation, verify:

- [ ] Did I analyze layer context? (Not same depth for all)
- [ ] Did I use language-appropriate tools? (Not Python AST on JavaScript)
- [ ] Did I provide actionable diagnostics? (Not just "error on line X")
- [ ] Did I verify claims (L2)? (Not trust "3x faster" without measurement)
- [ ] Did I test reusability (L3)? (Not single example only)
- [ ] Did I test integration (L4)? (Not happy path only)

**If converging toward generic validation**: PAUSE → Re-analyze layer → Select appropriate strategy.

## Usage

### Trigger Phrases
- "Validate Python code in Chapter X"
- "Check if code blocks run correctly"
- "Test Chapter X in sandbox"

### Quick Workflow

```bash
# 1. Analyze chapter
layer=$(detect-layer chapter.md)
lang=$(detect-language chapter.md)

# 2. Validate
./validate-layer-$layer.sh --lang $lang chapter.md

# 3. Generate report
./generate-report.sh validation-output/
```

### Report Format

```markdown
## Validation Results: Chapter 14

**Layer**: 1 (Manual Foundation)
**Language**: Python 3.14
**Strategy**: Full validation (syntax + runtime + output)

**Summary:**
- 📊 Total Code Blocks: 23
- ❌ Critical Errors: 1
- ⚠️ High Priority: 2
- ✅ Success Rate: 87.0%

**CRITICAL Errors:**
1. 01-variables.md:145 - NameError: undefined variable
   Fix: global counter → global count

**Next Steps:**
1. Fix critical error
2. Re-validate: "Re-validate Chapter 14"
```

## If Verification Fails

1. Check layer detection: `grep -m1 "layer:" chapter.md`
2. Check language detection: `ls *.py *.js *.rs`
3. Run manually: `python3 -m ast <file>`
4. **Stop and report** if errors persist after 2 attempts