---
name: ml-antipattern-validator
description: Prevents 30+ critical AI/ML mistakes including data leakage, evaluation errors, training pitfalls, and deployment issues. Use when working with ML training, testing, model evaluation, or deployment.
---

# ML Antipattern Validator

## Overview

AI/ML 개발에서 30+ 안티패턴을 감지하고 방지하는 스킬입니다.

**Key Principle**: Honest evaluation > Impressive metrics.

## When to Activate

**Automatic Triggers**:
- ML training code (`train*.py`, model training)
- Dataset preparation or splitting
- Model evaluation or testing
- Production deployment planning

**Manual Triggers**:
- `@validate-ml` - Full validation
- `@check-leakage` - Data leakage detection
- `@verify-eval` - Evaluation methodology

---

## Pre-Implementation Checklist

```python
✅ Requirements:
□ Problem clearly defined with success metrics
□ Train/test split strategy defined
□ Evaluation methodology matches business objective

✅ Data Integrity:
□ No temporal leakage (future → past)
□ No target leakage (answer in features)
□ No preprocessing leakage (fit on all data)
□ No group leakage (related samples split)

✅ Evaluation Setup:
□ Test set completely held out
□ Metrics aligned with business objective
□ Baseline models defined
```

---

## Critical Antipatterns

### Category 1: Data Leakage 🚨

#### 1.1 Target Leakage
```python
❌ WRONG: Using "refund_issued" to predict "purchase_fraud"
✅ CORRECT: Only use features available at purchase time
```

#### 1.2 Temporal Leakage
```python
❌ WRONG: train = df[df['date'] > '2024-06-01']  # Future data
✅ CORRECT: train = df[df['date'] < '2024-06-01']  # Past for training
```

#### 1.3 Preprocessing Leakage
```python
❌ WRONG: X_scaled = scaler.fit_transform(X); train_test_split(X_scaled)
✅ CORRECT: Split first, then scaler.fit(X_train)
```

#### 1.4 Group Leakage
```python
❌ WRONG: train_test_split(df)  # Same user in both sets
✅ CORRECT: GroupShuffleSplit(groups=df['user_id'])
```

#### 1.5 Data Augmentation Leakage
```python
❌ WRONG: augment(X) → train_test_split()
✅ CORRECT: train_test_split() → augment(X_train)
```

---

### Category 2: Evaluation Mistakes ⚠️

#### 2.1 Testing on Training Data
```python
❌ WRONG: evaluate(model, training_data)
✅ CORRECT: evaluate(model, unseen_test_data)
```

#### 2.2 Metric Misalignment
```python
Business Objective → Appropriate Metric:
- Ranking → NDCG, MRR, MAP
- Imbalanced → F1, Precision@K, AUC-PR
- Balanced → Accuracy, AUC-ROC
```

#### 2.3 Accuracy Paradox
```python
❌ WRONG: 99% accuracy on 99:1 imbalanced data
✅ CORRECT: Check per-class metrics with classification_report()
```

#### 2.4 Invalid Time Series CV
```python
❌ WRONG: cross_val_score(model, X, y, cv=5)  # Shuffles time!
✅ CORRECT: TimeSeriesSplit(n_splits=5)
```

#### 2.5 Hyperparameter Tuning on Test Set
```python
❌ WRONG: grid_search(model, X_test, y_test)
✅ CORRECT: train/validation/test three-way split
```

---

### Category 3: Training Pitfalls 🔧

#### 3.1 Batch Norm Inference Error
```python
❌ WRONG: predictions = model(X_test)  # Still in train mode
✅ CORRECT: model.eval(); with torch.no_grad(): predictions = model(X_test)
```

#### 3.2 Early Stopping Overfitting
```python
❌ WRONG: EarlyStopping(patience=50)
✅ CORRECT: EarlyStopping(patience=5, min_delta=0.001, restore_best_weights=True)
```

#### 3.3 Learning Rate Warmup
```python
✅ CORRECT: get_linear_schedule_with_warmup(num_warmup_steps=1000)
```

#### 3.4 Class Imbalance
```python
❌ WRONG: CrossEntropyLoss()  # Biased toward majority
✅ CORRECT: CrossEntropyLoss(weight=class_weights)
```

---

## Detection Patterns

### Leakage Detection
```python
# Check feature-target correlation
correlation = df[features].corrwith(df['target'])
if (correlation.abs() > 0.95).any():
    raise DataLeakageError("Suspiciously high correlation")

# Check temporal ordering
if train['date'].min() > test['date'].max():
    raise TemporalLeakageError("Training on future, testing on past")

# Check group overlap
if train_groups & test_groups:
    raise GroupLeakageError("Overlapping groups")
```

### Mode Check
```python
if model.training:
    raise InferenceModeError("Model in training mode during evaluation")
```

---

## Validation Checklist

Before deployment:

- [ ] No data leakage detected
- [ ] Test set never seen during training
- [ ] Metrics aligned with business objective
- [ ] model.eval() called for inference
- [ ] Class imbalance handled
- [ ] Covariate shift monitoring planned

---

## References

상세 예시 및 시나리오는 [references/REFERENCE.md](references/REFERENCE.md) 참조.