---
name: media-toolkit-production
description: Voice generation and audio mixing using ElevenLabs API v3 with intelligent timing optimization. Trigger when user mentions voice generation, TTS, ElevenLabs, audio production, or script format CHARACTER (emotion) dialogue.
---

# Media Toolkit Production

## Instructions

### Critical Requirements (ALL MUST BE ENFORCED)

1. **Accent Persistence** - Every ElevenLabs API call MUST include accent descriptor:
   ```
   text: "[Irish Midlands accent] Stay close, Jess."
   ```

2. **Clip Reuse from History** - Before generating any clip:
   - Check ElevenLabs History API
   - If matching text+voice exists → reuse it
   - Log: "Reusing from history: [id]"

3. **Re-Timing Workflow** - Local timing adjustment, NO regeneration:
   - Adjust downstream timing based on actual durations
   - Zero API calls for timing fixes

4. **Zero Voice Overlaps** (PRIORITY 1) - Voices NEVER overlap unless intentional

5. **Line Refinement** - User approval loop:
   - User: "Line 2 more panicked"
   - Adjust: stability=0.3, style=0.8, text="[panicked, breathless]..."
   - Archive both original and rework versions

### Script Format
```
CHARACTER (emotion): dialogue text
[SFX: description, duration: 5s, volume: 0.8]
```

### Production Workflow
1. Load character database (Notion)
2. Parse script (dialogue + SFX)
3. Generate timeline
4. Detect overlaps
5. Generate audio (v3 with history reuse)
6. Optimize timing
7. Mix audio (FFmpeg)

## DO NOT
- Generate audio without checking history first
- Omit accent descriptors from any API call
- Allow voice overlaps without explicit marking
- Regenerate clips for timing adjustments

## DO
- Always include accent in text prompt
- Check history → archive → then generate if needed
- Preserve both original and reworked versions
- Track and report token usage efficiency