---
name: audio-voice-recovery
description: Audio forensics and voice recovery guidelines for CSI-level audio analysis. This skill should be used when recovering voice from low-quality or low-volume audio, enhancing degraded recordings, performing forensic audio analysis, or transcribing difficult audio. Triggers on tasks involving audio enhancement, noise reduction, voice isolation, forensic authentication, or audio transcription.
---

# Forensic Audio Research Audio Voice Recovery Best Practices

Comprehensive audio forensics and voice recovery guide providing CSI-level capabilities for recovering voice from low-quality, low-volume, or damaged audio recordings. Contains 45 rules across 8 categories, prioritized by impact to guide audio enhancement, forensic analysis, and transcription workflows.

## When to Apply

Reference these guidelines when:
- Recovering voice from noisy or low-quality recordings
- Enhancing audio for transcription or legal evidence
- Performing forensic audio authentication
- Analyzing recordings for tampering or splices
- Building automated audio processing pipelines
- Transcribing difficult or degraded speech

## Rule Categories by Priority

| Priority | Category | Impact | Prefix | Rules |
|----------|----------|--------|--------|-------|
| 1 | Signal Preservation & Analysis | CRITICAL | `signal-` | 5 |
| 2 | Noise Profiling & Estimation | CRITICAL | `noise-` | 5 |
| 3 | Spectral Processing | HIGH | `spectral-` | 6 |
| 4 | Voice Isolation & Enhancement | HIGH | `voice-` | 7 |
| 5 | Temporal Processing | MEDIUM-HIGH | `temporal-` | 5 |
| 6 | Transcription & Recognition | MEDIUM | `transcribe-` | 5 |
| 7 | Forensic Authentication | MEDIUM | `forensic-` | 5 |
| 8 | Tool Integration & Automation | LOW-MEDIUM | `tool-` | 7 |

## Quick Reference

### 1. Signal Preservation & Analysis (CRITICAL)

- [`signal-preserve-original`](references/signal-preserve-original.md) - Never modify original recording
- [`signal-lossless-format`](references/signal-lossless-format.md) - Use lossless formats for processing
- [`signal-sample-rate`](references/signal-sample-rate.md) - Preserve native sample rate
- [`signal-bit-depth`](references/signal-bit-depth.md) - Use maximum bit depth for processing
- [`signal-analyze-first`](references/signal-analyze-first.md) - Analyze before processing

### 2. Noise Profiling & Estimation (CRITICAL)

- [`noise-profile-silence`](references/noise-profile-silence.md) - Extract noise profile from silent segments
- [`noise-identify-type`](references/noise-identify-type.md) - Identify noise type before reduction
- [`noise-adaptive-estimation`](references/noise-adaptive-estimation.md) - Use adaptive estimation for non-stationary noise
- [`noise-snr-assessment`](references/noise-snr-assessment.md) - Measure SNR before and after
- [`noise-avoid-overprocessing`](references/noise-avoid-overprocessing.md) - Avoid over-processing and musical artifacts

### 3. Spectral Processing (HIGH)

- [`spectral-subtraction`](references/spectral-subtraction.md) - Apply spectral subtraction for stationary noise
- [`spectral-wiener-filter`](references/spectral-wiener-filter.md) - Use Wiener filter for optimal noise estimation
- [`spectral-notch-filter`](references/spectral-notch-filter.md) - Apply notch filters for tonal interference
- [`spectral-band-limiting`](references/spectral-band-limiting.md) - Apply frequency band limiting for speech
- [`spectral-equalization`](references/spectral-equalization.md) - Use forensic equalization to restore intelligibility
- [`spectral-declip`](references/spectral-declip.md) - Repair clipped audio before other processing

### 4. Voice Isolation & Enhancement (HIGH)

- [`voice-rnnoise`](references/voice-rnnoise.md) - Use RNNoise for real-time ML denoising
- [`voice-dialogue-isolate`](references/voice-dialogue-isolate.md) - Use source separation for complex backgrounds
- [`voice-formant-preserve`](references/voice-formant-preserve.md) - Preserve formants during pitch manipulation
- [`voice-dereverb`](references/voice-dereverb.md) - Apply dereverberation for room echo
- [`voice-enhance-speech`](references/voice-enhance-speech.md) - Use AI speech enhancement services for quick results
- [`voice-vad-segment`](references/voice-vad-segment.md) - Use VAD for targeted processing
- [`voice-frequency-boost`](references/voice-frequency-boost.md) - Boost frequency regions for specific phonemes

### 5. Temporal Processing (MEDIUM-HIGH)

- [`temporal-dynamic-range`](references/temporal-dynamic-range.md) - Use dynamic range compression for level consistency
- [`temporal-noise-gate`](references/temporal-noise-gate.md) - Apply noise gate to silence non-speech segments
- [`temporal-time-stretch`](references/temporal-time-stretch.md) - Use time stretching for intelligibility
- [`temporal-transient-repair`](references/temporal-transient-repair.md) - Repair transient damage (clicks, pops, dropouts)
- [`temporal-silence-trim`](references/temporal-silence-trim.md) - Trim silence and normalize before export

### 6. Transcription & Recognition (MEDIUM)

- [`transcribe-whisper`](references/transcribe-whisper.md) - Use Whisper for noise-robust transcription
- [`transcribe-multipass`](references/transcribe-multipass.md) - Use multi-pass transcription for difficult audio
- [`transcribe-segment`](references/transcribe-segment.md) - Segment audio for targeted transcription
- [`transcribe-confidence`](references/transcribe-confidence.md) - Track confidence scores for uncertain words
- [`transcribe-hallucination`](references/transcribe-hallucination.md) - Detect and filter ASR hallucinations

### 7. Forensic Authentication (MEDIUM)

- [`forensic-enf-analysis`](references/forensic-enf-analysis.md) - Use ENF analysis for timestamp verification
- [`forensic-metadata`](references/forensic-metadata.md) - Extract and verify audio metadata
- [`forensic-tampering`](references/forensic-tampering.md) - Detect audio tampering and splices
- [`forensic-chain-custody`](references/forensic-chain-custody.md) - Document chain of custody for evidence
- [`forensic-speaker-id`](references/forensic-speaker-id.md) - Extract speaker characteristics for identification

### 8. Tool Integration & Automation (LOW-MEDIUM)

- [`tool-ffmpeg-essentials`](references/tool-ffmpeg-essentials.md) - Master essential FFmpeg audio commands
- [`tool-sox-commands`](references/tool-sox-commands.md) - Use SoX for advanced audio manipulation
- [`tool-python-pipeline`](references/tool-python-pipeline.md) - Build Python audio processing pipelines
- [`tool-audacity-workflow`](references/tool-audacity-workflow.md) - Use Audacity for visual analysis and manual editing
- [`tool-install-guide`](references/tool-install-guide.md) - Install audio forensic toolchain
- [`tool-batch-automation`](references/tool-batch-automation.md) - Automate batch processing workflows
- [`tool-quality-assessment`](references/tool-quality-assessment.md) - Measure audio quality metrics

## Essential Tools

| Tool | Purpose | Install |
|------|---------|---------|
| FFmpeg | Format conversion, filtering | `brew install ffmpeg` |
| SoX | Noise profiling, effects | `brew install sox` |
| Whisper | Speech transcription | `pip install openai-whisper` |
| librosa | Python audio analysis | `pip install librosa` |
| noisereduce | ML noise reduction | `pip install noisereduce` |
| Audacity | Visual editing | `brew install audacity` |

## Workflow Scripts (Recommended)

Use the bundled scripts to generate objective baselines, create a workflow plan, and verify results.

- `scripts/preflight_audio.py` - Generate a forensic preflight report (JSON or Markdown).
- `scripts/plan_from_preflight.py` - Create a workflow plan template from the preflight report.
- `scripts/compare_audio.py` - Compare objective metrics between baseline and processed audio.

Example usage:

```bash
# 1) Analyze and capture baseline metrics
python3 skills/.experimental/audio-voice-recovery/scripts/preflight_audio.py evidence.wav --out preflight.json

# 2) Generate a workflow plan template
python3 skills/.experimental/audio-voice-recovery/scripts/plan_from_preflight.py --preflight preflight.json --out plan.md

# 3) Compare baseline vs processed metrics
python3 skills/.experimental/audio-voice-recovery/scripts/compare_audio.py \
  --before evidence.wav \
  --after enhanced.wav \
  --format md \
  --out comparison.md
```

## Forensic Preflight Workflow (Do This Before Any Changes)

Align preflight with SWGDE Best Practices for the Enhancement of Digital Audio (20-a-001) and SWGDE Best Practices for Forensic Audio (08-a-001).
Establish an objective baseline state and plan the workflow so processing does not introduce clipping, artifacts, or false "done" confidence.
Use `scripts/preflight_audio.py` to capture baseline metrics and preserve the report with the case file.

Capture and record before processing:
- Record evidence identity and integrity: path, filename, file size, SHA-256 checksum, source, format/container, codec
- Record signal integrity: sample rate, bit depth, channels, duration
- Measure baseline loudness and levels: LUFS/LKFS, true peak, peak, RMS, dynamic range, DC offset
- Detect clipping and document clipped-sample percentage, peak headroom, exact time ranges
- Identify noise profile: stationary vs non-stationary, dominant noise bands, SNR estimate
- Locate the region of interest (ROI) and document time ranges and changes over time
- Inspect spectral content and estimate speech-band energy and intelligibility risk
- Scan for temporal defects: dropouts, discontinuities, splices, drift
- Evaluate channel correlation and phase anomalies (if stereo)
- Extract and preserve metadata: timestamps, device/model tags, embedded notes

Procedure:
1. Prepare a forensic working copy, verify hashes, and preserve the original untouched.
2. Locate ROI and target signal; document exact time ranges and changes across the recording.
3. Assess challenges to intelligibility and signal quality; map challenges to mitigation strategies.
4. Identify required processing and plan a workflow order that avoids unwanted artifacts.
   Generate a plan draft with `scripts/plan_from_preflight.py` and complete it with case-specific decisions.
5. Measure baseline loudness and true peak per ITU-R BS.1770 / EBU R 128 and record peak/RMS/DC offset.
6. Detect clipping and dropouts; if clipping is present, declip first or pause and document limitations.
7. Inspect spectral content and noise type; collect representative noise profile segments and estimate SNR.
8. If stereo, evaluate channel correlation and phase; document anomalies.
9. Create a baseline listening log (multiple devices) and define success criteria for intelligibility and listenability.

Failure-pattern guardrails:
- Do not process until every preflight field is captured.
- Document every process, setting, software version, and time segment to enable repeatability.
- Compare each processed output to the unprocessed input and assess progress toward intelligibility and listenability.
- Avoid over-processing; review removed signal (filter residue) to avoid removing target signal components.
- Keep intermediate files uncompressed and preserve sample rate/bit depth when moving between tools.
- Perform a final review against the original; if unsatisfactory, revise or stop and report limitations.
- If the request is not achievable, communicate limitations and do not declare completion.
- Require objective metrics and A/B listening before declaring completion.
- Do not rely solely on objective metrics; corroborate with critical listening.
- Take listening breaks to avoid ear fatigue during extended reviews.

## Quick Enhancement Pipeline

```bash
# 1. Analyze original (run preflight and capture baseline metrics)
python3 skills/.experimental/audio-voice-recovery/scripts/preflight_audio.py evidence.wav --out preflight.json

# 2. Create working copy with checksum
cp evidence.wav working.wav
sha256sum evidence.wav > evidence.sha256

# 3. Apply enhancement
ffmpeg -i working.wav -af "\
  highpass=f=80,\
  adeclick=w=55:o=75,\
  afftdn=nr=12:nf=-30:nt=w,\
  equalizer=f=2500:t=q:w=1:g=3,\
  loudnorm=I=-16:TP=-1.5:LRA=11\
" enhanced.wav

# 4. Transcribe
whisper enhanced.wav --model large-v3 --language en

# 5. Verify original unchanged
sha256sum -c evidence.sha256

# 6. Verify improvement (objective comparison + A/B listening)
python3 skills/.experimental/audio-voice-recovery/scripts/compare_audio.py \
  --before evidence.wav \
  --after enhanced.wav \
  --format md \
  --out comparison.md
```

## How to Use

Read individual reference files for detailed explanations and code examples:

- [Section definitions](references/_sections.md) - Category structure and impact levels
- [Rule template](assets/templates/_template.md) - Template for adding new rules

## Reference Files

| File | Description |
|------|-------------|
| [AGENTS.md](AGENTS.md) | Complete compiled guide with all rules |
| [references/_sections.md](references/_sections.md) | Category definitions and ordering |
| [assets/templates/_template.md](assets/templates/_template.md) | Template for new rules |
| [metadata.json](metadata.json) | Version and reference information |