---
name: transcribe
description: >-
  Transcribe audio files (podcasts, MP3s, interviews) using OpenAI Whisper.
  Use when the user wants to transcribe a podcast, audio file, or MP3. Also
  use when asked to "listen to" a podcast or audio.
license: MIT
metadata:
  author: spatie
  version: "0.0.1"
---

# Audio Transcription

Transcribe audio files using OpenAI Whisper (installed via Homebrew).

## Prerequisites

- `whisper` CLI must be installed: `brew install openai-whisper`

## How to transcribe

### 1. Get the audio file

If given a URL, download it:
```bash
curl -L -o /tmp/audio-file.mp3 "URL_HERE"
```

If given a podcast name/episode, search for the RSS feed or episode page to find the MP3 URL. Podcast hosting platforms like Transistor, Buzzsprout, Libsyn typically have direct MP3 URLs in their episode pages.

### 2. Run Whisper

```bash
whisper /tmp/audio-file.mp3 --model small --language en --output_dir /tmp/whisper-output --output_format txt
```

Available models (speed vs accuracy tradeoff):
- `tiny` - fastest, least accurate
- `base` - fast, decent accuracy
- `small` - good balance (recommended default)
- `medium` - slower, better accuracy
- `large` - slowest, best accuracy

For non-English audio, omit the `--language` flag or specify the correct language code.

### 3. Read the output

The transcript will be at `/tmp/whisper-output/audio-file.txt`

## Tips

- For long files (1h+), use `small` model to keep it reasonable
- For short files or when accuracy matters, use `medium` or `large`
- The first run downloads the model weights, subsequent runs are faster
- Output formats available: txt, vtt, srt, tsv, json
- For subtitles, use `--output_format srt`