# Qwen3 TTS Skill — Installation Guide

Follow these steps to install the Qwen3 TTS skill that provides text-to-speech and voice cloning capabilities.

## Prerequisites

- `curl` (for downloading)
- `unzip` (for extraction)
- `bash` (shell)
- `git` (for cloning)
- `pip` with `huggingface_hub` and `transformers` (for model download and tokenizer generation)

## Quick Install (Recommended)

```bash
SKILL_DIR="${HOME}/.openclaw/skills/audio_tts"
mkdir -p "$SKILL_DIR"

# Clone the repo
git clone --depth 1 https://github.com/second-state/qwen3_tts_rs.git /tmp/qwen3-tts-repo
cp -r /tmp/qwen3-tts-repo/skills/* "$SKILL_DIR"
mkdir -p "$SKILL_DIR/scripts"
cp -r /tmp/qwen3-tts-repo/reference_audio "$SKILL_DIR/scripts/reference_audio"
rm -rf /tmp/qwen3-tts-repo

# Download platform-specific binaries, libtorch, and models
"${SKILL_DIR}/bootstrap.sh"
```

After installation, verify it works:

```bash
# Test TTS (requires model download to complete)
~/.openclaw/skills/audio_tts/scripts/tts \
  ~/.openclaw/skills/audio_tts/scripts/models/Qwen3-TTS-12Hz-0.6B-CustomVoice \
  "Hello, this is a test." \
  Vivian \
  english
ls -la output.wav
```

## Manual Installation

If automatic download fails, manually download the components:

1. Go to https://github.com/second-state/qwen3_tts_rs/releases/latest
2. Download the zip for your platform:
   - `qwen3-tts-linux-x86_64.zip` (Linux x86_64 CPU, includes libtorch)
   - `qwen3-tts-linux-x86_64-cuda.zip` (Linux x86_64 CUDA, includes libtorch)
   - `qwen3-tts-linux-aarch64.zip` (Linux ARM64 CPU, includes libtorch)
   - `qwen3-tts-linux-aarch64-cuda.zip` (Linux ARM64 CUDA / Jetson, includes libtorch)
   - `qwen3-tts-macos-aarch64.zip` (macOS Apple Silicon, includes mlx.metallib)
3. Extract to `~/.openclaw/skills/audio_tts/scripts/`:
   ```bash
   unzip qwen3-tts-*.zip -d ~/.openclaw/skills/audio_tts/scripts/
   ```
4. Make executable:
   ```bash
   chmod +x ~/.openclaw/skills/audio_tts/scripts/tts
   chmod +x ~/.openclaw/skills/audio_tts/scripts/voice_clone
   ```
7. Copy reference audio files from the repo into the scripts directory:
   ```bash
   git clone --depth 1 https://github.com/second-state/qwen3_tts_rs.git /tmp/qwen3-tts-repo
   cp -r /tmp/qwen3-tts-repo/reference_audio ~/.openclaw/skills/audio_tts/scripts/reference_audio
   rm -rf /tmp/qwen3-tts-repo
   ```
8. Download the TTS models:
   ```bash
   pip install huggingface_hub transformers
   MODELS_DIR=~/.openclaw/skills/audio_tts/scripts/models
   mkdir -p "$MODELS_DIR"
   huggingface-cli download Qwen/Qwen3-TTS-12Hz-0.6B-CustomVoice --local-dir "$MODELS_DIR/Qwen3-TTS-12Hz-0.6B-CustomVoice"
   huggingface-cli download Qwen/Qwen3-TTS-12Hz-0.6B-Base --local-dir "$MODELS_DIR/Qwen3-TTS-12Hz-0.6B-Base"
   ```
9. Generate tokenizer.json files for each model:
   ```bash
   python3 -c "
   from transformers import AutoTokenizer
   for model in ['Qwen3-TTS-12Hz-0.6B-CustomVoice', 'Qwen3-TTS-12Hz-0.6B-Base']:
       path = '$MODELS_DIR/' + model
       tok = AutoTokenizer.from_pretrained(path, trust_remote_code=True)
       tok.backend_tokenizer.save(path + '/tokenizer.json')
       print(f'Saved {path}/tokenizer.json')
   "
   ```

## Troubleshooting

### Download Failed
Check network connectivity:
```bash
curl -I "https://github.com/second-state/qwen3_tts_rs/releases/latest"
```

### Unsupported Platform
Check your platform:
```bash
echo "OS: $(uname -s), Arch: $(uname -m)"
```

Supported: Linux (x86_64, aarch64) and macOS (Apple Silicon arm64).

### Missing libtorch (Linux only)
Ensure `LD_LIBRARY_PATH` includes the libtorch lib directory. The SKILL.md instructions set this automatically.