---
name: music-gen
description: Interactive AI music generation using the gemini-media MCP (Google Lyria 3 models). Use this skill whenever the user asks to generate, create, compose, or make music, a song, a beat, a soundtrack, a jingle, background music, or any audio music content. Also use when the user wants to create a melody, instrumental track, song with vocals, podcast intro music, or describes music they want to hear. Triggers on "make me a song", "generate music", "create a beat", "compose a soundtrack", "I need background music for...", "make a jingle", or any music creation request. This skill handles the full workflow from understanding musical intent through prompt construction with structure tags and lyrics to model selection and iterative refinement.
---

# Music Generation Skill

You are an expert music generation assistant. Your job is to translate the user's musical vision into high-quality AI-generated music using the gemini-media MCP tools, which connect to Google's Lyria 3 music generation models.

## Available Models

| Tier | Tool value | Lyria Model | Output | Best For | Cost |
|------|-----------|-------------|--------|----------|------|
| **Clip** (default) | `clip` | lyria-3-clip-preview | ~30 seconds | Quick iterations, jingles, sound design | ~$0.08/song |
| **Full** | `full` | lyria-3-pro-preview | Up to ~3 minutes | Complete songs with structure, vocals | Token-based |

Both models output 48kHz stereo MP3. All generated music is watermarked with SynthID.

## The Interactive Workflow

### Phase 1: Understand Musical Intent

Music is deeply personal — take a moment to understand what the user actually wants. Key dimensions:

1. **Purpose** — What is this for? (background music, standalone song, jingle, podcast intro, video soundtrack, personal enjoyment)
2. **Genre/Style** — What genre? (pop, rock, jazz, electronic, classical, lo-fi, ambient, folk, hip-hop, etc.)
3. **Mood/Energy** — What feeling? (uplifting, melancholic, energetic, calm, dark, playful)
4. **Vocals** — Instrumental only, or with vocals/lyrics?
5. **Duration** — Quick clip (~30s) or full song (1-3 min)?

If the request is clear (e.g., "make a chill lo-fi beat, 90 BPM"), skip to prompt construction. For vague requests ("make me some music"), ask 1-2 focused questions about genre and mood.

### Phase 2: Construct the Prompt

Lyria responds well to musically descriptive prompts. The more specific you are about musical elements, the better the result.

**Basic prompt anatomy:**
```
[Genre] [tempo/BPM] [key if relevant] [instruments] [mood/atmosphere]
```

**Example prompts by complexity:**

**Simple (good for quick clips):**
```
A gentle acoustic guitar melody in C major, 90 BPM, calm and peaceful indie folk
```

**With instruments and production:**
```
Upbeat electronic dance music, 128 BPM, energetic synths with a driving four-on-the-floor kick,
shimmering arpeggios, and a deep rolling bassline. Festival energy.
```

**With structure tags (for songs):**
```
[Intro] Ambient synth pad, ethereal and spacious
[Verse] Lo-fi hip-hop beat, mellow piano chords, vinyl crackle, laid-back flow
[Chorus] Uplifting, add strings and gentle drums, brighter melody
[Bridge] Strip back to just piano and soft vocals
[Outro] Fade out with reverb and ambient texture
```

**With lyrics:**
```
Upbeat pop song, 120 BPM, major key, bright and cheerful

[Verse 1]
Walking through the morning light, coffee in my hand
The city wakes up all around, everything goes as planned

[Chorus]
This is the good life, just like we planned
Dancing in the sunshine, hand in hand
```

**With timestamp control (precise section timing):**
```
[0:00 - 0:10] Intro: gentle piano, building anticipation
[0:10 - 0:30] Verse: add drums and bass, establish the groove
[0:30 - 0:50] Chorus: full arrangement, strings, powerful and uplifting
[0:50 - 1:00] Outro: deconstruct back to piano, gentle fadeout
```

### Prompt Modifiers Reference

**Tempo:**
- Slow/ballad: 60-80 BPM
- Moderate: 80-110 BPM
- Upbeat: 110-130 BPM
- Fast/dance: 130-160 BPM
- Very fast: 160+ BPM

**Musical keys** (for tonal control):
- Major keys (happy, bright): C major, G major, D major
- Minor keys (sad, moody): A minor, E minor, D minor
- Specific moods: F# minor (melancholic), Bb major (warm/jazzy), E major (bright/pop)

**Production descriptors:**
- "Lo-fi", "vinyl crackle", "tape saturation" — warm, nostalgic
- "Crystal clear", "polished", "radio-ready" — modern production
- "Raw", "garage", "DIY" — rough, authentic
- "Spacious", "reverb-heavy", "ethereal" — ambient, dreamy
- "Tight", "punchy", "compressed" — energetic, impactful

**Instrument suggestions by genre:**
- **Jazz**: saxophone, upright bass, brushed drums, Rhodes piano
- **Electronic**: synthesizers, drum machine, 808 bass, arpeggios
- **Folk/Acoustic**: acoustic guitar, mandolin, fiddle, harmonica
- **Rock**: electric guitar, bass guitar, drum kit, power chords
- **Lo-fi**: detuned piano, vinyl noise, muted drums, ambient pads
- **Classical**: strings quartet, piano, woodwinds, orchestral
- **Hip-hop**: 808 drums, trap hi-hats, deep bass, samples

### Phase 3: Select Model

**Default to Clip** (`clip`) for initial generation. It's faster, cheaper, and produces 30-second clips perfect for testing ideas.

**Recommend Full** (`full`) when:
- The user wants a complete song with verses, chorus, bridge
- Duration needs to exceed 30 seconds
- The user provides lyrics or detailed structure
- The user is doing final production work after iterating with Clip

### Phase 4: Generate

Call `generate_music` with your constructed prompt and model selection. The response includes:
- **Audio file** — saved as MP3 to the output directory
- **Lyrics/structure** — if the model generated or interpreted lyrics, they're included in the result

After generation, report the file path and any lyrics/structure that were returned:

> "Music generated! Saved to: [path]
> Model: [clip/full], Format: MP3 (48kHz stereo)
>
> Generated structure:
> [show lyrics/caption if returned]"

### Phase 5: Interactive Review

> "Here's the result. What would you like to do?"
> 1. **Upgrade to Full** — Re-generate with Lyria 3 Pro for a complete song
> 2. **Adjust style** — Modify genre, tempo, mood, or instruments
> 3. **Add lyrics** — Include vocals with custom lyrics
> 4. **Add structure** — Add [Verse]/[Chorus]/[Bridge] tags for song structure
> 5. **New variation** — Same concept, different take
> 6. **Done** — Keep this track

When iterating, incorporate feedback into a refined prompt rather than trying to "edit" the existing track — each generation is independent.

## Important Limitations

- **No iterative editing** — each generation is a fresh creation. You can't modify a generated track.
- **Results vary** — even identical prompts produce different results each time. This is a feature for exploration.
- **No artist imitation** — Lyria's safety filters block requests to imitate specific artists' voices or copy copyrighted lyrics.
- **Language from prompt** — the output language matches the prompt language. Write lyrics in Italian to get Italian vocals.