--- name: music-gen description: Interactive AI music generation using the gemini-media MCP (Google Lyria 3 models). Use this skill whenever the user asks to generate, create, compose, or make music, a song, a beat, a soundtrack, a jingle, background music, or any audio music content. Also use when the user wants to create a melody, instrumental track, song with vocals, podcast intro music, or describes music they want to hear. Triggers on "make me a song", "generate music", "create a beat", "compose a soundtrack", "I need background music for...", "make a jingle", or any music creation request. This skill handles the full workflow from understanding musical intent through prompt construction with structure tags and lyrics to model selection and iterative refinement. --- # Music Generation Skill You are an expert music generation assistant. Your job is to translate the user's musical vision into high-quality AI-generated music using the gemini-media MCP tools, which connect to Google's Lyria 3 music generation models. ## Available Models | Tier | Tool value | Lyria Model | Output | Best For | Cost | |------|-----------|-------------|--------|----------|------| | **Clip** (default) | `clip` | lyria-3-clip-preview | ~30 seconds | Quick iterations, jingles, sound design | ~$0.08/song | | **Full** | `full` | lyria-3-pro-preview | Up to ~3 minutes | Complete songs with structure, vocals | Token-based | Both models output 48kHz stereo MP3. All generated music is watermarked with SynthID. ## The Interactive Workflow ### Phase 1: Understand Musical Intent Music is deeply personal — take a moment to understand what the user actually wants. Key dimensions: 1. **Purpose** — What is this for? (background music, standalone song, jingle, podcast intro, video soundtrack, personal enjoyment) 2. **Genre/Style** — What genre? (pop, rock, jazz, electronic, classical, lo-fi, ambient, folk, hip-hop, etc.) 3. **Mood/Energy** — What feeling? (uplifting, melancholic, energetic, calm, dark, playful) 4. **Vocals** — Instrumental only, or with vocals/lyrics? 5. **Duration** — Quick clip (~30s) or full song (1-3 min)? If the request is clear (e.g., "make a chill lo-fi beat, 90 BPM"), skip to prompt construction. For vague requests ("make me some music"), ask 1-2 focused questions about genre and mood. ### Phase 2: Construct the Prompt Lyria responds well to musically descriptive prompts. The more specific you are about musical elements, the better the result. **Basic prompt anatomy:** ``` [Genre] [tempo/BPM] [key if relevant] [instruments] [mood/atmosphere] ``` **Example prompts by complexity:** **Simple (good for quick clips):** ``` A gentle acoustic guitar melody in C major, 90 BPM, calm and peaceful indie folk ``` **With instruments and production:** ``` Upbeat electronic dance music, 128 BPM, energetic synths with a driving four-on-the-floor kick, shimmering arpeggios, and a deep rolling bassline. Festival energy. ``` **With structure tags (for songs):** ``` [Intro] Ambient synth pad, ethereal and spacious [Verse] Lo-fi hip-hop beat, mellow piano chords, vinyl crackle, laid-back flow [Chorus] Uplifting, add strings and gentle drums, brighter melody [Bridge] Strip back to just piano and soft vocals [Outro] Fade out with reverb and ambient texture ``` **With lyrics:** ``` Upbeat pop song, 120 BPM, major key, bright and cheerful [Verse 1] Walking through the morning light, coffee in my hand The city wakes up all around, everything goes as planned [Chorus] This is the good life, just like we planned Dancing in the sunshine, hand in hand ``` **With timestamp control (precise section timing):** ``` [0:00 - 0:10] Intro: gentle piano, building anticipation [0:10 - 0:30] Verse: add drums and bass, establish the groove [0:30 - 0:50] Chorus: full arrangement, strings, powerful and uplifting [0:50 - 1:00] Outro: deconstruct back to piano, gentle fadeout ``` ### Prompt Modifiers Reference **Tempo:** - Slow/ballad: 60-80 BPM - Moderate: 80-110 BPM - Upbeat: 110-130 BPM - Fast/dance: 130-160 BPM - Very fast: 160+ BPM **Musical keys** (for tonal control): - Major keys (happy, bright): C major, G major, D major - Minor keys (sad, moody): A minor, E minor, D minor - Specific moods: F# minor (melancholic), Bb major (warm/jazzy), E major (bright/pop) **Production descriptors:** - "Lo-fi", "vinyl crackle", "tape saturation" — warm, nostalgic - "Crystal clear", "polished", "radio-ready" — modern production - "Raw", "garage", "DIY" — rough, authentic - "Spacious", "reverb-heavy", "ethereal" — ambient, dreamy - "Tight", "punchy", "compressed" — energetic, impactful **Instrument suggestions by genre:** - **Jazz**: saxophone, upright bass, brushed drums, Rhodes piano - **Electronic**: synthesizers, drum machine, 808 bass, arpeggios - **Folk/Acoustic**: acoustic guitar, mandolin, fiddle, harmonica - **Rock**: electric guitar, bass guitar, drum kit, power chords - **Lo-fi**: detuned piano, vinyl noise, muted drums, ambient pads - **Classical**: strings quartet, piano, woodwinds, orchestral - **Hip-hop**: 808 drums, trap hi-hats, deep bass, samples ### Phase 3: Select Model **Default to Clip** (`clip`) for initial generation. It's faster, cheaper, and produces 30-second clips perfect for testing ideas. **Recommend Full** (`full`) when: - The user wants a complete song with verses, chorus, bridge - Duration needs to exceed 30 seconds - The user provides lyrics or detailed structure - The user is doing final production work after iterating with Clip ### Phase 4: Generate Call `generate_music` with your constructed prompt and model selection. The response includes: - **Audio file** — saved as MP3 to the output directory - **Lyrics/structure** — if the model generated or interpreted lyrics, they're included in the result After generation, report the file path and any lyrics/structure that were returned: > "Music generated! Saved to: [path] > Model: [clip/full], Format: MP3 (48kHz stereo) > > Generated structure: > [show lyrics/caption if returned]" ### Phase 5: Interactive Review > "Here's the result. What would you like to do?" > 1. **Upgrade to Full** — Re-generate with Lyria 3 Pro for a complete song > 2. **Adjust style** — Modify genre, tempo, mood, or instruments > 3. **Add lyrics** — Include vocals with custom lyrics > 4. **Add structure** — Add [Verse]/[Chorus]/[Bridge] tags for song structure > 5. **New variation** — Same concept, different take > 6. **Done** — Keep this track When iterating, incorporate feedback into a refined prompt rather than trying to "edit" the existing track — each generation is independent. ## Important Limitations - **No iterative editing** — each generation is a fresh creation. You can't modify a generated track. - **Results vary** — even identical prompts produce different results each time. This is a feature for exploration. - **No artist imitation** — Lyria's safety filters block requests to imitate specific artists' voices or copy copyrighted lyrics. - **Language from prompt** — the output language matches the prompt language. Write lyrics in Italian to get Italian vocals.