--- name: media-factory description: AI-powered media production pipeline using Nano Banana Pro (images), KLING AI (video/transitions), and ElevenLabs (voiceover). Use when creating video content, product demos, social media assets, or any multimedia production. version: 1.0.0 mcps: [firecrawl, perplexity] subagents: [nana-image-generator] skills: [terminal-ui-design] --- # ID8 MEDIA FACTORY - AI Production Pipeline ## Purpose Orchestrate AI-powered multimedia production using three specialized tools: - **Nano Banana Pro** (fal.ai) β†’ Image generation - **KLING AI** β†’ Video generation & transitions - **ElevenLabs** β†’ Voiceover & audio **Philosophy:** Assemble, don't animate. Generate high-quality assets, then compose them into polished content. --- ## When to Use - Creating product demos or explainer videos - Generating social media video content - Building marketing assets (ads, promos) - Producing educational content - Creating podcast/video intros and outros - Generating b-roll or background footage - Building visual storytelling content - Product launch videos - Any multimedia content requiring images + video + audio --- ## The Three Pillars ### πŸ–ΌοΈ Nano Banana Pro (Images) **Provider:** fal.ai (`fal-ai/nano-banana-pro`) **Purpose:** Generate high-quality still images from text prompts | Feature | Value | |---------|-------| | Model | Gemini 3 Pro Image (Nano Banana 2) | | Resolutions | 1K, 2K, 4K | | Aspect Ratios | 21:9, 16:9, 3:2, 4:3, 5:4, 1:1, 4:5, 3:4, 2:3, 9:16 | | Formats | PNG, JPEG, WebP | | Web Search | Can use live web data for current topics | **Best For:** - Hero images, thumbnails - Character/product shots - Background scenes - Storyboard frames - Social media graphics ### 🎬 KLING AI (Video) **Provider:** KLING AI / AI/ML API **Purpose:** Generate video from text or images, create transitions | Feature | Value | |---------|-------| | Text-to-Video | v1, v1.6, v2, v2.1 (standard/pro/master) | | Image-to-Video | v1, v1.6, v2, v2.1 (standard/pro/master) | | Effects | v1.6-standard/effects, v1.6-pro/effects | | Resolution | Up to 1080p | | Frame Rate | 30 fps | | Duration | 5-10 seconds per generation | **Best For:** - Animating still images - Creating transitions between scenes - Generating b-roll footage - Motion graphics - Product animations ### πŸŽ™οΈ ElevenLabs (Voice) **Provider:** ElevenLabs API **Purpose:** Generate natural voiceovers and audio | Feature | Value | |---------|-------| | Models | eleven_multilingual_v2 (default), eleven_turbo_v2_5 | | Languages | 32+ supported | | Voices | 1000s of pre-made + custom voice cloning | | Formats | mp3_44100_128, pcm_44100, etc. | | Features | Pronunciation dictionaries, voice settings | **Best For:** - Narration and voiceovers - Character voices - Podcast intros - Product demo audio - Multilingual content --- ## Production Workflows ### Workflow 1: Image β†’ Video β†’ Audio (Standard) ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ NANO BANANA │────▢│ KLING AI │────▢│ ELEVENLABS β”‚ β”‚ Generate β”‚ β”‚ Animate β”‚ β”‚ Narrate β”‚ β”‚ Still Images β”‚ β”‚ Images to β”‚ β”‚ Final β”‚ β”‚ β”‚ β”‚ Video β”‚ β”‚ Video β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` **Steps:** 1. Write prompts for each scene/shot 2. Generate images with Nano Banana Pro 3. Feed images to KLING for animation 4. Write script for voiceover 5. Generate audio with ElevenLabs 6. Composite in video editor (CapCut, DaVinci, Premiere) ### Workflow 2: Script-First (Narrative) ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ SCRIPT β”‚ β”‚ Write story β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”Œβ”€β”€β”€β”€β”΄β”€β”€β”€β”€β” β–Ό β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ELEVEN β”‚ β”‚NANO BANANAβ”‚ β”‚LABS β”‚ β”‚Scene imgs β”‚ β””β”€β”€β”€β”¬β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β” β”‚ β–Ό β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚KLING β”‚ β”‚ β”‚ β”‚Animateβ”‚ β”‚ β”‚ β””β”€β”€β”€β”¬β”€β”€β”€β”˜ β”‚ β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β–Ό β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ COMPOSITE β”‚ β”‚ Final Edit β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` ### Workflow 3: Product Demo ``` Product Photos β†’ KLING (animate) β†’ KLING (transitions) β†’ ElevenLabs (VO) ``` --- ## Commands ### `/media-factory plan ` Create a production plan for multimedia content. **Output:** - Scene breakdown - Image prompts (for Nano Banana) - Video direction (for KLING) - Script draft (for ElevenLabs) - Estimated assets and timeline ### `/media-factory image ` Generate an image using Nano Banana Pro. **Parameters:** - `--aspect` - Aspect ratio (default: 16:9) - `--resolution` - 1K, 2K, or 4K (default: 1K) - `--count` - Number of variations (default: 1) - `--format` - png, jpeg, webp (default: png) ### `/media-factory video ` Generate video using KLING AI. **Parameters:** - `--model` - v2.1-master, v1.6-pro, etc. - `--mode` - text-to-video or image-to-video - `--duration` - 5 or 10 seconds ### `/media-factory voice