---
name: IMA Studio Music Generation
description: "Top-tier AI music generation with models: Suno sonic v4, Suno sonic v5, DouBao BGM (GenBGM), DouBao Song (GenSong). One-stop text-to-music with custom mode, lyrics, vocal control, and style tags. Requires IMA API key. Optional: ima-knowledge-ai for workflow and model guidance when installed. Use for: music generation, text-to-music, background music, songs with lyrics, soundtrack, jingles, ambient music, vocal and instrumental tracks. Output: MP3/WAV."
---

# IMA Voice AI Creation

## Scope & Dependencies (Declared for Transparency)

- **Credentials:** This skill requires an **IMA API key** at runtime (`IMA_API_KEY` or `--api-key`). The key is sent only to **api.imastudio.com**. Obtain keys at https://imastudio.com. Declared in registry as required.
- **Optional dependency:** When **ima-knowledge-ai** is installed, this skill may instruct the agent to read that skill's reference files (`~/.openclaw/skills/ima-knowledge-ai/references/*`) for workflow and model-selection guidance. **This skill is self-contained** — it works fully without ima-knowledge-ai. Reading another skill's files is optional and only for complex or multi-step tasks; users who do not have or trust ima-knowledge-ai can ignore those steps and use this skill's built-in defaults and **📥 User Input Parsing** tables.
- **Local paths:** This skill reads/writes `~/.openclaw/memory/ima_prefs.json` (preferences) and `~/.openclaw/logs/ima_skills/` (logs; auto-deleted after 7 days). User can delete these anytime.

---

## Optional: Read Knowledge Base (When ima-knowledge-ai Is Installed)

**If ima-knowledge-ai is not installed:** Skip this section. Use only this SKILL's default models and the **📥 User Input Parsing** tables for model_id and parameters.

**When ima-knowledge-ai is installed and the task is complex**, you may optionally read its reference files for better workflow and model choice:

1. **Workflow complexity** — Read `ima-knowledge-ai/references/workflow-design.md` if:
   - User mentions: "MV"、"配乐"、"完整作品"、"多步骤"、"soundtrack"
   - Task involves: video + music coordination, multi-track production, integrated workflows
   - Complex requirements that need task decomposition

2. **Model selection** — Read `ima-knowledge-ai/references/model-selection.md` if:
   - Unsure which model to use (Suno vs DouBao BGM vs DouBao Song)
   - Need cost/quality trade-off guidance
   - User specifies budget or quality requirements

**Why this is optional:**
- Music generation is often part of a larger workflow (video + music, story + soundtrack)
- For simple single-track requests, proceed directly with this skill's defaults
- For complex workflows, reading the knowledge base can improve task decomposition and model choice

**Example workflow case (when using optional knowledge base):**
```
User: "帮我做个产品宣传MV，有背景音乐"

❌ Wrong: 直接生成音乐 (music alone, no coordination with video)

✅ Right (if ima-knowledge-ai available): 
  1. Read workflow-design.md
  2. Decompose: Script → Video shots → Background music (matching video duration/mood)
  3. Generate video first (get duration)
  4. Generate BGM with matching duration and style
```

**How to check (optional):**
```python
# Only if ima-knowledge-ai is installed and task is complex
if ima_knowledge_ai_installed and (complex_workflow or multi_step):
    read("~/.openclaw/skills/ima-knowledge-ai/references/workflow-design.md")

if ima_knowledge_ai_installed and unsure_model_choice:
    read("~/.openclaw/skills/ima-knowledge-ai/references/model-selection.md")

# Choose model (this skill's logic works with or without knowledge base)
if "background music" or "BGM" or "instrumental":
    use_doubao_bgm()  # 30pts, pure instrumental
elif "song" or "lyrics" or "vocals":
    use_suno_sonic()  # 25pts, full-featured with lyrics
else:
    use_suno_sonic()  # Default: most versatile
```

**For simple requests:** Proceed directly with this skill's defaults. No need to read other skills' files.

---

## 📥 User Input Parsing (Model & Parameter Recognition)

**Purpose:** So that any agent (Claude or other models) parses user intent consistently, follow these rules when deriving **model_id** and **task type** from natural language. Normalize first, then map.

### 1. User phrasing → model selection (model_id)

| User intent / phrasing | model_id | Notes |
|------------------------|----------|--------|
| BGM / 背景音乐 / 纯音乐 / 无人声 / instrumental / 配乐 | `GenBGM` | DouBao BGM, 30 pts, ~30s |
| 歌 / 歌曲 / 带歌词 / 人声 / song / lyrics / 有唱 | `sonic` or `GenSong` | Suno (25 pts, ~2min) or DouBao Song (30 pts, ~30s) |
| Suno / 苏诺 / sonic | `sonic` | Full-featured, lyrics, vocal_gender, 25 pts |
| 豆包 BGM / DouBao BGM / BGM | `GenBGM` | 30 pts |
| 豆包 歌曲 / DouBao Song / 豆包歌 | `GenSong` | 30 pts |
| 最便宜 / 最省钱 / cheapest / budget | `GenBGM` or `GenSong` (6 pts 档 if available) | Only if user explicitly asks for cheapest |
| 最好 / 最全功能 / best / 带歌词可调 | `sonic` | Suno default |

If the user does not specify, default to **Suno (`sonic`)** for versatility. For "背景音乐"/"BGM"/"配乐" only → use **DouBao BGM (`GenBGM`)**.

### 2. Music-specific parameters (Suno)

| User says (examples) | Parameter | Action |
|----------------------|-----------|--------|
| 无人声 / 纯音乐 / no vocals / instrumental | make_instrumental | true |
| 女声 / 女声演唱 / female vocals | vocal_gender | "female" (custom_mode true) |
| 男声 / 男声演唱 / male vocals | vocal_gender | "male" (custom_mode true) |
| 我写歌词 / 自定义歌词 / custom lyrics | custom_mode + lyrics | Provide lyrics in request |

When using Suno with lyrics or vocal control, set `custom_mode: true` and pass `lyrics` / `vocal_gender` per API docs.

---

## ⚙️ How This Skill Works

**For transparency:** This skill uses a bundled Python script (`scripts/ima_voice_create.py`) to call the IMA Open API. The script:
- Sends your prompt to `https://api.imastudio.com` (IMA's servers)
- Uses `--user-id` **only locally** as a key for storing your model preferences
- Returns a music URL when generation is complete
- **NEW (v1.1.0): Automatic reflection mechanism** — if generation fails, the script automatically retries up to 3 times with smart parameter adjustments

### 🧠 Reflection Mechanism (Automatic Error Recovery)

This skill now includes an **intelligent reflection system** that automatically recovers from common errors:

**3-Layer Retry Strategy:**

1. **Attempt 1: Original Parameters**
   - Uses your provided parameters with smart credit_rule selection
   - Most tasks succeed on first try

2. **Attempt 2: Strict Match (Error 6009 Fix)**
   - Automatically removes unsupported parameters
   - Only keeps parameters in `credit_rules.attributes`
   - Example: Removes unsupported Suno parameters if not in model config

3. **Attempt 3: Fallback to Default (Error 6010 Fix)**
   - Uses model's default configuration
   - Uses `credit_rules[0]` (first rule = safest default)
   - Guarantees maximum compatibility

**Common Errors Fixed Automatically:**
- **Error 6009**: "No exact rule match found for parameters" → removes unsupported params
- **Error 6010**: "Attribute ID does not match" → corrects attribute_id to match params
- **Invalid product attribute**: Uses default rule configuration

**User Experience:**
- ✅ **Transparent**: Shows reflection log when retries happen
- ✅ **Fast**: Most tasks succeed on first attempt (no delay)
- ✅ **Smart**: Learns from errors and adjusts automatically
- ✅ **User-friendly**: Provides helpful suggestions if all 3 attempts fail

**Example Output (with reflection):**
```
🚀 Creating music generation task…

🧠 反省日志 (2 次尝试):
   ❌ [尝试 1] Invalid product attribute → 移除不支持的参数: ['unsupported_param']
   ✅ [尝试 2] ✅ 成功（尝试 2）

✅ Task created: task_abc123xyz
```

**What gets sent to IMA servers:**
- ✅ Your music prompt/description
- ✅ Model selection (Suno/DouBao)
- ❌ NO API key in prompts (key is used for authentication only)
- ❌ NO user_id (it's only used locally)

**What's stored locally:**
- `~/.openclaw/memory/ima_prefs.json` - Your model preferences (< 1 KB)
- `~/.openclaw/logs/ima_skills/` - Generation logs (auto-deleted after 7 days)

### Agent Execution (Internal Reference)

> **Note for users:** You can review the script source at `scripts/ima_voice_create.py` anytime.  
> **This skill uses only `api.imastudio.com`** (no image upload needed for music generation, unlike image/video skills which also use `imapi.liveme.com`).

Use the bundled script internally:

```bash
# Generate music — Suno sonic-v5
python3 {baseDir}/scripts/ima_voice_create.py \
  --api-key  $IMA_API_KEY \
  --task-type text_to_music \
  --model-id  sonic \
  --prompt   "upbeat lo-fi hip hop, 90 BPM, no vocals" \
  --user-id  {user_id} \
  --output-json

# DouBao BGM
python3 {baseDir}/scripts/ima_voice_create.py \
  --api-key  $IMA_API_KEY \
  --model-id  GenBGM \
  --prompt   "calm ambient piano background" \
  --user-id  {user_id} \
  --output-json
```

The script outputs JSON — parse it to get the result URL and pass it to the user via the UX protocol messages below.

---

## Overview

Call IMA Open API to create AI-generated music/audio. All endpoints require an `ima_*` API key. The core flow is: **query products → create task → poll until done**.

---

## 🔒 Security & Transparency Policy

> **This skill is community-maintained and open for inspection.**

### 🌐 Network Architecture

**This skill uses a simpler network architecture than image/video skills:**

| Skill Type | Domains Used | Why |
|------------|--------------|-----|
| **ima-voice-ai** (this skill) | ✅ `api.imastudio.com` only | Music generation doesn't require image uploads |
| ima-image-ai, ima-video-ai | `api.imastudio.com` + `imapi.liveme.com` | Image/video tasks need image upload service |

**Why the difference?**
- **Music generation** (text_to_music) only needs text prompts → single API endpoint
- **Image/video generation** (i2i, i2v tasks) needs image file uploads → requires separate upload service

**Security verification:**
```bash
# Verify this skill only uses api.imastudio.com:
grep -n "https://" scripts/ima_voice_create.py

# Expected output:
# Only https://api.imastudio.com (no imapi.liveme.com)
```

---

### ✅ What Users CAN Do

**Full transparency:**
- ✅ **Review all source code**: Check `scripts/ima_voice_create.py` and `ima_logger.py` anytime
- ✅ **Verify network calls**: **This skill uses only `api.imastudio.com`** (music generation doesn't require image uploads). Verify by running: `grep -n "https://" scripts/ima_voice_create.py`
- ✅ **Inspect local data**: View `~/.openclaw/memory/ima_prefs.json` and log files
- ✅ **Control privacy**: Delete preferences/logs anytime, or disable file writes (see below)

**Configuration allowed:**
- ✅ **Set API key** in environment or agent config:
  - Environment variable: `export IMA_API_KEY=ima_your_key_here`
  - OpenClaw/MCP config: Add `IMA_API_KEY` to agent's environment configuration
  - Get your key at: https://imastudio.com
- ✅ **Use scoped/test keys**: Test with limited API keys, rotate after testing
- ✅ **Disable file writes**: Make prefs/logs read-only or symlink to `/dev/null`

**Data control:**
- ✅ **View stored data**: `cat ~/.openclaw/memory/ima_prefs.json`
- ✅ **Delete preferences**: `rm ~/.openclaw/memory/ima_prefs.json` (resets to defaults)
- ✅ **Delete logs**: `rm -rf ~/.openclaw/logs/ima_skills/` (auto-cleanup after 7 days anyway)

### ⚠️ Advanced Users: Fork & Modify

If you need to modify this skill for your use case:
1. **Fork the repository** (don't modify the original)
2. **Update your fork** with your changes
3. **Test thoroughly** with limited API keys
4. **Document your changes** for troubleshooting

**Note:** Modified skills may break API compatibility or introduce security issues. Official support only covers the unmodified version.

### ❌ What to AVOID (Security Risks)

**Actions that could compromise security:**
- ❌ Sharing API keys publicly or in skill files
- ❌ Modifying API endpoints to unknown servers
- ❌ Disabling SSL/TLS certificate verification
- ❌ Logging sensitive user data (prompts, IDs, etc.)
- ❌ Bypassing authentication or billing mechanisms

**Why this matters:**
1. **API Compatibility**: Skill logic aligns with IMA Open API schema
2. **Security**: Malicious modifications could leak credentials or bypass billing
3. **Support**: Modified skills may not be supported
4. **Community**: Breaking changes affect all users

### 📁 File System Access (Declared)

This skill reads/writes the following files:

| Path | Purpose | Size | Auto-cleanup | User Control |
|------|---------|------|--------------|--------------|
| `~/.openclaw/memory/ima_prefs.json` | User model preferences | < 1 KB | No | Delete anytime |
| `~/.openclaw/logs/ima_skills/` | Generation logs | ~10-50 KB/day | 7 days | Delete anytime |

**What's stored:**
- ✅ Model preferences (e.g., "last used: Suno sonic-v5")
- ✅ Timestamps (e.g., "2026-02-27 12:34:56")
- ✅ Task IDs and HTTP status codes
- ❌ NO API keys
- ❌ NO personal data
- ❌ NO prompts or generated content

**Full transparency:** See the complete data flow and privacy policy in the skill documentation above.

### 📋 Privacy & Data Handling Summary

**What this skill does with your data:**

| Data Type | Sent to IMA? | Stored Locally? | User Control |
|-----------|-------------|-----------------|--------------|
| Music prompts | ✅ Yes (required for generation) | ❌ No | None (required) |
| API key | ✅ Yes (authentication header) | ❌ No | Set via env var |
| user_id (optional CLI arg) | ❌ **Never** (local preference key only) | ✅ Yes (as prefs file key) | Change `--user-id` value |
| Model preferences | ❌ No | ✅ Yes (~/.openclaw) | Delete anytime |
| Generation logs | ❌ No | ✅ Yes (~/.openclaw) | Auto-cleanup 7 days |

**Privacy recommendations:**
1. **Use test/scoped API keys** for initial testing
2. **Note**: `--user-id` is **never sent to IMA servers** - it's only used locally as a key for storing preferences in `~/.openclaw/memory/ima_prefs.json`
3. **Review source code** at `scripts/ima_voice_create.py` to verify network calls (search for `create_task` function)
4. **Rotate API keys** after testing or if compromised

**Get your IMA API key:** Visit https://imastudio.com to register and get started.

### 🔧 For Skill Maintainers Only

**Version control:**
- All changes must go through Git with proper version bumps (semver)
- CHANGELOG.md must document all changes
- Production deployments require code review

**File checksums (optional):**
```bash
# Verify skill integrity
sha256sum SKILL.md scripts/ima_voice_create.py
```

If users report issues, verify file integrity first.

---

## 🧠 User Preference Memory

> User preferences **override** recommended defaults. If a user has generated before, use their preferred model — not the system default.

### Storage: `~/.openclaw/memory/ima_prefs.json`

```json
{
  "user_{user_id}": {
    "text_to_music": { "model_id": "sonic", "model_name": "Suno", "credit": 25, "last_used": "..." }
  }
}
```

If the file or key doesn't exist, fall back to the ⭐ Recommended Defaults below.

### When to Read (Before Every Generation)

1. Load `~/.openclaw/memory/ima_prefs.json` (silently, no error if missing)
2. Look up `user_{user_id}.text_to_music`
3. **If found** → use that model; mention it:
   ```
   🎵 根据你的使用习惯，将用 [Model Name] 帮你生成音乐…
   • 模型：[Model Name]（你的常用模型）
   • 预计耗时：[X ~ Y 秒]
   • 消耗积分：[N pts]
   ```
4. **If not found** → use the ⭐ Recommended Default (Suno sonic-v5)

### When to Write (After Every Successful Generation)

Save the used model to `~/.openclaw/memory/ima_prefs.json` under `user_{user_id}.text_to_music`.  
See `ima-image-ai/SKILL.md` → "User Preference Memory" for the full Python write snippet.

### When to Update (User Explicitly Changes Model)

| Trigger | Action |
|---------|--------|
| `用XXX` / `换成XXX` | Switch + save as new preference |
| `以后都用XXX` / `always use XXX` | Save + confirm: `✅ 已记住！以后音乐生成默认用 [XXX]` |
| `用便宜的` / `cheapest` | Use DouBao BGM/Song; do NOT save unless user says "以后都用" |

---

## ⭐ Recommended Defaults

> **These are fallback defaults — only used when no user preference exists.**  
> **Always default to the newest and most popular model. Do NOT default to the cheapest.**

| Task | Default Model | model_id | model_version | Cost | Why |
|------|--------------|----------|---------------|------|-----|
| text_to_music | **Suno (sonic-v5)** | `sonic` | `sonic` | 25 pts | Latest Suno engine, best quality |
| text_to_music (BGM only) | **DouBao BGM** | `GenBGM` | `GenBGM` | 30 pts | Background music |
| text_to_music (song) | **DouBao Song** | `GenSong` | `GenSong` | 30 pts | Song generation |

**Selection guide by use case:**
- Custom song with lyrics, vocals, style → **Suno sonic-v5** (default)
- Background music / ambient loop → **DouBao BGM**
- Simple song generation → **DouBao Song**
- User explicitly asks for cheapest → DouBao BGM/Song (6pts each) — only if explicitly requested

> ⚠️ For Suno: `model_version` inside `parameters` (e.g. `sonic-v5`) is different from the outer `model_version` field (which is `sonic`). Always set both.

---

## 💬 User Experience Protocol (IM / Feishu / Discord) v1.1 🆕

> **v1.1 Update:** Added Step 0 to ensure correct message ordering in group chats (learned from ima-image-ai v1.2).
>
> Music generation completes in 10~45 seconds. **Never let users wait in silence.**  
> Always follow all 5 steps below, every single time.

### 🚫 Never Say to Users

| ❌ Never say | ✅ What users care about |
|-------------|--------------------------|
| `ima_voice_create.py` / 脚本 / script | — |
| 自动化脚本 / automation | — |
| 自动处理产品列表 / 查询接口 | — |
| 自动解析参数 / 智能轮询 | — |
| attribute_id / model_version / form_config | — |
| API 调用 / HTTP 请求 / 任何技术参数名 | — |

Only tell users: **model name · estimated time · credits · result (audio file/player) · plain-language status**.

---

### Estimated Generation Time per Model

| Model | Estimated Time | Poll Every | Send Progress Every |
|-------|---------------|------------|---------------------|
| DouBao BGM | 10~25s | 5s | 10s |
| DouBao Song | 10~25s | 5s | 10s |
| Suno (sonic-v5) | 20~45s | 5s | 15s |

`estimated_max_seconds` = upper bound (e.g. 45 for Suno, 25 for DouBao BGM/Song).

---

### Step 0 — Initial Acknowledgment Reply (Normal Reply) 🆕

**⚠️ CRITICAL:** This step ensures correct message ordering in group chats.

**Before doing anything else**, reply to the user with a friendly acknowledgment message using your **normal reply** (not `message` tool). This reply will automatically appear FIRST in the conversation.

**Example acknowledgment messages:**
```
好的！马上帮你生成轻松的背景音乐 🎵
```
```
收到！用 Suno 为你创作一首歌 🎶
```
```
OK! Starting music generation with DouBao BGM 🎵
```

**Rules:**
- Keep it short and warm (< 15 words)
- Match the user's language (Chinese/English)
- Include relevant emoji (🎵/🎶/🎸/🎹)
- This is your ONLY normal reply — all subsequent updates use `message` tool

**Why this matters:**
- Normal replies automatically appear FIRST in the conversation thread
- `message` tool pushes appear in chronological order AFTER your initial reply
- This ensures users see: "好的！" → "🎵 开始生成..." → "✅ 成功!" (correct order)

---

### Step 1 — Pre-Generation Notification (Push via message tool)

**After Step 0 reply**, use the `message` tool to push a notification immediately to the group/channel:

```python
# Agent implementation
message(
    action="send",
    target=group_id,  # or channel_id for Discord
    message="""🎵 开始生成音乐，请稍候…
• 模型：[Model Name]
• 预计耗时：[X ~ Y 秒]
• 消耗积分：[N pts]"""
)
```

**User-facing message template:**
```
🎵 开始生成音乐，请稍候…
• 模型：[Model Name]
• 预计耗时：[X ~ Y 秒]
• 消耗积分：[N pts]
```

**Cost transparency:**
- Balanced (Suno 25 pts): "使用 Suno（25 积分，功能最全）"
- DouBao alternatives (30 pts each): "使用 DouBao BGM（30 积分）" — only if user explicitly requests DouBao or background music type

> Adapt language to match the user. English → `🎵 Starting music generation, please wait [X~Y] seconds…`

---

### Step 2 — Progress Updates

Poll the task detail API every **5s**.  
Send a progress update every `[Send Progress Every]` seconds per the table above.

```
⏳ 音乐生成中… [P]%
已等待 [elapsed]s，预计最长 [max]s
```

**Progress formula:**
```
P = min(95, floor(elapsed_seconds / estimated_max_seconds * 100))
```

- **Cap at 95%** — never show 100% until the API returns `success`
- If `elapsed > estimated_max`: keep P at 95% and append `「快好了，稍等…」`

---

### Step 3 — Success Notification (Push audio via message tool)

When task status = `success`, use the `message` tool to **send the generated audio directly** (not as a text URL):

**Agent implementation:**
```python
# Get result URL from script output or task detail API
result = get_task_result(task_id)
audio_url = result["medias"][0]["url"]

# Push audio + caption to group/channel
message(
    action="send",
    target=group_id,
    media=audio_url,  # Feishu/Discord will render the audio
    caption=f"""✅ 音乐生成成功！
• 模型：[Model Name]
• 耗时：预计 [X~Y]s，实际 [actual]s
• 消耗积分：[N pts]

🔗 原始链接：{audio_url}"""
)
```

**User-facing message:**
```
✅ 音乐生成成功！
• 模型：[Model Name]
• 耗时：预计 [X~Y]s，实际 [actual]s
• 消耗积分：[N pts]

🔗 原始链接：https://ws.esxscloud.com/.../audio.wav

[音频直接显示为文件卡片，可点击播放]
```

**Platform-specific notes:**
- **Feishu**: `message(action=send, media=url, caption="...")` — caption appears with audio file card
- **Discord**: Audio embeds automatically from URL; caption can be in message text
- **Telegram**: Use `message(action=send, media=url, caption="...")`

**⚠️ Important**: 
- Always send audio via `media` parameter (file card/player) + include URL in caption text
- Do NOT use local file paths like `/tmp/audio.wav` — use HTTP URL from API
- Users expect: (1) clickable audio file card + (2) raw URL link for sharing/downloading
- Format: `media=audio_url` + `caption="...🔗 原始链接：{audio_url}"`

---

### Step 4 — Failure Notification (Push via message tool)

When task status = `failed` or any API/network error, push a failure message with alternative suggestions:

**Agent implementation:**
```python
message(
    action="send",
    target=group_id,
    message="""❌ 音乐生成失败
• 原因：[natural_language_error_message]
• 建议改用：
  - [Alt Model 1]（[特点]，[N pts]）
  - [Alt Model 2]（[特点]，[N pts]）

需要我帮你用其他模型重试吗？"""
)
```

**⚠️ CRITICAL: Error Message Translation**

**NEVER show technical error messages to users.** Always translate API errors into natural language.  
**API key & credits:** 密钥与积分管理入口为 imaclaw.ai（与 imastudio.com 同属 IMA 平台）。Key and subscription management: imaclaw.ai (same IMA platform as imastudio.com).

| Technical Error | ❌ Never Say | ✅ Say Instead (Chinese) | ✅ Say Instead (English) |
|----------------|-------------|------------------------|------------------------|
| `401 Unauthorized` 🆕 | Invalid API key / 401 Unauthorized | ❌ API密钥无效或未授权<br>💡 **生成新密钥**: https://www.imaclaw.ai/imaclaw/apikey | ❌ API key is invalid or unauthorized<br>💡 **Generate API Key**: https://www.imaclaw.ai/imaclaw/apikey |
| `4008 Insufficient points` 🆕 | Insufficient points / Error 4008 | ❌ 积分不足，无法创建任务<br>💡 **购买积分**: https://www.imaclaw.ai/imaclaw/subscription | ❌ Insufficient points to create this task<br>💡 **Buy Credits**: https://www.imaclaw.ai/imaclaw/subscription |
| `"Invalid product attribute"` / `"Insufficient points"` | Invalid product attribute | 生成参数配置异常，请稍后重试 | Configuration error, please try again later |
| `Error 6006` (credit mismatch) | Error 6006 | 积分计算异常，系统正在修复 | Points calculation error, system is fixing |
| `Error 6010` (attribute_id mismatch) | Attribute ID does not match | 模型参数不匹配，请尝试其他模型 | Model parameters incompatible, try another model |
| `error 400` (bad request) | error 400 / Bad request | 音乐参数设置有误，请调整描述后重试 | Music parameter error, adjust description and retry |
| `resource_status == 2` | Resource status 2 / Failed | 音乐生成遇到问题，建议换个模型试试 | Music generation failed, try another model |
| `status == "failed"` (no details) | Task failed | 这次生成没成功，要不换个模型试试？ | Generation unsuccessful, try a different model? |
| `timeout` | Task timed out / Timeout error | 音乐生成时间过长已超时，建议用更快的模型 | Music generation took too long, try a faster model |
| Network error / Connection refused | Connection refused / Network error | 网络连接不稳定，请检查网络后重试 | Network connection unstable, check network and retry |
| Rate limit exceeded | 429 Too Many Requests / Rate limit | 请求过于频繁，请稍等片刻再试 | Too many requests, please wait a moment |
| Model unavailable | Model not available / 503 Service Unavailable | 当前模型暂时不可用，建议换个模型 | Model temporarily unavailable, try another model |
| Lyrics format error (Suno only) | Invalid lyrics format | 歌词格式有误，请调整后重试 | Lyrics format error, adjust and retry |
| Prompt too short/long | Prompt length invalid | 音乐描述过短或过长，请调整到合适长度 | Music description too short or long, adjust length |

**Generic fallback (when error is unknown):**
- Chinese: `音乐生成遇到问题，请稍后重试或换个模型试试`
- English: `Music generation encountered an issue, please try again or use another model`

**Best Practices:**
1. **Focus on user action**: Tell users what to do next, not what went wrong technically
2. **Be reassuring**: Use phrases like "建议换个模型试试" instead of "生成失败了"
3. **Avoid blame**: Never say "你的描述有问题" → say "描述需要调整一下"
4. **Provide alternatives**: Always suggest 1-2 alternative models in the failure message
5. **Music-specific**: 
   - For Suno lyrics errors, suggest simplifying lyrics or using auto-generated lyrics
   - For prompt length errors, give example length (e.g., "建议20-100字")
   - For BGM requests, recommend DouBao BGM over Suno
6. **🆕 Include actionable links (v1.0.8+)**: For 401/4008 errors, provide clickable links to API key generation or credit purchase pages

**🆕 Enhanced Error Handling (v1.0.8):**

Music generation uses **direct error handling** (no Reflection mechanism due to simpler parameters):

- **401 Unauthorized**: System provides clickable link to API key generation page
- **4008 Insufficient Points**: System provides clickable link to credit purchase page
- Other errors: Clear natural language explanations with alternative model suggestions

Error messages are **user-friendly and actionable** — users receive clear next steps for resolution.

**Failure fallback table:**

| Failed Model | First Alt | Second Alt |
|-------------|-----------|------------|
| Suno | DouBao BGM（30pts，背景音乐） | DouBao Song（30pts，歌曲生成） |
| DouBao BGM | DouBao Song（30pts） | Suno（25pts，功能最强） |
| DouBao Song | DouBao BGM（30pts） | Suno（25pts，功能最强） |

---

### Step 5 — Done (No Further Action Needed) 🆕

**v1.1 Note:** After completing Steps 0-4:
- ✅ **Step 0** already sent your normal reply (appears FIRST in chat)
- ✅ **Steps 1-4** pushed all updates via `message` tool (appear in order)
- ✅ **No further action needed** — conversation is complete

**Do NOT:**
- ❌ Reply again with `NO_REPLY` (you already replied in Step 0)
- ❌ Send duplicate confirmation messages
- ❌ Use `message` tool to send the same content twice

**Why this works:**
```
User: "帮我生成一段轻松的背景音乐"
  ↓
[Step 0] Your normal reply:  "好的！马上帮你生成轻松的背景音乐 🎵"  ← Appears FIRST
  ↓
[Step 1] message tool push:  "🎵 开始生成音乐..."  ← Appears SECOND
  ↓
[Step 2] message tool push:  "⏳ 正在生成中… 45%"  ← (if task takes >15s)
  ↓
[Step 3] message tool push:  "✅ 音乐生成成功! [Audio File]"  ← Appears LAST
  ↓
[Step 5] Done. No further replies.
```

---

## Supported Models

### text_to_music (3 models)

| Name | model_id | version_id | Cost | Key form_config |
|------|----------|------------|------|-----------------|
| **Suno** | `sonic` | `sonic` | 25 pts | `model_version=sonic-v5` (latest), `custom_mode=true`, `make_instrumental`, `auto_lyrics`, `tags`, `negative_tags`, `vocal_gender`, `title` |
| DouBao BGM | `GenBGM` | `GenBGM` | 30 pts | — |
| DouBao Song | `GenSong` | `GenSong` | 30 pts | — |

**Model guidance:**
- **Suno**: Most powerful option. Supports full custom mode with genre tags, explicit instrumental toggle, vocal gender selection, and negative tags to exclude unwanted styles.
- **DouBao BGM**: Lightweight background music generation. Ideal for ambient / background tracks.
- **DouBao Song**: Song generation. Good for structured vocal compositions.

**What you can generate:**
- Background music (lo-fi, ambient, cinematic, electronic, jazz, classical…)
- Custom jingles or theme songs with specific BPM and key
- Vocal or instrumental tracks with mood direction
- Short loops or full-length compositions

**Prompt writing tips (for Suno `gpt_description_prompt`):**
- Genre: `"lo-fi hip hop"`, `"orchestral cinematic"`, `"upbeat pop"`, `"dark ambient"`
- Tempo: `"80 BPM"`, `"fast tempo"`, `"slow ballad"`
- Vocals: `"no vocals"` → set `make_instrumental=true`; `"female vocals"` → `vocal_gender="female"`
- Mood: `"happy and energetic"`, `"melancholic"`, `"tense and dramatic"`
- Negative: `negative_tags="heavy metal, distortion"` to exclude styles
- Duration hint: `"60 seconds"`, `"30 second loop"`

## Environment

Base URL: `https://api.imastudio.com`

Required/recommended headers for all `/open/v1/` endpoints:

| Header | Required | Value | Notes |
|--------|----------|-------|-------|
| `Authorization` | ✅ | `Bearer ima_your_api_key_here` | API key authentication |
| `x-app-source` | ✅ | `ima_skills` | Fixed value — identifies skill-originated requests |
| `x_app_language` | recommended | `en` / `zh` | Product label language; defaults to `en` if omitted |

```
Authorization: Bearer ima_your_api_key_here
x-app-source: ima_skills
x_app_language: en
```

---

## ⚠️ MANDATORY: Always Query Product List First

> **CRITICAL**: You MUST call `/open/v1/product/list` BEFORE creating any task.  
> The `attribute_id` field is REQUIRED in the create request. If it is `0` or missing, you get:  
> `"Invalid product attribute"` → `"Insufficient points"` → task fails completely.  
> **NEVER construct a create request from the model table alone. Always fetch the product first.**

### How to get attribute_id

```python
# Step 1: Query product list
GET /open/v1/product/list?app=ima&platform=web&category=text_to_music

# Step 2: Walk the tree to find your model
for group in response["data"]:
    for version in group.get("children", []):
        if version["type"] == "3" and version["model_id"] == target_model_id:
            attribute_id  = version["credit_rules"][0]["attribute_id"]
            credit        = version["credit_rules"][0]["points"]
            model_version = version["id"]
            model_name    = version["name"]
```

### Quick Reference: Known attribute_ids

⚠️ **Production warning**: `attribute_id` and `credit` values change frequently. Always call `/open/v1/product/list` at runtime; table below is pre-queried reference (2026-02-27).

| Model | model_id | attribute_id | credit | Notes |
|-------|----------|-------------|--------|-------|
| Suno (sonic-v4) | `sonic` | **2370** | 25 pts | Default |
| DouBao BGM | `GenBGM` | **4399** | 30 pts | BGM专用 |
| DouBao Song | `GenSong` | **4398** | 30 pts | 歌曲专用 |
| All others | — | → query `/open/v1/product/list` | — | Always runtime query |

### Common Mistakes (and resulting errors)

| Mistake | Error |
|---------|-------|
| `attribute_id` is 0 or missing | `"Invalid product attribute"` → Insufficient points |
| `attribute_id` outdated (production changed) | Same errors; always query product list first |
| `prompt` at outer level | Prompt ignored |
| `cast` missing from inner `parameters` | Billing failure |
| Suno: `model_version` in `parameters` not set to `sonic-v5` | Wrong engine used |

---

## Core Flow

```
1. GET /open/v1/product/list?app=ima&platform=web&category=text_to_music
   → REQUIRED: Get attribute_id, credit, model_version, form_config defaults

2. POST /open/v1/tasks/create
   → Must include: attribute_id, model_name, model_version, credit, cast, prompt (nested!)

3. POST /open/v1/tasks/detail  {task_id: "..."}
   → Poll every 3–5s until medias[].resource_status == 1
   → Extract url from completed media (mp3)
```

---

## Supported Task Types

| category | Capability | Input |
|----------|------------|-------|
| `text_to_music` | Text → Music | prompt |

---

## Detail API status values

| Field | Type | Values |
|-------|------|--------|
| **`resource_status`** | int or `null` | `0`=处理中, `1`=可用, `2`=失败, `3`=已删除；`null` 当作 0 |
| **`status`** | string | `"pending"`, `"processing"`, `"success"`, `"failed"` |

| `resource_status` | `status` | Action |
|-------------------|----------|--------|
| `0` or `null` | `pending` / `processing` | Keep polling |
| `1` | `success` (or `completed`) | Stop when **all** medias are 1; read `url` |
| `1` | `failed` | Stop, handle error |
| `2` / `3` | any | Stop, handle error |

> **Important**: Treat `resource_status: null` as 0. Stop only when **all** medias have `resource_status == 1`. Check `status != "failed"` when rs=1.

---

## API 1: Product List

```
GET /open/v1/product/list?app=ima&platform=web&category=text_to_music
```

Returns a **V2 tree structure**: `type=2` nodes are model groups, `type=3` nodes are versions (leaves). Only `type=3` nodes contain `credit_rules` and `form_config`.

**How to pick a version:**
1. Traverse nodes to find `type=3` leaves
2. Use `model_id` and `id` (= `model_version`) from the leaf
3. Pick `credit_rules[].attribute_id`
4. Use `form_config[].value` as default `parameters` values

---

## API 2: Create Task

```
POST /open/v1/tasks/create
```

### text_to_music

No image input. `src_img_url: []`, `input_images: []`.

```json
{
  "task_type": "text_to_music",
  "enable_multi_model": false,
  "src_img_url": [],
  "parameters": [{
    "attribute_id":  "<from credit_rules>",
    "model_id":      "<model_id>",
    "model_name":    "<model_name>",
    "model_version": "<version_id>",
    "app":           "ima",
    "platform":      "web",
    "category":      "text_to_music",
    "credit":        "<points>",
    "parameters": {
      "prompt":       "upbeat electronic, 120 BPM, no vocals",
      "n":            1,
      "input_images": [],
      "cast":         {"points": "<points>", "attribute_id": "<attribute_id>"}
    }
  }]
}
```

**Prompt tips for music generation:**
- Genre: `"upbeat electronic"`, `"classical piano"`, `"ambient chill"`
- Tempo: `"120 BPM"`, `"slow tempo"`
- Vocals: `"no vocals"`, `"male vocals"`, `"female vocals"`
- Mood: `"happy"`, `"melancholic"`, `"energetic"`
- Duration hint: `"60 seconds"`, `"short loop"`

**Key fields**:

| Field | Required | Description |
|-------|----------|-------------|
| `parameters[].credit` | ✅ | Must equal `credit_rules[].points`. Error 6006 if wrong. |
| `parameters[].parameters.prompt` | ✅ | Prompt must be nested here, NOT at top level. |
| `parameters[].parameters.cast` | ✅ | `{"points": N, "attribute_id": N}` — mirror of credit. |
| `parameters[].parameters.n` | ✅ | Number of outputs (usually `1`). |

Response: `data.id` = task ID for polling.

---

## API 3: Task Detail (Poll)

```
POST /open/v1/tasks/detail
{"task_id": "<id from create response>"}
```

Poll every 3–5s. Completed response:

```json
{
  "id": "task_abc",
  "medias": [{
    "resource_status": 1,
    "url":          "https://cdn.../output.mp3",
    "duration_str": "60s",
    "format":       "mp3"
  }]
}
```

Output fields: `url` (mp3), `duration_str`, `format`.

---

## Common Mistakes

| Mistake | Fix |
|---------|-----|
| Placing `prompt` at param top-level | `prompt` must be inside `parameters[].parameters` |
| Wrong `credit` value | Must exactly match `credit_rules[].points` (error 6006) |
| Missing `app` / `platform` in parameters | Required — use `ima` / `web` |
| Single-poll instead of loop | Poll until `resource_status == 1` for ALL medias |
| Not checking `status != "failed"` | `resource_status=1` + `status="failed"` = actual failure |

---

## Python Example

```python
import time
import requests

BASE_URL = "https://api.imastudio.com"
API_KEY  = "ima_your_key_here"
HEADERS  = {
    "Authorization":  f"Bearer {API_KEY}",
    "Content-Type":   "application/json",
    "x-app-source":   "ima_skills",
    "x_app_language": "en",
}


def get_products(category: str) -> list:
    """Returns flat list of type=3 version nodes from V2 tree."""
    r = requests.get(
        f"{BASE_URL}/open/v1/product/list",
        headers=HEADERS,
        params={"app": "ima", "platform": "web", "category": category},
    )
    r.raise_for_status()
    nodes = r.json()["data"]
    versions = []
    for node in nodes:
        for child in node.get("children") or []:
            if child.get("type") == "3":
                versions.append(child)
            for gc in child.get("children") or []:
                if gc.get("type") == "3":
                    versions.append(gc)
    return versions


def create_music_task(prompt: str, product: dict) -> str:
    """Returns task_id."""
    rule = product["credit_rules"][0]
    form_defaults = {f["field"]: f["value"] for f in product.get("form_config", []) if f.get("value") is not None}

    nested_params = {
        "prompt": prompt,
        "n":      1,
        "input_images": [],
        "cast":   {"points": rule["points"], "attribute_id": rule["attribute_id"]},
        **form_defaults,
    }

    body = {
        "task_type":          "text_to_music",
        "enable_multi_model": False,
        "src_img_url":        [],
        "parameters": [{
            "attribute_id":  rule["attribute_id"],
            "model_id":      product["model_id"],
            "model_name":    product["name"],
            "model_version": product["id"],
            "app":           "ima",
            "platform":      "web",
            "category":      "text_to_music",
            "credit":        rule["points"],
            "parameters":    nested_params,
        }],
    }
    r = requests.post(f"{BASE_URL}/open/v1/tasks/create", headers=HEADERS, json=body)
    r.raise_for_status()
    return r.json()["data"]["id"]


def poll(task_id: str, interval: int = 3, timeout: int = 300) -> dict:
    deadline = time.time() + timeout
    while time.time() < deadline:
        r = requests.post(f"{BASE_URL}/open/v1/tasks/detail", headers=HEADERS, json={"task_id": task_id})
        r.raise_for_status()
        task   = r.json()["data"]
        medias = task.get("medias", [])
        if medias:
            if any(m.get("status") == "failed" for m in medias):
                raise RuntimeError(f"Task failed: {task_id}")
            rs = lambda m: m.get("resource_status") if m.get("resource_status") is not None else 0
            if any(rs(m) == 2 for m in medias):
                raise RuntimeError(f"Task failed: {task_id}")
            if all(rs(m) == 1 for m in medias):
                return task
        time.sleep(interval)
    raise TimeoutError(f"Task timed out: {task_id}")


# text_to_music
products = get_products("text_to_music")
task_id  = create_music_task("upbeat electronic, 120 BPM, no vocals", products[0])
result   = poll(task_id)
print(result["medias"][0]["url"])          # mp3 URL
print(result["medias"][0]["duration_str"]) # e.g. "60s"
```

---

## Supported Models & Search Terms

**Models:** Suno sonic v4, Suno sonic v5, DouBao BGM (GenBGM), DouBao Song (GenSong)

**Capabilities:** music generation, text-to-music, AI music, background music, BGM, soundtrack, jingle, song with lyrics, vocal, instrumental, ambient music, audio generation