---
name: IMA Studio Video Generation
description: "Premier AI video generation with models: Wan 2.6, Kling O1, Kling 2.6, Google Veo 3.1, Sora 2 Pro, Pixverse V5.5, Hailuo 2.0, Hailuo 2.3, SeeDance 1.5 Pro, Vidu Q2. One-stop access across text-to-video, image-to-video, first-last-frame, reference-image modes with knowledge base guidance. BEFORE using: READ ima-knowledge-ai skill for workflow design and visual consistency. Use for: video generation, text-to-video, image-to-video, character animation, product demos, social media clips, storytelling, explainer videos, multi-shot production, character consistency via reference images. Requires IMA API key."
---

# IMA Video AI Creation

## 📋 Before you install

- **Credentials:** This skill requires an **IMA API key** at runtime (`IMA_API_KEY` or `--api-key`). The key is sent to **api.imastudio.com** (main API) and **imapi.liveme.com** (image uploads). Obtain keys at https://imastudio.com. Use a scoped or test key if you want to limit exposure.
- **Local files:** The skill reads **local image files** you provide (for image-to-video); it also writes logs under `~/.openclaw/logs/ima_skills/` and preferences to `~/.openclaw/memory/ima_prefs.json`. Do not point it at sensitive paths.
- **Cross-skill reads:** If **ima-knowledge-ai** is installed, this skill instructs the agent to read that skill's reference files (`~/.openclaw/skills/ima-knowledge-ai/references/*`) for workflow and visual-consistency guidance. If you do not have or trust that skill, skip those steps and use this skill's built-in defaults and tables.

---

## ⚠️ 重要：模型 ID 参考

**CRITICAL:** When calling the script, you MUST use the exact **model_id** (second/third column), NOT the friendly model name. Do NOT infer model_id from the friendly name.

**Quick Reference Table:**

| 友好名称 (Friendly Name) | model_id (t2v) | model_id (i2v) | 说明 (Notes) |
|-------------------------|---------------|----------------|-------------|
| Wan 2.6 | `wan2.6-t2v` | `wan2.6-i2v` | ⚠️ Note -t2v/-i2v suffix |
| Kling O1 | `kling-video-o1` | `kling-video-o1` | ⚠️ Note video- prefix |
| Kling 2.6 | `kling-v2-6` | `kling-v2-6` | ⚠️ Note v prefix |
| Hailuo 2.3 | `MiniMax-Hailuo-2.3` | `MiniMax-Hailuo-2.3` | ⚠️ Note MiniMax- prefix |
| Hailuo 2.0 | `MiniMax-Hailuo-02` | `MiniMax-Hailuo-02` | ⚠️ Note 02 not 2.0 |
| Vidu Q2 | `viduq2` | `viduq2-pro` | ⚠️ Different for t2v/i2v |
| Google Veo 3.1 | `veo-3.1-generate-preview` | `veo-3.1-generate-preview` | ⚠️ Note -generate-preview suffix |
| Sora 2 Pro | `sora-2-pro` | `sora-2-pro` | ✅ Straightforward |
| Pixverse | `pixverse` | `pixverse` | ✅ Same as friendly name |
| SeeDance 1.5 Pro | `doubao-seedance-1.5-pro` | `doubao-seedance-1.5-pro` | ⚠️ Note doubao- prefix |

**User Input Variations Handled by Agent:**
- "万" / "万2.6" / "Wan" → Wan 2.6 → `wan2.6-t2v` / `wan2.6-i2v`
- "可灵" / "可灵O1" / "Kling O1" → `kling-video-o1`
- "可灵2.6" / "Kling 2.6" → `kling-v2-6`
- "海螺" / "海螺2.3" / "Hailuo" → `MiniMax-Hailuo-2.3`
- "Veo" / "Google Veo" → `veo-3.1-generate-preview`

**How to get the correct model_id:**
1. Check this table first
2. Use `--list-models --task-type text_to_video` (or `image_to_video`)
3. Refer to command examples below

**Example:**
```bash
# ❌ WRONG: Inferring from friendly name
--model-id kling-o1

# ✅ CORRECT: Using exact model_id from table
--model-id kling-video-o1
```

---

## ⚠️ MANDATORY PRE-CHECK: Read Knowledge Base First!

**If ima-knowledge-ai is not installed:** Skip all "Read …" steps below; use only this SKILL's default models and the **📥 User Input Parsing** tables for task_type, model_id, and parameters.

**BEFORE executing ANY video generation task, you MUST:**

1. **CRITICAL: Understand video modes** — Read `ima-knowledge-ai/references/video-modes.md`:
   - **image_to_video** = first frame to video (输入图**成为第1帧**)
   - **reference_image_to_video** = reference appearance to video (输入图是**视觉参考**，不是第1帧)
   - These are COMPLETELY DIFFERENT concepts!
   - Wrong mode choice = wrong result

2. **Check for visual consistency needs** — Read `ima-knowledge-ai/references/visual-consistency.md` if:
   - User mentions: "系列"、"分镜"、"同一个"、"角色"、"续"、"多个镜头"
   - Task involves: multi-shot videos, character continuity, scene consistency
   - Second+ request about same subject (e.g., "旺财在游泳" after "生成旺财照片")

3. **Check workflow/model/parameters** — Read relevant `ima-knowledge-ai/references/` sections if:
   - Complex multi-step video production
   - Unsure which model to use
   - Need parameter guidance (duration, resolution, reference strength)

**Why this matters:**
- AI video generation defaults to **独立生成** (independent generation) each time
- Without reference images, "same character/scene" will look completely different
- **Text-to-video CANNOT maintain visual consistency** — must use image-based modes

**Example failure case:**
```
User: "生成一只小狗，叫旺财" 
  → You: generate dog image A

User: "生成旺财在游泳的视频"
  → ❌ Wrong: text_to_video "狗在游泳" (new dog, different from A)
  → ✅ Right: read visual-consistency.md + video-modes.md → 
             use image_to_video with image A as first frame
```

**How to check:**
```python
# Step 1: Read knowledge base
read("~/.openclaw/skills/ima-knowledge-ai/references/video-modes.md")
read("~/.openclaw/skills/ima-knowledge-ai/references/visual-consistency.md")

# Step 2: Identify if reference image needed
if "same subject" or "series" or "character continuity":
    # Use image-based mode with previous result as reference
    reference_image = previous_generation_result
    
    # Choose mode based on requirement
    if "reference becomes first frame":
        use_image_to_video(prompt, reference_image)
    else:
        use_reference_image_to_video(prompt, reference_image, reference_strength=0.8)
else:
    # OK to use text-to-video
    use_text_to_video(prompt)
```

**No exceptions** — if you skip this check and generate visually inconsistent results, that's a bug.

---

## 📥 User Input Parsing (Model & Parameter Recognition)

**Purpose:** So that any agent (Claude or other models) parses user intent consistently, follow these rules when deriving **task_type**, **model_id**, and **parameters** from natural language. Do not guess — normalize first, then map.

### 1. User phrasing → task_type

| User intent / phrasing | task_type | Notes |
|------------------------|-----------|--------|
| Only text, no image | `text_to_video` | "生成一段…视频" / "text to video" |
| One image as **first frame** (图成为第1帧) | `image_to_video` | "把这张图动起来" / "用这张图做视频" / "图生视频" |
| One image as **reference** (视觉参考，非第1帧) | `reference_image_to_video` | "参考这张图生成" / "像这张风格/角色" |
| Two images (start + end) | `first_last_frame_to_video` | "首帧+尾帧" / "从A过渡到B" |

When in doubt: "把图动起来" / "图动" → `image_to_video`; "参考这张图" / "按这张风格" → `reference_image_to_video`.

### 2. Model name / alias → model_id (normalize then lookup)

Normalize user wording (case-insensitive, ignore spaces), then map to **model_id**:

| User says (examples) | For t2v → model_id | For i2v → model_id |
|----------------------|--------------------|---------------------|
| 万 / Wan / 万2.6 / wan2.6 | `wan2.6-t2v` | `wan2.6-i2v` |
| 可灵 / Kling / Kling O1 / 可灵O1 | `kling-video-o1` | `kling-video-o1` |
| Kling 2.6 / 可灵2.6 | `kling-v2-6` | `kling-v2-6` |
| 海螺 / Hailuo / 海螺2.3 | `MiniMax-Hailuo-2.3` | `MiniMax-Hailuo-2.3` |
| Hailuo 2.0 / 海螺2.0 | `MiniMax-Hailuo-02` | `MiniMax-Hailuo-02` |
| Vidu / Vidu Q2 | `viduq2` | (i2v: `viduq2-pro` for "Vidu Q2 Pro") |
| Veo / Google Veo / Veo 3.1 | `veo-3.1-generate-preview` | `veo-3.1-generate-preview` |
| Sora / Sora 2 Pro | `sora-2-pro` | `sora-2-pro` |
| Pixverse / Pixverse V5.5 | `pixverse` | `pixverse` |
| 最便宜 / 最省钱 / cheapest / budget | `viduq2` (5 pts) | `wan2.6-i2v` or per product list |
| 最好 / 最高质量 / best / premium | Prefer Kling O1 / Veo 3.1 | Same |

If the user names a model not in the table, match by **Name** in the "Supported Models" tables below and use its **model_id** for the chosen task_type.

### 3. User phrasing → duration / resolution / aspect_ratio

| User says (examples) | Parameter | Normalized value | Fallback if unsupported |
|----------------------|-----------|------------------|--------------------------|
| 5秒 / 5s / 5 second | duration | 5 | — |
| 10秒 / 10s | duration | 10 | — |
| 15秒 / 15s | duration | 15 | — |
| 1分钟 / 1 min | duration | — | Use 15 if model max is 15s; tell user "当前最长15秒" |
| 横屏 / 16:9 / 横向 | aspect_ratio | 16:9 | — |
| 竖屏 / 9:16 / 竖向 | aspect_ratio | 9:16 | — |
| 1:1 / 方形 | aspect_ratio | 1:1 | — |
| 720P / 720p | resolution | 720P | — |
| 1080P / 1080p / 高清 | resolution | 1080P | — |
| 4K / 4k | resolution | 4K | Only if model supports (e.g. Veo 3.1) |

If the user does not specify duration/resolution/aspect_ratio, use **form_config defaults** from the product list for the chosen model (e.g. 5s, 720P or 1080P, 16:9).

---

## ⚙️ How This Skill Works

**For transparency:** This skill uses a bundled Python script (`scripts/ima_video_create.py`) to call the IMA Open API. The script:
- Sends your prompt to IMA's servers (two domains, see below)
- Uses `--user-id` **only locally** as a key for storing your model preferences
- Returns a video URL when generation is complete

### 🌐 Network Endpoints Used

This skill connects to **two domains** owned by IMA Studio for complete functionality:

| Domain | Purpose | What's Sent | Authentication |
|--------|---------|-------------|----------------|
| `api.imastudio.com` | Main API (task creation, status polling) | Prompts, model params, task IDs | Bearer token (IMA API key) |
| `imapi.liveme.com` | Image upload service (OSS token generation) | Image files (for i2v/ref tasks), IMA API key | IMA API key + APP_KEY signature |

**Why two domains?**
- `api.imastudio.com`: IMA's video generation API (handles task orchestration)
- `imapi.liveme.com`: IMA's media storage infrastructure (handles large file uploads)
- Both services are **owned and operated by IMA Studio**

**Privacy implications:**
- Your IMA API key is sent to **both domains** for authentication
- Image files are uploaded to `imapi.liveme.com` to obtain CDN URLs (for image_to_video, first_last_frame_to_video, reference_image_to_video tasks)
- Video generation happens on `api.imastudio.com` using the CDN URLs
- For text_to_video tasks (no image input), only `api.imastudio.com` is contacted

**Security verification:**
```bash
# List all network endpoints in the code:
grep -n "https://" scripts/ima_video_create.py

# Expected output:
# 57: DEFAULT_BASE_URL = "https://api.imastudio.com"
# 58: DEFAULT_IM_BASE_URL = "https://imapi.liveme.com"
```

**If you're concerned about the two-domain architecture:**
1. Review IMA Studio's privacy policy at https://imastudio.com/privacy
2. Contact IMA technical support to confirm domain ownership: support@imastudio.com
3. Use a test/scoped API key first (see security notice below)

### ⚠️ Credential Security Notice

**Your IMA API key is sent to TWO domains:**
1. `api.imastudio.com` — Main video generation API
2. `imapi.liveme.com` — Image upload service (only when using image-to-video tasks)

**Both domains are owned by IMA Studio**, but if you're concerned about credential exposure:

✅ **Best practices:**
- Use a **test/scoped API key** for initial testing (create at https://imastudio.com/api-keys)
- Set a low quota (e.g., 100 credits) for the test key
- Rotate your key after testing if needed
- Contact IMA support to confirm domain ownership: support@imastudio.com

❌ **Do NOT:**
- Use a production key if you're uncomfortable with the two-domain architecture
- Share your API key with others
- Commit your API key to version control

**What gets sent to IMA servers:**
- ✅ Your video prompt/description
- ✅ Model selection (Wan/Hailuo/Kling/etc.)
- ✅ Video parameters (duration, resolution, etc.)
- ✅ Image files (for image-to-video tasks, uploaded to `imapi.liveme.com`)
- ✅ IMA API key (for authentication to both domains)
- ❌ NO user_id (it's only used locally)

**What's stored locally:**
- `~/.openclaw/memory/ima_prefs.json` - Your model preferences (< 1 KB)
- `~/.openclaw/logs/ima_skills/` - Generation logs (auto-deleted after 7 days)

### Agent Execution (Internal Reference)

> **Note for users:** You can review the script source at `scripts/ima_video_create.py` anytime.  
> The agent uses this script to simplify API calls. Network requests go to two IMA Studio domains: `api.imastudio.com` (API) and `imapi.liveme.com` (image uploads).

Use the bundled script internally to ensure correct parameter construction:

```bash
# Text to video
python3 {baseDir}/scripts/ima_video_create.py \
  --api-key  $IMA_API_KEY \
  --task-type text_to_video \
  --model-id  wan2.6-t2v \
  --prompt   "a puppy runs across a sunny meadow, cinematic" \
  --user-id  {user_id} \
  --output-json

# Image to video
python3 {baseDir}/scripts/ima_video_create.py \
  --api-key      $IMA_API_KEY \
  --task-type    image_to_video \
  --model-id     wan2.6-i2v \
  --prompt       "camera slowly zooms in" \
  --input-images https://example.com/photo.jpg \
  --user-id      {user_id} \
  --output-json
```

**✅ Local images:** `--input-images` accepts both HTTPS URLs and **local file paths**. Local files are automatically uploaded to IMA CDN by the script (no need to host them first).

```bash
# First-last frame to video
python3 {baseDir}/scripts/ima_video_create.py \
  --api-key      $IMA_API_KEY \
  --task-type    first_last_frame_to_video \
  --model-id     kling-video-o1 \
  --prompt       "smooth transition" \
  --input-images https://example.com/first.jpg https://example.com/last.jpg \
  --user-id      {user_id} \
  --output-json
```

The script outputs JSON — parse it to get the result URL and pass it to the user via the UX protocol messages below.

**🚨 CRITICAL: How to send the video to user (Feishu/Discord/IM)**

```python
# ✅ CORRECT: Use the remote URL directly
video_url = json_output["url"]
message(
    action="send",
    media=video_url,  # Direct HTTPS URL → renders inline video player
    caption="✅ 视频生成成功！\n• 模型：[Model Name]\n• 耗时：[X]s\n• 消耗积分：[N pts]"
)

# ❌ WRONG: Download to local file first
# curl -o /tmp/video.mp4 {video_url}
# message(media="/tmp/video.mp4")  # Shows as file attachment (📎 path), NOT playable
```

**Why this matters:**
- ✅ Remote URL → Feishu renders inline video player with ▶ button
- ❌ Local file path → Feishu shows file attachment (📎 /tmp/...), not playable

**Always use the remote URL directly. Never download the video to local storage.**

---

## Overview


---

## 🛡️ Model-Specific Notes

### Sora 2 Pro — Content Safety Policy

**⚠️ Important**: Sora 2 Pro has **strict content safety policies** (OpenAI policy).

**Content Restrictions**:
- ❌ Cannot generate: people, celebrities, IP assets (e.g., Mickey Mouse)
- ❌ Strict prompt moderation
- ✅ Safe themes: landscapes, abstract patterns, animals, nature scenes

**Recommended Prompts**:
- ✅ "A sunset over mountains"
- ✅ "Abstract colorful flowing patterns"
- ✅ "A bird flying through clouds"

**Avoid**:
- ❌ "A person walking" (people)
- ❌ "Mickey Mouse dancing" (IP asset)
- ❌ Celebrity names or recognizable figures

If your prompt is rejected, try using more abstract or nature-focused descriptions.

---

Call IMA Open API to create AI-generated videos. All endpoints require an `ima_*` API key. The core flow is: **query products → create task → poll until done**.

---

## 🔒 Security & Transparency Policy

> **This skill is community-maintained and open for inspection.**

### ✅ What Users CAN Do

**Full transparency:**
- ✅ **Review all source code**: Check `scripts/ima_video_create.py` and `ima_logger.py` anytime
- ✅ **Verify network calls**: Network requests go to two IMA Studio domains: `api.imastudio.com` (API) and `imapi.liveme.com` (image uploads). See "🌐 Network Endpoints Used" section above for full details.
- ✅ **Inspect local data**: View `~/.openclaw/memory/ima_prefs.json` and log files
- ✅ **Control privacy**: Delete preferences/logs anytime, or disable file writes (see below)

**Configuration allowed:**
- ✅ **Set API key** in environment or agent config:
  - Environment variable: `export IMA_API_KEY=ima_your_key_here`
  - OpenClaw/MCP config: Add `IMA_API_KEY` to agent's environment configuration
  - Get your key at: https://imastudio.com
- ✅ **Use scoped/test keys**: Test with limited API keys, rotate after testing
- ✅ **Disable file writes**: Make prefs/logs read-only or symlink to `/dev/null`

**Data control:**
- ✅ **View stored data**: `cat ~/.openclaw/memory/ima_prefs.json`
- ✅ **Delete preferences**: `rm ~/.openclaw/memory/ima_prefs.json` (resets to defaults)
- ✅ **Delete logs**: `rm -rf ~/.openclaw/logs/ima_skills/` (auto-cleanup after 7 days anyway)

### ⚠️ Advanced Users: Fork & Modify

If you need to modify this skill for your use case:
1. **Fork the repository** (don't modify the original)
2. **Update your fork** with your changes
3. **Test thoroughly** with limited API keys
4. **Document your changes** for troubleshooting

**Note:** Modified skills may break API compatibility or introduce security issues. Official support only covers the unmodified version.

### ❌ What to AVOID (Security Risks)

**Actions that could compromise security:**
- ❌ Sharing API keys publicly or in skill files
- ❌ Modifying API endpoints to unknown servers
- ❌ Disabling SSL/TLS certificate verification
- ❌ Logging sensitive user data (prompts, IDs, etc.)
- ❌ Bypassing authentication or billing mechanisms

**Why this matters:**
1. **API Compatibility**: Skill logic aligns with IMA Open API schema
2. **Security**: Malicious modifications could leak credentials or bypass billing
3. **Support**: Modified skills may not be supported
4. **Community**: Breaking changes affect all users

### 📋 Privacy & Data Handling Summary

**What this skill does with your data:**

| Data Type | Sent to IMA? | Stored Locally? | User Control |
|-----------|-------------|-----------------|--------------|
| Video prompts | ✅ Yes (required for generation) | ❌ No | None (required) |
| API key | ✅ Yes (authentication header) | ❌ No | Set via env var |
| user_id (optional CLI arg) | ❌ **Never** (local preference key only) | ✅ Yes (as prefs file key) | Change `--user-id` value |
| Model preferences | ❌ No | ✅ Yes (~/.openclaw) | Delete anytime |
| Generation logs | ❌ No | ✅ Yes (~/.openclaw) | Auto-cleanup 7 days |

**Privacy recommendations:**
1. **Use test/scoped API keys** for initial testing
2. **Note**: `--user-id` is **never sent to IMA servers** - it's only used locally as a key for storing preferences in `~/.openclaw/memory/ima_prefs.json`
3. **Review source code** at `scripts/ima_video_create.py` to verify network calls (search for `create_task` function)
4. **Rotate API keys** after testing or if compromised

**Get your IMA API key:** Visit https://imastudio.com to register and get started.

### 🔧 For Skill Maintainers Only

**Version control:**
- All changes must go through Git with proper version bumps (semver)
- CHANGELOG.md must document all changes
- Production deployments require code review

**File checksums (optional):**
```bash
# Verify skill integrity
sha256sum SKILL.md scripts/ima_video_create.py
```

If users report issues, verify file integrity first.

---

## 🧠 User Preference Memory

> User preferences have **highest priority** when they exist. But preferences are only saved when users **explicitly express** model preferences — not from automatic model selection.

### Storage: `~/.openclaw/memory/ima_prefs.json`

```json
{
  "user_{user_id}": {
    "text_to_video":              { "model_id": "wan2.6-t2v",        "model_name": "Wan 2.6",          "credit": 25, "last_used": "..." },
    "image_to_video":             { "model_id": "wan2.6-i2v",        "model_name": "Wan 2.6",          "credit": 25, "last_used": "..." },
    "first_last_frame_to_video":  { "model_id": "kling-video-o1",    "model_name": "Kling O1",        "credit": 48, "last_used": "..." },
    "reference_image_to_video":   { "model_id": "kling-video-o1",    "model_name": "Kling O1",        "credit": 48, "last_used": "..." }
  }
}
```

### Model Selection Flow (Every Generation)

**Step 1: Get knowledge-ai recommendation** (if installed)
```python
knowledge_recommended_model = read_ima_knowledge_ai()  # e.g., "Wan 2.6"
```

**Step 2: Check user preference**
```python
user_pref = load_prefs().get(f"user_{user_id}", {}).get(task_type)  # e.g., {"model_id": "kling-video-o1", ...}
```

**Step 3: Decide which model to use**
```python
if user_pref exists:
    use_model = user_pref["model_id"]  # Highest priority
else:
    use_model = knowledge_recommended_model or fallback_default
```

**Step 4: Check for mismatch (for later hint)**
```python
if user_pref exists and knowledge_recommended_model != user_pref["model_id"]:
    mismatch = True  # Will add hint in success message
```

### When to Write (User Explicit Preference ONLY)

**✅ Save preference when user explicitly specifies a model:**

| User says | Action |
|-----------|--------|
| `用XXX` / `换成XXX` / `改用XXX` | Switch to model XXX + save as preference |
| `以后都用XXX` / `默认用XXX` / `always use XXX` | Save + confirm: `✅ 已记住！以后视频生成默认用 [XXX]` |
| `我喜欢XXX` / `我更喜欢XXX` | Save as preference |

**❌ Do NOT save when:**
- Agent auto-selects from knowledge-ai → not user preference
- Agent uses fallback default → not user preference
- User says generic quality requests (see "Clear Preference" below) → clear preference instead

### When to Clear (User Abandons Preference)

**🗑️ Clear preference when user wants automatic selection:**

| User says | Action |
|-----------|--------|
| `用最好的` / `用最合适的` / `best` / `recommended` | Clear pref + use knowledge-ai recommendation |
| `推荐一个` / `你选一个` / `自动选择` | Clear pref + use knowledge-ai recommendation |
| `用默认的` / `用新的` | Clear pref + use knowledge-ai recommendation |
| `试试别的` / `换个试试` (without specific model) | Clear pref + use knowledge-ai recommendation |
| `重新推荐` | Clear pref + use knowledge-ai recommendation |

**Implementation:**
```python
del prefs[f"user_{user_id}"][task_type]
save_prefs(prefs)
```

---

## ⭐ Model Selection Priority

**Selection flow:**

1. **User preference** (if exists) → Highest priority, always respect
2. **ima-knowledge-ai skill** (if installed) → Professional recommendation based on task
3. **Fallback defaults** → Use table below (only if neither 1 nor 2 exists)

**Important notes:**
- User preference is only saved when user **explicitly specifies** a model (see "When to Write" above)
- Knowledge-ai is **always consulted** (even when user pref exists) to detect mismatches
- When mismatch detected → add gentle hint in success message (does NOT interrupt generation)

> The defaults below are FALLBACK only. User preferences have highest priority, then knowledge-ai recommendations.  
> **Always default to the newest and most popular model. Do NOT default to the cheapest.**

| Task | Default Model | model_id | version_id | Cost | Why |
|------|--------------|----------|------------|------|-----|
| text_to_video | **Wan 2.6** | `wan2.6-t2v` | `wan2.6-t2v` | 25 pts | 🔥 Most popular, balanced cost |
| text_to_video (premium) | **Hailuo 2.3** | `MiniMax-Hailuo-2.3` | `MiniMax-Hailuo-2.3` | 38 pts | Higher quality |
| text_to_video (budget) | **Vidu Q2** | `viduq2` | `viduq2` | 5 pts | Lowest cost t2v |
| image_to_video | **Wan 2.6** | `wan2.6-i2v` | `wan2.6-i2v` | 25 pts | 🔥 Most popular i2v, 1080P |
| image_to_video (premium) | **Kling 2.6** | `kling-v2-6` | `kling-v2-6` | 40-160 pts | Premium Kling i2v |
| first_last_frame_to_video | **Kling O1** | `kling-video-o1` | `kling-video-o1` | 48 pts | Newest Kling reasoning model |
| reference_image_to_video | **Kling O1** | `kling-video-o1` | `kling-video-o1` | 48 pts | Best reference fidelity |

**Selection guide (production credits, sorted by popularity):**
- **🔥 Most popular text-to-video** → **Wan 2.6** (25 pts, balanced cost & quality)
- Premium text-to-video → **Hailuo 2.3** (38 pts, higher quality)
- Budget text-to-video → **Vidu Q2** (5 pts) or **Hailuo 2.0** (12 pts)
- **🔥 Most popular image_to_video** → **Wan 2.6** (25 pts)
- first_last_frame / reference → **Kling O1** (48 pts)
- User specifies cheapest → **Vidu Q2** (5 pts) — only if explicitly requested

---

## 🆕 Special Case: Pixverse Model Parameter (v1.0.7+)

**Auto-Inference Logic for Pixverse V5.5/V5/V4:**

- **Problem**: Pixverse V5.5, V5, V4 lack `model` field in `form_config` from Product List API
- **Backend Requirement**: Backend requires `model` parameter (e.g., `"v5.5"`, `"v5"`, `"v4"`)
- **Auto-Fix**: System automatically extracts version from `model_name` and injects it
  - Example: `model_name: "Pixverse V5.5"` → auto-inject `model: "v5.5"`
  - Example: `model_name: "Pixverse V4"` → auto-inject `model: "v4"`
- **Note**: V4.5 and V3.5 include `model` in `form_config` (no auto-inference needed)
- **Relevant Task Types**: All video modes (text_to_video, image_to_video, first_last_frame_to_video, reference_image_to_video)

**Error Prevention:**
- Without auto-inference: `err_code=400017 err_msg=Invalid value for model`
- With auto-inference (v1.0.7+): Pixverse V5.5/V5/V4 work seamlessly ✅

**Why This Matters:**
Some Pixverse models (V5.5/V5/V4) have inconsistent `form_config` in the Product List API response. The auto-inference ensures all Pixverse versions work correctly without requiring users to manually specify the `model` parameter.

---

## 💬 User Experience Protocol (IM / Feishu / Discord)

> Video generation takes 1~6 minutes. **Never let users wait in silence.**  
> Always follow all 4 steps below, every single time.

### 🚫 Never Say to Users

| ❌ Never say | ✅ What users care about |
|-------------|--------------------------|
| `ima_video_create.py` / 脚本 / script | — |
| 自动化脚本 / automation | — |
| 自动处理产品列表 / 查询接口 | — |
| 自动解析参数 / 智能轮询 | — |
| attribute_id / model_version / form_config | — |
| API 调用 / HTTP 请求 / 任何技术参数名 | — |

Only tell users: **model name · estimated time · credits · result URL · plain-language status**.

---

### Estimated Generation Time per Model

| Model | Estimated Time | Poll Every | Send Progress Every |
|-------|---------------|------------|---------------------|
| Wan 2.6 (t2v / i2v) | 60~120s | 8s | 30s |
| Hailuo 2.0 | 60~120s | 8s | 30s |
| Hailuo 2.3 | 60~120s | 8s | 30s |
| Vidu Q1 / Q2 | 60~120s | 8s | 30s |
| Pixverse V3.5~V5.5 | 60~120s | 8s | 30s |
| Kling 1.6 | 60~120s | 8s | 30s |
| Kling 2.1 Master | 90~180s | 8s | 40s |
| SeeDance 1.0 / 1.5 Pro | 90~180s | 8s | 40s |
| Google Veo 3.1 Fast | 90~180s | 8s | 40s |
| Kling 2.5 Turbo | 120~240s | 8s | 45s |
| Sora 2 | 120~240s | 8s | 45s |
| Wan 2.5 | 90~180s | 8s | 40s |
| Kling 2.6 | 120~240s | 8s | 45s |
| Kling O1 | 180~360s | 8s | 60s |
| Sora 2 Pro | 180~360s | 8s | 60s |
| Google Veo 3.1 | 120~300s | 8s | 50s |
| Google Veo 3.0 | 180~360s | 8s | 60s |

`estimated_max_seconds` = upper bound of the range (e.g. 180 for Kling 2.1 Master, 360 for Kling O1).

---

### Step 1 — Pre-Generation Notification (with Cost Transparency)

**Before calling the create API**, send this message immediately:

```
🎬 开始生成视频，请稍候…
• 模型：[Model Name]
• 预计耗时：[X ~ Y 秒]（约 [X/60 ~ Y/60] 分钟）
• 消耗积分：[N pts]

视频生成需要一定时间，我会每隔一段时间汇报进度 🙏
```

**Cost transparency (critical for video):**
- For balanced/default models (25 pts): "使用 Wan 2.6（25 积分，最新 Wan）"
- For premium models (>50 pts):
  - If auto-selected: "使用 Wan 2.6（25 积分）。若需更高质量可选 Kling 2.1 Master（150 积分）"
  - If user explicit: "使用高端模型 Kling 2.1 Master（150 积分），质量最佳"
- For budget (user explicit): "使用 Vidu Q2（5 积分，最省钱选项）"

> Adapt language to match the user. For expensive models (>50 pts), always mention cheaper alternatives unless user explicitly requested premium quality.

> Adapt language to match the user. English → `🎬 Starting video generation, this may take [X~Y] seconds. I'll update you on progress…`

---

### Step 2 — Progress Updates

Poll the task detail API every **8s**.  
Send a progress update message every `[Send Progress Every]` seconds per the table above.

```
⏳ 视频生成中… [P]%
已等待 [elapsed]s，预计最长 [max]s
```

**Progress formula:**
```
P = min(95, floor(elapsed_seconds / estimated_max_seconds * 100))
```

- **Cap at 95%** — never show 100% until the API returns `success`
- If `elapsed > estimated_max`: keep P at 95% and append `「快了，稍等一下…」`
- Example: elapsed=120s, max=180s → P = min(95, floor(120/180*100)) = min(95, 66) = **66%**
- Example: elapsed=200s, max=180s → P = **95%**（冻结 + 「快了，稍等一下…」）

---

### Step 3 — Success Notification (Push video via message tool)

When task status = `success`:

**3.1 Send video player first** (Feishu will render inline player):
```python
# Get result URL from script output or task detail API
result = get_task_result(task_id)
video_url = result["medias"][0]["url"]

# Build caption
caption = f"""✅ 视频生成成功！
• 模型：[Model Name]
• 耗时：预计 [X~Y]s，实际 [actual]s
• 消耗积分：[N pts]

[视频描述]"""

# Add mismatch hint if user pref conflicts with knowledge-ai recommendation
if user_pref_exists and knowledge_recommended_model != used_model:
    caption += f"""

💡 提示：当前任务也许用 {knowledge_recommended_model} 也会不错（{reason}，{cost} pts）"""

# Send video with caption
message(
    action="send",
    media=video_url,  # ⚠️ Use HTTPS URL directly, NOT local file path
    caption=caption
)
```

**Mismatch hint example:**
```
✅ 视频生成成功！
• 模型：Midjourney（你的偏好模型）
• 耗时：45s
• 消耗积分：8 pts

💡 提示：当前任务也许用 Wan 2.6 也会不错（写实风格更合适，25 pts）

[视频]
```

**Important:**
- Hint is **non-intrusive** — does NOT interrupt generation
- Only shown when user pref conflicts with knowledge-ai recommendation
- User can ignore the hint; video is already delivered

**3.2 Then send link as text** (for copying/sharing):
```python
# Send link message immediately after
message(
    action="send",
    message=f"""🔗 视频链接（方便复制分享）：
{video_url}"""
)
```

**Critical:** 
- Use the **remote HTTPS URL** directly as `media` parameter. Do NOT download to local file first.
- Send video first (for inline playback), then send link text (for copying/sharing).

> For Feishu: Direct video URL → inline video player with play button. Local file path → file attachment (📎 path).

---

### Step 4 — Failure Notification

When task status = `failed` or any API/network error, send:

```
❌ 视频生成失败
• 原因：[natural_language_error_message]
• 建议改用：
  - [Alt Model 1]（[特点]，[N pts]）
  - [Alt Model 2]（[特点]，[N pts]）

需要我帮你用其他模型重试吗？
```

**⚠️ CRITICAL: Error Message Translation**

**NEVER show technical error messages to users.** Always translate API errors into natural language.  
**API key & credits:** 密钥与积分管理入口为 imaclaw.ai（与 imastudio.com 同属 IMA 平台）。Key and subscription management: imaclaw.ai (same IMA platform as imastudio.com).

| Technical Error | ❌ Never Say | ✅ Say Instead (Chinese) | ✅ Say Instead (English) |
|----------------|-------------|------------------------|------------------------|
| `401 Unauthorized` 🆕 | Invalid API key / 401 Unauthorized | ❌ API密钥无效或未授权<br>💡 **生成新密钥**: https://www.imaclaw.ai/imaclaw/apikey | ❌ API key is invalid or unauthorized<br>💡 **Generate API Key**: https://www.imaclaw.ai/imaclaw/apikey |
| `4008 Insufficient points` 🆕 | Insufficient points / Error 4008 | ❌ 积分不足，无法创建任务<br>💡 **购买积分**: https://www.imaclaw.ai/imaclaw/subscription | ❌ Insufficient points to create this task<br>💡 **Buy Credits**: https://www.imaclaw.ai/imaclaw/subscription |
| `"Invalid product attribute"` / `"Insufficient points"` | Invalid product attribute | 生成参数配置异常，请稍后重试 | Configuration error, please try again later |
| `Error 6006` (credit mismatch) | Error 6006 | 积分计算异常，系统正在修复 | Points calculation error, system is fixing |
| `Error 6010` (attribute_id mismatch) | Attribute ID does not match | 模型参数不匹配，请尝试其他模型 | Model parameters incompatible, try another model |
| `error 400` (bad request) | error 400 / Bad request | 视频参数设置有误，请调整时长或分辨率 | Video parameter error, adjust duration or resolution |
| `resource_status == 2` | Resource status 2 / Failed | 视频生成遇到问题，建议换个模型试试 | Video generation failed, try another model |
| `status == "failed"` (no details) | Task failed | 这次生成没成功，要不换个模型试试？ | Generation unsuccessful, try a different model? |
| `timeout` | Task timed out / Timeout error | 视频生成时间过长已超时，建议用更快的模型 | Video generation took too long, try a faster model |
| Network error / Connection refused | Connection refused / Network error | 网络连接不稳定，请检查网络后重试 | Network connection unstable, check network and retry |
| Rate limit exceeded | 429 Too Many Requests / Rate limit | 请求过于频繁，请稍等片刻再试 | Too many requests, please wait a moment |
| Prompt moderation (Sora 2 Pro only) | Content policy violation | 提示词包含敏感内容（如人物），Sora 不支持，请换其他模型 | Prompt contains restricted content (e.g. people), Sora doesn't support it, try another model |
| Model unavailable | Model not available / 503 Service Unavailable | 当前模型暂时不可用，建议换个模型 | Model temporarily unavailable, try another model |
| Image upload failed (image_to_video only) | Image upload error | 输入图片处理失败，请检查图片格式或换张图 | Input image processing failed, check format or try another image |
| Duration/resolution not supported | Parameter not supported | 该模型不支持此时长或分辨率，请调整参数 | Model doesn't support this duration or resolution, adjust parameters |

**Generic fallback (when error is unknown):**
- Chinese: `视频生成遇到问题，请稍后重试或换个模型试试`
- English: `Video generation encountered an issue, please try again or use another model`

**Best Practices:**
1. **Focus on user action**: Tell users what to do next, not what went wrong technically
2. **Be reassuring**: Use phrases like "建议换个模型试试" instead of "生成失败了"
3. **Avoid blame**: Never say "你的提示词有问题" → say "提示词需要调整一下"
4. **Provide alternatives**: Always suggest 1-2 alternative models in the failure message
5. **Video-specific**: 
   - For Sora content policy errors, recommend Wan 2.6 or Kling O1 (more permissive)
   - For timeout errors, recommend faster models (Vidu Q2, Hailuo 2.0)
   - For image input errors, suggest checking image format (HTTPS URL, valid JPEG/PNG)
6. **🆕 Include actionable links (v1.0.8+)**: For 401/4008 errors, provide clickable links to API key generation or credit purchase pages

**🆕 Enhanced Error Handling (v1.0.8):**

The Reflection mechanism (3 automatic retries) now provides **specific, actionable suggestions** for common errors:

- **401 Unauthorized**: System suggests generating a new API key with clickable link
- **4008 Insufficient Points**: System suggests purchasing credits with clickable link
- **500 Internal Server Error**: Automatic parameter degradation (resolution: 1080P → 720P → 540P, duration: 15 → 10 → 5)
- **6009 No Rule Match**: Automatic parameter completion from credit_rules
- **6010 Attribute Mismatch**: Automatic credit_rule reselection
- **Timeout**: Helpful info with dashboard link for background task status
- **🆕 Pixverse Model Parameter (v1.0.7+)**: Auto-inference for missing `model` parameter (V5.5/V5/V4)

All error handling is **automatic and transparent** — users receive natural language explanations with next steps.

**Failure fallback table:**

| Failed Model | First Alt | Second Alt |
|-------------|-----------|------------|
| Kling 2.1 Master | Wan 2.6（3pts，速度快） | Hailuo 2.0（5pts） |
| Google Veo 3.1 | Kling 2.1 Master（10pts） | Sora 2（42pts） |
| Kling O1 | Kling 2.1 Master（10pts） | Kling 2.5 Turbo（37pts） |
| Wan 2.6 | Hailuo 2.0（5pts） | Kling 1.6（10pts） |
| Sora 2 / Pro | Kling 2.1 Master（10pts） | Google Veo 3.1（162pts） |
| SeeDance | Kling 2.1 Master（10pts） | Wan 2.6（3pts） |
| Any / Unknown | Wan 2.6（3pts，最稳定） | Hailuo 2.0（5pts） |

---


## Supported Models

⚠️ **Production Environment**: Model availability validated against production API on 2026-02-27.

### text_to_video (14 models)

| Name | model_id | Cost Range | Resolution | Duration | Notes |
|------|----------|-----------|------------|----------|-------|
| **Wan 2.6** 🌟 | `wan2.6-t2v` | 25-120 pts | 720P/1080P | 5-15s | Balanced, most popular |
| **Hailuo 2.3** | `MiniMax-Hailuo-2.3` | 32+ pts | 768P | 6s | Latest Hailuo |
| Hailuo 2.0 | `MiniMax-Hailuo-02` | 5+ pts | 768P | 6s | Budget friendly |
| Vidu Q2 | `viduq2` | 5-70 pts | 540P-1080P | 5-10s | Fast generation |
| SeeDance 1.5 Pro | `doubao-seedance-1.5-pro` | 20+ pts | 720P | 4s | Latest SeeDance |
| Sora 2 Pro | `sora-2-pro` | 122+ pts | 720P+ | 4s+ | Premium OpenAI |
| **Kling O1** | `kling-video-o1` | 48-120 pts | — | 5-10s | Latest Kling, with audio |
| Kling 2.6 | `kling-v2-6` | 80+ pts | — | 5-10s | Previous Kling gen |
| **Google Veo 3.1** | `veo-3.1-generate-preview` | 70-330 pts | 720P-4K | 4-8s | SOTA cinematic |
| Pixverse V5.5 | `pixverse` | 30+ pts | 540P-1080P | 5-8s | Latest Pixverse |
| Pixverse V5 | `pixverse` | 25+ pts | 540P-1080P | 5-8s | — |
| Pixverse V4.5 | `pixverse` | 20+ pts | 540P-1080P | 5-8s | — |
| Pixverse V4 | `pixverse` | 12+ pts | 540P-1080P | 5-8s | — |
| Pixverse V3.5 | `pixverse` | 12+ pts | 540P-1080P | 5-8s | — |

### image_to_video (14 models)

| Name | model_id | Cost Range | Resolution | Duration | Notes |
|------|----------|-----------|------------|----------|-------|
| **Wan 2.6** 🔥 | `wan2.6-i2v` | 25-120 pts | 720P/1080P | 5-15s | Most popular i2v |
| **Hailuo 2.3** | `MiniMax-Hailuo-2.3` | 32+ pts | 768P | 6s | Latest Hailuo |
| Hailuo 2.0 | `MiniMax-Hailuo-02` | 25+ pts | 768P | 6s | — |
| Vidu Q2 Pro | `viduq2-pro` | 20-70 pts | 540P-1080P | 5-10s | Fast i2v |
| SeeDance 1.5 Pro | `doubao-seedance-1.5-pro` | 47+ pts | 720P | 4s | Latest SeeDance |
| Sora 2 Pro | `sora-2-pro` | 122+ pts | 720P+ | 4s+ | Premium OpenAI |
| **Kling O1** | `kling-video-o1` | 48-120 pts | — | 5-10s | Latest Kling, with audio |
| Kling 2.6 | `kling-v2-6` | 80+ pts | — | 5-10s | Previous Kling gen |
| **Google Veo 3.1** | `veo-3.1-generate-preview` | 70-330 pts | 720P-4K | 4-8s | SOTA cinematic |
| Pixverse V5.5 | `pixverse` | 24-48 pts | 540P-1080P | 5-8s | Latest Pixverse |
| Pixverse V5 | `pixverse` | 24-48 pts | 540P-1080P | 5-8s | — |
| Pixverse V4.5 | `pixverse` | 12-48 pts | 540P-1080P | 5-8s | — |
| Pixverse V4 | `pixverse` | 12-48 pts | 540P-1080P | 5-8s | — |
| Pixverse V3.5 | `pixverse` | 12-48 pts | 540P-1080P | 5-8s | — |

### first_last_frame_to_video (10 models)

| Name | model_id | Cost Range | Duration | Notes |
|------|----------|-----------|----------|-------|
| Hailuo 2.0 | `MiniMax-Hailuo-02` | 5+ pts | 6s | Budget option |
| Vidu Q2 Pro | `viduq2-pro` | 20-70 pts | 5-10s | Fast generation |
| **Kling O1** 🌟 | `kling-video-o1` | 48-120 pts | 5-10s | Recommended default |
| Kling 2.6 | `kling-v2-6` | 80+ pts | 5-10s | — |
| **Google Veo 3.1** | `veo-3.1-generate-preview` | 70-330 pts | 4-8s | SOTA quality |
| Pixverse V5.5 | `pixverse` | 24-48 pts | 5-8s | Latest Pixverse |
| Pixverse V5 | `pixverse` | 24-48 pts | 5-8s | — |
| Pixverse V4.5 | `pixverse` | 12-48 pts | 5-8s | — |
| Pixverse V4 | `pixverse` | 12-48 pts | 5-8s | — |
| Pixverse V3.5 | `pixverse` | 12-48 pts | 5-8s | — |

### reference_image_to_video (9 models)

| Name | model_id | Cost Range | Duration | Notes |
|------|----------|-----------|----------|-------|
| Vidu Q2 | `viduq2` | 10-70 pts | 5-10s | Fast, cost-effective |
| **Kling O1** 🌟 | `kling-video-o1` | 48-120 pts | 5-10s | Recommended, strong reference |
| **Google Veo 3.1** | `veo-3.1-generate-preview` | 70-330 pts | 4-8s | SOTA cinematic |
| Pixverse (generic) | `pixverse` | 12-48 pts | 5-8s | Pixverse base |
| Pixverse V5.5 | `pixverse` | 12-48 pts | 5-8s | Latest Pixverse |
| Pixverse V5 | `pixverse` | 12-48 pts | 5-8s | — |
| Pixverse V4.5 | `pixverse` | 12-48 pts | 5-8s | — |
| Pixverse V4 | `pixverse` | 12-48 pts | 5-8s | — |
| Pixverse V3.5 | `pixverse` | 12-48 pts | 5-8s | — |

**Production Notes (2026-02-27)**:
- ✅ **Active models**: 14 t2v, 14 i2v, 10 first_last_frame, 9 reference_image
- 🔥 **Most popular**: Wan 2.6 (both t2v and i2v)
- 🌟 **Recommended defaults**: Wan 2.6 (balanced), Kling O1 (premium with audio)

## Environment

Base URL: `https://api.imastudio.com`

Required/recommended headers for all `/open/v1/` endpoints:

| Header | Required | Value | Notes |
|--------|----------|-------|-------|
| `Authorization` | ✅ | `Bearer ima_your_api_key_here` | API key authentication |
| `x-app-source` | ✅ | `ima_skills` | Fixed value — identifies skill-originated requests |
| `x_app_language` | recommended | `en` / `zh` | Product label language; defaults to `en` if omitted |

```
Authorization: Bearer ima_your_api_key_here
x-app-source: ima_skills
x_app_language: en
```

---

## ⚠️ MANDATORY: Always Query Product List First

> **CRITICAL**: You MUST call `/open/v1/product/list` BEFORE creating any task.  
> The `attribute_id` field is REQUIRED in the create request. If it is `0` or missing, you get:  
> `"Invalid product attribute"` → `"Insufficient points"` → task fails completely.  
> **NEVER construct a create request from the model table alone. Always fetch the product first.**

### How to get attribute_id

```python
# Step 1: Query product list for the target category
GET /open/v1/product/list?app=ima&platform=web&category=text_to_video
# (or image_to_video / first_last_frame_to_video / reference_image_to_video)

# Step 2: Walk the V2 tree to find your model (type=3 leaf nodes only)
for group in response["data"]:
    for version in group.get("children", []):
        if version["type"] == "3" and version["model_id"] == target_model_id:
            attribute_id  = version["credit_rules"][0]["attribute_id"]
            credit        = version["credit_rules"][0]["points"]
            model_version = version["id"]    # = version_id
            model_name    = version["name"]
            form_defaults = {f["field"]: f["value"] for f in version["form_config"]}
```

### Quick Reference: Known attribute_ids

⚠️ **Production warning**: `attribute_id` and `credit` values change frequently. Always call `/open/v1/product/list` at runtime; table below is pre-queried reference (2026-02-27).

| Model | Task | model_id | attribute_id | credit | Notes |
|-------|------|----------|-------------|--------|-------|
| Wan 2.6 (720P, 5s) | text_to_video | `wan2.6-t2v` | **2057** | 25 pts | Default, balanced |
| Wan 2.6 (1080P, 5s) | text_to_video | `wan2.6-t2v` | **2058** | 40 pts | — |
| Wan 2.6 (720P, 10s) | text_to_video | `wan2.6-t2v` | **2059** | 50 pts | — |
| Wan 2.6 (1080P, 10s) | text_to_video | `wan2.6-t2v` | **2060** | 80 pts | — |
| Wan 2.6 (720P, 15s) | text_to_video | `wan2.6-t2v` | **2061** | 75 pts | — |
| Wan 2.6 (1080P, 15s) | text_to_video | `wan2.6-t2v` | **2062** | 120 pts | — |
| Kling O1 (5s, std) | text_to_video | `kling-video-o1` | **2313** | 48 pts | Latest Kling |
| Kling O1 (5s, pro) | text_to_video | `kling-video-o1` | **2314** | 60 pts | — |
| Kling O1 (10s, std) | text_to_video | `kling-video-o1` | **2315** | 96 pts | — |
| Kling O1 (10s, pro) | text_to_video | `kling-video-o1` | **2316** | 120 pts | — |
| All others | any | — | → query `/open/v1/product/list` | — | Always runtime query |

### Common Mistakes (and resulting errors)

| Mistake | Error |
|---------|-------|
| `attribute_id` is 0 or missing | `"Invalid product attribute"` → Insufficient points |
| `attribute_id` outdated (production changed) | Same errors; always query product list first |
| **`attribute_id` doesn't match parameter combination** | **Error 6010: "Attribute ID does not match the calculated rule"** |
| `prompt` at outer level instead of `parameters.parameters.prompt` | Prompt ignored |
| `cast` missing from inner `parameters` | Billing validation failure |
| `credit` wrong / missing | Error 6006 |
| `model_name` or `model_version` missing | Wrong model routing |

**⚠️ Critical for Google Veo 3.1 and multi-rule models:**

Models like Google Veo 3.1 have **multiple `credit_rules`**, each with a different `attribute_id` for different parameter combinations:
- `720p + 4s + optimized` → attribute_id A
- `720p + 8s + optimized` → attribute_id B  
- `4K + 4s + high` → attribute_id C

The script automatically selects the correct `attribute_id` by matching your parameters (`duration`, `resolution`, `compression_quality`, `generate_audio`) against each rule's `attributes`. If the match fails, you get error 6010.

**Fix**: The bundled script now checks these video-specific parameters for smart credit_rule selection. Always use the script, not manual API construction.

---

## Core Flow

```
1. GET /open/v1/product/list?app=ima&platform=web&category=<type>
   → REQUIRED: Get attribute_id, credit, model_version, form_config defaults

[image_to_video / first_last_frame / reference_image tasks only]
2. Upload input image(s) → get public HTTPS URL(s)
   → See "Image Upload" section below

3. POST /open/v1/tasks/create
   → Must include: attribute_id, model_name, model_version, credit, cast, prompt (nested!)

4. POST /open/v1/tasks/detail  {task_id: "..."}
   → Poll every 8s until medias[].resource_status == 1
   → Extract url (mp4) and cover (thumbnail) from completed media
```

> Video generation is slower than image — poll every **8s** and set timeout to **600s**.

---

## Image Upload (Required for Video Tasks with Image Input)

**The IMA Open API does NOT accept raw bytes or base64 images. All input images must be public HTTPS URLs.**

**Script behavior:** `--input-images` accepts **both URLs and local file paths**. Local files are automatically uploaded to IMA CDN by the script — no separate upload step needed when calling the script.

For `image_to_video`, `first_last_frame_to_video`, `reference_image_to_video`: when a user provides an image (local file, base64, or non-public URL), you can pass a local path to the script (it will upload), or upload first in code to get a URL.

```python
def prepare_image_url(source) -> str:
    """Convert any image source to a public HTTPS URL.
    
    - If source is already a public HTTPS URL: return as-is
    - If source is a local file path or bytes: upload to hosting first
    """
    if isinstance(source, str) and source.startswith("https://"):
        return source  # already public, use directly

    # Option 1: IMA OSS (requires OSS credentials)
    #   objectName = f"aiagent/src/d/{date}/in/{uuid}.jpg"
    #   bucket.put_object(objectName, image_bytes)
    #   return f"https://ima.esxscloud.com/{objectName}"

    # Option 2: Any public image hosting (imgbb example)
    import base64, requests
    if isinstance(source, str):
        with open(source, "rb") as f:
            b64 = base64.b64encode(f.read()).decode()
    else:
        b64 = base64.b64encode(source).decode()
    r = requests.post("https://api.imgbb.com/1/upload",
                      data={"key": IMGBB_API_KEY, "image": b64})
    r.raise_for_status()
    return r.json()["data"]["url"]

# For first_last_frame: prepare both frames
first_url = prepare_image_url("/path/to/first.jpg")
last_url  = prepare_image_url("/path/to/last.jpg")
src_img_url = [first_url, last_url]  # index 0 = first, index 1 = last
```

> **Note**: URLs must be publicly accessible — not localhost, private network, or auth-gated endpoints.

---

## Supported Task Types

| category | Capability | Input |
|----------|------------|-------|
| `text_to_video` | Text → Video | prompt |
| `image_to_video` | Image → Video | prompt + upload_img_src |
| `first_last_frame_to_video` | First+Last Frame → Video | prompt + src_img_url[2] |
| `reference_image_to_video` | Reference Image → Video | prompt + src_img_url[1+] |

---

## Detail API status values

| Field | Type | Values |
|-------|------|--------|
| **`resource_status`** | int or `null` | `0`=处理中, `1`=可用, `2`=失败, `3`=已删除；`null` 当作 0 |
| **`status`** | string | `"pending"`, `"processing"`, `"success"`, `"failed"` |

| `resource_status` | `status` | Action |
|-------------------|----------|--------|
| `0` or `null` | `pending` / `processing` | Keep polling |
| `1` | `success` (or `completed`) | Stop when **all** medias are 1; read `url` / `cover` |
| `1` | `failed` | Stop, handle error |
| `2` / `3` | any | Stop, handle error |

> **Important**: Treat `resource_status: null` as 0. Stop only when **all** medias have `resource_status == 1`. Check `status != "failed"` when rs=1.

---

## API 1: Product List

```
GET /open/v1/product/list?app=ima&platform=web&category=text_to_video
```

Returns a **V2 tree structure**: `type=2` nodes are model groups, `type=3` nodes are versions (leaves). Only `type=3` nodes contain `credit_rules` and `form_config`.

**How to pick a version:**
1. Traverse nodes to find `type=3` leaves
2. Use `model_id` and `id` (= `model_version`) from the leaf
3. Pick `credit_rules[].attribute_id` matching desired quality
4. Use `form_config[].value` as default `parameters` values (duration, resolution, aspect_ratio, etc.)

---

## API 2: Create Task

```
POST /open/v1/tasks/create
```

### text_to_video — Verified ✅

No image input. `src_img_url: []`, `input_images: []`.

```json
{
  "task_type": "text_to_video",
  "enable_multi_model": false,
  "src_img_url": [],
  "parameters": [{
    "attribute_id":  4838,
    "model_id":      "wan2.6-t2v",
    "model_name":    "Wan 2.6",
    "model_version": "wan2.6-t2v",
    "app":           "ima",
    "platform":      "web",
    "category":      "text_to_video",
    "credit":        25,
    "parameters": {
      "prompt":          "a puppy dancing happily, sunny meadow",
      "negative_prompt": "",
      "prompt_extend":   false,
      "duration":        5,
      "resolution":      "1080P",
      "aspect_ratio":    "16:9",
      "shot_type":       "single",
      "seed":            -1,
      "n":               1,
      "input_images":    [],
      "cast":            {"points": 3, "attribute_id": 4838}
    }
  }]
}
```

> Video-specific fields from `form_config`: `duration` (seconds), `resolution`, `aspect_ratio`, `shot_type`, `negative_prompt`, `prompt_extend`.
> Response `medias[].cover` = first-frame thumbnail JPEG.

### image_to_video

Input image goes in top-level `src_img_url` and `parameters.input_images`:

```json
{
  "task_type": "image_to_video",
  "enable_multi_model": false,
  "src_img_url": ["https://example.com/scene.jpg"],
  "parameters": [{
    "attribute_id":  "<from credit_rules>",
    "model_id":      "<model_id>",
    "model_name":    "<model_name>",
    "model_version": "<version_id>",
    "app":           "ima",
    "platform":      "web",
    "category":      "image_to_video",
    "credit":        "<points>",
    "parameters": {
      "prompt":       "bring this landscape alive",
      "n":            1,
      "input_images": ["https://example.com/scene.jpg"],
      "cast":         {"points": "<points>", "attribute_id": "<attribute_id>"}
    }
  }]
}
```

### first_last_frame_to_video

Provide exactly 2 images: index 0 = first frame, index 1 = last frame:

```json
{
  "task_type": "first_last_frame_to_video",
  "src_img_url": ["https://example.com/first.jpg", "https://example.com/last.jpg"],
  "parameters": [{
    "category": "first_last_frame_to_video",
    "parameters": {
      "prompt": "smooth transition",
      "n": 1,
      "input_images": ["https://example.com/first.jpg", "https://example.com/last.jpg"],
      "cast": {"points": "<points>", "attribute_id": "<attribute_id>"}
    }
  }]
}
```

### reference_image_to_video

Provide 1 or more reference images in `src_img_url`:

```json
{
  "task_type": "reference_image_to_video",
  "src_img_url": ["https://example.com/ref.jpg"],
  "parameters": [{
    "category": "reference_image_to_video",
    "parameters": {
      "prompt": "dynamic video based on reference",
      "n": 1,
      "input_images": ["https://example.com/ref.jpg"],
      "cast": {"points": "<points>", "attribute_id": "<attribute_id>"}
    }
  }]
}
```

**Key fields**:

| Field | Required | Description |
|-------|----------|-------------|
| `parameters[].credit` | ✅ | Must equal `credit_rules[].points`. Error 6006 if wrong. |
| `parameters[].parameters.prompt` | ✅ | Prompt must be nested here, NOT at top level. |
| `parameters[].parameters.cast` | ✅ | `{"points": N, "attribute_id": N}` — mirror of credit. |
| `parameters[].parameters.n` | ✅ | Number of outputs (usually `1`). |
| top-level `src_img_url` | image tasks | Image URL(s); 2 images for first_last_frame. |
| `parameters[].parameters.input_images` | image tasks | Must mirror `src_img_url`. |
| `parameters[].parameters.duration` | text_to_video | Video duration in seconds (from form_config). |
| `parameters[].parameters.resolution` | text_to_video | e.g. `"1080P"` (from form_config). |
| `parameters[].parameters.aspect_ratio` | text_to_video | e.g. `"16:9"` (from form_config). |

Response: `data.id` = task ID for polling.

---

## API 3: Task Detail (Poll)

```
POST /open/v1/tasks/detail
{"task_id": "<id from create response>"}
```

Poll every **8s** for video tasks. Completed response:

```json
{
  "id": "task_abc",
  "medias": [{
    "resource_status": 1,
    "url":   "https://cdn.../output.mp4",
    "cover": "https://cdn.../cover.jpg",
    "duration_str": "5s",
    "format": "mp4"
  }]
}
```

Output fields: `url` (mp4), `cover` (first-frame thumbnail JPEG), `duration_str`, `format`.

---

## Common Mistakes

| Mistake | Fix |
|---------|-----|
| Polling too fast for video | Use 8s interval, not 2–3s |
| Missing `duration`/`resolution`/`aspect_ratio` | Read defaults from `form_config` |
| Wrong `credit` value | Must exactly match `credit_rules[].points` (error 6006) |
| `src_img_url` and `input_images` mismatch | Both must contain the same image URL(s) |
| Only 1 image for first_last_frame | Requires exactly 2 images (first + last) |
| Placing `prompt` at param top-level | `prompt` must be inside `parameters[].parameters` |

---

## Python Example

```python
import time
import requests

BASE_URL = "https://api.imastudio.com"
API_KEY  = "ima_your_key_here"
HEADERS  = {
    "Authorization":  f"Bearer {API_KEY}",
    "Content-Type":   "application/json",
    "x-app-source":   "ima_skills",
    "x_app_language": "en",
}


def get_products(category: str) -> list:
    """Returns flat list of type=3 version nodes from V2 tree."""
    r = requests.get(
        f"{BASE_URL}/open/v1/product/list",
        headers=HEADERS,
        params={"app": "ima", "platform": "web", "category": category},
    )
    r.raise_for_status()
    nodes = r.json()["data"]
    versions = []
    for node in nodes:
        for child in node.get("children") or []:
            if child.get("type") == "3":
                versions.append(child)
            for gc in child.get("children") or []:
                if gc.get("type") == "3":
                    versions.append(gc)
    return versions


def create_video_task(task_type: str, prompt: str, product: dict, src_img_url: list = None, **extra) -> str:
    """Returns task_id. src_img_url: list of image URLs (1+ for image tasks, 2 for first_last_frame)."""
    src_img_url = src_img_url or []
    rule = product["credit_rules"][0]
    form_defaults = {f["field"]: f["value"] for f in product.get("form_config", []) if f.get("value") is not None}

    nested_params = {
        "prompt": prompt,
        "n":      1,
        "input_images": src_img_url,
        "cast":   {"points": rule["points"], "attribute_id": rule["attribute_id"]},
        **form_defaults,
    }
    nested_params.update({k: v for k, v in extra.items()
                          if k in ("duration", "resolution", "aspect_ratio", "shot_type",
                                   "negative_prompt", "prompt_extend", "seed")})

    body = {
        "task_type":          task_type,
        "enable_multi_model": False,
        "src_img_url":        src_img_url,
        "parameters": [{
            "attribute_id":  rule["attribute_id"],
            "model_id":      product["model_id"],
            "model_name":    product["name"],
            "model_version": product["id"],
            "app":           "ima",
            "platform":      "web",
            "category":      task_type,
            "credit":        rule["points"],
            "parameters":    nested_params,
        }],
    }
    r = requests.post(f"{BASE_URL}/open/v1/tasks/create", headers=HEADERS, json=body)
    r.raise_for_status()
    return r.json()["data"]["id"]


def poll(task_id: str, interval: int = 8, timeout: int = 600) -> dict:
    deadline = time.time() + timeout
    while time.time() < deadline:
        r = requests.post(f"{BASE_URL}/open/v1/tasks/detail", headers=HEADERS, json={"task_id": task_id})
        r.raise_for_status()
        task   = r.json()["data"]
        medias = task.get("medias", [])
        if medias:
            if any(m.get("status") == "failed" for m in medias):
                raise RuntimeError(f"Task failed: {task_id}")
            rs = lambda m: m.get("resource_status") if m.get("resource_status") is not None else 0
            if any(rs(m) == 2 for m in medias):
                raise RuntimeError(f"Task failed: {task_id}")
            if all(rs(m) == 1 for m in medias):
                return task
        time.sleep(interval)
    raise TimeoutError(f"Task timed out: {task_id}")


# text_to_video (Verified: Wan 2.6, response includes cover thumbnail)
products = get_products("text_to_video")
wan26    = next(p for p in products if p["model_id"] == "wan2.6-t2v")
task_id  = create_video_task(
    "text_to_video", "a puppy dancing happily, sunny meadow", wan26,
    duration=5, resolution="1080P", aspect_ratio="16:9",
    shot_type="single", negative_prompt="", prompt_extend=False, seed=-1,
)
result = poll(task_id)
print(result["medias"][0]["url"])    # mp4 URL
print(result["medias"][0]["cover"])  # first-frame thumbnail JPEG

# image_to_video
products = get_products("image_to_video")
task_id  = create_video_task("image_to_video", "bring this landscape alive", products[0],
                             src_img_url=["https://example.com/scene.jpg"])
result   = poll(task_id)
print(result["medias"][0]["url"])

# first_last_frame_to_video (exactly 2 images required)
products = get_products("first_last_frame_to_video")
frames   = ["https://example.com/first.jpg", "https://example.com/last.jpg"]
task_id  = create_video_task("first_last_frame_to_video", "smooth transition", products[0],
                             src_img_url=frames)
result   = poll(task_id)
print(result["medias"][0]["url"])

# reference_image_to_video
products = get_products("reference_image_to_video")
task_id  = create_video_task("reference_image_to_video", "dynamic video", products[0],
                             src_img_url=["https://example.com/ref.jpg"])
result   = poll(task_id)
print(result["medias"][0]["url"])
```

---

## Supported Models & Search Terms

**Models:** Wan 2.6, Kling O1, Kling 2.6, Google Veo 3.1, Sora 2 Pro, Pixverse V5.5, Hailuo 2.0, Hailuo 2.3, MiniMax Hailuo, SeeDance 1.5 Pro, Vidu Q2

**Capabilities:** video generation, text-to-video, image-to-video, AI video, character animation, product demo, social media clips, storytelling, explainer video