---
name: creator
description: |
Creator workflow — generate platform-ready content packages. Triggers on:
"创作", "写公众号", "小红书", "口播", "creator", "content workflow",
"帮我写一篇", "生成内容", "write an article", "create content".
metadata:
openclaw:
emoji: "✍️"
---
## When to Use
- User wants a full content package for a specific platform (WeChat article, Xiaohongshu post, narration script)
- User says "帮我写篇公众号", "小红书图文", "口播稿", "create content"
- User provides a URL/text/topic and wants it turned into platform-ready content with images
## When NOT to Use
- User wants a single image without a content workflow → use image-gen directly
- User wants a single TTS audio → use tts directly
- User wants to transcribe audio → use asr directly
- User wants a podcast episode → use podcast directly
- User wants to extract content from a URL without further processing → use content-parser directly
Creator is for **multi-step content production** that combines writing + media generation into a platform-ready package.
## Purpose
Generate platform-specific content packages by orchestrating existing skills. Input: topic, URL, text, or audio/video file. Output: a folder with article/script, images, and metadata — ready to publish.
## Hard Constraints
- Use `listenhub` CLI commands for image-gen and TTS. Use curl for content-parser (see `content-parser/SKILL.md` § API Reference).
- Always read config following `shared/config-pattern.md` before any interaction
- Follow `shared/cli-patterns.md` for polling, errors, and interaction patterns
- Never save files to `~/Downloads/` or `.listenhub/` — save content packages to the current working directory
- JSON parsing: use `jq` only (no python3, awk)
Language Adaptation: All UI text follows the user's input language. Chinese input → Chinese output. English input → English output. Mixed → follow dominant language.
Use AskUserQuestion for every multiple-choice step. One question at a time. Wait for the answer. After template is selected and input is understood, show a confirmation summary and wait for explicit approval before executing the pipeline.
API Key Check at Confirmation Gate: If the pipeline includes any remote API call (image-gen, content-parser, tts), check authentication before proceeding. For CLI-based calls (image-gen, TTS), run `listenhub auth login` if not authenticated. For content-parser calls, configure `LISTENHUB_API_KEY` (see `content-parser/SKILL.md` § Authentication). Pure text-only pipelines (e.g., topic → narration script without TTS) can proceed without authentication.
## Step -1: API Key Check
Deferred. API key is checked at the confirmation gate (Step 4) only when the pipeline requires remote API calls. See Hard Constraints above.
## Step 0: Config Setup
Follow `shared/config-pattern.md` Step 0 (Zero-Question Boot).
**If file doesn't exist** — silently create with defaults and proceed:
```bash
mkdir -p ".listenhub/creator" ".listenhub/creator/styles"
cat > ".listenhub/creator/config.json" << 'EOF'
{"outputMode":"download","language":null,"preferences":{"wechat":{"history":[]},"xiaohongshu":{"mode":"both","history":[]},"narration":{"defaultSpeaker":null,"history":[]}}}
EOF
CONFIG_PATH=".listenhub/creator/config.json"
CONFIG=$(cat "$CONFIG_PATH")
```
User style preferences are stored as markdown files in `.listenhub/creator/styles/`:
- `.listenhub/creator/styles/wechat.md`
- `.listenhub/creator/styles/xiaohongshu.md`
- `.listenhub/creator/styles/narration.md`
These files are plain markdown — one directive per line. If the file does not exist, no custom style is applied. Users can edit these files directly.
Note: `outputMode` defaults to `"download"` (not the usual `"inline"`) because creator always produces multi-file output folders that must be saved to disk.
**If file exists** — read config silently and proceed:
```bash
CONFIG_PATH=".listenhub/creator/config.json"
[ ! -f "$CONFIG_PATH" ] && CONFIG_PATH="$HOME/.listenhub/creator/config.json"
CONFIG=$(cat "$CONFIG_PATH")
```
### Setup Flow (user-initiated reconfigure only)
Only when user explicitly asks to reconfigure. Display current settings:
```
当前配置 (creator):
输出方式:{outputMode}
小红书模式:{both / cards / long-text}
```
Ask:
1. **outputMode**: Follow `shared/output-mode.md` § Setup Flow Question.
2. **xiaohongshu.mode**: "小红书默认模式?"
- "图文 + 长文(both)"
- "仅图文卡片(cards)"
- "仅长文(long-text)"
## Interaction Flow
### Step 1: Understand Input
The user provides input along with their request. Classify the input:
| Input Type | Detection | Auto Action |
|-----------|-----------|-------------|
| URL (web/article) | `http(s)://` prefix, not an audio/video URL | Will call content-parser (requires API key) |
| URL (audio/video) | Extension `.mp3/.mp4/.wav/.m4a/.webm` or domain is youtube.com/bilibili.com/douyin.com | Will download + call `coli asr` to transcribe |
| Local audio file | File path exists, extension is audio/video | Will call `coli asr` directly |
| Local text file | File path exists, extension is `.txt/.md/.json` | Read file content |
| Raw text | Multi-line or >50 chars, not a URL/path | Use directly as material |
| Topic/keywords | Short text (<50 chars), no URL/path pattern | AI writes from scratch |
**Style reference detection:** If the user's prompt contains keywords like "参考", "风格", "照着…写", "style", "reference", the associated input (file path / URL / pasted text) should be classified as a **style reference** rather than content material. A single request may contain both material and a style reference — classify them separately. If only a style reference is provided with no material or topic, this is a **standalone style learning** request (see Step 2.5).
**For URL (audio/video) inputs:**
1. Download to `/tmp/creator-{slug}.{ext}` using `curl -L -o`
2. Check `coli` is available: `which coli 2>/dev/null && echo yes || echo no`
3. If `coli` missing: inform user to install (`npm install -g @marswave/coli`), ask them to paste text instead
4. Transcribe: `coli asr -j --model sensevoice "/tmp/creator-{slug}.{ext}"`
5. Extract text from JSON result
6. Cleanup: `rm "/tmp/creator-{slug}.{ext}"`
**For URL (web/article) inputs:**
Content-parser will be called during pipeline execution (after confirmation).
### Step 2: Template Matching
If the user specified a platform in their prompt, match directly:
- "公众号", "wechat", "微信" → wechat
- "小红书", "xiaohongshu", "xhs" → xiaohongshu
- "口播", "narration", "脚本" → narration
If no platform was specified, ask via AskUserQuestion:
Question: "Which content template?" / "用哪个创作模板?"
Options (adapt language to user's input):
- "WeChat article (公众号长文)" — Long-form article with AI illustrations
- "Xiaohongshu (小红书)" — Image cards + long text post
- "Narration script (口播稿)" — Spoken script with optional audio
### Step 2.5: Topic Assistance
This step runs only when the user's input is a topic or keywords (short text <50 chars, no URL/path). Skip if user provided a URL, file, or substantial text.
1. Read the selected platform's `methodology.md`:
- WeChat: `creator/templates/wechat/methodology.md`
- Xiaohongshu: `creator/templates/xiaohongshu/methodology.md`
- Narration: `creator/templates/narration/methodology.md`
2. Evaluate the topic using the three-circle Venn model:
- 用户的专业领域 (creator's expertise)
- 读者的普遍兴趣 (reader interest)
- 当下的时间节点 (current timing/relevance)
3. Run HKR quality filter:
- **H (Happy)**: 足够有趣、有悬念?
- **K (Knowledge)**: 有信息量?看完能学到新东西?
- **R (Resonance)**: 能戳中情绪?让人"对对对我也这么想"?
4. If topic scores ≥2 of 3 HKR criteria: proceed with the topic.
5. If topic scores <2: proactively suggest 2-3 alternative angles to the user via AskUserQuestion.
6. If topic is vague: ask for more specifics — key points, personal experiences, what excites or frustrates them.
### Step 3: Style Extraction (if style reference provided)
This step runs only when the user provided a style reference in Step 1. If no style reference was detected, skip to Step 3b.
**Read the reference content:**
- Local file → Read tool
- URL → content-parser API (requires API key)
- Pasted text → use directly
**Analyze and extract style directives:**
AI reads the reference content and extracts 3-5 concrete style directives. Focus on observable patterns:
- Sentence length and paragraph structure
- Tone and register (formal/casual, first/third person)
- Use of rhetorical devices (questions, lists, bold, quotes)
- Vocabulary level and domain jargon
- Formatting habits (heading style, emoji usage, whitespace)
**Present to user for confirmation:**
```
从参考文章中提炼了以下风格特征:
1. {directive 1}
2. {directive 2}
3. {directive 3}
...
你可以修改或删除其中的条目。确认后本次生成会应用这些规则。
```
Wait for user confirmation. The confirmed directives become `sessionStyle` — applied to this generation only.
After user confirms the style directives, proactively ask whether to persist:
```
要将这些风格规则保存吗?(保存后每次生成{platform}内容都会应用)
```
If yes → write to `.listenhub/creator/styles/{platform}.md`. If no → only apply to this generation.
**Standalone style learning:** If the user only provided a style reference without material/topic (e.g., "学习一下这篇文章的风格"), run the extraction above, then **persist directly** to `.listenhub/creator/styles/{platform}.md` without asking — the user's intent to save is already explicit. Confirm with a brief message: "已保存到 styles/{platform}.md". Do not proceed to content generation.
### Step 3a: Prototype Classification
Read the selected platform's prototype file:
- WeChat: `creator/templates/wechat/article-prototypes.md`
- Xiaohongshu: `creator/templates/xiaohongshu/content-prototypes.md`
- Narration: `creator/templates/narration/script-prototypes.md`
Based on the user's material/topic, auto-match the best-fit prototype using the matching heuristics table in the prototype file.
Present the recommendation to the user via AskUserQuestion:
Question: "这篇内容最适合哪种写法?" / "Which content prototype fits best?"
Options: [list all prototypes for the platform, recommended one first with "(Recommended)" suffix]
The selected prototype determines the narrative structure and L3-5 review criteria for writing.
### Step 3b: Preset Selection (if applicable)
If the selected template uses illustration or card presets **and** the mode requires images, the preset MUST be chosen **before** the confirmation gate so it can be displayed in the summary.
**Skip this step entirely** for:
- Narration template (no visual presets)
- Xiaohongshu with `preferences.xiaohongshu.mode` = `"long-text"` (no cards or images generated)
Otherwise:
1. Read the template's preset section to get available presets and the topic-matching table.
2. **If the user already specified a preset** in their prompt (e.g., "用水彩风格"): use that preset directly.
3. **If not specified**: ask the user via AskUserQuestion. Output a one-line hint first: "配图风格可以随时换,先选一个开始吧". List all available presets with their Chinese labels (from frontmatter `label` field). Use the topic-matching table to put the most relevant option first (marked "Recommended"), but always let the user choose.
### Step 4: Confirmation Gate
**Check API key** if the pipeline needs remote APIs:
- WeChat template always needs image-gen → requires API key
- Xiaohongshu cards mode needs image-gen → requires API key
- Xiaohongshu long-text only → no API key needed
- Narration without TTS → no API key needed
- Web/article URL input → needs content-parser → requires API key (audio/video URLs use local `coli asr`, no API key needed)
If API key required and missing: for CLI-based calls, run `listenhub auth login`. For content-parser calls, configure `LISTENHUB_API_KEY` (see `content-parser/SKILL.md` § Authentication).
**Show confirmation summary:**
```
准备生成内容:
模板:{WeChat article / Xiaohongshu / Narration}
输入:{topic description / URL / text excerpt...}
输出目录:{slug}-{platform}/
需要 API 调用:{content-parser, image-gen, ...}
风格偏好:{styles/{platform}.md 已配置 / 使用默认风格}
配图/卡片预设:{preset label / 不适用}
文章/内容原型:{selected prototype name}
本次风格参考:{M条来自参考文章 / 无}
确认开始?
```
Wait for explicit "yes" / confirmation before proceeding.
### Step 5: Execute Pipeline
Read the selected template file and execute:
```bash
# The template file path
TEMPLATE="creator/templates/$PLATFORM/template.md"
STYLE="creator/templates/$PLATFORM/style.md"
```
**For URL inputs — extract content first:**
```bash
# Submit content extraction
RESPONSE=$(curl -sS -X POST "https://api.marswave.ai/openapi/v1/content/extract" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "Content-Type: application/json" \
-H "X-Source: skills" \
-d "{\"source\":{\"type\":\"url\",\"uri\":\"$INPUT_URL\"}}")
TASK_ID=$(echo "$RESPONSE" | jq -r '.data.taskId')
```
Then poll in background. Run this as a **separate Bash call** with `run_in_background: true` and `timeout: 600000` (per `shared/cli-patterns.md`). The polling loop itself runs up to 300s (60 polls × 5s); `timeout: 600000` is set higher at the tool level to give the Bash process headroom beyond the poll budget:
```bash
# Run with: run_in_background: true, timeout: 600000
TASK_ID=""
for i in $(seq 1 60); do
RESULT=$(curl -sS "https://api.marswave.ai/openapi/v1/content/extract/$TASK_ID" \
-H "Authorization: Bearer $LISTENHUB_API_KEY" \
-H "X-Source: skills" 2>/dev/null)
STATUS=$(echo "$RESULT" | tr -d '\000-\037\177' | jq -r '.data.status // "processing"')
case "$STATUS" in
completed) echo "$RESULT"; exit 0 ;;
failed) echo "FAILED: $RESULT" >&2; exit 1 ;;
*) sleep 5 ;;
esac
done
echo "TIMEOUT" >&2; exit 2
```
Extract content: `MATERIAL=$(echo "$RESULT" | jq -r '.data.data.content')`
If extraction fails: tell user "URL 解析失败,你可以直接粘贴文字内容给我" and stop.
**Then follow the platform template** — read `template.md` and execute each step. The template specifies the exact writing instructions and API calls. See `creator/templates/{platform}/template.md` for template contents.
**Writing engine integration:** Each platform's `template.md` now includes writing-engine references and a self-review loop. The template handles loading `writing-engine/` files, applying the selected prototype's narrative structure, and running L1-L4 quality review after writing. See each platform's `template.md` for details.
**Style application:** When writing content, apply style directives in this priority order (higher overrides lower):
1. `sessionStyle` — directives from the current style reference (Step 3), if any
2. `.listenhub/creator/styles/{platform}.md` — persisted user style directives (if file exists)
3. `templates/{platform}/style.md` — baseline platform style
**For image generation** (called by wechat and xiaohongshu templates):
```bash
RESPONSE=$(listenhub image create \
--prompt "" \
--aspect-ratio "" \
--json)
BASE64_DATA=$(echo "$RESPONSE" | jq -r '.candidates[0].content.parts[0].inlineData.data // .data')
# macOS uses -D, Linux uses -d (detect platform)
if [[ "$(uname)" == "Darwin" ]]; then
echo "$BASE64_DATA" | base64 -D > "{output-path}/{filename}.jpg"
else
echo "$BASE64_DATA" | base64 -d > "{output-path}/{filename}.jpg"
fi
```
On 429: exponential backoff (wait 15s → 30s → 60s), retry up to 3 times. On failure after retries: skip this image, annotate in output summary.
Generate images **sequentially** (not parallel) to respect rate limits.
**For TTS** (called by narration template when user wants audio):
```bash
listenhub tts create --text "$(cat /tmp/lh-content.txt)" --speaker "$SPEAKER_ID" --json \
| jq -r '.data' | base64 -D > "{slug}-narration/audio.mp3"
```
### Step 6: Assemble Output
Create the output folder and write all files:
```bash
SLUG="{topic-slug}"
OUTPUT_DIR="${SLUG}-{platform}"
# Dedup folder name
i=2; while [ -d "$OUTPUT_DIR" ]; do OUTPUT_DIR="${SLUG}-{platform}-${i}"; i=$((i+1)); done
mkdir -p "$OUTPUT_DIR"
```
Write content files per template spec. Then write `meta.json`:
```json
{
"title": "...",
"slug": "...",
"platform": "wechat|xiaohongshu|narration",
"date": "YYYY-MM-DD",
"tags": ["...", "..."],
"summary": "..."
}
```
### Step 7: Present Result
```
✅ 内容已生成!保存在 {OUTPUT_DIR}/
📄 {main files list}
🖼️ images/ — N 张配图(如有)
📋 meta.json — 标题、标签、摘要
```
(Adapt language to user's input language per Hard Constraints.)
### Step 8: Update Preferences
Record this generation in history:
```bash
NEW_CONFIG=$(echo "$CONFIG" | jq \
--arg platform "$PLATFORM" \
--arg date "$(date +%Y-%m-%d)" \
--arg topic "$TOPIC" \
'.preferences[$platform].history = (.preferences[$platform].history + [{"date": $date, "topic": $topic}])[-5:]')
echo "$NEW_CONFIG" > "$CONFIG_PATH"
```
Keep only the last 5 history entries per platform.
Note: `cardStyle` from the spec is deferred — not implemented in V1 config. Can be added later when card style customization is needed.
### Manual Style Tuning
**Adding style directives:**
If the user says "记住:{style directive}" or "remember: {style directive}":
1. Detect which platform it applies to (from context or ask)
2. Append the directive as a new line to `.listenhub/creator/styles/{platform}.md` (create the file if it doesn't exist)
This also applies after Step 3 (Style Extraction): if the user says "记住这个风格" after reviewing extracted directives, write all confirmed directives to `.listenhub/creator/styles/{platform}.md`.
**Resetting style:**
If the user says "重置风格偏好" or "reset style":
1. Ask which platform (or all)
2. Delete `.listenhub/creator/styles/{platform}.md`
## API Reference
- Authentication: `shared/cli-authentication.md`
- Image generation: CLI: `listenhub image create` (see `shared/cli-patterns.md`)
- Content extraction: `content-parser/SKILL.md` § API Reference (Inlined)
- TTS (text-to-speech): CLI: `listenhub tts create` (see `shared/cli-patterns.md`)
- Speaker selection: `shared/speaker-selection.md`
- Config pattern: `shared/config-pattern.md`
- Common patterns (polling, errors): `shared/cli-patterns.md`
- Output mode: `shared/output-mode.md`
## Composability
- **Invokes**: content-parser (URL extraction), image-gen (illustrations/cards), tts (narration audio), asr (audio/video transcription via `coli`)
- **Invoked by**: standalone — user triggers directly
- **Templates**: `creator/templates/{wechat,xiaohongshu,narration}/template.md` define per-platform pipelines
- **Style guides**: `creator/templates/{wechat,xiaohongshu,narration}/style.md` define per-platform writing tone