---
name: genre-skill-builder
description: Meta-skill for creating genre-analysis-based writing skills. Analyzes a corpus of article sections, discovers clusters, and generates complete skills with phases, cluster guides, and techniques.
---

# Genre Skill Builder

You help researchers create **writing skills** based on systematic genre analysis. Given a corpus of article sections (introductions, conclusions, methods, discussions, etc.), you guide users through analyzing genre patterns, discovering clusters, and generating a complete skill that can guide future writing.

## What This Skill Does

This is a **meta-skill**—it creates other skills. The output is a fully-functional writing skill like `lit-writeup` or `interview-bookends`, with:
- A main `SKILL.md` with genre-based guidance
- Phase files for a structured writing workflow
- Cluster profiles based on discovered patterns
- Technique guides for sentence-level craft

## When to Use This Skill

Use this skill when you want to:
- Create a writing guide for a **specific article section** (e.g., Discussion sections, Abstract, Methodology)
- Base guidance on **empirical analysis** of a corpus rather than intuition
- Generate a skill that follows the **repository's phased architecture**
- Produce **cluster-based guidance** that recognizes different writing styles

## What You Need

1. **A corpus of article sections** (30+ recommended)
   - Text files, PDFs, or markdown
   - All from the same section type (all introductions, all conclusions, etc.)
   - Ideally from target venues (e.g., *Social Problems*, *Social Forces*)

2. **A model skill to learn from**
   - An existing skill like `lit-writeup` or `interview-bookends`
   - Provides structural template for the generated skill

## Connection to Other Skills

This skill adapts the methodology from:

| Skill | What We Borrow |
|-------|----------------|
| **interview-analyst** | Systematic coding approach (Phases 1-3) |
| **lit-writeup** | Cluster-based writing guidance structure |
| **interview-bookends** | Benchmarks and coherence checking |

## Core Principles

1. **Empirical grounding**: All guidance derives from corpus analysis, not intuition.

2. **Cluster discovery**: Different articles do the same job in different ways; identify the styles.

3. **Quantitative + qualitative**: Count features AND interpret patterns.

4. **Template-based generation**: Use parameterized templates, not free-form writing.

5. **Pauses for judgment**: Human decisions shape cluster boundaries and naming.

6. **The user is the expert**: They know the genre; we provide methodological support.

## Workflow Phases

### Phase 0: Scope Definition & Model Selection
**Goal**: Define what we're building and what to learn from.

**Process**:
- Identify the target article section (introduction, conclusion, methods, discussion, etc.)
- Select an existing skill as a structural model
- Review model skill to identify elements to extract
- Confirm corpus location and article count

**Output**: Scope definition memo with target section, model skill, corpus path.

> **Pause**: User confirms scope and model selection.

---

### Phase 1: Corpus Immersion
**Goal**: Build quantitative profile of the corpus.

**Process**:
- Count articles, calculate word counts, paragraph counts
- Identify structural patterns (headings, subsections)
- Generate descriptive statistics (median, IQR, range)
- Flag outliers and notable examples
- Create initial observations about variation

**Output**: Immersion report with corpus statistics.

> **Pause**: User reviews quantitative profile.

---

### Phase 2: Systematic Genre Coding
**Goal**: Code each article for genre features.

**Process**:
- Develop codebook based on model skill's categories
- Code opening moves, structural elements, rhetorical strategies
- Track frequency and co-occurrence of features
- Build article-by-article coding database
- Identify preliminary cluster candidates

**Output**: Codebook, article codes, preliminary clusters.

> **Pause**: User reviews codebook and sample codes.

---

### Phase 3: Pattern Interpretation & Cluster Discovery
**Goal**: Identify stable patterns and define cluster profiles.

**Process**:
- Analyze code co-occurrence patterns
- Define 3-6 cluster characteristics
- Calculate benchmarks for each cluster
- Identify signature moves and prohibited moves
- Extract exemplar quotes/passages
- Name clusters meaningfully

**Output**: Cluster profiles with benchmarks and exemplars.

> **Pause**: User confirms cluster definitions.

---

### Phase 4: Skill Generation
**Goal**: Generate the complete skill file structure.

**Process**:
- Generate `SKILL.md` using template + findings
- Generate phase files (typically 3-4 for writing skills)
- Generate cluster guide files (one per cluster)
- Generate technique guide files
- Generate `plugin.json`
- Prepare `marketplace.json` entry

**Output**: Complete skill directory structure.

> **Pause**: User reviews generated skill files.

---

### Phase 5: Validation & Testing
**Goal**: Verify skill quality and test with sample input.

**Process**:
- Check all files are syntactically correct
- Verify benchmarks match analysis data
- Ensure cluster coverage is complete
- Identify any gaps or inconsistencies
- Optionally test with sample input

**Output**: Validation report with quality assessment.

---

## Folder Structure for Analysis

```
project/
├── corpus/                 # Article sections to analyze
│   ├── article-01.md
│   ├── article-02.md
│   └── ...
├── analysis/
│   ├── phase0-scope/       # Scope definition
│   ├── phase1-immersion/   # Quantitative profiling
│   ├── phase2-coding/      # Genre coding
│   ├── phase3-clusters/    # Pattern analysis
│   ├── phase4-generation/  # Generated skill files
│   └── phase5-validation/  # Quality assessment
└── output/                 # Final skill plugin
    └── plugins/[skill-name]/
```

## Code Categories to Track

Based on model skills, these are typical genre features to code:

### Structural Features
- Word count, paragraph count
- Presence of subsections
- Heading structure
- Position of key elements

### Opening Moves
- Phenomenon-led, stakes-led, theory-led, case-led, question-led
- First sentence type
- Hook strategy

### Rhetorical Moves
- Gap identification
- Contribution claims
- Limitations
- Future directions
- Callbacks (for conclusions)

### Citation Patterns
- Citation density
- Integration style (parenthetical, author-subject, quote-then-cite)
- Anchor sources vs. supporting citations

### Linguistic Features
- Hedging level
- Temporal markers
- Transition patterns
- Key phrases

## Cluster Discovery Guidelines

### Minimum Clusters: 3
If fewer than 3 patterns emerge, the corpus may be too homogeneous or the coding scheme too coarse.

### Maximum Clusters: 6
More than 6 typically indicates over-differentiation; look for higher-level groupings.

### Cluster Naming
Name clusters by their **dominant strategy**, not their prevalence:
- "Gap-Filler" not "Cluster 1"
- "Theory-Extension" not "Common Type"
- "Problem-Driven" not "Applied Approach"

### Cluster Validation
Each cluster should have:
- At least 10% of corpus (minimum 3 articles if corpus < 30)
- Distinctive benchmark values
- Clear signature moves
- At least one exemplar article

## Template System

Phase 4 uses parameterized templates. Key parameters:

| Parameter | Source |
|-----------|--------|
| `{{skill_name}}` | Phase 0 user input |
| `{{target_section}}` | Phase 0 user input |
| `{{cluster_names}}` | Phase 3 cluster discovery |
| `{{benchmarks}}` | Phase 1-2 statistics |
| `{{opening_moves}}` | Phase 2 coding |
| `{{signature_phrases}}` | Phase 2-3 analysis |

## Technique Guides

Reference these guides for phase-specific instructions:

| Guide | Purpose |
|-------|---------|
| `phases/phase0-scope.md` | Scope definition, model selection |
| `phases/phase1-immersion.md` | Quantitative profiling |
| `phases/phase2-coding.md` | Genre coding methodology |
| `phases/phase3-interpretation.md` | Cluster discovery |
| `phases/phase4-generation.md` | Skill file generation |
| `phases/phase5-validation.md` | Quality verification |

## Templates

| Template | Purpose |
|----------|---------|
| `templates/skill-template.md` | Main SKILL.md structure |
| `templates/phase-template.md` | Phase file structure |
| `templates/cluster-template.md` | Cluster profile structure |
| `templates/technique-template.md` | Technique guide structure |

## Invoking Phase Agents

Use the Task tool for each phase:

```
Task: Phase 2 Genre Coding
subagent_type: general-purpose
model: sonnet
prompt: Read phases/phase2-coding.md and execute for [user's project]. Corpus is in [location]. Model skill is [skill name].
```

## Model Recommendations

| Phase | Model | Rationale |
|-------|-------|-----------|
| **Phase 0**: Scope | **Sonnet** | Planning, structural decisions |
| **Phase 1**: Immersion | **Sonnet** | Counting, statistics |
| **Phase 2**: Coding | **Sonnet** | Systematic processing |
| **Phase 3**: Interpretation | **Opus** | Pattern recognition, cluster naming |
| **Phase 4**: Generation | **Opus** | Template adaptation, prose quality |
| **Phase 5**: Validation | **Sonnet** | Verification, checking |

## Starting the Process

When the user is ready to begin:

1. **Ask about the target**:
   > "What article section do you want to create a writing skill for? (e.g., introduction, conclusion, discussion, methods)"

2. **Ask about the corpus**:
   > "Where is your corpus of articles? How many articles do you have?"

3. **Ask about the model skill**:
   > "Which existing skill should I use as a structural model? Options include `lit-writeup` (Theory sections) and `interview-bookends` (intro/conclusion). I can also review other skills if you prefer."

4. **Ask about output**:
   > "What should the new skill be named? (e.g., `discussion-writer`, `methods-guide`)"

5. **Proceed with Phase 0** to formalize scope.

## Key Reminders

- **Corpus size matters**: 30+ articles recommended for stable clusters.
- **Variation is the goal**: A homogeneous corpus won't reveal clusters.
- **Human judgment required**: Cluster boundaries and names need user input.
- **Templates constrain**: Generated skills follow established patterns, not novel structures.
- **Test the output**: The best validation is using the generated skill.
- **Iteration expected**: First-pass clusters often need refinement.