---
name: feature-results
description: Post-launch analysis and results documentation. Document what shipped and what we learned.
disable-model-invocation: false
user-invocable: true
---

# /feature-results - Post-Launch Analysis

Document feature results and capture learnings.

## Quick Start

```
/feature-results
```

Then provide:

1. **Feature name** and ship date
2. **Original hypothesis** (or I'll pull it from your PRD)
3. **Actual results** (metrics, or I'll query your analytics MCP)

I'll generate a results doc with: executive summary, results vs targets,
root cause analysis (if missed), segment breakdown, estimate calibration,
stakeholder communication guidance, and concrete next steps.

**Output:** Saved to `thoughts/shared/pm/analyses/feature-results-[name]-[date].md`
**Time:** ~10 min with data ready, ~20 min if we need to dig

---

## Pre-Launch Detection

Before generating a results analysis, verify the feature has actually shipped.

**Check for launch signals:**

- Does the PRD in `thoughts/shared/pm/prds/` show status "Shipped" or "Launched"?
- Does the PM mention actual results data (not projections)?

**If the feature hasn't launched yet:**

```
It looks like [feature] hasn't shipped yet. Post-launch analysis works best with real data.

For pre-launch work, you might want:
- `/feature-metrics` - Define what you'll measure before launch
- `/impact-sizing` - Estimate expected impact
- `/experiment-decision` - Plan your A/B test or rollout
- `/launch-checklist` - Make sure you're ready to ship

Want to proceed anyway (e.g., analyzing a soft launch or pilot), or switch to one of these?
```

---

## When to Use

- After A/B test completes
- After feature reaches full rollout
- During quarterly reviews
- For portfolio analysis

---

## Quick Start Prompt

When PM types `/feature-results`, respond:

```
Let's document the results of your feature launch.

Tell me:
1. What feature shipped?
2. What were the original goals/metrics?
3. What actually happened?

I'll help you create a results doc that captures outcomes and learnings.
```

---

## Results Doc Structure

### 1. Executive Summary

- One paragraph: What shipped, did it work, what's next

### 2. Original Hypothesis

- What we expected to happen
- Why we believed it

### 3. Results

- Primary metric: Target vs. Actual
- Guardrail metrics: Any issues?
- Unexpected findings

### 4. Analysis

- Why did results occur?
- What surprised us?
- What didn't we anticipate?

### 5. Learnings

- What would we do differently?
- What should we apply elsewhere?

### 6. Next Steps

- Keep / Iterate / Kill?
- Follow-up experiments
- Related opportunities

---

## Output Template

```markdown
# Feature Results: [Feature Name]

**Ship Date:** [Date]
**Analysis Date:** [Date]
**Owner:** [PM Name]

---

## Executive Summary

[1-2 sentences: What shipped, key result, recommendation]

**Verdict:** [Ship to 100% / Iterate / Kill]

---

## Original Hypothesis

**If we** [built X],
**then** [Y would happen],
**because** [Z assumption].

**Target:** [Primary metric] would improve by [X%]

---

## Results

### Primary Metric

| Metric | Baseline | Target | Actual | Verdict    |
| ------ | -------- | ------ | ------ | ---------- |
| [Name] | [X]      | [Y]    | [Z]    | [Hit/Miss] |

### Statistical Significance

- Sample size: [N]
- Confidence level: [X%]
- P-value: [X]

### Guardrail Metrics

| Metric   | Acceptable Range | Actual   | Status       |
| -------- | ---------------- | -------- | ------------ |
| [Metric] | [range]          | [actual] | [OK/Warning] |

### Unexpected Findings

- [Finding 1]
- [Finding 2]

---

## Analysis

### Why These Results?

[Explain the underlying reasons for the results]

### What Surprised Us?

[Things we didn't expect]

### Segment Breakdown

| Segment     | Result  | Notes     |
| ----------- | ------- | --------- |
| [Segment 1] | [+/-X%] | [insight] |
| [Segment 2] | [+/-X%] | [insight] |

---

## Learnings

### What Worked

- [Learning 1]
- [Learning 2]

### What Didn't Work

- [Learning 1]
- [Learning 2]

### Apply Elsewhere

- [Insight that applies to other features]

---

## Next Steps

**Recommendation:** [Ship / Iterate / Kill]

**If shipping:**

- [ ] Roll out to 100%
- [ ] Monitor for [X] weeks
- [ ] Update documentation

**If iterating:**

- [ ] [Specific change 1]
- [ ] [Specific change 2]
- [ ] Re-run experiment

**If killing:**

- [ ] Document why
- [ ] Rollback plan
- [ ] Communicate to stakeholders

---

## Appendix

- [Link to experiment dashboard]
- [Link to original PRD]
- [Link to detailed data]
```

---

## Root Cause Analysis

When results miss targets, diagnose using this framework:

### Discovery Problem (Users don't know it exists)

- **Check:** Feature awareness rate, in-app visibility, announcement reach
- **Common causes:** Buried in UI, no onboarding tooltip, weak announcement
- **Fix:** Improve discovery through tooltips, email campaign, in-app prompts

### Quality Problem (Users try it but it doesn't work well)

- **Check:** Error rates, completion rates, time-on-task, feedback sentiment
- **Common causes:** Bugs, poor UX, confusing workflow, slow performance
- **Fix:** Fix bugs, simplify UX, improve performance

### Relevance Problem (Users try it but it doesn't solve their problem)

- **Check:** Use case analysis, segment breakdown, feature-problem fit
- **Common causes:** Wrong target audience, edge cases not handled, misaligned with actual needs
- **Fix:** Narrow targeting, add missing use cases, revisit problem definition

### Frequency Problem (Users use it once but don't come back)

- **Check:** Repeat usage rates, habit formation metrics, triggers for return
- **Common causes:** Not integrated into daily workflow, one-time value, no triggers
- **Fix:** Add recurring value hooks, notifications, integration with other workflows

---

## Estimate vs Actual Calibration

| Metric             | Original Estimate    | Actual Result | Accuracy               |
| ------------------ | -------------------- | ------------- | ---------------------- |
| [Primary metric]   | [From impact-sizing] | [Actual]      | [Over/Under by X%]     |
| [Secondary metric] | [From impact-sizing] | [Actual]      | [Over/Under by X%]     |
| [Guardrail metric] | [Expected range]     | [Actual]      | [Within/Outside range] |

**Calibration insight:** [Are we consistently over/under-estimating? By how much? What should we adjust for next time?]

**Historical calibration pattern:**

- Look at past 3-5 feature results. If we consistently over-estimate by 30-50%, apply a discount factor to future impact-sizing.
- If we consistently under-estimate, we may be too conservative or missing amplification effects.
- Track this over time to improve estimation accuracy.

---

## Segment Breakdown Guidance

Always analyze results across these default segments. Segment analysis often reveals that an overall miss is actually a win for one segment and a miss for another.

| Segment Dimension                             | Why It Matters                                                       |
| --------------------------------------------- | -------------------------------------------------------------------- |
| **New vs existing users**                     | New users may not find the feature; existing users may resist change |
| **Plan tier (Free/Team/Business/Enterprise)** | Feature value often varies dramatically by tier                      |
| **Company size**                              | Small teams adopt differently than large orgs                        |
| **Power users vs casual users**               | Power users may love it; casual users may be confused                |
| **Geographic region (if applicable)**         | Cultural, language, and connectivity differences                     |

**Segment analysis template:**

| Segment        | N (sample) | Primary Metric | vs Target | Insight |
| -------------- | ---------- | -------------- | --------- | ------- |
| New users      | [N]        | [result]       | [+/-X%]   | [why]   |
| Existing users | [N]        | [result]       | [+/-X%]   | [why]   |
| Free tier      | [N]        | [result]       | [+/-X%]   | [why]   |
| Paid tier      | [N]        | [result]       | [+/-X%]   | [why]   |
| Power users    | [N]        | [result]       | [+/-X%]   | [why]   |
| Casual users   | [N]        | [result]       | [+/-X%]   | [why]   |

---

## How to Communicate Results

**If results beat targets:** Lead with the win, credit the team, connect to strategy. Share in #product-team, include in weekly status update, highlight in next board deck.

**If results miss targets:** Lead with what you learned, not what went wrong. Frame as: "Here's what we hypothesized, here's what actually happened, here's what we're doing about it." Never hide bad results -- own them fast.

**If results are ambiguous:** Be honest about uncertainty. "We saw X but can't conclusively attribute it to Y because Z. Here's our plan to get clearer signal."

### Audience-Specific Framing

| Audience             | Lead With                                               | Include                                             | Avoid                              |
| -------------------- | ------------------------------------------------------- | --------------------------------------------------- | ---------------------------------- |
| **CEO/Board**        | Impact on key business metrics + strategic implications | Revenue impact, user growth, competitive position   | Technical details, minor metrics   |
| **CPO/Manager**      | Learnings + next steps + resource ask                   | What worked, what didn't, proposed plan             | Blame, excuses                     |
| **Engineering team** | What worked technically + what to improve               | Performance data, technical wins, tech debt notes   | Business jargon                    |
| **Sales/CS**         | Customer-facing impact + talking points                 | User quotes, feature highlights, objection handlers | Internal debates, negative framing |
| **Design**           | User behavior changes + UX insights                     | Usage patterns, qualitative feedback, heatmaps      | Pure metric tables without context |

---

## Example Patterns

### Pattern 1: Clear Win

**Scenario:** Feature shipped, primary metric exceeded target by 20%. Guardrails held.
**Root cause:** N/A (success)
**Action:** Scale to 100%, plan v2 based on segment insights, draft announcement via `/catalyst-pm-ops:slack-message`, update `/catalyst-pm-ops:status-update` with win.
**Communication:** Lead with the number, credit the team, connect to quarterly goals.

### Pattern 2: Mixed Results

**Scenario:** Primary metric hit target but guardrail metric degraded (e.g., page load time increased 15%).
**Root cause:** Quality problem -- feature added weight to core flow.
**Action:** Investigate quality regression, hold at current rollout %, fix guardrail before scaling. Create follow-up ticket via `/catalyst-pm-ops:create-tickets`.
**Communication:** "Feature is working as intended for users, but we identified a performance regression we need to fix before full rollout."

### Pattern 3: Clear Miss

**Scenario:** Primary metric at 60% of target after 4 weeks.
**Root cause:** Discovery problem -- only 12% of eligible users found the feature. Among those who found it, conversion was above target.
**Action:** Don't kill the feature -- improve discovery (add tooltip, email campaign, onboarding mention), re-measure in 4 weeks.
**Communication:** "The feature works for users who find it, but we have a distribution problem. Plan: improve discovery and re-measure."

### Pattern 4: Kill Decision

**Scenario:** Primary metric at 30% of target after 6 weeks. Root cause is relevance -- users tried it but it doesn't solve their actual problem.
**Root cause:** Relevance problem -- hypothesis was wrong.
**Action:** Document the learning, roll back, redirect engineering effort. Draft decision doc via `/decision-doc`.
**Communication:** "We tested our hypothesis that X would solve Y. Data shows it doesn't. Here's what we learned and where we're redirecting effort."

---

## Tips

- **Be honest about failures** - Negative results are still valuable
- **Capture learnings immediately** - Don't wait; context fades
- **Share widely** - Others can learn from your experiments
- **Connect to strategy** - How does this inform our roadmap?
- **Run root cause analysis** - Don't just report the miss; diagnose why
- **Check segments** - An overall miss may hide a segment win
- **Calibrate estimates** - Track accuracy over time to improve future sizing

---

## Context Routing Strategy

When the PM uses `/feature-results`, I automatically:

### 1. Pull Original PRD & Hypothesis

**Source:** `thoughts/shared/pm/prds/`, conversation history

- **What I look for:** Original hypothesis, target metrics, success criteria
- **How I use it:** Compare actual results against original targets
- **Example:** Auto-populate "Original Hypothesis" section from the PRD you shared

### 2. Query Analytics MCPs

**Source:** PostHog, PostHog, Posthog, PostHog (if connected)

- **What I look for:** Real-time metrics for the feature being analyzed
- **How I use it:** Populate results tables with actual data
- **Fallback:** If no MCP connected, I'll ask you to provide the metrics

### 3. Check Related Experiments

**Source:** `thoughts/shared/pm/metrics/`, past feature results

- **What I look for:** Similar features' results, patterns in what works
- **How I use it:** Add context like "This compares to our pricing redesign which saw 12% improvement"

### 4. Extract Guardrail Metrics

**Source:** `thoughts/shared/pm/metrics/`, PRD success metrics section

- **What I look for:** Metrics that must not decline (retention, NPS, etc.)
- **How I use it:** Pre-populate guardrail metrics table with relevant metrics

### 5. Identify Learnings for Portfolio

**Source:** Cross-reference with other feature results

- **What I look for:** Patterns across multiple experiments
- **How I use it:** Suggest "Apply Elsewhere" section based on similar features

### 6. Route Results for Sharing

**Routing logic:**

- **Ship to 100%:** Draft announcement for `/catalyst-pm-ops:slack-message`
- **Iterate:** Create follow-up experiment plan with `/experiment-decision`
- **Kill:** Draft decision doc for `/decision-doc` explaining sunset
- **Executive update:** Format for `/catalyst-pm-ops:status-update` with business impact

---

## Output Quality Self-Check

Before delivering the feature results doc, verify:

- [ ] **Executive summary** is 1-2 sentences max and includes a clear verdict (Ship/Iterate/Kill)
- [ ] **Original hypothesis** is stated in If/Then/Because format
- [ ] **Primary metric** has baseline, target, and actual with statistical significance noted
- [ ] **Guardrail metrics** are checked and status is clear (OK/Warning/Breach)
- [ ] **Root cause analysis** is included if results missed targets (Discovery/Quality/Relevance/Frequency)
- [ ] **Segment breakdown** covers at least 3 dimensions (new vs existing, tier, user type)
- [ ] **Estimate vs actual calibration** table is populated with accuracy percentages
- [ ] **Stakeholder communication** section has audience-appropriate framing
- [ ] **Next steps** are specific, actionable, and have owners
- [ ] **Learnings** include at least one "apply elsewhere" insight
- [ ] **No corporate jargon** -- results are written in plain, direct language
- [ ] **Linked to original PRD** and any related experiments