--- name: web-to-prd description: "Scan a live web app with Playwright, extract all features, generate PRD/epics/stories with priorities and dependencies, export to Notion. Checks required MCP servers before starting." allowed_tools: - Read - Write - Bash - Glob - Grep - WebSearch - mcp__playwright__browser_navigate - mcp__playwright__browser_click - mcp__playwright__browser_snapshot - mcp__playwright__browser_take_screenshot - mcp__playwright__browser_type - mcp__playwright__browser_tabs - mcp__playwright__browser_close - mcp__playwright__browser_hover - mcp__playwright__browser_fill_form - mcp__playwright__browser_select_option - mcp__playwright__browser_press_key - mcp__playwright__browser_navigate_back - mcp__playwright__browser_wait_for - mcp__playwright__browser_evaluate - mcp__playwright__browser_resize - mcp__playwright__browser_console_messages - mcp__playwright__browser_network_requests - mcp__playwright__browser_drag - mcp__playwright__browser_file_upload - mcp__playwright__browser_handle_dialog - mcp__playwright__browser_run_code --- # Web-to-PRD Skill Scan a live web app. Extract every feature. Turn it into a structured PRD with epics, stories, and tasks. Push it all to Notion. ## When to Use - Reverse-engineer a competitor's product - Document an existing app you're taking over - Create a PRD from a live product (yours or someone else's) - Build a feature backlog from scratch by looking at what's already built ## What This Skill Does 1. **Checks prerequisites** — makes sure Playwright MCP and Notion MCP are connected 2. **Crawls the web app** — navigates page by page, reads UI elements 3. **Extracts features** — groups what it finds into feature areas 4. **Generates PM artifacts** — PRD, epics, stories, tasks with priorities and dependencies 5. **Exports to Notion** — creates linked databases, populates everything ## Prerequisites This skill needs 2 MCP servers. The command checks both before starting. ### 1. Playwright MCP (browser control) Claude uses Playwright to open a real browser, navigate pages, and read content. #### Browser modes | Mode | Install command | What it does | |------|----------------|-------------| | **Persistent profile (default)** | See setup below | Lightweight profile at `~/.playwright-profile`. Login once, remembered. Chrome stays open. No extensions bloat. | | CDP (advanced) | `--cdp-endpoint='http://localhost:9222'` | Connects to running Chrome. Has your logins but also loads all extensions (can be slow). | | Chrome profile (heavy) | `--user-data-dir="[Chrome path]" --browser=chrome` | Uses real Chrome profile. Has logins but loads ALL extensions — often causes timeouts. Not recommended. | | Clean session | no extra flags | Fresh browser each time. No saved state. Public sites only. | #### Default setup: Persistent profile (auto-installed by the command) The `/spartan:web-to-prd` command handles installation itself. Uses a lightweight separate profile at `~/.playwright-profile` — no extensions, no bloat, fast startup. **What the command does internally:** ```bash claude mcp remove playwright 2>/dev/null || true claude mcp add playwright -- npx @playwright/mcp@latest --user-data-dir="$HOME/.playwright-profile" --browser=chrome ``` **First run on a login-protected site:** Playwright opens Chrome with a clean profile. User logs in manually. Cookies are saved to `~/.playwright-profile`. Next runs are already logged in. **Why not the real Chrome profile?** Real Chrome profiles load ALL extensions (AdBlock, LastPass, password managers, etc.). These add latency, block requests, and often cause Playwright to timeout or hang. A separate profile is faster and more reliable. **Chrome can stay open.** Since we use a separate profile, there's no conflict. #### Switching modes To change mode, remove and re-add: ```bash claude mcp remove playwright claude mcp add playwright -- npx @playwright/mcp@latest [flags] ``` #### All Playwright MCP flags | Flag | What it does | |------|-------------| | `--cdp-endpoint="http://localhost:9222"` | Connect to running Chrome via CDP | | `--user-data-dir="/path"` | Persistent browser profile (keeps cookies) | | `--storage-state="/path/to/state.json"` | Load saved cookies from file | | `--isolated` | Fresh session, no persistent data | | `--browser=chrome` | Use real Chrome instead of Chromium | | `--headless` | No visible browser window | All flags also work as env vars with `PLAYWRIGHT_MCP_` prefix (e.g., `PLAYWRIGHT_MCP_CDP_ENDPOINT`). **How to verify Playwright MCP is installed:** ```bash claude mcp list | grep -i playwright ``` **What it gives you:** `browser_navigate`, `browser_click`, `browser_snapshot`, `browser_type`, `browser_tab_list` and more. ### 2. Notion MCP (export destination) Claude uses Notion MCP to create databases, pages, and views in your workspace. **How to install:** The Notion MCP is available as a Claude.ai integration. Enable it from: - Claude Code settings > MCP servers - Or Claude Desktop > Settings > Integrations > Notion **How to verify:** ```bash claude mcp list | grep -i notion ``` **What it gives you:** `notion-create-database`, `notion-create-pages`, `notion-create-view`, `notion-search`, `notion-update-page`. ### Optional: Firecrawl MCP (faster crawling) If the user has Firecrawl, use it instead of Playwright for the initial crawl. It's faster but costs money. ```bash claude mcp add firecrawl -- npx firecrawl-mcp ``` Firecrawl is optional. Playwright alone handles everything. --- ## Prerequisite Check Logic Run this check at the start. **IMPORTANT: `claude mcp add/remove` does NOT make tools available mid-session.** MCP tools only load when Claude Code starts. Never try to install or reconfigure MCP servers during a running session — it won't work and wastes time. ``` CHECK 1: Playwright MCP A) Try calling any Playwright tool (e.g., browser_snapshot or browser_navigate) If tool works → check the config: Read .claude.json for playwright args If --user-data-dir points to ~/.playwright-profile → good, proceed If --user-data-dir points to real Chrome profile → warn user (extensions cause timeouts) If no --user-data-dir (clean mode) → OK for public sites If --cdp-endpoint → good, proceed If tool NOT found → Playwright MCP is not loaded. Show this message and STOP: "Playwright MCP is not available. I need it to open a browser. Run this in your terminal (outside Claude Code): claude mcp add playwright -- npx @playwright/mcp@latest --user-data-dir=$HOME/.playwright-profile --browser=chrome Then restart Claude Code and run /spartan:web-to-prd again." NEVER run `claude mcp add` or `claude mcp remove` yourself during the session. It changes the config file but won't load the tools until restart. CHECK 2: Notion MCP (OPTIONAL — not a blocker) Try calling notion-search with a simple query If found → great, will export to Notion at the end If not found → note it, will save PRD locally instead. Continue with crawl. Playwright OK → proceed to crawl ``` **Notion is optional.** The PRD is always saved locally. Notion export is a bonus step at the end. --- ## Crawl Strategy ### Step 0: Clean up stale lock files (before every run) Stale lock files from previous browser sessions can cause "Opening in existing browser session" errors. **Only remove lock files — never kill processes:** ```bash rm -f "$HOME/.playwright-profile/SingletonLock" \ "$HOME/.playwright-profile/SingletonCookie" \ "$HOME/.playwright-profile/SingletonSocket" 2>/dev/null echo "Browser cleanup done" ``` **WARNING:** Do NOT run `pkill -f "playwright-profile"` — it kills the Playwright MCP server process too, disconnecting all browser tools mid-session. If navigate still fails after cleanup → retry once after 2 seconds. If still fails → user needs to restart Claude Code. ### Step 1: Login FIRST (mandatory before crawling) **Never start crawling without confirming access. Login is Step 1, not an afterthought.** 1. Navigate to the target URL 2. Take a snapshot — check for login signals (form fields, "Sign in" text, `/login` URL) 3. **If login page:** - STOP. Tell user: "Login page detected. Please log in in the browser window. Tell me when done." - Wait for user confirmation - Take snapshot to verify — still login page? Ask again. See dashboard? Proceed. - **Repeat until logged in.** Do NOT start crawling from a login page. 4. **If already logged in** (or public site): - Show the user what sections are visible - Ask: "Does this look like full access? Any sections I'm missing?" - Wait for confirmation before crawling **Session expiry during crawl:** If redirected to login mid-crawl → STOP, tell user to re-login in the browser, wait for confirmation, then continue where you left off. **Security rules:** - Never use `browser_type` to enter passwords — user types directly in the browser - Never ask for credentials in chat - Never screenshot login pages - SSO/OAuth popups work normally — just wait for user to complete **Cookies:** With persistent profile (`~/.playwright-profile`), logins are saved. Next run on the same site = already logged in. ### Step 2: Two-pass crawl **Pass 1 — Map all pages (breadth-first):** Visit every nav link, take a screenshot, note the page type, go back. Build a complete sitemap. Don't explore features deeply yet. Go back to home between sections. Show the sitemap to user and ask if anything is missing. **Pass 2 — Deep exploration (exhaust every feature):** Go through each page from the sitemap. On each page: try EVERY interactive element until there's nothing left to try. Click a button → opens a modal? → what's in the modal? → has a form? → what fields? → has a submit button? → what happens after submit? → follow every path until you hit a dead end or a page you already explored. Only move to next page when you've exhausted all interactions on this page. The goal is to discover features that are 2-3 levels deep — hidden behind tabs, modals, sub-pages, or conditional UI. ### Screenshots (mandatory) Take a screenshot of every page and every important UI state. Save to `.planning/web-to-prd/screenshots/` with names like `01-homepage.png`, `02-dashboard.png`, `07-create-modal.png`. Include screenshot references in each Epic. Never screenshot login pages. ### For SPAs (single page apps) SPAs don't have traditional page URLs. Use this approach: 1. Start at the root URL 2. Read the navigation/sidebar for all sections 3. Click each section, wait for content to load 4. Take snapshot after each navigation 5. Track visited states by URL hash or path changes ### Crawl depth limits | App size | Max pages | Estimated time | |----------|-----------|----------------| | Small (< 10 pages) | All pages | 2-5 min | | Medium (10-50 pages) | All pages | 5-15 min | | Large (50+ pages) | Top 50, then ask user | 15+ min | After every 10 pages, show progress: > "Scanned 10/~25 pages. Found 3 feature areas so far. Continue?" ### Coverage Check (mandatory before generating PRD) After crawling, show a coverage report: pages visited, screenshots taken, buttons clicked, modals found, forms found, tabs explored, filters tested. List all nav sections and mark which were explored vs skipped. **Fail if:** any nav section not explored, fewer screenshots than pages, zero modals on a page with buttons (means you didn't click them), any section with only 1 interaction (you only looked, didn't try). Ask user to confirm coverage before proceeding to PRD generation. --- ## Feature Extraction ### What to extract from each page For every page visited, capture: ```yaml page: url: "/dashboard" title: "Dashboard" type: dashboard | list | detail | form | settings | landing | auth | empty features: - name: "Revenue Chart" type: data-display | form | action | navigation | filter | notification description: "Line chart showing monthly revenue with date range picker" ui_elements: - chart (line, with tooltips) - date range picker - export button interactions: - hover shows tooltip with exact value - date range filters the data - export downloads CSV - name: "Quick Actions Bar" type: action description: "Row of shortcut buttons: New Invoice, New Client, Reports" interactions: - each button navigates to respective page ``` ### Feature grouping rules After crawling, group features into **feature areas** (these become Epics): 1. **By navigation section** — sidebar/navbar sections are natural groupings 2. **By user goal** — what is the user trying to do? 3. **By data domain** — features that touch the same data belong together Example groupings: ``` Epic: User Management - User list with search/filter - User profile page - User invite flow - Role assignment - User deactivation Epic: Billing & Payments - Invoice list - Create invoice form - Payment tracking - Subscription management - Billing settings ``` ### Priority assignment Assign priority based on visibility and complexity: | Priority | Criteria | |----------|----------| | P0 - Must have | Core user flow, app doesn't work without it | | P1 - Should have | Important but app is usable without it | | P2 - Nice to have | Enhancement, polish, edge case handling | | P3 - Future | Advanced feature, nice but not needed now | **Heuristics:** - Main navigation items → P0 or P1 - Settings/config pages → P1 or P2 - Empty states, onboarding → P2 - Social features, sharing → P2 or P3 ### Dependency mapping Map dependencies between features: ``` Epic: Authentication (must build first) → Epic: User Management (needs auth) → Epic: Team Management (needs users) → Epic: Permissions (needs teams) Epic: Product Catalog (independent) → Epic: Shopping Cart (needs products) → Epic: Checkout (needs cart) → Epic: Order Management (needs checkout) ``` Rules for dependencies: - CRUD operations: Create before Read/List before Update before Delete - Auth is always first - Data display depends on data input - Settings depend on the feature they configure --- ## PRD Generation ### Structure Generate a PRD with 8 sections. **Each Epic is a mini-PRD** — a developer reads one Epic and knows exactly what to build. ``` 1. TL;DR — 2 sentences, what this app does 2. Goals — Business goals, user goals, non-goals 3. User Stories — Grouped by persona/role 4. Epics (mini-PRDs) — THE MAIN SECTION. Each epic has full detail (see below) 5. User Flows — End-to-end flows connecting stories across epics 6. Narrative — 200-word story from user's POV 7. Build Roadmap — Phased plan with dependency graph 8. Open Questions — Things that need human input ``` **Section 4 (Epics) is the core.** Epics are ordered by build priority (Epic 1 = build first). **Each Epic is a FULL PRD with 6 sections:** ``` Epic N: [Name] ├── 1. TL;DR — what this epic solves, who it's for ├── 2. Goals — business goals, user goals, non-goals ├── 3. User Stories — As a [user], I want... ├── 4. Functional Requirements — every feature with screenshots, UI detail, priority ├── 5. User Experience — entry point, flow (step by step), edge cases, design notes └── 6. Narrative — 100-word user story for this epic ``` **No epic can skip any section.** Every epic gets all 6 sections. This is what makes the PRD actionable — a developer reads one Epic page and knows exactly what to build. **Screenshots are embedded in Section 4** next to the features they show. Not at the end, not as links — inline with the content. **Be as detailed as possible.** Describe every button, every form field, every table column, every filter option. **Try EVERY feature.** Click every menu, open every modal, test every filter. Missing features = useless PRD. See the command file (`web-to-prd.md`) for the full template with examples. --- ## Notion Export ### Database Structure Create a parent page with one sub-page per Epic: ``` Parent page: "[App Name] — PRD" ├── PRD overview (sections 1-3, 5-8) ├── Epic 1: [Name] (full page with screenshots, features, acceptance criteria) ├── Epic 2: [Name] (full page with screenshots, features, acceptance criteria) ├── ... └── Optional: Epics overview database (for filtering/sorting, links to pages) ``` **Each Epic = a full Notion page**, not a database row with a Description field. Include: - Full content from the PRD Section 4 format - Screenshots embedded directly in the page (visible, not links) - Every feature detail: user story, UI description, step-by-step, acceptance criteria **Screenshots must be uploaded to Notion**, not just saved locally. Place them next to the features they document. ### Export steps 1. **Ask where to put it:** > "Where should I create the backlog in Notion? > A) Create a new page in your workspace root > B) Add it under an existing page (I'll search for it) > C) Just generate the PRD locally, don't push to Notion" 2. **Create parent page** with the PRD content 3. **Create Epics database** with all epics 4. **Create Stories database** linked to Epics 5. **Create Tasks database** linked to Stories 6. **Create views:** - Kanban view (by Status) for Stories - Timeline view (by Phase) for Epics - Table view (default) for Tasks ### If Notion MCP is not available Save everything locally: ``` .planning/web-to-prd/ ├── prd.md # Full PRD document ├── epics.md # All epics with stories ├── dependency-graph.md # Visual dependency map └── screenshots/ # Page screenshots (if taken) ``` User can import to Notion manually later. --- ## Rules 1. **Always check prerequisites first.** Don't start crawling without confirming both MCP servers. 2. **Login before crawling.** Never generate a PRD from a login page or public-only view. If the app has login, handle it first. Verify you see the full app before starting. 3. **Confirm access level.** After login, show the user what sections are visible and ask if anything is missing. A PRD from a limited view is useless. 4. **Handle session expiry.** If redirected to login mid-crawl, STOP and ask user to re-login. Don't crawl from a login page. 5. **Show progress during crawl.** Every 10 pages or every major section, update the user. 3. **Don't guess features you can't see.** Only document what's visible in the UI. Mark assumptions clearly. 4. **Ask before clicking destructive actions.** If you see "Delete" or "Remove" buttons, don't click them during crawl. 5. **Handle errors gracefully.** If a page fails to load, note it and move on. Don't stop the whole crawl. 6. **Respect rate limits.** Add 1-2 second delays between page navigations to avoid being blocked. 7. **Screenshots are mandatory.** Take them for every page and embed them in the markdown PRD using `![name](path)` syntax. 8. **Login is the user's job.** Never store or ask for production credentials. Use headed mode for manual login. 9. **Local save is always available.** Even if Notion export fails, the PRD is saved locally. 10. **One app per run.** Don't crawl multiple domains in a single session. 11. **NEVER point `--user-data-dir` to the real Chrome profile directory** (e.g., `~/Library/Application Support/Google/Chrome` on Mac, `~/.config/google-chrome` on Linux). This can corrupt Chrome profiles, delete saved logins, and break the user's browser. Always use a separate directory like `~/.playwright-profile`.