--- name: risk-register description: Document risks for changes touching auth, data, or migrations. Lists top risks, how to test/monitor them, and rollback strategy. tools: Read, Grep, Glob user-invocable: true --- # Risk Register For changes that touch sensitive areas (authentication, data, migrations, infrastructure), document the risks explicitly. This is what senior developers do naturally - making it explicit ensures nothing is overlooked. ## When to Use Use this skill when your change involves: - **Authentication/Authorization** - Login, sessions, permissions, tokens - **User Data** - PII, passwords, payment info, user content - **Data Migrations** - Schema changes, data transformations, backfills - **External Integrations** - Third-party APIs, webhooks, OAuth - **Infrastructure** - Deployment, scaling, configuration changes - **Breaking Changes** - API changes, behavioral changes, deprecations ## Quick Start ``` /risk-register ``` Or specify the change: ``` /risk-register "Adding password reset functionality" ``` ## The Risk Register Template For each risky change, complete this register: ```markdown # Risk Register: [Feature/Change Name] ## Summary Brief description of what's changing and why it's sensitive. ## Top 3 Risks ### Risk 1: [Name] **Likelihood:** Low / Medium / High **Impact:** Low / Medium / High / Critical **Description:** What could go wrong? **Mitigation:** How are we preventing this? **Detection:** How would we know if this happened? **Response:** What do we do if it happens? --- ### Risk 2: [Name] ... ### Risk 3: [Name] ... ## Testing Strategy ### Pre-deployment - [ ] Unit tests cover the change - [ ] Integration tests for critical paths - [ ] Manual testing of edge cases - [ ] Security review completed ### Post-deployment - [ ] Smoke test in production - [ ] Monitor error rates - [ ] Watch for anomalies in [specific metrics] - [ ] Verify [specific functionality] works ## Monitoring & Alerting What should we watch after deployment? | Metric | Normal Range | Alert Threshold | Response | |--------|--------------|-----------------|----------| | Login failure rate | < 5% | > 10% | Check auth service | | API error rate | < 1% | > 5% | Investigate errors | | ... | ... | ... | ... | ## Rollback Strategy ### Can we rollback? Yes / Partial / No (explain why) ### Rollback steps 1. [Step 1] 2. [Step 2] 3. [Step 3] ### Rollback time estimate [X minutes/hours] ### Data implications What happens to data created after deployment if we rollback? ## Approval - [ ] Engineer reviewed risks - [ ] Security reviewed (if auth/data) - [ ] Stakeholder aware of risks ``` ## Risk Categories ### Authentication Risks | Risk | Impact | Common Mitigations | |------|--------|-------------------| | Session hijacking | Critical | Secure cookies, HTTPS, token rotation | | Credential stuffing | High | Rate limiting, MFA, breach detection | | Token leakage | Critical | Short expiry, secure storage, no logging | | Privilege escalation | Critical | Strict authz checks, principle of least privilege | | Account takeover | Critical | Email verification, suspicious activity alerts | ### Data Risks | Risk | Impact | Common Mitigations | |------|--------|-------------------| | Data loss | Critical | Backups, soft deletes, transaction safety | | Data corruption | Critical | Validation, constraints, idempotency | | Data leakage | Critical | Access controls, encryption, audit logs | | Privacy violation | High | PII handling, consent, data minimization | | Compliance breach | High | Audit trails, retention policies | ### Migration Risks | Risk | Impact | Common Mitigations | |------|--------|-------------------| | Failed migration | High | Dry runs, backups, reversible migrations | | Data inconsistency | High | Validation checks, reconciliation | | Downtime | Medium | Rolling deploys, feature flags | | Performance degradation | Medium | Index analysis, query optimization | ## Example Risk Register ```markdown # Risk Register: Password Reset Feature ## Summary Adding password reset via email. Touches auth system, sends emails with tokens, allows password changes without current password. ## Top 3 Risks ### Risk 1: Token Theft/Replay **Likelihood:** Medium **Impact:** Critical **Description:** Reset tokens could be intercepted or reused to take over accounts. **Mitigation:** - Tokens expire in 1 hour - Single use (invalidated after use) - Tokens are cryptographically random (32 bytes) - HTTPS only **Detection:** - Alert on multiple reset attempts for same user - Log all password resets with IP **Response:** - Invalidate all tokens for affected user - Force password change - Notify user of suspicious activity --- ### Risk 2: Email Enumeration **Likelihood:** High **Impact:** Medium **Description:** Attackers could use the reset form to discover which emails have accounts. **Mitigation:** - Same response for valid/invalid emails - Rate limiting on reset endpoint - CAPTCHA after 3 attempts **Detection:** - Monitor for high volume of reset requests - Alert on requests from same IP for many emails **Response:** - Block IP temporarily - Enable additional rate limiting --- ### Risk 3: Token Logged/Exposed **Likelihood:** Low **Impact:** Critical **Description:** Reset token appears in logs, error messages, or URLs shared externally. **Mitigation:** - Token in POST body, not URL - Logging excludes token field - Error messages are generic **Detection:** - Grep logs for token patterns - Review error handling **Response:** - Purge affected logs - Rotate any exposed tokens - Notify affected users ## Testing Strategy ### Pre-deployment - [x] Unit tests for token generation, validation, expiry - [x] Integration test for full reset flow - [x] Test expired token rejection - [x] Test reused token rejection - [x] Security review of token handling ### Post-deployment - [ ] Smoke test: Complete reset flow in production - [ ] Monitor email delivery rate - [ ] Watch for spike in reset requests ## Monitoring & Alerting | Metric | Normal | Alert | Response | |--------|--------|-------|----------| | Reset requests/hour | < 100 | > 500 | Check for abuse | | Reset completion rate | > 80% | < 50% | Check email delivery | | Failed reset attempts | < 10% | > 30% | Check token generation | ## Rollback Strategy ### Can we rollback? Yes - feature flag controls access to reset endpoint. ### Rollback steps 1. Disable `PASSWORD_RESET_ENABLED` feature flag 2. Invalidate all outstanding reset tokens 3. Communicate to support team ### Rollback time estimate ~5 minutes (feature flag toggle) ### Data implications Outstanding reset tokens will be invalidated. Users mid-reset will need to retry. ``` ## Integration with Wiggum When wiggum detects changes to auth, data, or migrations, it should prompt: ``` This change touches [auth/data/migrations]. Should we create a risk register? (y/n) ``` If yes, use this skill to document risks before proceeding. ## Remember - **Be specific** - "data loss" is too vague; "orphaned records if parent deleted" is actionable - **Be honest** - If you can't roll back, say so - **Think like an attacker** - What would you try if you wanted to break this? - **Think like ops** - How would you know something is wrong at 3am? The goal isn't to prevent all risks - it's to **know what the risks are** and have a plan.