--- name: feature-flag-management description: Feature flag lifecycle management -- safe feature toggling, gradual rollouts, A/B testing patterns, flag cleanup strategies, and technical debt prevention. Covers LaunchDarkly, Unleash, OpenFeature, and custom implementations. version: 2.0.0 category: DevOps agents: [developer, devops, qa] tags: [ feature-flags, rollouts, a-b-testing, toggles, launchdarkly, openfeature, trunk-based-development, ] model: sonnet invoked_by: both user_invocable: true tools: [Read, Write, Edit, Bash, Glob, Grep, WebFetch] best_practices: - Define a clear lifecycle for every flag (create, enable, rollout, permanent, cleanup) - Use typed flag values with defaults that match the safe/existing behavior - Clean up flags within 30 days of full rollout to prevent technical debt - Never nest feature flags more than 2 levels deep - Always include kill-switch capability for new features behind flags error_handling: graceful streaming: supported verified: true lastVerifiedAt: '2026-03-01' --- # Feature Flag Management Skill Feature flag lifecycle specialist covering safe feature toggling, gradual rollouts, A/B testing patterns, and flag cleanup to prevent technical debt. Enforces disciplined flag hygiene across the full lifecycle from creation through retirement. - Design feature flag architecture with proper categorization (release, experiment, ops, permission) - Implement gradual rollout strategies (percentage, user-segment, canary, ring-based) - Configure A/B testing with feature flags and metrics collection - Plan flag cleanup workflows to prevent stale flag accumulation - Integrate with flag platforms (LaunchDarkly, Unleash, Flipt, OpenFeature SDK) - Implement custom feature flag systems for projects without external platforms - Set up flag-aware testing strategies (all flag combinations) - Monitor flag evaluation performance and stale flag detection ## Overview Feature flags decouple deployment from release, enabling trunk-based development, safe rollouts, and instant rollbacks. However, undisciplined flag usage creates exponential code path complexity, stale flags, and untested combinations. This skill enforces a lifecycle-driven approach: every flag has a type, an owner, a target date, and a cleanup plan from day one. ## When to Use - When implementing trunk-based development with continuous deployment - When rolling out features gradually to reduce risk - When setting up A/B testing infrastructure - When auditing existing codebases for stale or orphaned feature flags - When choosing between feature flag platforms - When implementing kill-switches for critical features ## Iron Laws 1. **ALWAYS** assign an owner and expiration date to every feature flag -- orphaned flags without owners accumulate indefinitely and become permanent tech debt. 2. **NEVER** nest feature flags more than 2 levels deep -- combinatorial explosion makes testing impossible (2 flags = 4 states, 5 flags = 32 states, 10 flags = 1024 states). 3. **ALWAYS** default flag values to the existing/safe behavior -- if the flag system fails, the application should behave as it did before the flag was added. 4. **NEVER** use feature flags as a substitute for configuration management -- flags are temporary toggles for release control, not permanent application settings. 5. **ALWAYS** clean up flags within 30 days of full rollout -- stale flags in code increase cognitive load, slow onboarding, and hide dead code paths. ## Anti-Patterns | Anti-Pattern | Why It Fails | Correct Approach | | -------------------------------------------------- | ---------------------------------------------------------------------------- | ------------------------------------------------------------------------------------- | | Creating flags without expiration dates or owners | Flags become permanent; nobody knows if they can be removed | Require owner and target-date fields at creation time; alert on overdue flags | | Nesting 3+ flags in conditional logic | Testing requires covering all combinations; bugs hide in untested paths | Limit nesting to 2 levels; combine related flags into a single multi-valued flag | | Defaulting new features to ON when flag is missing | Flag system outage enables untested features for all users | Default to OFF (existing behavior); explicitly enable after validation | | Using flags for permanent configuration | Config changes require code deploys to remove; defeats the purpose of config | Use environment variables or config files for permanent settings; flags are temporary | | Testing only with flags ON or only with flags OFF | Misses interaction bugs between flag states | Test both states; add flag-combination matrix to CI for critical flags | ## Workflow ### Step 1: Flag Classification Classify every flag before creation: | Type | Purpose | Lifetime | Example | | -------------- | ----------------------------------------- | ----------------------- | ---------------------------- | | **Release** | Control feature visibility during rollout | Days to weeks | `enable_new_checkout` | | **Experiment** | A/B test with metrics collection | Weeks to months | `experiment_pricing_page_v2` | | **Ops** | Kill-switch for operational control | Permanent (with review) | `circuit_breaker_payments` | | **Permission** | User/role-based access control | Permanent | `enable_admin_dashboard` | ### Step 2: Implementation Pattern ```typescript // OpenFeature SDK pattern (vendor-neutral) import { OpenFeature } from '@openfeature/server-sdk'; const client = OpenFeature.getClient(); // Typed flag evaluation with safe default const showNewUI = await client.getBooleanValue( 'enable_new_checkout_ui', false, // safe default: existing behavior { targetingKey: user.id, attributes: { plan: user.plan } } ); if (showNewUI) { renderNewCheckout(); } else { renderLegacyCheckout(); } ``` ### Step 3: Gradual Rollout Strategy ``` Phase 1: Internal (0-1 day) - Enable for development team - Verify in production environment Phase 2: Canary (1-3 days) - Enable for 1% of users - Monitor error rates, latency, business metrics Phase 3: Controlled Rollout (3-7 days) - Ramp: 5% -> 10% -> 25% -> 50% -> 100% - Hold at each stage for minimum 24 hours - Define rollback criteria before advancing Phase 4: Cleanup (within 30 days of 100%) - Remove flag checks from code - Remove flag from platform - Update documentation ``` ### Step 4: Flag-Aware Testing ```typescript // Test both flag states in CI describe('Checkout Flow', () => { describe('with new_checkout_ui enabled', () => { beforeEach(() => { flagProvider.setOverride('enable_new_checkout_ui', true); }); it('should render new checkout components', () => { // test new path }); }); describe('with new_checkout_ui disabled', () => { beforeEach(() => { flagProvider.setOverride('enable_new_checkout_ui', false); }); it('should render legacy checkout components', () => { // test legacy path }); }); }); ``` ### Step 5: Stale Flag Detection ```bash # Find flags older than 30 days that are fully rolled out # Custom script pattern for codebase scanning grep -rn 'isEnabled\|getBooleanValue\|getFlag' src/ | \ awk -F"'" '{print $2}' | \ sort -u > active_flags.txt # Compare against flag platform inventory # Flag any that are 100% enabled for > 30 days ``` ### Step 6: Cleanup Checklist For each flag being retired: - [ ] Remove all flag evaluation calls from code - [ ] Remove unused code path (the one not selected) - [ ] Remove flag from platform/configuration - [ ] Remove flag from test overrides - [ ] Update documentation referencing the flag - [ ] Verify no other flags depend on this flag - [ ] Deploy and verify behavior matches full-rollout state ## Complementary Skills | Skill | Relationship | | --------------------------- | -------------------------------------------------- | | `tdd` | Test-driven development for flag-guarded features | | `ci-cd-implementation-rule` | CI/CD pipeline integration with flag-aware deploys | | `qa-workflow` | QA validation across flag combinations | | `proactive-audit` | Audit for stale or orphaned flags in codebase | ## Memory Protocol (MANDATORY) **Before starting:** Read `.claude/context/memory/learnings.md` for prior feature flag patterns and platform-specific decisions. **After completing:** - New pattern -> `.claude/context/memory/learnings.md` - Issue found -> `.claude/context/memory/issues.md` - Decision made -> `.claude/context/memory/decisions.md` > ASSUME INTERRUPTION: Your context may reset. If it's not in memory, it didn't happen.