--- name: beta-testing description: Beta testing strategy for iOS/macOS apps. Covers TestFlight program setup, beta tester recruitment, feedback collection methodology, user interviews, signal-vs-noise interpretation, and go/no-go launch readiness decisions. Use when planning a beta, setting up TestFlight, collecting user feedback, or deciding if ready to launch. allowed-tools: [Read, Write, Glob, Grep, AskUserQuestion] --- # Beta Testing Strategy End-to-end beta testing workflow for Apple platform apps — from TestFlight setup through feedback collection to launch readiness decision. ## When This Skill Activates Use this skill when the user: - Wants to run a beta test or plan a beta program - Needs to set up TestFlight for internal or external testing - Asks how to collect user feedback during beta - Wants to know if their app is ready to launch - Needs help designing a beta feedback survey - Asks "how many beta testers do I need?" - Mentions "TestFlight", "beta testers", "soft launch", or "launch readiness" - Wants to plan a structured beta rollout - Needs a go/no-go framework for shipping ## Process ### Phase 1: TestFlight Program Strategy Before recruiting testers, understand Apple's two-tier beta system. #### Internal Testing | Attribute | Details | |-----------|---------| | Max testers | 100 | | App Review required | No | | Build availability | Immediate after upload | | Tester requirement | Must be App Store Connect users | | Best for | Team, close collaborators, developer friends | | Expiration | 90 days from build upload | **Use internal testing for:** - Catching crashes and obvious bugs before anyone else sees the app - Validating core flows work end-to-end - Getting feedback from people who will be honest (friends, fellow devs) #### External Testing | Attribute | Details | |-----------|---------| | Max testers | 10,000 | | App Review required | Yes (Beta App Review — usually 24-48 hours) | | Build availability | After Beta App Review approval | | Tester requirement | Anyone with an email address and iOS/macOS device | | Best for | Real users, broader audience, validating product-market fit | | Expiration | 90 days from build upload | **Use external testing for:** - Validating with real users who have no context about your app - Stress-testing with diverse devices, OS versions, and usage patterns - Gathering signal on whether the value proposition resonates #### Group Management Create tester cohorts for targeted feedback: | Cohort | Size | Purpose | What to Ask | |--------|------|---------|-------------| | Power Users | 10-20 | Deep feature testing, edge cases | Does this handle your advanced workflows? | | Casual Users | 20-50 | First-impression and onboarding quality | Was anything confusing in the first 5 minutes? | | Accessibility Testers | 5-10 | VoiceOver, Dynamic Type, color contrast | Can you complete core tasks with accessibility features? | | Domain Experts | 5-10 | Validate domain-specific correctness | Is the [domain] logic accurate and trustworthy? | **TestFlight group tips:** - Name groups clearly (e.g., "Wave 1 - Power Users", "Wave 2 - General") - Use different "What to Test" descriptions per group - Stagger group invitations — don't invite everyone at once ### Phase 2: Beta Tester Recruitment #### How Many Testers Per Stage | Stage | Testers | Duration | Goal | |-------|---------|----------|------| | Internal alpha | 10-20 | 1-2 weeks | Crash-free, core flows work | | External wave 1 | 50-200 | 2 weeks | Validate UX, find confusion points | | External wave 2 | 200-1,000 | 2 weeks | Stress-test, validate at scale | | Open beta (optional) | 1,000+ | 1-2 weeks | Final validation, build buzz | **Rule of thumb:** You need at least 50 external testers to get meaningful signal. Below 50, individual preferences dominate. #### Where to Find Beta Testers **High-quality sources (engaged, will give feedback):** - **Indie dev communities** — Indie Dev Monday, iOS Dev Weekly Slack, Mastodon #iosdev - **Reddit communities** — r/iOSBeta, r/apple, domain-specific subreddits (r/productivity, r/fitness, etc.) - **Twitter/X** — Post "Looking for beta testers for [one-line pitch]" with a screenshot - **ProductHunt Upcoming** — List your app, collect emails before launch - **Friends and family** — Honest if you tell them honesty matters more than kindness **Medium-quality sources (volume, less feedback):** - **BetaList** — Submit for free listing - **Beta testing communities** — BetaFamily, ErliBird - **Discord servers** — App development and domain-specific servers **Low-quality sources (avoid for signal):** - Random giveaway sites (testers who just want free stuff) - Paid beta testers (financially motivated, not genuine users) #### Incentive Strategies | Incentive | Best For | Notes | |-----------|----------|-------| | Early access | All testers | The default — being first is often enough | | Lifetime free / pro unlock | Power users | Strong motivator, limited cost to you | | Credit in app (About screen) | Engaged testers | Recognition matters to some users | | Direct access to developer | Power users | They feel heard, you get deep feedback | | Discount at launch | Wave 2+ testers | Good for larger cohorts | **What NOT to offer:** Cash payment for testing. It attracts the wrong people and biases feedback. ### Phase 3: Feedback Collection Methodology Collect feedback through multiple channels — different methods catch different signals. #### Channel 1: In-App Feedback Form Build a simple feedback mechanism directly in the app. Three fields are enough: ``` 1. What's broken? (bugs, crashes, errors) 2. What's confusing? (UX that doesn't make sense) 3. What's missing? (features you expected but didn't find) ``` **Implementation tips:** - Add a "Send Feedback" button in Settings (always visible during beta) - Include device info, OS version, and app version automatically - Attach a screenshot option (but don't require it) - Send to a dedicated email or use a simple form service - Timestamp and tag by tester cohort #### Channel 2: TestFlight Built-In Feedback TestFlight's native feedback is surprisingly useful: - Users take a screenshot, annotate it, and add text - Feedback arrives in App Store Connect with device/OS metadata - Crash reports are automatic - No extra infrastructure needed **Tip:** In your TestFlight "What to Test" field, be specific: ``` This week, please test: 1. Creating a new [item] from scratch 2. Editing an existing [item] 3. Sharing [item] with someone Report anything confusing or broken via TestFlight feedback (screenshot + description). ``` #### Channel 3: Short Surveys (Max 5 Questions) Send a survey at the end of each beta wave. Use AskUserQuestion to help design survey questions tailored to the app. **Template survey (adapt per app):** 1. **How would you rate the overall experience?** (1-5 stars) 2. **What was the most confusing part?** (free text) 3. **What feature would make you use this daily?** (free text) 4. **Would you pay for this app?** (Yes / No / Maybe — if Yes, how much?) 5. **How likely are you to recommend this to a friend?** (0-10 NPS scale) **Survey rules:** - Maximum 5 questions — completion rate drops 20% per additional question - Always include one open-ended question (the best insights come from free text) - Send via email, not in-app (don't interrupt usage) - Send 5-7 days after they start testing (enough time to form opinions) #### Channel 4: 1-on-1 User Interviews The highest-signal feedback channel. Do 5-8 interviews per beta wave. **Who to interview:** - 2-3 testers who used the app heavily (understand power user needs) - 2-3 testers who tried it once and stopped (understand drop-off reasons) - 1-2 testers from the accessibility cohort **Logistics:** - 20-30 minutes via video call or phone - Record with permission (for your notes, not public) - Take written notes even if recording ### Phase 4: User Interview Guide #### The 5 Key Questions Ask these in order. Each builds on the previous: 1. **"What were you trying to do when you opened the app?"** - Reveals their mental model and expectations - Listen for: Does their goal match your intended use case? 2. **"Walk me through what happened."** - Have them narrate their experience step by step - Listen for: Where did they pause, backtrack, or get stuck? 3. **"What did you expect to happen at [specific moment]?"** - Reveals UX gaps between expectation and reality - Listen for: Mismatches between your design and their mental model 4. **"What was the most confusing part?"** - Direct question about pain points - Listen for: Recurring themes across multiple interviews 5. **"If this app cost $X/month, what would make it worth paying for?"** - Reveals perceived value and priority features - Listen for: What they mention first (that's the hero feature) #### How to Listen (Interview Discipline) **Do:** - Nod and say "Tell me more about that" - Ask "Why?" at least twice to get past surface answers - Take notes on exact phrases they use (their words, not your interpretation) - Let silence sit — people fill silence with valuable thoughts - Ask "What else?" after they finish answering **Don't:** - Defend your design decisions ("Well, the reason it works that way is...") - Explain how to use the feature ("Oh, you just need to swipe left") - Lead the witness ("Don't you think the onboarding was clear?") - Dismiss feedback ("That's an edge case" or "Nobody else reported that") - Promise to fix things in the interview ("We'll definitely add that") **The golden rule:** Your job is to understand their experience, not to educate them about your app. If they're confused, the app is confusing — full stop. ### Phase 5: Interpreting Beta Feedback #### Signal vs. Noise Framework Not all feedback is equal. Use frequency to determine priority: | Frequency | Classification | Action | |-----------|---------------|--------| | 1 person mentions it | Anecdote | Note it, don't act yet | | 3 people mention it | Pattern | Investigate, consider fixing | | 5+ people mention it | Must-fix | Fix before launch | | 10+ people mention it | Showstopper | Fix immediately, send new build | **Important:** Weight feedback by tester quality. One thoughtful power user's detailed report is worth more than ten casual testers saying "looks good." #### Categorization Framework Assign every piece of feedback to a priority level: | Priority | Category | Examples | Action | |----------|----------|----------|--------| | P0 — Critical | Crash / data loss | App crashes on launch, saved data disappears, sync destroys content | Fix immediately, push new build within 24 hours | | P1 — High | Broken flow | Cannot complete core task, flow dead-ends, save doesn't work | Fix before next beta wave | | P2 — Medium | UX confusion | Users don't find feature, misunderstand UI, take wrong path | Fix before launch | | P3 — Low | Nice-to-have | Feature requests, polish suggestions, "it would be cool if..." | Add to backlog, consider for v1.1 | #### Distinguishing Problem Types Two types of negative feedback require different solutions: **"I don't understand how to do X"** = UX problem - The feature exists but is hard to find or use - Solution: Redesign the flow, add onboarding hints, improve labels - Usually cheaper and faster to fix **"I don't need X"** = Feature/product problem - The feature exists but doesn't match user needs - Solution: Rethink whether this feature belongs in v1, or pivot the approach - More expensive to fix — may require cutting or rebuilding **How to tell the difference:** Ask "If this feature were easier to use, would you use it?" If yes, it's UX. If no, it's a product problem. ### Phase 6: Go/No-Go Launch Decision #### The Decision Framework After completing beta testing, evaluate across three categories: #### Green Light (Ship It) All of these must be true: - [ ] Crash-free rate > 99.5% (measured over at least 1,000 sessions) - [ ] Core flow completion rate > 80% (users can accomplish the primary task) - [ ] NPS > 30 (more promoters than detractors) - [ ] Zero P0 bugs open - [ ] Zero P1 bugs open - [ ] No data loss or privacy issues - [ ] App Store Review Guidelines compliance verified - [ ] Performance acceptable on oldest supported device #### Yellow Light (Ship with Caution) Acceptable to launch, but address soon: - Minor UX confusion that doesn't block core flow - Missing nice-to-have features that testers requested - P2 bugs that affect < 10% of users - NPS between 20-30 (decent but not enthusiastic) - Polish issues (animations, visual alignment, copy) - Crash-free rate 99.0-99.5% **Decision:** Launch, but fix yellow items in v1.0.1 within 1-2 weeks. #### Red Light (Do Not Ship) If any of these are true, delay launch: - Crash-free rate < 99% (app is unstable) - Core flow completion rate < 60% (users can't do the main thing) - Any P0 bugs open (crashes, data loss) - Data loss or corruption bugs exist - Privacy issues (data leaking, missing privacy labels, no consent flow) - NPS < 0 (more detractors than promoters) - App rejected in Beta App Review (signals App Store Review will also reject) - Core flow broken on common device/OS combinations **Decision:** Go back to development. Fix all red items. Run another beta wave. Re-evaluate. #### Making the Call Use this checklist format to present the decision: ```markdown # Go/No-Go Decision: [App Name] v[X.Y] ## Date: [Date] ## Beta Duration: [X weeks] ## Total Testers: [N] ## Sessions Analyzed: [N] ### Green Criteria - [ ] Crash-free rate: [XX.X%] (target: >99.5%) - [ ] Core flow completion: [XX%] (target: >80%) - [ ] NPS score: [XX] (target: >30) - [ ] P0 bugs: [0/N] open - [ ] P1 bugs: [0/N] open - [ ] Data loss issues: [None/Describe] - [ ] Privacy compliance: [Pass/Fail] - [ ] Oldest device performance: [Pass/Fail] ### Yellow Items (Ship but fix in v1.0.1) - [Item 1] - [Item 2] ### Red Items (Must fix before launch) - [Item 1 — or "None"] ### Decision: [GO / NO-GO / CONDITIONAL GO] ### Reasoning: [1-2 sentences] ### Next Steps: 1. [Action item] 2. [Action item] 3. [Action item] ``` ### Phase 7: Beta Timeline #### Recommended 7-Week Schedule | Week | Phase | Activities | Deliverables | |------|-------|------------|-------------| | 1 | Setup | Configure TestFlight groups, write "What to Test" descriptions, prepare feedback form | TestFlight ready, feedback channels set up | | 2 | Internal Alpha | Invite 10-20 internal testers, fix crash-level bugs daily | Crash-free build, core flows validated | | 3 | External Wave 1 | Invite 50-200 external testers, monitor crash reports | First external feedback collected | | 4 | Wave 1 Analysis | Send survey, conduct 5-8 user interviews, categorize feedback | Feedback report, prioritized bug/UX list | | 5 | External Wave 2 | Fix P0/P1 issues, push new build, invite 200-1,000 testers | Improved build validated at scale | | 6 | Wave 2 Analysis | Send final survey, conduct 3-5 follow-up interviews | Final feedback report, NPS score | | 7 | Go/No-Go | Evaluate all data against decision framework, make launch call | Go/No-Go decision document | **Compressed schedule (4 weeks):** Combine weeks 1-2, skip wave 2, use wave 1 data for go/no-go. Only recommended if the app is simple (< 5 screens) and developer has shipped before. **Extended schedule (10 weeks):** Add a third external wave for large or complex apps (health apps, financial apps, apps with sync/collaboration). Extra time catches rare bugs and edge cases. ## Output Format Present the beta testing plan as: ```markdown # Beta Testing Plan: [App Name] ## TestFlight Configuration ### Internal Group - **Testers**: [List or count] - **Focus**: [What they're testing] - **Duration**: [X days] ### External Group 1: [Name] - **Size**: [N testers] - **Cohort**: [Power users / casual / accessibility / domain] - **What to Test**: [Specific tasks and flows] ### External Group 2: [Name] - **Size**: [N testers] - **Cohort**: [...] - **What to Test**: [...] ## Recruitment Plan - **Sources**: [Where to find testers] - **Incentive**: [What to offer] - **Outreach message**: [Draft] ## Feedback Channels 1. In-app feedback form: [Yes/No, what fields] 2. TestFlight feedback: [What to Test description] 3. Survey: [Questions] 4. User interviews: [How many, who to target] ## Timeline | Week | Phase | Key Activities | |------|-------|----------------| | ... | ... | ... | ## Go/No-Go Criteria - Crash-free rate target: >99.5% - Core flow completion target: >80% - NPS target: >30 - P0/P1 bugs: Must be zero ## Next Steps 1. [First action item] 2. [Second action item] 3. [Third action item] ``` ## Integration with Other Skills This skill fits in the product development pipeline after implementation and before App Store submission: ``` 1. product-agent --> Validate the idea 2. prd-generator --> Define features 3. architecture-spec --> Technical design 4. implementation-guide --> Build it 5. test-spec --> Automated tests 6. beta-testing (THIS SKILL) --> Validate with real users 7. release-review --> Pre-submission audit 8. app-store --> App Store listing and submission ``` **Inputs from other skills:** - `test-spec` provides automated test coverage (should be solid before beta) - `implementation-guide` provides the feature list to test against - `prd-generator` provides the user stories to validate **Outputs to other skills:** - Feedback data informs `release-review` checklist - NPS and user quotes feed into `app-store` description and marketing - Go/no-go decision determines if `release-review` proceeds ## Common Mistakes to Avoid ### Inviting too many testers too early ``` Bad: Invite 500 external testers on day 1 Good: Start with 15 internal testers, fix crashes, THEN go external ``` ### Not giving testers direction ``` Bad: "Please test the app and let me know what you think" Good: "Please try creating a new project, adding 3 tasks, and marking one complete. Report anything confusing or broken." ``` ### Ignoring negative feedback ``` Bad: "That tester just doesn't get it" Good: "If 3 testers don't get it, my onboarding doesn't get it" ``` ### Running beta too short ``` Bad: 3 days of testing, then ship Good: Minimum 2 weeks external testing with at least 50 testers ``` ### Not acting on feedback ``` Bad: Collect feedback, file it away, ship unchanged Good: Fix P0/P1 between waves, push new build, re-test ``` --- **A beta test isn't a checkbox — it's your last chance to learn before the whole world sees your app. Treat tester feedback as a gift, even when it hurts.**