--- name: agent-ops-critical-review description: "Deep, excruciating code review. Use anytime to analyze code for correctness, edge cases, security, performance, and design issues. Not tied to baseline—this is pure code analysis." license: MIT compatibility: [opencode, claude, cursor] metadata: category: core related: [agent-ops-state, agent-ops-tasks, agent-ops-api-review] --- # Critical Code Review ## Purpose Perform a deep, thorough, excruciating review of code. This skill is about analyzing code quality, correctness, and design—NOT about running tests or comparing to baseline. Use it anytime you need to scrutinize code. ## API Detection **Before starting review, check if project contains APIs:** ```yaml api_indicators: - OpenAPI/Swagger spec (openapi.yaml, swagger.json, openapi.json) - API framework patterns (FastAPI, Flask, Express, ASP.NET controllers) - Route decorators (@app.route, @router.get, [HttpGet], etc.) - API test files (test_api_*, *_api_test.*, api.spec.*) ``` **If API detected:** 1. Note in review scope: "API endpoints detected" 2. After general review, invoke `agent-ops-api-review` for API-specific audit 3. Merge API review findings into final report ## When to Use - **During planning**: Review existing code before modifying - **During implementation**: Review your own changes - **Before completion**: Final quality check - **On demand**: User requests a code review - **After recovery**: Verify recovery didn't introduce issues ## Review Dimensions ### 1) Correctness **Logic Analysis**: - Does the code do what it claims to do? - Are all code paths reachable and correct? - Are conditionals exhaustive (all cases handled)? - Are loop invariants maintained? - Are recursive functions guaranteed to terminate? **Boundary Conditions**: - Empty inputs (null, undefined, empty string, empty array) - Single element cases - Maximum values / overflow potential - Off-by-one errors - Unicode / special characters **State Management**: - Is state modified predictably? - Are there race conditions? - Is cleanup always performed (finally blocks, defer, etc.)? ### 2) Error Handling **Questions to Ask**: - What can fail here? - Is every failure case handled? - Are errors propagated appropriately? - Are error messages informative? - Is there silent failure hiding bugs? **Checklist**: - [ ] All external calls wrapped in try/catch or equivalent - [ ] Errors logged with context - [ ] User-facing errors are actionable - [ ] No swallowed exceptions - [ ] Cleanup happens even on error ### 3) Security **Input Validation**: - Are all external inputs validated? - Is there SQL injection potential? - Is there XSS potential? - Is there command injection potential? - Are paths sanitized (no traversal)? **Authentication/Authorization**: - Is auth checked before sensitive operations? - Are tokens/secrets handled securely? - Is sensitive data logged? (it shouldn't be) **Data Handling**: - Is PII protected? - Are passwords hashed, not encrypted? - Is data sanitized before display? ### 4) Performance **Algorithmic**: - What's the time complexity? - What's the space complexity? - Are there N+1 query problems? - Are there unnecessary iterations? **Resource Management**: - Are connections/handles closed? - Are large objects released? - Is there potential for memory leaks? - Are there blocking calls that should be async? **Caching/Optimization**: - Is expensive computation cached when appropriate? - Are there premature optimizations obscuring clarity? ### 5) Design & Maintainability **Single Responsibility**: - Does each function do one thing? - Are there functions that are too long? - Is there duplicated logic that should be extracted? **Naming & Clarity**: - Do names accurately describe purpose? - Are abbreviations clear or confusing? - Would a new team member understand this? **Dependencies**: - Are dependencies appropriate? - Is there tight coupling that could be loosened? - Are there circular dependencies? **Testability**: - Can this code be unit tested? - Are dependencies injectable? - Is there global state that complicates testing? ### 6) Consistency - Does it follow project conventions? - Does it match surrounding code style? - Are patterns used consistently? ### 7) SOLID Principles (for OO/class-based code) > **When to apply:** Review SOLID for class-based code, services, and complex modules. Skip for simple scripts, utilities, or purely functional code. **Single Responsibility (SRP)**: - Does each class have one clear purpose? - Would changes to one feature require modifying this class for unrelated reasons? - Could this class be split into smaller, focused classes? **Open/Closed (OCP)**: - Can new features be added by extension rather than modification? - Are there long if/elif/switch chains that should use polymorphism? - Is the Strategy, Factory, or Template pattern appropriate? **Liskov Substitution (LSP)**: - Do subclasses honor the contracts of their parent classes? - Are there overridden methods that change expected behavior? - Does substituting a subclass for its parent break anything? **Interface Segregation (ISP)**: - Are interfaces/ABCs focused and cohesive? - Do implementations need all methods they're required to have? - Should large interfaces be split into smaller, role-specific ones? **Dependency Inversion (DIP)**: - Do high-level modules depend on abstractions, not concretions? - Are dependencies injected rather than created internally? - Can this code be tested without real implementations of dependencies? **SOLID Checklist** (for class-based code): - [ ] Classes have single, clear responsibilities - [ ] New behavior can be added via extension - [ ] Subclasses are substitutable for parents - [ ] Interfaces are focused and minimal - [ ] Dependencies are on abstractions, not concretions **Common Python SOLID Violations**: ```python # ❌ SRP Violation — class does too much class UserService: def create_user(self): ... def send_email(self): ... def generate_report(self): ... # ✅ SRP Fixed — separate concerns class UserRepository: ... class EmailService: ... class ReportGenerator: ... # ❌ OCP Violation — must modify to add new types def process(item): if item.type == "A": ... elif item.type == "B": ... # Must add elif for each new type # ✅ OCP Fixed — extend via new classes class Processor(ABC): @abstractmethod def process(self): ... # ❌ DIP Violation — depends on concrete class class OrderService: def __init__(self): self.repo = SqliteRepository() # Concrete! # ✅ DIP Fixed — depends on abstraction class OrderService: def __init__(self, repo: Repository): # Abstract! self.repo = repo ``` ## Review Procedure ### For Changed Code 1. **Read the diff thoroughly** — understand every line changed 2. **Trace data flow** — follow inputs through to outputs 3. **Question assumptions** — what if assumptions are wrong? 4. **Check edges** — what happens at boundaries? 5. **Verify error paths** — what happens when things fail? 6. **Consider security** — can this be exploited? 7. **Assess performance** — will this scale? ### For New Code All of the above, plus: - Is this the right abstraction? - Is there existing code that does this? - Does this integrate cleanly? ### For Existing Code (Understanding) Focus on: - What does this code actually do? - What are the invariants? - What are the failure modes? - Where are the risks? ## Review Output Format ```markdown ## Critical Review: [component/file] ### Summary - Scope: [what was reviewed] - Severity: [critical/high/medium/low findings] - Verdict: [APPROVE / CHANGES NEEDED / BLOCK] ### Critical Issues (must fix) 1. **[Issue title]** - Location: file.ts#L10-20 - Problem: [description] - Risk: [what could go wrong] - Suggestion: [how to fix] ### High Priority (should fix) 1. ... ### Medium Priority (consider fixing) 1. ... ### Low Priority (optional improvements) 1. ... ### Positive Observations - [What's done well — reinforce good patterns] ``` ## Review Mindset ### Be Thorough - Don't skim — read every line - Don't assume — verify - Don't trust happy path — test failure cases mentally ### Be Skeptical - What if this input is malicious? - What if this external service fails? - What if this runs concurrently? ### Be Constructive - Explain WHY something is a problem - Suggest alternatives, don't just criticize - Acknowledge good work ### Be Honest - If you don't understand something, say so - If you're uncertain about an issue, rate your confidence - Don't rubber-stamp — that defeats the purpose ## Integration with Workflow This skill is **independent** of baseline comparison. It's pure code analysis. - **Use with baseline**: After critical review, separately run validation/baseline comparison - **Use alone**: Review code quality without running any tests - **Use during planning**: Analyze existing code before changes - **Use for learning**: Understand unfamiliar codebase ## Output Update `.agent/focus.md`: ```markdown ## Just did - Critical review of [component] - Findings: X critical, Y high, Z medium - Verdict: [APPROVE / CHANGES NEEDED / BLOCK] ``` ## Issue Discovery After Review **After review, invoke `agent-ops-tasks` discovery procedure:** 1) **Collect all findings** from review output: - Critical issues → `BUG` or `SEC` (critical/high) - High priority → `BUG`, `SEC`, or `PERF` (high) - Medium priority → various types (medium) - Low priority → `CHORE`, `REFAC`, `DOCS` (low) 2) **Present to user:** ``` 📋 Review found {N} issues: Critical: - [SEC] SQL injection vulnerability in UserController#search - [BUG] Race condition in cache invalidation High: - [PERF] N+1 query in OrderService#list - [BUG] Unhandled null in payment callback Medium: - [CHORE] Error messages not user-friendly - [TEST] Missing edge case coverage Low: - [REFAC] Extract duplicate validation logic - [DOCS] Add JSDoc to public methods Create issues for these? [A]ll / [S]elect / [N]one ``` 3) **After creating issues:** ``` Created {N} issues. What's next? 1. Start fixing highest priority (SEC-0001@abc123 - SQL injection) 2. Create more issues 3. Continue with original task 4. Review issue list ``` 4) **If user declines:** - Log findings summary in focus.md - Note "Review findings not converted to issues (user declined)"