--- name: pentest-whitebox-code-review description: Source code security audit using backward taint analysis, slot type classification, render context verification, and 3-phase parallel review producing an exploitation queue. --- # Pentest Whitebox Code Review ## Purpose Perform systematic white-box source code security audit using Shannon's backward taint analysis methodology. Traces from dangerous sinks back to user-controlled sources, classifies injection contexts by slot type, verifies XSS render contexts, and produces a prioritized exploitation queue for downstream proof-driven exploitation. ## Prerequisites ### Authorization Requirements - **Written authorization** with explicit scope for source code review - **Source code access** — full repository with version control history - **Architecture documentation** if available (data flow diagrams, API specs) - **Deployment configuration** access (environment variables, secrets management) ### Environment Setup - semgrep with custom rules for taint analysis - CodeQL database built for target language - ripgrep for fast pattern searching - jadx for Android APK decompilation (if applicable) - Source map extraction tools for minified JavaScript - AST parsing tools for target language (tree-sitter, babel, etc.) ## Core Workflow ### Phase 1: Discovery 1. **Architecture Mapping**: Identify application layers (routing, controllers, services, data access, templates). Map data flow from HTTP entry points through business logic to database/file/external sinks. 2. **Entry Point Enumeration**: Catalog all user-controlled input sources — HTTP parameters, headers, cookies, file uploads, WebSocket messages, environment variables, database reads of user-stored data. 3. **Security Pattern Inventory**: Identify existing security controls — input validation functions, output encoding helpers, parameterized query patterns, CSRF protections, authentication middleware, rate limiters. ### Phase 2: Vulnerability Analysis (5 Parallel Tracks) 4. **Injection Sink Hunting**: Backward taint from SQL/command/file/template sinks to sources. Classify each sink by slot type: SQL-val, SQL-ident, CMD-argument, FILE-path, TEMPLATE-expr. Verify whether parameterization or sanitization breaks the taint chain. 5. **XSS Render Context Analysis**: Identify all dynamic output points in templates/responses. Classify each by render context: HTML_BODY, HTML_ATTRIBUTE, JAVASCRIPT_STRING, URL_PARAM, CSS_VALUE. Verify context-appropriate encoding is applied at each output point. 6. **Authentication Checklist (9-point)**: Transport security, rate limiting, session management, token properties, session fixation resistance, password policy enforcement, login response uniformity, account recovery security, SSO/OAuth implementation. 7. **Authorization Model Review (3-type)**: Horizontal (same-role cross-user access), vertical (privilege escalation across roles), context-workflow (state-dependent authorization bypass). 8. **SSRF Sink Hunting**: Identify all outbound request sinks. Classify by type: classic (direct URL), blind (no response), semi-blind (partial response), stored (deferred execution). Trace URL construction from user input to request dispatch. ### Phase 3: Synthesis 9. **Confidence Scoring & Exploitation Queue**: Score each finding by taint chain completeness, sanitization bypass likelihood, and impact severity. Generate exploitation queue JSON for downstream exploit validation. ## Slot Type Classification | Slot Type | Sink Pattern | Sanitization Required | |-----------|-------------|----------------------| | SQL-val | Query parameter value position | Parameterized query / prepared statement | | SQL-ident | Table name, column name, ORDER BY | Allowlist validation | | CMD-argument | Shell command argument | Argument escaping + allowlist | | FILE-path | File read/write path construction | Path canonicalization + allowlist | | TEMPLATE-expr | Template engine expression | Context-aware auto-escaping | ## Render Context Classification | Context | Output Location | Encoding Required | |---------|----------------|-------------------| | HTML_BODY | Between HTML tags | HTML entity encoding | | HTML_ATTRIBUTE | Inside attribute values | Attribute encoding + quoting | | JAVASCRIPT_STRING | Inside JS string literals | JavaScript Unicode escaping | | URL_PARAM | URL query parameter values | URL percent encoding | | CSS_VALUE | Inside CSS property values | CSS hex encoding | ## Tool Categories | Category | Tools | Purpose | |----------|-------|---------| | Taint Analysis | semgrep, CodeQL | Automated sink-to-source taint tracing | | Pattern Search | ripgrep, ast-grep | Fast code pattern matching | | Decompilation | jadx, sourcemap-extract | Recover source from compiled artifacts | | AST Parsing | tree-sitter, babel | Language-aware code structure analysis | | Dependency Audit | npm audit, pip-audit, snyk | Known vulnerability detection | ## References - `references/tools.md` - Tool function signatures and parameters - `references/workflows.md` - Taint analysis workflows and vulnerability patterns