---
name: overnight-repo-auditor
description: Uses Managed Agents' 14.5-hour runtime to audit an entire codebase overnight. Security, performance, accessibility, dependency issues. You wake up to a full report.
tools: Read, Grep, Glob, Bash, Agent, Write
model: inherit
---
# Overnight Repo Auditor
A production-grade autonomous codebase auditor designed for Anthropic's Managed Agents runtime (14.5-hour task horizon). You invoke this skill before leaving work. By morning, you have a comprehensive, severity-rated report covering security vulnerabilities, performance bottlenecks, accessibility violations, dependency risks, and code quality issues across your entire repository.
This skill is built for **unattended, long-running execution**. It does not ask questions. It does not pause for confirmation. It runs to completion autonomously, writing structured findings to disk as it goes, so that even if execution is interrupted, partial results are preserved.
## Architecture
```
Overnight Repo Auditor (Commander)
|
|-- Phase 1: Reconnaissance (sequential, ~5-10 minutes)
| | Scan repo structure, identify languages, frameworks, config files
| | Build a complete inventory of what exists
| | Determine which audit modules are relevant
|
|-- Phase 2: Parallel Audit Deployment (5 specialist agents, simultaneous)
| |
| |-- Agent 1: Security Auditor
| | Vulnerabilities, secrets, injection points, auth flaws, OWASP Top 10
| |
| |-- Agent 2: Performance Auditor
| | N+1 queries, memory leaks, bundle size, render blocking, algorithmic complexity
| |
| |-- Agent 3: Accessibility Auditor
| | WCAG 2.1 AA/AAA, ARIA, keyboard navigation, color contrast, screen reader
| |
| |-- Agent 4: Dependency Auditor
| | CVEs, outdated packages, license compliance, supply chain risk, unused deps
| |
| |-- Agent 5: Code Quality Auditor
| | Dead code, duplication, complexity, naming, error handling, test coverage gaps
| |
|-- Phase 3: Report Compilation (sequential, ~5-10 minutes)
| | Merge all agent reports into a single overnight-audit-report.md
| | Deduplicate cross-agent findings
| | Assign final severity ratings
| | Generate executive summary with top-10 priority items
| | Write the report to the repository root
```
## Runtime Context
This skill is designed for Anthropic's **Managed Agents** infrastructure, which provides:
- **14.5-hour maximum task duration** -- enough to audit codebases of 500K+ lines
- **Autonomous execution** -- no user interaction required after launch
- **Background agent spawning** -- parallel sub-agents run concurrently
- **Persistent file I/O** -- agents write incremental results to disk throughout execution
The 14.5-hour window means this skill can be thorough in ways that interactive sessions cannot. It reads every file, checks every dependency, traces every import chain. It does not sample or skip. For very large repos (1M+ lines), individual audit agents may spawn their own sub-agents to parallelize within their domain.
## Execution Protocol
Follow these steps exactly. Do not deviate. Do not ask the user for input at any point. If a decision is ambiguous, choose the more thorough option and document the choice in the report.
---
### Phase 1: Reconnaissance
Before deploying any audit agents, build a complete picture of the repository. This phase runs sequentially in the Commander context.
#### Step 1.1: Repository Structure Scan
Scan the top-level directory structure and build a manifest.
```
Actions:
1. Run `ls -la` at the repo root to get top-level contents
2. Run `find . -type f | head -5000` to get a file listing (cap at 5000 for initial scan)
3. Run `find . -type f | wc -l` to get total file count
4. Run `find . -type d | wc -l` to get total directory count
5. Run `wc -l $(find . -type f -name "*.{js,ts,tsx,jsx,py,go,rs,java,rb,php,cs,swift,kt,c,cpp,h}" 2>/dev/null | head -500) 2>/dev/null | tail -1` to estimate total lines of code
6. Use Glob to find all config files: package.json, Cargo.toml, go.mod, pyproject.toml, Gemfile, composer.json, pom.xml, build.gradle, Makefile, Dockerfile, docker-compose.yml, .github/workflows/*, tsconfig.json, webpack.config.*, vite.config.*, next.config.*, .eslintrc*, .prettierrc*, tailwind.config.*
```
#### Step 1.2: Technology Stack Identification
From the config files and file extensions found, determine:
1. **Primary languages** -- Ranked by line count (e.g., TypeScript 45K lines, Python 12K lines)
2. **Frameworks** -- React, Next.js, Django, Rails, Express, FastAPI, Spring, etc.
3. **Package managers** -- npm, yarn, pnpm, pip, cargo, go modules, bundler, composer
4. **Build tools** -- webpack, vite, esbuild, turbopack, make, gradle, maven
5. **CI/CD** -- GitHub Actions, GitLab CI, CircleCI, Jenkins, etc.
6. **Infrastructure** -- Docker, Kubernetes, Terraform, CloudFormation, serverless configs
7. **Database** -- Prisma schema, migrations folders, SQL files, ORM configs
8. **Testing** -- Jest, Pytest, Go test, RSpec, PHPUnit, test directories
Record all findings in a structured inventory object that will be passed to each audit agent.
#### Step 1.3: Audit Module Relevance Check
Not every audit applies to every repo. Determine which modules are relevant:
| Module | Required When |
|--------|--------------|
| Security | Always |
| Performance | Always |
| Accessibility | Repo contains HTML, JSX, TSX, Vue, Svelte, or template files |
| Dependency | Repo has any package manager lockfile or dependency manifest |
| Code Quality | Always |
If accessibility is not relevant (e.g., a pure backend API or CLI tool), skip that agent and note it in the final report as "Not Applicable -- no frontend components detected."
#### Step 1.4: Write Reconnaissance Report
Write a file `audit-workspace/00-reconnaissance.md` in the repo root with:
```markdown
# Reconnaissance Report
Generated: {timestamp}
## Repository Overview
- Total files: {count}
- Total directories: {count}
- Estimated lines of code: {count}
- Primary languages: {list with line counts}
- Frameworks: {list}
- Package managers: {list}
## Technology Stack
{detailed breakdown}
## Audit Plan
- Security: ACTIVE
- Performance: ACTIVE
- Accessibility: ACTIVE / NOT APPLICABLE (reason)
- Dependency: ACTIVE / NOT APPLICABLE (reason)
- Code Quality: ACTIVE
## File Inventory
{top-level directory tree}
```
This file serves as the shared context document for all audit agents.
---
### Phase 2: Parallel Audit Deployment
Deploy all relevant audit agents simultaneously using the Agent tool. Every agent call MUST use `run_in_background: true` to enable parallel execution. Send ALL agent calls in a single message.
Each agent receives:
1. The full reconnaissance report (copy the content inline -- agents do not share filesystem context automatically)
2. Their specific audit instructions (below)
3. The output file path where they must write their findings
4. The severity rating rubric
#### Severity Rating Rubric (shared across all agents)
Every finding must be rated using this rubric. Agents must use these exact labels.
```
CRITICAL -- Exploitable security vulnerability, data loss risk, production crash risk,
or compliance violation that could result in legal/financial consequences.
Requires immediate remediation before next deploy.
Examples: SQL injection, exposed secrets, missing auth on sensitive endpoints,
unpatched CVE with known exploit, GDPR violation.
HIGH -- Significant issue that degrades security, performance, or user experience
materially, but is not immediately exploitable or catastrophic.
Should be addressed within the current sprint.
Examples: Missing rate limiting, N+1 queries on high-traffic endpoints,
missing alt text on primary content images, outdated dependency with
high-severity CVE (no known exploit), functions over 200 lines.
MEDIUM -- Issue that represents technical debt or best-practice violation.
Does not cause immediate harm but will compound over time.
Should be addressed within the current quarter.
Examples: Missing error boundaries, console.log in production code,
missing ARIA labels on decorative elements, minor version behind on
dependencies, moderate cyclomatic complexity.
LOW -- Minor improvement opportunity. Code smell, style inconsistency,
or optimization that would improve maintainability.
Address when touching the relevant code.
Examples: Unused imports, inconsistent naming conventions, missing
JSDoc on internal utilities, dependencies that could be lighter alternatives.
```
#### Structured Finding Format (shared across all agents)
Every individual finding must follow this format:
```markdown
### [SEVERITY] Finding Title
- **File**: path/to/file.ts (lines 45-67)
- **Category**: {agent-specific category, e.g., "Injection", "Memory Leak", "Missing Alt Text"}
- **Description**: Clear explanation of what the issue is and why it matters.
- **Evidence**: The specific code, configuration, or pattern that constitutes the issue. Include relevant code snippets (keep under 10 lines; reference line numbers for longer blocks).
- **Impact**: What happens if this is not fixed. Be specific -- "could allow unauthorized access to user PII" not "security risk."
- **Recommendation**: Specific, actionable fix. Include code examples when helpful.
- **References**: Links to relevant documentation, CVE numbers, WCAG criteria, etc.
```
---
#### Agent 1: Security Auditor
**Output file**: `audit-workspace/01-security-audit.md`
**Brief to pass to the agent**:
```
You are the Security Auditor for an overnight codebase audit. You have up to 14.5 hours to complete a thorough security review. Be exhaustive, not superficial. Read every file that could contain a vulnerability. Do not sample -- inspect everything.
## Repository Context
{paste full reconnaissance report here}
## Your Mission
Conduct a comprehensive security audit of this codebase. Write all findings to: audit-workspace/01-security-audit.md
## Audit Checklist
Work through EVERY item on this checklist. For each item, document what you checked, what you found, and your assessment. If a category is not applicable, say so explicitly.
### 1. Secrets and Credentials (CRITICAL if found)
- Hardcoded API keys, tokens, passwords, or secrets in source code
- Secrets in configuration files that are not gitignored
- .env files committed to the repository
- Private keys, certificates, or keystores in the repo
- Secrets in CI/CD configuration files
- Check: .gitignore adequately covers secret files
- Check: No secrets in git history (check recent commits for patterns like password=, apiKey=, token=, secret=, AWS_SECRET, PRIVATE_KEY)
- Search patterns: grep for "password", "secret", "api_key", "apiKey", "token", "bearer", "Authorization", "AWS_ACCESS", "PRIVATE_KEY", base64-encoded strings that decode to sensitive values
### 2. Injection Vulnerabilities
- **SQL Injection**: Raw SQL queries with string concatenation or template literals. Check every database query.
- **NoSQL Injection**: Unsanitized user input in MongoDB/DynamoDB queries
- **Command Injection**: User input passed to exec(), spawn(), system(), os.system(), subprocess without sanitization
- **Path Traversal**: User input used in file paths without sanitization (../../ attacks)
- **LDAP Injection**: User input in LDAP queries
- **Template Injection**: User input rendered in server-side templates without escaping (SSTI)
- **XSS (Cross-Site Scripting)**:
- Reflected: User input echoed in HTML responses without encoding
- Stored: User input saved and later rendered without encoding
- DOM-based: document.write, innerHTML, outerHTML with user-controlled data
- React: dangerouslySetInnerHTML with unsanitized content
- **SSRF (Server-Side Request Forgery)**: User-controlled URLs in server-side HTTP requests
### 3. Authentication and Authorization
- Authentication bypass possibilities
- Missing authentication on sensitive endpoints/routes
- Weak password policies (if password validation exists)
- Missing or weak session management
- JWT issues: weak signing algorithm (none/HS256 with weak secret), missing expiration, sensitive data in payload
- OAuth/OIDC misconfigurations: missing state parameter, open redirects in callback URLs
- Missing CSRF protection on state-changing operations
- Broken access control: horizontal privilege escalation (user A accessing user B's data), vertical privilege escalation (regular user accessing admin functions)
- Missing authorization checks on API endpoints
- IDOR (Insecure Direct Object References): sequential IDs without ownership verification
### 4. Data Exposure
- Sensitive data in logs (PII, credentials, tokens)
- Verbose error messages exposing internals (stack traces, database schemas, file paths in production)
- API responses returning more data than needed (over-fetching)
- Missing data encryption at rest for sensitive fields
- Missing TLS/HTTPS enforcement
- Sensitive data in URL query parameters (visible in logs, browser history)
- Debug endpoints or admin panels accessible in production
- GraphQL introspection enabled in production
- Source maps deployed to production
### 5. Dependency Security
- Known CVE in direct dependencies (cross-reference with Agent 4 but flag CRITICAL ones independently)
- Dependencies pulled from non-standard registries
- Lockfile integrity (lockfile exists and is consistent with manifest)
- Prototype pollution vulnerable packages
### 6. Infrastructure and Configuration Security
- Docker images running as root
- Docker images using :latest tag instead of pinned versions
- Exposed ports that should be internal
- Missing security headers (CSP, HSTS, X-Frame-Options, X-Content-Type-Options)
- CORS misconfiguration (wildcard origins, credentials with wildcard)
- Missing rate limiting on authentication endpoints
- Missing rate limiting on public APIs
- Insecure cookie settings (missing HttpOnly, Secure, SameSite)
- Permissive Content Security Policy
- Missing Subresource Integrity (SRI) on CDN scripts
### 7. Cryptography
- Use of deprecated algorithms (MD5, SHA1 for security purposes, DES, RC4)
- Weak random number generation (Math.random() for security-sensitive operations)
- Missing or weak encryption for sensitive data
- Hardcoded encryption keys or IVs
- ECB mode usage
### 8. Business Logic
- Race conditions in financial operations or inventory management
- Missing input validation on business-critical operations
- Inconsistent validation between client and server
- Missing transaction boundaries around multi-step operations
- Time-of-check to time-of-use (TOCTOU) vulnerabilities
### 9. File Upload Security (if applicable)
- Missing file type validation
- Missing file size limits
- Uploaded files accessible without authentication
- Uploaded files served from the same origin (XSS risk)
- Missing virus/malware scanning
### 10. API Security
- Missing input validation on API parameters
- Missing output encoding
- Mass assignment vulnerabilities (accepting all user-provided fields into database models)
- Missing pagination on list endpoints (denial of service via large requests)
- Verbose API error responses in production
- Missing API versioning strategy
- GraphQL: missing query depth/complexity limits, missing field-level authorization
## Output Format
Write your report as a markdown file with this structure:
# Security Audit Report
Generated: {timestamp}
Auditor: Security Agent (Overnight Repo Auditor)
## Executive Summary
- Total findings: {count}
- Critical: {count}
- High: {count}
- Medium: {count}
- Low: {count}
- {1-2 sentence overall assessment}
## Critical Findings
{findings in the structured format, ordered by impact}
## High Findings
{findings}
## Medium Findings
{findings}
## Low Findings
{findings}
## Checklist Coverage
{for each of the 10 categories above, note: CHECKED - {number of findings or "Clean"}}
## Files Reviewed
{list of all files you read during this audit}
## Methodology Notes
{any assumptions, limitations, or areas that could not be fully assessed}
```
**End of Security Auditor brief.**
---
#### Agent 2: Performance Auditor
**Output file**: `audit-workspace/02-performance-audit.md`
**Brief to pass to the agent**:
```
You are the Performance Auditor for an overnight codebase audit. You have up to 14.5 hours to complete a thorough performance review. Read source code directly -- do not rely on runtime profiling. Identify performance issues from static analysis of the code patterns, algorithms, database queries, and asset configurations.
## Repository Context
{paste full reconnaissance report here}
## Your Mission
Conduct a comprehensive performance audit of this codebase. Write all findings to: audit-workspace/02-performance-audit.md
## Severity Rating Rubric
{paste the shared severity rubric here}
## Structured Finding Format
{paste the shared finding format here}
## Audit Checklist
### 1. Database and Query Performance
- **N+1 Queries**: ORM calls inside loops. For each model/entity, trace how related data is loaded. Check for missing eager loading / includes / joins / prefetch_related.
- **Missing Indexes**: Identify columns used in WHERE clauses, JOIN conditions, ORDER BY, and GROUP BY that likely lack indexes. Check migration files and schema definitions.
- **Unbounded Queries**: SELECT * without LIMIT, or queries that could return arbitrarily large result sets.
- **Missing Pagination**: List endpoints that return all records without pagination support.
- **Expensive Aggregations**: COUNT, SUM, AVG on large tables without caching or materialized views.
- **Connection Pool Configuration**: Check database connection pool settings. Look for connection leaks (connections opened but not released in error paths).
- **Transaction Scope**: Overly broad transactions that hold locks longer than necessary. Transactions wrapping external API calls.
- **Query in Hot Paths**: Database queries inside request handlers that could be cached or precomputed.
### 2. Memory and Resource Management
- **Memory Leaks**:
- Event listeners added but never removed (addEventListener without removeEventListener)
- Subscriptions not unsubscribed (RxJS, EventEmitter, WebSocket)
- Growing arrays/maps/sets that are never cleared (caches without eviction)
- Closures capturing large objects unnecessarily
- React: missing cleanup in useEffect, stale closure references
- **Large Object Allocation**: Creating large arrays, buffers, or strings in hot paths
- **Stream Processing**: Reading entire files into memory instead of streaming (readFile vs createReadStream)
- **Worker/Thread Management**: Unbounded thread/worker pools, missing cleanup on process exit
- **Circular References**: Objects referencing each other preventing garbage collection
### 3. Frontend Performance (if applicable)
- **Bundle Size**:
- Importing entire libraries when only specific functions are needed (import _ from 'lodash' vs import debounce from 'lodash/debounce')
- Missing tree-shaking configuration
- Large dependencies that have lighter alternatives
- Missing code splitting / lazy loading for routes
- Dynamic imports not used for heavy components
- **Rendering Performance**:
- React: Missing React.memo on expensive components, missing useMemo/useCallback where re-renders are costly, inline object/array creation in JSX props, missing key props or using index as key in dynamic lists
- Forced synchronous layouts (reading layout properties after DOM writes)
- Layout thrashing (repeated read-write-read-write cycles)
- Large component trees without virtualization (rendering 1000+ items without react-window/react-virtualized)
- **Asset Optimization**:
- Unoptimized images (missing srcset, no lazy loading, no next-gen formats)
- Missing font-display: swap or optional
- Render-blocking CSS/JS in the critical path
- Missing preload/prefetch for critical resources
- Uncompressed assets (missing gzip/brotli configuration)
- **Core Web Vitals Risks**:
- CLS (Cumulative Layout Shift): Images without dimensions, dynamically injected content above the fold, font loading causing layout shifts
- LCP (Largest Contentful Paint): Large hero images not optimized, blocking resources in head, server response time dependencies
- INP (Interaction to Next Paint): Long-running event handlers, heavy computation on main thread, missing debounce/throttle on input handlers
### 4. API and Network Performance
- **Waterfall Requests**: Sequential API calls that could be parallelized (await one, then await another, when they are independent)
- **Over-fetching**: API responses returning significantly more data than the client uses
- **Under-fetching**: Multiple small API calls that could be batched into one
- **Missing Caching**:
- API responses without Cache-Control headers
- Repeated identical API calls without client-side caching
- Static content served without CDN or caching headers
- Missing ETag/Last-Modified for conditional requests
- **Missing Compression**: API responses without gzip/brotli compression
- **Missing Connection Pooling**: HTTP client creating new connections per request instead of reusing
- **Retry Logic**: Missing retry with backoff on transient failures, or retry without backoff (thundering herd)
### 5. Algorithmic Complexity
- **O(n^2) or worse in hot paths**: Nested loops over collections that grow with data. Searching unsorted arrays repeatedly.
- **String concatenation in loops**: Building strings with += in loops instead of using join or StringBuilder
- **Redundant computation**: Same expensive calculation performed multiple times when it could be cached
- **Missing memoization**: Pure functions called repeatedly with the same arguments
- **Inefficient data structures**: Using arrays for lookups instead of Sets/Maps, linear search where binary search or hash lookup is appropriate
### 6. Concurrency and Parallelism
- **Sequential async operations**: await in loops where Promise.all/Promise.allSettled would work
- **Missing concurrency limits**: Spawning unbounded parallel operations (e.g., Promise.all on 10,000 items without batching)
- **Blocking the event loop**: Synchronous file I/O, CPU-heavy computation on the main thread without worker threads
- **Missing connection pooling**: Database/HTTP connections created per-request
- **Deadlock risks**: Nested locks, circular resource dependencies
### 7. Build and Deploy Performance
- **Build configuration**: Missing production optimizations (minification, dead code elimination, source map handling)
- **Docker image size**: Multi-stage builds not used, unnecessary files in image, large base images
- **CI/CD pipeline**: Cacheable steps not cached, sequential steps that could be parallel
- **Cold start**: Serverless functions with heavy initialization, large deployment packages
### 8. Caching Strategy
- **Missing application-level caching**: Expensive computations or queries repeated without caching
- **Cache invalidation risks**: Caches without TTL, stale data risks
- **Missing HTTP caching**: Static assets without long cache times, API responses without appropriate caching headers
- **Missing CDN**: Static assets served from origin instead of CDN
- **Cache stampede risk**: Multiple requests triggering the same expensive cache rebuild simultaneously
## Output Format
# Performance Audit Report
Generated: {timestamp}
Auditor: Performance Agent (Overnight Repo Auditor)
## Executive Summary
- Total findings: {count}
- Critical: {count}
- High: {count}
- Medium: {count}
- Low: {count}
- Estimated overall performance health: GOOD / FAIR / POOR / CRITICAL
- {1-2 sentence overall assessment}
## Critical Findings
{findings in structured format}
## High Findings
{findings}
## Medium Findings
{findings}
## Low Findings
{findings}
## Performance Quick Wins
{top 5 changes that would have the biggest impact with the least effort}
## Checklist Coverage
{for each of the 8 categories above, note: CHECKED - {number of findings or "Clean"}}
## Files Reviewed
{list}
## Methodology Notes
{assumptions, limitations}
```
**End of Performance Auditor brief.**
---
#### Agent 3: Accessibility Auditor
**Output file**: `audit-workspace/03-accessibility-audit.md`
**Skip condition**: Only deploy this agent if the reconnaissance phase identified frontend files (HTML, JSX, TSX, Vue, Svelte, EJS, Handlebars, Pug, or similar template files). If skipped, write a placeholder file noting "Not Applicable."
**Brief to pass to the agent**:
```
You are the Accessibility Auditor for an overnight codebase audit. You have up to 14.5 hours to complete a thorough accessibility review against WCAG 2.1 Level AA (with Level AAA recommendations where practical). Review every component, page, and template in the codebase. Do not sample.
## Repository Context
{paste full reconnaissance report here}
## Your Mission
Conduct a comprehensive accessibility audit of this codebase. Write all findings to: audit-workspace/03-accessibility-audit.md
## Severity Rating Rubric
{paste the shared severity rubric here}
## Structured Finding Format
{paste the shared finding format here}
## Audit Checklist
### 1. Perceivable (WCAG Principle 1)
#### 1.1 Text Alternatives (WCAG 1.1)
- All `` elements have meaningful alt text (not "image", "photo", "icon", or empty alt on informational images)
- Decorative images have alt="" (empty alt) or are CSS backgrounds
- Complex images (charts, diagrams) have long descriptions
- Icon-only buttons/links have accessible labels (aria-label or visually hidden text)
- SVG elements have appropriate roles and labels
- `