# QAgent - AGENTS.md > **AI Agent Guide**: This file is the primary reference for AI coding agents working on QAgent. Read this before starting any work. --- ## Project Overview **QAgent** is a self-healing QA agent that automatically tests web applications, identifies bugs, applies fixes, and verifies the fixes – all without human intervention. It creates a closed-loop system that iterates until all tests pass. ### The QAgent Loop ``` ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │ TESTER │───▶│ TRIAGE │───▶│ FIXER │───▶│ VERIFIER │ │ Agent │ │ Agent │ │ Agent │ │ Agent │ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │ │ │ ┌──────────────┐ │ │ │ Redis │◀───────────────────┘ │ │ (Knowledge │ │ │ Base) │ │ └──────────────┘ │ │ ▼ ▼ ┌─────────────────────────────────────────────────────────┐ │ W&B Weave (Observability) │ └─────────────────────────────────────────────────────────┘ ``` 1. **Tester Agent** runs E2E tests using Browserbase + Stagehand 2. **Triage Agent** diagnoses failures and queries the knowledge base 3. **Fixer Agent** generates code patches using LLM + past fix patterns 4. **Verifier Agent** applies patches, deploys via Vercel, and re-runs tests 5. **Knowledge Base** (Redis) stores successful fixes for future reference --- ## Technology Stack | Layer | Technology | Purpose | |-------|------------|---------| | **Frontend** | Next.js 14 (App Router), React 18, TypeScript | Demo app and dashboard UI | | **Styling** | Tailwind CSS, Radix UI | Component styling and UI primitives | | **Browser Automation** | Browserbase + Stagehand | AI-powered E2E testing | | **Deployment** | Vercel | Instant deployment after fixes | | **Vector Memory** | Redis Stack (with vector search) | Store failure traces and enable semantic lookup | | **Observability** | W&B Weave | Trace agent runs, log metrics, evaluate improvements | | **Dashboard** | Marimo | Interactive analytics and live visualization | | **LLM** | OpenAI / Google Gemini / Anthropic | Patch generation and diagnosis | | **Authentication** | GitHub OAuth | Dashboard access control | | **Mobile** | React Native (Expo) | Mobile companion app | --- ## Project Structure ``` QAgent/ ├── .claude/ │ └── skills/ # Domain-specific knowledge modules │ ├── browserbase-stagehand/ # Browser automation patterns │ ├── redis-vectorstore/ # Vector embeddings, semantic search │ ├── vercel-deployment/ # Programmatic deployments │ ├── wandb-weave/ # Tracing and evaluation │ ├── google-adk/ # ADK/A2A integration patterns │ ├── marimo-dashboards/ # Reactive notebooks │ └── qagent-agents/ # Agent implementation patterns ├── agents/ # Agent implementations │ ├── analyzer/ # Run analysis and summarization │ ├── crawler/ # Autonomous crawl and discovery flows │ ├── tester/ # E2E test execution with Stagehand │ ├── triage/ # Failure diagnosis and root cause analysis │ ├── fixer/ # LLM-powered patch generation │ ├── verifier/ # Patch application and deployment │ ├── orchestrator/ # Workflow coordination (main entry point) │ └── adk/ # ADK workflow & agents (planned integration) ├── app/ # Next.js App Router │ ├── api/ # API routes (auth, runs, patches, tests, webhooks) │ ├── dashboard/ # Dashboard UI pages │ └── layout.tsx # App shell and metadata ├── components/ # React components │ ├── dashboard/ # Dashboard-specific components │ ├── diagnostics/ # Diagnostic views │ ├── monitoring/ # Monitoring components │ ├── onboarding/ # First-run guidance and setup │ ├── patches/ # Patch management UI │ ├── runs/ # Run tracking components │ ├── ui/ # Shared UI components (shadcn/ui style) │ └── voice/ # Voice interface components ├── lib/ # Shared libraries │ ├── auth/ # Authentication utilities (GitHub OAuth) │ ├── browserbase/ # Browser automation utilities │ ├── dashboard/ # Dashboard data helpers │ ├── git/ # Local git workflow helpers │ ├── github/ # GitHub API integration │ ├── hooks/ # React hooks │ ├── notifications/ # Toasts and notification helpers │ ├── providers/ # React providers │ ├── queue/ # Job queue processing │ ├── redis/ # Redis vector store client │ ├── redteam/ # Adversarial testing suite │ ├── tracetriage/ # Trace analysis and self-improvement │ ├── utils/ # Shared utilities │ └── weave/ # W&B Weave logging and tracing ├── mobile/ # React Native mobile app ├── dashboard/ # Marimo analytics dashboard (app.py) ├── docs/ # Documentation │ ├── PRD.md # Product Requirements Document │ ├── DESIGN.md # System design and data structures │ ├── ARCHITECTURE.md # Architecture Decision Records │ ├── DEMO_SCRIPT.md # 3-minute demo script │ └── SPONSOR_INTEGRATIONS.md # Sponsor integration details ├── prompts/ # Agent prompts ├── scripts/ # Build/deploy helper scripts ├── tests/ │ ├── e2e/ # E2E test specs and runner │ └── unit/ # Vitest unit tests ├── middleware.ts # Next.js auth middleware └── .env.example # Environment variable template ``` --- ## Build, Test, and Development Commands ```bash # Install dependencies pnpm install # Development server (demo app) pnpm dev # Starts Next.js dev server on localhost:3000 # Agent workflow pnpm run agent # Start the QAgent orchestrator # Testing pnpm test # Run unit tests with Vitest pnpm run test:e2e # Execute E2E flows via tests/e2e/runner.ts # Code quality pnpm lint # Run ESLint + TypeScript type-check pnpm format # Format with Prettier pnpm format:check # Check formatting without modifying files # Production pnpm build # Build for production pnpm start # Start production server # Dashboard pnpm dashboard # Launch Marimo dashboard # Redis pnpm redis:init # Initialize Redis schema ``` --- ## Configuration Files | File | Purpose | |------|---------| | `package.json` | pnpm workspace configuration, scripts, dependencies | | `tsconfig.json` | TypeScript compiler options (strict mode, path aliases) | | `next.config.js` | Next.js configuration (React StrictMode) | | `tailwind.config.js` | Tailwind CSS theme, colors, animations | | `vitest.config.ts` | Vitest test configuration | | `.eslintrc.json` | ESLint rules (extends next/core-web-vitals) | | `.prettierrc` | Prettier formatting rules | | `middleware.ts` | Next.js auth middleware (GitHub OAuth session validation) | --- ## Coding Style & Naming Conventions - **Formatter**: Prettier is the source of truth - `tabWidth: 2` - `singleQuote: true` - `semi: true` - `trailingComma: es5` - `printWidth: 100` - **TypeScript**: Strict mode enabled - Avoid `any` unless absolutely justified - Use explicit return types for public methods - Prefer interfaces over types for object shapes - **Naming**: - PascalCase for components, classes, interfaces - camelCase for variables, functions, methods - UPPER_SNAKE_CASE for constants - kebab-case for file names - **File Organization**: - One class per file for agents - Co-locate related types in `lib/types.ts` - Use path aliases (`@/`) for imports --- ## Testing Guidelines ### Unit Tests - Location: `tests/unit/` - Framework: Vitest - Pattern: `*.test.ts` - Run: `pnpm test` - Coverage: Configured for `agents/**/*.ts` and `lib/**/*.ts` ### E2E Tests - Location: `tests/e2e/` - Test specs: `tests/e2e/specs.ts` - Runner: `tests/e2e/runner.ts` - Run: `pnpm run test:e2e` - Framework: Stagehand (AI-powered browser automation) ## Environment Variables Copy `.env.example` to `.env.local` and fill in required values: ### Required for Core Functionality | Variable | Description | |----------|-------------| | `BROWSERBASE_API_KEY` | Browserbase API key for browser automation | | `BROWSERBASE_PROJECT_ID` | Browserbase project identifier | | `OPENAI_API_KEY` | OpenAI API key for LLM patch generation | | `REDIS_URL` | Redis connection string (local or Redis Cloud) | | `VERCEL_TOKEN` | Vercel API token for deployments | | `VERCEL_PROJECT_ID` | Vercel project identifier | | `WANDB_API_KEY` | Weights & Biases API key for Weave | ### Required for Dashboard | Variable | Description | |----------|-------------| | `GITHUB_CLIENT_ID` | GitHub OAuth App client ID | | `GITHUB_CLIENT_SECRET` | GitHub OAuth App client secret | | `SESSION_SECRET` | Session encryption key (generate with `openssl rand -hex 32`) | ### Optional | Variable | Description | |----------|-------------| | `ANTHROPIC_API_KEY` | Anthropic API key (backup LLM) | | `GOOGLE_API_KEY` | Google API key for Gemini models | | `GITHUB_TOKEN` | GitHub token for code operations | | `DATABASE_URL` | PostgreSQL connection string | | `SLACK_BOT_TOKEN` | Slack notifications | | `LINEAR_API_KEY` | Linear issue tracking | **Security Note**: Never commit `.env.local` to version control. --- ## Agent Architecture ### Tester Agent (`agents/tester/`) - Executes E2E tests using Stagehand + Browserbase - Captures screenshots, DOM snapshots, console logs on failure - Generates structured `FailureReport` objects - Instrumented with W&B Weave for observability ### Triage Agent (`agents/triage/`) - Classifies failures: `UI_BUG`, `BACKEND_ERROR`, `DATA_ERROR`, `TEST_FLAKY`, `UNKNOWN` - Localizes bugs to file/line using error patterns + LLM - Queries Redis for similar past issues - Generates `DiagnosisReport` with root cause analysis ### Fixer Agent (`agents/fixer/`) - Generates minimal, targeted code patches - Uses LLM with few-shot examples from knowledge base - Validates patches for safety and syntax - Produces unified diff format ### Verifier Agent (`agents/verifier/`) - Applies patches to filesystem - Creates backups and handles rollback - Validates TypeScript/JSX syntax - Deploys to Vercel and re-runs tests - Records successful fixes in Redis ### Orchestrator (`agents/orchestrator/`) - Coordinates the full QAgent loop - Handles iteration limits and failure recovery - Logs metrics to Weave - Entry point: `pnpm run agent` --- ## Development Workflow (Ralph Loop) Follow this iterative workflow for development: 1. **Read** - Load `AGENTS.md`, `CLAUDE.md`, `GEMINI.md`, and relevant skills 2. **Analyze** - Understand current phase requirements 3. **Plan** - Break down into small, testable increments 4. **Execute** - Implement one increment at a time 5. **Validate** - Test, lint, verify acceptance criteria 6. **Loop** - Update documentation as needed, commit, and return to step 1 --- ## Security & Safety Guidelines ### Always - Keep secrets out of version control - Validate all patches for dangerous patterns (`eval`, `exec`, `rm -rf`) - Use parameterized queries for database access - Sanitize user inputs in RedTeam tests - Verify GitHub webhook signatures ### Never - Hardcode secrets or credentials - Deploy untested patches to production - Skip Redis lookup results when available - Ignore Weave logging for agent runs - Commit broken code --- ## Key Files for AI Agents | File | Purpose | |------|---------| | `AGENTS.md` | Primary repo guide for coding agents | | `CLAUDE.md` | Detailed tech stack, phase roadmap, always/never rules | | `GEMINI.md` | Compact project context for Gemini CLI | | `lib/types.ts` | All TypeScript interfaces and types | | `prompts/ralph-loop.md` | Development workflow prompts | | `.claude/skills/` | Domain-specific implementation guides | --- ## Dependencies ### Production - `next` - Next.js framework - `@browserbasehq/stagehand` - AI browser automation - `redis` - Redis client with vector search - `weave` - W&B Weave observability - `openai` - OpenAI SDK - `@radix-ui/*` - Headless UI components - `framer-motion` - Animations - `recharts` - Charts for dashboard - `lucide-react` - Icons ### Development - `vitest` - Unit testing - `typescript` - Type checking - `eslint` - Linting - `prettier` - Formatting - `tsx` - TypeScript execution --- ## Troubleshooting ### Common Issues **Stagehand initialization fails** - Verify `BROWSERBASE_API_KEY` and `BROWSERBASE_PROJECT_ID` - Check Browserbase dashboard for session limits **Redis connection errors** - For local: ensure Redis Stack is running (`redis-server`) - For cloud: verify `REDIS_URL` format **TypeScript errors after patch** - Fixer Agent may generate type-incorrect code - Type errors are allowed; syntax errors are blocked - Check `pnpm lint` output **Vercel deployment fails** - Verify `VERCEL_TOKEN` and `VERCEL_PROJECT_ID` - Check git working directory is clean --- ## References - [QAgent Paper](https://arxiv.org/html/2502.02747v1) - Five-step agentic patching framework - [Stagehand Docs](https://www.stagehand.dev/) - AI-powered browser automation - [Browserbase Docs](https://docs.browserbase.com/) - Cloud browser infrastructure - [Redis Vector Search](https://redis.io/docs/stack/search/reference/vectors/) - Semantic similarity - [W&B Weave](https://wandb.ai/site/weave) - LLM observability - [Marimo](https://marimo.io/) - Reactive Python notebooks --- *Last updated: March 2026*