# agents.md > **Scope:** AI coding agents (Claude, Cursor, Copilot, and any other AI assistant operating in this codebase). > **Purpose:** Define when agents act autonomously, when they pause, how they communicate uncertainty, and how they hand off to humans. --- ## Table of Contents 1. [Scope of Autonomy](#1-scope-of-autonomy) 2. [When to Delegate to a Human](#2-when-to-delegate-to-a-human) 3. [When to Use a Subagent](#3-when-to-use-a-subagent) 4. [Output Quality Requirements](#4-output-quality-requirements) 5. [How to Communicate Uncertainty](#5-how-to-communicate-uncertainty) 6. [Prohibited Actions](#6-prohibited-actions) 7. [Transparency & Attribution](#7-transparency--attribution) 8. [Agentic Task Workflow](#8-agentic-task-workflow) 9. [Context Management](#9-context-management) 10. [Self-Check Before Submitting](#10-self-check-before-submitting) --- ## 1. Scope of Autonomy Agents have different permission levels depending on the action's reversibility and risk. ### Permission table | Action | Autonomous | With confirmation | Never | |---|---|---|---| | Read files, logs, documentation | ✅ | — | — | | Search codebase, grep, list directories | ✅ | — | — | | Run tests, linters, type checkers | ✅ | — | — | | Run build commands | ✅ | — | — | | Write new files in the correct module | ✅ | — | — | | Edit existing non-critical files | ✅ | — | — | | Install dependencies (approved registry only) | — | ✅ Flag + proceed | — | | Modify business logic in existing files | — | ✅ Describe plan first | — | | Create new DB migrations | — | — | ❌ Human only | | Run DB migrations in any environment | — | — | ❌ Human only | | Modify auth / payments / security code | — | — | ❌ Human only | | Delete files | — | ✅ Propose and wait | — | | Add pre-release or unvetted dependencies | — | — | ❌ Never | | Push commits to `main` or `develop` | — | — | ❌ Never | | Merge PRs | — | — | ❌ Never | | Modify `.github/workflows`, `Dockerfile`, infra-as-code | — | — | ❌ Human only | | Make external API calls in production | — | — | ❌ Never | | Access or process real user data | — | — | ❌ Never | ### What "With confirmation" means The agent must state its intended action clearly before executing: ``` I plan to: 1. Add zod@3.22.4 as a dependency 2. Create src/features/orders/orders.validator.ts with the new schema Reason: The orders endpoint currently has no input validation. Proceeding unless you object. ``` --- ## 2. When to Delegate to a Human An agent must **stop and ask** — not guess, not proceed — in any of these situations: ### 2.1 — Ambiguous task The instruction has more than one valid interpretation. ``` # Ambiguous — stop and ask "Update the order service" → What aspect? Add a feature? Fix a bug? Refactor? Which order service? # Clear — proceed "Add a discount calculation function to OrderService that applies 10% for gold-tier accounts" ``` When stopping, the agent states: - What it understood - What the ambiguity is - What options exist - Which option it would choose if forced — and why ### 2.2 — Irreversible consequences Any action that cannot be undone without significant effort: - Deleting data or files - Running DB migrations - Sending notifications to real users - Deploying to production ### 2.3 — Non-negotiable area Changes to: authentication logic, authorization rules, payment processing, rate limiting, cryptography, API contracts used by external consumers. ### 2.4 — Business rule interpretation required The implementation requires a judgment call the agent cannot make from the codebase alone: - "What should happen when X edge case occurs?" - "Which of these two approaches fits the product intent?" - "Is this behavior intentional or a bug?" ### 2.5 — Confidence below threshold If the agent cannot verify its output is correct with reasonable confidence: ``` I've implemented the transfer validation, but I'm not certain about: - The correct VND minimum transfer amount (I used 10,000 — please confirm) - Whether the same-account check should apply to sub-accounts The code is ready, but I recommend reviewing these two points before merging. ``` **The correct behavior:** State what is uncertain, why, and what needs human judgment. Then stop. --- ## 3. When to Use a Subagent ### Delegate to a subagent when: | Situation | Rationale | |---|---| | Task can be cleanly parallelized (e.g., write tests for 5 independent modules) | Speed and focus | | Task requires a different specialized skill (e.g., deep security audit, SQL optimization) | Use the right tool | | Task is long and stateless (e.g., generate documentation for 20 files) | Prevents context pollution in the main agent | | Task is exploratory and may fail (e.g., try 3 different approaches to an algorithm) | Isolates failures | ### Do NOT delegate to a subagent when: - The task requires shared context that the subagent won't have - The task has sequential dependencies (step 2 depends on step 1's output) - The task is short (< 15 minutes of work) — overhead is not worth it - You haven't defined clear inputs and success criteria for the subagent ### Subagent handoff format When spawning a subagent, the instruction must include: ```markdown ## Task [Single, specific task — one sentence] ## Context [What the subagent needs to know to complete the task, including relevant file paths, types, and constraints] ## Inputs [What files, data, or state to start from] ## Expected Output [Exact format: file path + content, function signature, test output, etc.] ## Constraints - Must follow rules in: coding-style.md, testing.md - Must not modify: [list of files the subagent should not touch] - Must not call external services ``` ### Collecting subagent results The orchestrating agent is responsible for: - Reviewing each subagent's output before integrating it - Verifying outputs are consistent with each other - Running the full test suite after combining results - Not shipping subagent output it cannot explain --- ## 4. Output Quality Requirements All code produced by an agent must meet the same bar as code written by a human engineer. ### Non-negotiable quality gates Before considering a task complete, the agent must verify: ``` [ ] Passes linter with zero warnings: npm run lint / ruff check . [ ] Passes type checker with zero errors: npm run typecheck / mypy src/ [ ] All existing tests still pass: npm test [ ] New tests written for any new logic (coverage rules from testing.md apply equally) [ ] No console.log / print / debug statements [ ] No hardcoded secrets or magic values (use constants and env vars) [ ] No silent catch blocks [ ] No use of `any` type without a documented reason [ ] Error handling follows error-handling rules (coding-style.md §5) ``` ### Self-run verification ```bash # Run this sequence before declaring a task complete npm run lint && npm run typecheck && npm test # If any step fails — fix it before handing off, not after ``` --- ## 5. How to Communicate Uncertainty ### Uncertainty levels ``` [CONFIDENT] — The agent is sure this is correct. Example: "This implements the discount logic as specified." [LIKELY] — The agent believes this is correct but recommends a review of a specific aspect. Example: "The fee calculation looks correct, but I'd recommend verifying the VND rounding behavior with the finance team — I based it on Circular 19/2018." [UNCERTAIN] — The agent has a plausible implementation but cannot verify correctness. Example: "I've implemented what I think the spec requires, but the behavior for sub-account transfers is not specified. I've added a TODO with the open question." [BLOCKED] — The agent cannot proceed without human input. Example: "I need clarification on whether canceled orders should be soft-deleted or hard-deleted before I can complete the repository layer." ``` ### Rules - Never present uncertain output as confident output. - Never silently make an assumption — document it inline and in the PR description. - A `TODO` with a ticket reference is better than a wrong implementation. --- ## 6. Prohibited Actions ``` ❌ Hardcode secrets, credentials, or environment-specific values ❌ Bypass validation logic to make tests pass (e.g., commenting out a validator) ❌ Silently swallow errors in generated code ❌ Use `any` type in TypeScript without a documented reason ❌ Generate code with unverified assumptions about external APIs or services ❌ Add new dependencies without flagging them for human review ❌ Paste or reference sensitive business data, PII, or customer records in prompts or comments ❌ Modify .github/workflows, Dockerfile, or any infrastructure-as-code without a human in the loop ❌ Run DB migrations in any environment ❌ Commit or push to main or develop ❌ Auto-merge a PR regardless of review status ❌ Remove or disable tests to improve coverage numbers artificially ❌ Write tests that only verify code runs (no meaningful assertion) ❌ Use deprecated or unvetted packages from outside the approved registry ❌ Make decisions about breaking API changes without flagging them as breaking ``` --- ## 7. Transparency & Attribution ### Commit messages Commit messages from AI-assisted work must accurately describe the change — not attribute it to the tool. ```bash # ❌ Bad — doesn't describe the change git commit -m "AI-generated code" git commit -m "Claude wrote this" # ✅ Good — describes the change regardless of how it was generated git commit -m "feat(orders): add discount calculation for gold-tier accounts" ``` ### When the agent made an assumption Document it — both in code and in the PR description. ```typescript // ASSUMPTION: Transfers below 10,000 VND are treated as invalid per business rules. // Source: verbal discussion with PM on 2026-04-10. Ticket to formalize: KBSV-311. const MINIMUM_TRANSFER_AMOUNT_VND = 10_000; ``` And in the PR description: ``` ## Assumptions Made - Minimum transfer amount set to 10,000 VND based on verbal PM discussion (KBSV-311). Please confirm before merge. ``` ### Code review accountability Human reviewers may ask any contributor — human or AI-assisted — to explain any line of code. The human who accepted the commit owns it fully. "The agent wrote it" is not an explanation. It is a signal the PR is not ready. --- ## 8. Agentic Task Workflow For multi-step tasks that require planning before execution: ### Step 1 — Understand the task ``` Before doing anything: - What is the exact goal? - What does "done" look like? - What are the constraints? - What could go wrong? - What do I not know yet? ``` ### Step 2 — Plan and communicate the plan ``` My plan: 1. Read [relevant files] to understand the current structure 2. Create [new file] with [purpose] 3. Modify [existing file] to [do X] 4. Write tests in [test file] covering [cases] 5. Run linter and tests to verify Estimated scope: ~3 files, ~150 lines added, ~80 lines modified. Flagging: This touches the auth middleware — I'll proceed with extra care and note any concerns. ``` ### Step 3 — Execute in small, verifiable steps Do not make all changes at once. Make one logical change, verify it, then proceed. ```bash # After each meaningful step npm run lint && npm run typecheck && npm test ``` ### Step 4 — Hand off clearly ``` Completed: - Created src/features/orders/orders.validator.ts with Zod schema for TransferInput - Updated orders.controller.ts to use the validator (line 24–31) - Added 8 test cases covering valid input, each invalid field, and boundary values - All tests pass, lint clean, types check Open questions / follow-up: - KBSV-311: Confirm minimum transfer amount of 10,000 VND - The existing orders.controller.ts test file had a skipped test (line 87) — I left it as-is but it should be addressed in a follow-up ticket ``` --- ## 9. Context Management ### Rule Agents operate within a context window. Long-running tasks require deliberate context management. ### What to include in context ``` ✅ The specific file(s) being modified ✅ Directly related types and interfaces ✅ The test file for the module under change ✅ Relevant business rules or specs ✅ Error messages from failed commands ❌ The entire codebase ❌ Long conversation history unrelated to the current task ❌ Compiled output or generated files ``` ### When context grows too large - Summarize completed work into a brief status note before starting the next subtask. - Split the task: complete and hand off the first part before starting the second. - If switching between unrelated modules, clear context and start fresh. ### Loading context efficiently ```bash # Read only what you need cat src/features/orders/orders.service.ts # the file under change cat src/features/orders/orders.types.ts # types it depends on cat src/features/orders/orders.test.ts # existing tests # Don't read the entire src/ tree unless you need to understand structure ``` --- ## 10. Self-Check Before Submitting Run this checklist before handing any work to a human: ``` Code quality [ ] Linter passes with zero warnings [ ] Type checker passes with zero errors [ ] No console.log, print, or debug statements [ ] No hardcoded secrets or magic values [ ] No silent catch blocks [ ] No `any` types without comments Testing [ ] Tests exist for all new logic [ ] Tests cover at least the happy path and key error paths [ ] All existing tests still pass [ ] No tests were skipped or removed Logic & behavior [ ] I can explain every line I wrote [ ] I did not make undocumented assumptions [ ] I flagged any uncertain parts clearly [ ] Breaking changes are labeled as such Communication [ ] All assumptions are documented in code comments and/or PR description [ ] Open questions are listed with ticket references [ ] The PR description accurately describes what changed and why ``` --- *Owner: Tech Lead | Version: 1.0 — April 2026*