# AGENTS.md — CXecute ## What is this project? CXecute is a self-coordinating swarm of Codex agents that automates contact center agentification. Raw customer support transcripts + enterprise API specs go in → a deployment-ready Google Gemini Enterprise Customer Experience (GECX) package comes out. --- ## Read these first | File | Purpose | |---|---| | `REQUIREMENTS.md` | **What to build** — full product specification, UI spec, backend spec, agent contracts, data requirements | | `PLANS.md` | **How to build** — GitHub issues first, test-driven development, branch strategy, code standards, definition of done | | `.codex/config.toml` | Project-level Codex settings (model, sandbox, approval mode) | | `.codex/skills/*.md` | Reusable skill definitions for each specialist agent | **Always follow PLANS.md.** Create a GitHub issue before writing code. Write tests first. Follow the code standards. No exceptions. --- ## Architecture ``` [Next.js UI] ←REST + SSE→ [FastAPI Control Plane] ←dispatch→ [Codex App Server] ↕ ↕ [In-memory run state] [Codex MCP Server tools] ↕ ↕ [/output artifacts] [Repo Filesystem] ``` | Layer | Tech | Location | |---|---|---| | Frontend | Next.js (App Router) + Tailwind | `/ui` | | Control plane | Python + FastAPI | `/server` | | Execution engine | Codex App Server | dispatched from `/server/orchestrator.py` | | Tool layer | Codex MCP Server | `.codex/skills/` + `/server/agents/` + repo resources | | State | In-memory + filesystem | `/server/state.py`, `/output/` | --- ## Control plane and state model - The frontend talks only to FastAPI at `/api/*`. - FastAPI is the source of truth for the active swarm run, stage state, logs, comms, and repair counters. - FastAPI dispatches real specialist-stage work to Codex App Server. - Codex App Server agents use MCP-exposed tools and resources to inspect `/data`, `/specs`, `/output`, and failure context. - For GECX, the recommended pattern is hybrid: generate the local package first, then optionally apply direct live changes to the GECX app through MCP. - Hackathon v1 intentionally uses **no database**. Durable artifacts live in `/output`, and optional run manifests may be stored under `/output/runs/`. - Codex CLI via subprocess is fallback guidance only if App Server or MCP integration is blocked on demo day. ## Repository structure ``` CXecute/ ├── AGENTS.md ← You are here. Start point for Codex. ├── REQUIREMENTS.md ← Full product spec (read before building anything) ├── PLANS.md ← Dev process, TDD, standards (read before writing code) ├── Readme.md ← Project overview for humans ├── requirements.txt ← Python dependencies │ ├── .codex/ │ ├── config.toml ← Codex project settings │ └── skills/ ← Reusable agent skill definitions │ ├── intent-discoverer.md — Skill: analyze transcripts → intent taxonomy │ ├── api-crawler.md — Skill: parse Swagger → intent-to-API mapping │ ├── data-flattener.md — Skill: nested schemas → flat agent payloads │ ├── agent-codegen.md — Skill: generate a GECX package │ ├── test-writer.md — Skill: generate + validate GECX evaluations │ └── gateway-configurator.md — Skill: generate Apigee proxy bundles │ ├── /ui ← Next.js frontend (App Router, client-side only) │ ├── app/ │ │ ├── layout.tsx — Root layout (dark theme, fonts) │ │ ├── page.tsx — Single dashboard page ('use client') │ │ └── globals.css — Tailwind base + dark theme │ ├── components/ — Header, StageCard, LogFeed, AgentComms, Controls │ ├── hooks/ — useSSE, useSwarmState │ ├── lib/ — FastAPI client helpers │ └── __tests__/ — Vitest component tests │ ├── /server ← Python FastAPI backend │ ├── main.py — FastAPI app (REST + SSE endpoints, control plane) │ ├── orchestrator.py — Swarm orchestration logic + App Server dispatch │ ├── agents/ — Prompt templates per agent (read by orchestrator) │ │ ├── intent_discoverer.md │ │ ├── api_crawler.md │ │ ├── data_flattener.md │ │ ├── agent_codegen.md │ │ ├── gateway_config.md │ │ └── test_writer.md │ ├── mock/ — Pre-scripted demo data (logs, comms, outputs) │ └── tests/ — pytest tests │ ├── /data ← Input datasets and source notes │ ├── bitext/ — Primary retail demo subset + provenance │ └── raw/ — Full source downloads (gitignored, fetched locally) │ ├── /specs ← Curated retail OpenAPI specifications │ ├── orders-api.yaml │ ├── shipping-api.yaml │ ├── customers-api.yaml │ └── payments-api.yaml │ ├── /scripts │ └── fetch_raw_sources.py ← Pull full Bitext + vendor spec repos locally │ └── /output ← Agent outputs (generated at runtime) ├── intents/ — taxonomy.json ├── mappings/ — api-mapping.json ├── flat/ — payloads.json ├── gecx/ — app package, evaluations, and apply results ├── gateway/ — apigee-bundle.json ├── tests/ — results.json, failures.json └── runs/ — optional JSON run manifests ``` --- ## The 6 agents Each agent is a specialist in the swarm. They run sequentially, with feedback loops between them. ### Execution order ``` S1 Intent Discovery → S2 API Crawler → S3 Data Flattener → S4 Agent Codegen → S5 Test Writer → S6 Gateway Config ↑ │ └── self-repair ──┘ ``` ### Agent contracts | # | Agent | Reads | Writes | Skill file | |---|---|---|---|---| | S1 | Intent Discoverer | `/data/bitext/retail_primary_sample.csv` | `/output/intents/taxonomy.json` | `.codex/skills/intent-discoverer.md` | | S2 | API Crawler | `/specs/*.yaml` + `/output/intents/taxonomy.json` | `/output/mappings/api-mapping.json` | `.codex/skills/api-crawler.md` | | S3 | Data Flattener | `/specs/*.yaml` + `/output/mappings/api-mapping.json` | `/output/flat/payloads.json` | `.codex/skills/data-flattener.md` | | S4 | Agent Codegen | `/output/intents/` + `/output/mappings/` + `/output/flat/` | `/output/gecx/` package + `package-summary.json` | `.codex/skills/agent-codegen.md` | | S5 | Test Writer | `/output/gecx/*` + `/output/mappings/` + `/specs/` | `/output/tests/results.json` + GECX evaluation summaries + optional `apply-result.json` | `.codex/skills/test-writer.md` | | S6 | Gateway Config | `/output/mappings/api-mapping.json` | `/output/gateway/apigee-bundle.json` | `.codex/skills/gateway-configurator.md` | ### Inter-agent feedback loops 1. **API Crawler → Intent Discoverer**: When intents have no matching API endpoint, flag them for deprioritization 2. **Test Writer → Agent Codegen**: When tests fail, send failure details for automatic repair 3. **Agent Codegen → Test Writer**: After patching, trigger re-validation These feedback loops make CXecute a self-coordinating swarm, not a static pipeline. ### Self-repair protocol 1. Test Writer runs tests → finds failures → writes `/output/tests/failures.json` 2. FastAPI/orchestrator detects failures → re-dispatches Agent Codegen through Codex App Server with repair context 3. Agent Codegen patches the generated GECX package → writes updated files 4. FastAPI/orchestrator re-dispatches Test Writer → validates fix 5. Max 2 repair cycles before flagging for human review --- ## Agent output format All agents MUST: - Write valid JSON or YAML to their designated `/output/` path - Print structured progress that FastAPI can translate into the UI log feed: ```json {"stage": "intent", "type": "info", "timestamp": "00:03", "message": "Tokenizing conversations..."} ``` - Use these log types: `info`, `success`, `warn`, `error`, `repair` For hackathon v1, stage 4 should generate a real GECX package structure, not a single platform config file. The local cancel-order template package can be used as a structural reference during development. Run modes for the hackathon build: - `generate-only`: create the package locally and validate it without live GECX mutation - `generate-and-apply`: create the package locally, then push direct changes to the live GECX app through MCP and record the result in `apply-result.json` --- ## Coding conventions - **Python**: PEP 8, type hints, docstrings, `pathlib.Path`, async by default - **TypeScript**: strict mode, no `any`, functional components, `'use client'` on all pages - **Testing**: TDD always — write the test first (see PLANS.md) - **Issues**: GitHub issue before any code (see PLANS.md) - **Commits**: `type(scope): description` — e.g., `feat(ui): add stage cards component` - **Files**: Max 300 lines per file — split if longer --- ## Quick start for Codex When starting a task: 1. Read this file (AGENTS.md) for context 2. Read `REQUIREMENTS.md` for what to build 3. Read `PLANS.md` for how to build it 4. Create a GitHub issue for the task 5. Write failing tests 6. Implement to make tests pass 7. Open a PR referencing the issue ## Data Notes - Primary transcript path: Bitext retail-focused subset in `/data/bitext/retail_primary_sample.csv` - Primary spec path: curated Northstar Retail OpenAPI files in `/specs/` - Full raw sources are available locally via `python3 scripts/fetch_raw_sources.py` - Use `/data/raw/` and `/specs/vendor/` for provenance or deeper demo walkthroughs, not as the default runtime input