# AGENTS.md — CXecute

## What is this project?

CXecute is a self-coordinating swarm of Codex agents that automates contact center agentification. Raw customer support transcripts + enterprise API specs go in → a deployment-ready Google Gemini Enterprise Customer Experience (GECX) package comes out.

---

## Read these first

| File | Purpose |
|---|---|
| `REQUIREMENTS.md` | **What to build** — full product specification, UI spec, backend spec, agent contracts, data requirements |
| `PLANS.md` | **How to build** — GitHub issues first, test-driven development, branch strategy, code standards, definition of done |
| `.codex/config.toml` | Project-level Codex settings (model, sandbox, approval mode) |
| `.codex/skills/*.md` | Reusable skill definitions for each specialist agent |

**Always follow PLANS.md.** Create a GitHub issue before writing code. Write tests first. Follow the code standards. No exceptions.

---

## Architecture

```
[Next.js UI] ←REST + SSE→ [FastAPI Control Plane] ←dispatch→ [Codex App Server]
                               ↕                                 ↕
                      [In-memory run state]          [Codex MCP Server tools]
                               ↕                                 ↕
                        [/output artifacts]               [Repo Filesystem]
```

| Layer | Tech | Location |
|---|---|---|
| Frontend | Next.js (App Router) + Tailwind | `/ui` |
| Control plane | Python + FastAPI | `/server` |
| Execution engine | Codex App Server | dispatched from `/server/orchestrator.py` |
| Tool layer | Codex MCP Server | `.codex/skills/` + `/server/agents/` + repo resources |
| State | In-memory + filesystem | `/server/state.py`, `/output/` |

---

## Control plane and state model

- The frontend talks only to FastAPI at `/api/*`.
- FastAPI is the source of truth for the active swarm run, stage state, logs, comms, and repair counters.
- FastAPI dispatches real specialist-stage work to Codex App Server.
- Codex App Server agents use MCP-exposed tools and resources to inspect `/data`, `/specs`, `/output`, and failure context.
- For GECX, the recommended pattern is hybrid: generate the local package first, then optionally apply direct live changes to the GECX app through MCP.
- Hackathon v1 intentionally uses **no database**. Durable artifacts live in `/output`, and optional run manifests may be stored under `/output/runs/`.
- Codex CLI via subprocess is fallback guidance only if App Server or MCP integration is blocked on demo day.

## Repository structure

```
CXecute/
├── AGENTS.md                ← You are here. Start point for Codex.
├── REQUIREMENTS.md          ← Full product spec (read before building anything)
├── PLANS.md                 ← Dev process, TDD, standards (read before writing code)
├── Readme.md                ← Project overview for humans
├── requirements.txt         ← Python dependencies
│
├── .codex/
│   ├── config.toml          ← Codex project settings
│   └── skills/              ← Reusable agent skill definitions
│       ├── intent-discoverer.md   — Skill: analyze transcripts → intent taxonomy
│       ├── api-crawler.md         — Skill: parse Swagger → intent-to-API mapping
│       ├── data-flattener.md      — Skill: nested schemas → flat agent payloads
│       ├── agent-codegen.md       — Skill: generate a GECX package
│       ├── test-writer.md         — Skill: generate + validate GECX evaluations
│       └── gateway-configurator.md — Skill: generate Apigee proxy bundles
│
├── /ui                      ← Next.js frontend (App Router, client-side only)
│   ├── app/
│   │   ├── layout.tsx       — Root layout (dark theme, fonts)
│   │   ├── page.tsx         — Single dashboard page ('use client')
│   │   └── globals.css      — Tailwind base + dark theme
│   ├── components/          — Header, StageCard, LogFeed, AgentComms, Controls
│   ├── hooks/               — useSSE, useSwarmState
│   ├── lib/                 — FastAPI client helpers
│   └── __tests__/           — Vitest component tests
│
├── /server                  ← Python FastAPI backend
│   ├── main.py              — FastAPI app (REST + SSE endpoints, control plane)
│   ├── orchestrator.py      — Swarm orchestration logic + App Server dispatch
│   ├── agents/              — Prompt templates per agent (read by orchestrator)
│   │   ├── intent_discoverer.md
│   │   ├── api_crawler.md
│   │   ├── data_flattener.md
│   │   ├── agent_codegen.md
│   │   ├── gateway_config.md
│   │   └── test_writer.md
│   ├── mock/                — Pre-scripted demo data (logs, comms, outputs)
│   └── tests/               — pytest tests
│
├── /data                    ← Input datasets and source notes
│   ├── bitext/              — Primary retail demo subset + provenance
│   └── raw/                 — Full source downloads (gitignored, fetched locally)
│
├── /specs                   ← Curated retail OpenAPI specifications
│   ├── orders-api.yaml
│   ├── shipping-api.yaml
│   ├── customers-api.yaml
│   └── payments-api.yaml
│
├── /scripts
│   └── fetch_raw_sources.py ← Pull full Bitext + vendor spec repos locally
│
└── /output                  ← Agent outputs (generated at runtime)
    ├── intents/             — taxonomy.json
    ├── mappings/            — api-mapping.json
    ├── flat/                — payloads.json
    ├── gecx/                — app package, evaluations, and apply results
    ├── gateway/             — apigee-bundle.json
    ├── tests/               — results.json, failures.json
    └── runs/                — optional JSON run manifests
```

---

## The 6 agents

Each agent is a specialist in the swarm. They run sequentially, with feedback loops between them.

### Execution order
```
S1 Intent Discovery → S2 API Crawler → S3 Data Flattener → S4 Agent Codegen → S5 Test Writer → S6 Gateway Config
                                                                    ↑                │
                                                                    └── self-repair ──┘
```

### Agent contracts

| # | Agent | Reads | Writes | Skill file |
|---|---|---|---|---|
| S1 | Intent Discoverer | `/data/bitext/retail_primary_sample.csv` | `/output/intents/taxonomy.json` | `.codex/skills/intent-discoverer.md` |
| S2 | API Crawler | `/specs/*.yaml` + `/output/intents/taxonomy.json` | `/output/mappings/api-mapping.json` | `.codex/skills/api-crawler.md` |
| S3 | Data Flattener | `/specs/*.yaml` + `/output/mappings/api-mapping.json` | `/output/flat/payloads.json` | `.codex/skills/data-flattener.md` |
| S4 | Agent Codegen | `/output/intents/` + `/output/mappings/` + `/output/flat/` | `/output/gecx/` package + `package-summary.json` | `.codex/skills/agent-codegen.md` |
| S5 | Test Writer | `/output/gecx/*` + `/output/mappings/` + `/specs/` | `/output/tests/results.json` + GECX evaluation summaries + optional `apply-result.json` | `.codex/skills/test-writer.md` |
| S6 | Gateway Config | `/output/mappings/api-mapping.json` | `/output/gateway/apigee-bundle.json` | `.codex/skills/gateway-configurator.md` |

### Inter-agent feedback loops

1. **API Crawler → Intent Discoverer**: When intents have no matching API endpoint, flag them for deprioritization
2. **Test Writer → Agent Codegen**: When tests fail, send failure details for automatic repair
3. **Agent Codegen → Test Writer**: After patching, trigger re-validation

These feedback loops make CXecute a self-coordinating swarm, not a static pipeline.

### Self-repair protocol
1. Test Writer runs tests → finds failures → writes `/output/tests/failures.json`
2. FastAPI/orchestrator detects failures → re-dispatches Agent Codegen through Codex App Server with repair context
3. Agent Codegen patches the generated GECX package → writes updated files
4. FastAPI/orchestrator re-dispatches Test Writer → validates fix
5. Max 2 repair cycles before flagging for human review

---

## Agent output format

All agents MUST:
- Write valid JSON or YAML to their designated `/output/` path
- Print structured progress that FastAPI can translate into the UI log feed:
  ```json
  {"stage": "intent", "type": "info", "timestamp": "00:03", "message": "Tokenizing conversations..."}
  ```
- Use these log types: `info`, `success`, `warn`, `error`, `repair`

For hackathon v1, stage 4 should generate a real GECX package structure, not a single platform config file. The local cancel-order template package can be used as a structural reference during development.

Run modes for the hackathon build:

- `generate-only`: create the package locally and validate it without live GECX mutation
- `generate-and-apply`: create the package locally, then push direct changes to the live GECX app through MCP and record the result in `apply-result.json`

---

## Coding conventions

- **Python**: PEP 8, type hints, docstrings, `pathlib.Path`, async by default
- **TypeScript**: strict mode, no `any`, functional components, `'use client'` on all pages
- **Testing**: TDD always — write the test first (see PLANS.md)
- **Issues**: GitHub issue before any code (see PLANS.md)
- **Commits**: `type(scope): description` — e.g., `feat(ui): add stage cards component`
- **Files**: Max 300 lines per file — split if longer

---

## Quick start for Codex

When starting a task:
1. Read this file (AGENTS.md) for context
2. Read `REQUIREMENTS.md` for what to build
3. Read `PLANS.md` for how to build it
4. Create a GitHub issue for the task
5. Write failing tests
6. Implement to make tests pass
7. Open a PR referencing the issue

## Data Notes

- Primary transcript path: Bitext retail-focused subset in `/data/bitext/retail_primary_sample.csv`
- Primary spec path: curated Northstar Retail OpenAPI files in `/specs/`
- Full raw sources are available locally via `python3 scripts/fetch_raw_sources.py`
- Use `/data/raw/` and `/specs/vendor/` for provenance or deeper demo walkthroughs, not as the default runtime input