---
name: pipeline-context
description: Load context for pipeline, cron, Lambda, OCR, and translation work. Use when starting any pipeline monitoring, debugging, or processing task.
---

# Pipeline Context

Read these files before proceeding with pipeline work:

1. `memory/pipeline-ops.md` — Emergency controls, worker architecture, operational details
2. `memory/lessons-learned.md` — Operational postmortems and patterns
3. `.claude/docs/pipeline.md` — Full processing pipeline (states, crons, prompts, costs)
4. `.claude/docs/worker-architecture.md` — Lambda worker details
5. `.claude/docs/page-lifecycle.md` — Page processing states

## Critical Rules

- Model selection: prefer `getModelForBook(book)` from `src/lib/types/ai-models.ts` over hardcoding. It routes BPH books and non-Latin-script languages to `gemini-3-flash-preview` (full quality) and everything else to `gemini-3.1-flash-lite-preview` (50% cheaper, comparable quality on Latin-script). Enrich-worker uses `gemini-3.1-flash-lite-preview` for all phases (summary+index, chapters, quality scoring, collection assignment). Never use anything below Gemini v3 — `gemini-2.x` is deprecated.
- NEVER use Gemini Batch API for translation — use Lambda workers (SQS FIFO)
- Any script overwriting `ocr.data` or `translation.data` MUST call `createRevision()` first
- MongoDB Atlas saturates at ~40 concurrent Lambda jobs — global backpressure limit
- Emergency stop: `system_config._id: 'processing_control'`, set `paused: true`

## Audit Trail

All AI calls logged to `gemini_usage` collection via `logGeminiCall()` in `src/lib/gemini-logger.ts`.
- Book history timeline: `GET /api/books/[id]/history` (assembles from 6 collections)
- Dashboard: `GET /api/admin/processing-dashboard?provider=ia`
- Error classification: `src/lib/errors.ts` → `classifyError(error)`
- `cost_tracking` collection is DEPRECATED — use `gemini_usage` for all cost queries

## Staleness Check

After reading the memory files above, flag anything that contradicts what you observe in the codebase:
- File paths or function names that no longer exist
- Behavioral claims that don't match the current code
- Stats or counts with dates older than 14 days — note as potentially stale
- If you find contradictions, update the memory file immediately and tell the user what changed.

## Also Relevant

- Batch processing (Gemini Batch API): `.claude/docs/batch-processing.md`
- Observability & audit trail: `.claude/docs/observability.md`
- First translation identification: `.claude/docs/first-translation-system.md`