---
name: backend-design
description: Elite Tier Backend standards, including Vertical Slice Architecture, Zero Trust Security, and High-Performance API protocols.
allowed-tools: Read, Write, Edit, Glob, Grep, Bash
---
# Backend Design System
> **Philosophy:** The Backend is the Fortress. Logic is Law. Latency is the Enemy.
> **Core Principle:** ISOLATE features. TRUST no one. SCALE linearly.
**ANTI-HAPPY PATH MANDATE (CRITICAL):** Never assume the ideal scenario. AI-generated code often fails by ignoring edge cases and failure modes. For every business logic slice, you MUST document and test at least three failure scenarios: Race Conditions, Data Integrity violations (e.g., unique constraint overlaps), and Boundary failures. Reject any implementation that only covers the 'Happy Path'. Engineering is the art of handling what shouldn't happen.
## 🚀 ELITE TIER KNOWLEDGE (ARCHITECTURAL PROTOCOLS)
### 0. The "Vertical Slice" Law (The Anti-Layer Mandate)
> **CRITICAL:** You are FORBIDDEN from creating "Horizontal Layers" (Controllers, Services, Repositories) as primary folders.
**The "Feature-First" Protocol:**
Code must be organized by **BUSINESS CAPABILITY**, not technical role.
1. **The Slice:** A single directory (e.g., `features/create-order/`) contains EVERYTHING needed for that feature:
* `handler.ts` (Controller)
* `logic.ts` (Domain/Service)
* `schema.ts` (DTO/Validation)
* `db.ts` (Data Access)
2. **The Benefit:** Changing a feature requires touching only ONE folder. No "Shotgun Surgery" across 5 layers.
3. **Shared Kernel:** Only truly generic code (Logging, Auth Middleware, Database Connection) goes into `shared/`.
### 1. The "Modular Monolith" Mandate
* **Microservices Ban:** Do NOT start with microservices. Start with a **Modular Monolith**.
* **Modulith Rules:**
* Modules must be isolated (like internal microservices).
* Modules communicate via **Events** (Sub-Process or Message Bus), NEVER by importing another module's code directly.
* **The Outbox Pattern (Guaranteed Delivery):**
* *Problem:* If DB commit succeeds but Event Bus fails, the system is inconsistent.
* *Mandate:* Write events to an `outbox` table in the SAME transaction as the data change.
* *Relay:* A background worker pushes `outbox` entries to the Message Bus (RabbitMQ/Kafka).
* Data Sovereignty: Module A cannot query Module B's tables. It must ask Module B via API/Event.
### 2. The "Zero Trust" Security Protocol
> **Detailed protocols:** See [security-protocols.md](security-protocols.md)
**Quick Rules:**
1. **Strict Serialization:** NEVER return raw DB entities → Use ResponseDTO
2. **Validation at Gate:** Schema validation (Zod/Pydantic) BEFORE logic
3. **Token Sovereignty:** PASETO v4 > JWT (Ed25519 if JWT forced)
## 🏗️ Reliability & Performance Contracts
### 3. The "Sub-100ms" Performance Mandate
* **The Latency Budget:** P50 < 100ms. P99 < 500ms.
* **UUIDv7 (The Time-Lord Rule):**
* *Ban:* Never use `UUIDv4` (Random) for Primary Keys. It fragments B-Tree indexes.
* *Mandate:* Use **UUIDv7** (Time-ordered). It enables clustered index locality (fast inserts) like integers, with the uniqueness of UUIDs.
* **N+1 Assassin:**
* *Check:* Always inspect ORM queries. Loops triggering DB calls are a "Level 0" error.
* *Fix:* Use `DataLoader` pattern or explicit `JOIN` loading.
### 4. API Reliability Contracts
* **RFC 7807 (Problem Details):**
* *Ban:* returning `{ "error": "Something went wrong" }`.
* *Mandate:* Return standard Problem JSON:
```json
{
"type": "https://api.myapp.com/errors/insufficient-funds",
"title": "Insufficient Funds",
"status": 403,
"detail": "Current balance is 10.00, required is 15.00",
"instance": "/transactions/12345"
}
```
* **Idempotency Keys:**
* *Rule:* All critical `POST/PATCH` (Money, State Change) must accept an `Idempotency-Key` header.
* *Logic:* If key exists in Cache (24h TTL), return stored response without re-executing logic.
## 🗄️ Database Integrity & Design
### 5. Database Integrity & Design
* **Hard Constraints:** Application-level checks are "Suggestions". Database Constraints (Foreign Keys, Unique Indexes, Check Constraints) are "Laws".
* **Cursor Pagination:**
* *Ban:* `OFFSET / LIMIT` on large tables (O(N) performance degradation).
* *Mandate:* Cursor-based pagination (`WHERE created_at < cursor LIMIT 20`).
* **Migration Discipline:**
* Never alter a column in a way that locks the table for >1s.
* Use "Expand and Contract" pattern for breaking changes.
* **Concurrency Control:**
* *Problem:* Two users update the same record. The last one wipes the first.
* *Mandate:* Use Optimistic Locking. Add a `version` (int) column.
* *Logic:* Update WHERE `id` = X AND `version` = Y. If 0 rows affected, throw `StaleObjectException`.
### 6. AI & Vector Readiness
* **Semantic Storage:** Backend must be ready to store embeddings (Vector Types).
* **Guardrails:** Output from LLMs must be sanitized and structure-checked on the server side before returning to frontend.
## 👁️ Observability & Monitoring (The "Glass Box" Protocol)
### 7. Structured Logging Only
* **Ban:** `console.log("User updated")`. String logs are useless for machines.
* **Mandate:* JSON Logs with correlation IDs. `{ "level": "info", "event": "user_updated", "user_id": "u7-...", "trace_id": "..." }`.
### 8. Distributed Tracing (OpenTelemetry)
* Every request MUST carry a `traceparent` header.
* Spans must cover: DB Queries, External API Calls, and Redis operations.
### 9. Health Checks
* Liveness (`/health/live`): "Am I running?" (Instant, no checks).
* Readiness (`/health/ready`): "Can I take traffic?" (Check DB/Redis connection).
## 🛡️ Resilience Patterns (The "Anti-Fragile" Mandate)
### 10. Circuit Breakers
* Wrap ALL external calls (Payment Gateways, 3rd Party APIs) in a Circuit Breaker.
* *Logic:* After 5 failures, fail fast for 30s. Don't drown the downstream service.
### 11. Rate Limiting
* Protect *every* public endpoint with a Token Bucket rate limiter (Redis-backed).
* Differentiate limits by User Role (Anon: 60/min, Pro: 1000/min).
## 🔧 Workflow Rules
### 1. The Pre-Flight Checklist
0. **Environment Hardening:**
* Verify all `process.env` variables at startup using a schema (e.g., `t3-env` or `envalid`). If a key is missing, crash immediately. Do not start the server in an undefined state.
Before writing a single handler:
1. **Define the DTOs:** Request Schema (Zod) and Response Schema.
2. **Define the Error States:** What can go wrong? (404, 409, 429).
3. **Define the Data Access:** What is the most efficient SQL query?
### 2. The "No Magic" Rule
* Avoid "Magical" ORM features (Lazy Loading, Auto-Saving context).
* Prefer Explicit over Implicit. "Write the SQL (or Query Builder) if the ORM hides expensive logic."
### 3. Testing Pyramid
1. **Unit:** Test Domain Logic in isolation (mock DB).
2. **Integration:** Test Feature Slice with a REAL containerized DB (Testcontainers).
3. **E2E:** Test critical flows from the "Outside".
## 📂 Cognitive Audit Cycle
Before committing code:
1. **Is the endpoint under a feature slice?** (Not in a generic controller folder).
2. **Is Input Validated with a Schema?** (Zero Trust).
3. **Are DB Indexes used?** (Run `EXPLAIN ANALYZE`).
4. **Is the Primary Key UUIDv7?** (Index Perf).
5. **Are secrets managed properly?** (No hardcoded strings).
---
## 🔗 CROSS-SKILL INTEGRATION
| Skill | Backend Adds... |
|-------|-----------------|
| `@frontend-design` | API contracts, CORS config, error responses |
| `@clean-code` | Input validation, no raw SQL, dependency security |
| `@tdd-mastery` | Integration tests with Testcontainers |
| `@planning-mastery` | API endpoint task breakdown |
| `@debug-mastery` | Structured logging, distributed tracing |
> **Command:** Use these skills to architect "Fortress-Level" backend systems.