--- name: system-type-web-service description: "Domain patterns for web service architecture — API design (REST/GraphQL/gRPC), scaling, data layer, observability, failure modes, and anti-patterns. Use when designing or evaluating a web service, API, or request/response system." --- # System Type: Web Service Patterns, failure modes, and anti-patterns for request/response web services. --- ## API Patterns ### REST **When to use.** Public APIs, browser-facing services, CRUD-heavy domains, when discoverability and cacheability matter. **When to avoid.** Highly relational data with many nested queries (N+1 fetches). Real-time bidirectional communication. High-throughput internal service-to-service calls where payload efficiency matters. **Key decisions.** Resource naming, versioning strategy (URL vs header), pagination approach, error format. ### GraphQL **When to use.** Multiple client types needing different data shapes from the same backend. Complex, nested data relationships. When frontend teams need to iterate independently from backend. **When to avoid.** Simple CRUD APIs. Server-to-server communication. When caching at the HTTP layer is important (GraphQL's POST-based model breaks HTTP caching). When the team doesn't have GraphQL operational expertise. **Key decisions.** Schema-first vs code-first, query complexity limits, N+1 resolution strategy (DataLoader pattern), authorization model. ### gRPC **When to use.** Internal service-to-service communication. When payload size and serialization speed matter. When you want strongly-typed contracts with code generation. Streaming use cases. **When to avoid.** Browser clients (requires grpc-web proxy). When human readability of requests matters for debugging. When the team lacks protobuf experience. Public APIs (tooling ecosystem is smaller). **Key decisions.** Proto file organization, backward compatibility discipline, deadline propagation, load balancing (L7 required for HTTP/2). ## Scaling Patterns **Horizontal scaling.** Add more instances behind a load balancer. Requires stateless services (or externalized state). The default approach for web services. Watch for: session affinity requirements, connection pool exhaustion at the database, cache consistency across instances. **Vertical scaling.** Bigger machines. Simpler than horizontal but has a ceiling. Right for: databases, in-memory workloads, and when horizontal scaling's coordination cost exceeds the performance benefit. **Autoscaling.** Scale instance count based on metrics (CPU, request rate, queue depth). Essential for variable load. Watch for: cold start latency, scaling lag, minimum instance counts for availability, cost runaway from misconfigured scaling policies. **CDN and edge caching.** Serve static and cacheable dynamic content from edge locations. Dramatically reduces latency and origin load. Watch for: cache invalidation complexity, cache poisoning, TTL tuning, varying content by headers (accept-language, authorization). **Read replicas.** Offload read traffic from the primary database. Watch for: replication lag causing stale reads, read-after-write consistency requirements, connection routing complexity. ## Data Layer Patterns **RDBMS (PostgreSQL, MySQL).** Default choice for structured, relational data. Strong consistency, ACID transactions, mature tooling. Scales vertically well; horizontal scaling requires sharding (hard) or read replicas (easier). **Document stores (MongoDB, DynamoDB).** When data is naturally document-shaped, schema varies per record, or you need horizontal scaling without sharding complexity. Watch for: lack of joins, transaction limitations across documents, query patterns that don't match the data model. **Key-value stores (Redis, Memcached).** Caching, session storage, rate limiting, leaderboards. Extremely fast for simple access patterns. Watch for: data loss on restart (unless configured for persistence), memory limits, using it as a primary datastore when it's a cache. **Search engines (Elasticsearch, OpenSearch).** Full-text search, log aggregation, analytics on semi-structured data. Watch for: operational complexity, eventual consistency, write amplification, cluster sizing that's hard to change later. ## Observability **Structured logging.** JSON logs with consistent fields (request_id, user_id, service, timestamp, level). Enable correlation across services. Avoid: unstructured log lines, logging sensitive data, excessive log volume without sampling. **Distributed tracing.** Propagate trace IDs across service boundaries to reconstruct request paths. Essential when requests span multiple services. Use OpenTelemetry for vendor-neutral instrumentation. **Metrics.** RED method for services (Rate, Errors, Duration). USE method for resources (Utilization, Saturation, Errors). Define SLOs before choosing what to measure. **SLOs and SLIs.** Define service level objectives in terms of measurable indicators (latency P99 < 200ms, error rate < 0.1%). SLOs drive alerting, capacity planning, and error budgets. Without SLOs, you're guessing about reliability. ## Common Failure Modes - **Cascading failures.** One slow service causes callers to queue up, exhausting their resources. Mitigation: timeouts, circuit breakers, bulkheads. - **Connection pool exhaustion.** Database or HTTP connection pools fill up under load. Mitigation: pool sizing, connection timeouts, backpressure. - **Thundering herd.** Cache expiry causes all instances to hit the backend simultaneously. Mitigation: jittered TTLs, request coalescing, cache warming. - **Retry storms.** Clients retry failed requests, multiplying load on an already-stressed system. Mitigation: exponential backoff with jitter, retry budgets, circuit breakers. - **Memory leaks.** Gradual memory growth leading to OOM kills. Mitigation: memory limits, health checks, regular restarts (if you can't find the leak). - **Dependency failures.** External services go down. Mitigation: timeouts, fallbacks, graceful degradation, feature flags. ## Anti-Patterns - **Distributed monolith.** Microservices that must deploy together, share databases, or make synchronous calls in long chains. You got the complexity of distribution without the benefits. - **God service.** One service that does everything. Split by domain boundary, not by arbitrary size targets. - **Chatty interfaces.** Many small API calls where one well-designed call would do. Increases latency, error surface, and complexity. - **Shared mutable state.** Multiple services writing to the same database tables. Define ownership or accept the coupling. - **Premature microservices.** Splitting into services before understanding domain boundaries. Start with a well-structured monolith; extract services when you have evidence they need independent scaling or deployment. - **Ignoring cold starts.** Assuming services are always warm. New deployments, autoscaling events, and restarts all serve cold traffic.