---
name: system-type-web-service
description: "Domain patterns for web service architecture — API design (REST/GraphQL/gRPC), scaling, data layer, observability, failure modes, and anti-patterns. Use when designing or evaluating a web service, API, or request/response system."
---

# System Type: Web Service

Patterns, failure modes, and anti-patterns for request/response web services.

---

## API Patterns

### REST
**When to use.** Public APIs, browser-facing services, CRUD-heavy domains, when discoverability and cacheability matter.
**When to avoid.** Highly relational data with many nested queries (N+1 fetches). Real-time bidirectional communication. High-throughput internal service-to-service calls where payload efficiency matters.
**Key decisions.** Resource naming, versioning strategy (URL vs header), pagination approach, error format.

### GraphQL
**When to use.** Multiple client types needing different data shapes from the same backend. Complex, nested data relationships. When frontend teams need to iterate independently from backend.
**When to avoid.** Simple CRUD APIs. Server-to-server communication. When caching at the HTTP layer is important (GraphQL's POST-based model breaks HTTP caching). When the team doesn't have GraphQL operational expertise.
**Key decisions.** Schema-first vs code-first, query complexity limits, N+1 resolution strategy (DataLoader pattern), authorization model.

### gRPC
**When to use.** Internal service-to-service communication. When payload size and serialization speed matter. When you want strongly-typed contracts with code generation. Streaming use cases.
**When to avoid.** Browser clients (requires grpc-web proxy). When human readability of requests matters for debugging. When the team lacks protobuf experience. Public APIs (tooling ecosystem is smaller).
**Key decisions.** Proto file organization, backward compatibility discipline, deadline propagation, load balancing (L7 required for HTTP/2).

## Scaling Patterns

**Horizontal scaling.** Add more instances behind a load balancer. Requires stateless services (or externalized state). The default approach for web services. Watch for: session affinity requirements, connection pool exhaustion at the database, cache consistency across instances.

**Vertical scaling.** Bigger machines. Simpler than horizontal but has a ceiling. Right for: databases, in-memory workloads, and when horizontal scaling's coordination cost exceeds the performance benefit.

**Autoscaling.** Scale instance count based on metrics (CPU, request rate, queue depth). Essential for variable load. Watch for: cold start latency, scaling lag, minimum instance counts for availability, cost runaway from misconfigured scaling policies.

**CDN and edge caching.** Serve static and cacheable dynamic content from edge locations. Dramatically reduces latency and origin load. Watch for: cache invalidation complexity, cache poisoning, TTL tuning, varying content by headers (accept-language, authorization).

**Read replicas.** Offload read traffic from the primary database. Watch for: replication lag causing stale reads, read-after-write consistency requirements, connection routing complexity.

## Data Layer Patterns

**RDBMS (PostgreSQL, MySQL).** Default choice for structured, relational data. Strong consistency, ACID transactions, mature tooling. Scales vertically well; horizontal scaling requires sharding (hard) or read replicas (easier).

**Document stores (MongoDB, DynamoDB).** When data is naturally document-shaped, schema varies per record, or you need horizontal scaling without sharding complexity. Watch for: lack of joins, transaction limitations across documents, query patterns that don't match the data model.

**Key-value stores (Redis, Memcached).** Caching, session storage, rate limiting, leaderboards. Extremely fast for simple access patterns. Watch for: data loss on restart (unless configured for persistence), memory limits, using it as a primary datastore when it's a cache.

**Search engines (Elasticsearch, OpenSearch).** Full-text search, log aggregation, analytics on semi-structured data. Watch for: operational complexity, eventual consistency, write amplification, cluster sizing that's hard to change later.

## Observability

**Structured logging.** JSON logs with consistent fields (request_id, user_id, service, timestamp, level). Enable correlation across services. Avoid: unstructured log lines, logging sensitive data, excessive log volume without sampling.

**Distributed tracing.** Propagate trace IDs across service boundaries to reconstruct request paths. Essential when requests span multiple services. Use OpenTelemetry for vendor-neutral instrumentation.

**Metrics.** RED method for services (Rate, Errors, Duration). USE method for resources (Utilization, Saturation, Errors). Define SLOs before choosing what to measure.

**SLOs and SLIs.** Define service level objectives in terms of measurable indicators (latency P99 < 200ms, error rate < 0.1%). SLOs drive alerting, capacity planning, and error budgets. Without SLOs, you're guessing about reliability.

## Common Failure Modes

- **Cascading failures.** One slow service causes callers to queue up, exhausting their resources. Mitigation: timeouts, circuit breakers, bulkheads.
- **Connection pool exhaustion.** Database or HTTP connection pools fill up under load. Mitigation: pool sizing, connection timeouts, backpressure.
- **Thundering herd.** Cache expiry causes all instances to hit the backend simultaneously. Mitigation: jittered TTLs, request coalescing, cache warming.
- **Retry storms.** Clients retry failed requests, multiplying load on an already-stressed system. Mitigation: exponential backoff with jitter, retry budgets, circuit breakers.
- **Memory leaks.** Gradual memory growth leading to OOM kills. Mitigation: memory limits, health checks, regular restarts (if you can't find the leak).
- **Dependency failures.** External services go down. Mitigation: timeouts, fallbacks, graceful degradation, feature flags.

## Anti-Patterns

- **Distributed monolith.** Microservices that must deploy together, share databases, or make synchronous calls in long chains. You got the complexity of distribution without the benefits.
- **God service.** One service that does everything. Split by domain boundary, not by arbitrary size targets.
- **Chatty interfaces.** Many small API calls where one well-designed call would do. Increases latency, error surface, and complexity.
- **Shared mutable state.** Multiple services writing to the same database tables. Define ownership or accept the coupling.
- **Premature microservices.** Splitting into services before understanding domain boundaries. Start with a well-structured monolith; extract services when you have evidence they need independent scaling or deployment.
- **Ignoring cold starts.** Assuming services are always warm. New deployments, autoscaling events, and restarts all serve cold traffic.