# Roadmap

What's next, in order, with the reasoning. Background and citations live in
[RESEARCH-NOTES.md](RESEARCH-NOTES.md) (a survey of Litestream/LTX, LiteFS,
SlateDB, Neon, Cloudflare D1, wal-g/pgBackRest, and 2024–26 object-store
primitives). Status of what's already built: [../STATUS.md](../STATUS.md).

## Shipped in v1 (battle-tested)

Incremental WAL shipping (LSN ranges, ~134ms strict commits), snapshots as
compaction + rolling backup, durability modes (`strict`/`interval`/`sleep`),
group-commit pacing + 429 retry from a per-provider cost model, fence-stamped
lease takeovers, fencing-token object keys, generation-per-writer-life,
full-segment-size restore padding, read replicas (`ZeroPGReplica`), branching,
GC. Eleven-plus live-fire bugs found and regression-tested (STATUS.md).

## v2 — robustness + cost, no architecture change

1. **Numbered immutable manifests** (`manifest/00000000000042.json`,
   create-if-absent) replacing the single CAS-swapped `manifest.json`.
   Motivation is hard: GCS caps mutations per object NAME at ~1/s (measured:
   52% rejections beyond it), so one manifest name caps commit rate. Numbered
   manifests turn every commit into a create-if-absent on a fresh name —
   no per-name cap, free commit history, and point-in-time restore falls out.
   A small `current.json` hint (or list-last) locates the head. This is the
   SlateDB/Delta-Lake commit-log shape.
2. **Writer-epoch in object names + halt-on-first-fence** (SlateDB's formally
   verified recipe — we have 90% of it via fencing tokens; make a fenced
   writer permanently halt rather than serve 423s until restart).
3. **LTX-style checksum chain**: each segment records `preChecksum` /
   `postChecksum` of the database state; `post[i] == pre[i+1]` proves the
   chain end-to-end. Litestream v0.5 dropped generations for exactly this —
   it would let us re-promote cross-life chaining (the full-segment padding
   fix likely already made it sound; needs E4-grade proof).
4. **GCS `compose` segment folding** (32:1 server-side, one Class A op, zero
   instance CPU) + **restore-budget-driven compaction** replacing fixed
   thresholds. Constants are measured (E0/E2c/E3).
5. **Deferred deletion window** (keep superseded snapshots/segments N days):
   30-day PITR + zero-copy branches via manifest pinning — D1 "Time Travel"
   for the price of storage.
6. **S3 + R2 transports** with a CAS conformance suite (R2 has shipped real
   conditional-write bugs; test the primitive, not the docs).

## v3 — latency + scale levers

7. **Output gates** (Cloudflare DO trick): commit locally, keep executing,
   hold client responses until the bucket confirms — hides the ~100-200ms
   commit behind concurrency with zero durability loss. Pairs with
   `await_durable: false` per-query opt-out.
8. **Appendable WAL-tail tier**: GCS Rapid Storage (zonal, sub-ms appends) or
   S3 Express One Zone append for sub-10ms strict commits, with the regional
   manifest staying the commit point. Driver variant behind `CostModel`.
9. **Lazy page-faulting restore** (Neon GetPage / turbopuffer style): cold
   start O(working set) instead of O(database size) via a real PGlite VFS
   that faults pages from the bucket. The biggest cold-start lever for
   multi-GB databases; a research project, not a feature ticket.
10. **Replica WAL tailing**: replicas currently re-materialize on refresh;
    teach them to apply new segments by restarting recovery on the existing
    scratch dir (cheap for small deltas) before reaching for the VFS.

## Non-goals (still)

Multi-writer, cross-region active-active, databases larger than instance
memory (until #9), running your checkout path on this.