# Materialized views

Pre-computed views stored as tables — refresh on a schedule, query as if free. This page covers the per-dialect support (Postgres native + CONCURRENTLY, MSSQL indexed views, Mongo `$merge`/`$out`, emulation for MySQL/SQLite), refresh strategies, the freshness window trade-off, and the patterns for dashboards and pre-aggregated reports.

The [`Materialised views`](./VIEWS.md#materialised-views) section in
`VIEWS.md` is the surface — the one-line `.asView({ materialised: true
})` API and the per-dialect refresh table. This file goes one level
down: when a matview earns the storage versus a plain view, a cache, or
a read replica; per-dialect emit and refresh semantics; the staleness
window trade-off; and the patterns for dashboards, leaderboards, and
analytical pre-aggregations.

Related deep-dives:

* [VIEWS.md](./VIEWS.md) — plain views, the `.asView()` API, and the read-only wrapper.
* [AGGREGATIONS.md](./AGGREGATIONS.md) — `groupBy`, `aggregate`, and the live-query equivalents matviews replace.
* [MIGRATIONS.md](./MIGRATIONS.md) — how matview DDL flows through `push` and `diff`.
* [CACHING.md](./CACHING.md) — query-result caching, the in-process alternative when freshness must be tighter.
* [POSTGRES.md](./POSTGRES.md) — `REFRESH MATERIALIZED VIEW CONCURRENTLY`, `pg_cron`, and `pg_ivm`.
* [MONGO.md](./MONGO.md) — `$out` and `$merge` aggregation outputs.

---

## Contents

* [What a materialized view is](#what-a-materialized-view-is)
* [When a matview earns the storage](#when-a-matview-earns-the-storage)
* [Per-dialect support](#per-dialect-support)
  * [Postgres](#postgres)
  * [MySQL](#mysql)
  * [SQLite](#sqlite)
  * [DuckDB](#duckdb)
  * [MSSQL](#mssql)
  * [Mongo](#mongo)
* [forge declaration — `materialised: true`](#forge-declaration--materialised-true)
* [Refresh strategies](#refresh-strategies)
* [Refresh scheduling](#refresh-scheduling)
* [Incremental matviews](#incremental-matviews)
* [Indexing a matview](#indexing-a-matview)
* [Permissions](#permissions)
* [Cost analysis — matview vs live view vs cache vs replica](#cost-analysis--matview-vs-live-view-vs-cache-vs-replica)
* [Staleness window — bound it in the product spec](#staleness-window--bound-it-in-the-product-spec)
* [Drift detection — what `forge diff` sees](#drift-detection--what-forge-diff-sees)
* [Use cases that earn a matview](#use-cases-that-earn-a-matview)
* [Worked examples](#worked-examples)
* [See also](#see-also)

---

## What a materialized view is

A materialised view stores the result of its SELECT physically on disk
and serves reads from the stored copy. It looks like a table at read
time — `forge` issues `SELECT * FROM post_stats` and the engine reads
rows from the matview's heap, not from `posts`. The catch is the stored
rows go stale the moment the source tables change; refreshing the
matview is an explicit operation (manual call, scheduled job, or
engine-maintained on the dialects that support it).

The shape that distinguishes it from a plain view:

* **Plain view** — a saved SELECT. Every query re-executes the body
  against the base tables. Storage cost: zero. Read cost: cost of the
  underlying join/aggregation.
* **Materialised view** — a saved result. Reads are a table scan plus
  whatever indexes you put on the matview. Storage cost: size of the
  result set. Write cost: zero at base-table write time on most
  dialects (the matview lags), or proportional to the change on
  engine-maintained matviews (MSSQL indexed views, `pg_ivm`).

The trade-off is freshness for read speed. A live aggregation that
takes 1.2s per query becomes a 12ms scan against a matview that
refreshes every 5 minutes — at the price of up to 5 minutes of
staleness.

What a matview is **not**:

* Not a cache. A cache evicts on TTL; a matview is authoritative until
  the next refresh.
* Not a snapshot. The data inside a matview is whatever the SELECT
  returned at refresh time, not a point-in-time copy of the source
  tables.
* Not a replication target. The matview lives in the same database as
  its source; cross-database replication is a different problem (see
  [DEPLOYMENT.md](./DEPLOYMENT.md)).

---

## When a matview earns the storage

Reach for a matview when all three are true:

1. **The query is expensive to compute live.** A multi-table aggregate
   that scans millions of rows; a recursive CTE traversing a tree; a
   window function over a year of events. If `EXPLAIN ANALYZE` shows
   the live query in tens or hundreds of milliseconds and the page hits
   it on every load, the storage trade is worth weighing.
2. **The result tolerates staleness.** Dashboards refreshed every five
   minutes. Leaderboards that update on the hour. Daily rollups that
   need yesterday's data, not the last second's. If the product spec
   demands sub-second freshness, a matview is the wrong primitive —
   reach for a cache layer or read-side denormalisation on write.
3. **The read rate beats the refresh rate.** A matview refreshed every
   minute that's queried twice an hour is wasted compute; the live
   query was cheaper. The matview earns when reads per refresh
   interval times their individual cost exceeds the cost of one
   refresh.

If only (1) is true, look at indexes on the base tables first; the
matview is treating the symptom. If only (2) is true, you don't need
the matview — just write the query. If only (3) is true, you don't
have a cost problem worth solving.

---

## Per-dialect support

`forge push` emits matview DDL in the same view pass as plain views,
after table DDL. Each adapter has its own shape.

### Postgres

The native one. `CREATE MATERIALIZED VIEW` stores the SELECT result as
a table; `REFRESH MATERIALIZED VIEW` recomputes it.

```sql
CREATE MATERIALIZED VIEW "post_stats" AS
  SELECT author_id,
         COUNT(*) AS post_count,
         COALESCE(SUM(view_count), 0) AS total_views
  FROM posts
  GROUP BY author_id
WITH DATA;
```

`WITH DATA` (the default) runs the SELECT immediately and populates
the matview on creation. `WITH NO DATA` creates an empty matview that
must be refreshed before it can be queried; useful when you want the
matview to exist but defer the first compute to a cron job.

**Standard refresh:**

```sql
REFRESH MATERIALIZED VIEW "post_stats";
```

Acquires an `ACCESS EXCLUSIVE` lock on the matview for the duration of
the refresh. Reads block. For a small matview this is fine; for a
matview a 30-second refresh takes, reads stall for 30 seconds.

**Concurrent refresh:**

```sql
REFRESH MATERIALIZED VIEW CONCURRENTLY "post_stats";
```

Builds a new version alongside the old one and swaps. Reads keep
hitting the old version throughout. **Requires a unique index** on the
matview — Postgres uses it to diff old against new and emit row-level
changes. Without a unique index, the statement errors:

```
ERROR: cannot refresh materialized view "post_stats" concurrently
HINT:  Create a unique index with no WHERE clause on one or more columns of the materialized view.
```

The unique index has to cover columns that uniquely identify a result
row — typically the grouping keys (`author_id` in the example above,
or `(author_id, day)` for a daily rollup). Forge does not auto-emit
this index; declare it via `.indexes` on the model:

```ts
const PostStats = model('post_stats', {
  author_id:   f.objectId(),
  post_count:  f.bigint(),
  total_views: f.bigint(),
})
  .asView({
    materialised: true,
    sql: `SELECT author_id, COUNT(*) AS post_count, …`,
  })
  .indexes(t => [
    t.unique(['author_id']),    // required for REFRESH CONCURRENTLY
  ]);
```

`forge push` emits a `CREATE UNIQUE INDEX` on the matview alongside
the matview itself.

Concurrent refresh is slower than non-concurrent (it has to build the
new copy plus diff plus apply), and writes locks during the swap. The
trade is read availability for refresh time.

Source: `src/adapters/postgres/ddl.ts` — `CREATE MATERIALIZED VIEW` emit
and the `refreshView` implementation that dispatches on
`opts.concurrently`.

### MySQL

No native matviews. Forge emulates with a regular table populated from
the SELECT body:

```sql
CREATE TABLE IF NOT EXISTS post_stats AS
  SELECT author_id, COUNT(*) AS post_count, …
  FROM posts
  GROUP BY author_id;
```

`refresh()` runs as a transaction:

```sql
START TRANSACTION;
DELETE FROM post_stats;
INSERT INTO post_stats SELECT …;
COMMIT;
```

Single transaction so readers either see the old copy or the new copy,
not a half-truncated one. The transaction holds row locks for the
duration of the delete + insert; long refreshes serialise against
writes to the matview backing table itself, but reads are not blocked
(InnoDB MVCC).

No CONCURRENTLY equivalent. The pattern when read availability has to
hold through refresh is **double-buffered tables**: write the new copy
into `post_stats_next`, then `RENAME TABLE post_stats TO
post_stats_old, post_stats_next TO post_stats` (atomic across both),
drop `post_stats_old`. Forge does not emit this pattern itself; write
it as a custom refresh function and override the model's `refresh`.

### SQLite

Same approach as MySQL — a regular table populated from the SELECT,
refreshed inside a transaction:

```sql
BEGIN;
DELETE FROM post_stats;
INSERT INTO post_stats SELECT …;
COMMIT;
```

SQLite serialises writers but readers can proceed against the matview
during the refresh (in WAL mode). The matview is registered as a table
in `sqlite_master`; `forge diff` accepts it under either `tables` or
`views`.

No CONCURRENTLY equivalent. The single-writer model means a long
refresh blocks other writes to the database for the duration; size
matviews accordingly.

### DuckDB

DuckDB views are read-only at the engine level, and there is no native
`MATERIALIZED VIEW`. Forge emulates with `CREATE TABLE AS SELECT`, the
same shape as the MySQL/SQLite emulation:

```sql
CREATE TABLE IF NOT EXISTS post_stats AS
  SELECT author_id, COUNT(*) AS post_count, …
  FROM posts
  GROUP BY author_id;
```

`refresh()` does a `DROP TABLE` + `CREATE TABLE AS SELECT` rather than
DELETE + INSERT — DuckDB's bulk-load path is faster than transactional
row writes for analytical workloads, which is the whole point of using
DuckDB. The trade is a brief window where the matview doesn't exist
between the drop and the create; the operation runs inside a
transaction so concurrent reads either see the old version or the new,
never an absent one.

### MSSQL

The distinguishing feature: **indexed views**. A view declared with
`WITH SCHEMABINDING` and a unique clustered index becomes engine-
maintained — the storage layer keeps the materialised state in sync
transactionally as the base tables change. No refresh call. No
staleness window. Writes to the source tables pay the cost of
maintaining the index.

```sql
CREATE VIEW [dbo].[post_stats] WITH SCHEMABINDING AS
  SELECT author_id,
         COUNT_BIG(*)         AS post_count,
         SUM(view_count)      AS total_views,
         COUNT_BIG(view_count) AS _vc_count
  FROM dbo.posts
  GROUP BY author_id;

CREATE UNIQUE CLUSTERED INDEX IX_post_stats_author
  ON [dbo].[post_stats] (author_id);
```

Restrictions on what can go in an indexed view are strict:

* `WITH SCHEMABINDING` required — the view body must use two-part names
  (`dbo.posts`, not `posts`), and the engine prevents schema changes to
  the base tables that would invalidate the view.
* No outer joins, no `UNION`, no `DISTINCT`, no `TOP`, no subqueries, no
  CTEs, no window functions.
* Aggregates allowed only as `SUM`, `COUNT_BIG`. `AVG` is not allowed;
  compute it as `SUM / COUNT_BIG` in a wrapping view.
* Every aggregate column needs a `COUNT_BIG` companion so the engine
  can maintain the count incrementally.

Forge does not auto-emit indexed views — the SCHEMABINDING constraints
make the DDL bespoke. The pattern is to declare the indexed view in a
raw migration SQL file, then point a regular `model('post_stats', …)`
at the view name (no `.asView()`). The model reads from the indexed
view as if it were a table; writes are blocked at the engine level
because indexed views are read-only.

The matview's `refresh()` is a no-op on MSSQL (and forge raises if you
call it) because the engine handles the maintenance.

### Mongo

Two output stages from the aggregation framework, both useful for
matview-shaped problems:

* `$out` — overwrite a target collection with the pipeline result.
  Total replace, no diff.
* `$merge` — upsert pipeline rows into a target collection, keyed on a
  `on` field. Supports `whenMatched: 'merge' | 'replace' | 'keepExisting'
  | 'fail' | 'pipeline'` and `whenNotMatched: 'insert' | 'discard' |
  'fail'`.

```ts
await db.posts.aggregate([
  { $match: { status: 'PUBLISHED' } },
  { $group: {
      _id: '$author_id',
      post_count: { $sum: 1 },
      total_views: { $sum: '$view_count' },
  }},
  { $merge: {
      into: 'post_stats',
      on: '_id',
      whenMatched: 'replace',
      whenNotMatched: 'insert',
  }},
]);
```

`$merge` is the better default — `$out` drops every read against the
matview during the swap, while `$merge` updates row-by-row and leaves
unchanged rows alone. For partial refreshes (only this hour's
authors), `$merge` lets the pipeline filter on the source side and
only touch the matching keys.

Forge appends `{ $out: collection }` automatically when you declare
`materialised: true` on a Mongo view and don't supply your own
terminal stage. To use `$merge` instead, end your pipeline with it
explicitly:

```ts
const PostStats = model('post_stats', { … }).asView({
  materialised: true,
  sourceCollection: 'posts',
  pipeline: [
    { $match: { status: 'PUBLISHED' } },
    { $group: { _id: '$author_id', post_count: { $sum: 1 } } },
    { $merge: { into: 'post_stats', on: '_id', whenMatched: 'replace', whenNotMatched: 'insert' } },
  ],
});
```

Forge sees the terminal `$merge` and skips its own `$out` append.

---

## forge declaration — `materialised: true`

The same `.asView()` surface as plain views, with `materialised: true`
flipping forge into matview mode:

```ts
import { f, model } from 'forge-orm';

const PostStats = model('post_stats', {
  author_id:   f.objectId(),
  post_count:  f.bigint(),
  total_views: f.bigint(),
})
  .asView({
    materialised: true,
    sql: `
      SELECT author_id,
             COUNT(*)                       AS post_count,
             COALESCE(SUM(view_count), 0)   AS total_views
      FROM posts
      GROUP BY author_id
    `,
    refreshEvery: '5m',
  })
  .indexes(t => [
    t.unique(['author_id']),    // Postgres CONCURRENTLY refresh requirement
  ]);
```

What changes versus a plain view:

* `forge push` emits `CREATE MATERIALIZED VIEW` (Postgres) or `CREATE
  TABLE AS …` (MySQL/SQLite/DuckDB) or a one-shot pipeline with `$out`
  (Mongo) instead of `CREATE VIEW`.
* The model wrapper exposes `db.postStats.refresh(opts?)` —
  synchronous refresh on demand.
* If `refreshEvery` is set, `db.postStats.scheduleRefresh(interval)` is
  called automatically when the db is opened. Returns a `stop()`
  function. Timers are `.unref()`'d so they don't keep the process
  alive.
* The wrapper is still read-only: `create`, `update`, `delete`, etc.
  throw the same `ForgeViewWriteError`. Writes go through the source
  models.

There is no separate `f.materializedView` constructor. It's
`.asView({ materialised: true })` on a regular `model()`.

---

## Refresh strategies

Three shapes, picked by access pattern.

**Full refresh.** Wipe and recompute the entire matview. Simplest, and
the only option on most dialects. Postgres calls this the
non-concurrent refresh; MySQL/SQLite/DuckDB call it the only refresh.

```ts
await db.postStats.refresh();
```

Cost: proportional to the size of the result set. Lock window:
proportional to the refresh time on Postgres non-concurrent and
MySQL/SQLite (transaction holds the matview).

**Concurrent refresh (Postgres).** Build the new version alongside,
swap atomically. Reads keep working throughout.

```ts
await db.postStats.refresh({ concurrently: true });
```

Cost: roughly 2x the non-concurrent refresh (build new + diff + apply
delta). Requires a unique index on the matview. Lock window: brief
exclusive lock at swap time, measured in milliseconds.

**Incremental refresh (manual).** Recompute only the rows whose source
data changed since the last refresh. Forge does not have a built-in
incremental refresh primitive; the pattern is application-level:

1. Track a `_last_refreshed_at` column on the matview, or store a
   last-refresh timestamp in a side table.
2. Refresh by `INSERT … ON CONFLICT DO UPDATE` (Postgres) or `INSERT
   … ON DUPLICATE KEY UPDATE` (MySQL) keyed on the matview's unique
   columns, with a `WHERE source.updated_at > $1` clause filtering the
   source SELECT.
3. Run the partial refresh on a tighter cadence than the full refresh;
   schedule a full refresh nightly to catch deletes (which the
   `updated_at` predicate misses).

Worth it when the matview is large (millions of rows) but only a small
slice changes between refreshes. Not worth it when the result set is
small enough that a full refresh runs in under a second.

---

## Refresh scheduling

Four shapes, smallest blast radius first.

**`refreshEvery` (in-process timer).** Forge schedules a `setInterval`
that calls `refresh()`. Good for single-process services and dev. Bad
for multi-process deployments — every process refreshes, you get N
concurrent refreshes instead of one.

```ts
.asView({ materialised: true, refreshEvery: '5m', sql: … })
```

Suffixes: `s` / `m` / `h`. The timer is `.unref()`'d. Errors are
swallowed into the `$on('error')` channel so a transient DB blip
doesn't kill the process.

**External cron (BullMQ, cron, systemd).** A single scheduled job
calls `db.postStats.refresh()`. The cron orchestrator handles
single-execution semantics (BullMQ's `removeOnComplete`, `cron`'s
`startNow: false`).

```ts
queue.add('refresh-post-stats', null, {
  repeat: { every: 5 * 60 * 1000 },
  removeOnComplete: true,
});

new Worker('refresh-post-stats', async () => {
  await db.postStats.refresh({ concurrently: true });
});
```

The right choice for production. One refresh per interval across the
fleet; failure surfaces in the queue's failure handler.

**Database-side cron (`pg_cron`).** A Postgres extension that runs SQL
on a cron schedule, inside the database. No application process
needed.

```sql
SELECT cron.schedule(
  'refresh-post-stats',
  '*/5 * * * *',
  $$REFRESH MATERIALIZED VIEW CONCURRENTLY post_stats$$
);
```

Runs even when no application is connected. Status visible via
`cron.job_run_details`. The right choice when the matview's refresh
shouldn't depend on application liveness — e.g. analytics matviews
that have to refresh through deployments.

**Trigger-based.** A trigger on the source table fires the refresh on
every write. Almost always wrong — a 30-second refresh blocking
every INSERT is a denial-of-service against your own application.
The right shape if you genuinely need write-time matview maintenance
is MSSQL indexed views (engine-maintained, no triggers), Postgres
`pg_ivm` (extension, below), or a Mongo `$merge` on each write
(application-level, costed per write).

---

## Incremental matviews

`pg_ivm` (an extension for Postgres) implements **incremental view
maintenance**: matviews that the engine updates row-by-row as the
source tables change, the same way MSSQL indexed views work.

```sql
CREATE EXTENSION pg_ivm;

SELECT pgivm.create_immv(
  'post_stats',
  $$SELECT author_id, COUNT(*) AS post_count
    FROM posts
    GROUP BY author_id$$
);
```

`pgivm.create_immv` declares the matview and installs triggers on the
source tables. Every INSERT/UPDATE/DELETE on `posts` fires a delta
into `post_stats`. No `REFRESH` needed; no staleness window.

Cost: each source write pays a small overhead to update the matview.
Restrictions: a subset of SELECT is supported — most aggregations,
inner joins, outer joins (with caveats), `DISTINCT`, but not window
functions, recursive CTEs, or `OUTER JOIN` on the inner side. Read the
extension's docs before adopting.

Forge does not currently auto-emit `pg_ivm` matviews; declare them in
raw migration SQL and point a regular `model()` at the matview name.
Add the matview to `forge diff`'s ignore list so it doesn't try to
recreate it as a table.

For dialects without `pg_ivm`: incremental maintenance is application-
level. The pattern is the same as for partial refresh above — track an
`updated_at` watermark, refresh only the slice that changed. Cheaper
than a full refresh, but not as crisp as engine-maintained.

---

## Indexing a matview

A matview is a table — index it like one. Two reasons:

**1. `REFRESH CONCURRENTLY` requirement.** Postgres needs a unique
index covering the row identity columns. Without it, only blocking
refresh is available.

**2. Read performance.** A matview that wraps a `GROUP BY author_id`
aggregation will be queried by `author_id` most of the time; the
unique index doubles as the lookup index. Add additional indexes for
secondary access patterns (e.g. `ORDER BY total_views DESC` for a
top-N query needs an index on `total_views`).

Indexes on a matview are independent of indexes on the source tables.
The source indexes speed up the refresh; the matview indexes speed up
reads off the matview. Both matter.

```ts
.asView({ materialised: true, sql: … })
.indexes(t => [
  t.unique(['author_id']),                     // REFRESH CONCURRENTLY + lookup
  t.index(['total_views'], { sort: 'desc' }),  // ORDER BY total_views DESC
])
```

`forge push` emits the index DDL alongside the matview DDL. On
Postgres, `REINDEX MATERIALIZED VIEW` rebuilds the indexes (useful
after a bloat-heavy period); forge does not call it automatically.

---

## Permissions

A matview is a separate object from its source tables, with its own
GRANTs. The standard pattern:

```sql
-- The role that runs the application reads from the matview, not the
-- source tables. Reduces the blast radius if the connection leaks.
GRANT SELECT ON post_stats TO app_role;

-- The role that runs the refresh job has write on the matview only.
GRANT SELECT, INSERT, UPDATE, DELETE ON post_stats TO refresh_role;
-- Plus REFRESH MATERIALIZED VIEW privilege:
GRANT MAINTAIN ON post_stats TO refresh_role;   -- Postgres 17+
-- On earlier Postgres, MAINTAIN doesn't exist; OWNERSHIP of the
-- matview is required to call REFRESH. Set the refresh role as the
-- owner: ALTER MATERIALIZED VIEW post_stats OWNER TO refresh_role.
```

Two roles avoids the trap of granting the application role permission
to issue refreshes — a runaway request loop that triggers refreshes
will lock the matview and break read availability.

On Mongo, role separation is the same — give the application read on
the matview collection and give the refresh job write. `$merge` and
`$out` require write permission on the target collection.

See [SECURITY.md](./SECURITY.md) for the broader role pattern.

---

## Cost analysis — matview vs live view vs cache vs replica

Four primitives, four sets of trade-offs.

| Primitive | Where it lives | Freshness | Read cost | Write cost | Storage | Best for |
|---|---|---|---|---|---|---|
| Live view | Same DB | Real-time | Cost of query | None | None | Saved query / security boundary |
| Materialised view | Same DB | Refresh window | Table scan | Refresh time | Result size | Pre-aggregated dashboard |
| Engine-maintained matview | Same DB | Real-time | Table scan | Per-row overhead | Result size | Hot aggregate that must be fresh |
| Cache (Redis, in-process) | Separate process | TTL | Network round-trip | Cache write | Result size, per cache | Read-amplified key lookups |
| Read replica | Separate DB | Replication lag | Cost of query, replica load | Replication overhead | Full DB copy | OLAP queries against OLTP shape |

Decision sketch:

* If the query is hot but the staleness window is in minutes/hours,
  matview wins. The storage is small relative to the source tables;
  the refresh cost is paid once and spread across all reads.
* If the query needs sub-second freshness and you can afford the
  per-write cost, engine-maintained matview (MSSQL indexed view or
  `pg_ivm`) wins. Pay at write time, read for free.
* If the same key is read many times with a TTL of seconds to minutes,
  a cache wins. The matview's refresh window is too coarse; a cache's
  TTL matches the access pattern.
* If the workload is "run arbitrary BI queries against the same
  schema as the OLTP database", a read replica wins. Matviews
  pre-aggregate a known query; a replica supports any query the BI
  tool happens to write.

The mistake is reaching for a matview when a covering index would do.
If the live query is slow because of a missing index, the matview
ships the same slow query into the refresh and adds a freshness
window. Diagnose with `EXPLAIN` first.

---

## Staleness window — bound it in the product spec

A matview's freshness window is part of the product contract. Bake it
into the API and the UI:

* **API response.** Every matview-backed endpoint returns
  `last_refreshed_at`. Clients display "Updated 4 minutes ago" so the
  user knows they're not looking at live data.
* **Cache headers.** Set `Cache-Control: max-age=N` on the response,
  where N is the matview's refresh interval. The HTTP cache and the
  matview share a freshness contract.
* **Health check.** If the matview's `last_refreshed_at` falls behind
  by more than 2× the refresh interval, page the on-call. A stuck
  refresh job is the most common silent failure mode — the matview
  keeps returning stale data, reads succeed, no error fires.

Storing `last_refreshed_at` on the matview itself:

```ts
const PostStats = model('post_stats', {
  author_id:          f.objectId(),
  post_count:         f.bigint(),
  total_views:        f.bigint(),
  last_refreshed_at:  f.dateTime(),
}).asView({
  materialised: true,
  sql: `
    SELECT author_id,
           COUNT(*)                AS post_count,
           SUM(view_count)         AS total_views,
           NOW()                   AS last_refreshed_at
    FROM posts
    GROUP BY author_id
  `,
});
```

`NOW()` evaluates at refresh time; every row gets the same timestamp.
Useful for the cache-header pattern: read any row, take its
`last_refreshed_at`, set `Cache-Control` accordingly.

Alternative: store the last-refresh time in a side table, keyed by
matview name. Cleaner if you have many matviews and don't want to add
a column to each.

---

## Drift detection — what `forge diff` sees

Forge introspects materialised views per dialect:

* **Postgres** — `pg_matviews` returns matviews with their definition.
  Forge feeds them into `introspection.views` with `materialised: true`
  set (see `src/adapters/postgres/introspect.ts:74`). `diff` compares
  the set of matview names; a missing one is flagged.
* **MySQL** / **SQLite** / **DuckDB** — the matview backing table
  appears in the table list. Forge's diff accepts a table whose name
  matches a declared matview under either `tables` or `views`.
* **MSSQL** — view introspection is currently best-effort (`views: []`
  in the introspect output). Indexed views are the recommended shape
  on MSSQL; declare them in raw migration SQL and ignore in `diff`.
* **Mongo** — `db.listCollections()` reports the matview collection as
  a regular collection (it's a real collection populated by the
  pipeline). Forge's diff treats it as a table-like collection; the
  pipeline body is not compared.

What `diff` does **not** currently report:

* Whether the matview's SELECT body matches the schema. Forge re-emits
  the body on every `push`, so push always converges; diff doesn't
  flag "the body changed since last push".
* Whether the matview is up to date. Staleness is a runtime concern,
  not a schema concern.
* Whether the matview's indexes match what `.indexes()` declared.
  Index drift on the matview is tracked under the same code path as
  index drift on a table; `diff` reports it normally.

What `push` does on a matview body change:

* **Postgres** — `CREATE OR REPLACE MATERIALIZED VIEW` does not exist;
  body changes require `DROP MATERIALIZED VIEW` + `CREATE MATERIALIZED
  VIEW`. Forge does not auto-drop on push; it surfaces the
  `ERROR: relation "post_stats" already exists` from Postgres. You
  drop manually and re-push.
* **MySQL/SQLite/DuckDB** — same story. The matview is a backing
  table; body changes need a manual drop.
* **Mongo** — drop + recreate is automatic, the same as for plain
  views.

The right pattern for matview body changes is the same as for table
schema changes: explicit migration SQL that drops the matview, runs
the new DDL, and triggers an initial refresh.

---

## Use cases that earn a matview

* **Dashboard tiles.** A homepage that shows "Total revenue", "Active
  users this week", "Orders shipped today". The numbers come from
  aggregations over millions of rows; the page loads N times a second;
  the user accepts a 5-minute lag. Classic matview shape.
* **Leaderboards.** Top-10 contributors by score, top-100 most-played
  songs this week. The aggregation scans the full event log; the
  result is tiny (10 or 100 rows); the freshness window is hourly.
  Index the matview by score for the `ORDER BY score DESC LIMIT 10`
  query.
* **Top-N queries.** Per-tenant top-5 customers by lifetime value.
  Compute once per tenant in the matview, query by `(tenant_id, rank)`
  at read time.
* **Daily/hourly rollups.** Hourly buckets of event counts for a
  time-series chart. The matview holds one row per bucket; the live
  query would scan every event. Rebuild the last few hours on a tight
  cadence, the older buckets nightly.
* **Search facets.** Counts of products by category, brand, price
  band. The faceted-search UI hits these on every page load; the live
  query is multiple `GROUP BY`s over the product catalog.
* **Reports/exports.** A nightly CSV export pulls from a matview that
  pre-joins the underlying tables. The export is a fast table scan;
  the matview refreshes once per day before the export window.

Use cases that do **not** earn a matview:

* **The live result fits in cache.** A single-row lookup keyed on
  `(tenant_id, day)` belongs in Redis, not a matview.
* **Freshness is in seconds.** A matview's refresh window is its
  promise. Sub-second freshness means write-time denormalisation or
  engine-maintained matviews, not refresh-on-cron.
* **The query is run once per page-load and there's only one
  page-load per minute.** The live cost is fine; the matview is
  overhead.

---

## Worked examples

### (a) Postgres dashboard refreshed every 5 minutes via pg_cron

```ts
// schema.ts
import { f, model } from 'forge-orm';

export const DashboardStats = model('dashboard_stats', {
  day:                f.date(),
  org_id:             f.objectId(),
  orders_placed:      f.int(),
  revenue_cents:      f.bigint(),
  unique_customers:   f.int(),
  last_refreshed_at:  f.dateTime(),
})
  .asView({
    materialised: true,
    sql: `
      SELECT date_trunc('day', placed_at)::date AS day,
             org_id,
             COUNT(*)                            AS orders_placed,
             COALESCE(SUM(total_cents), 0)       AS revenue_cents,
             COUNT(DISTINCT customer_id)         AS unique_customers,
             NOW()                               AS last_refreshed_at
      FROM orders
      WHERE status IN ('PAID', 'SHIPPED')
        AND placed_at >= NOW() - INTERVAL '90 days'
      GROUP BY 1, 2
    `,
  })
  .indexes(t => [
    t.unique(['day', 'org_id']),                 // CONCURRENTLY refresh
    t.index(['org_id', 'day'], { sort: 'desc' }),
  ]);
```

Schedule via `pg_cron` (one-time setup, runs in the database):

```sql
CREATE EXTENSION IF NOT EXISTS pg_cron;

SELECT cron.schedule(
  'refresh-dashboard-stats',
  '*/5 * * * *',
  $$REFRESH MATERIALIZED VIEW CONCURRENTLY dashboard_stats$$
);
```

Read at request time:

```ts
const rows = await db.dashboardStats.findMany({
  where: { org_id: ctx.org, day: { gte: thirtyDaysAgo } },
  orderBy: { day: 'desc' },
});

const refreshedAt = rows[0]?.last_refreshed_at;
res.setHeader('Cache-Control', 'max-age=300');     // matches the 5-minute refresh
res.setHeader('Last-Modified', refreshedAt.toUTCString());
res.json({ rows, refreshedAt });
```

### (b) MySQL emulation via INSERT INTO … SELECT

```ts
export const PostStats = model('post_stats', {
  author_id:    f.objectId(),
  post_count:   f.bigint(),
  total_views:  f.bigint(),
}).asView({
  materialised: true,
  sql: `
    SELECT author_id,
           COUNT(*) AS post_count,
           COALESCE(SUM(view_count), 0) AS total_views
    FROM posts
    GROUP BY author_id
  `,
}).indexes(t => [
  t.unique(['author_id']),
]);
```

`forge push` emits:

```sql
CREATE TABLE IF NOT EXISTS post_stats (
  author_id   BINARY(12) NOT NULL,
  post_count  BIGINT NOT NULL,
  total_views BIGINT NOT NULL,
  UNIQUE KEY uq_post_stats_author (author_id)
);
```

The first call to `refresh()` populates the table; subsequent calls
reload it:

```ts
await db.postStats.refresh();
// Runs:
//   START TRANSACTION;
//   DELETE FROM post_stats;
//   INSERT INTO post_stats (author_id, post_count, total_views)
//     SELECT author_id, COUNT(*), COALESCE(SUM(view_count), 0)
//     FROM posts GROUP BY author_id;
//   COMMIT;
```

Schedule from a BullMQ worker:

```ts
new Worker('matview-refresh', async (job) => {
  if (job.name === 'post-stats') {
    await db.postStats.refresh();
  }
});

await queue.add('post-stats', null, {
  repeat: { every: 5 * 60 * 1000 },
  removeOnComplete: { count: 100 },
});
```

For read-availability through long refreshes, switch to the
double-buffer pattern (atomic `RENAME TABLE` swap of two tables). Out
of scope for forge's default emit; write as a custom refresh function.

### (c) Mongo `$merge` for daily rollups

```ts
export const DailySales = model('daily_sales', {
  _id:                f.string(),               // composite "YYYY-MM-DD:orgId"
  day:                f.date(),
  org_id:             f.objectId(),
  orders_placed:      f.int(),
  revenue_cents:      f.bigint(),
  last_refreshed_at:  f.dateTime(),
}).asView({
  materialised: true,
  sourceCollection: 'orders',
  pipeline: [
    { $match: { status: { $in: ['PAID', 'SHIPPED'] } } },
    { $group: {
        _id: {
          day: { $dateTrunc: { date: '$placed_at', unit: 'day' } },
          org_id: '$org_id',
        },
        orders_placed: { $sum: 1 },
        revenue_cents: { $sum: '$total_cents' },
    }},
    { $project: {
        _id: { $concat: [
          { $dateToString: { date: '$_id.day', format: '%Y-%m-%d' } },
          ':',
          { $toString: '$_id.org_id' },
        ]},
        day: '$_id.day',
        org_id: '$_id.org_id',
        orders_placed: 1,
        revenue_cents: 1,
        last_refreshed_at: '$$NOW',
    }},
    { $merge: {
        into: 'daily_sales',
        on: '_id',
        whenMatched: 'replace',
        whenNotMatched: 'insert',
    }},
  ],
});
```

The terminal `$merge` is explicit, so forge does not append its
default `$out`. The matview upserts row-by-row; reads against
`daily_sales` are never disrupted by a refresh.

Refresh by running the pipeline:

```ts
await db.dailySales.refresh();   // runs the stored pipeline against the source collection
```

For partial refresh — only today's rows — write a separate refresh
job that re-runs the pipeline with a `$match` on `placed_at >= startOfToday`:

```ts
await db.orders.aggregate([
  { $match: {
      status: { $in: ['PAID', 'SHIPPED'] },
      placed_at: { $gte: startOfToday },
  }},
  // …rest of the pipeline, ending with $merge…
]);
```

`$merge` only touches the rows the pipeline emits, so partial refresh
costs proportional to the changed slice.

### (d) MSSQL indexed view with SCHEMABINDING

The pattern: declare the indexed view in raw migration SQL, point a
regular forge `model()` at it, exclude it from `forge diff`.

**Migration SQL** (`migrations/2026-06-24_post_stats_indexed.sql`):

```sql
CREATE VIEW [dbo].[post_stats] WITH SCHEMABINDING AS
  SELECT author_id,
         COUNT_BIG(*)         AS post_count,
         SUM(view_count)      AS total_views,
         COUNT_BIG(view_count) AS _vc_count    -- required so SUM is maintainable
  FROM dbo.posts
  GROUP BY author_id;

CREATE UNIQUE CLUSTERED INDEX IX_post_stats_author
  ON [dbo].[post_stats] (author_id);
```

**Forge model**:

```ts
export const PostStats = model('post_stats', {
  author_id:   f.objectId(),
  post_count:  f.bigint(),
  total_views: f.bigint(),
});
```

No `.asView()` — the engine is maintaining the view; forge reads from
it as if it were a table. The `_vc_count` companion column doesn't
appear in the model because reads don't need it.

**Exclude from diff**:

```bash
forge diff --ignore=post_stats
```

Otherwise `diff` reports `post_stats` as a missing table (the model
declares it, but introspection won't see it as a table — it's a view
with a clustered index).

**Reads**:

```ts
const stats = await db.postStats.findMany({
  where: { author_id: { in: authorIds } },
});
```

The query planner uses the indexed view automatically when it covers
the query. No `refresh()` call — the engine maintained the index
transactionally as `posts` was written.

The trade: every INSERT/UPDATE/DELETE on `posts` pays a small overhead
to maintain the indexed view. For a hot OLTP table, profile under
write load before adopting; for a cold-write/hot-read table, it's the
right primitive.

---

## See also

* [VIEWS.md](./VIEWS.md) — plain views, the `.asView()` API, read-only wrappers, and the per-dialect view DDL.
* [AGGREGATIONS.md](./AGGREGATIONS.md) — `groupBy`, `aggregate`, and the live-query shapes a matview pre-computes.
* [MIGRATIONS.md](./MIGRATIONS.md) — matview DDL through `push` and `diff`, ignore patterns for engine-maintained matviews.
* [CACHING.md](./CACHING.md) — query-result cache, the tighter-freshness alternative when minutes of staleness is too much.
* [POSTGRES.md](./POSTGRES.md) — `REFRESH MATERIALIZED VIEW CONCURRENTLY`, `pg_cron`, `pg_ivm`, and matview-specific GRANTs.
* [MONGO.md](./MONGO.md) — `$out` vs `$merge`, pipeline-as-matview, and the `last_refreshed_at` field pattern.
* [DEPLOYMENT.md](./DEPLOYMENT.md) — running matview refresh jobs through deploys, single-execution patterns under multi-process fleets.