# SQL Server forge-orm targets SQL Server 2017+, Azure SQL Database, and Azure SQL Managed Instance. This page documents the MSSQL-specific behavior — the MERGE-based upsert, the snapshot-isolation model, geography vs geometry spatial, OPENJSON-based JSON queries — and the operational concerns when running forge against MSSQL. ## Contents * [Supported versions and editions](#supported-versions-and-editions) * [Driver and connection string](#driver-and-connection-string) * [DDL emit table](#ddl-emit-table) * [IDENTITY columns and sequences](#identity-columns-and-sequences) * [Upsert via MERGE](#upsert-via-merge) * [Transactions and isolation](#transactions-and-isolation) * [Pagination — OFFSET / FETCH](#pagination--offset--fetch) * [CTE and window functions](#cte-and-window-functions) * [Full-text search](#full-text-search) * [Spatial — geography and geometry](#spatial--geography-and-geometry) * [JSON path queries](#json-path-queries) * [Vector — JSON fallback and SQL Server 2025](#vector--json-fallback-and-sql-server-2025) * [Azure SQL specifics](#azure-sql-specifics) * [Connection pool sizing](#connection-pool-sizing) * [Errors and codes](#errors-and-codes) * [Worked examples](#worked-examples) * [Cross-links](#cross-links) --- ## Supported versions and editions | Edition | Tested | Notes | |---|---|---| | SQL Server 2017 | yes | Floor — `OPENJSON`, `STRING_AGG`, `OFFSET … FETCH` all present. JSON_PATH index workaround needed (computed column + index). | | SQL Server 2019 | yes | Adds `STRING_SPLIT(ordinal)`, intelligent query processing, accelerated DB recovery. Doctor flags this as the recommended floor for production. | | SQL Server 2022 | yes | Native JSON aggregates (`JSON_OBJECT`, `JSON_ARRAY`), system-versioned ledger tables. forge uses none of the new keywords by default — emit stays portable to 2017. | | SQL Server 2025 (preview) | partial | Native `VECTOR(N)` type and `VECTOR_DISTANCE(...)`. The dialect emits the new syntax when `f.vector()` is declared; the JSON fallback is the documented path until 2025 GA. | | Azure SQL Database | yes | Always-current. Behaves as 2022 + extras (Always Encrypted with secure enclaves, Hyperscale, serverless tier auto-pause). | | Azure SQL Managed Instance | yes | Full SQL Server surface inside a VNet. CLR + SQL Agent + cross-DB queries work; no compatibility differences forge cares about. | | Azure SQL Edge | partial | ARM64-friendly local testing image (M-series macs). Missing: full-text catalogs, Polybase, ML Services. Doctor probes for FTS and reports a degraded `notes` line. | `SELECT @@VERSION` lands in the doctor report under `notes` so you can see which edition forge actually connected to. The minimum compatibility level forge requires is 140 (SQL 2017); earlier levels lack `OPENJSON` and the `OFFSET … FETCH` pagination shape that the IR compiler emits. --- ## Driver and connection string The shipped wrapper is `mssqlDriver` — a thin port over the [`mssql`](https://www.npmjs.com/package/mssql) package, which itself wraps [`tedious`](https://www.npmjs.com/package/tedious) on the wire. Install one, you get both: ```sh npm i mssql ``` `mssql` is a peer-dep. `tedious` ships as a transitive dependency. forge's `loadDriver('mssql', url)` does the `require` and surfaces a useful install hint when the package is missing. ```ts import sql from 'mssql'; import { createDb, mssqlDriver } from 'forge-orm'; const pool = await sql.connect({ user: process.env.MSSQL_USER, password: process.env.MSSQL_PASS, server: 'localhost', database: 'app', port: 1433, options: { encrypt: true, // mandatory on Azure SQL trustServerCertificate: false, // flip to true ONLY for self-signed dev enableArithAbort: true, }, pool: { max: 20, min: 0, idleTimeoutMillis: 30_000 }, }); export const db = await createDb({ schema, driver: mssqlDriver(pool) }); ``` Or with a URL: ```ts // mssql:// and sqlserver:// both detect to the mssql adapter. await createDb({ url: 'mssql://user:pass@localhost:1433/app?encrypt=true&trustServerCertificate=false', schema, }); ``` URL query parameters map directly to the `mssql` package's config keys (`encrypt`, `trustServerCertificate`, `connectTimeout`, `requestTimeout`). The detection table in [DRIVERS](./DRIVERS.md#the-two-call-shape-url-vs-driver) lists every prefix forge recognizes. ### Encrypt and trustServerCertificate * **`encrypt: true`** — mandatory on Azure SQL. Without it the wire is refused with `ELOGIN`. Local SQL Server developer instances accept either, but production should always encrypt. * **`trustServerCertificate: true`** — only for development against a self-signed cert (the `mcr.microsoft.com/mssql/server:2022-latest` image ships one). In production this should always be `false` so the CA chain is verified. * **Azure SQL Edge** — ARM-Mac local testing. Set `encrypt: true, trustServerCertificate: true` because the bundled cert isn't signed. ### Port 1433 is the default. Named instances on Windows use SQL Browser to resolve `server\instance` to a dynamic port — forge has no opinion here; pass `server: 'host\\\\instance'` and the `mssql` package figures it out. For Azure SQL the host is `.database.windows.net` on port 1433. ### Param binding — `@p1, @p2, …` The dialect emits T-SQL named-parameter placeholders, never `?` or `$1`. The driver wrapper binds them positionally via `request.input('p${i+1}', value)`. A custom driver must mirror that — see [DRIVERS](./DRIVERS.md#mssqldriver) for the contract and the "`Must declare the scalar variable "@p1"`" symptom when it's wrong. --- ## DDL emit table The MssqlDialect column types — what the schema kinds in `f.*` become when `forge push` runs against MSSQL: | forge kind | T-SQL type | Notes | |---|---|---| | `f.id()` (string PK) | `NVARCHAR(255)` | App-generated cuid/ulid/uuid string. No `IDENTITY`. | | `f.id(\| 'bigserial')` | `BIGINT IDENTITY(1,1)` | Auto-increment 64-bit. `IDENTITY_INSERT` needed to override. | | `f.id(\| 'uuid')` | `UNIQUEIDENTIFIER` | Native 16-byte GUID. Default is **not** set; supply with `NEWID()` or `NEWSEQUENTIALID()` on the column default if you want server-side gen. | | `f.string()` | `NVARCHAR(255)` | UTF-16, fixed cap. For UTF-8-only payloads with the `_UTF8` collation you can cut storage by half — set the column collation at DDL time outside forge. | | `f.text()` | `NVARCHAR(MAX)` | `TEXT` is deprecated; `NVARCHAR(MAX)` is the modern equivalent. | | `f.int()` | `INT` | 32-bit. | | `f.bigint()` | `BIGINT` | 64-bit. | | `f.float()` | `FLOAT` | 53-bit double-precision (IEEE 754). | | `f.decimal(p, s)` | `DECIMAL(p,s)` | Up to 38 digits. | | `f.bool()` | `BIT` | T-SQL has no real boolean. The IR compiler coerces JS `true`/`false` → `1`/`0` at param-bind time. | | `f.dateTime()` | `DATETIMEOFFSET` | Preserves the timezone offset. `DATETIME2` is the no-tz alternative — drop down via `$queryRaw` if you need it. `DATETIME` (the legacy 1900-9999 type) is intentionally not emitted; its 3.33ms precision is a footgun. | | `f.uuid()` | `UNIQUEIDENTIFIER` | Distinct from `id(\|'uuid')` only in that this is a regular column. | | `f.json()` | `NVARCHAR(MAX)` | SQL Server 2025 adds a real `JSON` type — forge stays on `NVARCHAR(MAX)` for portability. `ISJSON()` check constraint not added; declare it via `$executeRaw` if you want write-time validation. | | `f.embed(...)` / `embedMany(...)` | `NVARCHAR(MAX)` | Adapter `coerceInbound` does `JSON.stringify` on write. Reads come back as text — see [EMBED](./EMBED.md#mssql) for the auto-parse hook. | | `f.stringArray()` / `intArray()` | `NVARCHAR(MAX)` | T-SQL has no array type. Stored as JSON, read with `OPENJSON` or `JSON_VALUE`. The IR's array operators (`in`, `contains`) compile to `OPENJSON`-backed subqueries. | | `f.enum(...)` | `NVARCHAR(255)` + CHECK | `CHECK ([col] IN ('a', 'b', …))` added at table-create time. Adding a value later means dropping + recreating the CHECK — see [MIGRATIONS](./MIGRATIONS.md#enum-drift). | | `f.geoPoint()` | `GEOGRAPHY` | Default — WGS84. See [Spatial](#spatial--geography-and-geometry). | | `f.geoPoint({ fallback: true })` | `NVARCHAR(MAX)` | JSON `{lat, lng}` fallback for editions without spatial (Azure SQL Edge). | | `f.vector(N)` | `VECTOR(N)` | SQL Server 2025 only. Earlier targets need a JSON-stored fallback; see [Vector](#vector--json-fallback-and-sql-server-2025). | Types that forge does **not** emit because they're either legacy or specialised — drop into `$queryRaw` if you need them: * `MONEY` / `SMALLMONEY` — fixed 4dp; use `DECIMAL(19,4)` via `f.decimal()` instead. * `HIERARCHYID` — built-in materialised path. Useful for trees but not modelled by forge's relations layer. * `XML` — pre-JSON era. Use `NVARCHAR(MAX)` + `OPENXML` if you really need it. * `SQL_VARIANT` — typed-as-anything. Use `f.json()`. * `ROWVERSION` / `TIMESTAMP` — auto-incrementing concurrency token. Add it via `$executeRaw('ALTER TABLE … ADD rv ROWVERSION')` and read through introspection. ### `NVARCHAR` vs `VARCHAR` forge emits `NVARCHAR` everywhere by default. The N-prefix is UTF-16 (every char 2-4 bytes). `VARCHAR` is single-byte / collation-dependent. Rationale: * JS strings are UTF-16 internally — `NVARCHAR` is the lossless round trip. * SQL Server 2019+ has the `_UTF8` collations, which let `VARCHAR` store UTF-8. That can halve storage on Latin-script text. To opt in, set the column / database collation outside forge: ```sql ALTER DATABASE app SET COLLATION Latin1_General_100_CI_AS_SC_UTF8; ``` The dialect still emits `NVARCHAR` — drop down to `$executeRaw` if you want `VARCHAR` columns; forge's introspect reader normalizes both back to `string` so reads work either way. --- ## IDENTITY columns and sequences T-SQL has two paths to monotonic IDs: **IDENTITY** — the historical approach. `BIGINT IDENTITY(1,1)` attaches an auto-increment counter to the column. `INSERT` without that column gets the next value; `SCOPE_IDENTITY()` reads it back. forge uses IDENTITY for `f.id('bigserial')`: ```sql CREATE TABLE [orders] ( [id] BIGINT IDENTITY(1,1) NOT NULL, [total] DECIMAL(19,4) NOT NULL, CONSTRAINT [pk_orders] PRIMARY KEY ([id]) ); ``` When you need to override the counter (data import, replay), wrap the load: ```ts await db.$executeRaw(sql`SET IDENTITY_INSERT [orders] ON`); await db.Orders.createMany({ data: rows }); await db.$executeRaw(sql`SET IDENTITY_INSERT [orders] OFF`); ``` Only one table per session can have `IDENTITY_INSERT ON`. Forgetting to flip it back leaks across requests in a pool — keep the toggle inside `db.$transaction(async (tx) => …)` to scope the lifetime to the call. **SEQUENCE** — SQL Server 2012+. A separate object that hands out values; multiple columns / tables can draw from one sequence. forge doesn't emit them automatically, but introspect reads them back so you can mix-and-match. Declare via `$executeRaw` and reference in the column default: ```ts await db.$executeRaw(sql` CREATE SEQUENCE [order_seq] AS BIGINT START WITH 1000 INCREMENT BY 1 `); await db.$executeRaw(sql` ALTER TABLE [orders] ADD CONSTRAINT [df_orders_id] DEFAULT NEXT VALUE FOR [order_seq] FOR [id] `); ``` Sequences over IDENTITY when: * Several tables share an ID space (multi-tenant ledger). * You need to pre-allocate a block (`SELECT NEXT VALUE FOR seq` n times before the insert). * You want a custom cache size to amortise the IDENTITY round-trip (`CACHE 1000`). For app-generated IDs (cuid/ulid), stick with `f.id()` (string) — the client mints the ID, no round-trip is needed at insert time, and the ID is sortable + globally unique. --- ## Upsert via MERGE PG has `INSERT … ON CONFLICT DO UPDATE`. MySQL has `INSERT … ON DUPLICATE KEY UPDATE`. T-SQL has no equivalent single-statement form except `MERGE`. As of forge 2.5.0 the MSSQL adapter rewrites `Model.upsert(...)` into a MERGE statement with the conflict target derived from the `where` tree. ### The call shape ```ts await db.Products.upsert({ where: { sku: 'WIDGET-A' }, create: { sku: 'WIDGET-A', name: 'Widget A', stock: 10 }, update: { stock: { increment: 1 } }, }); ``` ### What forge emits ```sql MERGE INTO [products] AS tgt USING (VALUES (@p1, @p2, @p3)) AS src ([stock], [name], [sku]) ON tgt.[sku] = src.[sku] WHEN MATCHED THEN UPDATE SET [stock] = COALESCE(tgt.[stock], 0) + @p4 WHEN NOT MATCHED THEN INSERT ([stock], [name], [sku]) VALUES (src.[stock], src.[name], src.[sku]) OUTPUT INSERTED.*; ``` The pipeline: 1. **Conflict target** — extracted from the `where` tree. The compiler walks `eq` leaves (single column or an AND of eq leaves); anything else throws at compile time with the explicit message `upsert requires a conflict target. Use { where: { uniqueCol: value } }`. 2. **Source row** — every key in `create`, plus any conflict column the caller forgot to mirror into `create` (pulled from the `where` leaf value). Missing-from-both throws `upsert conflict column '' is in the where clause but not in create`. 3. **UPDATE branch** — `set`, `increment`, `multiply`, `unset` are compiled in that order. `increment` and `multiply` wrap the existing value in `COALESCE(tgt., 0)` so they're safe on NULL columns. When the caller passes only `create` and no `update` payload the branch falls back to a no-op self-assignment of the conflict column so the MERGE statement is still legal T-SQL. 4. **OUTPUT** — `INSERTED.*` returns the row that landed (either the freshly inserted one or the updated one). Forge's `executeUpdate` reads it back, so `upsert()` returns the row like every other adapter. ### MERGE caveats The community has historically cautioned against MERGE — the bugs caught users in multi-source-row or trigger-laden configurations. forge's rewrite is the single-source-row form (`USING (VALUES (…)) AS src`) against a single target, with no triggers in the path — that variant has been stable since 2014. What you should still know: * **Trailing semicolon is mandatory** — the dialect emits one. * **HOLDLOCK** is **not** added by default. Under `READ COMMITTED` with high-concurrency upsert, two sessions can race and both hit the WHEN NOT MATCHED branch, leading to a duplicate-key violation. The conflicting INSERT throws `2627` (mapped to `UniqueViolation`) — catch and retry the upsert, which now hits WHEN MATCHED. Run under SNAPSHOT isolation or `$queryRaw` a `HOLDLOCK` hint into the MERGE manually if you want to avoid the retry. * **No `skipDuplicates`** — MERGE always either updates or inserts. Rewrite as a conditional INSERT (`INSERT … WHERE NOT EXISTS (…)`) via `$queryRaw` if you need ignore-on-conflict semantics. ### Composite conflict keys ```ts await db.OrgMembers.upsert({ where: { orgId: 'o1', userId: 'u1' }, create: { orgId: 'o1', userId: 'u1', role: 'member', joinedAt: new Date() }, update: { role: 'admin' }, }); ``` Compiles to a MERGE with `ON tgt.[orgId] = src.[orgId] AND tgt.[userId] = src.[userId]`. The AND-of-eq detection in `whereEqLeafColumns()` handles arbitrary depth. ### Why this is the only safe path The naïve "SELECT then INSERT or UPDATE" alternative races between the read and the write. Even inside a transaction, `READ COMMITTED` lets another session insert the same key after your SELECT and before your INSERT. The MERGE form is atomic — one statement, one lock window — and combined with the retry-on-`2627` strategy is the documented MSSQL upsert pattern. For high-write-rate counters, prefer a plain `db.Counters.update({ where, increment: { n: 1 } })` — that's a single `UPDATE … SET n = COALESCE(n,0)+1` with no conflict path. Reserve `upsert` for first-write-wins semantics. --- ## Transactions and isolation T-SQL ships four named isolation levels plus snapshot, plus read committed snapshot. forge's `db.$transaction(fn)` defaults to the session's current isolation (the default is `READ COMMITTED`). Raise or lower per-transaction with `SET TRANSACTION ISOLATION LEVEL` inside the callback. ### The five levels (T-SQL) | Level | Reads see | Writes block | Use | |---|---|---|---| | `READ UNCOMMITTED` | Dirty reads of in-flight writes | No locks taken on read | Avoid. Used to be the "fast reporting" hack; SNAPSHOT replaces it. | | `READ COMMITTED` (default) | Only committed rows | Shared locks released after each read | OLTP default. Non-repeatable reads possible — re-querying the same row mid-tx can return a different value. | | `REPEATABLE READ` | Same row, same value for the tx | Holds S-locks for the tx duration | Higher contention than READ COMMITTED; rarely the right answer. | | `SERIALIZABLE` | Phantom-free | Range locks | Strictest. Use for invariant-protecting workflows (allocate-from-pool). | | `SNAPSHOT` | Row version at tx start | No reader-vs-writer blocks | Optimistic concurrency control. Update conflicts surface as `3960` at COMMIT. Requires `ALLOW_SNAPSHOT_ISOLATION ON` at the DB level. | ### `READ COMMITTED SNAPSHOT` (RCSI) A database-level flag that turns `READ COMMITTED` into a row-versioned read (no shared locks taken). Enable at the database, every transaction gets snapshot-like read semantics without needing to set `SNAPSHOT` per-call. **Azure SQL Database has RCSI on by default.** For self-managed SQL Server, enable explicitly: ```sql ALTER DATABASE app SET READ_COMMITTED_SNAPSHOT ON WITH ROLLBACK IMMEDIATE; ``` `ROLLBACK IMMEDIATE` kills active sessions on the DB — run during a maintenance window. After RCSI is on, no app code changes; the default isolation behaves more like Postgres's REPEATABLE READ. ### Raising to SNAPSHOT inside a tx ```ts await db.$transaction(async (tx) => { await tx.query(`SET TRANSACTION ISOLATION LEVEL SNAPSHOT`); // every subsequent read in this tx sees the snapshot at tx start const balance = await tx.scope(db).Accounts.findFirst({ where: { id } }); // … check & write }); ``` The driver wrapper's `transaction()` opens a `BEGIN TRAN`; the `ISOLATION LEVEL SNAPSHOT` statement is per-session. On COMMIT the update-conflict check fires; conflicts come back as error `3960` (forge maps it to `DbKnownError` with the original error text). ### DDL implicitly commits T-SQL allows `CREATE TABLE` / `ALTER TABLE` inside a `BEGIN TRAN` block, but a uncommitted DDL holds Sch-M locks on the metadata for the duration. `forge push` and `db.$migrate()` run each statement outside of any user transaction; if you wrap a DDL call yourself, expect every other session to block on `sys.objects` until you commit. ### Online DDL `ALTER INDEX … REBUILD WITH (ONLINE = ON)` lets readers and writers continue while the index rebuilds. Enterprise / Azure SQL only. forge doesn't emit `ONLINE=ON` automatically; pass it via `$executeRaw` when you're rebuilding a large index in production. --- ## Pagination — OFFSET / FETCH The IR compiler emits `LIMIT n OFFSET m` (PG shape). The MSSQL compiler's `rewriteLimitOffset()` rewrites that tail into the T-SQL `OFFSET m ROWS FETCH NEXT n ROWS ONLY` form, which requires an `ORDER BY` clause to be present. The SELECT builder always emits one when there's a limit, so the rewrite is safe. ```ts const page = await db.Posts.findMany({ orderBy: { createdAt: 'desc' }, take: 25, skip: 75, }); ``` ```sql SELECT [id], [title], [createdAt] FROM [posts] ORDER BY [createdAt] DESC OFFSET 75 ROWS FETCH NEXT 25 ROWS ONLY; ``` OFFSET cost grows linearly — page 1000 with `OFFSET 25_000` reads and discards 25,000 rows. For deep pagination, switch to keyset (cursor) pagination: ```ts // First page const first = await db.Posts.findMany({ orderBy: { id: 'desc' }, take: 25, }); const cursor = first.at(-1)?.id; // Next page const next = await db.Posts.findMany({ where: { id: { lt: cursor } }, orderBy: { id: 'desc' }, take: 25, }); ``` Keyset pagination compiles to a clean `WHERE [id] < @p1 ORDER BY [id] DESC OFFSET 0 ROWS FETCH NEXT 25 ROWS ONLY` — index-perfect. Use it for infinite-scroll surfaces. OFFSET is fine for "page 1 to 10" finite UIs. ### `TOP` vs OFFSET / FETCH `SELECT TOP 25 …` is the older, simpler form — forge uses it only in the ctid-rewrite path (`WHERE [pk] IN (SELECT TOP 1 [pk] …)`) where there's no ORDER BY constraint. The IR doesn't compile to `TOP` for user-facing pagination because `TOP` doesn't support OFFSET. --- ## CTE and window functions T-SQL has had CTEs since SQL Server 2005 and window functions since 2005 (basic) / 2012 (full LAG/LEAD/percentile). forge's typed query layer doesn't model CTEs directly — drop to `$queryRaw` for the recursive cases. The dialect does emit window functions when the IR groupBy node uses `agg()` projections. ```ts import { sql } from 'forge-orm'; // Recursive: org hierarchy under a root. const tree = await db.$queryRaw(sql` WITH org_tree AS ( SELECT [id], [parentId], [name], 0 AS depth FROM [organizations] WHERE [id] = ${rootId} UNION ALL SELECT o.[id], o.[parentId], o.[name], t.depth + 1 FROM [organizations] o INNER JOIN org_tree t ON o.[parentId] = t.[id] ) SELECT * FROM org_tree OPTION (MAXRECURSION 100); `); ``` `OPTION (MAXRECURSION 100)` caps the recursion depth — 0 means unbounded, default is 100. Set it high for deep hierarchies, low for defensive coding. Window functions for ranking, running totals: ```ts const ranked = await db.$queryRaw(sql` SELECT [id], [score], RANK() OVER (PARTITION BY [orgId] ORDER BY [score] DESC) AS rank FROM [users] `); ``` The typed `groupBy().agg()` form already emits these for the documented aggregates (`count`, `sum`, `avg`, `min`, `max`); for `RANK`, `PERCENT_RANK`, `LAG`, `LEAD`, drop to `$queryRaw`. --- ## Full-text search SQL Server has a fully-featured FTS engine (`CONTAINS`, `FREETEXT`, ranked `CONTAINSTABLE` / `FREETEXTTABLE`), but it needs a catalog created out of band — `CREATE FULLTEXT CATALOG`, `CREATE FULLTEXT INDEX ON … KEY INDEX `. forge **does not** emit catalog DDL on `push`. The `.searchable()` marker is preserved on the schema (doctor and introspect can see it), but the runtime `searchClause` falls back to `LIKE '%q%'` so portability holds. ```ts const Post = model('posts', { id: f.id(), title: f.string().searchable(), body: f.text().searchable(), }); await db.Posts.findMany({ where: { title: { search: 'forge orm' } } }); // LIKE-mode SQL: // SELECT … FROM [posts] WHERE [title] LIKE '%' + @p1 + '%' ``` LIKE-mode is fine for low-volume search and gives portable behavior across the six dialects, but it never uses an index — every query table-scans. For production, install the FTS catalog and rewrite via `$queryRaw`: ```sql CREATE FULLTEXT CATALOG ftCatalog WITH ACCENT_SENSITIVITY = OFF; CREATE UNIQUE INDEX ux_posts_id ON [posts]([id]); CREATE FULLTEXT INDEX ON [posts]([title] LANGUAGE 1033, [body] LANGUAGE 1033) KEY INDEX ux_posts_id ON ftCatalog WITH CHANGE_TRACKING AUTO; ``` Then query with `CONTAINS` or `FREETEXT`: ```ts import { sql } from 'forge-orm'; // CONTAINS: phrase / AND / OR / proximity, strict syntax. const phrase = await db.$queryRaw(sql` SELECT [id], [title] FROM [posts] WHERE CONTAINS([title], ${'"forge orm" OR fts'}); `); // FREETEXT: looser; the engine breaks the query into terms and matches inflections. const freetext = await db.$queryRaw(sql` SELECT [id], [title] FROM [posts] WHERE FREETEXT(([title], [body]), ${'database search'}); `); // CONTAINSTABLE / FREETEXTTABLE: same predicates, but return RANK column. const ranked = await db.$queryRaw(sql` SELECT p.[id], p.[title], ft.[RANK] FROM [posts] p INNER JOIN FREETEXTTABLE([posts], *, ${q}) AS ft ON p.[id] = ft.[KEY] ORDER BY ft.[RANK] DESC OFFSET 0 ROWS FETCH NEXT 25 ROWS ONLY; `); ``` The `CHANGE_TRACKING AUTO` mode keeps the index live as the base table changes. With `OFF`, you populate manually with `ALTER FULLTEXT INDEX … START FULL POPULATION`. Azure SQL Database does **not** support full-text search — Azure SQL Managed Instance does. If you need FTS on Azure SQL, switch to MI or fall back to `LIKE` / a sidecar (Elasticsearch / Meilisearch). See [FTS](./FTS.md#mssql) for the long-form companion. --- ## Spatial — geography and geometry SQL Server has two spatial types: * **`GEOGRAPHY`** — WGS84-style ellipsoidal coordinates (lat/long on a sphere). Distances come back in **meters**. Use this for "real world" map data. * **`GEOMETRY`** — flat-plane Cartesian. Distances in whatever units the SRID dictates. Use for floor plans, game maps, anything where the Earth's curvature doesn't matter. forge emits `GEOGRAPHY` by default for `f.geoPoint()`. The dialect constructs values via `geography::STGeomFromText('POINT(lng lat)', 4326)` — note the **lng lat** order (WKT convention), not lat lng. ```ts const Store = model('stores', { id: f.id(), name: f.string(), location: f.geoPoint(), // → GEOGRAPHY }); await db.Stores.create({ data: { name: 'Main St', location: { lat: 6.5244, lng: 3.3792 }, // Lagos }, }); // near: returns the 10 closest within 5km, sorted by distance ascending. const near = await db.Stores.findMany({ where: { location: { near: { lat: 6.5244, lng: 3.3792, withinMeters: 5000 } } }, orderBy: { location: { distanceFrom: { lat: 6.5244, lng: 3.3792 } } }, take: 10, }); ``` Compiled: ```sql SELECT [id], [name], [location], [location].STDistance(geography::STGeomFromText(@p1, 4326)) AS _dist FROM [stores] WHERE [location].STDistance(geography::STGeomFromText(@p2, 4326)) < @p3 ORDER BY _dist ASC OFFSET 0 ROWS FETCH NEXT 10 ROWS ONLY; ``` The `STDistance` call on `geography` returns meters. On `geometry` it returns SRID-units (typically the same number, but interpreted flat). Mixing types in one expression throws — forge always emits `geography` so this only bites you if you mix in a `$queryRaw`. ### Spatial indexes `CREATE SPATIAL INDEX` over a geography column needs a bounding-box spec (geometry) or a default grid (geography). forge's DDL emitter rewrites `CREATE INDEX … USING gist` (the PG dialect's marker for spatial method) into `CREATE SPATIAL INDEX [name] ON [table]([col])`. The default grid (`GRID = (LOW, LOW, LOW, LOW)`) is fine for most workloads; tune via `$executeRaw` for hot tables. ### MultiPolygon / withinPolygon The dialect's `geoWithinPolygonClause` builds a `geography::STGeomFromText(, srid).STContains([col]) = 1`. Note the inversion — SQL Server's `STContains` is "does polygon contain point?", which is what `withinPolygon` semantically asks. ```ts await db.Stores.findMany({ where: { location: { withinPolygon: { type: 'Polygon', coordinates: [[[3.3, 6.4], [3.5, 6.4], [3.5, 6.6], [3.3, 6.6], [3.3, 6.4]]], }, }, }, }); ``` The polygon is converted to WKT with `lng lat` ordering (the `'lng-lat'` axis hint in `toGeoWKT`) to match SQL Server's expectation. MultiPolygon and GeometryCollection work the same way — the WKT helper handles both, and SRID defaults to 4326. See [GEO](./GEO.md#mssql) for the cross-dialect deep dive. ### Azure SQL Edge Spatial is **not** supported on Azure SQL Edge (the ARM64 testing image). Use `f.geoPoint({ fallback: true })` for those tests — fallback stores JSON `{lat, lng}` and runs an in-memory haversine in forge's JS layer. Doctor reports the missing spatial subsystem. --- ## JSON path queries JSON support landed in SQL Server 2016 and is stable across every supported edition. forge stores `f.json()` / `f.embed()` columns as `NVARCHAR(MAX)` and reaches into them with the three JSON functions: | T-SQL | Returns | Use | |---|---|---| | `JSON_VALUE(col, '$.path')` | scalar (NVARCHAR(4000)) | atoms (string, number, bool). Returns NULL on type mismatch by default; `WITH (LAX)` mode is the default; switch to `STRICT` for error-on-miss. | | `JSON_QUERY(col, '$.path')` | nested JSON (NVARCHAR(MAX)) | objects / arrays. Returns the JSON-stringified subtree. | | `OPENJSON(col, '$.path') WITH (...)` | row set | the FROM-clause form; pulls JSON arrays out into derived rows. | The dialect's `jsonPathExpr` emits `JSON_VALUE`: ```ts await db.Events.findMany({ where: { meta: { path: ['source'], eq: 'web' } }, }); ``` ```sql SELECT … FROM [events] WHERE JSON_VALUE([meta], '$.source') = @p1; ``` `JSON_VALUE` returns NVARCHAR(4000). For ints/floats wrap with `CAST`; forge does that automatically for typed `f.json()` access via the typed-path API (see [JSON-PATH](./JSON-PATH.md#mssql)). Untyped `{path:[…], eq: 42}` compares against the string `'42'` — equal under T-SQL implicit conversion but not index-eligible. ### Indexing JSON paths SQL Server can't index inside a JSON document directly (until 2022's optimized JSON storage). The standard workaround is a computed + persisted column, then index that: ```sql ALTER TABLE [events] ADD source_x AS CAST(JSON_VALUE([meta], '$.source') AS NVARCHAR(64)) PERSISTED; CREATE INDEX ix_events_source_x ON [events](source_x); ``` The IR compiler doesn't auto-generate this — declare via `$executeRaw` once and forge's introspect reads it back. Without the index every `JSON_VALUE` predicate is a full scan. ### OPENJSON for array access When `stringArray` / `intArray` columns are queried with `contains`, forge emits an `EXISTS (SELECT 1 FROM OPENJSON(col) WHERE value = ?)`: ```ts await db.Posts.findMany({ where: { tags: { contains: 'forge' } } }); ``` ```sql SELECT … FROM [posts] WHERE EXISTS (SELECT 1 FROM OPENJSON([tags]) WHERE [value] = @p1); ``` `OPENJSON` without a `WITH` clause emits one row per top-level array element with `[key], [value], [type]` columns. Filter by `[value]` for the contains check. For typed array reads (json that's actually `[{name:'…'}, {name:'…'}]`), use the `WITH` form: ```sql SELECT j.name, j.qty FROM [orders] CROSS APPLY OPENJSON([items]) WITH (name NVARCHAR(255), qty INT) j WHERE [orders].[id] = @p1; ``` See [JSON-PATH](./JSON-PATH.md#mssql) for the typed JSON access pattern. --- ## Vector — JSON fallback and SQL Server 2025 SQL Server 2025 (currently in preview) adds a native `VECTOR(N)` type and `VECTOR_DISTANCE('cosine' | 'l2' | 'ip', a, b)`. The dialect emits the new syntax when `f.vector(N)` is declared, which is right for Azure SQL Database (which tracks the preview) but **does not** work on SQL Server 2017 / 2019 / 2022. For pre-2025 targets, use the JSON-stored fallback: ```ts const Doc = model('docs', { id: f.id(), text: f.text(), embedding: f.json(), // store as JSON array }); // On write — pass the array, forge JSON.stringifies it. await db.Docs.create({ data: { text: '…', embedding: [0.1, 0.2, /* … */] } }); // On read — read back as JSON; compute cosine in the app layer. const rows = await db.Docs.findMany({ select: { id: true, embedding: true } }); const ranked = rows .map((r) => ({ id: r.id, sim: cosine(r.embedding, queryVec) })) .sort((a, b) => b.sim - a.sim) .slice(0, 10); ``` This is fine for 10K-row corpora; beyond that the round-trip cost dominates. Options: * **Hybrid query** — pre-filter with FTS or a metadata index, then cosine-rank the small result set in the app. * **Sidecar** — push the embedding store to a vector DB (Qdrant, pgvector via a sidecar Postgres, Pinecone) and keep MSSQL for OLTP. * **SQL Server 2025 / Azure SQL** — switch to `f.vector(N)` and the native `VECTOR_DISTANCE` predicate. The dialect already emits it; doctor probes for the version and reports availability. See [VECTOR](./VECTOR.md#mssql) for the full strategy doc. --- ## Azure SQL specifics Azure SQL Database and Azure SQL Managed Instance behave like SQL Server 2022+ at the wire, but the operational model is different: ### DTU vs vCore * **DTU** — bundled compute + storage + IO units. Simple for small apps. `S0` = 10 DTU, `S3` = 100 DTU, `P15` = 4000 DTU. The DTU model is a soft cap on everything; the cap doesn't tell you which resource is the bottleneck. * **vCore** — pay per vCore and storage independently. Hyperscale tier (vCore-only) supports up to 100TB and instant restore. **Recommended for new deployments** — the metrics map directly to what you reason about (CPU, memory, IO, log throughput). Doctor reads `sys.dm_db_resource_stats` and reports the last 60s average. Sustained CPU > 80% is a sign to scale; sustained log writes near the limit means batch your writes or switch tier. ### Geo-replication * **Active geo-replication** — readable secondary in a different region; manual or auto failover. The secondary is read-only and lags by seconds. * **Auto-failover groups** — at the server / database level, single failover DNS endpoint. The connection string stays stable across failover. * **Geo-restore** — restore from geo-redundant backup to any region. Forge's read replica detection isn't automatic; pass two `db` instances (one for writes, one for reads) and route at the app layer. The MSSQL adapter has no opinion on read/write split. ### Always Encrypted Column-level encryption with keys held client-side (the SQL Server never sees plaintext). With Always Encrypted enabled: * The `mssql` driver needs `columnEncryptionSetting: 'Enabled'` in the pool config plus a key store config (Azure Key Vault is the typical choice). * Encrypted columns can only be compared with `=` (deterministic encryption) — no LIKE, range, ordering, or join. * `IDENTITY` columns can't be encrypted; use a string PK + app-side ulid. forge doesn't know about Always Encrypted at the type level — declare the columns as `f.string()`, set the encryption metadata via `$executeRaw` ALTER, and queries with `where: { col: value }` still work. Range queries (`gt`, `lt`) on encrypted columns will fail at runtime — the driver throws `Operand type clash`. ### Connection limits | Tier | Concurrent worker threads | Max session count | |---|---|---| | Basic | 30 | 300 | | S0 / S1 | 60 | 600 | | S2 / S3 | 90 / 200 | 900 / 2400 | | P1 / P15 | 200 / 6400 | 2400 / 30,000 | | GP_Gen5_2 / GP_Gen5_16 | 200 / 1600 | 2400 / 19,200 | The worker count is the hard ceiling on concurrent queries. Hitting it returns error 10928 (`The resource pool is busy`). Pool the connection on the app side at well under that — a webserver with 50 concurrent requests should target pool max ~20, not 200. --- ## Connection pool sizing The `mssql` package owns the pool. forge calls `pool.request()` per query and `pool.transaction()` per `$transaction`. Defaults are `min: 0, max: 10, idleTimeoutMillis: 30_000`. | Workload | Pool max | Why | |---|---|---| | Edge (per-request) | 1 | No persistence; one connection per invocation | | Single-process webserver, 50 RPS | 10–20 | Default usually fine; raise if you see `idleTimeoutMillis` reconnect storms | | Single-process webserver, 500 RPS | 50–100 | Account for tail latency under load | | Background worker, batch writes | 5–10 | Long-running tx — too many starves OLTP | | Long-running analytics | 2–4, separate pool | Don't share a pool with the OLTP path | ```ts const pool = await sql.connect({ /* … */ pool: { max: 50, min: 2, idleTimeoutMillis: 30_000, acquireTimeoutMillis: 15_000, // wait up to 15s for an idle conn before failing }, }); ``` **Max worker threads** is a server-side knob — `sp_configure 'max worker threads'`. The default scales with CPU; touch only if you see SQLSERVER_WAITING_FOR_WORKER waits in `sys.dm_os_wait_stats`. Register `db.$disconnect()` from `SIGTERM` / `SIGINT` so the pool drains in flight. `close()` is idempotent — calling it twice on the same pool is safe. --- ## Errors and codes The errors module maps SQL Server numeric error codes to forge's Prisma-style `P` codes via `DbKnownError`. The full table: | MSSQL number | Forge code | Meaning | |---|---|---| | 2627 | `P2002` | PK / UNIQUE violation, named constraint | | 2601 | `P2002` | UNIQUE index violation (the duplicate-key form, often after MERGE retry) | | 547 | `P2003` | FOREIGN KEY violation (also fires on CHECK violations historically) | | 515 | `P2011` | NOT NULL violation on insert/update | | 208 | `P2021` | Invalid object name — table doesn't exist | | 207 | `P2022` | Invalid column name | | 1205 | `P2034` | Deadlock victim — transaction was rolled back, please retry | | 18456| `P1010` | Login failed for user | Things that fall through (`P` code not assigned) but you'll meet: * **3960** — SNAPSHOT isolation update conflict at COMMIT. Catch and retry the tx. * **10928 / 10929** — Resource governance — out of worker threads or rate limit. Backoff and retry. * **40197** — Azure SQL service event (failover, software update). Idempotent retry. * **41302 / 41305 / 41325** — In-Memory OLTP write conflicts. Retry. * **8645** — Memory grant timeout. Lower the query's parallelism or split it. ```ts try { await db.Products.upsert({ where: { sku }, create, update }); } catch (err) { if (err instanceof DbKnownError && err.code === 'P2002') { // Race — MERGE saw "no match" twice and the second INSERT lost. Retry. return db.Products.upsert({ where: { sku }, create, update }); } throw err; } ``` The retry-on-deadlock-or-conflict pattern is one place where `db.$transaction(fn, { retries: 3 })` saves boilerplate — though the adapter doesn't enforce a max-retry, so build a sane cap in your wrapper. --- ## Worked examples ### MERGE upsert with composite key ```ts // Idempotent membership grant: same (orgId, userId) won't double-insert. await db.OrgMembers.upsert({ where: { orgId, userId }, create: { orgId, userId, role: 'member', joinedAt: new Date() }, update: { role: 'admin' }, // promote if exists }); ``` ```sql MERGE INTO [org_members] AS tgt USING (VALUES (@p1, @p2, @p3, @p4)) AS src ([role], [joinedAt], [orgId], [userId]) ON tgt.[orgId] = src.[orgId] AND tgt.[userId] = src.[userId] WHEN MATCHED THEN UPDATE SET [role] = @p5 WHEN NOT MATCHED THEN INSERT ([role], [joinedAt], [orgId], [userId]) VALUES (src.[role], src.[joinedAt], src.[orgId], src.[userId]) OUTPUT INSERTED.*; ``` ### OFFSET pagination with stable ORDER BY ```ts const page = await db.Audit.findMany({ where: { orgId }, orderBy: [{ createdAt: 'desc' }, { id: 'desc' }], // tiebreak so paging is stable take: 50, skip: (pageNum - 1) * 50, }); ``` ```sql SELECT [id], [orgId], [actor], [event], [createdAt] FROM [audit] WHERE [orgId] = @p1 ORDER BY [createdAt] DESC, [id] DESC OFFSET 0 ROWS FETCH NEXT 50 ROWS ONLY; ``` The trailing `[id] DESC` is critical — without it, two rows with the same `createdAt` ms can flip between pages. Always tiebreak on a unique column. ### FREETEXT search (CHANGE_TRACKING AUTO) After provisioning the FTS catalog out-of-band: ```ts const results = await db.$queryRaw(sql` SELECT TOP 25 p.[id], p.[title], ft.[RANK] FROM [posts] p INNER JOIN FREETEXTTABLE([posts], ([title], [body]), ${query}) AS ft ON p.[id] = ft.[KEY] WHERE p.[orgId] = ${orgId} ORDER BY ft.[RANK] DESC; `); ``` `FREETEXTTABLE` returns a virtual `{KEY, RANK}` table; `KEY` is the PK of the matched row. The join binds rank to the base row. `TOP 25` caps; for paging use a CTE with `ROW_NUMBER() OVER (ORDER BY RANK DESC)`. ### Geography distance query ```ts const here = { lat: 6.5244, lng: 3.3792 }; const near = await db.Stores.findMany({ where: { location: { near: { ...here, withinMeters: 10_000 } } }, orderBy: { location: { distanceFrom: here } }, take: 20, }); ``` Each returned row has the original columns. To get the computed distance alongside, drop to `$queryRaw`: ```ts const ranked = await db.$queryRaw(sql` SELECT [id], [name], [location].STDistance(geography::STGeomFromText(${`POINT(${here.lng} ${here.lat})`}, 4326)) AS meters FROM [stores] WHERE [location].STDistance(geography::STGeomFromText(${`POINT(${here.lng} ${here.lat})`}, 4326)) < 10000 ORDER BY meters ASC OFFSET 0 ROWS FETCH NEXT 20 ROWS ONLY; `); ``` `STDistance` on `geography` returns meters; on `geometry` it returns SRID-units. Mixing the two in one expression throws — the dialect always emits `geography::STGeomFromText` so this only bites you in `$queryRaw`. ### Transactional debit-credit with SNAPSHOT ```ts await db.$transaction(async (tx) => { await tx.query(`SET TRANSACTION ISOLATION LEVEL SNAPSHOT`); const source = await tx.scope(db).Accounts.findFirst({ where: { id: from } }); if (!source || source.balance < amount) { throw new Error('insufficient funds'); } await tx.scope(db).Accounts.update({ where: { id: from }, set: { balance: { decrement: amount } }, }); await tx.scope(db).Accounts.update({ where: { id: to }, set: { balance: { increment: amount } }, }); }); ``` Under SNAPSHOT, the read inside the tx sees the row-version at tx start. If another tx commits a balance change before this one commits, the COMMIT throws `3960` and the wrapping retry runs the whole thing again. The alternative is `SERIALIZABLE` — same safety, but takes range locks and serializes contending writers, which throttles throughput. --- ## Cross-links * [DRIVERS](./DRIVERS.md#mssqldriver) — the `MssqlDriver` port contract, common bugs (`@p1` unbound), wire-compatible swaps (Azure SQL Edge). * [QUERIES](./QUERIES.md) — `findMany`, `findFirst`, `where`, `orderBy`, `groupBy`, cursor pagination — adapter-agnostic. * [INDEXES](./INDEXES.md) — `f.index()` / `f.unique()` shape and what the MSSQL DDL writes (partial-filter via `WHERE`, computed-column tricks). * [FTS](./FTS.md#mssql) — full-text deep dive across all six dialects; MSSQL catalog provisioning recipes. * [GEO](./GEO.md#mssql) — spatial type / WKT axis conventions / withinPolygon / MultiPolygon. * [JSON-PATH](./JSON-PATH.md#mssql) — typed JSON path access, OPENJSON projections, computed-column indexing. * [MUTATIONS](./MUTATIONS.md) — `create`, `update`, `delete`, `upsert` shapes; the MERGE rewrite is the MSSQL specialisation. * [TRANSACTIONS](./TRANSACTIONS.md) — `db.$transaction(fn)` semantics; per-adapter session shape including `MssqlQueryable`. * [MIGRATIONS](./MIGRATIONS.md) — `forge push`, plan/apply, idempotent DDL wrappers (`IF NOT EXISTS (sys.tables …) BEGIN … END`). * [DOCTOR](./DOCTOR.md) — what the MSSQL doctor probe reports — driver, RCSI status, FTS catalog presence, vector availability.