# Pessimistic locking via `KeyedLockProvider` Most of Ekbatan's write path uses **optimistic** locking — every update carries `WHERE version = ?` and conflicting writes surface as `StaleRecordException` for the executor to retry. That works well at low-to-medium contention. Some operations don't fit that model: - **At-most-once side effects** around external systems — a webhook handler that calls a payment API where retrying would charge twice. - **Single-flight execution** — a reconciliation job that should run on exactly one node at a time. - **Hot-key write paths** — a wallet that takes 1000 deposits/sec where retry-on-conflict thrashes more than it succeeds. For these, `KeyedLockProvider` is a key-scoped mutex with a uniform contract across five backends. Two acquirers using the same key are mutually exclusive (the second waits up to `maxWait` for the first to release); acquirers using different keys don't block each other. ## The contract ```java public interface KeyedLockProvider { Lease acquire(String key, Duration maxHold) throws InterruptedException; Optional tryAcquire(String key, Duration maxWait, Duration maxHold) throws InterruptedException; interface Lease extends AutoCloseable { boolean isHeld(); @Override void close(); } } ``` - `acquire(key, maxHold)` — blocking. Waits indefinitely until acquired (or thread interrupt). The lease auto-releases when `maxHold` elapses, regardless of whether the holder closed it. - `tryAcquire(key, maxWait, maxHold)` — bounded-wait. Returns `Optional.empty()` if the wait times out. - `key` is a `String`. **Namespace it per type** when locking on entity IDs (e.g. `"wallet:" + walletId`) so two unrelated ID spaces never collide. - Closing the lease releases the lock. Use `try-with-resources`. A wallet deposit, serialized per-wallet, with the lock acquired in the **caller** (the controller / job / handler that invokes the action), not inside `Action.perform()` — see [Where to acquire the lock](#where-to-acquire-the-lock) for the rationale: ```java @RestController @RequestMapping("/wallets") public class WalletController { private static final Duration MAX_HOLD = Duration.ofSeconds(10); private final ActionExecutor executor; private final KeyedLockProvider lockProvider; public WalletController(ActionExecutor executor, KeyedLockProvider lockProvider) { this.executor = executor; this.lockProvider = lockProvider; } @PostMapping("/{id}/deposits") public Wallet deposit(@PathVariable UUID id, @RequestBody DepositRequest body) throws Exception { try (var lease = lockProvider.tryAcquire("wallet:" + id, Duration.ofSeconds(2), MAX_HOLD) .orElseThrow(() -> new IllegalStateException("Wallet " + id + " is busy; try again later"))) { return executor.execute( () -> "rest-user", WalletDepositAction.class, new WalletDepositAction.Params(Id.of(Wallet.class, id), body.amount())); } } } ``` The `WalletDepositAction` itself is plain — no `KeyedLockProvider` injection, no try-with-resources in `perform()`. The locking is a *coordination policy* applied at the boundary of the action invocation, not a property of the action. Three things worth highlighting: - **`maxWait` (2s) bounds how long the caller will block** before giving up and surfacing a domain error. Without it (e.g. the no-wait `acquire(key, maxHold)` variant), a caller would queue indefinitely behind any prior holders. - **`maxHold` (10s) is a safety net, not a hint.** If the locked region overruns, the lock auto-releases — capping the blast radius of a hung holder regardless of what the calling thread is doing. - **Each acquire borrows its own JDBC connection** (for the SQL-backed providers) for the lifetime of the lease and for the wait. For lock-heavy workloads, point the provider at a dedicated pool — see [the dedicated pool recipe below](#dedicated-pool-via-the-lockconfig-slot). ## Where to acquire the lock A keyed lock can be acquired in two places. Both compile and run; only one of them actually serializes the underlying DB writes. ### ✅ At the caller, around `executor.execute(...)` — the default ``` caller (controller / job / event handler) ├─ acquire lock ←─┐ ├─ executor.execute() │ │ ├─ Action.perform()│ lock held │ │ ├─ read state │ │ │ └─ plan.update │ │ └─ commit txn │ └─ release lock ←─┘ ``` The lease wraps both `perform()` AND the framework's transaction commit. Concurrent acquirers can't enter until the previous commit is fully visible. No race window, no optimistic-locking conflicts triggered by lock-coordinated writes. **This is what most apps want.** When the goal is "concurrent writes on the same key must produce a consistent result", the lock must span the commit phase. ### ⚠️ Inside `Action.perform()` — narrower use case only ``` caller └─ executor.execute() ├─ Action.perform() │ ├─ acquire lock ←─┐ │ ├─ read state │ lock held │ ├─ plan.update │ │ └─ release lock ←─┘ (try-with-resources exits) └─ commit txn ← race window — concurrent acquirer can read pre-commit state ``` The lease closes *before* the framework's commit runs. Another acquirer can grab the lock immediately, read the still-pre-commit state, compute a stale-version update, and collide at its own commit. Ekbatan's optimistic-locking version check rejects the loser (`StaleRecordException` → the executor surfaces the failure), so no data is lost — but some calls fail under contention. With 10 concurrent deposits, only ~6 commit; the other 4 surface a version-conflict error. This placement is appropriate for **side-effect coordination, not commit serialization**: - An outbound webhook call where you want at-most-one in-flight per resource. The lock is held across the external call; the DB write is a thin record-keeping concern. - A long external compute (PDF generation, ML inference) where you want to dedupe in-flight calls per key. - Anywhere the DB write is idempotent (e.g. an upsert) and serialization isn't the goal. For **write-path serialization** (the most common use case), acquire at the caller. ### Why not just always lock inside `perform()`? `Action.perform()` is the body of the action — the change plan is built here, but the framework hasn't yet opened the transaction or committed the writes. The transaction lifecycle is owned by `ActionExecutor`, not by the action; the executor opens the transaction *after* `perform()` returns the plan, and commits it before `execute(...)` returns to the caller. The action body has no facility to "extend the lock until the executor commits" — the lease closes when `perform()` exits, no exceptions. The caller-side pattern works because the caller owns the boundary that *contains* the executor's transaction lifecycle. The lease closes after `execute(...)` returns, which is after the framework's commit. ## Reentrancy contract — uniform across all five backends Same thread + same key acquires re-enter without blocking. The underlying backend lock is released only when the **outermost** lease is closed (or the `maxHold` watchdog fires). The first acquire's `maxHold` governs the watchdog — re-entries' `maxHold` arguments are ignored. This is **stricter than Redisson/Hazelcast's "last-call-wins"** convention and prevents an inner re-entry from shortening the outer holder's commitment. Reentrancy is **per-thread, not per-call-stack**. A child thread spawned inside a held region is a different identity and will block. This contract is enforced by a shared internal helper, `KeyedReentrantHolder`, which owns the per-`(thread, key)` counter, a virtual-thread watchdog, and the release-arbitration CAS. Each backend only has to implement low-level `acquire`/`release`. ## Five backends | Provider | Scope | Backend | Hold | Notes | |---|---|---|---|---| | `InProcessKeyedLockProvider` | single JVM | per-key fair `Semaphore` | watchdog | FIFO. No JDBC connection consumed. | | `PostgresKeyedLockProvider` | cross-JVM | `pg_advisory_lock` (session-scoped) | session + watchdog | Auto-released if the session terminates (process crash, network drop). Borrows a connection per lease. | | `MariaDBKeyedLockProvider` | cross-JVM | `GET_LOCK(...)` | session + watchdog | MariaDB 12.x rejects negative timeout — `Integer.MAX_VALUE` (≈68 years) is the wait-forever sentinel. | | `MySQLKeyedLockProvider` | cross-JVM | `GET_LOCK(...)` | session + watchdog | Sub-second `maxWait` rounds up to whole seconds (MySQL precision limit). | | `RedisKeyedLockProvider` | cross-JVM | Redisson `RLock` | TTL + watchdog | Sub-millisecond hand-off. Lives in `ekbatan-keyed-lock-redis`. Single-master only — **not** Redlock-based. | All five honor the same reentrancy contract via `KeyedReentrantHolder`. The Redis variant explicitly disables Redisson's own watchdog (passes `maxHold` as Redisson `leaseTime`) and uses a local virtual-thread watchdog so the framework's first-call-wins semantics win over Redisson's last-call-wins default. ### Wiring up a backend The SQL-backed providers all take a `ConnectionProvider`; Redis takes a `RedissonClient`: ```java import static io.ekbatan.core.concurrent.PostgresKeyedLockProvider.Builder.postgresKeyedLockProvider; import static io.ekbatan.core.persistence.ConnectionProvider.hikariConnectionProvider; var lockProvider = postgresKeyedLockProvider() .connectionProvider(hikariConnectionProvider(lockDataSourceConfig)) .build(); ``` ```java import static io.ekbatan.keyedlock.redis.RedisKeyedLockProvider.Builder.redisKeyedLockProvider; var lockProvider = redisKeyedLockProvider() .redissonClient(redisson) .namespace("my-app-locks") // optional, default "ekbatan-lock" .build(); ``` The same builder shape works for `MariaDBKeyedLockProvider`, `MySQLKeyedLockProvider`, and `InProcessKeyedLockProvider`. ## Dedicated pool via the `lock-config` / `lockConfig` slot Each held lease (and each thread blocked waiting for one) pins a JDBC connection for its entire lifetime. For lock-heavy workloads, **point the provider at a dedicated pool** so locks don't starve normal queries. Add a user-defined `lock-config:` entry to the relevant member in your `ShardingConfig`. External config may use `lock-config` or `lockConfig`; datasource leaves may also use kebab-case or camelCase (`jdbc-url` / `jdbcUrl`, `maximum-pool-size` / `maximumPoolSize`, `leak-detection-threshold` / `leakDetectionThreshold`). After binding the Java key is always `lockConfig`, so `configFor(...)` must use camelCase. ```yaml sharding: groups: - group: 0 members: - member: 0 configs: primary-config: jdbc-url: jdbc:postgresql://primary-eu-1:5432/db username: app password: ${APP_PASSWORD} maximum-pool-size: 20 secondary-config: jdbc-url: jdbc:postgresql://replica-eu-1:5432/db username: app password: ${APP_PASSWORD} maximum-pool-size: 10 # Dedicated pool for KeyedLockProvider — keeps lock acquisitions # from competing for connections with normal queries. lock-config: jdbc-url: jdbc:postgresql://primary-eu-1:5432/db # same DB; locks must coordinate on the same instance username: app password: ${APP_PASSWORD} maximum-pool-size: 40 # per-instance — sized for the concurrent leases this instance holds minimum-idle: 5 leak-detection-threshold: 120000 # set comfortably above your largest expected maxHold ``` Pull it out and wire the provider: ```java var lockDataSourceConfig = shardingConfig.groups.get(0).members.get(0) .configFor("lockConfig") .orElseThrow(); var lockProvider = postgresKeyedLockProvider() .connectionProvider(hikariConnectionProvider(lockDataSourceConfig)) .build(); ``` ## Optimistic vs pessimistic — picking one The same wallet deposit can be implemented either way: - **Optimistic**: `walletRepo.getById(...)` → `wallet.deposit(amount)` → `plan().update(...)`. On a hot wallet, every concurrent caller hits `StaleRecordException` and the executor retries — which adds tail latency proportional to the contention. - **Pessimistic**: take a `KeyedLockProvider` lease keyed on the wallet ID, then read-modify-write inside it. Callers wait briefly for the lease and then succeed on their first attempt. The retry loop disappears. On hot keys the pessimistic version is usually the better trade. On low-contention paths the optimistic version is simpler and avoids the connection-per-holder cost. ## Caveat: not the right primitive at very high concurrency `KeyedLockProvider` fits coarse-grained coordination (single-flight cron jobs, admin actions, per-shard locks) and low-to-medium contention on the write path. At higher concurrency, every held lease — and every thread blocked waiting for one — pins a JDBC connection for its entire lifetime, which can quickly demand more connections than your database is sized for. If you're approaching that regime, consider: - **Falling back to optimistic locking** with shorter retry delays — usually cheaper at high concurrency. - **Application-level batching** to coalesce contending writes. - **The Redis-backed provider**, which doesn't pin JDBC connections. ## See also - [Models and Entities](../concepts/models-and-entities.md#optimistic-locking-always) — the optimistic locking baseline - [Sharding](sharding.md) — where the `lock-config` / `lockConfig` slot lives - [Distributed background jobs](../jobs/distributed-jobs.md) — uses `KeyedLockProvider`-style cluster exclusivity through db-scheduler instead