# Lithos Rust - AI Agent Reference ## Critical Files - READ FIRST **MUST** review these files before starting any work: - **Project Context**: [Core rules and patterns](_bmad-output/project-context.md) - **PRD**: [Product requirements](_bmad-output/planning-artifacts/prd.md) - **Architecture**: - [Core Architectural Decisions](_bmad-output/planning-artifacts/architecture/03-core-architectural-decisions.md) - [Implementation Patterns & Consistency Rules](_bmad-output/planning-artifacts/architecture/04-implementation-patterns-consistency-rules.md) - [Project Structure & Boundaries](_bmad-output/planning-artifacts/architecture/05-project-structure-boundaries.md) ## BMAD Agent Activation To activate specialized agents, use: `"As [agent-name], ..."` (e.g., `"As dev, implement the cache service"`) **Available agents**: See [_bmad/_config/agent-manifest.csv](_bmad/_config/agent-manifest.csv) for full list - **dev** - Implementation, debugging, refactoring - **architect** - System design, ADRs, tech selection - **tea** - Test strategy, quality gates - **quick-flow-solo-dev** - Rapid prototyping - **bmad-master** - General orchestration **Available workflows**: See [_bmad/_config/workflow-manifest.csv](_bmad/_config/workflow-manifest.csv) ## Project-Specific Context ### Technology Stack - **Language**: Rust (latest stable) - **Architecture**: Module-based separation with unified Repository traits - business contexts isolated, file-driven state management - **Key Libraries**: redb (zero-copy DB), rkyv (serialization), pulldown-cmark (markdown parsing) - **Testing**: nextest, criterion benchmarks, tarpaulin coverage - **Build**: cargo workspace with mise task orchestration ### Critical Coding Standards - **Zero-copy patterns** for performance-critical paths via closure-based `with_archived()` methods - **Unified Repository traits**: Single trait per context combining read and write operations - `schema::Repository` provides both reads (`get`, `list`) and writes (`save`, `delete`) - Concrete implementations: `RedbStorage`, `InMemoryStorage`, `FakeStorage` - Closure-based access for zero-copy: `with_archived(&self, id, f: F)` - **Optional View types**: `*View` types only when domain shape is inefficient - `Archived*` types (generated by rkyv) provide free query optimization - Introduce `*View` only when profiling reveals performance issues - Use `Loader` for orchestration, not domain methods - **Context isolation**: Business contexts (note, schema, template) don't import each other - **Type-driven validation**: - Raw layer validates syntax only (regex, type system) - Resolution layer validates semantics (refs exist, no cycles, depth limits) - **Test-first development**: Red-green-refactor cycle required - **ADR documentation**: All architectural decisions documented in [docs/adr/](docs/adr/) ### Project Structure - `lithos-core/` - Core library with all business logic and infrastructure - `src/config/` - Configuration management context - `src/note/` - Note domain context - `src/schema/` - Schema domain context - `src/template/` - Template domain context - `src/db/` - Database infrastructure - `src/fs/` - Filesystem utilities - `benches/` - Performance benchmarks - `lithos-cli/` - Command-line interface binary For complete rules, see [_bmad-output/project-context.md](_bmad-output/project-context.md) ## Key Architectural Constraints ⚠️ **NON-NEGOTIABLE RULES**: 1. **Context isolation**: Business contexts (note, schema, template) MUST NOT import each other - Config is cross-cutting infrastructure (available to all contexts) - Only infrastructure (db, fs, config) may be imported by business contexts 2. **Unified Repository traits**: Single trait per context combining read and write operations - `schema::Repository` provides both reads (`get`, `list`) and writes (`save`, `delete`) - Concrete implementations: `RedbStorage`, `InMemoryStorage`, `FakeStorage` - Closure-based access for zero-copy: `with_archived(&self, id, f: F)` 3. **Type safety**: Private fields by default, validation at construction, newtype wrappers for domain constraints 4. **Zero-copy patterns**: Domain types have rkyv derives; use `*View` types only when domain shape is inefficient for storage 5. **Test-first**: Red-green-refactor cycle required - tests before implementation 6. **ADRs required**: Document all architectural decisions in [docs/adr/](docs/adr/) 7. **Dependency flow**: Infrastructure (db, fs, config) → Business Contexts (note, schema, template) → CLI 8. **File Ingestion Rules**: - **Repository traits MUST NOT have file I/O methods**: No `load_from_file`, `scan_directory`, etc. - **File ingestion MUST use `FsReader`**: Abstract over filesystem for testability - **Loader orchestrates pipelines**: Loader coordinates File → Raw → Domain → Storage - **Parsing and validation are distinct phases**: File → Raw (parsing) → Domain (validation) → Storage 9. **Optional View Pattern**: - **`*View` types are optional**: Only introduce when domain shape is inefficient for storage/queries - **`Archived*` provides free optimization**: rkyv-generated types offer zero-copy reads without custom views - **Loader handles orchestration**: All pipeline logic in context-specific loaders (e.g., `schema::loader`) - **Functional composition**: Direct function calls with `Result` for error propagation (no events required) ## Where Does This Code Go? **Pure business logic (no I/O)?** → `lithos-core/src/{context}/` (note, schema, template) **Cross-cutting configuration?** → `lithos-core/src/config/` **File ingestion orchestration?** → `lithos-core/src/schema/loader.rs` **File source abstraction?** → `lithos-core/src/fs/source.rs` **File parsing logic?** → `lithos-core/src/schema/ingestor.rs` **Database operations?** → `lithos-core/src/db/` **File system utilities?** → `lithos-core/src/fs/` **CLI interface?** → `lithos-cli/src/` **Tests for domain logic?** → Same file as impl with `#[cfg(test)]` **Integration tests?** → `lithos-core/tests/` **Benchmarks?** → `lithos-core/benches/` ## Critical Rust Patterns & Anti-Patterns For deeper guidance on Rust style/module organization/tooling and crate-specific usage, start at [docs/refs/rust/README.md](docs/refs/rust/README.md). ### Naming Conventions (CRITICAL - Read First) **All method and function names MUST follow our standardized taxonomy**: [docs/refs/rust/naming-taxonomy.md](docs/refs/rust/naming-taxonomy.md) **Quick Reference for Repository Traits**: - **Read methods**: `find_*` (optional), `get_*` (singleton), `list_*` (multiple), `count_*` (aggregates), `with_*` (zero-copy closure-based), `is_*` (boolean) - **Write methods**: `create`, `save`, `update`, `delete`, `*_many` (bulk operations) - **Conversions**: `as_*` (free), `to_*` (expensive), `into_*` (consumes) - **Constructors**: `new()` (infallible), `try_new()` (fallible), `from_*` (conversions), `with_*` (builders) - **NO `get_` prefix** on simple getters - use field name directly: `name()` not `get_name()` See the full taxonomy document for comprehensive guidelines, examples, and anti-patterns. ### Repo-Specific Rust Notes (High Signal) - **Clippy suppressions**: Prefer local `#[expect(clippy::lint_name, reason = "...")]` over `#[allow(...)]`; avoid crate/module-wide suppressions unless it’s a deliberate policy. - **Doc tests**: `nextest` does not run doctests; when changing public docs/examples, also run `cargo test --doc`. - **Module layout**: Contexts use `/mod.rs` pattern with submodules for organization. - **Rustdoc hygiene**: For fallible/unsafe/panicking public APIs, document `# Errors`, `# Safety`, and/or `# Panics` (see [docs/refs/rust/style.md](docs/refs/rust/style.md)). ### Zero-Copy Library Footguns (Read Before Editing Hot Paths) - **rkyv format control**: Treat endianness/alignment/pointer-width feature choices as a persisted-format contract; changing them is a breaking change for on-disk bytes (see [docs/refs/crates/rkyv.md](docs/refs/crates/rkyv.md)). - **rkyv validation**: Use `rkyv::access` at trust boundaries (files/network/user input); reserve `access_unchecked` for trusted, internally produced bytes. - **redb guards**: `AccessGuard` values borrow the transaction/table; do not return or store them beyond the transaction scope. - **redb custom Value**: Due to orphan rules, implement `redb::Value` via local newtypes/wrappers when you need custom encoding. - **moka determinism**: Cache stats are eventually consistent; in tests that assert stats/entry counts, call `run_pending_tasks()`. - **moka callbacks**: Eviction listeners must not panic and should be fast (they run on user threads). ## Rust Idioms (Rules) These rules operationalize common Rust idioms for day-to-day Lithos development. For deeper rationale and examples, see [docs/refs/rust/idioms.md](docs/refs/rust/idioms.md). ### API & Ownership - Prefer borrowed arguments in APIs: take `&str`, `&Path`, slices, and `&T` (or `impl AsRef` / `impl Borrow`) instead of `String`/`PathBuf`/owned types unless ownership is required. - Use `impl Trait`/generics for “accept anything that can be viewed as X” APIs; reserve `&dyn Trait` for intentional runtime polymorphism. - When ownership is required, make it explicit: take `T`/`Box`/`Arc` by value and document the transfer. ### Construction & Defaults - Use conventional constructors: `new()` for infallible, `try_new()` / `new_checked()` for fallible, and `from_*`/`try_from_*` conversions via `From`/`TryFrom`. - Prefer builders when there are many optional parameters or invariants to enforce; keep `new()` small and unsurprising. - Implement or derive `Default` when a sensible default exists; prefer struct update syntax (`..Default::default()`) for ergonomic initialization. ### Strings & Formatting - Use `format!`/`write!`/`writeln!` for structured string construction; avoid repeated `+` concatenation in loops. - Accept string inputs as `&str` (or `impl AsRef` when appropriate); store immutable string data as `Box` when ownership is needed and mutability isn’t. ### Mutation, Moves, and Invariants - Keep `mut` scopes tight: prefer temporary mutability (shadowing) to long-lived `mut` bindings. - When you need to move out of a field or replace a value, prefer `std::mem::take` / `std::mem::replace` over cloning. - Prefer iterators over indexing; when indexing is unavoidable, use `.get()` and handle `None`. - Treat `Option` as an iterable for control flow: use `if let`, `while let`, `.into_iter()`, and combinators (`map`, `and_then`, `ok_or`) instead of sentinel values. ### Resource Management - Use RAII: acquire resources in constructors and release in `Drop`; avoid “manual close” APIs unless required for performance or correctness. - Never panic across FFI boundaries; Rust must not unwind into C. ### Closures & Captures - Be explicit about closure capture semantics: use `move` when the closure must own captured values. - When a closure needs owned data but the surrounding scope still needs it, explicitly rebind (e.g., clone an `Arc`/`String` into a new binding) rather than fighting the borrow checker. ### Extensibility & Public API Evolution - For public enums/structs intended to evolve, use `#[non_exhaustive]` (or private fields) to prevent downstream exhaustive construction/matching. - When matching on non-exhaustive enums, always include a wildcard arm to preserve forward compatibility. ### Documentation & Doctests - Write rustdoc examples as compilable code; hide setup noise in doctests using `#` lines to keep examples readable. ### Error Handling & FFI Interop - Prefer `Result` with structured errors (`thiserror` in non-domain crates); avoid `unwrap()`/`expect()` in production. - For fallible operations that consume an input, prefer returning the consumed value on failure (e.g., `Result` or an error type that carries the input) when it materially improves recovery. - In FFI: - Accept strings as `*const c_char` + `CStr`; pass strings as `CString`/`*const c_char` with clear ownership rules. - Return errors as status codes and/or out-parameters; ensure all FFI-exposed functions are `extern "C"` and panic-free. ### ✅ Always Do - **Error handling**: Use `Result` with `?` operator, never `unwrap()`/`expect()` in production - **Paths**: Use `PathBuf` (owned) or `&Path` (borrowed), NEVER `String` for file paths - **String efficiency**: Use `&str` for borrows, `Box` for immutable data, `String` only when mutable - **Async blocking**: Use `tokio::task::spawn_blocking` for any `std::fs` or CPU-intensive work - **Collections**: Use `.get()` instead of `[index]`, `entry()` API for HashMap updates - **Conversion traits**: Implement `From/Into` for infallible conversions, `TryFrom/TryInto` for fallible ones - **Lifetimes as documentation**: `fn get<'a>(&'a self) -> Guard<'a>` shows zero-copy, `fn get(&self) -> T` hides allocation - **Box over String**: Use `Box` for immutable string storage to avoid heap over-allocation - **Static strings**: Use `"literal".into()` instead of `"literal".to_owned().into()` for error fields ### ❌ Never Do - **String cloning for paths**: Path operations must use `Path`/`PathBuf` APIs - **Clone in traits**: `trait Cache` forces all implementations to allocate - **Unwrap/panic**: Use `?`, `ok_or()`, `context()` - panics crash the process - **Async mutex across await**: NEVER hold `std::sync::MutexGuard` across `.await` (deadlock risk) - **Numeric casting with 'as'**: Use `.try_into()?` to catch overflow/truncation errors - **Generic `String` errors**: Use `thiserror` for structured errors with context - **Ad-hoc conversions**: Don't write `to_x()` methods - use `From/Into` traits instead - **Unnecessary to_owned()**: NEVER use `"text".to_owned().into()` - use `"text".into()` directly ### String Allocation Anti-Patterns (Must Avoid) These patterns create unnecessary heap allocations: 1. **`"text".to_owned().into()`** → Use `"text".into()` instead - `Box: From<&'static str>` is zero-cost - Found 100+ occurrences in codebase before fixes 2. **Unnecessary to_string() for errors** - Error fields using `String` type: prefer `"literal".into()` over `to_string()` - Only allocate when the error message actually needs the full String 3. **UUID to_string() in hot paths** - Database lookups using `id.to_string()` allocate 36 bytes per call - Consider: thread-local buffers, UUID-native DB methods, or adapter-level caching 4. **Case conversion for pre-validated data** (context-specific) - Only applies when data is already validated to be lowercase via regex (e.g., `^[a-z0-9_-]+$`) - In this case, `to_lowercase()` is redundant and allocates unnecessarily ### Zero-Copy API Patterns For performance-critical paths (LSP queries, hot database reads): ```rust // ✅ GOOD: Closure-based zero-copy access fn with_archived(&self, id: Id, f: F) -> Result, Error> where F: for<'a> FnOnce(&'a Archived) -> R; // ❌ BAD: Returning guards requires self-referential structs fn get_archived(&self, id: Id) -> Result, Error>; ``` When implementing Repository traits: - Prefer closure-based `with_archived()` over returning guards - Avoid complex lifetime patterns that require `self_cell` or GAT Guard patterns ## Definition of Done Before marking any task complete: - [ ] All tests pass (`mise run test`) - [ ] Code formatted (`mise run fmt`) - [ ] No clippy warnings (`mise run lint`) - [ ] All public APIs have tests (functions, methods, traits) - [ ] Tests cover critical paths and business logic (not chasing % targets) - [ ] No `unwrap()`/`panic!` in production code - [ ] Context boundaries respected (business contexts isolated, no cross-imports) - [ ] Unified Repository pattern followed (single trait per context) - [ ] Type-driven design applied (private fields, validated constructors) - [ ] Documentation updated (doc comments for public APIs) - [ ] Doc tests run when docs/examples changed (`cargo test --doc`) - [ ] ADR created if architectural decision made - [ ] **No string allocation anti-patterns**: No `.to_owned().into()`, no unnecessary `.to_lowercase()`, no `.to_string()` in hot paths ## Before Submitting Work 1. **Run full verification**: `mise run verify` must be 100% green 2. **Review test quality**: Critical paths tested, edge cases covered 3. **Code hygiene check**: No debug prints, commented code, or TODOs 4. **Documentation**: If architectural change, ADR created in `docs/adr/` ## Common Commands (mise tasks) | Command | Action | | :--------------------------- | :-------------------------------------------------------------------------------- | | `mise run verify` | Full quality gate orchestration (fmt + lint + tests + adr:validate) (alias: `v`). | | `mise run quality` | Run all quality gates (fmt, lint, adr:validate) (alias: `q`). | | `mise run lint` | Run linting checks using clippy. | | `mise run fmt` | Format code using rustfmt. | | `mise run deny` | Check dependencies for security and license issues. | | `mise run clean` | Clean build artifacts and temporary files. | | `mise run clean:cargo` | Clean only cargo build artifacts. | | `mise run clean:test` | Clean only test output artifacts. | | `mise run clean:reports` | Clean only coverage and JUnit reports. | | `mise run build` | Build the project binaries. | | `mise run doc` | Generate and open project documentation. | | `mise run dev-setup` | Set up development environment and dependencies. | | `mise run adr:validate` | Validate ADR files for compliance. | | `mise run adr:metrics` | Generate metrics for ADR management. | | `mise run ci` | Simulate CI/CD pipeline. | | `mise run timing` | Run verify with detailed timing information. | | `mise run test` | Run all tests (unit, integration, e2e) (alias: `t`). | | `mise run test:unit` | Run all unit tests using `nextest` (alias: `tu`). | | `mise run test:unit:core` | Run core crate unit tests (alias: `tucore`). | | `mise run test:unit:cli` | Run CLI crate unit tests (alias: `tucli`). | | `mise run test:unit:config` | Run config module unit tests (alias: `tuconf`). | | `mise run test:unit:note` | Run note module unit tests (alias: `tunote`). | | `mise run test:unit:schema` | Run schema module unit tests (alias: `tusch`). | | `mise run test:unit:template`| Run template module unit tests (alias: `tutemp`). | | `mise run test:unit:db` | Run db module unit tests (alias: `tudb`). | | `mise run test:unit:fs` | Run fs module unit tests (alias: `tufs`). | | `mise run test:bench` | Run all performance benchmarks using `criterion`. | | `mise run test:bench:core` | Run core crate benchmarks (alias: `tbcore`). | | `mise run test:bench:cli` | Run CLI crate benchmarks (alias: `tbcli`). | | `mise run test:integration` | Run all integration tests across the workspace (alias: `ti`). | | `mise run test:e2e` | Run end-to-end tests (alias: `te`). | | `mise run test:coverage` | Generate code coverage reports using `tarpaulin` (alias: `tc`). | | `mise run test:watch` | Watch mode: automatically run tests on file changes (alias: `tw`). | | `mise run test:burn-in` | Run tests repeatedly to detect flaky failures (alias: `tb`). | | `mise run test:changed` | Run tests only for crates affected by changes (alias: `tc`). |