# Architecture
OpenChronicle is a single daemon that ingests capture events, compresses them through a deterministic funnel, and classifies the result into durable Markdown memory. There is only one ingestion path — no modes.
```mermaid
flowchart LR
W[mac-ax-watcher
Swift binary]
subgraph capture [Capture Layer]
direction TB
S0["S0 event_dispatcher
dedup · debounce · min-gap"]
S1["S1 s1_parser
focused_element · visible_text · url"]
BUF[(capture-buffer/*.json)]
S0 --> S1 --> BUF
end
subgraph compress [Compression Layer]
direction TB
TL["Timeline aggregator · LLM
1-min normalized blocks
verbatim-preserving"]
BLOCKS[(timeline_blocks)]
SM["Session manager
3-rule cutter
active → ended"]
S2["S2 session_reducer · LLM
async thread"]
TL --> BLOCKS
BLOCKS -- read window --> S2
SM -. trigger: flush 5m / on_session_end .-> S2
end
subgraph memory [Memory Layer]
direction TB
ED[(event-YYYY-MM-DD.md)]
CLF["Classifier · LLM
tool-call loop · 30 m tick + terminal"]
MF[(user- · project- · tool- ·
topic- · person- · org-*.md)]
CMP["Compact · LLM
on-demand"]
ED --> CLF --> MF
MF -. read / rewrite .-> CMP
CMP -. supersede .-> MF
end
subgraph query [Query Layer]
direction TB
FTS[(SQLite FTS5
entries_fts · captures_fts)]
MCP["MCP server
127.0.0.1:8742/mcp"]
AG[Tool-capable agents
Claude Code · Desktop · Cursor · Codex · …]
FTS --> MCP --> AG
end
W --> S0
BUF -. pre_capture_hook
(post-write · skipped on content-dedup) .-> SM
BUF --> TL
S2 --> ED
BLOCKS -. grounding .-> CLF
MF --> FTS
ED --> FTS
BUF -. indexed .-> FTS
```
## Runtime sequence
A typical 5-minute flush window, showing how one AX event propagates through to durable memory:
```mermaid
sequenceDiagram
participant W as mac-ax-watcher
participant S0 as S0 dispatcher
participant S1 as S1 parser
participant BUF as capture-buffer
participant SM as Session mgr
participant TL as Timeline tick
participant R as S2 reducer
participant CLF as Classifier
participant DB as SQLite + memory/
participant MCP as MCP / agent
W->>S0: AX event
S0->>S0: debounce / dedup / min-gap
S0->>S1: schedule capture runner (threaded)
S1->>BUF: write enriched {iso}.json
Note right of BUF: content-fingerprint dedup
drops consecutive duplicates
BUF->>SM: pre_capture_hook → on_event
(post-write · skipped on content-dedup)
Note over TL,BUF: timeline tick · every 60 s
TL->>BUF: scan closed 1-min windows
TL->>DB: LLM → insert timeline_blocks
Note over SM,R: flush tick · every 5 min
SM->>R: reduce(flush_end → now)
R->>DB: read blocks · LLM · append [flush] entry
R->>SM: advance flush_end
Note over CLF,DB: classifier tick · every 30 min
CLF->>DB: read event-daily (tagged sid:)
+ timeline_blocks in window (grounding)
CLF->>DB: LLM tool-call loop → update memory files
CLF->>SM: advance classified_end
Note over SM,CLF: on_session_end
(idle / soft-cut / timeout / shutdown / 23:55)
SM->>R: terminal reduce (full trailing range)
R->>DB: final entry
R-->>CLF: on_done callback
CLF->>DB: classify trailing window
Note over MCP,DB: any time
MCP->>DB: FTS search / list / read
DB-->>MCP: results
```
## Tasks in the daemon
Defined in `src/openchronicle/daemon.py`.
| Task | Purpose |
|---|---|
| `capture` | Consumes `mac-ax-watcher` events, debounces, writes enriched JSON captures (incl. S1 fields) to `~/.openchronicle/capture-buffer/`. Heartbeat catches quiet periods. Also calls `SessionManager.on_event` on every capture so the session cutter sees the same signal. |
| `timeline` | Every 60s scans for closed wall-clock windows (default 1 min) and runs the `timeline` LLM stage to normalize each window while preserving authored text verbatim. Cleans buffer files older than the newest block. |
| `session` | Every `session.tick_seconds` (default 30), calls `SessionManager.check_cuts()` so idle-gap and timeout cuts fire even when the dispatcher is quiet. |
| `flush` | Every `session.flush_minutes` (default 5, clamped to 5-min floor), runs the reducer incrementally over the active session's newly closed timeline blocks (~5 of them at defaults) and appends `[flush]`-tagged partial entries to today's event-daily. |
| `classifier-tick` | Every `classifier.interval_minutes` (default 30, min 5), runs the classifier over any event-daily entries appended since the session's `classified_end` bookmark. Silent no-op when no new entries have landed. |
| `daily-safety-net` | Once per local day at `reducer.daily_tick_hour:minute` (default 23:55), force-ends the currently-open session and reduces every stranded `ended`/`failed` session row — the "we survived a crash or midnight rollover" safety net. |
| `mcp` | Hosts the Reader MCP server inside the daemon. Exponential backoff on crash. |
The session cutter itself doesn't have a dedicated task — it runs inline on every capture via the `pre_capture_hook` wired in `daemon.py`. Session-end callbacks spawn the reducer on a daemon thread, which then fires the terminal classifier via its success callback (covering any trailing window the 30-min tick didn't reach). Each session's progress on both stages is bookkept on its sessions row: `flush_end` for the reducer, `classified_end` for the classifier.
`--capture-only` disables the timeline aggregator and MCP server. Capture, session, and the daily-safety-net still run so session rows land on disk.
## The session boundary
Three rules (ported verbatim from Einsia-Partner), all enforced in `session/manager.py`:
1. **Hard cut.** No capture-worthy events for `session.gap_minutes` (default 5) → close the session at the last event's timestamp.
2. **Soft cut.** A single unrelated app is focused for `session.soft_cut_minutes` (default 3) unless ≥2 distinct apps were focused in the preceding 2 minutes (frequent-switching defuses the rule).
3. **Timeout.** A session older than `session.max_session_hours` (default 2) is force-cut regardless.
Force-end is also called on daemon shutdown and on the 23:55 safety net, so a session never outlives the process that opened it.
## On-disk state
```
~/.openchronicle/
├── config.toml # single source of truth for runtime config
├── .pid # daemon PID; absence ⇒ stopped
├── .paused # sentinel — capture skips while present
├── index.db # SQLite WAL; entries / files / timeline_blocks / sessions
├── capture-buffer/ # S1-enriched {iso8601}.json captures
├── memory/
│ ├── index.md # auto-generated overview
│ ├── event-YYYY-MM-DD.md # one file per day, one entry per reduced session
│ ├── user-*.md # identity, preferences (durable)
│ └── project-*.md / tool-*.md / topic-*.md / person-*.md / org-*.md
└── logs/
├── capture.log # watcher events, dedup, writes
├── timeline.log # window scan, block production, buffer cleanup
├── session.log # cut decisions, session-end events
├── writer.log # reducer + classifier runs, tool calls
├── compact.log # compact rounds with preservation ratios
└── daemon.log # lifecycle + MCP server
```
SQLite is opened with WAL mode — the MCP reader and the writer paths coexist without blocking.
## Code layout
```
src/openchronicle/
├── cli.py # Typer entry point
├── daemon.py # Async task orchestration
├── config.py # TOML loader, per-stage ModelConfig inheritance
├── paths.py # ~/.openchronicle/* paths
├── logger.py # Rotating file sinks per component
├── capture/
│ ├── watcher.py # Spawns mac-ax-watcher, parses JSONL
│ ├── event_dispatcher.py # Debounce / dedup / min-gap
│ ├── ax_capture.py # One-shot mac-ax-helper invocation
│ ├── ax_models.py # ax_tree_to_markdown, prune helpers
│ ├── s1_parser.py # Enriches captures with focused_element / visible_text / url
│ ├── screenshot.py # mss + PIL → base64 JPEG
│ ├── window_meta.py # foreground app / title / bundle_id
│ └── scheduler.py # Capture loop + buffer cleanup
├── timeline/
│ ├── store.py # timeline_blocks schema + CRUD
│ ├── aggregator.py # Captures-in-window → LLM → entries list
│ └── tick.py # Every-minute scan for closed windows
├── session/
│ ├── store.py # sessions table + retry bookkeeping
│ ├── manager.py # 3-rule session cutter
│ └── tick.py # Daemon wiring: check_cuts loop + daily safety net
├── writer/
│ ├── agent.py # CLI entry: catch up pending sessions + classify
│ ├── session_reducer.py # S2: session → event-YYYY-MM-DD.md entry
│ ├── classifier.py # Extracts durable facts via the tool-call loop
│ ├── tools.py # read/search/append/create/supersede/commit
│ ├── compact.py # Per-file compaction with fact-preservation check
│ └── llm.py # litellm wrapper; per-stage config
├── store/
│ ├── fts.py # SQLite FTS5 schema, search, cursor context manager
│ ├── files.py # Markdown + YAML frontmatter IO
│ ├── entries.py # Entry format, supersede logic, rebuild_index
│ └── index_md.py # Rebuild memory/index.md from the files table
├── mcp/
│ ├── server.py # FastMCP server + tool definitions
│ └── captures.py # Read-side helpers for raw capture buffer + captures_fts
└── prompts/
├── timeline_block.md # short-window normalizer (verbatim-preserving)
├── session_reduce.md # S2 reducer
├── classifier.md # Durable-fact extraction
├── compact.md # Compaction
└── schema.md # Full memory spec — also returned by MCP get_schema
```
## Why this shape
- **Compression first, classification second.** S1 → Timeline → S2 is a deterministic funnel with bounded prompt size at each step. By the time the classifier runs it sees a session-level summary, not raw AX snapshots — so there is no "is this worth writing?" triage call; the classifier just extracts any durable facts it finds, or skips.
- **Session as the natural unit.** A "session" — a bounded chunk of focused work — is what humans remember. Cutting on idle / app-switch / timeout produces event-daily entries with accurate time ranges, which solves the v1 problem of long sessions being under-reported after the first append.
- **Periodic classifier, bookmarked.** The classifier fires on a 30-min interval during each active session, then one last trailing-window pass at session end. Each pass advances the session's `classified_end` bookmark, so entries are never double-classified and long sessions produce durable facts without waiting to close.
- **Daily event files.** `event-YYYY-MM-DD.md` sorts alphabetically by day. Weekly files from v1 are left untouched — they stay searchable via FTS.
- **One process, many tasks.** Avoids IPC overhead and keeps `index.db` single-writer in practice. SQLite WAL gives the MCP reader what it needs.
- **MCP inside the daemon.** External MCP clients get a stable localhost URL instead of spawning a fresh stdio subprocess per session.