# nano-brain Roadmap

> Last updated: 2026-06-23

---

## Vision

nano-brain is a persistent memory and code intelligence layer for AI coding agents.
Goal: agents know the project context, decision history, and can anticipate what's needed next — across sessions, machines, and team members.

---

## Pillar 1: Code Intelligence

**What:** Understand the codebase like a senior engineer.

| Feature | Description | Status |
|---|---|---|
| File indexing | Watch + chunk + embed toàn bộ source files | ✅ |
| Symbol extraction | Functions, types, interfaces, constants | ✅ |
| Knowledge graph | Module → function → dependency relationships | ✅ |
| Impact analytics | Change X → affects Y, Z (cross-file) | ✅ |
| Call chain tracing | Trace execution path from entry point | ✅ |
| Control-flow graphs | CFG extraction with branch-aware edges | ✅ |
| Sequence diagrams | Mermaid sequence diagrams from flow data | ✅ |
| Ruby/Rails support | Rails routes, controller→service→model chains | ✅ |
| Ruby cross-file resolution | Class→file index, resolver, reconcile edges | ✅ |
| Ruby CFG extraction | `if`/`else`, loops, `begin`/`rescue`, method defs | ✅ |
| Nuxt/Next frontend support | Framework-aware route extraction, API handlers, page/component graph traversal for modern frontend repos | ❌ Planned |

---

## Pillar 2: Session Harvesting

**What:** Collect and summarize sessions from AI tools, scoped per workspace.

| Feature | Description | Status |
|---|---|---|
| OpenCode SQLite harvester | Parse `opencode.db`, extract sessions/messages | ✅ |
| Claude Code JSONL harvester | Parse `ses_*.jsonl` transcripts | ✅ |
| Workspace filtering | Only harvest sessions matching registered workspace paths | ✅ |
| LLM summarization pipeline | Map-reduce chunking, token-bucket rate limiter | ✅ |
| Incremental harvest | Track last-harvested timestamp, dedup by session ID | ✅ |
| Summary persistence | `.md` files + vector DB (`session-summary` collection) | ✅ |
| Embed queue workspace isolation | Queue scan scoped to registered workspaces only | ✅ |

### Architecture

```
opencode.db / ses_*.jsonl
  → filter by workspace path
  → extract messages
  → map-reduce LLM summary (token-bucket rate limited)
  → chunk → embed → index (session-summary collection)
  → .md summary file to output_dir
```

### Config

```yaml
harvester:
  opencode:
    session_dir: ~/.local/share/opencode/storage
  claudecode:
    enabled: false
    session_dir: ~/.claude/transcripts/

summarization:
  enabled: true
  provider_url: "https://ai-proxy.example.com/v1"
  model: "claude-sonnet-4-5"
  max_tokens: 4096
  concurrency: 3
  output_dir: "~/.nano-brain/summaries"
```

---

## Pillar 3: Memory & Developer Experience

**What:** Persistent cross-session memory + ergonomic tooling.

| Feature | Description | Status |
|---|---|---|
| Write memory | `nano-brain write "..."` | ✅ |
| Semantic search | `nano-brain query "..."` | ✅ |
| Tag-based filter | `--tags decision,auth` | ✅ |
| Supersede | Replace stale memory entries | ✅ |
| Auto-memory from sessions | Extract decisions from harvested sessions | ✅ |
| 14 MCP tools | query, search, vsearch, get, write, tags, status, update, wake_up, graph, trace, impact, symbols, flow | ✅ |
| Hybrid search pipeline | BM25 + pgvector HNSW + RRF fusion + recency decay | ✅ |
| BM25 OR fallback | Retry with OR semantics when AND returns 0 results | ✅ |
| Debugging-aware search | Parallel search mode for debugging queries | ✅ |
| Incoming edges symbol fallback | Fallback to symbol name when target_node lookup fails | ✅ |
| Benchmarking suite | generate, run, compare, stress | ✅ |
| Workspace-specific benchmarks | Queries tailored to each project's domain | ✅ |
| Init onboarding wizard | Interactive config setup on first run | ✅ |
| Doctor command | Check prerequisites (PG, pgvector, Ollama, model) | ✅ |
| V1 SQLite migration | Import from V1 format (pure Go, no CGO) | ✅ |
| Config hot-reload | `POST /api/reload-config` | ✅ |
| Search telemetry | Local-only, 90-day retention, non-blocking | ✅ |

---

## Pillar 4: Team & Multi-user

**What:** Shared knowledge base for the whole team — one server, multiple users, role-controlled access.

**Use case:** Deploy one nano-brain server for the entire team. Every developer's AI agent connects to the same PostgreSQL instance — decisions, architecture notes, and code intelligence are instantly shared. New team members get full project context from day one without any per-machine setup.

### Authentication

| Method | Description | Status |
|---|---|---|
| Bearer token auth | Single shared token for all users | ✅ |
| Basic auth | Username/password per user | ✅ |
| TLS termination | HTTPS support (native or reverse proxy) | ❌ |
| Rate limiting | Per-user, per-IP request limits | ❌ |
| CORS configuration | Restrict allowed origins | ❌ |

### Authorization (Role-Based Access Control)

| Role | Description | Status |
|---|---|---|
| Admin | Full read/write + config + workspace management | ❌ |
| Developer | Read/write memory, scoped to assigned workspaces | ❌ |
| Reader | Read-only access (search, get, wake-up, status) | ❌ |

### Role Matrix

| Operation | Admin | Developer | Reader |
|---|---|---|---|
| `memory_query` / `memory_search` / `memory_vsearch` | ✅ | ✅ | ✅ |
| `memory_get` / `memory_wake_up` / `memory_status` | ✅ | ✅ | ✅ |
| `memory_write` / `memory_update` | ✅ | ✅ | ❌ |
| `memory_graph` / `memory_impact` / `memory_trace` | ✅ | ✅ | ✅ |
| Workspace init / delete | ✅ | ❌ | ❌ |
| Config reload / patch | ✅ | ❌ | ❌ |
| Collection create / delete | ✅ | ✅ | ❌ |
| Reindex / harvest | ✅ | ✅ | ❌ |

### Deployment Options

| Option | Description | Status |
|---|---|---|
| Local machine | Ollama + Docker, single user | ✅ |
| VPS / team server | Shared memory across machines | ✅ |
| Build from source | Go binary, no CGO | ✅ |
| Docker Compose | Production-ready container setup | ❌ |
| Kubernetes / Helm | Cloud-native deployment | ❌ |
| Cloud managed | AWS RDS, GCP Cloud SQL, Azure DB | ❌ |

### Config sketch (proposed)

```yaml
server:
  auth:
    enabled: true
    users:
      - username: alice
        password_hash: "$2a$10$..."
        role: admin
      - username: bob
        password_hash: "$2a$10$..."
        role: developer
        workspaces: ["abc123..."]  # scoped to specific workspaces
      - username: reviewer
        password_hash: "$2a$10$..."
        role: reader
    tokens:
      - token: "nbt_admin_..."
        role: admin
      - token: "nbt_dev_..."
        role: developer
        workspaces: ["abc123..."]
      - token: "nbt_readonly_..."
        role: reader
  rate_limit:
    enabled: true
    requests_per_minute: 60
    burst: 10
  cors:
    enabled: true
    allowed_origins:
      - "https://app.example.com"
```

---

## Pillar 5: Self-Learning & Prediction

**What:** Learn patterns from user behavior → prepare context proactively.

> ⚠️ Needs further design discussion on scope and approach.

### 5a. Pattern Learning
- Analyze prompt history from harvested sessions
- Identify recurring workflows (e.g., "user typically fixes bug → runs tests → commits")
- Build user-specific workflow graph

### 5b. Proactive Context Pre-loading
- Based on current task → predict what's needed next
- Pre-fetch relevant code symbols, memory entries, past decisions
- Surface as "you might need next: ..."

### 5c. Self-Lesson Extraction
- After each session: extract lessons ("what worked", "what failed")
- Store as tagged memory entries
- Surface relevant lessons when starting a similar task

### 5d. Auto-execution (Stretch)
- nano-brain autonomously triggers the next step without a user prompt
- Requires: high confidence prediction + explicit user opt-in flag
- Risk: false positives — needs confidence threshold

---

## Implementation Order

```
Phase 1 — Foundation ✅ (shipped 2026-05)
  ├── File indexing, watcher, chunking, embedding
  ├── OpenCode SQLite harvester
  ├── Claude Code JSONL harvester
  └── Workspace registration + isolation

Phase 2 — Code Intelligence ✅ (shipped 2026-05)
  ├── Symbol extraction (regex-based)
  ├── Knowledge graph (module → function → dependency)
  ├── Impact analytics (cross-file change propagation)
  └── Call chain tracing

Phase 3 — Memory & DX ✅ (shipped 2026-05)
  ├── Hybrid search (BM25 + vector + RRF + recency)
  ├── MCP tools (9 tools)
  ├── Session summarization pipeline
  ├── Workspace filtering for harvest + embed
  ├── Init onboarding, doctor, benchmarks
  └── V1 migration, config hot-reload, telemetry

Phase 4 — Hardening (mostly shipped)
  ├── ✅ #180 — Ollama context length overflow on large chunks (PR #208/#209)
  ├── ✅ #181 — UTF-8 null byte in harvested sessions
  ├── ⚠️ #184 — Require explicit --workspace on CLI commands (partial: only reset-embeddings)
  ├── ✅ #158 — Incremental reindex (only changed files) — watcher real-time cleanup shipped
  ├── ✅ #190 — cleanup-stale-raw command
  └── ✅ #191 — Summarization max_tokens default 4096 → 8000

Phase 5 — CLI Completeness (in progress)
  ├── ✅ #153 — Code intelligence CLI (context, code-impact, detect-changes)
  ├── ⚠️ #151 — Wake-up (REST API + MCP done; CLI command pending)
  ├── ❌ #152 — get, tags, multi-get commands
  ├── ❌ #155 — Workspace remove command (SQL query exists, no CLI/handler)
  ├── ❌ #156 — Cross-workspace search (--scope=all)
  ├── ❌ #157 — Cache management (clear, stats)
  └── ⚠️ #160 — --tags filter (works on write; pending on query/search)

Phase 6 — Enhanced Code Intelligence (in progress)
  ├── ✅ #174 — Symbol extraction with go-tree-sitter (Python extractor shipped)
  ├── ✅ Ruby/Rails flow & sequence diagrams (PRs #467, #469, #471, #473)
  │   ├── Ruby CFG extraction (if/else, loops, begin/rescue, method defs)
  │   ├── Ruby call graph extractor (class/module capture, unresolved edges)
  │   ├── Rails route extraction (resources, get/post/patch/put/delete, namespace)
  │   ├── Ruby class→file index with namespace preference
  │   ├── Cross-file resolver with reconcile edge builder
  │   └── Flows reach 20-34 nodes (entry → handler → func → calls chain)
  ├── ❌ Nuxt/Next support
  │   ├── Framework-aware route extraction for `pages/`, `app/`, `server/api`, and Nuxt file routing
  │   ├── Frontend entry→component→data-fetch flow tracing
  │   └── API handler + server action + middleware graph support
  └── ⚠️ Cross-language support — Python, Ruby via tree-sitter; TypeScript, Rust pending

Phase 7 — Team & Multi-user (Planned)
  ├── Role-based access control (Admin / Developer / Reader)
  ├── Per-token and per-user role assignment in config
  ├── Read-only enforcement at middleware layer
  └── Audit log (who wrote/deleted what)

Phase 8 — Self-Learning (Discuss)
  ├── #154 — Memory consolidation + categorization + Thompson Sampling
  ├── Pattern learning from prompt history
  ├── Proactive context pre-loading
  └── Self-lesson extraction

Phase 9 — Agent Memory Benchmarking ✅ (shipped 2026-06)
  ├── ✅ Benchmark framework (20 queries, ground truth, 6 tool runners)
  ├── ✅ Competitor comparison (LlamaIndex, Qdrant/Mem0)
  ├── ✅ Fair comparison with same raw source files
  ├── ✅ Workspace-specific queries (gaming-platform, nano-brain, rails-project)
  ├── ✅ BM25 OR fallback for zero-result queries
  ├── ✅ Results: nano-brain P@5=0.749, MRR=0.967
  └── ✅ Known issue: 2 rails-project queries still return 0

Phase 10 — Deployment & Security (Planned)
  ├── Deployment guides
  │   ├── ✅ Local machine (Ollama + Docker, ~5 min)
  │   ├── ✅ VPS / team server (shared memory across machines)
  │   ├── ✅ Build from source
  │   ├── ⚠️ Docker Compose production setup
  │   ├── ❌ Kubernetes / Helm chart
  │   ├── ❌ Cloud provider guides (AWS, GCP, Azure)
  │   └── ❌ CI/CD integration (GitHub Actions, GitLab CI)
  ├── Authentication & authorization
  │   ├── ✅ Bearer token auth (single shared token)
  │   ├── ✅ Basic auth (username/password per user)
  │   ├── ❌ Role-based access control (Admin / Developer / Reader)
  │   ├── ❌ Per-token role assignment (each `nbt_` token carries a role)
  │   ├── ❌ Per-user basic auth role (role assigned per username)
  │   ├── ❌ Workspace-scoped access (Developer role limited to assigned workspaces)
  │   └── ❌ Audit log (who wrote/deleted what, when)
  ├── Security hardening
  │   ├── ❌ Rate limiting (per-user, per-IP)
  │   ├── ❌ Request size limits (prevent abuse)
  │   ├── ❌ CORS configuration (restrict origins)
  │   ├── ❌ TLS termination (HTTPS support)
  │   ├── ❌ Input validation & sanitization
  │   └── ❌ Secrets management (env vars, not config files)
  └── Observability
      ├── ✅ Search telemetry (local-only, 90-day retention)
      ├── ❌ Prometheus metrics endpoint
      ├── ❌ Structured logging with request IDs
      ├── ❌ Health check enhancements (dependency checks)
      └── ❌ Distributed tracing (OpenTelemetry)
```

---

## Open Questions

1. **Pillar 4 scope**: Proactive suggestions only, or auto-execution? (Needs design discussion)
2. **Cross-workspace search**: #156 — privacy implications of searching across all workspaces?
3. **Memory consolidation**: #154 — Thompson Sampling for relevance ranking — need benchmarks first?
4. **Ruby limitations**: No `before_action`/`after_action`, no ActiveRecord dynamic methods, no metaprogramming — worth implementing?
5. **Benchmark accuracy**: How to improve P@5 from 0.749 to 0.9+ — better embedding models, HyDE, or reranking?
6. **Deployment target**: Self-hosted VPS vs cloud-managed (RDS, Cloud SQL) — which to prioritize?
7. **Auth granularity**: Is workspace-scoped access enough, or do we need collection-level permissions?
8. **TLS**: Should nano-brain handle TLS termination, or rely on reverse proxy (nginx, Caddy)?
9. **Frontend scope**: For Next/Nuxt, should we prioritize route extraction first, or end-to-end page→component→API flow tracing?

### Resolved Questions

- ~~LLM for summarization~~ → OpenAI-compatible endpoint via `summarization.provider_url`
- ~~Output dir~~ → `~/.nano-brain/summaries/` (configurable)
- ~~Incremental harvest~~ → On-demand via `POST /api/harvest`, tracks last-harvested per session
- ~~Claude projects/memory~~ → Not harvesting `~/.claude/projects/` — only transcripts
- ~~costs.jsonl~~ → Not indexed — analytics-only, not searchable
- ~~Tree-sitter vs regex~~ → go-tree-sitter for Python, Ruby; regex for Go, JS/TS
- ~~Agent memory benchmark~~ → nano-brain P@5=0.749 vs LlamaIndex 0.55 vs Qdrant 0.27