---
name: py_mnn_kb
description: >
  Local vector knowledge base with GraphRAG retrieval (vector + BM25 + knowledge graph).
  Use this skill when the user mentions: "查知识库", "加入知识库", "记住这个", "save to KB",
  "add to knowledge base", "query knowledge base", "记录一下", or similar intent to
  store or retrieve private knowledge.
license: MIT
allowed-tools:
  - kb_build
  - kb_note
  - kb_query
  - kb_status
disable: false
---

# py_mnn_kb — MNN Knowledge Base Skill

Local GraphRAG knowledge base backed by SQLite + MNN embeddings.
Fully compatible with Android OfflineAI RAG database format.

---

## Setup

### 1. Install dependencies
```bash
pip install -r requirements.txt
```

### 2. Configure
```bash
cp config.example.json config.json
# Edit config.json: set llm_api.api_key, optionally change default_name
```

Key fields in `config.json`:

| Field | Default | Description |
|---|---|---|
| `knowledge_base.default_name` | `default` | KB used when `--kb` is omitted |
| `knowledge_base.storage_dir` | `assets/knowledge_bases` | Where DB files are stored |
| `llm_api.api_key` | *(required for query+LLM)* | OpenAI-compatible API key |
| `graph_ner.custom_dict_path` | `assets/example_terms.json` | Domain terminology for NER |

### 3. First run (auto-downloads embedding model)
```bash
python scripts/py_mnn_kb.py status
```
On first use, `Qwen3-Embedding-0.6B-MNN-int4` (~400 MB) is auto-downloaded into `assets/`.

---

## Tools

### `kb_build` — Build / append knowledge base from files

Indexes a directory of documents. **Runs in background; returns immediately.**
Check progress with `kb_status`.

**Parameters:**
| Name | Type | Required | Description |
|---|---|---|---|
| `dir_path` | string | yes | Directory path to index (recursive) |
| `kb_name` | string | no | KB name (default: value of `default_name` in config.json) |

**Returns:** `{ status, command, kb_name, pid, files, message }`

**Supported formats:** `.txt` `.md` `.pdf` `.docx` `.pptx` `.xlsx` `.csv` `.html` `.json` `.jsonl`

**CLI:**
```bash
python scripts/py_mnn_kb.py build ./my_docs/ --kb my_kb
python scripts/py_mnn_kb.py build ./my_docs/          # uses default KB name
```

**Trigger phrases:** "加入知识库", "索引这个目录", "build KB", "index these files"

---

### `kb_note` — Insert a text note directly into the knowledge base

Embeds and stores a free-form text snippet. **Synchronous. Refused while build is running.**

**Parameters:**
| Name | Type | Required | Description |
|---|---|---|---|
| `text` | string | yes | Text content to store |
| `kb_name` | string | no | KB name (default: `default_name`) |
| `title` | string | no | Optional title, stored as source label |

**Returns:** `{ status, kb_name, chunks_added, elapsed_sec }`

**CLI:**
```bash
python scripts/py_mnn_kb.py note "Q1 roadmap: focus on modules A and B" --kb my_kb
python scripts/py_mnn_kb.py note "$(cat meeting.txt)" --kb my_kb --title "Weekly meeting"
```

**Trigger phrases:** "记住这个", "记录一下", "加个笔记", "save this", "remember this"

---

### `kb_query` — Retrieve relevant chunks (RAG retrieval)

Runs vector + BM25 + GraphRAG fusion and returns the top-N context string.
**The agent appends this context to its prompt — no LLM call is made inside this tool.**
**Synchronous. Refused while build is running.**

**Parameters:**
| Name | Type | Required | Description |
|---|---|---|---|
| `prompt` | string | yes | Query question or keywords |
| `kb_name` | string | no | KB name (default: `default_name`) |

**Returns:** Multi-document context string, e.g.:
```
Document1 [ID:42 source:manual.pdf]:
Deployment has three steps...

Document2 [ID:55 source:notes.md]:
...
```

**CLI:**
```bash
python scripts/py_mnn_kb.py query "NAND筛选核心流程" --kb my_kb --no-llm
python scripts/py_mnn_kb.py --output json query "产品路线图" --kb my_kb
```

**Agent usage pattern:**
```
context = kb_query("用户的问题", kb_name="my_kb")
# Then: f"Based on the following context:\n{context}\n\nQuestion: {user_question}"
```

**Trigger phrases:** "查知识库", "查一下", "知识库里有没有", "search KB"

---

### `kb_status` — Check build progress or last build result

**No KB initialization needed. Always returns instantly.**

**Parameters:**
| Name | Type | Required | Description |
|---|---|---|---|
| `kb_name` | string | no | (informational only, does not affect result) |

**Returns:**
- While building: `{ status: "building", progress: 0-100, message }`
- After success:  `{ status: "ok", message, stats: { chunks_added, elapsed_sec, ... } }`
- After failure:  `{ status: "error", error }`
- Not yet run:    `{ status: "idle", message }`

**CLI:**
```bash
python scripts/py_mnn_kb.py status
```

**Trigger phrases:** "构建进度", "build status", "知识库建好了吗"

---

## Workflow Examples

**A · User uploads files → auto-index**
```
User: "把这些文档加入知识库"
Agent → save files to temp dir
      → kb_build(dir_path=tmp_dir, kb_name="my_kb")   # returns immediately
      → "已开始后台构建，用 kb_status 检查进度"
```

**B · User dictates a note → insert**
```
User: "记住：STAR2000 低温写性能提升 8%"
Agent → kb_note(text="STAR2000 低温写性能提升 8%", kb_name="my_kb", title="技术发现")
      → "已保存到知识库 my_kb"
```

**C · User asks a question → KB-assisted answer**
```
User: "NAND 筛选核心流程是什么？"
Agent → context = kb_query("NAND 筛选核心流程", kb_name="my_kb")
      → append context to LLM prompt → generate answer
```

**D · Check if build finished before querying**
```
Agent → st = kb_status()
      → if st["status"] == "building": tell user to wait
      → else: proceed with kb_query(...)
```

---

## Notes

- `kb_build` is **incremental append** — re-running on the same directory adds only new content
- `kb_note` and `kb_query` are **blocked** (return `status: building`) while a build is running
- `--output json` on any CLI command returns machine-parseable JSON on stdout
- KB name `default` is used when `--kb` is omitted; configure `default_name` in `config.json`