LightMem: Lightweight and Efficient Memory-Augmented Generation

arXiv GitHub Stars License: MIT Last Commit PRs Welcome

⭐ If you like our project, please give us a star on GitHub for the latest updates!
--- **LightMem** is a lightweight and efficient memory management framework designed for Large Language Models and AI Agents. It provides a simple yet powerful memory storage, retrieval, and update mechanism to help you quickly build intelligent applications with long-term memory capabilities. * πŸš€ **Lightweight & Efficient**
Minimalist design with minimal resource consumption and fast response times * 🎯 **Easy to Use**
Simple API design - integrate into your application with just a few lines of code * πŸ”Œ **Flexible & Extensible**
Modular architecture supporting custom storage engines and retrieval strategies * 🌐 **Broad Compatibility**
Support for cloud APIs (OpenAI, DeepSeek) and local models (Ollama, vLLM, etc.)
## πŸ“’ News - **[2026-04-24]**: πŸš€ LightMem now supports the latest [**DeepSeek**](./src/lightmem/configs/memory_manager/base_config.py) models, including `deepseek-v4-flash` and `deepseek-v4-pro`, with `reasoning_effort` and thinking-mode configuration! - **[2026-04-24]**: πŸŽ‰πŸŽ‰πŸŽ‰ [**StructMem: Structured Memory for Long-Horizon Behavior in LLMs**](https://arxiv.org/abs/2604.21748) has been accepted by **ACL 2026**! - **[2026-03-21]**: πŸš€ We provide a more comprehensive [baseline evaluation framework](https://github.com/zjunlp/MemBase), supporting the benchmarking of memory layers such as Mem0, A-MEM, EverMemOS, LangMem on multiple datasets like LoCoMo and LongMemEval. - **[2026-02-15]**: πŸš€ **[StructMem](./StructMem.md)** is released: A hierarchical memory framework that preserves event-level memory bindings and cross-event memory connections. - **[2026-01-26]**: πŸŽ‰πŸŽ‰πŸŽ‰ [**LightMem: Lightweight and Efficient Memory-Augmented Generation**](https://arxiv.org/abs/2510.18866) has been accepted by **ICLR 2026**! - **[2026-01-17]**: πŸš€ We provide a comprehensive [baseline evaluation framework](https://github.com/zjunlp/LightMem/blob/main/src/lightmem/memory_toolkits/readme.md), supporting the benchmarking of memory layers such as Mem0, A-MEM, and LangMem on multiple datasets like LoCoMo and LongMemEval. - **[2025-12-09]**: 🎬 Released a **[Demo Video](#demo)** showcasing long-context handling, along with comprehensive **[Tutorial Notebooks](./tutorial-notebooks/)** for various scenarios! - **[2025-11-30]**: 🚌 LightMem now supports calling multiple tools provided by its [**MCP Server**](https://github.com/zjunlp/LightMem/blob/main/mcp/server.py). - **[2025-11-26]**: πŸš€ Added full **LoCoMo** dataset support, delivering strong [results](https://github.com/zjunlp/LightMem?tab=readme-ov-file#locomo) with leading performance and efficiency! Here is the [**reproduction script**](https://github.com/zjunlp/LightMem/blob/main/experiments/locomo/readme.md)! - **[2025-11-09]**: ✨ LightMem now supports local deployment via [**Ollama**](https://github.com/zjunlp/LightMem/blob/main/src/lightmem/factory/memory_manager/ollama.py), [**vLLM**](https://github.com/zjunlp/LightMem/blob/main/src/lightmem/factory/memory_manager/vllm_offline.py), and [**Transformers**](https://github.com/zjunlp/LightMem/blob/main/src/lightmem/factory/memory_manager/transformers.py) auto-loading! - **[2025-10-12]**: πŸŽ‰ LightMem project is officially Open-Sourced! ## πŸ§ͺ Reproduction Scripts for LoCoMo & LongMemEval We provide lightweight, ready-to-run scripts for reproducing results on **LoCoMo**, **LongMemEval**, and their combined baselines. | Dataset | Description | Script | Result | | :----------------------- | :--------------------------------------------------------------------------- | :--------------------------------------------------------------------------------------------------------------| :---------------------------------------------| | **LongMemEval** | Run LightMem on LongMemEval, including evaluation and offline memory update. | [run_lightmem_longmemeval.md](https://github.com/zjunlp/LightMem/blob/main/experiments/longmemeval/readme.md) | [LongMemEval Results](https://github.com/zjunlp/LightMem/blob/main/experiments/longmemeval/readme.md#results) | | **LoCoMo** | Scripts for reproducing LightMem results on LoCoMo. | [run_lightmem_locomo.md](https://github.com/zjunlp/LightMem/blob/main/experiments/locomo/readme.md) | [LoCoMo Results](https://github.com/zjunlp/LightMem/blob/main/experiments/locomo/readme.md#results) | | **LongMemEval & LoCoMo** | Unified baseline scripts for running both datasets. | [run_baselines.md](https://github.com/zjunlp/LightMem/blob/main/src/lightmem/memory_toolkits/readme.md) | [Baseline Results](#experimental-results) | ## πŸ§ͺ Baseline Evaluation We provide a comprehensive [baseline evaluation framework](https://github.com/zjunlp/LightMem/blob/main/src/lightmem/memory_toolkits/readme.md), supporting the benchmarking of memory layers such as Mem0, A-MEM, and LangMem on multiple datasets like LoCoMo and LongMemEval. ## πŸŽ₯ Demo & Tutorials **Watch Demo:** [YouTube](https://www.youtube.com/watch?v=r7sk_7Yv66I) | [Bilibili](https://www.bilibili.com/video/BV1a7mJBbEVM/) ### πŸ“š Hands-on Tutorials We provide ready-to-use Jupyter notebooks corresponding to the demo and other use cases. You can find them in the [`tutorial-notebooks`](./tutorial-notebooks/) directory. | Scenario | Description | Notebook Link | | :--- | :--- | :--- | | **Travel Planning** | A complete guide to building a travel agent with memory. | [LightMem_Example_travel.ipynb](./tutorial-notebooks/LightMem_Example_travel.ipynb) | | **Code Assistant** | A complete guide to building a code agent with memory. | [LightMem_Example_code.ipynb](./tutorial-notebooks/LightMem_Example_code.ipynb) | | **LongMemEval** | A tutorial on how to run evaluations on LongMemEval benchmarks using LightMem. | [LightMem_Example_longmemeval.ipynb](./tutorial-notebooks/LightMem_Example_longmemeval.ipynb) | ## β˜‘οΈ Todo List LightMem is continuously evolving! Here's what's coming: - Offline Pre-computation of KV Cache for Update (Lossless) - Online Pre-computation of KV Cache Before Q&A (Lossy) - Integration More Models and Feature Enhancement - Coordinated Use of Context and Long-Term Memory Storage - Multi Modal Memory ## πŸ“‘ Table of Contents * πŸ“’ News * πŸ§ͺ Reproduction Scripts * πŸ§ͺ Baseline Evaluation * πŸŽ₯ Demo & Tutorials * β˜‘οΈ Todo List * πŸ”§ Installation * ⚑ Quick Start * πŸ—οΈ Architecture * πŸ’‘ Examples * πŸ“ Experimental Results * βš™οΈ Configuration * πŸ‘₯ Contributors * πŸ”— Related Projects ## πŸ”§ Installation ### Installation Steps #### Option 1: Install from Source ```bash # Clone the repository git clone https://github.com/zjunlp/LightMem.git cd LightMem # Create virtual environment conda create -n lightmem python=3.11 -y conda activate lightmem # Install dependencies unset ALL_PROXY pip install -e . ``` #### Option 2: Install via pip ```bash pip install lightmem # Coming soon ``` ## ⚑ Quick Start 1. Modify the `JUDGE_MODEL`, `LLM_MODEL`, and their respective `API_KEY` and `BASE_URL` in `API Configuration`. 2. Download `LLMLINGUA_MODEL` from [microsoft/llmlingua-2-bert-base-multilingual-cased-meetingbank](https://huggingface.co/microsoft/llmlingua-2-bert-base-multilingual-cased-meetingbank) and `EMBEDDING_MODEL` from [sentence-transformers/all-MiniLM-L6-v2](https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2) and modify their paths in `Model Paths`. 3. Download the dataset from [longmemeval-cleaned](https://huggingface.co/datasets/xiaowu0162/longmemeval-cleaned), and modidy the path in `Data Configuration`. ```python cd experiments python run_lightmem_qwen.py ``` ## πŸ—οΈ Architecture ### πŸ—ΊοΈ Core Modules Overview LightMem adopts a modular design, breaking down the memory management process into several pluggable components. The core directory structure exposed to users is outlined below, allowing for easy customization and extension: ```python LightMem/ β”œβ”€β”€ src/lightmem/ # Main package β”‚ β”œβ”€β”€ __init__.py # Package initialization β”‚ β”œβ”€β”€ configs/ # Configuration files β”‚ β”œβ”€β”€ factory/ # Factory methods β”‚ β”œβ”€β”€ memory/ # Core memory management β”‚ └── memory_toolkits/ # Memory toolkits β”œβ”€β”€ mcp/ # LightMem MCP server β”œβ”€β”€ experiments/ # Experiment scripts β”œβ”€β”€ datasets/ # Datasets files └── examples/ # Examples ``` ### 🧩 Supported Backends per Module The following table lists the backends values currently recognized by each configuration module. Use the `model_name` field (or the corresponding config object) to select one of these backends. | Module (config) | Supported backends | | :--- | :--- | | `PreCompressorConfig` | `llmlingua-2`, `entropy_compress` | | `TopicSegmenterConfig` | `llmlingua-2` | | `MemoryManagerConfig` | `openai`, `deepseek`, `ollama`, `vllm`, etc. | | `TextEmbedderConfig` | `huggingface` | | `MMEmbedderConfig` | `huggingface` | | `RetrieverConfig` | `qdrant`, `FAISS`, `BM25` | ## πŸ’‘ Examples ### Initialize LightMem ```python import os from datetime import datetime from lightmem.memory.lightmem import LightMemory LOGS_ROOT = "./logs" RUN_TIMESTAMP = datetime.now().strftime("%Y%m%d_%H%M%S") RUN_LOG_DIR = os.path.join(LOGS_ROOT, RUN_TIMESTAMP) os.makedirs(RUN_LOG_DIR, exist_ok=True) API_KEY='your_api_key' API_BASE_URL='your_api_base_url' LLM_MODEL='your_model_name' # such as 'gpt-4o-mini' (API) or 'gemma3:latest' (Local Ollama) ... EMBEDDING_MODEL_PATH='/your/path/to/models/all-MiniLM-L6-v2' LLMLINGUA_MODEL_PATH='/your/path/to/models/llmlingua-2-bert-base-multilingual-cased-meetingbank' config_dict = { "pre_compress": True, "pre_compressor": { "model_name": "llmlingua-2", "configs": { "llmlingua_config": { "model_name": LLMLINGUA_MODEL_PATH, "device_map": "cuda", "use_llmlingua2": True, }, } }, "topic_segment": True, "precomp_topic_shared": True, "topic_segmenter": { "model_name": "llmlingua-2", }, "messages_use": "user_only", "metadata_generate": True, "text_summary": True, "memory_manager": { "model_name": 'xxx', # such as 'openai' or 'ollama' ... "configs": { "model": LLM_MODEL, "api_key": API_KEY, "max_tokens": 16000, "xxx_base_url": API_BASE_URL # API model specific, such as 'openai_base_url' or 'deepseek_base_url' ... } }, "extract_threshold": 0.1, "index_strategy": "embedding", "text_embedder": { "model_name": "huggingface", "configs": { "model": EMBEDDING_MODEL_PATH, "embedding_dims": 384, "model_kwargs": {"device": "cuda"}, }, }, "retrieve_strategy": "embedding", "embedding_retriever": { "model_name": "qdrant", "configs": { "collection_name": "my_long_term_chat", "embedding_model_dims": 384, "path": "./my_long_term_chat", } }, "summary_retriever": { "model_name": "qdrant", "configs": { "collection_name": "my_chat_summaries", "embedding_model_dims": 384, "path": "./my_chat_summaries", } }, "update": "offline", "logging": { "level": "DEBUG", "file_enabled": True, "log_dir": RUN_LOG_DIR, } } lightmem = LightMemory.from_config(config_dict) ``` ### Add Memory ```python session = { "timestamp": "2025-01-10", "turns": [ [ {"role": "user", "content": "My favorite ice cream flavor is pistachio, and my dog's name is Rex."}, {"role": "assistant", "content": "Got it. Pistachio is a great choice."}], ] } for turn_messages in session["turns"]: timestamp = session["timestamp"] for msg in turn_messages: msg["time_stamp"] = timestamp store_result = lightmem.add_memory( messages=turn_messages, force_segment=True, force_extract=True ) ``` ### Offline Update ```python lightmem.construct_update_queue_all_entries() lightmem.offline_update_all_entries(score_threshold=0.8) ``` ### Generate summaries ```python summary_result = lightmem.summarize() ``` ### Retrieve Memory ```python question = "What is the name of my dog?" related_memories = lightmem.retrieve(question, limit=5) print(related_memories) ``` ### MCP Server LightMem also supports the Model Context Protocol ([MCP](https://modelcontextprotocol.io/docs/getting-started/intro)) server: ```bash # Running at Root Directory cd LightMem # Environment pip install '.[mcp]' # MCP Inspector [Optional] npx @modelcontextprotocol/inspector python mcp/server.py # Start API by HTTP (http://127.0.0.1:8000/mcp) fastmcp run mcp/server.py:mcp --transport http --port 8000 ``` The MCP config `json` file of your local client may looks like: ```json { "yourMcpServers": { "LightMem": { "url": "http://127.0.0.1:8000/mcp", "otherParameters": "..." } } } ``` ## πŸ“ Experimental Results For transparency and reproducibility, we have shared the results of our experiments on Google Drive. This includes model outputs, evaluation logs, and predictions used in our study. πŸ”— Access the data here: [Google Drive - Experimental Results](https://drive.google.com/drive/folders/1n1YCqq0aDeWiPILhkq-uS3sU3FDmslz9?usp=drive_link) Please feel free to download, explore, and use these resources for research or reference purposes. ### LOCOMO: #### Overview backbone: `gpt-4o-mini`, judge model: `gpt-4o-mini` & `qwen2.5-32b-instruct` | Method | ACC(%) gpt-4o-mini | ACC(%) qwen2.5-32b-instruct | Memory-Con Tokens(k) Total | QA Tokens(k) total | Total(k) | Calls | Runtime(s) total | |-------------------|--------------------|------------------------------|-----------------------------|---------------------|--------------|--------|------------------| | FullText | 73.83 | 73.18 | – | 54,884.479 | 54,884.479 | – | 6,971 | | NaiveRAG | 63.64 | 63.12 | – | 3,870.187 | 3,870.187 | – | 1,884 | | A-MEM | 64.16 | 60.71 | 11,494.344 | 10,170.567 | 21,664.907 | 11,754 | 67,084 | | MemoryOS(eval) | 58.25 | 61.04 | 2,870.036 | 7,649.343 | 10,519.379 | 5,534 | 26,129 | | MemoryOS(pypi) | 54.87 | 55.91 | 5,264.801 | 6,126.111 | 11,390.004 | 10,160 | 37,912 | | Mem0 | 36.49 | 37.01 | 24,304.872 | 1,488.618 | 25,793.490 | 19,070 | 120,175 | | Mem0(api) | 61.69 | 61.69 | 68,347.720 | 4,169.909 | 72,517.629 | 6,022 | 10,445 | | Mem0-g(api) | 60.32 | 59.48 | 69,684.818 | 4,389.147 | 74,073.965 | 6,022 | 10,926 | backbone: `qwen3-30b-a3b-instruct-2507`, judge model: `gpt-4o-mini` & `qwen2.5-32b-instruct` | Method | ACC(%) gpt-4o-mini | ACC(%) qwen2.5-32b-instruct | Memory-Con Tokens(k) Total | QA Tokens(k) total | Total(k) | Calls | Runtime(s) total | |-------------------|--------------------|------------------------------|-----------------------------|---------------------|--------------|--------|------------------| | FullText | 74.87 | 74.35 | – | 60,873.076 | 60,873.076 | – | 10,555 | | NaiveRAG | 66.95 | 64.68 | – | 4,271.052 | 4,271.052 | – | 1,252 | | A-MEM | 56.10 | 54.81 | 16,267.997 | 17,340.881 | 33,608.878 | 11,754 | 69,339 | | MemoryOS(eval) | 61.04 | 59.81 | 3,615.087 | 9,703.169 | 11,946.442 | 4,147 | 13,710 | | MemoryOS(pypi) | 51.30 | 51.95 | 6,663.527 | 7,764.991 | 14,428.518 | 10,046 | 20,830 | | Mem0 | 43.31 | 43.25 | 17,994.035 | 1,765.570 | 19,759.605 | 16,145 | 46,500 | #### Details backbone: `gpt-4o-mini`, judge model: `gpt-4o-mini` & `qwen2.5-32b-instruct` | Method | Summary Tokens(k) In | Summary Tokens(k) Out | Update Tokens(k) In | Update Tokens(k) Out | QA Tokens(k) In | QA Tokens(k) Out | Runtime(s) mem-con | Runtime(s) qa | |-------------------|-----------------------|------------------------|----------------------|-----------------------|------------------|-------------------|----------------------|----------------| | FullText | – | – | – | – | 54,858.770 | 25.709 | – | 6,971 | | NaiveRAG | – | – | – | – | 3,851.029 | 19.158 | – | 1,884 | | A-MEM | 1,827.373 | 492.883 | 7,298.878 | 1,875.210 | 10,113.252 | 57.315 | 60,607 | 6,477 | | MemoryOS(eval) | 1,109.849 | 333.970 | 780.807 | 645.410 | 7,638.539 | 10.804 | 24,220 | 1,909 | | MemoryOS(pypi) | 1,007.729 | 294.601 | 3,037.509 | 924.962 | 6,116.239 | 9.872 | 33,325 | 4,587 | | Mem0 | 8,127.398 | 253.187 | 12,722.011 | 3,202.276 | 1,478.830 | 9.788 | 118,268 | 1,907 | | Mem0(api) | \ | \ | \ | \ | 4,156.850 | 13.059 | 4,328 | 6,117 | | Mem0-g(api) | \ | \ | \ | \ | 4,375.900 | 13.247 | 5,381 | 5,545 | backbone: `qwen3-30b-a3b-instruct-2507`, judge model: `gpt-4o-mini` & `qwen2.5-32b-instruct` | Method | Summary Tokens(k) In | Summary Tokens(k) Out | Update Tokens(k) In | Update Tokens(k) Out | QA Tokens(k) In | QA Tokens(k) Out | Runtime(s) mem-con | Runtime(s) qa | |-------------------|-----------------------|------------------------|----------------------|-----------------------|------------------|-------------------|----------------------|----------------| | FullText | – | – | – | – | 60,838.694 | 34.382 | – | 10,555 | | NaiveRAG | – | – | – | – | 4,239.030 | 32.022 | – | 1,252 | | A-MEM | 1,582.942 | 608.507 | 9,241.928 | 4,835.070 | 17,528.876 | 82.005 | 55,439 | 13,900 | | MemoryOS(eval) | 1,222.139 | 531.157 | 1,044.307 | 817.484 | 9,679.996 | 23.173 | 12,697 | 1,012 | | MemoryOS(pypi) | 2,288.533 | 516.024 | 2,422.693 | 1,436.277 | 7,743.391 | 21.600 | 19,822 | 1,007 | | Mem0 | 8,270.874 | 186.354 | 7,638.827 | 1,897.980 | 1,739.246 | 26.324 | 45,407 | 1,093 | #### Performance metrics backbone: `gpt-4o-mini`, judge model: `gpt-4o-mini` | Method | Overall ↑ | Multi | Open | Single | Temp | | :--- | :---: | :---: | :---: | :---: | :---: | | FullText | 73.83 | 68.79 | 56.25 | 86.56 | 50.16 | | NaiveRAG | 63.64 | 55.32 | 47.92 | 70.99 | 56.39 | | A-MEM | 64.16 | 56.03 | 31.25 | 72.06 | 60.44 | | MemoryOS(eval) | 58.25 | 56.74 | 45.83 | 67.06 | 40.19 | | MemoryOS(pypi) | 54.87 | 52.13 | 43.75 | 63.97 | 36.76 | | Mem0 | 36.49 | 30.85 | 34.38 | 38.41 | 37.07 | | Mem0(api) | 61.69 | 56.38 | 43.75 | 66.47 | 59.19 | | Mem0-g(api) | 60.32 | 54.26 | 39.58 | 65.99 | 57.01 | backbone: `gpt-4o-mini`, judge model: `qwen2.5-32b-instruct` | Method | Overall ↑ | Multi | Open | Single | Temp | | :--- | :---: | :---: | :---: | :---: | :---: | | FullText | 73.18 | 68.09 | 54.17 | 86.21 | 49.22 | | NaiveRAG | 63.12 | 53.55 | 50.00 | 71.34 | 53.89 | | A-MEM | 60.71 | 53.55 | 32.29 | 69.08 | 53.58 | | MemoryOS(eval) | 61.04 | 64.18 | 40.62 | 70.15 | 40.50 | | MemoryOS(pypi) | 55.91 | 52.48 | 41.67 | 66.35 | 35.83 | | Mem0 | 37.01 | 31.91 | 37.50 | 38.53 | 37.38 | | Mem0(api) | 61.69 | 54.26 | 46.88 | 67.66 | 57.01 | | Mem0-g(api) | 59.48 | 55.32 | 42.71 | 65.04 | 53.58 | backbone: `qwen3-30b-a3b-instruct-2507`, judge model: `gpt-4o-mini` | Method | Overall ↑ | Multi | Open | Single | Temp | | :--- | :---: | :---: | :---: | :---: | :---: | | FullText | 74.87 | 69.86 | 57.29 | 87.40 | 51.71 | | NaiveRAG | 66.95 | 62.41 | 57.29 | 76.81 | 47.98 | | A-MEM | 56.10 | 57.45 | 43.75 | 67.90 | 27.73 | | MemoryOS(eval) | 61.04 | 62.77 | 51.04 | 72.29 | 33.02 | | MemoryOS(pypi) | 51.30 | 52.48 | 40.62 | 61.59 | 26.48 | | Mem0 | 43.31 | 42.91 | 46.88 | 46.37 | 34.58 | | Mem0(api) | 61.69 | 54.26 | 46.88 | 67.66 | 57.01 | | Mem0-g(api) | 59.48 | 55.32 | 42.71 | 65.04 | 53.58 | backbone: `qwen3-30b-a3b-instruct-2507`, judge model: `qwen2.5-32b-instruct` | Method | Overall ↑ | Multi | Open | Single | Temp | | :--- | :---: | :---: | :---: | :---: | :---: | | FullText | 74.35 | 68.09 | 63.54 | 86.33 | 51.71 | | NaiveRAG | 64.68 | 60.28 | 52.08 | 75.62 | 43.61 | | A-MEM | 54.81 | 56.74 | 39.58 | 67.42 | 24.61 | | MemoryOS(eval) | 59.81 | 63.12 | 48.96 | 70.51 | 32.09 | | MemoryOS(pypi) | 51.95 | 55.67 | 39.58 | 61.47 | 27.41 | | Mem0 | 43.25 | 45.04 | 46.88 | 45.78 | 33.96 | | Mem0(api) | 61.69 | 54.26 | 46.88 | 67.66 | 57.01 | | Mem0-g(api) | 59.48 | 55.32 | 42.71 | 65.04 | 53.58 | ## βš™οΈ Configuration All behaviors of LightMem are controlled via the BaseMemoryConfigs configuration class. Users can customize aspects like pre-processing, memory extraction, retrieval strategy, and update mechanisms by providing a custom configuration. #### Key Configuration Options (Usage) | Option | Default | Usage (allowed values and behavior) | | :--- | :--- | :--- | | `pre_compress` | `False` | True / False. If True, input messages are pre-compressed using the `pre_compressor` configuration before being stored. This reduces storage and indexing cost but may remove fine-grained details. If False, messages are stored without pre-compression. | | `pre_compressor` | `None` | dict / object. Configuration for the pre-compression component (`PreCompressorConfig`) with fields like `model_name` (e.g., `llmlingua-2`, `entropy_compress`) and `configs` (model-specific parameters). Effective only when `pre_compress=True`. | | `topic_segment` | `False` | True / False. Enables topic-based segmentation of long conversations. When True, long conversations are split into topic segments and each segment can be indexed/stored independently (requires `topic_segmenter`). When False, messages are stored sequentially. | | `precomp_topic_shared`| `False` | True / False. If True, pre-compression and topic segmentation can share intermediate results to avoid redundant processing. May improve performance but requires careful configuration to avoid cross-topic leakage. | | `topic_segmenter` | `None` | dict / object. Configuration for topic segmentation (`TopicSegmenterConfig`), including `model_name` and `configs` (segment length, overlap, etc.). Used when `topic_segment=True`. | | `messages_use` | `'user_only'` | `'user_only'` / `'assistant_only'` / `'hybrid'`. Controls which messages are used to generate metadata and summaries: `user_only` uses user inputs, `assistant_only` uses assistant responses, `hybrid` uses both. Choosing `hybrid` increases processing but yields richer context. | | `metadata_generate` | `True` | True / False. If True, metadata such as keywords and entities are extracted and stored to support attribute-based and filtered retrieval. If False, no metadata extraction occurs. | | `text_summary` | `True` | True / False. If True, a text summary is generated and stored alongside the original text (reduces retrieval cost and speeds review). If False, only the original text is stored. Summary quality depends on `memory_manager`. | | `memory_manager` | `MemoryManagerConfig()` | dict / object. Controls the model used to generate summaries and metadata (`MemoryManagerConfig`), e.g., `model_name` (`openai`, `ollama`, etc.) and `configs`. Changing this affects summary style, length, and cost. | | `extract_threshold` | `0.5` | float (0.0 - 1.0). Threshold used to decide whether content is important enough to be extracted as metadata or highlight. Higher values (e.g., 0.8) mean more conservative extraction; lower values (e.g., 0.2) extract more items (may increase noise). | | `index_strategy` | `None` | `'embedding'` / `'context'` / `'hybrid'` / `None`. Determines how memories are indexed: 'embedding' uses vector-based indexing (requires embedders/retriever) for semantic search; 'context' uses text-based/contextual retrieval (requires context_retriever) for keyword/document similarity; and 'hybrid' combines context filtering and vector reranking for robustness and higher accuracy. | | `text_embedder` | `None` | dict / object. Configuration for text embedding model (`TextEmbedderConfig`) with `model_name` (e.g., `huggingface`) and `configs` (batch size, device, embedding dim). Required when `index_strategy` or `retrieve_strategy` includes `'embedding'`. | | `multimodal_embedder` | `None` | dict / object. Configuration for multimodal/image embedder (`MMEmbedderConfig`). Used for non-text modalities. | | `history_db_path` | `os.path.join(lightmem_dir, "history.db")` | str. Path to persist conversation history and lightweight state. Useful to restore state across restarts. | | `retrieve_strategy` | `'embedding'` | `'embedding'` / `'context'` / `'hybrid'`. Strategy used at query time to fetch relevant memories. Pick based on data and query type: semantic queries -> `'embedding'`; keyword/structured queries -> `'context'`; mixed -> `'hybrid'`. | | `context_retriever` | `None` | dict / object. Configuration for context-based retriever (`ContextRetrieverConfig`), e.g., `model_name='BM25'` and `configs` like `top_k`. Used when `retrieve_strategy` includes `'context'`. | | `embedding_retriever` | `None` | dict / object. Vector store configuration (`EmbeddingRetrieverConfig`), e.g., `model_name='qdrant'` and connection/index params. Used when `retrieve_strategy` includes `'embedding'`. | | `summary_retriever` | `None` | dict / object. Configuration for summary-specific vector store (`EmbeddingRetrieverConfig`). When configured, summaries are stored in a separate collection for hierarchical retrieval. Used in StructMem mode to store and retrieve session/topic summaries independently from detailed memories. | | `update` | `'offline'` | `'online'` / `'offline'`. `'offline'`: batch or scheduled updates to save cost and aggregate changes β€” this is the fully supported mode with complete functionality. `'online'`: reserved for future development (currently a no-op placeholder; memory will not be persisted when this mode is set). | | `kv_cache` | `False` | True / False. If True, attempt to precompute and persist model KV caches to accelerate repeated LLM calls (requires support from the LLM runtime and may increase storage). Uses `kv_cache_path` to store cache. | | `kv_cache_path` | `os.path.join(lightmem_dir, "kv_cache.db")` | str. File path for KV cache storage when `kv_cache=True`. | | `graph_mem` | `False` | True / False. When True, some memories will be organized as a graph (nodes and relationships) to support complex relation queries and reasoning. Requires additional graph processing/storage. | | `extraction_mode` | `'flat'` | `'flat'` / `'event'`. Memory extraction mode: `'flat'` extracts factual entries as independent units suitable for general knowledge retention; `'event'` extracts event-level structures with both factual and relational components, preserving temporal bindings and causal relationships. Use `'event'` for narrative-heavy or time-sensitive scenarios. | | `version` | `'v1.1'` | str. Configuration/API version. Only change if you know compatibility implications. | | `logging` | `'None'` | dict / object. Configuration for logging enabled. | ## πŸ† Contributors
JizhanFang
JizhanFang
Xinle-Deng
Xinle-Deng
Xubqpanda
Xubqpanda
HaomingX
HaomingX
453251
453251
James-TYQ
James-TYQ
evy568
evy568
Norah-Feathertail
Norah-Feathertail
TongjiCst
TongjiCst
We welcome contributions from the community! If you'd like to contribute, please fork the repository and submit a pull request. For major changes, please open an issue first to discuss what you would like to change. ## πŸ”— Related Projects