简体中文 | English

An enterprise-grade intelligent knowledge platform powered by RAG and AI Agent technology

Every answer should be traceable: document upload · intelligent parsing · hybrid retrieval · AI conversation · citation provenance

Why Argus · Core Highlights · System Architecture · Feature Modules · Technology Stack · Quick Start · API Overview · Project Structure

--- ## Why Choose Argus? > **Argus** is named after the hundred-eyed giant in Greek mythology. Even while asleep, some of Argus's eyes stayed watchful. The name reflects the platform's goal: to observe and understand private knowledge assets comprehensively, so **every answer can be backed by evidence**. **Argus** is not another "ChatGPT wrapper". It is a ground-up **RAG (Retrieval-Augmented Generation) knowledge base platform** that deeply integrates enterprise private documents with large language models, addressing three core pain points in vertical LLM applications: | Pain Point | Argus Solution | |------|-----------------| | **Hallucinated answers** | Hybrid retrieval + evidence evaluation + structured output ensure answers are grounded in real documents, with proactive refusal when the evidence is insufficient. | | **Fragmented knowledge** | Automatic document parsing, chunking, vectorization, and indexing connect the full path from files to usable knowledge. | | **Memoryless conversations** | ReactAgent plus three-level short-term memory compression supports context-aware multi-turn conversations. |
## Core Highlights

### End-to-End RAG Pipeline Argus builds a complete **RAG pipeline** from document upload to AI answer generation: ```text Document upload -> Intelligent parsing -> Text chunking ↓ Vector embedding (PGvector HNSW) + keyword indexing (Elasticsearch IK) ↓ User question -> Query planning (LLM) -> Hybrid retrieval (RRF fusion) ↓ Evidence evaluation (four sufficiency levels) -> LLM generation -> citation provenance ``` It is **not a simple "search + GPT wrapper"**. Argus implements key stages such as query planning, RRF fusion ranking, and four-level evidence evaluation.

### AI Agent Conversation Engine Built on the **Spring AI Alibaba ReactAgent** graph execution engine, Argus supports: - **Dual-mode switching**: pure chat (`CHAT`) and knowledge base retrieval (`KB_SEARCH`) can switch dynamically within the same session. - **Tool orchestration**: the Agent decides whether to call retrieval tools, with at most one retrieval call per turn to avoid waste. - **SSE streaming output**: model responses are streamed token by token to the frontend for a zero-wait experience. - **Short-term memory**: three-level progressive compression (session memory -> compact summary -> runtime truncation) keeps long conversations usable within limited context windows.

### Hybrid Retrieval Architecture **Vector semantic retrieval + keyword full-text retrieval** run in parallel, then RRF (Reciprocal Rank Fusion) merges and ranks the results: - **Semantic matching**: PGvector + HNSW index + `COSINE_DISTANCE` capture semantic similarity. - **Exact matching**: Elasticsearch + IK Chinese analyzer + BM25 accurately match domain terms. - **Evidence enhancement**: cluster aggregation and neighbor-window expansion add context and reduce chunk fragmentation.

### Enterprise-Grade Security - **Three-level role permissions**: Admin / Group Owner / Member, following the principle of least privilege. - **JWT dual-token flow**: Access Token (15 min) + Refresh Token (`httpOnly` Cookie + database rotation). - **BCrypt password hashing** with forced password changes. - **Group data isolation**: both vector retrieval and Elasticsearch retrieval apply `groupId` filters to prevent cross-group data leakage. - **AOP operation logs**: critical operations are tracked end to end.

## System Architecture ```mermaid graph TB subgraph Frontend A[Vue 3 SPA
Element Plus + Pinia] end subgraph Gateway B[JWT authentication filter
Access Token + Refresh Token] end subgraph Business Services C[Authentication and authorization
Registration/login/token management] D[Group collaboration
Create/invite/approve/member management] E[Document management
Chunked upload/preview/download/soft delete] F[ETL pipeline
Parse/clean/chunk/vectorize/index] G[Knowledge Q&A
Query planning/hybrid retrieval/evidence evaluation/answer generation] H[AI assistant
ReactAgent/session management/short-term memory/SSE streaming] end subgraph Data and Retrieval Engines I[(PostgreSQL
+ pgvector
HNSW vector index)] J[(Elasticsearch
+ IK analyzer
keyword retrieval)] K[(MinIO
object storage
document persistence)] end subgraph AI Model Layer L[DashScope
Qwen Chat] M[DashScope
text-embedding-v3] end A --> B B --> C & D & E & F & G & H E --> K F --> I & J & K G --> I & J & L H --> G & L E -.->|Triggered asynchronously by Spring Event| F ```
## Feature Modules ### User Authentication and Group Collaboration - User registration/login, JWT dual-token authentication, and role permissions (Admin / regular user). - Create knowledge-base groups, invite members through invite codes, and handle join requests with approval workflows. - Three group roles: Owner / Manager / Member, with fine-grained permission control. ### Full Document Lifecycle Management - **Chunked upload protocol**: three stages (`init -> chunk upload -> complete`) with resumable upload and instant-upload detection via SHA-256. - **Multi-format parsing**: PDF / DOCX / MD / TXT with automatic encoding detection. - **Asynchronous ETL pipeline**: Spring Event + `@Async` + `@Retryable`, with seven fully automated processing steps. - **Object storage**: MinIO S3-compatible storage, enabled on demand through `@ConditionalOnProperty`. ### Knowledge Base Q&A (RAG Q&A) - **LLM query planning**: automatically chooses `DIRECT`, `REWRITE`, or `DECOMPOSE`, with up to three parallel retrieval queries. - **RRF dual-channel fusion**: unifies vector and keyword results, with cluster aggregation and neighbor-window expansion. - **Four-level evidence evaluation**: `NONE -> WEAK -> PARTIAL -> SUFFICIENT`; the system refuses to answer when evidence is insufficient. - **Citation provenance**: each answer includes cited chunks, source documents, and relevance scores. ### AI Assistant - **ReactAgent graph execution engine**: a complete "think -> tool call -> generate response" flow. - **CHAT / KB_SEARCH dual modes**: pure conversation or knowledge base retrieval, dynamically switchable inside one session. - **BEFORE_MODEL hook**: automatically injects context before model calls (`compact summary -> session memory -> recent messages`). - **Three-level short-term memory compression**: - L1 session memory: incremental LLM summaries that preserve key facts and decisions. - L2 compact summary: condensed historical context that removes redundant details. - L3 runtime truncation: the final safeguard when tokens exceed 50,000. - **SSE streaming output**: delta deduplication plus `AGENT_MODEL_FINISHED` fallback.
## Technology Stack ### Backend Core | Layer | Technology | Version | Description | |------|------|------|------| | Language | **Java** | 21 | Records, virtual threads, pattern matching | | Framework | **Spring Boot** | 3.5.0 | Spring MVC, Jakarta EE 9+ | | ORM | **MyBatis-Plus** | 3.5.15 | Lambda type-safe queries | | Database | **PostgreSQL + pgvector** | 16+ | HNSW vector index, `COSINE_DISTANCE` | | Search Engine | **Elasticsearch** | 8.x | IK Chinese analyzer, direct JDK HttpClient integration | | Object Storage | **MinIO** | latest | S3-compatible storage, `composeObject` for chunk merging | | AI Chat | **Spring AI Alibaba** | 1.1.2.0 | Native DashScope integration (Qwen) | | AI Agent | **Spring AI Alibaba Agent** | 1.1.2.0 | ReactAgent graph execution engine | | AI Embedding | **Spring AI** | 1.1.2 | OpenAI-compatible mode, 512-dimensional vectors | | Authentication | **JJWT** | 0.12.6 | HMAC-SHA256 JWT issuing and parsing | | Password Hashing | **Spring Security Crypto** | - | BCrypt adaptive hashing | | Document Parsing | **Apache PDFBox / POI** | 2.0.31 / 5.2.5 | PDF + DOCX text extraction | | Retry Framework | **Spring Retry** | - | Declarative retry with `@Retryable` + `@Recover` | | API Docs | **Knife4j + SpringDoc** | 4.5.0 | Enhanced `/doc.html` UI and online debugging | ### Frontend Core | Layer | Technology | Version | |------|------|------| | Language | **TypeScript** | 6.0 | | Framework | **Vue 3** (Composition API) | 3.5 | | Build Tool | **Vite** | 8.0 | | Routing | **Vue Router** | 5.0 | | State Management | **Pinia** | 3.0 | | UI Components | **Element Plus** | 2.14 | | HTTP Client | **Axios** | 1.16 | | Markdown | **marked** | 18.0 | ### Infrastructure | Component | Purpose | |------|------| | **PostgreSQL + pgvector** | Relational primary storage + HNSW vector index (512 dimensions, `COSINE_DISTANCE`) | | **Elasticsearch 8.x** | IK Chinese analyzer + BM25 keyword retrieval + two-stage bool/rescore scoring | | **MinIO** | S3-compatible object storage, chunk merging with `composeObject`, conditional assembly | | **DashScope** | Qwen Chat model + text-embedding-v3 embedding model |
## Quick Start ### Requirements | Component | Version | Description | |------|---------|------| | **JDK** | 21 | Records and virtual threads | | **Node.js** | >= 20.19 | Frontend build | | **PostgreSQL** | 16+ | Requires the `pgvector` extension | | **Elasticsearch** | 8.x | Requires the IK Chinese analyzer plugin | | **MinIO** | latest | Object storage, optional | | **DashScope API Key** | - | Shared by LLM Chat and Embedding | ### 1. Initialize Middleware

PostgreSQL + pgvector

```bash # Install pgvector extension psql -h -U -d -c "CREATE EXTENSION IF NOT EXISTS vector;" # Run schema script psql -h -U -d -f sql/schema.sql ```

MinIO (Docker)

```bash docker run -d --name minio \ -p 9000:9000 -p 9001:9001 \ -e MINIO_ROOT_USER=minioadmin \ -e MINIO_ROOT_PASSWORD=minioadmin \ minio/minio server /data --console-address ":9001" # Visit http://localhost:9001 and create a bucket. # Default bucket: argus-rag-documents ```

Elasticsearch + IK Analyzer

```bash docker run -d --name elasticsearch \ -p 9200:9200 -p 9300:9300 \ -e "discovery.type=single-node" \ -e "xpack.security.enabled=false" \ elasticsearch:8.x # Install IK analyzer docker exec -it elasticsearch bin/elasticsearch-plugin install \ https://github.com/medcl/elasticsearch-analysis-ik/releases/download/v8.x/elasticsearch-analysis-ik-8.x.zip docker restart elasticsearch ```

### 2. Configure Environment Edit `Argus-backend/src/main/resources/application-local.yml` and fill in database, middleware, and LLM settings: ```yaml # Database spring.datasource.url: jdbc:postgresql://localhost:5432/argus_rag spring.datasource.username: your_username spring.datasource.password: your_password # LLM spring.ai.dashscope.api-key: ${DASHSCOPE_API_KEY} spring.ai.openai.api-key: ${DASHSCOPE_API_KEY} # Shared with DashScope # Object storage (optional) storage.minio.endpoint: http://localhost:9000 storage.minio.access-key: minioadmin storage.minio.secret-key: minioadmin # Elasticsearch elasticsearch.host: localhost elasticsearch.port: 9200 ``` ### 3. Start Backend ```bash # Set JDK 21 export JAVA_HOME="/path/to/jdk-21" cd Argus-backend # Compile ./mvnw clean compile # Start. The default profile is local, and the default port is 10001. ./mvnw spring-boot:run # API docs: http://localhost:10001/doc.html ``` ### 4. Start Frontend ```bash cd Argus-frontend npm install npm run dev # Visit: http://localhost:5173 ``` ### Default Account In the development environment (`--spring.profiles.active=dev`), an administrator account is created automatically: | Username | Password | Role | |--------|------|------| | `admin` | `admin123` | System Administrator |
## API Overview ### Authentication · `/api/auth` | Method | Path | Description | |------|------|------| | POST | `/api/auth/register` | Register a user | | POST | `/api/auth/login` | Log in and return Access + Refresh Token | | POST | `/api/auth/refresh` | Refresh token using the Refresh Token in the cookie | | POST | `/api/auth/logout` | Log out and clear the Refresh Token | | GET | `/api/auth/me` | Get current user information | ### Group Collaboration · `/api/groups` | Method | Path | Description | |------|------|------| | POST | `/api/groups` | Create a group | | GET | `/api/groups` | Query visible groups | | POST | `/api/groups/{id}/invitations` | Create an invitation | | POST | `/api/groups/{id}/join-request` | Request to join | | POST | `/api/groups/invitations/{id}/accept` | Accept an invitation | | DELETE | `/api/groups/{id}/members/{userId}` | Remove a member | ### Document Management · `/api/documents` | Method | Path | Description | |------|------|------| | POST | `/api/documents/upload/init` | Initialize chunked upload with instant/resume detection | | POST | `/api/documents/upload/chunks` | Upload a chunk | | POST | `/api/documents/upload/{id}/complete` | Complete upload and trigger ETL | | POST | `/api/documents/upload` | Upload a small file directly (<=10 MB) | | GET | `/api/documents` | List documents with multiple filters | | GET | `/api/documents/{id}/preview` | Preview a document | | GET | `/api/documents/{id}/download` | Download a document | | DELETE | `/api/documents/{id}` | Soft-delete a document | ### Knowledge Base Q&A · `/api/qa` | Method | Path | Description | |------|------|------| | POST | `/api/qa/ask` | Submit a question and receive an AI answer with citations |

Request/Response Example

**Request**: ```json { "groupId": 1, "question": "What is the document upload flow? How can I retry after an upload failure?" } ``` **Response**: ```json { "answered": true, "answer": "The document upload flow has three stages: initialization, chunk upload, and final merge. First, call /upload/init to initialize the session...", "citations": [ { "documentId": 1, "chunkId": 15, "fileName": "Argus User Manual.pdf", "score": 0.97 } ] } ```

### AI Assistant · `/api/assistant` | Method | Path | Description | |------|------|------| | POST | `/api/assistant/sessions` | Create a new session | | GET | `/api/assistant/sessions` | List sessions | | PATCH | `/api/assistant/sessions/{id}` | Rename a session | | DELETE | `/api/assistant/sessions/{id}` | Delete a session | | POST | `/api/assistant/chat` | Synchronous chat (`CHAT` / `KB_SEARCH`) | | POST | `/api/assistant/chat/stream` | Streaming chat over SSE | | GET | `/api/assistant/sessions/{id}/context` | Get session context, including summaries |
## Project Structure ```text Argus/ ├── Argus-backend/ # Spring Boot backend │ └── src/main/java/com/argus/rag/ │ ├── auth/ # Authentication and authorization (JWT dual tokens) │ ├── user/ # User management │ ├── group/ # Group collaboration (invitations/approval/roles) │ ├── document/ # Document management (chunked upload/preview/download) │ ├── ingestion/ # ETL pipeline (parse/chunk/vectorize/index) │ │ └── service/pipeline/ │ │ ├── reader/ # Document readers │ │ ├── parser/ # Multi-format parsers (PDF/DOCX/MD/TXT) │ │ └── transformer/ # Text cleaning + structure-aware chunking │ ├── qa/ # Knowledge Q&A (query planning/RRF fusion/evidence evaluation) │ │ └── rag/ # Hybrid retrieval engine │ ├── assistant/ # AI assistant (ReactAgent/short-term memory/SSE streaming) │ │ ├── agent/ # Agent factory + knowledge base retrieval tools │ │ ├── memory/ # Three-level short-term memory compression │ │ └── service/ # Conversation orchestration + session management │ └── engine/ # Infrastructure (ES/PGvector/MinIO) │ ├── Argus-frontend/ # Vue 3 frontend │ └── src/ │ ├── api/ # Backend API wrappers │ ├── views/ # Page components │ │ ├── HomeView.vue # Product home page │ │ ├── documents/ # Document management │ │ ├── qa/ # Knowledge Q&A │ │ ├── assistant/ # AI assistant │ │ ├── groups/ # Collaboration groups │ │ └── admin/ # User management │ ├── stores/ # Pinia stores │ └── components/ # Shared components │ ├── docs/ # Project documentation │ ├── V1.0-项目文档.md # User authentication + group management │ ├── V2.0-项目文档.md # Document upload + ETL pipeline │ ├── V3.0-项目文档.md # Knowledge Q&A (RAG) │ └── V4.0-项目文档.md # AI Assistant Agent + streaming chat │ └── sql/ └── schema.sql # Database schema DDL ```
## Version Evolution Argus follows a **progressive iteration** development model, with each version focusing on one core theme: | Version | Theme | Core Deliverables | |------|------|---------| | **V1.0** | Infrastructure | User authentication (JWT dual tokens), group collaboration (invitation/approval/roles), project skeleton | | **V2.0** | Document Engine | Chunked upload (resume/instant upload), ETL pipeline, dual-channel retrieval (vector + keyword) | | **V3.0** | RAG Q&A | Query planning (LLM), RRF fusion ranking, four-level evidence evaluation, citation provenance | | **V4.0** | AI Agent | ReactAgent graph engine, CHAT/KB_SEARCH dual modes, three-level short-term memory compression, SSE streaming | > For detailed design decisions and technical documentation, see the [`docs/`](docs/) directory.
## Contributing Issues and Pull Requests are welcome. Before submitting a PR, please make sure: - The code compiles successfully (`./mvnw clean compile`). - The existing code style and naming conventions are followed. - New features include appropriate JavaDoc or comments.
## License This project is licensed under the [MIT](LICENSE) License.
---

_{Made with love by the Argus team · Every answer should be traceable}