--- name: rag-frameworks description: Use when "RAG", "retrieval augmented generation", "LangChain", "LlamaIndex", "sentence transformers", "embeddings", "document QA", "chatbot with documents", "semantic search" version: 1.0.0 --- # RAG Frameworks Frameworks for building retrieval-augmented generation applications. ## Comparison | Framework | Best For | Learning Curve | Flexibility | |-----------|----------|----------------|-------------| | **LangChain** | Agents, chains, tools | Steeper | Highest | | **LlamaIndex** | Data indexing, simple RAG | Gentle | Medium | | **Sentence Transformers** | Custom embeddings | Low | High | --- ## LangChain Orchestration framework for building complex LLM applications. **Core concepts:** - **Chains**: Sequential operations (retrieve → prompt → generate) - **Agents**: LLM decides which tools to use - **LCEL**: Declarative pipeline syntax with `|` operator - **Retrievers**: Abstract interface to vector stores **Strengths**: Rich ecosystem, many integrations, agent capabilities **Limitations**: Abstractions can be confusing, rapid API changes **Key concept**: LCEL (LangChain Expression Language) for composable pipelines. --- ## LlamaIndex Data framework focused on connecting LLMs to external data. **Core concepts:** - **Documents → Nodes**: Automatic chunking and indexing - **Index types**: Vector, keyword, tree, knowledge graph - **Query engines**: Retrieve and synthesize answers - **Chat engines**: Stateful conversation over data **Strengths**: Simple API, great for document QA, data connectors **Limitations**: Less flexible for complex agent workflows **Key concept**: "Load data, index it, query it" - simpler mental model than LangChain. --- ## Sentence Transformers Generate high-quality embeddings for semantic similarity. **Popular models:** | Model | Dimensions | Quality | Speed | |-------|------------|---------|-------| | all-MiniLM-L6-v2 | 384 | Good | Fast | | all-mpnet-base-v2 | 768 | Better | Medium | | e5-large-v2 | 1024 | Best | Slow | **Key concept**: Bi-encoder architecture - encode query and documents separately, compare with cosine similarity. --- ## RAG Architecture Patterns | Pattern | Description | When to Use | |---------|-------------|-------------| | **Naive RAG** | Retrieve top-k, stuff in prompt | Simple QA | | **Parent-Child** | Retrieve chunks, return parent docs | Context preservation | | **Hybrid Search** | Vector + keyword search | Better recall | | **Re-ranking** | Retrieve many, re-rank with cross-encoder | Higher precision | | **Query Expansion** | Generate variations of query | Ambiguous queries | --- ## Decision Guide | Scenario | Recommendation | |----------|----------------| | Simple document QA | LlamaIndex | | Complex agents/tools | LangChain | | Custom embedding pipeline | Sentence Transformers | | Production RAG | LangChain or custom | | Quick prototype | LlamaIndex | | Maximum control | Build custom with Sentence Transformers | ## Resources - LangChain: - LlamaIndex: - Sentence Transformers: