---
name: hybrid-search
description: Use when building search systems that need both semantic similarity and keyword matching - covers combining vector and BM25 search with Reciprocal Rank Fusion, alpha tuning for search weight control, and optimizing retrieval quality
version: 0.5.0
---

# LLMemory Hybrid Search

## Installation

```bash
uv add llmemory
# or
pip install llmemory
```

## Overview

Hybrid search combines **vector similarity search** (semantic understanding) with **full-text search** (keyword matching) to deliver superior retrieval quality. Results are merged using **Reciprocal Rank Fusion (RRF)** to create a unified ranking.

**When to use hybrid search:**
- Need both semantic similarity AND exact keyword matches
- Queries contain specific terms, names, or technical jargon
- Want best-of-both-worlds retrieval quality (recommended default)

**When to use vector-only search:**
- Purely semantic/conceptual queries
- Cross-lingual search
- Queries with synonyms or paraphrasing

**When to use text-only search:**
- Exact keyword/phrase matching required
- Search in structured data or code
- When embeddings are not available

## Quick Start

```python
from llmemory import LLMemory, SearchType

async with LLMemory(connection_string="postgresql://localhost/mydb") as memory:
    # Hybrid search (default, recommended)
    results = await memory.search(
        owner_id="workspace-1",
        query_text="machine learning algorithms",
        search_type=SearchType.HYBRID,
        limit=10,
        alpha=0.5  # Equal weight to vector and text
    )

    for result in results:
        print(f"[RRF={result.rrf_score:.3f}] {result.content[:80]}...")
```

## Complete API Documentation

### SearchType Enum

```python
class SearchType(str, Enum):
    VECTOR = "vector"   # Vector similarity only
    TEXT = "text"       # Full-text search only
    HYBRID = "hybrid"   # Combines vector + text (recommended)
```

### search() - Hybrid Mode

**Signature:**
```python
async def search(
    owner_id: str,
    query_text: str,
    search_type: Union[SearchType, str] = SearchType.HYBRID,
    limit: int = 10,
    alpha: float = 0.5,
    metadata_filter: Optional[Dict[str, Any]] = None,
    id_at_origins: Optional[List[str]] = None,
    date_from: Optional[datetime] = None,
    date_to: Optional[datetime] = None,
    include_parent_context: bool = False,
    context_window: int = 2
) -> List[SearchResult]
```

**Hybrid Search Parameters:**
- `search_type` (SearchType, default: HYBRID): Set to `SearchType.HYBRID` for hybrid search
- `alpha` (float, default: 0.5): Weight for vector vs text search
  - `0.0` = text search only
  - `0.5` = equal weight (balanced, recommended)
  - `1.0` = vector search only
  - `0.3` = favor text search (good for keyword-heavy queries)
  - `0.7` = favor vector search (good for semantic queries)

**Returns:**
- `List[SearchResult]` with hybrid-specific fields:
  - `rrf_score` (float): Reciprocal Rank Fusion score (primary ranking)
  - `similarity` (float): Vector similarity score (0-1)
  - `text_rank` (float): Full-text search rank
  - `score` (float): Overall score (equals rrf_score for hybrid)

**Example:**
```python
# Balanced hybrid search
results = await memory.search(
    owner_id="workspace-1",
    query_text="quarterly revenue growth",
    search_type=SearchType.HYBRID,
    alpha=0.5,  # Equal weight
    limit=20
)

for result in results:
    print(f"RRF Score: {result.rrf_score:.3f}")
    print(f"Vector Similarity: {result.similarity:.3f}")
    print(f"Text Rank: {result.text_rank:.3f}")
    print(f"Content: {result.content[:100]}...")
    print("---")
```

## Understanding Alpha Parameter

The `alpha` parameter controls the balance between vector and text search in hybrid mode.

### Alpha Values Guide

```python
# Text-heavy (alpha = 0.0 to 0.3)
# Use when: Query has specific keywords, names, or technical terms
results = await memory.search(
    owner_id="workspace-1",
    query_text="Python asyncio gather timeout",
    search_type=SearchType.HYBRID,
    alpha=0.3  # Favor keyword matching
)

# Balanced (alpha = 0.4 to 0.6)
# Use when: General queries, uncertain which is better
results = await memory.search(
    owner_id="workspace-1",
    query_text="customer retention strategies",
    search_type=SearchType.HYBRID,
    alpha=0.5  # Equal weight (recommended default)
)

# Semantic-heavy (alpha = 0.7 to 1.0)
# Use when: Conceptual queries, synonyms, paraphrasing
results = await memory.search(
    owner_id="workspace-1",
    query_text="ways to keep customers happy",
    search_type=SearchType.HYBRID,
    alpha=0.7  # Favor semantic similarity
)
```

### Choosing Alpha for Different Query Types

| Query Type | Example | Recommended Alpha | Reasoning |
|------------|---------|-------------------|-----------|
| Specific keywords | "PostgreSQL CONNECTION_LIMIT error" | 0.2-0.3 | Need exact keyword matches |
| Product/person names | "iPhone 15 Pro specifications" | 0.3-0.4 | Names matter more than semantics |
| Technical jargon | "SOLID principles dependency injection" | 0.4-0.5 | Balance needed |
| General concepts | "improve team collaboration" | 0.5-0.6 | Balanced approach |
| Semantic queries | "how to motivate employees" | 0.6-0.7 | Semantic understanding key |
| Paraphrased questions | "what are good ways to retain staff" | 0.7-0.8 | Vector search excels |

## Reciprocal Rank Fusion (RRF)

Hybrid search uses RRF to merge vector and text search results into a unified ranking.

### How RRF Works

```python
k = 60  # RRF constant (prevents early results from dominating)

# Initialize score accumulator for each chunk
rrf_scores = {}

# Process vector search results
for rank, result in enumerate(vector_results):
    chunk_id = result["chunk_id"]
    vector_contribution = alpha / (k + rank + 1)
    rrf_scores[chunk_id] = rrf_scores.get(chunk_id, 0) + vector_contribution

# Process text search results
for rank, result in enumerate(text_results):
    chunk_id = result["chunk_id"]
    text_contribution = (1 - alpha) / (k + rank + 1)
    rrf_scores[chunk_id] = rrf_scores.get(chunk_id, 0) + text_contribution

# Sort by accumulated RRF score descending
sorted_results = sorted(rrf_scores.items(), key=lambda x: x[1], reverse=True)
```

**Key points:**
- Alpha is **inside** the division: `alpha / (k + rank + 1)`, not multiplied afterward
- Rank is 1-indexed: `rank + 1` where rank starts at 0
- Chunks appearing in **both** result lists get contributions from both
- k = 50 by default (configurable via `SearchConfig.rrf_k`)

### RRF Benefits

1. **Handles different score scales**: Vector similarities (0-1) and text ranks (varying) are normalized
2. **Position-based fusion**: Emphasizes consensus across search methods
3. **Robust to score outliers**: Single high score doesn't dominate
4. **Tunable with alpha**: Control the balance between search methods

### Example: RRF in Action

```python
results = await memory.search(
    owner_id="workspace-1",
    query_text="machine learning neural networks",
    search_type=SearchType.HYBRID,
    alpha=0.5,
    limit=5
)

for i, result in enumerate(results, 1):
    print(f"Result #{i}")
    print(f"  RRF Score: {result.rrf_score:.4f}")
    print(f"  Vector Sim: {result.similarity:.4f} (semantic match)")
    print(f"  Text Rank: {result.text_rank:.4f} (keyword match)")
    print(f"  Content: {result.content[:80]}...")
    print()

# Output shows how RRF balances both signals:
# Result #1
#   RRF Score: 0.0245  (highest combined score)
#   Vector Sim: 0.85   (very semantically similar)
#   Text Rank: 12.5    (good keyword match)
#   Content: Deep learning uses neural networks with multiple layers...
```

## Configuring Hybrid Search with SearchConfig

LLMemory's `SearchConfig` provides fine-grained control over hybrid search behavior, including HNSW vector index parameters and RRF fusion settings. You can configure these settings via environment variables or programmatically through `LLMemoryConfig`.

### HNSW Index Configuration

The HNSW (Hierarchical Navigable Small World) index powers fast approximate nearest neighbor vector search. LLMemory provides three preset profiles and supports custom configuration.

#### HNSW Parameters

- **`hnsw_m`** (int, default: 16): Number of bi-directional links per node
  - Higher values = better recall, larger index, slower construction
  - Range: 8-64, typical values: 8 (fast), 16 (balanced), 32 (accurate)

- **`hnsw_ef_construction`** (int, default: 200): Size of dynamic candidate list during index construction
  - Higher values = better index quality, slower construction
  - Range: 100-1000, typical values: 80 (fast), 200 (balanced), 400 (accurate)

- **`hnsw_ef_search`** (int, default: 100): Size of dynamic candidate list during search
  - Higher values = better recall, slower search
  - Range: 40-500, typical values: 40 (fast), 100 (balanced), 200 (accurate)

#### HNSW Presets

LLMemory includes three built-in presets for common use cases:

```python
HNSW_PRESETS = {
    "fast": {
        "m": 8,
        "ef_construction": 80,
        "ef_search": 40
    },
    "balanced": {
        "m": 16,
        "ef_construction": 200,
        "ef_search": 100
    },
    "accurate": {
        "m": 32,
        "ef_construction": 400,
        "ef_search": 200
    }
}
```

**Preset Recommendations:**
- **fast**: Latency-critical applications (40-60ms search, ~95% recall)
- **balanced**: General-purpose use (80-120ms search, ~98% recall) - **Default**
- **accurate**: High-precision requirements (150-250ms search, ~99.5% recall)

#### Using HNSW Presets via Environment Variable

Set the `LLMEMORY_HNSW_PROFILE` environment variable to use a preset:

```bash
# Use fast profile for low-latency applications
export LLMEMORY_HNSW_PROFILE=fast

# Use accurate profile for high-precision requirements
export LLMEMORY_HNSW_PROFILE=accurate

# Use balanced profile (default, can be omitted)
export LLMEMORY_HNSW_PROFILE=balanced
```

Then initialize LLMemory normally - the preset will be applied automatically:

```python
from llmemory import LLMemory, SearchType

# Automatically uses HNSW preset from environment
async with LLMemory(connection_string="postgresql://localhost/mydb") as memory:
    results = await memory.search(
        owner_id="workspace-1",
        query_text="machine learning",
        search_type=SearchType.HYBRID,
        limit=10
    )
```

#### Programmatic HNSW Configuration

For more control, configure HNSW parameters programmatically:

```python
from llmemory import LLMemory, SearchType
from llmemory.config import LLMemoryConfig

# Create custom configuration
config = LLMemoryConfig()

# Configure search parameters
config.search.hnsw_ef_search = 150  # Higher search accuracy

# Configure database/index parameters
config.database.hnsw_m = 24
config.database.hnsw_ef_construction = 300

# Initialize with custom config
async with LLMemory(
    connection_string="postgresql://localhost/mydb",
    config=config
) as memory:
    results = await memory.search(
        owner_id="workspace-1",
        query_text="neural networks",
        search_type=SearchType.HYBRID,
        limit=10
    )
```

**Note:** Index construction parameters (`hnsw_m`, `hnsw_ef_construction`) only affect **new indexes**. To apply them to an existing index, you must recreate the index:

```sql
-- Recreate HNSW index with new parameters
DROP INDEX IF EXISTS llmemory.document_chunks_embedding_hnsw;
CREATE INDEX document_chunks_embedding_hnsw
ON llmemory.document_chunks
USING hnsw (embedding vector_cosine_ops)
WITH (m = 24, ef_construction = 300);
```

### RRF Configuration

The `rrf_k` parameter controls the Reciprocal Rank Fusion constant used to merge vector and text search results.

#### RRF Parameter

- **`rrf_k`** (int, default: 50): RRF constant that controls rank position sensitivity
  - Higher values = less weight on top positions, more democratic fusion
  - Lower values = more weight on top positions, favors high-ranking results
  - Range: 10-100, typical values: 30 (aggressive), 50 (balanced), 70 (democratic)

**How rrf_k affects fusion:**

```python
# For a chunk at rank position r (0-indexed):
rrf_score_contribution = alpha / (rrf_k + r + 1)

# Example with rrf_k=50:
# Rank 0: 1.0 / (50 + 0 + 1) = 0.0196
# Rank 1: 1.0 / (50 + 1 + 1) = 0.0192
# Rank 10: 1.0 / (50 + 10 + 1) = 0.0164

# Example with rrf_k=20 (favors top results):
# Rank 0: 1.0 / (20 + 0 + 1) = 0.0476
# Rank 1: 1.0 / (20 + 1 + 1) = 0.0455
# Rank 10: 1.0 / (20 + 10 + 1) = 0.0323

# Example with rrf_k=80 (more democratic):
# Rank 0: 1.0 / (80 + 0 + 1) = 0.0123
# Rank 1: 1.0 / (80 + 1 + 1) = 0.0122
# Rank 10: 1.0 / (80 + 10 + 1) = 0.0110
```

#### Configuring RRF via Environment Variable

```bash
# Lower k favors top-ranked results
export LLMEMORY_RRF_K=30

# Higher k gives more weight to mid-ranked results
export LLMEMORY_RRF_K=70

# Default balanced setting
export LLMEMORY_RRF_K=50
```

**Note:** Currently, `rrf_k` is not directly exposed via environment variable. To configure it, use programmatic configuration:

```python
from llmemory import LLMemory
from llmemory.config import LLMemoryConfig

config = LLMemoryConfig()
config.search.rrf_k = 30  # Favor top-ranked results

async with LLMemory(
    connection_string="postgresql://localhost/mydb",
    config=config
) as memory:
    results = await memory.search(
        owner_id="workspace-1",
        query_text="search query",
        search_type=SearchType.HYBRID,
        limit=10
    )
```

### Complete Configuration Example

Here's a complete example showing both environment variable and programmatic configuration:

```python
import os
from llmemory import LLMemory, SearchType
from llmemory.config import LLMemoryConfig

# Option 1: Environment variable configuration
os.environ["LLMEMORY_HNSW_PROFILE"] = "accurate"
# HNSW will use: m=32, ef_construction=400, ef_search=200

async with LLMemory(connection_string="postgresql://localhost/mydb") as memory:
    results = await memory.search(
        owner_id="workspace-1",
        query_text="deep learning transformers",
        search_type=SearchType.HYBRID,
        alpha=0.6,
        limit=15
    )

# Option 2: Programmatic configuration with fine-tuning
config = LLMemoryConfig()

# HNSW search configuration
config.search.hnsw_ef_search = 150  # Higher accuracy than default

# HNSW index construction (for new indexes)
config.database.hnsw_m = 20
config.database.hnsw_ef_construction = 250

# RRF configuration
config.search.rrf_k = 40  # Favor top-ranked results slightly

# Other search settings
config.search.default_limit = 20
config.search.default_search_type = "hybrid"

async with LLMemory(
    connection_string="postgresql://localhost/mydb",
    config=config
) as memory:
    # Search with custom configuration
    results = await memory.search(
        owner_id="workspace-1",
        query_text="neural network architectures",
        search_type=SearchType.HYBRID,
        alpha=0.5,
        limit=20
    )

    for result in results:
        print(f"RRF: {result.rrf_score:.4f} | "
              f"Vector: {result.similarity:.4f} | "
              f"Text: {result.text_rank:.4f}")
        print(f"  {result.content[:80]}...")
```

### Configuration Performance Impact

Different HNSW settings have measurable performance impacts:

| Profile | Index Size (100k docs) | Construction Time | Search Latency | Recall |
|---------|------------------------|-------------------|----------------|--------|
| fast | 150 MB | 5 min | 40-60ms | ~95% |
| balanced | 250 MB | 12 min | 80-120ms | ~98% |
| accurate | 450 MB | 30 min | 150-250ms | ~99.5% |

**Tuning Guidelines:**

1. **Start with balanced** (default) for most applications
2. **Use fast** if:
   - Search latency must be under 100ms
   - Recall around 95% is acceptable
   - Index size is a constraint
3. **Use accurate** if:
   - High precision is critical (medical, legal, financial)
   - Search latency under 300ms is acceptable
   - Maximum recall is required
4. **Custom tune** if:
   - You have specific latency/recall requirements
   - You've measured performance with your data
   - You're optimizing for your embedding model

## Search Type Comparison

### Vector Search Only

```python
# Pure semantic similarity
results = await memory.search(
    owner_id="workspace-1",
    query_text="artificial intelligence",
    search_type=SearchType.VECTOR,
    limit=10
)

# Good for:
# - "AI" matching "machine learning" (synonym)
# - "dog" matching "puppy" (semantic)
# - Cross-lingual search
#
# Weak for:
# - Specific keywords ("PostgreSQL 14.2")
# - Exact phrases ("return on investment")
# - Technical terms ("ValueError exception")
```

### Text Search Only

```python
# Pure keyword matching
results = await memory.search(
    owner_id="workspace-1",
    query_text="PostgreSQL CONNECTION_LIMIT",
    search_type=SearchType.TEXT,
    limit=10
)

# Good for:
# - Exact keyword matches
# - Technical error messages
# - Code search
# - Structured data
#
# Weak for:
# - Synonyms ("automobile" vs "car")
# - Paraphrasing
# - Conceptual queries
```

### Hybrid Search (Recommended)

```python
# Combines both vector and text
results = await memory.search(
    owner_id="workspace-1",
    query_text="reduce server response time",
    search_type=SearchType.HYBRID,
    alpha=0.5,
    limit=10
)

# Strengths:
# - Finds semantically similar content ("optimize latency")
# - Also finds exact keywords ("response time")
# - Best overall retrieval quality
# - Robust to different query styles
#
# Use cases:
# - General-purpose search (recommended default)
# - Unknown query patterns
# - Mixed keyword + semantic needs
```

## Practical Examples

### E-commerce Product Search

```python
# Product search benefits from hybrid
# - Vector: Understands "laptop for programming"
# - Text: Matches exact model numbers "MacBook Pro M3"

results = await memory.search(
    owner_id="store-1",
    query_text="fast laptop for developers",
    search_type=SearchType.HYBRID,
    alpha=0.6,  # Favor semantic understanding
    metadata_filter={"category": "computers"},
    limit=20
)
```

### Technical Documentation Search

```python
# Documentation needs both semantic and exact matches
# - Vector: Finds conceptually related docs
# - Text: Finds exact function/class names

results = await memory.search(
    owner_id="docs-site",
    query_text="authenticate users with OAuth2",
    search_type=SearchType.HYBRID,
    alpha=0.4,  # Slight favor to keywords ("OAuth2")
    metadata_filter={"doc_type": "api_reference"},
    limit=15
)
```

### Customer Support Search

```python
# Support tickets need semantic understanding
# - Vector: Matches similar issues ("can't log in" = "login failed")
# - Text: Matches error codes, product names

results = await memory.search(
    owner_id="support-team",
    query_text="error code 500 payment processing",
    search_type=SearchType.HYBRID,
    alpha=0.3,  # Favor exact error codes
    metadata_filter={"status": "resolved"},
    limit=10
)
```

### Research Paper Search

```python
# Academic search benefits from semantic understanding
# - Vector: Finds related concepts and methods
# - Text: Finds exact citations, author names

results = await memory.search(
    owner_id="research-db",
    query_text="transformer attention mechanism",
    search_type=SearchType.HYBRID,
    alpha=0.7,  # Favor semantic similarity
    date_from=datetime(2020, 1, 1),  # Recent papers
    limit=25
)
```

## Performance Optimization

### Hybrid Search Performance

Hybrid search runs vector and text searches **in parallel** for optimal performance:

```python
# Both searches execute concurrently
# Total time ≈ max(vector_time, text_time) + rrf_fusion_time
# Typically: 50-150ms for hybrid search

import time

start = time.time()
results = await memory.search(
    owner_id="workspace-1",
    query_text="customer retention",
    search_type=SearchType.HYBRID,
    limit=20
)
elapsed = (time.time() - start) * 1000
print(f"Search completed in {elapsed:.2f}ms")
```

### Tuning for Speed vs Quality

```python
# Faster hybrid search (fewer candidates)
results = await memory.search(
    owner_id="workspace-1",
    query_text="query text",
    search_type=SearchType.HYBRID,
    limit=10,  # Lower limit = faster
    alpha=0.5
)

# Higher quality hybrid search (more candidates considered)
# Note: Uses internal candidate multiplier (typically limit * 2)
results = await memory.search(
    owner_id="workspace-1",
    query_text="query text",
    search_type=SearchType.HYBRID,
    limit=20,  # Higher limit for better recall
    alpha=0.5
)
```

## Advanced Filtering with Hybrid Search

```python
# Combine hybrid search with metadata filters
results = await memory.search(
    owner_id="workspace-1",
    query_text="financial performance analysis",
    search_type=SearchType.HYBRID,
    alpha=0.5,
    metadata_filter={
        "department": "finance",
        "year": 2024,
        "confidential": False
    },
    date_from=datetime(2024, 1, 1),
    date_to=datetime(2024, 12, 31),
    limit=15
)

# Hybrid search finds:
# - Vector: Similar financial concepts
# - Text: Exact keyword "performance analysis"
# - Both filtered by metadata and date range
```

## Common Mistakes

❌ **Wrong: Always using default alpha=0.5**
```python
# This works but may not be optimal
results = await memory.search(
    owner_id="workspace-1",
    query_text="iPhone 14 Pro specs",  # Specific product name
    search_type=SearchType.HYBRID,
    alpha=0.5  # Equal weight not ideal here
)
```

✅ **Right: Tune alpha for query type**
```python
# Product names and specific terms favor text search
results = await memory.search(
    owner_id="workspace-1",
    query_text="iPhone 14 Pro specs",
    search_type=SearchType.HYBRID,
    alpha=0.3  # Favor exact keyword matching
)
```

❌ **Wrong: Using VECTOR for exact keyword matching**
```python
results = await memory.search(
    owner_id="workspace-1",
    query_text="ERROR CODE 404",
    search_type=SearchType.VECTOR  # Won't find exact "404"
)
```

✅ **Right: Use HYBRID or TEXT for exact keywords**
```python
results = await memory.search(
    owner_id="workspace-1",
    query_text="ERROR CODE 404",
    search_type=SearchType.HYBRID,
    alpha=0.2  # Heavily favor exact keywords
)
```

❌ **Wrong: Using TEXT for conceptual queries**
```python
results = await memory.search(
    owner_id="workspace-1",
    query_text="how to improve customer satisfaction",
    search_type=SearchType.TEXT  # Misses semantic matches
)
```

✅ **Right: Use HYBRID for conceptual queries**
```python
results = await memory.search(
    owner_id="workspace-1",
    query_text="how to improve customer satisfaction",
    search_type=SearchType.HYBRID,
    alpha=0.7  # Favor semantic understanding
)
```

## Alpha Tuning Strategies

### A/B Testing Different Alpha Values

```python
# Test different alpha values to find optimal setting
query = "product launch strategy roadmap"
alpha_values = [0.3, 0.5, 0.7]

for alpha in alpha_values:
    results = await memory.search(
        owner_id="workspace-1",
        query_text=query,
        search_type=SearchType.HYBRID,
        alpha=alpha,
        limit=10
    )

    print(f"\nAlpha = {alpha}")
    for i, result in enumerate(results[:3], 1):
        print(f"  #{i}: {result.content[:60]}... (RRF={result.rrf_score:.4f})")
    # Compare results quality and adjust
```

### Dynamic Alpha Based on Query Analysis

```python
def calculate_alpha(query_text: str) -> float:
    """Dynamically adjust alpha based on query characteristics."""
    # Check for exact phrases (quotes)
    if '"' in query_text:
        return 0.2  # Favor exact matching

    # Check for technical terms or codes
    if any(char.isdigit() or char.isupper() for char in query_text.split()):
        return 0.3  # Favor keywords

    # Check for question words (semantic query)
    question_words = ["how", "why", "what", "when", "where", "who"]
    if any(word in query_text.lower() for word in question_words):
        return 0.7  # Favor semantic

    # Default balanced
    return 0.5

# Use dynamic alpha
query = "how to optimize database queries"
alpha = calculate_alpha(query)

results = await memory.search(
    owner_id="workspace-1",
    query_text=query,
    search_type=SearchType.HYBRID,
    alpha=alpha,
    limit=10
)
```

## Monitoring and Debugging

### Understanding Result Scores

```python
results = await memory.search(
    owner_id="workspace-1",
    query_text="test query",
    search_type=SearchType.HYBRID,
    alpha=0.5,
    limit=5
)

for result in results:
    # Inspect individual scores
    print(f"Chunk ID: {result.chunk_id}")
    print(f"  RRF Score: {result.rrf_score:.4f} (overall ranking)")
    print(f"  Vector Similarity: {result.similarity:.4f}")
    print(f"  Text Rank: {result.text_rank:.4f}")
    print(f"  Content preview: {result.content[:80]}...")
    print()

# Look for:
# - High RRF but low similarity = text search dominated
# - High RRF but low text rank = vector search dominated
# - High in both = strong consensus (best results)
```

## Related Skills

- `basic-usage` - Core document and search operations
- `multi-query` - Query expansion for better hybrid search results
- `rag` - Using hybrid search in RAG systems with reranking
- `multi-tenant` - Multi-tenant isolation patterns

## Important Notes

**HNSW Configuration:**
Hybrid search uses HNSW (Hierarchical Navigable Small World) index for fast vector similarity. Performance can be tuned with `LLMEMORY_HNSW_PROFILE` environment variable or programmatically via `SearchConfig`. See the "Configuring Hybrid Search with SearchConfig" section for comprehensive configuration details including:
- Three presets: `fast`, `balanced` (default), `accurate`
- Individual HNSW parameters (m, ef_construction, ef_search)
- RRF tuning with `rrf_k` parameter
- Performance impact comparison table

**Language Support:**
Text search automatically detects document language and uses appropriate full-text search configuration (supports 14+ languages including English, Spanish, French, German, etc.).

**Embedding Models:**
Vector search quality depends on embedding model. Default is OpenAI `text-embedding-3-small` (1536 dimensions). For local embeddings, use `all-MiniLM-L6-v2` (384 dimensions).

**Search Limits:**
Hybrid search internally retrieves `limit * 2` candidates from each search method before RRF fusion. This ensures high-quality results even when vector and text return different chunks.