---
name: bedrock-knowledge-bases
description: Amazon Bedrock Knowledge Bases for RAG (Retrieval-Augmented Generation). Create knowledge bases with vector stores, ingest data from S3/web/Confluence/SharePoint, configure chunking strategies, query with retrieve and generate APIs, manage sessions. Use when building RAG applications, implementing semantic search, creating document Q&A systems, integrating knowledge bases with agents, optimizing chunking for accuracy, or querying enterprise knowledge.
allowed-tools:
  - Read
  - Write
  - Edit
  - Bash
  - Grep
  - Glob
triggers:
  - RAG
  - knowledge base
  - vector database
  - semantic search
  - document retrieval
  - chunking
  - embeddings
  - OpenSearch
  - S3 vectors
  - Neptune GraphRAG
  - retrieve and generate
  - ingestion
  - data source
---

# Amazon Bedrock Knowledge Bases

Amazon Bedrock Knowledge Bases is a fully managed RAG (Retrieval-Augmented Generation) solution that handles data ingestion, embedding generation, vector storage, retrieval with reranking, source attribution, and session context management.

## Overview

### What It Does

Amazon Bedrock Knowledge Bases provides:
- **Data Ingestion**: Automatically process documents from S3, web, Confluence, SharePoint, Salesforce
- **Embedding Generation**: Convert text to vectors using foundation models
- **Vector Storage**: Store embeddings in multiple vector database options
- **Retrieval**: Semantic and hybrid search with metadata filtering
- **Generation**: RAG workflows with source attribution
- **Session Management**: Multi-turn conversations with context
- **Chunking Strategies**: Fixed, semantic, hierarchical, and custom chunking

### When to Use This Skill

Use this skill when you need to:
- Build RAG applications for document Q&A
- Implement semantic search over enterprise knowledge
- Create chatbots with knowledge bases
- Integrate retrieval with Bedrock Agents
- Configure optimal chunking strategies
- Query documents with source attribution
- Manage multi-turn conversations with context
- Optimize RAG performance and cost

### Key Capabilities

1. **Multiple Vector Store Options**: OpenSearch, S3 Vectors, Neptune, Pinecone, MongoDB, Redis
2. **Flexible Data Sources**: S3, web crawlers, Confluence, SharePoint, Salesforce
3. **Advanced Chunking**: Fixed-size, semantic, hierarchical, custom Lambda
4. **Hybrid Search**: Combine semantic (vector) and keyword search
5. **Session Management**: Built-in conversation context tracking
6. **GraphRAG**: Relationship-aware retrieval with Neptune Analytics
7. **Cost Optimization**: S3 Vectors for up to 90% storage savings

---

## Quick Start

### Basic RAG Workflow

```python
import boto3
import json

# Initialize clients
bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1')
bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')

# 1. Create Knowledge Base
kb_response = bedrock_agent.create_knowledge_base(
    name='enterprise-docs-kb',
    description='Company documentation knowledge base',
    roleArn='arn:aws:iam::123456789012:role/BedrockKBRole',
    knowledgeBaseConfiguration={
        'type': 'VECTOR',
        'vectorKnowledgeBaseConfiguration': {
            'embeddingModelArn': 'arn:aws:bedrock:us-east-1::foundation-model/amazon.titan-embed-text-v2:0'
        }
    },
    storageConfiguration={
        'type': 'OPENSEARCH_SERVERLESS',
        'opensearchServerlessConfiguration': {
            'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
            'vectorIndexName': 'bedrock-knowledge-base-index',
            'fieldMapping': {
                'vectorField': 'bedrock-knowledge-base-default-vector',
                'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
                'metadataField': 'AMAZON_BEDROCK_METADATA'
            }
        }
    }
)

knowledge_base_id = kb_response['knowledgeBase']['knowledgeBaseId']
print(f"Knowledge Base ID: {knowledge_base_id}")

# 2. Add S3 Data Source
ds_response = bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='s3-documents',
    description='Company documents from S3',
    dataSourceConfiguration={
        'type': 'S3',
        's3Configuration': {
            'bucketArn': 'arn:aws:s3:::my-docs-bucket',
            'inclusionPrefixes': ['documents/']
        }
    },
    vectorIngestionConfiguration={
        'chunkingConfiguration': {
            'chunkingStrategy': 'FIXED_SIZE',
            'fixedSizeChunkingConfiguration': {
                'maxTokens': 512,
                'overlapPercentage': 20
            }
        }
    }
)

data_source_id = ds_response['dataSource']['dataSourceId']

# 3. Start Ingestion
ingestion_response = bedrock_agent.start_ingestion_job(
    knowledgeBaseId=knowledge_base_id,
    dataSourceId=data_source_id,
    description='Initial document ingestion'
)

print(f"Ingestion Job ID: {ingestion_response['ingestionJob']['ingestionJobId']}")

# 4. Query with Retrieve and Generate
response = bedrock_agent_runtime.retrieve_and_generate(
    input={
        'text': 'What is our vacation policy?'
    },
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': knowledge_base_id,
            'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
            'retrievalConfiguration': {
                'vectorSearchConfiguration': {
                    'numberOfResults': 5,
                    'overrideSearchType': 'HYBRID'
                }
            }
        }
    }
)

print(f"Answer: {response['output']['text']}")
print(f"\nSources:")
for citation in response['citations']:
    for reference in citation['retrievedReferences']:
        print(f"  - {reference['location']['s3Location']['uri']}")
```

---

## Vector Store Options

### 1. Amazon OpenSearch Serverless

**Best for**: Production RAG applications with auto-scaling requirements

**Benefits**:
- Fully managed, serverless operation
- Auto-scaling compute and storage
- High availability with multi-AZ deployment
- Fast query performance

**Configuration**:

```python
storageConfiguration={
    'type': 'OPENSEARCH_SERVERLESS',
    'opensearchServerlessConfiguration': {
        'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
        'vectorIndexName': 'bedrock-knowledge-base-index',
        'fieldMapping': {
            'vectorField': 'bedrock-knowledge-base-default-vector',
            'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
            'metadataField': 'AMAZON_BEDROCK_METADATA'
        }
    }
}
```

### 2. Amazon S3 Vectors (Preview)

**Best for**: Cost-optimized, large-scale RAG applications

**Benefits**:
- Up to 90% cost reduction for vector storage
- Built-in vector support in S3
- Subsecond query performance
- Massive scale and durability

**Ideal Use Cases**:
- Large document collections (millions of chunks)
- Cost-sensitive applications
- Archival knowledge bases
- Low-to-medium QPS workloads

**Configuration**:

```python
storageConfiguration={
    'type': 'S3_VECTORS',
    's3VectorsConfiguration': {
        'bucketArn': 'arn:aws:s3:::my-vector-bucket',
        'prefix': 'vectors/'
    }
}
```

**Limitations**:
- Still in preview (no CloudFormation/CDK support yet)
- Not suitable for high QPS, millisecond-latency requirements
- Best for cost optimization over ultra-low latency

### 3. Amazon Neptune Analytics (GraphRAG)

**Best for**: Interconnected knowledge domains requiring relationship-aware retrieval

**Benefits**:
- Automatic graph creation linking related content
- Improved retrieval accuracy through relationships
- Comprehensive responses leveraging knowledge graph
- Explainable results with relationship context

**Use Cases**:
- Legal document analysis with case precedents
- Scientific research with paper citations
- Product catalogs with dependencies
- Organizational knowledge with team relationships

**Configuration**:

```python
storageConfiguration={
    'type': 'NEPTUNE_ANALYTICS',
    'neptuneAnalyticsConfiguration': {
        'graphArn': 'arn:aws:neptune-graph:us-east-1:123456789012:graph/g-12345678',
        'vectorSearchConfiguration': {
            'vectorField': 'embedding'
        }
    }
}
```

### 4. Amazon OpenSearch Service Managed Cluster

**Best for**: Existing OpenSearch infrastructure, advanced customization

**Configuration**:

```python
storageConfiguration={
    'type': 'OPENSEARCH_SERVICE',
    'opensearchServiceConfiguration': {
        'clusterArn': 'arn:aws:es:us-east-1:123456789012:domain/my-domain',
        'vectorIndexName': 'bedrock-kb-index',
        'fieldMapping': {
            'vectorField': 'embedding',
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}
```

### 5. Third-Party Vector Databases

**Pinecone**:

```python
storageConfiguration={
    'type': 'PINECONE',
    'pineconeConfiguration': {
        'connectionString': 'https://my-index-abc123.svc.us-west1-gcp.pinecone.io',
        'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:pinecone-api-key',
        'namespace': 'bedrock-kb',
        'fieldMapping': {
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}
```

**MongoDB Atlas**:

```python
storageConfiguration={
    'type': 'MONGODB_ATLAS',
    'mongoDbAtlasConfiguration': {
        'endpoint': 'https://cluster0.mongodb.net',
        'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:mongodb-creds',
        'databaseName': 'bedrock_kb',
        'collectionName': 'vectors',
        'vectorIndexName': 'vector_index',
        'fieldMapping': {
            'vectorField': 'embedding',
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}
```

**Redis Enterprise Cloud**:

```python
storageConfiguration={
    'type': 'REDIS_ENTERPRISE_CLOUD',
    'redisEnterpriseCloudConfiguration': {
        'endpoint': 'redis-12345.c1.us-east-1-2.ec2.cloud.redislabs.com:12345',
        'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:redis-creds',
        'vectorIndexName': 'bedrock-kb-index',
        'fieldMapping': {
            'vectorField': 'embedding',
            'textField': 'text',
            'metadataField': 'metadata'
        }
    }
}
```

---

## Data Source Configuration

### 1. Amazon S3

**Supported File Types**: PDF, TXT, MD, HTML, DOC, DOCX, CSV, XLS, XLSX

```python
bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='s3-technical-docs',
    description='Technical documentation from S3',
    dataSourceConfiguration={
        'type': 'S3',
        's3Configuration': {
            'bucketArn': 'arn:aws:s3:::my-docs-bucket',
            'inclusionPrefixes': ['docs/technical/', 'docs/manuals/'],
            'exclusionPrefixes': ['docs/archive/']
        }
    }
)
```

### 2. Web Crawler

**Automatic website scraping and indexing**:

```python
bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='company-website',
    description='Public company website content',
    dataSourceConfiguration={
        'type': 'WEB',
        'webConfiguration': {
            'sourceConfiguration': {
                'urlConfiguration': {
                    'seedUrls': [
                        {'url': 'https://www.example.com/docs'},
                        {'url': 'https://www.example.com/blog'}
                    ]
                }
            },
            'crawlerConfiguration': {
                'crawlerLimits': {
                    'rateLimit': 300  # Pages per minute
                }
            }
        }
    }
)
```

### 3. Confluence

```python
bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='confluence-wiki',
    description='Company Confluence knowledge base',
    dataSourceConfiguration={
        'type': 'CONFLUENCE',
        'confluenceConfiguration': {
            'sourceConfiguration': {
                'hostUrl': 'https://company.atlassian.net/wiki',
                'hostType': 'SAAS',
                'authType': 'BASIC',
                'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:confluence-creds'
            },
            'crawlerConfiguration': {
                'filterConfiguration': {
                    'type': 'PATTERN',
                    'patternObjectFilter': {
                        'filters': [
                            {
                                'objectType': 'Space',
                                'inclusionFilters': ['Engineering', 'Product'],
                                'exclusionFilters': ['Archive']
                            }
                        ]
                    }
                }
            }
        }
    }
)
```

### 4. SharePoint

```python
bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='sharepoint-docs',
    description='SharePoint document library',
    dataSourceConfiguration={
        'type': 'SHAREPOINT',
        'sharePointConfiguration': {
            'sourceConfiguration': {
                'siteUrls': [
                    'https://company.sharepoint.com/sites/Engineering',
                    'https://company.sharepoint.com/sites/Product'
                ],
                'tenantId': 'tenant-id',
                'domain': 'company',
                'authType': 'OAUTH2_CLIENT_CREDENTIALS',
                'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:sharepoint-creds'
            }
        }
    }
)
```

### 5. Salesforce

```python
bedrock_agent.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='salesforce-knowledge',
    description='Salesforce knowledge articles',
    dataSourceConfiguration={
        'type': 'SALESFORCE',
        'salesforceConfiguration': {
            'sourceConfiguration': {
                'hostUrl': 'https://company.my.salesforce.com',
                'authType': 'OAUTH2_CLIENT_CREDENTIALS',
                'credentialsSecretArn': 'arn:aws:secretsmanager:us-east-1:123456789012:secret:salesforce-creds'
            },
            'crawlerConfiguration': {
                'filterConfiguration': {
                    'type': 'PATTERN',
                    'patternObjectFilter': {
                        'filters': [
                            {
                                'objectType': 'Knowledge',
                                'inclusionFilters': ['Product_Documentation', 'Support_Articles']
                            }
                        ]
                    }
                }
            }
        }
    }
)
```

---

## Chunking Strategies

### 1. Fixed-Size Chunking

**Best for**: Simple documents with uniform structure

**How it works**: Splits text into chunks of fixed token size with overlap

**Parameters**:
- `maxTokens`: 200-8192 tokens (typically 512-1024)
- `overlapPercentage`: 10-50% (typically 20%)

**Configuration**:

```python
vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'FIXED_SIZE',
        'fixedSizeChunkingConfiguration': {
            'maxTokens': 512,
            'overlapPercentage': 20
        }
    }
}
```

**Use Cases**:
- Blog posts and articles
- Technical documentation with consistent formatting
- FAQs and Q&A content
- Simple text files

**Pros**:
- Fast and predictable
- No additional costs
- Easy to tune

**Cons**:
- May split semantic units awkwardly
- Doesn't respect document structure
- Can break context mid-sentence

### 2. Semantic Chunking

**Best for**: Documents without clear boundaries (legal, technical, academic)

**How it works**: Uses sentence similarity to group related content

**Parameters**:
- `maxTokens`: 20-8192 tokens (typically 300-500)
- `bufferSize`: Number of neighboring sentences (default: 1)
- `breakpointPercentileThreshold`: Similarity threshold (recommended: 95%)

**Configuration**:

```python
vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'SEMANTIC',
        'semanticChunkingConfiguration': {
            'maxTokens': 300,
            'bufferSize': 1,
            'breakpointPercentileThreshold': 95
        }
    }
}
```

**Use Cases**:
- Legal documents and contracts
- Academic papers
- Technical specifications
- Medical records
- Research reports

**Pros**:
- Preserves semantic meaning
- Better context preservation
- Improved retrieval accuracy

**Cons**:
- Additional cost (foundation model usage)
- Slower ingestion
- Less predictable chunk sizes

**Cost Consideration**: Semantic chunking uses foundation models for similarity analysis, incurring additional costs beyond storage and retrieval.

### 3. Hierarchical Chunking

**Best for**: Complex documents with nested structure

**How it works**: Creates parent and child chunks; retrieves child, returns parent for context

**Parameters**:
- `levelConfigurations`: Array of chunk sizes (parent → child)
- `overlapTokens`: Overlap between chunks

**Configuration**:

```python
vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'HIERARCHICAL',
        'hierarchicalChunkingConfiguration': {
            'levelConfigurations': [
                {
                    'maxTokens': 1500  # Parent chunk (comprehensive context)
                },
                {
                    'maxTokens': 300   # Child chunk (focused retrieval)
                }
            ],
            'overlapTokens': 60
        }
    }
}
```

**Use Cases**:
- Technical manuals with sections and subsections
- Academic papers with abstract, sections, and subsections
- Legal documents with articles and clauses
- Product documentation with categories and details

**How Retrieval Works**:
1. Query matches against child chunks (fast, focused)
2. Returns parent chunks (comprehensive context)
3. Best of both: precision retrieval + complete context

**Pros**:
- Optimal balance of precision and context
- Excellent for nested documents
- Better accuracy for complex queries

**Cons**:
- More complex configuration
- Larger storage footprint
- Requires understanding of document structure

### 4. Custom Chunking (Lambda)

**Best for**: Specialized domain logic, custom parsing requirements

**How it works**: Invoke Lambda function for custom chunking logic

**Configuration**:

```python
vectorIngestionConfiguration={
    'chunkingConfiguration': {
        'chunkingStrategy': 'NONE'  # Custom via Lambda
    },
    'customTransformationConfiguration': {
        'intermediateStorage': {
            's3Location': {
                'uri': 's3://my-kb-bucket/intermediate/'
            }
        },
        'transformations': [
            {
                'stepToApply': 'POST_CHUNKING',
                'transformationFunction': {
                    'transformationLambdaConfiguration': {
                        'lambdaArn': 'arn:aws:lambda:us-east-1:123456789012:function:custom-chunker'
                    }
                }
            }
        ]
    }
}
```

**Example Lambda Handler**:

```python
# Lambda function for custom chunking
import json

def lambda_handler(event, context):
    """
    Custom chunking logic for specialized documents

    Input: event contains document content and metadata
    Output: array of chunks with text and metadata
    """

    # Extract document content
    document = event['document']
    content = document['content']
    metadata = document.get('metadata', {})

    # Custom chunking logic (example: split by custom delimiter)
    chunks = []
    sections = content.split('---SECTION---')

    for idx, section in enumerate(sections):
        if section.strip():
            chunks.append({
                'text': section.strip(),
                'metadata': {
                    **metadata,
                    'chunk_id': f'section_{idx}',
                    'chunk_type': 'custom_section'
                }
            })

    return {
        'chunks': chunks
    }
```

**Use Cases**:
- Medical records with structured sections (SOAP notes)
- Financial documents with tables and calculations
- Code documentation with code blocks and explanations
- Domain-specific formats (HL7, FHIR, etc.)

**Pros**:
- Complete control over chunking logic
- Can handle any document format
- Integrate domain expertise

**Cons**:
- Requires Lambda development and maintenance
- Additional operational complexity
- Harder to debug and iterate

### Chunking Strategy Selection Guide

| Document Type | Recommended Strategy | Rationale |
|--------------|---------------------|-----------|
| Blog posts, articles | Fixed-size | Simple, uniform structure |
| Legal documents | Semantic | Preserve legal reasoning flow |
| Technical manuals | Hierarchical | Nested sections and subsections |
| Academic papers | Hierarchical | Abstract, sections, subsections |
| FAQs | Fixed-size | Independent Q&A pairs |
| Medical records | Custom Lambda | Structured sections (SOAP, HL7) |
| Code documentation | Custom Lambda | Code blocks + explanations |
| Product catalogs | Fixed-size | Uniform product descriptions |
| Research reports | Semantic | Preserve research narrative |

---

## Retrieval Operations

### 1. Retrieve API (Retrieval Only)

Returns raw retrieved chunks without generation.

**Use Cases**:
- Custom generation logic
- Debugging retrieval quality
- Building custom RAG pipelines
- Integrating with non-Bedrock models

```python
bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')

response = bedrock_agent_runtime.retrieve(
    knowledgeBaseId='KB123456',
    retrievalQuery={
        'text': 'What are the benefits of hierarchical chunking?'
    },
    retrievalConfiguration={
        'vectorSearchConfiguration': {
            'numberOfResults': 5,
            'overrideSearchType': 'HYBRID',  # SEMANTIC, HYBRID
            'filter': {
                'andAll': [
                    {
                        'equals': {
                            'key': 'document_type',
                            'value': 'technical_guide'
                        }
                    },
                    {
                        'greaterThan': {
                            'key': 'publish_year',
                            'value': 2024
                        }
                    }
                ]
            }
        }
    }
)

# Process retrieved chunks
for result in response['retrievalResults']:
    print(f"Score: {result['score']}")
    print(f"Content: {result['content']['text']}")
    print(f"Location: {result['location']}")
    print(f"Metadata: {result.get('metadata', {})}")
    print("---")
```

### 2. Retrieve and Generate API (RAG)

Returns generated response with source attribution.

**Use Cases**:
- Complete RAG workflows
- Question answering
- Document summarization
- Chatbots with knowledge bases

```python
response = bedrock_agent_runtime.retrieve_and_generate(
    input={
        'text': 'Explain semantic chunking benefits and when to use it'
    },
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': 'KB123456',
            'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
            'retrievalConfiguration': {
                'vectorSearchConfiguration': {
                    'numberOfResults': 5,
                    'overrideSearchType': 'HYBRID'
                }
            },
            'generationConfiguration': {
                'inferenceConfig': {
                    'textInferenceConfig': {
                        'temperature': 0.7,
                        'maxTokens': 2048,
                        'topP': 0.9
                    }
                },
                'promptTemplate': {
                    'textPromptTemplate': '''You are a helpful assistant. Answer the user's question based on the provided context.

Context: $search_results$

Question: $query$

Answer:'''
                }
            }
        }
    }
)

print(f"Generated Response: {response['output']['text']}")
print(f"\nSources:")
for citation in response['citations']:
    for reference in citation['retrievedReferences']:
        print(f"  - {reference['location']}")
        print(f"    Relevance Score: {reference.get('score', 'N/A')}")
```

### 3. Multi-Turn Conversations with Session Management

Bedrock automatically manages conversation context across turns.

```python
# First turn - creates session automatically
response1 = bedrock_agent_runtime.retrieve_and_generate(
    input={
        'text': 'What is Amazon Bedrock Knowledge Bases?'
    },
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': 'KB123456',
            'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
        }
    }
)

session_id = response1['sessionId']
print(f"Session ID: {session_id}")
print(f"Response: {response1['output']['text']}\n")

# Follow-up turn - reuse session for context
response2 = bedrock_agent_runtime.retrieve_and_generate(
    input={
        'text': 'What chunking strategies does it support?'
    },
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': 'KB123456',
            'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
        }
    },
    sessionId=session_id  # Continue conversation with context
)

print(f"Follow-up Response: {response2['output']['text']}")

# Third turn
response3 = bedrock_agent_runtime.retrieve_and_generate(
    input={
        'text': 'Which strategy would you recommend for legal documents?'
    },
    retrieveAndGenerateConfiguration={
        'type': 'KNOWLEDGE_BASE',
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': 'KB123456',
            'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
        }
    },
    sessionId=session_id
)

print(f"Third Response: {response3['output']['text']}")
```

### 4. Advanced Metadata Filtering

Filter retrieval by metadata attributes for precision.

```python
response = bedrock_agent_runtime.retrieve(
    knowledgeBaseId='KB123456',
    retrievalQuery={
        'text': 'Security best practices for production deployments'
    },
    retrievalConfiguration={
        'vectorSearchConfiguration': {
            'numberOfResults': 10,
            'overrideSearchType': 'HYBRID',
            'filter': {
                'andAll': [
                    {
                        'equals': {
                            'key': 'document_type',
                            'value': 'security_guide'
                        }
                    },
                    {
                        'greaterThanOrEquals': {
                            'key': 'publish_year',
                            'value': 2024
                        }
                    },
                    {
                        'in': {
                            'key': 'category',
                            'value': ['production', 'security', 'compliance']
                        }
                    }
                ]
            }
        }
    }
)
```

**Supported Filter Operators**:
- `equals`: Exact match
- `notEquals`: Not equal
- `greaterThan`, `greaterThanOrEquals`: Numeric comparison
- `lessThan`, `lessThanOrEquals`: Numeric comparison
- `in`: Match any value in array
- `notIn`: Not match any value in array
- `startsWith`: String prefix match
- `andAll`: Combine filters with AND
- `orAll`: Combine filters with OR

---

## Ingestion Management

### 1. Start Ingestion Job

```python
ingestion_response = bedrock_agent.start_ingestion_job(
    knowledgeBaseId=knowledge_base_id,
    dataSourceId=data_source_id,
    description='Monthly document sync',
    clientToken='unique-idempotency-token-123'
)

job_id = ingestion_response['ingestionJob']['ingestionJobId']
print(f"Ingestion Job ID: {job_id}")
```

### 2. Monitor Ingestion Job

```python
# Get job status
job_status = bedrock_agent.get_ingestion_job(
    knowledgeBaseId=knowledge_base_id,
    dataSourceId=data_source_id,
    ingestionJobId=job_id
)

print(f"Status: {job_status['ingestionJob']['status']}")
print(f"Started: {job_status['ingestionJob']['startedAt']}")
print(f"Updated: {job_status['ingestionJob']['updatedAt']}")

if 'statistics' in job_status['ingestionJob']:
    stats = job_status['ingestionJob']['statistics']
    print(f"Documents Scanned: {stats['numberOfDocumentsScanned']}")
    print(f"Documents Indexed: {stats['numberOfDocumentsIndexed']}")
    print(f"Documents Failed: {stats['numberOfDocumentsFailed']}")

# Wait for completion
import time

while True:
    status = bedrock_agent.get_ingestion_job(
        knowledgeBaseId=knowledge_base_id,
        dataSourceId=data_source_id,
        ingestionJobId=job_id
    )

    current_status = status['ingestionJob']['status']

    if current_status in ['COMPLETE', 'FAILED']:
        print(f"Ingestion job {current_status}")
        break

    print(f"Status: {current_status}, waiting...")
    time.sleep(30)
```

### 3. List Ingestion Jobs

```python
list_response = bedrock_agent.list_ingestion_jobs(
    knowledgeBaseId=knowledge_base_id,
    dataSourceId=data_source_id,
    maxResults=50
)

for job in list_response['ingestionJobSummaries']:
    print(f"Job ID: {job['ingestionJobId']}")
    print(f"Status: {job['status']}")
    print(f"Started: {job['startedAt']}")
    print(f"Updated: {job['updatedAt']}")
    print("---")
```

---

## Integration with Bedrock Agents

### 1. Agent with Knowledge Base Action

```python
bedrock_agent = boto3.client('bedrock-agent', region_name='us-east-1')

# Create agent with knowledge base
agent_response = bedrock_agent.create_agent(
    agentName='customer-support-agent',
    description='Customer support agent with knowledge base access',
    instruction='''You are a customer support agent. When answering questions:
1. Search the knowledge base for relevant information
2. Provide accurate answers based on retrieved context
3. Cite your sources
4. Admit when you don't know something''',
    foundationModel='anthropic.claude-3-sonnet-20240229-v1:0',
    agentResourceRoleArn='arn:aws:iam::123456789012:role/BedrockAgentRole'
)

agent_id = agent_response['agent']['agentId']

# Associate knowledge base with agent
kb_association = bedrock_agent.associate_agent_knowledge_base(
    agentId=agent_id,
    agentVersion='DRAFT',
    knowledgeBaseId='KB123456',
    description='Company documentation knowledge base',
    knowledgeBaseState='ENABLED'
)

# Prepare and create alias
bedrock_agent.prepare_agent(agentId=agent_id)

alias_response = bedrock_agent.create_agent_alias(
    agentId=agent_id,
    agentAliasName='production',
    description='Production alias'
)

agent_alias_id = alias_response['agentAlias']['agentAliasId']

# Invoke agent (automatically queries knowledge base)
bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name='us-east-1')

response = bedrock_agent_runtime.invoke_agent(
    agentId=agent_id,
    agentAliasId=agent_alias_id,
    sessionId='session-123',
    inputText='What is our return policy for defective products?'
)

for event in response['completion']:
    if 'chunk' in event:
        chunk = event['chunk']
        print(chunk['bytes'].decode())
```

### 2. Agent with Multiple Knowledge Bases

```python
# Associate multiple knowledge bases
bedrock_agent.associate_agent_knowledge_base(
    agentId=agent_id,
    agentVersion='DRAFT',
    knowledgeBaseId='KB-PRODUCT-DOCS',
    description='Product documentation'
)

bedrock_agent.associate_agent_knowledge_base(
    agentId=agent_id,
    agentVersion='DRAFT',
    knowledgeBaseId='KB-SUPPORT-ARTICLES',
    description='Support knowledge articles'
)

bedrock_agent.associate_agent_knowledge_base(
    agentId=agent_id,
    agentVersion='DRAFT',
    knowledgeBaseId='KB-COMPANY-POLICIES',
    description='Company policies and procedures'
)

# Agent automatically searches all knowledge bases and combines results
```

---

## Best Practices

### 1. Chunking Strategy Selection

**Decision Framework**:

1. **Simple, uniform documents** → Fixed-size chunking
   - Blog posts, articles, simple FAQs
   - Fast, predictable, cost-effective

2. **Documents without clear boundaries** → Semantic chunking
   - Legal documents, contracts, academic papers
   - Preserves semantic meaning, better accuracy
   - Consider additional cost

3. **Nested, hierarchical documents** → Hierarchical chunking
   - Technical manuals, product docs, research papers
   - Best balance of precision and context
   - Optimal for complex structures

4. **Specialized formats** → Custom Lambda chunking
   - Medical records (HL7, FHIR), code docs, custom formats
   - Complete control, domain expertise
   - Higher operational complexity

**Tuning Guidelines**:

- **Fixed-size**: Start with 512 tokens, 20% overlap
- **Semantic**: Start with 300 tokens, bufferSize=1, threshold=95%
- **Hierarchical**: Parent 1500 tokens, child 300 tokens, overlap 60 tokens
- **Custom**: Test extensively with domain experts

### 2. Retrieval Optimization

**Number of Results**:
- Start with 5-10 results
- Increase if answers lack detail
- Decrease if too much noise

**Search Type**:
- **SEMANTIC**: Pure vector similarity (faster, good for conceptual queries)
- **HYBRID**: Vector + keyword (better recall, recommended for production)

**Use Hybrid Search** when:
- Queries contain specific terms or names
- Need to match exact keywords
- Domain has specialized vocabulary

**Use Semantic Search** when:
- Purely conceptual queries
- Prioritizing speed over perfect recall
- Well-embedded domain knowledge

**Metadata Filters**:
- Always use when applicable
- Dramatically improves precision
- Reduces retrieval latency
- Examples: document_type, publish_date, category, author

### 3. Cost Optimization

**S3 Vectors**:
- Use for large-scale knowledge bases (millions of chunks)
- Up to 90% cost savings vs. OpenSearch
- Ideal for cost-sensitive applications
- Trade-off: Slightly higher latency

**Semantic Chunking**:
- Incurs foundation model costs during ingestion
- Consider cost vs. accuracy benefit
- May not be worth it for simple documents
- Best for complex, high-value content

**Ingestion Frequency**:
- Schedule ingestion during off-peak hours
- Use incremental updates when possible
- Don't re-ingest unchanged documents

**Model Selection**:
- Use smaller embedding models when accuracy permits
- Titan Embed Text v2 is cost-effective
- Consider Cohere Embed for multilingual

**Token Usage**:
- Monitor generation token usage
- Set appropriate maxTokens limits
- Use prompt templates to control verbosity

### 4. Session Management

**Always Reuse Sessions**:
- Pass `sessionId` for follow-up turns
- Bedrock handles context automatically
- No manual conversation history needed

**Session Lifecycle**:
- Sessions expire after inactivity (default: 60 minutes)
- Create new session for unrelated conversations
- Use unique sessionId per user/conversation

**Context Limits**:
- Monitor conversation length
- Long sessions may hit context limits
- Consider summarization for very long conversations

### 5. GraphRAG with Neptune

**When to Use**:
- Interconnected knowledge domains
- Relationship-aware queries
- Need for explainability
- Complex knowledge graphs

**Benefits**:
- Automatic graph creation
- Improved accuracy through relationships
- Comprehensive answers
- Explainable results

**Considerations**:
- Higher setup complexity
- Neptune Analytics costs
- Best for domains with rich relationships

### 6. Data Source Management

**S3 Best Practices**:
- Organize with clear prefixes
- Use inclusion/exclusion filters
- Maintain consistent metadata
- Version documents when updating

**Web Crawler**:
- Set appropriate rate limits
- Use robots.txt for guidance
- Monitor for broken links
- Schedule regular re-crawls

**Confluence/SharePoint**:
- Filter by spaces/sites
- Exclude archived content
- Use fine-grained permissions
- Schedule incremental syncs

**Metadata Enrichment**:
- Add custom metadata to documents
- Include: document_type, publish_date, category, author, version
- Enables powerful filtering
- Improves retrieval precision

### 7. Monitoring and Debugging

**Enable CloudWatch Logs**:
```python
# Monitor retrieval quality
# Track: query latency, retrieval scores, generation quality
# Set alarms for: high latency, low scores, high error rates
```

**Test Retrieval Quality**:
```python
# Use retrieve API to debug
response = bedrock_agent_runtime.retrieve(
    knowledgeBaseId='KB123456',
    retrievalQuery={'text': 'test query'}
)

# Analyze retrieval scores
for result in response['retrievalResults']:
    print(f"Score: {result['score']}")
    print(f"Content preview: {result['content']['text'][:200]}")
```

**Common Issues**:

1. **Low Retrieval Scores**:
   - Check chunking strategy
   - Verify embedding model
   - Ensure documents are properly ingested
   - Consider semantic or hierarchical chunking

2. **Irrelevant Results**:
   - Add metadata filters
   - Use hybrid search
   - Refine chunking strategy
   - Increase numberOfResults

3. **Missing Information**:
   - Verify data source configuration
   - Check ingestion job status
   - Ensure documents are not excluded by filters
   - Increase numberOfResults

4. **Slow Retrieval**:
   - Use metadata filters to narrow scope
   - Optimize vector database configuration
   - Consider S3 Vectors for cost over latency
   - Reduce numberOfResults

### 8. Security Best Practices

**IAM Permissions**:
- Use least privilege for Knowledge Base role
- Separate roles for data sources, ingestion, retrieval
- Enable VPC endpoints for private connectivity

**Data Encryption**:
- All data encrypted at rest (AWS KMS)
- Data encrypted in transit (TLS)
- Use customer-managed KMS keys for compliance

**Access Control**:
- Use IAM policies to control who can query
- Implement fine-grained access control
- Monitor access with CloudTrail

**PII Handling**:
- Use Bedrock Guardrails for PII redaction
- Implement data masking for sensitive fields
- Consider custom Lambda for advanced PII handling

---

## Complete Production Example

### End-to-End RAG Application

```python
import boto3
import json
from typing import List, Dict, Optional

class BedrockKnowledgeBaseRAG:
    """Production RAG application with Amazon Bedrock Knowledge Bases"""

    def __init__(self, region_name: str = 'us-east-1'):
        self.bedrock_agent = boto3.client('bedrock-agent', region_name=region_name)
        self.bedrock_agent_runtime = boto3.client('bedrock-agent-runtime', region_name=region_name)

    def create_knowledge_base(
        self,
        name: str,
        description: str,
        role_arn: str,
        vector_store_config: Dict,
        embedding_model: str = 'amazon.titan-embed-text-v2:0'
    ) -> str:
        """Create knowledge base with vector store"""

        response = self.bedrock_agent.create_knowledge_base(
            name=name,
            description=description,
            roleArn=role_arn,
            knowledgeBaseConfiguration={
                'type': 'VECTOR',
                'vectorKnowledgeBaseConfiguration': {
                    'embeddingModelArn': f'arn:aws:bedrock:us-east-1::foundation-model/{embedding_model}'
                }
            },
            storageConfiguration=vector_store_config
        )

        return response['knowledgeBase']['knowledgeBaseId']

    def add_s3_data_source(
        self,
        knowledge_base_id: str,
        name: str,
        bucket_arn: str,
        inclusion_prefixes: List[str],
        chunking_strategy: str = 'FIXED_SIZE',
        chunking_config: Optional[Dict] = None
    ) -> str:
        """Add S3 data source with chunking configuration"""

        if chunking_config is None:
            chunking_config = {
                'maxTokens': 512,
                'overlapPercentage': 20
            }

        vector_ingestion_config = {
            'chunkingConfiguration': {
                'chunkingStrategy': chunking_strategy
            }
        }

        if chunking_strategy == 'FIXED_SIZE':
            vector_ingestion_config['chunkingConfiguration']['fixedSizeChunkingConfiguration'] = chunking_config
        elif chunking_strategy == 'SEMANTIC':
            vector_ingestion_config['chunkingConfiguration']['semanticChunkingConfiguration'] = chunking_config
        elif chunking_strategy == 'HIERARCHICAL':
            vector_ingestion_config['chunkingConfiguration']['hierarchicalChunkingConfiguration'] = chunking_config

        response = self.bedrock_agent.create_data_source(
            knowledgeBaseId=knowledge_base_id,
            name=name,
            description=f'S3 data source: {name}',
            dataSourceConfiguration={
                'type': 'S3',
                's3Configuration': {
                    'bucketArn': bucket_arn,
                    'inclusionPrefixes': inclusion_prefixes
                }
            },
            vectorIngestionConfiguration=vector_ingestion_config
        )

        return response['dataSource']['dataSourceId']

    def ingest_data(self, knowledge_base_id: str, data_source_id: str) -> str:
        """Start ingestion job and wait for completion"""

        import time

        # Start ingestion
        response = self.bedrock_agent.start_ingestion_job(
            knowledgeBaseId=knowledge_base_id,
            dataSourceId=data_source_id,
            description='Automated ingestion'
        )

        job_id = response['ingestionJob']['ingestionJobId']

        # Wait for completion
        while True:
            status_response = self.bedrock_agent.get_ingestion_job(
                knowledgeBaseId=knowledge_base_id,
                dataSourceId=data_source_id,
                ingestionJobId=job_id
            )

            status = status_response['ingestionJob']['status']

            if status == 'COMPLETE':
                print(f"Ingestion completed successfully")
                if 'statistics' in status_response['ingestionJob']:
                    stats = status_response['ingestionJob']['statistics']
                    print(f"Documents indexed: {stats.get('numberOfDocumentsIndexed', 0)}")
                break
            elif status == 'FAILED':
                print(f"Ingestion failed")
                break

            print(f"Ingestion status: {status}")
            time.sleep(30)

        return job_id

    def query(
        self,
        knowledge_base_id: str,
        query: str,
        model_arn: str = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
        num_results: int = 5,
        search_type: str = 'HYBRID',
        metadata_filter: Optional[Dict] = None,
        session_id: Optional[str] = None
    ) -> Dict:
        """Query knowledge base with retrieve and generate"""

        retrieval_config = {
            'type': 'KNOWLEDGE_BASE',
            'knowledgeBaseConfiguration': {
                'knowledgeBaseId': knowledge_base_id,
                'modelArn': model_arn,
                'retrievalConfiguration': {
                    'vectorSearchConfiguration': {
                        'numberOfResults': num_results,
                        'overrideSearchType': search_type
                    }
                },
                'generationConfiguration': {
                    'inferenceConfig': {
                        'textInferenceConfig': {
                            'temperature': 0.7,
                            'maxTokens': 2048
                        }
                    }
                }
            }
        }

        # Add metadata filter if provided
        if metadata_filter:
            retrieval_config['knowledgeBaseConfiguration']['retrievalConfiguration']['vectorSearchConfiguration']['filter'] = metadata_filter

        # Build request
        request = {
            'input': {'text': query},
            'retrieveAndGenerateConfiguration': retrieval_config
        }

        # Add session if provided
        if session_id:
            request['sessionId'] = session_id

        response = self.bedrock_agent_runtime.retrieve_and_generate(**request)

        return {
            'answer': response['output']['text'],
            'citations': response.get('citations', []),
            'session_id': response['sessionId']
        }

    def multi_turn_conversation(
        self,
        knowledge_base_id: str,
        queries: List[str],
        model_arn: str = 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0'
    ) -> List[Dict]:
        """Execute multi-turn conversation with context"""

        session_id = None
        conversation = []

        for query in queries:
            result = self.query(
                knowledge_base_id=knowledge_base_id,
                query=query,
                model_arn=model_arn,
                session_id=session_id
            )

            session_id = result['session_id']

            conversation.append({
                'query': query,
                'answer': result['answer'],
                'citations': result['citations']
            })

        return conversation


# Example Usage
if __name__ == '__main__':
    rag = BedrockKnowledgeBaseRAG(region_name='us-east-1')

    # Create knowledge base
    kb_id = rag.create_knowledge_base(
        name='production-docs-kb',
        description='Production documentation knowledge base',
        role_arn='arn:aws:iam::123456789012:role/BedrockKBRole',
        vector_store_config={
            'type': 'OPENSEARCH_SERVERLESS',
            'opensearchServerlessConfiguration': {
                'collectionArn': 'arn:aws:aoss:us-east-1:123456789012:collection/kb-collection',
                'vectorIndexName': 'bedrock-kb-index',
                'fieldMapping': {
                    'vectorField': 'bedrock-knowledge-base-default-vector',
                    'textField': 'AMAZON_BEDROCK_TEXT_CHUNK',
                    'metadataField': 'AMAZON_BEDROCK_METADATA'
                }
            }
        }
    )

    # Add data source
    ds_id = rag.add_s3_data_source(
        knowledge_base_id=kb_id,
        name='technical-docs',
        bucket_arn='arn:aws:s3:::my-docs-bucket',
        inclusion_prefixes=['docs/'],
        chunking_strategy='HIERARCHICAL',
        chunking_config={
            'levelConfigurations': [
                {'maxTokens': 1500},
                {'maxTokens': 300}
            ],
            'overlapTokens': 60
        }
    )

    # Ingest data
    rag.ingest_data(kb_id, ds_id)

    # Single query
    result = rag.query(
        knowledge_base_id=kb_id,
        query='What are the best practices for RAG applications?',
        metadata_filter={
            'equals': {
                'key': 'document_type',
                'value': 'best_practices'
            }
        }
    )

    print(f"Answer: {result['answer']}")
    print(f"\nSources:")
    for citation in result['citations']:
        for ref in citation['retrievedReferences']:
            print(f"  - {ref['location']}")

    # Multi-turn conversation
    conversation = rag.multi_turn_conversation(
        knowledge_base_id=kb_id,
        queries=[
            'What is hierarchical chunking?',
            'When should I use it?',
            'What are the configuration parameters?'
        ]
    )

    for turn in conversation:
        print(f"\nQ: {turn['query']}")
        print(f"A: {turn['answer']}")
```

---

## Related Skills

### Amazon Bedrock Core Skills
- **bedrock-guardrails**: Content safety, PII redaction, hallucination detection
- **bedrock-agents**: Agentic workflows with tool use and knowledge bases
- **bedrock-flows**: Visual workflow builder for generative AI
- **bedrock-model-customization**: Fine-tuning, reinforcement fine-tuning, distillation
- **bedrock-prompt-management**: Prompt versioning and deployment

### AWS Infrastructure Skills
- **opensearch-serverless**: Vector database configuration and management
- **neptune-analytics**: GraphRAG configuration and queries
- **s3-management**: S3 bucket configuration for data sources and vectors
- **iam-bedrock**: IAM roles and policies for Knowledge Bases

### Observability Skills
- **cloudwatch-bedrock-monitoring**: Monitor Knowledge Bases metrics and logs
- **bedrock-cost-optimization**: Track and optimize Knowledge Bases costs

---

## Additional Resources

### Official Documentation
- [Amazon Bedrock Knowledge Bases](https://aws.amazon.com/bedrock/knowledge-bases/)
- [Knowledge Bases User Guide](https://docs.aws.amazon.com/bedrock/latest/userguide/knowledge-base.html)
- [Chunking Strategies](https://docs.aws.amazon.com/bedrock/latest/userguide/kb-chunking.html)
- [Boto3 Knowledge Bases API](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-agent.html)

### Best Practices
- [Building Cost-Effective RAG with S3 Vectors](https://aws.amazon.com/blogs/machine-learning/building-cost-effective-rag-applications-with-amazon-bedrock-knowledge-bases-and-amazon-s3-vectors/)
- [Advanced Parsing and Chunking](https://aws.amazon.com/blogs/machine-learning/amazon-bedrock-knowledge-bases-now-supports-advanced-parsing-chunking-and-query-reformulation-giving-greater-control-of-accuracy-in-rag-based-applications/)

### Research Document
- `/mnt/c/data/github/skrillz/AMAZON-BEDROCK-COMPREHENSIVE-RESEARCH-2025.md` - Section 2 (Complete Knowledge Bases research)