--- name: cloudflare-vectorize description: | Build semantic search with Cloudflare Vectorize V2. Covers async mutations, 5M vectors/index, 31ms latency, returnMetadata enum changes, and V1 deprecation. Prevents 14 errors including dimension mismatches, TypeScript types, testing setup. Use when: building RAG or semantic search, troubleshooting returnMetadata, V2 timing, metadata index, dimension errors, vitest setup, or wrangler --json output. user-invocable: true --- # Cloudflare Vectorize Complete implementation guide for Cloudflare Vectorize - a globally distributed vector database for building semantic search, RAG (Retrieval Augmented Generation), and AI-powered applications with Cloudflare Workers. **Status**: Production Ready ✅ **Last Updated**: 2026-01-21 **Dependencies**: cloudflare-worker-base (for Worker setup), cloudflare-workers-ai (for embeddings) **Latest Versions**: wrangler@4.59.3, @cloudflare/workers-types@4.20260109.0 **Token Savings**: ~70% **Errors Prevented**: 14 **Dev Time Saved**: ~4 hours ## What This Skill Provides ### Core Capabilities - ✅ **Index Management**: Create, configure, and manage vector indexes - ✅ **Vector Operations**: Insert, upsert, query, delete, and list vectors (list-vectors added August 2025) - ✅ **Metadata Filtering**: Advanced filtering with 10 metadata indexes per index - ✅ **Semantic Search**: Find similar vectors using cosine, euclidean, or dot-product metrics - ✅ **RAG Patterns**: Complete retrieval-augmented generation workflows - ✅ **Workers AI Integration**: Native embedding generation with @cf/baai/bge-base-en-v1.5 - ✅ **OpenAI Integration**: Support for text-embedding-3-small/large models - ✅ **Document Processing**: Text chunking and batch ingestion pipelines - ✅ **Testing Setup**: Vitest configuration with Vectorize bindings ### Templates Included 1. **basic-search.ts** - Simple vector search with Workers AI 2. **rag-chat.ts** - Full RAG chatbot with context retrieval 3. **document-ingestion.ts** - Document chunking and embedding pipeline 4. **metadata-filtering.ts** - Advanced filtering patterns --- ## ⚠️ Vectorize V2 Breaking Changes (September 2024) **IMPORTANT**: Vectorize V2 became GA in September 2024 with significant breaking changes. ### What Changed in V2 **Performance Improvements**: - **Index capacity**: 200,000 → **5 million vectors** per index - **Query latency**: 549ms → **31ms** median (18× faster) - **TopK limit**: 20 → **100** results per query - **Scale limits**: 100 → **50,000 indexes** per account - **Namespace limits**: 100 → **50,000 namespaces** per index **Breaking API Changes**: 1. **Async Mutations** - All mutations now asynchronous: ```typescript // V2: Returns mutationId const result = await env.VECTORIZE_INDEX.insert(vectors); console.log(result.mutationId); // "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" // Vector inserts/deletes may take a few seconds to be reflected ``` 2. **returnMetadata Parameter** - Boolean → String enum: ```typescript // ❌ V1 (deprecated) { returnMetadata: true } // ✅ V2 (required) { returnMetadata: 'all' | 'indexed' | 'none' } ``` 3. **Metadata Indexes Required Before Insert**: - V2 requires metadata indexes created BEFORE vectors inserted - Vectors added before metadata index won't be indexed - Must re-upsert vectors after creating metadata index **V1 Deprecation Timeline**: - **December 2024**: Can no longer create V1 indexes - **Existing V1 indexes**: Continue to work (other operations unaffected) - **Migration**: Use `wrangler vectorize --deprecated-v1` flag for V1 operations **Wrangler Version Required**: - **Minimum**: wrangler@3.71.0 for V2 commands - **Recommended**: wrangler@4.54.0+ (latest) ### Check Mutation Status ```typescript // Get index info to check last mutation processed const info = await env.VECTORIZE_INDEX.describe(); console.log(info.mutationId); // Last mutation ID console.log(info.processedUpToMutation); // Last processed timestamp ``` --- ## Critical Setup Rules ### ⚠️ MUST DO BEFORE INSERTING VECTORS ```bash # 1. Create the index with FIXED dimensions and metric npx wrangler vectorize create my-index \ --dimensions=768 \ --metric=cosine # 2. Create metadata indexes IMMEDIATELY (before inserting vectors!) npx wrangler vectorize create-metadata-index my-index \ --property-name=category \ --type=string npx wrangler vectorize create-metadata-index my-index \ --property-name=timestamp \ --type=number ``` **Why**: Metadata indexes MUST exist before vectors are inserted. Vectors added before a metadata index was created won't be filterable on that property. ### Index Configuration (Cannot Be Changed Later) ```bash # Dimensions MUST match your embedding model output: # - Workers AI @cf/baai/bge-base-en-v1.5: 768 dimensions # - OpenAI text-embedding-3-small: 1536 dimensions # - OpenAI text-embedding-3-large: 3072 dimensions # Metrics determine similarity calculation: # - cosine: Best for normalized embeddings (most common) # - euclidean: Absolute distance between vectors # - dot-product: For non-normalized vectors ``` ## Wrangler Configuration **wrangler.jsonc**: ```jsonc { "name": "my-vectorize-worker", "main": "src/index.ts", "compatibility_date": "2025-10-21", "vectorize": [ { "binding": "VECTORIZE_INDEX", "index_name": "my-index" } ], "ai": { "binding": "AI" } } ``` ## TypeScript Types ```typescript export interface Env { VECTORIZE_INDEX: VectorizeIndex; AI: Ai; } interface VectorizeVector { id: string; values: number[] | Float32Array | Float64Array; namespace?: string; metadata?: Record; } interface VectorizeMatches { matches: Array<{ id: string; score: number; values?: number[]; metadata?: Record; namespace?: string; }>; count: number; } ``` ## Metadata Filter Operators (V2) Vectorize V2 supports advanced metadata filtering with range queries: ```typescript // Equality (implicit $eq) { category: "docs" } // Not equals { status: { $ne: "archived" } } // In/Not in arrays { category: { $in: ["docs", "tutorials"] } } { category: { $nin: ["deprecated", "draft"] } } // Range queries (numbers) - NEW in V2 { timestamp: { $gte: 1704067200, $lt: 1735689600 } } // Range queries (strings) - prefix searching { url: { $gte: "/docs/workers", $lt: "/docs/workersz" } } // Nested metadata with dot notation { "author.id": "user123" } // Multiple conditions (implicit AND) { category: "docs", language: "en", "metadata.published": true } ``` ## Metadata Best Practices ### 1. Cardinality Considerations **Low Cardinality (Good for $eq filters)**: ```typescript // Few unique values - efficient filtering metadata: { category: "docs", // ~10 categories language: "en", // ~5 languages published: true // 2 values (boolean) } ``` **High Cardinality (Avoid in range queries)**: ```typescript // Many unique values - avoid large range scans metadata: { user_id: "uuid-v4...", // Millions of unique values timestamp_ms: 1704067200123 // Use seconds instead } ``` ### 2. Metadata Limits - **Max 10 metadata indexes** per Vectorize index - **Max 10 KiB metadata** per vector - **String indexes**: First 64 bytes (UTF-8) - **Number indexes**: Float64 precision - **Filter size**: Max 2048 bytes (compact JSON) ### 3. Vector Dimension Limit **Current Limit**: 1536 dimensions per vector **Source**: [GitHub Issue #8729](https://github.com/cloudflare/workers-sdk/issues/8729) **Supported Embedding Models**: - Workers AI `@cf/baai/bge-base-en-v1.5`: 768 dimensions ✅ - OpenAI `text-embedding-3-small`: 1536 dimensions ✅ - OpenAI `text-embedding-3-large`: 3072 dimensions ❌ (requires dimension reduction) **Unsupported Models** (>1536 dimensions): - `nomic-embed-code`: 3584 dimensions - `Qodo-Embed-1-7B`: >1536 dimensions **Workaround**: Use dimensionality reduction (e.g., PCA) to compress embeddings to 1536 or fewer dimensions, though this may reduce semantic quality. **Feature Request**: Higher dimension support is under consideration. Use [Limit Increase Request Form](https://forms.gle/nyamy2SM9zwWTXKE6) if this blocks your use case. ### 4. Key Restrictions ```typescript // ❌ INVALID metadata keys metadata: { "": "value", // Empty key "user.name": "John", // Contains dot (reserved for nesting) "$admin": true, // Starts with $ "key\"with\"quotes": 1 // Contains quotes } // ✅ VALID metadata keys metadata: { "user_name": "John", "isAdmin": true, "nested": { "allowed": true } // Access as "nested.allowed" in filters } ``` ## Best Practices ### Batch Insert Performance **Critical**: Use batch size of 5000 vectors for optimal performance. **Performance Data**: - **Individual inserts**: 2.5M vectors in 36+ hours (incomplete) - **Batch inserts (5000)**: 4M vectors in ~12 hours - **18× faster with proper batching** **Why 5000?** - Vectorize's internal Write-Ahead Log (WAL) optimized for this size - Avoids Cloudflare API rate limits - Balances throughput and memory usage **Optimal Pattern**: ```typescript const BATCH_SIZE = 5000; async function insertVectors(vectors: VectorizeVector[]) { for (let i = 0; i < vectors.length; i += BATCH_SIZE) { const batch = vectors.slice(i, i + BATCH_SIZE); const result = await env.VECTORIZE.insert(batch); console.log(`Inserted batch ${i / BATCH_SIZE + 1}, mutationId: ${result.mutationId}`); // Optional: Rate limiting delay if (i + BATCH_SIZE < vectors.length) { await new Promise(resolve => setTimeout(resolve, 100)); } } } ``` **Sources**: - [Community Report](https://community.cloudflare.com/t/performance-issues-with-large-scale-inserts-in-vectorize/788917) - [Official Best Practices](https://developers.cloudflare.com/vectorize/best-practices/insert-vectors/) --- ### Query Accuracy Modes Vectorize uses approximate nearest neighbor (ANN) search by default with ~80% accuracy compared to exact search. **Default Mode**: Approximate scoring (~80% accuracy) - Faster latency - Good for RAG, search, recommendations - topK up to 100 **High-Precision Mode**: Near 100% accuracy - Enabled via `returnValues: true` - Higher latency - Limited to topK=20 **Trade-off Example**: ```typescript // Fast, ~80% accuracy, topK up to 100 const results = await env.VECTORIZE.query(embedding, { topK: 50, returnValues: false // Default }); // Slower, ~100% accuracy, topK max 20 const preciseResults = await env.VECTORIZE.query(embedding, { topK: 10, returnValues: true // High-precision scoring }); ``` **When to Use High-Precision**: - Critical applications (fraud detection, legal compliance) - Small result sets (topK < 20) - Accuracy is higher priority than latency **Source**: [Cloudflare Blog - Building Vectorize](https://blog.cloudflare.com/building-vectorize-a-distributed-vector-database-on-cloudflare-developer-platform/) --- ## Common Errors & Solutions ### Error 1: Metadata Index Created After Vectors Inserted ``` Problem: Filtering doesn't work on existing vectors Solution: Delete and re-insert vectors OR create metadata indexes BEFORE inserting ``` ### Error 2: Dimension Mismatch ``` Problem: "Vector dimensions do not match index configuration" Solution: Ensure embedding model output matches index dimensions: - Workers AI bge-base: 768 - OpenAI small: 1536 - OpenAI large: 3072 ``` ### Error 3: Invalid Metadata Keys ``` Problem: "Invalid metadata key" Solution: Keys cannot: - Be empty - Contain . (dot) - Contain " (quote) - Start with $ (dollar sign) ``` ### Error 4: Filter Too Large ``` Problem: "Filter exceeds 2048 bytes" Solution: Simplify filter or split into multiple queries ``` ### Error 5: Range Query on High Cardinality ``` Problem: Slow queries or reduced accuracy Solution: Use lower cardinality fields for range queries, or use seconds instead of milliseconds for timestamps ``` ### Error 6: Insert vs Upsert Confusion ``` Problem: Updates not reflecting in index Solution: Use upsert() to overwrite existing vectors, not insert() ``` ### Error 7: Missing Bindings ``` Problem: "VECTORIZE_INDEX is not defined" Solution: Add [[vectorize]] binding to wrangler.jsonc ``` ### Error 8: Namespace vs Metadata Confusion ``` Problem: Unclear when to use namespace vs metadata filtering Solution: - Namespace: Partition key, applied BEFORE metadata filters - Metadata: Flexible key-value filtering within namespace ``` ### Error 9: V2 Async Mutation Timing (NEW in V2) ``` Problem: Inserted vectors not immediately queryable Solution: V2 mutations are asynchronous - vectors may take a few seconds to be reflected - Use mutationId to track mutation status - Check env.VECTORIZE_INDEX.describe() for processedUpToMutation timestamp ``` ### Error 10: V1 returnMetadata Boolean (BREAKING in V2) ``` Problem: "returnMetadata must be 'all', 'indexed', or 'none'" Solution: V2 changed returnMetadata from boolean to string enum: - ❌ V1: { returnMetadata: true } - ✅ V2: { returnMetadata: 'all' } ``` ### Error 11: Wrangler --json Output Contains Log Prefix **Error**: `wrangler vectorize list --json` output starts with log message, breaking JSON parsing **Source**: [GitHub Issue #11011](https://github.com/cloudflare/workers-sdk/issues/11011) **Affected Commands**: - `wrangler vectorize list --json` - `wrangler vectorize list-metadata-index --json` **Problem**: ```bash $ wrangler vectorize list --json 📋 Listing Vectorize indexes... [ { "created_on": "2025-10-18T13:28:30.259277Z", ... } ] ``` The log message makes output invalid JSON, breaking piping to `jq` or other tools. **Solution**: Strip first line before parsing: ```bash # Using tail wrangler vectorize list --json | tail -n +2 | jq '.' # Using sed wrangler vectorize list --json | sed '1d' | jq '.' ``` --- ### Error 12: TypeScript Types Missing Filter Operators **Error**: `wrangler types` generates incomplete `VectorizeVectorMetadataFilterOp` type **Source**: [GitHub Issue #10092](https://github.com/cloudflare/workers-sdk/issues/10092) **Status**: OPEN (tracked internally as VS-461) **Problem**: Generated type only includes `$eq` and `$ne`, missing V2 operators: `$in`, `$nin`, `$lt`, `$lte`, `$gt`, `$gte` **Impact**: TypeScript shows false errors when using valid V2 metadata filter operators: ```typescript const vectorizeRes = env.VECTORIZE.queryById(imgId, { filter: { gender: { $in: genderFilters } }, // ❌ TS error but works! topK, returnMetadata: 'indexed', }); ``` **Workaround**: Manual type override until wrangler types is fixed: ```typescript // Add to your types file type VectorizeMetadataFilter = Record; ``` --- ### Error 13: Windows Dev Registry Failure (FIXED) **Error**: `ENOENT: no such file or directory` when running `wrangler dev` on Windows **Source**: [GitHub Issue #10383](https://github.com/cloudflare/workers-sdk/issues/10383) **Status**: FIXED in wrangler@4.32.0 **Problem**: Wrangler attempted to create external worker files with colons in the name (invalid on Windows): ``` Error: ENOENT: ... '__WRANGLER_EXTERNAL_VECTORIZE_WORKER::' ``` **Solution**: Update to wrangler@4.32.0 or later: ```bash npm install -g wrangler@latest ``` --- ### Error 14: topK Limit Depends on returnValues/returnMetadata **Error**: `topK exceeds maximum allowed value` **Source**: [Vectorize Limits](https://developers.cloudflare.com/vectorize/platform/limits/) **Problem**: Maximum topK value changes based on query options: | Configuration | Max topK | |---------------|----------| | `returnValues: false`, `returnMetadata: 'none'` | 100 | | `returnValues: true` OR `returnMetadata: 'all'` | 20 | | `returnMetadata: 'indexed'` | 100 | **Common Error**: ```typescript // ❌ ERROR - topK too high with returnValues query(embedding, { topK: 100, // Exceeds limit! returnValues: true // Max topK=20 when true }); ``` **Solution**: ```typescript // ✅ OK - respects conditional limit query(embedding, { topK: 20, returnValues: true }); // ✅ OK - higher topK without values query(embedding, { topK: 100, returnValues: false, returnMetadata: 'indexed' }); ``` --- ## V2 Migration Checklist **If migrating from V1 to V2**: 1. ✅ Update wrangler to 3.71.0+ (`npm install -g wrangler@latest`) 2. ✅ Create new V2 index (can't upgrade V1 → V2) 3. ✅ Create metadata indexes BEFORE inserting vectors 4. ✅ Update `returnMetadata` boolean → string enum ('all', 'indexed', 'none') 5. ✅ Handle async mutations (expect `mutationId` in responses) 6. ✅ Test with V2 limits (topK up to 100, 5M vectors per index) 7. ✅ Update error handling for async behavior **V1 Deprecation**: - After December 2024: Cannot create new V1 indexes - Existing V1 indexes: Continue to work - Use `wrangler vectorize --deprecated-v1` for V1 operations --- ## Testing Considerations ### Vitest with Vectorize Bindings **Issue**: Using `@cloudflare/vitest-pool-workers` with Vectorize or Workers AI bindings causes runtime failure. **Source**: [GitHub Issue #7434](https://github.com/cloudflare/workers-sdk/issues/7434) **Error**: `wrapped binding module can't be resolved` **Workaround**: 1. Create `wrangler-test.jsonc` without Vectorize/AI bindings 2. Point vitest config to test-specific wrangler file 3. Mock bindings in your tests **Example**: ```typescript // wrangler-test.jsonc (no Vectorize binding) { "name": "my-worker-test", "main": "src/index.ts", "compatibility_date": "2025-10-21" // No vectorize binding } // vitest.config.ts import { defineWorkersProject } from '@cloudflare/vitest-pool-workers/config'; export default defineWorkersProject({ test: { poolOptions: { workers: { wrangler: { configPath: "./wrangler-test.jsonc" } } } } }); // Mock in tests import { vi } from 'vitest'; const mockVectorize = { query: vi.fn().mockResolvedValue({ matches: [ { id: 'test-1', score: 0.95, metadata: { category: 'docs' } } ], count: 1 }), insert: vi.fn().mockResolvedValue({ mutationId: "test-mutation-id" }), upsert: vi.fn().mockResolvedValue({ mutationId: "test-mutation-id" }) }; // Use mock in tests test('vector search', async () => { const env = { VECTORIZE_INDEX: mockVectorize }; // ... test logic }); ``` --- ## Community Tips **Note**: These tips come from community discussions and official blog posts. Verify against your Vectorize version. ### Tip 1: Range Queries at Scale May Have Reduced Accuracy (Community-sourced) **Source**: [Query Best Practices](https://developers.cloudflare.com/vectorize/best-practices/query-vectors/) **Confidence**: MEDIUM **Applies to**: Datasets with ~10M+ vectors Range queries (`$lt`, `$lte`, `$gt`, `$gte`) on large datasets may experience reduced accuracy. **Optimization Strategy**: ```typescript // ❌ High-cardinality range at scale metadata: { timestamp_ms: 1704067200123 } filter: { timestamp_ms: { $gte: 1704067200000 } } // ✅ Bucketed into discrete values metadata: { timestamp_bucket: "2025-01-01-00:00", // 1-hour buckets timestamp_ms: 1704067200123 // Original (non-indexed) } filter: { timestamp_bucket: { $in: ["2025-01-01-00:00", "2025-01-01-01:00"] } } ``` **When This Matters**: - Time-based filtering over months/years - User IDs, transaction IDs (UUID ranges) - Any high-cardinality continuous data **Alternative**: Use equality filters (`$eq`, `$in`) with bucketed values. --- ### Tip 2: List Vectors Operation (Added August 2025) **Source**: [Vectorize Changelog](https://developers.cloudflare.com/vectorize/platform/changelog/) Vectorize V2 added support for the `list-vectors` operation for paginated iteration through vector IDs. **Use Cases**: - Auditing vector collections - Bulk vector operations - Debugging index contents **API**: ```typescript const result = await env.VECTORIZE_INDEX.list({ limit: 1000, // Max 1000 per page cursor?: string }); // result.vectors: Array<{ id: string }> // result.cursor: string | undefined // result.count: number // Pagination example let cursor: string | undefined; const allVectorIds: string[] = []; do { const result = await env.VECTORIZE_INDEX.list({ limit: 1000, cursor }); allVectorIds.push(...result.vectors.map(v => v.id)); cursor = result.cursor; } while (cursor); ``` **Limitations**: - Returns IDs only (not values or metadata) - Max 1000 vectors per page - Use cursor for pagination --- ## Official Documentation - **Vectorize V2 Docs**: https://developers.cloudflare.com/vectorize/ - **V2 Changelog**: https://developers.cloudflare.com/vectorize/platform/changelog/ - **V1 to V2 Migration**: https://developers.cloudflare.com/vectorize/reference/transition-vectorize-legacy/ - **Metadata Filtering**: https://developers.cloudflare.com/vectorize/reference/metadata-filtering/ - **Workers AI Models**: https://developers.cloudflare.com/workers-ai/models/ --- **Status**: Production Ready ✅ (Vectorize V2 GA - September 2024) **Last Updated**: 2026-01-21 **Token Savings**: ~70% **Errors Prevented**: 14 (includes V2 breaking changes, testing setup, TypeScript types) **Changes**: Added 4 new errors (wrangler --json, TypeScript types, Windows dev, topK limits), batch performance best practices, query accuracy modes, testing setup, community tips on range queries and list-vectors operation.