--- name: exa-architecture-variants description: | Choose and implement Exa validated architecture blueprints for different scales. Use when designing new Exa integrations, choosing between monolith/service/microservice architectures, or planning migration paths for Exa applications. Trigger with phrases like "exa architecture", "exa blueprint", "how to structure exa", "exa project layout", "exa microservice". allowed-tools: Read, Grep version: 1.0.0 license: MIT author: Jeremy Longshore compatible-with: claude-code, codex, openclaw --- # Exa Architecture Variants ## Overview Deployment architectures for Exa neural search at different scales. Exa's search-and-contents model supports everything from simple search features to full RAG pipelines and semantic knowledge bases. ## Prerequisites - Exa API configured - Clear search use case defined - Infrastructure for chosen architecture tier ## Instructions ### Step 1: Direct Search Integration (Simple) **Best for:** Adding search to an existing app, < 1K queries/day. ``` User Query -> Backend -> Exa Search API -> Format Results -> User ``` ```python from exa_py import Exa exa = Exa(api_key=os.environ["EXA_API_KEY"]) @app.route('/search') def search(): query = request.args.get('q') results = exa.search_and_contents( query, num_results=5, text={"max_characters": 1000} # 1000: 1 second in ms ) return jsonify([{ "title": r.title, "url": r.url, "snippet": r.text[:200] # HTTP 200 OK } for r in results.results]) ``` ### Step 2: Cached Search with Semantic Layer (Moderate) **Best for:** High-traffic search, 1K-50K queries/day, content aggregation. ``` User Query -> Cache Check -> (miss) -> Exa API -> Cache Store -> User | v (hit) Cached Results -> User ``` ```python class CachedExaSearch: def __init__(self, exa_client, redis_client, ttl=600): # 600: timeout: 10 minutes self.exa = exa_client self.cache = redis_client self.ttl = ttl def search(self, query: str, **kwargs): key = self._cache_key(query, **kwargs) cached = self.cache.get(key) if cached: return json.loads(cached) results = self.exa.search_and_contents(query, **kwargs) serialized = self._serialize(results) self.cache.setex(key, self.ttl, json.dumps(serialized)) return serialized ``` ### Step 3: RAG Pipeline with Exa as Knowledge Source (Scale) **Best for:** AI-powered apps, 50K+ queries/day, LLM-augmented answers. ``` User Query -> Query Planner -> Exa Search -> Content Extraction | v Vector Store (cache) | v LLM Generation with Context -> User ``` ```python class ExaRAGPipeline: def __init__(self, exa, llm, vector_store): self.exa = exa self.llm = llm self.vectors = vector_store async def answer(self, question: str) -> dict: # 1. Search for relevant content results = self.exa.search_and_contents( question, num_results=5, text={"max_characters": 3000}, # 3000: 3 seconds in ms highlights=True ) # 2. Store in vector cache for future queries for r in results.results: self.vectors.upsert(r.url, r.text, {"title": r.title}) # 3. Generate answer with citations context = "\n\n".join([f"[{i+1}] {r.text}" for i, r in enumerate(results.results)]) answer = await self.llm.generate( f"Based on the following sources, answer: {question}\n\n{context}" ) return {"answer": answer, "sources": [r.url for r in results.results]} ``` ## Decision Matrix | Factor | Direct | Cached | RAG Pipeline | |--------|--------|--------|--------------| | Volume | < 1K/day | 1K-50K/day | 50K+/day | | Latency | 1-3s | 50ms (cached) | 3-8s | | Use Case | Simple search | Content aggregation | AI-powered answers | | Complexity | Low | Medium | High | ## Error Handling | Issue | Cause | Solution | |-------|-------|----------| | Slow search in UI | No caching | Add result cache with TTL | | Stale cached results | Long TTL | Reduce TTL for time-sensitive queries | | RAG hallucination | Poor source selection | Use highlights, increase num_results | | High API costs | No query deduplication | Cache layer deduplicates identical queries | ## Examples **Basic usage**: Apply exa architecture variants to a standard project setup with default configuration options. **Advanced scenario**: Customize exa architecture variants for production environments with multiple constraints and team-specific requirements. ## Resources - [Exa API Docs](https://docs.exa.ai) - [Exa RAG Guide](https://docs.exa.ai/reference/rag) ## Output - Configuration files or code changes applied to the project - Validation report confirming correct implementation - Summary of changes made and their rationale