--- name: cva-patterns-context description: Context management patterns for multi-source AI agents in Clojure+Vertex AI. Covers 4 context types (static/query/API/previous-result), lifecycle management (load/cache/invalidate), TTL strategies, and LGPD-compliant sensitive data handling. Includes production metrics (58% cost reduction via caching). Use when designing agent contexts, implementing multi-source data integration, optimizing cache strategies, or building LGPD-compliant systems. allowed-tools: Read,Bash,Edit,Write,Glob --- # Context Management > **Pattern Type:** Architectural + Optimization > **Complexity:** Medium > **Best For:** Agents requiring multiple data sources, production systems with caching needs, LGPD/compliance requirements ## 🎯 Overview Context management patterns enable efficient integration of multiple data sources into AI agent prompts. This pattern solves three challenges: 1. **Multi-Source Integration**: Combining static files, databases, APIs, and previous agent results 2. **Performance Optimization**: Caching strategies to reduce latency and cost 3. **Security & Compliance**: LGPD-compliant handling of sensitive data **When to Use:** - Agent needs data from 2+ sources (files, DB, API) - Production system requiring cache optimization - Healthcare/financial domain with PII/sensitive data - Multi-tenant systems with per-tenant contexts **Trade-offs:** - **Complexity**: Adds cache invalidation logic and lifecycle management - **Memory**: Static contexts consume ~0.45 MB, query contexts ~2 MB (100 cached) - **Performance**: Cache reduces latency by 33% and cost by 58% **Production ROI:** Healthcare pipeline achieved 58% cost reduction ($0.391 → $0.162) via aggressive caching of API contexts (73% hit rate) and query contexts (85% hit rate). ## 📊 Pattern Explanation ### Core Concept Contexts are categorized by source and caching strategy: ``` CONTEXT TYPE TAXONOMY 1. STATIC (Filesystem) - LGPD guidelines, JSON schemas, disclaimers - Loaded once, cached permanently - Latency: 0ms (after startup preload) - Cost: $0 2. QUERY (Database) - Professional profiles, SEO keywords - TTL cache (1h typical) - Latency: 35ms (miss) / <1ms (hit) - Cost: Negligible (DB query) 3. API (External Services) - PubMed articles, grounding data - TTL cache (24h typical) - Latency: 1.8s (miss) / <1ms (hit) - Cost: Variable (API-dependent) 4. PREVIOUS RESULT (Pipeline State) - Output from previous agent - In-memory only (no persistence) - Latency: 0ms - Cost: $0 ``` ### Implementation Approach **Step 1**: Identify context sources for your agent - Static: Unchanging guidelines, schemas, templates - Query: Per-tenant/per-user data from database - API: Real-time data from external services - Previous: Results from earlier agents in pipeline **Step 2**: Choose TTL strategy per source - Static: Permanent (reload only on deploy) - Query: 1h (balance freshness vs hit rate) - API: 24h (external data changes slowly) - Previous: N/A (ephemeral pipeline state) **Step 3**: Implement cache invalidation - Static: Explicit reload on content update - Query: Invalidate on database write (e.g., profile update) - API: Force refresh on user request or timeout - Previous: Garbage collected with pipeline execution **Step 4**: Add security scanning - Detect PII/sensitive data (CPF, health records) - Redact or mask before LLM processing - Audit access for LGPD compliance ## 💻 Clojure Implementation ### Basic Example: Static Context with Lazy Loading ```clojure (ns lab.contexts.static "Static context management with permanent cache" (:require [clojure.java.io :as io] [cheshire.core :as json])) (defrecord StaticContext [id ; Keyword identifier (:compliance-lgpd, :json-schema-extraction, etc.) type ; :markdown, :json, :edn content ; Parsed content (string or map) size-bytes ; Content size in bytes loaded-at ; Timestamp (millis since epoch) version]) ; Version string (e.g., "1.0.0") (defn load-static-context "Load static context from resources/ directory. Contexts are typically stored in resources/contexts/ and loaded once at startup. Content is parsed based on type. Args: id - Keyword identifier path - Relative path in resources/ (e.g., 'contexts/lgpd.md') opts - {:type :markdown/:json/:edn, :version string} Returns: StaticContext record Example: (load-static-context :compliance-lgpd 'contexts/diretrizes_protecao_dados.md' {:type :markdown, :version '1.0.0'})" [id path opts] (let [resource (io/resource path) content-str (slurp resource) size (count (.getBytes content-str "UTF-8")) ;; Parse based on type parsed-content (case (:type opts) :json (json/parse-string content-str true) :edn (clojure.edn/read-string content-str) :markdown content-str content-str)] (map->StaticContext {:id id :type (:type opts :markdown) :content parsed-content :size-bytes size :loaded-at (System/currentTimeMillis) :version (:version opts "1.0.0")}))) ;; Catalog of available static contexts (defonce static-contexts-catalog "Registry of all static contexts in the system. Each entry defines: - path: Location in resources/ - type: Content format - version: Semantic version - description: Human-readable purpose" {:compliance-lgpd {:path "contexts/diretrizes_protecao_dados.md" :type :markdown :version "1.0.0" :description "LGPD data protection guidelines"} :json-schema-extraction {:path "contexts/formato_json_extracao.json" :type :json :version "1.0.0" :description "JSON Schema for S.1.1 extraction output validation"} :disclaimers-cfm {:path "contexts/disclaimers_cfm_crp.md" :type :markdown :version "1.0.0" :description "Mandatory CFM/CRP medical disclaimers"}}) ;; Lazy-loaded cache (load on first access) (defonce static-contexts-cache "Permanent in-memory cache for static contexts. Contexts are loaded lazily on first access via get-static-context. Cache persists for application lifetime (no TTL eviction)." (atom {})) (defn get-static-context "Retrieve static context from cache (load if necessary). Uses lazy loading pattern: context is loaded on first access, then cached permanently. Subsequent accesses are instant (0ms). Args: id - Keyword from static-contexts-catalog Returns: StaticContext record or nil if not found Example: (def lgpd-ctx (get-static-context :compliance-lgpd)) (:content lgpd-ctx) ;; => '# Diretrizes de Proteção de Dados...'" [id] (or (@static-contexts-cache id) (when-let [catalog-entry (get static-contexts-catalog id)] (let [loaded (load-static-context id (:path catalog-entry) catalog-entry)] (swap! static-contexts-cache assoc id loaded) loaded)))) (defn preload-all-contexts! "Eagerly load all static contexts at startup. Recommended for production: eliminates cold start latency on first request. Loads all contexts in catalog concurrently. Returns: {:loaded-count int :total-size-mb float :duration-ms int} Example: (preload-all-contexts!) ;; => {:loaded-count 12, :total-size-mb 0.45, :duration-ms 127} ;; After preload, all get-static-context calls are instant" [] (let [start-time (System/currentTimeMillis)] (doseq [[id _] static-contexts-catalog] (get-static-context id)) (let [end-time (System/currentTimeMillis) total-size (reduce + (map #(:size-bytes %) (vals @static-contexts-cache)))] {:loaded-count (count @static-contexts-cache) :total-size-mb (/ total-size 1048576.0) :duration-ms (- end-time start-time)}))) (comment ;; Startup: Eagerly load all contexts (production pattern) (preload-all-contexts!) ;; => {:loaded-count 12, :total-size-mb 0.45, :duration-ms 127} ;; Runtime: Instant access after preload (def lgpd-context (get-static-context :compliance-lgpd)) (:content lgpd-context) ;; => "# Diretrizes de Proteção de Dados\n\n## Princípios..." ;; Production metrics: ;; - Memory usage: 0.45 MB total (12 contexts) ;; - Startup overhead: 127ms (one-time) ;; - Access latency: 0ms (after preload) ;; - Cache hit rate: 100% (permanent cache) ) ``` ### Production Example: Multi-Source Context with Caching ```clojure (ns lab.contexts.multi-source "Production context management with multi-layer caching" (:require [clojure.core.cache :as cache] [next.jdbc :as jdbc] [next.jdbc.result-set :as rs] [clj-http.client :as http] [cheshire.core :as json] [lab.contexts.static :as static-ctx])) (defrecord QueryContext [id ; Cache key (vector of [type params]) content ; Formatted content (string for prompt injection) cached-at ; Timestamp (millis) ttl-ms ; Time-to-live in milliseconds source]) ; :database or :api ;; TTL cache for query contexts (1h default) (defonce query-contexts-cache (atom (cache/ttl-cache-factory {} :ttl (* 60 60 1000)))) ;; TTL cache for API contexts (24h default) (defonce api-contexts-cache (atom (cache/ttl-cache-factory {} :ttl (* 24 60 60 1000)))) (defn fetch-professional-profile "Query professional profile from database. Returns: Map with raw database columns" [db-spec prof-id] (jdbc/execute-one! (jdbc/get-datasource db-spec) ["SELECT nome_completo, crm, especialidade, anos_experiencia, tom_voz, cidade_atuacao, bio FROM profissionais WHERE id = ?::uuid AND ativo = true" (str prof-id)] {:builder-fn rs/as-unqualified-lower-maps})) (defn format-professional-profile "Format profile for LLM prompt injection (Markdown). Args: profile - Raw database map Returns: Formatted string" [profile] (format "**Perfil Profissional:** - Nome: %s - Registro: %s %s - Especialidade: %s - Experiência: %d anos - Tom de voz: %s - Cidade: %s **Bio:** %s" (:nome_completo profile) (if (= (:especialidade profile) "Medicina") "CRM" "CRP") (:crm profile) (:especialidade profile) (:anos_experiencia profile) (:tom_voz profile "Profissional e acolhedor") (:cidade_atuacao profile) (:bio profile ""))) (defn fetch-pubmed-articles "Fetch scientific articles from PubMed API. Two-step process: 1. esearch: Get article IDs for query 2. esummary: Fetch article metadata Args: query - Search query string max-results - Number of articles (default 5) timeout-ms - Request timeout (default 2500ms) Returns: {:success? boolean :articles [{:pmid, :title, :authors, :journal, :doi, :link}] :metadata {:latency-ms, :source :pubmed}}" [query max-results & [timeout-ms]] (let [start-time (System/currentTimeMillis) timeout-ms (or timeout-ms 2500)] (try ;; Step 1: Search for article IDs (let [search-response (http/get "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esearch.fcgi" {:query-params {:db "pubmed" :term query :retmax max-results :retmode "json"} :socket-timeout timeout-ms :connection-timeout timeout-ms :as :json}) pmids (get-in search-response [:body :esearchresult :idlist])] (if (empty? pmids) {:success? false :error "No articles found" :metadata {:latency-ms (- (System/currentTimeMillis) start-time) :source :pubmed}} ;; Step 2: Fetch article summaries (let [summary-response (http/get "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/esummary.fcgi" {:query-params {:db "pubmed" :id (clojure.string/join "," pmids) :retmode "json"} :socket-timeout timeout-ms :connection-timeout timeout-ms :as :json}) result-map (get-in summary-response [:body :result]) articles (mapv (fn [pmid] (let [article (get result-map pmid)] {:pmid pmid :title (:title article) :authors (take 3 (:authors article)) :journal (:fulljournalname article) :pubdate (:pubdate article) :doi (:doi article) :link (str "https://pubmed.ncbi.nlm.nih.gov/" pmid "/")})) pmids)] {:success? true :articles articles :metadata {:latency-ms (- (System/currentTimeMillis) start-time) :source :pubmed :query query :results-count (count articles)}}))) (catch java.net.SocketTimeoutException e {:success? false :error "PubMed API timeout" :timeout? true :metadata {:latency-ms timeout-ms :source :pubmed}}) (catch Exception e {:success? false :error (.getMessage e) :metadata {:latency-ms (- (System/currentTimeMillis) start-time) :source :pubmed}})))) (defn get-api-context "Retrieve API context with aggressive caching and fallback. Caching strategy: - Cache successful responses for 24h (scientific data is stable) - Cache miss: Call API with timeout protection - API failure: Use fallback value if provided Args: api-type - :pubmed, :google-scholar, :grounding params - Map {:query string, :max-results int} opts - {:ttl-ms int, :timeout-ms int, :fallback-value string, :force-refresh? boolean} Returns: {:id cache-key :content formatted-string :cached-at timestamp :ttl-ms int :api-source keyword :metadata {:latency-ms, :from-cache?, :fallback-used?}} Example: (get-api-context :pubmed {:query 'anxiety treatment CBT', :max-results 5} {:timeout-ms 3000 :fallback-value 'References temporarily unavailable'}) ;; First call: 1847ms (API call) ;; Second call: <1ms (cache hit)" [api-type params & [opts]] (let [cache-key [api-type params] force-refresh? (:force-refresh? opts false) ttl-ms (:ttl-ms opts (* 24 60 60 1000)) ; 24h default timeout-ms (:timeout-ms opts 2500)] (if (and (not force-refresh?) (cache/has? @api-contexts-cache cache-key)) ;; Cache hit (let [cached (cache/lookup @api-contexts-cache cache-key)] (update cached :metadata assoc :from-cache? true)) ;; Cache miss - call API (let [api-result (case api-type :pubmed (fetch-pubmed-articles (:query params) (:max-results params 5) timeout-ms) (throw (ex-info "Unknown API type" {:type api-type})))] (if (:success? api-result) ;; Success - format and cache (let [formatted-content (format-pubmed-articles (:articles api-result)) ctx (map->QueryContext {:id cache-key :content formatted-content :cached-at (System/currentTimeMillis) :ttl-ms ttl-ms :source :api :metadata (assoc (:metadata api-result) :from-cache? false)})] (swap! api-contexts-cache cache/miss cache-key ctx) ctx) ;; Failure - use fallback if available (if-let [fallback (:fallback-value opts)] (do (println "⚠️ API" api-type "failed - using fallback") (map->QueryContext {:id cache-key :content fallback :cached-at (System/currentTimeMillis) :ttl-ms ttl-ms :source :api :metadata (assoc (:metadata api-result) :from-cache? false :fallback-used? true)})) ;; No fallback - propagate error (throw (ex-info "API call failed and no fallback provided" {:api-type api-type :error (:error api-result) :metadata (:metadata api-result)})))))))) (defn format-pubmed-articles "Format PubMed articles for prompt injection. Returns: Markdown-formatted string" [articles] (str "**Referências Científicas (PubMed):**\n\n" (clojure.string/join "\n\n" (map-indexed (fn [idx article] (format "%d. **%s**\n - Autores: %s\n - Journal: %s (%s)\n - PMID: %s | DOI: %s" (inc idx) (:title article) (clojure.string/join ", " (map :name (:authors article))) (:journal article) (:pubdate article) (:pmid article) (:doi article "N/A"))) articles)))) (comment ;; Usage: Fetch PubMed context with caching (def pubmed-ctx (get-api-context :pubmed {:query "anxiety treatment cognitive behavioral therapy" :max-results 5} {:timeout-ms 3000 :fallback-value "Scientific references temporarily unavailable."})) (get-in pubmed-ctx [:metadata :latency-ms]) ;; => 1847ms (first call) (get-in pubmed-ctx [:metadata :from-cache?]) ;; => false ;; Second call (cache hit) (def pubmed-ctx-2 (get-api-context :pubmed {:query "..." :max-results 5})) (get-in pubmed-ctx-2 [:metadata :from-cache?]) ;; => true (get-in pubmed-ctx-2 [:metadata :latency-ms]) ;; => 1847ms (original) ;; Production metrics (healthcare pipeline): ;; - Cache hit rate: 73% ;; - Latency (cache miss): 1.8s average ;; - Latency (cache hit): <1ms ;; - Cost savings: 73% × $0.067 = $0.049 per request ;; - Total savings: $0.115 per pipeline (-29% total cost) ) ``` ## 💡 Best Practices 1. **Preload Static Contexts at Startup** - **Rationale**: Eliminates cold start latency. Static contexts are ~0.45 MB total, acceptable memory overhead for instant access. - **Example**: Healthcare pipeline preloads 12 contexts in 127ms. First request avoids 127ms delay. 2. **Use Aggressive TTLs for Stable Data** - **Rationale**: Scientific articles, SEO keywords change slowly. 24h TTL achieves 73% cache hit rate with no quality impact. - **Example**: PubMed cache (24h TTL) saves $0.049 per request, 73% of time. Annual savings: $0.049 × 0.73 × 12000 = $429. 3. **Invalidate Query Contexts on Database Writes** - **Rationale**: Stale cache causes incorrect agent behavior. Invalidate immediately when source data changes. - **Example**: When user updates profile, invalidate `:professional-profile` cache for that user. Next request fetches fresh data. 4. **Always Provide Fallbacks for API Contexts** - **Rationale**: External APIs have 2-3% timeout rate. Fallback prevents pipeline failure. - **Example**: PubMed fallback is "References temporarily unavailable." Agent generates content without references instead of crashing. 5. **Scan for Sensitive Data Before LLM Processing** - **Rationale**: LGPD/GDPR require PII protection. Prevent accidental logging/transmission of CPF, health records. - **Example**: S.1.1 (data extraction) scans for CPF, health diagnoses. Redacts before storing in checkpoint database. 6. **Monitor Cache Hit Rates** - **Rationale**: Low hit rates indicate wrong TTL or cache key strategy. Measure to optimize. - **Example**: Initially used query string as PubMed cache key. Hit rate was 12%. Changed to semantic hash of query intent → 73% hit rate. ## 🔗 Related Skills - [`cva-concepts-agent-types`](../cva-concepts-agent-types/SKILL.md) - Agent types requiring different context strategies ⭐ - [`cva-healthcare-pipeline`](../cva-healthcare-pipeline/SKILL.md) - Real 4-type context usage ⭐ - [`cva-patterns-workflows`](../cva-patterns-workflows/SKILL.md) - Pipeline state management - [`cva-patterns-cost`](../cva-patterns-cost/SKILL.md) - Caching for cost optimization - [`cva-basics-prompts`](../cva-basics-prompts/SKILL.md) - Injecting contexts into prompts - [`cva-security-lgpd`](../cva-security-lgpd/SKILL.md) - LGPD-compliant data handling ## 📘 Additional Resources ### Pattern Variations **Distributed Cache**: Use Redis instead of in-memory cache for multi-instance deployments. Enables cache sharing across application instances. **Semantic Caching**: Cache by semantic similarity instead of exact match. Example: "anxiety treatment" and "treating anxiety" map to same cache entry (95% similarity threshold). **Versioned Contexts**: Store multiple versions of static contexts, switch per deployment. Enables A/B testing of prompt variations without code changes. ### Advanced Topics **Cache Warming**: Pre-populate cache on startup with most common queries. Healthcare pipeline warms 5 common PubMed queries (68% hit rate in first 2 hours vs 12% cold). **Multi-Tenant Context Isolation**: Ensure tenant A cannot access tenant B's cached contexts. Use tenant ID in cache key, enforce row-level security in database. **Context Compression**: Gzip large contexts before caching. Healthcare LGPD guidelines (12KB) compress to 3KB, 4x memory savings for 100+ cached contexts. ### Security Considerations **PII Detection Patterns** (LGPD-specific): - CPF: `\d{3}\.\d{3}\.\d{3}-\d{2}` - CNS (health card): `\d{15}` - Medical diagnoses: `CID-10`, `F\d{2}\.\d` (ICD codes) - Email, phone: Standard regex patterns **Redaction Strategy**: - **Mask**: Replace with asterisks (for display) - **Remove**: Delete entirely (for LLM processing) - **Anonymize**: Replace with fake data preserving format **Audit Requirements**: - Log all context accesses with user ID, timestamp - Track which agents accessed which sensitive data - Retention period: 5 years (LGPD requirement) ### Performance Benchmarks | Context Type | Latency (miss) | Latency (hit) | Hit Rate | Memory | Use Case | |--------------|----------------|---------------|----------|--------|----------| | Static | 127ms (startup) | 0ms | 100% | 0.45 MB | Guidelines, schemas | | Query | 35ms | <1ms | 85% | ~2 MB | User profiles, config | | API | 1.8s | <1ms | 73% | Variable | External data | | Previous | 0ms | 0ms | N/A | Negligible | Pipeline state | **Cache Savings** (healthcare pipeline, 1000 executions/month): - API contexts: $0.049 × 0.73 × 1000 = $35.77/month - Query contexts: $0.066 × 0.85 × 1000 = $56.10/month - **Total**: $91.87/month savings from caching alone