# @memberjunction/ai-prompts Advanced AI prompt execution engine for MemberJunction. Provides hierarchical template composition, intelligent model selection with failover, parallel execution with judge-based result selection, structured output validation with retry, comprehensive execution tracking, and streaming support. This is the primary interface for executing AI prompts in the MemberJunction framework. ## Guides - **[Assistant Prefill & Stop Sequences](PREFILL_AND_STOP_SEQUENCES.md)** — How to use `assistantPrefill` and `stopSequences` to control output format, reduce token usage, and eliminate verbose format instructions from prompts. ## Architecture ```mermaid graph TD subgraph "@memberjunction/ai-prompts" PR["AIPromptRunner"] style PR fill:#2d8659,stroke:#1a5c3a,color:#fff EP["ExecutionPlanner"] style EP fill:#7c5295,stroke:#563a6b,color:#fff PEC["ParallelExecutionCoordinator"] style PEC fill:#7c5295,stroke:#563a6b,color:#fff PE["ParallelExecution"] style PE fill:#7c5295,stroke:#563a6b,color:#fff end subgraph "Execution Pipeline" T["1. Template Rendering
Handlebars + System Placeholders"] style T fill:#b8762f,stroke:#8a5722,color:#fff MS["2. Model Selection
Default / Specific / ByPower"] style MS fill:#b8762f,stroke:#8a5722,color:#fff EX["3. LLM Execution
With Streaming & Caching"] style EX fill:#b8762f,stroke:#8a5722,color:#fff VAL["4. Output Validation
JSON Schema + Retry"] style VAL fill:#b8762f,stroke:#8a5722,color:#fff TRK["5. Execution Tracking
AIPromptRun Records"] style TRK fill:#b8762f,stroke:#8a5722,color:#fff end PR --> EP PR --> PEC PEC --> PE PR --> T T --> MS MS --> EX EX --> VAL VAL --> TRK subgraph Dependencies AI["@memberjunction/ai
BaseLLM"] style AI fill:#2d6a9f,stroke:#1a4971,color:#fff ACP["@memberjunction/ai-core-plus
AIPromptParams"] style ACP fill:#2d6a9f,stroke:#1a4971,color:#fff AIE["@memberjunction/aiengine
AIEngine"] style AIE fill:#2d6a9f,stroke:#1a4971,color:#fff TMPL["@memberjunction/templates
TemplateEngine"] style TMPL fill:#2d6a9f,stroke:#1a4971,color:#fff CRED["@memberjunction/credentials
CredentialEngine"] style CRED fill:#2d6a9f,stroke:#1a4971,color:#fff end AI --> PR ACP --> PR AIE --> PR TMPL --> PR CRED --> PR ``` ## Installation ```bash npm install @memberjunction/ai-prompts ``` ## Key Features ### Hierarchical Template Composition Build complex prompts from reusable sub-templates with unlimited nesting depth: ```typescript import { AIPromptRunner } from '@memberjunction/ai-prompts'; import { AIPromptParams, ChildPromptParam } from '@memberjunction/ai-core-plus'; const runner = new AIPromptRunner(); // Parent template uses {{ analysis }} and {{ summary }} placeholders const params = new AIPromptParams(); params.prompt = parentPrompt; params.childPrompts = [ new ChildPromptParam(analysisParams, 'analysis'), new ChildPromptParam(summaryParams, 'summary') ]; params.data = { userInput: 'complex data to process' }; const result = await runner.ExecutePrompt(params); ``` Execution order: 1. Child prompts render depth-first (children before parents) 2. Sibling prompts at each level execute in parallel 3. Child results replace placeholders in parent template 4. Final composed prompt executes as a single LLM call ### Model Selection Strategies Three strategies for selecting which AI model executes a prompt: | Strategy | Description | |---|---| | `Default` | Uses the AI configuration to determine the model based on priority and availability | | `Specific` | Uses explicitly associated models from the AIPromptModels table | | `ByPower` | Selects the highest PowerRank model matching the prompt's model type | Model selection precedence (highest to lowest): 1. `AIPromptParams.override` -- Runtime model/vendor override 2. `AIPromptParams.modelSelectionPrompt` -- Alternate prompt for model config 3. Prompt's own model configuration (strategy + associations) **Credential-evaluation short-circuit (performance):** Candidates are ordered by priority, so once the runner finds the highest-priority candidate that has working credentials it **stops probing the remaining candidates** and records them as `"not-evaluated"` in the `ModelSelection` telemetry. This avoids running a credential/env-var check for every configured model on every prompt run. The full ordered candidate list is still used for failover, so this only trims per-candidate availability *telemetry* for the tail. To force a complete availability report for every candidate (e.g. an admin diagnostic), set `AIPromptParams.forceFullModelEvaluation = true`. ### Parallel Execution with Judging Execute prompts across multiple models simultaneously and select the best result: - Configurable execution groups with different models - AI judge prompt evaluates and ranks results - Automatic selection of best result based on judge scoring - Full tracking of all parallel results ### Output Validation and Retry Automatic validation of AI outputs with configurable retry: - JSON schema validation against `OutputExample` definitions - Automatic JSON repair via JSON5 parsing and LLM-based repair - Configurable retry count with the original or repaired prompts - Validation syntax cleaning (removes `?`, `*`, `:type` markers from JSON keys) - Detailed validation attempt tracking **Output type coercion** (`AIPrompt.OutputType`): the raw model text is coerced before validation — `string` (verbatim), `number` (`parseFloat`, errors on `NaN`), `boolean` (`true/yes/1` ↔ `false/no/0`, case-insensitive + trimmed), `date` (`new Date`, errors on invalid), and `object` (JSON). For `object`, the text is first run through `CleanJSON` (strips markdown fences etc.); if that fails and `attemptJSONRepair` is set, it retries via JSON5 and then LLM-based repair. When an `OutputExample` is defined, validation-syntax markers are stripped from result keys automatically. With `skipValidation`, coercion failures return the raw output instead of throwing. **Validation behavior** (`AIPrompt.ValidationBehavior`): `Strict` retries up to `MaxRetries` on validation failure; `Warn` logs and returns the (invalid) output; `None` accepts as-is. The parsed `OutputExample` is cached by content so it isn't re-parsed on every run/retry. ### Streaming Support Real-time streaming of LLM responses: ```typescript const params = new AIPromptParams(); params.prompt = myPrompt; params.onStreaming = (chunk) => { process.stdout.write(chunk.content); }; const result = await runner.ExecutePrompt(params); ``` ### Execution Tracking Every prompt execution creates an `AIPromptRun` record with: - Model and vendor used - Template rendering results - Token usage (prompt + completion) - Cost tracking - Execution time - Parent/child relationships for hierarchical prompts - Agent run linkage via `agentRunId` **Persistence is fire-and-forget.** The initial `Running` INSERT and the final `Completed`/`Failed` UPDATE are queued, not awaited, so the model call is never blocked on a DB round-trip. Saves for the same run are chained (the INSERT always completes before the UPDATE, so a slow INSERT can't clobber the finalized row), and the record's ID is available immediately because `NewRecord()` client-generates the UUID. Save failures are logged but never fail the prompt (the record is observability, not part of the success contract). Callers that need the rows durably written before continuing can `await runner.WaitForPendingPromptRunSaves()`. ### Credential Resolution Hierarchical credential resolution for API keys: 1. `AIPromptParams.credentialId` (per-request override) 2. `AIPromptModel.CredentialID` (prompt-model specific) 3. `AIModelVendor.CredentialID` (model-vendor specific) 4. `AIVendor.CredentialID` (vendor default) 5. `AIPromptParams.apiKeys[]` (legacy runtime keys) 6. `AI_VENDOR_API_KEY__` environment variables (legacy) ### Failover When a model fails due to rate limiting, authentication errors, or other transient issues, the runner can automatically retry with alternate models from the selection candidates. ## Usage ### Basic Prompt Execution ```typescript import { AIPromptRunner } from '@memberjunction/ai-prompts'; import { AIPromptParams } from '@memberjunction/ai-core-plus'; import { AIEngine } from '@memberjunction/aiengine'; // Get prompt from metadata await AIEngine.Instance.Config(false, contextUser); const prompt = AIEngine.Instance.Prompts.find(p => p.Name === 'Summarize Content'); const runner = new AIPromptRunner(); const params = new AIPromptParams(); params.prompt = prompt; params.data = { content: documentText, maxLength: 500 }; params.contextUser = contextUser; const result = await runner.ExecutePrompt(params); if (result.success) { console.log(result.result); // Parsed/validated result console.log(result.promptTokens); // Input tokens used console.log(result.completionTokens); // Output tokens generated console.log(result.executionTimeMS); // Execution duration } ``` ### With Progress Tracking ```typescript params.onProgress = (progress) => { console.log(`[${progress.step}] ${progress.percentage}% - ${progress.message}`); }; ``` ### With Effort Level ```typescript params.effortLevel = 85; // High effort for thorough analysis (1-100 scale) ``` ### With Runtime Model Override ```typescript params.override = { modelId: 'specific-model-id', vendorId: 'specific-vendor-id' }; ``` ## Dependencies - `@memberjunction/ai` -- Core AI abstractions (BaseLLM, ChatParams) - `@memberjunction/ai-core-plus` -- AIPromptParams, AIPromptRunResult, extended entities - `@memberjunction/ai-engine-base` -- AIEngineBase metadata cache - `@memberjunction/aiengine` -- AIEngine server-side operations - `@memberjunction/core` -- MJ framework core - `@memberjunction/core-entities` -- Generated entity classes - `@memberjunction/credentials` -- Credential resolution - `@memberjunction/templates` -- Template rendering engine - `@memberjunction/templates-base-types` -- Template base types - `json5` -- Lenient JSON parsing for repair