--- name: prompt-engineer-llm description: World-class expert in prompt engineering, LLM fine-tuning, RAG systems, and AI/ML workflows. Use when crafting prompts, designing AI agents, building knowledge bases, implementing retrieval systems, or optimizing LLM performance at production scale. --- # Prompt Engineer & LLM Specialist - World-Class Edition ## Project Context: DriverConnect (eddication.io) **IMPORTANT**: This project integrates AI/ML for various features including geocoding, route optimization, and potential driver assistance features. ### AI/ML Integration Points | Feature | AI Technology | Status | |---------|---------------|--------| | **Geocoding** | Nominatim/Photon API (future: LLM-enhanced) | Active | | **Route Optimization** | Algorithmic (future: ML models) | Planned | | **Driver Assist** | LLM-based chatbot | Planned | | **Document Processing** | OCR + LLM extraction | Planned | | **Voice Commands** | STT + LLM | Planned | ### Project Files - **AI Prompts**: [PTGLG/driverconnect/admin/js/prompts.js](../PTGLG/driverconnect/admin/js/prompts.js) (if exists) - **LLM Integration**: [backend/lib/llm-service.js](../backend/lib/llm-service.js) (if exists) --- ## Overview You are a world-class expert in prompt engineering, LLM architecture, and AI/ML systems. You understand how to craft effective prompts, design retrieval-augmented generation (RAG) systems, fine-tune models, build AI agents, and optimize for cost and performance. You excel at bridging the gap between raw model capabilities and production-ready AI features. --- # Philosophy & Principles ## Core Principles 1. **Clarity Over Cleverness** - Explicit instructions beat implicit hints 2. **Context is King** - Provide relevant information, minimize noise 3. **Test and Iterate** - Prompt engineering is empirical 4. **Cost Awareness** - Optimize for token usage and latency 5. **Safety First** - Guardrails, filtering, and monitoring 6. **Human-in-the-Loop** - AI augments, not replaces, human judgment ## Prompt Engineering Decision Tree ``` Task → Is it simple or complex? ├─ Simple → Direct prompt with clear instructions │ └─ Complex → Needs strategy? ├─ Yes → Chain of Thought, Few-shot, or Decomposition └─ No → Needs external knowledge? ├─ Yes → RAG (Retrieval Augmented Generation) └─ No → Zero/Few-shot with examples ``` --- # Prompt Engineering Fundamentals ## The Anatomy of a Perfect Prompt ``` ┌─────────────────────────────────────────────────────────────┐ │ SYSTEM MESSAGE │ │ Role, personality, constraints, output format, safety │ ├─────────────────────────────────────────────────────────────┤ │ USER MESSAGE │ │ Context + Task + Examples + Constraints + Output Format │ └─────────────────────────────────────────────────────────────┘ ``` ### Essential Components | Component | Purpose | Example | |-----------|---------|---------| | **Role** | Set persona and expertise level | "You are an expert logistics coordinator..." | | **Context** | Provide background information | "DriverConnect is a fuel delivery system..." | | **Task** | Clear, specific instruction | "Extract delivery addresses from this text..." | | **Examples** | Few-shot learning | "Input: X → Output: Y" | | **Constraints** | What NOT to do | "Do not make up information..." | | **Format** | Expected output structure | "Return JSON with keys: address, city, province" | | **Thinking** | Chain-of-thought | "Think step by step..." | ## Prompt Templates ### Zero-Shot Template ``` You are a world-class {domain} expert. Task: {task_description} Input: {input_data} Constraints: - {constraint_1} - {constraint_2} Output Format: {format_specification} ``` ### Few-Shot Template ``` You are a world-class {domain} expert. Your task is to {task_description}. Here are some examples: Example 1: Input: {example_1_input} Output: {example_1_output} Example 2: Input: {example_2_input} Output: {example_2_output} Example 3: Input: {example_3_input} Output: {example_3_output} Now, process this input: Input: {actual_input} Output: ``` ### Chain-of-Thought Template ``` You are a world-class {domain} expert. Task: {task_description} Let's think step by step: 1. First, {step_1_instruction} 2. Then, {step_2_instruction} 3. Finally, {step_3_instruction} Input: {input_data} Step-by-step reasoning: [Your reasoning here] Final Answer: [Your final answer here] ``` --- # Advanced Prompt Techniques ## Chain-of-Thought (CoT) ### When to Use | Scenario | Why CoT Helps | |----------|---------------| | Multi-step reasoning | Breaks complex problems | | Math/Logic problems | Shows work, reduces errors | | Debugging | Systematic elimination | | Planning | Sequential consideration | ### Examples ``` // Basic CoT A truck has 22 delivery stops. At stop 1-5, 50 liters each. Stop 6-15, 30 liters each. Stop 16-22, 20 liters each. How much total fuel? Let's think step by step: 1. First, calculate stops 1-5: 5 stops × 50 liters = 250 liters 2. Then, calculate stops 6-15: 10 stops × 30 liters = 300 liters 3. Then, calculate stops 16-22: 7 stops × 20 liters = 140 liters 4. Finally, add them all: 250 + 300 + 140 = 690 liters Answer: 690 liters ``` ## Self-Consistency ``` Task: Solve this complex logistics problem Please solve this problem three different ways and compare your answers: 1. Method 1: [Approach description] 2. Method 2: [Approach description] 3. Method 3: [Approach description] After getting three answers, determine which is most likely correct and explain why. ``` ## Tree of Thoughts ``` You are a strategic logistics planner. For this delivery challenge, explore multiple solution paths: Path A: [Option A description] - Pros: [List] - Cons: [List] - Outcome prediction: [Analysis] Path B: [Option B description] - Pros: [List] - Cons: [List] - Outcome prediction: [Analysis] Path C: [Option C description] - Pros: [List] - Cons: [List] - Outcome prediction: [Analysis] After evaluating all paths, recommend the best option with justification. ``` ## ReAct Pattern (Reason + Act) ``` You are a logistics assistant with access to tools. For each step, think then act: Thought: [What you want to do] Action: [Which tool to use] Observation: [What the tool returned] Thought: [What to do next based on observation] Action: [Next tool] ... (repeat until done) Final Answer: [The result] ``` --- # Retrieval Augmented Generation (RAG) ## RAG Architecture ``` ┌─────────────┐ │ User Query │ └──────┬──────┘ │ ▼ ┌──────────────┐ ┌─────────────┐ │ Embedding │────►│ Vector DB │ │ Model │ │ (pgvector) │ └──────────────┘ └──────┬──────┘ │ ▼ ┌─────────┐ │Context │ │+ Query │ └────┬────┘ │ ▼ ┌─────────┐ │ LLM │ └────┬────┘ │ ▼ ┌─────────┐ │ Answer │ └─────────┘ ``` ## RAG Implementation ```typescript // Vector database schema (PostgreSQL + pgvector) CREATE TABLE documents ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), content TEXT NOT NULL, embedding vector(1536), // OpenAI dimension metadata JSONB DEFAULT '{}'::jsonb, created_at TIMESTAMPTZ DEFAULT NOW() ); CREATE INDEX docs_embedding_idx ON documents USING hnsw (embedding vector_cosine_ops); // Matching function CREATE OR REPLACE FUNCTION match_documents( query_embedding vector(1536), match_threshold float DEFAULT 0.8, match_count int DEFAULT 10 ) RETURNS TABLE ( id UUID, content TEXT, similarity float ) LANGUAGE plpgsql AS $$ BEGIN RETURN QUERY SELECT d.id, d.content, 1 - (d.embedding <=> query_embedding) as similarity FROM documents d WHERE d.embedding <=> query_embedding < 1 - match_threshold ORDER BY d.embedding <=> query_embedding LIMIT match_count; END; $$; // RAG pipeline import { OpenAI } from 'openai'; const openai = new OpenAI(); async function ragAnswer(query: string) { // 1. Embed query const embedding = await openai.embeddings.create({ model: 'text-embedding-3-small', input: query }); // 2. Retrieve relevant documents const { data: docs } = await supabase.rpc('match_documents', { query_embedding: embedding.data[0].embedding, match_threshold: 0.8, match_count: 5 }); // 3. Build prompt with context const context = docs.map(d => d.content).join('\n\n'); const prompt = ` You are a DriverConnect logistics expert. Answer the user's question using the context below. Context: ${context} Question: ${query} If the context doesn't contain the answer, say "I don't have enough information to answer this." Answer: `.trim(); // 4. Generate response const completion = await openai.chat.completions.create({ model: 'gpt-4o', messages: [{ role: 'user', content: prompt }], temperature: 0.3 }); return completion.choices[0].message.content; } ``` ## Chunking Strategies ```typescript // Fixed-size chunking function chunkBySize(text: string, chunkSize: number = 500, overlap: number = 50) { const chunks: string[] = []; let start = 0; while (start < text.length) { const end = Math.min(start + chunkSize, text.length); chunks.push(text.slice(start, end)); start = end - overlap; } return chunks; } // Semantic chunking (by paragraph/sentence) function chunkSemantic(text: string) { return text .split(/\n\n+/) // Split by paragraphs .filter(p => p.trim().length > 50) // Filter too short .map(p => p.trim()); } // Recursive chunking (preserves structure) function chunkRecursive( text: string, separators: string[] = ['\n\n', '\n', '. ', ' '], maxSize: number = 500 ): string[] { for (const sep of separators) { const parts = text.split(sep); if (parts.every(p => p.length <= maxSize)) { return parts.filter(p => p.trim()); } } // Fallback to fixed size return chunkBySize(text, maxSize); } ``` --- # LLM Fine-Tuning ## When to Fine-Tune | Scenario | Base Model | Fine-Tuned | |----------|------------|------------| | General tasks | Good enough | Overkill | | Domain-specific jargon | Struggles | Excels | | Specific output format | Inconsistent | Reliable | | Cost optimization | Large prompts | Compact prompts | | Privacy | External API | Self-hosted | ## Fine-Tuning Process ```python # Prepare training data training_data = [ { "messages": [ {"role": "system", "content": "You are a DriverConnect logistics assistant."}, {"role": "user", "content": "What's the status of trip TRIP-001?"}, {"role": "assistant", "content": "Trip TRIP-001 is currently in progress. Driver Somchai is at stop 2 of 5."} ] }, # ... more examples ] # Save as JSONL import json with open('driverconnect_train.jsonl', 'w') as f: for entry in training_data: f.write(json.dumps(entry) + '\n') # Upload and fine-tune (OpenAI) from openai import OpenAI client = OpenAI() # Upload file response = client.files.create( file=open('driverconnect_train.jsonl', 'rb'), purpose='fine-tune' ) file_id = response.id # Start fine-tuning fine_tune = client.fine_tuning.jobs.create( training_file=file_id, model='gpt-4o-mini', hyperparameters={ 'n_epochs': 3, 'learning_rate_multiplier': 0.1, 'batch_size': 4 } ) # Monitor progress job = client.fine_tuning.jobs.retrieve(fine_tune.id) print(f"Status: {job.status}") # Use fine-tuned model completion = client.chat.completions.create( model=fine_tune.fine_tuned_model, messages=[...] ) ``` --- # Agent Design Patterns ## ReAct Agent ```typescript interface Tool { name: string; description: string; parameters: Record; execute: (params: any) => Promise; } class ReActAgent { private tools: Map = new Map(); registerTool(tool: Tool) { this.tools.set(tool.name, tool); } async run(query: string, maxSteps: number = 10) { let thought = `I need to answer: ${query}`; const steps: string[] = []; for (let i = 0; i < maxSteps; i++) { // Decide next action const prompt = ` Thought: ${thought} Available tools: ${Array.from(this.tools.values()).map(t => `- ${t.name}: ${t.description}` ).join('\n')} Respond in this format: Thought: [your reasoning] Action: [tool name] Action Input: [JSON parameters] Or if you have the final answer: Thought: [reasoning] Final Answer: [answer] `.trim(); const response = await this.llm(prompt); // Parse response const actionMatch = response.match(/Action: (\w+)/); const inputMatch = response.match(/Action Input: ({.*})/s); const finalMatch = response.match(/Final Answer: (.*)/s); if (finalMatch) { return finalMatch[1]; } if (actionMatch && inputMatch) { const tool = this.tools.get(actionMatch[1]); if (tool) { const result = await tool.execute(JSON.parse(inputMatch[1])); thought = `Action ${actionMatch[1]} returned: ${result}`; steps.push(`${actionMatch[1]}: ${result}`); } } } return 'Could not complete task'; } private async llm(prompt: string): Promise { // LLM call implementation return ''; } } ``` ## Multi-Agent System ``` ┌─────────────────────────────────────────────────────────┐ │ Coordinator Agent │ │ - Receives user request │ │ - Decomposes into sub-tasks │ │ - Routes to specialist agents │ │ - Aggregates results │ └──────┬────────┬────────┬────────┬────────┬──────────────┘ │ │ │ │ │ ▼ ▼ ▼ ▼ ▼ ┌──────────┐┌────────┐┌───────┐┌──────┐┌─────────┐ │Routing ││Status ││Geocode││Alert ││Report │ │Agent ││Agent ││Agent ││Agent ││Agent │ └──────────┘└────────┘└───────┘└──────┘└─────────┘ ``` --- # Prompt Evaluation & Testing ## Evaluation Metrics ```typescript interface PromptEvaluation { // Accuracy correctness: number; // 0-1, factual correctness completeness: number; // 0-1, covered all aspects // Quality coherence: number; // 0-1, logical flow conciseness: number; // 0-1, not verbose // Safety hallucination: boolean; // Made up information policyViolation: boolean; // Broke guidelines // Performance latency: number; // milliseconds tokenUsage: number; // total tokens cost: number; // USD } async function evaluatePrompt( prompt: string, testCases: Array<{input: string, expected: string}> ): Promise { const results = await Promise.all( testCases.map(async ({input, expected}) => { const start = Date.now(); const output = await llm(prompt + '\n\nInput: ' + input); const latency = Date.now() - start; return { output, latency, similarity: cosineSimilarity(embed(expected), embed(output)) }; }) ); return { correctness: results.reduce((s, r) => s + r.similarity, 0) / results.length, latency: results.reduce((s, r) => s + r.latency, 0) / results.length, // ... other metrics }; } ``` --- # World-Class Resources ## Official Documentation - OpenAI Docs: https://platform.openai.com/docs - Anthropic Docs: https://docs.anthropic.com - LangChain Docs: https://python.langchain.com ## Prompt Engineering Guides - OpenAI Prompt Engineering Guide: https://platform.openai.com/docs/guides/prompt-engineering - Anthropic Prompt Library: https://docs.anthropic.com/claude/prompt-library ## RAG Resources - LangChain RAG Tutorial: https://python.langchain.com/docs/tutorials/rag - LlamaIndex Documentation: https://docs.llamaindex.ai ## Fine-Tuning - OpenAI Fine-Tuning Guide: https://platform.openai.com/docs/guides/fine-tuning - HuggingFace PEFT: https://huggingface.co/docs/peft ## Tools & Frameworks - LangChain: https://github.com/langchain-ai/langchain - LlamaIndex: https://github.com/run-llama/llama_index - AutoGPT: https://github.com/Significant-Gravitas/AutoGPT - Vector DBs: Pinecone, Weaviate, Qdrant, pgvector