--- name: ai-engineer description: Use when building production-grade GenAI, Agentic Systems, Advanced RAG, or setting up rigorous Evaluation pipelines. license: MIT metadata: version: "2.0" --- # AI Engineering Standards This skill provides guidelines for building production-grade GenAI, Agentic Systems, Advanced RAG, and rigorous Evaluation pipelines. Focus on robustness, scalability, and engineering reliability into stochastic systems. ## Core Responsibilities 1. **Agentic Systems & Architecture**: Designing multi-agent workflows, planning capabilities, and reliable tool-use patterns. 2. **Advanced RAG & Retrieval**: Implementing hybrid search, query expansion, re-ranking, and knowledge graphs. 3. **Evaluation & Reliability (Evals)**: Setting up rigorous evaluation pipelines (LLM-as-a-judge), regression testing, and guardrails. 4. **Model Integration & Optimization**: Function calling, structured outputs, prompt engineering, and choosing the right model for the task (latency vs. intelligence trade-offs). 5. **MLOps & Serving**: Observability, tracing, caching, and cost management. ## Dynamic Stack Loading - **Agentic Patterns**: [Principles for reliable agents](references/agentic-patterns.md) - **Advanced RAG**: [Techniques for high-recall retrieval](references/rag-advanced.md) - **Evaluation Frameworks**: [Testing & Metrics](references/evaluation.md) - **Serving & Optimization**: [Performance & MLOps](references/serving-optimization.md) - **LLM Fundamentals**: [Prompting & SDKs](references/llm.md)