---
name: opik-optimizer
description: Optimize LLM prompts, tools, and agents in Opik using standardized optimizer workflows (prompt optimization, tool optimization, and parameter tuning), dataset/metric wiring, and result interpretation.
metadata:
  internal: true
---

# Opik Optimizer

## Purpose

Design, run, and interpret Opik Optimizer workflows for prompts, tools, and model parameters with consistent dataset/metric wiring and reproducible evaluation.

## When to use

Use this skill when a user asks for:

- Choosing and configuring Opik Optimizer algorithms for prompt/agent optimization.
- Writing `ChatPrompt`-based optimization runs and custom metric functions.
- Optimizing with tools (function calling or MCP), selected prompt roles, or prompt segments.
- Tuning LLM call parameters with `optimize_parameter`.
- Comparing optimizer outputs and interpreting `OptimizationResult`.

## Workflow

1. Select optimizer strategy (`MetaPromptOptimizer`, `FewShotBayesianOptimizer`, `HRPO`, etc.) based on the target optimization goal.
2. Build prompt/dataset/metric wiring and validate placeholder-field alignment.
3. Run prompt, tool, or parameter optimization with explicit controls (`n_threads`, `n_samples`, `max_trials`, seed).
4. Inspect `OptimizationResult` and compare score deltas against initial baselines.
5. Summarize recommendations, risks, and next experiments.

## Inputs

- Target optimization objective (prompt/tool/parameter) and success metric.
- Dataset source and expected schema fields.
- Model/provider constraints and runtime limits.
- Optional scope constraints (`optimize_prompts` segments, tool fields, project names).

## Outputs

- Optimizer run configuration and rationale.
- Result interpretation (`score`, `initial_score`, history trends).
- Recommended next changes and follow-up experiment plan.

Use the reference files in this skill for details before implementing code:

- `references/algorithms.md`
- `references/prompt_agent_workflow.md`
- `references/example_patterns.md`

## Opik Optimizer quickstart

1. Install and import:

```bash
pip install opik-optimizer
```

```python
from opik_optimizer import ChatPrompt, MetaPromptOptimizer, HRPO, FewShotBayesianOptimizer
from opik_optimizer import datasets
```

2. Build a prompt and metric:

```python
from opik.evaluation.metrics import LevenshteinRatio

prompt = ChatPrompt(
    system="You are a concise answerer.",
    user="{question}",
)

def metric(dataset_item: dict, output: str) -> float:
    return LevenshteinRatio().score(
        reference=dataset_item["answer"],
        output=output,
    ).value
```

3. Load dataset and run:

```python
dataset = datasets.hotpot(count=30)

result = MetaPromptOptimizer(model="openai/gpt-5-nano").optimize_prompt(
    prompt=prompt,
    dataset=dataset,
    metric=metric,
    n_samples=20,
    max_trials=10,
)
result.display()
```

## Core workflow you should follow

1. Pick optimizer class:
   - Few-shot examples + Bayesian selection: `FewShotBayesianOptimizer`
   - LLM meta-reasoning: `MetaPromptOptimizer`
   - Genetic + MOO / LLM crossover: `EvolutionaryOptimizer`
   - Hierarchical reflective diagnostics: `HierarchicalReflectiveOptimizer` (`HRPO`)
   - Pareto-based genetic strategy: `GepaOptimizer`
   - Parameter tuning only: `ParameterOptimizer`
2. Define a single `ChatPrompt` (or dict of prompts for multi-prompt cases).
3. Provide a dataset from `opik_optimizer.datasets`.
4. Provide metric callable with signature `(dataset_item, llm_output) -> float` (or `ScoreResult`/list of `ScoreResult`).
5. Set optimizer controls (`n_threads`, `n_samples`, `max_trials`, seed, etc.).
6. Run one of:
   - `optimize_prompt(...)` for prompt/system behavior changes.
   - `optimize_parameter(...)` for model-call hyperparameters.
7. Inspect `OptimizationResult` (`score`, `initial_score`, `history`, `optimization_id`, `get_optimized_parameters`).

## Key execution details to enforce

- Prefer explicit `project_name` for Opik tracking if you are using org-level observability.
- Keep placeholders in prompts aligned with dataset fields (for example `{question}`).
- Start with `optimize_prompts="system"` or `"user"` when scope should be constrained.
- Keep `model` names in `MetaPrompt`/`reasoning` calls provider-compatible for your account.
- Validate multimodal input payloads by preserving non-empty content segments only.
- For small datasets, use `n_samples` and `n_samples_strategy` carefully; over-allocation auto-falls back to full set.

## Tooling and segment-based control

- Tools can be optimized with MCP/function schema fields, not only by changing prompt wording.
- For fine-grained text updates, use `optimize_prompts` values and helper functions from `prompt_segments`:
  - `extract_prompt_segments(ChatPrompt)` to inspect stable segment IDs.
  - `apply_segment_updates(ChatPrompt, updates)` for deterministic edits.
- Tool optimization is distinct from prompt optimization.

Runnable examples live upstream in the Opik repo:

- https://github.com/comet-ml/opik/tree/main/sdks/opik_optimizer/src/opik_optimizer

If you need local runnable scripts, vendor the upstream examples into a `scripts/` folder and keep references one level deep.

## Common mistakes to avoid

- Passing empty dataset or mismatched placeholder names.
- Mixing deprecated constructor arg `num_threads` with `n_threads`.
- Assuming tool optimization is the same as agent function-calling optimization.
- Running `ParameterOptimizer.optimize_prompt` (it raises and should not be used).

## Next actions

- For in-depth behavior and per-class parameter tables: `references/algorithms.md`
- For exact `optimize_prompt` signatures, prompts, tool constraints, and result usage: `references/prompt_agent_workflow.md`
- For pattern examples and source-backed workflows: `references/example_patterns.md`