When to use
- You want iterative prompt refinement with proposal stages
- You have reliable traces and a stable task app
- You’re optimizing online from production data (MIPRO is the online method)
What you need
- A task app with multiple examples
- Traces from eval or prior rollouts
- A MIPRO online job (backend returns a proxy URL)
Basic workflow
- Run eval to collect traces
- Start a MIPRO online job
- Evaluate candidates without leaking prompts to the task app
From demos: exact online loop (Banking77)
From themipro_banking77 demo, the online loop looks like:
- Start a local task app and health-check it
- Create a MIPRO online job on the backend
- Receive a proxy URL for prompt substitution
- For each rollout, call
{proxy_url}/{rollout_id}/chat/completions - Compute reward locally and POST status updates:
status=rewardwith reward valuestatus=donewhen rollout finishes
--train-sizeand--val-sizeto define seed ranges--min-proposal-rolloutsto control when new proposals appear
Minimal example
Key constraints
- MIPRO must not leak prompts to the task app
- Use the interceptor for all LLM calls
- Large traces may require RLM verifiers
Tradeoff vs GEPA
- MIPRO is less efficient than GEPA but can learn continuously from production rollouts and fresh data.
Next steps
- Task app overview:
/sdk/localapi/overview - Traces and rubrics:
/sdk/tracing/v3-tracesand/sdk/tracing/rubrics