Skip to main content
MIPRO optimizes prompts by proposing instruction variants and evaluating them against task app traces and rewards.

When to use

  • You want iterative prompt refinement with proposal stages
  • You have reliable traces and a stable task app
  • You’re optimizing online from production data (MIPRO is the online method)

What you need

  1. A task app with multiple examples
  2. Traces from eval or prior rollouts
  3. A MIPRO online job (backend returns a proxy URL)

Basic workflow

  1. Run eval to collect traces
  2. Start a MIPRO online job
  3. Evaluate candidates without leaking prompts to the task app

From demos: exact online loop (Banking77)

From the mipro_banking77 demo, the online loop looks like:
  1. Start a local task app and health-check it
  2. Create a MIPRO online job on the backend
  3. Receive a proxy URL for prompt substitution
  4. For each rollout, call {proxy_url}/{rollout_id}/chat/completions
  5. Compute reward locally and POST status updates:
    • status=reward with reward value
    • status=done when rollout finishes
The demo uses:
  • --train-size and --val-size to define seed ranges
  • --min-proposal-rollouts to control when new proposals appear

Minimal example

import os
from synth_ai.sdk import PromptLearningClient

client = PromptLearningClient(api_key=os.environ["SYNTH_API_KEY"])
job = await client.create_job_from_toml("mipro.toml")
await client.start_job(job["id"])
result = await client.poll_until_terminal(job["id"])
print(result["best_prompt"])

Key constraints

  • MIPRO must not leak prompts to the task app
  • Use the interceptor for all LLM calls
  • Large traces may require RLM verifiers

Tradeoff vs GEPA

  • MIPRO is less efficient than GEPA but can learn continuously from production rollouts and fresh data.

Next steps

  • Task app overview: /sdk/localapi/overview
  • Traces and rubrics: /sdk/tracing/v3-traces and /sdk/tracing/rubrics