Continual Learning with Online MIPRO

This walkthrough shows how to use Synth in production to continually optimize your prompts in light of production results. We’ll do this by running MIPRO Online on the Banking77 intent classification task. In online mode you run rollouts locally, while the backend proposes and selects prompt candidates reactively in real time, with under ~500ms of latency added from retrieving the cached candidates.

Prerequisites

Python 3.11+
uv package manager
SYNTH_API_KEY set in your environment
Access to a Synth backend (default is production)

Canonical SDK Entry Point

from synth_ai import SynthClient

Source script: code/training/prompt_learning/mipro/run_interactive.sh

Run the demo locally as a script

From the cookbooks repo:

git clone https://github.com/synth-laboratories/cookbooks
cd cookbooks
uv sync

export SYNTH_API_KEY="sk_live_your_key"
export SYNTH_URL="https://api.usesynth.ai"

uv run bash code/training/prompt_learning/mipro/run_interactive.sh

What the flags do

--rollouts: Number of online rollouts to run
--train-size: Number of training seeds (0..train-size-1)
--val-size: Number of validation seeds (train-size..train-size+val-size-1)
--min-proposal-rollouts: Minimum rollouts before generating new proposals

What happens

The script starts a local container and health-checks it.
A MIPRO online job is created on the backend.
The backend returns a proxy URL for prompt candidate selection.
The script runs rollouts locally, calling the proxy URL for each LLM call.
Rewards are reported back and proposals evolve in real time.

Tips

To use a different backend, set SYNTH_URL (preferred). SYNTH_BACKEND_URL and RUST_BACKEND_URL are also supported for compatibility.
You can change the policy model with --model gpt-4.1-nano (or another supported model).
The script auto-generates ENVIRONMENT_API_KEY if it is not set.

Production usage

When you move this flow to production, the loop is the same. You just swap the backend URL, send rewards back to the online MIPRO system, and rely on the proxy URL to perform prompt substitution.

1) Set the backend URL

Point to the production backend:

export SYNTH_API_KEY="sk_live_your_key"
export SYNTH_URL="https://api.usesynth.ai"

The demo resolves the backend URL from SYNTH_URL (preferred), then SYNTH_BACKEND_URL, then RUST_BACKEND_URL.

2) Send reward updates

After each rollout, report the reward to the backend system. The demo uses:

POST /api/v1/online/sessions/{session_id}/reward
{
  "rollout_id": "<id>",
  "status": "reward",
  "reward": <float>
}

Then it sends a final "done" status for the rollout:

POST /api/v1/online/sessions/{session_id}/reward
{
  "rollout_id": "<id>",
  "status": "done"
}

This is what push_status() does in the demo.

3) Prompt substitution (what happens behind the scenes)

The backend returns a proxy URL (e.g. mipro_proxy_url) for each online job. You call it like:

{proxy_url}/{rollout_id}/chat/completions

Behind the scenes:

The proxy selects the current best candidate prompt for the rollout.
It substitutes that candidate into the prompt template (system/user message patterns).
The proxy forwards the request to the model provider with the substituted prompt.
You receive the model response and compute a reward locally.
Reward updates drive the next round of proposals.

Ready to get started?

Get Started

Schedule Demo

See Synth in action with a personalized walkthrough.

Walkthroughs

Continual Learning with Online MIPRO

Prerequisites

Canonical SDK Entry Point

Run the demo locally as a script

What the flags do

What happens

Tips

Production usage

1) Set the backend URL

2) Send reward updates

3) Prompt substitution (what happens behind the scenes)

See Also

Ready to get started?

Get Started

Schedule Demo

Walkthroughs

​Prerequisites

​Canonical SDK Entry Point

​Run the demo locally as a script

​What the flags do

​What happens

​Tips

​Production usage

​1) Set the backend URL

​2) Send reward updates

​3) Prompt substitution (what happens behind the scenes)

​See Also

​Ready to get started?

Get Started

Schedule Demo

Prerequisites

Canonical SDK Entry Point

Run the demo locally as a script

What the flags do

What happens

Tips

Production usage

1) Set the backend URL

2) Send reward updates

3) Prompt substitution (what happens behind the scenes)

See Also

Ready to get started?