Note: This page is auto-generated from SDK validation code. Parameters and types are extracted automatically and will update when the code changes.
MIPRO Online (Multi-prompt Instruction Proposal Optimizer) is an algorithm for optimizing prompts through systematic instruction proposal and evaluation in online mode, where you drive rollouts locally while the backend provides prompt candidates through proxy URLs. Endpoint:
POST /api/policy-optimization/online/jobs
Authentication: Bearer token via Authorization: Bearer $SYNTH_API_KEY
Overview
In online mode:- You control rollouts: Drive the rollout loop locally
- No tunneling required: Backend never calls your task app
- Real-time evolution: Prompts evolve as rewards are reported
- Proxy URL: Backend provides a proxy URL that selects prompt candidates for each LLM call
Request
Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
mipro.mode | string | Yes | Must be "online" |
mipro.bootstrap_train_seeds | array[int] | Yes | Initial training seeds for bootstrap phase |
mipro.val_seeds | array[int] | Yes | Validation seeds for evaluation |
mipro.online_pool | array[int] | Yes | Pool of seeds for online optimization |
mipro.online_proposer_min_rollouts | int | Yes | Minimum rollouts before generating new proposals |
mipro.online_proposer_mode | string | Yes | Proposer mode: ‘inline’ (proposals generated during optimization) |
mipro.online_rollouts_per_candidate | int | Yes | Number of rollouts per candidate before switching |
mipro.proposer | object | Yes | Proposer configuration for generating prompt proposals |
mipro.proposer.max_tokens | int | No | Maximum tokens for proposer output (default: 512) |
mipro.proposer.mode | string | Yes | Proposer generation mode: ‘instruction_only’ |
mipro.proposer.model | string | Yes | Model for generating proposals |
mipro.proposer.provider | string | Yes | Provider for proposer model |
mipro.proposer.temperature | float | No | Temperature for proposer generation (default: 0.7) |
Workflow
- Create job: Submit MIPRO job with
mode: "online" - Get proxy URL: Backend returns a proxy URL endpoint (via
MiproOnlineSession) - Run rollouts: For each rollout:
- Call proxy URL with your task input
- Proxy selects best prompt candidate
- Execute LLM call with selected prompt
- Report reward back to backend using
MiproOnlineSession.update_reward()
- Automatic evolution: Backend generates new proposals based on rewards
Response
Polling for Completion
UseGET /api/policy-optimization/online/jobs/{job_id} to check status:
Notes
- No tunneling required: Backend never calls your task app, so no public URL needed
- You control rollouts: Drive the rollout loop locally in your code
- Real-time evolution: Prompts evolve as rewards are reported
- Proposer API key: Automatically resolved from backend environment (
OPENAI_API_KEYorPROD_OPENAI_API_KEY) - Session management: Use
MiproOnlineSessionSDK class for managing online sessions and reporting rewards
See Also
- MIPRO Offline API - Offline mode documentation
- MIPRO Banking77 Online Demo - Complete online mode example
- MIPRO SDK Reference - SDK usage guide
- MiproOnlineSession SDK - Online session management