synth_ai.sdk.api.train.prompt_learning
Alpha
First-class SDK API for prompt learning (MIPRO and GEPA).
Note: MIPRO is Experimental, GEPA is Alpha.
This module provides high-level abstractions for running prompt optimization jobs
both via CLI (uvx synth-ai train) and programmatically in Python scripts.
Example CLI usage:
PromptLearningJudgeConfig
in synth_ai.sdk.api.train.configs.prompt_learning for configuration details.
synth_ai.sdk.api.train.configs.prompt_learning
Prompt Learning configuration models for MIPRO and GEPA.
This module defines configuration schemas for prompt optimization training jobs,
supporting two evolutionary algorithms: GEPA and MIPROv2.
Papers:
- GEPA: Reflective Prompt Evolution Can Outperform Reinforcement Learning
- MIPROv2: Optimizing Instructions and Demonstrations for Multi-Stage Language Model Programs
When to Use Prompt Learning
GEPA (Genetic Evolution of Prompt Architectures):- Complex prompt structures with multiple components
- Multi-component prompts (system + few-shot + chain-of-thought)
- When you want interpretable, incremental prompt improvements
- Exploring diverse prompt mutations via evolutionary search
- Efficient prompt optimization with fewer evaluations
- When you have limited compute budget
- Bayesian optimization approach with bootstrap demonstrations
- Training reference: /training/gepa
- Training reference: /training/mipro
- Quickstart: /quickstart/prompt-optimization
Functions
resolve_adaptive_pool_config
level: Preset level (NONE, LOW, MODERATE, HIGH). Defaults to LOW if None.overrides: Dict of field overrides to apply on top of level defaults.dev_pool_size: Optional dev pool size to cap pool_init_size if needed.
- AdaptivePoolConfig with resolved values.
resolve_adaptive_batch_config
level: Preset level (NONE, LOW, MODERATE, HIGH). Defaults to MODERATE if None.overrides: Dict of field overrides to apply on top of level defaults.
- GEPAAdaptiveBatchConfig with resolved values.
Job API
PromptLearningJobConfig
Configuration for a prompt learning job.
PromptLearningJobPoller
Poller for prompt learning jobs.
Methods:
poll_job
job_id: Job ID (e.g., “pl_9c58b711c2644083”)
- PollOutcome with status and payload
PromptLearningJob
High-level SDK class for running prompt learning jobs (MIPRO or GEPA).
This class provides a clean API for:
- Submitting prompt learning jobs
- Polling job status
- Retrieving results
from_config
config_path: Path to TOML config filebackend_url: Backend API URL (defaults to env or production)api_key: API key (defaults to SYNTH_API_KEY env var)task_app_api_key: Task app API key (defaults to ENVIRONMENT_API_KEY env var)allow_experimental: Allow experimental modelsoverrides: Config overrides
- PromptLearningJob instance
ValueError: If required config is missingFileNotFoundError: If config file doesn’t exist
from_job_id
job_id: Existing job IDbackend_url: Backend API URL (defaults to env or production)api_key: API key (defaults to SYNTH_API_KEY env var)
- PromptLearningJob instance for the existing job
submit
- Job ID
RuntimeError: If job submission failsValueError: If task app health check fails
job_id
get_status
- Job status dictionary
RuntimeError: If job hasn’t been submitted yetValueError: If job ID format is invalid
poll_until_complete
timeout: Maximum seconds to waitinterval: Seconds between poll attemptson_status: Optional callback called on each status update
- Final job status dictionary
RuntimeError: If job hasn’t been submitted yetTimeoutError: If timeout is exceeded
get_results
- Results dictionary with best_prompt, best_score, etc.
RuntimeError: If job hasn’t been submitted yet
get_best_prompt_text
rank: Prompt rank (1 = best, 2 = second best, etc.)
- Prompt text or None if not found
Configuration Reference
SeedRange
Compact seed range notation for TOML configs.
Allows writing seeds = {start = 0, end = 50} instead of seeds = [0, 1, 2, ..., 49].
Examples:
seeds = {start = 0, end = 10} # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
seeds = {start = 0, end = 100, step = 2} # [0, 2, 4, …, 98]
Methods:
to_list
InferenceMode
ProviderName
PromptLearningPolicyConfig
Policy configuration for prompt learning (model, provider, etc.).
MessagePatternConfig
Configuration for a single message pattern.
PromptPatternConfig
Initial prompt pattern configuration.
MIPROMetaConfig
DEPRECATED: Meta-model config is now controlled by proposer_effort and proposer_output_tokens.
This class is kept for backwards compatibility but should not be used.
Use proposer_effort (LOW_CONTEXT, LOW, MEDIUM, HIGH) and proposer_output_tokens (RAPID, FAST, SLOW) instead.
MIPROStageConfig
Configuration for a single MIPRO stage inside a module.
Each stage MUST have its own policy configuration. The policy field is required
and must include ‘model’ and ‘provider’ fields.
MIPROModuleConfig
Configuration for a single module in a MIPRO pipeline.
MIPROSeedConfig
Seed pools used across bootstrap, optimization, and evaluation.
PromptLearningJudgeConfig
Verifier configuration shared by GEPA and MIPRO.
This configures LLM-based evaluation of agent trajectories during prompt optimization.
You can use standard rubrics or registered Verifier Graphs.
Attributes:
enabled: Whether to enable verifier-based scoring.reward_source: Source of the final reward for optimization.- “task_app”: Use only environment rewards from task app (default).
- “judge”: Use only verifier quality scores.
- “fused”: Weighted combination of environment and verifier rewards.
backend_base: Base URL for the verifier service (e.g. “https://api.usesynth.ai”).backend_api_key_env: Env var containing the Synth API key (default: “SYNTH_API_KEY”).backend_provider: Provider for the verifier model (e.g. “openai”, “groq”).backend_model: Model used to execute the verifier rubric or graph (e.g. “gpt-4o-mini”).synth_verifier_id: ID or Name of a registered Verifier Graph or Rubric on the backend. Use this to point to a specific, versioned verifier artifact.backend_rubric_id: Legacy alias for synth_verifier_id.backend_event_enabled: Whether to enable fine-grained event-level scoring.backend_outcome_enabled: Whether to enable episode-level outcome scoring.weight_env: Weight for environment rewards in “fused” mode (default: 1.0).weight_event: Weight for verifier event rewards in “fused” mode (default: 0.0).weight_outcome: Weight for verifier outcome rewards in “fused” mode (default: 0.0).
PromptLearningVerifierConfig
Alias for PromptLearningJudgeConfig with verifier terminology.
ProxyModelsConfig
Configuration for proxy usage on policy evaluations.
Uses a low-fidelity (LO) model for most evaluations and a high-fidelity (HI) model
for verification, with dynamic switching based on calibration and correlation.
The proxy system starts by evaluating examples with both HI and LO models to build
a calibration regression. Once calibrated (R² >= r2_thresh), it switches to using
only the LO model for most evaluations, falling back to HI when reliability drops.
Attributes:
hi_provider: Provider for high-fidelity model (e.g., “openai”, “groq”, “google”). This is the expensive model used for ground-truth evaluations.hi_model: High-fidelity model name (e.g., “gpt-4o”, “gpt-oss-120b”). Must be a supported model for the provider.lo_provider: Provider for low-fidelity proxy model (e.g., “groq”, “openai”). This is the cheaper model used for most evaluations after calibration.lo_model: Low-fidelity proxy model name (e.g., “gpt-oss-20b”, “gpt-4o-mini”). Must be a supported model for the provider. Should be cheaper than hi_model.n_min_hi: Minimum number of HI evaluations before allowing proxy substitution. Default: 5. Ensures sufficient calibration data before proxying.r2_thresh: R² correlation threshold (0.0-1.0) required to enable proxying. Default: 0.5. Higher values require stronger correlation before proxying.r2_stop: R² threshold (0.0-1.0) below which proxying is disabled. Default: 0.2. If correlation drops below this, revert to HI-only.sigma_max: Maximum residual variance (sigma²) allowed for proxy calibration. Default: 1e6. Higher values allow more variance in predictions.sigma_stop: Stop proxying if residual variance exceeds this value. Default: 1e9. If variance exceeds this, revert to HI-only.verify_every: Periodically verify calibration every N LO-only evaluations. Default: 0 (no periodic verification). Set to >0 to periodically run BOTH to check if calibration is still valid.proxy_patience_usd: Stop proxying if cumulative net gain drops below this (USD). Default: -100.0. Negative values allow some loss before stopping. Set to 0.0 to stop immediately if proxy becomes unprofitable.
AdaptiveCurriculumLevel
Preset levels for adaptive pooling curriculum.
AdaptivePoolConfig
Configuration for adaptive pooling (dynamically adjusting evaluation pool size).
Reduces evaluation costs by focusing on the most informative examples while
maintaining optimization quality through informativeness-based selection.
The adaptive pool starts with a larger pool and gradually reduces to a minimum
size, selecting examples based on informativeness (variance across prompts).
Examples are divided into anchors (always evaluated) and exploration pool
(selected based on informativeness).
Attributes:
level: Preset level (NONE, LOW, MODERATE, HIGH). Default: LOW. NONE disables adaptive pooling. Higher levels use smaller pools and more aggressive annealing for greater cost savings.anchor_size: Number of anchor examples that are always evaluated. Default: 30. Anchors provide stable baseline for optimization. Must be <= pool_min_size.pool_init_size: Initial pool size at start of optimization. Default: None (uses all available examples). Set to limit initial pool. Must be >= pool_min_size if both are set.pool_min_size: Target minimum pool size after annealing completes. Default: None (uses anchor_size). Pool anneals linearly from pool_init_size to pool_min_size between warmup_iters and anneal_stop_iter. Must be >= anchor_size.warmup_iters: Number of iterations before starting pool annealing. Default: 5. During warmup, pool stays at pool_init_size to gather informativeness data.anneal_stop_iter: Iteration at which pool reaches pool_min_size. Default: 20. Pool size decreases linearly from warmup_iters to this. Must be > warmup_iters.pool_update_period: Update informativeness scores every N generations. Default: 3. More frequent updates (lower value) adapt faster but require more computation.min_evals_per_example: Minimum evaluations per example before computing informativeness. Default: 3. Examples with fewer evals get info=0.0.k_info_prompts: Number of top-performing prompts used for informativeness computation. Default: 10. Only scores from these prompts are used to compute variance-based informativeness.info_buffer_factor: Buffer factor (0.0-1.0) for preserving informativeness during pool reduction. Default: 0.9. Higher values preserve more informativeness but allow less reduction. Lower values allow more aggressive reduction but may lose informativeness.info_epsilon: Small epsilon value added to prevent division by zero in informativeness calculations. Default: 1e-6.anchor_selection_method: Method for selecting anchor examples. Default: “clustering”. Options:- “random”: Random selection
- “clustering”: Select diverse examples via clustering
exploration_strategy: Strategy for selecting exploration pool examples. Default: “diversity”. Options:- “random”: Random selection
- “diversity”: Select diverse examples based on informativeness
heatup_reserve_pool: Optional list of seed IDs reserved for heat-up phase. Default: None. If provided, these seeds are added back to pool during heat-up phases to prevent overfitting to small pool.heatup_trigger: When to trigger heat-up phase (adding seeds back to pool). Default: “after_min_size”. Options:- “after_min_size”: Trigger after pool reaches min_size
- “immediate”: Trigger immediately
- “every_N_trials_after_min”: Trigger periodically after min_size
heatup_size: Number of seeds to add during heat-up phase. Default: 20. Seeds are selected from heatup_reserve_pool or reserve pool.heatup_cooldown_trials: Number of trials to wait before cooling down (removing heat-up seeds) after heat-up. Default: 50.heatup_schedule: Whether heat-up repeats or happens once. Default: “repeat”. Options:- “once”: Heat-up happens once
- “repeat”: Heat-up repeats after cooldown
enabled
AdaptiveBatchLevel
Preset levels for adaptive batch curriculum (GEPA only).
GEPAAdaptiveBatchConfig
Configuration for adaptive batch evaluation (GEPA only).
Reduces evaluation costs by using smaller minibatches and subsampling validation.
Methods:
enabled
MIPROConfig
MIPRO-specific configuration.
MIPROv2 uses meta-learning with bootstrap phase, TPE optimization, and mini-batch evaluation
to efficiently optimize prompts with fewer evaluations than genetic algorithms.
Attributes:
proposer_effort: Effort level for proposer model selection. Controls which model is used for generating prompt proposals. Default: “LOW”. Options:- “LOW_CONTEXT”: Uses gpt-oss-120b (Groq) with minimal context. Fastest/cheapest. Required when proposer_output_tokens=“RAPID”.
- “LOW”: Uses smaller/faster models (e.g., gpt-4o-mini). Good balance.
- “MEDIUM”: Uses medium models (e.g., gpt-4o). Higher quality proposals.
- “HIGH”: Uses best models (e.g., gpt-5). Highest quality but expensive.
proposer_output_tokens: Maximum output tokens allowed for proposer model. Default: “FAST”. Controls proposal length and cost. Options:- “RAPID”: 3000 tokens max. Fastest/cheapest. Requires proposer_effort=“LOW_CONTEXT” and gpt-oss-120b model. Use for short, focused proposals.
- “FAST”: 10000 tokens max. Good balance. Works with any effort level.
- “SLOW”: 25000 tokens max. Allows longer proposals. Use for complex prompts.
min_bootstrap_demos: Minimum number of qualified bootstrap demonstrations required. Default: None (no minimum). If set, bootstrap phase will fail early if fewer than this many demos pass the few_shot_score_threshold. Use with strict_bootstrap=True for fail-fast behavior.strict_bootstrap: If True, fail immediately when bootstrap doesn’t produce enough qualified demos (< min_bootstrap_demos). Default: False. When False, optimization continues but may produce suboptimal results with insufficient demos.
simple
GEPARolloutConfig
GEPA rollout configuration (mirrors RL [rollout] section).
GEPAEvaluationConfig
GEPA evaluation configuration (mirrors RL [evaluation] section).
GEPAMutationConfig
GEPA mutation configuration.
NOTE: Mutation model selection is controlled by proposer_effort, NOT llm_model.
The llm_model/llm_provider fields are deprecated and should not be used.
GEPAPopulationConfig
GEPA population configuration (evolution parameters).
GEPAArchiveConfig
GEPA archive configuration (Pareto archive settings).
GEPATokenConfig
GEPA token and budget configuration.
GEPAModuleConfig
Configuration for a single GEPA pipeline module/stage (instruction-only).
Each module MUST have its own policy configuration. The policy field is required
and must include ‘model’ and ‘provider’ fields.
GEPAConfig
GEPA-specific configuration with nested subsections.
GEPA (Genetic Evolution of Prompt Architectures) uses evolutionary algorithms
with LLM-guided mutations to optimize prompts through population-based search.
Attributes:
proposer_type: Type of proposer to use for generating mutations. Default: “dspy”. Options: “dspy” (DSPy-style proposer) or “spec” (spec-based).proposer_effort: Effort level for proposer model selection. Controls which model is used for generating prompt mutations. Default: “LOW”. Options:- “LOW_CONTEXT”: Uses gpt-oss-120b (Groq) with minimal context. Fastest/cheapest. Required when proposer_output_tokens=“RAPID”.
- “LOW”: Uses smaller/faster models (e.g., gpt-4o-mini). Good balance.
- “MEDIUM”: Uses medium models (e.g., gpt-4o). Higher quality mutations.
- “HIGH”: Uses best models (e.g., gpt-5). Highest quality but expensive.
proposer_output_tokens: Maximum output tokens allowed for proposer model. Default: “FAST”. Controls mutation length and cost. Options:- “RAPID”: 3000 tokens max. Fastest/cheapest. Requires proposer_effort=“LOW_CONTEXT” and gpt-oss-120b model. Use for short, focused mutations.
- “FAST”: 10000 tokens max. Good balance. Works with any effort level.
- “SLOW”: 25000 tokens max. Allows longer mutations. Use for complex prompts.
metaprompt: Optional custom metaprompt text to include in mutation prompts. Default: None. If provided, replaces default metaprompt template.