Skip to main content

synth_ai.sdk.api.train.prompt_learning

Alpha First-class SDK API for prompt learning (MIPRO and GEPA). Note: MIPRO is Experimental, GEPA is Alpha. This module provides high-level abstractions for running prompt optimization jobs both via CLI (uvx synth-ai train) and programmatically in Python scripts. Example CLI usage:
uvx synth-ai train --type prompt_learning --config my_config.toml --poll
Example SDK usage:
from synth_ai.sdk.api.train.prompt_learning import PromptLearningJob

job = PromptLearningJob.from_config("my_config.toml")
job.submit()
result = job.poll_until_complete()
print(f"Best score: {result['best_score']}")
For domain-specific judging, you can use Verifier Graphs. See PromptLearningJudgeConfig in synth_ai.sdk.api.train.configs.prompt_learning for configuration details.

synth_ai.sdk.api.train.configs.prompt_learning

Prompt Learning configuration models for MIPRO and GEPA. This module defines configuration schemas for prompt optimization training jobs, supporting two evolutionary algorithms: GEPA and MIPROv2. Papers:

When to Use Prompt Learning

GEPA (Genetic Evolution of Prompt Architectures):
  • Complex prompt structures with multiple components
  • Multi-component prompts (system + few-shot + chain-of-thought)
  • When you want interpretable, incremental prompt improvements
  • Exploring diverse prompt mutations via evolutionary search
MIPROv2 (Multi-prompt Instruction Proposal Optimizer):
  • Efficient prompt optimization with fewer evaluations
  • When you have limited compute budget
  • Bayesian optimization approach with bootstrap demonstrations
Example GEPA configuration:
[prompt_learning]
algorithm = "gepa"
task_app_url = "https://your-tunnel.trycloudflare.com"
task_app_api_key = "$ENVIRONMENT_API_KEY"
task_app_id = "your-task"

[prompt_learning.initial_prompt]
id = "classifier_prompt"
name = "Classification Prompt"

[[prompt_learning.initial_prompt.messages]]
role = "system"
pattern = "You are a classifier. {instructions}"
order = 0

[[prompt_learning.initial_prompt.messages]]
role = "user"
pattern = "{query}"
order = 1

[prompt_learning.policy]
model = "gpt-4o-mini"
provider = "openai"
temperature = 0.0
max_completion_tokens = 512

[prompt_learning.gepa]
env_name = "my-task"
proposer_effort = "LOW"           # Model quality: LOW_CONTEXT, LOW, MEDIUM, HIGH
proposer_output_tokens = "FAST"   # Token limit: RAPID (3k), FAST (10k), SLOW (25k)

[prompt_learning.gepa.rollout]
budget = 100                      # Total prompt evaluations
max_concurrent = 20               # Concurrent rollouts

[prompt_learning.gepa.evaluation]
seeds = {start = 0, end = 50}     # Training seeds (range syntax)
validation_seeds = [50, 51, 52, 53, 54]  # Held-out validation

[prompt_learning.gepa.population]
initial_size = 20                 # Initial population
num_generations = 10              # Evolution generations
children_per_generation = 5       # Children per generation
See Also:
  • Training reference: /training/gepa
  • Training reference: /training/mipro
  • Quickstart: /quickstart/prompt-optimization

Functions

resolve_adaptive_pool_config

resolve_adaptive_pool_config() -> AdaptivePoolConfig
Resolve adaptive pool config from level preset and overrides. Args:
  • level: Preset level (NONE, LOW, MODERATE, HIGH). Defaults to LOW if None.
  • overrides: Dict of field overrides to apply on top of level defaults.
  • dev_pool_size: Optional dev pool size to cap pool_init_size if needed.
Returns:
  • AdaptivePoolConfig with resolved values.

resolve_adaptive_batch_config

resolve_adaptive_batch_config() -> GEPAAdaptiveBatchConfig
Resolve adaptive batch config from level preset and overrides. Args:
  • level: Preset level (NONE, LOW, MODERATE, HIGH). Defaults to MODERATE if None.
  • overrides: Dict of field overrides to apply on top of level defaults.
Returns:
  • GEPAAdaptiveBatchConfig with resolved values.

Job API

PromptLearningJobConfig

Configuration for a prompt learning job.

PromptLearningJobPoller

Poller for prompt learning jobs. Methods:

poll_job

poll_job(self, job_id: str) -> PollOutcome
Poll a prompt learning job by ID. Args:
  • job_id: Job ID (e.g., “pl_9c58b711c2644083”)
Returns:
  • PollOutcome with status and payload

PromptLearningJob

High-level SDK class for running prompt learning jobs (MIPRO or GEPA). This class provides a clean API for:
  1. Submitting prompt learning jobs
  2. Polling job status
  3. Retrieving results
Methods:

from_config

from_config(cls, config_path: str | Path, backend_url: Optional[str] = None, api_key: Optional[str] = None, task_app_api_key: Optional[str] = None, allow_experimental: Optional[bool] = None, overrides: Optional[Dict[str, Any]] = None) -> PromptLearningJob
Create a job from a TOML config file. Args:
  • config_path: Path to TOML config file
  • backend_url: Backend API URL (defaults to env or production)
  • api_key: API key (defaults to SYNTH_API_KEY env var)
  • task_app_api_key: Task app API key (defaults to ENVIRONMENT_API_KEY env var)
  • allow_experimental: Allow experimental models
  • overrides: Config overrides
Returns:
  • PromptLearningJob instance
Raises:
  • ValueError: If required config is missing
  • FileNotFoundError: If config file doesn’t exist

from_job_id

from_job_id(cls, job_id: str, backend_url: Optional[str] = None, api_key: Optional[str] = None) -> PromptLearningJob
Resume an existing job by ID. Args:
  • job_id: Existing job ID
  • backend_url: Backend API URL (defaults to env or production)
  • api_key: API key (defaults to SYNTH_API_KEY env var)
Returns:
  • PromptLearningJob instance for the existing job

submit

submit(self) -> str
Submit the job to the backend. Returns:
  • Job ID
Raises:
  • RuntimeError: If job submission fails
  • ValueError: If task app health check fails

job_id

job_id(self) -> Optional[str]
Get the job ID (None if not yet submitted).

get_status

get_status(self) -> Dict[str, Any]
Get current job status. Returns:
  • Job status dictionary
Raises:
  • RuntimeError: If job hasn’t been submitted yet
  • ValueError: If job ID format is invalid

poll_until_complete

poll_until_complete(self) -> Dict[str, Any]
Poll job until it reaches a terminal state. Args:
  • timeout: Maximum seconds to wait
  • interval: Seconds between poll attempts
  • on_status: Optional callback called on each status update
Returns:
  • Final job status dictionary
Raises:
  • RuntimeError: If job hasn’t been submitted yet
  • TimeoutError: If timeout is exceeded

get_results

get_results(self) -> Dict[str, Any]
Get job results (prompts, scores, etc.). Returns:
  • Results dictionary with best_prompt, best_score, etc.
Raises:
  • RuntimeError: If job hasn’t been submitted yet

get_best_prompt_text

get_best_prompt_text(self, rank: int = 1) -> Optional[str]
Get the text of the best prompt by rank. Args:
  • rank: Prompt rank (1 = best, 2 = second best, etc.)
Returns:
  • Prompt text or None if not found

Configuration Reference

SeedRange

Compact seed range notation for TOML configs. Allows writing seeds = {start = 0, end = 50} instead of seeds = [0, 1, 2, ..., 49]. Examples: seeds = {start = 0, end = 10} # [0, 1, 2, 3, 4, 5, 6, 7, 8, 9] seeds = {start = 0, end = 100, step = 2} # [0, 2, 4, …, 98] Methods:

to_list

to_list(self) -> list[int]
Convert range to list of integers.

InferenceMode

ProviderName

PromptLearningPolicyConfig

Policy configuration for prompt learning (model, provider, etc.).

MessagePatternConfig

Configuration for a single message pattern.

PromptPatternConfig

Initial prompt pattern configuration.

MIPROMetaConfig

DEPRECATED: Meta-model config is now controlled by proposer_effort and proposer_output_tokens. This class is kept for backwards compatibility but should not be used. Use proposer_effort (LOW_CONTEXT, LOW, MEDIUM, HIGH) and proposer_output_tokens (RAPID, FAST, SLOW) instead.

MIPROStageConfig

Configuration for a single MIPRO stage inside a module. Each stage MUST have its own policy configuration. The policy field is required and must include ‘model’ and ‘provider’ fields.

MIPROModuleConfig

Configuration for a single module in a MIPRO pipeline.

MIPROSeedConfig

Seed pools used across bootstrap, optimization, and evaluation.

PromptLearningJudgeConfig

Verifier configuration shared by GEPA and MIPRO. This configures LLM-based evaluation of agent trajectories during prompt optimization. You can use standard rubrics or registered Verifier Graphs. Attributes:
  • enabled: Whether to enable verifier-based scoring.
  • reward_source: Source of the final reward for optimization.
  • “task_app”: Use only environment rewards from task app (default).
  • “judge”: Use only verifier quality scores.
  • “fused”: Weighted combination of environment and verifier rewards.
  • backend_base: Base URL for the verifier service (e.g. “https://api.usesynth.ai”).
  • backend_api_key_env: Env var containing the Synth API key (default: “SYNTH_API_KEY”).
  • backend_provider: Provider for the verifier model (e.g. “openai”, “groq”).
  • backend_model: Model used to execute the verifier rubric or graph (e.g. “gpt-4o-mini”).
  • synth_verifier_id: ID or Name of a registered Verifier Graph or Rubric on the backend. Use this to point to a specific, versioned verifier artifact.
  • backend_rubric_id: Legacy alias for synth_verifier_id.
  • backend_event_enabled: Whether to enable fine-grained event-level scoring.
  • backend_outcome_enabled: Whether to enable episode-level outcome scoring.
  • weight_env: Weight for environment rewards in “fused” mode (default: 1.0).
  • weight_event: Weight for verifier event rewards in “fused” mode (default: 0.0).
  • weight_outcome: Weight for verifier outcome rewards in “fused” mode (default: 0.0).

PromptLearningVerifierConfig

Alias for PromptLearningJudgeConfig with verifier terminology.

ProxyModelsConfig

Configuration for proxy usage on policy evaluations. Uses a low-fidelity (LO) model for most evaluations and a high-fidelity (HI) model for verification, with dynamic switching based on calibration and correlation. The proxy system starts by evaluating examples with both HI and LO models to build a calibration regression. Once calibrated (R² >= r2_thresh), it switches to using only the LO model for most evaluations, falling back to HI when reliability drops. Attributes:
  • hi_provider: Provider for high-fidelity model (e.g., “openai”, “groq”, “google”). This is the expensive model used for ground-truth evaluations.
  • hi_model: High-fidelity model name (e.g., “gpt-4o”, “gpt-oss-120b”). Must be a supported model for the provider.
  • lo_provider: Provider for low-fidelity proxy model (e.g., “groq”, “openai”). This is the cheaper model used for most evaluations after calibration.
  • lo_model: Low-fidelity proxy model name (e.g., “gpt-oss-20b”, “gpt-4o-mini”). Must be a supported model for the provider. Should be cheaper than hi_model.
  • n_min_hi: Minimum number of HI evaluations before allowing proxy substitution. Default: 5. Ensures sufficient calibration data before proxying.
  • r2_thresh: R² correlation threshold (0.0-1.0) required to enable proxying. Default: 0.5. Higher values require stronger correlation before proxying.
  • r2_stop: R² threshold (0.0-1.0) below which proxying is disabled. Default: 0.2. If correlation drops below this, revert to HI-only.
  • sigma_max: Maximum residual variance (sigma²) allowed for proxy calibration. Default: 1e6. Higher values allow more variance in predictions.
  • sigma_stop: Stop proxying if residual variance exceeds this value. Default: 1e9. If variance exceeds this, revert to HI-only.
  • verify_every: Periodically verify calibration every N LO-only evaluations. Default: 0 (no periodic verification). Set to >0 to periodically run BOTH to check if calibration is still valid.
  • proxy_patience_usd: Stop proxying if cumulative net gain drops below this (USD). Default: -100.0. Negative values allow some loss before stopping. Set to 0.0 to stop immediately if proxy becomes unprofitable.

AdaptiveCurriculumLevel

Preset levels for adaptive pooling curriculum.

AdaptivePoolConfig

Configuration for adaptive pooling (dynamically adjusting evaluation pool size). Reduces evaluation costs by focusing on the most informative examples while maintaining optimization quality through informativeness-based selection. The adaptive pool starts with a larger pool and gradually reduces to a minimum size, selecting examples based on informativeness (variance across prompts). Examples are divided into anchors (always evaluated) and exploration pool (selected based on informativeness). Attributes:
  • level: Preset level (NONE, LOW, MODERATE, HIGH). Default: LOW. NONE disables adaptive pooling. Higher levels use smaller pools and more aggressive annealing for greater cost savings.
  • anchor_size: Number of anchor examples that are always evaluated. Default: 30. Anchors provide stable baseline for optimization. Must be <= pool_min_size.
  • pool_init_size: Initial pool size at start of optimization. Default: None (uses all available examples). Set to limit initial pool. Must be >= pool_min_size if both are set.
  • pool_min_size: Target minimum pool size after annealing completes. Default: None (uses anchor_size). Pool anneals linearly from pool_init_size to pool_min_size between warmup_iters and anneal_stop_iter. Must be >= anchor_size.
  • warmup_iters: Number of iterations before starting pool annealing. Default: 5. During warmup, pool stays at pool_init_size to gather informativeness data.
  • anneal_stop_iter: Iteration at which pool reaches pool_min_size. Default: 20. Pool size decreases linearly from warmup_iters to this. Must be > warmup_iters.
  • pool_update_period: Update informativeness scores every N generations. Default: 3. More frequent updates (lower value) adapt faster but require more computation.
  • min_evals_per_example: Minimum evaluations per example before computing informativeness. Default: 3. Examples with fewer evals get info=0.0.
  • k_info_prompts: Number of top-performing prompts used for informativeness computation. Default: 10. Only scores from these prompts are used to compute variance-based informativeness.
  • info_buffer_factor: Buffer factor (0.0-1.0) for preserving informativeness during pool reduction. Default: 0.9. Higher values preserve more informativeness but allow less reduction. Lower values allow more aggressive reduction but may lose informativeness.
  • info_epsilon: Small epsilon value added to prevent division by zero in informativeness calculations. Default: 1e-6.
  • anchor_selection_method: Method for selecting anchor examples. Default: “clustering”. Options:
  • “random”: Random selection
  • “clustering”: Select diverse examples via clustering
  • exploration_strategy: Strategy for selecting exploration pool examples. Default: “diversity”. Options:
  • “random”: Random selection
  • “diversity”: Select diverse examples based on informativeness
  • heatup_reserve_pool: Optional list of seed IDs reserved for heat-up phase. Default: None. If provided, these seeds are added back to pool during heat-up phases to prevent overfitting to small pool.
  • heatup_trigger: When to trigger heat-up phase (adding seeds back to pool). Default: “after_min_size”. Options:
  • “after_min_size”: Trigger after pool reaches min_size
  • “immediate”: Trigger immediately
  • “every_N_trials_after_min”: Trigger periodically after min_size
  • heatup_size: Number of seeds to add during heat-up phase. Default: 20. Seeds are selected from heatup_reserve_pool or reserve pool.
  • heatup_cooldown_trials: Number of trials to wait before cooling down (removing heat-up seeds) after heat-up. Default: 50.
  • heatup_schedule: Whether heat-up repeats or happens once. Default: “repeat”. Options:
  • “once”: Heat-up happens once
  • “repeat”: Heat-up repeats after cooldown
Methods:

enabled

enabled(self) -> bool
Whether adaptive pooling is enabled (level != NONE).

AdaptiveBatchLevel

Preset levels for adaptive batch curriculum (GEPA only).

GEPAAdaptiveBatchConfig

Configuration for adaptive batch evaluation (GEPA only). Reduces evaluation costs by using smaller minibatches and subsampling validation. Methods:

enabled

enabled(self) -> bool
Whether adaptive batch is enabled (level != NONE).

MIPROConfig

MIPRO-specific configuration. MIPROv2 uses meta-learning with bootstrap phase, TPE optimization, and mini-batch evaluation to efficiently optimize prompts with fewer evaluations than genetic algorithms. Attributes:
  • proposer_effort: Effort level for proposer model selection. Controls which model is used for generating prompt proposals. Default: “LOW”. Options:
  • “LOW_CONTEXT”: Uses gpt-oss-120b (Groq) with minimal context. Fastest/cheapest. Required when proposer_output_tokens=“RAPID”.
  • “LOW”: Uses smaller/faster models (e.g., gpt-4o-mini). Good balance.
  • “MEDIUM”: Uses medium models (e.g., gpt-4o). Higher quality proposals.
  • “HIGH”: Uses best models (e.g., gpt-5). Highest quality but expensive.
  • proposer_output_tokens: Maximum output tokens allowed for proposer model. Default: “FAST”. Controls proposal length and cost. Options:
  • “RAPID”: 3000 tokens max. Fastest/cheapest. Requires proposer_effort=“LOW_CONTEXT” and gpt-oss-120b model. Use for short, focused proposals.
  • “FAST”: 10000 tokens max. Good balance. Works with any effort level.
  • “SLOW”: 25000 tokens max. Allows longer proposals. Use for complex prompts.
  • min_bootstrap_demos: Minimum number of qualified bootstrap demonstrations required. Default: None (no minimum). If set, bootstrap phase will fail early if fewer than this many demos pass the few_shot_score_threshold. Use with strict_bootstrap=True for fail-fast behavior.
  • strict_bootstrap: If True, fail immediately when bootstrap doesn’t produce enough qualified demos (< min_bootstrap_demos). Default: False. When False, optimization continues but may produce suboptimal results with insufficient demos.
Methods:

simple

simple(cls) -> MIPROConfig
Convenience constructor for single-stage MIPRO tasks. Automatically infers reasonable defaults for seeds, iterations, and module layout based on the rollout budget. This keeps simple benchmarks (e.g., Iris) readable while leaving the full constructor available for complex multi-stage pipelines.

GEPARolloutConfig

GEPA rollout configuration (mirrors RL [rollout] section).

GEPAEvaluationConfig

GEPA evaluation configuration (mirrors RL [evaluation] section).

GEPAMutationConfig

GEPA mutation configuration. NOTE: Mutation model selection is controlled by proposer_effort, NOT llm_model. The llm_model/llm_provider fields are deprecated and should not be used.

GEPAPopulationConfig

GEPA population configuration (evolution parameters).

GEPAArchiveConfig

GEPA archive configuration (Pareto archive settings).

GEPATokenConfig

GEPA token and budget configuration.

GEPAModuleConfig

Configuration for a single GEPA pipeline module/stage (instruction-only). Each module MUST have its own policy configuration. The policy field is required and must include ‘model’ and ‘provider’ fields.

GEPAConfig

GEPA-specific configuration with nested subsections. GEPA (Genetic Evolution of Prompt Architectures) uses evolutionary algorithms with LLM-guided mutations to optimize prompts through population-based search. Attributes:
  • proposer_type: Type of proposer to use for generating mutations. Default: “dspy”. Options: “dspy” (DSPy-style proposer) or “spec” (spec-based).
  • proposer_effort: Effort level for proposer model selection. Controls which model is used for generating prompt mutations. Default: “LOW”. Options:
  • “LOW_CONTEXT”: Uses gpt-oss-120b (Groq) with minimal context. Fastest/cheapest. Required when proposer_output_tokens=“RAPID”.
  • “LOW”: Uses smaller/faster models (e.g., gpt-4o-mini). Good balance.
  • “MEDIUM”: Uses medium models (e.g., gpt-4o). Higher quality mutations.
  • “HIGH”: Uses best models (e.g., gpt-5). Highest quality but expensive.
  • proposer_output_tokens: Maximum output tokens allowed for proposer model. Default: “FAST”. Controls mutation length and cost. Options:
  • “RAPID”: 3000 tokens max. Fastest/cheapest. Requires proposer_effort=“LOW_CONTEXT” and gpt-oss-120b model. Use for short, focused mutations.
  • “FAST”: 10000 tokens max. Good balance. Works with any effort level.
  • “SLOW”: 25000 tokens max. Allows longer mutations. Use for complex prompts.
  • metaprompt: Optional custom metaprompt text to include in mutation prompts. Default: None. If provided, replaces default metaprompt template.
Methods:

from_mapping

from_mapping(cls, data: Mapping[str, Any]) -> GEPAConfig
Load GEPA config from dict/TOML, handling both nested and flat structures.

PromptLearningConfig

Top-level prompt learning configuration. Methods:

to_dict

to_dict(self) -> dict[str, Any]
Convert config to dictionary for API payload.

from_mapping

from_mapping(cls, data: Mapping[str, Any]) -> PromptLearningConfig
Load prompt learning config from dict/TOML mapping.

from_path

from_path(cls, path: Path) -> PromptLearningConfig
Load prompt learning config from TOML file.