Skip to main content
Prompt learning configs use the schema defined in synth_ai/train/configs/prompt_learning.py. They may be declared either as a top-level [prompt_learning] table or with all keys at the root (the loader normalizes both).

Top-Level Keys

  • algorithm: "mipro" or "gepa" (required)
  • task_app_url: base URL of the task app the optimizer will hit (required)
  • task_app_api_key: optional API key if not in env
  • task_app_id: optional identifier for logging
  • initial_prompt: PromptPatternConfig defining starting messages/wildcards
  • policy: model/provider settings (see below)
  • mipro: MIPROConfig block (if algorithm = "mipro")
  • gepa: GEPAConfig block (if algorithm = "gepa")
  • env_config: optional dict passed through to the task app

[policy] (PromptLearningPolicyConfig)

  • model (required)
  • provider ("openai", "groq", "google")
  • inference_mode (defaults to synth_hosted)
  • Optional: inference_url, temperature, max_completion_tokens, policy_name

MIPRO Fields ([prompt_learning.mipro])

See MIPROConfig for the full list. Key parameters include:
  • num_iterations, num_evaluations_per_iteration, batch_size, max_concurrent
  • env_name, env_config
  • meta_model, meta_model_provider, meta_model_inference_url
  • few_shot_score_threshold, results_file, max_wall_clock_seconds, max_total_tokens
  • Budget controls: max_token_limit, max_spend_usd, token_counting_model, enforce_token_limit
  • Optional nested dicts: tpe, demo, grounding, meta_update, parallelism

GEPA Fields ([prompt_learning.gepa])

GEPAConfig supports both nested sections and flat keys. Recommended nested structure:
  • [prompt_learning.gepa.rollout] (GEPARolloutConfig): budget, max_concurrent, minibatch_size
  • [prompt_learning.gepa.evaluation] (GEPAEvaluationConfig): seeds, validation_seeds, test_pool, validation_pool, validation_top_k
  • [prompt_learning.gepa.mutation] (GEPAMutationConfig): rate, llm_model, llm_provider, llm_inference_url, prompt
  • [prompt_learning.gepa.population] (GEPAPopulationConfig): initial_size, num_generations, children_per_generation, crossover_rate, selection_pressure, patience_generations
  • [prompt_learning.gepa.archive] (GEPAArchiveConfig): size, pareto_set_size, pareto_eps, feedback_fraction
  • [prompt_learning.gepa.token] (GEPATokenConfig): max_limit, counting_model, enforce_pattern_limit, max_spend_usd
  • [prompt_learning.gepa.modules]: optional list of GEPAModuleConfig entries for multi-stage pipelines (module_id, limits, tool allowances)

Flat Backwards-Compatible Keys

If you cannot nest sections, the loader also recognizes:
  • rollout_budget, max_concurrent_rollouts, minibatch_size
  • evaluation_seeds, validation_seeds, test_pool, validation_pool, validation_top_k
  • mutation_rate, mutation_llm_model, mutation_llm_provider, mutation_llm_inference_url, mutation_prompt
  • initial_population_size, num_generations, children_per_generation, crossover_rate, selection_pressure, patience_generations
  • archive_size, pareto_set_size, pareto_eps, feedback_fraction
  • max_token_limit, token_counting_model, enforce_pattern_token_limit, max_spend_usd

Sample TOML

[prompt_learning]
algorithm = "mipro"
task_app_url = "https://my-task-app.modal.run"

[prompt_learning.policy]
model = "gpt-4o-mini"
provider = "openai"
temperature = 0.2

[prompt_learning.initial_prompt]
id = "baseline"
messages = [
  { role = "system", pattern = "You are a helpful agent.", order = 0 }
]

[prompt_learning.mipro]
num_iterations = 30
num_evaluations_per_iteration = 5
batch_size = 32
env_name = "banking77"
meta_model = "gpt-4o"
few_shot_score_threshold = 0.8
max_token_limit = 2_000_000
max_spend_usd = 200.0