Skip to main content
Complete reference for prompt optimization configuration files, including all algorithm parameters, model requirements, and best practices.

Configuration Structure

[prompt_learning]
algorithm = "gepa"  # or "mipro"
task_app_url = "http://127.0.0.1:8102"
task_app_id = "banking77"
evaluation_seeds = [50, 51, 52, ...]
validation_seeds = [0, 1, 2, ...]

[prompt_learning.initial_prompt]
messages = [
  { role = "system", content = "..." },
  { role = "user", pattern = "Query: {query}" }
]

[prompt_learning.gepa]
initial_population_size = 20
num_generations = 15
mutation_rate = 0.3
crossover_rate = 0.5
rollout_budget = 1000
max_concurrent_rollouts = 20
pareto_set_size = 20

Core Settings

[prompt_learning]

FieldTypeRequiredDescription
algorithm"gepa" | "mipro"YesOptimization algorithm
task_app_urlstringYesTask app endpoint URL
task_app_idstringNoTask app identifier
evaluation_seedsarray[int]YesTraining seed indices
validation_seedsarray[int]YesValidation seed indices

[prompt_learning.initial_prompt]

FieldTypeRequiredDescription
messagesarray[object]YesInitial prompt template
Each message object:
  • role: "system" \| "user" \| "assistant"
  • content: Static content (string)
  • pattern: Template with {query} placeholder (string)

GEPA Parameters

[prompt_learning.gepa]

ParameterTypeDefaultDescription
initial_population_sizeint20Starting number of prompt variants
num_generationsint15Evolutionary cycles to run
mutation_ratefloat0.3Probability of mutation (0-1)
crossover_ratefloat0.5Probability of crossover (0-1)
rollout_budgetint1000Total task evaluations allowed
max_concurrent_rolloutsint20Parallel rollout limit
pareto_set_sizeint20Pareto front size

MIPRO Parameters

[prompt_learning.mipro]

ParameterTypeDefaultDescription
num_iterationsint16Number of optimization iterations
num_evaluations_per_iterationint6Prompt variants evaluated per iteration
batch_sizeint6Concurrent evaluations per iteration
max_concurrentint20Maximum concurrent rollouts
bootstrap_train_seedsarray[int]RequiredSeeds for bootstrap phase (few-shot collection)
online_poolarray[int]RequiredSeeds for mini-batch evaluation during optimization
test_poolarray[int]RequiredSeeds for final held-out evaluation
reference_poolarray[int]OptionalSeeds for reference corpus (up to 50k tokens)
meta_modelstring"gpt-4o-mini"Meta-model for instruction proposals
meta_model_providerstring"openai"Provider for meta-model ("openai", "groq", "google")
meta_model_inference_urlstringProvider defaultInference URL for meta-model
few_shot_score_thresholdfloat0.85Minimum score for bootstrap examples
max_token_limitintOptionalMaximum tokens per prompt
max_spend_usdfloatOptionalMaximum spend in USD
token_counting_modelstringOptionalModel for token counting
enforce_token_limitboolfalseEnforce token limits strictly
spec_pathstringOptionalPath to system spec JSON file
spec_max_tokensint5000Max tokens from spec to include
spec_include_examplesbooltrueInclude examples from spec
spec_priority_thresholdint8Minimum priority for spec rules

Example Configurations

Banking77 (GEPA)

[prompt_learning]
algorithm = "gepa"
task_app_url = "http://127.0.0.1:8102"
task_app_id = "banking77"
evaluation_seeds = [50, 51, 52, ..., 79]
validation_seeds = [0, 1, 2, ..., 49]

[prompt_learning.initial_prompt]
messages = [
  { role = "system", content = "You are a banking intent classification assistant." },
  { role = "user", pattern = "Customer Query: {query}\n\nClassify this query into one of 77 banking intents." }
]

[prompt_learning.gepa]
initial_population_size = 20
num_generations = 15
mutation_rate = 0.3
crossover_rate = 0.5
rollout_budget = 1000
max_concurrent_rollouts = 20
pareto_set_size = 20

HotpotQA (GEPA)

[prompt_learning]
algorithm = "gepa"
task_app_url = "http://127.0.0.1:8103"
task_app_id = "hotpotqa"
evaluation_seeds = [0, 1, 2, ..., 29]
validation_seeds = [30, 31, 32, ..., 79]

[prompt_learning.initial_prompt]
messages = [
  { role = "system", content = "You are a question-answering assistant." },
  { role = "user", pattern = "Question: {query}\n\nAnswer this multi-hop question using reasoning." }
]

[prompt_learning.gepa]
initial_population_size = 30
num_generations = 20
mutation_rate = 0.25
crossover_rate = 0.6
rollout_budget = 1500
max_concurrent_rollouts = 25
pareto_set_size = 25

Supported Models

Policy Models (Task Execution)

Both GEPA and MIPRO support policy models from three providers:

OpenAI Models

  • gpt-4o
  • gpt-4o-mini
  • gpt-4.1
  • gpt-4.1-mini
  • gpt-4.1-nano
  • gpt-5
  • gpt-5-mini
  • gpt-5-nano
Explicitly REJECTED: gpt-5-pro (too expensive: 15/15/120 per 1M tokens)

Groq Models

  • gpt-oss-Xb pattern (e.g., gpt-oss-20b, openai/gpt-oss-120b)
  • llama-3.3-70b and variants (e.g., llama-3.3-70b-versatile)
  • qwen-32b, qwen3-32b, groq/qwen3-32b

Google/Gemini Models

  • gemini-2.5-pro
  • gemini-2.5-pro-gt200k
  • gemini-2.5-flash
  • gemini-2.5-flash-lite

Mutation Models (GEPA Only)

Used to generate prompt mutations/variations:
ModelProviderCommon Usage
openai/gpt-oss-120bGroqMost common
openai/gpt-oss-20bGroqAlternative
llama-3.3-70b-versatileGroqAlternative
llama3-groq-70b-8192-tool-use-previewGroqAlternative
Nano models are REJECTED (too small for generation tasks)

Meta Models (MIPRO Only)

Used to generate instruction proposals:
ModelProviderCommon Usage
gpt-4o-miniOpenAIMost common default
gpt-4.1-miniOpenAIAlternative
gpt-4oOpenAIHigher quality, more expensive
Nano models are REJECTED (too small for generation tasks)

Model Configuration

# Policy model (both algorithms)
[prompt_learning.policy]
model = "openai/gpt-oss-20b"
provider = "groq"
inference_url = "https://api.groq.com/openai/v1"

# Mutation model (GEPA only)
[prompt_learning.gepa.mutation]
llm_model = "openai/gpt-oss-120b"
llm_provider = "groq"
llm_inference_url = "https://api.groq.com/openai/v1"

# Meta model (MIPRO only)
[prompt_learning.mipro]
meta_model = "gpt-4o-mini"
meta_model_provider = "openai"
meta_model_inference_url = "https://api.openai.com/v1"

Policy Configuration

[prompt_learning.policy]

ParameterTypeRequiredDescription
modelstringYesPolicy model identifier
providerstringYesProvider ("openai", "groq", "google")
inference_urlstringYesInference endpoint URL
inference_modestringOptional"synth_hosted" or custom
temperaturefloatOptionalSampling temperature (default: 0.0)
max_completion_tokensintOptionalMaximum tokens (default: 512)

GEPA Mutation Configuration

[prompt_learning.gepa.mutation]

ParameterTypeDefaultDescription
ratefloat0.3Probability of mutation (0-1)
llm_modelstringOptionalLLM for guided mutations
llm_providerstringOptionalProvider for mutation LLM
llm_inference_urlstringOptionalInference URL for mutation LLM
proposer_typestring"dspy""dspy" or "spec"

System Spec Configuration

Both GEPA and MIPRO support system specifications (specs) for constraint-aware optimization.

[prompt_learning.gepa] Spec Parameters

ParameterTypeDefaultDescription
proposer_typestring"dspy""dspy" or "spec" (requires spec_path)
spec_pathstringOptionalPath to system spec JSON file (required if proposer_type="spec")
spec_max_tokensint5000Max tokens for spec context in mutation prompts
spec_include_examplesbooltrueInclude examples from spec
spec_priority_thresholdintOptionalOnly include rules with priority >= threshold

[prompt_learning.mipro] Spec Parameters

ParameterTypeDefaultDescription
spec_pathstringOptionalPath to system spec JSON file
spec_max_tokensint5000Max tokens for spec context in meta-prompt
spec_include_examplesbooltrueInclude examples from spec
spec_priority_thresholdintOptionalOnly include rules with priority >= threshold
See System Specifications for complete details on creating and using specs.

Multi-Stage Pipeline Configuration

Both algorithms support multi-stage pipelines:

GEPA Multi-Stage

[[prompt_learning.gepa.modules]]
module_id = "classifier"
max_instruction_slots = 3
max_tokens = 512

[[prompt_learning.gepa.modules]]
module_id = "calibrator"
max_instruction_slots = 2
max_tokens = 256

MIPRO Multi-Stage

[[prompt_learning.mipro.modules]]
module_id = "classifier"
max_instruction_slots = 3
max_demo_slots = 5

[[prompt_learning.mipro.modules]]
module_id = "calibrator"
max_instruction_slots = 3
max_demo_slots = 5

Complete Example Configurations

Banking77 (GEPA)

[prompt_learning]
algorithm = "gepa"
task_app_url = "http://127.0.0.1:8102"
task_app_id = "banking77"
evaluation_seeds = [50, 51, 52, ..., 79]
validation_seeds = [0, 1, 2, ..., 49]

[prompt_learning.initial_prompt]
messages = [
  { role = "system", content = "You are a banking intent classification assistant." },
  { role = "user", pattern = "Customer Query: {query}\n\nClassify this query into one of 77 banking intents." }
]

[prompt_learning.policy]
model = "openai/gpt-oss-20b"
provider = "groq"
inference_url = "https://api.groq.com/openai/v1"
temperature = 0.0
max_completion_tokens = 128

[prompt_learning.gepa]
initial_population_size = 20
num_generations = 15
mutation_rate = 0.3
crossover_rate = 0.5
rollout_budget = 1000
max_concurrent_rollouts = 20
pareto_set_size = 20

[prompt_learning.gepa.mutation]
llm_model = "openai/gpt-oss-120b"
llm_provider = "groq"
llm_inference_url = "https://api.groq.com/openai/v1"

Banking77 (MIPRO)

[prompt_learning]
algorithm = "mipro"
task_app_url = "https://synth-laboratories-dev--synth-banking77-web-web.modal.run"
task_app_id = "banking77"

[prompt_learning.initial_prompt]
messages = [
  { role = "system", content = "You are a banking intent classification assistant." },
  { role = "user", pattern = "Customer Query: {query}\n\nClassify this query into one of 77 banking intents." }
]

[prompt_learning.policy]
model = "openai/gpt-oss-20b"
provider = "groq"
inference_url = "https://api.groq.com/openai/v1"
temperature = 0.0
max_completion_tokens = 128

[prompt_learning.mipro]
num_iterations = 16
num_evaluations_per_iteration = 6
batch_size = 6
max_concurrent = 20
bootstrap_train_seeds = [0, 1, 2, 3, 4]
online_pool = [5, 6, 7, 8, 9]
test_pool = [20, 21, 22, 23, 24]
meta_model = "gpt-4o-mini"
meta_model_provider = "openai"
meta_model_inference_url = "https://api.openai.com/v1"
few_shot_score_threshold = 0.85

Best Practices

GEPA Best Practices

  1. Population Size: Start with 20-30 for most tasks. Increase for complex tasks.
  2. Generations: 10-15 generations usually sufficient. More for complex optimization.
  3. Mutation Rate: 0.2-0.4 works well. Higher = more exploration, lower = more exploitation.
  4. Rollout Budget: Allocate 50-100 rollouts per generation for stable estimates.
  5. Concurrency: Set max_concurrent_rollouts based on task app capacity (typically 10-50).
  6. Mutation Model: Use gpt-oss-120b for best quality mutations, gpt-oss-20b for faster/cheaper.

MIPRO Best Practices

  1. Bootstrap Seeds: Use 5-15 seeds for bootstrap phase. Higher threshold = fewer but better examples.
  2. Iterations: 10-20 iterations usually sufficient. More for complex tasks.
  3. Evaluations per Iteration: 4-6 variants per iteration balances exploration vs. cost.
  4. Meta Model: gpt-4o-mini is the sweet spot (quality + cost). Use gpt-4o for higher quality.
  5. Reference Pool: Optional but recommended. 50-100 seeds provide rich context (up to 50k tokens).
  6. Token Budget: Set max_token_limit and max_spend_usd to control costs.

General Best Practices

  1. Seed Splitting: Keep training, validation, and test seeds separate. Never overlap.
  2. Baseline Prompt: Start with a clear, task-specific baseline. Better baseline = better optimization.
  3. Model Selection: Use Groq models (gpt-oss-20b) for cost-effective policy execution.
  4. Concurrency: Match max_concurrent to your task app’s capacity. Too high = rate limits.
  5. Monitoring: Track accuracy, token count, and cost throughout optimization.