Configuration Structure
Core Settings
[prompt_learning]
| Field | Type | Required | Description |
|---|---|---|---|
algorithm | "gepa" | "mipro" | Yes | Optimization algorithm |
task_app_url | string | Yes | Task app endpoint URL |
task_app_id | string | No | Task app identifier |
evaluation_seeds | array[int] | Yes | Training seed indices |
validation_seeds | array[int] | Yes | Validation seed indices |
[prompt_learning.initial_prompt]
| Field | Type | Required | Description |
|---|---|---|---|
messages | array[object] | Yes | Initial prompt template |
role:"system" \| "user" \| "assistant"content: Static content (string)pattern: Template with{query}placeholder (string)
GEPA Parameters
[prompt_learning.gepa]
| Parameter | Type | Default | Description |
|---|---|---|---|
initial_population_size | int | 20 | Starting number of prompt variants |
num_generations | int | 15 | Evolutionary cycles to run |
mutation_rate | float | 0.3 | Probability of mutation (0-1) |
crossover_rate | float | 0.5 | Probability of crossover (0-1) |
rollout_budget | int | 1000 | Total task evaluations allowed |
max_concurrent_rollouts | int | 20 | Parallel rollout limit |
pareto_set_size | int | 20 | Pareto front size |
MIPRO Parameters
[prompt_learning.mipro]
| Parameter | Type | Default | Description |
|---|---|---|---|
num_iterations | int | 16 | Number of optimization iterations |
num_evaluations_per_iteration | int | 6 | Prompt variants evaluated per iteration |
batch_size | int | 6 | Concurrent evaluations per iteration |
max_concurrent | int | 20 | Maximum concurrent rollouts |
bootstrap_train_seeds | array[int] | Required | Seeds for bootstrap phase (few-shot collection) |
online_pool | array[int] | Required | Seeds for mini-batch evaluation during optimization |
test_pool | array[int] | Required | Seeds for final held-out evaluation |
reference_pool | array[int] | Optional | Seeds for reference corpus (up to 50k tokens) |
meta_model | string | "gpt-4o-mini" | Meta-model for instruction proposals |
meta_model_provider | string | "openai" | Provider for meta-model ("openai", "groq", "google") |
meta_model_inference_url | string | Provider default | Inference URL for meta-model |
few_shot_score_threshold | float | 0.85 | Minimum score for bootstrap examples |
max_token_limit | int | Optional | Maximum tokens per prompt |
max_spend_usd | float | Optional | Maximum spend in USD |
token_counting_model | string | Optional | Model for token counting |
enforce_token_limit | bool | false | Enforce token limits strictly |
spec_path | string | Optional | Path to system spec JSON file |
spec_max_tokens | int | 5000 | Max tokens from spec to include |
spec_include_examples | bool | true | Include examples from spec |
spec_priority_threshold | int | 8 | Minimum priority for spec rules |
Example Configurations
Banking77 (GEPA)
HotpotQA (GEPA)
Supported Models
Policy Models (Task Execution)
Both GEPA and MIPRO support policy models from three providers:OpenAI Models
gpt-4ogpt-4o-minigpt-4.1gpt-4.1-minigpt-4.1-nanogpt-5gpt-5-minigpt-5-nano
gpt-5-pro (too expensive: 120 per 1M tokens)
Groq Models
gpt-oss-Xbpattern (e.g.,gpt-oss-20b,openai/gpt-oss-120b)llama-3.3-70band variants (e.g.,llama-3.3-70b-versatile)qwen-32b,qwen3-32b,groq/qwen3-32b
Google/Gemini Models
gemini-2.5-progemini-2.5-pro-gt200kgemini-2.5-flashgemini-2.5-flash-lite
Mutation Models (GEPA Only)
Used to generate prompt mutations/variations:| Model | Provider | Common Usage |
|---|---|---|
openai/gpt-oss-120b | Groq | Most common |
openai/gpt-oss-20b | Groq | Alternative |
llama-3.3-70b-versatile | Groq | Alternative |
llama3-groq-70b-8192-tool-use-preview | Groq | Alternative |
Meta Models (MIPRO Only)
Used to generate instruction proposals:| Model | Provider | Common Usage |
|---|---|---|
gpt-4o-mini | OpenAI | Most common default |
gpt-4.1-mini | OpenAI | Alternative |
gpt-4o | OpenAI | Higher quality, more expensive |
Model Configuration
Policy Configuration
[prompt_learning.policy]
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Policy model identifier |
provider | string | Yes | Provider ("openai", "groq", "google") |
inference_url | string | Yes | Inference endpoint URL |
inference_mode | string | Optional | "synth_hosted" or custom |
temperature | float | Optional | Sampling temperature (default: 0.0) |
max_completion_tokens | int | Optional | Maximum tokens (default: 512) |
GEPA Mutation Configuration
[prompt_learning.gepa.mutation]
| Parameter | Type | Default | Description |
|---|---|---|---|
rate | float | 0.3 | Probability of mutation (0-1) |
llm_model | string | Optional | LLM for guided mutations |
llm_provider | string | Optional | Provider for mutation LLM |
llm_inference_url | string | Optional | Inference URL for mutation LLM |
proposer_type | string | "dspy" | "dspy" or "spec" |
System Spec Configuration
Both GEPA and MIPRO support system specifications (specs) for constraint-aware optimization.[prompt_learning.gepa] Spec Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
proposer_type | string | "dspy" | "dspy" or "spec" (requires spec_path) |
spec_path | string | Optional | Path to system spec JSON file (required if proposer_type="spec") |
spec_max_tokens | int | 5000 | Max tokens for spec context in mutation prompts |
spec_include_examples | bool | true | Include examples from spec |
spec_priority_threshold | int | Optional | Only include rules with priority >= threshold |
[prompt_learning.mipro] Spec Parameters
| Parameter | Type | Default | Description |
|---|---|---|---|
spec_path | string | Optional | Path to system spec JSON file |
spec_max_tokens | int | 5000 | Max tokens for spec context in meta-prompt |
spec_include_examples | bool | true | Include examples from spec |
spec_priority_threshold | int | Optional | Only include rules with priority >= threshold |
Multi-Stage Pipeline Configuration
Both algorithms support multi-stage pipelines:GEPA Multi-Stage
MIPRO Multi-Stage
Complete Example Configurations
Banking77 (GEPA)
Banking77 (MIPRO)
Best Practices
GEPA Best Practices
- Population Size: Start with 20-30 for most tasks. Increase for complex tasks.
- Generations: 10-15 generations usually sufficient. More for complex optimization.
- Mutation Rate: 0.2-0.4 works well. Higher = more exploration, lower = more exploitation.
- Rollout Budget: Allocate 50-100 rollouts per generation for stable estimates.
- Concurrency: Set
max_concurrent_rolloutsbased on task app capacity (typically 10-50). - Mutation Model: Use
gpt-oss-120bfor best quality mutations,gpt-oss-20bfor faster/cheaper.
MIPRO Best Practices
- Bootstrap Seeds: Use 5-15 seeds for bootstrap phase. Higher threshold = fewer but better examples.
- Iterations: 10-20 iterations usually sufficient. More for complex tasks.
- Evaluations per Iteration: 4-6 variants per iteration balances exploration vs. cost.
- Meta Model:
gpt-4o-miniis the sweet spot (quality + cost). Usegpt-4ofor higher quality. - Reference Pool: Optional but recommended. 50-100 seeds provide rich context (up to 50k tokens).
- Token Budget: Set
max_token_limitandmax_spend_usdto control costs.
General Best Practices
- Seed Splitting: Keep training, validation, and test seeds separate. Never overlap.
- Baseline Prompt: Start with a clear, task-specific baseline. Better baseline = better optimization.
- Model Selection: Use Groq models (
gpt-oss-20b) for cost-effective policy execution. - Concurrency: Match
max_concurrentto your task app’s capacity. Too high = rate limits. - Monitoring: Track accuracy, token count, and cost throughout optimization.