At a glance
When to use
- You have a dataset-backed container with multiple examples
- You want to optimize a prompt against a reward signal
- You can deterministically map
seed→ example - You’re optimizing offline on a fixed dataset (the backend drives
/rolloutcalls to your container; “offline” means batch optimization, not “no rollouts”) - You want broad exploration and diverse prompt variants (evolutionary search)
What you need
- A container that implements
/health,/task_info, and/rollout - A dataset with deterministic seed mapping (
seed-> example) - A reward function returned as
metrics.mean_return - A GEPA config (TOML) with train + validation seeds
How evaluation works (no prompt leakage)
GEPA never sends optimized prompts to your container. The backend registers each candidate with an interceptor and passes your container aninference_url to call.
The interceptor substitutes the candidate prompt at LLM time and captures traces.
Basic workflow
- Run your container (in-process, local + tunnel, or deployed)
- Create a GEPA config
- Submit a prompt optimization job
- Poll until complete and fetch the best prompt
Minimal example
Config essentials
Your config must include:prompt_learning.container_urlprompt_learning.container_api_key(ENVIRONMENT_API_KEY)prompt_learning.initial_promptprompt_learning.gepa.evaluation.seedsprompt_learning.gepa.evaluation.validation_seeds
Container tips
- Use
seedto select examples deterministically - Return
metrics.mean_returnfor each rollout - Route LLM calls through
policy_config.inference_url
Next steps
- Container overview:
/sdk/container/overview - GEPA config + SDK surface: GEPA Reference