Using objectives for prompt optimization
GEPA/MIPRO can optimize prompts against objectives that combine reward, time spent, and cost used. The typical pattern is:- Record rewards during the rollout (event or outcome)
- Record time/cost usage for the same run
- Define objectives that trade off quality vs efficiency
- Quality: maximize reward (outcome total or verifier score)
- Latency: minimize wall time / steps
- Cost: minimize token or USD usage
Event Rewards
Per-step rewards attached to individual events. Use for credit assignment.Schema
| Field | Type | Description |
|---|---|---|
event_id | int | FK to event |
reward_value | float | Reward (positive or negative) |
reward_type | str | achievement, achievement_delta, unique_achievement_delta, shaped, sparse, penalty, evaluator, human |
key | str | Achievement name or reward identifier |
source | str | environment, runner, evaluator, human |
annotation | dict | Additional context |
Reward Types
achievement: Binary, one per achievement unlockedachievement_delta: Count of achievements unlocked this stepunique_achievement_delta: Count of new achievements this episodeshaped: Dense signal for incremental progresssparse: Reward only at milestonesevaluator: From automated verifierhuman: Human annotation
Outcome Rewards
Episode-level summary. Use for filtering and ranking.Schema
| Field | Type | Description |
|---|---|---|
session_id | str | FK to session |
total_reward | int | Episode score (e.g., unique achievements) |
achievements_count | int | Milestones reached |
total_steps | int | Episode length |
reward_metadata | dict | Achievements list, final state, etc. |