Skip to main content
Accurate for Stack 0.1.0 and later unless a section cites a newer release. See Stack Changelog.
Stack runs GEPA prompt optimization locally and surfaces the loop in the cockpit, so you can optimize a prompt against your own task and inspect every step as a receipt.

What GEPA does

GEPA iteratively proposes and evaluates prompt variants against a task’s training and held-out sets, keeping changes that improve the score. Stack drives the optimizer service, runs the rollouts, and records the result as a StackEval packet you can replay.

Run a built-in task

Stack ships StackEval GEPA tasks you can run end-to-end:
TaskWhat it optimizes
banking77-local-gepaintent classification on the Banking77 set
crafter-local-gepaan agent policy prompt on Crafter
Prepare and launch a task from the cockpit harness:
cd ~/Documents/GitHub/stack
./bin/stackeval harness prepare banking77-local-gepa --preset smoke
This brings up stackd and the Stack TUI, scaffolds a StackEval packet (initial_prompt.txt, metadata.json, trace pointers), and starts the optimization loop. Progress, candidate scores, and the final improved prompt are written to the packet’s trace directory and shown in the cockpit.

Inspect the run

Every optimizer run is grounded in stackd receipts:
./bin/stackeval harness status   --capture-pane -o "$STACKEVAL_PACKET/harness.debug.json"
./bin/stackeval harness export-thread --packet-dir "$STACKEVAL_PACKET" -o "$STACKEVAL_PACKET/harness.export.json"
The export is the auditable record of what the optimizer tried, what scored, and the prompt it shipped.

Hosted optimizers

The same loop can run on Synth’s hosted optimizer service when you are signed in. See the hosted-gepa Codex skill for connecting a Stack session to hosted runs.