Banking77

Complete guide to optimizing prompts for Banking77 intent classification using GEPA.

Overview

Banking77 is an intent classification task with 77 banking-related intents. GEPA typically improves accuracy from 60-75% (baseline) to 85-90%+ over 15 generations.

Prerequisites

# Install dependencies
uv pip install -e .

# Set API keys
export SYNTH_API_KEY="your-backend-api-key"
export GROQ_API_KEY="gsk_your_groq_key"
export ENVIRONMENT_API_KEY="$(python -c 'import secrets; print(secrets.token_urlsafe(32))')"

Where to get API keys:

GROQ_API_KEY: Get from https://console.groq.com/keys
SYNTH_API_KEY: Get from your backend admin or .env.dev file
ENVIRONMENT_API_KEY: Generate a random secure token (command above)

Step 1: Deploy Task App

Option A: Using helper script (recommended)

# Terminal 1
./examples/blog_posts/gepa/deploy_banking77_task_app.sh

Option B: Using CLI

uvx synth-ai deploy banking77 --runtime uvicorn --port 8102

Option C: Deploy to Modal

uvx synth-ai deploy banking77 --runtime modal --name banking77-gepa --env-file .env

Verify the task app is running:

curl -H "X-API-Key: $ENVIRONMENT_API_KEY" http://127.0.0.1:8102/health

Step 2: Create Config

Create banking77_gepa.toml:

[prompt_learning]
algorithm = "gepa"
task_app_url = "http://127.0.0.1:8102"
task_app_id = "banking77"

# Training seeds (30 seeds from train pool)
evaluation_seeds = [50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79]

# Validation seeds (50 seeds from validation pool - not in training)
validation_seeds = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49]

[prompt_learning.initial_prompt]
messages = [
  { role = "system", content = "You are a banking intent classification assistant." },
  { role = "user", pattern = "Customer Query: {query}\n\nClassify this query into one of 77 banking intents." }
]

[prompt_learning.gepa]
initial_population_size = 20    # Starting population of prompts
num_generations = 15            # Number of evolutionary cycles
mutation_rate = 0.3             # Probability of mutation
crossover_rate = 0.5            # Probability of crossover
rollout_budget = 1000           # Total rollouts across all generations
max_concurrent_rollouts = 20    # Parallel rollout limit
pareto_set_size = 20           # Size of Pareto front

Step 3: Run Optimization

Option A: Using helper script (recommended)

# Terminal 2
./examples/blog_posts/gepa/run_gepa_banking77.sh

Option B: Using CLI directly

uvx synth-ai train \
  --config examples/blog_posts/gepa/configs/banking77_gepa_local.toml \
  --backend http://localhost:8000 \
  --poll

Step 4: Monitor Progress

You’ll see real-time output:

🧬 Running GEPA on Banking77
=============================
✅ Backend URL: http://localhost:8000
✅ Task app is healthy

🚀 Starting GEPA training...

proposal[0] train_accuracy=0.65 len=120 tool_rate=0.95 N=30
  🔄 TRANSFORMATION:
    [SYSTEM]: Classify customer banking queries into intents...

Generation 1/15: Best reward=0.75 (75% accuracy)
Generation 2/15: Best reward=0.82 (82% accuracy)
...
✅ GEPA training complete!

Step 5: Query Results

from synth_ai.learning import get_prompt_text, get_scoring_summary

# Get best prompt
best_prompt = get_prompt_text(
    job_id="pl_abc123",
    base_url="http://localhost:8000",
    api_key="sk_...",
    rank=1
)

# Get scoring summary
summary = get_scoring_summary(
    job_id="pl_abc123",
    base_url="http://localhost:8000",
    api_key="sk_..."
)

print(f"Best Train Accuracy: {summary['best_train_accuracy']:.3f}")
print(f"Best Validation Accuracy: {summary['best_validation_accuracy']:.3f}")
print(f"Mean Train Accuracy: {summary['mean_train_accuracy']:.3f}")
print(f"Candidates Tried: {summary['num_candidates_tried']}")

Expected Results

Generation	Typical Accuracy	Notes
1 (baseline)	60-75%	Initial random/baseline prompts
5	75-80%	Early optimization gains
10	80-85%	Convergence begins
15 (final)	85-90%+	Optimized prompts on Pareto front

Troubleshooting

❌ “Banking77 task app is not running”

Solution: Start the task app first

./examples/blog_posts/gepa/deploy_banking77_task_app.sh

❌ “Cannot connect to backend”

Solution: Verify backend is running

curl http://localhost:8000/api/health

❌ “GROQ_API_KEY environment variable is required”

Solution: Export your Groq API key

export GROQ_API_KEY="gsk_your_key_here"

❌ Pattern validation failed

Solution: Ensure your config’s initial_prompt.messages uses the {query} wildcard:

[[prompt_learning.initial_prompt.messages]]
role = "user"
pattern = "Customer Query: {query}\n\nClassify this query."

Helper Scripts

Script	Purpose
`deploy_banking77_task_app.sh`	Start Banking77 task app locally
`run_gepa_banking77.sh`	Run GEPA optimization with validation checks
`test_gepa_local.sh`	Quick test script for local setup
`verify_banking77_setup.sh`	Comprehensive setup verification
`query_prompts_example.py`	Example script for querying results

Next Steps

Configuration Reference – All algorithm parameters
Evaluate Results – Querying and validation
Other Examples – HotpotQA, IFBench, HoVer, PUPA

Get Started

Task App

Supervised Fine-Tuning

Reinforcement Learning

Prompt Learning

CLI Commands

Overview

Prerequisites

Step 1: Deploy Task App

Step 2: Create Config

Step 3: Run Optimization

Step 4: Monitor Progress

Step 5: Query Results

Expected Results

Troubleshooting

❌ “Banking77 task app is not running”

❌ “Cannot connect to backend”

❌ “GROQ_API_KEY environment variable is required”

❌ Pattern validation failed

Helper Scripts

Next Steps

Get Started

Task App

Supervised Fine-Tuning

Reinforcement Learning

Prompt Learning

CLI Commands

​Overview

​Prerequisites

​Step 1: Deploy Task App

​Step 2: Create Config

​Step 3: Run Optimization

​Step 4: Monitor Progress

​Step 5: Query Results

​Expected Results

​Troubleshooting

​❌ “Banking77 task app is not running”

​❌ “Cannot connect to backend”

​❌ “GROQ_API_KEY environment variable is required”

​❌ Pattern validation failed

​Helper Scripts

​Next Steps

Overview

Prerequisites

Step 1: Deploy Task App

Step 2: Create Config

Step 3: Run Optimization

Step 4: Monitor Progress

Step 5: Query Results

Expected Results

Troubleshooting

❌ “Banking77 task app is not running”

❌ “Cannot connect to backend”

❌ “GROQ_API_KEY environment variable is required”

❌ Pattern validation failed

Helper Scripts

Next Steps