Skip to main content
Online prompt optimization (GEPA and MIPRO) supports two ways to serve the best prompt at inference time:
PatternHow it worksWhen to use
A: Proxy-basedRuntime calls a Synth proxy URL; Synth performs live candidate selection and injects the promptSimplest integration; you want Synth to own selection
B: Retrieval + JIT applyRuntime fetches candidates via APIs, picks the best, and applies it just-in-time before each requestYou control selection logic; you call the LLM directly
Both patterns work with GEPA and MIPRO online sessions. The backend proposes new candidates and tracks rewards the same way; only the serving path differs.

Pattern A: Proxy-based serving

Your rollout loop sends LLM requests to a Synth proxy URL. The proxy:
  1. Selects the current best candidate (or assigns one for exploration)
  2. Injects the candidate prompt as the system message
  3. Forwards the request to your configured LLM provider
  4. Returns the response with headers (x-gepa-rollout-id, x-gepa-candidate-id) for reward attribution
Flow:
Your app → POST {proxy_url}/chat/completions → Synth proxy → LLM provider → response

         Synth injects prompt, selects candidate
SDK usage:
session = client.optimization.online.create(kind="gepa_online", config_path="gepa.toml")
urls = session.get_prompt_urls()
# Call urls["chat_completions_url"] for each LLM request
See GEPA Online Banking77 and GEPA Online API for full details.

Pattern B: Retrieval + JIT apply

Your rollout loop fetches candidates from Synth, picks the best (or your own logic), and applies the prompt yourself before calling the LLM directly. Flow:
  1. Create an online session (same as Pattern A)
  2. Poll session state for best_candidate_id (or list candidates and choose)
  3. Fetch the candidate payload via session.get_candidate(candidate_id) or session.list_candidates()
  4. Extract prompt text from the candidate (candidate_content, artifact_payload, or nested fields)
  5. Call your LLM with that prompt as the system message
  6. Report rewards via session.update_reward(...) with candidate_id and rollout_id
When to use Pattern B:
  • You call the LLM directly (no proxy in the path)
  • You want custom selection logic (e.g., A/B by user segment, fallback rules)
  • You need to cache or preload candidates
  • Your infra cannot route through a Synth proxy

Retrieval APIs

List candidates for a session

GET /api/v1/systems/{system_id}/candidates For online sessions, system_id is the session_id returned when you create the session. Query params: job_id, algorithm, mode, status, limit, cursor, sort, include. Response:
{
  "items": [
    {
      "candidate_id": "cand_abc123",
      "candidate_content": "You are a helpful assistant...",
      "avg_reward": 0.85,
      "rollout_count": 42
    }
  ],
  "next_cursor": "..."
}

Get a single candidate

GET /api/v1/candidates/{candidate_id}
GET /api/v1/offline/jobs/{job_id}/candidates/{candidate_id}
Returns the full candidate payload including candidate_content, artifact_payload, or nested prompt structures.

Session state (best candidate)

GET /api/v1/online/sessions/{session_id} Returns live state including:
  • best_candidate_id — the backend’s current best
  • best_reward / best_objective_value — associated score
  • candidates — summary list with candidate_id, avg_reward, rollout_count
Use best_candidate_id to fetch the prompt via get_candidate() when using Pattern B.

SDK usage (Pattern B)

GEPA online

from synth_ai import SynthClient

client = SynthClient(api_key=os.environ["SYNTH_API_KEY"])
session = client.optimization.online.create(
    kind="gepa_online",
    config_path="gepa.toml",
)

# Option 1: Use best from session state
state = session.get_status()
best_id = state.get("best_candidate_id") or "baseline"
candidate = session.get_candidate(best_id)
prompt_text = candidate.get("candidate_content") or candidate.get("artifact_payload")

# Option 2: List and pick (e.g., by your own logic)
page = session.list_candidates(limit=10)
items = page.get("items", [])
best = max(items, key=lambda x: x.get("avg_reward", 0))
candidate = session.get_candidate(best["candidate_id"])

# Apply prompt and call your LLM
messages = [{"role": "system", "content": prompt_text}, {"role": "user", "content": user_input}]
response = your_llm_client.chat(messages)

# Report reward (use rollout_id from your own tracking)
session.update_reward(
    reward_info={"score": 0.9},
    rollout_id="my_rollout_123",
    candidate_id=best_id,
)

MIPRO online

MiproOnlineSession exposes the same retrieval methods: list_candidates(), list_candidates_async(), get_candidate(), get_candidate_async(). Use session_id as the system identifier when calling the REST APIs directly.

Reward attribution

For Pattern B, you must track rollout_id and candidate_id yourself and pass them to update_reward() so the backend can attribute rewards correctly. The backend uses this to update per-candidate statistics and propose new candidates.

Next steps