synth_ai/task/server.py, require X-API-Key authentication, and emit traces/events for downstream tooling. The difference is how the app is used:
- The prompt-learning CLI (
synth_ai/train/cli.py) never uses the app during training; instead, the backend orchestrator repeatedly calls the/rolloutendpoint (and related routes) to evaluate prompt candidates. - The SDK/CLI health-checks the task app (
check_task_app_health) before submitting jobs, so your app must respond quickly to/healthand/task_infowith the expected schema.
Core Responsibilities
- Expose the standard endpoints (root,
/health,/info,/task_info,/rollout) exactly as implemented byTaskAppConfig. Prompt-learning configs reference these endpoints viatask_app_url. - Provide rich
TaskInfometadata describing the environment, dataset splits, and capabilities so the optimizer knows what dataset to pull from and how to score candidates. - Emit traces + events through the tracing_v3 pipeline (enable
TASKAPP_TRACING_ENABLED,TASKAPP_SFT_OUTPUT_DIR, and DB environment variables). The CLI fetches prompt-learning events (prompt.learning.*) after jobs complete, so write the relevant events/metrics during/rollout. - Support automated rollouts:
/rolloutshould process aRolloutRequestand returnRolloutResponsewith consistentpipeline_metadata.inference_url+ per-stepinfo.meta.inference_url(same requirement as RL; seesynth_ai.task.validators).
Authentication
Same as other task apps: every protected route enforcesX-API-Key / Authorization: Bearer headers. Use require_api_key_dependency from synth_ai.task.server.
Endpoint Contract Summary
(All implemented automatically when you create aTaskAppConfig; include custom logic in your config factory if needed.)
/– basic liveness probe:{ "status": "ok", "service": "<task-app-id>" }/health– verifies the API key and returns:/info– returnsTaskInfometadata:{ "service": { "task": {...} }, "dataset": {...}, "rubrics": {...}, "inference": {...}, "limits": {...} }/task_info– without seeds:{ "taskset": {...} }; with seeds:TaskInfoor list ofTaskInfodescribing each requested instance./rollout– acceptsRolloutRequest(from prompt-learning orchestrator) and returnsRolloutResponsewith trajectories, metrics,trace,pipeline_metadata. Ensurepipeline_metadata.inference_urlhas a?cid=token and everytrajectory.steps[*].info.meta.inference_urlmirrors it.
Tracing and Events
- Set
TASKAPP_TRACING_ENABLED=1when hosting the app so v3 traces capture every rollout. TASKAPP_SFT_OUTPUT_DIR(orSFT_OUTPUT_DIR) controls where, if anywhere, the app writes raw JSONL. Prompt learning mainly relies on the trace DB, but following the same pattern as RL/SFT simplifies reuse.- Emit prompt-learning-specific events (e.g.,
prompt.learning.progress,prompt.learning.final.results) via your app logic so the CLI can build summaries.
Modal / Remote Hosting
Prompt learning often runs long evaluations, so hosting on Modal is common:- Provide a Modal entry script (same format as RL/SFT) and register secrets (
ENVIRONMENT_API_KEY, vendor keys) souvx synth-ai deploy --runtime modalor the CLI’s automatic deploy path can boot the app. - Ensure the Modal deployment exposes stable
task_app_urlvalues; the prompt-learning config references this URL directly.
Best Practices
- Deterministic seeds: implement
/task_infoso specific seeds map to reproducible task instances (same as RL); GEPA/MIPRO depend on consistent evaluation. - Comprehensive metadata: fill
dataset,rubric,limits, andtask_metadatafields so the optimizer can display context and filter results. - Robust error handling:
/rolloutshould gracefully handle invalid prompts or tool calls and report meaningfulmetrics.detailsso you can debug candidate failures.