Before You Begin
- Config: A TOML file that sets
algorithm, task URLs, model defaults, and hyperparameters. - Secrets:
SYNTH_API_KEY,ENVIRONMENT_API_KEY, andTASK_APP_URLstored in a.env. Provide the path up front with--env-file /path/.envif you want to skip prompts. - Task app health: Ensure
/healthand/task_infoendpoints respond; the CLI calls them and will abort if they fail. - Optional overrides:
--model MODEL_IDto force a specific backend model.--task-url https://...to override what’s in the config for this run.--idempotency some-uuidso retried submissions don’t duplicate jobs.--allow-experimental(or--no-allow-experimental) to temporarily change the SDK experimental flag.
Run the CLI
- Config selection: If you omit
--config, the CLI lists discovered TOMLs and remembers your last choice. - Env resolution: Required keys are shown with masked values; select another
.env, fetch Modal secrets, or enter values manually if something is missing. - Verification calls: The CLI hits
POST /rl/verify_task_appusing every org credential combination to make sure the backend can talk to your task app. Failures print a full diagnostics payload so you can fix auth without guessing. - Task-app health check:
check_task_app_healthpings the task app directly with yourENVIRONMENT_API_KEY. If it fails, no job is created—fix the task app first. - Job creation: The CLI prints the payload preview and runs
POST {backend}/rl/jobs. The response must include ajob_id.
Live Monitoring
- Leave
--poll(default) enabled to launch theJobStreamer. --stream-format cli(default) prints concise status + event updates while hiding noisy Hatchet/Modal logs.--stream-format chartopens a live loss/score view that tracksgepa.transformation.mean_score.--poll-timeout(seconds) and--poll-interval(seconds) control how long and how often the streamer checks in.- Disable polling (
--no-poll) when you only need the job ID—for example, triggering runs from CI and checking status later.
What You See
- Verification summary listing candidate credentials and status codes.
- Task app health result (
✓ Task app healthyor a detailed failure reason). - Payload preview plus the raw backend response (truncated to 400 chars for readability).
- Streaming events emitted by the RL job (status transitions, environment events, metrics).
- Final status JSON once the job reaches a terminal state.
Troubleshooting Tips
- Authentication errors usually mean the
.envlacksENVIRONMENT_API_KEYor it’s scoped to another task app—rerun with--env-filepointing to the correct secrets file. - If the CLI hangs at “Verifying task app…”, the task app is likely offline; test its
/healthendpoint manually. - Use
--idempotencywhenever you expect to rerun commands (e.g., in scripts) to avoid duplicate jobs.