synth-ai train validates a training config, collects the required credentials, and submits either an RL or SFT job to the Synth backend. The command is fully interactive when you omit inputs—it will ask you to pick a config, select an .env, and confirm overrides before it talks to the API.
How it Works
- Config discovery –
discover_configsscans common directories (including the current working tree and repo examples). If you pass multiple--configflags they are processed sequentially. - Environment resolution –
resolve_envprompts you to choose an.envfile and ensures mandatory keys are present. RL jobs requireSYNTH_API_KEY,ENVIRONMENT_API_KEY, and (when not provided on the CLI)TASK_APP_URL. - RL preflight – Before submitting, the CLI asks the backend to verify your task app (
/rl/verify_task_app) and then calls the app’s/task_infoand/healthendpoints with the resolved environment keys. Failures stop the run with contextual diagnostics. - SFT payload prep – SFT jobs can upload the dataset referenced in the config or provided via
--dataset. The optional--exampleslimit trims a temporary copy so you can run smoke-sized training. - Job streaming – Successful submissions are streamed via
JobStreamer. The defaultcliformat prints status events;--stream-format chartopens the live loss panel.
Usage
Options
--config PATH— Repeatable training TOML paths. Omit to auto-discover and prompt.--type {auto,rl,sft}— Force the workflow type instead of letting the CLI infer it.--env-file PATH— Preload one or more.envfiles. Suppresses the interactive picker.--task-url URL— Override the RL task app URL (otherwise taken from env/config).--dataset PATH— Override the SFT dataset JSONL (otherwise taken from config discovery).--backend URL— Custom backend base URL (defaults to production viaget_backend_from_env).--model VALUE— Override the model identifier in the config.--allow-experimental / --no-allow-experimental— Toggle experimental model access without editing configs.--idempotency VALUE— Provide anIdempotency-Keyheader for job creation.--poll / --no-poll— Control whether the CLI waits for terminal status (default--poll).--poll-timeout SECONDS— Maximum time to stream a run (default3600seconds).--poll-interval SECONDS— Delay between status fetches while streaming.--stream-format {cli,chart}— Choose between line-based updates and the live loss chart (defaultcli).--examples VALUE— Limit SFT training to the first N examples by uploading a truncated copy.
Notes
- The env resolver caches values back into the chosen
.env, so subsequent invocations reuse the keys you confirm during the prompts. - If job submission fails (non-2xx response) the CLI prints the request payload and backend response body before exiting.
- When multiple configs are provided, later entries are skipped if an earlier submission fails validation or backend checks.