Skip to main content
synth-ai train validates a training config, collects the required credentials, and submits either an RL or SFT job to the Synth backend. The command is fully interactive when you omit inputs—it will ask you to pick a config, select an .env, and confirm overrides before it talks to the API.

How it Works

  • Config discoverydiscover_configs scans common directories (including the current working tree and repo examples). If you pass multiple --config flags they are processed sequentially.
  • Environment resolutionresolve_env prompts you to choose an .env file and ensures mandatory keys are present. RL jobs require SYNTH_API_KEY, ENVIRONMENT_API_KEY, and (when not provided on the CLI) TASK_APP_URL.
  • RL preflight – Before submitting, the CLI asks the backend to verify your task app (/rl/verify_task_app) and then calls the app’s /task_info and /health endpoints with the resolved environment keys. Failures stop the run with contextual diagnostics.
  • SFT payload prep – SFT jobs can upload the dataset referenced in the config or provided via --dataset. The optional --examples limit trims a temporary copy so you can run smoke-sized training.
  • Job streaming – Successful submissions are streamed via JobStreamer. The default cli format prints status events; --stream-format chart opens the live loss panel.

Usage

# RL job using an interactive .env picker
uvx synth-ai train --config configs/rl/grpo.toml

# SFT job with an explicit dataset and loss chart streaming
uvx synth-ai train \
  --config configs/sft/qwen.toml \
  --dataset datasets/crafter/train.jsonl \
  --stream-format chart

Options

  • --config PATH — Repeatable training TOML paths. Omit to auto-discover and prompt.
  • --type {auto,rl,sft} — Force the workflow type instead of letting the CLI infer it.
  • --env-file PATH — Preload one or more .env files. Suppresses the interactive picker.
  • --task-url URL — Override the RL task app URL (otherwise taken from env/config).
  • --dataset PATH — Override the SFT dataset JSONL (otherwise taken from config discovery).
  • --backend URL — Custom backend base URL (defaults to production via get_backend_from_env).
  • --model VALUE — Override the model identifier in the config.
  • --allow-experimental / --no-allow-experimental — Toggle experimental model access without editing configs.
  • --idempotency VALUE — Provide an Idempotency-Key header for job creation.
  • --poll / --no-poll — Control whether the CLI waits for terminal status (default --poll).
  • --poll-timeout SECONDS — Maximum time to stream a run (default 3600 seconds).
  • --poll-interval SECONDS — Delay between status fetches while streaming.
  • --stream-format {cli,chart} — Choose between line-based updates and the live loss chart (default cli).
  • --examples VALUE — Limit SFT training to the first N examples by uploading a truncated copy.

Notes

  • The env resolver caches values back into the chosen .env, so subsequent invocations reuse the keys you confirm during the prompts.
  • If job submission fails (non-2xx response) the CLI prints the request payload and backend response body before exiting.
  • When multiple configs are provided, later entries are skipped if an earlier submission fails validation or backend checks.