Prerequisites
uvx synth-ai setup
has been run and your.env
contains required keys.- The task app you want to train against is registered (
@register_task_app
) and ready to deploy. - Modal CLI is installed and logged in (
modal token new
).
1. Deploy the task app
ENVIRONMENT_API_KEY
, and prints the hosted URL. Store it in .env
as TASK_APP_URL
so future commands find it automatically.
2. Smoke-test with modal-serve
(optional)
3. Verify wiring
Run the built-in verifications before launching RL:--dry-run
prints the payload and runs all checks except job submission: .env
resolution, /rl/verify_task_app
, /health
, and /task_info
. Fix any issues before removing --dry-run
.
4. Launch the RL job
5. Inspect results & iterate
- Inspect checkpoints and logs; adjust reward shaping and hyperparameters
- Use
--idempotency
in automation to avoid duplicate job submissions
Tips
- Keep environment-agnostic configs under version control; the
train
command embeds them into job payloads for reproducibility. - Use
--idempotency
if you automate submissions and want the backend to reject accidental duplicates.