RL Examples

Synth bundles several reinforcement-learning demos that mirror the production workflows supported by the CLI. Each example lives in the SDK repository under examples/, complete with task apps, configs, and helper scripts. Pick an example that matches your goal:

Math Single-Step — spin up the math tool-use environment, deploy it locally/Modal, and launch an RL job with live metrics.
Crafter On-Policy Loop — deploy the Crafter task app, run smoke checks, and execute a full on-policy RL cycle with tracing.
Evaluation Playbook — run hosted evaluations, export traces, and convert them into SFT-ready datasets.
Reference Configs — browse the RL TOML files and task app entry points that ship with examples/.

All commands assume you have run uvx synth-ai setup and have a .env with at least SYNTH_API_KEY and ENVIRONMENT_API_KEY. When working outside the example directories, pass --env-file path/to/.env so the CLI loads the right credentials.

Looking for fine-tuning workflows? Head over to the FT examples.

Get Started

Fine-Tuning

Reinforcement Learning

CLI Commands