Skip to main contentSynth ships a set of examples that demonstrate the full lifecycle: deploy a task app, launch jobs, and turn traces into new datasets. Start with the SDK demos to understand the payloads and configs, then move to hosted runs.
Fine-Tuning
- Fine-Tuning Demo – collect rollout data locally, assemble SFT datasets, and launch Synth training jobs.
- Qwen Coder LoRA/QLoRA – run the Qwen Coder adapter example and learn the key config knobs for 30B models.
- Rejection Finetuning Demo – generate rollouts, curate JSONL data, launch SFT jobs, and inspect the resulting checkpoint.
- Synth Qwen – adapt the Crafter loop into a rejection SFT cycle targeting Synth’s Qwen base model.
Reinforcement Learning
- Math RL Demo – train the single-step math task app, evaluate checkpoints, and experiment with reward settings.
- On-Policy RL Demo – deploy a task app, verify hosted access, and drive an RL run end-to-end with the
train
CLI.
- RL Examples – reference configs for single-step and multi-step RL workloads.
Evaluation
- Evals Demo – run comparative evaluations against the Crafter task app, filter traces, and export summary stats.
Most examples assume you have already run uvx synth-ai setup
and deployed the task app to Modal. For local fine-tuning, start with the Fine-Tuning Demo.