Overview - Synth AI

Synth ships a set of examples that demonstrate the full lifecycle: deploy a task app, launch jobs, and turn traces into new datasets. Start with the SDK demos to understand the payloads and configs, then move to hosted runs.

Fine-Tuning

Fine-Tuning Demo – collect rollout data locally, assemble SFT datasets, and launch Synth training jobs.
Qwen Coder LoRA/QLoRA – run the Qwen Coder adapter example and learn the key config knobs for 30B models.
Rejection Finetuning Demo – generate rollouts, curate JSONL data, launch SFT jobs, and inspect the resulting checkpoint.
Synth Qwen – adapt the Crafter loop into a rejection SFT cycle targeting Synth’s Qwen base model.

Reinforcement Learning

Math RL Demo – train the single-step math task app, evaluate checkpoints, and experiment with reward settings.
On-Policy RL Demo – deploy a task app, verify hosted access, and drive an RL run end-to-end with the train CLI.
RL Examples – reference configs for single-step and multi-step RL workloads.

Evaluation

Evals Demo – run comparative evaluations against the Crafter task app, filter traces, and export summary stats.

Most examples assume you have already run uvx synth-ai setup and deployed the task app to Modal. For local fine-tuning, start with the Fine-Tuning Demo.

SDK

​Fine-Tuning

​Reinforcement Learning

​Evaluation

Fine-Tuning

Reinforcement Learning

Evaluation