Skip to main content
Synth ships a set of examples that demonstrate the full lifecycle: deploy a task app, launch jobs, and turn traces into new datasets. Start with the SDK demos to understand the payloads and configs, then move to hosted runs.

Fine-Tuning

  • Fine-Tuning Demo – collect rollout data locally, assemble SFT datasets, and launch Synth training jobs.
  • Qwen Coder LoRA/QLoRA – run the Qwen Coder adapter example and learn the key config knobs for 30B models.
  • Rejection Finetuning Demo – generate rollouts, curate JSONL data, launch SFT jobs, and inspect the resulting checkpoint.
  • Synth Qwen – adapt the Crafter loop into a rejection SFT cycle targeting Synth’s Qwen base model.

Reinforcement Learning

  • Math RL Demo – train the single-step math task app, evaluate checkpoints, and experiment with reward settings.
  • On-Policy RL Demo – deploy a task app, verify hosted access, and drive an RL run end-to-end with the train CLI.
  • RL Examples – reference configs for single-step and multi-step RL workloads.

Evaluation

  • Evals Demo – run comparative evaluations against the Crafter task app, filter traces, and export summary stats.
Most examples assume you have already run uvx synth-ai setup and deployed the task app to Modal. For local fine-tuning, start with the Fine-Tuning Demo.
I