Qwen Coder LoRA / QLoRA

This example mirrors the examples/qwen_coder/ project bundled with the SDK. It fine-tunes the 30B A3B Qwen Coder model using LoRA or QLoRA adapters.

Prerequisites

Task app deployed (or hosted) and accessible (TASK_APP_URL in .env).
uvx synth-ai setup has been run in the repo so .env contains SYNTH_API_KEY and ENVIRONMENT_API_KEY.
Dataset JSONL prepared in the Synth SFT schema.
Access to at least 2 × H200 GPUs (the configs default to this).

Minimal command

uvx synth-ai train \
  --config examples/qwen_coder/configs/coder_lora_30b.toml \
  --dataset /absolute/path/to/train.jsonl \
  --env-file .env

The CLI validates the dataset, uploads it, and streams training metrics until the job finishes. When complete, it prints the fine-tuned model ID (ft:Qwen/...).

Config highlights

[job]
model = "Qwen/Qwen3-Coder-30B-A3B-Instruct"

[compute]
gpu_type = "H200"
gpu_count = 2
nodes = 1

[training]
mode = "lora"
use_qlora = true  # disable for pure LoRA

[hyperparameters]
n_epochs = 1
per_device_batch = 2
gradient_accumulation_steps = 32
sequence_length = 4096
learning_rate = 5e-6

[lora]
r = 16
alpha = 32
dropout = 0.05
target_modules = ["all-linear"]

Adjust use_qlora, r, alpha, and the learning rate to explore different adapter trade-offs. For smaller runs you can switch to coder_lora_4b.toml or coder_lora_small.toml.

Tips

Set [hyperparameters].lora_rank or [training].max_steps when sweeping adapter size or training duration.
Use --idempotency to guard against duplicate job submissions in automation.
Upload evaluation datasets and reference this fine-tuned model in your RL configs to continue iterating.

Get Started

Fine-Tuning

Reinforcement Learning

CLI Commands

Qwen Coder LoRA / QLoRA

Prerequisites

Minimal command

Config highlights

Tips

Get Started

Fine-Tuning

Reinforcement Learning

CLI Commands

​Prerequisites

​Minimal command

​Config highlights

​Tips

Prerequisites

Minimal command

Config highlights

Tips