Qlora sft

LoRA SFT

Train larger models on constrained hardware by enabling LoRA while keeping the same SFT flow and payload schema.

Works via the standard SFT CLI: uvx synth-ai train --type sft --config <path>
Toggle with training.use_qlora = true in your TOML
Uses the same hyperparameter keys as FFT; the backend interprets LoRA-appropriate settings

Quickstart

uvx synth-ai train --type sft --config examples/warming_up_to_rl/configs/crafter_fft_4b.toml --dataset /abs/path/to/train.jsonl

Minimal TOML (LoRA enabled)

[job]
model = "Qwen/Qwen3-4B"
# Either set here or pass via --dataset
# data = "/abs/path/to/train.jsonl"

[compute]
gpu_type = "H100"       # required by backend
gpu_count = 1
nodes = 1

[data]
# Optional; forwarded into metadata.effective_config
topology = {}
# Optional local validation file; client uploads if present
# validation_path = "/abs/path/to/validation.jsonl"

[training]
mode = "sft_offline"
use_qlora = true         # LoRA toggle

[training.validation]
enabled = true
evaluation_strategy = "steps"
eval_steps = 20
save_best_model_at_end = true
metric_for_best_model = "val.loss"
greater_is_better = false

[hyperparameters]
n_epochs = 1
per_device_batch = 1
gradient_accumulation_steps = 64
sequence_length = 4096
learning_rate = 5e-6
warmup_ratio = 0.03

# Optional parallelism block forwarded as-is
#[hyperparameters.parallelism]
# use_deepspeed = true
# deepspeed_stage = 2
# bf16 = true

What the client validates and sends

Validates dataset path existence (from [job].data or --dataset) and JSONL shape
Uploads training (and optional validation) files to /api/learning/files
Builds payload with:
- model from [job].model
- training_type = "sft_offline"
- hyperparameters from [hyperparameters] (+ [training.validation] knobs)
- metadata.effective_config.compute from [compute]
- metadata.effective_config.data.topology from [data.topology]
- metadata.effective_config.training.{mode,use_qlora} from [training]

Multi‑GPU guidance

Set [compute].gpu_type, gpu_count, and optionally nodes
Use [hyperparameters.parallelism] for deepspeed/FSDP/precision/TP/PP knobs; forwarded verbatim
Optionally add [data.topology] (e.g., container_count) for visibility; backend validates resource consistency

Common issues

HTTP 400 missing_gpu_type: set [compute].gpu_type (and typically gpu_count) so it appears under metadata.effective_config.compute
Dataset not found: provide absolute path or use --dataset; the client resolves relative paths from the current working directory

Helpful CLI flags

--dataset to override [job].data
--examples N to subset the first N rows for quick smoke tests
--dry-run to preview payload without submitting

All sections and parameters (LoRA SFT)

The client recognizes and/or forwards the following sections:

[job] (client reads)
- model (string, required): base model identifier
- data or data_path (string): local path to training JSONL. Required unless overridden by --dataset
- Notes:
  - Paths are resolved relative to the current working directory (CWD), not the TOML location
  - Legacy scripts may also mention poll_seconds (legacy-only; the new CLI uses --poll-* flags)
[compute] (forwarded into metadata.effective_config.compute)
- gpu_type (string): required by backend (e.g., “H100”, “A10G”). Missing this often causes HTTP 400
- gpu_count (int): number of GPUs
- nodes (int, optional)
[data] (partially read)
- topology (table/dict): forwarded as-is to metadata.effective_config.data.topology
- validation_path (string, optional): if present and the file exists, the client uploads it and wires validation
- Path resolution: relative to the current working directory (CWD)
[training] (partially read)
- mode (string, optional): copied into metadata.effective_config.training.mode (documentation hint)
- use_qlora (bool): set to true for LoRA
- [training.validation] (optional; some keys are promoted into hyperparameters)
  - enabled (bool, default true): surfaced in metadata.effective_config.training.validation.enabled
  - evaluation_strategy (string, default “steps”): forwarded into hyperparameters
  - eval_steps (int, default 0): forwarded
  - save_best_model_at_end (bool, default true): forwarded
  - metric_for_best_model (string, default “val.loss”): forwarded
  - greater_is_better (bool, default false): forwarded
[hyperparameters] (client reads selective keys)
- Required/defaulted:
  - n_epochs (int, default 1)
- Optional (forwarded if present):
  - batch_size, global_batch, per_device_batch, gradient_accumulation_steps, sequence_length, learning_rate, warmup_ratio, train_kind
- Note: some legacy examples include world_size. The client does not forward world_size; prefer specifying per_device_batch and gradient_accumulation_steps explicitly.
- [hyperparameters.parallelism] (dict): forwarded verbatim (e.g., use_deepspeed, deepspeed_stage, fsdp, bf16, fp16, tensor_parallel_size, pipeline_parallel_size)
[algorithm] (ignored by client): present in some examples for documentation; no effect on payload

Validation and error rules (client):

Missing dataset path -> prompt or error; dataset must exist and be valid JSONL
Missing gpu_type (backend rule) -> HTTP 400 at create job
Validation path missing -> warning; continues without validation

Payload mapping recap:

model from [job].model
training_type = "sft_offline"
hyperparameters from [hyperparameters] + [training.validation] select keys
metadata.effective_config.compute from [compute]
metadata.effective_config.data.topology from [data.topology]
metadata.effective_config.training.{mode,use_qlora} from [training]

Welcome to Synth

SDK

LoRA SFT

Quickstart

Minimal TOML (LoRA enabled)

What the client validates and sends

Multi‑GPU guidance

Common issues

Helpful CLI flags

All sections and parameters (LoRA SFT)

Welcome to Synth

SDK

​LoRA SFT

​Quickstart

​Minimal TOML (LoRA enabled)

​What the client validates and sends

​Multi‑GPU guidance

​Common issues

​Helpful CLI flags

​All sections and parameters (LoRA SFT)

LoRA SFT

Quickstart

Minimal TOML (LoRA enabled)

What the client validates and sends

Multi‑GPU guidance

Common issues

Helpful CLI flags

All sections and parameters (LoRA SFT)