synth_ai.sdk.api.train.sft
Experimental
First-class SDK API for SFT (Supervised Fine-Tuning).
This module provides high-level abstractions for running SFT jobs
both via CLI (uvx synth-ai train) and programmatically in Python scripts.
Example CLI usage:
synth_ai.sdk.api.train.configs.sft
SFT (Supervised Fine-Tuning) configuration models.
This module defines the configuration schema for SFT training jobs.
Paper: Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models (ReST-EM)
When to Use SFT
- Cloning successful AI generations (ReST-EM style self-training)
- Distilling from a larger model to a smaller one
- Training on domain-specific data (code, medical, legal, etc.)
- Teaching specific output formats or styles
- Vision fine-tuning with image-text pairs
- Training reference: /training/sft
- Quickstart: /quickstart/supervised-fine-tuning
Job API
SFTJobConfig
Configuration for an SFT job.
SFTJob
High-level SDK class for running SFT jobs.
This class provides a clean API for:
- Submitting SFT jobs
- Polling job status
- Retrieving results
from_config
config_path: Path to TOML config filebackend_url: Backend API URL (defaults to env or production)api_key: API key (defaults to SYNTH_API_KEY env var)dataset_override: Override dataset path from configallow_experimental: Allow experimental modelsoverrides: Config overrides
- SFTJob instance
ValueError: If required config is missingFileNotFoundError: If config file doesn’t exist
from_job_id
job_id: Existing job IDbackend_url: Backend API URL (defaults to env or production)api_key: API key (defaults to SYNTH_API_KEY env var)
- SFTJob instance for the existing job
submit
- Job ID
RuntimeError: If job submission fails
job_id
get_status
- Job status dictionary
RuntimeError: If job hasn’t been submitted yet
poll_until_complete
timeout: Maximum seconds to waitinterval: Seconds between poll attemptson_status: Optional callback called on each status update
- Final job status dictionary
RuntimeError: If job hasn’t been submitted yetTimeoutError: If timeout is exceeded
get_fine_tuned_model
- Fine-tuned model ID or None if not available
Configuration Reference
JobConfig
Core job configuration for SFT.
Attributes:
model: Base model to fine-tune (e.g., “Qwen/Qwen3-4B”, “meta-llama/Llama-3-8B”).data: Dataset identifier (if using registered datasets).data_path: Path to JSONL training data file.poll_seconds: Polling interval for job status. Default: 10.
SFTDataConfig
Data configuration for SFT training.
Attributes:
topology: Data loading topology configuration.validation_path: Path to validation JSONL file for eval during training.
TrainingValidationConfig
Validation configuration during training.
Attributes:
enabled: Enable validation during training. Default: False.evaluation_strategy: When to evaluate - “steps” or “epoch”.eval_steps: Evaluate every N steps (if strategy is “steps”).save_best_model_at_end: Save only the best model checkpoint.metric_for_best_model: Metric to use for best model selection (e.g., “eval_loss”).greater_is_better: Whether higher metric is better. Default: False for loss.
TrainingConfig
Training mode configuration.
Attributes:
mode: Training mode - “lora”, “qlora”, or “full”.use_qlora: Enable QLoRA (4-bit quantized LoRA). Default: False.validation: Validation configuration.lora: LoRA hyperparameters (r, alpha, dropout, target_modules).
HyperparametersParallelism
Parallelism configuration for distributed training.
Attributes:
use_deepspeed: Enable DeepSpeed. Default: False.deepspeed_stage: DeepSpeed ZeRO stage (1, 2, or 3).fsdp: Enable PyTorch FSDP. Default: False.bf16: Use bfloat16 precision. Default: True on supported hardware.fp16: Use float16 precision. Default: False.activation_checkpointing: Enable gradient checkpointing. Default: False.tensor_parallel_size: Tensor parallelism degree.pipeline_parallel_size: Pipeline parallelism degree.
HyperparametersConfig
Training hyperparameters for SFT.
Attributes:
n_epochs: Number of training epochs. Default: 1.batch_size: Training batch size (alias for global_batch).global_batch: Global batch size across all GPUs.per_device_batch: Per-device batch size.gradient_accumulation_steps: Steps to accumulate gradients. Default: 1.sequence_length: Maximum sequence length. Default: 2048.learning_rate: Optimizer learning rate (e.g., 2e-5).warmup_ratio: Fraction of steps for LR warmup. Default: 0.1.train_kind: Training variant (advanced).weight_decay: Weight decay coefficient. Default: 0.01.parallelism: Distributed training configuration.
SFTConfig
Root configuration for SFT (Supervised Fine-Tuning) jobs.
This is the top-level config loaded from a TOML file.
Attributes:
algorithm: Algorithm configuration (type=“offline”, method=“sft”).job: Core job configuration (model, data_path).policy: Policy configuration (preferred over job.model).compute: GPU and compute configuration.data: Data loading configuration.training: Training mode (lora, full) and LoRA config.hyperparameters: Training hyperparameters.lora: Deprecated - use training.lora instead.tags: Optional metadata tags.
- After training completes, you receive a result dict:
-
- { “status”: “succeeded”, “model_id”: “ft:Qwen/Qwen3-4B:sft_abc123”, “final_loss”: 0.42, “checkpoints”: [ {“epoch”: 1, “loss”: 0.65, “path”: ”…”}, {“epoch”: 2, “loss”: 0.52, “path”: ”…”}, {“epoch”: 3, “loss”: 0.42, “path”: ”…”}, ],
- }
-
to_dict
from_mapping
data: Dictionary or TOML mapping with configuration.
- Validated SFTConfig instance.
from_path
path: Path to the TOML configuration file.
- Validated SFTConfig instance.