Generate Supervised Fine Tuning Training Data

This guide walks you through the practical steps to collect rollouts from a task app, export them into SFT-ready JSONL files, and validate the output before launching a fine-tuning run. Everything below focuses on user-facing tasks; implementation details are only mentioned when they affect what you need to do.

1. Prepare Your Task App

Write or pick a TaskAppConfig (a .py file that registers your app in synth_ai.task.apps.registry).
Create an .env file containing at least:
```
ENVIRONMENT_API_KEY=...
SYNTH_API_KEY=...
```
Use synth-ai setup or a password manager to keep these up to date.
Sanity-check the app locally (optional but recommended). Set an output directory so successful rollouts can emit JSONL immediately:
```
export TASKAPP_SFT_OUTPUT_DIR=traces/sft_records
uvx synth-ai deploy \
  --task-app task_apps/my_app.py \
  --runtime local \
  --env .env
```
This starts uvicorn, enables tracing by default, and ensures the basic endpoints respond. Stop here if the app fails; SFT data collection depends on a healthy task app.

2. Deploy for Rollouts

Pick the runtime best suited for the collection session:

Local (--runtime local): fastest iteration while you develop. Tracing is auto-enabled; set TASKAPP_SFT_OUTPUT_DIR to choose where JSONL batches land (for example traces/sft_records).
Modal (--runtime modal): deploys to Modal servers so teammates can collect rollouts or run larger batches. Provide --modal-app and (optionally) --name to map to an existing Modal config.

Example Modal deploy:

uvx synth-ai deploy \
  --task-app task_apps/my_app.py \
  --runtime modal \
  --modal-app task_apps/my_app_modal.py \
  --env .env \
  --name my-app-prod

The deploy command validates the task app, ensures SYNTH_API_KEY and ENVIRONMENT_API_KEY are loaded (from the CLI environment or the supplied .env), and prints the URL of the running task app. Keep that URL for the rollout step.

3. Collect Rollouts with Tracing Enabled

Rollouts are what power SFT. You can gather them manually, via automation, or by sharing the task app with labelers:

Confirm tracing is still on. Local deploys run with --trace enabled by default, so TASKAPP_TRACING_ENABLED=1. You must set TASKAPP_SFT_OUTPUT_DIR (or SFT_OUTPUT_DIR) yourself before launching if you want JSONL written to disk; otherwise resolve_sft_output_dir() returns None and the task app will skip SFT dumps. Tracing_v3 automatically provisions a SQLite database under traces/ (see TRACE_DB_DIR in synth_ai/tracing_v3/constants.py). Modal deploys rely on your Modal app’s tracing settings; export the same env vars there if you need local JSONLs.
Point your collector at the task app. Common options:
- uvx synth-ai eval ... --task-app-url <URL> to run scripted rollouts.
- Custom agents or labeler tools hitting /rollout and /states.
Run enough variety. Aim for dozens to hundreds of sessions that demonstrate the behaviors you care about. The tracing backend records:
- Session metadata (model, env, seed, scores, timestamps)
- Message streams (system/user/assistant/tool turns)
- Outcome rewards and judge scores if your app sets them
Verify traces exist:
- Inspect the directory you pointed TASKAPP_SFT_OUTPUT_DIR at (for example traces/sft_records) to make sure JSONL batches are appearing.
- Look under traces/ for the trace database file (default names reference TRACE_DB_DIR) or whatever path you configured; confirm its timestamp updates as you run rollouts.
- For Modal deployments, download the trace DB or connect via the remote database URL that your Modal app prints.

4. Export SFT JSONL with `synth-ai filter`

Once you have traces, convert them into training examples:

Write a [filter] TOML describing where to read from and where to write (point db at whichever .db file lives under your TRACE_DB_DIR, for example traces/trace_records.db or a timestamped variant):
```
[filter]
db = "traces/trace_records.db"
output = "ft_data/my_app_sft.jsonl"
splits = ["train"]
min_official_score = 0.5
min_judge_scores = { "accuracy" = 0.7 }
limit = 500
```
Supported filter knobs include:
- splits, task_ids, models
- Score thresholds (min_official_score, max_official_score, per-judge min/max)
- Time windows (min_created_at, max_created_at), pagination (limit, offset), shuffle/shuffle_seed
Run the filter command:

uvx synth-ai filter --config configs/filter-my-app.toml

The CLI:

Validates the config (FilterConfig) and ensures output ends with .jsonl or .json.
Connects to the tracing DB via SessionTracer.
Applies your filters session by session.

Emits one JSONL record per qualifying conversation turn, each shaped like:

{
  "messages": [
    {"role": "system", "content": "..."},
    {"role": "user", "content": "..."},
    {"role": "assistant", "content": "..."}
  ],
  "metadata": {
    "session_id": "...",
    "env_name": "...",
    "model": "...",
    "total_reward": 3.0,
    "created_at": "2024-02-01T12:34:56Z"
  }
}

Review the output: open a few rows to confirm the instructions/responses look correct and that metadata contains the fields you need downstream.

5. Validate the Dataset Before Training

The SFT CLI automatically validates JSONL files, but you can pre-check them to catch issues early:

uvx python - <<'PY'
from pathlib import Path
from synth_ai.train.utils import validate_sft_jsonl

validate_sft_jsonl(Path("ft_data/my_app_sft.jsonl"))
PY

Validation enforces:

Each line is valid JSON with at least one message.
Roles are limited to system, user, assistant, or tool.
Tool definitions/calls include required names/function signatures when present.
Multimodal payloads follow the structure expected by the trainer.

Fix any reported issues (missing messages, malformed payloads, etc.) and rerun the validator. Only move to training once the dataset passes.

6. Next Steps: Train with `synth-ai train --type sft`

With a validated JSONL, you can update your SFT config to point at the new dataset (or pass --dataset to the CLI) and launch training:

uvx synth-ai train \
  --type sft \
  --config configs/sft/my_app.toml \
  --dataset ft_data/my_app_sft.jsonl \
  --env .env

The training command will:

Validate the dataset again.
Upload it to the Synth backend.
Create and start the learning job.
Stream progress (train.loss, validation summaries) until completion.

Checklist

Task app validated (local or Modal) via synth-ai deploy.
Tracing enabled (JSONL directory + trace DB confirmed).
Rollouts collected with diverse prompts and seeds.
Filter TOML written and run; JSONL exported with expected metadata.
Dataset validated with validate_sft_jsonl.
SFT config points to the new dataset, ready for synth-ai train.

Following these steps ensures the SFT pipeline has clean, reproducible data without surprises once training begins.

Get Started

Train Your Model

Training Configs

Prompt Optimization

Supervised Fine Tuning

Reinforcement Learning

SDK Reference

Generate Supervised Fine Tuning Training Data

1. Prepare Your Task App

2. Deploy for Rollouts

3. Collect Rollouts with Tracing Enabled

4. Export SFT JSONL with `synth-ai filter`

5. Validate the Dataset Before Training

6. Next Steps: Train with `synth-ai train --type sft`

Checklist

Get Started

Train Your Model

Training Configs

Prompt Optimization

Supervised Fine Tuning

Reinforcement Learning

SDK Reference

​1. Prepare Your Task App

​2. Deploy for Rollouts

​3. Collect Rollouts with Tracing Enabled

​4. Export SFT JSONL with synth-ai filter

​5. Validate the Dataset Before Training

​6. Next Steps: Train with synth-ai train --type sft

​Checklist

1. Prepare Your Task App

2. Deploy for Rollouts

3. Collect Rollouts with Tracing Enabled

4. Export SFT JSONL with `synth-ai filter`

5. Validate the Dataset Before Training

6. Next Steps: Train with `synth-ai train --type sft`

Checklist