Skip to main content
This demo mirrors the hosted Crafter workflow. It assumes you have cloned the SDK repo (for task app code and configs) or spun up the Crafter demo via uvx synth-ai demo.

Prerequisites

  • uvx synth-ai setup has been run in the current directory.
  • Modal CLI installed and authenticated (modal token new), unless you are staying on local uvicorn.
  • Task app registered (the demo registers grpo-crafter-demo automatically).

1. Deploy the task app

uvx synth-ai deploy \
  --runtime modal \
  --task-app task_app.py \
  --modal-app modal_app.py \
  --name crafter-prod \
  --env-file .env
The CLI encrypts ENVIRONMENT_API_KEY, builds a Modal image with your code, and stores the resulting TASK_APP_BASE_URL in .env. For local testing swap --runtime modal with --runtime local.

2. Run smoke tests

uvx synth-ai smoke \
  --config configs/crafter_smoke.toml \
  --env-file .env
This uses the same env resolution as the trainer and verifies that your task app can serve rollouts, respond with proper metadata, and log traces.

3. Launch the RL job

uvx synth-ai train \
  --config configs/rl_from_base_qwen4b.toml \
  --env-file .env
Key points:
  • --dry-run is deprecated. Run the command for real; the trainer will perform /rl/verify_task_app, /health, and /task_info checks before submitting work.
  • The CLI streams job events until completion. Press Ctrl+C if you prefer to monitor via synth-ai status jobs … later.

4. Monitor jobs

uvx synth-ai status jobs list --status running --limit 5
uvx synth-ai status jobs logs rl_job_123 --follow
Use the status suite to tail metrics and inspect timelines.

5. Iterate

  • Adjust rewards and hyperparameters in configs/rl_from_base_qwen4b.toml.
  • Reference the latest checkpoint in [model].source once you have a good run.
  • Combine with the Rejection Loop to feed curated traces into SFT jobs.