Coding Agents: Harbor Deployments with Daytona

Package your coding environment as a Dockerfile, deploy it to Harbor, and run Codex or Claude Code against it — all from one SDK call or CLI command. Harbor builds a Daytona snapshot from your image so each rollout gets a fresh, isolated sandbox that provisions in seconds.

Architecture

What happens under the hood:

You upload a Dockerfile + build context to Harbor
Harbor builds a Daytona snapshot (cached after first build)
Each rollout provisions a fresh sandbox from the snapshot (~3s)
Your chosen coding agent runs inside the sandbox
Tests execute, results and LLM traces are returned

Prerequisites

Python 3.11+
uv or pip
API keys:
- SYNTH_API_KEY — your Synth platform key
- OPENAI_API_KEY — for Codex agent (or ANTHROPIC_API_KEY for Claude Code)

1. Install the SDK

pip install synth-ai
# or
uv add synth-ai

2. Write your Dockerfile

Create a Dockerfile for your coding task environment. This is the container image your agent will work inside.

Dockerfile

FROM python:3.11-slim

# Install system dependencies
RUN apt-get update && apt-get install -y git curl && rm -rf /var/lib/apt/lists/*

# Copy your project code
WORKDIR /workspace
COPY . .

# Install project dependencies
RUN pip install -r requirements.txt

# Default entrypoint for Harbor rollouts
CMD ["run_rollout", "--input", "/tmp/rollout.json", "--output", "/tmp/result.json"]

For a Rust project:

Dockerfile

FROM rust:1.82-slim

RUN apt-get update && apt-get install -y git curl pkg-config && rm -rf /var/lib/apt/lists/*

WORKDIR /workspace
COPY . .

RUN cargo build --release 2>/dev/null || true

CMD ["run_rollout", "--input", "/tmp/rollout.json", "--output", "/tmp/result.json"]

Harbor automatically injects LLM API keys via the interceptor. Do not bake OPENAI_API_KEY or ANTHROPIC_API_KEY into your Dockerfile — the SDK will reject it.

3. Upload and build the deployment

Python SDK

from synth_ai.sdk.harbor import HarborBuildSpec, upload_harbor_deployment

spec = HarborBuildSpec(
    name="my-coding-task-v1",
    dockerfile_path="./Dockerfile",
    context_dir=".",
    entrypoint="run_rollout --input /tmp/rollout.json --output /tmp/result.json",
    limits={
        "timeout_s": 600,
        "cpu_cores": 4,
        "memory_mb": 8192,
    },
    env_vars={
        "RUST_BACKTRACE": "1",
    },
    metadata={
        "agent_type": "codex",
        "project": "my-project",
    },
)

# Upload and wait for the image to build
result = upload_harbor_deployment(spec, wait_for_ready=True)
print(f"Deployment ready: {result.deployment_id}")
print(f"Name: {result.name}")
print(f"Status: {result.status}")

CLI

synth harbor upload \
  --name my-coding-task-v1 \
  --dockerfile ./Dockerfile \
  --context . \
  --wait

The first build takes 2-10 minutes depending on your image size. Subsequent rollouts reuse the cached Daytona snapshot and provision in ~3 seconds.

4. Run agent rollouts

CLI — Codex with GPT-4.1

# Run 10 rollouts with Codex
synth harbor run my-coding-task-v1 \
  --seeds 10 \
  --model gpt-4.1-mini \
  --timeout 300

# Run specific seeds
synth harbor run my-coding-task-v1 \
  --seed 0 --seed 5 --seed 10 \
  --model gpt-4.1 \
  --timeout 600

Python SDK — Codex

import httpx
import os

SYNTH_API_KEY = os.environ["SYNTH_API_KEY"]
BACKEND_URL = "https://api.usesynth.ai"
DEPLOYMENT = "my-coding-task-v1"

headers = {
    "Authorization": f"Bearer {SYNTH_API_KEY}",
    "Content-Type": "application/json",
}

# Run a single rollout
response = httpx.post(
    f"{BACKEND_URL}/api/harbor/deployments/{DEPLOYMENT}/rollout",
    json={
        "run_id": "my-run-001",
        "trace_correlation_id": "my-run-001-s0",
        "env": {
            "seed": 0,
            "env_name": "harbor",
            "config": {},
        },
        "policy": {
            "config": {
                "model": "gpt-4.1-mini",
                "inference_url": "https://api.openai.com/v1",
                "provider": "openai",
            },
        },
    },
    headers=headers,
    timeout=600.0,
)

result = response.json()
reward = result.get("reward_info", {}).get("outcome_reward", 0.0)
print(f"Reward: {reward}")

Python SDK — Claude Code

response = httpx.post(
    f"{BACKEND_URL}/api/harbor/deployments/{DEPLOYMENT}/rollout",
    json={
        "run_id": "claude-run-001",
        "trace_correlation_id": "claude-run-001-s0",
        "env": {
            "seed": 0,
            "env_name": "harbor",
            "config": {},
        },
        "policy": {
            "config": {
                "model": "claude-sonnet-4-20250514",
                "provider": "anthropic",
            },
        },
    },
    headers=headers,
    timeout=600.0,
)

5. Run multiple rollouts in batch

import concurrent.futures

seeds = list(range(10))

def run_seed(seed: int) -> dict:
    resp = httpx.post(
        f"{BACKEND_URL}/api/harbor/deployments/{DEPLOYMENT}/rollout",
        json={
            "run_id": f"batch-{DEPLOYMENT[:8]}",
            "trace_correlation_id": f"batch-{DEPLOYMENT[:8]}-s{seed}",
            "env": {"seed": seed, "env_name": "harbor", "config": {}},
            "policy": {
                "config": {
                    "model": "gpt-4.1-mini",
                    "provider": "openai",
                },
            },
        },
        headers=headers,
        timeout=600.0,
    )
    data = resp.json()
    reward = data.get("reward_info", {}).get("outcome_reward", 0.0)
    return {"seed": seed, "reward": reward}

with concurrent.futures.ThreadPoolExecutor(max_workers=6) as pool:
    futures = [pool.submit(run_seed, s) for s in seeds]
    results = [f.result() for f in concurrent.futures.as_completed(futures)]

rewards = [r["reward"] for r in results]
print(f"Mean reward: {sum(rewards) / len(rewards):.3f}")
print(f"Results: {sorted(results, key=lambda r: r['seed'])}")

6. Check deployment status

CLI

synth harbor list
synth harbor status my-coding-task-v1

Python SDK

from synth_ai.sdk.harbor import HarborDeploymentUploader

uploader = HarborDeploymentUploader()
status = uploader.get_deployment_status("my-coding-task-v1")
print(f"Status: {status['status']}")
print(f"Snapshot: {status.get('snapshot_id')}")

Agent Comparison

Agent	Model	Best For	CLI Example
Codex	`gpt-4.1-mini`, `gpt-4.1`	Fast iteration, OpenAI models	`synth harbor run DEPLOY --model gpt-4.1-mini`
Claude Code	`claude-sonnet-4-*`	Complex reasoning, Anthropic models	`synth harbor run DEPLOY --model claude-sonnet-4-20250514`
OpenCode	`gpt-4.1-mini`, `claude-sonnet-4-*`	Multi-provider flexibility	`synth harbor run DEPLOY --model gpt-4.1-mini`

Deployment Lifecycle

Status	Meaning
`pending`	Deployment created, build not started
`building`	Daytona snapshot is being built from your Dockerfile
`ready`	Snapshot cached, rollouts can run (~3s provisioning)
`failed`	Build failed — check logs, fix Dockerfile, re-trigger

Troubleshooting

Build fails — Check your Dockerfile builds locally first: docker build -t test .
“LLM API key in env_vars” — Remove OPENAI_API_KEY / ANTHROPIC_API_KEY from your Dockerfile and env_vars. Harbor injects these automatically via the interceptor.
Rollout timeouts — Increase timeout_s in limits and the --timeout CLI flag. Complex coding tasks may need 600s+.
Slow first rollout — The first rollout after a build provisions the Daytona snapshot. Subsequent rollouts reuse the cached snapshot (~3s).
Agent can’t find files — Make sure your WORKDIR in the Dockerfile matches where the agent expects to find code.

Next Steps

Optimize instructions with GEPA: Use Harbor deployments as the execution backend for coding agent optimization to evolve AGENTS.md and skills files.
Add custom evaluation: Write test suites that return pass/fail for automated reward scoring.
Use Environment Pools: For pre-provisioned pools of containers, see the Environment Pools SDK reference.

Ready to get started?

Get Started

Schedule Demo

See Synth in action with a personalized walkthrough.

Walkthroughs

​Architecture

​Prerequisites

​1. Install the SDK

​2. Write your Dockerfile

​3. Upload and build the deployment

​Python SDK

​CLI

​4. Run agent rollouts

​CLI — Codex with GPT-4.1

​Python SDK — Codex

​Python SDK — Claude Code

​5. Run multiple rollouts in batch

​6. Check deployment status

​CLI

​Python SDK

​Agent Comparison

​Deployment Lifecycle

​Troubleshooting

​Next Steps

​Ready to get started?

Get Started

Schedule Demo

Architecture

Prerequisites

1. Install the SDK

2. Write your Dockerfile

3. Upload and build the deployment

Python SDK

CLI

4. Run agent rollouts

CLI — Codex with GPT-4.1

Python SDK — Codex

Python SDK — Claude Code

5. Run multiple rollouts in batch

6. Check deployment status

CLI

Python SDK

Agent Comparison

Deployment Lifecycle

Troubleshooting

Next Steps

Ready to get started?