Skip to main content

Overview

Containers are HTTP services that evaluate prompts for Synth AI’s optimization algorithms. They implement a simple contract: receive rollout requests, execute episodes, and return rewards.

High-Level Architecture

Key Components:
  1. Container - Your HTTP service (any language)
  2. Synth AI Backend - Coordinates optimization
  3. GEPA Optimizer - Evolutionary search engine
  4. Interceptor - Prompt transformation layer
  5. LLM Provider - Inference endpoint

Container Contract

Required Endpoints

GET /health
  • Liveness probe (unauthenticated OK)
  • Returns: {"healthy": true}
GET /task_info
  • Dataset metadata (authenticated)
  • Returns: Task description, seeds, rubric, inference mode
POST /rollout
  • Execute one episode (authenticated)
  • Input: trace_correlation_id, env.seed, policy.config
  • Returns: Trajectories, metrics, rewards

Request Flow

Rollout Request Structure

{
  "trace_correlation_id": "unique-run-id",
  "env": {
    "seed": 0,
    "config": {}
  },
  "policy": {
    "config": {
      "model": "gpt-4o-mini",
      "inference_url": "https://interceptor-url/...",
      "prompt_template": {...}  // Baseline only, not optimized
    }
  }
}
Key Point: Container receives baseline prompts only. Optimized prompts are substituted by the Interceptor.

Response Flow

Rollout Response Structure

{
  "trace_correlation_id": "unique-run-id",
  "trajectories": [{
    "env_id": "task::train::0",
    "policy_id": "policy-1",
    "steps": [{
      "obs": {"query": "...", "index": 0},
      "tool_calls": [...],
      "reward": 1.0,
      "done": true,
      "info": {"expected": "...", "predicted": "...", "correct": true}
    }],
    "length": 1,
    "inference_url": "..."
  }],
  "metrics": {
    "episode_rewards": [1.0],
    "outcome_reward": 1.0,
    "num_steps": 1,
    "num_episodes": 1,
    "outcome_score": 1.0
  },
  "aborted": false,
  "ops_executed": 1
}

Interceptor Pattern (Critical)

How It Works

Step 1: Container Receives Baseline
Synth AI Backend → Container
  POST /rollout
  {
    "policy": {
      "config": {
        "inference_url": "https://interceptor/v1/trial-123",
        "prompt_template": {...}  // Baseline prompt
      }
    }
  }
Step 2: Container Calls LLM
Container → Interceptor
  POST /chat/completions
  {
    "model": "gpt-4o-mini",
    "messages": [...]  // Baseline messages
  }
Step 3: Interceptor Substitutes
Interceptor:
  1. Receives baseline messages
  2. Looks up registered transformation for trial-123
  3. Applies transformation to messages
  4. Forwards to actual LLM provider
Step 4: LLM Response
LLM Provider → Interceptor → Container
  {
    "choices": [{
      "message": {...},
      "tool_calls": [...]
    }]
  }
Key Benefits:
  • ✅ Container never sees optimized prompts
  • ✅ Prompts stay secure in backend
  • ✅ No Container code changes needed
  • ✅ Pattern-based transformations

GEPA Optimization Flow

Complete Optimization Cycle

Phase 1: Job Submission
User → Synth AI Backend
  POST /prompt-learning/online/jobs
  {
    "algorithm": "gepa",
    "config_body": {
      "container_url": "https://your-container.com",
      "container_api_key": "...",
      ...
    }
  }
Phase 2: Pattern Validation
Backend:
  1. Start Interceptor
  2. Fetch baseline messages from Container
  3. Validate pattern matches initial_template
Phase 3: Population Initialization
Backend:
  1. Create baseline transformation
  2. Generate mutations
  3. Initialize population (20-30 variants)
Phase 4: Evaluation Loop
For each generation:
  For each candidate:
    1. Register transformation with Interceptor
    2. Backend → Container: Rollout request (baseline)
    3. Container → Interceptor: LLM call
    4. Interceptor: Substitute optimized prompt
    5. LLM Provider → Interceptor → Container: Response
    6. Container → Backend: Trajectory with reward
    7. Backend: Update Pareto archive
Phase 5: Selection & Mutation
Backend:
  1. Select parents from Pareto archive
  2. Generate mutations/crossover
  3. Next generation

Deployment Architectures

Option 1: Embedded Container

Your Application
├── Container Logic (in-process)
└── Synth AI SDK Integration
    └── InProcessContainer
Use Case: Python applications, quick prototyping

Option 2: Standalone Container

Your Server
└── Container (HTTP service)
    ├── Any language (Rust, Go, TypeScript, Python)
    └── Exposed via tunnel or direct URL
Use Case: Production deployments, polyglot implementations

Option 3: Cloud-Deployed Container

Cloud Platform (Render, Fly.io, etc.)
└── Container (HTTP service)
    └── Public HTTPS URL
Use Case: Production, scalable deployments

Authentication Flow

Two Separate Auth Flows

1. Container Authentication (X-API-Key)
Synth AI Backend → Container
  Headers: X-API-Key: <ENVIRONMENT_API_KEY>

Purpose: Authenticate Synth AI to your Container
2. LLM Provider Authentication (Authorization: Bearer)
Container → LLM Provider
  Headers: Authorization: Bearer <LLM_API_KEY>

Purpose: Authenticate Container to LLM
Important: These are separate. Container manages LLM auth internally.

Data Flow: Complete Example

Banking77 Classification Example

1. Job Submission
import os
from synth_ai.sdk import OfflineJob

client = SynthClient(api_key=os.environ["SYNTH_API_KEY"])

# Create job from config
job = await client.create_job(config={
    "algorithm": "gepa",
    "container_url": "https://my-container.com",
    "container_api_key": "secret-key"
})
await client.start_job(job["id"])
2. Backend Validates Pattern
Backend → Container: GET /task_info?seed=0
Container → Backend: Task metadata

Backend → Container: POST /rollout (baseline)
Container → Interceptor: LLM call
Interceptor → LLM: Optimized prompt
LLM → Container: Response
Container → Backend: Reward (baseline score)
3. GEPA Optimization
For each generation:
  For each candidate:
    Backend registers transformation
    Backend → Container: Rollout (baseline)
    Container → Interceptor: LLM call
    Interceptor substitutes optimized prompt
    Container computes reward
    Backend updates archive
4. Job Completion
Backend → User: Job status = "succeeded"
Metadata includes:
  - Best prompt transformation
  - Best score
  - Pareto archive

Error Handling

Common Scenarios

Container Unreachable:
  • Backend retries with exponential backoff
  • Job fails after max retries
LLM Call Failure:
  • Container returns 502 Bad Gateway
  • Backend marks rollout as failed
  • Continues with other candidates
Invalid Response Format:
  • Backend validates response structure
  • Marks rollout as failed if invalid
Timeout:
  • Container should respond within timeout_seconds
  • Backend cancels long-running rollouts

Performance Considerations

Throughput

Bottlenecks:
  1. LLM inference latency (~1-3s per rollout)
  2. Network latency (Container ↔ Backend)
  3. Container processing time
Optimization:
  • Parallel rollouts (max_concurrent)
  • Minibatch gating (GEPA)
  • Efficient Container implementation

Scalability

Container:
  • Stateless design (scales horizontally)
  • Efficient dataset loading
  • Connection pooling for LLM calls
Backend:
  • Handles multiple jobs concurrently
  • Manages Interceptor instances
  • Efficient archive updates

Security Considerations

API Keys:
  • ENVIRONMENT_API_KEY - Container authentication
  • SYNTH_API_KEY - Backend authentication
  • LLM_API_KEY - LLM provider authentication
Network Security:
  • HTTPS for all connections
  • Tunnel options for local development
  • API key validation
Prompt Security:
  • Prompts never sent to Containers
  • Transformations registered securely
  • No prompt leakage

Next Steps