Containers Architecture

Overview

Containers are HTTP services that evaluate prompts for Synth AI’s optimization algorithms. They implement a simple contract: receive rollout requests, execute episodes, and return rewards.

High-Level Architecture

Key Components:

Container - Your HTTP service (any language)
Synth AI Backend - Coordinates optimization
GEPA Optimizer - Evolutionary search engine
Interceptor - Prompt transformation layer
LLM Provider - Inference endpoint

Container Contract

Required Endpoints

GET /health

Liveness probe (unauthenticated OK)
Returns: {"healthy": true}

GET /task_info

Dataset metadata (authenticated)
Returns: Task description, seeds, rubric, inference mode

POST /rollout

Execute one episode (authenticated)
Input: trace_correlation_id, env.seed, policy.config
Returns: Trajectories, metrics, rewards

Request Flow

Rollout Request Structure

{
  "trace_correlation_id": "unique-run-id",
  "env": {
    "seed": 0,
    "config": {}
  },
  "policy": {
    "config": {
      "model": "gpt-4o-mini",
      "inference_url": "https://interceptor-url/...",
      "prompt_template": {...}  // Baseline only, not optimized
    }
  }
}

Key Point: Container receives baseline prompts only. Optimized prompts are substituted by the Interceptor.

Response Flow

Rollout Response Structure

{
  "trace_correlation_id": "unique-run-id",
  "trajectories": [{
    "env_id": "task::train::0",
    "policy_id": "policy-1",
    "steps": [{
      "obs": {"query": "...", "index": 0},
      "tool_calls": [...],
      "reward": 1.0,
      "done": true,
      "info": {"expected": "...", "predicted": "...", "correct": true}
    }],
    "length": 1,
    "inference_url": "..."
  }],
  "metrics": {
    "episode_rewards": [1.0],
    "outcome_reward": 1.0,
    "num_steps": 1,
    "num_episodes": 1,
    "outcome_score": 1.0
  },
  "aborted": false,
  "ops_executed": 1
}

Interceptor Pattern (Critical)

How It Works

Step 1: Container Receives Baseline

Synth AI Backend → Container
  POST /rollout
  {
    "policy": {
      "config": {
        "inference_url": "https://interceptor/v1/trial-123",
        "prompt_template": {...}  // Baseline prompt
      }
    }
  }

Step 2: Container Calls LLM

Container → Interceptor
  POST /chat/completions
  {
    "model": "gpt-4o-mini",
    "messages": [...]  // Baseline messages
  }

Step 3: Interceptor Substitutes

Interceptor:
Receives baseline messages
Looks up registered transformation for trial-123
Applies transformation to messages
Forwards to actual LLM provider

Step 4: LLM Response

LLM Provider → Interceptor → Container
  {
    "choices": [{
      "message": {...},
      "tool_calls": [...]
    }]
  }

Key Benefits:

✅ Container never sees optimized prompts
✅ Prompts stay secure in backend
✅ No Container code changes needed
✅ Pattern-based transformations

GEPA Optimization Flow

Complete Optimization Cycle

Phase 1: Job Submission

User → Synth AI Backend
  POST /prompt-learning/online/jobs
  {
    "algorithm": "gepa",
    "config_body": {
      "container_url": "https://your-container.com",
      "container_api_key": "...",
      ...
    }
  }

Phase 2: Pattern Validation

Backend:
Start Interceptor
Fetch baseline messages from Container
Validate pattern matches initial_template

Phase 3: Population Initialization

Backend:
Create baseline transformation
Generate mutations
Initialize population (20-30 variants)

Phase 4: Evaluation Loop

For each generation:
  For each candidate:
Register transformation with Interceptor
Backend → Container: Rollout request (baseline)
Container → Interceptor: LLM call
Interceptor: Substitute optimized prompt
LLM Provider → Interceptor → Container: Response
Container → Backend: Trajectory with reward
Backend: Update Pareto archive

Phase 5: Selection & Mutation

Backend:
Select parents from Pareto archive
Generate mutations/crossover
Next generation

Deployment Architectures

Option 1: Embedded Container

Your Application
├── Container Logic (in-process)
└── Synth AI SDK Integration
    └── InProcessContainer

Use Case: Python applications, quick prototyping

Option 2: Standalone Container

Your Server
└── Container (HTTP service)
    ├── Any language (Rust, Go, TypeScript, Python)
    └── Exposed via tunnel or direct URL

Use Case: Production deployments, polyglot implementations

Option 3: Cloud-Deployed Container

Cloud Platform (Render, Fly.io, etc.)
└── Container (HTTP service)
    └── Public HTTPS URL

Use Case: Production, scalable deployments

Authentication Flow

Two Separate Auth Flows

1. Container Authentication (X-API-Key)

Synth AI Backend → Container
  Headers: X-API-Key: <ENVIRONMENT_API_KEY>

Purpose: Authenticate Synth AI to your Container

2. LLM Provider Authentication (Authorization: Bearer)

Container → LLM Provider
  Headers: Authorization: Bearer <LLM_API_KEY>

Purpose: Authenticate Container to LLM

Important: These are separate. Container manages LLM auth internally.

Data Flow: Complete Example

Banking77 Classification Example

1. Job Submission

import os
from synth_ai.sdk import OfflineJob

client = SynthClient(api_key=os.environ["SYNTH_API_KEY"])

# Create job from config
job = await client.create_job(config={
    "algorithm": "gepa",
    "container_url": "https://my-container.com",
    "container_api_key": "secret-key"
})
await client.start_job(job["id"])

2. Backend Validates Pattern

Backend → Container: GET /task_info?seed=0
Container → Backend: Task metadata

Backend → Container: POST /rollout (baseline)
Container → Interceptor: LLM call
Interceptor → LLM: Optimized prompt
LLM → Container: Response
Container → Backend: Reward (baseline score)

3. GEPA Optimization

For each generation:
  For each candidate:
    Backend registers transformation
    Backend → Container: Rollout (baseline)
    Container → Interceptor: LLM call
    Interceptor substitutes optimized prompt
    Container computes reward
    Backend updates archive

4. Job Completion

Backend → User: Job status = "succeeded"
Metadata includes:
  - Best prompt transformation
  - Best score
  - Pareto archive

Error Handling

Common Scenarios

Container Unreachable:

Backend retries with exponential backoff
Job fails after max retries

LLM Call Failure:

Container returns 502 Bad Gateway
Backend marks rollout as failed
Continues with other candidates

Invalid Response Format:

Backend validates response structure
Marks rollout as failed if invalid

Timeout:

Container should respond within timeout_seconds
Backend cancels long-running rollouts

Performance Considerations

Throughput

Bottlenecks:

LLM inference latency (~1-3s per rollout)
Network latency (Container ↔ Backend)
Container processing time

Optimization:

Parallel rollouts (max_concurrent)
Minibatch gating (GEPA)
Efficient Container implementation

Scalability

Container:

Stateless design (scales horizontally)
Efficient dataset loading
Connection pooling for LLM calls

Backend:

Handles multiple jobs concurrently
Manages Interceptor instances
Efficient archive updates

Security Considerations

API Keys:

ENVIRONMENT_API_KEY - Container authentication
SYNTH_API_KEY - Backend authentication
LLM_API_KEY - LLM provider authentication

Network Security:

HTTPS for all connections
Tunnel options for local development
API key validation

Prompt Security:

Prompts never sent to Containers
Transformations registered securely
No prompt leakage

Next Steps

Local API Overview - Complete Local API guide
Polyglot Local API - Multi-language examples
GEPA Architecture - Deep dive into GEPA
Deployed Container Walkthrough - Step-by-step guide

Getting started

Products

Container

Tunnel/Deploy

Containers Architecture

Overview

High-Level Architecture

Container Contract

Required Endpoints

Request Flow

Rollout Request Structure

Response Flow

Rollout Response Structure

Interceptor Pattern (Critical)

How It Works

GEPA Optimization Flow

Complete Optimization Cycle

Deployment Architectures

Option 1: Embedded Container

Option 2: Standalone Container

Option 3: Cloud-Deployed Container

Authentication Flow

Two Separate Auth Flows

Data Flow: Complete Example

Banking77 Classification Example

Error Handling

Common Scenarios

Performance Considerations

Throughput

Scalability

Security Considerations

Next Steps

Getting started

Products

Container

Tunnel/Deploy

​Overview

​High-Level Architecture

​Container Contract

​Required Endpoints

​Request Flow

​Rollout Request Structure

​Response Flow

​Rollout Response Structure

​Interceptor Pattern (Critical)

​How It Works

​GEPA Optimization Flow

​Complete Optimization Cycle

​Deployment Architectures

​Option 1: Embedded Container

​Option 2: Standalone Container

​Option 3: Cloud-Deployed Container

​Authentication Flow

​Two Separate Auth Flows

​Data Flow: Complete Example

​Banking77 Classification Example

​Error Handling

​Common Scenarios

​Performance Considerations

​Throughput

​Scalability

​Security Considerations

​Next Steps

Overview

High-Level Architecture

Container Contract

Required Endpoints

Request Flow

Rollout Request Structure

Response Flow

Rollout Response Structure

Interceptor Pattern (Critical)

How It Works

GEPA Optimization Flow

Complete Optimization Cycle

Deployment Architectures

Option 1: Embedded Container

Option 2: Standalone Container

Option 3: Cloud-Deployed Container

Authentication Flow

Two Separate Auth Flows

Data Flow: Complete Example

Banking77 Classification Example

Error Handling

Common Scenarios

Performance Considerations

Throughput

Scalability

Security Considerations

Next Steps