Task Apps Architecture

Overview

Task Apps are HTTP services that evaluate prompts for Synth AI’s optimization algorithms. They implement a simple contract: receive rollout requests, execute episodes, and return rewards.

High-Level Architecture

Key Components:

Task App - Your HTTP service (any language)
Synth AI Backend - Coordinates optimization
GEPA Optimizer - Evolutionary search engine
Interceptor - Prompt transformation layer
LLM Provider - Inference endpoint

Task App Contract

Required Endpoints

GET /health

Liveness probe (unauthenticated OK)
Returns: {"healthy": true}

GET /task_info

Dataset metadata (authenticated)
Returns: Task description, seeds, rubric, inference mode

POST /rollout

Execute one episode (authenticated)
Input: run_id, env.seed, policy.config
Returns: Trajectories, metrics, rewards

Request Flow

Rollout Request Structure

{
  "run_id": "unique-run-id",
  "env": {
    "seed": 0,
    "config": {}
  },
  "policy": {
    "config": {
      "model": "gpt-4o-mini",
      "inference_url": "https://interceptor-url/...",
      "prompt_template": {...}  // Baseline only, not optimized
    }
  }
}

Key Point: Task App receives baseline prompts only. Optimized prompts are substituted by the Interceptor.

Response Flow

Rollout Response Structure

{
  "run_id": "unique-run-id",
  "trajectories": [{
    "env_id": "task::train::0",
    "policy_id": "policy-1",
    "steps": [{
      "obs": {"query": "...", "index": 0},
      "tool_calls": [...],
      "reward": 1.0,
      "done": true,
      "info": {"expected": "...", "predicted": "...", "correct": true}
    }],
    "length": 1,
    "inference_url": "..."
  }],
  "metrics": {
    "episode_rewards": [1.0],
    "reward_mean": 1.0,
    "num_steps": 1,
    "num_episodes": 1,
    "outcome_score": 1.0
  },
  "aborted": false,
  "ops_executed": 1
}

Interceptor Pattern (Critical)

How It Works

Step 1: Task App Receives Baseline

Synth AI Backend → Task App
  POST /rollout
  {
    "policy": {
      "config": {
        "inference_url": "https://interceptor/v1/trial-123",
        "prompt_template": {...}  // Baseline prompt
      }
    }
  }

Step 2: Task App Calls LLM

Task App → Interceptor
  POST /chat/completions
  {
    "model": "gpt-4o-mini",
    "messages": [...]  // Baseline messages
  }

Step 3: Interceptor Substitutes

Interceptor:
Receives baseline messages
Looks up registered transformation for trial-123
Applies transformation to messages
Forwards to actual LLM provider

Step 4: LLM Response

LLM Provider → Interceptor → Task App
  {
    "choices": [{
      "message": {...},
      "tool_calls": [...]
    }]
  }

Key Benefits:

✅ Task App never sees optimized prompts
✅ Prompts stay secure in backend
✅ No Task App code changes needed
✅ Pattern-based transformations

GEPA Optimization Flow

Complete Optimization Cycle

Phase 1: Job Submission

User → Synth AI Backend
  POST /prompt-learning/online/jobs
  {
    "algorithm": "gepa",
    "config_body": {
      "task_app_url": "https://your-task-app.com",
      "task_app_api_key": "...",
      ...
    }
  }

Phase 2: Pattern Validation

Backend:
Start Interceptor
Fetch baseline messages from Task App
Validate pattern matches initial_template

Phase 3: Population Initialization

Backend:
Create baseline transformation
Generate mutations
Initialize population (20-30 variants)

Phase 4: Evaluation Loop

For each generation:
  For each candidate:
Register transformation with Interceptor
Backend → Task App: Rollout request (baseline)
Task App → Interceptor: LLM call
Interceptor: Substitute optimized prompt
LLM Provider → Interceptor → Task App: Response
Task App → Backend: Trajectory with reward
Backend: Update Pareto archive

Phase 5: Selection & Mutation

Backend:
Select parents from Pareto archive
Generate mutations/crossover
Next generation

Deployment Architectures

Option 1: Embedded Task App

Your Application
├── Task App Logic (in-process)
└── Synth AI SDK Integration
    └── InProcessTaskApp

Use Case: Python applications, quick prototyping

Option 2: Standalone Task App

Your Server
└── Task App (HTTP service)
    ├── Any language (Rust, Go, TypeScript, Python)
    └── Exposed via tunnel or direct URL

Use Case: Production deployments, polyglot implementations

Option 3: Cloud-Deployed Task App

Cloud Platform (Render, Fly.io, etc.)
└── Task App (HTTP service)
    └── Public HTTPS URL

Use Case: Production, scalable deployments

Authentication Flow

Two Separate Auth Flows

1. Task App Authentication (X-API-Key)

Synth AI Backend → Task App
  Headers: X-API-Key: <ENVIRONMENT_API_KEY>

Purpose: Authenticate Synth AI to your Task App

2. LLM Provider Authentication (Authorization: Bearer)

Task App → LLM Provider
  Headers: Authorization: Bearer <LLM_API_KEY>

Purpose: Authenticate Task App to LLM

Important: These are separate. Task App manages LLM auth internally.

Data Flow: Complete Example

Banking77 Classification Example

1. Job Submission

import os
from synth_ai.sdk import PolicyOptimizationJob

client = PromptLearningClient(api_key=os.environ["SYNTH_API_KEY"])

# Create job from config
job = await client.create_job(config={
    "algorithm": "gepa",
    "task_app_url": "https://my-task-app.com",
    "task_app_api_key": "secret-key"
})
await client.start_job(job["id"])

2. Backend Validates Pattern

Backend → Task App: GET /task_info?seed=0
Task App → Backend: Task metadata

Backend → Task App: POST /rollout (baseline)
Task App → Interceptor: LLM call
Interceptor → LLM: Optimized prompt
LLM → Task App: Response
Task App → Backend: Reward (baseline score)

3. GEPA Optimization

For each generation:
  For each candidate:
    Backend registers transformation
    Backend → Task App: Rollout (baseline)
    Task App → Interceptor: LLM call
    Interceptor substitutes optimized prompt
    Task App computes reward
    Backend updates archive

4. Job Completion

Backend → User: Job status = "succeeded"
Metadata includes:
  - Best prompt transformation
  - Best score
  - Pareto archive

Error Handling

Common Scenarios

Task App Unreachable:

Backend retries with exponential backoff
Job fails after max retries

LLM Call Failure:

Task App returns 502 Bad Gateway
Backend marks rollout as failed
Continues with other candidates

Invalid Response Format:

Backend validates response structure
Marks rollout as failed if invalid

Timeout:

Task App should respond within timeout_seconds
Backend cancels long-running rollouts

Performance Considerations

Throughput

Bottlenecks:

LLM inference latency (~1-3s per rollout)
Network latency (Task App ↔ Backend)
Task App processing time

Optimization:

Parallel rollouts (max_concurrent)
Minibatch gating (GEPA)
Efficient Task App implementation

Scalability

Task App:

Stateless design (scales horizontally)
Efficient dataset loading
Connection pooling for LLM calls

Backend:

Handles multiple jobs concurrently
Manages Interceptor instances
Efficient archive updates

Security Considerations

API Keys:

ENVIRONMENT_API_KEY - Task App authentication
SYNTH_API_KEY - Backend authentication
LLM_API_KEY - LLM provider authentication

Network Security:

HTTPS for all connections
Tunnel options for local development
API key validation

Prompt Security:

Prompts never sent to Task Apps
Transformations registered securely
No prompt leakage

Next Steps

Local API Overview - Complete Local API guide
Polyglot Local API - Multi-language examples
GEPA Architecture - Deep dive into GEPA
Deployed Task App Walkthrough - Step-by-step guide

Getting started

Algorithms

LocalAPI

Tunnel/Deploy

Datasets & Verifiers

Task Apps Architecture

Overview

High-Level Architecture

Task App Contract

Required Endpoints

Request Flow

Rollout Request Structure

Response Flow

Rollout Response Structure

Interceptor Pattern (Critical)

How It Works

GEPA Optimization Flow

Complete Optimization Cycle

Deployment Architectures

Option 1: Embedded Task App

Option 2: Standalone Task App

Option 3: Cloud-Deployed Task App

Authentication Flow

Two Separate Auth Flows

Data Flow: Complete Example

Banking77 Classification Example

Error Handling

Common Scenarios

Performance Considerations

Throughput

Scalability

Security Considerations

Next Steps

Getting started

Algorithms

LocalAPI

Tunnel/Deploy

Datasets & Verifiers

​Overview

​High-Level Architecture

​Task App Contract

​Required Endpoints

​Request Flow

​Rollout Request Structure

​Response Flow

​Rollout Response Structure

​Interceptor Pattern (Critical)

​How It Works

​GEPA Optimization Flow

​Complete Optimization Cycle

​Deployment Architectures

​Option 1: Embedded Task App

​Option 2: Standalone Task App

​Option 3: Cloud-Deployed Task App

​Authentication Flow

​Two Separate Auth Flows

​Data Flow: Complete Example

​Banking77 Classification Example

​Error Handling

​Common Scenarios

​Performance Considerations

​Throughput

​Scalability

​Security Considerations

​Next Steps

Overview

High-Level Architecture

Task App Contract

Required Endpoints

Request Flow

Rollout Request Structure

Response Flow

Rollout Response Structure

Interceptor Pattern (Critical)

How It Works

GEPA Optimization Flow

Complete Optimization Cycle

Deployment Architectures

Option 1: Embedded Task App

Option 2: Standalone Task App

Option 3: Cloud-Deployed Task App

Authentication Flow

Two Separate Auth Flows

Data Flow: Complete Example

Banking77 Classification Example

Error Handling

Common Scenarios

Performance Considerations

Throughput

Scalability

Security Considerations

Next Steps