`synth_ai.sdk.api.train.graphgen`

Alpha First-class SDK API for GraphGen (Automated Design of Agentic Systems). GraphGen is a simplified “Workflows API” for prompt optimization that:

Uses a simple JSON dataset format (GraphGenTaskSet) instead of TOML configs
Auto-generates task apps from the dataset (no user-managed task apps)
Has built-in judge configurations (rubric, contrastive, gold_examples)
Wraps GEPA internally for the actual optimization

Example CLI usage:

uvx synth-ai train --type graphgen --dataset my_tasks.json --poll

Example SDK usage:

from synth_ai.sdk.api.train.graphgen import GraphGenJob
from synth_ai.sdk.api.train.graphgen_models import GraphGenTaskSet, GraphGenTask

# From a dataset file
job = GraphGenJob.from_dataset("my_tasks.json")
job.submit()
result = job.stream_until_complete()
print(f"Best score: {result.get('best_score')}")

# Or programmatically
dataset = GraphGenTaskSet(
    metadata=GraphGenTaskSetMetadata(name="My Tasks"),
    tasks=[GraphGenTask(id="t1", input={"question": "What is 2+2?"})],
    gold_outputs=[GraphGenGoldOutput(output={"answer": "4"}, task_id="t1")],
)
job = GraphGenJob.from_dataset(dataset, policy_model="gpt-4o-mini", problem_spec="You are a helpful assistant.")
job.submit()

`synth_ai.sdk.api.train.graphgen_models`

GraphGen (Automated Design of Agentic Systems) data models. This module provides Pydantic models for defining GraphGen datasets and job configurations. GraphGen is a simplified “Workflows API” for prompt optimization that wraps GEPA with auto-generated task apps and built-in judge configurations. Example:

from synth_ai.sdk.api.train.graphgen_models import (
    GraphGenTaskSet,
    GraphGenTask,
    GraphGenGoldOutput,
    GraphGenRubric,
    GraphGenJobConfig,
)

# Create a dataset
dataset = GraphGenTaskSet(
    metadata=GraphGenTaskSetMetadata(name="My Dataset"),
    tasks=[
        GraphGenTask(id="task1", input={"question": "What is 2+2?"}),
        GraphGenTask(id="task2", input={"question": "What is the capital of France?"}),
    ],
    gold_outputs=[
        GraphGenGoldOutput(output={"answer": "4"}, task_id="task1"),
        GraphGenGoldOutput(output={"answer": "Paris"}, task_id="task2"),
    ],
    judge_config=GraphGenJudgeConfig(mode="rubric"),
)

Functions

`parse_graphgen_taskset`

parse_graphgen_taskset(data: Dict[str, Any]) -> GraphGenTaskSet

Parse a dictionary into an GraphGenTaskSet. Args:

data: Dictionary containing the taskset data (from JSON)

Returns:

Validated GraphGenTaskSet

Raises:

ValueError: If validation fails

`load_graphgen_taskset`

load_graphgen_taskset(path: str | Path) -> GraphGenTaskSet

Load an GraphGenTaskSet from a JSON file. Args:

path: Path to JSON file

Returns:

Validated GraphGenTaskSet

Raises:

FileNotFoundError: If file doesn’t exist
ValueError: If validation fails

Job API

`GraphGenJobResult`

Result from an GraphGen job.

`GraphGenSubmitResult`

Result from submitting an GraphGen job.

`GraphGenJob`

High-level SDK class for running GraphGen workflow optimization jobs. GraphGen (Automated Design of Agentic Systems) provides a simplified API for graph/workflow optimization that doesn’t require users to manage task apps. Key differences from PromptLearningJob:

Uses JSON dataset format (GraphGenTaskSet) instead of TOML configs
No task app management required - GraphGen builds it internally
Built-in judge modes (rubric, contrastive, gold_examples)
Graph-first: trains multi-node workflows by default (Graph-GEPA)
Public graph downloads are redacted .txt exports only
Simpler configuration with sensible defaults

Methods:

`from_dataset`

from_dataset(cls, dataset: str | Path | Dict[str, Any] | GraphGenTaskSet) -> GraphGenJob

Create an GraphGen job from a dataset. Args:

dataset: Dataset as file path, dict, or GraphGenTaskSet object
graph_type: Type of graph to train:
“policy”: Maps inputs to outputs (default).
“verifier”: Judges/scores traces (requires verifier-compliant dataset).
“rlm”: Recursive Language Model - handles massive contexts via tool-based search and recursive LLM calls. Requires configured_tools parameter.
policy_model: Model to use for policy inference
rollout_budget: Total number of rollouts for optimization
proposer_effort: Proposer effort level (“medium” or “high”). “low” is not allowed as gpt-4.1-mini is too weak for graph generation.
judge_model: Override judge model from dataset
judge_provider: Override judge provider from dataset
population_size: Population size for GEPA
num_generations: Number of generations (auto-calculated if not specified)
problem_spec: Detailed problem specification for the graph proposer. Include domain-specific info like valid output labels for classification.
target_llm_calls: Target number of LLM calls for the graph (1-10). Controls how many LLM nodes the graph should use. Defaults to 5.
configured_tools: Optional list of tool bindings for RLM graphs. Required for graph_type=“rlm”. Each tool should be a dict with ‘name’, ‘kind’, and ‘stateful’. Example: [{‘name’: ‘materialize_context’, ‘kind’: ‘rlm_materialize’, ‘stateful’: True}]
backend_url: Backend API URL (defaults to env or production)
api_key: API key (defaults to SYNTH_API_KEY env var)
auto_start: Whether to start the job immediately
metadata: Additional metadata for the job

Returns:

GraphGenJob instance

`from_job_id`

from_job_id(cls, job_id: str, backend_url: Optional[str] = None, api_key: Optional[str] = None) -> GraphGenJob

Resume an existing GraphGen job by ID. Args:

job_id: GraphGen job ID (“graphgen_”) or underlying GEPA job ID (“pl_”)
backend_url: Backend API URL (defaults to env or production)
api_key: API key (defaults to SYNTH_API_KEY env var)

Returns:

GraphGenJob instance for the existing job

`from_graph_evolve_job_id`

from_graph_evolve_job_id(cls, graph_evolve_job_id: str, backend_url: Optional[str] = None, api_key: Optional[str] = None) -> GraphGenJob

Alias for resuming an GraphGen job from a GEPA job ID.

`job_id`

job_id(self) -> Optional[str]

Get the GraphGen job ID (None if not yet submitted).

`graph_evolve_job_id`

graph_evolve_job_id(self) -> Optional[str]

Get the underlying GEPA job ID if known.

`submit`

submit(self) -> GraphGenSubmitResult

Submit the job to the backend. Returns:

GraphGenSubmitResult with job IDs and initial status

Raises:

RuntimeError: If job submission fails

`get_status`

get_status(self) -> Dict[str, Any]

Get current job status. Returns:

Job status dictionary containing ‘status’, ‘best_score’, etc.

Raises:

RuntimeError: If job hasn’t been submitted yet or API call fails.

`start`

start(self) -> Dict[str, Any]

Start a queued GraphGen job. This is only needed if the job was created with auto_start=False or ended up queued. Returns:

Updated job status dictionary.

`get_events`

get_events(self) -> Dict[str, Any]

Fetch events for this GraphGen job. Args:

since_seq: Return events with sequence number greater than this.
limit: Maximum number of events to return.

Returns:

Backend envelope: {“events”: […], “has_more”: bool, “next_seq”: int}.

`get_metrics`

get_metrics(self) -> Dict[str, Any]

Fetch metrics for this GraphGen job. Args:

name: Optional metric name filter.
after_step: Optional step filter.
limit: Maximum number of metrics to return.
run_id: Optional run identifier filter.

Returns:

Dictionary containing ‘metrics’ list.

`stream_until_complete`

stream_until_complete(self) -> Dict[str, Any]

Stream job events until completion using Server-Sent Events (SSE). This method connects to the backend SSE stream and processes events in real-time until the job reaches a terminal state (completed, failed, or cancelled). Events include:

job_started: Job execution began
generation_started: New generation of candidates started
candidate_evaluated: A candidate graph was evaluated
generation_completed: Generation finished
optimization_completed: Job finished successfully
job_failed: Job encountered an error

Args:

timeout: Maximum seconds to wait for completion
interval: Seconds between status checks (for SSE reconnects)
handlers: Optional StreamHandler instances for custom event handling. Defaults to GraphGenHandler which provides formatted CLI output.
on_event: Optional callback function called on each event. Receives the event dict as argument.

Returns:

Final job status dictionary containing ‘status’, ‘best_score’, etc.

Raises:

RuntimeError: If job hasn’t been submitted yet
TimeoutError: If timeout exceeded before job completion

`download_prompt`

download_prompt(self) -> str

Download the optimized prompt from a completed job. For graph-first jobs, prefer download_graph_txt(); this method is mainly useful for legacy single-node prompt workflows. Returns:

Optimized prompt text

Raises:

RuntimeError: If job hasn’t been submitted or isn’t complete

`download_graph_txt`

download_graph_txt(self) -> str

Download a PUBLIC (redacted) graph export for a completed job. Graph-first GraphGen jobs produce multi-node graphs. The internal graph YAML/spec is proprietary and never exposed. This helper downloads the .txt export from: GET /api/graphgen/jobs/{job_id}/graph.txt

`run_inference`

run_inference(self, input_data: Dict[str, Any]) -> Dict[str, Any]

Run inference with the optimized graph/workflow. Args:

input_data: Input data matching the task format
model: Override model (default: use job’s policy model)
prompt_snapshot_id: Legacy alias for selecting a specific snapshot.
graph_snapshot_id: Specific GraphSnapshot to use (default: best). Preferred for graph-first jobs. If provided, it is sent as prompt_snapshot_id for backward-compatible backend routing.

Returns:

Output dictionary containing ‘output’, ‘usage’, etc.

Raises:

RuntimeError: If job hasn’t been submitted or inference fails.
ValueError: If both prompt_snapshot_id and graph_snapshot_id are provided.

`run_inference_output`

run_inference_output(self, input_data: Dict[str, Any]) -> Any

Convenience wrapper returning only the model output.

`run_verifier`

run_verifier(self, session_trace: Dict[str, Any] | SessionTraceInput) -> GraphGenGraphJudgeResponse

Run a verifier graph on an execution trace. This method is specifically for graphs trained with graph_type=“verifier”. It accepts a V3 trace and returns structured rewards (score, reasoning, per-event rewards). Args:

session_trace: V3 session trace to evaluate. Can be a dict or SessionTraceInput.
context: Additional context for evaluation (e.g., rubric overrides, task description).
prompt_snapshot_id: Specific snapshot to use (default: best).
graph_snapshot_id: Specific GraphSnapshot to use (default: best). Preferred for graph-first jobs.

Returns:

GraphGenGraphJudgeResponse containing structured rewards and reasoning.

Raises:

RuntimeError: If job hasn’t been submitted or inference fails.

`run_judge`

run_judge(self, session_trace: Dict[str, Any] | SessionTraceInput) -> GraphGenGraphJudgeResponse

Deprecated: use run_verifier instead.

`get_graph_record`

get_graph_record(self) -> Dict[str, Any]

Get the optimized graph record (snapshot) for a completed job. Note: for graph-first jobs, this record is redacted and never includes proprietary YAML/spec. Use download_graph_txt() for the public export. Args:

prompt_snapshot_id: Legacy alias for selecting a specific snapshot.
graph_snapshot_id: Specific GraphSnapshot to use (default: best).

Returns:

Graph record dictionary containing:
- job_id: The job ID
- snapshot_id: The snapshot ID used
- prompt: Extracted prompt text (legacy single-node only; may be empty)
- graph: Public graph record payload (e.g., export metadata)
- model: Model used for this graph (optional)

Raises:

RuntimeError: If job hasn’t been submitted or API call fails.
ValueError: If both prompt_snapshot_id and graph_snapshot_id are provided.

Configuration Reference

`OutputConfig`

Configuration for graph output extraction + validation. This model defines how graph outputs should be extracted and validated. It supports JSON Schema validation, multiple output formats, and configurable extraction paths. Attributes:

schema_: JSON Schema (draft-07) for output validation. Use alias “schema” in JSON.
format: Expected output format - “json”, “text”, “tool_calls”, or “image”.
strict: If True, validation failures fail the run; if False, log warnings and continue.
extract_from: Ordered list of dot-paths/keys to try when extracting output from final_state.

`GraphGenTaskSetMetadata`

Metadata about the dataset. Methods:

`validate_select_output`

validate_select_output(cls, v: Any) -> Optional[Union[str, List[str]]]

Validate select_output is a string or list of strings.

`validate_output_config`

validate_output_config(cls, v: Any) -> Optional[OutputConfig]

Convert dict to OutputConfig for backward compatibility.

`GraphGenRubricCriterion`

A single rubric criterion for evaluation.

`GraphGenRubricOutcome`

Outcome-level rubric (evaluates final output).

`GraphGenRubricEvents`

Event-level rubric (evaluates intermediate steps).

`GraphGenRubric`

Rubric for evaluating task outputs.

`GraphGenTask`

A single task in the dataset. Tasks have arbitrary JSON inputs and optional task-specific rubrics. Gold outputs are stored separately and linked via task_id.

`GraphGenGoldOutput`

A gold/reference output. Can be linked to a specific task via task_id, or standalone (for reference examples). Standalone gold outputs (no task_id) are used as reference pool for contrastive judging.

`GraphGenJudgeConfig`

Configuration for the judge used during optimization.

`GraphGenTaskSet`

The complete GraphGen dataset format. Contains tasks with arbitrary JSON inputs, gold outputs (optionally linked to tasks), rubrics (task-specific and/or default), and judge configuration. Methods:

`validate_unique_task_ids`

validate_unique_task_ids(cls, v: List[GraphGenTask]) -> List[GraphGenTask]

Ensure all task IDs are unique.

`validate_gold_output_task_ids`

validate_gold_output_task_ids(cls, v: List[GraphGenGoldOutput], info: ValidationInfo) -> List[GraphGenGoldOutput]

Ensure gold output task_ids reference valid tasks. Args:

v: The list of gold outputs being validated.
info: Pydantic ValidationInfo providing access to other fields via info.data.

Returns:

The validated list of gold outputs.

Raises:

ValueError: If a gold output references a non-existent task ID.

`validate_select_output`

validate_select_output(cls, v: Any) -> Optional[Union[str, List[str]]]

Validate select_output is a string or list of strings.

`validate_output_config`

validate_output_config(cls, v: Any) -> Optional[OutputConfig]

Convert dict to OutputConfig for backward compatibility.

`get_task_by_id`

get_task_by_id(self, task_id: str) -> Optional[GraphGenTask]

Get a task by its ID.

`get_task_by_index`

get_task_by_index(self, index: int) -> Optional[GraphGenTask]

Get a task by zero-based index. Args:

index: Zero-based index into tasks list (0 to len(tasks)-1).

Returns:

Task at the specified index, or None if index is out of range.

`get_gold_output_for_task`

get_gold_output_for_task(self, task_id: str) -> Optional[GraphGenGoldOutput]

Get the gold output linked to a specific task.

`get_standalone_gold_outputs`

get_standalone_gold_outputs(self) -> List[GraphGenGoldOutput]

Get gold outputs not linked to any task (reference pool for contrastive judge).

`EventInput`

V3-compatible event input for verifier evaluation.

`SessionTimeStepInput`

V3-compatible session time step input.

`SessionTraceInput`

V3-compatible session trace input for judge evaluation.

`GraphGenGraphJudgeRequest`

Request for verifier graph inference.

`GraphGenGraphCompletionsModelUsage`

Token usage and cost for a single model in a graph completion.

`EventRewardResponse`

Event-level reward from verifier evaluation.

`OutcomeRewardResponse`

Outcome-level reward from verifier evaluation.

`GraphGenGraphJudgeResponse`

Response from verifier graph inference.

`GraphGenGraphVerifierRequest`

Alias for GraphGenGraphJudgeRequest with verifier terminology.

`GraphGenGraphVerifierResponse`

Alias for GraphGenGraphJudgeResponse with verifier terminology.

`GraphGenJobConfig`

Configuration for an GraphGen optimization job. Methods:

`get_policy_provider`

get_policy_provider(self) -> str

Get the policy provider (auto-detect if not specified).

CLI Reference

Python SDK Reference

​synth_ai.sdk.api.train.graphgen

​synth_ai.sdk.api.train.graphgen_models

​Functions

​parse_graphgen_taskset

​load_graphgen_taskset

​Job API

​GraphGenJobResult

​GraphGenSubmitResult

​GraphGenJob

​from_dataset

​from_job_id

​from_graph_evolve_job_id

​job_id

​graph_evolve_job_id

​submit

​get_status

​start

​get_events

​get_metrics

​stream_until_complete

​download_prompt

​download_graph_txt

​run_inference

​run_inference_output

​run_verifier

​run_judge

​get_graph_record

​Configuration Reference

​OutputConfig

​GraphGenTaskSetMetadata

​validate_select_output

​validate_output_config

​GraphGenRubricCriterion

​GraphGenRubricOutcome

​GraphGenRubricEvents

​GraphGenRubric

​GraphGenTask

​GraphGenGoldOutput

​GraphGenJudgeConfig

​GraphGenTaskSet

​validate_unique_task_ids

​validate_gold_output_task_ids

​validate_select_output

​validate_output_config

​get_task_by_id

​get_task_by_index

​get_gold_output_for_task

​get_standalone_gold_outputs

​EventInput

​SessionTimeStepInput

​SessionTraceInput

​GraphGenGraphJudgeRequest

​GraphGenGraphCompletionsModelUsage

​EventRewardResponse

​OutcomeRewardResponse

​GraphGenGraphJudgeResponse

​GraphGenGraphVerifierRequest

​GraphGenGraphVerifierResponse

​GraphGenJobConfig

​get_policy_provider

`synth_ai.sdk.api.train.graphgen`

`synth_ai.sdk.api.train.graphgen_models`

Functions

`parse_graphgen_taskset`

`load_graphgen_taskset`

Job API

`GraphGenJobResult`

`GraphGenSubmitResult`

`GraphGenJob`

`from_dataset`

`from_job_id`

`from_graph_evolve_job_id`

`job_id`

`graph_evolve_job_id`

`submit`

`get_status`

`start`

`get_events`

`get_metrics`

`stream_until_complete`

`download_prompt`

`download_graph_txt`

`run_inference`

`run_inference_output`

`run_verifier`

`run_judge`

`get_graph_record`

Configuration Reference

`OutputConfig`

`GraphGenTaskSetMetadata`

`validate_select_output`

`validate_output_config`

`GraphGenRubricCriterion`

`GraphGenRubricOutcome`

`GraphGenRubricEvents`

`GraphGenRubric`

`GraphGenTask`

`GraphGenGoldOutput`

`GraphGenJudgeConfig`

`GraphGenTaskSet`

`validate_unique_task_ids`

`validate_gold_output_task_ids`

`validate_select_output`

`validate_output_config`

`get_task_by_id`

`get_task_by_index`

`get_gold_output_for_task`

`get_standalone_gold_outputs`

`EventInput`

`SessionTimeStepInput`

`SessionTraceInput`

`GraphGenGraphJudgeRequest`

`GraphGenGraphCompletionsModelUsage`

`EventRewardResponse`

`OutcomeRewardResponse`

`GraphGenGraphJudgeResponse`

`GraphGenGraphVerifierRequest`

`GraphGenGraphVerifierResponse`

`GraphGenJobConfig`

`get_policy_provider`