Skip to main content

Quickstart: Graph Evolve

This guide walks you through training a multi-node LLM graph using Graph Evolve. By the end, you’ll have an optimized graph that outperforms a single prompt.
For most use cases, we recommend using the Graphs quickstart which provides a simpler interface. Use Graph Evolve directly when you need fine-grained control over evolution parameters.

Prerequisites

  • Synth AI API key (get one here)
  • Python 3.11+
  • synth-ai package installed
pip install synth-ai

Step 1: Prepare Your Dataset

Create a JSON file with your tasks and expected outputs:
{
  "tasks": [
    {
      "task_id": "q1",
      "input": {
        "question": "What is the capital of France?",
        "context": "France is a country in Western Europe."
      }
    },
    {
      "task_id": "q2",
      "input": {
        "question": "Who wrote Romeo and Juliet?",
        "context": "Romeo and Juliet is a famous tragedy."
      }
    }
  ],
  "gold_outputs": [
    {
      "task_id": "q1",
      "output": { "answer": "Paris" },
      "score": 1.0
    },
    {
      "task_id": "q2",
      "output": { "answer": "William Shakespeare" },
      "score": 1.0
    }
  ],
  "metadata": {
    "name": "simple_qa",
    "task_description": "Answer questions using the provided context"
  }
}
Save this as dataset.json.

Step 2: Create Configuration

Create a TOML configuration file:
# config.toml
[graph_optimization]
algorithm = "graph_evolve"
dataset_name = "simple_qa"

# Graph settings
graph_type = "policy"
graph_structure = "dag"
topology_guidance = "First extract relevant information from context, then formulate the answer"

# Models
allowed_policy_models = ["gpt-4o-mini"]
verifier_model = "gpt-4o-mini"
scoring_strategy = "rubric"

# Evolution
[graph_optimization.evolution]
num_generations = 3
children_per_generation = 2

[graph_optimization.proposer]
model = "gpt-4.1"

# Data splits
[graph_optimization.seeds]
train = [0, 1, 2, 3, 4]
validation = [5, 6, 7]

# Budget
[graph_optimization.limits]
max_spend_usd = 5.0
timeout_seconds = 1800

Step 3: Run Training

from synth_ai.sdk import GraphOptimizationJob

job = GraphOptimizationJob.from_dataset(
    "dataset.json",
    policy_model="gpt-4o-mini",
    rollout_budget=100,
    proposer_effort="medium",
)
job.submit()
result = job.stream_until_complete()
print(f"Best score: {result.best_score}")

Step 4: Use Your Graph

Production Inference

# Using Graph Evolve job
output = job.run_inference({
    "question": "What is the largest planet?",
    "context": "Jupiter is the largest planet in our solar system."
})
print(output)  # {"answer": "Jupiter"}

Download for Local Use

# Get the optimized graph
graph_export = job.download_prompt()
print(graph_export)

What Happens During Training

  1. Initialization: Graph Evolve creates an initial population of graph candidates
  2. Evaluation: Each candidate is run on training seeds and scored
  3. Selection: Best candidates are selected for the next generation
  4. Mutation: LLM proposes modifications to prompts and structure
  5. Repeat: Process continues for num_generations
  6. Validation: Top candidates are evaluated on held-out validation seeds
Generation 1: best_score=0.65, candidates=5
Generation 2: best_score=0.72, candidates=5
Generation 3: best_score=0.81, candidates=5
Validation: final_score=0.79

Tips for Better Results

1. More Training Data

More examples = better optimization:
[graph_optimization.seeds]
train = [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
validation = [15, 16, 17, 18, 19]

2. Topology Guidance

Help the proposer understand your task:
topology_guidance = """
For multi-hop reasoning questions:
1. First identify what information is needed
2. Extract relevant facts from context
3. Combine facts to form the answer
"""

3. Appropriate Structure

Match structure to task complexity:
TaskRecommended Structure
Simple classificationsingle_prompt
Multi-step reasoningdag
Routing/branching logicconditional

4. Budget Allocation

More generations with fewer children often beats few generations with many children:
[graph_optimization.evolution]
num_generations = 5        # More iterations
children_per_generation = 2  # Fewer variants per iteration

Troubleshooting

Low Scores

  • Add more diverse training examples
  • Increase num_generations
  • Try different topology_guidance
  • Check that gold outputs are correct

Slow Training

  • Reduce children_per_generation
  • Use faster policy model (e.g., gpt-4o-mini)
  • Reduce training seed count

High Costs

  • Set max_spend_usd limit
  • Use max_llm_calls_per_run to limit graph complexity
  • Use cheaper models in allowed_policy_models

Next Steps


Ready to get started?

Get Started

Sign up and start optimizing your prompts today.

Schedule Demo

See Synth in action with a personalized walkthrough.