System Specifications (Specs)

Overview

References:

GEPA: Agrawal et al. (2025). “GEPA: Reflective Prompt Evolution.” arXiv:2507.19457

System specifications (specs) are structured JSON documents that define task principles, rules, policies, and constraints for prompt optimization. GEPA uses specs to guide prompt generation with domain-specific knowledge.

What Are Specs?

Specs are JSON files that encode:

Principles: High-level guidelines for the task
Rules: Specific policies with priorities (0-10)
Constraints: Must/must-not/should directives
Examples: Good and bad examples for each rule
Glossary: Domain-specific terminology
Interfaces: Input/output formats and capabilities

Example Spec Structure

{
  "metadata": {
    "id": "spec.banking77_pipeline.v1",
    "title": "Banking77 Two-Stage Classification Pipeline Specification",
    "version": "1.0.0",
    "scope": "banking-intent-classification-pipeline"
  },
  "principles": [
    {
      "id": "P-clarity",
      "text": "Prioritize immediate-action intents over informational queries when multiple interpretations are possible.",
      "rationale": "Customers with urgent issues need immediate assistance."
    }
  ],
  "rules": [
    {
      "id": "R-card-disambiguation",
      "title": "Disambiguate card arrival vs. card payment issues",
      "priority": 10,
      "constraints": {
        "must": [
          "Classify as 'lost_or_stolen_card' if payment-related keywords are present",
          "Classify as 'card_arrival' if delivery-related keywords are present"
        ],
        "must_not": [
          "Assume 'card' always refers to physical delivery"
        ]
      },
      "examples": [
        {
          "kind": "good",
          "prompt": "My card was declined at the store",
          "response": "declined_card_payment",
          "description": "Payment keyword indicates payment issue"
        }
      ]
    }
  ],
  "glossary": [
    {
      "term": "disambiguate",
      "definition": "To distinguish between multiple plausible interpretations",
      "aliases": ["clarify", "distinguish"]
    }
  ]
}

How Specs Are Used

GEPA: Spec-Guided Mutations

When GEPA uses proposer_type = "spec", the spec is included in mutation prompts:

[prompt_learning.gepa]
proposer_type = "spec"  # Use spec mode
spec_path = "examples/containers/banking77_pipeline/banking77_pipeline_spec.json"
spec_max_tokens = 5000
spec_include_examples = true
spec_priority_threshold = 8  # Only include high-priority rules (8+)

How it works:

Spec Loading: GEPA loads the spec JSON file at initialization
Context Serialization: Spec is converted to compact markdown format (up to spec_max_tokens)
Mutation Prompts: Spec context is injected into LLM-guided mutation prompts
Rule Filtering: Only rules with priority >= spec_priority_threshold are included

Mutation Prompt Structure:

You are a prompt engineering expert. Improve the instruction text (DSPy-style).

Requirements:
- Preserve placeholders (e.g., {{query}}) and tool names
- Be precise, action-oriented, and unambiguous
- Keep guidance concise; avoid fluff

Current instruction:
{classifier_instruction}

Feedback (hints to address):
{feedback_text}

## System Specification
(Task principles, rules, and policies from spec document)
{spec_context}

Output: 1-3 bullet snippets (1-2 sentences each) that replace/augment the instruction.

Configuration Parameters

`spec_path` (Required)

Path to the spec JSON file (relative to config file or absolute).

spec_path = "examples/containers/banking77_pipeline/banking77_pipeline_spec.json"

`spec_max_tokens` (Default: 5000)

Maximum tokens for spec context in prompts. The serializer will:

Start with high-priority rules (priority >= 7)
Remove examples if still too long
Remove glossary if still too long
Increase priority threshold if still too long

spec_max_tokens = 5000  # Default

`spec_include_examples` (Default: true)

Whether to include rule examples in the spec context.

spec_include_examples = true  # Include good/bad examples

`spec_priority_threshold` (Optional)

Only include rules with priority >= threshold. Higher threshold = fewer but more important rules.

spec_priority_threshold = 8  # Only include priority 8+ rules

Priority Guidelines:

10: Critical rules (must always be followed)
9: High-priority rules (important for accuracy)
8: Medium-high priority (recommended)
7: Medium priority (helpful)
<7: Lower priority (may be filtered out)

Spec Format Details

Principles

High-level guidelines that apply across all rules:

{
  "id": "P-clarity",
  "text": "Prioritize immediate-action intents over informational queries",
  "rationale": "Customers with urgent issues need immediate assistance."
}

Rules

Specific policies with priorities and constraints:

{
  "id": "R-card-disambiguation",
  "title": "Disambiguate card arrival vs. card payment issues",
  "priority": 10,
  "rationale": "Queries mentioning 'card' can refer to physical delivery or payment problems.",
  "constraints": {
    "must": [
      "Classify as 'lost_or_stolen_card' if payment-related keywords are present"
    ],
    "must_not": [
      "Assume 'card' always refers to physical delivery"
    ],
    "should": [
      "Consider the query_analyzer's complexity assessment"
    ]
  },
  "examples": [
    {
      "kind": "good",
      "prompt": "My card was declined",
      "response": "declined_card_payment",
      "description": "Payment keyword indicates payment issue"
    },
    {
      "kind": "bad",
      "prompt": "My card isn't working",
      "response": "card_arrival",
      "description": "WRONG: Should be card_not_working"
    }
  ]
}

Constraints Types

must: Required behaviors (always enforced)
must_not: Prohibited behaviors (never allowed)
should: Recommended behaviors (preferred when possible)
should_not: Discouraged behaviors (avoid when possible)

Benefits of Using Specs

1. Domain Knowledge Injection

Specs encode expert knowledge about the task:

Edge cases and disambiguation rules
Domain-specific terminology
Priority-based policies

2. Constraint-Aware Optimization

GEPA respects spec constraints:

GEPA: Mutations follow spec rules (must/must_not)

3. Faster Convergence

Spec-guided optimization typically:

Converges faster (fewer generations/iterations)
Produces more accurate prompts
Better handles edge cases

4. Consistency

Specs ensure:

Consistent terminology across prompts
Alignment with domain requirements
Compliance with business rules

Example: Banking77 Pipeline Spec

Location: examples/containers/banking77_pipeline/banking77_pipeline_spec.json Key Rules:

R-card-disambiguation (Priority 10): Distinguish card delivery vs. payment issues
R-urgency-signals (Priority 10): Handle urgent queries (lost cards, fraud)
R-balance-transfer (Priority 9): Disambiguate balance update scenarios
R-stage-coordination (Priority 8): Coordinate between analyzer and classifier stages

Usage in Config:

[prompt_learning.gepa]
proposer_type = "spec"
spec_path = "examples/containers/banking77_pipeline/banking77_pipeline_spec.json"
spec_max_tokens = 5000
spec_include_examples = true
spec_priority_threshold = 8

When to Use Specs

Use specs when:

✅ You have domain expertise to encode
✅ Task has complex edge cases or disambiguation rules
✅ You want faster convergence
✅ Consistency with business rules is critical
✅ Multi-stage pipelines need coordination rules

Skip specs when:

❌ Task is simple and straightforward
❌ No domain-specific rules or constraints
❌ You want maximum exploration (specs may constrain search)

Creating a Spec

Step 1: Define Principles

Start with high-level guidelines:

{
  "principles": [
    {
      "id": "P-clarity",
      "text": "Prioritize immediate-action intents over informational queries",
      "rationale": "Urgent issues need immediate assistance."
    }
  ]
}

Step 2: Add Rules

Define specific policies with priorities:

{
  "rules": [
    {
      "id": "R-card-disambiguation",
      "title": "Disambiguate card arrival vs. card payment issues",
      "priority": 10,
      "constraints": {
        "must": [
          "Classify as 'lost_or_stolen_card' if payment keywords present"
        ]
      },
      "examples": [
        {
          "kind": "good",
          "prompt": "My card was declined",
          "response": "declined_card_payment"
        }
      ]
    }
  ]
}

Step 3: Add Glossary

Define domain-specific terms:

{
  "glossary": [
    {
      "term": "disambiguate",
      "definition": "To distinguish between multiple plausible interpretations",
      "aliases": ["clarify", "distinguish"]
    }
  ]
}

Step 4: Reference in Config

Point to the spec file:

[prompt_learning.gepa]
proposer_type = "spec"
spec_path = "path/to/your/spec.json"
spec_max_tokens = 5000
spec_priority_threshold = 8

Best Practices

Start with High-Priority Rules: Focus on critical constraints first (priority 8+)
Include Examples: Good and bad examples help the optimizer understand intent
Use Clear Constraints: Be specific with must/must_not directives
Test Token Limits: Ensure spec_max_tokens fits in your model’s context window
Filter by Priority: Use spec_priority_threshold to focus on important rules
Update Regularly: Keep specs in sync with task requirements

Comparison: DSPy vs Spec Mode

Aspect	DSPy Mode	Spec Mode
Guidance	Generic prompt engineering principles	Domain-specific rules and constraints
Convergence	Slower (broader exploration)	Faster (focused search)
Accuracy	Good for general tasks	Better for domain-specific tasks
Setup	No additional files	Requires spec JSON file
Best For	Simple tasks, exploration	Complex tasks, edge cases

Next Steps

Configuration Reference – Complete spec parameter documentation
GEPA Guide – How GEPA uses specs
Prompt Optimization Cookbook – Real-world spec usage

Getting started

Products

Container

Tunnel/Deploy

System Specifications (Specs)

Overview

What Are Specs?

Example Spec Structure

How Specs Are Used

GEPA: Spec-Guided Mutations

Configuration Parameters

`spec_path` (Required)

`spec_max_tokens` (Default: 5000)

`spec_include_examples` (Default: true)

`spec_priority_threshold` (Optional)

Spec Format Details

Principles

Rules

Constraints Types

Benefits of Using Specs

1. Domain Knowledge Injection

2. Constraint-Aware Optimization

3. Faster Convergence

4. Consistency

Example: Banking77 Pipeline Spec

When to Use Specs

Creating a Spec

Step 1: Define Principles

Step 2: Add Rules

Step 3: Add Glossary

Step 4: Reference in Config

Best Practices

Comparison: DSPy vs Spec Mode

Next Steps

Getting started

Products

Container

Tunnel/Deploy

​Overview

​What Are Specs?

​Example Spec Structure

​How Specs Are Used

​GEPA: Spec-Guided Mutations

​Configuration Parameters

​spec_path (Required)

​spec_max_tokens (Default: 5000)

​spec_include_examples (Default: true)

​spec_priority_threshold (Optional)

​Spec Format Details

​Principles

​Rules

​Constraints Types

​Benefits of Using Specs

​1. Domain Knowledge Injection

​2. Constraint-Aware Optimization

​3. Faster Convergence

​4. Consistency

​Example: Banking77 Pipeline Spec

​When to Use Specs

​Creating a Spec

​Step 1: Define Principles

​Step 2: Add Rules

​Step 3: Add Glossary

​Step 4: Reference in Config

​Best Practices

​Comparison: DSPy vs Spec Mode

​Next Steps

Overview

What Are Specs?

Example Spec Structure

How Specs Are Used

GEPA: Spec-Guided Mutations

Configuration Parameters

`spec_path` (Required)

`spec_max_tokens` (Default: 5000)

`spec_include_examples` (Default: true)

`spec_priority_threshold` (Optional)

Spec Format Details

Principles

Rules

Constraints Types

Benefits of Using Specs

1. Domain Knowledge Injection

2. Constraint-Aware Optimization

3. Faster Convergence

4. Consistency

Example: Banking77 Pipeline Spec

When to Use Specs

Creating a Spec

Step 1: Define Principles

Step 2: Add Rules

Step 3: Add Glossary

Step 4: Reference in Config

Best Practices

Comparison: DSPy vs Spec Mode

Next Steps