Skip to main content
These scenarios are used to validate launch readiness and partner onboarding.

Howie (email agent)

What it validates:
  • End-to-end orchestration for a repo-backed harness
  • Spend metering and budget accounting
  • Morning summary and proof artifact generation
Expected outcome:
  • Completed run with non-zero spend and downloadable deliverables.

Crafter (GEPA vs MIPRO)

What it validates:
  • Research scenario orchestration
  • Policy optimization flows and trial matrix behavior
  • Algorithm-specific artifact generation
Expected outcome:
  • Completed optimization run with experiment outputs and synthesis artifacts.

MintlifyBench (docs generation)

What it validates:
  • Content generation from SDK repo + company URL
  • Worker compliance with strict output contracts
  • Scoring pipeline for generated docs quality
Expected outcome:
  • Mintlify docs output (.mdx + mint.json) and associated scored artifacts.

Choosing a scenario

  • Start with Howie for baseline platform validation.
  • Use MintlifyBench for content-generation and docs-focused partners.
  • Use Crafter for optimization-focused partners and research workloads.