Skip to main content
Accurate for Stack 0.1.0 and later unless a section cites a newer release. See Stack Changelog.
Stack’s auxiliary agents (and your own clients) can call NVIDIA Nemotron 3 Ultra through Synth’s OpenAI-compatible inference endpoint. It is a drop-in /responses API authenticated with a Synth API key.

1. Sign in and get a key

Create a Synth account and an API key at usesynth.ai/keys.
export SYNTH_API_KEY=sk_...

2. List models

curl https://api.usesynth.ai/api/v1/synth/models \
  -H "authorization: Bearer $SYNTH_API_KEY"
You should see nemotron-3-ultra in the catalog.

3. Run inference

The aux endpoint requires an X-Stack-Actor-Role header (monitor, gardener, or aux). Use aux for general calls:
curl https://api.usesynth.ai/api/v1/stack-aux/openai/v1/responses \
  -H "authorization: Bearer $SYNTH_API_KEY" \
  -H "x-stack-actor-role: aux" \
  -H "content-type: application/json" \
  -d '{"model":"nemotron-3-ultra","input":"Reply with one word: pong."}'
Without the x-stack-actor-role header the request returns 403. Primary coding roles (worker, primary, codex, cursor) are intentionally rejected — this endpoint is for auxiliary agents and aux inference.

4. Usage and cost

Every request is metered per token and recorded against your org. Inference runs on a free aux tier with a per-org daily cap; usage and remaining quota are available at:
curl https://api.usesynth.ai/api/v1/stack-aux/usage \
  -H "authorization: Bearer $SYNTH_API_KEY" \
  -H "x-stack-actor-role: aux"

In Stack

When you sign in, Stack’s monitor and gardener aux agents use this endpoint automatically — you do not need to set headers by hand. The signed-out local loop (Quickstart) keeps working without any of this.