Accurate for Stack 0.1.0 and later unless a section cites a newer release.
See Stack Changelog.
Stack’s auxiliary agents (and your own clients) can call NVIDIA Nemotron 3 Ultra
through Synth’s OpenAI-compatible inference endpoint. It is a drop-in /responses
API authenticated with a Synth API key.
1. Sign in and get a key
Create a Synth account and an API key at usesynth.ai/keys.
export SYNTH_API_KEY=sk_...
2. List models
curl https://api.usesynth.ai/api/v1/synth/models \
-H "authorization: Bearer $SYNTH_API_KEY"
You should see nemotron-3-ultra in the catalog.
3. Run inference
The aux endpoint requires an X-Stack-Actor-Role header (monitor, gardener, or
aux). Use aux for general calls:
curl https://api.usesynth.ai/api/v1/stack-aux/openai/v1/responses \
-H "authorization: Bearer $SYNTH_API_KEY" \
-H "x-stack-actor-role: aux" \
-H "content-type: application/json" \
-d '{"model":"nemotron-3-ultra","input":"Reply with one word: pong."}'
Without the x-stack-actor-role header the request returns 403. Primary coding
roles (worker, primary, codex, cursor) are intentionally rejected — this
endpoint is for auxiliary agents and aux inference.
4. Usage and cost
Every request is metered per token and recorded against your org. Inference runs on a
free aux tier with a per-org daily cap; usage and remaining quota are available at:
curl https://api.usesynth.ai/api/v1/stack-aux/usage \
-H "authorization: Bearer $SYNTH_API_KEY" \
-H "x-stack-actor-role: aux"
In Stack
When you sign in, Stack’s monitor and gardener aux agents use this endpoint
automatically — you do not need to set headers by hand. The signed-out local loop
(Quickstart) keeps working without any of this.