> ## Documentation Index
> Fetch the complete documentation index at: https://docs.usesynth.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# Hosted inference

> Sign in to Synth and call Nemotron 3 Ultra through Stack's OpenAI-compatible aux endpoint.

<Note>
  Accurate for Stack **`0.1.0`** and later unless a section cites a newer release.
  See [Stack Changelog](/stack/changelog).
</Note>

Stack's auxiliary agents (and your own clients) can call **NVIDIA Nemotron 3 Ultra**
through Synth's OpenAI-compatible inference endpoint. It is a drop-in `/responses`
API authenticated with a Synth API key.

## 1. Sign in and get a key

Create a Synth account and an API key at [usesynth.ai/keys](https://usesynth.ai/keys).

```bash theme={null}
export SYNTH_API_KEY=sk_...
```

## 2. List models

```bash theme={null}
curl https://api.usesynth.ai/api/v1/synth/models \
  -H "authorization: Bearer $SYNTH_API_KEY"
```

You should see `nemotron-3-ultra` in the catalog.

## 3. Run inference

The aux endpoint requires an `X-Stack-Actor-Role` header (`monitor`, `gardener`, or
`aux`). Use `aux` for general calls:

```bash theme={null}
curl https://api.usesynth.ai/api/v1/stack-aux/openai/v1/responses \
  -H "authorization: Bearer $SYNTH_API_KEY" \
  -H "x-stack-actor-role: aux" \
  -H "content-type: application/json" \
  -d '{"model":"nemotron-3-ultra","input":"Reply with one word: pong."}'
```

<Warning>
  Without the `x-stack-actor-role` header the request returns `403`. Primary coding
  roles (`worker`, `primary`, `codex`, `cursor`) are intentionally rejected — this
  endpoint is for auxiliary agents and aux inference.
</Warning>

## 4. Usage and cost

Every request is metered per token and recorded against your org. Inference runs on a
free aux tier with a per-org daily cap; usage and remaining quota are available at:

```bash theme={null}
curl https://api.usesynth.ai/api/v1/stack-aux/usage \
  -H "authorization: Bearer $SYNTH_API_KEY" \
  -H "x-stack-actor-role: aux"
```

## In Stack

When you sign in, Stack's monitor and gardener aux agents use this endpoint
automatically — you do not need to set headers by hand. The signed-out local loop
([Quickstart](/stack/quickstart)) keeps working without any of this.
