nvidia

NVIDIA: Nemotron 3 Ultra

NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it...

Context
1M tokens
Input / 1M tokens
$0.50
Output / 1M tokens
$2.20
Benchmark

Pricing

Input tokens$0.50 per 1M tokens
Output tokens$2.20 per 1M tokens
Cache read$0.10 per 1M tokens

Technical details

Model IDnvidia/nemotron-3-ultra-550b-a55b
Context window1M tokens
Input modalitiestext
Output modalitiestext
TokenizerOther
Max output tokens16,384

Use with MegaBrain

import OpenAI from 'openai'

const client = new OpenAI({
  baseURL: 'https://getmegabrain.com/api/gateway/v1',
  apiKey: process.env.MEGABRAIN_API_KEY,
})

const response = await client.chat.completions.create({
  model: 'nvidia/nemotron-3-ultra-550b-a55b',
  messages: [{ role: 'user', content: 'Hello!' }],
})

Ready to use NVIDIA: Nemotron 3 Ultra?

Get an API key and start making requests in minutes.

Get an API key