nvidia

NVIDIA: Llama 3.3 Nemotron Super 49B V1.5

Llama-3.3-Nemotron-Super-49B-v1.5 is a 49B-parameter, English-centric reasoning/chat model derived from Meta’s Llama-3.3-70B-Instruct with a 128K context. It’s post-trained for agentic workflows (RAG, tool calling) via SFT across math, code, science, and...

Context
131k tokens
Input / 1M tokens
$0.40
Output / 1M tokens
$0.40
Benchmark

Pricing

Input tokens$0.40 per 1M tokens
Output tokens$0.40 per 1M tokens

Technical details

Model IDnvidia/llama-3.3-nemotron-super-49b-v1.5
Context window131k tokens
Input modalitiestext
Output modalitiestext
TokenizerLlama3
Max output tokens16,384

Use with MegaBrain

import OpenAI from 'openai'

const client = new OpenAI({
  baseURL: 'https://getmegabrain.com/api/gateway/v1',
  apiKey: process.env.MEGABRAIN_API_KEY,
})

const response = await client.chat.completions.create({
  model: 'nvidia/llama-3.3-nemotron-super-49b-v1.5',
  messages: [{ role: 'user', content: 'Hello!' }],
})

Ready to use NVIDIA: Llama 3.3 Nemotron Super 49B V1.5?

Get an API key and start making requests in minutes.

Get an API key