NVIDIA: Llama 3.1 Nemotron Ultra 253B v1
nvidia/llama-3.1-nemotron-ultra-253b-v1
Released Apr 8, 2025Knowledge cutoff Mar 31, 2024131,072 context
$0.60/M input tokens$1.80/M output tokens
Prompt tokens measure input size. Reasoning tokens show internal thinking before a response. Completion tokens reflect total output length.