Riva-Translate-4B-Instruct-v1.1

Model Overview

We’re excited to share our latest work on the next version of Riva-Translate-4B-Instruct! The new release outperforms the initial version across multiple benchmarks, including FLORES, NTREX, and WMT24, and demonstrates comparable performance to EuroLLM-9B-Instruct. The model supports translation across 12 languages and dialects spanning Chinese, Spanish, and Portuguese varieties. Specifically, it covers: English (en), German (de), European Spanish (es-ES), Latin American Spanish (es-US), French (fr), Brazilian Portuguese (pt-BR), Russian (ru), Simplified Chinese (zh-CN), Traditional Chinese (zh-TW), Japanese (ja), Korean (ko), and Arabic (ar). Built on a decoder-only Transformer architecture, this model is a fine-tuned version of a 4B-base model that was pruned and distilled from nvidia/Mistral-NeMo-Minitron-8B-Base using NVIDIA’s LLM compression techniques. Training followed a multi-stage pipeline consisting of Continued Pre-Training (CPT), Supervised Fine-Tuning (SFT), and Reward-aware Preference Optimization (RPO). The model uses tiktoken as its tokenizer and supports a context length of 8K tokens.

Model Developer: NVIDIA

Model Dates: Riva-Translate-4B-Instruct-v1.1 was trained between June 2025 and August 2025.

License

GOVERNING TERMS: The NIM container is governed by the NVIDIA Software License Agreement and Product-Specific Terms for AI Products. Use of this model is governed by the NVIDIA Community Model License. ADDITIONAL INFORMATION: Apache 2.0.

Prompt Format:

Optimal performance is achieved when using the prompt shown below.

<s>System
{system prompt}</s>
<s>User
{user prompt}</s>
<s>Assistant\n
  • Note that a newline character (\n) should be added after <s>Assistant as a generation prompt.
  • Note that users are required to use the correct language name in the prompt: 'ar': 'Arabic', 'en': 'English', 'de': 'German', 'es-es': 'European Spanish', 'es-us': 'Latin American Spanish', 'fr': 'French', 'ja': 'Japanese', 'ko': 'Korean', 'ru': 'Russian', 'zh-cn': 'Simplified Chinese', 'zh-tw': 'Traditional Chinese', 'pt-br': 'Brazilian Portuguese'

For example, to translate an English sentence into Simplified Chinese:

<s>System
You are an expert at translating text from English to Simplified Chinese.</s>
<s>User
What is the Simplified Chinese translation of the sentence: The GRACE mission is a collaboration between the NASA and German Aerospace Center.?</s>
<s>Assistant

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM


tokenizer = AutoTokenizer.from_pretrained("nvidia/Riva-Translate-4B-Instruct-v1.1")
model = AutoModelForCausalLM.from_pretrained("nvidia/Riva-Translate-4B-Instruct-v1.1").cuda()


# Use the prompt template
messages = [
    {
        "role": "system",
        "content": "You are an expert at translating text from English to Simplified Chinese.",
    },
    {"role": "user", "content": "What is the Simplified Chinese translation of the sentence: The GRACE mission is a collaboration between the NASA and German Aerospace Center.?"},
 ]
tokenized_chat = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to(model.device)
outputs = model.generate(tokenized_chat,  max_new_tokens=128, pad_token_id=tokenizer.eos_token_id)
print(tokenizer.decode(outputs[0]))

Ethical Considerations:

NVIDIA believes Trustworthy AI is a shared responsibility and we have established policies and practices to enable development for a wide array of AI applications. When downloaded or used in accordance with our terms of service, developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and addresses unforeseen product misuse.

Technical Limitations & Mitigation:

Accuracy varies based on the characteristics of input (Domain, Use Case, Noise, Context, etc.). Grammar errors and semantic issues may be present. As a potential mitigation, the user can change the prompt to get a better translation.

Use Case Restrictions:

Abide by NVIDIA Community Model License

Downloads last month
57
Safetensors
Model size
4B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for nvidia/Riva-Translate-4B-Instruct-v1.1

Finetuned
(3)
this model