MindAI University Mistral AI and NVIDIA announces 12B NeMo model

Mistral AI and NVIDIA announces 12B NeMo model

MindAI

Last updated:July 24 2024Read - (5 minutes)

Released under the Apache 2.0 license by Mistral AI in collaboaration with NVIDIA, NeMo is a 12-billion-parameter model. Mistral NeMo offers a large context window of up to 128k tokens. Read more about Mistral AI latest model

Mistral NeMO, picture by Mistral

Mistral AI announces NeMo, a 12B model created in collaboration with NVIDIA. This cutting-edge 12B language model has an impressive context window of up to 128,000 tokens and excels in reasoning, world knowledge, and coding accuracy for its size category.

The collaboration between Mistral AI and NVIDIA has resulted in a model that not only pushes the boundaries of performance but also prioritises ease of use. Mistral NeMo is designed to be a seamless replacement for systems currently using Mistral 7B, thanks to its reliance on standard architecture.

One of the most importatnt key features of Mistral NeMo is its quantisation awareness, which enables FP8 inference without compromising performance. This capability could be crucial for organisations looking to deploy large language models efficiently.

Mistral AI has provided performance comparisons between the Mistral NeMo base model and two other recent open-source pre-trained models: Gemma 2 9B and Llama 3 8B.

Here are the performance statistics.

Mistral NeMo performances comparison

In their announcement Mistral AI explained that: “The model is designed for global, multilingual applications. It is trained on function calling, has a large context window, and is particularly strong in English, French, German, Spanish, Italian, Portuguese, Chinese, Japanese, Korean, Arabic, and Hindi” .

“This is a new step toward bringing frontier AI models to everyone’s hands in all languages that form human culture.”

Mistral NeMo also introduces Tekken, a new tokeniser based on Tiktoken. Trained on over 100 languages, Tekken offers improved compression efficiency for both natural language text and source code compared to the SentencePiece tokeniser adopted in previous Mistral models. The company says that Tekken is at least 30% more efficient at compressing source code tasks and several major languages, with even more significant gains for Korean and Arabic.

Developers can start experimenting with Mistral NeMo using the mistral-inference tool and adapt it with mistral-finetune on HuggingFace since the model’s weights are now available for both the base and instruct versions.. For those using Mistral’s platform, the model is accessible under the name open-mistral-nemo.

The release of Mistral NeMo represents a significant step forward in the ever fast evolving AI enviroment. By combining high performance, multilingual capabilities, and open-source availability, Mistral AI and NVIDIA are positioning this model as a versatile tool for a wide range of AI applications across various industries and research fields.

Mistral AI and NVIDIA announces 12B NeMo model

TAGS