钛媒体 昨天
Nvidia Accelerates Mistral AI Models, Help Close Gap with OpenAI
index_new5.html
../../../zaker_core/zaker_tpl_static/wap/tpl_font3.html

 

TMTPOST --   Nvidia   Corp.   and French artificial intelligence ( AI )   startup Mistral AI have achieved significant performance breakthroughs through their latest collaboration, delivering up to 10 times faster inference speeds for Mistral's new model family on Nvidia's GB200 NVL72 systems compared to the previous-generation H200 chips.

AI Generated Image

Mistral AI on Tuesday released its Mistral 3 family of open-weight models, optimized for Nvidia platforms from data centers to edge devices. The release includes Mistral Large 3, a 675 billion total parameter mixture-of-experts model with multilingual and multimodal capabilities, alongside nine smaller Ministral 3 variants designed for deployment on robots, drones and offline devices.

The partnership positions the two-year-old French company to better compete with leading AI labs including OpenAI and Google, particularly in enterprise deployments where customization and cost efficiency matter. Mistral has raised   $2.7   billion   at a $13.7 billion   valuation, with Nvidia among its investors.

The collaboration delivers practical advantages for enterprise users. On the GB200 NVL72, Mistral Large 3 achieved over 5 million tokens per second per megawatt at 40 tokens per second per user, translating to lower per-token costs and improved energy efficiency for production AI systems.

GB200 Systems Drive Performance Gains

Mistral Large 3's architecture leverages Nvidia's hardware optimizations to unlock substantial efficiency improvements. The model's mixture-of-experts design activates only the most relevant parts for each task rather than engaging all 675 billion parameters, reducing computational waste while maintaining accuracy.

The performance leap stems from several technical advances. Nvidia's TensorRT-LLM Wide Expert Parallelism exploits the GB200 NVL72's coherent memory domain through NVLink fabric, enabling optimized expert distribution and load balancing. The system also employs NVFP4 low-precision inference and Dynamo disaggregated inference optimizations to deliver peak performance for large-scale training and deployment.

These optimizations work across Nvidia's inference frameworks including TensorRT-LLM, SGLang and vLLM. The models are available through leading open-source platforms and cloud service providers, with deployment expected soon as Nvidia NIM microservices.

Ministral 3 Targets Edge Deployment

The compact Ministral 3 suite brings AI capabilities to devices operating without network connectivity. Available in 3 billion, 8 billion and 14 billion parameter configurations, each size offers Base, Instruct and Reasoning variants to match specific use cases.

Performance on edge platforms demonstrates practical viability. The Ministral-3B variants achieve up to 385 tokens per second on Nvidia's RTX 5090 GPU. On Nvidia Jetson Thor, the models deliver 52 tokens per second for single concurrency, scaling to 273 tokens per second with eight concurrent requests.

Guillaume Lample, Mistral co-founder and chief scientist, emphasized the efficiency advantage: "The huge majority of enterprise use cases are things that can be tackled by small models, especially if you fine-tune them." All Ministral 3 variants support vision, handle 128,000 to 256,000 context windows, and run on single GPUs, reducing deployment costs and latency.

Commercial Push Intensifies Competition

The release comes as Mistral accelerates commercial activity following a 1.7 billion euro funding round in September that valued the company at 11.7 billion euros. Dutch chip equipment maker ASML contributed 1.3 billion euros, with Nvidia also participating.

Mistral has secured contracts worth hundreds of millions of dollars with corporate clients and announced a deal Monday with HSBC for financial analysis and translation tasks. The company is also expanding through acquisitions to compete with U.S. rivals establishing European operations, including Anthropic and OpenAI, which both opened European offices this year.

The startup's open-weight approach contrasts with closed-source competitors. While OpenAI and Anthropic maintain proprietary models accessible only through APIs, Mistral releases model weights publicly for download and customization. Lample argues this delivers superior results for specific enterprise deployments: "In many cases, you can actually match or even out-perform closed-source models" through fine-tuning.

宙世代

宙世代

ZAKER旗下Web3.0元宇宙平台

一起剪

一起剪

ZAKER旗下免费视频剪辑工具

相关标签

nvidia ai the
相关文章
评论
没有更多评论了
取消

登录后才可以发布评论哦

打开小程序可以发布评论哦

12 我来说两句…
打开 ZAKER 参与讨论