Reasoning, Vision, and Code in One Body: Mistral Small 4's 119B Unified Experiment

Editor J Published: Mar 17, 2026Updated: Mar 18, 2026

Mistral AI has released Mistral Small 4 for free, a single model that handles Q&A, image analysis, and code writing. Despite its massive size, it runs fast by activating only the parts it needs. The simultaneously released coding model can replace existing paid services at one-seventh the cost.

Mistral AI, which has championed an efficiency-first strategy in the AI model market, has now changed the game entirely. Released on March 16, 2026, Mistral Small 4 merges three previously separate models into one: Magistral for logical reasoning, Pixtral for image analysis, and Devstral for code writing. It holds 119 billion learned knowledge units in total, but doesn't use them all at once. Out of 128 specialists, only the 4 best suited for each question are called upon. As a result, just about 6 billion units are active at any time — making it remarkably light and fast for its enormous size.

One model that reasons, sees, and codes. There's no longer a reason to run three separate ones.

Previously, users had to pick different models for different tasks: Magistral for hard logic problems, Pixtral for image analysis, Devstral for code generation. Mistral Small 4 combines all three into one. A reasoning effort setting lets users control how deeply it thinks — light answers for simple questions, deep analysis for complex math. The design philosophy: handle everything with a single model.

Waking Only 4 of 128 Experts

Mistral Small 4 MoE model architecture — Mistral Small 4 with 119B parameter MoE architecture

The core of Mistral Small 4 is its 'pick only the experts you need' structure. Its 119 billion knowledge units are divided into 128 specialist groups, and when a question arrives, only the 4 best-suited specialists are summoned. Only about 6 billion units are active at any given time. This uses far less computing power than conventional models that tap into all their knowledge for every query.

The amount of text it can process at once is also remarkable — about 250,000 tokens, roughly the length of a full novel, in a single session. According to Mistral AI, response speed improved 40% over its predecessor Mistral Small 3, and the volume it can handle tripled. Faster and more productive on the same hardware.

What the Benchmarks Say — and What They Don't

The performance test results Mistral AI released are impressive. On math reasoning tests, it outscored similarly sized competing models while using 20% fewer words in its answers. Coding ability tests showed similar results. In other words, it delivers the same answers with fewer resources. Building a model that's fast, accurate, and cheap to run is Mistral AI's stated goal.

Mistral Small 4 Key Specifications

Metric	Value
Total knowledge scale	119 billion
Number of specialists	128
Active at a time	4 (~6 billion)
Text capacity per session	~250K tokens (one novel)
License	Apache 2.0 (free, commercial OK)
Response speed	40% faster than predecessor
Processing volume	3x more than predecessor

That said, a company's own report card should always be taken with a grain of salt. Until independent experts verify these claims, the design direction matters more than the numbers. Tech blogger Simon Willison shared his hands-on review, giving high marks to the convenience of handling all tasks with one model instead of switching between several.

Available to Anyone Under Apache 2.0

SWE-Bench open-weight vs proprietary model comparison — Devstral 2 achieves top performance among open-weight models

Mistral Small 4 is released under the Apache 2.0 license — meaning anyone can use it for free with no strings attached. Companies and individuals alike can download, modify, and embed it in their own services. It's available on Hugging Face, the popular AI model sharing platform, and can run on personal computers or private servers. A compressed lightweight version was also released in partnership with NVIDIA, running even faster while using less memory.

Mistral AI has adhered to Apache 2.0 licensing from the start. Meta's Llama series comes with some usage restrictions, and Google Gemma leans toward research purposes. Mistral places no restrictions on any use, including commercial. Unlike paid API services such as Claude and GPT, companies can host it on their own servers — that's the differentiator.

Devstral 2 Competes on Price, Not Performance

Devstral 2 and Mistral Vibe CLI — Vibe CLI, an AI coding assistant available directly in the terminal

Announced alongside Mistral Small 4, Devstral 2 is a model specialized in writing code. Holding 123 billion knowledge units, it scored a 72.2% success rate on a real-world software bug-fixing test. Top-tier paid models score above 80%, so the performance itself isn't dominant. But the pitch is price: Mistral AI emphasizes that it delivers comparable results to Anthropic's Claude Sonnet at one-seventh the cost.

Devstral Small 2 also arrived — a lightweight 24-billion-unit version that runs on a personal computer's graphics card. It targets individual developers and small teams who want an AI coding assistant without paying for expensive cloud servers.

The performance doesn't top the leaderboard, but at one-seventh the price, the equation changes. Devstral 2's bet is on value for money.

Also released was Vibe CLI, an AI coding assistant that works directly in the terminal (command prompt). It generates and modifies code with a single command, no separate development tool required. It competes directly with Anthropic's Claude Code and OpenAI's Codex CLI, with the key difference being its free, open-source foundation.

NVIDIA's Nemotron Coalition and the Open Frontier Declaration

Mistral AI is also a founding member of NVIDIA's Nemotron Coalition. Eight AI research labs have joined forces to build top-tier free models together, sharing NVIDIA's supercomputers for training. Their goal: bring free models up to the performance level that only paid models have reached until now.

The coalition's significance goes beyond a simple partnership. In a market dominated by paid models from OpenAI, Anthropic, and Google, the world's largest AI chip maker NVIDIA is rallying the free model camp. Mistral Small 4 reads as one of the first fruits of this alliance.

How Mistral Small 4 performs in practice remains to be seen. Test scores and real-world use are different, and true capability only reveals itself as the developer community runs its own tests. What is clear: a free model that combines reasoning, vision, and code writing in one package has arrived.

Sources

‹ Previous List Next ›

Waking Only 4 of 128 Experts

What the Benchmarks Say — and What They Don't

Available to Anyone Under Apache 2.0

Devstral 2 Competes on Price, Not Performance

NVIDIA's Nemotron Coalition and the Open Frontier Declaration

Categories