Devstral MediumDevstral Medium builds upon the strengths of Devstral Small and takes performance to the next level with a score of 61.6% on SWE-Bench Verified. Devstral Medium is available through the Mistral public API, and offers exceptional performance at a competitive price point, making it an ideal choice for businesses and developers looking for a high-quality, cost-effective model. | Jul 10, 2025 | | 61.6% | - | - | - | - | |
Devstral Small 1.1Devstral Small 1.1 (also called devstral-small-2507) is based on the Mistral-Small-3.1 foundation model and contains approximately 24 billion parameters. It supports a 128k token context window, which allows it to handle multi-file code inputs and long prompts typical in software engineering workflows. The model is fine-tuned specifically for structured outputs, including XML and function-calling formats. This makes it compatible with agent frameworks such as OpenHands and suitable for tasks like program navigation, multi-step edits, and code search. It is licensed under Apache 2.0 and available for both research and commercial use. | Jul 11, 2025 | | 53.6% | - | - | - | - | |
| Jun 20, 2025 | | - | - | - | - | - | |
Magistral Small 2506Building upon Mistral Small 3.1 (2503), with added reasoning capabilities, undergoing SFT from Magistral Medium traces and RL on top, it's a small, efficient reasoning model with 24B parameters. Magistral Small can be deployed locally, fitting within a single RTX 4090 or a 32GB RAM MacBook once quantized. | Jun 10, 2025 | | - | - | - | 51.3% | - | |
Magistral MediumTrained solely with reinforcement learning on top of Mistral Medium 3, Magistral Medium is a reasoning model that achieves strong performance on complex math and code tasks without relying on distillation from existing reasoning models. The training uses an RLVR framework with modifications to GRPO, enabling improved reasoning ability and multilingual consistency. | Jun 10, 2025 | | - | 47.1% | - | 50.3% | - | |
Mistral Small 3.1 24B BasePretrained base model version of Mistral Small 3.1. Features improved text performance, multimodal understanding, multilingual capabilities, and an expanded 128k token context window compared to Mistral Small 3. Designed for fine-tuning. | Mar 17, 2025 | | - | - | - | - | - | |
Mistral Small 3.1 24B InstructBuilding upon Mistral Small 3 (2501), Mistral Small 3.1 (2503) adds state-of-the-art vision understanding and enhances long context capabilities up to 128k tokens without compromising text performance. With 24 billion parameters, this model achieves top-tier capabilities in both text and vision tasks. | Mar 17, 2025 | | - | - | 88.4% | - | 74.7% | |
Mistral Small 3 24B BaseMistral Small 3 is competitive with larger models such as Llama 3.3 70B or Qwen 32B, and is an excellent open replacement for opaque proprietary models like GPT4o-mini. Mistral Small 3 is on par with Llama 3.3 70B instruct, while being more than 3x faster on the same hardware. | Jan 30, 2025 | | - | - | - | - | 69.6% | |
Mistral Small 3 24B InstructMistral Small 3 is a 24B-parameter LLM licensed under Apache-2.0. It focuses on low-latency, high-efficiency instruction following, maintaining performance comparable to larger models. It provides quick, accurate responses for conversational agents, function calling, and domain-specific fine-tuning. Suitable for local inference when quantized, it rivals models 2–3× its size while using significantly fewer compute resources. | Jan 30, 2025 | | - | - | 84.8% | - | - | |
Pixtral LargeA 124B parameter multimodal model built on top of Mistral Large 2, featuring frontier-level image understanding capabilities. Excels at understanding documents, charts, and natural images while maintaining strong text-only performance. Features a 123B multimodal decoder and 1B parameter vision encoder with a 128K context window supporting up to 30 high-resolution images. | Nov 18, 2024 | Mistral Research License (MRL) for research; Mistral Commercial License for commercial use | - | - | - | - | - | |