DeepSeek-R1-0528An upgraded version of DeepSeek R1 with significantly improved reasoning capabilities. This model leverages increased computational resources and algorithmic optimization mechanisms during post-training, demonstrating outstanding performance across mathematics, programming, and general logic tasks. | May 28, 2025 | | 57.6% | 71.6% | - | 73.3% | - | |
DeepSeek-R1DeepSeek-R1 is the first-generation reasoning model built atop DeepSeek-V3 (671B total parameters, 37B activated per token). It incorporates large-scale reinforcement learning (RL) to enhance its chain-of-thought and reasoning capabilities, delivering strong performance in math, code, and multi-step reasoning tasks. | Jan 20, 2025 | | 49.2% | 53.3% | - | 65.9% | - | |
DeepSeek-V3A powerful Mixture-of-Experts (MoE) language model with 671B total parameters (37B activated per token). Features Multi-head Latent Attention (MLA), auxiliary-loss-free load balancing, and multi-token prediction training. Pre-trained on 14.8T tokens with strong performance in reasoning, math, and code tasks. | Dec 25, 2024 | MIT + Model License (Commercial use allowed) | 42.0% | 49.6% | - | 37.6% | - | |
DeepSeek-V2.5DeepSeek-V2.5 is an upgraded version that combines DeepSeek-V2-Chat and DeepSeek-Coder-V2-Instruct, integrating general and coding abilities. It better aligns with human preferences and has been optimized in various aspects, including writing and instruction following. | May 8, 2024 | | 16.8% | - | 89.0% | - | - | |
DeepSeek-V3 0324A powerful Mixture-of-Experts (MoE) language model with 671B total parameters (37B activated per token). Features Multi-head Latent Attention (MLA), auxiliary-loss-free load balancing, and multi-token prediction training. Pre-trained on 14.8T tokens with strong performance in reasoning, math, and code tasks. | Mar 25, 2025 | MIT + Model License (Commercial use allowed) | - | - | - | 49.2% | - | |
DeepSeek R1 Distill Qwen 7BDeepSeek-R1 is the first-generation reasoning model built atop DeepSeek-V3 (671B total parameters, 37B activated per token). It incorporates large-scale reinforcement learning (RL) to enhance its chain-of-thought and reasoning capabilities, delivering strong performance in math, code, and multi-step reasoning tasks. | Jan 20, 2025 | | - | - | - | 37.6% | - | |
DeepSeek R1 Distill Qwen 1.5BDeepSeek-R1 is the first-generation reasoning model built atop DeepSeek-V3 (671B total parameters, 37B activated per token). It incorporates large-scale reinforcement learning (RL) to enhance its chain-of-thought and reasoning capabilities, delivering strong performance in math, code, and multi-step reasoning tasks. | Jan 20, 2025 | | - | - | - | 16.9% | - | |
DeepSeek R1 Distill Qwen 14BDeepSeek-R1 is the first-generation reasoning model built atop DeepSeek-V3 (671B total parameters, 37B activated per token). It incorporates large-scale reinforcement learning (RL) to enhance its chain-of-thought and reasoning capabilities, delivering strong performance in math, code, and multi-step reasoning tasks. | Jan 20, 2025 | | - | - | - | 53.1% | - | |
DeepSeek R1 Distill Qwen 32BDeepSeek-R1 is the first-generation reasoning model built atop DeepSeek-V3 (671B total parameters, 37B activated per token). It incorporates large-scale reinforcement learning (RL) to enhance its chain-of-thought and reasoning capabilities, delivering strong performance in math, code, and multi-step reasoning tasks. | Jan 20, 2025 | | - | - | - | 57.2% | - | |
DeepSeek R1 ZeroDeepSeek-R1-Zero, a model trained via large-scale reinforcement learning (RL) without supervised fine-tuning (SFT) as a preliminary step, demonstrated remarkable performance on reasoning. With RL, DeepSeek-R1-Zero naturally emerged with numerous powerful and interesting reasoning behaviors. However, DeepSeek-R1-Zero encounters challenges such as endless repetition, poor readability, and language mixing. To address these issues and further enhance reasoning performance, we introduce DeepSeek-R1, which incorporates cold-start data before RL. DeepSeek-R1 achieves performance comparable to OpenAI-o1 across math, code, and reasoning tasks. | Jan 20, 2025 | | - | - | - | 50.0% | - | |