DeepSeek Launches V4 Model With One Million Token Context

China’s DeepSeek has launched its AI model, DeepSeek-V4, claiming it offers enhanced capabilities over open-source alternatives. The new model is optimized for domestic chips and features an ultra-long context of one million words, asserting leadership in agent capabilities, world knowledge, and reasoning performance.

DeepSeek-V4 is available in two editions: DeepSeek-V4-Pro and DeepSeek-V4-Flash. The latter is marketed as a more efficient and economical option. In world knowledge benchmarks, DeepSeek-V4-Pro significantly outperforms other open-source models and closely trails Google’s closed-source model, Gemini-Pro-3.1.

🚀 DeepSeek-V4 Preview is officially live & open-sourced! Welcome to the era of cost-effective 1M context length.

🔹 DeepSeek-V4-Pro: 1.6T total / 49B active params. Performance rivaling the world’s top closed-source models.
🔹 DeepSeek-V4-Flash: 284B total / 13B active params.… pic.twitter.com/n1AgwMIymu

— DeepSeek (@deepseek_ai) April 24, 2026

Stay Ahead of the Curve!

Don’t miss out on the latest insights, trends, and analysis in the world of data, technology, and startups. Subscribe to our newsletter and get exclusive content delivered straight to your inbox.

Subscribe Now

The DeepSeek-V4-Pro version introduces a “maximum reasoning effort mode,” enhancing its knowledge capabilities compared to existing open-source models. This release follows a previous market downturn caused by DeepSeek’s earlier R1 model, which effectively competed against ChatGPT at a lower cost.

While the specific chip system used for training the V4 models remains undisclosed, DeepSeek stated its software is compatible with Nvidia and Huawei chips. This launch coincides with increasing U.S. semiconductor export restrictions to China, particularly for high-end GPUs crucial for AI development.

The new model can process a maximum output of 384,000 tokens, which are the fundamental units of data for AI models. A token typically represents about four characters; thus, the model’s rapid processing enhances learning and response speed. DeepSeek claims a significant jump in computational efficiency, capable of understanding the context of up to one million tokens.

DeepSeek-V4-Pro reportedly outperforms Google’s Gemini-3.1-Pro in processing long text strings but still trails Anthropic’s Claude Opus 4.6 model. The company aims to further improve the model’s intelligence and usability across various applications.

DeepSeek indicated that this breakthrough would initiate a new era of million-length contexts for next-generation language models. “This breakthrough enables efficient support for a context length of one million tokens,” DeepSeek stated in its announcement.

Featured image credit

Latest post

Today’s NYT Mini Crossword Answers for April 25

Palantir is reportedly helping the IRS investigate financial crimes

CISA Adds 4 Exploited Flaws to KEV, Sets May 2026 Federal Deadline

DeepSeek Launches V4 Model With One Million Token Context

Two college kids raise a $5.1 million pre-seed to build an AI social network in iMessage

Model Risk Management in 2026: A Banker’s Guide to the Revised Interagency Guidance

Tencent Uses Product Rollout, Not Just Benchmarks, To Define Hy3 Preview

How to Improve Claude Code Performance with Automated Testing

Today’s NYT Mini Crossword Answers for April 25

Palantir is reportedly helping the IRS investigate financial crimes

CISA Adds 4 Exploited Flaws to KEV, Sets May 2026 Federal Deadline

CVSS scored these two Palo Alto CVEs as manageable. Chained, they gave attackers root access to 13,000 devices.

Today’s NYT Mini Crossword Answers for April 25

Palantir is reportedly helping the IRS investigate financial crimes

CISA Adds 4 Exploited Flaws to KEV, Sets May 2026 Federal Deadline

CVSS scored these two Palo Alto CVEs as manageable. Chained, they gave attackers root access to 13,000 devices.

Latest post

DeepSeek Launches V4 Model With One Million Token Context

Stay Ahead of the Curve!

Related Posts