DeepSeek has significantly reduced its API prices by up to 90% following the release of its V4 model. The company attributes these price cuts, which establish a new industry low, to its sparse attention architecture. This new architecture reportedly lowers per-token compute needs and supports context windows of up to 1 million tokens. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT Accelerates the trend of falling AI inference costs, potentially enabling wider adoption of large context window models.
RANK_REASON Model release from a significant AI lab with a notable price reduction and technical innovation.