Ollama has released version 0.23.1, introducing support for Gemma 4 MTP (Multi-token Processing) with speculative decoding on Macs. This enhancement can reportedly double the speed for the Gemma 4 31B model when performing coding tasks. The update also includes threading fixes for MLX and MLX-C. AI
Summary written by gemini-2.5-flash-lite from 1 source. How we write summaries →
IMPACT Improves performance for running specific models on Mac hardware, potentially speeding up development workflows.
RANK_REASON This is a software release for a tool that facilitates running models, not a release of a frontier model itself.