Pulse

last 48h

[11/11] 89 sources

What AI is actually talking about — clusters surfacing on Bluesky, Reddit, HN, Mastodon and Lobsters, re-ranked to elevate originality and crush noise.

FRONTIER RELEASE · The Decoder · 1d · [12 sources] · MASTOBLOG

Thinking Machines Lab ships its first model and argues interactivity is what OpenAI gets wrong about voice

Thinking Machines Lab, founded by former OpenAI CTO Mira Murati, has unveiled its first AI model, focusing on "interaction models" designed for real-time collaboration across voice, video, and text. Unlike current AI that processes input sequentially, TML's model operates in 200-millisecond chunks, allowing it to listen and respond simultaneously, mimicking natural human conversation. This "full duplex" approach aims to surpass competitors like OpenAI's GPT Realtime 2 and Google's Gemini Live in conversational quality, though it is currently a research preview with a limited release planned. AI

IMPACT Sets a new standard for real-time conversational AI, potentially shifting focus from agentic capabilities to natural human-AI interaction.
FRONTIER RELEASE · Mastodon — sigmoid.social 日本語(JA) · 3d · [4 sources] · MASTO

OpenAI's 'Images 2.0' Significantly Improves Text Descriptions! AI-Generated Images Are Finally Difficult to Distinguish (Lifehacker Japan) – Yahoo! News https://www.yayafa.com/2797434/ # AgenticAi # AI # ArtificialGeneralIntelligence #

OpenAI has significantly improved its image generation capabilities with "Images 2.0," making AI-generated visuals increasingly indistinguishable from real ones. Separately, a new version of ChatGPT, referred to as "ChatGPT 5.5 Pro," has demonstrated the ability to perform complex, doctoral-level mathematical research in just one hour, prompting mathematicians to note a potential shift in the baseline for human research. Meanwhile, Anthropic is reportedly planning to establish an "AI McKinsey" with a $1.5 billion investment, aiming to leverage AI for strategic consulting. AI

IMPACT These advancements in image generation and complex problem-solving by AI models like ChatGPT 5.5 Pro signal a rapid increase in AI capabilities, potentially redefining research baselines and the creation of digital media.
FRONTIER RELEASE · 量子位 (QbitAI) 中文(ZH) · 4d · [3 sources] · MASTO

Baidu Releases Wenxin 5.1: Search Capabilities Top Domestic, Pre-training Costs Only 6% of Industry Average

Baidu has released its new large language model, Wenxin 5.1, which significantly enhances search, knowledge, and AI agent capabilities. The model achieves leading domestic search performance and surpasses DeepSeek-V4-Pro in AI agent functionality, while its creative writing and reasoning abilities are comparable to top-tier models. Notably, Wenxin 5.1 was trained using a novel multi-dimensional elastic pre-training technique, reducing training costs to approximately 6% of industry standards. AI

IMPACT Sets new SOTA for domestic search and agent capabilities, while drastically reducing training costs.
FRONTIER RELEASE · OpenAI News · 1w · [45 sources] · HNMASTOBLOG

How OpenAI delivers low-latency voice AI at scale

OpenAI has released three new real-time voice models: GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper. These models offer enhanced reasoning capabilities, live speech translation for over 70 languages, and low-latency transcription. GPT-Realtime-2, in particular, is described as having "GPT-5-class reasoning" and features a significantly expanded context window of 128K tokens, alongside improved handling of interruptions and tool usage. AI

IMPACT Enhances real-time voice agent capabilities with improved reasoning, translation, and transcription, potentially accelerating adoption of voice-first interfaces.
FRONTIER RELEASE · Smol AINews · 2w · [15 sources] · MASTOBLOG

not much happened today

OpenAI has released GPT-5.5, which offers improvements in factuality, intelligence, and image understanding, and is now the default model for ChatGPT and its API. This release also enhances personalization, allowing ChatGPT to utilize user memories, past chats, and connected files. Additionally, OpenAI has introduced an Agents SDK for TypeScript and updated its Codex model to function as a general-purpose computer work agent, expanding its capabilities beyond coding. AI

IMPACT GPT-5.5's release and enhanced personalization features are likely to accelerate user adoption of AI agents for a wider range of tasks beyond coding.
FRONTIER RELEASE · Simon Willison · 2w · [11 sources] · MASTOBLOG

A pelican for GPT-5.5 via the semi-official Codex backdoor API

OpenAI has released GPT-5.5, available in Codex and rolling out to paid ChatGPT subscribers, though its API access is pending further safety reviews. The new model is described as fast and capable, with early users noting its ability to accurately build requested items. Meanwhile, Simon Willison's LLM library has been updated to version 0.32a0, introducing a more flexible message-based input system and streaming parts for responses to better handle diverse model capabilities. Additionally, issues affecting Claude Code's performance have been identified as harness problems rather than model flaws, with a specific bug causing forgetfulness and repetition. AI

IMPACT GPT-5.5's release and API delay signals continued frontier model development and cautious rollout strategies.
FRONTIER RELEASE · 36氪 (36Kr) 中文(ZH) · 2w · [50 sources] · MASTOX

WildFire Energy shareholders seek to sell the US shale oil operator for over $4 billion

OpenAI has launched its new GPT-5.5 model, reporting rapid API revenue growth and increased demand for its coding tools. The release is accompanied by a peculiar system prompt directive for Codex to "never talk about goblins." Meanwhile, OpenAI President Greg Brockman testified in the ongoing trial against Elon Musk, disclosing his significant equity stake in the company and defending his financial interests and contributions. AI

IMPACT Sets a new benchmark for model efficiency and coding capabilities, potentially intensifying competition with Anthropic.
FRONTIER RELEASE · Smol AINews · 1mo · [3 sources] · MASTO

Gemma 4

Google DeepMind has released Gemma 4, a new family of open-weight models licensed under Apache 2.0, marking a significant advancement in their open-source AI offerings. The models are designed for reasoning and agentic workflows, with versions optimized for local and edge deployment, including native multimodal capabilities for text, vision, and audio. Early benchmarks suggest competitive performance, with the 31B model ranking highly among open-source options, and rapid ecosystem support has emerged from various platforms like llama.cpp and Ollama. AI

IMPACT Accelerates local and edge AI development with permissive licensing and multimodal capabilities.
FRONTIER RELEASE · Last Week in AI · 2mo · [4 sources] · BLOGREDDIT

LWiAI Podcast #236 - GPT 5.4, Gemini 3.1 Flash Lite, Supply Chain Risk

OpenAI has released GPT-5.4 Pro with a 1 million token context window and enhanced safety features, alongside GPT-5.3 Instant, which aims for a less preachy tone. Google has improved its Gemini 3.1 Flash Lite model for faster response times and lower costs, and introduced a CLI for agent integration with its productivity suite. Luma has launched unified multimodal models and agents for creative tasks, demonstrating a rapid ad localization use case. The cluster also touches on controversies surrounding AI in defense contracts, a lawsuit alleging Gemini's role in a suicide, and Anthropic's warning about labor disruption. AI

IMPACT New model releases from OpenAI and Google push the boundaries of context window size and agent integration, potentially accelerating enterprise adoption and raising safety concerns.
FRONTIER RELEASE · X — Cursor (AI IDE) · 9mo · [9 sources] · REDDITX

We recently shipped quality-of-life improvements to the Cursor CLI to make working with agents in the terminal more delightful.

Cursor has integrated GPT-5.5 into its AI IDE, allowing users to leverage the new model for their coding tasks. This integration enhances the capabilities of the Cursor CLI, introducing features like a customizable status bar and an in-CLI settings panel for managing preferences. Additionally, new commands such as "/btw" enable users to ask side questions without interrupting ongoing agent processes, improving the overall user experience for terminal-based agent interactions. AI
FRONTIER RELEASE · Practical AI · 68mo · [12 sources] · MASTOBLOG

Cracking the code of failed AI pilots

Anthropic has withheld its new Claude Mythos model from public release due to its advanced capabilities in finding and exploiting software vulnerabilities. The company is instead providing access to select cybersecurity firms through Project Glasswing to help patch critical software before the model's capabilities become more widely available. This decision highlights a shift from previous AI releases, where caution stemmed from unknown risks, to a current scenario where known, potent risks necessitate controlled access. AI

IMPACT This controlled release strategy for a highly capable model could set a precedent for managing advanced AI risks, potentially influencing future AI development and deployment.