Pulse

last 48h

[18/18] 97 sources

What AI is actually talking about — clusters surfacing on Bluesky, Reddit, HN, Mastodon and Lobsters, re-ranked to elevate originality and crush noise.

RESEARCH · X — Cohere Français(FR) · 5h · X

RT @aidangomez: Today, Cohere announced a new partnership with the Government of Quebec.

Cohere has announced a new partnership with the government of Quebec. This collaboration aims to leverage Cohere's AI technology within the Quebec region. Further details on the specific applications or goals of this partnership were not immediately available. AI

IMPACT This partnership could lead to increased AI adoption and development within Quebec's public sector.
RESEARCH · X — Together (inference / OSS) English(EN) · 23h · X

RT @vipulved: PSA: Just added a few thousand chips, including B200s and B300s to our Dedicated Model Inference (https://t.co/sD3mEZtSAa).…

Together AI has significantly expanded its cloud computing resources, adding thousands of new chips including NVIDIA's B200 and B300 accelerators. This move is aimed at bolstering their dedicated model inference services, providing enhanced capabilities for AI model deployment and operation. AI

IMPACT Increases available compute for AI model inference, potentially lowering costs and improving performance for users.
RESEARCH · X — SemiAnalysis English(EN) · 1d · X

China's Unitree Will Dominate Global Robotics

Unitree, a Chinese robotics company, is poised to lead the global market due to its rapid iteration cycle. This accelerated development is expected to drive significant advancements in next-generation robotics. The company's fast-paced approach suggests a strong competitive advantage in the evolving robotics landscape. AI

IMPACT Unitree's rapid product iteration could set new benchmarks for development speed in the robotics sector, potentially influencing AI integration and deployment.
RESEARCH · X — Perplexity English(EN) · 1d · [8 sources] · X

We found that more autonomy with autonomous agents like Computer tracks with higher quality and satisfaction. https://t.co/MbRxsXxSPS

Perplexity has released research detailing the impact of its autonomous agent, Computer, on knowledge work. The study, conducted in collaboration with Harvard, found that Computer significantly reduces task completion time and cost compared to traditional search methods. Users reported higher satisfaction and increased autonomy when using Computer, which handles complex, multi-field queries that often extend beyond typical search capabilities. AI

IMPACT Autonomous agents like Perplexity's Computer are demonstrating significant efficiency gains, potentially reshaping how knowledge work is performed by reducing time and cost.
RESEARCH · X — SemiAnalysis English(EN) · 22h · [2 sources] · X

We just published a deep dive on why Unitree is going to dominate global robotics. Timing could not be better. (2/2)

Unitree, a robotics company, has been added to the U.S. Department of Defense's Section 1260H list of Chinese military companies. This designation places Unitree alongside other prominent Chinese tech firms like BYD, Alibaba, Baidu, and Tencent. SemiAnalysis predicts that despite this, Unitree is poised to dominate the global robotics market due to favorable timing. AI

IMPACT This designation could impact Unitree's global market access and partnerships, potentially affecting the broader robotics industry.
RESEARCH · Hugging Face Daily Papers English(EN) · 3w · [97 sources] · MASTOREDDITX

Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps

Researchers are exploring novel approaches to enhance the efficiency and effectiveness of attention mechanisms in transformers. Several papers introduce methods to mitigate issues like over-smoothing and computational bottlenecks, particularly in graph transformers and large language models. Techniques include capacity-controlled attention gating, analyzing attention sinks to differentiate between adaptive no-op and broadcast mechanisms, and developing sparse attention strategies for ultra-long contexts. These advancements aim to improve model performance on various benchmarks while reducing computational costs. AI

IMPACT These research papers introduce techniques to improve transformer efficiency and performance, potentially leading to more capable and cost-effective AI models for various applications.
RESEARCH · X — SemiAnalysis English(EN) · 1mo · [3 sources] · X

@manicely6005 The public documentation can be found here too (3/3)

NVIDIA has open-sourced parts of its cuDNN library, a significant move after 12 years of it being closed-source. This release includes over 20 Mixture-of-Experts (MoE) kernels and NSA sparse attention kernels. The codebase for these kernels is largely written in Python CuTe-DSL, with public documentation now available. AI

IMPACT Open-sourcing of cuDNN kernels could accelerate research and development in AI infrastructure and model optimization.
RESEARCH · X — Google DeepMind English(EN) · 1mo · [6 sources] · X

We’re advancing this research with academics and institutions globally, and will gradually expand our clinician-facing trusted tester program to additional site

Google DeepMind has introduced an AI co-clinician research initiative aimed at assisting healthcare professionals and patients. This system utilizes live video and audio to analyze physical symptoms in real-time, such as a patient's gait or breathing. In testing, the AI demonstrated strong performance, matching or exceeding physicians in 68 out of 140 assessed areas, including triage, and made zero critical errors in 97 out of 98 primary care queries under the NOHARM safety framework. AI

IMPACT Potential to augment clinical decision-making and improve patient care through multimodal AI analysis.
RESEARCH · X — Qwen (Alibaba) English(EN) · 1mo · [3 sources] · X

Forward and backward benchmark results across common configurations. https://t.co/IHMCZRw9AW

Alibaba's Qwen team has released FlashQLA, a new set of high-performance linear attention kernels developed using TileLang. These kernels are designed to improve the efficiency of attention mechanisms in large language models. The team also shared benchmark results for their Qwen models, showcasing performance across various configurations. AI

IMPACT Introduces optimized kernels that could improve LLM inference speed and efficiency.
RESEARCH · X — Google DeepMind English(EN) · 1mo · [3 sources] · X

As AI continues to evolve, our commitment to education remains.

Google DeepMind's "Experience AI" program has trained over 30,000 educators globally since 2023, with 93% reporting increased AI knowledge and 87% feeling more confident teaching the subject. The initiative, developed with Raspberry Pi, has reached 2.9 million students across 180 countries in 19 languages. The program is now expanding into Latin America with $4.6 million in funding from Google.org, aiming to train an additional 24,000 educators and impact 1.25 million students by 2028. AI

IMPACT This program aims to improve AI literacy among educators and students globally, potentially fostering a larger future talent pool.
RESEARCH · X — Google DeepMind English(EN) · 1mo · [6 sources] · X

This is Decoupled DiLoCo: our new resilient and flexible way to train advanced AI models across multiple data centres. 🧵 https://t.co/YRmPrqIbYE

Google DeepMind has introduced Decoupled DiLoCo, a novel approach to training advanced AI models that enhances resilience and flexibility across data centers. This system can train models like Google's 12B Gemma model across geographically dispersed regions using low-bandwidth networks and can even mix different generations of hardware, such as TPU6e and TPUv5p. Decoupled DiLoCo is designed to be self-healing, isolating and continuing training through artificial hardware failures and reintegrating units when they come back online, addressing the synchronization issues that typically stall AI training. AI

IMPACT Enables more robust and flexible large-scale AI model training, potentially reducing costs and increasing accessibility.
RESEARCH · X — Runway (video gen) English(EN) · 1mo · [9 sources] · X

Have a big idea but no advertising budget? Make it yourself with Runway. All you need is a concept to start creating high impact ads for TV, social and more. Tr

Runway has released several updates to its video generation platform. Seedance 2.0 is now available in 1080p, via the iOS app, and through the Runway API. Additionally, users can now animate Runway Characters using scripts, bringing them to life with text prompts. AI
RESEARCH · X — Google AI English(EN) · 1mo · [3 sources] · X

Last week, we launched Gemini 3.1 TTS, our latest and best text-to-speech model. This new model introduces [awe] audio tags, an intuitive way to guide vocal sty

Google AI has released Gemini 3.1 TTS and Gemini 3.1 Flash TTS, their newest text-to-speech models. These models offer enhanced expressiveness and control, introducing audio tags to guide vocal style, pace, and delivery through natural language commands. The audio tags are designed to be an intuitive way for users to shape the output of the text-to-speech models. AI
RESEARCH · X — Qwen (Alibaba) English(EN) · 1mo · [12 sources] · MASTOX

Thanks to @lmsysorg ！ Try it on SGLang now!🚀🚀

Alibaba has released its Qwen3.6-27B model, an open-source, dense model that demonstrates strong coding performance, outperforming a significantly larger predecessor on key benchmarks. This new model is natively multimodal, capable of processing both vision and language inputs. The release has been accompanied by rapid integration with popular AI tools like vLLM and SGLang, enabling local execution and broader accessibility. AI
RESEARCH · Google AI / Research English(EN) · 10mo · [633 sources] · HNLOBSTERSMASTOBLOGREDDITX

Unlocking dependable responses with Gemini Enterprise Agent Platform’s Agentic RAG

Researchers are developing advanced agent frameworks to improve AI reliability and efficiency across various domains. Google introduced an agentic RAG system that enhances enterprise query handling by iteratively searching for complete context, boosting accuracy by up to 34%. Hugging Face demonstrated a multi-agent economy simulation using a small 3B model, highlighting the trade-offs between model size and real-time performance. Other research explores methods for reliable tool use, regulatory compliance through agent-to-agent protocols, dynamic benchmarking for agent behavior, and robust self-evolution mechanisms for AI agents. AI

IMPACT New agentic frameworks and evaluation methods promise more reliable, efficient, and compliant AI systems across enterprise, simulation, and regulatory domains.
RESEARCH · arXiv cs.CL English(EN) · 13mo · [53 sources] · MASTOREDDITX

FlexDraft: Flexible Speculative Decoding via Attention Tuning and Bonus-Guided Calibration

Researchers have developed several new methods to accelerate large language model (LLM) inference through speculative decoding. AdaPLD improves retrieval and draft construction by using semantic similarity and branched hypotheses, achieving up to 3.10x speedup. SSSD combines n-gram matching with hardware-aware speculation for up to 2.9x latency reduction without training. D^2SD uses a dual diffusion model and confidence-guided prefix trees to enhance acceptance rates, while TAPS optimizes prefix tree selection for diffusion-drafted decoding, yielding up to 7.9x speedup. KnapSpec treats draft model selection as a knapsack problem to maximize throughput, achieving up to 1.47x speedup, and Vegas uses verification-guided sparse attention for improved decoding throughput. Additionally, LK Losses directly optimize the acceptance rate during training, leading to gains of 8-10% in average acceptance length. AI

IMPACT These advancements in speculative decoding promise significant speedups and efficiency gains for LLM inference, potentially lowering costs and increasing accessibility.
RESEARCH · Medium — MLOps tag English(EN) · 34mo · [63 sources] · HNMASTOBLOGREDDITX

Building Secure AI Gateways with MLflow AI Gateway

Google Research has introduced ReasoningBank, a novel framework designed to enhance AI agents' ability to learn from their experiences, both successes and failures, after deployment. This system distills generalizable reasoning strategies from past interactions, allowing agents to continuously improve and avoid repeating mistakes. Separately, new research explores optimizing multi-agent communication through latent representations and introduces Agent Evolving Learning (AEL) for agents operating in open-ended environments, focusing on how to effectively use remembered information. Additionally, DeepSeek has released preview models of its V4 series, offering large context windows and advanced capabilities at a significantly lower cost than comparable frontier models. AI

IMPACT New frameworks for agent learning and memory, alongside cost-effective frontier models, could accelerate AI adoption in complex tasks and personalized applications.
RESEARCH · OpenAI News English(EN) · 122mo · [741 sources] · MASTOBLOGX

RL²: Fast reinforcement learning via slow reinforcement learning

OpenAI has published a series of research papers detailing advancements in reinforcement learning. These include achieving superhuman performance in Dota 2 with OpenAI Five, developing benchmarks for safe exploration in RL, and quantifying generalization capabilities with the CoinRun environment. The company also explored novel methods like prediction-based rewards for curiosity-driven exploration, learning policy representations in multiagent systems, and an experimental metalearning approach called Evolved Policy Gradients for faster training on new tasks. Further research addresses variance reduction in policy gradients and the equivalence between policy gradients and soft Q-learning, alongside challenging robotics environments for multi-goal RL. AI

IMPACT Demonstrates significant progress in RL capabilities, including superhuman performance, safety, generalization, and exploration, pushing the boundaries of AI.