Pulse

last 48h

[8/8] 97 sources

What AI is actually talking about — clusters surfacing on Bluesky, Reddit, HN, Mastodon and Lobsters, re-ranked to elevate originality and crush noise.

RESEARCH · X — Perplexity English(EN) · 1d · [8 sources] · X

We found that more autonomy with autonomous agents like Computer tracks with higher quality and satisfaction. https://t.co/MbRxsXxSPS

Perplexity has released research detailing the impact of its autonomous agent, Computer, on knowledge work. The study, conducted in collaboration with Harvard, found that Computer significantly reduces task completion time and cost compared to traditional search methods. Users reported higher satisfaction and increased autonomy when using Computer, which handles complex, multi-field queries that often extend beyond typical search capabilities. AI

IMPACT Autonomous agents like Perplexity's Computer are demonstrating significant efficiency gains, potentially reshaping how knowledge work is performed by reducing time and cost.
RESEARCH · Hugging Face Daily Papers English(EN) · 3w · [97 sources] · MASTOREDDITX

Full Attention Strikes Back: Transferring Full Attention into Sparse within Hundred Training Steps

Researchers are exploring novel approaches to enhance the efficiency and effectiveness of attention mechanisms in transformers. Several papers introduce methods to mitigate issues like over-smoothing and computational bottlenecks, particularly in graph transformers and large language models. Techniques include capacity-controlled attention gating, analyzing attention sinks to differentiate between adaptive no-op and broadcast mechanisms, and developing sparse attention strategies for ultra-long contexts. These advancements aim to improve model performance on various benchmarks while reducing computational costs. AI

IMPACT These research papers introduce techniques to improve transformer efficiency and performance, potentially leading to more capable and cost-effective AI models for various applications.
RESEARCH · X — Google DeepMind English(EN) · 1mo · [6 sources] · X

We’re advancing this research with academics and institutions globally, and will gradually expand our clinician-facing trusted tester program to additional site

Google DeepMind has introduced an AI co-clinician research initiative aimed at assisting healthcare professionals and patients. This system utilizes live video and audio to analyze physical symptoms in real-time, such as a patient's gait or breathing. In testing, the AI demonstrated strong performance, matching or exceeding physicians in 68 out of 140 assessed areas, including triage, and made zero critical errors in 97 out of 98 primary care queries under the NOHARM safety framework. AI

IMPACT Potential to augment clinical decision-making and improve patient care through multimodal AI analysis.
RESEARCH · X — Qwen (Alibaba) English(EN) · 1mo · [12 sources] · MASTOX

Thanks to @lmsysorg ！ Try it on SGLang now!🚀🚀

Alibaba has released its Qwen3.6-27B model, an open-source, dense model that demonstrates strong coding performance, outperforming a significantly larger predecessor on key benchmarks. This new model is natively multimodal, capable of processing both vision and language inputs. The release has been accompanied by rapid integration with popular AI tools like vLLM and SGLang, enabling local execution and broader accessibility. AI
RESEARCH · Google AI / Research English(EN) · 10mo · [633 sources] · HNLOBSTERSMASTOBLOGREDDITX

Unlocking dependable responses with Gemini Enterprise Agent Platform’s Agentic RAG

Researchers are developing advanced agent frameworks to improve AI reliability and efficiency across various domains. Google introduced an agentic RAG system that enhances enterprise query handling by iteratively searching for complete context, boosting accuracy by up to 34%. Hugging Face demonstrated a multi-agent economy simulation using a small 3B model, highlighting the trade-offs between model size and real-time performance. Other research explores methods for reliable tool use, regulatory compliance through agent-to-agent protocols, dynamic benchmarking for agent behavior, and robust self-evolution mechanisms for AI agents. AI

IMPACT New agentic frameworks and evaluation methods promise more reliable, efficient, and compliant AI systems across enterprise, simulation, and regulatory domains.
RESEARCH · arXiv cs.CL English(EN) · 13mo · [53 sources] · MASTOREDDITX

FlexDraft: Flexible Speculative Decoding via Attention Tuning and Bonus-Guided Calibration

Researchers have developed several new methods to accelerate large language model (LLM) inference through speculative decoding. AdaPLD improves retrieval and draft construction by using semantic similarity and branched hypotheses, achieving up to 3.10x speedup. SSSD combines n-gram matching with hardware-aware speculation for up to 2.9x latency reduction without training. D^2SD uses a dual diffusion model and confidence-guided prefix trees to enhance acceptance rates, while TAPS optimizes prefix tree selection for diffusion-drafted decoding, yielding up to 7.9x speedup. KnapSpec treats draft model selection as a knapsack problem to maximize throughput, achieving up to 1.47x speedup, and Vegas uses verification-guided sparse attention for improved decoding throughput. Additionally, LK Losses directly optimize the acceptance rate during training, leading to gains of 8-10% in average acceptance length. AI

IMPACT These advancements in speculative decoding promise significant speedups and efficiency gains for LLM inference, potentially lowering costs and increasing accessibility.
RESEARCH · Medium — MLOps tag English(EN) · 34mo · [63 sources] · HNMASTOBLOGREDDITX

Building Secure AI Gateways with MLflow AI Gateway

Google Research has introduced ReasoningBank, a novel framework designed to enhance AI agents' ability to learn from their experiences, both successes and failures, after deployment. This system distills generalizable reasoning strategies from past interactions, allowing agents to continuously improve and avoid repeating mistakes. Separately, new research explores optimizing multi-agent communication through latent representations and introduces Agent Evolving Learning (AEL) for agents operating in open-ended environments, focusing on how to effectively use remembered information. Additionally, DeepSeek has released preview models of its V4 series, offering large context windows and advanced capabilities at a significantly lower cost than comparable frontier models. AI

IMPACT New frameworks for agent learning and memory, alongside cost-effective frontier models, could accelerate AI adoption in complex tasks and personalized applications.
RESEARCH · OpenAI News English(EN) · 122mo · [741 sources] · MASTOBLOGX

RL²: Fast reinforcement learning via slow reinforcement learning

OpenAI has published a series of research papers detailing advancements in reinforcement learning. These include achieving superhuman performance in Dota 2 with OpenAI Five, developing benchmarks for safe exploration in RL, and quantifying generalization capabilities with the CoinRun environment. The company also explored novel methods like prediction-based rewards for curiosity-driven exploration, learning policy representations in multiagent systems, and an experimental metalearning approach called Evolved Policy Gradients for faster training on new tasks. Further research addresses variance reduction in policy gradients and the equivalence between policy gradients and soft Q-learning, alongside challenging robotics environments for multi-goal RL. AI

IMPACT Demonstrates significant progress in RL capabilities, including superhuman performance, safety, generalization, and exploration, pushing the boundaries of AI.