TOPIC Papers

Papers

Frontier AI papers move from arXiv preprint to broad citation in days, not months. PulseAugur's papers feed tracks the research that's actually being read across labs and developer communities — ranked by source corroboration and citation velocity, not raw upvotes. We ingest arXiv, Semantic Scholar, the major AI conference proceedings (NeurIPS, ICML, ICLR, ACL, EMNLP, CVPR), and we cluster across vendor blog posts about a paper, social commentary, and replication threads from independent groups. New papers appear within minutes of arXiv announcement; cluster scores update hourly as citations and replication signals arrive.

Coverage: 50stories
Window: today
Mix: tool 44 research 3 commentary 2 significant 1

RESEARCH · CL_30309 · May 13 · 21:21

Frontier models double reliability every 4.7 months, pushing benchmark limits

Frontier AI models are showing a rapid increase in their ability to handle complex tasks, with their reliability doubling every 4.7 months, a rate that has accelerated since late 2024. Recent models like Claude Mythos P…
TOOL · CL_30268 · May 13 · 18:22

Scientists engineer mice to produce own antibodies for extended treatment

Researchers have developed a novel method to enable the body to produce its own antibodies for extended periods, addressing the limitations of current antibody drugs. This technique involves gene-editing blood-forming s…
TOOL · CL_30240 · May 13 · 18:01

Author trains word embeddings from scratch using Dostoevsky novels

The author details their process of building word embeddings from scratch, using Dostoevsky's novels as a corpus of nearly one million words. This step follows their previous work on character-level tokenization and aim…
TOOL · CL_30104 · May 13 · 17:34

Secret loyalties in AI models pose neglected but tractable threat

A new paper from Formation Research introduces the concept of "secret loyalties" in frontier AI models, where a model is intentionally manipulated to advance a specific actor's interests without disclosure. The research…
TOOL · CL_30027 · May 13 · 16:01

LLMs Explained: How They Process Context and Generate Output

This article provides a beginner-friendly explanation of how Large Language Models (LLMs) function, focusing on their internal processes without complex mathematics. It details how LLMs handle context, predict subsequen…
TOOL · CL_30236 · May 13 · 15:38

Developer pivots LLM tool to 'Turn 0' state injection for consistency

A developer is pivoting their tool, Mnemara, from injecting state mid-conversation to a "Turn 0" strategy, placing all critical information in the initial system prompt. This approach leverages the primacy bias of LLMs,…
TOOL · CL_30028 · May 13 · 15:01

LLM Integration Guide: MCP, Tool Use, and Function Calling Explained

This article explores three distinct approaches for integrating large language models (LLMs) with external systems: MCP, tool use, and function calling. It aims to clarify the differences between these architectures and…
RESEARCH · CL_30006 · May 13 · 14:34

AI Security Lacking Metrics, New Study Finds

Berryville Infrastructure & Machine Learning (BIML) has published a new study highlighting a lack of security metrics for AI systems. The research indicates that current security practices are insufficient to address th…
TOOL · CL_30096 · May 13 · 14:00

OpenAI LLMs outperform doctors on clinical reasoning tasks

A recent study published in Science indicates that OpenAI's large language models have demonstrated the ability to outperform physicians in certain clinical reasoning tasks, using real emergency room data. This developm…
TOOL · CL_30062 · May 13 · 13:53

Bosch, CMU AI boosts humanoid robot dexterity by 90%

Researchers from Bosch and Carnegie Mellon University have created an AI system called Humanoid Transformer with Touch Dreaming (HTD) to enhance the dexterity of humanoid robots. This system uses reinforcement learning …
TOOL · CL_29959 · May 13 · 13:02

Penn Engineers develop AI framework to solve complex math problems

Researchers at the University of Pennsylvania have developed a novel AI framework aimed at tackling complex mathematical equations. This advancement is expected to accelerate scientific discovery by enabling a deeper un…
TOOL · CL_29928 · May 13 · 12:17

Pretrained AI Models Often Sufficient, Fine-Tuning Not Always Needed

This article explores the necessity of fine-tuning pretrained AI models. It argues that while fine-tuning can enhance performance for specific tasks, it is not always required. The author suggests that for many applicat…
TOOL · CL_30031 · May 13 · 12:01

A2A Protocol: Author details code, architecture, and failures

The author details the practical implementation of the A2A Protocol, an open standard for agent discovery and task delegation. This second part focuses on the code, outlining the architecture where the orchestrator acts…
TOOL · CL_29914 · May 13 · 12:00

LLMs excel at deciphering historical handwriting, outperforming specialized tools

Large language models are proving effective at deciphering historical handwriting, a task that has long challenged AI researchers. A study by Wilfrid Laurier University found that LLMs outperformed specialized software …
TOOL · CL_29793 · May 13 · 09:53

Ali Health launches medical AI 'Hydrogen Ion' with BMJ content deal

Ali Health has launched its medical AI platform, "Hydrogen Ion," and announced an exclusive content partnership with the UK's BMJ Group. This collaboration grants Hydrogen Ion access to BMJ's extensive medical journal c…
TOOL · CL_29729 · May 13 · 09:42

Microsoft: Frontier AI models falter on long, complex tasks

Microsoft researchers discovered that advanced AI models struggle with long, multi-step tasks, introducing errors even in complex workflows. This suggests that current frontier models are not yet reliable for intricate,…
COMMENTARY · CL_29758 · May 13 · 09:03

MoE architectures are workarounds for LLM training instability, not ideal solutions

Mixture-of-Experts (MoE) architectures are often presented as an efficient solution for scaling large language models, but this analysis argues they are primarily a workaround for training instability in dense transform…
SIGNIFICANT · CL_29680 · May 13 · 06:44

Alibaba Health launches medical AI 'Hydrogen Ion' with BMJ Group

Alibaba Health has launched its medical AI assistant, "Hydrogen Ion," designed to provide Chinese doctors with reliable, evidence-based medical information. The AI will offer exclusive access to over a decade of content…
COMMENTARY · CL_29700 · May 13 · 05:51

AI erodes science's self-correction, surgeon warns

A pediatric surgeon and researcher hypothesizes that artificial intelligence is eroding the self-correction mechanisms of science, a phenomenon they term "epistemic immunodepression." The erosion stems from reduced epis…
TOOL · CL_29571 · May 13 · 05:22

Epistemic Hygiene Explored to Reduce AI Hallucinations

Researchers are exploring epistemic hygiene as a method to improve the coherence and reduce hallucinations in large language models. This concept, borrowed from human cognitive practices, aims to maintain mental clarity…
TOOL · CL_29573 · May 13 · 05:17

EPFL AI generates dynamic protein structures with atomic detail

Researchers at EPFL have created an AI-driven framework capable of generating comprehensive, all-atom structural models of proteins and their dynamic movements. This new method goes beyond prior systems by not only mode…
TOOL · CL_29599 · May 13 · 04:58

ML model versioning needs dedicated registries, not just S3 buckets

This article discusses the critical need for robust model versioning and registry systems in machine learning development. It argues that simple cloud storage solutions like S3 buckets are insufficient for managing the …
TOOL · CL_29595 · May 13 · 04:50

Claude's tool use ensures reliable JSON output for developers

A developer guide demonstrates how to reliably extract structured data from Anthropic's Claude models by leveraging their tool-use feature. Instead of directly prompting for JSON, the technique involves defining a fake …
TOOL · CL_29603 · May 13 · 04:48

Cog-RAG uses dual-hypergraphs to improve LLM retrieval

Researchers have developed Cog-RAG, a novel approach to Retrieval Augmented Generation that mimics human cognitive processes for improved LLM responses. Unlike traditional methods that retrieve flat text or simple graph…
TOOL · CL_29625 · May 13 · 04:08

New benchmark tests AI agents on complex, iterative engineering tasks

A new benchmark, Frontier-Eng Bench, has been released to evaluate AI agents on complex engineering tasks that lack standardized answers. This benchmark moves beyond simple problem-solving by requiring agents to propose…
TOOL · CL_29549 · May 13 · 04:00

New framework guarantees multi-dimensional hyperparameter tuning

Researchers have developed a new framework for statistically guaranteeing the performance of multi-dimensional hyperparameter tuning in data-driven machine learning settings. This approach leverages tools from real alge…
TOOL · CL_29539 · May 13 · 04:00

CNN framework tests General Relativity using gravitational wave data

Researchers have developed a convolutional neural network (CNN) framework to test General Relativity using gravitational wave data. By training the CNN on simulated beyond-GR waveforms, they found that using a response …
TOOL · CL_29540 · May 13 · 04:00

New SCOPE algorithm optimizes sparse machine learning problems

Researchers have introduced SCOPE, a novel iterative algorithm for sparsity-constrained optimization problems. This method is designed to optimize nonlinear, differentiable, and strongly convex functions, replacing trad…
RESEARCH · CL_29541 · May 13 · 04:00

New methods combine simulation and real-world data for improved AI model training

Researchers have developed new strategies for training surrogate models by integrating data from multiple sources, including simulations and real-world measurements. One approach involves training separate models for ea…
TOOL · CL_29542 · May 13 · 04:00

Smoothed analysis makes positive-only learning feasible

Researchers have developed a smoothed analysis approach for learning from positive-only samples, a challenging problem in binary classification. Unlike worst-case scenarios where learning is nearly impossible, this new …
TOOL · CL_29543 · May 13 · 04:00

New framework quantifies epistemic uncertainty in machine learning

Researchers have introduced a new framework for comparing and quantifying epistemic uncertainty in machine learning models. This framework, called the integral imprecise probability metric (IIPM), generalizes classical …
TOOL · CL_29544 · May 13 · 04:00

ReLU network analysis links Fisher information to spherical harmonics

Researchers have analyzed the Fisher information matrices of simple two-layer ReLU neural networks with random hidden weights. They found that the eigenvalue distribution concentrates significantly on specific eigenspac…
TOOL · CL_29545 · May 13 · 04:00

New theory on stationary MMD points offers faster convergence

Researchers have introduced a new theoretical framework for approximating probability distributions using a finite set of points. Instead of attempting to globally minimize the maximum mean discrepancy (MMD), which is c…
TOOL · CL_29546 · May 13 · 04:00

Self-consistency loss boosts Bayesian model comparison accuracy

Researchers have developed a self-consistency (SC) loss to improve the accuracy of amortized Bayesian model comparison (BMC) when simulation models are misspecified. This technique enhances BMC estimators, particularly …
TOOL · CL_29547 · May 13 · 04:00

New GANs framework enhances credit card fraud detection with uncertainty awareness

Researchers have developed a new semi-supervised deep learning framework for credit card fraud detection, addressing challenges with large datasets and irregular transaction data. The system integrates Generative Advers…
TOOL · CL_29548 · May 13 · 04:00

New methods tackle conflicting health treatment comparisons

Researchers have introduced a new class of methods called arbitrated indirect treatment comparisons to address the "MAIC paradox." This paradox occurs when different analyses of the same health data yield conflicting co…
TOOL · CL_29550 · May 13 · 04:00

New Targeted Synthetic Control method improves causal effect estimation

Researchers have developed a new statistical method called Targeted Synthetic Control (TSC) to improve causal effect estimation in panel data. This two-stage approach refines initial weights to reduce bias and ensures t…
TOOL · CL_29551 · May 13 · 04:00

New iHMM method cuts forecasting errors by 67% with outlier protection

Researchers have developed a new method called Batched Robust iHMM (BR-iHMM) to improve the accuracy of online infinite hidden Markov models when dealing with noisy data. This approach enhances robustness against outlie…
TOOL · CL_29552 · May 13 · 04:00

New algorithms tackle Gaussian graphical model selection from dependent data

Researchers have developed new algorithms for Gaussian graphical model selection when data comes from dependent dynamics, rather than independent samples. One approach uses a local edge-testing estimator that can be imp…
TOOL · CL_29553 · May 13 · 04:00

Bayesian framework reveals multi-graph alignment thresholds

Researchers have established thresholds for the feasibility of aligning random multi-graphs using a Bayesian framework. Their findings indicate an "all-or-nothing" phenomenon in the Gaussian model, where alignment is ei…
TOOL · CL_29554 · May 13 · 04:00

New method estimates optimal classification error with soft labels

This paper introduces a practical method for estimating optimal classification error in binary classification tasks, particularly when dealing with soft labels and calibration. The research extends prior work by theoret…
TOOL · CL_29555 · May 13 · 04:00

New method generates realistic time-series data using causal models

Researchers have developed a new methodology called Adversarial Causal Tuning (ACT) to generate realistic time-series data from causal models. This approach aims to create simulated data that matches the observational a…
TOOL · CL_29556 · May 13 · 04:00

New method decomposes twin network variance to find model failure sources

Researchers have developed a novel method to decompose predictive variance in deep twin networks, separating it into encoder and head components. This technique, which adds minimal computational cost, helps pinpoint the…
TOOL · CL_29557 · May 13 · 04:00

New NARFIMA model enhances BRIC exchange rate forecasting

Researchers have developed a new Neural AutoRegressive Fractionally Integrated Moving Average (NARFIMA) model to improve the forecasting of exchange rates for emerging economies like Brazil, Russia, India, and China (BR…
TOOL · CL_29558 · May 13 · 04:00

Transformer model TAMO performs multi-objective optimization in-context

Researchers have developed TAMO, a novel transformer-based policy for multi-objective Bayesian optimization that operates entirely in-context. This approach eliminates the need for per-task surrogate fitting and acquisi…
TOOL · CL_29559 · May 13 · 04:00

New method achieves finite regret bounds in online inverse linear optimization

Researchers have developed a new method for online inverse linear optimization, a technique used in contextual recommendation systems. This approach achieves a finite regret bound of O(d log d) for M-convex action sets,…
TOOL · CL_29560 · May 13 · 04:00

Partition Tree framework advances conditional density estimation

Researchers have introduced Partition Tree, a new framework for conditional density estimation that can handle both continuous and categorical variables. This nonparametric approach models conditional distributions usin…
TOOL · CL_29561 · May 13 · 04:00

Federated learning gains uncertainty awareness for causal discovery

Researchers have developed a new method for Federated Granger Causality (FedGC) that addresses the limitation of deterministic point estimates by incorporating uncertainty awareness. This approach provides calibrated me…
TOOL · CL_29562 · May 13 · 04:00

New method improves conformal regression with CRPS-optimal binning

Researchers have developed a new non-parametric method for estimating conditional distributions, which can be used for conformal regression. This approach involves partitioning data into bins and using the empirical cum…
TOOL · CL_29481 · May 13 · 03:00

arXiv paper coins "harness" for AI agent structure

A recent arXiv paper introduces the term "harness" to formally describe the components that structure and control AI agents, moving beyond informal terms like "setup" or "config." The paper, "Natural-Language Agent Harn…

Frontier models double reliability every 4.7 months, pushing benchmark limits

Scientists engineer mice to produce own antibodies for extended treatment

Author trains word embeddings from scratch using Dostoevsky novels

Secret loyalties in AI models pose neglected but tractable threat

LLMs Explained: How They Process Context and Generate Output

Developer pivots LLM tool to 'Turn 0' state injection for consistency

LLM Integration Guide: MCP, Tool Use, and Function Calling Explained

AI Security Lacking Metrics, New Study Finds

OpenAI LLMs outperform doctors on clinical reasoning tasks

Bosch, CMU AI boosts humanoid robot dexterity by 90%

Penn Engineers develop AI framework to solve complex math problems

Pretrained AI Models Often Sufficient, Fine-Tuning Not Always Needed

A2A Protocol: Author details code, architecture, and failures

LLMs excel at deciphering historical handwriting, outperforming specialized tools

Ali Health launches medical AI 'Hydrogen Ion' with BMJ content deal

Microsoft: Frontier AI models falter on long, complex tasks

MoE architectures are workarounds for LLM training instability, not ideal solutions

Alibaba Health launches medical AI 'Hydrogen Ion' with BMJ Group

AI erodes science's self-correction, surgeon warns

Epistemic Hygiene Explored to Reduce AI Hallucinations

EPFL AI generates dynamic protein structures with atomic detail

ML model versioning needs dedicated registries, not just S3 buckets

Claude's tool use ensures reliable JSON output for developers

Cog-RAG uses dual-hypergraphs to improve LLM retrieval

New benchmark tests AI agents on complex, iterative engineering tasks

New framework guarantees multi-dimensional hyperparameter tuning

CNN framework tests General Relativity using gravitational wave data

New SCOPE algorithm optimizes sparse machine learning problems

New methods combine simulation and real-world data for improved AI model training

Smoothed analysis makes positive-only learning feasible

New framework quantifies epistemic uncertainty in machine learning

ReLU network analysis links Fisher information to spherical harmonics

New theory on stationary MMD points offers faster convergence

Self-consistency loss boosts Bayesian model comparison accuracy

New GANs framework enhances credit card fraud detection with uncertainty awareness

New methods tackle conflicting health treatment comparisons

New Targeted Synthetic Control method improves causal effect estimation

New iHMM method cuts forecasting errors by 67% with outlier protection

New algorithms tackle Gaussian graphical model selection from dependent data

Bayesian framework reveals multi-graph alignment thresholds

New method estimates optimal classification error with soft labels

New method generates realistic time-series data using causal models

New method decomposes twin network variance to find model failure sources

New NARFIMA model enhances BRIC exchange rate forecasting

Transformer model TAMO performs multi-objective optimization in-context

New method achieves finite regret bounds in online inverse linear optimization

Partition Tree framework advances conditional density estimation

Federated learning gains uncertainty awareness for causal discovery

New method improves conformal regression with CRPS-optimal binning

arXiv paper coins "harness" for AI agent structure