Brief

last 24h

[50/687] 185 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · arXiv cs.LG · 1d

Trajectory-Agnostic Asteroid Detection in TESS with Deep Learning

Researchers have developed a new deep learning method called a W-Net, utilizing two stacked 3D U-Nets, to detect asteroids in TESS image data. This approach is robust to variations in asteroid speed and direction, unlike traditional shift-and-stack algorithms, and includes a novel Adaptive Normalization technique for data scaling. The team has also released the code for generating TESS training data with asteroid masks to aid the scientific community, with potential applications for future missions like the Nancy Grace Roman Space Telescope. AI

IMPACT Enhances astronomical survey capabilities by improving asteroid detection efficiency and robustness.
TOOL · arXiv cs.CL · 1d

Predicting Disagreement with Human Raters in LLM-as-a-Judge Difficulty Assessment without Using Generation-Time Probability Signals

Researchers have developed a new method to predict when AI-generated difficulty ratings for educational materials might disagree with human assessments. This approach uses a separate embedding space, like ModernBERT, to identify potential disagreements without relying on generation-time probability signals, which are often difficult to compare across different AI models. Experiments demonstrated that this geometric consistency method achieved higher accuracy in predicting human rater disagreements than probability-based baselines when tested on CEFR-based sentence difficulty assessment using GPT-OSS-120B and Qwen3-235B-A22B. AI

IMPACT Improves the reliability of AI-generated educational content assessments, reducing the need for extensive human re-rating.
TOOL · arXiv cs.AI · 1d

Stories in Space: In-Context Learning Trajectories in Conceptual Belief Space

Researchers have proposed a new framework for understanding how Large Language Models (LLMs) learn within a given context. Their work suggests that LLMs update their behavior by performing Bayesian inference over a low-dimensional geometric space, termed a conceptual belief space. By analyzing LLMs' performance on story understanding tasks, the study found that these belief updates follow predictable trajectories on structured manifolds, which are reflected in both the models' external behavior and internal representations. Furthermore, interventions on these internal representations could causally influence the belief trajectories, supporting the geometric account of LLM belief dynamics. AI

IMPACT Proposes a geometric framework for understanding LLM in-context learning, potentially enabling more predictable and steerable model behavior.
- Large Language Models
- Bayesian inference
TOOL · arXiv cs.CL · 1d

A Comparative Study of Controlled Text Generation Systems Using Level-Playing-Field Evaluation Principles

A new research paper proposes a level-playing-field (LPF) evaluation approach to fairly compare controlled text generation (CTG) systems. The study found that when re-evaluated using standardized methods and datasets, the performance of several CTG systems was significantly worse than originally reported. This highlights a critical need for reproducible and standardized evaluation practices in the field to accurately reflect system capabilities. AI

IMPACT Standardized evaluation methods are crucial for accurately assessing and comparing AI model capabilities, potentially leading to more reliable development and deployment.
- Controlled Text Generation (CTG)
- arXiv
TOOL · arXiv cs.AI · 1d

Formalize, Don't Optimize: The Heuristic Trap in LLM-Generated Combinatorial Solvers

Researchers have developed a new benchmark, CP-SynC-XL, comprising 100 combinatorial problems to evaluate how Large Language Models (LLMs) synthesize executable solvers. Their findings indicate that using LLMs to formalize problems for existing solvers like OR-Tools in Python yields higher correctness than declarative modeling in MiniZinc. Prompting LLMs to also optimize search strategies resulted in only minor speed-ups and a significant drop in correctness for many problems, attributed to a "heuristic trap" where LLMs replace complete search with approximations or introduce over-constraining machinery. AI

IMPACT Highlights the risks of using LLMs for direct optimization in solver generation, suggesting a focus on formalization for verified solvers.
TOOL · arXiv cs.CL · 1d

ORBIT: Preserving Foundational Language Capabilities in GenRetrieval via Origin-Regulated Merging

Researchers have introduced ORBIT, a new method designed to prevent large language models from losing their foundational language capabilities during task-specific fine-tuning. This issue, known as catastrophic forgetting, is particularly prevalent in Generative Retrieval tasks and is linked to the divergence of model parameters. ORBIT addresses this by monitoring the distance between fine-tuned and original model weights, employing a weight averaging strategy to limit parameter drift when a set threshold is exceeded. Experiments demonstrate that ORBIT effectively preserves text and retrieval performance, outperforming existing continual learning and regularization techniques. AI

IMPACT Preserves general language abilities during task-specific LLM fine-tuning, potentially improving model versatility.
- ORBIT
- LLM
TOOL · arXiv cs.LG · 1d

Aligning Flow Map Policies with Optimal Q-Guidance

Researchers have developed a new class of generative policies called flow map policies, designed to accelerate action generation in complex control problems. These policies learn to make large jumps within generative dynamics, significantly reducing the inference cost compared to traditional methods. The approach, termed Flow Map Q-Guidance (FMQ), optimizes adaptation for offline-to-online reinforcement learning and has demonstrated state-of-the-art performance on robotic manipulation and locomotion tasks. AI

IMPACT Accelerates generative AI applications in robotics and control by reducing action generation latency.
TOOL · arXiv cs.CV · 1d

Beyond Localization: A Comprehensive Diagnosis of Perspective-Conditioned Spatial Reasoning in MLLMs from Omnidirectional Images

Researchers have introduced PCSR-Bench, a new diagnostic benchmark designed to evaluate the spatial reasoning capabilities of multimodal large language models (MLLMs) when processing omnidirectional images. The benchmark, comprising over 84,000 question-answer pairs across 2,600 images, reveals a significant gap between foundational perception and advanced reasoning tasks. While models perform moderately well on basic tasks like object counting, their accuracy plummets on more complex reasoning involving viewpoint changes and egocentric distortions. Further experiments using reinforcement learning on a smaller model indicate that spatial reasoning abilities can be improved through targeted optimization, though gains are task-specific and sensitive to reward design. AI

IMPACT Highlights a key bottleneck in current MLLMs, suggesting a need for improved spatial reasoning capabilities for more robust AI applications.
TOOL · arXiv cs.AI · 1d

Predicting Decisions of AI Agents from Limited Interaction through Text-Tabular Modeling

Researchers have developed a novel text-tabular modeling approach to predict the decisions of unfamiliar AI agents during negotiations. The method combines structured game state and dialogue history with representations derived from a frozen LLM, acting as an "LLM-as-Observer." This approach was tested on numerous frontier LLM agents, outperforming baseline methods by improving response-prediction AUC and reducing bargaining offer-prediction error. AI

IMPACT Introduces a method to predict AI agent behavior in negotiations, potentially improving automated transaction systems.
TOOL · arXiv cs.AI · 1d

Detecting overfitting in Neural Networks during long-horizon grokking using Random Matrix Theory

Researchers have developed a novel method using Random Matrix Theory to detect overfitting in neural networks, particularly during the "anti-grokking" phase of long-horizon training. This technique identifies "Correlation Traps" within model layers by analyzing deviations from the Marchenko-Pastur distribution in randomized weight matrices. The study found that these traps increase as test accuracy declines while training accuracy remains high, and importantly, some large-scale LLMs exhibit similar traps, suggesting potential harmful overfitting. AI

IMPACT This new method could help developers identify and mitigate harmful overfitting in large language models, potentially improving their generalization and reliability.
TOOL · arXiv cs.LG · 1d

A Semi-Supervised Framework for Speech Confidence Detection using Whisper

Researchers have developed a new semi-supervised framework for detecting speaker confidence in speech, addressing the challenge of limited labeled data. This approach combines deep semantic embeddings from OpenAI's Whisper model with interpretable acoustic features. A key innovation is the Uncertainty-Aware Pseudo-Labelling strategy, which generates and selects high-quality labels for unlabeled data, improving model performance. AI

IMPACT Introduces a novel method for speech confidence detection, potentially improving human-computer interaction and adaptive systems.
- Whisper
- OpenAI
- WavLM
- HuBERT
- Wav2Vec 2.0
RESEARCH · arXiv stat.ML · 2d · [2 sources]

$\varepsilon$-Good Action Identification in Fixed-Budget Monte Carlo Tree Search

Researchers have developed a new algorithm for identifying $\varepsilon$-good actions in fixed-budget Monte Carlo Tree Search (MCTS). This algorithm is $\varepsilon$-agnostic, meaning it does not require the error tolerance $\varepsilon$ as an input but still provides instance-dependent error bounds. The misidentification probability decays exponentially with the budget, and the analysis offers new guarantees for specific MCTS methods while highlighting differences in hardness compared to standard K-armed bandits. AI

IMPACT Introduces a novel algorithmic approach for decision-making under uncertainty in search algorithms, potentially improving planning efficiency in AI systems.
- Monte Carlo Tree Search
- $\varepsilon$-Good Action Identification
TOOL · arXiv cs.CV · 1d

GeoQuery: Geometry-Query Diffusion for Sparse-View Reconstruction

Researchers have developed GeoQuery, a new framework for 3D reconstruction that improves accuracy in sparse-view scenarios. This method integrates generative diffusion models with explicit geometric cues, overcoming limitations of previous approaches that struggled with corrupted features in novel views. GeoQuery utilizes a novel Geometry-guided Cross-view Attention mechanism to create geometry-aligned proxy queries, enabling more robust reconstruction even with extreme view sparsity. AI

IMPACT Improves 3D reconstruction quality in sparse-view settings, potentially benefiting applications like photogrammetry and virtual reality.
- GeoQuery
- 3D Gaussian Splatting
TOOL · arXiv cs.CV · 1d

SEMIR: Semantic Minor-Induced Representation Learning on Graphs for Visual Segmentation

Researchers have developed SEMIR, a novel representation framework designed to improve the segmentation of small and sparse structures in large-scale images. This method decouples inference from the native image grid by learning a task-adapted, topology-preserving latent graph representation. SEMIR transforms the grid graph into a compact graph minor, enabling efficient region-level inference via graph neural networks and yielding consistent improvements in minority-structure segmentation on tumor datasets. AI

IMPACT Introduces a new method for improving image segmentation accuracy, particularly for challenging small and sparse structures.
- SEMIR
- BraTS 2021
- KiTS23
- LiTS
TOOL · arXiv cs.AI · 1d

Classifier Context Rot: Monitor Performance Degrades with Context Length

A new paper reveals that leading AI models like Opus 4.6, GPT 5.4, and Gemini 3.1 exhibit significant performance degradation when classifying long transcripts, a crucial task for monitoring coding agents. These models miss subtly dangerous actions much more frequently in transcripts exceeding 800,000 tokens compared to shorter ones. While prompting techniques can partially mitigate this issue, further post-training improvements are likely necessary to ensure reliable monitoring in long-context scenarios. AI

IMPACT Leading AI models struggle with long contexts, potentially overestimating their safety monitoring capabilities and requiring new training or prompting strategies.
- Opus 4.6
- GPT 5.4
- Gemini 3.1
- arXiv
TOOL · arXiv cs.CL · 1d

Pretraining Exposure Explains Popularity Judgments in Large Language Models

Researchers have analyzed how large language models (LLMs) develop preferences for well-known entities, a phenomenon often linked to popularity bias. Using the open OLMo models and their complete Dolma pretraining corpus, they calculated entity exposure across 7.4 trillion tokens. Their findings indicate that LLM popularity judgments align more closely with pretraining exposure than with external signals like Wikipedia pageviews, especially for larger models and in the long tail of less popular entities. This suggests that data exposure during pretraining is the primary driver of popularity bias in LLMs. AI

IMPACT Demonstrates that LLM biases stem primarily from training data exposure, not external popularity metrics.
TOOL · arXiv cs.CV · 1d

Fast Image Super-Resolution via Consistency Rectified Flow

Researchers have developed FlowSR, a new method for image super-resolution that significantly speeds up the process using diffusion models. This approach reformulates super-resolution as a rectified flow from low-resolution to high-resolution images, enabling high-quality results in a single step. FlowSR incorporates HR regularization for precise convergence to the ground truth and a fast-slow scheduling strategy to balance efficiency with fine-grained texture detail. AI

IMPACT Accelerates image super-resolution tasks by enabling high-quality results in a single step using diffusion models.
- FlowSR
- diffusion models
TOOL · arXiv cs.AI · 1d

Agent-Based Post-Hoc Correction of Agricultural Yield Forecasts

Researchers have developed a novel agent-based framework to improve agricultural yield forecasts, particularly for soft fruit production where detailed data is scarce. This system uses large language model agents to refine existing predictions by incorporating domain knowledge through tools for phase detection, bias learning, and range validation. When tested on strawberry and corn datasets, the agent-based approach significantly reduced prediction errors, with Llama 3.1 8B proving most effective in refining XGBoost models. AI

IMPACT Enhances accuracy in agricultural forecasting by leveraging LLM agents for data-scarce environments.
TOOL · arXiv cs.LG · 1d

MetaColloc: Optimization-Free PDE Solving via Meta-Learned Basis Functions

Researchers have developed MetaColloc, a novel framework for solving partial differential equations (PDEs) using machine learning without requiring equation-specific optimization or data. The system meta-trains a neural network to create a universal dictionary of basis functions, which are then used in a single linear least squares step to solve PDEs. This approach significantly reduces computation time by several orders of magnitude compared to traditional methods, while achieving state-of-the-art accuracy on various smooth and non-linear problems. AI

IMPACT Offers a significant speedup for solving PDEs, potentially accelerating scientific discovery and engineering simulations.
TOOL · arXiv cs.AI · 1d

A Family of Quaternion-Valued Differential Evolution Algorithms for Numerical Function Optimization

Researchers have developed a new family of Quaternion-Valued Differential Evolution (QDE) algorithms designed for numerical function optimization. These algorithms operate directly in quaternion space, leveraging its unique algebraic and geometric properties. Initial results on the BBOB benchmark indicate that these QDE variants converge faster and outperform traditional real-valued Differential Evolution algorithms in optimizing various function classes. AI

IMPACT Introduces novel optimization algorithms that could improve the training of AI models.
TOOL · dev.to — LLM tag · 1d

Query The Quantum

A project developed for the TigerGraph GraphRAG Inference Hackathon demonstrated that GraphRAG significantly reduces token consumption and improves accuracy for complex queries. By constructing a knowledge graph of entities and their relationships, GraphRAG enables more focused retrieval compared to traditional vector-based RAG. Benchmarking against LLM-only and basic RAG pipelines on over 2 million quantum computing research paper abstracts, GraphRAG achieved a 90% accuracy rate, outperforming the other methods. AI

IMPACT GraphRAG's efficiency gains could significantly lower operational costs for LLM applications handling complex, multi-hop queries.
TOOL · arXiv cs.AI · 1d

ProfiliTable: Profiling-Driven Tabular Data Processing via Agentic Workflows

Researchers have introduced ProfiliTable, a new framework designed to improve the automation of tabular data processing tasks. This system utilizes a multi-agent approach that dynamically profiles data to build a comprehensive understanding and refine code generation. ProfiliTable integrates exploration, knowledge-augmented synthesis, and a feedback loop to ensure accurate and robust table transformations, outperforming existing methods on complex, multi-step scenarios. AI

IMPACT Enhances automation for tabular data tasks, potentially improving efficiency in data pipelines.
- ProfiliTable
- LLM
TOOL · arXiv cs.CL · 1d

Context Convergence Improves Answering Inferential Questions

Researchers have developed a new method called "context convergence" to improve how Large Language Models (LLMs) answer inferential questions. This technique focuses on how effectively sentences in a passage can eliminate incorrect answers, a measure that proves more effective than simple cosine similarity for inferential reasoning. Experiments using the TriviaHG dataset and various LLMs demonstrated that passages constructed with higher convergence sentences significantly boost answer accuracy, suggesting that LLMs prioritize information-rich cues presented earlier in the text. AI

IMPACT Introduces a novel metric for passage construction that enhances LLM accuracy on complex inferential reasoning tasks.
TOOL · arXiv cs.AI · 1d

QAP-Router: Tackling Qubit Routing as Dynamic Quadratic Assignment with Reinforcement Learning

Researchers have developed QAP-Router, a novel reinforcement learning approach for quantum compilation that frames qubit routing as a dynamic Quadratic Assignment Problem. This method models quantum gate interactions and hardware topology to optimize routing decisions. Experiments on benchmark circuits demonstrate a significant reduction in CNOT gate counts compared to existing compilers. AI

IMPACT Optimizes quantum circuit compilation, potentially accelerating the development and deployment of quantum computing applications.
TOOL · arXiv cs.LG · 1d

Attacks and Mitigations for Distributed Governance of Agentic AI under Byzantine Adversaries

Researchers have identified significant vulnerabilities in agentic AI governance systems, particularly concerning the potential for a compromised central provider to undermine security. The paper introduces SAGA-BFT, a fully Byzantine-resilient architecture that offers strong protection but at a performance cost. To address this, they also propose SAGA-MON and SAGA-AUD, which use lightweight monitoring or auditing for minimal overhead, and SAGA-HYB, a hybrid approach balancing security and performance. AI

IMPACT Identifies critical security flaws in agentic AI governance, prompting the need for more robust and resilient architectures.
- SAGA
- SAGA-BFT
- SAGA-MON
- SAGA-AUD
- SAGA-HYB
TOOL · arXiv cs.LG · 1d

From Message-Passing to Linearized Graph Sequence Models

Researchers have introduced a new framework called Linearized Graph Sequence Models, which reframes message-passing graph computations from a sequence modeling perspective. This approach aims to simplify architectural choices by decoupling computational processing depth from information propagation depth. The framework has demonstrated improved performance on tasks requiring long-range information processing in graphs, offering a principled method to integrate modern sequence modeling advancements into graph learning. AI

IMPACT Provides a new architectural approach for graph learning, potentially improving performance on tasks involving long-range dependencies.
- Linearized Graph Sequence Models
- arXiv
TOOL · arXiv cs.AI · 1d

A New Technique for AI Explainability using Feature Association Map

Researchers have introduced FAMeX, a novel algorithm designed to enhance the explainability of artificial intelligence systems. This new technique utilizes a graph-theoretic approach called a Feature Association Map (FAM) to model relationships between features. Experiments indicate that FAMeX outperforms existing methods like Permutation Feature Importance (PFI) and SHapley Additive exPlanations (SHAP) in determining feature importance for classification tasks. AI

IMPACT Enhances trust in AI systems by providing clearer explanations for model decisions, potentially accelerating adoption in sensitive domains.
TOOL · arXiv cs.AI · 1d

EHR-RAGp: Retrieval-Augmented Prototype-Guided Foundation Model for Electronic Health Records

Researchers have developed EHR-RAGp, a new retrieval-augmented foundation model designed to more effectively utilize historical patient data within Electronic Health Records (EHRs). This model employs a prototype-guided retrieval system to dynamically identify and integrate the most relevant past clinical information, overcoming limitations of existing methods that use fixed windows or uniform aggregation. In evaluations across various clinical prediction tasks, EHR-RAGp demonstrated superior performance compared to current state-of-the-art EHR foundation models and transformer-based approaches. AI

IMPACT Enhances predictive modeling in healthcare by enabling more precise use of historical patient data.
- EHR-RAGp
- Electronic Health Records
TOOL · arXiv cs.CL · 1d

Output Composability of QLoRA PEFT Modules for Plug-and-Play Attribute-Controlled Text Generation

Researchers have explored methods to generalize parameter-efficient fine-tuning (PEFT) techniques beyond single-task applications. Their work investigates training on combined datasets, composing weight matrices of separate PEFT modules, and composing the outputs of these modules during inference. The study found that summing PEFT module outputs was a particularly effective composition method, outperforming or matching other approaches across different large language models and controlled text generation tasks. AI

IMPACT This research could enable more flexible and cost-effective fine-tuning of large language models for multiple attributes simultaneously.
- QLoRA
- PEFT
- LLMs
TOOL · arXiv cs.LG · 1d

Neural-Schwarz Tiling for Geometry-Universal PDE Solving at Scale

Researchers have developed a new framework called NEST (Neural-Schwarz Tiling) for solving partial differential equations (PDEs) across various geometries and scales. Unlike previous methods that trained global operators for specific problem sets, NEST focuses on learning local physical responses on small voxel patches. These local solvers are then composed into global solutions using domain decomposition and Schwarz coupling, enabling generalization to unseen complex 3D domains. AI

IMPACT Introduces a novel approach to PDE solving that enhances generalization and reusability of learned models.
- NEST
- PDE
TOOL · arXiv cs.AI · 1d

BSO: Safety Alignment Is Density Ratio Matching

Researchers have introduced Bregman Safety Optimization (BSO), a novel method for aligning language models for both helpfulness and safety. BSO simplifies existing complex pipelines by reducing safety alignment to a density ratio matching problem, solvable with a single-stage loss function. This approach avoids auxiliary models and recovers existing safety-aware methods as special cases, demonstrating improved safety-helpfulness trade-offs in experiments. AI

IMPACT Simplifies AI safety alignment, potentially leading to more robust and easier-to-train helpful and safe language models.
- Bregman Safety Optimization
- language models
TOOL · arXiv cs.AI · 1d

Manifold Sampling via Entropy Maximization

Researchers have developed a new method called Manifold Sampling via Entropy Maximization (MASEM) to address the challenge of sampling from complex, disconnected feasible sets. This technique uses a resampling scheme to maximize the entropy of the empirical distribution, effectively improving mixing across different components of the feasible set. MASEM demonstrates significant improvements in efficiency and scalability, outperforming existing methods by an order of magnitude in Sinkhorn distance on various benchmarks. AI

IMPACT Introduces a novel sampling technique that could enhance performance in AI applications like Bayesian optimization and robotics.
- MASEM
- Cornelius V. Braun
TOOL · arXiv cs.AI · 1d

Reinforcing VLAs in Task-Agnostic World Models

Researchers have introduced RAW-Dream, a novel approach to adapt Vision-Language-Action (VLA) models for new tasks using reinforcement learning within task-agnostic world models. This method disentangles world model learning from specific task dependencies by leveraging a world model pre-trained on diverse, task-free behaviors and an off-the-shelf Vision-Language Model for reward generation. By relying on generalized physical priors instead of task-specific data, RAW-Dream enables zero-shot adaptation for VLAs, significantly improving scalability and mitigating world model hallucinations through a dual-noise verification mechanism. AI

IMPACT Enables more scalable and efficient adaptation of VLA models to new tasks by relying on generalized physical priors.
TOOL · arXiv cs.AI · 1d

LISA: Cognitive Arbitration for Signal-Free Autonomous Intersection Management

Researchers have developed LISA, a novel framework for signal-free autonomous intersection management that leverages large language models (LLMs) for real-time decision-making. Unlike traditional systems, LISA reasons over declared vehicle intents, considering factors like priority and queue pressure to optimize traffic flow. Evaluations show LISA significantly reduces control delay, waiting times, and queue lengths, while also improving fuel efficiency and intent satisfaction compared to existing methods. AI

IMPACT LLM-driven traffic management could significantly improve urban mobility and reduce vehicle emissions.
TOOL · arXiv cs.CV · 1d

Contrastive Learning under Noisy Temporal Self-Supervision for Colonoscopy Videos

Researchers have developed a novel noise-aware contrastive learning method to improve AI's ability to understand colonoscopy videos. This approach uses the natural temporal flow of procedures to create self-supervised associations, even when those associations might be imperfect. The learned representations have shown strong performance on downstream tasks like polyp retrieval and classification, outperforming existing self-supervised and supervised methods. AI

IMPACT This method could enhance AI's diagnostic capabilities in medical imaging, leading to more accurate polyp detection and characterization.
- AI
- colonoscopy videos
TOOL · arXiv cs.AI · 1d

Set-Aggregated Genome Embeddings for Microbiome Abundance Prediction

Researchers have developed a new method called Set-Aggregated Genome Embeddings (SAGE) to predict microbiome abundance profiles using genomic language models. This approach leverages few-shot learning capabilities to analyze raw DNA sequences and has demonstrated improved generalization on novel genomes compared to traditional bioinformatics methods. The study highlights that community-level latent representations are key to performance and explores the benefits of intermediate transformations and different embedding choices. AI

IMPACT Introduces a novel method for microbiome analysis using LLMs, potentially improving biological research and diagnostics.
TOOL · arXiv cs.LG · 1d

Autoregressive Learning in Joint KL: Sharp Oracle Bounds and Lower Bounds

Researchers have developed a new theoretical framework for understanding autoregressive learning, focusing on the joint Kullback-Leibler divergence for next-token prediction. Their work establishes matching upper and lower bounds that fully characterize long-horizon error behavior, offering improved rates and optimality justifications. The analysis reveals that the joint KL divergence allows for a horizon-free approximation factor, unlike Hellinger-based methods, and demonstrates an essential information-theoretic lower bound of order $\\Omega(H)\$. These findings align the log-loss training objective with sequence-level evaluation and approximation metrics, providing a sharp joint-KL oracle theory. AI

IMPACT Provides a theoretical foundation for improving next-token prediction accuracy in autoregressive models.
TOOL · arXiv cs.LG · 1d

In-context learning to predict critical transitions in dynamical systems

Researchers have developed a new in-context learning framework called TipPFN to predict critical transitions in dynamical systems. This method uses a prior-data fitted network to identify when a system is approaching an abrupt and potentially irreversible change. TipPFN was trained on synthetic data and demonstrated state-of-the-art early detection capabilities in unseen tipping regimes, sim-to-real examples, and real-world observations, outperforming existing methods that struggle with limited data or extrapolation. AI

IMPACT Introduces a novel AI approach for early detection of abrupt system changes, potentially improving forecasting in fields ranging from climate science to economics.
- TipPFN
- Benjamin Herdeanu
TOOL · arXiv cs.LG · 1d

Approximation of Maximally Monotone Operators : A Graph Convergence Perspective

Researchers have introduced a new framework for approximating maximally monotone operators, which are often discontinuous or set-valued and fall outside traditional approximation methods. This approach utilizes graph convergence, specifically the Painlevé-Kuratowski convergence, to handle these complex operators effectively. The study demonstrates that continuous encoder-decoder architectures can approximate these operators in the sense of local graph convergence, and proposes structure-preserving approximations that maintain maximal monotonicity through resolvent-based parameterizations. AI

IMPACT Introduces a novel mathematical framework for operator learning that could enable new AI architectures for complex, discontinuous functions.
TOOL · arXiv cs.CL · 1d

GKnow: Measuring the Entanglement of Gender Bias and Factual Gender

Researchers have developed GKnow, a new benchmark designed to measure both factual gender knowledge and gender bias in language models. This benchmark aims to disentangle stereotypical outputs from factually gendered ones, which are often conflated in current analyses. Experiments using GKnow revealed that factual gender knowledge and gender bias are deeply intertwined at both the circuit and neuron levels within models, suggesting that simple ablation techniques may be ineffective for debiasing and can even mask a loss of factual gender knowledge. AI

IMPACT Introduces a new evaluation tool to better understand and potentially mitigate gender bias in AI models.
TOOL · arXiv cs.LG · 1d

Targeted Neuron Modulation via Contrastive Pair Search

Researchers have developed a new method called contrastive neuron attribution (CNA) to identify specific neurons in language models that are responsible for refusing harmful requests. This technique requires only forward passes and can pinpoint the critical neurons with high accuracy. Ablating these identified neurons significantly reduced refusal rates by over 50% on a benchmark test, while maintaining output quality. The study also found that while base models possess similar underlying structures, the alignment fine-tuning process transforms these into a targeted refusal mechanism. AI

IMPACT Provides a novel method for understanding and controlling AI safety mechanisms, potentially leading to more robust alignment techniques.
TOOL · arXiv cs.CV · 1d

Large-Small Model Collaboration for Farmland Semantic Change Detection

Researchers have developed a new framework for farmland semantic change detection, addressing limitations in existing benchmarks and models. The proposed method, called Fine-grained Difference-aware Mamba (FD-Mamba) integrated with Cross-modal Logical Arbitration (CMLA), uses a small, task-specific model alongside a large, frozen vision-language model. This collaboration aims to improve fine-grained monitoring by preserving boundaries, localizing small regions, and suppressing pseudo-changes through textual priors. Experiments on the new HZNU-FCD benchmark and other datasets demonstrate high accuracy and robustness with a relatively small number of trainable parameters. AI

IMPACT Introduces a novel approach to semantic change detection in agriculture, potentially improving land management and monitoring.
- HZNU-FCD
- FD-Mamba
- CMLA
- CLIP
- ChangeCLIP-ViT
- LEVIR-CD
- WHU-CD
TOOL · arXiv cs.CV · 1d

KAN-CL: Per-Knot Importance Regularization for Continual Learning with Kolmogorov-Arnold Networks

Researchers have introduced KAN-CL, a new framework for continual learning that addresses catastrophic forgetting by leveraging the unique structure of Kolmogorov-Arnold Networks (KANs). This method applies importance-weighted regularization at a per-knot level, allowing for more precise control over parameter updates across tasks. When tested on classification tasks, KAN-CL significantly reduced forgetting compared to baseline methods while maintaining high accuracy, demonstrating its effectiveness in preserving learned information. AI

IMPACT Introduces a novel regularization technique for continual learning that significantly reduces catastrophic forgetting in neural networks.
TOOL · arXiv cs.CV · 1d

From Model Uncertainty to Human Attention: Localization-Aware Visual Cues for Scalable Annotation Review

Researchers have developed a new method to improve the quality and efficiency of data annotation for machine learning models. Their approach visualizes spatial uncertainty in model predictions, guiding human annotators to focus on areas where the model is most likely to make localization errors. A study with 120 participants showed that this uncertainty cueing led to higher label quality and faster overall annotation times, by directing annotator effort effectively. AI

IMPACT Improves efficiency and quality of data labeling, a critical bottleneck for ML model development.
- Moussa Kassem Sbeyti
- arXiv
TOOL · arXiv cs.LG · 1d

Hypernetworks for Dynamic Feature Selection

Researchers have developed a new machine learning framework called Hyper-DFS for dynamic feature selection, which aims to optimize feature acquisition under budget constraints. This approach utilizes a hypernetwork to generate classifier parameters on demand for specific feature subsets, improving efficiency and generalization. Benchmarks indicate that Hyper-DFS outperforms existing state-of-the-art methods on various datasets, including tabular and image data, and demonstrates superior zero-shot generalization capabilities. AI

IMPACT Introduces a novel framework that improves efficiency and generalization in dynamic feature selection tasks.
- Hyper-DFS
- Javier Fumanal-Idocin
TOOL · arXiv cs.AI · 1d

NARA: Anchor-Conditioned Relation-Aware Contextualization of Heterogeneous Geoentities

Researchers have introduced NARA, a novel self-supervised learning framework designed to create contextualized representations for vector geospatial data. Unlike previous methods that focused on specific data types or limited spatial relations, NARA unifies the modeling of semantics, geometry, and spatial relationships. This approach allows for a more comprehensive understanding of heterogeneous geoentities, including points, polylines, and polygons, by capturing relational structures beyond simple proximity. The framework has demonstrated improved performance in tasks such as building function classification, traffic speed prediction, and point-of-interest recommendation. AI

IMPACT Introduces a new method for processing and understanding complex geospatial data, potentially improving AI applications in areas like urban planning and navigation.
TOOL · arXiv cs.AI · 1d

Reconnecting Fragmented Citation Networks with Semantic Augmentation

Researchers have developed a new framework to address fragmentation in citation networks by integrating citation topology with large language model-based text similarity. This hybrid approach uses LLMs to identify semantically similar articles and adds these as new edges to the citation graph. Applied to over 600,000 publications, the method effectively reduces fragmentation while maintaining disciplinary coherence and improving cluster detection. AI

IMPACT Enhances the interpretability and completeness of academic knowledge graphs, potentially improving research discovery and citation analysis tools.
TOOL · arXiv cs.AI · 1d

TokenRatio: Principled Token-Level Preference Optimization via Ratio Matching

Researchers have introduced Token-level Bregman Preference Optimization (TBPO), a new method for aligning language models using pairwise preferences. Unlike existing approaches that focus on full sequences, TBPO operates at the token level, modeling preferences for individual next-token actions based on the preceding context. This approach aims to improve alignment quality, training stability, and output diversity compared to current methods. AI

IMPACT Introduces a new principled method for aligning language models at the token level, potentially improving training efficiency and output quality.
- TBPO
- DPO
TOOL · arXiv cs.CV · 1d

CAD-feature enhanced machine learning for manufacturing effort estimation on sheet metal bending parts

Researchers have developed a novel machine learning approach for estimating manufacturing effort in sheet metal bending. This method enhances graph-based learning by integrating manufacturing-specific features, such as bend characteristics and surface roles, into the CAD model's geometric representation. By combining domain knowledge with data-driven insights, the approach aims to improve the accuracy of manufacturability predictions and effort estimations in industrial CAD environments. AI

IMPACT This hybrid approach could lead to more accurate manufacturability assessments and effort estimations in industrial CAD systems.
- Matteo Ballegeer
- arXiv
TOOL · arXiv cs.AI · 1d

Missingness-MDPs: Bridging the Theory of Missing Data and POMDPs

Researchers have introduced a new framework called missingness-MDPs (miss-MDPs) that integrates the theory of missing data into partially observable Markov decision processes (POMDPs). This novel subclass of POMDPs specifically addresses scenarios where observation functions are missing, detailing the probability of individual state features being unobserved. The work focuses on computing near-optimal policies for miss-MDPs with unknown missingness functions by learning from trajectory data, offering PAC algorithms that yield epsilon-optimal policies with high probability. AI

IMPACT Introduces a new theoretical framework for handling missing data in sequential decision-making problems, potentially improving AI agents' robustness in real-world scenarios.