Brief

last 24h

[50/1790] 223 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · arXiv cs.CV English(EN) · 16h

AUCp: Pseudo-AUC for Inference Model Selection with Unlabeled Validation Data in Abnormality Detection

Researchers have introduced AUCp, a new metric designed to improve model selection in abnormality detection tasks, particularly within medical imaging. This metric addresses the challenge of relying on labeled validation data, which is often scarce or time-consuming to acquire for rare diseases. By treating all unannotated test samples as positive and using a traditional AUC calculation, AUCp effectively identifies the optimal model for inference without needing annotated test sets, outperforming conventional metrics in unsupervised and self-supervised learning scenarios. AI

IMPACT Introduces a novel metric to improve model selection in medical abnormality detection, potentially enhancing diagnostic accuracy in resource-limited settings.
- Md Mahfuzur Rahman Siddiquee
TOOL · arXiv cs.CV English(EN) · 16h

G2G: Exploiting Intra-Group Geometry for Inter-Group Pose Estimation

Researchers have developed a new method called G2G to improve the estimation of relative 6-Do-F poses between groups of images. This technique leverages existing intra-group geometry information and pretrained multi-view backbones. G2G introduces lightweight trainable modules that fuse information across different groups, achieving state-of-the-art accuracy on various datasets without retraining the entire foundation model. AI

IMPACT Enhances computer vision capabilities for tasks requiring precise spatial understanding between image sets.
- arXiv
TOOL · arXiv cs.LG English(EN) · 16h

Model-Based Learning of Whittle indices

Researchers have developed BLINQ, a novel model-based algorithm designed to learn Whittle indices for Markov Decision Processes. This new approach constructs an empirical estimate of the MDP and then computes the indices, offering a proven convergence guarantee and a bound on learning time. Numerical experiments indicate BLINQ requires fewer samples than existing Q-learning methods for accurate approximations and has a lower overall computational cost. AI
TOOL · arXiv cs.LG English(EN) · 16h

Learning What's Real: Disentangling Signal and Measurement Artifacts in Multi-Sensor Data, with Applications to Astrophysics

Researchers have developed a deep learning framework designed to separate true signals from measurement artifacts in multi-sensor data. This method uses a dual-encoder architecture and a counterfactual generation objective to disentangle intrinsic physical properties from sensor-specific distortions. The framework's effectiveness was demonstrated on astrophysical galaxy images from the DESI Legacy Imaging Survey and the Hyper Suprime-Cam Survey, offering a general approach for scientific and multi-modal self-supervised pretraining. AI

IMPACT Provides a generalizable method for improving data analysis in scientific and multi-modal settings by disentangling true signals from measurement artifacts.
TOOL · arXiv cs.LG English(EN) · 16h

ERBench: A Benchmark and Testsuite for Equation Discovery Algorithms

Researchers have introduced ERBench, a new benchmark and test suite specifically designed to evaluate algorithms for equation discovery. This framework focuses on assessing how well these algorithms can recover known groundtruth formulas, addressing limitations in existing benchmarks that often use small datasets and lack robustness testing. ERBench aims to provide a more rigorous evaluation of symbolic regression algorithms, which are crucial for automating the discovery of scientific models from data. AI

IMPACT Provides a standardized evaluation framework for equation discovery algorithms, potentially accelerating scientific model development.
- symbolic regression
- ERBench
TOOL · arXiv cs.LG English(EN) · 16h

Multi-resolution Enhancement for Full Spectrum Neural Representations

Researchers have developed WIEN-INR, a novel hierarchical implicit neural representation (INR) framework designed to improve the representation of complex scientific data. This multi-scale architecture distributes modeling across different resolution levels, using an enhancement network to recover fine details. The approach aims to enable smaller networks to capture the full spectrum of information, thereby reducing computational and storage costs while maintaining high fidelity. WIEN-INR has demonstrated its effectiveness on diverse experimental measurements, offering a practical solution for broader adoption of neural representations in scientific workflows. AI

IMPACT Enables more efficient and detailed representation of scientific data using neural networks, potentially accelerating research across various scientific domains.
- Yuan Ni
- WIEN-INR
TOOL · arXiv cs.CV English(EN) · 16h

REACT 2026: The Fourth Multiple Appropriate Facial Reaction Generation Challenge: Personalised MAFRG and Appropriate EEG Reaction Prediction

The REACT 2026 challenge focuses on generating multiple appropriate facial reactions in response to speaker behavior, building on previous iterations. This year's challenge introduces personalization by incorporating individual-level personality labels and EEG recordings, moving towards a one-to-many personalized facial reaction generation setting. New baselines and guidelines are provided for both offline and online generic and personalized MAFRG sub-challenges. AI

IMPACT Introduces novel personalized facial reaction generation by integrating personality and neurophysiological data.
- REACT 2026
- MAFRG
- MARS dataset
- EEG
TOOL · arXiv cs.LG English(EN) · 16h

CTS-Bench: Benchmarking Graph Coarsening Trade-offs for GNNs in Clock Tree Synthesis

Researchers have introduced CTS-Bench, a new benchmark suite designed to evaluate the trade-offs between graph coarsening techniques and the accuracy of Graph Neural Networks (GNNs) for Clock Tree Synthesis (CTS) in electronic design automation. The benchmark includes 4,860 converged physical design solutions, allowing for systematic analysis of how graph coarsening impacts prediction accuracy and computational efficiency. While coarsening can significantly reduce memory usage and training time, the study found it often removes critical structural information, leading to poor performance on CTS-specific tasks like clock skew prediction. AI

IMPACT Provides a standardized method to assess GNN performance for chip design tasks, potentially guiding future development of more efficient AI models in EDA.
TOOL · arXiv cs.LG English(EN) · 16h

Multi-View Speech Representation Learning for Parkinson's Disease Detection Using Context-guided Cross-modal Attention

Researchers have developed a novel multi-branch deep learning framework designed to improve the detection of Parkinson's disease through speech analysis. This approach utilizes three distinct speech representations: Log-Mel spectrograms, MFCCs, and HuBERT embeddings, each processed by specialized neural networks. A key innovation is a context-guided cross-modal attention mechanism that dynamically integrates these diverse features, leading to enhanced accuracy in identifying the disease. AI

IMPACT This research demonstrates a novel approach to using AI for early disease detection, potentially improving diagnostic accuracy and patient outcomes.
TOOL · arXiv cs.CV English(EN) · 16h

DAL-PCQA: Enabling Distortion-Level and Language-Driven Reasoning for Point Cloud Quality Assessment

Researchers have introduced DAL-PCQA, a new dataset designed to improve point cloud quality assessment by incorporating distortion-level and language-driven reasoning. Unlike previous methods that provide only a single score, DAL-PCQA includes multi-level distortion severity labels, quality categories, and natural language descriptions of artifacts. This dataset aims to enable more interpretable and explainable quality assessment by aligning with how humans perceive and describe point cloud degradations. AI

IMPACT Enables more interpretable AI models for assessing visual data quality.
- DAL-PCQA
- Swarna Chakraborty
TOOL · arXiv cs.LG English(EN) · 16h

Medial Axis Aware Learning of Signed Distance Functions

Researchers have developed a new variational method for accurately computing signed distance functions (SDFs) from point clouds. This approach explicitly incorporates the medial axis, which is the jump set of the SDF's gradient, by using a higher-order variational formulation. The method employs a phase field approximation to implicitly describe the medial axis and uses neural networks to approximate both the SDF and the phase field, demonstrating improved accuracy compared to existing methods. AI
- Christoph Norden-Smoch
TOOL · arXiv cs.LG English(EN) · 16h

Improving User Experience with Personalized Review Ranking and Summarization

Researchers have developed a new framework to improve the user experience of online shopping by personalizing review ranking and summarization. This system integrates user preference modeling, sentiment analysis, and Large Language Models (LLMs) to tailor review content to individual needs. By analyzing historical reviews and user-selected product aspects, the framework ranks and summarizes reviews to reduce information overload and enhance decision-making confidence. Evaluations showed this personalized approach significantly outperformed traditional ranking methods and improved user satisfaction and efficiency. AI

IMPACT Enhances e-commerce decision-making by personalizing review content and reducing information overload.
- Amazon
- Large Language Model
TOOL · arXiv cs.LG English(EN) · 16h

Beyond Neural Collapse: Task-Intrinsic Geometry Governs Neural Representations in Modular Arithmetic

Researchers have developed a new framework to understand neural network representations in modular arithmetic tasks. Their work refines the explanation for why these networks adopt a two-dimensional cyclic geometry, deviating from the predicted neural collapse phenomenon. The study details a layerwise training mechanism where classifier weights form a rank-2 configuration before embeddings align, and explains this cyclic solution's advantage over standard neural collapse under certain conditions. AI

IMPACT Provides a theoretical framework for understanding neural network behavior in specific mathematical tasks, potentially guiding future model design.
- Neural Collapse
TOOL · arXiv cs.LG English(EN) · 16h

From A to B to A: Palindromic Zero-Shot Voice Conversion with Non-Parallel Data

Researchers have developed a novel voice conversion framework that uses K-Nearest Neighbors (KNN) retrieval on WavLM representations to align non-parallel speech data. This method constructs synthetic training pairs from non-parallel source and target audio, enabling supervised learning without requiring explicit alignment or parallel corpora. The framework also incorporates a speaker loss to maintain consistent target-speaker identity, demonstrating high naturalness and speaker similarity across multiple languages, even when trained solely on English data. AI

IMPACT This method could enable more accessible and multilingual voice conversion without requiring parallel datasets.
- K-Nearest Neighbors (KNN)
- WavLM
TOOL · arXiv cs.LG English(EN) · 16h

Enhancing Strawberry Yield Forecasting with Backcasted IoT Sensor Data and Machine Learning

Researchers have developed an AI-based backcasting approach to generate synthetic IoT sensor data for strawberry yield forecasting. By combining this synthetic data with actual sensor and yield records, they trained models that improved forecasting accuracy. This method addresses data gaps in agricultural settings, enabling more robust data-driven resource management for farmers. AI

IMPACT Enhances agricultural forecasting capabilities by enabling more accurate yield predictions with limited real-world sensor data.
- Georgios Leontidis
- Machine Learning
TOOL · arXiv cs.LG English(EN) · 16h

Conditional Random Ordered Transport Spaces

Researchers have introduced Conditional Random Ordered Transport Spaces (CROTS), a novel framework for evaluating distributional learning. CROTS equips spaces of random probability measures with an ambient Wasserstein metric and a stochastic order, enabling the assessment of mass movement admissibility. This theory provides a mathematical language to describe issues like evidence overreach and distributional shift in machine learning. AI

IMPACT Introduces a new theoretical framework for evaluating distributional learning, potentially improving robustness and understanding of model behavior.
- Conditional Random Ordered Transport Spaces
- Wasserstein
TOOL · arXiv cs.CV English(EN) · 16h

RAD: A Dataset and Benchmark for Real-Life Anomaly Detection with Robotic Observations

Researchers have introduced RAD, a new dataset and benchmark designed to evaluate anomaly detection capabilities in real-world robotic scenarios. Unlike previous benchmarks, RAD features objects captured from numerous robotic viewpoints under uncontrolled lighting, simulating practical deployment challenges. The study found that established 2D feature-based methods surprisingly outperformed newer 3D and vision-language models in image-level anomaly detection, though the gap narrowed for precise defect localization. AI

IMPACT Establishes a more realistic benchmark for robotic perception, potentially guiding future research in anomaly detection for real-world applications.
- Kaichen Zhou
TOOL · arXiv cs.LG English(EN) · 16h

Hierarchical Projection for Adaptive Knowledge Transfer

Researchers have introduced Projection Transfer Learning (ProjectionTL), a novel framework designed to improve learning from multiple, heterogeneous data sources when the target dataset is limited. This method uses a hierarchical Bayesian model to adaptively weigh information from different sources, capturing global alignment. It then refines this transfer at the feature level through a posterior-projection step, selecting features that agree locally with the target signal. ProjectionTL aims to mitigate negative transfer and enhance interpretability, showing improved accuracy and stability in simulations and biomedical applications. AI

IMPACT Introduces a principled method for integrating heterogeneous data, potentially improving model robustness and interpretability in high-dimensional settings.
- Projection Transfer Learning
- ProjectionTL
TOOL · arXiv cs.LG English(EN) · 16h

Counterfactual Transport Flows for Offline Conservative Trajectory Refinement

Researchers have introduced a new framework called counterfactual transport flows for offline reinforcement learning. This method aims to improve decision-making policies using only logged historical data, without extrapolating beyond the available information. The approach constructs local preference pairs by finding similar trajectories with higher feedback in latent space, which then guides a conservative refinement process. This allows for a controllable trade-off between maintaining the original behavior and applying stronger improvements, as demonstrated on D4RL benchmarks. AI

IMPACT Introduces a novel method for improving decision-making from historical data, potentially enhancing the efficiency and safety of offline RL applications.
TOOL · arXiv cs.LG English(EN) · 16h

Statistical Decision Theory with Counterfactual Loss

Researchers have developed a new framework for statistical decision theory that incorporates counterfactual loss, addressing limitations in classical approaches that only consider realized outcomes. This new method allows for the evaluation of decision quality against feasible alternatives at an individual level, which is crucial in fields like pretrial bail decisions. The framework demonstrates that counterfactual risk is identifiable under specific conditions, particularly when the loss function is additive in potential outcomes, and can capture both decision accuracy and difficulty, unlike standard losses that only reflect accuracy. AI

IMPACT Introduces a novel theoretical framework for decision-making that could influence AI agent design and evaluation.
- Benedikt Koch
TOOL · arXiv cs.CV English(EN) · 16h

CSFlow: Aligning Flow Matching with Human Contrast Sensitivity

Researchers have developed CSFlow, a novel weighting scheme that aligns the iterative denoising process in flow matching models with human contrast sensitivity. This method accounts for the human visual system's varying sensitivity to different spatial frequencies and the tendency of diffusion models to stabilize coarse image content before fine details. By estimating which frequencies are generated at each reverse flow interval and weighting timesteps accordingly, CSFlow has demonstrated improvements in image generation quality, reducing FID scores and enhancing visual realism. AI

IMPACT Improves realism and quality of generated images by incorporating human visual perception into the generation process.
- CSFlow
TOOL · arXiv cs.LG English(EN) · 16h

Improved Convergence Analysis of Topology Dependence in Decentralized SGD

Researchers have developed a more precise convergence analysis for Decentralized SGD, a key algorithm in decentralized learning. Unlike previous methods that focused solely on the spectral gap of the network topology, this new analysis considers all eigenvalues of the mixing matrix. Experiments confirmed that this refined approach more accurately describes how different network topologies impact the convergence rate of Decentralized SGD, particularly in heterogeneous settings. AI

IMPACT Provides a more accurate theoretical framework for understanding and optimizing decentralized machine learning training.
- Decentralized SGD
- arXiv
TOOL · arXiv cs.LG English(EN) · 16h

A Survey on Deep Multi-Task Learning in Connected Autonomous Vehicles

This paper provides a comprehensive review of deep multi-task learning (MTL) techniques applied to connected autonomous vehicles (CAVs). It explores how MTL can enable a single model to handle diverse tasks like perception, prediction, planning, and control, which is crucial for efficient and real-time operation in complex driving scenarios. The survey categorizes existing research based on whether tasks are performed solely by the ego vehicle or enhanced through vehicle-to-everything (V2X) communication, and also examines MTL in the context of V2X communications and radio resource management. The authors identify current research gaps and suggest future directions for advancing MTL in CAV systems. AI

IMPACT Provides a structured overview of multi-task learning applications for autonomous driving systems.
TOOL · arXiv cs.CV English(EN) · 16h

Frequency Decoupled Framework for Screen Content Image Super-Resolution

Researchers have developed a novel framework for screen content image super-resolution (SCISR) that addresses limitations in existing methods by considering frequency characteristics. Their Frequency Decoupled Framework (FDF) separates images into amplitude and phase streams, utilizing specialized modules to capture periodic patterns and global configuration. This approach has demonstrated state-of-the-art performance across multiple datasets and scales. AI

IMPACT This new framework could lead to more accurate and detailed digital content rendering, improving user experience in applications that rely on screen content.
TOOL · arXiv cs.CV English(EN) · 16h

MinNav: Minimalist Navigation Using Optical Flow For Active Tiny Aerial Robots

Researchers have developed MinNav, a novel navigation system for tiny aerial robots that utilizes optical flow and its uncertainty to navigate complex environments. This system can autonomously fly through scenes with static and dynamic obstacles, as well as unknown gaps, without prior knowledge of the surroundings. MinNav achieves a 70% success rate in real-world experiments and offers comparable performance to depth-based methods with significantly less computational power, making it suitable for onboard tiny aerial robots. AI

IMPACT This research could enable more autonomous and efficient navigation for small drones in complex, unknown environments.
- tiny aerial robots
- MinNav
TOOL · arXiv cs.CV English(EN) · 16h

Empowering Feed-Forward Reconstruction Models with Metric Scale via Satellite Images

Researchers have developed a new method to resolve the scale ambiguity in feed-forward 3D reconstruction models by utilizing satellite imagery as a global metric reference. This approach integrates satellite patches with reconstruction backbones, enforcing consistency to infer absolute scale, refine geometry, and estimate camera pose in a metric coordinate frame. The framework demonstrated improvements in metric depth estimation, point-cloud reconstruction, and camera localization across multiple datasets. AI

IMPACT Enables metric understanding in 3D reconstruction, crucial for applications requiring precise environmental measurement.
TOOL · arXiv cs.CV English(EN) · 16h

Gravity-guided Contact Dynamics Estimation from 3D Human Motions

Researchers have developed a new method called GraCE to estimate ground contact forces from 3D human motion capture data. This approach utilizes the human's center of gravity and body mass distribution to infer contact dynamics, offering a more scalable solution than traditional force plates or pressure mats. GraCE aims to capture complex pressure distributions and has demonstrated superior performance on datasets like GroundLink and MOYO for ground reaction force and contact pressure prediction, respectively. AI

IMPACT Provides a more scalable method for biomechanical analysis by estimating contact dynamics from motion capture data.
TOOL · arXiv cs.CV English(EN) · 16h

DifferSeg: Towards Diverse Multimodal Binary Segmentation via Differential Perception and Frequency Guidance

Researchers have introduced DifferSeg, a novel framework for multimodal binary segmentation that addresses challenges in aligning complementary features and balancing high- and low-frequency representations. The framework utilizes a differential perception fusion module to adaptively align multimodal features and enhance their complementarity, while a frequency-guided decoder ensures consistency between detailed structures and semantic information. DifferSeg has demonstrated superior performance across numerous datasets and tasks, outperforming 67 existing methods. AI

IMPACT Introduces a new method for multimodal segmentation, potentially improving performance in diverse applications.
- DifferSeg
- arXiv
TOOL · arXiv cs.LG English(EN) · 16h

Zero and Few Shot Load Forecasting with Large Language Models

Researchers have developed a novel approach for load forecasting in data-scarce environments by leveraging a large language model called Chronos. This LLM framework utilizes its extensive pre-trained knowledge to achieve accurate predictions without requiring extensive fine-tuning on specific datasets. Experiments across five real-world datasets demonstrated that Chronos significantly outperforms nine traditional baseline models in both deterministic and probabilistic forecasting, showing substantial reductions in error metrics. AI

IMPACT Demonstrates LLMs' potential for accurate forecasting in data-limited domains, potentially reducing data acquisition costs and improving efficiency.
TOOL · arXiv cs.LG English(EN) · 16h

Stochastic Dimension Implicit Functional Projections for Global Integral Conservation in High-Dimensional PINNs

Researchers have introduced a new framework called Stochastic Dimension Implicit Functional Projection (SDIFP) to address challenges in enforcing integral constraints within high-dimensional neural network solvers for partial differential equations. This method replaces traditional grid-based projection techniques with a global affine correction, determined by scalar coefficients derived from a weighted quadrature rule. SDIFP aims to improve scalability and efficiency, particularly for mesh-free methods like physics-informed neural networks (PINNs), by separating quadrature evaluation from automatic differentiation memory costs and enabling pointwise inference efficiency. AI

IMPACT Introduces a novel method for improving the accuracy and efficiency of neural network solvers in high-dimensional scientific computing tasks.
- Zhangyong Liang
- Stochastic Dimension Implicit Functional Projection
TOOL · arXiv cs.CV English(EN) · 16h

Illumination-Invariant Anomaly Detection for Sub-Canopy UAV Multispectral Point Clouds

Researchers have developed a new framework for anomaly detection in multispectral point clouds captured by drones under dense foliage. This method addresses the challenge of varying illumination conditions, which can obscure targets. It achieves this by first estimating solar angles to distinguish shadows from dark objects and then employing an illumination-consistent sparse representation to separate spectral reflectance from lighting effects. The framework demonstrates improved performance in complex forest environments for tasks like identifying hidden military targets or archaeological sites. AI

IMPACT This framework could improve the accuracy of identifying hidden objects in challenging environments, aiding applications from military surveillance to archaeology.
- multispectral point clouds
- UAV
TOOL · arXiv cs.LG English(EN) · 16h

Energy-Regularized Spatial Masking: A Novel Approach to Enhancing Robustness and Interpretability in Vision Models

Researchers have introduced Energy-Regularized Spatial Masking (ERSM), a new framework designed to improve the robustness and interpretability of vision models. ERSM treats feature selection as a differentiable energy minimization problem, assigning each visual token an energy value based on its importance and spatial coherence. This approach allows models to autonomously find an optimal balance of information density, leading to emergent sparsity and enhanced performance in robustness tests without explicit supervision. AI

IMPACT Enhances vision model interpretability and robustness, potentially leading to more reliable AI systems in critical applications.
- Energy-Regularized Spatial Masking
- Tom Devynck
TOOL · arXiv cs.LG English(EN) · 16h

On the Superlinear Relationship between SGD Noise Covariance and Loss Landscape Curvature

Researchers have uncovered a new relationship between the noise introduced by Stochastic Gradient Descent (SGD) and the curvature of the loss landscape in deep learning models. Their findings indicate that this noise is not directly proportional to the Hessian of the loss, as previously assumed under specific conditions. Instead, the study reveals a more general connection where the SGD noise covariance is related to the expected value of per-sample Hessians, suggesting these two factors approximately commute rather than coincide. AI

IMPACT Provides a more accurate theoretical understanding of SGD noise and its interaction with loss landscape curvature, potentially guiding future optimization algorithm development.
- Stochastic Gradient Descent
- Yikuan Zhang
TOOL · arXiv cs.LG English(EN) · 16h

Compositional Approximation Can Strictly Outperform Superpositional Approximation

A new research paper explores the theoretical limits of function approximation, demonstrating that compositional methods, such as neural networks, can significantly outperform superpositional methods. The study constructs specific examples where the approximation error gap between these two approaches can be arbitrarily large. This work has implications for understanding the fundamental capabilities of different model architectures in machine learning. AI

IMPACT This theoretical work could inform the design of future AI architectures, potentially leading to more efficient and powerful models.
TOOL · arXiv cs.CV English(EN) · 16h

Revisiting Articulated Parts Perception in Robot Manipulation

Researchers have introduced a new representation called Geometric Primary Structure (GPS) for understanding articulated parts in robotic manipulation. This method aims to balance scalability and quality by abstracting the geometric structure of object parts. An efficient VR-based annotation system was used to collect a dataset of 41,000 frames for 234 objects, enabling the training of a generalizable GPS model that achieved a 73% success rate in object manipulation tasks. AI

IMPACT Introduces a novel representation and efficient data collection method that could improve robot dexterity and adaptability in handling objects with movable parts.
- robot manipulation
- Geometric Primary Structure
TOOL · arXiv cs.LG English(EN) · 16h

A Geometric Measure of Linear Separability for Neural Representations

Researchers have developed a new metric called the directional linear separability measure (LSM) to analyze the geometric properties of neural network representations. This measure quantifies how well a target class can be separated from other classes using affine halfspaces, providing a class-wise and asymmetric assessment. LSM is designed to distinguish between changes due to linear reparameterization and those caused by information loss or nonlinear transformations, offering a tool to diagnose class-wise intrusion in deep learning architectures. AI

IMPACT Provides a new quantitative tool for understanding and diagnosing the internal geometry of neural network representations.
- arXiv
TOOL · arXiv cs.CV English(EN) · 16h

Test-Time Scaling in Multimodal Foundation Models: A Comprehensive Survey of Generation and Reasoning

A new survey paper details the emerging field of Test-Time Scaling (TTS) for Multimodal Foundation Models (MFMs). The paper categorizes existing TTS methods into sampling-based, feedback-based, and search-based approaches. It also outlines common applications, benchmarks, and future research directions for enhancing MFM performance in generation and reasoning tasks. AI

IMPACT Provides a structured overview and taxonomy for multimodal AI scaling research, guiding future development.
- Multimodal Foundation Models
- Test-Time Scaling
TOOL · arXiv cs.LG English(EN) · 16h

Normality Calibration in Semi-supervised Graph Anomaly Detection

Researchers have developed a new framework called GraphNC to improve semi-supervised graph anomaly detection. This method calibrates normality by leveraging both labeled and unlabeled data, using a teacher model to guide the process. GraphNC incorporates anomaly score distribution alignment and perturbation-based normality regularization to enhance the accuracy and separability of anomaly scores and node representations. AI
- Hezhe Qiao
- GraphNC
TOOL · arXiv cs.LG English(EN) · 16h

GNSS-FM: A Self-Supervised Foundation Model for Daily GNSS Displacement Time Series

Researchers have developed GNSS-FM, a novel self-supervised foundation model designed for analyzing daily Global Navigation Satellite System (GNSS) displacement time series. This model utilizes a dual-stream input combining displacement and velocity data, pre-trained with a masked latent prediction objective. After pre-training on data from over 17,000 GNSS stations, GNSS-FM demonstrated strong performance when fine-tuned for displacement forecasting and seismic step localization, outperforming existing task-specific baselines. AI

IMPACT This self-supervised approach could enable more widespread use of AI in geophysics by overcoming data labeling limitations.
- wav2vec 2.0
- GNSS-FM
TOOL · arXiv cs.LG English(EN) · 16h

Bulk-boundary decomposition of neural networks

Researchers have introduced the bulk-boundary decomposition, a novel framework for analyzing the training dynamics of deep neural networks. This approach separates the network's Lagrangian into a data-independent bulk term and a data-dependent boundary term. The bulk term characterizes the inherent dynamics influenced by network architecture and activation functions, while the boundary term reflects the stochastic interactions arising from training samples at the input and output layers. This decomposition reveals the local and homogeneous structure within deep networks, leading to the derivation of an energy continuity equation. AI

IMPACT Introduces a new theoretical lens for understanding and potentially optimizing neural network training processes.
- Donghee Lee
TOOL · arXiv cs.LG English(EN) · 16h

Self-Consistent Generative Paths via Admissible Random Variational Transport

Researchers have introduced a new framework for understanding generative models, focusing on the concept of "self-consistent generative paths." This framework defines a path as self-consistent if it represents a random fixed point of admissible local variational transport corrections. The theory yields a metric called the random fixed-point path residual (R-FPR) to quantify the gap between a generated path and its correction, offering a principle for diagnosing and improving various generative models. AI

IMPACT Introduces a theoretical framework for unifying and improving various generative models, potentially impacting future research and development.
- Self-Consistent Generative Paths via Admissible Random Variational Transport
- arXiv cs.LG
TOOL · arXiv cs.CV English(EN) · 16h

Rethinking 3D Shape Generation: Diffusion over Superquadrics

Researchers have developed a new method for generating 3D shapes by diffusing over superquadric parameters instead of dense geometric representations. This approach significantly reduces the dimensionality of the diffusion state, requiring only 7KB of parameters per shape. The diffusion-over-superquadrics method enables faster generation, improved scalability, and supports advanced capabilities like part-level editing and constraint-based design, while achieving competitive performance on standard benchmarks. AI

IMPACT Enables more efficient and controllable 3D shape generation, potentially impacting fields requiring rapid asset creation.
- Diffusion models
TOOL · arXiv cs.LG English(EN) · 16h

Decentralized Online Riemannian Optimization Beyond Hadamard Manifolds

Researchers have developed a new decentralized online Riemannian optimization algorithm capable of operating beyond the limitations of Hadamard manifolds, extending its applicability to spaces with positive curvature. The algorithm incorporates a curvature-aware consensus step that facilitates linear convergence even in these more complex geometric settings. This advancement leads to a $O(\sqrt{T})$ regret bound for the decentralized online Riemannian gradient descent method, with similar bounds achieved in a two-point bandit feedback scenario using efficient gradient estimators. AI
- Emre Sahinoglu
TOOL · arXiv cs.CV English(EN) · 16h

MB-Loc: Multi-planar Bird's-eye-view Localization in outdoor LiDAR scenes

Researchers have developed MB-Loc, a new framework for multi-planar bird's-eye-view localization in outdoor LiDAR scenes. This method addresses computational inefficiency and viewpoint sensitivity in existing scene coordinate regression techniques. MB-Loc projects LiDAR scans into a 2.5D representation, enabling faster processing with standard 2D CNNs while retaining crucial 3D geometric information. The framework also incorporates a KL-regularized latent bottleneck for spatial uncertainty modeling and 3D spatial augmentations for rotation robustness, outperforming current state-of-the-art methods on the NCLT dataset at real-time inference speeds. AI

IMPACT Enhances autonomous navigation systems by improving the efficiency and robustness of LiDAR localization.
TOOL · arXiv cs.LG English(EN) · 16h

IR-SIM: A Lightweight Skill-Native Simulator for Navigation, Learning, and Benchmarking

Researchers have developed IR-SIM, a new lightweight simulator designed to streamline robotics research, particularly for tasks involving large language models. This simulator allows for the creation and modification of navigation scenarios using simple YAML configuration files and text prompts, making it easier to prototype and develop algorithms. IR-SIM also facilitates automated benchmarking and data generation for robot learning, with capabilities to bridge to higher-fidelity simulators and real-world deployments. AI

IMPACT Simplifies the development and benchmarking of AI-powered robot navigation systems.
- IR-SIM
- large language models
TOOL · arXiv cs.CV English(EN) · 16h

RGB-S: Image-Aligned Tactile Saliency for Robust Dexterous Manipulation

Researchers have developed a new framework called RGB-S that explicitly aligns tactile sensor data with visual information for robotic manipulation. This method projects tactile sensor locations directly onto RGB images, creating saliency maps that account for spatial uncertainty. By integrating these 2D anchors, the system injects physical contact priors into visual models, improving their ability to handle unreliable or occluded visual inputs. Experiments demonstrated a significant improvement in success rates for dexterous manipulation tasks under severe visual occlusion. AI

IMPACT Enhances robotic manipulation capabilities by improving sensor fusion and robustness to visual occlusions.
- Robotic Dexterous Manipulation
TOOL · arXiv cs.LG English(EN) · 16h

Temporal Coverage over Density: Parsimonious Training-Set Design for ML Climate Downscaling

Researchers have developed a new method for training machine learning models to downscale climate data, focusing on how to select training years effectively. Their study, using the CESM2 Large Ensemble, found that training models on years distributed across the entire climate trajectory, rather than contiguous historical periods, significantly improves their ability to reproduce climate variability. This approach, even with limited data, outperforms models trained solely on historical data and suggests that broad sampling of climate states is more beneficial than temporal continuity for allocating scarce high-resolution simulation resources. AI

IMPACT Optimizes training data selection for climate models, potentially improving accuracy and efficiency in climate impact assessments.
- CESM2 Large Ensemble
TOOL · arXiv cs.LG English(EN) · 16h

From inverse problems to neural operators: prediction, mechanism, and generalization of data-driven models

A new paper explores the relationship between traditional differential equation models and modern data-driven approaches like neural operators. It argues that many modeling strategies share a common structure, differing primarily in their assumed input-output mappings. The research suggests that only certain models are capable of true mechanism discovery and subsequent generalization, offering insights into their appropriate applications. AI

IMPACT Provides a theoretical framework for understanding and comparing different data-driven modeling approaches in scientific applications.
TOOL · arXiv cs.LG English(EN) · 16h

Solving Inverse Problems with Flow-based Models via Model Predictive Control

Researchers have developed MPC-Flow, a novel framework for solving inverse problems using flow-based generative models. This method employs model predictive control to guide the model's dynamics, making conditional generation more practical. MPC-Flow offers a spectrum of guidance algorithms, some of which bypass the need for backpropagation through the generative model's trajectory. The framework has demonstrated strong performance and scalability on image restoration tasks, including in-painting, deblurring, and super-resolution, even with large-scale models like FLUX.2 on consumer hardware. AI

IMPACT Introduces a more efficient method for conditional generation in flow-based models, potentially improving performance on tasks like image restoration.
TOOL · arXiv cs.CV English(EN) · 16h

DALE-CT: Depth-Aware Foundation Models for Computed Tomography

Researchers have developed DALE-CT, a new family of 2D foundation models for processing computed tomography (CT) data. Built from scratch using a self-supervised learning approach called LeJEPA, DALE-CT incorporates a novel 3D depth-aware pre-training strategy with both automated and human-annotated supervision. This model achieved a Macro AUROC of 0.833 on the CT-RATE dataset for multi-abnormality detection, nearing the performance of state-of-the-art 3D vision-language models with less data and no textual supervision. AI

IMPACT Introduces a novel, data-efficient approach for medical image analysis, potentially improving diagnostic accuracy in CT scans.
- DALE-CT
- LeJEPA
- CT-RATE dataset
- DINOv2