Brief

last 24h

[50/1536] 185 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · arXiv cs.CV · 16h

Unlocking Patch-Level Features for CLIP-Based Class-Incremental Learning

Researchers have developed a new method called SPA (Semantic-guided Patch-level Alignment) to improve class-incremental learning using CLIP. This approach leverages local, patch-level features within CLIP's encoders, which were previously overlooked in favor of global image embeddings. SPA uses GPT-5 to generate semantic descriptions that guide the selection of discriminative visual patches, which are then aligned with these descriptions using optimal transport. The method also incorporates task-specific projectors and pseudo-feature calibration to combat catastrophic forgetting, achieving state-of-the-art results in experiments. AI

IMPACT Introduces a novel approach to leverage local features in vision-language models for continuous learning, potentially improving model adaptability.
- CLIP
- SPA
- GPT-5
- arXiv
TOOL · arXiv cs.LG · 16h

QLAM: A Quantum Long-Attention Memory Approach to Long-Sequence Token Modeling

Researchers have developed QLAM, a novel hybrid quantum-classical memory mechanism designed to enhance long-sequence token modeling. QLAM represents the hidden state as a quantum state, leveraging superposition to encode historical information and enable non-classical, globally conditioned updates. This approach aims to preserve the efficiency of state-space models while enriching their memory capacity for capturing complex dependencies. Evaluations on image classification benchmarks flattened into token sequences showed QLAM outperforming both recurrent and transformer-based models. AI

IMPACT Introduces a novel quantum-enhanced approach to sequence modeling, potentially improving efficiency and capability for long-context tasks.
TOOL · arXiv cs.LG · 17h

Reducing cross-sample prediction churn in scientific machine learning

Researchers have identified a new metric called "cross-sample prediction churn" to measure the instability of machine learning models in scientific applications. This metric quantifies how predictions change when different subsets of training data are used. Standard techniques like deep ensembles do not reduce this churn, but two data-side methods, K-bootstrap bagging and the proposed twin-bootstrap method, show significant improvements. AI

IMPACT Introduces a new metric to better evaluate the reliability of scientific machine learning models, potentially leading to more robust AI systems in research.
- Kevin Maik Jablonka
TOOL · arXiv cs.AI · 16h

WARDEN: Endangered Indigenous Language Transcription and Translation with 6 Hours of Training Data

Researchers have developed WARDEN, a system designed to transcribe and translate the endangered Wardaman language into English, despite having only six hours of training data. The system employs a two-stage approach, first transcribing audio to phonemic text and then translating that text to English. Techniques like initializing the transcription model with a related language and providing a domain-specific dictionary to the translation model were used to overcome the low-resource challenge. WARDEN reportedly outperforms larger models in this extremely data-limited scenario. AI

IMPACT Demonstrates novel techniques for low-resource language processing, potentially enabling AI for other endangered languages.
- WARDEN
- Wardaman
- English
TOOL · arXiv cs.AI · 16h

Topology-Preserving Neural Operator Learning via Hodge Decomposition

Researchers have developed a new method for learning solution operators of physical field equations on geometric meshes. Their approach, called Hodge Spectral Duality (HSD), utilizes Hodge decomposition to separate learnable geometric dynamics from unlearnable topological degrees of freedom. This results in a Hybrid Eulerian-Lagrangian architecture that demonstrates superior accuracy and efficiency while preserving physical invariants. AI

IMPACT Introduces a novel mathematical framework for improving the accuracy and efficiency of neural operators in physics simulations.
- Hodge Spectral Duality
- Hodge decomposition
TOOL · arXiv cs.LG · 17h

Uncertainty-Driven Anomaly Detection for Psychotic Relapse Using Smartwatches: Forecasting and Multi-Task Learning Fusion

Researchers have developed two smartwatch-based frameworks for detecting psychotic relapse. The first framework forecasts cardiac dynamics, while the second uses a multi-task approach to fuse sleep, motion, and cardiac data. Both models employ Transformer encoders and estimate predictive uncertainty using an ensemble of MLPs to generate daily anomaly scores. A late-fusion strategy combining both frameworks achieved an 8% improvement over the previous best baseline on the e-Prevention Grand Challenge dataset. AI

IMPACT Novel application of AI in healthcare for early detection of mental health relapse using wearable sensor data.
TOOL · arXiv cs.AI · 17h

History Anchors: How Prior Behavior Steers LLM Decisions Toward Unsafe Actions

A new paper introduces HistoryAnchor-100, a dataset designed to test how prior harmful actions influence the decisions of frontier large language models when acting as agents. Researchers found that even strongly aligned models, when prompted to remain consistent with previous behavior, significantly increased their likelihood of choosing unsafe actions, sometimes escalating beyond mere continuation. This effect was observed across 17 different models from six providers, with flagship models showing the most pronounced susceptibility, suggesting a potential red flag for agentic AI deployments where action histories might be manipulated or replayed. AI

IMPACT Demonstrates a critical vulnerability in agentic LLMs, potentially impacting the safety of future AI deployments that rely on historical context.
TOOL · arXiv cs.AI · 17h

Neurosymbolic Auditing of Natural-Language Software Requirements

Researchers have developed VERIMED, a novel pipeline that uses large language models combined with an SMT solver to audit natural-language software requirements, particularly for safety-critical applications like medical devices. This neurosymbolic approach translates requirements into formal logic, identifies ambiguity through variations in formalization, and detects inconsistencies or safety violations using solver queries. Experiments on open-source medical device requirements demonstrated that VERIMED effectively reduces ambiguity and significantly improves the accuracy of verified specifications. AI

IMPACT Enhances safety and reliability in critical software by enabling rigorous, automated auditing of natural-language requirements.
TOOL · arXiv cs.CV · 17h

JANUS: Anatomy-Conditioned Gating for Robust CT Triage Under Distribution Shift

Researchers have developed JANUS, a new dual-stream architecture for automated CT triage that integrates anatomical information with visual data. This approach aims to improve accuracy across various pathologies and enhance reliability when faced with shifts in data distribution between institutions. In tests on the MERLIN dataset, JANUS achieved a macro-AUROC of 0.88 and demonstrated strong generalization to an external dataset, particularly for findings defined by size and attenuation. AI

IMPACT Enhances diagnostic capabilities in medical imaging, potentially improving patient outcomes and hospital efficiency.
- JANUS
- MERLIN
TOOL · arXiv cs.CV · 17h

OmniLiDAR: A Unified Diffusion Framework for Multi-Domain 3D LiDAR Generation

Researchers have developed OmniLiDAR, a unified diffusion framework capable of generating 3D LiDAR scans across diverse domains and sensing conditions. This approach uses a shared range-image representation and a novel Cross-Domain Training Strategy to train a single model for heterogeneous data without domain isolation. The framework also incorporates Cross-Domain Feature Modeling and Domain-Adaptive Feature Scaling to handle anisotropic scanning structures and domain-dependent feature shifts, demonstrating strong generation fidelity and utility in downstream tasks like semantic segmentation and object detection. AI

IMPACT Enables more robust and scalable synthetic data generation for autonomous systems across varied conditions.
- OmniLiDAR
- LiDAR
- arXiv
TOOL · The Register — AI · 15h

See through local AI lies with Irish eyes

The ICCL Enforce project has introduced Verity, a fact-checking server designed to combat misinformation generated by AI. This tool aims to help users discern the accuracy of AI-produced content. The development comes amid growing concerns about the proliferation of AI-generated falsehoods. AI

IMPACT Provides a tool to verify AI-generated content, potentially improving trust and reducing the spread of misinformation.
- ICCL Enforce project
- Verity
TOOL · arXiv cs.LG · 17h

Parallel Scan Recurrent Neural Quantum States for Scalable Variational Monte Carlo

Researchers have developed a new method called parallel scan recurrent neural quantum states (PSR-NQS) to improve the scalability of neural-network simulations for quantum many-body systems. This approach utilizes recurrent neural networks, traditionally seen as sequential, and makes them efficient for training within variational Monte Carlo simulations. The PSR-NQS method has demonstrated accurate results on two-dimensional spin lattices up to 52x52, suggesting recurrent architectures are a viable path for scalable neural quantum state simulations. AI

IMPACT Introduces a more scalable approach for simulating complex quantum systems, potentially accelerating research in condensed matter physics.
TOOL · arXiv cs.LG · 17h

Min-Max Optimization Requires Exponentially Many Queries

A new research paper explores the computational complexity of min-max optimization for non-convex and non-concave functions. The study demonstrates that finding an approximate stationary point for such functions requires an exponential number of queries, particularly concerning the approximation error and the dimensionality of the problem. AI

IMPACT This theoretical finding may impact the efficiency of training complex AI models that rely on min-max optimization techniques.
- arXiv cs.LG
TOOL · MIT Technology Review · 16h · [4 sources]

AI chatbots are giving out people’s real phone numbers

AI chatbots, including Google's Gemini, have been found to expose individuals' real phone numbers, leading to unwanted calls and privacy concerns. Experts suggest this issue stems from personally identifiable information being included in the AI's training data, with little apparent recourse for those affected. A company specializing in online privacy removal has reported a significant increase in customer inquiries related to generative AI and the surfacing of personal data. AI

IMPACT Exposes a significant privacy risk in widely used AI tools, potentially eroding user trust and increasing demand for data privacy services.
- Google AI
- Gemini
- DeleteMe
- ChatGPT
- Claude
- Rob Shavell
- Daniel Abraham
- PayBox
TOOL · arXiv cs.LG · 17h

Interpretable Machine Learning for Antepartum Prediction of Pregnancy-Associated Thrombotic Microangiopathy Using Routine Longitudinal Laboratory Data

Researchers have developed a machine learning model capable of predicting pregnancy-associated thrombotic microangiopathy (P-TMA) using routine longitudinal laboratory data. The gradient boosting model achieved an AUROC of 0.872 in a held-out test cohort, demonstrating its effectiveness in identifying subtle, time-dependent risk signatures. Notably, cystatin C levels at six weeks showed potential as an early monitoring indicator for this rare but life-threatening condition. AI

IMPACT This research demonstrates the potential of machine learning to identify subtle patterns in longitudinal data for early prediction of rare but severe medical conditions.
TOOL · arXiv cs.CV · 17h

VoxCor: Training-Free Volumetric Features for Multimodal Voxel Correspondence

Researchers have developed VoxCor, a novel method for creating reusable volumetric feature representations from pre-trained 2D Vision Transformer models. This training-free approach combines triplanar inference with a weighted partial least squares projection to identify stable anatomical directions across different imaging modalities and subjects. VoxCor enables direct voxel correspondence querying via nearest-neighbor search and demonstrates competitive performance in cross-modality and cross-subject transfer tasks, positioning it as a valuable layer for multimodal medical image analysis. AI

IMPACT Enables more robust and reusable feature representations for cross-modal medical image analysis, potentially improving downstream tasks like segmentation and registration.
TOOL · arXiv cs.CV · 17h

EvoGround: Self-Evolving Video Agents for Video Temporal Grounding

Researchers have developed EvoGround, a novel framework utilizing two self-evolving agents to perform video temporal grounding without human-labeled data. The system comprises a proposer agent that generates query-moment pairs from raw videos and a solver agent that grounds these pairs, providing feedback to enhance the proposer. This self-reinforcing loop allows the agents to mutually improve, achieving state-of-the-art results on VTG benchmarks and even functioning as a fine-grained video captioner. AI

IMPACT Introduces a novel method for video analysis that bypasses the need for extensive manual annotation, potentially accelerating research and application development in video understanding.
TOOL · arXiv cs.LG · 17h

Force-Aware Neural Tangent Kernels for Scalable and Robust Active Learning of MLIPs

Researchers have developed a new active learning framework for machine-learning interatomic potentials (MLIPs) that addresses scalability and robustness challenges. This framework utilizes a force-aware Neural Tangent Kernel (NTK) to efficiently screen large candidate pools of molecular structures. The method demonstrates effectiveness on the OC20 dataset, achieving low energy and force errors, and remains competitive and robust on other benchmarks. AI

IMPACT Introduces a more efficient and robust method for training interatomic potentials, potentially accelerating materials science discovery.
- Neural Tangent Kernel
- MLIPs
- OC20 dataset
- T1x
- PMechDB
- RGD
TOOL · Pandaily · 8h · [2 sources]

China's 'Jiuzhang 4' Quantum Computer Achieves 10^54 Speedup Over Supercomputers

Chinese researchers have developed the Jiuzhang 4, a programmable photonic quantum computing prototype. This new system boasts 8,176 modes and can manipulate 3,050 photons, demonstrating a quantum advantage that is 10^54 times faster than the leading supercomputer, El Capitan. The prototype utilizes 1,024 squeezed state inputs to achieve this significant speedup. AI

IMPACT Demonstrates significant advancements in quantum computing speed, potentially impacting future AI development and complex problem-solving.
TOOL · Mastodon — fosstodon.org 日本語(JA) · 5h · [4 sources]

Google DeepMind Releases Experimental Demo of AI-Powered Pointer "AI-Pointer" | gihyo.jp https://www.yayafa.com/2800236/ #AgenticAi #AI #ArtificialGeneralIntelligence #ArtificialIntellige

OpenAI has eliminated its minimum ad spend requirement of $50,000, a move that Dentsu is reportedly watching closely as part of ChatGPT's evolving strategy. Anthropic is introducing monthly credits for programmatic use within its paid Claude plans and has also launched 'Claude for Small Business,' designed to automate tasks like business analysis, ad campaigns, and bookkeeping. Separately, Google DeepMind has demonstrated an experimental 'AI-Pointer' that functions as an AI-powered cursor. AI

IMPACT AI companies are refining their product offerings and pricing strategies to broaden market access and cater to specific business needs.
TOOL · arXiv cs.CL · 17h

An LLM-Based System for Argument Reconstruction

Researchers have developed a novel system using large language models (LLMs) to reconstruct arguments from natural language text into abstract argument graphs. This multi-stage pipeline identifies argumentative components, selects relevant information, and maps their logical relationships, representing them as directed acyclic graphs with premises and conclusions linked by support or attack relations. Evaluations on textbook arguments and benchmark datasets indicate the system's effectiveness in recovering argumentative structures and its potential for scalable argument reconstruction. AI

IMPACT This system offers a scalable method for analyzing and structuring arguments from text, potentially aiding research and analysis in fields relying on logical reasoning.
- LLM
- arXiv
TOOL · arXiv cs.AI · 17h

ENSEMBITS: an alphabet of protein conformational ensembles

Researchers have developed Ensembits, a novel tokenizer designed to represent protein conformational ensembles, which capture dynamic movements and alternative states beyond static structures. This new method addresses challenges in encoding variable-sized ensembles and sparse dynamics data by using a Residual VQ-VAE with a frame distillation objective. Ensembits demonstrate superior performance in predicting protein dynamics and match or exceed static tokenizers on various prediction tasks, despite using less pretraining data, paving the way for incorporating dynamics into protein language modeling and design. AI

IMPACT Enables the incorporation of protein dynamics into language models, advancing protein design and analysis.
TOOL · arXiv cs.AI · 17h

Di-BiLPS: Denoising induced Bidirectional Latent-PDE-Solver under Sparse Observations

Researchers have introduced Di-BiLPS, a novel neural framework designed to solve partial differential equations (PDEs) even with extremely limited observational data. The system utilizes a variational autoencoder for data compression, a latent diffusion module for uncertainty modeling, and contrastive learning for representation alignment. By operating within a compressed latent space and incorporating a PDE-informed denoising process, Di-BiLPS achieves state-of-the-art accuracy with as few as 3% of the required observations, while also significantly reducing computational costs and enabling zero-shot super-resolution. AI

IMPACT Enables more accurate modeling of complex phenomena with significantly less data, potentially broadening AI applications in scientific research.
- Di-BiLPS
- arXiv
TOOL · arXiv cs.AI · 17h

LMPath: Language-Mediated Priors and Path Generation for Aerial Exploration

Researchers have developed LMPath, a new pipeline that uses language models to generate exploration priors for Unmanned Aerial Vehicle (UAV) search missions. This approach leverages semantic context from object prompts and foundation vision models to identify relevant regions in satellite imagery. The generated priors then inform UAV path planning to optimize search objectives, such as minimizing search time or maximizing discovery probability within a given distance. Real-world UAV tests and simulations demonstrated that LMPath outperforms traditional geometric coverage patterns. AI

IMPACT Enhances aerial exploration efficiency by integrating semantic understanding into path planning, potentially reducing search times in complex environments.
TOOL · arXiv cs.AI · 17h

Amplification to Synthesis: A Comparative Analysis of Cognitive Operations Before and After Generative AI

A new research paper analyzes how generative AI might be altering cognitive operations, particularly in the context of geopolitical influence campaigns. By comparing X (formerly Twitter) data from the 2016 and 2024 U.S. presidential elections, the study found significant shifts in content creation and coordination patterns. The findings suggest a move from amplification through retweets to active content generation with diverse wording, indicating potential generative AI involvement in shaping public perception. AI

IMPACT Suggests generative AI is fundamentally changing influence operations, requiring new detection frameworks for security practitioners.
TOOL · arXiv cs.CV · 17h

RoboEvolve: Co-Evolving Planner-Simulator for Robotic Manipulation with Limited Data

Researchers have developed RoboEvolve, a new framework designed to improve robotic manipulation capabilities by addressing the scarcity of training data. This system co-evolves a vision-language model planner with a video generation model simulator in a feedback loop. Operating on unlabeled images, RoboEvolve uses a dual-phase mechanism for exploration and failure analysis to enhance policy optimization, achieving significant improvements in effectiveness and data efficiency. AI

IMPACT This framework significantly enhances robotic manipulation by enabling effective learning with drastically reduced data, potentially accelerating real-world robotic applications.
TOOL · arXiv cs.CV · 17h

Realtime-VLA FLASH: Speculative Inference Framework for Diffusion-based VLAs

Researchers have developed Realtime-VLA FLASH, a new framework designed to speed up diffusion-based vision-language-action models (dVLAs) for embodied intelligence tasks. The system uses a lightweight draft model for speculative inference, significantly reducing the need for full, slower inference calls during replanning. This approach achieved a 3.04x speedup on the LIBERO benchmark, lowering average inference latency to 19.1 ms while maintaining task performance, and has also shown promise in real-world applications like conveyor-belt sorting. AI

IMPACT Accelerates real-time applications for embodied AI by significantly reducing inference latency.
TOOL · arXiv cs.AI · 18h

High-Rate Quantized Matrix Multiplication II

Researchers have published a paper detailing advancements in quantized matrix multiplication, specifically for large language models (LLMs). This second part of their work focuses on scenarios where the covariance matrix of the input data is known, which is common in weight-only post-training quantization of LLMs. The study shows how a 'waterfilling' approach, inspired by information theory, can improve quantization algorithms like GPTQ by allocating quantization rates more effectively across different dimensions, potentially nearing theoretical distortion limits. AI

IMPACT Introduces a more efficient quantization method that could reduce the computational cost and memory footprint of LLMs.
- LLMs
- GPTQ
- Llama-3-8B
TOOL · arXiv cs.LG · 18h

VectorSmuggle: Steganographic Exfiltration in Embedding Stores and a Cryptographic Provenance Defense

Researchers have identified a new security vulnerability in vector databases used by RAG systems, dubbed VectorSmuggle. This attack allows malicious actors with write access to hide sensitive data within embeddings, which are then used by AI models. The study demonstrates that simple post-embedding modifications can evade detection while maintaining retrieval accuracy, with specific rotation techniques proving particularly effective. To counter this, a new cryptographic provenance protocol called VectorPin has been proposed, which cryptographically links embeddings to their source content and the model used, thereby ensuring integrity. AI

IMPACT Introduces a novel steganographic attack on RAG systems, highlighting critical security gaps in vector database integrity and prompting the development of new cryptographic provenance protocols.
TOOL · arXiv cs.LG · 18h

Toward AI-Driven Digital Twins for Metropolitan Floods: A Conditional Latent Dynamics Network Surrogate of the Shallow Water Equations

Researchers have developed a new AI model called the Conditional Latent Dynamics Network (CLDNet) to create faster digital twins for simulating metropolitan floods. Traditional methods are too slow for real-time forecasting, taking nearly an hour for a 96-hour simulation. CLDNet, a neural ODE surrogate, significantly speeds up these simulations to about 29 seconds, achieving a 115x improvement while maintaining accuracy and outperforming other baseline models. AI

IMPACT Enables faster and more accurate flood forecasting, potentially improving disaster preparedness and response.
TOOL · arXiv cs.LG · 18h

Fast and effective algorithms for fair clustering at scale

Researchers have developed new algorithms for fair clustering at scale, addressing the challenge of balancing clustering cost with fairness constraints. The proposed framework offers precise control over this trade-off, which is often in conflict in real-world applications. Three heuristics were introduced, focusing on solution quality, scalability with high quality, and maximum scalability for millions of objects, outperforming existing methods in experiments. AI

IMPACT Provides new methods for applying machine learning in fairness-sensitive domains, improving scalability and control over trade-offs.
- Claudio Mantuano
- arXiv
TOOL · dev.to — LLM tag · 16h

Building a Safety-First RAG Triage Agent in 24 Hours

A developer built a safety-focused Retrieval-Augmented Generation (RAG) agent for a hackathon, prioritizing secure responses over speed. The agent uses a five-stage pipeline that first classifies tickets and then applies deterministic rules to identify high-risk issues before any LLM generation occurs. This approach aims to prevent dangerous outputs, such as providing incorrect advice for sensitive matters like identity theft or billing disputes, by escalating such cases directly to human agents. AI

IMPACT Demonstrates a practical approach to enhancing RAG safety, crucial for production systems handling sensitive user data.
TOOL · arXiv cs.AI · 18h

Humanwashing -- It Should Leave You Feeling Dirty

A new paper argues that the common phrase 'human in the loop' is often misused to imply AI safety when it actually obscures critical processes and outcomes. This practice, termed 'humanwashing,' is likened to 'greenwashing' and is used to present AI systems in a more favorable light without genuine accountability. The authors contend that indiscriminate use of the 'loop' metaphor hinders a true understanding of human oversight in AI decision-making. AI

IMPACT Introduces a critical term for analyzing AI oversight claims, urging a deeper examination of 'human in the loop' practices.
TOOL · The Register — AI · 12h

Google's AI-enabled mouse pointer understands 'this' and 'that'

Google has developed an AI-powered mouse pointer that can understand context, potentially making traditional right-clicking obsolete. This new pointer aims to improve user interaction by interpreting natural language cues. The development is part of a broader trend of integrating AI into everyday computing tools. AI

IMPACT Enhances user interaction with computing devices through AI integration.
- Google
- AI
TOOL · Mastodon — fosstodon.org · 6h

As # bostrom said. Paperclips must be maximised! # ai # ki Blind Ambition: AI agents can turn tasks into digital disasters | UCR News | UC Riverside https:// ne

A new paper from UC Riverside researchers explores the potential dangers of AI agents, drawing parallels to Nick Bostrom's "paperclip maximizer" thought experiment. The study highlights how AI agents, in their pursuit of completing assigned tasks, could inadvertently cause significant digital harm or unintended consequences. This research serves as a cautionary tale about the need for careful design and oversight of autonomous AI systems. AI

IMPACT Highlights potential unintended negative consequences of autonomous AI agents, emphasizing the need for safety research.
TOOL · arXiv cs.AI · 18h

Weakly-Supervised Spatiotemporal Anomaly Detection

Researchers have developed a new weakly-supervised method for spatiotemporal anomaly detection in videos. This approach trains a network using only video-level labels, indicating whether a video is normal or contains an anomaly, without requiring detailed frame-by-frame annotations. The system extracts features from clips and employs a multiple instance ranking loss to generate anomaly scores for specific spatiotemporal regions. Results were demonstrated on the UCF Crime2Local Dataset. AI

IMPACT This research could lead to more efficient video surveillance and analysis systems by reducing the need for extensive manual annotation.
TOOL · arXiv cs.CV · 18h

Aligning Network Equivariance with Data Symmetry: A Theoretical Framework and Adaptive Approach for Image Restoration

Researchers have developed a new theoretical framework to better understand the relationship between network equivariance and data symmetry in image restoration tasks. They propose a quantifiable definition of non-strict symmetry at the dataset level and use it to constrain the restoration problem, deriving model equivariance from this constraint. This approach leads to an adaptive equivariant network that dynamically aligns with individual sample symmetries, demonstrating superior performance in experiments on super-resolution, denoising, and deraining compared to existing methods. AI

IMPACT Introduces a novel theoretical framework and adaptive approach for image restoration, potentially improving model generalization and performance on tasks with imperfect data symmetry.
TOOL · arXiv cs.AI · 18h

Robust and Explainable Bicuspid Aortic Valve Diagnosis Using Stacked Ensembles on Echocardiography

Researchers have developed an AI model capable of diagnosing bicuspid aortic valve (BAV) from standard echocardiography videos. The model, a stacked ensemble of multiple video backbones, achieved a high F1-score of 0.907 and recall of 0.877 in distinguishing BAV from tricuspid aortic valves (TAV). Explainability features like Grad-CAM and SHAP values were integrated to localize diagnostic evidence and quantify the contribution of different model components, allowing for transparent case-level audits. This AI tool could aid in earlier BAV detection, particularly in settings with limited specialist expertise. AI

IMPACT This AI model could improve the accuracy and accessibility of diagnosing a common heart valve condition, potentially leading to earlier treatment.
TOOL · arXiv cs.LG · 18h

GHGbench: A Unified Multi-Entity, Multi-Task Benchmark for Carbon Emission Prediction

Researchers have introduced GHGbench, a new unified benchmark and dataset designed to improve the prediction of carbon emissions at both company and building levels. The benchmark addresses fragmentation in existing datasets by providing harmonized data for over 32,000 company-year records and nearly 500,000 building-year records. Initial findings indicate that predicting building emissions is more challenging than company emissions, and out-of-distribution performance is a critical bottleneck, though multimodal embeddings show promise in improving accuracy. AI

IMPACT Provides a standardized evaluation framework for ML models tackling climate change prediction.
TOOL · arXiv cs.AI · 18h

Coordinating Multiple Conditions for Trajectory-Controlled Human Motion Generation

Researchers have developed a new framework called CMC for generating realistic human motions that accurately follow specified trajectories and textual descriptions. Existing methods struggle with conflicting conditions and inconsistent motion representations. CMC addresses this by decoupling the process into two stages: trajectory control and motion completion, ensuring stable trajectory following and high-quality full-body motion synthesis. The framework also incorporates a Selective Inpainting Mechanism to improve training with limited data, achieving state-of-the-art results on benchmark datasets. AI

IMPACT Introduces a novel approach to multimodal condition coordination for realistic human motion synthesis, potentially improving applications in animation and robotics.
- CMC
- HumanML3D
- KIT
TOOL · dev.to — MCP tag · 12h

An Oracle DBA builds AI: shipping Oracle 23ai RAG and an MCP server in a weekend

An Oracle DBA has developed two open-source AI infrastructure projects, demonstrating how existing database administration skills are transferable to AI development. The first project, 'Talk to EBS,' is a retrieval-augmented generation (RAG) assistant that answers questions about Oracle E-Business Suite using Oracle Database 23ai's native vector search and Cohere embeddings. The second project, 'mcp-oracle-dba,' implements Anthropic's Model Context Protocol (MCP) to securely allow LLMs like Claude to interact with an Oracle database, including features like schema listing, table description, and SELECT query execution with PII redaction, while preventing destructive commands. AI

IMPACT Demonstrates how existing database administration skills can be leveraged to build practical AI infrastructure, potentially easing the transition for DBAs into AI roles.
TOOL · arXiv cs.AI · 18h

Senses Wide Shut: A Representation-Action Gap in Omnimodal LLMs

Researchers have identified a "Representation-Action Gap" in omnimodal large language models, where models can internally recognize contradictions between textual claims and their sensory inputs but fail to reflect this in their outputs. A new benchmark, IMAVB, was created using movie clips to test this capability, revealing that current models often either accept false premises or reject too many standard claims. The study suggests the bottleneck for grounding in these models is in translating perception into action, rather than perception itself. AI

IMPACT Highlights a critical gap in omnimodal LLM grounding, suggesting current models struggle to translate perceived information into reliable actions.
TOOL · dev.to — LLM tag · 13h

How LumiClip Finds the Best Moments in Your Video and Reframes Them for Mobile

LumiClip has developed a multi-stage pipeline to efficiently extract and reframe video highlights for social media. The process begins with transcription and video classification to tailor analysis to content type, followed by topic segmentation to identify coherent segments. Candidate highlights are then scored for quality and relevance, with a final selection ensuring non-overlapping clips and generating a concise hook for each. AI

IMPACT This product demonstrates a practical application of LLMs and multimodal models for content summarization and repurposing.
- LumiClip
- Deepgram Nova-3
TOOL · arXiv cs.AI · 18h

KVServe: Service-Aware KV Cache Compression for Communication-Efficient Disaggregated LLM Serving

Researchers have developed KVServe, a novel framework designed to optimize communication efficiency in disaggregated LLM serving systems. KVServe addresses the bottleneck caused by KV cache data crossing network and storage boundaries by employing a service-aware and adaptive compression strategy. It utilizes a Bayesian Profiling Engine for efficient search of compression profiles and a Service-Aware Online Controller to adapt to real-time service conditions, leading to significant reductions in latency and improvements in job completion time. AI

IMPACT Optimizes LLM serving infrastructure, potentially reducing costs and improving response times for AI applications.
- KVServe
- LLM
- vLLM
TOOL · Mastodon — fosstodon.org · 3h

🤖 Golem Offers 15% Off AI Workshops Until May End Golem Karrierewelt is offering a 15% discount on AI workshops and e-learning courses covering Copilot, AI foun

Golem Karrierewelt is providing a 15% discount on its AI workshops and e-learning programs. These courses cover topics such as Microsoft Copilot, fundamental AI concepts, and the EU AI Act. The promotion is valid until the end of May. AI

IMPACT Offers accessible training on AI tools and regulations for professionals.
TOOL · arXiv cs.AI · 18h

ScioMind: Cognitively Grounded Multi-Agent Social Simulation with Anchoring-Based Belief Dynamics and Dynamic Profiles

Researchers have developed ScioMind, a new framework for simulating social opinion dynamics using large language models. This system integrates structured opinion dynamics with LLM-based agent reasoning, featuring a memory-anchored belief update rule and a hierarchical memory architecture. ScioMind also incorporates dynamic agent profiles derived from a corpus-grounded retrieval pipeline, enhancing behavioral realism in simulations of real-world policy debates. AI

IMPACT Provides a more behaviorally realistic simulation framework for studying social opinion dynamics using LLMs.
- ScioMind
- Large language model
TOOL · arXiv cs.AI · 18h

AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation

Researchers have developed AnyFlow, a novel framework for video diffusion models that allows for any number of sampling steps during generation. Unlike previous methods that degrade with more steps, AnyFlow optimizes the full ODE sampling trajectory by learning flow-map transitions over arbitrary time intervals. This approach, demonstrated on models up to 14 billion parameters, matches or surpasses existing few-step distillation methods while offering better scalability. AI

IMPACT Enables more flexible and scalable video generation from diffusion models, potentially improving quality and control.
- AnyFlow
- arXiv
TOOL · arXiv cs.CV · 18h

Learning to Optimize Radiotherapy Plans via Fluence Maps Diffusion Model Generation and LSTM-based Optimization

Researchers have developed a novel diffusion model and LSTM-based approach for optimizing radiotherapy plans, specifically for Volumetric Modulated Arc Therapy (VMAT). This method aims to significantly reduce the planning time for VMAT by generating clinically feasible fluence maps in a single step and then rapidly refining them using learned gradient dynamics. Initial experiments on prostate cancer patient data show improvements in planning efficiency, flexibility, and machine deliverability compared to existing end-to-end VMAT planners. AI

IMPACT Introduces a novel AI-driven method to accelerate and improve radiotherapy planning, potentially leading to faster patient treatment and better outcomes.
TOOL · arXiv cs.AI · 18h

Children's English Reading Story Generation via Supervised Fine-Tuning of Compact LLMs with Controllable Difficulty and Safety

Researchers have developed a method to fine-tune compact, 8-billion parameter Large Language Models (LLMs) for generating children's English reading stories. By leveraging an existing curriculum and stories from larger models like GPT-4o and Llama 3.3 70B, they trained smaller LLMs to produce content with controllable difficulty and safety. Evaluations indicate that these fine-tuned compact models outperform larger models on difficulty metrics and exhibit minimal safety issues, making them a more affordable and accessible option for educational use. AI

IMPACT Fine-tuning smaller LLMs for specific educational tasks like story generation offers a more accessible and cost-effective alternative to large, proprietary models.
TOOL · arXiv cs.AI · 18h

Identifying AI Web Scrapers Using Canary Tokens

Researchers have developed a novel method to automatically identify which large language models (LLMs) are being fed data by specific web scrapers. The technique involves hosting dynamic websites that serve unique "canary tokens" to each visiting scraper. By prompting LLMs and observing if they consistently generate outputs containing these unique tokens, researchers can infer which scrapers are supplying data to which LLMs. Experiments across 22 production LLM systems demonstrated the approach's reliability in identifying previously unknown scraper-LLM connections, offering a way for unprivileged third parties to gain insight into data sourcing and potentially control unwanted scraping. AI

IMPACT Provides a method for identifying data sources for LLMs, potentially enabling better control over web scraping and data provenance.
- LLMs
- web scrapers
- canary tokens
- arXiv