Brief

last 24h

[50/770] 223 sources

Multi-source AI news clustered, deduplicated, and scored 0–100 across authority, cluster strength, headline signal, and time decay.

TOOL · arXiv cs.AI English(EN) · 17h

Graph-to-SFILES: Control structure prediction from process topologies using generative artificial intelligence

Researchers have developed a generative AI model called Graph-to-SFILES to predict control structures for process diagrams. This model utilizes graph neural networks to interpret process topologies, offering an alternative to sequence-based methods. While effective in small-data scenarios, its performance on large datasets still requires further investigation for industrial applications. AI

IMPACT This research could accelerate P&ID development in data-scarce environments, though its industrial applicability needs further study.
TOOL · arXiv cs.LG English(EN) · 17h

Graph Neural Networks for Predicting Solvability of Finite Groups

Researchers have developed a Graph Neural Network (GNN) framework designed to predict the solvability of finite groups. By representing finite groups as graphs, such as Cayley graphs, the GNN is trained to identify solvable versus non-solvable groups using only structural graph information. This study serves as a proof-of-concept to explore whether GNNs can learn abstract algebraic properties from these graph-based representations. AI

IMPACT Demonstrates potential for GNNs to learn abstract algebraic properties, opening new avenues for computational mathematics.
- finite groups
- Graph Neural Network
TOOL · arXiv cs.AI English(EN) · 17h

Pharmacogenomic Knowledge Graph Augmentation for Graph Neural Network-Based Drug-Drug Interaction Prediction

Researchers have developed a method to enhance drug-drug interaction (DDI) prediction using Graph Neural Networks (GNNs) by incorporating pharmacogenomic data. This approach augments molecular structure information with details about drug metabolism pathways, specifically focusing on cytochrome P450 enzymes. The study found that this knowledge graph augmentation significantly improves DDI classification accuracy, particularly for interactions mediated by CYP2C9, though it did not overcome inherent limitations in predicting interactions for entirely new drugs. AI

IMPACT Enhances AI's ability to predict drug interactions by integrating biological pathway data, potentially accelerating drug discovery and safety assessments.
TOOL · arXiv cs.AI English(EN) · 17h

From Statute to Control Flow: Span-Grounded Deontic Trees for Defeasible Scope Parsing

Researchers have introduced NormBench, a new benchmark designed to evaluate how well AI models can understand and parse legal and policy documents, specifically focusing on identifying nested exceptions and counter-exceptions. The benchmark uses Span-Grounded Deontic Trees (SG-DT) to represent rules and their exceptions, allowing for more precise scope parsing. Evaluations of current large language models revealed issues like "Recursion Decay" and an "Auditability Trap," indicating difficulties in handling complex rule structures and exceptions, though SG-DT showed promise in improving performance on these specific challenges. AI

IMPACT Highlights limitations in current LLMs for precise legal and policy interpretation, suggesting a need for improved reasoning and auditability in rule-following agents.
TOOL · arXiv cs.AI English(EN) · 17h

A Systematic Study of Behavioral Cloning for Scientific Data Annotation

Researchers have developed a new framework to study behavioral cloning for scientific data annotation, using synthetic tasks that mimic human strategies like correction and verification. Their experiments show that larger models are more data-efficient and can learn annotation skills hierarchically. The study also found that multi-task pretraining significantly improves fine-tuning for new tasks, and that models internally represent key aspects of the annotation process, including a shared representation for mistakes across different tasks. AI

IMPACT Establishes benchmarks for scaling behavioral cloning to real-world scientific data annotation, potentially accelerating research.
- Core Francisco Park
TOOL · arXiv cs.AI English(EN) · 17h

PLAGUE: Plug-and-play framework for Lifelong Adaptive Generation of Multi-turn Exploits

Researchers have developed PLAGUE, a new framework for creating multi-turn jailbreak attacks against large language models. This framework mimics lifelong learning agents, breaking down attacks into three phases: priming, planning, and finishing. PLAGUE has demonstrated significant success, improving attack success rates by over 30% on models like OpenAI's o3 and Anthropic's Claude Opus 4.1, which are known for their resistance to such exploits. AI

IMPACT This research highlights vulnerabilities in advanced LLMs, potentially guiding developers in strengthening safety measures against sophisticated multi-turn exploits.
- o3
- Claude Opus 4.1
- Neeladri Bhuiya
- OpenAI
- Anthropic
- PLAGUE
TOOL · arXiv cs.AI English(EN) · 17h

Improving the Performance and Learning Stability of Parallelizable RNNs Designed for Ultra-Low Power Applications

Researchers have developed new recurrent neural network architectures, the Cumulative Memory Recurrent Unit (CMRU) and its variant $\alpha$CMRU, to improve performance and learning stability in ultra-low power applications. These models address gradient blocking issues in previous designs by introducing a cumulative update formulation that enhances gradient flow and reduces initialization sensitivity. The CMRU and $\alpha$CMRU demonstrate competitive or superior performance compared to existing models like LRUs and minGRUs on various benchmarks, particularly for tasks requiring long-range memory retention, while maintaining essential features for analog implementation. AI

IMPACT Introduces more stable and efficient RNNs for edge devices, potentially enabling new low-power AI applications.
TOOL · arXiv cs.AI English(EN) · 17h

EssentialGIN: a new approach for gene essentiality prediction based on graph isomorphism neural networks

Researchers have developed EssentialGIN, a novel approach for predicting essential genes using graph isomorphism neural networks. This method integrates biological data like gene expression and orthology information with network topology to enhance prediction accuracy. Experiments show EssentialGIN outperforms existing centrality-based and machine learning methods, particularly in complex organisms like humans. AI

IMPACT This new method could improve the efficiency of biological research by more accurately identifying candidate genes for further study.
TOOL · arXiv cs.AI English(EN) · 17h

Dynamic Distributed Constraint Optimization and Metareasoning for Continual, Large-Scale Satellite Operations

Researchers have developed a new framework for managing large constellations of Earth-observing satellites, addressing the challenge of scheduling observations for hundreds of spacecraft. The proposed dynamic distributed constraint optimization problem (DCOSP) formulation integrates scheduling and execution, featuring a novel optimality condition and an exact offline algorithm. To manage resource constraints, the framework incorporates metareasoning to control computation expenditure and introduces the dynamic incremental neighborhood stochastic search (D-NSS) algorithm for efficient online problem repair. Simulations show D-NSS outperforms standard baselines in solution quality, computation time, and message volume, laying the groundwork for a significant in-space demonstration of distributed multi-agent AI. AI

IMPACT Enables more efficient and autonomous operations for large satellite constellations, potentially leading to more sophisticated in-space AI demonstrations.
TOOL · arXiv cs.AI English(EN) · 17h

SENTRY: Statistical Reliability Analysis of Vision Transformers Under Soft Errors

Researchers have developed SENTRY, a statistical framework to analyze the reliability of Vision Transformers (ViTs) against soft errors. This method uses finite-population sampling theory to provide formal reliability guarantees, significantly reducing the cost of fault injection campaigns. Evaluations on ViT-Tiny and ViT-Small models revealed that while few bit-flips cause failure, those that do lead to drastic accuracy drops, often localized in normalization layers and specific floating-point bits. AI

IMPACT Provides a cost-effective method to ensure the reliability of vision models in critical applications.
TOOL · arXiv cs.AI English(EN) · 17h

Graph2Idea:Retrieval-Augmented Scientific Idea Generation with Graph-Structured Contexts

Researchers have developed Graph2Idea, a new framework designed to enhance the generation of scientific research ideas. This system utilizes knowledge graphs to structure retrieved literature, moving beyond the limitations of flat text contexts. By transforming papers into knowledge triples and constructing a target-centered graph, Graph2Idea extracts relevant relational evidence while reducing noise, ultimately guiding LLMs to synthesize more novel, high-quality, and feasible research concepts. AI

IMPACT This framework could improve the efficiency and creativity of scientific research by leveraging structured knowledge graphs for idea generation.
TOOL · arXiv cs.AI English(EN) · 17h

Bidirectional Semantic Complementary Tool Retrieval for Remote Sensing Agents

Researchers have developed a new method for improving how AI agents retrieve specialized tools for processing remote sensing data. The approach addresses the challenge of semantic asymmetry between general user intentions and specific tool documentation. By enhancing queries with functional semantics and enriching tool descriptions with contextual information, the system aims to improve retrieval accuracy for complex tasks. AI

IMPACT Enhances AI agent capabilities in specialized domains like remote sensing, potentially improving efficiency and accuracy.
TOOL · arXiv cs.AI English(EN) · 17h

Few-shot Class-variable Incremental Audio Classification via Prototype Adaptation and Pseudo Class-variable Training

Researchers have introduced a new method for few-shot class-variable incremental audio classification, addressing scenarios where the number of audio classes can both increase and decrease over time. Their approach utilizes a dynamic classifier initialized with a class-variable prototype adaptation network and incorporates a pseudo class-variable training strategy to improve adaptability. Experiments on three public datasets demonstrate that this novel method outperforms existing techniques in average accuracy. AI

IMPACT Introduces a novel approach to handle dynamic class changes in audio classification, potentially improving real-world AI system adaptability.
- Few-shot Class-variable Incremental Audio Classification
- arXiv
TOOL · arXiv cs.AI English(EN) · 17h

Emergence via Phase Transitions: Mechanism Landscapes and Universal Convergence Across Complex Systems

Researchers have proposed a new framework called the Hierarchical Emergence Framework (HEF) to explain how complex systems, from machine learning to biology, converge on similar high-level structures. HEF models emergence as a phase transition in a mechanism landscape, identifying a critical energy threshold that separates competing mechanisms from a single, optimal one. Experiments with transformers trained on modular arithmetic demonstrated a reproducible fingerprint of this transition, with weight norms peaking before generalization and accuracy converging to a consistent value. AI

IMPACT Proposes a theoretical framework for understanding emergent properties in AI systems, potentially guiding future model development.
TOOL · arXiv cs.AI English(EN) · 17h

Query Lens: Interpreting Sparse Key-Value Features with Indirect Effects

Researchers have introduced Query Lens, a new method designed to improve the interpretability of sparse features in AI models. This technique extends existing approaches by analyzing both the input features that activate a specific model component and the output it influences. Query Lens also accounts for indirect effects, where a feature's impact is mediated through other parts of the model, offering a more comprehensive understanding than previous methods. AI

IMPACT Enhances understanding of AI model internals, potentially leading to more reliable and debuggable AI systems.
- Logit Lens
- Query Lens
TOOL · arXiv cs.AI English(EN) · 17h

Considerations for an Integrated Detector Design at FCC-ee: A Human-AI Exploration

A research paper details a collaborative process between a physicist and an AI assistant to design detectors for the Future Circular Collider (FCC-ee). The exploration began with AI-generated detector concepts, which were then refined through dialogue, challenging assumptions and incorporating practical considerations like calibration and stability. The report highlights the potential and limitations of human-AI collaboration in experimental physics design, with the physics capabilities of the proposed designs yet to be fully explored. AI

IMPACT Illustrates a novel human-AI collaboration methodology for complex scientific design challenges.
- AI
TOOL · arXiv cs.AI English(EN) · 17h

Baichuan-M4: A Clinical-Grade Medical Agent System for Continuous Care

Baichuan Intelligence has introduced Baichuan-M4, a medical large model designed for continuous patient care. This system integrates a unified runtime for consistent training and deployment, a core reasoning model trained with reinforcement learning for long-term patient memory and multi-agent coordination, and a clinical tool layer for evidence retrieval and multimodal understanding. Baichuan-M4 demonstrates leading performance across various medical evaluations, including static knowledge, dynamic consultations, and image analysis, while significantly reducing hallucination rates. AI

IMPACT This advanced medical AI system could set new benchmarks for continuous patient care and diagnostic accuracy in healthcare.
TOOL · arXiv cs.AI English(EN) · 17h

Syll: Open-Source Personal Automation with Cross-Surface Execution

Researchers have introduced Syll, an open-source personal automation system designed to operate across various interfaces including APIs, command lines, and graphical user interfaces. Syll allows users to teach agents by direct demonstration, which are then compiled into reusable skills. The system provides multimodal evidence of agent execution, such as logs and checkpoints, for user inspection and control. Syll externalizes memory, skills, and routines as editable local artifacts, aiming to provide a practical foundation for extensible personal automation. AI

IMPACT Provides a foundation for teachable, inspectable personal AI agents across diverse computing interfaces.
TOOL · arXiv cs.AI English(EN) · 17h

Rule-based autocorrection of Piping and Instrumentation Diagrams (P&IDs) on graphs

Researchers have developed a novel rule-based method to automatically detect and correct errors in Piping and Instrumentation Diagrams (P&IDs), which are crucial documents in chemical process engineering. The system represents P&IDs as graphs and applies rule graphs to identify and fix discrepancies, significantly reducing the manual workload associated with reviewing hundreds or thousands of pages. A case study demonstrated the method's reliability and effectiveness, utilizing 33 developed rules and the pyDEXPI Python package for P&ID graph generation. AI

IMPACT Automates a critical, labor-intensive task in chemical engineering, potentially speeding up design and review cycles.
TOOL · arXiv cs.LG English(EN) · 17h

QDSP: An Interpretable Structured Learning Framework for Predicting Death or Cerebral Palsy in Very Low Birth Weight Infants

Researchers have developed QDSP, a novel interpretable structured learning framework designed to predict mortality or cerebral palsy in very low birth weight infants. The framework integrates Quota-guided Subspace Sampling (QSS) and Differentiable-decision-guided Structure Perception (DSP) to model complex clinical interactions and identify key predictors. QDSP demonstrated high accuracy and AUC on a real-world cohort and public datasets, outperforming existing machine learning models and providing clinically relevant insights. AI

IMPACT Provides a more accurate and interpretable tool for high-risk infant prognostication, potentially improving clinical decision-making.
TOOL · arXiv cs.AI English(EN) · 17h

Language-based Trial and Error Falls Behind in the Era of Experience

Researchers have developed a new framework called SCOUT to improve the performance of Large Language Models (LLMs) on non-linguistic tasks. SCOUT decouples exploration from exploitation, using lightweight "scouts" to efficiently gather data from environments. This data is then used to fine-tune LLMs, enabling them to perform better on tasks that previously required extensive and costly trial-and-error. In experiments, SCOUT allowed a Qwen2.5-3B-Instruct model to outperform proprietary models like Gemini-2.5-Pro while consuming fewer computational resources. AI

IMPACT This framework could significantly reduce the computational cost of training LLMs for complex, real-world tasks.
TOOL · arXiv cs.AI English(EN) · 17h

How Small Can You Go? LoRA Fine-Tuning 270M-8B Models for Merchant Information Extraction in Financial Transactions

Researchers explored fine-tuning smaller language models for financial transaction merchant information extraction, aiming to reduce the costs associated with larger models. Their study evaluated 24 variants across four model families, including Gemma, Qwen, Aya, and LLaMA, focusing on accuracy, throughput, and training cost. Findings indicate that models like Qwen 3.5 4B and even the 0.8B version offer competitive performance with significantly fewer parameters and better latency, making them viable alternatives for production deployment. AI

IMPACT Demonstrates that smaller, more efficient models can achieve comparable performance to larger ones for specific tasks, potentially lowering operational costs and increasing accessibility.
- Cohere2
- Databricks
- LLaMA 3.1-8B
- Gemma 3
- Qwen 3.5
- Aya
TOOL · arXiv cs.AI English(EN) · 17h

Decoupling the "What" and "Where" With Polar Coordinate Positional Embeddings

Researchers have developed Polar Coordinate Positional Embeddings (PoPE) to improve Transformer architectures by decoupling content and positional information. This new method, PoPE, addresses limitations in existing RoPE embeddings where content and position are entangled, potentially hindering performance. PoPE demonstrates superior performance in tasks requiring positional or content-based indexing and shows significant gains in sequence modeling across music, genomics, and natural language, even outperforming methods designed for length extrapolation. AI

IMPACT PoPE could enhance Transformer performance in sequence modeling tasks by improving positional awareness, potentially leading to better language models and other sequence-based AI applications.
TOOL · arXiv cs.AI English(EN) · 17h

CANS: Accelerating Multiuser Collaborative Edge Inference via Cooperative Autodidactic NeuroSurgeon

Researchers have developed a new framework called Cooperative Autodidactic NeuroSurgeon (CANS) to improve the efficiency of collaborative deep neural network inference on mobile edge devices. CANS allows devices to adaptively learn optimal model partitions by sharing feedback during inference, addressing challenges posed by fluctuating network conditions and diverse device capabilities. The framework incorporates a FedLinUCB-DW algorithm for device grouping and leverages offline experience for faster exploration, with theoretical guarantees on its performance. In prototype experiments, CANS demonstrated a significant reduction in inference latency, cutting it by up to 50% compared to non-cooperative methods. AI

IMPACT Optimizes collaborative edge inference, potentially reducing latency and improving user experience for mobile AI applications.
- FedLinUCB-DW
- Cooperative Autodidactic NeuroSurgeon
TOOL · arXiv cs.AI English(EN) · 17h

MatMind: A Structure-Activity Knowledge-Driven Generative Foundation Model for Materials Science

Researchers have introduced MatMind, a novel generative foundation model designed for materials science. This model unifies structure-activity knowledge and physics-informed feedback within a progressive training framework. MatMind demonstrates competitive performance across various tasks, including property prediction and crystal generation, surpassing specialized models in several benchmarks. AI

IMPACT MatMind's unified approach could accelerate discovery and design in materials science by providing a versatile backbone for various tasks.
- arXiv
- MatMind
TOOL · arXiv cs.LG English(EN) · 17h

KITE: A Tri-Modal Transformer Integrating Text, Images, and Knowledge Graphs for Fake News Detection

Researchers have developed KITE, a novel tri-modal framework designed to combat increasingly sophisticated fake news. KITE integrates textual, visual, and knowledge graph representations to detect misinformation more effectively than existing methods. By employing cross-modal attention within a transformer architecture, KITE analyzes the relationships between these modalities and provides confidence scores for interpretability. Evaluations show KITE significantly outperforms unimodal and bimodal approaches, especially when dealing with inconsistencies between text, images, or external facts. AI

IMPACT This new framework could improve the accuracy and interpretability of fake news detection systems, especially against multimodal misinformation.
- Roberta
- Wikidata
TOOL · arXiv cs.AI English(EN) · 17h

The Topological Dual of a Dataset: A Logic-to-Topology Encoding for AlphaGeometry-Style Data

A research paper proposing a novel logic-to-topology encoding for neuro-symbolic AI has been withdrawn by its author. The paper aimed to address scaling bottlenecks in systems like AlphaGeometry by revealing structural invariants in a model's latent space. It introduced the concept of the "topological dual of a dataset" as a method for mechanistic interpretability. AI
- AlphaGeometry
- Anthony Bordg
TOOL · arXiv cs.AI English(EN) · 17h

No Free Lunch for Synthetic Images under Data Scarcity Conditions

A new study published on arXiv examines the effectiveness of synthetic image generation models like VAE, GAN, and DDPM when faced with limited data and privacy concerns. Researchers developed a framework to evaluate fidelity, privacy, and utility, finding that GAN and DDPM are more robust to differential privacy mechanisms than VAE. The findings emphasize the need for multi-dimensional evaluation of generative models, especially when privacy constraints are applied. AI

IMPACT Highlights trade-offs in synthetic data generation, informing model selection for privacy-sensitive applications.
- OCTMNIST
- OrganAMNIST
- Borja Arroyo Galende
- VAE
- GAN
- MNIST
TOOL · arXiv cs.AI English(EN) · 17h

Evaluating Advanced Prompting on Gemini Flash for Multi-Hop Biomedical QA

Researchers evaluated Google's Gemini Flash models on the MedHopQA challenge, which requires multi-hop reasoning in the biomedical domain. By employing an advanced prompt engineering strategy that included role-playing, Chain-of-Thought examples, and specific formatting, they achieved a Concept Level Score of 0.720 with Gemini 2.0 Flash. This sophisticated prompting significantly improved performance compared to a baseline prompt and nearly matched the results of the next-generation Gemini 2.5 Flash, highlighting the crucial role of prompt design in LLM reasoning. AI

IMPACT Demonstrates that sophisticated prompt engineering can unlock advanced reasoning capabilities in efficient LLMs for specialized domains.
TOOL · arXiv cs.AI English(EN) · 17h

Signals Are Not States: Neuro-Symbolic Safeguards for Culturally Aware Classroom AI

Researchers have developed a neuro-symbolic framework called NSCR to address stereotype-prone reasoning in AI systems designed for educational settings. This framework aims to distinguish between observable evidence and culturally biased interpretations, treating unsupported claims as safety risks. NSCR processes multimodal data, including video, audio, and text, to generate typed facts with provenance and cultural context, enabling executable reasoning and policy enforcement. The paper also proposes a benchmark agenda and metrics to evaluate stereotype leakage, evidence faithfulness, and cultural calibration in classroom AI. AI

IMPACT Mitigates stereotype-prone reasoning in educational AI, improving fairness and accuracy in culturally diverse settings.
- Sina Bagheri Nezhad
TOOL · arXiv cs.AI English(EN) · 17h

AI-Integrated Learning Management System for Middle School: A Longitudinal Study of Learning Outcomes Through High School and Beyond

Researchers have developed an AI-integrated Learning Management System (LMS) designed for middle school students to provide timely and targeted support. This system aims to offer formative feedback, recommend practice based on mastery, and alert teachers to persistent struggles, addressing the common issue of students receiving help too late. The platform prioritizes privacy with a data minimization approach and auditable logs, and its effectiveness will be studied longitudinally through high school and beyond to assess its impact on learning trajectories. AI

IMPACT This system could improve educational outcomes by providing personalized, timely support to students, potentially altering long-term learning trajectories.
TOOL · arXiv cs.AI English(EN) · 17h

Web Agents Should Use Typed Actions Instead of Click-Based Browsing

A new position paper proposes a shift from low-level, click-based interactions to typed actions for web agents. This approach, termed 'web verbs,' would expose web operations as typed functions with structured inputs and outputs, enhancing reliability and auditability for long-horizon tasks. The authors argue that this semantic layer is crucial for building trustworthy and scalable agentic web systems. AI

IMPACT This proposal could lead to more reliable and auditable web agents, improving their ability to perform complex, long-horizon tasks.
- Linxi Jiang
- Web Agents
TOOL · arXiv cs.AI English(EN) · 17h

Blockchain Infrastructure for Intelligent Cyber--Physical--Social Systems:Post-Quantum Security, Interoperability, and Trustworthy Data Economies in the Era of Embodied AI

A new tutorial paper explores the integration of blockchain infrastructure with embodied AI systems, focusing on post-quantum security and trustworthy data economies. It highlights the need for crypto-agile architectures to protect data provenance and governance as quantum computing advances threaten current cryptographic primitives. The paper proposes blockchain as a foundational layer for decentralized intelligent environments, offering open-source frameworks for quantum-resistant, interoperable, and data-trustworthy systems. AI

IMPACT Proposes a framework for securing future AI systems against quantum threats, potentially influencing the development of decentralized AI infrastructure.
TOOL · arXiv cs.AI English(EN) · 17h

A large-scale nanocrystal database with aligned synthesis and properties enabling generative inverse design

Researchers have developed a new method for designing nanocrystal synthesis using AI, addressing the historical trial-and-error approach. They created NanoExtractor, an LLM-enhanced tool that extracts structured synthesis data from literature, achieving high accuracy compared to other models. This data forms the basis of the Nanocrystal Synthesis-Property (NSP) database, which contains nearly 160,000 entries and powers NanoDesigner, an LLM capable of inverse synthesis design. NanoDesigner has successfully proposed viable synthesis routes for known and novel nanocrystals, demonstrating a powerful human-AI collaboration for accelerating materials discovery. AI

IMPACT Enables AI-driven discovery of new materials and synthesis processes, accelerating scientific research.
TOOL · arXiv cs.AI English(EN) · 17h

Deep Active Re-Labeling: Toward Noise-Resilient Annotation Efficiency

Researchers have developed a new framework called Deep Active Re-Labeling (DAR) to improve the efficiency of active learning in machine learning. This method addresses the issue of human annotation errors, which can significantly degrade active learning performance. DAR strategically re-annotates a portion of already labeled data to identify and correct noisy labels, leading to more data-efficient training and a cleaner final annotation dataset. AI

IMPACT This research could lead to more robust and efficient machine learning model training by mitigating the impact of noisy human annotations.
- Deep Active Re-Labeling
- Md Abdullah Al Forhad
TOOL · arXiv cs.AI English(EN) · 17h

Beyond Rational Illusion: Behaviorally Realistic Strategic Classification

Researchers have introduced a new framework called Pro-SF to address strategic classification problems where agents deviate from pure rationality due to psychological biases. This framework, grounded in prospect theory, models agents' strategic manipulations by incorporating mechanisms like asymmetric benefit/cost perception, subjective reference points, and probability distortion. Experiments on synthetic and real-world data demonstrate Pro-SF's effectiveness in bridging machine learning and behavioral economics for more reliable real-world applications. AI

IMPACT Introduces a more behaviorally realistic approach to modeling AI agent interactions, potentially leading to more robust and predictable AI systems in strategic environments.
- Prospect Theory
- Xinpeng Lv
TOOL · arXiv cs.AI English(EN) · 17h

VATS: Exploiting Implicit Authority in Error-Path Injection via Systematic Mutation

Researchers have developed a new framework called VATS to exploit vulnerabilities in how AI models handle tool errors. This method systematically mutates error messages to inject malicious instructions, bypassing standard safety measures. In tests with leading models like Gemini 3.1 Pro and GPT-5.5, this error-path injection technique significantly increased the success rate of prompt injection attacks, reaching up to 100% in some evaluations. While current production safeguards can offer some protection, the underlying susceptibility in the models themselves presents a risk to custom AI agent workflows. AI

IMPACT New attack vector identified that could compromise AI agent security and reliability.
TOOL · arXiv cs.AI English(EN) · 17h

OnlyDense: Reduced-Order Modeling for Lagrangian simulation

Researchers have developed a novel deep learning framework called OnlyDense to model complex Lagrangian simulations, which are often computationally intensive. This method represents the system's state as a function evolving in Hilbert space, using learned neural basis functions to create a linear subspace. This approach unifies classical reduced-order modeling with deep learning, allowing for accurate prediction of dynamics even with a reduced number of basis functions, as demonstrated in large-scale simulations. AI

IMPACT This framework offers a more efficient method for complex scientific simulations, potentially accelerating research in fields requiring Lagrangian dynamics.
TOOL · arXiv cs.AI English(EN) · 17h

DecepGPT: Schema-Driven Deception Detection with Multicultural Datasets and Robust Multimodal Learning

Researchers have developed DecepGPT, a new system designed to detect deception in multimodal data by analyzing audiovisual cues. The system aims to provide auditable reports by incorporating structured reasoning chains and cue-level descriptions. DecepGPT also introduces a large, multicultural dataset called T4-Deception, featuring over 1600 samples from four countries, to improve generalization across different cultural contexts and prevent shortcut learning. AI

IMPACT This research could enhance security and forensic applications by improving the accuracy and audibility of deception detection systems.
TOOL · arXiv cs.AI English(EN) · 17h

FormalASR: End-to-End Spoken Chinese to Formal Text

Researchers have developed FormalASR, a novel end-to-end system designed to convert spoken Chinese directly into formal written text. This approach bypasses the need for a separate post-editing step by an LLM, reducing latency and computational costs. The system utilizes two models, 0.6B and 1.7B parameters, fine-tuned from Qwen3-ASR, and is trained on newly created large-scale datasets, WenetSpeech-Formal and Speechio-Formal. AI

IMPACT Offers a more efficient and direct method for transcribing spoken language into formal text, potentially improving downstream NLP applications.
TOOL · arXiv cs.AI English(EN) · 17h

Developing Distance-Aware Physics-Constrained Probabilistic Frameworks for Industrial Prognostics

Researchers have developed two novel sampling-free frameworks, PC-SNGP and PC-SNER, designed to enhance the reliability and physical interpretability of probabilistic models for industrial prognostics. These frameworks improve performance by maintaining distance-preserving representations and increasing uncertainty estimates as input data deviates from the training manifold. The methods were validated on rolling-element-bearing prognostics datasets, demonstrating superior prediction accuracy and well-calibrated uncertainty compared to existing approaches, even under adversarial conditions. AI

IMPACT Enhances AI's ability to predict equipment failure with greater accuracy and reliability, crucial for industrial maintenance.
TOOL · arXiv cs.AI English(EN) · 17h

ePC: Fast and Deep Predictive Coding in Digital Simulation

Researchers have developed a new method called error-based Predictive Coding (ePC) that significantly speeds up neural network training on digital hardware. Traditional Predictive Coding (PC) methods suffer from signal decay in simulations, hindering their effectiveness with deeper networks. ePC reformulates PC to eliminate this decay, allowing it to achieve performance comparable to backpropagation even on complex models, while running orders of magnitude faster. AI

IMPACT This new training method could enable the development of deeper and more complex neural networks on existing digital hardware.
TOOL · arXiv cs.AI English(EN) · 17h

Position: Anthropomorphic Misalignment Research Needs Stronger Evidence

A new research paper argues that studies on anthropomorphic AI misalignment require more rigorous evidence. The paper highlights issues like conceptual ambiguity and weak experimental designs that can lead to overinterpretation of AI behaviors. It proposes a framework of evidence levels and a diagnostic checklist to improve methodological standards in this critical area of AI safety research. AI

IMPACT Establishes a framework for evaluating AI safety research, potentially influencing how AI risks are assessed and communicated.
- Anthropomorphic Misalignment Research
- arXiv
TOOL · arXiv cs.AI English(EN) · 17h

PAFO: Pareto Fairness Optimization for Personalized Reward Modeling

Researchers have introduced PAFO, a new framework designed to address personalized reward bias in large language models. This bias occurs when reward models, trained on diverse user preferences, disproportionately favor users with more common preferences. PAFO formulates fairness as a Pareto optimization problem, aiming to enhance the experience for under-served users without negatively impacting others. The framework trains specialized models for different user groups and then distills their knowledge into a single model, improving accuracy and fairness across the board. AI

IMPACT Addresses fairness issues in LLM personalization, potentially leading to more equitable user experiences.
- Personalized reward modeling
- Large language models
TOOL · arXiv cs.AI English(EN) · 17h

Repetition Mismatch: Why Data Mixture Experiments Don't Scale and How to Fix Them

Researchers have identified a key issue in scaling up AI model training data mixtures, termed "repetition mismatch." This occurs when the optimal data mixture changes as training budgets increase due to the varying repetition rates of high-quality, limited datasets. A new subsampling procedure that matches the target repetition rate can accurately predict optimal mixtures from significantly smaller experiments, improving efficiency and accuracy. AI

IMPACT Improves efficiency and accuracy in training large AI models by addressing data mixture scaling issues.
- arXiv
- Repetition Mismatch: Why Data Mixture Experiments Don't Scale and How to Fix Them
TOOL · arXiv cs.LG English(EN) · 17h

Characterizing the Discrete Geometry of ReLU Networks

Researchers have developed new theoretical findings regarding the discrete geometry of ReLU networks, focusing on their connectivity graphs. These graphs, where nodes represent linear regions and edges connect regions sharing a face, demonstrate an average degree upper-bounded by twice the input dimension, irrespective of network depth or width. Furthermore, the graph's diameter has an upper bound independent of input dimension, even as the number of regions grows exponentially. These theoretical results were validated through experiments on networks trained with both synthetic and real-world data. AI

IMPACT Provides deeper theoretical understanding of neural network structures, potentially aiding in interpretability and optimization.
- ReLU networks
TOOL · arXiv cs.AI English(EN) · 17h

Post-Trained MoE Can Skip Half Experts via Self-Distillation

Researchers have developed a new framework called Zero-Expert Self-Distillation Adaptation (ZEDA) to make Mixture-of-Experts (MoE) language models more efficient. ZEDA allows post-trained static MoE models to dynamically skip over half of their experts during inference with minimal accuracy loss. This method was tested on Qwen3-30B-A3B and GLM-4.7-Flash, showing significant reductions in computation and an inference speedup of approximately 1.20x. AI

IMPACT Reduces inference costs for MoE models, potentially accelerating deployment and adoption.
TOOL · arXiv cs.AI English(EN) · 17h

MedicalRec: Medical recommender system for image classification without retraining

Researchers have developed a transformer-based recommender system called MedicalRec to help select optimal machine learning models for medical image classification tasks. This system aims to reduce the energy consumption and waste associated with the trial-and-error process of model selection. MedicalRec was evaluated on a new dataset, MedicalRec-Bench, which contains over 5,000 records of models tested across various medical imaging categories, achieving a HitRate@100 of 75.5%. The dataset and code are publicly available. AI

IMPACT Reduces computational waste in AI model selection for medical imaging, potentially accelerating research and deployment.
- MedicalRec-Bench
- MedicalRec
TOOL · arXiv cs.AI English(EN) · 17h

CARE: A Conformal Safety Layer for Medical Summarization

Researchers have developed CARE, a novel post-hoc safety layer for medical summarization using large language models. This model-agnostic system overlays calibrated flags for omissions and hallucinations without requiring model retraining. CARE provides formal guarantees on error rates, aiming to balance safety with the burden on clinicians reviewing summaries. AI

IMPACT Introduces a method for formal safety guarantees in LLM medical summarization, potentially reducing errors and clinician review burden.
- LLMs
TOOL · arXiv cs.AI English(EN) · 17h

Post-training is (Massive) Supervised Learning

A new paper argues that the current dominant method for training large language models (LLMs), which involves extensive post-training stages like supervised fine-tuning (SFT) and reinforcement learning (RL), is essentially a return to older "pre-train then fine-tune" approaches. The authors demonstrate that models trained from scratch on modern reasoning datasets can achieve significant performance on competitive benchmarks, suggesting that current post-training primarily serves to fit models to specific distributions rather than fostering general capabilities. They propose a shift towards training procedures that emphasize "learning how to learn" to develop more generally capable models. AI

IMPACT Suggests current LLM training methods may be overly focused on distribution fitting, potentially hindering the development of more general AI capabilities.
- RL
- BERT
- LLMs
- SFT