arXivDaily arXiv每日学术速递 周一至周五更新
全部学科分类 1386
专题追踪
2601.21014 2026-02-12 stat.ML cs.LG stat.AP

Efficient Causal Structure Learning via Modular Subgraph Integration

Haixiang Sun, Pengchao Tian, Zihan Zhou, Jielei Zhang, Peiyi Li, Andrew L. Liu

详情
英文摘要

Learning causal structures from observational data remains a fundamental yet computationally intensive task, particularly in high-dimensional settings where existing methods face challenges such as the super-exponential growth of the search space and increasing computational demands. To address this, we introduce VISTA (Voting-based Integration of Subgraph Topologies for Acyclicity), a modular framework that decomposes the global causal structure learning problem into local subgraphs based on Markov Blankets. The global integration is achieved through a weighted voting mechanism that penalizes low-support edges via exponential decay, filters unreliable ones with an adaptive threshold, and ensures acyclicity using a Feedback Arc Set (FAS) algorithm. The framework is model-agnostic, imposing no assumptions on the inductive biases of base learners, is compatible with arbitrary data settings without requiring specific structural forms, and fully supports parallelization. We also theoretically establish finite-sample error bounds for VISTA, and prove its asymptotic consistency under mild conditions. Extensive experiments on both synthetic and real datasets consistently demonstrate the effectiveness of VISTA, yielding notable improvements in both accuracy and efficiency over a wide range of base learners.

2601.05588 2026-02-12 cs.IR cs.AI cs.LG

Autoregressive Ranking: Bridging the Gap Between Dual and Cross Encoders

Benjamin Rozonoyer, Chong You, Michael Boratko, Himanshu Jain, Nilesh Gupta, Srinadh Bhojanapalli, Andrew McCallum, Felix Yu

Comments 22 pages, 5 figures

详情
英文摘要

The success of Large Language Models (LLMs) has motivated a shift toward generative approaches to retrieval and ranking, aiming to supersede classical Dual Encoders (DEs) and Cross Encoders (CEs). A prominent paradigm is pointwise Autoregressive Ranking (ARR), where an LLM generates document identifiers (docIDs) token-by-token to enable ranking via beam search. ARR offers the promise of superior expressivity compared to DEs while avoiding the prohibitive computational cost of CEs. However, a formal theoretical foundation for this expressive power has been missing. Moreover, the standard next-token prediction loss is rank-agnostic and inappropriate for finetuning an LLM for ranking tasks. In this paper, we first prove that the expressive capacity of ARR is strictly superior to DEs. While a DE requires an embedding dimension that grows linearly with corpus size to achieve arbitrary rankings, ARR can solve it with a constant hidden dimension. We then propose SToICaL (Simple Token-Item Calibrated Loss), a generalized rank-aware training loss for LLM finetuning. By using item-level reweighting and prefix-tree marginalization, we distribute probability mass over valid docID tokens based on their ground-truth relevance. Experiments on WordNet and ESCI datasets verify that our loss suppresses invalid docID generations and significantly improves ranking metrics beyond top-1 retrieval.

2512.18542 2026-02-12 cs.CR cs.AI cs.CL cs.LG

SecureCode: A Production-Grade Multi-Turn Dataset for Training Security-Aware Code Generation Models

Scott Thornton

Comments 27 pages, 12 figures, 10 tables. Dataset available at https://huggingface.co/datasets/scthornton/securecode. Code and validation tools at https://github.com/scthornton/securecode

详情
英文摘要

AI coding assistants produce vulnerable code in 45\% of security-relevant scenarios~\cite{veracode2025}, yet no public training dataset teaches both traditional web security and AI/ML-specific defenses in a format suitable for instruction tuning. We present SecureCode, a production-grade dataset of 2,185 multi-turn security training examples spanning two domains: web application security (1,435 examples covering the OWASP Top 10 2021 across 11 languages and 9 frameworks, 100\% grounded in documented CVEs and security incidents) and AI/ML security (750 examples covering all 10 OWASP LLM Top 10 2025 categories across more than 40 frameworks, including LangChain, OpenAI, and Hugging Face). Every example follows a 4-turn conversational structure -- feature request; vulnerable and secure implementations with attack demonstrations; advanced probing; and defense-in-depth operational guidance -- designed for direct use in instruction tuning pipelines. Quality assurance combines automated structural validation with multi-agent review from seven specialist AI perspectives (more than 10{,}500 assessments) and an 8-phase remediation pipeline, producing a rubric-calibrated mean quality score of 93.8/100 ($σ= 0.93$) for the AI/ML component. Each example provides SIEM integration strategies, infrastructure hardening recommendations, and testing approaches using production frameworks. We release the unified dataset on Hugging Face with domain-specific loading configurations (web, aiml, default), alongside eight fine-tuned open-source models (3B--20B parameters, QLoRA), and an evaluation framework with four security-specific metrics. To our knowledge, SecureCode is the first public dataset that jointly provides OWASP Top 10 2021 web coverage and OWASP LLM Top 10 2025 AI/ML coverage in a unified conversational schema suitable for instruction tuning.

2511.12544 2026-02-12 cs.AR cs.ET cs.LG eess.IV

FERMI-ML: A Flexible and Resource-Efficient Memory-In-Situ SRAM Macro for TinyML acceleration

Mukul Lokhande, Akash Sankhe, S. V. Jaya Chand, Santosh Kumar Vishvakarma

Journal ref 37th International Conference on Microelectronics (ICM), Cairo, Egypt, 2025

详情
英文摘要

The growing demand for low-power and area-efficient TinyML inference on AIoT devices necessitates memory architectures that minimise data movement while sustaining high computational efficiency. This paper presents FERMI-ML, a Flexible and Resource-Efficient Memory-In-Situ (MIS) SRAM macro designed for TinyML acceleration. The proposed 9T XNOR-based RX9T bit-cell integrates a 5T storage cell with a 4T XNOR compute unit, enabling variable-precision MAC and CAM operations within the same array. A 22-transistor (C22T) compressor-tree-based accumulator facilitates logarithmic 1-64-bit MAC computation with reduced delay and power compared to conventional adder trees. The 4 KB macro achieves dual functionality for in-situ computation and CAM-based lookup operations, supporting Posit-4 or FP-4 precision. Post-layout results at 65 nm show operation at 350 MHz with 0.9 V, delivering a throughput of 1.93 TOPS and an energy efficiency of 364 TOPS/W, while maintaining a Quality-of-Result (QoR) above 97.5% with InceptionV4 and ResNet-18. FERMI-ML thus demonstrates a compact, reconfigurable, and energy-aware digital Memory-In-Situ macro capable of supporting mixed-precision TinyML workloads.

2511.11847 2026-02-12 cs.IR cs.AI cs.CY

A Multimodal Manufacturing Safety Chatbot: Knowledge Base Design, Benchmark Development, and Evaluation of Multiple RAG Approaches

Ryan Singh, Austin Hamilton, Amanda White, Michael Wise, Ibrahim Yousif, Arthur Carvalho, Zhe Shan, Reza Abrisham Baf, Mohammad Mayyas, Lora A. Cavuoto, Fadel M. Megahed

Comments 25 pages, 5 figures

详情
英文摘要

Ensuring worker safety remains a critical challenge in modern manufacturing environments. Industry 5.0 reorients the prevailing manufacturing paradigm toward more human-centric operations. Using a design science research methodology, we identify three essential requirements for next-generation safety training systems: high accuracy, low latency, and low cost. We introduce a multimodal chatbot powered by large language models that meets these design requirements. The chatbot uses retrieval-augmented generation to ground its responses in curated regulatory and technical documentation. To evaluate our solution, we developed a domain-specific benchmark of expert-validated question and answer pairs for three representative machines: a Bridgeport manual mill, a Haas TL-1 CNC lathe, and a Universal Robots UR5e collaborative robot. We tested 24 RAG configurations using a full-factorial design and assessed them with automated evaluations of correctness, latency, and cost. Our top 2 configurations were then evaluated by ten industry experts and academic researchers. Our results show that retrieval strategy and model configuration have a significant impact on performance. The top configuration, selected for chatbot deployment, achieved an accuracy of 86.66%, an average cost of $0.005 per query, and an average end-to-end latency of 10.04 seconds. This latency is practical for delivering a complete safety instruction and is measured from query submission to full instruction delivery rather than generation onset. Overall, our work provides three contributions: an open-source, domain-grounded safety training chatbot; a validated benchmark for evaluating AI-assisted safety instruction; and a systematic methodology for designing and assessing AI-enabled instructional and immersive safety training systems for Industry 5.0 environments.

2511.07277 2026-02-12 cs.HC cs.AI cs.CY

Designing Beyond Language: Sociotechnical Barriers in AI Health Technologies for Limited English Proficiency

Michelle Huang, Violeta J. Rodriguez, Koustuv Saha, Tal August

Journal ref Proceedings of the 2026 CHI Conference on Human Factors in Computing Systems

详情
英文摘要

Limited English proficiency (LEP) patients in the U.S. face systemic barriers to healthcare beyond language and interpreter access, encompassing procedural and institutional constraints. AI advances may support communication and care through on-demand translation and visit preparation, but also risk exacerbating existing inequalities. We conducted storyboard-driven interviews with 14 patient navigators to explore how AI could shape care experiences for Spanish-speaking LEP individuals. We identified tensions around linguistic and cultural misunderstandings, privacy concerns, and opportunities and risks for AI to augment care workflows. Participants highlighted structural factors that can undermine trust in AI systems, including sensitive information disclosure, unstable technology access, and low literacy. While AI tools can potentially alleviate social barriers and institutional constraints, there are risks of misinformation and reducing human-to-human interactions. Our findings contribute AI design considerations that support LEP patients and care teams via rapport-building, educational and language support, and minimizing disruptions to existing practices.

2511.00494 2026-02-12 eess.SP cs.AI

An Indoor Radio Mapping Dataset Combining 3D Point Clouds and RSSI

Ljupcho Milosheski, Kuon Akiyama, Blaž Bertalanič, Jernej Hribar, Ryoichi Shinkuma

Comments 19 pages, 8 figures, 3 tables

详情
英文摘要

The growing number of smart devices supporting bandwidth-intensive and latency-sensitive applications, such as real-time video analytics, smart sensing, Extended Reality (XR), etc., necessitates reliable wireless connectivity in indoor environments. In such environments, accurate design of Radio Environment Maps (REMs) enables adaptive wireless network planning and optimization of Access Point (AP) placement. However, generating realistic REMs remains difficult due to the variability of indoor environments and the limitations of existing modeling approaches, which often rely on simplified layouts or fully synthetic data. These challenges are further amplified by the adoption of next-generation Wi-Fi standards, which operate at higher frequencies and suffer from limited range and wall penetration. To support the efforts in addressing these challenges, we collected a dataset that combines high-resolution 3D LiDAR scans with Wi-Fi RSSI measurements collected across 20 setups in a multi-room indoor environment. The dataset includes two measurement scenarios, the first without human presence in the environment, and the second with human presence, enabling the development and validation of REM estimation models that incorporate physical geometry and environmental dynamics. The described dataset supports research in data-driven wireless modeling and the development of high-capacity indoor communication networks.

2510.09655 2026-02-12 cs.CR cs.AI

Data Provenance Auditing of Fine-Tuned Large Language Models with a Text-Preserving Technique

Yanming Li, Cédric Eichler, Nicolas Anciaux, Alexandra Bensamoun, Lorena Gonzalez Manzano, Seifeddine Ghozzi

详情
英文摘要

We propose a system for marking sensitive or copyrighted texts to detect their use in fine-tuning large language models under black-box access with statistical guarantees. Our method builds digital ``marks'' using invisible Unicode characters organized into (``cue'', ``reply'') pairs. During an audit, prompts containing only ``cue'' fragments are issued to trigger regurgitation of the corresponding ``reply'', indicating document usage. To control false positives, we compare against held-out counterfactual marks and apply a ranking test, yielding a verifiable bound on the false positive rate. The approach is minimally invasive, scalable across many sources, robust to standard processing pipelines, and achieves high detection power even when marked data is a small fraction of the fine-tuning corpus.

2510.02700 2026-02-12 eess.IV cs.CV eess.SP

A UAV-Based VNIR Hyperspectral Benchmark Dataset for Landmine and UXO Detection

Sagar Lekhak, Emmett J. Ientilucci, Jasper Baur, Susmita Ghosh

Comments This work was accepted and presented as an oral paper at the Indian Geoscience and Remote Sensing Symposium (InGARSS) 2025 and appears in the IEEE InGARSS 2025 Proceedings

详情
英文摘要

This paper introduces a novel benchmark dataset of Visible and Near-Infrared (VNIR) hyperspectral imagery acquired via an unmanned aerial vehicle (UAV) platform for landmine and unexploded ordnance (UXO) detection research. The dataset was collected over a controlled test field seeded with 143 realistic surrogate landmine and UXO targets, including surface, partially buried, and fully buried configurations. Data acquisition was performed using a Headwall Nano-Hyperspec sensor mounted on a multi-sensor drone platform, flown at an altitude of approximately 20.6 m, capturing 270 contiguous spectral bands spanning 398-1002 nm. Radiometric calibration, orthorectification, and mosaicking were performed followed by reflectance retrieval using a two-point Empirical Line Method (ELM), with reference spectra acquired using an SVC spectroradiometer. Cross-validation against six reference objects yielded RMSE values below 1.0 and SAM values between 1 and 6 degrees in the 400-900 nm range, demonstrating high spectral fidelity. The dataset is released alongside raw radiance cubes, GCP/AeroPoint data, and reference spectra to support reproducible research. This contribution fills a critical gap in open-access UAV-based hyperspectral data for landmine detection and offers a multi-sensor benchmark when combined with previously published drone-based electromagnetic induction (EMI) data from the same test field.

2509.24975 2026-02-12 cs.SE cs.CL

DiffuTester: Accelerating Unit Test Generation for Diffusion LLMs via Mining Structural Pattern

Lekang Yang, Yuetong Liu, Yitong Zhang, Jia Li

Comments Update format and add some experimental results

详情
英文摘要

Diffusion large language models (dLLMs) enable parallel generation and are promising for unit test generation (UTG), where efficient and large-scale automated testing is essential in software development. Despite this advantage, their application to UTG is still constrained by a clear trade-off between efficiency and test quality, since increasing the number of tokens generated in each step often causes a sharp decline in the quality of test cases. To overcome this limitation, we present DiffuTester, an acceleration framework specifically tailored for dLLMs in UTG. The motivation of DiffuTester is that unit tests targeting the same focal method often share structural patterns. DiffuTester employs a novel structural pattern based decoding approach, which dynamically identifies structural patterns across unit tests through their abstract syntax trees and additionally decodes the corresponding tokens, thereby achieving acceleration without compromising the quality of the output. To enable comprehensive evaluation, we extend the original TestEval benchmark to three programming languages. Extensive experiments on three benchmarks with two representative models show that DiffuTester delivers significant acceleration while preserving test coverage. Moreover, DiffuTester generalizes well across different dLLMs and programming languages, providing a practical and scalable solution for efficient UTG in software development. Code and data are publicly available at https://github.com/TsinghuaISE/DiffuTester.

2509.05941 2026-02-12 cs.SE cs.LG cs.MA

Code2MCP: Transforming Code Repositories into MCP Services

Chaoqian Ouyang, Ling Yue, Shimin Di, Libin Zheng, Linan Yue, Shaowu Pan, Jian Yin, Min-Ling Zhang

详情
英文摘要

The Model Context Protocol (MCP) aims to create a standard for how Large Language Models use tools. However, most current research focuses on selecting tools from an existing pool. A more fundamental, yet largely overlooked, problem is how to populate this pool by converting the vast number of existing software projects into MCP-compatible services. To bridge this gap, we introduce Code2MCP, an agent-based framework that automatically transforms a GitHub repository into a functional MCP service with minimal human intervention. Code2MCP employs a multi-agent workflow for code analysis, environment setup, tool function design, and service generation, enhanced by a self-correcting loop to ensure reliability. We demonstrate that Code2MCP successfully transforms open-source computing libraries in scientific fields such as bioinformatics, mathematics, and fluid dynamics that are not available in existing MCP servers. By providing a novel automated pathway to unlock GitHub, the world's largest code repository, for the MCP ecosystem, Code2MCP serves as a catalyst to significantly accelerate the protocol's adoption and practical application. The code is public at https://github.com/DEFENSE-SEU/Code2MCP.

2508.13374 2026-02-12 cs.DC cs.ET cs.LG cs.NI

OrbitChain: Orchestrating In-orbit Real-time Analytics of Earth Observation Data

Zhouyu Li, Zhijin Yang, Huayue Gu, Xiaojian Wang, Yuchen Liu, Ruozhou Yu

Comments currently under review

详情
英文摘要

Earth observation analytics have the potential to transform many sectors. However, due to limited ground connections, it currently takes hours to days to download and analyze Earth observation data, diminishing the value of data for time-sensitive applications like disaster monitoring or search-and-rescue. To enable real-time analytics, we propose OrbitChain, an in-orbit multi-satellite Earth analytics framework. OrbitChain uses a pipelined design to decompose workflows into analytics functions, and orchestrates constellation-wide resources to finish real-time analytics tasks. It provides timely insights to Earth sensing applications and enables advanced workflows like in-orbit tip-and-cue. Hardware-in-the-loop experiments show that OrbitChain can deliver analytics results in minutes, supports up to 60% more analytics workload than existing frameworks, and reduces inter-satellite communication overhead by up to 45%.

2508.08232 2026-02-12 eess.IV cs.CV eess.SP

Mamba-FCS: Joint Spatio- Frequency Feature Fusion, Change-Guided Attention, and SeK Loss for Enhanced Semantic Change Detection in Remote Sensing

Buddhi Wijenayake, Athulya Ratnayake, Praveen Sumanasekara, Roshan Godaliyadda, Parakrama Ekanayake, Vijitha Herath, Nichula Wasalathilaka

Comments 19 pages, 13 Figures

详情
英文摘要

Semantic Change Detection (SCD) from remote sensing imagery requires models balancing extensive spatial context, computational efficiency, and sensitivity to class-imbalanced land-cover transitions. While Convolutional Neural Networks excel at local feature extraction but lack global context, Transformers provide global modeling at high computational costs. Recent Mamba architectures based on state-space models offer compelling solutions through linear complexity and efficient long-range modeling. In this study, we introduce Mamba-FCS, a SCD framework built upon Visual State Space Model backbone incorporating, a Joint Spatio-Frequency Fusion block incorporating log-amplitude frequency domain features to enhance edge clarity and suppress illumination artifacts, a Change-Guided Attention (CGA) module that explicitly links the naturally intertwined BCD and SCD tasks, and a Separated Kappa (SeK) loss tailored for class-imbalanced performance optimization. Extensive evaluation on SECOND and Landsat-SCD datasets shows that Mamba-FCS achieves state-of-the-art metrics, 88.62% Overall Accuracy, 65.78% F_scd, and 25.50% SeK on SECOND, 96.25% Overall Accuracy, 89.27% F_scd, and 60.26% SeK on Landsat-SCD. Ablation analyses confirm distinct contributions of each novel component, with qualitative assessments highlighting significant improvements in SCD. Our results underline the substantial potential of Mamba architectures, enhanced by proposed techniques, setting a new benchmark for effective and scalable semantic change detection in remote sensing applications. The complete source code, configuration files, and pre-trained models will be publicly available upon publication.

2507.10136 2026-02-12 q-bio.QM cs.AI

A PBN-RL-XAI Framework for Discovering a "Hit-and-Run" Therapeutic Strategy in Melanoma

Zhonglin Liu

Comments 7 pages, 7 figures. Accepted by the IEEE International Conference on Bioinformatics and Biomedicine (BIBM) 2025. Code is available at https://github.com/Liu-Zhonglin/pbn-melanoma-project

详情
英文摘要

Innate resistance to anti-PD-1 immunotherapy remains a major clinical challenge in metastatic melanoma, with the underlying molecular networks being poorly understood. To address this, we constructed a dynamic Probabilistic Boolean Network model using transcriptomic data from patient tumor biopsies to elucidate the regulatory logic governing therapy response. We then employed a reinforcement learning agent to systematically discover optimal, multi-step therapeutic interventions and used explainable artificial intelligence to mechanistically interpret the agent's control policy. The analysis revealed that a precisely timed, 4-step temporary inhibition of the lysyl oxidase like 2 protein (LOXL2) was the most effective strategy. Our explainable analysis showed that this ''hit-and-run" intervention is sufficient to erase the molecular signature driving resistance, allowing the network to self-correct without requiring sustained intervention. This study presents a novel, time-dependent therapeutic hypothesis for overcoming immunotherapy resistance and provides a powerful computational framework for identifying non-obvious intervention protocols in complex biological systems.

2506.11214 2026-02-12 math.OC cs.AI cs.CC cs.LG stat.ML

Complexity of normalized stochastic first-order methods with momentum under heavy-tailed noise

Chuan He, Zhaosong Lu, Defeng Sun, Zhanwang Deng

详情
英文摘要

In this paper, we propose practical normalized stochastic first-order methods with Polyak momentum, multi-extrapolated momentum, and recursive momentum for solving unconstrained optimization problems. These methods employ dynamically updated algorithmic parameters and do not require explicit knowledge of problem-dependent quantities such as the Lipschitz constant or noise bound. We establish first-order oracle complexity results for finding approximate stochastic stationary points under heavy-tailed noise and weakly average smoothness conditions -- both of which are weaker than the commonly used bounded variance and mean-squared smoothness assumptions. Our complexity bounds either improve upon or match the best-known results in the literature. Numerical experiments are presented to demonstrate the practical effectiveness of the proposed methods.

2506.04518 2026-02-12 eess.AS cs.CL

Towards Efficient Speech-Text Jointly Decoding within One Speech Language Model

Haibin Wu, Yuxuan Hu, Ruchao Fan, Xiaofei Wang, Kenichi Kumatani, Bo Ren, Jianwei Yu, Heng Lu, Lijuan Wang, Yao Qian, Jinyu Li

Comments Accepted by ASRU 2025

详情
英文摘要

Speech language models (Speech LMs) enable end-to-end speech-text modeling within a single model, offering a promising direction for spoken dialogue systems. The choice of speech-text jointly decoding paradigm plays a critical role in performance, efficiency, and alignment quality. In this work, we systematically compare representative joint speech-text decoding strategies, including the interleaved, and parallel generation paradigms, under a controlled experimental setup using the same base language model, speech tokenizer and training data. Our results show that the interleaved approach achieves the best alignment. However it suffers from slow inference due to long token sequence length. To address this, we propose a novel early-stop interleaved (ESI) pattern that not only significantly accelerates decoding but also yields slightly better performance. Additionally, we curate high-quality question answering (QA) datasets to further improve speech QA performance.

2504.20610 2026-02-12 cs.IR cs.AI cs.PF

Information Retrieval in the Age of Generative AI: The RGB Model

Michele Garetto, Alessandro Cornacchia, Franco Galante, Emilio Leonardi, Alessandro Nordio, Alberto Tarable

Comments To be presented at ACM SIGIR 25

详情
英文摘要

The advent of Large Language Models (LLMs) and generative AI is fundamentally transforming information retrieval and processing on the Internet, bringing both great potential and significant concerns regarding content authenticity and reliability. This paper presents a novel quantitative approach to shed light on the complex information dynamics arising from the growing use of generative AI tools. Despite their significant impact on the digital ecosystem, these dynamics remain largely uncharted and poorly understood. We propose a stochastic model to characterize the generation, indexing, and dissemination of information in response to new topics. This scenario particularly challenges current LLMs, which often rely on real-time Retrieval-Augmented Generation (RAG) techniques to overcome their static knowledge limitations. Our findings suggest that the rapid pace of generative AI adoption, combined with increasing user reliance, can outpace human verification, escalating the risk of inaccurate information proliferation across digital resources. An in-depth analysis of Stack Exchange data confirms that high-quality answers inevitably require substantial time and human effort to emerge. This underscores the considerable risks associated with generating persuasive text in response to new questions and highlights the critical need for responsible development and deployment of future generative AI tools.

2504.20058 2026-02-12 q-fin.ST cs.LG

Predictive AI with External Knowledge Infusion: Datasets and Benchmarks for Stock Markets

Ambedkar Dukkipati, Kawin Mayilvaghanan, Naveen Kumar Pallekonda, Sai Prakash Hadnoor, Ranga Shaarad Ayyagari

详情
英文摘要

Fluctuations in stock prices are influenced by a complex interplay of factors that go beyond mere historical data. These factors, themselves influenced by external forces, encompass inter-stock dynamics, broader economic factors, various government policy decisions, outbreaks of wars, etc. Furthermore, all of these factors are dynamic and exhibit changes over time. In this paper, for the first time, we tackle the forecasting problem under external influence by proposing learning mechanisms that not only learn from historical trends but also incorporate external knowledge from temporal knowledge graphs. Since there are no such datasets or temporal knowledge graphs available, we study this problem with stock market data, and we construct comprehensive temporal knowledge graph datasets. In our proposed approach, we model relations on external temporal knowledge graphs as events of a Hawkes process on graphs. With extensive experiments, we show that learned dynamic representations effectively rank stocks based on returns across multiple holding periods, outperforming related baselines on relevant metrics.

2504.09221 2026-02-12 cs.HC cs.LG

CMCRD: Cross-Modal Contrastive Representation Distillation for Emotion Recognition

Siyuan Kan, Huanyu Wu, Zhenyao Cui, Fan Huang, Xiaolong Xu, Dongrui Wu

Journal ref IEEE Trans. on Affective Computing, 2026

详情
英文摘要

Emotion recognition is an important component of affective computing, and also human-machine interaction. Unimodal emotion recognition is convenient, but the accuracy may not be high enough; on the contrary, multi-modal emotion recognition may be more accurate, but it also increases the complexity and cost of the data collection system. This paper considers cross-modal emotion recognition, i.e., using both electroencephalography (EEG) and eye movement in training, but only EEG or eye movement in test. We propose cross-modal contrastive representation distillation (CMCRD), which uses a pre-trained eye movement classification model to assist the training of an EEG classification model, improving feature extraction from EEG signals, or vice versa. During test, only EEG signals (or eye movement signals) are acquired, eliminating the need for multi-modal data. CMCRD not only improves the emotion recognition accuracy, but also makes the system more simplified and practical. Experiments using three different neural network architectures on three multi-modal emotion recognition datasets demonstrated the effectiveness of CMCRD. Compared with the EEG-only model, it improved the average classification accuracy by about 6.2%.

2504.03785 2026-02-12 eess.SP cs.CE cs.LG

Detecting Plant VOC Traces Using Indoor Air Quality Sensors

Seyed Hamidreza Nabaei, Ryan Lenfant, Viswajith Govinda Rajan, Dong Chen, Michael P. Timko, Bradford Campbell, Arsalan Heydarian

Journal ref Indoor Air, 2025, Article ID 7134467

详情
英文摘要

In the era of growing interest in healthy buildings and smart homes, the importance of sustainable, health conscious indoor environments is paramount. Smart tools, especially VOC sensors, are crucial for monitoring indoor air quality, yet interpreting signals from various VOC sources remains challenging. A promising approach involves understanding how indoor plants respond to environmental conditions. Plants produce terpenes, a type of VOC, when exposed to abiotic and biotic stressors - including pathogens, predators, light, and temperature - offering a novel pathway for monitoring indoor air quality. While prior work often relies on specialized laboratory sensors, our research leverages readily available commercial sensors to detect and classify plant emitted VOCs that signify changes in indoor conditions. We quantified the sensitivity of these sensors by measuring 16 terpenes in controlled experiments, then identified and tested the most promising terpenes in realistic environments. We also examined physics based models to map VOC responses but found them lacking for real world complexity. Consequently, we trained machine learning models to classify terpenes using commercial sensors and identified optimal sensor placement. To validate this approach, we analyzed emissions from a living basil plant, successfully detecting terpene output. Our findings establish a foundation for overcoming challenges in plant VOC detection, paving the way for advanced plant based sensors to enhance indoor environmental quality in future smart buildings.

2504.03387 2026-02-12 hep-ph cs.LG hep-ex

BitHEP -- The Limits of Low-Precision ML in HEP

Claudius Krause, Daohan Wang, Ramon Winterhalder

Comments 20 pages, 6 figures. v2: accepted for publication

Journal ref SciPost Phys. 20, 038 (2026)

详情
英文摘要

The increasing complexity of modern neural network architectures demands fast and memory-efficient implementations to mitigate computational bottlenecks. In this work, we evaluate the recently proposed BitNet architecture in HEP applications, assessing its performance in classification, regression, and generative modeling tasks. Specifically, we investigate its suitability for quark-gluon discrimination, SMEFT parameter estimation, and detector simulation, comparing its efficiency and accuracy to state-of-the-art methods. Our results show that while BitNet consistently performs competitively in classification tasks, its performance in regression and generation varies with the size and type of the network, highlighting key limitations and potential areas for improvement.

2503.19925 2026-02-12 eess.IV cs.LG math.OC physics.med-ph

Accurate, provable and fast polychromatic tomographic reconstruction: A variational inequality approach

Mengqi Lou, Kabir Aladin Verchand, Sara Fridovich-Keil, Ashwin Pananjady

Comments Accepted at SIAM Journal on Imaging Sciences

详情
英文摘要

We consider the problem of signal reconstruction for computed tomography (CT) under a nonlinear forward model that accounts for exponential signal attenuation, a polychromatic X-ray source, general measurement noise (e.g., Poisson shot noise), and observations acquired over multiple wavelength windows. We develop a simple iterative algorithm for single-material reconstruction, which we call EXACT (EXtragradient Algorithm for Computed Tomography), based on formulating our estimate as the fixed point of a monotone variational inequality. We prove guarantees on the statistical and computational performance of EXACT under realistic assumptions on the measurement process. We also consider a recently introduced variant of this model with Gaussian measurements and present sample and iteration complexity bounds for EXACT that improve upon those of existing algorithms. We apply our EXACT algorithm to a CT phantom image recovery task and show that it often requires fewer X-ray views, lower source intensity, and less computation time to achieve reconstruction quality similar to existing methods. Code is available at https://github.com/voilalab/exact.

2503.19339 2026-02-12 cs.CR cs.AI

Efficient IoT Intrusion Detection with an Improved Attention-Based CNN-BiLSTM Architecture

Amna Naeem, Jawad Ahmad, Muazzam A. Khan, Aizaz Ahmad Khattak, Muhammad Shahbaz Khan

详情
英文摘要

The ever-increasing security vulnerabilities in the Internet-of-Things (IoT) systems require improved threat detection approaches. This paper presents a compact and efficient approach to detect botnet attacks by employing an integrated approach that consists of traffic pattern analysis, temporal support learning, and focused feature extraction. The proposed attention-based model benefits from a hybrid CNN-BiLSTM architecture and achieves 99% classification accuracy in detecting botnet attacks utilizing the N-BaIoT dataset, while maintaining high precision and recall across various scenarios. The proposed model's performance is further validated by key parameters, such as Mathews Correlation Coefficient and Cohen's kappa Correlation Coefficient. The close-to-ideal results for these parameters demonstrate the proposed model's ability to detect botnet attacks accurately and efficiently in practical settings and on unseen data. The proposed model proved to be a powerful defense mechanism for IoT networks to face emerging security challenges.

2503.10186 2026-02-12 cs.MA cs.AI cs.GT math.DS

Convergence and Connectivity: Dynamics of Multi-Agent Q-Learning in Random Networks

Dan Leonte, Aamal Hussain, Raphael Huser, Francesco Belardinelli, Dario Paccagnan

详情
英文摘要

Beyond specific settings, many multi-agent learning algorithms fail to converge to an equilibrium solution, instead displaying complex, non-stationary behaviours such as recurrent or chaotic orbits. In fact, recent literature suggests that such complex behaviours are likely to occur when the number of agents increases. In this paper, we study Q-learning dynamics in network polymatrix normal-form games where the network structure is drawn from classical random graph models. In particular, we focus on the Erdős-Rényi model, which is used to analyze connectivity in distributed systems, and the Stochastic Block model, which generalizes the above by accounting for community structures that naturally arise in multi-agent systems. In each setting, we establish sufficient conditions under which the agents' joint strategies converge to a unique equilibrium. We investigate how this condition depends on the exploration rates, payoff matrices and, crucially, the probabilities of interaction between network agents. We validate our theoretical findings through numerical simulations and demonstrate that convergence can be reliably achieved in many-agent systems, provided interactions in the network are controlled.

2503.02437 2026-02-12 stat.ML cs.LG

Decentralized Reinforcement Learning for Multi-Agent Multi-Resource Allocation via Dynamic Cluster Agreements

Antonio Marino, Esteban Restrepo, Claudio Pacchierotti, Paolo Robuffo Giordano

Journal ref IEEE Robotics and Automation Letters, 2025, 10 (8), pp.8123-8130

详情
英文摘要

This paper addresses the challenge of allocating heterogeneous resources among multiple agents in a decentralized manner. Our proposed method, Liquid-Graph-Time Clustering-IPPO, builds upon Independent Proximal Policy Optimization (IPPO) by integrating dynamic cluster consensus, a mechanism that allows agents to form and adapt local sub-teams based on resource demands. This decentralized coordination strategy reduces reliance on global information and enhances scalability. We evaluate LGTC-IPPO against standard multi-agent reinforcement learning baselines and a centralized expert solution across a range of team sizes and resource distributions. Experimental results demonstrate that LGTC-IPPO achieves more stable rewards, better coordination, and robust performance even as the number of agents or resource types increases. Additionally, we illustrate how dynamic clustering enables agents to reallocate resources efficiently also for scenarios with discharging resources.

2502.14121 2026-02-12 stat.ML cs.AI cs.LG

Multi-Objective Bayesian Optimization for Networked Black-Box Systems: A Path to Greener Profits and Smarter Designs

Akshay Kudva, Wei-Ting Tang, Joel A. Paulson

详情
英文摘要

Designing modern industrial systems requires balancing several competing objectives, such as profitability, resilience, and sustainability, while accounting for complex interactions between technological, economic, and environmental factors. Multi-objective optimization (MOO) methods are commonly used to navigate these tradeoffs, but selecting the appropriate algorithm to tackle these problems is often unclear, particularly when system representations vary from fully equation-based (white-box) to entirely data-driven (black-box) models. While grey-box MOO methods attempt to bridge this gap, they typically impose rigid assumptions on system structure, requiring models to conform to the underlying structural assumptions of the solver rather than the solver adapting to the natural representation of the system of interest. In this chapter, we introduce a unifying approach to grey-box MOO by leveraging network representations, which provide a general and flexible framework for modeling interconnected systems as a series of function nodes that share various inputs and outputs. Specifically, we propose MOBONS, a novel Bayesian optimization-inspired algorithm that can efficiently optimize general function networks, including those with cyclic dependencies, enabling the modeling of feedback loops, recycle streams, and multi-scale simulations - features that existing methods fail to capture. Furthermore, MOBONS incorporates constraints, supports parallel evaluations, and preserves the sample efficiency of Bayesian optimization while leveraging network structure for improved scalability. We demonstrate the effectiveness of MOBONS through two case studies, including one related to sustainable process design. By enabling efficient MOO under general graph representations, MOBONS has the potential to significantly enhance the design of more profitable, resilient, and sustainable engineering systems.

2412.19441 2026-02-12 quant-ph cs.LG

Comparative Performance Analysis of Quantum Machine Learning Architectures for Credit Card Fraud Detection

Mansour El Alami, Nouhaila Innan, Muhammad Shafique, Mohamed Bennai

Comments 15 pages, 20 figures, 8 tables. Accepted in Applied Intelligence

Journal ref Appl Intell 56, 83 (2026)

详情
英文摘要

As financial fraud becomes increasingly complex, effective detection methods are essential. Quantum Machine Learning (QML) introduces certain capabilities that may enhance both accuracy and efficiency in this area. This study examines how different quantum feature maps and ansatz configurations affect the performance of three QML-based classifiers, the Variational Quantum Classifier (VQC), the Sampler Quantum Neural Network (SQNN), and the Estimator Quantum Neural Network (EQNN), when applied to two non-normalized financial fraud datasets. Different quantum feature map and ansatz configurations are evaluated, revealing distinct performance patterns. The VQC consistently demonstrates strong classification results, achieving an F1-score of 0.88, while the SQNN also delivers promising outcomes. In contrast, the EQNN struggles to produce robust results, emphasizing the challenges presented by non-standardized data. Statistical validation using ANOVA confirms the significance of observed performance differences. Additionally, robustness tests on the best-performing models under five quantum noise types show that they maintain competitive performance, supporting their practical applicability. These findings highlight the importance of careful model configuration in QML-based financial fraud detection. By showing how specific feature maps and ansatz choices influence predictive success, this work guides researchers and practitioners in refining QML approaches for complex financial applications.

2412.11702 2026-02-12 cs.AR cs.CV cs.DC cs.ET eess.IV

Flex-PE: Flexible and SIMD Multi-Precision Processing Element for AI Workloads

Mukul Lokhande, Gopal Raut, Santosh Kumar Vishvakarma

Comments 10 pages, 5 figures, Preprint, Submitted to TVLSI Regular papers

Journal ref IEEE Transactions on Very Large Scale Integration (VLSI) Systems, June 2025

详情
英文摘要

The rapid adaptation of data driven AI models, such as deep learning inference, training, Vision Transformers (ViTs), and other HPC applications, drives a strong need for runtime precision configurable different non linear activation functions (AF) hardware support. Existing solutions support diverse precision or runtime AF reconfigurability but fail to address both simultaneously. This work proposes a flexible and SIMD multiprecision processing element (FlexPE), which supports diverse runtime configurable AFs, including sigmoid, tanh, ReLU and softmax, and MAC operation. The proposed design achieves an improved throughput of up to 16X FxP4, 8X FxP8, 4X FxP16 and 1X FxP32 in pipeline mode with 100% time multiplexed hardware. This work proposes an area efficient multiprecision iterative mode in the SIMD systolic arrays for edge AI use cases. The design delivers superior performance with up to 62X and 371X reductions in DMA reads for input feature maps and weight filters in VGG16, with an energy efficiency of 8.42 GOPS / W within the accuracy loss of 2%. The proposed architecture supports emerging 4-bit computations for DL inference while enhancing throughput in FxP8/16 modes for transformers and other HPC applications. The proposed approach enables future energy-efficient AI accelerators in edge and cloud environments.

2406.13761 2026-02-12 math.NA cs.LG cs.NA math.OC

Exponential time differencing for matrix-valued dynamical systems

Nayef Shkeir, Tobias Grafke

详情
英文摘要

Matrix evolution equations occur in many applications, such as dynamical Lyapunov/Sylvester systems or Riccati equations in optimization and stochastic control, machine learning or data assimilation. In many such problems, the dominant stability restriction is imposed by a stiff linear term, making standard explicit integrators impractical. Exponential time differencing (ETD) is known to produce highly stable numerical schemes by treating the linear term in an exact fashion. In particular, for stiff problems, ETD methods are the methods of choice. We extend ETD to matrix-valued evolution equations of the form $\dot Q = LQ + QR + N(Q,t)$ by deriving explicit matrix-ETD (METD) schemes. When $L$ and $R$ commute, we construct an explicit $p$-th order METD$p$ family and prove order-$p$ global convergence under standard assumptions; for the non-commuting case, we develop a Baker-Campbell-Hausdorff (BCH)-based extension. This allows us to produce highly efficient and stable integration schemes. We demonstrate efficiency and applicability on stiff PDE-derived and large-scale matrix dynamics, including an Allen-Cahn system, turbulent jet fluctuation statistics, and continuous graph neural networks. We further show that the scheme is more accurate, stable, and efficient than competing schemes in large-scale high-rank stiff systems.

2406.01552 2026-02-12 stat.ML cs.AI cs.LG

Tensor learning with orthogonal, Lorentz, and symplectic symmetries

Wilson G. Gregory, Josué Tonelli-Cueto, Nicholas F. Marshall, Andrew S. Lee, Soledad Villar

Comments 40 pages, 1 figure. To appear at ICLR 2026

详情
英文摘要

Tensors are a fundamental data structure for many scientific contexts, such as time series analysis, materials science, and physics, among many others. Improving our ability to produce and handle tensors is essential to efficiently address problems in these domains. In this paper, we show how to exploit the underlying symmetries of functions that map tensors to tensors. More concretely, we develop universally expressive equivariant machine learning architectures on tensors that exploit that, in many cases, these tensor functions are equivariant with respect to the diagonal action of the orthogonal, Lorentz, and/or symplectic groups. We showcase our results on three problems coming from material science, theoretical computer science, and time series analysis. For time series, we combine our method with the increasingly popular path signatures approach, which is also invariant with respect to reparameterizations. Our numerical experiments show that our equivariant models perform better than corresponding non-equivariant baselines.