arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2602.05629 2026-03-26 cs.SE cs.CV

ROMAN: Reward-Orchestrated Multi-Head Attention Network for Autonomous Driving System Testing

Jianlei Chi, Yuzhen Wu, Jiaxuan Hou, Xiaodong Zhang, Ming Fan, Suhui Sun, Weijun Dai, Bo Li, Jianguo Sun, Jun Sun

Comments The manuscript includes 13 pages, 8 tables, and 7 figures

详情

英文摘要

Automated Driving System (ADS) acts as the brain of autonomous vehicles, responsible for their safety and efficiency. Safe deployment requires thorough testing in diverse real-world scenarios and compliance with traffic laws like speed limits, signal obedience, and right-of-way rules. Violations like running red lights or speeding pose severe safety risks. However, current testing approaches face significant challenges: limited ability to generate complex and high-risk law-breaking scenarios, and failing to account for complex interactions involving multiple vehicles and critical situations. To address these challenges, we propose ROMAN, a novel scenario generation approach for ADS testing that combines a multi-head attention network with a traffic law weighting mechanism. ROMAN is designed to generate high-risk violation scenarios to enable more thorough and targeted ADS evaluation. The multi-head attention mechanism models interactions among vehicles, traffic signals, and other factors. The traffic law weighting mechanism implements a workflow that leverages an LLM-based risk weighting module to evaluate violations based on the two dimensions of severity and occurrence. We have evaluated ROMAN by testing the Baidu Apollo ADS within the CARLA simulation platform and conducting extensive experiments to measure its performance. Experimental results demonstrate that ROMAN surpassed state-of-the-art tools ABLE and LawBreaker by achieving 7.91% higher average violation count than ABLE and 55.96% higher than LawBreaker, while also maintaining greater scenario diversity. In addition, only ROMAN successfully generated violation scenarios for every clause of the input traffic laws, enabling it to identify more high-risk violations than existing approaches.

URL PDF HTML ☆

赞 0 踩 0

2512.15829 2026-03-26 cs.ET cs.AI cs.CV cs.NE

Physics-driven human-like working memory outperforms digital networks in dynamic vision

Jingli Liu, Huannan Zheng, Bohao Zou, Kezhou Yang

2511.20888 2026-03-26 stat.ML cs.CC cs.LG

Deep Learning as a Convex Paradigm of Computation: Minimizing Circuit Size with ResNets

Arthur Jacot

2510.11269 2026-03-26 cs.NI cs.AI

From Prompts to Packets: A View from the Network on ChatGPT, Copilot, and Gemini

Antonio Montieri, Alfredo Nascita, Antonio Pescapè

Comments 15 pages, 8 figures, 2 tables, 4 research questions, accepted on Elsevier Computer Networks

详情

DOI: 10.1016/j.comnet.2026.112237

英文摘要

GenAI chatbots are now pervasive in digital ecosystems, fundamentally reshaping user interactions over the Internet. Their reliance on an always-online, cloud-centric operating model introduces novel traffic dynamics that challenge practical network management. Despite the critical need to anticipate these changes in network demand, the traffic characterization of these chatbots remains largely underexplored. To fill this gap, this study presents an in-depth traffic analysis of ChatGPT, Copilot, and Gemini used via Android mobile apps. Using a dedicated capture architecture, we collect two complementary datasets, combining unconstrained user interactions with a controlled workload of selected prompts for both text and image generation. This dual design allows us to address practical research questions on the distinctiveness of chatbot traffic, its divergence from that of conventional messaging apps, and its novel implications for network usage. To this end, we provide a multi-granular traffic characterization and model packet-sequence dynamics to uncover the underlying transmission mechanisms. Our analysis reveals app-/content-specific traffic patterns and distinctive protocol footprints. We highlight the predominance of TLS, with Gemini extensively leveraging QUIC, ChatGPT exclusively using TLS 1.3, and characteristic Server Name Indication (SNI) values. Through occlusion analysis, we quantify the reliance on SNI for traffic visibility, demonstrating that masking this field reduces classification performance by up to 20 percentage points. Finally, the comparison with conventional messaging apps confirms that GenAI workloads introduce novel stress factors, such as sustained upstream activity and high-rate bursts, with direct implications for capacity planning and network management. We publicly release the datasets to support reproducibility and foster extensions to other use cases.

URL PDF HTML ☆

赞 0 踩 0

2510.04607 2026-03-26 cs.OS cs.AI cs.LG

From Imperative to Declarative: Towards LLM-friendly OS Interfaces for Boosted Computer-Use Agents

Yuan Wang, Mingyu Li, Haibo Chen

2509.03394 2026-03-26 cs.DC cs.LG cs.PF

CloudFormer: An Attention-based Performance Prediction for Public Clouds with Unknown Workload

Amirhossein Shahbazinia, Darong Huang, Luis Costero, David Atienza

详情

英文摘要

Cloud platforms are increasingly relied upon to host diverse, resource-intensive workloads due to their scalability, flexibility, and cost-efficiency. In multi-tenant cloud environments, virtual machines are consolidated on shared physical servers to improve resource utilization. While virtualization guarantees resource partitioning for CPU, memory, and storage, it cannot ensure performance isolation. Competition for shared resources such as last-level cache, memory bandwidth, and network interfaces often leads to severe performance degradation. Existing management techniques, including VM scheduling and resource provisioning, require accurate performance prediction to mitigate interference. However, this remains challenging in public clouds due to the black-box nature of VMs and the highly dynamic nature of workloads. To address these limitations, we propose CloudFormer, a dual-branch Transformer-based model designed to predict VM performance degradation in black-box environments. CloudFormer jointly models temporal dynamics and system-level interactions, leveraging 206 system metrics at one-second resolution across both static and dynamic scenarios. This design enables the model to capture transient interference effects and adapt to varying workload conditions without scenario-specific tuning. Complementing the methodology, we provide a fine-grained dataset that significantly expands the temporal resolution and metric diversity compared to existing benchmarks. Experimental results demonstrate that CloudFormer consistently outperforms state-of-the-art baselines across multiple evaluation metrics, achieving robust generalization across diverse and previously unseen workloads. Notably, CloudFormer attains a mean absolute error (MAE) of just 7.8%, representing a substantial improvement in predictive accuracy and outperforming existing methods at least by 28%.

URL PDF HTML ☆

赞 0 踩 0

2507.22171 2026-03-26 cs.CR cs.AI

Enhancing Jailbreak Attacks on LLMs via Persona Prompts

Zheng Zhang, Peilin Zhao, Deheng Ye, Hao Wang

Comments Workshop on LLM Persona Modeling at NeurIPS 2025

2507.00629 2026-03-26 cond-mat.dis-nn cs.LG math.PR math.ST stat.TH

Generalization performance of narrow one-hidden layer networks in the teacher-student setting

Rodrigo Pérez Ortiz, Gibbs Nwemadji, Jean Barbier, Federica Gerace, Alessandro Ingrosso, Clarissa Lauditi, Enrico M. Malatesta

Comments 37 pages, 7 figures

2506.20334 2026-03-26 eess.SY cs.LG cs.SY

Recurrent neural network-based robust control systems with regional properties and application to MPC design

Daniele Ravasio, Alessio La Bella, Marcello Farina, Andrea Ballarino

Comments 27 pages, 5 figures

2505.00574 2026-03-26 cond-mat.mtrl-sci cs.LG

Transition States Energies from Machine Learning: An Application to Reverse Water-Gas Shift on Single-Atom Alloys

Raffaele Cheula, Mie Andersen

详情

DOI: 10.1021/acscatal.5c02818
Journal ref: ACS Catalysis 2025

英文摘要

Obtaining accurate transition state (TS) energies is a bottleneck in computational screening of complex materials and reaction networks due to the high cost of TS search methods and first-principles methods such as density functional theory (DFT). Here we propose a machine learning (ML) model for predicting TS energies based on Gaussian process regression with the Wasserstein Weisfeiler-Lehman graph kernel (WWL-GPR). Applying the model to predict adsorption and TS energies for the reverse water-gas shift (RWGS) reaction on single-atom alloy (SAA) catalysts, we show that it can significantly improve the accuracy compared to traditional approaches based on scaling relations or ML models without a graph representation. Further benefitting from the low cost of model training, we train an ensemble of WWL-GPR models to obtain uncertainties through subsampling of the training data and show how these uncertainties propagate to turnover frequency (TOF) predictions through the construction of an ensemble of microkinetic models. Comparing the errors in model-based vs DFT-based TOF predictions, we show that the WWL-GPR model reduces errors by almost an order of magnitude compared to scaling relations. This demonstrates the critical impact of accurate energy predictions on catalytic activity estimation. Finally, we apply our model to screen new materials, identifying promising catalysts for RWGS. This work highlights the power of combining advanced ML techniques with DFT and microkinetic modeling for screening catalysts for complex reactions like RWGS, providing a robust framework for future catalyst design.

URL PDF HTML ☆

赞 0 踩 0

2504.13868 2026-03-26 cs.HC cs.AI

Diverse AI Personas Can Mitigate the Homogenization Effect in Human-AI Collaborative Ideation

Yun Wan, Yoram M Kalman

详情

DOI: 10.1016/j.chbah.2026.100289
Journal ref: Computers in Human Behavior: Artificial Humans, 2026

英文摘要

Recent studies suggest that while generative AI (GenAI) can enhance individual creativity, it often reduces the diversity of collective outputs. A well-known example of this homogenization effect is by Doshi and Hauser (2024) who found that GenAI-generated plot ideas improved story writing creativity but led to convergence across writers' outputs. This study extends their experiment, identifying the design choices behind the apparent creativity-diversity trade-off. In Phase 1, we used structured prompting with 10 diverse GenAI personas to generate 300 story plots, and confirmed the plots' diversity using text embedding analysis. In Phase 2, participants wrote stories with or without access to these plots. Results show that diverse GenAI inputs can preserve story diversity compared to a human-only baseline, with some evidence of enhancement in the 1-plot condition. Beyond addressing the diversity component of the trade-off, our findings offer broader insights for human-AI system design. Our findings suggest that the trade-off may emerge from uniform deployment practices rather than from an inherent limitation of GenAI, and that diversity can be intentionally built into AI-mediated collaboration. Our study highlights the risks of over-standardization, the importance of prompt variation, and the value of treating GenAI not as a static tool but as a configurable partner. These insights have important implications for the design of GenAI systems that support, not constrain, collective creativity.

URL PDF HTML ☆

赞 0 踩 0

2504.09271 2026-03-26 cs.HC cs.AI cs.CL cs.SI

Linguistic Comparison of AI- and Human-Written Responses to Online Mental Health Queries

Koustuv Saha, Yoshee Jain, Violeta J. Rodriguez, Munmun De Choudhury

2504.05296 2026-03-26 cs.GR cs.CV

Let it Snow! Animating 3D Gaussian Scenes with Dynamic Weather Effects via Physics-Guided Score Distillation

Gal Fiebelman, Hadar Averbuch-Elor, Sagie Benaim

Comments Accepted to CVPR 2026. Project webpage: https://galfiebelman.github.io/let-it-snow/

2504.01924 2026-03-26 cs.GR cs.LG

Gen-C: Populating Virtual Worlds with Generative Crowds

Andreas Panayiotou, Panayiotis Charalambous, Ioannis Karamouzas

Comments 13 pages

2502.03377 2026-03-26 cs.NI cs.LG

Energy-Efficient UAV-assisted LoRa Gateways: A Multi-Agent Optimization Approach

Abdullahi Isa Ahmed, Jamal Bentahar, El Mehdi Amhoud

Comments 6 pages, 5 figures, 2 table

2502.02861 2026-03-26 stat.ML cs.DS cs.LG

Algorithms with Calibrated Machine Learning Predictions

Judy Hanwen Shen, Ellen Vitercik, Anders Wikum

Comments Matches the camera-ready version accepted at ICML 2025

2412.05450 2026-03-26 cs.GT cs.AI nlin.AO q-bio.PE

Promoting Cooperation in the Public Goods Game using Artificial Intelligent Agents

Arend Hintze, Christoph Adami

Comments 16 pages, 6 figures

2405.17573 2026-03-26 stat.ML cs.AI cs.LG

Hamiltonian Mechanics of Feature Learning: Bottleneck Structure in Leaky ResNets

Arthur Jacot, Alexandre Kaiser

2404.04265 2026-03-26 cs.IR cs.LG

Accelerating Matrix Factorization by Dynamic Pruning for Fast Recommendation

Yining Wu, Shengyu Duan, Gaole Sai, Chenhong Cao, Guobing Zou

详情

英文摘要

Matrix factorization (MF) is a widely used collaborative filtering (CF) algorithm for recommendation systems (RSs), due to its high prediction accuracy, great flexibility and high efficiency in big data processing. However, with the dramatically increased number of users/items in current RSs, the computational complexity for training a MF model largely increases. Many existing works have accelerated MF, by either putting in additional computational resources or utilizing parallel systems, introducing a large cost. In this paper, we propose algorithmic methods to accelerate MF, without inducing any additional computational resources. In specific, we observe fine-grained structured sparsity in the decomposed feature matrices when considering a certain threshold. The fine-grained structured sparsity causes a large amount of unnecessary operations during both matrix multiplication and latent factor update, increasing the computational time of the MF training process. Based on the observation, we firstly propose to rearrange the feature matrices based on joint sparsity, which potentially makes a latent vector with a smaller index more dense than that with a larger index. The feature matrix rearrangement is given to limit the error caused by the later performed pruning process. We then propose to prune the insignificant latent factors by an early stopping process during both matrix multiplication and latent factor update. The pruning process is dynamically performed according to the sparsity of the latent factors for different users/items, to accelerate the process. The experiments show that our method can achieve 1.2-1.65 speedups, with up to 20.08% error increase, compared with the conventional MF training process. We also prove the proposed methods are applicable considering different hyperparameters including optimizer, optimization strategy and initialization method.

URL PDF HTML ☆

赞 0 踩 0

2402.08151 2026-03-26 stat.ME cs.AI cs.LG math.SP math.ST stat.TH

Perturbative adaptive importance sampling for Bayesian LOO cross-validation

Joshua C Chang, Xiangting Li, Tianyi Su, Shixin Xu, Hao-Ren Yao, Julia Porcino, Carson Chow

Comments Submitted

2312.00357 2026-03-26 eess.IV cs.CV cs.LG

A Generalizable Deep Learning System for Cardiac MRI

Rohan Shad, Cyril Zakka, Dhamanpreet Kaur, Mrudang Mathur, Robyn Fong, Joseph Cho, Ross Warren Filice, John Mongan, Kimberly Kalianos, Nishith Khandwala, David Eng, Matthew Leipzig, Walter R. Witschey, Alejandro de Feria, Victor A. Ferrari, Euan A. Ashley, Michael A. Acker, Curtis Langlotz, William Hiesinger

Comments Published in Nature Biomedical Engineering; Supplementary Appendix available on publisher website. Code: https://github.com/rohanshad/cmr_transformer

2306.17466 2026-03-26 eess.IV cs.CV

MedAugment: Universal Automatic Data Augmentation Plug-in for Medical Image Analysis

Zhaoshan Liu, Qiujie Lv, Yifan Li, Ziduo Yang, Lei Shen

Comments Knowledge-Based Systems Accepted

2603.24109 2026-03-26 eess.IV cs.AI cs.CV

Comparative analysis of dual-form networks for live land monitoring using multi-modal satellite image time series

Iris Dumeur, Jérémy Anger, Gabriele Facciolo

2603.24054 2026-03-26 cs.DB cs.IR cs.LG

Hierarchical Spatial-Temporal Graph-Enhanced Model for Map-Matching

Anjun Gao, Zhenglin Wan, Pingfu Chao, Shunyu Yao

详情

DOI: 10.1007/978-981-96-1242-0_4
Journal ref: Gao, A., Wan, Z., Chao, P., Yao, S. (2025). Hierarchical Spatial-Temporal Graph-Enhanced Model for Map-Matching. In: Databases Theory and Applications. ADC 2024. Lecture Notes in Computer Science, vol 15449. Springer, Singapore

英文摘要

The integration of GNSS data into portable devices has led to the generation of vast amounts of trajectory data, which is crucial for applications such as map-matching. To tackle the limitations of rule-based methods, recent works in deep learning for trajectory-related tasks occur. However, existing models remain challenging due to issues such as the difficulty of large-scale data labeling, ineffective modeling of spatial-temporal relationships, and discrepancies between training and test data distributions. To tackle these challenges, we propose HSTGMatch, a novel model designed to enhance map-matching performance. Our approach involves a two-stage process: hierarchical self-supervised learning and spatial-temporal supervised learning. We introduce a hierarchical trajectory representation, leveraging both grid cells and geographic tuples to capture moving patterns effectively. The model constructs an Adaptive Trajectory Adjacency Graph to dynamically capture spatial relationships, optimizing GATs for improved efficiency. Furthermore, we incorporate a Spatial-Temporal Factor to extract relevant features and employ a decay coefficient to address variations in trajectory length. Our extensive experiments demonstrate the model's superior performance, module effectiveness, and robustness, providing a promising solution for overcoming the existing limitations in map-matching applications. The source code of HSTGMatch is publicly available on GitHub at https://github.com/Nerooo-g/HSTGMatch.

URL PDF HTML ☆

赞 0 踩 0

2603.24041 2026-03-26 stat.ME cs.LG

Minimal Sufficient Representations for Self-interpretable Deep Neural Networks

Zhiyao Tan, Liu Li, Huazhen Lin

2603.24038 2026-03-26 eess.AS cs.SD

ACAVCaps: Enabling large-scale training for fine-grained and diverse audio understanding

Yadong Niu, Tianzi Wang, Heinrich Dinkel, Xingwei Sun, Jiahao Zhou, Gang Li, Jizhong Liu, Junbo Zhang, Jian Luan

Comments accepted by ICASSP 2026

2603.23990 2026-03-26 cs.CY cs.AI

From Untamed Black Box to Interpretable Pedagogical Orchestration: The Ensemble of Specialized LLMs Architecture for Adaptive Tutoring

Nizam Kadir

Comments Accepted as a FULL paper at the 27th International Conference on Artificial Intelligence in Education (AIED 2026). 15 pages, 4 figures, 4 tables

2603.23974 2026-03-26 physics.optics cs.CV cs.ET cs.LG physics.data-an

Machine vision with small numbers of detected photons per inference

Shi-Yuan Ma, Jérémie Laydevant, Mandar M. Sohoni, Logan G. Wright, Tianyu Wang, Peter L. McMahon

Comments 98 pages, 34 figures

2603.23943 2026-03-26 cond-mat.mtrl-sci cs.LG

ChargeFlow: Flow-Matching Refinement of Charge-Conditioned Electron Densities

Tri Minh Nguyen, Sherif Abdulkader Tawfik, Truyen Tran, Svetha Venkatesh

2603.23933 2026-03-26 cs.GR cs.CL cs.CV cs.LG

ORACLE: Orchestrate NPC Daily Activities using Contrastive Learning with Transformer-CVAE

Seong-Eun Hong, JuYeong Hwang, RyunHa Lee, HyeongYeop Kang

Comments 17 pages, 7 figures. Accepted to CVM 2026