arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.23076 2026-03-25 cs.LG

MsFormer: Enabling Robust Predictive Maintenance Services for Industrial Devices

Jiahui Zhou, Dan Li, Ruibing Jin, Jian Lou, Yanran Zhao, Zhenghua Chen, Zigui Jiang, See-Kiong Ng

详情

英文摘要

Providing reliable predictive maintenance is a critical industrial AI service essential for ensuring the high availability of manufacturing devices. Existing deep-learning methods present competitive results on such tasks but lack a general service-oriented framework to capture complex dependencies in industrial IoT sensor data. While Transformer-based models show strong sequence modeling capabilities, their direct deployment as robust AI services faces significant bottlenecks. Specifically, streaming sensor data collected in real-world service environments often exhibits multi-scale temporal correlations driven by machine working principles. Besides, the datasets available for training time-to-failure predictive services are typically limited in size. These issues pose significant challenges for directly applying existing models as robust predictive services. To address these challenges, we propose MsFormer, a lightweight Multi-scale Transformer designed as a unified AI service model for reliable industrial predictive maintenance. MsFormer incorporates a Multi-scale Sampling (MS) module and a tailored position encoding mechanism to capture sequential correlations across multi-streaming service data. Additionally, to accommodate data-scarce service environments, MsFormer adopts a lightweight attention mechanism with straightforward pooling operations instead of self-attention. Extensive experiments on real-world datasets demonstrate that the proposed framework achieves significant performance improvements over state-of-the-art methods. Furthermore, MsFormer outperforms across industrial devices and operating conditions, demonstrating strong generalizability while maintaining a highly reliable Quality of Service (QoS).

URL PDF HTML ☆

赞 0 踩 0

2603.23072 2026-03-25 cs.LG cs.NA math.AP math.NA

Generalization Bounds for Physics-Informed Neural Networks for the Incompressible Navier-Stokes Equations

Sebastien Andre-Sloan, Dibyakanti Kumar, Alejandro F Frangi, Anirbit Mukherjee

2603.23071 2026-03-25 cs.CV

PolarAPP: Beyond Polarization Demosaicking for Polarimetric Applications

Yidong Luo, Chenggong Li, Yunfeng Song, Ping Wang, Boxin Shi, Junchao Zhang, Xin Yuan

2603.23059 2026-03-25 cs.AI

Minibal: Balanced Game-Playing Without Opponent Modeling

Quentin Cohen-Solal, Tristan Cazenave

2603.23048 2026-03-25 cs.SD cs.AI

MSR-HuBERT: Self-supervised Pre-training for Adaptation to Multiple Sampling Rates

Zikang Huang, Meng Ge, Tianrui Wang, Xuanchen Li, Xiaobao Wang, Longbiao Wang, Jianwu Dang

2603.23047 2026-03-25 cs.CL cs.AI cs.CE

Parametric Knowledge and Retrieval Behavior in RAG Fine-Tuning for Electronic Design Automation

Julian Oestreich, Maximilian Bley, Frank Binder, Lydia Müller, Maksym Sydorenko, André Alcalde

2603.23044 2026-03-25 cs.RO

Learning Actuator-Aware Spectral Submanifolds for Precise Control of Continuum Robots

Paul Leonard Wolff, Hugo Buurmeijer, Luis Pabon, John Irvin Alora, Mark Leone, Roshan S. Kaundinya, Amirhossein Kazemipour, Robert K. Katzschmann, Marco Pavone

2603.23041 2026-03-25 cs.CV cs.AI cs.LG

HUydra: Full-Range Lung CT Synthesis via Multiple HU Interval Generative Modelling

António Cardoso, Pedro Sousa, Tania Pereira, Hélder P. Oliveira

Comments Submitted to iEEE TPAMI (Transactions on Pattern Analysis and Machine Intelligence)

2603.23037 2026-03-25 cs.CV cs.AI cs.CL cs.LG cs.RO

YOLOv10 with Kolmogorov-Arnold networks and vision-language foundation models for interpretable object detection and trustworthy multimodal AI in computer vision perception

Marios Impraimakis, Daniel Vazquez, Feiyu Zhou

Comments 14 pages, 23 Figures, 6 Tables

2603.23034 2026-03-25 cs.CV

Traffic Sign Recognition in Autonomous Driving: Dataset, Benchmark, and Field Experiment

Guoyang Zhao, Weiqing Qi, Kai Zhang, Chenguang Zhang, Zeying Gong, Zhihai Bi, Kai Chen, Benshan Ma, Ming Liu, Jun Ma

详情

英文摘要

Traffic Sign Recognition (TSR) is a core perception capability for autonomous driving, where robustness to cross-region variation, long-tailed categories, and semantic ambiguity is essential for reliable real-world deployment. Despite steady progress in recognition accuracy, existing traffic sign datasets and benchmarks offer limited diagnostic insight into how different modeling paradigms behave under these practical challenges. We present TS-1M, a large-scale and globally diverse traffic sign dataset comprising over one million real-world images across 454 standardized categories, together with a diagnostic benchmark designed to analyze model capability boundaries. Beyond standard train-test evaluation, we provide a suite of challenge-oriented settings, including cross-region recognition, rare-class identification, low-clarity robustness, and semantic text understanding, enabling systematic and fine-grained assessment of modern TSR models. Using TS-1M, we conduct a unified benchmark across three representative learning paradigms: classical supervised models, self-supervised pretrained models, and multimodal vision-language models (VLMs). Our analysis reveals consistent paradigm-dependent behaviors, showing that semantic alignment is a key factor for cross-region generalization and rare-category recognition, while purely visual models remain sensitive to appearance shift and data imbalance. Finally, we validate the practical relevance of TS-1M through real-scene autonomous driving experiments, where traffic sign recognition is integrated with semantic reasoning and spatial localization to support map-level decision constraints. Overall, TS-1M establishes a reference-level diagnostic benchmark for TSR and provides principled insights into robust and semantic-aware traffic sign perception. Project page: https://guoyangzhao.github.io/projects/ts1m.

URL PDF HTML ☆

赞 0 踩 0

2603.23030 2026-03-25 cs.CV cs.AI

Looking Beyond the Window: Global-Local Aligned CLIP for Training-free Open-Vocabulary Semantic Segmentation

ByeongCheol Lee, Hyun Seok Seong, Sangeek Hyun, Gilhan Park, WonJun Moon, Jae-Pil Heo

Comments 18 pages, 13 figures, 12 tables, Accepted to CVPR 2026

2603.23023 2026-03-25 cs.CV

Cog3DMap: Multi-View Vision-Language Reasoning with 3D Cognitive Maps

Chanyoung Gwak, Yoonwoo Jeong, Byungwoo Jeon, Hyunseok Lee, Jinwoo Shin, Minsu Cho

Comments Project Page: https://cog3dmap.github.io

2603.23020 2026-03-25 cs.CV cs.AI

Concept-based explanations of Segmentation and Detection models in Natural Disaster Management

Samar Heydari, Jawher Said, Galip Ümit Yolcu, Evgenii Kortukov, Elena Golimblevskaia, Evgenios Vlachos, Vasileios Mygdalis, Ioannis Pitas, Sebastian Lapuschkin, Leila Arras

Comments 8 pages, 4 figures

2603.23016 2026-03-25 cs.LG cs.AI

A Sobering Look at Tabular Data Generation via Probabilistic Circuits

Davide Scassola, Dylan Ponsford, Adrián Javaloy, Sebastiano Saccani, Luca Bortolussi, Henry Gouk, Antonio Vergari

2603.23013 2026-03-25 cs.CL

Knowledge Access Beats Model Size: Memory Augmented Routing for Persistent AI Agents

Xunzhuo Liu, Bowei He, Xue Liu, Andy Luo, Haichen Zhang, Huamin Chen

详情

英文摘要

Production AI agents frequently receive user-specific queries that are highly repetitive, with up to 47\% being semantically similar to prior interactions, yet each query is typically processed with the same computational cost. We argue that this redundancy can be exploited through conversational memory, transforming repetition from a cost burden into an efficiency advantage. We propose a memory-augmented inference framework in which a lightweight 8B-parameter model leverages retrieved conversational context to answer all queries via a low-cost inference path. Without any additional training or labeled data, this approach achieves 30.5\% F1, recovering 69\% of the performance of a full-context 235B model while reducing effective cost by 96\%. Notably, a 235B model without memory (13.7\% F1) underperforms even the standalone 8B model (15.4\% F1), indicating that for user-specific queries, access to relevant knowledge outweighs model scale. We further analyze the role of routing and confidence. At practical confidence thresholds, routing alone already directs 96\% of queries to the small model, but yields poor accuracy (13.0\% F1) due to confident hallucinations. Memory does not substantially alter routing decisions; instead, it improves correctness by grounding responses in retrieved user-specific information. As conversational memory accumulates over time, coverage of recurring topics increases, further narrowing the performance gap. We evaluate on 152 LoCoMo questions (Qwen3-8B/235B) and 500 LongMemEval questions. Incorporating hybrid retrieval (BM25 + cosine similarity) improves performance by an additional +7.7 F1, demonstrating that retrieval quality directly enhances end-to-end system performance. Overall, our results highlight that memory, rather than model size, is the primary driver of accuracy and efficiency in persistent AI agents.

URL PDF HTML ☆

赞 0 踩 0

2603.23010 2026-03-25 cs.CV

Zero-Shot Personalization of Objects via Textual Inversion

Aniket Roy, Maitreya Suin, Rama Chellappa

2603.23004 2026-03-25 cs.AI cs.LG

Can Large Language Models Reason and Optimize Under Constraints?

Fabien Bernier, Salah Ghamizi, Pantelis Dogoulis, Maxime Cordy

2603.23003 2026-03-25 cs.AI

On the use of Aggregation Operators to improve Human Identification using Dental Records

Antonio D. Villegas-Yeguas, Guillermo R-García, Tzipi Kahana, Jorge Pinares Toledo, Esi Sharon, Oscar Ibañez, Oscar Cordón

2603.22998 2026-03-25 cs.CV

VQ-Jarvis: Retrieval-Augmented Video Restoration Agent with Sharp Vision and Fast Thought

Xuanyu Zhang, Weiqi Li, Qunliang Xing, Jingfen Xie, Bin Chen, Junlin Li, Li Zhang, Jian Zhang, Shijie Zhao

Comments Video restoration, Agent-based restoration

2603.22991 2026-03-25 cs.CV

VLA-IAP: Training-Free Visual Token Pruning via Interaction Alignment for Vision-Language-Action Models

Jintao Cheng, Haozhe Wang, Weibin Li, Gang Wang, Yipu Zhang, Xiaoyu Tang, Jin Wu, Xieyuanli Chen, Yunhui Liu, Wei Zhang

Comments 27 pages, 8 figures

2603.22988 2026-03-25 cs.LG

Robustness Quantification and Uncertainty Quantification: Comparing Two Methods for Assessing the Reliability of Classifier Predictions

Adrián Detavernier, Jasper De Bock

2603.22985 2026-03-25 cs.CL cs.CY

Beyond Hate: Differentiating Uncivil and Intolerant Speech in Multimodal Content Moderation

Nils A. Herrmann, Tobias Eder, Jingyi He, Georg Groh

Comments Preprint. Under review

2603.22984 2026-03-25 cs.LG cs.AI cs.SI

Can Graph Foundation Models Generalize Over Architecture?

Benjamin Gutteridge, Michael Bronstein, Xiaowen Dong

Comments 9 pages main text + 18 pages references and appendix (27 pages total), 5 figures. Accepted to GRaM Workshop @ ICLR 2026: Workshop on Geometry-grounded Representation Learning and Generative Modeling (to appear in PMLR)

2603.22978 2026-03-25 cs.AI

JFTA-Bench: Evaluate LLM's Ability of Tracking and Analyzing Malfunctions Using Fault Trees

Yuhui Wang, Zhixiong Yang, Ming Zhang, Shihan Dou, Zhiheng Xi, Enyu Zhou, Senjie Jin, Yujiong Shen, Dingwei Zhu, Yi Dong, Tao Gui, Qi Zhang, Xuanjing Huang

2603.22977 2026-03-25 cs.CL cs.AI cs.LG

DariMis: Harm-Aware Modeling for Dari Misinformation Detection on YouTube

Jawid Ahmad Baktash, Mosa Ebrahimi, Mohammad Zarif Joya, Mursal Dawodi

Comments 9 pages, 8 figures. Accepted for submission; dataset and code will be released upon publication

2603.22973 2026-03-25 cs.AI

Where Experts Disagree, Models Fail: Detecting Implicit Legal Citations in French Court Decisions

Avrile Floro, Tamara Dhorasoo, Soline Pellez, Nils Holzenberger

2603.22969 2026-03-25 cs.CV

FCL-COD: Weakly Supervised Camouflaged Object Detection with Frequency-aware and Contrastive Learning

Jingchen Ni, Quan Zhang, Dan Jiang, Keyu Lv, Ke Zhang, Chun Yuan

Comments CVPR 2026 Findings

2603.22966 2026-03-25 cs.CL cs.AI

Set-Valued Prediction for Large Language Models with Feasibility-Aware Coverage Guarantees

Ye Li, Anqi Hu, Yuanchang Ye, Shiyan Tong, Zhiyuan Wang, Bo Fu

2603.22953 2026-03-25 cs.CV

Cluster-Wise Spatio-Temporal Masking for Efficient Video-Language Pretraining

Weijun Zhuang, Yuqing Huang, Weikang Meng, Xin Li, Ming Liu, Xiaopeng Hong, Yaowei Wang, Wangmeng Zuo

Comments Accepted by CVPR 2026

2603.22951 2026-03-25 cs.LG

Weak-PDE-Net: Discovering Open-Form PDEs via Differentiable Symbolic Networks and Weak Formulation

Xinxin Li, Xingyu Cui, Jin Qi, Juan Zhang, Da Li, Junping Yin