arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2510.03904 2026-03-31 cs.LG

LLM as an Algorithmist: Enhancing Anomaly Detectors via Programmatic Synthesis

Hangting Ye, Jinmeng Li, He Zhao, Mingchen Zhuge, Dandan Guo, Yi Chang, Hongyuan Zha

Comments Accepted by the Fourteenth International Conference on Learning Representations (ICLR 2026)

详情

英文摘要

Existing anomaly detection (AD) methods for tabular data usually rely on some assumptions about anomaly patterns, leading to inconsistent performance in real-world scenarios. While Large Language Models (LLMs) show remarkable reasoning capabilities, their direct application to tabular AD is impeded by fundamental challenges, including difficulties in processing heterogeneous data and significant privacy risks. To address these limitations, we propose LLM-DAS, a novel framework that repositions the LLM from a ``data processor'' to an ``algorithmist''. Instead of being exposed to raw data, our framework leverages the LLM's ability to reason about algorithms. It analyzes a high-level description of a given detector to understand its intrinsic weaknesses and then generates detector-specific, data-agnostic Python code to synthesize ``hard-to-detect'' anomalies that exploit these vulnerabilities. This generated synthesis program, which is reusable across diverse datasets, is then instantiated to augment training data, systematically enhancing the detector's robustness by transforming the problem into a more discriminative two-class classification task. Extensive experiments on 36 TAD benchmarks show that LLM-DAS consistently boosts the performance of mainstream detectors. By bridging LLM reasoning with classic AD algorithms via programmatic synthesis, LLM-DAS offers a scalable, effective, and privacy-preserving approach to patching the logical blind spots of existing detectors.

URL PDF HTML ☆

赞 0 踩 0

2510.03181 2026-03-31 cs.LG

Q-Learning with Shift-Aware Upper Confidence Bound in Non-Stationary Reinforcement Learning

Ha Manh Bui, Felix Parker, Kimia Ghobadi, Anqi Liu

Comments International Conference on Artificial Intelligence and Statistics, 2026

2510.01014 2026-03-31 cs.CV

Revisiting Adversarial Training under Hyperspectral Image

Weihua Zhang, Chengze Jiang, Minjing Dong, Jie Gui, Lu Dong, Zhipeng Gui, Yuan Yan Tang, James Tin-Yau Kwok

2509.26351 2026-03-31 cs.LG

LLM-Assisted Emergency Triage Benchmark: Bridging Hospital-Rich and MCI-Like Field Simulation

Joshua Sebastian, Karma Tobden, KMA Solaiman

Comments Submitted to GenAI4Health@NeurIPS 2025. This was the first version of the LLM-assisted emergency triage benchmark dataset and baseline models. A related but separate benchmark-focused study on emergency triage under constrained sensing has been accepted at the IEEE International Conference on Healthcare Informatics (ICHI) 2026 (see arXiv:2602.20168)

2509.23392 2026-03-31 cs.AI cs.CL

Your Models Have Thought Enough: Training Large Reasoning Models to Stop Overthinking

Jinyi Han, Ying Huang, Ying Liao, Zishang Jiang, Xikun Lu, Haiquan Zhao, Xinyi Wang, Guanghao Zhou, Sihang Jiang, Jiaqing Liang, Weikang Zhou, Zeye Sun, Fei Yu, Yanghua Xiao

2509.16952 2026-03-31 cs.CL cs.AI

AirQA: A Comprehensive QA Dataset for AI Research with Instance-Level Evaluation

Tiancheng Huang, Ruisheng Cao, Yuxin Zhang, Zhangyi Kang, Zijian Wang, Chenrun Wang, Yijie Luo, Hang Zheng, Lirong Qian, Lu Chen, Kai Yu

Comments 29 page, 6 figures, 17 tables, accepted to ICLR 2026

2509.12554 2026-03-31 cs.CV

Multimodal Graph Network Modeling for Human-Object Interaction Detection with PDE Graph Diffusion

Wenxuan Ji, Haichao Shi, Xiao-Yu Zhang

2509.11334 2026-03-31 cs.CV

Dual Band Thermal Videography: Separating Time-Varying Reflection and Emission Near Ambient Conditions

Sriram Narayanan, Mani Ramanagopal, Srinivasa G. Narasimhan

Comments CVPR 2026. Project Page: https://dual-band-thermal.github.io/

2509.04959 2026-03-31 cs.LG

On the Normalization of Confusion Matrices: Methods and Geometric Interpretations

Johan Erbani, Pierre-Edouard Portier, Elod Egyed-Zsigmond, Sonia Ben Mokhtar, Diana Nurbakova

2509.03417 2026-03-31 cs.LG

Initialization Schemes for Kolmogorov-Arnold Networks: An Empirical Study

Spyros Rigas, Dhruv Verma, Georgios Alexandridis, Yixuan Wang

Comments Accepted in ICLR 2026

2509.00761 2026-03-31 cs.AI cs.CL

L-MARS: Legal Multi-Agent Workflow with Orchestrated Reasoning and Agentic Search

Ziqi Wang, Boqin Yuan

2508.13669 2026-03-31 cs.CV

DeH4R: A Decoupled and Hybrid Method for Road Network Graph Extraction

Dengxian Gong, Shunping Ji

Comments Accepted for publication in the IEEE Transactions on Geoscience and Remote Sensing (TGRS)

2508.11524 2026-03-31 cs.AI

Inspire or Predict? Exploring New Paradigms in Assisting Classical Planners with Large Language Models

Wenkai Yu, Jianhang Tang, Yang Zhang, Yixiong Feng, Celimuge Wu, Kebing Jin, Hankz Hankui Zhuo

2508.11479 2026-03-31 cs.RO

OVSegDT: Segmenting Transformer for Open-Vocabulary Object Goal Navigation

Tatiana Zemskova, Aleksei Staroverov, Dmitry Yudin, Aleksandr Panov

2508.02343 2026-03-31 cs.LG cs.AI

MicroMix: Efficient Mixed-Precision Quantization with Microscaling Formats for Large Language Models

Wenyuan Liu, Haoqian Meng, Yilun Luo, Yafei Zhao, Peng Zhang, Xindian Ma

2508.00605 2026-03-31 cs.CL

GHTM: A Graph-based Hybrid Topic Modeling Approach with a Benchmark Dataset for the Low-Resource Bengali Language

Farhana Haque, Md. Abdur Rahman, Sumon Ahmed

详情

英文摘要

Topic modeling is a Natural Language Processing (NLP) technique used to discover latent themes and abstract topics from text corpora by grouping co-occurring keywords. Although widely researched in English, topic modeling remains understudied in Bengali due to a lack of adequate resources and initiatives. Existing Bengali topic modeling research lacks standardized evaluation frameworks with comprehensive baselines and diverse datasets, exploration of modern methodological approaches, and reproducible implementations, with only three Bengali-specific architectures proposed to date. To address these gaps, this study presents a comprehensive evaluation of traditional and contemporary topic modeling approaches across three Bengali datasets and introduces GHTM (Graph-based Hybrid Topic Model), a novel architecture that strategically integrates TF-IDF-weighted GloVe embeddings, Graph Convolutional Networks (GCN), and Non-negative Matrix Factorization (NMF). GHTM represents text documents using hybrid TF-IDF-weighted GloVe embeddings. It builds a document-similarity graph and leverages GCN to refine the representations through neighborhood aggregation. Then, it finally decomposes the refined representations using NMF to extract interpretable topics. Experimental results demonstrate that GHTM achieves superior topic coherence (NPMI: 0.27-0.28) and diversity compared to existing methods while maintaining computational efficiency across datasets of varying scales. The model also demonstrates strong cross-lingual generalization, outperforming established graph-based models on the English 20Newsgroups benchmark. Additionally, we introduce NCTBText, a diverse Bengali textbook-based dataset comprising 8,650 text documents, curated from eight subject areas, providing much-needed topical diversity beyond newspaper-centric Bengali corpora and serving as a benchmark for future research.

URL PDF HTML ☆

赞 0 踩 0

2507.17988 2026-03-31 cs.AI

Synthesis of timeline-based planning strategies avoiding determinization

Dario Della Monica, Angelo Montanari, Pietro Sala

Comments arXiv admin note: text overlap with arXiv:2410.22757

2507.15162 2026-03-31 cs.LG

Designing User-Centric Metrics for Evaluation of Counterfactual Explanations

Firdaus Ahmed Choudhury, Ethan Leicht, Jude Ethan Bislig, Hangzhi Guo, Amulya Yadav

2507.10195 2026-03-31 cs.CV

Minimizing the Pretraining Gap: Domain-aligned Text-Based Person Retrieval

Shuyu Yang, Yaxiong Wang, Yongrui Li, Li Zhu, Zhedong Zheng

2507.07743 2026-03-31 cs.AI

Identification of Violin Reduction via Contour Lines Classification

Philémon Beghin, Anne-Emmanuelle Ceulemans, François Glineur

详情

DOI: 10.2312/dh.20253083

英文摘要

The first violins appeared in late 16th-century Italy. Over the next 200 years, they spread across Europe and luthiers of various royal courts, eager to experiment with new techniques, created a highly diverse family of instruments. Around 1750, size standards were introduced to unify violin making for orchestras and conservatories. Instruments that fell between two standards were then reduced to a smaller size by luthiers. These reductions have an impact on several characteristics of violins, in particular on the contour lines, i.e. lines of constant altitude, which look more like a U for non reduced instruments and a V for reduced ones. While such differences are observed by experts, they have not been studied quantitatively. This paper presents a method for classifying violins as reduced or non-reduced based on their contour lines. We study a corpus of 25 instruments whose 3D geometric meshes were acquired via photogrammetry. For each instrument, we extract 10-20 contour lines regularly spaced every millimetre. Each line is fitted with a parabola-like curve (with an equation of the type y = alpha*abs(x)**beta) depending on two parameters, describing how open (beta) and how vertically stretched (alpha) the curve is. We compute additional features from those parameters, using regressions and counting how many values fall under some threshold. We also deal with outliers and non equal numbers of levels, and eventually obtain a numerical profile for each instrument. We then apply classification methods to assess whether geometry alone can predict size reduction. We find that distinguishing between reduced and non reduced instruments is feasible to some degree, taking into account that a whole spectrum of more or less transformed violins exists, for which it is more difficult to quantify the reduction. We also find the opening parameter beta to be the most predictive.

URL PDF HTML ☆

赞 0 踩 0

2507.06233 2026-03-31 cs.CV

AnthroTAP: Learning Point Tracking with Real-World Motion

Inès Hyeonsu Kim, Seokju Cho, Jahyeok Koo, Junghyun Park, Jiahui Huang, Honglak Lee, Joon-Young Lee, Seungryong Kim

Comments CVPR 2026. Project Page: https://cvlab-kaist.github.io/AnthroTAP/

2507.04990 2026-03-31 cs.CV cs.SE

Effort-Optimized, Accuracy-Driven Labelling and Validation of Test Inputs for DL Systems: A Mixed-Integer Linear Programming Approach

Mohammad Hossein Amini, Mehrdad Sabetzadeh, Shiva Nejati

Comments Accepted in the Empirical Software Engineering (EMSE) Journal (2026)

详情

英文摘要

Software systems increasingly include AI components based on deep learning (DL). Reliable testing of such systems requires near-perfect test-input validity and label accuracy, with minimal human effort. Yet, the DL community has largely overlooked the need to build highly accurate datasets with minimal effort, since DL training is generally tolerant of labelling errors. This challenge, instead, reflects concerns more familiar to software engineering, where a central goal is to construct high-accuracy test inputs, with accuracy as close to 100% as possible, while keeping associated costs in check. In this article we introduce OPAL, a human-assisted labelling method that can be configured to target a desired accuracy level while minimizing the manual effort required for labelling. The main contribution of OPAL is a mixed-integer linear programming (MILP) formulation that minimizes labelling effort subject to a specified accuracy target. To evaluate OPAL we instantiate it for two tasks in the context of testing vision systems: automatic labelling of test inputs and automated validation of test inputs. Our evaluation, based on more than 2500 experiments performed on nine datasets, comparing OPAL with eight baseline methods, shows that OPAL, relying on its MILP formulation, achieves an average accuracy of 98.8%, while cutting manual labelling by more than half. OPAL significantly outperforms automated labelling baselines in labelling accuracy across all nine datasets, when all methods are provided with the same manual-labelling budget. For automated test-input validation, on average, OPAL reduces manual effort by 28.8% while achieving 4.5% higher accuracy than the SOTA test-input validation baselines. Finally, we show that augmenting OPAL with an active-learning loop leads to an additional 4.5% reduction in required manual labelling, without compromising accuracy.

URL PDF HTML ☆

赞 0 踩 0

2506.23919 2026-03-31 cs.RO

Goal-VLA: Image-Generative VLMs as Object-Centric World Models Empowering Zero-shot Robot Manipulation

Haonan Chen, Jingxiang Guo, Bangjun Wang, Tianrui Zhang, Xuchuan Huang, Boren Zheng, Yiwen Hou, Chenrui Tie, Jiajun Deng, Lin Shao

2506.23555 2026-03-31 cs.CV

LH2Face: Loss function for Hard High-quality Face

Fan Xie, Yang Wang, Yikang Jiao, Zhenyu Yuan, Congxi Chen, Chuanxin Zhao

2506.21742 2026-03-31 cs.CV

VRR-QA: Visual Relational Reasoning in Videos Beyond Explicit Cues

Sirnam Swetha, Rohit Gupta, Parth Parag Kulkarni, David G Shatwell, Jeffrey A Chan Santiago, Nyle Siddiqui, Joseph Fioresi, Mubarak Shah

Comments Accepted at CVPR 2026

2506.18410 2026-03-31 cs.RO

Integrating Maneuverable Planning and Adaptive Control for Robot Cart-Pushing under Disturbances

Zhe Zhang, Peijia Xie, Yuhan Pang, Zhirui Sun, Bingyi Xia, Bi-Ke Zhu, Jiankun Wang

Comments 11 pages, 11 figures

2506.12433 2026-03-31 cs.CL cs.AI

Exploring Cultural Variations in Moral Judgments with Large Language Models

Hadi Mohammadi, Ayoub Bagheri

2506.10127 2026-03-31 cs.LG

Meet Me at the Arm: The Cooperative Multi-Armed Bandits Problem with Shareable Arms

Xinyi Hu, Aldo Pacchiano

2506.05207 2026-03-31 cs.CV

Follow-Your-Motion: Video Motion Transfer via Efficient Spatial-Temporal Decoupled Finetuning

Yue Ma, Yulong Liu, Qiyuan Zhu, Ayden Yang, Kunyu Feng, Xinhua Zhang, Zexuan Yan, Zhifeng Li, Sirui Han, Chenyang Qi, Qifeng Chen

Comments Accepted by ICLR 2026, project page: https://follow-your-motion.github.io/

2505.13280 2026-03-31 cs.LG cs.AI cs.CR

FlowPure: Continuous Normalizing Flows for Adversarial Purification

Elias Collaert, Abel Rodríguez, Sander Joos, Lieven Desmet, Vera Rimmer