arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

Choro Ulan uulu, Mikhail Kulyabin, Iris Fuhrmann, Jan Joosten, Nuno Miguel Martins Pacheco, Filippos Petridis, Rebecca Johnson, Jan Bosch, Helena Holmström Olsson

2601.15141 2026-01-22 cs.LG

CLEANER: Self-Purified Trajectories Boost Agentic Reinforcement Learning

Tianshi Xu, Yuteng Chen, Meng Li

2601.15131 2026-01-22 cs.AI

Vehicle Routing with Finite Time Horizon using Deep Reinforcement Learning with Improved Network Embedding

Ayan Maity, Sudeshna Sarkar

Comments Accepted at AAAI-26 Workshop on AI for Urban Planning

2601.15130 2026-01-22 cs.AI cs.CL

The Plausibility Trap: Using Probabilistic Engines for Deterministic Tasks

Ivan Carrera, Daniel Maldonado-Ruiz

2601.15129 2026-01-22 cs.CL

RSNA Large Language Model Benchmark Dataset for Chest Radiographs of Cardiothoracic Disease: Radiologist Evaluation and Validation Enhanced by AI Labels (REVEAL-CXR)

Yishu Wei, Adam E. Flanders, Errol Colak, John Mongan, Luciano M Prevedello, Po-Hao Chen, Henrique Min Ho Lee, Gilberto Szarf, Hamilton Shoji, Jason Sho, Katherine Andriole, Tessa Cook, Lisa C. Adams, Linda C. Chu, Maggie Chung, Geraldine Brusca-Augello, Djeven P. Deva, Navneet Singh, Felipe Sanchez Tijmes, Jeffrey B. Alpert, Elsie T. Nguyen, Drew A. Torigian, Kate Hanneman, Lauren K Groner, Alexander Phan, Ali Islam, Matias F. Callejas, Gustavo Borges da Silva Teles, Faisal Jamal, Maryam Vazirabad, Ali Tejani, Hari Trivedi, Paulo Kuriki, Rajesh Bhayana, Elana T. Benishay, Yi Lin, Yifan Peng, George Shih

详情

英文摘要

Multimodal large language models have demonstrated comparable performance to that of radiology trainees on multiple-choice board-style exams. However, to develop clinically useful multimodal LLM tools, high-quality benchmarks curated by domain experts are essential. To curate released and holdout datasets of 100 chest radiographic studies each and propose an artificial intelligence (AI)-assisted expert labeling procedure to allow radiologists to label studies more efficiently. A total of 13,735 deidentified chest radiographs and their corresponding reports from the MIDRC were used. GPT-4o extracted abnormal findings from the reports, which were then mapped to 12 benchmark labels with a locally hosted LLM (Phi-4-Reasoning). From these studies, 1,000 were sampled on the basis of the AI-suggested benchmark labels for expert review; the sampling algorithm ensured that the selected studies were clinically relevant and captured a range of difficulty levels. Seventeen chest radiologists participated, and they marked "Agree all", "Agree mostly" or "Disagree" to indicate their assessment of the correctness of the LLM suggested labels. Each chest radiograph was evaluated by three experts. Of these, at least two radiologists selected "Agree All" for 381 radiographs. From this set, 200 were selected, prioritizing those with less common or multiple finding labels, and divided into 100 released radiographs and 100 reserved as the holdout dataset. The holdout dataset is used exclusively by RSNA to independently evaluate different models. A benchmark of 200 chest radiographic studies with 12 benchmark labels was created and made publicly available https://imaging.rsna.org, with each chest radiograph verified by three radiologists. In addition, an AI-assisted labeling procedure was developed to help radiologists label at scale, minimize unnecessary omissions, and support a semicollaborative environment.

URL PDF HTML ☆

赞 0 踩 0

2601.15115 2026-01-22 cs.CV

Training-Free and Interpretable Hateful Video Detection via Multi-stage Adversarial Reasoning

Shuonan Yang, Yuchen Zhang, Zeyu Fu

Comments Accepted at ICASSP 2026. \c{opyright} 2026 IEEE. This is the author accepted manuscript. The final published version will be available via IEEE Xplore

2601.15111 2026-01-22 cs.LG cs.AI

Auditing Language Model Unlearning via Information Decomposition

Anmol Goel, Alan Ritter, Iryna Gurevych

Comments EACL 2026 Main

2601.15110 2026-01-22 cs.CV

Pb4U-GNet: Resolution-Adaptive Garment Simulation via Propagation-before-Update Graph Network

Aoran Liu, Kun Hu, Clinton Ansun Mo, Qiuxia Wu, Wenxiong Kang, Zhiyong Wang

Comments Camera-ready version accepted at AAAI 2026

2601.15102 2026-01-22 cs.LG eess.IV

Field-Space Autoencoder for Scalable Climate Emulators

Johannes Meuer, Maximilian Witte, Étiénne Plésiat, Thomas Ludwig, Christopher Kadow

2601.15098 2026-01-22 cs.CV

Three-dimensional visualization of X-ray micro-CT with large-scale datasets: Efficiency and accuracy for real-time interaction

Yipeng Yin, Rao Yao, Qingying Li, Dazhong Wang, Hong Zhou, Zhijun Fang, Jianing Chen, Longjie Qian, Mingyue Wu

Comments Page1-37

2601.15091 2026-01-22 cs.CL cs.CY cs.SI q-bio.NC

Circadian Modulation of Semantic Exploration in Social Media Language

Vuong Hung Truong, Mariana Gabrielle Cangco Reyes, Masatoshi Koizumi, Jihwan Myung

Comments 25 pages, 6 figures, 3 supplementary figures

2601.15086 2026-01-22 cs.LG cs.AI

Memory Retention Is Not Enough to Master Memory Tasks in Reinforcement Learning

Oleg Shchendrigin, Egor Cherepanov, Alexey K. Kovalev, Aleksandr I. Panov

Comments 11 pages, 6 figures, 7 tables

2601.15083 2026-01-22 cs.SD cs.LG

Bangla Music Genre Classification Using Bidirectional LSTMS

Muntakimur Rahaman, Md Mahmudul Hoque, Md Mehedi Hassain

2601.15079 2026-01-22 cs.LG cs.SI

LoRAP: Low-Rank Aggregation Prompting for Quantized Graph Neural Networks Training

Chenyu Liu, Haige Li, Luca Rossi

2601.15077 2026-01-22 cs.CL cs.AI cs.LG cs.MA

Multi-Agent Constraint Factorization Reveals Latent Invariant Solution Structure

Christopher Scofield

2601.15069 2026-01-22 cs.RO

Influence of Operator Expertise on Robot Supervision and Intervention

Yanran Jiang, Pavan Sikka, Leimin Tian, Dana Kuliic, Cecile Paris

2601.15061 2026-01-22 cs.CV cs.AI

Differential Privacy Image Generation with Reconstruction Loss and Noise Injection Using an Error Feedback SGD

Qiwei Ma, Jun Zhang

2601.15059 2026-01-22 cs.AI cs.SY eess.SY

The Responsibility Vacuum: Organizational Failure in Scaled Agent Systems

Oleg Romanchuk, Roman Bondar

2601.15056 2026-01-22 cs.RO

Systematic Evaluation of Hip Exoskeleton Assistance Parameters for Enhancing Gait Stability During Ground Slip Perturbations

Maria T. Tagliaferri, Inseung Kang

2601.15049 2026-01-22 cs.CV

Deep Leakage with Generative Flow Matching Denoiser

Isaac Baglin, Xiatian Zhu, Simon Hadfield

2601.15042 2026-01-22 cs.CV cs.AI

Federated Transformer-GNN for Privacy-Preserving Brain Tumor Localization with Modality-Level Explainability

Andrea Protani, Riccardo Taiello, Marc Molina Van Den Bosch, Luigi Serio

2601.15041 2026-01-22 cs.LG cs.SE

HyperNet-Adaptation for Diffusion-Based Test Case Generation

Oliver Weißl, Vincenzo Riccio, Severin Kacianka, Andrea Stocco

2601.15038 2026-01-22 cs.LG cs.AI

A Curriculum-Based Deep Reinforcement Learning Framework for the Electric Vehicle Routing Problem

Mertcan Daysalilar, Fuat Uyguroglu, Gabriel Nicolosi, Adam Meyers

详情

英文摘要

The electric vehicle routing problem with time windows (EVRPTW) is a complex optimization problem in sustainable logistics, where routing decisions must minimize total travel distance, fleet size, and battery usage while satisfying strict customer time constraints. Although deep reinforcement learning (DRL) has shown great potential as an alternative to classical heuristics and exact solvers, existing DRL models often struggle to maintain training stability-failing to converge or generalize when constraints are dense. In this study, we propose a curriculum-based deep reinforcement learning (CB-DRL) framework designed to resolve this instability. The framework utilizes a structured three-phase curriculum that gradually increases problem complexity: the agent first learns distance and fleet optimization (Phase A), then battery management (Phase B), and finally the full EVRPTW (Phase C). To ensure stable learning across phases, the framework employs a modified proximal policy optimization algorithm with phase-specific hyperparameters, value and advantage clipping, and adaptive learning-rate scheduling. The policy network is built upon a heterogeneous graph attention encoder enhanced by global-local attention and feature-wise linear modulation. This specialized architecture explicitly captures the distinct properties of depots, customers, and charging stations. Trained exclusively on small instances with N=10 customers, the model demonstrates robust generalization to unseen instances ranging from N=5 to N=100, significantly outperforming standard baselines on medium-scale problems. Experimental results confirm that this curriculum-guided approach achieves high feasibility rates and competitive solution quality on out-of-distribution instances where standard DRL baselines fail, effectively bridging the gap between neural speed and operational reliability.

URL PDF HTML ☆

赞 0 踩 0

2601.15037 2026-01-22 cs.CL cs.AI

Knowledge Restoration-driven Prompt Optimization: Unlocking LLM Potential for Open-Domain Relational Triplet Extraction

Xiaonan Jing, Gongqing Wu, Xingrui Zhuo, Lang Sun, Jiapu Wang

2601.15025 2026-01-22 cs.RO cs.CV

ExPrIS: Knowledge-Level Expectations as Priors for Object Interpretation from Sensor Data

Marian Renz, Martin Günther, Felix Igelbrink, Oscar Lima, Martin Atzmueller

Comments This preprint has not undergone peer review or any post-submission improvements or corrections. The Version of Record of this article is published in KI - Künstliche Intelligenz, and is available online at https://doi.org/10.1007/s13218-026-00901-7

Journal ref Künstl Intell (2026)

2601.15021 2026-01-22 cs.LG cs.CV

Mixture-of-Experts Models in Vision: Routing, Optimization, and Generalization

Adam Rokah, Daniel Veress, Caleb Caulk, Sourav Sharan

Comments 7 pages, 8 figures. Code available at: https://github.com/moe-project-uu/mixture-of-experts-project

2601.15018 2026-01-22 cs.RO

Risk Estimation for Automated Driving

Leon Tolksdorf, Arturo Tejada, Jonas Bauernfeind, Christian Birkner, Nathan van de Wouw

Comments 10 pages, 5 figures

2601.15016 2026-01-22 cs.CV

LiViBench: An Omnimodal Benchmark for Interactive Livestream Video Understanding

Xiaodong Wang, Langling Huang, Zhirong Wu, Xu Zhao, Teng Xu, Xuhong Xia, Peixi Peng

Comments AAAI 2026 Main Track

2601.15006 2026-01-22 cs.RO

DWPP: Dynamic Window Pure Pursuit Considering Velocity and Acceleration Constraints

Fumiya Ohnishi, Masaki Takahashi

Comments 28 pages, 12 figures

2601.15000 2026-01-22 cs.LG

Lineup Regularized Adjusted Plus-Minus (L-RAPM): Basketball Lineup Ratings with Informed Priors

Christos Petridis, Konstantinos Pelechrinis

Comments 7 pages, 4 figures

AI 大模型

视觉与机器人

科学与医疗

How to Build AI Agents by Augmenting LLMs with Codified Human Expert Domain Knowledge? A Software Engineering Framework