arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.24001 2026-04-28 cs.AI

CT-FineBench: A Diagnostic Fidelity Benchmark for Fine-Grained Evaluation of CT Report Generation

Ruifeng Yuan, Wanxing Chang, Weiwei Cao, Bowen Shi, Zhongyu Wei, Ling Zhang, Jianpeng Zhang

Comments Accepted by ACL 2026 Main

详情

英文摘要

The evaluation of generated reports remains a critical challenge in Computed Tomography (CT) report generation, due to the large volume of text, the diversity and complexity of findings, and the presence of fine-grained, disease-oriented attributes. Conventional evaluation metrics offer only coarse measures of lexical overlap or entity matching and fail to reflect the granular diagnostic accuracy required for clinical use. To address this gap, we propose CT-FineBench, a benchmark built from CT-RATE and Merlin to evaluate the fine-grained factual consistency of CT reports, constructed from CT-RATE and Merlin. Our benchmark is constructed through a meticulous, Question-Answering (QA) based process: first, we identify and structure key, finding-specific clinical attributes (like location, size, margin). Second, we systematically transform these attributes into a QA dataset, where questions probe for specific clinical details grounded in gold-standard reports. The evaluation protocol for CT-FineBench involves using this QA dataset to query a machine-generated report and scoring the correctness of the answers. This allows for a comprehensive, interpretable, and clinically-relevant assessment, moving beyond superficial lexical overlap to pinpoint specific clinical errors. Experiments show that CT-FineBench correlates better with expert clinical assessment and is substantially more sensitive to fine-grained factual errors than prior metrics.

URL PDF HTML ☆

赞 0 踩 0

2604.23996 2026-04-28 cs.CV

SMoES: Soft Modality-Guided Expert Specialization in MoE-VLMs

Zi-Hao Bo, Yaqian Li, Anzhou Hou, Rinyoichi Takezoe, Ertao Zhao, Tianxiang Pan, Jiale Yan, Mo Guang, Kaiwen Long

Comments CVPR 2026

2604.23994 2026-04-28 cs.LG cs.CL

When to Commit? Towards Variable-Size Self-Contained Blocks for Discrete Diffusion Language Models

Danny Wang, Ruihong Qiu, Zi Huang

2604.23993 2026-04-28 cs.CL cs.AI cs.DB cs.LG cs.MA

EPM-RL: Reinforcement Learning for On-Premise Product Mapping in E-Commerce

Minhyeong Yu, Wonduk Seo

Comments preprint

2604.23990 2026-04-28 cs.AI

Failure-Centered Runtime Evaluation for Deployed Trilingual Public-Space Agents

M. Meng

Comments 25 pages, 5 figures. arXiv preprint

2604.23989 2026-04-28 cs.LG cs.AI

Fix Initial Codes and Iteratively Refine Textual Directions Toward Safe Multi-Turn Code Correction

Yuto Tanaka, Issei Sato

2604.23988 2026-04-28 cs.LG cs.AI

Hindsight Preference Optimization for Financial Time Series Advisory

Yanwei Cui, Guanghui Wang, Xing Zhang, Peiyang He, Ziyuan Li, Bing Zhu, Wei Qiu, Xusheng Wang, Zheng Yu, Anqi Xin

Comments Accepted at ICLR 2026 TSALM Workshop

2604.23987 2026-04-28 cs.LG

Continual Calibration: Coverage Can Collapse Before Accuracy in Lifelong LLM Fine-Tuning

Ibne Farabi Shihab, Sanjeda Akter, Anuj Sharma

2604.23985 2026-04-28 cs.AI cs.CL cs.LG

Representational Curvature Modulates Behavioral Uncertainty in Large Language Models

Jack King, Evelina Fedorenko, Eghbal A. Hosseini

2604.23982 2026-04-28 cs.CV

Hierarchical Prototype-based Domain Priors for Multiple Instance Learning in Multimodal Histopathology Analysis

Xuemei Qiu, Dawei Fan, Yebin Huang, Yanping Chen, Lifang Wei

2604.23978 2026-04-28 cs.RO cs.HC

Supporting Family-School Partnerships with Robot-Facilitated Home-Based Activities

Michael F Xu, Qiyao Yang, Heather Kirkorian, Bilge Mutlu

Comments Proceedings of the 25th Interaction Design and Children Conference (IDC '26)

2604.23977 2026-04-28 cs.CV

Multi-View Synergistic Learning with Vision-Language Adaption for Low-Resource Biomedical Image Classification

Xiaoliu Luo, Minxue Xiao, Ting Xie, Mengzhu Wang, Huiqing Qi, Joey Tianyi Zhou, Taiping Zhang, Xu Wang

2604.23976 2026-04-28 cs.RO cs.HC

Designing Robots to Support Parent-Child Connections: Opportunities Through Robot-Mediated Communication

Michael F Xu, Bengisu Cagiltay, Yaxin Hu, Anjun Zhu, Bilge Mutlu

Comments Proceedings of the 25th Interaction Design and Children Conference (IDC '26)

2604.23974 2026-04-28 cs.CL

Propagation Structure-Semantic Transfer Learning for Robust Fake News Detection

Mengyang Chen, Lingwei Wei, Han Cao, Wei Zhou, Zhou Yan, Songlin Hu

Comments Accepted by ECML-PKDD 2024

2604.23972 2026-04-28 cs.CL cs.AI cs.SC

Quantum Knowledge Graph: Modeling Context-Dependent Triplet Validity

Yao Wang, Zixu Geng, Jun Yan

Comments 15 pages main text, 6 pages appendix, 5 figures, preprint

2604.23970 2026-04-28 cs.AI cs.CV cs.HC cs.MA

LLM-Guided Agentic Floor Plan Parsing for Accessible Indoor Navigation of Blind and Low-Vision People

Aydin Ayanzadeh, Tim Oates

2604.23968 2026-04-28 cs.LG cs.AI stat.ML

DecompKAN: Decomposed Patch-KAN for Long-Term Time Series Forecasting

Naveen Mysore

Comments 15 pages, 6 figures, 8 tables. Preprint; under review

2604.23964 2026-04-28 cs.LG cs.AI

Task-guided Spatiotemporal Network with Diffusion Augmentation for EEG-based Dementia Diagnosis and MMSE Prediction

Xiaoyu Zheng, Xu Tian, Bin Jiao, Kunbo Cui, Hanhe Lin, Lu Shen, Jin Liu

2604.23960 2026-04-28 cs.RO

Multi-Robot Motions in Milliseconds: Vector-Accelerated Primitives for Sampling-Based Planning

James D. Motes, Marco Morales, Nancy M. Amato

2604.23957 2026-04-28 cs.CV

LAVA: Layered Audio-Visual Anti-tampering Watermarking for Robust Deepfake Detection and Localization

Bokang Zeng, Zheng Gao, Xiaoyu Li, Xiaoyan Feng, Jiaojiao Jiang

Comments 10 pages, submitted to ACMMM 2026

2604.23954 2026-04-28 cs.AI

An empirical evaluation of the risks of AI model updates using clinical data: stability, arbitrariness, and fairness

Ioannis Bilionis, Ricardo C. Berrios, Luis Fernandez-Luque, Carlos Castillo

Comments Accepted to iEEE EMBC 2026. 4 pages, 3 figures

2604.23953 2026-04-28 cs.CV cs.AI

Viewport-Unaware Blind Omnidirectional Image Quality Assessment: A Unified and Generalized Approach

Jiebin Yan, Kangcheng Wu, Jingwen Hou, Jiayu Zhang, Pengfei Chen, Yuming Fang

2604.23950 2026-04-28 cs.CV

LearnPruner: Rethinking Attention-based Token Pruning in Vision Language Models

Rinyoichi Takezoe, Yaqian Li, Zihao Bo, Anzhou Hou, Mo Guang, Kaiwen Long

Comments Accepted to ICLR 2026

2604.23949 2026-04-28 cs.AI

Context-Aware Hospitalization Forecasting Evaluations for Decision Support using LLMs

Rhea Makkuni, Ananya Joshi

2604.23948 2026-04-28 cs.CL cs.AI

KOMBO: Korean Character Representations Based on the Combination Rules of Subcharacters

SungHo Kim, Juhyeong Park, Yeachan Kim, SangKeun Lee

Comments Presented at ACL 2024 Findings

2604.23941 2026-04-28 cs.CV

GoClick: Lightweight Element Grounding Model for Autonomous GUI Interaction

Hongxin Li, Yuntao Chen, Zhaoxiang Zhang

Comments Technical Report

详情

英文摘要

Graphical User Interface (GUI) element grounding (precisely locating elements on screenshots based on natural language instructions) is fundamental for agents interacting with GUIs. Deploying this capability directly on resource-constrained devices like mobile phones is increasingly critical for GUI agents requiring low latency. However, this goal faces a significant challenge, as current visual grounding methods typically employ large vision-language model (VLM) (more than 2.5B parameters), making them impractical for on-device execution due to memory and computational constraints. To address this, this paper introduces GoClick, a lightweight GUI element grounding VLM with only 230M parameters that achieves excellent visual grounding accuracy, even on par with significantly larger models. Simply downsizing existing decoder-only VLMs is a straightforward way to design a lightweight model, but our experiments reveal that this approach yields suboptimal results. Instead, we select an encoder-decoder architecture, which outperforms decoder-only alternatives at small parameter scales for GUI grounding tasks. Additionally, the limited capacity of small VLMs encourages us to develop a Progressive Data Refinement pipeline that utilizes task type filtering and data ratio adjustment to extract a high-quality 3.8M-sample core set from a 10.8M raw dataset. Training GoClick using this core set brings notable grounding accuracy gains. Our experiments show that GoClick excels on multiple GUI element grounding benchmarks while maintaining a small size and high inference speed. GoClick also enhances GUI agent performance when integrated into a device-cloud collaboration framework, where GoClick helps cloud-based task planners perform precise element localization and achieve higher success rates. We hope our method serves as a meaningful exploration within the GUI agent community.

URL PDF HTML ☆

赞 0 踩 0

2604.23935 2026-04-28 cs.CV

2nd of the 5th PVUW MeViS-Audio Track: ASR-SaSaSa2VA

Zhiyu Wang, Xudong Kang, Shutao Li

Comments 5 pages

2604.23933 2026-04-28 cs.LG eess.SP q-bio.NC

Robust and Clinically Reliable EEG Biomarkers: A Cross Population Framework for Generalizable Parkinson's Disease Detection

Nicholas R. Rasmussen, Longwei Wang, Rodrigue Rizk, Md Rezwanul Akter Pallab, Samuel Stuwart, Martina Mancini, Arun Singh, KC Santosh

Comments This is the non anonymized preprint corresponding to the version submitted to ACM Transactions on Computing for Healthcare. It is not the final typeset or accepted version

2604.23924 2026-04-28 cs.AI q-bio.BM

Agentic AI platforms for autonomous training and rule induction of human-human and virus-human protein-protein interactions

Hung N. Do, Jessica Z. Kubicek-Sutherland, Oscar A. Negrete, S. Gnanakaran

Comments Other correspondence email: donguyenhung238@gmail.com

2604.23921 2026-04-28 cs.LG cs.AI

Crystal structure prediction using graph neural combinatorial optimization

Stavros Gerolymatos, J. Kyle Brubaker, Martin J. A. Schuetz, Vladimir V. Gusev