arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.14829 2026-04-17 cs.AI

Beyond Literal Summarization: Redefining Hallucination for Medical SOAP Note Evaluation

Bhavik Vachhani, Kush Shrisvastava, Pranshu Nema, Sai Chiranthan

Comments 12 pages, 2 figures,3 tables

详情

英文摘要

Evaluating large language models (LLMs) for clinical documentation tasks such as SOAP note generation remains challenging. Unlike standard summarization, these tasks require clinical abstraction, normalization of colloquial language, and medically grounded inference. However, prevailing evaluation methods including automated metrics and LLM as judge frameworks rely on lexical faithfulness, often labeling any information not explicitly present in the transcript as hallucination. We show that such approaches systematically misclassify clinically valid outputs as errors, inflating hallucination rates and distorting model assessment. Our analysis reveals that many flagged hallucinations correspond to legitimate clinical transformations, including synonym mapping, abstraction of examination findings, diagnostic inference, and guideline consistent care planning. By aligning evaluation criteria with clinical reasoning through calibrated prompting and retrieval grounded in medical ontologies we observe a significant shift in outcomes. Under a lexical evaluation regime, the mean hallucination rate is 35%, heavily penalizing valid reasoning. With inference aware evaluation, this drops to 9%, with remaining cases reflecting genuine safety concerns. These findings suggest that current evaluation practices over penalize valid clinical reasoning and may measure artifacts of evaluation design rather than true errors, underscoring the need for clinically informed evaluation in high context domains like medicine.

URL PDF HTML ☆

赞 0 踩 0

2604.14828 2026-04-17 cs.CL

Pangu-ACE: Adaptive Cascaded Experts for Educational Response Generation on EduBench

Dinghao Li, Wenlong Zhou, Zhimin Chen, Yuehan Peng, Hong Ni, Chengfu Zou, Guoyu Shi, Yaochen Li

2604.14816 2026-04-17 cs.CV cs.HC cs.MM

NTIRE 2026 Challenge on Video Saliency Prediction: Methods and Results

Andrey Moskalenko, Alexey Bryncev, Ivan Kosmynin, Kira Shilovskaya, Mikhail Erofeev, Dmitry Vatolin, Radu Timofte, Kun Wang, Yupeng Hu, Zhiran Li, Hao Liu, Qianlong Xiang, Liqiang Nie, Konstantinos Chaldaiopoulos, Niki Efthymiou, Athanasia Zlatintsi, Panagiotis Filntisis, Katerina Pastra, Petros Maragos, Li Yang, Gen Zhan, Yiting Liao, Yabin Zhang, Yuxin Liu, Xu Wu, Yunheng Zheng, Linze Li, Kun He, Cong Wu, Xuefeng Zhu, Tianyang Xu, Xiaojun Wu, Wenzhuo Zhao, Keren Fu, Gongyang Li, Shixiang Shi, Jianlin Chen, Haibin Ling, Yaoxin Jiang, Guoyi Xu, Jiajia Liu, Yaokun Shi, Jiachen Tu

Comments CVPRW 2026

2604.14815 2026-04-17 cs.CL

Domain Fine-Tuning FinBERT on Finnish Histopathological Reports: Train-Time Signals and Downstream Correlations

Rami Luisto, Liisa Petäinen, Tommi Grönholm, Jan Böhm, Maarit Ahtiainen, Tomi Lilja, Ilkka Pölönen, Sami Äyrämö

2604.14811 2026-04-17 cs.LG cs.MA cs.NI

Learning Ad Hoc Network Dynamics via Graph-Structured World Models

Can Karacelebi, Yusuf Talha Sahin, Elif Surer, Ertan Onur

Comments 6 pages, 4 figures. Submitted to the IEEE Global Communications Conference (GLOBECOM) 2026

2604.14808 2026-04-17 cs.CL

Modeling LLM Unlearning as an Asymmetric Two-Task Learning Problem

Zeguan Xiao, Siqing Li, Yong Wang, Xuetao Wei, Jian Yang, Yun Chen, Guanhua Chen

Comments ACL 2026

2604.14806 2026-04-17 cs.SD cs.MM

Listen, Pause, and Reason: Toward Perception-Grounded Hybrid Reasoning for Audio Understanding

Jieyi Wang, Yazhe Niu, Dexuan Xu, Zhongyu Wei

2604.14805 2026-04-17 cs.CV

From Boundaries to Semantics: Prompt-Guided Multi-Task Learning for Petrographic Thin-section Segmentation

Yili Ren, Shiqi Wen, Li Hou, Dingwen Xiao, Weiming Zhang, Caleb Chen Cao, Lin Wang, Zilu Zheng, Qianxiao Su, Mingjun Zhao, Lei Chen

2604.14799 2026-04-17 cs.CL cs.CV

Knowing When Not to Answer: Evaluating Abstention in Multimodal Reasoning Systems

Nishanth Madhusudhan, Vikas Yadav, Alexandre Lacoste

Comments 10 pages and 4 figures (excluding appendix)

2604.14795 2026-04-17 cs.RO

Keep It CALM: Toward Calibration-Free Kilometer-Level SLAM with Visual Geometry Foundation Models via an Assistant Eye

Tianjun Zhang, Fengyi Zhang, Tianchen Deng, Lin Zhang, Hesheng Wang

Comments 19 pages, 8 figures, submitted to IEEE TPAMI

详情

英文摘要

Visual Geometry Foundation Models (VGFMs) demonstrate remarkable zero-shot capabilities in local reconstruction. However, deploying them for kilometer-level Simultaneous Localization and Mapping (SLAM) remains challenging. In such scenarios, current approaches mainly rely on linear transforms (e.g., Sim3 and SL4) for sub-map alignment, while we argue that a single linear transform is fundamentally insufficient to model the complex, non-linear geometric distortions inherent in VGFM outputs. Forcing such rigid alignment leads to the rapid accumulation of uncorrected residuals, eventually resulting in significant trajectory drift and map divergence. To address these limitations, we present CAL2M (Calibration-free Assistant-eye based Large-scale Localization and Mapping), a plug-and-play framework compatible with arbitrary VGFMs. Distinct from traditional systems, CAL2M introduces an "assistant eye" solely to leverage the prior of constant physical spacing, effectively eliminating scale ambiguity without any temporal or spatial pre-calibration. Furthermore, leveraging the assumption of accurate feature matching, we propose an epipolar-guided intrinsic and pose correction model. Supported by an online intrinsic search module, it can effectively rectify rotation and translation errors caused by inaccurate intrinsics through fundamental matrix decomposition. Finally, to ensure accurate mapping, we introduce a globally consistent mapping strategy based on anchor propagation. By constructing and fusing anchors across the trajectory, we establish a direct local-to-global mapping relationship. This enables the application of nonlinear transformations to elastically align sub-maps, effectively eliminating geometric misalignments and ensuring a globally consistent reconstruction. The source code of CAL2M will be publicly available at https://github.com/IRMVLab/CALM.

URL PDF HTML ☆

赞 0 踩 0

2604.14790 2026-04-17 cs.AI

Diffusion Crossover: Defining Evolutionary Recombination in Diffusion Models via Noise Sequence Interpolation

Chisatao Kumada, Satoru Hiwa, Tomoyuki Hiroyasu

Comments 14 pages, 7 figures, 2 tables

2604.14789 2026-04-17 cs.AI

A Comparative Study of CNN Optimization Methods for Edge AI: Exploring the Role of Early Exits

Nekane Fernandez, Ivan Valdes, Steven Van Vaerenbergh, Idoia de la Iglesia, Julen Arratibel

2604.14788 2026-04-17 cs.AI

Sequence Search: Automated Sequence Design using Neural Architecture Search

Rokgi Hong, Hongjun An, Sooyeon Ji, Jongho Lee

Comments 10 pages, 6 figures

2604.14782 2026-04-17 cs.CV

One-shot Compositional 3D Head Avatars with Deformable Hair

Yuan Sun, Xuan Wang, WeiLi Zhang, Wenxuan Zhang, Yu Guo, Fei Wang

Comments project page: https://yuansun-xjtu.github.io/CompHairHead.io

详情

英文摘要

We propose a compositional method for constructing a complete 3D head avatar from a single image. Prior one-shot holistic approaches frequently fail to produce realistic hair dynamics during animation, largely due to inadequate decoupling of hair from the facial region, resulting in entangled geometry and unnatural deformations. Our method explicitly decouples hair from the face, modeling these components using distinct deformation paradigms while integrating them into a unified rendering pipeline. Furthermore, by leveraging image-to-3D lifting techniques, we preserve fine-grained textures from the input image to the greatest extent possible, effectively mitigating the common issue of high-frequency information loss in generalized models. Specifically, given a frontal portrait image, we first perform hair removal to obtain a bald image. Both the original image and the bald image are then lifted to dense, detail-rich 3D Gaussian Splatting (3DGS) representations. For the bald 3DGS, we rig it to a FLAME mesh via non-rigid registration with a prior model, enabling natural deformation that follows the mesh triangles during animation. For the hair component, we employ semantic label supervision combined with a boundary-aware reassignment strategy to extract a clean and isolated set of hair Gaussians. To control hair deformation, we introduce a cage structure that supports Position-Based Dynamics (PBD) simulation, allowing realistic and physically plausible transformations of the hair Gaussian primitives under head motion, gravity, and inertial effects. Striking qualitative results, including dynamic animations under diverse head motions, gravity effects, and expressions, showcase substantially more realistic hair behavior alongside faithfully preserved facial details, outperforming state-of-the-art one-shot methods in perceptual realism.

URL PDF HTML ☆

赞 0 踩 0

2604.14781 2026-04-17 cs.CV

Integrating Object Detection, LiDAR-Enhanced Depth Estimation, and Segmentation Models for Railway Environments

Enrico Francesco Giannico, Federico Nesti, Gianluca D'Amico, Mauro Marinoni, Edoardo Carosio, Filippo Salotti, Salvatore Sabina, Giorgio Buttazzo

Comments Under submission for publication

2604.14779 2026-04-17 cs.CV cs.CL

AIM: Asymmetric Information Masking for Visual Question Answering Continual Learning

Peifeng Zhang, Zice Qiu, Donghua Yu, Shilei Cao, Juepeng Zheng, Yutong Lu, Haohuan Fu

Comments 18 pages, 9 figures. Submitted to ACM MM 2026

2604.14773 2026-04-17 cs.CL

CoPA: Benchmarking Personalized Question Answering with Data-Informed Cognitive Factors

Hang Su, Zequn Liu, Chen Hu, Xuesong Lu, Yingce Xia, Zhen Liu

Comments Accepted to ACL. 30 pages, 10 figures

2604.14769 2026-04-17 cs.LG

Constraint-based Pre-training: From Structured Constraints to Scalable Model Initialization

Fu Feng, Yucheng Xie, Ruixiao Shi, Jing Wang, Xin Geng

2604.14768 2026-04-17 cs.AI

CoTEvol: Self-Evolving Chain-of-Thoughts for Data Synthesis in Mathematical Reasoning

Zhuo Wang, Zhuo Zhang, Yafu Li, Yu Cheng, Lizhen Qu, Zenglin Xu

Comments acl2026 findings

2604.14765 2026-04-17 cs.LG math.OC math.PR

Wasserstein Formulation of Reinforcement Learning. An Optimal Transport Perspective on Policy Optimization

Mathias Dus

2604.14762 2026-04-17 cs.CV

OmniGCD: Abstracting Generalized Category Discovery for Modality Agnosticism

Jordan Shipard, Arnold Wiliem, Kien Nguyen Thanh, Wei Xiang, Clinton Fookes

Comments Accepted to CVPR 2026 Findings

2604.14755 2026-04-17 cs.CV

ASGNet: Adaptive Spectrum Guidance Network for Automatic Polyp Segmentation

Yanguang Sun, Hengmin Zhang, Jianjun Qian, Jian Yang, Lei Luo

Comments Accepted at TCSVT 2026

2604.14749 2026-04-17 cs.CL cs.AI

Which bird does not have wings: Negative-constrained KGQA with Schema-guided Semantic Matching and Self-directed Refinement

Midan Shim, Seokju Hwang, Kaehyun Um, Kyong-Ho Lee

Comments ACL 2026 findings

2604.14747 2026-04-17 cs.CV cs.RO

Efficient closed-form approaches for pose estimation using Sylvester forms

Jana Vráblíková, Ezio Malis, Laurent Busé

2604.14746 2026-04-17 cs.AI

Disentangle-then-Refine: LLM-Guided Decoupling and Structure-Aware Refinement for Graph Contrastive Learning

Zhaoxing Li, Hai-Feng Zhang, Xiaoming Zhang

Comments Accept in ICME 2026

2604.14739 2026-04-17 cs.LG

Assessing the Performance-Efficiency Trade-off of Foundation Models in Probabilistic Electricity Price Forecasting

Jan Niklas Lettner, Hadeer El Ashhab, Veit Hagenmeyer, Benjamin Schäfer

Comments Submitted to the 7th International Workshop on Energy Data and Analytics (EDA), held in conjunction with ACM e-Energy 2026

详情

英文摘要

Large-scale renewable energy deployment introduces pronounced volatility into the electricity system, turning grid operation into a complex stochastic optimization problem. Accurate electricity price forecasting (EPF) is essential not only to support operational decisions, such as optimal bidding strategies and balancing power preparation, but also to reduce economic risk and improve market efficiency. Probabilistic forecasts are particularly valuable because they quantify uncertainty stemming from renewable intermittency, market coupling, and regulatory changes, enabling market participants to make informed decisions that minimize losses and optimize expected revenues. However, it remains an open question which models to employ to produce accurate forecasts. Should these be task-specific machine learning (ML) models or Time Series Foundation Models (TSFMs)? In this work, we compare four models for day-ahead probabilistic EPF (PEPF) in European bidding zones: a deterministic NHITS backbone with Quantile-Regression Averaging (NHITS+QRA) and a conditional Normalizing-Flow forecaster (NF) are compared with two TSFMs, namely Moirai and ChronosX. On the one hand, we find that TSFMs outperform task-specific deep learning models trained from scratch in terms of CRPS, Energy Score, and predictive interval calibration across market conditions. On the other hand, we find that well-configured task-specific models, particularly NHITS combined with QRA, achieve performance very close to TSFMs, and in some scenarios, such as when supplied with additional informative feature groups or adapted via few-shot learning from other European markets, they can even surpass TSFMs. Overall, our findings show that while TSFMs offer expressive modeling capabilities, conventional models remain highly competitive, emphasizing the need to weigh computational expense against marginal performance improvements in PEPF.

URL PDF HTML ☆

赞 0 踩 0

2604.14738 2026-04-17 cs.AI

Personalized and Context-Aware Transformer Models for Predicting Post-Intervention Physiological Responses from Wearable Sensor Data

Esther Brown, Victoria Dean, Finale Doshi-Velez

2604.14733 2026-04-17 cs.RO

Differentiable Object Pose Connectivity Metrics for Regrasp Sequence Optimization

Liang Qin, Weiwei Wan, Kensuke Harada

2604.14727 2026-04-17 cs.LG

Expressivity of Transformers: A Tropical Geometry Perspective

Ye Su, Yong Liu

2604.14724 2026-04-17 cs.CV cs.LG eess.IV

HAMSA: Scanning-Free Vision State Space Models via SpectralPulseNet

Badri N. Patro, Vijay S. Agneeswaran