arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2507.02314 2026-04-30 cs.CV cs.AI

MAGIC: Few-Shot Mask-Guided Anomaly Inpainting with Prompt Perturbation, Spatially Adaptive Guidance, and Context Awareness

JaeHyuck Choi, MinJun Kim, Je Hyeong Hong

Comments Accepted at CVPR 2026 Findings. Supplementary material included after references. 47 pages, 47 figures, 28 tables. Code : https://github.com/SpatialAILab/MAGIC

2506.21444 2026-04-30 cs.CV

Benchmarking Deep Learning and Vision Foundation Models for Atypical vs. Normal Mitosis Classification with Cross-Dataset Evaluation

Sweta Banerjee, Viktoria Weiss, Taryn A. Donovan, Rutger H. J. Fick, Thomas Conrad, Jonas Ammeling, Nils Porsche, Robert Klopfleisch, Christopher Kaltenecker, Katharina Breininger, Marc Aubreville, Christof A. Bertram

Comments Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) https://melba-journal.org/2026:006

详情

DOI: 10.59275/j.melba.2026-6c1g
Journal ref: Machine.Learning.for.Biomedical.Imaging. 2026 (2026)

英文摘要

Atypical mitosis marks a deviation in the cell division process that has been shown be an independent prognostic marker for tumor malignancy. However, atypical mitosis classification remains challenging due to low prevalence, at times subtle morphological differences from normal mitotic figures, low inter-rater agreement among pathologists, and class imbalance in datasets. Building on the Atypical Mitosis dataset for Breast Cancer (AMi-Br), this study presents a comprehensive benchmark comparing deep learning approaches for automated atypical mitotic figure (AMF) classification, including end-to-end trained deep learning models, foundation models with linear probing, and foundation models fine-tuned with low-rank adaptation (LoRA). For rigorous evaluation, we further introduce two new held-out AMF datasets - AtNorM-Br, a dataset of mitotic figures from the TCGA breast cancer cohort, and AtNorM-MD, a multi-domain dataset of mitotic figures from a subset of the MIDOG++ training set. We found average balanced accuracy values of up to 0.8135, 0.7788, and 0.7723 on the in-domain AMi-Br and the out-of-domain AtNorm-Br and AtNorM-MD datasets, respectively. Our work shows that atypical mitotic figure classification, while being a challenging problem, can be effectively addressed through the use of recent advances in transfer learning and model fine-tuning techniques. We make all code and data used in this paper available in this github repository: https://github.com/DeepMicroscopy/AMi-Br_Benchmark.

URL PDF HTML ☆

赞 0 踩 0

2505.21072 2026-04-30 cs.CL

Faithfulness-Aware Uncertainty Quantification for Fact-Checking the Output of Retrieval Augmented Generation

Ekaterina Fadeeva, Aleksandr Rubashevskii, Dzianis Piatrashyn, Roman Vashurin, Shehzaad Dhuliawala, Artem Shelmanov, Timothy Baldwin, Preslav Nakov, Mrinmaya Sachan, Maxim Panov

2504.15458 2026-04-30 cs.LG hep-ph nucl-th quant-ph

Compton Form Factor Extraction using Quantum Deep Neural Networks

Brandon B. Le, Dustin Keller

Comments 24 pages, 14 figures. v4: matches published version

2502.11614 2026-04-30 cs.CL cs.AI

Is Human-Like Text Liked by Humans? Multilingual Human Detection and Preference Against AI

Yuxia Wang, Rui Xing, Jonibek Mansurov, Giovanni Puccetti, Zhuohan Xie, Minh Ngoc Ta, Jiahui Geng, Jinyan Su, Mervat Abassy, Saad El Dine Ahmed, Kareem Elozeiri, Nurkhan Laiyk, Maiya Goloburda, Tarek Mahmoud, Raj Vardhan Tomar, Alexander Aziz, Ryuto Koike, Masahiro Kaneko, Artem Shelmanov, Ekaterina Artemova, Vladislav Mikhailov, Akim Tsvigun, Alham Fikri Aji, Nizar Habash, Iryna Gurevych, Preslav Nakov

Comments ACL 2026 Main

2412.13682 2026-04-30 cs.AI cs.CL

ChinaTravel: An Open-Ended Travel Planning Benchmark with Compositional Constraint Validation for Language Agents

Jie-Jing Shao, Bo-Wen Zhang, Xiao-Wen Yang, Baizhi Chen, Si-Yu Han, Jinghao Pang, Wen-Da Wei, Guohao Cai, Zhenhua Dong, Lan-Zhe Guo, Yu-Feng Li

Comments ICLR 2026. Webpage: https://www.lamda.nju.edu.cn/shaojj/chinatravel

2412.11399 2026-04-30 cs.LG eess.SP

Quantifying Climate Change Impacts on Renewable Energy Generation: A Super-Resolution Recurrent Diffusion Model

Xiaochong Dong, Jun Dan, Yingyun Sun, Yang Liu, Xuemin Zhang, Shengwei Mei

Comments Accepted by CSEE Journal of Power and Energy Systems in Jul. 2025

2412.10679 2026-04-30 cs.CV eess.IV

U-FaceBP: Uncertainty-aware Bayesian Ensemble Deep Learning for Face Video-based Blood Pressure Estimation

Yusuke Akamatsu, Akinori F. Ebihara, Terumi Umematsu

Comments Accepted to IEEE Transactions on Instrumentation and Measurement

2409.12059 2026-04-30 cs.CL cs.AI cs.LG

MeTHanol: Modularized Thinking Language Models with Intermediate Layer Thinking, Decoding and Bootstrapping Reasoning

Ningyuan Xi, Xiaoyu Wang, Yetao Wu, Teng Chen, Qingqing Gu, Yue Zhao, Jinxian Qu, Zhonglin Jiang, Yong Chen, Luo Ji

Comments 19 pages, 7 figures. IJCNN2025

2409.06624 2026-04-30 cs.CL cs.AI cs.LG

A Practice of Post-Training on Llama-3 70B with Optimal Selection of Additional Language Mixture Ratio

Ningyuan Xi, Yetao Wu, Kun Fan, Teng Chen, Qingqing Gu, Luo Ji

Comments 12 pages, 2 figures. PAKDD2025

2309.09346 2026-04-30 cs.AI cs.RO

Speech-Gesture GAN: Gesture Generation for Robots and Embodied Agents

Carson Yu Liu, Gelareh Mohammadi, Yang Song, Wafa Johal

Comments RO-MAN'23, 32nd IEEE International Conference on Robot and Human Interactive Communication (RO-MAN), August 2023, Busan, South Korea

2604.26633 2026-04-30 cs.CV cs.AI

SynSur: An end-to-end generative pipeline for synthetic industrial surface defect generation and detection

Paul Julius Kühn, Mika Pommeranz, Arjan Kuijper, Saptarshi Neil Sinha

2604.26630 2026-04-30 cs.CL

SAGE: A Strategy-Aware Graph-Enhanced Generation Framework For Online Counseling

Eliya Naomi Aharon, Meytal Grimland, Avi Segal, Loona Ben Dayan, Inbar Shenfeld, Yossi Levi Belz, Kobi Gal

Comments Full version of the work accepted as a short paper at the 34th ACM Conference on User Modeling, Adaptation and Personalization (UMAP '26). 9 pages, 4 figures, 5 tables

2604.26626 2026-04-30 cs.RO

STAR-Filter: Efficient Convex Free-Space Approximation via Starshaped Set Filtering in Noisy Environments

Yuwei Wu, Yichen Zhao, Dexter Ong, Vijay Kumar

2604.26622 2026-04-30 cs.CL

OCR-Memory: Optical Context Retrieval for Long-Horizon Agent Memory

Jinze Li, Yang Zhang, Xin Yang, Jiayi Qu, Jinfeng Xu, Shuo Yang, Junhua Ding, Edith Cheuk-Han Ngai

Comments Accepted to ACL 2026 (Main Conference)

2604.26620 2026-04-30 cs.CV

SnapPose3D: Diffusion-Based Single-Frame 2D-to-3D Lifting of Human Poses

Alessandro Simoni, Riccardo Catalini, Davide Di Nucci, Guido Borghi, Davide Davoli, Lorenzo Garattoni, Gianpiero Francesca, Yuki Kawana, Roberto Vezzani

Comments Accepted at ICPR 2026

2604.26619 2026-04-30 cs.CL

Zero-Shot to Full-Resource: Cross-lingual Transfer Strategies for Aspect-Based Sentiment Analysis

Jakob Fehle, Nils Constantin Hellwig, Udo Kruschwitz, Christian Wolff

2604.26614 2026-04-30 cs.CV

State Beyond Appearance: Diagnosing and Improving State Consistency in Dial-Based Measurement Reading

Yuanze Hu, Gen Li, Yuqin Lan, Qingchen Yu, Zhichao Yang, Junwei Jing, Zhaoxin Fan, Xiaotie Deng

2604.26607 2026-04-30 cs.AI cs.CY cs.SE

Human-in-the-Loop Benchmarking of Heterogeneous LLMs for Automated Competency Assessment in Secondary Level Mathematics

Jatin Bhusal, Nancy Mahatha, Aayush Acharya, Raunak Regmi

Comments 5 pages, 3 figures, 5 tables. Submitted to 2AI-2026-Applied AI Conference

2604.26604 2026-04-30 cs.LG

Who Trains Matters: Federated Learning under Enrollment and Participation Selection Biases

Gota Morishita

Comments 10 pages, 2 figures

2604.26598 2026-04-30 cs.CV

FunFace: Feature Utility and Norm Estimation for Face Recognition

Žiga Babnik, Fadi Boutros, Naser Damer, Deepak Kumar Jain, Peter Peer, Vitomir Štruc

2604.26597 2026-04-30 cs.CL cs.AI

Translating Under Pressure: Domain-Aware LLMs for Crisis Communication

Antonio Castaldo, Maria Carmen Staiano, Johanna Monti, Sheila Castilho, Francesca Chiusaroli

2604.26593 2026-04-30 cs.LG physics.app-ph

PiGGO: Physics-Guided Learnable Graph Kalman Filters for Virtual Sensing of Nonlinear Dynamic Structures under Uncertainty

Marcus Haywood-Alexander, Gregory Duthé, Eleni Chatzi

2604.26582 2026-04-30 cs.CV cs.AI

Star-Fusion: A Multi-modal Transformer Architecture for Discrete Celestial Orientation via Spherical Topology

May Hammad, Menatallh Hammad

2604.26577 2026-04-30 cs.AI cs.CY cs.RO

Benchmarking the Safety of Large Language Models for Robotic Health Attendant Control

Mahiro Nakao, Kazuhiro Takemoto

Comments 20 pages, 9 figures, 3 tables, 8 pages supplementary material

2604.26573 2026-04-30 cs.LG

PAINT: Partial-Solution Adaptive Interpolated Training for Self-Distilled Reasoners

Zhiquan Tan, Yinrong Hong

2604.26569 2026-04-30 cs.RO

LLM-Flax : Generalizable Robotic Task Planning via Neuro-Symbolic Approaches with Large Language Models

Seongmin Kim, Daegyu Lee

2604.26568 2026-04-30 cs.CL

Multimodal LLMs are not all you need for Pediatric Speech Language Pathology

Darren Fürst, Sebastian Steindl, Ulrich Schäfer

2604.26567 2026-04-30 cs.CV

AirZoo: A Unified Large-Scale Dataset for Grounding Aerial Geometric 3D Vision

Xiaoya Cheng, Rouwan Wu, Xinyi Liu, Zeyu Cui, Yan Liu, Na Zhao, Yu Liu, Maojun Zhang, Shen Yan

2604.26565 2026-04-30 cs.CV

DenseStep2M: A Scalable, Training-Free Pipeline for Dense Instructional Video Annotation

Mingji Ge, Qirui Chen, Zeqian Li, Weidi Xie