arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.14734 2026-04-24 cs.CV

Find the Differences: Differential Morphing Attack Detection vs Face Recognition

Una M. Kelly, Luuk J. Spreeuwers, Raymond N. J. Veldhuis

详情

英文摘要

Morphing is a challenge to face recognition (FR) for which several morphing attack detection solutions have been proposed. We argue that face recognition and differential morphing attack detection (D-MAD) in principle perform very similar tasks, which we support by comparing an FR system with two existing D-MAD approaches. We also show that currently used decision thresholds inherently lead to FR systems being vulnerable to morphing attacks and that this explains the tradeoff between performance on normal images and vulnerability to morphing attacks. We propose using FR systems that are already in place for morphing detection and introduce a new evaluation threshold that guarantees an upper limit to the vulnerability to morphing attacks - even of unknown types.

URL PDF HTML ☆

赞 0 踩 0

2604.14626 2026-04-24 cs.LG cs.AI cs.AR cs.DC

ELMoE-3D: Leveraging Intrinsic Elasticity of MoE for Hybrid-Bonding-Enabled Self-Speculative Decoding in On-Premises Serving

Yuseon Choi, Jingu Lee, Jungjun Oh, Sunjoo Whang, Byeongcheol Kim, Minsung Kim, Hoi-Jun Yoo, Sangjin Kim

2604.13589 2026-04-24 cs.CV

Dehaze-then-Splat: Generative Dehazing with Physics-Informed 3D Gaussian Splatting for Smoke-Free Novel View Synthesis

Boss Chen, Hanqing Wang

2604.13359 2026-04-24 cs.LG cs.AR eess.SP

BioTrain: Sub-MB, Sub-50mW On-Device Fine-Tuning for Edge-AI on Biosignals

Run Wang, Victor J. B. Jung, Philip Wiese, Sebastian Frey, Giusy Spacone, Francesco Conti, Alessio Burrello, Luca Benini

2604.12867 2026-04-24 cs.AI

QuarkMedSearch: A Long-Horizon Deep Search Agent for Exploring Medical Intelligence

Zhichao Lin, Zhichao Liang, Gaoqiang Liu, Meng Xu, Baoyu Xiang, Shuxin Zhao, Yao Wu, Jian Xu, Guanjun Jiang

2604.12710 2026-04-24 cs.LG cs.AI cs.CL

LASA: Language-Agnostic Semantic Alignment at the Semantic Bottleneck for LLM Safety

Junxiao Yang, Haoran Liu, Jinzhe Tu, Jiale Cheng, Zhexin Zhang, Shiyao Cui, Jiaqi Weng, Jialing Tao, Hui Xue, Hongning Wang, Han Qiu, Minlie Huang

2604.10275 2026-04-24 cs.CV

FastSHADE: Fast Self-augmented Hierarchical Asymmetric Denoising for Efficient inference on mobile devices

Nikolay Falaleev

Comments To appear in the Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) 2026

2604.10072 2026-04-24 cs.CL

Reason Only When Needed: Efficient Generative Reward Modeling via Model-Internal Uncertainty

Chao Xue, Yao Wang, Mengqiao Liu, Di Liang, Xingsheng Han, Peiyang Liu, Xianjie Wu, Chenyao Lu, Lei Jiang, Yu Lu, Haibo Shi, Shuang Liang, Minlong Peng, Flora D. Salim

Comments accepted by ACL 2026 Findings

2604.09251 2026-04-24 cs.AI

DRBENCHER: Can Your Agent Identify the Entity, Retrieve Its Properties and Do the Math?

Young-Suk Lee, Ramon Fernandez Astudillo, Radu Florian

2604.09000 2026-04-24 cs.CV

StreamMeCo: Long-Term Agent Memory Compression for Efficient Streaming Video Understanding

Junxi Wang, Te Sun, Jiayi Zhu, Junxian Li, Haowen Xu, Zichen Wen, Xuming Hu, Zhiyu Li, Linfeng Zhang

Comments 2026ACL Findings

2604.05320 2026-04-24 cs.RO

ExpressMM: Expressive Mobile Manipulation Behaviors in Human-Robot Interactions

Souren Pashangpour, Haitong Wang, Matthew Lisondra, Goldie Nejat

2604.05260 2026-04-24 cs.RO cond-mat.soft cs.HC

ZipFold: Modular Actuators for Scaleable Adaptive Robots

Niklas Hagemann, Daniela Rus

2604.04395 2026-04-24 cs.CV cs.MM

BiTDiff: Fine-Grained 3D Conducting Motion Generation via BiMamba-Transformer Diffusion

Tianzhi Jia, Kaixing Yang, Xiaole Yang, Xulong Tang, Ke Qiu, Shikui Wei, Yao Zhao

Comments 15 pages, 7 figures

2604.03956 2026-04-24 cs.CV cs.AI

VLA-Forget: Vision-Language-Action Unlearning for Embodied Foundation Models

Ravi Ranjan, Agoritsa Polyzou

Comments 18 pages, 9 figures, Accepted to ACL-2026, KnowFM

2604.03873 2026-04-24 cs.LG cs.CL

SODA: Semi On-Policy Black-Box Distillation for Large Language Models

Xiwen Chen, Jingjing Wang, Wenhui Zhu, Peijie Qiu, Xuanzhao Dong, Hejian Sang, Zhipeng Wang, Alborz Geramifard, Feng Luo

2603.28342 2026-04-24 cs.CL cs.LG

Kernel-Smith: A Unified Recipe for Evolutionary Kernel Optimization

He Du, Qiming Ge, Jiakai Hu, Aijun Yang, Zheng Cai, Zixian Huang, Sheng Yuan, Qinxiu Cheng, Xinchen Xie, Yicheng Chen, Yining Li, Jiaxing Xie, Huanan Dong, Yaguang Wu, Xiangjun Huang, Jian Yang, Hui Wang, Bowen Zhou, Bowen Li, Qipeng Guo, Kai Chen

2603.27820 2026-04-24 cs.CL

Improving Clinical Diagnosis with Counterfactual Multi-Agent Reasoning

Zhiwen You, Xi Chen, Aniket Vashishtha, Simo Du, Gabriel Erion-Barner, Hongyuan Mei, Hao Peng, Yue Guo

详情

英文摘要

Clinical diagnosis is a complex reasoning process in which clinicians gather evidence, form hypotheses, and test them against alternative explanations. In medical training, this reasoning is explicitly developed through counterfactual questioning--e.g., asking how a diagnosis would change if a key symptom were absent or altered--to strengthen differential diagnosis skills. As large language model (LLM)-based systems are increasingly used for diagnostic support, ensuring the interpretability of their recommendations becomes critical. However, most existing LLM-based diagnostic agents reason over fixed clinical evidence without explicitly testing how individual findings support or weaken competing diagnoses. In this work, we propose a counterfactual multi-agent diagnostic framework inspired by clinician training that makes hypothesis testing explicit and evidence-grounded. Our framework introduces counterfactual case editing to modify clinical findings and evaluate how these changes affect competing diagnoses. We further define the Counterfactual Probability Gap, a method that quantifies how strongly individual findings support a diagnosis by measuring confidence shifts under these edits. These counterfactual signals guide multi-round specialist discussions, enabling agents to challenge unsupported hypotheses, refine differential diagnoses, and produce more interpretable reasoning trajectories. Across three diagnostic benchmarks and seven LLMs, our method consistently improves diagnostic accuracy over prompting and prior multi-agent baselines, with the largest gains observed in complex and ambiguous cases. Human evaluation further indicates that our framework produces more clinically useful, reliable, and coherent reasoning. These results suggest that incorporating counterfactual evidence verification is an important step toward building reliable AI systems for clinical decision support.

URL PDF HTML ☆

赞 0 踩 0

2603.27406 2026-04-24 cs.AI cs.LG

On the Relationship between Bayesian Networks and Probabilistic Structural Causal Models

Peter J. F. Lucas, Eleonora Zullo, Fabio Stella

2603.22823 2026-04-24 cs.AI

Empirical Comparison of Agent Communication Protocols for Task Orchestration

Ivan Dobrovolskyi

2603.16797 2026-04-24 cs.LG cs.CV

Adaptive Moments are Surprisingly Effective for Plug-and-Play Diffusion Sampling

Christian Belardi, Justin Lovelace, Kilian Q. Weinberger, Carla P. Gomes

2603.12845 2026-04-24 cs.CV

Multimodal Protein Language Models for Enzyme Kinetic Parameters: From Substrate Recognition to Conformational Adaptation

Fei Wang, Xinye Zheng, Kun Li, Yanyan Wei, Yuxin Liu, Ganpeng Hu, Tong Bao, Jingwen Yang

Comments Accepted by CVPR 2026

详情

英文摘要

Predicting enzyme kinetic parameters quantifies how efficiently an enzyme catalyzes a specific substrate under defined biochemical conditions. Canonical parameters such as the turnover number ($k_\text{cat}$), Michaelis constant ($K_\text{m}$), and inhibition constant ($K_\text{i}$) depend jointly on the enzyme sequence, the substrate chemistry, and the conformational adaptation of the active site during binding. Many learning pipelines simplify this process to a static compatibility problem between the enzyme and substrate, fusing their representations through shallow operations and regressing a single value. Such formulations overlook the staged nature of catalysis, which involves both substrate recognition and conformational adaptation. In this regard, we reformulate kinetic prediction as a staged multimodal conditional modeling problem and introduce the Enzyme-Reaction Bridging Adapter (ERBA), which injects cross-modal information via fine-tuning into Protein Language Models (PLMs) while preserving their biochemical priors. ERBA performs conditioning in two stages: Molecular Recognition Cross-Attention (MRCA) first injects substrate information into the enzyme representation to capture specificity; Geometry-aware Mixture-of-Experts (G-MoE) then integrates active-site structure and routes samples to pocket-specialized experts to reflect induced fit. To maintain semantic fidelity, Enzyme-Substrate Distribution Alignment (ESDA) enforces distributional consistency within the PLM manifold in a reproducing kernel Hilbert space. Experiments across three kinetic endpoints and multiple PLM backbones, ERBA delivers consistent gains and stronger out-of-distribution performance compared with sequence-only and shallow-fusion baselines, offering a biologically grounded route to scalable kinetic prediction and a foundation for adding cofactors, mutations, and time-resolved structural cues.

URL PDF HTML ☆

赞 0 踩 0

2603.07961 2026-04-24 cs.CV

SGG-R$^{\rm 3}$: From Next-Token Prediction to End-to-End Unbiased Scene Graph Generation

Jiaye Feng, Qixiang Yin, Yuankun Liu, Tong Mo, Weiping Li

2602.23408 2026-04-24 cs.RO cs.CV

Demystifying Action Space Design for Robotic Manipulation Policies

Yuchun Feng, Jinliang Zheng, Zhihao Wang, Dongxiu Liu, Jianxiong Li, Jiangmiao Pang, Tai Wang, Xianyuan Zhan

2602.11569 2026-04-24 cs.AI

SemaPop: Semantic-Persona Conditioned and Controllable Population Synthesis

Zhenlin Qin, Yancheng Ling, Leizhen Wang, Francisco Câmara Pereira, Zhenliang Ma

Comments Submitted to Transportation Research Part C: Emerging Technologies

2602.03875 2026-04-24 cs.LG cs.AI q-bio.QM

Reversible Deep Learning for 13C NMR in Chemoinformatics: On Structures and Spectra

Stefan Kuhn, Vandana Dwarka, Przemyslaw Karol Grenda, Eero Vainikko

Comments 10 pages, 4 figures, 4 tables

2602.02409 2026-04-24 cs.CV

Catalyst: Out-of-Distribution Detection via Elastic Scaling

Abid Hassan, Tuan Ngo, Saad Shafiq, Nenad Medvidovic

Comments Accepted at Conference on Computer Vision and Pattern Recognition (CVPR) 2026. arXiv admin note: text overlap with arXiv:2601.22703

2602.01493 2026-04-24 cs.LG cs.AI

OpInf-LLM: Parametric PDE Solving with LLMs via Operator Inference

Zhuoyuan Wang, Hanjiang Hu, Xiyu Deng, Saviz Mowlavi, Yorie Nakahira

2602.00931 2026-04-24 cs.LG cs.AI

Continuous-Utility Direct Preference Optimization

Muhammad Ahmed Mohsin, Muhammad Umer, Ahsan Bilal, Zihao He, Muhammad Usman Rafique, Asad Aali, Muhammad Ali Jamshed, John M. Cioffi, Emily Fox

2602.00469 2026-04-24 cs.CL cs.AI

Words that make SENSE: Sensorimotor Norms in Learned Lexical Token Representations

Abhinav Gupta, Toben H. Mintz, Jesse Thomason

Comments 5 pages, 2 figures, codebase can be found at: https://github.com/abhinav-usc/SENSE-model/tree/main

2601.22703 2026-04-24 cs.CV

DAVIS: OOD Detection via Dominant Activations and Variance for Increased Separation

Abid Hassan, Tuan Ngo, Saad Shafiq, Nenad Medvidovic