arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.27598 2026-05-01 cs.LG

Privacy-Preserving Federated Learning via Differential Privacy and Homomorphic Encryption for Cardiovascular Disease Risk Modeling

Gaurang Sharma, Juha Pajula, Aada Illikainen, Markus Rautell, Noora Lipsonen, Petri Alhainen, Mika Hilvo

详情

英文摘要

Protecting sensitive health data while enabling collaborative analysis is a central challenge in healthcare. Traditional machine learning (ML) requires institutions to pool anonymized patient records, centralizing analytical development and privacy risks at a single site. Privacy-enhancing technologies (PETs), including Differential Privacy (DP) and Homomorphic Encryption (HE), can mitigate these risks. However, they are mainly studied in conventional data-sharing settings and often introduce trade-offs, including reduced model utility, higher computational cost, and increased implementation complexity. Federated Learning (FL) reduces data centralization by enabling institutions to train models locally and share only model updates. Nevertheless, FL does not eliminate privacy risks, as shared parameters or gradients may still reveal sensitive information. Integrating DP or HE into FL can strengthen privacy guarantees, yet their comparative performance and deployment implications in real-world healthcare settings remain unclear. We systematically evaluated DP and HE integration in FL under real-world conditions, comparing them with standard FL and centralized ML (cML) to quantify privacy-utility trade-offs in multi-institutional settings. Using nationwide Swedish healthcare data, we evaluated cardiovascular disease risk prediction using logistic regression (LR) and neural network (NN) learners. FL with HE achieved performance comparable to cML but introduced measurable cryptographic overhead, particularly in the NN implementation. FL with DP incurred lower computational cost; however, LR was more sensitive to calibrated noise than the NN, resulting in greater performance degradation. Our findings provide practical guidance for deploying privacy-preserving FL in fragmented healthcare systems.

URL PDF HTML ☆

赞 0 踩 0

2604.27596 2026-05-01 cs.CV

SECOS: Semantic Capture for Rigorous Classification in Open-World Semi-Supervised Learning

Hezhao Liu, Jiacheng Yang, Junlong Gao, Mengke Li, Yiqun Zhang, Shreyank N Gowda, Yang Lu

Comments Accepted by CVPR 2026

2604.27591 2026-05-01 cs.CV cs.AI

ClipTBP: Clip-Pair based Temporal Boundary Prediction with Boundary-Aware Learning for Moment Retrieval

Ji-Hyeon Kim, Ho-Joong Kim, Seong-Whan Lee

Comments 15 pages

2604.27590 2026-05-01 cs.CV

Fake3DGS: A Benchmark for 3D Manipulation Detection in Neural Rendering

Davide Di Nucci, Riccardo Catalini, Guido Borghi, Roberto Vezzani

Comments Accepted at ICPR 2026. Code and data: https://github.com/iot-unimore/Fake3DGS

2604.27586 2026-05-01 cs.AI cs.LG

Trace-Level Analysis of Information Contamination in Multi-Agent Systems

Anna Mazhar, Huzaifa Suri, Sainyam Galhotra

2604.27582 2026-05-01 cs.CV

Assessing Pancreatic Ductal Adenocarcinoma Vascular Invasion: the PDACVI Benchmark

M. Riera-Marín, O. K. Sikha, J. Rodríguez-Comas, M. S. May, T. Kirscher, X. Coubez, P. Meyer, S. Faisan, Z. Pan, X. Zhou, X. Liang, C. Hémon, V. Boussot, J. -L. Dillenseger, J. -C. Nunes, K. -C. Kahl, C. Lüth, J. Traub, P. -H. Conze, M. M. Duh, A. Aubanell, R. de Figueiredo Cardoso, S. Egger-Hackenschmidt, J. García-López, M. A. González-Ballester, A. Galdran

2604.27578 2026-05-01 cs.CV

World2Minecraft: Occupancy-Driven Simulated Scenes Construction

Lechao Zhang, Haoran Xu, Jingyu Gong, Xuhong Wang, Yuan Xie, Xin Tan

2604.27574 2026-05-01 cs.LG cs.AI cs.IT eess.SP math.IT

Statistical Channel Fingerprint Construction for Massive MIMO: A Unified Tensor Learning Framework

Zhenzhou Jin, Li You, Xiang-Gen Xia, Xiqi Gao

Comments 15 pages, 7 figures

详情

英文摘要

Channel fingerprint (CF) is considered a key enabler for facilitating the acquisition of channel state information (CSI) in massive multiple-input multiple-output (MIMO) communication systems. In this work, we investigate a novel type of CF that stores statistical CSI (sCSI) at each potential location, referred to as statistical CF (sCF). Specifically, we reveal the relationship between sCSI, namely the channel spatial covariance matrix (CSCM), and the channel power angular spectrum (CPAS). Building on this foundation, we construct a unified tensor representation of the sCF and further reduce its dimension by exploiting the eigenvalue decomposition of the CSCM and its correlation with the PAS. Considering the practical constraints imposed by measurement cost, privacy, and security, we focus on three representative scenarios and uniformly formulate them as tensor restoration tasks. To this end, we propose a unified tensor-based learning architecture, termed LPWTNet. The architecture incorporates a closed-form Laplacian pyramid (LP) decomposition and reconstruction framework that replaces the traditional encoder-decoder structure, enabling efficient inference while capturing multi-scale frequency subband characteristics of the sCF. Additionally, a shared mask learning strategy is introduced to adaptively refine high-frequency sCF components through level-wise adjustments. To achieve a larger receptive field without over-parameterization, we further propose a small-kernel convolution mechanism based on the wavelet transform (WT), which decouples convolution across different frequency components of the sCF and enhances feature extraction efficiency. Extensive experiments show that the proposed approach delivers competitive reconstruction accuracy and computational efficiency across various sCF construction scenarios when compared with state-of-the-art baselines.

URL PDF HTML ☆

赞 0 踩 0

2604.27564 2026-05-01 cs.LG

Learning from a single labeled face and a stream of unlabeled data

Branislav Kveton, Michal Valko

Comments Published at IEEE International Conference on Automatic Face and Gesture Recognition (FG 2013). doi:10.1109/FG.2013.6553720

2604.27563 2026-05-01 cs.LG

Bayesian policy gradient and actor-critic algorithms

Mohammad Ghavamzadeh, Yaakov Engel, Michal Valko

Comments Published in Journal of Machine Learning Research 17(66):1-53, 2016

详情

Journal ref: Journal of Machine Learning Research 17(66):1-53, 2016

英文摘要

Policy gradient methods are reinforcement learning algorithms that adapt a parameterized policy by following a performance gradient estimate. Conventional policy gradient methods use Monte-Carlo techniques to estimate the gradient, which tend to have high variance, requiring many samples and resulting in slow convergence. We first propose a Bayesian framework for policy gradient, based on modeling the policy gradient as a Gaussian process. This reduces the number of samples needed to obtain accurate gradient estimates. Moreover, estimates of the natural gradient and a measure of the uncertainty in the gradient estimates, namely, the gradient covariance, are provided at little extra cost. Since the proposed framework considers system trajectories as its basic observable unit, it does not require the dynamics within trajectories to be of any particular form, and can be extended to partially observable problems. On the downside, it cannot exploit the Markov property when the system is Markovian. To address this, we supplement our Bayesian policy gradient framework with a new actor-critic learning model in which a Bayesian class of non-parametric critics, based on Gaussian process temporal difference learning, is used. Such critics model the action-value function as a Gaussian process, allowing Bayes rule to be used to compute the posterior distribution over action-value functions, conditioned on the observed data. Appropriate choices of the policy parameterization and of the prior covariance (kernel) between action-values yield closed-form expressions for the posterior of the gradient of the expected return with respect to the policy parameters. We perform detailed experimental comparisons of the proposed Bayesian policy gradient and actor-critic algorithms with classic Monte-Carlo based policy gradient methods, on a number of reinforcement learning problems.

URL PDF HTML ☆

赞 0 踩 0

2604.27562 2026-05-01 cs.LG

Online semi-supervised perception: Real-time learning without explicit feedback

Branislav Kveton, Michal Valko, Matthai Phillipose, Ling Huang

Comments IEEE Computer Vision and Pattern Recognition Workshop on Online Learning for Computer Vision (CVPR 2010 OLCV)

2604.27559 2026-05-01 cs.CV cs.AI

RIHA: Report-Image Hierarchical Alignment for Radiology Report Generation

Yucheng Chen, Yang Yu, Yufei Shi, Conghao Xiong, Xulei Yang, Si Yong Yeo

Comments Accepted by Journal of Biomedical and Health Informatics (JBHI)

详情

DOI: 10.1109/JBHI.2026.3670023

英文摘要

Radiology report generation (RRG) has emerged as a promising approach to alleviate radiologists' workload and reduce human errors by automatically generating diagnostic reports from medical images. A key challenge in RRG is achieving fine-grained alignment between complex visual features and the hierarchical structure of long-form radiology reports. Although recent methods have improved image-text representation learning, they often treat reports as flat sequences, overlooking their structured sections and semantic hierarchies. This simplification hinders precise cross-modal alignment and weakens RRG accuracy. To address this challenge, we propose RIHA (Report-Image Hierarchical Alignment Transformer), a novel end-to-end framework that performs multi-level alignment between radiological images and their corresponding reports across paragraph, sentence, and word levels. This hierarchical alignment enables more precise cross-modal mapping, essential for capturing the nuanced semantics embedded in clinical narratives. Specifically, RIHA introduces a Visual Feature Pyramid (VFP) to extract multi-scale visual features and a Text Feature Pyramid (TFP) to represent multi-granularity textual structures. These components are integrated through a Cross-modal Hierarchical Alignment (CHA) module, leveraging optimal transport to effectively align visual and textual features across various levels. Furthermore, we incorporate Relative Positional Encoding (RPE) into the decoder to model spatial and semantic relationships among tokens, enhancing the token-level alignment between visual features and generated text. Extensive experiments on two benchmark chest X-ray datasets, IU-Xray and MIMIC-CXR, demonstrate that RIHA outperforms existing state-of-the-art models in both natural language generation and clinical efficacy metrics.

URL PDF HTML ☆

赞 0 踩 0

2604.27557 2026-05-01 cs.RO

Function-based Parametric Co-Design Optimization of Dexterous Hands

Mohammad Amin Mirzaee, Harsh Gupta, Wenzhen Yuan

Comments 8 pages, 7 figures, https://www.aminmirzaee.com/HandCDO/

2604.27555 2026-05-01 cs.AI

SpatialGrammar: A Domain-Specific Language for LLM-Based 3D Indoor Scene Generation

Song Tang, Kaiyong Zhao, Yuliang Li, Qingsong Yan, Penglei Sun, Junyi Zou, Qiang Wang, Xiaowen Chu

2604.27553 2026-05-01 cs.CV

Revealing the Impact of Visual Text Style on Attribute-based Descriptions Produced by Large Visual Language Models

Xiaomeng Wang, Martha Larson, Zhengyu Zhao

Comments Accepted by ICMR 2026. Code is available at https://github.com/XiaomengWang-AI/The-Impact-of-Visual-Text-style-on-Attribute-based-Descriptions-Produced-by-LVLMs

2604.27552 2026-05-01 cs.CV

Residual Gaussian Splatting for Ultra Sparse-View CBCT Reconstruction

Jian Lin, Jiancheng Fang, Shaoyu Wang, Changan Lai, Yikun Zhang, Yang Chen, Qiegen Liu

2604.27551 2026-05-01 cs.LG cs.AI cs.CL

Beyond the Training Distribution: Mapping Generalization Boundaries in Neural Program Synthesis

Henrik Voigt, Michael Habeck, Joachim Giesen

2604.27550 2026-05-01 cs.CL cs.AI

APPSI-139: A Parallel Corpus of English Application Privacy Policy Summarization and Interpretation

Pengyun Zhu, Qiheng Sun, Long Wen, Yanbo Wang, Yang Cao, Junxu Liu, Deyi Xiong, Jinfei Liu, Zhibo Wang, Kui Ren

Comments Accepted to ACL 2026 Main Conference

2604.27547 2026-05-01 cs.LG

Diagnosing Capability Gaps in Fine-Tuning Data

Saeid Asgari Taghanaki, Rakshanda Agarwal, Bruce Sun, Rohan Jha, Elias Stengel-Eskin, Sara Malvar, Rui Ying, Yifei Xu, Guilherme Potje, Tusher Chakraborty, Leonardo de Oliveira Nunes, Ranveer Chandra, Emre Kiciman

2604.27543 2026-05-01 cs.CL

AppTek Call-Center Dialogues: A Multi-Accent Long-Form Benchmark for English ASR

Eugen Beck, Sarah Beranek, Uma Moothiringote, Daniel Mann, Wilfried Michel, Katie Nguyen, Taylor Tragemann

Comments Submitted to INTERSPEECH 2026

2604.27542 2026-05-01 cs.CL

HATS: An Open data set Integrating Human Perception Applied to the Evaluation of Automatic Speech Recognition Metrics

Thibault Bañeras Roux, Jane Wottawa, Mickael Rouvier, Teva Merlin, Richard Dufour

Comments 164--175

2604.27540 2026-05-01 cs.AI

In-Context Examples Suppress Scientific Knowledge Recall in LLMs

Chaemin Jang, Woojin Park, Hyeok Yun, Dongman Lee, Jihee Kim

2604.27538 2026-05-01 cs.CV

Self-Supervised Learning of Plant Image Representations

Ilyass Moummad, Kawtar Zaher, Hervé Goëau, Jean-Christophe Lombardo, Pierre Bonnet, Alexis Joly

2604.27536 2026-05-01 cs.AI

Belief-Guided Inference Control for Large Language Model Services via Verifiable Observations

Wenhao Yuan, Chenchen Lin, Jian Chen, Jinfeng Xu, Shuo Yang, Edith Cheuk Han Ngai

Comments Accepted by KnowFM@ACL2026

2604.27534 2026-05-01 cs.CL

Entropy of Ukrainian

Anton Lavreniuk, Mykyta Mudryi, Markiian Chaklosh

Comments 8 pages, 5 figures, 2 tables. Accepted at UNLP 2026

2604.27533 2026-05-01 cs.CL

Qualitative Evaluation of Language Model Rescoring in Automatic Speech Recognition

Thibault Bañeras-Roux, Mickaël Rouvier, Jane Wottawa, Richard Dufour

Comments 3968--3972

2604.27529 2026-05-01 cs.CV

Adjoint Inversion Reveals Holographic Superposition and Destructive Interference in CNN Classifiers

Kaixiang Shu

2604.27510 2026-05-01 cs.LG cs.CV

FMCL: Class-Aware Client Clustering with Foundation Model Representations for Heterogeneous Federated Learning

Mahad Ali, Laura J. Brattain

Comments 14 pages, 2 figures

2604.27508 2026-05-01 cs.RO

SASI: Leveraging Sub-Action Semantics for Robust Early Action Recognition in Human-Robot Interaction

Yongpeng Cao, Masahiro Hirano, Hyuno Kim, Yuji Yamakawa

2604.27504 2026-05-01 cs.CV

REVIVE 3D: Refinement via Encoded Voluminous Inflated prior for Volume Enhancement

Hankyeol Lee, Wooyeol Baek, Seongdo Kim, Jongyoo Kim

Comments Accepted by CVPR 2026