arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.09111 2026-03-11 cs.CV

Progressive Representation Learning for Multimodal Sentiment Analysis with Incomplete Modalities

Jindi Bao, Jianjun Qian, Mengkai Yan, Jian Yang

详情

英文摘要

Multimodal Sentiment Analysis (MSA) seeks to infer human emotions by integrating textual, acoustic, and visual cues. However, existing approaches often rely on all modalities are completeness, whereas real-world applications frequently encounter noise, hardware failures, or privacy restrictions that result in missing modalities. There exists a significant feature misalignment between incomplete and complete modalities, and directly fusing them may even distort the well-learned representations of the intact modalities. To this end, we propose PRLF, a Progressive Representation Learning Framework designed for MSA under uncertain missing-modality conditions. PRLF introduces an Adaptive Modality Reliability Estimator (AMRE), which dynamically quantifies the reliability of each modality using recognition confidence and Fisher information to determine the dominant modality. In addition, the Progressive Interaction (ProgInteract) module iteratively aligns the other modalities with the dominant one, thereby enhancing cross-modal consistency while suppressing noise. Extensive experiments on CMU-MOSI, CMU-MOSEI, and SIMS verify that PRLF outperforms state-of-the-art methods across both inter- and intra-modality missing scenarios, demonstrating its robustness and generalization capability.

URL PDF HTML ☆

赞 0 踩 0

2603.08707 2026-03-11 cs.LG

Impermanent: A Live Benchmark for Temporal Generalization in Time Series Forecasting

Azul Garza, Renée Rosillo, Rodrigo Mendoza-Smith, David Salinas, Andrew Robert Williams, Arjun Ashok, Mononito Goswami, José Martín Juárez

2603.08590 2026-03-11 cs.CV

PRISM: Streaming Human Motion Generation with Per-Joint Latent Decomposition

Zeyu Ling, Qing Shuai, Teng Zhang, Shiyang Li, Bo Han, Changqing Zou

2603.08574 2026-03-11 cs.SD

Scalable Neural Vocoder from Range-Null Space Decomposition

Andong Li, Tong Lei, Zhihang Sun, Rilin Chen, Xiaodong Li, Dong Yu, Chengshi Zheng

Comments 30 pages, 30 figures, 21 tables, Extension journal

2603.08483 2026-03-11 cs.CV cs.AI cs.LG

X-AVDT: Audio-Visual Cross-Attention for Robust Deepfake Detection

Youngseo Kim, Kwan Yun, Seokhyeon Hong, Sihun Cha, Colette Suhjung Koo, Junyong Noh

2603.08465 2026-03-11 cs.LG

MUSA-PINN: Multi-scale Weak-form Physics-Informed Neural Networks for Fluid Flow in Complex Geometries

Weizheng Zhang, Xunjie Xie, Hao Pan, Xiaowei Duan, Bingteng Sun, Qiang Du, Lin Lu

2603.08390 2026-03-11 cs.RO cs.CV

StructBiHOI: Structured Articulation Modeling for Long--Horizon Bimanual Hand--Object Interaction Generation

Zhi Wang, Liu Liu, Ruonan Liu, Dan Guo, Meng Wang

2603.08252 2026-03-11 cs.LG

FedPrism: Adaptive Personalized Federated Learning under Non-IID Data

Prakash Kumbhakar, Shrey Srivastava, Haroon R Lone

2603.07893 2026-03-11 cs.LG cs.AI econ.GN physics.ao-ph q-fin.EC

Designing probabilistic AI monsoon forecasts to inform agricultural decision-making

Colin Aitken, Rajat Masiwal, Adam Marchakitus, Katherine Kowal, Mayank Gupta, Tyler Yang, Amir Jina, Pedram Hassanzadeh, William R. Boos, Michael Kremer

2603.07528 2026-03-11 cs.CL

TableMind++: An Uncertainty-Aware Programmatic Agent for Tool-Augmented Table Reasoning

Mingyue Cheng, Shuo Yu, Chuang Jiang, Xiaoyu Tao, Qingyang Mao, Jie Ouyang, Qi Liu, Enhong Chen

Comments 6 tables, 9 figures. arXiv admin note: text overlap with arXiv:2509.06278

2603.07422 2026-03-11 cs.AI

Dynamic Vehicle Routing Problem with Prompt Confirmation of Advance Requests

Amutheezan Sivagnanam, Ayan Mukhopadhyay, Samitha Samaranayake, Abhishek Dubey, Aron Laszka

2603.07357 2026-03-11 cs.LG cs.AI

Latent Generative Models with Tunable Complexity for Compressed Sensing and other Inverse Problems

Sean Gunn, Jorio Cocola, Oliver De Candido, Vaggos Chatziafratis, Paul Hand

2603.07170 2026-03-11 cs.CV

Class Visualizations and Activation Atlases for Enhancing Interpretability in Deep Learning-Based Computational Pathology

Marco Gustav, Fabian Wolf, Christina Glasner, Nic G. Reitsam, Stefan Schulz, Kira Aschenbroich, Bruno Märkl, Sebastian Foersch, Jakob Nikolas Kather

详情

英文摘要

The rapid adoption of transformer-based models in computational pathology has enabled prediction of molecular and clinical biomarkers from H&E whole-slide images, yet interpretability has not kept pace with model complexity. While attribution- and generative-based methods are common, feature visualization approaches such as class visualizations (CVs) and activation atlases (AAs) have not been systematically evaluated for these models. We developed a visualization framework and assessed CVs and AAs for a transformer-based foundation model across tissue and multi-organ cancer classification tasks with increasing label granularity. Four pathologists annotated real and generated images to quantify inter-observer agreement, complemented by attribution and similarity metrics. CVs preserved recognizability for morphologically distinct tissues but showed reduced separability for overlapping cancer subclasses. In tissue classification, agreement decreased from Fleiss k = 0.75 (scans) to k = 0.31 (CVs), with similar trends in cancer subclass tasks. AAs revealed layer-dependent organization: coarse tissue-level concepts formed coherent regions, whereas finer subclasses exhibited dispersion and overlap. Agreement was moderate for tissue classification (k = 0.58), high for coarse cancer groupings (k = 0.82), and low at subclass level (k = 0.11). Atlas separability closely tracked expert agreement on real images, indicating that representational ambiguity reflects intrinsic pathological complexity. Attribution-based metrics approximated expert variability in low-complexity settings, whereas perceptual and distributional metrics showed limited alignment. Overall, concept-level feature visualization reveals structured morphological manifolds in transformer-based pathology models and provides a framework for expert-centered interrogation of learned representations across label granularities.

URL PDF HTML ☆

赞 0 踩 0

2603.07071 2026-03-11 cs.CV

VirtueBench: Evaluating Trustworthiness under Uncertainty in Long Video Understanding

Xueqing Yu, Bohan Li, Yan Li, Zhenheng Yang

Comments Accepted to CVPR 2026

2603.06758 2026-03-11 cs.LG cs.AI

Enhancing SHAP Explainability for Diagnostic and Prognostic ML Models in Alzheimer Disease

Pablo Guillén, Enrique Frias-Martinez

详情

DOI: 10.32604/cmc.2026.076400
Journal ref: CMC 1546-2226 (2026)

英文摘要

Alzheimer disease (AD) diagnosis and prognosis increasingly rely on machine learning (ML) models. Although these models provide good results, clinical adoption is limited by the need for technical expertise and the lack of trustworthy and consistent model explanations. SHAP (SHapley Additive exPlanations) is com-monly used to interpret AD models, but existing studies tend to focus on explanations for isolated tasks, providing little evidence about their robustness across disease stages, model architectures, or prediction objectives. This paper proposes a multi-level explainability framework that measures the coherence, stabil-ity and consistency of explanations by integrating: (1) within-model coherence metrics between feature importance and SHAP, (2) SHAP stability across AD boundaries, and (3) SHAP cross-task consistency be-tween diagnosis and prognosis. Using AutoML to optimize classifiers on the NACC dataset, we trained four diagnostic and four prognostic models covering the standard AD progression stages. Stability was then evaluated using correlation metrics, top-k feature overlap, SHAP sign consistency, and domain-level contribution ratios. Results show that cognitive and functional markers dominate SHAP explanations in both diagnosis and prognosis. SHAP-SHAP consistency between diagnostic and prognostic models was high across all classifiers, with 100% sign stability and minimal shifts in explanatory magnitude. Domain-level contributions also remained stable, with only minimal increases in genetic features for prognosis. These results demonstrate that SHAP explanations can be quantitatively vali-dated for robustness and transferability, providing clinicians with more reliable interpretations of ML pre-dictions.

URL PDF HTML ☆

赞 0 踩 0

2603.06748 2026-03-11 cs.LG cs.AI

Property-driven Protein Inverse Folding With Multi-Objective Preference Alignment

Xiaoyang Hou, Junqi Liu, Chence Shi, Xin Liu, Zhi Yang, Jian Tang

2603.06698 2026-03-11 cs.CV

Breaking the Geometric Bottleneck: Contrastive Expansion in Asymmetric Cross-Modal Distillation

Kabir Thayani

Comments Introduced auxiliary InfoNCE objective to reverse dimensional collapse. Expanded experiments to DINOv2 teacher and CIFAR-100 dataset. 3 pages, 3 figures, 2 tables

2603.06656 2026-03-11 cs.CV cs.AI

GameVerse: Can Vision-Language Models Learn from Video-based Reflection?

Kuan Zhang, Dongchen Liu, Qiyue Zhao, Jinkun Hou, Xinran Zhang, Qinlei Xie, Miao Liu, Yiming Li

Comments https://gameverse-bench.github.io/

2603.06634 2026-03-11 cs.LG hep-th math-ph math.MP

A new Uncertainty Principle in Machine Learning

V. Dolotin, A. Morozov

Comments 24 pages

2603.06602 2026-03-11 cs.LG stat.ML

Khatri-Rao Clustering for Data Summarization

Martino Ciaperoni, Collin Leiber, Aristides Gionis, Heikki Mannila

2603.06135 2026-03-11 cs.CL cs.AI

A Causal Graph Approach to Oppositional Narrative Analysis

Diego Revilla, Martin Fernandez-de-Retana, Lingfeng Chen, Aritz Bilbao-Jayo, Miguel Fernandez-de-Retana

2603.05960 2026-03-11 cs.LG

Omni-Masked Gradient Descent: Memory-Efficient Optimization via Mask Traversal with Improved Convergence

Hui Yang, Tao Ren, Jinyang Jiang, Wan Tian, Yijie Peng

2603.05494 2026-03-11 cs.LG cs.AI cs.CL

Censored LLMs as a Natural Testbed for Secret Knowledge Elicitation

Helena Casademunt, Bartosz Cywiński, Khoi Tran, Arya Jakkli, Samuel Marks, Neel Nanda

2603.05228 2026-03-11 cs.LG cs.AI

The Geometric Inductive Bias of Grokking: Bypassing Phase Transitions via Architectural Topology

Alper Yıldırım

Comments 19 pages, 2 figures, 3 tables. Code available at https://github.com/AlperYildirim1/geometric-grokking

2603.04818 2026-03-11 cs.AI

LLM-Grounded Explainable AI for Supply Chain Risk Early Warning via Temporal Graph Attention Networks

Zhiming Xue, Yujue Wang, Menghao Huo

2603.03930 2026-03-11 cs.CV

N-gram Injection into Transformers for Dynamic Language Model Adaptation in Handwritten Text Recognition

Florent Meyer, Laurent Guichard, Yann Soullard, Denis Coquenet, Guillaume Gravier, Bertrand Coüasnon

Comments Fix order of authors

2603.02023 2026-03-11 cs.CL

PonderLM-3: Adaptive Token-Wise Pondering with Differentiable Masking

He Li, Feichen Song, Boyi Zeng, Shixiang Song, Zhiqin John Xu, Ziwei He, Zhouhan Lin

2603.01433 2026-03-11 cs.CV

DOCFORGE-BENCH: A Comprehensive 0-shot Benchmark for Document Forgery Detection and Analysis

Zengqi Zhao, Weidi Xia, En Wei, Yan Zhang, Jane Mo, Tiannan Zhang, Yuanqin Dai, Zexi Chen, Yiran Tao, Simiao Ren

2603.01367 2026-03-11 cs.LG

DUEL: Exact Likelihood for Masked Diffusion via Deterministic Unmasking

Gilad Turok, Chris De Sa, Volodymyr Kuleshov

Comments 22 pages, 5 figures 8 tables

2603.00718 2026-03-11 cs.CL cs.SE

SkillCraft: Can LLM Agents Learn to Use Tools Skillfully?

Shiqi Chen, Jingze Gai, Ruochen Zhou, Jinghan Zhang, Tongyao Zhu, Junlong Li, Kangrui Wang, Zihan Wang, Zhengyu Chen, Klara Kaleb, Ning Miao, Siyang Gao, Cong Lu, Manling Li, Junxian He, Yee Whye Teh

Comments 21 pages. Code: https://github.com/shiqichen17/SkillCraft ; Project page: https://skillcraft-website.github.io/page