arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.26816 2026-03-31 cs.LG cs.AI

PiCSRL: Physics-Informed Contextual Spectral Reinforcement Learning

Mitra Nasr Azadani, Syed Usama Imtiaz, Nasrin Alamdari

Comments Accepted to IGARSS 2026

详情

英文摘要

High-dimensional low-sample-size (HDLSS) datasets constrain reliable environmental model development, where labeled data remain sparse. Reinforcement learning (RL)-based adaptive sensing methods can learn optimal sampling policies, yet their application is severely limited in HDLSS contexts. In this work, we present PiCSRL (Physics-Informed Contextual Spectral Reinforcement Learning), where embeddings are designed using domain knowledge and parsed directly into the RL state representation for improved adaptive sensing. We developed an uncertainty-aware belief model that encodes physics-informed features to improve prediction. As a representative example, we evaluated our approach for cyanobacterial gene concentration adaptive sampling task using NASA PACE hyperspectral imagery over Lake Erie. PiCSRL achieves optimal station selection (RMSE = 0.153, 98.4% bloom detection rate, outperforming random (0.296) and UCB (0.178) RMSE baselines, respectively. Our ablation experiments demonstrate that physics-informed features improve test generalization (0.52 R^2, +0.11 over raw bands) in semi-supervised learning. In addition, our scalability test shows that PiCSRL scales effectively to large networks (50 stations, >2M combinations) with significant improvements over baselines (p = 0.002). We posit PiCSRL as a sample-efficient adaptive sensing method across Earth observation domains for improved observation-to-target mapping.

URL PDF HTML ☆

赞 0 踩 0

2603.26814 2026-03-31 cs.CV cs.RO

arg-VU: Affordance Reasoning with Physics-Aware 3D Geometry for Visual Understanding in Robotic Surgery

Nan Xiao, Yunxin Fan, Farong Wang, Fei Liu

2603.26811 2026-03-31 cs.CV cs.AI cs.LG q-bio.NC

Implicit neural representations for larval zebrafish brain microscopy: a reproducible benchmark on the MapZebrain atlas

Agnieszka Pregowska

2603.26810 2026-03-31 cs.CV eess.IV

Unblur-SLAM: Dense Neural SLAM for Blurry Inputs

Qi Zhang, Denis Rozumny, Francesco Girlanda, Sezer Karaoglu, Marc Pollefeys, Theo Gevers, Martin R. Oswald

Comments 14 pages, 9 figures (based on the document's total length and the final Figure 9 ). Accepted By CVPR 2026

2603.26804 2026-03-31 cs.CV cs.AI

The Language of Touch: Translating Vibrations into Text with Dual-Branch Learning

Jin Chen, Yifeng Lin, Chao Zeng, Si Wu, Tiesong Zhao

Comments 9 pages, 6 figures

2603.26803 2026-03-31 cs.LG stat.ML

A Comparative Investigation of Thermodynamic Structure-Informed Neural Networks

Guojie Li, Liu Hong

Comments 30 pages, 9 figures, 2 tables

2603.26802 2026-03-31 cs.CV cs.RO eess.IV

Deep Learning Aided Vision System for Planetary Rovers

Lomash Relia, Jai G Singla, Amitabh, Nitant Dube

2603.26801 2026-03-31 cs.LG cs.AI

Sparse-by-Design Cross-Modality Prediction: L0-Gated Representations for Reliable and Efficient Learning

Filippo Cenacchi

详情

英文摘要

Predictive systems increasingly span heterogeneous modalities such as graphs, language, and tabular records, but sparsity and efficiency remain modality-specific (graph edge or neighborhood sparsification, Transformer head or layer pruning, and separate tabular feature-selection pipelines). This fragmentation makes results hard to compare, complicates deployment, and weakens reliability analysis across end-to-end KDD pipelines. A unified sparsification primitive would make accuracy-efficiency trade-offs comparable across modalities and enable controlled reliability analysis under representation compression. We ask whether a single representation-level mechanism can yield comparable accuracy-efficiency trade-offs across modalities while preserving or improving probability calibration. We propose L0-Gated Cross-Modality Learning (L0GM), a modality-agnostic, feature-wise hard-concrete gating framework that enforces L0-style sparsity directly on learned representations. L0GM attaches hard-concrete stochastic gates to each modality's classifier-facing interface: node embeddings (GNNs), pooled sequence embeddings such as CLS (Transformers), and learned tabular embedding vectors (tabular models). This yields end-to-end trainable sparsification with an explicit control knob for the active feature fraction. To stabilize optimization and make trade-offs interpretable, we introduce an L0-annealing schedule that induces clear accuracy-sparsity Pareto frontiers. Across three public benchmarks (ogbn-products, Adult, IMDB), L0GM achieves competitive predictive performance while activating fewer representation dimensions, and it reduces Expected Calibration Error (ECE) in our evaluation. Overall, L0GM establishes a modality-agnostic, reproducible sparsification primitive that supports comparable accuracy, efficiency, and calibration trade-off analysis across heterogeneous modalities.

URL PDF HTML ☆

赞 0 踩 0

2603.26800 2026-03-31 cs.LG cs.AI physics.flu-dyn

DSO: Dual-Scale Neural Operators for Stable Long-term Fluid Dynamics Forecasting

Huanshuo Dong, Hao Wu, Hong Wang, Qin-Yi Zhang, Zhezheng Hao

2603.26799 2026-03-31 cs.LG

Gaussian Joint Embeddings For Self-Supervised Representation Learning

Yongchao Huang

Comments 92 pages

2603.26798 2026-03-31 cs.LG cs.AI

Explaining, Verifying, and Aligning Semantic Hierarchies in Vision-Language Model Embeddings

Gesina Schwalbe, Mert Keser, Moritz Bayerkuhnlein, Edgar Heinert, Annika Mütze, Marvin Keller, Sparsh Tiwari, Georgii Mikriukov, Diedrich Wolter, Jae Hee Lee, Matthias Rottmann

2603.26797 2026-03-31 cs.LG

MemGuard-Alpha: Detecting and Filtering Memorization-Contaminated Signals in LLM-Based Financial Forecasting via Membership Inference and Cross-Model Disagreement

Anisha Roy, Dip Roy

2603.26796 2026-03-31 cs.LG cs.AI stat.ML

Robust Batch-Level Query Routing for Large Language Models under Cost and Capacity Constraints

Jelena Markovic-Voronov, Kayhan Behdin, Yuanda Xu, Zhengze Zhou, Zhipeng Wang, Rahul Mazumder

2603.26794 2026-03-31 cs.CV cs.AI

PhyDCM: A Reproducible Open-Source Framework for AI-Assisted Brain Tumor Classification from Multi-Sequence MRI

Hayder Saad Abdulbaqi, Mohammed Hadi Rahim, Mohammed Hassan Hadi, Haider Ali Aboud, Ali Hussein Allawi

Comments 18 pages, 9 figures, 6 tables

2603.26790 2026-03-31 cs.CV

Elucidating the Design Space of Flow Matching for Cellular Microscopy

Charles Jones, Emmanuel Noutahi, Jason Hartford, Cian Eastwood

2603.26789 2026-03-31 cs.CV

Confidence Matters: Uncertainty Quantification and Precision Assessment of Deep Learning-based CMR Biomarker Estimates Using Scan-rescan Data

Dewmini Hasara Wickremasinghe, Michelle Gibogwe, Andrew Bell, Esther Puyol-Antón, Muhummad Sohaib Nazir, Reza Razavi, Bruno Paun, Paul Aljabar, Andrew P. King

2603.26787 2026-03-31 cs.CV

Brain-Inspired Multimodal Spiking Neural Network for Image-Text Retrieval

Xintao Zong, Xian Zhong, Wenxuan Liu, Jianhao Ding, Zhaofei Yu, Tiejun Huang

2603.26786 2026-03-31 cs.LG cs.AI

A Step Toward Federated Pretraining of Multimodal Large Language Models

Baochen Xiong, Yifan Xu, Xiaoshan Yang, Yaguang Song, Yaowei Wang, Changsheng Xu

2603.26784 2026-03-31 cs.CV

HighlightBench: Benchmarking Markup-Driven Table Reasoning in Scientific Documents

Lexin Wang, Shenghua Liu, Yiwei Wang, Yujun Cai, Yuyao Ge, Jiayu Yao, Jiafeng Guo, Xueqi Cheng

2603.26780 2026-03-31 cs.CV

RatSeizure: A Benchmark and Saliency-Context Transformer for Rat Seizure Localization

Ting Yu Tsai, An Yu, Lucy Lee, Felix X. -F. Ye, Damian S. Shin, Tzu-Jen Kao, Xin Li, Ming-Ching Chang

2603.26778 2026-03-31 cs.LG cs.AI

TED: Training-Free Experience Distillation for Multimodal Reasoning

Shuozhi Yuan, Jinqing Wang, Zihao Liu, Miaomiao Yuan, Haoran Peng, Jin Zhao, Bingwen Wang, Haoyi Wang

Comments 13 pages,4 figures

2603.26777 2026-03-31 cs.CV astro-ph.IM cs.LG

BHCast: Unlocking Black Hole Plasma Dynamics from a Single Blurry Image with Long-Term Forecasting

Renbo Tu, Ali SaraerToosi, Nicholas S. Conroy, Gennady Pekhimenko, Aviad Levis

Comments CVPR 2026

2603.26776 2026-03-31 cs.CV

From Prediction to Diagnosis: Reasoning-Aware AI for Photovoltaic Defect Inspection

Dev Mistry, Feng Qiu, Bo Chen, Feng Liu, Can Chen, Mohammad Shahidehpour, Ren Wang

Comments 34 pages, 5 figures

2603.26775 2026-03-31 cs.LG cs.AI cs.CL cs.CV

Learning to Select Visual In-Context Demonstrations

Eugene Lee, Yu-Chi Lin, Jiajie Diao

Comments 21 pages, 12 figure, accepted to Computer Vision and Pattern Recognition Conference (CVPR) 2026 Findings Track

2603.26773 2026-03-31 cs.RO cs.LG

Robot Arm Control via Cognitive Map Learners

Nathan McDonald, Colyn Seeley, Christian Brazeau

2603.26772 2026-03-31 cs.CV cs.AI cs.CY

From Content to Audience: A Multimodal Annotation Framework for Broadcast Television Analytics

Paolo Cupini, Francesco Pierri

2603.26770 2026-03-31 cs.CV

Quantized Vision-Language Models for Damage Assessment: A Comparative Study of LLaVA-1.5-7B Quantization Levels

Takato Yasuno

Comments 16 pages, 4 figures, 8 tables

2603.26769 2026-03-31 cs.CV cs.AI

Edge Reliability Gap in Vision-Language Models: Quantifying Failure Modes of Compressed VLMs Under Visual Corruption

Mehmet Kaan Erol

Comments 16 pages, 5 figures

2603.26768 2026-03-31 cs.CV cs.AI cs.CL

Aesthetic Assessment of Chinese Handwritings Based on Vision Language Models

Chen Zheng, Yuxuan Lai, Haoyang Lu, Wentao Ma, Jitao Yang, Jian Wang

Comments Accepted by CCL2025

2603.26767 2026-03-31 cs.CV

A training-free framework for high-fidelity appearance transfer via diffusion transformers

Shengrong Gu, Ye Wang, Song Wu, Rui Ma, Qian Wang, Lanjun Wang, Zili Yi