arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2602.21550 2026-03-13 cs.LG q-bio.GN

Extending Sequence Length is Not All You Need: Effective Integration of Multimodal Signals for Gene Expression Prediction

Zhao Yang, Yi Duan, Jiwei Zhu, Ying Ba, Chuan Cao, Bing Su

Comments Accepted at ICLR 2026

详情

英文摘要

Gene expression prediction, which predicts mRNA expression levels from DNA sequences, presents significant challenges. Previous works often focus on extending input sequence length to locate distal enhancers, which may influence target genes from hundreds of kilobases away. Our work first reveals that for current models, long sequence modeling can decrease performance. Even carefully designed algorithms only mitigate the performance degradation caused by long sequences. Instead, we find that proximal multimodal epigenomic signals near target genes prove more essential. Hence we focus on how to better integrate these signals, which has been overlooked. We find that different signal types serve distinct biological roles, with some directly marking active regulatory elements while others reflect background chromatin patterns that may introduce confounding effects. Simple concatenation may lead models to develop spurious associations with these background patterns. To address this challenge, we propose Prism, a framework that learns multiple combinations of high-dimensional epigenomic features to represent distinct background chromatin states and uses backdoor adjustment to mitigate confounding effects. Our experimental results demonstrate that proper modeling of multimodal epigenomic signals achieves state-of-the-art performance using only short sequences for gene expression prediction.

URL PDF HTML ☆

赞 0 踩 0

2602.21421 2026-03-13 cs.CV cs.AI cs.LG

ECHOSAT: Estimating Canopy Height Over Space And Time

Jan Pauls, Karsten Schrödter, Sven Ligensa, Martin Schwartz, Berkant Turan, Max Zimmer, Sassan Saatchi, Sebastian Pokutta, Philippe Ciais, Fabian Gieseke

Comments 19 pages, 12 figures, 6 tables

2602.20197 2026-03-13 cs.LG cs.AI

Controllable Exploration in Hybrid-Policy RLVR for Multi-Modal Reasoning

Zhuoxu Huang, Mengxi Jia, Hao Sun, Xuelong Li, Jungong Han

Comments Published as a conference paper at ICLR 2026

2602.19912 2026-03-13 cs.LG

De novo molecular structure elucidation from mass spectra via flow matching

Ghaith Mqawass, Tuan Le, Fabian Theis, Djork-Arné Clevert

Comments This preprint has been withdrawn by the authors after identifying a potential data leakage issue. Further analysis is underway

2602.19281 2026-03-13 cs.AI

Limited Reasoning Space: The cage of long-horizon reasoning in LLMs

Zhenyu Li, Guanlin Wu, Cheems Wang, Yongqiang Zhao

2602.18990 2026-03-13 cs.CV

IDSelect: A RL-Based Cost-Aware Selection Agent for Video-based Multi-Modal Person Recognition

Yuyang Ji, Yixuan Shen, Kien Nguyen, Lifeng Zhou, Feng Liu

2602.18324 2026-03-13 cs.CL

PsihoRo: Depression and Anxiety Romanian Text Corpus

Alexandra Ciobotaru, Ana-Maria Bucur, Liviu P. Dinu

Comments This article was accepted at LREC 2026

2602.15112 2026-03-13 cs.AI

ResearchGym: Evaluating Language Model Agents on Real-World AI Research

Aniketh Garikaparthi, Manasi Patwardhan, Arman Cohan

Comments ICLR 2026 Agents in the Wild Workshop

2602.13093 2026-03-13 cs.AI cs.CL

Consistency of Large Reasoning Models Under Multi-Turn Attacks

Yubo Li, Ramayya Krishnan, Rema Padman

2602.08958 2026-03-13 cs.CV

Grow with the Flow: 4D Reconstruction of Growing Plants with Gaussian Flow Fields

Weihan Luo, Lily Goli, Sherwin Bahmani, Felix Taubner, Andrea Tagliasacchi, David B. Lindell

Comments Project page: https://weihanluo.ca/growflow/

2602.08430 2026-03-13 cs.CV

Understanding and Optimizing Attention-Based Sparse Matching for Diverse Local Features

Qiang Wang

Comments v2: add results with RaCo,RDD,DaD and Air-to-Ground benchmark

2602.06965 2026-03-13 cs.CV

MedMO: Grounding and Understanding Multimodal Large Language Model for Medical Images

Ankan Deria, Komal Kumar, Adinath Madhavrao Dukre, Eran Segal, Salman Khan, Imran Razzak

Comments 21 pages, 6 figures and 4 tables

2602.04329 2026-03-13 cs.RO

Safe and Stylized Trajectory Planning for Autonomous Driving via Diffusion Model

Shuo Pei, Yong Wang, Yuanchen Zhu, Chen Sun, Qin Li, Yanan Zhao, Huachun Tan

Comments 12 pages, 7 figures, submitted to IEEE Transactions on Intelligent Transportation Systems

2602.01716 2026-03-13 cs.CL

Mechanistic Indicators of Steering Effectiveness in Large Language Models

Mehdi Jafari, Hao Xue, Flora Salim

2602.00813 2026-03-13 cs.CV

Generating a Paracosm for Training-Free Zero-Shot Composed Image Retrieval

Tong Wang, Yunhan Zhao, Shu Kong

2601.22511 2026-03-13 cs.CL

Mock Worlds, Real Skills: Building Small Agentic Language Models with Synthetic Tasks, Simulated Environments, and Rubric-Based Rewards

Yuanjie Lyu, Chengyu Wang, Lei Shen, Jun Huang, Tong Xu

Comments The first author prefers the more commonly used English name "Yuanjie Lyu" over "Yuan-Jay Lü", so we have updated it; both refer to the same person

2601.20900 2026-03-13 cs.SD cs.CL cs.LG eess.AS

Text-only adaptation in LLM-based ASR through text denoising

Andrés Carofilis, Sergio Burdisso, Esaú Villatoro-Tello, Shashi Kumar, Kadri Hacioglu, Srikanth Madikeri, Pradeep Rangappa, Manjunath K E, Petr Motlicek, Shankar Venkatesan, Andreas Stolcke

2601.19410 2026-03-13 cs.CL cs.HC

Do LLMs Truly Benefit from Longer Context in Automatic Post-Editing?

Ahrii Kim, Seong-heum Kim

2601.16667 2026-03-13 cs.RO cs.CV

ReViP: Mitigating False Completion in Vision-Language-Action Models with Vision-Proprioception Rebalance

Zhuohao Li, Yinghao Li, Jian-Jian Jiang, Lang Zhou, Tianyu Zhang, Jiadong Yin, Mu Lin, Yi-Lin Wei, Wei-Shi Zheng

2601.15479 2026-03-13 cs.CL cs.AI

Benchmarking LLMs for Pairwise Causal Discovery in Biomedical and Multi-Domain Contexts

Sydney Anuyah, Sneha Shajee-Mohan, Ankit-Singh Chauhan, Sunandan Chakraborty

2601.13435 2026-03-13 cs.LG cs.AI q-fin.CP

A Learnable Wavelet Transformer for Long-Short Equity Trading and Risk-Adjusted Return Optimization

Shuozhe Li, Du Cheng, Leqi Liu

2601.08246 2026-03-13 cs.RO

FSAG: Enhancing Human-to-Dexterous-Hand Finger-Specific Affordance Grounding via Diffusion Models

Yifan Han, Yichuan Peng, Pengfei Yi, Junyan Li, Hanqing Wang, Gaojing Zhang, Qi Peng Liu, Wenzhao Lian

2601.07796 2026-03-13 cs.CL cs.HC

Learning Through Dialogue: Engagement and Efficacy Matter More Than Explanations

Shaz Furniturewala, Gerard Christopher Yeo, Kokil Jaidka

2601.02447 2026-03-13 cs.CV

Don't Mind the Gaps: Implicit Neural Representations for Resolution-Agnostic Retinal OCT Analysis

Bennet Kahrs, Julia Andresen, Fenja Falta, Monty Santarossa, Heinz Handels, Timo Kepp

Comments Accepted for publication at the Journal of Machine Learning for Biomedical Imaging (MELBA) https://melba-journal.org/2026:004

详情

DOI: 10.59275/j.melba.2026-38ba
Journal ref: Machine.Learning.for.Biomedical.Imaging. 2026 (2026)

英文摘要

Routine clinical imaging of the retina using optical coherence tomography (OCT) is performed with large slice spacing, resulting in highly anisotropic images and a sparsely scanned retina. Most learning-based methods circumvent the problems arising from the anisotropy by using 2D approaches rather than performing volumetric analyses. These approaches inherently bear the risk of generating inconsistent results for neighboring B-scans. For example, 2D retinal layer segmentations can have irregular surfaces in 3D. Furthermore, the typically used convolutional neural networks are bound to the resolution of the training data, which prevents their usage for images acquired with a different imaging protocol. Implicit neural representations (INRs) have recently emerged as a tool to store voxelized data as a continuous representation. Using coordinates as input, INRs are resolution-agnostic, which allows them to be applied to anisotropic data. In this paper, we propose two frameworks that make use of this characteristic of INRs for dense 3D analyses of retinal OCT volumes. 1) We perform inter-B-scan interpolation by incorporating additional information from en-face modalities, that help retain relevant structures between B-scans. 2) We create a resolution-agnostic retinal atlas that enables general analysis without strict requirements for the data. Both methods leverage generalizable INRs, improving retinal shape representation through population-based training and allowing predictions for unseen cases. Our resolution-independent frameworks facilitate the analysis of OCT images with large B-scan distances, opening up possibilities for the volumetric evaluation of retinal structures and pathologies.

URL PDF HTML ☆

赞 0 踩 0

2601.00411 2026-03-13 cs.CL cs.AI

Do LLMs Judge Distantly Supervised Named Entity Labels Well? Constructing the JudgeWEL Dataset

Alistair Plum, Laura Bernardy, Tharindu Ranasinghe

Comments Accepted at LREC 2026

2512.24873 2026-03-13 cs.AI cs.CL

Let It Flow: Agentic Crafting on Rock and Roll, Building the ROME Model within an Open Agentic Learning Ecosystem

Weixun Wang, XiaoXiao Xu, Wanhe An, Fangwen Dai, Wei Gao, Yancheng He, Ju Huang, Qiang Ji, Hanqi Jin, Xiaoyang Li, Yang Li, Zhongwen Li, Shirong Lin, Jiashun Liu, Zenan Liu, Tao Luo, Dilxat Muhtar, Yuanbin Qu, Jiaqiang Shi, Qinghui Sun, Yingshui Tan, Hao Tang, Runze Wang, Yi Wang, Zhaoguo Wang, Yanan Wu, Shaopan Xiong, Binchen Xu, Xander Xu, Yuchi Xu, Qipeng Zhang, Xixia Zhang, Haizhou Zhao, Jie Zhao, Shuaibing Zhao, Baihui Zheng, Jianhui Zheng, Suhang Zheng, Yanni Zhu, Mengze Cai, Kerui Cao, Xitong Chen, Yue Dai, Lifan Du, Tao Feng, Tao He, Jin Hu, Yijie Hu, Ziyu Jiang, Cheng Li, Xiang Li, Jing Liang, Xin Lin, Chonghuan Liu, ZhenDong Liu, Zhiqiang Lv, Haodong Mi, Yanhu Mo, Junjia Ni, Shixin Pei, Jingyu Shen, XiaoShuai Song, Cecilia Wang, Chaofan Wang, Kangyu Wang, Pei Wang, Tao Wang, Wei Wang, Ke Xiao, Mingyu Xu, Tiange Xu, Nan Ya, Siran Yang, Jianan Ye, Yaxing Zang, Duo Zhang, Junbo Zhang, Boren Zheng, Wanxi Deng, Ling Pan, Lin Qu, Wenbo Su, Jiamang Wang, Wei Wang, Hu Wei, Minggang Wu, Cheng Yu, Bing Zhao, Zhicheng Zheng, Bo Zheng

Comments 36 pages, 15 figures

2512.21692 2026-03-13 cs.CV cs.GR

ShinyNeRF: Digitizing Anisotropic Appearance in Neural Radiance Fields

Albert Barreiro, Roger Marí, Rafael Redondo, Gloria Haro, Carles Bosch

2512.21066 2026-03-13 cs.AI cs.HC

Agentic Explainable Artificial Intelligence (Agentic XAI) Approach To Explore Better Explanation

Tomoaki Yamaguchi, Yutong Zhou, Masahiro Ryo, Keisuke Katsura

详情

英文摘要

Explainable artificial intelligence (XAI) enables data-driven understanding of factor associations with response variables, yet communicating XAI outputs to laypersons remains challenging, hindering trust in AI-based predictions. Large language models (LLMs) have emerged as promising tools for translating technical explanations into accessible narratives, yet the integration of agentic AI, where LLMs operate as autonomous agents through iterative refinement, with XAI remains unexplored. This study proposes an agentic XAI framework combining SHAP-based explainability with multimodal LLM-driven iterative refinement to generate progressively enhanced explanations. As a use case, we tested this framework as an agricultural recommendation system using rice yield data from 26 fields in Japan. The Agentic XAI initially provided a SHAP result and explored how to improve the explanation through additional analysis iteratively across 11 refinement rounds (Rounds 0-10). Explanations were evaluated by human experts (crop scientists) (n=12) and LLMs (n=14) against seven metrics: Specificity, Clarity, Conciseness, Practicality, Contextual Relevance, Cost Consideration, and Crop Science Credibility. Both evaluator groups confirmed that the framework successfully enhanced recommendation quality with an average score increase of 30-33% from Round 0, peaking at Rounds 3-4. However, excessive refinement showed a substantial drop in recommendation quality, indicating a bias-variance trade-off where early rounds lacked explanation depth (bias) while excessive iteration introduced verbosity and ungrounded abstraction (variance), as revealed by metric-specific analysis. These findings suggest that strategic early stopping (regularization) is needed for optimizing practical utility, challenging assumptions about monotonic improvement and providing evidence-based design principles for agentic XAI systems.

URL PDF HTML ☆

赞 0 踩 0

2512.20299 2026-03-13 cs.RO cs.AI cs.CV

KnowVal: A Knowledge-Augmented and Value-Guided Autonomous Driving System

Zhongyu Xia, Wenhao Chen, Yongtao Wang, Ming-Hsuan Yang

Comments Accepted to CVPR 2026

2512.17137 2026-03-13 cs.CV cs.AI

SDUM: A Scalable Deep Unrolled Model for Universal MRI Reconstruction

Puyang Wang, Pengfei Guo, Keyi Chai, Jinyuan Zhou, Daguang Xu, Shanshan Jiang

Comments https://github.com/NVIDIA-Medtech/NV-Raw2insights-MRI