arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.13397 2026-04-16 cs.CV

A Multimodal Clinically Informed Coarse-to-Fine Framework for Longitudinal CT Registration in Proton Therapy

Caiwen Jiang, Yuzhen Ding, Mi Jia, Samir H. Patel, Terence T. Sio, Jonathan B. Ashman, Lisa A. McGee, Jean-Claude M. Rwigema, William G. Rule, Sameer R. Keole, Sujay A. Vora, William W. Wong, Nathan Y. Yu, Michele Y. Halyard, Steven E. Schild, Dinggang Shen, Wei Liu

详情

英文摘要

Proton therapy offers superior organ-at-risk sparing but is highly sensitive to anatomical changes, making accurate deformable image registration (DIR) across longitudinal CT scans essential. Conventional DIR methods are often too slow for emerging online adaptive workflows, while existing deep learning-based approaches are primarily designed for generic benchmarks and underutilize clinically relevant information beyond images. To address this gap, we propose a clinically scalable coarse-to-fine deformable registration framework that integrates multimodal information from the proton radiotherapy workflow to accommodate diverse clinical scenarios. The model employs dual CNN-based encoders for hierarchical feature extraction and a transformer-based decoder to progressively refine deformation fields. Beyond CT intensities, clinically critical priors, including target and organ-at-risk contours, dose distributions, and treatment planning text, are incorporated through anatomy- and risk-guided attention, text-conditioned feature modulation, and foreground-aware optimization, enabling anatomically focused and clinically informed deformation estimation. We evaluate the proposed framework on a large-scale proton therapy DIR dataset comprising 1,222 paired planning and repeat CT scans across multiple anatomical regions and disease types. Extensive experiments demonstrate consistent improvements over state-of-the-art methods, enabling fast and robust clinically meaningful registration.

URL PDF HTML ☆

赞 0 踩 0

2604.13395 2026-04-16 cs.AI cs.LG

Quantifying and Understanding Uncertainty in Large Reasoning Models

Yangyi Li, Chenxu Zhao, Mengdi Huai

2604.13386 2026-04-16 cs.LG

Linear Probe Accuracy Scales with Model Size and Benefits from Multi-Layer Ensembling

Erik Nordby, Tasha Pais, Aviel Parrack

2604.13383 2026-04-16 cs.CV

UniBlendNet: Unified Global, Multi-Scale, and Region-Adaptive Modeling for Ambient Lighting Normalization

Jiatao Dai, Wei Dong, Han Zhou, Chengzhou Tang, Jun Chen

Comments Accepted to CVPR 2026 NTIRE Workshop on New Trends in Image Restoration and Enhancement. 8 pages, 4 figures

2604.13371 2026-04-16 cs.CL

Empirical Evidence of Complexity-Induced Limits in Large Language Models on Finite Discrete State-Space Problems with Explicit Validity Constraints

Md. Fahad Ullah Utsho, Mohd. Ruhul Ameen, Akif Islam, Md. Golam Rashed, Dipankar Das

Comments 45 pages, 36 figures, 7 tables, Journal Preprint

2604.13368 2026-04-16 cs.CL

TLoRA+: A Low-Rank Parameter-Efficient Fine-Tuning Method for Large Language Models

Yarui Cao, Kai Liu

Comments 16 pages, 12 figures and 6 tables in total. Submitted to CoLM

2604.13367 2026-04-16 cs.CV cs.AI

A 3D SAM-Based Progressive Prompting Framework for Multi-Task Segmentation of Radiotherapy-induced Normal Tissue Injuries in Limited-Data Settings

Caiwen Jiang, Lei Zeng, Wei Liu

2604.13349 2026-04-16 cs.LG

When Less Latent Leads to Better Relay: Information-Preserving Compression for Latent Multi-Agent LLM Collaboration

Yiping Li, Zhiyu An, Wan Du

2604.13348 2026-04-16 cs.AI cs.CR

Listening Alone, Understanding Together: Collaborative Context Recovery for Privacy-Aware AI

Tanmay Srivastava, Amartya Basu, Shubham Jain, Vaishnavi Ranganathan

2604.13346 2026-04-16 cs.CL

AgentSPEX: An Agent SPecification and EXecution Language

Pengcheng Wang, Jerry Huang, Jiarui Yao, Rui Pan, Peizhi Niu, Yaowenqi Liu, Ruida Wang, Renhao Lu, Yuwei Guo, Tong Zhang

2604.13345 2026-04-16 cs.CV

Multi-Agent Object Detection Framework Based on Raspberry Pi YOLO Detector and Slack-Ollama Natural Language Interface

Vladimir Kalušev, Branko Brkljač, Milan Brkljač

Comments 19 pages, 7 figures, 2 tables, implementation code will be made available upon manuscript publication

2604.13340 2026-04-16 cs.CV cs.GR

MSGS: Multispectral 3D Gaussian Splatting

Iris Zheng, Guojun Tang, Alexander Doronin, Paul Teal, Fang-Lue Zhang

Comments Published in IEEE ISMAR 2025 Adjunct

2604.13335 2026-04-16 cs.CV

SEDTalker: Emotion-Aware 3D Facial Animation Using Frame-Level Speech Emotion Diarization

Farzaneh Jafari, Stefano Berretti, Anup Basu

Comments 15 pages; 4 figures; conference

2604.13333 2026-04-16 cs.CV cs.GR

SSD-GS: Scattering and Shadow Decomposition for Relightable 3D Gaussian Splatting

Iris Zheng, Guojun Tang, Alexander Doronin, Paul Teal, Fang-Lue Zhang

Comments Accepted to ICLR 2026. Code available at: https://github.com/irisfreesiri/SSD-GS

2604.13332 2026-04-16 cs.LG

Selecting Feature Interactions for Generalized Additive Models by Distilling Foundation Models

Jingyun Jia, Chandan Singh, Rich Caruana, Ben Lengerich

2604.13328 2026-04-16 cs.LG

Multi-Task LLM with LoRA Fine-Tuning for Automated Cancer Staging and Biomarker Extraction

Jiahao Shao, Anam Nawaz Khan, Christopher Brett, Tom Berg, Xueping Li, Bing Yao

Comments 11 pages, 3 figures and 4 tables in the main manuscript. Additional content, figures and tables are in supplementary material section. 17 pages in total

2604.13325 2026-04-16 cs.RO cs.SY eess.SY

Boundary Sampling to Learn Predictive Safety Filters via Pontryagin's Maximum Principle

James Dallas, Thomas Lew, John Talbot, Jonathan DeCastro, Somil Bansal, John Subosits

Comments This work has been submitted to the IEEE for possible publication

2604.13323 2026-04-16 cs.RO

Vectorizing Projection in Manifold-Constrained Motion Planning for Real-Time Whole-Body Control

Shrutheesh R Iyer, I-Chia Chang, Andrew Z. Liu, Yan Gu, Zachary Kingston

Comments 8 pages, 8 figures, 3 tables. Under review

2604.13322 2026-04-16 cs.CV

Towards Successful Implementation of Automated Raveling Detection: Effects of Training Data Size, Illumination Difference, and Spatial Shift

Xinan Zhang, Haolin Wang, Zhongyu Yang, Yi-Chang, Tsai

Comments Accepted and presented in TRBAM 2026

2604.13321 2026-04-16 cs.CV

Why MLLMs Struggle to Determine Object Orientations

Anju Gopinath, Nikhil Krishnaswamy, Bruce Draper

2604.13318 2026-04-16 cs.AI cs.CL

WebXSkill: Skill Learning for Autonomous Web Agents

Zhaoyang Wang, Qianhui Wu, Xuchao Zhang, Chaoyun Zhang, Wenlin Yao, Fazle Elahi Faisal, Baolin Peng, Si Qin, Suman Nath, Qingwei Lin, Chetan Bansal, Dongmei Zhang, Saravan Rajmohan, Jianfeng Gao, Huaxiu Yao

Comments 21 pages

2604.13316 2026-04-16 cs.LG cs.AI

Beyond Uniform Sampling: Synergistic Active Learning and Input Denoising for Robust Neural Operators

Samrendra Roy, Souvik Chakraborty, Syed Bahauddin Alam

2604.13313 2026-04-16 cs.LG

Concrete Jungle: Towards Concreteness Paved Contrastive Negative Mining for Compositional Understanding

Eun Woo Im, Dhruv Madhwal, Vivek Gupta

Comments 10 pages

2604.13309 2026-04-16 cs.RO

Utilizing Inpainting for Keypoint Detection for Vision-Based Control of Robotic Manipulators

Sreejani Chatterjee, Venkatesh Mullur, Abhinav Gandhi, Berk Calli

2604.13307 2026-04-16 cs.CV cs.LG

Deep Spatially-Regularized and Superpixel-Based Diffusion Learning for Unsupervised Hyperspectral Image Clustering

Vutichart Buranasiri, James M. Murphy

Comments To appear in IEEE IGARSS 2026

2604.13305 2026-04-16 cs.CV

Bias at the End of the Score

Salma Abdel Magid, Grace Guo, Esin Tureci, Amaya Dharmasiri, Vikram V. Ramaswamy, Hanspeter Pfister, Olga Russakovsky

Comments Accepted to The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2026

2604.13304 2026-04-16 cs.CV cs.AI

Can Cross-Layer Transcoders Replace Vision Transformer Activations? An Interpretable Perspective on Vision

Gerasimos Chatzoudis, Konstantinos D. Polyzos, Zhuowei Li, Difei Gu, Gemma E. Moran, Hao Wang, Dimitris N. Metaxas

2604.13295 2026-04-16 cs.LG math.PR stat.ML

Some Theoretical Limitations of t-SNE

Rupert Li, Elchanan Mossel

Comments 19 pages, 7 figures

2604.13292 2026-04-16 cs.CV

See&Say: Vision Language Guided Safe Zone Detection for Autonomous Package Delivery Drones

Mahyar Ghazanfari, Peng Wei

2604.13291 2026-04-16 cs.LG

Physics-informed reservoir characterization from bulk and extreme pressure events with a differentiable simulator

Harun Ur Rashid, Mingxin Li, Aleksandra Pachalieva, Georg Stadler, Daniel O'Malley

详情

英文摘要

Accurate characterization of subsurface heterogeneity is challenging but essential for applications such as reservoir pressure management, geothermal energy extraction and CO$_2$, H$_2$, and wastewater injection operations. This challenge becomes especially acute in extreme pressure events, which are rarely observed but can strongly affect operational risk. Traditional history matching and inversion techniques rely on expensive full-physics simulations, making it infeasible to handle uncertainty and extreme events at scale. Purely data-driven models often struggle to maintain physics consistency when dealing with sparse observations, complex geology, and extreme events. To overcome these limitations, we introduce a physics-informed machine learning method that embeds a differentiable subsurface flow simulator directly into neural network training. The network infers heterogeneous permeability fields from limited pressure observations, while training minimizes both permeability and pressure losses through the simulator, enforcing physical consistency. Because the simulator is used only during training, inference remains fast once the model is learned. In an initial test, the proposed method reduces the pressure inference error by half compared with a purely data-driven approach. We then extend the test over eight distinct data scenarios, and in every case, our method produces significantly lower pressure inference errors than the purely data-driven model. We also evaluate our method on extreme events, which represent high-consequence data in the tail of the sample distribution. Similar to the bulk distribution, the physics-informed model maintains higher pressure inference accuracy in the extreme event regimes. Overall, the proposed method enables rapid, physics-consistent subsurface inversion for real-time reservoir characterization and risk-aware decision-making.

URL PDF HTML ☆

赞 0 踩 0