arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.27948 2026-04-03 cs.CV

RehearsalNeRF: Decoupling Intrinsic Neural Fields of Dynamic Illuminations for Scene Editing

Changyeon Won, Hyunjun Jung, Jungu Cho, Seonmi Park, Chi-Hoon Lee, Hae-Gon Jeon

Comments Accepted to the International Journal of Computer Vision (IJCV). Changyeon Won and Hyunjun Jung contributed equally to this work

详情

英文摘要

Although there has been significant progress in neural radiance fields, an issue on dynamic illumination changes still remains unsolved. Different from relevant works that parameterize time-variant/-invariant components in scenes, subjects' radiance is highly entangled with their own emitted radiance and lighting colors in spatio-temporal domain. In this paper, we present a new effective method to learn disentangled neural fields under the severe illumination changes, named RehearsalNeRF. Our key idea is to leverage scenes captured under stable lighting like rehearsal stages, easily taken before dynamic illumination occurs, to enforce geometric consistency between the different lighting conditions. In particular, RehearsalNeRF employs a learnable vector for lighting effects which represents illumination colors in a temporal dimension and is used to disentangle projected light colors from scene radiance. Furthermore, our RehearsalNeRF is also able to reconstruct the neural fields of dynamic objects by simply adopting off-the-shelf interactive masks. To decouple the dynamic objects, we propose a new regularization leveraging optical flow, which provides coarse supervision for the color disentanglement. We demonstrate the effectiveness of RehearsalNeRF by showing robust performances on novel view synthesis and scene editing under dynamic illumination conditions. Our source code and video datasets will be publicly available.

URL PDF HTML ☆

赞 0 踩 0

2603.27529 2026-04-03 cs.LG cs.AI

Cross-attentive Cohesive Subgraph Embedding to Mitigate Oversquashing in GNNs

Tanvir Hossain, Muhammad Ifte Khairul Islam, Lilia Chebbah, Charles Fanning, Esra Akbas

2603.27346 2026-04-03 cs.RO cs.AI cs.LG

D-SPEAR: Dual-Stream Prioritized Experience Adaptive Replay for Stable Reinforcement Learning in Robotic Manipulation

Yu Zhang, Karl Mason

Comments Accepted at IEEE 11th International Conference on Control and Robotics Engineering (ICCRE 2026)

2603.27290 2026-04-03 cs.CV

Human-Centric Perception for Child Sexual Abuse Imagery

Camila Laranjeira, João Macedo, Sandra Avila, Fabrício Benevenuto, Jefersson A. dos Santos

Comments submitted to IEEE Transactions on Information Forensics and Security (TIFS)

2603.26546 2026-04-03 cs.CV

AutoWeather4D: Autonomous Driving Video Weather Conversion via G-Buffer Dual-Pass Editing

Tianyu Liu, Weitao Xiong, Kunming Luo, Manyuan Zhang, Peng Li, Yuan Liu, Ping Tan

Comments Project Page: https://lty2226262.github.io/autoweather4d/ | Github: https://github.com/lty2226262/AutoWeather4D

2603.25467 2026-04-03 cs.CV

GridVAD: Open-Set Video Anomaly Detection via Spatial Reasoning over Stratified Frame Grids

Mohamed Eltahir, Ahmed O. Ibrahim, Obada Siralkhatim, Tabarak Abdallah, Sondos Mohamed

2603.25040 2026-04-03 cs.LG cs.CL cs.CV

Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale

Yicheng Zou, Dongsheng Zhu, Lin Zhu, Tong Zhu, Yunhua Zhou, Peiheng Zhou, Xinyu Zhou, Dongzhan Zhou, Zhiwang Zhou, Yuhao Zhou, Bowen Zhou, Zhanping Zhong, Zhijie Zhong, Haiteng Zhao, Penghao Zhao, Xiaomeng Zhao, Zhiyuan Zhao, Yechen Zhang, Jin Zhang, Wenwei Zhang, Hongjie Zhang, Zhuo Zhang, Wenlong Zhang, Bo Zhang, Chao Zhang, Chen Zhang, Yuhang Zang, Fei Yuan, Jiakang Yuan, Jiashuo Yu, Jinhui Yin, Haochen Ye, Qian Yao, Bowen Yang, Danni Yang, Kaichen Yang, Ziang Yan, Jun Xu, Yicheng Xu, Wanghan Xu, Xuenan Xu, Chao Xu, Ruiliang Xu, Shuhao Xing, Long Xing, Xinchen Xie, Ling-I Wu, Zijian Wu, Zhenyu Wu, Lijun Wu, Yue Wu, Jianyu Wu, Wen Wu, Fan Wu, Xilin Wei, Qi Wei, Bingli Wang, Rui Wang, Ziyi Wang, Zun Wang, Yi Wang, Haomin Wang, Yizhou Wang, Lintao Wang, Yiheng Wang, Longjiang Wang, Bin Wang, Jian Tong, Zhongbo Tian, Huanze Tang, Chen Tang, Shixiang Tang, Yu Sun, Qiushi Sun, Xuerui Su, Qisheng Su, Chenlin Su, Demin Song, Jin Shi, Fukai Shang, Yuchen Ren, Pengli Ren, Xiaoye Qu, Yuan Qu, Jiantao Qiu, Yu Qiao, Biqing Qi, Runyu Peng, Tianshuo Peng, Jiahui Peng, Qizhi Pei, Zhuoshi Pan, Linke Ouyang, Wenchang Ning, Yichuan Ma, Zerun Ma, Ningsheng Ma, Runyuan Ma, Chengqi Lyu, Haijun Lv, Han Lv, Lindong Lu, Kuikun Liu, Jiangning Liu, Yuhong Liu, Kai Liu, Hongwei Liu, Zhoumianze Liu, Mengjie Liu, Ziyu Liu, Wenran Liu, Yang Liu, Liwei Liu, Kaiwen Liu, Junyao Lin, Junming Lin, Tianyang Lin, Dahua Lin, Jianze Liang, Linyang Li, Peiji Li, Zonglin Li, Zehao Li, Pengze Li, Guoyan Li, Lingkai Kong, Linglin Jing, Zhenjiang Jin, Feifei Jiang, Qian Jiang, Junhao Huang, Zixian Huang, Haian Huang, Zhouqi Hua, Ermo Hua, Han Hu, Linfeng Hou, Yinan He, Conghui He, Tianyao He, Xu Guo, Qipeng Guo, Aijia Guo, Yuzhe Gu, Lixin Gu, Jingyang Gong, Qiming Ge, Jiaye Ge, Songyang Gao, Jianfei Gao, Xinyu Fang, Caihua fan, Yue Fan, Yanhui Duan, Zichen Ding, Shengyuan Ding, Ning Ding, Xuanlang Dai, Erfei Cui, Ganqu Cui, Pei Chu, Tao Chu, Guangran Cheng, Yu Cheng, Kai Chen, Yongkang Chen, Chiyu Chen, Guanzhou Chen, Qiaosheng Chen, Sitao Chen, Xin Chen, Haojiong Chen, Yicheng Chen, Weihan Cao, Yuhang Cao, Qinglong Cao, Lei Bai

2603.24270 2026-04-03 cs.CV

ScrollScape: Unlocking 32K Image Generation With Video Diffusion Priors

Haodong Yu, Yabo Zhang, Donglin Di, Ruyi Zhang, Wangmeng Zuo

2603.23711 2026-04-03 cs.CV

Mind the Hitch: Dynamic Calibration and Articulated Perception for Autonomous Trucks

Morui Zhu, Yongqi Zhu, Song Fu, Qing Yang

Comments CVPR 2026 camera-ready version (minor revision & supplementary included)

2603.23406 2026-04-03 cs.AI cs.CL cs.HC

Beyond Preset Identities: How Agents Form Stances and Boundaries in Generative Societies

Hanzhong Zhang, Siyang Song, Jindong Wang

Comments 22 pages, 3 figures. arXiv admin note: substantial text overlap with arXiv:2508.17366

2603.21745 2026-04-03 cs.AI cs.CL

The Presupposition Problem in Representation Genesis

Yiling Wu

2603.21736 2026-04-03 cs.AI cs.CL

The Reasoning Error About Reasoning: Why Different Types of Reasoning Require Different Representational Structures

Yiling Wu

2603.21687 2026-04-03 cs.AI

MIRAGE: The Illusion of Visual Understanding

Mohammad Asadi, Jack W. O'Sullivan, Fang Cao, Tahoura Nedaee, Kamyar Rajabalifardi, Fei-Fei Li, Ehsan Adeli, Euan Ashley

2603.20673 2026-04-03 cs.CL cs.AI

PAVE: Premise-Aware Validation and Editing for Retrieval-Augmented LLMs

Tianyi Huang, Caden Yang, Emily Yin, Eric Wang, Michael Zhang

Comments Accepted at the ICLR 2026 Workshop on Logical Reasoning of Large Language Models

2603.20453 2026-04-03 cs.LG

Regret Bounds for Reinforcement Learning from Multi-Source Imperfect Preferences

Ming Shi, Yingbin Liang, Ness B. Shroff, Ananthram Swami

2603.19834 2026-04-03 cs.CV

Fourier Splatting: Generalized Fourier encoded primitives for scalable radiance fields

Mihnea-Bogdan Jurca, Bert Van hauwermeiren, Adrian Munteanu

2603.19136 2026-04-03 cs.LG cs.AI q-fin.ST

Adaptive Regime-Aware Stock Price Prediction Using Autoencoder-Gated Dual Node Transformers with Reinforcement Learning Control

Mohammad Al Ridhawi, Mahtab Haj Ali, Hussein Al Osman

Comments Submitted to Applied Intelligence (Springer). 17 pages, 9 figures, 10 tables

详情

英文摘要

Stock markets exhibit regime-dependent behavior where prediction models optimized for stable conditions often fail during volatile periods. Existing approaches typically treat all market states uniformly or require manual regime labeling, which is expensive and quickly becomes stale as market dynamics evolve. This paper introduces an adaptive prediction framework that adaptively identifies deviations from normal market conditions and routes data through specialized prediction pathways. The architecture consists of three components: (1) an autoencoder trained on normal market conditions that identifies anomalous regimes through reconstruction error, (2) dual node transformer networks specialized for stable and event-driven market conditions respectively, and (3) a Soft Actor-Critic reinforcement learning controller that adaptively tunes the regime detection threshold and pathway blending weights based on prediction performance feedback. The reinforcement learning component enables the system to learn adaptive regime boundaries, defining anomalies as market states where standard prediction approaches fail. Experiments on 20 S&P 500 stocks spanning 1982 to 2025 demonstrate that the proposed framework achieves 0.68% mean absolute percentage error (MAPE) for one-day predictions without the reinforcement controller and 0.59% MAPE with the full adaptive system, compared to 0.80% for the baseline integrated node transformer. Directional accuracy reaches 72% with the complete framework. The system maintains robust performance during high-volatility periods, with MAPE below 0.85% when baseline models exceed 1.5%. Ablation studies confirm that each component contributes meaningfully: autoencoder routing accounts for 36% relative MAPE degradation upon removal, followed by the SAC controller at 15% and the dual-path architecture at 7%.

URL PDF HTML ☆

赞 0 踩 0

2603.18588 2026-04-03 cs.CV cs.MM

A Novel FACS-Aligned Anatomical Text Description Paradigm for Fine-Grained Facial Behavior Synthesis

Jiahe Wang, Cong Liang, Xuandong Huang, Yuxin Wang, Xin Yun, Yi Wu, Yanan Chang, Shangfei Wang

2603.17979 2026-04-03 cs.CV

AdaRadar: Rate Adaptive Spectral Compression for Radar-based Perception

Jinho Park, Se Young Chun, Mingoo Seok

Comments Accepted to CVPR 2026. Project page: https://jp4327.github.io/adaradar/

2603.17219 2026-04-03 cs.CV cs.AI cs.LG eess.IV

SA-CycleGAN-2.5D: Self-Attention CycleGAN with Tri-Planar Context for Multi-Site MRI Harmonization

Ishrith Gowda, Chunwei Liu

Comments 12 pages, 5 figures, 5 tables. Submitted to MICCAI 2026

详情

英文摘要

Multi-site neuroimaging analysis is fundamentally confounded by scanner-induced covariate shifts, where the marginal distribution of voxel intensities $P(\mathbf{x})$ varies non-linearly across acquisition protocols while the conditional anatomy $P(\mathbf{y}|\mathbf{x})$ remains constant. This is particularly detrimental to radiomic reproducibility, where acquisition variance often exceeds biological pathology variance. Existing statistical harmonization methods (e.g., ComBat) operate in feature space, precluding spatial downstream tasks, while standard deep learning approaches are theoretically bounded by local effective receptive fields (ERF), failing to model the global intensity correlations characteristic of field-strength bias. We propose SA-CycleGAN-2.5D, a domain adaptation framework motivated by the $HΔH$-divergence bound of Ben-David et al., integrating three architectural innovations: (1) A 2.5D tri-planar manifold injection preserving through-plane gradients $\nabla_z$ at $O(HW)$ complexity; (2) A U-ResNet generator with dense voxel-to-voxel self-attention, surpassing the $O(\sqrt{L})$ receptive field limit of CNNs to model global scanner field biases; and (3) A spectrally-normalized discriminator constraining the Lipschitz constant ($K_D \le 1$) for stable adversarial optimization. Evaluated on 654 glioma patients across two institutional domains (BraTS and UPenn-GBM), our method reduces Maximum Mean Discrepancy (MMD) by 99.1% ($1.729 \to 0.015$) and degrades domain classifier accuracy to near-chance (59.7%). Ablation confirms that global attention is statistically essential (Cohen's $d = 1.32$, $p < 0.001$) for the harder heterogeneous-to-homogeneous translation direction. By bridging 2D efficiency and 3D consistency, our framework yields voxel-level harmonized images that preserve tumor pathophysiology, enabling reproducible multi-center radiomic analysis.

URL PDF HTML ☆

赞 0 踩 0

2603.15789 2026-04-03 cs.RO

Emergent Dexterity via Diverse Resets and Large-Scale Reinforcement Learning

Patrick Yin, Tyler Westenbroek, Zhengyu Zhang, Joshua Tran, Ignacio Dagnino, Eeshani Shilamkar, Numfor Mbiziwo-Tiapo, Simran Bagaria, Xinlei Liu, Galen Mullins, Andrey Kolobov, Abhishek Gupta

2603.15033 2026-04-03 cs.LG

Rethinking Machine Unlearning: Models Designed to Forget via Key Deletion

Sonia Laguna, Jorge da Silva Goncalves, Moritz Vandenhirtz, Alain Ryser, Irene Cannistraci, Julia E. Vogt

2603.14218 2026-04-03 cs.LG

Interleaved Resampling and Refitting: Data and Compute-Efficient Evaluation of Black-Box Predictors

Haichen Hu, David Simchi-Levi

2603.13651 2026-04-03 cs.CL cs.AI cs.IR

Benchmarking Large Language Models on Reference Extraction and Parsing in the Social Sciences and Humanities

Yurui Zhu, Giovanni Colavizza, Matteo Romanello

Comments 12 pages, 2 figures. Accepted at the SCOLIA 2026 Workshop (Second Workshop on Scholarly Information Access), co-located with ECIR 2026. Workshop date: April 2, 2026

详情

Journal ref: Proceedings of the Second International Workshop on Scholarly Information Access (SCOLIA 2026), co-located with ECIR 2026, Delft, The Netherlands, April 2, 2026. CEUR Workshop Proceedings, Vol. 4187, pp. 16-30

英文摘要

Bibliographic reference extraction and parsing are foundational for citation indexing, linking, and downstream scholarly knowledge-graph construction. However, most established evaluations focus on clean, English, end-of-document bibliographies, and therefore underrepresent the Social Sciences and Humanities (SSH), where citations are frequently multilingual, embedded in footnotes, abbreviated, and shaped by heterogeneous historical conventions. We present a unified benchmark that targets these SSH-realistic conditions across three complementary datasets: CEX (English journal articles spanning multiple disciplines), EXCITE (German/English documents with end-section, footnote-only, and mixed regimes), and LinkedBooks (humanities references with strong stylistic variation and multilinguality). We evaluate three tasks of increasing difficulty -- reference extraction, reference parsing, and end-to-end document parsing -- under a schema-constrained setup that enables direct comparison between a strong supervised pipeline baseline (GROBID) and contemporary LLMs (DeepSeek-V3.1, Mistral-Small-3.2-24B, Gemma-3-27B-it, and Qwen3-VL (4B-32B variants)). Across datasets, extraction largely saturates beyond a moderate capability threshold, while parsing and end-to-end parsing remain the primary bottlenecks due to structured-output brittleness under noisy layouts. We further show that lightweight LoRA adaptation yields consistent gains -- especially on SSH-heavy benchmarks -- and that segmentation/pipelining can substantially improve robustness. Finally, we argue for hybrid deployment via routing: leveraging GROBID for well-structured, in-distribution PDFs while escalating multilingual and footnote-heavy documents to task-adapted LLMs.

URL PDF HTML ☆

赞 0 踩 0

2603.12372 2026-04-03 cs.AI cs.CL cs.LG

Efficient Reasoning with Balanced Thinking

Yulin Li, Tengyao Tu, Li Ding, Junjie Wang, Huiling Zhen, Yixin Chen, Yong Li, Zhuotao Tian

Comments Accepted by ICLR 2026

2603.02788 2026-04-03 cs.AI

Agentified Assessment of Logical Reasoning Agents

Zhiyu Ni, Yifeng Xiao, Zheng Liang

Comments Accepted at ICLR 2026 Agents in the Wild (AIWILD) Workshop. 5 pages, 2 figures, 1 table

2603.02491 2026-04-03 cs.LG cs.AI cs.RO q-bio.NC stat.ML

What Capable Agents Must Know: Selection Theorems for Robust Decision-Making under Uncertainty

Aran Nayebi

Comments 23 pages; added PSR recovery (Theorems 3 & 4), and updated related work

2603.01593 2026-04-03 cs.CV

PPEDCRF: Privacy-Preserving Enhanced Dynamic CRF for Location-Privacy Protection for Sequence Videos with Minimal Detection Degradation

Bo Ma, Jinsong Wu, Weiqi Yan, Catherine Shi, Minh Nguyen

Comments We would like to withdraw this paper due to identified issues in the experimental design and insufficient supporting data, which affect the reliability of the reported results. A substantially revised version with corrected experiments and extended evaluations will be prepared and submitted in the future

2602.23205 2026-04-03 cs.CV

EmbodMocap: In-the-Wild 4D Human-Scene Reconstruction for Embodied Agents

Wenjia Wang, Liang Pan, Huaijin Pi, Yuke Lou, Xuqian Ren, Yifan Wu, Zhouyingcheng Liao, Lei Yang, Rishabh Dabral, Christian Theobalt, Taku Komura

2602.20985 2026-04-03 cs.CV

EW-DETR: Evolving World Object Detection via Incremental Low-Rank DEtection TRansformer

Munish Monga, Vishal Chudasama, Pankaj Wasnik, C. V. Jawahar

Comments Accepted at CVPR 2026