arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.18940 2026-03-30 cs.CL cs.LG

Entropy trajectory shape predicts LLM reasoning reliability: A diagnostic study of uncertainty dynamics in chain-of-thought

Xinghao Zhao

详情

英文摘要

Understanding uncertainty in chain-of-thought reasoning is critical for reliable deployment of large language models. In this work, we propose a simple yet effective diagnostic approach based on trajectory shape rather than scalar magnitude. We show that this signal is practical, interpretable, and inexpensive to obtain in black-box settings, while remaining robust across models and datasets. Through extensive ablations and cross-domain replications, we demonstrate its utility for selective prediction and triage. Our findings offer a generalizable insight into uncertainty dynamics in reasoning tasks, with particular focus on numeric and discrete-answer settings.

URL PDF HTML ☆

赞 0 踩 0

2603.18739 2026-03-30 cs.CV

EdgeCrafter: Compact ViTs for Edge Dense Prediction via Task-Specialized Distillation

Longfei Liu, Yongjie Hou, Yang Li, Qirui Wang, Youyang Sha, Yongjun Yu, Yinzhi Wang, Peizhe Ru, Xuanlong Yu, Xi Shen

Comments Code is available at: https://intellindust-ai-lab.github.io/projects/EdgeCrafter/

2603.17576 2026-03-30 cs.CV

LoGSAM: Parameter-Efficient Cross-Modal Grounding for MRI Segmentation

Mohammad Robaitul Islam Bhuiyan, Sheethal Bhat, Melika Qahqaie, Tri-Thien Nguyen, Paula Andrea Perez-Toro, Tomas Arias-Vergara, Andreas Maier

Comments 10 pages, 3 figures

2603.17528 2026-03-30 cs.CV

MM-OVSeg:Multimodal Optical-SAR Fusion for Open-Vocabulary Segmentation in Remote Sensing

Yimin Wei, Aoran Xiao, Hongruixuan Chen, Junshi Xia, Naoto Yokoya

Comments CVPR2026

2603.15636 2026-03-30 cs.AI

AIDABench: AI Data Analytics Benchmark

Yibo Yang, Fei Lei, Yixuan Sun, Yantao Zeng, Chengguang Lv, Jiancao Hong, Jiaojiao Tian, Tianyu Qiu, Xin Wang, Yanbing Chen, Yanjie Li, Zheng Pan, Xiaochen Zhou, Guanzhou Chen, Haoran Lv, Yuning Xu, Yue Ou, Haodong Liu, Shiqi He, Anya Jia, Yulei Xin, Huan Wu, Liang Liu, Jiaye Ge, Jianxin Dong, Dahua Lin, Wenxiu Sun

Comments 22 pages (including appendix), 9 figures, 4 tables. Code: https://github.com/MichaelYang-lyx/AIDABench. Dataset: https://huggingface.co/datasets/MichaelYang-lyx/AIDA

2603.15304 2026-03-30 cs.CV

UE5-Forest: A Photorealistic Synthetic Stereo Dataset for UAV Forestry Depth Estimation

Yida Lin, Bing Xue, Mengjie Zhang, Sam Schofield, Richard Green

2603.15195 2026-03-30 cs.LG

Massive Redundancy in Gradient Transport Enables Sparse Online Learning

Aur Shalev Merin

Comments 26 pages, 5 figures, 14 tables

详情

英文摘要

Real-time recurrent learning (RTRL) computes exact online gradients by propagating a Jacobian tensor forward through recurrent dynamics, but at O(n^4) cost per step. Prior work has sought structured approximations (rank-1 compression, graph-based sparsity, Kronecker factorization). We show that, in the continuous error signal regime, the recurrent Jacobian is massively redundant:propagating through a random 6% of paths (k=4 of n=64) recovers 84 +/- 6% of full RTRL's adaptation ability across five seeds, and the absolute count k=4 remains effective from n=64 to n=256 (6% to 1.6%, recovery 84 to 78%), meaning sparse RTRL becomes relatively cheaper as networks grow. In RNNs, the recovery is selection-invariant (even adversarial path selection works) and exhibits a step-function transition from zero to any nonzero propagation. Spectral analysis reveals the mechanism: the Jacobian is full-rank but near-isotropic (condition numbers 2.6-6.5), so any random subset provides a directionally representative gradient estimate. On chaotic dynamics (Lorenz attractor), sparse propagation is more numerically stable than full RTRL (CV 13% vs. 88%), as subsampling avoids amplifying pathological spectral modes. The redundancy extends to LSTMs (k=4 matches full RTRL) and to transformers via sparse gradient transport (50% head sparsity outperforms the dense reference; 33% is borderline), with higher thresholds reflecting head specialization rather than isotropy. On real primate neural data, sparse RTRL (k=4) adapts online to cross-session electrode drift (80 +/- 11% recovery, 5 seeds), where sparse propagation is again more stable than full RTRL. Without continuous error signal, Jacobian propagation accumulates numerical drift and degrades all RTRL variants, a scope condition for all forward-mode methods. Results hold with SGD (92 +/- 1% recovery), suggesting independence from optimizer choice.

URL PDF HTML ☆

赞 0 踩 0

2603.14688 2026-03-30 cs.LG cs.AI cs.SE

AgentTrace: Causal Graph Tracing for Root Cause Analysis in Deployed Multi-Agent Systems

Zhaohui Geoffrey Wang

Comments 11 pages, 1 figure, 19 tables. Published at ICLR 2026 Workshop on Agents in the Wild. Camera-ready version with revised layout and framework overview figure

2603.14375 2026-03-30 cs.CV cs.AI

The Pulse of Motion: Measuring Physical Frame Rate from Visual Dynamics

Xiangbo Gao, Mingyang Wu, Siyuan Yang, Jiongze Yu, Pardis Taghavi, Fangzhou Lin, Zhengzhong Tu

2603.13352 2026-03-30 cs.CV

Local Precise Refinement: A Dual-Gated Mixture-of-Experts for Enhancing Foundation Model Generalization against Spectral Shifts

Xi Chen, Maojun Zhang, Yu Liu, Shen Yan

2603.12760 2026-03-30 cs.CV

HIFICL: High-Fidelity In-Context Learning for Multimodal Tasks

Xiaoyu Li, Yuhang Liu, Xuanshuo Kang, Zheng Luo, Fangqi Lou, Xiaohua Wu, Zihan Xiong

Comments Accepted to CVPR 2026. Code available at https://github.com/bbbandari/HiFICL

2603.12206 2026-03-30 cs.CL

CLASP: Defending Hybrid Large Language Models Against Hidden State Poisoning Attacks

Alexandre Le Mercier, Thomas Demeester, Chris Develder

Comments 22 pages, 6 figures

2603.11601 2026-03-30 cs.AI

See, Symbolize, Act: Grounding VLMs with Spatial Representations for Better Gameplay

Ashish Baghel, Paras Chopra

Comments 11 pages, 13 figures. Accepted to LMReasoning Workshop at AAAI 2026

2603.06178 2026-03-30 cs.CV

Making Training-Free Diffusion Segmentors Scale with the Generative Power

Benyuan Meng, Qianqian Xu, Zitai Wang, Xiaochun Cao, Longtao Huang, Qingming Huang

Comments Accepted to CVPR 2026

2603.00717 2026-03-30 cs.CV

Leveraging Arbitrary Data Sources for AI-Generated Image Detection Without Sacrificing Generalization

Qinghui He, Haifeng Zhang, Xiuli Bi, Bo Liu, Chi-Man Pun, Bin Xiao

Comments Accepted to CVPR Findings 2026

2602.22949 2026-03-30 cs.CV

OpenFS: Multi-Hand-Capable Fingerspelling Recognition with Implicit Signing-Hand Detection and Frame-Wise Letter-Conditioned Synthesis

Junuk Cha, Jihyeon Kim, Han-Mu Park

Comments Accepted to CVPR 2026, camera-ready version

2602.22025 2026-03-30 cs.CV

Olbedo: An Albedo and Shading Aerial Dataset for Large-Scale Outdoor Environments

Shuang Song, Debao Huang, Deyan Deng, Haolin Xiong, Yang Tang, Yajie Zhao, Rongjun Qin

Comments CVPR 2026

2602.21100 2026-03-30 cs.CV cs.GR

Skullptor: High Fidelity 3D Head Reconstruction in Seconds with Multi-View Normal Prediction

Noé Artru, Rukhshanda Hussain, Emeline Got, Alexandre Messier, David B. Lindell, Abdallah Dib

Comments For our project page, see https://ubisoft-laforge.github.io/character/skullptor/

2602.20396 2026-03-30 cs.LG stat.ME

cc-Shapley: Measuring Multivariate Feature Importance Needs Causal Context

Jörg Martin, Stefan Haufe

2602.19623 2026-03-30 cs.CV cs.AI cs.HC

PedaCo-Gen: Scaffolding Pedagogical Agency in Human-AI Collaborative Video Authoring

Injun Baek, Yearim Kim, Nojun Kwak

2602.19530 2026-03-30 cs.CV

ORION: ORthonormal Text Encoding for Universal VLM AdaptatION

Omprakash Chakraborty, Jose Dolz, Ismail Ben Ayed

2602.18846 2026-03-30 cs.CV cs.AI

DUET-VLM: Dual stage Unified Efficient Token reduction for VLM Training and Inference

Aditya Kumar Singh, Hitesh Kandala, Pratik Prabhanjan Brahma, Zicheng Liu, Emad Barsoum

Comments 15 Pages, 8 figures, 15 tables, CVPR 2026; Code: AGI/DUET-VLM" target="_blank" rel="noopener">https://github.com/AMD-AGI/DUET-VLM

2602.18709 2026-03-30 cs.CV cs.RO

IRIS-SLAM: Unified Geo-Instance Representations for Robust Semantic Localization and Mapping

Tingyang Xiao, Liu Liu, Wei Feng, Zhengyu Zou, Xiaolin Zhou, Wei Sui, Hao Li, Dingwen Zhang, Zhizhong Su

2602.11391 2026-03-30 cs.CL

Advancing AI Trustworthiness Through Patient Simulation: Risk Assessment of Conversational Agents for Antidepressant Selection

Md Tanvir Rouf Shawon, Mohammad Sabik Irbaz, Hadeel R. A. Elyazori, Keerti Reddy Resapu, Yili Lin, Vladimir Franzuela Cardenas, Farrokh Alemi, Kevin Lybarger

2602.08277 2026-03-30 cs.CV cs.AI

PISCO: Precise Video Instance Insertion with Sparse Control

Xiangbo Gao, Renjie Li, Xinghao Chen, Yuheng Wu, Suofei Feng, Qing Yin, Zhengzhong Tu

详情

英文摘要

The landscape of AI video generation is undergoing a pivotal shift: moving beyond general generation - which relies on exhaustive prompt-engineering and "cherry-picking" - towards fine-grained, controllable generation and high-fidelity post-processing. In professional AI-assisted filmmaking, it is crucial to perform precise, targeted modifications. A cornerstone of this transition is video instance insertion, which requires inserting a specific instance into existing footage while maintaining scene integrity. Unlike traditional video editing, this task demands several requirements: precise spatial-temporal placement, physically consistent scene interaction, and the faithful preservation of original dynamics - all achieved under minimal user effort. In this paper, we propose PISCO, a video diffusion model for precise video instance insertion with arbitrary sparse keyframe control. PISCO allows users to specify a single keyframe, start-and-end keyframes, or sparse keyframes at arbitrary timestamps, and automatically propagates object appearance, motion, and interaction. To address the severe distribution shift induced by sparse conditioning in pretrained video diffusion models, we introduce Variable-Information Guidance for robust conditioning and Distribution-Preserving Temporal Masking to stabilize temporal generation, together with geometry-aware conditioning for realistic scene adaptation. We further construct PISCO-Bench, a benchmark with verified instance annotations and paired clean background videos, and evaluate performance using both reference-based and reference-free perceptual metrics. Experiments demonstrate that PISCO consistently outperforms strong inpainting and video editing baselines under sparse control, and exhibits clear, monotonic performance improvements as additional control signals are provided. Project page: xiangbogaobarry.github.io/PISCO.

URL PDF HTML ☆

赞 0 踩 0

2602.07374 2026-03-30 cs.CL cs.AI

TernaryLM: Memory-Efficient Language Modeling via Native 1.5-Bit Quantization with Adaptive Layer-wise Scaling

Nisharg Nargund, Priyesh Shukla

2602.03220 2026-03-30 cs.CV

PokeFusion Attention: A Lightweight Cross-Attention Mechanism for Style-Conditioned Image Generation

Jingbang Tang

Comments 12 pages, 5 figures. Revised version with improved method description and corrected references

2601.21419 2026-03-30 cs.LG cs.CV

Revisiting Diffusion Model Predictions Through Dimensionality

Qing Jin, Chaoyang Wang

Comments 19 pages, 5 figures

2601.19933 2026-03-30 cs.CL cs.AI cs.LG

NRR-Phi: Text-to-State Mapping for Ambiguity Preservation in LLM Inference

Kei Saito

Comments 25 pages, 5 figures, 7 tables. Replacement synced to repository snapshot v39. Series hub link: https://github.com/kei-saito-research/nrr-series-hub

2601.17468 2026-03-30 cs.CV cs.LG

ReflexSplit: Single Image Reflection Separation via Layer Fusion-Separation

Chia-Ming Lee, Yu-Fan Lin, Jin-Hui Jiang, Yu-Jou Hsiao, Chih-Chung Hsu, Yu-Lun Liu

Comments CVPR 2026 Camera Ready; Project page: https://wuw2135.github.io/ReflexSplit-ProjectPage/