arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.14763 2026-03-17 cs.RO cs.CV

LiDAR-EVS: Enhance Extrapolated View Synthesis for 3D Gaussian Splatting with Pseudo-LiDAR Supervision

Yiming Huang, Xin Kang, Sipeng Zhang, Hongliang Ren, Weihua Zhang, Junjie Lai

Comments 22 pages, 8 figures

详情

英文摘要

3D Gaussian Splatting (3DGS) has emerged as a powerful technique for real-time LiDAR and camera synthesis in autonomous driving simulation. However, simulating LiDAR with 3DGS remains challenging for extrapolated views beyond the training trajectory, as existing methods are typically trained on single-traversal sensor scans, suffer from severe overfitting and poor generalization to novel ego-vehicle paths. To enable reliable simulation of LiDAR along unseen driving trajectories without external multi-pass data, we present LiDAR-EVS, a lightweight framework for robust extrapolated-view LiDAR simulation in autonomous driving. Designed to be plug-and-play, LiDAR-EVS readily extends to diverse LiDAR sensors and neural rendering baselines with minimal modification. Our framework comprises two key components: (1) pseudo extrapolated-view point cloud supervision with multi-frame LiDAR fusion, view transformation, occlusion curling, and intensity adjustment; (2) spatially-constrained dropout regularization that promotes robustness to diverse trajectory variations encountered in real-world driving. Extensive experiments demonstrate that LiDAR-EVS achieves SOTA performance on extrapolated-view LiDAR synthesis across three datasets, making it a promising tool for data-driven simulation, closed-loop evaluation, and synthetic data generation in autonomous driving systems.

URL PDF HTML ☆

赞 0 踩 0

2603.14756 2026-03-17 cs.CL cs.AI

Towards Privacy-Preserving Machine Translation at the Inference Stage: A New Task and Benchmark

Wei Shao, Lemao Liu, Yinqiao Li, Guoping Huang, Shuming Shi, Linqi Song

Comments 15 pages, 5 figures, Accepted by IEEE Journal of Selected Topics in Signal Processing

2603.14755 2026-03-17 cs.CL

Learning Constituent Headedness

Zeyao Qi, Yige Chen, KyungTae Lim, Haihua Pan, Jungyeul Park

2603.14750 2026-03-17 cs.CV

Face-Guided Sentiment Boundary Enhancement for Weakly-Supervised Temporal Sentiment Localization

Cailing Han, Zhangbin Li, Jinxing Zhou, Wei Qian, Jingjing Hu, Yanghao Zhou, Zhangling Duan, Dan Guo

2603.14745 2026-03-17 cs.LG

CAMD: Coverage-Aware Multimodal Decoding for Efficient Reasoning of Multimodal Large Language Models

Huijie Guo, Jingyao Wang, Lingyu Si, Jiahuan Zhou, Changwen Zheng, Wenwen Qiang

2603.14741 2026-03-17 cs.CV

PHAC: Promptable Human Amodal Completion

Seung Young Noh, Ju Yong Chang

Comments Accepted to CVPR 2026

详情

英文摘要

Conditional image generation methods are increasingly used in human-centric applications, yet existing human amodal completion (HAC) models offer users limited control over the completed content. Given an occluded person image, they hallucinate invisible regions while preserving visible ones, but cannot reliably incorporate user-specified constraints such as a desired pose or spatial extent. As a result, users often resort to repeatedly sampling the model until they obtain a satisfactory output. Pose-guided person image synthesis (PGPIS) methods allow explicit pose conditioning, but frequently fail to preserve the instance-specific visible appearance and tend to be biased toward the training distribution, even when built on strong diffusion model priors. To address these limitations, we introduce promptable human amodal completion (PHAC), a new task that completes occluded human images while satisfying both visible appearance constraints and multiple user prompts. Users provide simple point-based prompts, such as additional joints for the target pose or bounding boxes for desired regions; these prompts are encoded using ControlNet modules specialized for each prompt type. These modules inject the prompt signals into a pre-trained diffusion model, and we fine-tune only the cross-attention blocks to obtain strong prompt alignment without degrading the underlying generative prior. To further preserve visible content, we propose an inpainting-based refinement module that starts from a slightly noised coarse completion, faithfully preserves the visible regions, and ensures seamless blending at occlusion boundaries. Extensive experiments on the HAC and PGPIS benchmarks show that our approach yields more physically plausible and higher-quality completions, while significantly improving prompt alignment compared with existing amodal completion and pose-guided synthesis methods.

URL PDF HTML ☆

赞 0 踩 0

2603.14739 2026-03-17 cs.CV cs.AI

TrajMamba: An Ego-Motion-Guided Mamba Model for Pedestrian Trajectory Prediction from an Egocentric Perspective

Yusheng Peng, Gaofeng Zhang, Liping Zheng

Comments Accept by ICRA 2026

2603.14738 2026-03-17 cs.CV cs.RO

Efficient Event Camera Volume System

Juan Camilo Soto, Ian Noronha, Saru Bharti, Upinder Kaur

Comments Accepted to ICRA 2026

2603.14734 2026-03-17 cs.AI cs.LG

Gauge-Equivariant Intrinsic Neural Operators for Geometry-Consistent Learning of Elliptic PDE Maps

Pengcheng Cheng

Comments 55 pages, 13figures

2603.14733 2026-03-17 cs.CV

A Skill-augmented Agentic Framework and Benchmark for Multi-Video Understanding

Yue Zhang, Liqiang Jing, Jia Li, Yapeng Tian, Xinya Du, Yunhui Guo, Vibhav Gogate

2603.14729 2026-03-17 cs.LG cs.DC

DeFRiS: Silo-Cooperative IoT Applications Scheduling via Decentralized Federated Reinforcement Learning

Zhiyu Wang, Mohammad Goudarzi, Mingming Gong, Rajkumar Buyya

详情

英文摘要

Next-generation IoT applications increasingly span across autonomous administrative entities, necessitating silo-cooperative scheduling to leverage diverse computational resources while preserving data privacy. However, realizing efficient cooperation faces significant challenges arising from infrastructure heterogeneity, Non-IID workload shifts, and the inherent risks of adversarial environments. Existing approaches, relying predominantly on centralized coordination or independent learning, fail to address the incompatibility of state-action spaces across heterogeneous silos and lack robustness against malicious attacks. This paper proposes DeFRiS, a Decentralized Federated Reinforcement Learning framework for robust and scalable Silo-cooperative IoT application scheduling. DeFRiS integrates three synergistic innovations: (i) an action-space-agnostic policy utilizing candidate resource scoring to enable seamless knowledge transfer across heterogeneous silos; (ii) a silo-optimized local learning mechanism combining Generalized Advantage Estimation (GAE) with clipped policy updates to resolve sparse delayed reward challenges; and (iii) a Dual-Track Non-IID robust decentralized aggregation protocol leveraging gradient fingerprints for similarity-aware knowledge transfer and anomaly detection, and gradient tracking for optimization momentum. Extensive experiments on a distributed testbed with 20 heterogeneous silos and realistic IoT workloads demonstrate that DeFRiS significantly outperforms state-of-the-art baselines, reducing average response time by 6.4% and energy consumption by 7.2%, while lowering tail latency risk (CVaR$_{0.95}$) by 10.4% and achieving near-zero deadline violations. Furthermore, DeFRiS achieves over 3 times better performance retention as the system scales and over 8 times better stability in adversarial environments compared to the best-performing baseline.

URL PDF HTML ☆

赞 0 踩 0

2603.14727 2026-03-17 cs.CV

Automated Diabetic Screening via Anterior Segment Ocular Imaging: A Deep Learning and Explainable AI Approach

Hasaan Maqsood, Saif Ur Rehman Khan, Sebastian Vollmer, Andreas Dengel, Muhammad Nabeel Asim

2603.14726 2026-03-17 cs.CV

Enhancing Hands in 3D Whole-Body Pose Estimation with Conditional Hands Modulator

Gyeongsik Moon

Comments Accepted to CVPR 2026

2603.14724 2026-03-17 cs.AI

GameUIAgent: An LLM-Powered Framework for Automated Game UI Design with Structured Intermediate Representation

Wei Zeng, Fengwei An, Zhen Liu, Jian Zhao

Comments 8 pages, 6 figures

2603.14723 2026-03-17 cs.CL

Beyond Creed: A Non-Identity Safety Condition A Strong Empirical Alternative to Identity Framing in Low-Data LoRA Fine-Tuning

Xinran Zhang

2603.14719 2026-03-17 cs.LG

Multimodal Deep Learning for Early Prediction of Patient Deterioration in the ICU: Integrating Time-Series EHR Data with Clinical Notes

Binesh Sadanandan

2603.14717 2026-03-17 cs.LG

Training-Free Generation of Protein Sequences from Small Family Alignments via Stochastic Attention

Jeffrey D. Varner

2603.14712 2026-03-17 cs.CL cs.LG

Towards Next-Generation LLM Training: From the Data-Centric Perspective

Hao Liang, Zhengyang Zhao, Zhaoyang Han, Meiyi Qiang, Xiaochen Ma, Bohan Zeng, Qifeng Cai, Zhiyu Li, Linpeng Tang, Weinan E, Wentao Zhang

2603.14707 2026-03-17 cs.CV cs.CL

Visual Confused Deputy: Exploiting and Defending Perception Failures in Computer-Using Agents

Xunzhuo Liu, Bowei He, Xue Liu, Andy Luo, Haichen Zhang, Huamin Chen

2603.14706 2026-03-17 cs.CV cs.AI cs.LG

AdapterTune: Zero-Initialized Low-Rank Adapters for Frozen Vision Transformers

Salim Khazem

2603.14704 2026-03-17 cs.LG cs.CV stat.ML

Chain-of-Trajectories: Unlocking the Intrinsic Generative Optimality of Diffusion Models via Graph-Theoretic Planning

Ping Chen, Xiang Liu, Xingpeng Zhang, Fei Shen, Xun Gong, Zhaoxiang Liu, Zezhou Chen, Huan Hu, Kai Wang, Shiguo Lian

Comments 12 figues, 5 tables

2603.14702 2026-03-17 cs.CV

Fractal Autoregressive Depth Estimation with Continuous Token Diffusion

Jinchang Zhang, Xinrou Kang, Guoyu Lu

2603.14701 2026-03-17 cs.CV

AURORA-KITTI: Any-Weather Depth Completion and Denoising in the Wild

Yiting Wang, Tim Brödermann, Hamed Haghighi, Haonan Zhao, Christos Sakaridis, Kurt Debattista, Valentina Donzella

2603.14686 2026-03-17 cs.CV cs.AI

MVHOI: Bridge Multi-view Condition to Complex Human-Object Interaction Video Reenactment via 3D Foundation Model

Jinguang Tong, Jinbo Wu, Kaisiyuan Wang, Zhelun Shen, Xuan Huang, Mochu Xiang, Xuesong Li, Yingying Li, Haocheng Feng, Chen Zhao, Hang Zhou, Wei He, Chuong Nguyen, Jingdong Wang, Hongdong Li

2603.14684 2026-03-17 cs.CV

E2EGS: Event-to-Edge Gaussian Splatting for Pose-Free 3D Reconstruction

Yunsoo Kim, Changki Sung, Dasol Hong, Hyun Myung

Comments 10 pages, 6 figures, accepted to CVPR 2026

2603.14674 2026-03-17 cs.CL

Computational Analysis of Semantic Connections Between Herman Melville Reading and Writing

Nudrat Habib, Elisa Barney Smith, Steven Olsen Smith

2603.14669 2026-03-17 cs.AI

RenderMem: Rendering as Spatial Memory Retrieval

JooHyun Park, HyeongYeop Kang

2603.14667 2026-03-17 cs.CV cs.AI

Comparative Analysis of 3D Convolutional and 2.5D Slice-Conditioned U-Net Architectures for MRI Super-Resolution via Elucidated Diffusion Models

Hendrik Chiche, Ludovic Corcos, Logan Rouge

2603.14666 2026-03-17 cs.CV

EviATTA: Evidential Active Test-Time Adaptation for Medical Segment Anything Models

Jiayi Chen, Yasmeen George, Winston Chong, Jianfei Cai

Comments 10 pages, 8 figures, 5 tables

2603.14664 2026-03-17 cs.AI cs.CL cs.CY

Punctuated Equilibria in Artificial Intelligence: The Institutional Scaling Law and the Speciation of Sovereign AI

Mark Baciak, Thomas A. Cellucci, Deanna M. Falkowski