arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.05507 2026-03-06 cs.CV cs.GR

Transformer-Based Inpainting for Real-Time 3D Streaming in Sparse Multi-Camera Setups

Leif Van Holland, Domenic Zingsheim, Mana Takhsha, Hannah Dröge, Patrick Stotko, Markus Plack, Reinhard Klein

Comments You can find the project page https://github.com/vc-bonn/transformer-based-inpainting

详情

英文摘要

High-quality 3D streaming from multiple cameras is crucial for immersive experiences in many AR/VR applications. The limited number of views - often due to real-time constraints - leads to missing information and incomplete surfaces in the rendered images. Existing approaches typically rely on simple heuristics for the hole filling, which can result in inconsistencies or visual artifacts. We propose to complete the missing textures using a novel, application-targeted inpainting method independent of the underlying representation as an image-based post-processing step after the novel view rendering. The method is designed as a standalone module compatible with any calibrated multi-camera system. For this we introduce a multi-view aware, transformer-based network architecture using spatio-temporal embeddings to ensure consistency across frames while preserving fine details. Additionally, our resolution-independent design allows adaptation to different camera setups, while an adaptive patch selection strategy balances inference speed and quality, allowing real-time performance. We evaluate our approach against state-of-the-art inpainting techniques under the same real-time constraints and demonstrate that our model achieves the best trade-off between quality and speed, outperforming competitors in both image and video-based metrics.

URL PDF HTML ☆

赞 0 踩 0

2603.05506 2026-03-06 cs.CV

FaceCam: Portrait Video Camera Control via Scale-Aware Conditioning

Weijie Lyu, Ming-Hsuan Yang, Zhixin Shu

Comments Accepted by CVPR 2026. Project page: https://weijielyu.github.io/FaceCam

2603.05503 2026-03-06 cs.CV

Accelerating Text-to-Video Generation with Calibrated Sparse Attention

Shai Yehezkel, Shahar Yadin, Noam Elata, Yaron Ostrovsky-Berman, Bahjat Kawar

2603.05498 2026-03-06 cs.AI cs.CL

The Spike, the Sparse and the Sink: Anatomy of Massive Activations and Attention Sinks

Shangwen Sun, Alfredo Canziani, Yann LeCun, Jiachen Zhu

2603.05487 2026-03-06 cs.RO

Observing and Controlling Features in Vision-Language-Action Models

Hugo Buurmeijer, Carmen Amo Alonso, Aiden Swann, Marco Pavone

2603.05485 2026-03-06 cs.AI

Towards Provably Unbiased LLM Judges via Bias-Bounded Evaluation

Benjamin Feuer, Lucas Rosenblatt, Oussama Elachqar

2603.05484 2026-03-06 cs.CV

Towards Multimodal Lifelong Understanding: A Dataset and Agentic Baseline

Guo Chen, Lidong Lu, Yicheng Liu, Liangrui Dong, Lidong Zou, Jixin Lv, Zhenquan Li, Xinyi Mao, Baoqi Pei, Shihao Wang, Zhiqi Li, Karan Sapra, Fuxiao Liu, Yin-Dong Zheng, Yifei Huang, Limin Wang, Zhiding Yu, Andrew Tao, Guilin Liu, Tong Lu

2603.05483 2026-03-06 cs.LG cs.AI stat.ML

SurvHTE-Bench: A Benchmark for Heterogeneous Treatment Effect Estimation in Survival Analysis

Shahriar Noroozizadeh, Xiaobin Shen, Jeremy C. Weiss, George H. Chen

Comments The Fourteenth International Conference on Learning Representations (ICLR 2026)

2603.05473 2026-03-06 cs.CV

Towards 3D Scene Understanding of Gas Plumes in LWIR Hyperspectral Images Using Neural Radiance Fields

Scout Jarman, Zigfried Hampel-Arias, Adra Carr, Kevin R. Moon

Comments This manuscript was submitted to SPIE JARS and is under review. Code and Data can be found at https://github.com/lanl/HSI-Nerfstudio and https://zenodo.org/records/18626884 respectively. Video 1 and Video 2 can be found at https://github.com/lanl/HSI-Nerfstudio/blob/main/renders/paper/grid_Falsecolor.mp4 and https://github.com/lanl/HSI-Nerfstudio/blob/main/renders/paper/grid_ACE.mp4 respectively

2603.05471 2026-03-06 cs.CL cs.AI

Leveraging LLM Parametric Knowledge for Fact Checking without Retrieval

Artem Vazhentsev, Maria Marina, Daniil Moskovskiy, Sergey Pletenev, Mikhail Seleznyov, Mikhail Salnikov, Elena Tutubalina, Vasily Konovalov, Irina Nikishina, Alexander Panchenko, Viktor Moskvoretskii

Comments Preprint

2603.05468 2026-03-06 cs.LG

Kraus Constrained Sequence Learning For Quantum Trajectories from Continuous Measurement

Priyanshi Singh, Krishna Bhatia

Comments Poster at AI&PDE: ICLR 2026 Workshop on AI and Partial Differential Equations. 17 pages, 3 figures

2603.05465 2026-03-06 cs.CV

HALP: Detecting Hallucinations in Vision-Language Models without Generating a Single Token

Sai Akhil Kogilathota, Sripadha Vallabha E G, Luzhe Sun, Jiawei Zhou

2603.05462 2026-03-06 cs.CL

NCTB-QA: A Large-Scale Bangla Educational Question Answering Dataset and Benchmarking Performance

Abrar Eyasir, Tahsin Ahmed, Muhammad Ibrahim

Comments 18 pages, 7 figures, 6 tables. Dataset contains 87,805 Bangla QA pairs from NCTB textbooks

2603.05454 2026-03-06 cs.CV

Beyond Scattered Acceptance: Fast and Coherent Inference for DLMs via Longest Stable Prefixes

Pengxiang Li, Joey Tsai, Hongwei Xue, Kunyu Shi, Shilin Yan

Comments Accepted at ICLR 2026

2603.05451 2026-03-06 cs.CL

FlashAttention-4: Algorithm and Kernel Pipelining Co-Design for Asymmetric Hardware Scaling

Ted Zadouri, Markus Hoehnerbach, Jay Shah, Timmy Liu, Vijay Thakkar, Tri Dao

2603.05449 2026-03-06 cs.CV cs.AI cs.GR

RealWonder: Real-Time Physical Action-Conditioned Video Generation

Wei Liu, Ziyu Chen, Zizhang Li, Yue Wang, Hong-Xing Yu, Jiajun Wu

Comments The first two authors contributed equally. The last two authors advised equally. Project website: https://liuwei283.github.io/RealWonder/

2603.05448 2026-03-06 cs.RO cs.AI

Residual RL--MPC for Robust Microrobotic Cell Pushing Under Time-Varying Flow

Yanda Yang, Sambeeta Das

Comments 8 pages, 8 figures

2603.05446 2026-03-06 cs.CV

NaiLIA: Multimodal Nail Design Retrieval Based on Dense Intent Descriptions and Palette Queries

Kanon Amemiya, Daichi Yashima, Kei Katsumata, Takumi Komatsu, Ryosuke Korekata, Seitaro Otsuki, Komei Sugiura

Comments Accepted to CVPR 2026 Findings

2603.05440 2026-03-06 cs.LG

Latent Wasserstein Adversarial Imitation Learning

Siqi Yang, Kai Yan, Alexander G. Schwing, Yu-Xiong Wang

Comments 10 pages, accepted to ICLR 2026

2603.05438 2026-03-06 cs.CV cs.AI cs.RO

Planning in 8 Tokens: A Compact Discrete Tokenizer for Latent World Model

Dongwon Kim, Gawon Seo, Jinsung Lee, Minsu Cho, Suha Kwak

Comments CVPR 2026

2603.05432 2026-03-06 cs.CL cs.AI cs.LG

Ensembling Language Models with Sequential Monte Carlo

Robin Shing Moon Chan, Tianyu Liu, Samuel Kiegeland, Clemente Pasti, Jacob Hoover Vigly, Timothy J. O'Donnell, Ryan Cotterell, Tim Vieira

2603.05423 2026-03-06 cs.LG

An interpretable prototype parts-based neural network for medical tabular data

Jacek Karolczak, Jerzy Stefanowski

Comments Proc. of EXPLIMED at ECAI 2025

2603.05410 2026-03-06 cs.RO

PhysiFlow: Physics-Aware Humanoid Whole-Body VLA via Multi-Brain Latent Flow Matching and Robust Tracking

Weikai Qin, Sichen Wu, Ci Chen, Mengfan Liu, Linxi Feng, Xinru Cui, Haoqi Han, Hesheng Wang

2603.05407 2026-03-06 cs.CV

Video-based Locomotion Analysis for Fish Health Monitoring

Timon Palm, Clemens Seibold, Anna Hilsmann, Peter Eisert

Comments Accepted at VISAPP 2026

2603.05400 2026-03-06 cs.CL

An Exploration-Analysis-Disambiguation Reasoning Framework for Word Sense Disambiguation with Low-Parameter LLMs

Deshan Sumanathilaka, Nicholas Micallef, Julian Hough

Comments Accepted at LREC 2026, 15 pages, 11 Tables

2603.05399 2026-03-06 cs.AI

Judge Reliability Harness: Stress Testing the Reliability of LLM Judges

Sunishchal Dev, Andrew Sloan, Joshua Kavner, Nicholas Kong, Morgan Sandler

Comments Accepted at Agents in the Wild: Safety, Security, and Beyond Workshop at ICLR 2026 - April 26, 2026, Rio de Janeiro, Brazil

2603.05397 2026-03-06 cs.RO cs.CV

Loop Closure via Maximal Cliques in 3D LiDAR-Based SLAM

Javier Laserna, Saurabh Gupta, Oscar Martinez Mozos, Cyrill Stachniss, Pablo San Segundo

Comments Accepted in the 2025 European Conference on Mobile Robots (ECMR). This is the author's version of the work

2603.05395 2026-03-06 cs.LG

On the Necessity of Learnable Sheaf Laplacians

Ferran Hernandez Caralt, Mar Gonzàlez i Català, Adrián Bazaga, Pietro Liò

2603.05392 2026-03-06 cs.AI

Legal interpretation and AI: from expert systems to argumentation and LLMs

Václav Janeček, Giovanni Sartor

2603.05386 2026-03-06 cs.CV

Fusion-CAM: Integrating Gradient and Region-Based Class Activation Maps for Robust Visual Explanations

Hajar Dekdegue, Moncef Garouani, Josiane Mothe, Jordan Bernigaud