arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.05507 2026-03-06 cs.CV cs.GR

Transformer-Based Inpainting for Real-Time 3D Streaming in Sparse Multi-Camera Setups

Leif Van Holland, Domenic Zingsheim, Mana Takhsha, Hannah Dröge, Patrick Stotko, Markus Plack, Reinhard Klein

Comments You can find the project page https://github.com/vc-bonn/transformer-based-inpainting

详情

英文摘要

High-quality 3D streaming from multiple cameras is crucial for immersive experiences in many AR/VR applications. The limited number of views - often due to real-time constraints - leads to missing information and incomplete surfaces in the rendered images. Existing approaches typically rely on simple heuristics for the hole filling, which can result in inconsistencies or visual artifacts. We propose to complete the missing textures using a novel, application-targeted inpainting method independent of the underlying representation as an image-based post-processing step after the novel view rendering. The method is designed as a standalone module compatible with any calibrated multi-camera system. For this we introduce a multi-view aware, transformer-based network architecture using spatio-temporal embeddings to ensure consistency across frames while preserving fine details. Additionally, our resolution-independent design allows adaptation to different camera setups, while an adaptive patch selection strategy balances inference speed and quality, allowing real-time performance. We evaluate our approach against state-of-the-art inpainting techniques under the same real-time constraints and demonstrate that our model achieves the best trade-off between quality and speed, outperforming competitors in both image and video-based metrics.

URL PDF HTML ☆

赞 0 踩 0

2603.05506 2026-03-06 cs.CV

FaceCam: Portrait Video Camera Control via Scale-Aware Conditioning

Weijie Lyu, Ming-Hsuan Yang, Zhixin Shu

Comments Accepted by CVPR 2026. Project page: https://weijielyu.github.io/FaceCam

2603.05503 2026-03-06 cs.CV

Accelerating Text-to-Video Generation with Calibrated Sparse Attention

Shai Yehezkel, Shahar Yadin, Noam Elata, Yaron Ostrovsky-Berman, Bahjat Kawar

2603.05498 2026-03-06 cs.AI cs.CL

The Spike, the Sparse and the Sink: Anatomy of Massive Activations and Attention Sinks

Shangwen Sun, Alfredo Canziani, Yann LeCun, Jiachen Zhu

2603.05489 2026-03-06 cs.AR cs.CY cs.LO cs.SY eess.SY

NL2GDS: LLM-aided interface for Open Source Chip Design

Max Eland, Jeyan Thiyagalingam, Dinesh Pamunuwa, Roshan Weerasekera

Comments 10 pages, 6 figures

2603.05487 2026-03-06 cs.RO

Observing and Controlling Features in Vision-Language-Action Models

Hugo Buurmeijer, Carmen Amo Alonso, Aiden Swann, Marco Pavone

2603.05486 2026-03-06 quant-ph cs.IT math.IT

Improved Decoding of Quantum Tanner Codes Using Generalized Check Nodes

Olai Å. Mostad, Eirik Rosnes, Hsuan-Yin Lin

Comments Submission for possible publication

2603.05485 2026-03-06 cs.AI

Towards Provably Unbiased LLM Judges via Bias-Bounded Evaluation

Benjamin Feuer, Lucas Rosenblatt, Oussama Elachqar

2603.05484 2026-03-06 cs.CV

Towards Multimodal Lifelong Understanding: A Dataset and Agentic Baseline

Guo Chen, Lidong Lu, Yicheng Liu, Liangrui Dong, Lidong Zou, Jixin Lv, Zhenquan Li, Xinyi Mao, Baoqi Pei, Shihao Wang, Zhiqi Li, Karan Sapra, Fuxiao Liu, Yin-Dong Zheng, Yifei Huang, Limin Wang, Zhiding Yu, Andrew Tao, Guilin Liu, Tong Lu

2603.05483 2026-03-06 cs.LG cs.AI stat.ML

SurvHTE-Bench: A Benchmark for Heterogeneous Treatment Effect Estimation in Survival Analysis

Shahriar Noroozizadeh, Xiaobin Shen, Jeremy C. Weiss, George H. Chen

Comments The Fourteenth International Conference on Learning Representations (ICLR 2026)

2603.05480 2026-03-06 stat.ML cs.LG math.ST stat.TH

Thermodynamic Response Functions in Singular Bayesian Models

Sean Plummer

详情

英文摘要

Singular statistical models-including mixtures, matrix factorization, and neural networks-violate regular asymptotics due to parameter non-identifiability and degenerate Fisher geometry. Although singular learning theory characterizes marginal likelihood behavior through invariants such as the real log canonical threshold and singular fluctuation, these quantities remain difficult to interpret operationally. At the same time, widely used criteria such as WAIC and WBIC appear disconnected from underlying singular geometry. We show that posterior tempering induces a one-parameter deformation of the posterior distribution whose associated observables generate a hierarchy of thermodynamic response functions. A universal covariance identity links derivatives of tempered expectations to posterior fluctuations, placing WAIC, WBIC, and singular fluctuation within a unified response framework. Within this framework, classical quantities from singular learning theory acquire natural thermodynamic interpretations: RLCT governs the leading free-energy slope, singular fluctuation corresponds to curvature of the tempered free energy, and WAIC measures predictive fluctuation. We formalize an observable algebra that quotients out non-identifiable directions, allowing structurally meaningful order parameters to be constructed in singular models. Across canonical singular examples-including symmetric Gaussian mixtures, reduced-rank regression, and overparameterized neural networks-we empirically demonstrate phase-transition-like behavior under tempering. Order parameters collapse, susceptibilities peak, and complexity measures align with structural reorganization in posterior geometry. Our results suggest that thermodynamic response theory provides a natural organizing framework for interpreting complexity, predictive variability, and structural reorganization in singular Bayesian learning.

URL PDF HTML ☆

赞 0 踩 0

2603.05473 2026-03-06 cs.CV

Towards 3D Scene Understanding of Gas Plumes in LWIR Hyperspectral Images Using Neural Radiance Fields

Scout Jarman, Zigfried Hampel-Arias, Adra Carr, Kevin R. Moon

Comments This manuscript was submitted to SPIE JARS and is under review. Code and Data can be found at https://github.com/lanl/HSI-Nerfstudio and https://zenodo.org/records/18626884 respectively. Video 1 and Video 2 can be found at https://github.com/lanl/HSI-Nerfstudio/blob/main/renders/paper/grid_Falsecolor.mp4 and https://github.com/lanl/HSI-Nerfstudio/blob/main/renders/paper/grid_ACE.mp4 respectively

2603.05471 2026-03-06 cs.CL cs.AI

Leveraging LLM Parametric Knowledge for Fact Checking without Retrieval

Artem Vazhentsev, Maria Marina, Daniil Moskovskiy, Sergey Pletenev, Mikhail Seleznyov, Mikhail Salnikov, Elena Tutubalina, Vasily Konovalov, Irina Nikishina, Alexander Panchenko, Viktor Moskvoretskii

Comments Preprint

2603.05469 2026-03-06 math.NA cs.NA physics.comp-ph

A Space-Time Galerkin Boundary Element Method for Aeroacoustic Scattering

Maks Groom, Beckett Zhou

2603.05468 2026-03-06 cs.LG

Kraus Constrained Sequence Learning For Quantum Trajectories from Continuous Measurement

Priyanshi Singh, Krishna Bhatia

Comments Poster at AI&PDE: ICLR 2026 Workshop on AI and Partial Differential Equations. 17 pages, 3 figures

2603.05465 2026-03-06 cs.CV

HALP: Detecting Hallucinations in Vision-Language Models without Generating a Single Token

Sai Akhil Kogilathota, Sripadha Vallabha E G, Luzhe Sun, Jiawei Zhou

2603.05462 2026-03-06 cs.CL

NCTB-QA: A Large-Scale Bangla Educational Question Answering Dataset and Benchmarking Performance

Abrar Eyasir, Tahsin Ahmed, Muhammad Ibrahim

Comments 18 pages, 7 figures, 6 tables. Dataset contains 87,805 Bangla QA pairs from NCTB textbooks

2603.05461 2026-03-06 cs.GT math.GN

Equilibrium for max-plus payoff

Taras Radul

2603.05454 2026-03-06 cs.CV

Beyond Scattered Acceptance: Fast and Coherent Inference for DLMs via Longest Stable Prefixes

Pengxiang Li, Joey Tsai, Hongwei Xue, Kunyu Shi, Shilin Yan

Comments Accepted at ICLR 2026

2603.05451 2026-03-06 cs.CL

FlashAttention-4: Algorithm and Kernel Pipelining Co-Design for Asymmetric Hardware Scaling

Ted Zadouri, Markus Hoehnerbach, Jay Shah, Timmy Liu, Vijay Thakkar, Tri Dao

2603.05449 2026-03-06 cs.CV cs.AI cs.GR

RealWonder: Real-Time Physical Action-Conditioned Video Generation

Wei Liu, Ziyu Chen, Zizhang Li, Yue Wang, Hong-Xing Yu, Jiajun Wu

Comments The first two authors contributed equally. The last two authors advised equally. Project website: https://liuwei283.github.io/RealWonder/

2603.05448 2026-03-06 cs.RO cs.AI

Residual RL--MPC for Robust Microrobotic Cell Pushing Under Time-Varying Flow

Yanda Yang, Sambeeta Das

Comments 8 pages, 8 figures

2603.05446 2026-03-06 cs.CV

NaiLIA: Multimodal Nail Design Retrieval Based on Dense Intent Descriptions and Palette Queries

Kanon Amemiya, Daichi Yashima, Kei Katsumata, Takumi Komatsu, Ryosuke Korekata, Seitaro Otsuki, Komei Sugiura

Comments Accepted to CVPR 2026 Findings

2603.05440 2026-03-06 cs.LG

Latent Wasserstein Adversarial Imitation Learning

Siqi Yang, Kai Yan, Alexander G. Schwing, Yu-Xiong Wang

Comments 10 pages, accepted to ICLR 2026

2603.05439 2026-03-06 cs.DB

O^3-LSM: Maximizing Disaggregated LSM Write Performance via Three-Layer Offloading

Qi Lin, Gangqi Huang, Te Guo, Chang Guo, Viraj Thakkar, Zichen Zhu, Jianguo Wang, Zhichao Cao

Comments Accepted to SIGMOD 2026 as a full research paper

2603.05438 2026-03-06 cs.CV cs.AI cs.RO

Planning in 8 Tokens: A Compact Discrete Tokenizer for Latent World Model

Dongwon Kim, Gawon Seo, Jinsung Lee, Minsu Cho, Suha Kwak

Comments CVPR 2026

2603.05432 2026-03-06 cs.CL cs.AI cs.LG

Ensembling Language Models with Sequential Monte Carlo

Robin Shing Moon Chan, Tianyu Liu, Samuel Kiegeland, Clemente Pasti, Jacob Hoover Vigly, Timothy J. O'Donnell, Ryan Cotterell, Tim Vieira

2603.05427 2026-03-06 cs.IT math.IT

Spatially-aware Secondary License Sharing in mmWave Networks

Shuchi Tripathi, Abhishek K. Gupta

Comments 32 pages, 12 figures

2603.05423 2026-03-06 cs.LG

An interpretable prototype parts-based neural network for medical tabular data

Jacek Karolczak, Jerzy Stefanowski

Comments Proc. of EXPLIMED at ECAI 2025

2603.05419 2026-03-06 math.NA cs.NA

Structured distance to singularity as a nonlinear system of equations

Miryam Gnazzo, Nicola Guglielmi, Federico Poloni, Stefano Sicilia

Comments 21 pages, 2 tables

详情

英文摘要

In this article we study the structured distance to singularity for a nonsingular matrix $A\in\mathbb{C}^{n\times n}$, with a prescribed linear structure $\mathcal{S}$ (for instance, a sparsity pattern, or a real Toeplitz structure), i.e., the norm of the smallest perturbation $Δ\in \mathcal{S}$, such that $A + Δ$ is singular. This is an example of structured matrix nearness problem: a family of problems that arise in control and systems theory and in numerical analysis, when characterizing the robustness of a certain property of a system with respect to perturbations that are constrained to a certain structure (for example the structure of the nominal system). We start by highlighting the parallelism between two main tools which have been proposed in the literature: a gradient system approach for a functional in the eigenvalues, which requires the solution of certain low-rank matrix differential equations (see [Guglielmi, Lubich, Sicilia, SINUM 2023]), and a two-level optimization approach in which the inner linear least-squares problem is solved explicitly (see [Usevich, Markovsky, JCAM 2014] and [Gnazzo, Noferini, Nyman, Poloni, FoCM 2025]). In particular, these articles underline the remarkable property that $Δ$ is (at least generically) the orthogonal projection onto the structure $\mathcal{S}$ of a rank-1 matrix $uv^*$. This property and the parallelism suggest a new reformulation of the problem into a system of nonlinear equations in the two vector unknowns $u,v \in\mathbb{C}^n$. We study this new formulation, and propose an algorithm to solve these nonlinear equations directly with the multivariate Newton's method. We discuss how to avoid the singularity of such system of nonlinear equations, and how to ensure monotonic convergence. The resulting algorithm is faster than the existing ones for large matrices, and maintains comparable accuracy.

URL PDF HTML ☆

赞 0 踩 0