arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.30045 2026-04-01 cs.CV

OmniRoam: World Wandering via Long-Horizon Panoramic Video Generation

Yuheng Liu, Xin Lin, Xinke Li, Baihan Yang, Chen Wang, Kalyan Sunkavalli, Yannick Hold-Geoffroy, Hao Tan, Kai Zhang, Xiaohui Xie, Zifan Shi, Yiwei Hu

Comments Code is available at https://github.com/yuhengliu02/OmniRoam

详情

英文摘要

Modeling scenes using video generation models has garnered growing research interest in recent years. However, most existing approaches rely on perspective video models that synthesize only limited observations of a scene, leading to issues of completeness and global consistency. We propose OmniRoam, a controllable panoramic video generation framework that exploits the rich per-frame scene coverage and inherent long-term spatial and temporal consistency of panoramic representation, enabling long-horizon scene wandering. Our framework begins with a preview stage, where a trajectory-controlled video generation model creates a quick overview of the scene from a given input image or video. Then, in the refine stage, this video is temporally extended and spatially upsampled to produce long-range, high-resolution videos, thus enabling high-fidelity world wandering. To train our model, we introduce two panoramic video datasets that incorporate both synthetic and real-world captured videos. Experiments show that our framework consistently outperforms state-of-the-art methods in terms of visual quality, controllability, and long-term scene consistency, both qualitatively and quantitatively. We further showcase several extensions of this framework, including real-time video generation and 3D reconstruction. Code is available at https://github.com/yuhengliu02/OmniRoam.

URL PDF HTML ☆

赞 0 踩 0

2603.30043 2026-04-01 cs.CV

Video Models Reason Early: Exploiting Plan Commitment for Maze Solving

Kaleb Newman, Tyler Zhu, Olga Russakovsky

2603.30042 2026-04-01 cs.RO cs.HC

HapCompass: A Rotational Haptic Device for Contact-Rich Robotic Teleoperation

Xiangshan Tan, Jingtian Ji, Tianchong Jiang, Pedro Lopes, Matthew R. Walter

Comments Accepted to IEEE International Conference on Robotics and Automation (ICRA), 2026. 8 pages, 5 figures. Project page: https://ripl.github.io/HapCompass/

2603.30040 2026-04-01 cs.SE cs.AI

Automatic Identification of Parallelizable Loops Using Transformer-Based Source Code Representations

Izavan dos S. Correia, Henrique C. T. Santos, Tiago A. E. Ferreira

Comments 28 pages, 12 figures

2603.30038 2026-04-01 cs.CV

Benchmarking PhD-Level Coding in 3D Geometric Computer Vision

Wenyi Li, Renkai Luo, Yue Yu, Huan-ang Gao, Mingju Gao, Li Yuan, Chaoyou Fu, Hao Zhao

Comments Accepted by CVPR 2026; Project page: https://geocodebench.github.io/

2603.30036 2026-04-01 cs.LG cs.AI

Aligned, Orthogonal or In-conflict: When can we safely optimize Chain-of-Thought?

Max Kaufmann, David Lindner, Roland S. Zimmermann, and Rohin Shah

2603.30035 2026-04-01 cs.LG cs.CL

Reward-Based Online LLM Routing via NeuralUCB

Ming-Hua Tsai, Phat Tran

2603.30034 2026-04-01 cs.CR

EnsembleSHAP: Faithful and Certifiably Robust Attribution for Random Subspace Method

Yanting Wang, Jinyuan Jia

Comments Published at ICLR 2026

2603.30033 2026-04-01 cs.LG cs.AI

Tucker Attention: A generalization of approximate attention mechanisms

Timon Klein, Jonas Kusch, Sebastian Sager, Stefan Schnake, Steffen Schotthöfer

2603.30032 2026-04-01 cs.CL cs.SD

Covertly improving intelligibility with data-driven adaptations of speech timing

Paige Tuttösí, Angelica Lim, H. Henny Yeung, Yue Wang, Jean-Julien Aucouturier

详情

英文摘要

Human talkers often address listeners with language-comprehension challenges, such as hard-of-hearing or non-native adults, by globally slowing down their speech. However, it remains unclear whether this strategy actually makes speech more intelligible. Here, we take advantage of recent advancements in machine-generated speech allowing more precise control of speech rate in order to systematically examine how targeted speech-rate adjustments may improve comprehension. We first use reverse-correlation experiments to show that the temporal influence of speech rate prior to a target vowel contrast (ex. the tense-lax distinction) in fact manifests in a scissor-like pattern, with opposite effects in early versus late context windows; this pattern is remarkably stable both within individuals and across native L1-English listeners and L2-English listeners with French, Mandarin, and Japanese L1s. Second, we show that this speech rate structure not only facilitates L2 listeners' comprehension of the target vowel contrast, but that native listeners also rely on this pattern in challenging acoustic conditions. Finally, we build a data-driven text-to-speech algorithm that replicates this temporal structure on novel speech sequences. Across a variety of sentences and vowel contrasts, listeners remained unaware that such targeted slowing improved word comprehension. Strikingly, participants instead judged the common strategy of global slowing as clearer, even though it actually increased comprehension errors. Together, these results show that targeted adjustments to speech rate significantly aid intelligibility under challenging conditions, while often going unnoticed. More generally, this paper provides a data-driven methodology to improve the accessibility of machine-generated speech which can be extended to other aspects of speech comprehension and a wide variety of listeners and environments.

URL PDF HTML ☆

赞 0 踩 0

2603.30030 2026-04-01 cs.DC cs.SE

A Lightweight Hybrid Publish/Subscribe Event Fabric for IPC and Modular Distributed Systems

Dimitris Gkoulis

2603.30028 2026-04-01 cs.CY

Can Commercial LLMs Be Parliamentary Political Companions? Comparing LLM Reasoning Against Romanian Legislative Expuneri de Motive

Iulian Lucău, Adelin-George Voicu

Comments 12 Figures

2603.30025 2026-04-01 cs.CL

ContextClaim: A Context-Driven Paradigm for Verifiable Claim Detection

Yufeng Li, Rrubaa Panchendrarajan, Arkaitz Zubiaga

2603.30023 2026-04-01 quant-ph cs.IT eess.SP math.IT

LO-Free Phase and Amplitude Recovery of an RF Signal with a DC-Stark-Enabled Rydberg Receiver

Vladislav Katkov, Nikola Zlatanov

2603.30022 2026-04-01 cs.RO cs.AI

Hybrid Framework for Robotic Manipulation: Integrating Reinforcement Learning and Large Language Models

Md Saad, Sajjad Hussain, Mohd Suhaib

2603.30020 2026-04-01 cs.DS

Approximation algorithms for satisfiable and nearly satisfiable ordering CSPs

Yury Makarychev

2603.30019 2026-04-01 math.OC cs.NA math.NA

A McKean-Pontrygin maximum principle for entropic-regularized optimal transport

Sebastian Reich

2603.30017 2026-04-01 cs.LG cs.CR stat.ML

Refined Detection for Gumbel Watermarking

Tor Lattimore

2603.30016 2026-04-01 cs.CR cs.AI

Architecting Secure AI Agents: Perspectives on System-Level Defenses Against Indirect Prompt Injection Attacks

Chong Xiang, Drew Zagieboylo, Shaona Ghosh, Sanjay Kariyappa, Kai Greshake, Hanshen Xiao, Chaowei Xiao, G. Edward Suh

2603.30014 2026-04-01 cs.DC cs.AI

Scalable AI-assisted Workflow Management for Detector Design Optimization Using Distributed Computing

Derek Anderson, Amit Bashyal, Markus Diefenthaler, Cristiano Fanelli, Wen Guan, Tanja Horn, Alex Jentsch Meifeng Lin, Tadashi Maeno, Kei Nagai, Hemalata Nayak, Connor Pecar, Karthik Suresh, Fang-Ying Tsai, Anselm Vossen, Tianle Wang, Torre Wenaus

2603.30004 2026-04-01 q-bio.NC cs.CY

From Patterns to Policy: A Scoping Review Based on Bibliometric Analysis (ScoRBA) of Intelligent and Secure Smart Hospital Ecosystems

Adi Wijaya, Budi Hermawan, Wiga Maulana Baihaqi, Catur Supriyanto

Comments 28 pages, 8 figures, 3 tables

2603.30002 2026-04-01 cs.LG cs.CL

Tracking Equivalent Mechanistic Interpretations Across Neural Networks

Alan Sun, Mariya Toneva

Comments 32 pages, 5 figures, ICLR 2026

2603.29999 2026-04-01 cs.SE cs.AI cs.PL

Phyelds: A Pythonic Framework for Aggregate Computing

Gianluca Aguzzi, Davide Domini, Nicolas Farabegoli, Mirko Viroli

2603.29997 2026-04-01 cs.CL cs.AI

Enhancing Structural Mapping with LLM-derived Abstractions for Analogical Reasoning in Narratives

Mohammadhossein Khojasteh, Yifan Jiang, Stefano De Giorgis, Frank van Harmelen, Filip Ilievski

2603.29993 2026-04-01 cs.AI

Extending MONA in Camera Dropbox: Reproduction, Learned Approval, and Design Implications for Reward-Hacking Mitigation

Nathan Heath

2603.29990 2026-04-01 cs.CV

SurgNavAR: An Augmented Reality Surgical Navigation Framework for Optical See-Through Head Mounted Displays

Abdullah Thabit, Mohamed Benmahdjoub, Rafiuddin Jinabade, Hizirwan S. Salim, Marie-Lise C. van Veelen, Mark G. van Vledder, Eppo B. Wolvius, Theo van Walsum

Comments This work has been submitted to the IEEE for possible publication

2603.29986 2026-04-01 q-bio.QM cs.MS

ParetoEnsembles.jl: A Julia Package for Multiobjective Parameter Estimation Using Pareto Optimal Ensemble Techniques

Jeffrey D. Varner

2603.29982 2026-04-01 cs.GT

Performative Scenario Optimization

Quanyan Zhu, Zhengye Han

2603.29980 2026-04-01 math.MG cs.CG math.CO

Voronoi-Based Vacuum Leakage Detection in Composite Manufacturing

Christoph Brauer, Arne Hindersmann, Timo de Wolff

Comments 25 pages, 8 pages appendix, 17 figures

2603.29979 2026-04-01 cs.CL cs.HC cs.IR

Structural Feature Engineering for Generative Engine Optimization: How Content Structure Shapes Citation Behavior

Junwei Yu, Mufeng Yang, Yepeng Ding, Hiroyuki Sato

Comments 12 pages, 5 figures. This paper proposes GEO-SFE, a structural feature engineering framework for generative engine optimization