arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.19607 2026-03-23 cs.CV

Physion-Eval: Evaluating Physical Realism in Generated Video via Human Reasoning

Qin Zhang, Peiyu Jing, Hong-Xing Yu, Fangqiang Ding, Fan Nie, Weimin Wang, Yilun Du, James Zou, Jiajun Wu, Bing Shuai

详情

英文摘要

Video generation models are increasingly used as world simulators for storytelling, simulation, and embodied AI. As these models advance, a key question arises: do generated videos obey the physical laws of the real world? Existing evaluations largely rely on automated metrics or coarse human judgments such as preferences or rubric-based checks. While useful for assessing perceptual quality, these methods provide limited insight into when and why generated dynamics violate real-world physical constraints. We introduce Physion-Eval, a large-scale benchmark of expert human reasoning for diagnosing physical realism failures in videos generated by five state-of-the-art models across egocentric and exocentric views, containing 10,990 expert reasoning traces spanning 22 fine-grained physical categories. Each generated video is derived from a corresponding real-world reference video depicting a clear physical process, and annotated with temporally localized glitches, structured failure categories, and natural-language explanations of the violated physical behavior. Using this dataset, we reveal a striking limitation of current video generation models: in physics-critical scenarios, 83.3% of exocentric and 93.5% of egocentric generated videos exhibit at least one human-identifiable physical glitch. We hope Physion-Eval will set a new standard for physical realism evaluation and guide the development of physics-grounded video generation. The benchmark is publicly available at https://huggingface.co/datasets/PhysionLabs/Physion-Eval.

URL PDF HTML ☆

赞 0 踩 0

2603.19606 2026-03-23 cs.CV

Beyond Quadratic: Linear-Time Change Detection with RWKV

Zhenyu Yang, Gensheng Pei, Tao Chen, Xia Yuan, Haofeng Zhang, Xiangbo Shu, Yazhou Yao

2603.19602 2026-03-23 cs.RO

CeRLP: A Cross-embodiment Robot Local Planning Framework for Visual Navigation

Haoyu Xi, Mingao Tan, Xinming Zhang, Siwei Cheng, Shanze Wang, Yin Gu, Xiaoyu Shen, Wei Zhang

2603.19601 2026-03-23 cs.CV cs.LG

K-GMRF: Kinetic Gauss-Markov Random Field for First-Principles Covariance Tracking on Lie Groups

ZhiMing Li

Comments 33 pages, 13 figures

2603.19598 2026-03-23 cs.CV

FlowScene: Style-Consistent Indoor Scene Generation with Multimodal Graph Rectified Flow

Zhifei Yang, Guangyao Zhai, Keyang Lu, YuYang Yin, Chao Zhang, Zhen Xiao, Jieyi Long, Nassir Navab, Yikai Wang

2603.19594 2026-03-23 cs.LG cs.AI

ARMOR: Adaptive Resilience Against Model Poisoning Attacks in Continual Federated Learning for Mobile Indoor Localization

Danish Gufran, Akhil Singampalli, Sudeep Pasricha

2603.19584 2026-03-23 cs.AI cs.SY eess.SY

PowerLens: Taming LLM Agents for Safe and Personalized Mobile Power Management

Xingyu Feng, Chang Sun, Yuzhu Wang, Zhangbing Zhou, Chengwen Luo, Zhuangzhuang Chen, Xiaomin Ouyang, Huanqi Yang

2603.19582 2026-03-23 cs.RO cs.AI

Evolving Embodied Intelligence: Graph Neural Network--Driven Co-Design of Morphology and Control in Soft Robotics

Jianqiang Wang, Shuaiqun Pan, Alvaro Serra-Gomez, Xiaohan Wei, Yue Xie

2603.19579 2026-03-23 cs.AI cs.LG

PA2D-MORL: Pareto Ascent Directional Decomposition based Multi-Objective Reinforcement Learning

Tianmeng Hu, Biao Luo

Comments AAAI 2024

2603.19234 2026-03-23 cs.CV cs.GR

Matryoshka Gaussian Splatting

Zhilin Guo, Boqiao Zhang, Hakan Aktas, Kyle Fogarty, Jeffrey Hu, Nursena Koprucu Aslan, Wenzhao Li, Canberk Baykal, Albert Miao, Josef Bengtson, Chenliang Zhou, Weihao Xia, Cristina Nader Vasconcelos, Cengiz Oztireli

Comments project page: https://zhilinguo.github.io/MGS

2603.19203 2026-03-23 cs.CV

Tinted Frames: Question Framing Blinds Vision-Language Models

Wan-Cyuan Fan, Jiayun Luo, Declan Kutscher, Leonid Sigal, Ritwik Gupta

Comments Preprint. Project page: https://davidhalladay.github.io/tinted_frames_demo/

2603.19121 2026-03-23 cs.CV cs.AI

CustomTex: High-fidelity Indoor Scene Texturing via Multi-Reference Customization

Weilin Chen, Jiahao Rao, Wenhao Wang, Xinyang Li, Xuan Cheng, Liujuan Cao

Comments Accepted to CVPR 2026. This version integrates the main paper and supplementary material

2603.18994 2026-03-23 cs.AI cs.LG

Evaluating Game Difficulty in Tetris Block Puzzle

Chun-Jui Wang, Jian-Ting Guo, Hung Guei, Chung-Chin Shih, Ti-Rong Wu, I-Chen Wu

Comments Accepted by the Game Programming Workshop (GPW 2025)

2603.18987 2026-03-23 cs.AI

Unmasking Algorithmic Bias in Predictive Policing: A GAN-Based Simulation Framework with Multi-City Temporal Analysis

Pronob Kumar Barman, Pronoy Kumar Barman

2603.18533 2026-03-23 cs.LG cs.CL

Balancing the Reasoning Load: Difficulty-Differentiated Policy Optimization with Length Redistribution for Efficient and Robust Reinforcement Learning

Yinan Xia, Haotian Zhang, Huiming Wang

Comments 13 pages

2603.18282 2026-03-23 cs.CV

CycleCap: Improving VLMs Captioning Performance via Self-Supervised Cycle Consistency Fine-Tuning

Marios Krestenitis, Christos Tzelepis, Konstantinos Ioannidis, Stefanos Vrochidis, Ioannis Kompatsiaris, Georgios Tzimiropoulos, Shaogang Gong, Ioannis Patras

2603.18090 2026-03-23 cs.SD cs.AI cs.CL

MOSS-TTS Technical Report

Yitian Gong, Botian Jiang, Yiwei Zhao, Yucheng Yuan, Kuangwei Chen, Yaozhou Jiang, Cheng Chang, Dong Hong, Mingshu Chen, Ruixiao Li, Yiyang Zhang, Yang Gao, Hanfu Chen, Ke Chen, Songlin Wang, Xiaogui Yang, Yuqian Zhang, Kexin Huang, ZhengYuan Lin, Kang Yu, Ziqi Chen, Jin Wang, Zhaoye Fei, Qinyuan Cheng, Shimin Li, Xipeng Qiu

Comments Project page: https://github.com/OpenMOSS/MOSS-TTS

2603.17470 2026-03-23 cs.CV cs.AI

VirPro: Visual-referred Probabilistic Prompt Learning for Weakly-Supervised Monocular 3D Detection

Chupeng Liu, Jiyong Rao, Shangquan Sun, Runkai Zhao, Weidong Cai

Comments Accepted by CVPR 2026 Findings

2603.17246 2026-03-23 cs.LG

On the Cone Effect and Modality Gap in Medical Vision-Language Embeddings

David Restrepo, Miguel L Martins, Chenwei Wu, Luis Filipe Nakayama, Diego M Lopez, Stergios Christodoulidis, Maria Vakalopoulou, Enzo Ferrante

2603.17065 2026-03-23 cs.RO

TeleDex: Accessible Dexterous Teleoperation

Omar Rayyan, Maximilian Gilles, Yuchen Cui

Comments For project website and videos, see https://www.orayyan.com/teledex

2603.16432 2026-03-23 cs.CV cs.LG

IRIS: A Real-World Benchmark for Inverse Recovery and Identification of Physical Dynamic Systems from Monocular Video

Rasul Khanbayov, Mohamed Rayan Barhdadi, Erchin Serpedin, Hasan Kurban

2603.16180 2026-03-23 cs.RO

Task-Specified Compliance Bounds for Humanoids via Lipschitz-Constrained Policies

Zewen He, Yoshihiko Nakamura

Comments Submitted to IEEE for possible publication, under review

2603.15709 2026-03-23 cs.AI cs.CE cs.CY cs.LG

Survey of Various Fuzzy and Uncertain Decision-Making Methods

Takaaki Fujita, Florentin Smarandache

Comments Book. Publisher: Neutrosophic Science International Association (NSIA) Publishing House. ISBN: 978-1-59973-883-3. 446 pages

2603.15597 2026-03-23 cs.SD cs.CV cs.LG cs.MM eess.AS

AC-Foley: Reference-Audio-Guided Video-to-Audio Synthesis with Acoustic Transfer

Pengjun Fang, Yingqing He, Yazhou Xing, Qifeng Chen, Ser-Nam Lim, Harry Yang

Comments Accepted at ICLR 2026. 15 pages, 5 figures, add project webpage

2603.14977 2026-03-23 cs.RO

ReMAP-DP: Reprojected Multi-view Aligned PointMaps for Diffusion Policy

Xinzhang Yang, Renjun Wu, Jinyan Liu, Xuesong Li

Comments fix some typos

2603.14579 2026-03-23 cs.CV cs.LG

Medical Image Spatial Grounding with Semantic Sampling

Andrew Seohwan Yu, Mohsen Hariri, Kunio Nakamura, Mingrui Yang, Xiaojuan Li, Vipin Chaudhary

Comments 10 pages, 2 figures, under review at MICCAI 2026

2603.12788 2026-03-23 cs.CV

Think and Answer ME: Benchmarking and Exploring Multi-Entity Reasoning Grounding in Remote Sensing

Shuchang Lyu, Haiquan Wen, Guangliang Cheng, Meng Li, Zheng Zhou, You Zhou, Dingding Yao, Zhenwei Shi

Comments 22 pages, 9 figures, 5 tables

2603.12180 2026-03-23 cs.CL cs.AI

Strategic Navigation or Stochastic Search? How Agents and Humans Reason Over Document Collections

Łukasz Borchmann, Jordy Van Landeghem, Michał Turski, Shreyansh Padarha, Ryan Othniel Kearns, Adam Mahdi, Niels Rogge, Clémentine Fourrier, Siwei Han, Huaxiu Yao, Artemis Llabrés, Yiming Xu, Dimosthenis Karatzas, Hao Zhang, Anupam Datta

2603.10685 2026-03-23 cs.CV

A$^2$-Edit: Precise Reference-Guided Image Editing of Arbitrary Objects and Ambiguous Masks

Huayu Zheng, Guangzhao Li, Baixuan Zhao, Siqi Luo, Hantao Jiang, Guangtao Zhai, Xiaohong Liu

2603.10598 2026-03-23 cs.CV

Layer Consistency Matters: Elegant Latent Transition Discrepancy for Generalizable Synthetic Image Detection

Yawen Yang, Feng Li, Shuqi Kong, Yunfeng Diao, Xinjian Gao, Zenglin Shi, Meng Wang

Comments Accepted by CVPR 2026 (main track)