arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2511.18886 2026-03-19 cs.CV

MagicWorld: Towards Long-Horizon Stability for Interactive Video World Exploration

Guangyuan Li, Bo Li, Jinwei Chen, Xiaobin Hu, Lei Zhao, Peng-Tao Jiang

详情

英文摘要

Recent interactive video world model methods generate scene evolution conditioned on user instructions. Although they achieve impressive results, two key limitations remain. First, they exhibit motion drift in complex environments with multiple interacting subjects, where dynamic subjects fail to follow realistic motion patterns during scene evolution. Second, they suffer from error accumulation in long-horizon interactions, where autoregressive generation gradually drifts from earlier scene states and causes structural and semantic inconsistencies. In this paper, we propose MagicWorld, an interactive video world model built upon an autoregressive framework. To address motion drift, we incorporate a flow-guided motion preservation constraint that mitigates motion degradation in dynamic subjects, encouraging realistic motion patterns and stable interactions during scene evolution. To mitigate error accumulation in long-horizon interactions, we design two complementary strategies, including a history cache retrieval strategy and an enhanced interactive training strategy. The former reinforces historical scene states by retrieving past generations during interaction, while the latter adopts multi-shot aggregated distillation with dual-reward weighting for interactive training, enhancing long-term stability and reducing error accumulation. In addition, we construct RealWM120K, a real-world dataset with diverse city-walk videos and multimodal annotations to support dynamic perception and long-horizon world modeling. Experimental results demonstrate that MagicWorld improves motion realism and alleviates error accumulation during long-horizon interactions.

URL PDF HTML ☆

赞 0 踩 0

2511.07231 2026-03-19 cs.CV

Semi-supervised Shelter Mapping for WASH Accessibility Assessment in Rohingya Refugee Camps

Kyeongjin Ahn, YongHun Suh, Sungwon Han, Jeasurk Yang, Hannes Taubenböck, Meeyoung Cha

Comments 22 pages, 13 figures, 2 tables

2510.26969 2026-03-19 cs.CL cs.AI

Frame Semantic Patterns for Identifying Underreporting of Notifiable Events in Healthcare: The Case of Gender-Based Violence

Lívia Dutra, Arthur Lorenzi, Laís Berno, Franciany Campos, Karoline Biscardi, Kenneth Brown, Marcelo Viridiano, Frederico Belcavello, Ely Matos, Olívia Guaranha, Erik Santos, Sofia Reinach, Tiago Timponi Torrent

Comments Paper accepted to the LREC 2026 in the Main Conference track

2510.14959 2026-03-19 cs.RO cs.AI cs.LG cs.SY eess.SY

CBF-RL: Safety Filtering Reinforcement Learning in Training with Control Barrier Functions

Lizhi Yang, Blake Werner, Massimiliano de Sa, Aaron D. Ames

Comments To appear at ICRA 2026; sample code for the navigation example with CBF-RL reward core construction can be found at https://github.com/lzyang2000/cbf-rl-navigation-demo

2509.24910 2026-03-19 cs.CV

Learning Goal-Oriented Vision-and-Language Navigation with Self-Improving Demonstrations at Scale

Songze Li, Zun Wang, Gengze Zhou, Jialu Li, Xiangyu Zeng, Ziyang Gong, Limin Wang, Yu Qiao, Qi Wu, Mohit Bansal, Yi Wang

2509.24384 2026-03-19 cs.CL cs.AI

HarmMetric Eval: Benchmarking Metrics and Judges for LLM Harmfulness Assessment

Langqi Yang, Tianhang Zheng, Yixuan Chen, Kedong Xiu, Hao Zhou, Wangze Ni, Lei Chen, Zhan Qin, Kui Ren

2509.22621 2026-03-19 cs.LG cs.AI cs.CL

IA2: Alignment with ICL Activations Improves Supervised Fine-Tuning

Aayush Mishra, Daniel Khashabi, Anqi Liu

Comments International Conference on Learning Representations (ICLR) 2026

2508.21096 2026-03-19 cs.CV

ROBUST-MIPS: A Combined Skeletal Pose and Instance Segmentation Dataset for Laparoscopic Surgical Instruments

Zhe Han, Charlie Budd, Gongyu Zhang, Huanyu Tian, Christos Bergeles, Tom Vercauteren

2508.19945 2026-03-19 cs.LG cs.SY eess.SY

Constraint Learning in Multi-Agent Dynamic Games from Demonstrations of Local Nash Interactions

Zhouyu Zhang, Chih-Yuan Chiu, Glen Chou

2508.13526 2026-03-19 cs.CL

MATA: Mindful Assessment of the Telugu Abilities of Large Language Models

Chalamalasetti Kranti, Sowmya Vajjala

Comments Accepted to LREC 2026

2508.05059 2026-03-19 cs.LG cs.AI cs.CV

Learning from Oblivion: Predicting Knowledge Overflowed Weights via Retrodiction of Forgetting

Jinhyeok Jang, Jaehong Kim, Jung Uk Kim

Comments To appear in CVPR 2026

2508.01310 2026-03-19 cs.LG

GraphVSSM: Graph Variational State-Space Model for Probabilistic Spatiotemporal Inference of Dynamic Exposure and Vulnerability for Regional Disaster Resilience Assessment

Joshua Dimasaka, Christian Geiß, Emily So

Comments Non-peer-reviewed Preprint | Keywords: graph state-space model, building exposure, physical vulnerability, weak supervision, probabilistic model, disaster resilience, risk audit | Code: https://github.com/riskaudit/GraphVSSM | Quezon City (Philippines) Dataset: https://doi.org/pzj2 | METEOR 2.5D Dataset, https://doi.org/pzq4, https://doi.org/pzrd | Khurushkul-Freetown Dataset: https://doi.org/pzkw

详情

DOI: 10.1609/aaai.v40i45.41178
Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, 40(45): 38376-38384, 2026

英文摘要

Regional disaster resilience quantifies the changing nature of physical risks to inform policy instruments ranging from local immediate recovery to international sustainable development. While many existing state-of-practice methods have greatly advanced the dynamic mapping of exposure and hazard, our understanding of large-scale physical vulnerability has remained static, costly, limited, region-specific, coarse-grained, overly aggregated, and inadequately calibrated. With the significant growth in the availability of time-series satellite imagery and derived products for exposure and hazard, we focus our work on the equally important yet challenging element of the risk equation: physical vulnerability. We leverage machine learning methods that flexibly capture spatial contextual relationships, limited temporal observations, and uncertainty in a unified probabilistic spatiotemporal inference framework. We therefore introduce Graph Variational State-Space Model (GraphVSSM), a novel modular spatiotemporal approach that uniquely integrates graph deep learning, state-space modeling, and variational inference using time-series data and prior expert belief systems in a weakly supervised or coarse-to-fine-grained manner. We present three major results: a city-wide demonstration in Quezon City, Philippines; an investigation of sudden changes in the cyclone-impacted coastal Khurushkul community (Bangladesh) and mudslide-affected Freetown (Sierra Leone); and an open geospatial dataset, METEOR 2.5D, that spatiotemporally enhances the existing global static dataset for UN Least Developed Countries (2020). Beyond advancing regional disaster resilience assessment and improving our understanding global disaster risk reduction progress, our method also offers a probabilistic deep learning approach, contributing to broader urban studies that require compositional data analysis in weak supervision.

URL PDF HTML ☆

赞 0 踩 0

2506.21982 2026-03-19 cs.RO cs.SY eess.SY

A MILP-Based Solution to Multi-Agent Motion Planning and Collision Avoidance in Constrained Environments

Akshay Jaitly, Jack Cline, Siavash Farzan

Comments Accepted to 2025 IEEE International Conference on Automation Science and Engineering (CASE 2025). This arXiv version adds a supplementary appendix with figures not in the IEEE proceedings

2506.08460 2026-03-19 cs.LG cs.AI cs.RO

MOBODY: Model Based Off-Dynamics Offline Reinforcement Learning

Yihong Guo, Yu Yang, Pan Xu, Anqi Liu

Comments Published at ICLR 2026

2505.22977 2026-03-19 cs.CV

HyperMotionX: The Dataset and Benchmark with DiT-Based Pose-Guided Human Image Animation of Complex Motions

Shuolin Xu, Siming Zheng, Ziyi Wang, HC Yu, Jinwei Chen, Huaqi Zhang, Daquan Zhou, Tong-Yee Lee, Bo Li, Peng-Tao Jiang

Comments 17 pages, 7 figures

2505.20321 2026-03-19 cs.CL cs.AI cs.LG

BiomedSQL: Text-to-SQL for Scientific Reasoning on Biomedical Knowledge Bases

Mathew J. Koretsky, Maya Willey, Owen Bianchi, Chelsea X. Alvarado, Tanay Nayak, Nicole Kuznetsov, Sungwon Kim, Mike A. Nalls, Daniel Khashabi, Faraz Faghri

Comments Accepted at the non-archival Gen2 Workshop at ICLR 2026. Under Review

2505.11611 2026-03-19 cs.AI cs.CL cs.CR

Signal in the Noise: Polysemantic Interference Transfers and Predicts Cross-Model Influence

Bofan Gong, Shiyang Lai, James Evans, Dawn Song

2503.13921 2026-03-19 cs.LG cs.AI

Learning Over Dirty Data with Minimal Repairs

Cheng Zhen, Prayoga, Nischal Aryal, Arash Termehchy, Garrett Biwer, Lubna Alzamil

2503.05305 2026-03-19 cs.CV cs.AI

Frequency Autoregressive Image Generation with Continuous Tokens

Hu Yu, Hao Luo, Hangjie Yuan, Yu Rong, Jie Huang, Feng Zhao

2502.12855 2026-03-19 cs.CL cs.AI cs.LG

Integrating Arithmetic Learning Improves Mathematical Reasoning in Smaller Models

Neeraj Gangwar, Suma P Bhat, Nickvash Kani

Comments Accepted to LREC 2026

2501.09127 2026-03-19 cs.CL

Multilingual LLMs Struggle to Link Orthography and Semantics in Bilingual Word Processing

Eshaan Tanwar, Gayatri Oke, Tanmoy Chakraborty

Comments Code available at: https://github.com/EshaanT/Bilingual_processing_LLMs

2412.16146 2026-03-19 cs.CV

Mamba2D: A Natively Multi-Dimensional State-Space Model for Vision Tasks

Enis Baty, Alejandro Hernández Díaz, Rebecca Davidson, Chris Bridges, Simon Hadfield

2409.17385 2026-03-19 cs.LG cs.AI cs.CV

Den-TP: A Density-Balanced Data Curation and Evaluation Framework for Trajectory Prediction

Ruining Yang, Yi Xu, Yun Fu, Lili Su

Comments Accepted by CVPR2026

2409.13106 2026-03-19 cs.CV

UL-VIO: Ultra-lightweight Visual-Inertial Odometry with Noise Robust Test-time Adaptation

Jinho Park, Se Young Chun, Mingoo Seok

2405.16924 2026-03-19 cs.LG stat.ML

Demystifying amortized causal discovery with transformers

Francesco Montagna, Max Cairney-Leeming, Dhanya Sridhar, Francesco Locatello

2403.10932 2026-03-19 cs.RO cs.SY eess.SY

Learning-Based Design of Off-Policy Gaussian Controllers: Integrating Model Predictive Control and Gaussian Process Regression

Shiva Kumar Tekumatla, Varun Gampa, Siavash Farzan

Comments Accepted to ACC 2024. 8 pages, 9 figures

2403.10924 2026-03-19 cs.RO

PAAMP: Polytopic Action-Set And Motion Planning for Long Horizon Dynamic Motion Planning via Mixed Integer Linear Programming

Akshay Jaitly, Siavash Farzan

Comments Accepted to 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2024). 8 pages, 10 figures

2402.09262 2026-03-19 cs.CV

MultiMedEval: A Benchmark and a Toolkit for Evaluating Medical Vision-Language Models

Corentin Royer, Bjoern Menze, Anjany Sekuboyina

Comments Accepted at MIDL 2024

2311.04055 2026-03-19 cs.LG

Feature Space Renormalization for Semi-supervised Learning

Jun Sun, Wancheng Zhang, Chao Zhou, Zhongjie Mao, Chao Li, Xiao-Jun Wu

Comments Version 2

2305.13047 2026-03-19 cs.CL

Automated stance detection in complex topics and small languages: the challenging case of immigration in polarizing news media

Mark Mets, Andres Karjus, Indrek Ibrus, Maximilian Schich