arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.04399 2026-03-05 cs.CV cs.LG

SimpliHuMoN: Simplifying Human Motion Prediction

Aadya Agrawal, Alexander Schwing

Comments 19 pages, 7 figures. Preprint

详情

英文摘要

Human motion prediction combines the tasks of trajectory forecasting and human pose prediction. For each of the two tasks, specialized models have been developed. Combining these models for holistic human motion prediction is non-trivial, and recent methods have struggled to compete on established benchmarks for individual tasks. To address this, we propose a simple yet effective transformer-based model for human motion prediction. The model employs a stack of self-attention modules to effectively capture both spatial dependencies within a pose and temporal relationships across a motion sequence. This simple, streamlined, end-to-end model is sufficiently versatile to handle pose-only, trajectory-only, and combined prediction tasks without task-specific modifications. We demonstrate that this approach achieves state-of-the-art results across all tasks through extensive experiments on a wide range of benchmark datasets, including Human3.6M, AMASS, ETH-UCY, and 3DPW.

URL PDF HTML ☆

赞 0 踩 0

2603.04395 2026-03-05 cs.LG physics.ao-ph

Accurate and Efficient Hybrid-Ensemble Atmospheric Data Assimilation in Latent Space with Uncertainty Quantification

Hang Fan, Juan Nathaniel, Yi Xiao, Ce Bian, Fenghua Ling, Ben Fei, Lei Bai, Pierre Gentine

Comments 23 pages, 12 figures

2603.04390 2026-03-05 cs.AI cs.SE

A Dual-Helix Governance Approach Towards Reliable Agentic AI for WebGIS Development

Boyuan, Guan, Wencong Cui, Levente Juhasz

Comments Paper submitted to Transactions in GIS

2603.04380 2026-03-05 cs.CV cs.CL

TaxonRL: Reinforcement Learning with Intermediate Rewards for Interpretable Fine-Grained Visual Reasoning

Maximilian von Klinski, Maximilian Schall

Comments Accepted at WACV 2026

2603.04379 2026-03-05 cs.CV

Helios: Real Real-Time Long Video Generation Model

Shenghai Yuan, Yuanyang Yin, Zongjian Li, Xinwei Huang, Xiao Yang, Li Yuan

Comments Page: pku-yuangroup.github.io/Helios-Page

2603.04378 2026-03-05 cs.LG cs.AI cs.CR cs.MA

Robustness of Agentic AI Systems via Adversarially-Aligned Jacobian Regularization

Furkan Mumcu, Yasin Yilmaz

2603.04370 2026-03-05 cs.AI cs.CL cs.IR

$τ$-Knowledge: Evaluating Conversational Agents over Unstructured Knowledge

Quan Shi, Alexandra Zytek, Pedram Razavi, Karthik Narasimhan, Victor Barres

Comments 29 pages (10 main + 19 appendix)

2603.04366 2026-03-05 cs.SD cs.AI cs.LG

Low-Resource Guidance for Controllable Latent Audio Diffusion

Zachary Novack, Zack Zukowski, CJ Carr, Julian Parker, Zach Evans, Josiah Taylor, Taylor Berg-Kirkpatrick, Julian McAuley, Jordi Pons

Comments Accepted at ICASSP 2026

2603.04364 2026-03-05 cs.LG cs.AI cs.CL

Dual-Modality Multi-Stage Adversarial Safety Training: Robustifying Multimodal Web Agents Against Cross-Modal Attacks

Haoyu Liu, Dingcheng Li, Lukas Rutishauser, Zeyu Zheng

2603.04363 2026-03-05 cs.RO

ManipulationNet: An Infrastructure for Benchmarking Real-World Robot Manipulation with Physical Skill Challenges and Embodied Multimodal Reasoning

Yiting Chen, Kenneth Kimble, Edward H. Adelson, Tamim Asfour, Podshara Chanrungmaneekul, Sachin Chitta, Yash Chitambar, Ziyang Chen, Ken Goldberg, Danica Kragic, Hui Li, Xiang Li, Yunzhu Li, Aaron Prather, Nancy Pollard, Maximo A. Roa-Garzon, Robert Seney, Shuo Sha, Shihefeng Wang, Yu Xiang, Kaifeng Zhang, Yuke Zhu, Kaiyu Hang

Comments 32 pages, 8 figures

2603.04360 2026-03-05 cs.LG eess.SP

Robust Unscented Kalman Filtering via Recurrent Meta-Adaptation of Sigma-Point Weights

Kenan Majewski, Michał Modzelewski, Marcin Żugaj, Piotr Lichota

Comments 8 pages, 3 figures, Submitted to the 29th International Conference on Information Fusion (FUSION 2026)

2603.04359 2026-03-05 cs.LG cs.AI

Dissecting Quantization Error: A Concentration-Alignment Perspective

Marco Federici, Boris van Breugel, Paul Whatmough, Markus Nagel

2603.04356 2026-03-05 cs.RO cs.AI cs.LG

RoboCasa365: A Large-Scale Simulation Framework for Training and Benchmarking Generalist Robots

Soroush Nasiriany, Sepehr Nasiriany, Abhiram Maddukuri, Yuke Zhu

Comments ICLR 2026; First three authors contributed equally

2603.04355 2026-03-05 cs.LG cs.AI

Efficient Refusal Ablation in LLM through Optimal Transport

Geraldin Nanfack, Eugene Belilovsky, Elvis Dohmatob

2603.04354 2026-03-05 cs.LG

Out-of-distribution transfer of PDE foundation models to material dynamics under extreme loading

Mahindra Rautela, Alexander Most, Siddharth Mansingh, Aleksandra Pachalieva, Bradley Love, Daniel O Malley, Alexander Scheinker, Kyle Hickmann, Diane Oyen, Nathan Debardeleben, Earl Lawrence, Ayan Biswas

2603.04351 2026-03-05 cs.RO

Tendon Force Modeling for Sim2Real Transfer of Reinforcement Learning Policies for Tendon-Driven Robots

Valentin Yuryev, Josie Hughes

Comments preprint

2603.04349 2026-03-05 cs.CV

FocusGraph: Graph-Structured Frame Selection for Embodied Long Video Question Answering

Tatiana Zemskova, Solomon Andryushenko, Ilya Obrubov, Viktoriia Khoruzhaia, Ekaterina Eroshenko, Ekaterina Derevyanka, Dmitry Yudin

2603.04346 2026-03-05 cs.CV

Underrepresented in Foundation Model Pretraining Data? A One-Shot Probe

Chris Vorster, Mayug Maniparambil, Noel E. O'Connor, Noel Murphy, Derek Molloy

2603.04343 2026-03-05 cs.CV cs.LG

Enhancing Authorship Attribution with Synthetic Paintings

Clarissa Loures, Caio Hosken, Luan Oliveira, Gianlucca Zuin, Adriano Veloso

Comments Accepted for publication at the 24th IEEE International Conference on Machine Learning and Applications (ICMLA 2025)

2603.04341 2026-03-05 cs.CV

Hold-One-Shot-Out (HOSO) for Validation-Free Few-Shot CLIP Adapters

Chris Vorster, Mayug Maniparambil, Noel E. O'Connor, Noel Murphy, Derek Molloy

2603.04340 2026-03-05 cs.CV cs.LG

Balancing Fidelity, Utility, and Privacy in Synthetic Cardiac MRI Generation: A Comparative Study

Madhura Edirisooriya, Dasuni Kawya, Ishan Kumarasinghe, Isuri Devindi, Mary M. Maleckar, Roshan Ragel, Isuru Nawinne, Vajira Thambawita

Comments 7 pages, 4 figures, Preprint

2603.04338 2026-03-05 cs.CV

ArtHOI: Articulated Human-Object Interaction Synthesis by 4D Reconstruction from Video Priors

Zihao Huang, Tianqi Liu, Zhaoxi Chen, Shaocong Xu, Saining Zhang, Lixing Xiao, Zhiguo Cao, Wei Li, Hao Zhao, Ziwei Liu

Comments Project Page: https://arthoi.github.io/

详情

英文摘要

Synthesizing physically plausible articulated human-object interactions (HOI) without 3D/4D supervision remains a fundamental challenge. While recent zero-shot approaches leverage video diffusion models to synthesize human-object interactions, they are largely confined to rigid-object manipulation and lack explicit 4D geometric reasoning. To bridge this gap, we formulate articulated HOI synthesis as a 4D reconstruction problem from monocular video priors: given only a video generated by a diffusion model, we reconstruct a full 4D articulated scene without any 3D supervision. This reconstruction-based approach treats the generated 2D video as supervision for an inverse rendering problem, recovering geometrically consistent and physically plausible 4D scenes that naturally respect contact, articulation, and temporal coherence. We introduce ArtHOI, the first zero-shot framework for articulated human-object interaction synthesis via 4D reconstruction from video priors. Our key designs are: 1) Flow-based part segmentation: leveraging optical flow as a geometric cue to disentangle dynamic from static regions in monocular video; 2) Decoupled reconstruction pipeline: joint optimization of human motion and object articulation is unstable under monocular ambiguity, so we first recover object articulation, then synthesize human motion conditioned on the reconstructed object states. ArtHOI bridges video-based generation and geometry-aware reconstruction, producing interactions that are both semantically aligned and physically grounded. Across diverse articulated scenes (e.g., opening fridges, cabinets, microwaves), ArtHOI significantly outperforms prior methods in contact accuracy, penetration reduction, and articulation fidelity, extending zero-shot interaction synthesis beyond rigid manipulation through reconstruction-informed synthesis.

URL PDF HTML ☆

赞 0 踩 0

2603.04329 2026-03-05 cs.RO

Gaussian Mixture-Based Inverse Perception Contract for Uncertainty-Aware Robot Navigation

Bingyao Du, Joonkyung Kim, Yiwei Lyu

Comments 8 pages, 5 figures. Accepted to ACC 2026 (American Control Conference)

2603.04325 2026-03-05 cs.CV cs.LG

Scalable Evaluation of the Realism of Synthetic Environmental Augmentations in Images

Damian J. Ruck, Paul Vautravers, Oliver Chalkley, Jake Thomas

详情

英文摘要

Evaluation of AI systems often requires synthetic test cases, particularly for rare or safety-critical conditions that are difficult to observe in operational data. Generative AI offers a promising approach for producing such data through controllable image editing, but its usefulness depends on whether the resulting images are sufficiently realistic to support meaningful evaluation. We present a scalable framework for assessing the realism of synthetic image-editing methods and apply it to the task of adding environmental conditions-fog, rain, snow, and nighttime-to car-mounted camera images. Using 40 clear-day images, we compare rule-based augmentation libraries with generative AI image-editing models. Realism is evaluated using two complementary automated metrics: a vision-language model (VLM) jury for perceptual realism assessment, and embedding-based distributional analysis to measure similarity to genuine adverse-condition imagery. Generative AI methods substantially outperform rule-based approaches, with the best generative method achieving approximately 3.6 times the acceptance rate of the best rule-based method. Performance varies across conditions: fog proves easiest to simulate, while nighttime transformations remain challenging. Notably, the VLM jury assigns imperfect acceptance even to real adverse-condition imagery, establishing practical ceilings against which synthetic methods can be judged. By this standard, leading generative methods match or exceed real-image performance for most conditions. These results suggest that modern generative image-editing models can enable scalable generation of realistic adverse-condition imagery for evaluation pipelines. Our framework therefore provides a practical approach for scalable realism evaluation, though validation against human studies remains an important direction for future work.

URL PDF HTML ☆

赞 0 踩 0

2603.04323 2026-03-05 cs.LG cs.CR cs.DC math.AT stat.ML

PTOPOFL: Privacy-Preserving Personalised Federated Learning via Persistent Homology

Kelly L Vomo-Donfack, Adryel Hoszu, Grégory Ginot, Ian Morilla

Comments 22 pages, 6 Figures

2603.04321 2026-03-05 cs.CV cs.AI

SPRINT: Semi-supervised Prototypical Representation for Few-Shot Class-Incremental Tabular Learning

Umid Suleymanov, Murat Kantarcioglu, Kevin S Chan, Michael De Lucia, Kevin Hamlen, Latifur Khan, Sharad Mehrotra, Ananthram Swami, Bhavani Thuraisingham

Comments Under Review

2603.04319 2026-03-05 cs.CL

AILS-NTUA at SemEval-2026 Task 12: Graph-Based Retrieval and Reflective Prompting for Abductive Event Reasoning

Nikolas Karafyllis, Maria Lymperaiou, Giorgos Filandrianos, Athanasios Voulodimos, Giorgos Stamou

2603.04317 2026-03-05 cs.CL cs.AI cs.LG

World Properties without World Models: Recovering Spatial and Temporal Structure from Co-occurrence Statistics in Static Word Embeddings

Elan Barenholtz

Comments 12 pages, 3 figures, 3 tables

2603.04309 2026-03-05 cs.LG cs.AI cs.CV

CRESTomics: Analyzing Carotid Plaques in the CREST-2 Trial with a New Additive Classification Model

Pranav Kulkarni, Brajesh K. Lal, Georges Jreij, Sai Vallamchetla, Langford Green, Jenifer Voeks, John Huston, Lloyd Edwards, George Howard, Bradley A. Maron, Thomas G. Brott, James F. Meschia, Florence X. Doo, Heng Huang

Comments 4 pages, 3 figures, 1 table, accepted to ISBI 2026

2603.04308 2026-03-05 cs.LG cs.AI

Activation Outliers in Transformer Quantization: Reproduction, Statistical Analysis, and Deployment Tradeoffs

Pranav Kumar Kaliaperumal

Comments 10 pages, 3 tables. Reproducible study of transformer PTQ activation outliers based on Bondarenko et al. (EMNLP 2021, Qualcomm AI Research). Code: https://github.com/pranavkkp4/TransQuant-Edge