arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.04934 2026-04-07 cs.CV

Vanast: Virtual Try-On with Human Image Animation via Synthetic Triplet Supervision

Hyunsoo Cha, Wonjung Woo, Byungjun Kim, Hanbyul Joo

Comments Accepted to CVPR 2026, Project Page: https://hyunsoocha.github.io/vanast/

详情

英文摘要

We present Vanast, a unified framework that generates garment-transferred human animation videos directly from a single human image, garment images, and a pose guidance video. Conventional two-stage pipelines treat image-based virtual try-on and pose-driven animation as separate processes, which often results in identity drift, garment distortion, and front-back inconsistency. Our model addresses these issues by performing the entire process in a single unified step to achieve coherent synthesis. To enable this setting, we construct large-scale triplet supervision. Our data generation pipeline includes generating identity-preserving human images in alternative outfits that differ from garment catalog images, capturing full upper and lower garment triplets to overcome the single-garment-posed video pair limitation, and assembling diverse in-the-wild triplets without requiring garment catalog images. We further introduce a Dual Module architecture for video diffusion transformers to stabilize training, preserve pretrained generative quality, and improve garment accuracy, pose adherence, and identity preservation while supporting zero-shot garment interpolation. Together, these contributions allow Vanast to produce high-fidelity, identity-consistent animation across a wide range of garment types.

URL PDF HTML ☆

赞 0 踩 0

2604.04933 2026-04-07 cs.CV

PointTPA: Dynamic Network Parameter Adaptation for 3D Scene Understanding

Siyuan Liu, Chaoqun Zheng, Xin Zhou, Tianrui Feng, Dingkang Liang, Xiang Bai

Comments Accepted by CVPR 2026. The code is available at https://github.com/H-EmbodVis/PointTPA

2604.04931 2026-04-07 cs.CV

LoMa: Local Feature Matching Revisited

David Nordström, Johan Edstedt, Georg Bökman, Jonathan Astermark, Anders Heyden, Viktor Larsson, Mårten Wadenbäck, Michael Felsberg, Fredrik Kahl

2604.04930 2026-04-07 cs.CL cs.AI cs.LG

Early Stopping for Large Reasoning Models via Confidence Dynamics

Parsa Hosseini, Sumit Nawathe, Mahdi Salmani, Meisam Razaviyayn, Soheil Feizi

2604.04929 2026-04-07 cs.CV

Rethinking Model Efficiency: Multi-Agent Inference with Large Models

Sixun Dong, Juhua Hu, Steven Li, Wei Wen, Qi Qian

2604.04924 2026-04-07 cs.CV cs.AI

Your Pre-trained Diffusion Model Secretly Knows Restoration

Sudarshan Rajagopalan, Vishal M. Patel

Comments Project page: https://sudraj2002.github.io/yptpage/

2604.04923 2026-04-07 cs.LG cs.LO cs.SY eess.SY math.AT

Stratifying Reinforcement Learning with Signal Temporal Logic

Justin Curry, Alberto Speranzon

Comments 8 pages, 13 figures

2604.04921 2026-04-07 cs.CL cs.CV

TriAttention: Efficient Long Reasoning with Trigonometric KV Compression

Weian Mao, Xi Lin, Wei Huang, Yuxin Xie, Tianfu Fu, Bohan Zhuang, Song Han, Yukang Chen

Comments Code is available at https://github.com/WeianMao/triattention

2604.04916 2026-04-07 cs.LG

Empowering Power Outage Prediction with Spatially Aware Hybrid Graph Neural Networks and Contrastive Learning

Xuyang Shen, Zijie Pan, Diego Cerrai, Xinxuan Zhang, Christopher Colorio, Emmanouil N. Anagnostou, Dongjin Song

2604.04913 2026-04-07 cs.CV

A Frame is Worth One Token: Efficient Generative World Modeling with Delta Tokens

Tommie Kerssies, Gabriele Berton, Ju He, Qihang Yu, Wufei Ma, Daan de Geus, Gijs Dubbelman, Liang-Chieh Chen

Comments CVPR 2026. Code and weights: https://deltatok.github.io

2604.04908 2026-04-07 cs.LG

HI-MoE: Hierarchical Instance-Conditioned Mixture-of-Experts for Object Detection

Vadim Vashkelis, Natalia Trukhina

2604.04905 2026-04-07 cs.CV cs.GR cs.HC

ClickAIXR: On-Device Multimodal Vision-Language Interaction with Real-World Objects in Extended Reality

Dawar Khan, Alexandre Kouyoumdjian, Xinyu Liu, Omar Mena, Dominik Engel, Ivan Viola

2604.04902 2026-04-07 cs.LG

Are Latent Reasoning Models Easily Interpretable?

Connor Dilgren, Sarah Wiegreffe

Comments Preprint

2604.04901 2026-04-07 cs.CV cs.AI

FileGram: Grounding Agent Personalization in File-System Behavioral Traces

Shuai Liu, Shulin Tian, Kairui Hu, Yuhao Dong, Zhe Yang, Bo Li, Jingkang Yang, Chen Change Loy, Ziwei Liu

Comments Project Page: https://filegram.choiszt.com, Code: https://github.com/synvo-ai/FileGram

2604.04898 2026-04-07 cs.AI cs.CL cs.LG

QED-Nano: Teaching a Tiny Model to Prove Hard Theorems

LM-Provers, Yuxiao Qu, Amrith Setlur, Jasper Dekoninck, Edward Beeching, Jia Li, Ian Wu, Lewis Tunstall, Aviral Kumar

2604.04892 2026-04-07 cs.LG

Data Attribution in Adaptive Learning

Amit Kiran Rege

Comments Work in progress

2604.04887 2026-04-07 cs.CV

HorizonWeaver: Generalizable Multi-Level Semantic Editing for Driving Scenes

Mauricio Soroco, Francesco Pittaluga, Zaid Tasneem, Abhishek Aich, Bingbing Zhuang, Wuyang Chen, Manmohan Chandraker, Ziyu Jiang

Comments CVPR Findings 2026

2604.04878 2026-04-07 cs.AI cs.PF

Learning, Potential, and Retention: An Approach for Evaluating Adaptive AI-Enabled Medical Devices

Alexis Burgon, Berkman Sahiner, Nicholas A Petrick, Gene Pennello, Ravi K Samala

2604.04876 2026-04-07 cs.AI

Incompleteness of AI Safety Verification via Kolmogorov Complexity

Munawar Hasan

2604.04875 2026-04-07 cs.CV cs.AI cs.MM

DIRECT: Video Mashup Creation via Hierarchical Multi-Agent Planning and Intent-Guided Editing

Ke Li, Maoliang Li, Jialiang Chen, Jiayu Chen, Zihao Zheng, Shaoqi Wang, Xiang Chen

2604.04874 2026-04-07 cs.CV

Free-Range Gaussians: Non-Grid-Aligned Generative 3D Gaussian Reconstruction

Ahan Shabanov, Peter Hedman, Ethan Weber, Zhengqin Li, Denis Rozumny, Gael Le Lan, Naina Dhingra, Lei Luo, Andrea Vedaldi, Christian Richardt, Andrea Tagliasacchi, Bo Zhu, Numair Khan

Comments Project Page: https://free-range-gaussians.github.io

2604.04872 2026-04-07 cs.CL cs.LG

Synthetic Sandbox for Training Machine Learning Engineering Agents

Yuhang Zhou, Lizhu Zhang, Yifan Wu, Jiayi Liu, Xiangjun Fan, Zhuokai Zhao, Hong Yan

Comments 28 pages, 9 tables, 8 figures

2604.04869 2026-04-07 cs.LG

Optimizing LLM Prompt Engineering with DSPy Based Declarative Learning

Shiek Ruksana, Sailesh Kiran Kurra, Thipparthi Sanjay Baradwaj

Comments Best paper Award ,IEEE International Conference on Emerging Smart Computing and Informatics (ESCI) Pune, India. Mar 11-13, 2026

2604.04863 2026-04-07 cs.CV

Beyond the Global Scores: Fine-Grained Token Grounding as a Robust Detector of LVLM Hallucinations

Tuan Dung Nguyen, Minh Khoi Ho, Qi Chen, Yutong Xie, Nguyen Cam-Tu, Minh Khoi Nguyen, Dang Huy Pham Nguyen, Anton van den Hengel, Johan W. Verjans, Phi Le Nguyen, Vu Minh Hieu Phan

Comments Accepted at CVPR2026 Main Track

2604.04862 2026-04-07 cs.RO

Outlier-Robust Nonlinear Moving Horizon Estimation using Adaptive Loss Functions

Nestor Deniz, Guido Sanchez, Fernando Auat Cheein, Leonardo Giovanini

2604.04859 2026-04-07 cs.CV

Unified Vector Floorplan Generation via Markup Representation

Kaede Shiohara, Toshihiko Yamasaki

Comments CVPR 2026. Webpage: https://mapooon.github.io/FMLPage

2604.04858 2026-04-07 cs.LG q-bio.QM

FairLogue: A Toolkit for Intersectional Fairness Analysis in Clinical Machine Learning Models

Nick Souligne, Vignesh Subbian

2604.04857 2026-04-07 cs.CV

The Blind Spot of Adaptation: Quantifying and Mitigating Forgetting in Fine-tuned Driving Models

Runhao Mao, Hanshi Wang, Yixiang Yang, Qianli Ma, Jingmeng Zhou, Zhipeng Zhang

Comments received by cvpr2026

2604.04855 2026-04-07 cs.LG

The Role of Generator Access in Autoregressive Post-Training

Amit Kiran Rege

Comments Work in progress

2604.04853 2026-04-07 cs.AI

MemMachine: A Ground-Truth-Preserving Memory System for Personalized AI Agents

Shu Wang, Edwin Yu, Oscar Love, Tom Zhang, Tom Wong, Steve Scargall, Charles Fan

Comments 18 pages, 16 tables, 3 figures