arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2601.16148 2026-04-02 cs.CV

ActionMesh: Animated 3D Mesh Generation with Temporal 3D Diffusion

Remy Sabathier, David Novotny, Niloy J. Mitra, Tom Monnier

Comments CVPR 2026. Project webpage with code and videos: https://remysabathier.github.io/actionmesh/ . V2 update includes more baseline models with a larger evaluation set on our new publicly released benchmark ActionBench, and {3D+video}-to-animated-mesh qualitative comparison in supplemental

详情

英文摘要

Generating animated 3D objects is at the heart of many applications, yet most advanced works are typically difficult to apply in practice because of their limited setup, their long runtime, or their limited quality. We introduce ActionMesh, a generative model that predicts production-ready 3D meshes "in action" in a feed-forward manner. Drawing inspiration from early video models, our key insight is to modify existing 3D diffusion models to include a temporal axis, resulting in a framework we dubbed "temporal 3D diffusion". Specifically, we first adapt the 3D diffusion stage to generate a sequence of synchronized latents representing time-varying and independent 3D shapes. Second, we design a temporal 3D autoencoder that translates a sequence of independent shapes into the corresponding deformations of a pre-defined reference shape, allowing us to build an animation. Combining these two components, ActionMesh generates animated 3D meshes from different inputs like a monocular video, a text description, or even a 3D mesh with a text prompt describing its animation. Besides, compared to previous approaches, our method is fast and produces results that are rig-free and topology consistent, hence enabling rapid iteration and seamless applications like texturing and retargeting. We evaluate our model on standard video-to-4D benchmarks (Consistent4D, Objaverse) and report state-of-the-art performances on both geometric accuracy and temporal consistency, demonstrating that our model can deliver animated 3D meshes with unprecedented speed and quality.

URL PDF HTML ☆

赞 0 踩 0

2601.12604 2026-04-02 cs.LG

Beyond Softmax and Entropy: Convergence Rates of Policy Gradients with f-SoftArgmax Parameterization & Coupled Regularization

Safwan Labbi, Daniil Tiapkin, Paul Mangold, Eric Moulines

2601.10001 2026-04-02 cs.CV

DW-DGAT: Dynamically Weighted Dual Graph Attention Network for Neurodegenerative Disease Diagnosis

Chengjia Liang, Zhenjiong Wang, Chao Chen, Ruizhi Zhang, Songxi Liang, Hai Xie, Haijun Lei, Zhongwei Huang

Comments The exended version of an AAAI-2026 accepted poster paper

2601.09504 2026-04-02 cs.CL

MVSS: A Unified Framework for Multi-View Structured Survey Generation

Yinqi Liu, Yueqi Zhu, Yongkang Zhang, Feiran Liu, Yutong Shen, Yufei Sun, Xin Wang, Renzhao Liang, Yidong Wang, Cunxiang Wang

2601.08476 2026-04-02 cs.CV cs.MM

Cross-modal Proxy Evolving for OOD Detection with Vision-Language Models

Hao Tang, Yu Liu, Shuanglin Yan, Fei Shen, Shengfeng He, Jing Qin

Comments Accepted by AAAI 2026

2601.08165 2026-04-02 cs.CV

Representation Learning with Semantic-aware Instance and Sparse Token Alignments

Phuoc-Nguyen Bui, Toan Duc Nguyen, Junghyun Bum, Duc-Tai Le, Hyunseung Choo

Comments Accepted to ICPR 2026

2601.05144 2026-04-02 cs.AI

Distilling the Thought, Watermarking the Answer: A Principle Semantic Guided Watermark for Large Reasoning Models

Shuliang Liu, Xingyu Li, Hongyi Liu, Dong Fang, Yibo Yan, Bingchen Duan, Qi Zheng, Lingfeng Su, Xuming Hu

Comments 31 pages, Published in ICLR 2026

2601.03811 2026-04-02 cs.CV cs.LG

EvalBlocks: A Modular Pipeline for Rapidly Evaluating Foundation Models in Medical Imaging

Jan Tagscherer, Sarah de Boer, Lena Philipp, Fennie van der Graaf, Dré Peeters, Joeran Bosma, Lars Leijten, Bogdan Obreja, Ewoud Smit, Alessa Hering

Comments Accepted and published in BVM 2026 proceedings (Springer)

2601.00267 2026-04-02 cs.CV

ActErase: A Training-Free Paradigm for Precise Concept Erasure via Activation Redirection

Yi Sun, Xinhao Zhong, Hongyan Li, Yimin Zhou, Junhao Li, Bin Chen, Xuan Wang

2512.24212 2026-04-02 cs.RO cs.CV

RANGER: A Monocular Zero-Shot Semantic Navigation Framework through Visual Contextual Adaptation

Ming-Ming Yu, Yi Chen, Börje F. Karlsson, Wenjun Wu

Comments Accepted at ICRA 2026

2512.21038 2026-04-02 cs.CV

Next-Scale Prediction: A Self-Supervised Approach for Real-World Image Denoising

Yiwen Shan, Haiyu Zhao, Peng Hu, Xi Peng, Yuanbiao Gou

2512.19693 2026-04-02 cs.CV

The Prism Hypothesis: Harmonizing Semantic and Pixel Representations via Unified Autoencoding

Weichen Fan, Haiwen Diao, Quan Wang, Dahua Lin, Ziwei Liu

Comments Code link: https://github.com/WeichenFan/UAE

2512.18640 2026-04-02 cs.CV cs.AI cs.RO

Geometric-Photometric Event-based 3D Gaussian Ray Tracing

Kai Kohyama, Yoshimitsu Aoki, Guillermo Gallego, Shintaro Shiba

Comments 15 pages, 12 figures, 5 tables

2512.17312 2026-04-02 cs.CV

CodeDance: A Dynamic Tool-integrated MLLM for Executable Visual Reasoning

Qi Song, Honglin Li, Yingchen Yu, Haoyi Zhou, Lin Yang, Song Bai, Qi She, Zilong Huang, Yunqing Zhao

Comments CVPR 2026. Project page: https://codedance-vl.github.io/

2512.12090 2026-04-02 cs.CV cs.CR cs.LG

SPDMark: Selective Parameter Displacement for Robust Video Watermarking

Samar Fares, Nurbek Tastan, Karthik Nandakumar

Comments CVPR 2026

2512.10394 2026-04-02 cs.RO cs.LG

RoboNeuron: A Middle-Layer Infrastructure for Agent-Driven Orchestration in Embodied AI

Weifan Guan, Qinghao Hu, Huasen Xi, Chenxiao Zhang, Aosheng Li, Jian Cheng

2512.08545 2026-04-02 cs.CL cs.AI cs.CV cs.MA

Curriculum Guided Massive Multi Agent System Solving For Robust Long Horizon Tasks

Indrajit Kar, Kalathur Chenchu Kishore Kumar

Comments 22 pages, 2 tables, 9 figures

2512.03932 2026-04-02 cs.CV

Beyond the Ground Truth: Enhanced Supervision for Image Restoration

Donghun Ryou, Inju Ha, Sanghyeok Chu, Bohyung Han

Comments Project page: https://hij1112.github.io/beyond-the-ground-truth/ Accepted to CVPR 2026

2512.02496 2026-04-02 cs.CV cs.GR

Attention-guided reference point shifting for Gaussian-mixture-based partial point set registration

Mizuki Kikkawa, Tatsuya Yatagawa, Yutaka Ohtake, Hiromasa Suzuki

Comments 16 pages, 9 figures, 7 tables

2512.02413 2026-04-02 cs.CV cs.AI

Enhancing Floor Plan Recognition: A Hybrid Mix-Transformer and U-Net Approach for Precise Wall Segmentation

Dmitriy Parashchuk, Alexey Kaspshitskiy, Yuriy Karyakin

Comments 11 pages, 5 figures, 3 tables

2512.02079 2026-04-02 cs.RO cs.MA cs.SY eess.SY

Robust Geospatial Coordination of Multi-Agent Communications Networks Under Attrition

Jonathan S. Kent, Eliana Stefani, Brian Plancher

Comments 8 pages, 4 figures, 4 tables, accepted to IEEE RA-L

2512.00580 2026-04-02 cs.LG stat.ML

Non-Asymptotic Convergence of Discrete Diffusion Models: Masked and Random Walk dynamics

Giovanni Conforti, Alain Durmus, Le-Tuyet-Nhi Pham, Gael Raoul

2512.00234 2026-04-02 cs.CL cs.AI

OmniFusion: Simultaneous Multilingual Multimodal Translations via Modular Fusion

Sai Koneru, Matthias Huck, Jan Niehues

Comments Revised submission in review for ACL ARR

2511.21523 2026-04-02 cs.CV

EoS-FM: Can an Ensemble of Specialist Models act as a Generalist Feature Extractor?

Pierre Adorni, Minh-Tan Pham, Stéphane May, Sébastien Lefèvre

2511.20836 2026-04-02 cs.CL cs.AI cs.LG

Structured Prompts Improve Evaluation of Language Models

Asad Aali, Muhammad Ahmed Mohsin, Vasiliki Bikia, Arnav Singhvi, Richard Gaus, Suhana Bedi, Hejie Cui, Miguel Fuentes, Alyssa Unell, Yifan Mai, Jordan Cahoon, Michael Pfeffer, Roxana Daneshjou, Sanmi Koyejo, Emily Alsentzer, Christopher Potts, Nigam H. Shah, Akshay S. Chaudhari

2511.20224 2026-04-02 cs.SD cs.AI

DuoTok: Source-Aware Dual-Track Tokenization for Multi-Track Music Language Modeling

Rui Lin, Zhiyue Wu, Jiahe Le, Kangdi Wang, Weixiong Chen, Junyu Dai, Tao Jiang

Comments 17 pages, 5 figures, 8 tables. Project page: https://eps-acoustic-revolution-lab.github.io/DUO_TOK/

2511.16908 2026-04-02 cs.CV

Q-REAL: Towards Realism and Plausibility Evaluation for AI-Generated Content

Shushi Wang, Zicheng Zhang, Chunyi Li, Wei Wang, Liya Ma, Fengjiao Chen, Xiaoyu Li, Xuezhi Cao, Guangtao Zhai, Xiaohong Liu

2511.15411 2026-04-02 cs.CV cs.LG

D4C: Data-Free Quantization for Contrastive Language-Image Pre-training Models

Wenlun Zhang, Yunshan Zhong, Zihao Ding, Xinyu Li, Kentaro Yoshioka

Comments Accepted to CVPRF 2026

2511.14702 2026-04-02 cs.CV cs.AI

Seeing Beyond the Image: ECG and Anatomical Knowledge-Guided Myocardial Scar Segmentation from Late Gadolinium-Enhanced Images

Farheen Ramzan, Yusuf Kiberu, Nikesh Jathanna, Meryem Jabrane, Vicente Grau, Shahnaz Jamil-Copley, Richard H. Clayton, Chen, Chen

Comments oral presentation at International Symposium on Biomedical Imaging (ISBI 2026)

2511.14275 2026-04-02 cs.CL

Let the Model Distribute Its Doubt: Confidence Estimation through Verbalized Probability Distribution

Ante Wang, Weizhi Ma, Yang Liu