arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2506.23690 2026-04-29 cs.CV

SynMotion: Semantic-Visual Adaptation for Motion Customized Video Generation

Shuai Tan, Biao Gong, Yujie Wei, Shiwei Zhang, Zhuoxin Liu, Ke Ma, Yan Wang, Kecheng Zheng, Xing Zhu, Yujun Shen, Hengshuang Zhao

Comments Project page: https://lucaria-academy.github.io/SynMotion/

详情

英文摘要

Diffusion-based video motion customization facilitates the acquisition of human motion representations from a few video samples, while achieving arbitrary subjects transfer through precise textual conditioning. Existing approaches often rely on semantic-level alignment, expecting the model to learn new motion concepts and combine them with other entities (e.g., ''cats'' or ''dogs'') to produce visually appealing results. However, video data involve complex spatio-temporal patterns, and focusing solely on semantics cause the model to overlook the visual complexity of motion. Conversely, tuning only the visual representation leads to semantic confusion in representing the intended action. To address these limitations, we propose SynMotion, a new motion-customized video generation model that jointly leverages semantic guidance and visual adaptation. At the semantic level, we introduce the dual-embedding semantic comprehension mechanism which disentangles subject and motion representations, allowing the model to learn customized motion features while preserving its generative capabilities for diverse subjects. At the visual level, we integrate parameter-efficient motion adapters into a pre-trained video generation model to enhance motion fidelity and temporal coherence. Furthermore, we introduce a new embedding-specific training strategy which \textbf{alternately optimizes} subject and motion embeddings, supported by the manually constructed Subject Prior Video (SPV) training dataset. This strategy promotes motion specificity while preserving generalization across diverse subjects. Lastly, we introduce MotionBench, a newly curated benchmark with diverse motion patterns. Experimental results across both T2V and I2V settings demonstrate that \method outperforms existing baselines. Project page: https://lucaria-academy.github.io/SynMotion/

URL PDF HTML ☆

赞 0 踩 0

2506.20941 2026-04-29 cs.LG

Revisiting the Past: Data Unlearning with Model State History

Keivan Rezaei, Mehrdad Saberi, Abhilasha Ravichander, Soheil Feizi

Comments Accepted to ICLR 2026

2506.14980 2026-04-29 cs.CV cs.RO

Advances in Compliance Detection: Novel Models Using Vision-Based Tactile Sensors

Ziteng Li, Malte Kuhlmann, Ilana Nisky, Nicolás Navarro-Guerrero

Comments Accepted in the IEEE International Conference on Development and Learning (ICDL). The paper contains 8 pages and 7 figures

2506.09981 2026-04-29 cs.CV cs.RO

ReSim: Reliable World Simulation for Autonomous Driving

Jiazhi Yang, Kashyap Chitta, Shenyuan Gao, Long Chen, Yuqian Shao, Xiaosong Jia, Hongyang Li, Andreas Geiger, Xiangyu Yue, Li Chen

Comments NeurIPS 2025 Spotlight. Project page: https://opendrivelab.com/ReSim

2506.06455 2026-04-29 cs.LG cs.AI stat.ML

WISCA: A Consensus-Based Approach to Harmonizing Interpretability in Tabular Datasets

Antonio Jesús Banegas-Luna, Horacio Pérez-Sánchez, Carlos Martínez-Cortés

Comments 27 pages, 11 figures, 2 tables, 13 equations

2506.05425 2026-04-29 cs.CV cs.AI

SIV-Bench: A Video Benchmark for Social Interaction Understanding and Reasoning

Fanqi Kong, Weiqin Zu, Xinyu Chen, Yaodong Yang, Song-Chun Zhu, Xue Feng

2506.05205 2026-04-29 cs.CL

RELIC: Evaluating Complex Reasoning via the Recognition of Languages In-Context

Jackson Petty, Michael Y. Hu, Wentao Wang, Shauli Ravfogel, William Merrill, Tal Linzen

Comments Accepted to TACL

2506.05199 2026-04-29 cs.CV

DEGround: An Effective Baseline for Ego-centric 3D Visual Grounding with a Homogeneous Framework

Yani Zhang, Dongming Wu, Hao Shi, Yingfei Liu, Tiancai Wang, Xingping Dong

Comments 1st place on EmbodiedScan visual grounding

2505.14174 2026-04-29 cs.CL cs.LG

Cheaper, Better, Faster, Stronger: Robust Text-to-SQL without Chain-of-Thought or Fine-Tuning

Yusuf Denizay Dönder, Derek Hommel, Andrea W Wen-Yi, David Mimno, Unso Eun Seo Jo

2505.13302 2026-04-29 cs.CL

Images Amplify Misinformation Sharing in Vision-Language Models

Alice Plebe, Timothy Douglas, Diana Riazi, R. Maria del Rio-Chanona

Comments Accepted for oral presentation at ICWSM 2026

2505.12202 2026-04-29 cs.LG stat.ML

Near-Optimal Sample Complexities of Divergence-based S-rectangular Distributionally Robust Reinforcement Learning

Zhenghao Li, Shengbo Wang, Nian Si

2503.12759 2026-04-29 cs.CL

RAG-RL: Advancing Retrieval-Augmented Generation via RL and Curriculum Learning

Jerry Huang, Siddarth Madala, Risham Sidhu, Cheng Niu, Hao Peng, Julia Hockenmaier, Tong Zhang

2503.06778 2026-04-29 cs.CL cs.AI

Large Language Models Are Effective Human Annotation Assistants, But Not Good Independent Annotators

Feng Gu, Zongxia Li, Carlos Rafael Colon, Benjamin Evans, Ishani Mondal, Jordan Lee Boyd-Graber

Comments 9 pages, 4 figures

2503.06100 2026-04-29 cs.CV

High-Precision Dichotomous Image Segmentation via Depth Integrity-Prior and Fine-Grained Patch Strategy

Xianjie Liu, Keren Fu, Qijun Zhao

2503.03426 2026-04-29 cs.LG math.ST stat.ML stat.TH

Sharp Risk Bounds for Early-Stopping in Gaussian Linear Regression

Tobias Wegel, Gil Kur, Patrick Rebeschini

2502.14427 2026-04-29 cs.CL

Token-Level Density-Based Uncertainty Quantification Methods for Eliciting Truthfulness of Large Language Models

Artem Vazhentsev, Lyudmila Rvanova, Ivan Lazichny, Alexander Panchenko, Maxim Panov, Timothy Baldwin, Artem Shelmanov

2412.08835 2026-04-29 cs.LG

Grothendieck Graph Neural Networks Framework: An Algebraic Platform for Crafting Topology-Aware GNNs

Amirreza Shiralinasab Langari, Leila Yeganeh, Kim Khoa Nguyen

2411.16073 2026-04-29 cs.LG cs.AI cs.CV

Soft-TransFormers for Continual Learning

Haeyong Kang, Chang D. Yoo

2411.08533 2026-04-29 cs.RO cs.AI

ACROSS: A Deformation-Based Cross-Modal Representation for Robotic Tactile Perception

Wadhah Zai El Amri, Malte Kuhlmann, Nicolás Navarro-Guerrero

Comments Accepted to 2025 IEEE Conference on Robotics and Automation (ICRA 2025). arXiv admin note: text overlap with arXiv:2410.14310

2411.05174 2026-04-29 cs.LG cs.AI stat.ML

Bayesian Inverse Transition Learning: Learning Dynamics From Near-Optimal Trajectories

Leo Benac, Abhishek Sharma, Sonali Parbhoo, Finale Doshi-Velez

2410.24214 2026-04-29 cs.LG cs.CR cs.CV

ARQ: A Mixed-Precision Quantization Framework for Accurate and Certifiably Robust DNNs

Yuchen Yang, Yifan Zhao, Shubham Ugare, Gagandeep Singh, Sasa Misailovic

2410.24116 2026-04-29 cs.CV cs.AI cs.LG

AIDOVECL: AI-generated Dataset of Outpainted Vehicles for Eye-level Classification and Localization

Amir Kazemi, Qurat ul ain Fatima, Volodymyr Kindratenko, Christopher W. Tessum

Comments 34 pages, 10 figures, 5 tables

2410.02082 2026-04-29 cs.LG q-bio.QM

FARM: Enhancing Molecular Representations with Functional Group Awareness

Thao Nguyen, Kuan-Hao Huang, Ge Liu, Martin D. Burke, Ying Diao, Heng Ji

Comments Preprint. The code is available at: https://github.com/thaonguyen217/farm_molecular_representation

2409.13869 2026-04-29 cs.AI cs.CL cs.CY

Generative AI Carries Non-Democratic Biases and Stereotypes: Representation of Women, Black Individuals, Age Groups, and People with Disability in AI-Generated Images across Occupations

Ayoob Sadeghiani

2408.16322 2026-04-29 cs.CV cs.RO

BEVal: A Cross-dataset Evaluation Study of BEV Segmentation Models for Autonomous Driving

Manuel Alejandro Diaz-Zapata, Wenqian Liu, Robin Baruffa, Christian Laugier

2408.12974 2026-04-29 cs.CV

Accuracy Improvement of Cell Image Segmentation Using Feedback Former

Hinako Mitsuoka, Kazuhiro Hotta

Comments Accepted by ECCV2024 Workshop "Human-inspired Computer Vision (HCV)". 2025/3/19 : An extended version of this paper has been accepted for publication in IEEE Access. The published version is available at DOI: https://doi.org/10.1109/ACCESS.2025.3552847

2408.10692 2026-04-29 cs.CL

Unconditional Truthfulness: Learning Unconditional Uncertainty of Large Language Models

Artem Vazhentsev, Ekaterina Fadeeva, Rui Xing, Gleb Kuzmin, Ivan Lazichny, Alexander Panchenko, Preslav Nakov, Timothy Baldwin, Maxim Panov, Artem Shelmanov

2406.06587 2026-04-29 cs.CL cs.AI cs.HC

TouchAI: Exploring human-AI perceptual alignment in touch through language model representations

Shu Zhong, Elia Gatti, Youngjun Cho, Marianna Obrist

Comments Accepted at IJHCS

2406.04855 2026-04-29 cs.CL

The Russian Legislative Corpus

Denis Saveliev, Ruslan Kuchakov

Comments 6 pages, 2 figures, 2 tables

2404.10425 2026-04-29 cs.RO cs.AI

Optimizing BioTac Simulation for Realistic Tactile Perception

Wadhah Zai El Amri, Nicolás Navarro-Guerrero

Comments 12 pages (including appendix), Accepted at the International Joint Conference on Neural Network (IJCNN) 2024, Yokohama, Japan. \c{opyright} 2024 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media... (We refer to IEEE Copyrights)