arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.08636 2026-04-13 cs.RO cs.AI

LEGO: Latent-space Exploration for Geometry-aware Optimization of Humanoid Kinematic Design

Jihwan Yoon, Taemoon Jeong, Jeongeun Park, Chanwoo Kim, Jaewoon Kwon, Yonghyeon Lee, Kyungjae Lee, Sungjoon Choi

Comments Accepted in ICRA 2026

详情

英文摘要

Designing robot morphologies and kinematics has traditionally relied on human intuition, with little systematic foundation. Motion-design co-optimization offers a promising path toward automation, but two major challenges remain: (i) the vast, unstructured design space and (ii) the difficulty of constructing task-specific loss functions. We propose a new paradigm that minimizes human involvement by (i) learning the design search space from existing mechanical designs, rather than hand-crafting it, and (ii) defining the loss directly from human motion data via motion retargeting and Procrustes analysis. Using screw-theory-based joint axis representation and isometric manifold learning, we construct a compact, geometry-preserving latent space of humanoid upper body designs in which optimization is tractable. We then solve design optimization in this latent space using gradient-free optimization. Our approach establishes a principled framework for data-driven robot design and demonstrates that leveraging existing designs and human motion can effectively guide the automated discovery of novel robot design.

URL PDF HTML ☆

赞 0 踩 0

2604.08457 2026-04-13 cs.CV cs.AI cs.RO

CrashSight: A Phase-Aware, Infrastructure-Centric Video Benchmark for Traffic Crash Scene Understanding and Reasoning

Rui Gan, Junyi Ma, Pei Li, Xingyou Yang, Kai Chen, Sikai Chen, Bin Ran

2604.08405 2026-04-13 cs.CV

SyncBreaker:Stage-Aware Multimodal Adversarial Attacks on Audio-Driven Talking Head Generation

Wenli Zhang, Xianglong Shi, Sirui Zhao, Xinqi Chen, Guo Cheng, Yifan Xu, Tong Xu, Yong Liao

2604.07928 2026-04-13 cs.CV cs.LG

Generative 3D Gaussian Splatting for Arbitrary-ResolutionAtmospheric Downscaling and Forecasting

Tao Han, Zhibin Wen, Zhenghao Chen, Fenghua Lin, Junyu Gao, Song Guo, Lei Bai

Comments 20 pages, 13 figures

2604.07779 2026-04-13 cs.CV

Plug-and-Play Logit Fusion for Heterogeneous Pathology Foundation Models

Gexin Huang, Anqi Li, Yusheng Tan, Beidi Zhao, Gang Wang, Zu-Hua Gao, Xiaoxiao Li

Comments 10 pages, 2 figures

2604.07725 2026-04-13 cs.AI cs.CL

Squeeze Evolve: Unified Multi-Model Orchestration for Verifier-Free Evolution

Monishwaran Maheswaran, Leon Lakhani, Zhongzhu Zhou, Shijia Yang, Junxiong Wang, Coleman Hooper, Yuezhou Hu, Rishabh Tiwari, Jue Wang, Harman Singh, Qingyang Wu, Yuqing Jian, Ce Zhang, Kurt Keutzer, Tri Dao, Xiaoxia Wu, Ben Athiwaratkun, James Zou, Chenfeng Xu

Comments 40 Pages, Project Page: https://squeeze-evolve.github.io/

2604.07720 2026-04-13 cs.AI

Towards Knowledgeable Deep Research: Framework and Benchmark

Wenxuan Liu, Zixuan Li, Long Bai, Chunmao Zhang, Fenghui Zhang, Zhuo Chen, Wei Li, Yuxin Zuo, Fei Wang, Bingbing Xu, Xuhui Jiang, Jin Zhang, Xiaolong Jin, Jiafeng Guo, Tat-Seng Chua, Xueqi Cheng

2604.06734 2026-04-13 cs.CL

TEC: A Collection of Human Trial-and-error Trajectories for Problem Solving

Xinkai Zhang, Jingtao Zhan, Yiqun Liu, Qingyao Ai

2604.06165 2026-04-13 cs.CV cs.LG

HaloProbe: Bayesian Detection and Mitigation of Object Hallucinations in Vision-Language Models

Reihaneh Zohrabi, Hosein Hasani, Akshita Gupta, Mahdieh Soleymani Baghshah, Anna Rohrbach, Marcus Rohrbach

2604.06014 2026-04-13 cs.LG

Gated-SwinRMT: Unifying Swin Windowed Attention with Retentive Manhattan Decay via Input-Dependent Gating

Dipan Maity, Suman Mondal, Arindam Roy

2604.04144 2026-04-13 cs.CL cs.AI

Many Preferences, Few Policies: Towards Scalable Language Model Personalization

Cheol Woo Kim, Jai Moondra, Roozbeh Nahavandi, Andrew Perrault, Milind Tambe, Swati Gupta

Comments Fixed typos

2604.02781 2026-04-13 cs.SD

DynFOA: Generating First-Order Ambisonics with Conditional Diffusion for Dynamic and Acoustically Complex 360-Degree Videos

Ziyu Luo, Lin Chen, Qiang Qu, Xiaoming Chen, Yiran Shen

Comments Accidental duplicate submission. This paper was intended to be a replacement (v2) for arXiv:2602.06846

2604.02241 2026-04-13 cs.CV cs.RO

UAV-Track VLA: Embodied Aerial Tracking via Vision-Language-Action Models

Qiyao Zhang, Shuhua Zheng, Jianli Sun, Chengxiang Li, Xianke Wu, Zihan Song, Zhiyong Cui, Yisheng Lv, Yonglin Tian

2604.02183 2026-04-13 cs.AI

TRU: Targeted Reverse Update for Efficient Multimodal Recommendation Unlearning

Zhanting Zhou, KaHou Tam, Ziqiang Zheng, Zeyu Ma, Yang Yang

2603.27222 2026-04-13 cs.CV

HD-VGGT: High-Resolution Visual Geometry Transformer

Tianrun Chen, Yuanqi Hu, Yidong Han, Hanjie Xu, Deyi Ji, Qi Zhu, Chunan Yu, Xin Zhang, Cheng Chen, Chaotao Ding, Ying Zang, Xuanfu Li, Jin Ma, Lanyun Zhu

2603.21692 2026-04-13 cs.AI cs.DC cs.SE

Reasoning Provenance for Autonomous AI Agents: Structured Behavioral Analytics Beyond State Checkpoints and Execution Traces

Neelmani Vispute, Aditya Kadam

Comments 10 pages, 2 tables, 1 figure, preprint, v2: adds co-author and transport-layer verification mechanism

2603.15757 2026-04-13 cs.RO cs.AI

You've Got a Golden Ticket: Improving Generative Robot Policies With A Single Noise Vector

Omkar Patil, Ondrej Biza, Thomas Weng, Karl Schmeckpeper, Wil Thomason, Xiaohan Zhang, Robin Walters, Nakul Gopalan, Sebastian Castro, Eric Rosen

Comments 13 pages, 9 figures

2603.01944 2026-04-13 cs.CV

MobileMold: A Smartphone-Based Microscopy Dataset for Food Mold Detection

Dinh Nam Pham, Leonard Prokisch, Bennet Meyer, Jonas Thumbs

Comments Accepted to ACM Multimedia Systems (MMSys'26). Dataset and code available at https://mobilemold.github.io/dataset/

2602.22812 2026-04-13 cs.LG cs.DC

Accelerating Local LLMs on Resource-Constrained Edge Devices via Distributed Prompt Caching

Hiroki Matsutani, Naoki Matsuda, Naoto Sugiura

Comments EuroMLSys'26

2602.01997 2026-04-13 cs.LG cs.AI

On the Limits of Layer Pruning for Generative Reasoning in Large Language Models

Safal Shrestha, Anubhav Shrestha, Aadim Nepal, Minwu Kim, Keith Ross

2602.00821 2026-04-13 cs.CV

Zero-Shot Generative De-identification: Inversion-Free Flow for Privacy-Preserving Skin Image Analysis

Konstantinos Moutselos, Ilias Maglogiannis

Comments 10 pages, 5 figures

2601.23045 2026-04-13 cs.AI

The Hot Mess of AI: How Does Misalignment Scale With Model Intelligence and Task Complexity?

Alexander Hägele, Aryo Pradipta Gema, Henry Sleight, Ethan Perez, Jascha Sohl-Dickstein

Comments ICLR 2026. 10 pages main text, 40 total, 27 figures. v2: typos, improved writing, references

2601.22990 2026-04-13 cs.CV cs.AI

Self-Supervised Slice-to-Volume Reconstruction with Gaussian Representations for Fetal MRI

Yinsong Wang, Thomas Fletcher, Xinzhe Luo, Aine Travers Dineen, Rhodri Cusack, Chen Qin

2601.18150 2026-04-13 cs.LG cs.CL

FP8-RL: A Practical and Stable Low-Precision Stack for LLM Reinforcement Learning

Zhaopeng Qiu, Shuang Yu, Jingqi Zhang, Shuai Zhang, Xue Huang, Jingyi Yang, Junjie Lai

Comments Added more FP8 end2end experiments

2601.10632 2026-04-13 cs.CV

CoMoVi: Co-Generation of 3D Human Motions and Realistic Videos

Chengfeng Zhao, Jiazhi Shu, Yubo Zhao, Tianyu Huang, Jiahao Lu, Zekai Gu, Chengwei Ren, Zhiyang Dou, Qing Shuai, Yuan Liu

Comments Project Page: https://igl-hkust.github.io/CoMoVi/

2601.10587 2026-04-13 cs.CV cs.AI

Adversarial Evasion Attacks on Computer Vision using SHAP Values

Frank Mollard, Marcus Becker, Florian Roehrbein

Comments 10th bwHPC Symposium - September 25th & 26th, 2024

2512.22437 2026-04-13 cs.CV

EmoCtrl: Controllable Emotional Image Content Generation

Jingyuan Yang, Weibin Luo, Hui Huang

2512.22187 2026-04-13 cs.RO cs.ET cs.SY eess.SY

Joint UAV-UGV Positioning and Trajectory Planning via Meta A3C for Reliable Emergency Communications

Ndagijimana Cyprien, Mehdi Sookhak, Hosein Zarini, Chandra N Sekharan, Mohammed Atiquzzaman

2512.21334 2026-04-13 cs.CV

Streaming Video Instruction Tuning

Jiaer Xia, Peixian Chen, Mengdan Zhang, Xing Sun, Kaiyang Zhou

Comments Accepted by CVPR2026

2512.07417 2026-04-13 cs.LG

Adaptive Tuning of Parameterized Traffic Controllers via Multi-Agent Reinforcement Learning

Giray Önür, Azita Dabiri, Bart De Schutter

Comments Accepted for presentation and publication in the proceedings of the 2026 European Control Conference (ECC)