arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.28766 2026-03-31 cs.CV

HandX: Scaling Bimanual Motion and Interaction Generation

Zimu Zhang, Yucheng Zhang, Xiyan Xu, Ziyin Wang, Sirui Xu, Kai Zhou, Bing Zhou, Chuan Guo, Jian Wang, Yu-Xiong Wang, Liang-Yan Gui

Comments CVPR 2026. Project Page: https://handx-project.github.io. Code: https://github.com/handx-project/HandX

详情

英文摘要

Synthesizing human motion has advanced rapidly, yet realistic hand motion and bimanual interaction remain underexplored. Whole-body models often miss the fine-grained cues that drive dexterous behavior, finger articulation, contact timing, and inter-hand coordination, and existing resources lack high-fidelity bimanual sequences that capture nuanced finger dynamics and collaboration. To fill this gap, we present HandX, a unified foundation spanning data, annotation, and evaluation. We consolidate and filter existing datasets for quality, and collect a new motion-capture dataset targeting underrepresented bimanual interactions with detailed finger dynamics. For scalable annotation, we introduce a decoupled strategy that extracts representative motion features, e.g., contact events and finger flexion, and then leverages reasoning from large language models to produce fine-grained, semantically rich descriptions aligned with these features. Building on the resulting data and annotations, we benchmark diffusion and autoregressive models with versatile conditioning modes. Experiments demonstrate high-quality dexterous motion generation, supported by our newly proposed hand-focused metrics. We further observe clear scaling trends: larger models trained on larger, higher-quality datasets produce more semantically coherent bimanual motion. Our dataset is released to support future research.

URL PDF HTML ☆

赞 0 踩 0

2603.28765 2026-03-31 cs.CL

Adaptive Block-Scaled Data Types

Jack Cook, Hyemin S. Lee, Kathryn Le, Junxian Guo, Giovanni Traverso, Anantha P. Chandrakasan, Song Han

Comments 19 pages, 9 figures

2603.28763 2026-03-31 cs.CV

PoseDreamer: Scalable and Photorealistic Human Data Generation Pipeline with Diffusion Models

Lorenza Prospero, Orest Kupyn, Ostap Viniavskyi, João F. Henriques, Christian Rupprecht

2603.28760 2026-03-31 cs.CV cs.RO

SHOW3D: Capturing Scenes of 3D Hands and Objects in the Wild

Patrick Rim, Kevin Harris, Braden Copple, Shangchen Han, Xu Xie, Ivan Shugurov, Sizhe An, He Wen, Alex Wong, Tomas Hodan, Kun He

Comments CVPR 2026

2603.28757 2026-03-31 cs.CV cs.MM cs.SD

SonoWorld: From One Image to a 3D Audio-Visual Scene

Derong Jin, Xiyi Chen, Ming C. Lin, Ruohan Gao

Comments Accepted by CVPR 2026, project page: https://humathe.github.io/sonoworld/

2603.28756 2026-03-31 cs.MS

Fast Large-Scale Model-Based Iterative Tomography via Exploiting Mathematical Structure, Hierarchical Optimization, Smart Initialization, and Distributed GPU Computing

Dinesh Kumar, Jeffrey Donatelli

2603.28755 2026-03-31 cs.CY

Graphilosophy: Graph-Based Digital Humanities Computing with The Four Books

Minh-Thu Do, Quynh-Chau Le-Tran, Duc-Duy Nguyen-Mai, Thien-Trang Nguyen, Khanh-Duy Le, Minh-Triet Tran, Tam V. Nguyen, Trung-Nghia Le

Comments AI & Society journal

2603.28754 2026-03-31 eess.SY cs.SY

Sparse State-Space Realizations of Linear Controllers

Yaozhi Du, Jing Shuang Li

Comments Submitted to 2026 CDC

2603.28753 2026-03-31 cs.NI

Iran's January 2026 Internet Shutdown: Public Data, Censorship Methods, and Circumvention Techniques

Giuseppe Aceto, Valerio Persico, Antonio Pescapè

Comments 12 pages, 3 figures, 1 table

2603.28747 2026-03-31 math.OC cs.SY eess.SY

Constrained Optimization on Matrix Lie Groups via Interior-Point Method

Aclécio J. Santos, Jean C. Pereira, Guilherme V. Raffo

Comments This is a preprint submitted to IEEE Control Systems Letters

2603.28744 2026-03-31 cs.LG

Stop Probing, Start Coding: Why Linear Probes and Sparse Autoencoders Fail at Compositional Generalisation

Vitória Barin Pacela, Shruti Joshi, Isabela Camacho, Simon Lacoste-Julien, David Klindt

2603.28740 2026-03-31 cs.RO

FocusVLA: Focused Visual Utilization for Vision-Language-Action Models

Yichi Zhang, Weihao Yuan, Yizhuo Zhang, Xidong Zhang, Jia Wan

Comments 25 pages, 18 figures

2603.28739 2026-03-31 cs.LG stat.ML

Expectation Error Bounds for Transfer Learning in Linear Regression and Linear Neural Networks

Meitong Liu, Christopher Jung, Rui Li, Xue Feng, Han Zhao

2603.28737 2026-03-31 eess.AS cs.AI cs.CL cs.SD

ParaSpeechCLAP: A Dual-Encoder Speech-Text Model for Rich Stylistic Language-Audio Pretraining

Anuj Diwan, Eunsol Choi, David Harwath

Comments Under review

2603.28735 2026-03-31 cs.SE cs.AI

RAD-AI: Rethinking Architecture Documentation for AI-Augmented Ecosystems

Oliver Aleksander Larsen, Mahyar T. Moghaddam

Comments Accepted at ANGE 2026, co-located with IEEE ICSA 2026. 8 pages

2603.28732 2026-03-31 cs.RO cs.CV

Pandora: Articulated 3D Scene Graphs from Egocentric Vision

Alan Yu, Yun Chang, Christopher Xie, Luca Carlone

Comments 14 pages, 5 figures. Presented at the 2025 British Machine Vision Conference (BMVC) in Sheffield, UK

2603.28731 2026-03-31 cs.SE cs.AI

SAGAI-MID: A Generative AI-Driven Middleware for Dynamic Runtime Interoperability

Oliver Aleksander Larsen, Mahyar T. Moghaddam

Comments Accepted at SAGAI 2026, co-located with IEEE ICSA 2026. 8 pages

2603.28728 2026-03-31 cs.NI

Study of Post Quantum status of Widely Used Protocols

Tushin Mallick, Ashish Kundu, Ramana Kompella

2603.28727 2026-03-31 cs.CR cs.DC cs.NI cs.SE

BitSov: A Composable Bitcoin-Native Architecture for Sovereign Internet Infrastructure

Oliver Aleksander Larsen, Rasmus Thorsen Larsen, Mahyar T. Moghaddam

Comments Accepted at BlockArch 2026, co-located with IEEE ICSA 2026. 4 pages

2603.28719 2026-03-31 eess.SY cs.SY

Alertness Optimization for Shift Workers Using a Physiology-based Mathematical Model

Zidi Tao, A. Agung Julius, John T Wen

Comments 35 pages single column, 9 figures

2603.28718 2026-03-31 cs.LG cs.AI cs.CV

Stepwise Credit Assignment for GRPO on Flow-Matching Models

Yash Savani, Branislav Kveton, Yuchen Liu, Yilin Wang, Jing Shi, Subhojyoti Mukherjee, Nikos Vlassis, Krishna Kumar Singh

Comments Accepted to the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2026 Project page: https://stepwiseflowgrpo.com

2603.28713 2026-03-31 cs.CV

DreamLite: A Lightweight On-Device Unified Model for Image Generation and Editing

Kailai Feng, Yuxiang Wei, Bo Chen, Yang Pan, Hu Ye, Songwei Liu, Chenqian Yan, Yuan Gao

Comments https://carlofkl.github.io/dreamlite/

2603.28709 2026-03-31 cs.AR

Physical Design of UET-RVMCU: A Streamlined Open-Source RISC-V Microcontroller

Abdullah Azhar, Uneeb Kamal, Wajid Ali, Saad Gillani, Dr Suleman Sami Qazi

2603.28708 2026-03-31 cs.LG cs.DC

GPU-Accelerated Optimization of Transformer-Based Neural Networks for Real-Time Inference

Soutrik Mukherjee, Sangwhan Cha

Comments 10 pages, 8 figures, 15 tables

2603.28706 2026-03-31 math.NA cs.NA physics.comp-ph

A Scalable Monolithic Modified Newton Multigrid Framework for Time-Dependent $p$-Navier-Stokes Flow

Nils Margenberg, Carolin Mehlmann

Comments 28 pages, 7 figures, 3 tables

2603.28700 2026-03-31 cs.DS

Improved Approximation Algorithms for Multiway Cut by Large Mixtures of New and Old Rounding Schemes

Joshua Brakensiek, Neng Huang, Aaron Potechin, Uri Zwick

Comments 49 pages, full version of STOC 2026 paper

详情

英文摘要

The input to the Multiway Cut problem is a weighted undirected graph, with nonnegative edge weights, and $k$ designated terminals. The goal is to partition the vertices of the graph into $k$ parts, each containing exactly one of the terminals, such that the sum of weights of the edges connecting vertices in different parts of the partition is minimized. The problem is APX-hard for $k\ge3$. The currently best known approximation algorithm for the problem for arbitrary $k$, obtained by Sharma and Vondrák [STOC 2014] more than a decade ago, has an approximation ratio of 1.2965. We present an algorithm with an improved approximation ratio of 1.2787. Also, for small values of $k \ge 4$ we obtain the first improvements in 25 years over the currently best approximation ratios obtained by Karger et al. [STOC 1999]. (For $k=3$ an optimal approximation algorithm is known.) Our main technical contributions are new insights on rounding the LP relaxation of Călinescu, Karloff, and Rabani [STOC 1998], whose integrality ratio matches Multiway Cut's approximability ratio, assuming the Unique Games Conjecture [Manokaran et al., STOC 2008]. First, we introduce a generalized form of a rounding scheme suggested by Kleinberg and Tardos [FOCS 1999] and use it to replace the Exponential Clocks rounding scheme used by Buchbinder et al. [STOC 2013] and by Sharma and Vondrák. Second, while previous algorithms use a mixture of two, three, or four basic rounding schemes, each from a different family of rounding schemes, our algorithm uses a computationally-discovered mixture of hundreds of basic rounding schemes, each parametrized by a random variable with a distinct probability distribution, including in particular many different rounding schemes from the same family. We give a completely rigorous analysis of our improved algorithms using a combination of analytical techniques and interval arithmetic.

URL PDF HTML ☆

赞 0 踩 0

2603.28698 2026-03-31 cs.CL

EpiScreen: Early Epilepsy Detection from Electronic Health Records with Large Language Models

Shuang Zhou, Kai Yu, Zaifu Zhan, Huixue Zhou, Min Zeng, Feng Xie, Zhiyi Sha, Rui Zhang

Comments 24 pages, 5 figures, 4 tables

2603.28696 2026-03-31 cs.CV cs.AI

AdaptToken: Entropy-based Adaptive Token Selection for MLLM Long Video Understanding

Haozhe Qi, Kevin Qu, Mahdi Rad, Rui Wang, Alexander Mathis, Marc Pollefeys

Comments Project page: https://haozheqi.github.io/adapt-token

2603.28691 2026-03-31 cs.RO

DRIVE-Nav: Directional Reasoning, Inspection, and Verification for Efficient Open-Vocabulary Navigation

Maoguo Gao, Zejun Zhu, Zhiming Sun, Zhengwei Ma, Longze Yuan, Zhongjing Ma, Zhigang Gao, Jinhui Zhang, Suli Zou

Comments 8 pages, 4 figures. Project page: https://coolmaoguo.github.io/drive-nav-page/

2603.28690 2026-03-31 cs.RO cs.CE

Vision-Based Robotic Disassembly Combined with Real-Time MFA Data Acquisition

Federico Zocco, Maria Pozzi, Monica Malvezzi

Comments Submitted