arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2604.01224 2026-04-02 cs.RO

Functional Force-Aware Retargeting from Virtual Human Demos to Soft Robot Policies

Uksang Yoo, Mengjia Zhu, Evan Pezent, Jom Preechayasomboon, Jean Oh, Jeffrey Ichnowski, Amir Memar, Ben Abbatematteo, Homanga Bharadhwaj, Ashish Deshpande, Harsha Prahlad

详情

英文摘要

We introduce SoftAct, a framework for teaching soft robot hands to perform human-like manipulation skills by explicitly reasoning about contact forces. Leveraging immersive virtual reality, our system captures rich human demonstrations, including hand kinematics, object motion, dense contact patches, and detailed contact force information. Unlike conventional approaches that retarget human joint trajectories, SoftAct employs a two-stage, force-aware retargeting algorithm. The first stage attributes demonstrated contact forces to individual human fingers and allocates robot fingers proportionally, establishing a force-balanced mapping between human and robot hands. The second stage performs online retargeting by combining baseline end-effector pose tracking with geodesic-weighted contact refinements, using contact geometry and force magnitude to adjust robot fingertip targets in real time. This formulation enables soft robotic hands to reproduce the functional intent of human demonstrations while naturally accommodating extreme embodiment mismatch and nonlinear compliance. We evaluate SoftAct on a suite of contact-rich manipulation tasks using a custom non-anthropomorphic pneumatic soft robot hand. SoftAct's controller reduces fingertip trajectory tracking RMSE by up to 55 percent and reduces tracking variance by up to 69 percent compared to kinematic and learning-based baselines. At the policy level, SoftAct achieves consistently higher success in zero-shot real-world deployment and in simulation. These results demonstrate that explicitly modeling contact geometry and force distribution is essential for effective skill transfer to soft robotic hands, and cannot be recovered through kinematic imitation alone. Project videos and additional details are available at https://soft-act.github.io/.

URL PDF HTML ☆

赞 0 踩 0

2604.01221 2026-04-02 cs.AI cs.CV

HippoCamp: Benchmarking Contextual Agents on Personal Computers

Zhe Yang, Shulin Tian, Kairui Hu, Shuai Liu, Hoang-Nhat Nguyen, Yichi Zhang, Zujin Guo, Mengying Yu, Zinan Zhang, Jingkang Yang, Chen Change Loy, Ziwei Liu

Comments Project Page: https://hippocamp-ai.github.io/

2604.01220 2026-04-02 cs.CL

Universal YOCO for Efficient Depth Scaling

Yutao Sun, Li Dong, Tianzhu Ye, Shaohan Huang, Jianyong Wang, Furu Wei

2604.01217 2026-04-02 quant-ph cond-mat.other cs.IT hep-th math-ph math.IT math.MP

Conditional channel entropy sets fundamental limits on thermodynamic quantum information processing

Himanshu Badhani, Siddhartha Das

Comments 33+20 pages, 1 table, 3 figures

2604.01216 2026-04-02 cs.LG cs.AI cs.CV

LAtent Phase Inference from Short time sequences using SHallow REcurrent Decoders (LAPIS-SHRED)

Yuxuan Bao, Xingyue Zhang, J. Nathan Kutz

详情

英文摘要

Reconstructing full spatio-temporal dynamics from sparse observations in both space and time remains a central challenge in complex systems, as measurements can be spatially incomplete and can be also limited to narrow temporal windows. Yet approximating the complete spatio-temporal trajectory is essential for mechanistic insight and understanding, model calibration, and operational decision-making. We introduce LAPIS-SHRED (LAtent Phase Inference from Short time sequence using SHallow REcurrent Decoders), a modular architecture that reconstructs and/or forecasts complete spatiotemporal dynamics from sparse sensor observations confined to short temporal windows. LAPIS-SHRED operates through a three-stage pipeline: (i) a SHRED model is pre-trained entirely on simulation data to map sensor time-histories into a structured latent space, (ii) a temporal sequence model, trained on simulation-derived latent trajectories, learns to propagate latent states forward or backward in time to span unobserved temporal regions from short observational time windows, and (iii) at deployment, only a short observation window of hyper-sparse sensor measurements from the true system is provided, from which the frozen SHRED model and the temporal model jointly reconstruct or forecast the complete spatiotemporal trajectory. The framework supports bidirectional inference, inherits data assimilation and multiscale reconstruction capabilities from its modular structure, and accommodates extreme observational constraints including single-frame terminal inputs. We evaluate LAPIS-SHRED on six experiments spanning complex spatio-temporal physics: turbulent flows, multiscale propulsion physics, volatile combustion transients, and satellite-derived environmental fields, highlighting a lightweight, modular architecture suited for operational settings where observation is constrained by physical or logistical limitations.

URL PDF HTML ☆

赞 0 踩 0

2604.01215 2026-04-02 cs.LG cs.AI physics.ao-ph

The Recipe Matters More Than the Kitchen:Mathematical Foundations of the AI Weather Prediction Pipeline

Piyush Garg, Diana R. Gergel, Andrew E. Shao, Galen J. Yacalis

详情

英文摘要

AI weather prediction has advanced rapidly, yet no unified mathematical framework explains what determines forecast skill. Existing theory addresses specific architectural choices rather than the learning pipeline as a whole, while operational evidence from 2023-2026 demonstrates that training methodology, loss function design, and data diversity matter at least as much as architecture selection. This paper makes two interleaved contributions. Theoretically, we construct a framework rooted in approximation theory on the sphere, dynamical systems theory, information theory, and statistical learning theory that treats the complete learning pipeline (architecture, loss function, training strategy, data distribution) rather than architecture alone. We establish a Learning Pipeline Error Decomposition showing that estimation error (loss- and data-dependent) dominates approximation error (architecture-dependent) at current scales. We develop a Loss Function Spectral Theory formalizing MSE-induced spectral blurring in spherical harmonic coordinates, and derive Out-of-Distribution Extrapolation Bounds proving that data-driven models systematically underestimate record-breaking extremes with bias growing linearly in record exceedance. Empirically, we validate these predictions via inference across ten architecturally diverse AI weather models using NVIDIA Earth2Studio with ERA5 initial conditions, evaluating six metrics across 30 initialization dates spanning all seasons. Results confirm universal spectral energy loss at high wavenumbers for MSE-trained models, rising Error Consensus Ratios showing that the majority of forecast error is shared across architectures, and linear negative bias during extreme events. A Holistic Model Assessment Score provides unified multi-dimensional evaluation, and a prescriptive framework enables mathematical evaluation of proposed pipelines before training.

URL PDF HTML ☆

赞 0 踩 0

2604.01213 2026-04-02 cs.RO cs.MA

Collaborative Task and Path Planning for Heterogeneous Robotic Teams using Multi-Agent PPO

Matthias Rubio, Julia Richter, Hendrik Kolvenbach, Marco Hutter

Comments 8 pages, 3 figures, associated code on https://github.com/leggedrobotics/multi_robot_global_planner

2604.01212 2026-04-02 cs.CL cs.AI

$\texttt{YC-Bench}$: Benchmarking AI Agents for Long-Term Planning and Consistent Execution

Muyu He, Adit Jain, Anand Kumar, Vincent Tu, Soumyadeep Bakshi, Sachin Patro, Nazneen Rajani

Comments 16 pages, 10 figures

2604.01211 2026-04-02 eess.SY cs.SY

Making Every Bit Count for $A$-Optimal State Estimation

Cameron Khanpour, Daniel Turizo, Samuel Talkington

2604.01210 2026-04-02 cs.LG cs.AI

CliffSearch: Structured Agentic Co-Evolution over Theory and Code for Scientific Algorithm Discovery

Youssef Mroueh, Carlos Fonseca, Brian Belgodere, David Cox

2604.01207 2026-04-02 cs.CV

TRACE: High-Fidelity 3D Scene Editing via Tangible Reconstruction and Geometry-Aligned Contextual Video Masking

Jiyuan Hu, Zechuan Zhang, Zongxin Yang, Yi Yang

Comments 22 pages, 9 figures

2604.01206 2026-04-02 cs.CL cs.LG

LLM REgression with a Latent Iterative State Head

Yiheng Su, Matthew Lease

2604.01205 2026-04-02 quant-ph cs.NA math.NA

Programmable Signal Design for Quantum Phase Estimation via Quantum Signal Processing

Zikang Jia, Suying Liu, Yulong Dong

Comments 23 pages, 7 figures

2604.01200 2026-04-02 math.NA cs.NA math.AP

A Posteriori Error Analysis of Runge-Kutta Discontinuous Galerkin Schemes with SIAC Post-Processing for Nonlinear Convection-Diffusion Systems

Jan Giesselmann, Kiwoong Kwon, Sebastian Krumscheid

Comments 21 pages, 1 figure, 10 tables

2604.01199 2026-04-02 math.NA cs.NA

A high-order, structure preserving scheme for the stochastic Galerkin shallow water equations -- unification and two-dimensional extension

Philipp Öffner, Per Pettersson, Andrew R. Winters

2604.01194 2026-04-02 cs.CR

AgentWatcher: A Rule-based Prompt Injection Monitor

Yanting Wang, Wei Zou, Runpeng Geng, Jinyuan Jia

Comments The code is available at https://github.com/wang-yanting/AgentWatcher

2604.01193 2026-04-02 cs.CL

Embarrassingly Simple Self-Distillation Improves Code Generation

Ruixiang Zhang, Richard He Bai, Huangjie Zheng, Navdeep Jaitly, Ronan Collobert, Yizhe Zhang

2604.01188 2026-04-02 eess.SY cs.SY

Learning Neural Network Controllers with Certified Robust Performance via Adversarial Training

Neelay Junnarkar, Yasin Sonmez, Murat Arcak

2604.01186 2026-04-02 cs.DL cs.IR

From Validity to Inter-Subjectivity: An Argument for Reliability Signals in Search Environments

Frans van der Sluis

Comments 4 pages. Extended abstract / conference paper for SEASON 2025 (September 24-25, 2025, Hamburg, Germany). Peer reviewed

2602.02326 2026-04-02 cs.CL

Language Steering for Multilingual In-Context Learning

Neeraja Kirtane, Kuan-Hao Huang

2601.02728 2026-04-02 cs.LG

CRoPE: Efficient Parametrization of Rotary Positional Embedding

Beicheng Lou, Zifei Xu, Vivian W. H. Wong

2511.23445 2026-04-02 quant-ph cs.CC cs.LO math.CO

Quantum Polymorphisms and the Complexity of Quantum Constraint Satisfaction

Lorenzo Ciardo, Gideo Joubert, Antoine Mottet

Comments We included several new results on quantum polymorphisms, quantum relational constructions, and the complexity of quantum CSPs

2511.08592 2026-04-02 cs.CL cs.AI

The Collective Turing Test: Large Language Models Can Generate Realistic Multi-User Discussions

Azza Bouleimen, Giordano De Marzo, Taehee Kim, Nicol`o Pagan, Hannah Metzler, Silvia Giordano, Anikó Hannák, David Garcia

2505.20507 2026-04-02 cs.CV cs.AI

Electrolyzers-HSI: Close-Range Multi-Scene Hyperspectral Imaging Benchmark Dataset

Elias Arbash, Ahmed Jamal Afifi, Ymane Belahsen, Margret Fuchs, Pedram Ghamisi, Paul Scheunders, Richard Gloaguen

详情

DOI: 10.1038/s41597-025-06279-9
Journal ref: Sci Data 12, 1818 (2025)

英文摘要

The global challenge of sustainable recycling demands automated, fast, and accurate, state-of-the-art (SOTA) material detection systems that act as a bedrock for a circular economy. Democratizing access to these cutting-edge solutions that enable real-time waste analysis is essential for scaling up recycling efforts and fostering the Green Deal. In response, we introduce \textbf{Electrolyzers-HSI}, a novel multimodal benchmark dataset designed to accelerate the recovery of critical raw materials through accurate electrolyzer materials classification. The dataset comprises 55 co-registered high-resolution RGB images and hyperspectral imaging (HSI) data cubes spanning the 400--2500 nm spectral range, yielding over 4.2 million pixel vectors and 424,169 labeled ones. This enables non-invasive spectral analysis of shredded electrolyzer samples, supporting quantitative and qualitative material classification and spectral properties investigation. We evaluate a suite of baseline machine learning (ML) methods alongside SOTA transformer-based deep learning (DL) architectures, including Vision Transformer, SpectralFormer, and the Multimodal Fusion Transformer, to investigate architectural bottlenecks for further efficiency optimisation when deploying transformers in material identification. We implement zero-shot detection techniques and majority voting across pixel-level predictions to establish object-level classification robustness. In adherence to the FAIR data principles, the electrolyzers-HSI dataset and accompanying codebase are openly available at https://github.com/hifexplo/Electrolyzers-HSI and https://rodare.hzdr.de/record/3668, supporting reproducible research and facilitating the broader adoption of smart and sustainable e-waste recycling solutions.

URL PDF HTML ☆

赞 0 踩 0

2503.19115 2026-04-02 q-bio.MN cs.NE

Implementation of Support Vector Machines using Reaction Networks

Amey Choudhary, Jiaxin Jin, Abhishek Deshpande

Comments 28 pages, 4 figures, 1 table

1809.03377 2026-04-02 math.NA cs.NA

Isogeometric Simulation and Shape Optimization with Applications to Electrical Machines

Peter Gangl, Ulrich Langer, Angelos Mantzaflaris, Rainer Schneckenleitner

1609.06236 2026-04-02 math.NA cs.NA

A Local Mesh Modification Strategy for Interface Problems with Application to Shape and Topology Optimization

Peter Gangl, Ulrich Langer

Comments 8 pages, 2 Figures, submitted to proceedings of SCEE (Scientific Computing in Electrical Engineering) 2016 in Strobl, Austria

2604.01183 2026-04-02 cs.HC

Assessing Affective Objectives for Communicative Visualizations

Elsie Lee-Robbins, Eytan Adar

2604.01181 2026-04-02 cs.HC cs.CL cs.CV

True (VIS) Lies: Analyzing How Generative AI Recognizes Intentionality, Rhetoric, and Misleadingness in Visualization Lies

Graziano Blasilli, Marco Angelini

详情

英文摘要

This study investigates the ability of multimodal Large Language Models (LLMs) to identify and interpret misleading visualizations, and recognize these observations along with their underlying causes and potential intentionality. Our analysis leverages concepts from visualization rhetoric and a newly developed taxonomy of authorial intents as explanatory lenses. We formulated three research questions and addressed them experimentally using a dataset of 2,336 COVID-19-related tweets, half of which contain misleading visualizations, and supplemented it with real-world examples of perceptual, cognitive, and conceptual errors drawn from VisLies, the IEEE VIS community event dedicated to showcasing deceptive and misleading visualizations. To ensure broad coverage of the current LLM landscape, we evaluated 16 state-of-the-art models. Among them, 15 are open-weight models, spanning a wide range of model sizes, architectural families, and reasoning capabilities. The selection comprises small models, namely Nemotron-Nano-V2-VL (12B parameters), Mistral-Small-3.2 (24B), DeepSeek-VL2 (27B), Gemma3 (27B), and GTA1 (32B); medium-sized models, namely Qianfan-VL (70B), Molmo (72B), GLM-4.5V (108B), LLaVA-NeXT (110B), and Pixtral-Large (124B); and large models, namely Qwen3-VL (235B), InternVL3.5 (241B), Step3 (321B), Llama-4-Maverick (400B), and Kimi-K2.5 (1000B). In addition, we employed OpenAI GPT-5.4, a frontier proprietary model. To establish a human perspective on these tasks, we also conducted a user study with visualization experts to assess how people perceive rhetorical techniques and the authorial intentions behind the same misleading visualizations. This allows comparison between model and expert behavior, revealing similarities and differences that provide insights into where LLMs align with human judgment and where they diverge.

URL PDF HTML ☆

赞 0 踩 0

2604.01180 2026-04-02 math.NA cs.NA

On the error of the Euler scheme for approximation of solutions of nonlinear DDEs under inexact information

Paweł Przybyłowicz, Martyna Wiącek