arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.07644 2026-03-10 cs.RO

PanoDP: Learning Collision-Free Navigation with Panoramic Depth and Differentiable Physics

Hao Zhong, Pei Chi, Jiang Zhao, Shenghai Yuan, Xuyang Gao, Thien-Minh Nguyen, Lihua Xie

详情

英文摘要

Autonomous collision-free navigation in cluttered environments requires safe decision-making under partial observability with both static structure and dynamic obstacles. We present \textbf{PanoDP}, a communication-free learning framework that combines four-view panoramic depth perception with differentiable-physics-based training signals. PanoDP encodes panoramic depth using a lightweight CNN and optimizes policies with dense differentiable collision and motion-feasibility terms, improving training stability beyond sparse terminal collisions. We evaluate PanoDP on a controlled ring-to-center benchmark with systematic sweeps over agent count, obstacle density/layout, and dynamic behaviors, and further test out-of-distribution generalization in an external simulator (e.g., AirSim). Across settings, PanoDP increases collision-free and completion rates over single-view and non-physics-guided baselines under matched training budgets, and ablations (view masking, rotation augmentation) confirm the policy leverages 360-degree information. Code will be open source upon acceptance.

URL PDF HTML ☆

赞 0 踩 0

2603.07642 2026-03-10 cs.LG

Helix: Evolutionary Reinforcement Learning for Open-Ended Scientific Problem Solving

Chang Su, Zhongkai Hao, Zhizhou Zhang, Zeyu Xia, Youjia Wu, Hang Su, Jun Zhu

Comments Accepted at ICLR 2026

2603.07630 2026-03-10 cs.CV

Real-Time Glottis Detection Framework via Spatial-decoupled Feature Learning for Nasal Transnasal Intubation

Jinyu Liu, Gaoyang Zhang, Yang Zhou, Ruoyi Hao, Yang Zhang, Hongliang Ren

Comments 15 pages, 7 figures

2603.07629 2026-03-10 cs.RO cs.LG

Exoskeleton Control through Learning to Reduce Biological Joint Moments in Simulations

Zihang You, Xianlian Zhou

详情

英文摘要

Data-driven joint-moment predictors offer a scalable alternative to laboratory-based inverse-dynamics pipelines for biomechanics estimation and exoskeleton control. Meanwhile, physics-based reinforcement learning (RL) enables simulation-trained controllers to learn dynamics-aware assistance strategies without extensive human experimentation. However, quantitative verification of simulation-trained exoskeleton torque predictors, and their impact on human joint power injection, remains limited. This paper presents (1) an RL framework to learn exoskeleton assistance policies that reduce biological joint moments, and (2) a validation pipeline that verifies the trained control networks using an open-source gait dataset through inference and comparison with biological joint moments. Simulation-trained multilayer perceptron (MLP) controllers are developed for level-ground and ramp walking, mapping short-horizon histories of bilateral hip and knee kinematics to normalized assistance torques. Results show that predicted assistance preserves task-intensity trends across speeds and inclines. Agreement is particularly strong at the hip, with cross-correlation coefficients reaching 0.94 at 1.8 m/s and 0.98 during 5° decline walking, demonstrating near-matched temporal structure. Discrepancies increase at higher speeds and steeper inclines, especially at the knee, and are more pronounced in joint power comparisons. Delay tuning biases assistance toward greater positive power injection; modest timing shifts increase positive power and improve agreement in specific gait intervals. Together, these results establish a quantitative validation framework for simulation-trained exoskeleton controllers, demonstrate strong sim-to-data consistency at the torque level, and highlight both the promise and the remaining challenges for sim-to-real transfer.

URL PDF HTML ☆

赞 0 踩 0

2603.07625 2026-03-10 cs.CV

Duala: Dual-Level Alignment of Subjects and Stimuli for Cross-Subject fMRI Decoding

Shumeng Li, Jintao Guo, Jian Zhang, Yulin Zhou, Luyang Cao, Yinghuan Shi

2603.07624 2026-03-10 cs.RO

GeoLoco: Leveraging 3D Geometric Priors from Visual Foundation Model for Robust RGB-Only Humanoid Locomotion

Yufei Liu, Xieyuanli Chen, Hainan Pan, Chenghao Shi, Yanjie Chen, Kaihong Huang, Zhiwen Zeng, Huimin Lu

Comments 8 pages, 6 figures, conference

2603.07618 2026-03-10 cs.RO cs.AI cs.LG

SMAT: Staged Multi-Agent Training for Co-Adaptive Exoskeleton Control

Yifei Yuan, Ghaith Androwis, Xianlian Zhou

2603.07614 2026-03-10 cs.CV

Looking Into the Water by Unsupervised Learning of the Surface Shape

Ori Lifschitz, Tali Treibitz, Dan Rosenbaum

2603.07612 2026-03-10 cs.CL

KohakuRAG: A simple RAG framework with hierarchical document indexing

Shih-Ying Yeh, Yueh-Feng Ku, Ko-Wei Huang, Buu-Khang Tu

Comments 38pages

2603.07606 2026-03-10 cs.LG

TT-Sparse: Learning Sparse Rule Models with Differentiable Truth Tables

Hans Farrell Soegeng, Sarthak Ketanbhai Modi, Thomas Peyrin

2603.07604 2026-03-10 cs.CV

EmbedTalk: Triplane-Free Talking Head Synthesis using Embedding-Driven Gaussian Deformation

Arpita Saggar, Jonathan C. Darling, Duygu Sarikaya, David C. Hogg

Comments Preprint

2603.07599 2026-03-10 cs.CL

StyleBench: Evaluating Speech Language Models on Conversational Speaking Style Control

Haishu Zhao, Aokai Hao, Yuan Ge, Zhenqiang Hong, Tong Xiao, Jingbo Zhu

2603.07598 2026-03-10 cs.AI cs.LG

Shorter Thoughts, Same Answers: Difficulty-Scaled Segment-Wise RL for CoT Compression

Ye Tian, Aijun Liu

Comments 12 pages, 3 figures. Preprint. Code available at the GitHub project repository

2603.07593 2026-03-10 cs.CV

Fast Attention-Based Simplification of LiDAR Point Clouds for Object Detection and Classification

Z. Rozsa, Á. Madaras, Q. Wei, X. Lu, M. Golarits, H. Yuan, T. Sziranyi, R. Hamzaoui

2603.07590 2026-03-10 cs.CV cs.LG

Models as Lego Builders: Assembling Malice from Benign Blocks via Semantic Blueprints

Chenxi Li, Xianggan Liu, Dake Shen, Yaosong Du, Zhibo Yao, Hao Jiang, Linyi Jiang, Chengwei Cao, Jingzhe Zhang, RanYi Peng, Peiling Bai, Xiande Huang

2603.07587 2026-03-10 cs.CV

3DGS-HPC: Distractor-free 3D Gaussian Splatting with Hybrid Patch-wise Classification

Jiahao Chen, Yipeng Qin, Ganlong Zhao, Xin Li, Wenping Wang, Guanbin Li

2603.07580 2026-03-10 cs.RO

FeasibleCap: Real-Time Embodiment Constraint Guidance for In-the-Wild Robot Demonstration Collection

Zi Yin, Fanhong Li, Yun Gui, Jia Liu

2603.07577 2026-03-10 cs.CV cs.AI cs.LG

Integration of deep generative Anomaly Detection algorithm in high-speed industrial line

Niccolò Ferrari, Nicola Zanarini, Michele Fraccaroli, Alice Bizzarri, Evelina Lamma

Comments Preprint under review at a Springer Nature journal. 36 pages, 3 tables, 29 figures. Updated and expanded version of the SSRN preprint (abstract_id=4858664), with substantial revisions and Springer Nature formatting

2603.07572 2026-03-10 cs.LG

TS-MLLM: A Multi-Modal Large Language Model-based Framework for Industrial Time-Series Big Data Analysis

Haiteng Wang, Yikang Li, Yunfei Zhu, Jingheng Yan, Lei Ren, Laurence T. Yang

2603.07570 2026-03-10 cs.CV

Efficient RGB-D Scene Understanding via Multi-task Adaptive Learning and Cross-dimensional Feature Guidance

Guodong Sun, Junjie Liu, Gaoyang Zhang, Bo Wu, Yang Zhang

Comments 23 pages, 13 figures

2603.07568 2026-03-10 cs.LG

Constraints Matrix Diffusion based Generative Neural Solver for Vehicle Routing Problems

Zhenwei Wang, Tiehua Zhang, Ning Xue, Ender Ozcan, Ling Wang, Ruibin Bai

详情

英文摘要

Over the past decade, neural network solvers powered by generative artificial intelligence have garnered significant attention in the domain of vehicle routing problems (VRPs), owing to their exceptional computational efficiency and superior reasoning capabilities. In particular, autoregressive solvers integrated with reinforcement learning have emerged as a prominent trend. However, much of the existing work emphasizes large-scale generalization of neural approaches while neglecting the limited robustness of attention-based methods across heterogeneous distributions of problem parameters. Their improvements over heuristic search remain largely restricted to hand-curated, fixed-distribution benchmarks. Furthermore, these architectures tend to degrade significantly when node representations are highly similar or when tasks involve long decision horizons. To address the aforementioned limitations, we propose a novel fusion neural network framework that employs a discrete noise graph diffusion model to learn the underlying constraints of vehicle routing problems and generate a constraint assignment matrix. This matrix is subsequently integrated adaptively into the feature representation learning and decision process of the autoregressive solver, serving as a graph structure mask that facilitates the formation of solutions characterized by both global vision and local feature integration. To the best of our knowledge, this work represents the first comprehensive experimental investigation of neural network model solvers across a 378-combinatorial space spanning four distinct dimensions within the CVRPlib public dataset. Extensive experimental evaluations demonstrate that our proposed fusion model effectively captures and leverages problem constraints, achieving state-of-the-art performance across multiple benchmark datasets.

URL PDF HTML ☆

赞 0 踩 0

2603.07566 2026-03-10 cs.CV cs.AI cs.LG

GRD-Net: Generative-Reconstructive-Discriminative Anomaly Detection with Region of Interest Attention Module

Niccolò Ferrari, Michele Fraccaroli, Evelina Lamma

Comments Peer-reviewed journal version published. 18 pages, 12 figures, 7 tables

详情

DOI: 10.1155/2023/7773481
Journal ref: International Journal of Intelligent Systems, vol. 2023, Article ID 7773481, 2023

英文摘要

Anomaly detection is nowadays increasingly used in industrial applications and processes. One of the main fields of the appliance is the visual inspection for surface anomaly detection, which aims to spot regions that deviate from regularity and consequently identify abnormal products. Defect localization is a key task, that usually is achieved using a basic comparison between generated image and the original one, implementing some blob-analysis or image-editing algorithms, in the post-processing step, which is very biased towards the source dataset, and they are unable to generalize. Furthermore, in industrial applications, the totality of the image is not always interesting but could be one or some regions of interest (ROIs), where only in those areas there are relevant anomalies to be spotted. For these reasons, we propose a new architecture composed by two blocks. The first block is a Generative Adversarial Network (GAN), based on a residual autoencoder (ResAE), to perform reconstruction and denoising processes, while the second block produces image segmentation, spotting defects. This method learns from a dataset composed of good products and generated synthetic defects. The discriminative network is trained using a ROI for each image contained in the training dataset. The network will learn in which area anomalies are relevant. This approach guarantees the reduction of using pre-processing algorithms, formerly developed with blob-analysis and image-editing procedures. To test our model we used challenging MVTec anomaly detection datasets and an industrial large dataset of pharmaceutical BFS strips of vials. This set constitutes a more realistic use case of the aforementioned network.

URL PDF HTML ☆

赞 0 踩 0

2603.07564 2026-03-10 cs.CV

SiamGM: Siamese Geometry-Aware and Motion-Guided Network for Real-Time Satellite Video Object Tracking

Zixiao Wen, Zhen Yang, Jiawei Li, Xiantai Xiang, Guangyao Zhou, Yuxin Hu, Yuhan Liu

Comments This work has been submitted to the IEEE for possible publication

2603.07562 2026-03-10 cs.CV

Brain-WM: Brain Glioblastoma World Model

Chenhui Wang, Boyun Zheng, Liuxin Bao, Zhihao Peng, Peter Y. M. Woo, Hongming Shan, Yixuan Yuan

2603.07559 2026-03-10 cs.CV

Active Inference for Micro-Gesture Recognition: EFE-Guided Temporal Sampling and Adaptive Learning

Weijia Feng, Jingyu Yang, Ruojia Zhang, Fengtao Sun, Qian Gao, Chenyang Wang, Tongtong Su, Jia Guo, Xiaobai Li, Minglai Shao

Comments 10 pages, accepted by CVPR 2026

2603.07558 2026-03-10 cs.LG

ECG Classification on PTB-XL: A Data-Centric Approach with Simplified CNN-VAE

Naqcho Ali Mehdi, Amir Ali

2603.07552 2026-03-10 cs.CV cs.RO

ReconDrive: Fast Feed-Forward 4D Gaussian Splatting for Autonomous Driving Scene Reconstruction

Haibao Yu, Kuntao Xiao, Jiahang Wang, Ruiyang Hao, Yuxin Huang, Guoran Hu, Haifang Qin, Bowen Jing, Yuntian Bo, Ping Luo

2603.07550 2026-03-10 cs.CL cs.AI

Learning-free L2-Accented Speech Generation using Phonological Rules

Thanathai Lertpetchpun, Yoonjeong Lee, Jihwan Lee, Tiantian Feng, Dani Byrd, Shrikanth Narayanan

Comments Submitted to Interspeech2026

2603.07546 2026-03-10 cs.AI cs.LG

COOL-MC: Verifying and Explaining RL Policies for Multi-bridge Network Maintenance

Dennis Gross

2603.07545 2026-03-10 cs.CV cs.AI cs.LG

DreamSAC: Learning Hamiltonian World Models via Symmetry Exploration

Jinzhou Tang, Fan Feng, Minghao Fu, Wenjun Lin, Biwei Huang, Keze Wang

Comments 19 pages, 5 figures