arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2602.21599 2026-03-10 cs.RO cs.CV

Iterative Closed-Loop Motion Synthesis for Scaling the Capabilities of Humanoid Control

Weisheng Xu, Qiwei Wu, Jiaxi Zhang, Tan Jing, Yangfan Li, Yuetong Fang, Jiaqi Xiong, Kai Wu, Rong Ou, Renjing Xu

详情

英文摘要

Physics-based humanoid control relies on training with motion datasets that have diverse data distributions. However, the fixed difficulty distribution of datasets limits the performance ceiling of the trained control policies. Additionally, the method of acquiring high-quality data through professional motion capture systems is constrained by costs, making it difficult to achieve large-scale scalability. To address these issues, we propose a closed-loop automated motion data generation and iterative framework. It can generate high-quality motion data with rich action semantics, including martial arts, dance, combat, sports, gymnastics, and more. Furthermore, our framework enables difficulty iteration of policies and data through physical metrics and objective evaluations, allowing the trained tracker to break through its original difficulty limits. On the PHC single-primitive tracker, using only approximately 1/10 of the AMASS dataset size, the average failure rate on the test set (2201 clips) is reduced by 45% compared to the baseline. Finally, we conduct comprehensive ablation and comparative experiments to highlight the rationality and advantages of our framework.

URL PDF HTML ☆

赞 0 踩 0

2602.20980 2026-03-10 cs.CV cs.AI

CrystaL: Spontaneous Emergence of Visual Latents in MLLMs

Yang Zhang, Danyang Li, Yuxuan Li, Xin Zhang, Tianyu Xie, Mingming Cheng, Xiang Li

2602.15895 2026-03-10 cs.CL cs.AI

Understand Then Memory: A Cognitive Gist-Driven RAG Framework with Global Semantic Diffusion

Pengcheng Zhou, Haochen Li, Zhiqiang Nie, JiaLe Chen, Qing Gong, Weizhen Zhang, Chun Yu

2602.10480 2026-03-10 cs.CL

Neuro-Symbolic Synergy for Interactive World Modeling

Hongyu Zhao, Siyu Zhou, Haolin Yang, Zengyi Qin, Tianyi Zhou

2602.08419 2026-03-10 cs.LG cs.NA math.NA

Radial Müntz-Szász Networks: Neural Architectures with Learnable Power Bases for Multidimensional Singularities

Gnankan Landry Regis N'guessan, Bum Jun Kim

Comments 52 pages, 15 figures

2602.05760 2026-03-10 cs.RO cs.HC

Task-Oriented Robot-Human Handovers on Legged Manipulators

Andreea Tulbure, Carmen Scheidemann, Elias Steiner, Marco Hutter

Comments Accepted to 21st ACM/IEEE International Conference on Human-Robot Interaction (HRI) 2026

2602.03036 2026-03-10 cs.CL cs.LG cs.MA

LatentMem: Customizing Latent Memory for Multi-Agent Systems

Muxin Fu, Xiangyuan Xue, Yafu Li, Zefeng He, Siyuan Huang, Xiaoye Qu, Yu Cheng, Yang Yang

2602.02026 2026-03-10 cs.RO

Synchronized Online Friction Estimation and Adaptive Grasp Control for Robust Gentle Grasp

Zhenwei Niu, Xiaoyi Chen, Jiayu Hu, Zhaoyang Liu, Tang Jian, Xiaozu Ju

2601.23014 2026-03-10 cs.LG cs.CL

Mem-T: Densifying Rewards for Long-Horizon Memory Agents

Yanwei Yue, Boci Peng, Xuanbo Fan, Jiaxin Guo, Qiankun Li, Yan Zhang

2512.23718 2026-03-10 cs.LG cs.NI

Network Traffic Analysis with Process Mining: The UPSIDE Case Study

Francesco Vitale, Paolo Palmiero, Massimiliano Rak, Nicola Mazzocca

Comments Accepted to the 2026 IEEE International Instrumentation and Measurement Technology Conference (I2MTC 2026)

2512.22179 2026-03-10 cs.LG cs.CR

Latent Sculpting for Zero-Shot Generalization: A Manifold Learning Approach to Out-of-Distribution Anomaly Detection

Rajeeb Thapa Chhetri, Saurab Thapa, Avinash Kumar, Zhixiong Chen

Comments 8 pages, 3 figures. Code available at: https://github.com/Rajeeb321123/Latent_sculpting_using_two_stage_method

详情

英文摘要

A critical vulnerability of supervised deep learning in high-dimensional tabular domains is "generalization collapse": models form precise decision boundaries around known training distributions but fail catastrophically when encountering Out-of-Distribution (OOD) data. To overcome this, we propose Latent Sculpting, a hierarchical, two-stage representation learning architecture designed to enforce explicit structural boundaries prior to density estimation. In the first stage, a Transformer-based tabular encoder is trained using our novel Binary Latent Sculpting loss. This objective explicitly condenses benign network traffic into a dense, low-entropy hypersphere while enforcing a strict geometric minimum-distance margin for anomalous patterns. In the second stage, a Masked Autoregressive Flow (MAF) maps this structurally optimized manifold to calculate exact, probabilistic anomaly thresholds. We evaluate this methodology on the CIC-IDS-2017 benchmark under a rigorous zero-shot protocol, deliberately withholding complex attack classes during training to test true OOD generalization. Averaged across three random initialization seeds to ensure statistical robustness, our framework maintains near-perfect classification on known signatures (F1 = 0.980 +/- 0.000) while achieving an overall zero-shot OOD F1-Score of 0.867 +/- 0.021 and an AUROC of 0.913 +/- 0.010 at an 85th-percentile threshold. Most notably, the model achieves an average recall of 78.7% (peaking at 97.2%) on stealthy "Infiltration" attacks and over 94% on low-volume DoS variations - complex distributional shifts where standard supervised and unsupervised baselines historically suffer near-total detection failure. These empirical results demonstrate that explicitly decoupling topological manifold structuring from probabilistic density estimation establishes a highly stable and scalable defense against zero-day cyber threats.

URL PDF HTML ☆

赞 0 踩 0

2512.15111 2026-03-10 cs.RO cs.CV

BEV-Patch-PF: Particle Filtering with BEV-Aerial Feature Matching for Off-Road Geo-Localization

Dongmyeong Lee, Jesse Quattrociocchi, Christian Ellis, Rwik Rana, Amanda Adkins, Adam Uccello, Garrett Warnell, Joydeep Biswas

2512.10054 2026-03-10 cs.AI cs.CL

Parallel Decoder Transformer: Planner-Seeded Latent Coordination for Synchronized Parallel Decoding

Logan Robbins

Comments Note: Updated to reflect revised architecture

2512.01738 2026-03-10 cs.LG

MSPT: Efficient Large-Scale Physical Modeling via Parallelized Multi-Scale Attention

Pedro M. P. Curvo, Jan-Willem van de Meent, Maksim Zhdanov

2512.01034 2026-03-10 cs.LG cs.AI

AltNet: Addressing the Plasticity-Stability Dilemma in Reinforcement Learning

Mansi Maheshwari, John C. Raisbeck, Bruno Castro da Silva

2512.00969 2026-03-10 cs.AI cs.SY eess.SY

Integrating a Causal Foundation Model into a Prescriptive Maintenance Framework for Optimising Production-Line OEE

Felix Saretzky, Lucas Andersen, Thomas Engel, Fazel Ansari

Comments 9 pages, 3 images, 1 table, conference paper

2512.00927 2026-03-10 cs.CV

LAHNet: Local Attentive Hashing Network for Point Cloud Registration

Wentao Qu, Xiaoshui Huang, Liang Xiao

2511.22854 2026-03-10 cs.LG cs.MA

CRAwDAD: Causal Reasoning Augmentation with Dual-Agent Debate

Finn G. Vamosi, Nils D. Forkert

Comments 12 pages, 8 figures. Code available at https://github.com/finnvamosi/CRAwDAD

详情

英文摘要

When people reason about cause and effect, they often consider many competing "what if" scenarios before deciding which explanation fits best. Analogously, advanced language models capable of causal inference can consider multiple interventions and counterfactuals to judge the validity of causal claims. Crucially, this type of reasoning is less like a single calculation and more like an internal dialogue between alternative hypotheses. In this paper, we make this dialogue explicit through a dual-agent debate framework where one model provides a structured causal inference, and the other critically examines this reasoning for logical flaws. When disagreements arise, the agents attempt to persuade each other, challenging each other's logic and revising their conclusions until they converge on a mutually agreed answer. To take advantage of this deliberative process, we specifically use reasoning language models, whose strengths in both causal inference and adversarial debate remain under-explored relative to standard large language models. We evaluate our approach on the CLadder dataset, a benchmark linking natural language questions to formally defined causal graphs across all three rungs of Pearl's ladder of causation. With Qwen3 and DeepSeek-R1 as debater agents, we demonstrate that multi-agent debate improves DeepSeek-R1's overall accuracy in causal inference from 78.03% to 87.45%, with the counterfactual category specifically improving from 67.94% to 80.04% accuracy. Similarly, Qwen3's overall accuracy improves from 84.16% to 89.41%, and counterfactual questions from 71.53% to 80.35%, showing that even strong models can still benefit greatly from debate with weaker agents. Our results highlight the potential of reasoning models as building blocks for multi-agent systems in causal inference, and demonstrate the importance of diverse perspectives in causal problem-solving.

URL PDF HTML ☆

赞 0 踩 0

2511.21194 2026-03-10 cs.CV cs.AI

BotaCLIP: Contrastive Learning for Botany-Aware Representation of Earth Observation Data

Selene Cerna, Sara Si-Moussi, Wilfried Thuiller, Hadrien Hendrikx, Vincent Miele

2511.07405 2026-03-10 cs.CL cs.CY

SPOT: An Annotated French Corpus and Benchmark for Detecting Critical Interventions in Online Conversations

Manon Berriche, Célia Nouri, Chloée Clavel, Jean-Philippe Cointet

2511.06325 2026-03-10 cs.CV cs.AI cs.CY

Detecting AI-Generated Images via Contextual Anomaly Estimation in Masked AutoEncoders

Minsuk Jang, Hyunseo Jeong, Minseok Son, Changick Kim

2510.24232 2026-03-10 cs.CV

Delving into Cascaded Instability: A Lipschitz Continuity View on Image Restoration and Object Detection Synergy

Qing Zhao, Weijian Deng, Pengxu Wei, ZiYi Dong, Hannan Lu, Xiangyang Ji, Liang Lin

Comments NeurIPS 2025

2510.12363 2026-03-10 cs.RO cs.LG

Pretraining in Actor-Critic Reinforcement Learning for Robot Locomotion

Jiale Fan, Andrei Cramariuc, Tifanny Portela, Marco Hutter

2510.11892 2026-03-10 cs.CL

R-WoM: Retrieval-augmented World Model For Computer-use Agents

Kai Mei, Jiang Guo, Shuaichen Chang, Mingwen Dong, Dongkyu Lee, Xing Niu, Jiarong Jiang

2510.11549 2026-03-10 cs.CV

ODI-Bench: Can MLLMs Understand Immersive Omnidirectional Environments?

Liu Yang, Huiyu Duan, Ran Tao, Juntao Cheng, Sijing Wu, Yunhao Li, Jing Liu, Xiongkuo Min, Guangtao Zhai

2510.07896 2026-03-10 cs.CL

ACE: Attribution-Controlled Knowledge Editing for Multi-hop Factual Recall

Jiayu Yang, Yuxuan Fan, Songning Lai, Shengen Wu, Jiaqi Tang, Chun Kang, Zhijiang Guo, Yutao Yue

Comments Accepted by ICLR2026

2510.04543 2026-03-10 cs.LG stat.ML

The Role of Feature Interactions in Graph-based Tabular Deep Learning

Elias Dubbeldam, Reza Mohammadi, Marit Schoonhoven, S. Ilker Birbil

Comments 12 pages, 5 figures, accepted at TMLR 2026

2510.03348 2026-03-10 cs.CV

FVO: Fast Visual Odometry with Transformers

Vlardimir Yugay, Duy-Kien Nguyen, Theo Gevers, Cees G. M. Snoek, Martin R. Oswald

2510.02286 2026-03-10 cs.LG cs.AI cs.CL

Tree-based Dialogue Reinforced Policy Optimization for Red-Teaming Attacks

Ruohao Guo, Afshin Oroojlooy, Roshan Sridhar, Miguel Ballesteros, Alan Ritter, Dan Roth

Comments Accepted at ICLR 2026

2510.00726 2026-03-10 cs.RO cs.AI cs.LG

CroSTAta: Cross-State Transition Attention Transformer for Robotic Manipulation

Giovanni Minelli, Giulio Turrisi, Victor Barasuol, Claudio Semini

Comments Code and data available at https://github.com/iit-DLSLab/croSTAta