arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2506.00098 2026-02-26 cs.RO cs.LG

Interactive Imitation Learning for Dexterous Robotic Manipulation: Challenges and Perspectives -- A Survey

Edgar Welte, Rania Rayyes

Comments 27 pages, 4 figures, 3 tables

详情

DOI: 10.3389/frobt.2025.1682437

英文摘要

Dexterous manipulation is a crucial yet highly complex challenge in humanoid robotics, demanding precise, adaptable, and sample-efficient learning methods. As humanoid robots are usually designed to operate in human-centric environments and interact with everyday objects, mastering dexterous manipulation is critical for real-world deployment. Traditional approaches, such as reinforcement learning and imitation learning, have made significant strides, but they often struggle due to the unique challenges of real-world dexterous manipulation, including high-dimensional control, limited training data, and covariate shift. This survey provides a comprehensive overview of these challenges and reviews existing learning-based methods for real-world dexterous manipulation, spanning imitation learning, reinforcement learning, and hybrid approaches. A promising yet underexplored direction is interactive imitation learning, where human feedback actively refines a robots behavior during training. While interactive imitation learning has shown success in various robotic tasks, its application to dexterous manipulation remains limited. To address this gap, we examine current interactive imitation learning techniques applied to other robotic tasks and discuss how these methods can be adapted to enhance dexterous manipulation. By synthesizing state-of-the-art research, this paper highlights key challenges, identifies gaps in current methodologies, and outlines potential directions for leveraging interactive imitation learning to improve dexterous robotic skills.

URL PDF HTML ☆

赞 0 踩 0

2505.19610 2026-02-26 cs.CV

JailBound: Jailbreaking Internal Safety Boundaries of Vision-Language Models

Jiaxin Song, Yixu Wang, Jie Li, Rui Yu, Yan Teng, Xingjun Ma, Yingchun Wang

Comments The Thirty-ninth Annual Conference on Neural Information Processing Systems (NeurIPS 2025)

详情

英文摘要

Vision-Language Models (VLMs) exhibit impressive performance, yet the integration of powerful vision encoders has significantly broadened their attack surface, rendering them increasingly susceptible to jailbreak attacks. However, lacking well-defined attack objectives, existing jailbreak methods often struggle with gradient-based strategies prone to local optima and lacking precise directional guidance, and typically decouple visual and textual modalities, thereby limiting their effectiveness by neglecting crucial cross-modal interactions. Inspired by the Eliciting Latent Knowledge (ELK) framework, we posit that VLMs encode safety-relevant information within their internal fusion-layer representations, revealing an implicit safety decision boundary in the latent space. This motivates exploiting boundary to steer model behavior. Accordingly, we propose JailBound, a novel latent space jailbreak framework comprising two stages: (1) Safety Boundary Probing, which addresses the guidance issue by approximating decision boundary within fusion layer's latent space, thereby identifying optimal perturbation directions towards the target region; and (2) Safety Boundary Crossing, which overcomes the limitations of decoupled approaches by jointly optimizing adversarial perturbations across both image and text inputs. This latter stage employs an innovative mechanism to steer the model's internal state towards policy-violating outputs while maintaining cross-modal semantic consistency. Extensive experiments on six diverse VLMs demonstrate JailBound's efficacy, achieves 94.32% white-box and 67.28% black-box attack success averagely, which are 6.17% and 21.13% higher than SOTA methods, respectively. Our findings expose a overlooked safety risk in VLMs and highlight the urgent need for more robust defenses. Warning: This paper contains potentially sensitive, harmful and offensive content.

URL PDF HTML ☆

赞 0 踩 0

2505.17306 2026-02-26 cs.CL

Refusal Direction is Universal Across Safety-Aligned Languages

Xinpeng Wang, Mingyang Wang, Yihong Liu, Hinrich Schütze, Barbara Plank

2505.13667 2026-02-26 cs.RO

Adaptive Diffusion Constrained Sampling for Bimanual Robot Manipulation

Haolei Tong, Yuezhe Zhang, Sophie Lueth, Georgia Chalvatzaki

Comments Accepted by IEEE International Conference on Robotics and Automation 2026(ICRA 2026)

2505.08246 2026-02-26 cs.CV cs.NA math.NA

Identifying Memorization of Diffusion Models through $p$-Laplace Analysis: Estimators, Bounds and Applications

Jonathan Brokman, Itay Gershon, Amit Giloni, Omer Hofman, Roman Vainshtein, Hisashi Kojima, Guy Gilboa

Comments This manuscript is a substantially extended version of our SSVM 2025 paper, including significant new theoretical results and additional experiments. It is currently under review as a journal submission

2504.06533 2026-02-26 cs.LG cs.AI cs.DS

Rethinking Flexible Graph Similarity Computation: One-step Alignment with Global Guidance

Zhouyang Liu, Ning Liu, Yixin Chen, Jiezhong He, Shuai Ma, Dongsheng Li

Comments Accepted by ICDE 2026

2503.15133 2026-02-26 cs.CL

EmoGRACE: Aspect-based emotion analysis for social media data

Christina Zorenböhmer, Sebastian Schmidt, Bernd Resch

2503.07982 2026-02-26 cs.CV

TRACE: Your Diffusion Model is Secretly an Instance Edge Detector

Sanghyun Jo, Ziseok Lee, Wooyeol Lee, Jonghyun Choi, Jaesik Park, Kyungsu Kim

Comments Accepted to ICLR 2026 (Oral)

2503.05236 2026-02-26 cs.CV

Unified Reward Model for Multimodal Understanding and Generation

Yibin Wang, Yuhang Zang, Hao Li, Cheng Jin, Jiaqi Wang

Comments project page: https://codegoat24.github.io/UnifiedReward/

2503.03178 2026-02-26 cs.LG math.PR

Active operator learning with predictive uncertainty quantification for partial differential equations

Nick Winovich, Mitchell Daneker, Lu Lu, Guang Lin

Comments Submitted to the Journal of Computational Physics

2502.18424 2026-02-26 cs.CL

Compressing Language Models for Specialized Domains

Miles Williams, George Chrysostomou, Vitor Jeronymo, Nikolaos Aletras

Comments EACL 2026

2501.16443 2026-02-26 cs.LG cs.CV

Object-Centric World Models from Few-Shot Annotations for Sample-Efficient Reinforcement Learning

Weipu Zhang, Adam Jelley, Trevor McInroe, Amos Storkey, Gang Wang

2412.10895 2026-02-26 cs.LG stat.ML

Multi-Class and Multi-Task Strategies for Neural Directed Link Prediction

Claudio Moroni, Claudio Borile, Carolina Mattsson, Michele Starnini, André Panisson

Comments 15 pages, 2 figures

详情

DOI: 10.1007/978-3-662-72243-5_8
Journal ref: ECML PKDD 2025

英文摘要

Link Prediction is a foundational task in Graph Representation Learning, supporting applications like link recommendation, knowledge graph completion and graph generation. Graph Neural Networks have shown the most promising results in this domain and are currently the de facto standard approach to learning from graph data. However, a key distinction exists between Undirected and Directed Link Prediction: the former just predicts the existence of an edge, while the latter must also account for edge directionality and bidirectionality. This translates to Directed Link Prediction (DLP) having three sub-tasks, each defined by how training, validation and test sets are structured. Most research on DLP overlooks this trichotomy, focusing solely on the "existence" sub-task, where training and test sets are random, uncorrelated samples of positive and negative directed edges. Even in the works that recognize the aforementioned trichotomy, models fail to perform well across all three sub-tasks. In this study, we experimentally demonstrate that training Neural DLP (NDLP) models only on the existence sub-task, using methods adapted from Neural Undirected Link Prediction, results in parameter configurations that fail to capture directionality and bidirectionality, even after rebalancing edge classes. To address this, we propose three strategies that handle the three tasks simultaneously. Our first strategy, the Multi-Class Framework for Neural Directed Link Prediction (MC-NDLP) maps NDLP to a Multi-Class training objective. The second and third approaches adopt a Multi-Task perspective, either with a Multi-Objective (MO-DLP) or a Scalarized (S-DLP) strategy. Our results show that these methods outperform traditional approaches across multiple datasets and models, achieving equivalent or superior performance in addressing the three DLP sub-tasks.

URL PDF HTML ☆

赞 0 踩 0

2412.06966 2026-02-26 cs.LG cs.AI cs.CY

Machine Unlearning Doesn't Do What You Think: Lessons for Generative AI Policy and Research

A. Feder Cooper, Christopher A. Choquette-Choo, Miranda Bogen, Kevin Klyman, Matthew Jagielski, Katja Filippova, Ken Liu, Alexandra Chouldechova, Jamie Hayes, Yangsibo Huang, Eleni Triantafillou, Peter Kairouz, Nicole Elyse Mitchell, Niloofar Mireshghallah, Abigail Z. Jacobs, James Grimmelmann, Vitaly Shmatikov, Christopher De Sa, Ilia Shumailov, Andreas Terzis, Solon Barocas, Jennifer Wortman Vaughan, danah boyd, Yejin Choi, Sanmi Koyejo, Fernando Delgado, Percy Liang, Daniel E. Ho, Pamela Samuelson, Miles Brundage, David Bau, Seth Neel, Hanna Wallach, Amy B. Cyphert, Mark A. Lemley, Nicolas Papernot, Katherine Lee

Comments NeurIPS 2025 (Oral)

2411.06657 2026-02-26 cs.CV cs.AI cs.CL cs.LG

Renaissance: Investigating the Pretraining of Vision-Language Encoders

Clayton Fields, Casey Kennington

Comments 9 pages

2411.04997 2026-02-26 cs.CV cs.CL

LLM2CLIP: Powerful Language Model Unlocks Richer Cross-Modality Representation

Weiquan Huang, Aoqi Wu, Yifan Yang, Xufang Luo, Yuqing Yang, Usman Naseem, Chunyu Wang, Chunyu Wang, Qi Dai, Xiyang Dai, Dongdong Chen, Chong Luo, Lili Qiu, Liang Hu

2411.03941 2026-02-26 cs.LG cs.AI

Modular Deep Learning for Multivariate Time-Series: Decoupling Imputation and Downstream Tasks

Joseph Arul Raj, Linglong Qian, Zina Ibrahim

2410.16718 2026-02-26 cs.LG

Learning Partial Graph Matching via Optimal Partial Transport

Gathika Ratnayaka, James Nichols, Qing Wang

2409.20469 2026-02-26 cs.CV

PoseAdapt: Sustainable Human Pose Estimation via Continual Learning Benchmarks and Toolkit

Muhammad Saif Ullah Khan, Didier Stricker

Comments Accepted in WACV 2026 Applications Track

2409.18745 2026-02-26 cs.RO

A study on the effects of mixed explicit and implicit communications in human-artificial-agent interactions

Ana Christina Almada Campos, Bruno Vilhena Adorno

Comments Main paper with 28 pages, 14 figures, 4 tables. Supplementary material with 39 pages, 44 figures, 2 tables. Submitted to Intelligent Service Robotics

2408.05861 2026-02-26 cs.AI cs.LG

Temporal Knowledge-Graph Memory in a Partially Observable Environment

Taewoon Kim, Vincent François-Lavet, Michael Cochez

2406.17115 2026-02-26 cs.CV cs.AI

Measuring the Measurers: Quality Evaluation of Hallucination Benchmarks for Large Vision-Language Models

Bei Yan, Jie Zhang, Zheng Yuan, Shiguang Shan, Xilin Chen

2406.05085 2026-02-26 cs.CL cs.AI cs.IR

Multi-Head RAG: Solving Multi-Aspect Problems with LLMs

Maciej Besta, Ales Kubicek, Robert Gerstenberger, Marcin Chrapek, Roman Niggli, Patrik Okanovic, Yi Zhu, Patrick Iff, Michal Podstawski, Lucas Weitzendorf, Mingyuan Chi, Joanna Gajda, Piotr Nyczyk, Jürgen Müller, Hubert Niewiadomski, Torsten Hoefler

2402.13604 2026-02-26 cs.CL econ.EM

Breaking the HISCO Barrier: Automatic Occupational Standardization with OccCANINE

Christian Møller Dahl, Torben Johansen, Christian Vedel

Comments All code and guides on how to use OccCANINE is available on GitHub https://github.com/christianvedels/OccCANINE

2305.15929 2026-02-26 cs.CL

Emergence of a phonological bias in ChatGPT

Juan Manuel Toro

Comments 15 pages, 1 figure, corrected typo

2303.00799 2026-02-26 cs.AI cs.LG cs.MA

Fairness for Workers Who Pull the Arms: An Index Based Policy for Allocation of Restless Bandit Tasks

Arpita Biswas, Jackson A. Killian, Paula Rodriguez Diaz, Susobhan Ghosh, Milind Tambe

Comments 22nd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2023), 10 pages

2002.10764 2026-02-26 cs.AI cs.GT

FairRec: Two-Sided Fairness for Personalized Recommendations in Two-Sided Platforms

Gourab K Patro, Arpita Biswas, Niloy Ganguly, Krishna P. Gummadi, Abhijnan Chakraborty

Comments In Proceedings of The Web Conference (WWW) 2020

2602.21717 2026-02-26 cs.LG cs.DB

C$^{2}$TC: A Training-Free Framework for Efficient Tabular Data Condensation

Sijia Xu, Fan Li, Xiaoyang Wang, Zhengyi Yang, Xuemin Lin

详情

英文摘要

Tabular data is the primary data format in industrial relational databases, underpinning modern data analytics and decision-making. However, the increasing scale of tabular data poses significant computational and storage challenges to learning-based analytical systems. This highlights the need for data-efficient learning, which enables effective model training and generalization using substantially fewer samples. Dataset condensation (DC) has emerged as a promising data-centric paradigm that synthesizes small yet informative datasets to preserve data utility while reducing storage and training costs. However, existing DC methods are computationally intensive due to reliance on complex gradient-based optimization. Moreover, they often overlook key characteristics of tabular data, such as heterogeneous features and class imbalance. To address these limitations, we introduce C$^{2}$TC (Class-Adaptive Clustering for Tabular Condensation), the first training-free tabular dataset condensation framework that jointly optimizes class allocation and feature representation, enabling efficient and scalable condensation. Specifically, we reformulate the dataset condensation objective into a novel class-adaptive cluster allocation problem (CCAP), which eliminates costly training and integrates adaptive label allocation to handle class imbalance. To solve the NP-hard CCAP, we develop HFILS, a heuristic local search that alternates between soft allocation and class-wise clustering to efficiently obtain high-quality solutions. Moreover, a hybrid categorical feature encoding (HCFE) is proposed for semantics-preserving clustering of heterogeneous discrete attributes. Extensive experiments on 10 real-world datasets demonstrate that C$^{2}$TC improves efficiency by at least 2 orders of magnitude over state-of-the-art baselines, while achieving superior downstream performance.

URL PDF HTML ☆

赞 0 踩 0

2602.21716 2026-02-26 cs.CV

TranX-Adapter: Bridging Artifacts and Semantics within MLLMs for Robust AI-generated Image Detection

Wenbin Wang, Yuge Huang, Jianqing Xu, Yue Yu, Jiangtao Yan, Shouhong Ding, Pan Zhou, Yong Luo

2602.21712 2026-02-26 cs.CV

Innovative Tooth Segmentation Using Hierarchical Features and Bidirectional Sequence Modeling

Xinxin Zhao, Jian Jiang, Yan Tian, Liqin Wu, Zhaocheng Xu, Teddy Yang, Yunuo Zou, Xun Wang

Comments Accepted by Pattern Recognition