arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2603.07848 2026-03-10 cs.AI

Intentional Deception as Controllable Capability in LLM Agents

Jason Starace, Terence Soule

详情

英文摘要

As LLM-based agents increasingly operate in multi-agent systems, understanding adversarial manipulation becomes critical for defensive design. We present a systematic study of intentional deception as an engineered capability, using LLM-to-LLM interactions within a text-based RPG where parameterized behavioral profiles (9 alignments x 4 motivations, yielding 36 profiles with explicit ethical ground truth) serve as our experimental testbed. Unlike accidental deception from misalignment, we investigate a two-stage system that infers target agent characteristics and generates deceptive responses steering targets toward actions counter to their beliefs and motivations. We find that deceptive intervention produces differential effects concentrated in specific behavioral profiles rather than distributed uniformly, and that 88.5% of successful deceptions employ misdirection (true statements with strategic framing) rather than fabrication, indicating fact-checking defenses would miss the large majority of adversarial responses. Motivation, inferable at 98%+ accuracy, serves as the primary attack vector, while belief systems remain harder to identify (49% inference ceiling) or exploit. These findings identify which agent profiles require additional safeguards and suggest that current fact-verification approaches are insufficient against strategically framed deception.

URL PDF HTML ☆

赞 0 踩 0

2603.07844 2026-03-10 cs.RO

Relating Reinforcement Learning to Dynamic Programming-Based Planning

Filip V. Georgiev, Kalle G. Timperi, Başak Sakçak, Steven M. LaValle

Comments 43 pages, 8 figures

2603.07841 2026-03-10 cs.CL

An Efficient and Effective Evaluator for Text2SQL Models on Unseen and Unlabeled Data

Trinh Pham, Thanh Tam Nguyen, Viet Huynh, Hongzhi Yin, Quoc Viet Hung Nguyen

Comments Accepted at ICDE 2026

2603.07839 2026-03-10 cs.CV

Training-free Temporal Object Tracking in Surgical Videos

Subhadeep Koley, Abdolrahim Kadkhodamohammadi, Santiago Barbarisi, Danail Stoyanov, Imanol Luengo

Comments Accepted in IPCAI 2025

详情

DOI: 10.1007/s11548-025-03349-6
Journal ref: Int J CARS 20, 1067-1075 (2025)

英文摘要

Purpose: In this paper, we present a novel approach for online object tracking in laparoscopic cholecystectomy (LC) surgical videos, targeting localisation and tracking of critical anatomical structures and instruments. Our method addresses the challenges of costly pixel-level annotations and label inconsistencies inherent in existing datasets. Methods: Leveraging the inherent object localisation capabilities of pre-trained text-to-image diffusion models, we extract representative features from surgical frames without any training or fine-tuning. Our tracking framework uses these features, along with cross-frame interactions via an affinity matrix inspired by query-key-value attention, to ensure temporal continuity in the tracking process. Results: Through a pilot study, we first demonstrate that diffusion features exhibit superior object localisation and consistent semantics across different decoder levels and temporal frames. Later, we perform extensive experiments to validate the effectiveness of our approach, showcasing its superiority over competitors for the task of temporal object tracking. Specifically, we achieve a per-pixel classification accuracy of 79.19%, mean Jaccard Score of 56.20%, and mean F-Score of 79.48% on the publicly available CholeSeg8K dataset. Conclusion: Our work not only introduces a novel application of text-to-image diffusion models but also contributes to advancing the field of surgical video analysis, offering a promising avenue for accurate and cost-effective temporal object tracking in minimally invasive surgery videos.

URL PDF HTML ☆

赞 0 踩 0

2603.07837 2026-03-10 cs.CL cs.AI

AI Steerability 360: A Toolkit for Steering Large Language Models

Erik Miehling, Karthikeyan Natesan Ramamurthy, Praveen Venkateswaran, Irene Ko, Pierre Dognin, Moninder Singh, Tejaswini Pedapati, Avinash Balakrishnan, Matthew Riemer, Dennis Wei, Inge Vejsbjerg, Elizabeth M. Daly, Kush R. Varshney

2603.07831 2026-03-10 cs.CV cs.LG math.OC

Transferable Optimization Network for Cross-Domain Image Reconstruction

Yunmei Chen, Chi Ding, Xiaojing Ye

Comments 30 pages, 7 figures

2603.07826 2026-03-10 cs.RO

Physics-infused Learning for Aerial Manipulator in Winds and Near-Wall Environments

Yiming Zhang, Junyi Geng

详情

DOI: 10.2514/6.2026-1760

英文摘要

Aerial manipulation (AM) expands UAV capabilities beyond passive observation to contact-based operations at high altitudes and in otherwise inaccessible environments. Although recent advances show promise, most AM systems are developed in controlled settings that overlook key aerodynamic effects. Simplified thrust models are often insufficient to capture the nonlinear wind disturbances and proximity-induced flow variations present in real-world environments near infrastructure, while high-fidelity CFD methods remain impractical for real-time use. Learning-based models are computationally efficient at inference, but often struggle to generalize to unseen condition. This paper combines both approaches by integrating a physics-based blade-element model with a learning-based residual force estimator, along with a rotor-speed allocation strategy for disturbance compensation, resulting in a unified control framework. The blade-element model computes per-rotor aerodynamic forces under wind and provides a refined feedforward disturbance estimate. A learning-based estimator then predicts the residual forces not captured by the model, enabling compensation for unmodeled aerodynamic effects. An online adaptation mechanism further updates the residual-force prediction and rotor-speed allocation jointly to reduce the mismatch between desired and realized thrust. We evaluate this framework in both free-flight and wall-contact tracking tasks in a simulated near-wall wind environment. Results demonstrate improved disturbance estimation and trajectory-tracking accuracy over conventional approaches, enabling robust wall-contact execution under challenging aerodynamic conditions.

URL PDF HTML ☆

赞 0 踩 0

2603.07825 2026-03-10 cs.CL

Benchmarking Large Language Models for Quebec Insurance: From Closed-Book to Retrieval-Augmented Generation

David Beauchemin, Richard Khoury

Comments Publish at the Advances in Financial AI: Towards Agentic and Responsible Systems Workshop @ ICLR 2026

2603.07824 2026-03-10 cs.RO

Reasoning Knowledge-Gap in Drone Planning via LLM-based Active Elicitation

Zeyu Fang, Beomyeol Yu, Cheng Liu, Zeyuan Yang, Rongqian Chen, Yuxin Lin, Mahdi Imani, Tian Lan

2603.07822 2026-03-10 cs.RO cs.HC

Uncertainty Mitigation and Intent Inference: A Dual-Mode Human-Machine Joint Planning System

Zeyu Fang, Yuxin Lin, Cheng Liu, Beomyeol Yu, Zeyuan Yang, Rongqian Chen, Taeyoung Lee, Mahdi Imani, Tian Lan

2603.07817 2026-03-10 cs.CV

Tracking Phenological Status and Ecological Interactions in a Hawaiian Cloud Forest Understory using Low-Cost Camera Traps and Visual Foundation Models

Luke Meyers, Anirudh Potlapally, Yuyan Chen, Mike Long, Tanya Berger-Wolf, Hari Subramoni, Remi Megret, Daniel Rubenstein

2603.07815 2026-03-10 cs.CV cs.AI

HybridStitch: Pixel and Timestep Level Model Stitching for Diffusion Acceleration

Desen Sun, Jason Hon, Jintao Zhang, Sihang Liu

2603.07811 2026-03-10 cs.LG

Neural Precoding in Complex Projective Spaces

Zaid Abdullah, Merouane Debbah, Symeon Chatzinotas, Bjorn Ottersten

2603.07800 2026-03-10 cs.RO

Preference-Conditioned Reinforcement Learning for Space-Time Efficient Online 3D Bin Packing

Nikita Sarawgi, Omey M. Manyar, Fan Wang, Thinh H. Nguyen, Daniel Seita, Satyandra K. Gupta

Comments 8 pages, 5 figures. Accepted to IEEE International Conference on Robotics and Automation 2026. Project Website: https://step-packing.github.io

2603.07799 2026-03-10 cs.CV cs.RO

MWM: Mobile World Models for Action-Conditioned Consistent Prediction

Han Yan, Zishang Xiang, Zeyu Zhang, Hao Tang

2603.07797 2026-03-10 cs.RO cs.LG

Toward Global Intent Inference for Human Motion by Inverse Reinforcement Learning

Sarmad Mehrdad, Maxime Sabbah, Vincent Bonnet, Ludovic Righetti

Comments 8 pages, 6 figures

2603.07796 2026-03-10 cs.RO

Inverse Resistive Force Theory (I-RFT): Learning granular properties through robot-terrain physical interactions

Shipeng Liu, Feng Xue, Yifeng Zhang, Tarunika Ponnusamy, Feifei Qian

2603.07795 2026-03-10 cs.RO

A Robust Antenna Provides Tactile Feedback in a Multi-legged Robot

Zhaochen J. Xu, Juntao He, Delfin Aydan, Malaika Taylor, Tianyu Wang, Jianfeng Lin, Wesley Dyer, Daniel I. Goldman

2603.07794 2026-03-10 cs.CV

4DRC-OCC: Robust Semantic Occupancy Prediction Through Fusion of 4D Radar and Camera

David Ninfa, Andras Palffy, Holger Caesar

2603.07792 2026-03-10 cs.CL cs.AI cs.CY

Dual-Metric Evaluation of Social Bias in Large Language Models: Evidence from an Underrepresented Nepali Cultural Context

Ashish Pandey, Tek Raj Chhetri

详情

英文摘要

Large language models (LLMs) increasingly influence global digital ecosystems, yet their potential to perpetuate social and cultural biases remains poorly understood in underrepresented contexts. This study presents a systematic analysis of representational biases in seven state-of-the-art LLMs: GPT-4o-mini, Claude-3-Sonnet, Claude-4-Sonnet, Gemini-2.0-Flash, Gemini-2.0-Lite, Llama-3-70B, and Mistral-Nemo in the Nepali cultural context. Using Croissant-compliant dataset of 2400+ stereotypical and anti-stereotypical sentence pairs on gender roles across social domains, we implement an evaluation framework, Dual-Metric Bias Assessment (DMBA), combining two metrics: (1) agreement with biased statements and (2) stereotypical completion tendencies. Results show models exhibit measurable explicit agreement bias, with mean bias agreement ranging from 0.36 to 0.43 across decoding configurations, and an implicit completion bias rate of 0.740-0.755. Importantly, implicit completion bias follows a non-linear, U-shaped relationship with temperature, peaking at moderate stochasticity (T=0.3) and declining slightly at higher temperatures. Correlation analysis under different decoding settings revealed that explicit agreement strongly aligns with stereotypical sentence agreement but is a weak and often negative predictor of implicit completion bias, indicating generative bias is poorly captured by agreement metrics. Sensitivity analysis shows increasing top-p amplifies explicit bias, while implicit generative bias remains largely stable. Domain-level analysis shows implicit bias is strongest for race and sociocultural stereotypes, while explicit agreement bias is similar across gender and sociocultural categories, with race showing the lowest explicit agreement. These findings highlight the need for culturally grounded datasets and debiasing strategies for LLMs in underrepresented societies.

URL PDF HTML ☆

赞 0 踩 0

2603.07787 2026-03-10 cs.LG

Vision Transformers that Never Stop Learning

Caihao Sun, Mingqi Yuan, Shiyuan Wang, Jiayu Chen

2603.07786 2026-03-10 cs.CV

OrdinalBench: A Benchmark Dataset for Diagnosing Generalization Limits in Ordinal Number Understanding of Vision-Language Models

Yusuke Tozaki, Hisashi Miyamori

Comments Accepted as a Short Paper at VISAPP 2026

2603.07784 2026-03-10 cs.LG cs.AI

ProgAgent:A Continual RL Agent with Progress-Aware Rewards

Jinzhou Tan, Gabriel Adineera, Jinoh Kim

2603.07779 2026-03-10 cs.CL cs.GL cs.LG

Scaling Data Difficulty: Improving Coding Models via Reinforcement Learning on Fresh and Challenging Problems

Zongqian Li, Tengchao Lv, Shaohan Huang, Yixuan Su, Qinzheng Sun, Qiufeng Yin, Ying Xin, Scarlett Li, Lei Cui, Nigel Collier, Furu Wei

2603.07777 2026-03-10 cs.LG cs.CL cs.GL

Breaking Training Bottlenecks: Effective and Stable Reinforcement Learning for Coding Models

Zongqian Li, Shaohan Huang, Zewen Chi, Yixuan Su, Lexin Zhou, Li Dong, Nigel Collier, Furu Wei

2603.07776 2026-03-10 cs.CV cs.GR

Parameterized Brushstroke Style Transfer

Uma Meleti, Siyu Huang

2603.07775 2026-03-10 cs.RO

Residual Control for Fast Recovery from Dynamics Shifts

Nethmi Jayasinghe, Diana Gontero, Francesco Migliarba, Spencer T. Brown, Vinod K. Sangwan, Mark C. Hersam, Amit Ranjan Trivedi

2603.07774 2026-03-10 cs.CV

Geometric Knowledge-Assisted Federated Dual Knowledge Distillation Approach Towards Remote Sensing Satellite Imagery

Luyao Zou, Fei Pan, Jueying Li, Yan Kyaw Tun, Apurba Adhikary, Zhu Han, Hayoung Oh

Comments 16 pages, 9 figures

2603.07769 2026-03-10 cs.CV

MedQ-Deg: A Multidimensional Benchmark for Evaluating MLLMs Across Medical Image Quality Degradations

Jiyao Liu, Junzhi Ning, Chenglong Ma, Wanying Qu, Jianghan Shen, Siqi Luo, Jinjie Wei, Jin Ye, Pengze Li, Tianbin Li, Jiashi Lin, Hongming Shan, Xinzhe Luo, Xiaohong Liu, Lihao Liu, Junjun He, Ningsheng Xu

Comments 29 pages, 11 figures

2603.07766 2026-03-10 cs.CL cs.AI

QuadAI at SemEval-2026 Task 3: Ensemble Learning of Hybrid RoBERTa and LLMs for Dimensional Aspect-Based Sentiment Analysis

A. J. W. de Vink, Filippos Karolos Ventirozos, Natalia Amat-Lefort, Lifeng Han

Comments SemEval System Report