arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 21511
1612.04023 2026-06-04 eess.SY cs.AI cs.RO cs.SY

Proceedings of the The First Workshop on Verification and Validation of Cyber-Physical Systems

第一届验证与验证网络物理系统研讨会会议记录

Mehdi Kargahi, Ashutosh Trivedi

发表机构 * Reykjavík, Iceland(冰岛雷克雅未克) MITL Specification Debugging for Monitoring of Cyber-Physical Systems(网络物理系统监控的MITL规格调试) Automatic Synthesis of Controllers from Specifications using Control Certificates(使用控制证书从规范自动合成控制器) A Compositional Framework for Preference-Aware Agents(偏好感知代理的组合框架) Output Feedback Controller Design with Symbolic Observers for Cyber-physical Systems(网络物理系统符号观测器输出反馈控制器设计) Towards an Approximate Conformance Relation for Hybrid I/O Automata(混合I/O自动机近似一致性关系) On Nonlinear Prices in Timed Automata(时序自动机中的非线性价格) Towards the Verification of Safety-critical Autonomous Systems in Dynamic Environments(动态环境中安全关键自主系统的验证)

AI总结 本文介绍了首届网络物理系统验证与验证研讨会,探讨了验证与验证方法,包括控制、模拟和形式化方法等,旨在解决复杂软件和算法的验证问题。

详情
Journal ref
EPTCS 232, 2016
AI中文摘要

第一届国际网络物理系统验证与验证研讨会(V2CPS-16)于冰岛雷克雅未克举行的第十二届国际形式化方法整合会议(iFM 2016)期间召开。该研讨会旨在汇集形式化验证和网络物理系统(CPS)领域的研究人员和专家,讨论涵盖广泛验证与验证方法的主题,包括但不限于控制、模拟、形式化方法等。网络物理系统(CPS)是网络化计算和物理过程的整合,具有有意义的相互作用;前者监控、控制并影响后者,而后者也影响前者。CPS在机器人、交通、通信、基础设施、能源和制造系统中有广泛应用。许多安全关键系统,如化学过程、医疗设备、飞机飞行控制系统和汽车系统,确实属于CPS。CPS的先进能力需要复杂的软件和合成算法,这些算法难以验证。事实上,该领域中的许多问题都是不可判定的。因此,一个重要的步骤是找到特定的抽象,这些抽象可能在特定属性上算法上可验证,描述CPS的部分或整体行为。

英文摘要

The first International Workshop on Verification and Validation of Cyber-Physical Systems (V2CPS-16) was held in conjunction with the 12th International Conference on integration of Formal Methods (iFM 2016) in Reykjavik, Iceland. The purpose of V2CPS-16 was to bring together researchers and experts of the fields of formal verification and cyber-physical systems (CPS) to cover the theme of this workshop, namely a wide spectrum of verification and validation methods including (but not limited to) control, simulation, formal methods, etc. A CPS is an integration of networked computational and physical processes with meaningful inter-effects; the former monitors, controls, and affects the latter, while the latter also impacts the former. CPSs have applications in a wide-range of systems spanning robotics, transportation, communication, infrastructure, energy, and manufacturing. Many safety-critical systems such as chemical processes, medical devices, aircraft flight control, and automotive systems, are indeed CPS. The advanced capabilities of CPS require complex software and synthesis algorithms, which are hard to verify. In fact, many problems in this area are undecidable. Thus, a major step is to find particular abstractions of such systems which might be algorithmically verifiable regarding specific properties of such systems, describing the partial/overall behaviors of CPSs.

2606.04850 2026-06-04 cs.LG cs.AI cs.AR math.OC

Uncertainty-Aware End-to-End Co-Design of Neural Network Processors: From Training and Mapping to Fabrication

不确定性感知的神经网络处理器端到端协同设计:从训练、映射到制造

Yuyang Du, Yujun Huang, Gioele Zardini

AI总结 提出一个基于单调协同设计理论的统一框架,通过四个可互操作的设计模块(网络训练、芯片映射、晶圆级制造和计算资源分配)实现神经网络处理器的端到端协同设计,并引入置信度(成功概率的倒数)作为显式可优化资源来处理不确定性。

Comments 14 pages

详情
AI中文摘要

设计神经网络处理器是一个端到端的协同设计问题:网络架构和训练预算决定了推理工作负载;硬件映射决策决定了芯片面积、延迟和能量;这些特性决定了制造良率和生产成本。在实践中,这些决策是在不同阶段做出的,现有的协同设计方法与特定算法紧密耦合,使得改进一个组件而不重新设计整个流水线变得困难。本文提出了一个基于单调协同设计理论的统一框架,该框架组合了四个可互操作的设计模块,涵盖网络训练、芯片映射、晶圆级制造和计算资源分配。每个模块仅向系统其余部分暴露功能-资源接口,因此任何模块都可以在不改变其他模块结构的情况下进行优化。一个核心贡献是对不确定性的处理:该框架没有将随机结果简化为点估计,而是引入置信度(成功概率的倒数)作为与成本、时间和功耗并列的显式可优化资源。三个案例研究验证了该方法。第一个案例恢复了跨异构应用场景的帕累托最优实现。第二个案例确认置信度作为一个连续可调的设计旋钮,而非事后诊断指标。第三个案例表明,改进单个模块的实现集会自动传播到全局帕累托前沿,而无需修改协同设计图。

英文摘要

Designing a neural network processor is an end-to-end co-design problem: network architecture and training budget determine the inference workload; hardware mapping decisions determine chip area, latency, and energy; and these characteristics govern fabrication yield and manufacturing cost. In practice, these decisions are made in separate stages, and existing co-design methodologies are tightly coupled to specific algorithms, making it difficult to improve one component without reworking the entire pipeline. This paper presents a unified framework, grounded in monotone co-design theory, that composes four interoperable design blocks spanning network training, chip mapping, wafer-level fabrication, and compute resource allocation. Each block exposes only a functionality-resource interface to the rest of the system, so any block can be refined without structural changes elsewhere. A central contribution is the treatment of uncertainty: rather than collapsing stochastic outcomes into point estimates, the framework introduces Confidence, the inverse of success probability, as an explicit and optimizable resource alongside cost, time, and power. Three case studies validate the approach. The first recovers Pareto-optimal implementations across heterogeneous application scenarios. The second confirms that Confidence functions as a continuously tunable design knob rather than a post-hoc diagnostic. The third demonstrates that improving a single block's implementation set automatically propagates to the global Pareto front, without modifying the co-design diagram.

2606.04493 2026-06-04 cs.CV cs.AI

SFMambaNet: Spectral-Frequency Enhanced Selective State Space Model for Correspondence Pruning

SFMambaNet: 用于对应点筛选的频谱-频率增强选择性状态空间模型

Zhihua Wang, Yanping Li, Yizhang Liu

AI总结 提出SFMambaNet,通过局部频谱-几何注意力块和频谱集成全局Mamba块,首次将频域感知融入对应点筛选任务,增强内点与离点的区分能力。

详情
AI中文摘要

对应点筛选旨在从初始对应点集中识别内点。现有大多数基于图神经网络的方法依赖于从粗欧几里得坐标映射的几何特征,难以捕捉内点呈现的细微几何一致性。而基于Mamba的方法虽具有全局感受野和长序列建模能力,但往往在隐藏状态空间中积累大量不一致特征,难以区分内点与离点。本文首次将频域感知融入该任务,提出SFMambaNet,一种新颖的频谱-频率增强Mamba双视图对应点筛选网络。我们的方法由两个组件协同构成:首先,设计局部频谱-几何注意力(LSGA)块。LSGA将频谱位置编码融入局部图交互,并引入多尺度Mamba处理,以增强对细微几何一致性的捕捉并提升局部特征判别性。在此基础上,设计频谱集成全局Mamba(SIGM)块。SIGM在状态空间中嵌入频率门控机制,利用LSGA提供的频率信息显式抑制隐藏状态内高频噪声的累积,并减轻不一致特征的传播。这增强了内点-离点可分性,并以近乎线性的复杂度实现了鲁棒的全局上下文建模能力。大量实验表明,SFMambaNet在多个具有挑战性的任务上优于当前最先进方法。代码可在https://github.com/Kirito14IT/SFMambaNet获取。

英文摘要

Correspondence pruning aims to identify inliers from an initial set of correspondences. Most existing Graph Neural Network (GNN)-based methods rely on geometric features mapped from coarse Euclidean coordinates, which struggle to capture the subtle geometric consistencies presented by inliers. While Mamba-based methods possess global receptive fields and long sequence modeling capabilities, they tend to accumulate substantial inconsistent features within the hidden state space, making it difficult to distinguish inliers from outliers. In this paper, we integrate frequency domain perception into this task for the first time and propose SFMambaNet, a novel Spectral-Frequency enhanced Mamba-based two-view correspondence pruning network. Our method is collaboratively composed of two components: First, we design a Local Spectral-Geometric Attention (LSGA) block. LSGA incorporates spectral positional encoding into local graph interactions and introduces multi-scale Mamba processing to enhance the capture of subtle geometric consistencies and improve local feature discriminability. Building upon this, we design a Spectral-Integrated Global Mamba (SIGM) block. SIGM embeds a frequency gating mechanism within the state space, utilizing the frequency information provided by LSGA to explicitly suppress high-frequency noise accumulation within hidden states and mitigate the propagation of inconsistent features. This enhances inlier-outlier separability and achieves robust global context modeling capabilities with nearly linear complexity. Extensive experiments demonstrate that SFMambaNet outperforms current state-of-the-art methods on several challenging tasks. The code is available at https://github.com/Kirito14IT/SFMambaNet.

2606.04118 2026-06-04 cs.CL

Computational conceptual history of scientific concepts: From early digital methods to LLMs

科学概念的计算概念史:从早期数字方法到大语言模型

Michael Zichert, Arno Simons

AI总结 本文回顾了从早期数字方法到大语言模型的计算概念史方法,分析LLM如何继承旧问题并带来新机遇,重点讨论语料构建、模型选择、操作化及评估解释等挑战。

Comments 19 pages, chapter in the book Understanding Science with Large Language Models? (pp. 383-412). transcript. Edited by Arno Simons, Adrian Wüthrich, Michael Zichert, Gerd Graßhoff (eds.)

详情
AI中文摘要

本文将大语言模型(LLMs)置于科学史、科学哲学和科学社会学(HPSS)中概念分析的计算方法的长期历史中。我们考察LLMs为现有方法增添了哪些内容,它们如何继承了长期存在的问题,并回顾了使用它们的最新案例研究。在第一部分中,我们通过汇集三个工作线索来重构LLMs之前的计算概念史:HPSS中的早期数字方法、来自数字历史及相关研究的分布方法,以及词汇语义变化检测。我们概述了主要挑战和机遇,重点关注语料构建、操作化和建模选择,以及评估和解释。在第二部分中,我们转向LLMs时代,首先简要介绍LLMs,然后回顾基于LLM的词汇语义变化检测工作以及HPSS中的相关案例研究。接着,我们重新审视之前的方法论问题,展示语料构建、模型选择和训练数据、操作化权衡以及评估和解释等问题如何在基于LLM的工作流程中体现。

英文摘要

This article situates large language models (LLMs) within the longer history of computational approaches to concept analysis in the history, philosophy, and sociology of science (HPSS). We examine what LLMs add to existing methods, how they inherit longstanding problems, and review recent case studies that employ them. In the first part, we reconstruct computational conceptual history before LLMs by bringing together three strands of work: early digital methods in HPSS, distributional approaches from digital history and related research, and lexical semantic change detection. We provide an overview of the main challenges and opportunities, focusing on corpus construction, operationalization and modelling choices, and evaluation and interpretation. In the second part, we turn to the era of LLMs, starting with a short introduction to LLMs before reviewing LLM-based work on lexical semantic change detection and relevant case studies in HPSS. We then revisit the earlier methodological questions, showing how issues of corpus construction, model choice and training data, operationalization trade-offs, and evaluation and interpretation play out in LLM-based workflows.

2603.24747 2026-06-04 cs.AI cs.MA

Formal Semantics for Agentic Tool Protocols: A Process Calculus Approach

智能体工具协议的形式语义:一种进程演算方法

Andreas Schlapbach

AI总结 本文通过进程演算形式化两种智能体工具协议(SGD和MCP),证明它们在映射Phi下结构互模拟,但反向映射有损,进而提出MCP+扩展实现完全等价。

Comments Logical flaw in Theorem 21

详情
AI中文摘要

能够调用外部工具的大型语言模型智能体的出现,催生了对智能体协议进行形式验证的迫切需求。两个范式主导了这一领域:Schema-Guided Dialogue (SGD),一个用于零样本API泛化的研究框架,以及Model Context Protocol (MCP),一个用于智能体-工具集成的行业标准。虽然两者都通过模式描述实现动态服务发现,但它们的形式关系仍未探索。基于先前建立这些范式概念趋同的工作,我们提出了SGD和MCP的第一个进程演算形式化,证明它们在定义良好的映射Phi下结构互模拟。然而,我们证明反向映射Phi^{-1}是部分且有损的,揭示了MCP表达性的关键缺陷。通过双向分析,我们识别出五个原则——语义完备性、显式动作边界、失败模式文档、渐进式披露兼容性和工具间关系声明——作为完全行为等价的充分必要条件。我们将这些原则形式化为类型系统扩展MCP+,证明MCP+与SGD同构。我们的工作为经过验证的智能体系统提供了第一个形式基础,并将模式质量确立为可证明的安全属性。

英文摘要

The emergence of large language model agents capable of invoking external tools has created urgent need for formal verification of agent protocols. Two paradigms dominate this space: Schema-Guided Dialogue (SGD), a research framework for zero-shot API generalization, and the Model Context Protocol (MCP), an industry standard for agent-tool integration. While both enable dynamic service discovery through schema descriptions, their formal relationship remains unexplored. Building on prior work establishing the conceptual convergence of these paradigms, we present the first process calculus formalization of SGD and MCP, proving they are structurally bisimilar under a well-defined mapping Phi. However, we demonstrate that the reverse mapping Phi^{-1} is partial and lossy, revealing critical gaps in MCP's expressivity. Through bidirectional analysis, we identify five principles -- semantic completeness, explicit action boundaries, failure mode documentation, progressive disclosure compatibility, and inter-tool relationship declaration -- as necessary and sufficient conditions for full behavioral equivalence. We formalize these principles as type-system extensions MCP+, proving MCP+ is isomorphic to SGD. Our work provides the first formal foundation for verified agent systems and establishes schema quality as a provable safety property.

2602.12643 2026-06-04 cs.LG cs.AI stat.ML

Unifying Model-Free Efficiency and Model-Based Representations via Latent Dynamics

通过潜在动力学统一无模型效率与基于模型的表示

Jashaswimalya Acharjee, Balaraman Ravindran

AI总结 提出统一潜在动力学算法,通过将状态-动作对嵌入到值函数近似线性的潜在空间,无需规划开销即可融合无模型效率与基于模型表示的优势,在80个环境中匹配或超越专门基线。

Comments Similarities found with a prior work. Hence, requesting for withdrawal until further notice

详情
AI中文摘要

我们提出了统一潜在动力学(ULD),一种新颖的强化学习算法,它统一了无模型方法的效率与基于模型方法的表示优势,且不产生规划开销。通过将状态-动作对嵌入到真实值函数近似线性的潜在空间中,我们的方法支持跨不同领域使用单一超参数集——从低维和像素输入的连续控制到高维Atari游戏。我们证明,在温和条件下,基于嵌入的时序差分更新的不动点与相应线性基于模型的值扩展的不动点一致,并推导了将嵌入保真度与值逼近质量相关联的显式误差界。在实践中,ULD采用编码器、值函数和策略网络的同步更新、短视界预测动力学的辅助损失以及奖励尺度归一化,以确保在稀疏奖励下的稳定学习。在涵盖Gym运动控制、DeepMind Control(本体感觉和视觉)以及Atari的80个环境上的评估表明,我们的方法匹配或超过了专门的基于模型和通用基于模型的基线的性能——以最少的调参和更少的参数实现了跨领域能力。这些结果表明,仅与值对齐的潜在表示就能提供传统上归因于完整基于模型规划的适应性和样本效率。

英文摘要

We present Unified Latent Dynamics (ULD), a novel reinforcement learning algorithm that unifies the efficiency of model-free methods with the representational strengths of model-based approaches, without incurring planning overhead. By embedding state-action pairs into a latent space in which the true value function is approximately linear, our method supports a single set of hyperparameters across diverse domains -- from continuous control with low-dimensional and pixel inputs to high-dimensional Atari games. We prove that, under mild conditions, the fixed point of our embedding-based temporal-difference updates coincides with that of a corresponding linear model-based value expansion, and we derive explicit error bounds relating embedding fidelity to value approximation quality. In practice, ULD employs synchronized updates of encoder, value, and policy networks, auxiliary losses for short-horizon predictive dynamics, and reward-scale normalization to ensure stable learning under sparse rewards. Evaluated on 80 environments spanning Gym locomotion, DeepMind Control (proprioceptive and visual), and Atari, our approach matches or exceeds the performance of specialized model-free and general model-based baselines -- achieving cross-domain competence with minimal tuning and a fraction of the parameter footprint. These results indicate that value-aligned latent representations alone can deliver the adaptability and sample efficiency traditionally attributed to full model-based planning.

1710.04465 2026-06-04 cs.RO cs.SY eess.SY stat.CO

Markerless visual servoing on unknown objects for humanoid robot platforms

无标记未知物体的人形机器人视觉伺服

Claudio Fantacci, Giulia Vezzani, Ugo Pattacini, Vadim Tikhanoff, Lorenzo Natale

AI总结 本文提出了一种无标记未知物体的人形机器人视觉伺服框架,通过立体视觉计算可抓取物体的体积,利用递归贝叶斯滤波估计末端执行器的6D姿态,结合非线性约束优化问题计算目标姿态,并通过图像基于视觉伺服控制实现末端执行器的精确控制。

详情
Journal ref
IEEE International Conference on Robotics and Automation (ICRA), 2018
AI中文摘要

为了精确地抓住一个物体,人形机器人需要对末端执行器、物体姿态和形状有良好的了解。本文提出了一种无标记未知物体的视觉伺服框架,分为四个主要部分:I) 通过立体视觉建立最小二乘问题来计算机器人手可抓取的物体体积;II) 基于序贯蒙特卡洛(SMC)滤波的递归贝叶斯滤波技术,用于在不使用标记的情况下估计机器人末端执行器的6D姿态;III) 建立非线性约束优化问题来计算关于物体的目标可抓取姿态;IV) 通过图像基于视觉伺服控制命令机器人末端执行器向目标姿态移动。我们通过大量实验在iCub人形机器人平台上验证了该方法的有效性和鲁棒性,实现了实时计算、平滑轨迹和亚像素精度。

英文摘要

To precisely reach for an object with a humanoid robot, it is of central importance to have good knowledge of both end-effector, object pose and shape. In this work we propose a framework for markerless visual servoing on unknown objects, which is divided in four main parts: I) a least-squares minimization problem is formulated to find the volume of the object graspable by the robot's hand using its stereo vision; II) a recursive Bayesian filtering technique, based on Sequential Monte Carlo (SMC) filtering, estimates the 6D pose (position and orientation) of the robot's end-effector without the use of markers; III) a nonlinear constrained optimization problem is formulated to compute the desired graspable pose about the object; IV) an image-based visual servo control commands the robot's end-effector toward the desired pose. We demonstrate effectiveness and robustness of our approach with extensive experiments on the iCub humanoid robot platform, achieving real-time computation, smooth trajectories and sub-pixel precisions.

1905.12191 2026-06-04 cs.RO cs.MA cs.SY eess.SY

CARE: Cooperative Autonomy for Resilience and Efficiency of Robot Teams for Complete Coverage of Unknown Environments under Robot Failures

CARE: 机器人团队在未知环境中鲁棒性和效率的协同自主性

Junnan Song, Shalabh Gupta

AI总结 本文提出了一种分布式算法CARE,用于解决未知环境中多机器人覆盖率路径规划问题,该算法在机器人故障情况下提供鲁棒性并提高整体效率,通过事件驱动的重新规划实现任务重新分配,实验结果表明其在故障情况下能够实现完全覆盖、减少覆盖时间和加快目标发现。

详情
Journal ref
Autonomous Robots, volume 44, 2020
AI中文摘要

本文针对未知环境中多机器人覆盖率路径规划(MCPP)问题,特别是在机器人故障情况下,提出了一个分布式算法,称为协同自主性以实现鲁棒性和效率(CARE)。该算法不仅为机器人团队提供故障容忍能力,还通过事件驱动的重新规划提高整体操作效率。算法使用分布式离散事件监督器(DESs),在机器人故障或空闲时触发一组可行玩家之间的游戏,以做出协作决策进行任务重新分配。游戏理论结构通过潜在游戏构建,其中每个玩家的效用与所有玩家的共享目标函数对齐。该算法已在各种复杂场景的高保真机器人模拟器上得到验证,结果表明,与三种替代方法相比,团队在故障情况下实现了完全覆盖,减少了覆盖时间,并加快了目标发现。

英文摘要

This paper addresses the problem of Multi-robot Coverage Path Planning (MCPP) for unknown environments in the presence of robot failures. Unexpected robot failures can seriously degrade the performance of a robot team and in extreme cases jeopardize the overall operation. Therefore, this paper presents a distributed algorithm, called Cooperative Autonomy for Resilience and Efficiency (CARE), which not only provides resilience to the robot team against failures of individual robots, but also improves the overall efficiency of operation via event-driven replanning. The algorithm uses distributed Discrete Event Supervisors (DESs), which trigger games between a set of feasible players in the event of a robot failure or idling, to make collaborative decisions for task reallocations. The game-theoretic structure is built using Potential Games, where the utility of each player is aligned with a shared objective function for all players. The algorithm has been validated in various complex scenarios on a high-fidelity robotic simulator, and the results demonstrate that the team achieves complete coverage under failures, reduced coverage time, and faster target discovery as compared to three alternative methods.

1812.03412 2026-06-04 cs.LG cs.NA math.NA stat.ML

Learning Multiplication-free Linear Transformations

学习无乘法线性变换

Cristian Rusu

AI总结 本文提出了一种字典学习算法,用于稀疏表示,同时对学习到的字典施加特定结构,使其在数值上更高效:减少加法/乘法次数甚至避免乘法。我们基于字典的高结构化基本构建块(二进制正交、缩放和剪切变换)来建立工作,可以写出优化问题的闭式解。我们在图像数据上展示了方法的有效性,并与已知的数值高效变换如快速傅里叶变换和快速离散余弦变换进行比较。

详情
AI中文摘要

在本文中,我们提出了几种字典学习算法,用于稀疏表示,同时对学习到的字典施加特定结构,使得它们在数值上更高效:减少加法/乘法的次数,甚至避免乘法。我们的工作基于字典的高结构化基本构建块(二进制正交、缩放和剪切变换)来建立,可以写出我们考虑的优化问题的闭式解。我们在图像数据上展示了我们方法的有效性,并可以与已知的数值高效变换如快速傅里叶变换和快速离散余弦变换进行比较。

英文摘要

In this paper, we propose several dictionary learning algorithms for sparse representations that also impose specific structures on the learned dictionaries such that they are numerically efficient to use: reduced number of addition/multiplications and even avoiding multiplications altogether. We base our work on factorizations of the dictionary in highly structured basic building blocks (binary orthonormal, scaling and shear transformations) for which we can write closed-form solutions to the optimization problems that we consider. We show the effectiveness of our methods on image data where we can compare against well-known numerically efficient transforms such as the fast Fourier and the fast discrete cosine transforms.

1601.03094 2026-06-04 cs.CV cs.SY eess.SY math.OC

A metric for sets of trajectories that is practical and mathematically consistent

一种用于轨迹集的度量标准,具有实用性和数学一致性

José Bento, Jia Jie Zhu

AI总结 本文提出了一种新的轨迹集度量标准,解决了现有数学一致度量难以计算以及实用近似度量不一致的问题,该度量标准能够快速计算、最优处理轨迹身份混淆,并且在数学上是有效的。

Comments Submitted to IEEE Transactions on Signal Processing

详情
AI中文摘要

在计算机视觉、机器学习、机器人学和通用人工智能领域,对轨迹集空间的度量至关重要。然而,现有的轨迹集接近性概念要么在数学上不一致,要么在实际应用中有限。在本文中,我们指出现有数学一致度量的局限性,这些度量基于OSPA(Schuhmacher等人,2008);以及实践中使用的启发式接近性概念,其主要思想与广泛用于计算机视觉的CLEAR MOT度量(Keni和Rainer,2008)相似。通过两步方法,我们提出了一个新的直观度量标准,以解决这些局限性。首先,我们解释了一种导致难以计算的度量解决方案。然后,我们修改此公式,以获得一个易于计算但保留先前度量有用属性的度量。我们的接近性概念是第一个展示以下三个特征的度量:1)可以快速计算,2)以最优方式整合轨迹身份的混淆,3)在数学意义上是一个度量。

英文摘要

Metrics on the space of sets of trajectories are important for scientists in the field of computer vision, machine learning, robotics, and general artificial intelligence. However, existing notions of closeness between sets of trajectories are either mathematically inconsistent or of limited practical use. In this paper, we outline the limitations in the current mathematically-consistent metrics, which are based on OSPA (Schuhmacher et al. 2008); and the inconsistencies in the heuristic notions of closeness used in practice, whose main ideas are common to the CLEAR MOT measures (Keni and Rainer 2008) widely used in computer vision. In two steps, we then propose a new intuitive metric between sets of trajectories and address these limitations. First, we explain a solution that leads to a metric that is hard to compute. Then we modify this formulation to obtain a metric that is easy to compute while keeping the useful properties of the previous metric. Our notion of closeness is the first demonstrating the following three features: the metric 1) can be quickly computed, 2) incorporates confusion of trajectories' identity in an optimal way, and 3) is a metric in the mathematical sense.

1904.02851 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY

Planning under non-rational perception of uncertain spatial costs

在不确定空间成本下的非理性感知规划

Aamodh Suresh, Sonia Martinez

AI总结 本文研究了在不确定空间成本下考虑非理性风险感知的运动规划策略,提出基于累积前景理论(CPT)生成感知风险地图的方法,并通过理论和仿真验证了CPT模型的建模能力,与CVaR等其他风险感知模型相比,展示了在路径规划中的优势。

Comments 12 pages and 10 figures. This revision adds more explanation and clearer figures

详情
AI中文摘要

本工作探讨了设计一种考虑与不确定空间成本相关的风险感知的运动规划策略。我们提出的方法利用累积前景理论(CPT)来生成给定环境中的感知风险地图。CPT-like感知风险和路径长度指标被结合以定义一个符合采样运动规划器(RRT*)渐近最优要求的成本函数。通过理论和仿真展示了CPT的建模能力,并与其他风险感知模型如条件价值-at-风险(CVaR)进行了比较。理论上,我们定义了风险感知模型的表达性概念,并证明CPT的表达性高于CVaR和期望风险。然后我们展示了这种表达性在路径规划设置中的转化,其中我们观察到一个配备CPT和同时扰动随机近似(SPSA)方法的规划器可以更好地近似任意环境中的路径。此外,我们通过仿真展示了我们的规划器能够捕捉一组丰富的有意义路径,代表了不同风险感知的自定义环境。然后我们通过在拥挤和动态环境中的仿真比较了我们的规划器与T-RRT*(连续成本空间的规划器)和Risk-RRT*(动态人类障碍物的风险感知规划器)的性能,展示了我们所提规划器的优势。

英文摘要

This work investigates the design of risk-perception-aware motion-planning strategies that incorporate non-rational perception of risks associated with uncertain spatial costs. Our proposed method employs the Cumulative Prospect Theory (CPT) to generate a perceived risk map over a given environment. CPT-like perceived risks and path-length metrics are then combined to define a cost function that is compliant with the requirements of asymptotic optimality of sampling-based motion planners (RRT*). The modeling power of CPT is illustrated in theory and in simulation, along with a comparison to other risk perception models like Conditional Value at Risk (CVaR). Theoretically, we define a notion of expressiveness for a risk perception model and show that CPT's is higher than that of CVaR and expected risk. We then show that this expressiveness translates to our path planning setting, where we observe that a planner equipped with CPT together with a simultaneous perturbation stochastic approximation (SPSA) method can better approximate arbitrary paths in an environment. Additionally, we show in simulation that our planner captures a rich set of meaningful paths, representative of different risk perceptions in a custom environment. We then compare the performance of our planner with T-RRT* (a planner for continuous cost spaces) and Risk-RRT* (a risk-aware planner for dynamic human obstacles) through simulations in cluttered and dynamic environments respectively, showing the advantage of our proposed planner.

1809.04048 2026-06-04 cs.RO cs.SY eess.SY

Accurate Tracking of Aggressive Quadrotor Trajectories using Incremental Nonlinear Dynamic Inversion and Differential Flatness

使用增量非线性动态逆反与微分平坦性准确跟踪攻击性四旋翼轨迹

Ezra Tal, Sertac Karaman

AI总结 本文提出了一种新的控制律,用于跟踪四旋翼的位置、偏航角及其至四阶导数,包括速度、加速度、加加速度和加加加速度以及偏航速率和偏航加速度。通过基于四旋翼动力学的微分平坦性,使用前馈输入跟踪加加速度和加加加速度。通过闭环电机速度控制,直接控制机体扭矩以实现加加加速度跟踪。控制器利用增量非线性动态逆反(INDI)在外部干扰如空气动力学阻力的情况下实现对线性和角加速度的鲁棒跟踪。通过响应分析严格分析所提出的控制律,并通过实验演示。控制器使四旋翼无人机能够跟踪复杂的3D轨迹,达到最高12.9 m/s的速度和2.1g的加速度,同时保持均方根跟踪误差仅为6.6 cm,在约18m×7m×3m的飞行体积内。此外,通过在飞行测试中附加拖板和在悬停时拉拽无人机来演示控制器的鲁棒性。

Comments To be published in IEEE Transactions on Control Systems Technology. Revision: new set of experiments at increased speed (up to 12.9 m/s), updated controller design using quaternion representation, new video available at https://youtu.be/K15lNBAKDCs

详情
AI中文摘要

自主无人机(UAV)能够执行攻击性(即高速和高加速度)动作已经引起了极大的关注。本文聚焦于准确跟踪攻击性四旋翼轨迹。我们提出了一种新的控制律,用于跟踪位置和偏航角及其导数至四阶,具体包括速度、加速度、加加速度和加加加速度以及偏航速率和偏航加速度。通过基于四旋翼动力学的微分平坦性,使用前馈输入跟踪加加速度和加加加速度。加加加速度跟踪需要直接控制机体扭矩,我们通过闭环电机速度控制,基于安装在电机上的光学编码器的测量来实现。控制器利用增量非线性动态逆反(INDI)在外部干扰如空气动力学阻力的情况下实现对线性和角加速度的鲁棒跟踪。因此,不需要先验建模空气动力学效应。我们通过响应分析严格分析所提出的控制律,并通过实验演示。控制器使四旋翼UAV能够跟踪复杂的3D轨迹,达到最高12.9 m/s的速度和2.1g的加速度,同时保持根均方跟踪误差仅为6.6 cm,在约18m×7m×3m的飞行体积内。我们还通过在飞行测试中附加拖板和在悬停时拉拽UAV来演示控制器的鲁棒性。

英文摘要

Autonomous unmanned aerial vehicles (UAVs) that can execute aggressive (i.e., high-speed and high-acceleration) maneuvers have attracted significant attention in the past few years. This paper focuses on accurate tracking of aggressive quadcopter trajectories. We propose a novel control law for tracking of position and yaw angle and their derivatives of up to fourth order, specifically, velocity, acceleration, jerk, and snap along with yaw rate and yaw acceleration. Jerk and snap are tracked using feedforward inputs for angular rate and angular acceleration based on the differential flatness of the quadcopter dynamics. Snap tracking requires direct control of body torque, which we achieve using closed-loop motor speed control based on measurements from optical encoders attached to the motors. The controller utilizes incremental nonlinear dynamic inversion (INDI) for robust tracking of linear and angular accelerations despite external disturbances, such as aerodynamic drag forces. Hence, prior modeling of aerodynamic effects is not required. We rigorously analyze the proposed control law through response analysis, and we demonstrate it in experiments. The controller enables a quadcopter UAV to track complex 3D trajectories, reaching speeds up to 12.9 m/s and accelerations up to 2.1g, while keeping the root-mean-square tracking error down to 6.6 cm, in a flight volume that is roughly 18 m by 7 m and 3 m tall. We also demonstrate the robustness of the controller by attaching a drag plate to the UAV in flight tests and by pulling on the UAV with a rope during hover.

1810.03345 2026-06-04 cs.RO cs.SY eess.SY

Bounded Collision Force by the Sobolev Norm

通过Sobolev范数限制碰撞力

Kevin Haninger, Dragoljub Surdilovic

AI总结 本文提出利用Sobolev范数作为系统范数,为动态系统中的最大碰撞力提供严格界限,通过实验和仿真验证了该方法在分析机器人碰撞力与控制策略、关节灵活性及末端执行器柔顺性之间的关系方面的有效性。

Comments Accepted ICRA2019, suppoprted by EU H2020 programme, Grant #820689

详情
AI中文摘要

机器人与环境或人类接触时可能产生安全风险,包括过大的碰撞力。尽管文献中有关于机器人惯性、相对速度和界面刚度对碰撞影响的实验研究,但关于最大碰撞力的分析模型仍局限于简化质量-弹簧机器人模型。该简化模型限制了对控制(力/扭矩、阻抗或顺应性)或柔顺机器人(关节和末端执行器柔顺性)的分析。本文将Sobolev范数适配为系统范数,为一般动态系统中刚度元件的最大力提供严格界限,允许使用更精确的模型和反馈控制进行碰撞研究。Sobolev范数可通过转换系统的$\mathcal{H}_2$范数找到,允许高效计算、连接现有控制理论并合成控制器以最小化碰撞力。Sobolev范数首先通过顺应性控制机器人进行实验验证,然后在模拟线性柔性关节机器人中进行验证。随后,它被用于研究控制、关节灵活性和末端执行器柔顺性对碰撞的影响,并展示了碰撞性能与环境估计不确定性之间的权衡。

英文摘要

A robot making contact with an environment or human presents potential safety risks, including excessive collision force. While experiments on the effect of robot inertia, relative velocity, and interface stiffness on collision are in literature, analytical models for maximum collision force are limited to a simplified mass-spring robot model. This simplified model limits the analysis of control (force/torque, impedance, or admittance) or compliant robots (joint and end-effector compliance). Here, the Sobolev norm is adapted to be a system norm, giving rigorous bounds on the maximum force on a stiffness element in a general dynamic system, allowing the study of collision with more accurate models and feedback control. The Sobolev norm can be found through the $\mathcal{H}_2$ norm of a transformed system, allowing efficient computation, connection with existing control theory, and controller synthesis to minimize collision force. The Sobolev norm is validated, first experimentally with an admittance-controlled robot, then in simulation with a linear flexible-joint robot. It is then used to investigate the impact of control, joint flexibility and end-effector compliance on collision, and a trade-off between collision performance and environmental estimation uncertainty is shown.

1904.13317 2026-06-04 cs.RO cs.LG cs.SY eess.SY

A data-efficient geometrically inspired polynomial kernel for robot inverse dynamics

一种数据高效且受几何启发的多项式核用于机器人逆动力学

Alberto Dalla Libera, Ruggero Carli

AI总结 本文提出了一种基于高斯过程回归的数据驱动逆动力学估计器,引入了几何启发多项式核(GIP),该核在合适输入空间上将逆动力学描述为多项式函数,并证明其定义了有限维的再生核希尔伯特空间,包含刚体动力学计算的逆动力学函数,实验表明该方法在数据效率和泛化能力上优于其他数据驱动方法,同时相比模型驱动方法需要更少的先验信息且不受模型偏差影响。

详情
Journal ref
IEEE Robotics and Automation Letters, vol. 5, no. 1, pp. 24-31, Jan. 2020
AI中文摘要

在本文中,我们介绍了一种基于高斯过程回归的新数据驱动逆动力学估计器。受逆动力学可以描述为合适输入空间上的多项式函数的启发,我们提出了一个名为几何启发多项式核(GIP)的新核。所得到的估计器在数据效率方面与基于模型的方法相似。事实上,我们证明了GIP核定义了一个有限维的再生核希尔伯特空间,该空间包含通过刚体动力学计算的逆动力学函数。所提出的核基于最近引入的乘法多项式核,这是经典多项式核的重新定义,配备了允许更高正则化的参数集。我们已在模拟环境和UR10机器人的真实实验中测试了所提出的方法。获得的结果证实,与其它数据驱动估计器相比,所提出的方法在数据效率和泛化能力上更优。相反,与基于模型的估计器相比,我们的方法需要更少的先验信息且不受模型偏差影响。

英文摘要

In this paper, we introduce a novel data-driven inverse dynamics estimator based on Gaussian Process Regression. Driven by the fact that the inverse dynamics can be described as a polynomial function on a suitable input space, we propose the use of a novel kernel, called Geometrically Inspired Polynomial Kernel (GIP). The resulting estimator behaves similarly to model-based approaches as concerns data efficiency. Indeed, we proved that the GIP kernel defines a finite-dimensional Reproducing Kernel Hilbert Space that contains the inverse dynamics function computed through the Rigid Body Dynamics. The proposed kernel is based on the recently introduced Multiplicative Polynomial Kernel, a redefinition of the classical polynomial kernel equipped with a set of parameters that allows for a higher regularization. We tested the proposed approach in a simulated environment, and also in real experiments with a UR10 robot. The obtained results confirm that, compared to other data-driven estimators, the proposed approach is more data-efficient and exhibits better generalization properties. Instead, with respect to model-based estimators, our approach requires less prior information and is not affected by model bias.

1904.08831 2026-06-04 cs.LG cs.SY eess.SY stat.ML

Neural-Attention-Based Deep Learning Architectures for Modeling Traffic Dynamics on Lane Graphs

基于神经注意力的深度学习架构用于车道图上的交通动态建模

Matthew A. Wright, Simon F. G. Ehlers, Roberto Horowitz

AI总结 本文提出了一种基于神经注意力的深度学习架构,用于建模车道图上的交通动态,通过显式编码车道间的关系类型来提高预测性能,并展示了该模型在复杂道路网络中的迁移能力。

Comments To appear at 2019 IEEE Conference on Intelligent Transportation Systems

详情
AI中文摘要

深度神经网络可以成为强大的工具,但需要特定应用的精心设计以确保数据中最相关信息可被学习。在本文中,我们将深度神经网络应用于车辆交通动态的非线性时空物理问题。我们考虑在车道层面估计宏观量(例如交叉口的排队长度)的问题。由于建模如车道变更等社会行为的复杂性以及这些行为的宏观尺度影响,车道尺度的第一原理建模一直是一个挑战。遵循领域知识,上游/下游车道和邻近车道以不同的方式影响彼此的交通流量,我们应用了一种神经注意力机制,使神经网络层能够以不同的方式聚合来自不同车道的信息。使用微观交通模拟器作为测试平台,我们获得了结果,表明注意力神经网络模型可以利用附近车道的信息来提高预测效果,并且显式编码车道间的关系类型显著提高了性能。我们还展示了所学神经网络在更复杂道路网络中的迁移能力,讨论了其性能退化可能归因于拓扑复杂性增加所引起的新交通行为,并激励从多种道路网络拓扑中学习动态模型。

英文摘要

Deep neural networks can be powerful tools, but require careful application-specific design to ensure that the most informative relationships in the data are learnable. In this paper, we apply deep neural networks to the nonlinear spatiotemporal physics problem of vehicle traffic dynamics. We consider problems of estimating macroscopic quantities (e.g., the queue at an intersection) at a lane level. First-principles modeling at the lane scale has been a challenge due to complexities in modeling social behaviors like lane changes, and those behaviors' resultant macro-scale effects. Following domain knowledge that upstream/downstream lanes and neighboring lanes affect each others' traffic flows in distinct ways, we apply a form of neural attention that allows the neural network layers to aggregate information from different lanes in different manners. Using a microscopic traffic simulator as a testbed, we obtain results showing that an attentional neural network model can use information from nearby lanes to improve predictions, and, that explicitly encoding the lane-to-lane relationship types significantly improves performance. We also demonstrate the transfer of our learned neural network to a more complex road network, discuss how its performance degradation may be attributable to new traffic behaviors induced by increased topological complexity, and motivate learning dynamics models from many road network topologies.

1903.12311 2026-06-04 cs.RO cs.SY eess.SY

Mesh-based Tools to Analyze Deep Reinforcement Learning Policies for Underactuated Biped Locomotion

基于网格的工具用于分析深度强化学习策略在欠驱动双足运动中的稳定性与鲁棒性

Nihar Talele, Katie Byl

AI总结 本文提出了一种基于网格的方法,用于分析通过深度强化学习获得的双足运动策略的稳定性与鲁棒性,通过量化评估策略鲁棒性的程度,提供更高效的工具来评估此类策略的鲁棒性特性。

详情
AI中文摘要

在本文中,我们提出了一种基于网格的方法,用于分析通过深度强化学习获得的五连杆平面模型各种双足步态策略的稳定性与鲁棒性。直观上,人们可能会认为在训练过程中包含扰动和/或其它类型的噪声会导致最终的控制策略更加鲁棒。然而,人们也希望有一个定量且计算高效的手段来评估这种可能性的程度。而不是依赖蒙特卡洛模拟,这种模拟在量化性能指标时可能会变得计算负担很重,我们的目标是提供更复杂的工具来评估此类策略的鲁棒性特性。我们的工作受双重视假的启发,即当动态收缩可行时,可以简化所需控制策略的复杂性,并且通过深度学习获得的控制策略可能会倾向于收缩到全状态空间中的低维流形,从而产生这种倾向。本文中网格工具的可操作性提供了一些证据表明这可能是正确的。

英文摘要

In this paper, we present a mesh-based approach to analyze stability and robustness of the policies obtained via deep reinforcement learning for various biped gaits of a five-link planar model. Intuitively, one would expect that including perturbations and/or other types of noise during training would likely result in more robustness of the resulting control policy. However, one would also like to have a quantitative and computationally-efficient means of evaluating the degree to which this might be so. Rather than relying on Monte Carlo simulations, which can become quite computationally burdensome in quantifying performance metrics, our goal is to provide more sophisticated tools to assess robustness properties of such policies. Our work is motivated by the twin hypotheses that contraction of dynamics, when achievable, can simplify the required complexity of a control policy and that control policies obtained via deep learning may therefore exhibit tendency to contract to lower-dimensional manifolds within the full state space, as a result. The tractability of our mesh-based tools in this work provides some evidence that this may be so.

1812.04426 2026-06-04 cs.LG cs.NA math.NA physics.comp-ph stat.ML

PDE-Net 2.0: Learning PDEs from Data with A Numeric-Symbolic Hybrid Deep Network

PDE-Net 2.0:基于数据学习PDE的数值-符号混合深度网络

Zichao Long, Yiping Lu, Bin Dong

AI总结 本文提出PDE-Net 2.0,一种结合数值近似和符号计算的深度网络,用于从动态数据中学习偏微分方程,并具有较高的灵活性和表达能力。

Comments 16 pages, 15 figures. arXiv admin note: substantial text overlap with arXiv:1710.09668

详情
AI中文摘要

偏微分方程(PDEs)通常是基于经验观察推导得出的。然而,技术的进步使我们能够收集和存储大量数据,这为数据驱动的PDE发现提供了新机会。本文提出了一种新的深度神经网络,称为PDE-Net 2.0,用于从观测动态数据中发现(时间依赖的)PDE,仅需少量对驱动动态机制的先验知识。PDE-Net 2.0的设计基于我们先前的工作\cite{Long2018PDE},其中提出了原始版本的PDE-Net。PDE-Net 2.0是通过卷积近似微分算子和用于模型恢复的符号多层神经网络的结合。与现有方法相比,PDE-Net 2.0通过学习微分算子和PDE模型的非线性响应函数,具有最大的灵活性和表达能力。数值实验表明,PDE-Net 2.0有潜力揭示观测动态的隐藏PDE,并在噪声环境中预测相对较长时间的动力学行为。

英文摘要

Partial differential equations (PDEs) are commonly derived based on empirical observations. However, recent advances of technology enable us to collect and store massive amount of data, which offers new opportunities for data-driven discovery of PDEs. In this paper, we propose a new deep neural network, called PDE-Net 2.0, to discover (time-dependent) PDEs from observed dynamic data with minor prior knowledge on the underlying mechanism that drives the dynamics. The design of PDE-Net 2.0 is based on our earlier work \cite{Long2018PDE} where the original version of PDE-Net was proposed. PDE-Net 2.0 is a combination of numerical approximation of differential operators by convolutions and a symbolic multi-layer neural network for model recovery. Comparing with existing approaches, PDE-Net 2.0 has the most flexibility and expressive power by learning both differential operators and the nonlinear response function of the underlying PDE model. Numerical experiments show that the PDE-Net 2.0 has the potential to uncover the hidden PDE of the observed dynamics, and predict the dynamical behavior for a relatively long time, even in a noisy environment.

1904.03830 2026-06-04 cs.RO cs.SY eess.SY

A Hybrid Compositional Approach to Optimal Mission Planning for Multi-rotor UAVs using Metric Temporal Logic

一种基于度量时序逻辑的混合组合方法用于多旋翼无人机最优任务规划

Usman A. Fiaz, John S. Baras

AI总结 本文提出了一种混合组合方法,用于多旋翼无人机在受约束环境中的时间敏感搜索救援任务规划,通过将度量时序逻辑形式化描述任务规范,并利用混合模型捕捉无人机的各种操作模式,将任务分解为子任务并使用混合整数线性规划求解器求解最优路径,从而降低计算复杂度并实现实时应用。

Comments 8 pages, 5 figures, 1 table. Fixed typos, added new references

详情
AI中文摘要

本文研究了一种混合组合方法用于多旋翼无人机的最优任务规划。我们考虑了一个时间敏感的搜索和救援场景,其中两个四旋翼无人机在一个受约束的环境中运行。度量时序逻辑(MTL)用于正式描述任务规范。为了捕捉无人机的各种操作模式,我们使用一个混合模型,该模型在不同操作点周围线性化动力学。我们通过利用各种任务规范的不变性质,即路径上的安全性和时间约束的相互独立性以及机器人不同模式(即动力学)来将任务分解为多个子任务。对于每个子任务,我们将MTL公式转换为线性约束,并使用混合整数线性规划(MILP)求解器求解所需的路径最优控制问题。完整的路径由各个最优子路径的组合构成。我们证明所得到的轨迹满足任务规范,并且所提出的方法显著降低了问题的计算复杂度,使其能够实现实时应用。

英文摘要

This paper investigates a hybrid compositional approach to optimal mission planning for multi-rotor Unmanned Aerial Vehicles (UAVs). We consider a time critical search and rescue scenario with two quadrotors in a constrained environment. Metric Temporal Logic (MTL) is used to formally describe the task specifications. In order to capture the various modes of UAV operation, we utilize a hybrid model for the system with linearized dynamics around different operating points. We divide the mission into several sub-tasks by exploiting the invariant nature of various task specifications i.e., the mutual independence of safety and timing constraints along the way, and the different modes (i,e., dynamics) of the robot. For each sub-task, we translate the MTL formulae into linear constraints, and solve the associated optimal control problem for desired path using a Mixed Integer Linear Program (MILP) solver. The complete path is constructed by the composition of individual optimal sub-paths. We show that the resulting trajectory satisfies the task specifications, and the proposed approach leads to significant reduction in computational complexity of the problem, making it possible to implement in real-time.

1807.08229 2026-06-04 cs.AI cs.RO cs.SY eess.SY

Optimal Continuous State POMDP Planning with Semantic Observations: A Variational Approach

基于语义观测的最优连续状态POMDP规划:一种变分方法

Luke Burks, Ian Loefgren, Nisar Ahmed

AI总结 本文提出了一种基于变分方法的最优规划策略,针对语义观测下的连续状态部分可观测马尔可夫决策过程(CPOMDP)进行改进,通过变分贝叶斯方法解决混合连续-离散概率模型的表示和推理问题,提升了动态决策任务的效率和鲁棒性。

Comments Final version accepted to IEEE Transactions on Robotics (in press as of August 2019)

详情
AI中文摘要

本文开发了用于利用语义观测进行最优规划的新策略,使用连续状态部分可观测马尔可夫决策过程(CPOMDP)。在高斯混合(GM)CPOMDP策略近似方法方面,提出了两项主要创新。尽管现有方法具有许多有益的理论性质,但它们无法高效地表示和推理混合连续-离散概率模型。第一项主要创新是通过softmax模型推导出连续-离散语义观测概率的变分贝叶斯GM近似,用于点基值迭代贝尔曼策略备份。这种方法的关键优势是可以在复杂的非高斯不确定性下进行动态决策,同时利用连续动态状态空间模型(从而避免繁琐且昂贵的离散化)。第二项主要创新是一种基于聚类的混合物凝聚技术,能够很好地扩展到非常大的GM策略函数和信念函数。针对目标搜索和拦截任务的仿真结果表明,这些创新所产生的GM策略比其他最先进的策略近似方法产生的策略更有效,但需要显著较少的建模开销和在线运行时间成本。此外,结果还显示该方法对模型误差具有鲁棒性,并能扩展到更高维度。

英文摘要

This work develops novel strategies for optimal planning with semantic observations using continuous state partially observable markov decision processes (CPOMDPs). Two major innovations are presented in relation to Gaussian mixture (GM) CPOMDP policy approximation methods. While existing methods have many desirable theoretical properties, they are unable to efficiently represent and reason over hybrid continuous-discrete probabilistic models. The first major innovation is the derivation of closed-form variational Bayes GM approximations of Point-Based Value Iteration Bellman policy backups, using softmax models of continuous-discrete semantic observation probabilities. A key benefit of this approach is that dynamic decision-making tasks can be performed with complex non-Gaussian uncertainties, while also exploiting continuous dynamic state space models (thus avoiding cumbersome and costly discretization). The second major innovation is a new clustering-based technique for mixture condensation that scales well to very large GM policy functions and belief functions. Simulation results for a target search and interception task with semantic observations show that the GM policies resulting from these innovations are more effective than those produced by other state of the art policy approximations, but require significantly less modeling overhead and online runtime cost. Additional results show the robustness of this approach to model errors and scaling to higher dimensions.

1905.04835 2026-06-04 cs.LG cs.CV cs.MA cs.RO cs.SY eess.SY stat.ML

Multi-Agent Image Classification via Reinforcement Learning

通过强化学习进行多智能体图像分类

Hossein K. Mousavi, Mohammadreza Nazari, Martin Takáč, Nader Motee

AI总结 本文研究了利用多个能够收集未知环境部分姿态依赖观测的移动智能体进行图像分类的问题,提出了一种网络架构,用于指导智能体形成局部信念、采取局部行动并从原始部分观测中提取相关特征,通过与邻居智能体交换信息更新自身信念,并利用强化学习技术实现分类问题的去中心化实现。

Comments Preprint of the paper to be published in IROS'19 proceedings

详情
AI中文摘要

我们研究了使用多个能够收集未知环境部分姿态依赖观测的移动智能体进行分类问题。目标是在有限的时间范围内对图像进行分类。我们提出了一种网络架构,用于指导智能体如何形成局部信念、采取局部行动并从原始部分观测中提取相关特征。智能体被允许与邻居智能体交换信息以更新自身信念。证明了如何利用强化学习技术通过运行去中心化共识协议来实现分类问题的去中心化实现。我们在MNIST手写数字数据集上的实验结果展示了我们所提框架的有效性。

英文摘要

We investigate a classification problem using multiple mobile agents capable of collecting (partial) pose-dependent observations of an unknown environment. The objective is to classify an image over a finite time horizon. We propose a network architecture on how agents should form a local belief, take local actions, and extract relevant features from their raw partial observations. Agents are allowed to exchange information with their neighboring agents to update their own beliefs. It is shown how reinforcement learning techniques can be utilized to achieve decentralized implementation of the classification problem by running a decentralized consensus protocol. Our experimental results on the MNIST handwritten digit dataset demonstrates the effectiveness of our proposed framework.

1904.04968 2026-06-04 cs.RO cs.SY eess.SY math.OC

Asymptotic Optimality of a Time Optimal Path Parametrization Algorithm

时间最优路径参数化算法的渐近最优性

Igor Spasojevic, Varun Murali, Sertac Karaman

AI总结 本文研究了时间最优路径参数化问题,证明了线性时间算法在所有由凸优化方法最优解决的问题中都是渐近最优的,并且刻画了该问题的最优解。

详情
AI中文摘要

时间最优路径参数化问题是 minimizing the time interval during which an actuation constrained agent can traverse a given path. Recently, an efficient linear-time algorithm for solving this problem was proposed. However, its optimality was proved for only a strict subclass of problems solved optimally by more computationally intensive approaches based on convex programming. In this paper, we prove that the same linear-time algorithm is asymptotically optimal for all problems solved optimally by convex optimization approaches. We also characterize the optimum of the Time Optimal Path Parametrization Problem, which may be of independent interest.

英文摘要

Time Optimal Path Parametrization is the problem of minimizing the time interval during which an actuation constrained agent can traverse a given path. Recently, an efficient linear-time algorithm for solving this problem was proposed. However, its optimality was proved for only a strict subclass of problems solved optimally by more computationally intensive approaches based on convex programming. In this paper, we prove that the same linear-time algorithm is asymptotically optimal for all problems solved optimally by convex optimization approaches. We also characterize the optimum of the Time Optimal Path Parametrization Problem, which may be of independent interest.

1903.05073 2026-06-04 cs.RO cs.LO cs.SY eess.SY

A Formal Safety Net for Waypoint Following in Ground Robots

为地面机器人路径跟随提供形式安全网

Brandon Bohrer, Yong Kiam Tan, Stefan Mitsch, Andrew Sogokon, André Platzer

AI总结 本文提出一个可重用的形式验证安全网,为具有容差和加速度的Dubins型地面机器人提供端到端的安全性和活跃性保证,通过形式化方法验证安全性和活跃性属性,并合成监控器以确保运行时的模型合规性。

详情
AI中文摘要

我们提出一个可重用的形式验证安全网,为具有容差和加速度的Dubins型地面机器人提供端到端的安全性和活跃性保证。我们:i) 用微分动态逻辑(dL)建模机器人,并对控制器和机器人运动学制定假设;ii) 证明具有速度限制的路径跟随形式安全性和活跃性属性;iii) 合成一个监控器,该监控器被自动证明在运行时强制执行模型合规性;iv) 我们使用VeriPhy工具链使这些保证延伸到不受信任的控制器、环境和计划的机器代码层面。安全网的保证适用于任何机器人,只要路径被安全选择且其模型中的物理假设成立。实验表明这些假设在实践中成立,存在合规性与性能之间的内在权衡。

英文摘要

We present a reusable formally verified safety net that provides end-to-end safety and liveness guarantees for 2D waypoint-following of Dubins-type ground robots with tolerances and acceleration. We: i) Model a robot in differential dynamic logic (dL), and specify assumptions on the controller and robot kinematics, ii) Prove formal safety and liveness properties for waypoint-following with speed limits, iii) Synthesize a monitor, which is automatically proven to enforce model compliance at runtime, and iv) Our use of the VeriPhy toolchain makes these guarantees carry over down to the level of machine code with untrusted controllers, environments, and plans. The guarantees for the safety net apply to any robot as long as the waypoints are chosen safely and the physical assumptions in its model hold. Experiments show these assumptions hold in practice, with an inherent trade-off between compliance and performance.

1905.11176 2026-06-04 cs.RO cs.SY eess.SY

Temporally Coupled Dynamical Movement Primitives in Cartesian Space

笛卡尔空间中耦合的动态运动素体

Martin Karlsson, Anders Robertsson, Rolf Johansson

AI总结 本文提出了一种在笛卡尔空间中耦合动态运动素体(DMPs)的控制方法,利用单位四元数表示姿态,并证明单位四元数集减去一个点是可缩的,从而设计出连续且全局渐近稳定的反馈控制系统。

详情
AI中文摘要

机器人在笛卡尔空间中的姿态控制涉及一些困难,因为旋转群SO(3)不是可缩的,只有全局可缩的状态空间才能支持连续且全局渐近稳定的反馈控制系统。本文利用单位四元数来表示姿态,并首次证明单位四元数集减去一个单点是可缩的。这一结果被用于设计笛卡尔空间中耦合动态运动素体(DMPs)的控制系统。该控制系统的功能在工业机器人上进行了实验验证。

英文摘要

Control of robot orientation in Cartesian space implicates some difficulties, because the rotation group SO(3) is not contractible, and only globally contractible state spaces support continuous and globally asymptotically stable feedback control systems. In this paper, unit quaternions are used to represent orientations, and it is first shown that the unit quaternion set minus one single point is contractible. This is used to design a control system for temporally coupled dynamical movement primitives (DMPs) in Cartesian space. The functionality of the control system is verified experimentally on an industrial robot.

1905.09396 2026-06-04 cs.RO cs.SY eess.SY

Predictive Control for Chasing a Ground Vehicle using a UAV

使用无人机追击地面车辆的预测控制

Jaeseung Byun, Karan P. Jain, Siddharth H. Nair, Haoyun Xu, Jiaming Zha

AI总结 本文提出了一种多旋翼无人机追击地面车辆的高层规划器,同时满足各种状态和输入约束。假设地面车辆的最小运动学模型,利用在线收集的数据在模型预测控制框架内生成预测,通过仿真和稳定四旋翼平台的实验验证了该方案。

详情
AI中文摘要

我们提出了一种多旋翼无人机追击地面车辆的高层规划器,同时满足各种状态和输入约束。假设地面车辆的最小运动学模型,我们利用在线收集的数据在模型预测控制框架内生成预测。我们的解决方案通过仿真和在稳定四旋翼平台上进行的实验得到了验证。

英文摘要

We propose a high-level planner for a multirotor to chase a ground vehicle, while simultaneously respecting various state and input constraints. Assuming a minimal kinematic model for the ground vehicle, we use data collected online to generate predictions for our planner within a model predictive control framework. Our solution is demonstrated, both via simulations and experiments on a stable quadcopter platform.

1905.05946 2026-06-04 cs.RO cs.CV cs.SY eess.SY

Depth map estimation methodology for detecting free-obstacle navigation areas

用于自由障碍区域检测的深度图估计方法

Sergio Trejo, Karla Martinez, Gerardo Flores

AI总结 本文提出了一种基于视觉的方法,利用立体相机和一维LiDAR估计四旋翼导航中的自由障碍区域。通过加权最小二乘滤波器过滤深度图,并通过卡尔曼滤波算法融合信息,确定四旋翼可通过的足够大自由空间区域。整个过程在Jetson TX2嵌入式计算机上用ROS实现。

详情
Journal ref
ICUAS'19 The 2019 International Conference on Unmanned Aircraft Systems
AI中文摘要

本文提出了一种基于视觉的方法,利用立体相机和一维LiDAR估计四旋翼导航中的自由障碍区域。所提出的方法利用立体相机提供的深度图和一维LiDAR的测距信息。在对深度图进行加权最小二乘滤波器(WLS)过滤后,通过卡尔曼滤波算法融合信息。通过使用卡尔曼滤波器的输出信息,在视差图中标记一个区域,以确定是否存在足够大的自由空间供四旋翼通过。整个过程在Jetson TX2嵌入式计算机上用机器人操作系统(ROS)实现。实验展示了该方法的有效性。

英文摘要

This paper presents a vision-based methodology which makes use of a stereo camera rig and a one dimension LiDAR to estimate free obstacle areas for quadrotor navigation. The presented approach fuses information provided by a depth map from a stereo camera rig, and the sensing distance of the 1D-LiDAR. Once the depth map is filtered with a Weighted Least Squares filter (WLS), the information is fused through a Kalman filter algorithm. To determine if there is a free space large enough for the quadrotor to pass through, our approach marks an area inside the disparity map by using the Kalman Filter output information. The whole process is implemented in an embedded computer Jetson TX2 and coded in the Robotic Operating System (ROS). Experiments demonstrate the effectiveness of our approach.

1810.03076 2026-06-04 cs.RO cs.LG cs.SY eess.SY

Online Center of Mass Estimation for a Humanoid Wheeled Inverted Pendulum Robot

人形轮式反重力摆机器人在线质心估计

Munzir Zafar, Akash Patel, Bogdan Vlahov, Nathaniel Glaser, Sergio Aguillera, Seth Hutchinson

AI总结 本文提出了一种新颖的鲁棒控制与在线学习结合的方法,用于平衡具有n自由度的轮式反重力摆人形机器人,通过在线学习更新质量模型以获得更准确的质心估计,实验表明该方法提升了整体控制效率。

详情
AI中文摘要

我们提出了一种新颖的鲁棒控制和在线学习应用,用于平衡具有n个自由度(DoF)的轮式反重力摆(WIP)人形机器人。我们的技术将质量模型的不准确性转化为质心(CoM)误差,并在存在误差的情况下实现平衡,同时利用在线学习更新质量模型以获得更好的质心估计。使用我们机器人的模拟模型,我们元学习了一组激励关节姿态,使我们的梯度下降算法快速收敛到准确的(CoM)估计。该模拟流程完全在线执行,使用主动扰动抵消来解决由于持续演变的质量模型所产生的质量误差。在19个自由度的WIP上进行了实验,我们手动获取了学习姿态集的数据,并展示了由梯度下降产生的质量模型产生的质心估计能够提高整体控制和效率。本工作为Golem Krang人形机器人整体控制贡献了更丰富的文献。

英文摘要

We present a novel application of robust control and online learning for the balancing of a n Degree of Freedom (DoF), Wheeled Inverted Pendulum (WIP) humanoid robot. Our technique condenses the inaccuracies of a mass model into a Center of Mass (CoM) error, balances despite this error, and uses online learning to update the mass model for a better CoM estimate. Using a simulated model of our robot, we meta-learn a set of excitory joint poses that makes our gradient descent algorithm quickly converge to an accurate (CoM) estimate. This simulated pipeline executes in a fully online fashion, using active disturbance rejection to address the mass errors that result from a steadily evolving mass model. Experiments were performed on a 19 DoF WIP, in which we manually acquired the data for the learned set of poses and show that the mass model produced by a gradient descent produces a CoM estimate that improves overall control and efficiency. This work contributes to a greater corpus of whole body control on the Golem Krang humanoid robot.

1905.03131 2026-06-04 cs.RO cs.SY eess.SY

Vision-based Unscented FastSLAM for Mobile Robot

基于视觉的无迹快速SLAM

Chunxin Qiu, Xiaorui Zhu, Xiaobing Zhao

AI总结 本文提出一种结合无迹粒子滤波和无迹卡尔曼滤波的视觉无迹快速SLAM算法,通过双目视觉检测地标实现定位与建图,利用无迹快速SLAM提升定位与建图的精度和鲁棒性。

详情
AI中文摘要

本文提出一种基于视觉的无迹快速SLAM(UFastSLAM)算法,结合了 Rao-Blackwellized 粒子滤波和无迹卡尔曼滤波(UKF)。地标通过双目视觉检测来整合定位与建图。由于双目视觉系统通常继承较大的测量误差,因此采用无迹快速SLAM来提高定位与建图的性能。无迹快速SLAM利用UKF代替非线性函数的线性近似,有效粒子数作为标准以减少粒子退化。通过仿真和实验证明,无迹快速SLAM算法在视觉系统中比FastSLAM2.0算法在精度和鲁棒性方面表现更优。

英文摘要

This paper presents a vision-based Unscented FastSLAM (UFastSLAM) algorithm combing the Rao-Blackwellized particle filter and Unscented Kalman filte(UKF). The landmarks are detected by a binocular vision to integrate localization and mapping. Since such binocular vision system generally inherits larger measurement errors, it is suitable to adopt Unscented FastSLAM to improve the performance of localization and mapping. Unscented FastSLAM takes advantage of UKF instead of the linear approximations of the nonlinear function where the effective number of particles is used as the criteria to reduce the particle degeneration. Simulations and experiments are carried out to demonstrate that the Unscented FastSLAM algorithm can achieve much better performance in the vision-based system than FastSLAM2.0 algorithm on the accuracy and robustness.

1905.03051 2026-06-04 cs.RO cs.SY eess.SY

Bayesian Optimization for Polynomial Time Probabilistically Complete STL Trajectory Synthesis

基于多项式时间概率完备STL轨迹综合的贝叶斯优化

Vince Kurtz, Hai Lin

AI总结 本文提出了一种基于贝叶斯优化的STL轨迹综合方法,替代传统的混合整数线性规划方法,实现了多项式时间复杂度和线性预测数量的高效轨迹综合,同时保证了概率完备性。

Comments CDC 2019 Submission

详情
AI中文摘要

近年来,信号临时逻辑(STL)作为机器人和网络物理系统控制目标编码的实用且表达能力强的手段获得了广泛关注。STL轨迹综合的最新进展是将问题形式化为混合整数线性规划(MILP)。MILP方法在有界规范下是正确且完整的,但这种强正确性保证以指数复杂度和规范的时间界为代价。在本工作中,我们提出了一种替代的综合范式,依赖于贝叶斯优化而不是混合整数规划。这将完备性保证放松为概率完备性,但显著更高效:我们的方法在STL时间界上呈多项式复杂度,在预测数量上呈线性复杂度。我们证明我们的方法是正确且概率完备的,并通过一个非平凡的例子展示了其可扩展性。

英文摘要

In recent years, Signal Temporal Logic (STL) has gained traction as a practical and expressive means of encoding control objectives for robotic and cyber-physical systems. The state-of-the-art in STL trajectory synthesis is to formulate the problem as a Mixed Integer Linear Program (MILP). The MILP approach is sound and complete for bounded specifications, but such strong correctness guarantees come at the price of exponential complexity in the number of predicates and the time bound of the specification. In this work, we propose an alternative synthesis paradigm that relies on Bayesian optimization rather than mixed integer programming. This relaxes the completeness guarantee to probabilistic completeness, but is significantly more efficient: our approach scales polynomially in the STL time-bound and linearly in the number of predicates. We prove that our approach is sound and probabilistically complete, and demonstrate its scalability with a nontrivial example.

1905.01683 2026-06-04 cs.RO cs.SY eess.SY

Path Planning for Autonomous Bus Driving in Urban Environments

城市环境中自动驾驶巴士的路径规划

Rui Oliveira, Pedro F. Lima, Gonçalo Collares Pereira, Jonas Mårtensson, Bo Wahlberg

AI总结 本文提出了一种针对城市环境中巴士驾驶的路径规划框架,通过优化问题解决巴士的复杂驾驶任务,利用道路对齐车辆模型,并考虑巴士的过道特性以实现安全的避障约束。

Comments 6 pages, 8 figures

详情
AI中文摘要

在城市环境中驾驶往往面临需要专家操作车辆的困难情况。当考虑大型车辆,如巴士时,这些情况变得更加具有挑战性。我们提出了一种路径规划框架,以解决巴士在城市区域中的复杂驾驶任务。该方法使用道路对齐的车辆模型进行建模。道路对齐的坐标系引入了对车辆本体和障碍物的扭曲,促使开发新的近似方法来捕捉这种扭曲。这些近似方法允许安全且非保守的碰撞避免约束的制定。与其他路径规划方法不同,我们的方法利用了 curb 和其他可扫过的区域,这些区域巴士在执行某些操作时必须扫过。此外,它充分利用了巴士的特定特性,即过道,这是车辆底盘的升高部分,可以扫过 curb。进行了模拟,展示了所提出方法的适用性和优势。

英文摘要

Driving in urban environments often presents difficult situations that require expert maneuvering of a vehicle. These situations become even more challenging when considering large vehicles, such as buses. We present a path planning framework that addresses the demanding driving task of buses in urban areas. The approach is formulated as an optimization problem using the road-aligned vehicle model. The road-aligned frame introduces a distortion on the vehicle body and obstacles, motivating the development of novel approximations that capture this distortion. These approximations allow for the formulation of safe and non-conservative collision avoidance constraints. Unlike other path planning approaches, our method exploits curbs and other sweepable regions, which a bus must often sweep over in order to manage certain maneuvers. Furthermore, it takes full advantage of the particular characteristics of buses, namely the overhangs, an elevated part of the vehicle chassis, that can sweep over curbs. Simulations are presented, showing the applicability and benefits of the proposed method.

1903.05803 2026-06-04 cs.LG cs.SY eess.SY stat.ML

On Applications of Bootstrap in Continuous Space Reinforcement Learning

在连续空间强化学习中Bootstrap应用的探讨

Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis

AI总结 本文研究了在连续状态和动作空间决策问题中,基于Bootstrap的策略在 regret 方面的平方根缩放特性,并探讨了模型动态学习的准确性。

详情
AI中文摘要

在连续状态和动作空间的决策问题中,线性动力学模型被广泛采用。具体而言,针对受二次成本函数约束的随机线性系统,其策略在强化学习中涵盖了大量应用。最近的文献中研究了随机策略,以解决识别与控制之间的权衡。然而,关于基于Bootstrap观察状态和动作的策略知之甚少。在本文中,我们证明基于Bootstrap的策略在时间方面具有平方根缩放的regret。我们还获得了关于学习模型动态准确性结果。此外,还提供了支持技术结果的数值分析。

英文摘要

In decision making problems for continuous state and action spaces, linear dynamical models are widely employed. Specifically, policies for stochastic linear systems subject to quadratic cost functions capture a large number of applications in reinforcement learning. Selected randomized policies have been studied in the literature recently that address the trade-off between identification and control. However, little is known about policies based on bootstrapping observed states and actions. In this work, we show that bootstrap-based policies achieve a square root scaling of regret with respect to time. We also obtain results on the accuracy of learning the model's dynamics. Corroborative numerical analysis that illustrates the technical results is also provided.