arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

1710.06537 2026-06-04 cs.RO cs.SY eess.SY

Sim-to-Real Transfer of Robotic Control with Dynamics Randomization

机器人控制的仿真到现实转移与动力学随机化

Xue Bin Peng, Marcin Andrychowicz, Wojciech Zaremba, Pieter Abbeel

发表机构 * OpenAI

AI总结本文提出一种简单方法通过随机化仿真动力学来弥合现实与仿真的差距，使策略能适应不同动态，从而在无需真实系统训练的情况下实现现实世界泛化。

详情

DOI: 10.1109/ICRA.2018.8460528

AI中文摘要

仿真环境为训练智能体提供了丰富的数据源，并在训练过程中减少了某些安全方面的担忧。但智能体在仿真中开发的行为往往特定于模拟器的特性。由于建模误差，仿真中表现良好的策略可能无法转移到现实世界。本文提出了一种简单的方法来弥合这一“现实差距”。通过在训练过程中随机化模拟器的动力学，我们能够开发出能够适应非常不同动力学的策略，包括那些与策略训练所基于的动力学有显著差异的动力学。这种适应性使策略能够在没有对物理系统进行训练的情况下泛化到现实世界的动力学。我们的方法在使用机械臂的物体推动任务上进行了演示。尽管策略仅在仿真中进行训练，但部署在真实机器人上时，其性能仍能保持相似水平，能够可靠地将物体从随机初始配置移动到目标位置。我们探讨了各种设计决策的影响，并展示了所得到的策略对显著校准误差具有鲁棒性。

英文摘要

Simulations are attractive environments for training agents as they provide an abundant source of data and alleviate certain safety concerns during the training process. But the behaviours developed by agents in simulation are often specific to the characteristics of the simulator. Due to modeling error, strategies that are successful in simulation may not transfer to their real world counterparts. In this paper, we demonstrate a simple method to bridge this "reality gap". By randomizing the dynamics of the simulator during training, we are able to develop policies that are capable of adapting to very different dynamics, including ones that differ significantly from the dynamics on which the policies were trained. This adaptivity enables the policies to generalize to the dynamics of the real world without any training on the physical system. Our approach is demonstrated on an object pushing task using a robotic arm. Despite being trained exclusively in simulation, our policies are able to maintain a similar level of performance when deployed on a real robot, reliably moving an object to a desired location from random initial configurations. We explore the impact of various design decisions and show that the resulting policies are robust to significant calibration error.

URL PDF HTML ☆

赞 0 踩 0

1809.07098 2026-06-04 cs.AI cs.LG cs.MA cs.NE cs.SY eess.SY

Novelty-organizing team of classifiers in noisy and dynamic environments

在噪声和动态环境中组织新颖性的分类器团队

Danilo Vasconcellos Vargas, Hirotaka Takano, Junichi Murata

发表机构 * Graduate School of Information Science（信息科学研究生学校）； Electrical Engineering Kyushu University Fukuoka, Japan Email（电气工程九州大学福冈日本电子邮件）； Faculty of Information Science（信息科学学院）

AI总结该研究提出了一种在噪声和动态环境中有效工作的分类器团队（NOTC），并通过连续动作山车问题及其变体进行验证，展示了NOTC在性能上的优势，尽管其初始化过程需要一些时间。

详情

DOI: 10.1109/CEC.2015.7257254
Journal ref: 2015 IEEE Congress on Evolutionary Computation (CEC)

AI中文摘要

在现实世界中，环境不断变化，输入变量受到噪声的影响。然而，很少有算法能够在这种情况下工作。在这里，新颖性组织分类器团队（NOTC）被应用于连续动作山车以及其两个变种：噪声山车和不稳定天气山车。这些问题分别考虑了噪声和问题动态的变化。此外，NOTC在这些问题中与神经进化拓扑增强（NEAT）进行了比较，揭示了两种方法之间的权衡。尽管NOTC在所有问题中均表现最佳，但NEAT需要更少的试验来收敛。证明了NOTC之所以表现更好，是因为其将输入空间划分为更易处理的问题。不幸的是，这种输入空间的划分也需要一些时间来初始化。

英文摘要

In the real world, the environment is constantly changing with the input variables under the effect of noise. However, few algorithms were shown to be able to work under those circumstances. Here, Novelty-Organizing Team of Classifiers (NOTC) is applied to the continuous action mountain car as well as two variations of it: a noisy mountain car and an unstable weather mountain car. These problems take respectively noise and change of problem dynamics into account. Moreover, NOTC is compared with NeuroEvolution of Augmenting Topologies (NEAT) in these problems, revealing a trade-off between the approaches. While NOTC achieves the best performance in all of the problems, NEAT needs less trials to converge. It is demonstrated that NOTC achieves better performance because of its division of the input space (creating easier problems). Unfortunately, this division of input space also requires a bit of time to bootstrap.

URL PDF HTML ☆

赞 0 踩 0

1809.06970 2026-06-04 cs.LG cs.NI cs.PF cs.SY eess.SY stat.ML

FastDeepIoT: Towards Understanding and Optimizing Neural Network Execution Time on Mobile and Embedded Devices

FastDeepIoT: 向理解和优化移动和嵌入式设备上神经网络执行时间迈进

Shuochao Yao, Yiran Zhao, Huajie Shao, Shengzhong Liu, Dongxin Liu, Lu Su, Tarek Abdelzaher

发表机构 * University of Illinois Urbana Champaign（伊利诺伊大学厄巴纳-香槟分校）； State University of New York at Buffalo（纽约州立大学布法罗分校）

AI总结本文提出FastDeepIoT框架，通过揭示神经网络结构与执行时间之间的非线性关系，优化移动和嵌入式设备上执行时间与准确性的权衡，同时无需预先了解硬件规格或深度学习库的实现细节。

Comments Accepted by SenSys '18

详情

DOI: 10.1145/3274783.3274840

AI中文摘要

深度神经网络在许多传感应用问题中展现出巨大潜力，但其过度的资源需求会减慢执行时间，成为在低端设备上部署的重大障碍。为了解决这一挑战，最近的研究集中在压缩神经网络大小以提高性能。我们表明，改变神经网络大小并不成比例地影响感兴趣的性能属性，例如执行时间。相反，在网络配置空间中存在极端的运行时间非线性性。因此，我们提出了一个名为FastDeepIoT的新型框架，该框架揭示了神经网络结构与执行时间之间的非线性关系，然后利用这种理解来找到显著改善移动和嵌入式设备上执行时间与准确性权衡的网络配置。FastDeepIoT有两个关键贡献。首先，FastDeepIoT自动学习了一个准确且高度可解释的深度神经网络在目标设备上的执行时间模型。这无需事先了解硬件规格或所用深度学习库的详细实现。其次，FastDeepIoT告知压缩算法如何在经过分析的设备上最小化执行时间而不影响准确性。我们使用三种不同的传感相关任务在两部移动设备（Nexus 5和Galaxy Nexus）上评估了FastDeepIoT。FastDeepIoT进一步将神经网络的执行时间减少了48%到78%，并将能耗降低了37%到69%，与最先进的压缩算法相比。

英文摘要

Deep neural networks show great potential as solutions to many sensing application problems, but their excessive resource demand slows down execution time, pausing a serious impediment to deployment on low-end devices. To address this challenge, recent literature focused on compressing neural network size to improve performance. We show that changing neural network size does not proportionally affect performance attributes of interest, such as execution time. Rather, extreme run-time nonlinearities exist over the network configuration space. Hence, we propose a novel framework, called FastDeepIoT, that uncovers the non-linear relation between neural network structure and execution time, then exploits that understanding to find network configurations that significantly improve the trade-off between execution time and accuracy on mobile and embedded devices. FastDeepIoT makes two key contributions. First, FastDeepIoT automatically learns an accurate and highly interpretable execution time model for deep neural networks on the target device. This is done without prior knowledge of either the hardware specifications or the detailed implementation of the used deep learning library. Second, FastDeepIoT informs a compression algorithm how to minimize execution time on the profiled device without impacting accuracy. We evaluate FastDeepIoT using three different sensing-related tasks on two mobile devices: Nexus 5 and Galaxy Nexus. FastDeepIoT further reduces the neural network execution time by $48\%$ to $78\%$ and energy consumption by $37\%$ to $69\%$ compared with the state-of-the-art compression algorithms.

URL PDF HTML ☆

赞 0 踩 0

1809.06179 2026-06-04 cs.RO cs.LG cs.SY eess.SY

Learning of Multi-Context Models for Autonomous Underwater Vehicles

多情境模型学习用于自主水下车辆

Bilal Wehbe, Octavio Arriaga, Mario Michael Krell, Frank Kirchner

发表机构 * DFKI - Robotic Innovation Center（DFKI机器人创新中心）； Robotics Research Group（机器人研究组）

AI总结本文提出利用LSTM网络学习自主水下车辆的多情境模型，通过实验数据构建仿真模型，生成不同情境并提高分类准确性，展现对噪声的鲁棒性和大数据集的扩展能力。

Comments 6 pages, 7 figures, AUV 2018 author copy

1809.06009 2026-06-04 cs.LG cs.NA math.NA stat.ML

Uncertainty Propagation in Deep Neural Networks Using Extended Kalman Filtering

使用扩展卡尔曼滤波在深度神经网络中进行不确定性传播

Jessica S. Titensky, Hayden Jananthan, Jeremy Kepner

发表机构 * Massachusetts Institute of Technology（麻省理工学院）； Department of Mathematics（数学系）； Lincoln Laboratory Supercomputing Center（林肯实验室超级计算机中心）

AI总结本文提出利用扩展卡尔曼滤波在深度神经网络中传播和量化输入不确定性，方法在计算效率上优于现有技术，同时自然地将模型误差纳入输出不确定性。

Comments 4 Pages, 8 figures. Accepted at MIT IEEE Undergraduate Research Technology Conference 2018. Publication pending

1806.06161 2026-06-04 cs.RO cs.LG cs.SY eess.SY

BaRC: Backward Reachability Curriculum for Robotic Reinforcement Learning

BaRC：机器人强化学习中的逆向可达性课程

Boris Ivanovic, James Harrison, Apoorva Sharma, Mo Chen, Marco Pavone

发表机构 * Department of Mechanical Engineering, Stanford University（斯坦福大学机械工程系）； School of Computing Science, Simon Fraser University（西蒙弗雷泽大学计算机科学学院）

AI总结本文提出BaRC方法，利用物理先验知识设计课程方案，通过逆向可达性策略加速连续控制MDP中模型无关RL算法的训练，提升性能并减少探索需求。

详情

AI中文摘要

模型无关强化学习（RL）为高维系统学习控制策略提供了有吸引力的方法，但其相对差的样本复杂性通常迫使在模拟环境中进行训练。即使在模拟中，具有稀疏自然奖励函数的目标导向任务仍难以被最先进的模型无关算法处理。这些任务的瓶颈在于从系统初始状态获取学习信号所需的大量探索。本文利用物理先验知识（以近似系统动力学模型的形式）设计了一种课程方案，用于模型无关策略优化算法。我们的逆向可达性课程（BaRC）从需要少量动作完成任务的状态开始策略训练，并在策略优化算法表现出足够性能后，以动态一致的方式扩展初始状态分布。BaRC具有通用性，可以加速任何模型无关RL算法在广泛目标导向连续控制MDP上的训练。其课程策略具有物理直观性、易于调节，并允许将物理先验整合到训练中，而不会影响模型无关RL算法的性能、灵活性和适用性。我们在两个代表性的动态机器人学习问题上评估了我们的方法，并发现相对于先前的课程生成技术和朴素探索策略，有显著的性能提升。

英文摘要

Model-free Reinforcement Learning (RL) offers an attractive approach to learn control policies for high-dimensional systems, but its relatively poor sample complexity often forces training in simulated environments. Even in simulation, goal-directed tasks whose natural reward function is sparse remain intractable for state-of-the-art model-free algorithms for continuous control. The bottleneck in these tasks is the prohibitive amount of exploration required to obtain a learning signal from the initial state of the system. In this work, we leverage physical priors in the form of an approximate system dynamics model to design a curriculum scheme for a model-free policy optimization algorithm. Our Backward Reachability Curriculum (BaRC) begins policy training from states that require a small number of actions to accomplish the task, and expands the initial state distribution backwards in a dynamically-consistent manner once the policy optimization algorithm demonstrates sufficient performance. BaRC is general, in that it can accelerate training of any model-free RL algorithm on a broad class of goal-directed continuous control MDPs. Its curriculum strategy is physically intuitive, easy-to-tune, and allows incorporating physical priors to accelerate training without hindering the performance, flexibility, and applicability of the model-free RL algorithm. We evaluate our approach on two representative dynamic robotic learning problems and find substantial performance improvement relative to previous curriculum generation techniques and naive exploration strategies.

URL PDF HTML ☆

赞 0 踩 0

1804.01031 2026-06-04 cs.RO cs.LG cs.SY eess.SY

Provably Robust Learning-Based Approach for High-Accuracy Tracking Control of Lagrangian Systems

具有证明鲁棒性的基于学习的方法用于拉格朗日系统高精度跟踪控制

Mohamed K. Helwa, Adam Heins, Angela P. Schoellig

发表机构 * Dynamic Systems Lab（动态系统实验室）； Institute for Aerospace Studies（航空航天研究院）； University of Toronto（多伦多大学）

AI总结本文提出基于高斯过程的新型学习控制方法，确保系统稳定性与高精度跟踪，通过不确定性界保证鲁棒性，并在仿真和实验中验证有效性。

Comments 8 pages, 4 figures, 2 tables, submitted to IEEE Robotics and Automation Letters (RA-L) and the 2019 International Conference on Robotics and Automation (ICRA) (created: March 2018; updated: September 2018)

详情

AI中文摘要

拉格朗日系统涵盖了多种机器人系统，包括机械臂、轮式和腿部机器人以及四旋翼。通常使用逆动力学控制和前馈线性化技术将复杂非线性动力学转换为解耦的二阶积分器，然后使用标准外环控制器计算线性化系统的期望加速度。然而，这些方法通常依赖于非常准确的系统模型，这在实践中往往不可用。尽管文献中使用了不同的学习方法来解决这一挑战，但大多数方法在学习控制系统稳定性方面缺乏安全保证。本文提出了一种基于高斯过程（GPs）的新学习控制方法，确保闭环系统的稳定性和高精度跟踪。我们使用GPs近似命令加速度与系统实际加速度之间的误差，并利用GP预测的均值和方差计算线性化模型不确定性的上界。此不确定性界随后用于鲁棒的外环控制器以确保整个系统的稳定性。此外，我们证明跟踪误差收敛到一个半径可任意小的球体。进一步，我们通过在2自由度平面机械臂上的仿真和6自由度工业机械臂上的实验验证了我们方法的有效性。

英文摘要

Lagrangian systems represent a wide range of robotic systems, including manipulators, wheeled and legged robots, and quadrotors. Inverse dynamics control and feedforward linearization techniques are typically used to convert the complex nonlinear dynamics of Lagrangian systems to a set of decoupled double integrators, and then a standard, outer-loop controller can be used to calculate the commanded acceleration for the linearized system. However, these methods typically depend on having a very accurate system model, which is often not available in practice. While this challenge has been addressed in the literature using different learning approaches, most of these approaches do not provide safety guarantees in terms of stability of the learning-based control system. In this paper, we provide a novel, learning-based control approach based on Gaussian processes (GPs) that ensures both stability of the closed-loop system and high-accuracy tracking. We use GPs to approximate the error between the commanded acceleration and the actual acceleration of the system, and then use the predicted mean and variance of the GP to calculate an upper bound on the uncertainty of the linearized model. This uncertainty bound is then used in a robust, outer-loop controller to ensure stability of the overall system. Moreover, we show that the tracking error converges to a ball with a radius that can be made arbitrarily small. Furthermore, we verify the effectiveness of our approach via simulations on a 2 degree-of-freedom (DOF) planar manipulator and experimentally on a 6 DOF industrial manipulator.

URL PDF HTML ☆

赞 0 踩 0

1809.03314 2026-06-04 cs.CV cs.SY eess.SY

A Robotic Auto-Focus System based on Deep Reinforcement Learning

基于深度强化学习的机器人自动对焦系统

Xiaofan Yu, Runze Yu, Jingsong Yang, Xiaohui Duan

发表机构 * Center of Wireless Communication and Signal Processing（无线通信与信号处理中心）

AI总结本文提出一种端到端的自动对焦方法，通过深度强化学习在视觉输入中学习对焦策略，实现自动清晰成像。方法通过离散化动作空间和应用DQN，解决自动对焦问题并推广至基于视觉的控制问题。

Comments To Appear at ICARCV 2018

详情

AI中文摘要

考虑到DQN在处理高维视觉输入和学习离散域控制策略方面的优势，DQN可能成为传统自动对焦方法的替代方案。本文基于深度强化学习提出了一种端到端方法，从视觉输入中学习自动对焦策略，并自动聚焦到清晰点。我们证明了我们的方法——通过粗到细的步骤离散化动作空间并应用DQN，不仅解决了自动对焦问题，还为基于视觉的控制问题提供了一种通用方法。分别在虚拟和真实环境中进行训练阶段以获得有效的模型。虚拟实验表明，我们的方法在不同聚焦范围内能够实现100%的准确性。进一步在真实机器人上训练可消除模拟器与真实场景之间的偏差，从而在实际应用中实现可靠性能。

英文摘要

Considering its advantages in dealing with high-dimensional visual input and learning control policies in discrete domain, Deep Q Network (DQN) could be an alternative method of traditional auto-focus means in the future. In this paper, based on Deep Reinforcement Learning, we propose an end-to-end approach that can learn auto-focus policies from visual input and finish at a clear spot automatically. We demonstrate that our method - discretizing the action space with coarse to fine steps and applying DQN is not only a solution to auto-focus but also a general approach towards vision-based control problems. Separate phases of training in virtual and real environments are applied to obtain an effective model. Virtual experiments, which are carried out after the virtual training phase, indicates that our method could achieve 100% accuracy on a certain view with different focus range. Further training on real robots could eliminate the deviation between the simulator and real scenario, leading to reliable performances in real applications.

URL PDF HTML ☆

赞 0 踩 0

1709.06196 2026-06-04 cs.AI cs.RO cs.SY eess.SY

Online algorithms for POMDPs with continuous state, action, and observation spaces

在线算法用于具有连续状态、动作和观察空间的POMDPs

Zachary Sunberg, Mykel Kochenderfer

发表机构 * Aeronautics and Astronautics Dept. Stanford University（航空航天系斯坦福大学）

AI总结本文提出POMCPOW和PFT-DPW算法，通过加权粒子过滤解决连续状态空间POMDPs的求解问题，验证了改进方法的有效性。

Comments Added Multilane section

1809.00037 2026-06-04 cs.RO cs.SY eess.SY

Estimation for Quadrotors

四旋翼估计

Stefanie Tellex, Andy Brown, Sergei Lupashin

发表机构 * Brown University（布朗大学）； Udacity, Inc.（Udacity公司）； Fotokite（Fotokite公司）

AI总结本文基于四旋翼模型，推导了扩展卡尔曼滤波器的推导过程，提供EKF、贝叶斯滤波和无迹卡尔曼滤波的伪代码，旨在解决四旋翼状态估计中的噪声和计算限制问题。

详情

AI中文摘要

本文描述了四旋翼滤波和估计的标准方法，适用于Udacity飞行汽车课程。本文假设具备概率知识和一些线性代数知识，不假设卡尔曼滤波或贝叶斯滤波的先前知识。本文推导了不同无人机模型在1D、2D和3D中的EKF。本文使用Thrun等人[13]定义的EKF和符号，并提供了贝叶斯滤波、EKF和无迹卡尔曼滤波[14]的伪代码。本文的动机是缺乏提供四旋翼直升机推导的逐步EKF教程。估计的目标是从传感器值和控制输入推断无人机的状态（姿态、速度、加速度和偏差）。这个问题具有挑战性，因为传感器噪声很大。此外，由于重量和成本问题，许多无人机具有有限的机载计算能力，因此希望快速估计这些值。标准方法是扩展卡尔曼滤波，它是卡尔曼滤波的非线性扩展，通过在当前状态附近线性化非线性转换和测量模型。然而，无迹卡尔曼滤波在几乎所有方面都更好：更容易实现，估计更准确，运行时间相当。

英文摘要

This document describes standard approaches for filtering and estimation for quadrotors, created for the Udacity Flying Cars course. We assume previous knowledge of probability and some knowledge of linear algebra. We do not assume previous knowledge of Kalman filters or Bayes filters. This document derives an EKF for various models of drones in 1D, 2D, and 3D. We use the EKF and notation as defined in Thrun et al. [13]. We also give pseudocode for the Bayes filter, the EKF, and the Unscented Kalman filter [14]. The motivation behind this document is the lack of a step-by-step EKF tutorial that provides the derivations for a quadrotor helicopter. The goal of estimation is to infer the drone's state (pose, velocity, acceleration, and biases) from its sensor values and control inputs. This problem is challenging because sensors are noisy. Additionally, because of weight and cost issues, many drones have limited on-board computation so we want to estimate these values as quickly as possible. The standard method for performing this method is the Extended Kalman filter, a nonlinear extension of the Kalman filter which linearizes a nonlinear transition and measurement model around the current state. However the Unscented Kalman filter is better in almost every respect: simpler to implement, more accurate to estimate, and comparable runtimes.

URL PDF HTML ☆

赞 0 踩 0

1806.05220 2026-06-04 cs.RO cs.SY eess.SY

Decentralized Ergodic Control: Distribution-Driven Sensing and Exploration for Multi-Agent Systems

去中心化恒定控制：面向多智能体系统的分布驱动感知与探索

Ian Abraham, Todd D. Murphey

发表机构 * Neuroscience and Robotics Laboratory（神经科学与机器人实验室）

AI总结本文提出一种去中心化恒定控制策略，用于解决多智能体非线性动态系统的时间变化区域覆盖问题，通过共识实现完全去中心化的多智能体控制政策，并展示了其在多智能体地形映射和目标定位中的应用。

Comments 8 pages, Accepted for publication in IEEE Robotics and Automation Letters

1802.08215 2026-06-04 cs.RO cs.SY eess.SY

ArduSoar: an Open-Source Thermalling Controller for Resource-Constrained Autopilots

ArduSoar：一种为资源受限自动驾驶仪设计的开源热气球控制器

Samuel Tabor, Iain Guilliard, Andrey Kolobov

发表机构 * Glasgow, Scotland（格拉斯哥，苏格兰）； Australian National University（澳大利亚国立大学）； Microsoft Research（微软研究院）

AI总结本文提出ArduSoar，首个集成于主流小型无人机自动驾驶软件中的热气球控制器，通过算法设计、与ArduPlane的集成及实飞测试验证其在非理想大气条件下的鲁棒性。

1803.10309 2026-06-04 cs.LG cs.SY eess.SY stat.ML

Canonical Correlation Analysis of Datasets with a Common Source Graph

具有共同源图的数据集的典型相关分析

Jia Chen, Gang Wang, Yanning Shen, Georgios B. Giannakis

发表机构 * University of Minnesota（明尼苏达大学）

AI总结本文提出了一种基于图正则化的典型相关分析方法(gCCA)，通过引入图结构来利用共同源的知识，以提升数据融合和分类性能。

Comments 10 pages, 7 figures

详情

DOI: 10.1109/TSP.2018.2853130

AI中文摘要

典型相关分析（CCA）是一种用于发现两个或多个数据集是否共享隐藏源的强大技术。其优点包括降维、聚类、分类、特征选择和数据融合。然而，标准CCA未利用共同源的几何结构，这可能来自给定数据或通过（交叉）相关性推导。本文将共同源提供的额外信息编码为图，并作为图正则化器。这导致了一种新的图正则化CCA方法，称为图（g）CCA。新的gCCA考虑了图诱导的共同源知识，同时最小化所需典型变量的距离。针对数据量小于数据向量维度的多种实际设置，还开发了gCCA的对偶形式。一种设置包括内核用于处理非线性数据依赖性。所得到的图内核（gk）CCA也以闭式形式获得。最后，通过多个真实数据集上的图像分类测试来证明新线性、对偶和内核方法相对于竞争方法的优势。

英文摘要

Canonical correlation analysis (CCA) is a powerful technique for discovering whether or not hidden sources are commonly present in two (or more) datasets. Its well-appreciated merits include dimensionality reduction, clustering, classification, feature selection, and data fusion. The standard CCA however, does not exploit the geometry of the common sources, which may be available from the given data or can be deduced from (cross-) correlations. In this paper, this extra information provided by the common sources generating the data is encoded in a graph, and is invoked as a graph regularizer. This leads to a novel graph-regularized CCA approach, that is termed graph (g) CCA. The novel gCCA accounts for the graph-induced knowledge of common sources, while minimizing the distance between the wanted canonical variables. Tailored for diverse practical settings where the number of data is smaller than the data vector dimensions, the dual formulation of gCCA is also developed. One such setting includes kernels that are incorporated to account for nonlinear data dependencies. The resultant graph-kernel (gk) CCA is also obtained in closed form. Finally, corroborating image classification tests over several real datasets are presented to showcase the merits of the novel linear, dual, and kernel approaches relative to competing alternatives.

URL PDF HTML ☆

赞 0 踩 0

1509.02223 2026-06-04 cs.CV cs.NA math.NA

Diffusion tensor imaging with deterministic error bounds

具有确定性误差边界的扩散张量成像

Artur Gorokh, Yury Korolev, Tuomo Valkonen

发表机构 * Faculty of Physics, Lomonosov Moscow State University（莫斯科罗蒙诺索夫国立大学物理系）； School of Engineering and Materials Science, Queen Mary University of London（伦敦女王玛丽大学工程与材料科学学院）

AI总结本文在Banach格中利用偏序理论建模逆问题的误差，应用于扩散张量成像中复杂噪声建模问题，通过确定性误差边界方法简化非线性Stejskal-Tanner方程的处理。

1808.03983 2026-06-04 cs.RO cs.SY eess.SY

Robot Safe Interaction System for Intelligent Industrial Co-Robots

智能工业协作机器人安全交互系统

Changliu Liu, Masayoshi Tomizuka

发表机构 * University of California, Berkeley（加州大学伯克利分校）

AI总结本文提出一种安全交互系统，通过并行规划与控制架构提升协作机器人在动态不确定环境中的效率与安全性，实验验证了方法的有效性。

Comments 12 pages

详情

AI中文摘要

人类-机器人交互被认为是未来工业协作机器人（协作机器人）的关键要素。不同于传统机器人在结构化和确定性环境中的工作方式，协作机器人需要在高度非结构化和随机环境中操作。为确保协作机器人在动态不确定环境中高效安全地运行，本文介绍了机器人安全交互系统。为解决人类-机器人交互中的不确定性，提出了一种独特的并行规划与控制架构，该架构包含一个长期全局规划器以确保机器人行为的效率，以及一个短期局部规划器以在不确定性下确保实时安全性。为使机器人能够立即响应环境变化，使用快速算法进行实时计算，即用于长期优化的凸可行性集算法和用于短期优化的安全集算法。介绍了几个测试平台，用于在部署初期对开发系统的安全性进行评估。通过与工业机器人机械臂的实验验证了所提方法的有效性和效率。

英文摘要

Human-robot interactions have been recognized to be a key element of future industrial collaborative robots (co-robots). Unlike traditional robots that work in structured and deterministic environments, co-robots need to operate in highly unstructured and stochastic environments. To ensure that co-robots operate efficiently and safely in dynamic uncertain environments, this paper introduces the robot safe interaction system. In order to address the uncertainties during human-robot interactions, a unique parallel planning and control architecture is proposed, which has a long term global planner to ensure efficiency of robot behavior, and a short term local planner to ensure real time safety under uncertainties. In order for the robot to respond immediately to environmental changes, fast algorithms are used for real-time computation, i.e., the convex feasible set algorithm for the long term optimization, and the safe set algorithm for the short term optimization. Several test platforms are introduced for safe evaluation of the developed system in the early phase of deployment. The effectiveness and the efficiency of the proposed method have been verified in experiment with an industrial robot manipulator.

URL PDF HTML ☆

赞 0 踩 0

1808.03037 2026-06-04 cs.RO cs.SY eess.SY

Passive Compliance Control of Aerial Manipulators

空载合规控制的空中机械臂

Min Jun Kim, Ribin Balachandran, Marco De Stefano, Konstantin Kondak, Christian Ott

发表机构 * German Aerospace Center (DLR)（德国航空航天中心）

AI总结本文提出了一种空载合规控制方法，通过合理选择末端执行器坐标和时间域被动技术，确保空中机械臂在无动力驱动情况下实现稳定环境交互。

Comments IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) 2018

1608.02702 2026-06-04 cs.CV cs.NA math.NA

Steerable Principal Components for Space-Frequency Localized Images

可旋转主成分用于空间-频率局部化图像

Boris Landa, Yoel Shkolnisky

发表机构 * Department of Applied Mathematics, School of Mathematical Sciences（应用数学系，数学科学学院）

AI总结本文提出一种快速准确的方法，通过二维Prolate Spheroidal Wave Functions对图像进行展开，获取可旋转主成分，用于图像及其旋转的最优扩展。

详情

DOI: 10.1137/16M1085334

AI中文摘要

本文描述了一种快速且准确的方法，用于从大量图像数据集中获得可旋转主成分，假设图像在空间和频率上具有良好的局部化特性。所获得的可旋转主成分用于图像数据集及其旋转的最优扩展。该方法首先使用一系列二维Prolate Spheroidal Wave Functions对图像进行展开，其中展开系数通过特殊设计的数值积分方案进行评估。然后，利用这些展开系数构建一个旋转不变的协方差矩阵，其具有块对角结构，其块的特征分解提供了所需的可旋转主成分。所提出的方法被证明比现有方法更快，同时提供适当的误差界以保证其准确性。

英文摘要

This paper describes a fast and accurate method for obtaining steerable principal components from a large dataset of images, assuming the images are well localized in space and frequency. The obtained steerable principal components are optimal for expanding the images in the dataset and all of their rotations. The method relies upon first expanding the images using a series of two-dimensional Prolate Spheroidal Wave Functions (PSWFs), where the expansion coefficients are evaluated using a specially designed numerical integration scheme. Then, the expansion coefficients are used to construct a rotationally-invariant covariance matrix which admits a block-diagonal structure, and the eigen-decomposition of its blocks provides us with the desired steerable principal components. The proposed method is shown to be faster then existing methods, while providing appropriate error bounds which guarantee its accuracy.

URL PDF HTML ☆

赞 0 踩 0

1712.07249 2026-06-04 cs.RO cs.LG cs.SY eess.SY

Probabilistic Learning of Torque Controllers from Kinematic and Force Constraints

基于概率学习的扭矩控制器从运动学和力约束中学习

João Silvério, Yanlong Huang, Leonel Rozo, Sylvain Calinon, Darwin G. Caldwell

发表机构 * Department of Advanced Robotics, Istituto Italiano di Tecnologia（意大利先进机器人研究所机器人部）； Idiap Research Institute（Idiap研究 institute）

AI总结本文提出一种概率方法，同时学习和合成扭矩控制命令，考虑任务空间、关节空间和力约束，通过概率学习不同扭矩控制器的相关性，结合高斯分布特性生成满足任务特征的新扭矩命令。

Comments Accepted for publication at 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

1807.05290 2026-06-04 cs.RO cs.SY eess.SY

Adaptive Model Predictive Control for High-Accuracy Trajectory Tracking in Changing Conditions

自适应模型预测控制在变化条件下高精度轨迹跟踪中的应用

Karime Pereida, Angela Schoellig

发表机构 * Dynamic Systems Lab（动态系统实验室）； University of Toronto Institute for Aerospace Studies（多伦多大学航空航天研究 institute）

AI总结本文提出一种结合模型预测控制与L1自适应控制器的自适应模型预测控制器，用于在未知和变化的扰动环境下提高系统轨迹跟踪性能。通过实验验证，该方法在四旋翼无人机上表现出更低的轨迹跟踪误差。

详情

AI中文摘要

机器人和自动化系统越来越多地被引入未知和动态环境，这些环境要求它们能够处理扰动、未建模动力学和参数不确定性。为在这些动态环境中实现高性能，需要鲁棒和自适应的控制策略。本文提出了一种新颖的自适应模型预测控制器，结合模型预测控制（MPC）与底层的L1自适应控制器，以提高受未知和变化扰动影响的系统轨迹跟踪性能。L1自适应控制器迫使系统以参考模型指定的方式运行。更高层的模型预测控制器则基于成本函数计算最优参考输入，同时考虑输入和状态约束。我们专注于所提出方法的实验验证，并在四旋翼无人机上展示了其有效性。我们表明，所提出的方法在外部风扰动下，其轨迹跟踪误差比非预测性自适应方法和预测性非自适应方法更低。

英文摘要

Robots and automated systems are increasingly being introduced to unknown and dynamic environments where they are required to handle disturbances, unmodeled dynamics, and parametric uncertainties. Robust and adaptive control strategies are required to achieve high performance in these dynamic environments. In this paper, we propose a novel adaptive model predictive controller that combines model predictive control (MPC) with an underlying $\mathcal{L}_1$ adaptive controller to improve trajectory tracking of a system subject to unknown and changing disturbances. The $\mathcal{L}_1$ adaptive controller forces the system to behave in a predefined way, as specified by a reference model. A higher-level model predictive controller then uses this reference model to calculate the optimal reference input based on a cost function, while taking into account input and state constraints. We focus on the experimental validation of the proposed approach and demonstrate its effectiveness in experiments on a quadrotor. We show that the proposed approach has a lower trajectory tracking error compared to non-predictive, adaptive approaches and a predictive, non-adaptive approach, even when external wind disturbances are applied.

URL PDF HTML ☆

赞 0 踩 0

1709.03726 2026-06-04 cs.LG cs.SY eess.SY

Adaptive Graph Signal Processing: Algorithms and Optimal Sampling Strategies

自适应图信号处理：算法与最优采样策略

Paolo Di Lorenzo, Paolo Banelli, Elvin Isufi, Sergio Barbarossa, Geert Leus

发表机构 * Dept. of Engineering, University of Perugia（工程系，佩鲁吉亚大学）

AI总结本文提出自适应图信号学习的新策略，通过分析随机采样对算法性能的影响，设计优化采样策略以提升稳态性能和收敛速度。

Comments Submitted to IEEE Transactions on Signal Processing, September 2017

详情

DOI: 10.1109/TSP.2018.2835384

AI中文摘要

本文旨在提出自适应图信号学习的新策略，即在随机时间变化的顶点子集上观测信号。将经典自适应算法LMS和RLS重新纳入图信号处理框架，通过均方分析探讨随机采样对自适应重建能力和稳态性能的影响。随后提出几种概率采样策略，设计每个节点的采样概率，以优化稳态性能、图采样率和算法收敛速度的平衡。最后推导出一种分布式RLS策略，并证明其收敛于集中式算法。通过合成和真实数据的数值模拟，展示了所提采样和重建策略在图上信号（可能分布式）自适应学习中的良好性能。

英文摘要

The goal of this paper is to propose novel strategies for adaptive learning of signals defined over graphs, which are observed over a (randomly time-varying) subset of vertices. We recast two classical adaptive algorithms in the graph signal processing framework, namely, the least mean squares (LMS) and the recursive least squares (RLS) adaptive estimation strategies. For both methods, a detailed mean-square analysis illustrates the effect of random sampling on the adaptive reconstruction capability and the steady-state performance. Then, several probabilistic sampling strategies are proposed to design the sampling probability at each node in the graph, with the aim of optimizing the tradeoff between steady-state performance, graph sampling rate, and convergence rate of the adaptive algorithms. Finally, a distributed RLS strategy is derived and is shown to be convergent to its centralized counterpart. Numerical simulations carried out over both synthetic and real data illustrate the good performance of the proposed sampling and reconstruction strategies for (possibly distributed) adaptive learning of signals defined over graphs.

URL PDF HTML ☆

赞 0 踩 0

1807.10757 2026-06-04 cs.CV cs.NA math.NA

A multi-contrast MRI approach to thalamus segmentation

一种多对比MRI方法用于丘脑分割

Veronica Corona, Jan Lellmann, Peter Nestor, Carola-Bibiane Schoenlieb, Julio Acosta-Cabronero

发表机构 * Department of Applied Mathematics and Theoretical Physics, University of Cambridge（应用数学与理论物理系，剑桥大学）； Queensland Brain Institute, University of Queensland（昆士兰脑研究所，昆士兰大学）； Mater Hospital, South Brisbane, Queensland, Australia（马特医院，南布里斯班，昆士兰，澳大利亚）； Wellcome Centre for Human Neuroimaging, UCL Institute of Neurology, University College London, United Kingdom（wellcome人类神经影像中心，伦敦大学学院神经学研究所，伦敦大学学院，英国）； German Center for Neurodegenerative Diseases (DZNE), Magdeburg, Germany（德国神经退行性疾病研究中心（DZNE），马格德堡，德国）

AI总结本文提出一种多模态MRI分割方法，通过多对比数据提高丘脑子核分割精度，结合迭代配准、手动分割模板、监督学习和凸优化，提升分割性能与鲁棒性。

详情

AI中文摘要

丘脑变化与许多神经疾病相关，包括阿尔茨海默病、帕金森病和多发性硬化症。常规干预常包括手术或深部脑刺激，因此准确分割灰质丘脑子区域具有临床重要性。MRI适用于结构分割，因其能提供单次扫描的不同解剖视图。尽管有多种对比度可用，开发能处理多谱的图像分割技术变得越来越重要。本文提出了一种新的多模态数据分割方法，用于自动分割主要丘脑子核组，使用T1-、T2*-加权和定量susceptibility mapping (QSM)信息。该方法包括四个步骤：高度迭代的图像配准、在平均训练数据模板上的手动分割、监督学习用于模式识别，以及最终的凸优化步骤，通过进一步的空间约束来优化解决方案。这导致了与手动分割更一致的解决方案，优于标准Morel图谱方法。此外，我们展示了多对比方法提升了分割性能。然后我们研究了是否能利用训练模板轮廓的先验知识进一步提高凸分割的精度和鲁棒性，从而在单个受试者中获得高度精确的多对比分割。该方法可扩展到大多数3D成像数据类型和任何在单次扫描或多受试者模板中可辨识的感兴趣区域。

英文摘要

Thalamic alterations are relevant to many neurological disorders including Alzheimer's disease, Parkinson's disease and multiple sclerosis. Routine interventions to improve symptom severity in movement disorders, for example, often consist of surgery or deep brain stimulation to diencephalic nuclei. Therefore, accurate delineation of grey matter thalamic subregions is of the upmost clinical importance. MRI is highly appropriate for structural segmentation as it provides different views of the anatomy from a single scanning session. Though with several contrasts potentially available, it is also of increasing importance to develop new image segmentation techniques that can operate multi-spectrally. We hereby propose a new segmentation method for use with multi-modality data, which we evaluated for automated segmentation of major thalamic subnuclear groups using T1-, T2*-weighted and quantitative susceptibility mapping (QSM) information. The proposed method consists of four steps: highly iterative image co-registration, manual segmentation on the average training-data template, supervised learning for pattern recognition, and a final convex optimisation step imposing further spatial constraints to refine the solution. This led to solutions in greater agreement with manual segmentation than the standard Morel atlas based approach. Furthermore, we show that the multi-contrast approach boosts segmentation performances. We then investigated whether prior knowledge using the training-template contours could further improve convex segmentation accuracy and robustness, which led to highly precise multi-contrast segmentations in single subjects. This approach can be extended to most 3D imaging data types and any region of interest discernible in single scans or multi-subject templates.

URL PDF HTML ☆

赞 0 踩 0

1807.08048 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY

Baidu Apollo EM Motion Planner

百度 Apollo EM 运动规划器

Haoyang Fan, Fan Zhu, Changchun Liu, Liangliang Zhang, Li Zhuang, Dong Li, Weicheng Zhu, Jiangtao Hu, Hongye Li, Qi Kong

发表机构 * Baidu USA LLC（百度美国有限公司）

AI总结本文提出基于百度 Apollo 开源自动驾驶平台的实时运动规划系统，解决工业级4级运动规划问题，兼顾安全性、舒适性和可扩展性，通过分层结构实现多车道和单车道自动驾驶。

详情

AI中文摘要

本文介绍了一种基于百度 Apollo（开源）自动驾驶平台的实时运动规划系统。该系统旨在解决工业级4级运动规划问题，同时考虑安全性、舒适性和可扩展性。系统采用分层结构处理多车道和单车道自动驾驶：（1）系统顶层为多车道策略，通过并行计算的车道级轨迹进行比较以处理变道场景。（2）在车道级轨迹生成器中，基于弗伦兹框架迭代求解路径和速度优化。（3）对于路径和速度优化，提出结合动态规划和基于样条的二次规划的方法，构建可扩展且易于调节的框架，同时处理交通规则、障碍物决策和平滑性。该规划器可扩展至高速公路和低速城市驾驶场景。我们通过场景示例和道路测试结果展示了该算法。本文描述的系统自2017年9月Apollo v1.5发布以来已部署到数十辆百度Apollo自动驾驶车辆。截至2018年5月16日，该系统已在各种城市场景下进行了3,380小时和约68,000公里（42,253英里）的闭环自动驾驶测试。本文描述的算法可在https://github.com/ApolloAuto/apollo/tree/master/modules/planning上获得。

英文摘要

In this manuscript, we introduce a real-time motion planning system based on the Baidu Apollo (open source) autonomous driving platform. The developed system aims to address the industrial level-4 motion planning problem while considering safety, comfort and scalability. The system covers multilane and single-lane autonomous driving in a hierarchical manner: (1) The top layer of the system is a multilane strategy that handles lane-change scenarios by comparing lane-level trajectories computed in parallel. (2) Inside the lane-level trajectory generator, it iteratively solves path and speed optimization based on a Frenet frame. (3) For path and speed optimization, a combination of dynamic programming and spline-based quadratic programming is proposed to construct a scalable and easy-to-tune framework to handle traffic rules, obstacle decisions and smoothness simultaneously. The planner is scalable to both highway and lower-speed city driving scenarios. We also demonstrate the algorithm through scenario illustrations and on-road test results. The system described in this manuscript has been deployed to dozens of Baidu Apollo autonomous driving vehicles since Apollo v1.5 was announced in September 2017. As of May 16th, 2018, the system has been tested under 3,380 hours and approximately 68,000 kilometers (42,253 miles) of closed-loop autonomous driving under various urban scenarios. The algorithm described in this manuscript is available at https://github.com/ApolloAuto/apollo/tree/master/modules/planning.

URL PDF HTML ☆

赞 0 踩 0

1709.05077 2026-06-04 cs.AI cs.SY eess.SY

Transforming Cooling Optimization for Green Data Center via Deep Reinforcement Learning

通过深度强化学习优化绿色数据中心的冷却系统

Yuanlong Li, Yonggang Wen, Kyle Guan, Dacheng Tao

发表机构 * School of Computer Science and Engineering, Nanyang Technological University（南洋理工大学计算机科学与工程学院）； Bell Labs, Nokia（诺基亚贝尔实验室）

AI总结本文提出利用数据中心监控数据优化冷却控制策略，采用深度强化学习框架设计端到端冷却控制算法，实现冷却成本降低11%的模拟平台结果及15%的实时数据节省。

详情

AI中文摘要

冷却系统在现代数据中心（DC）中起着关键作用。开发最优控制策略对于数据中心冷却系统是一个具有挑战性的任务。现有方法通常依赖于基于机械冷却、电气和热管理知识构建的系统模型近似，这难以设计且可能导致次优或不稳定性能。本文提出利用数据中心中的大量监控数据来优化控制策略。为此，将冷却控制策略设计转化为具有温度约束的能量成本最小化问题，并将其应用于新兴的深度强化学习（DRL）框架。具体而言，我们提出了一种基于actor-critic框架和深度确定性策略梯度（DDPG）算法的端到端冷却控制算法（CCA）。在所提出的CCA中，评估网络被训练以预测一个受数据中心房间冷却状态惩罚的能量成本计数器，而策略网络被训练以在给定当前负载和天气信息时预测优化的控制设置。所提出的算法在EnergyPlus模拟平台和从新加坡国家超级计算中心（NSCC）收集的实时数据跟踪上进行了评估。我们的结果表明，所提出的CCA在模拟平台上相比手动配置的基线控制算法可实现约11%的冷却成本节省。在基于跟踪的研究中，我们提出了一种去低估验证机制，因为我们无法直接在真实数据中心上测试该算法。尽管使用DUE结果较为保守，如果我们设置入口温度阈值为26.6摄氏度，我们仍能在NSCC数据跟踪上实现约15%的冷却能耗节省。

英文摘要

Cooling system plays a critical role in a modern data center (DC). Developing an optimal control policy for DC cooling system is a challenging task. The prevailing approaches often rely on approximating system models that are built upon the knowledge of mechanical cooling, electrical and thermal management, which is difficult to design and may lead to sub-optimal or unstable performances. In this paper, we propose utilizing the large amount of monitoring data in DC to optimize the control policy. To do so, we cast the cooling control policy design into an energy cost minimization problem with temperature constraints, and tap it into the emerging deep reinforcement learning (DRL) framework. Specifically, we propose an end-to-end cooling control algorithm (CCA) that is based on the actor-critic framework and an off-policy offline version of the deep deterministic policy gradient (DDPG) algorithm. In the proposed CCA, an evaluation network is trained to predict an energy cost counter penalized by the cooling status of the DC room, and a policy network is trained to predict optimized control settings when gave the current load and weather information. The proposed algorithm is evaluated on the EnergyPlus simulation platform and on a real data trace collected from the National Super Computing Centre (NSCC) of Singapore. Our results show that the proposed CCA can achieve about 11% cooling cost saving on the simulation platform compared with a manually configured baseline control algorithm. In the trace-based study, we propose a de-underestimation validation mechanism as we cannot directly test the algorithm on a real DC. Even though with DUE the results are conservative, we can still achieve about 15% cooling energy saving on the NSCC data trace if we set the inlet temperature threshold at 26.6 degree Celsius.

URL PDF HTML ☆

赞 0 踩 0

1807.05289 2026-06-04 cs.RO cs.SY eess.SY

Transfer Learning for High-Precision Trajectory Tracking Through $\mathcal{L}_1$ Adaptive Feedback and Iterative Learning

通过L1自适应反馈和迭代学习实现高精度轨迹跟踪的迁移学习

Karime Pereida, Dave Kooijman, Rikky R. P. R. Duivenvoorden, Angela P. Schoellig

发表机构 * Institute for Aerospace Studies, University of Toronto, North York, ON M3H 5T6, Canada（多伦多大学航空航天研究 institute, 北York, ON M3H 5T6, 加拿大）

AI总结本文提出结合L1自适应控制与迭代学习控制的框架，用于在未知动态环境中实现高精度轨迹跟踪，通过迁移学习实现不同系统间的经验传递。

详情

DOI: 10.1002/acs.2887

AI中文摘要

当机器人或自动化系统被引入未知和动态环境时，需要鲁棒且适应性的控制策略以应对干扰、未建模动力学和参数不确定性。本文展示了一种结合L1自适应控制与迭代学习控制（ILC）的框架，用于在存在未知和变化的干扰时实现高精度轨迹跟踪。L1自适应控制器使系统接近参考模型，但无法保证完美轨迹跟踪，而ILC则通过以前的迭代改进轨迹跟踪性能。本文的综合框架使用L1自适应控制作为底层控制器，实现鲁棒且可重复的行为，而ILC则作为高层适应方案，主要补偿系统跟踪误差。我们证明了该框架能够在动态不同的系统间实现迁移学习，其中一个系统的学习经验可对另一个系统有益。两种不同四旋翼的实验结果表明，与使用PID控制器的ILC方法相比，该综合L1-ILC框架具有优越性能。结果表明，当初始输入基于自适应控制器的参考模型生成时，我们的L1-ILC框架能够实现精确的轨迹跟踪，即使在存在未知和变化的干扰时，也能实现系统间的学习经验迁移。

英文摘要

Robust and adaptive control strategies are needed when robots or automated systems are introduced to unknown and dynamic environments where they are required to cope with disturbances, unmodeled dynamics, and parametric uncertainties. In this paper, we demonstrate the capabilities of a combined $\mathcal{L}_1$ adaptive control and iterative learning control (ILC) framework to achieve high-precision trajectory tracking in the presence of unknown and changing disturbances. The $\mathcal{L}_1$ adaptive controller makes the system behave close to a reference model; however, it does not guarantee that perfect trajectory tracking is achieved, while ILC improves trajectory tracking performance based on previous iterations. The combined framework in this paper uses $\mathcal{L}_1$ adaptive control as an underlying controller that achieves a robust and repeatable behavior, while the ILC acts as a high-level adaptation scheme that mainly compensates for systematic tracking errors. We illustrate that this framework enables transfer learning between dynamically different systems, where learned experience of one system can be shown to be beneficial for another different system. Experimental results with two different quadrotors show the superior performance of the combined $\mathcal{L}_1$-ILC framework compared with approaches using ILC with an underlying proportional-derivative controller or proportional-integral-derivative controller. Results highlight that our $\mathcal{L}_1$-ILC framework can achieve high-precision trajectory tracking when unknown and changing disturbances are present and can achieve transfer of learned experience between dynamically different systems. Moreover, our approach is able to achieve precise trajectory tracking in the first attempt when the initial input is generated based on the reference model of the adaptive controller.

URL PDF HTML ☆

赞 0 踩 0

1709.08174 2026-06-04 cs.LG cs.NA math.NA

Function approximation with zonal function networks with activation functions analogous to the rectified linear unit functions

基于类似修正线性单元函数的区域函数网络的函数近似

Hrushikesh N. Mhaskar

发表机构 * Institute of Mathematical Sciences, Claremont Graduate University（数学科学研究所，克莱蒙特研究生大学）

AI总结本文研究了在q维球面上的区域函数网络的近似性质，探讨了非正定激活函数的逼近特性，并建立了相应的光滑性类别和逼近性质。

Comments 18 pages, Title changed from the pervious version

1807.02297 2026-06-04 cs.LG cs.AI cs.SY eess.SY stat.ML

Combinatorial Bandits for Incentivizing Agents with Dynamic Preferences

基于动态偏好的激励机制组合博弈问题

Tanner Fiez, Shreyas Sekar, Liyuan Zheng, Lillian J. Ratliff

发表机构 * Electrical Engineering Department, University of Washington（华盛顿大学电气工程系）

AI总结本文提出一种多臂老虎机框架，用于在资源受限环境下匹配用户激励，结合贪心匹配、UCB算法和马尔可夫链混合时间，理论分析 regret 并通过合成和现实案例验证性能。

Comments Published as a conference paper in Conference on Uncertainty in Artificial Intelligence (UAI) 2018

1807.00553 2026-06-04 cs.LG cs.AI cs.SY eess.SY math.DS stat.ML

A Broader View on Bias in Automated Decision-Making: Reflecting on Epistemology and Dynamics

对自动化决策中偏见的更广泛视角：反思认识论与动态性

Roel Dobbe, Sarah Dean, Thomas Gilbert, Nitin Kohli

发表机构 * Department of Electrical Engineering and Computer Sciences, University of California Berkeley, USA（加州大学伯克利分校电气工程与计算机科学系）； Department of Rhetoric, University of California Berkeley, USA（加州大学伯克利分校修辞学系）； School of Information, University of California Berkeley, USA（加州大学伯克利分校信息学院）

AI总结本文探讨自动化决策中偏见的根源，将技术偏见视为认识论问题，新兴偏见视为动态反馈现象，强调需反思认识论并采用价值敏感设计方法改进决策系统。

Comments Presented at the 2018 Workshop on Fairness, Accountability and Transparency in Machine Learning during ICML 2018, Stockholm, Sweden

1806.10472 2026-06-04 cs.CV cs.NA math.NA

Homogeneity of a region in the logarithmic image processing framework: application to region growing algorithms

对数图像处理框架中区域的同质性：应用于区域生长算法

Michel Jourlin, Guillaume Noyel

发表机构 * Lab. H. Curien, UMR CNRS 5516（H. Curien实验室，CNRS 5516研究单位）； University of Strathclyde Institute of Global Public Health（斯特拉斯堡大学全球公共卫生研究所）； International Prevention Research Institute, iPRI（国际预防研究研究所）

AI总结本文探讨了对数图像处理（LIP）算子在评估区域同质性中的作用，提出两种新的异质性标准，改进了Revol技术以增强对比度变化的鲁棒性，减少区域生长过程中的链式效应。

1806.09919 2026-06-04 cs.LG cs.SY eess.SY stat.ML

Tangent-Space Regularization for Neural-Network Models of Dynamical Systems

神经动力系统模型中的切空间正则化

Fredrik Bagge Carlson, Rolf Johansson, Anders Robertsson

发表机构 * LCCC Linnaeus Center（LCCC 林纳尤斯中心）

AI总结本文提出神经网络动力系统模型的切空间正则化方法，通过利用动力学函数的切空间特性，改进模型雅可比矩阵的正则化，减少对大量训练数据的依赖，并探讨不同网络架构对输入输出雅可比矩阵学习能力及L2正则化对系统稳定性的影响。

1806.08083 2026-06-04 cs.AI cs.SY eess.SY

Expanding the Active Inference Landscape: More Intrinsic Motivations in the Perception-Action Loop

拓展主动推断领域：感知-动作循环中的更多内在动机

Martin Biehl, Christian Guckelsberger, Christoph Salge, Simón C. Smith, Daniel Polani

发表机构 * Araya Inc.（Araya公司）； Computational Creativity Group, Department of Computing, Goldsmiths, University of London（Goldsmiths大学计算创意小组）； Game Innovation Lab, Department of Computer Science and Engineering, New York University（纽约大学游戏创新实验室）； Sepia Lab, Adaptive Systems Research Group, Department of Computer Science, University of Hertfordshire（赫特福德大学计算机科学系Sepia实验室）； Institute of Perception, Action and Behaviour, School of Informatics, The University of Edinburgh（爱丁堡大学信息学院感知、行为与行为研究所）

AI总结本文探讨主动推断中是否可利用其他内在动机替代原有动机，同时保持核心机制，并通过形式化方法连接通用强化学习。

Comments 53 pages, 6 figures, 2 tables

详情

AI中文摘要

主动推断是一种雄心勃勃的理论，将自主代理的感知、推断和动作选择统一于单一原则下。它为许多认知现象提供了生物合理解释，包括意识。在主动推断中，动作选择由一个评估未来动作的客观函数驱动，该函数基于当前推断的世界信念。主动推断本质上独立于外在奖励，使其在不同环境或代理形态中具有高度鲁棒性。在文献中，共享这种独立性的范式被总结为内在动机。与主动推断不同，这些动机模型通常不承诺特定的推断和动作选择机制。本文研究主动推断的推断和动作选择机制是否也可用于其他内在动机替代原动机。感知-动作循环明确将推断和动作选择与环境和代理记忆联系起来，因此被用作分析基础。我们重构了主动推断方法，将其原始公式定位其中，并展示如何在保持许多原始特征的同时使用其他内在动机。此外，我们通过形式化方法展示了与通用强化学习的联系。主动推断研究可能从比较其他内在动机诱导的动力学中受益。内在动机研究可能从另一种实现内在动机代理的方式中受益，该方式也共享主动推断的生物合理性。

英文摘要

Active inference is an ambitious theory that treats perception, inference and action selection of autonomous agents under the heading of a single principle. It suggests biologically plausible explanations for many cognitive phenomena, including consciousness. In active inference, action selection is driven by an objective function that evaluates possible future actions with respect to current, inferred beliefs about the world. Active inference at its core is independent from extrinsic rewards, resulting in a high level of robustness across e.g.\ different environments or agent morphologies. In the literature, paradigms that share this independence have been summarised under the notion of intrinsic motivations. In general and in contrast to active inference, these models of motivation come without a commitment to particular inference and action selection mechanisms. In this article, we study if the inference and action selection machinery of active inference can also be used by alternatives to the originally included intrinsic motivation. The perception-action loop explicitly relates inference and action selection to the environment and agent memory, and is consequently used as foundation for our analysis. We reconstruct the active inference approach, locate the original formulation within, and show how alternative intrinsic motivations can be used while keeping many of the original features intact. Furthermore, we illustrate the connection to universal reinforcement learning by means of our formalism. Active inference research may profit from comparisons of the dynamics induced by alternative intrinsic motivations. Research on intrinsic motivations may profit from an additional way to implement intrinsically motivated agents that also share the biological plausibility of active inference.

URL PDF HTML ☆

赞 0 踩 0