arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 21511
1807.11534 2026-06-04 math.NA cs.CV cs.NA

A Restricted-Domain Dual Formulation for Two-Phase Image Segmentation

两相图像分割的受限域双变量公式

Jack Spencer

发表机构 * Department of Mathematics, University of Liverpool, UK(利物浦大学数学系)

AI总结 本文探讨了两相图像分割中数据拟合的性质,提出在受限域内求解双变量公式以提升计算效率,并通过实验验证了该方法的有效性。

详情
Journal ref
Irish Machine Vision and Image Processing Conference Proceedings, pp. 139-146, 2017
AI中文摘要

在两相图像分割中,凸松弛方法允许计算各种数据拟合项的全局最小解。许多高效方法可以快速求解。然而,我们考虑该公式中数据拟合的本质是否允许对解做出合理假设,以进一步提高计算性能。特别是,我们采用了一个广为人知的双变量公式,并在受限域内求解相应的方程。我们展示了实验结果,探讨了该限制对解的影响,并量化了计算性能的改进。这种方法可以简单地扩展到类似方法,并可能为此类问题提供一种高效替代方案。

英文摘要

In two-phase image segmentation, convex relaxation has allowed global minimisers to be computed for a variety of data fitting terms. Many efficient approaches exist to compute a solution quickly. However, we consider whether the nature of the data fitting in this formulation allows for reasonable assumptions to be made about the solution that can improve the computational performance further. In particular, we employ a well known dual formulation of this problem and solve the corresponding equations in a restricted domain. We present experimental results that explore the dependence of the solution on this restriction and quantify imrovements in the computational performance. This approach can be extended to analogous methods simply and could provide an efficient alternative for problems of this type.

1807.03515 2026-06-04 eess.SY cs.NI cs.RO cs.SY

A Reinforcement Learning Approach to Jointly Adapt Vehicular Communications and Planning for Optimized Driving

一种联合适应车载通信与规划的强化学习方法

Mayank K. Pal, Rupali Bhati, Anil Sharma, Sanjit K. Kaul, Saket Anand, P. B. Sujit

发表机构 * IIIT-Delhi(印度德里印度理工学院)

AI总结 本文提出一种强化学习方法,用于联合优化自动驾驶车辆的通信与运动规划,通过模拟验证了该方法在提升驾驶效益方面的有效性。

Comments 7 pages, 7 figures; Accepted as a conference paper at IEEE ITSC 2018

详情
AI中文摘要

我们的前提是自动驾驶车辆必须联合优化通信和运动规划。具体而言,车辆必须在考虑通信速率相关约束的情况下调整其运动计划,并在考虑道路环境可能施加的运动规划限制的情况下调整通信使用。为此,我们提出了一个强化学习问题,其中自动驾驶车辆同时选择(a)在道路上执行的运动规划动作和(b)查询基础设施感知信息的通信动作。目标是优化自动驾驶车辆的驾驶效益。我们应用Q学习算法使车辆学习最优策略,该策略在任何给定时间都能做出最优的规划和通信动作选择。我们通过模拟验证了最优策略在智能适应通信和规划动作方面的能力,同时实现了较高的驾驶效益。

英文摘要

Our premise is that autonomous vehicles must optimize communications and motion planning jointly. Specifically, a vehicle must adapt its motion plan staying cognizant of communications rate related constraints and adapt the use of communications while being cognizant of motion planning related restrictions that may be imposed by the on-road environment. To this end, we formulate a reinforcement learning problem wherein an autonomous vehicle jointly chooses (a) a motion planning action that executes on-road and (b) a communications action of querying sensed information from the infrastructure. The goal is to optimize the driving utility of the autonomous vehicle. We apply the Q-learning algorithm to make the vehicle learn the optimal policy, which makes the optimal choice of planning and communications actions at any given time. We demonstrate the ability of the optimal policy to smartly adapt communications and planning actions, while achieving large driving utilities, using simulations.

1803.09022 2026-06-04 eess.SY cs.RO cs.SY math.OC

Controller Synthesis for Discrete-Time Polynomial Systems via Occupation Measures

通过占用测度方法为离散时间多项式系统设计控制器

Weiqiao Han, Russ Tedrake

发表机构 * Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology(计算机科学与人工智能实验室,麻省理工学院)

AI总结 本文通过占用测度方法为离散时间多项式动力系统设计非线性状态反馈控制器,将控制器合成问题转化为无限维线性规划问题,并通过松弛为有限维半正定规划问题,从而提取非线性控制器,该方法具有计算复杂度低且可扩展性强的优势。

详情
AI中文摘要

本文中,我们通过占用测度方法为离散时间多项式动力系统设计非线性状态反馈控制器。我们提出了离散时间受控李雅普诺夫方程,并利用该方程将控制器合成问题转化为关于测度的无限维线性规划问题,随后将其松弛为关于测度矩及其对偶的有限维半正定规划问题。非线性控制器可以从松弛问题的解中提取。占用测度方法的优势在于我们解决的是凸问题而非通常非凸问题,且计算复杂度与状态和输入维度呈多项式关系,因此该方法更具可扩展性。此外,我们还展示了该方法可以应用于近似离散时间自主多项式系统的后向可达集以及在已知状态反馈控制律下的离散时间多项式系统的可控集。我们还在多个动力系统上展示了我们的方法。

英文摘要

In this paper, we design nonlinear state feedback controllers for discrete-time polynomial dynamical systems via the occupation measure approach. We propose the discrete-time controlled Liouville equation, and use it to formulate the controller synthesis problem as an infinite-dimensional linear programming problem on measures, which is then relaxed as finite-dimensional semidefinite programming problems on moments of measures and their duals on sums-of-squares polynomials. Nonlinear controllers can be extracted from the solutions to the relaxed problems. The advantage of the occupation measure approach is that we solve convex problems instead of generally non-convex problems, and the computational complexity is polynomial in the state and input dimensions, and hence the approach is more scalable. In addition, we show that the approach can be applied to over-approximating the backward reachable set of discrete-time autonomous polynomial systems and the controllable set of discrete-time polynomial systems under known state feedback control laws. We illustrate our approach on several dynamical systems.

1807.08855 2026-06-04 stat.ML cs.LG cs.RO cs.SY eess.SP eess.SY

Weak in the NEES?: Auto-tuning Kalman Filters with Bayesian Optimization

在NEES中薄弱:基于贝叶斯优化的自动调节卡尔曼滤波器

Zhaozhong Chen, Christoffer Heckman, Simon Julier, Nisar Ahmed

发表机构 * Department of Computer Science(计算机科学系) University of Colorado Boulder(科罗拉多大学博尔德分校) University College London(伦敦大学学院) Smead Aerospace Engineering Sciences(Smead航空航天工程科学系)

AI总结 本文提出一种基于贝叶斯优化的自动调节卡尔曼滤波器方法,通过智能采样参数空间,利用非参数高斯过程代理函数,高效识别多个局部极小值并提供结果不确定性量化。

Comments Final version presented at FUSION 2018 Conference, Cambridge, UK, July 2018 (submitted June 1, 2018)

详情
AI中文摘要

卡尔曼滤波器被广泛用于数据融合应用,包括导航、跟踪和同时定位与建图问题。然而,调整各种卡尔曼滤波器模型参数需要大量时间和努力,例如过程噪声协方差、非白噪声预白化滤波器模型等。传统优化技术在调整时容易陷入较差的局部极小值,并且使用真实传感器数据实施成本较高。为了解决这些问题,本文开发了一种新的“黑箱”贝叶斯优化策略,用于自动调节卡尔曼滤波器。在该方法中,性能由两种随机目标函数之一来表征:当可用真实状态模型时为归一化估计误差平方(NEES),当只有传感器数据可用时为归一化创新误差平方(NIS)。通过智能采样参数空间,学习和利用非参数高斯过程代理函数,贝叶斯优化可以高效地识别多个局部极小值,并对其结果提供不确定性量化。

英文摘要

Kalman filters are routinely used for many data fusion applications including navigation, tracking, and simultaneous localization and mapping problems. However, significant time and effort is frequently required to tune various Kalman filter model parameters, e.g. process noise covariance, pre-whitening filter models for non-white noise, etc. Conventional optimization techniques for tuning can get stuck in poor local minima and can be expensive to implement with real sensor data. To address these issues, a new "black box" Bayesian optimization strategy is developed for automatically tuning Kalman filters. In this approach, performance is characterized by one of two stochastic objective functions: normalized estimation error squared (NEES) when ground truth state models are available, or the normalized innovation error squared (NIS) when only sensor data is available. By intelligently sampling the parameter space to both learn and exploit a nonparametric Gaussian process surrogate function for the NEES/NIS costs, Bayesian optimization can efficiently identify multiple local minima and provide uncertainty quantification on its results.

1807.07099 2026-06-04 eess.SP cs.LG cs.NA math.NA stat.ML

Comparative study of Discrete Wavelet Transforms and Wavelet Tensor Train decomposition to feature extraction of FTIR data of medicinal plants

对离散小波变换与小波张量分解在药用植物FTIR数据特征提取中的比较研究

Pavel Kharyuk, Dmitry Nazarenko, Ivan Oseledets

发表机构 * Skolkovo Institute of Science and Technology(斯克洛洛夫研究所) Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University(莫斯科国立大学计算数学与电子学系) Faculty of Chemistry, Lomonosov Moscow State University(莫斯科国立大学化学系) Institute of Numerical Mathematics of the Russian Academy of Sciences(俄罗斯科学院数值数学研究所)

AI总结 本文比较了小波张量分解与离散小波变换在药用植物FTIR数据特征提取中的应用,发现两者在预处理和特征提取对机器学习算法效率的影响上表现相似,且小波张量分解因其单一参数调优优势更适用于多种信号处理任务。

详情
AI中文摘要

本文利用7种植物样本的傅里叶变换红外(FTIR)光谱,探讨了预处理和特征提取对机器学习算法效率的影响。将小波张量分解(WTT)与离散小波变换(DWT)作为药用植物FTIR数据的特征提取技术进行比较。各种信号处理步骤在应用于分类和聚类任务时表现出不同的行为。通过网格搜索找到的WTT和DWT的最佳结果相似,显著提高了聚类质量和调优后的逻辑回归分类准确率,相比原始光谱。与DWT不同,WTT只有一个参数(秩)需要调优,使其成为在各种信号处理应用中更通用和易用的数据处理工具。

英文摘要

Fourier-transform infra-red (FTIR) spectra of samples from 7 plant species were used to explore the influence of preprocessing and feature extraction on efficiency of machine learning algorithms. Wavelet Tensor Train (WTT) and Discrete Wavelet Transforms (DWT) were compared as feature extraction techniques for FTIR data of medicinal plants. Various combinations of signal processing steps showed different behavior when applied to classification and clustering tasks. Best results for WTT and DWT found through grid search were similar, significantly improving quality of clustering as well as classification accuracy for tuned logistic regression in comparison to original spectra. Unlike DWT, WTT has only one parameter to be tuned (rank), making it a more versatile and easier to use as a data processing tool in various signal processing applications.

1709.07224 2026-06-04 cs.MA cs.AI cs.LG cs.SY eess.SY stat.ML

Local Communication Protocols for Learning Complex Swarm Behaviors with Deep Reinforcement Learning

局部通信协议用于通过深度强化学习学习复杂群集行为

Maximilian Hüttenrauch, Adrian Šošić, Gerhard Neumann

发表机构 * School of Computer Science, University of Lincoln(林肯大学计算机科学学院) Department of Electrical Engineering, Technische Universität Darmstadt(达姆施塔特技术大学电气工程系)

AI总结 本文提出简单通信协议,利用深度强化学习在多机器人群环境中学习去中心化控制策略,通过直方图编码局部邻域关系并传输任务特定信息,如最短距离和方向,以完成协作任务。

Comments 13 pages, 4 figures, version 2, accepted at ANTS 2018

详情
AI中文摘要

群集系统对强化学习(RL)构成挑战,因为算法需要学习去中心化控制策略以应对代理的有限局部感知和通信能力。虽然直接定义代理行为困难,但可通过先验知识定义简单的通信协议。本文提出多种简单通信协议,用于深度强化学习在多机器人群环境中寻找去中心化控制策略。协议基于直方图编码代理的局部邻域关系,并可传输任务特定信息,如到目标的最短距离和方向。在我们的框架中,我们采用信任区域策略优化的变体来学习复杂协作任务,如编队和建立通信链路。我们在模拟的2D物理环境中评估了我们的发现,并比较了不同通信协议的影响。

英文摘要

Swarm systems constitute a challenging problem for reinforcement learning (RL) as the algorithm needs to learn decentralized control policies that can cope with limited local sensing and communication abilities of the agents. While it is often difficult to directly define the behavior of the agents, simple communication protocols can be defined more easily using prior knowledge about the given task. In this paper, we propose a number of simple communication protocols that can be exploited by deep reinforcement learning to find decentralized control policies in a multi-robot swarm environment. The protocols are based on histograms that encode the local neighborhood relations of the agents and can also transmit task-specific information, such as the shortest distance and direction to a desired target. In our framework, we use an adaptation of Trust Region Policy Optimization to learn complex collaborative tasks, such as formation building and building a communication link. We evaluate our findings in a simulated 2D-physics environment, and compare the implications of different communication protocols.

1807.03769 2026-06-04 math.OC cs.AI cs.LG cs.SY eess.SY stat.ML

Kernel-Based Learning for Smart Inverter Control

基于核方法的智能逆变器控制

Aditie Garg, Mana Jalali, Vassilis Kekatos, Nikolaos Gatsis

发表机构 * Dept. of ECE, Virginia Tech(维吉尼亚理工大学电子工程系) Dept. of ECE, Un. of Texas at San Antonio(德克萨斯大学圣安东尼奥分校电子工程系)

AI总结 本文提出非线性逆变器控制策略,通过类比多任务学习将反应控制视为核回归任务,利用线性化电网模型和预测数据场景,在馈线层面联合设计逆变器规则以最小化电压偏差和电阻损耗。

Comments Submitted to the 2018 IEEE Global Signal and Information Processing Conf., Symposium on Smart Energy Infrastructures

详情
AI中文摘要

目前,分布电网面临由间歇性太阳能发电引起的频繁电压波动的挑战。智能逆变器被倡导为一种快速响应的手段,用于调节电压并最小化电阻损耗。由于最优逆变器协调可能计算上具有挑战性,而预设的本地控制规则表现不佳,因此定制化的准静态控制规则被视为最佳折中方案。本文从仿射控制规则出发,提出非线性逆变器控制策略。通过类比多任务学习,将反应控制视为基于核的回归任务。利用线性化电网模型和给定的预期数据场景,在馈线层面联合设计逆变器规则,以最小化电压偏差和电阻损耗的凸组合,通过线性约束的二次规划。使用真实世界数据在基准馈线上的数值测试表明,非线性控制规则即使由少数非本地读数驱动,也能实现近最优性能。

英文摘要

Distribution grids are currently challenged by frequent voltage excursions induced by intermittent solar generation. Smart inverters have been advocated as a fast-responding means to regulate voltage and minimize ohmic losses. Since optimal inverter coordination may be computationally challenging and preset local control rules are subpar, the approach of customized control rules designed in a quasi-static fashion features as a golden middle. Departing from affine control rules, this work puts forth non-linear inverter control policies. Drawing analogies to multi-task learning, reactive control is posed as a kernel-based regression task. Leveraging a linearized grid model and given anticipated data scenarios, inverter rules are jointly designed at the feeder level to minimize a convex combination of voltage deviations and ohmic losses via a linearly-constrained quadratic program. Numerical tests using real-world data on a benchmark feeder demonstrate that nonlinear control rules driven also by a few non-local readings can attain near-optimal performance.

1709.01268 2026-06-04 cs.CE cs.LG cs.NA math.NA q-fin.TR

Tensor Representation in High-Frequency Financial Data for Price Change Prediction

高频金融数据中的张量表示用于价格变动预测

Dat Thanh Tran, Martin Magris, Juho Kanniainen, Moncef Gabbouj, Alexandros Iosifidis

发表机构 * 1 Laboratory of Signal Processing, Tampere University of Technology, Tampere, Finland 2 Laboratory of Industrial

AI总结 本文研究了张量多线性方法在中价预测中的有效性,通过大规模数据集实验表明,张量表示优于向量方法及其他方法。

Comments accepted in SSCI 2017, typos fixed

详情
Journal ref
IEEE Symposium Series on Computational Intelligence (SSCI), 2017
AI中文摘要

如今,随着大量交易数据的可用性,金融市场的动态既是对高频交易者的一种挑战,也是一种机会。为了利用高频交易中资产快速微妙的变动,必须有自动算法来分析和检测基于交易记录的价格变动模式。金融数据的多通道时间序列表示自然地建议了基于张量的学习算法。在本工作中,我们研究了两种多线性方法在中价预测问题中的有效性,与其他现有方法相比。在包含超过400万笔限价订单的大型数据集上的实验表明,通过利用张量表示,多线性模型优于向量方法和其他竞争方法。

英文摘要

Nowadays, with the availability of massive amount of trade data collected, the dynamics of the financial markets pose both a challenge and an opportunity for high frequency traders. In order to take advantage of the rapid, subtle movement of assets in High Frequency Trading (HFT), an automatic algorithm to analyze and detect patterns of price change based on transaction records must be available. The multichannel, time-series representation of financial data naturally suggests tensor-based learning algorithms. In this work, we investigate the effectiveness of two multilinear methods for the mid-price prediction problem against other existing methods. The experiments in a large scale dataset which contains more than 4 millions limit orders show that by utilizing tensor representation, multilinear models outperform vector-based approaches and other competing ones.

1806.00627 2026-06-04 eess.SY cs.RO cs.SY

Fast Rigid 3D Registration Solution: A Simple Method Free of SVD and Eigen-Decomposition

快速刚性三维配准解决方案:一种无需SVD和特征分解的简单方法

Jin Wu, Ming Liu, Zebo Zhou, Rui Li

发表机构 * School of Aeronautics and Astronautics, University of Electronic Science and Technology of China, Chengdu, 611731, China(电子科技大学航空宇航学院)

AI总结 本文提出一种无需SVD和特征分解的快速刚性三维配准方法,通过点交叉协方差矩阵的最优特征向量计算实现高效配准,验证了其在噪声点云中的鲁棒性和速度优势。

详情
AI中文摘要

本文提出了一种新的解决方案来解决刚性三维配准问题,受之前基于特征分解的方法启发。与现有求解器不同,所提算法不需要复杂的矩阵运算,如奇异值分解或特征值分解。相反,点交叉协方差矩阵的最优特征向量可以在几次迭代中计算出来。此外,证明了在不需要四元数的情况下可以直接计算最优旋转矩阵。该简单框架提供了在嵌入式平台上整数实现的非常简便方法。对噪声污染点云的仿真验证了所提方法的鲁棒性和计算速度。最终结果表明,所提算法准确、鲁棒,并且比代表方法计算时间减少了60%至80%。它还已应用于实际世界,以实现更快的相对机器人导航。

英文摘要

A novel solution is obtained to solve the rigid 3D registration problem, motivated by previous eigen-decomposition approaches. Different from existing solvers, the proposed algorithm does not require sophisticated matrix operations e.g. singular value decomposition or eigenvalue decomposition. Instead, the optimal eigenvector of the point cross-covariance matrix can be computed within several iterations. It is also proven that the optimal rotation matrix can be directly computed for cases without need of quaternion. The simple framework provides very easy approach of integer-implementation on embedded platforms. Simulations on noise-corrupted point clouds have verified the robustness and computation speed of the proposed method. The final results indicate that the proposed algorithm is accurate, robust and owns over $60\% \sim 80\%$ less computation time than representatives. It has also been applied to real-world applications for faster relative robotic navigation.

1710.04265 2026-06-04 math.NA cs.CV cs.NA

Solutions of Quadratic First-Order ODEs applied to Computer Vision Problems

二次一阶微分方程的解及其在计算机视觉问题中的应用

David Casillas-Perez, Daniel Pizarro, Manuel Mazo, Adrien Bartoli

发表机构 * Department of Electronic, University of Alcalá(阿尔卡拉大学电子系) ISIT - CNRS/Université d’Auvergne(奥弗涅大学ISIT-CNRS)

AI总结 本文研究了特定二次一阶微分方程的存在性和唯一性,探讨了其在平面-透视曲线重建中的应用,并提出了最大深度函数和最大深度解问题。

Comments The version 2: New change of variable. Maximal Curve Maximal Solution Convergence Cones The version 3: modifies the author's list and the abstract in metadata

详情
AI中文摘要

本文研究了特定二次一阶微分方程的存在性和唯一性,探讨了其在平面-透视曲线重建中的应用,并提出了最大深度函数和最大深度解问题。

英文摘要

This article is a study about the existence and the uniqueness of solutions of a specific quadratic first-order ODE that frequently appears in multiple reconstruction problems. It is called the \emph{planar-perspective equation} due to the duality with the geometric problem of reconstruction of planar-perspective curves from their modulus. Solutions of the \emph{planar-perspective equation} are related with planar curves parametrized with perspective parametrization due to this geometric interpretation. The article proves the existence of only two local solutions to the \emph{initial value problem} with \emph{regular initial conditions} and a maximum of two analytic solutions with \emph{critical initial conditions}. The article also gives theorems to extend the local definition domain where the existence of both solutions are guaranteed. It introduces the \emph{maximal depth function} as a function that upper-bound all possible solutions of the \emph{planar-perspective equation} and contains all its possible \emph{critical points}. Finally, the article describes the \emph{maximal-depth solution problem} that consists of finding the solution of the referred equation that has maximum the depth and proves its uniqueness. It is an important problem as it does not need initial conditions to obtain the unique solution and its the frequent solution that practical algorithms of the state-of-the-art give.

1806.09849 2026-06-04 eess.SY cs.RO cs.SY

SENSE: Abstraction-Based Synthesis of Networked Control Systems

SENSE:基于抽象的网络控制系统合成

Mahmoud Khaled, Matthias Rungger, Majid Zamani

发表机构 * Technical University of Munich(慕尼黑技术大学)

AI总结 SENSE通过符号模型构建和自动控制器合成,解决网络控制系统复杂规格的满足问题,支持VHDL/Verilog或C/C++代码生成。

Comments In Proceedings MeTRiD 2018, arXiv:1806.09330

详情
Journal ref
EPTCS 272, 2018, pp. 65-78
AI中文摘要

SENSE通过符号模型构建和自动控制器合成,解决网络控制系统复杂规格的满足问题,支持VHDL/Verilog或C/C++代码生成。

英文摘要

While many studies and tools target the basic stabilizability problem of networked control systems (NCS), nowadays modern systems require more sophisticated objectives such as those expressed as formulae in linear temporal logic or as automata on infinite strings. One general technique to achieve this is based on so-called symbolic models, where complex systems are approximated by finite abstractions, and then, correct-by-construction controllers are automatically synthesized for them. We present tool SENSE for the construction of finite abstractions for NCS and the automated synthesis of controllers. Constructed controllers enforce complex specifications over plants in NCS by taking into account several non-idealities of the communication channels. Given a symbolic model of the plant and network parameters, SENSE can efficiently construct a symbolic model of the NCS, by employing operations on binary decision diagrams (BDDs). Then, it synthesizes symbolic controllers satisfying a class of specifications. It has interfaces for the simulation and the visualization of the resulting closed-loop systems using OMNETPP and MATLAB. Additionally, SENSE can generate ready-to-implement VHDL/Verilog or C/C++ codes from the synthesized controllers.

1806.09620 2026-06-04 math.OC cs.LG cs.NA math.NA

A DCA-Like Algorithm and its Accelerated Version with Application in Data Visualization

一种类似DCA的算法及其加速版本在数据可视化中的应用

Hoai An Le Thi, Hoai Minh Le, Duy Nhat Phan, Bach Tran

发表机构 * Department Informatics and Application(信息与应用系) LGIPM University of Lorraine(洛林大学) France(法国)

AI总结 本文提出两种DCA变体,旨在加速约束下可微函数和复合函数的最小化问题。通过引入新的分解技术改进DCA,进而结合Nesterov加速技术得到加速DCA。算法在Kurdyka-Lojasiewicz假设下的收敛性被严格研究,并应用于t-分布随机邻居嵌入。

详情
AI中文摘要

本文提出两种DCA变体,旨在加速约束下可微函数和复合函数的最小化问题。在第一种变体DCA-Like中,我们引入了一种新的技术来迭代修改目标函数的分解。这种连续分解可以导致更好的主导性,从而比基本DCA有更快的收敛速度。然后,我们将Nesterov的加速技术纳入DCA-Like中,得到第二种变体,称为加速DCA-Like。两种变体的收敛性质和在Kurdyka-Lojasiewicz假设下的收敛率被严格研究。作为应用,我们研究了我们的算法用于t-分布随机邻居嵌入。在几个基准数据集上的数值实验展示了我们算法的有效性。

英文摘要

In this paper, we present two variants of DCA (Different of Convex functions Algorithm) to solve the constrained sum of differentiable function and composite functions minimization problem, with the aim of increasing the convergence speed of DCA. In the first variant, DCA-Like, we introduce a new technique to iteratively modify the decomposition of the objective function. This successive decomposition could lead to a better majorization and consequently a better convergence speed than the basic DCA. We then incorporate the Nesterov's acceleration technique into DCA-Like to give rise to the second variant, named Accelerated DCA-Like. The convergence properties and the convergence rate under Kudyka-Lojasiewicz assumption of both variants are rigorously studied. As an application, we investigate our algorithms for the t-distributed stochastic neighbor embedding. Numerical experiments on several benchmark datasets illustrate the efficiency of our algorithms.

1806.08810 2026-06-04 cs.LO cs.RO cs.SY eess.SY

Self-Driving Vehicle Verification Towards a Benchmark

自动驾驶车辆验证:一个基准

Nima Roohi, Ramneet Kaur, James Weimer, Oleg Sokolsky, Insup Lee

发表机构 * University of Pennsylvania(宾夕法尼亚大学)

AI总结 本文提出一个简单的形式化模型用于自动驾驶车辆,尽管经过简化后其安全性已手动证明,但目前尚无自动形式化验证工具支持其动态特性,旨在为形式化验证工具提供挑战。

Comments 7 pages

详情
AI中文摘要

工业蜂窝物理系统是具有严格安全要求的混合系统。尽管没有形式语义,大多数这些系统主要由于两个原因被建模为Stateflow/Simulink:(1) 使用这些工具建模、测试和模拟更方便,(2)这些系统的动态特性不被大多数其他工具支持。此外,随着蜂窝物理系统复杂性的不断增加,自动形式化验证工具所能建模的范围与工业蜂窝物理系统的模型之间的差距也在扩大。在本文中,我们提出了一个简单的形式化模型用于自动驾驶车辆。尽管经过一些简化,该系统的安全性已经手动证明,据我们所知,目前没有自动形式化验证工具支持其动态特性。我们希望这能为针对工业应用的形式化验证工具提供一个挑战问题。

英文摘要

Industrial cyber-physical systems are hybrid systems with strict safety requirements. Despite not having a formal semantics, most of these systems are modeled using Stateflow/Simulink for mainly two reasons: (1) it is easier to model, test, and simulate using these tools, and (2) dynamics of these systems are not supported by most other tools. Furthermore, with the ever growing complexity of cyber-physical systems, grows the gap between what can be modeled using an automatic formal verification tool and models of industrial cyber-physical systems. In this paper, we present a simple formal model for self-deriving cars. While after some simplification, safety of this system has already been proven manually, to the best of our knowledge, no automatic formal verification tool supports its dynamics. We hope this serves as a challenge problem for formal verification tools targeting industrial applications.

1711.10868 2026-06-04 eess.SY cs.AI cs.SY math.OC

La production de nitrites lors de la dénitrification des eaux usées par biofiltration - Stratégie de contrôle et de réduction des concentrations résiduelles

废水生物过滤脱硝过程中亚硝酸盐的生成 - 控制与残留浓度的减少策略

Vincent Rocher, Cédric Join, Stéphane Mottelet, Jean Bernier, Sabrina Rechdaoui-Guérin, Sam Azimi, Paul Lessard, André Pauss, Michel Fliess

发表机构 * SIAAP (Syndicat Interdépartemental pour l'Assainissement de l'Agglomération Parisienne)(巴黎大都会污水处理协会) CRAN (CNRS, UMR 7039)(CRAN(国家科学研究中心,UMR 7039)) TIMR (EA 4297)(TIMR(EA 4297)) Département de génie civil et de génie des eaux, Université Laval(土木工程与水工程系,拉瓦尔大学) LIX (CNRS, UMR 7161)(LIX(国家科学研究中心,UMR 7161)) AL.I.E.N. (ALgèbre pour Identification & Estimation Numériques)(AL.I.E.N.(代数用于识别与数值估计))

AI总结 研究通过MOCOPEE项目探讨废水脱硝过程中亚硝酸盐生成机制,开发测量与控制工具以降低现场亚硝酸盐浓度,采用模型无关控制策略提升脱硝效率。

Comments in french, Journal of Water Science, to appear

详情
Journal ref
Revue des Sciences de l'Eau, 31(1), 2018, 61-73
AI中文摘要

近年来,巴黎大区污水处理厂对脱硝后处理过程的流行导致塞纳河中亚硝酸盐浓度回升。控制脱硝后亚硝酸盐生成成为关键技术问题。MOCOPEE项目研究了废水脱硝过程中亚硝酸盐生成的机理,并开发了测量和控制工具以减少现场亚硝酸盐产量。先前研究表明,典型的甲醇投加策略会导致反应器中碳氮比波动,从而引起出水亚硝酸盐浓度不稳定。因此,在SimBio模型上测试了将模型无关控制添加到经典投加策略的可能性,该模型模拟了废水生物滤池的行为。相应的

英文摘要

The recent popularity of post-denitrification processes in the greater Paris area wastewater treatment plants has caused a resurgence of the presence of nitrite in the Seine river. Controlling the production of nitrite during the post-denitrification has thus become a major technical issue. Research studies have been led in the MOCOPEE program (www.mocopee.com) to better understand the underlying mechanisms behind the production of nitrite during wastewater denitrification and to develop technical tools (measurement and control solutions) to assist on-site reductions of nitrite productions. Prior studies have shown that typical methanol dosage strategies produce a varying carbon-to-nitrogen ratio in the reactor, which in turn leads to unstable nitrite concentrations in the effluent. The possibility of adding a model-free control to the actual classical dosage strategy has thus been tested on the SimBio model, which simulates the behavior of wastewater biofilters. The corresponding "intelligent" feedback loop, which is using effluent nitrite concentrations, compensates the classical strategy only when needed. Simulation results show a clear improvement in average nitrite concentration level and level stability in the effluent, without a notable overcost in methanol.

1806.05419 2026-06-04 stat.ML cs.LG cs.NA math.NA math.ST stat.TH

Ranking Recovery from Limited Comparisons using Low-Rank Matrix Completion

通过低秩矩阵补全进行有限比较的排序恢复

Tal Levy, Alireza Vahid, Raja Giryes

发表机构 * School of Electrical Engineering, Tel-Aviv University(特拉维夫大学电气工程学院) Electrical Engineering Department, University of Colorado Denver(科罗拉多大学丹佛分校电气工程系)

AI总结 本文提出利用低秩矩阵补全方法解决经典排名聚合问题,通过矩阵形式处理部分噪声比较数据,结合交替最小化算法和最大似然估计,重建真实偏好强度。

Comments 10 Pages, 9 figures. A prediction table for 2018 FIFA soccer world cup is included

详情
AI中文摘要

本文提出了一种新的方法,利用低秩矩阵补全技术解决经典的排名聚合问题。通过将成对比较的不完全和噪声数据转换为矩阵形式,并利用矩阵补全工具(如Netflix挑战中的低秩补全解决方案)来构建不同对象的偏好。在我们的方法中,利用多个比较数据估计对象i相对于对象j获胜(或被选择)的概率,其中仅已知N个对象的部分比较数据。然后将数据转换为矩阵形式,其无噪声解具有已知的秩为一。接着使用目标矩阵具有双线性形式的交替最小化算法,并结合最大似然估计对两个因素进行估计。重建的矩阵用于获得真实的潜在偏好强度。本工作在模拟场景和真实数据中展示了所提算法相对于当前最先进方法的改进。

英文摘要

This paper proposes a new method for solving the well-known rank aggregation problem from pairwise comparisons using the method of low-rank matrix completion. The partial and noisy data of pairwise comparisons is transformed into a matrix form. We then use tools from matrix completion, which has served as a major component in the low-rank completion solution of the Netflix challenge, to construct the preference of the different objects. In our approach, the data of multiple comparisons is used to create an estimate of the probability of object i to win (or be chosen) over object j, where only a partial set of comparisons between N objects is known. The data is then transformed into a matrix form for which the noiseless solution has a known rank of one. An alternating minimization algorithm, in which the target matrix takes a bilinear form, is then used in combination with maximum likelihood estimation for both factors. The reconstructed matrix is used to obtain the true underlying preference intensity. This work demonstrates the improvement of our proposed algorithm over the current state-of-the-art in both simulated scenarios and real data.

1806.04830 2026-06-04 math.NA cs.LG cs.NA

Deep Multiscale Model Learning

深度多尺度模型学习

Yating Wang, Siu Wun Cheung, Eric T. Chung, Yalchin Efendiev, Min Wang

发表机构 * Department of Mathematics, Texas A&M University(德克萨斯大学数学系) Department of Mathematics, The Chinese University of Hong Kong(香港中文大学数学系) Department of Mathematics & Institute for Scientific Computation (ISC), Texas A&M University(德克萨斯大学数学系与科学计算研究所)

AI总结 本文提出利用深度学习与局部多尺度模型降阶方法,通过数据和物理建模概念提升流体多尺度模拟的预测能力。

详情
AI中文摘要

本文的目标是设计新型多层神经网络架构,用于考虑观测数据和物理建模概念的流体多尺度模拟。我们的方法结合深度学习概念与局部多尺度模型降阶方法,预测流体动力学。使用降阶模型对于构建稳健的深度学习架构至关重要,因为降阶模型提供较少的自由度。流体动力学可以视为多层网络。更准确地说,时间瞬间n+1的解(例如压力和饱和度)取决于时间瞬间n的解和输入参数,如渗透率场、强迫项和初始条件。可以将解视为多层网络,其中每一层通常是一个非线性前向映射,层数与内部时间步数相关。我们将依赖严格的模型降阶概念来定义每个层的未知数和连接。在每一层中,我们的降阶模型将提供一个前向映射,该映射将通过可用数据进行修改(“训练”)。使用降阶模型至关重要,因为它们将识别影响区域和适当的变量数量。由于可用数据有限,训练将补充计算数据,并在数据丰富和数据贫乏的模型之间进行插值。我们还将使用深度学习算法来训练降阶模型离散系统的元素。我们将介绍我们方法的主要成分和数值结果。数值结果表明,使用深度学习和多尺度模型,可以提高受可用数据条件的前向模型。

英文摘要

The objective of this paper is to design novel multi-layer neural network architectures for multiscale simulations of flows taking into account the observed data and physical modeling concepts. Our approaches use deep learning concepts combined with local multiscale model reduction methodologies to predict flow dynamics. Using reduced-order model concepts is important for constructing robust deep learning architectures since the reduced-order models provide fewer degrees of freedom. Flow dynamics can be thought of as multi-layer networks. More precisely, the solution (e.g., pressures and saturations) at the time instant $n+1$ depends on the solution at the time instant $n$ and input parameters, such as permeability fields, forcing terms, and initial conditions. One can regard the solution as a multi-layer network, where each layer, in general, is a nonlinear forward map and the number of layers relates to the internal time steps. We will rely on rigorous model reduction concepts to define unknowns and connections for each layer. In each layer, our reduced-order models will provide a forward map, which will be modified ("trained") using available data. It is critical to use reduced-order models for this purpose, which will identify the regions of influence and the appropriate number of variables. Because of the lack of available data, the training will be supplemented with computational data as needed and the interpolation between data-rich and data-deficient models. We will also use deep learning algorithms to train the elements of the reduced model discrete system. We will present main ingredients of our approach and numerical results. Numerical results show that using deep learning and multiscale models, we can improve the forward models, which are conditioned to the available data.

1803.02998 2026-06-04 eess.SY cs.AI cs.SY

DeepCAS: A Deep Reinforcement Learning Algorithm for Control-Aware Scheduling

DeepCAS: 一种用于控制感知调度的深度强化学习算法

Burak Demirel, Arunselvan Ramaswamy, Daniel E. Quevedo, Holger Karl

发表机构 * Paderborn University(帕德博恩大学)

AI总结 本文提出DeepCAS算法,通过深度强化学习实现控制感知调度,优化子系统控制器并最小化控制损失,实验证明其优于周期性调度。

详情
AI中文摘要

我们考虑由多个独立受控子系统组成的网络控制系统,这些系统通过共享通信网络运行。此类系统在网络物理系统、物联网和大规模工业系统中普遍存在。在许多大规模设置中,通信网络的规模小于系统的规模,从而引发调度问题。本文的主要贡献是开发一种基于深度强化学习的控制感知调度(DeepCAS)算法,以解决这些问题。我们采用以下(最优)设计策略:首先,为每个子系统合成最优控制器;其次,设计一个学习算法,以适应所选子系统(被控对象)和控制器。由于这种适应性,我们的算法找到一个调度方案,以最小化控制损失。我们通过实验证明,DeepCAS找到的调度性能优于周期性调度。

英文摘要

We consider networked control systems consisting of multiple independent controlled subsystems, operating over a shared communication network. Such systems are ubiquitous in cyber-physical systems, Internet of Things, and large-scale industrial systems. In many large-scale settings, the size of the communication network is smaller than the size of the system. In consequence, scheduling issues arise. The main contribution of this paper is to develop a deep reinforcement learning-based \emph{control-aware} scheduling (\textsc{DeepCAS}) algorithm to tackle these issues. We use the following (optimal) design strategy: First, we synthesize an optimal controller for each subsystem; next, we design a learning algorithm that adapts to the chosen subsystems (plants) and controllers. As a consequence of this adaptation, our algorithm finds a schedule that minimizes the \emph{control loss}. We present empirical results to show that \textsc{DeepCAS} finds schedules with better performance than periodic ones.

1806.04167 2026-06-04 eess.SY cs.LG cs.SY

Learning an Approximate Model Predictive Controller with Guarantees

学习具有保证的近似模型预测控制器

Michael Hertneck, Johannes Köhler, Sebastian Trimpe, Frank Allgöwer

发表机构 * University of Stuttgart(斯图加特大学)

AI总结 本文提出一种监督学习框架,用于在降低计算复杂度的同时近似模型预测控制器,并保证稳定性和约束满足。通过结合鲁棒MPC设计和统计学习界限,为学习的MPC提供闭环保证。

Comments 6 pages, 3 figures, to appear in IEEE Control Systems Letters

详情
AI中文摘要

本文提出了一种监督学习框架,用于近似模型预测控制器(MPC),以降低计算复杂度并保证稳定性和约束满足。该框架可应用于广泛非线性系统。任何标准监督学习技术(例如神经网络)均可用于从样本中近似MPC。为了获得学习MPC的闭环保证,将鲁棒MPC设计与统计学习界限相结合。MPC设计确保在给定范围内输入不准确时的鲁棒性,Hoeffding不等式用于验证学习的MPC在高置信度下满足这些界限。结果是学习MPC的闭环统计保证,确保稳定性和约束满足。所提出的基于学习的MPC框架在非线性基准问题上进行了示例说明,其中我们学习了一个具有保证的神经网络控制器。

英文摘要

A supervised learning framework is proposed to approximate a model predictive controller (MPC) with reduced computational complexity and guarantees on stability and constraint satisfaction. The framework can be used for a wide class of nonlinear systems. Any standard supervised learning technique (e.g. neural networks) can be employed to approximate the MPC from samples. In order to obtain closed-loop guarantees for the learned MPC, a robust MPC design is combined with statistical learning bounds. The MPC design ensures robustness to inaccurate inputs within given bounds, and Hoeffding's Inequality is used to validate that the learned MPC satisfies these bounds with high confidence. The result is a closed-loop statistical guarantee on stability and constraint satisfaction for the learned MPC. The proposed learning-based MPC framework is illustrated on a nonlinear benchmark problem, for which we learn a neural network controller with guarantees.

1711.03449 2026-06-04 math.OC cs.RO cs.SY eess.SY

Optimization-Based Collision Avoidance

基于优化的避障方法

Xiaojing Zhang, Alexander Liniger, Francesco Borrelli

发表机构 * Model Predictive Control Laboratory, Department of Mechanical Engineering, University of California, Berkeley, USA(加州大学伯克利分校机械工程系模型预测控制实验室) Automatic Control Laboratory, Department of Information Technology and Electrical Engineering, ETH Zurich, Switzerland(苏黎世联邦理工学院信息科技与电气工程系自动控制实验室)

AI总结 本文提出一种将非光滑避障约束转化为光滑非线性约束的方法,利用凸优化的强对偶性。该方法适用于在n维空间中移动的受控物体,能处理一般障碍物和可表示为有限凸集并集的受控物体,并结合了传统轨迹生成算法中常用的有号距离概念。

Comments 27 pages, 9 figures, 2 tables

详情
AI中文摘要

本文提出了一种将非光滑避障约束转化为光滑非线性约束的方法,利用凸优化的强对偶性。我们关注一个受控物体在n维空间中移动时避免障碍物的目标。所提出的改写方法不引入近似,并适用于一般障碍物和可在n维空间中表示为有限凸集并集的受控物体。此外,我们结果与传统轨迹生成算法中常用的有号距离概念相连接。我们的方法可以用于通用的导航和轨迹规划任务,光滑性属性允许使用通用的梯度和Hessian基于优化算法。最后,在无法避免碰撞的情况下,我们的框架允许找到

英文摘要

This paper presents a novel method for reformulating non-differentiable collision avoidance constraints into smooth nonlinear constraints using strong duality of convex optimization. We focus on a controlled object whose goal is to avoid obstacles while moving in an n-dimensional space. The proposed reformulation does not introduce approximations, and applies to general obstacles and controlled objects that can be represented in an n-dimensional space as the finite union of convex sets. Furthermore, we connect our results with the notion of signed distance, which is widely used in traditional trajectory generation algorithms. Our method can be used in generic navigation and trajectory planning tasks, and the smoothness property allows the use of general-purpose gradient- and Hessian-based optimization algorithms. Finally, in case a collision cannot be avoided, our framework allows us to find "least-intrusive" trajectories, measured in terms of penetration. We demonstrate the efficacy of our framework on a quadcopter navigation and automated parking problem, and our numerical experiments suggest that the proposed methods enable real-time optimization-based trajectory planning problems in tight environments. Source code of our implementation is provided at https://github.com/XiaojingGeorgeZhang/OBCA.

1806.02499 2026-06-04 eess.SY cs.LG cs.SY stat.ML

Conditional probability calculation using restricted Boltzmann machine with application to system identification

基于受限玻尔兹曼机的条件概率计算及其在系统辨识中的应用

Erick de la Rosa, Wen Yu

发表机构 * Departamento de Control Automatico CINVESTAV-IPN (National Polytechnic Institute)(自动控制系 CINVESTAV-IPN(国家理工学院))

AI总结 本文利用受限玻尔兹曼机计算条件概率用于非线性系统辨识,通过二进制编码和连续值方法改进模型,提出通用逼近分析,验证在噪声大和系统动态复杂时方法优势。

详情
AI中文摘要

使用概率方法进行非线性系统辨识具有优势,如数据集中的噪声和离群值对概率模型影响小,输入特征可概率形式提取。概率模型的主要障碍是概率分布难以获得。本文将非线性系统辨识转化为求解条件概率问题,修改受限玻尔兹曼机(RBM),使联合概率、输入分布和条件概率可通过RBM训练计算。讨论了二进制编码和连续值方法,提出基于条件概率建模的通用逼近分析。使用两个非线性系统基准测试比较本文概率建模方法与其他黑盒建模方法。结果表明,该新方法在存在大量噪声和系统动态复杂时表现更优。

英文摘要

There are many advantages to use probability method for nonlinear system identification, such as the noises and outliers in the data set do not affect the probability models significantly; the input features can be extracted in probability forms. The biggest obstacle of the probability model is the probability distributions are not easy to be obtained. In this paper, we form the nonlinear system identification into solving the conditional probability. Then we modify the restricted Boltzmann machine (RBM), such that the joint probability, input distribution, and the conditional probability can be calculated by the RBM training. Binary encoding and continue valued methods are discussed. The universal approximation analysis for the conditional probability based modelling is proposed. We use two benchmark nonlinear systems to compare our probability modelling method with the other black-box modeling methods. The results show that this novel method is much better when there are big noises and the system dynamics are complex.

1806.01777 2026-06-04 eess.SP cs.RO cs.SY eess.SY

Safe Driving Capacity of Autonomous Vehicles

自动驾驶车辆的安全驾驶能力

Yuan-Ying Wang, Hung-Yu Wei

发表机构 * Department of Electrical Engineering(电气工程系) National Taiwan University(国立台湾大学)

AI总结 本文通过线性时序逻辑定义道路和车辆的安全状态,提出安全驾驶吞吐量和容量概念,分析不同因素对安全驾驶吞吐量的影响,并比较基于感知和协作车辆的道路安全驾驶容量差异。

Comments 5 pages, VTC 2018

详情
AI中文摘要

本文通过线性时序逻辑定义道路和车辆的安全状态,提出安全驾驶吞吐量和容量概念,分析不同因素对安全驾驶吞吐量的影响,并比较基于感知和协作车辆的道路安全驾驶容量差异。

英文摘要

An excellent self-driving car is expected to take its passengers safely and efficiently from one place to another. However, different ways of defining safety and efficiency may significantly affect the conclusion we make. In this paper, we give formal definitions to the safe state of a road and safe state of a vehicle using the syntax of linear temporal logic (LTL). We then propose the concept of safe driving throughput (SDT) and safe driving capacity (SDC) which measure the amount of vehicles in the safe state on a road. We analyze how SDT is affected by different factors. We show the analytic difference of SDC between the road with perception-based vehicles (PBV) and the road with cooperative-based vehicles (CBV). We claim that through proper design, the SDC of the road filled with PBVs will be upper-bounded by the SDC of the road filled with CBVs.

1806.01678 2026-06-04 math.NA cs.LG cs.NA stat.ML

A Projection Method for Metric-Constrained Optimization

度量约束优化的一种投影方法

Nate Veldt, David Gleich, Anthony Wirth, James Saunderson

发表机构 * Purdue University, Mathematics Department(普渡大学数学系) Purdue University, Computer Science Department(普渡大学计算机科学系) The University of Melbourne, Computing and Information Systems School(墨尔本大学计算与信息系统学院) Monash University, Department of Electrical and Computer Systems Engineering(莫纳什大学电子与计算机系统工程系)

AI总结 本文提出一种解决度量约束优化问题的新方法,通过改进投影算法解决图聚类中的高维优化问题,并提供新的近似保证。

详情
AI中文摘要

我们概述了一种解决强制输出变量三角不等式的优化问题的新方法。我们将其称为度量约束优化,并给出了在机器学习应用和图聚类理论近似算法中出现的几个例子。尽管这些问题是理论上的有趣问题,但实际求解具有挑战性,因为黑箱求解器需要高内存。为了解决这一挑战,我们首先证明了相关聚类的度量约束线性规划松弛等价于度量接近问题的特殊情况。然后我们通过推广和改进最初用于度量接近的简单投影算法,开发了一个通用求解器。我们为使用我们的框架找到几个具有挑战性的图聚类问题的最优解的下界提供了几种新的近似保证。我们还通过解决包含高达10^8个变量和10^11个约束的优化问题来展示我们框架的威力。

英文摘要

We outline a new approach for solving optimization problems which enforce triangle inequalities on output variables. We refer to this as metric-constrained optimization, and give several examples where problems of this form arise in machine learning applications and theoretical approximation algorithms for graph clustering. Although these problem are interesting from a theoretical perspective, they are challenging to solve in practice due to the high memory requirement of black-box solvers. In order to address this challenge we first prove that the metric-constrained linear program relaxation of correlation clustering is equivalent to a special case of the metric nearness problem. We then developed a general solver for metric-constrained linear and quadratic programs by generalizing and improving a simple projection algorithm originally developed for metric nearness. We give several novel approximation guarantees for using our framework to find lower bounds for optimal solutions to several challenging graph clustering problems. We also demonstrate the power of our framework by solving optimizing problems involving up to 10^{8} variables and 10^{11} constraints.

1806.01003 2026-06-04 eess.SY cs.LG cs.SY stat.ML

Distributed Learning from Interactions in Social Networks

社交网络中交互的分布式学习

Francesco Sasso, Angelo Coluccia, Giuseppe Notarstefano

发表机构 * European Research Council (ERC)(欧洲研究理事会)

AI总结 本文提出基于社交网络交互的分布式学习框架,利用贝叶斯方法和最大似然估计,通过图模型工具实现参数和超参数的分布式估计,用于用户画像建模。

Comments This submission is a shorter work (for conference publication) of a more comprehensive paper, already submitted as arXiv:1706.04081 (under review for journal publication). In this short submission only one social set-up is considered and only one of the relaxed estimators is proposed. Moreover, the exhaustive analysis, carried out in the longer manuscript, is completely missing in this version

详情
AI中文摘要

我们考虑一个网络场景,其中代理可以根据表示某些交互的评分图来评估彼此。目标是设计一个分布式协议,由代理运行,使他们能够在有限的可能值中学习其未知状态。我们提出一个贝叶斯框架,其中评分和状态与具有未知参数和超参数的概率事件相关联。我们展示每个代理可以通过本地贝叶斯分类器和结合普通最大似然估计和经验贝叶斯方法的(集中式)最大似然(ML)估计器来学习其状态。通过使用图模型工具,我们可以获得评分和状态的条件依赖性的洞察,从而提供一个放松的概率模型,最终导致一个适合分布式计算的参数-超参数估计器。为了突出所提放松的适当性,我们将在社交互动设置中演示分布式估计器。

英文摘要

We consider a network scenario in which agents can evaluate each other according to a score graph that models some interactions. The goal is to design a distributed protocol, run by the agents, that allows them to learn their unknown state among a finite set of possible values. We propose a Bayesian framework in which scores and states are associated to probabilistic events with unknown parameters and hyperparameters, respectively. We show that each agent can learn its state by means of a local Bayesian classifier and a (centralized) Maximum-Likelihood (ML) estimator of parameter-hyperparameter that combines plain ML and Empirical Bayes approaches. By using tools from graphical models, which allow us to gain insight on conditional dependencies of scores and states, we provide a relaxed probabilistic model that ultimately leads to a parameter-hyperparameter estimator amenable to distributed computation. To highlight the appropriateness of the proposed relaxation, we demonstrate the distributed estimators on a social interaction set-up for user profiling.

1805.12170 2026-06-04 eess.SY cs.RO cs.SY

An Improved Active Disturbance Rejection Control for a Differential Drive Mobile Robot with Mismatched Disturbances and Uncertainties

一种改进的主动扰动拒绝控制用于有差驱动移动机器人以应对不匹配扰动和不确定性

Ibraheem Kasim Ibraheem, Wameedh Riyadh Abdul-Adeem

发表机构 * Electrical Engineering Department(电气工程系) College of Engineering, Baghdad University(巴格达大学工程学院)

AI总结 本文提出了一种基于扰动和不确定性估计与抑制技术的改进主动扰动拒绝控制方法,用于非线性运动学模型的有差驱动移动机器人,通过消除扰动和不确定性提高动态性能。

详情
AI中文摘要

本文提出了一种基于扰动和不确定性估计与抑制技术的改进主动扰动拒绝控制方法,用于非线性运动学模型的有差驱动移动机器人。所提出的方法是J.Han建议的主动扰动拒绝控制(ADRC)的改进版本。ADRC用于主动抵消由未知外源信号和系统模型的匹配不确定性引起的扰动,这些扰动被合并为总扰动。在本工作中,假设系统为仿射系统,总扰动和输入被认为是不同通道。为处理不匹配扰动和不确定性,总扰动已被转换为匹配扰动。然后基于改进的ADRC(IADRC),通过估计总扰动并将其从系统中抵消,提高了DDMR的动态性能。通过数字仿真,应用了不同的性能指标,所有结果都表明所提出的IADRC的有效性,几乎消除了抖振现象,并在闭环系统中提供了对扭矩扰动的高鲁棒性。

英文摘要

In this paper a new strategy based on disturbance and uncertainty (DU) estimation and attenuation technique is proposed and tested on the nonlinear kinematic model of the differential drive mobile robot (DDMR). The proposed technique is an improved version of the Active Disturbance Rejection Control (ADRC) strategy suggested by J. Han. The ADRC is used to actively reject disturbances caused by the unknown exogenous signals and the matched uncertainties of the system model, which are lumped all together and attributed as a total disturbance. In this work, the considered system is assumed to be affine and the total disturbance and the input are considered to be on different channels. To deal with the mismatched disturbances and uncertainties, the total disturbance has been converted into a matched one. Then, based on the improved ADRC (IADRC), the dynamic performance of the DDMR has been enhanced by estimating the total disturbance and canceling it from the system. Through digital simulations, different performance measures are applied, and they all indicate the effectiveness of the proposed IADRC by almost removing the chattering phenomenon and providing a high immunity in the closed-loop system against torque disturbance.

1805.09613 2026-06-04 stat.ML cs.AI cs.LG cs.RO cs.SY eess.SY

A0C: Alpha Zero in Continuous Action Space

A0C:在连续动作空间中的Alpha Zero

Thomas M. Moerland, Joost Broekens, Aske Plaat, Catholijn M. Jonker

发表机构 * Dep. of Computer Science, Delft University of Technology, The Netherlands(代尔夫特理工大学计算机科学系,荷兰) Dep. of Computer Science, Leiden University, The Netherlands(莱顿大学计算机科学系,荷兰)

AI总结 本文提出将Alpha Zero扩展到连续动作空间的理论方法,并在倒摆任务中验证了其可行性,为连续动作空间中的迭代搜索与学习应用奠定了基础。

详情
AI中文摘要

Alpha Zero的核心创新在于树搜索与深度学习的结合,这在国际象棋、国际跳棋和围棋等离散动作空间的游戏中证明非常成功。然而,许多现实世界的强化学习领域具有连续动作空间,例如机器人控制、导航和自动驾驶汽车。本文提出了将Alpha Zero扩展到连续动作空间所需的理论扩展。我们还提供了一些在倒摆摆起任务中的初步实验,实证地展示了我们方法的可行性。因此,这项工作为在连续动作空间领域中应用迭代搜索与学习奠定了基础。

英文摘要

A core novelty of Alpha Zero is the interleaving of tree search and deep learning, which has proven very successful in board games like Chess, Shogi and Go. These games have a discrete action space. However, many real-world reinforcement learning domains have continuous action spaces, for example in robotic control, navigation and self-driving cars. This paper presents the necessary theoretical extensions of Alpha Zero to deal with continuous action space. We also provide some preliminary experiments on the Pendulum swing-up task, empirically showing the feasibility of our approach. Thereby, this work provides a first step towards the application of iterated search and learning in domains with a continuous action space.

1804.02307 2026-06-04 math.OC cs.CV cs.NA math.NA

Accelerated Optimization in the PDE Framework: Formulations for the Manifold of Diffeomorphisms

在PDE框架中的加速优化:微分流形上的形式化方法

Ganesh Sundaramoorthi, Anthony Yezzi

发表机构 * KAUST (King Abdullah University of Science and Technology)(卡斯特大学(国王阿卜杜勒-阿齐兹大学)) Georgia Institute of Technology(佐治亚理工学院)

AI总结 本文提出了一种适用于微分流形上优化问题的新方法,通过将Nesterov加速优化推广到无限维流形,推导出连续演化方程并将其与流体力学原理联系起来,同时与最优运输问题建立联系。

详情
AI中文摘要

我们考虑在无限维微分流形上优化成本泛函的问题。我们提出了一种新的优化方法,适用于任何在微分流形上设置的优化问题,通过将Nesterov加速优化推广到微分流形。虽然我们的框架适用于无限维流形,但我们特别处理微分流形的情况,受计算机视觉中光流问题的启发。这通过基于最近的变分方法来一般类加速优化方法的近期工作实现,该方法适用于有限维空间。我们将其推广到无限维流形。我们推导出令人惊讶的简单的连续演化方程,即偏微分方程,用于加速梯度下降,并将其与流体力学中的简单力学原理联系起来。我们的方法与最优运输问题有自然联系,因为可以将我们的方法视为无限数量粒子的演化,这些粒子具有质量(用质量密度表示),在能量景观中移动。质量随优化变量变化,并赋予粒子动力学。这与有限维情况不同,后者只有一粒粒子移动,因此动力学不依赖于质量。我们推导了理论,计算了加速优化的PDE,并展示了这些新加速优化方案的行为。

英文摘要

We consider the problem of optimization of cost functionals on the infinite-dimensional manifold of diffeomorphisms. We present a new class of optimization methods, valid for any optimization problem setup on the space of diffeomorphisms by generalizing Nesterov accelerated optimization to the manifold of diffeomorphisms. While our framework is general for infinite dimensional manifolds, we specifically treat the case of diffeomorphisms, motivated by optical flow problems in computer vision. This is accomplished by building on a recent variational approach to a general class of accelerated optimization methods by Wibisono, Wilson and Jordan, which applies in finite dimensions. We generalize that approach to infinite dimensional manifolds. We derive the surprisingly simple continuum evolution equations, which are partial differential equations, for accelerated gradient descent, and relate it to simple mechanical principles from fluid mechanics. Our approach has natural connections to the optimal mass transport problem. This is because one can think of our approach as an evolution of an infinite number of particles endowed with mass (represented with a mass density) that moves in an energy landscape. The mass evolves with the optimization variable, and endows the particles with dynamics. This is different than the finite dimensional case where only a single particle moves and hence the dynamics does not depend on the mass. We derive the theory, compute the PDEs for accelerated optimization, and illustrate the behavior of these new accelerated optimization schemes.

1805.08468 2026-06-04 math.NA cs.LG cs.NA

Rank Minimization on Tensor Ring: A New Paradigm in Scalable Tensor Decomposition and Completion

张量环上的秩最小化:一种可扩展张量分解与补全的新范式

Longhao Yuan, Chao Li, Danilo Mandic, Jianting Cao, Qibin Zhao

发表机构 * Graduate School of Engineering, Saitama Institute of Technology, Japan(日本萨它马工学院工程研究生院) Tensor Learning Unit, RIKEN Center for Advanced Intelligence Project (AIP), Japan(日本RIKEN高级智能项目(AIP)张量学习单元) School of Automation, Guangdong University of Technology, China(中国广东技术大学自动化学院) School of Computer Science and Technology, Hangzhou Dianzi University, China(中国杭州电子科技大学计算机科学与技术学院) Department of Electrical and Electronic Engineering, Imperial College London, United Kingdom(英国伦敦帝国理工学院电子与电气工程系)

AI总结 本文提出基于张量环的秩最小化方法,通过引入凸替代项解决传统方法的高计算成本和模型复杂度敏感问题,提出两种算法以不同结构的Schatten范数优化张量环因子,实验显示其高效性与高性能。

详情
AI中文摘要

在低秩张量补全任务中,由于传统方法需要多次大规模奇异值分解(SVD)操作和秩选择问题,导致计算成本高且对模型复杂度敏感。本文利用最近提出的张量环(TR)分解的高压缩性,提出了一种新的张量补全模型。通过引入凸替代项对潜在张量环因子的低秩假设,使得基于Schatten范数正则化的模型可以在更小的规模上求解。我们提出了两种算法,分别对张量环因子应用不同的结构化Schatten范数。通过交替方向乘子法(ADMM)方案,张量环因子和预测张量可以同时优化。在合成数据和实际数据上的实验显示了所提方法的高性能和高效性。

英文摘要

In low-rank tensor completion tasks, due to the underlying multiple large-scale singular value decomposition (SVD) operations and rank selection problem of the traditional methods, they suffer from high computational cost and high sensitivity of model complexity. In this paper, taking advantages of high compressibility of the recently proposed tensor ring (TR) decomposition, we propose a new model for tensor completion problem. This is achieved through introducing convex surrogates of tensor low-rank assumption on latent tensor ring factors, which makes it possible for the Schatten norm regularization based models to be solved at much smaller scale. We propose two algorithms which apply different structured Schatten norms on tensor ring factors respectively. By the alternating direction method of multipliers (ADMM) scheme, the tensor ring factors and the predicted tensor can be optimized simultaneously. The experiments on synthetic data and real-world data show the high performance and efficiency of the proposed approach.

1805.07196 2026-06-04 eess.SY cs.AI cs.SY

Supervisory Control of Probabilistic Discrete Event Systems under Partial Observation

对在部分观测下概率离散事件系统的监督控制

Weilin Deng, Jingkai Yang, Daowen Qiu

发表机构 * School of Data and Computer Science, Sun Yat-sen University(中山大学数据与计算机科学学院)

AI总结 研究在概率监督控制器和部分观测假设下概率离散事件系统(PDESs)的监督控制,提出概率可控性和可观测性的概念,并设计多项式验证算法,同时引入并计算了最优控制问题的解。

Comments 36 pages, comments are welcome

详情
AI中文摘要

对在部分观测下概率离散事件系统的监督控制进行了研究,假设监督控制器是概率性的且具有部分观测。定义了概率P监督器,为每个观测指定控制模式的概率分布。提出了概率可控性和可观测性的概念,并证明其为概率P监督器存在的必要且充分条件。此外,提出了概率可控性和可观测性的多项式验证算法。还引入了下界概率可控且可观的超语言,并将其作为PDESs最优控制问题的解进行计算。通过几个例子展示了所获得的结果。

英文摘要

The supervisory control of probabilistic discrete event systems (PDESs) is investigated under the assumptions that the supervisory controller (supervisor) is probabilistic and has a partial observation. The probabilistic P-supervisor is defined, which specifies a probability distribution on the control patterns for each observation. The notions of the probabilistic controllability and observability are proposed and demonstrated to be a necessary and sufficient conditions for the existence of the probabilistic P-supervisors. Moreover, the polynomial verification algorithms for the probabilistic controllability and observability are put forward. In addition, the infimal probabilistic controllable and observable superlanguage is introduced and computed as the solution of the optimal control problem of PDESs. Several examples are presented to illustrate the results obtained.

1805.03090 2026-06-04 math.OC cs.AI cs.SY eess.SY

Deception in Optimal Control

最优控制中的欺骗

Melkior Ornik, Ufuk Topcu

发表机构 * Institute for Computational Engineering and Sciences, University of Texas at Austin(德克萨斯大学奥斯汀分校计算工程与科学研究所) Department of Aerospace Engineering and Engineering Mechanics and the Institute for Computational Engineering and Sciences, University of Texas at Austin(德克萨斯大学奥斯汀分校航空航天工程与工程力学系及计算工程与科学研究所)

AI总结 本文提出一个数学严谨的框架,用于定义最优控制中的欺骗,通过设计最优欺骗策略,考虑代理和对手的信念空间,并讨论在不确定性和部分可观测马尔可夫决策过程中的欺骗策略设计。

详情
AI中文摘要

本文考虑了一个对抗性场景,其中一方试图实现目标,而其对手试图学习该方的意图并阻止其达成目标。代理有动机试图欺骗对手,同时努力实现其目标。本文的主要贡献是引入了一个数学严谨的框架,用于在最优控制背景下定义欺骗。核心概念是信念诱导奖励:一种奖励不仅依赖于代理的状态和动作,还依赖于对手的信念。设计最优欺骗策略成为在代理状态空间和对手信念空间的乘积上进行最优控制设计的问题。所提出的框架允许在任意具有奖励函数的控制系统中定义欺骗,以及带有额外限制代理控制策略的规范。除了定义欺骗外,我们还讨论了在代理对对手学习过程的知识不确定时如何设计最优欺骗策略。在论文后半部分,我们聚焦于代理行为由马尔可夫决策过程决定的场景,并展示在缺乏对手知识时设计最优欺骗策略自然减少到之前讨论的控制设计问题中部分可观测或不确定的马尔可夫决策过程中。最后,我们给出了两个欺骗策略的例子:一个“警察与小偷”场景和一个代理在移动时使用伪装的例子。我们展示了在这些例子中最优欺骗策略遵循上述设置中如何欺骗对手的直观想法。

英文摘要

In this paper, we consider an adversarial scenario where one agent seeks to achieve an objective and its adversary seeks to learn the agent's intentions and prevent the agent from achieving its objective. The agent has an incentive to try to deceive the adversary about its intentions, while at the same time working to achieve its objective. The primary contribution of this paper is to introduce a mathematically rigorous framework for the notion of deception within the context of optimal control. The central notion introduced in the paper is that of a belief-induced reward: a reward dependent not only on the agent's state and action, but also adversary's beliefs. Design of an optimal deceptive strategy then becomes a question of optimal control design on the product of the agent's state space and the adversary's belief space. The proposed framework allows for deception to be defined in an arbitrary control system endowed with a reward function, as well as with additional specifications limiting the agent's control policy. In addition to defining deception, we discuss design of optimally deceptive strategies under uncertainties in agent's knowledge about the adversary's learning process. In the latter part of the paper, we focus on a setting where the agent's behavior is governed by a Markov decision process, and show that the design of optimally deceptive strategies under lack of knowledge about the adversary naturally reduces to previously discussed problems in control design on partially observable or uncertain Markov decision processes. Finally, we present two examples of deceptive strategies: a "cops and robbers" scenario and an example where an agent may use camouflage while moving. We show that optimally deceptive strategies in such examples follow the intuitive idea of how to deceive an adversary in the above settings.

1805.00983 2026-06-04 eess.SY cs.AI cs.SY math.OC

Robust Deep Reinforcement Learning for Security and Safety in Autonomous Vehicle Systems

面向自动驾驶系统安全与安全的鲁棒深度强化学习

Aidin Ferdowsi, Ursula Challita, Walid Saad, Narayan B. Mandayam

发表机构 * Ericsson Research(爱立信研究) WINLAB, Dept. of ECE, Rutgers University(WINLAB,电子与计算机工程系,罗格斯大学)

AI总结 本文提出了一种新颖的对抗深度强化学习算法,用于提高自动驾驶系统在面对网络物理攻击时的鲁棒性,通过游戏理论框架分析攻击者与自动驾驶车辆之间的对抗行为,利用LSTM块学习预期间距偏差以优化安全控制。

Comments 8 pages, 4 figures

详情
AI中文摘要

为了在未来的智能城市中有效运行,自动驾驶车辆(AVs)必须依赖于车载传感器如摄像头和雷达以及车对车通信。这种对传感器和通信链路的依赖使AVs容易受到网络物理(CP)攻击,攻击者试图通过操纵数据来控制AVs。因此,为了确保安全和最优的AV动态控制,AVs的数据处理功能必须对这些CP攻击具有鲁棒性。为此,本文分析了在存在CP攻击情况下监控AV动态的状态估计过程,并提出了一种新颖的对抗深度强化学习(RL)算法,以最大化AV动态控制对CP攻击的鲁棒性。在所提出的游戏中,攻击者试图注入错误的数据到AV传感器读数中,以操纵车对车最优安全间距,从而可能增加AV事故风险或减少道路上的车辆流量。同时,AV作为防御方,试图最小化间距偏差以确保对攻击者行为的鲁棒性。由于AV没有关于攻击者行为的信息,且数据值操纵的可能性无限,玩家过去互动的结果被输入到长短期记忆(LSTM)块中。每个玩家的LSTM块学习其自身行动导致的预期间距偏差,并将其反馈到其RL算法中。然后,攻击者的RL算法选择最大化间距偏差的动作,而AV的RL算法则试图找到最小化此类偏差的最佳动作。

英文摘要

To operate effectively in tomorrow's smart cities, autonomous vehicles (AVs) must rely on intra-vehicle sensors such as camera and radar as well as inter-vehicle communication. Such dependence on sensors and communication links exposes AVs to cyber-physical (CP) attacks by adversaries that seek to take control of the AVs by manipulating their data. Thus, to ensure safe and optimal AV dynamics control, the data processing functions at AVs must be robust to such CP attacks. To this end, in this paper, the state estimation process for monitoring AV dynamics, in presence of CP attacks, is analyzed and a novel adversarial deep reinforcement learning (RL) algorithm is proposed to maximize the robustness of AV dynamics control to CP attacks. The attacker's action and the AV's reaction to CP attacks are studied in a game-theoretic framework. In the formulated game, the attacker seeks to inject faulty data to AV sensor readings so as to manipulate the inter-vehicle optimal safe spacing and potentially increase the risk of AV accidents or reduce the vehicle flow on the roads. Meanwhile, the AV, acting as a defender, seeks to minimize the deviations of spacing so as to ensure robustness to the attacker's actions. Since the AV has no information about the attacker's action and due to the infinite possibilities for data value manipulations, the outcome of the players' past interactions are fed to long-short term memory (LSTM) blocks. Each player's LSTM block learns the expected spacing deviation resulting from its own action and feeds it to its RL algorithm. Then, the the attacker's RL algorithm chooses the action which maximizes the spacing deviation, while the AV's RL algorithm tries to find the optimal action that minimizes such deviation.