arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 8087
1210.0888 2026-06-03 cs.RO cs.SY eess.SY math.OC

Control Design along Trajectories with Sums of Squares Programming

基于平方和规划的轨迹控制设计

Anirudha Majumdar, Amir Ali Ahmadi, Russ Tedrake

AI总结 提出一种通过平方和规划最大化不变漏斗尺寸的控制设计方法,以形式化保证机器人控制任务的稳定性和安全性。

详情
AI中文摘要

受对具有挑战性的机器人控制任务的控制器稳定性和安全性形式化保证需求的驱动,我们提出了一种控制设计程序,该程序明确寻求最大化通向预定义目标集的不变“漏斗”的尺寸。我们的不变性证明以适当定义的Lyapunov不等式组的平方和证明形式给出。这些证明以及我们提出的多项式控制器可以通过半定优化高效获得。我们的方法可以处理跟踪给定轨迹导致的时变动力学、输入饱和(例如力矩限制),并可扩展到处理动力学和状态的不确定性。所得控制器可用于空间填充反馈运动规划算法,以显著减少轨迹数量填充空间。我们在一个严重力矩受限的欠驱动双摆(Acrobot)上演示了我们的方法,并提供了广泛的仿真和硬件验证。

英文摘要

Motivated by the need for formal guarantees on the stability and safety of controllers for challenging robot control tasks, we present a control design procedure that explicitly seeks to maximize the size of an invariant "funnel" that leads to a predefined goal set. Our certificates of invariance are given in terms of sums of squares proofs of a set of appropriately defined Lyapunov inequalities. These certificates, together with our proposed polynomial controllers, can be efficiently obtained via semidefinite optimization. Our approach can handle time-varying dynamics resulting from tracking a given trajectory, input saturations (e.g. torque limits), and can be extended to deal with uncertainty in the dynamics and state. The resulting controllers can be used by space-filling feedback motion planning algorithms to fill up the space with significantly fewer trajectories. We demonstrate our approach on a severely torque limited underactuated double pendulum (Acrobot) and provide extensive simulation and hardware validation.

1207.3554 2026-06-03 cs.CV cs.NA math.NA stat.ME stat.ML

Designing various component analysis at will

随意设计各种成分分析

Akisato Kimura, Masashi Sugiyama, Sakano Hitoshi, Hirokazu Kameoka

AI总结 提出一种基于广义成对表达(GPE)的通用成分分析框架,涵盖标准方法、正则化、加权、聚类及半监督扩展,并给出利用模板组合设计新方法的简单策略。

Comments Accepted to IAPR International Conference on Pattern Recognition, submitted to IPSJ Transactions on Mathematical Modeling and its Applications (TOM). Just only one-page abstract for new due to novelty violation for journal submission. The details will be disclosed in late September

详情
AI中文摘要

本文提供了一个通用的成分分析(CA)方法框架,引入了一种新的散度矩阵和Gram矩阵表达式,称为广义成对表达(GPE)。该表达式非常紧凑但功能强大:该框架不仅包括(1)标准CA方法,还包括(2)几种正则化技术,(3)加权扩展,(4)一些聚类方法,以及(5)它们的半监督扩展。本文还提出了一种非常简单的方法,用于从所提出的框架中设计所需的CA方法:采用已知的GPE作为模板,并通过适当组合这些模板生成新方法。

英文摘要

This paper provides a generic framework of component analysis (CA) methods introducing a new expression for scatter matrices and Gram matrices, called Generalized Pairwise Expression (GPE). This expression is quite compact but highly powerful: The framework includes not only (1) the standard CA methods but also (2) several regularization techniques, (3) weighted extensions, (4) some clustering methods, and (5) their semi-supervised extensions. This paper also presents quite a simple methodology for designing a desired CA method from the proposed framework: Adopting the known GPEs as templates, and generating a new method by combining these templates appropriately.

1205.3668 2026-06-03 cs.RO cs.SY eess.SY nlin.AO physics.comp-ph

Synthesis and Adaptation of Effective Motor Synergies for the Solution of Reaching Tasks

有效运动协同的合成与自适应用于解决到达任务

Cristiano Alessandro, Juan Pablo Carbajal, Andrea d'Avella

AI总结 受肌肉协同假说启发,提出一种通过线性组合少量预定义协同(synergies)生成开环控制器的方法,使智能体能够自主合成并适应有效协同集以解决点对点到达任务,显著降低控制问题维度并保持良好性能。

Comments conference paper

详情
AI中文摘要

受肌肉协同假说的启发,我们提出了一种方法,用于为求解点对点到达任务的智能体生成开环控制器。控制器输出被定义为少量预定义驱动(称为协同)的线性组合。该方法可以从发展视角进行解释,因为它允许智能体自主合成并适应一组有效的协同以适应新的行为需求。该方案极大地降低了控制问题的维度,同时保持了良好的性能水平。该框架在一个平面运动链中进行了评估,并在多个场景中量化了解决方案的质量。

英文摘要

Taking inspiration from the hypothesis of muscle synergies, we propose a method to generate open loop controllers for an agent solving point-to-point reaching tasks. The controller output is defined as a linear combination of a small set of predefined actuations, termed synergies. The method can be interpreted from a developmental perspective, since it allows the agent to autonomously synthesize and adapt an effective set of synergies to new behavioral needs. This scheme greatly reduces the dimensionality of the control problem, while keeping a good performance level. The framework is evaluated in a planar kinematic chain, and the quality of the solutions is quantified in several scenarios.

1209.0001 2026-06-03 cs.LG cs.NA math.NA stat.ML

An Improved Bound for the Nystrom Method for Large Eigengap

大特征间隙下Nyström方法的改进界

Mehrdad Mahdavi, Tianbao Yang, Rong Jin

AI总结 针对核矩阵谱中存在大特征间隙的情况,基于积分算子集中不等式和矩阵扰动理论,将Nyström方法的Frobenius范数近似误差从O(N/m^{1/4})改进到O(N/m^{1/2})。

详情
AI中文摘要

我们在大特征间隙假设下,为Nyström方法的近似误差建立了一个改进的界。这是基于经验观察,即特征间隙对Nyström方法的近似误差有显著影响。我们的方法基于积分算子的集中不等式和矩阵扰动理论。我们的分析表明,当存在大特征间隙时,在Frobenius范数下,我们可以将Nyström方法的近似误差从$O(N/m^{1/4})$改进到$O(N/m^{1/2})$,其中$N$是核矩阵的大小,$m$是采样列的数量。

英文摘要

We develop an improved bound for the approximation error of the Nyström method under the assumption that there is a large eigengap in the spectrum of kernel matrix. This is based on the empirical observation that the eigengap has a significant impact on the approximation error of the Nyström method. Our approach is based on the concentration inequality of integral operator and the theory of matrix perturbation. Our analysis shows that when there is a large eigengap, we can improve the approximation error of the Nyström method from $O(N/m^{1/4})$ to $O(N/m^{1/2})$ when measured in Frobenius norm, where $N$ is the size of the kernel matrix, and $m$ is the number of sampled columns.

1007.3753 2026-06-03 cs.CV cs.NA math.NA

Fast L1-Minimization Algorithms For Robust Face Recognition

用于鲁棒人脸识别的快速L1最小化算法

Allen Y. Yang, Zihan Zhou, Arvind Ganesh, S. Shankar Sastry, Yi Ma

AI总结 针对鲁棒人脸识别中的稀疏表示分类框架,提出基于增广拉格朗日方法的快速L1最小化解法,解决了传统算法在大规模应用中的可扩展性问题。

详情
AI中文摘要

L1最小化是指在欠定线性系统b=Ax中寻找最小L1范数解。根据压缩感知理论中的某些条件,最小L1范数解也是最稀疏的解。本文研究其算法的速度和可扩展性。特别地,我们关注鲁棒人脸识别中基于稀疏性的分类框架的数值实现,其中通过稀疏表示从可能被光照、面部伪装和姿态变化破坏的高维人脸图像中恢复人类身份。尽管底层数值问题是线性规划,但传统算法在大规模应用中可扩展性差。我们研究了一种基于经典凸优化框架——增广拉格朗日方法(ALM)的新解法。新的凸求解器为实时、时间关键的应用(如人脸识别)提供了可行的解决方案。我们进行了大量实验,验证并比较了ALM算法与几种流行的L1最小化解法(包括内点法、Homotopy、FISTA、SESOP-PCD、近似消息传递(AMP)和TFOCS)的性能。为便于同行评估,所有算法的代码均已公开。

英文摘要

L1-minimization refers to finding the minimum L1-norm solution to an underdetermined linear system b=Ax. Under certain conditions as described in compressive sensing theory, the minimum L1-norm solution is also the sparsest solution. In this paper, our study addresses the speed and scalability of its algorithms. In particular, we focus on the numerical implementation of a sparsity-based classification framework in robust face recognition, where sparse representation is sought to recover human identities from very high-dimensional facial images that may be corrupted by illumination, facial disguise, and pose variation. Although the underlying numerical problem is a linear program, traditional algorithms are known to suffer poor scalability for large-scale applications. We investigate a new solution based on a classical convex optimization framework, known as Augmented Lagrangian Methods (ALM). The new convex solvers provide a viable solution to real-world, time-critical applications such as face recognition. We conduct extensive experiments to validate and compare the performance of the ALM algorithms against several popular L1-minimization solvers, including interior-point method, Homotopy, FISTA, SESOP-PCD, approximate message passing (AMP) and TFOCS. To aid peer evaluation, the code for all the algorithms has been made publicly available.

1208.1103 2026-06-03 cs.AI cs.SY eess.SY

System identification and modeling for interacting and non-interacting tank systems using intelligent techniques

基于智能技术的交互与非交互罐式系统的系统辨识与建模

N. S. Bhuvaneswari, R. Praveena, R. Divya

AI总结 本文采用统计模型辨识、过程反应曲线法、ARX模型、遗传算法及神经网络和模糊逻辑,从实时实验数据中辨识交互与非交互罐式过程的传递函数模型和智能模型。

Comments 13 pages,8 figures

详情
AI中文摘要

从实验数据中进行系统辨识对于基于模型的控制器设计至关重要。由于过程复杂性,从第一原理推导过程模型通常很困难。任何控制和监测系统开发的第一阶段都是系统的辨识和建模。每个模型都是在特定控制问题的背景下开发的。因此,需要一个通用的系统辨识框架。所提出的框架应能根据控制目标和系统行为性质适应并强调不同的特性。因此,系统辨识已成为基于输入输出数据辨识系统模型以设计控制器的宝贵工具。本文关注于使用统计模型辨识、过程反应曲线法、ARX模型、遗传算法以及神经网络和模糊逻辑对交互和非交互罐式过程进行传递函数模型的辨识。所使用的辨识技术和建模易受参数变化和干扰的影响。所提出的方法用于从实时实验数据中辨识交互和非交互过程的数学模型和智能模型。

英文摘要

System identification from the experimental data plays a vital role for model based controller design. Derivation of process model from first principles is often difficult due to its complexity. The first stage in the development of any control and monitoring system is the identification and modeling of the system. Each model is developed within the context of a specific control problem. Thus, the need for a general system identification framework is warranted. The proposed framework should be able to adapt and emphasize different properties based on the control objective and the nature of the behavior of the system. Therefore, system identification has been a valuable tool in identifying the model of the system based on the input and output data for the design of the controller. The present work is concerned with the identification of transfer function models using statistical model identification, process reaction curve method, ARX model, genetic algorithm and modeling using neural network and fuzzy logic for interacting and non interacting tank process. The identification technique and modeling used is prone to parameter change & disturbance. The proposed methods are used for identifying the mathematical model and intelligent model of interacting and non interacting process from the real time experimental data.

1207.4154 2026-06-03 cs.AI cs.SY eess.SY math.OC

Discretized Approximations for POMDP with Average Cost

平均成本POMDP的离散化近似

Huizhen Yu, Dimitri Bertsekas

AI总结 针对平均成本POMDP,提出一种新的基于有限信念点离散化的下界近似方案,利用有限状态MDP的多链算法高效计算,并证明其收敛性。

Comments Appears in Proceedings of the Twentieth Conference on Uncertainty in Artificial Intelligence (UAI2004)

详情
AI中文摘要

在本文中,我们针对具有折扣和平均成本准则的POMDP提出了一种新的下界近似方案。近似函数由其在一组有限信念点上的值确定,并可通过有限状态MDP的值迭代算法高效计算。虽然对于折扣问题已有几种下界近似方案被提出,但我们的方案似乎是平均成本问题中的首个。我们主要关注平均成本情形,并证明相应的近似可以通过有限状态MDP的多链算法高效计算。我们给出初步分析表明,无论POMDP中是否存在最优平均成本J,所获得的近似都是liminf最优平均成本函数的下界,也可用于计算limsup最优平均成本函数的上界,以及执行与近似相关的平稳策略的成本界。当最优平均成本为常数且最优差分成本连续时,我们证明了成本近似的收敛性。

英文摘要

In this paper, we propose a new lower approximation scheme for POMDP with discounted and average cost criterion. The approximating functions are determined by their values at a finite number of belief points, and can be computed efficiently using value iteration algorithms for finite-state MDP. While for discounted problems several lower approximation schemes have been proposed earlier, ours seems the first of its kind for average cost problems. We focus primarily on the average cost case, and we show that the corresponding approximation can be computed efficiently using multi-chain algorithms for finite-state MDP. We give a preliminary analysis showing that regardless of the existence of the optimal average cost J in the POMDP, the approximation obtained is a lower bound of the liminf optimal average cost function, and can also be used to calculate an upper bound on the limsup optimal average cost function, as well as bounds on the cost of executing the stationary policy associated with the approximation. Weshow the convergence of the cost approximation, when the optimal average cost is constant and the optimal differential cost is continuous.

1207.3434 2026-06-03 cs.AI cs.RO cs.SY eess.SY

An Approach to Model Interest for Planetary Rover through Dezert-Smarandache Theory

基于Dezert-Smarandache理论的行星探测器兴趣建模方法

Matteo Ceriotti, Massimiliano Vasile, Giovanni Giardini, Mauro Massari

AI总结 提出一种通过Dezert-Smarandache理论融合有效载荷和导航信息来量化行星探测器目标兴趣度的方法,实现自主目标重分配与科学目标优选。

Comments Journal Of Aerospace Computing, Information, And Communication Vol. 5, Month 2008

详情
AI中文摘要

本文提出了一种为行星探测器目标分配兴趣度的方法。为目标分配兴趣度,使探测器能够自主地转换和重新分配目标。兴趣度由数据融合的有效载荷和导航信息定义。融合产生一个“兴趣地图”,量化探测器周围每个区域的兴趣水平。通过这种方式,规划器可以在有限的人为干预下选择最有趣的科学目标进行分析,并自主重新分配其目标。使用Dezert-Smarandache plausible and paradoxical reasoning理论进行信息融合:该理论允许处理模糊和冲突的数据。特别是,它允许我们直接模拟必须评估特定目标集相关性的科学家的行为。本文展示了所提方法在生成可靠兴趣地图中的应用。

英文摘要

In this paper, we propose an approach for assigning an interest level to the goals of a planetary rover. Assigning an interest level to goals, allows the rover autonomously to transform and reallocate the goals. The interest level is defined by data-fusing payload and navigation information. The fusion yields an "interest map", that quantifies the level of interest of each area around the rover. In this way the planner can choose the most interesting scientific objectives to be analyzed, with limited human intervention, and reallocates its goals autonomously. The Dezert-Smarandache Theory of Plausible and Paradoxical Reasoning was used for information fusion: this theory allows dealing with vague and conflicting data. In particular, it allows us directly to model the behavior of the scientists that have to evaluate the relevance of a particular set of goals. The paper shows an application of the proposed approach to the generation of a reliable interest map.

1207.1280 2026-06-03 cs.RO cs.SY eess.SY

Probabilistically Safe Control of Noisy Dubins Vehicles

噪声Dubins车辆的概率安全控制

Igor Cizelj, Calin Belta

AI总结 针对噪声Dubins车辆,通过马尔可夫决策过程(MDP)和概率计算树逻辑(PCTL)最大化满足时序逻辑规约的概率,并保证原环境中的满足概率有下界。

Comments Technical Report

详情
AI中文摘要

我们解决了控制随机版本的Dubins车辆的问题,使得在分区环境中一组属性区域上满足时序逻辑规约的概率最大化。我们假设车辆能够确定其在已知环境地图中的精确初始位置。然而,受实际限制启发,我们假设车辆配备有噪声执行器,并且在运动过程中只能使用有限精度的陀螺仪测量其角速度。通过量化和离散化,我们以马尔可夫决策过程(MDP)的形式构建了车辆运动的有限近似。我们允许任务规约为关于环境属性的时序逻辑语句,并使用概率计算树逻辑(PCTL)工具生成最大化满足概率的MDP控制策略。我们将该策略转化为车辆反馈控制策略,并证明车辆在原环境中满足规约的概率由MDP上满足规约的最大概率给出下界。

英文摘要

We address the problem of controlling a stochastic version of a Dubins vehicle such that the probability of satisfying a temporal logic specification over a set of properties at the regions in a partitioned environment is maximized. We assume that the vehicle can determine its precise initial position in a known map of the environment. However, inspired by practical limitations, we assume that the vehicle is equipped with noisy actuators and, during its motion in the environment, it can only measure its angular velocity using a limited accuracy gyroscope. Through quantization and discretization, we construct a finite approximation for the motion of the vehicle in the form of a Markov Decision Process (MDP). We allow for task specifications given as temporal logic statements over the environmental properties, and use tools in Probabilistic Computation Tree Logic (PCTL) to generate an MDP control policy that maximizes the probability of satisfaction. We translate this policy to a vehicle feedback control strategy and show that the probability that the vehicle satisfies the specification in the original environment is bounded from below by the maximum probability of satisfying the specification on the MDP.

1203.1007 2026-06-03 cs.LG cs.AI cs.SY eess.SY stat.ML

Agnostic System Identification for Model-Based Reinforcement Learning

基于模型的强化学习的不可知系统辨识

Stephane Ross, J. Andrew Bagnell

AI总结 针对模型类可能不包含真实系统的不可知情况,提出一种利用无遗憾在线学习算法获得近优策略的迭代方法,并在离散和连续域上验证其有效性。

Comments 8 pages, published in ICML 2012

详情
AI中文摘要

控制中的一个基本问题是从观测中学习一个对控制器综合有用的系统模型。为了提供良好的性能保证,现有方法必须假设真实系统属于学习过程中考虑的模型类。我们提出了一种迭代方法,即使在系统不在模型类中的不可知情况下,也能提供强有力的保证。特别地,我们表明,只要某个模型实现了低训练误差并且能够访问良好的探索分布,任何无遗憾在线学习算法都可以用于获得近优策略。我们的方法适用于离散和连续域。我们在文献中一个具有挑战性的直升机领域上展示了其有效性和可扩展性。

英文摘要

A fundamental problem in control is to learn a model of a system from observations that is useful for controller synthesis. To provide good performance guarantees, existing methods must assume that the real system is in the class of models considered during learning. We present an iterative method with strong guarantees even in the agnostic case where the system is not in the class. In particular, we show that any no-regret online learning algorithm can be used to obtain a near-optimal policy, provided some model achieves low training error and access to a good exploration distribution. Our approach applies to both discrete and continuous domains. We demonstrate its efficacy and scalability on a challenging helicopter domain from the literature.

1206.6857 2026-06-03 cs.LG cs.NA math.NA stat.ML

Faster Gaussian Summation: Theory and Experiment

更快的高斯求和:理论与实验

Dongryeol Lee, Alexander G. Gray

AI总结 本文针对机器学习中常见的高斯求和问题,提出两种新扩展(带严格误差界的O(Dp)泰勒展开和集成任意近似方法的新误差控制方案),并在自适应分层数据结构框架下实现更快的算法,通过核密度估计中的最优带宽选择实验首次揭示了当前最先进方法的优缺点。

Comments Appears in Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (UAI2006)

详情
AI中文摘要

我们为高斯求和问题提供了更快的算法,该问题出现在许多机器学习方法中。我们在使用自适应分层数据结构的最佳离散算法框架内,开发了两个新的扩展——一个具有严格误差界的O(Dp)泰勒展开式用于高斯核,以及一个集成任意近似方法的新误差控制方案。我们在核密度估计中最优带宽选择的背景下严格评估了这些技术的实证效果,首次揭示了当前最先进方法的优缺点。我们的结果表明,新的误差控制方案提高了性能,而级数展开方法仅在低维(五维或以下)中有效。

英文摘要

We provide faster algorithms for the problem of Gaussian summation, which occurs in many machine learning methods. We develop two new extensions - an O(Dp) Taylor expansion for the Gaussian kernel with rigorous error bounds and a new error control scheme integrating any arbitrary approximation method - within the best discretealgorithmic framework using adaptive hierarchical data structures. We rigorously evaluate these techniques empirically in the context of optimal bandwidth selection in kernel density estimation, revealing the strengths and weaknesses of current state-of-the-art approaches for the first time. Our results demonstrate that the new error control scheme yields improved performance, whereas the series expansion approach is only effective in low dimensions (five or less).

1206.6833 2026-06-03 cs.LG cs.CE cs.NA math.NA stat.ML

Matrix Tile Analysis

矩阵瓦片分析

Inmar Givoni, Vincent Cheung, Brendan J. Frey

AI总结 提出矩阵瓦片分析(MTA)问题,通过非重叠瓦片分解矩阵,并设计近似迭代算法和和积松弛方法,在合成数据和酵母基因敲除数据上验证其有效性。

Comments Appears in Proceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence (UAI2006)

详情
AI中文摘要

许多任务需要在数字、符号或类别似然矩阵中寻找元素组。一种方法是使用高效的双线性或三线性分解技术,包括PCA、ICA、稀疏矩阵分解和格子分析。当矩阵元素的加法和乘法没有明确定义时,这些技术不适用。更直接地,像双聚类这样的方法可用于对矩阵元素进行分类,但这些方法做出了过于严格的假设,即每个元素的类别是行类别和列类别的函数。我们引入一个通用的计算问题——矩阵瓦片分析(MTA),它将矩阵分解为一组非重叠的瓦片,每个瓦片由通常不相邻的行和列的子集定义。MTA不需要用于组合瓦片的代数,但必须搜索瓦片分配的离散组合。精确MTA是一个计算上难以处理的整数规划问题,但我们描述了一种近似迭代技术和一种计算高效的整数规划和积松弛。我们在数百个随机生成的任务上比较了这些方法与PCA和格子分析的有效性。利用双基因敲除数据,我们展示了MTA找到了具有生物学相关功能的相互作用酵母基因群。

英文摘要

Many tasks require finding groups of elements in a matrix of numbers, symbols or class likelihoods. One approach is to use efficient bi- or tri-linear factorization techniques including PCA, ICA, sparse matrix factorization and plaid analysis. These techniques are not appropriate when addition and multiplication of matrix elements are not sensibly defined. More directly, methods like bi-clustering can be used to classify matrix elements, but these methods make the overly-restrictive assumption that the class of each element is a function of a row class and a column class. We introduce a general computational problem, `matrix tile analysis' (MTA), which consists of decomposing a matrix into a set of non-overlapping tiles, each of which is defined by a subset of usually nonadjacent rows and columns. MTA does not require an algebra for combining tiles, but must search over discrete combinations of tile assignments. Exact MTA is a computationally intractable integer programming problem, but we describe an approximate iterative technique and a computationally efficient sum-product relaxation of the integer program. We compare the effectiveness of these methods to PCA and plaid on hundreds of randomly generated tasks. Using double-gene-knockout data, we show that MTA finds groups of interacting yeast genes that have biologically-related functions.

1206.6470 2026-06-03 cs.LG cs.DM cs.NA math.NA stat.ML

A Combinatorial Algebraic Approach for the Identifiability of Low-Rank Matrix Completion

低秩矩阵完备可辨识性的组合代数方法

Franz Kiraly, Ryota Tomioka

AI总结 本文通过组合代数方法,首次给出了任意秩矩阵从一组矩阵条目中可辨识的充要组合条件,并提出了新算法。

Comments Appears in Proceedings of the 29th International Conference on Machine Learning (ICML 2012)

详情
AI中文摘要

本文回顾了矩阵完备问题,并揭示了其与代数几何、组合学和图论的密切联系。我们首次给出了任意秩矩阵从一组矩阵条目中可辨识的充要组合条件,为矩阵完备问题提供了理论约束和新算法。最后,我们通过算法评估了给定条件和算法在实际相关矩阵大小上的紧致性,表明代数组合方法可以改进现有的矩阵完备方法。

英文摘要

In this paper, we review the problem of matrix completion and expose its intimate relations with algebraic geometry, combinatorics and graph theory. We present the first necessary and sufficient combinatorial conditions for matrices of arbitrary rank to be identifiable from a set of matrix entries, yielding theoretical constraints and new algorithms for the problem of matrix completion. We conclude by algorithmically evaluating the tightness of the given conditions and algorithms for practically relevant matrix sizes, showing that the algebraic-combinatoric approach can lead to improvements over state-of-the-art matrix completion methods.

1206.6141 2026-06-03 cs.LG cs.SY eess.SY stat.ML

Directed Time Series Regression for Control

面向控制的定向时间序列回归

Yi-Hao Kao, Benjamin Van Roy

AI总结 提出定向时间序列回归方法,结合最小二乘回归与经验优化的优点,用于确定性等价模型预测控制中的时间序列模型参数估计,在随机倒立摆平衡问题中显著提升控制器性能。

详情
AI中文摘要

我们提出了定向时间序列回归,这是一种用于确定性等价模型预测控制中时间序列模型参数估计的新方法。该方法结合了最小二乘回归和经验优化的优点。通过一个涉及著名倒立摆平衡问题的随机版本的计算研究,我们证明了定向时间序列回归能够在控制器性能上比上述任何一种替代方法产生显著的改进。

英文摘要

We propose directed time series regression, a new approach to estimating parameters of time-series models for use in certainty equivalent model predictive control. The approach combines merits of least squares regression and empirical optimization. Through a computational study involving a stochastic version of a well known inverted pendulum balancing problem, we demonstrate that directed time series regression can generate significant improvements in controller performance over either of the aforementioned alternatives.

1206.4676 2026-06-03 cs.LG cs.CV cs.NA math.NA stat.ML

Clustering by Low-Rank Doubly Stochastic Matrix Decomposition

基于低秩双随机矩阵分解的聚类

Zhirong Yang, Erkki Oja

AI总结 提出一种超越矩阵分解的低秩学习方法,通过两步二分随机游走逼近聚类分配概率,利用KL散度最小化实现判别模型的最大似然估计,并采用松弛的MM算法优化,显著提升大规模流形数据的聚类纯度。

Comments ICML2012

详情
AI中文摘要

在过去十年中,通过非负低秩近似进行聚类分析取得了显著进展。然而,该方向上的大多数近似方法仍局限于矩阵分解。我们提出了一种新的低秩学习方法以提高聚类性能,该方法超越了矩阵分解。该近似基于通过虚拟聚类节点的两步二分随机游走,其中近似仅由聚类分配概率构成。通过Kullback-Leibler散度测量的近似误差最小化等价于判别模型的最大似然估计,这为我们的方法提供了坚实的概率解释。优化通过一种松弛的Majorization-Minimization算法实现,该算法在寻找良好局部最小值方面具有优势。此外,我们指出带有Dirichlet先验的正则化算法仅作为初始化。实验结果表明,新方法在各种数据集上,特别是大规模流形数据上,具有强大的聚类纯度性能。

英文摘要

Clustering analysis by nonnegative low-rank approximations has achieved remarkable progress in the past decade. However, most approximation approaches in this direction are still restricted to matrix factorization. We propose a new low-rank learning method to improve the clustering performance, which is beyond matrix factorization. The approximation is based on a two-step bipartite random walk through virtual cluster nodes, where the approximation is formed by only cluster assigning probabilities. Minimizing the approximation error measured by Kullback-Leibler divergence is equivalent to maximizing the likelihood of a discriminative model, which endows our method with a solid probabilistic interpretation. The optimization is implemented by a relaxed Majorization-Minimization algorithm that is advantageous in finding good local minima. Furthermore, we point out that the regularized algorithm with Dirichlet prior only serves as initialization. Experimental results show that the new method has strong performance in clustering purity for various datasets, especially for large-scale manifold data.

1206.4645 2026-06-03 cs.LG cs.NA math.NA stat.ME stat.ML

Ensemble Methods for Convex Regression with Applications to Geometric Programming Based Circuit Design

凸回归的集成方法及其在基于几何规划的电路设计中的应用

Lauren Hannah, David Dunson

AI总结 本文提出集成方法(如bagging、smearing和随机划分)来改进分段线性凸回归的稳定性,并应用于基于几何规划的电路设计中的器件建模和约束近似。

Comments ICML2012

详情
AI中文摘要

凸回归是连接统计估计和确定性凸优化的一个有前景的领域。新的分段线性凸回归方法快速且可扩展,但在用于近似优化问题的约束或目标函数时可能不稳定。集成方法,如bagging、smearing和随机划分,可以缓解这一问题并保持底层估计器的理论性质。我们通过实验检验了集成方法在预测和优化中的性能,然后将其应用于基于几何规划的电路设计中的器件建模和约束近似。

英文摘要

Convex regression is a promising area for bridging statistical estimation and deterministic convex optimization. New piecewise linear convex regression methods are fast and scalable, but can have instability when used to approximate constraints or objective functions for optimization. Ensemble methods, like bagging, smearing and random partitioning, can alleviate this problem and maintain the theoretical properties of the underlying estimator. We empirically examine the performance of ensemble methods for prediction and optimization, and then apply them to device modeling and constraint approximation for geometric programming based circuit design.

1206.4643 2026-06-03 cs.LG cs.GT cs.SY eess.SY

Lightning Does Not Strike Twice: Robust MDPs with Coupled Uncertainty

闪电不会两次击中同一地点:具有耦合不确定性的鲁棒MDP

Shie Mannor, Ofir Mebel, Huan Xu

AI总结 针对马尔可夫决策过程中参数不确定性的耦合问题,提出“闪电不会两次击中同一地点”概念,设计可计算最优策略的算法。

Comments ICML2012

详情
AI中文摘要

我们考虑参数不确定性下的马尔可夫决策过程。以往的研究都限制在不同状态之间的不确定性是解耦的,这导致保守的解。相比之下,我们引入了一个直观的概念,称为“闪电不会两次击中同一地点”,来建模耦合的不确定参数。具体来说,我们要求系统只能偏离其名义参数有限次数。我们给出了概率保证,表明该模型代表了现实生活中的情况,并设计了使用这一概念计算最优控制策略的可行算法。

英文摘要

We consider Markov decision processes under parameter uncertainty. Previous studies all restrict to the case that uncertainties among different states are uncoupled, which leads to conservative solutions. In contrast, we introduce an intuitive concept, termed "Lightning Does not Strike Twice," to model coupled uncertain parameters. Specifically, we require that the system can deviate from its nominal parameters only a bounded number of times. We give probabilistic guarantees indicating that this model represents real life situations and devise tractable algorithms for computing optimal control policies using this concept.

1206.4608 2026-06-03 cs.LG cs.DS cs.NA math.NA stat.ML

A Hybrid Algorithm for Convex Semidefinite Optimization

凸半定优化的一种混合算法

Soeren Laue

AI总结 提出一种混合算法用于优化凸光滑函数在半正定矩阵锥上的问题,该算法收敛到全局最优解,可解决大规模半定规划,在矩阵补全、度量学习和稀疏PCA上优于现有方法。

Comments ICML2012

详情
AI中文摘要

我们提出了一种混合算法,用于在半正定矩阵锥上优化凸光滑函数。我们的算法收敛到全局最优解,可用于解决一般的大规模半定规划问题,因此可以轻松应用于各种机器学习问题。我们在三个机器学习问题(矩阵补全、度量学习和稀疏PCA)上展示了实验结果。我们的方法优于最先进的算法。

英文摘要

We present a hybrid algorithm for optimizing a convex, smooth function over the cone of positive semidefinite matrices. Our algorithm converges to the global optimal solution and can be used to solve general large-scale semidefinite programs and hence can be readily applied to a variety of machine learning problems. We show experimental results on three machine learning problems (matrix completion, metric learning, and sparse PCA) . Our approach outperforms state-of-the-art algorithms.

1206.4329 2026-06-03 cs.AI cs.NA math.NA

An Improved Gauss-Newtons Method based Back-propagation Algorithm for Fast Convergence

一种基于改进高斯-牛顿法的快速收敛反向传播算法

Sudarshan Nandy, Partha Pratim Sarkar, Achintya Das

AI总结 提出一种基于高斯-牛顿数值优化方法的改进反向传播算法,通过多层神经网络实现快速收敛,并在多种数据集上验证其优于最速下降法。

Comments 7 pages, 6 figures,2 tables, Published with International Journal of Computer Applications (IJCA)

详情
Journal ref
International Journal of Computer Applications 39(8):1-7, February 2012. Published by Foundation of Computer Science, New York, USA
AI中文摘要

本文研究了一种基于高斯-牛顿数值优化方法的改进反向传播算法,以实现快速收敛。反向传播采用最速下降法。该算法使用多种数据集进行测试,并与最速下降反向传播算法进行比较。系统中使用多层神经网络进行优化。在训练期间观察到所提方法的有效性,因为它对测试中使用的数据集收敛迅速。还分析了计算算法步骤所需的内存。

英文摘要

The present work deals with an improved back-propagation algorithm based on Gauss-Newton numerical optimization method for fast convergence. The steepest descent method is used for the back-propagation. The algorithm is tested using various datasets and compared with the steepest descent back-propagation algorithm. In the system, optimization is carried out using multilayer neural network. The efficacy of the proposed method is observed during the training period as it converges quickly for the dataset used in test. The requirement of memory for computing the steps of algorithm is also analyzed.

1206.3285 2026-06-03 cs.AI cs.LG cs.SY eess.SY

Dyna-Style Planning with Linear Function Approximation and Prioritized Sweeping

具有线性函数逼近和优先级扫描的Dyna风格规划

Richard S. Sutton, Csaba Szepesvari, Alborz Geramifard, Michael P. Bowling

AI总结 本文提出一种基于模型的Dyna风格规划方法,扩展至线性函数逼近,证明其收敛性,并引入线性Dyna的优先级扫描算法。

Comments Appears in Proceedings of the Twenty-Fourth Conference on Uncertainty in Artificial Intelligence (UAI2008)

详情
AI中文摘要

我们考虑在在线设置中高效学习最优控制策略和价值函数的问题,其中状态空间很大,且必须在每次与世界交互后获得估计。本文开发了一种显式的基于模型的方法,将Dyna架构扩展到线性函数逼近。Dyna风格规划通过从世界模型生成想象经验,然后将无模型强化学习算法应用于想象的状态转移来进行。我们的主要结果是证明,在自然条件下,线性Dyna风格规划收敛到一个独立于生成分布的唯一解。在策略评估设置中,我们证明极限点是最小二乘(LSTD)解。我们的结果的一个含义是,优先级扫描可以合理地扩展到线性逼近情况,即回溯到前驱特征而不是前驱状态。我们介绍了两种线性Dyna的优先级扫描版本,并在Mountain Car和Boyan Chain问题上简要展示了它们的经验性能。

英文摘要

We consider the problem of efficiently learning optimal control policies and value functions over large state spaces in an online setting in which estimates must be available after each interaction with the world. This paper develops an explicitly model-based approach extending the Dyna architecture to linear function approximation. Dynastyle planning proceeds by generating imaginary experience from the world model and then applying model-free reinforcement learning algorithms to the imagined state transitions. Our main results are to prove that linear Dyna-style planning converges to a unique solution independent of the generating distribution, under natural conditions. In the policy evaluation setting, we prove that the limit point is the least-squares (LSTD) solution. An implication of our results is that prioritized-sweeping can be soundly extended to the linear approximation case, backing up to preceding features rather than to preceding states. We introduce two versions of prioritized sweeping with linear Dyna and briefly illustrate their performance empirically on the Mountain Car and Boyan Chain problems.

1205.2643 2026-06-03 cs.LG cs.SY eess.SY math.OC stat.CO stat.ML

New inference strategies for solving Markov Decision Processes using reversible jump MCMC

使用可逆跳跃MCMC求解马尔可夫决策过程的新推理策略

Matthias Hoffman, Hendrik Kueck, Nando de Freitas, Arnaud Doucet

AI总结 本文提出基于可逆跳跃MCMC的改进推理策略,通过新目标分布和打破参数-轨迹相关性,实现高维空间中的最优策略估计。

Comments Appears in Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence (UAI2009)

详情
AI中文摘要

本文基于先前使用推理技术(特别是马尔可夫链蒙特卡洛(MCMC)方法)求解参数化控制问题的工作,提出了一系列改进,以使该方法在一般的高维空间中更加实用。我们首先引入了一个新的目标分布,能够从采样轨迹中融入更多奖励信息。我们还展示了如何打破策略参数与采样轨迹之间的强相关性,以实现更自由的采样。最后,我们展示了如何以原则性的方式将这些技术结合起来,以获得最优策略的估计。

英文摘要

In this paper we build on previous work which uses inferences techniques, in particular Markov Chain Monte Carlo (MCMC) methods, to solve parameterized control problems. We propose a number of modifications in order to make this approach more practical in general, higher-dimensional spaces. We first introduce a new target distribution which is able to incorporate more reward information from sampled trajectories. We also show how to break strong correlations between the policy parameters and sampled trajectories in order to sample more freely. Finally, we show how to incorporate these techniques in a principled manner to obtain estimates of the optimal policy.

1204.4476 2026-06-03 cs.CV cs.SY eess.SY

Dynamic Template Tracking and Recognition

动态模板跟踪与识别

Rizwan Chaudhry, Gregory Hager, Rene Vidal

AI总结 提出使用线性动态系统建模非刚性物体的外观/运动时间演化,作为动态模板进行跟踪,并实现同时跟踪与识别。

详情
AI中文摘要

本文解决局部外观和运动随时间变化的非刚性物体跟踪问题。这类物体包括动态纹理(如蒸汽、火、烟、水等)以及关节物体(如执行各种动作的人)。我们使用线性动态系统(LDS)对物体外观/运动的时间演化进行建模。从样本视频中学习此类模型,并将其作为动态模板用于跟踪新视频中的物体。我们将当前帧中动态非刚性物体的跟踪问题视为在给定当前图像特征和前帧状态最佳估计下,物体位置和动态系统潜在状态的最大后验估计。我们方法的优势在于,通过使用先前训练的纹理动力学模型,可以预先指定场景中要跟踪的纹理类型。我们的框架自然地将常见的跟踪方法(如SSD和基于核的跟踪)从静态模板推广到动态模板。我们在合成和真实动态纹理示例上测试算法,并表明我们基于简单动力学的跟踪器性能与最先进方法相当甚至更优。由于我们的方法具有通用性且适用于任何图像特征,我们还将其应用于人体动作跟踪问题,构建了特定动作的光流跟踪器,在跟踪执行特定动作的人时性能优于最先进方法。最后,由于我们的方法是生成式的,我们可以使用针对不同纹理或动作类别预先训练的跟踪器,同时跟踪和识别视频中的纹理或动作。

英文摘要

In this paper we address the problem of tracking non-rigid objects whose local appearance and motion changes as a function of time. This class of objects includes dynamic textures such as steam, fire, smoke, water, etc., as well as articulated objects such as humans performing various actions. We model the temporal evolution of the object's appearance/motion using a Linear Dynamical System (LDS). We learn such models from sample videos and use them as dynamic templates for tracking objects in novel videos. We pose the problem of tracking a dynamic non-rigid object in the current frame as a maximum a-posteriori estimate of the location of the object and the latent state of the dynamical system, given the current image features and the best estimate of the state in the previous frame. The advantage of our approach is that we can specify a-priori the type of texture to be tracked in the scene by using previously trained models for the dynamics of these textures. Our framework naturally generalizes common tracking methods such as SSD and kernel-based tracking from static templates to dynamic templates. We test our algorithm on synthetic as well as real examples of dynamic textures and show that our simple dynamics-based trackers perform at par if not better than the state-of-the-art. Since our approach is general and applicable to any image feature, we also apply it to the problem of human action tracking and build action-specific optical flow trackers that perform better than the state-of-the-art when tracking a human performing a particular action. Finally, since our approach is generative, we can use a-priori trained trackers for different texture or action classes to simultaneously track and recognize the texture or action in the video.

1203.2210 2026-06-03 cs.CV cs.NA math.NA

Fixed-Rank Representation for Unsupervised Visual Learning

固定秩表示用于无监督视觉学习

Risheng Liu, Zhouchen Lin, Fernando De la Torre, Zhixun Su

AI总结 本文提出固定秩表示(FRR)作为无监督视觉学习的统一框架,通过闭式解揭示多子空间结构,并引入稀疏正则化以增强鲁棒性,同时开发了快速数值求解器。

Comments accepted by CVPR 2012

详情
AI中文摘要

子空间聚类和特征提取是计算机视觉和模式识别中最常用的两种无监督学习技术。最先进的子空间聚类技术利用了稀疏性和秩最小化的最新进展。然而,现有技术计算成本高,并且在数据采样不足的情况下可能导致退化解,从而降低聚类性能。为了部分解决这些问题,并受现有矩阵分解工作的启发,本文提出固定秩表示(FRR)作为无监督视觉学习的统一框架。当数据无噪声时,FRR能够以闭式形式揭示多个子空间的结构。此外,我们证明在某些适当条件下,即使观测不足,FRR仍然能够揭示真实的子空间成员关系。为了实现对异常值和噪声的鲁棒性,我们在FRR框架中引入了稀疏正则化。除了子空间聚类,FRR还可用于无监督特征提取。作为一个非平凡的副产品,我们为FRR开发了一个快速数值求解器。在合成数据和实际应用上的实验结果验证了我们的理论分析,并展示了FRR在无监督视觉学习中的优势。

英文摘要

Subspace clustering and feature extraction are two of the most commonly used unsupervised learning techniques in computer vision and pattern recognition. State-of-the-art techniques for subspace clustering make use of recent advances in sparsity and rank minimization. However, existing techniques are computationally expensive and may result in degenerate solutions that degrade clustering performance in the case of insufficient data sampling. To partially solve these problems, and inspired by existing work on matrix factorization, this paper proposes fixed-rank representation (FRR) as a unified framework for unsupervised visual learning. FRR is able to reveal the structure of multiple subspaces in closed-form when the data is noiseless. Furthermore, we prove that under some suitable conditions, even with insufficient observations, FRR can still reveal the true subspace memberships. To achieve robustness to outliers and noise, a sparse regularizer is introduced into the FRR framework. Beyond subspace clustering, FRR can be used for unsupervised feature extraction. As a non-trivial byproduct, a fast numerical solver is developed for FRR. Experimental results on both synthetic data and real applications validate our theoretical analysis and demonstrate the benefits of FRR for unsupervised visual learning.

1203.2556 2026-06-03 cs.AI cs.SY eess.SY

A Probabilistic Transmission Expansion Planning Methodology based on Roulette Wheel Selection and Social Welfare

基于轮盘赌选择和社会福利的概率输电扩展规划方法

Neeraj Gupta, Rajiv Shekhar, Prem Kumar Kalra

AI总结 提出一种无需预先指定新增输电容量、利用社会福利概念的概率输电扩展规划方法,通过轮盘赌计算线路容量和潮流分析计算期望未供电量,并在改进IEEE 5节点系统上验证了仅新增线路不足以最小化期望未供电量。

Comments 22 pages, 4 figures

详情
AI中文摘要

提出了一种新的概率输电扩展规划(TEP)方法,该方法不需要预先指定新的/额外的输电容量,并利用了社会福利的概念。本文引入了两个新概念:(i)使用轮盘赌方法计算新输电线路的容量,(ii)使用潮流分析计算期望未供电量(EDNS)。整体方法已在改进的IEEE 5节点测试系统上实现。仿真结果表明一个重要结果:仅增加新的输电线路不足以最小化EDNS。

英文摘要

A new probabilistic methodology for transmission expansion planning (TEP) that does not require a priori specification of new/additional transmission capacities and uses the concept of social welfare has been proposed. Two new concepts have been introduced in this paper: (i) roulette wheel methodology has been used to calculate the capacity of new transmission lines and (ii) load flow analysis has been used to calculate expected demand not served (EDNS). The overall methodology has been implemented on a modified IEEE 5-bus test system. Simulations show an important result: addition of only new transmission lines is not sufficient to minimize EDNS.

1203.2511 2026-06-03 cs.LG cs.CE cs.NI cs.SY eess.SY stat.AP

A Simple Flood Forecasting Scheme Using Wireless Sensor Networks

一种使用无线传感器网络的简单洪水预测方案

Victor Seal, Arnab Raha, Shovan Maity, Souvik Kr Mitra, Amitava Mukherjee, Mrinal Kanti Naskar

AI总结 提出一种基于无线传感器网络的多元鲁棒线性回归洪水预测模型,通过简单快速的计算实现实时预测,并与其他算法对比验证改进效果。

Comments 16 pages, 4 figures, published in International Journal Of Ad-Hoc, Sensor And Ubiquitous Computing, February 2012; V. seal et al, 'A Simple Flood Forecasting Scheme Using Wireless Sensor Networks', IJASUC, Feb.2012

详情
AI中文摘要

本文提出一种使用无线传感器网络(WSNs)设计的预测模型,用于预测河流洪水,采用简单快速的计算提供实时结果,以拯救可能受洪水影响的生命。我们的预测模型使用多元鲁棒线性回归,易于理解,实现简单且成本效益高,速度高效,资源利用率低,同时提供可靠精度的实时预测,因此具有任何实际算法所期望的特征。我们的预测模型独立于参数数量,即可以根据现场需求添加或删除任意数量的参数。当水位上升时,我们使用多项式表示水位,其性质用于判断水位是否可能在近期超过洪水线。我们将我们的工作与一种当代算法进行比较,以展示我们的改进。然后,我们展示了预测水位与实际水位的仿真结果。

英文摘要

This paper presents a forecasting model designed using WSNs (Wireless Sensor Networks) to predict flood in rivers using simple and fast calculations to provide real-time results and save the lives of people who may be affected by the flood. Our prediction model uses multiple variable robust linear regression which is easy to understand and simple and cost effective in implementation, is speed efficient, but has low resource utilization and yet provides real time predictions with reliable accuracy, thus having features which are desirable in any real world algorithm. Our prediction model is independent of the number of parameters, i.e. any number of parameters may be added or removed based on the on-site requirements. When the water level rises, we represent it using a polynomial whose nature is used to determine if the water level may exceed the flood line in the near future. We compare our work with a contemporary algorithm to demonstrate our improvements over it. Then we present our simulation results for the predicted water level compared to the actual water level.

1202.5544 2026-06-03 cs.RO cs.SY eess.SY math.DS math.OC math.PR

An Incremental Sampling-based Algorithm for Stochastic Optimal Control

基于增量采样的随机最优控制算法

Vu Anh Huynh, Sertac Karaman, Emilio Frazzoli

AI总结 针对连续时间连续空间随机最优控制问题,提出增量马尔可夫决策过程(iMDP)算法,通过随机采样状态空间生成离散化序列并异步值迭代,以任意精度逼近最优策略。

Comments Part of the results have been submitted to the IEEE International Conference on Robotics and Automation (ICRA 2012). Minnesota, USA, May 2012

详情
AI中文摘要

本文考虑一类连续时间、连续空间的随机最优控制问题。基于马尔可夫链近似方法和确定性路径规划中基于采样的算法的最新进展,我们提出了一种名为增量马尔可夫决策过程(iMDP)的新算法,用于增量计算在期望成本意义上任意逼近最优策略的控制策略。该算法的主要思想是通过对状态空间进行随机采样,生成原始问题的一系列有限离散化。在每次迭代中,离散化问题是一个马尔可夫决策过程,作为原始问题的增量细化模型。我们证明,以概率1,(i)每个离散化问题的最优值函数序列一致收敛到原始随机最优控制问题的最优值函数,并且(ii)原始最优值函数可以使用异步值迭代以增量方式高效计算。因此,所提出的算法为连续问题的最优控制策略计算提供了一种随时方法。在存在过程噪声的杂乱环境中,通过运动规划和控制问题展示了所提出方法的有效性。

英文摘要

In this paper, we consider a class of continuous-time, continuous-space stochastic optimal control problems. Building upon recent advances in Markov chain approximation methods and sampling-based algorithms for deterministic path planning, we propose a novel algorithm called the incremental Markov Decision Process (iMDP) to compute incrementally control policies that approximate arbitrarily well an optimal policy in terms of the expected cost. The main idea behind the algorithm is to generate a sequence of finite discretizations of the original problem through random sampling of the state space. At each iteration, the discretized problem is a Markov Decision Process that serves as an incrementally refined model of the original problem. We show that with probability one, (i) the sequence of the optimal value functions for each of the discretized problems converges uniformly to the optimal value function of the original stochastic optimal control problem, and (ii) the original optimal value function can be computed efficiently in an incremental manner using asynchronous value iterations. Thus, the proposed algorithm provides an anytime approach to the computation of optimal control policies of the continuous problem. The effectiveness of the proposed approach is demonstrated on motion planning and control problems in cluttered environments in the presence of process noise.

1202.2185 2026-06-03 cs.RO cs.SY eess.SY math.OC

Temporal Logic Motion Control using Actor-Critic Methods

使用Actor-Critic方法的时序逻辑运动控制

Xu Chu Ding, Jing Wang, Morteza Lahijanian, Ioannis Ch. Paschalidis, Calin A. Belta

AI总结 针对大型分区环境中基于时序逻辑规范的控制问题,提出一种基于最小二乘时序差分学习的Actor-Critic近似动态规划框架,通过优化随机控制策略参数实现近似最优策略。

Comments Technical Report which accompanies an ICRA2012 paper

详情
AI中文摘要

本文考虑从以时序逻辑语句形式给出的规范部署机器人的问题,该规范涉及大型分区环境中区域满足的某些属性。我们假设机器人具有噪声传感器和执行器,并将其在环境区域中的运动建模为马尔可夫决策过程(MDP)。机器人控制问题变为寻找在MDP上最大化满足时序逻辑任务概率的控制策略。对于大型环境,获取每个状态-动作对的转移概率以及求解最优策略所需的优化问题通常计算上不可行。为解决这些问题,我们提出了一种基于最小二乘时序差分学习方法的Actor-Critic类型近似动态规划框架。该框架在机器人的样本路径上运行,并针对少量参数优化随机控制策略。转移概率仅在需要时获取。硬件在环仿真证实,参数的收敛转化为近似最优策略。

英文摘要

In this paper, we consider the problem of deploying a robot from a specification given as a temporal logic statement about some properties satisfied by the regions of a large, partitioned environment. We assume that the robot has noisy sensors and actuators and model its motion through the regions of the environment as a Markov Decision Process (MDP). The robot control problem becomes finding the control policy maximizing the probability of satisfying the temporal logic task on the MDP. For a large environment, obtaining transition probabilities for each state-action pair, as well as solving the necessary optimization problem for the optimal policy are usually not computationally feasible. To address these issues, we propose an approximate dynamic programming framework based on a least-square temporal difference learning method of the actor-critic type. This framework operates on sample paths of the robot and optimizes a randomized control policy with respect to a small set of parameters. The transition probabilities are obtained only when needed. Hardware-in-the-loop simulations confirm that convergence of the parameters translates to an approximately optimal policy.

1112.5282 2026-06-03 cs.RO cs.SY eess.SY

Observability of Strapdown INS Alignment: A Global Perspective

捷联惯导系统对准的可观测性:全局视角

Yuanxin Wu, Hongliang Zhang, Meiping Wu, Xiaoping Hu, Dewen Hu

AI总结 本文从全局视角出发,利用SO(3)流形上的姿态演化等固有特性,研究捷联惯导系统静态和翻滚对准的可观测性,证明绕两个不同轴连续旋转可实现完全可观测,绕单轴旋转则存在有限不可观测状态。

Comments 25 pages; IEEE Trans. on Aerospace and Electronic Systems, Jan. 2012

详情
Journal ref
IEEE Trans. on Aerospace and Electronic Systems, 48(1), pp. 78-102, 2012
AI中文摘要

捷联惯导系统(INS)的对准具有很强的非线性,当采用机动(例如翻滚技术)来改善对准时,非线性甚至更严重。由于没有通用的规则来处理非线性系统的可观测性,大多数先前的工作通过隐式假设原始非线性系统和线性化系统具有相同的可观测性特征,来研究相应线性化系统的可观测性。捷联惯导对准是一个具有自身特性的非线性系统。利用捷联惯导的固有属性,例如SO(3)流形上的姿态演化,我们从基本定义出发,开发了一种全局且构造性的方法来研究捷联惯导静态和翻滚对准的可观测性,突出了姿态机动对可观测性的影响。我们证明,考虑未知常值传感器偏差,如果捷联惯导绕两个不同轴连续旋转,则对准将是完全可观测的;如果绕单轴旋转,则对于有限已知不可观测状态(不超过两个)几乎是可观测的。全局视角的可观测性为我们提供了对问题的深入理解和更清晰的图景,揭示了先前关于捷联惯导对准的理论结果的不全面或不一致之处。这些不一致的报告要求对大量文献中所有基于线性化的可观测性研究进行重新审视。我们进行了大量仿真,包括构造理想观测器和扩展卡尔曼滤波器,数值结果与分析一致。这些结论还有助于在实践中设计最优翻滚策略和合适的状态观测器,以最大化对准性能。

英文摘要

Alignment of the strapdown inertial navigation system (INS) has strong nonlinearity, even worse when maneuvers, e.g., tumbling techniques, are employed to improve the alignment. There is no general rule to attack the observability of a nonlinear system, so most previous works addressed the observability of the corresponding linearized system by implicitly assuming that the original nonlinear system and the linearized one have identical observability characteristics. Strapdown INS alignment is a nonlinear system that has its own characteristics. Using the inherent properties of strapdown INS, e.g., the attitude evolution on the SO(3) manifold, we start from the basic definition and develop a global and constructive approach to investigate the observability of strapdown INS static and tumbling alignment, highlighting the effects of the attitude maneuver on observability. We prove that strapdown INS alignment, considering the unknown constant sensor biases, will be completely observable if the strapdown INS is rotated successively about two different axes and will be nearly observable for finite known unobservable states (no more than two) if it is rotated about a single axis. Observability from a global perspective provides us with insights into and a clearer picture of the problem, shedding light on previous theoretical results on strapdown INS alignment that were not comprehensive or consistent.. The reporting of inconsistencies calls for a review of all linearization-based observability studies in the vast literature. Extensive simulations with constructed ideal observers and an extended Kalman filter are carried out, and the numerical results accord with the analysis. The conclusions can also assist in designing the optimal tumbling strategy and the appropriate state observer in practice to maximize the alignment performance.

1108.6296 2026-06-03 cs.LG cs.NA math.NA

Infinite Tucker Decomposition: Nonparametric Bayesian Models for Multiway Data Analysis

无限Tucker分解:用于多路数据分析的非参数贝叶斯模型

Zenglin Xu, Feng Yan, Yuan, Qi

AI总结 提出基于非参数贝叶斯的无限Tucker分解模型(InfTucker),通过潜在高斯/t过程和非线性协方差函数,在概率框架下处理连续和二元数据,并开发高效变分推理方法,显著提升预测精度。

详情
AI中文摘要

张量分解是多路数据分析的强大计算工具。许多流行的张量分解方法——如Tucker分解和CANDECOMP/PARAFAC (CP)——本质上是多线性因子分解。它们不足以建模(i)数据实体间的复杂交互、(ii)各种数据类型(如缺失数据和二元数据)以及(iii)噪声观测和异常值。为解决这些问题,我们提出了张量变量潜在非参数贝叶斯模型,并结合高效推理方法,用于多路数据分析。我们将这些模型命名为InfTucker。使用这些InfTucker,我们在无限特征空间中进行Tucker分解。与经典张量分解模型不同,我们的新方法在概率框架下处理连续和二元数据。与先前关于矩阵和张量的贝叶斯模型不同,我们的模型基于具有非线性协方差函数的潜在高斯或t过程。为了从数据中高效学习InfTucker,我们开发了一种张量上的变分推理技术。与经典实现相比,新技术将时间和空间复杂度降低了几个数量级。我们在化学计量学和社交网络数据集上的实验结果表明,我们的新模型比最先进的张量分解方法取得了显著更高的预测精度。

英文摘要

Tensor decomposition is a powerful computational tool for multiway data analysis. Many popular tensor decomposition approaches---such as the Tucker decomposition and CANDECOMP/PARAFAC (CP)---amount to multi-linear factorization. They are insufficient to model (i) complex interactions between data entities, (ii) various data types (e.g. missing data and binary data), and (iii) noisy observations and outliers. To address these issues, we propose tensor-variate latent nonparametric Bayesian models, coupled with efficient inference methods, for multiway data analysis. We name these models InfTucker. Using these InfTucker, we conduct Tucker decomposition in an infinite feature space. Unlike classical tensor decomposition models, our new approaches handle both continuous and binary data in a probabilistic framework. Unlike previous Bayesian models on matrices and tensors, our models are based on latent Gaussian or $t$ processes with nonlinear covariance functions. To efficiently learn the InfTucker from data, we develop a variational inference technique on tensors. Compared with classical implementation, the new technique reduces both time and space complexities by several orders of magnitude. Our experimental results on chemometrics and social network datasets demonstrate that our new models achieved significantly higher prediction accuracy than the most state-of-art tensor decomposition

1112.4057 2026-06-03 cs.AI cs.SY eess.SY

Performance Evaluation of Road Traffic Control Using a Fuzzy Cellular Model

基于模糊细胞模型的道路交通控制性能评估

Bartłomiej Płaczek

AI总结 提出一种基于模糊细胞模型的方法,用于在线仿真环境中评估自适应交通控制策略的性能,通过结合元胞自动机和模糊演算处理不精确测量。

Comments The final publication is available at http://www.springerlink.com

详情
Journal ref
Płaczek, B., Performance Evaluation of Road Traffic Control Using a Fuzzy Cellular Model. Lecture Notes in Artificial Intelligence 6679. Springer-Verlag, Berlin Heidelberg, pp. 59-66, 2011
AI中文摘要

本文提出了一种用于道路交通控制系统性能评估的方法。该方法设计用于在线仿真环境,能够优化自适应交通控制策略。性能指标通过模糊细胞交通模型计算,该模型是结合元胞自动机和模糊演算的混合系统。实验结果表明,所引入的方法允许使用不精确的交通测量进行性能评估。此外,性能指标的模糊定义便于交通控制决策中的不确定性确定。

英文摘要

In this paper a method is proposed for performance evaluation of road traffic control systems. The method is designed to be implemented in an on-line simulation environment, which enables optimisation of adaptive traffic control strategies. Performance measures are computed using a fuzzy cellular traffic model, formulated as a hybrid system combining cellular automata and fuzzy calculus. Experimental results show that the introduced method allows the performance to be evaluated using imprecise traffic measurements. Moreover, the fuzzy definitions of performance measures are convenient for uncertainty determination in traffic control decisions.