arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

1603.04586 2026-06-04 cs.AI cs.RO cs.SY eess.SY

Optimal Sensing via Multi-armed Bandit Relaxations in Mixed Observability Domains

通过混合可观测域中的多臂老虎机放松实现最优感知

Mikko Lauri, Risto Ritala

AI总结研究在混合可观测域中不确定决策问题，通过放松约束推导最优价值函数上界，并利用多臂老虎机的可计算最优策略提升搜索空间剪枝效率，实验显示在目标监控领域有效。

Comments 6 pages, 2 figures

1510.06083 2026-06-04 cs.LG cs.NA math.NA math.OC stat.ML

Regularization vs. Relaxation: A conic optimization perspective of statistical variable selection

正则化与松弛：从锥优化视角看统计变量选择

Hongbo Dong, Kun Chen, Jeff Linderoth

AI总结本文从锥优化视角探讨变量选择问题，证明MCP和反Huber惩罚函数可视为视角松弛的特例，并通过半定松弛解决，结合Goemans-Williamson方法获得近似解。

Comments Also available on optimization online {http://www.optimization-online.org/DB_HTML/2015/05/4932.html}

详情

AI中文摘要

变量选择是统计数据分析中的基本任务。稀疏诱导正则化方法同时执行变量选择和模型估计，核心问题是一个带有l0范数惩罚的二次优化问题。精确执行l0范数惩罚对大规模问题计算不可行，因此引入了近似l0范数的稀疏诱导惩罚函数。本文表明从凸松弛视角分析问题提供新见解。特别是，我们证明了流行的稀疏诱导凹惩罚函数Minimax Concave Penalty（MCP）和反Huber惩罚函数（由Pilanci等人最近提出）均可视为一种称为视角松弛的提升凸松弛的特例。最优视角松弛是一个相关的minimax问题，平衡整体凸性和对l0范数的逼近紧密性。我们证明其可通过半定松弛解决。此外，半定松弛的概率解释揭示了与组合优化中的布尔二次多面体的联系。最后，通过将l0范数惩罚问题重新表述为两级问题，其中内层为Max-Cut问题，我们的所提半定松弛可通过将内层问题替换为其由Goemans和Williamson研究的半定松弛来实现。此解释表明使用Goemans-Williamson的舍入过程可找到l0范数惩罚问题的近似解。数值实验展示了我们所提半定松弛的紧密性，以及通过Goemans-Williamson舍入找到近似解的有效性。

英文摘要

Variable selection is a fundamental task in statistical data analysis. Sparsity-inducing regularization methods are a popular class of methods that simultaneously perform variable selection and model estimation. The central problem is a quadratic optimization problem with an l0-norm penalty. Exactly enforcing the l0-norm penalty is computationally intractable for larger scale problems, so dif- ferent sparsity-inducing penalty functions that approximate the l0-norm have been introduced. In this paper, we show that viewing the problem from a convex relaxation perspective offers new insights. In particular, we show that a popular sparsity-inducing concave penalty function known as the Minimax Concave Penalty (MCP), and the reverse Huber penalty derived in a recent work by Pilanci, Wainwright and Ghaoui, can both be derived as special cases of a lifted convex relaxation called the perspective relaxation. The optimal perspective relaxation is a related minimax problem that balances the overall convexity and tightness of approximation to the l0 norm. We show it can be solved by a semidefinite relaxation. Moreover, a probabilistic interpretation of the semidefinite relaxation reveals connections with the boolean quadric polytope in combinatorial optimization. Finally by reformulating the l0-norm pe- nalized problem as a two-level problem, with the inner level being a Max-Cut problem, our proposed semidefinite relaxation can be realized by replacing the inner level problem with its semidefinite relaxation studied by Goemans and Williamson. This interpretation suggests using the Goemans-Williamson rounding procedure to find approximate solutions to the l0-norm penalized problem. Numerical experiments demonstrate the tightness of our proposed semidefinite relaxation, and the effectiveness of finding approximate solutions by Goemans-Williamson rounding.

URL PDF HTML ☆

赞 0 踩 0

1603.02381 2026-06-04 cs.RO cs.SY eess.SY

The Effect of Communication Topology on Scalar Field Estimation by Networked Robotic Swarms

网络化机器人群中通信拓扑对标量场估计的影响

Ragesh K Ramachandran, Spring Berman

AI总结本文研究了利用网络化机器人群重构二维标量场的问题，通过链式或网格拓扑结构进行通信，采用优化方法结合梯度计算，推导了可观测性格拉姆矩阵的迹界，验证了链式和网格拓扑的估计能力与鲁棒性。

1603.02038 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY

Unscented Bayesian Optimization for Safe Robot Grasping

无迹贝叶斯优化用于安全机器人抓取

José Nogueira, Ruben Martinez-Cantin, Alexandre Bernardino, Lorenzo Jamone

AI总结本文提出无迹贝叶斯优化算法，通过考虑输入噪声在安全区域寻找最优抓取策略，提升机器人抓取的安全性和效率。

Comments conference paper

详情

AI中文摘要

我们解决了在输入空间存在不确定性时的机器人抓取优化问题。通过试错探索策略实现抓取未知物体。贝叶斯优化是一种样本高效的优化算法，特别适合此设置，因为它能主动减少试验次数以学习待优化函数。事实上，这种主动对象探索策略与婴儿学习最佳抓取方式的策略相同。在学习抓取策略时，一些抓取参数配置可能对物体与机器人末端执行器之间相对姿态的误差非常敏感。我们称这些配置为不安全，因为抓取执行中的小误差可能将好的抓取变为坏的抓取。因此，为了降低抓取失败的风险，抓取应规划在安全区域。我们提出了一种新的算法，即无迹贝叶斯优化，能够在考虑输入噪声的情况下进行样本高效的优化以找到安全的极值。无迹贝叶斯优化的贡献是双方面的：一方面提供了一个新的决策过程，驱动探索到安全区域；另一方面提供了一个新的选择过程，选择在不进行额外分析或计算成本的情况下最优的抓取策略。这两个贡献都根植于无迹变换背后的强大理论，这是一种流行的非线性近似方法。我们在合成问题和现实的机器人抓取模拟中展示了其相对于经典贝叶斯优化的优势。结果表明，我们的方法在几次试验后就能获得最优且鲁棒的抓取策略，同时所选的抓取保持在安全区域。

英文摘要

We address the robot grasp optimization problem of unknown objects considering uncertainty in the input space. Grasping unknown objects can be achieved by using a trial and error exploration strategy. Bayesian optimization is a sample efficient optimization algorithm that is especially suitable for this setups as it actively reduces the number of trials for learning about the function to optimize. In fact, this active object exploration is the same strategy that infants do to learn optimal grasps. One problem that arises while learning grasping policies is that some configurations of grasp parameters may be very sensitive to error in the relative pose between the object and robot end-effector. We call these configurations unsafe because small errors during grasp execution may turn good grasps into bad grasps. Therefore, to reduce the risk of grasp failure, grasps should be planned in safe areas. We propose a new algorithm, Unscented Bayesian optimization that is able to perform sample efficient optimization while taking into consideration input noise to find safe optima. The contribution of Unscented Bayesian optimization is twofold as if provides a new decision process that drives exploration to safe regions and a new selection procedure that chooses the optimal in terms of its safety without extra analysis or computational cost. Both contributions are rooted on the strong theory behind the unscented transformation, a popular nonlinear approximation method. We show its advantages with respect to the classical Bayesian optimization both in synthetic problems and in realistic robot grasp simulations. The results highlights that our method achieves optimal and robust grasping policies after few trials while the selected grasps remain in safe regions.

URL PDF HTML ☆

赞 0 踩 0

1603.00748 2026-06-04 cs.LG cs.AI cs.RO cs.SY eess.SY

Continuous Deep Q-Learning with Model-based Acceleration

基于模型的连续深度Q学习加速

Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine

AI总结本文提出连续深度Q学习算法NAF及基于模型的加速方法，用于提升连续控制任务的样本效率和学习速度。

详情

AI中文摘要

模型无关强化学习已成功应用于多种挑战性问题，并扩展到处理大规模神经网络策略和价值函数。然而，模型无关算法的样本复杂性，特别是使用高维函数近似器时，限制了其在物理系统中的应用。本文探索了减少深度强化学习样本复杂性的算法和表示方法。我们提出两种互补技术来提高此类算法的效率。首先，我们推导出Q学习的连续变种，称为归一化优势函数（NAF），作为替代更常用的策略梯度和actor-critic方法。NAF表示允许我们应用带有经验回放的Q学习来处理连续任务，并在一组模拟机器人控制任务上显著提高性能。为进一步提高我们的方法效率，我们探索了使用学习模型来加速模型无关强化学习。我们展示迭代重新拟合的局部线性模型在这一点上特别有效，并在适用此类模型的领域中展示了显著更快的学习速度。

英文摘要

Model-free reinforcement learning has been successfully applied to a range of challenging problems, and has recently been extended to handle large neural network policies and value functions. However, the sample complexity of model-free algorithms, particularly when using high-dimensional function approximators, tends to limit their applicability to physical systems. In this paper, we explore algorithms and representations to reduce the sample complexity of deep reinforcement learning for continuous control tasks. We propose two complementary techniques for improving the efficiency of such algorithms. First, we derive a continuous variant of the Q-learning algorithm, which we call normalized adantage functions (NAF), as an alternative to the more commonly used policy gradient and actor-critic methods. NAF representation allows us to apply Q-learning with experience replay to continuous tasks, and substantially improves performance on a set of simulated robotic control tasks. To further improve the efficiency of our approach, we explore the use of learned models for accelerating model-free reinforcement learning. We show that iteratively refitted local linear models are especially effective for this, and demonstrate substantially faster learning on domains where such models are applicable.

URL PDF HTML ☆

赞 0 踩 0

1602.08609 2026-06-04 cs.SD cs.SY eess.SY

A New Robust Frequency Domain Echo Canceller With Closed-Loop Learning Rate Adaptation

一种新的鲁棒频域回声抵消器与闭环学习率适应

Jean-Marc Valin, Iain B. Collings

AI总结本文提出一种基于多延迟块频域回声抵消器的闭环方法，通过学习率与对齐参数成正比，提升回声抵消性能，优于现有双工检测技术6dB。

Comments 4 pages, Proc. International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2007

1511.04634 2026-06-04 cs.RO cs.SY eess.SY

Motion Planning for Global Localization in Non-Gaussian Belief Spaces

非高斯信念空间中的全局定位运动规划

Saurav Agarwal, Amirhossein Tamjidi, Suman Chakravorty

AI总结本文提出了一种在不确定环境下进行运动规划的方法，用于处理模糊数据关联导致的多模态假设。通过递推方法逐步消除多模态信念，实现有限时间内精确定位。

Comments extends previous submission with updated figures, analysis and justifications. arXiv admin note: text overlap with arXiv:1506.01780

1602.08044 2026-06-04 cs.SD cs.SY eess.SY

On Adjusting the Learning Rate in Frequency Domain Echo Cancellation With Double-Talk

在频域回声消除中调整学习率

Jean-Marc Valin

AI总结本文提出了一种在频域回声消除中根据双音和回声路径变化调整学习率的新方法，基于噪声环境下NLMS算法最优学习率的推导，通过多延迟块频域自适应滤波器评估，证明其优于现有双音检测技术且易于实现。

Comments 5 pages

1602.04434 2026-06-04 cs.LG cs.SY eess.SY stat.ML

Frequency Analysis of Temporal Graph Signals

时序图信号的频谱分析

Andreas Loukas, Damien Foucard

AI总结本文提出时序图频谱分析概念，统一了时频和图频分析方法，通过联合时频变换设计分布式滤波器用于干扰消除。

Comments 5 pages, 3 figures

1510.00771 2026-06-04 cs.CV cs.RO cs.SY eess.SY

Design and Analysis of a Single-Camera Omnistereo Sensor for Quadrotor Micro Aerial Vehicles (MAVs)

单相机 omnistereo 传感器的设计与分析用于四旋翼微型飞行器（MAVs）

Carlos Jaramillo

AI总结本文提出一种适用于低负载四旋翼微型飞行器的单相机 omnistereo 传感器设计，通过共轴超曲面镜实现立体视觉，分析其几何特性与3D感知性能。

Comments 49 pages, 22 figures, journal article draft

详情

DOI: 10.3390/s16020217
Journal ref: Sensors 16 (2016) 217

AI中文摘要

我们描述了一种应用于微型飞行器（MAVs）的 omnistereo 系统的设计和3D感知性能。所提出的 omnistereo 模型采用一个单目相机，与一对双曲面镜（折叠catadioptric配置）共轴对齐。我们证明这种配置在安装在具有低负载的四旋翼MAV上进行立体视觉是可行的。理论上的单视角（SVP）约束帮助我们推导出传感器投影几何的解析解，并生成SVP兼容的全景图像，以从立体对应关系中计算3D信息（真正同步地）。我们对各种系统特性进行了广泛分析，如大小、catadioptric空间分辨率、视场。此外，我们提出了一种概率模型，用于估计从三角化中深度的不确定性，用于斜向后投影射线。我们期望通过我们的解决方案的可重复性来激励，因为它可以被适应（最优地）到其他基于catadioptric的omnistereo视觉应用。

英文摘要

We describe the design and 3D sensing performance of an omnidirectional stereo-vision system (omnistereo) as applied to Micro Aerial Vehicles (MAVs). The proposed omnistereo model employs a monocular camera that is co-axially aligned with a pair of hyperboloidal mirrors (folded catadioptric configuration). We show that this arrangement is practical for performing stereo-vision when mounted on top of propeller-based MAVs characterized by low payloads. The theoretical single viewpoint (SVP) constraint helps us derive analytical solutions for the sensor's projective geometry and generate SVP-compliant panoramic images to compute 3D information from stereo correspondences (in a truly synchronous fashion). We perform an extensive analysis on various system characteristics such as its size, catadioptric spatial resolution, field-of-view. In addition, we pose a probabilistic model for uncertainty estimation of the depth from triangulation for skew back-projection rays. We expect to motivate the reproducibility of our solution since it can be adapted (optimally) to other catadioptric-based omnistereo vision applications.

URL PDF HTML ☆

赞 0 踩 0

1506.00438 2026-06-04 cs.LG cs.DM cs.SY eess.SY stat.ME

Network Topology Identification using PCA and its Graph Theoretic Interpretations

利用PCA进行网络拓扑识别及其图论解释

Aravind Rajeswaran, Shankar Narasimhan

AI总结本文通过PCA估计线性关系，利用f-cut集和f-环路实现网络拓扑识别，展示了从稳态数据中识别网络结构的方法及图论意义。

Comments Structure of paper is changed to improve presentation. Methods and results are unchanged. A more detailed literature survey has been added

详情

AI中文摘要

我们解决了从稳态网络测量中识别（重建）网络拓扑的问题。具体来说，给定一个数据矩阵X，其中X_{ij}对应配置（稳态）j中边i的流量，我们希望找到一个网络结构，使得所有节点的流量守恒成立。这模型了许多涉及守恒量的网络问题，如水、电力和代谢网络。我们证明了识别等同于学习一个模型A_n，该模型捕捉了X中不同变量之间的近似线性关系（即形式为A_n X ≈ 0），使得A_n满秩（最高可能）且与网络节点-边 incidence 结构一致。该问题通过一系列步骤解决，包括使用PCA估计近似线性关系、从这些近似关系中获得f-cut集，以及从f-cut集（或等价地f-环路）中实现图结构。每一步和整个过程都是多项式时间。该方法通过识别水分布网络的拓扑结构进行示例说明。我们还研究了从稳态数据中识别的可识别性范围。

英文摘要

We solve the problem of identifying (reconstructing) network topology from steady state network measurements. Concretely, given only a data matrix $\mathbf{X}$ where the $X_{ij}$ entry corresponds to flow in edge $i$ in configuration (steady-state) $j$, we wish to find a network structure for which flow conservation is obeyed at all the nodes. This models many network problems involving conserved quantities like water, power, and metabolic networks. We show that identification is equivalent to learning a model $\mathbf{A_n}$ which captures the approximate linear relationships between the different variables comprising $\mathbf{X}$ (i.e. of the form $\mathbf{A_n X \approx 0}$) such that $\mathbf{A_n}$ is full rank (highest possible) and consistent with a network node-edge incidence structure. The problem is solved through a sequence of steps like estimating approximate linear relationships using Principal Component Analysis, obtaining f-cut-sets from these approximate relationships, and graph realization from f-cut-sets (or equivalently f-circuits). Each step and the overall process is polynomial time. The method is illustrated by identifying topology of a water distribution network. We also study the extent of identifiability from steady-state data.

URL PDF HTML ☆

赞 0 踩 0

1510.06895 2026-06-04 cs.LG cs.CV cs.NA math.NA

Nonconvex Nonsmooth Low-Rank Minimization via Iteratively Reweighted Nuclear Norm

非凸非光滑低秩最小化通过迭代重加权核范数

Canyi Lu, Jinhui Tang, Shuicheng Yan, Zhouchen Lin

AI总结本文提出通过迭代重加权核范数算法解决非凸非光滑低秩最小化问题，利用非凸替代函数近似秩函数，提升低秩矩阵恢复性能。

详情

DOI: 10.1109/TIP.2015.2511584

AI中文摘要

核范数因其在压缩感知中用于低秩矩阵恢复而被广泛使用，但求解基于核范数的松弛凸问题通常导致原始秩最小化问题的次优解。本文提出在矩阵奇异值上使用非凸替代函数近似秩函数，从而得到非凸非光滑最小化问题。然后通过迭代重加权核范数（IRNN）算法求解，该算法通过求解加权奇异值阈值（WSVT）问题，利用非凸替代函数的特殊性质获得闭式解。同时，IRNN被扩展以处理两个或多个变量块的非凸问题。理论上，证明IRNN单调减少目标函数值，任何极限点都是 stationary 点。在合成数据和真实图像上的大量实验表明，IRNN相比最先进的凸算法在低秩矩阵恢复方面表现更优。

英文摘要

The nuclear norm is widely used as a convex surrogate of the rank function in compressive sensing for low rank matrix recovery with its applications in image recovery and signal processing. However, solving the nuclear norm based relaxed convex problem usually leads to a suboptimal solution of the original rank minimization problem. In this paper, we propose to perform a family of nonconvex surrogates of $L_0$-norm on the singular values of a matrix to approximate the rank function. This leads to a nonconvex nonsmooth minimization problem. Then we propose to solve the problem by Iteratively Reweighted Nuclear Norm (IRNN) algorithm. IRNN iteratively solves a Weighted Singular Value Thresholding (WSVT) problem, which has a closed form solution due to the special properties of the nonconvex surrogate functions. We also extend IRNN to solve the nonconvex problem with two or more blocks of variables. In theory, we prove that IRNN decreases the objective function value monotonically, and any limit point is a stationary point. Extensive experiments on both synthesized data and real images demonstrate that IRNN enhances the low-rank matrix recovery compared with state-of-the-art convex algorithms.

URL PDF HTML ☆

赞 0 踩 0

1506.08350 2026-06-04 cs.LG cs.NA math.NA

Stochastic Gradient Made Stable: A Manifold Propagation Approach for Large-Scale Optimization

随机梯度使稳定：一种用于大规模优化的流形传播方法

Yadong Mu, Wei Liu, Wei Fan

AI总结本文提出了一种新的半随机梯度下降算法S3GD，通过高效的流形传播方法减少计算复杂度，提升优化稳定性。

Comments 14 pages, 9 figures

详情

AI中文摘要

随机梯度下降（SGD）是构建大规模机器学习模型的经典方法。由于通常仅使用少量样本计算随机梯度，导致梯度估计波动较大，参数难以收敛。本文提出了一种新的半随机梯度下降算法S3GD，通过高效的流形传播方法，能够以较低的计算复杂度生成精确的梯度估计，从而在优化复合凸函数时实现更快的收敛速度。理论分析表明，S3GD在几何算法收敛速度与空间和时间复杂度之间取得了良好的平衡。实验结果在多个大规模基准数据集上验证了S3GD的有效性。

英文摘要

Stochastic gradient descent (SGD) holds as a classical method to build large scale machine learning models over big data. A stochastic gradient is typically calculated from a limited number of samples (known as mini-batch), so it potentially incurs a high variance and causes the estimated parameters bounce around the optimal solution. To improve the stability of stochastic gradient, recent years have witnessed the proposal of several semi-stochastic gradient descent algorithms, which distinguish themselves from standard SGD by incorporating global information into gradient computation. In this paper we contribute a novel stratified semi-stochastic gradient descent (S3GD) algorithm to this nascent research area, accelerating the optimization of a large family of composite convex functions. Though theoretically converging faster, prior semi-stochastic algorithms are found to suffer from high iteration complexity, which makes them even slower than SGD in practice on many datasets. In our proposed S3GD, the semi-stochastic gradient is calculated based on efficient manifold propagation, which can be numerically accomplished by sparse matrix multiplications. This way S3GD is able to generate a highly-accurate estimate of the exact gradient from each mini-batch with largely-reduced computational complexity. Theoretic analysis reveals that the proposed S3GD elegantly balances the geometric algorithmic convergence rate against the space and time complexities during the optimization. The efficacy of S3GD is also experimentally corroborated on several large-scale benchmark datasets.

URL PDF HTML ☆

赞 0 踩 0

1512.06427 2026-06-04 cs.AI cs.DS cs.SY eess.SY math.OC

Towards Integrated Glance To Restructuring in Combinatorial Optimization

迈向组合优化中重构的整合视角

Mark Sh. Levin

AI总结本文研究组合优化中解决方案的重构问题，探讨重构成本与目标解接近度，并针对三种重构类型提出单准则和多准则问题解决方法。

Comments 31 pages, 34 figures, 10 tables

1512.03351 2026-06-04 cs.RO cs.SY eess.SY

Adaptive Neural Control for Mobile Robots Autonomous Navigation

自适应神经控制用于自主导航的移动机器人

Monica Dragoicea, Ioan Dumitrache, Nicolae Constantin

AI总结本文提出了一种自适应神经控制策略，用于非holonomic移动机器人的自主导航，通过同时学习运动学转向和速度动力学，实现稳定跟踪。

Comments in Proceedings of the 7th Int. Symposium on Automatic Control and Computer Science SACCS 2001, Iasi, Romania, CD ISBN 973-8292-11-5, 2001

1512.03345 2026-06-04 cs.RO cs.SY eess.SY

Mobile Robots Adaptive Control Using Neural Networks

移动机器人自适应控制的神经网络方法

Ioan Dumitrache, Monica Dragoicea

AI总结本文提出了一种前馈控制策略，用于考虑移动机器人非线性模型及其输入输出交互，通过神经网络控制器补偿建模不确定性，实现智能控制策略。

1504.06917 2026-06-04 cs.RO cs.SY eess.SY math.OC

Spline Path Following for Redundant Mechanical Systems

样条路径跟随用于冗余机械系统

Rajan Gill, Dana Kulić, Christopher Nielsen

AI总结本文提出了一种适用于由样条生成的框架曲线的路径跟随控制方法，通过求解约束二次优化问题解决冗余性，并通过实验验证了其在具有显著模型不确定性的4自由度机械臂上的有效性。

Comments Submitted to IEEE TRO (under review)

详情

DOI: 10.1109/TRO.2015.2489502
Journal ref: Robotics, IEEE Transactions on (Volume:31 , Issue: 6 ) 02 December 2015

AI中文摘要

路径跟随控制器使控制系统的输出能够接近并沿预指定路径移动，无需事先时间参数化。本文提出了一种适用于冗余机械系统工作空间中由样条生成的框架曲线的路径跟随控制设计方法。可接受的路径类别包括自相交曲线。通过设计解决合适定义的约束二次优化问题来解决运动学冗余性。通过部分反馈线性化，所提出的路径跟随控制器具有明确的物理意义。该方法在具有旋转和线性执行器链接的4自由度机械臂上进行了实验验证，该机械臂具有显著的模型不确定性。

英文摘要

Path following controllers make the output of a control system approach and traverse a pre-specified path with no apriori time parametrization. In this paper we present a method for path following control design applicable to framed curves generated by splines in the workspace of kinematically redundant mechanical systems. The class of admissible paths includes self-intersecting curves. Kinematic redundancies are resolved by designing controllers that solve a suitably defined constrained quadratic optimization problem. By employing partial feedback linearization, the proposed path following controllers have a clear physical meaning. The approach is experimentally verified on a 4-degree-of-freedom (4-DOF) manipulator with a combination of revolute and linear actuated links and significant model uncertainty.

URL PDF HTML ☆

赞 0 踩 0

1512.01885 2026-06-04 cs.AI cs.SY eess.SY math.OC

Probabilistic Structural Controllability in Causal Bayesian Networks

因果贝叶斯网络中的概率结构可控性

Ardavan Salehi Nobandegani, Ioannis N. Psaromiligkos

AI总结本文首次研究因果贝叶斯网络中的概率可控性问题，提出概率结构可控性的定义，并识别出一组足够的驱动变量以实现目标变量状态的概率控制。

1505.00274 2026-06-04 cs.AI cs.SY eess.SY stat.ML

Stick-Breaking Policy Learning in Dec-POMDPs

在Dec-POMDPs中采用Stick-Breaking策略的学习

Miao Liu, Christopher Amato, Xuejun Liao, Lawrence Carin, Jonathan P. How

AI总结本文提出了一种变大小状态控制器的Dec-SBPR框架，通过Stick-Breaking先验构建局部策略，无需假设Dec-POMDP模型即可学习控制器参数，有效提升大规模问题的性能。

1509.03044 2026-06-04 cs.LG cs.AI cs.SY eess.SY

Recurrent Reinforcement Learning: A Hybrid Approach

递归强化学习：一种混合方法

Xiujun Li, Lihong Li, Jianfeng Gao, Xiaodong He, Jianshu Chen, Li Deng, Ji He

AI总结本文提出一种混合模型，结合监督学习和强化学习，用于部分可观测任务的状态表示学习，在极少领域知识下有效。

Comments 11 pages, 6 figures

详情

AI中文摘要

成功的强化学习应用往往需要处理部分可观测状态。通常很难构建和推断隐藏状态，因为它们依赖于智能体的整个交互历史，可能需要大量领域知识。本文研究了一种深度学习方法，用于在极少领域知识下学习部分可观测任务的状态表示。特别地，我们提出了一种新的混合模型，结合监督学习（SL）和强化学习（RL）的优点，以联合方式训练：SL组件可以是循环神经网络（RNN）或其长短期记忆（LSTM）版本，具有捕捉长期依赖性的能力，从而有效学习隐藏状态的表示。RL组件是一个深度Q网络（DQN），学习优化控制以最大化长期奖励。在直接邮寄营销问题上的大量实验展示了所提出方法的有效性和优势，其在一组先前最先进的方法中表现最佳。

英文摘要

Successful applications of reinforcement learning in real-world problems often require dealing with partially observable states. It is in general very challenging to construct and infer hidden states as they often depend on the agent's entire interaction history and may require substantial domain knowledge. In this work, we investigate a deep-learning approach to learning the representation of states in partially observable tasks, with minimal prior knowledge of the domain. In particular, we propose a new family of hybrid models that combines the strength of both supervised learning (SL) and reinforcement learning (RL), trained in a joint fashion: The SL component can be a recurrent neural networks (RNN) or its long short-term memory (LSTM) version, which is equipped with the desired property of being able to capture long-term dependency on history, thus providing an effective way of learning the representation of hidden states. The RL component is a deep Q-network (DQN) that learns to optimize the control for maximizing long-term rewards. Extensive experiments in a direct mailing campaign problem demonstrate the effectiveness and advantages of the proposed approach, which performs the best among a set of previous state-of-the-art methods.

URL PDF HTML ☆

赞 0 踩 0

1511.04628 2026-06-04 cs.RO cs.SY eess.SY

A Framework for Planning and Controlling Non-Periodic Bipedal Locomotion

一种用于非周期性双足运动规划与控制的框架

Ye Zhao, Benito R. Fernandez, Luis Sentis

AI总结本文提出了一种基于非周期性顶点状态鲁棒跟踪的双足运动规划与控制理论框架，通过混合相空间规划与控制方法实现非周期性步态生成及抗干扰能力。

Comments 33 pages, 18 figures, journal

详情

AI中文摘要

本文提出了一种基于鲁棒跟踪非周期性顶点状态的双足运动规划与控制理论框架。基于平面倒置摆模型，我们提出了一个混合相空间规划与控制框架，包含四个关键组件：(1) 一种步态转换求解器，能够在各种地形上动态跟踪非周期性顶点或关键帧状态；(2) 一种鲁棒混合自动机，有效制定规划与控制算法；(3) 一种相空间度量，用于测量到规划运动流形的距离；(4) 一种基于前度量的混合控制方法，以产生在外部干扰下的稳健动态运动。与其他运动框架相比，我们更关注非周期性步态生成和鲁棒性度量以处理干扰。这种关注使所提出的控制框架能够稳健地跟踪各种具有挑战性的地形上的非周期性顶点状态，并通过多个模拟示例进行了说明。此外，它还允许双足机器人在不连续地形上执行非周期性跳跃动作。

英文摘要

This study presents a theoretical framework for planning and controlling agile bipedal locomotion based on robustly tracking a set of non-periodic apex states. Based on the prismatic inverted pendulum model, we formulate a hybrid phase-space planning and control framework which includes the following key components: (1) a step transition solver that enables dynamically tracking non-periodic apex or keyframe states over various types of terrains, (2) a robust hybrid automaton to effectively formulate planning and control algorithms, (3) a phase-space metric to measure distance to the planned locomotion manifolds, and (4) a hybrid control method based on the previous distance metric to produce robust dynamic locomotion under external disturbances. Compared to other locomotion frameworks, we have a larger focus on non-periodic gait generation and robustness metrics to deal with disturbances. Such focus enables the proposed control framework to robustly track non-periodic apex states over various challenging terrains and under external disturbances as illustrated through several simulations. Additionally, it allows a bipedal robot to perform non-periodic bouncing maneuvers over disjointed terrains.

URL PDF HTML ☆

赞 0 踩 0

1502.06800 2026-06-04 cs.LG cs.NA math.NA stat.ML

On the Equivalence between Kernel Quadrature Rules and Random Feature Expansions

核积分规则与随机特征展开的等价性

Francis Bach

AI总结研究揭示核积分规则是随机特征展开的特例，通过理论分析得出样本数与积分算子特征值的关系，扩展至函数逼近问题并改进随机特征学习的泛化保证。

详情

AI中文摘要

我们展示基于核的积分规则可以视为正定核随机特征展开的特例，对于特定分解总存在。我们提供理论分析，得出所需样本数与近似误差的关系，得到基于积分算子特征值的上下界，匹配对数项。特别地，我们显示上界可通过特定非均匀分布的独立同分布样本获得，而下界若对任何点集有效。将结果应用于核积分规则时，我们恢复了Sobolev空间的已知上下界。此外，结果扩展至更一般的函数逼近问题，得到L2-和L∞-范数结果，匹配特殊情形的已知结果。应用于随机特征时，我们显示改进了保持学习Lipschitz连续损失泛化保证所需的随机特征数量。

英文摘要

We show that kernel-based quadrature rules for computing integrals can be seen as a special case of random feature expansions for positive definite kernels, for a particular decomposition that always exists for such kernels. We provide a theoretical analysis of the number of required samples for a given approximation error, leading to both upper and lower bounds that are based solely on the eigenvalues of the associated integral operator and match up to logarithmic terms. In particular, we show that the upper bound may be obtained from independent and identically distributed samples from a specific non-uniform distribution, while the lower bound if valid for any set of points. Applying our results to kernel-based quadrature, while our results are fairly general, we recover known upper and lower bounds for the special cases of Sobolev spaces. Moreover, our results extend to the more general problem of full function approximations (beyond simply computing an integral), with results in L2- and L$\infty$-norm that match known results for special cases. Applying our results to random features, we show an improvement of the number of random features needed to preserve the generalization guarantees for learning with Lipschitz-continuous losses.

URL PDF HTML ☆

赞 0 踩 0

1411.2276 2026-06-04 cs.RO cs.NE cs.SY eess.SY

Trade-Offs in Exploiting Body Morphology for Control: from Simple Bodies and Model-Based Control to Complex Bodies with Model-Free Distributed Control Schemes

在利用身体形态进行控制中权衡：从简单身体和基于模型的控制到复杂身体与基于模型的分布式控制方案

Matej Hoffmann, Vincent C. Müller

AI总结本文探讨了在复杂身体设计中基于模型与无模型控制的权衡，分析了软体机器人自动接管控制的优缺点及模型构建的可行性。

详情

Journal ref: Helmut Hauser; Rudolf M. Füchslin & Rolf Pfeifer, ed., 'E-book on Opinions and Outlooks on Morphological Computation', 2014, pp. 185--194

AI中文摘要

为控制目的设计机器人身体的优化隐含在工程师的实践中，但缺乏系统的方法或工具，导致形态优化滞后于控制器的发展。随着柔软、变形或“软”身体的出现，其在控制中的潜力显著，有时被称为“形态计算”，即通过身体卸载控制所需计算。本文主张采用动态系统而非计算视角来审视该问题，并分析简单与复杂身体的优劣，批判性地回顾“软”身体自动接管控制任务的吸引力。同时，本文还探讨了设计空间中的关键维度——是否应使用基于模型的控制，以及在不同形态下开发忠实模型的可行性。

英文摘要

Tailoring the design of robot bodies for control purposes is implicitly performed by engineers, however, a methodology or set of tools is largely absent and optimization of morphology (shape, material properties of robot bodies, etc.) is lagging behind the development of controllers. This has become even more prominent with the advent of compliant, deformable or "soft" bodies. These carry substantial potential regarding their exploitation for control---sometimes referred to as "morphological computation" in the sense of offloading computation needed for control to the body. Here, we will argue in favor of a dynamical systems rather than computational perspective on the problem. Then, we will look at the pros and cons of simple vs. complex bodies, critically reviewing the attractive notion of "soft" bodies automatically taking over control tasks. We will address another key dimension of the design space---whether model-based control should be used and to what extent it is feasible to develop faithful models for different morphologies.

URL PDF HTML ☆

赞 0 踩 0

1405.7178 2026-06-04 cs.RO cs.SY eess.SY

Artificial Wrestling: A Dynamical Formulation of Autonomous Agents Fighting in a Coupled Inverted Pendula Framework

人工摔跤：一种自主代理在耦合倒立摆框架中的动态建模

Katsutoshi Yoshida, Shigeki Matsumoto, Yoichi Matsue

AI总结本文提出基于耦合倒立摆框架的自主代理对抗模型，通过动态控制器存储状态对应关系并生成控制力，实验表明延迟元素和量化分辨率影响性能。

Comments The 12th International Conference on Motion and Vibration Control (MOVIC 2014), August 3-7, 2014, Sapporo, Japan. This article was selected as an article of Mechanical Engineering Journal after minor revisions; the final version is available at http://dx.doi.org/10.1299/mej.14-00518

详情

AI中文摘要

我们开发了自主代理相互对抗的系统，灵感来自人类摔跤。为此，我们提出耦合倒立摆（CIP）框架：1）两个倒立摆的顶端通过连接杆相连；2）每个摆主要通过PD控制器稳定；3）并额外配备智能控制器。基于此框架，我们动态建模智能控制器，用于存储CIP模型的初始状态到最终状态的动态对应关系，接收模型状态向量并输出脉冲控制力以产生期望的最终状态。开发了该控制器的量化和降阶设计，基于离线学习方法获得实用控制流程。随后进行数值模拟以研究智能控制器的个体性能，显示通过在智能控制器中添加延迟元件可提高性能。结果表明，性能不仅取决于学习数据的量化分辨率，还取决于延迟元件的延迟时间。最后，我们将智能控制器安装到所提框架中的两个摆上，以演示倒立摆之间的自主竞争行为。

英文摘要

We develop autonomous agents fighting with each other, inspired by human wrestling. For this purpose, we propose a coupled inverted pendula (CIP) framework in which: 1) tips of two inverted pendulums are linked by a connection rod, 2) each pendulum is primarily stabilized by a PD-controller, 3) and is additionally equipped with an intelligent controller. Based on this framework, we dynamically formulate an intelligent controller designed to store dynamical correspondence from initial states to final states of the CIP model, to receive state vectors of the model, and to output impulsive control forces to produce desired final states of the model. Developing a quantized and reduced order design of this controller, we have a practical control procedure based on an off-line learning method. We then conduct numerical simulations to investigate individual performance of the intelligent controller, showing that the performance can be improved by adding a delay element into the intelligent controller. The result shows that the performance depends not only on quantization resolutions of learning data but also on delay time of the delay element. Finally, we install the intelligent controllers into both pendulums in the proposed framework to demonstrate autonomous competitive behavior between inverted pendulums.

URL PDF HTML ☆

赞 0 踩 0

1510.04914 2026-06-04 cs.AI cs.DC cs.MS cs.NA math.NA math.OC

Hybridization of Interval CP and Evolutionary Algorithms for Optimizing Difficult Problems

区间CP与进化算法的混合方法用于优化难题

Charlie Vanaret, Jean-Baptiste Gotteland, Nicolas Durand, Jean-Marc Alliot

AI总结本文提出一种混合框架，结合区间方法与进化算法，通过消息传递实现并行搜索，展示Charibde在解决困难COCONUT问题时优于现有求解器。

Comments 21st International Conference on Principles and Practice of Constraint Programming (CP 2015), 2015

详情

DOI: 10.1007/978-3-319-23219-5_32

AI中文摘要

在全局优化中，唯一严谨的数值证明最优性的方法是基于区间的算法，通过搜索空间的分支和不可含最优解的子域修剪。最先进的求解器通常整合局部优化算法来计算每个子空间的良好上界。本文提出了一种合作框架，其中区间方法与进化算法相互协作。后者是随机算法，通过候选解种群在搜索空间中迭代进化以达到满意解。在我们的合作求解器Charibde中，进化算法和基于区间的算法并行运行，并通过消息传递以高级方式交换边界、解和搜索空间。对困难COCONUT问题的基准测试表明，Charibde在非严谨求解器中具有竞争力，并比严谨求解器快一个数量级收敛。

英文摘要

The only rigorous approaches for achieving a numerical proof of optimality in global optimization are interval-based methods that interleave branching of the search-space and pruning of the subdomains that cannot contain an optimal solution. State-of-the-art solvers generally integrate local optimization algorithms to compute a good upper bound of the global minimum over each subspace. In this document, we propose a cooperative framework in which interval methods cooperate with evolutionary algorithms. The latter are stochastic algorithms in which a population of candidate solutions iteratively evolves in the search-space to reach satisfactory solutions. Within our cooperative solver Charibde, the evolutionary algorithm and the interval-based algorithm run in parallel and exchange bounds, solutions and search-space in an advanced manner via message passing. A comparison of Charibde with state-of-the-art interval-based solvers (GlobSol, IBBA, Ibex) and NLP solvers (Couenne, BARON) on a benchmark of difficult COCONUT problems shows that Charibde is highly competitive against non-rigorous solvers and converges faster than rigorous solvers by an order of magnitude.

URL PDF HTML ☆

赞 0 踩 0

1505.05908 2026-06-04 cs.RO cs.SY eess.SY

Cooperative localization for mobile agents: a recursive decentralized algorithm based on Kalman filter decoupling

移动代理的协同定位：基于卡尔曼滤波解耦的递归分布式算法

Solmaz S. Kia, Stephen Rounds, Sonia Martinez

AI总结本文提出一种递归分布式协同定位算法，通过解耦卡尔曼滤波实现分布式状态估计，提升移动代理的定位精度与效率。

详情

AI中文摘要

我们考虑具有通信和计算能力的移动代理的协同定位技术。首先概述了文献中不同的去中心化策略，特别关注这些算法如何维护团队成员状态估计之间的内在相关性。然后，我们提出了一种新的去中心化协同定位算法，它是集中式扩展卡尔曼滤波的分布式实现。在该算法中，每个代理传播新的中间局部变量，这些变量可用于更新阶段生成所需的传播交叉协方差项。每当网络中存在相对测量时，算法将进行该测量的代理声明为临时主代理。通过从临时地标获取信息，该代理可以计算并广播一组中间变量，每个机器人可以使用这些变量来更新其估计值，以匹配集中式扩展卡尔曼滤波器的估计结果。一旦完成更新，直到下一次相对测量之前不需要进一步通信。

英文摘要

We consider cooperative localization technique for mobile agents with communication and computation capabilities. We start by provide and overview of different decentralization strategies in the literature, with special focus on how these algorithms maintain an account of intrinsic correlations between state estimate of team members. Then, we present a novel decentralized cooperative localization algorithm that is a decentralized implementation of a centralized Extended Kalman Filter for cooperative localization. In this algorithm, instead of propagating cross-covariance terms, each agent propagates new intermediate local variables that can be used in an update stage to create the required propagated cross-covariance terms. Whenever there is a relative measurement in the network, the algorithm declares the agent making this measurement as the interim master. By acquiring information from the interim landmark, the agent the relative measurement is taken from, the interim master can calculate and broadcast a set of intermediate variables which each robot can then use to update its estimates to match that of a centralized Extended Kalman Filter for cooperative localization. Once an update is done, no further communication is needed until the next relative measurement.

URL PDF HTML ☆

赞 0 踩 0

1509.01208 2026-06-04 cs.LG cs.IR cs.NA math.NA

Fast Clustering and Topic Modeling Based on Rank-2 Nonnegative Matrix Factorization

基于秩2非负矩阵分解的快速聚类与主题建模

Da Kuang, Barry Drake, Haesun Park

AI总结本文提出HierNMF2和FlatNMF2方法，利用秩2非负矩阵分解实现高效层次聚类和主题建模，实验表明在计算时间和解质量上均有显著提升。

Comments This paper has been withdrawn by the author to clarify the authorship

详情

AI中文摘要

无监督聚类和主题建模的重要性日益凸显，随着文本数据量的增加。本文提出了一种名为HierNMF2的快速方法，基于快速秩2非负矩阵分解（NMF），进行二元聚类和高效的节点分裂规则。进一步利用HierNMF2生成的最终叶节点和非负最小二乘拟合思想，提出新的聚类/主题建模方法FlatNMF2，以极简且显著更有效的方式恢复扁平聚类/主题建模结果。我们为HierNMF2和FlatNMF2实现了高度优化的开源C++软件，用于文档数据集的层次和部分聚类/主题建模。大量实验测试展示了计算时间和解质量的显著改进。我们比较了我们的方法与其他聚类方法（包括K-means、标准NMF和CLUTO）以及主题建模方法（包括潜在狄利克雷分配（LDA）和最近提出的具有分离性约束的NMF算法）。总体而言，我们提出了分析大规模数据集的高效工具和技术，这些技术可推广到许多其他数据分析问题领域。

英文摘要

The importance of unsupervised clustering and topic modeling is well recognized with ever-increasing volumes of text data. In this paper, we propose a fast method for hierarchical clustering and topic modeling called HierNMF2. Our method is based on fast Rank-2 nonnegative matrix factorization (NMF) that performs binary clustering and an efficient node splitting rule. Further utilizing the final leaf nodes generated in HierNMF2 and the idea of nonnegative least squares fitting, we propose a new clustering/topic modeling method called FlatNMF2 that recovers a flat clustering/topic modeling result in a very simple yet significantly more effective way than any other existing methods. We implement highly optimized open source software in C++ for both HierNMF2 and FlatNMF2 for hierarchical and partitional clustering/topic modeling of document data sets. Substantial experimental tests are presented that illustrate significant improvements both in computational time as well as quality of solutions. We compare our methods to other clustering methods including K-means, standard NMF, and CLUTO, and also topic modeling methods including latent Dirichlet allocation (LDA) and recently proposed algorithms for NMF with separability constraints. Overall, we present efficient tools for analyzing large-scale data sets, and techniques that can be generalized to many other data analytics problem domains.

URL PDF HTML ☆

赞 0 踩 0

1509.06458 2026-06-04 cs.LG cs.NA math.NA

Harmonic Extension

调和扩展

Zuoqiang Shi, Jian Sun, Minghao Tian

AI总结本文提出点积分方法(PIM)和体积约束方法(VCM)以解决调和扩展问题，改进传统图拉普拉斯方法的不足，应用于半监督学习中表现最佳。

Comments 10 pages, 2 figures

1509.04237 2026-06-04 cs.CV cs.NA math.NA

A Total Fractional-Order Variation Model for Image Restoration with Non-homogeneous Boundary Conditions and its Numerical Solution

一种总分数阶变化模型用于图像恢复及其数值解法，具有非均匀边界条件

Jianping Zhang, Ke Chen

AI总结本文提出一种分数阶总α阶变化模型用于图像恢复，克服传统总变分模型的不足，通过分析理论性质和开发四种算法，在恢复质量和效率上优于现有高阶模型。

Comments 26 pages

详情

AI中文摘要

为克服基于总变分的图像恢复模型的不足，近年来提出了多种高阶（通常为二阶）正则化模型。本文分析并测试了一种基于分数阶导数的总α阶变化模型，其性能优于当前流行的高阶正则化模型。尽管已有使用总α阶变化进行图像恢复的研究，但尚未进行理论分析，且所有测试公式均使用零Dirichlet边界条件，这在现实中不适用（而非零边界条件违反分数阶导数的定义）。本文首先回顾了一些分数阶导数的结果，然后严格分析所提出总α阶变分模型的理论性质。接着开发了四种算法来求解变分问题，一种基于变分Split-Bregman思想，三种基于直接求解离散优化问题。数值实验表明，在恢复质量和解效率方面，所提模型在光滑图像上能与已建立的高阶模型（均 curvature 和总泛化变化）产生高度竞争的结果。

英文摘要

To overcome the weakness of a total variation based model for image restoration, various high order (typically second order) regularization models have been proposed and studied recently. In this paper we analyze and test a fractional-order derivative based total $α$-order variation model, which can outperform the currently popular high order regularization models. There exist several previous works using total $α$-order variations for image restoration; however first no analysis is done yet and second all tested formulations, differing from each other, utilize the zero Dirichlet boundary conditions which are not realistic (while non-zero boundary conditions violate definitions of fractional-order derivatives). This paper first reviews some results of fractional-order derivatives and then analyzes the theoretical properties of the proposed total $α$-order variational model rigorously. It then develops four algorithms for solving the variational problem, one based on the variational Split-Bregman idea and three based on direct solution of the discretise-optimization problem. Numerical experiments show that, in terms of restoration quality and solution efficiency, the proposed model can produce highly competitive results, for smooth images, to two established high order models: the mean curvature and the total generalized variation.

URL PDF HTML ☆

赞 0 踩 0

1509.03946 2026-06-04 cs.LG cs.NA math.NA

Parametric Maxflows for Structured Sparse Learning with Convex Relaxations of Submodular Functions

参数最大流用于结构稀疏学习的凸松弛子模函数

Yoshinobu Kawahara, Yutaro Yamaguchi

AI总结本文提出利用参数最大流优化解决结构稀疏学习中的凸松弛子模函数问题，展示现有结构惩罚满足条件，可快速求解正则化学习。