arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

1607.02480 2026-06-04 cs.AI cs.DC cs.SY eess.SY

Real-Time Anomaly Detection for Streaming Analytics

实时流分析中的异常检测

Subutai Ahmad, Scott Purdy

AI总结本文提出基于Hierarchical Temporal Memory算法的实时异常检测方法，通过流数据实时处理与学习实现预测，在金融指标和NAB基准测试中均取得最佳性能。

1307.4847 2026-06-04 cs.LG cs.AI cs.SY eess.SY stat.ML

Efficient Reinforcement Learning in Deterministic Systems with Value Function Generalization

在确定性系统中通过价值函数泛化实现高效的强化学习

Zheng Wen, Benjamin Van Roy

AI总结本文提出OCP算法，通过优化约束传播实现高效探索和价值函数泛化，在有限时间 horizon 确定性系统中实现最优动作选择，并提供效率和渐进行为保证。

1607.01478 2026-06-04 cs.RO cs.AI cs.SY eess.SY

Mixed Strategy for Constrained Stochastic Optimal Control

混合策略用于受约束的随机最优控制

Masahiro Ono, Mahmoud El Chamie, Marco Pavone, Behcet Acikmese

AI总结本文提出混合策略用于受约束的随机最优控制，证明随机化控制输入在非凸优化问题中可降低成本，等于对偶间隙，并提出基于对偶优化的高效求解方法。

Comments 11 pages. 9 figures.Preliminary version of a working journal paper

详情

AI中文摘要

在具有随机约束的最优控制问题中，随机选择控制输入可以降低预期成本，例如随机模型预测控制（SMPC）。我们考虑具有初始随机化的控制器，即在开始时随机选择K+1个控制序列（称为K-随机化）。已知对于具有K个约束的有限状态、有限动作马尔可夫决策过程（MDP），K-随机化足以达到最小成本。我们发现，对于具有连续状态和动作空间的随机最优控制问题，相同结果也成立。进一步，我们证明当优化问题非凸时，控制输入的随机化可以导致成本降低，且该降低量等于对偶间隙。然后，我们提供随机解最优性的必要和充分条件，并开发基于对偶优化的高效求解方法。此外，在K=1的特殊情况（如联合概率约束问题）中，对偶优化可通过根查找更高效地解决。最后，我们在路径规划到未来火星任务的着陆、下降和着陆（EDL）规划等多个实际问题上测试理论并演示求解方法。

英文摘要

Choosing control inputs randomly can result in a reduced expected cost in optimal control problems with stochastic constraints, such as stochastic model predictive control (SMPC). We consider a controller with initial randomization, meaning that the controller randomly chooses from K+1 control sequences at the beginning (called K-randimization).It is known that, for a finite-state, finite-action Markov Decision Process (MDP) with K constraints, K-randimization is sufficient to achieve the minimum cost. We found that the same result holds for stochastic optimal control problems with continuous state and action spaces.Furthermore, we show the randomization of control input can result in reduced cost when the optimization problem is nonconvex, and the cost reduction is equal to the duality gap. We then provide the necessary and sufficient conditions for the optimality of a randomized solution, and develop an efficient solution method based on dual optimization. Furthermore, in a special case with K=1 such as a joint chance-constrained problem, the dual optimization can be solved even more efficiently by root finding. Finally, we test the theories and demonstrate the solution method on multiple practical problems ranging from path planning to the planning of entry, descent, and landing (EDL) for future Mars missions.

URL PDF HTML ☆

赞 0 踩 0

1607.00644 2026-06-04 cs.RO cs.SY eess.SY

Nearest Neighbor-based Rendezvous for Sparsely Connected Mobile Agents

基于最近邻的稀疏连接移动体 rendezvous

Ahmad A. Masoud

AI总结本文提出一种收敛的最近邻控制协议，用于非平凡动力学的移动体。协议保证即使每个体仅与单个最近邻通信，也能收敛到共同点。最近邻需位于任意小的优先区外，协议由两层结构组成，第一层为一阶动力学提供 rendezvous 信号，第二层将信号转换为适合现实体的控制信号。

1602.04621 2026-06-04 cs.LG cs.AI cs.SY eess.SY stat.ML

Deep Exploration via Bootstrapped DQN

通过Bootstrap DQN进行深度探索

Ian Osband, Charles Blundell, Alexander Pritzel, Benjamin Van Roy

AI总结本文提出Bootstrap DQN算法，通过随机价值函数实现高效探索，提升复杂环境中的学习速度和性能，尤其在Atari游戏中表现优异。

1606.09383 2026-06-04 cs.LG cs.SY eess.SY

On Approximate Dynamic Programming with Multivariate Splines for Adaptive Control

基于多变量样条的近似动态规划在自适应控制中的应用

Willem Eerland, Coen de Visser, Erik-Jan van Kampen

AI总结本文提出基于RLSTD算法和多变量简单样条的SDP框架，引入局部遗忘因子以保持样条连续性，通过实验展示SDP在跟踪时变系统和提升控制性能方面的优势。

Comments 23 pages

详情

AI中文摘要

我们定义了一个基于RLSTD算法和多变量简单样条的SDP框架。我们引入了一个局部遗忘因子，能够保持简单样条的连续性。该局部遗忘因子与RLSTD算法结合，产生了一种能够跟踪时变系统的修改RLSTD算法。我们进行了两个数值实验，一个验证了SDP并将其与NDP进行比较，另一个展示了修改后的RLSTD算法在系统参数改变时的恢复速度优势。尽管SDP每时间步需要更多的计算，但实验表明，在相同的功能近似器参数量下，SDP在稳定性和学习率方面优于NDP。第二个实验表明，SDP结合修改后的RLSTD算法在系统参数改变时比原始RLSTD算法恢复得更快，为自适应高性能非线性控制方法铺平了道路。

英文摘要

We define a SDP framework based on the RLSTD algorithm and multivariate simplex B-splines. We introduce a local forget factor capable of preserving the continuity of the simplex splines. This local forget factor is integrated with the RLSTD algorithm, resulting in a modified RLSTD algorithm that is capable of tracking time-varying systems. We present the results of two numerical experiments, one validating SDP and comparing it with NDP and another to show the advantages of the modified RLSTD algorithm over the original. While SDP requires more computations per time-step, the experiment shows that for the same amount of function approximator parameters, there is an increase in performance in terms of stability and learning rate compared to NDP. The second experiment shows that SDP in combination with the modified RLSTD algorithm allows for faster recovery compared to the original RLSTD algorithm when system parameters are altered, paving the way for an adaptive high-performance non-linear control method.

URL PDF HTML ☆

赞 0 踩 0

1606.09275 2026-06-04 cs.RO cs.SY eess.SY

A Harmonic Potential Approach For Simultaneous Planning And Control Of A Generic UAV Platform

一种谐波势场方法用于通用无人机平台的同时规划与控制

Ahmad A. Masoud

AI总结本文提出利用谐波势场方法实现多种无人机的同时规划与控制，通过生成密集参考速度场调节无人机速度，确保其向目标点移动并满足行为约束。

详情

DOI: 10.1007/s10846-011-9570-8
Journal ref: Journal Of Intelligent & Robotic Systems: Volume 65, Issue 1 (2012), Page 153-173

AI中文摘要

本文采用谐波势场方法解决多种无人飞行器的同时规划与控制问题。通过生成参考速度场的梯度来调节无人机速度，使其向目标点移动并满足事先编码在参考场中的约束。调节过程使用了称为虚拟速度吸引器（VVA）的新概念。谐波势场梯度与VVA的综合作用能够产生高效、易于实现且具有证明正确性的上下文敏感控制动作，适用于多种无人机。该方法已开发并提供了正确性证明及仿真结果。

英文摘要

Simultaneous planning and control of a large variety of unmanned aerial vehicles (UAVs) is tackled using the harmonic potential field (HPF) approach. A dense reference velocity field generated from the gradient of an HPF is used to regulate the velocity of the UAV concerned in a manner that would propel the UAV to a target point while enforcing the constraints on behavior that were a priori encoded in the reference field. The regulation process is carried-out using a novel and simple concept called the: virtual velocity attractor (VVA). The combined effect of the HPF gradient and the VVA is found able to yield an efficient, easy to implement, well-behaved and provably-correct context-sensitive control action that suits a wide variety of UAVs. The approach is developed and basic proofs of correctness are provided along with simulation results.

URL PDF HTML ☆

赞 0 踩 0

1606.07414 2026-06-04 cs.CV cs.MM cs.NA math.NA stat.ME

Multiplierless 16-point DCT Approximation for Low-complexity Image and Video Coding

无乘法器16点DCT近似用于低复杂度图像和视频编码

T. L. T. Silveira, R. S. Oliveira, F. M. Bayer, R. J. Cintra, A. Madanayake

AI总结本文提出一种无需乘法和位移操作的16点近似DCT变换，通过矩阵分解快速算法仅需44次加法，实现了最低的算术成本，并在图像和视频编码中表现出最佳的成本效益比。

Comments 12 pages, 5 figures, 3 tables

详情

DOI: 10.1007/s11760-016-0923-4

AI中文摘要

本文介绍了一种正交的16点近似离散余弦变换（DCT）。所提出的变换不需要乘法或位移操作。引入了一种基于矩阵分解的快速算法，仅需44次加法，这是文献中最低的算术成本。为了评估所提出的变换，计算了计算复杂度、与精确DCT的相似性以及编码性能指标。经典和最先进的16点低复杂度变换在比较分析中被使用。在图像压缩中，所提出的近似通过PSNR和SSIM测量评估，获得了最佳的成本效益比。对于视频编码，所提出的近似被嵌入到HEVC参考软件中，直接与原始HEVC标准进行比较。通过FPGA硬件实现和测试，所提出的变换在与文献中最佳竞争变换相比时，面积-时间和面积-时间平方VLSI指标分别提高了35%和37%。

英文摘要

An orthogonal 16-point approximate discrete cosine transform (DCT) is introduced. The proposed transform requires neither multiplications nor bit-shifting operations. A fast algorithm based on matrix factorization is introduced, requiring only 44 additions---the lowest arithmetic cost in literature. To assess the introduced transform, computational complexity, similarity with the exact DCT, and coding performance measures are computed. Classical and state-of-the-art 16-point low-complexity transforms were used in a comparative analysis. In the context of image compression, the proposed approximation was evaluated via PSNR and SSIM measurements, attaining the best cost-benefit ratio among the competitors. For video encoding, the proposed approximation was embedded into a HEVC reference software for direct comparison with the original HEVC standard. Physically realized and tested using FPGA hardware, the proposed transform showed 35% and 37% improvements of area-time and area-time-squared VLSI metrics when compared to the best competing transform in the literature.

URL PDF HTML ☆

赞 0 踩 0

1411.0728 2026-06-04 cs.LG cs.GT cs.SY eess.SY math.OC

Approachability in Stackelberg Stochastic Games with Vector Costs

在向量成本的Stackelberg随机博弈中可接近性的研究

Dileep Kalathil, Vivek Borkar, Rahul Jain

AI总结本文提出在动态变化环境中多目标优化问题中，针对向量成本的Stackelberg随机博弈的可接近性策略，并设计了计算可行的算法和强化学习方法。

Comments 18 Pages, Submitted to Dynamic Games and Applications

详情

AI中文摘要

本文引入了Blackwell [1]在向量值重复博弈中的可接近性概念。著名的Blackwell可接近性定理规定了一种策略，即无论其他参与者的策略如何，都能将给定参与者的平均成本导向给定的目标集。在本文中，受动态变化环境中多目标优化/决策问题的启发，我们研究了具有向量值成本函数的Stackelberg随机博弈的可接近性问题。我们做出了两项主要贡献。首先，我们为Stackelberg随机博弈提供了一种简单且计算上可行的可接近性策略，沿Blackwell的思路。其次，我们提出了一种强化学习算法，用于在转移核未知的情况下学习可接近的策略。我们还作为副产品恢复了Blackwell在凸集情况下可接近性的必要和充分条件，从而实现了完全表征。我们还给出了非凸集的充分条件。

英文摘要

The notion of approachability was introduced by Blackwell [1] in the context of vector-valued repeated games. The famous Blackwell's approachability theorem prescribes a strategy for approachability, i.e., for `steering' the average cost of a given agent towards a given target set, irrespective of the strategies of the other agents. In this paper, motivated by the multi-objective optimization/decision making problems in dynamically changing environments, we address the approachability problem in Stackelberg stochastic games with vector valued cost functions. We make two main contributions. Firstly, we give a simple and computationally tractable strategy for approachability for Stackelberg stochastic games along the lines of Blackwell's. Secondly, we give a reinforcement learning algorithm for learning the approachable strategy when the transition kernel is unknown. We also recover as a by-product Blackwell's necessary and sufficient condition for approachability for convex sets in this set up and thus a complete characterization. We also give sufficient conditions for non-convex sets.

URL PDF HTML ☆

赞 0 踩 0

1508.01308 2026-06-04 cs.CV cs.NA math.HO math.NA math.OC

Collaborative Total Variation: A General Framework for Vectorial TV Models

协同总变分：向量总变分模型的通用框架

Joan Duran, Michael Moeller, Catalina Sbert, Daniel Cremers

AI总结本文提出协同总变分（CTV）模型，通过不同维度的范数测量颜色图像张量的平滑性，探讨其理论性质和应用效果，实验比较了多种CTV方法在去噪、去模糊和修复等逆问题中的性能。

详情

DOI: 10.1137/15M102873X
Journal ref: SIAM Journal on Imaging Sciences, vol. 9(1), pp. 116-151, 2016

AI中文摘要

尽管已有二十年，总变分（TV）仍然是图像处理中最受欢迎的正则化方法之一，并引发了大量研究，特别是从标量到向量值函数的转变。本文将彩色图像的梯度视为一个三维矩阵或张量，其维度对应空间扩展、与其他像素的差异和光谱通道。通过不同维度的不同范数测量该张量的平滑性，根据这些范数的类型可获得不同的正则化特性，从而得到新的颜色图像模型。我们称之为协同总变分（CTV）。在理论方面，我们刻画了所提出正则化器的对偶范数、次微分和近端映射。进一步地，借助广义奇异向量的概念，证明了$\ell^{\infty}$通道耦合做出最强烈的先验假设，并具有最大程度减少颜色伪影的潜力。我们的实际贡献包括一个广泛的实验部分，其中我们比较了大量协同TV方法在去噪、去模糊和修复等逆问题中的性能。

英文摘要

Even after over two decades, the total variation (TV) remains one of the most popular regularizations for image processing problems and has sparked a tremendous amount of research, particularly to move from scalar to vector-valued functions. In this paper, we consider the gradient of a color image as a three dimensional matrix or tensor with dimensions corresponding to the spatial extend, the differences to other pixels, and the spectral channels. The smoothness of this tensor is then measured by taking different norms along the different dimensions. Depending on the type of these norms one obtains very different properties of the regularization, leading to novel models for color images. We call this class of regularizations collaborative total variation (CTV). On the theoretical side, we characterize the dual norm, the subdifferential and the proximal mapping of the proposed regularizers. We further prove, with the help of the generalized concept of singular vectors, that an $\ell^{\infty}$ channel coupling makes the most prior assumptions and has the greatest potential to reduce color artifacts. Our practical contributions consist of an extensive experimental section where we compare the performance of a large number of collaborative TV methods for inverse problems like denoising, deblurring and inpainting.

URL PDF HTML ☆

赞 0 踩 0

1606.05124 2026-06-04 cs.RO cs.AI cs.SY eess.SY

Robust Active Perception via Data-association aware Belief Space planning

通过数据关联意识的信念空间规划实现鲁棒的主动感知

Shashank Pathak, Antony Thomas, Asaf Feniger, Vadim Indelman

AI总结本文提出一种结合数据关联推理的信念空间规划方法，以应对定位不确定性和感知模糊环境中的挑战，通过设计新的成本函数提升主动解歧能力。

详情

AI中文摘要

我们开发了一种信念空间规划（BSP）方法，通过在规划中整合数据关联（DA）推理，同时考虑额外的不确定性来源，从而推动了该领域的前沿。现有BSP方法通常假设数据关联已知且完美，但在存在定位不确定性、模糊和感知混叠环境时，这一假设更难成立。相反，我们的数据关联意识信念空间规划（DA-BSP）方法在信念演化中显式推理数据关联，因此能更好地应对这些具有挑战性的现实场景。特别是，我们展示了由于感知混叠，后验信念成为概率分布函数的混合，设计了衡量预期模糊程度和后验不确定性的成本函数。使用这些以及标准成本（如控制惩罚、距离目标）在目标函数中，得到一个能够可靠表示动作影响且特别擅长主动解歧的通用框架。我们的方法因此适用于感知混叠环境中的鲁棒主动感知和自主导航。我们通过基本和现实的模拟展示了关键方面。

英文摘要

We develop a belief space planning (BSP) approach that advances the state of the art by incorporating reasoning about data association (DA) within planning, while considering additional sources of uncertainty. Existing BSP approaches typically assume data association is given and perfect, an assumption that can be harder to justify while operating, in the presence of localization uncertainty, in ambiguous and perceptually aliased environments. In contrast, our data association aware belief space planning (DA-BSP) approach explicitly reasons about DA within belief evolution, and as such can better accommodate these challenging real world scenarios. In particular, we show that due to perceptual aliasing, the posterior belief becomes a mixture of probability distribution functions, and design cost functions that measure the expected level of ambiguity and posterior uncertainty. Using these and standard costs (e.g.~control penalty, distance to goal) within the objective function, yields a general framework that reliably represents action impact, and in particular, capable of active disambiguation. Our approach is thus applicable to robust active perception and autonomous navigation in perceptually aliased environments. We demonstrate key aspects in basic and realistic simulations.

URL PDF HTML ☆

赞 0 踩 0

1602.02990 2026-06-04 cs.RO cs.LG cs.SY eess.SY

Self-organized control for musculoskeletal robots

肌骨机器人中的自组织控制

Ralf Der, Georg Martius

AI总结本文提出了一种自组织控制方法，通过无功能控制器实现机器人与环境的动态交互，展示了其在肌肉驱动臂肩系统中的自组织行为及与物体动态的共振效应。

Comments 11 pages, 4 figures, 1 table

详情

AI中文摘要

随着机器人技术的快速发展，最优控制成为研究核心。传统方法中，控制器基于传感器历史数据和预设目标进行动作决策。然而，弹性驱动机器人面临严重挑战。本文提出自组织控制新范式，采用无自身功能的固定函数控制器，基于传感器历史数据。在Myorobotics工具包的肌肉驱动臂肩系统中，观察到多样化的自组织行为：当系统独处时，臂部产生伪随机姿态序列，也可被操控为确定性运动模式。最有趣的是，当附加物体后，控制器与物体内部动态产生共振：给半满瓶时，系统自发摇晃瓶身以产生最大水动态响应；附加摆锤时，控制器使其进入圆周模式。本文还讨论了该控制器范式在意图驱动行为生成中的应用前景。

英文摘要

With the accelerated development of robot technologies, optimal control becomes one of the central themes of research. In traditional approaches, the controller, by its internal functionality, finds appropriate actions on the basis of the history of sensor values, guided by the goals, intentions, objectives, learning schemes, and so on planted into it. The idea is that the controller controls the world---the body plus its environment---as reliably as possible. However, in elastically actuated robots this approach faces severe difficulties. This paper advocates for a new paradigm of self-organized control. The paper presents a solution with a controller that is devoid of any functionalities of its own, given by a fixed, explicit and context-free function of the recent history of the sensor values. When applying this controller to a muscle-tendon driven arm-shoulder system from the Myorobotics toolkit, we observe a vast variety of self-organized behavior patterns: when left alone, the arm realizes pseudo-random sequences of different poses but one can also manipulate the system into definite motion patterns. But most interestingly, after attaching an object, the controller gets in a functional resonance with the object's internal dynamics: when given a half-filled bottle, the system spontaneously starts shaking the bottle so that maximum response from the dynamics of the water is being generated. After attaching a pendulum to the arm, the controller drives the pendulum into a circular mode. In this way, the robot discovers dynamical affordances of objects its body is interacting with. We also discuss perspectives for using this controller paradigm for intention driven behavior generation.

URL PDF HTML ☆

赞 0 踩 0

1411.0181 2026-06-04 cs.RO cs.SY eess.SY

Restricted Discrete Invariance and Self-Synchronization For Stable Walking of Bipedal Robots

受限离散不变性与自同步用于双足机器人稳定行走

Hamed Razavi, Anthony M. Bloch, Christine Chevallereau, J. W. Grizzle

AI总结本文研究了双足机器人稳定行走的低维子流形不变性，提出自同步概念，通过3D线性倒立摆模型分析，扩展至9自由度双足机器人，验证渐近稳定行走的可行性。

Comments Conference

详情

AI中文摘要

双足运动模型是混合系统，包含连续部分（由拉格朗日方程和执行器生成）和离散部分（腿部转移）。离散部分通常由连续状态空间中的局部嵌入共维一子流形（切换面）和重置映射组成。本文旨在识别切换面的低维子流形，使其在闭环动力学下不变，从而实现渐近稳定的周期性步态。本文首先研究经典的3D线性倒立摆（LIP）模型，通过分析结果更易获得。关键贡献是自同步概念，即摆动平面的周期趋于共同周期。通过3D LIP模型的不变性研究，将该概念扩展到9自由度3D双足机器人，并通过数值研究验证渐近稳定行走的可行性。

英文摘要

Models of bipedal locomotion are hybrid, with a continuous component often generated by a Lagrangian plus actuators, and a discrete component where leg transfer takes place. The discrete component typically consists of a locally embedded co-dimension one submanifold in the continuous state space of the robot, called the switching surface, and a reset map that provides a new initial condition when a solution of the continuous component intersects the switching surface. The aim of this paper is to identify a low-dimensional submanifold of the switching surface, which, when it can be rendered invariant by the closed-loop dynamics, leads to asymptotically stable periodic gaits. The paper begins this process by studying the well-known 3D Linear Inverted Pendulum (LIP) model, where analytical results are much easier to obtain. A key contribution here is the notion of \textit{self-synchronization}, which refers to the periods of the pendular motions in the sagittal and frontal planes tending to a common period. The notion of invariance resulting from the study of the 3D LIP model is then extended to a 9-DOF 3D biped. A numerical study is performed to illustrate that asymptotically stable walking may be obtained.

URL PDF HTML ☆

赞 0 踩 0

1606.01949 2026-06-04 cs.AI cs.SY eess.SY

Assisted Energy Management in Smart Microgrids

智能微电网中的辅助能源管理

Andrea Monacchi, Wilfried Elmenreich

AI总结本文研究了通过正向合同缓解竞争需求导致的服务中断问题，设计了基于策略的经纪人并利用神经网络实现学习经纪人，以降低赔付成本并提高整体利润。

1605.09049 2026-06-04 cs.LG cs.NA math.NA stat.ML

Recycling Randomness with Structure for Sublinear time Kernel Expansions

利用结构回收随机性以实现子线性时间核展开

Krzysztof Choromanski, Vikas Sindhwani

AI总结本文提出通过结构矩阵近似各种核函数的方法，扩展了快速食品构造，并通过理论分析和实验验证了结构化矩阵在提升核方法性能中的有效性。

1605.08257 2026-06-04 cs.LG cs.NA math.NA math.OC stat.ML

Low-rank tensor completion: a Riemannian manifold preconditioning approach

低秩张量补全：黎曼流形预条件方法

Hiroyuki Kasai, Bamdev Mishra

AI总结本文提出了一种基于黎曼流形预条件的方法用于具有秩约束的张量补全问题，通过引入新的黎曼度量利用最小二乘结构和Tucker分解的对称性，开发出预条件非线性共轭梯度和随机梯度下降算法，实验表明其在不同数据集上优于现有方法。

Comments The 33rd International Conference on Machine Learning (ICML 2016). arXiv admin note: substantial text overlap with arXiv:1506.02159

1511.03722 2026-06-04 cs.LG cs.AI cs.SY eess.SY stat.ME stat.ML

Doubly Robust Off-policy Value Evaluation for Reinforcement Learning

强化学习中的双重鲁棒离策略价值评估

Nan Jiang, Lihong Li

AI总结本文提出一种双重鲁棒估计器，用于离策略价值评估，兼顾无偏性和低方差性，并在基准问题中验证其有效性。

Comments 14 pages; 4 figures; ICML 2016

1605.04368 2026-06-04 cs.RO cs.SY eess.SY

Decentralized Autonomous Navigation Strategies for Multi-Robot Search and Rescue

多机器人搜索救援中的去中心化自主导航策略

Ahmad Baranzadeh

AI总结本文提出三种基于三角网格模式的算法，用于多机器人搜索任务，通过数学证明算法收敛性，并通过仿真和实验验证其有效性，同时探讨了去中心化编队形成与障碍物避让问题。

Comments arXiv admin note: substantial text overlap with arXiv:1402.5188 by other authors

详情

AI中文摘要

在本报告中，我们尝试改进现有多机器人搜索操作中的方法。我们提出了三种新的算法，利用三角网格模式，即机器人在搜索过程中必然经过三角网格的顶点。使用三角网格模式的主要优势是，在覆盖任意有界区域时，所需的最少机器人数量是渐近最优的。我们使用在搜索操作中由机器人制作并共享的新拓扑地图。我们考虑一个事先未知、具有任意形状且包含障碍物的区域。与许多现有启发式算法不同，我们为算法提供了数学证明的收敛性。我们使用真实机器人的模拟器和环境展示了所提出算法的计算机仿真结果。我们通过使用真实机器人进行实验来评估算法的性能。我们比较了我们自己的算法与三种其他研究人员提出的现有算法的性能。结果展示了我们所提方案的优势。本文还探讨了移动机器人团队的编队形成与障碍物避让问题。我们为一组移动机器人提出了一种去中心化的编队形成与障碍物避让算法，以移动到定义的几何构型。此外，我们考虑了一种更复杂的编队问题，其中一组匿名机器人；这些机器人在最终构型中并不知道自己的位置，需要在编队过程中达成共识。我们为这些匿名机器人提出了一种随机算法，以概率1收敛到期望的构型。我们还提出了一种新的障碍物避让规则，用于编队形成算法中。

英文摘要

In this report, we try to improve the performance of existing approaches for search operations in multi-robot context. We propose three novel algorithms that are using a triangular grid pattern, i.e., robots certainly go through the vertices of a triangular grid during the search procedure. The main advantage of using a triangular grid pattern is that it is asymptotically optimal in terms of the minimum number of robots required for the complete coverage of an arbitrary bounded area. We use a new topological map which is made and shared by robots during the search operation. We consider an area that is unknown to the robots a priori with an arbitrary shape, containing some obstacles. Unlike many current heuristic algorithms, we give mathematically proofs of convergence of the algorithms. The computer simulation results for the proposed algorithms are presented using a simulator of real robots and environment. We evaluate the performance of the algorithms via experiments with real robots. We compare the performance of our own algorithms with three existing algorithms from other researchers. The results demonstrate the merits of our proposed solution. A further study on formation building with obstacle avoidance for a team of mobile robots is presented in this report. We propose a decentralized formation building with obstacle avoidance algorithm for a group of mobile robots to move in a defined geometric configuration. Furthermore, we consider a more complicated formation problem with a group of anonymous robots; these robots are not aware of their position in the final configuration and need to reach a consensus during the formation process. We propose a randomized algorithm for the anonymous robots that achieves the convergence to a desired configuration with probability 1. We also propose a novel obstacle avoidance rule, used in the formation building algorithm.

URL PDF HTML ☆

赞 0 踩 0

1605.00716 2026-06-04 cs.LG cs.NI cs.SY eess.SY

Radio Transformer Networks: Attention Models for Learning to Synchronize in Wireless Systems

无线系统中的无线电变换网络：用于学习同步的注意力模型

Timothy J O'Shea, Latha Pemula, Dhruv Batra, T. Charles Clancy

AI总结本文提出利用空间变换网络和新无线电领域适应的变换，引入学习注意力模型以提升调制识别的准确率，通过优化分类精度、稀疏表示和正则化实现信号同步与归一化。

Comments 5 pages, 8 figures

1604.08768 2026-06-04 cs.AI cs.SY eess.SY

Supervisory Control for Behavior Composition

行为组合的监督控制

Paolo Felli, Nitin Yadav, Sebastian Sardina

AI总结将AI中的行为组合合成任务与离散事件系统领域的监督控制理论联系起来，通过协调可用行为实现目标模块，利用离散事件系统的理论基础和工具。

1604.06558 2026-06-04 cs.RO cs.SY eess.SY

Folding Assembly by Means of Dual-Arm Robotic Manipulation

通过双臂机械手实现折叠组装

Diogo Almeida, Yiannis Karayiannidis

AI总结本文提出一种适用于双臂机械手的折叠组装基本操作，用于更高层次的组装策略。通过实验验证了该方法在两个部件接触时的可行性。

Comments 7 pages, accepted for ICRA 2016

1504.07259 2026-06-04 cs.CV cs.NA math.AP math.NA

Image Segmentation and Restoration Using Parametric Contours With Free Endpoints

基于自由端点参数轮廓的图像分割与修复

Heike Benninghoff, Harald Garcke

AI总结本文提出一种新型自由端点主动轮廓方法，通过离散化穆恩-沙赫功能实现图像分割与修复，结合曲线法向流动和端点切向流动演化规律，采用参数化轮廓与边缘保持去噪实现快速分割与修复。

1604.05064 2026-06-04 cs.RO cs.SY eess.SY

An Approximation Algorithm for a Shortest Dubins Path Problem

一种短路径问题的近似算法

Sivakumar Rathinam, Pramod Khargonekar

AI总结本文提出了一种改进的近似算法，将短路径问题的解的质量保证从3.04降低到2.04，通过实验证明了该方法的有效性。

1604.03912 2026-06-04 cs.AI cs.LG cs.SY eess.SY stat.ML

Inverse Reinforcement Learning with Simultaneous Estimation of Rewards and Dynamics

逆强化学习与奖励和动态的同时估计

Michael Herman, Tobias Gindele, Jörg Wagner, Felix Schmitt, Wolfram Burgard

AI总结本文提出一种基于梯度的逆强化学习方法，同时估计系统动态和奖励函数，提升了样本效率和估计准确性。

Comments accepted to appear in AISTATS 2016

详情

AI中文摘要

逆强化学习（IRL）描述了从观察到的智能体行为中学习未知马尔可夫决策过程（MDP）奖励函数的问题。由于智能体的行为源于其策略，而MDP策略依赖于随机系统动态和奖励函数，逆问题的解决方案受到两者显著影响。当前的IRL方法假设如果转移模型未知，可以获取额外的系统动态样本，或观察行为提供足够的系统动态样本以准确求解逆问题。这些假设往往不成立。为克服这一问题，我们提出了一种基于梯度的IRL方法，同时估计系统的动态。通过求解联合优化问题，我们的方法考虑了演示的偏差，这种偏差源于生成策略。在合成MDP和迁移学习任务上的评估显示，该方法在样本效率以及估计的奖励函数和转移模型的准确性方面有所改进。

英文摘要

Inverse Reinforcement Learning (IRL) describes the problem of learning an unknown reward function of a Markov Decision Process (MDP) from observed behavior of an agent. Since the agent's behavior originates in its policy and MDP policies depend on both the stochastic system dynamics as well as the reward function, the solution of the inverse problem is significantly influenced by both. Current IRL approaches assume that if the transition model is unknown, additional samples from the system's dynamics are accessible, or the observed behavior provides enough samples of the system's dynamics to solve the inverse problem accurately. These assumptions are often not satisfied. To overcome this, we present a gradient-based IRL approach that simultaneously estimates the system's dynamics. By solving the combined optimization problem, our approach takes into account the bias of the demonstrations, which stems from the generating policy. The evaluation on a synthetic MDP and a transfer learning task shows improvements regarding the sample efficiency as well as the accuracy of the estimated reward functions and transition models.

URL PDF HTML ☆

赞 0 踩 0

1604.02930 2026-06-04 cs.RO cs.SY eess.SY

Implementation of haptic communication in comanipulative tasks: a statistical state machine model

在协同操作任务中实现触觉通信：一种统计状态机模型

Lucas Roche, Ludovic Saint-Bauzel

AI总结本文通过轻量化条件下的机械臂实验，探讨物理人机交互中的时间基通信机制，提出统计状态机模型并验证其与人类交互性能的接近性。

1604.02080 2026-06-04 cs.AI cs.SY eess.SY

Planning with Information-Processing Constraints and Model Uncertainty in Markov Decision Processes

在马尔可夫决策过程中的信息处理约束与模型不确定性规划

Jordi Grau-Moya, Felix Leibfried, Tim Genewein, Daniel A. Braun

AI总结本文提出考虑模型不确定性的马尔可夫决策过程规划方法，通过信息论原理统一解决信息处理约束问题，结合广义变分原理推导价值迭代方案，并在网格世界模拟中验证其有效性。

Comments 16 pages, 3 figures

1402.4893 2026-06-04 cs.CV cs.NA math.NA

Anisotropic Mesh Adaptation for Image Representation

各向异性网格自适应用于图像表示

Xianping Li

AI总结本文提出基于各向异性网格自适应的GPRAMA方法，通过改进的网格拼接技术实现更高质量的图像表示，同时降低计算成本。

Comments 25 pages, 15 figures

1603.08497 2026-06-04 cs.CV cs.NA math.NA

On distances, paths and connections for hyperspectral image segmentation

关于超光谱图像分割的距离、路径和连接

Guillaume Noyel, Jesus Angulo, Dominique Jeulin

AI总结本文提出η和η连接以增强λ-平坦区的区域信息，通过自顶向下的方法实现更精细的分割。

1410.7632 2026-06-04 cs.CV cs.RO cs.SY eess.SY

On the Covariance of ICP-based Scan-matching Techniques

关于基于ICP的扫描匹配技术的协方差

Silvère Bonnabel, Martin Barczyk, François Goulette

AI总结本文研究了ICP算法计算旋转变换协方差的问题，指出点到点版本的ICP应用会导致错误协方差，通过数学证明验证点到平面版本的正确性。

Comments Accepted at 2016 American Control Conference

1506.09016 2026-06-04 cs.LG cs.CV cs.NA math.NA math.OC stat.ML

Online Learning to Sample

在线学习采样

Guillaume Bouchard, Théo Trouillon, Julien Perez, Adrien Gaidon

AI总结本文提出AW-SGD算法，通过在线学习优化采样策略，提升在线优化效率，应用于图像分类、矩阵分解和强化学习。

Comments Update: removed convergence theorem and proof as there is an error. Submitted to UAI 2016

详情

AI中文摘要

随机梯度下降（SGD）是机器学习中用于在线优化最广泛使用的技术之一。在本工作中，我们通过适应性地学习如何在每个时间步选择最有用的训练示例来加速SGD。首先，我们证明SGD可以用于学习重要采样估计器的最佳可能采样分布。其次，我们证明SGD算法的采样分布可以通过逐步最小化梯度的方差来在线估计。所得到的算法——自适应加权SGD（AW-SGD）——维护一组用于优化的参数，以及一组用于采样学习示例的参数。我们证明AWSGD在三个不同的应用中实现了更快的收敛：（i）使用深度特征的图像分类，其中图像的采样取决于其标签，（ii）矩阵分解，其中行和列不是均匀采样的，以及（iii）强化学习，其中优化和探索策略同时被估计，其中我们的方法对应于一个off-policy梯度算法。

英文摘要

Stochastic Gradient Descent (SGD) is one of the most widely used techniques for online optimization in machine learning. In this work, we accelerate SGD by adaptively learning how to sample the most useful training examples at each time step. First, we show that SGD can be used to learn the best possible sampling distribution of an importance sampling estimator. Second, we show that the sampling distribution of an SGD algorithm can be estimated online by incrementally minimizing the variance of the gradient. The resulting algorithm - called Adaptive Weighted SGD (AW-SGD) - maintains a set of parameters to optimize, as well as a set of parameters to sample learning examples. We show that AWSGD yields faster convergence in three different applications: (i) image classification with deep features, where the sampling of images depends on their labels, (ii) matrix factorization, where rows and columns are not sampled uniformly, and (iii) reinforcement learning, where the optimized and exploration policies are estimated at the same time, where our approach corresponds to an off-policy gradient algorithm.

URL PDF HTML ☆

赞 0 踩 0