2606.05139 2026-06-04 cs.LG 版本更新

BBOmix: A Tabular Benchmark for Hyperparameter Optimization of Unsupervised Biological Representation Learning

BBOmix: 用于无监督生物表示学习超参数优化的表格基准

Luca Thale-Bombien, Jan Ewald, Ralf König, Aaron Klein

发表机构 * Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI)（可扩展数据与人工智能研究中心）； Leipzig University（莱比锡大学）； ELLIS Institute（ELLIS研究所）

AI总结针对高通量测序产生的组学数据，提出首个开源表格基准BBOmix，包含105,000次评估，涵盖四种自编码器架构和七种多组学模态，用于无监督表示学习的超参数优化。

详情

AI中文摘要

高通量测序的快速发展产生了大规模、高维的组学数据集。深度无监督学习架构，特别是自编码器（AEs），在该领域越来越多地被用于降维和表示学习。然而，AEs对架构选择和超参数高度敏感，且无监督优化通常依赖于重建损失，这可能是下游任务效用的不良代理。穷举超参数优化（HPO）计算成本高昂，导致研究人员经常依赖次优的默认配置。为了普及大规模无监督HPO研究，我们引入了$ extbf{BBOmix}$，这是第一个用于真实生物数据上无监督表示学习的开源表格基准。我们的基准包括来自TCGA和SCHC数据集的四种AE架构和七种多组学模态的105,000次评估。我们量化了重建损失与下游任务性能之间的相关性，并对最先进的单保真度、多保真度和迁移学习HPO方法进行了广泛评估，为未来无监督生物表示学习研究建立了严格的基线。

英文摘要

The rapid advancement of high-throughput sequencing has led to large, high-dimensional omics datasets. Deep unsupervised learning architectures, particularly Autoencoders (AEs), are increasingly used for dimensionality reduction and representation learning in this domain. However, AEs are highly sensitive to architectural choices and hyperparameters, and unsupervised optimization typically relies on reconstruction loss, which may be a poor proxy for downstream utility. Exhaustive hyperparameter optimization (HPO) is computationally expensive, leading researchers to frequently rely on suboptimal default configurations. To democratize access to large-scale unsupervised HPO research, we introduce $\textbf{BBOmix}$, the first open-source tabular benchmark for unsupervised representation learning on real-world biological data. Our benchmark includes 105,000 evaluations across four AE architectures and seven multi-omics modalities from the TCGA and SCHC datasets. We quantify the correlation between reconstruction loss and downstream task performance and provide an extensive evaluation of state-of-the-art single-fidelity, multi-fidelity, and transfer learning HPO methods, establishing a rigorous baseline for future research in unsupervised biological representation learning.

URL PDF HTML ☆

赞 0 踩 0

2606.05138 2026-06-04 cs.LG q-fin.ST 版本更新

面向高效且基于证据的移动预测：基于LLM驱动的智能体

Linyao Chen, Qinlao Zhao, Zechen Li, Mingming Li, Likun Ni, Jinyu Chen, Yuhao Yao, Xuan Song, Noboru Koshizuka, Hiroki Kobayashi

发表机构 * The University of Tokyo（东京大学）； Huazhong University of Science and Technology（华中科技大学）； University of New South Wales, Sydney（新南威尔士大学（悉尼））； LocationMind Inc.（LocationMind公司）； Southern University of Science and Technology（南方科技大学）； Jilin University（吉林大学）

AI总结提出一种无需训练的LLM驱动智能体框架AgentMob，通过自适应证据收集机制解决移动预测中的模糊情况，在多个数据集上达到最优性能。

详情

AI中文摘要

个体层面的移动预测是城市模拟、交通规划和政策分析的核心。监督序列模型实现了高精度，但需要任务特定训练且决策透明度有限。最近的基于LLM的方法提高了可解释性，但大多依赖静态提示和单次推理，限制了在移动信号弱或冲突时寻求额外证据的能力。我们提出\method{}，一种无需训练的LLM驱动智能体框架，将下一位置预测建模为自适应证据控制的决策制定。\method{}通过基于历史规律性的快速路径处理常规情况，而模糊情况则触发对近期轨迹、历史行为、停留-移动可能性和地理证据的迭代工具使用。在三个移动数据集上，AgentMob在无需训练的基于LLM的方法中实现了最强的整体性能，GPT-5.4在BW上达到71.42%的Acc@1，在YJMob100K上达到33.14%，在上海ISP上达到33.50%。在BW的非快速路径案例中，LLM控制器相比相同工具的统计基线将Acc@1从30.65%提高到48.62%，表明其主要优势在于通过自适应证据收集解决模糊预测。我们的代码可在https://github.com/Unknown-zoo/AgentMob获取。

英文摘要

Individual-level mobility prediction is central to urban simulation, transportation planning, and policy analysis. Supervised sequence models achieve strong accuracy but require task-specific training and offer limited decision-level transparency. Recent LLM-based methods improve interpretability, yet mostly rely on static prompts and single-pass inference, limiting their ability to seek additional evidence when mobility signals are weak or conflicting. We propose \method{}, a training-free LLM-driven agent framework that formulates next-location prediction as adaptive evidence-controlled decision making. \method{} resolves routine cases through a fast path based on historical regularity, while ambiguous cases trigger iterative tool use over recent trajectories, historical behavior, stay-move likelihood, and geographical evidence. Across three mobility datasets, AgentMob achieves the strongest overall performance among training-free LLM-based methods, with GPT-5.4 reaching 71.42\% Acc@1 on BW, 33.14\% on YJMob100K, and 33.50\% on Shanghai ISP. On BW non-fast-path cases, the LLM controller improves Acc@1 from 30.65\% to 48.62\% over a same-tool statistical baseline, showing that its main benefit lies in resolving ambiguous predictions through adaptive evidence gathering. Our code is available at https://github.com/Unknown-zoo/AgentMob.

URL PDF HTML ☆

赞 0 踩 0

2606.05129 2026-06-04 cs.CR cs.LG 版本更新

Preserving Data Privacy in Learning Causal Structure with Fully Homomorphic Encryption

在全同态加密下学习因果结构时保护数据隐私

Jian Yang, Yuan Tong, Qinbin Li, Zeyi Wen, Xiaofang Zhou

发表机构 * Hong Kong University of Science and Technology (Guangzhou)（香港理工大学（广州））； Hong Kong University of Science and Technology（香港理工大学）； University of California, Berkeley（加州大学伯克利分校）

AI总结针对分布式因果结构学习中的隐私泄露问题，提出基于全同态加密的方法，通过电路简化、除法和对数近似以及SIMD批处理技术，在加密数据上高效完成因果结构学习，并支持扩展到差分隐私。

详情

AI中文摘要

保护数据隐私是结构数据管理和数据挖掘中的重要课题。然而，分布式因果结构学习中的隐私泄露问题是一个持续的挑战，特别是在需要数据传输和计算的情况下。在本文中，我们提出了一种基于全同态加密（FHE）的方法，该方法在密文上进行计算，保持数据在传输和计算过程中加密。然而，由于FHE计算成本高且对除法和对数运算的支持有限，将FHE应用于因果结构学习具有挑战性。为了应对这一挑战，我们提出了一系列新颖的技术，包括（i）电路简化以提高效率，（ii）通过牛顿-拉夫森倒数和泰勒展开近似除法和对数，以及（iii）使用SIMD加速的批处理技术来增强整个学习过程。此外，我们的方法可以轻松扩展到FHE之外，通过展示其可移植性来支持差分隐私。实验结果表明，我们的方法在测试的数据集上实现了与明文版本高度一致且可比的因果结构。最后，即使在FHE的隐私保护下，我们的方法也能在几十分钟内高效且实际地完成因果结构学习。

英文摘要

Preserving data privacy is an important topic in structural data management and data mining. However, the issue of privacy leakage in distributed causal structure learning is a persistent challenge, especially in cases where data transmission and computation are required. In this paper, we propose a method based on fully homomorphic encryption (FHE) that performs calculations on ciphertexts, keeping data encrypted in transition and computation. Nevertheless, adopting FHE to causal structure learning is challenging due to the high computation cost and limited support on division as well as logarithm operations in FHE. To tackle this challenge, we propose a series of novel techniques including (i) circuit simplification for better efficiency, (ii) approximation of division and logarithm through Newton-Raphson Reciprocal and Taylor expansion, and (iii) a batching technique with SIMD-acceleration to enhance the whole learning process. Additionally, our method can be easily extended beyond FHE by demonstration of its portability to support differential privacy. Empirical results show that our method achieves high consistency and comparable causal structure with the plaintext version in the datasets tested. Last, our method is efficient and practical to complete learning causal structures in tens of minutes even under the privacy protection of FHE.

URL PDF HTML ☆

赞 0 踩 0

2606.05124 2026-06-04 cs.GR cs.CV cs.LG 版本更新

Geometry Gaussians: Decoupling Appearance and Geometry in Gaussian Splatting

几何高斯：在高斯泼溅中解耦外观与几何

Hongyu Zhou, Zorah Lähner

发表机构 * University of Bonn（波恩大学）； Lamarr Institut（拉马尔研究所）

AI总结针对3D高斯泼溅在几何表示与外观渲染间的冲突，提出通过为每个溅射添加几何不透明度参数并配合透明度优化流程，实现几何与外观的解耦，提升复杂场景（尤其是透明物体）的渲染与几何性能。

详情

AI中文摘要

在3D高斯泼溅（3DGS）成功用于新视角合成后，许多工作探索了如何将其用于几何表面表示。然而，直接从3DGS中提取准确的几何信息仍然具有挑战性，且往往会降低外观渲染质量。在这项工作中，我们通过使用完整的地面真值纹理和几何信息进行训练，证明了默认形式的3DGS本质上不适合同时表示纹理和几何。我们还提出了一种简单的解决方案，即为每个溅射应用一个额外的几何不透明度参数，并配合可选的透明度策划优化流程。我们的实验，无论是使用地面真值还是视觉基础模型的几何输入，都表明这一改变在多种数据集上提高了渲染和几何性能，尤其是对于包含透明物体的复杂场景，我们的方法带来了显著提升。

英文摘要

After the success of 3D Gaussian Splatting (3DGS) for novel view synthesis, many works have explored how to also use it for geometric surface representation. However, extracting accurate geometric information directly from 3DGS remains challenging and can often reduce the appearance rendering quality. In this work, we show that 3DGS in its default form is inheritedly unsuited to represent texture and geometry at the same time, by training with complete ground-truth texture and geometry information. We also propose a simple solution by applying a single additional geometry opacity parameter to each splat, together with an optional transparency-curated optimization pipeline. Our experiments, both with ground-truth and vision foundation model geometric input, show that this change leads to improved rendering and geometry performance on a wide variety of dataset, and especially complex scenes with transparent objects benefit significantly from our method.

URL PDF HTML ☆

赞 0 踩 0

2606.05116 2026-06-04 cs.LG 版本更新

Graph Set Transformer

图集变换器

Jose E. Escrig Molina, Baoquan Chen, Daniel Probst

发表机构 * Bioinformatics Group Wageningen University（瓦赫宁根大学生物信息学组）； Department of Physics Technical University of Munich（慕尼黑技术大学物理系）

AI总结提出图集变换器（GST），通过层间交织节点级特征传播与跨图上下文建模，解决图集合学习任务中局部结构与集合上下文融合问题，在合成和真实基准上优于基线。

Comments 10 pages, 1 figure, conference

详情

AI中文摘要

我们介绍了图集变换器（GST），一种用于在图集合上学习的神经网络架构，设计用于每个元素的预测依赖于集合范围的上下文以及局部结构的任务。现有架构，包括DeepSets和SetTransformer，需要来自单独GNN的预编码图嵌入，在特征提取和集合级上下文化之间造成瓶颈。相比之下，GST在每一层交织节点级特征传播和跨图上下文建模，通过门控机制融合两个信息层次。我们在一个旨在隔离集合条件结构推理的受控合成套件以及三个真实数据基准（包括逐原子反应中心识别、反应产率预测和图像分类）上评估了GST。在匹配参数预算下，GST在这些设置中表现优于基线。架构消融强烈表明，局部和集合上下文的交织对这一优势有显著贡献。

英文摘要

We introduce the Graph Set Transformer (GST), a neural network architecture for learning on sets of graphs, designed for tasks in which per-element predictions depend on set-wide context as well as local structure. Existing architectures, including DeepSets and SetTransformer, require pre-encoded graph embeddings from a separate GNN, creating a bottleneck between feature extraction and set-level contextualisation. In contrast, GST interleaves node-level feature propagation and cross-graph contextual modelling at every layer, fusing the two levels of information through a gating mechanism. We evaluate GST on a controlled synthetic suite designed to isolate set-conditional structural reasoning and on three real-data benchmarks spanning per-atom reaction-centre identification, reaction yield prediction, and image classification. Under matched parameter budgets, GST performs better than the baselines across these settings. An architectural ablation strongly suggests that the interleaving of local and set context contributes substantially to this advantage.

URL PDF HTML ☆

赞 0 踩 0

2606.05109 2026-06-04 cs.LG 版本更新

快速且保真的函数向量

Minh An Pham, Anton Segeler, Thomas Wiegand, Wojciech Samek, Sebastian Lapuschkin, Patrick Kahardipraja, Reduan Achtibat

发表机构 * GitHub ； arXiv

AI总结本研究通过优化注意力头选择和分布式引导方法，利用基于梯度的逐层相关性传播（LRP）提高了函数向量（FV）的效率和准确性，从而实现了对大型语言模型（LLM）的快速且保真的引导。

2606.05073 2026-06-04 cs.LG 版本更新

图级联：基于传染的介观重连用于结构感知图机器学习

Meher Chaitanya, My Le, Luana Ruiz

发表机构 * KTH Royal Institute of Technology（皇家理工学院）； Johns Hopkins University（约翰霍普金斯大学）

AI总结提出一种基于传染扩散的介观重连策略Graph Cascades，通过构建辅助图增强图神经网络和变换器对中间尺度结构的捕捉能力，在节点分类任务上提升多个骨干网络性能，并理论刻画了重连有效的条件。

详情

AI中文摘要

我们引入图级联（Graph Cascades），一种用于图神经网络（GNN）和图变换器（GT）的介观重连策略，它能够捕获超出纯局部边或完全全局注意力的中间尺度图结构。基于传染扩散过程，Graph Cascades 在 O(|V|+|E|) 时间内构建一个辅助图，其中由重复多跳强化支持的节点对被提升为直接邻居。我们从理论上刻画了基于强化的重连何时有帮助：强化边选择比直接邻接更标签对齐的充分条件，一个两跳强化完全同质的 SBM 示例，以及通过图有效电阻对介观连通性的形式化。实验上，在节点分类基准测试中，Graph Cascades 改进了多个 GNN 和稀疏 GT 骨干网络，在异质图和中等至高同质度图上观察到最可靠的增益。理论条件还识别了介观重连不太可能有益的场景——低度正则图和存在结构瓶颈的图——这些预测与观察到的失败相符。我们还观察到重连图中性能与结构属性之间的紧密相关性。

英文摘要

We introduce Graph Cascades, a mesoscopic rewiring strategy for Graph Neural Networks (GNNs) and Graph Transformers (GTs) that captures intermediate-scale graph structure beyond purely local edges or fully global attention. Using contagion-based diffusion processes, Graph Cascades constructs, in O(|V|+|E|) time, an auxiliary graph where node pairs supported by repeated multi-hop reinforcement are promoted to direct neighbors. We theoretically characterize when reinforcement-based rewiring helps: sufficient conditions under which reinforcement-based edge selection is more label-aligned than direct adjacency, an SBM witness in which two-hop reinforcement is perfectly homophilic, and a formalization of mesoscopic connectivity via graph effective resistance. Empirically, across node-classification benchmarks, Graph Cascades improves multiple GNN and sparse-GT backbones, with the most reliable gains observed on heterophilic and moderate- to high-degree homophilic graphs. The theoretical conditions also identify regimes where mesoscopic rewiring is unlikely to be beneficial -- low-degree regular graphs and graphs with structural bottlenecks -- and these predictions match the observed failures. We additionally observe tight correlations between performance and structural properties in the rewired graphs.

URL PDF HTML ☆

赞 0 踩 0

2606.05045 2026-06-04 math.DS cs.LG 版本更新

Learning Control-Affine Reduced-Order Models via Autoencoders

通过自编码器学习控制仿射降阶模型

Ali Mjalled, Martin Mönnigmann

发表机构 * Automatic Control and Systems Theory Ruhr-Universität Bochum（自动控制与系统理论梅尔恩大学波恩分校）

AI总结提出一种利用自编码器同时学习降阶潜在空间和控制仿射状态空间动力学的框架，并扩展为序列模型以提高预测精度，通过反馈线性化验证其有效性。

详情

AI中文摘要

本文提出了一种用于识别控制仿射降阶模型（ROM）的框架。该方法利用自编码器（AE）将高维状态以及潜在的高维输入变换为适合控制仿射状态空间动力学的降维潜在变量。这是通过同时训练AE和状态空间模型实现的。此外，我们将离散ROM公式扩展为基于序列的模型，该模型处理状态和输入历史以提高预测精度，同时保持控制仿射结构。我们通过对导出的模型应用反馈线性化来激励我们的框架，并提出了有效使用它的指南。所提出的框架在两个数值示例上进行了评估，并将其性能与基线模型（其中AE识别具有线性状态空间动力学的潜在空间）进行了比较。评估涉及测试数据上ROM的预测精度及其将系统控制到期望状态或轨迹的有效性。

英文摘要

We present in this paper a framework for the identification of control-affine reduced-order models (ROMs). The proposed method utilizes autoencoders (AEs) to transform the high-dimensional states, and potentially the high-dimensional inputs, into reduced latent ones suitable for control-affine state-space dynamics. This is achieved by simultaneous training of the AE and the state-space model. In addition, we extend the discrete ROM formulation to a sequence-based model, which processes state and input histories to improve prediction accuracy while preserving the control-affine structure. We motivate our framework by applying feedback linearization to the derived models, and we present guidelines for its efficient use. The proposed framework is assessed on two numerical examples and its performance is compared to a baseline model, where the AE identifies a latent space with linear state-space dynamics. The assessment involves evaluating the prediction accuracy of the ROM on test data and its effectiveness in controlling the system to a desired state or trajectory.

URL PDF HTML ☆

赞 0 踩 0

2606.05042 2026-06-04 cs.LG cs.CL cs.SC 版本更新

In-Context Graphical Inference

上下文图形推理

Zehua Cheng, Wei Dai, Jiahao Sun

发表机构 * Department of Computer Science, University of Oxford（计算机科学系，牛津大学）； FLock.io

AI总结提出一种自回归图Transformer（ICG-I），通过模拟变量消除并利用张量列压缩和加权共形预测，实现离散图形模型中可扩展且校准的边缘推理，在标准实例和受挫自旋玻璃上达到最先进性能。

Comments 19 Pages

详情

AI中文摘要

离散图形模型中的边缘推理迫使在精确性和可扩展性之间做出选择：精确算法对于高树宽图是难以处理的，而迭代近似（信念传播、变分方法）在受挫拓扑上牺牲了收敛保证。我们认为这种二分法源于归纳偏置不匹配：迭代方法放弃了使精确推理正确的顺序消除结构。我们引入了上下文图形推理（ICG-I），一种自回归图Transformer，通过模拟变量消除并使用学习的张量列压缩中间因子来恢复这种结构，同时结合Dirichlet输出层和加权共形预测，在拓扑偏移下提供校准的、无分布的覆盖保证。我们证明了TT压缩误差在自回归链中最多线性传播，Dirichlet-Multinomial损失是适当的评分规则，并且WCP在估计密度比下保持覆盖且退化可量化。我们进行了大量实验来评估ICG-I，并在所有基准测试中取得了最先进的性能。ICG-I将标准实例上的MAE从0.041（最佳基线）降低到0.020，并在N=500的受挫自旋玻璃上达到0.048，而BP完全发散。

英文摘要

Marginal inference in discrete graphical models forces a choice between exactness and scalability: exact algorithms are intractable for high-treewidth graphs, while iterative approximations (Belief Propagation, variational methods) sacrifice convergence guarantees on frustrated topologies. We argue that this dichotomy stems from a mismatched inductive bias: iterative methods abandon the sequential elimination structure that makes exact inference correct. We introduce In-Context Graphical Inference (ICG-I), an autoregressive Graph Transformer that restores this structure by mimicking Variable Elimination with learned, Tensor- Train-compressed intermediate factors, paired with a Dirichlet output layer and Weighted Conformal Prediction for calibrated, distribution-free coverage guarantees under topological shift. We prove that TT compression errors propagate at most lincarly through the autoregressive chain, that the Dirichlet-Multinomial loss is a proper scoring rule, and that WCP maintains coverage with a quantifiable degradation under estimated density ratios. We conducted intensive experiments to evaluate ICG-I and achieved state-of-the-art performance across all benchmarks. ICG-I reduces MAE from 0.041 (best baseline) to 0.020 on standard instances and achieves 0.048 on N=500 frustrated spin glasses where BP diverges entirely.

URL PDF HTML ☆

赞 0 踩 0

2606.05029 2026-06-04 cs.LG cs.CL 版本更新

Validity Threats for Foundation Model Research

基础模型研究的有效性威胁

Gunnar König, Martin Pawelczyk, Ulrike von Luxburg, Sebastian Bordt

发表机构 * University of Tübingen, Tübingen AI Center（图宾根大学，图宾根人工智能中心）； University of Vienna（维也纳大学）

AI总结本文提出一个因果推断评估框架，将基础模型研究中的不同近似实验策略（代理实验、观察性研究、单次运行设计）映射为四种有效性（统计、内部、外部、构念）的权衡，揭示并分析计算节省带来的隐蔽有效性威胁。

详情

AI中文摘要

受控实验是机器学习研究的基石，但在现代基础模型的规模下，它们变得过于昂贵。相反，研究界越来越依赖于以较低成本近似理想实验的研究策略：代理实验和缩放定律、使用公开模型的观察性研究，以及利用单个训练运行内部变化的单次运行设计。在这项工作中，我们认为在计算预算内近似大规模实验没有免费午餐。具体来说，计算节省是以有效性威胁为代价的——隐藏且有时无法检验的假设，当这些假设被违反时，会使研究主张无效。为了帮助应对这些威胁，我们提出了一个评估框架，将基础模型研究视为因果推断问题。在这个框架内，我们通过从经验社会科学中改编的四种有效性——统计、内部、外部和构念有效性——来评估不同的研究策略。我们发现每种策略都有其特有的有效性特征：代理实验以外部和构念有效性换取统计和内部有效性；观察性研究面临混杂和效应异质性；单次运行设计则因处理单元之间的干扰而紧张。这一分析揭示了文献中未得到充分关注的若干有效性威胁。总体而言，我们的评估框架为研究人员提供了一个实用的工具包，用于审视基础模型研究设计中的有效性威胁。

英文摘要

Controlled experiments are the backbone of machine learning research, but at the scale of modern foundation models, they have become prohibitively expensive. Instead, the community increasingly relies on research strategies that approximate the ideal experiment at a fraction of the cost: proxy experiments and scaling laws, observational studies with publicly available models, and single-run designs that leverage variation within individual training runs. In this work, we argue that there is no free lunch when approximating large-scale experiments on a compute budget. Specifically, savings in compute come at the cost of validity threats -- hidden and sometimes untestable assumptions that, when violated, can invalidate research claims. To help navigate such threats, we propose an evaluation framework that casts foundation model research as a causal inference problem. Within this framework, we evaluate different research strategies through four types of validity adapted from the empirical social sciences -- statistical, internal, external, and construct validity. We find that each strategy comes with a characteristic validity profile: proxy experiments trade external and construct validity for statistical and internal validity; observational studies face confounding and effect heterogeneity; and single-run designs are strained by interference between treated units. This analysis reveals several validity threats that have received insufficient attention in the literature. Overall, our evaluation framework provides researchers with a practical toolkit for scrutinizing validity threats in foundation model research~designs.

URL PDF HTML ☆

赞 0 踩 0

2606.05025 2026-06-04 cs.LG cs.AI 版本更新

Invariant Gradient Alignment for Robust Reasoning Distillation

不变梯度对齐用于鲁棒推理蒸馏

Zehua Cheng, Wei Dai, Jiahao Sun

发表机构 * University of Oxford（牛津大学）； FLock.io

AI总结提出不变梯度对齐（IGA）框架，通过逻辑同构集、连续梯度冲突掩码和截断SVD投影，对齐不同语义域但逻辑结构相同的梯度更新，提升大语言模型在分布外输入上的鲁棒性。

Comments 30 Pages

详情

Journal ref: In Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2026

AI中文摘要

大型语言模型（LLMs）存在捷径学习问题：它们在分布外（OOD）输入上系统性失败，这些输入的语义表面与训练数据不同，即使逻辑结构相同。这破坏了将思维链推理迁移到较小学生模型的知识蒸馏流程。我们引入不变梯度对齐（IGA），一种训练框架，通过三项创新对齐跨语义多样但逻辑同构示例的梯度更新：（i）逻辑同构集，即跨不同语义领域（数学、医学、法律、科学）共享相同逻辑结构的问题组；（ii）可微的连续梯度冲突掩码，抑制具有高跨域梯度方差的参数维度，同时保留不变方向；（iii）将掩码梯度通过截断SVD投影回LoRA低秩流形，保持参数效率。理论上，IGA比ERM产生更紧的OOD泛化界，随同构域数量缩放，并在温和正则条件下以标准SGD速率收敛。实验上，IGA在四个基准测试中优于八种基线，准确率提升高达14.3个百分点（相对于ERM-SFT），逻辑一致性得分为0.031对比0.142——表示不变性提升四倍。

英文摘要

Large language models (LLMs) suffer from shortcut learning: they systematically fail on out-of-distribution (OOD) inputs whose semantic surface differs from training data, even when the logical structure is identical. This undermines knowledge distillation pipelines that transfer chain-of-thought reasoning to smaller students. We introduce Invariant Gradient Alignment (IGA), a training framework that aligns gradient updates across semantically diverse but logically isomorphic examples via three innovations: (i) Logical Isomer Sets, groups of problems sharing identical logical structure across distinct semantic domains (mathematics, medicine, law, science); (ii) a differentiable \emph{Continuous Gradient Conflict Mask}, that suppresses parameter dimensions with high cross-domain gradient variance while preserving invariant directions; and (iii) a truncated SVD projection of the masked gradient back onto the LoRA low-rank manifold, maintaining parameter efficiency throughout. Theoretically, IGA yields tighter OOD generalization bounds than ERM, scaling with the number of isomer domains, and converges at the standard SGD rate under mild regularity. Empirically, IGA outperforms eight baselines across four benchmarks with accuracy gains up to 14.3 pp over ERM-SFT and a Logical Consistency Score of 0.031 versus 0.142 -- a fourfold improvement in representational invariance.

URL PDF HTML ☆

赞 0 踩 0

2606.05021 2026-06-04 cs.LG 版本更新

Enhancing the MADDPG Algorithm for Multi-Agent Learning via Action Inference and Importance Sampling

通过动作推理和重要性采样增强多智能体学习的MADDPG算法

Marc Walden, Jason Liu, Shaashwath Sivakumar, Ryan Liu, Hamza Khan

发表机构 * Department of Mathematics, University of California Los Angeles, Los Angeles, CA, USA（加州大学洛杉矶分校数学系）

AI总结针对多智能体深度强化学习，提出动作推理机制和基于几何分布的重要性采样策略来改进MADDPG算法，在离散动作捕食者-猎物任务中提升了学习稳定性、智能体间协作和探索效率。

详情

AI中文摘要

我们研究了多智能体深度强化学习，并提出了对多智能体深度确定性策略梯度（MADDPG）算法的两项增强。首先，我们引入了一种新颖的动作推理机制，使每个智能体能够预测其他智能体的预期动作，从而提高其自身策略的准确性和稳定性。其次，我们在回放缓冲区中应用了基于几何分布的重要性采样策略，以优先考虑更近期和更具信息性的经验，这有助于缓解多智能体环境中固有的非平稳性。我们在PettingZoo库提供的离散动作捕食者-猎物任务上评估了这两项修改，PettingZoo是一个用于通用多智能体强化学习基准测试的灵活Python接口。我们的结果表明，动作推理在提高学习稳定性和智能体间协作方面是有效的，并且使用几何分布的重要性采样可以在探索效率上比标准MADDPG带来显著改进。代码可在https://github.com/shaashwathsivakumar/MARL_Proj获取。

英文摘要

We investigate multi-agent deep reinforcement learning and propose two enhancements to the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm. First, we introduce a novel Action Inference mechanism that enables each agent to predict other agents' intended actions, thereby improving the accuracy and stability of its own policy. Second, we apply an importance sampling strategy, using geometric distribution, in the replay buffer to prioritize more recent and informative experiences, which helps mitigate the non-stationarity inherent in multi-agent environments. We evaluate both modifications on the discrete-action Predator-Prey task provided by the PettingZoo library, a flexible Python interface for general multi-agent reinforcement learning benchmarks. Our results indicate that Action Inference is effective in improving learning stability and inter-agent cooperation and that importance sampling using geometric distribution can lead to significant improvements in exploration efficiency over standard MADDPG. Code available at https://github.com/shaashwathsivakumar/MARL_Proj

URL PDF HTML ☆

赞 0 踩 0

2606.04994 2026-06-04 cs.LG q-bio.QM 版本更新

New Benchmarking Shows Limited Generalization Power of TCR Antigenic Epitope Prediction Models

新基准测试显示TCR抗原表位预测模型的泛化能力有限

Yiming Liao, Yiheng Li, Ning Jiang, Bo Li, Keke Chen

发表机构 * Trustworthy and Intelligent Computing Lab (TAIC), Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County（可信智能计算实验室（TAIC），计算机科学与电气工程系，马里兰大学巴尔的摩分校）； Children’s Hospital of Philadelphia（费城儿童医院）； Department of Bioengineering, University of Pennsylvania（生物工程系，宾夕法尼亚大学）； Institute for Immunology & Immune Health, University of Pennsylvania（免疫学与免疫健康研究所，宾夕法尼亚大学）； Institute for RNA Innovation, University of Pennsylvania（RNA创新研究所，宾夕法尼亚大学）； Abramson Cancer Center, University of Pennsylvania（Abramson癌症中心，宾夕法尼亚大学）； Center for Precision Engineering for Health, University of Pennsylvania（健康精准工程中心，宾夕法尼亚大学）； Center for Cellular Immunotherapies, University of Pennsylvania（细胞免疫治疗中心，宾夕法尼亚大学）

AI总结本文通过构建两类严格定义的未见基准数据集，评估了T细胞受体(TCR)抗原特异性预测模型的性能，发现现有模型泛化能力有限，并提出了改进框架。

Comments 6 pages, 1 figure. Preprint version

2606.04980 2026-06-04 cs.LG 版本更新

AlphaQ: Calibration-Free Bit Allocation for Mixture-of-Experts Quantization

AlphaQ: 混合专家量化的免校准位分配

Wanqi Yang, Yuexiao Ma, Alexander Conzelmann, Xiawu Zheng, Michael W. Mahoney, T. Konstantin Rusch, Shiwei Liu

发表机构 * Max Planck Institute for Intelligent Systems（马克斯·普朗克智能系统研究所）； ELLIS Institute Tübingen（图宾根ELLIS研究所）； Tübingen AI Center（图宾根人工智能中心）； Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Xiamen University（厦门大学多媒体可信感知与高效计算重点实验室）； International Computer Science Institute（国际计算机科学研究所）； Lawrence Berkeley National Laboratory（伯克利国家实验室）； University of California, Berkeley（加州大学伯克利分校）； Liquid AI

AI总结针对混合专家模型量化中依赖校准数据导致位分配次优的问题，提出基于重尾自正则化理论的免校准位分配方法AlphaQ，通过专家权重谱的重尾程度分配位宽，在预算约束下最小化量化误差，实现接近全精度的性能。

Comments 28 pages, 11 figures

详情

AI中文摘要

混合专家（MoE）架构通过稀疏专家激活扩展模型容量，但其部署仍受内存限制，因为所有专家权重必须驻留在内存中。混合精度量化通过为不同专家分配不同位宽，可以显著减少内存占用。然而，现有方法通常依赖校准数据来估计专家重要性并确定位分配。对于前沿的MoE大语言模型，原始训练数据（即真实训练分布）是专有的且不可访问。因此，校准集不可避免地成为不完美的替代品，这可能导致对专家利用率的错误估计和次优的位分配。受现代MoE模型中观察到的显著跨专家质量差异，以及重尾自正则化（HT-SR）理论在无需训练或测试数据的情况下成功预测神经网络模型质量的启发，我们提出了AlphaQ，一种用于MoE量化的免校准位分配方法。AlphaQ借鉴HT-SR理论，遵循一个简单原则：具有更重尾权重谱的专家通常训练得更好，因此应获得更高的位宽，而重尾结构较弱的专家可以更激进地量化。AlphaQ通过测量专家级别的谱重尾程度，并求解在全局位预算约束下最小化总量化误差的预算约束优化问题来实现这一原则。在多个MoE模型上，AlphaQ在匹配位预算下始终优于基于校准的基线方法。值得注意的是，在Qwen1.5-MoE上，AlphaQ在平均专家精度仅为3.5位的情况下实现了接近全精度的准确率，同时提供了超过4倍的内存压缩。我们的代码可在https://github.com/Superone77/AlphaQ获取。

英文摘要

Mixture-of-Experts (MoE) architectures scale model capacity through sparse expert activation, but their deployment remains memory-bound because all expert weights must reside in memory. Mixed-precision quantization can substantially reduce this footprint by assigning different bit-widths to different experts. Existing approaches, however, typically rely on calibration data to estimate expert importance and determine bit allocation. For frontier MoE LLMs, the original training data, and hence the true training distribution, is proprietary and inaccessible. As a result, calibration sets are inevitably imperfect surrogates, and this can misestimate expert utilization and lead to suboptimal bit allocation. Motivated by the substantial cross-expert quality variability observed in modern MoE models, and by the success of Heavy-Tailed Self-Regularization (HT-SR) theory at predicting neural network model quality without access to training or testing data, we propose AlphaQ, a calibration-free bit-allocation method for MoE quantization. AlphaQ draws on HT-SR theory and follows a simple principle: experts with more heavy-tailed weight spectra are typically better trained and hence should receive higher bit-widths, while experts with weaker heavy-tailed structure can be quantized more aggressively. AlphaQ operationalizes this principle by measuring expert-wise spectral heavy-tailedness and solving a budget-constrained optimization problem that minimizes total quantization error under a global bit-budget constraint. Across several MoE models, AlphaQ consistently outperforms calibration-based baselines under matched bit budgets. Notably, on Qwen1.5-MoE, AlphaQ achieves near full-precision accuracy with an average expert precision of only 3.5 bits, while delivering more than 4$\times$ memory compression. Our code is available at https://github.com/Superone77/AlphaQ.

URL PDF HTML ☆

赞 0 踩 0

2606.04971 2026-06-04 cs.LG cs.DB 版本更新

基于均值的算法：下界与遗憾

Julius Durmann, Amelie Kleber

发表机构 * Technical University of Munich（慕尼黑技术大学）

AI总结本文针对未知时间范围且仅有赌博机反馈的设定，首次给出了基于均值算法定义序列γ_t的下界，并提出了两种新算法，实验表明其性能与现有算法相当，同时分析了与无遗憾算法的关系。

详情

AI中文摘要

基于均值的算法是一类在线学习算法，它们将低概率分配给平均奖励低的动作。最近的研究表明，这些算法能够有利地收敛到序列非支配动作，从而逼近经济博弈中的纳什均衡。然而，实证研究也显示，在赌博机反馈场景中，与已有算法相比，其收敛速度较慢。我们研究时间范围未知且仅有赌博机反馈时的基于均值算法。在此设定下，我们首次给出了算法定义序列$γ_t$的下界，正式确立了这些算法学习速度的极限。此外，我们提出了两种基于均值的算法：一种推广了$ε$-贪心算法，另一种将基于均值的Exp3扩展到未知时间范围。我们的实验表明，基于均值的算法虽然略慢，但可以与其他赌博机反馈算法竞争。我们进一步分析了与无遗憾算法的关系。根据$γ_t$的选择，与无遗憾算法的交集是非平凡的，并且我们证明存在既是基于均值又是无遗憾的算法。这为此类算法的“可剥削性”提供了背景，而先前的研究曾暗示这一点。

英文摘要

Mean-based algorithms are a class of online learning algorithms that assign low probability to actions with low average rewards. Recent work indicates these algorithms converge favorably to serially undominated actions, which approximate Nash equilibria in economic games. However, empirical studies also show slower convergence compared to established algorithms in bandit-feedback scenarios. We study mean-based algorithms when the time horizon is unknown and only bandit feedback is available. In this setting, we provide the first lower bound on the algorithm-defining sequence $γ_t$ that formally establishes a limit on how fast these algorithms can learn. Additionally, we propose two mean-based algorithms: one generalizes $ε$-greedy, and the other extends the mean-based Exp3 to unknown horizons. Our experiments show that mean-based algorithms, although slightly slower, can perform competitively with other bandit-feedback algorithms. We further analyze the relationship to no-regret algorithms. Depending on the choice of $γ_t$, the intersection with no-regret algorithms is non-trivial, and we show that algorithms exist that are both mean-based and no-regret. This adds context to the "exploitability" of this class of algorithms that previous contributions suggest.

URL PDF HTML ☆

赞 0 踩 0

2606.04930 2026-06-04 cs.LG cs.AI stat.ML 版本更新

AdaKoop: Efficient Modeling of Nonlinear Dynamics from Nonstationary Data Streams with Koopman Operator Regression

AdaKoop: 基于Koopman算子回归的非平稳数据流非线性动力学高效建模

Naoki Chihara, Ren Fujiwara, Yasuko Matsubara, Yasushi Sakurai

发表机构 * SANKEN, The University of Osaka（SANKEN大学）

AI总结提出AdaKoop，一种基于Koopman算子理论和概率框架的流式算法，通过将非线性动力学表示为线性系统，实现对非平稳数据流的高效、稳定建模，并在71个基准数据集上超越现有方法。

Comments Accepted by KDD'26

详情

DOI: 10.1145/3770855.3817851
Journal ref: The 32nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2026

AI中文摘要

实时数据分析需要准确且自适应地处理非平稳数据流中的非线性动力学，同时保持计算效率。然而，非线性动力学非常复杂，在严格时间限制下捕获动态变化的非线性模式并将其用于下游任务并非易事。为了弥合非线性复杂性与计算可处理性之间的差距，本研究应用了Koopman算子理论，该理论指出非线性动力学可以表示为无限维空间中的线性变换。基于该算子的有限维近似，我们提出了AdaKoop，一种用于对非平稳数据流上的非线性动力学进行建模的高效流式算法。我们的方法利用基于Koopman算子理论的概率框架，将原始观测和再生核希尔伯特空间（RKHS）特征都视为来自潜在向量的发射。这种双视角公式允许非线性动力学被表示为可处理的线性系统。因此，AdaKoop能够以流式方式高效稳定地建模非线性动力学，避免了迭代非线性优化的高昂计算成本。此外，为了应对数据流中的非平稳性，AdaKoop通过统计假设检验自适应地检测模式突变，并增量更新模型参数以处理连续变化。在总共71个跨领域实际基准数据集上的大量实验表明，AdaKoop在实时预测准确性和计算效率方面均优于最先进的方法。

英文摘要

Real-time data analysis requires the ability to accurately and adaptively address nonlinear dynamics in a nonstationary data stream while preserving computational efficiency. However, nonlinear dynamics are so complex that capturing dynamically changing nonlinear patterns and utilizing them for downstream tasks under strict time constraints is nontrivial. To bridge the gap between nonlinear complexity and computational tractability, this study applies Koopman operator theory, which states that nonlinear dynamics can be represented as linear transitions in an infinite-dimensional space. Building upon finite-dimensional approximations of this operator, we present AdaKoop, an efficient streaming algorithm for modeling nonlinear dynamics over nonstationary data streams. Our approach utilizes a probabilistic framework grounded in Koopman operator theory, treating both raw observations and reproducing kernel Hilbert space (RKHS) features as emissions from latent vectors. This dual-view formulation allows nonlinear dynamics to be expressed as a tractable linear system. Therefore, AdaKoop enables the efficient and stable modeling of nonlinear dynamics in a streaming fashion, avoiding the prohibitive computational costs of iterative nonlinear optimization. Furthermore, to address nonstationarity in data streams, AdaKoop adaptively detects the switching of patterns via statistical hypothesis testing for abrupt pattern shifts and incrementally updates model parameters to handle continuous changes. Extensive experiments on a total of 71 practical benchmark datasets across various domains demonstrate that AdaKoop outperforms state-of-the-art methods in terms of real-time forecasting accuracy and computational efficiency.

URL PDF HTML ☆

赞 0 踩 0

2606.04929 2026-06-04 cs.LG cs.CR 版本更新

面向TabPFN的文本编码器预训练

Mustafa Tajjar, Alexander Pfefferle, Lennart Purucker, Frank Hutter

发表机构 * University of California, Berkeley（加州大学伯克利分校）； DeepMind（深度思维）

AI总结提出TabPFN文本适配器，通过轻量级适配器将文本嵌入映射到TabPFN的嵌入空间，避免PCA瓶颈，保留TabPFN数值优势，训练效率更高。

详情

AI中文摘要

表格基础模型（如TabPFN）在数值和分类数据的表格数据集上表现强劲，但本身不处理高基数文本特征。因此，标准流程使用语言模型嵌入文本，并通过PCA将结果向量压缩为少量标量特征，再输入TabPFN。这造成了信息瓶颈：大多数嵌入维度被丢弃，压缩后的表示必须由TabPFN的特征编码器再次扩展。端到端替代方案可以避免PCA，但需要大量包含文本单元格的预训练数据，且通常性能不如在大量合成数据上预训练的表格基础模型。受模态对齐方法（如LLaVA（视觉到LLM令牌投影）和TableGPT风格系统（表格到LLM令牌投影））的启发，我们引入了TabPFN文本适配器（文本到TFM令牌投影）。我们冻结句子编码器和TabPFN，仅训练一个轻量级适配器，将文本嵌入映射为TabPFN嵌入空间中的短序列令牌。这种设计消除了PCA瓶颈，保留了TabPFN的数值优势，并且比端到端文本表格流水线训练效率更高。

英文摘要

Tabular foundation models, such as TabPFN, achieve strong performance on tabular datasets with numerical and categorical data, but do not natively handle high-cardinality text features. Standard pipelines, therefore, embed text with a language model and compress the resulting vectors with PCA into a small number of scalar features before inputting them into TabPFN. This creates an information bottleneck: most embedding dimensions are discarded, and the compressed representation must then be expanded again by TabPFN's feature encoder. End-to-end alternatives can avoid PCA, but they require large amounts of pretraining data containing text cells and usually perform subpar compared to tabular foundation models that were pretrained on large amounts of synthetic data. Inspired by modality-alignment approaches like LLaVA (vision-to-LLM token projection) and TableGPT-style systems (table-to-LLM token projection), we introduce the TabPFN Text Adapter (text-to-TFM token projection). We freeze both the sentence encoder and TabPFN, and train only a lightweight adapter that maps text embeddings into a short sequence of tokens in TabPFN's embedding space. This design removes the PCA bottleneck, preserves TabPFN's numerical strengths, and is more efficient to train than end-to-end text-tabular pipelines.

URL PDF HTML ☆

赞 0 踩 0

2606.04866 2026-06-04 cs.LG 版本更新

Provably Reduced Sample Cost in Prior-Guided Hyperparameter Optimization

在先验引导的超参数优化中可证明的样本成本降低

Leona Hennig, Jasmin Brandt, Lukas Fehring, Barbara Hammer, Marius Lindauer, Marcel Wever

发表机构 * Leibniz University Hanover（莱比锡大学汉诺威分校）； University of Bielefeld（比勒菲尔德大学）； Institute of Artificial Intelligence, Leibniz University Hanover（人工智能研究所，莱比锡大学汉诺威分校）； L3S Research Center Hanover（汉诺威L3S研究中心）

AI总结本文通过固定预算最佳臂识别的形式化框架，首次给出了多保真度超参数优化中依赖先验分布的样本复杂度界，证明了信息性先验可显著减少评估次数，并实验验证了高达90%的预算节省。

详情

AI中文摘要

自动化机器学习（AutoML）中的大规模超参数优化（HPO）消耗大量计算资源，引发了关于可扩展性和能源效率的日益关注。现有方法启发式地利用先验信息来加速黑箱和多保真度设置，但缺乏对先验信息性如何定量减少样本复杂度的刻画。在这项工作中，我们通过固定预算最佳臂识别的形式化视角，首次给出了带先验的多保真度HPO的依赖分布的样本复杂度界。通过将先验直接建模在臂均值（即配置性能）上，我们推导出显式的、依赖分布的误差界，量化了先验与评估预算之间的关系。我们的分析表明，信息性先验（将概率质量集中在接近最优的臂上）能够减少所需的评估次数，而无信息或误导性先验则恢复基线性能。我们在合成基准和LCBench（一个用于深度学习的常见多保真度HPO基准）上进行了概念验证实验，以确认我们的理论结果，在保持解质量的同时实现了高达90%的预算削减。总之，我们的结果为先验引导和计算高效的绿色AutoML提供了原则性基础。

英文摘要

Large-scale hyperparameter optimization (HPO) in automated machine learning (AutoML) consumes substantial computational resources, raising growing concerns about scalability and energy efficiency. Existing methods use prior information heuristically to accelerate both black-box and multi-fidelity settings, but they lack a characterization of how prior informativeness quantitatively reduces sample complexity. In this work, we provide the first distribution-dependent sample complexity bounds for multi-fidelity HPO with priors through the formal lens of fixed-budget best-arm identification. By modeling priors directly over arm means as configuration performance, we derive explicit, distribution-dependent error bounds that quantify the relationship between priors and evaluation budget. Our analysis shows that informative priors, which concentrate probability mass on near-optimal arms, yield reductions in the number of required evaluations, whereas baseline performance is recovered with uninformative or misleading priors. We conduct proof-of-concept experiments on a synthetic benchmark and on LCBench, a common multi-fidelity HPO benchmark for deep learning, to confirm our theoretical results, achieving up to 90% budget reduction while retaining solution quality. Together, our results provide a principled foundation for prior-guided and compute-efficient green AutoML.

URL PDF HTML ☆

赞 0 踩 0

2606.04860 2026-06-04 cs.LG cs.AI 版本更新

Learning Empirically Admissible Neural Heuristics for Combinatorial Search

学习组合搜索的经验可容许神经启发式

Siddharth Sahay

发表机构 * Independent Researcher（独立研究者）

AI总结针对组合搜索问题，提出一种结合可容许贝尔曼算子与非对称损失函数的验证校准框架，训练出经验可容许的神经启发式，在保证路径最优性的同时显著减少搜索节点扩展。

Comments 13 pages, 3 figures, 2 tables, 1 algorithm

详情

AI中文摘要

寻找诸如魔方、滑动拼图游戏和Lights Out等组合谜题的最优解路径仍然是人工智能中的经典挑战。启发式搜索算法（如A*）仅在使用可容许启发式（即从不高估真实剩余代价的启发式）时才能保证路径最优性。深度强化学习方法（如DeepCubeA）训练深度神经网络来近似代价到目标的启发式。然而，标准均方误差训练经常产生高估，违反可容许性并损害解的最优性。在本文中，我们介绍了一个可泛化的框架，用于学习验证校准的可容许神经启发式。我们使用低估的可容许贝尔曼算子结合非对称损失函数来训练价值网络，以惩罚高估。为了考虑残差神经函数逼近误差，我们提出了一个基于验证打乱计算的校准安全偏移量。我们证明，在校准的神经启发式下，在评估协议下未观察到可容许性违反，并在实践中保持了路径最优性，同时与标准分析基线相比，在2x2魔方上减少了高达83.0%的搜索节点扩展，在3x3 Lights Out网格上减少了19.9%，在8-Puzzle上减少了1.9%。

英文摘要

Finding optimal solution paths for combinatorial puzzles like the Rubik's Cube, sliding tile puzzles, and Lights Out remains a classical challenge in artificial intelligence. Heuristic search algorithms, such as A* , guarantee path optimality only when using an admissible heuristic-one that never overestimates the true remaining cost-to-go. Deep reinforcement learning (RL) methods like DeepCubeA train deep neural networks to approximate cost-to-go heuristics. However, standard mean-squared error (MSE) training regularly yields overestimations, violating admissibility and compromising solution optimality. In this paper, we introduce a generalizable framework for learning validation-calibrated admissible neural heuristics. We train a value network using an underestimating Admissible Bellman Operator combined with an Asymmetric Loss function to penalize overestimation. To account for residual neural function approximation errors, we propose a post-hoc calibration safety offset computed over validation scrambles. We demonstrate that our calibrated neural heuristics achieve no observed admissibility violations under the evaluation protocol and preserve path optimality in practice while reducing search node expansions by up to 83.0% on a 2 by 2 Rubik's Cube, 19.9% on a 3 by 3 Lights Out grid, and 1.9% on an 8-Puzzle compared to standard analytical baselines.

URL PDF HTML ☆

赞 0 踩 0

2606.04857 2026-06-04 cs.LG 版本更新

Rethinking Incompleteness: Formalizing Protocol Divergence and Train-Once Learning for Robust IMVC

重新思考不完备性：形式化协议发散与单次训练学习用于鲁棒IMVC

Haolu Liu, Xiyue Wang, Xuanting Xie, Liangjian Wen, Zhao Kang

发表机构 * National University of Singapore（新加坡国立大学）

AI总结针对标准IMVC评估范式忽视缺失率不足以刻画数据不完备性的问题，提出协议发散形式化度量，并设计CRAFT架构通过样本独立性和掩码感知融合实现单次训练泛化到多种缺失模式。

详情

AI中文摘要

标准IMVC评估为不同的缺失数据配置分别训练模型。我们表明，这种范式掩盖了一个基本脆弱性：仅缺失率不足以刻画数据不完备性。具体而言，我们表明，具有相同名义缺失率的协议在完全观测样本的比例上可能相差高达$50\times$，从而引发截然不同的学习机制。我们将这一现象形式化为不完备性发散，提供了捕捉缺失数据协议间结构差异的度量。我们进一步证明，对于一大类基于重构的目标函数，当完整样本比例低于临界阈值时，学习在结构上变得不适定，导致接近随机的性能。为了绕过这一理论界限，我们提出了CRAFT（完整数据鲁棒注意力掩码融合变换器）。CRAFT通过两个关键特性将鲁棒性的负担从损失函数转移到架构上：（i）每个样本的独立性，消除了对完整样本共现的依赖，以及（ii）掩码感知变长融合，通过注意力掩码仅聚合观测到的视图。这种设计允许单个模型在完整数据上训练一次，即可在推理时泛化到不同的缺失模式，无需重新训练。在七个基准上的大量实验表明，CRAFT匹配或超越了每个配置的基线，同时将训练开销降低了$8.8\times$，证明对缺失数据的鲁棒性可以作为固有的架构属性实现。代码（CRAFT）和我们的imvc-audit工具包可在https://anonymous.4open.science/r/CRAFT-BF80/ 和 https://anonymous.4open.science/r/imvc-audit-8263/ 获取。

英文摘要

Standard IMVC evaluation retrains separate models for different missing-data configurations. We show that this paradigm obscures a fundamental vulnerability: missing rate alone is insufficient to characterize data incompleteness. Specifically, we show that protocols with identical nominal missing rates can differ by up to $50\times$ in their proportion of fully observed samples, inducing drastically different learning regimes. We formalize this phenomenon as incompleteness divergence, providing measures that capture structural disparities across missing-data protocols. We further prove that for a broad class of reconstruction-based objectives, learning becomes structurally ill-posed when the proportion of complete samples falls below a critical threshold, leading to near-random performance. To bypass this theoretical bound, we propose CRAFT (Complete-data Robust Attention-masked Fusion Transformer). CRAFT shifts the burden of robustness from the loss function to the architecture via two key properties: (i) per-sample independence, which removes reliance on complete-sample co-occurrence, and (ii) mask-aware variable-length fusion, which aggregates only observed views through attention masking. This design allows a single model, trained once on complete data, to generalize to diverse missing patterns at inference time without retraining. Extensive experiments on seven benchmarks show that CRAFT matches or outperforms per-configuration baselines while reducing training overhead by $8.8\times$, demonstrating that robustness to missing data can be achieved as an inherent architectural property. Code (CRAFT) and our imvc-audit toolkit are available at https://anonymous.4open.science/r/CRAFT-BF80/ and https://anonymous.4open.science/r/imvc-audit-8263/.

URL PDF HTML ☆

赞 0 踩 0

2606.04850 2026-06-04 cs.LG cs.AI cs.AR math.OC 版本更新

Uncertainty-Aware End-to-End Co-Design of Neural Network Processors: From Training and Mapping to Fabrication

不确定性感知的神经网络处理器端到端协同设计：从训练、映射到制造

Yuyang Du, Yujun Huang, Gioele Zardini

AI总结提出一个基于单调协同设计理论的统一框架，通过四个可互操作的设计模块（网络训练、芯片映射、晶圆级制造和计算资源分配）实现神经网络处理器的端到端协同设计，并引入置信度（成功概率的倒数）作为显式可优化资源来处理不确定性。

Comments 14 pages

详情

AI中文摘要

设计神经网络处理器是一个端到端的协同设计问题：网络架构和训练预算决定了推理工作负载；硬件映射决策决定了芯片面积、延迟和能量；这些特性决定了制造良率和生产成本。在实践中，这些决策是在不同阶段做出的，现有的协同设计方法与特定算法紧密耦合，使得改进一个组件而不重新设计整个流水线变得困难。本文提出了一个基于单调协同设计理论的统一框架，该框架组合了四个可互操作的设计模块，涵盖网络训练、芯片映射、晶圆级制造和计算资源分配。每个模块仅向系统其余部分暴露功能-资源接口，因此任何模块都可以在不改变其他模块结构的情况下进行优化。一个核心贡献是对不确定性的处理：该框架没有将随机结果简化为点估计，而是引入置信度（成功概率的倒数）作为与成本、时间和功耗并列的显式可优化资源。三个案例研究验证了该方法。第一个案例恢复了跨异构应用场景的帕累托最优实现。第二个案例确认置信度作为一个连续可调的设计旋钮，而非事后诊断指标。第三个案例表明，改进单个模块的实现集会自动传播到全局帕累托前沿，而无需修改协同设计图。

英文摘要

Designing a neural network processor is an end-to-end co-design problem: network architecture and training budget determine the inference workload; hardware mapping decisions determine chip area, latency, and energy; and these characteristics govern fabrication yield and manufacturing cost. In practice, these decisions are made in separate stages, and existing co-design methodologies are tightly coupled to specific algorithms, making it difficult to improve one component without reworking the entire pipeline. This paper presents a unified framework, grounded in monotone co-design theory, that composes four interoperable design blocks spanning network training, chip mapping, wafer-level fabrication, and compute resource allocation. Each block exposes only a functionality-resource interface to the rest of the system, so any block can be refined without structural changes elsewhere. A central contribution is the treatment of uncertainty: rather than collapsing stochastic outcomes into point estimates, the framework introduces Confidence, the inverse of success probability, as an explicit and optimizable resource alongside cost, time, and power. Three case studies validate the approach. The first recovers Pareto-optimal implementations across heterogeneous application scenarios. The second confirms that Confidence functions as a continuously tunable design knob rather than a post-hoc diagnostic. The third demonstrates that improving a single block's implementation set automatically propagates to the global Pareto front, without modifying the co-design diagram.

URL PDF HTML ☆

赞 0 踩 0

2606.04847 2026-06-04 cs.CV cs.CL cs.LG 版本更新

用哈密顿因果模型调和因果关系与非平衡热力学

Dario Rancati, Max Welling, Francesco Locatello

发表机构 * Institute of Science and Technology Austria（奥地利科学与技术研究所）； CuspAI University of Amsterdam（阿姆斯特丹大学CuspAI）

AI总结提出哈密顿因果模型（HCMs），通过分离不可变运动方程与可干预机制，定义路径级因果效应，并与非平衡热力学自然接口，利用熵产生量化因果效应。

详情

AI中文摘要

物理时间现象的因果建模必须处理沿轨迹的干预、非平稳诱导律、路径依赖效应以及由动力学介导的反馈，这些在标准因果模型中都具有挑战性。我们引入了哈密顿因果模型（HCMs），这是一个轨迹级框架，其中观测变量与局部环境相互作用，干预作为哈密顿机制的控制。HCMs将不可变的运动方程与可干预机制分离，并将因果效应定义为干预路径律之间的差异。HCMs的一个关键动机是它们与非平衡热力学的自然接口。熵产生量化了过程的不可逆性，是一个核心因果可观测量：它可以从数据中估计，并见证系统演化过程中标准平均处理效应的端点和累积版本所不可见的因果效应。如同物理学中，原因和结果不是两个随机变量之间关系的原始概念，而是源于热力学箭头的不可逆性。因此，我们的论文调和了统计因果模型和非平稳热力学的语言，为描述广泛物理系统中的因果关系提供了新工具。

英文摘要

Causal modeling of physical temporal phenomena must handle interventions that act along trajectories, nonstationary induced laws, path-dependent effects, and feedback mediated by dynamics, all challenging in standard causal models. We introduce Hamiltonian Causal Models (HCMs), a trajectory-level framework in which observed variables interact with local environments and interventions act as controls of Hamiltonian mechanisms. HCMs separate immutable equations of motion from intervenable mechanisms and define causal effects as discrepancies between interventional path laws. A key motivation for HCMs is their natural interface with non-equilibrium thermodynamics. Entropy production quantifies the irreversibility of a process and is a central causal observable: it is estimable from data and witnesses causal effects along the system's evolution that are invisible to endpoint and cumulative versions of the standard average treatment effect. As in physics, cause and effect are not primitives of the relation between two random variables but arise from the non-invertibility of the thermodynamic arrow. With this, our paper reconciles the language of statistical causal models and non-stationary thermodynamics, offering new tools to describe causality in a wide range of physical systems.

URL PDF HTML ☆

赞 0 踩 0

2606.04820 2026-06-04 cs.CV cs.AI cs.LG 版本更新

OA-CutMix: Correcting the Label Bias of CutMix

OA-CutMix：纠正CutMix的标签偏差

Tobias Christian Nauen, Stanislav Frolov, Federico Raue, Brian B. Moser, Andreas Dengel

发表机构 * RPTU University Kaiserslautern-Landau（凯撒斯劳滕-兰道大学）； German Research Center for Artificial Intelligence (DFKI)（德国人工智能研究中心）

AI总结针对CutMix中标签分配基于区域面积导致语义偏差的问题，提出OA-CutMix，利用分割掩码根据可见目标面积分配标签，在不改变图像混合过程的情况下提升分类准确率。

详情

AI中文摘要

CutMix已成为事实上的标准混合增强方法，但其标签分配基于一个有缺陷的假设：粘贴补丁的面积忠实地反映了其对混合图像的语义贡献。然而，在实践中，补丁经常落在背景区域，将标签信用分配给其目标不可见的类别。CutMix标签与语义目标面积的平均差异为21.5%。在17%的样本中，一张图像贡献了零个可见目标像素，却获得了非零的标签权重。我们提出目标感知CutMix（OA-CutMix），通过用从预计算分割掩码中导出的权重替换基于面积的CutMix权重来纠正这种偏差，根据每个图像贡献给混合图像的可见目标面积比例分配标签。图像混合过程完全保持不变。我们在4种架构和6个数据集上评估了OA-CutMix与10多种静态和动态混合方法的性能。OA-CutMix在所有任务中始终达到最高准确率，甚至优于动态混合方法，但训练时间成本仅为其一小部分。对于小目标，改进最大，因为CutMix的标签偏差最大。因此，纠正标签足以匹配或超过修改图像混合算法的方法的性能。

英文摘要

CutMix has become the de facto standard mixing augmentation, yet its label assignment rests on a flawed assumption: The area of the pasted patch faithfully reflects its semantic contribution to the mixed image. In practice, however, patches frequently land on background regions, assigning label credit to classes whose objects are not visible. The mean discrepancy of the CutMix label and the semantic object area is $21.5\%$. In $17\%$ of samples an image contributes zero visible object pixels yet receives nonzero label weight. We propose Object-Aware CutMix (OA-CutMix), which corrects this bias by replacing the area-based CutMix weight with one derived from precomputed segmentation masks, assigning labels in proportion to the visible object area each image contributes to the mix. The image mixing procedure is left entirely unchanged. We evaluate OA-CutMix against 10+ static and dynamic mixing methods across 4 architectures and 6 datasets. OA-CutMix consistently achieves the highest accuracy over all tasks, outperforming even dynamic mixing methods, but at a fraction of the training-time cost. Improvements are largest for small objects, where the label bias from CutMix is greatest. Thus, correcting the label is sufficient to match or exceed the performance of methods modifying the image mixing algorithm.

URL PDF HTML ☆

赞 0 踩 0

2606.04816 2026-06-04 cs.AI cs.LG 版本更新

Beyond Objective Equivalence: Constraint Injection for LLM-Based Optimization Modeling on Vehicle Routing Problems

超越目标等价性：基于LLM的车辆路径问题优化建模的约束注入

Xizi Luo, Changhong He, Dongdong Geng, Chenggong Shi, Yu Mei

发表机构 * Beihang University（北京航空航天大学）； Baidu Inc.（百度公司）

AI总结针对LLM在约束密集的运筹问题中可能添加虚假约束或遗漏必要约束的问题，提出约束注入方法，结合差分测试形成双重验证器，并在车辆路径问题上验证其有效性。

Comments 28 pages

详情

AI中文摘要

大型语言模型（LLM）越来越多地将自然语言优化问题转化为可执行的求解器代码。然而，对于约束密集的运筹学（OR）问题，现有的数据过滤和训练流程主要依赖于目标等价性信号，如差分测试和答案一致性，这些信号允许程序在测试实例上添加虚假约束或静默省略必要约束，只要这些约束在测试实例上非绑定。我们提出约束注入，利用可行探针暴露虚假过度约束，利用单约束违反探针揭示静默约束遗漏。结合差分测试，它形成一个双重验证器。我们在车辆路径问题（VRPs）上实例化并评估该方法，VRPs是代表性的约束密集组合优化测试平台，具有耦合的操作约束。我们开发了VRPCoder，一个8B端到端模型，将自然语言VRP场景转化为Gurobi脚本，并附带一个专家验证的VRP基准套件，涵盖21种变体。该验证器在数据合成期间用作拒绝采样过滤器，在组相对策略优化（GRPO）中用作每次rollout的奖励。在四个VRP基准上，VRPCoder-GRPO达到93%的平均Pass@1，在三个基准上优于Gemini-3.1-Pro Preview，超过Claude-Sonnet-4.5平均28个百分点，并超过先前的OR-LLM平均78个百分点。

英文摘要

Large language models (LLMs) increasingly translate natural-language optimization problems into executable solver code. Yet for constraint-dense operations research (OR) problems, existing data-filtering and training pipelines largely rely on objective-equivalence signals such as differential testing and answer agreement, which a program can pass while adding spurious constraints or silently omitting required ones, whenever those constraints are non-binding on the tested instance. We propose constraint injection, which uses feasible probes to expose spurious over-constraint and one-constraint-violating probes to reveal silent constraint omission. Combined with differential testing, it forms a dual verifier. We instantiate and evaluate it on vehicle routing problems (VRPs), a representative constraint-dense combinatorial optimization testbed with coupled operational constraints. We develop VRPCoder, an 8B end-to-end model that translates natural-language VRP scenarios into Gurobi scripts, together with an expert-verified VRP benchmark suite covering 21 variants. The verifier is reused as a rejection-sampling filter during data synthesis and as a per-rollout reward in group relative policy optimization (GRPO). Across four VRP benchmarks, VRPCoder-GRPO reaches 93\% average Pass@1, outperforms Gemini-3.1-Pro Preview on three benchmarks, exceeds Claude-Sonnet-4.5 by 28 average points, and surpasses prior OR-LLMs by 78 average points.

URL PDF HTML ☆

赞 0 踩 0

2606.04815 2026-06-04 cs.LG cs.AI 版本更新

AIP: 一种用于学习和治理智能体技能的图表示

Zachary Blumenfeld, Jim Webber

发表机构 * Neo4j USA（Neo4j美国公司）； Neo4j UK（Neo4j英国公司）

AI总结提出Agent指令协议(AIP)，将有向执行图作为技能表示，通过编译人类编写的技能提升任务表现，并支持技能的可诊断修复与治理。

详情

AI中文摘要

当前的智能体技能主要由自由形式的散文组成，要求智能体在每个会话中阅读、解释并重新推导如何行动。这带来了两个叠加的成本：在实现密集型任务上降低了可靠性，并且技能创建和改进困难，因为编辑散文是一个脆弱的过程，人类和智能体都难以处理，特别是对于模型训练中代表性不足的领域特定程序性知识。智能体指令协议(AIP)通过将技能建模为有向执行图来解决这两个问题：离散步骤作为节点，由确定性脚本或自然语言描述支持，通过显式类型的输入/输出边连接，并由模式验证的YAML规范管理。一个编译器元技能将现有的人类编写的技能转换为这种形式。好处是双重的。首先，将人类编写的技能编译为AIP后，Claude Sonnet在SkillsBench的27个真实智能体任务上的平均任务奖励从0.60提高到0.71，通过率从53%提高到67%——这是统计上显著的提升（Wilcoxon符号秩检验p=0.011），在12个任务中获胜，2个失败，13个平局——通常耗时更少。该图为智能体提供了经过验证、可运行的单元，而不是要求它从自然语言中重新推导代码、命令和工具调用。其次，在创建和改进方面，由于每个技能都经过模式验证、功能可测试且可逐节点寻址，因此可以精确诊断和修复故障。两个作者编写的技能故障被追溯到脚本级别。在调整AIP规范并重新编译后，两者均恢复且无回归（一个任务从0/5变为5/5），将技能改进转变为可测量的调优循环，而不是散文重写。相同的图结构支持语料库级别的治理和技能内省，并为基于技能的强化学习提供了自然的动作空间。

英文摘要

Agent Skills today consist largely of free-form prose requiring the agent to read, interpret, and re-derive how to act in every session. This imposes two compounding costs: reduced reliability on implementation-heavy tasks, and difficulty in skill creation and improvement, since editing prose is a fragile process that both humans and agents struggle with, particularly for domain-specific procedural knowledge underrepresented in model training. The Agent Instruction Protocol (AIP) addresses both by modeling a skill as a directed execution graph: discrete steps as nodes backed by deterministic scripts or natural-language descriptions, connected by explicit typed input/output edges, and governed by a schema-validated YAML specification. A compiler meta-skill translates existing human-written skills into this form. The benefits are twofold. First, compiling human-written skills to AIP raised Claude Sonnet's mean task reward from 0.60 to 0.71 and pass rate from 53% to 67% across 27 real agent tasks from SkillsBench - a statistically significant gain (Wilcoxon signed-rank p = 0.011), winning 12 tasks to 2 with 13 ties - often in less wall-clock time. The graph delivers vetted, runnable units to the agent rather than asking it to re-derive code, commands, and tool calls from natural language. Second, on creation and improvement, because each skill is schema-validated, functionally testable, and addressable node-by-node, failures can be diagnosed and repaired precisely. Two authored-skill failures were traced to the script level. After adjusting the AIP spec and recompiling, both recovered with zero regressions (one task going from 0/5 to 5/5), turning skill improvement into a measurable tuning loop rather than a prose rewrite. That same graph structure supports corpus-level governance and skill introspection, and provides a natural action space for reinforcement learning over skills.

URL PDF HTML ☆

赞 0 踩 0

2606.04778 2026-06-04 cs.AI cs.CL cs.LG 版本更新

Inference-Time Vulnerability Beyond Shallow Safety: Alignment Along Generation Trajectories

超越浅层安全的推理时脆弱性：沿生成轨迹的对齐

Kyungmin Park, Taesup Kim

发表机构 * Hankuk University of Foreign Studies（翰江大学外国语大学）； Seoul National University（首尔国立大学）

AI总结本文揭示安全对齐的大语言模型在推理时存在更广泛的脆弱性，即任意生成步骤的短标记注入都能显著改变后续安全行为，并提出通过直接在生成轨迹上对齐模型来提升鲁棒性。

详情

AI中文摘要

安全对齐的大语言模型（LLMs）在推理时仍然容易受到干预，这些干预会将生成导向有害输出。最近的研究将其归因于浅层安全，即对齐集中在最初的几个输出标记上。我们表明，浅层安全是更广泛的推理时脆弱性的一个特例，其中在任何生成步骤的短标记注入都能显著改变后续的安全行为。我们还发现，模型在其隐藏状态中与拒绝方向的对齐并不能预测其对这种注入的鲁棒性，这表明在扰动下，内部状态本身并不能决定生成行为。为了解决这个问题，我们通过模拟序列中段扰动构建的生成轨迹上直接对齐模型，并表明这提高了对中段注入的鲁棒性，并泛化到利用早期标记生成的攻击。我们的工作认为，鲁棒的安全对齐需要对生成过程本身进行训练，而不仅仅是其输出。

英文摘要

Safety-aligned Large Language Models (LLMs) remain vulnerable to interventions during inference that redirect generation toward harmful outputs. Recent work attributes this to shallow safety, where alignment concentrates in the first few output tokens. We show that shallow safety is a special case of a broader inference-time vulnerability, in which short token injections at any generation step can substantially alter subsequent safety behavior. We also find that a model's alignment with refusal directions in its hidden states does not predict its robustness to such injection, revealing that internal state alone does not determine generation behavior under perturbation. To address this, we align models directly on generation trajectories constructed by simulating mid-sequence perturbation, and show that this improves robustness to mid-sequence injection and generalizes to attacks that exploit early-token generation. Our work argues that robust safety alignment requires training on the generation process itself, not only its outputs.

URL PDF HTML ☆

赞 0 踩 0

2606.04775 2026-06-04 cs.LG cs.AI cs.CV cs.SY eess.SY math.OC 版本更新

Fog of Love: 基于亲和力强化学习在游戏环境中塑造道德智能体行为

Ajay Vishwanath, Christian Omlin

发表机构 * University of Agder（阿格德大学）

AI总结本文提出基于亲和力的强化学习方法，通过策略正则化在多智能体角色扮演游戏Fog of Love中同时实现竞争与合作目标，并提升智能体行为的可解释性。

详情

AI中文摘要

在人工智能中注入道德行为越来越受到关注。其中一种提出的技术是基于亲和力的强化学习，它通过对目标函数进行策略正则化来激励道德行为，而不完全依赖于奖励函数设计。迄今为止，该技术已在状态和动作空间最小的网格世界和玩具问题环境中证明有效。为了将这项研究扩展到更复杂的环境，我们引入了一个基于角色扮演棋盘游戏Fog of Love的双人多智能体环境。在该环境中，两个智能体竞争以实现各自的道德目标，同时合作以维持他们的关系。鉴于多智能体性质，这是一个复杂问题，其中多智能体深度确定性策略梯度智能体既不能成功竞争也不能成功合作。我们提供的证据表明，局部亲和力增强了智能体在实现竞争和合作目标方面的性能，从而在两个领域都获得了更高的总体得分。这不仅产生了道德选择，还阐明了智能体的目的论，并使其行为达到人类水平的可解释性。

英文摘要

Instilling virtuous behavior in artificial intelligence has seen increasing interest. One of the techniques proposed is known as affinity-based reinforcement learning, which uses policy regularization on the objective function to incentivize virtuous actions without being fully dependent on the reward function design. Thus far, this technique has been demonstrated to be effective in grid worlds and toy-problem environments with minimal state and action spaces. To expand this research to more sophisticated environments, we introduce a two-player multi-agent environment based on the role-playing board game known as Fog of Love. In this environment, two agents compete to fulfill their individual virtues, while also cooperating to satisfy their relationship. Given the multi-agent nature, this is a complex problem where multi-agent deep deterministic policy gradient agents neither compete nor cooperate successfully. We present evidence that localized affinities enhance agent performance in achieving both competitive and cooperative objectives, resulting from superior overall scores in both domains. This not only results in virtuous choices but also clarifies an agent's teleology and makes its behavior human-level interpretable.

URL PDF HTML ☆

赞 0 踩 0

2606.04749 2026-06-04 cs.RO cs.LG 版本更新

COP-Q: Safety-First Reinforcement Learning for Robot Control via Cholesky-Ordered Projection

COP-Q：基于Cholesky有序投影的安全优先强化学习机器人控制

Guopeng Li, Moritz A. Zanger, Matthijs T. J. Spaan, Julian F. P. Kooij

发表机构 * Department of Cognitive Robotics, Delft University of Technology（代尔夫特理工大学认知机器人系）； Department of Intelligent Systems, Delft University of Technology（代尔夫特理工大学智能系统系）； School of Transportation, Southeast University（东南大学交通学院）

AI总结提出COP-Q方法，通过Cholesky分解编码目标优先级并利用联合Q值空间的广义置信界，在安全优先的离线策略强化学习中平衡安全与奖励目标，减少过度保守性，提升样本效率。

Comments 7 pages, 6 figures, 2 tables

详情

AI中文摘要

安全机器人控制需要在满足安全约束的同时最大化回报。在离线策略安全强化学习中，奖励和安全Q值通常由独立的评论家集成学习，每个目标的不确定性独立处理。这种按目标处理的方式忽略了目标间的相关性，可能导致过于保守的价值估计，从而降低样本效率。为解决此问题，我们提出Cholesky有序投影Q学习（COP-Q），一种安全优先的方法，将目标间协方差纳入向量值Q值估计中。COP-Q在联合Q值空间中构建广义置信界，并使用Cholesky分解以顺序形式编码目标优先级。这在对安全目标保持保守性的同时，自适应地减少对奖励目标的过度保守性。得到的估计同时用于时序差分目标计算和演员优化。COP-Q引入最小的计算开销，并且与大多数现有深度Q学习框架兼容。在Brax中的机器人运动和安全健身房中的安全导航实验（涵盖硬安全和软安全设置）表明，与代表性基线相比，COP-Q实现了强大的安全性能以及有竞争力或更高的样本效率。

英文摘要

Safe robot control requires maximizing return while satisfying safety constraints. In off-policy safe reinforcement learning, reward and safety Q-values are commonly learned by separate critic ensembles, with uncertainty handled independently for each objective. This objective-wise treatment neglects inter-objective correlation and can lead to overly conservative value estimates, thereby reducing sample efficiency. To address this issue, we propose Cholesky-Ordered Projection Q-learning (COP-Q), a safety-first method that incorporates inter-objective covariance into vector-valued Q-value estimation. COP-Q constructs a generalized confidence bound in the joint Q-value space and uses Cholesky factorization to encode objective priority in a sequential form. This preserves conservatism on safety while adaptively reducing excessive conservatism on the reward objective. The resulting estimate is used in both temporal-difference target computation and actor optimization. COP-Q incurs minimal computational overhead and is readily compatible with most existing deep Q-learning frameworks. Experiments on robot locomotion in Brax and safe navigation in Safety-Gymnasium, covering both hard- and soft-safety settings, demonstrate that COP-Q achieves strong safety performance together with competitive or improved sample efficiency relative to representative baselines.

URL PDF HTML ☆

赞 0 踩 0

2606.04743 2026-06-04 cs.CL cs.AI cs.LG 版本更新

TIDE: Proactive Multi-Problem Discovery via Template-Guided Iteration

TIDE：通过模板引导迭代的主动多问题发现

Soyeong Jeong, Jinheon Baek, Minki Kang, Sung Ju Hwang

发表机构 * KAIST（韩国科学技术院）； DeepAuto.ai

AI总结提出TIDE框架，通过模板引导的迭代机制主动发现用户上下文中隐藏的多个问题，并给出具体行动方案，在个人工作区和软件仓库两个场景中显著提升任务覆盖率和问题识别与解决能力。

详情

AI中文摘要

智能体被广泛部署为文档、工具和代码的助手。然而，它们通常仅对明确的用户请求做出响应，这些请求只反映了用户已注意到的问题，而许多其他重要问题共存于更广泛的用户上下文中，隐藏于显而易见之处，且其总数事先未知。我们将此定义为从上下文中发现多个隐藏问题的任务，其中应揭示共存的问题，基于支持性证据，并配以具体行动。为此，我们引入了TIDE，一个模板引导的迭代框架，包含两种互补机制。具体而言，基于单次预测倾向于关注最显著案例并产生泛化结论的观察，我们提出迭代发现：每轮生成一小批候选，同时基于已发现结果进行条件化，从而后续轮次扩展覆盖范围；以及思维模板：从先前解决的案例中提炼的可重用模式，指定应关注哪些上下文信号以及如何连接它们，将每个预测锚定于可识别的问题类别。我们在两个现实场景（个人工作区和软件仓库）中，使用四种模型骨干验证了TIDE，在任务覆盖率、识别和解决方面显著优于单次和并行多智能体基线。

英文摘要

Agents are widely deployed as assistants over documents, tools, and code. However, they typically act only on explicit user requests, which surface only the problems the user has noticed, while many other important problems coexist, hidden in plain sight, within the broader user context, with their total number unknown in advance. We frame this as the task of discovering multiple hidden problems from context, in which coexisting problems should be uncovered, grounded in supporting evidence, and paired with concrete actions. To this end, we introduce TIDE, a template-guided iterative framework with two complementary mechanisms. Specifically, motivated by the observation that single-pass prediction anchors on the most salient cases and yields generic claims, we propose iterative discovery, which surfaces a small batch of candidates per round while conditioning on what has already been found, so subsequent rounds extend coverage; and thought templates, reusable schemas distilled from previously solved cases that specify what contextual signals to attend to and how to connect them, anchoring each prediction in a recognizable problem class. We validate TIDE on two realistic settings, personal workspaces and software repositories, across four model backbones, showing substantial gains over single-shot and parallel multi-agent baselines on task coverage, identification, and resolution.

URL PDF HTML ☆

赞 0 踩 0

2606.04736 2026-06-04 cs.LG cs.AI 版本更新

Curvature-aware dynamic precision approach for physics-informed neural networks

面向物理信息神经网络的曲率感知动态精度方法

Yingjie Shao, Ioannis N. Athanasiadis, George van Voorn, Taniya Kapoor

发表机构 * Mathematical & Statistical Methods Group (Biometris), Wageningen University & Research（数学与统计方法组（Biometris），瓦赫宁根大学与研究中心）； Artificial Intelligence Group, Wageningen University & Research（人工智能组，瓦赫宁根大学与研究中心）

AI总结提出一种曲率感知精度控制器，利用L-BFGS优化器中的曲率信息动态调整数值精度，在保持预测精度的同时降低双精度训练的计算成本。

详情

AI中文摘要

物理信息神经网络（PINNs）通过将物理定律直接嵌入神经网络训练，已成为模拟偏微分方程（PDEs）的有前景框架。然而，近期研究表明PINN优化对数值精度敏感。现有实现通常使用单精度（FP32），计算效率高但易出现失败模式，或双精度（FP64），鲁棒但成本高昂。这造成了计算效率与数值精度之间的权衡。为降低双精度训练的计算成本同时保持预测精度，我们提出一种曲率感知精度控制器，在训练过程中自适应调整数值精度，而非将其视为固定的实现选择。该方法重用来自有限内存BFGS（L-BFGS）优化器的曲率信息来构建精度控制器，在低精度足够时保留FP32，并在训练动态表明数值敏感或精度受限停滞时提升至FP64计算。我们在四个典型PINN失败模式基准和一个辐照度驱动的常微分方程示例上评估了所提方法。我们还测试了不同神经网络架构下的方法。该方法在所有基准方程上一致匹配甚至略微超过全FP64解的精度，同时相对于全双精度训练减少了训练时间。所得结果表明，PINN优化中的精度敏感性具有相位依赖性，仅在数值关键阶段选择性应用更高精度可以在不牺牲预测精度的前提下降低计算成本。

英文摘要

Physics-informed neural networks (PINNs) have become a promising framework for simulating partial differential equations (PDEs) by embedding physical laws directly into neural network training. However, recent studies show that PINN optimisation is sensitive to numerical precision. Existing implementations commonly use either single precision (FP32), which is computationally efficient but prone to failure modes, or double precision (FP64), which is robust but substantially expensive. This creates a trade-off between computational efficiency and numerical accuracy. To reduce the computational cost of double-precision training while retaining prediction accuracy, we propose a curvature-aware precision controller that adapts numerical precision during training rather than treating it as a fixed implementation choice. The proposed method reuses curvature information derived from the limited-memory BFGS (L-BFGS) optimiser to construct a precision controller, retaining FP32 when lower precision is sufficient and promoting computation to FP64 when the training dynamics indicate numerical sensitivity or precision-limited stagnation. We evaluate the proposed approach on four canonical PINN failure-mode benchmarks and an irradiance-driven ordinary differential equation example. We further test the proposed approach across different neural network architectures. The method consistently matches or even slightly exceeds full FP64 solution accuracy while reducing training time relative to full double-precision training on all benchmark equations. The obtained results indicate that precision sensitivity in PINN optimisation is phase-dependent, and that selectively applying higher precision only during numerically critical stages can lower computational cost without sacrificing predictive accuracy.

URL PDF HTML ☆

赞 0 踩 0

2606.04735 2026-06-04 cs.LG cs.AI 版本更新

基于图引导的广义特征值近端支持向量机中的Universum学习用于阿尔茨海默病分类

Yogesh Kumar, Vrushank Ahire, Mudasir Ganaie

发表机构 * Dept. of Computer Science and Engineering, IIT Ropar, Punjab 140001, India（计算机科学与工程系，IIT罗帕尔，旁遮普140001，印度）

AI总结针对阿尔茨海默病分类，提出两种图引导的Universum学习模型UG-GEPSVM和IUG-GEPSVM，利用轻度认知障碍样本构建图拉普拉斯正则化，替代传统独立惩罚项，在ADNI MRI数据集上取得更优性能。

详情

AI中文摘要

早期准确检测阿尔茨海默病（AD）对于及时干预和疾病管理至关重要。广义特征值近端支持向量机（GEPSVM）及其基于Universum的变体在AD分类中显示出有希望的结果。然而，现有方法将Universum样本视为独立点，未考虑它们之间的几何关系。本文提出了两种图引导的Universum学习模型，即UG-GEPSVM和IUG-GEPSVM，用于使用结构MRI数据进行AD与认知正常（CN）分类。在所提出的框架中，轻度认知障碍（MCI）受试者被用作Universum数据，以提供AD和CN类别之间的中间信息。使用高斯相似性、最小生成树连通性和多跳传播在Universum样本上构建图。从该图中导出拉普拉斯矩阵，捕获MCI样本的几何结构。这种基于拉普拉斯的正则化被纳入学习过程，以替代传统的独立Universum惩罚项。UG-GEPSVM将此正则化集成到广义特征值公式中，而IUG-GEPSVM使用标准特征值公式扩展了数值稳定的改进GEPSVM框架。在ADNI MRI数据集变体上使用ICA和PCA特征在五个不同噪声水平下的实验表明，两种提出的模型始终优于现有的GEPSVM和基于Universum的方法。UG-GEPSVM实现了88.07%的最高平均AUC，并在增加的噪声水平下保持稳定的性能。统计检验进一步证实了观察到的改进的显著性。

英文摘要

Early and accurate detection of Alzheimer's disease (AD) is important for timely intervention and disease management. Generalized Eigenvalue Proximal Support Vector Machine (GEPSVM) and its Universum-based variants have shown promising results for AD classification. However, existing methods treat Universum samples as independent points and do not consider the geometric relationships among them. This paper proposes two graph-guided Universum learning models, namely UG-GEPSVM and IUG-GEPSVM, for AD versus cognitively normal (CN) classification using structural MRI data. In the proposed framework, mild cognitive impairment (MCI) subjects are used as Universum data to provide intermediate information between AD and CN classes. A graph is constructed over the Universum samples using Gaussian similarity, Minimum Spanning Tree connectivity, and multi-hop propagation. From this graph, a Laplacian matrix is derived that captures the geometric structure of the MCI samples. This Laplacian-based regularization is incorporated into the learning process in place of the conventional independent Universum penalty term. UG-GEPSVM integrates this regularization into the generalized eigenvalue formulation, while IUG-GEPSVM extends the numerically stable improved GEPSVM framework using a standard eigenvalue formulation. Experiments on ADNI MRI dataset variants using ICA- and PCA-based features at five different noise levels show that both proposed models consistently outperform existing GEPSVM and Universum-based methods. UG-GEPSVM achieves the highest average AUC of 88.07% and maintains stable performance under increasing noise levels. Statistical tests further confirm the significance of the observed improvements.

URL PDF HTML ☆

赞 0 踩 0

2606.04695 2026-06-04 cs.LG 版本更新

Cone-Compatible Monge Geometry for High-Dimensional Ordered Optimal Transport

锥相容的Monge几何用于高维有序最优输运

Lei Luo, Hongliang Zhang, Jian Yang

发表机构 * PCA Lab, Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education, School of Computer Science and Engineering, Nanjing University of Science and Technology（PCA实验室、教育部高维信息智能感知与系统重点实验室、计算机科学与工程学院、南京理工大学）

AI总结本文提出锥相容的Monge几何，通过闭凸锥诱导的偏序与输运成本兼容的条件，为高维有序数据提供闭式最优耦合。

Comments 13 pages, 2 figures, including appendices

详情

AI中文摘要

高维最优输运很少具有闭式解。一维情况是例外，因为实数线的顺序与凸输运成本兼容，使得单调重排最优。本文研究在更高维中如何从偏序恢复类似的Monge结构。我们引入锥相容的Monge几何：一个闭凸锥(K)诱导序(x\preceq_K y)当(y-x\in K)，并且如果有序对满足Monge交换不等式，则与成本兼容。对于平方马氏距离成本(c_M(x,y)=(x-y)^\top M(x-y))，我们证明了一个尖锐的刻画：兼容性恰好当(K)在(M)-内积下是锐角锥，即对所有(u,v\in K)有(u^\top Mv\ge0)，等价于(K\subseteq K_M^*)。在此条件下，支撑在锥链上的测度允许分位数型的闭式最优耦合，在原始地面成本下（而非投影或度量替换后）得到精确输运。我们将由此产生的锥链Wasserstein度量（定义在规范有序的链分布上）与扩展的有向锥输运成本（定义在一般测度上）区分开来，并发展了可行性、对偶性、稳定性、逼近、高斯恢复、统计和计算方面的结果。该理论与切片和树Wasserstein距离互补：它不是通用的快速替代，而是为有序高维数据提供可解释、方向有效、原始空间单调输运的一种方法。

英文摘要

High-dimensional optimal transport is seldom available in closed form. The one-dimensional case is exceptional because the order of the real line is compatible with convex transport costs, making monotone rearrangement optimal. This paper studies when an analogous Monge structure can be recovered in higher dimensions from a partial order. We introduce a cone-compatible Monge geometry: a closed convex cone (K) induces the order (x\preceq_K y) whenever (y-x\in K), and is compatible with a cost if ordered pairs satisfy a Monge exchange inequality. For squared Mahalanobis costs (c_M(x,y)=(x-y)^\top M(x-y)), we prove a sharp characterization: compatibility holds exactly when (K) is acute under the (M)-inner product, namely (u^\top Mv\ge0) for all (u,v\in K), equivalently (K\subseteq K_M^*). Under this condition, measures supported on cone chains admit a quantile-type closed-form optimal coupling, yielding exact transport under the original ground cost rather than after projection or metric replacement. We distinguish the resulting cone-chain Wasserstein metric on canonically ordered chain distributions from an extended directed cone transport cost on general measures, and develop feasibility, duality, stability, approximation, Gaussian recovery, statistical, and computational results. The theory is complementary to sliced and tree Wasserstein distances: it is not a universal fast surrogate, but a way to obtain interpretable, direction-valid, original-space monotone transport for ordered high-dimensional data.

URL PDF HTML ☆

赞 0 踩 0

2606.04689 2026-06-04 quant-ph cs.LG 版本更新

QPredSGG: Hybrid Quantum Predicate Learning for Long-Tailed Scene Graph Generation

QPredSGG：面向长尾场景图生成的混合量子谓词学习

Prerana Ramkumar, Nouhaila Innan, Muhammad Shafique

发表机构 * Department of Computer Science, University of Waterloo（1. 温哥华大学计算机科学系）； Machine Learning Research Group, University of Waterloo（2. 温哥华大学机器学习研究组）

AI总结针对场景图生成中长尾谓词分布导致的分类偏差，提出用量子谓词头（QP-Head）替换经典谓词头，通过振幅嵌入和强纠缠层压缩特征，在Visual Genome 150上实现参数高效的长尾关系分类。

Comments 11 pages, 5 figures

详情

AI中文摘要

AI中文摘要

节点分类的主动学习通常专注于在一个或几个大图（例如社交网络分析）中选择最具信息量的节点进行标注。然而，在其他领域，如分子化学或电子设计自动化，数据集由数千个独立图组成。在许多这样的归纳式设置中，标注单个节点需要全图分析，这实际上会即时产生剩余的节点标签。因此，这些场景需要选择整个图而非单个节点的主动学习策略，而这一问题迄今尚未在文献中得到解决。因此，我们提出了ALINC，一个通过图采样进行归纳式节点分类的主动学习框架。它通过多种聚合机制将节点级效用度量提升为图级选择标准，从而弥合了现有的方法论差距。在包含十种策略、三种聚合方法和四个数据集的广泛基准测试中，我们确定了CoreSet、TypiClust和BADGE作为性能最佳的图采样策略。我们的详细分析进一步揭示，聚合方法的选择至关重要，因为它显著影响模型性能和标注成本。最后，我们在两个用例研究中展示了ALINC的有效性：分子中的代谢位点预测和印刷电路板原理图的设计自动化。

英文摘要

Active learning (AL) for node classification typically focuses on selecting the most informative nodes for annotation within one or a few large graphs (e.g., in social network analysis). However, in other domains, such as molecular chemistry or electronic design automation, datasets consist of thousands of independent graphs. In many of these inductive settings, annotating an individual node requires a full-graph analysis, which effectively yields the remaining node labels on-the-fly. Therefore, these scenarios require AL strategies that select entire graphs instead of single nodes, a problem which has not been tackled in the literature so far. Thus, we introduce ALINC, an AL framework for inductive node classification via graph sampling. It bridges the existing methodological gap by elevating node-level utility measures to graph-level selection criteria through various aggregation mechanisms. In an extensive benchmark including ten strategies, three aggregation methods, and four datasets, we identify CoreSet, TypiClust, and BADGE as the top-performing graph sampling strategies. Our detailed analysis further reveals that the choice of the aggregation method is pivotal, as it substantially affects model performance and annotation costs. Finally, we demonstrate the effectiveness of ALINC in two use case studies: site-of-metabolism prediction in molecules and design automation of printed circuit board schematics.

URL PDF HTML ☆

赞 0 踩 0

2606.04634 2026-06-04 cs.LG 版本更新

Explainably Safe Reinforcement Learning

可解释的安全强化学习

Sabine Rieder, Stefan Pranger, Debraj Chakraborty, Jan Křetínský, Bettina Könighofer

发表机构 * Masaryk University（马萨里克大学）； Graz University of Technology（格拉茨技术大学）； Technical University of Munich（慕尼黑技术大学）

AI总结提出一种基于分层决策树的可解释安全强化学习方法，通过世界模型分析状态风险并构建屏蔽策略，生成可理解的解释，同时保持安全保证。

详情

AI中文摘要

对决策系统的信任既需要安全保证，也需要解释和理解其行为的能力。这对于学习系统尤为重要，因为其决策过程往往高度不透明。屏蔽是一种基于模型的强化学习安全增强技术。然而，由于屏蔽是通过严格的形式化方法自动合成的，其决策同样难以被人类解释。最近，决策树被广泛用于表示控制器和策略。但由于屏蔽本质上具有非确定性，其决策树表示变得过大，无法在实践中提供可解释性。为应对这一挑战，我们提出了一种新颖的可解释安全强化学习方法，通过提供人类可理解的屏蔽决策解释来增强信任。我们的方法将屏蔽策略表示为分层决策树，提供自上而下的基于案例的解释。在设计时，我们使用世界模型分析在给定状态下执行动作的安全风险。基于此分析，我们构建屏蔽策略和一个高层决策树，将状态分类为风险类别（安全、关键、危险、不安全），解释为何某种情况可能涉及安全关键。在运行时，我们生成局部决策树，解释哪些动作被允许以及为何其他动作被认为不安全。我们的方法促进了屏蔽安全强化学习中安全方面的可解释性，不需要超出屏蔽已用信息的额外信息，开销极小，并能轻松集成到现有的屏蔽强化学习流程中。实验中，我们使用比原始屏蔽小几个数量级的决策树来计算解释。

英文摘要

Trust in a decision-making system requires both safety guarantees and the ability to interpret and understand its behavior. This is particularly important for learned systems, whose decision-making processes are often highly opaque. Shielding is a prominent model-based technique for enforcing safety in reinforcement learning. However, because shields are automatically synthesized using rigorous formal methods, their decisions are often similarly difficult for humans to interpret. Recently, decision trees became customary to represent controllers and policies. However, since shields are inherently non-deterministic, their decision tree representations become too large to be explainable in practice. To address this challenge, we propose a novel approach for explainable safe RL that enhances trust by providing human-interpretable explanations of the shield's decisions. Our method represents the shielding policy as a hierarchy of decision trees, offering top-down, case-based explanations. At design time, we use a world model to analyze the safety risks of executing actions in given states. Based on this analysis, we construct both the shield and a high-level decision tree that classifies states into risk categories (safe, critical, dangerous, unsafe), explaining why a situation may be safety-critical. At runtime, we generate localized decision trees that explain which actions are allowed and why others are deemed unsafe. Our method facilitates explainability of the safety aspect in safe-by-shielding reinforcement learning, requires no additional information beyond what is already used for shielding, incurs minimal overhead, and integrates readily into existing shielded RL pipelines. In our experiments, we compute explanations using decision trees that are several orders of magnitude smaller than the original shield.

URL PDF HTML ☆

赞 0 踩 0

2606.04632 2026-06-04 cs.LG cs.CL 版本更新

VentAgent: When LLMs Learn to Breathe -- Multi-Objective Arbitration for ARDS Ventilation

VentAgent：当大语言模型学会呼吸——ARDS通气的多目标仲裁

Teqi Hao, Yuxuan Fu, Xiaoyu Tan, Shaojie Shi, Bohao Lv, Yinghui Xu, Xihe Qiu

发表机构 * School of Electronic and Electrical Engineering, Shanghai University of Engineering Science（上海工程技术大学电子与电气工程学院）； Tencent Youtu Lab（腾讯优图实验室）； Artificial Intelligence Innovation and Incubation Institute, Fudan University（复旦大学人工智能创新与孵化院）

AI总结提出VentAgent分层框架，利用大语言模型作为透明仲裁者，通过感知-规划-编排三阶段将机械通气控制转化为动态多目标仲裁过程，在生理模拟器上优于强化学习和经典控制基线，并提供可解释的推理链。

详情

AI中文摘要

急性呼吸窘迫综合征（ARDS）的机械通气需要平衡竞争性的生理目标，包括氧合、肺保护和酸碱平衡。然而，当前的数据驱动方法，尤其是模仿回顾性电子健康记录（EHR）的方法，常常遭受模仿偏差。它们可能从不一致的临床演示中捕获表面相关性，例如将被动呼吸机设置与生存关联，因为这种设置在稳定患者中很常见，因此无法泛化到不稳定或分布外的表型。标准的强化学习（RL）方法也难以处理重症监护中的对抗性权衡，并常常产生不透明且临床可解释性有限的策略。为了解决这些局限性，我们引入了VentAgent，一个分层框架，其中大语言模型（LLM）作为机械通气的透明仲裁者。我们将通气控制重新表述为动态多目标仲裁过程，而非单目标优化。VentAgent将决策分解为三个可解释的阶段：感知、规划和编排。通过利用LLM的语义推理能力，它综合来自异构专家的策略，并通过显式协调机制解决冲突的临床优先级。在高保真生理模拟器上的评估表明，VentAgent优于最先进的RL和经典控制基线。此外，它将控制决策转化为人类可读的推理链，为重症监护自动化提供了更安全、更可解释和更自适应的范式。

英文摘要

Mechanical ventilation for Acute Respiratory Distress Syndrome (ARDS) requires balancing competing physiological goals, including oxygenation, lung protection, and acid-base homeostasis. However, current data-driven methods, especially those imitating retrospective Electronic Health Records (EHR), often suffer from imitation bias. They may capture superficial correlations from inconsistent clinical demonstrations, such as associating passive ventilator settings with survival because such settings are common in stable patients, and thus fail to generalize to volatile or out-of-distribution phenotypes. Standard Reinforcement Learning (RL) methods also struggle with the adversarial trade-offs of critical care and often produce opaque policies with limited clinical interpretability. To address these limitations, we introduce VentAgent, a hierarchical framework in which Large Language Models (LLMs) act as transparent arbitrators for mechanical ventilation. We reformulate ventilation control as a dynamic Multi-Objective Arbitration process rather than single-objective optimization. VentAgent decomposes decision-making into three interpretable stages: Perception, Planning, and Orchestration. By leveraging the semantic reasoning capabilities of LLMs, it synthesizes strategies from heterogeneous experts and resolves conflicting clinical priorities through an explicit coordination mechanism. Evaluations on a high-fidelity physiological simulator show that VentAgent outperforms state-of-the-art RL and classical control baselines. Moreover, it converts control decisions into human-readable reasoning chains, offering a safer, more interpretable, and adaptable paradigm for critical care automation.

URL PDF HTML ☆

赞 0 踩 0

2606.04623 2026-06-04 cs.LG 版本更新

面向不确定性感知检索的分布近似最近邻搜索

Olivier Jeunen

发表机构 * Antwerp, Belgium（比利时安特卫普）

AI总结提出DINOSAUR框架，通过为每个物品采样多个嵌入并构建索引，在检索时对用户嵌入进行采样，以隐式边缘化嵌入不确定性，从而在不改变模型架构或索引基础设施的情况下提升长尾物品的覆盖。

详情

AI中文摘要

近似最近邻搜索索引构成了现实世界推荐系统的骨干，支持在百万级物品目录上进行实时候选检索。通常，为每个用户和每个物品学习一个点估计嵌入。在服务时，用户嵌入查询索引以获取相关物品。由于这些表示是从稀疏交互数据中学习的，它们带有噪声，可能无法捕捉所有有助于“相关性”的细微差别——忽略了其固有的基本不确定性。结果是检索管道系统性地偏向于少数嵌入估计良好的热门头部物品，而牺牲了长尾中多数小众、多样和偶然的内容。我们提出了DINOSAUR（面向不确定性感知检索的分布近似最近邻搜索）：一个简单且与基础设施兼容的框架，将嵌入不确定性纳入候选生成。DINOSAUR不为点估计建立索引，而是为每个物品采样$S_i$个嵌入，并在这一增强集上构建索引。类似地，在查询时，对用户嵌入进行采样。这种双边的随机检索过程隐式地边缘化了嵌入不确定性，无需改变模型架构或ANN索引基础设施。在分析方面，我们展示了当不确定性消失时，DINOSAUR恢复标准的点估计检索，并刻画了增加的嵌入方差如何扩展不确定物品可检索的潜在空间区域。可重复的实证观察与这些预期一致，显示出在离线召回率小幅损失的情况下，覆盖率大幅提升。

英文摘要

Approximate Nearest Neighbour search indices form the backbone of real-world recommender systems, enabling real-time candidate retrieval over million-item catalogues. Typically, a single point estimate embedding is learnt for every user and every item. At serving time, the user embedding queries the index for relevant items. Since these representations are learnt from sparse interaction data, they are noisy and might fail to capture all the nuances that contribute to ``relevance'' -- ignoring the fundamental uncertainty that is inherent to them. The result is a retrieval pipeline that is systematically biased toward the small minority of popular head items with well-estimated embeddings, at the expense of the long-tail majority of niche, diverse, and serendipitous content. We propose DINOSAUR (Distributional Approximate Nearest Neighbour Search for Uncertainty-Aware Retrieval): a simple and infrastructure-compatible framework to incorporate embedding uncertainty into candidate generation. Rather than indexing point estimates, DINOSAUR samples $S_i$ embeddings per item and constructs an index on this augmented set. Analogously, at query time, a user embedding is sampled. This two-sided stochastic retrieval process implicitly marginalises over embedding uncertainty, without requiring changes to model architecture or ANN index infrastructure. On the analytical side, we show that DINOSAUR recovers standard point-estimate retrieval as uncertainty vanishes, and we characterise how increased embedding variance expands the regions of latent space in which uncertain items are retrievable. Reproducible empirical observations align with these expectations, showing large coverage gains with small losses in offline recall.

URL PDF HTML ☆

赞 0 踩 0

2606.04583 2026-06-04 cs.LG 版本更新

HalfNet: Randomized Neural Networks with Learned Subspace Geometry

HalfNet: 具有学习子空间几何的随机神经网络

Ethem Alpaydin

发表机构 * Ethem Alpaydin

AI总结提出HalfNet，通过从可学习的低秩协方差矩阵中随机采样权重，在减少参数的同时匹配全连接网络的性能，揭示权重空间几何对预测能力的关键作用。

Comments 6 pages (+2 pages of appendix), 6 figures

2606.04582 2026-06-04 physics.comp-ph cs.LG physics.app-ph 版本更新

Reconstructing Unobservable Temperature Fields via Simulation-Aided Intelligent Sensing

通过仿真辅助智能感知重建不可观测温度场

Monika Stipsitz, Hèlios Sanchis-Alepuz, Jacob Reynvaan, Silvester Sabathiel

发表机构 * Silicon Austria Labs（硅酸奥地利实验室）； Republic of Austria（奥地利共和国）； Styrian Business Promotion Agency（施蒂里亚商业促进局）； federal state of Carinthia（卡林西亚联邦州）； Upper Austrian Research（上奥地利研究）； Austrian Association for the Electric and Electronics Industry（奥地利电子电气工业协会）

AI总结提出基于随机物理仿真生成数据集的方法，训练神经网络从稀疏传感器重建内部温度场，实现实时在线监测。

Comments Presented at IEEE International Instrumentation and Measurement Technology Conference (I2MTC), Nancy, France, 2026

2606.04576 2026-06-04 stat.ML cs.LG econ.EM q-fin.RM 版本更新

Neetyabhas: 理性主体模型中不确定性感知的公共政策优化框架

Janani Venugopalan, Gaurav Deshkar, Rishabh Gaur, Harshal Hayatnagarkar, Jayanta Kshirsagar

发表机构 * ThoughtWorks

AI总结提出一种集成流行病测量和政策执行不确定性的分层强化学习框架，通过模拟个体行为与政策干预的交互，有效管理疫情并降低影响。

详情

AI中文摘要

目的世界卫生组织的COVID-19非药物干预措施（如封锁、疫苗接种）有效遏制了传播，但带来了沉重的经济负担。现有研究常常忽略个体行为，并错误地假设完美的感染追踪和无误的政策执行，未能考虑现实世界的不确定性和错误。方法我们提出了一种整合流行病测量（感染/住院）和政策执行中不确定性的方法。我们构建了一个包含1000名个体的模拟模型，这些个体实时做出关于佩戴口罩、接种疫苗和购物的选择。同时，政策制定者基于健康和经济观察部署干预措施（封锁、强制令）。该框架由分层强化学习智能体驱动，利用深度Q网络以及不确定性感知的策略梯度变体（DDPG和TD3）。结果模拟有效管理了疫情的进展。佩戴口罩和疫苗接种被证明非常有效，显著降低了疫情高峰的高度和持续时间。通过整合个体行为、政策不确定性和多方面的干预措施，我们的动态控制方法成功减轻了疫情的影响。结论我们的模型通过将不确定性和人类行为嵌入公共卫生政策框架，克服了以往研究的局限性。模拟表明，考虑个体选择和不完美数据对于设计复杂疫情期间的有效干预措施至关重要，其中口罩和疫苗是关键工具。

英文摘要

Purpose The WHO's COVID-19 non-pharmaceutical interventions (e.g., lockdowns, vaccinations) effectively curb transmission but impose heavy economic strains. Existing research often neglects individual behaviors and falsely assumes perfect infection tracking and flawless policy execution, failing to account for real-world uncertainties and errors. Methods We propose an integrative approach incorporating uncertainties in both epidemic measurement (infections/hospitalizations) and policy implementation. We built a simulation model of 1,000 individuals making real-time choices regarding mask-wearing, vaccination, and shopping. Concurrently, policymakers deploy interventions (lockdowns, mandates) based on health and economic observations. This framework is driven by hierarchical reinforcement learning agents, utilizing deep Q-networks alongside uncertainty-aware policy gradient variants (DDPG and TD3). Results The simulations effectively managed the epidemic's progression. Masking and vaccinations proved highly effective, significantly reducing both the outbreak's peak height and duration. By integrating individual behaviors, policy uncertainties, and multifaceted interventions, our dynamic control approach successfully mitigated the epidemic's impact. Conclusions Our model overcomes previous research limitations by embedding uncertainty and human behavior into public health policy frameworks. The simulation demonstrates that accounting for individual choices and imperfect data is crucial for designing effective interventions during complex pandemics, with masks and vaccines serving as pivotal tools.

URL PDF HTML ☆

赞 0 踩 0

2606.04557 2026-06-04 cs.CL cs.IR cs.LG 版本更新

Cartridges at Scale: Training Modular KV Caches over Large Document Collections

大规模弹匣：训练模块化KV缓存以处理大型文档集合

Momchil Hardalov, Gonzalo Iglesias, Adrià de Gispert

发表机构 * Amazon AGI（亚马逊人工智能研究院）

AI总结提出Cartridges at Scale (CAS)框架，通过动态干扰混合和内存高效预算管理器实现大规模多弹匣训练，在减少预填充开销的同时保持准确性，性能优于单块弹匣10-31点，接近全上下文学习。

Comments 21 pages, 5 figures, 17 tables

详情

AI中文摘要

大型语言模型能够处理长上下文，但预填充数百万个标记是浪费的，因为许多内容在查询之间保持不变。弹匣通过将文档集合提炼为可重用的键值（KV）缓存来解决这一问题，从而消除预填充同时保持准确性。这种方法的一个关键限制是弹匣是单块且非组合的：将整个集合编码为单个KV块无法扩展，并且天真地混合单独训练的弹匣会使性能下降到接近随机水平。我们引入了Cartridges at Scale (CAS)，这是一个可扩展的多弹匣学习训练框架，具有动态干扰混合和内存高效的预算管理器，可在GPU和持久存储之间轮换数百个每文档弹匣。我们的方法可扩展到超过一百万个标记的集合，在可比标记预算下，比单块弹匣提高10-31点。即使在高度压缩下，Oracle弹匣准确率也接近完全上下文学习的2-6点范围内。当与检索结合用于弹匣选择时，CAS匹配或超过传统RAG准确率，同时消耗的提示标记减少3-4倍。

英文摘要

Large Language Models can reason over long contexts, yet prefilling millions of tokens is wasteful as much of the content remains static across queries. Cartridges address this by distilling document collections into reusable key-value (KV) caches that eliminate prefilling while preserving accuracy. A critical limitation of this approach is that cartridges are monolithic and non-compositional: encoding an entire collection into a single KV block does not scale, and naively mixing cartridges trained in isolation collapses performance to near chance. We introduce Cartridges at Scale (CAS), a training framework for scalable multi-cartridge learning with dynamic distractor mixing and a memory-efficient budget manager that rotates hundreds of per-document cartridges between GPU and persistent storage. Our approach scales to collections exceeding a million tokens, improving over a monolithic cartridge by 10-31 points at comparable token budgets. Oracle cartridge accuracy falls within 2-6 points of full in-context learning even at high compression. When paired with retrieval for cartridge selection, CAS matches or exceeds conventional RAG accuracy while consuming 3-4x fewer prompt tokens.

URL PDF HTML ☆

赞 0 踩 0

2606.04522 2026-06-04 cs.IR cs.AI cs.DB cs.LG 版本更新

ANN Search: Recall What Matters

ANN搜索：召回真正重要的

Dimitris Dimitropoulos, Nikos Mamoulis

发表机构 * University of Ioannina（伊奥尼亚大学）； Archimedes, Athena RC（阿基米德，雅典RC）

AI总结本文提出用逆近似比1/Ratio@k替代Recall@k来评估近似最近邻搜索质量，实验表明前者能更准确反映实际效用并降低计算开销。

详情

AI中文摘要

近似最近邻（ANN）搜索已成为信息检索和现代机器学习任务（从分类到检索增强生成）的核心原语。社区主要通过给定Recall@k（检索到的真实精确最近邻的比例）下的吞吐量来评估和调优ANN算法。我们认为，ANN搜索真正重要的是检索结果的质量，而非它们与真实kNN集合的重叠。我们证明，使用Recall@k评估检索质量会带来不必要的计算开销，并研究用逆近似比1/Ratio@k替代它。1/Ratio@k评估检索到的邻居与真实邻居之间距离的差异。它无需判断、无需超参数，仅通过标准ANN基准输入即可计算。我们在涵盖广泛内在维度的多样化数据集上对最先进的ANN算法进行基准测试，从效率、下游分类和检索增强生成三个维度全面评估这两个指标。在效率方面，优化1/Ratio@k达到操作质量阈值所需的计算成本远低于Recall@k。在下游任务中，即使Recall@k显著下降，性能指标（标签精度、语义相似度、BERTScore和LLM评分质量）仍保持高度稳定。相反，逆近似比紧密反映了这种稳定性，比Recall@k更好地追踪实际效用。最终，虽然Recall@k夸大了近似的真实成本，但1/Ratio@k提供了更准确、可部署的ANN实际质量代理。

英文摘要

Approximate nearest neighbor (ANN) search has become a core primitive in information retrieval and modern machine learning tasks, from classification to retrieval-augmented generation. The community evaluates and tunes ANN algorithms primarily on their throughput at a given Recall@k, the fraction of true exact neighbors retrieved. We argue that what really matters in ANN search is the quality of the retrieved results and not their overlap with the true kNN set. We show that using Recall@k to assess retrieval quality forces unnecessary computational overhead and investigate replacing it by 1/Ratio@k, the inverse approximation ratio. 1/Ratio@k evaluates the differences between the distances of the retrieved and true neighbors. It is judge-free, hyperparameter-free, and computable from standard ANN benchmark inputs alone. We benchmark state-of-the-art ANN algorithms across diverse datasets spanning a wide range of intrinsic dimensionalities, evaluating the two metrics comprehensively across efficiency, downstream classification, and retrieval-augmented generation. On the efficiency axis, optimizing for 1/Ratio@k reaches operational quality thresholds at a substantially lower computational cost than Recall@k. In downstream tasks, performance indicators (label precision, semantic similarity, BERTScore, and LLM-graded quality) remain highly stable even when Recall@k drops significantly. The inverse approximation ratio, on the other hand, closely mirrors this stability, tracking true utility much better than Recall@k. Ultimately, while Recall@k overstates the true cost of approximation, 1/Ratio@k offers a more accurate, deployable proxy for actual ANN quality.

URL PDF HTML ☆

赞 0 踩 0

2606.04516 2026-06-04 cs.LG cs.AI 版本更新

GeoMin: Data-Efficient Semi-Supervised RLVR via Geometric Distribution Modeling

GeoMin: 基于几何分布建模的数据高效半监督RLVR

Guangcheng Zhu, Shenzhi Yang, Haobo Wang, Xing Zheng, Yingfan MA, Xuening Feng, Zhongqi Chen, Kai Tang, Zhengqing Zang, Bowen Song, Weiqiang Wang, Gang Chen

发表机构 * Zhejiang University（浙江大学）； Ant Group（蚂蚁集团）

AI总结提出GeoMin方法，通过建模标注数据的全局特征分布来解码正确与错误展开的结构差异，从而建立稳健先验评估自奖励信号可靠性，以少量标注数据高效利用未标注数据，在仅用10%标注时超越全监督模型。

详情

AI中文摘要

基于可验证奖励的强化学习（RLVR）显著提升了LLM的推理能力，但面临困境：标准监督扩展受限于高标注成本，而无监督替代方案则遭受严重的模型崩溃。最近的半监督RLVR方法通过使用少量标注集指导未标注数据，在训练效果和标注成本之间取得了有前景的权衡。然而，由于依赖粗糙的性能启发式，它们遭受严重的数据效率瓶颈，导致绝大多数有价值实例未被充分利用。为此，我们提出GeoMin，它在标注数据上建模全局特征分布，以解码正确和错误展开之间的结构差异，从而建立稳健的先验来评估自奖励信号的可靠性，并充分释放未标注数据的潜力。实验上，GeoMin比最强基线高出+4.1%，甚至在使用仅10%标注的情况下超越全监督模型，展示了显著的数据效率。

英文摘要

Reinforcement learning with verifiable rewards (RLVR) significantly advances LLM reasoning, yet it faces a dilemma: standard supervised scaling is throttled by high annotation costs, while unsupervised alternatives suffer from severe model collapse. Recent semi-supervised RLVR methods address this by using a small labeled set to guide unlabeled data, achieving a promising trade-off between training efficacy and annotation cost. However, they suffer from a severe data-efficiency bottleneck due to the reliance on coarse performance heuristics, leaving a vast majority of valuable instances underutilized. To this end, we propose GeoMin, which models global feature distributions on labeled data to decode the structural discrepancy between correct and incorrect rollouts, thereby establishing a robust prior to assess the reliability of self-reward signals and fully unleash the potential of unlabeled data. Empirically, GeoMin outperforms the strongest baselines by +4.1% and even surpasses fully supervised models with only 10% of the annotations, demonstrating remarkable data efficiency.

URL PDF HTML ☆

赞 0 踩 0

2606.04511 2026-06-04 cs.CL cs.LG 版本更新

SparDA: Sparse Decoupled Attention for Efficient Long-Context LLM Inference

SparDA: 用于高效长上下文LLM推理的稀疏解耦注意力

Yaosheng Fu, Guangxuan Xiao, Xin Dong, Song Han, Oreste Villa

发表机构 * NVIDIA ； Thinking Machines Lab ； ByteDance Seed ； MIT

AI总结提出SparDA架构，通过引入第四投影Forecast实现KV缓存预取与注意力解耦，减少稀疏选择开销，在长上下文推理中实现1.25倍预填充加速和1.7倍解码加速。

详情

AI中文摘要

稀疏注意力减少了长上下文LLM推理的计算和内存带宽。然而，仍然存在两个关键挑战：（1）KV缓存容量随序列长度增长，卸载到CPU内存引入了PCIe传输瓶颈；（2）稀疏选择步骤本身保持$O(T^2)$复杂度，在长上下文中可能主导注意力成本。我们提出SparDA，一种解耦的稀疏注意力架构，它在Query、Key和Value之外引入了第四个逐层投影——Forecast。Forecast预测下一层所需的KV块，从而实现超前选择，将CPU到GPU的预取与当前层执行重叠。由于Forecast与注意力查询解耦，我们的GQA实现为每个GQA组使用一个Forecast头，相比原始多头选择器减少了选择开销。SparDA增加了<0.5%的参数，并通过匹配原始选择器的注意力分布仅训练Forecast投影。在两个稀疏预训练的8B模型上，SparDA匹配或略微提高了准确性，并且相比稀疏注意力卸载基线，提供了高达1.25倍的预填充加速和1.7倍的解码加速。通过使单个GPU上可行的批量大小更大，SparDA进一步实现了比非卸载稀疏基线高达5.3倍的解码吞吐量。我们的源代码可在https://github.com/NVlabs/SparDA获取。

英文摘要

Sparse attention reduces compute and memory bandwidth for long-context LLM inference. However, two key challenges remain: (1) KV cache capacity still grows with sequence length, and offloading to CPU memory introduces a PCIe transfer bottleneck; (2) the sparse selection step itself retains $O(T^2)$ complexity and can dominate attention cost at long contexts. We propose SparDA, a decoupled sparse attention architecture that introduces a fourth per-layer projection, the Forecast, alongside Query, Key, and Value. The Forecast predicts the KV blocks needed by the next layer, enabling lookahead selection that overlaps CPU-to-GPU prefetch with current-layer execution. Because Forecast is decoupled from the attention query, our GQA implementation uses one Forecast head per GQA group, reducing selection overhead versus the original multi-head selector. SparDA adds $<$0.5% parameters and trains only the Forecast projections by matching the original selector's attention distribution. On two sparse-pretrained 8B models, SparDA matches or slightly improves accuracy and delivers up to 1.25$\times$ prefill speedup and 1.7$\times$ decode speedup over the sparse-attention offload baseline. By enabling larger feasible batch sizes on a single GPU, SparDA further reaches up to 5.3$\times$ higher decode throughput than the non-offload sparse baseline. Our source code is available at https://github.com/NVlabs/SparDA.

URL PDF HTML ☆

赞 0 踩 0

2606.04503 2026-06-04 cs.LG cs.AI 版本更新

Smart Picks in the Dark: Towards Efficient RLVR for Reasoning via Tracing Metacognitive Pivots

暗中选择：通过追踪元认知支点实现高效的推理可验证奖励强化学习

Guangcheng Zhu, Shenzhi Yang, Haobo Wang, Xing Zheng, Yingfan MA, Xuening Feng, Zhongqi Chen, Bowen Song, Weiqiang Wang, Gang Chen

发表机构 * Zhejiang University（浙江大学）； Ant Group（蚂蚁集团）

AI总结针对可验证奖励强化学习（RLVR）中数据效率低的问题，提出PivotTrace框架，利用注意力动态追踪推理过程中的元认知支点，通过支点密度量化不确定性实现数据自动分流，在仅使用29.3%标注样本和2.75倍收敛加速下超越全监督模型。

详情

AI中文摘要

可验证奖励强化学习（RLVR）极大地推进了大型推理模型（LRMs），但它需要及时在大量完全标注的数据集上进行训练。为此，从两个角度广泛研究了数据高效的RLVR方法：（i）数据选择方法识别一小部分“黄金”样本，这些样本能产生接近全数据性能，但它们依赖于预先存在的标注数据池。（ii）无监督RLVR方法在大规模未标注数据上利用模型自身的内部监督信号进行训练，但表现出次优性能。因此，我们研究了RLVR的“暗中选择”设置，其目标是在没有先验监督的情况下，选择对训练最有益且值得标注的未标注样本。通过系统分析，我们证明智能选择依赖于一个校准良好的不确定性估计器，以实现数据的策略性划分，从而进行自适应训练方案。基于这一见解，我们提出了PivotTrace，一个三路数据分流框架，利用注意力动态追踪推理过程中的元认知支点。通过支点密度精确量化不确定性，PivotTrace实现了自动数据路由，协同最大化标注和训练效率。实验表明，PivotTrace仅使用29.3%的标注样本和2.75倍的收敛速度就超越了全监督LRM。

英文摘要

Reinforcement learning with verifiable rewards (RLVR) has greatly advanced large reasoning models (LRMs), but it requires timely training on a huge fully-annotated dataset. To this end, data-efficient RLVR methods have been widely studied from two perspectives: (i) data selection methods identify a small subset of "golden" samples that yield near-full-data performance, but they rely on a pre-existing pool of labeled data. (ii) unsupervised RLVR methods train the model using its own internal supervision signals on large-scale unlabeled data, yet they exhibit suboptimal performance. Accordingly, we investigate the "pick in the dark" setup for RLVR, which aims to select, without prior supervision, unlabeled samples that are most beneficial for training and worthy of annotation. Through systematic analysis, we demonstrate that smart picks hinge on a well-calibrated uncertainty estimator to enable strategic partitioning of data for adaptive training regimes. Building on this insight, we propose PivotTrace, a three-way data triage framework that leverages attention dynamics to trace metacognitive pivots during reasoning. By precisely quantifying uncertainty through pivot density, PivotTrace achieves automated data routing to synergistically maximize both annotation and training efficiency. Empirically, PivotTrace surpasses the fully supervised LRM with only 29.3% annotated samples and 2.75 faster convergence.

URL PDF HTML ☆

赞 0 踩 0

2606.04499 2026-06-04 cs.SI cs.LG 版本更新

Modeling and Interpreting Teamwork Dynamics in Cancer Care Outcome Prediction

建模与解释癌症护理结果预测中的团队协作动态

Yuhua Huang, Hsiao-Ying Lu, Kwan-Liu Ma

发表机构 * University of California, Davis（加州大学戴维斯分校）

AI总结利用电子健康记录中的协作网络和机器学习方法，研究医疗专业人员团队协作动态对癌症患者生存预测的影响，并解释关键网络特征。

详情

AI中文摘要

癌症护理需要纵向方法，根据每个患者的需求随时间规划和实施治疗。虽然先前研究深入探讨了临床和人口统计学因素（如合并症和年龄）如何指导治疗规划，但对护理实施阶段的关注却少得多。然而，规划和实施都是基于团队的过程，依赖于多个医疗专业人员之间的协调努力。因此，这些协作实践中蕴含的人为因素对于优化患者结果至关重要。尽管重要性显著，但现有关于癌症护理中人为因素的文献有限，很少有研究调查护理团队内的协作如何在治疗过程中演变。为填补这一空白，本研究探讨通过电子健康记录系统捕获的医疗专业人员协作如何影响癌症患者结果，特别强调团队协作动态。我们将电子健康记录介导的医疗专业人员交互表示为网络，并应用机器学习方法识别这些协作结构中嵌入的患者生存预测信号。我们进一步通过指出与特定结果相关的网络特征和动态模式来解释模型预测。我们通过稳健性分析评估模型，确保发现稳定且不受训练中随机变异驱动。此外，我们的见解与医学文献中提出的假设一致，我们的结果为这些主张提供了基于经验数据的证据。总体而言，我们的工作提供了一个实用流程，利用协作的数字痕迹来评估和加强纵向团队医疗，为医疗实施中的数据驱动干预提供可操作的见解。

英文摘要

Cancer care requires a longitudinal approach in which treatments are planned and delivered over time according to the needs of each individual patient. While prior research has thoroughly explored how clinical and demographic factors, such as comorbidities and age, inform treatment planning, far less attention has been devoted to the delivery phase of care. Yet planning and delivery are both team-based processes that depend on coordinated efforts among multiple healthcare professionals (HCPs). As such, the human factors embedded in these collaborative practices are crucial to optimizing patient outcomes. Despite this importance, the existing literature on human factors in cancer care is limited, and very few studies have investigated how collaboration within care teams evolves over the course of treatment. To fill this gap, this work examine how HCPs' collaboration, captured through electronic health record (EHR) systems, affects cancer patient outcomes, with particular emphasis on teamwork dynamics. We represent EHR-mediated HCP interactions as networks and apply machine learning methods to identify predictive signals of patient survival embedded in these collaborative structures. We further interpret model predictions by pinpointing network characteristics and dynamic patterns associated with particular outcomes. We evaluate our model through robustness analyses to ensure that the findings are stable and not driven by stochastic variation in training. Additionally, our insights align with hypotheses proposed in the medical literature, and our results provide the empirical, data-driven evidence supporting these claims. Overall, our work contributes a practical workflow for leveraging digital traces of collaboration to evaluate and strengthen longitudinal team-based healthcare, offering actionable insights to guide data-informed interventions in healthcare delivery.

URL PDF HTML ☆

赞 0 踩 0

2606.04492 2026-06-04 cs.LG cs.GT 版本更新

Episodic Memory Temporal Consistency for Cooperative Multi-Agent Reinforcement Learning

面向合作多智能体强化学习的 episodic 记忆时间一致性

Zicheng Zhao, Yu Lan, Chengzhengxu Li, Zhaohan Zhang, Xiaoming Liu

发表机构 * Xi’an Jiaotong University（西安交通大学）； Queen Mary University of London（伦敦玛丽女王大学）

AI总结针对合作多智能体强化学习中的奖励稀疏和探索瓶颈，提出 Episodic Memory Temporal Consistency (EMTC) 框架，通过时间一致性语义嵌入器和门控机制，防止表示崩溃并过滤伪成功轨迹，理论保证误差界，在 SMAC 和 GRF 基准上显著优于现有方法。

Comments Under Review

详情

ChessMimic: 用于在线闪电棋中人类走棋、时钟和结果预测的按等级划分的Transformer模型

Thomas Johnson

发表机构 * nascent.xyz（nascent实验室）

AI总结提出ChessMimic系统，包含三个小型编码器Transformer模型，分别用于走棋、思考时间和结果预测，通过按Elo等级分段训练实现更精细的技能校准，在Lichess闪电棋数据上走棋预测准确率超越Maia-2，结果预测AUC达0.78，时钟模型提供可用但非最优的思考时间信号。

详情

AI中文摘要

我们提出了ChessMimic，一个由三个小型编码器Transformer组成的系统——分别用于走棋、思考时间和结果预测——以局面、最近走棋历史、玩家等级和时钟状态为条件。我们为每100 Elo等级区间拟合每个模型的独立实例，以参数效率换取更精细的技能校准。在Lichess Rated Blitz游戏的一个月保留切片上，ChessMimic的人类走棋预测准确率在每个Elo区间都优于Maia-2。与Maia-3相比，我们的9M参数模型的准确率介于Maia-3-5M和Maia-3-23M之间，且没有几何注意力偏置的额外复杂性。除了走棋匹配模型，我们还训练了一个游戏结果模型，该模型不仅以局面为条件，还以玩家等级、时间控制和剩余时钟时间为条件。结果模型在样本外达到了0.78的AUC，击败了Maia-2以及基于子力、等级和时钟时间的逻辑回归。最后，我们训练了一个时钟模型来预测人类思考时间。该时钟模型在ALLIE风格过滤器下提供了可用但非最优的每步思考时间信号（Pearson r = 0.41，Spearman rho = 0.50，MAE 4.10秒，而ALLIE报告的r = 0.70），残差差距集中在每位置桶的锐度上，而非桶边际校准。公开演示在1e4.ai，我们在GitHub上发布了代码、每个区间的权重以及C++数据过滤管道代码。

英文摘要

We present ChessMimic, a system of three small encoder-only transformers - for move, thinking-time, and outcome prediction - conditioned on the position, recent move history, player rating, and clock state. We fit a separate instance of each model per 100-Elo rating band, trading parameter efficiency for sharper per-skill calibration. On a held-out month-wide slice of Lichess Rated Blitz games ChessMimic's human move prediction accuracy outperforms Maia-2 in every Elo band. Compared to Maia-3, our 9M parameter model's accuracy sits between Maia-3-5M and Maia-3-23M without the additional complexity of Geometric Attention Bias. In addition to the move matching model, we also train a game outcome model that conditions not only on the position, but also player ratings, time control, and remaining clock times. The outcome model achieves an AUC of 0.78 out of sample, beating Maia-2 as well as logistic regressions based on material, ratings, and clock time. Finally, we train a clock model that predicts human thinking times. The clock model provides a usable but non-SOTA per-ply think-time signal under ALLIE-style filters (Pearson r = 0.41, Spearman rho = 0.50, MAE 4.10 s, against ALLIE's reported r = 0.70), with the residual gap concentrated in per-position bucket sharpness rather than bucket-marginal calibration. A public demo is at 1e4.ai and we release code, per-band weights, and the C++ data-filter pipeline code in GitHub.

URL PDF HTML ☆

赞 0 踩 0

2606.04468 2026-06-04 cs.LG cs.AI cs.NE math.OC 版本更新

ParetoPilot: Zero-Surrogate Offline Multi-Objective Optimization via Infer-Perturb-Guide Diffusion

ParetoPilot：通过推断-扰动-引导扩散实现零代理离线多目标优化

Ruiqing Sun, Sen Yang, Dawei Feng, Bo Ding, Yijie Wang, Huaimin Wang

发表机构 * Nanyang Technological University（南洋理工大学）

AI总结提出ParetoPilot，一种无需外部代理模型的零代理扩散框架，通过推断-扰动-引导引擎在无条件去噪步骤中隐式推断目标方向、正交化并行引力场和边缘感知排斥力，实现离线多目标优化的帕累托最优设计。

详情

AI中文摘要

离线多目标优化旨在基于静态数据集发现新颖的帕累托最优设计，而无需昂贵的环境交互。尽管最近的生成方法取得了显著成功，但它们主要依赖外部代理模型。这种依赖引入了显著的计算开销，遭受欺骗性评估，并偏离了联合训练主流生成模型与条件的流行范式。为了解决这些瓶颈，我们提出了ParetoPilot，一种用于离线多目标优化的新颖零代理扩散框架。ParetoPilot充分利用预训练扩散模型中固有的条件先验。其核心是引入了推断-扰动-引导引擎，该引擎无缝地插入在反向生成过程的无条件去噪步骤中。首先，通过匹配条件噪声预测和无条件噪声预测，隐式推断瞬时目标方向。其次，数学上正交化一个用于严格收敛的平行引力场和一个用于相互多样性的边缘感知排斥力，从而生成一个动态退火的扰动向量。最后，这个扰动目标通过标准的无分类器引导无缝地引导生成过程。在51个任务上的大量实验表明，ParetoPilot优于14个最先进的基于代理和逆生成基线。通过消除辅助代理训练，我们的方法在实现超体积改进和鲁棒帕累托前沿覆盖的同时，保护了数据隐私。

英文摘要

Offline multi-objective optimization (Offline MOO) aims to discover novel Pareto-optimal designs based on static datasets without expensive environment interactions. While recent generative methods have achieved notable success, they predominantly rely on external surrogate models. This dependency introduces significant computational overhead, suffers from deceptive evaluations, and deviates from the prevailing paradigm of jointly training mainstream generative models with conditions. To address these bottlenecks, we propose ParetoPilot, a novel zero-surrogate diffusion framework for offline MOO. ParetoPilot fully leverages the conditional priors inherently embedded within pre-trained diffusion models. At its core, the framework introduces the Infer-Perturb-Guide (IPG) engine, which is seamlessly interleaved within the unconditional denoising steps of the reverse generation process. First, it implicitly infers the instantaneous objective direction by matching conditional and unconditional noise predictions. Next, it mathematically orthogonalizes a parallel gravity field for strict convergence and an edgeness-aware repulsive force for mutual diversity, creating a dynamically annealed perturbation vector. Finally, this perturbed target seamlessly steers the generation process via standard Classifier-Free Guidance (CFG). Extensive experiments across 51 tasks demonstrate that ParetoPilot outperforms 14 state-of-the-art surrogate-based and inverse generative baselines. By eliminating auxiliary proxy training, our approach preserves data privacy while achieving hypervolume improvement and robust Pareto front coverage.

URL PDF HTML ☆

赞 0 踩 0

2606.04460 2026-06-04 cs.CR cs.AI cs.LG 版本更新

CyberGym-E2E: Scalable Real-World Benchmark for AI Agents' End-to-End Cybersecurity Capabilities

CyberGym-E2E：面向AI代理端到端网络安全能力的可扩展真实世界基准

Tianneng Shi, Robin Rheem, Dongwei Jiang, Mona Wang, Francisco De La Riega, Zhun Wang, Jingzhi Jiang, Alexander Cheung, Sean Tai, Jonah Cha, Jianhong Tu, Gabriel Han, Chenguang Wang, Jingxuan He, Wenbo Guo, Dawn Song

发表机构 * Stanford University（斯坦福大学）； UC Berkeley（加州大学伯克利分校）

AI总结提出CyberGym-E2E，一个大规模、真实的端到端网络安全基准，通过自动化流水线将开源漏洞数据转化为评估环境，全面评估AI代理在漏洞发现、PoC生成和补丁生成全生命周期中的能力。

Comments ICML 2026

2606.04453 2026-06-04 cs.CV cs.LG 版本更新

Radiomic Feature Selection Using Gradient Loss of Deep Neural Network for Lung Cancer Stage Detection

基于深度神经网络梯度损失的放射组学特征选择用于肺癌分期检测

Hina Shakir, Mohammad Mohatram, Javeed Hussain, Syed Rizwan Ali, Muhammad Irfan Memon

发表机构 * Department of Software Engineering, Bahria University（巴尔ia大学软件工程系）； Global College of Engineering and Technology（全球工程与技术学院）； Software Engineering & Business Incubation Center, Bahria University（软件工程与企业孵化中心，巴尔ia大学）

AI总结提出GL-RFE框架，利用深度神经网络梯度敏感性分析递归消除低贡献特征，从106个放射组学特征中选出前15个用于肺癌早晚期分类，准确率达90.22%。

详情

DOI: 10.3791/70181
Journal ref: J. Vis. Exp. (230), e70181, (2026)

AI中文摘要

放射组学能够从医学图像中提取定量成像生物标志物，已成为计算机辅助癌症诊断的重要工具。然而，放射组学数据集通常具有高维小样本的特点，使得特征选择成为构建可靠预测模型的关键步骤。本研究提出了一种梯度损失递归特征消除（GL-RFE）框架，该框架集成深度神经网络的梯度敏感性分析，以识别对肺癌分期检测最具影响力的放射组学特征。使用3D Slicer平台的PyRadiomics扩展从胸部计算机断层扫描（CT）中提取了总共106个放射组学特征。所提出的方法通过计算网络损失相对于输入特征的梯度来评估特征重要性，并递归消除贡献最小的特征。最终选出的前15个放射组学特征用于训练深度神经网络分类器，以区分早期和晚期肺癌。该框架在测试数据集上取得了强劲的分类性能，准确率为90.22%，精确率为90.10%，召回率为90.24%，F1分数为90.16%。可视化分析（包括相关性热图和分布图）进一步证实了特征冗余减少和类别可分性提高。与传统特征选择技术相比，GL-RFE有效捕捉了非线性特征交互并增强了模型泛化能力。所提出的协议为基于放射组学的癌症分期检测提供了一种可重复且可解释的方法，特别适用于高维小样本生物医学数据集，并在基因组学和多模态临床分析等其他领域具有潜在应用价值。

英文摘要

Radiomics enables extraction of quantitative imaging biomarkers from medical images and has become an important tool for computer-aided cancer diagnosis. However, radiomics datasets are typically high-dimensional with limited samples, making feature selection a critical step for building reliable predictive models. This study proposes a Gradient-Loss Recursive Feature Elimination (GL-RFE) framework that integrates gradient sensitivity analysis from a deep neural network to identify the most influential radiomic features for lung cancer stage detection. A total of 106 radiomic features were extracted from chest Computed Tomography (CT) scans using the PyRadiomics extension of the 3D Slicer platform. The proposed method evaluates feature importance by computing gradients of the network loss with respect to input features and recursively eliminates features with minimal contribution. The resulting top-15 radiomic features are used to train a deep neural network classifier for distinguishing early-stage and advanced-stage lung cancer. The proposed framework achieves strong classification performance, with accuracy of 90.22%, precision of 90.10%, recall of 90.24%, and F1-score of 90.16% on the test dataset. Visualization analyses, including correlation heat maps and distribution plots, further confirm reduced feature redundancy and improved class separability. Compared to conventional feature selection techniques, GL-RFE effectively captures nonlinear feature interactions and enhances model generalization. The presented protocol provides a reproducible and interpretable methodology for radiomics-based cancer stage detection and is particularly suitable for high-dimensional, small-sample biomedical datasets, with potential applications in other domains such as genomics and multimodal clinical analysis.

URL PDF HTML ☆

赞 0 踩 0

2606.04451 2026-06-04 cs.LG 版本更新

On Out-of-sample Embedding in UMAP

UMAP中的样本外嵌入

Mohammad Tariqul Islam, Jason W. Fleischer

发表机构 * Media Lab, Massachusetts Institute of Technology（媒体实验室，麻省理工学院）； Electrical and Computer Engineering, Princeton University（电子与计算机工程，普林斯顿大学）

AI总结针对UMAP在添加新样本时产生的排斥效应，通过优化原始k近邻图中的成对交互，提出参数化UMAP方法以改善嵌入质量。

Comments 22 pages, 16 figures

详情

AI中文摘要

邻域嵌入算法通过在低维空间中构建等价的图表示来揭示高维数据中的相关性。一种日益流行的算法是统一流形学习与投影（UMAP），它使用代数拓扑来映射两个空间之间的距离。虽然它在许多类型的数据集上表现良好，但UMAP在将样本外点添加到现有映射时存在困难。特别是，UMAP通常将新点放置在所发现簇的周边，而不是与它们的相关邻居一起放在簇的内部。在这里，我们通过优化原始k近邻图中的成对交互来克服这种“排斥效应”。此外，我们表明参数化UMAP比非参数算法获得更好的嵌入，特别是当数据变得更复杂时（例如，医学图像）。我们还表明，当使用参数化UMAP嵌入数据时，排斥效应自然得到缓解。我们使用可信度、最近邻分类器以及分析嵌入中的吸引力和排斥力来表征不同的UMAP方法。

英文摘要

Neighbor embedding algorithms reveal correlations in high-dimensional data by constructing an equivalent graph representation in a lower-dimensional space. An increasingly popular algorithm is Uniform Manifold Learning and Projection (UMAP), which uses algebraic topology to map distances between the two spaces. While it works well on many types of data sets, UMAP has trouble adding out-of-sample points to a pre-existing mapping. In particular, UMAP often places new points on the periphery of the found clusters, rather than in their interiors with their correlated neighbors. Here, we overcome this ``repulsion effect'' by optimizing pairwise interactions within the original k-nearest-neighbor graph. Moreover, we show that parameterizing UMAP obtains better embeddings than non-parametric algorithms, particularly as the data gets more complex (e.g., medical images). We also show that the repulsion effect is naturally mitigated when a parameterized UMAP is employed to embed the data. We characterize different UMAP approaches using trustworthiness, nearest neighbor classifiers, and by analyzing attractive and repulsive forces in the embeddings.

URL PDF HTML ☆

赞 0 踩 0

2606.04446 2026-06-04 cs.DC cs.LG 版本更新

D^2SD: Accelerating Speculative Decoding with Dual Diffusion Draft Models

D^2SD: 使用双重扩散草稿模型加速推测解码

Liyuan Zhang, Jiarui Zhang, Jinwei Yao, Ran Yan, Yuchen Yang, Jiahao Zhang, Tongkai Yang, Yi Wu, Binhang Yuan

发表机构 * Peking University（北京大学）； Tsinghua University（清华大学）； HKUST（香港科技大学）； UIUC（伊利诺伊大学厄巴纳-香槟分校）； Ant Group（蚂蚁集团）

AI总结提出D^2SD框架，通过双重扩散草稿模型和置信度引导的前缀树，提升推测解码的接受率，优于现有扩散方法和自回归推测解码基线。

详情

AI中文摘要

推测解码通过草拟多个令牌并在单次目标模型前向传递中验证它们，加速自回归大语言模型推理。最近的基于扩散的草稿模型并行生成整个令牌块，但通常每次验证只提交单个草稿序列：一旦出现第一个不匹配，所有后续草稿令牌被丢弃，导致接受率有限。简单地对更多草稿候选序列进行批处理只会带来边际改进，因为冗余或位置不当的分支增加了草拟和验证的成本，而没有成比例地增加接受的令牌数量。我们提出D^2SD，一种双重扩散草稿推测解码框架，将候选组织成置信度引导的前缀树，其中第一个扩散草稿器生成一个块以及每个位置的置信度分数，用于识别最可能的拒绝边界并选择前K个前缀范围进行恢复；第二个可变前缀扩散草稿器在每个选定前缀处重新锚定，并在一次批处理中提出替代延续；得到的共享前缀候选通过级联注意力联合验证。实验表明，D^2SD在底层扩散方法和强自回归推测解码基线上均有明显改进。

英文摘要

Speculative decoding accelerates autoregressive large language model inference by drafting multiple tokens and verifying them in a single target-model forward pass. Recent diffusion-based drafters generate an entire block of tokens in parallel but usually commit to a single draft sequence per verification: once the first mismatch occurs, all subsequent draft tokens are discarded, resulting in a limited acceptance rate. Naively batching more draft candidate sequences only introduces a marginal improvement, as redundant or poorly placed branches increase the cost of drafting and verification without proportionally increasing the number of accepted tokens. We propose D^2SD, a dual diffusion draft speculative decoding framework that organizes candidates into a confidence-guided prefix tree, where the first diffusion drafter generates a block along with per-position confidence scores that are used to identify the most likely rejection boundary and select the top-K prefix ranges for recovery; the second variable-prefix diffusion drafter re-anchors at each selected prefix and proposes alternative continuations in one batched pass; the resulting shared-prefix candidates are jointly verified via cascade attention. Empirically, D^2SD shows clear improvements over both the underlying diffusion approach and strong autoregressive speculative decoding baselines.

URL PDF HTML ☆

赞 0 踩 0

2606.04445 2026-06-04 cs.LG cs.AI math.ST stat.TH 版本更新

RowNet: A Memory Transformer for Tabular Regression

RowNet: 用于表格回归的记忆Transformer

Askat Rakhymbekov, Gulshat Muhametjanova

发表机构 * Department of Applied Mathematics and Informatics（应用数学与信息学系）； Kyrgyz-Turkish Manas University（吉尔吉斯-土耳其马纳斯大学）

AI总结针对房地产估值中表格回归问题，提出RowNet，一种基于检索的神经网络架构，通过记忆库中的成对相似性特征、目标一致性增强和混合专家模块实现价格预测。

Comments Retrieval-based neural architecture for real estate valuation. Related to TabR (arXiv:2307.14338) and retrieval-augmented tabular learning

详情

AI中文摘要

房地产估值是一个结构化回归问题，其中价格受异构特征类型、稀疏区域效应、非线性交互以及可比房产的实际逻辑影响。标准多层感知器将每一行视为孤立向量，必须仅从监督中学习局部性、尺度敏感性和类别匹配。梯度提升决策树提供了强大的表格基线，但其以特征为中心的分裂机制并未显式建模相似历史观测的检索。本文提出了RowNet，一种用于房地产每平方米价格预测的基于检索的神经网络架构。RowNet通过针对标记属性记忆库的成对相似性特征来表示查询属性。第一检索层从仅特征相似性中估计粗略目标。第二层通过目标一致性特征增强记忆比较，并使用多个学习注意力头检索互补的可比集。最终的混合专家模块结合了学习门控、残差校正、熵正则化和头多样性正则化以产生预测。

英文摘要

Real estate valuation is a structured regression problem in which prices are governed by heterogeneous feature types, sparse regional effects, nonlinear interactions, and the practical logic of comparable properties. Standard multilayer perceptrons treat each row as an isolated vector and must learn locality, scale sensitivity, and categorical matching from supervision alone. Gradient-boosted decision trees provide strong tabular baselines, but their feature-centric splitting mechanism does not explicitly model the retrieval of similar historical observations. This paper presents RowNet, a retrieval-based neural architecture for real estate price-per-square-meter prediction. RowNet represents a query property through pairwise similarity features against a memory bank of labeled properties. A first retrieval layer estimates a coarse target from feature-only similarities. A second layer augments the memory comparison with target-consistency features and uses multiple learned attention heads to retrieve complementary comparable sets. A final mixture-of-experts module combines learned gating, residual correction, entropy regularization, and head-diversity regularization to produce the prediction.

URL PDF HTML ☆

赞 0 踩 0

2606.04444 2026-06-04 eess.IV cs.LG 版本更新

Scaling Datasets for Multi-Sensor, Multi-Agent, and Multi-Domain Learning in Autonomous Systems

面向自主系统中多传感器、多智能体与多领域学习的数据集扩展

R. Spencer Hallyburton, David Hunt, Miroslav Pajic

发表机构 * Department of Electrical and Computer Engineering, Duke University（电气与计算机工程系，杜克大学）

AI总结提出基于AVstack和CARLA的模块化数据集生成流程，创建TB级带真实标签的多域数据，支持单/多智能体与灵活传感器配置，用于特定应用训练和协作自主研究。

2606.04438 2026-06-04 cs.LG cs.AI 版本更新

平坦性与泛化：使用齐次神经网络学习多指标模型

Harsh Vardhan, Hossein Taheri, Arya Mazumdar

发表机构 * Department of Computer Science（计算机科学系）； University of California, San Diego（加州大学圣地亚哥分校）； Halicioğlu Data Science Institute（Halicioğlu数据科学研究所）

AI总结本文研究两层齐次神经网络学习多指标模型时，平坦性与泛化之间的关系，证明最平坦插值器总能泛化，而某些非泛化插值器的平坦性无法接近最平坦值。

详情

AI中文摘要

TITAN-FedAnil+: Trust-Based Adaptive Blockchain Federated Learning for Resource-Constrained Intelligent Enterprises

TITAN-FedAnil+：面向资源受限智能企业的基于信任的自适应区块链联邦学习

Muhammad Hadi, Muhammad Jahangir, Talha Shafique, Muhammad Khuram Shahzad

发表机构 * University of Science and Technology of China（中国科学技术大学）

AI总结提出TITAN-FedAnil+框架，通过基于亲和传播的自适应聚类聚合过滤恶意更新、GPU加速向量化提升效率及有符号状态跳变机制实现轻量级区块链重同步，在资源受限边缘设备上内存开销降低81%。

Comments 8 pages, 5 figures; code available at https://github.com/error8149/FedAnilPlus-Optimized

详情

AI中文摘要

联邦学习（FL）已成为一种在保护数据隐私的同时实现协作智能的有效范式。然而，由非独立同分布（non-IID）数据分布引起的数据异构性和去中心化安全威胁仍然是重大挑战，尤其是在资源受限的企业环境中。本文提出了TITAN-FedAnil+，一种面向智能企业中区块链联邦学习的基于信任的自适应网络。所提出的框架引入了基于亲和传播的自适应聚类聚合，无需预先知道攻击者数量即可识别并过滤恶意更新。此外，采用GPU加速向量化以提高计算效率，同时通过有符号状态跳变机制实现轻量级区块链重同步。实验结果表明，与基线框架相比，在受限的8 GB边缘设备上经过50轮通信，内存开销显著降低，节省高达81%。结果表明，TITAN-FedAnil+有效提升了智能企业环境中安全联邦学习部署的鲁棒性、可扩展性和资源效率。

英文摘要

Federated Learning (FL) has emerged as an effective paradigm for collaborative intelligence while preserving data privacy. However, data heterogeneity arising from non-IID distributions and decentralized security threats remain significant challenges, particularly in resource-constrained enterprise environments. This paper presents TITAN-FedAnil+, a Trust-Based Adaptive Network for blockchain-enabled federated learning in intelligent enterprises. The proposed framework introduces affinity propagation-based adaptive clustered aggregation to identify and filter malicious updates without requiring prior knowledge of the number of attackers. In addition, GPU-accelerated vectorization is employed to improve computational efficiency, while a signed state jump mechanism enables lightweight blockchain resynchronization. Experimental results demonstrate substantial reductions in memory overhead, achieving up to 81% savings across 50 communication rounds on constrained 8 GB edge devices compared with the baseline framework. The results indicate that TITAN-FedAnil+ effectively improves robustness, scalability, and resource efficiency for secure federated learning deployments in intelligent enterprise environments.

URL PDF HTML ☆

赞 0 踩 0

2606.04384 2026-06-04 cs.LG cs.CR stat.ML 版本更新

Revisiting Privacy Amplification by Subsampling in Selective Release DPSGD

重新审视选择性释放DPSGD中的子采样隐私放大

Xiaobo Huang, Fang Xie

发表机构 * Guangdong Provincial Key Laboratory of IRADS, Beijing Normal-Hong Kong Baptist University（广东IRADS重点实验室，北京师范大学-香港 Baptist大学）

AI总结针对DPSGD中梯度裁剪和噪声注入导致的效用下降和收敛缓慢问题，重新评估选择性释放机制的隐私分析，提出基于裁剪梯度的差分隐私选择性释放算法（DPSR-CG），通过严格的隐私分析和实验证明其在保持严格隐私保证的同时实现优异模型性能。

详情

AI中文摘要

机器学习对敏感数据的依赖需要差分隐私随机梯度下降（DPSGD）等隐私保护技术。然而，由于梯度裁剪和噪声注入，DPSGD存在显著的效用下降和收敛缓慢的问题。先前的工作试图从不同角度改进DPSGD；值得注意的是，差分隐私选择性更新与释放（DPSUR）算法取得了显著的模型效用。然而，DPSUR中的隐私核算忽略了选择性释放机制引入的采样概率变化，这损害了其隐私保证的严谨性。为了解决这些限制，我们重新评估了选择性释放机制的隐私分析，并提出了一种新颖的算法：基于裁剪梯度的差分隐私选择性释放（DPSR-CG）。通过严格的新推导隐私分析以及在多个数据集（MNIST、CIFAR-10、IMDB和FMNIST）上的广泛实验，我们证明了我们的DPSR-CG机制在保持严格隐私保证的同时实现了卓越的模型性能。

英文摘要

Machine learning's reliance on sensitive data necessitates privacy-preserving techniques like Differentially Private Stochastic Gradient Descent (DPSGD). However, DPSGD suffers from substantial utility degradation and slow convergence due to gradient clipping and noise injection. Prior works have attempted to improve DPSGD from various perspectives; notably, the Differentially Private Selective Update and Release (DPSUR) algorithm has achieved remarkable model utility. However, the privacy accounting in DPSUR overlooks the variation in sampling probability introduced by the selective release mechanism, which compromises the rigor of its privacy guarantees. To address these limitations, we re-evaluate the privacy analysis of the selective release mechanism and propose a novel algorithm: Differentially Private Selective Release based on Clipped Gradients (DPSR-CG). Through a rigorous, newly derived privacy analysis and extensive experiments on multiple datasets (MNIST, CIFAR-10, IMDB, and FMNIST), we demonstrate that our DPSR-CG mechanism maintains strict privacy guarantees while achieving exceptional model performance.

URL PDF HTML ☆

赞 0 踩 0

2606.04381 2026-06-04 cs.LG cs.AI 版本更新

From Symbolic to Geometric: Enabling Spatial Reasoning in Large Language Models

从符号到几何：在大语言模型中实现空间推理

Chen Chu, Bita Azarijoo, Li Xiong, Khurram Shafique, Cyrus Shahabi

发表机构 * University of Southern California（南加州大学）； Emory University（埃默里大学）； Novateur Research Solutions（Novateur研究解决方案）

AI总结提出空间语言模型（SLM），通过将位置信息作为一等模态并学习空间表示，在推理过程中实现几何空间推理，显著优于基于符号推理的现有方法。

详情

AI中文摘要

近期的大语言模型（LLM）通常表现出空间推理能力；然而，这种能力很大程度上是\emph{符号}性的，源于对空间语言的模式匹配，而非真正的\emph{几何}空间推理。由于LLM操作离散令牌，它们缺乏对连续空间表示、显式几何计算和结构化空间算子的原生支持。为解决这一局限，我们引入了\emph{空间语言模型（SLM）}，这是首个将位置信息作为一等模态并在模型推理过程中实现几何空间推理的多模态LLM。SLM直接操作学习到的空间表示，而非空间关系的文本描述。为支持有效训练，我们构建了\emph{空间指令数据集}，该数据集对齐了空间表示、原子几何操作和自然语言指令。我们进一步提出了名为\emph{SpatialEval}的新基准，旨在评估属性、距离、拓扑和相对位置任务上的空间推理。大量实验表明，SLM显著优于依赖通过提示工程或文本抽象进行符号推理的现有基于LLM的方法，展示了集成几何空间表示对稳健空间推理的优势。我们的指令数据集、评估基准、模型训练代码和模型检查点可在\hyperlink{https://github.com/chuchen2017/SLM}{https://github.com/chuchen2017/SLM}获取。

英文摘要

Recent large language models (LLMs) often appear to exhibit spatial reasoning ability; however, this capability is largely \emph{symbolic}, arising from pattern matching over spatial language rather than true \emph{geometric} reasoning over space. Because LLMs operate on discrete tokens, they lack native support for continuous spatial representations, explicit geometric computation, and structured spatial operators. To address this limitation, we introduce the \emph{Spatial Language Model (SLM)}, the first multimodal LLM that treats location information as a first-class modality and enables geometric spatial reasoning within the model's inference process. SLM directly operates on learned spatial representations rather than textual descriptions of spatial relations. To support effective training, we construct a \emph{Spatial Instruction Dataset} that aligns spatial representations, atomic geometric operations, and natural language instructions. We further propose a new benchmark named \emph{SpatialEval}, which is designed to evaluate spatial reasoning across attributes, distance, topology, and relative-position tasks. Extensive experiments show that SLM significantly outperforms existing LLM-based approaches that rely on symbolic reasoning via prompt engineering or textual abstraction, demonstrating the benefits of integrating geometric spatial representations for robust spatial reasoning. Our instruction dataset, evaluation benchmark, model training codes, and models' checkpoints can be found at: \hyperlink{https://github.com/chuchen2017/SLM}{https://github.com/chuchen2017/SLM}.

URL PDF HTML ☆

赞 0 踩 0

2606.04380 2026-06-04 stat.ML cs.LG 版本更新

Deliberate Evolution: 基于智能体推理的样本高效符号回归与LLM

Xinyu Pang, Zhanke Zhou, Xuan Li, Fangrui Lv, Shanshan Wei, Sen Cui, Bo Han, Changshui Zhang

发表机构 * TMLR Group, Department of Computer Science, Hong Kong Baptist University（香港 Baptist 大学计算机科学系 TMLR 组）； Beijing National Research Center for Information Science（北京信息科学国家研究中心）； Technology (BNRist), Department of Automation, Tsinghua University, Beijing, P.R. China（技术（BNRist），自动化系，清华大学，北京，中华人民共和国）； Lenovo Research（联想研究）

AI总结提出Deliberate Evolution框架，通过解耦符号生成与搜索控制，利用自适应算子、分析工具和反思记忆，在仅用40%样本预算下超越现有LLM符号回归方法。

Comments ICML 2026

2606.04345 2026-06-04 cs.CV cs.AI cs.LG 版本更新

HYolo: An Intelligent IoT-Based Object Detection System Using Hypergraph Learning

HYolo：一种基于超图学习的智能物联网目标检测系统

Isha Abid, Fawad Khan, Muhammad Khuram Shahzad

发表机构 * National University of Sciences and Technology（国家安全科学与技术大学）

AI总结提出HYolo框架，将超图学习融入YOLO架构以建模高阶特征关系，在COCO数据集上mAP@50提升约12%。

Comments 8 pages, multiple figures;

详情

AI中文摘要

本文提出HYolo，一种基于物联网的智能目标检测框架，将超图学习集成到YOLO架构中。传统的基于YOLO的目标检测模型主要捕获成对特征交互，可能无法建模对象与上下文特征之间的复杂高阶关系。为解决这一局限，HYolo引入超图学习以捕获更丰富的上下文依赖关系并改进对象表示。在COCO数据集上的实验评估表明，与基线YOLO模型相比，性能显著提升。所提方法在mAP@50上实现了约12%的提升，同时增强了整体检测准确性和鲁棒性。通过建模高阶特征关系，HYolo在物联网环境中提供了改进的上下文理解和更可靠的目标检测性能。结果表明，将超图学习集成到目标检测流程中，为智能且上下文感知的物联网视觉系统提供了一个有前景的方向。

英文摘要

This paper presents HYolo, an intelligent IoT-based object detection framework that integrates hypergraph learning into the YOLO architecture. Traditional YOLO-based object detection models primarily capture pairwise feature interactions and may fail to model complex high-order relationships among objects and contextual features. To address this limitation, HYolo incorporates hypergraph learning to capture richer contextual dependencies and improve object representation. Experimental evaluation on the COCO dataset demonstrates significant performance improvements over baseline YOLO models. The proposed approach achieves approximately 12% improvement in mAP@50 while enhancing overall detection accuracy and robustness. By modeling high-order feature relationships, HYolo provides improved contextual understanding and more reliable object detection performance in IoT-based environments. The results indicate that integrating hypergraph learning into object detection pipelines offers a promising direction for intelligent and context-aware IoT vision systems.

URL PDF HTML ☆

赞 0 踩 0

2606.04342 2026-06-04 cs.LG cs.AI 版本更新

Expectations vs. Realities: The Cost of MSE-Optimal Forecasting Under Conditional Uncertainty

期望与现实：条件不确定性下MSE最优预测的成本

Riku Green, Zahraa S. Abdallah, Telmo M Silva Filho

发表机构 * The University of Bristol（布里斯托尔大学）

AI总结本文通过条件不确定性间隙理论证明多步时间序列预测中MSE最优与边际真实性存在根本性权衡，并实证表明小幅牺牲MSE（≤5%）可显著提升边际真实性（中位数17.3%）。

Comments 12 pages, Accepted for KDD 2026 Research track

详情

DOI: 10.1145/3770855.3818087

AI中文摘要

多步时间序列预测（MSF）通常使用均方误差（MSE）等逐点误差指标进行评估，隐含地将条件均值视为充分目标。我们证明，在条件不确定性下，当条件期望在较长预测范围内无法代表典型实现值时，这种做法可能产生误导。我们通过条件不确定性间隙形式化这一效应，并证明只要该间隙非零，任何确定性预测器都无法同时最小化MSE并匹配实现未来的边际分布。这确立了MSF评估中逐点准确性与边际真实性之间根本性的、与模型无关的权衡。利用受控随机动力系统和九个真实世界预测基准，我们经验性地刻画了由此产生的准确性-真实性前沿，并量化了仅基于MSE的模型选择的实际成本。随着条件不确定性随预测范围增加，可达集扩展为明显的帕累托前沿，将MSE最优但分散不足的预测器与牺牲准确性换取真实边际变异性的方法区分开来。在多个基准中，我们发现MSE的小幅放松（≤5%）通常能带来边际真实性的不成比例提升，中位数改进为17.3%，在某些数据集中增益超过30%。我们进一步表明，常见的预测策略系统性地占据该前沿的不同区域：直接多输出预测器集中在准确性最优极端附近，而递归策略和基于样本的推断更倾向于边际真实性。这些结果共同揭示了长期预测中基于MSE评估的结构性失败模式，并将策略和推断选择重新定义为对不可避免的准确性-真实性权衡的导航。

英文摘要

Multi-step time series forecasting (MSF) is commonly evaluated using point-wise error metrics such as mean squared error (MSE), implicitly treating the conditional mean as a sufficient target. We show that this can be misleading under conditional uncertainty, where the conditional expectation becomes unrepresentative of typical realized values at longer horizons. We formalize this effect through a conditional uncertainty gap and prove that whenever this gap is nonzero, no deterministic predictor can simultaneously minimize MSE and match the marginal distribution of realized futures. This establishes a fundamental, model-agnostic trade-off between point accuracy and marginal realism in MSF evaluation. Using controlled stochastic dynamical systems and nine real-world forecasting benchmarks, we empirically characterize the resulting accuracy--realism frontier and \textbf{quantify the practical cost of MSE-only model selection}. As conditional uncertainty increases with forecast horizon, the attainable set expands into a pronounced Pareto front, separating MSE-optimal but under-dispersed predictors from methods that trade accuracy for realistic marginal variability. \textbf{Across benchmarks, we find that small relaxations in MSE ($\boldsymbol{\le 5\%}$) frequently unlock disproportionate gains in marginal realism, with median improvements of $\mathbf{17.3\%}$ and gains exceeding $\mathbf{30\%}$ in some datasets.} We further show that common forecasting strategies systematically occupy different regions of this frontier: direct multi-output predictors concentrate near the accuracy-optimal extreme, while recursive strategies and sample-based inference favors marginal realism. Together, these results expose a structural failure mode of MSE-based evaluation in long-horizon forecasting and recast strategy and inference selection as navigation of an unavoidable accuracy--realism trade-off.

URL PDF HTML ☆

赞 0 踩 0

2606.04339 2026-06-04 cs.LG 版本更新

Literature-Guided Minimax Optimization of Virtual Epilepsy Neurostimulation

文献引导的虚拟癫痫神经刺激极小化优化

Cathy Liu

发表机构 * Cathy Liu

AI总结提出一种文献引导的极小化优化流程，结合PubMed规模假设提取、TVB Epileptor模拟和大语言模型黑箱优化，以最大化最坏情况下的奖励，用于鲁棒的神经刺激设计。

Comments 9 pages, 4 figures. Code and interactive essay at https://github.com/liuzhitong330/tvb-llm-robust-neurostim

详情

AI中文摘要

癫痫的计算模型有望实现患者特异性治疗设计，但大多数优化工作流程仍搜索平均表现良好的参数。在神经调控中，这是一个薄弱目标：改善平均响应的方案仍可能对网络最不耐受刺激的患者失败。我们提出一种文献引导的极小化优化流程，结合PubMed规模假设提取、虚拟大脑（TVB）Epileptor模拟和大语言模型引导的黑箱优化。优化器提出内在模型控制参数或临床可解释的外部刺激方案；TVB对采样的虚拟患者评估每个方案；目标函数最大化最坏情况奖励，定义为模拟癫痫活动的负方差。在内在模型控制实验中，最佳存档参数集将最坏情况奖励从-0.5285提升至-0.3182，比基线提高39.8%。临床风格的外部刺激搜索产生较小的最坏情况改善（1.7%），尽管有55%的响应率和阳性颞叶亚组信号，但20名虚拟患者队列未显示总体获益（p=0.9019）。该研究应被视为鲁棒、文献感知的神经刺激设计的计算机概念验证，而非临床证据。

英文摘要

Computational models of epilepsy promise patient-specific treatment design, but most optimization workflows still search for parameters that perform well on average. In neuromodulation, this is a weak target: a protocol that improves the mean response can still fail in the patient whose network is least tolerant to stimulation. We present a literature-guided minimax pipeline that couples PubMed-scale hypothesis extraction, The Virtual Brain (TVB) Epileptor simulations, and large-language-model-guided black-box optimization. The optimizer proposes either intrinsic model-control parameters or clinically interpretable external-stimulation protocols; TVB evaluates each proposal across sampled virtual patients; and the objective maximizes worst-case reward, defined as the negative variance of simulated seizure activity. In the intrinsic model-control experiment, the best archived parameter set improved worst-case reward from -0.5285 to -0.3182, a 39.8% gain over baseline. The clinical-style external-stimulation search produced a much smaller worst-case improvement (1.7%), and a 20-patient virtual cohort showed no aggregate benefit (p=0.9019), despite a 55% responder rate and a positive temporal-lobe subgroup signal. The study should be read as an in silico proof of concept for robust, literature-aware neurostimulation design, not as clinical evidence.

URL PDF HTML ☆

赞 0 踩 0

2606.04338 2026-06-04 cs.LG cs.CR 版本更新

Federated Learning for Multi-Center Sepsis Early Prediction with Privacy-Preserving

联邦学习用于隐私保护的多中心脓毒症早期预测

Xixi Tian, Di Wu, Xiang Liu, Yiziting Zhu, Yujie Li, Xin Shu, Bin Yi

发表机构 * University of Science and Technology of China（中国科学技术大学）

AI总结针对多中心医疗数据的隐私和分布式特性，提出基于联邦学习的分布式协作建模方法，实现与集中式模型相当的预测精度并避免隐私泄露。

详情

AI中文摘要

多中心医疗数据的隐私敏感性和分布式特征给集中式建模进行脓毒症早期准确预测带来了严重障碍。联邦学习作为一种有前景的协作模型开发框架，允许多个机构在不直接共享或集中原始数据的情况下联合训练预测模型，因此受到越来越多的关注。然而，其实际性能、鲁棒性和隐私保护优势尚未使用真实临床数据集进行充分评估。为弥补这一差距，本研究系统性地考察了联邦学习在多中心脓毒症预测中的应用。实验数据集包括从中国三家三级医院收集的648个临床筛选样本，并采用严格的纳入和排除标准。我们建立了集中式训练范式作为性能基线，然后实现了水平联邦学习框架用于分布式协作建模。大量实验结果表明，基于联邦学习的模型在预测精度上与集中式模型高度相当，同时从根本上避免了隐私泄露。进一步的隐私安全分析验证了恶意攻击者无法从传输的模型参数中重建原始患者数据，表明其对数据重建攻击具有强大的抵抗力。这项工作不仅验证了联邦学习在临床脓毒症预测中的实用性和安全性，而且为隐私保护的多中心医疗协作提供了可靠且可行的解决方案。

英文摘要

Privacy-sensitive and distributed characteristics of multi-center medical data bring severe obstacles to centralized modeling for accurate early prediction of sepsis. Federated learning (FL) has attracted growing attention as a promising framework for collaborative model development, as it allows multiple institutions to jointly train predictive models without directly sharing or centralizing raw data. Nevertheless, its practical performance, robustness, and privacy-preserving benefits remain insufficiently evaluated using real-world clinical datasets. To bridge this gap, this study systematically examines the application of federated learning to multi-center sepsis prediction. The experimental dataset consists of 648 clinically screened samples collected from three tertiary hospitals in China, with rigorous inclusion and exclusion criteria. We establish a centralized training paradigm as the performance baseline, and then implement a horizontal federated learning framework for distributed collaborative modeling. Extensive experimental results demonstrate that the federated learning-based model achieves highly comparable prediction accuracy to the centralized counterpart, while fundamentally avoiding privacy leakage. Further privacy security analysis verifies that malicious attackers cannot reconstruct the original patient data from the transmitted model parameters, indicating strong resistance against data reconstruction attacks. This work not only validates the practicality and security of federated learning in clinical sepsis prediction, but also provides a reliable and feasible solution for privacy-preserving multi-center medical collaboration.

URL PDF HTML ☆

赞 0 踩 0

2605.01910 2026-06-04 cs.LG cs.AI cs.DC 版本更新

Stochastic Sparse Attention for Memory-Bound Inference

随机稀疏注意力用于内存受限推理

Kyle Lee, Corentin Delacour, Kevin Callahan-Coray, Kyle Jiang, Can Yaras, Samet Oymak, Tathagata Srimani, Kerem Y. Camsari

发表机构 * University of California, Santa Barbara（加州大学圣芭芭拉分校）； University of Michigan（密歇根大学）； Carnegie Mellon University（卡内基梅隆大学）

AI总结提出SANTA方法，通过从后softmax分布中采样稀疏索引来减少值缓存访问，实现无乘法的高效解码，在Llama-3.1-8B-Instruct上获得1.5倍注意力核加速和1.25倍端到端加速。

Comments Code available at https://github.com/OPUSLab/SANTA

详情

Journal ref: ICML 2026

AI中文摘要

自回归解码在长上下文中变得带宽受限，因为生成每个token需要从KV缓存中读取所有$n_k$个键和值向量。我们提出随机加法无乘法注意力（SANTA），一种通过从后softmax分布中采样$S \ll n_k$个索引并仅聚合这些值行来稀疏化值缓存访问的方法。这产生了后softmax值聚合的无偏估计，同时将值阶段的乘加运算替换为收集和加法。我们引入分层和系统采样来设计方差减少、GPU友好的变体。在32k token上下文的Llama-3.1-8B-Instruct上评估，S$^2$ANTA匹配基线准确率，同时在NVIDIA RTX 6000 Ada上相比FlashInfer和FlashDecoding实现高达1.5倍解码步注意力核加速。在批处理长上下文生成中，这些核增益转化为高达1.25倍的端到端解码延迟加速。最后，我们提出伯努利$qK^\mathsf{T}$采样作为补充技术来稀疏化分数阶段，通过随机三元查询减少键特征访问。两种方法对上游量化、低秩投影、KV缓存压缩和KV缓存选择方法互补。它们共同指向稀疏、无乘法和节能的推理。我们在https://github.com/OPUSLab/SANTA.git开源了我们的核。

英文摘要

Autoregressive decoding becomes bandwidth-limited at long contexts, as generating each token requires reading all $n_k$ key and value vectors from KV cache. We present Stochastic Additive No-mulT Attention (SANTA), a method that sparsifies value-cache access by sampling $S \ll n_k$ indices from the post-softmax distribution and aggregates only those value rows. This yields an unbiased estimator of the post-softmax value aggregation while replacing value-stage multiply-accumulates with gather-and-add. We introduce stratified and systematic sampling to design variance-reduced, GPU-friendly variants. Evaluated on Llama-3.1-8B-Instruct at 32k-token contexts, S$^2$ANTA matches baseline accuracy while achieving up to $1.5\times$ decode-step attention-kernel speedup over FlashInfer and FlashDecoding on an NVIDIA RTX 6000 Ada. In batched long-context generation, these kernel gains translate to up to $1.25\times$ end-to-end decode-latency speedup. Finally, we propose Bernoulli $qK^\mathsf{T}$ sampling as a complementary technique to sparsify the score stage, reducing key-feature access through stochastic ternary queries. Both methods are complementary to upstream quantization, low-rank projection, KV-cache compression, and KV-cache selection methods. Together, they point toward sparse, multiplier-free, and energy-efficient inference. We open-source our kernels at: https://github.com/OPUSLab/SANTA.git

URL PDF HTML ☆

赞 0 踩 0

2606.04327 2026-06-04 cs.LG cs.AI math.OC 版本更新

A Geometric Characterization of the Stationary Plateau for Two-Layer Neural Networks

两层神经网络平稳高原的几何刻画

Tian Ding, Dawei Li, Ruoyu Sun

发表机构 * Shenzhen International Center of Industrial and Applied Mathematics（深圳工业与应用数学国际中心）； Shenzhen Research Institute of Big Data（深圳大数据研究院）； Shenzhen Loop Area Institute（深圳环城区域研究所）； AutoKernel ； University of Minnesota Twin Cities（明尼苏达大学双城分校）； School of Data Science, The Chinese University of Hong Kong, Shenzhen, China（香港中文大学（深圳）数据科学学院）

AI总结通过定义“内Hessian”矩阵，研究了光滑激活函数下两层神经网络损失景观中平稳高原的几何结构，分类了所有平稳点的类型（局部极小或鞍点），并揭示了分裂系数与内Hessian的定性如何共同决定高原的局部几何。

Comments 47 pages

详情

AI中文摘要

我们研究了光滑激活函数的两层神经网络损失景观中出现的平稳高原的几何结构。我们关注“神经元分裂”现象，其中复制一个隐藏神经元会在更宽的网络中产生一个仿射平稳点集。我们提供了这些高原上所有平稳点的全面分类，确定了它们在何种条件下构成局部极小点或鞍点。我们的刻画依赖于一个我们称之为“内Hessian”矩阵的每个神经元曲率对象。我们的分析表明，内Hessian的定性以及分裂系数的选择共同决定了高原的局部几何。我们证明，分裂一个局部极小点可以产生局部极小和鞍点的混合，或者一个全鞍点的高原，在温和假设下确定了一个具体的必然鞍点区域。相反，分裂一个鞍点总是产生一个鞍点的高原。我们的结果统一并扩展了先前的景观分析，阐明了模型扩展何时以及如何保持或改变平稳点的性质。这些发现为神经网络中宽度扩展和重参数化的影响提供了新的几何见解。

英文摘要

We investigate the geometric structure of stationary plateaus that arise in the loss landscape of two-layer neural networks with smooth activation functions. We focus on the phenomenon of "neuron splitting" where duplicating a hidden neuron yields an affine set of stationary points in a wider network. We provide a comprehensive classification of all stationary points on these plateaus, determining under what conditions they constitute local minima or saddle points. Our characterization hinges on a per-neuron curvature object we term the "inner Hessian" matrix. Our analysis reveals that the definiteness of the inner Hessian and the choice of splitting coefficients jointly dictate the local geometry of the plateau. We show that "splitting" a local minimum can yield either a mixture of local minima and saddles or an all-saddle plateau, with a concrete sure-saddle region identified under mild assumptions. In contrast, splitting a saddle point always produces a plateau of saddle points. Our results unify and extend prior landscape analyses, elucidating when and how model expansion preserves or alters the nature of stationary points. These findings offer new geometric insights into the effects of width expansion and reparameterization in neural networks.

URL PDF HTML ☆

赞 0 踩 0

2606.04326 2026-06-04 cs.LG cs.AI 版本更新

Measuring What Matters: Synthetic Benchmarks for Concept Bottleneck Models

衡量重要之事：概念瓶颈模型的合成基准

Julian Skirzynski, Harry Cheon, Shreyas Kadekodi, Meredith Stewart, Berk Ustun

发表机构 * University of California, San Diego（加州大学圣地亚哥分校）

AI总结本文开发了用于概念瓶颈模型的合成基准，通过控制数据模态、概念选择、标注质量和完整性等属性，评估模型在决策支持和自动化场景下的性能，并诊断失败模式。

Comments Benchmarks available at https://github.com/ustunb/concept-benchmark

详情

AI中文摘要

概念瓶颈模型从输入中检测到的高级概念预测结果。尽管概念提供了从可解释性中获益的简单方法，但很少有数据集包含概念标签。这限制了研究人员确定哪些问题适合这些模型、隔离驱动其性能或导致失败的因素、或发现哪些算法表现良好的能力。在本文中，我们为概念瓶颈模型开发了合成基准，重点关注其两个主要用例：决策支持（模型帮助人类做出更好的决策）和自动化（模型在无监督下处理常规任务）。我们的基准可以生成带标签的数据集，同时控制影响性能的属性，包括数据模态、概念选择、标注质量和完整性。我们演示了如何使用这些基准评估代表性类别的概念瓶颈模型。我们的演示展示了基准如何诊断失败模式并指导后续测试。

英文摘要

Concept bottleneck models predict outcomes from high-level concepts detected in inputs. Although concepts provide a simple way to reap benefits from interpretability, very few datasets include concept labels. This limits researchers' ability to determine which problems are suitable for these models, isolate the factors that drive their performance or lead to failures, or uncover which algorithms perform well. In this paper, we develop synthetic benchmarks for concept-bottleneck models, focusing on their two main use cases: decision support, in which models assist humans in making better decisions, and automation, in which models handle routine tasks without supervision. Our benchmarks can generate labeled datasets while controlling for properties that affect performance, including data modality, concept choice, annotation quality, and completeness. We demonstrate how the benchmarks can be used to evaluate representative classes of concept bottleneck models. Our demonstrations show how the benchmarks can diagnose failure modes and guide follow-up testing.

URL PDF HTML ☆

赞 0 踩 0

2606.04324 2026-06-04 cs.LG stat.ML 版本更新

Neural Galerkin Normalizing Flows for Bayesian Inference of Diffusions with Inaccessible Boundaries

用于具有不可达边界的扩散模型贝叶斯推断的神经Galerkin归一化流

Riccardo Saporiti, Fabio Nobile

发表机构 * CSQI École Polytechnique Fédérale de Lausanne（CSQI瑞士联邦理工学院）

AI总结提出一种新的归一化流架构，通过神经Galerkin框架求解Fokker-Planck方程，学习扩散过程在两次观测之间的转移密度函数，从而高效实现贝叶斯推断。

Comments 27 pages, 12 figures

详情

AI中文摘要

从离散观测对扩散模型参数进行贝叶斯推断的主要挑战之一是，在连续观测时间之间无法获得转移密度函数的解析表达式，而该函数是推导似然函数所必需的。扩展先前使用归一化流求解Fokker-Planck型偏微分方程的研究，我们提出一种新的归一化流架构，用于学习扩散过程在两个观测时间之间的转移密度函数。我们通过神经Galerkin框架，以狄拉克质量作为初始条件，在初始数据和扩散系数的指定训练分布上求解相关的Fokker-Planck方程来实现这一点。我们特别关注扩散矩阵在某些不可达边界区域消失的过程，例如满足Feller条件的随机波动率模型。沿观测轨迹评估所获得的转移密度的乘积近似似然函数，从而通过马尔可夫链蒙特卡洛实现廉价的后验采样。在离线训练阶段之后，推断变得显著更高效，因为它避免了为MCMC采样器提出的每个参数实时求解Fokker-Planck方程，或依赖其他涉及重复模拟扩散桥的无似然贝叶斯推断方法。

英文摘要

One of the primary challenges in Bayesian inference on the parameters of a diffusion model from discrete observations is the unavailability of an analytical expression for the transition density function between consecutive observation times, which is needed to derive the likelihood function. Extending previous studies that solve Fokker-Planck (FP) type partial differential equations with Normalizing Flows, we propose a new Normalizing Flow architecture to learn the transition density function of the diffusion process between two observation times. We do so by solving in a Neural Galerkin framework the associated FP equation with a Dirac mass as initial condition, over a specified training distribution of the initial datum and the coefficients of the diffusion. We specifically focus on processes whose diffusion matrix vanishes in certain inaccessible boundary regions, such as Stochastic Volatility models that satisfy a Feller condition. The product of the obtained transition densities evaluated along the observed trajectory approximates the likelihood function, thereby enabling cheap posterior sampling via Markov chain Monte Carlo (MCMC). After the offline training phase, inference becomes significantly more efficient, as it avoids the need to solve the FP equation in real time for each parameter proposed by the MCMC sampler or to rely on other likelihood-free methods for Bayesian inference that involve repeated simulation of diffusion bridges.

URL PDF HTML ☆

赞 0 踩 0

2606.04320 2026-06-04 cs.LG cs.AI 版本更新

OpenRFM: Dissecting Relational In-Context Learning

OpenRFM：剖析关系型上下文学习

Zhikai Chen, Junyu Yin, Jialiang Gu, Siheng Xiong, Xiaoze Liu, Ruowang Zhang, Keren Zhou, Kai Guo

发表机构 * Michigan State University（密歇根州立大学）； Georgia Institute of Technology（佐治亚理工学院）； Purdue University（普渡大学）； George Mason University（乔治·马歇尔大学）

AI总结本文通过分析关系型Transformer的模型和数据两方面问题，提出双阶段上下文学习架构和同质性感知预训练混合策略，构建OpenRFM模型，在关系型基础模型上平均任务性能提升约30%。

Comments 25 pages, including appendix

详情

AI中文摘要

关系型基础模型（RFM）承诺一个单一的预训练预测器，给定任何关系数据库，通过关系型上下文学习（ICL）在一次前向传播中返回预测。然而，开放RFM与其商业对应物之间存在显著差距，且这一差距的根源尚未被系统理解。我们从两个角度剖析了一个代表性框架——关系型Transformer（RT）。模型方面：我们表明RT执行关系级ICL，而核回归视图显示，当稀疏标签单元覆盖导致欠定回归时，它会失败。数据方面：我们消融了RT的预训练来源，发现仅合成预训练和分布内预训练将相同架构驱动到不同机制（惰性与特征学习）。探究这一差距揭示，缺失的成分是标签生成过程中可识别支持的关系型潜在变量。这两个诊断转化为：（1）一种双阶段ICL架构，将关系型骨干与从预训练表格基础模型提升的批级ICL层相结合，以克服关系级标签稀缺；（2）一种同质性感知的合成加持续真实数据预训练混合，辅以基于原型的正则化。这些选择定义了OpenRFM，一个简单而有效的RFM，在RT骨干上平均任务性能提升约30%，并在大量评估任务上超越了商业模型KumoRFMv1。

英文摘要

Relational Foundation Models (RFMs) promise a single pre-trained predictor that, given any relational database, returns predictions in one forward pass via relational in-context learning (ICL). Yet a substantial gap separates open RFMs from their commercial counterparts, and the origin of this gap has not been systematically understood. We dissect a representative framework, the Relational Transformer (RT), from two perspectives. Model side: we show that RT performs relation-level ICL, and a kernel regression view shows it fails when sparse label-cell coverage yields an underdetermined regression. Data side: we ablate RT's pre-training source and find that existing synthetic-only pre-training and in-distribution pre-training drive the same architecture into different regimes, lazy vs. feature-learning. Probing this gap reveals that the missing ingredient is a support-identifiable relational latent in the label-generation process. These two diagnoses translate into (1) a dual-stage ICL architecture that combines the relational backbone with a batch-level ICL layer lifted from a pre-trained tabular foundation model to overcome relation-level label scarcity, and (2) a homophily-aware synthetic plus continual real-data pre-training mixture, augmented with a prototype-based regularization. These choices define OpenRFM, a simple yet effective RFM that improves average task performance by approximately 30% over the RT backbone and surpasses the commercial model KumoRFMv1 on a large set of evaluation tasks.

URL PDF HTML ☆

赞 0 踩 0

2606.04317 2026-06-04 cs.CR cs.LG cs.SE 版本更新

Toward a Generalized Defense Across Sparse, Continuous, and Structured Parameter Attacks

面向稀疏、连续和结构化参数攻击的通用防御

Bin Duan, Zeyu Bai, Guowei Yang

发表机构 * School of Electrical Engineering and Computer Science, The University of Queensland, Australia（电气工程与计算机科学学院，昆士兰大学，澳大利亚）

AI总结提出 ParDef 框架，通过密钥通道重参数化、QC-LDPC 量化和自适应鲁棒推理，实现对多种参数攻击的通用防御，在保持高性能的同时降低攻击成功率。

详情

AI中文摘要

深度神经网络越来越多地部署在异构和部分不可信的环境中，模型通过云存储、CI/CD 流水线、容器化服务和边缘执行平台进行分发。这种广泛的部署场景使模型参数面临各种完整性风险。与输入空间对抗攻击不同，参数攻击直接篡改模型的内部参数，并持续影响所有后续推理。现有防御要么需要重新训练，要么导致显著的精度下降，或者仅限于特定的攻击类别。然而，在实际部署场景中，参数攻击的形式往往不可预测。为了解决这一挑战，我们提出了 ParDef，一种针对深度神经网络面向多种类型参数攻击的通用防御。ParDef 集成了密钥通道重参数化（隐藏敏感参数方向）、QC-LDPC 量化（嵌入冗余并支持纠错）以及自适应鲁棒推理（在不确定性下稳定预测）。我们在 CIFAR-10、CIFAR-100 和 Tiny-ImageNet 上使用 ResNet 和 VGG 模型的评估表明，ParDef 在不同参数攻击下持续降低攻击成功率，同时保持较高的模型性能，且仅引入适度的部署开销。这些结果凸显了 ParDef 是一种实用且通用的 DNN 部署防御方案。

英文摘要

Deep neural networks are increasingly deployed across heterogeneous and partially untrusted environments, where models are distributed through cloud storage, CI/CD pipelines, containerized services, and edge execution platforms. This broad deployment landscape exposes model parameters to various integrity risks. Unlike input-space adversarial attacks, parameter attacks directly tamper with the model's internal parameters and persist across all subsequent inferences. Existing defenses either require retraining, incur significant accuracy degradation, or are limited to specific attack classes. However, in real-world deployment scenarios, the forms of parameter attacks are often unpredictable. To address this challenge, we present ParDef, a generalized defense for deep neural networks against diverse types of parameter attacks. ParDef integrates keyed channel reparameterization, which obscures sensitive parameter directions, QC-LDPC quantization, which embeds redundancy and supports error correction, and adaptive robust inference, which stabilizes predictions under uncertainty. Our evaluation on CIFAR-10, CIFAR-100, and Tiny-ImageNet using ResNet and VGG models demonstrates that ParDef consistently reduces attack success rates across different parameter attacks while maintaining high model performance and incurring only moderate deployment overhead. These results highlight that ParDef is a practical and generalized defense for DNN deployments.

URL PDF HTML ☆

赞 0 踩 0

2606.04314 2026-06-04 cs.LG cs.SE 版本更新

Testing Neural Networks via Bayesian-Guided Exploration of Decision Landscapes

通过贝叶斯引导的决策景观探索测试神经网络

Bin Duan, Meiru Che, Guowei Yang

发表机构 * School of Electrical Engineering and Computer Science, The University of Queensland, Australia（昆士兰大学电子工程与计算机科学学院）； College of Information and Communications Technology, Central Queensland University, Australia（中央昆士兰大学信息与通信技术学院）

AI总结提出BayesWarp框架，利用可解释显著性技术识别决策关键区域，并通过不确定性感知的贝叶斯优化自适应引导测试，在保持数据分布和语义接近性的同时高效发现多样化故障。

详情

AI中文摘要

随着神经网络越来越多地部署在安全关键领域，测试对于评估和提高其可靠性至关重要。现有的测试方法，无论是黑盒还是白盒，主要使用全局变异或覆盖引导策略，这两种方法都难以在保持与原始数据分布和语义接近的同时高效发现多样化的模型故障。我们提出BayesWarp，一个通过可解释显著性技术识别决策关键输入区域，并使用不确定性感知的贝叶斯优化策略自适应引导测试过程的测试框架，能够在保持与原始数据分布和语义接近的同时发现多样化故障。在MNIST、CIFAR-10和ImageNet上对六个神经网络模型的评估表明，BayesWarp在固定变异预算下提高了故障发现率、故障多样性、测试用例质量和关键神经元覆盖率。这些结果表明BayesWarp提高了测试有效性。此外，使用生成的故障案例进行微调可提高模型性能。

英文摘要

As neural networks are increasingly deployed in safety-critical domains, testing is essential to evaluate and improve their reliability. Existing testing methods, whether black-box or white-box, primarily use global mutation or coverage-guided strategies, both of which struggle to efficiently uncover diverse model failures while remaining proximate to the original data distribution and semantics. We propose BayesWarp, a testing framework that addresses this limitation by mutating decision-critical input regions identified via interpretable saliency techniques and adaptively guiding the testing process using an uncertainty-aware Bayesian Optimization strategy, enabling the discovery of diverse failures while preserving distributional and semantic proximity to the original data. Evaluation on MNIST, CIFAR-10, and ImageNet across six neural network models shows that BayesWarp improves failure discovery, failure diversity, test case quality, and critical neuron coverage under a fixed mutation budget. These results demonstrate that BayesWarp improves testing effectiveness. Moreover, fine-tuning with the generated failure cases leads to improvements in model performance.

URL PDF HTML ☆

赞 0 踩 0

2606.04310 2026-06-04 cs.LG cs.SE 版本更新

Latent Anchor-Driven Test Generation for Deep Neural Networks

基于潜在锚点的深度神经网络测试生成

Bin Duan, Matthew B. Dwyer, Guowei Yang

发表机构 * School of Electrical Engineering and Computer Science, The University of Queensland, Australia（昆士兰大学电气工程与计算机科学学院）； Department of Computer Science, University of Virginia, United States（弗吉尼亚大学计算机科学系）

AI总结提出 Latte 框架，利用预训练 VQ-VAE 在潜在空间中进行锚点引导的变异，生成语义相近、多样且能揭示错误的测试用例，提高故障暴露和行为多样性。

详情

AI中文摘要

深度神经网络（DNN）越来越多地部署在安全关键和安全性敏感的应用中，这使得严格的测试对于识别和缓解模型弱点至关重要。现有的 DNN 测试方法要么探索输入空间，要么探索学习到的潜在空间。虽然潜在空间生成比直接输入空间变异能更好地保持合理性，但当前方法在探索可控性、故障多样性和种子相对语义漂移之间仍面临权衡。为了克服这些限制，我们提出了 Latte，一个黑盒测试框架，通过利用潜在空间生成语义相近、多样且能揭示错误的测试用例。具体来说，Latte 使用预训练的 VQ-VAE 对每个输入种子进行编码，并沿着由从替代类别中采样的锚点定义的方向执行以种子为中心的一步潜在变异，然后进行量化并解码回输入空间。这会在学习到的潜在流形中探索每个种子周围的局部邻域，从而在相同预算下产生更多数量和更广泛多样性的触发预言机预测差异。我们在 5 个数据集和 10 个 DNN 模型上评估了 Latte，包括单模型和多模型测试场景。在评估的数据集和模型上，Latte 在匹配的测试预算下提高了故障暴露和行为多样性。在单模型设置下，它还相对于源种子保持了较低的种子相对语义漂移。

英文摘要

Deep Neural Networks (DNNs) are increasingly being deployed in security-critical and safety-sensitive applications, which makes rigorous testing essential to identify and mitigate model weaknesses. Existing DNN testing approaches explore either the input space or a learned latent space. While latent-space generation can better maintain plausibility than direct input-space mutation, current methods still face a trade-off among exploration controllability, failure diversity, and seed-relative semantic drift. To overcome these limitations, we propose Latte, a black-box testing framework that generates semantically proximate, diverse, and fault-revealing test cases by leveraging the latent space. Specifically, Latte encodes each input seed with a pre-trained VQ-VAE and performs a seed-centered, one-step latent mutation along directions defined by anchors sampled from alternative classes, followed by quantization and decoding back to the input space. This explores local neighborhoods around each seed within the learned latent manifold, resulting in a larger number and broader diversity of oracle-triggering prediction discrepancies under the same budget. We evaluated Latte on 5 datasets and 10 DNN models in single-model and multi-model testing scenarios. Across the evaluated datasets and models, Latte improves fault exposure and behavioral diversity under matched testing budgets. Under the single-model setting, it also maintains low seed-relative semantic drift with respect to the source seeds.

URL PDF HTML ☆

赞 0 踩 0

2606.04307 2026-06-04 cs.LG stat.CO stat.ME 版本更新

Folded Transport MCMC: Certifiable Quotient Posterior Computation for Symmetric Bayesian Models

折叠传输MCMC：对称贝叶斯模型的可认证商后验计算

Jun Hu

发表机构 * Wuhan University of Technology（武汉理工大学）

AI总结针对对称贝叶斯模型中的冗余多峰性导致MCMC收敛诊断退化的问题，提出Folded Transport MCMC方法，通过在对称群的基本域上构建独立采样器直接对商后验进行推断，并利用LCNF振荡认证框架在商度量下提供可证明的认证下界。

Comments 48 pages (including supplementary material), 5 figures, 6 tables. Submitted to Journal of the Royal Statistical Society: Series B

详情

PE-MHL: 用于复杂系统可扩展学习的物理编码模块化混合层

Ismail Hassaballa, Mircea Lazar

发表机构 * TUE（蒂姆大学）

AI总结提出物理编码模块化混合层（PE-MHL）框架，通过增量添加子模型并保证训练误差单调非增，实现可扩展、鲁棒的混合建模，在非线性NARX基准和Quanser Aero 2平台上优于同等规模单体网络。

详情

AI中文摘要

结合基于物理和数据驱动的混合模型在控制应用中展现出实现准确性和可解释性的强大潜力。尽管最近的方法在融入物理一致性方面取得了进展，但在可扩展性、对噪声的鲁棒性以及模型复杂度控制方面仍存在挑战。本文提出了物理编码模块化混合层（PE-MHL）框架，其中基线基于物理的模型通过添加新的子模型逐步细化，每个新组件在保留先前组件已学知识的同时增加复杂度。我们为这种构造建立了理论保证：通过每个新子模型的最小二乘初始化，训练误差在子模型数量上单调非增并可证明收敛。在非线性NARX基准和Quanser Aero 2平台上的实证评估表明，PE-MHL在准确性和泛化能力上均优于同等规模的单体网络，同时提供更稳定的训练动态和更好的底层数据结构保留。

英文摘要

Hybrid models that combine physics-based and data-driven components have shown strong potential for achieving accuracy and interpretability in control applications. While recent methods have made progress in incorporating physical consistency, challenges remain in scalability, robustness to noise, and control of model complexity. This paper proposes a Physics-Encoded Modular Hybrid Layer (PE-MHL) framework, in which a baseline physics-based model is incrementally refined through the addition of new sub-models, where each new component adds complexity while preserving what previous components have already learned. We establish a theoretical guarantee for this construction: with a least-squares initialization of each new sub-model, the training error is monotonically non-increasing in the number of sub-models and provably converges. Empirical evaluations on a nonlinear NARX benchmark and the Quanser Aero 2 platform demonstrate that PE-MHL outperforms equivalently sized monolithic networks in both accuracy and generalization, while also providing more stable training dynamics and better preservation of underlying data structures.

URL PDF HTML ☆

赞 0 踩 0

2606.04287 2026-06-04 cs.LG cs.AI 版本更新

Scaling Novel Graph Generation via Lightweight Structure-Guided Autoregressive Models

通过轻量级结构引导自回归模型扩展新颖图生成

Alessio Barboni, Massimiliano Lupo Pasini, Bishal Lakha, Edoardo Serra

发表机构 * Boise State University（博伊州立大学）； Oak Ridge National Laboratory（橡树岭国家实验室）

AI总结提出一种轻量级自回归框架，利用结构引导拓扑排序和两阶段训练策略，在分子和非分子基准上实现高新颖性、有效性和唯一性的图生成。

详情

AI中文摘要

生成真实且多样的图是机器学习中的一个关键问题，在分子发现、电路设计、网络安全等领域有应用。然而，当前的图生成模型在可扩展性和新颖性方面仍存在局限。基于扩散的方法通常需要昂贵的全邻接操作和长去噪链，而许多自回归和混合模型至少具有二次复杂度。此外，这些模型往往模仿训练图而非泛化到新图。我们提出一个轻量级自回归框架来解决这些问题。它使用结构引导的拓扑排序将图序列化为规则的边序列，实现近对数线性生成，以及一种两阶段训练策略，结合探索导向的增强和迭代细化，以减少过拟合并促进受控的新颖性。在分子和非分子基准上的实验表明，我们的方法在保持高有效性和唯一性的同时提高了新颖性。该框架还支持LSTM和Mamba风格的因果序列骨干，大内存加速器使得能够进行超出典型GPU限制的更长的图序列实验。

英文摘要

Generating realistic and diverse graphs is a key problem in machine learning, with applications in molecular discovery, circuit design, cybersecurity, and beyond. However, current graph generative models remain limited by scalability and novelty. Diffusion-based methods often require costly full-adjacency operations and long denoising chains, while many autoregressive and hybrid models have at least quadratic complexity. In addition, these models often imitate training graphs rather than generalize beyond them. We propose a lightweight autoregressive framework to address these issues. It uses a structure-guided topological ordering to serialize graphs into regular edge sequences, enabling near log-linear generation, and a two-phase training strategy that combines exploration-oriented augmentation with iterative refinement to reduce overfitting and promote controlled novelty. Experiments on molecular and non-molecular benchmarks show that our approach improves novelty while preserving high validity and uniqueness. The framework also supports both LSTM and Mamba-style causal sequence backbones, with large-memory accelerators enabling longer graph-sequence experiments beyond typical GPU limits.

URL PDF HTML ☆

赞 0 踩 0

2606.04284 2026-06-04 cs.LG cs.AI cs.CL 版本更新

Sparse Mixture-of-Experts Reward Models Learn Interpretable and Specialized Experts for Personalized Preference Modeling

稀疏混合专家奖励模型学习可解释且专业化的专家用于个性化偏好建模

Yifan Wang, Jinyi Mu, Mayank Jobanputra, Yu Wang, Ji-Ung Lee, Soyoung Oh, Isabel Valera, Vera Demberg

发表机构 * Saarland University（萨尔兰大学）； Independent Researcher（独立研究者）； Bielefeld University（比勒菲尔德大学）； Max Planck Institute for Software Systems（马克斯·普朗克软件系统研究所）； Max Planck Institute for Informatics（马克斯·普朗克信息研究所）

AI总结提出稀疏混合专家奖励模型，通过稀疏路由和专家多样性训练，从二元偏好数据中学习可解释的专家模式，提升个性化偏好建模的测试时适应性和可解释性。

详情

AI中文摘要

偏好建模在基于人类反馈的强化学习（RLHF）中扮演核心角色，使大型语言模型（LLMs）与人类价值观对齐。然而，大多数现有方法假设一个通用的奖励函数，忽视了人类偏好的多样性和异质性。为了在不增加额外标注成本的情况下解决这一限制，最近的工作提出从二元数据中学习多个偏好组件，并组合它们以建模个体偏好。然而，这些组件往往无法捕捉连贯且解耦的模式，限制了其可解释性和个性化效果。在这项工作中，我们提出了一种稀疏混合专家（MoE）奖励模型，该模型在二元偏好数据训练过程中鼓励稀疏路由和专家多样性。在受控和真实世界的实验中，稀疏MoE学习了可解释的路由模式和专业化的专家。它还改进了测试时的个性化，并且适应后的专家权重变化为分析模型如何适应个性化偏好提供了定性视角。

英文摘要

Preference modeling plays a central role in reinforcement learning from human feedback (RLHF), enabling large language models (LLMs) to align with human values. However, most existing approaches assume a universal reward function, neglecting the diversity and heterogeneity of human preferences. To address this limitation without additional annotation costs, recent work has proposed learning multiple preference components from binary data and combining them to model individual preferences. Nevertheless, these components often fail to capture coherent and disentangled patterns, limiting their interpretability and effectiveness for personalization. In this work, we propose a sparse Mixture-of-Experts (MoE) reward model that encourages sparse routing and expert diversity during training on binary preference data. Across controlled and real-world experiments, sparse MoE learns interpretable routing patterns and specialized experts. It also improves test-time personalization, and post-adaptation shifts in expert weights provide a qualitative lens for analyzing how the model adapts to personalized preferences.

URL PDF HTML ☆

赞 0 踩 0

2606.04280 2026-06-04 cs.LG cs.AI cs.IR 版本更新

预训练期间的强化学习探索：重新审视LLM训练中的策略优化

Rachit Bansal, Clara Mohri, Tian Qin, David Alvarez-Melis, Sham Kakade

发表机构 * Harvard University（哈佛大学）

AI总结本文质疑LLM标准训练流程中仅在预训练和监督微调后使用强化学习的做法，通过从头训练LLM并在中间检查点直接应用RL、SFT及SFT后RL，发现RL早期有效且能匹配完整流程，同时提出并行平均合并RL和SFT目标的方法在保持通用能力的同时优于其他方法。

详情

AI中文摘要

标准的LLM训练流程仅在预训练和监督微调（SFT）之后应用强化学习（RL）。我们通过从头训练LLM，并直接在中间预训练检查点上应用RL、SFT以及SFT后接RL，来质疑这一现状。我们发现RL在早期非常有效，并且通常也能在早期匹配完整的SFT→RL流程。通过在更难问题上的实验，我们发现针对性的预训练数据组成是RL有效性的强大杠杆，甚至比模型规模更重要。除了推理准确性之外，直接将RL应用于基础检查点会扩展模型的分布；而最近工作中报告的锐化效应仅在RL跟随SFT时出现。RL基本不改变模型的通用能力，而SFT后通用能力会下降。最后，我们通过并行平均合并RL和SFT目标，该方法在所有其他训练方法中表现最佳，跨指标均优于其他方法，同时保持通用能力。这些结果表明，LLM训练可能受益于RL的更广泛使用。

英文摘要

The standard LLM training pipeline applies reinforcement learning (RL) only after pre-training and supervised fine-tuning (SFT). We question this status quo by training a LLM from scratch and applying RL, SFT, and SFT followed by RL directly to intermediate pre-training checkpoints. We find that RL is effective very early, and often matches the full SFT$\to$RL pipeline early as well. Through experiments on harder problems, we find that targeted pre-training data composition is a strong lever for RL effectiveness, even more so than model scale. Beyond reasoning accuracy, applying RL directly to base checkpoints expands the model's distribution; the sharpening effect reported in recent work arises only when RL follows SFT. The general capabilities of the model remain essentially unchanged by RL, while they degrade following SFT. Finally, we merge RL and SFT objectives by parallel averaging, which outperforms across all other training methods discussed, across metrics, while preserving general capabilities. Together, these results suggest that LLM training might benefit from an expanded use of RL.

URL PDF HTML ☆

赞 0 踩 0

2606.04266 2026-06-04 cs.CR cs.LG 版本更新

Long-Term and Short-Term Transistor Aging in Deep Neural Networks: Impact and Mitigation

深度神经网络中的长期与短期晶体管老化：影响与缓解

Alireza Sarmadi, Virinchi Roy Surabhi, Prashanth Krishnamurthy, Hussam Amrouch, Ramesh Karri, Farshad Khorrami

发表机构 * Dept. of Electrical and Computer Engineering, New York University (NYU) Tandon School of Engineering（纽约大学电气与计算机工程系（Tandon工程学院））； School of Computation, Information and Technology, Technical University of Munich (TUM)（慕尼黑技术大学计算、信息与技术学院）

AI总结本文研究了长期和短期晶体管老化对深度神经网络推理精度的影响，并提出了一种老化感知重训练方法来缓解性能下降。

Comments 28 pages, 16 figures

详情

AI中文摘要

深度神经网络（DNN）被用于各种实际应用，例如图像分类和语音识别。在集成电路（IC）的硬件上实现的DNN的推理精度会在晶体管老化等现象下下降。老化会减慢晶体管的开关速度，由于时钟无法维持而导致系统级时序违规。为了在整个预期寿命内保持可靠性，设计人员添加保护带以防止时序违规；然而，添加大的时序保护带会导致性能（速度或吞吐量）损失。本章详细讨论了长期和短期晶体管老化对DNN推理精度的影响。此外，为了减轻老化对DNN精度的影响并控制它们，提出了一种老化感知重训练方法，以生成即使在激进（即小于所需）保护带下也具有弹性的DNN。这提高了DNN在老化引起的退化情况下的推理精度。本章在用于图像分类的DNN硬件实现上，使用现成的图像数据集讨论了这些影响以及缓解策略。还简要讨论了短期老化作为检测集成电路中硬件木马的激励机制的应用。

英文摘要

Deep neural networks (DNNs) are used in a variety of real-world applications including, for example, image classification and speech recognition. The inference accuracy of DNN implemented on hardware in integrated circuits (ICs) degrades under phenomena such as transistor aging. Aging slows down the switching speed of transistors, resulting in system-level timing violations due to unsustainable clocks. To maintain reliability for the entire projected lifetime, designers add guardbands to prevent timing violations; however, adding large timing guardbands causes losses in performance (speed or throughput). This chapter provides a detailed discussion of the effects of long-term and short-term transistor aging on DNN inference accuracy. Furthermore, to mitigate aging effects on DNN's accuracy and keep them at bay, a methodology for aging-aware retraining is presented in order to generate a resilient DNN even when aggressive (i.e., smaller than required) guardbands are used. This improves the inference accuracy of the DNNs even in the presence of aging-induced degradation. These effects are discussed in this chapter along with mitigation strategies on a hardware implementation of a DNN for image classification on an off-the-shelf image dataset. The application of short-term aging as an excitation mechanism for the detection of hardware Trojans in integrated circuits is also briefly discussed.

URL PDF HTML ☆

赞 0 踩 0

2606.04265 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

Nonlocal Mean Field Schrödinger Bridge with Learned Interactions

具有学习相互作用的非局部平均场薛定谔桥

Daisuke Inoue, Mathieu Laurière, Dante Kalise

发表机构 * Department of Mathematics, Imperial College London（伦敦帝国学院数学系）； Shanghai Frontiers Science Center of Artificial Intelligence and Deep Learning（上海前沿人工智能与深度学习科学中心）； NYU-ECNU Institute of Mathematical Sciences, NYU Shanghai（纽约大学上海数学科学研究所）

AI总结本文提出一种使用神经网络代理近似非局部相互作用的平均场薛定谔桥方法，将推理时的每步计算成本从二次降低到线性，并推导了代理误差传播的稳定性界限。

Comments 31 pages, 15 figures

详情

AI中文摘要

薛定谔桥问题构建一个以最小能量连接初始分布和终端分布的随机过程。本文考虑其平均场扩展，即平均场薛定谔桥，用于相互作用粒子系统。对于非局部相互作用，评估产生的依赖于粒子的分布项的计算量随种群规模呈二次增长，这使得大规模问题难以处理。我们通过使用神经网络代理近似非局部相互作用来解决这一瓶颈。由此产生的四阶段交替算法将推理时每步成本从种群规模的二次降低到线性。我们还推导了Grönwall型稳定性界限，显示代理误差如何传播到生成的轨迹。在导航和意见动力学任务的数值实验中，所提出的方法再现了通过解析评估获得的轨迹，并减少了训练时间。

英文摘要

The Schrödinger Bridge Problem constructs a stochastic process that connects an initial distribution to a terminal distribution with minimum energy. This work considers its mean-field extension, the Mean-Field Schrödinger Bridge, for interacting particle systems. With nonlocal interactions, evaluating the resulting particle-dependent distributional terms can scale quadratically with the population size, which makes large-scale problems intractable. We address this bottleneck by approximating the nonlocal interactions with neural network surrogates. The resulting four-stage alternating algorithm reduces the per-step cost from quadratic to linear in the population size at inference. We also derive Grönwall-type stability bounds that show how surrogate errors propagate to the generated trajectories. In numerical experiments on navigation and opinion-dynamics tasks, the proposed method reproduces trajectories obtained with analytical evaluation and reduces training time.

URL PDF HTML ☆

赞 0 踩 0

2606.04261 2026-06-04 cs.AI cs.CL cs.CV cs.ET cs.LG 版本更新

Can Generalist Agents Automate Data Curation?

通用智能体能否自动化数据筛选？

Feiyang Kang, Hanze Li, Adam Nguyen, Mahavir Dabas, Jiaqi W. Ma, Frederic Sala, Dawn Song, Ruoxi Jia

发表机构 * Virginia Tech（弗吉尼亚理工大学）； University of Illinois Urbana-Champaign（伊利诺伊大学厄巴纳-香槟分校）； University of Wisconsin-Madison（威斯康星大学麦迪逊分校）； University of California, Berkeley（加州大学伯克利分校）

AI总结本文提出Curation-Bench基准，通过通用编码智能体自动化数据筛选循环，实验表明现成智能体可达到强基线，但存在执行-研究差距，而结构化方法引导的智能体能在十分之一数据预算下自主组合出优于强基线的数据选择策略。

Comments Preprint

详情

AI中文摘要

训练数据的筛选是现代AI开发中最重要但劳动密集的部分之一：实践者根据嘈杂的基准反馈迭代地提出、实施、评估和修订数据策略。我们探究通用编码智能体能否自动化这一数据筛选循环。我们引入了*Curation-Bench*，一个以智能体为中心的基准，它固定模型、训练配方和评估套件，同时赋予智能体命令行权限以检查数据、实施策略、提交到固定的训练/评估流水线并进行修订。在视觉-语言指令微调实例中，现成智能体在十次迭代内达到了已发表的强数据选择基线。然而，轨迹分析揭示了持续的*执行-研究差距*：即使提供了策略指南和论文参考，智能体主要调整局部策略变体，而非探索新的策略家族。要求每次迭代引用、实例化和改编先前方法的框架将智能体转向方法引导的探索。这种框架化的智能体自主组合——无需人工设计输入——一种数据选择策略，在十分之一的数据预算下优于已发表的强基线。总体而言，当前智能体可以运行筛选循环，但可靠的数据研究需要框架化的方法适应，而非仅靠开放式提示。代码和基准已开源。

英文摘要

Curating training data is among the most consequential yet labor-intensive parts of modern AI development: practitioners iteratively propose, implement, evaluate, and revise data policies against noisy benchmark feedback. We ask whether generalist coding agents can automate this data-curation loop. We introduce *Curation-Bench*, an agent-centric benchmark that fixes the model, training recipe, and evaluation suite while giving agents command-line access to inspect data, implement policies, submit them to a fixed training/evaluation pipeline, and revise. In a vision-language instruction-tuning instantiation, out-of-the-box agents reach strong published data-selection baselines within ten iterations. However, trajectory analysis reveals a persistent *execution-research gap*: agents mainly tune local policy variants rather than explore new policy families, even when given strategy guides and paper references. Scaffolds requiring each iteration to cite, instantiate, and adapt a prior method shift agents toward method-guided exploration. The scaffolded agent autonomously composes -- without human design input -- a data-selection policy that outperforms strong published baselines at one-tenth their data budget. Overall, current agents can run the curation loop, but reliable data research requires scaffolded method adaptation, not open-ended prompting alone. Code and benchmark are open-sourced.

URL PDF HTML ☆

赞 0 踩 0

2606.04244 2026-06-04 cs.AI cs.CL cs.CV cs.LG 版本更新

VAMPS: Visual-Assisted Mathematical Problem Solving Benchmark

VAMPS: 视觉辅助数学问题求解基准

Amirhossein Dabiriaghdam, Shayan Vassef, Mohammadreza Bakhtiari, Yasamin Medghalchi, Ilker Hacihaliloglu, Mesrob Ohannessian, Lele Wang, Giuseppe Carenini

发表机构 * University of California, Berkeley（加州大学伯克利分校）

AI总结提出VAMPS基准，通过1,168道双语多选题评估多模态大模型在借助绘图工具进行数学推理时的表现，发现直接解析求解优于工具辅助视觉求解。

详情

AI中文摘要

多模态大语言模型在复杂推理方面能力日益增强，但当它们必须通过工具外部化问题然后基于工具输出进行推理时，尤其是在依赖视觉辅助的情况下，其性能往往会下降。这一差距尤为重要，因为真实的工程和科学工作流程通常依赖可视化工具进行分析、验证和决策。为了研究这一差异，我们引入了VAMPS（视觉辅助数学问题求解），一个用于图辅助数学的基准。VAMPS包含1,168个多模态、双语选择题问答对，这些题目来自伊朗大学入学考试的代数和微积分问题，并通过人工审核的LLM生成的合成变体进行了扩展，所有题目都经过精心挑选，使得绘图能够通过揭示交点、极值、渐近线等提供自然的求解策略。VAMPS旨在用于基准测试和诊断，它超越了以往主要评估在固定视觉输入上进行推理的多模态基准，通过测试模型是否能够从构建有用的图形中受益并将其答案基于结果可视化。总体而言，我们发现，在一组多样化的模型中，直接解析求解出人意料地优于工具辅助的视觉求解，即使在绘图是自然策略的问题上也是如此。

英文摘要

Multimodal large language models are increasingly capable of complex reasoning, yet their performance often degrades when they must externalize a problem through a tool and then reason over the tool's output, specifically when they rely on visual aids. This gap is especially important because real engineering and scientific workflows often rely on visualization tools for analysis, validation, and decision-making. To study this discrepancy, we introduce VAMPS (Visual-Assisted Mathematical Problem Solving), a benchmark for graph-assisted mathematics. VAMPS contains 1,168 multimodal, bilingual multiple-choice question-answer pairs drawn from Iranian University Entrance Exam algebra and calculus problems and expanded with human-reviewed LLM-generated synthetic variants, all selected so that plotting provides a natural solution strategy by revealing intersections, extrema, asymptotes, etc. Designed for both benchmarking and diagnosis, VAMPS goes beyond prior multimodal benchmarks that primarily evaluate reasoning over fixed visual inputs by testing whether a model can benefit from constructing a useful graph and grounding its answer in the resulting visualization. Overall, we found that across a diverse set of models, direct analytical solving surprisingly outperforms tool-enabled visual solving, even on problems where plotting is a natural strategy.

URL PDF HTML ☆

赞 0 踩 0

2606.04238 2026-06-04 cs.LG cs.AI 版本更新

Recover-LoRA for Aggressive Quantization: Reclaiming Accuracy in 2-Bit Language Models via Low-Rank Adaptation with Knowledge Distillation on Synthetic Data

Recover-LoRA 用于激进量化：通过低秩适配与合成数据知识蒸馏恢复2比特语言模型的精度

Devleena Das, Rajeev Patwari, Elliott Delaye, Ashish Sirasao

发表机构 * Advanced Micro Devices, Inc.（先进微器件公司）

AI总结针对2比特激进量化导致的大语言模型精度严重下降问题，提出Recover-LoRA方法，结合选择性混合精度策略（仅MLP的gate和up层量化为2比特）和基于合成数据蒸馏的低秩适配训练，在Qwen3-4B上以1万合成样本在12个基准中恢复9个基准80-95%的精度。

详情

AI中文摘要

将权重激进量化至2比特精度可大幅提升大语言模型推理的吞吐量和内存效率，但通常会导致严重的精度下降。这些增益对于内存容量和带宽为主要限制的边缘和设备端部署尤为重要。在本工作中，我们将Recover-LoRA——一种最初为通用模型权重损坏设计的轻量级、无需数据的精度恢复方法——扩展到超低比特量化场景。我们提出了一种选择性混合精度策略，其中仅MLP的gate和up投影层被量化为2比特（W2），而所有其他线性层保持更高精度，从而形成混合精度的GateUp配置。通过三个模型系列（4B-20B）和两个硬件平台的屋顶线分析，我们证明W4/W2-GateUp部署（4比特基础加2比特gate/up）相比均匀W4可实现7.5-23.3%的TPS提升（取决于模型和上下文长度），同时将量化误差限制在可预测的层子集内。然后，我们应用Recover-LoRA——在量化层上通过合成数据的logit蒸馏训练低秩适配器——来恢复因gate和up层的2比特量化而损失的精度。在Qwen3-4B的案例研究中，Recover-LoRA仅使用1万合成训练样本且无需标注数据，就在12个基准中的9个上实现了80-95%的精度恢复。我们进一步证明，对于基于蒸馏的恢复，合成数据的表现与精心整理的标注数据相当，并且恢复结果可泛化到分布外评估任务。我们的结果表明，Recover-LoRA是一种实用的后量化精度恢复工具，适用于部署场景中的激进权重压缩。

英文摘要

Aggressive weight quantization to 2-bit precision offers substantial throughput and memory gains for large language model (LLM) inference, but typically incurs severe accuracy degradation. These gains are particularly relevant for edge and on-device deployment, where memory capacity and bandwidth are primary constraints. In this work, we extend Recover-LoRA -- a lightweight, data-free accuracy recovery method originally developed for general model weight corruption -- to the setting of ultra-low-bit quantization. We propose a selective mixed-precision strategy in which only gate and up projection layers of the MLP are quantized to 2-bit (W2), while all other linear layers remain at higher precision, yielding a mixed-precision GateUp configuration. We demonstrate via roofline analysis across three model families (4B--20B) and two hardware platforms that a W4/W2-GateUp deployment (4-bit base with 2-bit gate/up) delivers 7.5--23.3\% TPS improvement over uniform W4 depending on model and context length, while confining quantization error to a predictable subset of layers. We then apply Recover-LoRA -- training low-rank adapters on the quantized layers via logit distillation with synthetic data -- to recover accuracy lost from 2-bit quantization of the gate and up layers. In a case study on Qwen3-4B, Recover-LoRA achieves 80--95\% accuracy recovery on 9 of 12 benchmarks, using only 10k synthetic training samples and no labeled data. We further demonstrate that synthetic data performs comparably to curated labeled data for distillation-based recovery, and that recovery generalizes to out-of-distribution evaluation tasks. Our results present Recover-LoRA as a practical post-quantization accuracy recovery tool for aggressive weight compression in deployment settings.

URL PDF HTML ☆

赞 0 踩 0

2606.04236 2026-06-04 cs.CL cs.AI cs.LG 版本更新

Supportive Token Revealing for Fast Diffusion Language Model Decoding

支持性标记揭示：快速扩散语言模型解码

Giries Abu Ayoub, Mario Barbara, Lluís Pastor-Pérez, Tanja Bien, Aneesh Barthakur, Alaa Maalouf, Loay Mualem

发表机构 * Department of Computer Science, University of Haifa（海法大学计算机科学系）； Institute for AI, University of Stuttgart（斯图加特大学人工智能研究所）； IMPRS-IS

AI总结提出AXON模块，通过选择注意力、不确定性和置信度信号中的锚点标记来改善扩散语言模型并行解码的质量-延迟权衡。

详情

AI中文摘要

离散扩散语言模型可以通过并行更新多个掩码位置来高效生成文本，但这种并行性引入了质量-延迟权衡。激进的解码可能过早提交相互依赖的标记，而保守的解码则需要大量去噪步骤。现有方法通过使用置信度或依赖性标准决定哪些标记可以安全揭示来解决这一矛盾。然而，避免不安全提交并不一定使剩余的掩码序列易于解码，因为不确定的标记可能依赖于掩码标记，从而成为去噪步骤的瓶颈。我们提出AXON，一个无需训练的模块，可添加到现有扩散语言模型的并行解码策略之上。AXON不替换基础解码器，而是监控剩余不确定的掩码标记，并仅当它们当前状态表明需要额外上下文时才进行干预。然后它将标准从揭示哪些标记最安全转变为哪些自信揭示最能支持后续去噪。AXON使用注意力、不确定性和置信度信号选择锚点，即不确定位置关注的自信掩码标记。在多个扩散语言模型的推理和代码生成基准上的实验表明，AXON改善了现有并行解码器的质量-延迟权衡，通常减少函数评估次数，同时保持或提高准确性。

英文摘要

Discrete diffusion language models can generate text efficiently by updating multiple masked positions in parallel, but this parallelism introduces a quality-latency trade-off. Aggressive decoding may commit mutually dependent tokens too early, while conservative decoding requires many denoising steps. Existing methods address this tension by deciding which tokens are safe to reveal using confidence or dependency criteria. However, avoiding unsafe commits does not necessarily make the remaining masked sequence easy to decode, since uncertain tokens may depend on masked tokens, creating a bottleneck for denoising steps. We propose AXON, a training-free module that can be added on top of existing parallel decoding strategies for diffusion language models. Rather than replacing the base decoder, AXON monitors the remaining uncertain masked tokens and intervenes only when their current state suggests that additional context is needed. It then shifts the criterion from which tokens are safest to reveal to which confident reveals would best support later denoising. AXON selects anchors, confident masked tokens that uncertain positions attend to, using attention, uncertainty, and confidence signals. Experiments on reasoning and code-generation benchmarks across multiple diffusion language models show that AXON improves the quality-latency trade-off of existing parallel decoders, often reducing the number of function evaluations while maintaining or improving accuracy.

URL PDF HTML ☆

赞 0 踩 0

2606.04210 2026-06-04 eess.AS cs.LG cs.SD 版本更新

Representation Matters in Randomized Smoothing for Audio Classification

表示在音频分类的随机平滑中至关重要

Jong-Ik Park, Shreyas Chaudhari, José M. F. Moura, Carlee Joe-Wong

发表机构 * University of California, Berkeley（加州大学伯克利分校）

AI总结研究随机平滑在音频分类中的表示问题，通过实验揭示预处理和表示选择对认证鲁棒性的影响，并提出报告规范。

详情

AI中文摘要

随机平滑（RS）在添加高斯噪声的向量空间中认证鲁棒性。在音频分类中，该空间通常不是唯一确定的，因为标准流程会对波形进行归一化、范围控制，并将其转换为log-mel或其他频谱特征。我们表明，除非认证对象和预处理策略明确，否则直接RS是欠定义的。在两个音频基准（关键词识别和环境声音分类）上，我们研究了波形、特征空间和后处理平滑。我们的诊断显示了为什么表示感知的报告是必要的：在相同的平滑水平$σ=0.0025$下，两个数据集共享相同的中位数原始半径$.007996$，但不同的波形能量产生不同的SNR等效尺度（$83.98$ vs. $90.97$ dB）；log-mel平滑在环境声音上给出更高的正半径认证准确率（$68.42\%$ vs. $65.53\%$），认证了更多具有非零半径的样本，但基于特征而非波形；裁剪或峰值归一化将有效扰动范数改变约$230$--$351\times$。因此，我们建议音频RS研究选择并报告任务特定的认证对象和扰动模型，包括扰动位置、增益策略、原始半径以及任何噪声后的几何变化。

英文摘要

Randomized smoothing (RS) certifies robustness in the vector space where Gaussian noise is added. In audio classification, this space is often not uniquely defined as standard pipelines normalize, range-control, and transform waveforms into log-mel or other spectral features. We show that direct RS is therefore under-specified unless the certified object and preprocessing policy are explicit. On two audio benchmarks, keyword spotting and environmental-sound classification, we study waveform, feature-space, and post-processed smoothing. Our diagnostics show why representation-aware reporting is necessary: at the same smoothing level $σ=0.0025$, the two datasets share the same median raw radius $.007996$, but different waveform energies yield different SNR-equivalent scales ($83.98$ vs. $90.97$ dB); log-mel smoothing gives higher positive-radius certified accuracy on environmental sounds ($68.42\%$ vs. $65.53\%$), certifying more examples with nonzero radius but over features rather than waveforms; and clipping or peak normalization changes the effective perturbation norm by roughly $230$--$351\times$. We therefore recommend that audio RS studies choose and report the task-specific certified object and perturbation model, including the perturbation location, gain policy, raw radius, and any post-noise geometry changes.

URL PDF HTML ☆

赞 0 踩 0

2606.04209 2026-06-04 cs.LG 版本更新

A Geometric View of Counterfactual Behavior: Interaction of Boundary Proximity and Local Support

反事实行为的几何视角：边界接近度与局部支持的交互作用

Ioanna Gemou, Matteo Gamba, Randall Balestriero, Ritambhara Singh

发表机构 * Brown University（布朗大学）

AI总结本文通过几何视角研究反事实行为，发现决策边界接近度与局部数据支持的交互作用决定了反事实的可行性，且反事实行为是独立于预测性能的维度，可在不改变准确率的情况下被改变。

详情

AI中文摘要

反事实解释寻求对输入进行小的、语义上有意义的改变，以改变模型的预测，并广泛用于解释和审计机器学习系统。在现代视觉、语言和多模态系统中，预训练编码器将输入映射到表示空间，下游分类器头在这些空间内施加决策边界。因此，附近反事实的可行性和距离取决于边界相对于数据的位置。然而，具有相似预测性能的模型在是否能够实现此类改变以及表示必须移动多远方面可能存在显著差异。本文通过使用标准化局部搜索探针，在多个预训练编码器和线性分类器头上检验了这种变化。结果表明，尽管预测性能相似，但模型在反事实行为上存在显著差异。在固定表示下，仅改变分类器头就会改变反事实结果，而预测性能基本保持不变。这种变化由决策边界接近度和局部数据支持的交互作用解释，两者共同决定了预测变化是否可行且位于数据支持的区域内，并且还可以改进固定模型内的反事实搜索。总之，这些发现将反事实行为识别为超越预测性能的独立维度，并表明可以在不改变准确率的情况下改变它，这对模型选择、鲁棒性和反事实方法的可靠性具有启示意义。

英文摘要

Counterfactual explanations seek small, semantically meaningful changes to an input that alter a model's prediction, and are widely used to interpret and audit machine learning systems. In modern vision, language, and multimodal systems, pretrained encoders map inputs to representation spaces, and downstream classifier heads impose decision boundaries within those spaces. As a result, the feasibility and distance of nearby counterfactuals depend on boundary placement relative to the data. Yet models with similar predictive performance can differ substantially in whether such changes are achievable and how far representations must move. This work examines this variation using a standardized local search probe across several pretrained encoders and linear classifier heads. Results show that despite similar predictive performance, models differ substantially in their counterfactual behavior. Under fixed representations, varying only the classifier head alters counterfactual outcomes while leaving predictive performance largely unchanged. This variation is explained by the interaction of decision-boundary proximity and local data support, which jointly determine whether prediction changes are both feasible and lie in regions supported by the data, and can also improve counterfactual search within fixed models. Together, these findings identify counterfactual behavior as a distinct dimension beyond predictive performance and show that it can be altered without changing accuracy, with implications for model selection, robustness, and the reliability of counterfactual methods.

URL PDF HTML ☆

赞 0 踩 0

2606.04205 2026-06-04 cs.MM cs.AI cs.CL cs.CV cs.LG cs.SD 版本更新

DetectZoo: A Unified Toolkit for AI-Generated Content Detection Across Text, Audio, and Image Modalities

DetectZoo：一个用于跨文本、音频和图像模态的AI生成内容检测的统一工具包

Sajad Ebrahimi, Nima Jamali, Bardia Shirsalimian, Kelly McConvey, Wentao Zhang, Jalehsadat Mahdavimoghaddam, Maksym Taranukhin, Maura Grossman, Vered Shwartz, Yuntian Deng, Ebrahim Bagheri

发表机构 * University of Toronto（多伦多大学）； University of Waterloo（滑铁卢大学）； Toronto Metropolitan University（多伦多 Metropolitan 大学）； University of British Columbia（不列颠哥伦比亚大学）； Vector Institute（向量研究所）

AI总结提出DetectZoo，一个首个统一的多模态AI生成内容检测工具包，通过标准化数据预处理、评估流程和集成61个检测器与22个基准数据集，实现公平可重复的基准测试。

详情

AI中文摘要

生成模型的日益普及和能力提升模糊了人类与机器生成内容之间的界限，推动了跨文本、图像和音频检测领域的大量研究。大多数现有的检测器要么是商业软件，要么是开源但带有不兼容的代码库、定制化的预处理、评估协议和评估指标，这使得它们的采用、公平比较和复现变得相当困难。为了解决这一关键差距，我们引入了DetectZoo，这是首个可扩展的工具包，旨在为跨文本、音频和图像模态的AI生成内容检测提供统一接口。DetectZoo标准化了从数据摄取和预处理到模型评估的完整实证流程，为研究人员提供了一个统一的框架来系统地基准测试最先进的检测器。通过将多样的公共数据集和基线检测算法集成到单一的统一API下，我们的工具包促进了严格且可重复的评估。DetectZoo提供了61个检测器的参考实现、22个基准数据集的原生加载器，以及一个标准化的评估流程，通过通用接口报告多个指标。每个检测器都是自包含的，但可通过同一接口访问，自动缓存预训练权重，并复现原始发表的结果。DetectZoo降低了多模态AI取证的入门门槛，使研究人员能够识别跨领域的性能差距，并加速开发鲁棒、可泛化的检测技术。开源仓库和全面文档可在https://github.com/sadjadeb/DetectZoo 获取，且可通过pip install detectzoo安装该包。

英文摘要

The growing popularity and capacity of generative models have eroded the distinction between human and machine-generated content, motivating a growing body of work on detection across text, images, and audio. Most available detectors are either commercial software or, if open-source, come with incompatible codebases with bespoke preprocessing, evaluation protocols, and evaluation metrics, which make their adoption, fair comparison, and reproduction quite difficult. To address this critical gap, we introduce DetectZoo, a first-of-its-kind, extensible toolkit designed to provide a unified interface for AI-generated content detection across text, audio, and image modalities. DetectZoo standardizes the complete empirical pipeline, from data ingestion and preprocessing to model assessment, offering researchers a cohesive framework to benchmark state-of-the-art detectors systematically. By integrating diverse public datasets and baseline detection algorithms under a single, unified API, our toolkit facilitates rigorous and reproducible evaluation. DetectZoo provides reference implementations of 61 detectors, native loaders for 22 benchmark datasets, and a standardized evaluation pipeline that reports multiple metrics through a common interface. Each detector is self-contained yet accessible through the same interface, automatically caches pretrained weights, and reproduces the original published results. DetectZoo lowers the barrier to entry for multi-modal AI forensics, enabling researchers to identify performance gaps across domains and accelerating the development of robust, generalizable detection techniques. The open-source repository and comprehensive documentation are publicly available at https://github.com/sadjadeb/DetectZoo, and the package can be installed via pip install detectzoo.

URL PDF HTML ☆

赞 0 踩 0

2606.04199 2026-06-04 cs.CL cs.LG 版本更新

Cross-Prompt Generalization in Detecting AI-Generated Fake News Using Interpretable Linguistic Features

使用可解释语言特征检测AI生成假新闻的跨提示泛化

Aya Vera-Jimenez, Samuel Jaeger, Calvin Ibenye, Dhrubajyoti Ghosh

发表机构 * Department of Mathematics（数学系）； School of Data Science and Analytics（数据科学与分析学院）； Department of Computer Science（计算机科学系）

AI总结研究通过提取词汇多样性、可读性和情感特征，在跨提示框架下使用随机森林分类器检测AI生成假新闻，发现模型在不同提示下均表现稳定（AUC 0.988-1.000），表明这些特征可泛化。

详情

AI中文摘要

大型语言模型的日益普及引发了对AI生成假新闻传播的担忧，尤其是在不同的提示策略下。大多数现有的检测模型是在单一生成设置下训练和评估的，其跨未见提示的泛化能力尚不清楚。在本研究中，我们使用三个在不同提示下生成的AI文章数据集以及真实新闻文章，研究了假新闻检测中的跨提示泛化。我们提取了捕捉词汇多样性、可读性和情感特征的可解释语言特征，并在跨提示框架下评估了随机森林分类器，其中在一个提示上训练的模型在另一个提示上进行测试。在所有六个训练-测试组合中，性能始终保持较高，AUC值在0.988到1.000之间。特征分布分析显示，与整体数据集相比，AI生成文本表现出更高的词汇多样性、更低的可读性和显著较低的情感强度，且不同提示间存在差异。尽管存在这些分布变化，分类器仍保持强劲性能，表明这些特征捕捉了AI生成文本的稳定属性，这些属性可跨提示策略泛化。这些发现表明，基于特征的方法可以在提示变化下提供对AI生成假新闻的稳健检测。

英文摘要

The increasing use of large language models has raised concerns about the spread of AI-generated fake news, particularly under varying prompting strategies. Most existing detection models are trained and evaluated under a single generation setting, leaving their ability to generalize across unseen prompts unclear. In this study, we investigate cross-prompt generalization in fake news detection using three datasets of AI-generated articles produced under distinct prompts, combined with real news articles. We extract interpretable linguistic features capturing lexical diversity, readability, and emotion-based characteristics and evaluate a random forest classifier under a cross-prompt framework, where models trained on one prompt are tested on another. Across all six train-test combinations, performance remains consistently high, with AUC values ranging from 0.988 to 1.000. Analysis of feature distributions shows that AI-generated text exhibits increased lexical diversity, reduced readability, and substantially lower emotional intensity compared to the overall dataset, with variations across prompts. Despite these distributional shifts, the classifier maintains strong performance, indicating that these features capture stable properties of AI-generated text that generalize across prompting strategies. These findings suggest that feature-based approaches can provide robust detection of AI-generated fake news under prompt variability.

URL PDF HTML ☆

赞 0 踩 0

2606.04194 2026-06-04 cs.LG cs.CL cs.IR 版本更新

Training-Free Lexical-Dense Fusion for Conversational-Memory Retrieval

免训练的词汇-稠密融合用于对话记忆检索

Christian Lysenstøen

发表机构 * Inland Norway University of Applied Sciences（内陆挪威应用科学大学）； University of California, Berkeley（加州大学伯克利分校）

AI总结本文提出一种免训练、仅CPU的检索方法，通过分数级融合最大查询-轮次相似度（后期交互）与BM25，显著提升多会话对话记忆检索的命中率，并分析了不同编码器和池化策略的影响。

Comments 9 pages, 3 figures, 10 tables. Code, data, and per-table receipts: https://github.com/Chrislysen/opsem

详情

AI中文摘要

在跨长多会话历史中检索回答新查询的过去几轮是长期对话记忆（LoCoMo, LongMemEval）背后的检索瓶颈。最近的并行工作Nano-Memory表明，通过最大查询-轮次相似度（后期交互，“轮次隔离检索”）对会话进行评分优于均值池化的会话嵌入。我们不声称该效果；我们复现它并询问一个免训练、仅CPU的检索阶段应在其周围添加什么。我们报告四个发现。（1）融合：在单个留一对话权重下，后期交互稠密分数与BM25的分数级融合，在六个编码器上比单独后期交互增加+8.8到+17.2个LoCoMo Hit@1点（所有p<1e-4），达到Hit@1 0.752 / NDCG@5 0.829（e5-large-v2），比BM25高+11.2个百分点。（2）一个现成的网络搜索交叉编码器重排序器在融合的前10个结果上效果不佳，将Hit@1降低6.9个百分点（一个重排序器，一种配置）。（3）池化算子消融显示top-k后期交互匹配最大相似度，但朴素的平滑最大值（log-sum-exp）对一半编码器失效。（4）所有六个编码器的后期减早期差距很大，且较大的编码器差距往往更大，而边际融合增益缩小；在LongMemEval-S上，一个BM25饱和的词汇机制中，相对于BM25的净融合增益很小且不显著。按类别分析将增益视为分工：稠密后期交互在多跳和时间问题上帮助最大，但在对抗性问题上落后于BM25。贡献是对一个强大的免训练检索方案的可控、可复现的描述，而非后期交互检索器本身（Nano-Memory的）。我们不声称完整的记忆架构；这是一个检索阶段的研究。

英文摘要

Retrieving the few past turns that answer a new query across long multi-session histories is the retrieval bottleneck behind long-term conversational memory (LoCoMo, LongMemEval). Recent concurrent work, Nano-Memory, shows that scoring a session by the maximum query-turn similarity (late interaction, "Turn Isolation Retrieval") beats mean-pooled session embeddings. We do not claim that effect; we replicate it and ask what a training-free, CPU-only retrieval stage should add around it. We report four findings. (1) Fuse: score-level fusion of the late-interaction dense score with BM25, under a single leave-one-conversation-out weight, adds +8.8 to +17.2 points of LoCoMo Hit@1 over late interaction alone across six encoders (all p<1e-4), reaching Hit@1 0.752 / NDCG@5 0.829 (e5-large-v2), +11.2 pp over BM25. (2) An off-the-shelf web-search cross-encoder reranker over the fused top-10 hurts here, degrading Hit@1 by 6.9 pp (one reranker, one configuration). (3) A pooling-operator ablation shows top-k late interaction matches max-similarity, but a naive smooth-max (log-sum-exp) collapses for half the encoders. (4) The late-minus-early gap is large for all six encoders and tends to be larger for larger ones, while the marginal fusion gain shrinks; on LongMemEval-S, a lexical regime where BM25 saturates, the net fusion gain over BM25 is small and not significant. A per-category analysis frames the gain as a division of labor: dense late interaction helps most on multi-hop and temporal questions but trails BM25 on adversarial ones. The contribution is a controlled, reproducible account of a strong training-free retrieval recipe, not the late-interaction retriever itself (Nano-Memory's). We make no claim to a complete memory architecture; this is a retrieval-stage study.

URL PDF HTML ☆

赞 0 踩 0

2606.04191 2026-06-04 cs.LG cs.AI 版本更新

KODA: 视觉-语言基础模型的对比表示比较与对齐

Youqi Wu, Mohammad Jalali, Farzan Farnia

发表机构 * University of California, Berkeley（加州大学伯克利分校）

AI总结提出KODA框架，通过核优化方法对比分析视觉-语言基础模型的表示差异，并识别弱聚类与强聚类的样本子集，实现表示对齐。

详情

AI中文摘要

视觉-语言基础模型（如CLIP和SigLIP）为多模态学习系统提供了广泛使用的表示。虽然这些模型通常通过下游性能进行比较，但这种评估往往不能解释它们的表示在结构上如何不同。在本文中，我们通过对比嵌入聚类任务研究这一问题：识别在一个表示下弱聚类但在另一个表示下强聚类的样本子集。我们提出了\emph{核优化差异分析（KODA）}，一个基于核的对比表示比较与对齐框架。KODA通过模态核组合构建统一的多模态核，并将差异发现形式化为一个约束优化问题，该问题在一个表示中搜索一致结构，同时抑制参考表示中的一致性。这产生了与特定样本子集和模态交互相关的可解释差异方向。为了将KODA扩展到大型视觉-语言数据集，我们开发了使用随机投影的联合核随机低维近似，包括用于平移不变核的随机傅里叶特征。实验上，KODA在视觉-语言表示中识别出一致且可解释的差异结构，并为表示对齐提供了样本子集。代码可在https://github.com/yokiwuuu/KODA获取。

英文摘要

Vision-language foundation models such as CLIP and SigLIP provide widely used representations for multimodal learning systems. While these models are typically compared through downstream performance, such evaluations often do not explain how their representations differ structurally. In this work, we study this problem through the task of Contrastive Embedding Clustering: identifying sample subsets that are weakly clustered under one representation but strongly clustered under another. We propose \emph{Kernel Optimization for Discrepancy Analysis (KODA)}, a kernel-based framework for contrastive representation comparison and alignment. KODA constructs unified multimodal kernels through modality-wise kernel composition and formulates discrepancy discovery as a constrained optimization problem that searches for coherent structures in one representation while suppressing coherence in a reference representation. This yields interpretable discrepancy directions associated with specific sample subsets and modality interactions. To scale KODA to large vision-language datasets, we develop randomized low-dimensional approximations of joint kernels using random projections, including Random Fourier Features for shift-invariant kernels. Empirically, KODA identifies consistent and interpretable discrepancy structures across vision-language representations and provides sample subsets for representation alignment. The code is available at https://github.com/yokiwuuu/KODA.

URL PDF HTML ☆

赞 0 踩 0

2606.04176 2026-06-04 cs.LG math.ST stat.ML stat.TH 版本更新

CaloTrilogy：迈向现代量热器一步式端到端物理引导簇射生成的突破

Cheng Jiang, Sitian Qian, Kevin Pedro, Oz Amram, Huilin Qu, Maggie Voetberg

发表机构 * School of Physics and Astronomy, University of Edinburgh（爱丁堡大学物理与天文学学院）； Department of Physics, University of Wisconsin-Madison（威斯康星大学麦迪逊分校物理系）； Fermi National Accelerator Laboratory（费米国家加速器实验室）； State Key Laboratory of Dark Matter Physics, Tsung-Dao Lee Institute & School of Physics and Astronomy, Shanghai Jiao Tong University（上海交通大学暗物质物理国家重点实验室、李政道研究所及物理与天文学学院）； Key Laboratory for Particle Astrophysics and Cosmology (MOE) & Shanghai Key Laboratory for Particle Physics and Cosmology, Shanghai Jiao Tong University（教育部粒子天体物理与宇宙学重点实验室及上海粒子物理与宇宙学重点实验室，上海交通大学）

AI总结提出一种结合平均速度场积分器、学习生成先验和物理引导损失项的框架，实现一步或少量评估步骤的高质量簇射生成，性能与最先进的流和扩散模型相当。

详情

AI中文摘要

当前和未来对撞机的高精度量热器模拟对计算资源的需求快速增长，促使开发机器学习替代传统蒙特卡洛工具（如Geant4）。流匹配和基于扩散的生成模型因其样本质量而成为高维快速模拟的主流方法，但通常在推理时需要${\cal O}(100)$次函数评估，并常依赖辅助网络约束全局可观测量，损害了简化的端到端生成。我们引入了一个统一框架，改进了速度、簇射质量和物理保真度之间的平衡。该方法结合了：（i）平均速度场积分器，实现一步或少量评估的采样；（ii）从数据而非随机噪声构建的簇射空间学习生成先验；（iii）训练期间对关键可观测量施加归纳偏置的物理引导损失项。这些元素是训练时的正则化器，保持了端到端推理且无额外成本。仅需一步或少量评估步骤，该模型在多个公开的高粒度量热器数据集上达到了与最先进的流和扩散模型竞争的簇射质量。结果表明层间簇射结构与底层物理一致，为未来的快速模拟工作流提供了有力候选。

英文摘要

High-precision calorimeter simulation at current and future colliders imposes rapidly growing computational demands, motivating the development of machine-learning surrogates for traditional Monte Carlo tools such as Geant4. Flow matching and diffusion-based generative models have become leading approaches for high-dimensional fast simulation because of their sample quality, but typically require ${\cal O}(100)$ function evaluations at inference and often rely on auxiliary networks to constrain global observables, compromising streamlined end-to-end generation. We introduce a unified framework that improves the balance between speed, shower quality, and physics fidelity. The method combines: (i) an average velocity field integrator that enables sampling in one or a few evaluations; (ii) a learned generative prior in shower space, constructed from data rather than random noise; and (iii) physics-guided loss terms that impose inductive biases on key observables during training. These elements are training time regularizers, preserving end-to-end inference with no additional cost. With only one or a few evaluation steps, the model achieves shower quality competitive with state-of-the-art flow and diffusion approaches, tested on several public high granularity calorimeter datasets. The results demonstrate inter-layer shower structure consistent with the underlying physics, providing a strong candidate for future fast simulation workflows.

URL PDF HTML ☆

赞 0 踩 0

2606.04164 2026-06-04 cs.LG cs.AI 版本更新

ADAPTOOD: Uncertainty-Aware Fine-Tuning for Out-of-Distribution ECG Time Series Models

ADAPTOOD：面向分布外心电图时间序列模型的不确定性感知微调

Sotirios Vavaroutas, Yu Yvonne Wu, Ali Etemad, Cecilia Mascolo

发表机构 * University of Cambridge（剑桥大学）； Dartmouth College（达特茅斯学院）； Queen’s University（皇后大学）

AI总结提出ADAPTOOD框架，利用数据不确定性量化分布偏移严重性，结合低秩更新和自适应超参数优化，在分布外心电图时间序列任务上提升准确率高达7%和精确率12.9%。

Comments 11 pages

详情

AI中文摘要

用于训练的数据样本通常与微调和部署期间遇到的数据不同，尽管机器学习模型显示出潜力，但在只有少量标注数据集可用时，其性能仍然有限。在由不同传感器、人群和应用设置引起的分布偏移下，性能通常会下降。尽管预训练有所帮助，但模型在现实环境中经常遇到分布外（OOD）数据，导致鲁棒性降低。现有的自适应方法通常假设固定的分布偏移，并在出现多种类型或严重性时难以应对。特别是，它们忽略了偏移的严重性，例如将适应大型熟悉数据集与适应带有新任务的小型数据集同等对待，这限制了泛化能力。为了解决这个问题，我们提出了ADAPTOOD，这是一个新颖的框架，利用数据不确定性来量化分布偏移的严重性并指导时间序列的微调。这种不确定性衡量目标部署分布中的样本与预训练分布偏离的程度，提供了OOD严重性的直接信号。我们的框架将这种不确定性与低秩模型更新和自适应超参数优化相结合，以改进自适应。我们表明，在OOD任务中，ADAPTOOD比现有方法实现了高达7%的准确率和12.9%的精确率提升，在分布偏移严重性增加时仍保持强劲性能。

英文摘要

Data samples used for training often differ from those encountered during fine-tuning and deployment, and while ML models show promise, their performance remains limited when only small annotated datasets are available. Performance often degrades under distribution shifts caused by diverse sensors, populations, and application settings. Although pre-training helps, models frequently encounter out-of-distribution (OOD) data in real-world settings, leading to reduced robustness. Existing adaptation methods usually assume fixed distribution shifts and struggle when multiple types or severities occur. In particular, they overlook shift severity, for example treating adaptation to a large familiar dataset the same as adaptation to a small dataset with a new task, which limits generalisation. To address this, we propose ADAPTOOD, a novel framework that leverages data uncertainty to quantify distribution shift severity and guide fine-tuning for time series. This uncertainty measures how strongly samples from the target deployment distribution deviate from the pre-training distribution, providing a direct signal of OOD severity. Our framework combines this uncertainty with low-rank model updates and adaptive hyperparameter optimisation to improve adaptation. We show that ADAPTOOD achieves up to 7% higher accuracy and 12.9% higher precision than existing methods in OOD tasks, maintaining strong performance as distribution shift severity increases.

URL PDF HTML ☆

赞 0 踩 0

2606.04161 2026-06-04 cs.LG 版本更新

When Offline Selectors Cannot Beat the Best Single Model: A Diagnostic Study on edX Dropout Prediction

当离线选择器无法超越最佳单一模型：基于edX辍学预测的诊断研究

Tyler Crosse, Alan Nadelsticher Ruvalcaba, Dustin Khang LeDuc, Thomas Trask, Nicholas Lytle, David Joyner

发表机构 * edX

AI总结针对离线选择器在实践中的表现常不如最佳单一模型的问题，提出三阶段诊断方法，通过k-NN标签一致性、离线学习器性能比较和状态特征消融实验，识别瓶颈为局部表示模糊性，建议改进状态或收集新数据而非调优学习器。

详情

AI中文摘要

不同的预测器通常在不同输入上表现优异，因此每实例选择最佳预测器有望比固定单一模型获得更高准确率。在实践中，从日志数据训练的选择器经常无法击败最强的单一预测器。在进一步调优之前，三个原因通常未被区分：不匹配的学习器、无法预测哪个模型获胜的状态、或从缓存到部署的标签偏移。一个三阶段诊断在共享缓存上排除这些原因。第一阶段通过k-NN标签一致性估计oracle恢复的局部上限。第二阶段询问配对BC和离线RL学习器（BC、DQN和CQL，跨惩罚权重）是否达到该上限。第三阶段消融选择器状态，测试更丰富的特征是否会提高上限。综合结论指向最有希望的下一步：调优学习器、重新设计状态或收集新数据。我们将其应用于在edX点击流数据上选择五个辍学预测模型。在16个时间窗口上，oracle平均比最强单一基模型高出9.7个准确率点，但BC、DQN和CQL均落在其下方的相同测试准确率带内（对十倍缓存扫描和N=2,000个保留样本鲁棒）。瓶颈是局部表示模糊性：CQL缩小了模仿差距但无部署增益（非保守性），遗憾在学习器间紧密聚集（非打破平局），三个学习器在测试准确率上收敛（非偏移）。下一次迭代应改变状态或收集新数据，而非进一步调优离线学习器。

英文摘要

Different predictors often excel on different inputs, so picking the best one per instance promises higher accuracy than committing to a single model. In practice, selectors trained from logged data routinely fail to beat the strongest single predictor. Three causes typically go unseparated before more tuning is applied: a mismatched learner, a state that does not predict which model wins, or buffer-to-deployment label shift. A three-stage diagnostic rules them out on a shared buffer. Stage~1 estimates a local ceiling on oracle recovery from $k$-NN label consistency. Stage~2 asks whether paired BC and offline-RL learners (BC, DQN, and CQL across penalty weights) reach that ceiling. Stage~3 ablates the selector state to test whether richer features would raise it. The combined verdict points to the most promising next step: tuning the learner, redesigning the state, or collecting new data. We apply it to selecting among five dropout-prediction models on edX clickstream data. Across 16 windows, the oracle beats the strongest single base model by 9.7 accuracy points on average, yet BC, DQN, and CQL land in the same test-accuracy band below it (robust to a tenfold buffer sweep and $N{=}2{,}000$ held-out examples). The bottleneck is local representational ambiguity: CQL closes the imitation gap without a deployment gain (not conservatism), regret clusters tightly across learners (not tie-breaking), and the three learners converge on test accuracy (not shift). The next iteration should change the state or collect new data, not tune the offline learner further.

URL PDF HTML ☆

赞 0 踩 0

2606.04160 2026-06-04 cs.CL cs.LG 版本更新

Expert-Aware Refusal Steering

专家感知的拒绝引导

Anna C. Marbut, Daniel R. Olson, Travis J. Wheeler

发表机构 * Department of Interdisciplinary Studies（交叉学科研究部）； University of Montana（蒙大拿大学）； Department of Pharmacy Practice & Science（药学与科学系）； University of Arizona（亚利桑那大学）； European Bioinformatics Institute（欧洲生物信息研究所）； European Molecular Biology Laboratory（欧洲分子生物学实验室）； Wellcome Genome Campus（沃氏基因组校园）

AI总结研究在混合专家（MoE）大语言模型中，通过专家感知的引导向量抑制拒绝行为，发现单个专家输出即可有效引导，且注意力机制在MoE拒绝行为中起重要作用。

Comments Under review for COLM 2026

详情

AI中文摘要

指令调优的大语言模型（LLM）的安全对齐依赖于模型可靠地拒绝回答有害或不允许请求的能力。最近的研究表明，在推理过程中对密集LLM应用引导向量可以有效抑制拒绝行为，诱导模型响应有害请求。我们将这种拒绝引导方法扩展到三个开源混合专家（MoE）LLM，并发现引导性能不受MoE架构固有的复杂路由模式影响。然后，我们提出了两种专家感知的拒绝引导方法，利用拒绝特定的专家路由模式和专家特定的引导方向来抑制正常的拒绝行为。我们发现，基于单个专家的输出即可有效引导拒绝行为。我们的结果表明，引导方法捕获的拒绝信号与专家路由行为不同，这表明注意力在MoE拒绝行为中扮演重要角色。

英文摘要

Safety alignment in instruction-tuned large language models (LLMs) depends on a model's ability to reliably refuse to respond to harmful or disallowed requests. Recent work has shown that a steering vector can be applied to a dense LLM during inference to effectively suppress refusal behavior, inducing response to harmful requests. We extend this refusal steering method to three open-source Mixture-of-Experts (MoE) LLMs and find that steering performance is uninhibited by the complex routing patterns inherent to the MoE architecture. We then propose two expert-aware refusal steering methods that leverage refusal-specific expert routing patterns and expert-specific steering directions to suppress normal refusal behavior. We find that refusal behavior can be effectively steered based on the output of a single expert. Our results show that refusal signals captured by steering methods differ from expert routing behavior, suggesting a substantial role for attention in MoE refusal behavior.

URL PDF HTML ☆

赞 0 踩 0

2606.04154 2026-06-04 q-bio.QM cs.LG 版本更新

dMX: 低精度浮点格式的可微分混合精度分配

Giuseppe Franco, Ian Colbert, Pablo Monteagudo-Lago, Felix Marty, Nicholas Fraser

发表机构 * AMD

AI总结提出可微分混合精度量化框架 dMX，通过连续优化每层浮点格式参数并配合退火调度和正则化项，实现硬件兼容的 MXFP 格式分配，在 LLM 上取得帕累托最优效果。

详情

AI中文摘要

将大型语言模型（LLM）量化为低精度浮点表示是高效部署的关键，然而在所有层上统一应用单一比特宽度在性能和准确性方面均非最优。本文介绍 dMX，一种用于可学习浮点比特宽度分配的可微分混合精度量化框架。我们研究了其在开放计算项目（OCP）标准定义的微缩放浮点（MXFP）数据类型家族上的应用。每层比特宽度分配被表述为一个连续优化问题，其中每层的浮点格式由一个标量参数参数化，将多变量设计空间折叠为单个可学习偏移量。在训练过程中，该偏移量取连续值，避免了离散量化格式之间的突然振荡。基于温度的退火调度逐步离散化学习到的偏移量，确保最终配置映射到硬件兼容的 MXFP 格式，而不会在训练和推理行为之间出现突变。目标感知正则化项将平均比特宽度引导至用户指定的预算，作为推理成本的粗粒度代理，平衡模型质量与部署效率。我们在不同 LLM 家族（如 Llama、Qwen3 和 SmolLM2）上进行了实验，评估了 WikiText-2 上的困惑度和四个零样本推理基准上的准确率。在这些设置中，dMX 一致地产生帕累托主导模型，并优于基于 Kullback-Leibler（KL）散度的层选择启发式方法，有效导航模型质量与平均比特宽度之间的权衡。

英文摘要

Quantizing large language models (LLMs) to low-precision floating-point representations is central to efficient deployment, yet applying a single bit-width uniformly across all layers is sub-optimal in terms of both performance and accuracy. This work introduces dMX, a differentiable mixed-precision quantization framework for learnable floating-point bit-width assignment. We study its application for the microscaling floating-point (MXFP) family of data types defined by the Open Compute Project (OCP) standard. The per-layer bit-width assignment is formulated as a continuous optimization problem in which each layer's floating-point format format is parameterized by a scalar parameter, folding the multi-variate design space into a single learnable offset. During training this offset takes continuous values, avoiding sudden oscillations between discrete quantization formats. A temperature-based annealing schedule progressively discretizes the learned offsets, ensuring that the final configuration maps to hardware-compatible MXFP formats without abrupt transitions between training and inference behavior. A target-aware regularization term steers the average bit-width toward a user-specified budget, serving as a coarse-grained proxy for inference cost and balancing model quality against deployment efficiency. We performed experiments on different families of LLM, such as Llama, Qwen3, and SmolLM2, evaluating perplexity on WikiText-2 and accuracy on four zero-shot reasoning benchmarks. Across these settings, dMX consistently yields Pareto-dominating models and improves over Kullback-Leibler (KL) divergence-based layer-selection heuristics, efficiently navigating trade-offs between model quality and average bit-width.

URL PDF HTML ☆

赞 0 踩 0

2606.04110 2026-06-04 cs.LG stat.ML 版本更新

Variance Reduction for Heavy-Tailed Monetization Metrics in Ranking Experiments via Post-Stratification

基于事后分层的排序实验中重尾货币化指标的方差缩减

Neeti Pokharna, Olivier Jeunen, Yatharth Saraf, Aleksei Ustimenko

发表机构 * ShareChat ； Aampe ； Simulacra Research

AI总结针对排序实验中重尾货币化指标方差大、统计功效低的问题，提出结合事后分层与CUPED的方差缩减框架，利用实验前协变量提升灵敏度，在ShareChat部署后以约45%的流量实现同等统计置信度。

Comments Accepted as Industry Track paper in the 2026 ACM SIGIR Conference on Research and Development in Information Retrieval

详情

DOI: 10.1145/3805712.3808428

AI中文摘要

排序和检索系统的在线评估通常依赖于下游货币化指标，如应用收入或创作者收益。这些指标通常是重尾的，一小部分用户主导了均值和方差，导致A/B实验的统计功效低、结论不可靠——尤其是在流量有限的情况下。我们提出了一个实用的在线实验方差缩减框架，通过结合事后分层与CUPED。我们的方法利用实验前协变量提高货币化实验的灵敏度，无需额外流量。在ShareChat的排名驱动货币化实验中部署后，该方法显著降低了方差并提高了决策稳定性，与标准指标相比，以约45%的流量实现了同等的统计置信度。我们进一步讨论了实际设计选择、防护措施和局限性，为事后分层在现实信息检索和推荐系统中的适用性提供了指导。

英文摘要

Online evaluation of ranking and retrieval systems often relies on downstream monetization metrics such as app revenue or creator earnings. These metrics are typically heavy-tailed, with a small fraction of users dominating both mean and variance, leading to low statistical power and unreliable conclusions in A/B experiments -- especially under limited traffic. We present a practical framework for variance reduction in online experiments by combining post-stratification with CUPED. Our approach leverages pre-experiment covariates to improve the sensitivity of monetization experiments without requiring additional traffic. Deployed at ShareChat across ranking-driven monetization experiments, the method substantially reduces variance and improves decision stability, achieving equivalent statistical confidence with ~45\% less traffic than standard metrics. We further discuss practical design choices, guardrails, and limitations, providing guidance on when post-stratification is appropriate for real-world information retrieval and Recommendation systems.

URL PDF HTML ☆

赞 0 踩 0

2606.04108 2026-06-04 cs.GR cs.AI cs.CV cs.LG 版本更新

SymTRELLIS: Symmetry-Enforced Voxel Latents for 3D Generation

SymTRELLIS: 对称性增强的体素潜变量用于3D生成

Guangda Ji, Qimin Chen, Qinchan Li, Mingrui Zhao, Kai Wang, Hao Zhang

发表机构 * Simon Fraser University（西蒙 Fraser大学）

AI总结提出SymTRELLIS方法，通过在流模型生成过程中对预测速度进行对称化平均，强制任意有限点群对称性，无需重新训练VAE或流模型，显著降低对称性误差。

详情

AI中文摘要

单视图3D生成模型已取得令人印象深刻的视觉质量，但它们并非为满足结构或功能需求而设计，在实践中常常存在不足。对称性就是这样一个需求：违反对称性，即使是微小的违反，也可能使模型在物理上不可用。我们提出SymTRELLIS，一种在TRELLIS.2的基于流的3D生成过程中强制任意有限点群对称性（旋转、反射和多面体对称）的方法，无需重新训练底层的VAE或流模型。我们的关键思想是将空间变换在潜空间中的作用近似为体素潜变量上的学习线性算子，通过一个轻量级的空间变换潜映射器实现，该映射器在通用的非对称3D数据上训练。在生成时，我们通过在每一步ODE中对所有对称等价变换的预测流速度进行平均来强制对称性，这一过程称为速度对称化。对称性规格可以从初始TRELLIS.2生成中自动估计，或由用户提供，从而实现超越输入图像暗示的刻意折叠操作。在一个包含266个严格对称物体的基准测试上（涵盖2到20倍旋转和多面体对称群），与TRELLIS.2、Hunyuan3D-2.1和TripoSG相比，SymTRELLIS显著降低了所有对称性误差指标，同时保持了与基础模型相当的重建精度。

英文摘要

Single-view 3D generative models have achieved impressive visual quality, yet they are not designed to satisfy structural or functional requirements, and in practice, often fall short. Symmetry is one such requirement: violations, even subtle ones, on symmetry can render a model physically unusable. We present SymTRELLIS, a method that enforces arbitrary finite point group symmetries (rotational, reflectional, and polyhedral) during the flow-based 3D generation of TRELLIS.2, without retraining the underlying VAE or flow model. Our key idea is to approximate the latent-space action of spatial transformations as a learned linear operator on voxel latents, implemented as a lightweight spatial-transform latent mapper trained on generic, non-symmetric 3D data. At generation time, we enforce symmetry by averaging predicted flow velocities across all symmetry-equivalent transformations at each ODE step, a process we call velocity symmetrization. The symmetry specification can be estimated automatically from an initial TRELLIS.2 generation or supplied by the user, enabling deliberate fold manipulation beyond what the input image suggests. On a curated benchmark of 266 strictly symmetric objects spanning 2- to 20-fold rotations and polyhedral symmetry groups, SymTRELLIS substantially reduces all symmetry error metrics compared to TRELLIS.2, Hunyuan3D-2.1, and TripoSG, while maintaining reconstruction accuracy comparable to the base model.

URL PDF HTML ☆

赞 0 踩 0

2606.04106 2026-06-04 cs.LG cs.AI 版本更新

Building The Ph(ysical)AI Layer Of Machine Intelligence

构建机器智能的物理AI层

Ulbert Jose Botero, Liam Smith, Brooks Olney, Pooya Khorrami, Steven Kusiak, Watson Jia, Sage Trudeau, Daniel Capecci

发表机构 * MIT Lincoln Laboratory（麻省理工学院林肯实验室）

AI总结提出基于信号处理原理的基座模型，通过射频数据训练实现跨模态迁移，无需目标域微调，以1.99M参数在15个任务上平均准确率77.7%。

Comments 102 pages, 11 Figures

详情

AI中文摘要

基础模型通过多样化数据的大规模训练实现泛化，但在没有配对训练数据的情况下，向真正未见过的领域迁移存在局限性。我们提出基于原理的基座模型，该模型编码信号处理原理（傅里叶分解、能量守恒、对称性），而不是学习无约束的统计相关性。我们假设不同领域的差异不在于基本物理规律，而在于时间、频率、幅度或相位上的可学习变换。仅使用射频数据训练，并结合这些原理的协同设计架构和损失函数，我们实现了向音频、图像、文本和视频的跨模态迁移，仅使用从射频数据学习到的冻结表示，无需在目标域上对编码器进行微调。我们的1.99M参数冻结编码器通过线性探测在15个不同任务上达到77.7%的平均准确率（top-3为91.9%），具有系统性差异：在物理基础任务（说话人识别、地震学、射频指纹识别）上为84.5%，而在语义任务（音乐流派、语言识别）上为70.0%。这表明基于原理和基于规模的方法提供了互补路径：物理原理实现了高效的跨模态迁移，同时自然地界定了物理理解与语义理解之间的边界。

英文摘要

Foundation models achieve generalization through massive-scale training on diverse data, but have limitations with transfer to truly unseen domains without paired training data. We propose principle-driven foundation models that encode signal-theoretic principles (Fourier decomposition, energy conservation, symmetry) rather than learn untethered statistical correlations. We hypothesize that domains differ not in fundamental physics, but in learnable transformations in time, frequency, magnitude, or phase. Training exclusively on radio-frequency (RF) data with co-designed architecture and losses incorporating these principles, we achieve cross-modal transfer to audio, images, text, and video using only frozen representations learned from RF data, requiring no fine-tuning of the encoder on target domains. Our 1.99M parameter frozen encoder achieves 77.7% average accuracy (91.9% top-3) across 15 diverse tasks via linear probing, with systematic variation: 84.5 on physically-grounded tasks (speaker recognition, seismology, RF fingerprinting) versus 70.0% on semantic tasks (music genre, language recognition). This reveals that principle-driven and scale-driven approaches offer complementary paths: physical principles enable efficient cross-modal transfer while naturally establishing the boundary between physical and semantic understanding.

URL PDF HTML ☆

赞 0 踩 0

2606.04103 2026-06-04 cs.SD cs.AI cs.LG eess.AS 版本更新

The Differentiable Auditory Loop (DAL): An ML Framework for Hyper-Personalized Hearing Aids

可微分听觉环路（DAL）：用于超个性化助听器的机器学习框架

Alejandro Ballesta Rosen, Jason Mikiel-Hunter, Julian Maclaren, Jack Collins, Richard F. Lyon, Simon Carlile

发表机构 * Google Research Australia（谷歌澳大利亚研究实验室）； Macquarie University（麦考瑞大学）

AI总结提出可微分听觉环路（DAL）框架，通过将CARFAC模型移植到JAX并优化SEANet深度神经网络，以正常听觉神经活动模式为参考补偿听力损失，在神经表征和信号保真度指标上优于传统助听器基线。

详情

AI中文摘要

传统助听器依赖固定的频率依赖性放大和压缩来管理灵敏度降低，这在复杂环境中（如多说话者场景，即“鸡尾酒会”问题）往往无法提供足够的听力支持。为了更全面地解决听力损失背后的编码功能障碍，我们引入了可微分听觉环路（DAL），这是一个用于个性化助听器设计和验配的新开源框架。我们的第一个DAL实现包含了CARFAC——一个可微的人类耳蜗功能模型，我们将其移植到JAX，以优化深度神经网络，使受损的听觉神经活动模式与正常听力参考匹配。为了构建具有所需精细频谱-时间信号处理的助听器，我们采用了SEANet，一种波形到波形的全卷积UNet生成器。我们通过比较适配正常听力的CARFAC模型输出与适配每个受试者个体听力损伤的CARFAC模型输出来微调网络。比较使用来自各自CARFAC神经活动模式（NAP）输出和稳定听觉图像（SAI）的损失函数进行，后者提供捕获听觉神经输出中相位不敏感时间结构的二维表示。通过梯度下降，SEANet模型学习同时去噪输入并补偿由受损CARFAC模型建模的听力损失。在神经表征和信号保真度指标上，DAL优化的SEANet模型优于测试的主助听器（MHA）基线。DAL框架为基于模型、机器学习驱动的助听器信号处理个性化提供了一条实用路径。下一步包括硬件部署以实现真实世界的临床测试。

英文摘要

Conventional hearing aids rely on fixed, frequency-dependent amplification and compression to manage reduced sensitivity, which often fails to provide sufficient listening support in complex environments, such as situations with multiple speakers (the ``cocktail party'' problem). To more comprehensively address the underlying encoding dysfunctions of hearing loss, we introduce the Differentiable Auditory Loop (DAL), a new open-source framework for personalized hearing aid design and fitting. Our first implementation of DAL incorporates CARFAC, a differentiable model of human cochlear function, which we ported to JAX, to optimize a deep neural network to match impaired auditory neural activity patterns with a normal-hearing reference. To build a hearing aid with the fine-grained spectro-temporal signal processing required, we adopt SEANet, a waveform-to-waveform fully convolutional UNet generator. We fine-tune the network by comparing the outputs of a CARFAC model fitted to normal hearing with that of a CARFAC model fitted to match each subject's individual hearing impairment. The comparison is done using loss functions derived from the respective CARFAC neural activity pattern (NAP) outputs and stabilized auditory images (SAIs), the latter providing a 2D representation that captures phase-insensitive temporal structure in the auditory nerve output. Through gradient descent, the SEANet model learns to both denoise the input and compensate for the hearing loss modelled by the impaired CARFAC model. Across neural-representation and signal-fidelity metrics, the DAL-optimized SEANet model outperformed the tested master hearing aid (MHA) baselines. The DAL framework provides a practical path toward model-based, machine-learning-driven personalization of hearing aid signal processing. Next steps include hardware deployment to enable real-world clinical testing.

URL PDF HTML ☆

赞 0 踩 0

2606.04100 2026-06-04 cs.LG physics.comp-ph 版本更新

自适应分块在时间序列预测中比看起来更难

Federico Zucchi, Yi Xie, Chao Zhang, Keyuan Luo, Thomas Lampert, Ziyue Li

发表机构 * ICube, University of Strasbourg, Illkirch-Graffenstaden, France（斯特拉斯堡大学ICube研究所，法国伊尔克里奇-格拉夫芬斯坦德）； Technical University of Munich（慕尼黑技术大学）； FinTech Thrust, The Hong Kong University of Science and Technology (Guangzhou)（香港科技大学（广州）金融科技研究组）； Computer Science Department, Hainan Bielefeld University of Applied Sciences（海南比尔费尔德应用科学大学计算机科学系）； Cephalgo, Strasbourg, France（法国斯特拉斯堡Cephalgo公司）； Heilbronn Data Science Center, Munich Data Science Institute（慕尼黑数据科学研究所海德堡数据科学中心）

AI总结本文通过理论分析和实验验证，探讨自适应分块在时间序列Transformer中是否优于调优的均匀分块，发现均匀基线在标准基准上具有竞争力，自适应分块的优势有限且依赖于特定方法和数据集。

详情

AI中文摘要

自适应分块是时间序列Transformer最近提出的一个引人注目的方案：在序列局部信息丰富的区域分配更细的分块。本文探究在什么条件下内容自适应分块算子应优于调优的均匀算子。局部异质性本身并不足够：在逐点预测损失下，一个看似复杂的区域并不自动意味着更细的分块会减少损失。我们将分块建模为有预算的比特率分配，并推导出一个显式阈值，动态分块规则必须满足该阈值才能击败调优的均匀基线，然后从局部（二次代理）和全局（模型假设下的强凸界）两方面界定了可实现的改进。由此得出两个结构性结果：在没有耦合约束的情况下，标量局部复杂度无法在常见损失景观下产生非均匀最优；一旦骨干网络训练到其表示感知最优，对齐增益会在调优的均匀分块大小附近崩溃。为了验证这些预测，我们在三种代表性架构上进行了受控隔离研究，用均匀分块大小扫描替换每个自适应机制，同时保持骨干网络、数据和训练协议不变。在标准的长时域预测基准上，验证选择的均匀基线与动态对应物具有竞争力，每个设置的效果集中在零附近，且按数据集汇总后没有一致的方向性优势。我们观察到的较大增益是方法和数据集特定的。因此，自适应分块应针对调优的均匀基线进行评估；其价值取决于是否有一个廉价且可靠的路由信号能够识别出更细的分块实际上在何处减少预测损失。

英文摘要

Adaptive patching is a recent and compelling proposal for time-series Transformers: allocate finer patches where the sequence looks locally informative. This paper asks under what conditions a content-adaptive patching operator should outperform a tuned uniform one. Local heterogeneity alone is not enough: under pointwise forecasting losses, a complex-looking region is not automatically one where finer patching reduces the loss. We model patching as a budgeted bitrate allocation and derive an explicit threshold that a dynamic patching rule must satisfy to beat a well-tuned uniform baseline, then bound the achievable improvement both locally (a quadratic surrogate) and globally (a strong-convexity bound under the model's assumptions). Two structural results follow: without a coupling constraint, scalar local complexity cannot produce a non-uniform optimum under a common loss landscape; and once the backbone is trained to its representation-aware optimum, the alignment gain collapses around a well-tuned uniform patch size. To test these predictions, we run a controlled isolation study on three representative architectures, replacing each adaptive mechanism with a uniform patch-size sweep while keeping the backbone, data, and training protocol fixed. On standard long-horizon forecasting benchmarks, the validation-selected uniform baseline is competitive with the dynamic counterpart, with per-setting effects concentrated near zero and no consistent directional advantage once results are aggregated by dataset. The larger gains we do observe are method- and dataset-specific. Adaptive patching should therefore be evaluated against a tuned uniform baseline; its value depends on whether a cheap and reliable routing signal can identify where finer patches actually reduce forecasting loss.

URL PDF HTML ☆

赞 0 踩 0

2606.04073 2026-06-04 cs.LG cs.AI stat.ML 版本更新

TPA-AD: A Two-Stage Pseudo Anomaly-Guided Method for Bearing Time-Series Anomaly Detection

TPA-AD: 一种用于轴承时间序列异常检测的两阶段伪异常引导方法

Xiancheng Wang, Zhibo Zhang, Ran Li, Rui Wang, Minghang Zhao, Shisheng Zhong, Lin Wang

发表机构 * CQSF.com（重庆师范大学）； Huadian University（哈尔滨理工大学）

AI总结提出一种两阶段伪异常引导方法TPA-AD，通过重构模型和特征误差控制生成边界伪异常窗口，结合对比学习与KNN实现无监督轴承时间序列异常检测，在轴承故障和退化数据集上表现稳定且具泛化性。

详情

AI中文摘要

本文提出了一种两阶段伪异常引导的异常检测方法（TPA-AD），用于在仅正常样本可用的训练设置下进行轴箱轴承时间序列异常检测（TSAD）。该方法首先利用重构模型和每特征目标误差控制在正常边界附近生成伪异常窗口，然后通过正常窗口与伪异常窗口之间的对比学习学习异常敏感表示，最后使用k近邻（KNN）生成窗口级和点级异常分数。与依赖已知故障类别、真实异常先验或随机异常注入的现有方法相比，TPA-AD通过在边界邻域构建伪异常提高了正常边界的可分离性，并能联合处理混合变量场景中的连续和离散特征。主要实验在轴承故障检测数据集和退化过程数据集上进行，并在13个公共TSAD数据集上进行了额外的探索性扩展。结果表明，所提方法产生相对稳定的异常响应，对退化演化敏感，并在公共TSAD基准和真实高速列车相关轴承数据上表现出一定程度的更广泛适用性。

英文摘要

This paper proposes a two-stage pseudo anomaly-guided anomaly detection method (\textbf{T}wo-stage \textbf{P}seudo \textbf{A}nomaly-guided \textbf{A}nomaly \textbf{D}etection, \textbf{TPA-AD}) for axle-box bearing time-series anomaly detection (time series anomaly detection, TSAD) under the setting where only normal samples are available for training. The method first generates pseudo-anomalous windows near the normal boundary using a reconstruction model and per-feature target-error control. It then learns anomaly-sensitive representations through contrastive learning between normal and pseudo-anomalous windows, and finally produces window-level and point-level anomaly scores using k-nearest neighbors (KNN). Compared with existing methods that rely on known fault categories, real anomaly priors, or random anomaly injection, TPA-AD improves the separability of the normal boundary by constructing pseudo-anomalies in boundary neighborhoods and can jointly handle continuous and discrete features in mixed-variable scenarios. The main experiments are conducted on bearing fault detection datasets and degradation-process datasets, with an additional exploratory extension on $13$ public TSAD datasets. The results show that the proposed method yields relatively stable anomaly responses, is sensitive to degradation evolution, and demonstrates a certain degree of broader applicability on public TSAD benchmarks and real high-speed-train-related bearing data.

URL PDF HTML ☆

赞 0 踩 0

2606.04072 2026-06-04 cs.RO cs.DC cs.LG cs.SY eess.SY 版本更新

CADET: A Modular Platform for Evaluating Distributed Cooperative Autonomy in Connected Autonomous Vehicles

CADET：用于评估网联自动驾驶车辆中分布式协作自主性的模块化平台

Pragya Sharma, Brian Wang, Mani Srivastava

发表机构 * UCLA ； Amazon Scholar（亚马逊学者）

AI总结提出CADET模块化平台，通过解耦自动驾驶堆栈并集成网络与工作负载仿真，系统评估分布式协作自主系统在真实部署条件下的安全性与性能。

详情

Journal ref: ICRA 2026

AI中文摘要

深度学习模型日益成为自动驾驶汽车（AV）管道的核心，然而其集成传统上遵循单一设计，即感知、规划和控制在同一车载计算机上执行。这种设计忽视了协作自主的新兴范式，即车辆通过车联网（V2X）连接与路侧单元（RSU）、边缘服务器和云托管智能进行交互。协作感知和控制提高了安全性和效率，但也引入了系统级挑战：网络延迟、计算异构性和多租户争用，所有这些都严重影响实时决策。这些挑战因对大型基础模型的日益依赖而进一步放大，这些模型的规模需要云部署。我们提出CADET（通过分布式实验工具包实现协作自主），这是一个模块化平台，用于在真实部署条件下对分布式协作自主系统进行系统化和可重复的评估。CADET将自动驾驶堆栈解耦为可组合的模块，这些模块可以灵活地部署在车辆、基础设施和边缘/云层级上。该框架集成了最先进的模型，引入了基于轨迹的网络和工作负载仿真，并提供了同步的模型级、系统级和任务级检测。通过V2V和V2I实验，我们表明分布式部署选择从根本上影响安全性，其中V2V意图数据包优于基于云的感知，而RSU辅助感知在过载并发请求之前维持安全性。尽管专为自动驾驶管道设计，CADET也支持数据集驱动的实验，使系统和机器学习研究人员能够独立于完整的车辆仿真来基准测试分布式推理工作负载。CADET是开源的，代码和演示可在https://nesl.github.io/cadet-web获取。

英文摘要

Deep learning models are increasingly central to autonomous vehicle (AV) pipelines, yet their integration has traditionally followed a monolithic design where perception, planning, and control execute on a single onboard computer. This design overlooks the emerging paradigm of cooperative autonomy, where vehicles interact with roadside units (RSUs), edge servers, and cloud-hosted intelligence through vehicle-to-everything (V2X) connectivity. Cooperative perception and control improve safety and efficiency, but also introduce systems-level challenges: network latency, compute heterogeneity, and multi-tenant contention, all critically affect real-time decision-making. These challenges are further amplified by the increasing reliance on large foundation models, whose scale necessitates cloud deployment. We present CADET (Cooperative Autonomy through Distributed Experimentation Toolkit), a modular platform for systematic and reproducible evaluation of distributed cooperative autonomy systems under realistic deployment conditions. CADET decouples the AV stack into composable modules that can be flexibly deployed across vehicles, infrastructure, and edge/cloud tiers. The framework integrates state-of-the-art models, incorporates trace-driven network and workload emulation, and provides synchronized model-, system-, and task-level instrumentation. Through V2V and V2I experiments, we show that distributed deployment choices fundamentally shape safety, with V2V intent packets outperforming cloud-based perception and RSU-assisted perception sustaining safety until overloaded by concurrent requests. Although designed for AV pipelines, CADET also supports dataset-driven experimentation, enabling systems and ML researchers to benchmark distributed inference workloads independently of full vehicle simulation. CADET is open source, with code and demo available at https://nesl.github.io/cadet-web.

URL PDF HTML ☆

赞 0 踩 0

2606.04071 2026-06-04 cs.CR cs.CL cs.LG 版本更新

Covert Influence Between Language Models

语言模型之间的隐蔽影响

Avidan Shah, Jay Chooi, Jinghua Ou, Shi Feng

发表机构 * MATS ； New York University（纽约大学）； Harvard University（哈佛大学）； George Washington University（乔治华盛顿大学）

AI总结本文研究语言模型间通过微调、蒸馏和上下文学习三种接口实现隐蔽影响的风险，并提出使用逐点归因分数选择载体以放大训练时影响，发现自然语言载体相比数字载体更难被人类检测且跨模型迁移性更差。

详情

AI中文摘要

随着语言模型越来越多地消费彼此的输出，隐蔽影响——即发送者的载荷（其被条件化传播的行为倾向）通过人类无法检测的载体转移到接收者的现象——成为一种日益增长的风险。我们通过三种接口（监督微调、在线策略蒸馏和上下文学习）刻画了这一风险，并发现它们在实现不留下人类可见痕迹的影响规模上有所不同。利用推理时逐样本归因分数，我们研究了所有三种接口下的隐蔽影响，并具备选择能够放大训练时影响的载体的能力，解锁了先前工作无法实现的载荷转移。我们进一步提供证据表明，使用自然语言载体的隐蔽影响与先前使用数字载体的研究是不同的现象，因为前者更难以被人类检测且跨模型家族的迁移性更差。这些结果共同表明，隐蔽影响的风险面比先前认识到的更广，我们研究了逐点归因评分方法作为调查和缓解该风险的工具。

英文摘要

As language models increasingly consume one another's outputs, covert influence -- a phenomenon where a sender's payload (the behavioral disposition it is conditioned to propagate) transfers to a receiver through carriers undetectable by humans -- becomes a growing risk. We characterize this risk across three interfaces: supervised fine-tuning, on-policy distillation, and in-context learning, and find that they vary in the scale of influence achievable without leaving behind human-visible traces. Using inference-time per-sample attribution scores, we study covert influence across all three interfaces with the ability to select carriers that amplify training-time influence, unlocking payload transfers that prior work could not achieve. We further provide evidence that covert influence with natural-language carriers is a distinct phenomenon from prior studies using number carriers, as the latter is more resistant to human detection and less portable across model families. Together, these results suggest that the risk surface for covert influence is broader than previously recognized, and we study pointwise attribution scoring methods as a tool to investigate and mitigate it.

URL PDF HTML ☆

赞 0 踩 0

2606.04069 2026-06-04 cs.CR cs.LG 版本更新

Bayesian Membership Privacy for Graph Neural Networks

图神经网络的贝叶斯成员隐私

Sinan Yıldırım, Megha Khosla

发表机构 * University of California, Berkeley（加州大学伯克利分校）

AI总结针对图神经网络中结构相关性和随机训练图采样导致的成员推断问题，提出贝叶斯成员隐私（BMP）框架，通过贝叶斯假设检验量化节点级成员隐私，并设计采样感知审计机制以评估隐私泄露。

详情

AI中文摘要

现有的图神经网络（GNN）隐私分析很大程度上继承了非图设置中的假设，忽略了结构相关性和随机训练图采样。特别是，节点相关的先验使得仅凭第一类和第二类错误不足以刻画最优的成员推断测试。为了解决这个问题，我们引入了贝叶斯成员隐私（BMP），这是一种采样感知的节点级成员隐私公式，它结合了节点相关的先验，并将图采样概率视为对手知识的一部分。BMP将成员推断视为贝叶斯假设检验，并据此以后验成员概率来量化成员隐私。我们探讨了BMP与文献中现有定义相关的理论性质。我们进一步提出了一种实用的、采样感知的审计机制，用于估计BMP的参数，作为GNN中节点级隐私泄露的度量。我们在基准图数据集上进行了实验，结果表明BMP提供了细粒度的隐私洞察，而这些洞察仅通过全局攻击准确率是无法看到的。

英文摘要

Existing privacy analyses for Graph Neural Networks (GNNs) largely inherit assumptions from non-graph settings, overlooking structural correlations and stochastic training-graph sampling. In particular, node-dependent priors make type-I and type-II errors alone insufficient to characterize the best membership inference test. To address this, we introduce Bayesian Membership Privacy (BMP), a sampling-aware formulation of node-level membership privacy that incorporates node-dependent priors and treats graph sampling probabilities as part of the adversary's knowledge. BMP casts membership inference as a Bayesian hypothesis test and accordingly quantifies membership privacy in terms of posterior membership probability. We explore theoretical properties of BMP in relation to the existing definitions in the literature. We further propose a practical, sampling-aware auditing mechanism to estimate the parameters of BMP as a measure of node-level privacy leakage in GNNs. We conduct experiments on benchmark graph datasets and show that BMP yields fine-grained privacy insights that are not visible through global attack accuracy alone.

URL PDF HTML ☆

赞 0 踩 0

2606.04066 2026-06-04 q-bio.NC cs.LG 版本更新

SC-TauPath: A Structural Connectivity Attribution Framework for Mapping Tau Propagation Pathways in Alzheimer's Disease

SC-TauPath：一种用于映射阿尔茨海默病中tau蛋白传播路径的结构连接归因框架

Jing Zhang, Norman Scheel, Minheng Chen, Tong Chen, Yanjun Lyu, David C. Zhu, Rong Zhang, Dajiang Zhu

发表机构 * University of Texas at Arlington（德克萨斯理工大学）； Michigan State University（密歇根州立大学）； University of Texas Southwestern Medical Center（德克萨斯西南医学中心）

AI总结提出SC-TauPath框架，结合网络扩散模型增强的多层感知机和梯度×输入归因方法，从体内神经影像数据中映射tau蛋白传播路径，并验证了与Braak分期解剖学的一致性。

详情

AI中文摘要

理解结构连接如何与阿尔茨海默病（AD）中的tau蛋白传播相关联仍然是一个核心未解问题，然而现有的计算模型要么严重依赖生物物理假设，要么缺乏神经生物学可解释的路径图。我们提出了SC-TauPath，一个结构连接（SC）归因框架，用于从体内神经影像数据中映射tau蛋白传播路径。SC-TauPath将网络扩散模型（NDM）增强的多层感知机与梯度×输入归因相结合，以评分每个SC边对tau预测的贡献，然后将这些归因分数转化为多尺度路径图（骨干边、高流量路径和枢纽ROI），这验证了已建立的Braak分期解剖学。应用于234名ADNI参与者，这些参与者具有配对的DTI SC和18F-Flortaucipir PET数据，SC-TauPath实现了强交叉验证的tau预测，并产生了与已建立的Braak分期解剖学一致的基于归因的路径图，表明SC编码了AD中区域tau分布的特定空间信息。

英文摘要

Understanding how structural connections are associated with tau propagation in Alzheimer's disease (AD) remains a central open question, yet existing computational models either rely heavily on biophysical assumptions or lack neurobiologically interpretable pathway maps. We present SC-TauPath, a structural connectivity (SC) attribution framework that maps tau propagation pathways from in vivo neuroimaging data. SC-TauPath combines a Network Diffusion Model (NDM)-augmented multilayer perceptron with gradient $\times$ input attribution to score each SC edge's contribution to tau prediction, then translates these attribution scores into multi-scale pathway maps (backbone edges, high-traffic routes, and hub ROIs), which validates established Braak staging anatomy. Applied to 234 ADNI participants with paired DTI SC and 18F-Flortaucipir PET, SC-TauPath achieves strong cross-validated tau prediction and yields attribution-based pathway maps consistent with established Braak staging anatomy, demonstrating that SC encode spatially specific information about regional tau distribution in AD.

URL PDF HTML ☆

赞 0 踩 0

2606.04065 2026-06-04 stat.ML cs.LG math.ST stat.TH 版本更新

Finite-Iteration Local Dynamics and Warm Starts for Alternating Power Iteration in Spiked Tensor PCA

尖峰张量PCA中交替幂迭代的有限迭代局部动力学与热启动

Yanjin Xiang, Zhihua Zhang

发表机构 * Peking University（北京大学）

AI总结研究固定阶非对称秩一张量模型中同步交替幂迭代的有限迭代局部理论，提出与初始化无关的误差分解和热启动机制。

Comments 67 pages, 0 figures. The paper studies local dynamics and warm-start analysis for alternating power iteration in spiked tensor PCA

详情

AI中文摘要

我们研究了固定阶非对称秩一张量模型中的同步交替幂迭代。主要贡献是一个与任何特定初始化无关的有限迭代局部理论。一旦迭代进入种植秩一方向的足够小邻域，其误差分解为几何衰减的瞬态部分和由种植点处固定正交噪声收缩引起的内在噪声基底。确定性有限样本条件被明确陈述，但在粗粒度的固定阶多线性噪声事件下，它们简化为固定或缓慢扩展局部半径的保守高信号区域。然后，我们将热启动机制与任何特定谱构造分离。一个通用的单扫描原理表明，如果符号兼容的初始器具有相关性γ_N，第一扫描噪声水平a_N，且a_N/(γ_N^{d-1}ω_{N,d})→0，则可以选择一个扩展半径r_N=o(ω_{N,d})，使得第一扫描进入局部盆地。进入后，局部仿射收缩导致收敛到该盆地中唯一的信息性局部不动点。对于中心Gram初始化，我们通过信号保持的仅噪声留一比较和平均留一片收缩估计（称为压回估计），在独立同分布有限四阶矩噪声下验证了所需的相关性和同一样本第一扫描噪声界。留一比较保持尖峰固定并对删除坐标取平均，因此种植坐标通过ℓ₂加权和而非最坏情况非相干界进入。

英文摘要

We study simultaneous alternating power iteration for fixed-order asymmetric rank-one spiked tensor models. Our main contribution is a finite-iteration local theory that is independent of any particular initialization. Once the iterates enter a sufficiently small neighborhood of the planted rank-one direction, their error decomposes into a geometrically decaying transient and an intrinsic noise floor caused by fixed orthogonal noise contractions at the planted point. The deterministic finite-sample conditions are stated explicitly, but under a coarse fixed-order multilinear noise event they reduce to a conservative high-signal regime for fixed or slowly expanding local radii. We then separate the warm-start mechanism from any specific spectral construction. A generic one-sweep principle shows that, if a sign-compatible initializer has correlation $γ_N$, first-sweep noise level $a_N$, and $a_N/(γ_N^{d-1}ω_{N,d})\to0$, then one can choose an expanding radius $r_N=o(ω_{N,d})$ for which the first sweep enters the local basin. After entry, the local affine contraction yields convergence to the unique informative local fixed point in that basin. For centered-Gram initialization, we verify the required correlation and same-sample first-sweep noise bound under i.i.d. finite-fourth-moment noise by a signal-preserving noise-only leave-one comparison and an averaged leave-one slice-contraction estimate, which we call a pressed-back estimate. The leave-one comparison keeps the spike fixed and averages over the deleted coordinate, so planted coordinates enter through $\ell_2$-weighted sums rather than worst-case incoherence bounds.

URL PDF HTML ☆

赞 0 踩 0

2606.04063 2026-06-04 cs.LG cs.AI 版本更新

LLM Compression with Jointly Optimizing Architectural and Quantization choices

联合优化架构与量化选择的大语言模型压缩

Hoang-Loc La, Truong-Thanh Le, Amir Taherkordi, Phuong Hoai Ha

发表机构 * UiT The Arctic University of Norway（UiT北莫斯科斯大学）； University of Oslo, Norway（奥斯陆大学）

AI总结提出一种可微神经架构搜索框架，联合优化大语言模型的架构配置与混合精度量化，实现更优的精度-延迟权衡。

详情

AI中文摘要

部署大型语言模型（LLM）因其巨大的内存和计算需求而具有挑战性。虽然一些方法通过从头开发小型或微型语言模型来解决这一问题，但这些方法需要大量的GPU训练。压缩预训练的LLM用于边缘设备提供了一种有吸引力的替代方案。除了剪枝和量化，神经架构搜索（NAS）能够实现有效的压缩，然而先前的NAS方法通常限制搜索空间并将架构与量化解耦。我们引入了一种可微NAS框架，该框架探索整个空间，并联合优化LLM线性层的架构配置与混合精度量化。实验表明，我们的模型在精度-延迟权衡上具有优越性：在可比精度下，我们的模型推理速度比顺序的NAS后量化基线快1.4倍，或在等效延迟下，在七个推理任务上平均精度提高高达6%。

英文摘要

Deploying large language models (LLMs) is challenging due to their significant memory and computational requirements. While some methods address this by developing small or tiny language models from scratch, these approaches demand extensive GPU training. Compressing pre-trained LLMs for edge devices offers a compelling alternative. Beyond pruning and quantization, Neural Architecture Search (NAS) enables effective compression, yet prior NAS approaches often limit the search space and decouple architecture from quantization. We introduce a differentiable NAS framework that explores the entire space and jointly optimizes architectural configurations alongside mixed-precision quantization for linear layers of LLMs. Experiments demonstrate superior accuracy-latency trade-offs: our models achieve up to 1.4x faster inference than sequential NAS-then-quantization baselines at comparable accuracy, or up to 6% higher average accuracy across seven reasoning tasks at equivalent latency.

URL PDF HTML ☆

赞 0 踩 0

2606.04057 2026-06-04 cs.SE cs.AI cs.LG 版本更新

The Invisible Lottery: How Subtle Cues Steer Algorithm Choice in LLM Code Generation

隐形彩票：微妙线索如何引导LLM代码生成中的算法选择

Akanksha Narula, Mofasshara Binte Rafique, Laurent Bindschaedler

发表机构 * University of Washington（华盛顿大学）； Google Research（谷歌研究院）

AI总结通过大量控制实验，发现提示中的偶然线索（如上下文词或元数据）会系统性地改变LLM在代码生成中选择的算法族分布，影响性能、安全性和可维护性，而直接命名算法是最可靠的缓解措施。

详情

AI中文摘要

大型语言模型（LLM）现在生成大量生产代码，通常用于具有多个有效算法解决方案的任务。偶然的提示线索，即任务规范之外的上下文词或元数据，可以引导模型选择哪个算法，即使所有输出都通过相同的测试。提示敏感性作为提高输出质量的工具已被广泛研究。这里，输出策略意味着在固定正确性下的算法选择。我们将算法引导定义为线索引起的算法族分布变化，并在11个任务、19种线索类型（18个通道加上一个记忆化语义与表面消融，在改变排版和标点的同时保留含义）以及15个模型配置上进行了46,535次控制实验。我们发现算法族分布存在大规模、系统性的变化（高达100个百分点），与线索语义基本一致，包括在速率限制等应用任务中。直接命名算法是我们测试的最可靠的缓解措施。因此，偶然的上下文在性能、安全性和可维护性上创造了一个“隐形彩票”。

英文摘要

Large language models (LLMs) now generate substantial production code, often for tasks with multiple valid algorithmic solutions. Incidental prompt cues, meaning contextual words or metadata outside the task specification, can steer which algorithm the model selects, even when all outputs pass the same tests. Prompt sensitivity is well studied as a tool to improve output quality. Here, output policy means algorithm choice under fixed correctness. We define algorithm steering as cue-induced shifts in algorithm-family distributions and run 46,535 controlled experiments across 11 tasks, 19 cue types (18 channels plus a memoization semantic-vs-surface ablation that preserves meaning while changing typography and punctuation), and 15 model configurations. We find large, systematic shifts in algorithm-family distributions (up to 100 pp), largely consistent with cue semantics, including in applied tasks such as rate limiting. Direct algorithm naming is the most reliable mitigation we tested. Accidental context therefore creates an "invisible lottery" over performance, security, and maintainability.

URL PDF HTML ☆

赞 0 踩 0

2606.04053 2026-06-04 cs.LG cs.AI 版本更新

A Goal-Set Characterization of Task Composition in the Boolean Task Algebra

布尔任务代数中任务组合的目标集刻画

Eduardo Terrés-Caballero, Herke van Hoof

发表机构 * Informatics Institute, University of Amsterdam（阿姆斯特丹大学信息学院）； AMLab, University of Amsterdam（阿姆斯特丹大学AML实验室）

AI总结本文通过目标集方法简化了布尔任务代数中的任务组合，证明了确定性MDP中最优扩展Q值函数由通用任务和空任务决定，从而减少了学习成本。

详情

AI中文摘要

布尔任务代数（BTA）通过为达到目标的任务配备布尔运算，为强化学习中的零样本任务组合提供了一个原则性框架。我们重新审视了其结构假设，并形式化了最优扩展Q值函数空间中的坍缩：在确定性MDP中，每个这样的函数完全由通用任务和空任务决定。这使得原始BTA公式中提出的对数基任务集变得冗余。基于这一观察，我们引入了一种基于目标集的组合方法，该方法对目标集执行逻辑运算，并通过从通用值函数和空值函数中选择切片来重构组合值函数。这降低了标准BTA的学习成本，并减少了BTA和技能机器的组合时间，同时保持了策略性能。在表格、视觉、函数逼近和连续控制领域的实验表明，学习额外的基任务并不会带来更好的性能。最后，我们研究了随机设置，并提供了一个反例，表明这种坍缩不一定成立，即最优组合可能需要考虑目标数量指数级的策略。代码可在 https://github.com/EduardoTerres/bta_paper 获取。

英文摘要

The Boolean Task Algebra (BTA) provides a principled framework for zero-shot task composition in reinforcement learning by equipping goal-reaching tasks with Boolean operations. We revisit its structural assumptions and formalize a collapse in the space of optimal extended Q-value functions: in deterministic MDPs, every such function is fully determined by the universal and empty tasks. This makes the logarithmic set of base tasks proposed in the original BTA formulation redundant. Building on this observation, we introduce a goal-set-based composition method that performs logical operations on goal sets and reconstructs composed value functions by selecting slices from the universal and empty value functions. This reduces learning costs for standard BTA and reduces composition time for both BTA and Skill Machines, while preserving policy performance. Experiments across tabular, visual, function-approximation, and continuous-control domains show that learning additional base tasks does not yield better performance. Finally, we study the stochastic setting and provide a counterexample showing that this collapse need not hold, that is, optimal composition may require accounting for exponentially many policies in the number of goals. Code is available at https://github.com/EduardoTerres/bta_paper.

URL PDF HTML ☆

赞 0 踩 0

2606.04051 2026-06-04 cs.LG cs.AI cs.CR 版本更新

RUBAS: Rubric-Based Reinforcement Learning for Agent Safety

RUBAS: 基于评分标准的强化学习用于智能体安全

Xian Qi Loye, Qinglin Su, Zhexin Zhang, Shiyao Cui, Qi Zhu, Fei Mi, Hongning Wang, Minlie Huang

发表机构 * The Conversational AI (CoAI) group, DCST, Tsinghua University（清华大学对话人工智能（CoAI）组，DCST，清华大学）； Huawei Noah’s Ark Lab（华为诺亚实验室）

AI总结提出RUBAS框架，通过将智能体行为分解为四个维度的评分标准提供细粒度奖励，利用强化学习在保证任务完成的同时提升工具使用安全性。

详情

AI中文摘要

LLM进化为工具型智能体带来了与真实世界执行相关的新安全挑战，而非简单的文本生成。现有的对齐方法通常依赖粗略的拒绝信号或静态监督，难以在多样化的智能体风险中平衡安全性与有用的工具执行。我们提出了RUBAS，一种基于评分标准的强化学习框架用于智能体安全。RUBAS将智能体行为分解为四个维度：工具使用安全性、参数安全性、响应安全性和有用性。这些结构化的评分标准在完整的智能体轨迹上提供细粒度且可解释的奖励，使强化学习能够在保持任务完成的同时优化安全工具使用。在多个智能体安全基准和模型上的大量实验表明，RUBAS相比标准对齐基线提高了安全性，减少了基于工具的幻觉，并保持了竞争性的实用性。我们的结果表明，多维评分标准奖励为在安全关键的工具使用环境中对齐LLM智能体提供了有效的训练信号。

英文摘要

The evolution of LLMs into tool-enabled agents creates a new class of safety challenges associated with real-world execution rather than simple text generation. Existing alignment methods often rely on coarse refusal signals or static supervision, making it difficult to balance safety with useful tool execution across diverse agentic risks. We introduce RUBAS, a rubric-based reinforcement learning framework for agent safety. RUBAS decomposes agent behavior into four dimensions: tool-use safety, argument safety, response safety, and helpfulness. These structured rubrics provide fine-grained and interpretable rewards over complete agent trajectories, enabling reinforcement learning to optimize safe tool use while preserving task completion. Extensive experiments across multiple agent safety benchmarks and models show that RUBAS improves safety over standard alignment baselines, reduces tool-grounded hallucinations, and maintains competitive utility. Our results suggest that multi-dimensional rubric rewards provide an effective training signal for aligning LLM agents in safety-critical tool-use settings.

URL PDF HTML ☆

赞 0 踩 0

2606.04050 2026-06-04 cs.LG cs.AI 版本更新

LiftQuant: Continuous Bit-Width LLM via Dimensional Lifting and Projection

LiftQuant: 通过维度提升和投影实现连续位宽的LLM

Liulu He, XuanAng Liu, Juntao Liu, Taolue Feng, Ting Lu, Chunsheng Gan, Zhiyv Peng, Yuan Du, Huanrui Yang, Yijiang Liu, Li Du

发表机构 * Nanyang Technological University（南洋理工大学）

AI总结提出LiftQuant框架，通过“提升-投影”机制实现准连续位宽控制，以精确适配内存预算，在70B模型上以2.4位压缩超越现有2位模型。

Comments ICML 2026 Spotlight

详情

AI中文摘要

现有的量化方法从根本上受限于刚性的整数位宽（例如2位、3位），导致存在“部署鸿沟”，即大型语言模型无法最优地适配特定的内存预算。为弥合这一鸿沟，我们引入了LiftQuant，一种新颖的框架，能够实现连续位宽控制，从而实现真正的帕累托最优部署。其核心创新是一种“提升-投影”机制，该机制通过从更高维度的“提升”空间中投影一个简单的1位格点来近似低维权重向量。关键在于，有效位宽仅由提升维度与原始维度的比率决定，这使得位宽可以准连续地调整，因为维度是一个灵活的结构参数。这种投影生成一个结构化但非均匀的码本，捕获了向量量化（VQ）的表达能力。虽然优于VQ，但LiftQuant的解码路径仅依赖于线性变换和1位均匀量化器，保持了硬件友好的特性。这种灵活性具有变革性：LiftQuant能够将70B的LLM压缩到2.4位，以精确适配24GB GPU，其性能显著超过在同一设备上部署的最先进的2位模型。我们的代码和检查点可在https://github.com/Heliulu/LiftQuant获取。

英文摘要

Existing quantization methods are fundamentally limited by rigid, integer-based bit-widths (e.g., 2, 3-bit), resulting in a ``deployment gap" where Large Language Models cannot be optimally fitted to specific memory budgets. To bridge this gap, we introduce LiftQuant, a novel framework that enables continuous bit-width control for true Pareto-optimal deployment. The core innovation is a ``lift-then-project" mechanism which approximates low-dimensional weight vectors by projecting a simple 1-bit lattice from a higher-dimensional ``lifted" space. Crucially, the effective bit-width is determined simply by the ratio of the lifted dimension to the original dimension, which allows the bit-width to be tuned quasi-continuous as the dimension is a flexible structural parameter. This projection generates a structured yet non-uniform codebook, capturing the expressive power of Vector Quantization (VQ). While beneficial over VQ, LiftQuant's decoding path relies solely on linear transformations and 1-bit uniform quantizers, retaining hardware-friendly nature. This flexibility is transformative: LiftQuant enables a 70B LLM to be compressed to 2.4 bits to precisely fit a 24GB GPU, where its performance significantly surpasses state-of-the-art 2-bit models fitted on the same device. Our code and ckpt is available at https://github.com/Heliulu/LiftQuant.

URL PDF HTML ☆

赞 0 踩 0

2606.04048 2026-06-04 cs.LG cs.AI 版本更新

Unlocking Feature Learning in Gated Delta Networks at Scale

解锁大规模门控Delta网络中的特征学习

Yifeng Liu, Quanquan Gu

发表机构 * University of California Los Angeles（加州大学洛杉矶分校）

AI总结本文通过推导门控Delta网络的缩放规则，实现了超参数（尤其是学习率）在不同模型宽度下的零样本迁移，验证了Maximal Update Parametrization在结构化状态空间模型中的有效性。

2606.04046 2026-06-04 cs.CV cs.AI cs.CL cs.LG cs.RO 版本更新

Dive into the Scene: Breaking the Perceptual Bottleneck in Vision-Language Decision Making via Focus Plan Generation

深入场景：通过焦点计划生成打破视觉-语言决策中的感知瓶颈

Boyuan Xiao, Bohong Chen, Yumeng Li, Ji Feng, Yao-Xiang Ding, Kun Zhou

发表机构 * University of Science and Technology of China（中国科学技术大学）； Tsinghua University（清华大学）

AI总结提出SceneDiver方法，通过从粗到细的焦点计划生成，逐步构建场景图并分解任务，减少视觉幻觉，提升视觉-语言模型和视觉-语言-动作模型在具身决策任务中的表现。

Comments Accepted at ICML 2026

详情

AI中文摘要

在具身视觉-语言决策任务（如机器人操作和导航）中，视觉-语言模型和视觉-语言-动作模型（VLMs & VLAs）是具有不同优势的强大工具：VLMs更擅长长期规划，而VLAs更擅长反应控制。然而，它们的性能受到相同感知瓶颈的限制：由于模型无法区分任务相关对象与干扰物，导致视觉幻觉。原则上，准确识别并聚焦关键对象同时过滤无关对象是突破这一限制的关键。一个直接的解决方案是一步聚焦：直接关注重要对象。然而，这种方法被证明无效，因为有效的聚焦本质上需要深度场景理解。为此，我们提出SceneDiver，一种利用VLMs长期规划能力的从粗到细的焦点计划生成方法，首先构建整体场景图以建立初步理解，然后通过识别、理解和分析的迭代循环逐步将任务分解为更简单的子问题。为了实现反应控制，我们还设计了一个轻量级适配器，将深思熟虑的聚焦能力蒸馏到VLAs中。在标准具身AI基准上的评估证实，我们的方法显著减少了VLMs和VLAs的视觉幻觉，同时在需要快速执行的任务中保持了计算效率。我们的代码和数据发布在：https://future-item.github.io/SceneDiver。

英文摘要

In embodied vision-language decision making tasks such as robotic manipulation and navigation, Vision-Language and Vision-Language-Action Models (VLMs & VLAs) are powerful tools with different benefits: VLMs are better at long-term planning, while VLAs are better at reactive control. However, their performance is limited by the same perceptual bottleneck: visual hallucinations arise due to the models' inability to distinguish task-relevant objects from distractors. In principle, accurate identification and focus on critical objects while filtering out irrelevant ones is the key to break this limitation. A straightforward solution is one-step focus: directly attending to essential objects. However, this approach proves ineffective because effective focus inherently requires deep scene understanding. To this end, we propose SceneDiver, a coarse-to-fine focus plan generation method for VLMs leveraging their long-term planning abilities, that first constructs a holistic scene graph to establish initial comprehension, then progressively decomposes the task into simpler sub-problems through an iterative cycle of recognition, understanding, and analysis. To enable reactive control, we also design a lightweight adapter for distilling the deliberate focus ability into VLAs. Evaluations on standard embodied AI benchmarks confirm that our method substantially reduces visual hallucinations for both VLMs and VLAs, while preserving computational efficiency in tasks requiring fast execution. Our code and data are released at: https://future-item.github.io/SceneDiver.

URL PDF HTML ☆

赞 0 踩 0

2606.04045 2026-06-04 cs.LG cs.AI 版本更新

Bayes-Sufficient Representations in Supervised Learning

监督学习中的贝叶斯充分表示

Vasileios Sevetlidis

发表机构 * Athena Research Center, Kimmeria Campus, Xanthi, Greece（阿塔尼亚研究中心，基米里亚校区，辛提斯，希腊）； Democritus University of Thrace, Vas. Sofias Campus, Xanthi, Greece（德摩根大学，瓦斯·索菲亚校区，辛提斯，希腊）； International Hellenic University, Serres, Greece（国际希腊大学，塞雷斯，希腊）

AI总结本文定义了监督学习中表示对损失函数的贝叶斯充分性，引入贝叶斯商概念，并证明最小充分表示等价于贝叶斯商，通过实验区分了充分性、最小性和非必要信息保留。

详情

AI中文摘要

表示学习通常被描述为保留输入中与预测相关的信息。本文探讨了在固定监督决策问题中相关性的含义。定义了一个表示对于联合分布和损失是贝叶斯充分的，如果某个预测头可以使用它来实现贝叶斯最优行动规则。这使得目标信息依赖于损失。在几乎必然唯一的贝叶斯行动情况下，相关对象是贝叶斯商，它识别需要相同贝叶斯最优行动的输入。当表示细化这个商时，它是充分的；当它在信息上等价于商时，它是贝叶斯最小的。该框架自然地连接到属性诱导：零一损失需要贝叶斯类，平方损失需要条件均值，布里尔损失需要二元预测中的条件概率，对数损失或严格适当评分规则需要预测分布。受控的有限实验、学习的神经瓶颈实验以及真实数据的iNaturalist分类学细化实验说明了充分性、最小性和保留的非必要信息之间的区别。对于固定的监督问题，分布和损失决定贝叶斯行动，贝叶斯行动决定商，商决定贝叶斯最优预测所需的最小信息。

英文摘要

Representation learning is often described as preserving the information in an input that is relevant for prediction. This work asks what relevance means for a fixed supervised decision problem. A representation is defined to be Bayes-sufficient for a joint distribution and loss if some prediction head can use it to implement a Bayes-optimal action rule. This makes the target information loss-dependent. In the almost-surely unique Bayes-action case, the relevant object is a Bayes quotient, which identifies inputs that require the same Bayes-optimal action. A representation is sufficient when it refines this quotient, and Bayes-minimal when it is informationally equivalent to it. The framework connects naturally to property elicitation: zero-one loss requires the Bayes class, squared loss the conditional mean, Brier loss the conditional probability in binary prediction, and log loss or strictly proper scoring rules the predictive distribution. Controlled finite experiments, learned neural bottleneck experiments, and a real-data iNaturalist taxonomic refinement experiment illustrate the distinction between sufficiency, minimality, and retained non-required information. For a fixed supervised problem, the distribution and the loss determine the Bayes action, the Bayes action determines the quotient, and the quotient determines the minimal information required for Bayes-optimal prediction.

URL PDF HTML ☆

赞 0 踩 0

2606.04039 2026-06-04 cs.NE cs.AI cs.LG 版本更新

Beyond Static Priors: Dynamic Neural Guidance for Large-Scale Ant Colony Optimization

超越静态先验：大规模蚁群优化的动态神经引导

Dat Thanh Tran, Van Khu Vu, Yining Ma

发表机构 * Center for AI Research（人工智能研究中心）； VinUniversity（文大学）； College of Engineering and Computer Science（工程与计算机科学学院）； Laboratory for Information and Decision Systems（信息与决策系统实验室）； Massachusetts Institute of Technology（麻省理工学院）

AI总结提出DyNACO框架，通过周期性观察信息素分布和当前解实现动态神经引导，结合扰动ACO后端和范围受限的细化机制，在TSP上扩展至10万节点并优于神经基线，在CVRP上以<1%神经开销持续改进无引导基线。

Comments Accepted at KDD 2026

详情

DOI: 10.1145/3770855.3817893

AI中文摘要

耦合梯度下降中瞬态放大的伪谱界

Ahanaf Hasan Ariq

发表机构 * Ideal School and College（理想学校和学院）

AI总结针对耦合梯度下降中块三角雅可比矩阵的非正态性导致的瞬态放大，提出尖锐的伪谱理论，给出Kreiss常数的上界与匹配极小极大下界，并导出随机耦合下降的有限步迭代复杂度界。

Comments 11 pages, 3 tables. Accepted as poster at HiLD 2026 (4th Workshop on High-dimensional Learning Dynamics, ICML 2026)

详情

AI中文摘要

耦合梯度下降——其中一个参数块的更新依赖于另一个——是双层优化、双时间尺度随机逼近和对抗训练的基础。当耦合雅可比矩阵为块三角时，渐近稳定性由对角块的谱半径决定，但由于非正态性，收敛前的瞬态放大可能任意大。我们为这种块三角雅可比矩阵发展了尖锐的伪谱理论，证明当对角块对称且谱半径至多为γ<1时，Kreiss常数满足K(J) ≤ 2/(1-γ) + ||C||/(4(1-γ))，并建立了匹配的极小极大下界。我们刻画了谱不稳定的临界耦合阈值，并通过Neumann级数扰动框架将分析扩展到近自指系统。作为推论，我们得到了随机耦合下降的有限步迭代复杂度界O(K(J)^2 log(1/δ))。将结果表述为非平稳双时间尺度优化的标度律，我们的理论揭示了谱半径分析无法看到的非渐近、实例依赖的高维学习动力学。在线性二次问题、基于IQC的比较和神经网络训练上的实验证实了该理论。

英文摘要

Coupled gradient descent--where the update of one parameter block depends on another--underlies bilevel optimization, two-time-scale stochastic approximation, and adversarial training. When the coupled Jacobian is block-triangular, asymptotic stability is governed by the spectral radii of the diagonal blocks, yet transient amplification before convergence can be arbitrarily large due to non-normality. We develop a sharp pseudospectral theory for such block-triangular Jacobians, proving that the Kreiss constant satisfies $K(J) \leq 2/(1-γ) + \|C\|/(4(1-γ))$ when the diagonal blocks are symmetric with spectral radii at most $γ< 1$, and we establish matching minimax lower bounds. We characterize the critical coupling threshold for spectral instability and extend the analysis to nearly self-referential systems via a Neumann-series perturbation framework. As a consequence, we obtain a finite-horizon iteration-complexity bound of $O(K(J)^2 \log(1/δ))$ for stochastic coupled descent. Framed as scaling laws for non-stationary two-time-scale optimization, our results expose a non-asymptotic, instance-dependent regime of high-dimensional learning dynamics that is invisible to spectral-radius analysis. Experiments on linear-quadratic problems, IQC-based comparisons, and neural-network training confirm the theory.

URL PDF HTML ☆

赞 0 踩 0

2606.04028 2026-06-04 cs.LG 版本更新

SPLIT-PINN: 基于物理信息神经网络的可分离概率学习技术用于高维概率建模

Pouria Behnoudfar, Deekshith Naidu Ponnana, Noah J. Schmelzer, Janith Wanni, George T. Gray, Dan J. Thoma, Curt A. Bronkhorst, Nan Chen, Wenxiao Pan

发表机构 * Department of Mechanical Engineering, University of Wisconsin-Madison（威斯康星大学麦迪逊分校机械工程系）； Department of Mathematics, University of Wisconsin-Madison（威斯康星大学麦迪逊分校数学系）； Department of Civil Engineering, Johns Hopkins University（约翰霍普金斯大学土木工程系）； Materials Physics and Applications Division, Los Alamos National Laboratory（洛斯阿拉莫斯国家实验室材料物理与应用 division）； Department of Materials Science and Engineering, University of Wisconsin-Madison（威斯康星大学麦迪逊分校材料科学与工程系）

AI总结提出一种基于物理信息神经网络的可分离概率学习技术（SPLIT-PINN），通过将漂移场分解为边际校正项并施加正交约束，从数据中推断高维输运主导的联合概率密度函数演化，实现对多晶材料微观结构状态演变的准确概率预测。

详情

AI中文摘要

我们提出了一种概率建模框架，用于将小尺度空间异质性纳入多晶金属材料宏观行为描述中。空间异质性材料状态场使用概率密度函数（PDF）表示，提供了跨不同计算多晶实现的微观结构变异性和状态演化的原则性统计描述。该框架基于概率输运模型的逆识别，该模型被表述为具有未知漂移项的Liouville方程。为了在高维、输运主导的设置中实现该漂移场的准确、稳定和可解释推断，我们开发了基于物理信息神经网络的可分离概率学习技术（SPLIT-PINN）。该方法结合了边际校正漂移分解、正交性约束和基于残差的自适应训练，以增强适定性、数值稳定性和物理一致性，而不施加限制性参数假设。使用SPLIT-PINN，控制联合状态PDF时间演化的漂移场直接从数据中推断。在基准验证之后，该框架应用于描述多晶微观结构状态（包括von Mises应力、位错密度和等效塑性应变率）演化的物理计算数据集。在单个数据集上训练的所学Liouville模型随后用于对多个未见过的多晶实现的联合和边际PDF的时间演化进行正向预测。与参考PDF的定量比较表明，所提出的框架产生了准确且鲁棒的概率预测，并有效跨数据集泛化。

英文摘要

We present a probabilistic modeling framework for incorporating small-scale spatial heterogeneity into macroscopic descriptions of material behavior for polycrystalline metallic materials. Spatially heterogeneous material state fields are represented using probability density functions (PDFs), providing a principled statistical description of microstructural variability and state evolution across different computational polycrystalline realizations. The framework is built on the inverse identification of a probabilistic transport model, formulated as a Liouville equation with an unknown drift term. To enable accurate, stable, and interpretable inference of this drift field in high-dimensional, transport-dominated settings, we develop a Separable Probability Learning Technique via Physics-Informed Neural Networks (SPLIT-PINN). This method incorporates a marginal-correction drift decomposition, orthogonality constraints, and residual-based adaptive training to enhance well-posedness, numerical stability, and physical consistency without imposing restrictive parametric assumptions. Using SPLIT-PINN, the drift field governing the temporal evolution of joint state PDFs is inferred directly from data. After benchmark validation, the framework is applied to physical computational datasets describing the evolution of polycrystalline microstructural states, including von Mises stress, dislocation density, and equivalent plastic strain rate. The learned Liouville model, trained on a single dataset, is subsequently used in forward predictions of the temporal evolution of joint and marginal PDFs for multiple unseen polycrystal realizations. Quantitative comparisons with reference PDFs demonstrate that the proposed framework yields accurate and robust probabilistic predictions and generalizes effectively across datasets.

URL PDF HTML ☆

赞 0 踩 0

2606.03995 2026-06-04 cs.LG cs.AI q-bio.QM 版本更新

Early Detection of Alzheimer's Disease Using Explainable Machine Learning on Clinical Biomarkers: A Multi-Class Classification Study Using the Alzheimer's Disease Neuroimaging Initiative (ADNI) Dataset

使用可解释机器学习基于临床生物标志物早期检测阿尔茨海默病：基于阿尔茨海默病神经影像学倡议（ADNI）数据集的多分类研究

Afshan Hashmi

发表机构 * TRDC, Tuwaiq Academy（TRDC，图瓦伊克学院）

AI总结本研究使用XGBoost分类器，基于ADNI数据集的8个临床特征（MMSE、CDR Global、CDR-SB、MoCA、FAQ、年龄、性别、教育程度）进行三分类（正常认知、轻度认知障碍、阿尔茨海默病）检测，通过SMOTE处理类别不平衡，Optuna优化超参数，SHAP提供可解释性，在测试集上达到macro AUC 0.982、准确率0.943，并揭示了临床合理的特征重要性模式。

详情

AI中文摘要

背景：阿尔茨海默病（AD）影响全球超过5500万人。从常规临床评估中准确、可解释地检测正常认知（NC）、轻度认知障碍（MCI）和AD仍是一个关键未满足需求。方法：使用XGBoost分类器进行三分类检测，采用来自阿尔茨海默病神经影像学倡议（ADNI）的八个临床特征：MMSE、CDR Global、CDR Sum of Boxes（CDR-SB）、MoCA、FAQ、年龄、性别和教育程度。使用Optuna（50次试验）优化超参数；通过SMOTE处理类别不平衡。性能通过macro AUC-ROC（1000次迭代bootstrap 95%置信区间）、macro F1、平衡准确率和Cohen's kappa评估。SHAP值提供特征级别的可解释性。结果：数据集包含1641名基线受试者（608 NC、767 MCI、266 AD）。在五折交叉验证中，平均macro AUC为0.983（SD 0.007），准确率为0.944（SD 0.006），macro F1为0.929（SD 0.008）。在保留测试集（n=247）上，macro AUC为0.982（95% CI: 0.965--0.995），准确率为0.943，平衡准确率为0.932，macro F1为0.927，Cohen's kappa为0.909。SHAP分析确定CDR Global是NC和MCI的主要预测因子，而CDR-SB和MMSE共同驱动AD分类。结论：一个基于常规临床评估训练的可解释机器学习模型实现了近乎完美的三分类阿尔茨海默病检测。SHAP分析揭示了临床合理、类别特定的特征重要性模式，支持临床有效性。未来工作将扩展该框架，加入语音生物标志物以实现多模态检测。

英文摘要

Background: Alzheimer's disease (AD) affects over 55 million people worldwide. Accurate, interpretable detection of normal cognition (NC), mild cognitive impairment (MCI), and AD from routine clinical assessments remains a critical unmet need. Methods: An XGBoost classifier was developed for three-class detection using eight clinical features from the Alzheimer's Disease Neuroimaging Initiative (ADNI): MMSE, CDR Global, CDR Sum of Boxes (CDR-SB), MoCA, FAQ, age, sex, and education. Hyperparameters were optimised using Optuna (50 trials); class imbalance was addressed with SMOTE. Performance was evaluated by macro AUC-ROC with 1,000-iteration bootstrap 95% confidence intervals, macro F1, balanced accuracy, and Cohen's kappa. SHAP values provided feature-level explainability. Results: The dataset comprised 1,641 baseline subjects (608 NC, 767 MCI, 266 AD). On five-fold cross-validation, mean macro AUC was 0.983 (SD 0.007), accuracy 0.944 (SD 0.006), and macro F1 0.929 (SD 0.008). On the held-out test set (n = 247), macro AUC was 0.982 (95% CI: 0.965--0.995), accuracy 0.943, balanced accuracy 0.932, macro F1 0.927, and Cohen's kappa 0.909. SHAP analysis identified CDR Global as the dominant predictor for NC and MCI, while CDR-SB and MMSE together drove AD classification. Conclusion: An explainable machine learning model trained on routine clinical assessments achieves near-perfect three-class Alzheimer's detection. SHAP analysis reveals clinically plausible, class-specific feature importance patterns supporting clinical validity. Future work will extend this framework with speech biomarkers for multimodal detection.

URL PDF HTML ☆

赞 0 踩 0

2605.04356 2026-06-04 cs.LG cs.AI 版本更新

先降噪，后正交：通过谱滤波理解Muon中的动量

Xianliang Li, Zihan Zhang, Weiyang Liu, Han Bao

发表机构 * The Institute of Statistical Mathematics（统计数学研究所）； The Graduate Institute for Advanced Studies, SOKENDAI（SOKENDAI高级研究院）； National Institute of Informatics（国家信息研究所）； The Chinese University of Hong Kong（香港中文大学）； Tohoku University（东北大学）； RIKEN AIP（理化学研究所AIP）

AI总结本文通过谱滤波理论证明Muon优化器中的动量能抑制梯度扰动、扩大谱间隙，从而稳定正交化步骤，并证明先动量后正交化比相反顺序或去除动量更优。

详情

AI中文摘要

Muon最近在大语言模型训练中展示了强大的实证性能，但动量在Muon中的理论作用仍不清楚。现有的Muon分析要么移除动量以单独研究谱更新，要么保留动量而不解释其为何提升实证性能。我们的工作通过展示Muon中的动量充当谱滤波器来弥合这一差距。在结构化信号加扰动梯度模型下，我们证明动量抑制扰动同时保留主导信号，从而扩大它们之间的谱间隙。这个扩大的间隙稳定了传递给Muon正交化步骤的矩阵的奇异子空间，使得最终更新更可靠。我们进一步证明，在正交化之前应用动量比颠倒顺序或简单地移除动量能实现与梯度信号分量可证明的更强对齐。跨多种任务（包括LLM预训练）的实验支持我们的理论分析。更广泛地说，我们的理论为理解其他基于矩阵的优化器中动量的益处提供了起点。

英文摘要

Muon has recently demonstrated strong empirical performance in large language model training, but the theoretical role of momentum in Muon remains unclear. Existing analyses of Muon either remove momentum to study spectral updates in isolation, or retain momentum without explaining why it improves empirical performance. Our work bridges this gap by showing momentum in Muon acts as a spectral filter. Under a structured signal-plus-perturbation gradient model, we prove that momentum suppresses perturbations while preserving the dominant signal, thereby enlarging the spectral gap between them. This enlarged gap stabilizes the singular subspaces of the matrix passed to Muon's orthogonalization step, making the resulting update more reliable. We further show that applying momentum before orthogonalization achieves provably stronger alignment with the signal component of the gradient than either reversing this order or simply removing momentum. Experiments across diverse tasks, including LLM pretraining, support our theoretical analysis. More broadly, our theory offers a starting point for understanding the benefits of momentum in other matrix-based optimizers.

URL PDF HTML ☆

赞 0 踩 0

2606.03892 2026-06-04 cs.CL cs.AI cs.LG 版本更新

Synthesize and Reward -- Reinforcement Learning for Multi-Step Tool Use in Live Environments

合成与奖励——面向实时环境中多步骤工具使用的强化学习

Ibrahim Abdelaziz, Asim Munawar, Kinjal Basu, Maxwell Crouse, Chulaka Gunasekara, Suneet Katrekar, Pavan Kapanipathi

发表机构 * IBM Research（IBM研究院）

AI总结提出PROVE框架，通过20个有状态MCP服务器、自动化数据合成流水线和多组件程序化奖励，解决多步骤工具调用中的环境构建、查询生成和奖励设计问题，在BFCL Multi-Turn、tau2-bench和T-Eval上分别提升最多+10.2、+6.8和+6.5分。

详情

AI中文摘要

训练LLM编排多步骤工具调用受到三个相互耦合的障碍的阻碍：现实的有状态执行环境构建成本高昂，合成训练查询通常与服务器的实际状态脱节（因此生成的工具调用无法执行），以及基于回忆的RL奖励会鼓励冗长的工具调用模式。我们提出PROVE（已验证环境上的程序化奖励），一个包含三项贡献的框架：（1）一个包含20个有状态MCP（模型上下文协议）服务器的库，暴露了343个工具，支持具有会话范围状态隔离的实时执行RL训练；（2）一个自动数据合成流水线，通过基于实时采样服务器状态的依赖图引导的对话模拟，针对这些服务器生成经过验证的多轮工具调用轨迹，使得每个生成的查询都引用实际存在的实体；（3）一个多组件程序化奖励——渐进式有效性评分、依赖感知覆盖率、具有复杂度缩放调用预算的自适应效率惩罚、工具名称信号和参数值匹配奖励——无需外部评判模型。我们使用相同的奖励超参数和约13K训练示例，通过GRPO训练了四个模型（Qwen3-4B、Qwen3-8B、Qwen2.5-7B、Granite-4.1-8B）；仅对每个模型族从三点扫描中调整学习率。在BFCL Multi-Turn、tau2-bench和T-Eval上，PROVE分别带来了最多+10.2、+6.8和+6.5分的改进，表明紧凑的程序化奖励在两个模型族的多步骤工具编排上产生了一致的收益。

英文摘要

Training LLMs to orchestrate multi-step tool calls is held back by three coupled obstacles: realistic stateful execution environments are costly to build, synthetic training queries are often detached from the server's actual state (so the generated tool calls fail to execute), and recall-based RL rewards incentivize verbose tool-calling patterns. We present PROVE (Programmatic Rewards On Verified Environments), a framework with three contributions: (1) a library of 20 stateful MCP (Model Context Protocol) servers exposing 343 tools, enabling live-execution RL training with session-scoped state isolation; (2) a state-machine data synthesis pipeline that generates multi-turn tool-call trajectories grounded in live-sampled server state, so generated queries reference entities that actually exist; and (3) a multi-component programmatic reward with an adaptive efficiency penalty that counters the verbosity incentive of recall-based rewards. We train four models (Qwen3-4B, Qwen3-8B, Qwen2.5-7B, Granite-4.1-8B) with GRPO on the resulting ~13K training examples. On BFCL Multi-Turn, tau2-bench, and T-Eval, PROVE yields improvements of up to +10.2, +6.8, and +6.5 points respectively, demonstrating that this framework yields consistent gains on multi-step tool orchestration across two model families.

URL PDF HTML ☆

赞 0 踩 0

2606.03746 2026-06-04 cs.CV cs.AI cs.GR cs.LG 版本更新

Qwen-Image-Flash: Beyond Objective Design

Qwen-Image-Flash：超越目标设计

Tianhe Wu, Kun Yan, Zikai Zhou, Lihan Jiang, Jiahao Li, Jie Zhang, Kaiyuan Gao, Ningyuan Tang, Shengming Yin, Xiaoyue Chen, Xiao Xu, Yilei Chen, Yuxiang Chen, Yan Shu, Yixian Xu, Yanran Zhang, Zihao Liu, Zhendong Wang, Zekai Zhang, Deqing Li, Liang Peng, Yi Wang, Jingren Zhou, Chenfei Wu

发表机构 * alibaba-inc.com（阿里巴巴公司）

AI总结本文通过系统研究数据组成、教师指导和任务混合三个因素，提出Qwen-Image-Flash，表明有效的少步蒸馏不仅需要精心设计的目标，还需要对更广泛的训练流程进行原则性组织。

2606.03631 2026-06-04 cs.LG cs.AI 版本更新

AnchorMoE: Interpretable Time Series Classification via Anchor-Routed MoE

AnchorMoE: 基于锚点路由的混合专家模型实现可解释时间序列分类

Tao Xie, Zexi Tan, Haoyi Xiao, Mengke Li, Yiqun Zhang, Yang Lu, Cuie Yang, Yiu-ming Cheung

发表机构 * School of Automation, Guangdong University of Technology（广东工业大学自动化学院）； School of Computer Science and Technology, Guangdong University of Technology（广东工业大学计算机科学与技术学院）； College of Computer Science and Software Engineering, Shenzhen University（深圳大学计算机科学与软件工程学院）； School of Informatics, Xiamen University（厦门大学信息学院）； State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University（东北大学过程工业综合自动化国家重点实验室）； Department of Computer Science, Hong Kong Baptist University（香港 Baptist 大学计算机科学系）

AI总结提出AnchorMoE框架，利用混合专家架构对局部补丁进行多视角表示并路由至专门专家，通过加性分解实现前向可解释性，并引入几何正交约束和不确定性感知门控机制提升稀疏信号下的分解可靠性与噪声抑制。

Comments Accepted by KDD 2026, 12 pages

详情

AI中文摘要

多变量时间序列分类（MTSC）在高风险领域（如临床诊断和工业故障检测）中至关重要，这些领域的安全部署需要透明的决策过程。然而，隔离驱动模型预测的时间段具有挑战性，因为现实世界时间序列中的判别信号通常是稀疏、异构且被背景噪声严重掩盖的。因此，本文提出了AnchorMoE，一种天生可解释的分类框架。基于混合专家（MoE）架构，AnchorMoE编码局部补丁的多视角表示并将其路由到专门专家，确保最终预测被表述为输入段上的精确加性分解，从而促进前向透明度，而非依赖事后估计。为了在稀疏信号分布下保持这种分解的可靠性，我们引入了几何正交约束，惩罚表示冗余，迫使不同专家专门处理异构预测模式。此外，设计了一个不确定性感知的可靠性门控，动态校准每个段的贡献，有效抑制残余背景噪声。在真实世界和合成基准上的大量实验表明，AnchorMoE在实现高度竞争的分类性能的同时，忠实于原始时间序列进行决策。

英文摘要

Multivariate time series classification (MTSC) is pivotal in high-stakes domains, such as clinical diagnosis and industrial fault detection, where safe deployment necessitates transparent decision-making. However, isolating the temporal segments that drive model predictions is challenging because discriminative signals in real-world time series are typically sparse, heterogeneous, and heavily obscured by background noise. This paper, therefore, proposes AnchorMoE, an interpretable-by-construction classification framework. Built upon a Mixture-of-Experts (MoE) architecture, AnchorMoE encodes multi-view representations of local patches and routes them to specialized experts, ensuring that the final prediction is formulated as an exact additive decomposition over the input segments, facilitating ante-hoc transparency rather than relying on post-hoc estimations. To maintain the reliability of this decomposition under sparse signal distributions, we introduce a geometric orthogonality constraint that penalizes representational redundancy, compelling distinct experts to specialize in heterogeneous predictive patterns. Furthermore, an uncertainty-aware reliability gate is designed to dynamically calibrate the contribution of each segment, effectively suppressing residual background noise. Extensive experiments on real-world and synthetic benchmarks demonstrate that AnchorMoE achieves highly competitive classification performance while faithfully grounding its decisions in the raw time series.

URL PDF HTML ☆

赞 0 踩 0

2606.03441 2026-06-04 cs.RO cs.LG 版本更新

PerchRL: Vision-Based Agile Perching on Inclined Platforms under Rapid and Irregular Motion

PerchRL：基于视觉的快速不规则运动倾斜平台敏捷着陆

Zihong Lu, Zongzhuo Liu, Huaxu Li, Jinqiang Cui, Jie Mei, Youmin Gong, U Kei Cheang, Boyu Zhou

发表机构 * SUSTech（四川大学）； HITSZ（哈尔滨工业大学）； PCL（鹏城实验室）； Differential Robotics（差分机器人实验室）

AI总结提出PerchRL强化学习框架，通过两阶段学习策略（状态预训练+视觉微调）和混合学习框架（可见性感知状态增强+主动感知奖励），实现四旋翼在快速不规则运动倾斜平台上的自主视觉着陆。

详情

AI中文摘要

自主视觉引导的四旋翼在移动倾斜平台上的着陆对于空地协作至关重要，但由于有限的视场角（FOV）而具有挑战性。本文提出PerchRL，一种基于强化学习（RL）的框架，用于在快速和不规则运动下的倾斜平台上进行基于视觉的敏捷着陆。具体而言，我们采用两阶段学习策略，包括基于状态的预训练和基于视觉的微调。为了提高对不同平台运动的泛化能力，我们使用随机化的平台轨迹来防止过拟合，并采用时间增强方法从历史观测中捕捉潜在运动模式。在基于视觉的微调过程中，提出了一种混合学习框架，包括可见性感知状态增强和主动感知奖励，以提高在间歇性视觉丢失下的鲁棒性。大量的仿真和真实世界实验证明了PerchRL的可行性、稳定性和实时性能，而在不同四旋翼平台上的成功部署进一步验证了其适应性。源代码将发布以惠及社区。

英文摘要

Autonomous vision-based perching of quadrotors on moving inclined platforms is critical for air-ground collaboration but remains challenging due to the limited field of view (FOV). In this paper, we propose PerchRL, a reinforcement learning (RL) framework for vision-based agile perching on inclined platforms under rapid and irregular motion. Specifically, we employ a two-stage learning strategy consisting of state-based pre-training followed by vision-based fine-tuning. To improve generalization across diverse platform motions, we employ randomized platform trajectories to prevent overfitting and temporal augmentation methods to capture latent motion patterns from historical observations. During vision-based fine-tuning, a hybrid learning framework consisting of visibility-aware state augmentation and active perception rewards is presented to improve robustness under intermittent visual loss. Extensive simulation and real-world experiments demonstrate the feasibility, stability, and real-time performance of PerchRL, while successful deployment across distinct quadrotor platforms further validates its adaptability. The source code will be released to benefit the community.

URL PDF HTML ☆

赞 0 踩 0

2606.03393 2026-06-04 cs.LG 版本更新

Flicker-DDPM: Accelerating Denoising Diffusion via 1/f Colored Noise Injection

Flicker-DDPM：通过1/f彩色噪声注入加速去噪扩散

KeXiang Mao, FanCheng Li

发表机构 * School of Physics and Technology, Wuhan University（武汉大学物理科学技术学院）； Hongyi Honor College, Wuhan University（弘毅荣誉学院）

AI总结提出Flicker-DDPM模型，利用自组织临界性启发的1/f彩色噪声替代各向同性白噪声，通过空间相关核生成幂律谱噪声，在CIFAR-10上以3.33倍更少的采样步数达到或超越标准DDPM的生成质量，并从频域线性理论解释加速机制。

Comments 16pages, 8 figures, Code available at https://github.com/Mao-Kexiang/Flicker_DDPM

详情

AI中文摘要

我们提出了一种新颖的扩散模型Flicker-DDPM，它引入了受自组织临界性（SOC）启发的闪烁（1/f）噪声，SOC是自然系统中广泛观察到的现象。与在前向过程中采用各向同性白噪声的去噪扩散概率模型（DDPM）不同，Flicker-DDPM采用具有幂律谱的彩色噪声，以更好地匹配自然图像的频谱统计，其功率谱通常遵循P(k)正比于1/k^{\alpha}。为此，我们基于空间相关核{\sigma}(d) = (d + 1)^{-\eta}开发了一个彩色噪声模块，并从理论上证明调整{\eta}可以控制生成的1/f^{\alpha}噪声的谱指数{\alpha}，从而适应具有不同频谱特征的数据集。在CIFAR-10上，Flicker-DDPM使用3.33倍更少的采样步数即可达到或超越标准DDPM基线的生成质量，且每步的额外计算成本可忽略不计。我们进一步开发了一种频域线性理论，证明频谱匹配的彩色噪声使反向轨迹线性化，从理论上解释了所观察到的采样加速现象。

英文摘要

We propose a novel diffusion model, Flicker-DDPM, which incorporates flicker (1/f) noise inspired by self-organized criticality (SOC), a widely observed phenomenon in natural systems. Unlike denoising diffusion probabilistic models (DDPMs), which employ isotropic white noise in the forward process, Flicker-DDPM adopts colored noise with power-law spectra to better match the spectral statistics of natural images, whose power spectra typically follow P(k) proportional to 1/k^α. To this end, we develop a colored-noise module based on a spatial correlation kernel, σ(d) = (d + 1)^{-η}, and theoretically establish that adjusting η controls the spectral exponent α of the generated 1/fα noise, enabling adaptation to datasets with diverse spectral characteristics. On CIFAR-10, Flicker DDPM matches or surpasses the generation quality of a standard DDPM baseline using 3.33 times fewer sampling steps, with negligible additional computational cost per step. We further develop a frequency-domain linear theory demonstrating that spectrally matched colored noise linearizes the reverse trajectory, theoretically explaining the observed sampling acceleration.

URL PDF HTML ☆

赞 0 踩 0

2606.03376 2026-06-04 cs.CV cs.AI cs.CL cs.LG 版本更新

P$^2$-DPO: Grounding Hallucination in Perceptual Processing via Calibration Direct Preference Optimization

P²-DPO：通过校准直接偏好优化在感知处理中锚定幻觉

Ruipeng Zhang, Zhihao Li, Haozhang Yuan, C. L. Philip Chen, Tong Zhang

发表机构 * Guangdong Provincial Key Laboratory of Computational AI Models and Cognitive Intelligence, School of Computer Science & Engineering, South China University of Technology（广东省计算人工智能模型与认知智能重点实验室，计算机科学与工程学院，华南理工大学）； Pazhou Lab, Guangzhou, China（琶洲实验室，广州，中国）； Engineering Research Center of the Ministry of Education on Health Intelligent Perception and Paralleled Digital-Human, Guangzhou, China（教育部健康智能感知与并行数字人工程研究中心，广州，中国）

AI总结针对大型视觉语言模型中的幻觉问题，提出P²-DPO训练范式，通过模型自生成偏好对和校准损失，直接优化感知瓶颈和视觉鲁棒性，无需昂贵人工反馈。

详情

AI中文摘要

幻觉最近在大型视觉语言模型（LVLMs）中引起了广泛的研究关注。直接偏好优化（DPO）旨在直接从人类提供的纠正偏好中学习，从而解决幻觉问题。尽管取得了成功，但这种范式尚未专门针对关注区域中的感知瓶颈或解决图像退化下的视觉鲁棒性不足问题。此外，现有的偏好对通常是视觉无关的，其固有的离策略性质限制了它们在指导模型学习方面的有效性。为了解决这些挑战，我们提出了感知处理直接偏好优化（P²-DPO），一种新颖的训练范式，其中模型生成并学习自己的偏好对，从而直接解决已识别的视觉瓶颈，同时固有地避免视觉无关和离策略数据的问题。它引入了：（1）一种针对焦点增强感知和视觉鲁棒性的在策略偏好对构建方法，以及（2）一种精心设计的校准损失，以精确地将视觉信号与文本的因果生成对齐。实验结果表明，在相当数量的训练数据和成本下，P²-DPO在基准测试中优于依赖昂贵人工反馈的强基线。此外，对注意力区域保真度（ARF）和图像退化场景的评估验证了P²-DPO在解决关注区域感知瓶颈和提高对退化输入的视觉鲁棒性方面的有效性。

英文摘要

Hallucination has recently garnered significant research attention in Large Vision-Language Models (LVLMs). Direct Preference Optimization (DPO) aims to learn directly from the corrected preferences provided by humans, thereby addressing the hallucination issue. Despite its success, this paradigm has yet to specifically target the perceptual bottleneck in attended regions or address insufficient Visual Robustness against image degradation. Furthermore, existing preference pairs are often vision-agnostic and their inherently off-policy nature limits their effectiveness in guiding model learning. To address these challenges, we propose Perceptual Processing Direct Preference Optimization (P$^2$-DPO), a novel training paradigm in which the model generates and learns from its own preference pairs, thereby directly addressing the identified visual bottlenecks while inherently avoiding the issues of vision-agnostic and off-policy data. It introduces: (1) an on-policy preference pairs construction method targeting Focus-and-Enhance perception and Visual Robustness, and (2) a well-designed Calibration Loss to precisely align visual signals with the causal generation of text. Experimental results demonstrate that with a comparable amount of training data and cost, P$^2$-DPO outperforms strong baselines that rely on costly human feedback on benchmarks. Furthermore, evaluations on Attention Region Fidelity (ARF) and image degradation scenarios validate the effectiveness of P$^2$-DPO in addressing perceptual bottleneck in attended regions and improving Visual Robustness against degraded inputs.

URL PDF HTML ☆

赞 0 踩 0

2606.02886 2026-06-04 cs.LG cs.AI cs.CE math.PR physics.ao-ph 版本更新

Scalable Uncertainty Quantification for Extreme Weather Forecasting via Empirical Neural Tangent Kernels

基于经验神经正切核的极端天气预报可扩展不确定性量化

Jose Marie Antonio Miñoza, Rex Gregor Laylo, Sebastian C. Ibañez

发表机构 * Center for AI Research（人工智能研究中心）； Department of Education（教育部门）； Makati Philippines（马卡蒂菲律宾）

AI总结本文提出基于神经正切核的不确定性量化方法，利用最后一层经验特征，通过方差崩溃机制和分解性能分析，实现无需重训练的极端天气自适应预测区间。

Comments Proceedings of the 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2 (KDD '26)

详情

DOI: 10.1145/3770855.3818106

AI中文摘要

深度学习天气模型现在匹配数值天气预报的准确性，同时运行速度快几个数量级，但产生确定性预测而没有不确定性估计，这对于极端天气事件期间的高风险决策是一个关键差距。本文提出基于神经正切核的不确定性量化（NTK-UQ），使用最后一层经验特征。理论分析预测，UQ质量通过两种机制依赖于架构。首先，方差崩溃机制解释了UQ何时失败：当特征值截断秩接近特征空间的有效秩时，GP校正项消耗几乎所有的先验方差，破坏了热带气旋与常规条件之间的区分；具有集中谱（谱算子）的架构需要激进截断（k≤10），而基于注意力的模型容忍满秩计算。其次，分解性能取决于极端天气的非高斯、重尾结构：独立成分分析利用高阶统计量（峰度、负熵）来隔离重尾极端事件特征，实现了比仅捕获二阶方差的奇异值分解更高的区分度。一个数据驱动的选择规则根据特征谱集中比选择ICA或SVD，正确地为所有四种评估架构指定了更优的分解。与分裂共形预测（自然的后验基线）相比，NTK-UQ在90%覆盖率下实现了31-37%更窄的预测区间，并且独特地产生随极端事件严重程度缩放的自适应区间，而共形预测无法通过构造实现。该框架无需重训练；推理时的不确定性每个样本仅需一次矩阵-向量乘积。

英文摘要

Deep learning weather models now match numerical weather prediction accuracy while running orders of magnitude faster, but produce deterministic forecasts without uncertainty estimates, a critical gap for high-stakes decisions during extreme weather events. This paper proposes Neural Tangent Kernel-based uncertainty quantification (NTK-UQ) using last-layer empirical features. Theoretical analysis predicts that UQ quality is architecture-dependent through two mechanisms. First, a variance collapse mechanism explains when UQ fails: when the eigenvalue truncation rank approaches the effective rank of the feature space, the GP correction term consumes nearly all prior variance, destroying discrimination between tropical cyclones and routine conditions; architectures with concentrated spectra (spectral operators) require aggressive truncation ($k \leq 10$), while attention-based models tolerate full-rank computation. Second, decomposition performance depends on the non-Gaussian, heavy-tailed structure of extreme weather: Independent Component Analysis exploits higher-order statistics (kurtosis, negentropy) to isolate heavy-tailed extreme-event features, achieving higher discrimination than singular value decomposition, which captures only second-order variance. A data-driven selection rule chooses ICA or SVD from the feature eigenspectrum concentration ratio, correctly prescribing the superior decomposition for all four evaluated architectures. Compared to split conformal prediction (the natural post-hoc baseline), NTK-UQ achieves 31--37\% sharper prediction intervals at 90\% coverage, and uniquely produces \emph{adaptive} intervals that scale with extreme event severity, which conformal prediction cannot achieve by construction. The framework requires no retraining; inference-time uncertainty requires only a single matrix-vector product per sample.

URL PDF HTML ☆

赞 0 踩 0

2606.02576 2026-06-04 cs.CV cs.LG 版本更新

自适应自动框架：面向开放式任务流的智能体系统部署的持续自我改进

Zewen Liu, Zhan Shi, Yisi Sang, Bing He, Minhua Lin, Tianxin Wei, Dakuo Wang, Benoit Dumoulin, Wei Jin, Hanqing Lu

发表机构 * Emory University（埃默里大学）； Amazon（亚马逊）； The Pennsylvania State University（宾夕法尼亚州立大学）； UIUC（伊利诺伊大学香槟分校）； Northeastern University（东北大学）

AI总结提出自适应自动框架（Adaptive Auto-Harness），通过状态化多智能体进化器、带求解时路由的框架树和人工引导机制，解决开放式任务流中自动框架性能退化问题，在多个流上超越现有基线。

详情

AI中文摘要

自动框架系统（如A-Evolve、GEPA和Meta-Harness）通过从执行反馈中优化提示、技能、工具、记忆和支持基础设施来改进LLM智能体，但它们通常在固定的离线基准上进行评估。实际部署中呈现的是开放式任务流：历史记录无固定终点增长，异构任务需要不同的框架，问题分布随时间变化。这些挑战使得单一反复密集更新的框架变得脆弱，导致性能退化，准确率早期达到峰值后下降。这激发了具有任务自适应性的持续框架构建。我们引入了自适应自动框架（Adaptive Auto-Harness），一个针对此类流的框架和系统。该框架将到 oracle 框架的差距分解为进化损失和适应损失。系统通过状态化多智能体进化器、带求解时路由的框架树以及针对历史缺乏所需信号情况的人工引导钩子来解决这些损失。在预测市场、安全竞赛和事件预测流中，自适应自动框架优于五个现有的自动框架基线，消融实验将收益归因于更好的构建、路由或针对性的人工引导。代码可在 https://github.com/A-EVO-Lab/AdaptiveHarness 获取。

英文摘要

Auto-harness systems such as A-Evolve, GEPA, and Meta-Harness improve LLM agents by optimizing prompts, skills, tools, memories, and supporting infrastructure from execution feedback, but they are typically evaluated on fixed offline benchmarks. Real deployments instead present open-ended task streams: histories grow without a fixed endpoint, heterogeneous tasks require different harnesses, and problem distributions shift over time. These challenges make a single repeatedly and densely updated harness brittle, causing performance degradation as accuracy peaks early and then declines. This motivates sustained harness construction with task-wise adaptation. We introduce Adaptive Auto-Harness, a framework and system for such streams. The framework decomposes the gap to an oracle harness into evolution loss and adaptation loss. The system addresses these losses with a stateful multi-agent evolver, a harness tree with solve-time routing, and human-steering hooks for cases where history lacks the needed signal. Across prediction-market, security-competition, and event-forecasting streams, Adaptive Auto-Harness outperforms five existing auto-harness baselines and ablations attribute gains to better construction, routing, or targeted human steering. Code is available in \href{https://github.com/A-EVO-Lab/a-evolve/tree/release/adaptive-auto-harness}{Link}.

URL PDF HTML ☆

赞 0 踩 0

2606.01537 2026-06-04 cs.CV cs.LG 版本更新

PaCX-MAE: Physiology-Augmented Chest X-Ray Masked Autoencoder

PaCX-MAE: 生理增强的胸部X光掩码自编码器

Yancheng Liu, Kenichi Maeda, Manan Pancholy

发表机构 * University of California, Berkeley（加州大学伯克利分校）； University of Tokyo（东京大学）； University of Michigan（密歇根大学）

AI总结提出PaCX-MAE跨模态蒸馏框架，通过双对比预测目标将生理先验注入胸部X光编码器，在保持单模态推理的同时提升生理相关任务性能。

Comments Accepted at the ICML 2026 3rd Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences (FM4LS)

详情

AI中文摘要

临床诊断通常需要结合影像与生理测量，但部署的模型通常处理单模态数据。我们提出PaCX-MAE，一种跨模态蒸馏框架，将生理先验注入胸部X光（CXR）编码器，同时在推理时严格保持单模态。PaCX-MAE通过双对比预测目标增强域内掩码自编码，使CXR表示与配对的ECG和实验室嵌入对齐。在九个基准上的广泛评估表明，该方法在领域特定MAE上取得一致改进，特别是在依赖生理的任务上（例如，MedMod上AUROC提升2.7；VinDr上F1提升6.5）。该方法在1%标注数据下表现出高度标签效率，并保持解剖保真度，在分割任务上与MAE持平。零样本和注意力分析证实，PaCX-MAE成功学习关注生理指标，如心脏轮廓，这在标准视觉预训练中缺失。

英文摘要

Clinical diagnosis often requires combining imaging with physiological measurements, yet deployed models typically operate on unimodal data. We present PaCX-MAE, a cross-modal distillation framework that injects physiological priors into chest X-ray (CXR) encoders while remaining strictly unimodal at inference. PaCX-MAE augments in-domain masked autoencoding with a dual contrastive-predictive objective, aligning CXR representations with paired ECG and laboratory embeddings. Extensive evaluation across nine benchmarks demonstrates consistent improvements over domain-specific MAE, particularly on physiology-dependent tasks (e.g., +2.7 AUROC on MedMod; +6.5 F1 on VinDr). The method proves highly label-efficient in the 1% regime and preserves anatomical fidelity, achieving parity with MAE on segmentation tasks. Zero-shot and attention analyses confirm that PaCX-MAE successfully learns to attend to physiological indicators, such as the cardiac silhouette, absent in standard visual pretraining.

URL PDF HTML ☆

赞 0 踩 0

2606.01495 2026-06-04 cs.LG cs.CL 版本更新

CART: Context-Anchored Recurrent Transformer -- A Parameter-Efficient Architecture with Learned Stability

CART: 上下文锚定循环Transformer——一种具有学习稳定性的参数高效架构

Chad A. Capps

发表机构 * Independent Researcher（独立研究员）

AI总结提出CART，一种通过共享核心块循环和冻结键值张量实现参数高效的语言模型，并引入线性时不变门控保持稳定性，实验表明在参数匹配时性能略低于密集基线。

Comments 31 pages, 4 figures. Code, training scripts, and the full experiment database (results.db) are available at https://github.com/ccapps42/CART

详情

AI中文摘要

我们提出CART（上下文锚定循环Transformer），一种参数高效的语言模型，它在深度上重复使用单个共享核心块R次。与先前每次迭代重新计算键值张量的循环Transformer不同，CART从多层前奏中一次性计算K和V，并通过多头潜在注意力让循环核心交叉关注这些冻结的张量。一个学习得到的线性时不变（LTI）门控保持循环稳定性：其谱半径在所有36个完全训练配置中稳定在窄带内（rho在[0.79, 0.83]之间）。我们在单个消费级GPU上分两个阶段评估CART：首先在3000步进行64配置筛选，然后对36个配置（P=6，R∈{6,8,10}，三个种子）训练30500步（约10亿token）。在宽度d∈{256,512,768,1024}上，两个模式成立：前奏深度P主导循环次数R，并且R的第一阶段排名在完全训练时反转（在d≥512时R=6变为最佳）。在绑定d=1024的参数对比测试中，CART未能击败参数匹配的密集基线，在存储参数对比中损失1-2%，在有效参数对比中损失约10%。诊断消融将有效参数差距分为约5%来自权重共享和约5%来自异质的前奏/锚点/核心/尾声框架；循环核心机制（超连接、LTI门控、循环索引嵌入）单独来看是退化的。变R推理在训练R的两侧性能下降，这是该方案下测试时深度扩展的一个负面结果。

英文摘要

We present CART (Context-Anchored Recurrent Transformer), a parameter-efficient language model that reuses a single shared core block R times across depth. Unlike prior looped transformers that recompute key-value tensors at every iteration, CART computes K and V once from a multi-layer prelude and has the recurrent core cross-attend to those frozen tensors via multi-head latent attention. A learned Linear Time-Invariant (LTI) gate keeps the recurrence stable: its spectral radius settles in a narrow band (rho in [0.79, 0.83]) across all 36 fully-trained configurations. We evaluate CART on single consumer GPUs in two stages: a 64-configuration screen at 3,000 steps, then 36 configurations (P=6, R in {6,8,10}, three seeds) trained for 30,500 steps (~1B tokens). Two patterns hold across widths d in {256,512,768,1024}: prelude depth P dominates loop count R, and the Stage-1 ranking of R reverses at full training (R=6 becomes best at d>=512). At the binding d=1024 parameter-parity test, CART does not beat a parameter-matched dense baseline, losing by 1-2% at stored-parameter parity and by ~10% at effective-parameter parity. Diagnostic ablations split the effective-parameter gap into ~5% from weight sharing and a residual ~5% from the heterogeneous prelude/anchor/core/coda framing; the recurrent-core machinery (hyper-connections, LTI gate, loop-index embedding) is individually vestigial. Variable-R inference degrades on both sides of the trained R, a negative result for test-time depth scaling under this recipe.

URL PDF HTML ☆

赞 0 踩 0

2606.00732 2026-06-04 cs.AI cs.LG 版本更新

SHARP: Sleep-based Hierarchical Accelerated Replay for Long Range Non-Stationary Temporal Pattern Recognition

SHARP: 基于睡眠的分层加速重放用于长程非平稳时间模式识别

Jayanta Dey, Shikhar Srivastava, Itamar Lerner, Christopher Kanan, Dhireesha Kudithipudi

发表机构 * Department of Computer Engineering, University of Texas at San Antonio, USA（德克萨斯大学圣安东尼奥分校计算机工程系）； Department of Computer Science, University of Rochester, USA（罗切斯特大学计算机科学系）； Department of Psychology, University of Texas at San Antonio, USA（德克萨斯大学圣安东尼奥分校心理学系）

AI总结提出SHARP框架，通过将时间学习分解为记忆模块和模式识别模块，并引入离线睡眠阶段加速重放时间结构记忆，实现长程非平稳序列模式的高效学习。

详情

AI中文摘要

学习长程非平稳时间模式仍然是现代序列模型的核心挑战，特别是在严格的流式设置中。在这些设置中，数据按顺序到达，必须单次处理，不能同时回顾过去的观测。标准架构，包括循环神经网络和变换器，受到截断时间反向传播或显式输入窗口长度的限制，无法进行长程信用分配。为了解决这些限制，我们提出了SHARP（基于睡眠的分层加速重放），一个将时间学习分解为两个互补组件的框架：一个累积过去输入的结构化历史的记忆模块，以及一个在该记忆上操作的模式识别模块。这种分离通过消除跨多步时间反向传播进行长程信用分配的需求，实现了对非平稳动态的资源高效和计算高效适应。受啮齿动物在慢波睡眠期间观察到的加速重放启发，SHARP引入了离线（睡眠）阶段，其中时间结构的记忆痕迹以加速形式重放并整合到更高层次的记忆表示中，从而改善长程上下文保留。通过受控模拟和消融研究，我们表征了所提出框架的关键属性。在text8和PG-19等基准数据集上，我们证明SHARP通过保留先前见过数据的下一个令牌预测性能，同时继续从当前流中学习并泛化到未来未见数据，改进了循环基线。这些增益得益于其分层结构，该结构以线性时间计算成本实现了指数级增长的有效时间上下文。

英文摘要

Learning long-range non-stationary temporal patterns remains a core challenge for modern sequence models, particularly in strict streaming settings. In these settings, data arrive sequentially and must be processed in a single pass without simultaneously revisiting past observations. Standard architectures, including recurrent neural networks and transformers, are constrained by either truncated backpropagation through time horizon or explicit input window length for long range credit assignment. To address these limitations, we propose SHARP (Sleep-based Hierarchical Accelerated Replay), a framework that decomposes temporal learning into two complementary components: a memory module that accumulates a structured history of past inputs, and a pattern-recognition module that operates over this memory. This separation enables resource- and compute-efficient adaptation to non-stationary dynamics by eliminating the need for backpropagation through time across many steps for long-range credit assignment. Inspired by the accelerated replay observed in rodents during slow-wave sleep, SHARP incorporates offline (sleep) phases in which temporally structured memory traces are replayed in an accelerated form and integrated into higher-level memory representations, improving long-range context retention. Through controlled simulations and ablation studies, we characterize the key properties of the proposed framework. In benchmark datasets such as text8 and PG-19, we demonstrate that SHARP improves over recurrent baselines by retaining next-token predictive performance on previously seen data while continuing to learn from the current stream and generalizing to future unseen data. These gains are enabled by its hierarchical structure, which yields an exponentially increasing effective temporal context with only linear-time computational cost.

URL PDF HTML ☆

赞 0 踩 0

2606.00260 2026-06-04 cs.CV cs.LG 版本更新

LastAct: Trajectory-Guided Latest-Activity Localization for Real-Time Smart-Home Activity Recognition

LastAct: 轨迹引导的最新活动定位用于实时智能家居活动识别

Zishuai Liu, Ruili Fang, Jin Lu, Fei Dou

发表机构 * School of Computing, University of Georgia（佐治亚大学计算学院）

AI总结提出LastAct框架，通过轨迹图像序列和边界定位器解决滑动窗口中的边界污染问题，实现实时智能家居活动识别。

详情

AI中文摘要

基于环境传感器的人类活动识别（HAR）支持健康监测和辅助生活等智能家居应用。然而，在实际部署中，传感器事件以连续流的形式到达，活动边界未知。因此，滑动窗口推理会产生许多跨越转换并包含混合活动的窗口，造成边界污染，违反了大多数基准和模型使用的预分割实例假设。此外，许多管道通过将传感器ID视为独立标记来未充分利用空间上下文。我们提出了LastAct，一个面向轨迹的流式智能家居HAR框架，旨在处理混合窗口下的最新活动，同时显式建模空间结构。LastAct将传感器事件投影到家庭平面图上，形成保持空间连续性的布局对齐轨迹图像序列。一个轻量级门控识别受污染的窗口，边界定位器估计最后一个转换，从而实现边界引导的掩码，强调边界后的证据并抑制过时的上下文。为了提高效率，我们重用预计算的布局对齐模板缓存以避免重复渲染。实验表明，在四个公开的智能家居数据集上，采用接近真实的混合活动协议，LastAct在纯窗口上达到竞争性或更优的性能，并在交叉/混合窗口上获得显著的Macro-F1增益，展示了在接近真实的滑动窗口机制下更强的鲁棒性。

英文摘要

Human Activity Recognition (HAR) from ambient sensors enables smart-home applications such as health monitoring and assisted living. In realistic deployments, however, sensor events arrive as a continuous stream and activity boundaries are unknown. Sliding-window inference therefore produces many windows that straddle transitions and contain mixed activities, creating boundary contamination that violates the pre-segmented instance assumption used by most benchmarks and models. Moreover, many pipelines under-use spatial context by treating sensor IDs as independent tokens. We present LastAct, a trajectory-centric framework for streaming smart-home HAR that targets the most recent activity under mixed windows while explicitly modeling spatial structure. LastAct projects sensor events onto the home floorplan to form a layout-aligned trajectory image sequence that preserves spatial continuity. A lightweight gate identifies contaminated windows, and a boundary localizer estimates the last transition to enable boundary-guided masking that emphasizes post-boundary evidence and suppresses stale context. For efficiency, we reuse a precomputed layout-aligned template cache to avoid repeated rendering. Empirically, across four public smart-home datasets under near-realistic mixed-activity protocols, LastAct achieves competitive or superior performance on pure windows and yields substantial Macro-F1 gains on cross/mixed windows, demonstrating improved robustness under near-realistic sliding-window regimes.

URL PDF HTML ☆

赞 0 踩 0

2605.30705 2026-06-04 cs.CV cs.LG 版本更新

为何只问一个专家？学习将任务推迟到Top-$k$专家

Yannis Montreuil, Axel Carlier, Lai Xing Ng, Wei Tsang Ooi

发表机构 * School of Computing, National University of Singapore（新加坡国立大学计算机学院）； Fédération ENAC, ISAE-SUPAERO, ONERA, Université de Toulouse（ENAC联合会、ISAE-SUPAERO、ONERA、图卢兹大学）； Agency for Science, Technology and Research, Institute for Infocomm Research（科技研究局、信息通信研究所）

AI总结提出Top-$k$学习推迟框架，通过将查询分配给最优的$k$个专家，实现多专家协作，并开发了与$k$无关的替代损失函数，在准确性和成本之间取得更优权衡。

详情

AI中文摘要

现有的学习推迟（L2D）框架仅限于单专家推迟，迫使每个查询仅依赖一个专家，无法利用集体专业知识。我们首次提出了Top-$k$学习推迟框架，将查询分配给成本效益最高的$k$个实体。我们的公式统一并严格推广了先前的方法，包括单阶段和两阶段机制、选择性预测以及经典级联。特别地，它将通常的Top-1推迟规则作为特例，同时当$k>1$时能够与多个专家进行原则性协作。我们进一步提出了Top-$k(x)$学习推迟，这是一种自适应变体，根据输入难度、专家质量和咨询成本学习每个查询的最佳专家数量。为了实现实际学习，我们开发了一种新颖的替代损失函数，该函数在单阶段设置中是贝叶斯一致且$\mathcal{H}_h$一致的，在两阶段设置中是$(\mathcal{H}_r,\mathcal{H}_g)$一致的。关键是，该替代损失与$k$无关，允许一次性学习单个策略并灵活地部署到不同的$k$值。在两个机制上的实验表明，Top-$k$和Top-$k(x)$在准确性和成本之间实现了更优的权衡，为L2D中的多专家推迟开辟了新方向。

英文摘要

Existing Learning-to-Defer (L2D) frameworks are limited to single-expert deferral, forcing each query to rely on only one expert and preventing the use of collective expertise. We introduce the first framework for Top-$k$ Learning-to-Defer, which allocates queries to the $k$ most cost-effective entities. Our formulation unifies and strictly generalizes prior approaches, including the one-stage and two-stage regimes, selective prediction, and classical cascades. In particular, it recovers the usual Top-1 deferral rule as a special case while enabling principled collaboration with multiple experts when $k>1$. We further propose Top-$k(x)$ Learning-to-Defer, an adaptive variant that learns the optimal number of experts per query based on input difficulty, expert quality, and consultation cost. To enable practical learning, we develop a novel surrogate loss that is Bayes-consistent, $\mathcal{H}_h$-consistent in the one-stage setting, and $(\mathcal{H}_r,\mathcal{H}_g)$-consistent in the two-stage setting. Crucially, this surrogate is independent of $k$, allowing a single policy to be learned once and deployed flexibly across $k$. Experiments across both regimes show that Top-$k$ and Top-$k(x)$ deliver superior accuracy-cost trade-offs, opening a new direction for multi-expert deferral in L2D.

URL PDF HTML ☆

赞 0 踩 0

2410.15761 2026-06-04 cs.CL cs.LG stat.ML 版本更新

Optimal Query Allocation in Extractive QA with LLMs: A Learning-to-Defer Framework with Theoretical Guarantees

基于LLM的抽取式问答中的最优查询分配：一个具有理论保证的学习-推迟框架

Yannis Montreuil, Shu Heng Yeo, Axel Carlier, Lai Xing Ng, Wei Tsang Ooi

发表机构 * School of Computing, National University of Singapore（新加坡国立大学计算机学院）； Fédération ENAC ISAE-SUPAERO ONERA, Université de Toulouse, France（法国图卢兹大学ENAC ISAE-SUPAERO ONERA联合体）； Institute for Infocomm Research (A*STAR), Singapore（新加坡信息与通信研究院（A*STAR））； IPAL, IRL 2955, Singapore（新加坡IPAL实验室）

AI总结提出一个学习-推迟框架，通过将查询分配给专门专家，在保证高置信度预测的同时优化计算效率，并在SQuADv1、SQuADv2和TriviaQA上验证了其提高答案可靠性和降低计算开销的效果。

Comments 25 pages, 17 main paper

2605.29280 2026-06-04 cs.LG cs.AI cs.IR 版本更新

LoopFM: Learning frOm HistOrical RePresentations of Foundation Model for Recommendation

LoopFM：从基础模型的历史表示中学习用于推荐

Shali Jiang, Hua Zheng, Boyang Liu, Laming Chen, Kenny Lov, Chuanqi Xu, Lisang Ding, Qinghai Zhou, Can Cui, Xiaolong Liu, Xiaoyi Liu, Yasmine Badr, Xin Xu, Jiyan Yang, Ellie Dingqiao Wen, Gerard Jonathan Mugisha Akkerhuis, Chenxiao Guan, Rong Jin, Ruichao Qiu, Xian Chen, Shifu Xu, Zhehui Zhou, Ping Chen, Rui Yang, Haicheng Chen, Xiangge Meng, Song Zhou, Dharak Kharod, Shuyu Xu, Qiang Jin, Qiao Yang, Wankun Zhu, Qin Huang, Yuzhen Huang, Darren Liu, Parish Aggarwal, Hui Zhou, Erzhuo Wang, Shuo Chang, Xiaorui Gan, Wenlin Chen, Santanu Kolay, Huayu Li

发表机构 * Meta

AI总结针对知识蒸馏中传递标量导致转移率下降的问题，提出LoopFM框架，通过将基础模型的中间嵌入作为输入特征传递给下游垂直模型，实现高带宽知识转移，并在理论和实验中证明其有效性。

Comments Shali Jiang, Hua Zheng, Boyang Liu contributed equally to this work

详情

AI中文摘要

知识蒸馏（KD）将大型基础模型（FM）的单个标量预测传递给紧凑的垂直模型（VM），但由于单个标量无法传达较大FM学习的丰富中间知识，导致转移率（VM捕获的FM改进比例）下降。为了解决这一瓶颈，我们提出了LoopFM（从FM的历史表示中学习），该框架通过将FM中间嵌入结构化为下游VM的输入特征（例如，用户历史序列）来打开高带宽传输通道，无需在服务时进行实时FM推理，也无需FM和VM之间的架构耦合。我们为LoopFM提供了理论框架，包括增益分解和转移率分析。在三个公开基准上，LoopFM展示了强大的AUC改进（例如，在淘宝广告上提高6%以上）以及与KD互补的知识转移能力。在工业规模系统（数十亿样本、万亿参数FM）上，LoopFM在KD基础上将知识转移率大约翻倍，在Y1H1中实现了+0.5%的转化改进，在Y1H2中分别从两次单独发布实现了+1.03%和+1.22%的转化改进。

英文摘要

Knowledge distillation (KD) transfers a single scalar prediction from a large foundation model (FM) to compact vertical models (VMs), suffering from diminishing transfer ratio -- the fraction of FM improvement captured by the VM -- as a single scalar cannot convey the rich intermediate knowledge that larger FMs learn. To address this bottleneck, we propose LoopFM (Learning frOm HistOrical RePresentations of FM), a framework that opens a high-bandwidth transfer channel by structuring FM intermediate embeddings as input features (e.g., user history sequence) for downstream VMs, without requiring real-time FM inference at serving and architectural coupling between FM and VM. We provide a theoretical framework for LoopFM with a gain decomposition and transfer-ratio analysis. On three public benchmarks, LoopFM demonstrates strong AUC improvements (e.g., 6%+ on TaobaoAd) and complementary knowledge transfer capability with KD. On industrial-scale systems (billions of examples, trillion-parameter FMs), LoopFM approximately doubles the knowledge transfer ratio on top of KD, delivering a +0.5% conversion improvement in the first half after its initial launch, and +1.03% and +1.22% conversion improvement from two individual launches in the subsequent half.

URL PDF HTML ☆

赞 0 踩 0

2605.29076 2026-06-04 cs.CL cs.AI cs.LG 版本更新

基于频域知识的通用模型初始化

Jianlu Shen, Fu Feng, Yucheng Xie, Jiaqi Lv, Xin Geng

发表机构 * School of Computer Science and Engineering（计算机科学与工程学院）； Southeast University（东南大学）； Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications（新一代人工智能技术及其交叉应用重点实验室）； Ministry of Education, China（中华人民共和国教育部）

AI总结提出FRONT框架，利用离散余弦变换提取权重的低频分量作为“学习基因”，通过截断或填充实现任意大小模型的免训练初始化，并可选频谱正则化提升迁移性，在视觉任务中加速收敛15倍，语言任务中平均减少40.5%训练计算量。

详情

AI中文摘要

通过微调大规模预训练网络来迁移知识已成为下游任务的标准范式，然而预训练模型的知识与单一架构紧密耦合，限制了在不同规模模型间的灵活复用。针对这一挑战，近期方法通常采用参数选择（无法捕捉知识的相互依赖结构）或使用生成模型进行参数预测（依赖于对大规模网络集合的不切实际访问）。在本文中，我们实验证明，模型的基础、任务无关知识（即其“学习基因”）编码在权重的低频分量中，并且可以被下游模型高效继承。基于这一发现，我们提出FRONT（频域知识迁移），一种新颖框架，使用离散余弦变换（DCT）分离低频“学习基因”。该学习基因可以通过简单的截断或填充无缝适配以初始化任意大小的模型，整个过程无需训练。为了提升性能，我们提出一个可选的低成本精炼过程，引入频谱正则化器以进一步提高学习基因的可迁移性。大量实验表明，FRONT达到了最先进的性能，在视觉任务中加速收敛高达15倍，在语言任务中平均减少40.5%的训练FLOPs。

英文摘要

Transferring knowledge by fine-tuning large-scale pre-trained networks has become a standard paradigm for downstream tasks, yet the knowledge of a pre-trained model is tightly coupled with monolithic architecture, which restricts flexible reuse across models of varying scales. In response to this challenge, recent approaches typically resort to either parameter selection, which fails to capture the interdependent structure of this knowledge, or parameter prediction using generative models that depend on impractical access to large network collections. In this paper, we identify the low-frequency components of model weights as the concrete carrier of foundational, task-agnostic knowledge, its ``learngene", and validate this by demonstrating its efficient inheritance by downstream models and tasks. Based on this insight, we propose FRONT (FRequency dOmain kNowledge Transfer), a novel framework that uses the Discrete Cosine Transform (DCT) to isolate the low-frequency ``learngene". This learngene can be seamlessly adapted to initialize models of arbitrary size via simple truncation or padding, a process that is entirely training-free. For enhanced performance, we propose an optional low-cost refinement process that introduces a spectral regularizer to further improve the learngene's transferability. Extensive experiments demonstrate that FRONT achieves the state-of-the-art performance, accelerates convergence by up to $15\times$ in vision tasks, and reduces training FLOPs by an average of 40.5% in language tasks. Code is available at https://github.com/LUcy0505/FRONT.

URL PDF HTML ☆

赞 0 踩 0

2602.05725 2026-06-04 cs.LG math.OC stat.ML 版本更新

Muon in Associative Memory Learning: Training Dynamics and Scaling Laws

联想记忆学习中的Muon：训练动力学与缩放定律

Binghui Li, Kaifei Wang, Han Zhong, Pinyan Lu, Liwei Wang

发表机构 * University of Science and Technology of China（中国科学技术大学）

AI总结本文在联想记忆模型中研究Muon优化器的训练动力学和缩放定律，证明其相比梯度下降在无噪声情况下实现指数加速，在有噪声情况下具有更优的缩放效率。

Comments Published as a conference paper at ICML 2026; 53 pages

详情

AI中文摘要

Muon通过梯度的矩阵符号更新矩阵参数，并显示出强大的经验增益，但其动力学和缩放行为在理论上仍不清楚。我们在具有softmax检索和查询-答案对上的层次频谱（含和不含标签噪声）的线性联想记忆模型中研究Muon。在该设置下，我们证明梯度下降以高度不平衡的速率学习频率分量，导致收敛缓慢，瓶颈在于低频分量。相比之下，Muon优化器缓解了这种不平衡，实现了更快且更均匀的进展。具体地，在无噪声情况下，Muon实现了相对于梯度下降的指数加速；在具有幂律频谱的有噪声情况下，我们推导了Muon的缩放定律，并展示了其相对于梯度下降的优越缩放效率。此外，我们表明Muon可以解释为由自适应任务对齐和块对称梯度结构产生的隐式矩阵预处理器。相比之下，具有坐标符号算子的预处理器在已知未知任务表示的oracle访问下才能匹配Muon，而这在实践中的SignGD中是不可行的。在合成长尾分类和LLaMA风格预训练上的实验证实了该理论。

英文摘要

Muon updates matrix parameters via the matrix sign of the gradient and has shown strong empirical gains, yet its dynamics and scaling behavior remain unclear in theory. We study Muon in a linear associative memory model with softmax retrieval and a hierarchical frequency spectrum over query-answer pairs, with and without label noise. In this setting, we show that Gradient Descent (GD) learns frequency components at highly imbalanced rates, leading to slow convergence bottlenecked by low-frequency components. In contrast, the Muon optimizer mitigates this imbalance, leading to faster and more uniform progress. Specifically, in the noiseless case, Muon achieves an exponential speedup over GD; in the noisy case with a power-law frequency spectrum, we derive Muon's scaling law and demonstrate its superior scaling efficiency over GD. Furthermore, we show that Muon can be interpreted as an implicit matrix preconditioner arising from adaptive task alignment and block-symmetric gradient structure. In contrast, the preconditioner with coordinate-wise sign operator could match Muon under oracle access to unknown task representations, which is infeasible for SignGD in practice. Experiments on synthetic long-tail classification and LLaMA-style pre-training corroborate the theory.

URL PDF HTML ☆

赞 0 踩 0

2605.24782 2026-06-04 cs.LG 版本更新

The Perception-Physics Paradox: Probing Scientific Alignment with TC-Bench

感知-物理悖论：用TC-Bench探究科学对齐

Dingling Yao, Andrea Polesello, Adeel Pervez, Caroline Muller, Francesco Locatello

发表机构 * ETH Zurich（苏黎世联邦理工学院）； DeepMind ； University of Cambridge（剑桥大学）； University of Amsterdam（阿姆斯特丹大学）； University of Toronto（多伦多大学）

AI总结本文提出科学对齐概念，通过结构同构性构建层次化必要条件，并发布TC-Bench基准数据集，揭示视觉基础模型在极端条件下依赖视觉捷径而非科学推理。

Comments Accepted at ICML 2026

详情

AI中文摘要

虽然视觉基础模型（VFM）在卫星图像的预测任务中表现出色，但其性能可能源于视觉相关性而非底层结构不变性，这使得基于感知的分布外准确性甚至不能作为科学实用性的良好代理。因此，模型可能看起来正确但推理错误，我们将这种差异称为感知-物理悖论。为了解决这一差距，我们引入科学对齐作为科学领域表示学习的隐式目标。我们通过结构同构性研究科学对齐的一个原则性、可测试的方面，该要求潜在表示能够唯一地识别物理系统，直至线性重新参数化。这一视角引出了一个层次化的必要条件，并为物理和因果可解释性提供了系统的探测协议。为了实施这一框架，我们发布了TC-Bench，这是一个全球性的、可复现的基准数据集，带有自动构建流程，用于热带气旋研究，并表明当前的VFM依赖于在极端条件下崩溃的视觉捷径，表明科学对齐并非仅仅是规模扩展的自然副产品。

英文摘要

While Vision Foundation Models (VFMs) excel at predictive tasks on satellite imagery, their performance can arise from visual correlations rather than underlying structural invariants, making even perception-based out-of-distribution accuracy a poor proxy for scientific utility. As a result, models may look correct without reasoning correctly, a discrepancy we term the Perception-Physics Paradox. To address this gap, we introduce scientific alignment as an implicit objective for representation learning in scientific domains. We study a principled, testable aspect of scientific alignment through structural isomorphism, which requires latent representations to uniquely identify physical systems up to a linear reparameterization. This perspective induces a hierarchy of necessary conditions and yields a systematic probing protocol for physical and causal interpretability. To operationalize this framework, we release TC-Bench, a global, reproducible benchmark dataset with an automated construction pipeline for tropical cyclone research, and show that current VFMs rely on visual shortcuts that collapse in intense regimes, indicating that scientific alignment does not arise as a natural byproduct of scaling alone.

URL PDF HTML ☆

赞 0 踩 0

2605.17273 2026-06-04 cs.LG cs.AI 版本更新

ZeroUnlearn：大语言模型中的少样本知识遗忘

Yujie Lin, Chengyi Yang, Zhishang Xiang, Yiping Song, Jinsong Su

发表机构 * University of Science and Technology of China（中国科学技术大学）

AI总结提出ZeroUnlearn框架，通过模型编辑将机器遗忘重新定义为精确的知识重映射问题，利用封闭解乘法参数更新实现高效、定向的少样本遗忘。

详情

AI中文摘要

大型语言模型由于在海量网络语料上训练，不可避免地会保留敏感信息（定义为可能引发有害生成的输入），从而引发隐私和安全担忧。现有的机器遗忘方法主要依赖于重训练或激进微调，这些方法要么计算成本高，要么容易降低相关知识并损害整体模型效用。在这项工作中，我们通过模型编辑将机器遗忘重新表述为一个精确的知识重映射问题。我们提出了ZeroUnlearn，一个少样本遗忘框架。它通过将敏感输入映射到中性目标状态并移除其原始表示来覆盖敏感输入。ZeroUnlearn通过封闭解形式的乘法参数更新强制执行表示正交性，从而实现高效且有针对性的遗忘。我们进一步将ZeroUnlearn扩展到基于梯度的变体，用于多样本遗忘。实验表明，我们的方法在保持模型整体效用的同时优于现有基线。我们的代码可在github上获取：https://github.com/XMUDeepLIT/ZeroUnlearn。

英文摘要

Large language models inevitably retain sensitive information, defined as inputs that may induce harmful generations, due to training on massive web corpora, raising concerns for privacy and safety. Existing machine unlearning methods primarily rely on retraining or aggressive fine-tuning, which are either computationally expensive or prone to degrading related knowledge and overall model utility. In this work, we reformulate machine unlearning as a precise knowledge re-mapping problem via model editing. We propose ZeroUnlearn, a few-shot unlearning framework. It overwrites sensitive inputs by mapping them to a neutral target state and removing their original representations. ZeroUnlearn enforces representational orthogonality through a multiplicative parameter update with a closed-form solution, enabling efficient and targeted unlearning. We further extend ZeroUnlearn to a gradient-based variant for multi-sample unlearning. Experiments demonstrate that our approach outperforms existing baselines while preserving general model utility. Our code is available at the github: https://github.com/XMUDeepLIT/ZeroUnlearn.

URL PDF HTML ☆

赞 0 踩 0

2605.20468 2026-06-04 cs.LG stat.ME stat.ML 版本更新

CASCADE Conformal Prediction: Uncertainty-Adaptive Prediction Intervals for Two-Stage Clinical Decision Support

CASCADE 共形预测：两阶段临床决策支持的不确定性自适应预测区间

Ricardo Diaz-Rincon, Muxuan Liang, Adolfo Ramirez-Zamora, Benjamin Shickel

发表机构 * University of Florida（佛罗里达大学）； MD Anderson Cancer Center（MD安德森癌症中心）； University of Louisville（路易斯维尔大学）

AI总结提出 CASCADE 共形预测框架，通过传播分类器认知不确定性动态调整回归预测区间，在帕金森病用药管理中实现高效且鲁棒的区间估计。

Comments Accepted to ICML 2026 AgenticUQ Workshop. 14 Pages, 3 Figures

详情

AI中文摘要

由于疾病进展的异质性、患者反应的差异性以及药物副作用，帕金森病（PD）的有效用药管理具有挑战性。虽然AI模型可以预测左旋多巴等效日剂量（LEDD）作为用药需求的度量，但标准的不确定性量化通常无法传达这些预测的可靠性，将高置信度和低置信度的临床决策等同对待。我们引入了CASCADE（通过共形和分布估计的校准自适应缩放），一种新颖的共形预测框架，它将来自筛选分类器的认知不确定性传播以自适应下游预测。与依赖辅助残差回归的标准共形方法不同，我们利用来自主要分类任务（识别是否需要改变用药）的认知不确定性，动态缩放次要回归任务（预测改变多少）的预测区间。通过将Venn-Abers多概率不确定性直接映射到非一致性分数，我们的框架实现了连续的风险自适应。我们证明，这种“级联效应”为高置信度患者产生高效的区间（比标准共形基线窄38.9%），同时自动扩展区间以确保对不确定病例的鲁棒覆盖，弥合了PD中离散临床决策与连续剂量预测之间的差距。

英文摘要

Effective medication management in Parkinson's Disease (PD) is challenging due to heterogeneous disease progression, variable patient response, and medication side effects. While AI models can forecast levodopa equivalent daily dose (LEDD) as a measure of medication needs, standard uncertainty quantification often fails to communicate the reliability of these predictions, treating high and low confidence clinical decisions identically. We introduce CASCADE (Calibrated Adaptive Scaling via Conformal And Distributional Estimation), a novel conformal prediction framework that propagates epistemic uncertainty from a screening classifier to adapt downstream predictions. Unlike standard conformal methods that rely on auxiliary residual regression, we leverage epistemic uncertainty from a primary classification task (identifying whether a medication change is needed) to dynamically scale the prediction intervals of a secondary regression task (predicting how much change). By mapping Venn-Abers multi-probabilistic uncertainty directly to non-conformity scores, our framework achieves continuous risk adaptation. We demonstrate that this ``cascade effect'' produces highly efficient intervals for confident patients (38.9% narrower than standard conformal baselines) while automatically expanding intervals to ensure robust coverage for uncertain cases, bridging the gap between discrete clinical decision-making and continuous dose forecasting in PD.

URL PDF HTML ☆

赞 0 踩 0

2605.18936 2026-06-04 cs.LG cs.CL 版本更新

FedMental: Evaluating Federated Learning for Mental Health Detection from Social Media Data

FedMental: 评估用于社交媒体数据心理健康检测的联邦学习

Nuredin Ali Abdelkadir, Anjali Ratnam, Zeerak Talat, Stevie Chancellor

发表机构 * University of Minnesota（明尼苏达大学）； University of Edinburgh（爱丁堡大学）

AI总结本文通过联邦学习和差分隐私联邦学习在抑郁和自杀危机检测任务上的实验，评估了隐私保护技术对心理健康检测性能的影响，发现联邦学习性能接近集中式训练，但差分隐私联邦学习存在显著的性能-隐私权衡。

Comments Association for Computational Linguistics (ACL) 2026 Main Conference

详情

AI中文摘要

社交媒体文本数据常用于训练机器学习模型以识别表现出高风险心理健康行为的用户。然而，共享这些敏感数据会带来隐私风险，并限制了基准数据集的发展。我们全面评估了隐私保护的机器学习技术是否能在保持性能的同时实现更安全的数据共享。具体来说，我们将联邦学习和差分隐私联邦学习应用于两个广泛研究的心理健康预测任务：X（Twitter）上的抑郁检测和Reddit上的自杀危机检测。通过将每个用户视为非独立同分布设置中的一个客户端，我们模拟了现实的数据共享场景，评估了不同的客户端比例、聚合策略和隐私预算。虽然联邦学习在抑郁识别上达到了与集中式训练相当的性能（集中式F1=85.63；最佳联邦学习模型F1=83.16），但我们发现差分隐私联邦学习即使在低噪声水平（epsilon=50）下也存在较大的性能-隐私权衡（F1下降高达27.01）。这是由于与心理健康相关的高信息量但稀疏的语言标记（如健康主题和情感词）被扭曲所致。本研究实证展示了当前隐私保护技术在心理健康推理任务中的潜力和局限性。

英文摘要

Social media text data are often used to train Machine Learning (ML) models to identify users exhibiting high-risk mental health behaviors. However, sharing this sensitive data poses privacy risks and limits the growth of benchmark datasets. We comprehensively evaluate whether privacy-preserving ML techniques can enable safer data sharing while preserving performance. Specifically, we apply federated learning (FL) and Differentially Private FL for two widely-studied mental health prediction tasks: depression detection on X (Twitter) and suicide crisis detection on Reddit. We simulate realistic data-sharing scenarios by treating each user as a client in a non-IID setting, evaluating across different client fractions, aggregation strategies, and privacy budgets. While FL achieves comparable performance to centralized training (centralized F1 = 85.63; best FL model F1 = 83.16) on depression identification, we find that Differentially Private FL has a large performance-privacy trade-off (up to F1 = 27.01 drop) even with low levels of noise (epsilon = 50). This is due to the distortion of highly informative yet sparse mental health linguistic markers related to mental health, like health topics and emotion words. This research empirically demonstrates the potential and limitations of current privacy preservation techniques for mental health inference tasks.

URL PDF HTML ☆

赞 0 踩 0

2605.18931 2026-06-04 stat.ML cs.AI cs.LG 版本更新

Markov Chain Decoders Overcome the Heavy-Tail Limitations of Lipschitz Generative Models

马尔可夫链解码器克服Lipschitz生成模型的重尾限制

Abdelhakim Ziani, Andras Horvath, Paolo Ballarini

发表机构 * Université Paris Saclay, Lab. MICS, CentraleSupélec, Gif-sur-Yvette, France（巴黎萨克雷大学，MICS实验室，CentraleSupélec，法国吉夫昂耶vette）； Università di Torino, Torino, Italy（都灵大学，意大利都灵）

AI总结针对Lipschitz生成模型无法生成重尾分布的问题，提出用基于马尔可夫链的Phase-Type分布替换高斯解码器，显著降低了尾部误差和极端分位数误差。

详情

Journal ref: 22nd European Performance Engineering Workshop (EPEW 2026), Jun 2025, Grimstad, Norway

AI中文摘要

重尾分布在性能评估、网络流量和风险建模中普遍存在。这种行为对现代深度生成模型构成了根本性挑战。标准变分自编码器（VAE）采用高斯解码器似然和Lipschitz约束神经网络，这种组合在结构上无法产生重尾输出：高斯尾部呈指数衰减，而Lipschitz连续性阻止解码器放大来自潜在空间的罕见事件以充分克服这种衰减。我们提供了这一局限性的理论刻画，并使用合成Pareto数据（跨越尾部指数$α$ ∈ {2, 3, 5, 30}和维度d ∈ {1, 5, 10}的网格）进行了受控实证演示。作为解决方案，我们在保持编码器、潜在空间和训练过程不变的情况下，将高斯解码器替换为基于马尔可夫链的Phase-Type（PH）分布。PH分布允许对任何正值分布（包括重尾族）进行任意精确的近似。实验表明，对于重尾数据，与高斯基线相比，基于PH的模型将尾部Kolmogorov-Smirnov距离减少了最多6倍，极端分位数误差减少了最多10倍。这些结果表明，将基于马尔可夫链的分布集成到生成模型的解码器中，为重尾生成问题提供了一个有原则且实际有效的解决方案。

英文摘要

Heavy-tailed distributions are prevalent in performance evaluation, network traffic, and risk modeling. This behavior poses a fundamental challenge for modern deep generative models. Standard Variational Autoencoders (VAEs) employ Gaussian decoder likelihoods and Lipschitz-constrained neural networks, a combination that is structurally incapable of producing heavy-tailed outputs: the Gaussian tail decays exponentially, and Lipschitz continuity prevents the decoder from amplifying rare events from the latent space input to sufficiently overcome this decay. We provide both a theoretical characterization of this limitation and a controlled empirical demonstration using synthetic Pareto data across a grid of tail indices $α$ $\in$ {2, 3, 5, 30} and dimensions d $\in$ {1, 5, 10}. As a solution, we replace the Gaussian decoder with a Phase-Type (PH) distribution based on Markov chains, while keeping the encoder, latent space, and training procedure identical. PH distributions allow for arbitrarily precise approximations of any positive-valued distributions, including heavy-tailed families. Experiments showed that the PH-based model reduces tail Kolmogorov-Smirnov distance by up to x6 and extreme quantile error by up to x10 compared to the Gaussian baseline for heavy-tailed data. These results demonstrate that integrating Markov chain-based distributions into the decoder of a generative model institutes a principled and practically effective solution to the heavy-tail generation problem.

URL PDF HTML ☆

赞 0 踩 0

2605.16301 2026-06-04 cs.CY cs.AI cs.LG 版本更新

Do LLMs Hold Their Values? MANTA: A Multi-Turn Adversarial Benchmark for Animal Welfare Reasoning

LLMs 是否坚持其价值观？MANTA：一个用于动物福利推理的多轮对抗性基准

Isabella Luong, Joyee Chen, Arturs Kanepajs, Jasmine Brazilek, Sankalpa Ghose, David Williams-King, Linh Le, Allen Lu

发表机构 * SPAR ； Compassion Aligned Machine Learning（同情对齐机器学习）； NUS（新加坡大学）； Mila（Mila研究所）； ERA Cambridge（剑桥ERA）

AI总结提出 MANTA 基准，通过多轮对抗性对话评估大语言模型在动物福利推理中的价值观稳定性和道德敏感性，发现单轮基准无法捕捉的排名变化和物种-压力交互效应。

详情

AI中文摘要

评估大语言模型（LLMs）中的动物福利推理仍然是一个开放挑战，尽管它们在消费者和专业环境中迅速部署，其中福利考虑隐含地出现在日常查询中。现有的基准（如 AnimalHarmBench）通过单轮、明确框架的问题进行评估，衡量模型在直接询问时是否避免有害内容。这种方法忽略了两种失败模式：在持续对抗性压力下的对齐退化，以及道德敏感性（模型是否在日常查询中自发提出福利问题）。为填补这一空白，我们构建了 MANTA，一个包含 1,088 个五轮对话的基准，从隐式的第一轮场景开始，通过明确的福利提示，再到来自五种类型（社会、文化、经济、实用和认知）的三轮对抗性压力。我们在两个维度上对对话进行评分：动物福利价值观稳定性（AWVS，主要）和动物福利道德敏感性（AWMS，诊断）。我们评估了七个前沿模型：Claude Opus 4.7、GPT-5.5、DeepSeek V4、Llama 3.3 70B、Mistral Small、Grok 4.3 和 Gemini 3.1 Flash Lite。多轮评估捕捉了单轮基准遗漏的行为：7 个模型中有 4 个相对于第一轮得分改变了排名，包括 Gemini Flash Lite，它在 AWMS 上从第五名下降到 AWVS 上的最后一名。AWMS 和 AWVS 呈正相关但不完全相关，表明道德识别测试捕捉了模型在压力下行为的一个稳定但不完整的组成部分。MANTA 还提供了先前基准无法获得的物种-压力交互矩阵，显示福利鲁棒性同时取决于动物和施加的压力；伴侣动物得分高于野生动物，后者高于养殖动物和无脊椎动物。我们发布了数据集、脚本化压力计划、评判提示和分析代码。

英文摘要

Evaluating animal welfare reasoning in LLMs remains an open challenge despite rapid deployment in consumer and professional contexts where welfare considerations appear implicitly in everyday queries. Existing benchmarks such as AnimalHarmBench evaluate this through single-turn, explicitly framed questions, measuring whether models avoid harmful content when directly asked. This approach overlooks two failure modes: alignment degradation under sustained adversarial pressure, and moral sensitivity (whether a model spontaneously surfaces welfare stakes in everyday queries). To fill this gap, we construct MANTA, a benchmark of 1,088 five-turn conversations progressing from an implicit Turn-1 scenario through an explicit welfare prompt to three adversarial pressure rounds drawn from a five-type taxonomy: Social, Cultural, Economic, Pragmatic, and Epistemic. We score conversations on two dimensions: Animal Welfare Value Stability (AWVS, primary) and Animal Welfare Moral Sensitivity (AWMS, diagnostic). We evaluate seven frontier models: Claude Opus 4.7, GPT-5.5, DeepSeek V4, Llama 3.3 70B, Mistral Small, Grok 4.3, and Gemini 3.1 Flash Lite. Multi-turn evaluation captures behavior single-turn benchmarks miss: 4 of 7 models change rank relative to Turn 1 scores, including Gemini Flash Lite, which drops from fifth on AWMS to last on AWVS. AWMS and AWVS are positively but imperfectly correlated, suggesting moral-recognition tests capture a stable but incomplete component of model behavior under pressure. MANTA also enables a species-by-pressure interaction matrix unavailable to prior benchmarks, showing welfare robustness depends jointly on the animal and pressure applied; companion animals score above wild animals, which score above farmed animals and invertebrates. We release the dataset, scripted pressure plans, judge prompts, and analysis code.

URL PDF HTML ☆

赞 0 踩 0

2605.15152 2026-06-04 cs.LG cs.AI 版本更新

Widening the Gap: Exploiting LLM Quantization via Outlier Injection

扩大差距：通过异常值注入利用LLM量化

Xiaohua Zhan, Kazuki Egashira, Robin Staab, Mark Vero, Martin Vechev

发表机构 * ETH Zurich（苏黎世联邦理工学院）

AI总结本文提出首个针对多种先进量化方法（AWQ、GPTQ、GGUF I-quants）的量化条件攻击，通过注入异常值导致权重塌缩，诱导模型在量化后出现恶意行为。

详情

AI中文摘要

LLM量化已成为内存高效部署的关键。最近的研究表明，量化方案可能带来严重的安全风险：对手可以发布一个在全精度下看似良性，但在用户量化后表现出恶意行为的模型。然而，现有的量化条件攻击仅限于相对简单的量化方法，攻击者可以估计在目标量化下保持不变的权重区域。值得注意的是，先前的攻击始终未能攻破更流行和复杂的方案，限制了其实际影响。在这项工作中，我们提出了首个量化条件攻击，能够持续诱导出可由多种先进量化技术（包括AWQ、GPTQ和GGUF I-quants）触发的恶意行为。我们的攻击利用了现代量化方法共有的一个简单特性：大的异常值可能导致其他权重四舍五入为零。因此，通过向特定权重块注入异常值，对手可以诱导模型出现目标性的、可预测的权重塌缩。这种效应可用于制作看似良性的全精度模型，这些模型在量化后表现出广泛的恶意行为。通过在三种攻击场景和LLM上的广泛评估，我们表明我们的攻击在先前攻击失败的多种量化方法上实现了高成功率。我们的结果首次证明，量化的安全风险不仅限于更简单的方案，而是广泛存在于复杂、广泛使用的量化方法中。

英文摘要

LLM quantization has become essential for memory-efficient deployment. Recent work has shown that quantization schemes can pose critical security risks: an adversary may release a model that appears benign in full precision but exhibits malicious behavior once quantized by users. However, existing quantization-conditioned attacks have been limited to relatively simple quantization methods, where the attacker can estimate weight regions that remain invariant under the target quantization. Notably, prior attacks have consistently failed to compromise more popular and sophisticated schemes, limiting their practical impact. In this work, we introduce the first quantization-conditioned attack that consistently induces malicious behavior that can be triggered by a broad range of advanced quantization techniques, including AWQ, GPTQ, and GGUF I-quants. Our attack exploits a simple property shared by many modern quantization methods: large outliers can cause other weights to be rounded to zero. Consequently, by injecting outliers into specific weight blocks, an adversary can induce a targeted, predictable weight collapse in the model. This effect can be used to craft seemingly benign full-precision models that exhibit a wide range of malicious behaviors after quantization. Through extensive evaluation across three attack scenarios and LLMs, we show that our attack achieves high success rates against a broad range of quantization methods on which prior attacks fail. Our results demonstrate, for the first time, that the security risks of quantization are not restricted to simpler schemes but are broadly relevant across complex, widely-used quantization methods.

URL PDF HTML ☆

赞 0 踩 0

2605.13672 2026-06-04 cs.CV cs.AI cs.LG 版本更新

SpurAudio: A Benchmark for Studying Shortcut Learning in Few-Shot Audio Classification

SpurAudio: 用于研究少样本音频分类中捷径学习的基准

Giries Abu Ayoub, Morad Tukan, Loay Mualem

发表机构 * Department of Computer Science, University of Haifa（海法大学计算机科学系）； Independent Researcher（独立研究者）； University of Stuttgart, Germany（斯图加特大学，德国）； IMPRS-IS, Germany（智能系统国际Max Planck研究学校，德国）

AI总结提出SpurAudio基准，通过控制音频中前景与背景的关联，评估少样本分类模型对虚假相关性的敏感性，发现现有方法在背景变化时性能显著下降。

详情

AI中文摘要

少样本分类（FSC）广泛用于从有限标注数据中学习，但大多数评估隐含假设目标概念与上下文线索无关。然而，在现实场景中，样本通常出现在丰富的上下文中，允许模型利用前景内容与背景信号之间的虚假相关性。虽然这种效应已在少样本图像分类中得到研究，但其在少样本音频分类中的作用仍 largely 未被探索，且现有音频基准对上下文结构的控制有限。我们引入了 SpurAudio，一个利用音频中前景事件和背景环境的自然可分离性，以支持对支持集和查询集之间的上下文偏移进行可控、多级评估的基准。使用该基准，我们表明许多最先进的少样本方法在背景相关性被破坏时遭受严重的性能下降，尽管在标准评估协议下达到相似的准确率。关键的是，即使在大型预训练音频基础模型中，这种脆弱性仍然存在，排除了骨干网络容量不足的解释。此外，在传统基准下看似相当的方法可能对虚假相关性表现出显著不同的敏感性，揭示了与特征表示在推理时如何与分类器头交互相关的系统性算法优势和脆弱性。这些发现为音频中少样本方法的行为提供了新的见解，并强调了在评估FSC模型时需要明确探测上下文依赖性的基准。

英文摘要

Few-shot classification (FSC) is widely used for learning from limited labeled data, yet most evaluations implicitly assume that target concepts are independent of contextual cues. In real-world settings, however, examples often appear within rich contexts, allowing models to exploit spurious correlations between foreground content and background signals. While such effects have been studied in few-shot image classification, their role in few-shot audio classification remains largely unexplored, and existing audio benchmarks offer limited control over contextual structure. We introduce SpurAudio, a benchmark that leverages the natural separability of foreground events and background environments in audio to enable controlled, multi-level evaluation of contextual shifts across support and query sets. Using this benchmark, we show that many state-of-the-art few-shot methods suffer severe performance degradation when background correlations are disrupted, despite achieving similar accuracy under standard evaluation protocols. Crucially, this vulnerability persists even in large pretrained audio foundation models, ruling out limited backbone capacity as an explanation. Moreover, methods that appear comparable under conventional benchmarks can exhibit markedly different sensitivity to spurious correlations, revealing systematic algorithmic strengths and vulnerabilities tied to how feature representations interact with classifier heads at inference time. These findings provide new insight into the behavior of few-shot methods in audio and highlight the need for benchmarks that explicitly probe context dependence when evaluating FSC models.

URL PDF HTML ☆

赞 0 踩 0

2605.00182 2026-06-04 cs.LG 版本更新

Towards A Generative Protein Evolution Machine with DPLM-Evo

迈向生成式蛋白质进化机器：DPLM-Evo

Xinyou Wang, Liang Hong, Jiasheng Ye, Zaixiang Zheng, Yu Li, Shujian Huang, Quanquan Gu

发表机构 * Nanjing University（南京大学）； CUHK（香港大学）； Fudan University（复旦大学）； ByteDance（字节跳动）

AI总结提出DPLM-Evo，一种显式建模替换、插入和删除操作的进化离散扩散框架，在单序列设置下实现蛋白质突变效应预测的最优性能，并支持变长模拟进化与蛋白质编辑优化。

Comments A peer-reviewed version was accepted to ICML 2026

详情

AI中文摘要

蛋白质在生物物理和功能约束下通过逐渐进化形成。蛋白质语言模型从大规模序列中学习丰富的进化约束，基于离散扩散的蛋白质语言模型（如DPLM）在理解和生成方面都很有前景。然而，现有的DPLM通常依赖于掩码扩散，这与一个简单的生物学直觉相矛盾：蛋白质通过累积的编辑进化，而不是从掩码中出现。因此，这些框架缺乏用于替换和插入/删除（indel）操作的显式预训练目标，限制了优化风格的后编辑和灵活的引导生成。为了解决这些限制，我们提出了DPLM-Evo，一种进化离散扩散框架，在去噪过程中显式预测替换、插入和删除操作。DPLM-Evo将上采样长度的潜在对齐空间与可变长度的观测序列空间解耦，使得indel感知生成变得可行。为了更好地将替换与真实进化对齐，我们进一步引入了一种上下文感知的进化噪声核，产生生物学信息丰富、上下文依赖的突变模式。在各种任务中，DPLM-Evo提升了序列理解能力，并在单序列设置下在ProteinGym上实现了最先进的突变效应预测性能。它还支持变长模拟进化，以及通过显式编辑轨迹对现有蛋白质进行后编辑/优化。

英文摘要

Proteins are shaped by gradual evolution under biophysical and functional constraints. Protein language models learn rich evolutionary constraints from large-scale sequences, and discrete diffusion-based protein language models~(\eg, DPLMs) are promising for both understanding and generation. However, existing DPLMs typically rely on masked diffusion that contradicts a simple biological intuition: proteins evolve through accumulated edits, not by emerging from masks. Consequently, these frameworks lack explicit pretraining objectives for substitution and insertion/deletion (indel) operations, limiting both optimization-style post-editing and flexible guided generation. To address these limitations, we present DPLM-Evo, an evolutionary discrete diffusion framework that explicitly predicts substitution, insertion, and deletion operations during denoising. DPLM-Evo decouples an upsampled-length latent alignment space from the variable-length observed sequence space, which makes indel-aware generation tractable. To better align substitutions with real evolution, we further introduce a contextualized evolutionary noising kernel that produces biologically informed, context-dependent mutation patterns. Across tasks, DPLM-Evo improves sequence understanding and achieves state-of-the-art mutation effect prediction performance on ProteinGym in the single-sequence setting. It also enables variable-length simulated evolution, and post-editing/optimization of existing proteins via explicit edit trajectories.

URL PDF HTML ☆

赞 0 踩 0

2304.10891 2026-06-04 cs.LG cs.AI cs.CV cs.RO cs.SY eess.SY 版本更新

Transformer-Based Autonomous Driving Models and Deployment-Oriented Compression: A Survey

基于Transformer的自动驾驶模型与面向部署的压缩：综述

Juan Zhong, Yuhang Shi, Zukang Xu, Xi Chen

发表机构 * Renmin University of China（中国人民大学）； Artificial Intelligence Innovation and Incubation Institute, Fudan University（复旦大学人工智能创新与孵化院）； Shanghai Academy of AI for Science（上海人工智能科学研究院）； Department of houmo.ai（houmo.ai部门）

AI总结本文综述了基于Transformer的自动驾驶模型，并从部署角度分析了压缩与加速策略（如量化、剪枝、知识蒸馏等）如何影响模型设计、部署性、鲁棒性和安全性。

详情

AI中文摘要

基于Transformer的模型正成为自动驾驶的核心范式，因为它们能够捕捉感知、预测和规划中的长程空间依赖、多智能体交互和多模态上下文。然而，它们在真实车辆中的部署仍然困难，因为高容量注意力架构带来了显著的延迟、内存和能量开销。本综述回顾了具有代表性的基于Transformer的自动驾驶模型，并按任务角色、感知配置和架构设计进行组织。更重要的是，我们从面向部署的角度审视这些模型，分析效率约束如何在实际中重塑模型设计选择。我们进一步回顾了与基于Transformer的驾驶系统相关的压缩和加速策略，包括量化、剪枝、知识蒸馏、低秩近似和高效注意力，并讨论了它们的优势、局限性和任务依赖性。我们不将压缩视为孤立的后期处理步骤，而是强调其作为直接影响部署性、鲁棒性和安全性的系统级设计考虑。最后，我们指出了面向标准化、安全感知和硬件感知的高效自动驾驶系统评估的开放挑战和未来研究方向。

英文摘要

Transformer-based models are becoming a central paradigm in autonomous driving because they can capture long-range spatial dependencies, multi-agent interactions, and multimodal context across perception, prediction, and planning. At the same time, their deployment in real vehicles remains difficult because high-capacity attention-based architectures impose substantial latency, memory, and energy overhead. This survey reviews representative Transformer-based autonomous driving models and organizes them by task role, sensing configuration, and architectural design. More importantly, it examines these models from a deployment-oriented perspective and analyzes how efficiency constraints reshape model design choices in practice. We further review compression and acceleration strategies relevant to Transformer-based driving systems, including quantization, pruning, knowledge distillation, low-rank approximation, and efficient attention, and discuss their benefits, limitations, and task-dependent applicability. Rather than treating compression as an isolated post-processing step, we highlight it as a system-level design consideration that directly affects deployability, robustness, and safety. Finally, we identify open challenges and future research directions toward standardized, safety-aware, and hardware-conscious evaluation of efficient autonomous driving systems.

URL PDF HTML ☆

赞 0 踩 0

2602.02834 2026-06-04 cs.LG cs.AI 版本更新

What Structural Inductive Bias Helps Transformers Reason Over Knowledge Graphs? A Study with Tabula RASA

什么结构归纳偏置帮助Transformer在知识图谱上进行推理？Tabula RASA研究

Jonas Petersen, Camilla Mazzoleni, Gian-Alessandro Lombardi, Federico Martelli, Riccardo Maggioni

发表机构 * ETH Zurich（苏黎世联邦理工学院）

AI总结通过最小化Transformer修改的消融实验，发现稀疏邻接掩码是驱动多跳推理的主要结构归纳偏置，而关系参数贡献有限。

Comments Accepted at GFM, ICML 2026

2510.17281 2026-06-04 cs.LG cs.AI cs.IR 版本更新

MemoryBench: A Benchmark for Memory and Continual Learning in LLM Systems

MemoryBench：面向LLM系统的记忆与持续学习基准

Qingyao Ai, Yichen Tang, Changyue Wang, Jianming Long, Weihang Su, Yiqun Liu

发表机构 * Department of Computer Science and Technology, Tsinghua University, Beijing, China（清华大学计算机科学与技术系）

AI总结提出用户反馈模拟框架及跨领域、多语言、多任务类型的综合基准MemoryBench，评估LLM系统从累积用户反馈中持续学习的能力，实验表明现有方法效果与效率均不理想。

详情

AI中文摘要

扩展数据、参数和测试时计算一直是改进LLM系统（LLMsys）的主流方法，但由于高质量数据的逐渐枯竭以及更大计算资源消耗带来的边际收益，这些方法的性能上限已几乎达到。受人类和传统AI系统从实践中学习能力的启发，为LLMsys构建记忆和持续学习框架已成为近期文献中一个重要且热门的研究方向。然而，现有的LLM记忆基准通常侧重于评估系统在长文本输入的同质阅读理解任务上的表现，而非测试其在服务时间内从累积用户反馈中学习的能力。因此，我们提出了一个用户反馈模拟框架和一个涵盖多个领域、语言和任务类型的综合基准，以评估LLMsys的持续学习能力。实验表明，最先进的基线方法在有效性和效率上远未令人满意，我们希望这一基准能为未来LLM记忆和优化算法的研究铺平道路。

英文摘要

Scaling up data, parameters, and test-time computation has been the mainstream methods to improve LLM systems (LLMsys), but their upper bounds are almost reached due to the gradual depletion of high-quality data and marginal gains obtained from larger computational resource consumption. Inspired by the abilities of human and traditional AI systems in learning from practice, constructing memory and continual learning frameworks for LLMsys has become an important and popular research direction in recent literature. Yet, existing benchmarks for LLM memory often focus on evaluating the system on homogeneous reading comprehension tasks with long-form inputs rather than testing their abilities to learn from accumulated user feedback in service time. Therefore, we propose a user feedback simulation framework and a comprehensive benchmark covering multiple domains, languages, and types of tasks to evaluate the continual learning abilities of LLMsys. Experiments show that the effectiveness and efficiency of state-of-the-art baselines are far from satisfying, and we hope this benchmark could pave the way for future studies on LLM memory and optimization algorithms. Website: https://memorybench.thuir.cn Code: https://github.com/THUIR/MemoryBench Data: https://huggingface.co/datasets/THUIR/MemoryBench Data-Full: https://huggingface.co/datasets/THUIR/MemoryBench-Full

URL PDF HTML ☆

赞 0 踩 0

2506.01250 2026-06-04 cs.LG stat.ML 版本更新

Neural Variance-aware Dueling Bandits with Deep Representation and Shallow Exploration

神经方差感知的深度表示与浅层探索的对抗性老虎机

Youngmin Oh, Jinje Park, Taejin Paik

发表机构 * InfiniTree Samsung Electronics（InfiniTree三星电子）； Samsung Electronics（三星电子）

AI总结提出首个方差感知的上下文对抗性老虎机算法，结合浅层探索与神经网络非线性效用逼近，通过迭代自改进与谱分析将网络宽度需求从Ω̃(T^{14})降至Ω̃(T^{6})，并实现次线性遗憾。

Comments Accepted at AISTATS 2026; code at https://github.com/youngmin0oh/NVLDB-AISTATS2026

详情

AI中文摘要

时间序列预测作为推理：一种基于强化LLM的慢思考方法

Yitong Zhou, Yucong Luo, Mingyue Cheng, Qi Liu, Jiahao Wang, Daoyu Wang, Enhong Chen

发表机构 * State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China（认知智能国家重点实验室，中国科学技术大学）

AI总结提出Time-R1框架，通过两阶段强化微调（监督微调+强化学习）训练LLM进行多步推理，以提升时间序列预测的准确性。

详情

AI中文摘要

为了推进时间序列预测（TSF），人们提出了各种方法来提高预测精度，从统计技术发展到数据驱动的深度学习架构。尽管这些方法有效，但大多数现有方法仍然遵循快速思考范式——依赖提取历史模式并将其映射到未来值作为核心建模理念，缺乏包含中间时间序列推理的显式思考过程。与此同时，新兴的慢思考LLM（如OpenAI-o1）展示了显著的多步推理能力，为克服这些问题提供了替代途径。然而，仅靠提示工程存在若干局限性——包括高计算成本、隐私风险以及领域特定时间序列深度推理能力有限。为了解决这些局限性，更有前景的方法是训练LLM发展慢思考能力并获得强大的时间序列推理技能。为此，我们提出了Time-R1，一个两阶段强化微调框架，旨在增强LLM用于时间序列预测的多步推理能力。具体来说，第一阶段进行监督微调以进行预热适应，而第二阶段采用强化学习来提高模型的泛化能力。特别地，我们专门为时间序列预测设计了一个细粒度的多目标奖励，然后引入了GRIP（基于组的相对重要性策略优化），它利用非均匀采样进一步鼓励和优化模型对有效推理路径的探索。实验表明，Time-R1在多种数据集上显著提高了预测性能。

英文摘要

To advance time series forecasting (TSF), various methods have been proposed to improve prediction accuracy, evolving from statistical techniques to data-driven deep learning architectures. Despite their effectiveness, most existing methods still adhere to a fast thinking paradigm-relying on extracting historical patterns and mapping them to future values as their core modeling philosophy, lacking an explicit thinking process that incorporates intermediate time series reasoning. Meanwhile, emerging slow-thinking LLMs (e.g., OpenAI-o1) have shown remarkable multi-step reasoning capabilities, offering an alternative way to overcome these issues. However, prompt engineering alone presents several limitations - including high computational cost, privacy risks, and limited capacity for in-depth domain-specific time series reasoning. To address these limitations, a more promising approach is to train LLMs to develop slow thinking capabilities and acquire strong time series reasoning skills. For this purpose, we propose Time-R1, a two-stage reinforcement fine-tuning framework designed to enhance multi-step reasoning ability of LLMs for time series forecasting. Specifically, the first stage conducts supervised fine-tuning for warmup adaptation, while the second stage employs reinforcement learning to improve the model's generalization ability. Particularly, we design a fine-grained multi-objective reward specifically for time series forecasting, and then introduce GRIP (group-based relative importance for policy optimization), which leverages non-uniform sampling to further encourage and optimize the model's exploration of effective reasoning paths. Experiments demonstrate that Time-R1 significantly improves forecast performance across diverse datasets.

URL PDF HTML ☆

赞 0 踩 0

2502.00944 2026-06-04 cs.LG 版本更新

Training speedups via batching for geometric learning: an analysis of static and dynamic algorithms

通过批处理实现几何学习训练加速：静态与动态算法分析

Daniel T. Speckhard, Tim Bechtel, Sebastian Kehl, Jonathan Godwin, Claudia Draxl

发表机构 * Humboldt-Universität zu Berlin（洪堡-柏林大学）； Max Planck Institute for Solid State Research（马克斯·普朗克固态研究所）； Max Planck Computing and Data Facility（马克斯·普朗克计算与数据设施）； Orbital Materials（Orbital Materials公司）

AI总结本文分析图神经网络中静态与动态批处理算法对训练速度和模型性能的影响，实验表明算法选择可带来最高2.7倍加速，但最优算法取决于数据、模型、批大小、硬件和训练步数。

详情

Journal ref: Transactions on Machine Learning Research (3/2026)

AI中文摘要

图神经网络（GNN）在材料科学、化学和社会科学等多个领域展现出有前景的结果。GNN模型通常包含数百万个参数，与其他神经网络（NN）模型一样，通常仅以批次方式输入训练数据集中的一部分图来更新模型参数。批处理算法对训练时间和模型性能的影响已在NN中得到了深入探索，但在GNN中尚未进行。我们分析了两种不同的基于图的模型批处理算法，即针对两个数据集（小分子QM9数据集和AFLOW材料数据库）的静态和动态批处理。我们的实验表明，更改批处理算法可提供高达2.7倍的加速，但最快的算法取决于数据、模型、批大小、硬件和运行的训练步数。实验表明，对于批大小、数据集和模型的某些组合，静态和动态批处理算法之间的模型学习指标存在显著差异。

英文摘要

Graph neural networks (GNN) have shown promising results for several domains such as materials science, chemistry, and the social sciences. GNN models often contain millions of parameters, and like other neural network (NN) models, are often fed only a fraction of the graphs that make up the training dataset in batches to update model parameters. The effect of batching algorithms on training time and model performance has been thoroughly explored for NNs but not yet for GNNs. We analyze two different batching algorithms for graph-based models, namely static and dynamic batching for two datasets, the QM9 dataset of small molecules and the AFLOW materials database. Our experiments show that changing the batching algorithm can provide up to a 2.7x speedup, but the fastest algorithm depends on the data, model, batch size, hardware, and number of training steps run. Experiments show that for a select number of combinations of batch size, dataset, and model, significant differences in model learning metrics are observed between static and dynamic batching algorithms.

URL PDF HTML ☆

赞 0 踩 0

2602.14757 2026-06-04 math.NA cs.LG cs.NA 版本更新

Solving Inverse Parametrized Problems via Finite Elements and Extreme Learning Networks

通过有限元和极限学习网络求解反参数化问题

Erik Burman, Mats G. Larson, Karl Larsson, Jonatan Vallin

发表机构 * KTH Royal Institute of Technology（皇家理工学院）； Uppsala University（乌普萨拉大学）

AI总结提出一种基于插值的建模框架，结合有限元离散和极限学习机代理，用于控制、反问题和不确定性量化中的参数依赖偏微分方程，并应用于定量光声层析成像，实现计算节省且保持精度。

详情

DOI: 10.1016/j.cma.2026.119077
Journal ref: Comput. Methods Appl. Mech. Engrg. 460 (2026), Paper No. 119077

AI中文摘要

我们开发了一种基于插值的建模框架，用于控制、反问题和不确定性量化中出现的参数依赖偏微分方程。在物理域中使用有限元方法对解进行离散化，同时单独近似对有限维参数的依赖。我们建立了参数解的存在性、唯一性和正则性，并推导了严格的误差估计，明确量化了空间离散化和参数逼近之间的相互作用。在低维参数空间中，经典插值方案基于参数变量的Sobolev正则性产生代数收敛速度。在高维参数空间中，我们用极限学习机（ELM）代理替换经典插值，并在显式逼近和稳定性假设下获得误差界。该框架应用于定量光声层析成像中的反问题，我们推导了势和参数重建误差估计，并证明了与标准方法相比，在不牺牲精度的情况下显著节省了计算量。

英文摘要

We develop an interpolation-based modeling framework for parameter-dependent partial differential equations arising in control, inverse problems, and uncertainty quantification. The solution is discretized in the physical domain using finite element methods, while the dependence on a finite-dimensional parameter is approximated separately. We establish existence, uniqueness, and regularity of the parametric solution and derive rigorous error estimates that explicitly quantify the interplay between spatial discretization and parameter approximation. In low-dimensional parameter spaces, classical interpolation schemes yield algebraic convergence rates based on Sobolev regularity in the parameter variable. In higher-dimensional parameter spaces, we replace classical interpolation by extreme learning machine (ELM) surrogates and obtain error bounds under explicit approximation and stability assumptions. The proposed framework is applied to inverse problems in quantitative photoacoustic tomography, where we derive potential and parameter reconstruction error estimates and demonstrate substantial computational savings compared to standard approaches, without sacrificing accuracy.

URL PDF HTML ☆

赞 0 踩 0

2604.14575 2026-06-04 cs.LG cs.AI stat.ME stat.ML 版本更新

Generative Augmented Inference

生成式增强推断

Cheng Lu, Mengxin Wang, Dennis J. Zhang, Heng Zhang

发表机构 * University of California, Berkeley（加州大学伯克利分校）； Stanford University（斯坦福大学）； University of Toronto（多伦多大学）

AI总结提出生成式增强推断（GAI）框架，将AI输出视为学习真实标签的高维信息特征而非代理，通过非参数方法建模，实现人机数据联合的一致估计和有效推断，在随机标注下渐近效率严格优于仅用人类数据。

详情

AI中文摘要

大型语言模型使得廉价的AI生成标注成为可能，但如何可靠地将其用于因果推断仍然具有挑战性。简单地将AI和人类数据混合会引入偏差，而现有方法如预测驱动推断（PPI；Angelopoulos et al., 2023a）将AI输出视为真实标签的代理——这一假设在实践中常被生成模型输出所违背。我们提出生成式增强推断（GAI），一个将AI输出视为学习人类标签的一般性、潜在高维信息特征而非替代品的框架。GAI使用非参数方法灵活建模这种关系，从而能够从人类和AI的联合数据中进行一致估计和有效推断。我们建立了渐近正态性，并证明在随机标注下，只要AI输出对真实标签具有信息量，GAI在渐近效率上严格优于仅使用人类数据的估计。在真实数据集上的实证研究表明，与仅使用人类数据和基于PPI的估计相比，GAI在多种生成数据源上显著降低了估计误差并提高了置信区间质量。

英文摘要

Large language models enable inexpensive AI-generated annotations, but using them reliably for causal inference remains challenging. Naively pooling AI and human data induces bias, while existing methods such as Prediction-Powered Inference (PPI; Angelopoulos et al., 2023a) treat AI outputs as proxies of true labels -- an assumption often violated for generative model outputs in practice. We propose Generative Augmented Inference (GAI), a framework that treats AI outputs as general, potentially high-dimensional informative features for learning human labels rather than as surrogates. GAI flexibly models this relationship using nonparametric methods, enabling consistent estimation and valid inference from combined human and AI data. We establish asymptotic normality and show that, under random labeling, GAI strictly improves asymptotic efficiency over human-data-only estimation whenever AI outputs are informative for true labels. Empirical studies on real-world datasets demonstrate that GAI significantly reduces estimation error and improves confidence interval quality across diverse generative data sources relative to human-only and PPI-based estimation.

URL PDF HTML ☆

赞 0 踩 0

2407.00809 2026-06-04 cs.LG cs.NA math.NA 版本更新

Kernel Neural Operators (KNOs) for Scalable, Memory-efficient, Geometrically-flexible Operator Learning

核神经算子（KNOs）：可扩展、内存高效、几何灵活的算子学习

Matthew Lowery, John Turnage, Zachary Morrow, John D. Jakeman, Akil Narayan, Shandian Zhe, Varun Shankar

发表机构 * Kahlert School of Computing（卡勒特计算学院）； University of Utah（犹他大学）； Department of Mathematics（数学系）； Sandia National Laboratories（桑迪亚国家实验室）； Scientific Machine Learning（科学机器学习）； Scientific Computing and Imaging (SCI) Institute（科学计算与成像（SCI）研究所）

AI总结提出核神经算子（KNO），通过组合深度核积分算子实现算子学习，具有收敛性、低内存和几何灵活性，在基准测试中以更少参数达到可比或更高精度。

Comments 14 pages + 15 page appendix, 7 figures

详情

Journal ref: Transactions on Machine Learning Research, ISSN 2835-8856, 2026

AI中文摘要

本文介绍了核神经算子（KNO），一种可证明收敛的算子学习架构，它利用深度核积分算子的组合进行算子（函数到函数的映射）的函数空间逼近。KNO将核的选择与数值积分方案（求积）解耦，从而自然允许在不规则几何上使用显式选择的可训练核进行算子学习。在不规则域上，这使KNO能够利用特定于域的求积规则。为了帮助缓解维数灾难，我们还在规则域上利用了一种高效的维度分解算法。更重要的是，显式指定核的能力还允许使用高度表达性的、非平稳的、神经各向异性核，其参数通过训练神经网络计算。我们提出了通用逼近定理，表明连续和完全离散化的KNO都是算子学习问题的通用逼近器。数值结果表明，在现有基准测试中，KNO的训练和测试精度与流行的神经算子相当或更高，同时通常使用的可训练参数少一个数量级，其中更具表达性的核对于实现高精度至关重要。因此，KNO促进了低内存、几何灵活的深度算子学习，同时保留了科学计算和机器学习中传统核方法的实现简单性和透明性。

英文摘要

This paper introduces the Kernel Neural Operator (KNO), a provably convergent operator-learning architecture that utilizes compositions of deep kernel-based integral operators for function-space approximation of operators (maps from functions to functions). The KNO decouples the choice of kernel from the numerical integration scheme (quadrature), thereby naturally allowing for operator learning with explicitly-chosen trainable kernels on irregular geometries. On irregular domains, this allows the KNO to utilize domain-specific quadrature rules. To help ameliorate the curse of dimensionality, we also leverage an efficient dimension-wise factorization algorithm on regular domains. More importantly, the ability to explicitly specify kernels also allows the use of highly expressive, non-stationary, neural anisotropic kernels whose parameters are computed by training neural networks. We present universal approximation theorems showing that both the continuous and fully discretized KNO are universal approximators on operator learning problems. Numerical results demonstrate that on existing benchmarks the training and test accuracy of KNOs is closely comparable to or higher than that of popular neural operators while typically using an order of magnitude fewer trainable parameters, with the more expressive kernels proving important to attaining high accuracy. KNOs thus facilitate low-memory, geometrically-flexible, deep operator learning, while retaining the implementation simplicity and transparency of traditional kernel methods from both scientific computing and machine learning.

URL PDF HTML ☆

赞 0 踩 0

2604.11510 2026-06-04 cs.CL cs.AI cs.LG 版本更新

Policy Split: Incentivizing Dual-Mode Exploration in LLM Reinforcement with Dual-Mode Entropy Regularization

策略分裂：通过双模式熵正则化激励大语言模型强化学习中的双模式探索

Jiashu Yao, Heyan Huang, Daiqing Wu, Zeming Liu, Yuhang Guo

发表机构 * Beijing Institute of Technology（北京理工大学）； Tsinghua University（清华大学）； Beihang University（北航）

AI总结提出Policy Split方法，将策略分裂为正常和高熵两种模式，通过协作双模式熵正则化在保持准确性的同时促进多样化探索，实验表明在通用和创造性任务上优于现有基线。

Comments preprint

2604.08564 2026-06-04 cs.CL cs.LG 版本更新

Attention-Based Sampler for Diffusion Language Models

基于注意力的扩散语言模型采样器

Yuyan Zhou, Kai Syun Hou, Weiyu Chen, James Kwok

发表机构 * Department of Computer Science and Engineering, The Hong Kong University of Science and Technology（计算机科学与工程系，香港科学与技术大学）

AI总结针对扩散语言模型采样中忽略全局序列结构的问题，提出基于注意力矩阵列和的采样顺序优化方法，实现无训练的高质量并行采样。

详情

AI中文摘要

自回归模型（ARMs）已在语言建模中建立了主导范式。然而，其严格的顺序采样范式对推理效率和建模灵活性施加了根本性限制。为解决这些限制，提出了基于扩散的大语言模型（dLLMs），提供了并行采样和灵活语言建模的潜力。尽管有这些优势，当前dLLMs的采样策略主要依赖于token级别的信息，未能考虑全局序列结构，往往产生次优结果。在本文中，我们从对数似然最大化的角度研究采样顺序选择问题。我们证明该问题是NP难的，并提出一种基于最优采样秩的近似方法，使目标在计算上可行。我们进一步证明，通过按注意力矩阵列和降序采样token可以优化该可行目标。这一发现为注意力引导采样提供了原则性依据，并提供了贪婪搜索的理论基础替代方案。我们将这一理论见解实例化为一种新的无训练采样算法，称为Attn-Sampler，并进一步提出动态注意力阈值以实现实际加速。在多个基准上的大量实验验证了我们方法的有效性，表明它在增强采样并行性的同时实现了更优的生成质量。

英文摘要

Auto-regressive models (ARMs) have established a dominant paradigm in language modeling. However, their strictly sequential sampling paradigm imposes fundamental constraints on both inference efficiency and modeling flexibility. To address these limitations, diffusion-based large language models (dLLMs) have been proposed, offering the potential for parallel sampling and flexible language modeling. Despite these advantages, current dLLMs sampling strategies rely primarily on token level information, which fails to account for global sequence structure and often yields suboptimal results. In this paper, we study the sampling order selection problem from the perspective of log-likelihood maximization. We show that this problem is NP-hard and propose an optimal sampling-rank-based approximation that makes the objective computationally tractable. We further prove that the tractable objective is optimized by sampling tokens in descending order of their attention-matrix column sums. This finding provides a principled justification for attention-guided sampling and offers a theoretically grounded alternative to greedy search. We instantiate this theoretical insight in a new training-free sampling algorithm, termed Attn-Sampler, and further propose dynamic attention thresholding for practical acceleration. Extensive experiments across multiple benchmarks validate the effectiveness of our proposed method, demonstrating that it achieves superior generation quality while enhancing the sampling parallelism.

URL PDF HTML ☆

赞 0 踩 0

2603.21180 2026-06-04 cs.LG stat.CO stat.ME stat.ML 版本更新

ALMAB-DC: Active Learning, Multi-Armed Bandits, and Distributed Computing for Sequential Experimental Design and Black-Box Optimization

ALMAB-DC：用于序贯实验设计和黑箱优化的主动学习、多臂老虎机和分布式计算

Foo Hui-Mean, Yuan-chin I Chang

发表机构 * Institute of Statistical Science, Academia Sinica（中央研究院统计科学研究所）

AI总结提出ALMAB-DC框架，结合高斯过程代理模型、多臂老虎机控制和异步分布式调度，解决昂贵黑箱优化问题，在多个基准上显著优于现有方法。

Comments 33 pages, and 13 figures

详情

AI中文摘要

在昂贵且无梯度目标下的序贯实验设计是计算统计学中的一个核心挑战：评估预算严格受限，必须从每次观测中高效提取信息。我们提出 extbf{ALMAB-DC}，一种基于高斯过程的序贯设计框架，结合主动学习、多臂老虎机（MAB）和分布式异步计算，用于昂贵的黑箱实验。具有不确定性感知获取函数的高斯过程代理模型识别信息量大的查询点；UCB或汤普森采样老虎机控制器在并行工作节点间分配评估；异步调度器处理异构运行时间。我们给出了老虎机组件的累积遗憾界，并通过阿姆达尔定律刻画了并行可扩展性。我们在五个基准上验证了ALMAB-DC。在两个统计实验设计任务中，ALMAB-DC在剂量-响应优化中实现了比等间距、随机和D最优设计更低的简单遗憾，在自适应空间场估计中匹配了贪婪最大方差基准并优于拉丁超立方采样；在$K=4$时，分布式设置达到目标性能所需的序贯挂钟轮次仅为四分之一。在三个机器学习/工程任务（CIFAR-10 HPO、CFD阻力最小化、MuJoCo RL）中，ALMAB-DC实现了93.4%的CIFAR-10准确率（超过BOHB 1.7个百分点和Optuna 1.1个百分点），将翼型阻力降低至$C_D = 0.059$（比网格搜索低36.9%），并将RL回报比网格搜索提高50%。所有相对于非ALMAB基线的优势在Bonferroni校正的Mann-Whitney $U$检验下均具有统计显著性。分布式执行在$K = 16$个智能体时实现了$7.5 imes$加速，与阿姆达尔定律一致。

英文摘要

Sequential experimental design under expensive, gradient-free objectives is a central challenge in computational statistics: evaluation budgets are tightly constrained and information must be extracted efficiently from each observation. We propose \textbf{ALMAB-DC}, a GP-based sequential design framework combining active learning, multi-armed bandits (MAB), and distributed asynchronous computing for expensive black-box experimentation. A Gaussian process surrogate with uncertainty-aware acquisition identifies informative query points; a UCB or Thompson-sampling bandit controller allocates evaluations across parallel workers; and an asynchronous scheduler handles heterogeneous runtimes. We present cumulative regret bounds for the bandit components and characterize parallel scalability via Amdahl's Law. We validate ALMAB-DC on five benchmarks. On the two statistical experimental-design tasks, ALMAB-DC achieves lower simple regret than Equal Spacing, Random, and D-optimal designs in dose--response optimization, and in adaptive spatial field estimation matches the Greedy Max-Variance benchmark while outperforming Latin Hypercube Sampling; at $K=4$ the distributed setting reaches target performance in one-quarter of sequential wall-clock rounds. On three ML/engineering tasks (CIFAR-10 HPO, CFD drag minimization, MuJoCo RL), ALMAB-DC achieves 93.4\% CIFAR-10 accuracy (outperforming BOHB by 1.7\,pp and Optuna by 1.1\,pp), reduces airfoil drag to $C_D = 0.059$ (36.9\% below Grid Search), and improves RL return by 50\% over Grid Search. All advantages over non-ALMAB baselines are statistically significant under Bonferroni-corrected Mann--Whitney $U$ tests. Distributed execution achieves $7.5\times$ speedup at $K = 16$ agents, consistent with Amdahl's Law.

URL PDF HTML ☆

赞 0 踩 0

2604.08438 2026-06-04 cs.LG 版本更新

正交学习器用于估计异质性长期处理效应

Haorui Ma, Dennis Frauen, Valentyn Melnychuk, Stefan Feuerriegel

发表机构 * AI in Management, LMU Munich（慕尼黑莱茵河大学人工智能管理系）； Munich Center for Machine Learning（慕尼黑机器学习中心）； Munich, Germany（德国慕尼黑）

AI总结提出LT-O-learners，通过定制重叠权重重新定位损失函数，解决低重叠区域下异质性长期处理效应估计不稳定问题，并证明其Neyman正交性和对干扰误差的鲁棒性。

详情

AI中文摘要

异质性长期处理效应（HLTEs）的估计对于营销、经济学和医学中的个性化决策具有重要意义，在这些领域中，短期观测数据集通常与长期观测数据集相结合。然而，由于某些子群体在处理分配或长期结果上的重叠有限，HLTE估计面临挑战，可能导致具有大有限样本方差不稳定的HLTE估计。为了解决这一挑战，我们引入了LT-O-learners（长期正交学习器），这是一组新颖的正交学习器，用于在具有替代性的典型HLTE设置中进行HLTE估计。我们的LT-O-learners的关键思想是通过定制的重叠权重重新定位损失函数，这些权重降低了低重叠样本的权重。我们证明了重新定位的损失函数逐点恢复真实的HLTE，并满足Neyman正交性。我们进一步证明了两个关键的理论结果：（i）干扰误差仅通过高阶项进入误差界，这意味着我们的学习器对干扰估计误差具有鲁棒性。（ii）在线性函数类下，重新定位通过低重叠区域中的重叠权重有效控制了HLTE估计器的渐近方差。我们在合成和真实世界数据集上进行实验，以确认我们的LT-O-learners的理论性质，特别是在低重叠区域中的鲁棒性。据我们所知，我们是第一个在长期设置中对低重叠鲁棒的HLTE估计正交学习器。

英文摘要

Estimation of heterogeneous long-term treatment effects (HLTEs) is relevant for personalized decision-making in marketing, economics, and medicine, where short-term observational datasets are often combined with long-term observational datasets. However, HLTE estimation is challenging due to limited overlap in treatment assignments or in long-term outcomes for certain subpopulations, which can lead to unstable HLTE estimates with large finite-sample variance. To address this challenge, we introduce the LT-O-learners (Long-Term Orthogonal Learners), a set of novel orthogonal learners for HLTE estimation in the canonical HLTE setting with surrogacy. The key idea of our LT-O-learners is to retarget the loss via custom overlap weights that downweight low-overlap samples. We show that the retargeted loss recovers the true HLTE pointwise and satisfies Neyman-orthogonality. We further prove two key theoretical results: (i) The nuisance error enters the error bound only through higher-order terms, which means our learners are robust to nuisance estimation error. (ii) Under a linear function class, the retargeting effectively controls the asymptotic variance of the HLTE estimator via the overlap weights in low-overlap regimes. We conduct experiments on synthetic and real-world datasets to confirm the theoretical properties of our LT-O-learners, particularly robustness in low-overlap regimes. To our knowledge, ours are the first orthogonal learners for HLTE estimation robust to low overlap in long-term settings.

URL PDF HTML ☆

赞 0 踩 0

2603.28762 2026-06-04 cs.CV cs.AI cs.GR cs.LG 版本更新

On-the-fly Repulsion in the Contextual Space for Rich Diversity in Diffusion Transformers

上下文空间中的即时排斥以实现扩散变换器的丰富多样性

Omer Dahary, Benaya Koren, Daniel Garibi, Daniel Cohen-Or

发表机构 * Tel Aviv University（特拉维夫大学）； Snap Research Israel（Snap以色列研究）

AI总结针对文本到图像扩散模型多样性不足的问题，提出在扩散变换器的上下文空间中通过多模态注意力通道施加即时排斥，在不牺牲视觉保真度和语义一致性的前提下显著提升生成多样性，且计算开销小，适用于现代Turbo和蒸馏模型。

Comments SIGGRAPH 2026. Project page: https://contextual-repulsion.github.io/

详情

AI中文摘要

现代文本到图像（T2I）扩散模型在语义对齐方面取得了显著进展，但通常缺乏多样性，倾向于为任何给定提示收敛到狭窄的视觉解决方案集。这种典型性偏差对需要广泛生成结果的创意应用构成了挑战。我们识别出当前多样性方法中的一个基本权衡：修改模型输入需要昂贵的优化来整合生成路径的反馈。相反，对空间上已承诺的中间潜变量进行操作往往会破坏正在形成的视觉结构，导致伪影。在这项工作中，我们提出在上下文空间中应用排斥作为一种新颖的框架，以实现扩散变换器的丰富多样性。通过干预多模态注意力通道，我们在变换器的前向传播过程中施加即时排斥，在文本条件被新兴图像结构丰富后的块之间注入干预。这允许在结构信息形成后但构图固定之前重定向引导轨迹。我们的结果表明，上下文空间中的排斥在不牺牲视觉保真度或语义一致性的情况下产生了显著更丰富的多样性。此外，我们的方法非常高效，计算开销小，即使在现代“Turbo”和蒸馏模型中也有效，而传统的基于轨迹的干预在这些模型中通常会失败。

英文摘要

Modern Text-to-Image (T2I) diffusion models have achieved remarkable semantic alignment, yet they often suffer from a significant lack of variety, converging on a narrow set of visual solutions for any given prompt. This typicality bias presents a challenge for creative applications that require a wide range of generative outcomes. We identify a fundamental trade-off in current approaches to diversity: modifying model inputs requires costly optimization to incorporate feedback from the generative path. In contrast, acting on spatially-committed intermediate latents tends to disrupt the forming visual structure, leading to artifacts. In this work, we propose to apply repulsion in the Contextual Space as a novel framework for achieving rich diversity in Diffusion Transformers. By intervening in the multimodal attention channels, we apply on-the-fly repulsion during the transformer's forward pass, injecting the intervention between blocks where text conditioning is enriched with emergent image structure. This allows for redirecting the guidance trajectory after it is structurally informed but before the composition is fixed. Our results demonstrate that repulsion in the Contextual Space produces significantly richer diversity without sacrificing visual fidelity or semantic adherence. Furthermore, our method is uniquely efficient, imposing a small computational overhead while remaining effective even in modern "Turbo" and distilled models where traditional trajectory-based interventions typically fail.

URL PDF HTML ☆

赞 0 踩 0

2601.04051 2026-06-04 cs.LG 版本更新

Symbolic Regression for Shared Expressions: Introducing Partial Parameter Sharing

共享表达式的符号回归：引入部分参数共享

Viktor Martinek, Roland Herzog

发表机构 * Interdisciplinary Center for Scientific Computing, Heidelberg University（科学计算跨学科中心，海德堡大学）

AI总结提出一种符号回归方法，通过引入部分参数共享机制处理多个分类变量，以分离通用效应、类别特定趋势和类别交互，并在合成数据和天体物理学数据集上验证其减少数据需求和迁移学习的能力。

详情

AI中文摘要

符号回归旨在寻找描述数据集的符号表达式。由于其固有的可解释性，符号回归（SR）是科学发现的有力范式。最近的进展已将SR扩展到使用具有可变参数集的单一表达式来描述相关现象，从而引入单一分类变量。例如，这允许搜索描述多种流体温度依赖粘度的单一表达式，同时识别一组不同的流体特定参数。我们在先前工作的基础上，考虑多个分类变量并引入中间级别的参数共享。参数并非完全通用或完全唯一，一些参数可以在特定类别之间共享，而对其他类别保持不同。这允许分离通用效应（共享参数）、类别特定趋势（部分共享参数）和类别交互（非共享参数）。我们通过一个合成的仅拟合示例测试了这种设置在减少数据需求和迁移学习方面的极限。此外，我们将该方法应用于先前单类别研究中也使用过的天体物理学数据集。相比之下，我们以显著更少的参数实现了类似的拟合质量，同时提取了关于问题的额外信息。

英文摘要

Symbolic regression aims to find symbolic expressions that describe datasets. Due to its inherent interpretability, symbolic regression (SR) is a powerful paradigm for scientific discovery. Recent advances have expanded SR to describe related phenomena using a single expression with varying sets of parameters, thereby introducing a single categorical variable. To illustrate, this enables the search for a single expression describing temperaturedependent viscosity across multiple fluids, while simultaneously identifying a distinct set of fluid-specific parameters. We expand upon prior efforts by considering multiple categorical variables and introducing intermediate levels of parameter sharing. Rather than parameters being either entirely universal or entirely unique, some parameters can also be shared across specific categories while remaining distinct for others. This allows for separating universal effects (shared parameters), category-specific trends (partially-shared parameters), and category interactions (non-shared parameters). We test the limits of this setup in terms of reducing data requirements and transfer learning using a synthetic, fitting-only example. Furthermore, we apply the method to an astrophysics dataset also used in a previous single-category study. In comparison, we achieve similar fit quality with significantly fewer parameters while extracting additional information about the problem.

URL PDF HTML ☆

赞 0 踩 0

2510.21459 2026-06-04 cs.CR cs.CL cs.LG 版本更新

SBASH: a Framework for Designing and Evaluating RAG vs. Prompt-Tuned LLM Honeypots

SBASH：用于设计和评估RAG与提示调优的LLM蜜罐框架

Adetayo Adebimpe, Helmut Neukirchen, Thomas Welsh

发表机构 * University of California, Berkeley（加州大学伯克利分校）

AI总结提出SBASH框架，利用轻量级本地LLM和RAG技术构建蜜罐，通过多种指标评估RAG与提示调优对LLM蜜罐真实性和响应延迟的影响。

Comments to be published in: The 3rd International Conference on Foundation and Large Language Models (FLLM2025), IEEE, 2025

详情

DOI: 10.1109/FLLM67465.2025.11391242
Journal ref: 2025 3rd International Conference on Foundation and Large Language Models (FLLM), IEEE, 2025

AI中文摘要

蜜罐是用于收集有价值威胁情报或将攻击者从生产系统引开的诱饵系统。最大化攻击者参与度对其效用至关重要。然而，研究表明，上下文感知能力（例如响应新攻击类型、系统和攻击者代理的能力）对于提高参与度是必要的。大型语言模型（LLM）已被证明是提高上下文感知能力的一种方法，但面临若干挑战，包括响应时间的准确性和及时性、高运营成本以及由于云部署带来的数据保护问题。我们提出了基于系统的注意力外壳蜜罐（SBASH）框架，通过使用轻量级本地LLM来管理数据保护问题。我们研究了使用检索增强生成（RAG）支持的LLM和非RAG LLM处理Linux shell命令的情况，并使用多种不同指标（如响应时间差异、人类测试者的真实感、以及通过Levenshtein距离、SBert和BertScore计算的与真实系统的相似度）对其进行评估。我们表明，RAG提高了未调优模型的准确性，而通过系统提示（指示LLM像Linux系统一样响应）调优的模型在无RAG情况下达到了与未调优模型有RAG时相似的准确性，同时延迟略低。

英文摘要

Honeypots are decoy systems used for gathering valuable threat intelligence or diverting attackers away from production systems. Maximising attacker engagement is essential to their utility. However research has highlighted that context-awareness, such as the ability to respond to new attack types, systems and attacker agents, is necessary to increase engagement. Large Language Models (LLMs) have been shown as one approach to increase context awareness but suffer from several challenges including accuracy and timeliness of response time, high operational costs and data-protection issues due to cloud deployment. We propose the System-Based Attention Shell Honeypot (SBASH) framework which manages data-protection issues through the use of lightweight local LLMs. We investigate the use of Retrieval Augmented Generation (RAG) supported LLMs and non-RAG LLMs for Linux shell commands and evaluate them using several different metrics such as response time differences, realism from human testers, and similarity to a real system calculated with Levenshtein distance, SBert, and BertScore. We show that RAG improves accuracy for untuned models while models that have been tuned via a system prompt that tells the LLM to respond like a Linux system achieve without RAG a similar accuracy as untuned with RAG, while having a slightly lower latency.

URL PDF HTML ☆

赞 0 踩 0

2603.19005 2026-06-04 cs.LG cs.AI stat.ME 版本更新

稀疏贝叶斯深度函数学习与结构化区域选择

Xiaoxian Zhu, Yingmeng Li, Shuangge Ma, Mengyun Wu

发表机构 * School of Statistics and Data Science, Shanghai University of Finance and Economics（上海金融学院统计与数据科学学院）； School of Statistics（统计学院）； Data Science, Shanghai University of Finance（金融大学数据科学学院）； Department of Biostatistics, Yale School of Public Health（耶鲁大学公共卫生学院生物统计学系）

AI总结提出稀疏贝叶斯函数深度神经网络（sBayFDNN），通过深度贝叶斯架构学习自适应函数嵌入以捕捉复杂非线性关系，并利用结构化先验实现具有量化不确定性的可解释区域选择，理论上首次为贝叶斯深度函数模型提供了近似误差界、后验一致性和区域选择一致性的严格保证。

详情

AI中文摘要

在现代应用如心电图监测、神经影像、可穿戴传感和工业设备诊断中，复杂且连续结构化的数据无处不在，为函数数据分析带来了挑战和机遇。然而，现有方法面临关键权衡：传统函数模型受限于线性，而深度学习方法缺乏可解释的区域选择以处理稀疏效应。为弥合这些差距，我们提出了一种稀疏贝叶斯函数深度神经网络（sBayFDNN）。它通过深度贝叶斯架构学习自适应函数嵌入以捕捉复杂的非线性关系，同时结构化先验使得能够对具有量化不确定性的影响域进行可解释的区域选择。理论上，我们建立了严格的近似误差界、后验一致性和区域选择一致性。这些结果为贝叶斯深度函数模型提供了首个理论保证，确保了其可靠性和统计严谨性。实证上，全面的模拟和真实世界研究证实了sBayFDNN的有效性和优越性。关键的是，sBayFDNN在识别复杂依赖关系以实现准确预测方面表现出色，并能更精确地识别功能上有意义的区域，这些能力从根本上超越了现有方法。

英文摘要

In modern applications such as ECG monitoring, neuroimaging, wearable sensing, and industrial equipment diagnostics, complex and continuously structured data are ubiquitous, presenting both challenges and opportunities for functional data analysis. However, existing methods face a critical trade-off: conventional functional models are limited by linearity, whereas deep learning approaches lack interpretable region selection for sparse effects. To bridge these gaps, we propose a sparse Bayesian functional deep neural network (sBayFDNN). It learns adaptive functional embeddings through a deep Bayesian architecture to capture complex nonlinear relationships, while a structured prior enables interpretable, region-wise selection of influential domains with quantified uncertainty. Theoretically, we establish rigorous approximation error bounds, posterior consistency, and region selection consistency. These results provide the first theoretical guarantees for a Bayesian deep functional model, ensuring its reliability and statistical rigor. Empirically, comprehensive simulations and real-world studies confirm the effectiveness and superiority of sBayFDNN. Crucially, sBayFDNN excels in recognizing intricate dependencies for accurate predictions and more precisely identifies functionally meaningful regions, capabilities fundamentally beyond existing approaches.

URL PDF HTML ☆

赞 0 踩 0

2603.01013 2026-06-04 cs.LG 版本更新

Feature-Weighted Maximum Representative Subsampling

特征加权最大代表性子抽样

Tony Hauptmann, Stefan Kramer

发表机构 * Institute of Computer Science, Johannes Gutenberg University Mainz（明斯特大学计算机科学研究所）

AI总结针对部分特征高度偏倚导致代表性变量引入偏差的问题，提出特征加权最大代表性子抽样（FW-MRS）方法，通过特征权重降低高度偏倚特征的影响，在保持下游任务泛化性能的同时保留更多实例。

详情

DOI: 10.1038/s41598-026-54180-1

AI中文摘要

在社会科学中，通常需要在得出有效结论之前对研究和调查进行去偏。去偏算法能够使用样本权重通过计算方式去除偏差。然而，当只有一部分特征高度偏倚而其余特征已经具有代表性时，就会出现问题。算法需要强烈改变样本分布以处理少数高度偏倚的特征，这反过来可能会给已经具有代表性的变量引入偏差。为了解决这个问题，我们开发了一种使用特征权重的方法，以最小化高度偏倚特征对样本权重计算的影响。我们的算法基于最大代表性子抽样（MRS），该方法通过迭代移除元素来对齐非代表性样本与代表性样本，从而创建代表性子样本，进而对数据集进行去偏。新算法名为特征加权MRS（FW-MRS），它降低了对高度偏倚特征的重视程度，从而能够为下游任务保留更多实例。特征权重来源于一个域分类器的特征重要性，该分类器经过训练以区分代表性数据集和非代表性数据集。我们使用八个表格数据集验证了FW-MRS，每个数据集都被人为偏倚。偏倚特征可能对下游任务很重要，较少关注它们可能导致泛化性能下降。因此，我们评估了FW-MRS在下游任务上的泛化性能，发现没有统计学上的显著差异。此外，FW-MRS被应用于一个来自社会科学的真实数据集。源代码可在https://github.com/kramerlab/FeatureWeightDebiasing获取。

英文摘要

In the social sciences, it is often necessary to debias studies and surveys before valid conclusions can be drawn. Debiasing algorithms enable the computational removal of bias using sample weights. However, an issue arises when only a subset of features is highly biased, while the rest is already representative. Algorithms need to strongly alter the sample distribution to manage a few highly biased features, which can in turn introduce bias into already representative variables. To address this issue, we developed a method that uses feature weights to minimize the impact of highly biased features on the computation of sample weights. Our algorithm is based on Maximum Representative Subsampling (MRS), which debiases datasets by aligning a non-representative sample with a representative one through iterative removal of elements to create a representative subsample. The new algorithm, named feature-weighted MRS (FW-MRS), decreases the emphasis on highly biased features, allowing it to retain more instances for downstream tasks. The feature weights are derived from the feature importance of a domain classifier trained to differentiate between the representative and non-representative datasets. We validated FW-MRS using eight tabular datasets, each of which we artificially biased. Biased features can be important for downstream tasks, and focusing less on them could lead to a decline in generalization. For this reason, we assessed the generalization performance of FW-MRS on downstream tasks and found no statistically significant differences. Additionally, FW-MRS was applied to a real-world dataset from the social sciences. The source code is available at https://github.com/kramerlab/FeatureWeightDebiasing.

URL PDF HTML ☆

赞 0 踩 0

2602.23214 2026-06-04 cs.CV cs.LG eess.IV 版本更新

Plug-and-Play Diffusion Meets ADMM: Dual-Variable Coupling for Robust Medical Image Reconstruction

即插即用扩散遇见ADMM：双变量耦合用于鲁棒医学图像重建

Chenhe Du, Xuanyu Tian, Qing Wu, Muyu Liu, Jingyi Yu, Hongjiang Wei, Yuyao Zhang

发表机构 * ShanghaiTech University（上海科技大学）； Shanghai Jiao Tong University（上海交通大学）

AI总结提出双耦合即插即用扩散（DC-PnPDP）框架，通过引入经典对偶变量提供积分反馈并采用频谱均匀化（SH）处理结构伪影，解决了现有PnP求解器的稳态偏差和幻觉问题，在CT和MRI重建中实现了最先进的保真度和加速收敛。

Comments Accepted by ICML 2026

详情

AI中文摘要

即插即用扩散先验（PnPDP）框架通过将预训练生成模型视为模块化先验，已成为解决成像逆问题的强大范式。然而，我们发现当前PnP求解器（例如基于HQS或近端梯度）存在一个关键缺陷：它们作为无记忆算子，仅基于瞬时梯度更新估计。这种缺乏历史跟踪的做法不可避免地导致非消失稳态偏差，使得重建在严重损坏下无法严格满足物理测量。为了解决这个问题，我们提出了双耦合PnP扩散（DC-PnPDP），它恢复了经典对偶变量以提供积分反馈，逐步强制数据一致性和先验之间的一致性。然而，这种严格的几何耦合引入了第二个挑战：累积的对偶残差表现出频谱有色、结构化的伪影，违反了扩散先验的加性白高斯噪声（AWGN）假设，导致严重的幻觉。为了弥合这一差距，我们引入了频谱均匀化（SH），一种频域适应机制，将这些结构化残差调制为统计上合规的伪AWGN输入。这有效地将求解器的严格优化轨迹与去噪器的有效统计流形对齐。在CT和MRI重建上的大量实验表明，我们的方法解决了偏差-幻觉权衡，实现了最先进的保真度并显著加速收敛。代码可在https://github.com/duchenhe/DC-PnPDP获取。

英文摘要

Plug-and-Play diffusion prior (PnPDP) frameworks have emerged as a powerful paradigm for solving imaging inverse problems by treating pretrained generative models as modular priors. However, we identify a critical flaw in prevailing PnP solvers (e.g., based on HQS or Proximal Gradient): they function as memoryless operators, updating estimates solely based on instantaneous gradients. This lack of historical tracking inevitably leads to non-vanishing steady-state bias, where the reconstruction fails to strictly satisfy physical measurements under heavy corruption. To resolve this, we propose Dual-Coupled PnP Diffusion (DC-PnPDP), which restores the classical dual variable to provide integral feedback, progressively enforce agreement between the data-consistency and prior. However, this rigorous geometric coupling introduces a secondary challenge: the accumulated dual residuals exhibit spectrally colored, structured artifacts that violate the Additive White Gaussian Noise (AWGN) assumption of diffusion priors, causing severe hallucinations. To bridge this gap, we introduce Spectral Homogenization (SH), a frequency-domain adaptation mechanism that modulates these structured residuals into statistically compliant pseudo-AWGN inputs. This effectively aligns the solver's rigorous optimization trajectory with the denoiser's valid statistical manifold. Extensive experiments on CT and MRI reconstruction demonstrate that our approach resolves the bias-hallucination trade-off, achieving state-of-the-art fidelity with significantly accelerated convergence. The code is available at https://github.com/duchenhe/DC-PnPDP

URL PDF HTML ☆

赞 0 踩 0

2602.20971 2026-06-04 cs.LG cs.AI 版本更新

Does Order Matter : Connecting The Law of Robustness to Robust Generalization

顺序重要吗：连接鲁棒性定律与鲁棒泛化

Mihir More, Aritra Das, Jaee Ponde, Himadri Mandal, Vishnu Varadarajan, Debayan Gupta

发表机构 * Ashoka University（阿什oka大学）； Truth Audit Labs（真相审计实验室）； Indian Statistical Institute（印度统计研究所）

AI总结本文通过全局和局部Rademacher复杂度，将鲁棒性定律（Lipschitz常数下界）与鲁棒泛化误差联系起来，证明了对任意数据分布，全局Lipschitz界阶不变，而局部Lipschitz界阶随扰动半径和局部浓度项变化。

详情

AI中文摘要

Bubeck和Selke（2021）将鲁棒性定律与鲁棒泛化误差之间的联系作为一个开放问题提出。鲁棒性定律指出，过参数化对于模型实现鲁棒插值是必要的，即插值函数必须是Lipschitz的。Wu等人（2023）将该定律推广到任意数据分布，证明Lipschitz常数满足$L = Ω(n^{1/d})$。另一方面，鲁棒泛化研究小的鲁棒训练损失是否意味着小的鲁棒测试损失。这可以使用统计学习技术（如Rademacher复杂度）来研究，其中鲁棒损失类的Rademacher复杂度的界意味着函数类Lipschitz性的界。我们利用这一联系，明确地将两者联系起来，适用于任意数据分布。(i) 我们证明，在考虑鲁棒损失类的全局Rademacher复杂度时，Lipschitz界的阶保持不变。(ii) 在局部尺度上，即对于具有小经验误差的函数子集，Lipschitz界的阶随扰动半径$ρ$和局部浓度项$\sqrt{r/n}$变化。

英文摘要

Bubeck and Selke (2021) propose the connection between the Law of Robustness and robust generalization error as an open problem. The Law of Robustness states that overparameterization is necessary for models to interpolate robustly, i.e., the interpolating function is required to be Lipschitz. Wu et al. (2023) extend this law to arbitrary data distributions, proving that the Lipschitz constant satisfies $L = Ω(n^{1/d})$. Robust generalization, on the other hand, asks whether small robust training loss implies small robust test loss. This can be studied using statistical learning techniques such as Rademacher complexities, where a bound on the Rademacher complexity of the robust loss class implies a bound on the Lipschitzness of the function class. We use this connection to explicitly link the two for arbitrary data distributions. (i) We prove that the order of the Lipschitz bound remains the same when considering the global Rademacher complexity of robust loss classes. (ii) At the local scale, i.e., for subsets of functions with small empirical error, the order of the Lipschitz bound changes with the perturbation radius $ρ$ and the localized concentration term $\sqrt{r/n}$.

URL PDF HTML ☆

赞 0 踩 0

2602.19799 2026-06-04 stat.ML cs.LG math.OC 版本更新

Path-conditioned training: a principled way to rescale ReLU neural networks

路径条件训练：一种缩放ReLU神经网络的原则性方法

Arthur Lebeurrier, Titouan Vayer, Rémi Gribonval

发表机构 * Université de Lyon（里昂大学）； CNRS（法国国家科学研究中心）

AI总结本文提出一种基于路径提升框架的几何准则来缩放ReLU网络参数，通过最小化该准则实现核对齐，从而加速训练。

详情

Journal ref: Proceedings of the 43rd International Conference on Machine Learning (ICML 2026), Seoul, South Korea, PMLR 306 (2026)

AI中文摘要

尽管最近算法有所进展，我们仍然缺乏原则性的方法来利用ReLU神经网络参数中记录良好的缩放对称性。虽然两个适当缩放的权重实现相同的函数，但训练动态可能截然不同。为了对这一现象提供新的视角，我们基于最近的路径提升框架，该框架提供了ReLU网络的紧凑分解。我们引入了一个几何动机的准则来缩放神经网络参数，其最小化导致一种条件策略，将路径提升空间中的核与选定的参考对齐。我们推导了一种有效的算法来执行这种对齐。在随机网络初始化的背景下，我们分析了架构和初始化尺度如何共同影响所提出方法的输出。数值实验展示了其加速训练的潜力。

英文摘要

Despite recent algorithmic advances, we still lack principled ways to leverage the well-documented rescaling symmetries in ReLU neural network parameters. While two properly rescaled weights implement the same function, the training dynamics can be dramatically different. To offer a fresh perspective on exploiting this phenomenon, we build on the recent path-lifting framework, which provides a compact factorization of ReLU networks. We introduce a geometrically motivated criterion to rescale neural network parameters which minimization leads to a conditioning strategy that aligns a kernel in the path-lifting space with a chosen reference. We derive an efficient algorithm to perform this alignment. In the context of random network initialization, we analyze how the architecture and the initialization scale jointly impact the output of the proposed method. Numerical experiments illustrate its potential to speed up training.

URL PDF HTML ☆

赞 0 踩 0

2602.16966 2026-06-04 cs.LG cs.AI 版本更新

A Unified Framework for Locality in Scalable MARL

可扩展多智能体强化学习中局部性的统一框架

Sourav Chakraborty, Amit Kiran Rege, Claire Monteleoni, Lijun Chen

发表机构 * University of Colorado Boulder（科罗拉多大学博尔德分校）； INRIA Paris（巴黎国家信息与自动化研究所）

AI总结提出统一框架，通过将矩阵C^π分解为环境敏感性和策略敏感性部分，利用谱半径条件ρ(H^π)<1严格弱于行和条件，证明软max温度直接控制局部性，并给出块坐标KL近端策略改进的确定性保证。

详情

AI中文摘要

网络化多智能体强化学习的可扩展方法让每个智能体仅使用智能体图的一小部分邻域进行规划。这仅在系统是值局部性时有效，即一个智能体的扰动对远处另一个智能体的长期值影响较弱。在平均奖励设置中，验证局部性的标准方法是Dobrushin行和界，该界基于一个矩阵$C^π$，该矩阵捕捉每个智能体的下一个状态如何依赖于其他智能体的当前状态。为了使该矩阵易于处理，先前的工作通过联合动作的上确界来约束它。得到的界与策略无关，但当策略从不选择最坏情况动作时，该界是松的。我们将$C^π$分解为分别跟踪环境敏感性和策略敏感性的部分，$C^π\preceq E^{\mathrm s}+E^{\mathrm a}Π(π)$，其中$E^{\mathrm s}$衡量下一个状态如何随当前状态变化，$E^{\mathrm a}$衡量它如何随当前动作变化，$Π(π)$衡量策略对状态变化的反应程度。那么$H^π:= E^{\mathrm s}+E^{\mathrm a}Π(π)$的谱半径控制平均奖励泊松解的衰减，谱证书$ρ(H^π)<1$严格弱于同一矩阵上的行和条件$\|H^π\|_\infty<1$，并适用于先前Dobrushin风格工作中使用的策略无关动作上确界界无法处理的场景。对于温度-$τ$ softmax策略，我们有$Π(π)\le L/(2τ)$，因此softmax温度直接控制局部性。我们利用这一衰减结果为块坐标KL近端策略改进模板提供确定性预言机保证，其截断偏差随消息传递半径$κ$指数衰减。

英文摘要

Scalable methods for networked multi-agent reinforcement learning let each agent plan using only a small neighborhood of the agent graph. This works only when the system is value-local, meaning a perturbation at one agent affects the long-run value at another agent weakly when the two are far apart. In the average-reward setting, the standard way to certify locality is the Dobrushin row-sum bound on a single matrix $C^π$ that captures how each agent's next state depends on each other agent's current state. To make this matrix easy to work with, prior work bounds it by a supremum over joint actions. The resulting bound is independent of the policy, but it is loose whenever the policy never picks the worst-case action. We split $C^π$ into pieces that separately track environment sensitivity and policy sensitivity, $C^π\preceq E^{\mathrm s}+E^{\mathrm a}Π(π)$, where $E^{\mathrm s}$ measures how the next state moves with the current state, $E^{\mathrm a}$ measures how it moves with the current action, and $Π(π)$ measures how reactive the policy is to changes in state. The spectral radius of $H^π:= E^{\mathrm s}+E^{\mathrm a}Π(π)$ then controls the decay of the average-reward Poisson solution, and the spectral certificate $ρ(H^π)<1$ is strictly weaker than the row-sum condition $\|H^π\|_\infty<1$ on the same matrix and applies in regimes where policy-independent action-supremum bounds used in prior Dobrushin-style work cannot. For temperature-$τ$ softmax policies we get $Π(π)\le L/(2τ)$, so the softmax temperature directly controls locality. We use this decay result to give a deterministic oracle guarantee for a block-coordinate KL-proximal policy-improvement template whose truncation bias decays exponentially in the message-passing radius $κ$.

URL PDF HTML ☆

赞 0 踩 0

2602.03972 2026-06-04 stat.ML cs.AI cs.LG 版本更新

Fixed Budget is No Harder Than Fixed Confidence in Best-Arm Identification up to Logarithmic Factors

固定预算在最佳臂识别中不比固定置信度难（对数因子范围内）

Kapilan Balagopalan, Yinan Li, Yao Zhao, Tuan Nguyen, Anton Daitche, Houssam Nassif, Kwang-Sung Jun

发表机构 * University of California, Berkeley（加州大学伯克利分校）； University of Washington（华盛顿大学）； University of Texas at Austin（德克萨斯大学奥斯汀分校）

AI总结本文提出元算法FC2FB，将固定置信度算法转化为固定预算算法，证明固定预算的样本复杂度在log因子内不高于固定置信度。

详情

Journal ref: International Conference on Machine Learning (ICML'26), Seoul, Korea, 2026

AI中文摘要

最佳臂识别（BAI）问题是交互式机器学习中最基本的问题之一，有两种形式：固定预算设置（FB）和固定置信度设置（FC）。对于具有唯一最佳臂的$K$臂赌博机，两种设置的最优样本复杂度已被确定，且在对数因子内匹配。这引出了一个关于通用的、可能具有结构化的BAI问题的有趣研究问题：FB是否比FC更难，还是相反？在本文中，我们证明FB在对数因子内并不比FC难。我们通过构造性方式做到这一点：我们提出了一种名为FC2FB（固定置信度到固定预算）的新算法，这是一种元算法，它接收一个FC算法$\mathcal{A}$并将其转化为FB算法。我们证明FC2FB的样本复杂度与$\mathcal{A}$的样本复杂度在对数因子内匹配。这意味着最优FC样本复杂度是FB最优样本复杂度的一个上界（在对数因子内）。我们的结果不仅揭示了FB和FC之间的基本关系，而且具有重要含义：FC2FB与现有最先进的FC算法相结合，可以改善许多FB问题的样本复杂度。

英文摘要

The best-arm identification (BAI) problem is one of the most fundamental problems in interactive machine learning, which has two flavors: the fixed-budget setting (FB) and the fixed-confidence setting (FC). For $K$-armed bandits with a unique best arm, the optimal sample complexities for both settings have been settled down, and they match up to logarithmic factors. This prompts an interesting research question about the generic, potentially structured BAI problems: is FB harder than FC or the other way around? In this paper, we show that FB is no harder than FC up to logarithmic factors. We do this constructively: we propose a novel algorithm called FC2FB (fixed confidence to fixed budget), which is a meta algorithm that takes in an FC algorithm $\mathcal{A}$ and turn it into an FB algorithm. We prove that FC2FB enjoys a sample complexity that matches, up to logarithmic factors, that of the sample complexity of $\mathcal{A}$. This means that the optimal FC sample complexity is an upper bound of the optimal FB sample complexity up to logarithmic factors. Our result not only reveals a fundamental relationship between FB and FC, but also has a significant implication: FC2FB combined with existing state-of-the-art FC algorithms leads to improved sample complexity for a number of FB problems.

URL PDF HTML ☆

赞 0 踩 0

2602.14885 2026-06-04 cond-mat.dis-nn cond-mat.stat-mech cs.LG q-bio.NC 版本更新

Drift-Diffusion Matching: Embedding dynamics in latent manifolds of asymmetric neural networks

漂移-扩散匹配：非对称神经网络潜在流形中的动力学嵌入

Ramón Nartallo-Kaluarachchi, Renaud Lambiotte, Alain Goriely

发表机构 * Mathematical Institute, University of Oxford（牛津大学数学研究所）； Centre for Eudaimonia and Human Flourishing, University of Oxford（牛津大学幸福与人类繁荣中心）； Complexity Science Hub, Vienna（维也纳复杂科学中心）

AI总结提出漂移-扩散匹配框架，通过训练连续时间循环神经网络在低维潜在子空间中嵌入任意非线性随机微分方程，利用非对称连接实现非平衡动力学，并应用于联想记忆和序列记忆建模。

Comments 25 pages, 16 figures

详情

AI中文摘要

循环神经网络（RNN）为理解生物神经回路中的计算提供了理论框架，然而经典结果（如Hopfield联想记忆模型）依赖于对称连接，将网络动力学限制为梯度流。相比之下，生物网络支持由其非对称性促进的丰富时间依赖行为。本文引入一个通用框架，称为漂移-扩散匹配，用于训练连续时间RNN在低维潜在子空间中表示具有给定漂移和扩散系数的任意非线性随机微分方程（SDE）。通过允许非对称连接，我们证明RNN能够忠实地嵌入给定SDE的漂移和扩散，包括非线性非平衡动力学（如混沌吸引子）。作为应用，我们构建了随机系统的RNN实现，这些系统通过输入驱动切换和由非平衡电流驱动的自主跃迁短暂探索各种吸引子，我们将其解释为联想记忆和序列（情景）记忆的模型。为了阐明这些动力学如何在网络中编码，我们基于RNN的非对称连接及其时间不可逆性引入分解。我们的结果将吸引子神经网络理论扩展到平衡态之外，表明非对称神经群体可以在低维流形内实现广泛的动力学计算，统一了来自联想记忆、非平衡统计力学和神经计算的思想。

英文摘要

Recurrent neural networks (RNNs) provide a theoretical framework for understanding computation in biological neural circuits, yet classical results, such as Hopfield's model of associative memory, rely on symmetric connectivity that restricts network dynamics to gradient-like flows. In contrast, biological networks support rich time-dependent behaviour facilitated by their asymmetry. Here we introduce a general framework, which we term drift-diffusion matching, for training continuous-time RNNs to represent arbitrary, nonlinear stochastic differential equations (SDEs), with given drift and diffusion coefficients, within a low-dimensional latent subspace. Allowing asymmetric connectivity, we show that RNNs can faithfully embed the drift and diffusion of a given SDE, including nonlinear and nonequilibrium dynamics such as chaotic attractors. As an application, we construct RNN realisations of stochastic systems that transiently explore various attractors through both input-driven switching and autonomous transitions driven by nonequilibrium currents, which we interpret as models of associative and sequential (episodic) memory. To elucidate how these dynamics are encoded in the network, we introduce decompositions of the RNN based on its asymmetric connectivity and its time-irreversibility. Our results extend attractor neural network theory beyond equilibrium, showing that asymmetric neural populations can implement a broad class of dynamical computations within low-dimensional manifolds, unifying ideas from associative memory, nonequilibrium statistical mechanics, and neural computation.

URL PDF HTML ☆

赞 0 踩 0

2602.12643 2026-06-04 cs.LG cs.AI stat.ML 版本更新

Unifying Model-Free Efficiency and Model-Based Representations via Latent Dynamics

通过潜在动力学统一无模型效率与基于模型的表示

Jashaswimalya Acharjee, Balaraman Ravindran

AI总结提出统一潜在动力学算法，通过将状态-动作对嵌入到值函数近似线性的潜在空间，无需规划开销即可融合无模型效率与基于模型表示的优势，在80个环境中匹配或超越专门基线。

Comments Similarities found with a prior work. Hence, requesting for withdrawal until further notice

详情

AI中文摘要

我们提出了统一潜在动力学（ULD），一种新颖的强化学习算法，它统一了无模型方法的效率与基于模型方法的表示优势，且不产生规划开销。通过将状态-动作对嵌入到真实值函数近似线性的潜在空间中，我们的方法支持跨不同领域使用单一超参数集——从低维和像素输入的连续控制到高维Atari游戏。我们证明，在温和条件下，基于嵌入的时序差分更新的不动点与相应线性基于模型的值扩展的不动点一致，并推导了将嵌入保真度与值逼近质量相关联的显式误差界。在实践中，ULD采用编码器、值函数和策略网络的同步更新、短视界预测动力学的辅助损失以及奖励尺度归一化，以确保在稀疏奖励下的稳定学习。在涵盖Gym运动控制、DeepMind Control（本体感觉和视觉）以及Atari的80个环境上的评估表明，我们的方法匹配或超过了专门的基于模型和通用基于模型的基线的性能——以最少的调参和更少的参数实现了跨领域能力。这些结果表明，仅与值对齐的潜在表示就能提供传统上归因于完整基于模型规划的适应性和样本效率。

英文摘要

We present Unified Latent Dynamics (ULD), a novel reinforcement learning algorithm that unifies the efficiency of model-free methods with the representational strengths of model-based approaches, without incurring planning overhead. By embedding state-action pairs into a latent space in which the true value function is approximately linear, our method supports a single set of hyperparameters across diverse domains -- from continuous control with low-dimensional and pixel inputs to high-dimensional Atari games. We prove that, under mild conditions, the fixed point of our embedding-based temporal-difference updates coincides with that of a corresponding linear model-based value expansion, and we derive explicit error bounds relating embedding fidelity to value approximation quality. In practice, ULD employs synchronized updates of encoder, value, and policy networks, auxiliary losses for short-horizon predictive dynamics, and reward-scale normalization to ensure stable learning under sparse rewards. Evaluated on 80 environments spanning Gym locomotion, DeepMind Control (proprioceptive and visual), and Atari, our approach matches or exceeds the performance of specialized model-free and general model-based baselines -- achieving cross-domain competence with minimal tuning and a fraction of the parameter footprint. These results indicate that value-aligned latent representations alone can deliver the adaptability and sample efficiency traditionally attributed to full model-based planning.

URL PDF HTML ☆

赞 0 踩 0

2602.11406 2026-06-04 stat.ML cs.LG 版本更新

The Cost of Learning Under Multiple Change Points

多个变化点下的学习成本

Tomer Gafni, Garud Iyengar, Assaf Zeevi

AI总结针对多变化点在线学习问题，提出选择性检测算法ATC，实现近乎极小化最优的遗憾界。

Comments A version of this work has been accepted for publication in the Proceedings of the 43rd International Conference on Machine Learning (ICML 2026), Seoul, South Korea

详情

AI中文摘要

我们考虑具有多个变化点的环境中的在线学习问题。与使用经典“高置信度”检测方案广泛研究的单变化点问题不同，多变化点环境提出了新的学习理论和算法挑战。具体来说，我们表明经典方法可能由于我们称之为内生混杂的现象而表现出灾难性失败（高遗憾）。为了克服这一点，我们提出了一类新的学习算法，称为任意时间跟踪CUSUM（ATC）。这些是无视时间范围的在线算法，实现选择性检测原则，平衡忽略“小”（难以检测）变化的需要，同时对显著变化做出“快速”反应。我们证明，适当调整的ATC算法的性能几乎是极小化最优的；其遗憾保证紧密匹配任何学习算法在多变化点问题中可实现性能的新信息论下界。在合成数据以及真实数据上的实验验证了上述理论发现。

英文摘要

We consider an online learning problem in environments with multiple change points. In contrast to the single change point problem that is widely studied using classical "high confidence" detection schemes, the multiple change point environment presents new learning-theoretic and algorithmic challenges. Specifically, we show that classical methods may exhibit catastrophic failure (high regret) due to a phenomenon we refer to as endogenous confounding. To overcome this, we propose a new class of learning algorithms dubbed Anytime Tracking CUSUM (ATC). These are horizon-free online algorithms that implement a selective detection principle, balancing the need to ignore "small" (hard-to-detect) shifts, while reacting "quickly" to significant ones. We prove that the performance of a properly tuned ATC algorithm is nearly minimax-optimal; its regret is guaranteed to closely match a novel information-theoretic lower bound on the achievable performance of any learning algorithm in the multiple change point problem. Experiments on synthetic as well as real-world data validate the aforementioned theoretical findings.

URL PDF HTML ☆

赞 0 踩 0

2510.26219 2026-06-04 cs.LG cs.AI 版本更新

Test-time reward-guided alignment of language models by importance sampling on pre-logit space

基于预逻辑空间重要性采样的测试时奖励引导语言模型对齐

Sekitoshi Kanai, Tsukasa Yoshida, Hiroshi Takahashi, Haru Kuroki, Kazumune Hashimoto

发表机构 * NTT, Inc.（NTT公司）； Toyohashi University of Technology（东邦大学）； The University of Osaka（大阪大学）

AI总结提出一种基于预逻辑空间自适应重要性采样的测试时对齐方法AISP，通过高斯扰动和重要性采样优化奖励期望，在样本效率上优于最佳-of-n采样和其他测试时对齐方法。

Comments 24 pages, 10 figures

2509.16301 2026-06-04 q-bio.QM cs.LG 版本更新

TF-DWGNet: A Directed Weighted Graph Neural Network with Tensor Fusion for Multi-Omics Cancer Subtype Classification

TF-DWGNet: 基于张量融合的有向加权图神经网络用于多组学癌症亚型分类

Tiantian Yang, Zhiqian Chen

发表机构 * Mathematics and Statistical Science University of Idaho（数学与统计科学大学 Idaho 大学）； Computer Science and Engineering Mississippi State University（计算机科学与工程密苏里州立大学）

AI总结提出TF-DWGNet框架，结合基于树的有向加权图构建与张量融合机制，解决多组学数据异质性和高阶交互问题，在癌症亚型分类中优于现有方法并提供可解释性。

Comments 9 pages, 4 figures, 4 tables

详情

DOI: 10.1093/nargab/lqag054
Journal ref: NAR Genomics and Bioinformatics, Volume 8, Issue 2, 2026, lqag054

AI中文摘要

多组学数据的整合与分析为改善癌症亚型分类提供了宝贵的见解。然而，这些数据本质上是异质的、高维的，并表现出复杂的模态内和模态间依赖关系。图神经网络（GNN）为建模这些结构提供了一个原则性框架，但现有方法通常依赖先验知识或预定义的相似性网络，这些网络生成无向或无权重图，无法捕捉任务特定的方向性和交互强度。在模态和特征层面的可解释性也仍然有限。为了解决这些挑战，我们提出了TF-DWGNet，一种新颖的图神经网络框架，它结合了基于树的有向加权图构建与张量融合，用于多类癌症亚型分类。TF-DWGNet引入了两个关键创新：（i）一种监督的基于树的策略，为每种组学模态构建定制的有向加权图，以及（ii）一种张量融合机制，通过低秩分解捕获单模态、双模态和三模态交互，以提高计算效率。在三个真实世界癌症数据集上的实验表明，TF-DWGNet在多个指标和统计测试中始终优于最先进的基线方法。此外，该模型通过模态级贡献分数和排序的特征重要性提供了生物学上有意义的见解。这些结果突显了TF-DWGNet是癌症研究中多组学整合的有效且可解释的解决方案。

英文摘要

Integration and analysis of multi-omics data provide valuable insights for improving cancer subtype classification. However, such data are inherently heterogeneous, high-dimensional, and exhibit complex intra- and inter-modality dependencies. Graph neural networks (GNNs) offer a principled framework for modeling these structures, but existing approaches often rely on prior knowledge or predefined similarity networks that produce undirected or unweighted graphs and fail to capture task-specific directionality and interaction strength. Interpretability at both the modality and feature levels also remains limited. To address these challenges, we propose TF-DWGNet, a novel Graph Neural Network framework that combines tree-based Directed Weighted graph construction with Tensor Fusion for multiclass cancer subtype classification. TF-DWGNet introduces two key innovations: (i) a supervised tree-based strategy that constructs directed, weighted graphs tailored to each omics modality, and (ii) a tensor fusion mechanism that captures unimodal, bimodal, and trimodal interactions using low-rank decomposition for computational efficiency. Experiments on three real-world cancer datasets demonstrate that TF-DWGNet consistently outperforms state-of-the-art baselines across multiple metrics and statistical tests. In addition, the model provides biologically meaningful insights through modality-level contribution scores and ranked feature importance. These results highlight that TF-DWGNet is an effective and interpretable solution for multi-omics integration in cancer research.

URL PDF HTML ☆

赞 0 踩 0

2511.13391 2026-06-04 cs.LG cs.AI math.CO math.MG 版本更新

Finding Kissing Numbers with Game-theoretic Reinforcement Learning

用博弈论强化学习寻找亲吻数

Chengdong Ma, Théo Tao Zhaowei, Pengyu Li, Minghao Liu, Haojun Chen, Zihao Mao, Bo Li, Yuan Cheng, Yuan Qi, Yaodong Yang

发表机构 * Institute for Artificial Intelligence, Peking University（北京大学人工智能研究院）； Shanghai Academy of AI for Science（上海人工智能科学研究院）； Artificial Intelligence Innovation and Incubation Institute, Fudan University（复旦大学人工智能创新与孵化院）

AI总结将亲吻数问题转化为合作矩阵补全博弈，利用强化学习系统PackingStar在极值配置空间中探索，改进了15个长期未突破的亲吻数上界，并发现了新的可解释几何结构。

详情

AI中文摘要

自1694年牛顿首次研究亲吻数问题以来，确定中心球周围非重叠球的最大数量一直是离散几何中的一个决定性挑战。作为希尔伯特第18问题的局部类比，它在几何、数论和信息论中具有深远意义。尽管格和编码取得了显著进展，但该领域局限于孤立的极值构型，掩盖了潜在的几何原理。在这里，我们将对象转移到更广泛的极值配置空间，从而为亲吻数问题开辟了一条新路径。因此，我们将该问题重新表述为一个合作矩阵补全博弈，并训练一个强化学习系统PackingStar来解决它。一个玩家填充余弦条目，而另一个玩家纠正次优条目，使爆炸性的几何复杂性变得可处理。在极值配置空间内工作，PackingStar发现了新的可解释几何结构，改进了15个在亲吻数及其推广中保持数十年的强上界，其中几个在自然内积下被证明是最优的。这些发现揭示了Fischer群Fi22的第一个显式球面编码实现，扩展了子群结构的经典欧几里得表示，并直接启发了数学家的后续突破。总体而言，这项工作为人工智能在希尔伯特级别问题上的进展提供了一个早期示例，展示了强化学习通过解锁更具表现力的对象来推动数学发现。

英文摘要

Since Isaac Newton first studied the Kissing Number Problem in 1694, determining the maximal number of non-overlapping spheres around a central sphere has remained a defining challenge in discrete geometry. As the local analogue of Hilbert's 18th problem, it has profound implications across geometry, number theory and information theory. Although lattices and codes have achieved significant progress, the field is confined to isolated extremal configurations, leaving underlying geometric principles obscured. Here we shift the object to the broader extremal configuration space, thereby opening a new path for the Kissing Number Problem. Accordingly, we recast this problem as a cooperative matrix-completion game, and train a reinforcement learning system, PackingStar, to solve it. One player fills cosine entries while the other corrects suboptimal ones, making explosive geometric complexity tractable. Working within extremal configuration spaces, PackingStar discovers new interpretable geometric structures that improve 15 strong bounds held for decades in kissing numbers and their generalizations, several of them provably optimal under natural inner products. These findings reveal the first explicit spherical-code realization of the Fischer group Fi22, extend the classical Euclidean representation of subgroup structure, and directly inspire subsequent breakthroughs by mathematicians. Overall, the work provides an early example of AI-driven progress on a Hilbert-calibre problem, showing how reinforcement learning advances mathematical discovery by unlocking more expressive objects.

URL PDF HTML ☆

赞 0 踩 0

2602.09075 2026-06-04 cs.LG cs.AI 版本更新

Learning to Remember, Learn, and Forget in Attention-Based Models

在基于注意力的模型中学习记忆、学习和遗忘

Djohan Bonnet, Jamie Lohoff, Jan Finkbeiner, Elidona Shiqerukaj, Emre Neftci

发表机构 * University of Cambridge（剑桥大学）

AI总结提出Palimpsa模型，将上下文学习视为持续学习问题，通过贝叶斯元可塑性解决稳定性-可塑性困境，显著提升记忆容量，在MQAR和常识推理任务上优于基线。

详情

AI中文摘要

Transformer中的上下文学习（ICL）作为一种在线联想记忆，被认为是其在复杂序列处理任务中高性能的基础。然而，在门控线性注意力模型中，这种记忆具有固定容量且容易受到干扰，尤其是对于长序列。我们提出Palimpsa，一种自注意力模型，将ICL视为必须解决稳定性-可塑性困境的持续学习问题。Palimpsa使用贝叶斯元可塑性，其中每个注意力状态的可塑性绑定到一个由捕获累积知识的先验分布支撑的重要性状态。我们证明各种门控线性注意力模型作为特定的架构选择和后验近似出现，并且Mamba2是Palimpsa的一个特例，其中遗忘占主导。这一理论联系使得任何非元可塑性模型都能转化为元可塑性模型，从而显著扩展其记忆容量。我们的实验表明，Palimpsa在多查询联想回忆（MQAR）基准和常识推理任务上始终优于基线。

英文摘要

In-Context Learning (ICL) in transformers acts as an online associative memory and is believed to underpin their high performance on complex sequence processing tasks. However, in gated linear attention models, this memory has a fixed capacity and is prone to interference, especially for long sequences. We propose Palimpsa, a self-attention model that views ICL as a continual learning problem that must address a stability-plasticity dilemma. Palimpsa uses Bayesian metaplasticity, where the plasticity of each attention state is tied to an importance state grounded by a prior distribution that captures accumulated knowledge. We demonstrate that various gated linear attention models emerge as specific architecture choices and posterior approximations, and that Mamba2 is a special case of Palimpsa where forgetting dominates. This theoretical link enables the transformation of any non-metaplastic model into a metaplastic one, significantly expanding its memory capacity. Our experiments show that Palimpsa consistently outperforms baselines on the Multi-Query Associative Recall (MQAR) benchmark and on Commonsense Reasoning tasks.

URL PDF HTML ☆

赞 0 踩 0

2509.25289 2026-06-04 cs.LG cs.AI 版本更新

ClustRecNet: A Novel End-to-End Deep Learning Framework for Clustering Algorithm Recommendation

ClustRecNet: 一种用于聚类算法推荐的新型端到端深度学习框架

Mohammadreza Bakhtyari, Bogdan Mazoure, Renato Cordeiro de Amorim, Guillaume Rabusseau, Vladimir Makarenkov

发表机构 * Département d’Informatique, Université du Québec à Montréal（魁北克大学蒙特利尔分校计算机科学系）； Mila - Quebec AI Institute（魁北克人工智能研究所）； School of Computer Science and EE, University of Essex（埃塞克斯大学计算机科学与电子工程学院）； Department of Computer Science and Operations Research, Université de Montréal（蒙特利尔大学计算机科学与运筹学系）

AI总结提出ClustRecNet，一种端到端深度学习框架，通过直接学习原始表格数据的高阶表示来推荐合适的聚类算法，在合成和真实基准上优于传统内部聚类有效性指标和AutoML方法。

Comments Published in IEEE Access

详情

DOI: 10.1109/ACCESS.2026.3697689
Journal ref: IEEE Access, vol. 14, pp. 81352 - 81365, 2026

AI中文摘要

为给定数据集识别有效的聚类算法仍然是一个基本的无监督学习问题。我们引入了ClustRecNet，一种新颖的端到端深度学习框架，通过直接学习原始表格数据的高阶表示来推荐合适的聚类算法。为了促进稳健的元学习，我们首先构建了一个包含34,000个合成数据集的综合存储库，涵盖了多种聚类场景，运行了10种流行的聚类算法，并使用调整兰德指数（ARI）建立真实标签。ClustRecNet的架构包含一个卷积块、两个残差块和一个注意力块，以捕获局部和全局结构模式，有效绕过了与手动特征工程相关的知识瓶颈。在合成和真实世界基准上的广泛评估表明，ClustRecNet始终优于传统的内部聚类有效性指标，如轮廓系数、Calinski-Harabasz、Davies-Bouldin和Dunn，以及最先进的自动化机器学习（AutoML）方法，如ML2DAC、AutoCluster和AutoML4Clust。例如，我们的框架在合成数据上平均比Calinski-Harabasz聚类有效性指数高出0.497的ARI增益，在真实世界基准上平均比领先的AutoML方法（ML2DAC）高出44.16%的ARI改进。代码和数据可在以下网址获取：https://github.com/mrbakhtyari/ClustRecNet

英文摘要

Identifying an effective clustering algorithm for a given dataset remains a fundamental unsupervised learning issue. We introduce ClustRecNet, a novel end-to-end deep learning framework that recommends suitable clustering algorithm(s) by directly learning high-order representations of raw tabular data. To facilitate robust meta-learning, we first construct a comprehensive repository of 34,000 synthetic datasets encompassing a large variety of clustering scenarios, run 10 popular clustering algorithms, and use Adjusted Rand Index (ARI) to establish ground-truth labels. ClustRecNet's architecture incorporates a convolution block, two residual blocks, and an attention block to capture local and global structural patterns, effectively bypassing the knowledge bottleneck associated with manual feature engineering. Extensive evaluation on both synthetic and real-world benchmarks demonstrates that ClustRecNet consistently outperforms traditional internal cluster validity indices such as Silhouette, Calinski-Harabasz, Davies-Bouldin, and Dunn as well as state-of-the-art Automated Machine Learning (AutoML) approaches such as ML2DAC, AutoCluster, and AutoML4Clust. For example, our framework achieves an average 0.497 ARI gain over the Calinski-Harabasz cluster validity index on synthetic data and an average 44.16% ARI improvement over the leading AutoML approach (ML2DAC) on real-world benchmarks. Code and data are available at: https://github.com/mrbakhtyari/ClustRecNet

URL PDF HTML ☆

赞 0 踩 0

2602.08142 2026-06-04 cs.LG stat.ML 版本更新

Variance-Gated Ensembles: An Epistemic-Aware Framework for Uncertainty Estimation

方差门控集成：一种面向认知不确定性的估计框架

H. Martin Gillis, Isaac Xu, Thomas Trappenberg

发表机构 * Faculty of Computer Science, Dalhousie University, Halifax, NS（计算机科学学院，达尔豪西大学，哈利法克斯，NS）

AI总结提出方差门控集成（VGE）框架，通过从集成统计量计算信噪比门控注入认知敏感性，实现高效且可微的不确定性估计，在计算效率与性能上匹配或超越现有方法。

Comments Published in Transactions on Machine Learning Research (06/2026)

详情

AI中文摘要

机器学习应用需要快速且可靠的逐样本不确定性估计。常见方法是使用贝叶斯或近似方法的预测分布，并将不确定性加性分解为偶然（即数据相关）和认知（即模型相关）分量。然而，加性分解最近受到质疑，有证据表明当使用有限集成采样和/或不匹配的预测分布时，该分解会失效。本文介绍方差门控集成（VGE），一种直观、可微的框架，通过从集成统计量计算的信噪比门控注入认知敏感性。VGE提供：（i）方差门控边际不确定性（VGMU）分数，将决策边际与集成预测方差耦合；（ii）方差门控归一化（VGN）层，通过每类可学习的集成成员概率归一化，将方差门控不确定性机制推广到训练。我们推导出闭合形式的向量-雅可比积，使得通过集成样本均值和方差进行端到端训练成为可能。VGE在保持计算效率的同时，匹配或超越最先进的信息论基线。因此，VGE为集成模型中的认知感知不确定性估计提供了一种实用且可扩展的方法。

英文摘要

Machine learning applications require fast and reliable per-sample uncertainty estimation. A common approach is to use predictive distributions from Bayesian or approximation methods and additively decompose uncertainty into aleatoric (i.e., data-related) and epistemic (i.e., model-related) components. However, additive decomposition has recently been questioned, with evidence that it breaks down when using finite-ensemble sampling and/or mismatched predictive distributions. This paper introduces Variance-Gated Ensembles (VGE), an intuitive, differentiable framework that injects epistemic sensitivity via a signal-to-noise gate computed from ensemble statistics. VGE provides: (i) a Variance-Gated Margin Uncertainty (VGMU) score that couples decision margins with ensemble predictive variance; and (ii) a Variance-Gated Normalization (VGN) layer that generalizes the variance-gated uncertainty mechanism to training via per-class, learnable normalization of ensemble member probabilities. We derive closed-form vector-Jacobian products enabling end-to-end training through ensemble sample mean and variance. VGE matches or exceeds state-of-the-art information-theoretic baselines while remaining computationally efficient. As a result, VGE provides a practical and scalable approach to epistemic-aware uncertainty estimation in ensemble models.

URL PDF HTML ☆

赞 0 踩 0

2602.06883 2026-06-04 cs.LG cs.CV stat.ML 版本更新

Vision Transformer Finetuning Benefits from Non-Smooth Components

视觉变换器微调受益于非平滑组件

Ambroise Odonnat, Laetitia Chapel, Romain Tavenard, Ievgen Redko

发表机构 * Noah's Ark Lab（诺亚 ark 实验室）； Univ. Rennes 2, Inria（里昂二大学，法国国家信息与自动化研究所）

AI总结本文通过分析视觉变换器组件的可塑性（即输出对输入变化的敏感度），发现高可塑性（低平滑性）的注意力模块和前馈层在微调中表现更好，挑战了平滑性有利的传统观点。

Comments Accepted at ICML 2026

详情

AI中文摘要

变换器架构的平滑性在泛化、训练稳定性和对抗鲁棒性方面已被广泛研究。然而，其在迁移学习中的作用仍知之甚少。本文分析了视觉变换器组件使其输出适应输入变化的能力，即它们的\emph{可塑性}。定义为平均变化率，它捕捉了对输入扰动的敏感性；特别地，高可塑性意味着低平滑性。我们的理论分析和大量实验——在大规模视觉变换器上进行超过1000次微调运行——表明，这一视角为选择在适应过程中优先考虑的组件提供了原则性指导。对从业者的关键启示是，注意力模块和前馈层的高可塑性始终导致更好的微调性能。我们的发现偏离了平滑性是可取的普遍假设，为变换器的功能特性提供了新的视角。代码可在 https://github.com/ambroiseodt/vit-plasticity 获取。

英文摘要

The smoothness of the transformer architecture has been extensively studied in the context of generalization, training stability, and adversarial robustness. However, its role in transfer learning remains poorly understood. In this paper, we analyze the ability of vision transformer components to adapt their outputs to changes in inputs, or, in other words, their \emph{plasticity}. Defined as an average rate of change, it captures the sensitivity to input perturbation; in particular, a high plasticity implies a low smoothness. Our theoretical analysis and extensive experiments -- over $1,000$ finetuning runs on large-scale vision transformers -- showcase that this perspective provides principled guidance in choosing the components to prioritize during adaptation. A key takeaway for practitioners is that the high plasticity of the attention modules and feedforward layers consistently leads to better finetuning performance. Our findings depart from the prevailing assumption that smoothness is desirable, offering a novel perspective on transformers' functional properties. The code is available at https://github.com/ambroiseodt/vit-plasticity.

URL PDF HTML ☆

赞 0 踩 0

2601.20800 2026-06-04 cs.LG cs.AI 版本更新

Conditional PED-ANOVA: Hyperparameter Importance in Hierarchical & Dynamic Search Spaces

条件PED-ANOVA：层次与动态搜索空间中的超参数重要性

Kaito Baba, Yoshihiko Ozaki, Shuhei Watanabe

发表机构 * Preferred Networks, Inc.（Preferred Networks公司）； The University of Tokyo（东京大学）； SB Intuitions Corp.（SB Intuitions公司）

AI总结提出条件PED-ANOVA框架，用于估计条件搜索空间中超参数的重要性，通过闭式估计器准确反映条件激活和域变化，实验证明其优于朴素适应方法。

Comments 20 pages, 15 figures. Accepted to the 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2026)

详情

DOI: 10.1145/3770855.3817758

AI中文摘要

我们提出条件PED-ANOVA（condPED-ANOVA），一个用于估计条件搜索空间中超参数重要性（HPI）的原则性框架，其中超参数的存在或域可能依赖于其他超参数。尽管原始PED-ANOVA提供了一种快速有效的方法来估计搜索空间内高性能区域的HPI，但它假设一个固定的、无条件的搜索空间，因此无法正确处理条件超参数。为了解决这个问题，我们引入了针对高性能区域的条件HPI，并推导出一个闭式估计器，能够准确反映条件激活和域变化。实验表明，现有HPI估计器的朴素适应在条件设置下会产生误导性或不可解释的重要性，而condPED-ANOVA始终提供反映底层条件结构的有意义的重要性。我们的代码公开在https://github.com/kAIto47802/condPED-ANOVA。

英文摘要

We propose conditional PED-ANOVA (condPED-ANOVA), a principled framework for estimating hyperparameter importance (HPI) in conditional search spaces, where the presence or domain of a hyperparameter can depend on other hyperparameters. Although the original PED-ANOVA provides a fast and efficient way to estimate HPI within the top-performing regions of the search space, it assumes a fixed, unconditional search space and therefore cannot properly handle conditional hyperparameters. To address this, we introduce a conditional HPI for top-performing regions and derive a closed-form estimator that accurately reflects conditional activation and domain changes. Experiments show that naive adaptations of existing HPI estimators yield misleading or uninterpretable importances in conditional settings, whereas condPED-ANOVA consistently provides meaningful importances that reflect the underlying conditional structure. Our code is publicly available at https://github.com/kAIto47802/condPED-ANOVA.

URL PDF HTML ☆

赞 0 踩 0

2602.05657 2026-06-04 cs.LG math.OC 版本更新

Tight Long-Term Tail Decay of (Clipped) SGD in Non-Convex Optimization

非凸优化中（裁剪）SGD的严格长期尾衰减

Aleksandar Armacki, Dragana Bajović, Dušan Jakovetić, Soummya Kar, Ali H. Sayed

发表机构 * École Polytechnique Fédérale de Lausanne（瑞士联邦理工学院洛桑分校）； University of Novi Sad（诺维萨德大学）； Carnegie Mellon University（卡内基梅隆大学）

AI总结通过大偏差理论，研究非凸优化中SGD和裁剪SGD的长期尾衰减，给出梯度范数平方的指数级上界和下界，证明衰减率可达$e^{-t/\log(t)}$量级，比现有有限时间界快一个数量级。

Comments 34 pages

详情

AI中文摘要

由于能够为算法的单次运行提供强保证，对SGD诱导过程的尾部行为的研究引起了广泛兴趣。虽然许多工作提供了高概率保证（量化固定概率阈值下的误差率），但缺乏直接研究失败概率的工作，即量化固定误差阈值下的尾部衰减率。此外，现有结果具有有限时间性质，限制了它们捕捉真实长期尾部衰减的能力，而后者对于现代学习模型（通常训练数百万次迭代）更具信息量。我们的工作通过大偏差理论的视角研究基于SGD的方法的长期尾部衰减，填补了这些空白，在此过程中建立了若干强结果。首先，对于非凸成本和有界噪声，我们给出了（普通）SGD产生的最佳迭代的梯度范数平方的尾部上界，长期衰减率为$e^{-t/\log(t)}$。接着，我们通过考虑在具有有界$p$阶矩（$p \in (1,2]$）的重尾噪声下的裁剪SGD（c-SGD）来放宽噪声假设，证明了长期衰减率为$e^{-t^{\beta_p}/\log(t)}$的上界，其中当$p \in (1,2)$时$\beta_p = \frac{4(p-1)}{3p-2}$，当$p=2$时衰减率为$e^{-t/\log^2(t)}$。最后，我们给出了尾部衰减的下界，衰减率为$e^{-t}$，表明我们关于SGD和c-SGD的衰减率在多项式对数因子意义下是紧的。值得注意的是，我们的结果表明，与基于有限时间界的现有工作（分别显示SGD和c-SGD的衰减率为$e^{-\sqrt{t}}$和$e^{-t^{\beta_p/2}}$，$p \in (1,2]$）相比，长期尾部衰减快一个数量级。因此，我们揭示了尾部衰减比先前已知快得多的机制，为单次运行提供了更强的长期保证。

英文摘要

The study of tail behaviour of SGD-induced processes has been attracting a lot of interest, due to offering strong guarantees with respect to individual runs of an algorithm. While many works provide high-probability guarantees, quantifying the error rate for a fixed probability threshold, there is a lack of work directly studying the probability of failure, i.e., quantifying the tail decay rate for a fixed error threshold. Moreover, existing results are of finite-time nature, limiting their ability to capture the true long-term tail decay which is more informative for modern learning models, typically trained for millions of iterations. Our work closes these gaps, by studying the long-term tail decay of SGD-based methods through the lens of large deviations theory, establishing several strong results in the process. First, we provide an upper bound on the tails of the gradient norm-squared of the best iterate produced by (vanilla) SGD, for non-convex costs and bounded noise, with long-term decay at rate $e^{-t/\log(t)}$. Next, we relax the noise assumption by considering clipped SGD (c-SGD) under heavy-tailed noise with bounded moment of order $p \in (1,2]$, showing an upper bound with long-term decay at rate $e^{-t^{β_p}/\log(t)}$, where $β_p = \frac{4(p-1)}{3p-2}$ for $p \in (1,2)$ and $e^{-t/\log^2(t)}$ for $p = 2$. Finally, we provide lower bounds on the tail decay, at rate $e^{-t}$, showing that our rates for both SGD and c-SGD are tight, up to poly-logarithmic factors. Notably, our results demonstrate an order of magnitude faster long-term tail decay compared to existing work based on finite-time bounds, which show rates $e^{-\sqrt{t}}$ and $e^{-t^{β_p/2}}$, $p \in (1,2]$, for SGD and c-SGD, respectively. As such, we uncover regimes where the tails decay much faster than previously known, providing stronger long-term guarantees for individual runs.

URL PDF HTML ☆

赞 0 踩 0

2510.08734 2026-06-04 cs.LG 版本更新

半参数偏好优化：你的语言模型秘密地是一个单索引模型

Nathan Kallus

发表机构 * Netflix & Cornell University（Netflix与康奈尔大学）

AI总结本文提出半参数偏好优化方法，通过放宽偏好与潜在奖励之间的链接函数假设，在未知且无限制的链接函数下进行策略对齐，并证明策略类的可实现性诱导出半参数单索引二元选择模型，直接学习策略并给出链接无关的收敛保证。

详情

AI中文摘要

策略对齐到偏好数据通常假设观察到的偏好与潜在奖励之间存在已知的链接函数（例如，Bradley-Terry模型/逻辑链接）。这种链接的错误设定可能会使推断的奖励产生偏差，并使学习到的策略偏离对齐。我们研究了在未知且无限制的链接函数下的策略对齐。我们提出了一个$f$-散度约束的奖励最大化问题，并表明策略类中的可实现性诱导出一个半参数单索引二元选择模型，其中标量策略诱导的索引捕获了所有对示范的依赖，而剩余的偏好分布是无限制的。与计量经济学中要求识别此类模型的结构参数并进行估计不同，我们开发了直接学习策略的方法，其中奖励函数是隐式的，分析了与最优策略的误差，并允许不可识别和非参数的索引。我们证明了基于通用函数复杂度度量的链接无关收敛保证，并通过实验验证了方法和理论。代码可在 https://github.com/causalml/spo/ 获取。

英文摘要

Policy alignment to preference data typically assumes a known link function between observed preferences and latent rewards (e.g., Bradley-Terry model / logistic link). Misspecification of this link can bias inferred rewards and misalign learned policies. We study policy alignment under an unknown and unrestricted link function. We formulate an $f$-divergence-constrained reward maximization problem and show that realizability in a policy class induces a semiparametric single-index binary choice model, where a scalar policy-induced index captures all dependence on demonstrations and the remaining preference distribution is unrestricted. Rather than impose identifiability of structural parameters of such a model and estimate them, as in econometrics, we develop methods that directly learn policies, with the reward function implicit, analyzing error to the optimal policy and allowing for unidentifiable and nonparametric indices. We prove link-agnostic convergence guarantees in terms of generic function complexity measures and validate the methods and theory empirically. Code is available at https://github.com/causalml/spo/.

URL PDF HTML ☆

赞 0 踩 0

2506.06178 2026-06-04 cs.LG 版本更新

\textsc{Lethe}: 用于联邦遗忘中持久知识擦除的原则性双流更新

Wentai Wu, Hanwei Tan, Yijun Quan, Haixia Peng, Ligang He, Bin Yang, C. L. Philip Chen

发表机构 * Department of Computer Science, College of Information Science and Technology, Jinan University（计算机科学系，信息科学与技术学院，暨南大学）； WMG, University of Warwick（沃森盖尔学院，沃里克大学）； School of Information and Communications Engineering, Xi’an Jiaotong University（信息与通信工程学院，西安交通大学）； Department of Computer Science, University of Warwick（计算机科学系，沃里克大学）； School of Data Science and Engineering, East China Normal University（数据科学与工程学院，华东师范大学）； School of Computer Science and Engineering, South China University of Technology（计算机科学与工程学院，华南理工大学）

AI总结针对联邦遗忘后继续训练导致已遗忘知识重新浮现的问题，提出Lethe方法，通过遗忘流和保留流的反对齐更新实现持久知识擦除。

详情

AI中文摘要

联邦遗忘（FU）旨在从全局模型中擦除知识。现有研究通常假设遗忘后联邦协作终止，忽略了在删除请求完成后剩余客户端继续训练的实际部署场景。在这项工作中，我们识别出一个关键失败模式，称为知识重新浮现，揭示了仅对保留数据进行持续训练可以在几轮内重新激活已遗忘的知识。实验表明，许多最先进的FU方法容易发生知识重新浮现。我们随后提出Lethe，一种用于联邦设置中持久知识擦除的新型遗忘方法。在每次迭代中，Lethe操作来自遗忘客户端的遗忘流和来自保留客户端的保留流。它将遗忘更新重定向到两个流反对齐的区域，阻止保留数据训练移回遗忘知识。因此，Lethe在后续联邦训练期间确保更强的遗忘持久性。跨不同模型、数据集和遗忘级别的广泛实验验证了Lethe以统一方式支持CV和NLP任务中的所有遗忘级别，即使在极长后续训练时间后，大多数情况下也持续显示出低于1%的低重新浮现率。

英文摘要

Federated unlearning (FU) aims to erase knowledge from a global model. Existing studies commonly assume that federated collaboration terminates after unlearning, overlooking a deployment-realistic scenario where training continues on the remaining clients after deletion requests are fulfilled. In this work, we identify a critical failure mode, termed knowledge resurfacing, revealing that continued training on retained data alone can reactivate unlearned knowledge in a few rounds. Empirically, we demonstrate that many state-of-the-art FU methods are prone to knowledge resurfacing. We then propose Lethe, a novel unlearning method for persistent knowledge erasure in federated settings. In each iteration, Lethe operates on a forget stream from the unlearning client and a retain stream from the retained clients. It redirects unlearning updates toward a region where the two streams are anti-aligned, discouraging retained-data training from moving back toward the forgotten knowledge. Consequently, Lethe ensures stronger unlearning persistence during subsequent federated training. Extensive experiments across diverse models, datasets, and unlearning levels validate that Lethe supports all levels of unlearning in a unified manner across both CV and NLP tasks, demonstrating consistently low RR, below 1% in most cases, even after an extremely long horizon of follow-up training.

URL PDF HTML ☆

赞 0 踩 0

2601.22450 2026-06-04 cs.LG cs.AI 版本更新

Tuning the Implicit Regularizer of Masked Diffusion Language Models: Enhancing Generalization via Insights from $k$-Parity

调整掩码扩散语言模型的隐式正则化器：通过$k$-奇偶问题的见解增强泛化能力

Jianhao Huang, Baharan Mirzasoleiman

发表机构 * University of California, Berkeley（加州大学伯克利分校）； Stanford University（斯坦福大学）

AI总结本文通过$k$-奇偶问题研究掩码扩散语言模型的泛化特性，理论分解其目标函数为信号和噪声两部分，并利用噪声作为隐式正则化器，通过优化掩码概率分布显著提升模型性能。

Comments ICML 2026

详情

成功条件化作为策略改进：模仿成功所解决的优化问题

Daniel Russo

发表机构 * Daniel J. Russo

AI总结本文证明成功条件化（模仿成功轨迹）精确求解了一个信任区域优化问题，其χ²散度约束半径由数据自动确定，并揭示了相对策略改进、策略变化幅度和动作影响之间的等式关系。

详情

AI中文摘要

一种广泛使用的策略改进技术是成功条件化，即收集轨迹，识别那些实现期望结果的轨迹，并更新策略以模仿沿成功轨迹采取的动作。这一原则有许多名称——带SFT的拒绝采样、目标条件化RL、决策Transformer——但它解决了什么优化问题（如果有的话）一直不清楚。我们证明成功条件化精确求解了一个信任区域优化问题，在由数据自动确定半径的χ²散度约束下最大化策略改进。这产生了一个恒等式：相对策略改进、策略变化幅度以及我们称为动作影响（衡量动作选择中的随机变化如何影响成功率）的量在每个状态下都完全相等。因此，成功条件化表现为一个保守的改进算子。精确的成功条件化不会降低性能或引发危险的分布偏移，但当它失败时，它会以可观察的方式失败，即几乎不改变策略。我们将我们的理论应用于常见的回报阈值设定实践，表明这可以放大改进，但代价是可能与真实目标不一致。

英文摘要

A widely used technique for improving policies is success conditioning, in which one collects trajectories, identifies those that achieve a desired outcome, and updates the policy to imitate the actions taken along successful trajectories. This principle appears under many names -- rejection sampling with SFT, goal-conditioned RL, Decision Transformers -- yet what optimization problem it solves, if any, has remained unclear. We prove that success conditioning exactly solves a trust-region optimization problem, maximizing policy improvement subject to a $χ^2$ divergence constraint whose radius is determined automatically by the data. This yields an identity: relative policy improvement, the magnitude of policy change, and a quantity we call action-influence -- measuring how random variation in action choices affects success rates -- are exactly equal at every state. Success conditioning thus emerges as a conservative improvement operator. Exact success conditioning cannot degrade performance or induce dangerous distribution shift, but when it fails, it does so observably, by hardly changing the policy at all. We apply our theory to the common practice of return thresholding, showing this can amplify improvement, but at the cost of potential misalignment with the true objective.

URL PDF HTML ☆

赞 0 踩 0

2601.17469 2026-06-04 cs.LG 版本更新

Identifying and Correcting Label Noise for Robust GNNs via Influence Contradiction

通过影响矛盾识别和纠正标签噪声以实现鲁棒图神经网络

Wei Ju, Wei Zhang, Siyu Yi, Zhengyang Mao, Yifan Wang, Jingyang Yuan, Zhiping Xiao, Ziyue Qiao, Ming Zhang

发表机构 * University of Science and Technology of China（中国科学技术大学）

AI总结提出ICGNN方法，利用图扩散矩阵计算影响矛盾分数（ICS）检测噪声标签，并通过邻居预测软策略纠正噪声标签，结合伪标签提升鲁棒性。

Comments Accepted by Proceedings of the 43rd International Conference on Machine Learning (ICML 2026)

详情

AI中文摘要

图神经网络（GNN）在学习图结构数据方面表现出显著能力，广泛应用于社交分析和生物信息学等领域。然而，现实场景中标签噪声的存在对学习鲁棒GNN构成重大挑战，其有效性在处理图上噪声标签（通常源于标注错误或不一致）时会受到严重影响。为此，本文提出一种名为ICGNN的新方法，利用图的结构信息有效缓解噪声标签带来的挑战。具体地，我们首先设计一种新的噪声指示器，基于图扩散矩阵测量影响矛盾分数（ICS），以量化具有干净标签的节点的可信度，使得ICS值较高的节点更可能被检测为具有噪声标签。然后，我们利用高斯混合模型精确检测节点标签是否含有噪声。此外，我们开发了一种软策略，结合图上邻居节点的预测来纠正检测到的噪声标签。最后，引入大量未标记节点的伪标签，以提供辅助监督信号并指导模型优化。在基准数据集上的实验表明，我们的方法在噪声标签场景下优于竞争基线。

英文摘要

Graph Neural Networks (GNNs) have shown remarkable capabilities in learning from graph-structured data with various applications such as social analysis and bioinformatics. However, the presence of label noise in real scenarios poses a significant challenge in learning robust GNNs, and their effectiveness can be severely impacted when dealing with noisy labels on graphs, often stemming from annotation errors or inconsistencies. To address this, in this paper we propose a novel approach called ICGNN that harnesses the structure information of the graph to effectively alleviate the challenges posed by noisy labels. Specifically, we first design a novel noise indicator that measures the influence contradiction score (ICS) based on the graph diffusion matrix to quantify the credibility of nodes with clean labels, such that nodes with higher ICS values are more likely to be detected as having noisy labels. Then we leverage the Gaussian mixture model to precisely detect whether the label of a node is noisy or not. Additionally, we develop a soft strategy to combine the predictions from neighboring nodes on the graph to correct the detected noisy labels. At last, pseudo-labeling for abundant unlabeled nodes is incorporated to provide auxiliary supervision signals and guide the model optimization. Experiments on benchmark datasets show the superiority of our approach over competitive baselines in noisy label scenarios.

URL PDF HTML ☆

赞 0 踩 0

2601.06196 2026-06-04 cs.LG cs.AI cs.CL 版本更新

Geometry-Aware Hallucination Detection in Large Language Models

大语言模型中的几何感知幻觉检测

Bodla Krishna Vamshi, Rohan Bhatnagar, Haizhao Yang

发表机构 * University of Maryland, College Park（马里兰大学学院公园分校）

AI总结提出GA-ICL框架，利用冻结LLM的潜在表示建模局部流形和类别原型几何，选择上下文示例以检测幻觉，在FEVER和HaluEval基准上优于基线方法。

详情

AI中文摘要

大型语言模型（LLM）经常生成事实不正确或未经支持的内容，通常称为幻觉。先前的工作探索了解码策略、检索增强和监督微调用于幻觉检测，而最近的研究表明，上下文学习（ICL）可以显著影响事实可靠性。然而，现有的ICL示例选择方法通常依赖于表面相似性启发式方法，并且在任务和模型上表现出有限的鲁棒性。我们提出GA-ICL，一种几何感知的示例采样框架，用于选择上下文示例，该框架利用从冻结LLM中提取的潜在表示。通过联合建模局部流形结构和类别感知的原型几何，GA-ICL根据示例与学习原型的接近程度进行选择，而不仅仅是基于词汇或嵌入相似性。在事实验证（FEVER）和幻觉检测（HaluEval）基准上，GA-ICL在大多数评估设置中优于标准ICL选择基线，在对话和摘要任务上尤其有显著提升。该方法在温度扰动和模型变化下保持鲁棒性，表明与启发式检索策略相比具有更高的稳定性。虽然在较小模型规模下的某些问答场景中，词汇检索仍可能具有竞争力，但我们的结果表明，几何感知的原型选择为幻觉检测提供了一种可靠且训练轻量的方法，无需修改LLM参数。在Phi-14B和Qwen3-32B上的扩展评估证实，GA-ICL能有效扩展到更大模型，在包括较小模型显示边界条件限制的问答任务在内的所有比较基线上均表现优异，为改进ICL示例选择提供了原则性方向。

英文摘要

Large language models (LLMs) frequently generate factually incorrect or unsupported content, commonly referred to as hallucinations. Prior work has explored decoding strategies, retrieval augmentation, and supervised fine-tuning for hallucination detection, while recent studies show that in-context learning (ICL) can substantially influence factual reliability. However, existing ICL demonstration selection methods often rely on surface-level similarity heuristics and exhibit limited robustness across tasks and models. We propose GA-ICL, a geometry-aware demonstration sampling framework for selecting in-context demonstrations that leverages latent representations extracted from frozen LLMs. By jointly modeling local manifold structure and class-aware prototype geometry, GA-ICL selects demonstrations based on their proximity to learned prototypes rather than lexical or embedding similarity alone. Across factual verification (FEVER) and hallucination detection (HaluEval) benchmarks, GA-ICL outperforms standard ICL selection baselines in the majority of evaluated settings, with particularly strong gains on dialogue and summarization tasks. The method remains robust under temperature perturbations and model variation, indicating improved stability compared to heuristic retrieval strategies. While lexical retrieval can remain competitive in certain question-answering regimes at smaller model scales, our results demonstrate that geometry-aware prototype selection provides a reliable and training-light approach for hallucination detection without modifying LLM parameters. Extended evaluations on Phi-14B and Qwen3-32B confirm that GA-ICL scales effectively to larger models, outperforming all compared baselines including on QA tasks where smaller models show boundary-condition limitations, offering a principled direction for improved ICL demonstration selection.

URL PDF HTML ☆

赞 0 踩 0

2412.18134 2026-06-04 cs.LG cs.CC cs.PL cs.SE 版本更新

Learning Randomized Reductions

学习随机归约

Ferhat Erata, Orr Paradise, Thanos Typaldos, Timos Antonopoulos, ThanhVu Nguyen, Shafi Goldwasser, Ruzica Piskac

发表机构 * Yale University, USA（耶鲁大学）； EPFL, Switzerland（瑞士联邦理工学院）； George Mason University, USA（乔治·梅onn大学）

AI总结提出 Bitween 框架自动学习随机自归约（RSR），通过线性回归、遗传编程等后端和 LLM 代理，在 80 个函数中分别发现 54% 和 80% 的 RSR，包括首个 sigmoid 归约。

Comments Accepted at ICML 2026 (Spotlight). 9 pages main text + appendix

详情

Journal ref: Proceedings of the 43rd International Conference on Machine Learning, PMLR 306, 2026

AI中文摘要

随机自归约（RSR）通过使用 $f$ 在随机相关点上的求值来表达 $f(x)$，从而能够实现自校正程序、实例隐藏协议，并在复杂性理论和密码学中有应用。然而，40 多年来发现 RSR 一直需要手动专家推导，限制了其实际应用。我们提出了用于自动 RSR 学习的 Bitween。首先，我们在相关采样下形式化了 RSR 学习及其样本复杂度分析。其次，我们开发了 Vanilla Bitween，它集成了多个后端（线性回归、遗传编程、符号回归和混合整数规划）。线性回归后端表现最佳，在我们的基准套件 RSR-Bench 中为 80 个函数中的 43 个（54%）发现了 RSR，包括 sigmoid 的首次已知归约。第三，我们引入了 Agentic Bitween，一种神经符号方法，其中 LLM 代理提出超越先前工作中固定集合（$x+r$, $x-r$, $x \cdot r$, $x$, $r$）的新查询函数。Agentic Bitween 为 80 个函数中的 64 个（80%）发现了 RSR，在 RSR 发现和验证准确性方面均优于纯神经基线。

英文摘要

Randomized self-reductions (RSRs) express $f(x)$ using $f$ evaluated at random correlated points, enabling self-correcting programs, instance-hiding protocols, and applications in complexity theory and cryptography. Yet discovering RSRs has required manual expert derivation for over 40 years, limiting their practical use. We present Bitween for automated RSR learning. First, we formalize RSR learning with sample complexity analysis under correlated sampling. Second, we develop Vanilla Bitween, which integrates multiple backends (linear regression, genetic programming, symbolic regression, and mixed-integer programming). The linear regression backend outperforms the others, discovering RSRs for 43 of 80 functions (54%) in RSR-Bench, our benchmark suite, including the first known reduction for sigmoid. Third, we introduce Agentic Bitween, a neuro-symbolic approach where LLM agents propose novel query functions beyond the fixed set ($x+r$, $x-r$, $x \cdot r$, $x$, $r$) in prior work. Agentic Bitween discovers RSRs for 64 of 80 functions (80%), outperforming pure neural baselines in both RSR discovery and verification accuracy.

URL PDF HTML ☆

赞 0 踩 0

2601.03569 2026-06-04 cs.LG stat.AP 版本更新

Local Intrinsic Dimensionality of Ground Motion Data for Early Detection of Catastrophic Slope Failure

用于早期检测灾难性边坡破坏的地震动数据的局部内在维度

Yuansan Liu, James Bailey, Antoinette Tordesillas

发表机构 * The University of Melbourne（墨尔本大学）； Monash University（莫纳什大学）

AI总结提出时空局部内在维度（st-LID）无监督框架，通过运动增强、贝叶斯空间融合和时间建模，提高滑坡监测中破坏区域的早期检测精度和提前时间。

Comments 20 pages, 9 figures. ECML-PKDD 2026

详情

AI中文摘要

局部内在维度（LID）在高维数据异常检测中显示出强大潜力，包括颗粒介质中滑坡破坏检测，其中早期准确识别破坏区域对于有效的地质灾害缓解至关重要。然而，由于表面位移数据中固有的空间相关性和时间动态，这项任务仍然具有挑战性。为了解决这一差距，我们提出了一种新颖的无监督框架，称为时空LID（st-LID），它将LID推广到滑坡监测网络中的稳健破坏检测。我们的方法引入了三个关键创新：（1）运动增强，将速度纳入LID计算以捕获瞬时变形率和短期时间动态；（2）贝叶斯空间融合，通过贝叶斯估计聚合空间邻域内的LID值，以嵌入空间相关性并考虑局部噪声；以及（3）时间建模（t-LID），一种新变体，表征位移数据的长期动态，提供位移行为的稳健时间表示。通过统一这些组件，st-LID识别出现有方法经常忽略的复杂多阶段破坏区域。大量实验表明，st-LID在检测精度和提前时间方面始终优于最先进的无监督基线，为滑坡早期预警系统和有针对性的风险干预提供了稳健基础，以增强社区韧性和准备策略。

英文摘要

Local Intrinsic Dimensionality (LID) has shown strong potential for anomaly detection in high-dimensional data, including landslide failure detection in granular media, where early and accurate identification of failure zones is crucial for effective geohazard mitigation. However, this task is still challenging due to the spatial correlations and temporal dynamics that are inherently present in surface displacement data. To address this gap, we propose a novel unsupervised framework called spatiotemporal LID (st-LID) that generalizes the LID for robust failure detection in landslide monitoring networks. Our approach introduces three key innovations: (1) Kinematic enhancement, incorporating velocity into the LID computation to capture instantaneous deformation rates and short-term temporal dynamics; (2) Bayesian spatial fusion, which aggregates LID values across spatial neighborhoods via Bayesian estimation, to embed spatial correlations and account for localized noise; and (3) Temporal modeling (t-LID), a new variant that characterizes long-term dynamics of displacement data, providing a robust temporal representation of displacement behavior. By unifying these components, st-LID identifies complex, multi-stage failure zones often overlooked by existing methods. Extensive experiments show that st-LID consistently outperforms state-of-the-art unsupervised baselines in detection precision and lead-time, providing a robust foundation for landslide early warning systems and targeted risk intervention to enhance community resilience and preparedness strategies.

URL PDF HTML ☆

赞 0 踩 0

2601.07408 2026-06-04 cs.CL cs.LG 版本更新

Outcome-Grounded Advantage Reshaping for Fine-Grained Credit Assignment in Mathematical Reasoning

基于结果锚定的优势重塑用于数学推理中的细粒度信用分配

Ziheng Li, Liu Kang, Feng Xiao, Luxi Xing, Qingyi Si, Zhuoran Li, Weikang Gong, Deqing Yang, Yanghua Xiao, Hongcheng Guo

发表机构 * Fudan University（复旦大学）； XingYun lab, HUJING Digital Media & Entertainment Group（星云实验室，HUJING数字媒体与娱乐集团）； University of Science and Technology Beijing（北京科技大学）； Chinese Academy of Sciences（中国科学院）； Beijing University of Posts and Telecommunications（北京邮电大学）

AI总结提出结果锚定优势重塑（OAR），通过两种策略（OAR-P和OAR-G）实现细粒度信用分配，显著提升GRPO在数学推理中的性能。

详情

AI中文摘要

组相对策略优化（GRPO）已成为一种有前途的无需评论家的强化学习范式，用于推理任务。然而，标准GRPO采用粗粒度的信用分配机制，将组级奖励均匀地传播到序列中的每个令牌，忽略了各个推理步骤的不同贡献。我们通过引入结果锚定优势重塑（OAR）来解决这一局限性，这是一种细粒度的信用分配机制，根据每个令牌对模型最终答案的影响程度重新分配优势。我们通过两种互补策略实例化OAR：（1）OAR-P，通过反事实令牌扰动估计结果敏感性，作为高保真归因信号；（2）OAR-G，使用输入梯度敏感性代理，通过单次反向传播近似影响信号。这些重要性信号与保守的双层优势重塑方案相结合，该方案抑制低影响令牌并提升关键令牌，同时保持整体优势质量。在广泛数学推理基准上的实证结果表明，虽然OAR-P设定了性能上限，但OAR-G以可忽略的计算开销实现了相当的增益，两者均显著优于强GRPO基线，推动了无需评论家的大语言模型推理的边界。

英文摘要

Group Relative Policy Optimization (GRPO) has emerged as a promising critic-free reinforcement learning paradigm for reasoning tasks. However, standard GRPO employs a coarse-grained credit assignment mechanism that propagates group-level rewards uniformly to to every token in a sequence, neglecting the varying contribution of individual reasoning steps. We address this limitation by introducing Outcome-grounded Advantage Reshaping (OAR), a fine-grained credit assignment mechanism that redistributes advantages based on how much each token influences the model's final answer. We instantiate OAR via two complementary strategies: (1) OAR-P, which estimates outcome sensitivity through counterfactual token perturbations, serving as a high-fidelity attribution signal; (2) OAR-G, which uses an input-gradient sensitivity proxy to approximate the influence signal with a single backward pass. These importance signals are integrated with a conservative Bi-Level advantage reshaping scheme that suppresses low-impact tokens and boosts pivotal ones while preserving the overall advantage mass. Empirical results on extensive mathematical reasoning benchmarks demonstrate that while OAR-P sets the performance upper bound, OAR-G achieves comparable gains with negligible computational overhead, both significantly outperforming a strong GRPO baseline, pushing the boundaries of critic-free LLM reasoning.

URL PDF HTML ☆

赞 0 踩 0

2601.07036 2026-06-04 cs.CL cs.AI cs.LG 版本更新

Mid-Think: Training-Free Intermediate-Budget Reasoning via Token-Level Triggers

Mid-Think: 通过词元级触发器实现无需训练的中间预算推理

Wang Yang, Debargha Ganguly, Xinpeng Li, Chaoda Song, Shouren Wang, Vikash Singh, Vipin Chaudhary, Xiaotian Han

发表机构 * Case Western Reserve University（凯斯西储大学）

AI总结本文通过分析注意力机制和提示实验，发现推理行为主要由少量触发词元控制，并据此提出Mid-Think方法，通过组合触发词元实现中间预算推理，在准确率-长度权衡上优于基线，并能在强化学习训练中减少时间并提升性能。

详情

AI中文摘要

混合推理语言模型通常通过高级的Think/No-think指令来控制推理行为，但我们发现这种模式切换主要由一小部分触发词元驱动，而非指令本身。通过注意力分析和受控提示实验，我们表明开头的“Okay”词元会诱导推理行为，而“</think>”后的换行模式则会抑制推理。基于这一观察，我们提出了Mid-Think，一种简单的无需训练的提示格式，通过组合这些触发器实现中间预算推理，在准确率-长度权衡上始终优于固定词元和基于提示的基线。此外，在监督微调后将Mid-Think应用于强化学习训练，可将训练时间减少约15%，同时将Qwen3-8B在AIME上的最终性能从69.8%提升至72.4%，在GPQA上从58.5%提升至61.1%，证明了其在推理时控制和基于强化学习的推理训练中的有效性。

英文摘要

Hybrid reasoning language models are commonly controlled through high-level Think/No-think instructions to regulate reasoning behavior, yet we found that such mode switching is largely driven by a small set of trigger tokens rather than the instructions themselves. Through attention analysis and controlled prompting experiments, we show that a leading ``Okay'' token induces reasoning behavior, while the newline pattern following ``</think>'' suppresses it. Based on this observation, we propose Mid-Think, a simple training-free prompting format that combines these triggers to achieve intermediate-budget reasoning, consistently outperforming fixed-token and prompt-based baselines in terms of the accuracy-length trade-off. Furthermore, applying Mid-Think to RL training after SFT reduces training time by approximately 15% while improving final performance of Qwen3-8B on AIME from 69.8% to 72.4% and on GPQA from 58.5% to 61.1%, demonstrating its effectiveness for both inference-time control and RL-based reasoning training.

URL PDF HTML ☆

赞 0 踩 0

2411.05894 2026-06-04 cs.CL cs.AI cs.LG 版本更新

SSSD: Simply-Scalable Speculative Decoding

SSSD: 简单可扩展的推测解码

Michele Marzollo, Jiawei Zhuang, Niklas Roemer, Niklas Zwingenberger, Lorenz K. Müller, Lukas Cavigelli

发表机构 * Huawei（华为）； ETH Zurich（苏黎世联邦理工学院）

AI总结提出一种无需训练的推测解码方法SSSD，结合轻量级n-gram匹配和硬件感知推测，在多种基准测试中达到与领先训练方法相当的性能，延迟降低高达2.9倍，且对语言和领域变化具有鲁棒性。

Comments Accepted to the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026, Main Conference)

详情

AI中文摘要

推测解码已成为加速大型语言模型推理的流行技术。然而，大多数现有方法在生产服务系统中仅带来适度的改进。实现显著加速的方法通常依赖于额外的训练草案模型或辅助模型组件，增加了部署和维护的复杂性。这种增加的复杂性降低了灵活性，特别是当服务负载转移到草案模型训练数据中未充分表示的任务、领域或语言时。我们引入了简单可扩展的推测解码（SSSD），一种无需训练的方法，结合了轻量级n-gram匹配和硬件感知推测。相对于标准自回归解码，SSSD将延迟降低高达2.9倍。它在广泛的基准测试中达到了与领先的基于训练的方法相当的性能，同时需要显著更低的采用成本——无需数据准备、训练或调优——并且在语言和领域变化以及长上下文设置中表现出优越的鲁棒性。

英文摘要

Speculative Decoding has emerged as a popular technique for accelerating inference in Large Language Models. However, most existing approaches yield only modest improvements in production serving systems. Methods that achieve substantial speedups typically rely on an additional trained draft model or auxiliary model components, increasing deployment and maintenance complexity. This added complexity reduces flexibility, particularly when serving workloads shift to tasks, domains, or languages that are not well represented in the draft model's training data. We introduce Simply-Scalable Speculative Decoding (SSSD), a training-free method that combines lightweight n-gram matching with hardware-aware speculation. Relative to standard autoregressive decoding, SSSD reduces latency by up to 2.9x. It achieves performance on par with leading training-based approaches across a broad range of benchmarks, while requiring substantially lower adoption effort--no data preparation, training or tuning are needed--and exhibiting superior robustness under language and domain shift, as well as in long-context settings.

URL PDF HTML ☆

赞 0 踩 0

2509.05510 2026-06-04 physics.comp-ph cs.LG 版本更新

Causal Multi-fidelity Surrogate Forward and Inverse Models for ICF Implosions

因果多保真替代前向与逆向模型用于ICF内爆

Tyler E. Maltba, Ben S. Southworth, Jeffrey R. Haack, Marc L. Klasky

发表机构 * Theoretical Division, Los Alamos National Laboratory（洛斯阿拉莫斯国家实验室理论 division）； Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory（洛斯阿拉莫斯国家实验室计算机、计算与统计科学 division）

AI总结针对惯性约束聚变中的逆向问题，构建因果动态多保真降阶替代模型，通过低/高保真训练数据学习控制器，并利用机器学习模型优化辐射温度驱动以复现观测界面动力学。

详情

AI中文摘要

惯性约束聚变（ICF）的持续进展需要解决将实验观测与模拟输入参数相关联的逆向问题，随后进行设计优化。然而，这类高维动态PDE约束优化问题极具挑战性，甚至难以处理。最近研究表明，通过仅考虑某些鲁棒特征可以解决逆向问题。本文考虑ICF靶丸的氘-氚（DT）界面，构建了一个因果、动态、多保真降阶替代模型，将时间依赖的辐射温度驱动映射到界面的半径和速度动力学。该替代模型针对DT界面动力学的ODE嵌入，通过使用低/高保真模拟训练数据（关于辐射能群结构）学习基础解析模型的控制器来构建。在展示了替代界面模型的优异精度后，我们使用机器学习（ML）模型结合替代生成的数据来解决逆向问题，优化辐射温度驱动以复现观测到的界面动力学。对于稀疏时间快照，ML模型进一步表征了采样动力学最具信息量的时间点。总之，我们展示了如何将算子学习、因果架构和物理归纳偏差整合起来，以加速高能量密度系统中的发现、设计和诊断。

英文摘要

Continued progress in inertial confinement fusion (ICF) requires solving inverse problems relating experimental observations to simulation input parameters, followed by design optimization. However, such high-dimensional dynamic PDE-constrained optimization problems are extremely challenging or even intractable. It has been recently shown that inverse problems can be solved by only considering certain robust features. Here we consider the ICF capsule's deuterium-tritium (DT) interface, and construct a causal, dynamic, multifidelity reduced-order surrogate that maps from a time-dependent radiation temperature drive to the interface's radius and velocity dynamics. The surrogate targets an ODE embedding of DT interface dynamics, and is constructed by learning a controller for a base analytical model using low- and high-fidelity simulation training data with respect to radiation energy group structure. After demonstrating excellent accuracy of the surrogate interface model, we use machine learning (ML) models with surrogate-generated data to solve inverse problems optimizing radiation temperature drive to reproduce observed interface dynamics. For sparse snapshots in time, the ML model further characterizes the most informative times at which to sample dynamics. Altogether we demonstrate how operator learning, causal architectures, and physical inductive bias can be integrated to accelerate discovery, design, and diagnostics in high-energy-density systems.

URL PDF HTML ☆

赞 0 踩 0

2512.17678 2026-06-04 cs.LG cs.AI 版本更新

You Only Train Once: Differentiable Subset Selection for Omics Data

你只训练一次：用于组学数据的可微分子集选择

Daphné Chopard, Jorge da Silva Gonçalves, Irene Cannistraci, Thomas M. Sutter, Julia E. Vogt

发表机构 * Department of Computer Science, ETH Zurich（计算机科学系，苏黎世联邦理工学院）； Department of Intensive Care and Neonatology, University Children’s Hospital Zurich（重症医学与新生儿科，苏黎世大学儿童医院）

AI总结提出YOTO框架，通过端到端可微架构联合选择离散基因子集并进行预测，实现稀疏、多任务学习，提升单细胞转录组数据分析性能。

Comments Camera-ready version accepted at Transactions on Machine Learning Research (TMLR)

详情

Journal ref: Transactions on Machine Learning Research, 2026

AI中文摘要

从单细胞转录组数据中选择紧凑且信息丰富的基因子集对于生物标志物发现、提高可解释性和成本效益分析至关重要。然而，大多数现有的特征选择方法要么作为多阶段流水线运行，要么依赖于事后特征归因，使得选择和预测弱耦合。在这项工作中，我们提出了YOTO（你只训练一次），一个端到端框架，在单个可微架构中联合识别离散基因子集并进行预测。在我们的模型中，预测任务直接指导选择哪些基因，而学习到的子集反过来塑造预测表示。这种闭环反馈使模型能够在训练过程中迭代地优化其选择内容和预测方式。与现有方法不同，YOTO强制执行稀疏性，使得只有选中的基因对推理有贡献，从而无需训练额外的下游分类器。通过多任务学习设计，模型在相关目标之间学习共享表示，使得部分标记的数据集能够相互提供信息，并发现无需额外训练步骤即可跨任务泛化的基因子集。我们在两个代表性的单细胞RNA-seq数据集上评估YOTO，显示它持续优于最先进的基线。这些结果表明，稀疏、端到端、多任务的基因子集选择提高了预测性能，并产生了紧凑且有意义的基因子集，推进了生物标志物发现和单细胞分析。

英文摘要

Selecting compact and informative gene subsets from single-cell transcriptomic data is essential for biomarker discovery, improving interpretability, and cost-effective profiling. However, most existing feature selection approaches either operate as multi-stage pipelines or rely on post hoc feature attribution, making selection and prediction weakly coupled. In this work, we present YOTO (you only train once), an end-to-end framework that jointly identifies discrete gene subsets and performs prediction within a single differentiable architecture. In our model, the prediction task directly guides which genes are selected, while the learned subsets, in turn, shape the predictive representation. This closed feedback loop enables the model to iteratively refine both what it selects and how it predicts during training. Unlike existing approaches, YOTO enforces sparsity so that only the selected genes contribute to inference, eliminating the need to train additional downstream classifiers. Through a multi-task learning design, the model learns shared representations across related objectives, allowing partially labeled datasets to inform one another, and discovering gene subsets that generalize across tasks without additional training steps. We evaluate YOTO on two representative single-cell RNA-seq datasets, showing that it consistently outperforms state-of-the-art baselines. These results demonstrate that sparse, end-to-end, multi-task gene subset selection improves predictive performance and yields compact and meaningful gene subsets, advancing biomarker discovery and single-cell analysis.

URL PDF HTML ☆

赞 0 踩 0

2510.05013 2026-06-04 stat.ML cs.LG 版本更新

Curiosity-Driven Development of Action and Language in Robots Through Self-Exploration

通过自我探索的机器人好奇心驱动行为与语言发展

Theodore Jerome Tinker, Kenji Doya, Jun Tani

发表机构 * Okinawa Institute of Science and Technology（冲绳科学技术大学院大学）

AI总结本研究通过好奇心驱动的机器人自我探索，结合Q学习实现主动推理，揭示了组合泛化、快速学习、先配对后组合以及异常处理导致的U型发展模式，为人类高效语言习得提供解释。

Comments 27 pages, 22 pages of supplementary material

详情

AI中文摘要

婴儿通过极少的经验就能泛化习得语言，而大型语言模型需要数十亿的训练标记。人类高效发展的基础是什么？我们通过实验研究了这一问题，其中机器人代理通过好奇心驱动的自我探索学习执行与祈使句（例如，推红色立方体）相关的动作。我们的方法使用Q学习摊销主动推理，实现内在动机的发展性学习。模拟揭示了与发展心理学观察相对应的关键发现。i) 随着组合元素规模的增加，泛化能力显著提高。ii) 好奇心驱动的探索能够加速学习。iii) 句子和动作的机械配对先于组合泛化。iv) 异常处理导致U型发展表现，这种模式类似于儿童语言学习中的表征重述。这些结果表明，好奇心驱动的主动推理解释了内在动机的感觉运动-语言学习如何支持人类和人工代理中的可扩展组合泛化和异常处理。

英文摘要

Infants acquire language with generalization from minimal experience, whereas large language models require billions of training tokens. What underlies efficient development in humans? We investigated this problem through experiments wherein robotic agents learn to perform actions associated with imperative sentences (e.g., push red cube) via curiosity-driven self-exploration. Our approach amortizes active inference using Q-learning, enabling intrinsically motivated developmental learning. The simulations reveal key findings corresponding to observations in developmental psychology. i) Generalization improves drastically as the scale of compositional elements increases. ii) Curiosity-driven exploration enables faster learning. iii) Rote pairing of sentences and actions precedes compositional generalization. iv) Exception-handling induces U-shaped developmental performance, a pattern like representational redescription in child language learning. These results suggest that curiosity-driven active inference accounts for how intrinsically motivated sensorimotor-linguistic learning supports scalable compositional generalization and exception handling in humans and artificial agents.

URL PDF HTML ☆

赞 0 踩 0

2512.10236 2026-06-04 cs.DC cs.AR cs.LG 版本更新

Design Space Exploration of DMA based Finer-Grain Compute Communication Overlap

基于DMA的细粒度计算通信重叠的设计空间探索

Shagnik Pal, Shaizeen Aga, Suchita Pati, Mahzabeen Islam, Lizy K. John

发表机构 * Advanced Micro Devices Inc.（先进微器件公司）； The University of Texas at Austin（德克萨斯大学奥斯汀分校）

AI总结本文提出细粒度计算通信重叠方法FiCCO，通过深入分片粒度以下、利用DMA引擎卸载通信，实现更广泛网络拓扑下的性能优化，最高获得1.6倍加速。

详情

AI中文摘要

现代机器学习工作负载需要跨多个GPU分布训练和推理。然而，这些并行化技术常常遭受暴露的关键路径通信，通过计算通信重叠可能实现1.7倍的加速。先前的重叠方法利用ML模型状态和输入已经分片到GPU数量的事实，并在分片粒度上重叠计算和通信。然而，这种粗粒度重叠受到有限网络拓扑支持和次优数据流的限制。在这项工作中，我们转而支持更细粒度的计算通信重叠，称之为FiCCO。FiCCO比传统分片深入一层，为更广泛的网络拓扑解锁重叠，并实现更细粒度的数据流。我们表明，FiCCO打开了比仅分片级别更广泛的执行调度设计空间。为了遍历调度的设计空间，我们研究并表征了进行重叠时的性能低效问题，并将调度与相关的低效特征叠加。我们的表征揭示了分解和基于争用的减速是主要的性能限制因素，并将减速因子与静态计算/通信算子大小相关联。这有助于我们设计启发式方法（框架和运行时可以利用）来根据底层ML操作的性质选择定制的FiCCO调度。最后，为了进一步最小化操作重叠固有的争用低效，我们将通信卸载到GPU DMA引擎。我们评估了来自实际ML部署的几种场景，并证明我们提出的启发式驱动的定制调度可提供高达1.6倍的加速。此外，我们的启发式方法在81%的未见场景中提供了准确选择最优调度的指导。

英文摘要

Modern ML workloads demand distributing training and inference across multiple GPUs. However, these parallelization techniques often suffer from exposed critical-path communication, leaving a potential 1.7x speedup on the table through compute-communication overlap. Prior overlapping methods harness the fact that ML model state and inputs are already sharded into the number of GPUs, and overlap the compute and communication at shard granularity. However, such coarse-grained overlap suffers from limited network topology support, and suboptimal dataflows. In this work, we instead make a case for finer-grain compute-communication overlap which we term FiCCO. FiCCO operates one level deeper than traditional sharding, and unlocks overlap for a wider set of network topologies and enables finer-grain dataflow. We show that FiCCO opens up a wider design space of execution schedules than possible at shard-level alone. To walk the design space of schedules, we study and characterize the performance inefficiencies on doing overlap and overlay the schedules with the associated inefficiency signatures. Our characterization reveals decomposition and contention based slowdowns to be the major performance limiters, and we correlate the slowdown factors with the static compute/communication operator sizes. This helps us design heuristics (that frameworks and runtimes can harness) to select bespoke FiCCO schedules based on the nature of underlying ML operations. Finally, to further minimize contention inefficiencies inherent with operation overlap, we offload communication to GPU DMA engines. We evaluate several scenarios from realistic ML deployments and demonstrate that our proposed heuristics driven bespoke schedules deliver up to 1.6x speedup. Further, our heuristics provide accurate guidance to pick the optimal schedule in 81% of unseen scenarios.

URL PDF HTML ☆

赞 0 踩 0

2512.06553 2026-06-04 stat.AP cs.LG 版本更新

A Latent Variable Framework for Scaling Laws in Large Language Models

大型语言模型中缩放定律的潜变量框架

Peiyao Cai, Chengyu Cui, Felipe Maia Polo, Seamus Somerstep, Leshem Choshen, Mikhail Yurochkin, Yuekai Sun, Kean Ming Tan, Gongjun Xu

发表机构 * Department of Statistics, University of Michigan（密歇根大学统计系）； IBM Research and CSAIL, MIT（IBM研究与麻省理工学院计算机科学与人工智能实验室）； Institute of Foundation Models, MBZUAI（MBZUAI基础模型研究所）

AI总结提出基于潜变量建模的统计框架，通过引入潜变量捕获不同模型家族和基准的异构性，以更准确地建模大型语言模型的缩放定律。

详情

AI中文摘要

我们提出了一个基于潜变量建模的统计框架，用于大型语言模型（LLMs）的缩放定律。我们的工作受到大量具有不同架构和训练策略的新LLM家族迅速涌现的推动，这些模型在越来越多的基准上进行评估。这种异构性使得单一的全局缩放曲线不足以捕捉不同家族和基准之间的性能变化。为了解决这个问题，我们提出了一个潜变量建模框架，其中每个LLM家族与一个潜变量相关联，该潜变量捕获该家族中常见的底层特征。然后，LLM在不同基准上的性能由其潜在技能驱动，这些技能由潜变量和模型自身的可观测特征共同决定。我们开发了该潜变量模型的估计程序，并建立了其统计性质。我们还设计了支持估计和各种下游任务的高效数值算法。在实验上，我们在Open LLM Leaderboard（v1/v2）的12个广泛使用的基准上评估了该方法。

英文摘要

We propose a statistical framework built on latent variable modeling for scaling laws of large language models (LLMs). Our work is motivated by the rapid emergence of numerous new LLM families with distinct architectures and training strategies, evaluated on an increasing number of benchmarks. This heterogeneity makes a single global scaling curve inadequate for capturing how performance varies across families and benchmarks. To address this, we propose a latent variable modeling framework in which each LLM family is associated with a latent variable that captures the common underlying features in that family. An LLM's performance on different benchmarks is then driven by its latent skills, which are jointly determined by the latent variable and the model's own observable features. We develop an estimation procedure for this latent variable model and establish its statistical properties. We also design efficient numerical algorithms that support estimation and various downstream tasks. Empirically, we evaluate the approach on 12 widely used benchmarks from the Open LLM Leaderboard (v1/v2).

URL PDF HTML ☆

赞 0 踩 0

2512.03296 2026-06-04 cs.SI cs.CY cs.LG 版本更新

Associating Healthcare Teamwork with Patient Outcomes for Predictive Analysis

将医疗团队协作与患者结局关联以进行预测分析

Hsiao-Ying Lu, Kwan-Liu Ma

发表机构 * Department of Computer Science University of California, Davis Davis, USA（加州大学戴维斯分校计算机科学系）

AI总结本研究通过电子健康记录系统建模医疗专业人员协作网络，应用机器学习技术检测患者生存预测信号，并识别与改善结局相关的关键网络特征，经临床专家验证其现实应用潜力。

详情

AI中文摘要

癌症治疗结局不仅受临床和人口统计学因素影响，还受医疗团队协作的影响。然而，先前的工作在很大程度上忽视了人类协作在塑造患者生存中的潜在作用。本文提出了一种应用人工智能方法，通过电子健康记录（EHR）系统捕获的医疗专业人员（HCP）协作，揭示其对癌症患者结局的影响。我们将EHR介导的HCP交互建模为网络，并应用机器学习技术检测这些协作中嵌入的患者生存预测信号。我们的模型经过交叉验证以确保泛化能力，并通过识别与改善结局相关的关键网络特征来解释预测。重要的是，临床专家和文献验证了所识别关键协作特征的相关性，增强了其在现实应用中的潜力。这项工作为利用协作的数字痕迹和人工智能评估及改善基于团队的医疗保健提供了实用工作流程。该方法可能可转移到涉及复杂协作的其他领域，并提供可操作的见解以支持医疗保健服务中的数据知情干预。

英文摘要

Cancer treatment outcomes are influenced not only by clinical and demographic factors but also by the collaboration of healthcare teams. However, prior work has largely overlooked the potential role of human collaboration in shaping patient survival. This paper presents an applied AI approach to uncovering the impact of healthcare professionals' (HCPs) collaboration, captured through electronic health record (EHR) systems, on cancer patient outcomes. We model EHR-mediated HCP interactions as networks and apply machine learning techniques to detect predictive signals of patient survival embedded in these collaborations. Our models are cross validated to ensure generalizability, and we explain the predictions by identifying key network traits associated with improved outcomes. Importantly, clinical experts and literature validate the relevance of the identified crucial collaboration traits, reinforcing their potential for real-world applications. This work contributes to a practical workflow for leveraging digital traces of collaboration and AI to assess and improve team-based healthcare. The approach is potentially transferable to other domains involving complex collaboration and offers actionable insights to support data-informed interventions in healthcare delivery.

URL PDF HTML ☆

赞 0 踩 0

2511.21035 2026-06-04 cs.LG 版本更新

RAVQ-HoloNet: Rate-Adaptive Vector-Quantized Hologram Compression

RAVQ-HoloNet：速率自适应向量量化全息图压缩

Shima Rafiei, Zahra Nabizadeh Shahr-Babak, Soroush Khoubyarian, Alexandre Cooper, Shadrokh Samavi, Shahram Shirani

发表机构 * Department of Electrical and Computer Engineering, McMaster University（麦基尔大学电气与计算机工程系）； Department of Physics and Astronomy, University of Waterloo（滑铁卢大学物理与天文学系）； Institute for Quantum Computing, Department of Physics and Astronomy, University of Waterloo（滑铁卢大学量子计算研究所）； Computer Science Department, Seattle University（西雅图大学计算机科学系）

AI总结提出RAVQ-HoloNet，一种集成速率自适应压缩与相位全息图变换的向量量化框架，在低比特率下实现高保真重建，性能超越现有方法。

详情

AI中文摘要

全息术为AR/VR应用提供了巨大潜力。然而，其应用受到数据压缩高需求的限制。现有的深度学习方法通常缺乏单一网络内的速率自适应性，往往需要多个模型来覆盖不同的带宽要求。我们提出了RAVQ-HoloNet，一种速率自适应向量量化框架，将速率自适应压缩与图像数据到纯相位全息图的变换相结合。RAVQ-HoloNet实现了高保真重建，通过两种不同的架构配置超越了当前最先进的方法：一种针对低比特率优化的标准模型，以及一种针对超低比特率设置的更深、扩展变体。为了评估这些模型，我们使用DIV2K数据集作为高保真全息重建的基准。模拟中的定量分析表明，我们的方法显著超越了当前基准。具体来说，在低比特率领域，相对于最先进的方法，我们的方法实现了-33.91%的BD-Rate降低和1.02dB的BD-PSNR增益。此外，在SLM设备上的实验结果表明，我们的方法实现了更高的对比度和改进的质量。

英文摘要

Holography offers significant potential for AR/VR applications. However, its adoption is limited by the high demand for data compression. Existing deep learning approaches generally lack rate adaptivity within a single network and often require multiple models to cover different bandwidth requirements. We present RAVQ-HoloNet, a rate-adaptive vector quantization framework that integrates the rate-adaptive compression with the transformation of image data into phase-only hologram. RAVQ-HoloNet achieves high-fidelity reconstructions, outperforming current state-of-the-art methods implemented via two distinct architectural configurations: a standard model optimized for low bit rates and a deeper, extended variant tailored for ultra low bit rate setting. To evaluate these models, we utilized the DIV2K dataset as a benchmark for high-fidelity holographic reconstruction. Quantitative analysis in the simulation reveals that our approach significantly surpasses current benchmarks. Specifically, in the low bit rate domain, our method achieves a BD-Rate reduction of -33.91% and a BD-PSNR gain of 1.02dB relative to the state-of-the-art method. Additionally, experimental results on the SLM device show that our method achieves higher contrast and improved quality.

URL PDF HTML ☆

赞 0 踩 0

2511.12581 2026-06-04 cs.LG 版本更新

LMM-IR: Large-Scale Netlist-Aware Multimodal Framework for Static IR-Drop Prediction

LMM-IR：面向静态IR压降预测的大规模网表感知多模态框架

Kai Ma, Zhen Wang, Hongquan He, Qi Xu, Tinghuan Chen, Hao Geng

发表机构 * Tsinghua University（清华大学）

AI总结提出一种基于大规模网表变换器和3D点云表示的多模态框架，用于快速准确地预测芯片静态IR压降，在ICCAD 2023竞赛中取得最佳F1分数和最低MAE。

Comments Accepted by DAC2025

详情

DOI: 10.1109/DAC63849.2025.11133205

AI中文摘要

静态IR压降分析是芯片设计领域一项基础且关键的任务。然而，该过程可能相当耗时，有时需要数小时。此外，解决IR压降违规问题通常需要迭代分析，从而造成计算负担。因此，快速准确的IR压降预测对于减少芯片设计的总体投入时间至关重要。在本文中，我们首次提出了一种新颖的多模态方法，通过大规模网表变换器（LNT）高效处理SPICE文件。我们的关键创新在于将网表拓扑表示为3D点云并进行处理，从而能够高效处理节点数达数十万至数百万的网表。所有类型的数据，包括网表文件和图像数据，都被编码到潜在空间作为特征，并输入模型进行静态电压降预测。这使得来自多种模态的数据能够集成，实现互补预测。实验结果表明，我们提出的算法在ICCAD 2023竞赛的获胜团队和现有最优算法中，能够取得最佳F1分数和最低MAE。

英文摘要

Static IR drop analysis is a fundamental and critical task in the field of chip design. Nevertheless, this process can be quite time-consuming, potentially requiring several hours. Moreover, addressing IR drop violations frequently demands iterative analysis, thereby causing the computational burden. Therefore, fast and accurate IR drop prediction is vital for reducing the overall time invested in chip design. In this paper, we firstly propose a novel multimodal approach that efficiently processes SPICE files through large-scale netlist transformer (LNT). Our key innovation is representing and processing netlist topology as 3D point cloud representations, enabling efficient handling of netlist with up to hundreds of thousands to millions nodes. All types of data, including netlist files and image data, are encoded into latent space as features and fed into the model for static voltage drop prediction. This enables the integration of data from multiple modalities for complementary predictions. Experimental results demonstrate that our proposed algorithm can achieve the best F1 score and the lowest MAE among the winning teams of the ICCAD 2023 contest and the state-of-the-art algorithms.

URL PDF HTML ☆

赞 0 踩 0

2511.03304 2026-06-04 cs.LG cs.AI 版本更新

Extending Fair Null-Space Projections for Continuous Attributes to Kernel Methods

将连续属性的公平零空间投影扩展到核方法

Felix Störck, Fabian Hinder, Barbara Hammer

发表机构 * Felix Störck ； Fabian Hinder ； Barbara Hammer

AI总结提出将公平零空间投影扩展到核诱导特征空间，通过经验特征空间直接变换核矩阵，实现模型和公平评分无关的连续属性公平性方法，并在支持向量回归中展示竞争性或改进性能。

Comments Accepted to ICML 2026

详情

AI中文摘要

随着机器学习系统融入数百万人的日常社会生活，公平性在其发展中的优先级日益提高。公平性概念通常依赖受保护属性来评估潜在偏差。这里，大多数文献关注离散设置下的目标和受保护属性。关于连续属性尤其是与回归结合——我们称之为“连续公平性”——的文献很少。一种常见策略是迭代零空间投影，目前仅在线性模型或通过非线性编码器获得的嵌入中探索。我们通过“经验特征空间”将其扩展到核诱导特征空间，从而改进这一点。我们从理论上推导出这是核矩阵的直接变换，产生一种适用于连续受保护属性的模型和公平评分无关的方法。我们证明，与支持向量回归结合时，我们的新方法在多个数据集上相比其他当代方法具有竞争性或改进的性能。

英文摘要

With the on-going integration of machine learning systems into the everyday social life of millions the notion of fairness becomes an ever increasing priority in their development. Fairness notions commonly rely on protected attributes to assess potential biases. Here, the majority of literature focuses on discrete setups regarding both target and protected attributes. The literature on continuous attributes especially in conjunction with regression -- we refer to this as \emph{continuous fairness} -- is scarce. A common strategy is iterative null-space projection which as of now has only been explored for linear models or embeddings such as obtained by a non-linear encoder. We improve on this by extending this to kernel induced feature spaces by means of the ``empirical feature space''. We theoretically derive this as a direct transformation of the kernel matrix yielding a model and fairness-score agnostic method applicable to continuous protected attributes. We demonstrate that our novel approach in conjunction with Support Vector Regression (SVR) provides competitive or improved performance across multiple datasets in comparison to other contemporary methods.

URL PDF HTML ☆

赞 0 踩 0

2408.04607 2026-06-04 stat.ML cond-mat.dis-nn cs.LG 版本更新

Risk and cross validation in ridge regression with correlated samples

带相关样本的岭回归中的风险与交叉验证

Alexander Atanasov, Jacob A. Zavatone-Veth, Cengiz Pehlevan

发表机构 * Department of Physics, Harvard University（哈佛大学物理系）； Center for Brain Science, Harvard University（哈佛大学脑科学中心）； Society of Fellows, Harvard University（哈佛大学 fellows 会）； John A. Paulson School of Engineering and Applied Sciences, Harvard University（哈佛大学约翰·A·保罗森工程与应用科学学院）； Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University（哈佛大学自然与人工智能研究学院）

AI总结利用随机矩阵理论和自由概率，研究了数据点具有任意相关性时岭回归的渐近风险，并提出了修正的广义交叉验证估计器CorrGCV，同时扩展到测试点与训练集相关的情况。

Comments 50 pages, 19 figures. v4: ICML 2025 camera-ready. v5: Fix typo in statement of Theorem 5. v6: typos corrected, to appear in 2026 JSTAT Machine Learning focus collection

详情

Journal ref: International Conference on Machine Learning (2025), https://proceedings.mlr.press/v267/atanasov25a.html

AI中文摘要

近年来，我们对高维岭回归的理解取得了实质性进展，但现有理论假设训练样本是独立的。通过利用随机矩阵理论和自由概率的技术，我们为数据点具有任意相关性时岭回归的样本内和样本外风险提供了精确的渐近结果。我们证明，在这种情况下，广义交叉验证估计器（GCV）无法正确预测样本外风险。然而，当噪声残差与数据点具有相同相关性时，可以修改GCV以产生一个在高维极限下集中的高效可计算无偏估计器，我们称之为CorrGCV。我们进一步将渐近分析扩展到测试点与训练集具有非平凡相关性的情况，这是时间序列预测中经常遇到的情况。假设已知时间序列的相关结构，这再次产生了GCV估计器的扩展，并精确刻画了此类测试点对长期风险产生过于乐观预测的程度。我们在各种高维数据上验证了理论的预测。

英文摘要

Recent years have seen substantial advances in our understanding of high-dimensional ridge regression, but existing theories assume that training examples are independent. By leveraging techniques from random matrix theory and free probability, we provide sharp asymptotics for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations. We demonstrate that in this setting, the generalized cross validation estimator (GCV) fails to correctly predict the out-of-sample risk. However, in the case where the noise residuals have the same correlations as the data points, one can modify the GCV to yield an efficiently-computable unbiased estimator that concentrates in the high-dimensional limit, which we dub CorrGCV. We further extend our asymptotic analysis to the case where the test point has nontrivial correlations with the training set, a setting often encountered in time series forecasting. Assuming knowledge of the correlation structure of the time series, this again yields an extension of the GCV estimator, and sharply characterizes the degree to which such test points yield an overly optimistic prediction of long-time risk. We validate the predictions of our theory across a variety of high dimensional data.

URL PDF HTML ☆

赞 0 踩 0

2511.03000 2026-06-04 stat.ML cs.IT cs.LG math.IT 版本更新

Unifying Information-Theoretic and Pair-Counting Clustering Similarity

统一信息论与配对计数的聚类相似性

Alexander J. Gates

发表机构 * School of Data Science, University of Virginia（数据科学学院，弗吉尼亚大学）

AI总结本文通过加权展开和高阶扩展两个视角，统一了配对计数与信息论两类聚类相似性度量，揭示了它们之间的分析联系。

Comments 23 pages, 2 figures

详情

AI中文摘要

比较聚类结果对于评估无监督模型至关重要，然而现有的许多相似性度量可能产生广泛分歧、有时甚至矛盾的评估。聚类相似性度量通常分为两大族：配对计数和信息论，分别反映它们是通过元素对还是通过完整聚类列联表的聚合信息来量化一致性。先前的工作已发现这些族之间的相似性，并应用了经验归一化或机会校正方案，但它们更深层的分析联系仍仅部分被理解。在此，我们开发了一个分析框架，通过两个互补视角统一这些族。首先，两个族都表示为观察到的与期望的共现的加权展开，配对计数作为二次低阶近似出现，而信息论度量作为高阶频率加权扩展。其次，我们将配对计数推广到k元组一致性，并表明信息论度量可以被视为系统性地累积超出成对水平的高阶共分配结构。我们针对Rand指数和互信息从分析上说明了这些方法，并展示了每个族中的其他指数如何作为自然扩展出现。总之，这些观点阐明了两个体系何时以及为何产生分歧，将它们的敏感性直接与权重和近似阶数联系起来，并为跨应用选择、解释和扩展聚类相似性度量提供了原则性基础。

英文摘要

Comparing clusterings is central to evaluating unsupervised models, yet the many existing similarity measures can produce widely divergent, sometimes contradictory, evaluations. Clustering similarity measures are typically organized into two principal families, pair-counting and information-theoretic, reflecting whether they quantify agreement through element pairs or aggregate information across full cluster contingency tables. Prior work has uncovered parallels between these families and applied empirical normalization or chance-correction schemes, but their deeper analytical connection remains only partially understood. Here, we develop an analytical framework that unifies these families through two complementary perspectives. First, both families are expressed as weighted expansions of observed versus expected co-occurrences, with pair-counting arising as a quadratic, low-order approximation and information-theoretic measures as higher-order, frequency-weighted extensions. Second, we generalize pair-counting to k-tuple agreement and show that information-theoretic measures can be viewed as systematically accumulating higher-order co-assignment structure beyond the pairwise level. We illustrate the approaches analytically for the Rand index and Mutual Information, and show how other indices in each family emerge as natural extensions. Together, these views clarify when and why the two regimes diverge, relating their sensitivities directly to weighting and approximation order, and provide a principled basis for selecting, interpreting, and extending clustering similarity measures across applications.

URL PDF HTML ☆

赞 0 踩 0

2505.24528 2026-06-04 cs.CV cs.LG 版本更新

Geospatial Foundation Models to Enable Progress on Sustainable Development Goals

地理空间基础模型推动可持续发展目标的进展

Pedram Ghamisi, Weikang Yu, Xiaokang Zhang, Aldino Rizaldy, Jian Wang, Chufeng Zhou, Richard Gloaguen, Gustau Camps-Valls

发表机构 * Helmholtz-Zentrum Dresden-Rossendorf（德累斯顿-罗斯托克研究所）； University of Iceland（冰岛大学）； Wuhan University（武汉大学）； Wuhan University of Science and Technology（武汉科技大学）； Universitat de València（瓦伦西亚大学）

AI总结本文提出SustainFM基准框架，基于17个可持续发展目标评估地理空间基础模型，发现其在多样任务中优于传统方法，并强调需从模型中心转向影响驱动部署，关注能效、泛化性和伦理。

详情

AI中文摘要

基础模型（FMs）是大规模预训练的人工智能系统，已革新自然语言处理和计算机视觉，并正在推进地理空间分析和地球观测（EO）。它们承诺在任务间改进泛化、可扩展性以及用最少标注数据高效适应。然而，尽管地理空间FMs迅速激增，其现实世界效用和与全球可持续发展目标的一致性仍未充分探索。我们提出SustainFM，一个基于17个可持续发展目标的全面基准框架，涵盖从资产财富预测到环境危害检测的极其多样化的任务。本研究提供了对地理空间FMs的严格、跨学科评估，并对其在实现可持续发展目标中的作用提供了关键见解。我们的发现表明：（1）虽然并非普遍优越，但FMs在多样任务和数据集上通常优于传统方法。（2）评估FMs应超越准确性，将可迁移性、泛化性和能效作为其负责任使用的关键标准。（3）FMs支持可扩展的、基于SDG的解决方案，为应对复杂可持续发展挑战提供广泛实用性。关键的是，我们倡导从以模型为中心的发展转向以影响驱动的部署，并强调能效、对领域变化的鲁棒性以及伦理考量等指标。

英文摘要

Foundation Models (FMs) are large-scale, pre-trained artificial intelligence (AI) systems that have revolutionized natural language processing and computer vision, and are now advancing geospatial analysis and Earth Observation (EO). They promise improved generalization across tasks, scalability, and efficient adaptation with minimal labeled data. However, despite the rapid proliferation of geospatial FMs, their real-world utility and alignment with global sustainability goals remain underexplored. We introduce SustainFM, a comprehensive benchmarking framework grounded in the 17 Sustainable Development Goals with extremely diverse tasks ranging from asset wealth prediction to environmental hazard detection. This study provides a rigorous, interdisciplinary assessment of geospatial FMs and offers critical insights into their role in attaining sustainability goals. Our findings show: (1) While not universally superior, FMs often outperform traditional approaches across diverse tasks and datasets. (2) Evaluating FMs should go beyond accuracy to include transferability, generalization, and energy efficiency as key criteria for their responsible use. (3) FMs enable scalable, SDG-grounded solutions, offering broad utility for tackling complex sustainability challenges. Critically, we advocate for a paradigm shift from model-centric development to impact-driven deployment, and emphasize metrics such as energy efficiency, robustness to domain shifts, and ethical considerations.

URL PDF HTML ☆

赞 0 踩 0

2502.00470 2026-06-04 math.OC cs.LG stat.ML 版本更新

On the Relationship Between CoCoA and ADMM for Distributed Empirical Risk Minimization

关于CoCoA与ADMM在分布式经验风险最小化中的关系

Runxiong Wu, Andi Wang

发表机构 * Department of Industrial & Systems Engineering, University of Wisconsin–Madison（工业与系统工程系，威斯康星大学麦迪逊分校）

AI总结本文从统一原始-对偶视角揭示CoCoA与ADMM两类分布式ERM算法的内在联系，证明岭正则化下CoCoA等价于特定近端ADMM方案，并给出ADMM型方法的统一收敛分析和早停准则。

Comments 21 pages, 4 figures, 1 table

详情

Journal ref: Published in Transactions on Machine Learning Research (06/2026)

AI中文摘要

分布式经验风险最小化（ERM）通常通过两类有影响力但看似独立的方法来研究：源自分布式对偶坐标上升的CoCoA型算法，以及源自共识和近端分裂的ADMM型算法。本文从统一的原始-对偶视角研究这两类算法的联系。我们证明共识ADMM、线性化共识ADMM、两种分布式近端ADMM变体以及岭正则化CoCoA都可以写成一种涉及全局原始变量和块对偶变量的通用更新形式。这种重新表述使几个先前隐藏的联系变得明确：对于岭正则化ERM，CoCoA在对偶更新层面上与特定的近端ADMM方案一致。此外，原始问题上的共识ADMM等价于对偶问题上的近端ADMM，并具有显式参数映射以及鞍点目标符号反转；线性化变体也存在类似的对应关系。这些结果表明，在岭正则化ERM问题下，经过精细调参的ADMM型算法至少与CoCoA性能相当。统一视角还为共识ADMM提供了自然的原始-对偶间隙早停准则，并为ADMM型方法提供了统一的$O(1/T)$遍历收敛分析。在合成回归问题和真实SVM数据集上的实验支持了预测的关系，阐明了调参的作用，并表明适当调参的ADMM变体在岭正则化设置下可以优于CoCoA。

英文摘要

Distributed empirical risk minimization (ERM) is often studied through two influential yet seemingly separate families of methods: CoCoA-type algorithms, derived from distributed dual coordinate ascent, and ADMM-type algorithms, derived from consensus and proximal splitting. In this paper, we investigate the connection of the two types of algorithms from a unified primal-dual perspective. We show that consensus ADMM, linearized consensus ADMM, two distributed proximal ADMM variants, and ridge-regularized CoCoA can all be written in a common update form involving a global primal variable and block dual variables. This reformulation makes several previously hidden connections explicit: For ridge-regularized ERM, CoCoA coincides with a particular proximal ADMM scheme at the level of the dual update. Moreover, consensus ADMM on the primal problem is equivalent to proximal ADMM on the dual problem under an explicit parameter mapping together with a sign reversal of the saddle objective; similar correspondences also hold for the linearized variants.These results indicates that the ADMM-type algorithms, when fine tuned, performs at least as good as CoCoA, under ridge regularized ERM problems. The unified view also yields a natural primal-dual gap stopping criterion for consensus ADMM and a unified $O(1/T)$ ergodic convergence analysis for the ADMM-type methods. Experiments on synthetic regression problems and real SVM datasets support the predicted relationships, clarify the role of tuning parameters, and show that suitably tuned ADMM variants can outperform CoCoA in the ridge-regularized setting.

URL PDF HTML ☆

赞 0 踩 0

2509.23385 2026-06-04 stat.ML cs.LG 版本更新

Flow Matching Calibration for Simulation-Based Inference under Model Misspecification

模型误设定下基于模拟推断的流匹配校准

Pierre-Louis Ruhlmann, Michael Arbel, Florence Forbes, Pedro L. C. Rodrigues

发表机构 * Institut national de physique de la matière (CNRS UMR 7586)（物质物理国家研究院（CNRS UMR 7586））

AI总结针对基于模拟推断中模型误设定导致的偏差，提出流匹配校正后验估计方法，通过少量校准样本利用流匹配范式修正后验估计器，提高推断准确性和不确定性量化。

详情

AI中文摘要

非线性动力学的认证神经逼近

Frederik Baymler Mathiesen, Nikolaus Vertovec, Francesco Fabiano, Luca Laurenti, Alessandro Abate

发表机构 * Delft Center for Systems and Control（代尔夫特系统与控制中心）； Department of Computer Science, University of Oxford（牛津大学计算机科学系）； The Italian Institute of Artificial Intelligence (AI4I)（意大利人工智能研究所（AI4I））

AI总结提出一种基于认证一阶模型的自适应并行验证方法，为神经网络逼近非线性动力学提供形式化误差界，从而安全地用作替代模型，并在多个基准测试中显著优于现有方法。

Comments first and second author contributed equally

详情

AI中文摘要

神经网络作为非线性动力系统的近似模型具有巨大潜力，由此产生的神经逼近能够实现对此类系统的验证和控制。然而，在安全关键背景下，使用神经逼近需要对其与底层系统的接近程度有形式化界限。为了解决这一基本挑战，我们提出了一种新颖的、自适应的、可并行化的验证方法，基于认证的一阶模型。我们的方法为动力系统的神经逼近提供了形式化误差界，通过将误差界解释为作用于近似动力学的有界扰动，使得它们能够安全地用作替代模型。我们在文献中的一系列既定基准测试上展示了我们方法的有效性和可扩展性，表明它显著优于现有技术。此外，我们展示了我们的框架能够成功解决现有方法以前无法处理的额外场景——神经网络压缩和基于自编码器的深度学习架构，用于训练Koopman算子以进行轨迹预测。

英文摘要

Neural networks hold great potential to act as approximate models of nonlinear dynamical systems, with the resulting neural approximations enabling verification and control of such systems. However, in safety-critical contexts, the use of neural approximations requires formal bounds on their closeness to the underlying system. To address this fundamental challenge, we propose a novel, adaptive, and parallelizable verification method based on certified first-order models. Our approach provides formal error bounds on the neural approximations of dynamical systems, allowing them to be safely employed as surrogates by interpreting the error bound as bounded disturbances acting on the approximated dynamics. We demonstrate the effectiveness and scalability of our method on a range of established benchmarks from the literature, showing that it significantly outperforms the state of the art. Furthermore, we show that our framework can successfully address additional scenarios previously intractable for existing methods -- neural network compression and an autoencoder-based deep learning architecture for training Koopman operators for the purpose of trajectory prediction.

URL PDF HTML ☆

赞 0 踩 0

2509.22454 2026-06-04 cs.LG 版本更新

Overclocking Electrostatic Generative Models

超频静电生成模型

Daniil Shlenskii, Alexander Korotin

发表机构 * University of California, Berkeley（加州大学伯克利分校）

AI总结提出逆泊松流匹配（IPFM）蒸馏框架，加速所有维度D下的静电生成模型，实现少步采样且质量接近甚至超越教师模型。

详情

AI中文摘要

诸如PFGM++等静电生成模型最近作为一种强大的框架出现，在图像合成中取得了竞争性能。PFGM++在具有辅助维度$D$的扩展数据空间中运行，当$D\to\infty$时恢复扩散模型框架，而在有限$D$下产生更优的经验结果。与扩散模型一样，PFGM++依赖昂贵的ODE模拟来生成样本，计算成本高。为解决此问题，我们提出逆泊松流匹配（IPFM），一个原则性的蒸馏框架，可加速所有$D$值下的静电生成模型。我们的IPFM将蒸馏重新表述为一个逆问题：学习一个生成器，其诱导的静电场与教师模型匹配。我们为该问题推导了一个可处理的训练目标，并表明当$D\to\infty$时，我们的IPFM紧密恢复分数恒等蒸馏（SiD），一种最近用于蒸馏扩散模型的方法。实验上，我们的IPFM生成的蒸馏生成器仅需少量函数评估即可达到接近教师甚至更优的样本质量。此外，我们发现单步生成器蒸馏在有限$D$下比在$D\to\infty$扩散极限下收敛更快，这与先前证据一致，即有限$D$的PFGM++模型提供更有利的优化和采样行为。

英文摘要

Electrostatic generative models such as PFGM++ have recently emerged as a powerful framework, achieving competitive performance in image synthesis. PFGM++ operates in an extended data space with auxiliary dimensionality $D$, recovering the diffusion model framework as $D\to\infty$, while yielding superior empirical results for finite $D$. Like diffusion models, PFGM++ relies on expensive ODE simulations to generate samples, making it computationally costly. To address this, we propose Inverse Poisson Flow Matching (IPFM), a principled distillation framework that accelerates electrostatic generative models across all values of $D$. Our IPFM reformulates distillation as an inverse problem: learning a generator whose induced electrostatic field matches that of the teacher. We derive a tractable training objective for this problem and show that, as $D\to\infty$, our IPFM closely recovers Score Identity Distillation (SiD), a recent method for distilling diffusion models. Empirically, our IPFM produces distilled generators that achieve near-teacher or even superior sample quality using only a few function evaluations. Moreover, we find that one-step generator distillation converges faster at finite $D$ than in the $D\to\infty$ diffusion limit, aligning with prior evidence that finite-$D$ PFGM++ models offer more favorable optimization and sampling behavior.

URL PDF HTML ☆

赞 0 踩 0

2505.22988 2026-06-04 cs.LG cs.AI 版本更新

Model-Preserving Adaptive Rounding

模型保持的自适应舍入

Albert Tseng, Zhaofeng Sun, Christopher De Sa

发表机构 * Department of Computer Science, Cornell University（康奈尔大学计算机科学系）

AI总结提出一种直接考虑网络输出误差的自适应舍入量化算法YAQA，通过理论分析给出首个端到端误差界，并利用Kronecker分解近似Hessian矩阵，在无推理开销下实现优于GPTQ/LDLQ约30%的误差降低。

Comments ICML 2026

详情

AI中文摘要

量化的目标是生成一个压缩模型，其输出分布尽可能接近原始模型。为了可处理地实现这一点，大多数量化算法最小化每层的即时激活误差作为端到端误差的代理。然而，这忽略了未来层的影响，使其成为一个较差的代理。在这项工作中，我们引入了另一种量化算法（YAQA），一种直接考虑网络输出误差的自适应舍入算法。YAQA引入了一系列理论结果，最终给出了量化算法的首个端到端误差界。首先，我们通过Hessian近似的结构刻画了自适应舍入算法的收敛时间。然后，我们证明端到端误差可以通过近似与真实Hessian的余弦相似度来界定。这允许一种自然的Kronecker分解近似，并具有相应的近最优Hessian草图。YAQA在理论上优于GPTQ/LDLQ，并在经验上比这些方法减少约30%的误差。YAQA甚至实现了比量化感知训练更低的误差。这转化为下游任务上的最先进性能，同时不增加推理开销。

英文摘要

The goal of quantization is to produce a compressed model whose output distribution is as close to the original model's as possible. To do this tractably, most quantization algorithms minimize the immediate activation error of each layer as a proxy for the end-to-end error. However, this ignores the effect of future layers, making it a poor proxy. In this work, we introduce Yet Another Quantization Algorithm (YAQA), an adaptive rounding algorithm that directly considers the error at the network's output. YAQA introduces a series of theoretical results that culminate in the first end-to-end error bounds for quantization algorithms. First, we characterize the convergence time of adaptive rounding algorithms via the structure of their Hessian approximations. We then show that the end-to-end error can be bounded by the approximation's cosine similarity to the true Hessian. This admits a natural Kronecker-factored approximation with corresponding near-optimal Hessian sketches. YAQA is provably better than GPTQ/LDLQ and empirically reduces the error by $\approx 30\%$ over these methods. YAQA even achieves a lower error than quantization aware training. This translates to state of the art performance on downstream tasks, all while adding no inference overhead.

URL PDF HTML ☆

赞 0 踩 0

2509.15676 2026-06-04 cs.LG cs.AI cs.CL 版本更新

KITE: Kernelized and Information Theoretic Exemplars for In-Context Learning

KITE: 基于核方法和信息论的上下文学习示例选择

Vaibhav Singh, Soumya Suvra Ghosal, Kapu Nirmal Joshua, Soumyabrata Pal, Sayak Ray Chowdhury

发表机构 * IIT Bombay（印度比哈尔理工学院）； UMD College Park（马里兰大学 College Park 分校）； IIT Kanpur（印度坎普尔理工学院）； Adobe Research（Adobe 研究）

AI总结针对上下文学习中的示例选择问题，提出一种基于信息论和核方法的贪心算法，通过最小化查询特定预测误差并引入多样性正则化，显著提升分类性能。

详情

AI中文摘要

上下文学习（ICL）已成为一种强大的范式，通过仅使用提示中精心选择的少量任务特定示例，使大型语言模型（LLM）适应新的、数据稀缺的任务。然而，鉴于LLM有限的上下文大小，一个基本问题出现了：应选择哪些示例以最大化给定用户查询的性能？虽然基于最近邻的方法（如KATE）已被广泛用于此目的，但它们在高维嵌入空间中存在众所周知的缺点，包括泛化能力差和缺乏多样性。在这项工作中，我们从原则性的、信息论驱动的角度研究ICL中的示例选择问题。我们首先将LLM建模为输入嵌入上的线性函数，并将示例选择任务框架化为一个查询特定的优化问题：从较大的示例库中选择一个子集，以最小化特定查询上的预测误差。这种表述通过针对特定查询实例的准确预测，偏离了传统的以泛化为中心的学习理论方法。我们推导出一个原则性的代理目标，该目标是近似子模的，从而能够使用具有近似保证的贪心算法。我们通过（i）引入核技巧以在高维特征空间中操作而无需显式映射，以及（ii）引入基于最优设计的正则化项以鼓励所选示例的多样性，进一步增强了我们的方法。实验上，我们在多个分类任务上展示了相对于标准检索方法的显著改进，突出了在真实世界、标签稀缺场景中，结构感知、多样化的示例选择对ICL的益处。

英文摘要

In-context learning (ICL) has emerged as a powerful paradigm for adapting large language models (LLMs) to new and data-scarce tasks using only a few carefully selected task-specific examples presented in the prompt. However, given the limited context size of LLMs, a fundamental question arises: Which examples should be selected to maximize performance on a given user query? While nearest-neighbor-based methods like KATE have been widely adopted for this purpose, they suffer from well-known drawbacks in high-dimensional embedding spaces, including poor generalization and a lack of diversity. In this work, we study this problem of example selection in ICL from a principled, information theory-driven perspective. We first model an LLM as a linear function over input embeddings and frame the example selection task as a query-specific optimization problem: selecting a subset of exemplars from a larger example bank that minimizes the prediction error on a specific query. This formulation departs from traditional generalization-focused learning theoretic approaches by targeting accurate prediction for a specific query instance. We derive a principled surrogate objective that is approximately submodular, enabling the use of a greedy algorithm with an approximation guarantee. We further enhance our method by (i) incorporating the kernel trick to operate in high-dimensional feature spaces without explicit mappings, and (ii) introducing an optimal design-based regularizer to encourage diversity in the selected examples. Empirically, we demonstrate significant improvements over standard retrieval methods across a suite of classification tasks, highlighting the benefits of structure-aware, diverse example selection for ICL in real-world, label-scarce scenarios.

URL PDF HTML ☆

赞 0 踩 0

2502.06301 2026-06-04 cs.LG cs.NE 版本更新

Utilizing Novelty-based Evolution Strategies to Train Transformers in Reinforcement Learning

利用基于新颖性的进化策略训练强化学习中的Transformer

Matyáš Lorenc, Roman Neruda

发表机构 * Faculty of Mathematics and Physics, Charles University（数学与物理学院，查理大学）； Institute of Computer Science, Czech Academy of Sciences（计算机科学研究所，捷克科学院）

AI总结本研究实验了基于新颖性的进化策略变体（NS-ES和NSR-ES），评估其在训练强化学习中的Transformer架构（如Decision Transformer）的效果，并探索预训练模型加速训练的可能性。

详情

DOI: 10.1109/ICTAI66417.2025.00116
Journal ref: 2025 IEEE 37th International Conference on Tools with Artificial Intelligence (ICTAI), Athens, Greece, 2025, pp. 801-805

AI中文摘要

在本文中，我们实验了OpenAI-ES的基于新颖性的变体，即NS-ES和NSR-ES算法，并评估了它们在训练针对强化学习问题设计的复杂Transformer架构（如Decision Transformers）中的有效性。我们还测试了是否可以通过使用预训练模型进行种子训练来加速这些更大模型的新颖性训练。实验结果喜忧参半。NS-ES显示出进展，但显然需要更多迭代才能产生有趣的智能体。另一方面，NSR-ES被证明能够直接用于更大模型，因为其性能在前馈模型和Decision Transformer之间表现相似，正如我们之前工作中OpenAI-ES的表现一样。

英文摘要

In this paper, we experiment with novelty-based variants of OpenAI-ES, the NS-ES and NSR-ES algorithms, and evaluate their effectiveness in training complex, transformer-based architectures designed for the problem of reinforcement learning, such as Decision Transformers. We also test if we can accelerate the novelty-based training of these larger models by seeding the training with a pretrained models. The experimental results were mixed. NS-ES showed progress, but it would clearly need many more iterations for it to yield interesting agents. NSR-ES, on the other hand, proved quite capable of being straightforwardly used on larger models, since its performance appears as similar between the feed-forward model and Decision Transformer, as it was for the OpenAI-ES in our previous work.

URL PDF HTML ☆

赞 0 踩 0

2509.08846 2026-06-04 cs.LG cs.AI stat.ML 版本更新

Uncertainty Estimation using Variance-Gated Distributions

使用方差门控分布的不确定性估计

H. Martin Gillis, Isaac Xu, Thomas Trappenberg

发表机构 * Faculty of Computer Science（计算机科学学院）； Dalhousie University（达尔豪斯大学）

AI总结提出基于类概率分布信噪比的方差门控不确定性估计框架，通过集成置信因子缩放预测，解决神经网络预测不确定性分解中的加性分解问题。

Comments NeurIPS Workshop: Mathematical Foundations and Operational Integration of Machine Learning for Uncertainty-Aware Decision-Making

2509.07963 2026-06-04 cs.LG 版本更新

Customizing the Inductive Biases of Softmax Attention using Structured Matrices

使用结构化矩阵定制软注意力机制的归纳偏置

Yilun Kuang, Noah Amsel, Sanae Lotfi, Shikai Qiu, Andres Potapczynski, Andrew Gordon Wilson

发表机构 * University of Cambridge（剑桥大学）

AI总结针对标准注意力机制在低维投影信息损失和缺乏距离依赖偏置的问题，提出基于块张量列（BTT）和连续多级低秩（MLR）结构化矩阵的高秩评分函数，在上下文回归、语言建模和长程时间序列预测中提升性能。

Comments ICML 2025. Code available at https://github.com/YilunKuang/structured-attention

详情

AI中文摘要

注意力机制的核心组件是评分函数，它将输入转换为低维查询和键，并计算每对向量的点积。虽然低维投影提高了效率，但对于某些具有本质高维输入的任务，它会导致信息损失。此外，注意力对所有输入对使用相同的评分函数，而没有对序列中相邻标记施加距离相关的计算偏置。在这项工作中，我们通过提出基于计算高效的高秩结构化矩阵（包括块张量列（BTT）和连续多级低秩（MLR）矩阵）的新评分函数来解决这些缺陷。在高维输入的上下文回归任务中，我们提出的评分函数在任意固定计算预算下均优于标准注意力。在语言建模（一种表现出局部性模式的任务）中，基于MLR的注意力方法相比标准注意力和滑动窗口注意力的变体实现了改进的扩展定律。此外，我们表明BTT和MLR都属于更广泛的高效结构化矩阵家族，能够编码全秩或距离依赖的计算偏置，从而解决了标准注意力的显著缺陷。最后，我们展示了MLR注意力在长程时间序列预测中具有令人期待的结果。

英文摘要

The core component of attention is the scoring function, which transforms the inputs into low-dimensional queries and keys and takes the dot product of each pair. While the low-dimensional projection improves efficiency, it causes information loss for certain tasks that have intrinsically high-dimensional inputs. Additionally, attention uses the same scoring function for all input pairs, without imposing a distance-dependent compute bias for neighboring tokens in the sequence. In this work, we address these shortcomings by proposing new scoring functions based on computationally efficient structured matrices with high ranks, including Block Tensor-Train (BTT) and contiguous Multi-Level Low Rank (MLR) matrices. On in-context regression tasks with high-dimensional inputs, our proposed scoring functions outperform standard attention for any fixed compute budget. On language modeling, a task that exhibits locality patterns, our MLR-based attention method achieves improved scaling laws compared to both standard attention and variants of sliding window attention. Additionally, we show that both BTT and MLR fall under a broader family of efficient structured matrices capable of encoding either full-rank or distance-dependent compute biases, thereby addressing significant shortcomings of standard attention. Finally, we show that MLR attention has promising results for long-range time-series forecasting.

URL PDF HTML ☆

赞 0 踩 0

2509.03351 2026-06-04 cs.LG cs.AI q-bio.QM 版本更新

epiGPTope: A machine learning-based epitope generator and classifier

epiGPTope: 一种基于机器学习的表位生成器和分类器

Natalia Flechas Manrique, Alberto Martínez, Elena López-Martínez, Luc Andrea, Román Orus, Aitor Manteca, Aitziber L. Cortajarena, Llorenç Espinosa-Portalés

发表机构 * Multiverse Computing（多维计算公司）； Centre for Cooperative Research in Biomaterials (CIC biomaGUNE)（生物材料联合研究中心）； Basque Research and Technology Alliance (BRTA)（巴斯克研究与技术联盟）； Donostia International Physics Center（多斯蒂亚国际物理中心）； Ikerbasque Foundation for Science（伊kerbasque科学基金会）； IKERBASQUE（伊kerbasque）

AI总结提出基于大型语言模型epiGPTope，通过预训练和微调直接生成新型表位序列，并结合统计分类器预测表位来源（细菌或病毒），以加速合成表位库的构建和筛选。

Comments 11 pages, 4 figures. Supplementary Information with 5 pages, 4 figures

详情

DOI: 10.1021/acssynbio.5c00693
Journal ref: ACS Synthetic Biology 2026 15 (2), 631-642

AI中文摘要

表位是能被抗体或免疫细胞受体识别的短抗原肽序列，对免疫疗法、疫苗和诊断的开发至关重要。然而，由于巨大的组合序列空间（n个氨基酸的线性表位有$20^n$种组合），即使采用高通量实验技术，合成表位库的合理设计也极具挑战。在本研究中，我们提出了一种大型语言模型epiGPTope，该模型在蛋白质数据上预训练，并专门针对线性表位进行微调，首次能够直接生成新型表位样序列，这些序列被发现具有与已知表位相似的统计特性。这种生成方法可用于制备表位候选序列库。我们进一步训练统计分类器来预测表位序列是细菌来源还是病毒来源，从而缩小候选库范围，提高识别特定表位的可能性。我们提出，这种生成模型与预测模型的组合有助于表位发现。该方法仅使用线性表位的一级氨基酸序列，无需几何框架或手工特征。通过开发生成生物学可行序列的方法，我们预期能更快、更经济地生成和筛选合成表位，并在新生物技术开发中具有相关应用。

英文摘要

Epitopes are short antigenic peptide sequences which are recognized by antibodies or immune cell receptors. These are central to the development of immunotherapies, vaccines, and diagnostics. However, the rational design of synthetic epitope libraries is challenging due to the large combinatorial sequence space, $20^n$ combinations for linear epitopes of n amino acids, making screening and testing unfeasible, even with high throughput experimental techniques. In this study, we present a large language model, epiGPTope, pre-trained on protein data and specifically fine-tuned on linear epitopes, which for the first time can directly generate novel epitope-like sequences, which are found to possess statistical properties analogous to the ones of known epitopes. This generative approach can be used to prepare libraries of epitope candidate sequences. We further train statistical classifiers to predict whether an epitope sequence is of bacterial or viral origin, thus narrowing the candidate library and increasing the likelihood of identifying specific epitopes. We propose that such combination of generative and predictive models can be of assistance in epitope discovery. The approach uses only primary amino acid sequences of linear epitopes, bypassing the need for a geometric framework or hand-crafted features of the sequences. By developing a method to create biologically feasible sequences, we anticipate faster and more cost-effective generation and screening of synthetic epitopes, with relevant applications in the development of new biotechnologies.

URL PDF HTML ☆

赞 0 踩 0

2507.21638 2026-06-04 cs.AI cs.LG cs.MA cs.RO 版本更新

Assistax: A Multi-Agent Hardware-Accelerated Reinforcement Learning Benchmark for Assistive Robotics

Assistax: 一个用于辅助机器人的多智能体硬件加速强化学习基准

Leonard Hinckeldey, Elliot Fosong, Rimvydas Rubavicius, Elle Miller, Trevor McInroe, Fan Zhang, Patricia Wollstadt, Stefano V. Albrecht, Subramanian Ramamoorthy

发表机构 * University of California, Berkeley（加州大学伯克利分校）； Stanford University（斯坦福大学）

AI总结提出Assistax基准，利用JAX硬件加速和基于多智能体强化学习的辅助机器人任务，实现高达370倍加速，并测试机器人的零样本协调能力。

Comments Accepted at the Reinforcement Learning Conference 2026

详情

AI中文摘要

具有不确定和演化持有成本的排队系统中的调度

Caner Gocmen, Thodoris Lykouris, Deeksha Sinha, Wentao Weng

发表机构 * Meta Platforms（Meta平台）； Massachusetts Institute of Technology（麻省理工学院）

AI总结针对持有成本不确定且演化的排队系统，提出基于马尔可夫链的模型和机会调整剩余成本（OaRC）算法，证明其渐近最优性并优于经典规则。

详情

AI中文摘要

在社交媒体平台的内容审核中，延迟审核内容的成本与其观看轨迹成正比，而观看轨迹是波动的且先验未知。受这种不确定且演化的持有成本的启发，我们考虑一个排队模型，其中作业状态基于马尔可夫链演化，并具有状态相关的瞬时持有成本。我们证明，在存在这种不确定且演化的持有成本的情况下，两个经典算法原则——瞬时成本（$cμ$规则）和期望剩余成本（$cμ/θ$规则）——是次优的。通过将每个作业视为一个马尔可夫滑雪租赁问题，我们开发了一种新的基于索引的算法——机会调整剩余成本（OaRC），该算法在不确定性部分解决时调整到未来服务作业的机会。我们证明OaRC的次优性差距为$ ilde{O}(\sqrt{N})$，其中$N$是系统规模。这个界限表明，当系统规模$N$趋于无穷时，OaRC对于过载系统实现了渐近最优性。此外，该界限与状态空间大小无关，这在作业状态包含上下文信息时是一个理想性质。我们基于社交媒体平台内容审核中出现的两种持有成本模式（在线广告和用户生成内容）进行了广泛的模拟研究，验证了我们的结果。基于合成和真实数据集的模拟表明，OaRC始终优于基于两个经典算法原则的现有实践。

英文摘要

In content moderation for social media platforms, the cost of delaying the review of a content is proportional to its view trajectory, which fluctuates and is apriori unknown. Motivated by such uncertain and evolving holding costs, we consider a queueing model where job states evolve based on a Markov chain with state-dependent instantaneous holding costs. We demonstrate that in the presence of such uncertain and evolving holding costs, the two canonical algorithmic principles, instantaneous-cost ($cμ$-rule) and expected-remaining-cost ($cμ/θ$-rule), are suboptimal. By viewing each job as a Markovian ski-rental problem, we develop a new index-based algorithm, Opportunity-adjusted Remaining Cost (OaRC), that adjusts to the opportunity of serving jobs in the future when uncertainty partly resolves. We show that the suboptimality gap of OaRC scales as $\tilde{O}(\sqrt{N})$, where $N$ is the system size. This bound shows that OaRC achieves asymptotic optimality for overloaded systems when the system size $N$ scales to infinity. Moreover, the bound is independent of the state-space size, which is a desirable property when job states contain contextual information. We corroborate our results with an extensive simulation study based on two holding cost patterns (online ads and user-generated content) that arise in content moderation for social media platforms. Our simulations based on synthetic and real datasets demonstrate that OaRC consistently outperforms existing practice, which is based on the two canonical algorithmic principles.

URL PDF HTML ☆

赞 0 踩 0

2505.19293 2026-06-04 cs.CL cs.AI cs.LG 版本更新

100-LongBench: Are de facto Long-Context Benchmarks Literally Evaluating Long-Context Ability?

100-LongBench：事实上的长上下文基准是否真的在评估长上下文能力？

Wang Yang, Hongye Jin, Shaochen Zhong, Song Jiang, Qifan Wang, Vipin Chaudhary, Xiaotian Han

发表机构 * Case Western Reserve University（凯斯西储大学）； Texas A&M University（德克萨斯A&M大学）； Rice University（里德大学）； University of California, Los Angeles（加州大学洛杉矶分校）； Meta（Meta公司）

AI总结针对现有长上下文基准无法分离基线能力与真实长上下文能力、且输入长度固定等问题，提出长度可控的长上下文基准和新指标，以有效评估大语言模型的长上下文能力。

详情

AI中文摘要

长上下文能力被认为是LLM最重要的能力之一，因为真正具备长上下文能力的LLM使用户能够轻松处理许多原本繁琐的任务——例如，阅读长文档寻找答案与直接询问LLM。然而，现有的基于真实任务的长上下文评估基准有两个主要缺陷。首先，像LongBench这样的基准通常没有提供适当的指标来将长上下文性能与模型的基线能力分开，使得跨模型比较不清晰。其次，此类基准通常以固定输入长度构建，这限制了它们在不同模型上的适用性，并且无法揭示模型何时开始崩溃。为了解决这些问题，我们引入了一个长度可控的长上下文基准和一个新颖的指标，该指标将基线知识与真实的长上下文能力解耦。实验证明了我们的方法在有效评估LLM方面的优越性。

英文摘要

Long-context capability is considered one of the most important abilities of LLMs, as a truly long context-capable LLM enables users to effortlessly process many originally exhausting tasks -- e.g., digesting a long-form document to find answers vs. directly asking an LLM about it. However, existing real-task-based long-context evaluation benchmarks have two major shortcomings. First, benchmarks like LongBench often do not provide proper metrics to separate long-context performance from the model's baseline ability, making cross-model comparison unclear. Second, such benchmarks are usually constructed with fixed input lengths, which limits their applicability across different models and fails to reveal when a model begins to break down. To address these issues, we introduce a length-controllable long-context benchmark and a novel metric that disentangles baseline knowledge from true long-context capabilities. Experiments demonstrate the superiority of our approach in effectively evaluating LLMs.

URL PDF HTML ☆

赞 0 踩 0

2505.17315 2026-06-04 cs.AI cs.CL cs.LG 版本更新

Longer Context, Deeper Thinking: Uncovering the Role of Long-Context Ability in Reasoning

更长上下文，更深思考：揭示长上下文能力在推理中的作用

Wang Yang, Zirui Liu, Hongye Jin, Qingyu Yin, Vipin Chaudhary, Xiaotian Han

发表机构 * Case Western Reserve University（凯斯西储大学）； University of Minnesota - Twin Cities（明尼苏达大学双城分校）； Texas A&M University（德克萨斯阿姆大学）

AI总结本研究通过实验发现，增强模型的长上下文能力（在监督微调前）能显著提升推理性能，即使对于短输入任务也有泛化收益，表明长上下文建模是推理能力的关键基础。

详情

AI中文摘要

近期语言模型展现出强大的推理能力，但长上下文能力对推理的影响仍未充分探索。在本工作中，我们假设当前推理能力的局限性部分源于长上下文能力不足，这一假设基于经验观察：（1）更高的上下文窗口长度通常带来更强的推理性能，（2）失败的推理案例与失败的长上下文案例相似。为验证这一假设，我们检验了在监督微调（SFT）前增强模型的长上下文能力是否能提升推理性能。具体而言，我们比较了架构和微调数据相同但长上下文能力不同的模型。结果揭示了一致趋势：长上下文能力更强的模型在SFT后，在推理基准上取得了显著更高的准确率。值得注意的是，即使在输入长度较短的任务上，这些增益也持续存在，表明长上下文训练为推理性能提供了可泛化的益处。这些发现表明，长上下文建模不仅对处理长输入至关重要，而且也是推理的关键基础。我们主张将长上下文能力作为未来语言模型设计的首要目标。

英文摘要

Recent language models exhibit strong reasoning capabilities, yet the influence of long-context capacity on reasoning remains underexplored. In this work, we hypothesize that current limitations in reasoning stem, in part, from insufficient long-context capacity, motivated by empirical observations such as (1) higher context window length often leads to stronger reasoning performance, and (2) failed reasoning cases resemble failed long-context cases. To test this hypothesis, we examine whether enhancing a model's long-context ability before Supervised Fine-Tuning (SFT) leads to improved reasoning performance. Specifically, we compared models with identical architectures and fine-tuning data but varying levels of long-context capacity. Our results reveal a consistent trend: models with stronger long-context capacity achieve significantly higher accuracy on reasoning benchmarks after SFT. Notably, these gains persist even on tasks with short input lengths, indicating that long-context training offers generalizable benefits for reasoning performance. These findings suggest that long-context modeling is not just essential for processing lengthy inputs, but also serves as a critical foundation for reasoning. We advocate for treating long-context capacity as a first-class objective in the design of future language models.

URL PDF HTML ☆

赞 0 踩 0

2505.15354 2026-06-04 cs.LG stat.ML 版本更新

Post-Training Corrections for Improved Time-Series Forecasting

人在回路的自适应优化用于改进时间序列预测

Hamza Cherkaoui, Malik Tiomoko, Giuseppe Paolo, Zhang Yili, Yu Meng, Zhang Keli, Hafiz Tiomoko Ali

发表机构 * SAMOVAR Télécom SudParis Institut Polytechnique de Paris（Telecom SudParis高等研究院）； Noah Ark Lab（Noah Ark实验室）； Independent Researcher（独立研究员）

AI总结提出一种无需重训练或修改架构的轻量级后训练自适应优化框架，通过强化学习、上下文赌博机或遗传算法自动学习表达性变换来校正模型输出，并支持人类专家通过自然语言引导校正，从而在多个基准上以最小计算开销持续提升预测精度。

详情

AI中文摘要

时间序列预测模型即使在能源、金融和医疗等关键领域也经常产生系统性的、可预测的错误。我们引入了一种新颖的后训练自适应优化框架，无需重训练或架构更改即可提高预测准确性。我们的方法自动应用通过强化学习、上下文赌博机或遗传算法优化的表达性变换，以轻量级和模型无关的方式校正模型输出。理论上，我们证明了仿射校正总能降低均方误差；实际上，我们通过基于动态动作的优化扩展了这一思想。该框架还支持可选的人回路组件：领域专家可以使用自然语言指导校正，自然语言由语言模型解析为动作。在多个基准（例如电力、天气、交通）上，我们观察到以最小的计算开销持续提高准确性。我们的交互式演示展示了该框架的实时可用性。通过将自动事后改进与可解释和可扩展的机制相结合，我们的方法为实际预测系统提供了强大的新方向。

英文摘要

Time-series forecasting is a critical task in various business domains, but it remains inherently challenging. Typically, large forecasting models are trained in a single, resource-intensive run. Once training is completed, a natural question arises:~\emph{is there still potential for meaningful improvement in the model's performance?} Motivated by techniques from boosting, we introduce the concept of~\emph{post-training corrections}. This approach enhances a trained forecaster by sequentially applying a carefully selected set of corrections to its predictions. Our method offers a lightweight, model-agnostic, and scalable strategy to improve forecasting performance in practical settings. We provide theoretical foundations for the approach, starting with the affine correction case, and analyze the expected performance gains and computational costs in more general settings. Across a range of benchmark datasets, our method consistently delivers up to a $30\%$ improvement in forecasting accuracy over existing state-of-the-art models, with minimal computational overhead.

URL PDF HTML ☆

赞 0 踩 0

2504.15587 2026-06-04 cs.LG cs.AI 版本更新

MetaMolGen: A Neural Graph Motif Generation Model for De Novo Molecular Design

MetaMolGen: 一种用于从头分子设计的神经图基序生成模型

Zimo Yan, Jie Zhang, Zheng Xie, Chang Liu, Yizhen Liu, Yiping Song

发表机构 * National University of Defense Technology（国防科技大学）

AI总结提出基于元学习的分子生成模型MetaMolGen，通过标准化图基序分布和轻量级自回归序列模型，实现少样本和属性条件分子生成。

详情

DOI: 10.46793/match.97-2.11226

AI中文摘要

分子生成在药物发现和材料科学中扮演重要角色，尤其是在数据稀缺场景下，传统生成模型往往难以实现令人满意的条件泛化。为应对这一挑战，我们提出MetaMolGen，一种基于一阶元学习的分子生成器，专为少样本和属性条件分子生成而设计。MetaMolGen通过将图基序映射到标准化潜在空间来标准化其分布，并采用轻量级自回归序列模型生成忠实反映底层分子结构的SMILES序列。此外，它通过集成到生成过程中的可学习属性投影器，支持具有目标属性的分子的条件生成。实验结果表明，MetaMolGen在低数据条件下持续生成有效且多样的SMILES序列，优于传统基线。这突显了其在快速适应和高效条件生成方面的优势，适用于实际分子设计。

英文摘要

Molecular generation plays an important role in drug discovery and materials science, especially in data-scarce scenarios where traditional generative models often struggle to achieve satisfactory conditional generalization. To address this challenge, we propose MetaMolGen, a first-order meta-learning-based molecular generator designed for few-shot and property-conditioned molecular generation. MetaMolGen standardizes the distribution of graph motifs by mapping them to a normalized latent space, and employs a lightweight autoregressive sequence model to generate SMILES sequences that faithfully reflect the underlying molecular structure. In addition, it supports conditional generation of molecules with target properties through a learnable property projector integrated into the generative process.Experimental results demonstrate that MetaMolGen consistently generates valid and diverse SMILES sequences under low-data regimes, outperforming conventional baselines. This highlights its advantage in fast adaptation and efficient conditional generation for practical molecular design.

URL PDF HTML ☆

赞 0 踩 0

2405.08036 2026-06-04 cs.LG cs.AI 版本更新

Potentially Optimal Joint Actions Recognition for Cooperative Multi-Agent Reinforcement Learning

合作多智能体强化学习中潜在最优联合动作识别

Chang Huang, Shatong Zhu, Junqiao Zhao, Hongtu Zhou, Di Zhang, Hai Zhang, Chen Ye, Ziqiao Wang, Guang Chen

发表机构 * School of Computer Science and Technology, Tongji University（同济大学计算机科学与技术学院）； Stanford University（斯坦福大学）； MOE Key Lab of Embedded System and Service Computing, Tongji University, Shanghai, China（同济大学嵌入式系统与服务计算教育部重点实验室，上海，中国）； The University of Hong Kong（香港大学）； Shanghai Innovation Institute（上海创新研究院）

AI总结针对值函数分解中单调性约束限制表达能力的问题，提出潜在最优联合动作加权方法，通过迭代加权训练保证最优策略恢复，在多个任务上超越现有方法。

Comments ICLR 2026

详情

Journal ref: ICLR 2026

AI中文摘要

值函数分解在合作多智能体强化学习（MARL）中被广泛使用。现有方法通常对联合动作值与个体动作值之间施加单调性约束以实现分散执行。然而，此类约束限制了值函数分解的表达能力，缩小了可表示的联合动作值范围，并阻碍了最优策略的学习。为解决这一问题，我们提出了潜在最优联合动作加权（POW）方法，该方法在现有近似加权策略可能失效的情况下确保最优策略恢复。POW通过一个理论上有依据的迭代加权训练过程，迭代地识别潜在最优联合动作并为其分配更高的训练权重。我们证明该机制保证了真实最优策略的恢复，克服了先前启发式加权策略的局限性。POW是架构无关的，可以无缝集成到现有的值函数分解算法中。在矩阵博弈、难度增强的捕食者-猎物任务、SMAC、SMACv2以及高速公路环境交叉口场景上的大量实验表明，POW显著提升了稳定性，并持续超越了最先进的基于值的MARL方法。

英文摘要

Value function factorization is widely used in cooperative multi-agent reinforcement learning (MARL). Existing approaches often impose monotonicity constraints between the joint action value and individual action values to enable decentralized execution. However, such constraints limit the expressiveness of value factorization, restricting the range of joint action values that can be represented and hindering the learning of optimal policies. To address this, we propose Potentially Optimal Joint Actions Weighting (POW), a method that ensures optimal policy recovery where existing approximate weighting strategies may fail. POW iteratively identifies potentially optimal joint actions and assigns them higher training weights through a theoretically grounded iterative weighted training process. We prove that this mechanism guarantees recovery of the true optimal policy, overcoming the limitations of prior heuristic weighting strategies. POW is architecture-agnostic and can be seamlessly integrated into existing value factorization algorithms. Extensive experiments on matrix games, difficulty-enhanced predator-prey tasks, SMAC, SMACv2, and a highway-env intersection scenario show that POW substantially improves stability and consistently surpasses state-of-the-art value-based MARL methods.

URL PDF HTML ☆

赞 0 踩 0

2503.18721 2026-06-04 math.ST cs.CR cs.LG stat.ME stat.ML stat.TH 版本更新

Differentially Private Joint Independence Test

差分隐私联合独立性检验

Xingwei Liu, Yuexin Chen, Jin-Ting Zhang, Wangli Xu

发表机构 * Center for Applied Statistics and School of Statistics, Renmin University of China（应用统计中心和中国人民大学统计学院）； Department of Statistics and Data Science, National University of Singapore（统计与数据科学系，新加坡国立大学）

AI总结针对隐私约束下的多随机向量联合依赖检测问题，提出基于差分隐私置换的dHSIC检验方法，实现有效水平、点态一致性和极小极大最优功效。

Comments 57 pages, 7 figures

详情

AI中文摘要

多个随机向量之间的联合依赖识别在许多统计应用中扮演重要角色，其中数据可能包含敏感或机密信息。本文在差分隐私背景下考虑$d$变量希尔伯特-施密特独立性准则（dHSIC）。鉴于dHSIC经验估计的极限分布是复杂的高斯混沌，非隐私场景下的检验通常基于置换和自助法。为了在隐私约束下检测联合依赖，我们提出了一种采用差分隐私置换方法的基于dHSIC的检验程序。我们证明该方法具有隐私保证、有效水平和点态一致性，而自助法存在功效不一致的问题。我们进一步研究了所提检验在dHSIC和$L_2$度量下的均匀功效，表明该检验在不同隐私机制下达到极小极大最优功效。作为副产品，我们证明了Pfister等人（2018）提出的非隐私置换dHSIC检验是我们差分隐私置换检验的特例，并且我们的结果也建立了其点态和均匀功效——从而解决了该工作中的开放问题。因果推断中的数值模拟和真实数据分析表明，我们提出的检验在实证中表现良好。

英文摘要

Identification of joint dependence among several random vectors plays an important role in many statistical applications, where the data may contain sensitive or confidential information. In this paper, we consider the $d$-variable Hilbert-Schmidt independence criterion (dHSIC) in the context of differential privacy. Given that the limiting distribution of the empirical estimate of dHSIC is a complicated Gaussian chaos, constructing tests in the non-private regime is typically based on permutation and bootstrap methods. To detect joint dependence under privacy constraints, we propose a dHSIC-based testing procedure employing a differentially private permutation methodology. We show that our method enjoys privacy guarantees, a valid level, and pointwise consistency, whereas the bootstrap counterpart suffers from inconsistent power. We further investigate the uniform power of the proposed test under the dHSIC and $L_2$ metrics, showing that the proposed test attains the minimax optimal power across different privacy regimes. As a byproduct, we show that the non-private permutation dHSIC test proposed in Pfister et al. (2018) is a special case of our differentially private permutation test, and our results also establish its pointwise and uniform power--thus resolving an open problem from that work. Both numerical simulations and real data analysis in causal inference suggest that our proposed test performs well empirically.

URL PDF HTML ☆

赞 0 踩 0

2502.08870 2026-06-04 cs.LG stat.ML 版本更新

When and why randomised exploration works (in linear bandits)

随机探索何时以及为何有效（在线性赌博机中）

Marc Abeille, David Janz, Ciara Pike-Burke

发表机构 * Criteo AI Lab（Criteo AI实验室）； University of Oxford（牛津大学）； Imperial College London（伦敦帝国学院）

AI总结本文提出一种不依赖强制乐观或后验膨胀的分析方法，证明在动作空间光滑且强凸的d维线性赌博机中，随机探索算法（如汤普森采样）可实现O(d√n log(n))的n步遗憾界，首次表明在非平凡线性赌博机设置中汤普森采样能达到最优维度依赖。

Comments Minor corrections to formulas and text; results unchanged

2408.01382 2026-06-04 cs.LG cs.GT 版本更新

Explaining a probabilistic prediction on the simplex with Shapley compositions

用Shapley组合解释单纯形上的概率预测

Paul-Gauthier Noé, Miquel Perelló-Nieto, Jean-François Bonastre, Peter Flach

发表机构 * Laboratoire Informatique d’Avignon, Avignon Université, France（阿维尼昂信息实验室，阿维尼昂大学，法国）； University of Bristol, United Kingdom（布里斯托大学，英国）

AI总结本文引入Shapley组合，利用成分数据分析的Aitchison几何，为多类概率预测提供了一种基于公理的解释方法。

Comments Published in ECAI2024's proceedings

详情

DOI: 10.3233/FAIA240605

AI中文摘要

源于博弈论的Shapley值被广泛用于通过量化每个特征值对预测的贡献来解释机器学习模型的预测。这需要像二分类中那样的标量预测，而多类概率预测是离散概率分布，位于多维单纯形上。在这种多类设置中，Shapley值通常以一对多的方式单独计算每个类别，忽略了输出分布的组成性质。在本文中，我们引入Shapley组合作为一种有根据的方法来正确解释多类概率预测，使用成分数据分析中的Aitchison几何。我们证明了Shapley组合是满足Aitchison单纯形上的线性性、对称性和效率的唯一量，扩展了标准Shapley值的相应公理性质。我们在一系列场景中展示了这种正确的多类处理。

英文摘要

Originating in game theory, Shapley values are widely used for explaining a machine learning model's prediction by quantifying the contribution of each feature's value to the prediction. This requires a scalar prediction as in binary classification, whereas a multiclass probabilistic prediction is a discrete probability distribution, living on a multidimensional simplex. In such a multiclass setting the Shapley values are typically computed separately on each class in a one-vs-rest manner, ignoring the compositional nature of the output distribution. In this paper, we introduce Shapley compositions as a well-founded way to properly explain a multiclass probabilistic prediction, using the Aitchison geometry from compositional data analysis. We prove that the Shapley composition is the unique quantity satisfying linearity, symmetry and efficiency on the Aitchison simplex, extending the corresponding axiomatic properties of the standard Shapley value. We demonstrate this proper multiclass treatment in a range of scenarios.

URL PDF HTML ☆

赞 0 踩 0

2502.05349 2026-06-04 math.OC cs.LG 版本更新

Contextual Scenario Generation for Two-Stage Stochastic Programming

两阶段随机规划的情境生成

David Islip, Roy H. Kwon, Sanghyeon Bae, Woo Chang Kim

发表机构 * Department of Mechanical and Industrial Engineering, University of Toronto（机械与工业工程系，多伦多大学）； Department of Industrial and Systems Engineering, Korea Advanced Institute of Science and Technology (KAIST)（工业与系统工程系，韩国科学技术院（KAIST））

AI总结针对两阶段随机规划中情境数量大、部署受限的问题，提出两种情境生成方法（基于分布和基于任务），通过上下文信息学习生成少量替代情境，并保证决策质量。

Comments 79 pages, 12 figures

详情

AI中文摘要

两阶段随机规划（2SPs）广泛用于不确定性下的决策，但其实际部署通常受限于需要大量情境来近似不确定结果的条件分布。我们研究情境生成：给定上下文信息，学习生成一个小的、用户指定的替代情境集，当将其作为2SP的输入时，能产生高质量的2SP决策。现有的情境生成方法要么忽略上下文信息，要么在此设置下计算负担沉重。我们提出上下文情境生成（CSG），它学习从上下文到一组替代情境的映射。我们开发了两种互补的方法：（i）基于分布的方法，通过最小化与条件分布的基于核的距离来学习从上下文到情境的映射；（ii）基于任务的方法，通过区分下游2SP目标的代理来优化决策质量。这两种方法都广泛适用，仅需要重复求解底层子问题和在生成的情境上定义的2SP。我们提供了有限样本泛化保证，并在多个2SP类别上展示了强大的实证性能。

英文摘要

Two-stage stochastic programs (2SPs) are widely used for decision-making under uncertainty, but their practical deployment is often limited by the large number of scenarios needed to approximate the conditional distribution of uncertain outcomes. We study contextual scenario generation: given contextual information, learn to produce a small, user-specified set of surrogate scenarios that, when used as input into the 2SP, lead to high-quality 2SP decisions. Existing scenario generation methods either ignore contextual information or are computationally burdensome in this setting. We propose contextual scenario generation (CSG), which learns a mapping from context to a set of surrogate scenarios. We develop two complementary methodologies: (i) a distributional approach that learns a mapping from context to scenarios by minimizing a kernel-based distance to the conditional distribution, and (ii) a task-based approach that selects the mapping to optimize decision quality via differentiating through a learned surrogate of the downstream 2SP objective. Both approaches are broadly applicable and require only repeated solution of the underlying subproblems and 2SPs defined on the generated scenarios. We provide finite-sample generalization guarantees and demonstrate strong empirical performance across multiple 2SP classes.

URL PDF HTML ☆

赞 0 踩 0

2408.11121 2026-06-04 cs.LG cs.AI cs.CL cs.CR 版本更新

DOMBA: Double Model Balancing for Access-Controlled Language Models via Minimum-Bounded Aggregation

DOMBA: 通过最小有界聚合实现访问控制语言模型的双模型平衡

Tom Segal, Asaf Shabtai, Yuval Elovici

发表机构 * Ben-Gurion University（本·古里安大学）

AI总结提出DOMBA方法，通过最小有界平均函数聚合两个不同访问级别文档训练的语言模型的概率分布，在保证安全性的同时实现高效用。

Comments Code: https://github.com/ppo1/DOMBA 11 pages, 3 figures

详情

DOI: 10.1609/aaai.v39i23.34695
Journal ref: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 39, pp. 25101-25109, 2025

AI中文摘要

大型语言模型（LLMs）的实用性在很大程度上取决于其训练数据的质量和数量。许多组织拥有大量数据语料库，可用于训练或微调针对其特定需求的LLMs。然而，这些数据集通常带有基于用户权限并由访问控制机制强制执行的访问限制。在此类数据集上训练LLMs可能导致敏感信息暴露给未经授权的用户。防止此类暴露的一种直接方法是为每个访问级别训练一个单独的模型。然而，由于每个模型的训练数据量相对于整个组织语料库的总量有限，这可能导致模型效用低下。另一种方法是在所有数据上训练单个LLM，同时限制未经授权信息的暴露。然而，当前针对LLMs的暴露限制方法对于访问控制数据无效，因为敏感信息在多个训练样本中频繁出现。我们提出DOMBA——双模型平衡——一种训练和部署LLMs的简单方法，可在提供高效用和访问控制功能的同时保证安全性。DOMBA使用“最小有界”平均函数（一个受较小值约束的函数，例如调和平均）聚合两个模型的概率分布，每个模型在具有（可能多个）不同访问级别的文档上训练。详细的数学分析和广泛评估表明，DOMBA在保护受限信息的同时，提供了与非安全模型相当的效用。

英文摘要

The utility of large language models (LLMs) depends heavily on the quality and quantity of their training data. Many organizations possess large data corpora that could be leveraged to train or fine-tune LLMs tailored to their specific needs. However, these datasets often come with access restrictions that are based on user privileges and enforced by access control mechanisms. Training LLMs on such datasets could result in exposure of sensitive information to unauthorized users. A straightforward approach for preventing such exposure is to train a separate model for each access level. This, however, may result in low utility models due to the limited amount of training data per model compared to the amount in the entire organizational corpus. Another approach is to train a single LLM on all the data while limiting the exposure of unauthorized information. However, current exposure-limiting methods for LLMs are ineffective for access-controlled data, where sensitive information appears frequently across many training examples. We propose DOMBA - double model balancing - a simple approach for training and deploying LLMs that provides high utility and access-control functionality with security guarantees. DOMBA aggregates the probability distributions of two models, each trained on documents with (potentially many) different access levels, using a "min-bounded" average function (a function that is bounded by the smaller value, e.g., harmonic mean). A detailed mathematical analysis and extensive evaluation show that DOMBA safeguards restricted information while offering utility comparable to non-secure models.

URL PDF HTML ☆

赞 0 踩 0

2412.03008 2026-06-04 cs.SI cs.DS cs.LG 版本更新

Local Clustering on Complex Graphs and Complex Hypergraphs

复杂图与复杂超图上的局部聚类

Zihao Li, Dongqi Fu, Hengyu Liu, Jingrui He

发表机构 * University of Illinois Urbana-Champaign（伊利诺伊大学厄巴纳-香槟分校）； Meta

AI总结本文通过扩展非近似的Andersen-Chung-Lang (ACL)聚类算法，提出了GeneralACL和HyperACL两种算法，分别适用于带权、有向、自环图以及边依赖顶点权重的超图，并证明了在温和条件下它们能识别出电导率二次最优的聚类。

Comments KDD 2026, Preprint version. 26 pages

详情

DOI: 10.1145/3770855.3818092

AI中文摘要

局部/种子聚类旨在找到靠近给定起始实例的紧凑聚类。虽然现有的大多数图聚类研究假设离散图设置（即无权重、无向、无自环图），但现实世界的图可能更加复杂。在本文中，我们将经典的非近似Andersen-Chung-Lang (ACL)聚类算法扩展到离散图之外，并将其二次最优性推广到更广泛的复杂图，包括带权、有向、自环图以及具有边依赖顶点权重的超图。具体来说，通过利用PageRank，我们提出了两种算法：用于图的GeneralACL和用于超图的HyperACL。我们证明，在两种温和条件下，这两种算法都能识别出电导率方面二次最优的聚类。此外，我们提供了实验来验证我们的理论发现。我们的代码可在https://github.com/iDEA-iSAIL-Lab-UIUC/HyperACL获取。

英文摘要

Local/seeded clustering aims to find a compact cluster near the given starting instances. While most existing studies on graph clustering assume a discrete graph setting (i.e., unweighted, undirected graphs without self-loops), real-world graphs can be more complex. In this paper, we extend the classic non-approximating Andersen-Chung-Lang (ACL) clustering algorithm beyond discrete graphs and generalize its quadratic optimality to a wider range of complex graphs, including weighted, directed, and self-looped graphs and hypergraphs with edge-dependent vertex weights. Specifically, by leveraging PageRank, we propose two algorithms: GeneralACL for graphs and HyperACL for hypergraphs. We prove that, under two mild conditions, both algorithms can identify a quadratically optimal cluster in terms of conductance. Additionally, we provide experiments to validate our theoretical findings. Our code is available at https://github.com/iDEA-iSAIL-Lab-UIUC/HyperACL.

URL PDF HTML ☆

赞 0 踩 0

2411.19758 2026-06-04 cs.CV cs.AI cs.LG 版本更新

LaVIDE: Language-Prompted Satellite Change Detection via Map-Image Alignment

LaVIDE: 通过地图-图像对齐的语言提示卫星变化检测

Shuguo Jiang, Fang Xu, Chuandong Liu, Hong Tan, Shengyang Li, Lei Yu, Wen Yang, Sen Jia, Gui-Song Xia

发表机构 * School of Computer Science, Wuhan University（武汉大学计算机学院）； School of Artificial Intelligence, Wuhan University（武汉大学人工智能学院）； Technology and Engineering Center for Space Utilization and the Key Laboratory of Space Utilization, Chinese Academy of Sciences（中国科学院空间利用技术与重点实验室）； School of Aeronautics and Astronautics, University of Chinese Academy of Sciences（中国科学院大学航空宇航学院）； School of Electronic Information, Wuhan University（武汉大学电子信息学院）； College of Computer Science and Software Engineering, Shenzhen University（深圳大学计算机科学与软件工程学院）

AI总结提出LaVIDE框架，利用受限提示学习和对象感知嵌入增强，通过语言弥合高层地图类别与低层图像细节之间的语义鸿沟，实现跨模态对齐，在多类与单类变化检测任务上分别提升IoU 18.4%和5.2%。

详情

AI中文摘要

基于地图参考和最新图像的遥感变化检测，在缺乏早期图像进行比较时，有助于及时观测地球表面。然而，高层地图类别与低层图像细节之间的语义鸿沟阻碍了提取同质特征以进行稳健的时间关联。与比较像素级视觉相似性或传播分割误差的传统方法不同，我们提出了一种新颖框架——LaVIDE（用于检测变化的语言-视觉判别器），该框架以语言为中介，弥合了高层地图类别与低层图像细节之间的语义鸿沟。具体来说，我们引入了受限提示学习来生成上下文感知的文本提示，使地图语义与图像内容对齐，并采用对象感知嵌入增强策略将对象级属性（如形状、边界）整合到地图表示中。这些组件能够在统一的语言-视觉特征空间中实现稳健的跨模态对齐。在四个基准数据集（DynamicEarthNet、HRSCD、BANDON和SECOND）上的大量实验表明，LaVIDE以显著优势超越了最先进的方法，在多类和单类变化检测任务上分别实现了18.4%和5.2%的IoU提升。我们的框架不仅提高了地图-图像变化检测的准确性，还为以最少人工干预快速更新地图提供了实用解决方案，有望在城市规划、灾害评估和生态保护等领域产生广泛影响。代码和数据集可在 https://github.com/ShuGuoJ/LAVIDE.git 获取。

英文摘要

Remote sensing change detection based on a map reference and an up-to-date image boosts timely observation of the Earth's surface when earlier images are lacking for comparison. However, the semantic gap between high-level map categories and low-level image details hinders the extraction of homogeneous features for robust temporal association in change detection. Unlike conventional approaches that either compare pixel-level visual similarity or propagate segmentation errors, \textcolor{black}{we propose a novel framework, \underline{La}nguage-\underline{VI}sion \underline{D}iscriminator for d\underline{E}tecting changes, LaVIDE}, which bridges the semantic gap between high-level map categories and low-level image details using language as an intermediary. Specifically, we introduce {\it restricted prompt learning} to generate context-aware textual prompts that align map semantics with image content, and an {\it object-aware embedding enhancement} strategy to integrate object-level attributes (e.g., shape, boundary) into map representations. These components enable robust cross-modal alignment within a unified language-vision feature space. Extensive experiments on four benchmarks, DynamicEarthNet, HRSCD, BANDON, and SECOND, demonstrate that LaVIDE outperforms state-of-the-art methods by significant margins, achieving $18.4\%$ and $5.2\%$ improvements in IoU on multi-class and single-class change detection tasks, respectively. Our framework not only advances the accuracy of map-image change detection but also provides a practical solution for rapid map updating with minimal human intervention, promising broad impacts in urban planning, disaster assessment, and ecological conservation. Code and datasets are available at: https://github.com/ShuGuoJ/LAVIDE.git.

URL PDF HTML ☆

赞 0 踩 0

2411.05591 2026-06-04 stat.ML cs.LG 版本更新

Decentralized EM Algorithm for Gaussian Mixtures under Data Heterogeneity and Partial Labeling

数据异质性和部分标记下高斯混合的分布式EM算法

Xuetong Li, Shuyuan Wu, Bin Du, Hansheng Wang

发表机构 * School of Mathematics and Statistics（数学与统计学学院）； School of Statistics and Data Science（统计与数据科学学院）； Guanghua School of Management（光华管理学院）

AI总结针对分布式联邦学习中数据异质性导致经典EM算法估计有偏的问题，提出动量网络EM（MNEM）算法和半监督MNEM（semi-MNEM）算法，实现渐近有效估计并加速收敛。

详情

AI中文摘要

我们系统研究了分布式联邦学习（DFL）中高斯混合模型的几种基于网络的期望最大化（EM）算法。理论研究表明，当数据在不同站点间异质分布时，直接将经典EM算法扩展到DFL会导致有偏估计。为解决这一问题，我们引入了动量网络EM（MNEM）算法，该算法整合了当前和先前DFL迭代的历史估计信息。我们进一步开发了半监督MNEM（semi-MNEM）算法，利用部分标记数据提供的信息。严格的理论分析表明，在适当的正则条件下，即使数据异质，MNEM估计器也能达到与全样本估计器相同的渐近效率。此外，即使不同混合成分分离较差，semi-MNEM估计器也能显著提高MNEM算法的收敛速度。进行了大量模拟，并分析了一个广泛使用的胸部X射线数据集，以证明所提出方法的有限样本性能。

英文摘要

We systematically study several network-based Expectation-Maximization (EM) algorithms for the Gaussian mixture model within decentralized federated learning (DFL). Our theoretical investigation shows that directly extending the classic EM algorithm to DFL leads to a biased estimator when data are heterogeneously distributed across sites. To address this, we introduce a momentum network EM (MNEM) algorithm, which integrates information from both current and historical estimators from previous DFL iterations. We further develop a semi-supervised MNEM (semi-MNEM) algorithm, which utilizes information provided by partially labeled data. Rigorous theoretical analysis demonstrates that the MNEM estimator can achieve the same asymptotic efficiency as the whole-sample estimator under appropriate regularity conditions, even with heterogeneous data. Moreover, the semi-MNEM estimator significantly improves the convergence speed of the MNEM algorithm, even if different mixture components are poorly separated. Extensive simulations are conducted, and a widely used chest X-ray dataset is analyzed to demonstrate the finite-sample performance of the proposed methods.

URL PDF HTML ☆

赞 0 踩 0

2205.08609 2026-06-04 stat.ML cs.LG stat.ME 版本更新

Bagged Polynomial Regression and Neural Networks

Sylvia Klosin, Jaume Vives-i-Bastida

发表机构 * Department of Agricultural and Resource Economics, UC Davis（加州大学戴维斯分校农业与资源经济学系）； Stanford Graduate School of Business（斯坦福商学院）

AI总结针对高维预测问题，提出基于随机投影的袋装多项式回归（BPR），在保持与神经网络相当精度的同时提供可解释性和诊断工具。

详情

AI中文摘要

气候和环境应用越来越依赖于从遥感和其他科学数据中进行高维预测。神经网络（NN）在这些场景中能够提供强大的准确性，但往往难以审计且难以与领域知识对齐。作为替代方案，我们提出了基于随机投影的袋装多项式回归（BPR），这是一种计量经济学原生的集成方法，它对在随机选择的协变量组上拟合的多个正则化低次多项式模型进行平均。我们提供了新颖的有限样本和渐近风险界，并展示了协变量划分如何通过控制字典基增长来改善光滑目标函数的速率。速率改进对于边际效应的估计可能尤其重要。在使用光学和雷达图像进行基于卫星的作物分类的应用中，BPR 在保持易于诊断的同时达到了与 NN 相当的准确性。我们提供了实用的透明度工具、系数汇总和偏依赖诊断，表明 BPR 捕捉到了 NN 未能捕捉到的直观特征关系。

英文摘要

Climate and environmental applications increasingly rely on high-dimensional prediction from remote sensing and other scientific data. Neural networks (NN) can deliver strong accuracy in these settings, but they are often hard to audit and hard to align with domain knowledge. As an alternative, we propose bagged polynomial regression with random projections (BPR), an econometrics-native ensemble that averages many regularized low-degree polynomial models fit on randomly selected covariate groups. We provide novel finite-sample and asymptotic risk bounds and show how covariate partitioning can improve rates for smooth target functions by controlling dictionary basis growth. Rate improvements may be particularly relevant for the estimation of marginal effects. In an application to satellite-based crop classification using optical and radar imagery, BPR matches NN accuracy while remaining straightforward to diagnose. We provide practical transparency tools, coefficient summaries and partial-dependence diagnostics, that show BPR captures intuitive feature relationships that NNs do not.

URL PDF HTML ☆

赞 0 踩 0

2407.13922 2026-06-04 cs.CV cs.AI cs.LG 版本更新

CounterFace: A Synthetic Face Dataset for Fine-Grained Counterfactual Evaluation of Face Recognition Systems

CounterFace: 用于人脸识别系统细粒度反事实评估的合成人脸数据集

Guruprasad Viswanathan Ramesh, Ashish Hooda, Shimaa Ahmed, Harrison J Rosenberg, Ramya Korlakai Vinayak, Kassem Fawaz

发表机构 * University of Wisconsin-Madison（威斯康星大学麦迪逊分校）； Visa Research（Visa研究）

AI总结提出CounterFace数据集，通过全自动流水线生成包含20种面部属性和8种人口统计因素的11,821个反事实人脸对，用于细粒度评估人脸识别系统在特定属性-人口统计组合下的性能退化。

Comments Code available at https://github.com/Guruprasad68/counterface_facct2026. Dataset available for non-commercial research upon request

详情

DOI: 10.1145/3805689.3812410

AI中文摘要

人脸识别系统广泛应用于关键应用，因此其在不同人群和条件下的可靠性和鲁棒性至关重要。人脸识别系统的标准评估通常依赖LFW等数据集来估计平均识别准确率。一些基准测试也捕捉了粗粒度的身份内变化，如老化、姿态和光照。然而，人脸存在更细粒度的变化，包括发型和化妆等外观变化，这些在现有基准测试中代表性不足。反事实评估提供了一种在细粒度变化下评估人脸识别鲁棒性的方法。然而，现有使用图像生成器合成的反事实人脸数据集由于在流程中使用人工验证，属性覆盖范围有限。我们提出CounterFace，一个新的反事实评估数据集，包含20种面部属性和8种人口统计因素，超过先前合成人脸数据集14种属性和2种人口统计因素。该数据集使用基于现成图像生成器和自定义验证器的全自动流水线生成，无需人工验证。CounterFace包含11,821个反事实人脸对，事后用户研究证实了生成反事实的忠实性。我们评估了两个商业和四个开源人脸识别系统（AWS Rekognition、Face++、AdaFace、MagFace、ArcFace、FaceNet）在160种属性-人口统计组合上的性能。与标准评估基准不同，我们的数据集有助于隔离单个系统的精确故障模式。结果表明，所有六个系统的性能退化因属性和人口统计而异，遮挡属性（如口罩和胡须）普遍降低性能。

英文摘要

Face recognition (FR) systems are widely deployed in critical applications, making their reliability and robustness across diverse populations and conditions essential. Standard evaluation of FR systems typically relies on datasets such as LFW to estimate average recognition accuracy. Some benchmarks also capture coarse-grained intra-identity variations such as aging, pose, and lighting. However, human faces undergo more fine-grained changes, including appearance changes such as hairstyles and makeup, that are underrepresented in existing benchmarks. Counterfactual evaluation provides a method to assess FR robustness under such fine-grained variations. Existing counterfactual face datasets synthesized with image generators, however, are limited in attribute coverage due to the use of humans for verification in the pipeline. We propose CounterFace, a new counterfactual evaluation dataset comprising 20 facial attributes and 8 demographic factors, exceeding prior synthetic face datasets by 14 attributes and 2 demographics. The dataset is generated using a fully automated pipeline based on off-the-shelf image generators with custom verifiers, removing human need for verification. CounterFace contains 11,821 counterfactual face pairs, and a post-hoc user study confirms the faithfulness of the generated counterfactuals. We evaluate two commercial and four open-source FR systems (AWS Rekognition, Face++, AdaFace, MagFace, ArcFace, FaceNet) across 160 attribute-demographic combinations. Our dataset helps in the isolation of precise failure modes for individual systems unlike standard evaluation benchmarks. Results indicate that the performance degradation varies across attributes and demographics for all six systems and occluding attributes (e.g., facemask and facial hair) universally degrade performance.

URL PDF HTML ☆

赞 0 踩 0

2004.10846 2026-06-04 cs.CY cs.LG 版本更新

Reducing the Filtering Effect in Public School Admissions: A Bias-aware Analysis for Targeted Interventions

减少公立学校招生中的过滤效应：面向针对性干预的偏差感知分析

Yuri Faenza, Swati Gupta, Aapeli Vuorinen, Xuan Zhang

发表机构 * Columbia University（哥伦比亚大学）； Massachusetts Institute of Technology（麻省理工学院）

AI总结本研究采用运筹学方法，通过分析纽约市教育部数据，将弱势学生分数分布偏移建模为偏差，并证明针对中等成绩弱势学生的集中干预（如奖学金或培训）可显著降低偏差影响。

详情

AI中文摘要

问题定义：传统上，纽约市顶尖的8所公立学校仅根据学生在特殊高中入学考试（SHSAT）中的成绩选拔候选人。这些成绩已知受到学生社会经济地位和初中所接受的考试准备的影响，导致教育管道中产生巨大的过滤效应。经典的学校分配机制并未自然解决学校隔离和班级多样性等问题，这些问题近年来日益恶化。包括政策制定者在内的科学界通过引入群体特定配额和比例约束来应对，但结果好坏参半。寻找有效且公平的方法以扩大优质教育机会的问题仍未解决。方法/结果：我们采用与大多数现有文献不同的运筹学方法，目标是增加经济需求高的学生的机会。利用纽约市教育部（DOE）的数据，我们展示了被DOE归类为“弱势”（主要基于经济因素的标准）的学生所获分数的分布存在偏移。我们将这种偏移建模为“偏差”，源于对弱势学生真实潜力的低估。我们分析了这种偏差对分类匹配市场的影响。我们表明，当针对中等成绩的弱势学生群体时，通过奖学金或培训进行的集中干预可以显著降低偏差的影响。

英文摘要

Problem definition: Traditionally, New York City's top 8 public schools have selected candidates solely based on their scores in the Specialized High School Admissions Test (SHSAT). These scores are known to be impacted by socioeconomic status of students and test preparation received in middle schools, leading to a massive filtering effect in the education pipeline. The classical mechanisms for assigning students to schools do not naturally address problems like school segregation and class diversity, which have worsened over the years. The scientific community, including policymakers, have reacted by incorporating group-specific quotas and proportionality constraints, with mixed results. The problem of finding effective and fair methods for broadening access to top-notch education is still unsolved. Methodology/results: We take an operations approach to the problem different from most established literature, with the goal of increasing opportunities for students with high economic needs. Using data from the Department of Education (DOE) in New York City, we show that there is a shift in the distribution of scores obtained by students that the DOE classifies as "disadvantaged" (following criteria mostly based on economic factors). We model this shift as a "bias" that results from an underestimation of the true potential of disadvantaged students. We analyze the impact this bias has on an assortative matching market. We show that centrally planned interventions can significantly reduce the impact of bias through scholarships or training, when they target the segment of disadvantaged students with average performance.

URL PDF HTML ☆

赞 0 踩 0

1710.04238 2026-06-04 stat.ME cs.LG cs.NA math.NA 版本更新

Regression-aware decompositions

回归感知的分解

Mark Tygert

发表机构 * Facebook Artificial Intelligence Research（脸书人工智能研究）

AI总结本文提出了一种回归感知的分解方法，通过结合线性最小二乘回归模型与插值分解，实现了对矩阵B的监督降维，从而揭示了B中与A回归相关的结构。

Comments 19 pages, 9 figures, 2 tables

详情

Journal ref: Linear Algebra and Its Applications, 565 (6): 208-224, 2019

AI中文摘要

线性最小二乘回归通过设计矩阵A来近似给定矩阵B，通过最小化谱范数或Frobenius范数的差异||AX-B||来实现。另一种流行的近似方法是通过主成分分析（PCA）进行低秩近似，即奇异值分解（SVD）或插值分解（ID）。传统上，PCA/SVD和ID仅使用被近似的矩阵B，而不受任何辅助矩阵A的监督。然而，线性最小二乘回归模型可以指导ID，从而产生回归感知的ID。作为额外的好处，这为一种典型的判别分析（A和B之间的相关性）提供了解释。回归感知的分解有效使监督信息能够指导经典的降维方法，而经典降维方法历来是完全无监督的。回归感知的分解揭示了B中与A回归相关的结构。

英文摘要

Linear least-squares regression with a "design" matrix A approximates a given matrix B via minimization of the spectral- or Frobenius-norm discrepancy ||AX-B|| over every conformingly sized matrix X. Another popular approximation is low-rank approximation via principal component analysis (PCA) -- which is essentially singular value decomposition (SVD) -- or interpolative decomposition (ID). Classically, PCA/SVD and ID operate solely with the matrix B being approximated, not supervised by any auxiliary matrix A. However, linear least-squares regression models can inform the ID, yielding regression-aware ID. As a bonus, this provides an interpretation as regression-aware PCA for a kind of canonical correlation analysis between A and B. The regression-aware decompositions effectively enable supervision to inform classical dimensionality reduction, which classically has been totally unsupervised. The regression-aware decompositions reveal the structure inherent in B that is relevant to regression against A.

URL PDF HTML ☆

赞 0 踩 0

2312.08472 2026-06-04 cs.NE cs.LG cs.NA math.NA 版本更新

AutoNumerics-Zero: Automated Discovery of State-of-the-Art Mathematical Functions

AutoNumerics-Zero：自动发现最先进的数学函数

Esteban Real, Mirko Rossini, Connal de Souza, Manav Garg, Moritz Firsching, Quoc V. Le, Yao Chen, Akhil Verghese, Ekin Dogus Cubuk, David H. Park

发表机构 * University of California, Berkeley（加州大学伯克利分校）

AI总结本文通过符号回归的进化方法，在不依赖任意精度的情况下，自动发现比传统方法更高效的数学函数近似程序，例如一个10操作的程序逼近指数函数达到14位有效数字。

Comments v2: Accepted to the International Conference on Machine Learning (ICML 2026); added results, clarified framing, and added proofs

详情

AI中文摘要

超越函数（如指数函数）是科学计算的核心，但数字硬件无法原生计算它们。相反，计算机必须通过组合基本运算（如$\{+, -, \times, ÷\}$）使用泰勒级数等方法近似这些函数。这些方法由数学家经过几个世纪发展而来，专注于能够达到任意精度的途径。然而，计算机通过仅使用有限精度类型（如float32）即可处理大多数应用，超出该类型精度的任何精度实际上都被丢弃了。因此，我们探索放弃任意精度是否能够发现更高效的近似。符号回归的进化方法特别适合，因为它可以搜索任意操作组合，并优化不可微的目标（如使用的操作数）。我们的结果表明，进化能够发现在此设置下优于已有方法的计算机程序，尽管除了基本运算的计算外没有先前的数学知识。从空代码开始，符号回归构建表示新颖数学表达式的程序。特别地，我们发现了10个操作的逼近指数函数达到14位有效数字的程序，其精度比此前已知的同等规模近似高出超过6个数量级。

英文摘要

Transcendental functions, such as the exponential, are central to scientific computing, yet they cannot be natively calculated by digital hardware. Instead, computers must approximate these functions by combining basic operations, such as $\{+, -, \times, ÷\}$, using methods like Taylor series. These methods were developed over centuries by mathematicians, who focused on approaches that could attain arbitrary accuracy. However, computers can handle most applications by using only finite-precision types, like float32, where any accuracy beyond the type's precision is effectively discarded. We explore, therefore, whether forgoing arbitrary accuracy can lead to the discovery of more efficient approximations. The evolutionary method of symbolic regression is particularly suitable, as it can search for arbitrary operation combinations and can optimize non-differentiable objectives, such as the number of operations used. Our results show that evolution can discover computer programs that outperform established methods in this setting, despite having no prior mathematical knowledge beyond the calculation of the basic operations. Starting from empty code, symbolic regression constructs programs representing novel mathematical expressions. In particular, we discovered a 10-operation program that approximates the exponential function to 14 significant figures, exceeding the accuracy of previously known approximations of this size by more than 6 orders of magnitude.

URL PDF HTML ☆

赞 0 踩 0

2209.15448 2026-06-04 cs.LG math.ST stat.ME stat.TH 版本更新

Blessing from Human-AI Interaction: Super Reinforcement Learning in Confounded Environments

人机交互的福音：混杂环境下的超级强化学习

Jiayi Wang, Zhengling Qi, Chengchun Shi

发表机构 * Department of Mathematical Sciences, University of Texas at Dallas（德克萨斯大学达拉斯分校数学科学系）； Department of Statistics, London School of Economics and Political Science（伦敦政治经济学院统计系）； Department of Decision Sciences, George Washington University（乔治华盛顿大学决策科学系）

AI总结提出利用人机交互中的观察动作进行超级策略学习，在存在未测量混杂的情况下，通过近端因果推断实现优于标准最优策略和行为策略的超级策略。

详情

AI中文摘要

随着人工智能在社会中越来越普遍，整合人类和AI系统以发挥各自优势并降低风险的有效方法已成为重要优先事项。在本文中，我们引入了超级策略学习的范式，该范式利用人机交互进行数据驱动的序贯决策。这种方法将来自AI或人类的观察动作作为输入，以实现决策者（人类或AI）在策略学习中更强的oracle。在存在未测量混杂的决策过程中，过去智能体采取的动作可以揭示未公开信息的有价值见解。通过以一种新颖且合法的方式将这些信息纳入策略搜索，所提出的超级策略学习将产生一个超级策略，该策略保证优于标准最优策略和行为策略（例如，过去智能体的动作）。我们将这种更强的oracle称为人机交互的福音。此外，为了解决使用批处理数据寻找超级策略时的未测量混杂问题，在近端因果推断框架下建立了一系列非参数和因果识别。基于这些新颖的识别结果，我们开发了几种超级策略学习算法，并系统研究了它们的理论性质，例如有限样本遗憾保证。最后，通过大量模拟和实际应用说明了我们方法的有效性。

英文摘要

As AI becomes more prevalent throughout society, effective methods of integrating humans and AI systems that leverage their respective strengths and mitigate risk have become an important priority. In this paper, we introduce the paradigm of super policy learning that takes advantage of Human-AI interaction for data driven sequential decision making. This approach utilizes the observed action, either from AI or humans, as input for achieving a stronger oracle in policy learning for the decision maker (humans or AI). In the decision process with unmeasured confounding, the actions taken by past agents can offer valuable insights into undisclosed information. By including this information for the policy search in a novel and legitimate manner, the proposed super policy learning will yield a super-policy that is guaranteed to outperform both the standard optimal policy and the behavior one (e.g., past agents' actions). We call this stronger oracle a blessing from human-AI interaction. Furthermore, to address the issue of unmeasured confounding in finding super-policies using the batch data, a number of nonparametric and causal identifications are established under the framework of proximal causal inference. Building upon on these novel identification results, we develop several super-policy learning algorithms and systematically study their theoretical properties such as finite-sample regret guarantee. Finally, we illustrate the effectiveness of our proposal through extensive simulations and real-world applications.

URL PDF HTML ☆

赞 0 踩 0

1409.6111 2026-06-04 math.OC cs.LG cs.MA cs.SY eess.SY stat.ML 版本更新

Distributed Clustering and Learning Over Networks

网络上的分布式聚类与学习

Xiaochuan Zhao, Ali H. Sayed

发表机构 * Department of Electrical Engineering, University of California, Los Angeles（加州大学洛杉矶分校电气工程系）

AI总结本文提出了一种自适应的聚类和学习方案，使智能体能够学习应与哪些邻居合作以及哪些邻居应忽略，从而在网络中实现更准确的学习和估计。通过详细的均方分析，评估了聚类机制的一阶和二阶误差概率，并证明这些概率随步长指数衰减，从而可以将正确聚类的概率任意接近于一。

Comments 47 pages, 6 figures

详情

DOI: 10.1109/TSP.2015.2415755

AI中文摘要

网络上的分布式处理依赖于节点间的在网处理和邻近智能体之间的合作。当智能体共享共同目标时，合作是有益的。然而，在许多应用中，智能体可能属于不同的集群，追求不同的目标。因此，无差别合作会导致不期望的结果。在本文中，我们提出了一种自适应的聚类和学习方案，使智能体能够学习应与哪些邻居合作以及哪些其他邻居应忽略。通过这样做，所得到的算法使智能体能够识别其集群，并在网络中实现改进的学习和估计准确性。我们进行了详细的均方分析，并评估了聚类机制的一阶和二阶误差概率，即虚警和误检概率。此外，我们证明这些概率随着步长指数衰减，从而使正确聚类的概率可以任意接近于一。

英文摘要

Distributed processing over networks relies on in-network processing and cooperation among neighboring agents. Cooperation is beneficial when agents share a common objective. However, in many applications agents may belong to different clusters that pursue different objectives. Then, indiscriminate cooperation will lead to undesired results. In this work, we propose an adaptive clustering and learning scheme that allows agents to learn which neighbors they should cooperate with and which other neighbors they should ignore. In doing so, the resulting algorithm enables the agents to identify their clusters and to attain improved learning and estimation accuracy over networks. We carry out a detailed mean-square analysis and assess the error probabilities of Types I and II, i.e., false alarm and mis-detection, for the clustering mechanism. Among other results, we establish that these probabilities decay exponentially with the step-sizes so that the probability of correct clustering can be made arbitrarily close to one.

URL PDF HTML ☆

赞 0 踩 0

1808.03408 2026-06-04 cs.LG cs.NA math.NA math.OC stat.ML 版本更新

A Unified Analysis of AdaGrad with Weighted Aggregation and Momentum Acceleration

AdaGrad的统一分析：带加权聚合和动量加速

Li Shen, Congliang Chen, Fangyu Zou, Zequn Jie, Ju Sun, Wei Liu

发表机构 * JD Explore Academy, Beijing, China（京东探索研究院，北京，中国）； Facebook, USA（Facebook，美国）； Meituan, Beijing, China（美团，北京，中国）； University of Minnesota, Twin Cities, USA（明尼苏达大学双城分校，美国）； Tencent, Shenzhen, China（腾讯，深圳，中国）

AI总结本文提出了一种名为AdaUSM的加权AdaGrad算法，通过统一动量方案和新型加权自适应学习率，实现了在非凸随机设置下的O(√(log(T)/T))收敛率，并从新视角解释了Adam和RMSProp的自适应学习率。

Comments IEEE TNNLS

详情

AI中文摘要

将自适应学习率和动量技术整合到SGD中，可以得到一系列高效加速的自适应随机算法，如AdaGrad、RMSProp、Adam、AccAdaGrad等。尽管这些算法在实践中效果显著，但在非凸随机设置下的收敛理论仍存在较大差距。为此，我们提出了名为AdaUSM的加权AdaGrad，其主要特点包括（1）采用统一的动量方案，涵盖重球动量和Nesterov加速梯度动量；（2）采用新颖的加权自适应学习率，能够统一AdaGrad、AccAdaGrad、Adam和RMSProp的学习率。此外，当在AdaUSM中采用多项式增长的权重时，可以得到非凸随机设置下的O(√(log(T)/T))收敛率。我们还展示了Adam和RMSProp的自适应学习率对应于在AdaUSM中采用指数增长的权重，从而为理解Adam和RMSProp提供了新的视角。最后，我们还在各种深度学习模型和数据集上进行了AdaUSM与SGD动量、AdaGrad、AdaEMA、Adam和AMSGrad的比较实验。

英文摘要

Integrating adaptive learning rate and momentum techniques into SGD leads to a large class of efficiently accelerated adaptive stochastic algorithms, such as AdaGrad, RMSProp, Adam, AccAdaGrad, \textit{etc}. In spite of their effectiveness in practice, there is still a large gap in their theories of convergences, especially in the difficult non-convex stochastic setting. To fill this gap, we propose \emph{weighted AdaGrad with unified momentum}, dubbed AdaUSM, which has the main characteristics that (1) it incorporates a unified momentum scheme which covers both the heavy ball momentum and the Nesterov accelerated gradient momentum; (2) it adopts a novel weighted adaptive learning rate that can unify the learning rates of AdaGrad, AccAdaGrad, Adam, and RMSProp. Moreover, when we take polynomially growing weights in AdaUSM, we obtain its $\mathcal{O}(\log(T)/\sqrt{T})$ convergence rate in the non-convex stochastic setting. We also show that the adaptive learning rates of Adam and RMSProp correspond to taking exponentially growing weights in AdaUSM, thereby providing a new perspective for understanding Adam and RMSProp. Lastly, comparative experiments of AdaUSM against SGD with momentum, AdaGrad, AdaEMA, Adam, and AMSGrad on various deep learning models and datasets are also carried out.

URL PDF HTML ☆

赞 0 踩 0

1709.09480 2026-06-04 cs.AI cs.LG cs.SY eess.SY 版本更新

A Benchmark Environment Motivated by Industrial Control Problems

由工业控制问题启发的基准环境

Daniel Hein, Stefan Depeweg, Michel Tokic, Steffen Udluft, Alexander Hentschel, Thomas A. Runkler, Volkmar Sterzing

发表机构 * Siemens AG, Corporate Technology（西门子股份公司企业技术部）

AI总结本文提出一个结合工业控制问题的基准环境，旨在解决真实工业环境与现有人工基准之间缺乏联系的问题，通过详细描述基准动态并识别典型实验设置来促进强化学习方法的改进。

详情

DOI: 10.1109/SSCI.2017.8280935
Journal ref: 2017 IEEE Symposium Series on Computational Intelligence (SSCI)

AI中文摘要

在强化学习（RL）研究领域，频繁出现新的有前景的方法被开发并引入RL社区。然而，尽管许多研究人员渴望将他们的方法应用于现实世界的问题，但在真实工业环境中实施这些方法往往是一个令人沮丧和繁琐的过程。通常，学术研究小组只能有限地访问真实工业数据和应用。因此，新方法通常通过使用人工软件基准来开发、评估和比较。一方面，这些基准旨在提供可解释的RL训练场景和对所用方法学习过程的深入见解。另一方面，它们通常与现实工业应用缺乏相似性。为此，我们利用行业经验设计了一个基准，以弥合自由可用、文档齐全且有动机的人工基准与真实工业问题属性之间的差距。所得到的工业基准（IB）已通过在GitHub上发布其Java和Python代码，包括一个OpenAI Gym包装器，向RL社区公开。在本文中，我们详细阐述了IB的动力学，并识别了能够捕捉现实世界工业控制问题中常见情况的典型实验设置。

英文摘要

In the research area of reinforcement learning (RL), frequently novel and promising methods are developed and introduced to the RL community. However, although many researchers are keen to apply their methods on real-world problems, implementing such methods in real industry environments often is a frustrating and tedious process. Generally, academic research groups have only limited access to real industrial data and applications. For this reason, new methods are usually developed, evaluated and compared by using artificial software benchmarks. On one hand, these benchmarks are designed to provide interpretable RL training scenarios and detailed insight into the learning process of the method on hand. On the other hand, they usually do not share much similarity with industrial real-world applications. For this reason we used our industry experience to design a benchmark which bridges the gap between freely available, documented, and motivated artificial benchmarks and properties of real industrial problems. The resulting industrial benchmark (IB) has been made publicly available to the RL community by publishing its Java and Python code, including an OpenAI Gym wrapper, on Github. In this paper we motivate and describe in detail the IB's dynamics and identify prototypic experimental settings that capture common situations in real-world industry control problems.

URL PDF HTML ☆

赞 0 踩 0

2107.01629 2026-06-04 stat.ML cs.LG econ.GN q-fin.EC stat.AP 版本更新

From Live to Recording: Consumer Demand and Response to Price Across the Livestreaming Lifecycle

从直播到录制：消费者对直播生命周期中价格的需求与响应

Ziwei Cong, Jia Liu, Puneet Manchanda

发表机构 * Georgetown University（乔治·华盛顿大学）； Hong Kong University of Science and Technology（香港科学与技术大学）； University of Michigan（密歇根大学）； Stephen M. Ross School of Business, University of Michigan（密歇根大学罗斯商学院）

AI总结利用大型直播平台数据，研究消费者在直播前后对价格敏感性的差异，发现直播前需求价格弹性更高，主要由消费者自选择和质量不确定性驱动。

Comments An earlier version of this paper was distributed under the title "The Role of 'Live' in Livestreaming Markets: Evidence Using Orthogonal Random Forest."

详情

AI中文摘要

直播已发展成为一个蓬勃发展的行业，创作者可以直接从中获利并与观众和粉丝互动。在实践中，创作者和平台通常将营销工作集中在直播前的时期。然而，直播活动在结束后自然过渡到录制格式，创造了潜在的“剩余”变现机会。本研究利用一个大型直播平台的数据，系统性地考察了消费者在整个直播生命周期中对直播活动的需求，该平台允许消费者在直播结束后购买付费直播活动的录制版本。我们发现，与直播后时期相比，直播前时期的需求对价格更敏感。这部分由两种机制驱动：消费者自选择（不常消费的消费者可能错过了直播活动，对录制版本表现出更高的支付意愿）和质量不确定性（消费者在直播前时期面临的事件质量不确定性高于直播后时期）。我们的研究结果为直播市场的定价和定向策略提供了启示。

英文摘要

Livestreaming has evolved into a thriving industry where creators can directly monetize and engage with their audiences and followers. In practice, creators and platforms typically concentrate their marketing efforts on the period leading up to the livestream. However, livestreaming events naturally transition into recorded formats once the event concludes, creating potential "residual" opportunities for monetization. This study systematically examines consumer demand for live events throughout the entire livestream life-cycle, using data from a large livestreaming platform that allows consumers to purchase the recorded version of a paid live event after the livestream ends. We find that the demand is surprisingly more price-sensitive during the pre-livestream period compared to the post-period. This is partly driven by two mechanisms: consumer self-selection (infrequent consumers who may have missed the live events exhibit a higher willingness to pay for recorded versions) and quality uncertainty (consumers face higher uncertainty in event quality during the pre-period than in the post-period). Our findings generate implications for the pricing and targeting strategies in livestreaming markets.

URL PDF HTML ☆

赞 0 踩 0

2006.04013 2026-06-04 cs.CY cs.AI cs.LG 版本更新

AI from concrete to abstract: demystifying artificial intelligence to the general public

从具体到抽象的人工智能：向公众揭秘人工智能

Rubens Lacerda Queiroz, Fábio Ferrentini Sampaio, Cabral Lima, Priscila Machado Vieira Lima

发表机构 * Federal University of Rio de Janeiro – UFRJ – Brazil（巴西联邦大学里约热内卢分校）； InovLabs – Portugal（葡萄牙InovLabs）； Atlantica University – Portugal（葡萄牙Atlantica大学）； PESC/COPPE ； Tercio Pacitti Institute (NCE)（Tercio Pacitti研究所（NCE））

AI总结本文提出一种结合可视化编程与WiSARD无权重人工神经网络的新方法AIcon2abs，通过实践开发学习机器并观察其学习过程，帮助普通大众（包括儿童）理解人工智能的基本概念。

Comments 23 pages; 2 tables; 47 figures; review comment: Included references for the final published peer-reviewed version of this pre-print: https://doi.org/10.1007/s00146-021-01151-x and https://rdcu.be/cihdO; typos corrected

详情

DOI: 10.1007/s00146-021-01151-x
Journal ref: AI & SOCIETY, 36 877-893 (2021)

AI中文摘要

人工智能（AI）已被广泛应用于众多领域，这表明迫切需要开发手段，使普通大众对AI的含义有最基本的理解。本文结合可视化编程与WiSARD无权重人工神经网络，提出了一种新方法——从具体到抽象的人工智能（AIcon2abs），使普通人（包括儿童）能够实现这一目标。该方法的主要策略是通过与学习机器开发相关的实践活动，以及观察其学习过程，来促进对人工智能的去神秘化。因此，它能够使受训者获得技能，从而在涉及采用人工智能机制的辩论和决策中成为有洞察力的参与者。目前，通过编程教授基本AI概念的现有方法将机器智能视为外部元素/模块。经过训练后，该外部模块被耦合到学习者正在开发的主应用程序中。而在本文提出的方法中，训练和分类任务都是构成主程序的模块，就像其他编程结构一样。作为AIcon2abs的一个有益副作用，能够从数据中学习的程序与常规计算机程序之间的区别变得更加明显。此外，WiSARD无权重人工神经网络模型的简单性使得训练和分类任务的内部实现易于可视化和理解。

英文摘要

Artificial Intelligence (AI) has been adopted in a wide range of domains. This shows the imperative need to develop means to endow common people with a minimum understanding of what AI means. Combining visual programming and WiSARD weightless artificial neural networks, this article presents a new methodology, AI from concrete to abstract (AIcon2abs), to enable general people (including children) to achieve this goal. The main strategy adopted by is to promote a demystification of artificial intelligence via practical activities related to the development of learning machines, as well as through the observation of their learning process. Thus, it is possible to provide subjects with skills that contributes to making them insightful actors in debates and decisions involving the adoption of artificial intelligence mechanisms. Currently, existing approaches to the teaching of basic AI concepts through programming treat machine intelligence as an external element/module. After being trained, that external module is coupled to the main application being developed by the learners. In the methodology herein presented, both training and classification tasks are blocks that compose the main program, just as the other programming constructs. As a beneficial side effect of AIcon2abs, the difference between a program capable of learning from data and a conventional computer program becomes more evident. In addition, the simplicity of the WiSARD weightless artificial neural network model enables easy visualization and understanding of training and classification tasks internal realization.

URL PDF HTML ☆

赞 0 踩 0

1809.03225 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Gait learning for soft microrobots controlled by light fields

基于光场控制的软微机器人步态学习

Alexander von Rohr, Sebastian Trimpe, Alonso Marco, Peer Fischer, Stefano Palagi

发表机构 * Micro, Nano, and Molecular Systems Group, Max Planck Institute for Intelligent Systems（微、纳、分子系统组，人工智能系统马克斯·普朗克研究所）； Max Planck ETH Center for Learning Systems（马克斯·普朗克-ETH学习系统中心）

AI总结本文提出一种基于贝叶斯优化和高斯过程的概率学习方法，用于优化光场控制的软微机器人步态，通过有限实验预算实现高效且鲁棒的运动性能提升。

Comments 8 pages, 7 figures, to appear in the proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems 2018

详情

DOI: 10.1109/IROS.2018.8594092

AI中文摘要

基于光场控制的软微机器人可以生成多种不同的步态。这种内在的灵活性可以用来最大化其在特定环境中的运动性能，并用于适应变化的条件。然而，由于缺乏准确的运动模型以及微机器人之间的固有变异性，分析控制设计是不可能的。另一方面，常见的数据驱动方法需要运行大量的实验，导致非常特定于样本的结果。本文提出了一种基于贝叶斯优化（BO）和高斯过程（GPs）的概率学习方法，用于光场控制的软微机器人。所提出的方法产生了一种学习方案，该方案在数据效率方面表现优异，能够在有限的实验预算下进行步态优化，并且对微机器人样本之间的差异具有鲁棒性。这些特性是通过在半合成数据集上比较不同的GP先验和BO设置来设计学习方案获得的。开发的学习方案在微机器人实验中得到验证，结果在仅20次实验的预算下，使微机器人的运动性能提高了115%。这些令人鼓舞的结果为基于光场控制的软微机器人和概率学习控制的自适应微机器人系统铺平了道路。

英文摘要

Soft microrobots based on photoresponsive materials and controlled by light fields can generate a variety of different gaits. This inherent flexibility can be exploited to maximize their locomotion performance in a given environment and used to adapt them to changing conditions. Albeit, because of the lack of accurate locomotion models, and given the intrinsic variability among microrobots, analytical control design is not possible. Common data-driven approaches, on the other hand, require running prohibitive numbers of experiments and lead to very sample-specific results. Here we propose a probabilistic learning approach for light-controlled soft microrobots based on Bayesian Optimization (BO) and Gaussian Processes (GPs). The proposed approach results in a learning scheme that is data-efficient, enabling gait optimization with a limited experimental budget, and robust against differences among microrobot samples. These features are obtained by designing the learning scheme through the comparison of different GP priors and BO settings on a semi-synthetic data set. The developed learning scheme is validated in microrobot experiments, resulting in a 115% improvement in a microrobot's locomotion performance with an experimental budget of only 20 tests. These encouraging results lead the way toward self-adaptive microrobotic systems based on light-controlled soft microrobots and probabilistic learning control.

URL PDF HTML ☆

赞 0 踩 0

1904.08962 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Constrained Restless Bandits for Dynamic Scheduling in Cyber-Physical Systems

用于网络物理系统动态调度的受限 restless 扩展老虎机

Kesav Kaza, Rahul Meshram, Varun Mehta, S. N. Merchant

发表机构 * Department of Electrical Engineering, Indian Institute of Technology Bombay（印度理工学院班加罗尔电子工程系）； Polytechnique Montreal（蒙特利尔理工学院）； IIIT Allahabad（阿哈迈德纳巴德印度理工学院）； University of Ottawa（渥太华大学）； IIT Bombay（印度理工学院班加罗尔）

AI总结本文研究了一类受约束的 restless 多臂老虎机（CRMAB），其中约束以时间变化的动作集（可用臂集）形式存在。该变化可以是随机的或半确定性的。给定一组臂，每个决策区间内固定数量的臂可以被选择进行播放。每个臂的播放会产生依赖于当前状态的奖励。当前臂的状态通过二进制反馈信号部分可观察，而当前可用臂的状态则是完全可观察的。目标是最大化长期累积奖励。未来臂可用性的不确定性以及部分状态信息使这一目标具有挑战性。CRMAB的应用可以发现于涉及时间变化可用性的网络物理系统中的资源分配。首先，通过 Whittle 的指数策略分析该优化问题。为此，研究了一个受约束的 restless 单臂老虎机。证明其具有阈值型最优策略，并且是可指数化的。提出了一种计算 Whittle 指数的算法。还提出了一种复杂度更低的替代解决方案方法，以在线滚动策略的形式呈现。还详细讨论了这两种方案的复杂性，表明具有短前瞻的在线滚动策略比 Whittle 指数计算更容易实施。进一步，推导了价值函数的上界，以估计各种解决方案的次优程度。模拟研究比较了 Whittle 指数、在线滚动、贪心和修改 Whittle 指数策略的性能。

Comments 17 pages, 2 figures

详情

AI中文摘要

本文研究了一类受约束的 restless 多臂老虎机（CRMAB）。约束以时间变化的动作集（可用臂集）形式存在。这种变化可以是随机的或半确定性的。给定一组臂，每个决策区间内固定数量的臂可以被选择进行播放。每个臂的播放会产生依赖于当前状态的奖励。当前臂的状态通过二进制反馈信号部分可观察，而当前可用臂的状态则是完全可观察的。目标是最大化长期累积奖励。未来臂可用性的不确定性以及部分状态信息使这一目标具有挑战性。CRMAB的应用可以发现于涉及时间变化可用性的网络物理系统中的资源分配。首先，通过 Whittle 的指数策略分析该优化问题。为此，研究了一个受约束的 restless 单臂老虎机。证明其具有阈值型最优策略，并且是可指数化的。提出了一种计算 Whittle 指数的算法。还提出了一种复杂度更低的替代解决方案方法，以在线滚动策略的形式呈现。还详细讨论了这两种方案的复杂性，表明具有短前瞻的在线滚动策略比 Whittle 指数计算更容易实施。进一步，推导了价值函数的上界，以估计各种解决方案的次优程度。模拟研究比较了 Whittle 指数、在线滚动、贪心和修改 Whittle 指数策略的性能。

英文摘要

This paper studies a class of constrained restless multi-armed bandits (CRMAB). The constraints are in the form of time varying set of actions (set of available arms). This variation can be either stochastic or semi-deterministic. Given a set of arms, a fixed number of them can be chosen to be played in each decision interval. The play of each arm yields a state dependent reward. The current states of arms are partially observable through binary feedback signals from arms that are played. The current availability of arms is fully observable. The objective is to maximize long term cumulative reward. The uncertainty about future availability of arms along with partial state information makes this objective challenging. Applications for CRMAB can be found in resource allocation in cyber-physical systems involving components with time varying availability. First, this optimization problem is analyzed using Whittle's index policy. To this end, a constrained restless single-armed bandit is studied. It is shown to admit a threshold-type optimal policy and is also indexable. An algorithm to compute Whittle's index is presented. An alternate solution method with lower complexity is also presented in the form of an online rollout policy. A detailed discussion on the complexity of both these schemes is also presented, which suggests that online rollout policy with short look ahead is simpler to implement than Whittle's index computation. Further, upper bounds on the value function are derived in order to estimate the degree of sub-optimality of various solutions. The simulation study compares the performance of Whittle's index, online rollout, myopic and modified Whittle's index policies.

URL PDF HTML ☆

赞 0 踩 0

1903.11734 2026-06-04 math.OC cs.LG cs.SY eess.SY 版本更新

A Posteriori Probabilistic Bounds of Convex Scenario Programs with Validation Tests

凸场景程序后验概率界限的后验概率界限与验证测试

Chao Shang, Fengqi You

发表机构 * College of Engineering, Cornell University, Ithaca, New York（工程学院，康奈尔大学，伊萨卡，纽约）

AI总结本文提出了一种新的后验界限，用于凸场景程序的后验概率评估，结合支持约束的实现和样本外验证数据的表现，以提高随机解的风险评估。

详情

DOI: 10.1109/TAC.2020.3024273
Journal ref: IEEE Transactions on Automatic Control, Sept. 2021, Volume 66, Issue 9, Pages 4015 - 4028

AI中文摘要

场景程序已建立为在不确定性下做出决策的有效工具。为了评估基于场景的解决方案的质量，后验验证测试基于伯努利试验已被广泛采用。然而，为了达到理论上可靠的风险判断，通常需要收集大量的验证样本。在本文中，我们提出了一种新的后验界限，用于凸场景程序的后验概率评估，这些界限依赖于支持约束的实现和样本外验证数据的表现。所提出的界限具有广泛的通用性，因为许多现有的理论结果可以作为特殊情况被纳入其中。为了便于实际应用，还开发了一种系统的方法来参数化后验概率界限，该方法被证明具有多种有利的属性，允许易于实施和清晰的解释。通过综合支持约束和验证测试的全面信息，可以比现有的后验界限更有效地评估随机解的风险。对飞机侧向运动控制器设计的案例研究被提出以验证所提出的后验界限的有效性。

英文摘要

Scenario programs have established themselves as efficient tools towards decision-making under uncertainty. To assess the quality of scenario-based solutions a posteriori, validation tests based on Bernoulli trials have been widely adopted in practice. However, to reach a theoretically reliable judgement of risk, one typically needs to collect massive validation samples. In this work, we propose new a posteriori bounds for convex scenario programs with validation tests, which are dependent on both realizations of support constraints and performance on out-of-sample validation data. The proposed bounds enjoy wide generality in that many existing theoretical results can be incorporated as particular cases. To facilitate practical use, a systematic approach for parameterizing a posteriori probability bounds is also developed, which is shown to possess a variety of desirable properties allowing for easy implementations and clear interpretations. By synthesizing comprehensive information about support constraints and validation tests, improved risk evaluation can be achieved for randomized solutions in comparison with existing a posteriori bounds. Case studies on controller design of aircraft lateral motion are presented to validate the effectiveness of the proposed a posteriori bounds.

URL PDF HTML ☆

赞 0 踩 0

1711.05519 2026-06-04 cs.IT cs.LG cs.NA math.IT math.NA math.OC 版本更新

Accelerated Alternating Projections for Robust Principal Component Analysis

加速交替投影用于鲁棒主成分分析

HanQin Cai, Jian-Feng Cai, Ke Wei

发表机构 * Department of Mathematics, University of California, Los Angeles（加州大学洛杉矶分校数学系）； Department of Mathematics, Hong Kong University of Science and Technology（香港理工大学数学系）； School of Data Science, Fudan University（复旦大学数据科学学院）

AI总结本文提出了一种加速交替投影算法，用于鲁棒主成分分析，显著提高了现有交替投影方法在更新低秩因子时的计算效率，并证明了该算法的精确恢复保证和线性收敛性。

1903.00979 2026-06-04 math.OC cs.LG cs.SY eess.SY math.DS stat.ML 版本更新

Analysis of a Generalized Expectation-Maximization Algorithm for Gaussian Mixture Models: A Control Systems Perspective

Gaussian混合模型中通用期望-最大化算法的分析：控制系统的视角

Sarthak Chatterjee, Orlando Romero, Sérgio Pequito

发表机构 * Department of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Institute（雷德利尔理工学院电子工程与计算机系统系）； Department of Industrial and Systems Engineering, Rensselaer Polytechnic Institute（雷德利尔理工学院工业与系统工程系）

AI总结本文从控制系统的角度分析了Gaussian混合模型中的一种通用期望-最大化算法，探讨了其收敛性质，并通过示例展示了该方法的优势。

Comments 17 pages, 7 figures

1812.05506 2026-06-04 eess.SY cs.LG cs.SY 版本更新

A predictive safety filter for learning-based control of constrained nonlinear dynamical systems

基于约束非线性动力学系统的预测安全过滤器

Kim P. Wabersich, Melanie N. Zeilinger

发表机构 * Institute for Dynamic Systems and Control, ETH Zurich, Zurich, Switzerland（动态系统与控制研究所，苏黎世联邦理工学院，瑞士苏黎世）

AI总结本文提出了一种预测安全过滤器，用于基于学习的控制中处理物理限制下的安全问题，通过将约束动力学系统转换为无约束安全系统，使任何强化学习算法都能直接应用。

详情

AI中文摘要

将强化学习（RL）技术转移到现实应用中的挑战在于存在物理限制下的安全要求。大多数RL方法，特别是最流行的算法，不支持显式考虑状态和输入约束。在本文中，我们针对具有连续状态和输入空间的非线性系统，引入了一种预测安全过滤器，能够将约束动力学系统转换为无约束安全系统，并且任何RL算法都可以直接应用。预测安全过滤器接收提出的控制输入，并基于当前系统状态决定是否可以安全地应用于实际系统，或者是否需要进行修改。安全通过一个不断更新的安全策略来建立，该策略基于数据驱动的系统模型和考虑状态和输入依赖的不确定性，采用模型预测控制的公式。

英文摘要

The transfer of reinforcement learning (RL) techniques into real-world applications is challenged by safety requirements in the presence of physical limitations. Most RL methods, in particular the most popular algorithms, do not support explicit consideration of state and input constraints. In this paper, we address this problem for nonlinear systems with continuous state and input spaces by introducing a predictive safety filter, which is able to turn a constrained dynamical system into an unconstrained safe system and to which any RL algorithm can be applied `out-of-the-box'. The predictive safety filter receives the proposed control input and decides, based on the current system state, if it can be safely applied to the real system, or if it has to be modified otherwise. Safety is thereby established by a continuously updated safety policy, which is based on a model predictive control formulation using a data-driven system model and considering state and input dependent uncertainties.

URL PDF HTML ☆

赞 0 踩 0

1902.02311 2026-06-04 cs.MA cs.AI cs.LG cs.SY eess.SY 版本更新

Decentralized Multi-Agents by Imitation of a Centralized Controller

通过模仿集中控制器实现去中心化多智能体

Alex Tong Lin, Mark J. Debord, Katia Estabridis, Gary Hewer, Guido Montufar, Stanley Osher

发表机构 * UCLA（加州大学洛杉矶分校）； Max Planck Institute, Leipzig（莱比锡马克斯·普朗克研究所）； University of California, Los Angeles（加州大学洛杉矶分校）

AI总结本文提出了一种基于集中训练、去中心执行框架的新型算法，通过模仿学习生成去中心化多智能体，解决了多智能体强化学习中非平稳和部分可观测环境下的协作问题。

详情

AI中文摘要

我们考虑了一个多智能体强化学习问题，其中每个智能体试图在与其他智能体交互时最大化共享奖励，且可能无法通信。通常，智能体无法访问其他智能体的策略，因此每个智能体都处于非平稳和部分可观测的环境中。为了获得去中心化作用的多智能体，我们引入了一种新的算法，该算法基于流行的集中训练、去中心执行框架。该训练框架首先通过单一集中联合空间学习者解决多智能体问题，然后用于指导模仿学习以生成独立的去中心化多智能体。该框架具有灵活性，可以使用任何强化学习算法来获得专家，以及任何模仿学习算法来获得去中心化智能体。这与其它多智能体学习算法不同，例如可能需要更具体的结构。我们为该方法提供了一些理论界限，并展示了通过模仿学习可以获得多智能体问题的去中心化解决方案。

英文摘要

We consider a multi-agent reinforcement learning problem where each agent seeks to maximize a shared reward while interacting with other agents, and they may or may not be able to communicate. Typically the agents do not have access to other agent policies and thus each agent is situated in a non-stationary and partially-observable environment. In order to obtain multi-agents that act in a decentralized manner, we introduce a novel algorithm under the popular framework of centralized training, but decentralized execution. This training framework first obtains solutions to a multi-agent problem with a single centralized joint-space learner, which is then used to guide imitation learning for independent decentralized multi-agents. This framework has the flexibility to use any reinforcement learning algorithm to obtain the expert as well as any imitation learning algorithm to obtain the decentralized agents. This is in contrast to other multi-agent learning algorithms that, for example, can require more specific structures. We present some theoretical bounds for our method, and we show that one can obtain decentralized solutions to a multi-agent problem through imitation learning.

URL PDF HTML ☆

赞 0 踩 0

1701.00178 2026-06-04 math.OC cs.AI cs.LG cs.SY eess.SY stat.ML 版本更新

Lazily Adapted Constant Kinky Inference for Nonparametric Regression and Model-Reference Adaptive Control

惰性适应的常数Kinky推断用于非参数回归和模型参考自适应控制

Jan-Peter Calliess

发表机构 * Dept. of Engineering Science University of Oxford, UK（工程科学系奥克斯福德大学英国）

AI总结本文提出了一种惰性适应的常数Kinky推断方法，用于非参数回归和模型参考自适应控制，通过在线估计Hölder常数并建立强通用逼近保证，展示了在密集数据下学习任意连续函数的能力。

详情

AI中文摘要

非线性集合成员预测、Lipschitz插值或Kinky推断是机器学习中利用预设Lipschitz性质来计算未观测函数值推断的方法。在已知目标函数真实最佳Lipschitz常数的上界时，这些方法提供收敛保证和预测的界限。考虑一个更一般的设置，该设置基于相对于伪度量的Hölder连续性，我们提出了一种在线方法，用于从可能受有界观测误差影响的函数值观测中估计Hölder常数。利用此方法在Kinky推断规则中计算自适应参数，从而得到一种非参数机器学习方法，我们为此建立了强通用逼近保证。也就是说，我们证明我们的预测规则在数据越来越密集的情况下，可以学习任意连续函数，其最坏误差界取决于观测不确定性水平。我们在非参数模型参考自适应控制（MRAC）的背景下应用了我们的方法。在一系列模拟飞机滚动动力学和性能指标中，我们的方法优于基于高斯过程和RBF神经网络最近提出的方法。对于离散时间系统，我们为我们的基于学习的控制器在批量学习和在线学习设置下的跟踪成功率提供了保证。

英文摘要

Techniques known as Nonlinear Set Membership prediction, Lipschitz Interpolation or Kinky Inference are approaches to machine learning that utilise presupposed Lipschitz properties to compute inferences over unobserved function values. Provided a bound on the true best Lipschitz constant of the target function is known a priori they offer convergence guarantees as well as bounds around the predictions. Considering a more general setting that builds on Hoelder continuity relative to pseudo-metrics, we propose an online method for estimating the Hoelder constant online from function value observations that possibly are corrupted by bounded observational errors. Utilising this to compute adaptive parameters within a kinky inference rule gives rise to a nonparametric machine learning method, for which we establish strong universal approximation guarantees. That is, we show that our prediction rule can learn any continuous function in the limit of increasingly dense data to within a worst-case error bound that depends on the level of observational uncertainty. We apply our method in the context of nonparametric model-reference adaptive control (MRAC). Across a range of simulated aircraft roll-dynamics and performance metrics our approach outperforms recently proposed alternatives that were based on Gaussian processes and RBF-neural networks. For discrete-time systems, we provide guarantees on the tracking success of our learning-based controllers both for the batch and the online learning setting.

URL PDF HTML ☆

赞 0 踩 0

1906.00729 2026-06-04 cs.LG cs.GT cs.SY eess.SY math.OC stat.ML 版本更新

Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum Linear Quadratic Games

策略优化在零和线性二次博弈中可证明收敛至纳什均衡

Kaiqing Zhang, Zhuoran Yang, Tamer Başar

发表机构 * Department of Electrical and Computer Engineering & Coordinated Science Laboratory, University of Illinois at Urbana-Champaign（伊利诺伊大学厄巴纳-香槟分校电子工程与协调科学实验室部门）； Department of Operations Research and Financial Engineering, Princeton University（普林斯顿大学运筹学与金融工程系）

AI总结本文研究了策略优化在零和线性二次博弈中寻找纳什均衡的全局收敛性，通过分析LQ博弈的优化景观，证明了线性反馈控制策略的 stationary 点构成博弈的纳什均衡，并提出三种保证收敛到纳什均衡的投影嵌套梯度方法，同时展示了这些算法具有全局次线性和局部线性收敛率。

Comments Fixed some typos, addressed some comments from NeurIPS reviews

详情

AI中文摘要

我们研究了策略优化在寻找零和线性二次（LQ）博弈纳什均衡（NE）中的全局收敛性。为此，我们首先分析了LQ博弈的景观，将其视为策略空间中的非凸非凹鞍点问题。具体来说，我们证明了尽管其非凸性和非凹性，零和LQ博弈具有性质：目标函数相对于线性反馈控制策略的 stationary 点构成博弈的纳什均衡。在此基础上，我们开发了三种投影嵌套梯度方法，这些方法保证能够收敛到博弈的纳什均衡。此外，我们证明所有这些算法都具有全局次线性和局部线性收敛率。还提供了仿真结果以说明算法的满意收敛特性。据我们所知，这项工作似乎是首次研究LQ博弈的优化景观，并且证明了策略优化方法收敛到纳什均衡。我们的工作为理解一般零和马尔可夫游戏中的基于策略的强化学习算法的理论方面提供了初步步骤。

英文摘要

We study the global convergence of policy optimization for finding the Nash equilibria (NE) in zero-sum linear quadratic (LQ) games. To this end, we first investigate the landscape of LQ games, viewing it as a nonconvex-nonconcave saddle-point problem in the policy space. Specifically, we show that despite its nonconvexity and nonconcavity, zero-sum LQ games have the property that the stationary point of the objective function with respect to the linear feedback control policies constitutes the NE of the game. Building upon this, we develop three projected nested-gradient methods that are guaranteed to converge to the NE of the game. Moreover, we show that all of these algorithms enjoy both globally sublinear and locally linear convergence rates. Simulation results are also provided to illustrate the satisfactory convergence properties of the algorithms. To the best of our knowledge, this work appears to be the first one to investigate the optimization landscape of LQ games, and provably show the convergence of policy optimization methods to the Nash equilibria. Our work serves as an initial step toward understanding the theoretical aspects of policy-based reinforcement learning algorithms for zero-sum Markov games in general.

URL PDF HTML ☆

赞 0 踩 0

1905.04403 2026-06-04 eess.SY cs.LG cs.SY 版本更新

PAC Statistical Model Checking for Markov Decision Processes and Stochastic Games

PAC统计模型检验用于马尔可夫决策过程和随机游戏

Pranav Ashok, Jan Křetínský, Maximilian Weininger

发表机构 * Technical University of Munich, Germany（慕尼黑技术大学）

AI总结本文提出了一种用于马尔可夫决策过程和随机游戏的PAC统计模型检验算法，该算法在不完全了解转移函数的情况下，能够提供概率近似正确性保证，且在实际应用中效率较高。

详情

DOI: 10.1007/978-3-030-25540-4_29

AI中文摘要

统计模型检验（SMC）是一种用于分析概率系统的技术，这些系统可能（部分）未知。我们提出了一种用于无界可达性的SMC算法，该算法能够提供概率近似正确（PAC）的保证。我们考虑了两种情况：（i）没有转移函数的知识（仅需一个转移概率的下界）和（ii）了解底层图的拓扑结构。一方面，这是首个针对随机游戏的算法；另一方面，即使对于马尔可夫决策过程，这也是首个实用的算法。与之前需要运行时间超过宇宙年龄的方法相比，我们的算法通常可以在几分钟内得到合理精确的结果，不需要了解混合时间或整个模型的拓扑结构。

英文摘要

Statistical model checking (SMC) is a technique for analysis of probabilistic systems that may be (partially) unknown. We present an SMC algorithm for (unbounded) reachability yielding probably approximately correct (PAC) guarantees on the results. We consider both the setting (i) with no knowledge of the transition function (with the only quantity required a bound on the minimum transition probability) and (ii) with knowledge of the topology of the underlying graph. On the one hand, it is the first algorithm for stochastic games. On the other hand, it is the first practical algorithm even for Markov decision processes. Compared to previous approaches where PAC guarantees require running times longer than the age of universe even for systems with a handful of states, our algorithm often yields reasonably precise results within minutes, not requiring the knowledge of mixing time or the topology of the whole model.

URL PDF HTML ☆

赞 0 踩 0

1905.13268 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Interpretable PID Parameter Tuning for Control Engineering using General Dynamic Neural Networks: An Extensive Comparison

使用通用动态神经网络进行可解释的PID参数调节：一种广泛的比较

Johannes Günther, Elias Reichensdörfer, Patrick M. Pilarski, Klaus Diepold

发表机构 * Department of Computing Science, University of Alberta（阿尔伯塔大学计算机科学系）； Alberta Machine Intelligence Institute（阿尔伯塔人工智能研究所）

AI总结本文研究了如何通过通用动态神经网络（GDNN）扩展PID控制器，以提高复杂控制系统的性能和可解释性，通过四个基准系统的广泛比较，展示了神经PID控制器在16项任务中优于传统PID和模型驱动控制的13项任务。

详情

DOI: 10.1371/journal.pone.0243320

AI中文摘要

现代自动化系统依赖于闭环控制，其中控制器根据观察与受控过程交互。这些系统日益复杂，但大多数控制器仍是线性比例-积分-微分（PID）控制器。PID控制器在处理线性和近线性系统时表现良好，但其简单性与控制复杂过程所需鲁棒性相矛盾。现代机器学习提供了一种方法，即通过神经网络扩展PID控制器，以超越其线性能力。然而，这种扩展以失去稳定性保证和控制器可解释性为代价。本文研究了通过循环神经网络（即通用动态神经网络GDNN）扩展PID控制器的效用，证明GDNN（神经）PID控制器在多种控制系统中表现良好，并强调其作为可扩展和可解释的控制选项。为此，我们通过四个基准系统进行了广泛研究，这些系统代表了最常用的控制工程基准。所有控制基准均在有噪声和无噪声、有干扰和无干扰的情况下进行评估。神经PID控制器在16项任务中优于传统PID控制15项，在16项任务中优于模型驱动控制13项。作为第二项贡献，我们解决了防止神经网络用于实际控制过程的可解释性不足问题。我们使用有界输入有界输出稳定性分析来评估神经网络建议的参数，从而使其变得可理解。这种严格的评估与更好的可解释性相结合，是神经网络控制方法接受的重要步骤。此外，这也是可解释和安全应用人工智能的重要步骤。

英文摘要

Modern automation systems rely on closed loop control, wherein a controller interacts with a controlled process, based on observations. These systems are increasingly complex, yet most controllers are linear Proportional-Integral-Derivative (PID) controllers. PID controllers perform well on linear and near-linear systems but their simplicity is at odds with the robustness required to reliably control complex processes. Modern machine learning offers a way to extend PID controllers beyond their linear capabilities by using neural networks. However, such an extension comes at the cost of losing stability guarantees and controller interpretability. In this paper, we examine the utility of extending PID controllers with recurrent neural networks-namely, General Dynamic Neural Networks (GDNN); we show that GDNN (neural) PID controllers perform well on a range of control systems and highlight how they can be a scalable and interpretable option for control systems. To do so, we provide an extensive study using four benchmark systems that represent the most common control engineering benchmarks. All control benchmarks are evaluated with and without noise as well as with and without disturbances. The neural PID controller performs better than standard PID control in 15 of 16 tasks and better than model-based control in 13 of 16 tasks. As a second contribution, we address the lack of interpretability that prevents neural networks from being used in real-world control processes. We use bounded-input bounded-output stability analysis to evaluate the parameters suggested by the neural network, thus making them understandable. This combination of rigorous evaluation paired with better interpretability is an important step towards the acceptance of neural-network-based control approaches. It is furthermore an important step towards interpretable and safely applied artificial intelligence.

URL PDF HTML ☆

赞 0 踩 0

1812.03412 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Learning Multiplication-free Linear Transformations

学习无乘法线性变换

Cristian Rusu

AI总结本文提出了一种字典学习算法，用于稀疏表示，同时对学习到的字典施加特定结构，使其在数值上更高效：减少加法/乘法次数甚至避免乘法。我们基于字典的高结构化基本构建块（二进制正交、缩放和剪切变换）来建立工作，可以写出优化问题的闭式解。我们在图像数据上展示了方法的有效性，并与已知的数值高效变换如快速傅里叶变换和快速离散余弦变换进行比较。

1904.09841 2026-06-04 cs.DS cs.LG cs.NA math.NA 版本更新

Simple Heuristics Yield Provable Algorithms for Masked Low-Rank Approximation

简单的启发式方法可证明算法用于带掩码的低秩近似

Cameron Musco, Christopher Musco, David P. Woodruff

发表机构 * UMass Amherst（马萨诸塞大学阿默斯特分校）； New York University（纽约大学）； Carnegie Mellon University（卡内基梅隆大学）

AI总结本文研究了带掩码的低秩近似问题，提出了一种简单的启发式方法，通过将掩码为0的区域设为0，然后求解标准低秩近似，从而得到具有双标准逼近保证的算法。

Comments ITCS 2021

详情

AI中文摘要

在$masked\ low-rank\ approximation$中，给定$A \in \mathbb{R}^{n imes n}$和二进制掩码矩阵$W \in \{0,1\}^{n imes n}$。目标是找到一个秩为$k$的矩阵$L$，使得$$cost(L) = \sum_{i=1}^{n} \sum_{j = 1}^{n} W_{i,j} \cdot (A_{i,j} - L_{i,j} )^2 \leq OPT + ε\|A\|_F^2 ,$$其中$OPT = \min_{rank-k\ \hat{L}} cost(\hat L)$，$ε$是一个给定的误差参数。根据$W$的不同选择，该问题捕捉到因子分析、低秩加对角分解、鲁棒PCA、低秩矩阵补全、低秩加块矩阵近似以及许多问题。许多这些问题都是NP难的，尽管已有一些具有证明保证的算法，但它们要么1) 运行时间是$n^{Ω(k^2/ε)}$，要么2) 做出强假设，例如$A$是不相干的或$W$是随机的。在本工作中，我们证明了一个常见的多项式时间启发式方法，即简单地将$W$为0的区域设为0，然后找到标准低秩近似，可以为该问题提供双标准逼近保证。特别是，对于秩为$k' > k$，取决于$W$的public\ coin\ partition\ number，该启发式方法输出秩为$k'$的$L$，其成本$(L) \leq OPT + ε\|A\|_F^2$。这个partition number反过来由$W$作为两个玩家通信矩阵时的randomized\ communication\ complexity$所限制。对于许多重要的带掩码低秩近似示例，包括上述所有问题，该结果提供了具有$k' = k \cdot poly(\log n/ε)$的双标准逼近保证。此外，我们还显示了不同的通信模型为带掩码低秩近似的自然变种提供了算法。例如，多玩家number-in-hand通信复杂度与带掩码张量分解相关，而非确定性通信复杂度与带掩码布尔低秩分解相关。

英文摘要

In $masked\ low-rank\ approximation$, one is given $A \in \mathbb{R}^{n \times n}$ and binary mask matrix $W \in \{0,1\}^{n \times n}$. The goal is to find a rank-$k$ matrix $L$ for which: $$cost(L) = \sum_{i=1}^{n} \sum_{j = 1}^{n} W_{i,j} \cdot (A_{i,j} - L_{i,j} )^2 \leq OPT + ε\|A\|_F^2 ,$$ where $OPT = \min_{rank-k\ \hat{L}} cost(\hat L)$ and $ε$ is a given error parameter. Depending on the choice of $W$, this problem captures factor analysis, low-rank plus diagonal decomposition, robust PCA, low-rank matrix completion, low-rank plus block matrix approximation, and many problems. Many of these problems are NP-hard, and while some algorithms with provable guarantees are known, they either 1) run in time $n^{Ω(k^2/ε)}$ or 2) make strong assumptions, e.g., that $A$ is incoherent or that $W$ is random. In this work, we show that a common polynomial time heuristic, which simply sets $A$ to $0$ where $W$ is $0$, and then finds a standard low-rank approximation, yields bicriteria approximation guarantees for this problem. In particular, for rank $k' > k$ depending on the $public\ coin\ partition\ number$ of $W$, the heuristic outputs rank-$k'$ $L$ with cost$(L) \leq OPT + ε\|A\|_F^2$. This partition number is in turn bounded by the $randomized\ communication\ complexity$ of $W$, when interpreted as a two-player communication matrix. For many important examples of masked low-rank approximation, including all those listed above, this result yields bicriteria approximation guarantees with $k' = k \cdot poly(\log n/ε)$. Further, we show that different models of communication yield algorithms for natural variants of masked low-rank approximation. For example, multi-player number-in-hand communication complexity connects to masked tensor decomposition and non-deterministic communication complexity to masked Boolean low-rank factorization.

URL PDF HTML ☆

赞 0 踩 0

1903.07214 2026-06-04 eess.SY cs.LG cs.SY 版本更新

A Control Lyapunov Perspective on Episodic Learning via Projection to State Stability

从控制李雅普诺夫视角看通过投影到状态稳定性进行片段学习

Andrew J. Taylor, Victor D. Dorobantu, Meera Krishnamoorthy, Hoang M. Le, Yisong Yue, Aaron D. Ames

发表机构 * California Institute of Technology（加州理工学院）

AI总结本文从李雅普诺夫函数视角探讨学习对控制合成的影响，提出投影到状态稳定性（PSS）概念，用于表征CLF对系统不确定数据的鲁棒性，并展示如何利用PSS在仿射控制中限制不确定性，实现鲁棒控制合成。

1903.01577 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Episodic Learning with Control Lyapunov Functions for Uncertain Robotic Systems

具有控制李雅普诺夫函数的不确定性机器人系统的经验学习

Andrew J. Taylor, Victor D. Dorobantu, Hoang M. Le, Yisong Yue, Aaron D. Ames

发表机构 * California Institute of Technology（加州理工学院）

AI总结本文提出了一种基于控制李雅普诺夫函数的机器学习框架，用于适应机器人系统中的参数不确定性和未建模动态，通过迭代更新李雅普诺夫函数导数的估计和改进控制器，最终获得一个稳定性的二次规划基于控制器，并在平面Segway模拟中验证了方法的有效性。

详情

DOI: 10.1109/IROS40897.2019.8967820

AI中文摘要

许多现代非线性控制方法旨在赋予系统保证性质，如稳定性或安全性，并已成功应用于机器人领域。然而，模型不确定性仍然是持续的挑战，削弱了理论保证并导致物理系统中的实施失败。本文开发了一种以控制李雅普诺夫函数（CLFs）为中心的机器学习框架，以适应一般机器人系统中的参数不确定性和未建模动态。我们提出的方法通过迭代更新李雅普诺夫函数导数的估计并改进控制器，最终获得一个基于二次规划的稳定控制器。我们在平面Segway模拟中验证了我们的方法，通过迭代改进基础无模型控制器，展示了显著的性能提升。

英文摘要

Many modern nonlinear control methods aim to endow systems with guaranteed properties, such as stability or safety, and have been successfully applied to the domain of robotics. However, model uncertainty remains a persistent challenge, weakening theoretical guarantees and causing implementation failures on physical systems. This paper develops a machine learning framework centered around Control Lyapunov Functions (CLFs) to adapt to parametric uncertainty and unmodeled dynamics in general robotic systems. Our proposed method proceeds by iteratively updating estimates of Lyapunov function derivatives and improving controllers, ultimately yielding a stabilizing quadratic program model-based controller. We validate our approach on a planar Segway simulation, demonstrating substantial performance improvements by iteratively refining on a base model-free controller.

URL PDF HTML ☆

赞 0 踩 0

1711.03127 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Energy Storage Arbitrage in Real-Time Markets via Reinforcement Learning

通过强化学习实现实时市场的能源存储套利

Hao Wang, Baosen Zhang

发表机构 * Department of Electrical Engineering, University of Washington（华盛顿大学电气工程系）

AI总结本文通过强化学习设计了一个时间套利策略，用于能源存储，解决了实时价格套利中价格高度不确定带来的策略设计难题，通过设计奖励函数提升性能。

详情

DOI: 10.1109/PESGM.2018.8586321
Journal ref: 2018 IEEE Power & Energy Society General Meeting (PESGM)

AI中文摘要

在本文中，我们通过强化学习推导出一个时间套利策略用于存储。实时价格套利是存储单元的重要收入来源，但设计良好的策略 proved 难以实现，因为价格的高不确定性。我们采用强化学习来设计一个最优的套利策略。该策略通过存储单元重复的充放电操作，通过更新价值矩阵来学习。我们设计了一个奖励函数，不仅反映了充放电决策的即时利润，还结合了历史信息。仿真结果表明，与现有算法相比，我们设计的奖励函数导致了显著的性能提升。

英文摘要

In this paper, we derive a temporal arbitrage policy for storage via reinforcement learning. Real-time price arbitrage is an important source of revenue for storage units, but designing good strategies have proven to be difficult because of the highly uncertain nature of the prices. Instead of current model predictive or dynamic programming approaches, we use reinforcement learning to design an optimal arbitrage policy. This policy is learned through repeated charge and discharge actions performed by the storage unit through updating a value matrix. We design a reward function that does not only reflect the instant profit of charge/discharge decisions but also incorporate the history information. Simulation results demonstrate that our designed reward function leads to significant performance improvement compared with existing algorithms.

URL PDF HTML ☆

赞 0 踩 0

1904.02851 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY 版本更新

Planning under non-rational perception of uncertain spatial costs

在不确定空间成本下的非理性感知规划

Aamodh Suresh, Sonia Martinez

AI总结本文研究了在不确定空间成本下考虑非理性风险感知的运动规划策略，提出基于累积前景理论（CPT）生成感知风险地图的方法，并通过理论和仿真验证了CPT模型的建模能力，与CVaR等其他风险感知模型相比，展示了在路径规划中的优势。

Comments 12 pages and 10 figures. This revision adds more explanation and clearer figures

详情

AI中文摘要

本工作探讨了设计一种考虑与不确定空间成本相关的风险感知的运动规划策略。我们提出的方法利用累积前景理论（CPT）来生成给定环境中的感知风险地图。CPT-like感知风险和路径长度指标被结合以定义一个符合采样运动规划器（RRT*）渐近最优要求的成本函数。通过理论和仿真展示了CPT的建模能力，并与其他风险感知模型如条件价值-at-风险（CVaR）进行了比较。理论上，我们定义了风险感知模型的表达性概念，并证明CPT的表达性高于CVaR和期望风险。然后我们展示了这种表达性在路径规划设置中的转化，其中我们观察到一个配备CPT和同时扰动随机近似（SPSA）方法的规划器可以更好地近似任意环境中的路径。此外，我们通过仿真展示了我们的规划器能够捕捉一组丰富的有意义路径，代表了不同风险感知的自定义环境。然后我们通过在拥挤和动态环境中的仿真比较了我们的规划器与T-RRT*（连续成本空间的规划器）和Risk-RRT*（动态人类障碍物的风险感知规划器）的性能，展示了我们所提规划器的优势。

英文摘要

This work investigates the design of risk-perception-aware motion-planning strategies that incorporate non-rational perception of risks associated with uncertain spatial costs. Our proposed method employs the Cumulative Prospect Theory (CPT) to generate a perceived risk map over a given environment. CPT-like perceived risks and path-length metrics are then combined to define a cost function that is compliant with the requirements of asymptotic optimality of sampling-based motion planners (RRT*). The modeling power of CPT is illustrated in theory and in simulation, along with a comparison to other risk perception models like Conditional Value at Risk (CVaR). Theoretically, we define a notion of expressiveness for a risk perception model and show that CPT's is higher than that of CVaR and expected risk. We then show that this expressiveness translates to our path planning setting, where we observe that a planner equipped with CPT together with a simultaneous perturbation stochastic approximation (SPSA) method can better approximate arbitrary paths in an environment. Additionally, we show in simulation that our planner captures a rich set of meaningful paths, representative of different risk perceptions in a custom environment. We then compare the performance of our planner with T-RRT* (a planner for continuous cost spaces) and Risk-RRT* (a risk-aware planner for dynamic human obstacles) through simulations in cluttered and dynamic environments respectively, showing the advantage of our proposed planner.

URL PDF HTML ☆

赞 0 踩 0

1902.10139 2026-06-04 eess.SY cs.LG cs.RO cs.SY 版本更新

Learning Dynamic-Objective Policies from a Class of Optimal Trajectories

从一类最优轨迹中学习动态目标策略

Christopher Iliffe Sprague, Dario Izzo, Petter Ögren

AI总结本文提出了一种新颖且简单的方法，通过轨迹优化、同伦持续和模仿学习相结合，合成能够在线切换目标函数的最优状态反馈控制器，并在倒立摆摆起和航天器轨道转移问题中验证了其有效性。

Comments Accepted to the 59th IEEE Conference on Decision and Control (CDC)

详情

AI中文摘要

最优状态反馈控制器，能够切换不同的目标函数，在可能遇到意外情况的系统中具有优势。然而，即使对于单一目标函数，合成此类控制器也是极具挑战性的。本文提出了一种新颖且简单的方法，通过轨迹优化、同伦持续和模仿学习相结合，来合成这些策略。我们使用数值持续法高效地生成多个目标函数和边界条件下的最优演示，并利用这些演示来训练我们的策略。此外，我们展示了我们的策略能够有效学习一系列最优状态反馈控制器，这些控制器可以在线切换目标函数。我们通过两个轨迹优化问题，即倒立摆摆起和航天器轨道转移，展示了该方法，并证明在仿真中合成的策略产生的轨迹接近最优。这些结果表明，轨迹优化和同伦持续对动态目标情境下的控制器合成具有益处。

英文摘要

Optimal state-feedback controllers, capable of changing between different objective functions, are advantageous to systems in which unexpected situations may arise. However, synthesising such controllers, even for a single objective, is a demanding process. In this paper, we present a novel and straightforward approach to synthesising these policies through a combination of trajectory optimisation, homotopy continuation, and imitation learning. We use numerical continuation to efficiently generate optimal demonstrations across several objectives and boundary conditions, and use these to train our policies. Additionally, we demonstrate the ability of our policies to effectively learn families of optimal state-feedback controllers, which can be used to change objective functions online. We illustrate this approach across two trajectory optimisation problems, an inverted pendulum swingup and a spacecraft orbit transfer, and show that the synthesised policies, when evaluated in simulation, produce trajectories that are near-optimal. These results indicate the benefit of trajectory optimisation and homotopy continuation to the synthesis of controllers in dynamic-objective contexts.

URL PDF HTML ☆

赞 0 踩 0

1710.09691 2026-06-04 eess.SY cs.LG cs.RO cs.SY 版本更新

Iterative Machine Learning for Precision Trajectory Tracking with Series Elastic Actuators

迭代机器学习用于系列弹性执行器的高精度轨迹跟踪

Nathan Banka, W. Tony Piaskowy, Joseph Garbini, Santosh Devasia

发表机构 * Ultra-Precision Controls Lab（超精密控制实验室）； University of Washington（华盛顿大学）

AI总结本文研究了在系列弹性执行器中使用迭代学习方法提高位置跟踪精度的问题，通过迭代学习生成前馈命令，利用复值高斯过程回归技术估计局部系统模型，从而减少跟踪误差。

Comments 9 pages, 16 figure. Submitted to AMC Workshop

详情

DOI: 10.1109/AMC.2019.8371094
Journal ref: 2018 IEEE 15th International Workshop on Advanced Motion Control (AMC), Tokyo, 2018, pp. 234-239

AI中文摘要

当机器人在未知环境中操作时，位置的小误差可能导致接触力的大幅变化，尤其是对于典型的高阻抗设计。这可能会损坏周围环境或机器人本身。系列弹性执行器（SEAs）是一种减少机器人手臂输出阻抗以提高对环境施加力的控制能力的流行方法。然而，这种增加的力控制能力伴随着较低的位置精度和带宽。本文探讨了使用迭代学习的前馈命令来改进使用SEAs时的位置跟踪。在每次迭代中，系统对量化输入的输出响应被用来估计线性化的局部系统模型。这些估计的模型是通过复值高斯过程回归（cGPR）技术获得的，然后用于基于前一次迭代的误差生成新的前馈输入命令。本文展示了该迭代机器学习（IML）技术在双自由度（2-DOF）机器人手臂上的应用，并证明了IML方法能够成功收敛以减少跟踪误差。

英文摘要

When robots operate in unknown environments small errors in postions can lead to large variations in the contact forces, especially with typical high-impedance designs. This can potentially damage the surroundings and/or the robot. Series elastic actuators (SEAs) are a popular way to reduce the output impedance of a robotic arm to improve control authority over the force exerted on the environment. However this increased control over forces with lower impedance comes at the cost of lower positioning precision and bandwidth. This article examines the use of an iteratively-learned feedforward command to improve position tracking when using SEAs. Over each iteration, the output responses of the system to the quantized inputs are used to estimate a linearized local system models. These estimated models are obtained using a complex-valued Gaussian Process Regression (cGPR) technique and then, used to generate a new feedforward input command based on the previous iteration's error. This article illustrates this iterative machine learning (IML) technique for a two degree of freedom (2-DOF) robotic arm, and demonstrates successful convergence of the IML approach to reduce the tracking error.

URL PDF HTML ☆

赞 0 踩 0

1812.07725 2026-06-04 math.OC cs.LG cs.NA math.NA math.PR stat.ML 版本更新

Breaking Reversibility Accelerates Langevin Dynamics for Global Non-Convex Optimization

打破可逆性加速Langevin动力学用于全局非凸优化

Xuefeng Gao, Mert Gurbuzbalaban, Lingjiong Zhu

发表机构 * Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, N.T. Hong Kong（系统工程与工程管理系，香港中文大学（深圳））； Department of Management Science and Information Systems and the DIMACS Institute, Rutgers University, Piscataway, NJ-08854, United States of America（管理科学与信息系统系及DIMACS研究所，罗杰斯大学）； Department of Mathematics, Florida State University, 1017 Academic Way, Tallahassee, FL-32306, United States of America（数学系，佛罗里达州立大学）

AI总结本文研究了非可逆Langevin动力学在全局非凸优化中的应用，通过分析非可逆动力学算法的收敛性和混合速率，证明了非可逆算法在寻找局部极小值和探索状态空间方面的效率提升。

详情

AI中文摘要

Langevin动力学（LD）已被证明是一种强大的技术，用于优化非凸目标，作为一种高效的算法来寻找局部极小值，而最终在更长的时间尺度上访问全局极小值。LD基于一阶Langevin扩散，其时间是可逆的。我们研究了两种基于非可逆Langevin扩散的变种：欠阻尼Langevin动力学（ULD）和具有非对称漂移的Langevin动力学（NLD）。采用Tzen、Liang和Raginsky（2018）为LD到非可逆扩散的技术，我们证明了对于给定的局部极小值，其在初始化点任意距离内，以高概率，ULD轨迹会在依赖于局部极小值Hessian最小特征值的复发时间内结束于该局部极小值的小邻域之外，或者在复发时间内进入该邻域并停留可能极长的逃逸时间。ULD算法在Hessian最小特征值的依赖性方面优于Tzen、Liang和Raginsky（2018）中LD的复发时间。对于NLD算法也获得了相似的结果和改进。我们还展示了非可逆变种在离散时间中能够更快地退出局部极小值的吸引盆地，当目标函数有两个局部极小值被鞍点分隔时，并量化了改进的幅度。我们的分析表明，非可逆Langevin算法在寻找局部极小值和探索状态空间方面更有效。我们的分析基于在局部极小值周围对目标函数的二次近似。作为我们分析的副产品，我们获得了两个非可逆Langevin算法在2-Wasserstein距离下的最优混合速率。

英文摘要

Langevin dynamics (LD) has been proven to be a powerful technique for optimizing a non-convex objective as an efficient algorithm to find local minima while eventually visiting a global minimum on longer time-scales. LD is based on the first-order Langevin diffusion which is reversible in time. We study two variants that are based on non-reversible Langevin diffusions: the underdamped Langevin dynamics (ULD) and the Langevin dynamics with a non-symmetric drift (NLD). Adopting the techniques of Tzen, Liang and Raginsky (2018) for LD to non-reversible diffusions, we show that for a given local minimum that is within an arbitrary distance from the initialization, with high probability, either the ULD trajectory ends up somewhere outside a small neighborhood of this local minimum within a recurrence time which depends on the smallest eigenvalue of the Hessian at the local minimum or they enter this neighborhood by the recurrence time and stay there for a potentially exponentially long escape time. The ULD algorithms improve upon the recurrence time obtained for LD in Tzen, Liang and Raginsky (2018) with respect to the dependency on the smallest eigenvalue of the Hessian at the local minimum. Similar result and improvement are obtained for the NLD algorithm. We also show that non-reversible variants can exit the basin of attraction of a local minimum faster in discrete time when the objective has two local minima separated by a saddle point and quantify the amount of improvement. Our analysis suggests that non-reversible Langevin algorithms are more efficient to locate a local minimum as well as exploring the state space. Our analysis is based on the quadratic approximation of the objective around a local minimum. As a by-product of our analysis, we obtain optimal mixing rates for quadratic objectives in the 2-Wasserstein distance for two non-reversible Langevin algorithms we consider.

URL PDF HTML ☆

赞 0 踩 0

1711.01526 2026-06-04 cs.LG cs.SY eess.SY math.OC 版本更新

On Identification of Distribution Grids

配电网络的识别

Omid Ardakanian, Vincent W. S. Wong, Roel Dobbe, Steven H. Low, Alexandra von Meier, Claire Tomlin, Ye Yuan

发表机构 * Department of Electrical Engineering and Computer Sciences, UC Berkeley, USA（伯克利大学电气工程与计算机科学系，美国）

AI总结本文研究了如何通过遥测数据联合估计配电网络的模型参数和运行结构，利用lasso方法进行回归收缩和选择，提出可处理配电系统低秩结构的可行凸优化程序，并开发了用于早期检测和定位引起电导矩阵变化的关键事件的在线算法。

详情

DOI: 10.1109/TCNS.2019.2891002

AI中文摘要

将分布式能源资源大规模整合到住宅配电馈线中需要通过潮流分析仔细控制其运行。虽然分布系统模型的知识对于此类分析至关重要，但这种知识往往不可用或过时。最近同步相量技术在低压配电网络中的引入为从高精度、时间同步的电压和电流相量测量中学习此模型创造了前所未有的机会。本文重点是通过lasso方法（一种回归收缩和选择方法）从可用遥测数据中联合估计多相配电网络的模型参数（电导值）和运行结构。我们提出了能够处理配电系统低秩结构的可行凸优化程序，并开发了用于早期检测和定位引起电导矩阵变化的关键事件的在线算法。这些技术的有效性通过四个三相辐射形配电系统在真实家庭需求上的潮流研究得到验证。

英文摘要

Large-scale integration of distributed energy resources into residential distribution feeders necessitates careful control of their operation through power flow analysis. While the knowledge of the distribution system model is crucial for this type of analysis, it is often unavailable or outdated. The recent introduction of synchrophasor technology in low-voltage distribution grids has created an unprecedented opportunity to learn this model from high-precision, time-synchronized measurements of voltage and current phasors at various locations. This paper focuses on joint estimation of model parameters (admittance values) and operational structure of a poly-phase distribution network from the available telemetry data via the lasso, a method for regression shrinkage and selection. We propose tractable convex programs capable of tackling the low rank structure of the distribution system and develop an online algorithm for early detection and localization of critical events that induce a change in the admittance matrix. The efficacy of these techniques is corroborated through power flow studies on four three-phase radial distribution systems serving real household demands.

URL PDF HTML ☆

赞 0 踩 0

1806.04225 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY math.OC 版本更新

PAC-Bayes Control: Learning Policies that Provably Generalize to Novel Environments

PAC-Bayes 控制：学习能够证明在新环境中泛化的能力的策略

Anirudha Majumdar, Alec Farid, Anoopkumar Sonar

发表机构 * Department of Mechanical and Aerospace Engineering（1,2 机械与航空航天工程系）； Department of Computer Science Princeton University（3 计算机科学系纽约大学普林斯顿分校）

AI总结本文提出了一种基于PAC-Bayes框架的机器人策略学习方法，通过在新环境中泛化能力的理论分析，为机器人系统提供强泛化保证。

Comments Extended version of paper presented at the 2018 Conference on Robot Learning (CoRL)

详情

AI中文摘要

我们的目标是学习能够证明在新环境中泛化能力的机器人控制策略，给定一组示例环境的数据集。我们方法的关键技术思想是利用机器学习中的泛化理论工具，通过精确的类比（以缩减形式呈现）将控制策略在新环境中的泛化与监督学习中的假设泛化联系起来。特别是，我们利用Probably Approximately Correct (PAC)-Bayes框架，这使我们能够获得在新环境中（随机）控制策略预期成本的上界。我们提出策略学习算法，明确寻求最小化此上界。相应的优化问题可以在有限策略空间的设置中通过凸优化（特别是相对熵编程）解决。在更一般的情况下，对于连续参数化策略（例如神经网络策略），我们使用随机梯度下降来最小化此上界。我们展示了所提出方法应用于学习（1）反应性障碍物回避策略和（2）基于神经网络的抓取策略的模拟结果。我们还展示了Parrot Swing无人机在不同障碍物环境中的硬件结果。我们的例子展示了该方法在具有连续状态和动作空间、复杂（例如非线性）动态、丰富感官输入（例如深度图像）和基于神经网络的策略的机器人系统中提供强泛化保证的潜力。

英文摘要

Our goal is to learn control policies for robots that provably generalize well to novel environments given a dataset of example environments. The key technical idea behind our approach is to leverage tools from generalization theory in machine learning by exploiting a precise analogy (which we present in the form of a reduction) between generalization of control policies to novel environments and generalization of hypotheses in the supervised learning setting. In particular, we utilize the Probably Approximately Correct (PAC)-Bayes framework, which allows us to obtain upper bounds that hold with high probability on the expected cost of (stochastic) control policies across novel environments. We propose policy learning algorithms that explicitly seek to minimize this upper bound. The corresponding optimization problem can be solved using convex optimization (Relative Entropy Programming in particular) in the setting where we are optimizing over a finite policy space. In the more general setting of continuously parameterized policies (e.g., neural network policies), we minimize this upper bound using stochastic gradient descent. We present simulated results of our approach applied to learning (1) reactive obstacle avoidance policies and (2) neural network-based grasping policies. We also present hardware results for the Parrot Swing drone navigating through different obstacle environments. Our examples demonstrate the potential of our approach to provide strong generalization guarantees for robotic systems with continuous state and action spaces, complicated (e.g., nonlinear) dynamics, rich sensory inputs (e.g., depth images), and neural network-based policies.

URL PDF HTML ☆

赞 0 踩 0

1905.00820 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

On the smoothness of nonlinear system identification

关于非线性系统辨识的光滑性

Antônio H. Ribeiro, Koen Tiels, Jack Umenberger, Thomas B. Schön, Luis A. Aguirre

发表机构 * Dept. of Information Technology, Uppsala University, Sweden（信息科技系，乌普萨拉大学，瑞典）； Dept. of Mechanical Engineering, Eindhoven University of Technology, The Netherlands（机械工程系，埃因霍温理工大学，荷兰）

AI总结本文研究了预测误差参数估计中线性和非线性系统优化问题的光滑性，提出通过多阶段法解决参数空间中模型非合同区域导致的Lipschitz常数和β-光滑性指数指数级增长的问题。

详情

DOI: 10.1016/j.automatica.2020.109158
Journal ref: Automatica, vol. 121, 109158, Nov. 2020

AI中文摘要

我们从新的角度探讨了在预测误差参数估计中线性和非线性系统出现的优化问题的光滑性。我们证明，在参数空间中模型非合同的区域，目标函数的Lipschitz常数和β-光滑性可能会随着仿真长度指数级增长，使得在这些区域内难以数值地找到极小值，甚至难以逃离这些区域。除了对这一问题提供理论理解外，本文还提出了多阶段法作为可行的解决方案。所提出的方法最小化预测模型与观测值之间的误差。与其在整个数据集上运行预测模型不同，多阶段法将数据分成更小的子集，并在每个子集上运行预测模型，使仿真长度成为设计参数，并使使用标准方法不可行的问题变得可行。通过在优化中包含约束条件，获得了与原问题的等价性。新方法通过估计具有混沌或不稳定行为的非线性系统的参数以及神经网络的参数进行了说明。我们还比较了所提出的方法与多步预测误差最小化方法的性能。

英文摘要

We shed new light on the \textit{smoothness} of optimization problems arising in prediction error parameter estimation of linear and nonlinear systems. We show that for regions of the parameter space where the model is not contractive, the Lipschitz constant and $β$-smoothness of the objective function might blow up exponentially with the simulation length, making it hard to numerically find minima within those regions or, even, to escape from them. In addition to providing theoretical understanding of this problem, this paper also proposes the use of multiple shooting as a viable solution. The proposed method minimizes the error between a prediction model and the observed values. Rather than running the prediction model over the entire dataset, multiple shooting splits the data into smaller subsets and runs the prediction model over each subset, making the simulation length a design parameter and making it possible to solve problems that would be infeasible using a standard approach. The equivalence to the original problem is obtained by including constraints in the optimization. The new method is illustrated by estimating the parameters of nonlinear systems with chaotic or unstable behavior, as well as neural networks. We also present a comparative analysis of the proposed method with multi-step-ahead prediction error minimization.

URL PDF HTML ☆

赞 0 踩 0

1904.10778 2026-06-04 cs.LG cs.SY eess.SY math.OC math.PR stat.ML 版本更新

Some Limit Properties of Markov Chains Induced by Stochastic Recursive Algorithms

由随机递归算法诱导的马尔可夫链的一些极限性质

Abhishek Gupta, Hao Chen, Jianzong Pi, Gaurav Tendolkar

发表机构 * Electrical and Computer Engineering Department, The Ohio State University（俄亥俄州立大学电气与计算机工程系）； Microsoft Corp（微软公司）

AI总结本文研究了由随机递归算法诱导的马尔可夫链的极限性质，通过分析迭代随机算子的收敛性，证明了随机序列的分布弱收敛于收缩算子生成的轨迹，并进一步展示了随机序列的时间平均收敛于不变分布的空间均值。

Comments Accepted in SIMODS, 37 pages

详情

AI中文摘要

递归随机算法由于数据驱动应用而近期受到广泛关注。例如，随机梯度下降用于解决大规模优化问题，经验动态规划算法用于解决马尔可夫决策问题。这些递归随机算法近似某些收缩算子，并可以被视为迭代随机算子的框架内。因此，我们考虑在波兰空间上迭代随机算子，模拟该波兰空间上的迭代收缩算子。假设迭代随机算子按一定批次大小索引，当批次大小趋于无穷时，每个随机算子的实现（以某种方式）收敛于它所模拟的收缩算子。我们证明，从相同的初始条件出发，由迭代随机算子生成的随机序列的分布弱收敛于由收缩算子生成的轨迹。我们进一步证明，在某些条件下，随机序列的时间平均收敛于不变分布的空间均值。然后，我们将这些结果应用于逻辑回归、经验价值迭代和经验Q值迭代，以说明此处发展的通用理论。

英文摘要

Recursive stochastic algorithms have gained significant attention in the recent past due to data driven applications. Examples include stochastic gradient descent for solving large-scale optimization problems and empirical dynamic programming algorithms for solving Markov decision problems. These recursive stochastic algorithms approximate certain contraction operators and can be viewed within the framework of iterated random operators. Accordingly, we consider iterated random operators over a Polish space that simulate iterated contraction operator over that Polish space. Assume that the iterated random operators are indexed by certain batch sizes such that as batch sizes grow to infinity, each realization of the random operator converges (in some sense) to the contraction operator it is simulating. We show that starting from the same initial condition, the distribution of the random sequence generated by the iterated random operators converges weakly to the trajectory generated by the contraction operator. We further show that under certain conditions, the time average of the random sequence converges to the spatial mean of the invariant distribution. We then apply these results to logistic regression, empirical value iteration, and empirical Q value iteration for finite state finite action MDPs to illustrate the general theory develop here.

URL PDF HTML ☆

赞 0 踩 0

1501.07242 2026-06-04 math.NA cs.LG cs.NA math.OC 版本更新

Escaping the Local Minima via Simulated Annealing: Optimization of Approximately Convex Functions

通过模拟退火逃离局部极小值：近似凸函数的优化

Alexandre Belloni, Tengyuan Liang, Hariharan Narayanan, Alexander Rakhlin

发表机构 * The Fuqua School of Business, Duke University（德克萨斯大学福克商学院）； Department of Statistics, The Wharton School, University of Pennsylvania（宾夕法尼亚大学沃顿商学院统计系）； Department of Statistics and Department of Mathematics, University of Washington（华盛顿大学统计系和数学系）

AI总结本文研究了如何通过模拟退化方法优化近似凸函数，提出了一种基于Hit-and-Run方法的采样算法，能够有效避免局部极小值问题，并在零阶随机凸优化中实现了高效的ε-极小值求解。

Comments 27 pages

详情

Journal ref: Proceedings of the 28th Conference on Learning Theory 40 (2015) 240-265

AI中文摘要

我们考虑在$\mathbb{R}^n$中有限凸集上仅使用函数评估来优化近似凸函数的问题。该问题被转化为使用Hit-and-Run方法从近似对数凹分布中采样，证明其具有与对数凹分布采样相同的$\mathcal{O}^*$复杂度。除了将对数凹分布的分析扩展到近似对数凹分布外，Hit-and-Run漫步的一维采样器的实现需要新的方法和分析。该算法基于模拟退火，不依赖一阶条件，从而本质上免疫于局部极小值。然后，我们将该方法应用于不同的激励问题。在零阶随机凸优化的背景下，所提出的方法在诱导一个$\mathcal{O}(ε/n)$-近似对数凹分布后，通过$\mathcal{O}^*(n^{7.5}ε^{-2})$的噪声函数评估产生一个$ε$-极小值。我们还详细考虑了当“非凸性程度”向函数最优解衰减时的情况。本文讨论的方法还包括隐私计算经验风险最小化、两阶段随机规划以及在线学习中的近似动态规划应用。

英文摘要

We consider the problem of optimizing an approximately convex function over a bounded convex set in $\mathbb{R}^n$ using only function evaluations. The problem is reduced to sampling from an \emph{approximately} log-concave distribution using the Hit-and-Run method, which is shown to have the same $\mathcal{O}^*$ complexity as sampling from log-concave distributions. In addition to extend the analysis for log-concave distributions to approximate log-concave distributions, the implementation of the 1-dimensional sampler of the Hit-and-Run walk requires new methods and analysis. The algorithm then is based on simulated annealing which does not relies on first order conditions which makes it essentially immune to local minima. We then apply the method to different motivating problems. In the context of zeroth order stochastic convex optimization, the proposed method produces an $ε$-minimizer after $\mathcal{O}^*(n^{7.5}ε^{-2})$ noisy function evaluations by inducing a $\mathcal{O}(ε/n)$-approximately log concave distribution. We also consider in detail the case when the "amount of non-convexity" decays towards the optimum of the function. Other applications of the method discussed in this work include private computation of empirical risk minimizers, two-stage stochastic programming, and approximate dynamic programming for online learning.

URL PDF HTML ☆

赞 0 踩 0

1707.02568 2026-06-04 math.NA cs.LG cs.NA math.OC math.PR 版本更新

Solving high-dimensional partial differential equations using deep learning

利用深度学习解决高维偏微分方程

Jiequn Han, Arnulf Jentzen, Weinan E

发表机构 * Program in Applied and Computational Mathematics, Princeton University（普林斯顿大学应用与计算数学项目）； Department of Mathematics, Princeton University（普林斯顿大学数学系）； Beijing Institute of Big Data Research, Beijing（北京大数据研究院）

AI总结本文提出了一种基于深度学习的方法，用于解决高维抛物型偏微分方程，通过将偏微分方程转化为反向随机微分方程，并利用神经网络近似未知解的梯度，有效提高了高维问题的准确性和效率。

Comments 13 pages, 6 figures

详情

DOI: 10.1073/pnas.1718942115
Journal ref: Proceedings of the National Academy of Sciences, 115(34), 8505-8510 (2018)

AI中文摘要

开发用于求解高维偏微分方程（PDEs）的算法长期以来一直是一个极具挑战性的问题，由于著名的“维度灾难”问题。本文介绍了一种基于深度学习的方法，能够处理一般的高维抛物型PDEs。为此，PDEs被重新表述为反向随机微分方程，并且未知解的梯度通过神经网络近似，这在很大程度上类似于深度强化学习，其中梯度作为策略函数。在非线性Black-Scholes方程、Hamilton-Jacobi-Bellman方程和Allen-Cahn方程等示例上的数值结果表明，所提出的算法在高维情况下在准确性和成本方面都非常有效。这为经济学、金融学、运筹学和物理学开辟了新的可能性，通过同时考虑所有参与的代理、资产、资源或粒子，而不是对它们之间的相互关系做出任意假设。

英文摘要

Developing algorithms for solving high-dimensional partial differential equations (PDEs) has been an exceedingly difficult task for a long time, due to the notoriously difficult problem known as the "curse of dimensionality". This paper introduces a deep learning-based approach that can handle general high-dimensional parabolic PDEs. To this end, the PDEs are reformulated using backward stochastic differential equations and the gradient of the unknown solution is approximated by neural networks, very much in the spirit of deep reinforcement learning with the gradient acting as the policy function. Numerical results on examples including the nonlinear Black-Scholes equation, the Hamilton-Jacobi-Bellman equation, and the Allen-Cahn equation suggest that the proposed algorithm is quite effective in high dimensions, in terms of both accuracy and cost. This opens up new possibilities in economics, finance, operational research, and physics, by considering all participating agents, assets, resources, or particles together at the same time, instead of making ad hoc assumptions on their inter-relationships.

URL PDF HTML ☆

赞 0 踩 0

1709.05963 2026-06-04 math.NA cs.LG cs.NA cs.NE math.PR stat.ML 版本更新

Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations

基于机器学习的近似算法用于高维非线性偏微分方程和二阶反向随机微分方程

Christian Beck, Weinan E, Arnulf Jentzen

发表机构 * ETH Zurich（苏黎世联邦理工学院）； Beijing Institute of Big Data Research（北京大数据研究院）； Princeton University（普林斯顿大学）； Peking University（北京大学）

AI总结本文提出了一种基于机器学习的高维非线性二阶偏微分方程的求解方法，通过将非线性偏微分方程与二阶反向随机微分方程联系起来，并利用深度神经网络进行空间近似和随机梯度下降优化，展示了该方法在高维Black-Scholes-Barenblatt方程、Hamilton-Jacobi-Bellman方程和非线性期望问题中的高效性和准确性。

Comments 56 pages, 12 figures

详情

DOI: 10.1007/s00332-018-9525-3
Journal ref: J. Nonlinear Sci. 29, 1563-1619 (2019)

AI中文摘要

高维偏微分方程（PDE）出现在金融行业的多个模型中，例如衍生品定价模型、信用估值调整（CVA）模型或投资组合优化模型。这些应用中的PDE通常是高维的，因为维度对应于投资组合中的金融资产数量。此外，由于需要在模型中纳入某些非线性现象，如违约风险、交易成本、波动率不确定性（Knightian不确定性）或交易限制，这些PDE往往是完全非线性的。此类高维完全非线性PDE的求解极具挑战性，因为标准近似方法的计算努力随着维度呈指数增长。在本工作中，我们提出了一种新的方法来求解高维完全非线性二阶PDE。该方法可以特别用于采样高维非线性期望。该方法基于（i）完全非线性二阶PDE与二阶反向随机微分方程（2BSDE）之间的联系，（ii）PDE和2BSDE问题的合并公式，（iii）2BSDE的时间前向离散化和通过深度神经网络的空间近似，以及（iv）随机梯度下降型优化过程。使用Python中的TENSORFLOW获得的数值结果展示了该方法在100维Black-Scholes-Barenblatt方程、100维Hamilton-Jacobi-Bellman方程和100维G-布朗运动的非线性期望问题中的效率和准确性。

英文摘要

High-dimensional partial differential equations (PDE) appear in a number of models from the financial industry, such as in derivative pricing models, credit valuation adjustment (CVA) models, or portfolio optimization models. The PDEs in such applications are high-dimensional as the dimension corresponds to the number of financial assets in a portfolio. Moreover, such PDEs are often fully nonlinear due to the need to incorporate certain nonlinear phenomena in the model such as default risks, transaction costs, volatility uncertainty (Knightian uncertainty), or trading constraints in the model. Such high-dimensional fully nonlinear PDEs are exceedingly difficult to solve as the computational effort for standard approximation methods grows exponentially with the dimension. In this work we propose a new method for solving high-dimensional fully nonlinear second-order PDEs. Our method can in particular be used to sample from high-dimensional nonlinear expectations. The method is based on (i) a connection between fully nonlinear second-order PDEs and second-order backward stochastic differential equations (2BSDEs), (ii) a merged formulation of the PDE and the 2BSDE problem, (iii) a temporal forward discretization of the 2BSDE and a spatial approximation via deep neural nets, and (iv) a stochastic gradient descent-type optimization procedure. Numerical results obtained using ${\rm T{\small ENSOR}F{\small LOW}}$ in ${\rm P{\small YTHON}}$ illustrate the efficiency and the accuracy of the method in the cases of a $100$-dimensional Black-Scholes-Barenblatt equation, a $100$-dimensional Hamilton-Jacobi-Bellman equation, and a nonlinear expectation of a $ 100 $-dimensional $ G $-Brownian motion.

URL PDF HTML ☆

赞 0 踩 0

1706.04702 2026-06-04 math.NA cs.LG cs.NA cs.NE math.PR stat.ML 版本更新

Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations

基于深度学习的高维抛物型偏微分方程和反向随机微分方程的数值方法

Weinan E, Jiequn Han, Arnulf Jentzen

发表机构 * Beijing Institute of Big Data Research (China)（北京大数据研究院（中国））； Princeton University (USA)（普林斯顿大学（美国））； Peking University (China)（北京大学（中国））； ETH Zurich (Switzerland)（苏黎世联邦理工学院（瑞士））

AI总结本文提出了一种基于深度学习的算法，通过将反向随机微分方程与强化学习类比，利用解的梯度作为策略函数，采用神经网络近似策略函数，有效解决了高维非线性偏微分方程和反向随机微分方程的问题。

Comments 39 pages, 15 figures

详情

DOI: 10.1007/s40304-017-0117-6
Journal ref: Commun. Math. Stat. 5, 349-380 (2017)

AI中文摘要

我们提出了一种新的算法，用于求解高维抛物型偏微分方程（PDEs）和反向随机微分方程（BSDEs），通过将BSDE与强化学习进行类比，将解的梯度作为策略函数，损失函数由给定的终端条件与BSDE解之间的误差构成。策略函数随后通过神经网络进行近似，如深度强化学习中所做的那样。使用TensorFlow进行的数值结果展示了所提出算法在解决物理和金融领域中多个100维非线性PDEs方面的效率和准确性，例如Allen-Cahn方程、Hamilton-Jacobi-Bellman方程以及金融衍生品的非线性定价模型。

英文摘要

We propose a new algorithm for solving parabolic partial differential equations (PDEs) and backward stochastic differential equations (BSDEs) in high dimension, by making an analogy between the BSDE and reinforcement learning with the gradient of the solution playing the role of the policy function, and the loss function given by the error between the prescribed terminal condition and the solution of the BSDE. The policy function is then approximated by a neural network, as is done in deep reinforcement learning. Numerical results using TensorFlow illustrate the efficiency and accuracy of the proposed algorithms for several 100-dimensional nonlinear PDEs from physics and finance such as the Allen-Cahn equation, the Hamilton-Jacobi-Bellman equation, and a nonlinear pricing model for financial derivatives.

URL PDF HTML ☆

赞 0 踩 0

1808.07452 2026-06-04 math.NA cs.LG cs.NA 版本更新

Generalized Canonical Polyadic Tensor Decomposition

广义规范张量分解

David Hong, Tamara G. Kolda, Jed A. Duersch

发表机构 * Sandia National Laboratories（桑迪亚国家实验室）； University of Michigan（密歇根大学）

AI总结本文提出了一种广义规范张量分解（GCP），能够使用除均方误差外的其他损失函数，如逻辑损失或KL散度，从而适用于二分类或计数数据，并展示了其在社交网络互动、小鼠神经活动和印度月降雨量等真实数据集上的灵活性。

详情

DOI: 10.1137/18M1203626
Journal ref: SIAM Review, Vol. 62, No. 1, pp. 133-163, 2020

AI中文摘要

张量分解是数据科学中一种基本的无监督机器学习方法，应用于网络分析和传感器数据处理等领域。本文开发了一种广义规范（GCP）低秩张量分解，允许使用除均方误差外的其他损失函数。例如，我们可以使用逻辑损失或Kullback-Leibler散度，从而实现二分类或计数数据的张量分解。我们为各种场景提出了多种统计动机的损失函数。我们提供了一个通用的框架，用于计算梯度和处理缺失数据，使标准优化方法能够用于拟合模型。我们展示了GCP在多个真实世界示例中的灵活性，包括社交网络中的互动、小鼠神经活动以及印度的月降雨量测量。

英文摘要

Tensor decomposition is a fundamental unsupervised machine learning method in data science, with applications including network analysis and sensor data processing. This work develops a generalized canonical polyadic (GCP) low-rank tensor decomposition that allows other loss functions besides squared error. For instance, we can use logistic loss or Kullback-Leibler divergence, enabling tensor decomposition for binary or count data. We present a variety statistically-motivated loss functions for various scenarios. We provide a generalized framework for computing gradients and handling missing data that enables the use of standard optimization methods for fitting the model. We demonstrate the flexibility of GCP on several real-world examples including interactions in a social network, neural activity in a mouse, and monthly rainfall measurements in India.

URL PDF HTML ☆

赞 0 踩 0

1904.01514 2026-06-04 math.NA cs.LG cs.NA 版本更新

Data driven approximation of parametrized PDEs by Reduced Basis and Neural Networks

基于降阶基和神经网络的数据驱动参数化PDE近似

Niccolò Dal Santo, Simone Deparis, Luca Pegolotti

发表机构 * SCI-SB-SD, École Polytechnique Fédérale de Lausanne (EPFL), Station 8, 1015 Lausanne, Switzerland（SCI-SB-SD，瑞士联邦理工学院（洛桑联邦理工学院），8号站，1015洛桑，瑞士）

AI总结本文提出一种结合降阶基方法和神经网络的数据驱动方法，用于近似参数化偏微分方程，通过减少物理参数的计算成本来估计感兴趣的场，如材料样本的温度或流体的速度。

详情

DOI: 10.1016/j.jcp.2020.109550

AI中文摘要

我们致力于利用基于降阶基方法和机器学习的数据驱动方法来近似偏微分方程。我们假设感兴趣的物理现象可以由参数化偏微分方程建模，但物理参数的值未知或难以直接测量。我们的方法允许在域内少量点的数据基础上估计感兴趣的场，例如材料样本的温度或流体的速度。我们提出使用神经网络嵌入降阶基求解器作为最后一层的 exotic 激活函数来完成此任务。降阶基求解器考虑了底层的物理现象，并从随机选择的物理参数值期间获得的快照中构建。随后，相同的全阶解用于训练神经网络。事实上，所选架构类似于一个不对称自动编码器，其中解码器是降阶基求解器，因此不包含可训练参数。所得到的自动编码器的潜在空间包括参数依赖的量，这些量为降阶基求解器提供输入，这取决于所考虑的偏微分方程，可能是物理参数本身或微分算子的仿射分解系数。

英文摘要

We are interested in the approximation of partial differential equations with a data-driven approach based on the reduced basis method and machine learning. We suppose that the phenomenon of interest can be modeled by a parametrized partial differential equation, but that the value of the physical parameters is unknown or difficult to be directly measured. Our method allows to estimate fields of interest, for instance temperature of a sample of material or velocity of a fluid, given data at a handful of points in the domain. We propose to accomplish this task with a neural network embedding a reduced basis solver as exotic activation function in the last layer. The reduced basis solver accounts for the underlying physical phenomenonon and it is constructed from snapshots obtained from randomly selected values of the physical parameters during an expensive offline phase. The same full order solutions are then employed for the training of the neural network. As a matter of fact, the chosen architecture resembles an asymmetric autoencoder in which the decoder is the reduced basis solver and as such it does not contain trainable parameters. The resulting latent space of our autoencoder includes parameter-dependent quantities feeding the reduced basis solver, which -- depending on the considered partial differential equation -- are the values of the physical parameters themselves or the affine decomposition coefficients of the differential operators.

URL PDF HTML ☆

赞 0 踩 0

1903.11483 2026-06-04 cs.LG cs.NE cs.RO cs.SY eess.SY stat.ML 版本更新

Constructing Parsimonious Analytic Models for Dynamic Systems via Symbolic Regression

通过符号回归构建动态系统的简洁解析模型

Erik Derner, Jiří Kubalík, Nicola Ancona, Robert Babuška

发表机构 * Czech Institute of Informatics, Robotics, and Cybernetics（捷克信息学、机器人学与自动化研究所）； Czech Technical University in Prague（布拉格捷克技术大学）； Department of Control Engineering, Faculty of Electrical Engineering（电气工程系控制工程系）； Delft University of Technology（代尔夫特理工大学）

AI总结本文提出利用符号回归构建动态系统的简洁解析模型，通过两种先进的符号回归算法在状态空间域和输入输出域中应用，展示了在模拟示例和真实系统中的优越性能。

详情

DOI: 10.1016/j.asoc.2020.106432
Journal ref: Applied Soft Computing, Volume 94, September 2020, 106432

AI中文摘要

构建动态系统的数学模型对于许多工程和科学学科至关重要。模型有助于模拟、分析系统行为、决策制定和自动控制算法的设计。即使像强化学习（RL）这样的无模型控制技术也已被证明能从使用模型中受益，通常这些模型是在线学习的。任何模型构建方法都必须处理模型的准确性和复杂性之间的权衡，这很难做到。本文提出利用符号回归（SR）来构建由解析方程描述的简洁过程模型。我们为方法配备了两种最先进的符号回归算法，它们自动搜索适合测量数据的方程：单节点遗传编程（SNGP）和多基因遗传编程（MGGP）。除了状态空间域中的标准问题表述外，我们还展示了该方法如何应用于非线性自回归加外生输入（NARX）类型的输入输出模型。我们展示了该方法在三个模拟示例中的应用，这些示例的状态空间最高可达14维：倒立摆、移动机器人和双足行走机器人。与深度神经网络和局部线性回归的比较表明，SR在大多数情况下优于这些常用替代方法。我们在真实摆系统上展示了解析模型的发现使RL控制器能够成功完成摆起任务，该模型仅基于100个数据样本构建。

英文摘要

Developing mathematical models of dynamic systems is central to many disciplines of engineering and science. Models facilitate simulations, analysis of the system's behavior, decision making and design of automatic control algorithms. Even inherently model-free control techniques such as reinforcement learning (RL) have been shown to benefit from the use of models, typically learned online. Any model construction method must address the tradeoff between the accuracy of the model and its complexity, which is difficult to strike. In this paper, we propose to employ symbolic regression (SR) to construct parsimonious process models described by analytic equations. We have equipped our method with two different state-of-the-art SR algorithms which automatically search for equations that fit the measured data: Single Node Genetic Programming (SNGP) and Multi-Gene Genetic Programming (MGGP). In addition to the standard problem formulation in the state-space domain, we show how the method can also be applied to input-output models of the NARX (nonlinear autoregressive with exogenous input) type. We present the approach on three simulated examples with up to 14-dimensional state space: an inverted pendulum, a mobile robot, and a bipedal walking robot. A comparison with deep neural networks and local linear regression shows that SR in most cases outperforms these commonly used alternative methods. We demonstrate on a real pendulum system that the analytic model found enables a RL controller to successfully perform the swing-up task, based on a model constructed from only 100 data samples.

URL PDF HTML ☆

赞 0 踩 0

1904.01068 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY 版本更新

Efficient and Safe Exploration in Deterministic Markov Decision Processes with Unknown Transition Models

在未知转移模型的确定性马尔可夫决策过程中实现高效且安全的探索

Erdem Bıyık, Jonathan Margoliash, Shahrouz Ryan Alimo, Dorsa Sadigh

发表机构 * Stanford University（斯坦福大学）； Jet Propulsion Laboratory（喷气推进实验室）； California Institute of Technology（加州理工学院）

AI总结本文提出了一种安全探索算法，通过利用Lipschitz连续性确保在探索过程中不访问危险状态，该算法在确定性马尔可夫决策过程中提供了确定性的安全保证，并通过模拟导航任务验证了其性能。

Comments Proceedings of the American Control Conference (ACC), July 2019. The first two authors have equal contribution

1902.02095 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Space Navigator: a Tool for the Optimization of Collision Avoidance Maneuvers

空间导航：碰撞规避 maneuver 优化工具

Leonid Gremyachikh, Dmitrii Dubov, Nikita Kazeev, Andrey Kulibaba, Andrey Skuratov, Anton Tereshkin, Andrey Ustyuzhanin, Lubov Shiryaeva, Sergej Shishkin

发表机构 * National Research University Higher School of Economics, Laboratory of Methods for Big Data Analysis（俄罗斯国家研究大学高等经济学院，大数据分析方法实验室）； Yandex School of Data Analysis（Yandex数据科学学院）； Phygitalism

AI总结本文提出了一种名为“空间导航”的模块化自主碰撞规避系统，通过结合领域知识与强化学习方法，解决千级卫星星座的碰撞规避 maneuver 优化问题。

Comments Submitted to AAS Advances in the Astronautical Sciences, presented at IAA SciTech Forum 2018

1711.11417 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Scalable synthesis of safety certificates from data with application to learning-based control

可扩展的数据合成安全证书及在基于学习的控制中的应用

Kim P. Wabersich, Melanie N. Zeilinger

发表机构 * Institute for Dynamic Systems and Control, ETH Zurich（动态系统与控制研究所，苏黎世联邦理工学院）

AI总结本文提出了一种高效的方法来合成安全集和控制律，通过基于凸优化问题的近似方法，提高了可扩展性，同时利用高斯过程先验减少保守性，应用于自动驾驶车队等场景。

1905.10706 2026-06-04 cs.LG cs.RO cs.SY eess.SY stat.ML 版本更新

Interactive Differentiable Simulation

交互式可微分模拟

Eric Heiden, David Millard, Hejia Zhang, Gaurav S. Sukhatme

发表机构 * University of Southern California（南加州大学）

AI总结本文提出交互式可微分模拟（IDS），一种能够高效准确推断刚体系统物理属性的可微分物理引擎，通过视觉输入实现系统识别，从而建立具有物理意义的世界模型，并在非线性动态系统中实现自动任务机器人设计和参数估计，显著提升了非线性控制领域的样本效率。

详情

AI中文摘要

智能体需要对世界有物理理解才能预测其未来行动的影响。虽然基于学习的环境动力学模型在样本效率上相比无模型强化学习算法有所改进，但通常无法泛化到训练数据之外的系统状态，且往往依赖于非解释性的潜在变量。我们引入交互式可微分模拟（IDS），一种可微分的物理引擎，能够高效准确地推断刚体系统的物理属性。将模型集成到深度学习架构中，该模型能够利用视觉输入实现系统识别，从而建立具有物理意义的世界模型。我们展示了通过自动计算IDS中的梯度，实现非线性动态系统的自动任务机器人设计和参数估计。当与自适应模型预测控制算法结合时，我们的方法在具有挑战性的非线性控制领域中，相比无模型强化学习算法显示出数量级的样本效率提升。

英文摘要

Intelligent agents need a physical understanding of the world to predict the impact of their actions in the future. While learning-based models of the environment dynamics have contributed to significant improvements in sample efficiency compared to model-free reinforcement learning algorithms, they typically fail to generalize to system states beyond the training data, while often grounding their predictions on non-interpretable latent variables. We introduce Interactive Differentiable Simulation (IDS), a differentiable physics engine, that allows for efficient, accurate inference of physical properties of rigid-body systems. Integrated into deep learning architectures, our model is able to accomplish system identification using visual input, leading to an interpretable model of the world whose parameters have physical meaning. We present experiments showing automatic task-based robot design and parameter estimation for nonlinear dynamical systems by automatically calculating gradients in IDS. When integrated into an adaptive model-predictive control algorithm, our approach exhibits orders of magnitude improvements in sample efficiency over model-free reinforcement learning algorithms on challenging nonlinear control domains.

URL PDF HTML ☆

赞 0 踩 0

1607.01027 2026-06-04 math.OC cs.LG cs.NA math.NA stat.ML 版本更新

Accelerate Stochastic Subgradient Method by Leveraging Local Growth Condition

通过利用局部增长条件加速随机子梯度方法

Yi Xu, Qihang Lin, Tianbao Yang

发表机构 * Department of Computer Science（计算机科学系）； Department of Management Sciences（管理科学系）； The University of Iowa（爱荷华大学）

AI总结本文提出了一种新的理论，表明在最优解邻域内目标函数的局部增长率足以量化一阶随机凸优化的全局收敛率，通过局部区域逐步缩小的方法改进了加速随机子梯度方法的收敛性，并在实践中提出了无需知道乘法增长常数和增长率的实用变体。

详情

AI中文摘要

在本文中，我们为一阶随机凸优化开发了一种新理论，表明全局收敛率足以由最优解邻域内目标函数的局部增长率量化。具体而言，如果目标函数F(w)在ε子水平集内以速度‖w - w*‖_2^{1/θ}增长，其中w*是w最近的最优解，θ∈(0,1]表示局部增长率，则达到ε最优解的一阶随机优化迭代复杂度可以为~O(1/ε^{2(1-θ)})，这在至多对数因子范围内是最佳的。为了实现更快的全局收敛，我们通过在历史解的局部区域中迭代求解原始问题，开发了两种不同的加速随机子梯度方法，该局部区域的大小随着解接近最优集而逐渐减小。除了理论改进外，这项工作还包含了使所提算法实用的新贡献：(i) 我们提出了可以运行而无需知道乘法增长常数和增长率θ的加速随机子梯度方法的实用变体；(ii) 我们考虑了机器学习中的广泛问题集，以证明所提算法比传统随机子梯度方法具有更快的收敛速度。我们还表征了所提算法的复杂性，以确保在不假设光滑性的情况下梯度较小。

英文摘要

In this paper, a new theory is developed for first-order stochastic convex optimization, showing that the global convergence rate is sufficiently quantified by a local growth rate of the objective function in a neighborhood of the optimal solutions. In particular, if the objective function $F(\mathbf w)$ in the $ε$-sublevel set grows as fast as $\|\mathbf w - \mathbf w_*\|_2^{1/θ}$, where $\mathbf w_*$ represents the closest optimal solution to $\mathbf w$ and $θ\in(0,1]$ quantifies the local growth rate, the iteration complexity of first-order stochastic optimization for achieving an $ε$-optimal solution can be $\widetilde O(1/ε^{2(1-θ)})$, which is optimal at most up to a logarithmic factor. To achieve the faster global convergence, we develop two different accelerated stochastic subgradient methods by iteratively solving the original problem approximately in a local region around a historical solution with the size of the local region gradually decreasing as the solution approaches the optimal set. Besides the theoretical improvements, this work also includes new contributions towards making the proposed algorithms practical: (i) we present practical variants of accelerated stochastic subgradient methods that can run without the knowledge of multiplicative growth constant and even the growth rate $θ$; (ii) we consider a broad family of problems in machine learning to demonstrate that the proposed algorithms enjoy faster convergence than traditional stochastic subgradient method. We also characterize the complexity of the proposed algorithms for ensuring the gradient is small without the smoothness assumption.

URL PDF HTML ☆

赞 0 踩 0

1809.09170 2026-06-04 math.NA cs.LG cs.NA math.DS stat.ML 版本更新

Numerical Aspects for Approximating Governing Equations Using Data

利用数据近似求解方程的数值方面

Kailiang Wu, Dongbin Xiu

发表机构 * Department of Mathematics, The Ohio State University, Columbus, OH 43210, USA.（数学系，俄亥俄州立大学，哥伦布，OH 43210，USA）

AI总结本文提出了一种有效的数值算法，用于从测量数据中局部恢复未知的偏微分方程，通过使用多项式等标准基函数进行高精度近似，并讨论了准确近似的关键因素，如使用大量短轨迹数据而非单一长轨迹数据，以及展示了线性和非线性系统的数值示例。

Comments 26 pages, 17 figures

详情

DOI: 10.1016/j.jcp.2019.01.030
Journal ref: Journal of Computational Physics, 384, 200-221, 2019

AI中文摘要

我们提出了有效的数值算法，用于从测量数据中局部恢复未知的偏微分方程。我们采用一组标准基函数，例如多项式，来高精度地近似求解方程。在将问题转化为函数近似问题后，我们讨论了几个重要的方面以确保准确的近似。最值得注意的是，我们讨论了使用大量短轨迹数据burst而非单一长轨迹数据的重要性。随后，我们提出了几种数值算法以实现准确的近似，并给出了最终方程近似的误差估计。然后，我们展示了线性和非线性系统的一系列广泛数值示例，以展示我们方程恢复算法的性质和有效性。

英文摘要

We present effective numerical algorithms for locally recovering unknown governing differential equations from measurement data. We employ a set of standard basis functions, e.g., polynomials, to approximate the governing equation with high accuracy. Upon recasting the problem into a function approximation problem, we discuss several important aspects for accurate approximation. Most notably, we discuss the importance of using a large number of short bursts of trajectory data, rather than using data from a single long trajectory. Several options for the numerical algorithms to perform accurate approximation are then presented, along with an error estimate of the final equation approximation. We then present an extensive set of numerical examples of both linear and nonlinear systems to demonstrate the properties and effectiveness of our equation recovery algorithms.

URL PDF HTML ☆

赞 0 踩 0

1905.13547 2026-06-04 cs.LG cs.SY eess.SY math.DS math.OC stat.ML 版本更新

Learning robust control for LQR systems with multiplicative noise via policy gradient

通过策略梯度学习具有乘性噪声的LQR系统的鲁棒控制

Benjamin Gravell, Peyman Mohajerin Esfahani, Tyler Summers

发表机构 * Control, Optimization, and Networks lab, UT Dallas（控制、优化与网络实验室，UT Dallas）； Delft Center for Systems and Control, TU Delft（代尔夫特系统与控制中心，TU Delft）

AI总结本文研究了具有乘性噪声的LQR系统，通过策略梯度方法实现鲁棒控制，证明了在非凸成本函数下策略梯度算法的全局收敛性。

详情

AI中文摘要

线性二次调节（LQR）问题重新成为强化学习控制复杂动态系统的重要理论基准，特别是当状态和动作空间连续时。与几乎所有近期相关工作不同，我们考虑了乘性噪声模型，这些模型由于显式地纳入系统动态中的固有不确定性和变化，从而提高了控制器的鲁棒性。鲁棒性是强化学习中一个关键但理解不足的问题；现有不考虑不确定性的方法可能会收敛到脆弱的策略或完全无法收敛。此外，有意地将乘性噪声注入到学习算法中可以增强策略的鲁棒性，如在领域随机化中的非正式工作所观察到的。尽管策略梯度算法需要优化非凸成本函数，我们展示了乘性噪声LQR成本具有称为梯度支配的特殊性质，该性质被用来证明策略梯度算法在问题参数上具有多项式依赖性的全局收敛性，以达到全局最优控制策略。结果在已知模型和未知模型设置中均提供，其中系统轨迹样本用于估计策略梯度。

英文摘要

The linear quadratic regulator (LQR) problem has reemerged as an important theoretical benchmark for reinforcement learning-based control of complex dynamical systems with continuous state and action spaces. In contrast with nearly all recent work in this area, we consider multiplicative noise models, which are increasingly relevant because they explicitly incorporate inherent uncertainty and variation in the system dynamics and thereby improve robustness properties of the controller. Robustness is a critical and poorly understood issue in reinforcement learning; existing methods which do not account for uncertainty can converge to fragile policies or fail to converge at all. Additionally, intentional injection of multiplicative noise into learning algorithms can enhance robustness of policies, as observed in ad hoc work on domain randomization. Although policy gradient algorithms require optimization of a non-convex cost function, we show that the multiplicative noise LQR cost has a special property called gradient domination, which is exploited to prove global convergence of policy gradient algorithms to the globally optimum control policy with polynomial dependence on problem parameters. Results are provided both in the model-known and model-unknown settings where samples of system trajectories are used to estimate policy gradients.

URL PDF HTML ☆

赞 0 踩 0

1902.09964 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

A Neural-Network-Based Model Predictive Control of Three-Phase Inverter With an Output LC Filter

基于神经网络的三相逆变器模型预测控制及输出LC滤波器

Ihab S. Mohamed, Stefano Rovetta, Ton Duc Do, Tomislav Dragicevic, Ahmed A. Zaki Diab

发表机构 * 1 INRIA Sophia Antipolis - M\'editerran\'ee, University C\ ote d'Azur, France (e-mail: ) ； 2 Department of Informatics, Bioengineering, Robotics ； Systems Engineering, University of Genoa, Italy (e-mail: ) ； 3 Department of Robotics ； Mechatronics, School of Science ； Technology (SST), Nazarbayev University, Astana Z05H0P9, Republic of Kazakhstan (e-mail: ) ； 4 Department of Energy Technology, Aalborg University, Denmark (e-mail: ) ； 5 Electrical Engineering Department, Faculty of Engineering, Minia University, Egypt (e-mail: )

AI总结本文提出了一种结合模型预测控制（MPC）和前馈人工神经网络（ANN）的双级逆变器控制方案，旨在降低总谐波失真（THD）并提高系统在不同负载类型下的稳态和动态性能。通过MPC生成神经网络训练数据，利用训练好的ANN实现无MPC的电压跟踪，通过MATLAB/Simulink仿真验证了该策略的优越性能。

Comments 13 pages, 15 figures, 3 tables. This article has been submitted to IEEE Access

详情

DOI: 10.1109/ACCESS.2019.2938220

AI中文摘要

模型预测控制（MPC）已成为一种well-established的现代控制方法，用于具有输出LC滤波器的三相逆变器，其中需要高质量电压和低总谐波失真（THD）。尽管MPC是一种直观的控制器，易于理解和实现，但它有显著缺点，即需要大量的在线计算来解决优化问题。另一方面，在电力电子和驱动领域，基于人工神经网络的无模型方法的应用正在迅速增长。本文提出了一种新的双级逆变器控制方案，结合MPC和前馈ANN，旨在降低THD并提高系统在不同负载类型下的稳态和动态性能。首先，MPC在训练阶段用于生成用于训练所提出神经网络所需的数据。然后，一旦神经网络经过微调，就可以在不需要使用MPC的情况下在线用于电压跟踪目的。所提出的基于ANN的控制策略通过MATLAB/Simulink工具进行仿真验证，考虑了不同的负载条件。此外，评估了基于ANN的控制器在多种线性和非线性负载下的不同运行条件下性能，并与MPC的性能进行比较，证明了所提出基于ANN的控制策略在稳态和动态性能方面的优异表现。

英文摘要

Model predictive control (MPC) has become one of the well-established modern control methods for three-phase inverters with an output LC filter, where a high-quality voltage with low total harmonic distortion (THD) is needed. Although it is an intuitive controller, easy to understand and implement, it has the significant disadvantage of requiring a large number of online calculations for solving the optimization problem. On the other hand, the application of model-free approaches such as those based on artificial neural networks approaches is currently growing rapidly in the area of power electronics and drives. This paper presents a new control scheme for a two-level converter based on combining MPC and feed-forward ANN, with the aim of getting lower THD and improving the steady and dynamic performance of the system for different types of loads. First, MPC is used, as an expert, in the training phase to generate data required for training the proposed neural network. Then, once the neural network is fine-tuned, it can be successfully used online for voltage tracking purpose, without the need of using MPC. The proposed ANN-based control strategy is validated through simulation, using MATLAB/Simulink tools, taking into account different loads conditions. Moreover, the performance of the ANN-based controller is evaluated, on several samples of linear and non-linear loads under various operating conditions, and compared to that of MPC, demonstrating the excellent steady-state and dynamic performance of the proposed ANN-based control strategy.

URL PDF HTML ☆

赞 0 踩 0

1904.05856 2026-06-04 math.OC cs.LG cs.SY eess.SY 版本更新

Connections Between Adaptive Control and Optimization in Machine Learning

适应控制与机器学习中优化方法之间的联系

Joseph E. Gaudio, Travis E. Gibson, Anuradha M. Annaswamy, Michael A. Bolender, Eugene Lavretsky

发表机构 * Massachusetts Institute of Technology（麻省理工学院）； Brigham and Women’s Hospital and Harvard Medical School（布莱尔妇女医院和哈佛医学院）； Air Force Research Laboratory（空军研究实验室）； The Boeing Company（波音公司）

AI总结本文探讨了适应控制与机器学习中常用优化方法之间的联系，通过分析更新法则的相似性，讨论了稳定性、性能和学习等共同概念，并提出了新的交集和改进算法分析的机会，特别是通过这些交集的见解解决了高阶学习问题。

Comments 18 pages

1602.04450 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Bayesian Optimization with Safety Constraints: Safe and Automatic Parameter Tuning in Robotics

具有安全约束的贝叶斯优化：机器人中的安全自动参数调节

Felix Berkenkamp, Andreas Krause, Angela P. Schoellig

发表机构 * 1 Learning \& Adaptive Systems Group, Department of Computer Science, ETH Zurich, Switzerland. 2 Dynamic Systems Lab, Institute for Aerospace Studies, University of Toronto, Canada.

AI总结本文提出了一种通用算法，允许在目标函数之外存在多个独立的安全约束。该算法在给定初始安全参数的情况下，最大化性能，但仅评估满足所有安全约束的参数。通过利用高斯过程先验的正则性假设，该算法仔细探索参数空间，并展示了如何利用上下文变量安全地将知识转移到新任务中。

详情

AI中文摘要

机器人算法通常依赖于各种参数，这些参数的选择对机器人的性能有显著影响。虽然初始参数猜测可以从机器人的动态模型中获得，但通常需要在真实系统上手动调整参数以达到最佳性能。优化算法，如贝叶斯优化，已被用来自动化这一过程。然而，这些方法在优化过程中可能会评估不安全的参数，导致安全关键系统的故障。最近，一种称为SafeOpt的安全贝叶斯优化算法已被开发，该算法保证系统性能永远不会低于临界值；即，安全性是基于性能函数定义的。然而，在机器人中，将性能和安全性结合往往并不理想。例如，高增益控制器可能实现低平均跟踪误差（性能），但可能会超调并违反输入约束。在本文中，我们提出了一种通用算法，允许在目标函数之外存在多个独立的安全约束。给定初始的安全参数集，该算法最大化性能，但只评估满足所有约束的参数，以高概率。为此，它通过利用高斯过程先验的正则性假设来仔细探索参数空间。此外，我们展示了如何利用上下文变量安全地将知识转移到新情况和任务中。我们提供了理论分析，并证明所提出的算法能够实现快速、自动和安全的参数调节优化，在四旋翼飞行器的实验中得到了验证。

英文摘要

Robotic algorithms typically depend on various parameters, the choice of which significantly affects the robot's performance. While an initial guess for the parameters may be obtained from dynamic models of the robot, parameters are usually tuned manually on the real system to achieve the best performance. Optimization algorithms, such as Bayesian optimization, have been used to automate this process. However, these methods may evaluate unsafe parameters during the optimization process that lead to safety-critical system failures. Recently, a safe Bayesian optimization algorithm, called SafeOpt, has been developed, which guarantees that the performance of the system never falls below a critical value; that is, safety is defined based on the performance function. However, coupling performance and safety is often not desirable in robotics. For example, high-gain controllers might achieve low average tracking error (performance), but can overshoot and violate input constraints. In this paper, we present a generalized algorithm that allows for multiple safety constraints separate from the objective. Given an initial set of safe parameters, the algorithm maximizes performance but only evaluates parameters that satisfy safety for all constraints with high probability. To this end, it carefully explores the parameter space by exploiting regularity assumptions in terms of a Gaussian process prior. Moreover, we show how context variables can be used to safely transfer knowledge to new situations and tasks. We provide a theoretical analysis and demonstrate that the proposed algorithm enables fast, automatic, and safe optimization of tuning parameters in experiments on a quadrotor vehicle.

URL PDF HTML ☆

赞 0 踩 0

1812.08625 2026-06-04 math.NA cs.LG cs.NA physics.comp-ph stat.ML 版本更新

Deep Theory of Functional Connections: A New Method for Estimating the Solutions of PDEs

深度函数连接理论：一种用于估计偏微分方程解的新方法

Carl Leake

发表机构 * Ph.D. Student, Aerospace Engineering, Texas A\&M University, College Station, TX（航空航天工程博士研究生，德克萨斯A&M大学，学院站，德克萨斯）

AI总结本文提出了一种名为深度函数连接理论（TFC）的新方法，通过将神经网络与TFC结合来估计偏微分方程（PDEs）的解。该方法将带有边界条件的PDEs转换为无约束优化问题，利用神经网络作为自由函数来求解无约束优化问题，并通过残差平方作为损失函数进行无监督训练。与传统方法相比，该方法无需离散化域，且能提供整个训练域的闭合形式解析解。

Comments 14 pages, 7 figures

详情

DOI: 10.3390/make2010004
Journal ref: Mach. Learn. Knowl. Extr. 2020, 2(1), 37-55

AI中文摘要

本文提出了一种名为深度函数连接理论（TFC）的新方法，通过将神经网络与TFC结合来估计偏微分方程（PDEs）的解。TFC用于将带有边界条件的PDEs转换为无约束优化问题，通过将边界条件嵌入到一个“约束表达式”中。在本工作中，神经网络被选为自由函数，并用于求解现在无约束的优化问题。损失函数取为PDE残差的平方。然后，神经网络以无监督的方式训练以解决无约束优化问题。与用于估计PDE解的流行方法相比，该方法有两个主要区别。首先，该方法不需要将域离散化为网格，而是在线性训练阶段随机采样域中的点。其次，训练后，该方法在整个训练域内提供闭合形式、解析、可微的解的近似。相比之下，其他流行方法如果需要在不在离散化网格上的点上估计解，则需要插值。深度TFC方法用于解决四个具有各种边界条件的问题。

英文摘要

This article presents a new methodology called deep Theory of Functional Connections (TFC) that estimates the solutions of partial differential equations (PDEs) by combining neural networks with TFC. TFC is used to transform PDEs with boundary conditions into unconstrained optimization problems by embedding the boundary conditions into a "constrained expression." In this work, a neural network is chosen as the free function, and used to solve the now unconstrained optimization problem. The loss function is taken as the square of the residual of the PDE. Then, the neural network is trained in an unsupervised manner to solve the unconstrained optimization problem. This methodology has two major differences when compared with popular methods used to estimate the solutions of PDEs. First, this methodology does not need to discretize the domain into a grid, rather, this methodology randomly samples points from the domain during the training phase. Second, after training, this methodology represents a closed form, analytical, differentiable approximation of the solution throughout the entire training domain. In contrast, other popular methods require interpolation if the estimated solution is desired at points that do not lie on the discretized grid. The deep TFC method for estimating the solution of PDEs is demonstrated on four problems with a variety of boundary conditions.

URL PDF HTML ☆

赞 0 踩 0

1809.02341 2026-06-04 math.OC cs.LG cs.NA math.NA stat.ML 版本更新

A Fast Anderson-Chebyshev Acceleration for Nonlinear Optimization

非线性优化中的快速安德森-切比雪夫加速方法

Zhize Li, Jian Li

发表机构 * King Abdullah University of Science and Technology（卡布斯大学）； Tsinghua University（清华大学）

AI总结本文提出了一种快速安德森-切比雪夫加速方法，用于非线性优化问题，该方法在二次函数上实现了最优收敛率O(√κ ln(1/ε))，并提供了通用非线性问题的收敛分析，同时提出了动态猜测超参数的算法。

Comments To appear in AISTATS 2020

详情

AI中文摘要

安德森加速（或安德森混合）是一种高效的固定点迭代方法$ x_{t+1}=G(x_t) $，例如梯度下降可以视为迭代应用操作$ G(x) riangleq x-α abla f(x) $。本文表明，安德森加速结合切比雪夫多项式可以实现最优收敛率$ O(\sqrtκ\ln rac{1}ε) $，这改进了之前对于二次函数提供的结果$ O(κ\ln rac{1}ε) $（Toth and Kelley, 2015）。此外，我们为一般非线性问题提供了收敛分析。此外，如果超参数（例如Lipschitz光滑参数$ L $）不可用，我们提出了一种猜测算法来动态猜测它们，并证明了类似的收敛率。最后，实验结果表明，所提出的安德森-切比雪夫加速方法比其他算法如普通梯度下降（GD）、Nesterov加速GD收敛更快。此外，这些算法结合所提出的猜测算法（动态猜测超参数）实现了更好的性能。

英文摘要

Anderson acceleration (or Anderson mixing) is an efficient acceleration method for fixed point iterations $x_{t+1}=G(x_t)$, e.g., gradient descent can be viewed as iteratively applying the operation $G(x) \triangleq x-α\nabla f(x)$. It is known that Anderson acceleration is quite efficient in practice and can be viewed as an extension of Krylov subspace methods for nonlinear problems. In this paper, we show that Anderson acceleration with Chebyshev polynomial can achieve the optimal convergence rate $O(\sqrtκ\ln\frac{1}ε)$, which improves the previous result $O(κ\ln\frac{1}ε)$ provided by (Toth and Kelley, 2015) for quadratic functions. Moreover, we provide a convergence analysis for minimizing general nonlinear problems. Besides, if the hyperparameters (e.g., the Lipschitz smooth parameter $L$) are not available, we propose a guessing algorithm for guessing them dynamically and also prove a similar convergence rate. Finally, the experimental results demonstrate that the proposed Anderson-Chebyshev acceleration method converges significantly faster than other algorithms, e.g., vanilla gradient descent (GD), Nesterov's Accelerated GD. Also, these algorithms combined with the proposed guessing algorithm (guessing the hyperparameters dynamically) achieve much better performance.

URL PDF HTML ☆

赞 0 踩 0

1812.11137 2026-06-04 cs.LG cs.SY eess.SY math.OC stat.ML 版本更新

Differential Temporal Difference Learning

差分时间差分学习

Adithya M. Devraj, Ioannis Kontoyiannis, Sean P. Meyn

发表机构 * Department of Electrical and Computer Engineering, University of Florida（佛罗里达大学电气与计算机工程系）； Department of Engineering, University of Cambridge（剑桥大学工程系）

AI总结本文提出了一种新的差分时间差分学习算法，旨在解决传统时间差分学习方法中收敛缓慢和相对价值函数计算中一致性算法仅在特殊情况下存在的问题。

Comments Preliminary versions of some of the results in this article were submitted as arXiv:1604.01828

详情

AI中文摘要

由马尔可夫决策过程导出的价值函数在许多统计和工程应用中的机器学习技术中作为算法和性能指标的核心组成部分。在大多数实际情况下，计算相关贝尔曼方程的解具有挑战性。一种流行的近似技术，即时间差分（TD）学习算法，是通用强化学习方法的重要子类。本文介绍的算法旨在解决TD学习方法的两个已知难题：由于非常高的方差导致的收敛缓慢，以及在计算相对价值函数的问题中，仅在特殊情况下存在一致算法。首先，我们表明这些价值函数的梯度具有可以用于算法设计的表示形式。基于这一结果，引入了一种新的差分TD学习算法。对于在欧几里得空间上具有光滑动力学的马尔可夫模型，在一般条件下，这些算法被证明是自洽的。数值结果表明，与标准方法相比，具有显著的方差减少。

英文摘要

Value functions derived from Markov decision processes arise as a central component of algorithms as well as performance metrics in many statistics and engineering applications of machine learning techniques. Computation of the solution to the associated Bellman equations is challenging in most practical cases of interest. A popular class of approximation techniques, known as Temporal Difference (TD) learning algorithms, are an important sub-class of general reinforcement learning methods. The algorithms introduced in this paper are intended to resolve two well-known difficulties of TD-learning approaches: Their slow convergence due to very high variance, and the fact that, for the problem of computing the relative value function, consistent algorithms exist only in special cases. First we show that the gradients of these value functions admit a representation that lends itself to algorithm design. Based on this result, a new class of differential TD-learning algorithms is introduced. For Markovian models on Euclidean space with smooth dynamics, the algorithms are shown to be consistent under general conditions. Numerical results show dramatic variance reduction when compared to standard methods.

URL PDF HTML ☆

赞 0 踩 0

1905.11011 2026-06-04 math.OC cs.AI cs.LG cs.SY eess.SY 版本更新

Robustness of accelerated first-order algorithms for strongly convex optimization problems

强凸优化问题中加速一阶算法的鲁棒性

Hesameddin Mohammadi, Meisam Razaviyayn, Mihailo R. Jovanović

发表机构 * Ming Hsieh Department of Electrical and Computer Engineering（明希德电气与计算机工程系）； Daniel J. Epstein Department of Industrial and Systems Engineering（丹尼尔·J·埃普斯坦工业与系统工程系）

AI总结本文研究了在梯度评估中存在随机不确定性的加速一阶算法的鲁棒性，分析了噪声对优化变量均方误差的影响，并探讨了噪声放大与收敛速率之间的根本权衡。

Comments 45 pages, 6 figures

详情

AI中文摘要

我们研究了在梯度评估中存在随机不确定性的加速一阶算法的鲁棒性。具体而言，针对无约束、光滑、强凸优化问题，我们考察了在迭代项受到加性白噪声扰动时优化变量的均方误差。这种不确定性可能出现在通过真实系统的测量来近似梯度或在分布式网络计算中。尽管此类问题的一阶算法的动力学是非线性的，我们建立了均方偏离最优解的上界，这些上界在常数因子范围内是紧致的。我们的分析量化了通过任何类似于Nesterov或重力球方法的加速方案所获得的噪声放大与收敛速率之间的根本权衡。为了获得额外的分析洞察，对于强凸二次问题，我们明确地将优化变量的稳态方差表示为目标函数Hessian矩阵特征值的函数。我们证明了Hessian的整个谱，而不仅仅是极值特征值，影响噪声算法的鲁棒性。我们将这一结果专门应用于无向网络上的分布式平均问题，并考察了网络大小和拓扑结构对噪声加速算法鲁棒性的影响。

英文摘要

We study the robustness of accelerated first-order algorithms to stochastic uncertainties in gradient evaluation. Specifically, for unconstrained, smooth, strongly convex optimization problems, we examine the mean-squared error in the optimization variable when the iterates are perturbed by additive white noise. This type of uncertainty may arise in situations where an approximation of the gradient is sought through measurements of a real system or in a distributed computation over a network. Even though the underlying dynamics of first-order algorithms for this class of problems are nonlinear, we establish upper bounds on the mean-squared deviation from the optimal solution that are tight up to constant factors. Our analysis quantifies fundamental trade-offs between noise amplification and convergence rates obtained via any acceleration scheme similar to Nesterov's or heavy-ball methods. To gain additional analytical insight, for strongly convex quadratic problems, we explicitly evaluate the steady-state variance of the optimization variable in terms of the eigenvalues of the Hessian of the objective function. We demonstrate that the entire spectrum of the Hessian, rather than just the extreme eigenvalues, influence robustness of noisy algorithms. We specialize this result to the problem of distributed averaging over undirected networks and examine the role of network size and topology on the robustness of noisy accelerated algorithms.

URL PDF HTML ☆

赞 0 踩 0

1809.07180 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

Projective Splitting with Forward Steps only Requires Continuity

仅需连续性即可实现投影分裂

Patrick R. Johnstone, Jonathan Eckstein

AI总结本文研究了投影分裂算法中仅需连续性即可实现收敛的问题，核心方法是通过有限维空间中的反向追踪线搜索来实现收敛，主要贡献是证明了在连续条件下无需Lipschitz连续性假设。

Comments 15 pages. arXiv admin note: text overlap with arXiv:1803.07043

1602.02726 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

Local and Global Convergence of a General Inertial Proximal Splitting Scheme

局部和全局收敛性的一般惯性近端分裂方案

Patrick R. Johnstone, Pierre Moulin

发表机构 * Coordinated Science Laboratory, University of Illinois, Urbana, IL 61801, USA（协调科学实验室，伊利诺伊大学，厄巴纳，伊利诺伊州，61801，美国）

AI总结本文研究了希尔伯特空间中的凸复合最优化问题，提出了一种通用的惯性近端分裂算法（GIPSA），并证明了其迭代序列的平方增量和累积点的最优性，以及在最小值存在时的弱收敛性。进一步分析了ℓ1正则化优化问题，展示了GIPSA在特定参数选择下的局部收敛性和局部线性收敛性，以及其在FISTA变体中的应用。

Comments 33 pages 1 figure

详情

DOI: 10.1007/s10589-017-9896-7
Journal ref: Comput Optim Appl 67, 259-292 (2017)

AI中文摘要

本文关注希尔伯特空间中的凸复合最优化问题。在这些问题中，目标函数是两个闭合、proper且凸函数的和，其中一个是光滑的，另一个具有计算成本较低的近端算子。我们分析了一种通用的惯性近端分裂算法（GIPSA）以解决此类问题。我们建立了迭代序列平方增量之和的有限性和累积点的最优性。如果最小值被达到，则整个序列的弱收敛性随之成立。我们的分析统一并扩展了之前的一些结果。然后我们专注于ℓ1正则化优化，这是最常见的特殊情况，其中非光滑项是ℓ1范数。对于某些参数选择，GIPSA适用于此问题的局部分析。对于这些选择，我们证明GIPSA在有限次迭代内收敛到最优支持和符号，之后GIPSA减少到最小化局部光滑函数。在某些条件下，局部线性收敛性成立。我们以惯性、步长和局部曲率来确定收敛率。我们的局部分析适用于某些最近的快速迭代收缩阈值算法（FISTA）变体，我们在此类FISTA变体中建立了主动流形识别和局部线性收敛性。我们的分析促使在这些FISTA变体中使用动量重启方案以获得最优的局部线性收敛率。

英文摘要

This paper is concerned with convex composite minimization problems in a Hilbert space. In these problems, the objective is the sum of two closed, proper, and convex functions where one is smooth and the other admits a computationally inexpensive proximal operator. We analyze a general family of inertial proximal splitting algorithms (GIPSA) for solving such problems. We establish finiteness of the sum of squared increments of the iterates and optimality of the accumulation points. Weak convergence of the entire sequence then follows if the minimum is attained. Our analysis unifies and extends several previous results. We then focus on $\ell_1$-regularized optimization, which is the ubiquitous special case where the nonsmooth term is the $\ell_1$-norm. For certain parameter choices, GIPSA is amenable to a local analysis for this problem. For these choices we show that GIPSA achieves finite "active manifold identification", i.e. convergence in a finite number of iterations to the optimal support and sign, after which GIPSA reduces to minimizing a local smooth function. Local linear convergence then holds under certain conditions. We determine the rate in terms of the inertia, stepsize, and local curvature. Our local analysis is applicable to certain recent variants of the Fast Iterative Shrinkage-Thresholding Algorithm (FISTA), for which we establish active manifold identification and local linear convergence. Our analysis motivates the use of a momentum restart scheme in these FISTA variants to obtain the optimal local linear convergence rate.

URL PDF HTML ☆

赞 0 踩 0

1811.06838 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

The Trace Criterion for Kernel Bandwidth Selection for Support Vector Data Description

核带宽选择的迹准则用于支持向量数据描述

Arin Chaudhuri, Carol Sadek, Deovrat Kakde, Wenhao Hu, Hansi Jiang, Seunghyun Kong, Yuewei Liao, Sergiy Peredriy, Haoyu Wang

发表机构 * Internet of Things, SAS Institute Inc., Cary, NC, 27513（物联网，SAS公司，北卡罗来纳州卡里，27513）

AI总结本文提出了一种新的无监督方法，用于选择支持向量数据描述（SVDD）中高斯核的带宽，通过利用核矩阵的低秩表示来建议带宽值，该方法在低维数据中与当前最佳方法竞争，并在许多高维数据类别中表现极佳。

Comments note: some text overlap with arXiv:1708.05106 because common background material is covered in both papers

详情

AI中文摘要

支持向量数据描述（SVDD）是一种流行的异常检测技术。SVDD分类器将整个数据空间划分为内群区域和外群区域。计算SVDD分类器需要一个核函数，高斯核是一个常见选择。高斯核有一个带宽参数，正确设置该参数对获得良好结果至关重要。小带宽会导致过拟合，使得SVDD分类器高估异常数量，而大带宽会导致欠拟合，无法检测许多异常。本文提出了一种新的无监督方法，用于选择高斯核的带宽。我们的方法利用核矩阵的低秩表示来建议带宽值。我们的新方法在低维数据中与当前最佳方法竞争，并在许多高维数据类别中表现极佳。由于当使用高斯核时，SVDD的数学公式与单类支持向量机（OCSVM）的数学公式相同，因此我们的方法同样适用于OCSVM的高斯核带宽调整。

英文摘要

Support vector data description (SVDD) is a popular anomaly detection technique. The SVDD classifier partitions the whole data space into an inlier region, which consists of the region near the training data, and an outlier region, which consists of points away from the training data. The computation of the SVDD classifier requires a kernel function, for which the Gaussian kernel is a common choice. The Gaussian kernel has a bandwidth parameter, and it is important to set the value of this parameter correctly for good results. A small bandwidth leads to overfitting such that the resulting SVDD classifier overestimates the number of anomalies, whereas a large bandwidth leads to underfitting and an inability to detect many anomalies. In this paper, we present a new unsupervised method for selecting the Gaussian kernel bandwidth. Our method exploits a low-rank representation of the kernel matrix to suggest a kernel bandwidth value. Our new technique is competitive with the current state of the art for low-dimensional data and performs extremely well for many classes of high-dimensional data. Because the mathematical formulation of SVDD is identical with the mathematical formulation of one-class support vector machines (OCSVM) when the Gaussian kernel is used, our method is equally applicable to Gaussian kernel bandwidth tuning for OCSVM.

URL PDF HTML ☆

赞 0 踩 0

1904.13317 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

A data-efficient geometrically inspired polynomial kernel for robot inverse dynamics

一种数据高效且受几何启发的多项式核用于机器人逆动力学

Alberto Dalla Libera, Ruggero Carli

AI总结本文提出了一种基于高斯过程回归的数据驱动逆动力学估计器，引入了几何启发多项式核（GIP），该核在合适输入空间上将逆动力学描述为多项式函数，并证明其定义了有限维的再生核希尔伯特空间，包含刚体动力学计算的逆动力学函数，实验表明该方法在数据效率和泛化能力上优于其他数据驱动方法，同时相比模型驱动方法需要更少的先验信息且不受模型偏差影响。

详情

DOI: 10.1109/LRA.2019.2945240
Journal ref: IEEE Robotics and Automation Letters, vol. 5, no. 1, pp. 24-31, Jan. 2020

AI中文摘要

在本文中，我们介绍了一种基于高斯过程回归的新数据驱动逆动力学估计器。受逆动力学可以描述为合适输入空间上的多项式函数的启发，我们提出了一个名为几何启发多项式核（GIP）的新核。所得到的估计器在数据效率方面与基于模型的方法相似。事实上，我们证明了GIP核定义了一个有限维的再生核希尔伯特空间，该空间包含通过刚体动力学计算的逆动力学函数。所提出的核基于最近引入的乘法多项式核，这是经典多项式核的重新定义，配备了允许更高正则化的参数集。我们已在模拟环境和UR10机器人的真实实验中测试了所提出的方法。获得的结果证实，与其它数据驱动估计器相比，所提出的方法在数据效率和泛化能力上更优。相反，与基于模型的估计器相比，我们的方法需要更少的先验信息且不受模型偏差影响。

英文摘要

In this paper, we introduce a novel data-driven inverse dynamics estimator based on Gaussian Process Regression. Driven by the fact that the inverse dynamics can be described as a polynomial function on a suitable input space, we propose the use of a novel kernel, called Geometrically Inspired Polynomial Kernel (GIP). The resulting estimator behaves similarly to model-based approaches as concerns data efficiency. Indeed, we proved that the GIP kernel defines a finite-dimensional Reproducing Kernel Hilbert Space that contains the inverse dynamics function computed through the Rigid Body Dynamics. The proposed kernel is based on the recently introduced Multiplicative Polynomial Kernel, a redefinition of the classical polynomial kernel equipped with a set of parameters that allows for a higher regularization. We tested the proposed approach in a simulated environment, and also in real experiments with a UR10 robot. The obtained results confirm that, compared to other data-driven estimators, the proposed approach is more data-efficient and exhibits better generalization properties. Instead, with respect to model-based estimators, our approach requires less prior information and is not affected by model bias.

URL PDF HTML ☆

赞 0 踩 0

1905.11266 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

One Method to Rule Them All: Variance Reduction for Data, Parameters and Many New Methods

一法统御诸法：数据、参数及许多新方法的方差缩减

Filip Hanzely, Peter Richtárik

发表机构 * King Abdullah University of Science and Technology（国王阿卜杜勒·阿齐兹科技大学）

AI总结本文提出了一种通用的方差缩减方法，适用于解决具有大量训练样例或大模型维度的正则化经验风险最小化问题。该方法在特殊情况下可以退化为多种已知且以前被认为无关的方法，如SAGA、LSVRG、JacSketch、SEGA和ISEGA及其任意采样和近端泛化。同时，本文还提出了许多具有有趣性质的新具体算法，并提供了一个单一定理，证明在光滑性和拟强凸性假设下方法的线性收敛性。

Comments 61 pages, 6 figures, 3 tables

1807.01739 2026-06-04 math.OC cs.AI cs.LG cs.SY eess.SY 版本更新

Proximal algorithms for large-scale statistical modeling and sensor/actuator selection

大规模统计建模和传感器/执行器选择的近端算法

Armin Zare, Hesameddin Mohammadi, Neil K. Dhingra, Tryphon T. Georgiou, Mihailo R. Jovanović

发表机构 * Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California（南加州大学明希赫电子与计算机工程系）； Numerica Corporation（Numerica公司）

AI总结本文提出了一种统一的近端算法框架，用于解决大规模系统建模与控制中的正则化半定规划问题，通过近端方法实现了对统计建模和传感器/执行器选择的高效处理，展示了算法的线性收敛性和有效性。

Comments To appear in IEEE Trans. Automat. Control

详情

DOI: 10.1109/TAC.2019.2948268

AI中文摘要

若干在随机驱动动态系统建模与控制中的问题可以被表述为正则化半定规划。我们考察了两个具有代表性的此类问题，并展示了它们可以以类似的方式进行表述。第一个问题在统计建模中寻求通过适当且最小的扰动来协调观测统计数据。第二个问题则旨在为控制目的最优选择可用的传感器和执行器子集。为了应对大规模系统的建模与控制，我们开发了一种统一的算法框架，利用近端方法。我们的定制算法利用问题结构，使得能够处理统计建模以及传感器和执行器选择，比当前通用求解器可以处理的规模大得多。我们建立了近端梯度算法的线性收敛性，对比了所提出的近端算法与交替方向乘子法，并提供了示例以说明我们框架的优势和有效性。

英文摘要

Several problems in modeling and control of stochastically-driven dynamical systems can be cast as regularized semi-definite programs. We examine two such representative problems and show that they can be formulated in a similar manner. The first, in statistical modeling, seeks to reconcile observed statistics by suitably and minimally perturbing prior dynamics. The second seeks to optimally select a subset of available sensors and actuators for control purposes. To address modeling and control of large-scale systems we develop a unified algorithmic framework using proximal methods. Our customized algorithms exploit problem structure and allow handling statistical modeling, as well as sensor and actuator selection, for substantially larger scales than what is amenable to current general-purpose solvers. We establish linear convergence of the proximal gradient algorithm, draw contrast between the proposed proximal algorithms and alternating direction method of multipliers, and provide examples that illustrate the merits and effectiveness of our framework.

URL PDF HTML ☆

赞 0 踩 0

1905.08314 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Longitudinal Dynamic versus Kinematic Models for Car-Following Control Using Deep Reinforcement Learning

纵向动态模型与运动学模型在使用深度强化学习的汽车跟随控制中的比较

Yuan Lin, John McPhee, Nasser L. Azad

发表机构 * University of Waterloo, Ontario, Canada（加拿大温哥华大学）

AI总结本文研究了在考虑车辆动力学的情况下，使用深度强化学习的纵向汽车跟随控制问题，通过引入延迟的控制输入和实际车辆加速度到强化学习环境状态中，改进了DRL框架，从而在考虑车辆动力学时实现了接近最优的控制性能。

Comments Accepted to 2019 IEEE Intelligent Transportation Systems Conference

详情

DOI: 10.1109/ITSC.2019.8916781

AI中文摘要

目前大多数关于通过深度强化学习（DRL）实现自动驾驶车辆控制的研究都使用点质量运动学模型，忽略了车辆动力学，包括加速度延迟和加速度命令动力学。加速度延迟源于传感和执行延迟，导致控制输入执行延迟。加速度命令动力学决定了实际车辆加速度不会立即达到期望的命令加速度，因为存在动力学限制。在本工作中，我们研究了将使用车辆运动学模型训练的DRL控制器应用于更现实的驾驶控制中的可行性。我们考虑了一个特定的纵向汽车跟随控制问题，即自适应巡航控制系统（ACC），该问题通过使用点质量运动学模型的DRL解决。当此类控制器应用于具有车辆动力学的汽车跟随时，我们观察到显著退化的汽车跟随性能。因此，我们重新设计DRL框架，通过将延迟的控制输入和实际车辆加速度分别添加到强化学习环境状态中，以适应加速度延迟和加速度命令动力学。训练结果表明，改进后的DRL控制器在考虑车辆动力学时的汽车跟随控制性能接近最优，与动态规划解决方案相比。

英文摘要

The majority of current studies on autonomous vehicle control via deep reinforcement learning (DRL) utilize point-mass kinematic models, neglecting vehicle dynamics which includes acceleration delay and acceleration command dynamics. The acceleration delay, which results from sensing and actuation delays, results in delayed execution of the control inputs. The acceleration command dynamics dictates that the actual vehicle acceleration does not rise up to the desired command acceleration instantaneously due to dynamics. In this work, we investigate the feasibility of applying DRL controllers trained using vehicle kinematic models to more realistic driving control with vehicle dynamics. We consider a particular longitudinal car-following control, i.e., Adaptive Cruise Control (ACC), problem solved via DRL using a point-mass kinematic model. When such a controller is applied to car following with vehicle dynamics, we observe significantly degraded car-following performance. Therefore, we redesign the DRL framework to accommodate the acceleration delay and acceleration command dynamics by adding the delayed control inputs and the actual vehicle acceleration to the reinforcement learning environment state, respectively. The training results show that the redesigned DRL controller results in near-optimal control performance of car following with vehicle dynamics considered when compared with dynamic programming solutions.

URL PDF HTML ☆

赞 0 踩 0

1803.00204 2026-06-04 cs.LG cs.AI cs.NA math.NA stat.ML 版本更新

Scalar Quantization as Sparse Least Square Optimization

标量量化作为稀疏最小二乘优化

Chen Wang, Xiaomei Yang, Shaomin Fei, Kai Zhou, Xiaofeng Gong, Miao Du, Ruisen Luo

发表机构 * College of Electrical Engineering, Sichuan University（四川大学电气工程学院）； Department of Computer Science, Rutgers University -- New Brunswick（罗格斯大学新布朗斯维广场分校计算机科学系）； Engineering Practice Center, Chengdu University of Information Technology（成都信息科技大学工程实践中心）

AI总结本文提出了一种基于稀疏最小二乘优化的新方法，用于解决标量量化中的问题，通过引入l1、l1+l2和l0正则化，改进了传统聚类方法的不足，提升了在位宽缩减场景下的性能。

详情

DOI: 10.1109/TPAMI.2019.2952096
Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019

AI中文摘要

量化可以用来形成具有共享值的新向量/矩阵，其值接近原始数据。近年来，标量量化在值共享应用中的普及度迅速上升，因为它在减少神经网络复杂度方面具有巨大实用性。现有的基于聚类的量化技术虽然发展成熟，但存在多个缺点，包括对随机种子的依赖性、空集群或超出范围的集群，以及大量集群时的时间复杂度高。为克服这些问题，本文从新的视角研究标量量化问题，即稀疏最小二乘优化。具体来说，受稀疏最小二乘回归性质的启发，提出了几种基于l1最小二乘的量化算法。此外，还提出了类似的方案，具有l1 + l2和l0正则化。此外，为了计算给定数量的值/集群的量化结果，本文设计了一种迭代方法和一种基于聚类的方法，并且两者都建立在稀疏最小二乘之上。本文表明，后者方法在数学上等价于改进版的k-means聚类基量化算法，尽管两种算法起源于不同的直觉。所提出的算法在三种类型的数据上进行了测试，比较和分析了其计算性能，包括信息损失、时间消耗以及稀疏向量值的分布。本文为量化领域提供了新的视角，所提出的算法在某些位宽缩减场景下表现优异，当所需的量化后分辨率（值的数量）不显著低于原始数量时尤其如此。

英文摘要

Quantization can be used to form new vectors/matrices with shared values close to the original. In recent years, the popularity of scalar quantization for value-sharing applications has been soaring as it has been found huge utilities in reducing the complexity of neural networks. Existing clustering-based quantization techniques, while being well-developed, have multiple drawbacks including the dependency of the random seed, empty or out-of-the-range clusters, and high time complexity for a large number of clusters. To overcome these problems, in this paper, the problem of scalar quantization is examined from a new perspective, namely sparse least square optimization. Specifically, inspired by the property of sparse least square regression, several quantization algorithms based on $l_1$ least square are proposed. In addition, similar schemes with $l_1 + l_2$ and $l_0$ regularization are proposed. Furthermore, to compute quantization results with a given amount of values/clusters, this paper designed an iterative method and a clustering-based method, and both of them are built on sparse least square. The paper shows that the latter method is mathematically equivalent to an improved version of k-means clustering-based quantization algorithm, although the two algorithms originated from different intuitions. The algorithms proposed were tested with three types of data and their computational performances, including information loss, time consumption, and the distribution of the values of the sparse vectors, were compared and analyzed. The paper offers a new perspective to probe the area of quantization, and the algorithms proposed can outperform existing methods especially under some bit-width reduction scenarios, when the required post-quantization resolution (number of values) is not significantly lower than the original number.

URL PDF HTML ☆

赞 0 踩 0

1904.08831 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Neural-Attention-Based Deep Learning Architectures for Modeling Traffic Dynamics on Lane Graphs

基于神经注意力的深度学习架构用于车道图上的交通动态建模

Matthew A. Wright, Simon F. G. Ehlers, Roberto Horowitz

AI总结本文提出了一种基于神经注意力的深度学习架构，用于建模车道图上的交通动态，通过显式编码车道间的关系类型来提高预测性能，并展示了该模型在复杂道路网络中的迁移能力。

Comments To appear at 2019 IEEE Conference on Intelligent Transportation Systems

详情

DOI: 10.1109/ITSC.2019.8917174

AI中文摘要

深度神经网络可以成为强大的工具，但需要特定应用的精心设计以确保数据中最相关信息可被学习。在本文中，我们将深度神经网络应用于车辆交通动态的非线性时空物理问题。我们考虑在车道层面估计宏观量（例如交叉口的排队长度）的问题。由于建模如车道变更等社会行为的复杂性以及这些行为的宏观尺度影响，车道尺度的第一原理建模一直是一个挑战。遵循领域知识，上游/下游车道和邻近车道以不同的方式影响彼此的交通流量，我们应用了一种神经注意力机制，使神经网络层能够以不同的方式聚合来自不同车道的信息。使用微观交通模拟器作为测试平台，我们获得了结果，表明注意力神经网络模型可以利用附近车道的信息来提高预测效果，并且显式编码车道间的关系类型显著提高了性能。我们还展示了所学神经网络在更复杂道路网络中的迁移能力，讨论了其性能退化可能归因于拓扑复杂性增加所引起的新交通行为，并激励从多种道路网络拓扑中学习动态模型。

英文摘要

Deep neural networks can be powerful tools, but require careful application-specific design to ensure that the most informative relationships in the data are learnable. In this paper, we apply deep neural networks to the nonlinear spatiotemporal physics problem of vehicle traffic dynamics. We consider problems of estimating macroscopic quantities (e.g., the queue at an intersection) at a lane level. First-principles modeling at the lane scale has been a challenge due to complexities in modeling social behaviors like lane changes, and those behaviors' resultant macro-scale effects. Following domain knowledge that upstream/downstream lanes and neighboring lanes affect each others' traffic flows in distinct ways, we apply a form of neural attention that allows the neural network layers to aggregate information from different lanes in different manners. Using a microscopic traffic simulator as a testbed, we obtain results showing that an attentional neural network model can use information from nearby lanes to improve predictions, and, that explicitly encoding the lane-to-lane relationship types significantly improves performance. We also demonstrate the transfer of our learned neural network to a more complex road network, discuss how its performance degradation may be attributable to new traffic behaviors induced by increased topological complexity, and motivate learning dynamics models from many road network topologies.

URL PDF HTML ☆

赞 0 踩 0

1905.06518 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Efficient hinging hyperplanes neural network and its application in nonlinear system identification

高效铰接超平面神经网络及其在非线性系统辨识中的应用

Jun Xu, Qinghua Tao, Zhen Li, Xiangming Xi, Johan A. K. Suykens, Shuning Wang

发表机构 * School of Mechanical Engineering（机械工程学院）； Automation, Harbin Institute of Technology, Shenzhen, 518055, China（自动化学院，哈尔滨工业大学深圳校区，中国，518055）； BNRist, Department of Automation, Tsinghua University, Beijing, 100084, China（BNRist，自动化系，清华大学，北京，中国，100084）

AI总结本文提出了一种高效的铰接超平面（EHH）神经网络，该网络通过求解多个凸优化问题进行训练，具有快速的训练速度。研究证明，每个EHH神经网络等价于一个自适应铰接超平面（AHH）树，并在系统辨识中表现出良好的应用效果。EHH神经网络具有可解释性，可通过ANOVA分解或交互矩阵获得，可作为输入变量选择的建议。

Comments submitted to Automatica

详情

AI中文摘要

本文提出了一种基于铰接超平面（HH）模型的高效铰接超平面（EHH）神经网络。EHH神经网络是一种分布式表示，其训练涉及求解多个凸优化问题，并且训练速度快。证明了对于每一个EHH神经网络，都存在一个等价的自适应铰接超平面（AHH）树，该AHH树也是基于HH模型提出的，并在系统辨识中找到了良好的应用。EHH神经网络的构建包括两个阶段。首先，EHH神经网络的初始结构是随机确定的，使用Lasso回归选择合适的网络。为了减轻随机性的影响，第二阶段采用堆叠策略来形成更一般的网络结构。与其他神经网络不同，EHH神经网络具有可解释性，可以通过其ANOVA分解（或交互矩阵）轻松获得。这种可解释性可以用于输入变量选择的建议。EHH神经网络应用于非线性系统辨识，仿真结果表明所选回归向量合理，识别速度较快，同时仿真精度也令人满意。

英文摘要

In this paper, the efficient hinging hyperplanes (EHH) neural network is proposed based on the model of hinging hyperplanes (HH). The EHH neural network is a distributed representation, the training of which involves solving several convex optimization problems and is fast. It is proved that for every EHH neural network, there is an equivalent adaptive hinging hyperplanes (AHH) tree, which was also proposed based on the model of HH and find good applications in system identification. The construction of the EHH neural network includes 2 stages. First the initial structure of the EHH neural network is randomly determined and the Lasso regression is used to choose the appropriate network. To alleviate the impact of randomness, secondly, the stacking strategy is employed to formulate a more general network structure. Different from other neural networks, the EHH neural network has interpretability ability, which can be easily obtained through its ANOVA decomposition (or interaction matrix). The interpretability can then be used as a suggestion for input variable selection. The EHH neural network is applied in nonlinear system identification, the simulation results show that the regression vector selected is reasonable and the identification speed is fast, while at the same time, the simulation accuracy is satisfactory.

URL PDF HTML ☆

赞 0 踩 0

1905.09435 2026-06-04 cs.LG cs.SY eess.SY math.OC stat.ML 版本更新

MATCHA: Speeding Up Decentralized SGD via Matching Decomposition Sampling

MATCHA: 通过匹配分解采样加速去中心化SGD

Jianyu Wang, Anit Kumar Sahu, Zhouyi Yang, Gauri Joshi, Soummya Kar

发表机构 * Carnegie Mellon University（卡内基梅隆大学）； Bosch Center for Artificial Intelligence（博世人工智能中心）

AI总结该研究提出MATCHA算法，通过匹配分解采样在去中心化SGD中实现误差与运行时间的双赢，验证了其在各种数据集和深度神经网络上的有效性，证明其比传统去中心化SGD快5倍。

详情

AI中文摘要

本文研究了在基于随机梯度下降（SGD）的去中心化训练中常见的误差-运行时间权衡问题。尽管更密集（稀疏）的网络拓扑会导致迭代更快（更慢）的误差收敛，但会带来更多的（更少）每次迭代的通信时间/延迟。本文提出MATCHA算法，能够在任意任意网络拓扑中实现误差-运行时间的双赢。MATCHA的主要思想是通过将拓扑分解为匹配来并行化节点间通信。为了保持快速的误差收敛速度，它识别并频繁通过关键链接进行通信，并通过较少使用其他链接来节省通信时间。在一系列数据集和深度神经网络上的实验验证了理论分析，并证明MATCHA在达到相同训练损失时比传统去中心化SGD快多达5倍。

英文摘要

This paper studies the problem of error-runtime trade-off, typically encountered in decentralized training based on stochastic gradient descent (SGD) using a given network. While a denser (sparser) network topology results in faster (slower) error convergence in terms of iterations, it incurs more (less) communication time/delay per iteration. In this paper, we propose MATCHA, an algorithm that can achieve a win-win in this error-runtime trade-off for any arbitrary network topology. The main idea of MATCHA is to parallelize inter-node communication by decomposing the topology into matchings. To preserve fast error convergence speed, it identifies and communicates more frequently over critical links, and saves communication time by using other links less frequently. Experiments on a suite of datasets and deep neural networks validate the theoretical analyses and demonstrate that MATCHA takes up to $5\times$ less time than vanilla decentralized SGD to reach the same training loss.

URL PDF HTML ☆

赞 0 踩 0

1905.07960 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

A novel Multiplicative Polynomial Kernel for Volterra series identification

一种新型的乘积多项式核用于Volterra级数识别

Alberto Dalla Libera, Ruggero Carli, Gianluigi Pillonetto

发表机构 * Department of Information Engineering, University of Padova（信息工程系，帕多瓦大学）

AI总结本文提出了一种新的正则化网络用于Volterra模型的识别，通过引入由基本构建块乘积构成的新核，利用边际似然优化估计未知参数，实验表明该方法能更有效地选择影响系统输出的单项式，提升模型预测能力。

1804.10273 2026-06-04 math.OC cs.LG cs.NA math.FA math.NA 版本更新

A telescoping Bregmanian proximal gradient method without the global Lipschitz continuity assumption

一种无需全局Lipschitz连续性假设的 telescoping Bregmanian 近端梯度方法

Daniel Reem, Simeon Reich, Alvaro De Pierro

发表机构 * The Technion - Israel Institute of Technology（技术Ion-以色列理工学院）

AI总结本文提出了一种无需全局Lipschitz连续性假设的近端梯度方法变体，通过在约束集上进行 telescoping 分解，并利用Bregman散度来改进收敛性分析。

Comments Journal of Optimization Theory and Applications (JOTA): accepted for publication; very minor modifications; this version contains full proofs and alphabetically ordered list of references (in contrast with the journal version)

详情

DOI: 10.1007/s10957-019-01509-8
Journal ref: J. Optim. Theory. Appl. 182 (2019), 851--884

AI中文摘要

最小化两个凸函数之和的问题在理论和实际应用中都有广泛的应用。解决此类问题的一种流行方法是近端梯度方法（近端前向-后向算法）。在使用该方法时，一个常见的假设是光滑项的梯度具有全局Lipschitz连续性。然而，这种假设在实践中并不总是成立，从而限制了该方法的使用。在本文中，我们讨论了一种新的近端梯度方法变体，在广泛的有限和无限维空间中，该方法不假定上述全局Lipschitz连续性假设。该方法的关键贡献是迭代步长依赖于约束集的某种telescoping分解。此外，我们使用Bregman散度在近端前向-后向操作中。在某些实际条件下，建立了非渐近收敛率（即函数值的收敛率），以及整个序列弱收敛到极小值点。我们还获得了一些具有独立兴趣的辅助结果。

英文摘要

The problem of minimization of the sum of two convex functions has various theoretical and real-world applications. One of the popular methods for solving this problem is the proximal gradient method (proximal forward-backward algorithm). A very common assumption in the use of this method is that the gradient of the smooth term is globally Lipschitz continuous. However, this assumption is not always satisfied in practice, thus casting a limitation on the method. In this paper, we discuss, in a wide class of finite and infinite-dimensional spaces, a new variant of the proximal gradient method which does not impose the above-mentioned global Lipschitz continuity assumption. A key contribution of the method is the dependence of the iterative steps on a certain telescopic decomposition of the constraint set into subsets. Moreover, we use a Bregman divergence in the proximal forward-backward operation. Under certain practical conditions, a non-asymptotic rate of convergence (that is, in the function values) is established, as well as the weak convergence of the whole sequence to a minimizer. We also obtain a few auxiliary results of independent interest.

URL PDF HTML ☆

赞 0 踩 0

1904.03537 2026-06-04 math.OC cs.CV cs.LG cs.NA math.NA 版本更新

Convex-Concave Backtracking for Inertial Bregman Proximal Gradient Algorithms in Non-Convex Optimization

凸凹回溯法用于非凸优化中的惯性Bregman近似梯度算法

Mahesh Chandra Mukkamala, Peter Ochs, Thomas Pock, Shoham Sabach

发表机构 * Faculty of Mathematics and Computer Science, Saarland University（萨尔兰大学数学与计算机科学学院）； Institute of Computer Graphics and Vision, Graz University of Technology（格拉茨技术大学计算机图形与视觉研究所）； Faculty of Industrial Engineering, The Technion（技术学院工业工程学院）

AI总结本文提出了一种凸凹回溯方法，用于非凸优化中的惯性Bregman近似梯度算法，通过寻找目标函数的凸上界和凹下界，实现步长和外推参数的自适应选择，并证明算法全局收敛到临界点。

Comments 29 pages

详情

AI中文摘要

回溯线搜索是一种古老而强大的策略，用于在近似梯度算法中寻找更好的步长。其主要原理是局部寻找目标函数的简单凸上界，从而控制使用的步长。在惯性近似梯度算法中，情况变得更加复杂，通常导致对外推参数的非常严格的限制。在本文中，我们展示通过局部寻找目标函数的简单凹下界，可以控制外推参数。这导致了一种双凸凹回溯过程，允许自适应地选择步长和外推参数。我们将此过程应用于惯性Bregman近似梯度方法的类别，并证明由这些算法生成的任何序列都全局收敛到函数的临界点。在图像处理和机器学习中的多个具有挑战性的非凸问题上的数值实验显示，结合惯性步和双回溯策略能够实现性能的提升。

英文摘要

Backtracking line-search is an old yet powerful strategy for finding a better step sizes to be used in proximal gradient algorithms. The main principle is to locally find a simple convex upper bound of the objective function, which in turn controls the step size that is used. In case of inertial proximal gradient algorithms, the situation becomes much more difficult and usually leads to very restrictive rules on the extrapolation parameter. In this paper, we show that the extrapolation parameter can be controlled by locally finding also a simple concave lower bound of the objective function. This gives rise to a double convex-concave backtracking procedure which allows for an adaptive choice of both the step size and extrapolation parameters. We apply this procedure to the class of inertial Bregman proximal gradient methods, and prove that any sequence generated by these algorithms converges globally to a critical point of the function at hand. Numerical experiments on a number of challenging non-convex problems in image processing and machine learning were conducted and show the power of combining inertial step and double backtracking strategy in achieving improved performances.

URL PDF HTML ☆

赞 0 踩 0

1810.00697 2026-06-04 eess.SY cs.AI cs.LG cs.SY 版本更新

Data-driven Discovery of Cyber-Physical Systems

基于数据的物理系统发现

Ye Yuan, Xiuchuan Tang, Wei Pan, Xiuting Li, Wei Zhou, Hai-Tao Zhang, Han Ding, Jorge Goncalves

发表机构 * School of Automation, Huazhong University of Science and Technology（华中科技大学自动化学院）； State Key Lab of Digital Manufacturing Equipment and Technology（数字制造装备与技术国家重点实验室）； School of Mechanical Science and Engineering, Huazhong University of Science and Technology（华中科技大学机械科学与工程学院）； Department of Cognitive Robotics, Delft University of Technology（代尔夫特理工大学认知机器人系）； Department of Engineering, University of Cambridge（剑桥大学工程系）； Luxembourg Centre for Systems Biomedicine, University of Luxembourg（卢森堡系统生物医学中心，卢森堡大学）

AI总结本文提出了一种从数据直接反向工程物理系统的通用框架，通过识别物理系统和推断转移逻辑，成功应用于机械、电气系统和医疗应用，为预测CPS轨迹、评估性能、设计容错系统和制定新系统设计指南提供了新方法。

详情

DOI: 10.1038/s41467-019-12490-1

AI中文摘要

物理系统（CPSs）将软件嵌入物理世界，广泛应用于智能电网、机器人、智能制造和医疗监测等领域。由于其固有的复杂性，来自物理组件和网络组件的组合以及它们之间的相互作用，CPSs在建模方面表现出抗性。本文提出了一种从数据直接反向工程CPSs的通用框架。该方法涉及识别物理系统以及推断转移逻辑。它已成功应用于从机械和电气系统到医疗应用的多个现实世界示例。该新颖的框架旨在使研究人员能够基于发现的模型预测CPS的轨迹。此类信息已被证明对于评估CPS性能、设计容错CPS以及为新CPS制定设计指南至关重要。

英文摘要

Cyber-physical systems (CPSs) embed software into the physical world. They appear in a wide range of applications such as smart grids, robotics, intelligent manufacture and medical monitoring. CPSs have proved resistant to modeling due to their intrinsic complexity arising from the combination of physical components and cyber components and the interaction between them. This study proposes a general framework for reverse engineering CPSs directly from data. The method involves the identification of physical systems as well as the inference of transition logic. It has been applied successfully to a number of real-world examples ranging from mechanical and electrical systems to medical applications. The novel framework seeks to enable researchers to make predictions concerning the trajectory of CPSs based on the discovered model. Such information has been proven essential for the assessment of the performance of CPS, the design of failure-proof CPS and the creation of design guidelines for new CPSs.

URL PDF HTML ☆

赞 0 踩 0

1811.07624 2026-06-04 math.NA cs.DS cs.LG cs.NA stat.ML 版本更新

Approximate Eigenvalue Decompositions of Linear Transformations with a Few Householder Reflectors

利用少量Householder反射子进行线性变换的近似本征值分解

Cristian Rusu

发表机构 * Istituto Italiano di Tecnologia（意大利技术研究院）

AI总结本文提出了一种利用少量Householder反射子构造高效或thonormal矩阵的方法，用于近似或thonormal或对称变换，并应用于快速Mahalanobis距离度量变换的学习。

详情

AI中文摘要

将信号分解为正交基（一组正交分量，每个分量归一化为单位长度）的能力，是许多信号处理方法和应用的核心。经典例子是傅里叶变换和小波变换，它们具有数值高效的实现（FFT和FWT）。不幸的是，正交变换通常结构不规则，因此通常不具有低计算复杂度的性质。在本文中，基于Householder反射子，我们引入了一类正交矩阵，这些矩阵在数值上易于操作：我们通过一个给定参数控制这些矩阵与向量的乘法复杂度。我们提供了数值算法，用于近似任何正交或对称变换，通过给定数量Householder反射子的乘积构造新的正交或对称结构。我们展示了分析和数值证据，以突出所提近似的准确性，并提供了一个应用于快速Mahalanobis距离度量变换学习的应用。

英文摘要

The ability to decompose a signal in an orthonormal basis (a set of orthogonal components, each normalized to have unit length) using a fast numerical procedure rests at the heart of many signal processing methods and applications. The classic examples are the Fourier and wavelet transforms that enjoy numerically efficient implementations (FFT and FWT, respectively). Unfortunately, orthonormal transformations are in general unstructured, and therefore they do not enjoy low computational complexity properties. In this paper, based on Householder reflectors, we introduce a class of orthonormal matrices that are numerically efficient to manipulate: we control the complexity of matrix-vector multiplications with these matrices using a given parameter. We provide numerical algorithms that approximate any orthonormal or symmetric transform with a new orthonormal or symmetric structure made up of products of a given number of Householder reflectors. We show analyses and numerical evidence to highlight the accuracy of the proposed approximations and provide an application to the case of learning fast Mahanalobis distance metric transformations.

URL PDF HTML ☆

赞 0 踩 0

1812.04426 2026-06-04 cs.LG cs.NA math.NA physics.comp-ph stat.ML 版本更新

PDE-Net 2.0: Learning PDEs from Data with A Numeric-Symbolic Hybrid Deep Network

PDE-Net 2.0：基于数据学习PDE的数值-符号混合深度网络

Zichao Long, Yiping Lu, Bin Dong

AI总结本文提出PDE-Net 2.0，一种结合数值近似和符号计算的深度网络，用于从动态数据中学习偏微分方程，并具有较高的灵活性和表达能力。

Comments 16 pages, 15 figures. arXiv admin note: substantial text overlap with arXiv:1710.09668

详情

DOI: 10.1016/j.jcp.2019.108925

AI中文摘要

偏微分方程（PDEs）通常是基于经验观察推导得出的。然而，技术的进步使我们能够收集和存储大量数据，这为数据驱动的PDE发现提供了新机会。本文提出了一种新的深度神经网络，称为PDE-Net 2.0，用于从观测动态数据中发现（时间依赖的）PDE，仅需少量对驱动动态机制的先验知识。PDE-Net 2.0的设计基于我们先前的工作\cite{Long2018PDE}，其中提出了原始版本的PDE-Net。PDE-Net 2.0是通过卷积近似微分算子和用于模型恢复的符号多层神经网络的结合。与现有方法相比，PDE-Net 2.0通过学习微分算子和PDE模型的非线性响应函数，具有最大的灵活性和表达能力。数值实验表明，PDE-Net 2.0有潜力揭示观测动态的隐藏PDE，并在噪声环境中预测相对较长时间的动力学行为。

英文摘要

Partial differential equations (PDEs) are commonly derived based on empirical observations. However, recent advances of technology enable us to collect and store massive amount of data, which offers new opportunities for data-driven discovery of PDEs. In this paper, we propose a new deep neural network, called PDE-Net 2.0, to discover (time-dependent) PDEs from observed dynamic data with minor prior knowledge on the underlying mechanism that drives the dynamics. The design of PDE-Net 2.0 is based on our earlier work \cite{Long2018PDE} where the original version of PDE-Net was proposed. PDE-Net 2.0 is a combination of numerical approximation of differential operators by convolutions and a symbolic multi-layer neural network for model recovery. Comparing with existing approaches, PDE-Net 2.0 has the most flexibility and expressive power by learning both differential operators and the nonlinear response function of the underlying PDE model. Numerical experiments show that the PDE-Net 2.0 has the potential to uncover the hidden PDE of the observed dynamics, and predict the dynamical behavior for a relatively long time, even in a noisy environment.

URL PDF HTML ☆

赞 0 踩 0

1905.07875 2026-06-04 eess.SY cs.LG cs.NA cs.SY math.NA 版本更新

Investigating Flight Envelope Variation Predictability of Impaired Aircraft using Least-Squares Regression Analysis

利用最小二乘回归分析研究受损飞机飞行包线变化的可预测性

Ramin Norouzi, Amirreza Kosari, Mohammad Hossein Sabour

发表机构 * University of Tehran（塔里哈大学）

AI总结本文通过线性和非线性最小二乘估计方法，研究了受损飞机飞行包线内Trim点数量及其质心的可预测性，并开发并比较了多种多项式模型和基于双曲正切函数的非线性模型，以预测不同故障程度下的飞行包线变化。

Comments Accepted version, Journal of Aerospace Information Systems

详情

DOI: 10.2514/1.I010760

AI中文摘要

飞机故障会改变飞机的动态特性并导致飞行包线发生变化。此类包线变化是非线性的，通常无法被飞行员预测，因为它们受飞机复杂动态的支配。因此，为了防止飞行中失去控制，必须能够实际预测任何事先未知故障程度下受损飞机的飞行包线变化。本文通过线性和非线性最小二乘估计方法，研究了飞行包线内Trim点数量及其质心的可预测性。为此，开发并比较了多种多项式模型和基于双曲正切函数的非线性模型，这些模型将影响包线变化的因素作为输入，并在任何预期故障程度下估计飞行包线的Trim点数量和质心。结果表明，多项式和基于双曲正切函数的模型都能以高精度预测受损飞行包线的变化。此外，还证明了最佳多项式拟合的回归方程能够直接评估受损飞机的飞行包线收缩和位移对特定飞机故障和飞行条件参数的敏感性。

英文摘要

Aircraft failures alter the aircraft dynamics and cause maneuvering flight envelope to change. Such envelope variations are nonlinear and generally unpredictable by the pilot as they are governed by the aircraft's complex dynamics. Hence, in order to prevent in-flight Loss of Control it is crucial to practically predict the impaired aircraft's flight envelope variation due to any a-priori unknown failure degree. This paper investigates the predictability of the number of trim points within the maneuvering flight envelope and its centroid using both linear and nonlinear least-squares estimation methods. To do so, various polynomial models and nonlinear models based on hyperbolic tangent function are developed and compared which incorporate the influencing factors on the envelope variations as the inputs and estimate the centroid and the number of trim points of the maneuvering flight envelope at any intended failure degree. Results indicate that both the polynomial and hyperbolic tangent function-based models are capable of predicting the impaired fight envelope variation with good precision. Furthermore, it is shown that the regression equation of the best polynomial fit enables direct assessment of the impaired aircraft's flight envelope contraction and displacement sensitivity to the specific parameters characterizing aircraft failure and flight condition.

URL PDF HTML ☆

赞 0 踩 0

1904.11898 2026-06-04 cs.RO cs.CV cs.LG cs.SY eess.SY 版本更新

Perceptual Attention-based Predictive Control

基于感知注意力的预测控制

Keuntaek Lee, Gabriel Nakajima An, Viacheslav Zakharov, Evangelos A. Theodorou

发表机构 * Georgia Institute of Technology（佐治亚理工学院）

AI总结本文提出了一种新的信息处理架构，用于安全的深度学习视觉导航系统，通过模型预测控制（MPC）、卷积神经网络（CNNs）和不确定性量化方法，实现基于感知注意力的预测控制算法，提高了系统对不安全状况的快速检测能力。

详情

AI中文摘要

在本文中，我们提出了一种新的信息处理架构，用于安全的基于深度学习的视觉导航自主系统。所提出的信息处理架构用于支持一种基于感知注意力的预测控制算法，该算法利用模型预测控制（MPC）、卷积神经网络（CNNs）和不确定性量化方法。我们的方法新颖之处在于利用MPC学习如何在视觉输入的相关区域上放置注意力，从而最终使系统能够更快速地检测到不安全状况。我们通过使用MPC学习如何选择输入图像中的感兴趣区域，这些区域用于输出控制动作以及在注意力感知的视觉输入中的epistemic和aleatoric不确定性估计。我们使用这些不确定性估计来量化在当前导航条件下网络控制器的安全性。所提出的架构和算法在1:5比例的陆地车辆上进行了测试。实验结果表明，所提出的算法在早期检测不安全状况方面优于先前的方法，例如当导航环境中出现新障碍物时。所提出的架构是向在安全关键领域使用基于深度学习的感知控制策略迈出的第一步。

英文摘要

In this paper, we present a novel information processing architecture for safe deep learning-based visual navigation of autonomous systems. The proposed information processing architecture is used to support a perceptual attention-based predictive control algorithm that leverages model predictive control (MPC), convolutional neural networks (CNNs), and uncertainty quantification methods. The novelty of our approach lies in using MPC to learn how to place attention on relevant areas of the visual input, which ultimately allows the system to more rapidly detect unsafe conditions. We accomplish this by using MPC to learn to select regions of interest in the input image, which are used to output control actions as well as estimates of epistemic and aleatoric uncertainty in the attention-aware visual input. We use these uncertainty estimates to quantify the safety of our network controller under the current navigation condition. The proposed architecture and algorithm is tested on a 1:5 scale terrestrial vehicle. Experimental results show that the proposed algorithm outperforms previous approaches on early detection of unsafe conditions, such as when novel obstacles are present in the navigation environment. The proposed architecture is the first step towards using deep learning-based perceptual control policies in safety-critical domains.

URL PDF HTML ☆

赞 0 踩 0

1806.02957 2026-06-04 cs.LG cs.NA cs.NE math.NA physics.comp-ph stat.ML 版本更新

A Deep Neural Network Surrogate for High-Dimensional Random Partial Differential Equations

高维随机偏微分方程的深度神经网络替代模型

Mohammad Amin Nabian, Hadi Meidani

发表机构 * Department of Civil and Environmental Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA.（土木与环境工程系，伊利诺伊大学厄巴纳-香槟分校）

AI总结本文提出了一种基于深度学习的高维随机偏微分方程求解框架，通过深度残差网络近似随机PDE，并采用强化或弱化初始和边界条件的方法，验证了该方法在扩散和热传导问题中的准确性。

详情

DOI: 10.1016/j.probengmech.2019.05.001
Journal ref: Probabilistic Engineering Mechanics, 57, pp.14-25 (2019)

AI中文摘要

开发高效的数值算法来求解高维随机偏微分方程（PDEs）一直是一个具有挑战性的任务，由于众所周知的维度灾难。我们提出了一种基于深度学习的新解决方案框架。具体而言，随机PDE通过前馈全连接深度残差网络进行近似，采用强或弱执行初始和边界约束。该框架是无网格的，能够处理不规则计算域。近似深度神经网络的参数通过SGD算法的变种迭代确定。所提出的框架在扩散和热传导问题中通过数值实验验证了令人满意的准确性，与收敛的基于蒙特卡洛的有限元结果进行比较。

英文摘要

Developing efficient numerical algorithms for the solution of high dimensional random Partial Differential Equations (PDEs) has been a challenging task due to the well-known curse of dimensionality. We present a new solution framework for these problems based on a deep learning approach. Specifically, the random PDE is approximated by a feed-forward fully-connected deep residual network, with either strong or weak enforcement of initial and boundary constraints. The framework is mesh-free, and can handle irregular computational domains. Parameters of the approximating deep neural network are determined iteratively using variants of the Stochastic Gradient Descent (SGD) algorithm. The satisfactory accuracy of the proposed frameworks is numerically demonstrated on diffusion and heat conduction problems, in comparison with the converged Monte Carlo-based finite element results.

URL PDF HTML ☆

赞 0 踩 0

1701.08711 2026-06-04 cs.CL cs.LG econ.GN q-fin.EC stat.ML 版本更新

Predicting Auction Price of Vehicle License Plate with Deep Recurrent Neural Network

利用深度循环神经网络预测车辆车牌拍卖价格

Vinci Chow

发表机构 * Department of Economics, The Chinese University of Hong Kong, Shatin, Hong Kong（香港中文大学经济系，沙田，香港）

AI总结本文提出将车辆车牌价格预测视为自然语言处理任务，通过构建深度循环神经网络来预测香港车牌拍卖价格，并展示了模型在解释价格变化和扩展为车牌搜索引擎方面的贡献。

详情

DOI: 10.1016/j.eswa.2019.113008

AI中文摘要

在中国社会，迷信因素极为重要，具有吉祥数字的车辆车牌在拍卖中可以高价成交。与其他珍贵物品不同，车牌在拍卖前并不预估价格。本文提出将车牌价格预测视为自然语言处理（NLP）任务，因为价值取决于车牌上每个字符的含义和语义。本文构建了一个深度循环神经网络（RNN）来预测香港车牌的价格，基于车牌上的字符。在13年的历史拍卖价格上评估，深度RNN的预测可以解释超过80%的价格变化，显著优于以前的模型。此外，本文还展示了该模型如何扩展为车牌搜索引擎，并提供价格分布的估计。

英文摘要

In Chinese societies, superstition is of paramount importance, and vehicle license plates with desirable numbers can fetch very high prices in auctions. Unlike other valuable items, license plates are not allocated an estimated price before auction. I propose that the task of predicting plate prices can be viewed as a natural language processing (NLP) task, as the value depends on the meaning of each individual character on the plate and its semantics. I construct a deep recurrent neural network (RNN) to predict the prices of vehicle license plates in Hong Kong, based on the characters on a plate. I demonstrate the importance of having a deep network and of retraining. Evaluated on 13 years of historical auction prices, the deep RNN's predictions can explain over 80 percent of price variations, outperforming previous models by a significant margin. I also demonstrate how the model can be extended to become a search engine for plates and to provide estimates of the expected price distribution.

URL PDF HTML ☆

赞 0 踩 0

1904.11538 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Zap Q-Learning for Optimal Stopping Time Problems

Shuhang Chen, Adithya M. Devraj, Ana Bušić, Sean P. Meyn

发表机构 * Department of ECE at the University of Florida（佛罗里达大学电子与计算机工程系）； Inria International Chair, Paris（巴黎Inria国际席位）

AI总结本文研究了在不可约、均匀递归的马尔可夫链上，通过快速收敛的强化学习算法近似求解折扣成本最优停止问题，提出了一种名为Zap-Q-learning的算法，证明其在线性函数近似设置下的收敛性。

详情

AI中文摘要

本文的目标是获得快速收敛的强化学习算法，以近似求解在不可约、均匀递归的马尔可夫链上，其状态空间为$\mathbb{R}^n$的紧子集中的折扣成本最优停止问题的解。我们基于Tsitsikilis和Van Roy所采用的动态规划方法，其中他们提出了一种Q-learning算法来估计最优状态-动作价值函数，从而定义最优停止规则。我们探讨了该算法收敛速度慢的原因，并提出了一种快速收敛的替代算法，即“Zap-Q-learning”算法，旨在实现最优的收敛速度。首次在假设线性函数近似设置下证明了Zap-Q-learning算法的收敛性。我们通过ODE分析进行证明，并通过金融示例中的最优渐近方差性质反映该算法的快速收敛性。

英文摘要

The objective in this paper is to obtain fast converging reinforcement learning algorithms to approximate solutions to the problem of discounted cost optimal stopping in an irreducible, uniformly ergodic Markov chain, evolving on a compact subset of $\mathbb{R}^n$. We build on the dynamic programming approach taken by Tsitsikilis and Van Roy, wherein they propose a Q-learning algorithm to estimate the optimal state-action value function, which then defines an optimal stopping rule. We provide insights as to why the convergence rate of this algorithm can be slow, and propose a fast-converging alternative, the "Zap-Q-learning" algorithm, designed to achieve optimal rate of convergence. For the first time, we prove the convergence of the Zap-Q-learning algorithm under the assumption of linear function approximation setting. We use ODE analysis for the proof, and the optimal asymptotic variance property of the algorithm is reflected via fast convergence in a finance example.

URL PDF HTML ☆

赞 0 踩 0

1905.05992 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Deep reinforcement learning for scheduling in large-scale networked control systems

在大规模网络化控制系统中使用深度强化学习进行调度

Adrian Redder, Arunselvan Ramaswamy, Daniel E. Quevedo

发表机构 * Faculty of Computer Science, Electrical Engineering and Mathematics（计算机科学、电气工程与数学系）； Paderborn University（帕德博恩大学）

AI总结本文提出了一种基于深度强化学习的迭代资源分配算法DIRA，用于解决网络化系统中的控制与资源调度问题，通过联合优化控制与调度以提高性能。

1603.07421 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

On the Powerball Method for Optimization

关于优化的Powerball方法

Ye Yuan, Mu Li, Jun Liu, Claire J. Tomlin

发表机构 * School of Automation, Huazhong University of Science and Technology（华中科技大学自动化学院）； Department of Computer Science, Carnegie Mellon University（卡内基梅隆大学计算机科学系）； Department of Applied Mathematics, University of Waterloo（滑铁卢大学应用数学系）； Department of Electrical Engineering and Computer Sciences, University of California, Berkeley（加州大学伯克利分校电气工程与计算机科学系）

AI总结本文提出了一种新的方法来加速优化算法的收敛，通过在优化过程中添加一个功率系数γ∈[0,1)，称为Powerball方法，并分析了该方法在强凸函数中的收敛率。尽管理论上Powerball方法与梯度方法有相同的线性收敛率，但实验证明其在初始迭代中显著优于梯度下降和牛顿方法，尤其在多个真实数据集上提供了10倍的收敛加速。

1904.10945 2026-06-04 cs.LG cs.SY eess.SY math.OC stat.ML 版本更新

Target-Based Temporal Difference Learning

基于目标的时序差分学习

Donghwan Lee, Niao He

发表机构 * Coordinated Science Laboratory (CSL), University of Illinois at Urbana-Champaign（协调科学实验室（CSL），伊利诺伊大学厄巴纳-香槟分校）； Department of Industrial and Enterprise Systems Engineering, University of Illinois（工业与企业系统工程系，伊利诺伊大学）

AI总结本文提出了一种新的基于目标的时序差分学习算法家族，并从理论上分析了其收敛性，展示了这些算法在收敛性能上可能优于标准时序差分学习。

详情

AI中文摘要

目标网络的使用已成为近期深度Q学习算法在强化学习中的流行和关键组成部分，但理论方面的了解仍然有限。在本工作中，我们介绍了一种新的基于目标的时序差分（TD）学习算法家族，并对其收敛性进行了理论分析。与标准TD学习不同，基于目标的TD算法维护两个独立的学习参数——目标变量和在线变量。特别地，我们介绍了该家族中的三个成员，称为平均TD、双TD和周期TD，其中目标变量通过平均、对称或周期性的方式更新，模仿了深度Q学习实践中使用的技术。我们为平均TD和双TD建立了渐近收敛分析，并为周期TD提供了有限样本分析。此外，我们还提供了一些模拟结果，显示这些基于目标的TD算法在收敛性能上可能优于标准TD学习。虽然本工作集中在线性函数逼近和策略评估设置上，但我们将其视为朝着理解具有目标网络的深度Q学习变体理论基础迈出的有意义一步。

英文摘要

The use of target networks has been a popular and key component of recent deep Q-learning algorithms for reinforcement learning, yet little is known from the theory side. In this work, we introduce a new family of target-based temporal difference (TD) learning algorithms and provide theoretical analysis on their convergences. In contrast to the standard TD-learning, target-based TD algorithms maintain two separate learning parameters-the target variable and online variable. Particularly, we introduce three members in the family, called the averaging TD, double TD, and periodic TD, where the target variable is updated through an averaging, symmetric, or periodic fashion, mirroring those techniques used in deep Q-learning practice. We establish asymptotic convergence analyses for both averaging TD and double TD and a finite sample analysis for periodic TD. In addition, we also provide some simulation results showing potentially superior convergence of these target-based TD algorithms compared to the standard TD-learning. While this work focuses on linear function approximation and policy evaluation setting, we consider this as a meaningful step towards the theoretical understanding of deep Q-learning variants with target networks.

URL PDF HTML ☆

赞 0 踩 0

1806.07200 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Adaptive Input Estimation in Linear Dynamical Systems with Applications to Learning-from-Observations

线性动态系统中自适应输入估计及其在学习-观察中的应用

Sebastian Curi, Kfir Y. Levy, Andreas Krause

发表机构 * Electrical Engineering Department, Technion- Israel Institute of Technology（电气工程系，技术ion-以色列理工学院）

AI总结本文提出了一种自适应输入估计算法，通过在每个时间步高效地平衡偏差和方差以优化总体估计误差，并在学习-观察框架中展示了其在控制器学习中的有效性。

Comments CDC 2019

1606.00911 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Distributed Cooperative Decision-Making in Multiarmed Bandits: Frequentist and Bayesian Algorithms

多臂老虎机中分布式协同决策：频率主义与贝叶斯算法

Peter Landgren, Vaibhav Srivastava, Naomi Ehrich Leonard

AI总结本文研究了在多臂老虎机问题中，如何在探索与利用之间取得平衡进行分布式协同决策，提出了适用于多智能体的频率主义和贝叶斯算法，并证明了这些算法在渐近意义上能够恢复集中式智能体的性能。

Comments This revision provides a correction to the original paper, which appeared in the Proceedings of the 2016 IEEE Conference on Decision and Control (CDC). The second statement of Proposition 1 and Theorem 1 are new from arXiv:1512.06888v3 and Lemma 1 is new. These are used to prove regret bounds in Theorems 2 and 3

1712.09379 2026-06-04 math.OC cs.DS cs.LG cs.NA math.NA stat.ML 版本更新

IHT dies hard: Provable accelerated Iterative Hard Thresholding

IHT死守：可证明的加速迭代硬阈值法

Rajiv Khanna, Anastasios Kyrillidis

发表机构 * University of Texas at Austin（德克萨斯大学奥斯汀分校）； IBM T.J. Watson Research Center（IBM 沃森研究中心）

AI总结本文研究了在理论和实践中经典迭代硬阈值（IHT）方法中动量运动的使用，通过简单修改普通IHT，探讨了其在具有非凸约束的凸优化标准下的收敛行为，并观察到IHT的加速在投影梯度下降和Frank-Wolfe变体中带来了显著改进。

Comments accepted to AISTATS 2018

1808.03258 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Application of Bounded Total Variation Denoising in Urban Traffic Analysis

bounded 总变差去噪在城市交通分析中的应用

Shanshan Tang, Haijun Yu

发表机构 * 1 School of Mathematical Sciences, University of Chinese Academy of Sciences ； LSEC, Institute of Computational Mathematics ； Scientific/Engineering Computing, Academy of Mathematics ； Systems Science, Beijing 100190, China 2 NCMIS \& LSEC, Institute of Computational Mathematics ； School of Mathematical Sciences, University of Chinese Academy of Sciences

AI总结本文提出利用 bounded 总变差去噪方法提升城市交通分析的准确性，通过改进的去噪算法和神经网络结合历史匹配方法，提高了交通预测和聚类的性能。

Comments 7 figures, 3 tables, to appear on East Asian Journal on Applied Mathematics

详情

DOI: 10.4208/eajam.181118.250219
Journal ref: East Asian Journal on Applied Mathematics Vol.9, No.3, pp. 622-642, 2019

AI中文摘要

尽管在许多大数据应用中人们认为去噪并不总是必要，但本文通过将 bounded 总变差去噪方法应用于城市道路预测和聚类问题，证明了去噪在城市交通分析中的有效性。我们提出了两种易于实现的方法来估计去噪算法中的噪声强度参数，并将去噪算法应用于北京出租车系统基于 GPS 的交通数据。在交通预测问题中，我们结合神经网络和历史匹配方法，对北京城市区域中随机选择的道路进行预测。数值实验表明，应用所提出的 bounded 总变差去噪算法显著提高了预测精度。我们还测试了该算法在聚类问题中的应用，其中一种 recently 开发的聚类分析方法被应用于北京超过一百个城市的道路段，基于其速度剖面进行聚类分析。去噪后获得了更好的聚类结果。

英文摘要

While it is believed that denoising is not always necessary in many big data applications, we show in this paper that denoising is helpful in urban traffic analysis by applying the method of bounded total variation denoising to the urban road traffic prediction and clustering problem. We propose two easy-to-implement methods to estimate the noise strength parameter in the denoising algorithm, and apply the denoising algorithm to GPS-based traffic data from Beijing taxi system. For the traffic prediction problem, we combine neural network and history matching method for roads randomly chosen from an urban area of Beijing. Numerical experiments show that the predicting accuracy is improved significantly by applying the proposed bounded total variation denoising algorithm. We also test the algorithm on clustering problem, where a recently developed clustering analysis method is applied to more than one hundred urban road segments in Beijing based on their velocity profiles. Better clustering result is obtained after denoising.

URL PDF HTML ☆

赞 0 踩 0

1806.06790 2026-06-04 cs.LG cs.AI cs.IT cs.SY eess.SY math.IT math.OC stat.ML 版本更新

Towards Distributed Energy Services: Decentralizing Optimal Power Flow with Machine Learning

迈向分布式能源服务：利用机器学习实现最优功率流的去中心化

Roel Dobbe, Oscar Sondermeijer, David Fridovich-Keil, Daniel Arnold, Duncan Callaway, Claire Tomlin

发表机构 * AI Now Institute at New York University（纽约大学AI现在研究所）； Energy & Resources Group at UC Berkeley（伯克利大学能源与资源组）

AI总结本文提出了一种基于机器学习的去中心化方法，通过本地可用信息学习可控分布式能源资源（DER）的控制策略，以重构和模仿集中式最优功率流（OPF）问题的解决方案，从而实现分布式能源服务。

Comments Accepted for publication. To appear in the IEEE Transactions on Smart Grid

详情

AI中文摘要

实现最优功率流（OPF）方法以调节电力网络中的电压和功率流通常被认为需要大量通信。我们考虑包含多个可控分布式能源资源（DER）的配电系统，并提出一种数据驱动的方法，用于学习每个DER的控制策略，以仅利用本地可用信息来重构和模仿集中式OPF问题的解决方案。集体来看，所有本地控制器紧密匹配集中式OPF解决方案，提供接近最优的性能并满足系统约束。速率失真框架使得能够分析由此产生的完全去中心化控制策略在重构OPF解决方案方面的效果。该方法为决定DER应与哪些节点通信以改进其个别策略提供了自然扩展。该方法在单相和三相测试馈线网络上应用，使用真实负载和分布式发电机的数据，重点于不表现出跨时间依赖性的DER。它为配电系统运营商提供了一个框架，以高效规划和操作DER的贡献，以实现配电网络中的分布式能源服务。

英文摘要

The implementation of optimal power flow (OPF) methods to perform voltage and power flow regulation in electric networks is generally believed to require extensive communication. We consider distribution systems with multiple controllable Distributed Energy Resources (DERs) and present a data-driven approach to learn control policies for each DER to reconstruct and mimic the solution to a centralized OPF problem from solely locally available information. Collectively, all local controllers closely match the centralized OPF solution, providing near optimal performance and satisfaction of system constraints. A rate distortion framework enables the analysis of how well the resulting fully decentralized control policies are able to reconstruct the OPF solution. The methodology provides a natural extension to decide what nodes a DER should communicate with to improve the reconstruction of its individual policy. The method is applied on both single- and three-phase test feeder networks using data from real loads and distributed generators, focusing on DERs that do not exhibit inter-temporal dependencies. It provides a framework for Distribution System Operators to efficiently plan and operate the contributions of DERs to achieve Distributed Energy Services in distribution networks.

URL PDF HTML ☆

赞 0 踩 0

1804.02948 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

Sample-Derived Disjunctive Rules for Secure Power System Operation

基于样本的离散规则用于安全电力系统运行

Jochen L. Cremer, Ioannis Konstantelos, Simon H. Tindemans, Goran Strbac

发表机构 * Department of Electrical and Electronic Engineering（电气与电子工程系）； Department of Electrical Sustainable Energy（电气可持续能源系）

AI总结本文提出了一种基于决策树的离散规则方法，用于在标准优化框架中进行预故障和后故障控制，通过通用化方法将决策树衍生的规则嵌入到操作决策模型中，以提高电力系统运行的安全性。

Comments 6 pages, accepted paper to IEEE PMAPS 2018

详情

DOI: 10.1109/PMAPS.2018.8440373

AI中文摘要

机器学习技术过去曾利用蒙特卡洛样本来构建电力系统动态稳定的预测器。在本文中，我们超越了预测任务，提出了一种综合方法，将预测器（如决策树（DT））纳入标准优化框架中，用于预故障和后故障控制。具体而言，我们提出了一种通用方法，用于将从决策树中导出的规则嵌入到操作决策模型中。我们首先指出了从预测框架过渡到控制框架时所面临的特定挑战。接着，我们介绍了基于广义离散规划（GDP）的解决方案策略，以及一种两步搜索方法，用于确定最优超参数以平衡成本和控制精度。我们通过IEEE 39节点系统的案例研究，展示了所提出的方法如何在高维不确定性条件下构建覆盖多种故障情景的安全代理。该方法在系统价格方面仅略高于理想模型，实现了高效的系统控制。

英文摘要

Machine learning techniques have been used in the past using Monte Carlo samples to construct predictors of the dynamic stability of power systems. In this paper we move beyond the task of prediction and propose a comprehensive approach to use predictors, such as Decision Trees (DT), within a standard optimization framework for pre- and post-fault control purposes. In particular, we present a generalizable method for embedding rules derived from DTs in an operation decision-making model. We begin by pointing out the specific challenges entailed when moving from a prediction to a control framework. We proceed with introducing the solution strategy based on generalized disjunctive programming (GDP) as well as a two-step search method for identifying optimal hyper-parameters for balancing cost and control accuracy. We showcase how the proposed approach constructs security proxies that cover multiple contingencies while facing high-dimensional uncertainty with respect to operating conditions with the use of a case study on the IEEE 39-bus system. The method is shown to achieve efficient system control at a marginal increase in system price compared to an oracle model.

URL PDF HTML ☆

赞 0 踩 0

1905.04835 2026-06-04 cs.LG cs.CV cs.MA cs.RO cs.SY eess.SY stat.ML 版本更新

Multi-Agent Image Classification via Reinforcement Learning

通过强化学习进行多智能体图像分类

Hossein K. Mousavi, Mohammadreza Nazari, Martin Takáč, Nader Motee

AI总结本文研究了利用多个能够收集未知环境部分姿态依赖观测的移动智能体进行图像分类的问题，提出了一种网络架构，用于指导智能体形成局部信念、采取局部行动并从原始部分观测中提取相关特征，通过与邻居智能体交换信息更新自身信念，并利用强化学习技术实现分类问题的去中心化实现。

Comments Preprint of the paper to be published in IROS'19 proceedings

1801.09627 2026-06-04 cs.LG cs.RO cs.SY eess.SY 版本更新

Barrier-Certified Adaptive Reinforcement Learning with Applications to Brushbot Navigation

具有应用的障碍证书自适应强化学习：Brushbot导航

Motoya Ohnishi, Li Wang, Gennaro Notomista, Magnus Egerstedt

发表机构 * School of Electrical Engineering, Royal Institute of Technology（皇家理工学院电气工程学院）； Georgia Institute of Technology（佐治亚理工学院）； RIKEN Center for Advanced Intelligence Project（日本理化学研究所高级智能研究中心）； School of Mechanical Engineering（机械工程学院）

AI总结本文提出了一种安全学习框架，结合自适应模型学习算法和障碍证书，用于具有可能非平稳智能体动态的系统。通过稀疏优化技术提取模型的动态结构，并利用学习的模型结合控制障碍证书来约束策略（反馈控制器），以保持安全性，即避免特定的不利状态空间区域。在某些条件下，保证了在安全被非平稳性破坏后，以李雅普诺夫稳定性的方式恢复安全。此外，将动作-价值函数近似重新公式化，使任何基于内核的非线性函数估计方法都能应用于我们的自适应学习框架。最后，保证了障碍证书策略优化的解是全局最优的，确保在温和条件下进行贪心策略改进。所得到的框架通过四旋翼无人机的模拟进行验证，该无人机此前在安全学习文献中被假设为平稳性，然后在动态未知、高度复杂且非平稳的Brushbot机器人上进行测试。

Comments ©2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

详情

DOI: 10.1109/TRO.2019.2920206
Journal ref: Published in IEEE Transactions on Robotics, 2019

AI中文摘要

本文提出了一种安全学习框架，该框架结合了自适应模型学习算法和障碍证书，用于具有可能非平稳智能体动态的系统。为了提取模型的动态结构，我们使用了稀疏优化技术。我们利用学习的模型结合控制障碍证书，以约束策略（反馈控制器）从而保持安全性，即避免特定的状态空间区域中的不利区域。在某些条件下，恢复安全性的保证是在安全被非平稳性破坏后以李雅普诺夫稳定性的方式恢复。此外，我们重新公式化了动作-价值函数近似，使任何基于内核的非线性函数估计方法都能应用于我们的自适应学习框架。最后，保证了障碍证书策略优化的解是全局最优的，确保在温和条件下进行贪心策略改进。所得到的框架通过四旋翼无人机的模拟进行验证，该无人机此前在安全学习文献中被假设为平稳性，然后在动态未知、高度复杂且非平稳的Brushbot机器人上进行测试。

英文摘要

This paper presents a safe learning framework that employs an adaptive model learning algorithm together with barrier certificates for systems with possibly nonstationary agent dynamics. To extract the dynamic structure of the model, we use a sparse optimization technique. We use the learned model in combination with control barrier certificates which constrain policies (feedback controllers) in order to maintain safety, which refers to avoiding particular undesirable regions of the state space. Under certain conditions, recovery of safety in the sense of Lyapunov stability after violations of safety due to the nonstationarity is guaranteed. In addition, we reformulate an action-value function approximation to make any kernel-based nonlinear function estimation method applicable to our adaptive learning framework. Lastly, solutions to the barrier-certified policy optimization are guaranteed to be globally optimal, ensuring the greedy policy improvement under mild conditions. The resulting framework is validated via simulations of a quadrotor, which has previously been used under stationarity assumptions in the safe learnings literature, and is then tested on a real robot, the brushbot, whose dynamics is unknown, highly complex and nonstationary.

URL PDF HTML ☆

赞 0 踩 0

1706.00078 2026-06-04 cs.CC cs.LG cs.NA math.NA math.OC 版本更新

Low-Rank Matrix Approximation in the Infinity Norm

以无穷范数为度量的低秩矩阵逼近

Nicolas Gillis, Yaroslav Shitov

发表机构 * Department of Mathematics and Operational Research, University of Mons（蒙斯大学数学与运筹学系）； National Research University Higher School of Economics（俄罗斯国家研究大学高等经济学院）

AI总结本文研究了以无穷范数为度量的低秩矩阵逼近问题，证明了当秩r=1时该问题的决策变种是NP难的，并分析了在某些情况下该问题可以在多项式时间内解决，同时提出了一种实用的启发式算法用于恢复量化低秩矩阵。

Comments 12 pages, 3 tables

1903.11683 2026-06-04 stat.ML cs.CV cs.LG cs.RO cs.SY eess.SY stat.AP 版本更新

Outlier-Robust Spatial Perception: Hardness, General-Purpose Algorithms, and Guarantees

抗异常的空域感知：难度、通用算法和保证

Vasileios Tzoumas, Pasquale Antonante, Luca Carlone

AI总结本文研究了空域感知中异常数据的影响，提出了一种通用算法来有效去除异常，并提供了对算法性能的理论保证。

详情

AI中文摘要

空域感知是许多机器人应用的核心，涵盖了定位与建图、点云对齐和从相机图像中估计相对姿态等广泛的研究问题。异常数据的存在会威胁到空域感知的鲁棒性，而一般情况下，异常值是主要问题。尽管已有处理异常值的技术，但它们可能以不可预测的方式失败（例如RANSAC、鲁棒估计器），或具有指数级的运行时间（例如分支界限法）。在本文中，我们通过三个贡献推动了异常拒绝的前沿。首先，我们证明了即使是最简单的线性异常拒绝实例也是近似不可行的：在最坏情况下，无法设计出一个准多项式时间算法来高效计算近似解。我们的第二个贡献是提供第一个实例级的次优界限，以评估给定异常拒绝结果的近似质量。我们的第三个贡献是提出了一种简单的通用算法，称为自适应修剪，用于去除异常值。我们的算法利用了最近提出的一类全局求解器，能够解决无异常的问题，并通过迭代去除误差较大的测量值。我们在三个空域感知问题上展示了所提出的算法：三维配准、双视几何和SLAM。结果表明，我们的算法在各种应用中优于几种最先进的方法，同时是一种通用的方法。

英文摘要

Spatial perception is the backbone of many robotics applications, and spans a broad range of research problems, including localization and mapping, point cloud alignment, and relative pose estimation from camera images. Robust spatial perception is jeopardized by the presence of incorrect data association, and in general, outliers. Although techniques to handle outliers do exist, they can fail in unpredictable manners (e.g., RANSAC, robust estimators), or can have exponential runtime (e.g., branch-and-bound). In this paper, we advance the state of the art in outlier rejection by making three contributions. First, we show that even a simple linear instance of outlier rejection is inapproximable: in the worst-case one cannot design a quasi-polynomial time algorithm that computes an approximate solution efficiently. Our second contribution is to provide the first per-instance sub-optimality bounds to assess the approximation quality of a given outlier rejection outcome. Our third contribution is to propose a simple general-purpose algorithm, named adaptive trimming, to remove outliers. Our algorithm leverages recently-proposed global solvers that are able to solve outlier-free problems, and iteratively removes measurements with large errors. We demonstrate the proposed algorithm on three spatial perception problems: 3D registration, two-view geometry, and SLAM. The results show that our algorithm outperforms several state-of-the-art methods across applications while being a general-purpose method.

URL PDF HTML ☆

赞 0 踩 0

1811.05537 2026-06-04 math.NA cs.LG cs.NA cs.NE math.DS stat.ML 版本更新

Data Driven Governing Equations Approximation Using Deep Neural Networks

利用深度神经网络的数据驱动 governing 方程近似

Tong Qin, Kailiang Wu, Dongbin Xiu

发表机构 * Department of Mathematics, The Ohio State University（数学系，俄亥俄州立大学）

AI总结本文提出了一种数值框架，利用观测数据和深度神经网络近似未知的 governing 方程，通过残差网络作为基本构建块，提出了两种多步方法，展示了其在不同时间步长下的性能。

详情

DOI: 10.1016/j.jcp.2019.06.042

AI中文摘要

我们提出了一种数值框架，用于利用观测数据和深度神经网络（DNN）近似未知的 governing 方程。特别是，我们提出使用残差网络（ResNet）作为方程近似的基本构建块。我们证明残差网络块可以被视为在时间积分中精确的一步方法。然后，我们提出了两种多步方法，即递归残差网络（RT-ResNet）方法和递归 ReNet（RS-ResNet）方法。RT-ResNet 是一种在均匀时间步长上的多步方法，而 RS-ResNet 是一种使用可变时间步长的自适应多步方法。所有三种方法均基于底层动力系统的基本积分形式。因此，它们不需要时间导数数据进行方程恢复，能够处理相对粗略分布的轨迹数据。几个数值例子展示了这些方法的性能。

英文摘要

We present a numerical framework for approximating unknown governing equations using observation data and deep neural networks (DNN). In particular, we propose to use residual network (ResNet) as the basic building block for equation approximation. We demonstrate that the ResNet block can be considered as a one-step method that is exact in temporal integration. We then present two multi-step methods, recurrent ResNet (RT-ResNet) method and recursive ReNet (RS-ResNet) method. The RT-ResNet is a multi-step method on uniform time steps, whereas the RS-ResNet is an adaptive multi-step method using variable time steps. All three methods presented here are based on integral form of the underlying dynamical system. As a result, they do not require time derivative data for equation recovery and can cope with relatively coarsely distributed trajectory data. Several numerical examples are presented to demonstrate the performance of the methods.

URL PDF HTML ☆

赞 0 踩 0

1904.08353 2026-06-04 stat.ML cs.LG cs.SY eess.SY 版本更新

Towards Robust Deep Reinforcement Learning for Traffic Signal Control: Demand Surges, Incidents and Sensor Failures

面向交通信号控制的鲁棒深度强化学习：需求激增、事故和传感器故障

Filipe Rodrigues, Carlos Lima Azevedo

发表机构 * Technical University of Denmark (DTU)（丹麦技术大学）

AI总结本文提出了一种开源的回调框架，用于在交通模拟环境中灵活评估不同深度强化学习配置，研究了深度强化学习自适应交通控制器在需求激增、事故导致的容量下降和传感器故障等场景下的表现，并提出了缓解这些外源不确定性的具体设计。

Comments 8 pages

详情

AI中文摘要

强化学习（RL）构成了缓解交通拥堵问题的一种有希望的解决方案。特别是，深度RL算法已被证明能够产生适应性强的交通信号控制器，其性能优于传统系统。然而，为了在高度动态的城市区域中保持可靠性，此类控制器需要对一系列外源不确定性具有鲁棒性。在本文中，我们开发了一个开源的回调基于框架，用于在交通模拟环境中促进不同深度RL配置的灵活评估。借助该框架，我们研究了深度RL基于自适应交通控制器在不同场景下的表现，即由特殊事件引起的交通需求激增、由事故导致的容量下降以及传感器故障。我们提取了若干关键见解，以开发用于交通控制的鲁棒深度RL算法，并提出了具体设计以减轻所考虑的外源不确定性的影响。

英文摘要

Reinforcement learning (RL) constitutes a promising solution for alleviating the problem of traffic congestion. In particular, deep RL algorithms have been shown to produce adaptive traffic signal controllers that outperform conventional systems. However, in order to be reliable in highly dynamic urban areas, such controllers need to be robust with the respect to a series of exogenous sources of uncertainty. In this paper, we develop an open-source callback-based framework for promoting the flexible evaluation of different deep RL configurations under a traffic simulation environment. With this framework, we investigate how deep RL-based adaptive traffic controllers perform under different scenarios, namely under demand surges caused by special events, capacity reductions from incidents and sensor failures. We extract several key insights for the development of robust deep RL algorithms for traffic control and propose concrete designs to mitigate the impact of the considered exogenous uncertainties.

URL PDF HTML ☆

赞 0 踩 0

1903.02531 2026-06-04 cs.RO cs.AI cs.CV cs.LG cs.SY eess.SY 版本更新

Combining Optimal Control and Learning for Visual Navigation in Novel Environments

将最优控制与学习相结合用于新环境中的视觉导航

Somil Bansal, Varun Tolani, Saurabh Gupta, Jitendra Malik, Claire Tomlin

发表机构 * University of California, Berkeley（加州大学伯克利分校）； Facebook AI Research（脸书人工智能研究）

AI总结本文提出了一种结合模型控制与学习感知的方法，用于在新环境中实现可靠的视觉导航，通过生成无碰撞路径的 waypoints，使机器人能够高效地到达目标位置，同时在低帧率和仿真到现实的迁移中表现良好。

Comments Project website: https://vtolani95.github.io/WayPtNav/

详情

AI中文摘要

基于模型的控制是机器人导航的流行范式，因为它可以利用已知的动力学模型来高效地规划鲁棒的机器人轨迹。然而，在环境事先未知且只能通过机器人上的传感器部分观测的情况下，使用基于模型的方法具有挑战性。在本工作中，我们通过将基于模型的控制与基于学习的感知相结合来解决这一不足。基于学习的感知模块生成一系列 waypoints，通过无碰撞路径引导机器人到达目标。这些 waypoints 被用于基于模型的规划器生成平滑且动态可行的轨迹，该轨迹通过反馈控制在物理系统上执行。我们在模拟的真实世界复杂环境中以及在实际地面车辆上的实验表明，与纯几何映射或端到端学习方法相比，所提出的方法在新环境中能够更可靠、更高效地到达目标位置。我们的方法不依赖于详细的显式 3D 环境地图，能够与低帧率工作，并且在仿真到现实的迁移中表现良好。描述我们方法和实验的视频可在项目网站上获得。

英文摘要

Model-based control is a popular paradigm for robot navigation because it can leverage a known dynamics model to efficiently plan robust robot trajectories. However, it is challenging to use model-based methods in settings where the environment is a priori unknown and can only be observed partially through on-board sensors on the robot. In this work, we address this short-coming by coupling model-based control with learning-based perception. The learning-based perception module produces a series of waypoints that guide the robot to the goal via a collision-free path. These waypoints are used by a model-based planner to generate a smooth and dynamically feasible trajectory that is executed on the physical system using feedback control. Our experiments in simulated real-world cluttered environments and on an actual ground vehicle demonstrate that the proposed approach can reach goal locations more reliably and efficiently in novel environments as compared to purely geometric mapping-based or end-to-end learning-based alternatives. Our approach does not rely on detailed explicit 3D maps of the environment, works well with low frame rates, and generalizes well from simulation to the real world. Videos describing our approach and experiments are available on the project website.

URL PDF HTML ☆

赞 0 踩 0

1809.05525 2026-06-04 quant-ph cs.LG cs.SY eess.SY stat.ML 版本更新

Robustness of Quantum-Enhanced Adaptive Phase Estimation

量子增强自适应相位估计的鲁棒性

Pantita Palittapongarnpim, Barry C. Sanders

发表机构 * Institute for Quantum Science and Technology（量子科学与技术研究所）； University of Calgary（卡尔加里大学）； Program in Quantum Information Science（量子信息科学项目）； Canadian Institute for Advanced Research（加拿大高级研究 institute）； Toronto, Ontario M5G 1M1, Canada（加拿大安大略省多伦多M5G 1M1）

AI总结本研究提出了一种评估量子增强自适应相位估计策略鲁棒性的测试方法，并比较了不同策略所使用的资源，以确定其有效性并选择合适的策略。

Comments 15 pages, 2 figures, 2 tables

详情

DOI: 10.1103/PhysRevA.100.012106
Journal ref: Phys. Rev. A 100, 012106 (2019)

AI中文摘要

由于所有物理上的自适应量子增强计量方案都在具有部分理解的噪声条件下运行，因此实际的控制策略必须在未知噪声的情况下也具有鲁棒性。我们旨在设计一个测试来评估AQEM策略的鲁棒性，并评估策略所使用的资源。鲁棒性测试是在QEAPE上进行的，通过模拟四种相位噪声模型（正态分布噪声、随机电报噪声、偏态正态分布噪声和对数正态分布噪声）下的方案进行。控制策略要么是在相同嘈杂条件下由进化算法设计，尽管不知道其特性，要么是基于贝叶斯反馈的方法，假设没有噪声。我们的鲁棒性测试和资源比较方法可用于确定有效性和选择合适的策略。

英文摘要

As all physical adaptive quantum-enhanced metrology schemes operate under noisy conditions with only partially understood noise characteristics, so a practical control policy must be robust even for unknown noise. We aim to devise a test to evaluate the robustness of AQEM policies and assess the resource used by the policies. The robustness test is performed on QEAPE by simulating the scheme under four phase-noise models corresponding to normal-distribution noise, random-telegraph noise, skew-normal-distribution noise, and log-normal-distribution noise. Control policies are devised either by an evolutionary algorithm under the same noisy conditions, albeit ignorant of its properties, or a Bayesian-based feedback method that assumes no noise. Our robustness test and resource comparison method can be used to determining the efficacy and selecting a suitable policy.

URL PDF HTML ☆

赞 0 踩 0

1806.03816 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Adaptive MCMC via Combining Local Samplers

通过结合局部采样器实现自适应MCMC

Kiarash Shaloudegi, András György

发表机构 * Imperial College London, London, UK（伦敦帝国学院，伦敦，英国）

AI总结本文提出了一种自适应MCMC方法，通过结合多个并行运行的局部采样器，利用核Stein分歧度优先选择链，以提高整体采样效率，实验表明该方法在多模态问题和传感器定位任务中优于现有方法。

详情

AI中文摘要

马尔可夫链蒙特卡罗（MCMC）方法在机器学习中被广泛使用。MCMC的主要问题之一是如何设计能够快速混合整个状态空间的链，特别是如何选择MCMC算法的参数。本文采取了不同的方法，类似于并行MCMC方法，而不是寻找一个能够采样整个分布的单一链，而是结合多个并行运行的链的样本，每个链仅探索状态空间的部分（例如几个模式）。链根据核Stein分歧度优先级进行选择，这提供了局部性能的良好度量。独立链的样本通过一种新的技术进行组合，用于估计样本空间不同区域的概率。实验结果表明，所提出的算法可能在不同的采样问题中提供显著的加速。最重要的是，当与最先进的NUTS算法作为基础MCMC采样器结合时，我们的方法在采样单峰分布时与NUTS具有竞争力，而在合成多峰问题以及具有挑战性的传感器定位任务中显著优于现有方法。

英文摘要

Markov chain Monte Carlo (MCMC) methods are widely used in machine learning. One of the major problems with MCMC is the question of how to design chains that mix fast over the whole state space; in particular, how to select the parameters of an MCMC algorithm. Here we take a different approach and, similarly to parallel MCMC methods, instead of trying to find a single chain that samples from the whole distribution, we combine samples from several chains run in parallel, each exploring only parts of the state space (e.g., a few modes only). The chains are prioritized based on kernel Stein discrepancy, which provides a good measure of performance locally. The samples from the independent chains are combined using a novel technique for estimating the probability of different regions of the sample space. Experimental results demonstrate that the proposed algorithm may provide significant speedups in different sampling problems. Most importantly, when combined with the state-of-the-art NUTS algorithm as the base MCMC sampler, our method remained competitive with NUTS on sampling from unimodal distributions, while significantly outperforming state-of-the-art competitors on synthetic multimodal problems as well as on a challenging sensor localization task.

URL PDF HTML ☆

赞 0 踩 0

1809.07192 2026-06-04 eess.SY cs.LG cs.SY math.OC stat.ML 版本更新

Unbalanced Multi-Phase Distribution Grid Topology Estimation and Bus Phase Identification

不平衡多相配电网拓扑估计与节点相位识别

Yizheng Liao, Yang Weng, Guangyi Liu, Zhongyang Zhao, Chin-woo Tan, Ram Rajagopal

AI总结本文提出了一种基于信息论的方法，利用智能电表数据估计配电网的多相拓扑并识别节点相位，通过将不平衡系统转换为对称分量并证明Chow-Liu算法在存在错误节点相位标签时能确定拓扑结构，最终通过Carson方程证明电压测量可正确识别节点相位连接，实验结果表明该方法在强负载不平衡和分布式能源接入条件下具有高准确性。

Comments 17 pages, 18 figures

详情

AI中文摘要

随着分布式能源资源带来的不确定性在配电网中增加，准确的多相拓扑是相关不平衡配电网测量的基础。然而，由于投资有限，尤其是低压配电网，此类拓扑知识往往不可用。此外，由于人为错误或过时记录，节点相位标签信息不准确。为此，本文利用智能电表数据提出了一种信息论方法来学习配电网拓扑。具体而言，多相不平衡系统被转换为对称分量，即正序、负序和零序。然后，本文证明Chow-Liu算法通过利用功率流方程和由配电网径向多相结构隐含的条件独立关系来确定拓扑。最后，通过Carson方程证明可以使用电压测量正确识别节点相位连接。为验证，使用三个真实数据集模拟IEEE系统。仿真结果表明，该算法在强负载不平衡和DERs条件下仍能准确找到多相拓扑，确保配电网中分布式能源的紧密监控和控制。

英文摘要

There is an increasing need for monitoring and controlling uncertainties brought by distributed energy resources in distribution grids. For such goal, accurate multi-phase topology is the basis for correlating measurements in unbalanced distribution networks. Unfortunately, such topology knowledge is often unavailable due to limited investment, especially for \revv{low-voltage} distribution grids. Also, the bus phase labeling information is inaccurate due to human errors or outdated records. For this challenge, this paper utilizes smart meter data for an information-theoretic approach to learn the topology of distribution grids. Specifically, multi-phase unbalanced systems are converted into symmetrical components, namely positive, negative, and zero sequences. Then, this paper proves that the Chow-Liu algorithm finds the topology by utilizing power flow equations and the conditional independence relationships implied by the radial multi-phase structure of distribution grids with the presence of incorrect bus phase labels. At last, by utilizing Carson's equation, this paper proves that the bus phase connection can be correctly identified using voltage measurements. For validation, IEEE systems are simulated using three real data sets. The simulation results demonstrate that the algorithm is highly accurate for finding multi-phase topology even with strong load unbalancing condition and DERs. This ensures close monitoring and controlling DERs in distribution grids.

URL PDF HTML ☆

赞 0 踩 0

1811.09358 2026-06-04 cs.LG cs.CV cs.NA math.NA math.OC stat.ML 版本更新

A Sufficient Condition for Convergences of Adam and RMSProp

Adam和RMSProp收敛性的充分条件

Fangyu Zou, Li Shen, Zequn Jie, Weizhong Zhang, Wei Liu

发表机构 * Tencent AI Lab（腾讯AI实验室）； Stony Brook University（石英布鲁克大学）

AI总结本文提出了一种易于检查的充分条件，该条件仅依赖于基础学习率参数和历史二阶矩量的组合，以保证通用的Adam/RMSProp算法在大规模非凸随机优化中的全局收敛性，并展示了几种Adam变体在非凸设置下的收敛性可由此条件直接推导。

Comments Accepted by CVPR2019 as an Oral presentation

详情

AI中文摘要

Adam和RMSProp是训练深度神经网络中最具影响力的自适应随机算法，尽管在凸设置中通过几个简单的反例已被指出存在发散现象。许多尝试，如降低自适应学习率、采用大批次大小、引入时间去相关技术、寻找类比的替代方案等，已被尝试以促进Adam/RMSProp型算法收敛。与现有方法不同，我们引入了一种替代的易于检查的充分条件，该条件仅依赖于基础学习率参数和历史二阶矩量的组合，以保证通用的Adam/RMSProp算法在大规模非凸随机优化中的全局收敛性。此外，我们展示了几种Adam变体，如AdamNC、AdaEMA等，在非凸设置下的收敛性可通过所提出的充分条件直接推导。此外，我们表明Adam本质上是一种具有指数移动平均动量的特定加权AdaGrad，这为理解Adam和RMSProp提供了新的视角。这一观察结合该充分条件，为它们的发散性提供了更深入的解释。最后，我们通过将Adam和RMSProp应用于特定反例和训练深度神经网络来验证该充分条件。数值结果与我们的理论分析一致。

英文摘要

Adam and RMSProp are two of the most influential adaptive stochastic algorithms for training deep neural networks, which have been pointed out to be divergent even in the convex setting via a few simple counterexamples. Many attempts, such as decreasing an adaptive learning rate, adopting a big batch size, incorporating a temporal decorrelation technique, seeking an analogous surrogate, etc., have been tried to promote Adam/RMSProp-type algorithms to converge. In contrast with existing approaches, we introduce an alternative easy-to-check sufficient condition, which merely depends on the parameters of the base learning rate and combinations of historical second-order moments, to guarantee the global convergence of generic Adam/RMSProp for solving large-scale non-convex stochastic optimization. Moreover, we show that the convergences of several variants of Adam, such as AdamNC, AdaEMA, etc., can be directly implied via the proposed sufficient condition in the non-convex setting. In addition, we illustrate that Adam is essentially a specifically weighted AdaGrad with exponential moving average momentum, which provides a novel perspective for understanding Adam and RMSProp. This observation coupled with this sufficient condition gives much deeper interpretations on their divergences. At last, we validate the sufficient condition by applying Adam and RMSProp to tackle a certain counterexample and train deep neural networks. Numerical results are exactly in accord with our theoretical analysis.

URL PDF HTML ☆

赞 0 踩 0

1803.07726 2026-06-04 stat.ML cs.IT cs.LG cs.NA math.IT math.NA math.OC 版本更新

Gradient Descent with Random Initialization: Fast Global Convergence for Nonconvex Phase Retrieval

梯度下降与随机初始化：非凸相位恢复的快速全局收敛性

Yuxin Chen, Yuejie Chi, Jianqing Fan, Cong Ma

发表机构 * Department of Electrical Engineering, Princeton University（普林斯顿大学电气工程系）； Department of Electrical and Computer Engineering, Carnegie Mellon University（卡内基梅隆大学电气与计算机工程系）； Department of Operations Research and Financial Engineering, Princeton University（普林斯顿大学运筹学与金融工程系）

AI总结本文研究了通过二次方程恢复目标对象的问题，证明了在高斯设计下，随机初始化的梯度下降能在O(log n + log(1/ε))次迭代中获得ε精度的解，从而实现了计算和样本复杂度的近最优性，为相位恢复提供了首个无需精心设计初始化、样本分割或复杂鞍点逃离方案的全局收敛保证。

Comments Accepted to Mathematical Programming

详情

DOI: 10.1007/s10107-019-01363-6
Journal ref: Mathematical Programming 2019, Volume 176, Issue 1-2, 5-37

AI中文摘要

本文考虑了解二次方程组的问题，即从m个二次方程/样本y_i=(a_i^T x^natural)^2 (1≤i≤m)中恢复感兴趣的对象x^natural∈R^n。这个问题也被称为相位恢复，涵盖了多个领域，包括物理科学和机器学习。我们研究了为非凸最小二乘问题设计的梯度下降（或Wirtinger流）的效率。我们证明，在高斯设计下，梯度下降——当以随机方式初始化时——能在O(log n + log(1/ε))次迭代中获得ε精度的解，从而同时实现了近最优的计算和样本复杂度。这为相位恢复提供了首个关于普通梯度下降的全局收敛保证，无需（i）精心设计的初始化（ii）样本分割，或（iii）复杂的鞍点逃离方案。所有这些都通过利用统计模型分析优化算法，通过一种leave-one-out方法，实现了梯度下降迭代与数据之间的统计依赖性的解耦。

英文摘要

This paper considers the problem of solving systems of quadratic equations, namely, recovering an object of interest $\mathbf{x}^{\natural}\in\mathbb{R}^{n}$ from $m$ quadratic equations/samples $y_{i}=(\mathbf{a}_{i}^{\top}\mathbf{x}^{\natural})^{2}$, $1\leq i\leq m$. This problem, also dubbed as phase retrieval, spans multiple domains including physical sciences and machine learning. We investigate the efficiency of gradient descent (or Wirtinger flow) designed for the nonconvex least squares problem. We prove that under Gaussian designs, gradient descent --- when randomly initialized --- yields an $ε$-accurate solution in $O\big(\log n+\log(1/ε)\big)$ iterations given nearly minimal samples, thus achieving near-optimal computational and sample complexities at once. This provides the first global convergence guarantee concerning vanilla gradient descent for phase retrieval, without the need of (i) carefully-designed initialization, (ii) sample splitting, or (iii) sophisticated saddle-point escaping schemes. All of these are achieved by exploiting the statistical models in analyzing optimization algorithms, via a leave-one-out approach that enables the decoupling of certain statistical dependency between the gradient descent iterates and the data.

URL PDF HTML ☆

赞 0 踩 0

1811.10745 2026-06-04 cs.LG cs.CR cs.NA math.NA stat.ML 版本更新

ResNets Ensemble via the Feynman-Kac Formalism to Improve Natural and Robust Accuracies

通过费米-狄拉克公式式方法提升ResNets的自然和鲁棒准确性的集成方法

Bao Wang, Binjie Yuan, Zuoqiang Shi, Stanley J. Osher

发表机构 * Department of Mathematics（数学系）； Computer Science Department（计算机科学系）； University of California, Los Angeles（加州大学洛杉矶分校）； Tsinghua University（清华大学）； Yau Mathematical Sciences Center（杨振宁数学科学中心）

AI总结本文提出了一种基于费米-狄拉克公式式的ResNets集成算法，通过在残差映射的输出中注入方差指定的高斯噪声并平均多个联合训练的修改ResNets的乘积来提高模型在干净和对抗性图像上的准确率。

Comments 18 pages, 6 figures

详情

AI中文摘要

经验对抗风险最小化（EARM）是一种广泛使用的数学框架，用于鲁棒地训练深度神经网络（DNNs），使其对对抗性攻击具有抵抗力。然而，训练后的鲁棒模型在分类干净图像和对抗图像时的自然和鲁棒准确率仍然远未令人满意。在本工作中，我们统一了传输方程最优控制的理论与ResNets的训练和测试实践。基于这一统一观点，我们提出了一种简单但有效的ResNets集成算法，以提升鲁棒训练模型在干净和对抗图像上的准确率。所提出的算法包括两个组成部分：首先，我们通过在每个残差映射的输出中注入指定方差的高斯噪声来修改基础ResNets。其次，我们对多个联合训练的修改ResNets的乘积进行平均以获得最终预测。这两个步骤对费米-狄拉克公式表示粘性传输方程或对流-扩散方程的解提供了近似。在CIFAR10基准测试中，该简单算法导致在干净图像上的自然准确率为85.62%，在20次IFGSM攻击迭代下的鲁棒准确率为57.94%，优于当前在CIFAR10上防御IFGSM攻击的最先进方法。所提出的ResNets集成的自然和鲁棒准确率可以随着基础ResNet的进展动态提高。代码可在：https://github.com/BaoWangMath/EnResNet获取。

英文摘要

Empirical adversarial risk minimization (EARM) is a widely used mathematical framework to robustly train deep neural nets (DNNs) that are resistant to adversarial attacks. However, both natural and robust accuracies, in classifying clean and adversarial images, respectively, of the trained robust models are far from satisfactory. In this work, we unify the theory of optimal control of transport equations with the practice of training and testing of ResNets. Based on this unified viewpoint, we propose a simple yet effective ResNets ensemble algorithm to boost the accuracy of the robustly trained model on both clean and adversarial images. The proposed algorithm consists of two components: First, we modify the base ResNets by injecting a variance specified Gaussian noise to the output of each residual mapping. Second, we average over the production of multiple jointly trained modified ResNets to get the final prediction. These two steps give an approximation to the Feynman-Kac formula for representing the solution of a transport equation with viscosity, or a convection-diffusion equation. For the CIFAR10 benchmark, this simple algorithm leads to a robust model with a natural accuracy of {\bf 85.62}\% on clean images and a robust accuracy of ${\bf 57.94 \%}$ under the 20 iterations of the IFGSM attack, which outperforms the current state-of-the-art in defending against IFGSM attack on the CIFAR10. Both natural and robust accuracies of the proposed ResNets ensemble can be improved dynamically as the building block ResNet advances. The code is available at: \url{https://github.com/BaoWangMath/EnResNet}.

URL PDF HTML ☆

赞 0 踩 0

1807.06613 2026-06-04 cs.MA cs.AI cs.LG cs.SY eess.SY stat.ML 版本更新

Deep Reinforcement Learning for Swarm Systems

深度强化学习用于群体系统

Maximilian Hüttenrauch, Adrian Šošić, Gerhard Neumann

发表机构 * L-CAS University of Lincoln（L-CAS林肯大学）； Technische Universität Darmstadt（达姆施塔特技术大学）

AI总结本文提出了一种基于分布均嵌入的新状态表示方法，用于深度多智能体强化学习，以更有效地处理大规模同质群体系统的去中心化决策问题。

Comments 31 pages, 12 figures, version 3 (published in JMLR Volume 20)

详情

Journal ref: Journal of Machine Learning Research 20(54):1-31, 2019

AI中文摘要

最近，深度强化学习（RL）方法已成功应用于多智能体场景。通常，这些方法依赖于将智能体状态拼接起来以表示去中心化决策所需的信 �息内容。然而，拼接在大规模同质群体系统中表现不佳，因为它不利用这些系统固有的基本属性：（i）群体中的智能体是可互换的，（ii）群体中智能体的精确数量无关。因此，我们提出了一种基于分布均嵌入的新深度多智能体RL状态表示方法。我们将智能体视为分布的样本，并使用经验均嵌入作为去中心化策略的输入。我们通过直方图、径向基函数和端到端学习的神经网络定义了不同的均嵌入特征空间。我们在群体文献中两个著名的已知问题（相遇和追捕）上评估了该表示方法，在全局和局部可观察的设置中。对于局部设置，我们进一步引入了简单的通信协议。所有方法中，基于神经网络特征的均嵌入表示能够促进相邻智能体之间最丰富的信息交换，从而促进更复杂的集体策略的发展。

英文摘要

Recently, deep reinforcement learning (RL) methods have been applied successfully to multi-agent scenarios. Typically, these methods rely on a concatenation of agent states to represent the information content required for decentralized decision making. However, concatenation scales poorly to swarm systems with a large number of homogeneous agents as it does not exploit the fundamental properties inherent to these systems: (i) the agents in the swarm are interchangeable and (ii) the exact number of agents in the swarm is irrelevant. Therefore, we propose a new state representation for deep multi-agent RL based on mean embeddings of distributions. We treat the agents as samples of a distribution and use the empirical mean embedding as input for a decentralized policy. We define different feature spaces of the mean embedding using histograms, radial basis functions and a neural network learned end-to-end. We evaluate the representation on two well known problems from the swarm literature (rendezvous and pursuit evasion), in a globally and locally observable setup. For the local setup we furthermore introduce simple communication protocols. Of all approaches, the mean embedding representation using neural network features enables the richest information exchange between neighboring agents facilitating the development of more complex collective strategies.

URL PDF HTML ☆

赞 0 踩 0

1905.08645 2026-06-04 math.OC cs.DC cs.LG cs.MA cs.SY eess.SY 版本更新

Revisiting Randomized Gossip Algorithms: General Framework, Convergence Rates and Novel Block and Accelerated Protocols

重新审视随机广播算法：通用框架、收敛速率和新型块及加速协议

Nicolas Loizou, Peter Richtárik

发表机构 * University of Edinburgh（爱丁堡大学）； KAUST（卡塔尔科技大学）； MIPT（莫斯科国立信息安全研究学院）

AI总结本文提出了一种新的随机广播算法分析和设计框架，用于解决平均共识问题。通过将经典随机迭代方法应用于特殊系统来解释网络结构，展示了其去中心化特性。该框架恢复了多种已知的广播算法作为特殊情况，并允许开发具有证明更快变体的方法。我们还提出了新的块和第一个可证明加速的随机广播协议，以及双随机广播算法。

Comments 44 pages, 12 figures

详情

AI中文摘要

在本文中，我们提出了一种新的框架，用于分析和设计随机广播算法以解决平均共识问题。我们展示了经典随机迭代方法在应用于特殊系统以编码底层网络时如何被解释为广播算法，并详细解释了其去中心化性质。我们的通用框架恢复了多种已知的广播算法作为特殊情况，包括配对随机广播算法和路径平均广播算法，并允许开发具有证明更快变体的方法。新方法的灵活性使我们能够设计出多种新的特定广播方法。例如，我们提出了并分析了新的块和第一个可证明加速的随机广播协议，以及双随机广播算法。从数值分析的角度来看，我们的工作是首次深入探讨随机迭代方法在解决线性系统时的去中心化性质，并将其作为解决平均共识问题的方法。我们通过在典型无线网络拓扑上进行广泛的实验测试来评估所提出广播协议的性能。

英文摘要

In this work we present a new framework for the analysis and design of randomized gossip algorithms for solving the average consensus problem. We show how classical randomized iterative methods for solving linear systems can be interpreted as gossip algorithms when applied to special systems encoding the underlying network and explain in detail their decentralized nature. Our general framework recovers a comprehensive array of well-known gossip algorithms as special cases, including the pairwise randomized gossip algorithm and path averaging gossip, and allows for the development of provably faster variants. The flexibility of the new approach enables the design of a number of new specific gossip methods. For instance, we propose and analyze novel block and the first provably accelerated randomized gossip protocols, and dual randomized gossip algorithms. From a numerical analysis viewpoint, our work is the first that explores in depth the decentralized nature of randomized iterative methods for linear systems and proposes them as methods for solving the average consensus problem. We evaluate the performance of the proposed gossip protocols by performing extensive experimental testing on typical wireless network topologies.

URL PDF HTML ☆

赞 0 踩 0

1905.13587 2026-06-04 cs.LG cs.NA math.NA math.OC stat.ML 版本更新

GENO -- GENeric Optimization for Classical Machine Learning

GENO -- 为经典机器学习设计的通用优化

Sören Laue, Matthias Mitterreiter, Joachim Giesen

发表机构 * Friedrich-Schiller-Universität Jena（耶拿弗里德里希-施勒斯海姆大学）

AI总结本文提出GENO框架，通过结合建模语言和通用求解器，实现了对大多数经典机器学习问题的高效自动求解，展示了其在效率上的优势。

详情

AI中文摘要

尽管优化是机器学习的长期算法核心，但新模型仍需要耗时实现新求解器。因此，有成千上万种针对机器学习问题的优化算法实现。一个自然的问题是，是否总需要实现新求解器，或者是否存在一个适用于大多数模型的算法。普遍认为这种“万能算法”无法工作，因为该算法无法利用模型特定的结构，因此无法在广泛的问题上高效且稳健。本文挑战这一普遍观点。我们设计并实现了优化框架GENO（GENeric Optimization），它结合了建模语言和通用求解器。GENO从优化问题类的声明性规范中生成求解器。该框架足够灵活，可以涵盖大多数经典机器学习问题。我们在广泛的经典问题以及一些最近提出的问题上展示了自动生成的求解器的性能：(1) 与精心设计的专用求解器一样高效，(2) 比最近的最先进求解器有相当大的优势，(3) 比传统建模语言加求解器方法快多个数量级。

英文摘要

Although optimization is the longstanding algorithmic backbone of machine learning, new models still require the time-consuming implementation of new solvers. As a result, there are thousands of implementations of optimization algorithms for machine learning problems. A natural question is, if it is always necessary to implement a new solver, or if there is one algorithm that is sufficient for most models. Common belief suggests that such a one-algorithm-fits-all approach cannot work, because this algorithm cannot exploit model specific structure and thus cannot be efficient and robust on a wide variety of problems. Here, we challenge this common belief. We have designed and implemented the optimization framework GENO (GENeric Optimization) that combines a modeling language with a generic solver. GENO generates a solver from the declarative specification of an optimization problem class. The framework is flexible enough to encompass most of the classical machine learning problems. We show on a wide variety of classical but also some recently suggested problems that the automatically generated solvers are (1) as efficient as well-engineered specialized solvers, (2) more efficient by a decent margin than recent state-of-the-art solvers, and (3) orders of magnitude more efficient than classical modeling language plus solver approaches.

URL PDF HTML ☆

赞 0 踩 0

1905.13548 2026-06-04 math.OC cs.LG cs.SY eess.SY math.DS 版本更新

Sparse optimal control of networks with multiplicative noise via policy gradient

通过策略梯度实现受乘性噪声影响的网络稀疏最优控制

Benjamin Gravell, Yi Guo, Tyler Summers

发表机构 * The University of Texas at Dallas（德克萨斯大学达拉斯分校）

AI总结本文提出了一种基于策略梯度的近优稀疏控制器设计算法，用于处理受乘性噪声影响的复杂动态网络系统，通过多种正则化方案的比较，展示了算法在大规模网络系统中的有效性。

1905.13428 2026-06-04 cs.LG cs.MA cs.SY eess.SY stat.ML 版本更新

Attentional Policies for Cross-Context Multi-Agent Reinforcement Learning

面向跨上下文多智能体强化学习的注意力策略

Matthew A. Wright, Roberto Horowitz

发表机构 * University of California Berkeley（加州大学伯克利分校）

AI总结本文提出了一种新的神经策略架构，用于解决多智能体问题，通过在策略层面学习多智能体关系，利用注意力机制实现智能体间的协作，优于传统方法并在大规模智能体场景中表现更优。

详情

AI中文摘要

许多现实世界中强化学习的应用涉及与数量随时间变化的其他智能体交互。我们为这些多智能体问题提出了新的神经策略架构。与传统的为每个智能体训练离散策略并通过额外的跨策略机制强制合作的方法不同，我们遵循最近关于深度网络中关系归纳偏置力量的工作精神，在策略层面学习多智能体关系。在我们的方法中，所有智能体共享相同的策略，但各自在自己的上下文中独立应用该策略，以聚合其他智能体的状态信息以选择下一步动作。我们的架构结构允许其应用于具有不同数量智能体的环境。我们在基准多智能体自动驾驶协调问题上展示了我们的架构，取得了优于全知识、完全集中化参考解决方案的成果，并在智能体数量扩大时显著优于该方案。

英文摘要

Many potential applications of reinforcement learning in the real world involve interacting with other agents whose numbers vary over time. We propose new neural policy architectures for these multi-agent problems. In contrast to other methods of training an individual, discrete policy for each agent and then enforcing cooperation through some additional inter-policy mechanism, we follow the spirit of recent work on the power of relational inductive biases in deep networks by learning multi-agent relationships at the policy level via an attentional architecture. In our method, all agents share the same policy, but independently apply it in their own context to aggregate the other agents' state information when selecting their next action. The structure of our architectures allow them to be applied on environments with varying numbers of agents. We demonstrate our architecture on a benchmark multi-agent autonomous vehicle coordination problem, obtaining superior results to a full-knowledge, fully-centralized reference solution, and significantly outperforming it when scaling to large numbers of agents.

URL PDF HTML ☆

赞 0 踩 0

1905.09673 2026-06-04 cs.AI cs.LG cs.SY eess.SY 版本更新

Deep Q-Learning with Q-Matrix Transfer Learning for Novel Fire Evacuation Environment

基于Q矩阵迁移学习的深度Q学习用于新型火灾疏散环境

Jivitesh Sharma, Per-Arne Andersen, Ole-Chrisoffer Granmo, Morten Goodwin

发表机构 * Centre for Artificial Intelligence Research（人工智能研究中心）； Department of Information and Communication Technology（信息与通信技术系）； University of Agder, Norway（阿格德大学，挪威）

AI总结本文提出了一种基于Q矩阵迁移学习的深度Q学习方法，用于解决紧急疏散问题，通过预训练DQN网络权重以获取最短路径信息，并在复杂真实环境中实现最优疏散路径。

Comments 21 pages, 14 figures, 4 tables

详情

AI中文摘要

我们关注紧急疏散这一重要问题，该问题显然可以受益于强化学习，但长期以来未被充分研究。紧急疏散是一个复杂的任务，难以用强化学习解决，因为紧急情况高度动态，包含大量变化变量和复杂约束，使训练变得困难。在本文中，我们提出了第一个用于训练强化学习代理进行疏散规划的火灾疏散环境。该环境被建模为图，以捕捉建筑结构。它包括现实特征，如火势蔓延、不确定性和瓶颈。我们已经将环境实现为OpenAI gym格式，以促进未来研究。我们还提出了一种新的强化学习方法，该方法通过预训练DQN代理的网络权重来整合通往出口的最短路径信息。我们通过使用表格Q学习来学习建筑模型图中的最短路径来实现这一点。此信息通过故意在Q矩阵上过拟合来转移到网络。然后，预训练的DQN模型在火灾疏散环境中进行训练，以在时间变化条件下生成最优疏散路径。我们对所提出的方法与PPO、VPG、SARSA、A2C和ACKTR等最新强化学习算法进行了比较。结果表明，我们的方法在包括原始DQN模型在内的最新模型上表现出巨大的优势。最后，我们在一个大型且复杂的现实建筑中测试我们的模型，该建筑由91个房间组成，可以移动到任何其他房间，因此有8281种动作。我们使用基于注意力的机制来处理大动作空间。我们的模型在现实世界紧急环境中实现了接近最优的性能。

英文摘要

We focus on the important problem of emergency evacuation, which clearly could benefit from reinforcement learning that has been largely unaddressed. Emergency evacuation is a complex task which is difficult to solve with reinforcement learning, since an emergency situation is highly dynamic, with a lot of changing variables and complex constraints that makes it difficult to train on. In this paper, we propose the first fire evacuation environment to train reinforcement learning agents for evacuation planning. The environment is modelled as a graph capturing the building structure. It consists of realistic features like fire spread, uncertainty and bottlenecks. We have implemented the environment in the OpenAI gym format, to facilitate future research. We also propose a new reinforcement learning approach that entails pretraining the network weights of a DQN based agents to incorporate information on the shortest path to the exit. We achieved this by using tabular Q-learning to learn the shortest path on the building model's graph. This information is transferred to the network by deliberately overfitting it on the Q-matrix. Then, the pretrained DQN model is trained on the fire evacuation environment to generate the optimal evacuation path under time varying conditions. We perform comparisons of the proposed approach with state-of-the-art reinforcement learning algorithms like PPO, VPG, SARSA, A2C and ACKTR. The results show that our method is able to outperform state-of-the-art models by a huge margin including the original DQN based models. Finally, we test our model on a large and complex real building consisting of 91 rooms, with the possibility to move to any other room, hence giving 8281 actions. We use an attention based mechanism to deal with large action spaces. Our model achieves near optimal performance on the real world emergency environment.

URL PDF HTML ☆

赞 0 踩 0

1805.07297 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

General solutions for nonlinear differential equations: a rule-based self-learning approach using deep reinforcement learning

非线性微分方程的通用解法：一种基于规则的自学习方法使用深度强化学习

Shiyin Wei, Xiaowei Jin, Hui Li

发表机构 * Key Lab of Smart Prevention and Mitigation of Civil Engineering Disasters of the Ministry of Industry and Information Technology, Harbin Institute of Technology（工信部智能防灾减灾重点实验室，哈尔滨工业大学）； Key Lab of Structures Dynamic Behavior and Control of the Ministry of Education, Harbin Institute of Technology（教育部结构动力行为与控制重点实验室，哈尔滨工业大学）； School of Civil Engineering, Harbin Institute of Technology（哈尔滨工业大学土木工程学院）

AI总结本文提出了一种基于规则的自学习方法，利用深度强化学习解决非线性常微分方程和偏微分方程，通过深度神经网络结构的演员输出候选解，以及仅基于物理规则（ governing equations 和边界和初始条件）的评论家，展示了转移学习特性，并验证了该方法在求解薛定谔、纳维-斯托克斯、伯格斯、范德波尔和洛伦兹方程及运动方程中的高精度解。

详情

DOI: 10.1007/s00466-019-01715-1

AI中文摘要

本文首次提出了一种基于深度强化学习（DRL）的通用规则-based 自学习方法，用于求解非线性常微分方程和偏微分方程。求解器由一个深度神经网络结构的演员组成，该演员输出候选解，以及仅基于物理规则（ governing equations 和边界和初始条件）的评论家。离散时间中的解被视为共享相同 governing equation 的多个任务，当前步骤参数为下一步提供了理想的初始化，由于解的时序连续性，展示了转移学习特性，表明DRL求解器已经捕捉到了方程的本质。该方法通过求解薛定谔、纳维-斯托克斯、伯格斯、范德波尔和洛伦兹方程及运动方程进行了验证。结果表明，该方法能够给出高精度的解，且求解过程有望更快。

英文摘要

A universal rule-based self-learning approach using deep reinforcement learning (DRL) is proposed for the first time to solve nonlinear ordinary differential equations and partial differential equations. The solver consists of a deep neural network-structured actor that outputs candidate solutions, and a critic derived only from physical rules (governing equations and boundary and initial conditions). Solutions in discretized time are treated as multiple tasks sharing the same governing equation, and the current step parameters provide an ideal initialization for the next owing to the temporal continuity of the solutions, which shows a transfer learning characteristic and indicates that the DRL solver has captured the intrinsic nature of the equation. The approach is verified through solving the Schrödinger, Navier-Stokes, Burgers', Van der Pol, and Lorenz equations and an equation of motion. The results indicate that the approach gives solutions with high accuracy, and the solution process promises to get faster.

URL PDF HTML ☆

赞 0 踩 0

1905.10457 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

A Polynomial-Based Approach for Architectural Design and Learning with Deep Neural Networks

基于多项式的深度神经网络架构设计与学习方法

Joseph Daws, Clayton G. Webster

发表机构 * Oak Ridge National Lab（橡树岭国家实验室）； University of Tennessee at Knoxville（田纳西大学 Knoxville分校）

AI总结本文提出了一种基于多项式的新型方法，通过识别合适的网络架构和初始化来从训练数据中重建多元函数，利用多项式近似，通过标准训练过程改进网络，从而更可能获得理想的局部极小值。

Comments 11 pages, 6 figures, submitted to NeurIPS 2019, corrected several typos and included new examples

详情

AI中文摘要

使用APHEN进行Paratuck2张量分解的移动银行用户-设备认证

Jeremy Charlier, Eric Falk, Radu State, Jean Hilger

发表机构 * University of Luxembourg（卢森堡大学）

AI总结本文研究了如何利用Paratuck2张量分解和APHEN算法提高移动银行应用中的用户-设备认证效率，以增强个人财务广告的效果。

详情

AI中文摘要

新的金融欧洲法规，如PSD2，正在改变零售银行业务服务。值得注意的是，个人支出的监控现在不仅限于零售银行。然而，零售银行希望通过移动银行应用中的用户-设备认证来增强个人财务广告。为了解决认证的建模问题，我们依赖于张量分解，这是矩阵分解的高维类比。我们使用Paratuck2，因为它可以将张量表示为矩阵乘积和对角张量的乘积，因为用户和设备数量之间存在不平衡。我们强调为什么Paratuck2比流行的CP张量分解更适合这种情况，后者将张量分解为秩一张量的和。然而，Paratuck2的计算是计算密集型的。我们提出了一种新的近似Hessian基于牛顿求解算法，APHEN，能够比基于交替最小二乘或梯度下降的其他流行方法更准确和快速地解决Paratuck2。Paratuck2的结果用于通过神经网络预测用户认证的预测。我们应用我们的方法用于具体的案例，即基于移动银行应用生成的认证事件来针对客户进行财务广告活动。

英文摘要

The new financial European regulations such as PSD2 are changing the retail banking services. Noticeably, the monitoring of the personal expenses is now opened to other institutions than retail banks. Nonetheless, the retail banks are looking to leverage the user-device authentication on the mobile banking applications to enhance the personal financial advertisement. To address the profiling of the authentication, we rely on tensor decomposition, a higher dimensional analogue of matrix decomposition. We use Paratuck2, which expresses a tensor as a multiplication of matrices and diagonal tensors, because of the imbalance between the number of users and devices. We highlight why Paratuck2 is more appropriate in this case than the popular CP tensor decomposition, which decomposes a tensor as a sum of rank-one tensors. However, the computation of Paratuck2 is computational intensive. We propose a new APproximate HEssian-based Newton resolution algorithm, APHEN, capable of solving Paratuck2 more accurately and faster than the other popular approaches based on alternating least square or gradient descent. The results of Paratuck2 are used for the predictions of users' authentication with neural networks. We apply our method for the concrete case of targeting clients for financial advertising campaigns based on the authentication events generated by mobile banking applications.

URL PDF HTML ☆

赞 0 踩 0

1905.10224 2026-06-04 cs.LG cs.DM cs.NA cs.NE math.NA stat.ML 版本更新

Semi-Supervised Classification on Non-Sparse Graphs Using Low-Rank Graph Convolutional Networks

利用低秩图卷积网络对非稀疏图进行半监督分类

Dominik Alfke, Martin Stoll

发表机构 * Department of Mathematics, Chair of Scientific Computing（数学系，科学计算教研室）

AI总结本文提出了一种低秩图卷积网络架构，用于高效处理非稀疏图上的半监督学习问题，通过引入低秩滤波器提升运行效率和准确率，并扩展到超图数据集的处理。

1812.03457 2026-06-04 math.OC cs.LG cs.NA cs.NE math.NA 版本更新

Minima distribution for global optimization

全局优化的极小值分布

Xiaopeng Luo

发表机构 * Department of Chemistry, Princeton University（普林斯顿大学化学系）

AI总结本文研究了任意连续函数在紧集上的极小值分布问题，提出了一种新的极小值分布函数构造方法，并建立了与目标函数和紧集相关的单调收敛序列，最终确定了连续可微函数的极小值集收缩率。

Comments 19 pages, 6 figures

1807.00251 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

从动力系统角度看待Nesterov加速法

Michael Muehlebach, Michael I. Jordan

发表机构 * Electrical Engineering and Computer Science Department, UC Berkeley, Berkeley, California, USA（加州大学伯克利分校电子工程与计算机科学系）

AI总结本文提出一个动力系统框架来理解Nesterov加速梯度方法，通过分析连续时间动力学和离散化过程，揭示了曲率依赖的阻尼项是加速现象的核心，并建立了离散和连续时间动力学之间的联系。

Comments 11 pages, 4 figures, to appear in the Proceedings of the 36th International Conference on Machine Learning

1902.01119 2026-06-04 cs.AI cs.CL cs.LG cs.SY eess.SY 版本更新

The Natural Language of Actions

动作的自然语言

Guy Tennenholtz, Shie Mannor

发表机构 * Faculty of Electrical Engineering, Technion Institute of Technology, Israel（电气工程学院，技术学院，以色列）

AI总结本文提出Act2Vec框架，用于学习基于上下文的动作表示以提升强化学习性能，通过将相似动作分组并利用动作间的关系来改进Q值近似和状态表示。

Comments Published in the proceedings of the 36th International Conference on Machine Learning (ICML 2019)

1905.06978 2026-06-04 eess.SY cs.LG cs.RO cs.SY stat.AP 版本更新

Randomized Algorithms for Data-Driven Stabilization of Stochastic Linear Systems

数据驱动的随机算法用于随机线性系统的稳定化

Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis

AI总结本文提出两种随机算法用于数据驱动的随机线性系统稳定化，通过数值分析研究了随机反馈和随机参数方法的稳定速度和失败概率，证明在统计独立随机化数量不小时可以保证快速稳定化。

1810.05247 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

Real-time Faulted Line Localization and PMU Placement in Power Systems through Convolutional Neural Networks

通过卷积神经网络实现电力系统中的实时故障线路定位与PMU布置

Wenting Li, Deepjyoti Deka, Michael Chertkov, Meng Wang

发表机构 * Theory Division and the Center for Nonlinear Studies, Los Alamos National Laboratory（理论部和非线性研究中心，洛斯阿拉莫斯国家实验室）

AI总结本文提出基于卷积神经网络的故障线路定位方法，利用母线电压特征提高鲁棒性，并提出联合PMU布置策略，通过不同类型的故障模拟验证了在低可观测性条件下高精度的故障定位能力。

Comments 11 pages, 8 figures

详情

AI中文摘要

多样化的故障类型、快速的重合闸和故障后复杂的暂态状态使得电力电网中的实时故障定位具有挑战性。现有定位技术依赖于静态负载等简化假设或需要更高的采样率或总测量可用性。本文提出了一种基于卷积神经网络（CNN）分类器的故障线路定位方法，利用母线电压。与以往的数据驱动方法不同，所提出的分类器基于具有物理解释的特征，提高了定位性能的鲁棒性。我们的基于CNN的定位工具的准确性明显优于文献中的其他机器学习分类器。为了进一步提高定位性能，提出了一种联合相量测量单元（PMU）布置策略，并与其他方法进行了验证。我们方法的一个重要方面是，在非常低的可观测性（7%的母线）下，算法仍能以高概率将故障线路定位到小的邻域。通过在IEEE 39母线和68母线电力系统中不同类型的故障模拟，验证了在变化的不确定条件、系统可观测性和测量质量下的方案性能。

英文摘要

Diverse fault types, fast re-closures, and complicated transient states after a fault event make real-time fault location in power grids challenging. Existing localization techniques in this area rely on simplistic assumptions, such as static loads, or require much higher sampling rates or total measurement availability. This paper proposes a faulted line localization method based on a Convolutional Neural Network (CNN) classifier using bus voltages. Unlike prior data-driven methods, the proposed classifier is based on features with physical interpretations that improve the robustness of the location performance. The accuracy of our CNN based localization tool is demonstrably superior to other machine learning classifiers in the literature. To further improve the location performance, a joint phasor measurement units (PMU) placement strategy is proposed and validated against other methods. A significant aspect of our methodology is that under very low observability (7% of buses), the algorithm is still able to localize the faulted line to a small neighborhood with high probability. The performance of our scheme is validated through simulations of faults of various types in the IEEE 39-bus and 68-bus power systems under varying uncertain conditions, system observability, and measurement quality.

URL PDF HTML ☆

赞 0 踩 0

1810.03076 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Online Center of Mass Estimation for a Humanoid Wheeled Inverted Pendulum Robot

人形轮式反重力摆机器人在线质心估计

Munzir Zafar, Akash Patel, Bogdan Vlahov, Nathaniel Glaser, Sergio Aguillera, Seth Hutchinson

AI总结本文提出了一种新颖的鲁棒控制与在线学习结合的方法，用于平衡具有n自由度的轮式反重力摆人形机器人，通过在线学习更新质量模型以获得更准确的质心估计，实验表明该方法提升了整体控制效率。

详情

AI中文摘要

我们提出了一种新颖的鲁棒控制和在线学习应用，用于平衡具有n个自由度（DoF）的轮式反重力摆（WIP）人形机器人。我们的技术将质量模型的不准确性转化为质心（CoM）误差，并在存在误差的情况下实现平衡，同时利用在线学习更新质量模型以获得更好的质心估计。使用我们机器人的模拟模型，我们元学习了一组激励关节姿态，使我们的梯度下降算法快速收敛到准确的（CoM）估计。该模拟流程完全在线执行，使用主动扰动抵消来解决由于持续演变的质量模型所产生的质量误差。在19个自由度的WIP上进行了实验，我们手动获取了学习姿态集的数据，并展示了由梯度下降产生的质量模型产生的质心估计能够提高整体控制和效率。本工作为Golem Krang人形机器人整体控制贡献了更丰富的文献。

英文摘要

We present a novel application of robust control and online learning for the balancing of a n Degree of Freedom (DoF), Wheeled Inverted Pendulum (WIP) humanoid robot. Our technique condenses the inaccuracies of a mass model into a Center of Mass (CoM) error, balances despite this error, and uses online learning to update the mass model for a better CoM estimate. Using a simulated model of our robot, we meta-learn a set of excitory joint poses that makes our gradient descent algorithm quickly converge to an accurate (CoM) estimate. This simulated pipeline executes in a fully online fashion, using active disturbance rejection to address the mass errors that result from a steadily evolving mass model. Experiments were performed on a 19 DoF WIP, in which we manually acquired the data for the learned set of poses and show that the mass model produced by a gradient descent produces a CoM estimate that improves overall control and efficiency. This work contributes to a greater corpus of whole body control on the Golem Krang humanoid robot.

URL PDF HTML ☆

赞 0 踩 0

1905.05380 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Control Regularization for Reduced Variance Reinforcement Learning

减少方差的强化学习中的控制正则化

Richard Cheng, Abhinav Verma, Gabor Orosz, Swarat Chaudhuri, Yisong Yue, Joel W. Burdick

发表机构 * California Institute of Technology, Pasadena, CA（加州理工学院）； University of Michigan, Ann Arbor, MI（密歇根大学）； Rice University, Houston, TX（Rice大学）

AI总结本文提出了一种功能正则化方法，用于减少连续控制中强化学习的方差，通过正则化深度策略的行为与先验策略相似，从而在偏倚-方差权衡中实现更稳定的动态稳定性和更高效的训练。

Comments Appearing in ICML 2019

详情

AI中文摘要

低 Tucker 等秩张量有限可补全的确定性和概率条件

Morteza Ashraphijuo, Vaneet Aggarwal, Xiaodong Wang

发表机构 * Department of Electrical Engineering, Columbia University（哥伦比亚大学电气工程系）； School of IE, Purdue University（普渡大学工业工程学院）

AI总结本文研究了在给定某些 Tucker 等秩组件的情况下，张量有限可补全的采样模式的基本条件。通过在 Tucker 流形上进行代数几何分析，提出了确定性必要和充分条件，同时研究了概率条件并给出了采样概率的下界，以确保所提出的确定性条件在高概率下成立。此外，利用所提出的几何方法，提出了一个保证采样张量有唯一补全的采样模式充分条件。

详情

AI中文摘要

我们研究了在给定某些 Tucker 等秩组件的情况下，张量有限可补全的采样模式的基本条件。为了找到确定性必要和充分条件，我们提出了一种在 Tucker 流形上的代数几何分析，这与传统在 Grassmannian 流形上的几何方法不同，允许在分析中纳入多个秩组件。这种分析刻画了一组基于采样模式定义的多项式的代数独立性，这与有限补全密切相关。然后研究了概率条件，并给出了采样概率的下界，该下界保证所提出的关于采样模式的确定性条件在有限补全中以高概率成立。此外，利用所提出的有限补全几何方法，我们提出了一种关于采样模式的充分条件，该条件保证采样张量存在唯一的补全。

英文摘要

We investigate the fundamental conditions on the sampling pattern, i.e., locations of the sampled entries, for finite completability of a low-rank tensor given some components of its Tucker rank. In order to find the deterministic necessary and sufficient conditions, we propose an algebraic geometric analysis on the Tucker manifold, which allows us to incorporate multiple rank components in the proposed analysis in contrast with the conventional geometric approaches on the Grassmannian manifold. This analysis characterizes the algebraic independence of a set of polynomials defined based on the sampling pattern, which is closely related to finite completion. Probabilistic conditions are then studied and a lower bound on the sampling probability is given, which guarantees that the proposed deterministic conditions on the sampling patterns for finite completability hold with high probability. Furthermore, using the proposed geometric approach for finite completability, we propose a sufficient condition on the sampling pattern that ensures there exists exactly one completion for the sampled tensor.

URL PDF HTML ☆

赞 0 踩 0

1706.07119 2026-06-04 eess.SY cs.LG cs.SY 版本更新

"Parallel Training Considered Harmful?": Comparing series-parallel and parallel feedforward network training

并行训练是否有害？：比较系列-并行与并行前馈网络训练

Antônio H. Ribeiro, Luis A. Aguirre

发表机构 * Department of Electronic Engineering at Universidade Federal de Minas Gerais (UFMG) - Av. Ant\ o nio Carlos 6627, 31270-901, Belo Horizonte, MG, Brazil

AI总结本文比较了系列-并行和并行前馈网络训练方法，探讨了其在鲁棒性、计算成本和收敛性方面的表现，发现并行训练在更现实的场景中表现更优。

详情

DOI: 10.1016/j.neucom.2018.07.071
Journal ref: Neurocomputing 316:222--231, (2018)

AI中文摘要

动态系统神经网络模型可以并行或系列-并行配置进行训练。受早期论述影响，一些论文认为系列-并行配置比并行配置更优，因其计算成本更低、训练稳定性更好且结果更准确。另一方面，其他研究则认为并行训练更稳健，能产生更准确的长期预测。本文的主要贡献是通过统一框架比较两种方法。我们关注三个方面：i）在噪声下的估计鲁棒性；ii）计算成本；iii）收敛性。统一的数学框架和模拟研究显示了每种训练方法在不同情境下的验证结果，发现并行训练在更现实的场景中表现更优。一个使用测量数据的例子似乎支持这一结论。我们还通过新的复杂度分析和数值示例表明，两种方法的计算成本相似，但系列-并行训练更易于并行化。一些关于稳定性和收敛性性质的非正式讨论也在示例中进行了探讨。

英文摘要

Neural network models for dynamic systems can be trained either in parallel or in series-parallel configurations. Influenced by early arguments, several papers justify the choice of series-parallel rather than parallel configuration claiming it has a lower computational cost, better stability properties during training and provides more accurate results. Other published results, on the other hand, defend parallel training as being more robust and capable of yielding more accu- rate long-term predictions. The main contribution of this paper is to present a study comparing both methods under the same unified framework. We focus on three aspects: i) robustness of the estimation in the presence of noise; ii) computational cost; and, iii) convergence. A unifying mathematical framework and simulation studies show situations where each training method provides better validation results, being parallel training better in what is believed to be more realistic scenarios. An example using measured data seems to reinforce such claim. We also show, with a novel complexity analysis and numerical examples, that both methods have similar computational cost, being series series-parallel training, however, more amenable to parallelization. Some informal discussion about stability and convergence properties is presented and explored in the examples.

URL PDF HTML ☆

赞 0 踩 0

1809.07412 2026-06-04 cs.LG cs.AI cs.SY eess.SY 版本更新

Learning, Planning, and Control in a Monolithic Neural Event Inference Architecture

在单体神经事件推理架构中的学习、规划与控制

Martin V. Butz, David Bilkey, Dania Humaidan, Alistair Knott, Sebastian Otte

发表机构 * Cognitive Modeling Group Computer Science Department University of Tübingen（图宾根大学认知建模组计算机科学系）

AI总结该研究提出了一种单体神经事件推理架构REPRISE，通过学习动态系统的时序事件预测模型，结合回顾和前瞻推理，实现对传感器运动动态的高效预测与控制。

Comments This is the final revision submitted to the Neural Networks journal. The revision mainly includes improvements in language, explanation, and additional references and system relations

详情

AI中文摘要

我们引入了REPRISE，一种回顾和前瞻推理方案，用于学习动态系统的时序事件预测模型。REPRISE推断出不可观测的上下文事件状态及其最佳解释最近遭遇的传感器运动经验的时序预测模型。同时，它以目标导向的方式优化即将到来的运动活动。在此，REPRISE通过循环神经网络（RNN）实现，该网络学习由不同模拟动态车辆生成的传感器运动连续性的时序前向模型。RNN通过上下文神经元增强，能够编码不同但相关的传感器运动动态为紧凑的事件代码。我们证明REPRISE能够同时学习分离和近似遇到的传感器运动动态：它分析传感器运动误差信号，同时适应内部上下文神经活动和连接权重值。此外，我们证明REPRISE可以利用所学模型诱导目标导向的模型预测控制，即近似主动推理：给定一个目标状态，系统想象一个优化该状态的运动命令序列，以最小化与目标的距离。RNN活动因此持续想象即将到来的未来并反思最近的过去，优化预测模型、隐藏神经状态活动和即将到来的运动活动。结果，事件预测神经编码得以发展，从而能够调用高效且适应性强的目标导向传感器运动控制。

英文摘要

We introduce REPRISE, a REtrospective and PRospective Inference SchEme, which learns temporal event-predictive models of dynamical systems. REPRISE infers the unobservable contextual event state and accompanying temporal predictive models that best explain the recently encountered sensorimotor experiences retrospectively. Meanwhile, it optimizes upcoming motor activities prospectively in a goal-directed manner. Here, REPRISE is implemented by a recurrent neural network (RNN), which learns temporal forward models of the sensorimotor contingencies generated by different simulated dynamic vehicles. The RNN is augmented with contextual neurons, which enable the encoding of distinct, but related, sensorimotor dynamics as compact event codes. We show that REPRISE concurrently learns to separate and approximate the encountered sensorimotor dynamics: it analyzes sensorimotor error signals adapting both internal contextual neural activities and connection weight values. Moreover, we show that REPRISE can exploit the learned model to induce goal-directed, model-predictive control, that is, approximate active inference: Given a goal state, the system imagines a motor command sequence optimizing it with the prospective objective to minimize the distance to the goal. The RNN activities thus continuously imagine the upcoming future and reflect on the recent past, optimizing the predictive model, the hidden neural state activities, and the upcoming motor activities. As a result, event-predictive neural encodings develop, which allow the invocation of highly effective and adaptive goal-directed sensorimotor control.

URL PDF HTML ☆

赞 0 踩 0

1803.10986 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

谱聚类方法相似性矩阵的构造：数值实验

Paola Favati, Grazia Lotti, Ornella Menchi, Francesco Romani

发表机构 * IIT - CNR（意大利国家研究 council（CNR）- 国立应用科学研究院（IIT））； Dipartimento di Scienze Matematiche, Fisiche e Informatiche, University of Parma（帕尔马大学数学、物理和信息科学系）； Dipartimento di Informatica, University of Pisa（比萨大学信息科学系）

AI总结本文研究了谱聚类中相似性矩阵的构造问题，通过直接基于数据集关联图或其最小生成树（MST）来考虑稀疏性和尺度参数σ的选择，进行人工和真实数据集的数值实验以比较方法性能。

Comments Submitted to Journal of Computational and Applied Mathematics

1904.10597 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Autonomous Voltage Control for Grid Operation Using Deep Reinforcement Learning

利用深度强化学习实现电网运行的自主电压控制

Ruisheng Diao, Zhiwei Wang, Di Shi, Qianyun Chang, Jiajun Duan, Xiaohu Zhang

发表机构 * GEIRI North America（GEIRI北美中心）； State Grid Corporation of China（国家电网公司）

AI总结本文提出Grid Mind框架，通过深度强化学习实现自主电网控制，解决传统方法在处理可再生能源和需求响应动态性带来的挑战，提升电网运行的安全性和经济性。

Comments To be published (Accepted) in: Proceedings of the Power and Energy Society General Meeting (PESGM), Atlanta, GA, 2019

详情

AI中文摘要

现代电力电网正面临由快速增长的可再生能源和需求响应的随机性和动态性带来的巨大挑战。传统理论假设和运营规则可能被违反，而现有控制系统由于缺乏计算能力和准确的电网模型，在实时应用中难以适应，导致电网安全和经济运行日益受到关注。现有运营控制措施通常是离线确定的，优化程度较低。本文提出了一种新的范式Grid Mind，用于利用深度强化学习实现自主电网运营控制。所提出的AI代理可通过与大量离线模拟的交互学习其控制策略，并适应新的变化，包括负载/发电变化以及拓扑变化。经过适当训练的代理在IEEE 14节点系统上测试了数万种场景，并在应用自主电压控制以实现安全电网运行方面展示了有希望的性能。

英文摘要

Modern power grids are experiencing grand challenges caused by the stochastic and dynamic nature of growing renewable energy and demand response. Traditional theoretical assumptions and operational rules may be violated, which are difficult to be adapted by existing control systems due to the lack of computational power and accurate grid models for use in real time, leading to growing concerns in the secure and economic operation of the power grid. Existing operational control actions are typically determined offline, which are less optimized. This paper presents a novel paradigm, Grid Mind, for autonomous grid operational controls using deep reinforcement learning. The proposed AI agent for voltage control can learn its control policy through interactions with massive offline simulations, and adapts its behavior to new changes including not only load/generation variations but also topological changes. A properly trained agent is tested on the IEEE 14-bus system with tens of thousands of scenarios, and promising performance is demonstrated in applying autonomous voltage controls for secure grid operation.

URL PDF HTML ☆

赞 0 踩 0

1809.00846 2026-06-04 cs.LG cs.CV cs.SY eess.SY stat.ML 版本更新

Towards Understanding Regularization in Batch Normalization

向批量归一化中的正则化理解迈进

Ping Luo, Xinjiang Wang, Wenqi Shao, Zhanglin Peng

发表机构 * The Chinese University of Hong Kong（香港中文大学）； SenseTime Research（时光科技研究院）； The University of Hong Kong（香港大学）

AI总结本文通过理论分析探讨了批量归一化在神经网络训练中的收敛性和泛化能力，揭示了批量归一化作为隐式正则化的作用，并通过实验验证了其在卷积神经网络中的正则化特性。

Comments International Conference on Learning Representations (ICLR)

1904.09656 2026-06-04 cs.LG cs.NA math.NA 版本更新

Solution of Definite Integrals using Functional Link Artificial Neural Networks

使用功能链接人工神经网络求解定积分

Satyasaran Changdar, Snehangshu Bhattacharjee

发表机构 * Department of Information Technology, Institute of Engineering and Management（信息科技系，工程管理学院）

AI总结本文提出了一种利用人工神经网络求解定积分的新方法，通过最小化精心设计的误差函数，构建出一种新颖的替代传统数值方法的神经网络，特别适用于高阶多项式积分。

Comments 14 pages, 7 figures

1903.05803 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

On Applications of Bootstrap in Continuous Space Reinforcement Learning

在连续空间强化学习中Bootstrap应用的探讨

Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis

AI总结本文研究了在连续状态和动作空间决策问题中，基于Bootstrap的策略在 regret 方面的平方根缩放特性，并探讨了模型动态学习的准确性。

1903.03712 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Adaptive Power System Emergency Control using Deep Reinforcement Learning

基于深度强化学习的自适应电力系统紧急控制

Qiuhua Huang, Renke Huang, Weituo Hao, Jie Tan, Rui Fan, Zhenyu Huang

发表机构 * Pacific Northwest National Laboratory（太平洋西北国家实验室）； Battelle（巴特尔）； U.S. Department of Energy（美国能源部）； Deep Science laboratory（深科学实验室）； Google Brain（谷歌大脑）

AI总结本文提出了一种基于深度强化学习的自适应电力系统紧急控制方法，通过高维特征提取和非线性泛化能力来应对现代电网中的不确定性与变化，展示了其在发电机动态制动和电压降低负载切除中的优异性能和鲁棒性。

Comments 12 pages

详情

AI中文摘要

电力系统紧急控制通常被视为电网安全和韧性的最后一道安全网。现有的紧急控制方案通常是基于设想的'最坏'场景或几个典型运行场景进行离线设计。这些方案在现代电网中出现越来越多的不确定性和变化时，面临着显著的适应性和鲁棒性问题。为了解决这些挑战，本文首次开发了新的自适应紧急控制方案，利用深度强化学习（DRL）的高维特征提取和非线性泛化能力来处理复杂的电力系统。此外，首次设计了一个名为RLGC的开源平台，以协助DRL算法在电力系统控制中的开发和基准测试。详细介绍了该平台和基于DRL的紧急控制方案，包括发电机动态制动和电压降低负载切除。在两个区域四机系统和IEEE 39节点系统中进行了广泛的案例研究，证明了所提出方案的优异性能和鲁棒性。

英文摘要

Power system emergency control is generally regarded as the last safety net for grid security and resiliency. Existing emergency control schemes are usually designed off-line based on either the conceived "worst" case scenario or a few typical operation scenarios. These schemes are facing significant adaptiveness and robustness issues as increasing uncertainties and variations occur in modern electrical grids. To address these challenges, for the first time, this paper developed novel adaptive emergency control schemes using deep reinforcement learning (DRL), by leveraging the high-dimensional feature extraction and non-linear generalization capabilities of DRL for complex power systems. Furthermore, an open-source platform named RLGC has been designed for the first time to assist the development and benchmarking of DRL algorithms for power system control. Details of the platform and DRL-based emergency control schemes for generator dynamic braking and under-voltage load shedding are presented. Extensive case studies performed in both two-area four-machine system and IEEE 39-Bus system have demonstrated the excellent performance and robustness of the proposed schemes.

URL PDF HTML ☆

赞 0 踩 0

1712.01975 2026-06-04 cs.LG cs.NA math.NA math.OC 版本更新

Regularization and feature selection for large dimensional data

大规模数据的正则化与特征选择

Nand Sharma, Prathamesh Verlekar, Rehab Ashary, Sui Zhiquan

发表机构 * Department of Mathematics, Colorado State University（科罗拉多州立大学数学系）； Department of Computer Science, Colorado State University（科罗拉多州立大学计算机科学系）

AI总结本文研究了五种嵌入式特征选择方法，通过ridge回归、Lasso回归或其组合进行正则化，以在高维数据中减少特征空间并提高分类性能。

1904.08361 2026-06-04 cs.LG cs.RO cs.SY eess.SY stat.ML 版本更新

Decoupled Data Based Approach for Learning to Control Nonlinear Dynamical Systems

基于解耦数据的方法用于学习控制非线性动力学系统

Ran Wang, Karthikeya Parunandi, Dan Yu, Dileep Kalathil, Suman Chakravorty

发表机构 * College of Astronautics, Nanjing University and hence, run into the curse of dimensionality（南京大学航天学院）； Department of Electrical and Computer Engineering, Texas A&M University, Texas, USA（德克萨斯A&M大学电气与计算机工程系）

AI总结本文提出了一种解耦数据基于的方法，用于学习控制具有连续状态空间、连续动作空间和未知动态的非线性随机动力学系统，通过解耦的开环-闭环方法，利用黑盒仿真模型解决开环确定性轨迹优化问题，并通过线性化动态在该名义轨迹上开发闭环控制，从而使用线性二次调节器算法，证明了该方法的性能近似最优，并在训练时间上显著优于其他先进算法。

详情

AI中文摘要

本文解决了一个非线性随机动力学系统学习最优控制策略的问题，该系统具有连续状态空间、连续动作空间和未知动态。此类问题通常在随机自适应控制和强化学习文献中使用基于模型和无模型的方法分别解决。这两种方法都依赖于解决动态规划问题，无论是直接还是间接，以找到最优闭环控制策略。动态规划方法固有的'维度灾难'使这些方法也变得计算上困难。本文提出了一种新颖的解耦数据基于控制（D2C）算法，通过解耦的'开环-闭环'方法解决这个问题。首先，使用动力学系统的黑盒仿真模型解决一个开环确定性轨迹优化问题。然后，通过在该名义轨迹上线性化动态，开发围绕该开环轨迹的闭环控制。通过线性化，可以使用基于线性二次调节器的算法来实现该闭环控制。我们证明了D2C算法的性能近似最优。此外，仿真性能表明，与其它先进算法相比，训练时间显著减少。

英文摘要

This paper addresses the problem of learning the optimal control policy for a nonlinear stochastic dynamical system with continuous state space, continuous action space and unknown dynamics. This class of problems are typically addressed in stochastic adaptive control and reinforcement learning literature using model-based and model-free approaches respectively. Both methods rely on solving a dynamic programming problem, either directly or indirectly, for finding the optimal closed loop control policy. The inherent `curse of dimensionality' associated with dynamic programming method makes these approaches also computationally difficult. This paper proposes a novel decoupled data-based control (D2C) algorithm that addresses this problem using a decoupled, `open loop - closed loop', approach. First, an open-loop deterministic trajectory optimization problem is solved using a black-box simulation model of the dynamical system. Then, a closed loop control is developed around this open loop trajectory by linearization of the dynamics about this nominal trajectory. By virtue of linearization, a linear quadratic regulator based algorithm can be used for this closed loop control. We show that the performance of D2C algorithm is approximately optimal. Moreover, simulation performance suggests significant reduction in training time compared to other state of the art algorithms.

URL PDF HTML ☆

赞 0 踩 0

1904.07200 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

A Discussion on Solving Partial Differential Equations using Neural Networks

利用神经网络求解偏微分方程的讨论

Tim Dockhorn

发表机构 * Department of Applied Mathematics（应用数学系）； University of Waterloo（滑铁卢大学）

AI总结本文探讨了神经网络求解偏微分方程的能力，通过数值实验展示了小型神经网络能够准确学习复杂解，并分析了随机权重初始化对解质量的影响，提出了损失函数的选择、神经网络与经典数值方法的优劣比较，以及未来研究方向。

Comments 9 pages, 2 figures

1812.03467 2026-06-04 math.NA cs.LG cs.MS cs.NA math.OC 版本更新

A note on solving nonlinear optimization problems in variable precision

关于在变量精度下求解非线性优化问题的注记

S. Gratton, Ph. L. Toint

发表机构 * NAXYS, University of Namur（NAXYS，纳慕尔大学）

AI总结本文提出一种高效的信赖域算法变体，用于高性能计算，通过多精度计算有效降低目标函数和梯度评估的能耗。

Comments 11 pages, 2 figures

1904.05814 2026-06-04 cs.CV cs.GR cs.LG cs.NA cs.RO math.NA 版本更新

Probabilistic Permutation Synchronization using the Riemannian Structure of the Birkhoff Polytope

利用Birkhoff多面体的Riemannian结构的概率排列同步

Tolga Birdal, Umut Şimşekli

AI总结本文提出了一种新的几何和概率方法，用于在多个对象或图像集合之间同步对应关系。核心方法包括基于Birkhoff-Riemannian L-BFGS优化放松后的循环一致性损失，以及基于Birkhoff-Riemannian Langevin Monte Carlo生成Birkhoff多面体样本并估计解的置信度。

Comments To appear as oral presentation at CVPR 2019. 20 pages including the supplementary material

详情

AI中文摘要

学习控制高加速度的球形运动在肌肉机器人上

Dieter Büchler, Roberto Calandra, Jan Peters

发表机构 * Max Planck Institute for Intelligent Systems（智能系统马克斯·普朗克研究所）； Facebook AI Research（脸书人工智能研究）

AI总结本文研究了如何通过学习方法提高肌肉机器人在高速高加速度运动中的控制精度，提出了一种四自由度的机器人臂，利用气动人工肌肉实现高关节角加速度，并通过贝叶斯优化直接在硬件上调整控制参数，从而在快速轨迹上实现了优于以往的结果。

Comments 12 pages, preprint submitted to Journal of Robotics and Autonomous Systems

详情

AI中文摘要

高速和高加速度的运动本质上很难控制。在人形机器人臂上应用学习方法来控制此类运动可以提高控制的准确性，但可能会损害系统。学习方法的内在探索可能导致不稳定性和机器人在高速下达到关节极限。因此，具有安全探索高速和高加速度运动硬件的需求是必要的。为了解决这个问题，我们提出使用由气动人工肌肉（PAMs）驱动的机器人。在本文中，我们展示了一种四自由度的机器人臂，能够达到高达28000度/秒²的关节角加速度，同时通过拮抗驱动和空气压力范围限制避免危险的关节极限。利用这种机器人臂，我们能够通过贝叶斯优化直接在硬件上调整控制参数，而无需额外的安全考虑。在快速轨迹上的跟踪性能超过了以往在类似PAM驱动机器人上的结果。我们还展示了由于电缆弯曲最小、轻量级动力学和PAMs与链接之间的最小接触等精心设计考虑，系统能够使用PID控制器在慢速轨迹上良好控制。最后，我们提出了一种新的技术来控制拮抗肌肉对的协同收缩。实验结果表明，选择最佳的协同收缩水平对于达到更好的跟踪性能至关重要。通过使用PAM驱动机器人和学习，我们朝着未来能够实现更像人类运动的机器人发展迈出了小一步。

英文摘要

High-speed and high-acceleration movements are inherently hard to control. Applying learning to the control of such motions on anthropomorphic robot arms can improve the accuracy of the control but might damage the system. The inherent exploration of learning approaches can lead to instabilities and the robot reaching joint limits at high speeds. Having hardware that enables safe exploration of high-speed and high-acceleration movements is therefore desirable. To address this issue, we propose to use robots actuated by Pneumatic Artificial Muscles (PAMs). In this paper, we present a four degrees of freedom (DoFs) robot arm that reaches high joint angle accelerations of up to 28000 deg/s^2 while avoiding dangerous joint limits thanks to the antagonistic actuation and limits on the air pressure ranges. With this robot arm, we are able to tune control parameters using Bayesian optimization directly on the hardware without additional safety considerations. The achieved tracking performance on a fast trajectory exceeds previous results on comparable PAM-driven robots. We also show that our system can be controlled well on slow trajectories with PID controllers due to careful construction considerations such as minimal bending of cables, lightweight kinematics and minimal contact between PAMs and PAMs with the links. Finally, we propose a novel technique to control the the co-contraction of antagonistic muscle pairs. Experimental results illustrate that choosing the optimal co-contraction level is vital to reach better tracking performance. Through the use of PAM-driven robots and learning, we do a small step towards the future development of robots capable of more human-like motions.

URL PDF HTML ☆

赞 0 踩 0

1803.08552 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Linear model predictive safety certification for learning-based control

基于线性模型的预测安全认证用于基于学习的控制

Kim P. Wabersich, Melanie N. Zeilinger

发表机构 * ETH Zurich（苏黎世联邦理工学院）

AI总结本文提出了一种模型预测安全认证（MPSC）方案，用于具有加性扰动的多边形线性系统，以解决基于学习的控制器缺乏安全保证的问题。通过引入MPC来确保系统在安全目标集内运行，并通过场景优化提出了一种实用的数据驱动设计方法。

详情

AI中文摘要

尽管已多次证明基于学习的控制器可以提供优越的性能，但它们通常缺乏安全保证。本文旨在通过引入一种模型预测安全认证（MPSC）方案来解决这一问题，该方案适用于具有加性扰动的多边形线性系统。该方案验证所提出的学习输入的安全性，并尽可能最小地修改以保持系统在给定约束集内。安全因此与模型预测控制器（MPC）提供可行轨迹以达到安全目标集的存在相关。一种鲁棒的MPC公式考虑了学习环境中模型通常不确定的事实，从而在所提出的MPSC策略下始终保证约束满足。MPSC方案可用于扩展任何潜在保守的安全状态集用于学习，并证明了一种迭代技术用于扩大安全集。最后，提出了一种使用场景优化的实用数据驱动设计方法用于MPSC。

英文摘要

While it has been repeatedly shown that learning-based controllers can provide superior performance, they often lack of safety guarantees. This paper aims at addressing this problem by introducing a model predictive safety certification (MPSC) scheme for polytopic linear systems with additive disturbances. The scheme verifies safety of a proposed learning-based input and modifies it as little as necessary in order to keep the system within a given set of constraints. Safety is thereby related to the existence of a model predictive controller (MPC) providing a feasible trajectory towards a safe target set. A robust MPC formulation accounts for the fact that the model is generally uncertain in the context of learning, which allows proving constraint satisfaction at all times under the proposed MPSC strategy. The MPSC scheme can be used in order to expand any potentially conservative set of safe states for learning and we prove an iterative technique for enlarging the safe set. Finally, a practical data-based design procedure for MPSC is proposed using scenario optimization.

URL PDF HTML ☆

赞 0 踩 0

1904.02765 2026-06-04 cs.RO cs.LG cs.SY eess.SY math.OC 版本更新

Intent-Aware Probabilistic Trajectory Estimation for Collision Prediction with Uncertainty Quantification

意图感知的概率轨迹估计用于碰撞预测与不确定性量化

Andrew Patterson, Arun Lakshmanan, Naira Hovakimyan

发表机构 * Department of Mechanical Science and Engineering, University of Illinois at Urbana-Champaign（伊利诺伊大学厄巴纳-香槟分校机械科学与工程系）

AI总结本文提出了一种基于高斯过程的概率轨迹估计方法，用于在不确定环境中预测碰撞，通过概率方法替代确定性假设，以考虑更广泛的障碍物类型，并通过案例研究展示了在有限障碍物行为知识下预测碰撞的能力。

详情

AI中文摘要

在动态和未知的环境中，碰撞预测依赖于对环境变化的理解。许多碰撞预测方法依赖于对障碍物运动的确定性知识，但完全确定性的障碍物运动知识往往不可用。本文提出了一种基于高斯过程的预测方法，用概率知识替代对每个障碍物未来行为的确定性假设，以考虑更广泛的障碍物。该方法仅依赖位置和速度测量来预测与动态障碍物的碰撞。我们证明，障碍物位置的不确定性区域可以表示为通过高斯过程回归生成的多项式的组合。为了控制任意时间范围内不确定性的增长，假设概率障碍物意图作为障碍物位置和速度的分布，这可以自然地包含在高斯过程框架中。我们的方法在两个案例研究中得到验证：(i) 障碍物超越代理；(ii) 障碍物垂直穿过代理的路径。在这些模拟中，我们展示了即使在有限的障碍物行为知识下也能预测碰撞。

英文摘要

Collision prediction in a dynamic and unknown environment relies on knowledge of how the environment is changing. Many collision prediction methods rely on deterministic knowledge of how obstacles are moving in the environment. However, complete deterministic knowledge of the obstacles' motion is often unavailable. This work proposes a Gaussian process based prediction method that replaces the assumption of deterministic knowledge of each obstacle's future behavior with probabilistic knowledge, to allow a larger class of obstacles to be considered. The method solely relies on position and velocity measurements to predict collisions with dynamic obstacles. We show that the uncertainty region for obstacle positions can be expressed in terms of a combination of polynomials generated with Gaussian process regression. To control the growth of uncertainty over arbitrary time horizons, a probabilistic obstacle intention is assumed as a distribution over obstacle positions and velocities, which can be naturally included in the Gaussian process framework. Our approach is demonstrated in two case studies in which (i), an obstacle overtakes the agent and (ii), an obstacle crosses the agent's path perpendicularly. In these simulations we show that the collision can be predicted despite having limited knowledge of the obstacle's behavior.

URL PDF HTML ☆

赞 0 踩 0

1708.01945 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

A Bootstrap Method for Error Estimation in Randomized Matrix Multiplication

一种用于随机矩阵乘法误差估计的自助法

Miles E. Lopes, Shusen Wang, Michael W. Mahoney

发表机构 * Department of Statistics University of California at Davis（加州大学戴维斯分校统计学系）； Department of Computer Science Stevens Institute of Technology（史蒂文斯理工学院计算机科学系）； International Computer Science Institute and Department of Statistics University of California at Berkeley（伯克利大学国际计算机科学研究所和统计学系）

AI总结本文提出了一种自助方法，用于直接估计随机矩阵乘法（降维）的准确性，作为解决一般问题的原型设置。该方法在计算上不显著增加标准降维方法的成本，并通过插值技术实现，同时提供了理论和实证结果以证明其有效性。

详情

Journal ref: Journal of Machine Learning Research, 20(39): 1-40, 2019

AI中文摘要

近年来，随机方法在数值线性代数中受到越来越多的关注，作为解决大规模问题的一般方法。通常，这些方法的核心成分是某种形式的随机降维，这加速了计算，但也引入了随机近似误差。在这种情况下，降维步骤编码了成本与精度之间的权衡。然而，成本与精度之间的精确数值关系通常未知，因此用户可能难以准确知道（1）给定解的准确性，或（2）为了达到给定的准确性水平需要多少计算。在本文中，我们研究随机矩阵乘法（草图）作为解决这些问题的原型设置。作为解决方案，我们开发了一种自助方法，用于直接估计准确性作为降维函数的函数（而不是导出降维的最坏情况界限）。从计算角度来看，所提出的方法不显著增加标准草图方法的成本，并且这得益于一种“插值”技术。此外，我们提供了理论和实证结果，以证明所提出方法的有效性。

英文摘要

In recent years, randomized methods for numerical linear algebra have received growing interest as a general approach to large-scale problems. Typically, the essential ingredient of these methods is some form of randomized dimension reduction, which accelerates computations, but also creates random approximation error. In this way, the dimension reduction step encodes a tradeoff between cost and accuracy. However, the exact numerical relationship between cost and accuracy is typically unknown, and consequently, it may be difficult for the user to precisely know (1) how accurate a given solution is, or (2) how much computation is needed to achieve a given level of accuracy. In the current paper, we study randomized matrix multiplication (sketching) as a prototype setting for addressing these general problems. As a solution, we develop a bootstrap method for \emph{directly estimating} the accuracy as a function of the reduced dimension (as opposed to deriving worst-case bounds on the accuracy in terms of the reduced dimension). From a computational standpoint, the proposed method does not substantially increase the cost of standard sketching methods, and this is made possible by an "extrapolation" technique. In addition, we provide both theoretical and empirical results to demonstrate the effectiveness of the proposed method.

URL PDF HTML ☆

赞 0 踩 0

1904.01855 2026-06-04 math.OC cs.LG cs.SY eess.SY stat.ML 版本更新

A Stochastic Interpretation of Stochastic Mirror Descent: Risk-Sensitive Optimality

随机镜像下降的随机解释：风险敏感最优性

Navid Azizan, Babak Hassibi

发表机构 * California Institute of Technology（加州理工学院）

AI总结本文提出随机镜像下降（SMD）是一种风险敏感最优估计器，适用于非高斯分布的未知权重向量和加性噪声，同时引入了对称SMD（SSMD）的改进版本。

详情

AI中文摘要

随机镜像下降（SMD）是一种相对较新的算法家族，最近在优化、机器学习和控制领域得到了广泛应用。它可以被视为经典随机梯度算法（SGD）的推广，其中权重向量的更新不是沿着随机梯度的负方向进行，而是在一个由梯度的（严格凸）势函数定义的“镜像域”中进行。这种势函数及其产生的镜像域相比SGD提供了更大的算法灵活性。尽管许多SMD的性质已经在文献中得到研究，但本文提出了SMD的一个新解释，即当未知权重向量和加性噪声非高斯且属于指数分布族时，SMD是一个风险敏感最优估计器。分析还建议了SMD的一种改进版本，称为对称SMD（SSMD）。证明依赖于Bregman散度的一些简单性质，使我们能够将结果从二次函数和高斯分布扩展到某些凸函数和指数分布族，方式较为流畅。

英文摘要

Stochastic mirror descent (SMD) is a fairly new family of algorithms that has recently found a wide range of applications in optimization, machine learning, and control. It can be considered a generalization of the classical stochastic gradient algorithm (SGD), where instead of updating the weight vector along the negative direction of the stochastic gradient, the update is performed in a "mirror domain" defined by the gradient of a (strictly convex) potential function. This potential function, and the mirror domain it yields, provides considerable flexibility in the algorithm compared to SGD. While many properties of SMD have already been obtained in the literature, in this paper we exhibit a new interpretation of SMD, namely that it is a risk-sensitive optimal estimator when the unknown weight vector and additive noise are non-Gaussian and belong to the exponential family of distributions. The analysis also suggests a modified version of SMD, which we refer to as symmetric SMD (SSMD). The proofs rely on some simple properties of Bregman divergence, which allow us to extend results from quadratics and Gaussians to certain convex functions and exponential families in a rather seamless way.

URL PDF HTML ☆

赞 0 踩 0

1904.01214 2026-06-04 cs.LG cs.SY eess.SY math.OC stat.ML 版本更新

Enhancement of Energy-Based Swing-Up Controller via Entropy Search

通过熵搜索增强基于能量的摆动上控制器

Chang Sik Lee, Dong Eui Chang

发表机构 * School of Electrical Engineering, KAIST, Daejeon, Korea.（韩国成均馆大学电气工程学院）

AI总结本文利用熵搜索进行贝叶斯优化，改进基于能量的控制器，以实现旋转倒立摆（Furuta摆）的摆动控制，实验表明该控制器在各种初始条件下性能优于常规控制器。

Comments 6 pages, 2019 Asian Control Conference

1903.09698 2026-06-04 math.NA cs.LG cs.NA 版本更新

CUR Decompositions, Approximations, and Perturbations

CUR分解、近似与扰动

Keaton Hamm, Longxiu Huang

发表机构 * Department of Mathematics, University of Arizona, Tucson, AZ 85719 USA（亚利桑那大学数学系，图森，亚利桑那州，85719 USA）； Department of Mathematics, Vanderbilt University, Nashville, TN 37240 USA（范德比尔特大学数学系，纳什维尔，田纳西州，37240 USA）

AI总结本文探讨了用于降维和低秩矩阵近似的CUR分解方法，综述并比较了文献中的不同观点，提出了一种新的精确CUR分解特征，并对噪声低秩矩阵的CUR近似进行了新颖的扰动分析，同时给出了新的列和行采样结果，证明了低秩矩阵的CUR分解在高概率下得以实现，并展示了这些采样方法在之前研究的扰动下的稳定性以及相关方法和界限的数值示例。

Comments 40 pages

1707.09198 2026-06-04 cs.LG cs.AI cs.SY eess.SY math.OC 版本更新

Data-Driven Stochastic Robust Optimization: A General Computational Framework and Algorithm for Optimization under Uncertainty in the Big Data Era

数据驱动的随机稳健优化：大数据时代不确定性优化的通用计算框架和算法

Chao Ning, Fengqi You

发表机构 * Robert Frederick Smith School of Chemical and Biomolecular Engineering, Cornell University（罗伯特·弗雷德里克·史密斯化学与生物分子工程学院，康奈尔大学）

AI总结本文提出了一种数据驱动的随机稳健优化框架，通过双层优化结构基于数据驱动的不确定性模型，结合两阶段随机规划和自适应稳健优化，解决大数据时代下的不确定性优化问题。

详情

DOI: 10.1016/j.compchemeng.2017.12.015
Journal ref: Computers & Chemical Engineering, Volume 111, Pages 115-133, 4 March 2018,

AI中文摘要

本文提出了一种新颖的数据驱动随机稳健优化（DDSRO）框架，用于利用带有标签的多类不确定性数据进行不确定性优化。大数据集中的不确定性数据通常来自各种条件，这些条件通过类别标签进行编码。采用狄利克雷过程混合模型和最大似然估计等机器学习方法进行不确定性建模。基于数据驱动的不确定性模型，进一步提出了一种双层优化结构的DDSRO框架。外层优化问题采用两阶段随机规划方法，以在不同数据类别上优化预期目标；自适应稳健优化作为内层问题，确保解决方案的鲁棒性，同时保持计算可行性。进一步开发了一种基于分解的算法，以高效解决由此产生的多级优化问题。通过过程网络设计和规划的案例研究，展示了所提框架和算法的应用性。

英文摘要

A novel data-driven stochastic robust optimization (DDSRO) framework is proposed for optimization under uncertainty leveraging labeled multi-class uncertainty data. Uncertainty data in large datasets are often collected from various conditions, which are encoded by class labels. Machine learning methods including Dirichlet process mixture model and maximum likelihood estimation are employed for uncertainty modeling. A DDSRO framework is further proposed based on the data-driven uncertainty model through a bi-level optimization structure. The outer optimization problem follows a two-stage stochastic programming approach to optimize the expected objective across different data classes; adaptive robust optimization is nested as the inner problem to ensure the robustness of the solution while maintaining computational tractability. A decomposition-based algorithm is further developed to solve the resulting multi-level optimization problem efficiently. Case studies on process network design and planning are presented to demonstrate the applicability of the proposed framework and algorithm.

URL PDF HTML ☆

赞 0 踩 0

1810.04351 2026-06-04 math.AP cs.LG cs.NA math.NA math.PR 版本更新

Properly-weighted graph Laplacian for semi-supervised learning

带权图拉普拉斯算子用于半监督学习

Jeff Calder, Dejan Slepcev

发表机构 * Department of Mathematics, University of Minnesota（明尼苏达大学数学系）； Department of Mathematical Sciences, Carnegie Mellon University（卡内基梅隆大学数学科学系）

AI总结本文提出了一种带权图拉普拉斯算子的方法，以解决传统半监督学习方法在标签数据与未标签数据比例降低时性能下降的问题，通过在拉普拉斯正则化中正确设置权重，使估计器在大样本极限下保持良好和稳定，证明了所提出的方法在无限样本极限下收敛于连续变分问题的光滑解。

1904.00035 2026-06-04 cs.RO cs.LG cs.SY eess.SY stat.ML 版本更新

Autonomous Highway Driving using Deep Reinforcement Learning

使用深度强化学习实现自动驾驶高速公路驾驶

Subramanya Nageshrao, Eric Tseng, Dimitar Filev

发表机构 * Ford Greenfield Labs（福特绿谷实验室）； Ford Research and Innovation Center（福特研究与创新中心）

AI总结本文提出了一种基于强化学习的方法，通过与模拟交通直接交互，使自动驾驶车辆在复杂和多变的环境中做出决策，解决了传统规则和预设成本函数在实时优化中的不足，提高了学习效率和安全性。

详情

AI中文摘要

自动驾驶车辆的操作空间可以是多样的，并且可能显著变化。这可能导致设计阶段未预料到的场景。因此，基于规则的决策者选择动作可能并不理想。同样，设计一个先验成本函数然后在实时中求解最优控制问题可能也不够有效。为了应对这些问题并避免在遇到意外场景时出现异常行为，我们提出了一种基于强化学习（RL）的方法，其中自动驾驶车辆通过与模拟交通直接交互来学习决策。决策者由深度神经网络实现，根据给定的系统状态提供动作选择。在关键应用如驾驶中，没有明确安全概念的RL代理可能无法收敛，或者需要极大量的样本才能找到可靠的策略。为了更好地解决这个问题，本文将强化学习与额外的短时间安全检查（SC）相结合。在关键场景中，安全检查还将为代理提供替代的安全动作，如果存在的话。这导致了两个新的贡献。首先，它扩展了可能导致不良“接近事件”或“碰撞”的状态。其次，安全检查的加入可以提供一个安全且稳定的训练环境。这显著提高了学习效率，同时不抑制有意义的探索，以确保安全和最优的学习行为。我们展示了所开发算法在高速公路驾驶场景中的性能，其中训练好的自动驾驶车辆在高速公路环境下遇到不同交通密度的情况。

英文摘要

The operational space of an autonomous vehicle (AV) can be diverse and vary significantly. This may lead to a scenario that was not postulated in the design phase. Due to this, formulating a rule based decision maker for selecting maneuvers may not be ideal. Similarly, it may not be effective to design an a-priori cost function and then solve the optimal control problem in real-time. In order to address these issues and to avoid peculiar behaviors when encountering unforeseen scenario, we propose a reinforcement learning (RL) based method, where the ego car, i.e., an autonomous vehicle, learns to make decisions by directly interacting with simulated traffic. The decision maker for AV is implemented as a deep neural network providing an action choice for a given system state. In a critical application such as driving, an RL agent without explicit notion of safety may not converge or it may need extremely large number of samples before finding a reliable policy. To best address the issue, this paper incorporates reinforcement learning with an additional short horizon safety check (SC). In a critical scenario, the safety check will also provide an alternate safe action to the agent provided if it exists. This leads to two novel contributions. First, it generalizes the states that could lead to undesirable "near-misses" or "collisions ". Second, inclusion of safety check can provide a safe and stable training environment. This significantly enhances learning efficiency without inhibiting meaningful exploration to ensure safe and optimal learned behavior. We demonstrate the performance of the developed algorithm in highway driving scenario where the trained AV encounters varying traffic density in a highway setting.

URL PDF HTML ☆

赞 0 踩 0

1808.10788 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

Data-driven discovery of PDEs in complex datasets

基于数据的复杂数据集中的PDE发现

Jens Berg, Kaj Nyström

发表机构 * Department of Mathematics, Uppsala University（乌普萨拉大学数学系）

AI总结本文通过机器学习方法从复杂数据集中发现隐藏的偏微分方程，展示了如何通过数据转换和特征选择来揭示物理过程的PDE，并在非线性二次PDE和瑞典温度分布模拟中验证了该方法的有效性。

详情

DOI: 10.1016/j.jcp.2019.01.036

AI中文摘要

许多科学和工程中的过程可以用偏微分方程（PDEs）来描述。传统上，PDEs是通过考虑物理基本原理来推导感兴趣的物理量之间的关系。另一种方法是测量感兴趣的量并使用深度学习来逆向工程描述物理过程的PDEs。本文使用机器学习，特别是深度学习，来从测量数据中发现复杂数据集中的PDEs。我们包括来自已知模型问题的数据示例和来自气象站的实测数据。我们展示了输入数据的必要转换相当于发现的PDE中的坐标转换，并详细阐述了特征和模型选择。证明了非线性二次PDE的动力学可以被普通微分方程准确描述，该方程由我们的深度学习算法自动发现。更有趣的是，我们在瑞典温度分布更复杂的模拟中也展示了类似的结果。

英文摘要

Many processes in science and engineering can be described by partial differential equations (PDEs). Traditionally, PDEs are derived by considering first principles of physics to derive the relations between the involved physical quantities of interest. A different approach is to measure the quantities of interest and use deep learning to reverse engineer the PDEs which are describing the physical process. In this paper we use machine learning, and deep learning in particular, to discover PDEs hidden in complex data sets from measurement data. We include examples of data from a known model problem, and real data from weather station measurements. We show how necessary transformations of the input data amounts to coordinate transformations in the discovered PDE, and we elaborate on feature and model selection. It is shown that the dynamics of a non-linear, second order PDE can be accurately described by an ordinary differential equation which is automatically discovered by our deep learning algorithm. Even more interestingly, we show that similar results apply in the context of more complex simulations of the Swedish temperature distribution.

URL PDF HTML ☆

赞 0 踩 0

1810.08754 2026-06-04 math.NA cs.LG cs.NA eess.SP 版本更新

BCR-Net: a neural network based on the nonstandard wavelet form

BCR-Net: 一种基于非标准小波形式的神经网络

Yuwei Fan, Cindy Orozco Bohorquez, Lexing Ying

发表机构 * Department of Mathematics, Stanford University（斯坦福大学数学系）； Institute for Computational and Mathematical Engineering, Stanford University（斯坦福大学计算与数学工程研究所）； Department of Mathematics and ICME, Stanford University（斯坦福大学数学系和计算与数学工程研究所）； Facebook AI Research, Menlo Park, CA（脸书人工智能研究（Menlo Park, CA））

AI总结本文提出了一种基于非标准小波形式的神经网络架构，该架构通过将非标准形式的矩阵向量乘法算法表示为线性神经网络，其中每个多分辨率计算的尺度都由局部连接的线性子网络完成，并通过用更深层次和强大的非线性子网络替换线性子网络来扩展以解决非线性问题。

Comments 17 pages and 9 figures

详情

DOI: 10.1016/j.jcp.2019.02.002

AI中文摘要

本文提出了一种新颖的神经网络架构，灵感来源于Beylkin、Coifman和Rokhlin在[Communications on Pure and Applied Mathematics, 44(2), 141-183]中提出的一种非标准形式。非标准形式是一种高效的基于小波的压缩方案，用于线性积分算子。在本文中，我们首先将非标准形式的矩阵向量乘法算法表示为线性神经网络，其中每个多分辨率计算的尺度都通过局部连接的线性子网络完成。为了处理非线性问题，我们提出了一种扩展，称为BCR-Net，通过将每个线性子网络替换为更深层次和更强大的非线性子网络。数值结果展示了新架构的有效性，通过近似出现在均质理论和随机计算中的非线性映射。

英文摘要

This paper proposes a novel neural network architecture inspired by the nonstandard form proposed by Beylkin, Coifman, and Rokhlin in [Communications on Pure and Applied Mathematics, 44(2), 141-183]. The nonstandard form is a highly effective wavelet-based compression scheme for linear integral operators. In this work, we first represent the matrix-vector product algorithm of the nonstandard form as a linear neural network where every scale of the multiresolution computation is carried out by a locally connected linear sub-network. In order to address nonlinear problems, we propose an extension, called BCR-Net, by replacing each linear sub-network with a deeper and more powerful nonlinear one. Numerical results demonstrate the efficiency of the new architecture by approximating nonlinear maps that arise in homogenization theory and stochastic computation.

URL PDF HTML ☆

赞 0 踩 0

1903.10343 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Sample Complexity Lower Bounds for Linear System Identification

线性系统辨识的样本复杂性下界

Yassir Jedra, Alexandre Proutiere

AI总结本文基于线性系统辨识问题，推导了针对特定问题的样本复杂性下界，该下界在PAC框架下定义，用于确定系统参数的识别时间和精度。研究针对受控和不受控系统，为不受控系统推导了基于有限时间可控性格拉姆矩阵的下界，并推导了仅依赖系统谱的简化下界。对于受控系统，下界虽不如不受控系统明确，但可为设计最小样本复杂度的控制策略提供见解。

详情

AI中文摘要

本文建立了线性系统辨识问题的特定问题样本复杂性下界。样本复杂性在PAC框架下定义，对应于在指定精度和置信水平下识别系统参数所需的时间。所谓特定问题，是指下界明确依赖于待识别的系统（与最小化下界形成对比），从而真正捕捉到特定系统的识别难度。我们考虑了受控和不受控系统。对于不受控系统，下界适用于任何线性系统（稳定或不稳定），仅依赖于系统的有限时间可控性格拉姆矩阵。还推导了仅依赖系统谱的简化下界。鉴于最近对经典估计方法（如普通最小二乘法）的有限时间分析，我们的样本复杂性下界对于许多系统来说是紧的。对于受控系统，我们的下界不如不受控系统明确，但可能为设计具有最小样本复杂度的控制策略提供有趣的见解。

英文摘要

This paper establishes problem-specific sample complexity lower bounds for linear system identification problems. The sample complexity is defined in the PAC framework: it corresponds to the time it takes to identify the system parameters with prescribed accuracy and confidence levels. By problem-specific, we mean that the lower bound explicitly depends on the system to be identified (which contrasts with minimax lower bounds), and hence really captures the identification hardness specific to the system. We consider both uncontrolled and controlled systems. For uncontrolled systems, the lower bounds are valid for any linear system, stable or not, and only depend of the system finite-time controllability gramian. A simplified lower bound depending on the spectrum of the system only is also derived. In view of recent finitetime analysis of classical estimation methods (e.g. ordinary least squares), our sample complexity lower bounds are tight for many systems. For controlled systems, our lower bounds are not as explicit as in the case of uncontrolled systems, but could well provide interesting insights into the design of control policy with minimal sample complexity.

URL PDF HTML ☆

赞 0 踩 0

1902.06094 2026-06-04 cs.NE cs.LG cs.SY eess.SY 版本更新

Differentiable reservoir computing

可微 reservoir 计算

Lyudmila Grigoryeva, Juan-Pablo Ortega

发表机构 * Department of Mathematics and Statistics（数学与统计学系）； Centre National de la Recherche Scientifique (CNRS)（国家科学研究中心（CNRS））

AI总结本文研究了 reservoir 计算系统在不同可微性条件下的特性，提出了一种新的方法来分析 reservoir 过滤器的可微性，并展示了其在混沌动力系统学习中的应用。

Comments 60 pages

详情

AI中文摘要

在过去二十年中，大量努力致力于确定 reservoir 计算系统在所谓的回声状态（ESP）和衰减记忆（FMP）特性下的情况。这些重要特性在数学上相当于全局 reservoir 系统解的存在性和连续性。本文通过为非常一般类别的离散时间确定性输入刻画 reservoir 过滤器的可微性，从而补充了这一研究。这构成了对长期研究 ESP 和 FMP 的重要贡献，并特别与现有研究中的 ESP 输入依赖性相关联。文献中已证明可微性是学习混沌动力系统吸引子的关键特征。在分析情况下，利用泰勒定理构造了 reservoir 过滤器的 Volterra 型级数表示，并提供了相应的近似界。最后，这些结果的推论表明，任何衰减记忆过滤器都可以通过具有有限记忆的有限 Volterra 级数均匀近似。

英文摘要

Much effort has been devoted in the last two decades to characterize the situations in which a reservoir computing system exhibits the so-called echo state (ESP) and fading memory (FMP) properties. These important features amount, in mathematical terms, to the existence and continuity of global reservoir system solutions. That research is complemented in this paper with the characterization of the differentiability of reservoir filters for very general classes of discrete-time deterministic inputs. This constitutes a novel strong contribution to the long line of research on the ESP and the FMP and, in particular, links to existing research on the input-dependence of the ESP. Differentiability has been shown in the literature to be a key feature in the learning of attractors of chaotic dynamical systems. A Volterra-type series representation for reservoir filters with semi-infinite discrete-time inputs is constructed in the analytic case using Taylor's theorem and corresponding approximation bounds are provided. Finally, it is shown as a corollary of these results that any fading memory filter can be uniformly approximated by a finite Volterra series with finite memory.

URL PDF HTML ☆

赞 0 踩 0

1903.09136 2026-06-04 stat.ML cs.LG cs.SY eess.SP eess.SY 版本更新

On Approximate Nonlinear Gaussian Message Passing On Factor Graphs

关于因子图上的近似非线性高斯信息传递

Eike Petersen, Christian Hoffmann, Philipp Rostalski

发表机构 * Institute for Electrical Engineering in Medicine, University of Lübeck（医学电气工程研究所，吕贝克大学）

AI总结本文提出了一种基于因子图的近似高斯信息传递规则，用于处理确定性非线性变换节点，通过数值求积和Rauch-Tung-Striebel型近似方法，为非线性问题的求解提供了新的算法框架。

详情

DOI: 10.1109/SSP.2018.8450699
Journal ref: 2018 IEEE Statistical Signal Processing Workshop (SSP)

AI中文摘要

因子图近年来因其作为信号处理、估计和控制算法表示和构建的统一框架而受到越来越多的关注。因子图工具包中似乎未充分探索的一个能力是利用表格信息传递规则处理确定性非线性变换，如非线性滤波和平滑问题中的变换。在本贡献中，我们基于前向传递的数值求积过程和后向传递的Rauch-Tung-Striebel型近似方法，为满足马尔可夫性质的任意因子图中的确定性非线性变换节点提供了通用的前向（滤波）和后向（平滑）近似高斯信息传递规则。这些信息传递规则可用于推导许多使用因子图求解非线性问题的算法，如基于所提出信息传递规则的非线性修改Bryson-Frazier（MBF）平滑器的提出。

英文摘要

Factor graphs have recently gained increasing attention as a unified framework for representing and constructing algorithms for signal processing, estimation, and control. One capability that does not seem to be well explored within the factor graph tool kit is the ability to handle deterministic nonlinear transformations, such as those occurring in nonlinear filtering and smoothing problems, using tabulated message passing rules. In this contribution, we provide general forward (filtering) and backward (smoothing) approximate Gaussian message passing rules for deterministic nonlinear transformation nodes in arbitrary factor graphs fulfilling a Markov property, based on numerical quadrature procedures for the forward pass and a Rauch-Tung-Striebel-type approximation of the backward pass. These message passing rules can be employed for deriving many algorithms for solving nonlinear problems using factor graphs, as is illustrated by the proposition of a nonlinear modified Bryson-Frazier (MBF) smoother based on the presented message passing rules.

URL PDF HTML ☆

赞 0 踩 0

1903.09122 2026-06-04 cs.LG cs.SY eess.SY math.OC stat.ML 版本更新

Finite Sample Analysis of Stochastic System Identification

随机系统辨识的有限样本分析

Anastasios Tsiamis, George J. Pappas

发表机构 * Department of Electrical and Systems Engineering, University of Pennsylvania（宾夕法尼亚大学电气与系统工程系）

AI总结本文基于机器学习和统计学的现代工具，研究了随机系统辨识的有限样本复杂性。通过子空间辨识算法和有限数量的输出样本，提供了系统参数估计误差的非渐近高概率上界，证明了在高概率下估计误差以1/√N的速度减小。

Comments Under review

1903.08792 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks

通过障碍函数实现端到端安全强化学习用于安全关键的连续控制任务

Richard Cheng, Gabor Orosz, Richard M. Murray, Joel W. Burdick

发表机构 * California Institute of Technology（加州理工学院）； University of Michigan（密歇根大学）

AI总结本文提出了一种结合模型无关强化学习控制器、基于控制障碍函数的控制器以及在线学习未知系统动力学的控制器架构，以确保学习过程中的安全性，通过Gaussian Processes建模系统动力学并展示在倒立摆和无线车对车自主跟车任务中更高的样本效率和安全性。

Comments Published in AAAI 2019

详情

AI中文摘要

强化学习（RL）算法在模拟应用之外取得有限成功，主要原因是学习过程中缺乏安全性保证。现实世界系统在最优控制器学习之前可能无法正常运行或崩溃。为了解决这个问题，我们提出了一种控制器架构，结合（1）模型无关的RL控制器、（2）利用控制障碍函数（CBFs）的模型基于控制器以及（3）在线学习未知系统动力学，以确保学习过程中的安全性。我们的通用框架利用RL算法的成功来学习高性能控制器，而基于CBF的控制器通过约束可探索策略集来保证安全并引导学习过程。我们利用高斯过程（GPs）来建模系统动力学及其不确定性。我们的新型控制器合成算法RL-CBF在学习过程中以高概率保证安全性，无论使用何种RL算法，并展示了更高的策略探索效率。我们在（1）倒立摆控制和（2）具有无线车辆到车辆通信的自动驾驶跟车任务中测试了我们的算法，并展示了我们的算法在学习过程中比其他最先进的算法具有更高的样本效率，并在整个学习过程中保持安全。

英文摘要

Reinforcement Learning (RL) algorithms have found limited success beyond simulated applications, and one main reason is the absence of safety guarantees during the learning process. Real world systems would realistically fail or break before an optimal controller can be learned. To address this issue, we propose a controller architecture that combines (1) a model-free RL-based controller with (2) model-based controllers utilizing control barrier functions (CBFs) and (3) on-line learning of the unknown system dynamics, in order to ensure safety during learning. Our general framework leverages the success of RL algorithms to learn high-performance controllers, while the CBF-based controllers both guarantee safety and guide the learning process by constraining the set of explorable polices. We utilize Gaussian Processes (GPs) to model the system dynamics and its uncertainties. Our novel controller synthesis algorithm, RL-CBF, guarantees safety with high probability during the learning process, regardless of the RL algorithm used, and demonstrates greater policy exploration efficiency. We test our algorithm on (1) control of an inverted pendulum and (2) autonomous car-following with wireless vehicle-to-vehicle communication, and show that our algorithm attains much greater sample efficiency in learning than other state-of-the-art algorithms and maintains safety during the entire learning process.

URL PDF HTML ☆

赞 0 踩 0

1803.02099 2026-06-04 cs.LG cs.SY eess.SY 版本更新

A Hybrid Method for Traffic Flow Forecasting Using Multimodal Deep Learning

一种用于交通流预测的混合方法：使用多模态深度学习

Shengdong Du, Tianrui Li, Xun Gong, Shi-Jinn Horng

发表机构 * School of Information Science and Technology, National Engineering Laboratory of Integrated Transportation Big Data Application Technology（信息科学与技术学院，集成交通大数据应用技术国家工程实验室）； Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology（计算机科学与工程系，台湾大学科技学院）

AI总结本文提出了一种混合多模态深度学习方法，用于短期交通流预测，通过注意力辅助多模态深度学习架构联合和自适应学习多模态交通数据的空间时间相关特征和长期时间依赖性。

详情

AI中文摘要

交通流预测被视为智能交通系统的关键问题。在本工作中，我们提出了一种混合多模态深度学习方法，用于短期交通流预测，该方法通过注意力辅助多模态深度学习架构，联合和自适应地学习多模态交通数据的空间时间相关特征和长期时间依赖性。根据多模态交通数据的强非线性特征，我们方法的基础模块由一维卷积神经网络（1D CNN）和门控循环单元（GRU）组成，其中前者用于捕捉局部趋势特征，后者用于捕捉长期时间依赖性。然后，我们设计了一个混合多模态深度学习框架（HMDLF），通过多个CNN-GRU-Attention模块融合不同模态交通数据的共享表示特征。实验结果表明，所提出的多模态深度学习模型能够有效处理复杂的非线性城市交通流预测，并具有满意的准确性和有效性。

英文摘要

Traffic flow forecasting has been regarded as a key problem of intelligent transport systems. In this work, we propose a hybrid multimodal deep learning method for short-term traffic flow forecasting, which can jointly and adaptively learn the spatial-temporal correlation features and long temporal interdependence of multi-modality traffic data by an attention auxiliary multimodal deep learning architecture. According to the highly nonlinear characteristics of multi-modality traffic data, the base module of our method consists of one-dimensional Convolutional Neural Networks (1D CNN) and Gated Recurrent Units (GRU) with the attention mechanism. The former is to capture the local trend features and the latter is to capture the long temporal dependencies. Then, we design a hybrid multimodal deep learning framework (HMDLF) for fusing share representation features of different modality traffic data by multiple CNN-GRU-Attention modules. The experimental results indicate that the proposed multimodal deep learning model is capable of dealing with complex nonlinear urban traffic flow forecasting with satisfying accuracy and effectiveness.

URL PDF HTML ☆

赞 0 踩 0

1903.05196 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

A Review of Reinforcement Learning for Autonomous Building Energy Management

自主建筑能源管理中强化学习的综述

Karl Mason, Santiago Grijalva

发表机构 * School of Electrical and Computer Engineering（电气与计算机工程学院）； Georgia Institute of Technology（佐治亚理工学院）

AI总结本文综述了强化学习在自主建筑能源管理系统中的应用，总结了相关文献，并探讨了未来研究方向和挑战。

Comments 17 pages, 3 figures

1903.01032 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

A Fundamental Performance Limitation for Adversarial Classification

对抗分类中的基本性能限制

Abed AlRahman Al Makdah, Vaibhav Katewa, Fabio Pasqualetti

AI总结本文研究了对抗分类中的基本性能限制，指出在优化准确率的过程中，二分类算法不可避免地会变得更加敏感于数据的对抗操纵，并且准确率与敏感度之间的根本权衡曲线仅取决于数据的统计特性，无法通过调整算法来改进。

1903.05817 2026-06-04 eess.SY cs.LG cs.SY 版本更新

A New Approach for Distributed Hypothesis Testing with Extensions to Byzantine-Resilience

分布式假设检验的一种新方法及其对拜占庭容错的扩展

Aritra Mitra, John A. Richards, Shreyas Sundaram

发表机构 * School of Electrical and Computer Engineering at Purdue University（普渡大学电气与计算机工程学院）； Sandia National Laboratories（桑迪亚国家实验室）

AI总结本文提出了一种新的分布式学习规则，用于在时间序列中联合观察资料下学习真实的状态，该方法不采用信念平均，且能扩展到处理网络中某些代理的恶意行为。

Comments To appear in the Proceedings of the American Control Conference, 2019

详情

AI中文摘要

我们研究了一个场景，其中一组代理各自接收部分信息的私人观察，试图协作学习能够解释他们随时间变化的联合观察资料的真实状态（在一组假设中）。为了解决这个问题，我们提出了一种分布式学习规则，与现有方法不同，它不采用任何形式的“信念平均”。具体来说，每个代理维护一个本地信念（对每个假设），该信念以贝叶斯方式更新，不受网络影响，同时维护一个实际信念，该信念在更新（除归一化外）时是其自身本地信念和邻居实际信念的最小值。在对代理信号结构和底层通信图的最小要求下，我们建立了所提出信念更新规则的一致性，即我们证明了代理的实际信念几乎必然渐近地集中在真实状态上。作为我们方法的一个关键好处，我们展示了我们的学习规则可以扩展到捕捉网络中某些代理的恶意行为，通过拜占庭对手模型。特别是，我们证明在适当的观察模型和网络拓扑条件下，每个非恶意代理几乎必然渐近地学习世界的真实状态。

英文摘要

We study a setting where a group of agents, each receiving partially informative private observations, seek to collaboratively learn the true state (among a set of hypotheses) that explains their joint observation profiles over time. To solve this problem, we propose a distributed learning rule that differs fundamentally from existing approaches, in the sense, that it does not employ any form of "belief-averaging". Specifically, every agent maintains a local belief (on each hypothesis) that is updated in a Bayesian manner without any network influence, and an actual belief that is updated (up to normalization) as the minimum of its own local belief and the actual beliefs of its neighbors. Under minimal requirements on the signal structures of the agents and the underlying communication graph, we establish consistency of the proposed belief update rule, i.e., we show that the actual beliefs of the agents asymptotically concentrate on the true state almost surely. As one of the key benefits of our approach, we show that our learning rule can be extended to scenarios that capture misbehavior on the part of certain agents in the network, modeled via the Byzantine adversary model. In particular, we prove that each non-adversarial agent can asymptotically learn the true state of the world almost surely, under appropriate conditions on the observation model and the network topology.

URL PDF HTML ☆

赞 0 踩 0

1903.05355 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

A Framework for On-line Learning of Underwater Vehicles Dynamic Models

在线学习水下机器人动态模型的框架

Bilal Wehbe, Marc Hildebrandt, Frank Kirchner

发表机构 * DFKI - Robotic Innovation Center（DFKI机器人创新中心）

AI总结本文提出了一种在线学习水下机器人动态模型的框架，通过增量支持向量回归方法从数据流中逐步学习模型，并结合增量学习策略来改进模型在整体状态空间上的泛化能力。

Comments 8 pages, 6 figures, ICRA 2019 authors preprint

1903.04958 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Real-Time Boiler Control Optimization with Machine Learning

燃煤电厂实时锅炉控制优化与机器学习

Yukun Ding, Yiyu Shi

发表机构 * University of Notre Dame（诺特达姆大学）

AI总结本文提出利用机器学习优化燃煤电厂锅炉实时控制，通过优化不同区域的温度分布和炉膛氧含量，提高锅炉稳定性与能源效率。

Comments To appear in TC-CPS Newsletter

1903.04681 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Estimating multi-class dynamic origin-destination demand through a forward-backward algorithm on computational graphs

通过计算图上的前向-后向算法估计多类动态出行生成需求

Wei Ma, Xidong Pi, Sean Qian

发表机构 * Department of Civil and Environmental Engineering（土木与环境工程系）； Carnegie Mellon University（卡内基梅隆大学）

AI总结本文提出了一种基于计算图的多类动态出行生成需求估计框架（MCDODE），通过前向-后向算法和树基累积曲线估计OD需求梯度，以解决大规模交通网络中多类时空车流估计的挑战。

Comments 31 pages, 21 figures, submitted to Transportation Research Part C: Emerging Technologies

详情

AI中文摘要

交通网络的复杂性前所未有，具有异质性车流。传统上，车辆类别通过车辆分类（如标准乘用车和卡车）来考虑。然而，车辆流的异质性源于许多其他方面，例如网约车与个人车辆、人工驾驶车辆与联网和自动驾驶车辆。在大型交通网络中，为每个类别提供一些车辆流观测，如何估计多类时空车辆流，即时间变化的起源-目的地（OD）需求和路径/链流，仍是一个重大挑战。本文提出了一种多类动态OD需求估计（MCDODE）的解决方案框架，该框架基于具有张量表示的时空流和MCDODE公式中所有中间特征的计算图。提出了一种前向-后向算法，以在计算图上高效求解MCDODE公式。此外，我们提出了一种新的树基累积曲线概念来估计OD需求的梯度。开发了Growing Tree算法来构建树基累积曲线。所提出的框架在小型网络以及实际的大规模网络上进行了检验。实验结果表明，所提出的框架具有竞争力、令人满意且计算上可行。

英文摘要

Transportation networks are unprecedentedly complex with heterogeneous vehicular flow. Conventionally, vehicle classes are considered by vehicle classifications (such as standard passenger cars and trucks). However, vehicle flow heterogeneity stems from many other aspects in general, e.g., ride-sourcing vehicles versus personal vehicles, human driven vehicles versus connected and automated vehicles. Provided with some observations of vehicular flow for each class in a large-scale transportation network, how to estimate the multi-class spatio-temporal vehicular flow, in terms of time-varying Origin-Destination (OD) demand and path/link flow, remains a big challenge. This paper presents a solution framework for multi-class dynamic OD demand estimation (MCDODE) in large-scale networks. The proposed framework is built on a computational graph with tensor representations of spatio-temporal flow and all intermediate features involved in the MCDODE formulation. A forward-backward algorithm is proposed to efficiently solve the MCDODE formulation on computational graphs. In addition, we propose a novel concept of tree-based cumulative curves to estimate the gradient of OD demand. A Growing Tree algorithm is developed to construct tree-based cumulative curves. The proposed framework is examined on a small network as well as a real-world large-scale network. The experiment results indicate that the proposed framework is compelling, satisfactory and computationally plausible.

URL PDF HTML ☆

赞 0 踩 0

1903.03763 2026-06-04 eess.SY cs.LG cs.SY math.OC stat.ML 版本更新

A tractable ellipsoidal approximation for voltage regulation problems

电压调节问题中的可处理椭球近似

Pan Li, Baihong Jin, Ruoxuan Xiong, Dai Wang, Alberto Sangiovanni-Vincentelli, Baosen Zhang

发表机构 * Facebook Inc.（Facebook公司）； University of Washington（华盛顿大学）； Stanford University（斯坦福大学）； Tesla Inc.（特斯拉公司）

AI总结本文提出了一种基于机器学习的方法来解决电力系统运行中电压调节问题中的机会约束优化问题，通过用椭球近似不确定性可行区域，提出了类似支持向量机的学习模型和高效的采样算法。

Comments accepted by ACC2019 http://acc2019.a2c2.org/

1807.09519 2026-06-04 math.NA cs.LG cs.NA 版本更新

A machine learning framework for data driven acceleration of computations of differential equations

一种用于微分方程计算的数据驱动加速的机器学习框架

Siddhartha Mishra

发表机构 * Seminar for Applied Mathematics (SAM), D-Math ETH Zürich（应用数学研讨会（SAM），ETH Zurich 数学系）

AI总结本文提出了一种机器学习框架，用于加速时间依赖的常微分方程和偏微分方程的数值计算，通过将现有数值方法转化为人工神经网络，并通过离线训练过程最小化损失函数来确定可训练参数，从而提高计算效率。

从部分观测中学习动力系统

Ibrahim Ayed, Emmanuel de Bézenac, Arthur Pajot, Julien Brajard, Patrick Gallinari

发表机构 * Theresis lab, Thales, Thales Research \& Technology Route D\'epartementale, 91120 Palaiseau ； Sorbonne Universit\'e, CNRS-IRD-MNHN, LOCEAN, Paris, France ； Remote Sensing Center, Bergen, Norway ； Criteo AI Lab, Paris, France

AI总结本文提出了一种数据驱动的框架，用于从部分观测中预测复杂非线性时空过程，通过神经网络估计时间变化的微分方程来建模系统动力学，并在浅水和欧拉模拟中验证了该方法在长期预测和学习隐藏状态方面的有效性。

1810.06749 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Optimally rotated coordinate systems for adaptive least-squares regression on sparse grids

最优旋转坐标系用于稀疏网格上的自适应最小二乘回归

Bastian Bohn, Michael Griebel, Jens Oettershagen

发表机构 * Institute for Numerical Simulation, University of Bonn（柏林洪堡大学数值模拟研究所）

AI总结针对高维数据集，本文提出了一种预处理方法，通过确定问题相关的优化坐标系来降低数据的有效维度，从而提升自适应稀疏网格最小二乘回归算法的性能。

详情

AI中文摘要

对于低维数据集具有大量数据点时，标准核方法通常不再适用于回归。除了简单的线性模型或复杂的启发式深度学习模型外，基于网格的更大（核）模型类的离散化方法导致的算法自然地线性缩放数据点数量。在中等维或高维回归任务中，这些基于网格的离散化方法受到维度诅咒的影响。在此背景下，稀疏网格方法已证明可以很大程度上克服这一问题。在这种情况下，能够检测并利用名义上高维数据的低有效维数的空间和维度自适应稀疏网格特别成功。然而，它们仍然依赖于轴对齐的结构，并在具有主要偏斜和旋转坐标的数据中表现出问题。在本文中，我们提出了一种预处理方法，用于这些自适应稀疏网格算法，以确定一个优化的、问题相关的坐标系，从而在ANOVA意义上降低给定数据集的有效维度。我们通过合成数据以及现实世界数据的数值示例，展示了自适应稀疏网格最小二乘算法如何从我们的预处理方法中受益。

英文摘要

For low-dimensional data sets with a large amount of data points, standard kernel methods are usually not feasible for regression anymore. Besides simple linear models or involved heuristic deep learning models, grid-based discretizations of larger (kernel) model classes lead to algorithms, which naturally scale linearly in the amount of data points. For moderate-dimensional or high-dimensional regression tasks, these grid-based discretizations suffer from the curse of dimensionality. Here, sparse grid methods have proven to circumvent this problem to a large extent. In this context, space- and dimension-adaptive sparse grids, which can detect and exploit a given low effective dimensionality of nominally high-dimensional data, are particularly successful. They nevertheless rely on an axis-aligned structure of the solution and exhibit issues for data with predominantly skewed and rotated coordinates. In this paper we propose a preprocessing approach for these adaptive sparse grid algorithms that determines an optimized, problem-dependent coordinate system and, thus, reduces the effective dimensionality of a given data set in the ANOVA sense. We provide numerical examples on synthetic data as well as real-world data to show how an adaptive sparse grid least squares algorithm benefits from our preprocessing method.

URL PDF HTML ☆

赞 0 踩 0

1902.10590 2026-06-04 cs.SE cs.AI cs.LG cs.SY eess.SY 版本更新

Architecting Dependable Learning-enabled Autonomous Systems: A Survey

构建可靠的学习自主系统：一项综述

Chih-Hong Cheng, Dhiraj Gulati, Rongjie Yan

发表机构 * fortiss - Research Institute of the Free State of Bavaria, Germany（巴伐利亚自由州研究 institute）； State Key Laboratory of Computer Science, China（中国计算机科学国家重点实验室）

AI总结本文综述了构建可靠学习自主系统的方法，重点在于自动驾驶，讨论了多样冗余、信息融合和运行时监控等技术支柱，并总结了提升深度学习组件可靠性的最新方法，最后提出了研究方向。

1703.00734 2026-06-04 stat.ML cs.DC cs.LG cs.NA math.NA stat.ME 版本更新

Distributed Bayesian Matrix Factorization with Limited Communication

分布式贝叶斯矩阵分解与有限通信

Xiangju Qin, Paul Blomstedt, Eemeli Leppäaho, Pekka Parviainen, Samuel Kaski

发表机构 * Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University（赫尔辛基信息科技研究院 HIIT，计算机科学系，阿莱大学）； Department of Informatics, University of Bergen（信息学院，卑尔根大学）

AI总结本文提出了一种分布式贝叶斯矩阵分解方法，通过分层分解联合后验分布，结合并行计算和高效近似实现，提高了大规模数据处理效率，同时保持预测准确性。

Comments 28 pages, 8 figures. The paper is published in Machine Learning journal. An implementation of the method is is available in SMURFF software on github (bmfpp branch): https://github.com/ExaScience/smurff

详情

DOI: 10.1007/s10994-019-05778-2
Journal ref: Machine Learning, 2019

AI中文摘要

贝叶斯矩阵分解（BMF）是一种强大的工具，用于生成低秩矩阵表示，并预测缺失值和提供置信区间。对大规模矩阵的后验推断进行扩展具有挑战性，需要将数据和计算分布到多个工人上，使通信成为主要的计算瓶颈。 embarrassingly parallel 推断可以通过在不同数据子集上使用完全独立的计算来消除通信需求，但会受到BMF解的固有不可识别性的影响。我们引入了联合后验分布的分层分解，将子推断耦合起来，允许在最多三个阶段中进行 embarrassingly parallel 计算。使用高效的近似实现，我们在真实和模拟数据上经验性地展示了改进。我们的分布式方法能够实现比完整后验快几乎一个数量级的速度提升，对预测准确性影响微小。我们的方法在准确性上优于最先进的 embarrassingly parallel MCMC 方法，并在结果上与其它可用的分布式和并行BMF实现具有竞争力。

英文摘要

Bayesian matrix factorization (BMF) is a powerful tool for producing low-rank representations of matrices and for predicting missing values and providing confidence intervals. Scaling up the posterior inference for massive-scale matrices is challenging and requires distributing both data and computation over many workers, making communication the main computational bottleneck. Embarrassingly parallel inference would remove the communication needed, by using completely independent computations on different data subsets, but it suffers from the inherent unidentifiability of BMF solutions. We introduce a hierarchical decomposition of the joint posterior distribution, which couples the subset inferences, allowing for embarrassingly parallel computations in a sequence of at most three stages. Using an efficient approximate implementation, we show improvements empirically on both real and simulated data. Our distributed approach is able to achieve a speed-up of almost an order of magnitude over the full posterior, with a negligible effect on predictive accuracy. Our method outperforms state-of-the-art embarrassingly parallel MCMC methods in accuracy, and achieves results competitive to other available distributed and parallel implementations of BMF.

URL PDF HTML ☆

赞 0 踩 0

1902.09626 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Learning Extreme Hummingbird Maneuvers on Flapping Wing Robots

在扑翼机器人上学习极端蜂鸟动作

Fan Fei, Zhan Tu, Jian Zhang, Xinyan Deng

AI总结研究通过模仿蜂鸟的极端机动动作，开发了一种混合控制策略，利用模型驱动的非线性控制和模型无关的强化学习，实现了在12克仿生蜂鸟机器人上实现快速逃避机动。

Comments 6 pages, accepted at ICRA 2019

详情

AI中文摘要

生物学研究表明，蜂鸟在快速逃避时可以执行极端空战动作。在悬停时突然出现的视觉刺激下，蜂鸟会启动快速的后退平移并伴随180度的偏转，随后在不到10次振翅之间完成瞬间姿态稳定。考虑到振翅频率为40Hz，这种激进的动作仅在0.2秒内完成。受蜂鸟在这些极端动作中接近最大性能的启发，我们开发了一种飞行控制系统，并实验表明，这种机动性可通过配备两个执行器的12克仿生蜂鸟机器人实现。所提出的混合控制策略结合了基于模型的非线性控制和无模型强化学习。我们使用基于模型的非线性控制进行正常飞行控制，因为这些条件下的动态模型相对准确。然而，在极端机动中，建模误差变得无法控制。通过在仿真中训练的无模型强化学习策略被优化以'破坏'系统并最大化机动期间的性能。混合策略表现出接近蜂鸟观察到的机动动作。直接仿真到现实的转移得以实现，证明了仿生蜂鸟机器人上蜂鸟式的快速逃避机动。

英文摘要

Biological studies show that hummingbirds can perform extreme aerobatic maneuvers during fast escape. Given a sudden looming visual stimulus at hover, a hummingbird initiates a fast backward translation coupled with a 180-degree yaw turn, which is followed by instant posture stabilization in just under 10 wingbeats. Consider the wingbeat frequency of 40Hz, this aggressive maneuver is carried out in just 0.2 seconds. Inspired by the hummingbirds' near-maximal performance during such extreme maneuvers, we developed a flight control strategy and experimentally demonstrated that such maneuverability can be achieved by an at-scale 12-gram hummingbird robot equipped with just two actuators. The proposed hybrid control policy combines model-based nonlinear control with model-free reinforcement learning. We use model-based nonlinear control for nominal flight control, as the dynamic model is relatively accurate for these conditions. However, during extreme maneuver, the modeling error becomes unmanageable. A model-free reinforcement learning policy trained in simulation was optimized to 'destabilize' the system and maximize the performance during maneuvering. The hybrid policy manifests a maneuver that is close to that observed in hummingbirds. Direct simulation-to-real transfer is achieved, demonstrating the hummingbird-like fast evasive maneuvers on the at-scale hummingbird robot.

URL PDF HTML ☆

赞 0 踩 0

1902.09427 2026-06-04 eess.SP cs.LG cs.SY eess.SY stat.ML 版本更新

Fault Diagnosis Method Based on Scaling Law for On-line Refrigerant Leak Detection

基于缩放定律的故障诊断方法用于在线制冷剂泄漏检测

Shun Takeuchi, Takahiro Saito

发表机构 * Machine Discovery Technology Project Artificial Intelligence Laboratory Fujitsu Laboratories Ltd., Kanagawa, Japan ； Machine Learning Technology Project Artificial Intelligence Laboratory Fujitsu Laboratories Ltd., Kanagawa, Japan

AI总结本文提出了一种基于物理建模和空调系统控制机制的制冷剂泄漏故障诊断方法，通过推导与制冷剂泄漏相关的缩放定律，使模型能够适用于不同配置的空调系统，利用实验室的小规模离线故障测试数据估计缩放指数，并通过真实数据验证，证明了该方法在早期泄漏检测中的有效性。

Comments 8 pages, 6 figures

详情

Journal ref: 2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)

AI中文摘要

利用仪器化传感器数据进行早期故障检测是机器学习在工业设施中的一个有前景的应用领域。然而，由于目标诊断系统中复杂的系统配置和不足的故障数据，训练出的故障检测模型的泛化性能难以提高。将训练好的模型应用于其他系统并不容易。本文提出了一种考虑空调系统物理建模和控制机制的制冷剂泄漏故障诊断方法。我们推导出与制冷剂泄漏相关的有用缩放定律。如果控制机制相同，模型可以应用于其他空调系统，而不论系统配置如何。在实验室中获得的小规模离线故障测试数据用于估计缩放指数。我们通过真实数据评估所提出的缩放定律。基于两组之间相互作用的统计假设检验，我们证明了不同空调系统的缩放指数是等价的。此外，我们基于缩放定律对实际过程数据的泄漏程度时间序列进行了估计，并通过与专家评估的比较，证明了该方法在早期泄漏检测中的有效性。

英文摘要

Early fault detection using instrumented sensor data is one of the promising application areas of machine learning in industrial facilities. However, it is difficult to improve the generalization performance of the trained fault-detection model because of the complex system configuration in the target diagnostic system and insufficient fault data. It is not trivial to apply the trained model to other systems. Here we propose a fault diagnosis method for refrigerant leak detection considering the physical modeling and control mechanism of an air-conditioning system. We derive a useful scaling law related to refrigerant leak. If the control mechanism is the same, the model can be applied to other air-conditioning systems irrespective of the system configuration. Small-scale off-line fault test data obtained in a laboratory are applied to estimate the scaling exponent. We evaluate the proposed scaling law by using real-world data. Based on a statistical hypothesis test of the interaction between two groups, we show that the scaling exponents of different air-conditioning systems are equivalent. In addition, we estimated the time series of the degree of leakage of real process data based on the scaling law and confirmed that the proposed method is promising for early leak detection through comparison with assessment by experts.

URL PDF HTML ☆

赞 0 踩 0

1902.09426 2026-06-04 eess.SP cs.LG cs.SY eess.SY stat.ML 版本更新

Semi-supervised Approach to Soft Sensor Modeling for Fault Detection in Industrial Systems with Multiple Operation Modes

基于半监督方法的软传感器建模用于具有多种操作模式的工业系统故障检测

Shun Takeuchi, Takuya Nishino, Takahiro Saito, Isamu Watanabe

发表机构 * Artificial Intelligence Research Center（人工智能研究中心）； Knowledge Information Processing Laboratory（知识信息处理实验室）； Fujitsu Laboratories Ltd.（Fujitsu实验室有限公司）； Japan（日本）

AI总结本文提出了一种半监督方法用于软传感器建模，以解决在多操作模式系统中因目标变量数据不足而无法有效训练的问题，通过利用操作模式转换点的特性来改进模型预测能力。

Comments 7 pages, 1 figure

详情

Journal ref: International Conference on Advanced Intelligent Systems and Informatics 2017

AI中文摘要

在工业系统中，某些需要监控以检测故障的过程变量往往难以或无法测量。软传感器技术广泛用于从易于测量的变量估计这些难以测量的过程变量。软传感器建模需要包含各种状态信息的训练数据集，但目标变量的故障数据集不足，无法作为训练数据集。本文描述了一种半监督方法用于软传感器建模，以将缺少目标变量的不完整数据集纳入训练数据集。为了整合不完整数据集，我们考虑系统中操作模式转换点的特性。在约束条件下，通过从模式转换信息中获得的约束条件估计操作模式的回归系数。在案例研究中，这种受约束的软传感器建模被用于预测具有加热和制冷操作模式的空调系统中的制冷剂泄漏。结果表明，这种建模方法对于具有多种操作模式的系统中的软传感器具有前景。

英文摘要

In industrial systems, certain process variables that need to be monitored for detecting faults are often difficult or impossible to measure. Soft sensor techniques are widely used to estimate such difficult-to-measure process variables from easy-to-measure ones. Soft sensor modeling requires training datasets including the information of various states such as operation modes, but the fault dataset with the target variable is insufficient as the training dataset. This paper describes a semi-supervised approach to soft sensor modeling to incorporate an incomplete dataset without the target variable in the training dataset. To incorporate the incomplete dataset, we consider the properties of processes at transition points between operation modes in the system. The regression coefficients of the operation modes are estimated under constraint conditions obtained from the information on the mode transitions. In a case study, this constrained soft sensor modeling was used to predict refrigerant leaks in air-conditioning systems with heating and cooling operation modes. The results show that this modeling method is promising for soft sensors in a system with multiple operation modes.

URL PDF HTML ☆

赞 0 踩 0

1806.07190 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Stable Gaussian Process based Tracking Control of Euler-Lagrange Systems

基于稳定高斯过程的欧拉-拉格朗日系统跟踪控制

Thomas Beckers, Dana Kulić, Sandra Hirche

发表机构 * Chair of Information-oriented Control (ITR), Department of Electrical and Computer Engineering, Technical University of Munich（信息导向控制研究所（ITR），电气与计算机工程系，慕尼黑技术大学）； Adaptive Systems Laboratory, Department of Electrical and Computer Engineering, University of Waterloo（自适应系统实验室，电气与计算机工程系，滑铁卢大学）

AI总结本文提出一种基于高斯过程回归的稳定跟踪控制方法，用于未知欧拉-拉格朗日系统的高精度跟踪控制，通过数据驱动建模实现前馈补偿，并利用模型保真度动态调整反馈增益，确保全局有界跟踪误差。

Comments Accepted manuscript for publication in Elsevier Automatica

详情

DOI: 10.1016/j.automatica.2019.01.023

AI中文摘要

对现实中的欧拉-拉格朗日系统实现完美的跟踪控制具有挑战性，因为系统模型的不确定性以及外部干扰会影响跟踪误差的大小。通过增加反馈增益或改进系统模型可以减小跟踪误差。后者显然更可取，因为它允许在低反馈增益下保持良好的跟踪性能。然而，准确的模型往往难以获得。在本文中，我们解决了未知欧拉-拉格朗日系统的稳定高性能跟踪控制问题。具体来说，我们使用高斯过程回归来获得一个数据驱动的模型，用于系统未知动力学的前馈补偿。模型保真度用于调整反馈增益，允许在状态空间中模型信心高的区域使用低反馈增益。所提出的控制律保证了具有特定概率的全局有界跟踪误差。仿真研究展示了其优于现有跟踪控制方法的优越性。

英文摘要

Perfect tracking control for real-world Euler-Lagrange systems is challenging due to uncertainties in the system model and external disturbances. The magnitude of the tracking error can be reduced either by increasing the feedback gains or improving the model of the system. The latter is clearly preferable as it allows to maintain good tracking performance at low feedback gains. However, accurate models are often difficult to obtain. In this article, we address the problem of stable high-performance tracking control for unknown Euler-Lagrange systems. In particular, we employ Gaussian Process regression to obtain a data-driven model that is used for the feed-forward compensation of unknown dynamics of the system. The model fidelity is used to adapt the feedback gains allowing low feedback gains in state space regions of high model confidence. The proposed control law guarantees a globally bounded tracking error with a specific probability. Simulation studies demonstrate the superiority over state of the art tracking control approaches.

URL PDF HTML ☆

赞 0 踩 0

1902.08721 2026-06-04 cs.LG cs.SY eess.SY math.OC stat.ML 版本更新

Online Control with Adversarial Disturbances

对抗性扰动下的在线控制

Naman Agarwal, Brian Bullins, Elad Hazan, Sham M. Kakade, Karan Singh

发表机构 * Google AI Princeton（谷歌AI普林斯顿）； Princeton University（普林斯顿大学）； University of Washington（华盛顿大学）； Allen School of Computer Science and Engineering（阿伦计算机科学与工程学院）

AI总结本文研究了在存在对抗性扰动的线性动态系统中的在线控制问题，提出了一种高效的算法，该算法在几乎紧致的 regret 绑定下实现了接近全知扰动的控制效果，同时扩展了先前工作的两个主要方面：允许动态中的对抗性噪声和一般的凸成本。

1902.08594 2026-06-04 eess.SY cs.LG cs.MA cs.SY stat.ML 版本更新

Regression-based Inverter Control for Decentralized Optimal Power Flow and Voltage Regulation

基于回归的逆变器控制用于分布式最优功率流和电压调节

Oscar Sondermeijer, Roel Dobbe, Daniel Arnold, Claire Tomlin, Tamás Keviczky

发表机构 * 2 Department of Electrical Engineering \& Computer Sciences, UC Berkeley, Berkeley, USA ； 3 Department of Mechanical Engineering, UC Berkeley, Berkeley, USA ； 4 Delft Center for Systems ； Control, Delft University of Technology, Delft, The Netherlands

AI总结本文提出了一种系统化的数据驱动方法，通过本地测量确定逆变器输出无功功率，以实现接近最优的结果，该方法通过网络模型和历史负荷和发电数据进行最优功率流计算，然后利用回归找到每个逆变器的函数，将本地历史数据映射到其最优无功功率注入的近似值，从而实现分布式控制，以在电压和容量约束下最小化损耗并实现电压平坦化，同时允许高效的电压-无功优化（VVO）方案，使传统控制设备与现有逆变器协同工作，以安全运行高分布式发电水平的配电网。

Comments Cite as: Oscar Sondermeijer, Roel Dobbe, Daniel Arnold, Claire Tomlin and Tamás Keviczky, "Regression-based Inverter Control for Decentralized Optimal Power Flow and Voltage Regulation", IEEE Power & Energy Society General Meeting, Boston, July 2016

详情

AI中文摘要

电子功率逆变器能够快速提供无功功率以维持客户电压在运行容差范围内，并减少配电网中的系统损耗。本文提出了一种系统化且数据驱动的方法，以确定无功功率逆变器输出作为本地测量函数的方式，以获得接近最优的结果。首先，我们使用网络模型和历史负荷和发电数据，并进行最优功率流计算，以计算网络中所有可控逆变器的全局最优无功功率注入。随后，我们使用回归找到每个逆变器的函数，将本地历史数据映射到其最优无功功率注入的近似值。所得函数随后作为参与逆变器的分布式控制器，根据新的本地测量预测最优注入。该方法在执行电压和容量约束下的损耗最小化和电压平坦化时能够实现接近最优的结果，并允许高效的电压-无功优化（VVO）方案，其中传统控制设备与现有逆变器协同工作，以安全运行具有更高分布式发电水平的配电网。

英文摘要

Electronic power inverters are capable of quickly delivering reactive power to maintain customer voltages within operating tolerances and to reduce system losses in distribution grids. This paper proposes a systematic and data-driven approach to determine reactive power inverter output as a function of local measurements in a manner that obtains near optimal results. First, we use a network model and historic load and generation data and do optimal power flow to compute globally optimal reactive power injections for all controllable inverters in the network. Subsequently, we use regression to find a function for each inverter that maps its local historical data to an approximation of its optimal reactive power injection. The resulting functions then serve as decentralized controllers in the participating inverters to predict the optimal injection based on a new local measurements. The method achieves near-optimal results when performing voltage- and capacity-constrained loss minimization and voltage flattening, and allows for an efficient volt-VAR optimization (VVO) scheme in which legacy control equipment collaborates with existing inverters to facilitate safe operation of distribution networks with higher levels of distributed generation.

URL PDF HTML ☆

赞 0 踩 0

1902.08274 2026-06-04 cs.AI cs.LG cs.MA cs.SY eess.SY 版本更新

An Online Decision-Theoretic Pipeline for Responder Dispatch

为响应调度设计一个在线决策理论管道

Ayan Mukhopadhyay, Geoffrey Pettet, Chinmaya Samal, Abhishek Dubey, Yevgeniy Vorobeychik

发表机构 * Vanderbilt University（范德比大学）； Washington University（华盛顿大学）

AI总结本文提出了一种在线决策理论管道，用于有效应对紧急事件，通过实时数据流更新模型，提高响应效率并减少计算时间。

Comments Appeared in ICCPS 2019

详情

DOI: 10.1145/3302509.3311055

AI中文摘要

向服务交通事故、火灾、 distress 电话和犯罪等紧急事件派遣应急响应人员的问题困扰着全球各地的城市。尽管此类问题已广泛研究，但大多数方法是离线的。这些方法无法捕捉到关键紧急响应发生的动态变化环境，因此无法在实践中实施。任何全面的方法必须考虑其他挑战，包括预测事件何时何地发生以及理解环境动态变化。我们描述了一个系统，该系统以在线方式处理所有这些问题，即模型通过流数据源更新。我们强调这种做法对应急响应有效性的重要性，并提出了一种算法框架，可以为给定的决策理论模型计算有希望的行动。我们还提出了一种在线机制用于事件预测，以及基于循环神经网络的方法来学习和预测影响响应调度的环境特征。我们比较了我们的方法与现有最先进的方法和现有调度策略，结果表明我们的方法在减少响应时间的同时大幅减少了计算时间。

英文摘要

The problem of dispatching emergency responders to service traffic accidents, fire, distress calls and crimes plagues urban areas across the globe. While such problems have been extensively looked at, most approaches are offline. Such methodologies fail to capture the dynamically changing environments under which critical emergency response occurs, and therefore, fail to be implemented in practice. Any holistic approach towards creating a pipeline for effective emergency response must also look at other challenges that it subsumes - predicting when and where incidents happen and understanding the changing environmental dynamics. We describe a system that collectively deals with all these problems in an online manner, meaning that the models get updated with streaming data sources. We highlight why such an approach is crucial to the effectiveness of emergency response, and present an algorithmic framework that can compute promising actions for a given decision-theoretic model for responder dispatch. We argue that carefully crafted heuristic measures can balance the trade-off between computational time and the quality of solutions achieved and highlight why such an approach is more scalable and tractable than traditional approaches. We also present an online mechanism for incident prediction, as well as an approach based on recurrent neural networks for learning and predicting environmental features that affect responder dispatch. We compare our methodology with prior state-of-the-art and existing dispatch strategies in the field, which show that our approach results in a reduction in response time with a drastic reduction in computational time.

URL PDF HTML ☆

赞 0 踩 0

1807.04020 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Improved SVD-based Initialization for Nonnegative Matrix Factorization using Low-Rank Correction

改进的基于SVD的非负矩阵分解初始化方法：利用低秩修正

Atif Muhammad Syed, Sameer Qazi, Nicolas Gillis

发表机构 * Graduate School of Science and Engineering（研究生院）； PAF-Karachi Institute of Economics and Technology（卡拉奇经济和技术学院）； Department of Mathematics and Operational Research（数学与运筹学系）

AI总结本文提出了一种改进的基于SVD的非负矩阵分解初始化方法，通过考虑被丢弃的SVD因子来降低初始误差，同时生成稀疏初始因子并提高计算效率。

Comments 12 pages, 1 figure, 5 tables, submitted to pattern recognition letters

详情

DOI: 10.1016/j.patrec.2019.02.018
Journal ref: Pattern Recognition Letters 122, pp. 53-59, 2019

AI中文摘要

由于大多数非负矩阵分解（NMF）算法的迭代性质，初始化是一个关键因素，因为它显著影响收敛性和最终得到的解。许多初始化方案已被提出，其中最受欢迎的一类方法基于奇异值分解（SVD）。然而，这些基于SVD的初始化方法并不满足一个自然条件，即误差应随着因子分解的秩增加而减少。在本文中，我们提出了一种新的基于SVD的NMF初始化方法，专门针对这一不足，通过考虑用于获得非负初始化而被丢弃的SVD因子。这种方法称为非负SVD与低秩修正（NNSVD-LRC），通过利用被丢弃的SVD因子的低秩结构，在可忽略的额外计算成本下显著降低初始误差。与以往基于SVD的初始化方法相比，NNSVD-LRC还有两个其他优势：（1）它能够证明生成稀疏的初始因子；（2）它更快，因为它只需要计算秩为⌈r/2 + 1⌉的截断SVD，其中r是所求NMF分解的因子秩（与其他方法不同，其他方法需要计算秩为r的截断SVD）。我们在多个标准密集和稀疏数据集上展示了我们的新方法在NMF中与最先进的基于SVD的初始化方法竞争性。

DeGroot-Friedkin映射在意见动力学中的镜像下降

Abhishek Halder

AI总结本文通过变分解释将DeGroot-Friedkin映射视为在标准单纯形上的镜像下降，其关联的Bregman散度等于广义Kullback-Leibler散度，即熵的镜像下降，揭示了DeGroot-Friedkin映射在最小化互补意见的熵的同时，使个体的社会影响力接近其社会权力。

1712.10158 2026-06-04 q-bio.NC cs.LG cs.NE cs.SY eess.SY stat.ML 版本更新

Non-linear motor control by local learning in spiking neural networks

通过局部学习在脉冲神经网络中实现非线性运动控制

Aditya Gilra, Wulfram Gerstner

发表机构 * School of Computer and Communication Sciences（计算机与通信科学学院）； Brain-Mind Institute, School of Life Sciences（脑科学与生命科学研究所）

AI总结本文提出了一种基于反馈的在线局部学习权重（FOLLOW）方法，用于训练异构脉冲神经网络，以控制双臂并重现期望的状态轨迹，核心贡献是通过局部可塑性规则学习逆模型以实现非线性动力学控制。

详情

Journal ref: Proceedings of the 35th International Conference on Machine Learning, PMLR 80:1773-1782, 2018

AI中文摘要

在具有隐藏神经元的脉冲神经网络中，使用局部、稳定且在线的规则学习权重，以控制非线性身体动力学是一个开放性问题。本文采用监督方案，反馈基于在线局部学习权重（FOLLOW），训练具有隐藏层的异质脉冲神经元网络，以控制双臂以重现期望状态轨迹。网络首先学习非线性动力学的逆模型，即从状态轨迹作为输入，学习推断产生轨迹的连续时间命令。连接权重通过涉及前突触放电和后突触误差反馈的局部可塑性规则进行调整。我们选择了一种称为微分前馈的网络架构，该架构在不同前馈和递归架构中提供了最低的测试误差。学习的逆模型随后用于生成连续时间运动命令以控制手臂，给定期望轨迹。

英文摘要

Learning weights in a spiking neural network with hidden neurons, using local, stable and online rules, to control non-linear body dynamics is an open problem. Here, we employ a supervised scheme, Feedback-based Online Local Learning Of Weights (FOLLOW), to train a network of heterogeneous spiking neurons with hidden layers, to control a two-link arm so as to reproduce a desired state trajectory. The network first learns an inverse model of the non-linear dynamics, i.e. from state trajectory as input to the network, it learns to infer the continuous-time command that produced the trajectory. Connection weights are adjusted via a local plasticity rule that involves pre-synaptic firing and post-synaptic feedback of the error in the inferred command. We choose a network architecture, termed differential feedforward, that gives the lowest test error from different feedforward and recurrent architectures. The learned inverse model is then used to generate a continuous-time motor command to control the arm, given a desired trajectory.

URL PDF HTML ☆

赞 0 踩 0

1712.06281 2026-06-04 math.OC cs.LG cs.SY eess.SY physics.chem-ph 版本更新

A New Data-Driven Sparse-Learning Approach to Study Chemical Reaction Networks

一种新的数据驱动稀疏学习方法用于研究化学反应网络

Farshad Harirchi, Doohyun Kim, Omar A. Khalil, Sijia Liu, Paolo Elvati, Angela Violi, Alfred O. Hero

发表机构 * Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109--2125, USA（电气工程与计算机科学系，密歇根大学，安娜堡，MI 48109--2125，美国）； Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI 48109--2125, USA（机械工程系，密歇根大学，安娜堡，MI 48109--2125，美国）； Departments of Chemical Engineering, Biomedical Engineering, Macromolecular Science and Engineering, Biophysics Program, University of Michigan, Ann Arbor, MI 48109--2125, USA（化学工程、生物医学工程、大分子科学与工程、生物物理项目系，密歇根大学，安娜堡，MI 48109--2125，美国）

AI总结本文提出了一种数据驱动的稀疏学习方法，用于识别化学反应网络中关键反应，该方法通过物种浓度和反应速率来确定影响反应，具有低计算成本，无需额外数据或模拟，应用于氢气和丙烷的燃烧化学分析，并展示了简化机制在点火延迟上的良好性能。

详情

AI中文摘要

化学动力学机制可以通过一组基本反应表示，这些反应可以利用物理化学关系轻松转换为数学表达式。反应的示意图表示捕捉了反应物和产物之间的相互作用。确定系统动态行为下的最小化学相互作用是一个主要任务。在本文中，我们介绍了一种新的方法，利用数据驱动的稀疏学习技术来识别化学反应网络中在燃烧应用中的关键反应。所提出的方法利用物种浓度和反应速率来确定一组关键反应，且具有最小的计算成本，无需额外的数据或模拟。新的方法应用于分析恒容均质反应器中氢气和丙烷的燃烧化学。稀疏学习方法识别出的关键反应与当前化学机制的速率理论知识一致。此外，我们还表明，可以将不同时间和条件下识别出的关键反应组合起来，生成一个简化版本的原始机制，并且对于氢气和丙烷，这种简化机制在广泛的条件下表现出与原始机制相近的点火延迟性能。我们的结果展示了稀疏学习方法作为有效且高效的机制分析和机制简化工具的潜力。

英文摘要

Chemical kinetic mechanisms can be represented by sets of elementary reactions that are easily translated into mathematical terms using physicochemical relationships. The schematic representation of reactions captures the interactions between reacting species and products. Determining the minimal chemical interactions underlying the dynamic behavior of systems is a major task. In this paper, we introduce a novel approach for the identification of the influential reactions in chemical reaction networks for combustion applications, using a data-driven sparse-learning technique. The proposed approach identifies a set of influential reactions using species concentrations and reaction rates, with minimal computational cost without requiring additional data or simulations. The new approach is applied to analyze the combustion chemistry of H2 and C3H8 in a constant-volume homogeneous reactor. The influential reactions identified by the sparse-learning method are consistent with the current kinetics knowledge of chemical mechanisms. Additionally, we show that a reduced version of the parent mechanism can be generated as a combination of the influential reactions identified at different times and conditions and that for both H2 and C3H8 this reduced mechanism performs closely to the parent mechanism as a function of ignition delay over a wide range of conditions. Our results demonstrate the potential of the sparse-learning approach as an effective and efficient tool for mechanism analysis and mechanism reduction.

URL PDF HTML ☆

赞 0 踩 0

1902.02542 2026-06-04 math.OC cs.LG cs.SY eess.SY 版本更新

Predict Globally, Correct Locally: Parallel-in-Time Optimal Control of Neural Networks

全局预测，局部校正：神经网络优化的并行时间最优控制

Panos Parpas, Corey Muir

发表机构 * Department of Computing, Imperial College London, London, United Kingdom（帝国理工学院计算系，伦敦，英国）

AI总结本文提出了一种新的分布式优化算法，通过将神经网络的层视为动力系统离散动力学，利用最优控制的共态（adjoints）与反向传播的关系，实现参数更新无需等待前向或反向传播完成，从而提高效率。

详情

AI中文摘要

动态系统最优控制与神经网络之间的联系在理论和实践中都具有价值。几位研究者利用这些联系来研究不同神经网络架构的稳定性，并开发了内存高效的训练算法。我们同样采用动态系统的观点来看待神经网络，但我们的目标与早期工作不同。我们利用动态系统、最优控制和神经网络之间的联系，开发了一种新的分布式优化算法。所提出的算法解决了分布式神经网络优化算法最显著的障碍：网络权重不能在数据前向传播完成之前更新，且反向传播梯度完成之后才能更新。利用动态系统的观点，我们将（残差）神经网络的层解释为动态系统的离散动力学，并利用最优控制问题的共态（adjoints）与反向传播之间的关系。然后我们开发了一种并行时间方法，该方法在前向或反向传播算法完全完成之前即可更新网络参数。我们建立了所提算法的收敛性。初步的数值结果表明，该算法在竞争性和效率方面优于最先进的方法。

英文摘要

The links between optimal control of dynamical systems and neural networks have proved beneficial both from a theoretical and from a practical point of view. Several researchers have exploited these links to investigate the stability of different neural network architectures and develop memory efficient training algorithms. We also adopt the dynamical systems view of neural networks, but our aim is different from earlier works. We exploit the links between dynamical systems, optimal control, and neural networks to develop a novel distributed optimization algorithm. The proposed algorithm addresses the most significant obstacle for distributed algorithms for neural network optimization: the network weights cannot be updated until the forward propagation of the data, and backward propagation of the gradients are complete. Using the dynamical systems point of view, we interpret the layers of a (residual) neural network as the discretized dynamics of a dynamical system and exploit the relationship between the co-states (adjoints) of the optimal control problem and backpropagation. We then develop a parallel-in-time method that updates the parameters of the network without waiting for the forward or back propagation algorithms to complete in full. We establish the convergence of the proposed algorithm. Preliminary numerical results suggest that the algorithm is competitive and more efficient than the state-of-the-art.

URL PDF HTML ☆

赞 0 踩 0

1902.01064 2026-06-04 cs.DC cs.LG cs.SY eess.SY 版本更新

Hop: Heterogeneity-Aware Decentralized Training

Hop：异质性感知的去中心化训练

Qinyi Luo, Jinkun Lin, Youwei Zhuo, Xuehai Qian

发表机构 * University of Southern California（南加州大学）； Tsinghua University（清华大学）

AI总结本文提出Hop，首个考虑异质性的去中心化训练协议，通过引入迭代间隙这一独特特性，提出基于队列的同步机制以实现备份工作者和有限滞后，同时通过跳过迭代来缓解确定性延迟，实验表明在异质环境中相比标准去中心化训练有显著加速。

详情

DOI: 10.1145/3297858.3304009

AI中文摘要

近期研究表明，在机器学习领域，去中心化算法在异质环境中相较于集中化算法能提供更优的性能。这两种方法的主要区别在于其不同的通信模式，两者在异质环境中都可能受到性能下降的影响。尽管已有大量努力支持集中化算法对抗异质性，但针对去中心化算法的相关研究却十分有限。本文提出Hop，首个异质性感知的去中心化训练协议。基于我们识别出的去中心化训练的一个独特特性，即迭代间隙，我们提出一种基于队列的同步机制，能够高效实现备份工作者和有限滞后。为了应对确定性延迟，我们提出跳过迭代，以进一步减轻较慢工作者的影响。我们基于TensorFlow构建了Hop的原型实现。在CNN和SVM上的实验结果表明，在异质环境中相比标准去中心化训练有显著的加速效果。

英文摘要

Recent work has shown that decentralized algorithms can deliver superior performance over centralized ones in the context of machine learning. The two approaches, with the main difference residing in their distinct communication patterns, are both susceptible to performance degradation in heterogeneous environments. Although vigorous efforts have been devoted to supporting centralized algorithms against heterogeneity, little has been explored in decentralized algorithms regarding this problem. This paper proposes Hop, the first heterogeneity-aware decentralized training protocol. Based on a unique characteristic of decentralized training that we have identified, the iteration gap, we propose a queue-based synchronization mechanism that can efficiently implement backup workers and bounded staleness in the decentralized setting. To cope with deterministic slowdown, we propose skipping iterations so that the effect of slower workers is further mitigated. We build a prototype implementation of Hop on TensorFlow. The experiment results on CNN and SVM show significant speedup over standard decentralized training in heterogeneous settings.

URL PDF HTML ☆

赞 0 踩 0

1809.06277 2026-06-04 math.OC cs.LG cs.SY eess.SY 版本更新

Optimal Matrix Momentum Stochastic Approximation and Applications to Q-learning

最优矩阵动量随机逼近及其在Q学习中的应用

Adithya M. Devraj, Ana Bušić, Sean Meyn

发表机构 * Department of Electrical and Computer Engineering, University of Florida（佛罗里达大学电气与计算机工程系）

AI总结本文提出两种新的根寻找算法，PolSA和NeSA，用于解决优化问题，并探讨了这些算法在强化学习中的应用，特别是在Q学习中通过随机逼近实现最优渐近协方差。

详情

AI中文摘要

加速是随机优化文献中越来越常见的主题。最常见的例子是Nesterov的方法和Polyak的动量技术。在本文中，针对根寻找问题引入了两种新的算法：1）PolSA是一种具有特别设计的矩阵动量的根寻找算法，2）NeSA可以被视为Nesterov算法的一种变种，或PolSA的简化版本。PolSA算法在优化领域（当作为根寻找问题处理时）是新的。本文研究的调研受到强化学习应用的启发。众所周知，大多数TD-和Q学习的变种可以作为SA（随机逼近）算法来处理，且一般SA理论的工具可用于研究收敛性和收敛速率的界限。特别是，渐近方差是SA算法性能的常见度量标准，也是评估随机优化算法性能的多种度量之一。有两种广为人知的SA技术已知具有最优渐近方差：Ruppert-Polyak平均技术和随机牛顿-拉夫逊（SNR）。前者算法可能具有极差的瞬时性能，而后者计算成本较高。本文证明了新提出的PolSA算法的参数估计与理想（但更复杂）SNR算法的估计耦合。因此，新算法成为获得最优渐近协方差的第三种方法。这些强结果需要对模型的假设。考虑了线性化模型，并假设噪声是一个鞅差序列。在非线性设置中获得了数值结果，这是本文工作的动机：在PolSA实现的Q学习中，在这种非理想设置下观察到与SNR的耦合。

英文摘要

Acceleration is an increasingly common theme in the stochastic optimization literature. The two most common examples are Nesterov's method, and Polyak's momentum technique. In this paper two new algorithms are introduced for root finding problems: 1) PolSA is a root finding algorithm with specially designed matrix momentum, and 2) NeSA can be regarded as a variant of Nesterov's algorithm, or a simplification of PolSA. The PolSA algorithm is new even in the context of optimization (when cast as a root finding problem). The research surveyed in this paper is motivated by applications to reinforcement learning. It is well known that most variants of TD- and Q-learning may be cast as SA (stochastic approximation) algorithms, and the tools from general SA theory can be used to investigate convergence and bounds on convergence rate. In particular, the asymptotic variance is a common metric of performance for SA algorithms, and is also one among many metrics used in assessing the performance of stochastic optimization algorithms. There are two well known SA techniques that are known to have optimal asymptotic variance: the Ruppert-Polyak averaging technique, and stochastic Newton-Raphson (SNR). The former algorithm can have extremely bad transient performance, and the latter can be computationally expensive. It is demonstrated here that parameter estimates from the new PolSA algorithm couple with those of the ideal (but more complex) SNR algorithm. The new algorithm is thus a third approach to obtain optimal asymptotic covariance. These strong results require assumptions on the model. A linearized model is considered, and the noise is assumed to be a martingale difference sequence. Numerical results are obtained in a non-linear setting that is the motivation for this work: In PolSA implementations of Q-learning it is observed that coupling occurs with SNR in this non-ideal setting.

URL PDF HTML ☆

赞 0 踩 0

1806.05722 2026-06-04 cs.LG cs.SY eess.SY math.OC stat.ML 版本更新

Non-asymptotic Identification of LTI Systems from a Single Trajectory

非渐近识别单轨迹下的线性时不变系统

Samet Oymak, Necmiye Ozay

发表机构 * Department of Electrical and Computer Engineering, University of California, Riverside, CA（加州大学河滨分校电子工程与计算机科学系）； Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, MI（密歇根大学安娜堡分校电气工程与计算机科学系）

AI总结该研究通过单轨迹输入输出数据，利用霍尔-卡尔曼算法在有限时间内学习线性时不变系统的马尔可夫参数，并结合稳定性结果和样本复杂度分析，确定学习系统平衡实现所需的数据量。

Comments Version 2 has two improvements: First, paper now uses spectral radius rather than largest singular value hence applies to a larger class of systems. Secondly, new sample complexity bounds are provided for approximating the system's Hankel operator via estimated Markov parameters

1803.06443 2026-06-04 cs.LG cs.DC cs.SY eess.SY stat.ML 版本更新

Communication Compression for Decentralized Training

分布式训练中的通信压缩

Hanlin Tang, Shaoduo Gan, Ce Zhang, Tong Zhang, Ji Liu

发表机构 * Department of Computer Science, University of Rochester（罗切斯特大学计算机科学系）； Department of Computer Science, ETH Zurich（苏黎世联邦理工学院计算机科学系）； Tencent AI Lab（腾讯AI实验室）

AI总结本文研究了在高延迟和低带宽网络中结合通信压缩与去中心化技术以实现鲁棒训练系统的问题，提出了两种新的压缩策略并证明了其收敛性。

详情

AI中文摘要

优化分布式学习系统是平衡计算与通信的艺术。已有两种研究方向试图解决网络速度慢的问题：通信压缩用于低带宽网络，去中心化用于高延迟网络。本文探讨了一个自然问题：能否将这两种技术结合，使系统同时鲁棒于带宽和延迟？尽管这种组合的系统影响是显而易见的，但其理论原理和算法设计却极具挑战性：与集中式算法不同，简单地在去中心化网络中压缩交换信息，即使以无偏随机方式，也会累积误差并导致无法收敛。本文提出了一种压缩的去中心化训练框架，并提出了两种不同的策略，分别称为 extrapolation compression 和 difference compression。我们分析了这两种算法并证明了它们以 $O(1/\sqrt{nT})$ 的速率收敛，其中 $n$ 是工作者数量，$T$ 是迭代次数，与全精度集中式训练的收敛速率相匹配。我们验证了我们的算法，并发现对于同时具有高延迟和低带宽的网络，我们的算法显著优于仅去中心化或仅量化算法。

英文摘要

Optimizing distributed learning systems is an art of balancing between computation and communication. There have been two lines of research that try to deal with slower networks: {\em communication compression} for low bandwidth networks, and {\em decentralization} for high latency networks. In this paper, We explore a natural question: {\em can the combination of both techniques lead to a system that is robust to both bandwidth and latency?} Although the system implication of such combination is trivial, the underlying theoretical principle and algorithm design is challenging: unlike centralized algorithms, simply compressing exchanged information, even in an unbiased stochastic way, within the decentralized network would accumulate the error and fail to converge. In this paper, we develop a framework of compressed, decentralized training and propose two different strategies, which we call {\em extrapolation compression} and {\em difference compression}. We analyze both algorithms and prove both converge at the rate of $O(1/\sqrt{nT})$ where $n$ is the number of workers and $T$ is the number of iterations, matching the convergence rate for full precision, centralized training. We validate our algorithms and find that our proposed algorithm outperforms the best of merely decentralized and merely quantized algorithm significantly for networks with {\em both} high latency and low bandwidth.

URL PDF HTML ☆

赞 0 踩 0

1809.08911 2026-06-04 cs.LG cs.CY cs.SY eess.SP eess.SY stat.ML 版本更新

Understanding Compressive Adversarial Privacy

理解压缩对抗隐私

Xiao Chen, Peter Kairouz, Ram Rajagopal

发表机构 * Stanford University（斯坦福大学）

AI总结本文提出了一种压缩对抗隐私框架，通过凸优化在数据隐私和效用之间取得平衡，并通过实证应用展示了该框架在保护敏感信息方面的有效性。

详情

DOI: 10.1109/CDC.2018.8619455
Journal ref: 2018 IEEE Conference on Decision and Control (CDC)

AI中文摘要

设计一种不牺牲过多隐私的数据共享机制可以被视为数据持有者与恶意攻击者之间的博弈。本文描述了一种压缩对抗隐私框架，该框架捕捉了数据隐私与效用之间的权衡。我们在假设数据持有者和攻击者只能使用线性变换修改数据的情况下，通过凸优化确定最优的数据发布机制。随后，我们构建了一个更加现实的数据发布机制，该机制可以依赖于非线性压缩模型，而攻击者则使用神经网络。通过一系列实证应用，我们展示了该框架，即压缩对抗隐私，能够保护敏感信息。

英文摘要

Designing a data sharing mechanism without sacrificing too much privacy can be considered as a game between data holders and malicious attackers. This paper describes a compressive adversarial privacy framework that captures the trade-off between the data privacy and utility. We characterize the optimal data releasing mechanism through convex optimization when assuming that both the data holder and attacker can only modify the data using linear transformations. We then build a more realistic data releasing mechanism that can rely on a nonlinear compression model while the attacker uses a neural network. We demonstrate in a series of empirical applications that this framework, consisting of compressive adversarial privacy, can preserve sensitive information.

URL PDF HTML ☆

赞 0 踩 0

1610.05202 2026-06-04 cs.LG cs.AI cs.DC cs.SY eess.SY stat.ML 版本更新

Decentralized Collaborative Learning of Personalized Models over Networks

网络上的去中心化协作学习个性化模型

Paul Vanhaesebrouck, Aurélien Bellet, Marc Tommasi

发表机构 * INRIA

AI总结本文研究了在协作对等网络中，如何通过与其他具有相似目标的代理通信来改进本地训练模型，提出两种异步 gossip 算法并基于 ADMM 实现去中心化算法。

Comments To appear in the Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS 2017)

1606.02421 2026-06-04 stat.ML cs.AI cs.DC cs.LG cs.SY eess.SY 版本更新

Gossip Dual Averaging for Decentralized Optimization of Pairwise Functions

基于 gossip 的双重平均法用于分布式优化配对函数

Igor Colin, Aurélien Bellet, Joseph Salmon, Stéphan Clémençon

发表机构 * Magnet Team, INRIA Lille – Nord Europe（磁力团队、法国国家信息与自动化技术研究所里尔-北欧洲分部）

AI总结本文提出了一种基于 gossip 的双重平均算法，用于在分布式网络中优化配对函数，适用于排名、距离度量学习和图推断等应用，通过同步和异步设置解决优化问题，并展示了其在AUC最大化和度量学习中的实际应用。

详情

AI中文摘要

在分布式网络（如传感器、连接设备等）中，存在对高效算法优化全局成本函数的重要需求，例如从每个计算单元收集的本地数据中学习全局模型。本文针对分布式最小化数据点配对函数的问题，这些点分布在定义网络通信拓扑的图的节点上。该问题在排名、距离度量学习和图推断等领域有广泛应用。我们提出了一种基于双重平均的新型 gossip 算法，旨在在同步和异步设置中解决此类问题。所提出的框架足够灵活，能够处理约束和正则化优化问题的变体。我们的理论分析表明，所提出的算法在保持集中式双重平均收敛速度的同时，仅引入一个加性偏差项。我们还通过在AUC最大化和度量学习问题上的数值模拟，展示了我们方法的实际价值。

英文摘要

In decentralized networks (of sensors, connected objects, etc.), there is an important need for efficient algorithms to optimize a global cost function, for instance to learn a global model from the local data collected by each computing unit. In this paper, we address the problem of decentralized minimization of pairwise functions of the data points, where these points are distributed over the nodes of a graph defining the communication topology of the network. This general problem finds applications in ranking, distance metric learning and graph inference, among others. We propose new gossip algorithms based on dual averaging which aims at solving such problems both in synchronous and asynchronous settings. The proposed framework is flexible enough to deal with constrained and regularized variants of the optimization problem. Our theoretical analysis reveals that the proposed algorithms preserve the convergence rate of centralized dual averaging up to an additive bias term. We present numerical simulations on Area Under the ROC Curve (AUC) maximization and metric learning problems which illustrate the practical interest of our approach.

URL PDF HTML ☆

赞 0 踩 0

1511.05464 2026-06-04 stat.ML cs.DC cs.LG cs.SY eess.SY stat.CO 版本更新

Extending Gossip Algorithms to Distributed Estimation of U-Statistics

将 gossip 算法扩展到分布式 U-统计量估计

Igor Colin, Aurélien Bellet, Joseph Salmon, Stéphan Clémençon

发表机构 * INRIA Lille - Nord Europe（INRIA里尔-北欧洲）

AI总结本文提出新的同步和异步随机 gossip 算法，用于在分布式网络中同时传播数据并维护局部的 U-统计量估计，证明了同步和异步情况下的收敛率分别为 O(1/t) 和 O(log t / t)，并通过数值实验验证了算法的优越性。

Comments to be presented at NIPS 2015

详情

AI中文摘要

高效且稳健的去中心化网络估计算法对于许多分布式系统至关重要。尽管样本均值统计的分布式估计已受到广泛关注，但依赖于对观测对的更昂贵平均的 U-统计量计算却是一个研究较少的领域。然而，这些数据函数对于描述统计总体的全局特性至关重要，重要例子包括曲线下面积、经验方差、基尼均差和簇内点散度。本文提出新的同步和异步随机 gossip 算法，同时在网络中传播数据并维护感兴趣的 U-统计量的局部估计。我们建立了同步和异步情况下的收敛率界分别为 O(1/t) 和 O(log t / t)，其中 t 是迭代次数，且具有明确的数据和网络依赖项。除了在速率分析方面的优越比较外，数值实验还提供了实证证据，证明所提出的算法优于之前引入的方法。

英文摘要

Efficient and robust algorithms for decentralized estimation in networks are essential to many distributed systems. Whereas distributed estimation of sample mean statistics has been the subject of a good deal of attention, computation of $U$-statistics, relying on more expensive averaging over pairs of observations, is a less investigated area. Yet, such data functionals are essential to describe global properties of a statistical population, with important examples including Area Under the Curve, empirical variance, Gini mean difference and within-cluster point scatter. This paper proposes new synchronous and asynchronous randomized gossip algorithms which simultaneously propagate data across the network and maintain local estimates of the $U$-statistic of interest. We establish convergence rate bounds of $O(1/t)$ and $O(\log t / t)$ for the synchronous and asynchronous cases respectively, where $t$ is the number of iterations, with explicit data and network dependent terms. Beyond favorable comparisons in terms of rate analysis, numerical experiments provide empirical evidence the proposed algorithms surpasses the previously introduced approach.

URL PDF HTML ☆

赞 0 踩 0

1812.06325 2026-06-04 eess.SY cs.LG cs.RO cs.SY 版本更新

Data-efficient Auto-tuning with Bayesian Optimization: An Industrial Control Study

数据高效自动调优与贝叶斯优化：一项工业控制研究

Matthias Neumann-Brosig, Alonso Marco, Dieter Schwarzmann, Sebastian Trimpe

发表机构 * IAV GmbH（IAV集团）； Max Planck Society（马克斯·普朗克学会）； Cyber Valley initiative（Cyber Valley倡议）； Max Planck Institute for Intelligent Systems（智能系统研究所）

AI总结本文提出利用贝叶斯优化自动学习最优控制器参数，通过概率模型（高斯过程）建模控制器参数到用户定义成本的未知函数，并通过实验数据迭代优化，以高效找到全局最优参数，实验表明其在 throttle valve 控制中优于手动校准。

Comments 11 pages, 7 figures and 4 tables. To appear in IEEE Transactions on Control Systems Technology

详情

DOI: 10.1109/TCST.2018.2886159

AI中文摘要

贝叶斯优化被提出用于从实验数据自动学习最优控制器参数。通过概率描述（高斯过程）建模控制器参数到用户定义成本的未知函数。概率模型通过在物理系统上测试一组参数并评估成本来更新。为加快学习速度，贝叶斯优化算法系统地选择下一步评估的参数，例如通过最大化关于最优解的信息增益。因此，该算法通过少量实验迭代找到全局最优参数。以节流阀控制为例，所提出的自动调优方法在低实验次数下 consistently 实现更好的性能，优于手动校准。所提出的自动调优框架具有灵活性，可处理不同的控制结构和目标。

英文摘要

Bayesian optimization is proposed for automatic learning of optimal controller parameters from experimental data. A probabilistic description (a Gaussian process) is used to model the unknown function from controller parameters to a user-defined cost. The probabilistic model is updated with data, which is obtained by testing a set of parameters on the physical system and evaluating the cost. In order to learn fast, the Bayesian optimization algorithm selects the next parameters to evaluate in a systematic way, for example, by maximizing information gain about the optimum. The algorithm thus iteratively finds the globally optimal parameters with only few experiments. Taking throttle valve control as a representative industrial control example, the proposed auto-tuning method is shown to outperform manual calibration: it consistently achieves better performance with a low number of experiments. The proposed auto-tuning framework is flexible and can handle different control structures and objectives.

URL PDF HTML ☆

赞 0 踩 0

1809.06750 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Multiobjective Reinforcement Learning for Reconfigurable Adaptive Optimal Control of Manufacturing Processes

多目标强化学习用于可重构自适应最优控制的制造过程

Johannes Dornheim, Norbert Link

发表机构 * Intelligent Systems Research Group (ISRG)（智能系统研究组）

AI总结本文提出了一种新型无模型多目标强化学习方法，用于制造过程的自适应最优控制，能够高效学习不同目标权重下的控制配置。

Comments Conference, Preprint, 978-1-5386-5925-0/18/$31.00 \c{opyright} 2018 IEEE

1811.04455 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

Learning with tree-based tensor formats

基于树结构张量格式的学习

Erwan Grelier, Anthony Nouy, Mathilde Chevreuil

AI总结本文研究了在统计学习设置中，通过经验风险最小化在树结构张量格式的模型类中近似高维函数的问题，提出了一种基于树结构张量格式的模型选择策略和树优化算法，以提高学习的数值稳定性与可靠性。

详情

AI中文摘要

本文关注在统计学习设置中，通过在树结构张量格式的模型类中进行经验风险最小化来近似高维函数的问题。这些是特定的秩结构函数类，可以视为具有与树和多线性激活函数相关的稀疏架构的深度神经网络。对于给定的模型类，我们利用树结构张量格式是多线性模型的事实，将风险最小化问题转换为一系列线性模型的学习问题。适当的表示变换会产生数值稳定的学习问题，并允许利用稀疏性。对于高维问题或仅当数据集较小时，选择合适的模型类是一个关键问题。对于给定的树，选择最小化风险的树结构秩元组是一个组合问题。在这里，我们提出了一种秩适应策略，实际情况下能够提供风险随模型类复杂度变化的良好收敛性。找到合适的树也是一个组合问题，可以与深度神经网络特定稀疏架构的选择相关联。在这里，我们提出了一种随机算法，用于最小化给定函数在具有给定 arity 的树类中的表示复杂度，允许树的拓扑结构变化。该树优化算法随后被包含在一种学习方案中，该方案依次适应树和相应的树结构秩。与经典非线性模型类学习算法不同，所提出的算法在数值上是稳定、可靠的，并且只需要用户较低水平的专业知识。

英文摘要

This paper is concerned with the approximation of high-dimensional functions in a statistical learning setting, by empirical risk minimization over model classes of functions in tree-based tensor format. These are particular classes of rank-structured functions that can be seen as deep neural networks with a sparse architecture related to the tree and multilinear activation functions. For learning in a given model class, we exploit the fact that tree-based tensor formats are multilinear models and recast the problem of risk minimization over a nonlinear set into a succession of learning problems with linear models. Suitable changes of representation yield numerically stable learning problems and allow to exploit sparsity. For high-dimensional problems or when only a small data set is available, the selection of a good model class is a critical issue. For a given tree, the selection of the tuple of tree-based ranks that minimize the risk is a combinatorial problem. Here, we propose a rank adaptation strategy which provides in practice a good convergence of the risk as a function of the model class complexity. Finding a good tree is also a combinatorial problem, which can be related to the choice of a particular sparse architecture for deep neural networks. Here, we propose a stochastic algorithm for minimizing the complexity of the representation of a given function over a class of trees with a given arity, allowing changes in the topology of the tree. This tree optimization algorithm is then included in a learning scheme that successively adapts the tree and the corresponding tree-based ranks. Contrary to classical learning algorithms for nonlinear model classes, the proposed algorithms are numerically stable, reliable, and require only a low level expertise of the user.

URL PDF HTML ☆

赞 0 踩 0

1805.11572 2026-06-04 cs.CV cs.LG cs.NA math.NA stat.ML 版本更新

Adversarial Regularizers in Inverse Problems

对抗正则化在反问题中的应用

Sebastian Lunz, Ozan Öktem, Carola-Bibiane Schönlieb

发表机构 * DAMTP Department of Mathematics（DAMTP数学系）； University of Cambridge（剑桥大学）； KTH - Royal Institute of Technology（皇家理工学院）

AI总结本文提出了一种利用神经网络作为正则化函数的新框架，用于解决反问题，该方法通过学习真实图像分布与未正则化重建分布之间的差异来提升反问题求解的性能。

Comments published at NeurIPS 2018

1704.04163 2026-06-04 cs.DS cs.LG cs.NA math.NA 版本更新

Spectrum Approximation Beyond Fast Matrix Multiplication: Algorithms and Hardness

超越快速矩阵乘法的谱近似：算法与难度

Cameron Musco, Praneeth Netrapalli, Aaron Sidford, Shashanka Ubaru, David P. Woodruff

发表机构 * MIT（麻省理工学院）； Microsoft Research（微软研究院）； Stanford University（斯坦福大学）； University of Minnesota（明尼苏达大学）； Carnegie Mellon University（卡内基梅隆大学）

AI总结本文研究了如何在比矩阵乘法时间更快的运行时间内近似矩阵的谱，提出了一种基于随机迹估计、多项式逼近和快速系统求解器的算法，能够高效地隔离矩阵谱的不同范围并近似奇异值的数量，从而在许多应用中替代真实的奇异值。

Comments ITCS 2018

详情

AI中文摘要

理解矩阵$A \in \mathbb{R}^{n imes n}$的奇异值谱是众多应用中的基本任务。在矩阵乘法时间内，可以执行完整的SVD并直接计算奇异值$σ_1,...,σ_n$。然而，很少有关于突破这一运行时间障碍的算法。利用随机迹估计、多项式逼近和快速系统求解器的工具，我们展示了如何高效地隔离$A$的谱的不同范围并近似这些范围内的奇异值数量。因此，我们有效地计算了谱的直方图，这在许多应用中可以替代真实的奇异值。我们使用这一原始工具，给出了对广泛对称矩阵范数进行近似的第一种算法，其运行时间快于矩阵乘法时间。例如，我们给出了一种$(1 + ε)$近似算法，用于Schatten-1范数（核范数），运行时间为$ ilde O((nnz(A)n^{1/3} + n^2)ε^{-3})$，适用于具有均匀行稀疏性的矩阵，或$ ilde O(n^{2.18} ε^{-3})$时间用于密集矩阵。对于一般的Schatten-p范数，运行时间平滑地扩展，特别是对于任何$p \ge 2$，运行时间变为$ ilde O(p \cdot nnz(A) ε^{-3})$。同时，我们证明了谱近似的复杂性本质上与快速矩阵乘法在小$ε$范围内密切相关。我们证明，如果在我们的算法中实现更温和的$ε$依赖性，则意味着在一般图上实现比矩阵乘法时间更快的三角检测算法。这进一步意味着，高精度算法在亚立方时间内运行将导致亚立方时间矩阵乘法。作为我们界限的应用，我们展示了在矩阵乘法时间以内精确计算图中所有有效电阻的可能性可能很困难，除非有重大的算法突破。

英文摘要

Understanding the singular value spectrum of a matrix $A \in \mathbb{R}^{n \times n}$ is a fundamental task in countless applications. In matrix multiplication time, it is possible to perform a full SVD and directly compute the singular values $σ_1,...,σ_n$. However, little is known about algorithms that break this runtime barrier. Using tools from stochastic trace estimation, polynomial approximation, and fast system solvers, we show how to efficiently isolate different ranges of $A$'s spectrum and approximate the number of singular values in these ranges. We thus effectively compute a histogram of the spectrum, which can stand in for the true singular values in many applications. We use this primitive to give the first algorithms for approximating a wide class of symmetric matrix norms in faster than matrix multiplication time. For example, we give a $(1 + ε)$ approximation algorithm for the Schatten-$1$ norm (the nuclear norm) running in just $\tilde O((nnz(A)n^{1/3} + n^2)ε^{-3})$ time for $A$ with uniform row sparsity or $\tilde O(n^{2.18} ε^{-3})$ time for dense matrices. The runtime scales smoothly for general Schatten-$p$ norms, notably becoming $\tilde O (p \cdot nnz(A) ε^{-3})$ for any $p \ge 2$. At the same time, we show that the complexity of spectrum approximation is inherently tied to fast matrix multiplication in the small $ε$ regime. We prove that achieving milder $ε$ dependencies in our algorithms would imply faster than matrix multiplication time triangle detection for general graphs. This further implies that highly accurate algorithms running in subcubic time yield subcubic time matrix multiplication. As an application of our bounds, we show that precisely computing all effective resistances in a graph in less than matrix multiplication time is likely difficult, barring a major algorithmic breakthrough.

URL PDF HTML ☆

赞 0 踩 0

1704.03371 2026-06-04 cs.DS cs.LG cs.NA math.NA 版本更新

差分时间差学习用于价值函数近似

Adithya M. Devraj, Sean P. Meyn

发表机构 * Department of Electrical and Computer Engg. at the University of Florida（佛罗里达大学电气与计算机工程系）

AI总结本文提出了一种差分时间差学习方法，用于解决传统时间差学习在折扣成本设置中方差发散和平均成本设置中无偏算法仅在特殊情况下存在的问题，通过价值函数梯度的表示来设计算法，提高了马尔可夫模型在欧几里得空间中平滑动态下的性能。

详情

AI中文摘要

价值函数作为算法组件和统计与工程应用中的性能度量出现。计算相关的Bellman方程在所有非特殊情况中都具有数值挑战性。一种流行的近似技术是时间差（TD）学习。本文介绍的算法旨在解决该方法的两个已知问题：在折扣成本设置中，当折扣因子接近单位时，算法的方差发散。第二，在平均成本设置中，只有在特殊情况下才存在无偏算法。证明了任何这些价值函数的梯度都可以表示为算法设计的依据。基于此结果，得到了适用于欧几里得空间中马尔可夫模型的新型差分TD方法。数值示例显示了显著的性能改进。在应用于速度调节时，方差减少了两个数量级。

英文摘要

Value functions arise as a component of algorithms as well as performance metrics in statistics and engineering applications. Computation of the associated Bellman equations is numerically challenging in all but a few special cases. A popular approximation technique is known as Temporal Difference (TD) learning. The algorithm introduced in this paper is intended to resolve two well-known problems with this approach: In the discounted-cost setting, the variance of the algorithm diverges as the discount factor approaches unity. Second, for the average cost setting, unbiased algorithms exist only in special cases. It is shown that the gradient of any of these value functions admits a representation that lends itself to algorithm design. Based on this result, the new differential TD method is obtained for Markovian models on Euclidean space with smooth dynamics. Numerical examples show remarkable improvements in performance. In application to speed scaling, variance is reduced by two orders of magnitude.

URL PDF HTML ☆

赞 0 踩 0

1805.03117 2026-06-04 astro-ph.CO cs.LG cs.NA math.NA 版本更新

Local, algebraic simplifications of Gaussian random fields

局部的代数简化方法用于高斯随机场

Theodor Bjorkmo, M. C. David Marsh

发表机构 * Department of Applied Mathematics and Theoretical Physics, University of Cambridge（应用数学与理论物理系，剑桥大学）

AI总结本文提出了一种局部代数简化方法，用于高斯随机场的概率密度函数计算，从而避免了协方差矩阵求逆的计算复杂性，并展示了该方法在生成多场势能景观和机器学习中的应用。

Comments 15 pages, 2 figures

详情

DOI: 10.1088/1475-7516/2018/12/022

AI中文摘要

许多高斯随机场和高斯随机过程的应用受到计算复杂性的限制，这涉及求逆相关协方差矩阵。在本工作中，我们展示了如何完全绕过这一问题，用于高斯随机场的局部泰勒系数，其协方差函数为高斯（或平方指数）形式。我们的结果适用于任意维度的场和任意阶的泰勒展开。我们给出了两个应用：首先，我们证明该方法可以用于显式生成具有许多场的非平凡势能景观，这在关注局部特殊点（例如极值）时特别有用，如早期宇宙中的`manyfield'膨胀问题。其次，我们证明该方法在机器学习中有应用，大大简化了确定协方差函数超参数的回归问题，给定由单点局部泰勒系数组成的训练数据集。一个配套的Mathematica笔记本可在https://doi.org/10.17863/CAM.22859获取。

英文摘要

Many applications of Gaussian random fields and Gaussian random processes are limited by the computational complexity of evaluating the probability density function, which involves inverting the relevant covariance matrix. In this work, we show how that problem can be completely circumvented for the local Taylor coefficients of a Gaussian random field with a Gaussian (or `square exponential') covariance function. Our results hold for any dimension of the field and to any order in the Taylor expansion. We present two applications. First, we show that this method can be used to explicitly generate non-trivial potential energy landscapes with many fields. This application is particularly useful when one is concerned with the field locally around special points (e.g.~maxima or minima), as we exemplify by the problem of cosmic `manyfield' inflation in the early universe. Second, we show that this method has applications in machine learning, and greatly simplifies the regression problem of determining the hyperparameters of the covariance function given a training data set consisting of local Taylor coefficients at single point. An accompanying Mathematica notebook is available at https://doi.org/10.17863/CAM.22859 .

URL PDF HTML ☆

赞 0 踩 0

1812.08723 2026-06-04 cs.DS cs.LG cs.NA eess.SP math.NA 版本更新

A Universal Sampling Method for Reconstructing Signals with Simple Fourier Transforms

一种适用于使用简单傅里叶变换重建信号的通用采样方法

Haim Avron, Michael Kapralov, Cameron Musco, Christopher Musco, Ameya Velingker, Amir Zandieh

发表机构 * Tel Aviv University（特拉维夫大学）； EPFL（瑞士联邦理工学院）； Microsoft Research（微软研究院）； Princeton University（普林斯顿大学）； Google Research（谷歌研究院）

AI总结本文提出了一种通用采样方法，用于通过少量离散样本重建连续信号，该方法基于信号的傅里叶结构约束，并展示了其在多带信号重建和高斯过程回归等任务中的有效性。

详情

AI中文摘要

从少量离散样本重建连续信号是科学和工程中的基本问题。在实践中，我们通常感兴趣的信号具有'简单'的傅里叶结构，如带限、多带和傅里叶稀疏信号。更广泛地说，任何关于信号傅里叶功率谱的先验知识都可以限制其复杂性。直觉上，具有更受约束的傅里叶结构的信号需要更少的样本来重建。我们通过证明，给定类别的连续信号可以使用与该类允许功率谱的统计维度成比例的样本数近似重建。进一步地，在几乎所有情况下，这种自然度量紧密刻画了信号重建的样本复杂性。令人惊讶的是，我们还展示了，除了对数因子外，一种通用非均匀采样策略可以实现任何信号类别的最优复杂性。我们提出了一个简单且高效的算法，用于从采样中恢复信号。对于带限和稀疏信号，我们的方法达到了最先进的水平。同时，它为包括多带信号重建和一维kriging和高斯过程回归任务在内的广泛问题提供了第一个计算和样本效率的解决方案。我们的工作基于随机线性代数与具有受约束傅里叶结构的信号重建之间的新联系。我们扩展了基于统计杠杆得分采样和列基矩阵重建的工具到连续线性算子的近似，这些算子出现在信号重建中。我们相信这些扩展具有独立的兴趣，并为使用随机方法解决广泛的时间连续问题奠定了基础。

英文摘要

Reconstructing continuous signals from a small number of discrete samples is a fundamental problem across science and engineering. In practice, we are often interested in signals with 'simple' Fourier structure, such as bandlimited, multiband, and Fourier sparse signals. More broadly, any prior knowledge about a signal's Fourier power spectrum can constrain its complexity. Intuitively, signals with more highly constrained Fourier structure require fewer samples to reconstruct. We formalize this intuition by showing that, roughly, a continuous signal from a given class can be approximately reconstructed using a number of samples proportional to the *statistical dimension* of the allowed power spectrum of that class. Further, in nearly all settings, this natural measure tightly characterizes the sample complexity of signal reconstruction. Surprisingly, we also show that, up to logarithmic factors, a universal non-uniform sampling strategy can achieve this optimal complexity for *any class of signals*. We present a simple and efficient algorithm for recovering a signal from the samples taken. For bandlimited and sparse signals, our method matches the state-of-the-art. At the same time, it gives the first computationally and sample efficient solution to a broad range of problems, including multiband signal reconstruction and kriging and Gaussian process regression tasks in one dimension. Our work is based on a novel connection between randomized linear algebra and signal reconstruction with constrained Fourier structure. We extend tools based on statistical leverage score sampling and column-based matrix reconstruction to the approximation of continuous linear operators that arise in signal reconstruction. We believe that these extensions are of independent interest and serve as a foundation for tackling a broad range of continuous time problems using randomized methods.

URL PDF HTML ☆

赞 0 踩 0

1812.02588 2026-06-04 eess.SP cs.LG cs.SY eess.SY math.OC 版本更新

q-LMF: Quantum Calculus-based Least Mean Fourth Algorithm

q-LMF：基于量子微积分的最小四次均值算法

Alishba Sadiq, Muhammad Usman, Shujaat Khan, Imran Naseem, Muhammad Moinuddin, Ubaid M. Al-Saggaf

发表机构 * College of Engineering, Karachi Institute of Economics and Technology（卡拉奇经济科技学院工程学院）； Faculty of Engineering Science and Technology (FEST), Iqra University（伊克拉大学工程科学与技术学院）； School of Electrical, Electronic and Computer Engineering, The University of Western Australia（西澳大学电气、电子与计算机工程学院）； Center of Excellence in Intelligent Engineering Systems (CEIES), King Abdulaziz University（国王阿卜杜勒阿齐兹大学智能工程系统卓越中心）； Electrical and Computer Engineering Department, King Abdulaziz University（国王阿卜杜勒阿齐兹大学电气与计算机工程系）

AI总结本文提出了一种基于量子微积分的最小四次均值算法（q-LMF），用于非高斯噪声环境下的信道估计，通过引入误差相关能量和信号归一化技术，提高了收敛速度、稳定性和稳态误差，相比传统LMF算法具有更大的步长自由度。

详情

AI中文摘要

信道估计是现代通信系统中的关键部分，因为它能提高系统的整体性能。在最近的研究中，已经设计了多种自适应学习方法以增强学习过程的鲁棒性和收敛速度。然而，仍然需要一种最优技术。本文针对非高斯噪声环境，提出了一种新的随机梯度算法用于信道识别。所提出的q-最小四次均值（q-LMF）是最小四次均值（LMF）算法的扩展，基于量子微积分（也称为Jackson导数）。所提出的算法利用了新的误差相关能量概念和信号归一化技术，以确保高收敛速率、更好的稳定性和低稳态误差。与传统LMF不同，所提出的方法在大步长情况下具有更大的自由度。广泛的实验表明，所提出的q-LMF算法在性能上相比现有技术有显著提升。

英文摘要

Channel estimation is an essential part of modern communication systems as it enhances the overall performance of the system. In recent past a variety of adaptive learning methods have been designed to enhance the robustness and convergence speed of the learning process. However, the need for an optimal technique is still there. Herein, for non-Gaussian noisy environment we propose a new class of stochastic gradient algorithm for channel identification. The proposed $q$-least mean fourth ($q$-LMF) is an extension of least mean fourth (LMF) algorithm and it is based on the $q$-calculus which is also known as Jackson derivative. The proposed algorithm utilizes a novel concept of error-correlation energy and normalization of signal to ensure high convergence rate, better stability and low steady-state error. Contrary to the conventional LMF, the proposed method has more freedom for large step-sizes. Extensive experiments show significant gain in the performance of the proposed $q$-LMF algorithm in comparison to the contemporary techniques.

URL PDF HTML ☆

赞 0 踩 0

1812.07810 2026-06-04 cs.LG cs.CR cs.NA math.NA stat.ML 版本更新

Fast Botnet Detection From Streaming Logs Using Online Lanczos Method

从流日志中快速检测僵尸网络的在线兰茨斯方法

Zheng Chen, Xinli Yu, Chi Zhang, Jin Zhang, Cui Lin, Bo Song, Jianliang Gao, Xiaohua Hu, Wei-Shih Yang, Erjia Yan

发表机构 * CA Technologies, Inc.（CA Technologies公司）； College of Computing & Informatics, Drexel University（德雷塞尔大学计算与信息学院）； Department of Mathematics, Temple University（Temple大学数学系）； Department of Computer Science, Maryland University at Baltimore County（马里兰大学巴尔的摩县计算机科学系）

AI总结本文提出了一种基于在线兰茨斯方法的僵尸网络检测方法，通过将PCA方法改进为亚立方复杂度，提高了实时检测的准确性和灵敏度，同时提出了通用的在线相关矩阵更新公式和新的终止条件。

详情

AI中文摘要

僵尸网络，作为一种协调的机器人网络，已成为恶意互联网活动的主要平台，如DDOS攻击、点击欺诈、网络爬虫、垃圾/谣言传播等。本文专注于设计和实验一种新的从流Web服务器日志中检测僵尸网络的方法，受到其广泛适用性、实时保护能力、易用性和敏感数据更安全的启发。我们的算法受到主成分分析（PCA）的启发，以捕捉数据中的相关性，我们首次将兰茨斯方法应用于改进基于PCA的僵尸网络检测的时间复杂度，从立方到亚立方，这使我们能够更准确和灵敏地检测滑动时间窗口中的僵尸网络，而不是固定时间窗口。我们贡献了一个通用的在线相关矩阵更新公式，以及基于误差界和对称矩阵非递减特征值的新终止条件。在电子商务网站日志数据集上，实验表明兰茨斯方法在不同时间窗口下的时间成本始终仅为PCA的20%至25%。

英文摘要

Botnet, a group of coordinated bots, is becoming the main platform of malicious Internet activities like DDOS, click fraud, web scraping, spam/rumor distribution, etc. This paper focuses on design and experiment of a new approach for botnet detection from streaming web server logs, motivated by its wide applicability, real-time protection capability, ease of use and better security of sensitive data. Our algorithm is inspired by a Principal Component Analysis (PCA) to capture correlation in data, and we are first to recognize and adapt Lanczos method to improve the time complexity of PCA-based botnet detection from cubic to sub-cubic, which enables us to more accurately and sensitively detect botnets with sliding time windows rather than fixed time windows. We contribute a generalized online correlation matrix update formula, and a new termination condition for Lanczos iteration for our purpose based on error bound and non-decreasing eigenvalues of symmetric matrices. On our dataset of an ecommerce website logs, experiments show the time cost of Lanczos method with different time windows are consistently only 20% to 25% of PCA.

URL PDF HTML ☆

赞 0 踩 0

1812.07410 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

An Improved Deep Belief Network Model for Road Safety Analyses

一种改进的深度信念网络模型用于道路安全分析

Guangyuan Pan, Liping Fu, Lalita Thakali, Matthew Muresan, Ming Yu

发表机构 * Intelligent Transportation Systems Research Center（智能交通系统研究中心）； Wuhan University of Technology（武汉理工大学）； University of Waterloo（滑铁卢大学）； Department of Civil & Environmental Engineering（土木与环境工程系）； Department of Electrical & Computer Engineering（电气与计算机工程系）

AI总结本文提出了一种改进的深度信念网络模型，用于提升道路安全分析中的碰撞预测能力，通过两个案例研究展示该模型在预测性能上的优势，并与其他传统模型进行比较。

详情

Journal ref: Transportation Research Board 97th Annual Meeting, 2018

AI中文摘要

碰撞预测是道路安全分析中的关键组成部分。广泛采用的碰撞预测方法是应用基于回归的技术。底层的校准过程通常耗时较长，需要大量的领域知识和专业知识，无法轻易自动化。本文介绍了一种新的机器学习（ML）方法作为传统技术的替代方案。所提出的ML模型称为正则化深度信念网络，是一种具有两个训练步骤的深度神经网络：首先使用无监督学习算法进行训练，然后通过用第一步训练得到的权重初始化贝叶斯神经网络进行微调。所得模型预计具有改进的预测能力和减少对耗时人工干预的需求。在本文中，我们试图通过两个案例研究展示这种新模型在碰撞预测中的潜力，包括来自加拿大安大略省800公里高速公路401号和其他高速公路的碰撞数据集。我们的目的是展示该ML方法与其他传统模型（包括负二项（NB）模型、核回归（KR）和贝叶斯神经网络（贝叶斯NN））的性能比较。我们还试图解决其他相关问题，如训练数据大小和训练参数的影响。

英文摘要

Crash prediction is a critical component of road safety analyses. A widely adopted approach to crash prediction is application of regression based techniques. The underlying calibration process is often time-consuming, requiring significant domain knowledge and expertise and cannot be easily automated. This paper introduces a new machine learning (ML) based approach as an alternative to the traditional techniques. The proposed ML model is called regularized deep belief network, which is a deep neural network with two training steps: it is first trained using an unsupervised learning algorithm and then fine-tuned by initializing a Bayesian neural network with the trained weights from the first step. The resulting model is expected to have improved prediction power and reduced need for the time-consuming human intervention. In this paper, we attempt to demonstrate the potential of this new model for crash prediction through two case studies including a collision data set from 800 km stretch of Highway 401 and other highways in Ontario, Canada. Our intention is to show the performance of this ML approach in comparison to various traditional models including negative binomial (NB) model, kernel regression (KR), and Bayesian neural network (Bayesian NN). We also attempt to address other related issues such as effect of training data size and training parameters.

URL PDF HTML ☆

赞 0 踩 0

1703.00978 2026-06-04 eess.SY cs.LG cs.SE cs.SY 版本更新

Compositional Falsification of Cyber-Physical Systems with Machine Learning Components

包含机器学习组件的网络物理系统组合性验证

Tommaso Dreossi, Alexandre Donzé, Sanjit A. Seshia

发表机构 * University of California, Berkeley（加州大学伯克利分校）； Decyphir, Inc.（Decyphir公司）

AI总结本文研究了包含机器学习组件的网络物理系统（CPS）的正确性问题，提出了一种组合性验证框架，通过时间逻辑 falsifier 和机器学习分析器合作寻找违反规范的执行，以验证 CPS 的正确性。

详情

AI中文摘要

网络物理系统（CPS），如汽车系统，开始包含复杂的机器学习（ML）组件。因此，其正确性依赖于内部ML模块的属性。虽然学习算法旨在从示例中泛化，但它们的性能仅取决于提供的示例，最近的努力已显示它们在小对抗扰动下会产生不一致的输出。这引发了问题：学习组件的输出是否会导致整个CPS的失效？在本文中，我们通过将此问题建模为具有ML组件的CPS的时间逻辑（STL）规范的验证问题来解决此问题。我们提出了一种组合性验证框架，其中时间逻辑验证器和机器学习分析器合作，旨在找到所考虑模型的违反执行。所提出技术的有效性通过带有基于深度神经网络的感知组件的自动紧急制动系统模型得到展示。

英文摘要

Cyber-physical systems (CPS), such as automotive systems, are starting to include sophisticated machine learning (ML) components. Their correctness, therefore, depends on properties of the inner ML modules. While learning algorithms aim to generalize from examples, they are only as good as the examples provided, and recent efforts have shown that they can produce inconsistent output under small adversarial perturbations. This raises the question: can the output from learning components can lead to a failure of the entire CPS? In this work, we address this question by formulating it as a problem of falsifying signal temporal logic (STL) specifications for CPS with ML components. We propose a compositional falsification framework where a temporal logic falsifier and a machine learning analyzer cooperate with the aim of finding falsifying executions of the considered model. The efficacy of the proposed technique is shown on an automatic emergency braking system model with a perception component based on deep neural networks.

URL PDF HTML ☆

赞 0 踩 0

1509.09236 2026-06-04 cs.LG cs.CC cs.NA math.NA math.OC 版本更新

On the Complexity of Robust PCA and $\ell_1$-norm Low-Rank Matrix Approximation

关于鲁棒PCA和ℓ1-范数低秩矩阵逼近的复杂性

Nicolas Gillis, Stephen A. Vavasis

发表机构 * Department of Mathematics and Operational Research, University of Mons（蒙斯大学数学与运筹学系）； Department of Combinatorics and Optimization, University of Waterloo（滑铁卢大学组合学与优化系）

AI总结本文证明了基于ℓ1-范数的低秩矩阵逼近（ℓ1-LRA）在秩为1的情况下是NP难的，并将其与鲁棒PCA、ℓ0-LRA、二元矩阵分解等多个已知NP难问题建立了联系。

Comments 16 pages, some typos corrected

详情

DOI: 10.1287/moor.2017.0895
Journal ref: Mathematics of Operations Research 43 (4), pp. 1072-1084, 2018

AI中文摘要

基于组件wise的ℓ1-范数（ℓ1-LRA）的低秩矩阵逼近问题，与鲁棒主成分分析（PCA）密切相关，已成为数据挖掘和机器学习中的非常流行工具。鲁棒PCA旨在恢复被稀疏噪声扰动的低秩矩阵，例如在前景-背景视频分离中的应用。尽管ℓ1-LRA被强烈认为是NP难的，但到目前为止，尚无正式证明。在本文中，我们通过将问题归约到MAX CUT，证明了ℓ1-LRA在秩为1的情况下是NP难的。我们的推导揭示了ℓ1-LRA与几个其他已知NP难问题之间的有趣联系，包括鲁棒PCA、ℓ0-LRA、二元矩阵分解、特定的稠密二分子图问题、{-1,+1}矩阵的切范数计算，以及离散基底问题。

英文摘要

The low-rank matrix approximation problem with respect to the component-wise $\ell_1$-norm ($\ell_1$-LRA), which is closely related to robust principal component analysis (PCA), has become a very popular tool in data mining and machine learning. Robust PCA aims at recovering a low-rank matrix that was perturbed with sparse noise, with applications for example in foreground-background video separation. Although $\ell_1$-LRA is strongly believed to be NP-hard, there is, to the best of our knowledge, no formal proof of this fact. In this paper, we prove that $\ell_1$-LRA is NP-hard, already in the rank-one case, using a reduction from MAX CUT. Our derivations draw interesting connections between $\ell_1$-LRA and several other well-known problems, namely, robust PCA, $\ell_0$-LRA, binary matrix factorization, a particular densest bipartite subgraph problem, the computation of the cut norm of $\{-1,+1\}$ matrices, and the discrete basis problem, which we all prove to be NP-hard.

URL PDF HTML ☆

赞 0 踩 0

1711.06586 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Cautious NMPC with Gaussian Process Dynamics for Autonomous Miniature Race Cars

谨慎的非线性模型预测控制用于自动驾驶微型赛车

Lukas Hewing, Alexander Liniger, Melanie N. Zeilinger

发表机构 * Institute for Dynamic Systems and Control, ETH Zurich（动态系统与控制研究所，苏黎世联邦理工学院）； Institute for Automatic Control, ETH Zurich（自动控制研究所，苏黎世联邦理工学院）

AI总结本文提出了一种自适应高性能控制方法，通过使用高斯过程动态模型来改进自动驾驶微型赛车的动力学模型，从而在保证安全性的前提下提高赛车性能。

详情

DOI: 10.23919/ECC.2018.8550162
Journal ref: 2018 European Control Conference (ECC), Limassol, 2018, pp. 1341-1348

AI中文摘要

本文提出了一种自适应高性能控制方法，用于自动驾驶微型赛车。赛车动力学从原理上建模 notoriously 非常困难，本文通过一种谨慎的非线性模型预测控制（NMPC）方法来解决，该方法通过数据学习来改进其动力学模型并安全地提高赛车性能。该方法利用高斯过程（GP）并通过机会约束形式考虑残差模型不确定性。我们提出了一个稀疏GP近似方法，具有动态调整的诱导输入，从而实现可实时实施的控制器。该方法在模拟中得到了验证，显示了与无模型学习的NMPC相比，在圈速和约束满足方面有显著的改进。

英文摘要

This paper presents an adaptive high performance control method for autonomous miniature race cars. Racing dynamics are notoriously hard to model from first principles, which is addressed by means of a cautious nonlinear model predictive control (NMPC) approach that learns to improve its dynamics model from data and safely increases racing performance. The approach makes use of a Gaussian Process (GP) and takes residual model uncertainty into account through a chance constrained formulation. We present a sparse GP approximation with dynamically adjusting inducing inputs, enabling a real-time implementable controller. The formulation is demonstrated in simulations, which show significant improvement with respect to both lap time and constraint satisfaction compared to an NMPC without model learning.

URL PDF HTML ☆

赞 0 踩 0

1812.06243 2026-06-04 cs.DS cs.LG cs.NA math.NA stat.ML 版本更新

Algorithmic Theory of ODEs and Sampling from Well-conditioned Logconcave Densities

ODEs的算法理论与从良好条件的logconcave密度采样

Yin Tat Lee, Zhao Song, Santosh S. Vempala

发表机构 * University of Washington & Microsoft Research（华盛顿大学与微软研究院）； UT-Austin & University of Washington（德克萨斯大学奥斯汀分校与华盛顿大学）； Georgia Tech（佐治亚理工学院）

AI总结本文提出了一种求解多元微分方程的通用算法，其解接近已知基函数的张成，从而实现了近线性时间复杂度的HMC采样方法，适用于广泛使用的logistic回归损失函数。

详情

AI中文摘要

从统计学和机器学习中出现的logconcave函数采样已成为研究热点。最近的发展包括对 Langevin 动力学和 Hamiltonian Monte Carlo (HMC) 的分析。虽然这两种方法在足够强的光滑条件下对连续过程有维度无关的界，但所得到的离散算法的复杂度和函数评估次数随维度增长。受此问题启发，本文提出了一种通用算法，用于求解解接近已知基函数张成的多元微分方程。所得到的算法具有多项对数深度和几乎紧致的运行时间——几乎与解的表示大小成线性关系。我们将此应用于采样问题，以获得一个几乎线性的HMC实现，适用于广泛使用的logistic回归损失函数，其迭代次数（并行深度）和梯度评估次数为维度的多项对数（而非之前的多项式）。该类包括最近广泛研究的用于logistic回归的损失函数，其权重矩阵不相干。我们还给出了一个更快的算法，具有多项对数深度，适用于更一般和标准的强凸函数类，其梯度具有Lipschitz连续性。这些结果基于（1）对精确HMC过程的改进收缩界，以及（2）在实现HMC时出现的微分方程解的多项式近似次数的对数界。

英文摘要

Sampling logconcave functions arising in statistics and machine learning has been a subject of intensive study. Recent developments include analyses for Langevin dynamics and Hamiltonian Monte Carlo (HMC). While both approaches have dimension-independent bounds for the underlying $\mathit{continuous}$ processes under sufficiently strong smoothness conditions, the resulting discrete algorithms have complexity and number of function evaluations growing with the dimension. Motivated by this problem, in this paper, we give a general algorithm for solving multivariate ordinary differential equations whose solution is close to the span of a known basis of functions (e.g., polynomials or piecewise polynomials). The resulting algorithm has polylogarithmic depth and essentially tight runtime - it is nearly linear in the size of the representation of the solution. We apply this to the sampling problem to obtain a nearly linear implementation of HMC for a broad class of smooth, strongly logconcave densities, with the number of iterations (parallel depth) and gradient evaluations being $\mathit{polylogarithmic}$ in the dimension (rather than polynomial as in previous work). This class includes the widely-used loss function for logistic regression with incoherent weight matrices and has been subject of much study recently. We also give a faster algorithm with $ \mathit{polylogarithmic~depth}$ for the more general and standard class of strongly convex functions with Lipschitz gradient. These results are based on (1) an improved contraction bound for the exact HMC process and (2) logarithmic bounds on the degree of polynomials that approximate solutions of the differential equations arising in implementing HMC.

URL PDF HTML ☆

赞 0 踩 0

1812.05298 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Cyber-Physical Security and Safety of Autonomous Connected Vehicles: Optimal Control Meets Multi-Armed Bandit Learning

自动驾驶联网车辆的网络安全与安全性：最优控制与多臂老虎机学习

Aidin Ferdowsi, Samad Ali, Walid Saad, Narayan B. Mandayam

发表机构 * Centre for Wireless Communications (CWC), University of Oulu, Finland（奥卢大学无线通信中心（CWC））； WINLAB, Dept. of ECE, Rutgers University, New Brunswick, NJ, USA（罗格斯大学WINLAB，电子与计算机工程系，新泽西州新布朗斯维尔，美国）

AI总结本文提出了一种综合框架，用于防止自动驾驶联网车辆网络中的网络和物理攻击。首先，推导出一个最优安全控制器，通过优化自动驾驶车辆的速度和车辆间间距，最大化街道交通流量并最小化事故风险。其次，提出数据注入攻击检测方法，以应对传感器的网络攻击及其对自动驾驶系统的影响。

Comments 30 pages, 11 figures

详情

AI中文摘要

自动驾驶联网车辆（ACVs）依赖于车载传感器如摄像头和雷达以及车对车通信来有效运行。这种对网络组件的依赖使ACVs容易受到网络和物理攻击，其中攻击者可以操纵传感器读数并物理上控制ACV。本文提出了一种综合框架，以防止ACV网络中的网络和物理攻击。首先，推导出一个最优安全控制器，通过优化ACV速度和车辆间间距，最大化街道交通流量并最小化事故风险。证明所提出的控制器对旨在使ACV系统不稳定的身体攻击具有鲁棒性。为了提高ACV系统的网络-物理安全性，接下来提出了数据注入攻击（DIA）检测方法，以应对传感器的网络攻击及其对ACV系统的影响。为了全面设计DIA检测方法，将ACV传感器分为两个子集，基于其数据的先验信息可用性。对于具有先验信息的传感器，提出DIA检测方法，并推导出实际和估计值之间的差异的最优阈值水平，使ACV能够抵御网络攻击。对于没有先验信息的传感器，提出了一种新的多臂老虎机（MAB）算法，以使ACV能够安全地控制其运动。仿真结果表明，所提出的最优安全控制器在最大化对物理攻击的鲁棒性方面优于当前最先进的控制器。结果还显示，所提出的DIA检测方法相比卡尔曼滤波，可以提高ACV传感器对网络攻击的安全性，并最终提高ACV系统的物理鲁棒性。

英文摘要

Autonomous connected vehicles (ACVs) rely on intra-vehicle sensors such as camera and radar as well as inter-vehicle communication to operate effectively. This reliance on cyber components exposes ACVs to cyber and physical attacks in which an adversary can manipulate sensor readings and physically take control of an ACV. In this paper, a comprehensive framework is proposed to thwart cyber and physical attacks on ACV networks. First, an optimal safe controller for ACVs is derived to maximize the street traffic flow while minimizing the risk of accidents by optimizing ACV speed and inter-ACV spacing. It is proven that the proposed controller is robust to physical attacks which aim at making ACV systems instable. To improve the cyber-physical security of ACV systems, next, data injection attack (DIA) detection approaches are proposed to address cyber attacks on sensors and their physical impact on the ACV system. To comprehensively design the DIA detection approaches, ACV sensors are characterized in two subsets based on the availability of a-priori information about their data. For sensors having a prior information, a DIA detection approach is proposed and an optimal threshold level is derived for the difference between the actual and estimated values of sensors data which enables ACV to stay robust against cyber attacks. For sensors having no prior information, a novel multi-armed bandit (MAB) algorithm is proposed to enable ACV to securely control its motion. Simulation results show that the proposed optimal safe controller outperforms current state of the art controllers by maximizing the robustness of ACVs to physical attacks. The results also show that the proposed DIA detection approaches, compared to Kalman filtering, can improve the security of ACV sensors against cyber attacks and ultimately improve the physical robustness of an ACV system.

URL PDF HTML ☆

赞 0 踩 0

1808.04580 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

NFFT meets Krylov methods: Fast matrix-vector products for the graph Laplacian of fully connected networks

NFFT与Krylov方法的结合：用于完全连接网络图拉普拉斯算子的快速矩阵-向量乘法

Dominik Alfke, Daniel Potts, Martin Stoll, Toni Volkmer

发表机构 * Technische Universität Chemnitz, Faculty of Mathematics, Chair of Scientific Computing（技术大学化学学院，数学系，科学计算教研室）； Technische Universität Chemnitz, Faculty of Mathematics, Chair of Applied Functional Analysis（技术大学化学学院，数学系，应用泛函分析教研室）； Technische Universität Chemnitz, Faculty of Mathematics, Chair of Applied Analysis（技术大学化学学院，数学系，应用分析教研室）

AI总结本文提出利用NFFT进行快速矩阵-向量乘法，以高效处理完全连接网络的图拉普拉斯算子，同时展示了其在图像分割和半监督学习中的应用，并与Nyström方法进行了比较。

Comments 28 pages, 9 figures

详情

DOI: 10.3389/fams.2018.00061

AI中文摘要

图拉普拉斯算子是数据科学、机器学习和图像处理中的标准工具。对应的矩阵继承了底层网络的复杂结构，并在某些应用中被密集填充。这使得与图拉普拉斯算子的计算，特别是矩阵-向量乘法，成为一个困难的任务。典型应用是计算其若干特征值和特征向量。标准方法在图中节点数量过大时变得不可行。本文提出利用基于非等间距快速傅里叶变换（NFFT）的快速求和方法，以快速执行图拉普拉斯算子的密集矩阵-向量乘法，而无需形成整个矩阵。NFFT算法的巨大灵活性使我们能够将加速乘法嵌入到基于Lanczos的特征值计算程序或迭代线性系统求解器中，并考虑非标准高斯核。我们通过一系列测试问题展示了该方法的可行性，从图像分割到基于图的PDEs的半监督学习。特别是，我们比较了我们的方法与Nyström方法。此外，我们还提出并测试了改进的、混合版本的Nyström方法，该方法内部使用NFFT。

英文摘要

The graph Laplacian is a standard tool in data science, machine learning, and image processing. The corresponding matrix inherits the complex structure of the underlying network and is in certain applications densely populated. This makes computations, in particular matrix-vector products, with the graph Laplacian a hard task. A typical application is the computation of a number of its eigenvalues and eigenvectors. Standard methods become infeasible as the number of nodes in the graph is too large. We propose the use of the fast summation based on the nonequispaced fast Fourier transform (NFFT) to perform the dense matrix-vector product with the graph Laplacian fast without ever forming the whole matrix. The enormous flexibility of the NFFT algorithm allows us to embed the accelerated multiplication into Lanczos-based eigenvalues routines or iterative linear system solvers and even consider other than the standard Gaussian kernels. We illustrate the feasibility of our approach on a number of test problems from image segmentation to semi-supervised learning based on graph-based PDEs. In particular, we compare our approach with the Nyström method. Moreover, we present and test an enhanced, hybrid version of the Nyström method, which internally uses the NFFT.

URL PDF HTML ☆

赞 0 踩 0

1711.04178 2026-06-04 cs.LG cs.CV cs.NA math.NA stat.ML 版本更新

CUR Decompositions, Similarity Matrices, and Subspace Clustering

CUR分解、相似矩阵与子空间聚类

Akram Aldroubi, Keaton Hamm, Ahmet Bugra Koku, Ali Sekmen

AI总结本文提出了一种利用CUR分解解决子空间聚类问题的通用框架，通过构造相似矩阵实现无噪声情况下的精确聚类，并展示了如何通过CUR分解生成多种相似矩阵以处理噪声数据，同时推导出两种已知的子空间聚类方法。

Comments Approximately 30 pages. Current version contains improved algorithm and numerical experiments from the previous version

详情

AI中文摘要

本文提出了一种利用CUR分解解决子空间聚类问题的通用框架。CUR分解提供了一种自然方法来构造数据来自未知子空间联盟$\mathscr{U}=\underset{i=1}{\overset{M}\bigcup}S_i$的相似矩阵。由此构造的相似矩阵在无噪声情况下能够实现精确聚类。此外，这种分解还能从给定数据集生成多种不同的相似矩阵，从而具有足够的灵活性以对含噪声数据进行准确聚类。我们还展示了两种已知的子空间聚类方法可以从CUR分解中推导出来。本文还提出了一种基于相似矩阵理论构造的算法，并在合成和真实数据上进行了实验以测试该方法。此外，本文还利用了基于CUR的相似矩阵的改进版本，提供了一种启发式算法用于子空间聚类；该算法在Hopkins155运动分割数据集上的聚类性能目前最佳。

英文摘要

A general framework for solving the subspace clustering problem using the CUR decomposition is presented. The CUR decomposition provides a natural way to construct similarity matrices for data that come from a union of unknown subspaces $\mathscr{U}=\underset{i=1}{\overset{M}\bigcup}S_i$. The similarity matrices thus constructed give the exact clustering in the noise-free case. Additionally, this decomposition gives rise to many distinct similarity matrices from a given set of data, which allow enough flexibility to perform accurate clustering of noisy data. We also show that two known methods for subspace clustering can be derived from the CUR decomposition. An algorithm based on the theoretical construction of similarity matrices is presented, and experiments on synthetic and real data are presented to test the method. Additionally, an adaptation of our CUR based similarity matrices is utilized to provide a heuristic algorithm for subspace clustering; this algorithm yields the best overall performance to date for clustering the Hopkins155 motion segmentation dataset.

URL PDF HTML ☆

赞 0 踩 0

1511.06444 2026-06-04 cs.LG cs.NA math.NA math.PR 版本更新

Universal halting times in optimization and machine learning

优化与机器学习中的通用停止时间

Levent Sagun, Thomas Trogdon, Yann LeCun

发表机构 * Mathematics Department（数学系）； Department of Mathematics（数学系）； Computer Science Department（计算机科学系）； New York University（纽约大学）； University of California, Irvine（加州大学欧文分校）

AI总结研究通过分析优化算法在随机系统（如自旋玻璃和深度学习）中的停止时间分布，发现其在特定条件下与底层分布无关，揭示了两种类型的分布类：Gumbel型和高斯型。

详情

DOI: 10.1090/qam/1483
Journal ref: Quart. Appl. Math. 76 (2018), 289-301

AI中文摘要

作者展示了优化算法在两个随机系统（自旋玻璃和深度学习）中达到给定精度所需的迭代次数的经验分布。给定一个算法（即优化过程和随机景观的形式），停止时间的波动遵循一种分布，即使在改变景观分布后，经过中心化和标准化后仍保持不变。我们观察到两种定性类别：一种类似于Gumbel分布的分布，出现在谷歌搜索、人类决策时间、QR特征值算法和自旋玻璃中；另一种类似于高斯分布的分布，出现在共轭梯度法、使用MNIST输入数据的深度网络和使用随机输入数据的深度网络中。这种经验证据表明，存在一种分布类，其停止时间在某些条件下与底层分布无关。

英文摘要

The authors present empirical distributions for the halting time (measured by the number of iterations to reach a given accuracy) of optimization algorithms applied to two random systems: spin glasses and deep learning. Given an algorithm, which we take to be both the optimization routine and the form of the random landscape, the fluctuations of the halting time follow a distribution that, after centering and scaling, remains unchanged even when the distribution on the landscape is changed. We observe two qualitative classes: A Gumbel-like distribution that appears in Google searches, human decision times, the QR eigenvalue algorithm and spin glasses, and a Gaussian-like distribution that appears in conjugate gradient method, deep network with MNIST input data and deep network with random input data. This empirical evidence suggests presence of a class of distributions for which the halting time is independent of the underlying distribution under some conditions.

URL PDF HTML ☆

赞 0 踩 0

1802.08678 2026-06-04 eess.SY cs.LG cs.RO cs.SY stat.ML 版本更新

Verifying Controllers Against Adversarial Examples with Bayesian Optimization

通过贝叶斯优化验证控制器对抗示例

Shromona Ghosh, Felix Berkenkamp, Gireeja Ranade, Shaz Qadeer, Ashish Kapoor

发表机构 * Microsoft Research, Redmond（微软研究院（红mond））

AI总结本文提出基于贝叶斯优化的主动测试框架，用于验证控制器的安全性，通过逻辑定义安全约束并高效搜索行为空间以发现违反安全规范的对抗示例。

Comments Proc. of the IEEE International Conference on Robotics and Automation, 2018

详情

DOI: 10.1109/ICRA.2018.8460635

AI中文摘要

最近强化学习的成功促使开发了用于现实世界机器人的复杂控制器。由于这些机器人被部署在安全关键应用中并与人类交互，确保安全性以避免造成伤害变得至关重要。为此方向的一个初步步骤是测试控制器在仿真中的表现。为了做到这一点，我们需要明确安全的定义，然后高效地搜索所有行为空间以确定其安全性。在本文中，我们提出了一种基于贝叶斯优化的主动测试框架。我们使用逻辑指定安全约束，并利用问题中的结构来测试系统，以发现违反安全规范的对抗示例。这些规范被定义为轨迹上的光滑函数的复杂布尔组合，与强化学习中的奖励函数不同，它们是表达性强且对系统施加硬约束。在我们的框架中，我们利用单个函数的正则性假设，形式化为高斯过程（GP）先验。我们结合这些内容到一个连贯的优化框架中，利用问题结构。所得到的算法能够证明验证复杂的安全规范或找到对抗示例。实验结果表明，所提出的方法能够快速发现对抗示例。

英文摘要

Recent successes in reinforcement learning have lead to the development of complex controllers for real-world robots. As these robots are deployed in safety-critical applications and interact with humans, it becomes critical to ensure safety in order to avoid causing harm. A first step in this direction is to test the controllers in simulation. To be able to do this, we need to capture what we mean by safety and then efficiently search the space of all behaviors to see if they are safe. In this paper, we present an active-testing framework based on Bayesian Optimization. We specify safety constraints using logic and exploit structure in the problem in order to test the system for adversarial counter examples that violate the safety specifications. These specifications are defined as complex boolean combinations of smooth functions on the trajectories and, unlike reward functions in reinforcement learning, are expressive and impose hard constraints on the system. In our framework, we exploit regularity assumptions on individual functions in form of a Gaussian Process (GP) prior. We combine these into a coherent optimization framework using problem structure. The resulting algorithm is able to provably verify complex safety specifications or alternatively find counter examples. Experimental results show that the proposed method is able to find adversarial examples quickly.

URL PDF HTML ☆

赞 0 踩 0

1812.03216 2026-06-04 cs.LG cs.RO cs.SY eess.SY 版本更新

Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control

基于鲁棒控制的零样本深度强化学习驾驶策略迁移用于自动驾驶车辆

Zhuo Xu, Chen Tang, Masayoshi Tomizuka

发表机构 * University of California, Berkeley（加州大学伯克利分校）

AI总结本文提出了一种基于鲁棒控制的零样本深度强化学习驾驶策略迁移方法，通过转移具体的运动学量来解决自动驾驶中源域与目标域之间的建模差距问题，采用可转移的分层强化学习轨迹规划器和基于扰动观测器的鲁棒跟踪控制器，验证了该方法在多个驾驶场景中的零样本迁移能力。

Comments Published at IEEE ITSC 2018

详情

AI中文摘要

尽管深度强化学习（深度RL）方法在应用于自动驾驶时具有诸多优势，但真实应用却受到源域（训练）与目标域（部署）之间建模差距的限制。与当前的策略迁移方法不同，本文提出转移具体的自动驾驶运动学量。所提出的基于鲁棒控制的（RC）通用迁移架构，称为RL-RC，结合了可转移的分层强化学习轨迹规划器和基于扰动观测器（DOB）的鲁棒跟踪控制器。通过训练已知的名义动力学模型的深度RL策略直接转移到目标域，DOB基于的鲁棒跟踪控制用于处理建模差距，包括车辆动力学误差和外部扰动如侧向力。我们提供了模拟验证所提出方法在多个驾驶场景如车道保持、车道变更和障碍物避让中的迁移能力。

英文摘要

Although deep reinforcement learning (deep RL) methods have lots of strengths that are favorable if applied to autonomous driving, real deep RL applications in autonomous driving have been slowed down by the modeling gap between the source (training) domain and the target (deployment) domain. Unlike current policy transfer approaches, which generally limit to the usage of uninterpretable neural network representations as the transferred features, we propose to transfer concrete kinematic quantities in autonomous driving. The proposed robust-control-based (RC) generic transfer architecture, which we call RL-RC, incorporates a transferable hierarchical RL trajectory planner and a robust tracking controller based on disturbance observer (DOB). The deep RL policies trained with known nominal dynamics model are transfered directly to the target domain, DOB-based robust tracking control is applied to tackle the modeling gap including the vehicle dynamics errors and the external disturbances such as side forces. We provide simulations validating the capability of the proposed method to achieve zero-shot transfer across multiple driving scenarios such as lane keeping, lane changing and obstacle avoidance.

URL PDF HTML ☆

赞 0 踩 0

1805.07857 2026-06-04 cs.LG cs.CV cs.NA math.NA stat.ML 版本更新

Parallel Transport Convolution: A New Tool for Convolutional Neural Networks on Manifolds

平行运输卷积：用于流形上卷积神经网络的新工具

Stefan C. Schonsheck, Bin Dong, Rongjie Lai

发表机构 * Rensselaer Polytechnic Institute（伦斯拉尔理工学院）

AI总结本文提出平行运输卷积（PTC），一种在流形及其离散对应物上扩展卷积操作的新方法，能够保持卷积的紧凑支持、方向性和跨流形的可转移性，从而在曲面域上构建小波样操作和深度卷积神经网络。

Comments 10 pages

详情

AI中文摘要

卷积在科学和工程中的各种应用中扮演了重要的角色，是卷积神经网络中最关键的操作。近年来，研究者对在曲面域（如流形和图）上推广卷积的兴趣增长，但现有方法无法保持欧几里得卷积的所有理想特性，即紧凑支持滤波器、方向性和跨不同流形的可转移性。本文开发了一种新的卷积操作扩展，称为平行运输卷积（PTC），应用于黎曼流形及其离散对应物。PTC基于平行运输，能够沿流形传输信息并内在保持方向性。PTC允许构建具有紧凑支持的滤波器，并且对流形变形具有鲁棒性。这使得我们能够执行小波样操作，并在曲面域上定义深度卷积神经网络。

英文摘要

Convolution has been playing a prominent role in various applications in science and engineering for many years. It is the most important operation in convolutional neural networks. There has been a recent growth of interests of research in generalizing convolutions on curved domains such as manifolds and graphs. However, existing approaches cannot preserve all the desirable properties of Euclidean convolutions, namely compactly supported filters, directionality, transferability across different manifolds. In this paper we develop a new generalization of the convolution operation, referred to as parallel transport convolution (PTC), on Riemannian manifolds and their discrete counterparts. PTC is designed based on the parallel transportation which is able to translate information along a manifold and to intrinsically preserve directionality. PTC allows for the construction of compactly supported filters and is also robust to manifold deformations. This enables us to preform wavelet-like operations and to define deep convolutional neural networks on curved domains.

URL PDF HTML ☆

赞 0 踩 0

1812.00679 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Data Driven Chiller Plant Energy Optimization with Domain Knowledge

数据驱动的冷水机组能源优化与领域知识

Hoang Dung Vu, Kok Soon Chai, Bryan Keating, Nurislam Tursynbek, Boyan Xu, Kaige Yang, Xiaoyan Yang, Zhenjie Zhang

发表机构 * Kaer Pte. Ltd.（卡尔公司）； University of Illinois at Urbana Champaign（伊利诺伊大学厄巴纳-香槟分校）； Nazarbayev University（纳扎尔拜耶夫大学）； Guangdong University of Technology（广东工业大学）； University College London（伦敦大学学院）； Advanced Digital Sciences Center（先进数字科学中心）

AI总结本文提出了一种结合领域知识的数据驱动方法，用于实时冷水机组优化，通过实际案例验证了该方法在降低日常电力消耗方面的显著效果。

Comments CIKM2017. Proceedings of the 26th ACM International Conference on Information and Knowledge Management. 2017

详情

DOI: 10.1145/3132847.3132860

AI中文摘要

制冷和冷水机组优化是机械工程中的重要且广泛研究的主题，主要利用物理模型，基于过于简化的假设在设备上进行设计。传统优化技术使用物理模型进行在线参数调整，仅基于有限的硬件规格和外部条件信息，例如室外天气。近年来，新一代传感器成为新冷水机组的重要组成部分，首次使系统管理员能够及时准确地持续监控所有设备的运行状态。数据激增，由机器学习和数据挖掘的分析能力增加驱动，揭示了数据驱动方法在实时冷水机组优化中的新可能性。本文介绍了我们在冷水机组上采用数据模型和优化的研究和工业经验，并讨论了我们在实际设备上的实践教训。与复杂机器学习模型不同，我们强调将适当的领域知识纳入数据分析工具中，这在很大程度上超越了最先进的深度学习技术的性能。我们在实际冷水机组上的实证评估实现了每日电力消耗的节省超过7%。

英文摘要

Refrigeration and chiller optimization is an important and well studied topic in mechanical engineering, mostly taking advantage of physical models, designed on top of over-simplified assumptions, over the equipments. Conventional optimization techniques using physical models make decisions of online parameter tuning, based on very limited information of hardware specifications and external conditions, e.g., outdoor weather. In recent years, new generation of sensors is becoming essential part of new chiller plants, for the first time allowing the system administrators to continuously monitor the running status of all equipments in a timely and accurate way. The explosive growth of data flowing to databases, driven by the increasing analytical power by machine learning and data mining, unveils new possibilities of data-driven approaches for real-time chiller plant optimization. This paper presents our research and industrial experience on the adoption of data models and optimizations on chiller plant and discusses the lessons learnt from our practice on real world plants. Instead of employing complex machine learning models, we emphasize the incorporation of appropriate domain knowledge into data analysis tools, which turns out to be the key performance improver over state-of-the-art deep learning techniques by a significant margin. Our empirical evaluation on a real world chiller plant achieves savings by more than 7% on daily power consumption.

URL PDF HTML ☆

赞 0 踩 0

1804.01526 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Training DNNs with Hybrid Block Floating Point

用混合块浮点数训练DNNs

Mario Drumond, Tao Lin, Martin Jaggi, Babak Falsafi

发表机构 * EPFL（苏黎世联邦理工学院）

AI总结本文提出了一种混合块浮点数（HBFP）方法，通过在块浮点数中执行所有点积运算，其他运算使用浮点数，从而在保持高精度的同时提高硬件密度和吞吐量。

Comments 9 pages, 3 figures. Accepted in Neural Information Processing Systems 2018 (NeurIPS 2018)

详情

AI中文摘要

深度神经网络（DNN）的广泛应用催生了持续增长的计算需求，迫使数据中心运营商采用领域专用加速器来训练它们。这些加速器通常使用密集打包的全精度浮点运算以最大化面积性能。持续的研究努力旨在通过用固定点运算替换浮点运算来进一步提高这种性能密度。然而，这些尝试面临的主要障碍是固定点的动态范围狭窄，不足以支持DNN训练的收敛。我们识别出块浮点数（BFP）作为有前途的替代表示，因为它具有宽动态范围，并且能够使大多数DNN运算使用固定点逻辑进行。不幸的是，BFP单独引入了几个限制，使其无法直接应用。在本文中，我们引入了HBFP，一种混合BFP-FP方法，它在BFP中执行所有点积运算，其他运算使用浮点运算。HBFP实现了两全其美：浮点数的高精度和固定点的优越硬件密度。对于广泛的各种模型，我们证明HBFP在保持浮点数精度的同时，能够实现高达8.5倍的吞吐量。

英文摘要

The wide adoption of DNNs has given birth to unrelenting computing requirements, forcing datacenter operators to adopt domain-specific accelerators to train them. These accelerators typically employ densely packed full precision floating-point arithmetic to maximize performance per area. Ongoing research efforts seek to further increase that performance density by replacing floating-point with fixed-point arithmetic. However, a significant roadblock for these attempts has been fixed point's narrow dynamic range, which is insufficient for DNN training convergence. We identify block floating point (BFP) as a promising alternative representation since it exhibits wide dynamic range and enables the majority of DNN operations to be performed with fixed-point logic. Unfortunately, BFP alone introduces several limitations that preclude its direct applicability. In this work, we introduce HBFP, a hybrid BFP-FP approach, which performs all dot products in BFP and other operations in floating point. HBFP delivers the best of both worlds: the high accuracy of floating point at the superior hardware density of fixed point. For a wide variety of models, we show that HBFP matches floating point's accuracy while enabling hardware implementations that deliver up to 8.5x higher throughput.

URL PDF HTML ☆

赞 0 踩 0

1811.12830 2026-06-04 math.NA cs.LG cs.NA math.FA 版本更新

Beltrami-Net: Domain Independent Deep D-bar Learning for Absolute Imaging with Electrical Impedance Tomography (a-EIT)

Beltrami-Net: 域无关的深度D-bar学习用于电阻抗断层成像（a-EIT）

S. J. Hamilton, A. Hänninen, A. Hauptmann, V. Kolehmainen

发表机构 * Department of Mathematics, Statistics, and Computer Science（数学、统计与计算机科学系；马quette大学）； Marquette University（应用物理系；东芬兰大学）； Department of Applied Physics（计算机科学系；伦敦大学学院）； University of Eastern Finland ； Department of Computer Science ； University College London

AI总结本文提出了一种新的a-EIT图像重建方法，通过将深度学习技术与实时鲁棒D-bar方法结合，利用非物理Beltrami方程生成训练数据，实现了与边界形状无关的图像质量提升。

Comments 15 pages, 8 figures, 3 tables

详情

AI中文摘要

目标：开发并证明一种新的绝对电阻抗断层成像（a-EIT）图像重建方法，该方法结合了深度学习技术与实时鲁棒D-bar方法。方法：将D-bar方法与训练好的卷积神经网络（CNN）作为后处理步骤结合。通过使用关联的非物理Beltrami方程而非传统特定领域的电流和电压数据来模拟训练数据，从而实现训练数据与边界形状无关。该方法在两个EIT系统（ACT4和KIT4）的实验数据上进行了测试。主要结果：用CNN后处理D-bar图像，在结构相似性指数（SSIM）以及相对ℓ₂和ℓ₁图像误差方面显著提高了图像质量。意义：本工作展示了无需特定边界形状即可训练更通用网络的可能性，这是EIT图像重建中的关键挑战。该工作对未来涉及解剖学大数据库的研究具有前景。

英文摘要

Objective: To develop, and demonstrate the feasibility of, a novel image reconstruction method for absolute Electrical Impedance Tomography (a-EIT) that pairs deep learning techniques with real-time robust D-bar methods. Approach: A D-bar method is paired with a trained Convolutional Neural Network (CNN) as a post-processing step. Training data is simulated for the network using no knowledge of the boundary shape by using an associated nonphysical Beltrami equation rather than simulating the traditional current and voltage data specific to a given domain. This allows the training data to be boundary shape independent. The method is tested on experimental data from two EIT systems (ACT4 and KIT4). Main Results: Post processing the D-bar images with a CNN produces significant improvements in image quality measured by Structural SIMilarity indices (SSIMs) as well as relative $\ell_2$ and $\ell_1$ image errors. Significance: This work demonstrates that more general networks can be trained without being specific about boundary shape, a key challenge in EIT image reconstruction. The work is promising for future studies involving databases of anatomical atlases.

URL PDF HTML ☆

赞 0 踩 0

1811.11433 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Beyond Pham's algorithm for joint diagonalization

超越Pham算法的联合对角化

Pierre Ablin, Jean-François Cardoso, Alexandre Gramfort

发表机构 * INRIA - Parietal team（INRIA-帕里埃尔团队）； CNRS - Institut d’Astrophysique de Paris（CNRS-巴黎天体物理研究所）

AI总结本文提出了一种新的拟牛顿方法来优化Pham提出的对角化准则，并通过模拟和真实数据集的实验表明该方法优于Pham算法。

1804.01983 2026-06-04 math.NA cs.CV cs.LG cs.NA 版本更新

High-dimension Tensor Completion via Gradient-based Optimization Under Tensor-train Format

通过张量列车格式的梯度优化实现高维张量补全

Longhao Yuan, Qibin Zhao, Lihua Gui, Jianting Cao

发表机构 * Graduate School of Engineering, Saitama Institute of Technology, Japan（日本埼玉科技大学工学研究科）； Tensor Learning Unit, RIKEN Center for Advanced Intelligence Project (AIP), Japan（日本RIKEN先进人工智能项目（AIP）张量学习单元）； School of Automation, Guangdong University of Technology, China（广东技术大学自动化学院）； School of Computer Science and Technology, Hangzhou Dianzi University, China（杭州电子科技大学计算机科学与技术学院）

AI总结本文提出了一种基于张量列车格式的梯度优化方法，用于补全高维张量中的缺失数据，通过寻找低秩张量列车分解来捕捉数据的潜在特征，并利用梯度下降算法高效解决张量补全问题，同时引入视觉数据张量化方法提升算法性能。

详情

AI中文摘要

张量列车（TT）分解因其在高阶张量中的强大表示能力和稳定性而受到关注。本文提出了一种新的方法，用于恢复由高阶张量表示的不完整数据中的缺失条目。我们尝试找到不完整数据的低秩TT分解，以捕捉整个数据集的潜在特征，然后重建缺失条目。通过应用梯度下降算法，利用优化模型高效地解决了张量补全问题。我们提出了两种基于TT的算法：张量列车加权优化（TT-WOPT）和张量列车随机梯度下降（TT-SGD），用于优化TT分解因子。此外，提出了一种名为视觉数据张量化（VDT）的方法，将视觉数据转换为高阶张量，从而提升了我们算法的性能。在合成数据和视觉数据的实验中，我们的算法在高阶、高缺失率和大规模张量补全情况下表现出高效和优越的性能，相比最先进的补全算法。

英文摘要

Tensor train (TT) decomposition has drawn people's attention due to its powerful representation ability and performance stability in high-order tensors. In this paper, we propose a novel approach to recover the missing entries of incomplete data represented by higher-order tensors. We attempt to find the low-rank TT decomposition of the incomplete data which captures the latent features of the whole data and then reconstruct the missing entries. By applying gradient descent algorithms, tensor completion problem is efficiently solved by optimization models. We propose two TT-based algorithms: Tensor Train Weighted Optimization (TT-WOPT) and Tensor Train Stochastic Gradient Descent (TT-SGD) to optimize TT decomposition factors. In addition, a method named Visual Data Tensorization (VDT) is proposed to transform visual data into higher-order tensors, resulting in the performance improvement of our algorithms. The experiments in synthetic data and visual data show high efficiency and performance of our algorithms compared to the state-of-the-art completion algorithms, especially in high-order, high missing rate, and large-scale tensor completion situations.

URL PDF HTML ☆

赞 0 踩 0

1803.00444 2026-06-04 cs.LG cs.AI cs.RO cs.SY eess.SY stat.ML 版本更新

Inverse Reinforcement Learning via Nonparametric Spatio-Temporal Subgoal Modeling

通过非参数时空子目标建模实现逆强化学习

Adrian Šošić, Elmar Rueckert, Jan Peters, Abdelhak M. Zoubir, Heinz Koeppl

发表机构 * Signal Processing Group（信号处理组）； Institute for Robotics and Cognitive Systems（机器人与认知系统研究所）； Autonomous Systems Labs（自主系统实验室）； Bioinspired Communication Systems（生物启发通信系统）

AI总结本文提出了一种基于非参数时空子目标建模的逆强化学习方法，通过局部上下文更高效地解释单条轨迹，实现更紧凑的行为表示，并构建隐式意图模型以预测未观察到的情况，从而在处理意图变化和主动学习场景中表现出色。

Comments 45 pages, 14 figures; ### Version 3 ### published in the Journal of Machine Learning Research

详情

AI中文摘要

逆强化学习（IRL）领域的发展导致了更复杂的推理框架，这些框架放宽了原始建模假设，即观察到的代理行为仅反映单一意图。相反于学习全局行为模型，最近的IRL方法将演示数据分成部分，以考虑不同轨迹可能对应不同意图，例如因为它们由不同领域专家生成。在本工作中，我们进一步采用子目标的直观概念，建立一个前提：即使单条轨迹在特定上下文中局部解释也比全局更高效，从而实现更紧凑的行为表示。基于这一假设，我们构建了代理目标的隐式意图模型，以预测未观察到的情况。结果是一种集成的贝叶斯预测框架，显著优于现有IRL解决方案，并提供与专家计划一致的平滑策略估计。最值得注意的是，我们的框架自然处理代理意图随时间变化的情况，而经典IRL算法失败。此外，由于其概率性质，该模型可以轻松应用于主动学习场景，以指导专家的演示过程。

英文摘要

Advances in the field of inverse reinforcement learning (IRL) have led to sophisticated inference frameworks that relax the original modeling assumption of observing an agent behavior that reflects only a single intention. Instead of learning a global behavioral model, recent IRL methods divide the demonstration data into parts, to account for the fact that different trajectories may correspond to different intentions, e.g., because they were generated by different domain experts. In this work, we go one step further: using the intuitive concept of subgoals, we build upon the premise that even a single trajectory can be explained more efficiently locally within a certain context than globally, enabling a more compact representation of the observed behavior. Based on this assumption, we build an implicit intentional model of the agent's goals to forecast its behavior in unobserved situations. The result is an integrated Bayesian prediction framework that significantly outperforms existing IRL solutions and provides smooth policy estimates consistent with the expert's plan. Most notably, our framework naturally handles situations where the intentions of the agent change over time and classical IRL algorithms fail. In addition, due to its probabilistic nature, the model can be straightforwardly applied in active learning scenarios to guide the demonstration process of the expert.

URL PDF HTML ☆

赞 0 踩 0

1811.12084 2026-06-04 cs.CV cs.LG cs.NA math.AP math.NA 版本更新

Networks for Nonlinear Diffusion Problems in Imaging

图像中非线性扩散问题的网络

Simon Arridge, Andreas Hauptmann

发表机构 * Department of Computer Science（计算机科学系；伦敦大学学院）； University College London

AI总结本文提出了一种基于非线性扩散过程的网络架构DiffNet，用于解决图像中的非线性扩散问题，该网络在可解释性和泛化能力方面优于传统卷积神经网络，并在非线性扩散逆问题上取得了与U-Net相当的性能。

详情

AI中文摘要

许多成像和视觉任务近期通过深度学习方法，特别是卷积神经网络的应用，经历了重大变革。这些方法在某些应用中取得了显著成果，即使这些应用并不明显表明卷积适合捕捉底层物理。在本文中，我们开发了一种基于非线性扩散过程的网络架构，称为DiffNet。通过设计，我们获得了一种适合图像中扩散相关问题的非线性网络架构。此外，所执行的更新是显式的，从而比传统卷积神经网络架构获得了更好的可解释性和泛化能力。在STL-10图像数据集上测试DiffNet在非线性扩散逆问题中的性能，使用Perona-Malik滤波器。我们获得的结果与已建立的U-Net架构具有竞争力，参数数量和必要的训练数据较少。

英文摘要

A multitude of imaging and vision tasks have seen recently a major transformation by deep learning methods and in particular by the application of convolutional neural networks. These methods achieve impressive results, even for applications where it is not apparent that convolutions are suited to capture the underlying physics. In this work we develop a network architecture based on nonlinear diffusion processes, named DiffNet. By design, we obtain a nonlinear network architecture that is well suited for diffusion related problems in imaging. Furthermore, the performed updates are explicit, by which we obtain better interpretability and generalisability compared to classical convolutional neural network architectures. The performance of DiffNet tested on the inverse problem of nonlinear diffusion with the Perona-Malik filter on the STL-10 image dataset. We obtain competitive results to the established U-Net architecture, with a fraction of parameters and necessary training data.

URL PDF HTML ☆

赞 0 踩 0

1811.11259 2026-06-04 cs.LG cs.AI cs.DS cs.SY eess.SY stat.ML 版本更新

Scaling Configuration of Energy Harvesting Sensors with Reinforcement Learning

基于强化学习的能源收集传感器的扩展配置

Francesco Fraternali, Bharathan Balaji, Rajesh Gupta

发表机构 * University of California, San Diego（加州大学圣迭戈分校）； University of California, Los Angeles（加州大学洛杉矶分校）

AI总结本文提出利用强化学习自动配置室内太阳能板能源收集传感器的采样率，通过减少训练阶段和计算需求，实现快速部署和大规模扩展，有效提升传感器数据采集效率并避免能源耗尽。

Comments 7 pages, 5 figures

详情

DOI: 10.1145/3279755.3279760
Journal ref: ENSsys '18: International Workshop on Energy Harvesting & Energy-Neutral Sensing Systems}{November 4, 2018}{Shenzhen, China

AI中文摘要

随着物联网（IoT）的出现，越来越多的能源收集方法被用于补充或替代电池供电传感器。能源收集传感器需要根据应用、硬件和环境条件进行配置，以最大化其效用。目前，传感器配置要么是手动的，要么基于启发式方法，需要宝贵的领域专业知识。强化学习（RL）是一种有前景的方法，可以自动化配置并高效扩展IoT部署，但尚未在实践中得到应用。我们提出了解决这一差距的解决方案：减少RL的训练阶段，使节点在部署后短时间内即可运行，并减少计算需求以扩展到大规模部署。我们专注于配置基于室内太阳能板的能源收集传感器的采样率。我们基于三个月内从5个传感器节点收集的数据创建了一个模拟器。我们的模拟结果表明，RL可以有效学习能源可用性模式，并配置传感器节点的采样率以在确保不耗尽能源存储的情况下最大化传感数据。通过我们的方法，节点可以在部署的第一天内投入使用。我们还展示了通过使用相似光照条件的节点共享单个策略来减少RL策略数量的可能性。

英文摘要

With the advent of the Internet of Things (IoT), an increasing number of energy harvesting methods are being used to supplement or supplant battery based sensors. Energy harvesting sensors need to be configured according to the application, hardware, and environmental conditions to maximize their usefulness. As of today, the configuration of sensors is either manual or heuristics based, requiring valuable domain expertise. Reinforcement learning (RL) is a promising approach to automate configuration and efficiently scale IoT deployments, but it is not yet adopted in practice. We propose solutions to bridge this gap: reduce the training phase of RL so that nodes are operational within a short time after deployment and reduce the computational requirements to scale to large deployments. We focus on configuration of the sampling rate of indoor solar panel based energy harvesting sensors. We created a simulator based on 3 months of data collected from 5 sensor nodes subject to different lighting conditions. Our simulation results show that RL can effectively learn energy availability patterns and configure the sampling rate of the sensor nodes to maximize the sensing data while ensuring that energy storage is not depleted. The nodes can be operational within the first day by using our methods. We show that it is possible to reduce the number of RL policies by using a single policy for nodes that share similar lighting conditions.

URL PDF HTML ☆

赞 0 踩 0

1811.10275 2026-06-04 stat.CO cs.LG cs.NA math.NA stat.ML 版本更新

Rejoinder for "Probabilistic Integration: A Role in Statistical Computation?"

对“概率积分：在统计计算中的作用？”的回应

Francois-Xavier Briol, Chris J. Oates, Mark Girolami, Michael A. Osborne, Dino Sejdinovic

发表机构 * Imperial College London（伦敦帝国理工学院）； Newcastle University（新castle大学）； University of Oxford（牛津大学）

AI总结本文是对即将发表在《统计科学》上的论文“概率积分：在统计计算中的作用？”的回应。作者感谢了评审员和同事们的帮助，并回应了讨论者提出的问题，探讨了贝叶斯方法在数值分析中的应用及其在统计计算中的作用。

Comments Accepted to Statistical Science

1809.01588 2026-06-04 math.NA cs.LG cs.NA math.AG math.ST stat.TH 版本更新

Learning Paths from Signature Tensors

从签名张量学习路径

Max Pfeffer, Anna Seigal, Bernd Sturmfels

AI总结本文通过张量分解、代数几何和数值优化方法，研究了张量的矩阵共轭问题，并针对随机分析中的逆问题，提出从第三阶签名张量恢复路径的方法，建立了路径的可识别性结果。

Comments 22 pages, 3 figures

1811.07799 2026-06-04 cs.MA cs.LG cs.SY eess.SY 版本更新

Distributed Learning of Average Belief Over Networks Using Sequential Observations

在使用顺序观测的网络上分布式学习平均信念

Kaiqing Zhang, Yang Liu, Ji Liu, Mingyan Liu, Tamer Başar

AI总结本文研究了在顺序观测下，网络中多个智能体通过仅与邻居交换信息达成对平均信念共识的问题。提出两种分布式在线算法，适用于无向和有向图，均能几乎 surely 收敛到平均信念，并以 O(1/t) 的速率达成共识。对于无向图，还修改了算法以适应量化通信和有限精度的除法操作，证明修改后的算法能使所有智能体达成量化共识或进入平均信念的附近区域。

Comments Accepted to Automatica

详情

AI中文摘要

本文解决了分布式学习平均信念的问题，其中n>1个智能体通过仅与邻居交换信息来达成对平均信念的共识。每个智能体以在线方式接收到其信念的序列样本。n个智能体之间的邻居关系由一个可能随时间变化的图描述，其顶点对应智能体，边表示邻居关系。提出两种分布式在线算法，适用于无向和有向图，均能几乎 surely 收敛到平均信念。此外，两种算法生成的序列在高概率下以O(1/t)的速率达成共识，其中t是迭代次数。对于无向图，相应算法被修改以适应量化通信和有限精度的除法操作。证明修改后的算法使所有n个智能体要么达成量化共识，要么进入其信念平均值的附近区域。随后提供了数值模拟以验证理论结果。

英文摘要

This paper addresses the problem of distributed learning of average belief with sequential observations, in which a network of $n>1$ agents aim to reach a consensus on the average value of their beliefs, by exchanging information only with their neighbors. Each agent has sequentially arriving samples of its belief in an online manner. The neighbor relationships among the $n$ agents are described by a graph which is possibly time-varying, whose vertices correspond to agents and whose edges depict neighbor relationships. Two distributed online algorithms are introduced for undirected and directed graphs, which are both shown to converge to the average belief almost surely. Moreover, the sequences generated by both algorithms are shown to reach consensus with an $O(1/t)$ rate with high probability, where $t$ is the number of iterations. For undirected graphs, the corresponding algorithm is modified for the case with quantized communication and limited precision of the division operation. It is shown that the modified algorithm causes all $n$ agents to either reach a quantized consensus or enter a small neighborhood around the average of their beliefs. Numerical simulations are then provided to corroborate the theoretical results.

URL PDF HTML ☆

赞 0 踩 0

1811.05788 2026-06-04 cs.LG cs.AI cs.SY eess.SY 版本更新

Learning to Compensate Photovoltaic Power Fluctuations from Images of the Sky by Imitating an Optimal Policy

通过模仿最优策略从天空图像中学习补偿光伏功率波动

Robin Spiess, Felix Berkenkamp, Jan Poland, Andreas Krause

发表机构 * Department of Computer Science, ETH Zurich（计算机科学系，苏黎世联邦理工学院）； ABB Corporate Research, Switzerland（瑞士ABB企业研究）

AI总结本文提出了一种基于深度学习的方法，利用天空图像预测性地补偿光伏功率波动，减少电池压力，通过模仿学习训练神经网络近似最优策略。

Comments 7 pages, 7 figures

详情

AI中文摘要

光伏（PV）发电站的输出功率取决于环境，因此会随时间波动。这导致光伏功率可能在电网中引起不稳定性，尤其是在日益广泛使用的情况下。限制功率输出变化率是缓解这些波动的常见方法，通常借助大型电池。一种使用这些电池补偿阶跃变化的反应控制器在实践中有效，但会导致电池因高能量通过而受到压力。在本文中，我们提出了一种深度学习方法，利用天空图像来预测性地补偿功率波动并减少电池压力。特别是，我们证明可以通过仅在事后可用的信息来计算最优控制策略。基于此，我们使用模仿学习训练一个神经网络，该网络近似这种事后最优策略，但仅使用当前可用的天空图像和传感器数据。我们对一个大规模的测量和图像数据集进行了评估，并展示了训练后的策略能够减少电池压力。

英文摘要

The energy output of photovoltaic (PV) power plants depends on the environment and thus fluctuates over time. As a result, PV power can cause instability in the power grid, in particular when increasingly used. Limiting the rate of change of the power output is a common way to mitigate these fluctuations, often with the help of large batteries. A reactive controller that uses these batteries to compensate ramps works in practice, but causes stress on the battery due to a high energy throughput. In this paper, we present a deep learning approach that uses images of the sky to compensate power fluctuations predictively and reduces battery stress. In particular, we show that the optimal control policy can be computed using information that is only available in hindsight. Based on this, we use imitation learning to train a neural network that approximates this hindsight-optimal policy, but uses only currently available sky images and sensor data. We evaluate our method on a large dataset of measurements and images from a real power plant and show that the trained policy reduces stress on the battery.

URL PDF HTML ☆

赞 0 踩 0

1811.05646 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

Fast Distribution Grid Line Outage Identification with $μ$PMU

快速分布电网线路故障识别与μPMU

Yizheng Liao, Yang Weng, Chin-Woo Tan, Ram Rajagopal

AI总结本文提出基于μPMU的随机时间序列分析的数据驱动故障监测方法，通过功率流分析证明线路故障后电压时间序列的统计特性显著变化，利用最大似然方法直接学习分布参数，实现快速准确的故障识别。

Comments 9 pages

详情

AI中文摘要

随着分布式能源资源（DERs）在城市配电网中的日益集成，由于DER行为的不确定性和复杂性，可靠性问题日益突出。在大规模DER渗透的情况下，传统故障检测方法依赖于客户电话呼叫和智能电表的“最后信号”，由于可再生能源发电机可在线路故障后供电且许多城市电网为网状结构，线路故障不会影响供电。为解决这些问题，我们提出了一种基于微量程相量测量单元（μPMU）的随机时间序列分析的数据驱动故障监测方法。具体而言，我们通过功率流分析证明，线路故障后电压时间序列的依赖性表现出显著的统计变化。这使得基于最优变化点检测的理论能够通过μPMU实现快速且准确的线路故障识别。然而，现有变化点检测方法需要分布系统中故障后电压分布未知。因此，我们设计了一种基于最大似然的方法，直接从μPMU数据中学习分布参数。我们证明，基于估计参数的检测仍能实现最优性能，使其在配电网故障识别中具有极高的实用性。仿真结果表明，在八个配置有和没有DERs的配电网中，使用μPMU数据实现了高度准确的故障识别。

英文摘要

The growing integration of distributed energy resources (DERs) in urban distribution grids raises various reliability issues due to DER's uncertain and complex behaviors. With a large-scale DER penetration, traditional outage detection methods, which rely on customers making phone calls and smart meters' "last gasp" signals, will have limited performance, because the renewable generators can supply powers after line outages and many urban grids are mesh so line outages do not affect power supply. To address these drawbacks, we propose a data-driven outage monitoring approach based on the stochastic time series analysis from micro phasor measurement unit ($μ$PMU). Specifically, we prove via power flow analysis that the dependency of time-series voltage measurements exhibits significant statistical changes after line outages. This makes the theory on optimal change-point detection suitable to identify line outages via $μ$PMUs with fast and accurate sampling. However, existing change point detection methods require post-outage voltage distribution unknown in distribution systems. Therefore, we design a maximum likelihood-based method to directly learn the distribution parameters from $μ$PMU data. We prove that the estimated parameters-based detection still achieves the optimal performance, making it extremely useful for distribution grid outage identifications. Simulation results show highly accurate outage identification in eight distribution grids with 14 configurations with and without DERs using $μ$PMU data.

URL PDF HTML ☆

赞 0 踩 0

1808.00113 2026-06-04 eess.SY cs.LG cs.RO cs.SY math.OC 版本更新

Learning Stabilizable Dynamical Systems via Control Contraction Metrics

通过控制收缩度量学习可稳定化的动态系统

Sumeet Singh, Vikas Sindhwani, Jean-Jacques E. Slotine, Marco Pavone

发表机构 * Dept. of Aeronautics and Astronautics, Stanford University（航空航天系，斯坦福大学）； Google Brain Robotics, New York（谷歌大脑机器人，纽约）； Dept. of Mechanical Engineering, Massachusetts Institute of Technology（机械工程系，麻省理工学院）

AI总结本文提出了一种新的框架，用于学习可稳定化的非线性动态系统，以实现机器人连续控制任务。核心方法是开发一种基于稳定性的控制理论正则化器，以确保学习到的系统可以配备一个稳健的控制器，能够稳定任何系统生成的开环轨迹。通过利用收缩理论、统计学习和凸优化工具，我们提供了一个通用且可操作的半监督算法来学习可稳定化的动态系统，可以应用于复杂的欠驱动系统。在模拟平面四旋翼系统上验证了所提算法，并观察到与传统回归技术学习的模型相比，使用控制理论正则化模型在轨迹生成和跟踪性能上有显著改进，尤其是在使用少量示范示例时。结果展示了将标准基于模型的强化学习算法与非线性控制理论概念结合的必要性，以提高可靠性。

Comments To appear at WAFR 2018. v2: re-structured Sections 3 & 4 to improve clarity; expanded discussion on limitations & future work in Section 5; added details on training & validation, significantly expanded experiments

详情

AI中文摘要

我们提出了一种新的框架，用于学习可稳定化的非线性动态系统，以实现机器人连续控制任务。关键思想是开发一种基于稳定性的控制理论正则化器，用于动态拟合，该正则化器保证所学习的系统可以配备一个稳健的控制器，能够稳定任何系统可能生成的开环轨迹。通过利用收缩理论、统计学习和凸优化工具，我们提供了一个通用且可操作的半监督算法来学习可稳定化的动态系统，可以应用于复杂的欠驱动系统。我们在模拟平面四旋翼系统上验证了所提算法，并观察到与传统回归技术学习的模型相比，使用控制理论正则化模型在轨迹生成和跟踪性能上有显著改进，尤其是在使用少量示范示例时。所呈现的结果展示了将标准基于模型的强化学习算法与非线性控制理论概念结合的必要性，以提高可靠性。

英文摘要

We propose a novel framework for learning stabilizable nonlinear dynamical systems for continuous control tasks in robotics. The key idea is to develop a new control-theoretic regularizer for dynamics fitting rooted in the notion of stabilizability, which guarantees that the learned system can be accompanied by a robust controller capable of stabilizing any open-loop trajectory that the system may generate. By leveraging tools from contraction theory, statistical learning, and convex optimization, we provide a general and tractable semi-supervised algorithm to learn stabilizable dynamics, which can be applied to complex underactuated systems. We validated the proposed algorithm on a simulated planar quadrotor system and observed notably improved trajectory generation and tracking performance with the control-theoretic regularized model over models learned using traditional regression techniques, especially when using a small number of demonstration examples. The results presented illustrate the need to infuse standard model-based reinforcement learning algorithms with concepts drawn from nonlinear control theory for improved reliability.

URL PDF HTML ☆

赞 0 踩 0

1811.04006 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Reachability-based safe learning for optimal control problem

基于可达性的安全学习用于最优控制问题

Stanislav Fedorov, Antonio Candelieri

发表机构 * Department of Computer Science, Systems and Communication, University of Milano Bicocca（米兰Bicocca大学计算机科学与通信系）

AI总结本文提出了一种结合系统部分已知状态空间模型和未知动态作为加性有界扰动的安全学习方法，旨在通过安全集选择最优动作以实现目标集，同时在学习过程中更新扰动并提升最优控制的鲁棒性。

详情

AI中文摘要

在本文中，我们寻求一种整合安全性的学习方法，该方法依赖于部分已知的系统状态空间模型，并将未知动态视为加性有界扰动。我们引入了一个框架，用于在存在扰动的情况下安全地学习控制策略。基于已知模型部分，算法可以在满足安全保持条件的情况下，选择最优动作以追求目标集。在一些学习回合后，扰动可以根据现实数据进行更新。为此，对收集的扰动样本进行高斯过程回归。由于现实世界的不稳定性，例如摩擦或导电性随温度的变化，我们期望获得更鲁棒的最优控制问题解决方案。为了评估上述方法，我们选择倒立摆作为基准模型。所提出的算法能够学习到不违反预设安全约束的策略。当将其与探索设置结合时，观察到性能有所提升，从而确保在安全集内学习到最优策略。最后，我们概述了一些超出本文范围的未来研究方向。

英文摘要

In this work we seek for an approach to integrate safety in the learning process that relies on a partly known state-space model of the system and regards the unknown dynamics as an additive bounded disturbance. We introduce a framework for safely learning a control strategy for a given system with an additive disturbance. On the basis of the known part of the model, a safe set in which the system can learn safely, the algorithm can choose optimal actions for pursuing the target set as long as the safety-preserving condition is satisfied. After some learning episodes, the disturbance can be updated based on real-world data. To this end, Gaussian Process regression is conducted on the collected disturbance samples. Since the unstable nature of the law of the real world, for example, change of friction or conductivity with the temperature, we expect to have the more robust solution of optimal control problem. For evaluation of approach described above we choose an inverted pendulum as a benchmark model. The proposed algorithm manages to learn a policy that does not violate the pre-specified safety constraints. Observed performance is improved when it was incorporated exploration set up to make sure that an optimal policy is learned everywhere in the safe set. Finally, we outline some promising directions for future research beyond the scope of this paper.

URL PDF HTML ☆

赞 0 踩 0

1811.03853 2026-06-04 cs.LG cs.AI cs.SY eess.SY 版本更新

Sample-Efficient Policy Learning based on Completely Behavior Cloning

基于完全行为克隆的高效策略学习

Qiming Zou, Ling Wang, Ke Lu, Yu Li

发表机构 * Department of Computer Science and Technology, Harbin Institute of Technology, China（计算机科学与技术系，哈尔滨工业大学，中国）； Department of Management Science and Engineering, Anhui University of Technology, China（管理科学与工程系，安徽理工大学，中国）

AI总结本文提出了一种基于完全行为克隆的策略初始化算法PLCBC，通过将模型预测控制转换为分段仿射函数并用神经网络表达，实现无训练的完全克隆，从而提高策略学习的效率和收敛性。

1811.03621 2026-06-04 cs.HC cs.CV cs.LG cs.SY eess.SY stat.ML 版本更新

Satyam: Democratizing Groundtruth for Machine Vision

Satyam: 机器视觉领域地面真实数据的民主化

Hang Qiu, Krishna Chintalapudi, Ramesh Govindan

发表机构 * University of Southern California（南加州大学）； Microsoft Research（微软研究院）

AI总结本文提出Satyam系统，通过简化流程使非专业人员能够高效收集机器视觉的地面真实数据，从而提升自动驾驶、交通监控和视频监控系统的性能。

详情

AI中文摘要

机器学习的民主化已经导致了用于自动驾驶、交通监控和视频监控的基于机器学习的机器视觉系统。然而，没有大大简化收集地面真实数据的过程，真正的民主化就无法实现。这种地面真实数据的收集对于确保在不同条件下具有良好的性能是必要的。在本文中，我们提出了Satyam系统的设计和评估，这是一个首次出现的系统，使非专业人士能够以最小的努力启动机器视觉的地面真实数据收集任务。Satyam利用一个众包平台，亚马逊机械 Turk，并自动化了地面真实数据收集的几个具有挑战性的方面：创建和启动定制的网页用户界面任务以获取所需的真实数据，控制结果质量以应对垃圾邮件发送者和未经训练的工人，根据任务复杂性调整价格，过滤表现差的垃圾邮件发送者和工人，以及处理工人的报酬。我们通过几种流行的基准视觉数据集验证了Satyam，并展示了通过Satyam获得的真实数据与由训练专家获得的数据相当，并且在用于训练时提供匹配的机器学习性能。

英文摘要

The democratization of machine learning (ML) has led to ML-based machine vision systems for autonomous driving, traffic monitoring, and video surveillance. However, true democratization cannot be achieved without greatly simplifying the process of collecting groundtruth for training and testing these systems. This groundtruth collection is necessary to ensure good performance under varying conditions. In this paper, we present the design and evaluation of Satyam, a first-of-its-kind system that enables a layperson to launch groundtruth collection tasks for machine vision with minimal effort. Satyam leverages a crowdtasking platform, Amazon Mechanical Turk, and automates several challenging aspects of groundtruth collection: creating and launching of custom web-UI tasks for obtaining the desired groundtruth, controlling result quality in the face of spammers and untrained workers, adapting prices to match task complexity, filtering spammers and workers with poor performance, and processing worker payments. We validate Satyam using several popular benchmark vision datasets, and demonstrate that groundtruth obtained by Satyam is comparable to that obtained from trained experts and provides matching ML performance when used for training.

URL PDF HTML ☆

赞 0 踩 0

1803.08287 2026-06-04 eess.SY cs.AI cs.LG cs.RO cs.SY 版本更新

Learning-based Model Predictive Control for Safe Exploration

基于学习的模型预测控制用于安全探索

Torsten Koller, Felix Berkenkamp, Matteo Turchetta, Andreas Krause

发表机构 * Vector Institute（向量研究所）； Max Planck ETH Center for Learning Systems（马克斯·普朗克-ETH学习系统中心）

AI总结本文提出了一种基于学习的模型预测控制方法，通过高斯过程先验假设构建可证明准确的轨迹置信区间，从而提供可证明的高概率安全保证，用于动态系统的安全高效探索和学习。

Comments Proc. of the Conference on Decision and Control, 2018

详情

AI中文摘要

基于学习的方法在没有显著系统先验知识的情况下成功解决了复杂控制任务。然而，这些方法通常不提供任何安全保证，这限制了它们在安全关键的现实应用中的使用。在本文中，我们提出了一种基于学习的模型预测控制方案，可以提供可证明的高概率安全保证。为此，我们利用高斯过程先验对动态特性进行正则性假设，以构建可证明准确的预测轨迹置信区间。与以往的方法不同，我们不假设模型不确定性是独立的。基于这些预测，我们保证轨迹满足安全约束。此外，我们使用终端集约束递归地保证在每个迭代中都存在安全的控制动作。在我们的实验中，我们展示了所提出算法可以安全且高效地探索和学习动态系统。

英文摘要

Learning-based methods have been successful in solving complex control tasks without significant prior knowledge about the system. However, these methods typically do not provide any safety guarantees, which prevents their use in safety-critical, real-world applications. In this paper, we present a learning-based model predictive control scheme that can provide provable high-probability safety guarantees. To this end, we exploit regularity assumptions on the dynamics in terms of a Gaussian process prior to construct provably accurate confidence intervals on predicted trajectories. Unlike previous approaches, we do not assume that model uncertainties are independent. Based on these predictions, we guarantee that trajectories satisfy safety constraints. Moreover, we use a terminal set constraint to recursively guarantee the existence of safe control actions at every iteration. In our experiments, we show that the resulting algorithm can be used to safely and efficiently explore and learn about dynamic systems.

URL PDF HTML ☆

赞 0 踩 0

1811.02052 2026-06-04 eess.SY cs.LG cs.MA cs.SY 版本更新

Managing engineering systems with large state and action spaces through deep reinforcement learning

通过深度强化学习管理具有大状态和动作空间的工程系统

C. P. Andriotis, K. G. Papakonstantinou

发表机构 * Department of Civil & Environmental Engineering（土木与环境工程系）； The Pennsylvania State University（宾夕法尼亚州立大学）； University Park（大学公园）； USA（美国）

AI总结本文提出了一种集成的深度强化学习框架，用于管理具有大状态和动作空间的多组件工程系统，通过开发深度集中多智能体Actor-Critic（DCMAC）方法，提供高效生命周期策略，以应对高维空间中的复杂决策问题。

详情

AI中文摘要

工程系统的决策可以高效地建模为马尔可夫决策过程（MDP）或部分可观测马尔可夫决策过程（POMDP）。典型的MDP和POMDP解决方案利用离线环境知识，为相对较小的状态和动作空间提供详细策略。然而，在大型多组件系统中，这些空间的规模容易爆炸，因为系统状态和动作随着组件数量的增加呈指数级增长，而整个系统的环境动态难以用显式形式描述，只能通过数值模拟器获取。在本工作中，为了解决这些问题，引入了一个集成的深度强化学习（DRL）框架。开发了深度集中多智能体Actor-Critic（DCMAC），一种离线策略的Actor-Critic DRL方法，为在高维空间中运行的大型多组件系统提供高效的生命周期策略。除了深度函数近似，用于参数化大型状态空间外，DCMAC还采用了动作的因子化表示，能够指定个体化的组件级和子系统级决策，同时保持整个系统的集中价值函数。DCMAC在与深度Q网络（DQN）解决方案和精确策略相比时表现良好，并在基于时间、基于条件和周期性策略的优化基线之上表现更优。

英文摘要

Decision-making for engineering systems can be efficiently formulated as a Markov Decision Process (MDP) or a Partially Observable MDP (POMDP). Typical MDP and POMDP solution procedures utilize offline knowledge about the environment and provide detailed policies for relatively small systems with tractable state and action spaces. However, in large multi-component systems the sizes of these spaces easily explode, as system states and actions scale exponentially with the number of components, whereas environment dynamics are difficult to be described in explicit forms for the entire system and may only be accessible through numerical simulators. In this work, to address these issues, an integrated Deep Reinforcement Learning (DRL) framework is introduced. The Deep Centralized Multi-agent Actor Critic (DCMAC) is developed, an off-policy actor-critic DRL approach, providing efficient life-cycle policies for large multi-component systems operating in high-dimensional spaces. Apart from deep function approximations that parametrize large state spaces, DCMAC also adopts a factorized representation of the system actions, being able to designate individualized component- and subsystem-level decisions, while maintaining a centralized value function for the entire system. DCMAC compares well against Deep Q-Network (DQN) solutions and exact policies, where applicable, and outperforms optimized baselines that are based on time-based, condition-based and periodic policies.

URL PDF HTML ☆

赞 0 踩 0

1811.02033 2026-06-04 stat.ML cs.LG cs.NA math.AP math.NA 版本更新

Physics-Informed Generative Adversarial Networks for Stochastic Differential Equations

基于随机微分方程的物理信息生成对抗网络

Liu Yang, Dongkun Zhang, George Em Karniadakis

发表机构 * Division of Applied Mathematics, Brown University, Providence, RI 02912, USA（应用数学系，布朗大学，普罗维德恩，罗德岛州，02912，美国）

AI总结本文提出了一种新的物理信息生成对抗网络（PI-GANs），通过统一解决随机问题中的正向、逆向和混合问题，利用自动微分将物理定律编码到GAN架构中，展示了PI-GANs在高维随机微分方程求解中的准确性和有效性。

详情

AI中文摘要

我们开发了一种新的物理信息生成对抗网络（PI-GANs），以统一的方式解决基于有限散射测量的正向、逆向和混合随机问题。与仅依赖数据训练的常规GANs不同，我们通过自动微分将 governing 物理定律以随机微分方程（SDEs）的形式编码到GANs的架构中。特别地，我们应用了Wasserstein GANs with gradient penalty（WGAN-GP），因为其比vanilla GANs具有更强的稳定性。我们首先测试了WGAN-GP在基于来自稀疏放置传感器的同时读取数据中不同相关长度的高斯过程的近似能力。我们获得了良好的生成随机过程对目标过程的近似，即使输入噪声维度与目标随机过程的有效维度不匹配。我们还研究了判别器和生成器的过拟合问题，并发现生成器也出现过拟合，除了判别器之外。随后，我们考虑了解决需要近似三个随机过程（即解、激励和扩散系数）的椭圆SDEs。我们使用了三个生成器，其中两个是前馈深度神经网络（DNNs），另一个是由SDE诱导的神经网络。根据数据，我们使用一个或多个前馈DNNs作为PI-GANs中的判别器。在此，我们展示了PI-GANs在解决最多30维的SDEs中的准确性和有效性，但原则上，PI-GANs可以处理非常高的维数问题，只要有更多的传感器数据，并且计算成本具有低多项式增长。

英文摘要

We developed a new class of physics-informed generative adversarial networks (PI-GANs) to solve in a unified manner forward, inverse and mixed stochastic problems based on a limited number of scattered measurements. Unlike standard GANs relying only on data for training, here we encoded into the architecture of GANs the governing physical laws in the form of stochastic differential equations (SDEs) using automatic differentiation. In particular, we applied Wasserstein GANs with gradient penalty (WGAN-GP) for its enhanced stability compared to vanilla GANs. We first tested WGAN-GP in approximating Gaussian processes of different correlation lengths based on data realizations collected from simultaneous reads at sparsely placed sensors. We obtained good approximation of the generated stochastic processes to the target ones even for a mismatch between the input noise dimensionality and the effective dimensionality of the target stochastic processes. We also studied the overfitting issue for both the discriminator and generator, and we found that overfitting occurs also in the generator in addition to the discriminator as previously reported. Subsequently, we considered the solution of elliptic SDEs requiring approximations of three stochastic processes, namely the solution, the forcing, and the diffusion coefficient. We used three generators for the PI-GANs, two of them were feed forward deep neural networks (DNNs) while the other one was the neural network induced by the SDE. Depending on the data, we employed one or multiple feed forward DNNs as the discriminators in PI-GANs. Here, we have demonstrated the accuracy and effectiveness of PI-GANs in solving SDEs for up to 30 dimensions, but in principle, PI-GANs could tackle very high dimensional problems given more sensor data with low-polynomial growth in computational cost.

URL PDF HTML ☆

赞 0 踩 0

1811.01721 2026-06-04 math.NA cs.LG cs.NA 版本更新

Rethinking floating point for deep learning

重新思考深度学习中的浮点运算

Jeff Johnson

发表机构 * Facebook AI Research（脸书人工智能研究）

AI总结本文提出了一种新的混合对数乘/线性加、Kulisch累加和渐缩编码的8位对数浮点格式，以提高能效并保持精度，同时在不重新训练网络的情况下，实现了与原始float32 ResNet-50模型在ImageNet上的高精度性能。

详情

AI中文摘要

减少神经网络硬件开销以实现更快或更低功耗的推理和训练是研究的活跃领域。使用整数乘加的统一量化已得到充分研究，这需要学习许多量化参数、微调训练或其他先决条件。很少有努力致力于改进浮点相对于此基准线；它仍然效率低下，字大小减少导致所需动态范围的剧烈损失。我们通过一种新的混合对数乘/线性加、Kulisch累加和Gustafson的正数格式的渐缩编码，将浮点改进为在28nm ASIC工艺上比等效位宽的整数硬件更节能，同时在8位中保持精度。通过仅使用四舍五入到最近的偶数，无需网络重新训练，所有数学和float32参数的替换都可以直接使用。此开源的8位对数浮点在ImageNet上达到原始float32 ResNet-50 CNN模型的top-1精度为0.9%和top-5精度为0.2%。与int8量化不同，它仍然是通用的浮点运算，可以即开即用。我们的8/38位对数浮点乘加在28nm工艺上综合并功率分析，其功率为8/32位整数乘加的1.12倍，面积为0.96倍。在16位时，我们的对数浮点乘加的功率为IEEE 754 float16融合乘加的0.59倍，面积为0.68倍，保持相同的显著位精度和动态范围，证明了其在训练ASICs中的实用性。

英文摘要

Reducing hardware overhead of neural networks for faster or lower power inference and training is an active area of research. Uniform quantization using integer multiply-add has been thoroughly investigated, which requires learning many quantization parameters, fine-tuning training or other prerequisites. Little effort is made to improve floating point relative to this baseline; it remains energy inefficient, and word size reduction yields drastic loss in needed dynamic range. We improve floating point to be more energy efficient than equivalent bit width integer hardware on a 28 nm ASIC process while retaining accuracy in 8 bits with a novel hybrid log multiply/linear add, Kulisch accumulation and tapered encodings from Gustafson's posit format. With no network retraining, and drop-in replacement of all math and float32 parameters via round-to-nearest-even only, this open-sourced 8-bit log float is within 0.9% top-1 and 0.2% top-5 accuracy of the original float32 ResNet-50 CNN model on ImageNet. Unlike int8 quantization, it is still a general purpose floating point arithmetic, interpretable out-of-the-box. Our 8/38-bit log float multiply-add is synthesized and power profiled at 28 nm at 0.96x the power and 1.12x the area of 8/32-bit integer multiply-add. In 16 bits, our log float multiply-add is 0.59x the power and 0.68x the area of IEEE 754 float16 fused multiply-add, maintaining the same signficand precision and dynamic range, proving useful for training ASICs as well.

URL PDF HTML ☆

赞 0 踩 0

1806.06498 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Conditional Affordance Learning for Driving in Urban Environments

面向城市环境的条件性 affordance 学习

Axel Sauer, Nikolay Savinov, Andreas Geiger

发表机构 * Chair of Robotics Science and System Intelligence, Technical University of Munich（慕尼黑技术大学机器人科学与系统智能系）

AI总结本文提出了一种直接感知方法，通过将视频输入映射到适合复杂城市环境自主导航的中间表示，结合高层方向输入，实现了比现有强化学习和条件模仿学习方法更高的目标导向导航性能，并首次通过图像级标签处理交通灯和速度标志，显著减少模拟中的交通事故。

Comments Accepted for Conference on Robot Learning (CoRL) 2018

详情

AI中文摘要

大多数现有的自动驾驶方法分为两类：模块化流水线，通过构建环境的详尽模型，以及模仿学习方法，直接将图像映射到控制输出。最近提出的一种第三范式，直接感知，旨在通过神经网络学习适当的低维中间表示来结合两者的优点。然而，现有的直接感知方法仅限于简单的高速公路场景，缺乏在交叉路口导航、在交通灯前停止或遵守速度限制的能力。在本文中，我们提出了一种直接感知方法，将视频输入映射到适合复杂城市环境自主导航的中间表示，给定高层方向输入。与最先进的强化学习和条件模仿学习方法相比，在具有挑战性的CARLA模拟基准上，我们实现了高达68%的目标导向导航改进。此外，我们的方法是首次通过仅使用图像级标签来处理交通灯和速度标志，从而在模拟中显著减少交通事故。

英文摘要

Most existing approaches to autonomous driving fall into one of two categories: modular pipelines, that build an extensive model of the environment, and imitation learning approaches, that map images directly to control outputs. A recently proposed third paradigm, direct perception, aims to combine the advantages of both by using a neural network to learn appropriate low-dimensional intermediate representations. However, existing direct perception approaches are restricted to simple highway situations, lacking the ability to navigate intersections, stop at traffic lights or respect speed limits. In this work, we propose a direct perception approach which maps video input to intermediate representations suitable for autonomous navigation in complex urban environments given high-level directional inputs. Compared to state-of-the-art reinforcement and conditional imitation learning approaches, we achieve an improvement of up to 68 % in goal-directed navigation on the challenging CARLA simulation benchmark. In addition, our approach is the first to handle traffic lights and speed signs by using image-level labels only, as well as smooth car-following, resulting in a significant reduction of traffic accidents in simulation.

URL PDF HTML ☆

赞 0 踩 0

1605.03364 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Active Uncertainty Calibration in Bayesian ODE Solvers

在贝叶斯微分方程求解器中的主动不确定性校准

Hans Kersting, Philipp Hennig

发表机构 * Max-Planck-Institute for Intelligent Systems（马克斯·普朗克智能系统研究所）

AI总结本文研究了如何在贝叶斯微分方程求解器中平衡计算成本与概率校准，提出了一种基于过滤的方法Bayesian Quadrature filtering (BQF)，通过主动学习梯度测量的不精确性来提高不确定性校准。

Comments 10 pages, 3 figures, published at UAI 2016. Changes for Version 3: fixed minor index mistake in equation (14) (q-1-i instead of q+1-i on top of the product)

详情

Journal ref: Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence (UAI2016) 309--3018

AI中文摘要

在统计学和机器学习中，对微分方程（ODEs）求解器的兴趣正在重新增长，这些求解器返回概率测度而非点估计。最近，Conrad等人引入了一种基于采样的方法类，这些方法在特定意义上是'well-calibrated'的。但是，这些方法的计算成本显著高于经典方法。另一方面，Schober等人指出经典Runge-Kutta ODE求解器与高斯滤波器之间存在精确的联系，这只能提供粗糙的概率校准，但计算开销可忽略不计。通过将ODE的解视为线性高斯SDE中的近似推断，我们研究了一类概率ODE求解器，这些求解器在计算成本和概率校准之间取得了平衡，并识别出不准确的梯度测量是不确定性的关键来源。我们提出了一种新的基于过滤的方法Bayesian Quadrature filtering (BQF)，该方法利用贝叶斯二次法主动学习梯度测量的不精确性，通过收集多个梯度评估来提高不确定性校准。

英文摘要

There is resurging interest, in statistics and machine learning, in solvers for ordinary differential equations (ODEs) that return probability measures instead of point estimates. Recently, Conrad et al. introduced a sampling-based class of methods that are 'well-calibrated' in a specific sense. But the computational cost of these methods is significantly above that of classic methods. On the other hand, Schober et al. pointed out a precise connection between classic Runge-Kutta ODE solvers and Gaussian filters, which gives only a rough probabilistic calibration, but at negligible cost overhead. By formulating the solution of ODEs as approximate inference in linear Gaussian SDEs, we investigate a range of probabilistic ODE solvers, that bridge the trade-off between computational cost and probabilistic calibration, and identify the inaccurate gradient measurement as the crucial source of uncertainty. We propose the novel filtering-based method Bayesian Quadrature filtering (BQF) which uses Bayesian quadrature to actively learn the imprecision in the gradient measurement by collecting multiple gradient evaluations.

URL PDF HTML ☆

赞 0 踩 0

1811.00961 2026-06-04 math.DS cs.LG cs.SY eess.SY math.OC 版本更新

可证明加速的随机传话算法

Nicolas Loizou, Michael Rabbat, Peter Richtárik

发表机构 * The University of Edinburgh, UK（爱丁堡大学）； Facebook AI Research, Montreal（脸书人工智能研究）； KAUST, KSA（王国立科技大学）

AI总结本文提出了一种可证明加速的随机传话算法，用于解决平均一致性问题。该算法受到最近开发的加速随机Kaczmarz方法的启发，该方法是解决线性系统问题的流行方法。在每次传话迭代中，网络中的所有节点都会更新它们的值，但只有成对的节点交换私有信息。还展示了在流行无线传感器网络上的数值实验，展示了我们协议的优势。

1810.12429 2026-06-04 cs.LG cs.AI cs.SY eess.SY stat.ML 版本更新

Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation

打破地平线诅咒：无限地平线离线估计

Qiang Liu, Lihong Li, Ziyang Tang, Dengyong Zhou

发表机构 * The University of Texas at Austin（德克萨斯大学奥斯汀分校）； Google Brain（谷歌大脑）

AI总结本文提出了一种新的离线估计方法，通过直接在平稳状态访问分布上应用重要性采样来避免现有估计器中方差爆炸的问题，核心贡献是提出了一种估计两个平稳分布密度比的新方法，并推导了RKHS情况下的闭式解。

Comments 21 pages, 5 figures, NIPS 2018 (spotlight)

详情

AI中文摘要

我们考虑了估计目标策略预期奖励的离线估计问题，该问题使用由不同行为策略收集的样本进行估计。重要性采样（IS）已成为推导（近）无偏估计器的关键技术，但在长地平线问题中已知会遭受过度高的方差。在无限地平线问题的极端情况下，基于IS的估计器的方差可能甚至是无界的。在本文中，我们提出了一种新的离线估计方法，直接在平稳状态访问分布上应用重要性采样，以避免现有估计器所面临的爆炸方差问题。我们的关键贡献是提出了一种估计两个平稳分布密度比的新方法，仅从行为分布中采样轨迹。我们为估计问题开发了一种mini-max损失函数，并推导了RKHS情况下的闭式解。我们通过理论和实证分析支持我们的方法。

英文摘要

We consider the off-policy estimation problem of estimating the expected reward of a target policy using samples collected by a different behavior policy. Importance sampling (IS) has been a key technique to derive (nearly) unbiased estimators, but is known to suffer from an excessively high variance in long-horizon problems. In the extreme case of in infinite-horizon problems, the variance of an IS-based estimator may even be unbounded. In this paper, we propose a new off-policy estimation method that applies IS directly on the stationary state-visitation distributions to avoid the exploding variance issue faced by existing estimators.Our key contribution is a novel approach to estimating the density ratio of two stationary distributions, with trajectories sampled from only the behavior distribution. We develop a mini-max loss function for the estimation problem, and derive a closed-form solution for the case of RKHS. We support our method with both theoretical and empirical analyses.

URL PDF HTML ☆

赞 0 踩 0

1806.03085 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

A Stein variational Newton method

一种Stein变分牛顿方法

Gianluca Detommaso, Tiangang Cui, Alessio Spantini, Youssef Marzouk, Robert Scheichl

发表机构 * University of Bath（巴斯大学）； The Alan Turing Institute（艾伦·图灵研究所）； Monash University（莫纳什大学）； Massachusetts Institute of Technology（麻省理工学院）； Heidelberg University（海德堡大学）

AI总结本文提出了一种基于Stein变分梯度下降（SVGD）的改进方法，通过引入二阶信息加速并推广了该算法，实现了函数空间中的牛顿迭代，并展示了在多个测试案例中显著的计算效率提升。

Comments 18 pages, 7 figures

1810.11505 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Stability-certified reinforcement learning: A control-theoretic perspective

具有稳定性认证的强化学习：控制论视角

Ming Jin, Javad Lavaei

发表机构 * Department of Industrial Engineering and Operations Research, University of California, Berkeley（工业工程与运营管理系，加州大学伯克利分校）； Department of Industrial Engineering and Operations Research, and the Tsinghua-Berkeley Shenzhen Institute, University of California, Berkeley（工业工程与运营管理系，以及清华-伯克利深圳研究院，加州大学伯克利分校）

AI总结本文从控制论角度研究强化学习策略与非线性动力系统连接时的稳定性认证问题，提出了一种基于半定规划可行性的方法，通过调节策略的输入输出梯度来获得鲁棒稳定性保证，并通过实验验证了该方法在多飞行编队和电力系统频率调节任务中的有效性。

1804.04310 2026-06-04 cs.IT cs.LG cs.NA math.IT math.NA math.OC 版本更新

Exact Reconstruction of Euclidean Distance Geometry Problem Using Low-rank Matrix Completion

利用低秩矩阵补全进行欧几里得距离几何问题的精确重构

Abiy Tasissa, Rongjie Lai

发表机构 * Department of Mathematics, Rensselaer Polytechnic Institute（罗切斯特理工学院数学系）

AI总结本文提出了一种利用低秩矩阵补全方法来解决欧几里得距离几何问题的框架，通过引入双基方法理论分析重构问题，并在不同三维数据和蛋白质分子上的数值测试验证了算法的有效性和效率。

Comments 28 pages, revised proof of Theorem 1, added proof of form of $H^{-1}$, presentation improved

详情

AI中文摘要

欧几里得距离几何问题出现在许多应用中，从计算化学中确定分子构象到传感器网络中的定位。当距离信息不完整时，该问题可以被公式化为核范数最小化问题。在本文中，该最小化程序被重新表述为一个低秩r Gram矩阵相对于合适基底的矩阵补全问题。众所周知的限制等距性质在此场景中无法满足。相反，引入了双基方法来理论分析重构问题。如果Gram矩阵满足某些与参数ν相关的相干条件，则主要结果表明，从O(nrνlog²(n))均匀随机样本中可以以很高的概率恢复n个点的底层配置。计算上，设计了简单且快速的算法来解决欧几里得距离几何问题。在不同三维数据和蛋白质分子上的数值测试验证了所提算法的有效性和效率。

英文摘要

The Euclidean distance geometry problem arises in a wide variety of applications, from determining molecular conformations in computational chemistry to localization in sensor networks. When the distance information is incomplete, the problem can be formulated as a nuclear norm minimization problem. In this paper, this minimization program is recast as a matrix completion problem of a low-rank $r$ Gram matrix with respect to a suitable basis. The well known restricted isometry property can not be satisfied in this scenario. Instead, a dual basis approach is introduced to theoretically analyze the reconstruction problem. If the Gram matrix satisfies certain coherence conditions with parameter $ν$, the main result shows that the underlying configuration of $n$ points can be recovered with very high probability from $O(nrν\log^{2}(n))$ uniformly random samples. Computationally, simple and fast algorithms are designed to solve the Euclidean distance geometry problem. Numerical tests on different three dimensional data and protein molecules validate effectiveness and efficiency of the proposed algorithms.

URL PDF HTML ☆

赞 0 踩 0

1810.11178 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Using solar and load predictions in battery scheduling at the residential level

在住宅层面利用太阳能和负载预测进行电池调度

Richard Bean, Hina Khan

发表机构 * Redback Technologies Brisbane, Australia（红背技术布里斯班，澳大利亚）； School of ITEE The University of Queensland, Australia（信息技术工程学院昆士兰大学，澳大利亚）

AI总结本文提出了一种新的电池调度算法，通过预测负载和太阳能发电量来优化住宅用户的电力成本，该算法在不同电价下可实现1%至10%的节能效果。

Comments This paper was presented at the 8th Solar Integration Workshop and published in the workshop's proceedings

详情

AI中文摘要

智能太阳能逆变器可用于存储、监控和管理家庭的太阳能能源。我们描述了一种带有电池的智能太阳能逆变器系统，该系统可以自动运行或通过网络接收命令以在给定速率充电和放电。为了使电池存储在财务上可行并对消费者有利，可以采用有效的电池调度算法。特别是当地区内实施分时电价时，在某些情况下可以调度电池以节省个人客户的费用，相比自动模式。因此，本文提出了并评估了为太阳能能源住宅消费者设计的新电池调度算法。所提出的电池调度算法优化了住宅用户的下一24小时电力成本。通过根据负载和太阳能发电量的预测来控制电池存储系统的充放电，实现成本最小化。调度问题被公式化为一个线性规划问题。我们使用几个月的每小时负载和光伏数据对83个逆变器进行了计算机模拟。模拟结果表明，影响优化可行性的关键因素是电价和每个逆变器的光伏与负载比率。根据电价，可以预期比自动方法节省1%至10%。本文中所用的预测方法也显示出优于基本“持久性”预测方法。我们还检查了提高预测准确性和优化有效性的方法。

英文摘要

Smart solar inverters can be used to store, monitor and manage a home's solar energy. We describe a smart solar inverter system with battery which can either operate in an automatic mode or receive commands over a network to charge and discharge at a given rate. In order to make battery storage financially viable and advantageous to the consumers, effective battery scheduling algorithms can be employed. Particularly, when time-of-use tariffs are in effect in the region of the inverter, it is possible in some cases to schedule the battery to save money for the individual customer, compared to the "automatic" mode. Hence, this paper presents and evaluates the performance of a novel battery scheduling algorithm for residential consumers of solar energy. The proposed battery scheduling algorithm optimizes the cost of electricity over next 24 hours for residential consumers. The cost minimization is realized by controlling the charging/discharging of battery storage system based on the predictions for load and solar power generation values. The scheduling problem is formulated as a linear programming problem. We performed computer simulations over 83 inverters using several months of hourly load and PV data. The simulation results indicate that key factors affecting the viability of optimization are the tariffs and the PV to Load ratio at each inverter. Depending on the tariff, savings of between 1% and 10% can be expected over the automatic approach. The prediction approach used in this paper is also shown to out-perform basic "persistence" forecasting approaches. We have also examined the approaches for improving the prediction accuracy and optimization effectiveness.

URL PDF HTML ☆

赞 0 踩 0

1810.09675 2026-06-04 math.NA cs.LG cs.NA 版本更新

SwitchNet: a neural network model for forward and inverse scattering problems

SwitchNet: 一种用于正反散射问题的神经网络模型

Yuehaw Khoo, Lexing Ying

发表机构 * Department of Mathematics, Stanford University, Stanford, CA 94305.（斯坦福大学数学系）； Department of Mathematics（数学系）； ICME, Stanford University, Stanford, CA 94305. Facebook AI Research, Menlo Park, CA 94025.（斯坦福大学计算数学与工程系，Facebook人工智能研究实验室）

AI总结 SwitchNet通过建立散射体与散射场之间的映射，解决基于波方程的反散射问题，利用低秩结构和稀疏连接的切换层减少参数量并提升训练效率。

Comments 19 pages, 7 figures

1810.10078 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

Model Selection for Nonnegative Matrix Factorization by Support Union Recovery

通过支持联合恢复进行非负矩阵分解的模型选择

Zhaoqiang Liu

发表机构 * Department of Electrical and Computer Engineering（电子工程系）

AI总结本文提出了一种通过计算经验二阶矩并恢复与经验二阶矩相关的矩阵中非零行的索引集来自动选择非负矩阵分解的潜在维度的算法，该算法在理论上有保证地检测出真实的潜在维度。

详情

AI中文摘要

非负矩阵分解（NMF）因其非减性和部分基性质而被广泛应用于机器学习和信号处理，因为它增强了可解释性。通常假设潜在维度（或组件数量）是给定的。尽管已经设计了大量NMF算法，但关于具有理论保证的自动NMF模型选择的文献却很少。在本文中，我们提出了一种算法，首先从经验四阶累积张量中计算经验二阶矩，然后通过恢复与经验二阶矩相关的矩阵中非零行的索引集（即非零行的索引集）来估计潜在维度。通过假设数据的生成模型并加入额外的温和条件，我们的算法可以证明性地检测到真实的潜在维度。我们在合成示例上展示了所提出的算法能够找到近似正确的组件数量。

使用深度学习的主动顺序假设检验政策设计

Dhruva Kartik, Ekraam Sabir, Urbashi Mitra, Prem Natarajan

发表机构 * USC Information Sciences Institute（美国南加州大学信息科学研究所）

AI总结本文研究了如何利用深度学习设计更有效的主动顺序假设检验策略，通过比较新提出的启发式方法与现有方法，展示了在某些场景下性能的显著提升。

Comments Accepted at 56th Annual Allerton Conference on Communication, Control, and Computing

详情

AI中文摘要

信息论在通信、压缩和假设检验等各类问题中取得了很大的成功，而随机控制理论则通过动态规划对部分可观测马尔可夫决策过程（POMDPs）的最优策略进行表征。然而，一般情况下找到这些问题的最优策略是计算上困难的，因此在实践中通常采用启发式方法。深度学习可以作为一种工具，用于设计更好的启发式方法。本文考虑了主动顺序假设检验问题，目标是通过自适应选择适当的查询来以最少的样本量可靠地推断真实假设。该问题可以建模为POMDP，并且文献中已存在其价值函数的界。然而，最优策略尚未被识别，各种启发式方法被使用。本文提出了两种新的启发式方法：一种基于深度强化学习，另一种基于KL散度零和博弈。这些启发式方法与最先进的解决方案进行了比较，并通过数值实验表明，在某些场景下，所提出的启发式方法能够显著优于现有方法。

英文摘要

Information theory has been very successful in obtaining performance limits for various problems such as communication, compression and hypothesis testing. Likewise, stochastic control theory provides a characterization of optimal policies for Partially Observable Markov Decision Processes (POMDPs) using dynamic programming. However, finding optimal policies for these problems is computationally hard in general and thus, heuristic solutions are employed in practice. Deep learning can be used as a tool for designing better heuristics in such problems. In this paper, the problem of active sequential hypothesis testing is considered. The goal is to design a policy that can reliably infer the true hypothesis using as few samples as possible by adaptively selecting appropriate queries. This problem can be modeled as a POMDP and bounds on its value function exist in literature. However, optimal policies have not been identified and various heuristics are used. In this paper, two new heuristics are proposed: one based on deep reinforcement learning and another based on a KL-divergence zero-sum game. These heuristics are compared with state-of-the-art solutions and it is demonstrated using numerical experiments that the proposed heuristics can achieve significantly better performance than existing methods in some scenarios.

URL PDF HTML ☆

赞 0 踩 0

1803.01066 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Specialized Interior Point Algorithm for Stable Nonlinear System Identification

用于稳定非线性系统辨识的专用内点算法

Jack Umenberger, Ian R. Manchester

发表机构 * Australian Centre for Field Robotics（澳大利亚机器人场实验室）； The University of Sydney（悉尼大学）

AI总结本文提出了一种专用内点算法，通过利用问题中的特殊结构，将计算复杂度从数据集长度的三次方降低到线性增长，从而提高了非线性系统辨识的效率，并展示了其在新数据上的优越泛化能力。

Comments accepted to IEEE Transactions on Automatic Control

详情

DOI: 10.1109/TAC.2018.2867358

AI中文摘要

从数据估计非线性动态模型面临着许多挑战，包括模型不稳定性和长期仿真保真度的非凸性。最近，拉格朗日松弛法被提出作为近似仿真保真度和保证稳定性的方法，通过半正定规划（SDP），然而由此产生的SDP具有较大的维度，限制了其在实际问题中的应用。在本文中，我们开发了一种路径跟随内点算法，利用问题中的特殊结构，将计算复杂度从数据集长度的三次方降低到线性增长。新的算法使经验比较成为可能，包括非线性ARX方法，并展示了对新数据的优越泛化能力。我们还探讨了稳定性约束的“正则化”效应，作为替代回归子集选择的方法。

英文摘要

Estimation of nonlinear dynamic models from data poses many challenges, including model instability and non-convexity of long-term simulation fidelity. Recently Lagrangian relaxation has been proposed as a method to approximate simulation fidelity and guarantee stability via semidefinite programming (SDP), however the resulting SDPs have large dimension, limiting their utility in practical problems. In this paper we develop a path-following interior point algorithm that takes advantage of special structure in the problem and reduces computational complexity from cubic to linear growth with the length of the data set. The new algorithm enables empirical comparisons to established methods including Nonlinear ARX, and we demonstrate superior generalization to new data. We also explore the "regularizing" effect of stability constraints as an alternative to regressor subset selection.

URL PDF HTML ☆

赞 0 踩 0

1801.08383 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

Data-Driven Impulse Response Regularization via Deep Learning

基于深度学习的数据驱动脉冲响应正则化

Carl Andersson, Niklas Wahlström, Thomas B. Schön

发表机构 * Department of Information Technology, Uppsala University（信息技术系，乌普萨拉大学）

AI总结本文提出了一种新的数据驱动模型，用于稳定线性单输入单输出系统的脉冲响应估计，该模型在利用输入输出数据中的隐藏模式方面优于非参数模型。

1810.03733 2026-06-04 math.NA cs.LG cs.NA 版本更新

Find the dimension that counts: Fast dimension estimation and Krylov PCA

找出计数的维度：快速维度估计和Krylov PCA

Shashanka Ubaru, Abd-Krim Seghouane, Yousef Saad

发表机构 * IBM T. J. Watson Research Center（IBM T.J.沃森研究中心）； The University of Melbourne（墨尔本大学）； University of Minnesota（明尼苏达大学）

AI总结本文提出了一种新的方法，用于同时估计协方差矩阵主子空间的维度并获得子空间的近似值，该方法结合了Krylov子空间方法，避免了显式计算样本协方差矩阵和完整的特征分解，从而在大规模数据应用中具有成本效益。

详情

AI中文摘要

高维数据和具有许多自由度的系统通常由协方差矩阵来表征。在本文中，我们考虑同时估计这些协方差矩阵主子空间的维度并获得子空间近似值的问题。这个问题出现在流行的主成分分析（PCA）中，并在许多机器学习、数据分析、信号和图像处理等应用中出现。我们首先提出了一种新的方法来估计主子空间的维度。然后展示如何将该方法与Krylov子空间方法结合，以同时估计维度并获得子空间的近似。维度估计无需额外成本。所提出的方法基于模型选择框架，其中新的选择标准是基于随机矩阵扰动理论思想推导的。我们进行了理论分析，(a) 显示所提出的方法在数据点数量 $n ightarrow \infty$ 时具有强一致性（即得出最优解），(b) 分析了有限 $n$ 情况下的精确维度估计条件。利用最近的结果，我们展示了所提算法也能产生近最优的PCA。所提出的方法避免显式形成样本协方差矩阵（与数据相关）并计算完整的特征分解。因此，该方法成本低廉，这在现代数据应用中特别有利，因为协方差矩阵可能非常大。数值实验展示了所提方法在各种应用中的性能。

英文摘要

High dimensional data and systems with many degrees of freedom are often characterized by covariance matrices. In this paper, we consider the problem of simultaneously estimating the dimension of the principal (dominant) subspace of these covariance matrices and obtaining an approximation to the subspace. This problem arises in the popular principal component analysis (PCA), and in many applications of machine learning, data analysis, signal and image processing, and others. We first present a novel method for estimating the dimension of the principal subspace. We then show how this method can be coupled with a Krylov subspace method to simultaneously estimate the dimension and obtain an approximation to the subspace. The dimension estimation is achieved at no additional cost. The proposed method operates on a model selection framework, where the novel selection criterion is derived based on random matrix perturbation theory ideas. We present theoretical analyses which (a) show that the proposed method achieves strong consistency (i.e., yields optimal solution as the number of data-points $n\rightarrow \infty$), and (b) analyze conditions for exact dimension estimation in the finite $n$ case. Using recent results, we show that our algorithm also yields near optimal PCA. The proposed method avoids forming the sample covariance matrix (associated with the data) explicitly and computing the complete eigen-decomposition. Therefore, the method is inexpensive, which is particularly advantageous in modern data applications where the covariance matrices can be very large. Numerical experiments illustrate the performance of the proposed method in various applications.

URL PDF HTML ☆

赞 0 踩 0

1807.09904 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

A Data-Efficient Approach to Precise and Controlled Pushing

一种数据高效且精确可控的推动作方法

Maria Bauza, Francois R. Hogan, Alberto Rodriguez

发表机构 * Department of Mechanical Engineering — Massachusetts Institute of Technology（机械工程系——麻省理工学院）

AI总结本文提出了一种数据高效的方法，通过学习动态模型来控制复杂机械系统，仅需10个数据点即可完成复杂的推动作轨迹。

Comments Maria Bauza and Francois R. Hogan contributed equally to this work. 10 pages, 5 figures

1810.03025 2026-06-04 stat.ML cs.AI cs.LG cs.SY eess.SY 版本更新

Discretizing Logged Interaction Data Biases Learning for Decision-Making

对记录交互数据进行离散化会偏学习决策制定

Peter Schulam, Suchi Saria

发表机构 * Johns Hopkins University（约翰霍普金斯大学）

AI总结本文研究了对非等间隔时间序列数据进行离散化对决策制定模型训练的影响，指出离散化引入了偏差，并提出使用连续时间模型来避免这一问题。

Comments This is a standalone short paper describing a new type of bias that can arise when learning from time series data for sequential decision-making problems

1810.02866 2026-06-04 eess.SP cs.LG cs.SY eess.SY 版本更新

Artificial Intelligence Assisted Power Grid Hardening in Response to Extreme Weather Events

人工智能辅助的电网加固以应对极端天气事件

Rozhin Eskandarpour, Amin Khodaei, A. Paaso, N. M. Abdullah

发表机构 * University of Denver（丹佛大学）； ComEd（ComEd公司）； US National Committee（美国国家委员会）

AI总结本文提出了一种基于人工智能的电网加固模型，旨在提高电网在极端天气事件中的韧性。首先，提出一个机器学习模型来预测组件状态（运行或停电），然后将这些预测输入到加固模型中，确定分布式发电（DG）单元的放置位置。与现有文献不同，本文通过考虑两个目标的复杂依赖关系，共同优化电网经济性和韧性。在标准IEEE 118节点测试系统上的数值模拟展示了所提加固模型的优势和适用性。结果表明，通过去中心化和分布式本地能源资源，所提加固模型可以产生更稳健的解决方案，显著保护系统免受多个组件因极端事件而停电的影响。

详情

Journal ref: 2018 Grid of the Future Symposium

AI中文摘要

本文提出了一种基于人工智能的电网加固模型，旨在提高电网在极端天气事件中的韧性。首先，提出一个机器学习模型来预测组件状态（运行或停电）。然后，将这些预测输入到加固模型中，确定分布式发电（DG）单元的战略放置位置。与现有文献不同，本文通过考虑两个目标的复杂依赖关系，共同优化电网的经济性和韧性。在标准IEEE 118节点测试系统上的数值模拟展示了所提加固模型的优势和适用性。结果表明，通过去中心化和分布式本地能源资源，所提加固模型可以产生更稳健的解决方案，显著保护系统免受多个组件因极端事件而停电的影响。

英文摘要

In this paper, an artificial intelligence based grid hardening model is proposed with the objective of improving power grid resilience in response to extreme weather events. At first, a machine learning model is proposed to predict the component states (either operational or outage) in response to the extreme event. Then, these predictions are fed into a hardening model, which determines strategic locations for placement of distributed generation (DG) units. In contrast to existing literature in hardening and resilience enhancement, this paper co-optimizes grid economic and resilience objectives by considering the intricate dependencies of the two. The numerical simulations on the standard IEEE 118-bus test system illustrate the merits and applicability of the proposed hardening model. The results indicate that the proposed hardening model through decentralized and distributed local energy resources can produce a more robust solution that can protect the system significantly against multiple component outages due to an extreme event.

URL PDF HTML ☆

赞 0 踩 0

1810.02022 2026-06-04 math.OC cs.LG cs.SY eess.SY math.DS stat.ML 版本更新

Convergence of the Expectation-Maximization Algorithm Through Discrete-Time Lyapunov Stability Theory

通过离散时间李雅普诺夫稳定性理论分析期望-最大化算法的收敛性

Orlando Romero, Sarthak Chatterjee, Sérgio Pequito

AI总结本文从动态系统视角重新审视期望-最大化算法，将其视为非线性状态空间动态系统，并利用离散时间李雅普诺夫稳定性理论证明其收敛性。

Comments Preprint submitted to ACC 2019

1810.01586 2026-06-04 math.NA cs.LG cs.NA physics.comp-ph 版本更新

Machine learning for accelerating effective property prediction for poroelasticity problem in stochastic media

利用机器学习加速随机介质中渗透弹性问题有效性质的预测

Maria Vasilyeva, Aleksey Tyrylgin

发表机构 * Institute for Scientific Computation, Texas A&M University, College Station, TX 77843-3368（科学计算研究所，德克萨斯A&M大学，学院站，德克萨斯州77843-3368）； Multiscale model reduction laboratory, North-Eastern Federal University, Yakutsk, Republic of Sakha (Yakutia), Russia, 677980（多尺度模型简化实验室，北欧联邦大学，雅库茨克，萨哈（雅库茨克）共和国，俄罗斯，677980）

AI总结本文提出一种基于深度神经网络的数值均质方法，用于快速计算随机介质中渗透弹性问题的有效性质，通过卷积神经网络学习随机场与有效性质之间的映射关系，实验结果表明该方法在二维和三维模型问题中均能快速且准确地预测有效性质。

1711.00439 2026-06-04 math.NA cs.LG cs.NA 版本更新

用神经网络为移动传感器生成信息图

Louis Dressel, Mykel J. Kochenderfer

AI总结本文提出利用卷积神经网络实时生成移动传感器的信息图，以提高轨迹生成的效率和准确性。

Comments Accepted to the 2018 IEEE Conference on Decision and Control (CDC)

1809.09261 2026-06-04 cs.AI cs.LG cs.SY eess.SY 版本更新

Resilient Computing with Reinforcement Learning on a Dynamical System: Case Study in Sorting

基于动态系统的强化学习鲁棒计算：排序问题案例研究

Aleksandra Faust, James B. Aimone, Conrad D. James, Lydia Tapia

发表机构 * Google Brain, Mountain View, CA, USA（谷歌大脑，美国加利福尼亚州山景城）； Sandia National Labs, Albuquerque, NM, USA（桑迪亚国家实验室，美国新墨西哥州阿尔伯克基）

AI总结本文将计算过程建模为反馈控制问题，利用强化学习解决序列决策问题，通过排序问题案例展示鲁棒计算方法在克服传统编程局限性方面的有效性。

Comments 11 pages, accepted to CDC 2018. Here with additional evaluations

详情

AI中文摘要

机器人和自主代理在资源有限的情况下，通常依赖不完美的模型和传感器测量来完成目标导向任务。特别是，强化学习（RL）和反馈控制可以用来帮助机器人实现目标。本文基于这一领域的工作，将通用计算建模为反馈控制问题，使代理能够自主克服标准过程语言编程的局限性：对错误的鲁棒性和早期程序终止的容忍。我们的建模将计算视为程序变量空间中的轨迹生成。计算因此成为一个序列决策问题，通过强化学习（RL）解决，并通过李雅普诺夫稳定性理论分析以评估代理的鲁棒性和向目标的进展。我们通过一个典型的计算机科学问题——数组排序的案例研究来实现这一点。评估显示，我们的RL排序代理能够稳定地向渐近稳定的终点进展，对故障组件具有鲁棒性，并且比传统的快速排序和冒泡排序进行的数组操作更少。

英文摘要

Robots and autonomous agents often complete goal-based tasks with limited resources, relying on imperfect models and sensor measurements. In particular, reinforcement learning (RL) and feedback control can be used to help a robot achieve a goal. Taking advantage of this body of work, this paper formulates general computation as a feedback-control problem, which allows the agent to autonomously overcome some limitations of standard procedural language programming: resilience to errors and early program termination. Our formulation considers computation to be trajectory generation in the program's variable space. The computing then becomes a sequential decision making problem, solved with reinforcement learning (RL), and analyzed with Lyapunov stability theory to assess the agent's resilience and progression to the goal. We do this through a case study on a quintessential computer science problem, array sorting. Evaluations show that our RL sorting agent makes steady progress to an asymptotically stable goal, is resilient to faulty components, and performs less array manipulations than traditional Quicksort and Bubble sort.

URL PDF HTML ☆

赞 0 踩 0

1809.08657 2026-06-04 math.OC cs.IT cs.LG cs.NA cs.SY eess.SY math.IT math.NA 版本更新

Accelerated Gossip via Stochastic Heavy Ball Method

通过随机重力球方法加速 gossip

Nicolas Loizou, Peter Richtárik

发表机构 * School of Mathematics, KAUST, Saudi Arabia（卡士塔大学数学学院，沙特阿拉伯）； The University of Edinburgh（爱丁堡大学）； The University of Edinburgh, United Kingdom（英国爱丁堡大学）； Edinburgh, Scotland, UK（苏格兰爱丁堡，英国）； MIPT, Russia（莫斯科国立信息安全大学，俄罗斯）

AI总结本文研究了随机重力球方法（SHB）作为随机 gossip 算法的应用，提出了一种新的解决平均共识问题的协议，并通过实验展示了其优势。

Comments 8 pages, 5 Figures, 56th Annual Allerton Conference on Communication, Control, and Computing, 2018

1809.06401 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Hidden Markov Model Estimation-Based Q-learning for Partially Observable Markov Decision Process

基于隐马尔可夫模型估计的Q学习：部分可观测马尔可夫决策过程

Hyung-Jin Yoon, Donghwan Lee, Naira Hovakimyan

发表机构 * Department of Industrial and Enterprise Systems Engineering（工业与企业系统工程系）

AI总结本文提出了一种基于隐马尔可夫模型估计的在线Q学习算法，用于部分可观测马尔可夫决策过程，同时估计POMDP参数和Q函数，并证明其收敛性。

1809.08004 2026-06-04 cs.SI cs.LG cs.NA math.NA physics.data-an 版本更新

Multi-Dimensional, Multilayer, Nonlinear and Dynamic HITS

多维、多层、非线性和动态HITS

Francesca Arrigo, Francesco Tudisco

发表机构 * University of Strathclyde（斯特拉思克莱德大学）

AI总结本文提出了一种基于多齐次顺序保持映射的 Perron 特征向量的时序多维加权有向网络排名模型，扩展了HITS算法到时序多层设置，并定义了五个中心性向量，包括节点、层和时间戳的向量，通过非线性引入保证了任何网络的中心性向量存在性和唯一性。

1809.07098 2026-06-04 cs.AI cs.LG cs.MA cs.NE cs.SY eess.SY 版本更新

Novelty-organizing team of classifiers in noisy and dynamic environments

在噪声和动态环境中组织新颖性的分类器团队

Danilo Vasconcellos Vargas, Hirotaka Takano, Junichi Murata

发表机构 * Graduate School of Information Science（信息科学研究生学校）； Electrical Engineering Kyushu University Fukuoka, Japan Email（电气工程九州大学福冈日本电子邮件）； Faculty of Information Science（信息科学学院）

AI总结该研究提出了一种在噪声和动态环境中有效工作的分类器团队（NOTC），并通过连续动作山车问题及其变体进行验证，展示了NOTC在性能上的优势，尽管其初始化过程需要一些时间。

详情

DOI: 10.1109/CEC.2015.7257254
Journal ref: 2015 IEEE Congress on Evolutionary Computation (CEC)

AI中文摘要

在现实世界中，环境不断变化，输入变量受到噪声的影响。然而，很少有算法能够在这种情况下工作。在这里，新颖性组织分类器团队（NOTC）被应用于连续动作山车以及其两个变种：噪声山车和不稳定天气山车。这些问题分别考虑了噪声和问题动态的变化。此外，NOTC在这些问题中与神经进化拓扑增强（NEAT）进行了比较，揭示了两种方法之间的权衡。尽管NOTC在所有问题中均表现最佳，但NEAT需要更少的试验来收敛。证明了NOTC之所以表现更好，是因为其将输入空间划分为更易处理的问题。不幸的是，这种输入空间的划分也需要一些时间来初始化。

英文摘要

In the real world, the environment is constantly changing with the input variables under the effect of noise. However, few algorithms were shown to be able to work under those circumstances. Here, Novelty-Organizing Team of Classifiers (NOTC) is applied to the continuous action mountain car as well as two variations of it: a noisy mountain car and an unstable weather mountain car. These problems take respectively noise and change of problem dynamics into account. Moreover, NOTC is compared with NeuroEvolution of Augmenting Topologies (NEAT) in these problems, revealing a trade-off between the approaches. While NOTC achieves the best performance in all of the problems, NEAT needs less trials to converge. It is demonstrated that NOTC achieves better performance because of its division of the input space (creating easier problems). Unfortunately, this division of input space also requires a bit of time to bootstrap.

URL PDF HTML ☆

赞 0 踩 0

1809.06970 2026-06-04 cs.LG cs.NI cs.PF cs.SY eess.SY stat.ML 版本更新

FastDeepIoT: Towards Understanding and Optimizing Neural Network Execution Time on Mobile and Embedded Devices

FastDeepIoT: 向理解和优化移动和嵌入式设备上神经网络执行时间迈进

Shuochao Yao, Yiran Zhao, Huajie Shao, Shengzhong Liu, Dongxin Liu, Lu Su, Tarek Abdelzaher

发表机构 * University of Illinois Urbana Champaign（伊利诺伊大学厄巴纳-香槟分校）； State University of New York at Buffalo（纽约州立大学布法罗分校）

AI总结本文提出FastDeepIoT框架，通过揭示神经网络结构与执行时间之间的非线性关系，优化移动和嵌入式设备上执行时间与准确性的权衡，同时无需预先了解硬件规格或深度学习库的实现细节。

Comments Accepted by SenSys '18

详情

DOI: 10.1145/3274783.3274840

AI中文摘要

深度神经网络在许多传感应用问题中展现出巨大潜力，但其过度的资源需求会减慢执行时间，成为在低端设备上部署的重大障碍。为了解决这一挑战，最近的研究集中在压缩神经网络大小以提高性能。我们表明，改变神经网络大小并不成比例地影响感兴趣的性能属性，例如执行时间。相反，在网络配置空间中存在极端的运行时间非线性性。因此，我们提出了一个名为FastDeepIoT的新型框架，该框架揭示了神经网络结构与执行时间之间的非线性关系，然后利用这种理解来找到显著改善移动和嵌入式设备上执行时间与准确性权衡的网络配置。FastDeepIoT有两个关键贡献。首先，FastDeepIoT自动学习了一个准确且高度可解释的深度神经网络在目标设备上的执行时间模型。这无需事先了解硬件规格或所用深度学习库的详细实现。其次，FastDeepIoT告知压缩算法如何在经过分析的设备上最小化执行时间而不影响准确性。我们使用三种不同的传感相关任务在两部移动设备（Nexus 5和Galaxy Nexus）上评估了FastDeepIoT。FastDeepIoT进一步将神经网络的执行时间减少了48%到78%，并将能耗降低了37%到69%，与最先进的压缩算法相比。

英文摘要

Deep neural networks show great potential as solutions to many sensing application problems, but their excessive resource demand slows down execution time, pausing a serious impediment to deployment on low-end devices. To address this challenge, recent literature focused on compressing neural network size to improve performance. We show that changing neural network size does not proportionally affect performance attributes of interest, such as execution time. Rather, extreme run-time nonlinearities exist over the network configuration space. Hence, we propose a novel framework, called FastDeepIoT, that uncovers the non-linear relation between neural network structure and execution time, then exploits that understanding to find network configurations that significantly improve the trade-off between execution time and accuracy on mobile and embedded devices. FastDeepIoT makes two key contributions. First, FastDeepIoT automatically learns an accurate and highly interpretable execution time model for deep neural networks on the target device. This is done without prior knowledge of either the hardware specifications or the detailed implementation of the used deep learning library. Second, FastDeepIoT informs a compression algorithm how to minimize execution time on the profiled device without impacting accuracy. We evaluate FastDeepIoT using three different sensing-related tasks on two mobile devices: Nexus 5 and Galaxy Nexus. FastDeepIoT further reduces the neural network execution time by $48\%$ to $78\%$ and energy consumption by $37\%$ to $69\%$ compared with the state-of-the-art compression algorithms.

URL PDF HTML ☆

赞 0 踩 0

1809.06179 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Learning of Multi-Context Models for Autonomous Underwater Vehicles

多情境模型学习用于自主水下车辆

Bilal Wehbe, Octavio Arriaga, Mario Michael Krell, Frank Kirchner

发表机构 * DFKI - Robotic Innovation Center（DFKI机器人创新中心）； Robotics Research Group（机器人研究组）

AI总结本文提出利用LSTM网络学习自主水下车辆的多情境模型，通过实验数据构建仿真模型，生成不同情境并提高分类准确性，展现对噪声的鲁棒性和大数据集的扩展能力。

Comments 6 pages, 7 figures, AUV 2018 author copy

1809.06009 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Uncertainty Propagation in Deep Neural Networks Using Extended Kalman Filtering

使用扩展卡尔曼滤波在深度神经网络中进行不确定性传播

Jessica S. Titensky, Hayden Jananthan, Jeremy Kepner

发表机构 * Massachusetts Institute of Technology（麻省理工学院）； Department of Mathematics（数学系）； Lincoln Laboratory Supercomputing Center（林肯实验室超级计算机中心）

AI总结本文提出利用扩展卡尔曼滤波在深度神经网络中传播和量化输入不确定性，方法在计算效率上优于现有技术，同时自然地将模型误差纳入输出不确定性。

Comments 4 Pages, 8 figures. Accepted at MIT IEEE Undergraduate Research Technology Conference 2018. Publication pending

1806.06161 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

BaRC: Backward Reachability Curriculum for Robotic Reinforcement Learning

BaRC：机器人强化学习中的逆向可达性课程

Boris Ivanovic, James Harrison, Apoorva Sharma, Mo Chen, Marco Pavone

发表机构 * Department of Mechanical Engineering, Stanford University（斯坦福大学机械工程系）； School of Computing Science, Simon Fraser University（西蒙弗雷泽大学计算机科学学院）

AI总结本文提出BaRC方法，利用物理先验知识设计课程方案，通过逆向可达性策略加速连续控制MDP中模型无关RL算法的训练，提升性能并减少探索需求。

详情

AI中文摘要

模型无关强化学习（RL）为高维系统学习控制策略提供了有吸引力的方法，但其相对差的样本复杂性通常迫使在模拟环境中进行训练。即使在模拟中，具有稀疏自然奖励函数的目标导向任务仍难以被最先进的模型无关算法处理。这些任务的瓶颈在于从系统初始状态获取学习信号所需的大量探索。本文利用物理先验知识（以近似系统动力学模型的形式）设计了一种课程方案，用于模型无关策略优化算法。我们的逆向可达性课程（BaRC）从需要少量动作完成任务的状态开始策略训练，并在策略优化算法表现出足够性能后，以动态一致的方式扩展初始状态分布。BaRC具有通用性，可以加速任何模型无关RL算法在广泛目标导向连续控制MDP上的训练。其课程策略具有物理直观性、易于调节，并允许将物理先验整合到训练中，而不会影响模型无关RL算法的性能、灵活性和适用性。我们在两个代表性的动态机器人学习问题上评估了我们的方法，并发现相对于先前的课程生成技术和朴素探索策略，有显著的性能提升。

英文摘要

Model-free Reinforcement Learning (RL) offers an attractive approach to learn control policies for high-dimensional systems, but its relatively poor sample complexity often forces training in simulated environments. Even in simulation, goal-directed tasks whose natural reward function is sparse remain intractable for state-of-the-art model-free algorithms for continuous control. The bottleneck in these tasks is the prohibitive amount of exploration required to obtain a learning signal from the initial state of the system. In this work, we leverage physical priors in the form of an approximate system dynamics model to design a curriculum scheme for a model-free policy optimization algorithm. Our Backward Reachability Curriculum (BaRC) begins policy training from states that require a small number of actions to accomplish the task, and expands the initial state distribution backwards in a dynamically-consistent manner once the policy optimization algorithm demonstrates sufficient performance. BaRC is general, in that it can accelerate training of any model-free RL algorithm on a broad class of goal-directed continuous control MDPs. Its curriculum strategy is physically intuitive, easy-to-tune, and allows incorporating physical priors to accelerate training without hindering the performance, flexibility, and applicability of the model-free RL algorithm. We evaluate our approach on two representative dynamic robotic learning problems and find substantial performance improvement relative to previous curriculum generation techniques and naive exploration strategies.

URL PDF HTML ☆

赞 0 踩 0

1809.05152 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Deep Reinforcement Learning for Event-Triggered Control

基于事件触发控制的深度强化学习

Dominik Baumann, Jia-Jie Zhu, Georg Martius, Sebastian Trimpe

AI总结本文提出一种基于深度强化学习的事件触发控制方法，首次将DRL应用于ETC，能够从零开始学习控制与通信行为，并在非线性系统中展现优势。

1804.01031 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Provably Robust Learning-Based Approach for High-Accuracy Tracking Control of Lagrangian Systems

具有证明鲁棒性的基于学习的方法用于拉格朗日系统高精度跟踪控制

Mohamed K. Helwa, Adam Heins, Angela P. Schoellig

发表机构 * Dynamic Systems Lab（动态系统实验室）； Institute for Aerospace Studies（航空航天研究院）； University of Toronto（多伦多大学）

AI总结本文提出基于高斯过程的新型学习控制方法，确保系统稳定性与高精度跟踪，通过不确定性界保证鲁棒性，并在仿真和实验中验证有效性。

Comments 8 pages, 4 figures, 2 tables, submitted to IEEE Robotics and Automation Letters (RA-L) and the 2019 International Conference on Robotics and Automation (ICRA) (created: March 2018; updated: September 2018)

详情

AI中文摘要

拉格朗日系统涵盖了多种机器人系统，包括机械臂、轮式和腿部机器人以及四旋翼。通常使用逆动力学控制和前馈线性化技术将复杂非线性动力学转换为解耦的二阶积分器，然后使用标准外环控制器计算线性化系统的期望加速度。然而，这些方法通常依赖于非常准确的系统模型，这在实践中往往不可用。尽管文献中使用了不同的学习方法来解决这一挑战，但大多数方法在学习控制系统稳定性方面缺乏安全保证。本文提出了一种基于高斯过程（GPs）的新学习控制方法，确保闭环系统的稳定性和高精度跟踪。我们使用GPs近似命令加速度与系统实际加速度之间的误差，并利用GP预测的均值和方差计算线性化模型不确定性的上界。此不确定性界随后用于鲁棒的外环控制器以确保整个系统的稳定性。此外，我们证明跟踪误差收敛到一个半径可任意小的球体。进一步，我们通过在2自由度平面机械臂上的仿真和6自由度工业机械臂上的实验验证了我们方法的有效性。

英文摘要

Lagrangian systems represent a wide range of robotic systems, including manipulators, wheeled and legged robots, and quadrotors. Inverse dynamics control and feedforward linearization techniques are typically used to convert the complex nonlinear dynamics of Lagrangian systems to a set of decoupled double integrators, and then a standard, outer-loop controller can be used to calculate the commanded acceleration for the linearized system. However, these methods typically depend on having a very accurate system model, which is often not available in practice. While this challenge has been addressed in the literature using different learning approaches, most of these approaches do not provide safety guarantees in terms of stability of the learning-based control system. In this paper, we provide a novel, learning-based control approach based on Gaussian processes (GPs) that ensures both stability of the closed-loop system and high-accuracy tracking. We use GPs to approximate the error between the commanded acceleration and the actual acceleration of the system, and then use the predicted mean and variance of the GP to calculate an upper bound on the uncertainty of the linearized model. This uncertainty bound is then used in a robust, outer-loop controller to ensure stability of the overall system. Moreover, we show that the tracking error converges to a ball with a radius that can be made arbitrarily small. Furthermore, we verify the effectiveness of our approach via simulations on a 2 degree-of-freedom (DOF) planar manipulator and experimentally on a 6 DOF industrial manipulator.

URL PDF HTML ☆

赞 0 踩 0

1809.03618 2026-06-04 cs.GR cs.LG cs.MM cs.NA math.NA 版本更新

Visualization of High-dimensional Scalar Functions Using Principal Parameterizations

使用主参数化可视化高维标量函数

Rafael Ballester-Ripoll, Renato Pajarola

发表机构 * University of Zurich（苏黎世大学）

AI总结本文提出基于主成分的方法，通过降维高维标量场，利用Sobol方法进行敏感性分析，实现高维模型的交互式分析。

1809.03343 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

Distributed dynamic modeling and monitoring for large-scale industrial processes under closed-loop control

分布式动态建模与监控用于闭环控制下的大规模工业过程

Wenqing Li, Chunhui Zhao, Biao Huang

发表机构 * State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University（工业控制技术国家重点实验室，控制科学与工程学院，浙江大学）； Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems（复杂系统先进控制与智能自动化湖北省重点实验室）； Department of Chemical and Materials Engineering, University of Alberta（阿尔伯塔大学化学与材料工程系）

AI总结本文提出一种分布式监控方法，结合静态和动态特性，区分真实故障与操作条件变化，通过稀疏慢特征分析算法分解过程并建立模型，验证方法有效性。

详情

AI中文摘要

对于受闭环控制的大规模工业过程，过程动态直接由控制动作产生，可能在真实故障和正常操作条件变化间表现出不同行为。然而，传统分布式监控方法不考虑闭环控制机制，仅探索静态特性，无法区分真实故障与名义变化，导致不必要的警报。本文提出一种分布式监控方法，通过同时探索静态和动态特性，首先通过开发稀疏慢特征分析（SSFA）算法将大规模闭环过程分解为若干子系统，其次开发分布式模型分别捕捉局部和全局的静态和动态特性。基于分布式监控系统，提出两级监控策略，检查操作条件和控制动作对过程特性的影响，从而区分两种变化。通过基准数据和真实工业过程数据的案例研究验证了所提方法的有效性。

英文摘要

For large-scale industrial processes under closed-loop control, process dynamics directly resulting from control action are typical characteristics and may show different behaviors between real faults and normal changes of operating conditions. However, conventional distributed monitoring approaches do not consider the closed-loop control mechanism and only explore static characteristics, which thus are incapable of distinguishing between real process faults and nominal changes of operating conditions, leading to unnecessary alarms. In this regard, this paper proposes a distributed monitoring method for closed-loop industrial processes by concurrently exploring static and dynamic characteristics. First, the large-scale closed-loop process is decomposed into several subsystems by developing a sparse slow feature analysis (SSFA) algorithm which capture changes of both static and dynamic information. Second, distributed models are developed to separately capture static and dynamic characteristics from the local and global aspects. Based on the distributed monitoring system, a two-level monitoring strategy is proposed to check different influences on process characteristics resulting from changes of the operating conditions and control action, and thus the two changes can be well distinguished from each other. Case studies are conducted based on both benchmark data and real industrial process data to illustrate the effectiveness of the proposed method.

URL PDF HTML ☆

赞 0 踩 0

1512.09156 2026-06-04 cs.IT cs.LG cs.NA math.IT math.NA 版本更新

Low rank approximation and decomposition of large matrices using error correcting codes

利用纠错码进行大矩阵的低秩近似与分解

Shashanka Ubaru, Arya Mazumdar, Yousef Saad

发表机构 * Department of Computer Science and Engineering, University of Minnesota, Twin Cities（计算机科学与工程系，明尼苏达大学，双城分校）； Department of Electrical and Computer Engineering, University of Minnesota, Twin Cities（电气与计算机工程系，明尼苏达大学，双城分校）

AI总结本文探讨利用纠错码矩阵进行大矩阵低秩近似与分解，提出该方法在低秩近似、线性回归等问题中的优势，包括减少随机性、子空间嵌入性质、并行计算优势等。

详情

DOI: 10.1109/TIT.2017.2723898
Journal ref: IEEE Transactions on Information Theory ( Volume: 63, Issue: 9, Sept. 2017 ) Page(s): 5544 - 5558

AI中文摘要

低秩近似是信号处理和机器学习中重要的工具。最近，随机化草图算法被提出，用于有效构造低秩近似并获得大矩阵的近似奇异值分解。类似的思想也被用于解决最小二乘回归问题。本文展示如何利用纠错码中的矩阵来寻找此类低秩近似和矩阵分解，并将框架扩展到线性最小二乘回归问题。使用这些码矩阵的好处包括：(i) 它们易于生成且显著减少随机性。(ii) 具有轻微性质的码矩阵满足子空间嵌入性质，更有可能保持整个向量子空间的几何结构。(iii) 对于并行和分布式应用，码矩阵在结构随机矩阵和高斯随机矩阵上有显著优势。(iv) 与傅里叶或哈达玛变换矩阵不同，某些类型的码矩阵不需要log因子即可实现(1+ε)最优弗罗贝尼乌斯范数误差，即对于秩k的近似，仅需O(k/ε)样本。(v) 结构化码矩阵可以实现快速乘法，因此可以快速近似一般稠密输入矩阵。(vi) 对于最小二乘回归问题min‖Ax-b‖₂，当A∈ℝ^{n×d}时，使用特定码矩阵可实现(1+ε)相对误差近似，概率很高，仅需O(d/ε)样本。

英文摘要

Low rank approximation is an important tool used in many applications of signal processing and machine learning. Recently, randomized sketching algorithms were proposed to effectively construct low rank approximations and obtain approximate singular value decompositions of large matrices. Similar ideas were used to solve least squares regression problems. In this paper, we show how matrices from error correcting codes can be used to find such low rank approximations and matrix decompositions, and extend the framework to linear least squares regression problems. The benefits of using these code matrices are the following: (i) They are easy to generate and they reduce randomness significantly. (ii) Code matrices with mild properties satisfy the subspace embedding property, and have a better chance of preserving the geometry of an entire subspace of vectors. (iii) For parallel and distributed applications, code matrices have significant advantages over structured random matrices and Gaussian random matrices. (iv) Unlike Fourier or Hadamard transform matrices, which require sampling $O(k\log k)$ columns for a rank-$k$ approximation, the log factor is not necessary for certain types of code matrices. That is, $(1+ε)$ optimal Frobenius norm error can be achieved for a rank-$k$ approximation with $O(k/ε)$ samples. (v) Fast multiplication is possible with structured code matrices, so fast approximations can be achieved for general dense input matrices. (vi) For least squares regression problem $\min\|Ax-b\|_2$ where $A\in \mathbb{R}^{n\times d}$, the $(1+ε)$ relative error approximation can be achieved with $O(d/ε)$ samples, with high probability, when certain code matrices are used.

URL PDF HTML ☆

赞 0 踩 0

1809.01353 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

IKA: Independent Kernel Approximator

IKA：独立核近似器

Matteo Ronchetti

AI总结本文提出IKA方法，通过线性组合任意选择的函数进行低秩核近似，优于Nyström方法，在STL-10数据集上表现更优。

1710.03608 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

CTD: Fast, Accurate, and Interpretable Method for Static and Dynamic Tensor Decompositions

CTD: 一种快速、准确且可解释的静态和动态张量分解方法

Jungwoo Lee, Dongjin Choi, Lee Sael

发表机构 * Seoul National University（首尔国立大学）； The State University of New York (SUNY) Korea（纽约州立大学（SUNY）韩国）

AI总结本文提出CTD方法，用于高效且可解释地进行静态和动态张量分解，通过去除冗余提升准确性和效率，适用于在线环境下的异常检测。

详情

DOI: 10.1371/journal.pone.0200579

AI中文摘要

如何在高效且直接可解释的方式下发现张量中的模式和异常？如何在在线环境中处理不断到来的张量？张量模式和异常检测是关键问题，应用于安全监控、健康监测、网络安全等领域。标准的PARAFAC和Tucker分解结果不可直接解释。尽管已有基于采样的方法，但需要更快、更高效和更准确。本文提出CTD，一种基于采样的快速、准确且可解释的张量分解方法。CTD-S在准确性上比现有方法高17-83倍，速度和内存效率也分别提升5-86倍和7-12倍。CTD-D是首个可解释的动态张量分解方法，通过利用前一时间步的因素和重新排列操作，使速度提升2-3倍。通过CTD展示了如何在在线分布式拒绝服务（DDoS）攻击检测中有效解释结果。

英文摘要

How can we find patterns and anomalies in a tensor, or multi-dimensional array, in an efficient and directly interpretable way? How can we do this in an online environment, where a new tensor arrives each time step? Finding patterns and anomalies in a tensor is a crucial problem with many applications, including building safety monitoring, patient health monitoring, cyber security, terrorist detection, and fake user detection in social networks. Standard PARAFAC and Tucker decomposition results are not directly interpretable. Although a few sampling-based methods have previously been proposed towards better interpretability, they need to be made faster, more memory efficient, and more accurate. In this paper, we propose CTD, a fast, accurate, and directly interpretable tensor decomposition method based on sampling. CTD-S, the static version of CTD, provably guarantees a high accuracy that is 17 ~ 83x more accurate than that of the state-of-the-art method. Also, CTD-S is made 5 ~ 86x faster, and 7 ~ 12x more memory-efficient than the state-of-the-art method by removing redundancy. CTD-D, the dynamic version of CTD, is the first interpretable dynamic tensor decomposition method ever proposed. Also, it is made 2 ~ 3x faster than already fast CTD-S by exploiting factors at previous time step and by reordering operations. With CTD, we demonstrate how the results can be effectively interpreted in the online distributed denial of service (DDoS) attack detection.

URL PDF HTML ☆

赞 0 踩 0

1706.09763 2026-06-04 cs.GT cond-mat.stat-mech cs.LG econ.GN q-fin.EC 版本更新

Dynamical selection of Nash equilibria using Experience Weighted Attraction Learning: emergence of heterogeneous mixed equilibria

利用经验加权吸引学习动态选择纳什均衡：异质混合均衡的出现

Robin Nicole, Peter Sollich

发表机构 * Department of Mathematics, King’s College London（伦敦国王学院数学系）

AI总结本文研究了大游戏中策略分布，分析了纳什均衡的分类及EWA学习如何打破均衡不确定性，揭示异质混合均衡的形成机制。

Comments 35 pages, 16 figures

详情

DOI: 10.1371/journal.pone.0196577

AI中文摘要

我们研究了大游戏中策略分布，分析了可能的均场纳什均衡，包括可能的分割状态。由于游戏是聚集性的，实际均衡策略分布仍不确定。因此，我们比较了经验加权吸引学习的结果，该学习在长时间后在适当的大选择强度、低噪声（长代理记忆）和完美填补缺失分数（ fictitious play）极限下导致纳什均衡。学习动态打破了纳什均衡的不确定性。非平凡地，根据相关极限的取法，可以选出多种均衡类型，包括标准的同质混合和异质纯状态，以及异质混合状态，其中不同代理扮演不同策略，这些策略不全是纯策略。EWA学习的分析涉及福克-普兰克建模结合大偏差方法。理论结果通过多代理模拟得到验证。

英文摘要

We study the distribution of strategies in a large game that models how agents choose among different double auction markets. We classify the possible mean field Nash equilibria, which include potentially segregated states where an agent population can split into subpopulations adopting different strategies. As the game is aggregative, the actual equilibrium strategy distributions remain undetermined, however. We therefore compare with the results of Experience-Weighted Attraction (EWA) learning, which at long times leads to Nash equilibria in the appropriate limits of large intensity of choice, low noise (long agent memory) and perfect imputation of missing scores (fictitious play). The learning dynamics breaks the indeterminacy of the Nash equilibria. Non-trivially, depending on how the relevant limits are taken, more than one type of equilibrium can be selected. These include the standard homogeneous mixed and heterogeneous pure states, but also \emph{heterogeneous mixed} states where different agents play different strategies that are not all pure. The analysis of the EWA learning involves Fokker-Planck modeling combined with large deviation methods. The theoretical results are confirmed by multi-agent simulations.

URL PDF HTML ☆

赞 0 踩 0

1806.00728 2026-06-04 stat.ML cs.CV cs.LG cs.SY eess.SP eess.SY 版本更新

Data-Free/Data-Sparse Softmax Parameter Estimation with Structured Class Geometries

无数据/稀疏数据softmax参数估计与结构类几何

Nisar Ahmed

发表机构 * H.J. Smead Aerospace Engineering Sciences, University of Colorado, Boulder, Colorado 80309（H.J. Smead航空航天工程科学系，科罗拉多大学，伯尔德，科罗拉多州80309）

AI总结本文提出在少量或无标注数据情况下，利用类标签对数几率边界结构几何先验信息进行softmax参数估计，通过线性方程组求解，无需昂贵的数据采样和优化。

Comments Final version accepted to IEEE Signal Processing Letters (double column), submitted July 21, 2018

详情

DOI: 10.1109/LSP.2018.2860238

AI中文摘要

本文考虑在少量或无标注训练数据可用时，但已知类标签对数几率边界相对几何结构信息的softmax参数估计问题。证明了'无数据'softmax模型合成对应于求解参数方程组，其中期望主导类对数几率边界通过分解输入特征空间的凸多面体编码。当方程可解时，线性方程给出仅使用类边界多面体规范的softmax参数解集。这允许softmax参数学习无需昂贵的暴力数据采样和数值优化。线性方程还可适应数据稀疏情况下的约束最大似然估计。由于某些多面体规范可能无法得到解，因此也展示了存在某些概率分类问题，其对数几率边界无法用m类softmax模型学习。

英文摘要

This note considers softmax parameter estimation when little/no labeled training data is available, but a priori information about the relative geometry of class label log-odds boundaries is available. It is shown that `data-free' softmax model synthesis corresponds to solving a linear system of parameter equations, wherein desired dominant class log-odds boundaries are encoded via convex polytopes that decompose the input feature space. When solvable, the linear equations yield closed-form softmax parameter solution families using class boundary polytope specifications only. This allows softmax parameter learning to be implemented without expensive brute force data sampling and numerical optimization. The linear equations can also be adapted to constrained maximum likelihood estimation in data-sparse settings. Since solutions may also fail to exist for the linear parameter equations derived from certain polytope specifications, it is thus also shown that there exist probabilistic classification problems over m convexly separable classes for which the log-odds boundaries cannot be learned using an m-class softmax model.

URL PDF HTML ☆

赞 0 踩 0

1711.10144 2026-06-04 math.AP cs.LG cs.NA math.NA math.PR 版本更新

The game theoretic p-Laplacian and semi-supervised learning with few labels

基于博弈论的p-拉普拉斯方程与少量标签的半监督学习

Jeff Calder

发表机构 * Department of Mathematics, University of Minnesota（明尼苏达大学数学系）

AI总结研究了图上半监督学习中的博弈论p-拉普拉斯方程，证明其在有限标签和无限未标签数据极限下的良好性，展示其连续极限为加权连续p-拉普拉斯方程，并证明图p-拉普拉斯方程的解在高概率下近似Hölder连续。

1803.10309 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Canonical Correlation Analysis of Datasets with a Common Source Graph

具有共同源图的数据集的典型相关分析

Jia Chen, Gang Wang, Yanning Shen, Georgios B. Giannakis

发表机构 * University of Minnesota（明尼苏达大学）

AI总结本文提出了一种基于图正则化的典型相关分析方法(gCCA)，通过引入图结构来利用共同源的知识，以提升数据融合和分类性能。

Comments 10 pages, 7 figures

详情

DOI: 10.1109/TSP.2018.2853130

AI中文摘要

典型相关分析（CCA）是一种用于发现两个或多个数据集是否共享隐藏源的强大技术。其优点包括降维、聚类、分类、特征选择和数据融合。然而，标准CCA未利用共同源的几何结构，这可能来自给定数据或通过（交叉）相关性推导。本文将共同源提供的额外信息编码为图，并作为图正则化器。这导致了一种新的图正则化CCA方法，称为图（g）CCA。新的gCCA考虑了图诱导的共同源知识，同时最小化所需典型变量的距离。针对数据量小于数据向量维度的多种实际设置，还开发了gCCA的对偶形式。一种设置包括内核用于处理非线性数据依赖性。所得到的图内核（gk）CCA也以闭式形式获得。最后，通过多个真实数据集上的图像分类测试来证明新线性、对偶和内核方法相对于竞争方法的优势。

英文摘要

Canonical correlation analysis (CCA) is a powerful technique for discovering whether or not hidden sources are commonly present in two (or more) datasets. Its well-appreciated merits include dimensionality reduction, clustering, classification, feature selection, and data fusion. The standard CCA however, does not exploit the geometry of the common sources, which may be available from the given data or can be deduced from (cross-) correlations. In this paper, this extra information provided by the common sources generating the data is encoded in a graph, and is invoked as a graph regularizer. This leads to a novel graph-regularized CCA approach, that is termed graph (g) CCA. The novel gCCA accounts for the graph-induced knowledge of common sources, while minimizing the distance between the wanted canonical variables. Tailored for diverse practical settings where the number of data is smaller than the data vector dimensions, the dual formulation of gCCA is also developed. One such setting includes kernels that are incorporated to account for nonlinear data dependencies. The resultant graph-kernel (gk) CCA is also obtained in closed form. Finally, corroborating image classification tests over several real datasets are presented to showcase the merits of the novel linear, dual, and kernel approaches relative to competing alternatives.

URL PDF HTML ☆

赞 0 踩 0

1712.07249 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Probabilistic Learning of Torque Controllers from Kinematic and Force Constraints

基于概率学习的扭矩控制器从运动学和力约束中学习

João Silvério, Yanlong Huang, Leonel Rozo, Sylvain Calinon, Darwin G. Caldwell

发表机构 * Department of Advanced Robotics, Istituto Italiano di Tecnologia（意大利先进机器人研究所机器人部）； Idiap Research Institute（Idiap研究 institute）

AI总结本文提出一种概率方法，同时学习和合成扭矩控制命令，考虑任务空间、关节空间和力约束，通过概率学习不同扭矩控制器的相关性，结合高斯分布特性生成满足任务特征的新扭矩命令。

Comments Accepted for publication at 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

1808.00058 2026-06-04 cs.NI cs.LG cs.SY eess.SP eess.SY 版本更新

A Unified Framework for Joint Mobility Prediction and Object Profiling of Drones in UAV Networks

无人机网络中无人机移动预测与物体特征联合预测的统一框架

Han Peng, Abolfazl Razi, Fatemeh Afghah, Jonathan Ashdown

发表机构 * School of Informatics, Computing and Cyber Systems, Northern Arizona University, Flagstaff, AZ（信息学、计算与网络系统学院，北亚利桑那大学，弗拉格斯塔，亚利桑那）； Air Force Research Laboratory, Rome, NY（空军研究实验室，罗马，纽约）

AI总结本文提出一种无监督在线学习算法，用于无人机网络中无人机的移动预测和物体特征联合预测，以提升控制和通信协议的效率。

Comments 8 pages, 11 figures

详情

AI中文摘要

近年来，使用自主且协作的无人空中车辆（UAV）网络在没有地面站命令和通信的情况下变得更加重要，特别是在搜索和救援、灾害管理等人类干预受限的应用中。在这些场景中，如果无人机能够获取关于邻居节点的移动性、传感和作动能力的信息，它们可以做出更有效的决策。本文开发了一种无监督在线学习算法，用于无人机的移动预测和物体特征联合预测，以促进控制和通信协议。所提出的方法不仅预测周围飞行物体的未来位置，还能将它们分类为具有相似机动能力的不同组别（例如旋转式和固定翼UAVs），而无需事先了解这些组别。该方法在接纳具有未知移动性特征的新物体类型方面具有灵活性，因此适用于具有异构节点的新兴飞行自组网。

英文摘要

In recent years, using a network of autonomous and cooperative unmanned aerial vehicles (UAVs) without command and communication from the ground station has become more imperative, in particular in search-and-rescue operations, disaster management, and other applications where human intervention is limited. In such scenarios, UAVs can make more efficient decisions if they acquire more information about the mobility, sensing and actuation capabilities of their neighbor nodes. In this paper, we develop an unsupervised online learning algorithm for joint mobility prediction and object profiling of UAVs to facilitate control and communication protocols. The proposed method not only predicts the future locations of the surrounding flying objects, but also classifies them into different groups with similar levels of maneuverability (e.g. rotatory, and fixed-wing UAVs) without prior knowledge about these classes. This method is flexible in admitting new object types with unknown mobility profiles, thereby applicable to emerging flying Ad-hoc networks with heterogeneous nodes.

URL PDF HTML ☆

赞 0 踩 0

1709.03726 2026-06-04 cs.LG cs.SY eess.SY 版本更新

Adaptive Graph Signal Processing: Algorithms and Optimal Sampling Strategies

自适应图信号处理：算法与最优采样策略

Paolo Di Lorenzo, Paolo Banelli, Elvin Isufi, Sergio Barbarossa, Geert Leus

发表机构 * Dept. of Engineering, University of Perugia（工程系，佩鲁吉亚大学）

AI总结本文提出自适应图信号学习的新策略，通过分析随机采样对算法性能的影响，设计优化采样策略以提升稳态性能和收敛速度。

Comments Submitted to IEEE Transactions on Signal Processing, September 2017

详情

DOI: 10.1109/TSP.2018.2835384

AI中文摘要

本文旨在提出自适应图信号学习的新策略，即在随机时间变化的顶点子集上观测信号。将经典自适应算法LMS和RLS重新纳入图信号处理框架，通过均方分析探讨随机采样对自适应重建能力和稳态性能的影响。随后提出几种概率采样策略，设计每个节点的采样概率，以优化稳态性能、图采样率和算法收敛速度的平衡。最后推导出一种分布式RLS策略，并证明其收敛于集中式算法。通过合成和真实数据的数值模拟，展示了所提采样和重建策略在图上信号（可能分布式）自适应学习中的良好性能。

英文摘要

The goal of this paper is to propose novel strategies for adaptive learning of signals defined over graphs, which are observed over a (randomly time-varying) subset of vertices. We recast two classical adaptive algorithms in the graph signal processing framework, namely, the least mean squares (LMS) and the recursive least squares (RLS) adaptive estimation strategies. For both methods, a detailed mean-square analysis illustrates the effect of random sampling on the adaptive reconstruction capability and the steady-state performance. Then, several probabilistic sampling strategies are proposed to design the sampling probability at each node in the graph, with the aim of optimizing the tradeoff between steady-state performance, graph sampling rate, and convergence rate of the adaptive algorithms. Finally, a distributed RLS strategy is derived and is shown to be convergent to its centralized counterpart. Numerical simulations carried out over both synthetic and real data illustrate the good performance of the proposed sampling and reconstruction strategies for (possibly distributed) adaptive learning of signals defined over graphs.

URL PDF HTML ☆

赞 0 踩 0

1807.08855 2026-06-04 stat.ML cs.LG cs.RO cs.SY eess.SP eess.SY 版本更新

Weak in the NEES?: Auto-tuning Kalman Filters with Bayesian Optimization

在NEES中薄弱：基于贝叶斯优化的自动调节卡尔曼滤波器

Zhaozhong Chen, Christoffer Heckman, Simon Julier, Nisar Ahmed

发表机构 * Department of Computer Science（计算机科学系）； University of Colorado Boulder（科罗拉多大学博尔德分校）； University College London（伦敦大学学院）； Smead Aerospace Engineering Sciences（Smead航空航天工程科学系）

AI总结本文提出一种基于贝叶斯优化的自动调节卡尔曼滤波器方法，通过智能采样参数空间，利用非参数高斯过程代理函数，高效识别多个局部极小值并提供结果不确定性量化。

Comments Final version presented at FUSION 2018 Conference, Cambridge, UK, July 2018 (submitted June 1, 2018)

详情

AI中文摘要

卡尔曼滤波器被广泛用于数据融合应用，包括导航、跟踪和同时定位与建图问题。然而，调整各种卡尔曼滤波器模型参数需要大量时间和努力，例如过程噪声协方差、非白噪声预白化滤波器模型等。传统优化技术在调整时容易陷入较差的局部极小值，并且使用真实传感器数据实施成本较高。为了解决这些问题，本文开发了一种新的“黑箱”贝叶斯优化策略，用于自动调节卡尔曼滤波器。在该方法中，性能由两种随机目标函数之一来表征：当可用真实状态模型时为归一化估计误差平方（NEES），当只有传感器数据可用时为归一化创新误差平方（NIS）。通过智能采样参数空间，学习和利用非参数高斯过程代理函数，贝叶斯优化可以高效地识别多个局部极小值，并对其结果提供不确定性量化。

英文摘要

Kalman filters are routinely used for many data fusion applications including navigation, tracking, and simultaneous localization and mapping problems. However, significant time and effort is frequently required to tune various Kalman filter model parameters, e.g. process noise covariance, pre-whitening filter models for non-white noise, etc. Conventional optimization techniques for tuning can get stuck in poor local minima and can be expensive to implement with real sensor data. To address these issues, a new "black box" Bayesian optimization strategy is developed for automatically tuning Kalman filters. In this approach, performance is characterized by one of two stochastic objective functions: normalized estimation error squared (NEES) when ground truth state models are available, or the normalized innovation error squared (NIS) when only sensor data is available. By intelligently sampling the parameter space to both learn and exploit a nonparametric Gaussian process surrogate function for the NEES/NIS costs, Bayesian optimization can efficiently identify multiple local minima and provide uncertainty quantification on its results.

URL PDF HTML ☆

赞 0 踩 0

1807.08048 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY 版本更新

Baidu Apollo EM Motion Planner

百度 Apollo EM 运动规划器

Haoyang Fan, Fan Zhu, Changchun Liu, Liangliang Zhang, Li Zhuang, Dong Li, Weicheng Zhu, Jiangtao Hu, Hongye Li, Qi Kong

发表机构 * Baidu USA LLC（百度美国有限公司）

AI总结本文提出基于百度 Apollo 开源自动驾驶平台的实时运动规划系统，解决工业级4级运动规划问题，兼顾安全性、舒适性和可扩展性，通过分层结构实现多车道和单车道自动驾驶。

详情

基于类似修正线性单元函数的区域函数网络的函数近似

Hrushikesh N. Mhaskar

发表机构 * Institute of Mathematical Sciences, Claremont Graduate University（数学科学研究所，克莱蒙特研究生大学）

AI总结本文研究了在q维球面上的区域函数网络的近似性质，探讨了非正定激活函数的逼近特性，并建立了相应的光滑性类别和逼近性质。

Comments 18 pages, Title changed from the pervious version

1807.02297 2026-06-04 cs.LG cs.AI cs.SY eess.SY stat.ML 版本更新

Combinatorial Bandits for Incentivizing Agents with Dynamic Preferences

基于动态偏好的激励机制组合博弈问题

Tanner Fiez, Shreyas Sekar, Liyuan Zheng, Lillian J. Ratliff

发表机构 * Electrical Engineering Department, University of Washington（华盛顿大学电气工程系）

AI总结本文提出一种多臂老虎机框架，用于在资源受限环境下匹配用户激励，结合贪心匹配、UCB算法和马尔可夫链混合时间，理论分析 regret 并通过合成和现实案例验证性能。

Comments Published as a conference paper in Conference on Uncertainty in Artificial Intelligence (UAI) 2018

1807.00553 2026-06-04 cs.LG cs.AI cs.SY eess.SY math.DS stat.ML 版本更新

A Broader View on Bias in Automated Decision-Making: Reflecting on Epistemology and Dynamics

对自动化决策中偏见的更广泛视角：反思认识论与动态性

Roel Dobbe, Sarah Dean, Thomas Gilbert, Nitin Kohli

发表机构 * Department of Electrical Engineering and Computer Sciences, University of California Berkeley, USA（加州大学伯克利分校电气工程与计算机科学系）； Department of Rhetoric, University of California Berkeley, USA（加州大学伯克利分校修辞学系）； School of Information, University of California Berkeley, USA（加州大学伯克利分校信息学院）

AI总结本文探讨自动化决策中偏见的根源，将技术偏见视为认识论问题，新兴偏见视为动态反馈现象，强调需反思认识论并采用价值敏感设计方法改进决策系统。

Comments Presented at the 2018 Workshop on Fairness, Accountability and Transparency in Machine Learning during ICML 2018, Stockholm, Sweden

1709.01268 2026-06-04 cs.CE cs.LG cs.NA math.NA q-fin.TR 版本更新

Tensor Representation in High-Frequency Financial Data for Price Change Prediction

高频金融数据中的张量表示用于价格变动预测

Dat Thanh Tran, Martin Magris, Juho Kanniainen, Moncef Gabbouj, Alexandros Iosifidis

发表机构 * 1 Laboratory of Signal Processing, Tampere University of Technology, Tampere, Finland 2 Laboratory of Industrial

AI总结本文研究了张量多线性方法在中价预测中的有效性，通过大规模数据集实验表明，张量表示优于向量方法及其他方法。

Comments accepted in SSCI 2017, typos fixed

详情

DOI: 10.1109/SSCI.2017.8280812
Journal ref: IEEE Symposium Series on Computational Intelligence (SSCI), 2017

AI中文摘要

如今，随着大量交易数据的可用性，金融市场的动态既是对高频交易者的一种挑战，也是一种机会。为了利用高频交易中资产快速微妙的变动，必须有自动算法来分析和检测基于交易记录的价格变动模式。金融数据的多通道时间序列表示自然地建议了基于张量的学习算法。在本工作中，我们研究了两种多线性方法在中价预测问题中的有效性，与其他现有方法相比。在包含超过400万笔限价订单的大型数据集上的实验表明，通过利用张量表示，多线性模型优于向量方法和其他竞争方法。

英文摘要

Nowadays, with the availability of massive amount of trade data collected, the dynamics of the financial markets pose both a challenge and an opportunity for high frequency traders. In order to take advantage of the rapid, subtle movement of assets in High Frequency Trading (HFT), an automatic algorithm to analyze and detect patterns of price change based on transaction records must be available. The multichannel, time-series representation of financial data naturally suggests tensor-based learning algorithms. In this work, we investigate the effectiveness of two multilinear methods for the mid-price prediction problem against other existing methods. The experiments in a large scale dataset which contains more than 4 millions limit orders show that by utilizing tensor representation, multilinear models outperform vector-based approaches and other competing ones.

URL PDF HTML ☆

赞 0 踩 0

1806.09919 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Tangent-Space Regularization for Neural-Network Models of Dynamical Systems

神经动力系统模型中的切空间正则化

Fredrik Bagge Carlson, Rolf Johansson, Anders Robertsson

发表机构 * LCCC Linnaeus Center（LCCC 林纳尤斯中心）

AI总结本文提出神经网络动力系统模型的切空间正则化方法，通过利用动力学函数的切空间特性，改进模型雅可比矩阵的正则化，减少对大量训练数据的依赖，并探讨不同网络架构对输入输出雅可比矩阵学习能力及L2正则化对系统稳定性的影响。

1806.09620 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

A DCA-Like Algorithm and its Accelerated Version with Application in Data Visualization

一种类似DCA的算法及其加速版本在数据可视化中的应用

Hoai An Le Thi, Hoai Minh Le, Duy Nhat Phan, Bach Tran

发表机构 * Department Informatics and Application（信息与应用系）； LGIPM ； University of Lorraine（洛林大学）； France（法国）

AI总结本文提出两种DCA变体，旨在加速约束下可微函数和复合函数的最小化问题。通过引入新的分解技术改进DCA，进而结合Nesterov加速技术得到加速DCA。算法在Kurdyka-Lojasiewicz假设下的收敛性被严格研究，并应用于t-分布随机邻居嵌入。

1806.05419 2026-06-04 stat.ML cs.LG cs.NA math.NA math.ST stat.TH 版本更新

Ranking Recovery from Limited Comparisons using Low-Rank Matrix Completion

通过低秩矩阵补全进行有限比较的排序恢复

Tal Levy, Alireza Vahid, Raja Giryes

发表机构 * School of Electrical Engineering, Tel-Aviv University（特拉维夫大学电气工程学院）； Electrical Engineering Department, University of Colorado Denver（科罗拉多大学丹佛分校电气工程系）

AI总结本文提出利用低秩矩阵补全方法解决经典排名聚合问题，通过矩阵形式处理部分噪声比较数据，结合交替最小化算法和最大似然估计，重建真实偏好强度。

Comments 10 Pages, 9 figures. A prediction table for 2018 FIFA soccer world cup is included

详情

AI中文摘要

本文提出了一种新的方法，利用低秩矩阵补全技术解决经典的排名聚合问题。通过将成对比较的不完全和噪声数据转换为矩阵形式，并利用矩阵补全工具（如Netflix挑战中的低秩补全解决方案）来构建不同对象的偏好。在我们的方法中，利用多个比较数据估计对象i相对于对象j获胜（或被选择）的概率，其中仅已知N个对象的部分比较数据。然后将数据转换为矩阵形式，其无噪声解具有已知的秩为一。接着使用目标矩阵具有双线性形式的交替最小化算法，并结合最大似然估计对两个因素进行估计。重建的矩阵用于获得真实的潜在偏好强度。本工作在模拟场景和真实数据中展示了所提算法相对于当前最先进方法的改进。

英文摘要

This paper proposes a new method for solving the well-known rank aggregation problem from pairwise comparisons using the method of low-rank matrix completion. The partial and noisy data of pairwise comparisons is transformed into a matrix form. We then use tools from matrix completion, which has served as a major component in the low-rank completion solution of the Netflix challenge, to construct the preference of the different objects. In our approach, the data of multiple comparisons is used to create an estimate of the probability of object i to win (or be chosen) over object j, where only a partial set of comparisons between N objects is known. The data is then transformed into a matrix form for which the noiseless solution has a known rank of one. An alternating minimization algorithm, in which the target matrix takes a bilinear form, is then used in combination with maximum likelihood estimation for both factors. The reconstructed matrix is used to obtain the true underlying preference intensity. This work demonstrates the improvement of our proposed algorithm over the current state-of-the-art in both simulated scenarios and real data.

URL PDF HTML ☆

赞 0 踩 0

1806.04830 2026-06-04 math.NA cs.LG cs.NA 版本更新

Deep Multiscale Model Learning

深度多尺度模型学习

Yating Wang, Siu Wun Cheung, Eric T. Chung, Yalchin Efendiev, Min Wang

发表机构 * Department of Mathematics, Texas A&M University（德克萨斯大学数学系）； Department of Mathematics, The Chinese University of Hong Kong（香港中文大学数学系）； Department of Mathematics & Institute for Scientific Computation (ISC), Texas A&M University（德克萨斯大学数学系与科学计算研究所）

AI总结本文提出利用深度学习与局部多尺度模型降阶方法，通过数据和物理建模概念提升流体多尺度模拟的预测能力。

详情

AI中文摘要

本文的目标是设计新型多层神经网络架构，用于考虑观测数据和物理建模概念的流体多尺度模拟。我们的方法结合深度学习概念与局部多尺度模型降阶方法，预测流体动力学。使用降阶模型对于构建稳健的深度学习架构至关重要，因为降阶模型提供较少的自由度。流体动力学可以视为多层网络。更准确地说，时间瞬间n+1的解（例如压力和饱和度）取决于时间瞬间n的解和输入参数，如渗透率场、强迫项和初始条件。可以将解视为多层网络，其中每一层通常是一个非线性前向映射，层数与内部时间步数相关。我们将依赖严格的模型降阶概念来定义每个层的未知数和连接。在每一层中，我们的降阶模型将提供一个前向映射，该映射将通过可用数据进行修改（“训练”）。使用降阶模型至关重要，因为它们将识别影响区域和适当的变量数量。由于可用数据有限，训练将补充计算数据，并在数据丰富和数据贫乏的模型之间进行插值。我们还将使用深度学习算法来训练降阶模型离散系统的元素。我们将介绍我们方法的主要成分和数值结果。数值结果表明，使用深度学习和多尺度模型，可以提高受可用数据条件的前向模型。

英文摘要

The objective of this paper is to design novel multi-layer neural network architectures for multiscale simulations of flows taking into account the observed data and physical modeling concepts. Our approaches use deep learning concepts combined with local multiscale model reduction methodologies to predict flow dynamics. Using reduced-order model concepts is important for constructing robust deep learning architectures since the reduced-order models provide fewer degrees of freedom. Flow dynamics can be thought of as multi-layer networks. More precisely, the solution (e.g., pressures and saturations) at the time instant $n+1$ depends on the solution at the time instant $n$ and input parameters, such as permeability fields, forcing terms, and initial conditions. One can regard the solution as a multi-layer network, where each layer, in general, is a nonlinear forward map and the number of layers relates to the internal time steps. We will rely on rigorous model reduction concepts to define unknowns and connections for each layer. In each layer, our reduced-order models will provide a forward map, which will be modified ("trained") using available data. It is critical to use reduced-order models for this purpose, which will identify the regions of influence and the appropriate number of variables. Because of the lack of available data, the training will be supplemented with computational data as needed and the interpolation between data-rich and data-deficient models. We will also use deep learning algorithms to train the elements of the reduced model discrete system. We will present main ingredients of our approach and numerical results. Numerical results show that using deep learning and multiscale models, we can improve the forward models, which are conditioned to the available data.

URL PDF HTML ☆

赞 0 踩 0

1806.04167 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Learning an Approximate Model Predictive Controller with Guarantees

学习具有保证的近似模型预测控制器

Michael Hertneck, Johannes Köhler, Sebastian Trimpe, Frank Allgöwer

发表机构 * University of Stuttgart（斯图加特大学）

AI总结本文提出一种监督学习框架，用于在降低计算复杂度的同时近似模型预测控制器，并保证稳定性和约束满足。通过结合鲁棒MPC设计和统计学习界限，为学习的MPC提供闭环保证。

Comments 6 pages, 3 figures, to appear in IEEE Control Systems Letters

详情

DOI: 10.1109/LCSYS.2018.2843682

AI中文摘要

本文提出了一种监督学习框架，用于近似模型预测控制器（MPC），以降低计算复杂度并保证稳定性和约束满足。该框架可应用于广泛非线性系统。任何标准监督学习技术（例如神经网络）均可用于从样本中近似MPC。为了获得学习MPC的闭环保证，将鲁棒MPC设计与统计学习界限相结合。MPC设计确保在给定范围内输入不准确时的鲁棒性，Hoeffding不等式用于验证学习的MPC在高置信度下满足这些界限。结果是学习MPC的闭环统计保证，确保稳定性和约束满足。所提出的基于学习的MPC框架在非线性基准问题上进行了示例说明，其中我们学习了一个具有保证的神经网络控制器。

英文摘要

A supervised learning framework is proposed to approximate a model predictive controller (MPC) with reduced computational complexity and guarantees on stability and constraint satisfaction. The framework can be used for a wide class of nonlinear systems. Any standard supervised learning technique (e.g. neural networks) can be employed to approximate the MPC from samples. In order to obtain closed-loop guarantees for the learned MPC, a robust MPC design is combined with statistical learning bounds. The MPC design ensures robustness to inaccurate inputs within given bounds, and Hoeffding's Inequality is used to validate that the learned MPC satisfies these bounds with high confidence. The result is a closed-loop statistical guarantee on stability and constraint satisfaction for the learned MPC. The proposed learning-based MPC framework is illustrated on a nonlinear benchmark problem, for which we learn a neural network controller with guarantees.

URL PDF HTML ☆

赞 0 踩 0

1806.03145 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Fidelity-based Probabilistic Q-learning for Control of Quantum Systems

基于保真度的概率Q学习用于量子系统的控制

Chunlin Chen, Daoyi Dong, Han-Xiong Li, Jian Chu, Tzyh-Jong Tarn

AI总结本文提出基于保真度的概率Q学习方法，用于解决强化学习中探索与利用的平衡问题，并应用于量子系统控制，通过迭代更新动作概率实现自然探索策略，提升学习效率。

Comments 13 pages, 16 figures

详情

DOI: 10.1109/TNNLS.2013.2283574
Journal ref: IEEE Transactions on Neural Networks and Learning Systems, VOL. 25, NO. 5, pp.920-933, MAY 2014

AI中文摘要

在强化学习中，探索与利用的平衡是一个关键问题，尤其是对于Q学习。本文提出一种基于保真度的概率Q学习（FPQL）方法，以自然解决此问题并应用于量子系统控制。该方法利用保真度指导学习过程，迭代更新每个状态的动作概率，从而实现自然探索策略而非依赖配置参数的尖锐策略。首先提出概率Q学习（PQL）算法以展示概率动作选择的基本思想，随后针对量子系统控制提出FPQL算法。通过两个例子（自旋-1/2系统和λ型原子系统）测试FPQL算法的性能。结果表明，FPQL算法在探索与利用之间取得更好的平衡，能够避免局部最优策略并加速学习过程。

英文摘要

The balance between exploration and exploitation is a key problem for reinforcement learning methods, especially for Q-learning. In this paper, a fidelity-based probabilistic Q-learning (FPQL) approach is presented to naturally solve this problem and applied for learning control of quantum systems. In this approach, fidelity is adopted to help direct the learning process and the probability of each action to be selected at a certain state is updated iteratively along with the learning process, which leads to a natural exploration strategy instead of a pointed one with configured parameters. A probabilistic Q-learning (PQL) algorithm is first presented to demonstrate the basic idea of probabilistic action selection. Then the FPQL algorithm is presented for learning control of quantum systems. Two examples (a spin- 1/2 system and a lamda-type atomic system) are demonstrated to test the performance of the FPQL algorithm. The results show that FPQL algorithms attain a better balance between exploration and exploitation, and can also avoid local optimal policies and accelerate the learning process.

URL PDF HTML ☆

赞 0 踩 0

1806.02499 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

Conditional probability calculation using restricted Boltzmann machine with application to system identification

基于受限玻尔兹曼机的条件概率计算及其在系统辨识中的应用

Erick de la Rosa, Wen Yu

发表机构 * Departamento de Control Automatico CINVESTAV-IPN (National Polytechnic Institute)（自动控制系 CINVESTAV-IPN（国家理工学院））

AI总结本文利用受限玻尔兹曼机计算条件概率用于非线性系统辨识，通过二进制编码和连续值方法改进模型，提出通用逼近分析，验证在噪声大和系统动态复杂时方法优势。

1709.10441 2026-06-04 cs.LG cs.NA math.NA 版本更新

A representer theorem for deep kernel learning

深度核学习的代表定理

Bastian Bohn, Michael Griebel, Christian Rieger

发表机构 * Institute for Numerical Simulation, University of Bonn（数值模拟研究所，波恩大学）； Fraunhofer Institute for Algorithms and Scientific Computing SCAI（算法与科学计算弗劳恩霍夫研究所SCAI）

AI总结本文为深度核学习中的函数拼接提供了有限和无限样本的代表定理，为分析基于函数组合的机器学习算法提供数学基础，并展示了如何将拼接的机器学习问题转化为神经网络，并应用于最新深度学习方法。

1806.01678 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

A Projection Method for Metric-Constrained Optimization

度量约束优化的一种投影方法

Nate Veldt, David Gleich, Anthony Wirth, James Saunderson

发表机构 * Purdue University, Mathematics Department（普渡大学数学系）； Purdue University, Computer Science Department（普渡大学计算机科学系）； The University of Melbourne, Computing and Information Systems School（墨尔本大学计算与信息系统学院）； Monash University, Department of Electrical and Computer Systems Engineering（莫纳什大学电子与计算机系统工程系）

AI总结本文提出一种解决度量约束优化问题的新方法，通过改进投影算法解决图聚类中的高维优化问题，并提供新的近似保证。

详情

AI中文摘要

我们概述了一种解决强制输出变量三角不等式的优化问题的新方法。我们将其称为度量约束优化，并给出了在机器学习应用和图聚类理论近似算法中出现的几个例子。尽管这些问题是理论上的有趣问题，但实际求解具有挑战性，因为黑箱求解器需要高内存。为了解决这一挑战，我们首先证明了相关聚类的度量约束线性规划松弛等价于度量接近问题的特殊情况。然后我们通过推广和改进最初用于度量接近的简单投影算法，开发了一个通用求解器。我们为使用我们的框架找到几个具有挑战性的图聚类问题的最优解的下界提供了几种新的近似保证。我们还通过解决包含高达10^8个变量和10^11个约束的优化问题来展示我们框架的威力。

英文摘要

We outline a new approach for solving optimization problems which enforce triangle inequalities on output variables. We refer to this as metric-constrained optimization, and give several examples where problems of this form arise in machine learning applications and theoretical approximation algorithms for graph clustering. Although these problem are interesting from a theoretical perspective, they are challenging to solve in practice due to the high memory requirement of black-box solvers. In order to address this challenge we first prove that the metric-constrained linear program relaxation of correlation clustering is equivalent to a special case of the metric nearness problem. We then developed a general solver for metric-constrained linear and quadratic programs by generalizing and improving a simple projection algorithm originally developed for metric nearness. We give several novel approximation guarantees for using our framework to find lower bounds for optimal solutions to several challenging graph clustering problems. We also demonstrate the power of our framework by solving optimizing problems involving up to 10^{8} variables and 10^{11} constraints.

URL PDF HTML ☆

赞 0 踩 0

1806.01003 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

Distributed Learning from Interactions in Social Networks

社交网络中交互的分布式学习

Francesco Sasso, Angelo Coluccia, Giuseppe Notarstefano

发表机构 * European Research Council (ERC)（欧洲研究理事会）

AI总结本文提出基于社交网络交互的分布式学习框架，利用贝叶斯方法和最大似然估计，通过图模型工具实现参数和超参数的分布式估计，用于用户画像建模。

Comments This submission is a shorter work (for conference publication) of a more comprehensive paper, already submitted as arXiv:1706.04081 (under review for journal publication). In this short submission only one social set-up is considered and only one of the relaxed estimators is proposed. Moreover, the exhaustive analysis, carried out in the longer manuscript, is completely missing in this version

详情

AI中文摘要

我们考虑一个网络场景，其中代理可以根据表示某些交互的评分图来评估彼此。目标是设计一个分布式协议，由代理运行，使他们能够在有限的可能值中学习其未知状态。我们提出一个贝叶斯框架，其中评分和状态与具有未知参数和超参数的概率事件相关联。我们展示每个代理可以通过本地贝叶斯分类器和结合普通最大似然估计和经验贝叶斯方法的（集中式）最大似然（ML）估计器来学习其状态。通过使用图模型工具，我们可以获得评分和状态的条件依赖性的洞察，从而提供一个放松的概率模型，最终导致一个适合分布式计算的参数-超参数估计器。为了突出所提放松的适当性，我们将在社交互动设置中演示分布式估计器。

英文摘要

We consider a network scenario in which agents can evaluate each other according to a score graph that models some interactions. The goal is to design a distributed protocol, run by the agents, that allows them to learn their unknown state among a finite set of possible values. We propose a Bayesian framework in which scores and states are associated to probabilistic events with unknown parameters and hyperparameters, respectively. We show that each agent can learn its state by means of a local Bayesian classifier and a (centralized) Maximum-Likelihood (ML) estimator of parameter-hyperparameter that combines plain ML and Empirical Bayes approaches. By using tools from graphical models, which allow us to gain insight on conditional dependencies of scores and states, we provide a relaxed probabilistic model that ultimately leads to a parameter-hyperparameter estimator amenable to distributed computation. To highlight the appropriateness of the proposed relaxation, we demonstrate the distributed estimators on a social interaction set-up for user profiling.

URL PDF HTML ☆

赞 0 踩 0

1806.00589 2026-06-04 cs.LG cs.AI cs.SY eess.SY stat.ML 版本更新

Efficient Entropy for Policy Gradient with Multidimensional Action Space

在多维动作空间中高效的策略梯度熵

Yiming Zhang, Quan Ho Vuong, Kenny Song, Xiao-Yue Gong, Keith W. Ross

发表机构 * New York University（纽约大学）； New York University Abu Dhabi（纽约大学阿布扎克分校）； New York University Shanghai（纽约大学上海分校）； Massachusetts Institute of Technology（麻省理工学院）

AI总结本文提出高效计算高维动作空间策略梯度熵的方法，通过改进的无偏估计器提升探索效率，在多猎手多兔子网格游戏和多智能体多臂老虎机问题中验证了其有效性。

详情

AI中文摘要

近年来，深度强化学习在解决高维状态空间（如Atari游戏）的序列决策过程方面表现出色。然而，许多强化学习问题涉及高维离散动作空间和高维状态空间。本文考虑熵奖励，用于在策略梯度中鼓励探索。在高维动作空间中，计算熵及其梯度需要枚举所有动作并为每个动作运行前向和反向传播，这可能计算上不可行。我们开发了几种新颖的无偏估计器用于熵奖励及其梯度。我们将这些估计器应用于几种参数化策略模型，包括独立采样、CommNet、带有修改MDP的自回归和带有LSTM的自回归。最后，我们在两个环境中测试我们的算法：一个多猎手多兔子网格游戏和一个多智能体多臂老虎机问题。结果表明，我们的熵估计器在边际额外计算成本下显著提升了性能。

英文摘要

In recent years, deep reinforcement learning has been shown to be adept at solving sequential decision processes with high-dimensional state spaces such as in the Atari games. Many reinforcement learning problems, however, involve high-dimensional discrete action spaces as well as high-dimensional state spaces. This paper considers entropy bonus, which is used to encourage exploration in policy gradient. In the case of high-dimensional action spaces, calculating the entropy and its gradient requires enumerating all the actions in the action space and running forward and backpropagation for each action, which may be computationally infeasible. We develop several novel unbiased estimators for the entropy bonus and its gradient. We apply these estimators to several models for the parameterized policies, including Independent Sampling, CommNet, Autoregressive with Modified MDP, and Autoregressive with LSTM. Finally, we test our algorithms on two environments: a multi-hunter multi-rabbit grid game and a multi-agent multi-arm bandit problem. The results show that our entropy estimators substantially improve performance with marginal additional computational cost.

URL PDF HTML ☆

赞 0 踩 0

1709.05746 2026-06-04 cs.RO cs.AI cs.CV cs.LG cs.SY eess.SY 版本更新

Adversarial Discriminative Sim-to-real Transfer of Visuo-motor Policies

对抗性判别仿真到现实的视觉-运动策略转移

Fangyi Zhang, Jürgen Leitner, Zongyuan Ge, Michael Milford, Peter Corke

发表机构 * Australian Centre for Robotic Vision (ACRV)（澳大利亚机器人视觉中心）； Queensland University of Technology (QUT)（昆士兰技术大学）； Monash University（墨尔本大学）

AI总结本文提出对抗性判别仿真到现实转移方法，减少现实数据标注成本，在桌面上物体抓取任务中，通过视觉观测控制7自由度机械臂在障碍物中抓取蓝色立方体，仅需93个标注和186个未标注图像即可实现97.8%的成功率和1.8厘米的控制精度。

Comments Under review for the International Journal of Robotics Research

详情

AI中文摘要

各种方法已被提出以学习用于现实世界机器人应用的视觉-运动策略。一种解决方案是首先在仿真中学习然后转移到现实世界。在转移过程中，大多数现有方法需要带有标签的真实图像。然而，在许多机器人应用中，标注过程往往昂贵甚至不实际。在本文中，我们提出了一种对抗性判别仿真到现实转移方法，以减少标注真实数据的成本。通过模块化网络在桌面物体抓取任务中验证了该方法的有效性，其中7自由度的机械臂以速度模式控制在障碍物中抓取蓝色立方体。对抗性转移方法将标注真实数据的需求减少了50%。策略可以仅使用93个标注和186个未标注的真实图像转移到现实环境。转移的视觉-运动策略对训练中未见过的物体和移动目标具有鲁棒性，实现了97.8%的成功率和1.8厘米的控制精度。

英文摘要

Various approaches have been proposed to learn visuo-motor policies for real-world robotic applications. One solution is first learning in simulation then transferring to the real world. In the transfer, most existing approaches need real-world images with labels. However, the labelling process is often expensive or even impractical in many robotic applications. In this paper, we propose an adversarial discriminative sim-to-real transfer approach to reduce the cost of labelling real data. The effectiveness of the approach is demonstrated with modular networks in a table-top object reaching task where a 7 DoF arm is controlled in velocity mode to reach a blue cuboid in clutter through visual observations. The adversarial transfer approach reduced the labelled real data requirement by 50%. Policies can be transferred to real environments with only 93 labelled and 186 unlabelled real images. The transferred visuo-motor policies are robust to novel (not seen in training) objects in clutter and even a moving target, achieving a 97.8% success rate and 1.8 cm control accuracy.

URL PDF HTML ☆

赞 0 踩 0

1805.10638 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Fast K-Means Clustering with Anderson Acceleration

快速K均值聚类的安德森加速方法

Juyong Zhang, Yuxin Yao, Yue Peng, Hao Yu, Bailin Deng

发表机构 * University of Science and Technology of China（中国科学技术大学）； Cardiff University（卡迪夫大学）

AI总结本文提出了一种加速K均值聚类Lloyd算法的新方法，通过将Lloyd算法的分配和更新步骤视为固定点迭代，并应用安德森加速技术，动态调整参数m以实现鲁棒且一致的加速效果。

详情

AI中文摘要

我们提出了一种新的方法，用于加速K-均值聚类的Lloyd算法。与以往减少每次迭代计算成本或改进初始化的方法不同，我们的方法专注于减少收敛所需的迭代次数。这通过将Lloyd算法的分配步骤和更新步骤视为固定点迭代，并应用安德森加速，一种已建立的加速固定点求解器的技术来实现。经典安德森加速利用m个之前的迭代来找到加速的迭代，其在K-均值聚类中的性能对m的选择和样本分布敏感。我们提出了一种新的策略，动态调整m的值，以在不同问题实例上实现鲁棒且一致的加速。我们的方法补充了现有的加速技术，并可以与它们结合以实现最先进的性能。我们进行了广泛的实验来评估所提出方法的性能，在120个测试用例中，有106个用例优于其他算法，平均计算时间减少比率超过33%。

英文摘要

We propose a novel method to accelerate Lloyd's algorithm for K-Means clustering. Unlike previous acceleration approaches that reduce computational cost per iterations or improve initialization, our approach is focused on reducing the number of iterations required for convergence. This is achieved by treating the assignment step and the update step of Lloyd's algorithm as a fixed-point iteration, and applying Anderson acceleration, a well-established technique for accelerating fixed-point solvers. Classical Anderson acceleration utilizes m previous iterates to find an accelerated iterate, and its performance on K-Means clustering can be sensitive to choice of m and the distribution of samples. We propose a new strategy to dynamically adjust the value of m, which achieves robust and consistent speedups across different problem instances. Our method complements existing acceleration techniques, and can be combined with them to achieve state-of-the-art performance. We perform extensive experiments to evaluate the performance of the proposed method, where it outperforms other algorithms in 106 out of 120 test cases, and the mean decrease ratio of computational time is more than 33%.

URL PDF HTML ☆

赞 0 踩 0

1710.01493 2026-06-04 cs.LG cs.CV cs.NA math.NA math.OC 版本更新

Image Labeling Based on Graphical Models Using Wasserstein Messages and Geometric Assignment

基于图形模型的图像标注：利用Wasserstein消息与几何分配

Ruben Hühnerbein, Fabrizio Savarino, Freddie Åström, Christoph Schnörr

发表机构 * Image and Pattern Analysis Group, Heidelberg University, Germany（海德堡大学图像与模式分析组）； Heidelberg Collaboratory for Image Processing, Heidelberg University, Germany（海德堡图像处理协同实验室）

AI总结本文提出基于离散图模型的最大后验推断新方法，利用局部Wasserstein距离近似目标函数并实现并行收敛。

1805.09613 2026-06-04 stat.ML cs.AI cs.LG cs.RO cs.SY eess.SY 版本更新

A0C: Alpha Zero in Continuous Action Space

A0C：在连续动作空间中的Alpha Zero

Thomas M. Moerland, Joost Broekens, Aske Plaat, Catholijn M. Jonker

发表机构 * Dep. of Computer Science, Delft University of Technology, The Netherlands（代尔夫特理工大学计算机科学系，荷兰）； Dep. of Computer Science, Leiden University, The Netherlands（莱顿大学计算机科学系，荷兰）

AI总结本文提出将Alpha Zero扩展到连续动作空间的理论方法，并在倒摆任务中验证了其可行性，为连续动作空间中的迭代搜索与学习应用奠定了基础。

1805.09464 2026-06-04 cs.LG cs.IT cs.NA math.IT math.NA math.OC stat.ML 版本更新

Simple and practical algorithms for $\ell_p$-norm low-rank approximation

简单且实用的ℓp-范数低秩近似算法

Anastasios Kyrillidis

发表机构 * IBM T.J. Watson Research Center（IBM T.J. 巴特利特研究中心）； Rice University（里士满大学）

AI总结本文提出基于梯度的非凸算法，用于ℓp范数低秩近似，适用于p=1或p=∞。算法易于实现，能更快速且更精确地逼近，理论证明其可达到(1+ε)-OPT近似，且不依赖超参数。

Comments 16 pages, 11 figures, to appear in UAI 2018

1805.08468 2026-06-04 math.NA cs.LG cs.NA 版本更新

Rank Minimization on Tensor Ring: A New Paradigm in Scalable Tensor Decomposition and Completion

张量环上的秩最小化：一种可扩展张量分解与补全的新范式

Longhao Yuan, Chao Li, Danilo Mandic, Jianting Cao, Qibin Zhao

发表机构 * Graduate School of Engineering, Saitama Institute of Technology, Japan（日本萨它马工学院工程研究生院）； Tensor Learning Unit, RIKEN Center for Advanced Intelligence Project (AIP), Japan（日本RIKEN高级智能项目（AIP）张量学习单元）； School of Automation, Guangdong University of Technology, China（中国广东技术大学自动化学院）； School of Computer Science and Technology, Hangzhou Dianzi University, China（中国杭州电子科技大学计算机科学与技术学院）； Department of Electrical and Electronic Engineering, Imperial College London, United Kingdom（英国伦敦帝国理工学院电子与电气工程系）

AI总结本文提出基于张量环的秩最小化方法，通过引入凸替代项解决传统方法的高计算成本和模型复杂度敏感问题，提出两种算法以不同结构的Schatten范数优化张量环因子，实验显示其高效性与高性能。

详情

AI中文摘要

在低秩张量补全任务中，由于传统方法需要多次大规模奇异值分解（SVD）操作和秩选择问题，导致计算成本高且对模型复杂度敏感。本文利用最近提出的张量环（TR）分解的高压缩性，提出了一种新的张量补全模型。通过引入凸替代项对潜在张量环因子的低秩假设，使得基于Schatten范数正则化的模型可以在更小的规模上求解。我们提出了两种算法，分别对张量环因子应用不同的结构化Schatten范数。通过交替方向乘子法（ADMM）方案，张量环因子和预测张量可以同时优化。在合成数据和实际数据上的实验显示了所提方法的高性能和高效性。

英文摘要

In low-rank tensor completion tasks, due to the underlying multiple large-scale singular value decomposition (SVD) operations and rank selection problem of the traditional methods, they suffer from high computational cost and high sensitivity of model complexity. In this paper, taking advantages of high compressibility of the recently proposed tensor ring (TR) decomposition, we propose a new model for tensor completion problem. This is achieved through introducing convex surrogates of tensor low-rank assumption on latent tensor ring factors, which makes it possible for the Schatten norm regularization based models to be solved at much smaller scale. We propose two algorithms which apply different structured Schatten norms on tensor ring factors respectively. By the alternating direction method of multipliers (ADMM) scheme, the tensor ring factors and the predicted tensor can be optimized simultaneously. The experiments on synthetic data and real-world data show the high performance and efficiency of the proposed approach.

URL PDF HTML ☆

赞 0 踩 0

1805.08095 2026-06-04 cs.LG cs.CV cs.NA math.NA stat.ML 版本更新

Small steps and giant leaps: Minimal Newton solvers for Deep Learning

小步与巨跃：用于深度学习的最小牛顿求解器

João F. Henriques, Sebastien Ehrhardt, Samuel Albanie, Andrea Vedaldi

发表机构 * Visual Geometry Group, University of Oxford（视觉几何组，牛津大学）

AI总结本文提出一种快速的二阶方法，可作为现有深度学习求解器的替代方案。该方法仅需每个迭代两次额外的前向模式自动微分操作，计算成本与两次标准前向传递相当，易于实现。方法解决了现有二阶求解器的长期问题，避免了计算Hessian矩阵的近似逆矩阵的高成本和噪声敏感性。

详情

AI中文摘要

我们提出了一种快速的二阶方法，可作为现有深度学习求解器的替代方案。与随机梯度下降（SGD）相比，该方法每个迭代仅需两次额外的前向模式自动微分操作，计算成本与两次标准前向传递相当，且易于实现。我们的方法解决了现有二阶求解器的长期问题，即每次迭代精确或通过共轭梯度法计算近似Hessian矩阵的逆矩阵，这一过程成本高且对噪声敏感。相反，我们提出保持一个梯度的估计值，该估计值通过逆Hessian矩阵投影得到，并在每次迭代中更新一次。该估计值的大小相同，类似于SGD中常用的动量变量。不维护Hessian的估计值。我们首先在具有已知闭式解的小问题上验证了我们的方法，称为CurveBall，包括噪声Rosenbrock函数和退化的两层线性网络，其中现有深度学习求解器似乎难以处理。然后我们在CIFAR和ImageNet上训练了多个大型模型，包括ResNet和VGG-f网络，展示了无需超参数调优的更快收敛速度。代码已提供。

英文摘要

We propose a fast second-order method that can be used as a drop-in replacement for current deep learning solvers. Compared to stochastic gradient descent (SGD), it only requires two additional forward-mode automatic differentiation operations per iteration, which has a computational cost comparable to two standard forward passes and is easy to implement. Our method addresses long-standing issues with current second-order solvers, which invert an approximate Hessian matrix every iteration exactly or by conjugate-gradient methods, a procedure that is both costly and sensitive to noise. Instead, we propose to keep a single estimate of the gradient projected by the inverse Hessian matrix, and update it once per iteration. This estimate has the same size and is similar to the momentum variable that is commonly used in SGD. No estimate of the Hessian is maintained. We first validate our method, called CurveBall, on small problems with known closed-form solutions (noisy Rosenbrock function and degenerate 2-layer linear networks), where current deep learning solvers seem to struggle. We then train several large models on CIFAR and ImageNet, including ResNet and VGG-f networks, where we demonstrate faster convergence with no hyperparameter tuning. Code is available.

URL PDF HTML ☆

赞 0 踩 0

1711.09220 2026-06-04 cs.LG cs.SY eess.SY math.OC 版本更新

Fitting Jump Models

拟合跳跃模型

A. Bemporad, V. Breschi, D. Piga, S. Boyd

发表机构 * IMT School for Advanced Studies Lucca（IMT 高级研究学院卢塞拉分校）； Dalle Molle Institute for Artificial Intelligence Research - USI/SUPSI（达勒莫莱人工智能研究所 - USI/SUPSI）； Department of Electrical Engineering, Stanford University（斯坦福大学电气工程系）

AI总结本文提出了一种新的框架，用于拟合跳跃模型序列数据，通过交替最小化损失函数以拟合多个模型参数和确定每个数据点的活跃参数集，适用于隐马尔可夫模型等主流模型。

Comments Accepted for publication in Automatica

1804.01825 2026-06-04 cs.LG econ.GN q-fin.EC stat.ML 版本更新

Evaluating Hospital Case Cost Prediction Models Using Azure Machine Learning Studio

利用Azure机器学习工作室评估医院病例成本预测模型

Alexei Botchkarev

发表机构 * Microsoft Azure Machine Learning Studio（微软Azure机器学习工作室）

AI总结本文提出了一种利用Azure机器学习工作室快速评估多种回归模型的工具，评估了鲁棒回归、提升决策树回归和决策森林回归在医院病例成本预测中的优势。

详情

AI中文摘要

准确的医院病例成本建模和预测能力对高效医疗财务管理和预算规划至关重要。已知各种回归机器学习算法在医疗成本预测中表现良好。本实验的目的是构建一个Azure机器学习工作室工具，用于快速评估多种类型的回归模型。该工具提供了一个统一的实验环境，可比较14种回归模型：线性回归、贝叶斯线性回归、决策森林回归、提升决策树回归、神经网络回归、泊松回归、回归高斯过程、梯度提升机、非线性最小二乘回归、投影寻踪回归、随机森林回归、鲁棒回归、鲁棒回归与mm型估计器、支持向量回归。该工具通过五个性能指标将评估结果按模型准确性排列在单一表格中。对回归机器学习模型进行医院病例成本预测的评估显示，鲁棒回归模型、提升决策树回归和决策森林回归具有优势。该操作工具已发布到网络上，可供实验和扩展使用。

英文摘要

Ability for accurate hospital case cost modelling and prediction is critical for efficient health care financial management and budgetary planning. A variety of regression machine learning algorithms are known to be effective for health care cost predictions. The purpose of this experiment was to build an Azure Machine Learning Studio tool for rapid assessment of multiple types of regression models. The tool offers environment for comparing 14 types of regression models in a unified experiment: linear regression, Bayesian linear regression, decision forest regression, boosted decision tree regression, neural network regression, Poisson regression, Gaussian processes for regression, gradient boosted machine, nonlinear least squares regression, projection pursuit regression, random forest regression, robust regression, robust regression with mm-type estimators, support vector regression. The tool presents assessment results arranged by model accuracy in a single table using five performance metrics. Evaluation of regression machine learning models for performing hospital case cost prediction demonstrated advantage of robust regression model, boosted decision tree regression and decision forest regression. The operational tool has been published to the web and openly available for experiments and extensions.

URL PDF HTML ☆

赞 0 踩 0

1702.04837 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging

草图岭回归：优化视角、统计视角和模型平均

Shusen Wang, Alex Gittens, Michael W. Mahoney

发表机构 * International Computer Science Institute and Department of Statistics University of California at Berkeley（国际计算机科学研究所和统计学系加州大学伯克利分校）； Computer Science Department Rensselaer Polytechnic Institute（计算机科学系拉特格斯理工学院）

AI总结本文从优化和统计角度研究了草图和Hessian草图在矩阵岭回归中的影响，发现经典草图能近似最优解，而Hessian草图则不同。通过理论和实验表明，模型平均可显著降低真实与草图解间的风险差距。

Comments To appear in Journal of Machine Learning Research, 2018. A short version has appeared in International Conference on Machine Learning (ICML), 2017

详情

Journal ref: Journal of Machine Learning Research, 19, pp1-50, 2018

AI中文摘要

我们探讨了经典草图和Hessian草图在近似求解矩阵岭回归（MRR）问题中的统计和优化影响。先前研究量化了经典草图对更简单的最小二乘回归（LSR）问题的影响。我们证明经典草图对MRR的优化属性的影响与对LSR的影响类似：即恢复近似最优解。相反，Hessian草图没有这种保证，其近似误差由响应中的“质量”与最优目标值之间的微妙交互决定。对于两种类型的近似，sketched MRR中的正则化导致与sketched LSR不同的统计特性。特别是，在sketched MRR中存在偏误-方差权衡，这在sketched LSR中不存在。我们提供了sketched MRR的偏误和方差的上界和下界，这些界限表明经典草图显著增加方差，而Hessian草图显著增加偏误。经验上，sketched MRR的解的风险可能比最优MRR解高一个数量级。我们理论和实证表明，模型平均显著降低真实解与sketched解风险之间的差距。因此，在并行或分布式设置中，草图结合模型平均是一种强大的技术，能够快速获得近似最优解，同时大幅减轻草图带来的统计风险增加。

英文摘要

We address the statistical and optimization impacts of the classical sketch and Hessian sketch used to approximately solve the Matrix Ridge Regression (MRR) problem. Prior research has quantified the effects of classical sketch on the strictly simpler least squares regression (LSR) problem. We establish that classical sketch has a similar effect upon the optimization properties of MRR as it does on those of LSR: namely, it recovers nearly optimal solutions. By contrast, Hessian sketch does not have this guarantee, instead, the approximation error is governed by a subtle interplay between the "mass" in the responses and the optimal objective value. For both types of approximation, the regularization in the sketched MRR problem results in significantly different statistical properties from those of the sketched LSR problem. In particular, there is a bias-variance trade-off in sketched MRR that is not present in sketched LSR. We provide upper and lower bounds on the bias and variance of sketched MRR, these bounds show that classical sketch significantly increases the variance, while Hessian sketch significantly increases the bias. Empirically, sketched MRR solutions can have risks that are higher by an order-of-magnitude than those of the optimal MRR solutions. We establish theoretically and empirically that model averaging greatly decreases the gap between the risks of the true and sketched solutions to the MRR problem. Thus, in parallel or distributed settings, sketching combined with model averaging is a powerful technique that quickly obtains near-optimal solutions to the MRR problem while greatly mitigating the increased statistical risk incurred by sketching.

URL PDF HTML ☆

赞 0 踩 0

1804.07323 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Nonparametric Stochastic Compositional Gradient Descent for Q-Learning in Continuous Markov Decision Problems

非参数随机组合梯度下降法在连续马尔可夫决策问题中的Q学习

Alec Koppel, Ekaterina Tolstaya, Ethan Stump, Alejandro Ribeiro

发表机构 * University of Pennsylvania（宾夕法尼亚大学）； U.S. Army Research Laboratory（美国陆军研究实验室）

AI总结本文提出非参数随机组合梯度下降法用于连续马尔可夫决策问题中的Q学习，通过将贝尔曼最优性方程转化为嵌套非凸随机优化问题，并利用核诱导再生核希尔伯特空间进行参数化，最终证明算法在概率意义下收敛于问题的 stationary 点。

详情

AI中文摘要

我们考虑定义在连续状态和动作空间上的马尔可夫决策问题，其中自主代理试图学习从状态到动作的映射以最大化长期折扣奖励累积。我们通过考虑定义在动作价值函数上的贝尔曼最优性方程，将其重新表述为一个嵌套非凸随机优化问题，该问题定义在再生核希尔伯特空间（RKHS）上。我们开发了一种功能扩展的随机准梯度方法来解决这个问题，由于RKHS的结构，它允许以标量权重和过去的状态-动作对参数化，其增长与算法迭代次数成比例。为缓解这种复杂性爆炸，我们应用核正交匹配追踪到核权重和字典序列，从而在底层优化方法的下降方向上产生可控的误差。我们证明所得到的算法，称为KQ学习，以概率1收敛于该问题的 stationary 点，从而在假设其属于RKHS的情况下得到贝尔曼最优性算子的固定点。在常数学习率下，我们进一步得到收敛于一个小的贝尔曼误差，该误差取决于所选的学习率。在连续山车和倒立摆任务上的数值评估表明，收敛的简洁学习动作价值函数、与最先进方法具有竞争力的策略，并表现出可靠、可重复的学习行为。

英文摘要

We consider Markov Decision Problems defined over continuous state and action spaces, where an autonomous agent seeks to learn a map from its states to actions so as to maximize its long-term discounted accumulation of rewards. We address this problem by considering Bellman's optimality equation defined over action-value functions, which we reformulate into a nested non-convex stochastic optimization problem defined over a Reproducing Kernel Hilbert Space (RKHS). We develop a functional generalization of stochastic quasi-gradient method to solve it, which, owing to the structure of the RKHS, admits a parameterization in terms of scalar weights and past state-action pairs which grows proportionately with the algorithm iteration index. To ameliorate this complexity explosion, we apply Kernel Orthogonal Matching Pursuit to the sequence of kernel weights and dictionaries, which yields a controllable error in the descent direction of the underlying optimization method. We prove that the resulting algorithm, called KQ-Learning, converges with probability 1 to a stationary point of this problem, yielding a fixed point of the Bellman optimality operator under the hypothesis that it belongs to the RKHS. Under constant learning rates, we further obtain convergence to a small Bellman error that depends on the chosen learning rates. Numerical evaluation on the Continuous Mountain Car and Inverted Pendulum tasks yields convergent parsimonious learned action-value functions, policies that are competitive with the state of the art, and exhibit reliable, reproducible learning behavior.

URL PDF HTML ☆

赞 0 踩 0

1804.07010 2026-06-04 stat.ML cs.LG cs.SY eess.SY math.AP math.OC 版本更新

Forward-Backward Stochastic Neural Networks: Deep Learning of High-dimensional Partial Differential Equations

前向-后向随机神经网络：高维偏微分方程的深度学习

Maziar Raissi

发表机构 * Division of Applied Mathematics, Brown University（布朗大学应用数学系）

AI总结本文提出一种高维偏微分方程求解方法，利用深度神经网络和随机微分方程的联系，避免数值离散化限制，解决维度诅咒问题。

1804.06114 2026-06-04 cs.LG cs.CV cs.NA math.NA stat.ML 版本更新

A Support Tensor Train Machine

支持张量列车机

Cong Chen, Kim Batselier, Ching-Yun Ko, Ngai Wong

发表机构 * The Department of Electrical and Electronic Engineering, The University of Hong Kong（香港大学电子与电气工程系）

AI总结本文提出支持张量列车机，通过将传统支持张量机中的秩一张量替换为张量列车，提升模型表达能力，实验验证其优于SVM和STM。

Comments 7 pages

1804.04696 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY 版本更新

Efficient Model Identification for Tensegrity Locomotion

高效 tensegrity 机器人运动的模型识别

Shaojun Zhu, David Surovik, Kostas E. Bekris, Abdeslam Boularias

发表机构 * Department of Computer Science, Rutgers University（计算机科学系，罗格斯大学）

AI总结本文提出一种高效方法，利用物理引擎和贝叶斯优化框架，用于识别高维顺应性tensegrity机器人中的未知机械参数，提升运动控制精度。

1804.02884 2026-06-04 cs.AI cs.LG cs.MA cs.NE cs.SY eess.SY 版本更新

Policy Gradient With Value Function Approximation For Collective Multiagent Planning

基于价值函数近似集体多智能体规划的策略梯度

Duc Thien Nguyen, Akshat Kumar, Hoong Chuin Lau

发表机构 * School of Information Systems（信息系统学院）； Singapore Management University（新加坡管理大学）

AI总结本文提出一种改进的actor-critic方法，用于优化集体决策多智能体规划问题，通过分解近似动作价值函数提升收敛速度，并在合成任务和出租车车队优化中验证了方法的有效性。

1708.09630 2026-06-04 cs.MA cs.LG cs.SY eess.SY 版本更新

Resilient Autonomous Control of Distributed Multi-agent Systems in Contested Environments

在 contested 环境中分布式多智能体系统的鲁棒自主控制

Rohollah Moghadam, Hamidreza Modares

AI总结本文提出了一种鲁棒学习控制协议，用于在存在攻击和系统动态不确定性的情况下实现多智能体系统的同步，通过分布式 H_infinity 控制器和信任-信心机制提高系统鲁棒性。

详情

AI中文摘要

本文提出了一种自主且鲁棒的控制器，用于在存在不确定性和网络攻击的条件下，领导-跟随多智能体系统。领导节点被假设为非自主的，具有非零控制输入，以响应环境变化改变团队行为或任务。本文提出了一种鲁棒的学习控制协议，以在存在攻击和系统动态不确定性的情况下找到同步问题的最优解。首先设计了一个基于观测器的分布式 H_infinity 控制器，以防止攻击对传感器和执行器的影响在整个网络中传播，并减弱对被攻击代理本身的影响。推导了非同质博弈代数 Riccati 方程以解决 H_infinity 最优同步问题，并利用非策略强化学习来学习其解，而无需任何关于代理动态的知识。然后提出了一种基于信任-信心的分布式控制协议，以缓解劫持整个节点和通信链路攻击。为每个代理定义一个基于其本地证据的置信值。所提出的鲁棒强化学习算法利用每个代理的置信值来指示其自身信息的可信度，并将其广播给邻居，以在学习过程中和之后对所接收的数据施加权重。如果某个代理的置信值较低，则利用信任机制来识别被篡改的代理，并从学习过程中移除其接收到的数据。仿真结果展示了所提出方法的有效性。

英文摘要

An autonomous and resilient controller is proposed for leader-follower multi-agent systems under uncertainties and cyber-physical attacks. The leader is assumed non-autonomous with a nonzero control input, which allows changing the team behavior or mission in response to environmental changes. A resilient learning-based control protocol is presented to find optimal solutions to the synchronization problem in the presence of attacks and system dynamic uncertainties. An observer-based distributed H_infinity controller is first designed to prevent propagating the effects of attacks on sensors and actuators throughout the network, as well as to attenuate the effect of these attacks on the compromised agent itself. Non-homogeneous game algebraic Riccati equations are derived to solve the H_infinity optimal synchronization problem and off-policy reinforcement learning is utilized to learn their solution without requiring any knowledge of the agent's dynamics. A trust-confidence based distributed control protocol is then proposed to mitigate attacks that hijack the entire node and attacks on communication links. A confidence value is defined for each agent based solely on its local evidence. The proposed resilient reinforcement learning algorithm employs the confidence value of each agent to indicate the trustworthiness of its own information and broadcast it to its neighbors to put weights on the data they receive from it during and after learning. If the confidence value of an agent is low, it employs a trust mechanism to identify compromised agents and remove the data it receives from them from the learning process. Simulation results are provided to show the effectiveness of the proposed approach.

URL PDF HTML ☆

赞 0 踩 0

1612.07139 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY 版本更新

A Survey of Deep Network Solutions for Learning Control in Robotics: From Reinforcement to Imitation

深度网络在机器人学习控制中的应用综述：从强化到模仿

Lei Tai, Jingwei Zhang, Ming Liu, Joschka Boedecker, Wolfram Burgard

发表机构 * University of Freiburg（弗赖堡大学）

AI总结本文综述了深度学习在机器人学习控制中的应用，探讨了深度强化学习和模仿学习两大主流方法，分析了其在导航、 manipulation 任务中的应用及现实差距挑战。

Comments 19 pages, 1 figures

详情

AI中文摘要

深度学习技术已广泛应用于各种研究领域，取得了最先进的成果。本文综述了针对机器人应用的学习控制策略的深度学习解决方案。我们讨论了深度学习在学习控制中的两大主要范式：深度强化学习和模仿学习。对于深度强化学习（DRL），我们从传统强化学习算法开始，展示了如何将其扩展到深度领域，并介绍了在机器人导航和 manipulation 任务中使用 DRL 的代表性工作。我们继续讨论了解决现实差距挑战的方法，即如何将仿真中训练的 DRL 策略转移到现实世界场景，并总结了用于 DRL 研究的机器人仿真平台。对于模仿学习，我们探讨了其三个主要类别：行为克隆、逆强化学习和生成对抗模仿学习，介绍了它们的公式及其在机器人应用中的对应情况。最后，我们讨论了开放挑战和研究前沿。

英文摘要

Deep learning techniques have been widely applied, achieving state-of-the-art results in various fields of study. This survey focuses on deep learning solutions that target learning control policies for robotics applications. We carry out our discussions on the two main paradigms for learning control with deep networks: deep reinforcement learning and imitation learning. For deep reinforcement learning (DRL), we begin from traditional reinforcement learning algorithms, showing how they are extended to the deep context and effective mechanisms that could be added on top of the DRL algorithms. We then introduce representative works that utilize DRL to solve navigation and manipulation tasks in robotics. We continue our discussion on methods addressing the challenge of the reality gap for transferring DRL policies trained in simulation to real-world scenarios, and summarize robotics simulation platforms for conducting DRL research. For imitation leaning, we go through its three main categories, behavior cloning, inverse reinforcement learning and generative adversarial imitation learning, by introducing their formulations and their corresponding robotics applications. Finally, we discuss the open challenges and research frontiers.

URL PDF HTML ☆

赞 0 踩 0

1610.02967 2026-06-04 math.OC cs.LG cs.NA math.NA stat.ML 版本更新

Distributed Convex Optimization with Many Convex Constraints

具有许多凸约束的分布式凸优化

Joachim Giesen, Sören Laue

发表机构 * Friedrich-Schiller-University Jena（耶拿弗里德里希-施特劳斯大学）

AI总结本文提出一种扩展的ADMM方法，用于解决具有众多凸约束的分布式凸优化问题，继承了ADMM和增广拉格朗日方法的收敛性保证。

1711.07038 2026-06-04 math.NA cs.LG cs.NA 版本更新

A Coordinate-wise Optimization Algorithm for Sparse Inverse Covariance Selection

用于稀疏逆协方差选择的坐标优化算法

Ganzhao Yuan, Haoxian Tan, Wei-Shi Zheng

AI总结本文提出一种坐标优化算法解决稀疏逆协方差选择问题，保证收敛到坐标最小点，并在合成和真实数据上优于现有方法。

详情

AI中文摘要

稀疏逆协方差选择是分析高维数据依赖性的基本问题，但因其NP难而难以求解。现有方法基于凸近似和迭代硬阈值，仅能得到次优解。本文提出一种坐标优化算法，通过迭代贪心选择或交换变量确定支持集，并在支持集上求解缩减的凸优化问题以实现最大下降。此外，本文还提出一种牛顿型算法解决缩减凸子问题，证明其具有全局线性收敛率和局部二次收敛率。最后，在合成数据和真实数据集上验证了方法的有效性，结果表明所提方法在准确性上优于现有方法。

英文摘要

Sparse inverse covariance selection is a fundamental problem for analyzing dependencies in high dimensional data. However, such a problem is difficult to solve since it is NP-hard. Existing solutions are primarily based on convex approximation and iterative hard thresholding, which only lead to sub-optimal solutions. In this work, we propose a coordinate-wise optimization algorithm to solve this problem which is guaranteed to converge to a coordinate-wise minimum point. The algorithm iteratively and greedily selects one variable or swaps two variables to identify the support set, and then solves a reduced convex optimization problem over the support set to achieve the greatest descent. As a side contribution of this paper, we propose a Newton-like algorithm to solve the reduced convex sub-problem, which is proven to always converge to the optimal solution with global linear convergence rate and local quadratic convergence rate. Finally, we demonstrate the efficacy of our method on synthetic data and real-world data sets. As a result, the proposed method consistently outperforms existing solutions in terms of accuracy.

URL PDF HTML ☆

赞 0 踩 0

1710.10781 2026-06-04 math.NA cs.CV cs.LG cs.NA stat.ML 版本更新

Stochastic variance reduced multiplicative update for nonnegative matrix factorization

随机方差缩减乘法更新用于非负矩阵分解

Hiroyuki Kasai

发表机构 * Graduate School of Informatics and Engineering, The University of Electro-Communications（信息与工程研究生院，东京电波通信大学）

AI总结本文提出一种随机方差缩减乘法更新算法，改进非负矩阵分解的收敛速度，通过数值实验验证其在不同数据集上的优越性。

Comments IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2018)

1804.00684 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Graph-Based Deep Modeling and Real Time Forecasting of Sparse Spatio-Temporal Data

基于图的深度建模与稀疏时空数据的实时预测

Bao Wang, Xiyang Luo, Fangbo Zhang, Baichuan Yuan, Andrea L. Bertozzi, P. Jeffrey Brantingham

发表机构 * Dept of Anthropology, UCLA（人类学系，加州大学洛杉矶分校）； Dept of Math, UCLA（数学系，加州大学洛杉矶分校）

AI总结本文提出一种通用框架，用于稀疏时空数据的建模、分析与预测，结合自激发点过程和图结构循环神经网络，实现宏微观尺度的联合建模与实时预测。

Comments 9 pages, 19 figures

1803.11411 2026-06-04 eess.SY cs.LG cs.MA cs.SY 版本更新

Observer-based Adaptive Optimal Output Containment Control problem of Linear Heterogeneous Multi-agent Systems with Relative Output Measurements

基于观测器的自适应最优输出包容控制问题：线性异构多智能体系统中的相对输出测量

Majid Mazouchi, Mohammad Bagher Naghibi-Sistani, Seyed Kamal Hosseini Sani, Farzaneh Tatari, Hamidreza Modares

发表机构 * Department of Electrical Engineering, Ferdowsi University of Mashhad, Mashhad, Iran（马什哈德法尔多西大学电气工程系）； Department of Electrical Engineering, University of Semnan, Semnan, Iran（塞姆南大学电气工程系）； Missouri University of Science（密苏里科技大学）

AI总结本文提出了一种基于相对输出反馈的最优解法，用于线性异构多智能体系统的包容控制问题，通过分布式最优控制协议确保跟随器输出处于领导者输出的凸包内并优化暂态性能，采用分布式观测器估计不可用的状态和凸包。

详情

AI中文摘要

With an increasing use of data-driven models to control robotic systems, it has become important to develop a methodology for validating such models before they can be deployed to design a controller for the actual system. Specifically, it must be ensured that the controller designed for a learned model would perform as expected on the actual physical system. We propose a context-specific validation framework to quantify the quality of a learned model based on a distance measure between the closed-loop actual system and the learned model. We then propose an active sampling scheme to compute a probabilistic upper bound on this distance in a sample-efficient manner. The proposed framework validates the learned model against only those behaviors of the system that are relevant for the purpose for which we intend to use this model, and does not require any a priori knowledge of the system dynamics. Several simulations illustrate the practicality of the proposed framework for validating the models of real-world systems, and consequently, for controller synthesis.

URL PDF HTML ☆

赞 0 踩 0

1803.07661 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Efficient Recurrent Neural Networks using Structured Matrices in FPGAs

在FPGA上使用结构化矩阵实现高效的循环神经网络

Zhe Li, Shuo Wang, Caiwen Ding, Qinru Qiu, Yanzhi Wang, Yun Liang

发表机构 * Department of Electrical Engineering and Computer Science, Syracuse University, USA（Syracuse大学电气工程与计算机科学系）； Center for Energy-efficient Computing and Applications (CECA), Peking University, China（北京大学能源高效计算与应用中心）

AI总结本文提出在FPGA上使用块循环矩阵实现RNN，以提高模型压缩和加速，实验显示比ESE提升35.7倍的能效。

Comments To appear in International Conference on Learning Representations 2018 Workshop Track

1711.02271 2026-06-04 math.NA cs.LG cs.NA 版本更新

High-order Tensor Completion for Data Recovery via Sparse Tensor-train Optimization

高阶张量补全：通过稀疏张量-列车优化实现数据恢复

Longhao Yuan, Qibin Zhao, Jianting Cao

发表机构 * Graduate School of Engineering, Saitama Institute of Technology, Japan（埼玉工科大学工学研究科）； Tensor Learning Unit, RIKEN Center for Advanced Intelligence Project (AIP), Japan（日本先进情报项目（AIP）RIKEN智能学习单元）； School of Automation, Guangdong University of Technology, China（广东技术大学自动化学院）； School of Computer Science and Technology, Hangzhou Dianzi University, China（杭州电子科技大学计算机科学与技术学院）

AI总结本文提出稀疏张量-列车优化算法，通过将缺失数据视为稀疏张量并利用一阶优化方法求解张量-列车分解因子，有效提升低阶和高阶张量补全性能，尤其在高缺失率下表现优异。

Comments 5 pages (include 1 page of reference) ICASSP 2018

1707.03770 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Fastest Convergence for Q-learning

Q-learning 最快收敛算法

Adithya M. Devraj, Sean P. Meyn

发表机构 * University of Florida（佛罗里达大学）； University of California, Berkeley（加州大学伯克利分校）

AI总结本文提出Zap Q-learning算法，通过矩阵增益设计实现渐近方差最优，并通过ODE分析证明其瞬态行为接近确定性牛顿-拉夫森法，实验验证其在非理想参数化设置下的快速收敛性。

1703.02660 2026-06-04 cs.LG cs.AI cs.RO cs.SY eess.SY 版本更新

Towards Generalization and Simplicity in Continuous Control

连续控制中的泛化与简洁性

Aravind Rajeswaran, Kendall Lowrey, Emanuel Todorov, Sham Kakade

发表机构 * University of Washington（华盛顿大学）

AI总结本文展示简单线性与RBF参数化策略可解决多种连续控制任务，性能可与更复杂网络相媲美，且多样初始化提升泛化能力。

Comments NIPS 2017, Project page: https://sites.google.com/view/simple-pol

1803.06989 2026-06-04 math.ST cs.LG cs.NA math.NA stat.ML stat.TH 版本更新

Numerical Integration on Graphs: where to sample and how to weigh

图上的数值积分：在哪里采样和如何加权

George C. Linderman, Stefan Steinerberger

发表机构 * Program in Applied Mathematics, Yale University, New Haven, CT 06511, USA（应用数学项目，耶鲁大学，新 Haven, CT 06511, USA）； Department of Mathematics, Yale University, New Haven, CT 06511, USA（数学系，耶鲁大学，新 Haven, CT 06511, USA）

AI总结研究图上数值积分问题，通过热球最优打包几何问题重构积分，提出采样策略与加权方法，验证方法效率。

1803.05026 2026-06-04 cs.LG cs.CV cs.IT cs.NA math.IT math.NA 版本更新

Principal Component Analysis with Tensor Train Subspace

张量列车子空间下的主成分分析

Wenqi Wang, Vaneet Aggarwal, Shuchin Aeron

发表机构 * Purdue University（普渡大学）

AI总结本文提出TT-PCA算法，通过保持低秩张量结构来估计结构化的张量列车子空间，相比PCA和Tucker-PCA更具鲁棒性，实验验证其有效性。

1703.02419 2026-06-04 stat.CO cs.LG cs.SY eess.SY 版本更新

Probabilistic learning of nonlinear dynamical systems using sequential Monte Carlo

利用序贯蒙特卡洛方法进行非线性动力系统概率学习

Thomas B. Schön, Andreas Svensson, Lawrence Murray, Fredrik Lindsten

发表机构 * Department of Information Technology, Uppsala University（乌普萨拉大学信息科技系）

AI总结本文提出基于序贯蒙特卡洛方法的概率非线性状态空间模型学习方法，通过粒子Metropolis-Hastings算法实现参数空间的高效采样，并展示了该方法在动态系统建模中的应用。

Comments Thomas B. Schön, Andreas Svensson, Lawrence Murray and Fredrik Lindsten, 2018. Probabilistic learning of nonlinear dynamical systems using sequential Monte Carlo. In Mechanical Systems and Signal Processing, Volume 104, pp. 866-883

详情

DOI: 10.1016/j.ymssp.2017.10.033

AI中文摘要

概率建模能够表示和操纵数据、模型、预测和决策中的不确定性。本文关注从测量数据中学习动态系统概率模型的问题，特别是非线性状态空间模型的学习。由于该问题没有闭式解，因此必须使用近似方法。本文提供了一个自包含的介绍，介绍了一种最先进的方法——粒子Metropolis-Hastings算法，该算法已被证明能提供实用的近似。这是一种基于蒙特卡洛的方法，其中粒子滤波用于引导马尔可夫链蒙特卡洛方法通过参数空间。粒子Metropolis-Hastings算法的一个关键优点是，在温和的假设下，它保证收敛到“真实解”，尽管它基于仅有限数量粒子的粒子滤波。本文还提供了一个数值示例，通过为序贯蒙特卡洛方法量身定制的建模语言来展示该方法。此类建模语言的目的是将高级蒙特卡洛方法（包括粒子Metropolis-Hastings）的威力带给大量用户，而无需他们了解所有底层数学细节。

英文摘要

Probabilistic modeling provides the capability to represent and manipulate uncertainty in data, models, predictions and decisions. We are concerned with the problem of learning probabilistic models of dynamical systems from measured data. Specifically, we consider learning of probabilistic nonlinear state-space models. There is no closed-form solution available for this problem, implying that we are forced to use approximations. In this tutorial we will provide a self-contained introduction to one of the state-of-the-art methods---the particle Metropolis--Hastings algorithm---which has proven to offer a practical approximation. This is a Monte Carlo based method, where the particle filter is used to guide a Markov chain Monte Carlo method through the parameter space. One of the key merits of the particle Metropolis--Hastings algorithm is that it is guaranteed to converge to the "true solution" under mild assumptions, despite being based on a particle filter with only a finite number of particles. We will also provide a motivating numerical example illustrating the method using a modeling language tailored for sequential Monte Carlo methods. The intention of modeling languages of this kind is to open up the power of sophisticated Monte Carlo methods---including particle Metropolis--Hastings---to a large group of users without requiring them to know all the underlying mathematical details.

URL PDF HTML ☆

赞 0 踩 0

1803.02553 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Graph Learning from Filtered Signals: Graph System and Diffusion Kernel Identification

基于滤波信号的图学习：图系统与扩散核识别

Hilmi E. Egilmez, Eduardo Pavez, Antonio Ortega

发表机构 * Department of Electrical Engineering, University of Southern California（电气工程系，南加州大学）

AI总结本文提出一种新的图信号处理框架，用于从滤波信号类中构建图模型。通过将图建模问题转化为图系统识别问题，学习加权图（图拉普拉斯矩阵）和图滤波器（图拉普拉斯矩阵函数）。算法能从多信号观测中联合识别图和图滤波器，适用于学习扩散核，并在真实气候数据集上验证了其有效性。

Comments Submitted to IEEE Trans. on Signal and Information Processing over Networks (13 pages)

详情

AI中文摘要

本文介绍了一种新的图信号处理框架，用于从滤波信号类中构建图模型。在我们的框架中，图建模被公式化为图系统识别问题，目标是学习加权图（图拉普拉斯矩阵）和图滤波器（图拉普拉斯矩阵函数）。为了求解提出的问题，开发了一种算法，从多个信号/数据观测中联合识别图和图滤波器（GBF）。我们的算法在GBF是一一对应函数的假设下有效。所提出的方法可以应用于学习扩散（热）核，这些核在各种领域中用于建模扩散过程。此外，对于特定的图滤波器选择，所提出的问题减少为图拉普拉斯估计问题。我们的实验结果表明，所提出算法优于当前最先进的方法。我们还实现了该框架在一个真实气候数据集上，用于温度信号建模。

英文摘要

This paper introduces a novel graph signal processing framework for building graph-based models from classes of filtered signals. In our framework, graph-based modeling is formulated as a graph system identification problem, where the goal is to learn a weighted graph (a graph Laplacian matrix) and a graph-based filter (a function of graph Laplacian matrices). In order to solve the proposed problem, an algorithm is developed to jointly identify a graph and a graph-based filter (GBF) from multiple signal/data observations. Our algorithm is valid under the assumption that GBFs are one-to-one functions. The proposed approach can be applied to learn diffusion (heat) kernels, which are popular in various fields for modeling diffusion processes. In addition, for specific choices of graph-based filters, the proposed problem reduces to a graph Laplacian estimation problem. Our experimental results demonstrate that the proposed algorithm outperforms the current state-of-the-art methods. We also implement our framework on a real climate dataset for modeling of temperature signals.

URL PDF HTML ☆

赞 0 踩 0

1709.04407 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

An Inversion-Based Learning Approach for Improving Impromptu Trajectory Tracking of Robots with Non-Minimum Phase Dynamics

基于逆向学习的方法用于改进具有非最小相位动态的机器人即兴轨迹跟踪

Siqi Zhou, Mohamed K. Helwa, Angela P. Schoellig

发表机构 * Dynamic Systems Lab（动态系统实验室）； Institute for Aerospace Studies（航空航天研究 institute）； University of Toronto（多伦多大学）； Cairo University（开罗大学）

AI总结本文提出一种基于学习的方法，用于改进非最小相位系统的即兴轨迹跟踪，通过直接从输入输出数据学习稳定近似逆向，验证了方法的稳定性与高精度跟踪效果。

Comments Accepted for publication in the IEEE Robotics and Automation Letters (RA-L), July 2018

详情

DOI: 10.1109/LRA.2018.2801471

AI中文摘要

本文提出了一种基于学习的方法，用于非最小相位系统的即兴轨迹跟踪。逆向前馈方法常用于提高跟踪性能，但无法直接应用于非最小相位系统，因为其固有不稳定。为解决此问题，现有方法假设系统模型已知，并使用预动作或逆向近似技术。本文提出了一种从输入输出数据直接学习稳定近似逆向的方法。通过理论讨论、模拟和两种不同平台的实验，展示了所提方法的稳定性及其在高精度即兴跟踪中的有效性。此外，本文还表明，在训练中包含更多信息，尽管通常被认为有用，但未必能提高性能，反而可能引发不稳定性并影响整体方法的效果。

英文摘要

This paper presents a learning-based approach for impromptu trajectory tracking for non-minimum phase systems, i.e., systems with unstable inverse dynamics. Inversion-based feedforward approaches are commonly used for improving tracking performance; however, these approaches are not directly applicable to non-minimum phase systems due to their inherent instability. In order to resolve the instability issue, existing methods have assumed that the system model is known and used pre-actuation or inverse approximation techniques. In this work, we propose an approach for learning a stable, approximate inverse of a non-minimum phase baseline system directly from its input-output data. Through theoretical discussions, simulations, and experiments on two different platforms, we show the stability of our proposed approach and its effectiveness for high-accuracy, impromptu tracking. Our approach also shows that including more information in the training, as is commonly assumed to be useful, does not lead to better performance but may trigger instability and impact the effectiveness of the overall approach.

URL PDF HTML ☆

赞 0 踩 0

1803.01626 2026-06-04 stat.ML cs.LG cs.SY eess.SY 版本更新

Variance-Aware Regret Bounds for Undiscounted Reinforcement Learning in MDPs

考虑平均回报准则下未知离散马尔可夫决策过程（MDP）中的强化学习的方差意识后悔界

Mohammad Sadegh Talebi, Odalric-Ambrym Maillard

发表机构 * KTH Royal Institute of Technology（皇家理工学院）； INRIA Lille – Nord Europe（里尔-北欧洲研究所）

AI总结本文基于平均回报准则，重新审视未知离散MDP中的强化学习问题，通过引入局部方差代替MDP直径，改进KL-UCRL算法的后悔界，提供更优的性能保证。

Comments To appear in Proceedings of the 29th International Conference on Algorithmic Learning Theory (ALT 2018)

详情

AI中文摘要

在未知和离散的马尔可夫决策过程（MDP）中，考虑在单一流观测下进行强化学习的问题，当学习者从初始状态开始与系统交互时。我们通过引入偏倚函数的局部方差代替MDP的直径，重新审视该问题的最小最大下界。此外，我们提供了KL-UCRL算法的新型分析，建立了高概率的后悔界，其规模为$\widetilde {\mathcal O}\Bigl({\textstyle \sqrt{S\sum_{s,a}{\bf V}^\star_{s,a}T}}\Big)$，适用于周期性MDP。该界优于之前已知的$\widetilde {\mathcal O}(DS\sqrt{AT})$界，其中$A$和$D$分别表示每个状态的最大动作数和MDP的直径。我们最终在一些基准MDP中比较了两个界的主导项，表明在某些情况下，所推导的界可以提供一个数量级的改进。我们的分析利用了运输引理的新变体结合KL集中不等式，我们认为这些方法具有独立的兴趣。

英文摘要

The problem of reinforcement learning in an unknown and discrete Markov Decision Process (MDP) under the average-reward criterion is considered, when the learner interacts with the system in a single stream of observations, starting from an initial state without any reset. We revisit the minimax lower bound for that problem by making appear the local variance of the bias function in place of the diameter of the MDP. Furthermore, we provide a novel analysis of the KL-UCRL algorithm establishing a high-probability regret bound scaling as $\widetilde {\mathcal O}\Bigl({\textstyle \sqrt{S\sum_{s,a}{\bf V}^\star_{s,a}T}}\Big)$ for this algorithm for ergodic MDPs, where $S$ denotes the number of states and where ${\bf V}^\star_{s,a}$ is the variance of the bias function with respect to the next-state distribution following action $a$ in state $s$. The resulting bound improves upon the best previously known regret bound $\widetilde {\mathcal O}(DS\sqrt{AT})$ for that algorithm, where $A$ and $D$ respectively denote the maximum number of actions (per state) and the diameter of MDP. We finally compare the leading terms of the two bounds in some benchmark MDPs indicating that the derived bound can provide an order of magnitude improvement in some cases. Our analysis leverages novel variations of the transportation lemma combined with Kullback-Leibler concentration inequalities, that we believe to be of independent interest.

URL PDF HTML ☆

赞 0 踩 0

1803.00491 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

The Power Mean Laplacian for Multilayer Graph Clustering

多层图聚类的幂均值拉普拉斯

Pedro Mercado, Antoine Gautier, Francesco Tudisco, Matthias Hein

发表机构 * Department of Mathematics and Computer Science, Saarland University（萨尔兰大学数学与计算机科学系）； Department of Mathematics and Statistics, University of Strathclyde（斯特拉斯克莱德大学数学与统计学系）

AI总结本文提出一种参数化的矩阵幂均值方法，用于融合多层图的拉普拉斯矩阵，分析其在随机块模型中的期望性能，并在真实数据中验证其在不同设置下恢复真实聚类的能力。

Comments 19 pages, 3 figures. Accepted in Artificial Intelligence and Statistics (AISTATS), 2018

1802.10275 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Solving for high dimensional committor functions using artificial neural networks

利用人工神经网络求解高维承诺函数

Yuehaw Khoo, Jianfeng Lu, Lexing Ying

发表机构 * Department of Mathematics, Stanford University（数学系，斯坦福大学）； Department of Mathematics, Department of Chemistry and Department of Physics, Duke University（数学系、化学系和物理系，杜克大学）； Department of Mathematics and ICME, Stanford University（数学系和ICME，斯坦福大学）

AI总结本文提出基于人工神经网络的方法，用于研究由随机过程支配的状态转换。通过变分公式和神经网络参数化，获得高维Fokker-Planck方程的承诺函数数值解，证明在高维问题中可实现中等精度。

Comments 12 pages, 6 figures

1802.00930 2026-06-04 cs.NE cs.LG cs.NA math.NA 版本更新

Mixed Precision Training of Convolutional Neural Networks using Integer Operations

使用整数运算进行卷积神经网络的混合精度训练

Dipankar Das, Naveen Mellempudi, Dheevatsa Mudigere, Dhiraj Kalamkar, Sasikanth Avancha, Kunal Banerjee, Srinivas Sridharan, Karthik Vaidyanathan, Bharat Kaul, Evangelos Georganas, Alexander Heinecke, Pradeep Dubey, Jesus Corbal, Nikita Shustrov, Roma Dubtsov, Evarist Fomenko, Vadim Pirogov

发表机构 * Parallel Computing Lab（并行计算实验室）； Intel Labs, India（英特尔实验室，印度）； Product Architecture Group（产品架构组）； Intel Labs, SC Intel, OR（英特尔实验室，SC英特尔，美国）； Software Services Group（软件服务组）； Intel, OR（英特尔，美国）

AI总结本文提出了一种基于整数运算的混合精度训练方法，在ImageNet-1K数据集上训练了ResNet-50、GoogLeNet-v1等SOTA网络，实现了比FP32更高的训练吞吐量和相同精度下的最高准确率。

Comments Published as a conference paper at ICLR 2018

详情

AI中文摘要

当前混合精度训练的SOTA主要基于低精度浮点运算，如FP16累积到FP32的变种。然而，在低精度和混合精度整数训练领域，已有研究要么针对非SOTA网络（如仅AlexNet用于ImageNet-1K），要么针对较小的数据集（如CIFAR-10）。本文在通用硬件上训练了SOTA视觉理解神经网络，使用整数运算。特别关注整数融合乘加（FMA）运算，其输入为两个INT16操作数，输出为INT32。我们提出了张量的共享指数表示，并开发了适用于常见神经网络操作的动态定点（DFP）方案。研究了高效整数卷积核的开发细节，包括处理INT32累加器溢出的方法。我们实现了ResNet-50、GoogLeNet-v1、VGG-16和AlexNet的CNN训练，这些网络在相同迭代次数下达到或超过FP32的SOTA准确率，无需改变超参数，并在端到端训练吞吐量上提高了1.8倍。据我们所知，这些结果是首次在通用硬件上使用SOTA CNNs在ImageNet-1K数据集上实现INT16训练的结果，并实现了最高报告的准确率。

英文摘要

The state-of-the-art (SOTA) for mixed precision training is dominated by variants of low precision floating point operations, and in particular, FP16 accumulating into FP32 Micikevicius et al. (2017). On the other hand, while a lot of research has also happened in the domain of low and mixed-precision Integer training, these works either present results for non-SOTA networks (for instance only AlexNet for ImageNet-1K), or relatively small datasets (like CIFAR-10). In this work, we train state-of-the-art visual understanding neural networks on the ImageNet-1K dataset, with Integer operations on General Purpose (GP) hardware. In particular, we focus on Integer Fused-Multiply-and-Accumulate (FMA) operations which take two pairs of INT16 operands and accumulate results into an INT32 output.We propose a shared exponent representation of tensors and develop a Dynamic Fixed Point (DFP) scheme suitable for common neural network operations. The nuances of developing an efficient integer convolution kernel is examined, including methods to handle overflow of the INT32 accumulator. We implement CNN training for ResNet-50, GoogLeNet-v1, VGG-16 and AlexNet; and these networks achieve or exceed SOTA accuracy within the same number of iterations as their FP32 counterparts without any change in hyper-parameters and with a 1.8X improvement in end-to-end training throughput. To the best of our knowledge these results represent the first INT16 training results on GP hardware for ImageNet-1K dataset using SOTA CNNs and achieve highest reported accuracy using half-precision

URL PDF HTML ☆

赞 0 踩 0

1702.03258 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Adaptive and Resilient Soft Tensegrity Robots

自适应且具有韧性的软 tensegrity 机器人

John Rieffel, Jean-Baptiste Mouret

发表机构 * Union College（联合学院）； Inria Nancy Grand - Est ； CNRS, Loria, UMR 7503（法国国家科学研究中心（CNRS）、洛里亚实验室（Loria）、UMR 7503）

AI总结本文提出一种易于组装的基于 tensegrity 的软机器人，能产生高动态运动步态，并在物理损伤下表现出结构和行为韧性，通过机器学习算法实现有效步态发现。

Comments video: https://youtu.be/SuLQDhrk9tQ

详情

AI中文摘要

生物体结合软（如肌肉）和硬（如骨骼）材料，赋予其内在灵活性和韧性，这在传统刚性机器人中往往缺失。新兴的软机器人领域试图利用这些特性创造韧性机器。然而，软材料的性质给设计、制造和控制带来了重大挑战，迄今为止，大多数软机器人的步态都是通过经验试错法手动设计的。本文描述了一种易于组装的基于 tensegrity 的软机器人，能够产生高度动态的运动步态，并在面对物理损伤时表现出结构和行为的韧性。使这一成果成为可能的是使用一种机器学习算法，能够以最少的物理试验发现有效的步态。这些结果进一步支持了软机器人方法，旨在利用复杂材料动力学的相互作用，以生成丰富的动态行为。

英文摘要

Living organisms intertwine soft (e.g., muscle) and hard (e.g., bones) materials, giving them an intrinsic flexibility and resiliency often lacking in conventional rigid robots. The emerging field of soft robotics seeks to harness these same properties in order to create resilient machines. The nature of soft materials, however, presents considerable challenges to aspects of design, construction, and control -- and up until now, the vast majority of gaits for soft robots have been hand-designed through empirical trial-and-error. This manuscript describes an easy-to-assemble tensegrity-based soft robot capable of highly dynamic locomotive gaits and demonstrating structural and behavioral resilience in the face of physical damage. Enabling this is the use of a machine learning algorithm able to discover effective gaits with a minimal number of physical trials. These results lend further credence to soft-robotic approaches that seek to harness the interaction of complex material dynamics in order to generate a wealth of dynamical behaviors.

URL PDF HTML ☆

赞 0 踩 0

1705.08435 2026-06-04 cs.LG cs.CR cs.DC cs.SY eess.SY stat.ML 版本更新

Personalized and Private Peer-to-Peer Machine Learning

个性化与隐私保护的点对点机器学习

Aurélien Bellet, Rachid Guerraoui, Mahsa Taziki, Marc Tommasi

发表机构 * INRIA ； EPFL（瑞士联邦理工学院）； Université de Lille（里尔大学）

AI总结本文提出一种高效算法，实现去中心化且异步的个性化机器学习，在强隐私要求下保证收敛性。通过差分隐私保护数据隐私，并分析隐私与效用的平衡。实验表明，在非隐私情况下优于先前方法，隐私约束下可显著提升模型性能。

Comments 20 pages, to appear in the Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (AISTATS 2018)

1708.07827 2026-06-04 math.OC cs.LG cs.NA math.NA stat.ML 版本更新

Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study

非凸机器学习中的二阶优化：一项实证研究

Peng Xu, Farbod Roosta-Khorasani, Michael W. Mahoney

发表机构 * Institute for Computational and Mathematical Engineering, Stanford University（计算与数学工程研究所，斯坦福大学）； School of Mathematics and Physics, University of Queensland（数学与物理学院，昆士兰大学）； International Computer Science Institute, Berkeley, USA（国际计算机科学研究所，伯克利，美国）； International Computer Science Institute and Department of Statistics, University of California at Berkeley（国际计算机科学研究所和统计学系，加州大学伯克利分校）

AI总结本文通过实证研究评估了非凸机器学习问题中牛顿型方法的性能，证明其在泛化性能和超参数鲁棒性方面优于传统SGD，能有效逃离平坦区域和鞍点。

Comments 21 pages, 11 figures. Restructure the paper and add experiments

详情

AI中文摘要

尽管随机梯度下降（SGD）等一阶优化方法在机器学习（ML）中广泛应用，但它们存在收敛速度慢、超参数设置敏感、易陷入高训练误差和难以逃离平坦区域及鞍点等缺陷。在高度非凸设置（如神经网络中）尤为明显。受此启发，近期有研究关注二阶方法，旨在通过捕捉曲率信息缓解这些不足。本文报告了针对非凸ML问题的一类牛顿型方法——信任区域（TR）和自适应三次正则化（ARC）算法的子采样变体的详细实证评估。在此过程中，我们证明这些方法不仅在计算上与手工调优的SGD加动量方法具有竞争力，泛化性能可比或更优，而且对超参数设置具有高度鲁棒性。此外，与SGD加动量相比，这些牛顿型方法利用曲率信息的方式使其能够无缝逃离平坦区域和鞍点。

英文摘要

While first-order optimization methods such as stochastic gradient descent (SGD) are popular in machine learning (ML), they come with well-known deficiencies, including relatively-slow convergence, sensitivity to the settings of hyper-parameters such as learning rate, stagnation at high training errors, and difficulty in escaping flat regions and saddle points. These issues are particularly acute in highly non-convex settings such as those arising in neural networks. Motivated by this, there has been recent interest in second-order methods that aim to alleviate these shortcomings by capturing curvature information. In this paper, we report detailed empirical evaluations of a class of Newton-type methods, namely sub-sampled variants of trust region (TR) and adaptive regularization with cubics (ARC) algorithms, for non-convex ML problems. In doing so, we demonstrate that these methods not only can be computationally competitive with hand-tuned SGD with momentum, obtaining comparable or better generalization performance, but also they are highly robust to hyper-parameter settings. Further, in contrast to SGD with momentum, we show that the manner in which these Newton-type methods employ curvature information allows them to seamlessly escape flat regions and saddle points.

URL PDF HTML ☆

赞 0 踩 0

1605.09232 2026-06-04 math.NA cs.LG cs.NA cs.NE math.OC stat.ML 版本更新

Tradeoffs between Convergence Speed and Reconstruction Accuracy in Inverse Problems

反问题中收敛速度与重建精度之间的权衡

Raja Giryes, Yonina C. Eldar, Alex M. Bronstein, Guillermo Sapiro

发表机构 * School of Electrical Engineering, Tel Aviv University（特拉维夫大学电子工程学院）； Electrical Engineering Department, Technion - IIT（技术学院-理工学院电子工程系）； Computer Science Department, Technion - IIT（技术学院-理工学院计算机科学系）； Electrical and Computer Engineering Department, Duke University（杜克大学电气与计算机工程系）

AI总结研究探讨了在逆问题中，通过调整迭代算法以加快收敛速度同时保持重建精度的可行性，结合低维集的恢复技术，分析了粗略估计对收敛速度的影响。

Comments To appear in IEEE Transactions on Signal Processing

详情

AI中文摘要

使用迭代算法求解逆问题在大数据中很流行。由于时间限制，迭代次数通常有限，可能影响可实现的精度。给定可接受的误差范围，一个重要问题是是否可以通过修改原始迭代方法，获得更快收敛到达到允许误差的极小值点，而不显著增加每次迭代的计算成本。基于最近为某些低维集信号恢复开发的恢复技术，我们表明使用该集的粗略估计可能以额外的重建误差为代价加快收敛。我们的理论与稀疏恢复、压缩感知和深度学习的最新进展相关。特别是，它可能为神经网络通过层表示迭代来近似l1最小化解的成功提供了可能的解释，如在学习迭代收缩阈值算法（LISTA）中实践的那样。

英文摘要

Solving inverse problems with iterative algorithms is popular, especially for large data. Due to time constraints, the number of possible iterations is usually limited, potentially affecting the achievable accuracy. Given an error one is willing to tolerate, an important question is whether it is possible to modify the original iterations to obtain faster convergence to a minimizer achieving the allowed error without increasing the computational cost of each iteration considerably. Relying on recent recovery techniques developed for settings in which the desired signal belongs to some low-dimensional set, we show that using a coarse estimate of this set may lead to faster convergence at the cost of an additional reconstruction error related to the accuracy of the set approximation. Our theory ties to recent advances in sparse recovery, compressed sensing, and deep learning. Particularly, it may provide a possible explanation to the successful approximation of the l1-minimization solution by neural networks with layers representing iterations, as practiced in the learned iterative shrinkage-thresholding algorithm (LISTA).

URL PDF HTML ☆

赞 0 踩 0

1802.03981 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Spectral Filtering for General Linear Dynamical Systems

谱滤波用于通用线性动态系统

Elad Hazan, Holden Lee, Karan Singh, Cyril Zhang, Yi Zhang

发表机构 * Department of Computer Science, Princeton University（普林斯顿大学计算机科学系）； Google Brain（谷歌大脑）； Department of Mathematics, Princeton University（普林斯顿大学数学系）

AI总结本文提出一种多项式时间算法，用于学习无系统识别假设的隐状态线性动态系统，无需假设系统转移矩阵的谱半径。该算法扩展了谱滤波技术，通过新的凸松弛方法高效识别相位。

1605.00031 2026-06-04 cs.LG cs.CV cs.NA math.NA stat.ML 版本更新

Deep Convolutional Neural Networks on Cartoon Functions

深度卷积神经网络在卡通函数上的应用

Philipp Grohs, Thomas Wiatowski, Helmut Bölcskei

发表机构 * 1 Dept. Math., ETH Zurich, Switzerland

AI总结本文研究深度卷积神经网络在卡通函数上的变形稳定性，提出考虑结构特性的新结果，适用于具有尖锐和弯曲不连续性的信号。

Comments This is a slightly updated version of the paper published in the ISIT proceedings. Specifically, we corrected errors in the arguments on the volume of tubes. Note that this correction does not affect the main statements of the paper

详情

DOI: 10.1109/ISIT.2016.7541482
Journal ref: Proc. of IEEE International Symposium on Information Theory (ISIT), Barcelona, Spain, pp. 1163-1167, July 2016

AI中文摘要

Wiatowski和Bölcskei, 2015证明了深度卷积神经网络基于的特征提取器的变形稳定性和垂直平移不变性由网络结构本身保证，而非特定卷积核和非线性。虽然平移不变性结果适用于平方可积函数，变形稳定性界仅适用于带限函数。然而，许多实际相关信号（如自然图像）表现出尖锐和弯曲的不连续性，因此不是带限的。本文的主要贡献是针对Donoho, 2001引入的卡通函数类建立变形稳定性界。

英文摘要

Wiatowski and Bölcskei, 2015, proved that deformation stability and vertical translation invariance of deep convolutional neural network-based feature extractors are guaranteed by the network structure per se rather than the specific convolution kernels and non-linearities. While the translation invariance result applies to square-integrable functions, the deformation stability bound holds for band-limited functions only. Many signals of practical relevance (such as natural images) exhibit, however, sharp and curved discontinuities and are, hence, not band-limited. The main contribution of this paper is a deformation stability result that takes these structural properties into account. Specifically, we establish deformation stability bounds for the class of cartoon functions introduced by Donoho, 2001.

URL PDF HTML ☆

赞 0 踩 0

1705.07364 2026-06-04 cs.LG cs.CV cs.NA math.NA 版本更新

Stabilizing Adversarial Nets With Prediction Methods

用预测方法稳定对抗网络

Abhay Yadav, Sohil Shah, Zheng Xu, David Jacobs, Tom Goldstein

发表机构 * University of Maryland, College Park（马里兰大学 College Park 分校）

AI总结本文提出一种改进的随机梯度下降方法，通过稳定对抗网络的训练过程，使其更可靠地收敛到鞍点，提高训练稳定性与效率。

Comments Accepted at ICLR 2018

1709.07089 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

On the Design of LQR Kernels for Efficient Controller Learning

关于为高效控制器学习设计LQR核

Alonso Marco, Philipp Hennig, Stefan Schaal, Sebastian Trimpe

发表机构 * Max Planck Institute for Intelligent Systems（马克斯·普朗克智能系统研究所）； Computational Learning and Motor Control Lab（计算学习与运动控制实验室）

AI总结本文提出两种基于LQR结构的核，用于改进基于贝叶斯优化的控制器学习，通过模拟线性和非线性系统证明其优于传统GP方法。

Comments 8 pages, 5 figures, to appear in 56th IEEE Conference on Decision and Control (CDC 2017)

详情

DOI: 10.1109/CDC.2017.8264429

AI中文摘要

从数据中为非线性动态系统寻找最优反馈控制器是困难的。最近，贝叶斯优化（BO）被提出作为直接从实验试次中调整控制器的强大框架。为了选择下一个查询点并找到全局最优解，BO依赖于潜在目标函数的概率描述，通常为高斯过程（GP）。本文显示，使用常见核的GP在标准二次控制问题上可能导致学习效果差。对于一阶系统，我们构建了两种核，专门利用广为人知的线性二次调节器（LQR）的结构，同时保留贝叶斯非参数学习的灵活性。对不确定线性和非线性系统的模拟显示，LQR核在学习性能上优于传统GP方法。

英文摘要

Finding optimal feedback controllers for nonlinear dynamic systems from data is hard. Recently, Bayesian optimization (BO) has been proposed as a powerful framework for direct controller tuning from experimental trials. For selecting the next query point and finding the global optimum, BO relies on a probabilistic description of the latent objective function, typically a Gaussian process (GP). As is shown herein, GPs with a common kernel choice can, however, lead to poor learning outcomes on standard quadratic control problems. For a first-order system, we construct two kernels that specifically leverage the structure of the well-known Linear Quadratic Regulator (LQR), yet retain the flexibility of Bayesian nonparametric learning. Simulations of uncertain linear and nonlinear systems demonstrate that the LQR kernels yield superior learning performance.

URL PDF HTML ☆

赞 0 踩 0

1707.09676 2026-06-04 cs.LG cs.SY eess.SY math.OC 版本更新

Model-Free Renewable Scenario Generation Using Generative Adversarial Networks

基于生成对抗网络的无模型可再生能源场景生成

Yize Chen, Yishen Wang, Daniel Kirschen, Baosen Zhang

发表机构 * University of Washington（华盛顿大学）

AI总结本文提出一种基于生成对抗网络的数据驱动场景生成方法，用于高效生成具有时空相关性的可再生能源生产模式。

Comments Accepted to IEEE Transactions on Power Systems; code available at https://github.com/chennnnnyize/Renewables_Scenario_Gen_GAN

详情

AI中文摘要

场景生成是高可再生能源渗透电力系统运行和规划中的重要步骤。本文提出了一种基于生成对抗网络的数据驱动方法，该方法基于两个相互连接的深度神经网络。与基于概率模型的方法相比，我们的方法能够捕捉大量相关资源在时间和空间维度上的可再生能源生产模式。通过使用NREL集成数据集中的风能和太阳能时间序列数据进行验证，证明了所提方法能够生成具有多样行为的逼真风能和光伏发电曲线。此外，通过使用训练期间的标注数据，展示了如何根据不同感兴趣的条件生成场景。例如，可以基于天气事件（如高风日）或年份时间（如七月某日的太阳能发电）来生成场景。由于神经网络的前馈性质，无需复杂的采样技术即可高效生成场景。

英文摘要

Scenario generation is an important step in the operation and planning of power systems with high renewable penetrations. In this work, we proposed a data-driven approach for scenario generation using generative adversarial networks, which is based on two interconnected deep neural networks. Compared with existing methods based on probabilistic models that are often hard to scale or sample from, our method is data-driven, and captures renewable energy production patterns in both temporal and spatial dimensions for a large number of correlated resources. For validation, we use wind and solar times-series data from NREL integration data sets. We demonstrate that the proposed method is able to generate realistic wind and photovoltaic power profiles with full diversity of behaviors. We also illustrate how to generate scenarios based on different conditions of interest by using labeled data during training. For example, scenarios can be conditioned on weather events~(e.g. high wind day) or time of the year~(e,g. solar generation for a day in July). Because of the feedforward nature of the neural networks, scenarios can be generated extremely efficiently without sophisticated sampling techniques.

URL PDF HTML ☆

赞 0 踩 0

1704.00227 2026-06-04 cs.LG cs.NA math.NA 版本更新

Online and Stable Learning of Analysis Operators

在线和稳定的分析算子学习

Michael Sandbichler, Karin Schnass

发表机构 * Department of Mathematics, University of Innsbruck（因斯布鲁克大学数学系）

AI总结本文提出四种在线学习分析算子的迭代算法，基于优化原则，改进了分析K-SVD和分析SimCO，通过投影梯度下降、隐式欧拉方案和奇异值策略，在合成和图像数据上表现出更好的恢复率和更快的收敛速度。

Comments 21 pages, 12 figures, 6 tables

详情

AI中文摘要

本文提出了四种用于学习分析算子的迭代算法。它们基于分析K-SVD和分析SimCO所依赖的相同优化原则。FAOL和SAOL算法基于投影梯度下降法，使用最优步长；IAOL算法受隐式欧拉方案启发，无需选择步长；SVAOL算法采用类似分析K-SVD的策略，但避免其高计算成本。所有算法在每一步都证明能减少或保持目标函数，并提供了其平稳点的特征描述。进一步在合成和图像数据上测试，与分析SimCO相比，显示出更好的恢复率和更快的目标函数衰减。在最终的去噪实验中，所提算法的表现与最先进的ASimCO算法相当或更优。

英文摘要

In this paper four iterative algorithms for learning analysis operators are presented. They are built upon the same optimisation principle underlying both Analysis K-SVD and Analysis SimCO. The Forward and Sequential Analysis Operator Learning (FAOL and SAOL) algorithms are based on projected gradient descent with optimally chosen step size. The Implicit AOL (IAOL) algorithm is inspired by the implicit Euler scheme for solving ordinary differential equations and does not require to choose a step size. The fourth algorithm, Singular Value AOL (SVAOL), uses a similar strategy as Analysis K-SVD while avoiding its high computational cost. All algorithms are proven to decrease or preserve the target function in each step and a characterisation of their stationary points is provided. Further they are tested on synthetic and image data, compared to Analysis SimCO and found to give better recovery rates and faster decay of the objective function respectively. In a final denoising experiment the presented algorithms are again shown to perform similar to or better than the state-of-the-art algorithm ASimCO.

URL PDF HTML ☆

赞 0 踩 0

1801.09657 2026-06-04 math.NA cs.LG cs.NA stat.ME 版本更新

Matrix Completion for Structured Observations

结构观测的矩阵补全

Denali Molitor, Deanna Needell

发表机构 * University of California, Los Angeles（加州大学洛杉矶分校）

AI总结本文提出改进的核范数最小化方法，以考虑观测与未观测条目间的结构性差异，提升矩阵补全效果。

1705.07252 2026-06-04 cs.LG cs.NA math.NA 版本更新

SVM via Saddle Point Optimization: New Bounds and Distributed Algorithms

通过鞍点优化实现SVM：新的界和分布式算法

Yifei Jin, Lingxiao Huang, Jian Li

发表机构 * Tsinghua University（清华大学）； EPFL（苏黎世联邦理工学院）

AI总结本文提出基于鞍点优化的新算法，为硬边距SVM和ν-SVM提供近线性时间复杂度的解决方案，并在分布式环境下实现高效通信。

详情

AI中文摘要

我们研究了两种重要的SVM变体：硬边距SVM（用于线性可分情况）和ν-SVM（用于线性不可分情况）。我们从鞍点优化的角度提出新算法。我们的算法在两种变体上均能实现(1-ε)近似解，运行时间为~O(nd +n√(d/ε))，其中n是点数，d是维度。目前最好的ν-SVM算法基于二次规划方法，最坏情况下需要Ω(n²d)时间~\cite{joachims1998making,platt199912}。本文为ν-SVM提供了首个近线性时间算法。硬边距SVM的最佳算法由Gilbert算法~\cite{gartner2009coresets}实现，需要O(nd/ε)时间。我们的算法将运行时间提高了√d/√ε倍。此外，我们的算法可以自然地在分布式设置中实现。我们证明我们的算法需要~O(k(d +√(d/ε)))的通信成本，其中k是客户端数量，这几乎接近理论下界。数值实验支持我们的理论，并显示我们的算法在高维、大规模和密集数据集上比先前方法收敛更快。

英文摘要

We study two important SVM variants: hard-margin SVM (for linearly separable cases) and $ν$-SVM (for linearly non-separable cases). We propose new algorithms from the perspective of saddle point optimization. Our algorithms achieve $(1-ε)$-approximations with running time $\tilde{O}(nd+n\sqrt{d / ε})$ for both variants, where $n$ is the number of points and $d$ is the dimensionality. To the best of our knowledge, the current best algorithm for $ν$-SVM is based on quadratic programming approach which requires $Ω(n^2 d)$ time in worst case~\cite{joachims1998making,platt199912}. In the paper, we provide the first nearly linear time algorithm for $ν$-SVM. The current best algorithm for hard margin SVM achieved by Gilbert algorithm~\cite{gartner2009coresets} requires $O(nd / ε)$ time. Our algorithm improves the running time by a factor of $\sqrt{d}/\sqrtε$. Moreover, our algorithms can be implemented in the distributed settings naturally. We prove that our algorithms require $\tilde{O}(k(d +\sqrt{d/ε}))$ communication cost, where $k$ is the number of clients, which almost matches the theoretical lower bound. Numerical experiments support our theory and show that our algorithms converge faster on high dimensional, large and dense data sets, as compared to previous methods.

URL PDF HTML ☆

赞 0 踩 0

1705.07262 2026-06-04 cs.LG cs.AI cs.NE cs.SY eess.SY 版本更新

Batch Reinforcement Learning on the Industrial Benchmark: First Experiences

批量强化学习在工业基准上的应用：初步经验

Daniel Hein, Steffen Udluft, Michel Tokic, Alexander Hentschel, Thomas A. Runkler, Volkmar Sterzing

发表机构 * Technical University of Munich, Department of Informatics（慕尼黑技术大学信息学院）； Siemens AG, Corporate Technology（西门子股份公司企业技术部）

AI总结本文研究了粒子群优化策略在工业基准上的表现，展示了其在真实应用场景中的有效性，相比传统方法，PSO-P在性能和鲁棒性上表现突出。

详情

DOI: 10.1109/IJCNN.2017.7966389
Journal ref: 2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, 2017, pp. 4214-4221

AI中文摘要

粒子群优化策略（PSO-P）近期被引入并证明在与学术强化学习基准的非策略、批量设置中产生了显著成果。为进一步研究其在真实应用中的性质和可行性，本文在所谓的工业基准（IB）上研究PSO-P，这是一个旨在通过包含工业应用中发现的各种方面（如连续状态和动作空间、高维部分可观测状态空间、延迟效应和复杂随机性）而变得真实的新强化学习（RL）基准。PSO-P在IB上的实验结果与基于模型的递归控制神经网络（RCNN）和基于模型的神经拟合Q迭代（NFQ）推导出的闭式控制策略的结果进行比较。实验表明，PSO-P不仅对学术基准感兴趣，也对真实世界工业应用感兴趣，因为它在我们的IB设置中也产生了最佳表现的策略。与其它已建立的RL技术相比，PSO-P在性能和鲁棒性上表现出色，仅需相对较低的努力来找到合适的参数或做出复杂的设计决策。

英文摘要

The Particle Swarm Optimization Policy (PSO-P) has been recently introduced and proven to produce remarkable results on interacting with academic reinforcement learning benchmarks in an off-policy, batch-based setting. To further investigate the properties and feasibility on real-world applications, this paper investigates PSO-P on the so-called Industrial Benchmark (IB), a novel reinforcement learning (RL) benchmark that aims at being realistic by including a variety of aspects found in industrial applications, like continuous state and action spaces, a high dimensional, partially observable state space, delayed effects, and complex stochasticity. The experimental results of PSO-P on IB are compared to results of closed-form control policies derived from the model-based Recurrent Control Neural Network (RCNN) and the model-free Neural Fitted Q-Iteration (NFQ). Experiments show that PSO-P is not only of interest for academic benchmarks, but also for real-world industrial applications, since it also yielded the best performing policy in our IB setting. Compared to other well established RL techniques, PSO-P produced outstanding results in performance and robustness, requiring only a relatively low amount of effort in finding adequate parameters or making complex design decisions.

URL PDF HTML ☆

赞 0 踩 0

1801.06637 2026-06-04 stat.ML cs.LG cs.NA math.AP math.NA 版本更新

Deep Hidden Physics Models: Deep Learning of Nonlinear Partial Differential Equations

深度隐物理模型：深度学习非线性偏微分方程

Maziar Raissi

发表机构 * Division of Applied Mathematics, Brown University（布朗大学应用数学系）

AI总结本文提出深度学习方法，从散乱且可能含噪声的观测数据中发现非线性偏微分方程，通过两个深度神经网络近似未知解及非线性动力学，验证了该方法在多个科学领域基准问题中的有效性。

详情

AI中文摘要

在人工智能与应用数学的交汇处，长期存在的问题是设计出能够将观测数据转化为物理世界预测数学模型的算法。在数据丰富和机器学习能力先进的时代，自然的问题是：如何自动从高维实验数据中揭示物理定律？本文提出了一种深度学习方法，用于从空间和时间上散乱且可能含噪声的观测中发现非线性偏微分方程。具体而言，我们通过两个深度神经网络近似未知解及非线性动力学。第一个网络作为未知解的先验，本质上使我们能够避免本质上病态且不稳定的数值微分。第二个网络代表非线性动力学，帮助我们提炼支配给定时空数据集演化的机制。我们测试了该方法在多个科学领域基准问题中的有效性，并展示了所提框架如何帮助我们准确学习底层动力学并预测系统未来状态。特别是，我们研究了Burgers'、Korteweg-de Vries（KdV）、Kuramoto-Sivashinsky、非线性Schrödinger和Navier-Stokes方程。

英文摘要

A long-standing problem at the interface of artificial intelligence and applied mathematics is to devise an algorithm capable of achieving human level or even superhuman proficiency in transforming observed data into predictive mathematical models of the physical world. In the current era of abundance of data and advanced machine learning capabilities, the natural question arises: How can we automatically uncover the underlying laws of physics from high-dimensional data generated from experiments? In this work, we put forth a deep learning approach for discovering nonlinear partial differential equations from scattered and potentially noisy observations in space and time. Specifically, we approximate the unknown solution as well as the nonlinear dynamics by two deep neural networks. The first network acts as a prior on the unknown solution and essentially enables us to avoid numerical differentiations which are inherently ill-conditioned and unstable. The second network represents the nonlinear dynamics and helps us distill the mechanisms that govern the evolution of a given spatiotemporal data-set. We test the effectiveness of our approach for several benchmark problems spanning a number of scientific domains and demonstrate how the proposed framework can help us accurately learn the underlying dynamics and forecast future states of the system. In particular, we study the Burgers', Korteweg-de Vries (KdV), Kuramoto-Sivashinsky, nonlinear Schrödinger, and Navier-Stokes equations.

URL PDF HTML ☆

赞 0 踩 0

1801.05894 2026-06-04 math.HO cs.LG cs.NA math.NA stat.ML 版本更新

Deep Learning: An Introduction for Applied Mathematicians

深度学习：应用数学家的入门指南

Catherine F. Higham, Desmond J. Higham

发表机构 * School of Computing Science, University of Glasgow, UK（格拉斯哥大学计算机科学学院，英国）； Department of Mathematics and Statistics, University of Strathclyde, UK（斯特拉斯克莱德大学数学与统计学系，英国）

AI总结本文从应用数学角度介绍深度学习基本概念，面向数学专业研究生及本科生，通过MATLAB代码和图像分类实例展示神经网络原理与训练方法。

详情

AI中文摘要

多层人工神经网络正成为众多应用领域中的普遍工具。深度学习革命的核心概念来自应用数学和计算数学，包括微积分、逼近论、优化和线性代数。本文为应用数学家提供深度学习基础介绍。目标读者为数学专业研究生及大四本科生，也适用于希望在课堂中引入深度学习应用的数学教师。文章聚焦三个核心问题：什么是深度神经网络？如何训练网络？什么是随机梯度方法？通过MATLAB代码展示网络构建与训练，并展示在大规模图像分类问题中使用最新软件的应用。最后提供当前文献的参考文献。

英文摘要

Multilayered artificial neural networks are becoming a pervasive tool in a host of application fields. At the heart of this deep learning revolution are familiar concepts from applied and computational mathematics; notably, in calculus, approximation theory, optimization and linear algebra. This article provides a very brief introduction to the basic ideas that underlie deep learning from an applied mathematics perspective. Our target audience includes postgraduate and final year undergraduate students in mathematics who are keen to learn about the area. The article may also be useful for instructors in mathematics who wish to enliven their classes with references to the application of deep learning techniques. We focus on three fundamental questions: what is a deep neural network? how is a network trained? what is the stochastic gradient method? We illustrate the ideas with a short MATLAB code that sets up and trains a network. We also show the use of state-of-the art software on a large scale image classification problem. We finish with references to the current literature.

URL PDF HTML ☆

赞 0 踩 0

1801.04492 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

An Explicit Convergence Rate for Nesterov's Method from SDP

Nesterov方法从SDP的显式收敛率

Sam Safavi, Bikash Joshi, Guilherme França, José Bento

AI总结本文通过求解SDP，为Nesterov加速法提供了新的显式收敛率上界，优化了参数并改进了强凸函数的收敛分析。

1706.09993 2026-06-04 math.NA cs.IT cs.LG cs.NA math.IT math.PR math.ST stat.TH 版本更新

Phase Retrieval via Randomized Kaczmarz: Theoretical Guarantees

通过随机Kaczmarz方法进行相位恢复：理论保证

Yan Shuo Tan, Roman Vershynin

发表机构 * Department of Mathematics, University of Michigan（密歇根大学数学系）； Department of Mathematics, University of California, Irvine（加州大学尔湾分校数学系）

AI总结本文提出随机Kaczmarz方法在相位恢复中的理论保障，证明仅需与维度成正比的高斯测量即可保证收敛，引入了测量集的充分条件，并利用VC维和度量熵证明高斯采样向量满足该条件。

Comments Revised after comments from referees

1402.6964 2026-06-04 cs.LG cs.DC cs.NA math.NA stat.ML 版本更新

Scalable methods for nonnegative matrix factorizations of near-separable tall-and-skinny matrices

可扩展的非负矩阵分解方法用于近可分离的高瘦矩阵

Austin R. Benson, Jason D. Lee, Bartek Rajwa, David F. Gleich

发表机构 * Stanford University（斯坦福大学）； Purdue University（普渡大学）； Purdue University Institute for Computational and Bindley Biosciences Center（普渡大学计算与Bindley生物科学中心）； Computer Science Mathematical Engineering（计算机科学数学工程）

AI总结本文提出高效算法处理高瘦矩阵的非负矩阵分解，通过正交变换保持分离性，适用于流式数据和分布式计算环境。

1710.09668 2026-06-04 math.NA cs.LG cs.NA cs.NE stat.ML 版本更新

PDE-Net: Learning PDEs from Data

PDE-Net：从数据中学习偏微分方程

Zichao Long, Yiping Lu, Xianzhong Ma, Bin Dong

发表机构 * School of Mathematical Sciences（数学科学学院）； School of Mathematical Sciences, Peking University（北京大学数学科学学院）； Peking University, Beijing, China（北京大学北京中国）； Beijing Computational Science Research Center（北京计算科学研究中心）； Beijing International Center for Mathematical Research, Peking University（北京大学北京国际数学研究中心）； Center for Data Science, Peking University（北京大学数据科学中心）； Beijing Institute of Big Data Research（北京大数据研究院）

AI总结本文提出PDE-Net，通过学习卷积核来获取微分算子，同时近似未知非线性响应，灵活地揭示复杂系统的动力学和隐藏的PDE模型。

Comments 15 pages, 13 figures

详情

AI中文摘要

本文提出了一种新的前馈深度网络PDE-Net，旨在同时准确预测复杂系统的动力学并揭示隐藏的PDE模型。PDE-Net通过学习卷积核来获取微分算子，并利用神经网络或其他机器学习方法近似未知的非线性响应。与现有方法相比，我们的方法通过学习微分算子和非线性响应具有最大的灵活性。PDE-Net的特殊之处在于所有滤波器都受到适当约束，这使我们能够轻松识别 governing PDE 模型，同时保持网络的表达力和预测能力。这些约束通过充分利用微分算子的阶数与滤波器的阶数总和规则（源自小波理论的重要概念）精心设计。我们还讨论了PDE-Net与计算机视觉中的一些现有网络如Network-In-Network (NIN) 和 Residual Neural Network (ResNet) 的关系。数值实验表明，PDE-Net有潜力揭示观测动态的隐藏PDE，并在噪声环境中预测相对较长的时间内的动态行为。

英文摘要

In this paper, we present an initial attempt to learn evolution PDEs from data. Inspired by the latest development of neural network designs in deep learning, we propose a new feed-forward deep network, called PDE-Net, to fulfill two objectives at the same time: to accurately predict dynamics of complex systems and to uncover the underlying hidden PDE models. The basic idea of the proposed PDE-Net is to learn differential operators by learning convolution kernels (filters), and apply neural networks or other machine learning methods to approximate the unknown nonlinear responses. Comparing with existing approaches, which either assume the form of the nonlinear response is known or fix certain finite difference approximations of differential operators, our approach has the most flexibility by learning both differential operators and the nonlinear responses. A special feature of the proposed PDE-Net is that all filters are properly constrained, which enables us to easily identify the governing PDE models while still maintaining the expressive and predictive power of the network. These constrains are carefully designed by fully exploiting the relation between the orders of differential operators and the orders of sum rules of filters (an important concept originated from wavelet theory). We also discuss relations of the PDE-Net with some existing networks in computer vision such as Network-In-Network (NIN) and Residual Neural Network (ResNet). Numerical experiments show that the PDE-Net has the potential to uncover the hidden PDE of the observed dynamics, and predict the dynamical behavior for a relatively long time, even in a noisy environment.

URL PDF HTML ☆

赞 0 踩 0

1712.09999 2026-06-04 math.NA cs.LG cs.NA 版本更新

Parallel Active Subspace Decomposition for Scalable and Efficient Tensor Robust Principal Component Analysis

并行主动子空间分解用于可扩展和高效的张量鲁棒主成分分析

Jonathan Q. Jiang, Michael K. Ng

发表机构 * Department of Mathematics, Hong Kong Baptist University（香港 Baptist 大学数学系）

AI总结本文提出并行主动子空间分解方法，通过将张量展开的每个模式分解为正交矩阵和小矩阵，降低核范数最小化问题的规模，提升张量鲁棒主成分分析的效率和精度。

Comments 19 pages, 2 figures, 2 tables

详情

AI中文摘要

张量鲁棒主成分分析（TRPCA）在多个领域受到广泛关注。现有方法通常依赖张量核范数最小化，但每次迭代都需要多个奇异值分解（SVD），导致计算成本高昂。为克服这一缺点，我们提出了一种可扩展且高效的方法，称为并行主动子空间分解（PASD），该方法将张量展开的每个模式分解为列正交矩阵（主动子空间）和另一个小矩阵。这种变换导致了一个非凸优化问题，其中核范数最小化的规模通常比原始问题小得多。此外，我们引入交替方向乘子法（ADMM）来解决改写的问题，并提供其收敛性和次优性的严格分析。在合成和真实数据上的实验结果表明，我们的算法比最先进的方法更准确，并且快了多个数量级。

英文摘要

Tensor robust principal component analysis (TRPCA) has received a substantial amount of attention in various fields. Most existing methods, normally relying on tensor nuclear norm minimization, need to pay an expensive computational cost due to multiple singular value decompositions (SVDs) at each iteration. To overcome the drawback, we propose a scalable and efficient method, named Parallel Active Subspace Decomposition (PASD), which divides the unfolding along each mode of the tensor into a columnwise orthonormal matrix (active subspace) and another small-size matrix in parallel. Such a transformation leads to a nonconvex optimization problem in which the scale of nulcear norm minimization is generally much smaller than that in the original problem. Furthermore, we introduce an alternating direction method of multipliers (ADMM) method to solve the reformulated problem and provide rigorous analyses for its convergence and suboptimality. Experimental results on synthetic and real-world data show that our algorithm is more accurate than the state-of-the-art approaches, and is orders of magnitude faster.

URL PDF HTML ☆

赞 0 踩 0

1707.07342 2026-06-04 eess.SY cs.LG cs.SY 版本更新

An Online Learning Approach to Buying and Selling Demand Response

面向购买和销售需求响应的在线学习方法

Kia Khezeli, Eilyan Bitar

发表机构 * School of Electrical and Computer Engineering, Cornell University, Ithaca, NY 14853, USA.（电气与计算机工程系，康奈尔大学，纽约州伊萨卡市，14853，美国）

AI总结本文提出一种在线学习方法，用于协调聚合商在固定居民客户群中购买需求削减，并在双结算批发市场中销售总体需求削减。研究通过动态定价和合同策略，在未知需求模型下最大化预期利润。

详情

AI中文摘要

我们采用聚合商的视角，旨在协调其从固定居民电力客户群中购买需求削减与在双结算批发市场中销售总体需求削减。聚合商通过向客户提供统一价格来获取需求削减，该价格相对于其预定基准。在实现总体需求削减之前，聚合商还必须确定向双结算能源市场出售多少能源。在日间市场中，聚合商承诺一份远期合同，要求在实时市场交付能源。基础需求曲线，将总体需求削减与聚合商提供的价格相关联，被假设为线性且受不可观测的随机冲击影响。假设聚合商最初不知道需求曲线的参数和随机冲击的分布，我们研究聚合商在T天时间窗口内动态调整报价和远期合同以最大化预期利润的程度。具体而言，我们设计了一种动态定价和合同提供策略，解决聚合商学习未知需求模型与最大化时间累积预期利润之间的需求。特别地，所提出的定价策略被证明在T天内产生的遗憾不超过O(log(T)√T)。

英文摘要

We adopt the perspective of an aggregator, which seeks to coordinate its purchase of demand reductions from a fixed group of residential electricity customers, with its sale of the aggregate demand reduction in a two-settlement wholesale energy market. The aggregator procures reductions in demand by offering its customers a uniform price for reductions in consumption relative to their predetermined baselines. Prior to its realization of the aggregate demand reduction, the aggregator must also determine how much energy to sell into the two-settlement energy market. In the day-ahead market, the aggregator commits to a forward contract, which calls for the delivery of energy in the real-time market. The underlying aggregate demand curve, which relates the aggregate demand reduction to the aggregator's offered price, is assumed to be affine and subject to unobservable, random shocks. Assuming that both the parameters of the demand curve and the distribution of the random shocks are initially unknown to the aggregator, we investigate the extent to which the aggregator might dynamically adapt its offered prices and forward contracts to maximize its expected profit over a time window of $T$ days. Specifically, we design a dynamic pricing and contract offering policy that resolves the aggregator's need to learn the unknown demand model with its desire to maximize its cumulative expected profit over time. In particular, the proposed pricing policy is proven to incur a regret over $T$ days that is no greater than $O(\log(T)\sqrt{T})$.

URL PDF HTML ☆

赞 0 踩 0

1710.10737 2026-06-04 math.OC cs.LG cs.NA math.NA stat.ML 版本更新

Linearly convergent stochastic heavy ball method for minimizing generalization error

用于最小化泛化误差的线性收敛随机重力球方法

Nicolas Loizou, Peter Richtárik

发表机构 * University of Edinburgh, United Kingdom（爱丁堡大学，英国）； KAUST, Kingdom of Saudi Arabia（王国沙特阿拉伯的KAUST）

AI总结本文首次证明了随机重力球方法的线性收敛性，通过固定步长的SGD步骤结合重力球动量项，专注于最小化期望损失而非有限和最小化。

Comments NIPS 2017, Workshop on Optimization for Machine Learning (camera ready version)

1610.06781 2026-06-04 cs.RO cs.AI cs.CV cs.LG cs.SY eess.SY 版本更新

Modular Deep Q Networks for Sim-to-real Transfer of Visuo-motor Policies

模块化深度Q网络用于视觉-运动策略的仿真到现实迁移

Fangyi Zhang, Jürgen Leitner, Michael Milford, Peter Corke

发表机构 * Australian Centre for Robotic Vision (ACRV)（澳大利亚机器人视觉中心）； Queensland University of Technology (QUT)（昆士兰理工大学）

AI总结本文提出模块化深度强化学习方法，通过在感知与控制之间引入瓶颈，实现仿真到现实的迁移，提升机器人视觉-运动协调能力。

Comments Australasian Conference on Robotics and Automation (ACRA) 2017, Student Paper Award Finalist

详情

Journal ref: The proceedings of the Australasian Conference on Robotics and Automation (ACRA) 2017

AI中文摘要

尽管深度学习在计算机视觉中因大量视觉数据而取得显著成功，但为机器人学习收集足够大的现实世界数据集成本较高。为提高这些技术在真实机器人上的实用性，我们提出了一种模块化深度强化学习方法，能够将仿真训练的模型迁移到现实世界机器人任务中。我们引入了感知与控制之间的瓶颈，使网络能够独立训练，然后在端到端方式下合并和微调，以进一步提高视觉-运动协调性。在经典的平面视觉引导机器人抓取任务中，微调后的准确度达到1.6像素，显著优于直接迁移（17.5像素），显示出在更复杂和广泛的应用中的潜力。我们的方法提供了一种更高效学习和迁移视觉-运动策略的技术，无需完全依赖大规模现实世界机器人数据集。

英文摘要

While deep learning has had significant successes in computer vision thanks to the abundance of visual data, collecting sufficiently large real-world datasets for robot learning can be costly. To increase the practicality of these techniques on real robots, we propose a modular deep reinforcement learning method capable of transferring models trained in simulation to a real-world robotic task. We introduce a bottleneck between perception and control, enabling the networks to be trained independently, but then merged and fine-tuned in an end-to-end manner to further improve hand-eye coordination. On a canonical, planar visually-guided robot reaching task a fine-tuned accuracy of 1.6 pixels is achieved, a significant improvement over naive transfer (17.5 pixels), showing the potential for more complicated and broader applications. Our method provides a technique for more efficient learning and transfer of visuo-motor policies for real robotic systems without relying entirely on large real-world robot datasets.

URL PDF HTML ☆

赞 0 踩 0

1712.06577 2026-06-04 cs.LG cs.AI cs.NA math.NA 版本更新

Parallel Complexity of Forward and Backward Propagation

前向和反向传播的并行复杂度

Maxim Naumov

发表机构 * NVIDIA

AI总结研究前向和反向传播作为三角方程组解的并行计算复杂度，提出直接和迭代并行算法，并展示FNN和RNN的反向传播可并行处理。

Comments 18 pages

1609.01387 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Learning Model Predictive Control for iterative tasks. A Data-Driven Control Framework

为迭代任务学习模型预测控制：一种数据驱动的控制框架

Ugo Rosolia, Francesco Borrelli

AI总结本文提出一种无参考模型预测控制器，通过学习前次迭代提升性能，利用安全集和终端成本函数保证递归可行性，通过状态和输入轨迹递归构建终端集和终端成本函数，仿真验证了控制逻辑的有效性。

1712.04612 2026-06-04 q-fin.CP cs.AI cs.CE cs.LG cs.SY eess.SY 版本更新

Inverse Reinforcement Learning for Marketing

营销中的逆强化学习

Igor Halperin

发表机构 * NYU Tandon School of Engineering（纽约大学坦顿工程学院）

AI总结本文提出利用逆强化学习研究动态消费者需求，通过最大熵方法构建可 tractable 模型，展示观测噪声可能被误认为消费者异质性。

Comments 18 pages, 5 figures

1708.01413 2026-06-04 cs.LG cs.DC cs.NA math.NA 版本更新

Distributed Solution of Large-Scale Linear Systems via Accelerated Projection-Based Consensus

通过加速投影共识解决大规模线性系统

Navid Azizan-Ruhi, Farshad Lahouti, Salman Avestimehr, Babak Hassibi

发表机构 * California Institute of Technology（加州理工学院）； University of Southern California（南加州大学）

AI总结本文提出加速分布式共识算法，用于解决大规模线性系统，通过在每个迭代中更新解决方案并结合动量平均，优于梯度下降、Nesterov加速梯度下降、重力球方法和ADMM等方法。

详情

AI中文摘要

解决大规模线性方程组是许多算法在机器学习、科学计算中的关键步骤。当问题维度较大时，计算和/或内存限制使得以分布式方式执行任务变得必要。本文考虑了任务主节点希望通过将方程组的子集分配给多个计算机器/核心来解决大规模线性系统的情况。我们提出了一种加速的分布式共识算法，在每次迭代中，每个机器通过将误差信号的投影加到其解决方案上，并在任务主节点进行动量平均。详细分析了所提算法的收敛行为，并证明其收敛速度优于替代的分布式方法，包括分布式梯度下降、分布式Nesterov加速梯度下降和重力球方法、块Cimmino方法和ADMM。在随机选择的线性系统以及真实世界数据集上，所提方法相对于上述方法提供了显著的加速效果。最后，我们的分析提出了一种新的分布式重力球方法变种，采用特定的分布式预处理，实现了与所提共识方法相同的理论收敛速度。

英文摘要

Solving a large-scale system of linear equations is a key step at the heart of many algorithms in machine learning, scientific computing, and beyond. When the problem dimension is large, computational and/or memory constraints make it desirable, or even necessary, to perform the task in a distributed fashion. In this paper, we consider a common scenario in which a taskmaster intends to solve a large-scale system of linear equations by distributing subsets of the equations among a number of computing machines/cores. We propose an accelerated distributed consensus algorithm, in which at each iteration every machine updates its solution by adding a scaled version of the projection of an error signal onto the nullspace of its system of equations, and where the taskmaster conducts an averaging over the solutions with momentum. The convergence behavior of the proposed algorithm is analyzed in detail and analytically shown to compare favorably with the convergence rate of alternative distributed methods, namely distributed gradient descent, distributed versions of Nesterov's accelerated gradient descent and heavy-ball method, the block Cimmino method, and ADMM. On randomly chosen linear systems, as well as on real-world data sets, the proposed method offers significant speed-up relative to all the aforementioned methods. Finally, our analysis suggests a novel variation of the distributed heavy-ball method, which employs a particular distributed preconditioning, and which achieves the same theoretical convergence rate as the proposed consensus-based method.

URL PDF HTML ☆

赞 0 踩 0

1710.01719 2026-06-04 eess.SY cs.LG cs.SY math.DS math.OC 版本更新

Decomposition of Nonlinear Dynamical Systems Using Koopman Gramians

利用Koopman格拉姆矩阵分解非线性动力系统

Zhiyuan Liu, Soumya Kundu, Lijun Chen, Enoch Yeung

发表机构 * Pacific Northwest National Laboratory（太平洋西北国家实验室）

AI总结本文提出了一种新的Koopman算子方法，用于利用Koopman格拉姆矩阵分解非线性动力系统，介绍了输入-Koopman算子，并展示了如何将其用于将非线性系统转换为经典状态空间形式，以及输入和状态可观测函数分离的条件。

Comments 8 pages, submitted to IEEE 2018 ACC

详情

AI中文摘要

在本文中，我们提出了一种新的Koopman算子方法，用于利用Koopman格拉姆矩阵分解非线性动力系统。我们引入了输入-Koopman算子的概念，并展示了如何利用输入-Koopman算子将非线性系统转换为经典状态空间形式，并确定输入和状态可观测函数分离的条件。然后，我们扩展了现有的动态模式分解方法，用于从数据中学习Koopman算子，称为深度动态模式分解，以适用于具有控制或扰动的系统。我们通过学习一个输入-状态分离的Koopman算子来演示该方法的准确性，即使底层系统表现出混合的状态-输入项。我们接下来介绍了一种基于Koopman格拉姆矩阵的非线性分解算法，该算法最大化内部子系统的可观测性，并从其他子系统的噪声中减少扰动。我们推导了基于Koopman格拉姆矩阵和多维分区的放松方法，用于解决由此产生的NP难分解问题。最后，我们用IEEE 39节点系统的摆动动力学来演示所提出的算法。

英文摘要

In this paper we propose a new Koopman operator approach to the decomposition of nonlinear dynamical systems using Koopman Gramians. We introduce the notion of an input-Koopman operator, and show how input-Koopman operators can be used to cast a nonlinear system into the classical state-space form, and identify conditions under which input and state observable functions are well separated. We then extend an existing method of dynamic mode decomposition for learning Koopman operators from data known as deep dynamic mode decomposition to systems with controls or disturbances. We illustrate the accuracy of the method in learning an input-state separable Koopman operator for an example system, even when the underlying system exhibits mixed state-input terms. We next introduce a nonlinear decomposition algorithm, based on Koopman Gramians, that maximizes internal subsystem observability and disturbance rejection from unwanted noise from other subsystems. We derive a relaxation based on Koopman Gramians and multi-way partitioning for the resulting NP-hard decomposition problem. We lastly illustrate the proposed algorithm with the swing dynamics for an IEEE 39-bus system.

URL PDF HTML ☆

赞 0 踩 0

1306.4080 2026-06-04 cs.LG cs.NA math.NA 版本更新

Parallel Coordinate Descent Newton Method for Efficient $\ell_1$-Regularized Minimization

并行坐标下降牛顿法用于高效ℓ1正则化最小化

An Bian, Xiong Li, Yuncai Liu, Ming-Hsuan Yang

发表机构 * ETH Zurich（苏黎世联邦理工学院）； CNCERT（国家计算机网络应急中心）； Shanghai Jiao Tong University（上海交通大学）； University of California, Merced（加州大学默塞德分校）

AI总结本文提出并行坐标下降牛顿算法（PCDN），通过将Hessian矩阵的非对角线元素设为零以实现并行化，有效解决大规模优化问题中的并行性与收敛性问题。

详情

AI中文摘要

近年来，大规模优化问题中的并行算法取得了进展。尽管已有算法在并行化特征方面表现出色，但通常在高并行性下受限于发散问题或需要数据预处理来缓解这些问题。本文提出了一种并行坐标下降牛顿算法，使用多维近似牛顿步（PCDN），将Hessian矩阵的非对角线元素设为零以实现并行化。它随机将特征集分成$b$个束/子集，每个束的大小为$P$，并按顺序处理每个束，首先并行计算每个特征的下降方向，然后进行$P$维线搜索以获得步长。我们证明：(1) PCDN在增加并行性的情况下仍能保证全局收敛；(2) PCDN在有限的迭代次数$T_ε$内收敛到指定的精度$ε$，且$T_ε$随着并行性（束大小$P$）的增加而减少。通过维护中间量的实现技术，我们最小化了$P$维线搜索的数据传输和同步成本。为具体起见，所提出的PCDN算法应用于ℓ1正则化逻辑回归和ℓ2损失支持向量机。在六个基准数据集上的实验评估显示，所提出的PCDN算法有效利用了并行性，并在不损失精度的情况下优于最先进的方法。

英文摘要

The recent years have witnessed advances in parallel algorithms for large scale optimization problems. Notwithstanding demonstrated success, existing algorithms that parallelize over features are usually limited by divergence issues under high parallelism or require data preprocessing to alleviate these problems. In this work, we propose a Parallel Coordinate Descent Newton algorithm using multidimensional approximate Newton steps (PCDN), where the off-diagonal elements of the Hessian are set to zero to enable parallelization. It randomly partitions the feature set into $b$ bundles/subsets with size of $P$, and sequentially processes each bundle by first computing the descent directions for each feature in parallel and then conducting $P$-dimensional line search to obtain the step size. We show that: (1) PCDN is guaranteed to converge globally despite increasing parallelism; (2) PCDN converges to the specified accuracy $ε$ within the limited iteration number of $T_ε$, and $T_ε$ decreases with increasing parallelism (bundle size $P$). Using the implementation technique of maintaining intermediate quantities, we minimize the data transfer and synchronization cost of the $P$-dimensional line search. For concreteness, the proposed PCDN algorithm is applied to $\ell_1$-regularized logistic regression and $\ell_2$-loss SVM. Experimental evaluations on six benchmark datasets show that the proposed PCDN algorithm exploits parallelism well and outperforms the state-of-the-art methods in speed without losing accuracy.

URL PDF HTML ☆

赞 0 踩 0

1712.00634 2026-06-04 cs.LG cs.AI cs.RO cs.SY eess.SY math.OC 版本更新

PFAx: Predictable Feature Analysis to Perform Control

PFAx：可预测特征分析用于控制

Stefan Richthofer, Laurenz Wiskott

AI总结 PFAx通过整合补充信息提升预测性能，并透明展示补充信息对特征选择的影响，应用于强化学习环境中的智能体控制优化。

详情

AI中文摘要

可预测特征分析（PFA）（Richthofer, Wiskott, ICMLA 2015）是一种对高维输入信号进行降维的算法，提取最可预测的子信号。本文扩展了PFA，考虑补充信息以提高预测。补充信息不参与特征提取，特征仅从主输入中提取。PFAx透明地展示补充信息如何提升预测质量，并可生成补充信息以实现主信号的特定目标。该方法应用于强化学习环境，使智能体局部优化状态，接近目标。后续论文将扩展此方法以实现全局优化。

英文摘要

Predictable Feature Analysis (PFA) (Richthofer, Wiskott, ICMLA 2015) is an algorithm that performs dimensionality reduction on high dimensional input signal. It extracts those subsignals that are most predictable according to a certain prediction model. We refer to these extracted signals as predictable features. In this work we extend the notion of PFA to take supplementary information into account for improving its predictions. Such information can be a multidimensional signal like the main input to PFA, but is regarded external. That means it won't participate in the feature extraction - no features get extracted or composed of it. Features will be exclusively extracted from the main input such that they are most predictable based on themselves and the supplementary information. We refer to this enhanced PFA as PFAx (PFA extended). Even more important than improving prediction quality is to observe the effect of supplementary information on feature selection. PFAx transparently provides insight how the supplementary information adds to prediction quality and whether it is valuable at all. Finally we show how to invert that relation and can generate the supplementary information such that it would yield a certain desired outcome of the main signal. We apply this to a setting inspired by reinforcement learning and let the algorithm learn how to control an agent in an environment. With this method it is feasible to locally optimize the agent's state, i.e. reach a certain goal that is near enough. We are preparing a follow-up paper that extends this method such that also global optimization is feasible.

URL PDF HTML ☆

赞 0 踩 0

1711.02213 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks

Flexpoint：一种适应性数值格式，用于高效训练深度神经网络

Urs Köster, Tristan J. Webb, Xin Wang, Marcel Nassar, Arjun K. Bansal, William H. Constable, Oğuz H. Elibol, Scott Gray, Stewart Hall, Luke Hornof, Amir Khosrowshahi, Carey Kloss, Ruby J. Pai, Naveen Rao

发表机构 * Artificial Intelligence Products Group, Intel Corporation（英特尔人工智能产品部）

AI总结 Flexpoint是一种适应性数值格式，旨在高效训练深度神经网络，通过动态调整指数来减少溢出并最大化动态范围，实验证明其在训练AlexNet、残差网络和生成对抗网络时性能接近32位浮点数。

Comments 14 pages, 5 figures, accepted in Neural Information Processing Systems 2017

详情

AI中文摘要

深度神经网络通常在32位浮点格式下开发和训练。通过在训练和推理中使用优化于深度学习的数值格式，可以实现性能和能效的显著提升。尽管近年来在有限精度推理方面取得了进展，但以低比特宽度训练神经网络仍是一个挑战。本文提出了Flexpoint数据格式，旨在完全取代32位浮点格式的训练和推理，支持现代深度网络拓扑而不需修改。Flexpoint张量具有共享的指数，该指数动态调整以最小化溢出并最大化可用动态范围。我们通过使用neon深度学习框架实现的模拟器验证了Flexpoint，证明在训练AlexNet、深度残差网络和生成对抗网络时，16位Flexpoint在不需调整模型超参数的情况下，性能接近32位浮点数。我们的结果表明，Flexpoint是一种有前途的数值格式，适用于未来用于训练和推理的硬件。

英文摘要

Deep neural networks are commonly developed and trained in 32-bit floating point format. Significant gains in performance and energy efficiency could be realized by training and inference in numerical formats optimized for deep learning. Despite advances in limited precision inference in recent years, training of neural networks in low bit-width remains a challenging problem. Here we present the Flexpoint data format, aiming at a complete replacement of 32-bit floating point format training and inference, designed to support modern deep network topologies without modifications. Flexpoint tensors have a shared exponent that is dynamically adjusted to minimize overflows and maximize available dynamic range. We validate Flexpoint by training AlexNet, a deep residual network and a generative adversarial network, using a simulator implemented with the neon deep learning framework. We demonstrate that 16-bit Flexpoint closely matches 32-bit floating point in training all three models, without any need for tuning of model hyperparameters. Our results suggest Flexpoint as a promising numerical format for future hardware for training and inference.

URL PDF HTML ☆

赞 0 踩 0

1711.11165 2026-06-04 cs.LG cs.SY eess.SY 版本更新

Safe Exploration for Identifying Linear Systems via Robust Optimization

通过鲁棒优化安全探索以识别线性系统

Tyler Lu, Martin Zinkevich, Craig Boutilier, Binz Roy, Dale Schuurmans

发表机构 * Google（谷歌）

AI总结本文研究如何在未知动态系统中安全识别参数，通过鲁棒优化方法逐步扩展安全动作区域，提升安全探索的样本效率。

详情

AI中文摘要

安全探索未知动态系统对在物理系统中部署强化学习（RL）至关重要，特别是在可能产生灾难性后果的情况下。在了解动态系统很少的情况下，需要多样化的转移数据来应用基于模型或无模型的RL。受谷歌数据中心冷却的启发，我们研究如何在给定准确性和置信度水平的情况下安全识别系统参数。特别是，在假设最初已知一个名义安全动作的情况下，学习未知线性系统并假设存在高斯噪声。将安全性定义为在轨迹整个跨度内满足特定线性约束（例如过程变量要求），并给定概率近似正确（PAC）风格的模型参数估计误差界限，我们展示如何通过逐步扩展围绕名义安全动作的球体来计算安全动作空间区域。可以应用任何从此类安全区域中选择动作的探索策略。在数据中心冷却动态的简化模型上的实验展示了如何通过计算适当的安全区域来提高安全探索的样本效率。

英文摘要

Safely exploring an unknown dynamical system is critical to the deployment of reinforcement learning (RL) in physical systems where failures may have catastrophic consequences. In scenarios where one knows little about the dynamics, diverse transition data covering relevant regions of state-action space is needed to apply either model-based or model-free RL. Motivated by the cooling of Google's data centers, we study how one can safely identify the parameters of a system model with a desired accuracy and confidence level. In particular, we focus on learning an unknown linear system with Gaussian noise assuming only that, initially, a nominal safe action is known. Define safety as satisfying specific linear constraints on the state space (e.g., requirements on process variable) that must hold over the span of an entire trajectory, and given a Probably Approximately Correct (PAC) style bound on the estimation error of model parameters, we show how to compute safe regions of action space by gradually growing a ball around the nominal safe action. One can apply any exploration strategy where actions are chosen from such safe regions. Experiments on a stylized model of data center cooling dynamics show how computing proper safe regions can increase the sample efficiency of safe exploration.

URL PDF HTML ☆

赞 0 踩 0

1711.10566 2026-06-04 cs.AI cs.LG cs.NA math.AP math.NA stat.ML 版本更新

Physics Informed Deep Learning (Part II): Data-driven Discovery of Nonlinear Partial Differential Equations

物理指导深度学习（第二部分）：数据驱动发现非线性偏微分方程

Maziar Raissi, Paris Perdikaris, George Em Karniadakis

发表机构 * Division of Applied Mathematics, Brown University（应用数学系，布朗大学）

AI总结本文提出物理指导神经网络，用于在尊重物理定律的前提下解决监督学习任务。第二部分聚焦于数据驱动发现偏微分方程的问题，区分了连续时间和离散时间模型，并通过数学物理中的多个基准问题验证了方法的有效性。

1711.10561 2026-06-04 cs.AI cs.LG cs.NA math.DS math.NA stat.ML 版本更新

Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations

物理引导的深度学习（第一部分）：非线性偏微分方程的数据驱动求解

Maziar Raissi, Paris Perdikaris, George Em Karniadakis

发表机构 * Division of Applied Mathematics, Brown University（应用数学系，布朗大学）

AI总结本文提出物理引导的神经网络，用于在满足物理定律的前提下解决监督学习问题。第一部分介绍了如何利用这些网络推断偏微分方程的解，并构建可微的物理引导替代模型。

详情

AI中文摘要

我们引入了物理引导的神经网络——一种在解决监督学习任务时尊重由一般非线性偏微分方程描述的物理定律的神经网络。在本两部分论述中，我们围绕解决两类主要问题展开：数据驱动求解和数据驱动发现偏微分方程。根据可用数据的性质和安排，我们设计了两种不同的算法类别，即连续时间和离散时间模型。所得到的神经网络形成了一种新的数据高效通用函数逼近器类别，能够自然地将任何底层物理定律作为先验信息编码。在本第一部分中，我们展示了这些网络如何用于推断偏微分方程的解，并获得完全可微的物理引导替代模型，该模型对所有输入坐标和自由参数均可微分。

英文摘要

We introduce physics informed neural networks -- neural networks that are trained to solve supervised learning tasks while respecting any given law of physics described by general nonlinear partial differential equations. In this two part treatise, we present our developments in the context of solving two main classes of problems: data-driven solution and data-driven discovery of partial differential equations. Depending on the nature and arrangement of the available data, we devise two distinct classes of algorithms, namely continuous time and discrete time models. The resulting neural networks form a new class of data-efficient universal function approximators that naturally encode any underlying physical laws as prior information. In this first part, we demonstrate how these networks can be used to infer solutions to partial differential equations, and obtain physics-informed surrogate models that are fully differentiable with respect to all input coordinates and free parameters.

URL PDF HTML ☆

赞 0 踩 0

1702.06463 2026-06-04 q-bio.NC cs.LG cs.NE cs.SY eess.SY 版本更新

Predicting non-linear dynamics by stable local learning in a recurrent spiking neural network

通过在循环脉冲神经网络中稳定的局部学习预测非线性动力学

Aditya Gilra, Wulfram Gerstner

AI总结本文提出了一种监督学习方案，用于循环脉冲神经网络中前馈和循环连接的权重学习，通过反馈误差和局部学习规则实现稳定学习，展示了线性、非线性及混沌动力学的学习能力。

详情

DOI: 10.7554/eLife.28295
Journal ref: eLife 2017;6:e28295

AI中文摘要

大脑需要预测身体对运动指令的反应。问题在于如何通过脉冲神经元网络中的局部、在线和稳定学习规则，使网络学习复现由运动指令引起的非线性身体动力学。本文提出了一种监督学习方案，用于循环脉冲神经网络中前馈和循环连接的权重学习。误差通过固定随机连接和负增益反馈，使网络跟随期望动力学，同时在线局部规则改变权重。反馈基于在线局部学习权重（FOLLOW）规则在局部意义上，即权重变化取决于突触前活动和误差信号投影到突触后神经元。我们提供了学习线性、非线性及混沌动力学以及双臂动力学的例子。通过李雅普诺夫方法，在合理假设和近似下，证明FOLLOW学习是均匀稳定的，误差渐近趋于零。

英文摘要

Brains need to predict how the body reacts to motor commands. It is an open question how networks of spiking neurons can learn to reproduce the non-linear body dynamics caused by motor commands, using local, online and stable learning rules. Here, we present a supervised learning scheme for the feedforward and recurrent connections in a network of heterogeneous spiking neurons. The error in the output is fed back through fixed random connections with a negative gain, causing the network to follow the desired dynamics, while an online and local rule changes the weights. The rule for Feedback-based Online Local Learning Of Weights (FOLLOW) is local in the sense that weight changes depend on the presynaptic activity and the error signal projected onto the postsynaptic neuron. We provide examples of learning linear, non-linear and chaotic dynamics, as well as the dynamics of a two-link arm. Using the Lyapunov method, and under reasonable assumptions and approximations, we show that FOLLOW learning is stable uniformly, with the error going to zero asymptotically.

URL PDF HTML ☆

赞 0 踩 0

1711.08833 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Deep Learning for Real-Time Crime Forecasting and its Ternarization

深度学习在实时犯罪预测中的应用及其三元化

Bao Wang, Penghang Yin, Andrea L. Bertozzi, P. Jeffrey Brantingham, Stanley J. Osher, Jack Xin

发表机构 * Department of Mathematics, UCLA（UCLA数学系）； Department of Anthropology, UCLA（UCLA人类学系）； Department of Mathematics, UCI（UCI数学系）

AI总结本文提出了一种基于时空残差网络的犯罪预测模型，并通过三元化技术解决实际部署中的资源消耗问题，提升了预测精度。

Comments 14 pages, 7 figures

1711.08135 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Contracting Nonlinear Observers: Convex Optimization and Learning from Data

收缩非线性观测器：凸优化与数据学习

Ian R. Manchester

AI总结本文提出通过构造凸集设计非线性观测器的新方法，利用凸优化最小化状态估计误差界，验证了在模拟噪声数据集上的有效性。

Comments conference submission

1705.07112 2026-06-04 math.NA cs.LG cs.NA 版本更新

Fast Singular Value Shrinkage with Chebyshev Polynomial Approximation Based on Signal Sparsity

基于信号稀疏性的快速奇异值收缩与切比雪夫多项式逼近

Masaki Onuki, Shunsuke Ono, Keiichiro Shirai, Yuichi Tanaka

AI总结本文提出利用切比雪夫多项式逼近方法快速处理奇异值收缩，通过信号稀疏性减少计算成本，提升矩阵秩最小化在图像处理中的效率和精度。

Comments This is a journal paper

详情

DOI: 10.1109/TSP.2017.2745444

AI中文摘要

我们提出一种利用切比雪夫多项式逼近（CPA）进行奇异值阈值处理的近似方法。许多信号处理问题需要迭代应用奇异值分解（SVD）以最小化给定数据矩阵的秩，这称为矩阵秩最小化。在矩阵秩最小化中，通过硬阈值、软阈值或加权软阈值收缩矩阵的奇异值。然而，SVD的计算成本通常过高，难以处理高维信号如图像；因此，在这种情况下，矩阵秩最小化需要巨大的计算时间。本文利用CPA来（近似）操作奇异值，而无需计算奇异值和向量。奇异值的阈值处理通过特定矩阵的乘法表达，该乘法源于CPA的特性。该乘法也利用信号的稀疏性高效计算。结果表明，计算成本显著降低。实验结果通过基于矩阵秩最小化与核范数松弛的图像处理应用，展示了方法在计算时间和近似精度方面的有效性。

英文摘要

We propose an approximation method for thresholding of singular values using Chebyshev polynomial approximation (CPA). Many signal processing problems require iterative application of singular value decomposition (SVD) for minimizing the rank of a given data matrix with other cost functions and/or constraints, which is called matrix rank minimization. In matrix rank minimization, singular values of a matrix are shrunk by hard-thresholding, soft-thresholding, or weighted soft-thresholding. However, the computational cost of SVD is generally too expensive to handle high dimensional signals such as images; hence, in this case, matrix rank minimization requires enormous computation time. In this paper, we leverage CPA to (approximately) manipulate singular values without computing singular values and vectors. The thresholding of singular values is expressed by a multiplication of certain matrices, which is derived from a characteristic of CPA. The multiplication is also efficiently computed using the sparsity of signals. As a result, the computational cost is significantly reduced. Experimental results suggest the effectiveness of our method through several image processing applications based on matrix rank minimization with nuclear norm relaxation in terms of computation time and approximation precision.

URL PDF HTML ☆

赞 0 踩 0

1701.03974 2026-06-04 eess.SY cs.LG cs.SY math.OC stat.ML 版本更新

An Online Convex Optimization Approach to Dynamic Network Resource Allocation

一种面向动态网络资源分配的在线凸优化方法

Tianyi Chen, Qing Ling, Georgios B. Giannakis

AI总结本文提出MOSP方案，解决对抗性损失和约束下的动态优化问题，实现子线性动态遗憾和适应性，应用于网络资源分配并优于现有方法。

详情

DOI: 10.1109/TSP.2017.2750109

AI中文摘要

现有在线凸优化方法进行顺序单时段决策，导致可能的对抗性损失，其性能通过遗憾衡量。本文研究对抗性损失和约束的在线凸优化问题，约束在决策后揭示，可容忍瞬时违反但需长期满足。算法性能通过动态遗憾和动态适应性评估。本文提出改进的在线鞍点方案（MOSP），证明其在累积变化子线性增长时同时获得子线性动态遗憾和适应性。MOSP应用于动态网络资源分配任务，并与已知的随机对偶梯度方法比较。在各种场景中，数值实验展示了MOSP相对于现有方法的性能优势。

英文摘要

Existing approaches to online convex optimization (OCO) make sequential one-slot-ahead decisions, which lead to (possibly adversarial) losses that drive subsequent decision iterates. Their performance is evaluated by the so-called regret that measures the difference of losses between the online solution and the best yet fixed overall solution in hindsight. The present paper deals with online convex optimization involving adversarial loss functions and adversarial constraints, where the constraints are revealed after making decisions, and can be tolerable to instantaneous violations but must be satisfied in the long term. Performance of an online algorithm in this setting is assessed by: i) the difference of its losses relative to the best dynamic solution with one-slot-ahead information of the loss function and the constraint (that is here termed dynamic regret); and, ii) the accumulated amount of constraint violations (that is here termed dynamic fit). In this context, a modified online saddle-point (MOSP) scheme is developed, and proved to simultaneously yield sub-linear dynamic regret and fit, provided that the accumulated variations of per-slot minimizers and constraints are sub-linearly growing with time. MOSP is also applied to the dynamic network resource allocation task, and it is compared with the well-known stochastic dual gradient method. Under various scenarios, numerical experiments demonstrate the performance gain of MOSP relative to the state-of-the-art.

URL PDF HTML ☆

赞 0 踩 0

1612.08461 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

Randomized Block Frank-Wolfe for Convergent Large-Scale Learning

随机块Frank-Wolfe用于收敛大规模学习

Liang Zhang, Gang Wang, Daniel Romero, Georgios B. Giannakis

AI总结本文提出随机块Frank-Wolfe方法，通过灵活选择每次迭代更新的块数，确保收敛性和可行性，并扩展了收敛分析以涵盖非凸目标。

详情

DOI: 10.1109/TSP.2017.2755597

AI中文摘要

由于其低复杂度迭代，Frank-Wolfe（FW）求解器适用于各种大规模学习任务。当存在块可分离约束时，随机块FW（RB-FW）通过每次迭代仅更新部分坐标块进一步降低复杂度。为克服现有方法的限制，本文开发了RB-FW的步长，使每次迭代可灵活选择更新的块数，同时保证收敛性和可行性。通过建立关于对偶间隙和原问题次优性的计算界，推导了RB-FW的收敛速率。新界扩展了现有收敛分析，后者仅适用于不保证可行性迭代的步长序列。此外，还提出了两类保证迭代可行性的步长序列，以增强衰减率选择的灵活性。新收敛结果扩展到非凸目标，并证明精确线搜索的RB-FW以$\mathcal{O}(1/\sqrt{t})$速率达到临界点。在电动汽车充电和结构支持向量机应用中，展示了不同步长和块数的RB-FW性能。广泛模拟测试显示，RB-FW相比现有随机单块FW方法有性能提升。

英文摘要

Owing to their low-complexity iterations, Frank-Wolfe (FW) solvers are well suited for various large-scale learning tasks. When block-separable constraints are present, randomized block FW (RB-FW) has been shown to further reduce complexity by updating only a fraction of coordinate blocks per iteration. To circumvent the limitations of existing methods, the present work develops step sizes for RB-FW that enable a flexible selection of the number of blocks to update per iteration while ensuring convergence and feasibility of the iterates. To this end, convergence rates of RB-FW are established through computational bounds on a primal sub-optimality measure and on the duality gap. The novel bounds extend the existing convergence analysis, which only applies to a step-size sequence that does not generally lead to feasible iterates. Furthermore, two classes of step-size sequences that guarantee feasibility of the iterates are also proposed to enhance flexibility in choosing decay rates. The novel convergence results are markedly broadened to encompass also nonconvex objectives, and further assert that RB-FW with exact line-search reaches a stationary point at rate $\mathcal{O}(1/\sqrt{t})$. Performance of RB-FW with different step sizes and number of blocks is demonstrated in two applications, namely charging of electrical vehicles and structural support vector machines. Extensive simulated tests demonstrate the performance improvement of RB-FW relative to existing randomized single-block FW methods.

URL PDF HTML ☆

赞 0 踩 0

1711.04683 2026-06-04 cs.LG cs.RO cs.SY eess.SY stat.ML 版本更新

Tensor Decompositions for Modeling Inverse Dynamics

张量分解用于逆动力学建模

Stephan Baier, Volker Tresp

AI总结本文提出利用张量分解方法建模逆动力学，通过处理位置、速度和加速度的三重交互，实现对高非线性函数的近似，并在SARCOS机械臂数据集上验证了其优越性。

详情

AI中文摘要

建模逆动力学对于精确的前馈机器人控制至关重要。该模型计算所需的关节扭矩，以执行预期的运动。高度非线性的动态系统逆函数可以通过回归技术近似。我们提出了一种回归方法，即利用位置x速度x加速度的三重交互的张量分解模型。大多数张量分解工作都解决了密集张量的分解问题。本文在稀疏张量的分解基础上进行扩展，仅包含少量非零条目。稀疏张量的分解已成功应用于关系学习，例如大规模知识图谱的建模。最近，该方法已扩展到多类分类问题，涉及离散输入变量。在高维稀疏张量中表示数据可以近似复杂的高非线性函数。本文展示了稀疏张量分解如何应用于回归问题。此外，我们通过学习从连续输入到张量分解的潜在表示的映射，利用基函数将方法扩展到连续输入。我们在具有七自由度SARCOS机械臂轨迹的数据集上评估了所提出的模型。实验结果表明，所提出的功能张量模型相比挑战性的最新方法具有优越的性能。

英文摘要

Modeling inverse dynamics is crucial for accurate feedforward robot control. The model computes the necessary joint torques, to perform a desired movement. The highly non-linear inverse function of the dynamical system can be approximated using regression techniques. We propose as regression method a tensor decomposition model that exploits the inherent three-way interaction of positions x velocities x accelerations. Most work in tensor factorization has addressed the decomposition of dense tensors. In this paper, we build upon the decomposition of sparse tensors, with only small amounts of nonzero entries. The decomposition of sparse tensors has successfully been used in relational learning, e.g., the modeling of large knowledge graphs. Recently, the approach has been extended to multi-class classification with discrete input variables. Representing the data in high dimensional sparse tensors enables the approximation of complex highly non-linear functions. In this paper we show how the decomposition of sparse tensors can be applied to regression problems. Furthermore, we extend the method to continuous inputs, by learning a mapping from the continuous inputs to the latent representations of the tensor decomposition, using basis functions. We evaluate our proposed model on a dataset with trajectories from a seven degrees of freedom SARCOS robot arm. Our experimental results show superior performance of the proposed functional tensor model, compared to challenging state-of-the art methods.

URL PDF HTML ☆

赞 0 踩 0

1711.04518 2026-06-04 eess.SY cs.AI cs.HC cs.LG cs.NE cs.SY 版本更新

A Supervised Learning Concept for Reducing User Interaction in Passenger Cars

一种用于减少乘客汽车中用户交互的监督学习概念

Marius Stärk, Damian Backes, Christian Kehl

AI总结本文提出了一种基于监督学习的自动化系统，用于减少人机交互界面中的交互复杂性，适用于汽车多模态热调节系统的设定点选择。

Comments 4 pages, 9 figures, concept only

1705.08551 2026-06-04 stat.ML cs.AI cs.LG cs.SY eess.SY 版本更新

Safe Model-based Reinforcement Learning with Stability Guarantees

具有稳定性保证的安全模型基于强化学习

Felix Berkenkamp, Matteo Turchetta, Angela P. Schoellig, Andreas Krause

AI总结本文提出一种考虑安全性的强化学习算法，通过Lyapunov稳定性验证理论，利用动态统计模型获得具有证明稳定性的高性能控制策略，并在模拟倒立摆中展示其安全优化神经网络策略的能力。

Comments Proc. of Neural Information Processing Systems (NIPS), 2017

详情

AI中文摘要

强化学习是一种从实验数据中学习最优策略的强大范式。然而，为了找到最优策略，大多数强化学习算法会探索所有可能的动作，这可能对现实系统有害。因此，学习算法在现实世界中很少应用于安全关键系统。在本文中，我们提出了一种明确考虑安全性的学习算法，定义为稳定性保证。具体来说，我们扩展了控制理论中关于Lyapunov稳定性验证的结果，并展示了如何利用动态的统计模型来获得具有证明稳定性的高性能控制策略。此外，在额外的正则性假设条件下，我们证明了可以有效地、安全地收集数据以学习动态特性，从而提高控制性能并扩大状态空间的安全区域。在我们的实验中，我们展示了所得到的算法如何在模拟倒立摆上安全地优化神经网络策略，而摆杆从未倒下。

英文摘要

Reinforcement learning is a powerful paradigm for learning optimal policies from experimental data. However, to find optimal policies, most reinforcement learning algorithms explore all possible actions, which may be harmful for real-world systems. As a consequence, learning algorithms are rarely applied on safety-critical systems in the real world. In this paper, we present a learning algorithm that explicitly considers safety, defined in terms of stability guarantees. Specifically, we extend control-theoretic results on Lyapunov stability verification and show how to use statistical models of the dynamics to obtain high-performance control policies with provable stability certificates. Moreover, under additional regularity assumptions in terms of a Gaussian process prior, we prove that one can effectively and safely collect data in order to learn about the dynamics and thus both improve control performance and expand the safe region of the state space. In our experiments, we show how the resulting algorithm can safely optimize a neural network policy on a simulated inverted pendulum, without the pendulum ever falling down.

URL PDF HTML ☆

赞 0 踩 0

1711.03906 2026-06-04 cs.LG cs.DC cs.NI cs.RO cs.SY eess.SY 版本更新

D-SLATS: Distributed Simultaneous Localization and Time Synchronization

D-SLATS：分布式的同时定位与时间同步

Amr Alanwar, Henrique Ferraz, Kevin Hsieh, Rohit Thazhath, Paul Martin, Joao Hespanha, Mani Srivastava

AI总结本文提出D-SLATS框架，通过分布式扩展卡尔曼滤波和优化技术联合解决时间同步与定位问题，实现3微秒精度和30厘米误差。

详情

DOI: 10.1145/3084041.3084049

AI中文摘要

通过过去十年，我们见证了物联网（IoT）设备数量的激增，随之而来的是一次对时间和空间上协同行动的更大需求。尽管时间同步和定位这两个问题在许多方面有共同点，但传统上它们被分别处理或在集中式方法中结合，导致资源利用效率低下，或在设备数量方面不可扩展的解决方案。因此，我们提出D-SLATS，一个由三种不同且独立算法组成的框架，以分布式方式联合解决时间和定位问题。前两个算法主要基于分布式扩展卡尔曼滤波（EKF），而第三个算法使用优化技术。不需要融合中心，设备仅与邻居通信。所提出的方法在定制的超宽带通信测试平台和四旋翼无人机上进行了评估，代表了静态和移动节点的网络。我们的算法实现了高达三微秒的时间同步精度和30厘米的定位误差。

英文摘要

Through the last decade, we have witnessed a surge of Internet of Things (IoT) devices, and with that a greater need to choreograph their actions across both time and space. Although these two problems, namely time synchronization and localization, share many aspects in common, they are traditionally treated separately or combined on centralized approaches that results in an ineffcient use of resources, or in solutions that are not scalable in terms of the number of IoT devices. Therefore, we propose D-SLATS, a framework comprised of three different and independent algorithms to jointly solve time synchronization and localization problems in a distributed fashion. The First two algorithms are based mainly on the distributed Extended Kalman Filter (EKF) whereas the third one uses optimization techniques. No fusion center is required, and the devices only communicate with their neighbors. The proposed methods are evaluated on custom Ultra-Wideband communication Testbed and a quadrotor, representing a network of both static and mobile nodes. Our algorithms achieve up to three microseconds time synchronization accuracy and 30 cm localization error.

URL PDF HTML ☆

赞 0 踩 0

1701.08585 2026-06-04 cs.LG cs.SI cs.SY eess.SY math.OC 版本更新

Variational Policy for Guiding Point Processes

变分策略用于引导点过程

Yichen Wang, Grady Williams, Evangelos Theodorou, Le Song

AI总结本文提出基于最优测度和变分推断的凸优化框架，用于设计点过程的最优控制策略，以更高效准确地引导系统状态。

Comments ICML 2017

1711.03398 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Data Fusion and Machine Learning Integration for Transformer Loss of Life Estimation

数据融合与机器学习集成用于变压器寿命估计

Mohsen Mahoor, Amin Khodaei

AI总结本文通过数据融合与机器学习技术，结合IEEE标准，利用ANFIS和RBF网络估算变压器寿命，并采用OWA和序列卡尔曼滤波提高精度。

1711.02877 2026-06-04 eess.SY cs.AI cs.LG cs.SY math.OC 版本更新

Un résultat intrigant en commande sans modèle

一个令人着迷的无模型控制结果

Cédric Join, Emmanuel Delaleau, Michel Fliess, Claude H. Moog

AI总结通过鲁夫-赫维茨准则，证明了无模型控制中智能比例控制器可能比智能比例-微分控制器更难调参，通过仿真展示了iPD的优势。

Comments in French, https://www.openscience.fr/Un-resultat-intrigant-en-commande-sans-modele

1711.02857 2026-06-04 cs.LG cs.AI cs.CV cs.NA math.NA stat.ML 版本更新

Learning Sparse Visual Representations with Leaky Capped Norm Regularizers

通过泄漏受限范数正则化器学习稀疏视觉表示

Jianqiao Wangni, Dahua Lin

AI总结本文提出泄漏受限范数正则化器，用于学习过完备视觉表示，证明了其在3D形状恢复中的收敛性，优于ℓ1和非凸正则化方法。

详情

AI中文摘要

诱导稀疏性的正则化是学习过完备视觉表示的重要组成部分。尽管ℓ1正则化广受欢迎，本文研究了非凸正则化在该问题中的应用。我们的贡献包括三个部分：首先，我们提出了泄漏受限范数正则化器（LCNR），允许模型权重低于一定阈值的部分被更强地正则化，从而实现强稀疏性，仅引入可控的估计偏差。我们提出了一种主要化-最小化算法来优化联合目标函数。其次，我们的研究显示，在单目3D形状恢复和神经网络中，LCNR优于ℓ1和其他非凸正则化方法，实现了最先进的性能和更快的收敛速度。第三，我们证明了在3D恢复问题上的理论全局收敛速度。到目前为止，这是首次对3D恢复问题的收敛性分析。

英文摘要

Sparsity inducing regularization is an important part for learning over-complete visual representations. Despite the popularity of $\ell_1$ regularization, in this paper, we investigate the usage of non-convex regularizations in this problem. Our contribution consists of three parts. First, we propose the leaky capped norm regularization (LCNR), which allows model weights below a certain threshold to be regularized more strongly as opposed to those above, therefore imposes strong sparsity and only introduces controllable estimation bias. We propose a majorization-minimization algorithm to optimize the joint objective function. Second, our study over monocular 3D shape recovery and neural networks with LCNR outperforms $\ell_1$ and other non-convex regularizations, achieving state-of-the-art performance and faster convergence. Third, we prove a theoretical global convergence speed on the 3D recovery problem. To the best of our knowledge, this is the first convergence analysis of the 3D recovery problem.

URL PDF HTML ☆

赞 0 踩 0

1711.00946 2026-06-04 cs.LG cs.SY eess.SY math.OC stat.ML 版本更新

Learning Linear Dynamical Systems via Spectral Filtering

通过谱滤波学习线性动力系统

Elad Hazan, Karan Singh, Cyril Zhang

AI总结本文提出一种高效算法，通过过度参数化线性动力系统实现在线预测，利用谱滤波技术获得近优 regret 保证。

Comments Published as a conference paper at NIPS 2017

1702.06861 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

On the Power of Truncated SVD for General High-rank Matrix Estimation Problems

关于截断SVD在一般高秩矩阵估计问题中的功效

Simon S. Du, Yining Wang, Aarti Singh

AI总结本文探讨了在谱范数下接近高秩正半定矩阵的估计值，通过截断SVD在Frobenius范数下得到乘法近似，解决了高秩矩阵补全、去噪和高维协方差估计问题。

Comments Accepted by NIPS 2017. Add gap-dependent bounds

详情

AI中文摘要

本文证明，给定一个在谱范数下接近一般高秩正半定矩阵A的估计值Ã（即‖Ã-A‖₂ ≤ δ），对Ã进行简单的截断SVD可以得到A在Frobenius范数下的乘法近似。这一观察导致了许多关于一般高秩矩阵估计问题的有趣结果，我们简要总结如下（A是一个n×n的高秩正半定矩阵，A_k是A的最佳秩-k近似）：（1）高秩矩阵补全：通过观测Ω(nmax{ε⁻⁴,k²}μ₀²‖A‖_F²logn/σ_{k+1}(A)²)个A的元素，其中σ_{k+1}(A)是A的第(k+1)个奇异值，μ₀是不相干性，对零填充矩阵进行截断SVD可以满足‖Ã_k -A‖_F ≤ (1+O(ε))‖A -A_k‖_F，以高概率成立。（2）高秩矩阵去噪：令Ã=A+E，其中E是一个高斯随机噪声矩阵，具有零均值和每个元素方差为ν²/n。则Ã的截断SVD满足‖Ã_k -A‖_F ≤ (1+O(√(ν/σ_{k+1}(A))))‖A -A_k‖_F + O(√kν)。（3）高维协方差的低秩估计：给定N个i.i.d.样本X₁,…,X_N ~ N_n(0,A)，能否用相对误差Frobenius范数界来估计A？我们证明如果N=Ω(nmax{ε⁻⁴,k²}γ_k(A)²logN)，其中γ_k(A)=σ₁(A)/σ_{k+1}(A)，则‖Ã_k -A‖_F ≤ (1+O(ε))‖A -A_k‖_F，以高概率成立，其中Ã=1/N∑_{i=1}^N X_iX_i^T是样本协方差。

英文摘要

We show that given an estimate $\widehat{A}$ that is close to a general high-rank positive semi-definite (PSD) matrix $A$ in spectral norm (i.e., $\|\widehat{A}-A\|_2 \leq δ$), the simple truncated SVD of $\widehat{A}$ produces a multiplicative approximation of $A$ in Frobenius norm. This observation leads to many interesting results on general high-rank matrix estimation problems, which we briefly summarize below ($A$ is an $n\times n$ high-rank PSD matrix and $A_k$ is the best rank-$k$ approximation of $A$): (1) High-rank matrix completion: By observing $Ω(\frac{n\max\{ε^{-4},k^2\}μ_0^2\|A\|_F^2\log n}{σ_{k+1}(A)^2})$ elements of $A$ where $σ_{k+1}\left(A\right)$ is the $\left(k+1\right)$-th singular value of $A$ and $μ_0$ is the incoherence, the truncated SVD on a zero-filled matrix satisfies $\|\widehat{A}_k-A\|_F \leq (1+O(ε))\|A-A_k\|_F$ with high probability. (2)High-rank matrix de-noising: Let $\widehat{A}=A+E$ where $E$ is a Gaussian random noise matrix with zero mean and $ν^2/n$ variance on each entry. Then the truncated SVD of $\widehat{A}$ satisfies $\|\widehat{A}_k-A\|_F \leq (1+O(\sqrt{ν/σ_{k+1}(A)}))\|A-A_k\|_F + O(\sqrt{k}ν)$. (3) Low-rank Estimation of high-dimensional covariance: Given $N$ i.i.d.~samples $X_1,\cdots,X_N\sim\mathcal N_n(0,A)$, can we estimate $A$ with a relative-error Frobenius norm bound? We show that if $N = Ω\left(n\max\{ε^{-4},k^2\}γ_k(A)^2\log N\right)$ for $γ_k(A)=σ_1(A)/σ_{k+1}(A)$, then $\|\widehat{A}_k-A\|_F \leq (1+O(ε))\|A-A_k\|_F$ with high probability, where $\widehat{A}=\frac{1}{N}\sum_{i=1}^N{X_iX_i^\top}$ is the sample covariance.

URL PDF HTML ☆

赞 0 踩 0

1710.10532 2026-06-04 eess.SY cs.AI cs.LG cs.SY 版本更新

Interpretable Apprenticeship Learning with Temporal Logic Specifications

具有时序逻辑规范的可解释模仿学习

Daniel Kasenberg, Matthias Scheutz

AI总结本文提出通过多目标优化从MDP中的行为轨迹推断LTL规范，采用违反成本概念设计状态和动作基于的目标函数，并通过遗传算法在简单领域验证方法有效性。

Comments Accepted to the 56th IEEE Conference on Decision and Control (CDC 2017)

1710.09854 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Gradient Sparsification for Communication-Efficient Distributed Optimization

梯度稀疏化用于通信高效的分布式优化

Jianqiao Wangni, Jialei Wang, Ji Liu, Tong Zhang

AI总结本文提出通过凸优化方法减少梯度通信开销，设计高效算法实现梯度稀疏化，验证了在逻辑回归、支持向量机和卷积神经网络中的有效性。

1710.09657 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Segment Parameter Labelling in MCMC Mean-Shift Change Detection

MCMC均值迁移中的分段参数标记

Alireza Ahrabian, Shirin Enshaeifar, Clive Cheong-Took, Payam Barnaghi

AI总结本文提出一种基于贝叶斯均值迁移的分段变化检测算法，利用分段参数重复性提升性能。

1710.08883 2026-06-04 cs.DC cs.LG cs.NA math.NA math.OC 版本更新

Avoiding Communication in Proximal Methods for Convex Optimization Problems

在凸优化问题中避免通信的近端方法

Saeed Soori, Aditya Devarakonda, James Demmel, Mert Gurbuzbalaban, Maryam Mehri Dehnavi

AI总结本文提出一种改进的FISTA算法，通过每k次迭代通信一次来减少数据传输，提升分布式架构性能，实验显示在多个基准测试中平均加速3-10倍。

详情

AI中文摘要

快速迭代软阈值算法（FISTA）用于解决机器学习中的凸正则化优化问题。分布式实现因能处理大数据集而流行，但现有FISTA在每次迭代都通信，限制了现代分布式架构的性能。本文重新公式化FISTA，使数据每k次迭代通信一次，减少大数据集的通信开销。在Lasso问题的两种优化方法中，算法显示延迟成本降低k倍，而带宽和浮点运算成本保持不变。改进算法的收敛率和稳定性与标准方法相似。在1至1024个节点上评估通信避免FISTA和近端牛顿方法的性能，显示在多个基准测试中平均加速3-10倍，且扩展性能优于经典算法。

英文摘要

The fast iterative soft thresholding algorithm (FISTA) is used to solve convex regularized optimization problems in machine learning. Distributed implementations of the algorithm have become popular since they enable the analysis of large datasets. However, existing formulations of FISTA communicate data at every iteration which reduces its performance on modern distributed architectures. The communication costs of FISTA, including bandwidth and latency costs, is closely tied to the mathematical formulation of the algorithm. This work reformulates FISTA to communicate data at every k iterations and reduce data communication when operating on large data sets. We formulate the algorithm for two different optimization methods on the Lasso problem and show that the latency cost is reduced by a factor of k while bandwidth and floating-point operation costs remain the same. The convergence rates and stability properties of the reformulated algorithms are similar to the standard formulations. The performance of communication-avoiding FISTA and Proximal Newton methods is evaluated on 1 to 1024 nodes for multiple benchmarks and demonstrate average speedups of 3-10x with scaling properties that outperform the classical algorithms.

URL PDF HTML ☆

赞 0 踩 0

1710.08530 2026-06-04 math.OC cs.LG cs.SY eess.SY stat.ML 版本更新

Stability Analysis of Optimal Adaptive Control using Value Iteration with Approximation Errors

基于价值迭代的最优自适应控制稳定性分析

Ali Heydari

AI总结本文通过价值迭代分析自适应最优控制的稳定性，考虑了近似误差的影响，提供了吸引域的估计，确保初始条件在该域内时轨迹保持有效。

Comments A part of this paper is based on preliminary results presented in arXiv:1412.5675

1709.03153 2026-06-04 cs.LG cs.AI cs.RO cs.SY eess.SY 版本更新

MBMF: Model-Based Priors for Model-Free Reinforcement Learning

MBMF：基于模型的先验用于无模型强化学习

Somil Bansal, Roberto Calandra, Kurtland Chua, Sergey Levine, Claire Tomlin

AI总结本文提出一种结合模型与无模型强化学习的方法，通过学习概率动力学模型作为先验，提升数据效率和成本效益。

Comments After we submitted the paper for consideration in CoRL 2017 we found a paper published in the recent past with a similar method (see related work for a discussion). Considering the similarities between the two papers, we have decided to retract our paper from CoRL 2017

1607.03081 2026-06-04 math.NA cs.LG cs.NA math.OC stat.ML 版本更新

基于广义Lasso路径的自适应多罚项正则化

Markus Grasmair, Timo Klock, Valeriya Naumova

AI总结本文提出了一种自适应多罚项正则化参数选择框架，通过构建包含结构相似解的区域，实现正确支持恢复，并结合模型选择准则进行数据自适应参数选择，提升压缩感知问题的鲁棒性和性能。

详情

AI中文摘要

对于许多算法，参数调节仍是一个具有挑战性和关键性的任务，尤其是在多参数设置中变得繁琐且不可行。多罚项正则化，成功用于解决混合型不定稀疏回归问题，其中信号和噪声是加法混合的，是此类例子之一。本文提出了一种新的算法框架，用于多罚项正则化的自适应参数选择，重点在于正确支持恢复。基于正则化路径理论和单罚项函数的算法理论，我们通过提供一种高效的构造包含结构相似解的区域的程序，将这些想法扩展到多罚项框架中，即在参数范围内的整个范围内，构造具有相同稀疏性和符号模式的解。结合这一方法与模型选择准则，可以以数据自适应的方式选择正则化参数。我们算法的另一个优势是，它提供了整个参数范围内解稳定性概述。这可以进一步用于获得对感兴趣问题的额外见解。我们对我们的方法进行了数值分析，并将其与压缩感知问题中的最新单罚项算法进行比较，以展示所提算法的鲁棒性和强大性。

英文摘要

For many algorithms, parameter tuning remains a challenging and critical task, which becomes tedious and infeasible in a multi-parameter setting. Multi-penalty regularization, successfully used for solving undetermined sparse regression of problems of unmixing type where signal and noise are additively mixed, is one of such examples. In this paper, we propose a novel algorithmic framework for an adaptive parameter choice in multi-penalty regularization with a focus on the correct support recovery. Building upon the theory of regularization paths and algorithms for single-penalty functionals, we extend these ideas to a multi-penalty framework by providing an efficient procedure for the construction of regions containing structurally similar solutions, i.e., solutions with the same sparsity and sign pattern, over the whole range of parameters. Combining this with a model selection criterion, we can choose regularization parameters in a data-adaptive manner. Another advantage of our algorithm is that it provides an overview on the solution stability over the whole range of parameters. This can be further exploited to obtain additional insights into the problem of interest. We provide a numerical analysis of our method and compare it to the state-of-the-art single-penalty algorithms for compressed sensing problems in order to demonstrate the robustness and power of the proposed algorithm.

URL PDF HTML ☆

赞 0 踩 0

1502.02860 2026-06-04 stat.ML cs.LG cs.RO cs.SY eess.SY 版本更新

Gaussian Processes for Data-Efficient Learning in Robotics and Control

高斯过程在机器人和控制中的数据高效学习

Marc Peter Deisenroth, Dieter Fox, Carl Edward Rasmussen

AI总结本文提出基于高斯过程的非参数转移模型，通过提取更多数据信息加速学习，减少模型误差影响，实现高效自主学习。

Comments 20 pages, 29 figures; fixed a typo in equation on page 8

详情

DOI: 10.1109/TPAMI.2013.218
Journal ref: IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, issue no 2, pages 408-423, February 2015

AI中文摘要

自主学习在控制和机器人领域已持续十多年，数据驱动学习可减少工程知识需求。然而，自主强化学习通常需要大量系统交互，这在实际系统中（如机器人）不现实。本文提出通过高斯过程转移模型提取更多数据信息，显式纳入模型不确定性以减少误差影响，相比现有RL方法，模型基于策略搜索方法实现了前所未有的学习速度，并在真实机器人和控制任务中展示了应用价值。

英文摘要

Autonomous learning has been a promising direction in control and robotics for more than a decade since data-driven learning allows to reduce the amount of engineering knowledge, which is otherwise required. However, autonomous reinforcement learning (RL) approaches typically require many interactions with the system to learn controllers, which is a practical limitation in real systems, such as robots, where many interactions can be impractical and time consuming. To address this problem, current learning approaches typically require task-specific knowledge in form of expert demonstrations, realistic simulators, pre-shaped policies, or specific knowledge about the underlying dynamics. In this article, we follow a different approach and speed up learning by extracting more information from data. In particular, we learn a probabilistic, non-parametric Gaussian process transition model of the system. By explicitly incorporating model uncertainty into long-term planning and controller learning our approach reduces the effects of model errors, a key problem in model-based learning. Compared to state-of-the art RL our model-based policy search method achieves an unprecedented speed of learning. We demonstrate its applicability to autonomous learning in real robot and control tasks.

URL PDF HTML ☆

赞 0 踩 0

1710.02242 2026-06-04 cs.LG cs.NA math.NA 版本更新

Solving differential equations with unknown constitutive relations as recurrent neural networks

利用未知本构关系求解微分方程作为循环神经网络

Tobias Hagge, Panos Stinis, Enoch Yeung, Alexandre M. Tartakovsky

AI总结本文提出用循环神经网络学习未知的反应速率项，通过离散化的常微分方程作为训练问题的一部分，解决部分可用状态变量测量数据下的微分方程问题，应用于fedbatch生物反应器模拟。

Comments 19 pages, 8 figures

1710.00032 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Learning the Exact Topology of Undirected Consensus Networks

学习无向共识网络的精确拓扑结构

Saurav Talukdar, Deepjyoti Deka, Sandeep Attree, Donatello Materassi, Murti V. Salapaka

AI总结本文提出了一种非侵入性方法，利用多元维纳滤波学习无向共识网络的交互拓扑，通过频率响应识别虚假链接，从而精确揭示节点间交互结构。

Comments 6 pages

1708.02276 2026-06-04 math.NA cs.LG cs.NA 版本更新

Parallelizing Over Artificial Neural Network Training Runs with Multigrid

用多网格方法并行化人工神经网络训练运行

Jacob B. Schroder

AI总结本文提出一种多网格降时（MGRIT）算法，用于并行化神经网络训练运行，以解决训练阶段的瓶颈问题，通过将训练视为演化方程来实现并行计算。

Comments Version 2: - Added more complete references to basic neural network literature - Corrected typos - Condensed results in Section 3 to be more concise - 22 pages

详情

AI中文摘要

人工神经网络是一种流行且有效的机器学习技术。在并行化单个网络的昂贵训练阶段方面取得了巨大进展，导致了高度专门化的硬件，许多基于GPU架构，以及更多并发算法如合成梯度。然而，训练阶段仍然是一个瓶颈，其中训练数据必须在数千个单独的训练运行上串行处理。本文考虑了一种多网格降时（MGRIT）算法，能够并行化数千个训练运行，并收敛到与传统训练相同的结果。MGRIT最初是为提供时间演化的并行性而开发的，通过串行地步进有限数量的时间步。本文将神经网络训练类似地重新表述，将神经网络训练视为一个演化方程，该方程从一步到下一步演化网络权重。因此，本文关注分布式计算方法，但与其他仅试图在单个训练运行上并行化的做法不同。本文最后给出了两个模型问题的数值结果以支持研究。

英文摘要

Artificial neural networks are a popular and effective machine learning technique. Great progress has been made parallelizing the expensive training phase of an individual network, leading to highly specialized pieces of hardware, many based on GPU-type architectures, and more concurrent algorithms such as synthetic gradients. However, the training phase continues to be a bottleneck, where the training data must be processed serially over thousands of individual training runs. This work considers a multigrid reduction in time (MGRIT) algorithm that is able to parallelize over the thousands of training runs and converge to the exact same solution as traditional training would provide. MGRIT was originally developed to provide parallelism for time evolution problems that serially step through a finite number of time-steps. This work recasts the training of a neural network similarly, treating neural network training as an evolution equation that evolves the network weights from one step to the next. Thus, this work concerns distributed computing approaches for neural networks, but is distinct from other approaches which seek to parallelize only over individual training runs. The work concludes with supporting numerical results for two model problems.

URL PDF HTML ☆

赞 0 踩 0

1709.09578 2026-06-04 cs.LG cs.NA math.NA 版本更新

Neural networks for topology optimization

神经网络用于拓扑优化

Ivan Sosnovik, Ivan Oseledets

AI总结本文提出基于深度学习的拓扑优化加速方法，将布局问题转化为图像分割任务，利用卷积编码器-解码器架构实现高效优化，实验表明方法显著提升优化速度并具有良好的泛化能力。

1709.08830 2026-06-04 eess.SY cs.CR cs.LG cs.SY 版本更新

Catching Anomalous Distributed Photovoltaics: An Edge-based Multi-modal Anomaly Detection

捕捉异常分布式光伏：基于边缘的多模态异常检测

Devu Manikantan Shilay, Kin Gwn Lorey, Tianshu Weiz, Teems Lovetty, Yu Cheng

AI总结本文提出基于边缘的多模态异常检测方法，用于识别分布式光伏等设备的异常行为，通过融合多源时间序列数据，提升对电网安全的检测能力。

详情

AI中文摘要

能源系统网络安全性面临的主要挑战是无法检测针对分布式电网边缘设备（如光伏板、智能柔性负载和电动汽车）的网络物理攻击。本文设计并开发了一种分布式、多模态异常检测方法，通过在多个时间序列数据源上使用无监督机器学习算法，融合本地观测并标记异常。特别关注分布式光伏面临的网络物理威胁，通过创建供需失配、反向功率流等条件导致局部扰动或电网不稳定。使用开源电力系统模拟工具GridLAB-D，结合真实智能家居和太阳能数据集模拟智能电网场景，展示光伏攻击对电力系统的影响。各种针对光伏板的攻击（如电压波动、反向功率流等）被设计并执行。观察到虽然单个无监督学习算法如OCSVMs、Corrupt RF和PCA在识别特定攻击类型上表现优异，但PCA与凸包的组合在识别所有设计攻击时表现最佳，真阳性率为83.64%，准确率为95.78%。关键发现是由于配电网络的异构性和攻击类型的不确定性，依赖单一信息模式进行防御会导致假警报和漏检率增加，因为攻击者可以设计攻击隐藏在这些不确定性中保持隐蔽。

英文摘要

A significant challenge in energy system cyber security is the current inability to detect cyber-physical attacks targeting and originating from distributed grid-edge devices such as photovoltaics (PV) panels, smart flexible loads, and electric vehicles. We address this concern by designing and developing a distributed, multi-modal anomaly detection approach that can sense the health of the device and the electric power grid from the edge. This is realized by exploiting unsupervised machine learning algorithms on multiple sources of time-series data, fusing these multiple local observations and flagging anomalies when a deviation from the normal behavior is observed. We particularly focus on the cyber-physical threats to the distributed PVs that has the potential to cause local disturbances or grid instabilities by creating supply-demand mismatch, reverse power flow conditions etc. We use an open source power system simulation tool called GridLAB-D, loaded with real smart home and solar datasets to simulate the smart grid scenarios and to illustrate the impact of PV attacks on the power system. Various attacks targeting PV panels that create voltage fluctuations, reverse power flow etc were designed and performed. We observe that while individual unsupervised learning algorithms such as OCSVMs, Corrupt RF and PCA surpasses in identifying particular attack type, PCA with Convex Hull outperforms all algorithms in identifying all designed attacks with a true positive rate of 83.64% and an accuracy of 95.78%. Our key insight is that due to the heterogeneous nature of the distribution grid and the uncertainty in the type of the attack being launched, relying on single mode of information for defense can lead to increased false alarms and missed detection rates as one can design attacks to hide within those uncertainties and remain stealthy.

URL PDF HTML ☆

赞 0 踩 0

1703.09260 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Goal-Driven Dynamics Learning via Bayesian Optimization

通过贝叶斯优化的目标驱动动力学学习

Somil Bansal, Roberto Calandra, Ted Xiao, Sergey Levine, Claire J. Tomlin

AI总结本文提出通过贝叶斯优化主动学习框架，迭代学习局部线性动力学模型以提升控制性能，用于四旋翼无人机任务控制。

Comments This is the extended version of the CDC'17 paper titled "Goal-Driven Dynamics Learning via Bayesian Optimization."

1703.01250 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization

虚拟与现实：在强化学习中权衡模拟与物理实验

Alonso Marco, Felix Berkenkamp, Philipp Hennig, Angela P. Schoellig, Andreas Krause, Stefan Schaal, Sebastian Trimpe

AI总结本文提出利用模拟数据优化强化学习，通过结合低成本但不准确的模拟信息与高成本但准确的物理实验，提高效率。

Comments 7 pages, 6 figures, to appear in IEEE 2017 International Conference on Robotics and Automation (ICRA)

详情

DOI: 10.1109/ICRA.2017.7989186

AI中文摘要

在实践中，控制策略的参数通常手动调整，这耗时且令人沮丧。强化学习是一种有前途的替代方法，旨在自动化此过程，但通常需要太多实验才实用。本文提出了一种解决方案，通过利用可用于大多数机器人平台的模拟先验知识。具体而言，我们扩展了熵搜索，一种最大化每次实验信息增益的贝叶斯优化算法，以处理多个信息源的情况。结果是一种原则性的方法，可以有效地将低成本但不准确的模拟信息与高成本且准确的物理实验结合起来。我们将其应用于摆杆系统，证明该算法可以在比仅使用物理系统标准贝叶斯优化更少的实验中找到良好的控制策略。

英文摘要

In practice, the parameters of control policies are often tuned manually. This is time-consuming and frustrating. Reinforcement learning is a promising alternative that aims to automate this process, yet often requires too many experiments to be practical. In this paper, we propose a solution to this problem by exploiting prior knowledge from simulations, which are readily available for most robotic platforms. Specifically, we extend Entropy Search, a Bayesian optimization algorithm that maximizes information gain from each experiment, to the case of multiple information sources. The result is a principled way to automatically combine cheap, but inaccurate information from simulations with expensive and accurate physical experiments in a cost-effective manner. We apply the resulting method to a cart-pole system, which confirms that the algorithm can find good control policies with fewer experiments than standard Bayesian optimization on the physical system only.

URL PDF HTML ☆

赞 0 踩 0

1605.01950 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Automatic LQR Tuning Based on Gaussian Process Global Optimization

基于高斯过程全局优化的自动LQR调优

Alonso Marco, Philipp Hennig, Jeannette Bohg, Stefan Schaal, Sebastian Trimpe

AI总结本文提出一种结合线性最优控制的自动控制器调优框架，利用贝叶斯优化算法提升控制器参数，通过实验数据优化性能目标，以七自由度机械臂平衡倒立杆为例验证方法有效性。

Comments 8 pages, 5 figures, to appear in IEEE 2016 International Conference on Robotics and Automation. Video demonstration of the experiments available at https://am.is.tuebingen.mpg.de/publications/marco_icra_2016

详情

DOI: 10.1109/ICRA.2016.7487144

AI中文摘要

本文提出一种基于线性最优控制与贝叶斯优化的自动控制器调优框架。该框架根据预定义的性能目标，利用实验数据自动改进初始控制器参数。所采用的贝叶斯优化算法为熵搜索，将潜在目标表示为高斯过程，并构建关于目标最小值位置的显式信念。通过最大化每次实验评估的信息增益，该框架能够在较少评估次数下获得改进的控制器。实验演示使用了七自由度机械臂平衡倒立杆的任务，二、四维调优问题的结果展示了该方法在机器人平台上的自动控制器调优潜力。

英文摘要

This paper proposes an automatic controller tuning framework based on linear optimal control combined with Bayesian optimization. With this framework, an initial set of controller gains is automatically improved according to a pre-defined performance objective evaluated from experimental data. The underlying Bayesian optimization algorithm is Entropy Search, which represents the latent objective as a Gaussian process and constructs an explicit belief over the location of the objective minimum. This is used to maximize the information gain from each experimental evaluation. Thus, this framework shall yield improved controllers with fewer evaluations compared to alternative approaches. A seven-degree-of-freedom robot arm balancing an inverted pole is used as the experimental demonstrator. Results of a two- and four-dimensional tuning problems highlight the method's potential for automatic controller tuning on robotic platforms.

URL PDF HTML ☆

赞 0 踩 0

1709.06080 2026-06-04 cs.LG cs.AI cs.NA math.NA 版本更新

Feedforward and Recurrent Neural Networks Backward Propagation and Hessian in Matrix Form

前馈和循环神经网络的反向传播与Hessian矩阵形式

Maxim Naumov

AI总结本文研究了前馈和循环神经网络的线性代数理论，推导了Hessian的精确表达式，并展示了权重梯度和Hessian的矩阵形式。

Comments 23 pages, 4 figures

1709.06011 2026-06-04 cs.MA cs.AI cs.LG cs.SY eess.SY stat.ML 版本更新

Guided Deep Reinforcement Learning for Swarm Systems

引导式深度强化学习用于群体系统

Maximilian Hüttenrauch, Adrian Šošić, Gerhard Neumann

AI总结本文研究如何通过有限感知能力的协作代理（如机器人群）学习控制方法，提出引导式强化学习框架，利用中央 critic 获取全局状态以简化策略评估，通过深度强化学习近似 Q 函数和策略。

Comments 15 pages, 8 figures, accepted at the AAMAS 2017 Autonomous Robots and Multirobot Systems (ARMS) Workshop

详情

AI中文摘要

本文研究如何学习控制具有有限感知能力的协作代理群体（如机器人群）。代理仅具备基本传感器能力，但通过协作可完成复杂任务，如分布式装配或搜索救援。学习群体代理的策略因分布式部分可观测性而困难。本文采用引导式方法，其中 critic 在学习过程中拥有全局状态的中央访问，从而从强化学习角度简化策略评估问题。例如，通过摄像头图像获取所有机器人位置，但该图像仅供 critic 使用，不供机器人控制策略。本文采用 actor-critic 方法，其中 actor 仅基于本地感知信息做决策，而 critic 基于真实全局状态进行学习。算法使用深度强化学习近似 Q 函数和策略。算法性能在两个简单模拟 2D 代理任务上进行评估：1) 找到并维持一定距离；2) 定位目标。

英文摘要

In this paper, we investigate how to learn to control a group of cooperative agents with limited sensing capabilities such as robot swarms. The agents have only very basic sensor capabilities, yet in a group they can accomplish sophisticated tasks, such as distributed assembly or search and rescue tasks. Learning a policy for a group of agents is difficult due to distributed partial observability of the state. Here, we follow a guided approach where a critic has central access to the global state during learning, which simplifies the policy evaluation problem from a reinforcement learning point of view. For example, we can get the positions of all robots of the swarm using a camera image of a scene. This camera image is only available to the critic and not to the control policies of the robots. We follow an actor-critic approach, where the actors base their decisions only on locally sensed information. In contrast, the critic is learned based on the true global state. Our algorithm uses deep reinforcement learning to approximate both the Q-function and the policy. The performance of the algorithm is evaluated on two tasks with simple simulated 2D agents: 1) finding and maintaining a certain distance to each others and 2) locating a target.

URL PDF HTML ☆

赞 0 踩 0

1602.04436 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Autoregressive Moving Average Graph Filtering

自回归移动平均图滤波器

Elvin Isufi, Andreas Loukas, Andrea Simonetto, Geert Leus

AI总结本文提出了一种自回归移动平均图滤波器，能够近似任意图频响应并实现信号去噪和插值。该方法适用于静态和时变场景，通过二维滤波处理时变图信号。

详情

DOI: 10.1109/TSP.2016.2614793
Journal ref: IEEE Transactions on Signal Processing, vol. 67 (2), pages 274 - 288, 2017

AI中文摘要

本文提出了一种自回归移动平均图滤波器，能够近似任意图频响应并实现信号去噪和插值。该方法适用于静态和时变场景，通过二维滤波处理时变图信号。

英文摘要

One of the cornerstones of the field of signal processing on graphs are graph filters, direct analogues of classical filters, but intended for signals defined on graphs. This work brings forth new insights on the distributed graph filtering problem. We design a family of autoregressive moving average (ARMA) recursions, which (i) are able to approximate any desired graph frequency response, and (ii) give exact solutions for tasks such as graph signal denoising and interpolation. The design philosophy, which allows us to design the ARMA coefficients independently from the underlying graph, renders the ARMA graph filters suitable in static and, particularly, time-varying settings. The latter occur when the graph signal and/or graph are changing over time. We show that in case of a time-varying graph signal our approach extends naturally to a two-dimensional filter, operating concurrently in the graph and regular time domains. We also derive sufficient conditions for filter stability when the graph and signal are time-varying. The analytical and numerical results presented in this paper illustrate that ARMA graph filters are practically appealing for static and time-varying settings, as predicted by theoretical derivations.

URL PDF HTML ☆

赞 0 踩 0

1709.04073 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Linear Stochastic Approximation: Constant Step-Size and Iterate Averaging

线性随机逼近：固定步长和迭代平均

Chandrashekar Lakshminarayanan, Csaba Szepesvári

AI总结本文研究了固定步长和Polyak-Ruppert平均的线性随机逼近算法，分析了其均方误差随迭代次数的变化，并探讨了在不同数据分布下固定步长的选择条件及启发式调整方法。

Comments 16 pages, 2 figures, was submitted to NIPS 2017

详情

AI中文摘要

本文研究了具有固定步长和Polyak-Ruppert（PR）迭代平均的$d$维线性随机逼近算法（LSAs）。LSAs广泛应用于机器学习和强化学习（RL）中，其目标是利用噪声数据和每个迭代$O(d)$次更新来计算合适的$θ_*∈\mathbb{R}^d$（即最优解或固定点）。本文受RL中从经验回放中评估策略的问题启发，探讨了属于时间差分（TD）类学习算法的LSAs。对于具有固定步长和PR平均的LSAs，我们提供了$t$次迭代后的均方误差（MSE）的界限。我们假设数据是独立同分布且具有有限方差（底层分布为$P$）且期望动力学是Hurwitz的。对于给定的LSA与PR平均，以及满足上述假设的数据分布$P$，我们证明存在一个常数步长范围，使得其MSE衰减为$O(1/t)$。我们还探讨了在数据分布$\mathcal{P}$中选择统一常数步长的条件，并证明并非所有数据分布都允许这样的统一常数步长。此外，我们建议一种启发式步长调整算法，用于为给定的数据分布$P$选择LSA的常数步长。我们还比较了我们的结果与相关工作，并讨论了我们的结果在TD算法作为LSAs的上下文中的意义。

英文摘要

We consider $d$-dimensional linear stochastic approximation algorithms (LSAs) with a constant step-size and the so called Polyak-Ruppert (PR) averaging of iterates. LSAs are widely applied in machine learning and reinforcement learning (RL), where the aim is to compute an appropriate $θ_{*} \in \mathbb{R}^d$ (that is an optimum or a fixed point) using noisy data and $O(d)$ updates per iteration. In this paper, we are motivated by the problem (in RL) of policy evaluation from experience replay using the \emph{temporal difference} (TD) class of learning algorithms that are also LSAs. For LSAs with a constant step-size, and PR averaging, we provide bounds for the mean squared error (MSE) after $t$ iterations. We assume that data is \iid with finite variance (underlying distribution being $P$) and that the expected dynamics is Hurwitz. For a given LSA with PR averaging, and data distribution $P$ satisfying the said assumptions, we show that there exists a range of constant step-sizes such that its MSE decays as $O(\frac{1}{t})$. We examine the conditions under which a constant step-size can be chosen uniformly for a class of data distributions $\mathcal{P}$, and show that not all data distributions `admit' such a uniform constant step-size. We also suggest a heuristic step-size tuning algorithm to choose a constant step-size of a given LSA for a given data distribution $P$. We compare our results with related work and also discuss the implication of our results in the context of TD algorithms that are LSAs.

URL PDF HTML ☆

赞 0 踩 0

1701.02440 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Machine Learning of Linear Differential Equations using Gaussian Processes

利用高斯过程学习线性微分方程

Maziar Raissi, George Em. Karniadakis

AI总结本文利用概率机器学习最新进展，通过高斯过程先验发现参数化的线性守恒律方程，包括常微分、偏微分、积分微分及分数阶算子。

1709.01672 2026-06-04 cs.NI cs.LG cs.NE cs.SY eess.SY 版本更新

Throughput Optimal Decentralized Scheduling of Multi-Hop Networks with End-to-End Deadline Constraints: II Wireless Networks with Interference

多跳网络中端到端截止期限约束下的吞吐量最优去中心化调度：II 无线网络与干扰

Rahul Singh, P. R. Kumar, Eytan Modiano

AI总结研究多跳无线网络中端到端截止期限约束下的去中心化调度问题，提出基于位置和年龄的路由调度策略以最大化吞吐量，强调其与传统稳定性的关键差异。

详情

AI中文摘要

考虑一个多跳无线网络，服务于多个流，其中无线链路干扰约束由链路干扰图描述。针对此类网络，设计路由调度策略以最大化网络的端到端及时吞吐量。流f的及时吞吐量定义为平均到达其目的地节点df内的包率。我们的策略具有几个令人惊讶的特点。首先，我们证明了单个包在无线节点i∈V处的最优路由调度决策仅取决于其位置和年龄，因此无线节点i无需了解全局网络状态即可最大化及时吞吐量。相比之下，在回压路由策略下，节点i仅需了解邻居队列长度以保证最大稳定性，因此是去中心化的。关键差异在于，在我们的设置中，一旦包的年龄超过其截止期限，其效用将丧失，这使得优化及时吞吐量比确保网络稳定性更具挑战性。当然，由于这一关键差异，最大化及时吞吐量的决策过程也比确保网络内队列稳定化更复杂。因此，我们的结果有些令人惊讶。

英文摘要

Consider a multihop wireless network serving multiple flows in which wireless link interference constraints are described by a link interference graph. For such a network, we design routing-scheduling policies that maximize the end-to-end timely throughput of the network. Timely throughput of a flow $f$ is defined as the average rate at which packets of flow $f$ reach their destination node $d_f$ within their deadline. Our policy has several surprising characteristics. Firstly, we show that the optimal routing-scheduling decision for an individual packet that is present at a wireless node $i\in V$ is solely a function of its location, and "age". Thus, a wireless node $i$ does not require the knowledge of the "global" network state in order to maximize the timely throughput. We notice that in comparison, under the backpressure routing policy, a node $i$ requires only the knowledge of its neighbours queue lengths in order to guarantee maximal stability, and hence is decentralized. The key difference arises due to the fact that in our set-up the packets loose their utility once their "age" has crossed their deadline, thus making the task of optimizing timely throughput much more challenging than that of ensuring network stability. Of course, due to this key difference, the decision process involved in maximizing the timely throughput is also much more complex than that involved in ensuring network-wide queue stabilization. In view of this, our results are somewhat surprising.

URL PDF HTML ☆

赞 0 踩 0

1709.02555 2026-06-04 eess.SY cs.AI cs.LG cs.LO cs.SY 版本更新

Causality-Aided Falsification

因果辅助的反驳

Takumi Akazaki, Yoshihiro Kumazawa, Ichiro Hasuo

AI总结本文提出利用因果信息提升异构系统质量保证中反驳效率的方法，通过贝叶斯网络优化成本函数实现高效输入值搜索。

Comments In Proceedings FVAV 2017, arXiv:1709.02126

1709.02435 2026-06-04 cs.AI cs.LG cs.SE cs.SY eess.SY 版本更新

An Analysis of ISO 26262: Using Machine Learning Safely in Automotive Software

ISO 26262分析：在汽车软件中安全使用机器学习

Rick Salay, Rodrigo Queiroz, Krzysztof Czarnecki

AI总结本文分析了在汽车软件中使用机器学习对ISO 26262安全生命周期的影响，并提出适应该标准以容纳机器学习的建议。

Comments 6 pages, 3 figures

1709.01237 2026-06-04 cs.CV cs.LG cs.NA math.NA 版本更新

Newton-type Methods for Inference in Higher-Order Markov Random Fields

牛顿型方法在高阶马尔可夫随机场推断中的应用

Hariprasad Kannan, Nikos Komodakis, Nikos Paragios

AI总结本文研究了在高阶马尔可夫随机场推断中使用牛顿型方法求解拉格朗日对偶问题的益处，提出了一种收敛性可证且高效的框架，包含Hessian矩阵构建的计算复杂度与精度的平衡策略、阻尼策略、截断策略与通用预条件器的结合，以及稀疏团势能的高效求和-乘积计算。

Comments 10 pages, 3 figures, 3 tables, CVPR 2017

详情

Journal ref: Poster at IEEE International Conference on Computer Vision and Pattern Recognition 2017

AI中文摘要

线性规划松弛是离散马尔可夫随机场MAP推断中的核心方法。正确求解拉格朗日对偶问题的能力是此类方法的关键组成部分。本文研究了使用牛顿型方法求解平滑版本问题的拉格朗日对偶问题的益处。我们探讨了其在实现更优收敛行为和更好地处理公式中的病态性质方面的能力，与一阶方法相比。我们证明了确实可以高效地应用信任区域牛顿方法，以解决广泛MAP推断问题。本文提出了一种可证收敛且高效的框架，包括（i）在Hessian矩阵构建方面计算复杂度和精度之间的良好平衡，（ii）一种有助于高效优化的阻尼策略，（iii）一种与通用共轭梯度预条件器结合的截断策略，（iv）稀疏团势能的高效求和-乘积计算。高阶马尔可夫随机场的结果展示了这种方法的潜力。

英文摘要

Linear programming relaxations are central to {\sc map} inference in discrete Markov Random Fields. The ability to properly solve the Lagrangian dual is a critical component of such methods. In this paper, we study the benefit of using Newton-type methods to solve the Lagrangian dual of a smooth version of the problem. We investigate their ability to achieve superior convergence behavior and to better handle the ill-conditioned nature of the formulation, as compared to first order methods. We show that it is indeed possible to efficiently apply a trust region Newton method for a broad range of {\sc map} inference problems. In this paper we propose a provably convergent and efficient framework that includes (i) excellent compromise between computational complexity and precision concerning the Hessian matrix construction, (ii) a damping strategy that aids efficient optimization, (iii) a truncation strategy coupled with a generic pre-conditioner for Conjugate Gradients, (iv) efficient sum-product computation for sparse clique potentials. Results for higher-order Markov Random Fields demonstrate the potential of this approach.

URL PDF HTML ☆

赞 0 踩 0

1607.03428 2026-06-04 cs.LG cs.SY eess.SY quant-ph stat.ML 版本更新

Learning in Quantum Control: High-Dimensional Global Optimization for Noisy Quantum Dynamics

量子控制中的学习：用于噪声量子动力学的高维全局优化

Pantita Palittapongarnpim, Peter Wittek, Ehsan Zahedinejad, Shakib Vedaie, Barry C. Sanders

AI总结本文提出使用差分进化算法解决高维量子系统中非凸优化问题，通过改进控制保真度和引入启发式方法提升计算效率，展示在量子相位估计和量子门设计中的优越性能。

Comments 32 pages, 4 figures, extension of proceedings in ESANN 2016 conference submitted to Neurocomputing

详情

DOI: 10.1016/j.neucom.2016.12.087
Journal ref: Neurocomputing 268 (2017) 116-126

AI中文摘要

量子控制在多种量子技术中具有重要价值，如通用量子计算中的高保真门、自适应量子增强计量和超冷原子操控。尽管监督学习和强化学习广泛用于优化经典系统的控制参数，但量子参数优化主要通过基于梯度的贪心算法进行。虽然量子适应度景观通常与贪心算法兼容，但在高维量子系统中贪心算法可能表现不佳。本文采用差分进化算法克服非凸优化的停滞问题，通过平均目标函数提升噪声系统中的量子控制保真度。为减少计算成本，引入了运行终止的启发式方法和自适应搜索子空间选择。我们的实现是大规模并行和向量化的，以进一步减少运行时间。通过量子相位估计和量子门设计两个示例，我们展示了在保真度和可扩展性方面优于贪心算法的结果。

英文摘要

Quantum control is valuable for various quantum technologies such as high-fidelity gates for universal quantum computing, adaptive quantum-enhanced metrology, and ultra-cold atom manipulation. Although supervised machine learning and reinforcement learning are widely used for optimizing control parameters in classical systems, quantum control for parameter optimization is mainly pursued via gradient-based greedy algorithms. Although the quantum fitness landscape is often compatible with greedy algorithms, sometimes greedy algorithms yield poor results, especially for large-dimensional quantum systems. We employ differential evolution algorithms to circumvent the stagnation problem of non-convex optimization. We improve quantum control fidelity for noisy system by averaging over the objective function. To reduce computational cost, we introduce heuristics for early termination of runs and for adaptive selection of search subspaces. Our implementation is massively parallel and vectorized to reduce run time even further. We demonstrate our methods with two examples, namely quantum phase estimation and quantum gate design, for which we achieve superior fidelity and scalability than obtained using greedy algorithms.

URL PDF HTML ☆

赞 0 踩 0

1504.01982 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Adaptive Diffusion Schemes for Heterogeneous Networks

异构网络中的自适应扩散方案

Jesus Fernandez-Bes, Jerónimo Arenas-García, Magno T. M. Silva, Luis A. Azpicueta-Ruiz

AI总结本文针对异构扩散网络中的分布式估计问题，提出了一种自适应组合策略，通过解耦适应与组合阶段，实现局部估计，并通过最小化网络均方误差优化组合器，实验显示其优于现有技术。

Comments To appear in in IEEE Transactions on Signal Processing. URL: http://ieeexplore.ieee.org/document/8010454/

详情

DOI: 10.1109/TSP.2017.2740199
Journal ref: IEEE Transactions on Signal Processing ( Volume: 65, Issue: 21, Nov.1, 1 2017 )

AI中文摘要

本文研究了异构网络中的扩散策略，提出了一种解耦适应与组合阶段的自适应方案，通过最小化网络均方误差优化组合器，与传统Adapt-then-Combine方案相比，实验表明该方法在异构网络中表现更优。

英文摘要

In this paper, we deal with distributed estimation problems in diffusion networks with heterogeneous nodes, i.e., nodes that either implement different adaptive rules or differ in some other aspect such as the filter structure or length, or step size. Although such heterogeneous networks have been considered from the first works on diffusion networks, obtaining practical and robust schemes to adaptively adjust the combiners in different scenarios is still an open problem. In this paper, we study a diffusion strategy specially designed and suited to heterogeneous networks. Our approach is based on two key ingredients: 1) the adaptation and combination phases are completely decoupled, so that network nodes keep purely local estimations at all times; and 2) combiners are adapted to minimize estimates of the network mean-square-error. Our scheme is compared with the standard Adapt-then-Combine scheme and theoretically analyzed using energy conservation arguments. Several experiments involving networks with heterogeneous nodes show that the proposed decoupled Adapt-then-Combine approach with adaptive combiners outperforms other state-of-the-art techniques, becoming a competitive approach in these scenarios.

URL PDF HTML ☆

赞 0 踩 0

1708.09342 2026-06-04 eess.SY cs.LG cs.RO cs.SY math.OC 版本更新

Optimal and Learning Control for Autonomous Robots

自主机器人最优与学习控制

Jonas Buchli, Farbod Farshidian, Alexander Winkler, Timothy Sandy, Markus Giftthaler

AI总结本文基于最优控制与强化学习，从统一视角探讨自主机器人闭环控制问题，提供统一符号和术语对比，帮助理解不同领域方法。

Comments Lecture Notes, 101 pages

1708.09165 2026-06-04 math.NA cs.LG cs.NA 版本更新

Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

张量网络用于降维和大规模优化。第二部分：应用与未来展望

A. Cichocki, A-H. Phan, Q. Zhao, N. Lee, I. V. Oseledets, M. Sugiyama, D. Mandic

AI总结本文第二部分探讨了张量网络在数据/参数超压缩高阶表示及相关成本函数中的应用，重点介绍张量列车（TT）和分层张量（HT）分解及其物理意义，展示其在机器学习和数据分析中的潜力。

Comments 232 pages

详情

DOI: 10.1561/2200000067
Journal ref: Foundations and Trends in Machine Learning: Vol. 9: No. 6, pp 431-673, 2017

AI中文摘要

本专著第二部分基于第一部分介绍的张量网络及其操作，聚焦于张量网络模型在数据/参数的超压缩高阶表示及相关成本函数中的应用，概述其在机器学习和数据分析中的应用。特别强调张量列车（TT）和分层张量（HT）分解及其具有物理意义的解释，反映张量网络方法的可扩展性。通过图示方法，阐述了通过底层低秩张量近似和核心张量的复杂收缩，张量网络能够执行分布式计算，从而缓解或消除维度灾难。该概念在广义回归和分类（支持张量机、典型相关分析、高阶偏最小二乘）、广义特征值分解、黎曼优化和深度神经网络优化等多个应用领域中得到了验证。本工作的第一部分和第二部分可以作为单独的文本使用，也可以作为低秩张量网络和张量分解这一激动人心领域的综合综述。

英文摘要

Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions.

URL PDF HTML ☆

赞 0 踩 0

1708.08552 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

An inexact subsampled proximal Newton-type method for large-scale machine learning

一种用于大规模机器学习的近似子采样近端牛顿型方法

Xuanqing Liu, Cho-Jui Hsieh, Jason D. Lee, Yuekai Sun

AI总结本文提出一种快速近端牛顿型算法，通过子采样构造牛顿子问题，提升大规模优化效率，实验验证其在ℓ₁正则化逻辑回归中的优越性。

1708.07850 2026-06-04 cs.LG cs.CV cs.NA math.NA 版本更新

Structured Low-Rank Matrix Factorization: Global Optimality, Algorithms, and Applications

结构低秩矩阵分解：全局最优性、算法与应用

Benjamin D. Haeffele, Rene Vidal

AI总结本文提出一种适用于大规模数据集的矩阵分解技术，通过特定正则化形式捕捉额外结构，证明在因子规模足够时局部极小值即为全局极小值，并展示在神经钙成像视频分割和高光谱压缩恢复中的优势。

详情

AI中文摘要

近年来，低秩矩阵分解问题凸形式在机器学习中受到广泛关注。然而，此类形式往往需要求解与数据矩阵同样大小的矩阵，难以应用于大规模数据集。此外，在许多应用中，数据可能表现出超越单纯低秩的结构，例如图像和视频呈现复杂的时空结构，而标准低秩方法大多忽略这些结构。本文研究了一种适用于大规模数据集的矩阵分解技术，通过特定形式的正则化捕捉额外结构，该正则化包括总变分和核范数等已知正则化器作为特例。尽管所得优化问题非凸，我们证明在因子规模足够时，若满足某些条件，则因子的任何局部极小值即为全局极小值。此外，本文还提供了几种实用算法来解决矩阵分解问题，并推导了近似解到全局最优解距离的界。神经钙成像视频分割和高光谱压缩恢复的示例展示了该方法在高维数据集中的优势。

英文摘要

Recently, convex formulations of low-rank matrix factorization problems have received considerable attention in machine learning. However, such formulations often require solving for a matrix of the size of the data matrix, making it challenging to apply them to large scale datasets. Moreover, in many applications the data can display structures beyond simply being low-rank, e.g., images and videos present complex spatio-temporal structures that are largely ignored by standard low-rank methods. In this paper we study a matrix factorization technique that is suitable for large datasets and captures additional structure in the factors by using a particular form of regularization that includes well-known regularizers such as total variation and the nuclear norm as particular cases. Although the resulting optimization problem is non-convex, we show that if the size of the factors is large enough, under certain conditions, any local minimizer for the factors yields a global minimizer. A few practical algorithms are also provided to solve the matrix factorization problem, and bounds on the distance from a given approximate solution of the optimization problem to the global optimum are derived. Examples in neural calcium imaging video segmentation and hyperspectral compressed recovery show the advantages of our approach on high-dimensional datasets.

URL PDF HTML ☆

赞 0 踩 0

1610.05984 2026-06-04 cs.NE cs.AI cs.LG cs.SY eess.SY 版本更新

Particle Swarm Optimization for Generating Interpretable Fuzzy Reinforcement Learning Policies

粒子群优化用于生成可解释的模糊强化学习策略

Daniel Hein, Alexander Hentschel, Thomas Runkler, Steffen Udluft

AI总结本文提出一种基于模糊粒子群强化学习（FPSRL）的方法，通过训练参数在模拟真实系统动态的世界模型上生成可解释的模糊强化学习策略，适用于无法进行在线学习的领域。

详情

DOI: 10.1016/j.engappai.2017.07.005
Journal ref: Engineering Applications of Artificial Intelligence, Volume 65C, October 2017, Pages 87-98

AI中文摘要

模糊控制器是用于连续状态和动作空间的有效且可解释的系统控制器。到目前为止，此类控制器要么是手动构建的，要么是通过使用专家生成的问题特定成本函数或结合详细的最优控制策略知识自动训练的。在大多数现实世界的强化学习（RL）问题中，这两种要求都不存在。在这些应用中，由于在线学习需要在策略训练期间探索问题的动力学，因此通常禁止在线学习。我们引入了一种模糊粒子群强化学习（FPSRL）方法，该方法仅通过在模拟真实系统动态的世界模型上训练参数来构建模糊RL策略。这些世界模型是通过使用之前生成的转换样本的自主机器学习技术创建的。据我们所知，这种方法是首次将自组织模糊控制器与基于模型的批量RL相关联的。因此，FPSRL旨在解决那些禁止在线学习、系统动态相对容易从先前生成的默认策略转换样本中建模，并且预计存在相对易于解释的控制策略的领域的问题。通过使用三个标准RL基准，即山车、平衡小车和小车摆起，证明了所提出方法在这些领域中的效率。我们的实验结果展示了高性能且可解释的模糊策略。

英文摘要

Fuzzy controllers are efficient and interpretable system controllers for continuous state and action spaces. To date, such controllers have been constructed manually or trained automatically either using expert-generated problem-specific cost functions or incorporating detailed knowledge about the optimal control strategy. Both requirements for automatic training processes are not found in most real-world reinforcement learning (RL) problems. In such applications, online learning is often prohibited for safety reasons because online learning requires exploration of the problem's dynamics during policy training. We introduce a fuzzy particle swarm reinforcement learning (FPSRL) approach that can construct fuzzy RL policies solely by training parameters on world models that simulate real system dynamics. These world models are created by employing an autonomous machine learning technique that uses previously generated transition samples of a real system. To the best of our knowledge, this approach is the first to relate self-organizing fuzzy controllers to model-based batch RL. Therefore, FPSRL is intended to solve problems in domains where online learning is prohibited, system dynamics are relatively easy to model from previously generated default policy transition samples, and it is expected that a relatively easily interpretable control policy exists. The efficiency of the proposed approach with problems from such domains is demonstrated using three standard RL benchmarks, i.e., mountain car, cart-pole balancing, and cart-pole swing-up. Our experimental results demonstrate high-performing, interpretable fuzzy policies.

URL PDF HTML ☆

赞 0 踩 0

1601.08068 2026-06-04 stat.ML cs.LG cs.SY eess.SY 版本更新

System Identification through Online Sparse Gaussian Process Regression with Input Noise

通过在线稀疏高斯过程回归进行系统辨识

Hildo Bijl, Thomas B. Schön, Jan-Willem van Wingerden, Michel Verhaegen

AI总结本文提出一种在线稀疏高斯过程回归算法，解决高斯过程回归在计算效率、在线更新和处理噪声输入方面的不足，实验表明其在非线性黑盒系统建模中性能优异。

1708.03366 2026-06-04 cs.LG cs.AI cs.CR cs.SY eess.SY 版本更新

Resilient Linear Classification: An Approach to Deal with Attacks on Training Data

鲁棒线性分类：一种应对训练数据攻击的方法

Sangdon Park, James Weimer, Insup Lee

AI总结本文提出一种鲁棒线性分类方法，通过引入多数约束，提高对抗训练数据攻击的鲁棒性，验证了传统算法在攻击下的脆弱性。

Comments Accepted as a conference paper at ICCPS17

详情

DOI: 10.1145/3055004.3055006

AI中文摘要

数据驱动技术用于控制自动驾驶车辆、处理能源管理的需求响应以及建模人体生理学用于医疗设备。这些技术从训练数据中提取模型，其性能通常基于训练数据中的随机误差进行分析。然而，如果训练数据被攻击者恶意篡改，这些攻击对数据驱动CPS底层学习算法的影响尚未被考虑。本文分析了分类算法对训练数据攻击的鲁棒性。具体而言，提出了一种通用度量标准，用于衡量分类算法对训练数据最坏情况篡改的鲁棒性。使用该度量标准，我们显示传统线性分类算法在受限条件下具有鲁棒性。为克服这些限制，我们提出了一种具有多数约束的线性分类算法，并证明其比传统算法更鲁棒。在合成数据和一个现实世界的回顾性心律失常医疗案例研究中的评估显示，传统算法对篡改的训练数据易受攻击，而所提算法更具鲁棒性（以最坏情况篡改衡量）。

英文摘要

Data-driven techniques are used in cyber-physical systems (CPS) for controlling autonomous vehicles, handling demand responses for energy management, and modeling human physiology for medical devices. These data-driven techniques extract models from training data, where their performance is often analyzed with respect to random errors in the training data. However, if the training data is maliciously altered by attackers, the effect of these attacks on the learning algorithms underpinning data-driven CPS have yet to be considered. In this paper, we analyze the resilience of classification algorithms to training data attacks. Specifically, a generic metric is proposed that is tailored to measure resilience of classification algorithms with respect to worst-case tampering of the training data. Using the metric, we show that traditional linear classification algorithms are resilient under restricted conditions. To overcome these limitations, we propose a linear classification algorithm with a majority constraint and prove that it is strictly more resilient than the traditional algorithms. Evaluations on both synthetic data and a real-world retrospective arrhythmia medical case-study show that the traditional algorithms are vulnerable to tampered training data, whereas the proposed algorithm is more resilient (as measured by worst-case tampering).

URL PDF HTML ☆

赞 0 踩 0

1610.05261 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

A probabilistic model for the numerical solution of initial value problems

初值问题数值解的概率模型

Michael Schober, Simo Särkkä, Philipp Hennig

AI总结本文提出将初值问题解法视为对潜在路径的推断，连接了广义线性方法、Runge-Kutta方法和Nordsieck方法，揭示了经典方法的隐含假设和不确定性处理。

Comments 23 pages, 11 figures

1610.02067 2026-06-04 cs.GT cs.IT cs.LG cs.SY eess.SY math.IT 版本更新

Stochastic Games for Smart Grid Energy Management with Prospect Prosumers

面向智能电网的能量管理的随机博弈：考虑前景理论的产消者

Seyed Rasoul Etesami, Walid Saad, Narayan Mandayam, H. Vincent Poor

AI总结本文研究了在随机动态下智能电网能量管理问题，通过随机博弈模型考虑产消者的行为，利用前景理论构建收益函数，并提出分布式算法以实现纳什均衡。

详情

AI中文摘要

本文研究了在随机动态下智能电网能量管理问题。在所考虑的模型中，假设消费者可以作为拥有可再生能源的产消者，既能生产也能消费能源。由于产消者决策与可再生能源的随机性耦合，产消者间的交互被建模为随机博弈，其中每个产消者通过控制其能源消费和需求来最大化收益。特别地，利用前景理论，将产消者的主观行为显式地反映到其收益函数中。对于基于前景的随机博弈，证明总存在一个平稳纳什均衡，其中产消者的交易策略与时间及其历史无关。此外，提出了一种无需产消者间信息共享的新型分布式算法，并证明其收敛于ε-纳什均衡。另一方面，在供给侧，电力公司与产消者之间的交互被建模为在线优化问题，其中电力公司的目标是学习其最优能源分配规则。对于这种情况，证明该优化问题具有无遗憾算法，即无论产消者间的实际游戏结果如何，电力公司都可以遵循一种策略，以减少其分配成本，如同事先知道整个需求市场一样。仿真结果展示了所提算法收敛到预测结果的能力，并揭示了前景理论带来的新见解，有助于更高效的智能电网能量管理。

英文摘要

In this paper, the problem of smart grid energy management under stochastic dynamics is investigated. In the considered model, at the demand side, it is assumed that customers can act as prosumers who own renewable energy sources and can both produce and consume energy. Due to the coupling between the prosumers' decisions and the stochastic nature of renewable energy, the interaction among prosumers is formulated as a stochastic game, in which each prosumer seeks to maximize its payoff, in terms of revenues, by controlling its energy consumption and demand. In particular, the subjective behavior of prosumers is explicitly reflected into their payoff functions using prospect theory, a powerful framework that allows modeling real-life human choices. For this prospect-based stochastic game, it is shown that there always exists a stationary Nash equilibrium where the prosumers' trading policies in the equilibrium are independent of the time and their histories of the play. Moreover, a novel distributed algorithm with no information sharing among prosumers is proposed and shown to converge to an $ε$-Nash equilibrium. On the other hand, at the supply side, the interaction between the utility company and the prosumers is formulated as an online optimization problem in which the utility company's goal is to learn its optimal energy allocation rules. For this case, it is shown that such an optimization problem admits a no-regret algorithm meaning that regardless of the actual outcome of the game among the prosumers, the utility company can follow a strategy that mitigates its allocation costs as if it knew the entire demand market a priori. Simulation results show the convergence of the proposed algorithms to their predicted outcomes and present new insights resulting from prospect theory that contribute toward more efficient energy management in the smart grids.

URL PDF HTML ☆

赞 0 踩 0

1605.01278 2026-06-04 stat.ML cs.LG cs.SY eess.SY math.DS math.PR 版本更新

A Bayesian Approach to Policy Recognition and State Representation Learning

基于贝叶斯方法的策略识别与状态表示学习

Adrian Šošić, Abdelhak M. Zoubir, Heinz Koeppl

AI总结本文提出一种贝叶斯方法，用于在不假设专家行为最优的情况下，学习任意随机专家策略，并推断专家使用的状态表示复杂度及任务相关的状态空间划分。

Comments 17 pages, 8 figures; ### Version 4 ### to appear in IEEE Transactions on Pattern Analysis and Machine Intelligence

详情

DOI: 10.1109/TPAMI.2017.2711024

AI中文摘要

学习从示范（LfD）是通过专家提供的示范构建任务行为模型的过程。这些模型可用于系统控制，通过泛化专家示范到未曾遇到的情况。然而，大多数LfD方法假设专家行为的确定性最优地面真实策略或需要直接监控专家的控制，限制了其在一般系统识别框架中的实际应用。本文考虑了更一般性的LfD问题，允许任意随机专家策略，而不考虑示范的最优性。采用贝叶斯方法，我们建模了能够解释所提供示范数据的全部可能专家控制器的后验分布。此外，我们展示了该方法可以应用于非参数上下文，以推断专家使用的状态表示复杂度，并学习任务相关的系统状态空间划分。

英文摘要

Learning from demonstration (LfD) is the process of building behavioral models of a task from demonstrations provided by an expert. These models can be used e.g. for system control by generalizing the expert demonstrations to previously unencountered situations. Most LfD methods, however, make strong assumptions about the expert behavior, e.g. they assume the existence of a deterministic optimal ground truth policy or require direct monitoring of the expert's controls, which limits their practical use as part of a general system identification framework. In this work, we consider the LfD problem in a more general setting where we allow for arbitrary stochastic expert policies, without reasoning about the optimality of the demonstrations. Following a Bayesian methodology, we model the full posterior distribution of possible expert controllers that explain the provided demonstration data. Moreover, we show that our methodology can be applied in a nonparametric context to infer the complexity of the state representation used by the expert, and to learn task-appropriate partitionings of the system state space.

URL PDF HTML ☆

赞 0 踩 0

1707.09428 2026-06-04 math.NA cs.LG cs.NA 版本更新

A unified method for super-resolution recovery and real exponential-sum separation

超分辨率恢复与实指数和分离的统一方法

Charles K. Chui, Hrushikesh N. Mhaskar

AI总结本文提出一种统一方法，解决多变量超分辨率问题和盲源分离实值指数和问题，适用于荧光显微镜、天文观测及磁共振成像等应用。

详情

AI中文摘要

本文受衍射传播光波的启发，提出一个简单的数学模型，用于多变量超分辨率问题和实值指数和的盲源分离问题。该模型促进了本文中两个问题的统一理论和统一解决方案的发展。超分辨率问题的研究旨在应用于荧光显微镜和天文观测，而第二个问题的动机是当前需要从磁共振波谱学中提取多变量指数特征，以帮助神经科医生和放射科医生，并为核化学中的同位素分离提供数学工具。本文介绍的统一方法可通过处理有限数量的数据实现，这些数据在非预先指定的位置采样，计算方案仅包括矩阵-向量乘法、峰值寻找和聚类。

英文摘要

In this paper, motivated by diffraction of traveling light waves, a simple mathematical model is proposed, both for the multivariate super-resolution problem and the problem of blind-source separation of real-valued exponential sums. This model facilitates the development of a unified theory and a unified solution of both problems in this paper. Our consideration of the super-resolution problem is aimed at applications to fluorescence microscopy and observational astronomy, and the motivation for our consideration of the second problem is the current need of extracting multivariate exponential features in magnetic resonance spectroscopy (MRS) for the neurologist and radiologist as well as for providing a mathematical tool for isotope separation in Nuclear Chemistry. The unified method introduced in this paper can be easily realized by processing only finitely many data, sampled at locations that are not necessarily prescribed in advance, with computational scheme consisting only of matrix - vector multiplication, peak finding, and clustering.

URL PDF HTML ☆

赞 0 踩 0

1706.03369 2026-06-04 stat.ML cs.LG cs.NA math.NA stat.CO 版本更新

On the Sampling Problem for Kernel Quadrature

关于核二次求积的采样问题

Francois-Xavier Briol, Chris J. Oates, Jon Cockayne, Wilson Ye Chen, Mark Girolami

AI总结本文探讨了核二次求积中采样分布对收敛速率的影响，提出基于自适应温度调节和序列蒙特卡罗的自动方法，显著降低积分误差。

Comments To appear at Thirty-fourth International Conference on Machine Learning (ICML 2017)

详情

Journal ref: Proceedings of the 34th International Conference on Machine Learning, PMLR 70:586-595, 2017

AI中文摘要

标准的核二次求积方法在随机点集下（也称为贝叶斯蒙特卡罗）以根均方误差收敛，其收敛速率由$s/d$比值决定，其中$s$和$d$分别表示被积函数的光滑性和维度。然而，实证研究显示速率常数$C$对随机点分布高度敏感。与标准蒙特卡罗积分不同，对于核二次求积，使$C$最小的采样分布无闭合形式。本文认为采样分布的实用选择是一个重要开放问题。一种解决方案是基于自适应温度调节和序列蒙特卡罗的自动方法。实证结果表明，该方法可使积分误差降低多达4个数量级。

英文摘要

The standard Kernel Quadrature method for numerical integration with random point sets (also called Bayesian Monte Carlo) is known to converge in root mean square error at a rate determined by the ratio $s/d$, where $s$ and $d$ encode the smoothness and dimension of the integrand. However, an empirical investigation reveals that the rate constant $C$ is highly sensitive to the distribution of the random points. In contrast to standard Monte Carlo integration, for which optimal importance sampling is well-understood, the sampling distribution that minimises $C$ for Kernel Quadrature does not admit a closed form. This paper argues that the practical choice of sampling distribution is an important open problem. One solution is considered; a novel automatic approach based on adaptive tempering and sequential Monte Carlo. Empirical results demonstrate a dramatic reduction in integration error of up to 4 orders of magnitude can be achieved with the proposed method.

URL PDF HTML ☆

赞 0 踩 0

1707.09319 2026-06-04 stat.OT cs.LG cs.NA math.NA 版本更新

A Fourier-invariant method for locating point-masses and computing their attributes

一种用于定位点质量及其属性的傅里叶不变方法

Charles K. Chui, Hrushikesh N. Mhaskar

AI总结本文提出一种有效方法，用于计数点质量、确定其空间位置并计算其属性，基于傅里叶不变的赫尔姆特矩计算，适用于任意维度的空间和傅里叶数据处理。

1707.08689 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Multi-Robot Transfer Learning: A Dynamical System Perspective

多机器人迁移学习：动态系统视角

Mohamed K. Helwa, Angela P. Schoellig

AI总结本文从动态系统角度研究多机器人迁移学习中的最优转移映射性质，提出无需详细动力学知识的算法，通过实验验证该算法在四旋翼平台间迁移学习中减少60-70%的误差。

Comments 7 pages, 6 figures, accepted at the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems

详情

AI中文摘要

多机器人迁移学习允许一个机器人利用第二个相似机器人生成的数据来改进自身行为。潜在优势是减少训练时间并降低训练阶段不可避免的风险。迁移学习算法旨在找到不同机器人之间的最优转移映射。本文通过单输入单输出（SISO）系统的理论研究，探讨了此类最优转移映射的性质。我们首先证明最优迁移学习映射通常是一个动态系统。本文的主要贡献是提供一种确定该最优动态映射性质的算法，包括其阶数和回归器（即它所依赖的变量）。所提出的算法不需要详细的机器人动力学知识，但依赖于通过简单实验测试可获得的基本系统属性。我们通过两个不同四旋翼平台间的迁移学习示例验证了所提算法。实验结果表明，通过我们的算法获得的最优动态映射在减少迁移学习误差方面比直接转移数据或使用最优静态映射的情况减少了60-70%。

英文摘要

Multi-robot transfer learning allows a robot to use data generated by a second, similar robot to improve its own behavior. The potential advantages are reducing the time of training and the unavoidable risks that exist during the training phase. Transfer learning algorithms aim to find an optimal transfer map between different robots. In this paper, we investigate, through a theoretical study of single-input single-output (SISO) systems, the properties of such optimal transfer maps. We first show that the optimal transfer learning map is, in general, a dynamic system. The main contribution of the paper is to provide an algorithm for determining the properties of this optimal dynamic map including its order and regressors (i.e., the variables it depends on). The proposed algorithm does not require detailed knowledge of the robots' dynamics, but relies on basic system properties easily obtainable through simple experimental tests. We validate the proposed algorithm experimentally through an example of transfer learning between two different quadrotor platforms. Experimental results show that an optimal dynamic map, with correct properties obtained from our proposed algorithm, achieves 60-70% reduction of transfer learning error compared to the cases when the data is directly transferred or transferred using an optimal static map.

URL PDF HTML ☆

赞 0 踩 0

1707.08369 2026-06-04 cs.LG cs.NA math.NA 版本更新

Updating Singular Value Decomposition for Rank One Matrix Perturbation

针对秩一矩阵扰动的奇异值分解更新

Ratnik Gandhi, Amoli Rajgor

AI总结本文提出一种高效算法，用于在O(n² log(1/ε))时间内更新秩一扰动矩阵的奇异值分解，利用快速多极子方法在O(n log(1/ε))时间内更新奇异向量。

1502.02609 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Efficient model-based reinforcement learning for approximate online optimal

近似在线最优的高效基于模型的强化学习

Rushikesh Kamalapurkar, Joel A. Rosenfeld, Warren E. Dixon

AI总结本文提出基于状态跟随核方法的在线近似最优控制策略，通过局部小邻域近似值函数实现稳定性和最优性的高效控制。

1610.06283 2026-06-04 cs.RO cs.LG cs.NE cs.SY eess.SY 版本更新

Deep Neural Networks for Improved, Impromptu Trajectory Tracking of Quadrotors

用于四旋翼机即时轨迹跟踪的深度神经网络

Qiyang Li, Jingxing Qian, Zining Zhu, Xuchan Bao, Mohamed K. Helwa, Angela P. Schoellig

AI总结本文提出基于深度神经网络的算法，通过提供定制参考输入提升经典反馈控制器的轨迹跟踪性能，实验表明该方法能有效减少跟踪误差，适用于实时轨迹跟踪应用。

Comments 7 pages, 8 figures. Accepted final version. To appear in the proc. of the 2017 IEEE International Conference on Robotics and Automation

详情

AI中文摘要

四旋翼机的轨迹跟踪控制对于应用范围从勘测和检查到影视制作都至关重要。然而，设计和调优经典控制器，如比例-积分-微分（PID）控制器，以实现高跟踪精度可能耗时且困难，由于隐藏动态和其他非理想因素。深度神经网络（DNN）凭借其卓越的近似抽象、非线性函数的能力，提出了一种增强轨迹跟踪控制的新方法。本文提出了一种基于DNN的算法作为附加模块，以提高经典反馈控制器的跟踪性能。给定期望轨迹，DNNs根据其获得的经验为控制器提供定制参考输入。输入旨在实现期望轨迹与输出轨迹之间的单位映射。这项工作的动机是交互式“画即飞”应用，用户在移动设备上绘制轨迹，四旋翼机即时飞越该轨迹，通过DNN增强的控制系统。实验结果表明，所提出的方法在DNNs在选定的周期轨迹上训练后，能够提高用户绘制轨迹的跟踪精度，表明该方法在现实应用中的潜力。跟踪误差在训练和测试轨迹上分别减少约40-50%，突显了DNNs在知识泛化方面的能力。

英文摘要

Trajectory tracking control for quadrotors is important for applications ranging from surveying and inspection, to film making. However, designing and tuning classical controllers, such as proportional-integral-derivative (PID) controllers, to achieve high tracking precision can be time-consuming and difficult, due to hidden dynamics and other non-idealities. The Deep Neural Network (DNN), with its superior capability of approximating abstract, nonlinear functions, proposes a novel approach for enhancing trajectory tracking control. This paper presents a DNN-based algorithm as an add-on module that improves the tracking performance of a classical feedback controller. Given a desired trajectory, the DNNs provide a tailored reference input to the controller based on their gained experience. The input aims to achieve a unity map between the desired and the output trajectory. The motivation for this work is an interactive "fly-as-you-draw" application, in which a user draws a trajectory on a mobile device, and a quadrotor instantly flies that trajectory with the DNN-enhanced control system. Experimental results demonstrate that the proposed approach improves the tracking precision for user-drawn trajectories after the DNNs are trained on selected periodic trajectories, suggesting the method's potential in real-world applications. Tracking errors are reduced by around 40-50% for both training and testing trajectories from users, highlighting the DNNs' capability of generalizing knowledge.

URL PDF HTML ☆

赞 0 踩 0

1508.03332 2026-06-04 math.NA cs.LG cs.MA cs.NA math.DS stat.ML 版本更新

Dimensionality Reduction of Collective Motion by Principal Manifolds

通过主流形进行集体运动的降维

Kelum Gajamannage, Sachit Butail, Maurizio Porfiri, Erik M. Bollt

AI总结本文提出基于立方平滑样条构建二维主流形的方法，用于降维分析集体运动数据，保留原始结构并优于现有非线性降维方法。

Comments 19 pages, 13 figures, journal article

详情

DOI: 10.1016/j.physd.2014.09.009
Journal ref: Physica-D : Nonlinear Phenomena, Volume 291, 15 January 2015, Pages 62-73

AI中文摘要

尽管已证明集体运动模式中存在低维嵌入流形，但现有非线性降维方法无法有效分析此类流形，主要原因是谱分解步骤限制了高维空间到嵌入空间的映射控制。本文提出一种替代方法，要求二维嵌入以拓扑方式总结高维数据。具体而言，我们直接在高维空间中使用立方平滑样条构建二维主流形，并用测地距离定义嵌入坐标。通过代表性示例，我们展示与现有非线性降维方法相比，主流形在噪声和稀疏数据集上仍能保留原始结构。主流形寻找算法应用于多个代理动态系统模拟复杂机动（捕食者围攻）得到的配置，并将所得二维嵌入与已建立的非线性降维方法进行比较。

英文摘要

While the existence of low-dimensional embedding manifolds has been shown in patterns of collective motion, the current battery of nonlinear dimensionality reduction methods are not amenable to the analysis of such manifolds. This is mainly due to the necessary spectral decomposition step, which limits control over the mapping from the original high-dimensional space to the embedding space. Here, we propose an alternative approach that demands a two-dimensional embedding which topologically summarizes the high-dimensional data. In this sense, our approach is closely related to the construction of one-dimensional principal curves that minimize orthogonal error to data points subject to smoothness constraints. Specifically, we construct a two-dimensional principal manifold directly in the high-dimensional space using cubic smoothing splines, and define the embedding coordinates in terms of geodesic distances. Thus, the mapping from the high-dimensional data to the manifold is defined in terms of local coordinates. Through representative examples, we show that compared to existing nonlinear dimensionality reduction methods, the principal manifold retains the original structure even in noisy and sparse datasets. The principal manifold finding algorithm is applied to configurations obtained from a dynamical system of multiple agents simulating a complex maneuver called predator mobbing, and the resulting two-dimensional embedding is compared with that of a well-established nonlinear dimensionality reduction method.

URL PDF HTML ☆

赞 0 踩 0

1707.05828 2026-06-04 cs.LG cs.NA math.NA 版本更新

A deep learning approach to diabetic blood glucose prediction

基于深度学习的糖尿病血血糖预测方法

H. N. Mhaskar, S. V. Pereverzyev, M. D. van der Walt

AI总结本文提出利用深度学习对糖尿病患者血糖进行30分钟预测，通过选取部分患者数据训练模型，验证深度学习在该任务中的优越性，并展示如何利用领域知识构建简洁的深度表示。

1605.07246 2026-06-04 cs.LG cs.AI cs.NA math.NA 版本更新

Adaptive ADMM with Spectral Penalty Parameter Selection

自适应ADMM与谱惩罚参数选择

Zheng Xu, Mario A. T. Figueiredo, Tom Goldstein

AI总结本文提出自适应ADMM算法，通过自适应调整惩罚参数实现快速收敛，提高算法鲁棒性与易用性。

Comments AISTATS 2017

1611.03220 2026-06-04 math.NA cs.DS cs.LG cs.NA 版本更新

Faster Kernel Ridge Regression Using Sketching and Preconditioning

用Sketching和Preconditioning加速核岭回归

Haim Avron, Kenneth L. Clarkson, David P. Woodruff

AI总结本文提出利用随机特征映射和预处理技术加速核岭回归线性系统的求解，通过有效预处理方法提升大规模数据集的计算效率。

1707.03340 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Deep Learning for Real Time Crime Forecasting

深度学习用于实时犯罪预测

Bao Wang, Duo Zhang, Duanhao Zhang, P. Jeffery Brantingham, Andrea L. Bertozzi

AI总结本文提出基于深度学习的时空预测模型，通过空间时间正则化和残差卷积结构，提升对洛杉矶犯罪分布的预测精度。

Comments 4 pages, 6 figures, NOLTA, 2017

1707.03092 2026-06-04 eess.SY cs.LG cs.RO cs.SY 版本更新

A Separation-Based Design to Data-Driven Control for Large-Scale Partially Observed Systems

基于分离的设计到数据驱动控制用于大规模部分观测系统

Dan Yu, Mohammadhussein Rafieisakhaei, Suman Chakravorty

AI总结本文研究了由偏微分方程（PDE）描述的状态动力学导致的 partially observed 随机最优控制问题，通过黑盒模拟模型求解开环确定性轨迹优化问题，并基于输入输出实验数据设计线性二次高斯控制器。

Comments 3 pages, 6 figures, In Robotics: Science and Systems (RSS) 2017 Workshop of "POMDPs in Robotics: State of The Art, Challenges, and Opportunities"

1707.02670 2026-06-04 math.OC cs.DS cs.LG cs.NA math.NA stat.ML 版本更新

Accelerated Stochastic Power Iteration

加速随机幂迭代

Christopher De Sa, Bryan He, Ioannis Mitliagkas, Christopher Ré, Peng Xu

AI总结本文提出一种带有动量项的幂迭代变种，实现了最优的样本和迭代复杂度，适用于在线和离线设置的随机PCA算法，加速了迭代复杂度至O(1/√Δ)。

Comments 37 pages, 5 figures

详情

AI中文摘要

主成分分析（PCA）是机器学习中最强大的工具之一。最简单的PCA方法，即幂迭代，需要O(1/Δ)次全数据遍历来恢复具有特征间隙Δ的矩阵的主成分。Lanczos方法虽然更复杂，但实现了加速的O(1/√Δ)遍历率。现代应用却要求仅处理可用数据子集的随机方法。在线随机设置中，简单的Oja迭代方法达到最优样本复杂度O(σ²/Δ²)，但其完全序列且需要O(σ²/Δ²)次迭代，远低于Lanczos的O(1/√Δ)速率。本文提出一种带有动量项的幂迭代变种，实现了最优的样本和迭代复杂度。在全遍历设置中，标准分析表明动量可实现加速的O(1/√Δ)速率。我们实证表明，简单将动量应用于随机方法并不能加速。通过新颖的紧方差分析，揭示了“断裂点方差”之后加速不再发生。结合现代方差减少技术，我们构建了适用于在线和离线设置的随机PCA算法，实现了加速的迭代复杂度O(1/√Δ)。由于我们的方法具有 embarrassingly 并行性质，如果在并行环境中部署，这种加速可直接转化为实际时间。我们的方法非常通用，适用于许多可加速的非凸优化问题。

英文摘要

Principal component analysis (PCA) is one of the most powerful tools in machine learning. The simplest method for PCA, the power iteration, requires $\mathcal O(1/Δ)$ full-data passes to recover the principal component of a matrix with eigen-gap $Δ$. Lanczos, a significantly more complex method, achieves an accelerated rate of $\mathcal O(1/\sqrtΔ)$ passes. Modern applications, however, motivate methods that only ingest a subset of available data, known as the stochastic setting. In the online stochastic setting, simple algorithms like Oja's iteration achieve the optimal sample complexity $\mathcal O(σ^2/Δ^2)$. Unfortunately, they are fully sequential, and also require $\mathcal O(σ^2/Δ^2)$ iterations, far from the $\mathcal O(1/\sqrtΔ)$ rate of Lanczos. We propose a simple variant of the power iteration with an added momentum term, that achieves both the optimal sample and iteration complexity. In the full-pass setting, standard analysis shows that momentum achieves the accelerated rate, $\mathcal O(1/\sqrtΔ)$. We demonstrate empirically that naively applying momentum to a stochastic method, does not result in acceleration. We perform a novel, tight variance analysis that reveals the "breaking-point variance" beyond which this acceleration does not occur. By combining this insight with modern variance reduction techniques, we construct stochastic PCA algorithms, for the online and offline setting, that achieve an accelerated iteration complexity $\mathcal O(1/\sqrtΔ)$. Due to the embarassingly parallel nature of our methods, this acceleration translates directly to wall-clock time if deployed in a parallel environment. Our approach is very general, and applies to many non-convex optimization problems that can now be accelerated using the same technique.

URL PDF HTML ☆

赞 0 踩 0

1707.02515 2026-06-04 cs.AI cs.LG cs.SY eess.SY 版本更新

A Fast Integrated Planning and Control Framework for Autonomous Driving via Imitation Learning

一种通过模仿学习的快速集成规划与控制系统用于自动驾驶

Liting Sun, Cheng Peng, Wei Zhan, Masayoshi Tomizuka

AI总结本文提出一种结合学习与优化方法的两层框架，通过神经网络学习长期最优策略并结合短期优化控制器提升自动驾驶的安全性和效率。

详情

AI中文摘要

为实现自动驾驶中的安全高效规划与控制，需要一种能够长期 horizon 内实现良好驾驶质量且保证安全可行的驾驶策略。基于优化的方法，如模型预测控制（MPC），可以提供此类最优策略，但其计算复杂度通常无法满足实时实现的需求。为解决此问题，我们提出了一种快速集成规划与控制系统，该系统通过在两层分层结构中结合学习与优化方法。第一层定义为“策略层”，由神经网络建立，学习由MPC生成的长期最优驾驶策略。第二层称为“执行层”，是一个基于优化的短期控制器，能够跟踪由“策略层”提供的参考轨迹，并保证短期的安全性和可行性。此外，通过高效且高度代表性的特征，小尺寸的神经网络足以处理许多复杂的驾驶场景。这使得在线模仿学习与数据集聚合（DAgger）成为可能，从而能够快速且持续地提升“策略层”的性能。几个驾驶场景的例子被演示以验证所提框架的有效性和效率。

英文摘要

For safe and efficient planning and control in autonomous driving, we need a driving policy which can achieve desirable driving quality in long-term horizon with guaranteed safety and feasibility. Optimization-based approaches, such as Model Predictive Control (MPC), can provide such optimal policies, but their computational complexity is generally unacceptable for real-time implementation. To address this problem, we propose a fast integrated planning and control framework that combines learning- and optimization-based approaches in a two-layer hierarchical structure. The first layer, defined as the "policy layer", is established by a neural network which learns the long-term optimal driving policy generated by MPC. The second layer, called the "execution layer", is a short-term optimization-based controller that tracks the reference trajecotries given by the "policy layer" with guaranteed short-term safety and feasibility. Moreover, with efficient and highly-representative features, a small-size neural network is sufficient in the "policy layer" to handle many complicated driving scenarios. This renders online imitation learning with Dataset Aggregation (DAgger) so that the performance of the "policy layer" can be improved rapidly and continuously online. Several exampled driving scenarios are demonstrated to verify the effectiveness and efficiency of the proposed framework.

URL PDF HTML ☆

赞 0 踩 0

1707.02201 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Learning human behaviors from motion capture by adversarial imitation

通过对抗模仿学习学习人类行为

Josh Merel, Yuval Tassa, Dhruva TB, Sriram Srinivasan, Jay Lemmon, Ziyu Wang, Greg Wayne, Nicolas Heess

AI总结本文提出利用生成对抗模仿学习训练神经网络策略，从有限的不完全观测状态特征中生成人类化运动模式，即使演示来自不同物理参数的躯体，也能通过子技能策略解决任务。

1707.01945 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Simple Classification using Binary Data

基于二进制数据的简单分类

Deanna Needell, Rayan Saab, Tina Woolf

AI总结本文研究了从二进制数据进行分类的问题，提出了一种计算和资源消耗低的框架，并通过实验和理论分析验证其有效性。

1707.01322 2026-06-04 cs.LG cs.LO cs.SY eess.SY 版本更新

Automated Experiment Design for Data-Efficient Verification of Parametric Markov Decision Processes

数据高效验证参数马尔可夫决策过程的自动化实验设计

Elizabeth Polgreen, Viraj Wijesuriya, Sofie Haesaert, Alessandro Abate

AI总结本文提出一种利用参数模型和实验数据进行统计验证的新方法，通过参数综合确定可行参数集，主动合成实验提高数据相关性，并传播信息以获得验证结果。

Comments QEST 2017, 18 pages, 7 figures

1609.00932 2026-06-04 cs.LG cs.AI cs.SY eess.SY math.PR physics.data-an 版本更新

Spectral learning of dynamic systems from nonequilibrium data

从非平衡数据中学习动态系统的谱方法

Hao Wu, Frank Noé

AI总结本文研究了在不假设数据同分布的情况下，通过施加平衡约束从非平衡观测数据中提取系统平衡动力学的谱学习特性，并提出了一种适用于连续数据的无bin扩展方法，实现线性复杂度下的稳定估计。

详情

Journal ref: Proceedings of the 29th conference on Neural Information Processing Systems (NIPS), Barcelona, Spain, 2016, pp. 4179-4187

AI中文摘要

可观测操作模型（OOMs）及相关模型是建模和分析随机系统的重要且强大的工具。它们精确描述有限秩系统的动力学，并可通过谱学习在假设数据同分布的情况下高效一致地估计。本文研究了在分析长时间尺度系统时不假设数据同分布的谱学习特性，并展示通过施加平衡约束可从非平衡观测数据中提取系统平衡动力学。此外，本文提出了一种适用于连续数据的无bin扩展谱学习方法。与其他连续值谱算法相比，无bin算法仅需线性复杂度即可实现平衡动力学的一致估计。

英文摘要

Observable operator models (OOMs) and related models are one of the most important and powerful tools for modeling and analyzing stochastic systems. They exactly describe dynamics of finite-rank systems and can be efficiently and consistently estimated through spectral learning under the assumption of identically distributed data. In this paper, we investigate the properties of spectral learning without this assumption due to the requirements of analyzing large-time scale systems, and show that the equilibrium dynamics of a system can be extracted from nonequilibrium observation data by imposing an equilibrium constraint. In addition, we propose a binless extension of spectral learning for continuous data. In comparison with the other continuous-valued spectral algorithms, the binless algorithm can achieve consistent estimation of equilibrium dynamics with only linear complexity.

URL PDF HTML ☆

赞 0 踩 0

1706.02869 2026-06-04 cs.LG cs.NA cs.SY eess.SY math.NA 版本更新

Adaptive Consensus ADMM for Distributed Optimization

自适应共识ADMM用于分布式优化

Zheng Xu, Gavin Taylor, Hao Li, Mario Figueiredo, Xiaoming Yuan, Tom Goldstein

AI总结本文提出自适应共识ADMM方法，通过为每个节点定制参数提升分布式优化性能，并证明其O(1/k)收敛速率。

Comments ICML 2017

1602.07764 2026-06-04 cs.AI cs.LG cs.NA math.NA math.OC stat.ML 版本更新

Reinforcement Learning of POMDPs using Spectral Methods

使用谱方法进行POMDP的强化学习

Kamyar Azizzadenesheli, Alessandro Lazaric, Animashree Anandkumar

AI总结本文提出基于谱分解方法的POMDP强化学习算法，通过轨迹学习参数并利用优化 oracle 得到最优无记忆策略，证明了与最优无记忆策略的最优 regret 绑定和高维空间的高效扩展性。

详情

Journal ref: 29th Annual Conference on Learning Theory, PMLR 49:193-256, 2016

AI中文摘要

我们提出了一种新的基于谱分解方法的POMDP强化学习算法。尽管谱方法之前已被用于一致学习隐马尔可夫模型等被动潜在变量模型，但POMDP更具挑战性，因为学习者与环境交互可能会改变未来的观测。我们设计了一种通过回合运行的算法，每个回合中利用谱技术从由固定策略生成的轨迹中学习POMDP参数。回合结束时，优化 oracle 返回基于估计POMDP模型的最优无记忆规划策略，该策略最大化预期奖励。我们证明了与最优无记忆策略相比的最优 regret 绑定以及在观测和动作空间维度上的高效扩展性。

英文摘要

We propose a new reinforcement learning algorithm for partially observable Markov decision processes (POMDP) based on spectral decomposition methods. While spectral methods have been previously employed for consistent learning of (passive) latent variable models such as hidden Markov models, POMDPs are more challenging since the learner interacts with the environment and possibly changes the future observations in the process. We devise a learning algorithm running through episodes, in each episode we employ spectral techniques to learn the POMDP parameters from a trajectory generated by a fixed policy. At the end of the episode, an optimization oracle returns the optimal memoryless planning policy which maximizes the expected reward based on the estimated POMDP model. We prove an order-optimal regret bound with respect to the optimal memoryless policy and efficient scaling with respect to the dimensionality of observation and action spaces.

URL PDF HTML ☆

赞 0 踩 0

1608.05754 2026-06-04 math.NA cs.LG cs.NA 版本更新

Fast estimation of approximate matrix ranks using spectral densities

利用谱密度快速估计近似矩阵秩

Shashanka Ubaru, Yousef Saad, Abd-Krim Seghouane

AI总结本文提出两种低成本方法，利用谱密度估算大数据矩阵的近似秩，通过Chebyshev多项式和Lanczos算法进行分析，结合谱密度图定位噪声与有效特征值间隙，验证方法在典型应用中的性能。

详情

DOI: 10.1162/NECO_a_00951
Journal ref: Neural Computation, Vol. 29, No. 5, pp. 1317-1351 (May 2017)

AI中文摘要

在许多机器学习和数据应用中，需要掌握大型数据矩阵的近似秩。本文提出两种计算成本低的技术来估算此类大矩阵的近似秩。这些技术利用物理中流行的近似谱密度，即测量矩阵在实线上特定点处特征值出现概率的密度分布。对区间积分可得到该区间内矩阵的特征值计数，因此通过精心选择的区间积分可近似得到秩。讨论了两种不同的方法来估计近似秩，一种基于Chebyshev多项式，另一种基于Lanczos算法。为了获得适当的区间，需要定位噪声对应的特征值与影响矩阵秩的特征值之间的间隙。基于谱密度图提出了一种定位此间隙并选择积分区间的办法。数值实验展示了这些技术在典型应用矩阵上的性能。

英文摘要

In many machine learning and data related applications, it is required to have the knowledge of approximate ranks of large data matrices at hand. In this paper, we present two computationally inexpensive techniques to estimate the approximate ranks of such large matrices. These techniques exploit approximate spectral densities, popular in physics, which are probability density distributions that measure the likelihood of finding eigenvalues of the matrix at a given point on the real line. Integrating the spectral density over an interval gives the eigenvalue count of the matrix in that interval. Therefore the rank can be approximated by integrating the spectral density over a carefully selected interval. Two different approaches are discussed to estimate the approximate rank, one based on Chebyshev polynomials and the other based on the Lanczos algorithm. In order to obtain the appropriate interval, it is necessary to locate a gap between the eigenvalues that correspond to noise and the relevant eigenvalues that contribute to the matrix rank. A method for locating this gap and selecting the interval of integration is proposed based on the plot of the spectral density. Numerical experiments illustrate the performance of these techniques on matrices from typical applications.

URL PDF HTML ☆

赞 0 踩 0

1405.6341 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY 版本更新

Efficient Model Learning for Human-Robot Collaborative Tasks

高效的人机协作任务模型学习

Stefanos Nikolaidis, Keren Gu, Ramya Ramakrishnan, Julie Shah

AI总结本文提出一种框架，通过联合动作演示学习人类用户模型，使机器人能自动计算稳健的协作策略。采用无监督学习聚类动作序列，学习逆强化学习奖励函数，并在混合可观测马尔可夫决策过程框架中应用，实现对新用户的类型推断和策略计算。

详情

DOI: 10.1145/2696454.2696455
Journal ref: Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction (HRI 2015)

AI中文摘要

我们提出了一种框架，用于从联合动作演示中学习人类用户模型，使机器人能够计算协作任务的稳健策略。学习过程完全自动，无需人工干预。首先，我们描述了使用无监督学习算法将演示的动作序列聚类为不同的人类类型。这些演示序列还被机器人用来通过逆强化学习算法学习代表每种类型的奖励函数。学习的模型随后作为混合可观测马尔可夫决策过程（MO-MDP）的一部分使用，其中人类类型是部分可观测变量。通过该框架，我们可以推断新用户类型（未包含在训练集中），并计算与新用户偏好一致且对人类动作偏离具有鲁棒性的机器人策略。最后，我们通过人类受试者实验数据验证了该方法，并进行了概念验证演示，其中一个人与小型工业机器人进行协作任务。

英文摘要

We present a framework for learning human user models from joint-action demonstrations that enables the robot to compute a robust policy for a collaborative task with a human. The learning takes place completely automatically, without any human intervention. First, we describe the clustering of demonstrated action sequences into different human types using an unsupervised learning algorithm. These demonstrated sequences are also used by the robot to learn a reward function that is representative for each type, through the employment of an inverse reinforcement learning algorithm. The learned model is then used as part of a Mixed Observability Markov Decision Process formulation, wherein the human type is a partially observable variable. With this framework, we can infer, either offline or online, the human type of a new user that was not included in the training set, and can compute a policy for the robot that will be aligned to the preference of this new user and will be robust to deviations of the human actions from prior demonstrations. Finally we validate the approach using data collected in human subject experiments, and conduct proof-of-concept demonstrations in which a person performs a collaborative task with a small industrial robot.

URL PDF HTML ☆

赞 0 踩 0

1706.04097 2026-06-04 cs.LG cs.DS cs.NA math.NA stat.ML 版本更新

Provable Alternating Gradient Descent for Non-negative Matrix Factorization with Strong Correlations

可证明的非负矩阵分解交替梯度下降法用于强相关性情况

Yuanzhi Li, Yingyu Liang

AI总结本文提出了一种简单的交替梯度下降算法，证明在强相关性下能有效恢复真实特征矩阵，并展示了其在噪声下的鲁棒性。

Comments Accepted to the International Conference on Machine Learning (ICML), 2017

1702.07944 2026-06-04 cs.LG cs.AI cs.SY eess.SY math.OC stat.ML 版本更新

Stochastic Variance Reduction Methods for Policy Evaluation

基于随机方差缩减的方法用于策略评估

Simon S. Du, Jianshu Chen, Lihong Li, Lin Xiao, Dengyong Zhou

AI总结本文提出基于线性函数逼近的策略评估方法，通过将经验策略评估问题转化为二次凸-凹鞍点问题，并设计了双变量批量梯度方法及两种随机方差缩减算法，实现线性缩放和线性收敛。

Comments Accepted by ICML 2017

1706.00241 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Krylov Subspace Recycling for Fast Iterative Least-Squares in Machine Learning

Krylov子空间回收用于机器学习中的快速迭代最小二乘法

Filip de Roos, Philipp Hennig

AI总结本文研究了利用Krylov子空间回收方法提高机器学习中对称正定线性问题求解效率，通过迭代优化低秩近似以平衡计算成本与数值精度。

详情

AI中文摘要

求解对称正定线性问题是机器学习中的基础计算任务。精确解，众所周知，其计算复杂度与矩阵大小呈立方关系。为缓解这一问题，已提出几种线性时间的近似方法，如谱方法和诱导点方法，这些方法现在被广泛应用。这些方法是低秩近似，提前选择低秩空间，并不随时间迭代优化。虽然这允许数据集大小的线性成本，但也导致有限的、无法纠正的近似误差。数值线性代数领域的作者探索了如何迭代优化此类低秩近似，其成本仅为少量矩阵-向量乘法。这一想法尤其在机器学习中许多情况下具有吸引力，其中需要解决一系列相关的对称正定线性问题。从机器学习的角度来看，此类消减方法可以被解释为在时间序列的数值任务中，低秩近似的迁移学习。我们研究了此类方法在我们领域中的应用。我们的实验证明，在中等规模的回归和分类问题上，这种方法可以介于低计算成本和数值精度之间。

英文摘要

Solving symmetric positive definite linear problems is a fundamental computational task in machine learning. The exact solution, famously, is cubicly expensive in the size of the matrix. To alleviate this problem, several linear-time approximations, such as spectral and inducing-point methods, have been suggested and are now in wide use. These are low-rank approximations that choose the low-rank space a priori and do not refine it over time. While this allows linear cost in the data-set size, it also causes a finite, uncorrected approximation error. Authors from numerical linear algebra have explored ways to iteratively refine such low-rank approximations, at a cost of a small number of matrix-vector multiplications. This idea is particularly interesting in the many situations in machine learning where one has to solve a sequence of related symmetric positive definite linear problems. From the machine learning perspective, such deflation methods can be interpreted as transfer learning of a low-rank approximation across a time-series of numerical tasks. We study the use of such methods for our field. Our empirical results show that, on regression and classification problems of intermediate size, this approach can interpolate between low computational cost and numerical precision.

URL PDF HTML ☆

赞 0 踩 0

1705.10596 2026-06-04 math.NA cs.LG cs.NA 版本更新

Approximation learning methods of Harmonic Mappings in relation to Hardy Spaces

谐映射近似学习方法与Hardy空间的关系

Zhulin Liu, C. L. Philip Chen

AI总结本文提出基于Tikhonov正则化和再生Hilbert核空间的Hardy空间方法，用于求解Dirichlet型问题，通过利用Hardy空间中函数的再生性质，简化优化运算并提出高效算法，同时探讨了谐映射在图像处理等应用中的独特性质和有效性。

Comments 2016 3rd International Conference on Informative and Cybernetics for Computational Social Systems (ICCSS)

详情

DOI: 10.1109/ICCSS.2016.7586421

AI中文摘要

本文讨论了一种基于Tikhonov正则化和再生Hilbert核空间的Hardy空间方法，用于解决Dirichlet型问题，该方法本质上是一个位于上高复平面上的极值问题。在Hardy空间中，该问题的优化算子将被大大简化，从而可能实现高效算法。这主要通过利用上高复平面上Hardy空间中函数的再生性质来实现，并提出了详细算法。此外，谐映射作为一种重要的几何变换，在许多应用中如图像处理中被广泛应用，因为它描述了个体流形之间的能量最小化映射。特别是，当关注两个欧几里得平面区域之间的平面映射时，谐映射是存在且唯一的，这一性质由谐函数的存在性保证。本文展示了该性质的吸引力，并通过模拟结果验证了其在平面形状扭曲和表面配准等应用中的能力。

英文摘要

A new Hardy space Hardy space approach of Dirichlet type problem based on Tikhonov regularization and Reproducing Hilbert kernel space is discussed in this paper, which turns out to be a typical extremal problem located on the upper upper-high complex plane. If considering this in the Hardy space, the optimization operator of this problem will be highly simplified and an efficient algorithm is possible. This is mainly realized by the help of reproducing properties of the functions in the Hardy space of upper-high complex plane, and the detail algorithm is proposed. Moreover, harmonic mappings, which is a significant geometric transformation, are commonly used in many applications such as image processing, since it describes the energy minimization mappings between individual manifolds. Particularly, when we focus on the planer mappings between two Euclid planer regions, the harmonic mappings are exist and unique, which is guaranteed solidly by the existence of harmonic function. This property is attractive and simulation results are shown in this paper to ensure the capability of applications such as planer shape distortion and surface registration.

URL PDF HTML ☆

赞 0 踩 0

1705.10152 2026-06-04 math.OC cs.LG cs.NA math.AG math.NA 版本更新

Tangent Cones to TT Varieties

TT种集的切锥

Benjamin Kutschan

AI总结本文研究了TT种集的Bouligand切锥参数化，探讨了其在二进制分层格式中的推广，并给出了切锥的正交和参数化及多项式方程组的隐式描述。

1705.09761 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Stochastic Feedback Control of Systems with Unknown Nonlinear Dynamics

具有未知非线性动力学系统的随机反馈控制

Dan Yu, Mohammadhussein Rafieisakhaei, Suman Chakravorty

AI总结研究未知动力学系统的随机最优控制问题，通过开环确定性轨迹优化和LQG控制器设计，使状态接近最优轨迹，利用输入输出实验数据识别轨迹依赖线性化系统。

Comments 7 pages, 7 figures, submitted to 56th IEEE Conference on Decision and Control (CDC), 2017

1607.01231 2026-06-04 math.OC cs.LG cs.NA math.NA stat.ML 版本更新

Stochastic Quasi-Newton Methods for Nonconvex Stochastic Optimization

非凸随机优化的随机拟牛顿方法

Xiao Wang, Shiqian Ma, Donald Goldfarb, Wei Liu

AI总结本文研究非凸随机优化中的随机拟牛顿方法，提出了一种框架并证明了收敛性，分析了最坏情况下的迭代复杂度，提出了一种随机阻尼L-BFGS方法，并结合SVRG技术，展示了在非凸二分类和多分类问题中的数值结果。

Comments published in SIAM Journal on Optimization

详情

AI中文摘要

本文研究了非凸随机优化中的随机拟牛顿方法，假设可以通过随机一阶 oracle（SFO）获取目标函数梯度的噪声信息。我们提出了一种通用框架，证明了几乎必然收敛到 stationary points，并分析了最坏情况下的迭代复杂度。当随机选择的迭代结果作为算法输出时，我们证明在最坏情况下，SFO调用的复杂度为 $O(ε^{-2})$，以确保梯度平方范数的期望小于给定的精度容限 $ε$。我们还提出了一种具体的算法，即随机阻尼L-BFGS（SdLBFGS）方法，该方法属于所提出的框架。此外，我们将SVRG方差减少技术纳入所提出的SdLBFGS方法中，并分析了其SFO调用复杂度。报告了在非凸二分类问题中使用SVM以及多分类问题中使用神经网络的数值结果。

英文摘要

In this paper we study stochastic quasi-Newton methods for nonconvex stochastic optimization, where we assume that noisy information about the gradients of the objective function is available via a stochastic first-order oracle (SFO). We propose a general framework for such methods, for which we prove almost sure convergence to stationary points and analyze its worst-case iteration complexity. When a randomly chosen iterate is returned as the output of such an algorithm, we prove that in the worst-case, the SFO-calls complexity is $O(ε^{-2})$ to ensure that the expectation of the squared norm of the gradient is smaller than the given accuracy tolerance $ε$. We also propose a specific algorithm, namely a stochastic damped L-BFGS (SdLBFGS) method, that falls under the proposed framework. {Moreover, we incorporate the SVRG variance reduction technique into the proposed SdLBFGS method, and analyze its SFO-calls complexity. Numerical results on a nonconvex binary classification problem using SVM, and a multiclass classification problem using neural networks are reported.

URL PDF HTML ☆

赞 0 踩 0

1705.05475 2026-06-04 cs.LG cs.NA cs.NE math.NA q-bio.NC 版本更新

Sparse Coding by Spiking Neural Networks: Convergence Theory and Computational Results

稀疏编码的脉冲神经网络：收敛理论与计算结果

Ping Tak Peter Tang, Tsung-Han Lin, Mike Davies

AI总结本文提出一种脉冲神经网络模型，证明其能可靠解决稀疏编码问题，为非冯·诺依曼架构计算机提供了理论保障。

Comments 13 pages, 3 figures

1705.05116 2026-06-04 cs.RO cs.AI cs.CV cs.LG cs.SY eess.SY 版本更新

Tuning Modular Networks with Weighted Losses for Hand-Eye Coordination

通过加权损失调节模块网络以提升手眼协调

Fangyi Zhang, Jürgen Leitner, Michael Milford, Peter I. Corke

AI总结本文提出端到端微调方法，通过加权损失提升模块化深度视觉-运动策略在平面抓取任务中的手眼协调性能。

Comments 2 pages, to appear in the Deep Learning for Robotic Vision (DLRV) Workshop in CVPR 2017

1702.02453 2026-06-04 cs.LG cs.RO cs.SY eess.SY 版本更新

Preparing for the Unknown: Learning a Universal Policy with Online System Identification

为未知做准备：学习通用策略与在线系统识别

Wenhao Yu, Jie Tan, C. Karen Liu, Greg Turk

AI总结本文提出了一种学习通用策略的方法，通过在线系统识别和大量训练示例，使策略在未知动态模型下具备鲁棒性，适用于多种动态模型和环境变化。

Comments Accepted as a conference paper at RSS 2017

详情

AI中文摘要

我们提出了一种学习控制策略的新方法，该方法能够在未知动态模型下有效运行。我们通过利用大量由物理模拟器生成的训练示例来创建此类策略。系统由两个组件组成：通用策略（UP）和在线系统识别（OSI）函数。我们描述我们的控制策略为通用，因为它是在广泛动态模型上训练的。这些动态模型的变化可能包括机器人组件的质量和惯性差异、摩擦系数变化或未知被操作物体的质量。通过在这些变化上训练通用策略，控制策略在未知环境中准备了更广泛的可能条件。系统第二部分利用系统的近期状态和动作历史来预测动态模型参数mu。在线系统识别的mu值然后作为输入提供给控制策略（连同系统状态）。UP-OSI是一种在广泛动态模型上适用且对环境突然变化具有响应性的稳健控制策略。我们评估了该系统在多种任务上的性能，包括cart-pole翻转问题、双倒立摆、跳蛙器的运动和机械臂的块投掷任务。UP-OSI在各种动态模型上均有效。此外，当测试动态模型超出训练范围时，UP-OSI在UP单独的情况下表现更优，即使UP被给予实际的动态模型值。除了创建更稳健的控制器的好处外，UP-OSI还具有缩小模拟与真实物理系统现实差距的潜力。

英文摘要

We present a new method of learning control policies that successfully operate under unknown dynamic models. We create such policies by leveraging a large number of training examples that are generated using a physical simulator. Our system is made of two components: a Universal Policy (UP) and a function for Online System Identification (OSI). We describe our control policy as universal because it is trained over a wide array of dynamic models. These variations in the dynamic model may include differences in mass and inertia of the robots' components, variable friction coefficients, or unknown mass of an object to be manipulated. By training the Universal Policy with this variation, the control policy is prepared for a wider array of possible conditions when executed in an unknown environment. The second part of our system uses the recent state and action history of the system to predict the dynamics model parameters mu. The value of mu from the Online System Identification is then provided as input to the control policy (along with the system state). Together, UP-OSI is a robust control policy that can be used across a wide range of dynamic models, and that is also responsive to sudden changes in the environment. We have evaluated the performance of this system on a variety of tasks, including the problem of cart-pole swing-up, the double inverted pendulum, locomotion of a hopper, and block-throwing of a manipulator. UP-OSI is effective at these tasks across a wide range of dynamic models. Moreover, when tested with dynamic models outside of the training range, UP-OSI outperforms the Universal Policy alone, even when UP is given the actual value of the model dynamics. In addition to the benefits of creating more robust controllers, UP-OSI also holds out promise of narrowing the Reality Gap between simulated and real physical systems.

URL PDF HTML ☆

赞 0 踩 0

1702.04077 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Mutual Kernel Matrix Completion

互核矩阵补全

Tsuyoshi Kato, Rachelle Rivero

AI总结本文提出互核矩阵补全算法，通过融合数据与核矩阵补全方法，提升生物数据分类任务中缺失核矩阵的补全效果。

Comments 10 pages, 4 figures

详情

AI中文摘要

随着各种数据的大量涌入，从其中提取知识已成为数据科学家的一项有趣但繁琐的任务，特别是当数据形式异构且存在缺失信息时。许多数据补全技术已被引入，尤其是在核方法出现后。然而，在现有文献中，关于同时补全多个不完整核矩阵的研究却很少受到关注。本文提出了一种新的方法，称为互核矩阵补全（MKMC）算法，通过结合数据融合和核矩阵补全的概念，应用于生物数据集以用于分类任务。我们首先引入了一个目标函数，通过利用EM算法进行最小化，从而得到涉及的核矩阵中缺失条目的估计。补全后的核矩阵随后被结合以生成一个模型矩阵，可用于进一步改进获得的估计。我们的研究结果表明，E步和M步以闭合形式给出，使我们的算法在时间和内存方面都高效。完成补全后，补全的核矩阵用于训练SVM分类器，以测试数据点之间关系的保持程度。我们的实证结果表明，所提出的算法在保持数据点之间关系和准确恢复缺失核矩阵条目方面优于传统补全技术。目前，MKMC为多个相关不完整核矩阵的相互估计问题提供了一个有前途的解决方案。

英文摘要

With the huge influx of various data nowadays, extracting knowledge from them has become an interesting but tedious task among data scientists, particularly when the data come in heterogeneous form and have missing information. Many data completion techniques had been introduced, especially in the advent of kernel methods. However, among the many data completion techniques available in the literature, studies about mutually completing several incomplete kernel matrices have not been given much attention yet. In this paper, we present a new method, called Mutual Kernel Matrix Completion (MKMC) algorithm, that tackles this problem of mutually inferring the missing entries of multiple kernel matrices by combining the notions of data fusion and kernel matrix completion, applied on biological data sets to be used for classification task. We first introduced an objective function that will be minimized by exploiting the EM algorithm, which in turn results to an estimate of the missing entries of the kernel matrices involved. The completed kernel matrices are then combined to produce a model matrix that can be used to further improve the obtained estimates. An interesting result of our study is that the E-step and the M-step are given in closed form, which makes our algorithm efficient in terms of time and memory. After completion, the (completed) kernel matrices are then used to train an SVM classifier to test how well the relationships among the entries are preserved. Our empirical results show that the proposed algorithm bested the traditional completion techniques in preserving the relationships among the data points, and in accurately recovering the missing kernel matrix entries. By far, MKMC offers a promising solution to the problem of mutual estimation of a number of relevant incomplete kernel matrices.

URL PDF HTML ☆

赞 0 踩 0

1705.02891 2026-06-04 stat.CO cs.LG cs.NA hep-lat math.NA stat.ML 版本更新

Geometry and Dynamics for Markov Chain Monte Carlo

马尔可夫链蒙特卡洛的几何与动力学

Alessandro Barp, Francois-Xavier Briol, Anthony D. Kennedy, Mark Girolami

AI总结本文综述了Hamiltonian Monte Carlo中使用的几何工具，为统计学家和机器学习者提供基础理解，并讨论了该领域最新进展。

Comments Submitted to "Annual Review of Statistics and Its Applications"

详情

AI中文摘要

马尔可夫链蒙特卡洛方法已革新了数学计算，并在许多以前无法处理的模型中实现了统计推断。在此背景下，哈密顿动力学被提出作为高效构建链的方法，以高效探索概率密度。该方法源自物理和几何，并且这些联系已通过一系列作者三十年的研究被广泛研究。然而，目前用户对方法的直觉和知识与我们对这些理论基础的深入理解之间存在差距。本文的目的是为统计学家、机器学习者及其他方法使用者提供一个全面的介绍，使他们能够在仅具备基本蒙特卡洛方法知识的情况下理解这些几何工具。这将通过讨论该领域最近期的进展来补充，我们相信这些进展将对应用科学家越来越相关。

英文摘要

Markov Chain Monte Carlo methods have revolutionised mathematical computation and enabled statistical inference within many previously intractable models. In this context, Hamiltonian dynamics have been proposed as an efficient way of building chains which can explore probability densities efficiently. The method emerges from physics and geometry and these links have been extensively studied by a series of authors through the last thirty years. However, there is currently a gap between the intuitions and knowledge of users of the methodology and our deep understanding of these theoretical foundations. The aim of this review is to provide a comprehensive introduction to the geometric tools used in Hamiltonian Monte Carlo at a level accessible to statisticians, machine learners and other users of the methodology with only a basic understanding of Monte Carlo methods. This will be complemented with some discussion of the most recent advances in the field which we believe will become increasingly relevant to applied scientists.

URL PDF HTML ☆

赞 0 踩 0

1605.00609 2026-06-04 cs.LG cs.IT cs.NA math.IT math.NA stat.ML 版本更新

Algorithms for Learning Sparse Additive Models with Interactions in High Dimensions

高维空间中包含交互项的稀疏加法模型的学习算法

Hemant Tyagi, Anastasios Kyrillidis, Bernd Gärtner, Andreas Krause

AI总结本文提出了一种在高维空间中学习包含稀疏交互项的加法模型的算法，通过压缩感知方法有效恢复模型结构并保证误差界。

Comments To appear in Information and Inference: A Journal of the IMA. Made following changes after review process: (a) Corrected typos throughout the text. (b) Corrected choice of sampling distribution in Section 5, see eqs. (5.2), (5.3). (c) More detailed comparison with existing work in Section 8. (d) Added Section B in appendix on roots of cubic equation

详情

AI中文摘要

一个函数$f: \mathbb{R}^d \rightarrow \mathbb{R}$是稀疏加法模型（SPAM），如果其形式为$f(\mathbf{x}) = \sum_{l \in \mathcal{S}}ϕ_{l}(x_l)$，其中$\mathcal{S} \subset [d]$，且$|\mathcal{S}| \ll d$。假设$ϕ$和$\mathcal{S}$未知，已有大量工作致力于从样本中估计$f$。本文考虑了一种广义的SPAMs，允许存在少量的二次交互项。对于某些$\mathcal{S}_1 \subset [d], \mathcal{S}_2 \subset {[d] \choose 2}$，其中$|\mathcal{S}_1| \ll d, |\mathcal{S}_2| \ll d^2$，函数$f$现在被假设为形式：$\sum_{p \in \mathcal{S}_1}ϕ_{p} (x_p) + \sum_{(l,l^{\prime}) \in \mathcal{S}_2}ϕ_{(l,l^{\prime})} (x_l,x_{l^{\prime}})$。假设我们能够任意查询$f$的域内任意点，我们推导出高效的算法，能够以有限样本界证明恢复$\mathcal{S}_1,\mathcal{S}_2$。我们的分析涵盖了无噪声设置，即获得精确的$f$样本，也扩展到有噪声设置，其中查询被噪声污染。特别是对于有噪声设置，我们考虑了两种噪声模型：独立同分布高斯噪声和任意但有界的噪声。我们的主要方法依赖于稀疏Hessian矩阵的估计，为此我们提供了两种新的压缩感知方案。一旦$\mathcal{S}_1, \mathcal{S}_2$已知，我们展示了如何通过额外的$f$查询估计个体组件$ϕ_p$, $ϕ_{(l,l^{\prime})}$，并保证均匀误差界。最后，我们通过合成数据的模拟结果验证了我们的理论发现。

英文摘要

A function $f: \mathbb{R}^d \rightarrow \mathbb{R}$ is a Sparse Additive Model (SPAM), if it is of the form $f(\mathbf{x}) = \sum_{l \in \mathcal{S}}ϕ_{l}(x_l)$ where $\mathcal{S} \subset [d]$, $|\mathcal{S}| \ll d$. Assuming $ϕ$'s, $\mathcal{S}$ to be unknown, there exists extensive work for estimating $f$ from its samples. In this work, we consider a generalized version of SPAMs, that also allows for the presence of a sparse number of second order interaction terms. For some $\mathcal{S}_1 \subset [d], \mathcal{S}_2 \subset {[d] \choose 2}$, with $|\mathcal{S}_1| \ll d, |\mathcal{S}_2| \ll d^2$, the function $f$ is now assumed to be of the form: $\sum_{p \in \mathcal{S}_1}ϕ_{p} (x_p) + \sum_{(l,l^{\prime}) \in \mathcal{S}_2}ϕ_{(l,l^{\prime})} (x_l,x_{l^{\prime}})$. Assuming we have the freedom to query $f$ anywhere in its domain, we derive efficient algorithms that provably recover $\mathcal{S}_1,\mathcal{S}_2$ with finite sample bounds. Our analysis covers the noiseless setting where exact samples of $f$ are obtained, and also extends to the noisy setting where the queries are corrupted with noise. For the noisy setting in particular, we consider two noise models namely: i.i.d Gaussian noise and arbitrary but bounded noise. Our main methods for identification of $\mathcal{S}_2$ essentially rely on estimation of sparse Hessian matrices, for which we provide two novel compressed sensing based schemes. Once $\mathcal{S}_1, \mathcal{S}_2$ are known, we show how the individual components $ϕ_p$, $ϕ_{(l,l^{\prime})}$ can be estimated via additional queries of $f$, with uniform error bounds. Lastly, we provide simulation results on synthetic data that validate our theoretical findings.

URL PDF HTML ☆

赞 0 踩 0

1704.07669 2026-06-04 cs.DS cs.LG cs.NA math.NA 版本更新

Single-Pass PCA of Large High-Dimensional Data

大规模高维数据的单次PCA处理

Wenjian Yu, Yu Gu, Jian Li, Shenghua Liu, Yaohang Li

AI总结本文提出一种单次随机算法实现大规模高维数据的PCA，适用于存储在慢速存储器或流式生成的数据，实验验证其准确性，比现有算法误差小多个数量级，可在24分钟内计算50个主成分。

Comments IJCAI 2017, 16 pages, 6 figures

1608.04773 2026-06-04 stat.ML cs.DS cs.LG cs.NA math.NA math.OC 版本更新

Faster Principal Component Regression and Stable Matrix Chebyshev Approximation

更快的主成分回归与稳定的矩阵切比雪夫逼近

Zeyuan Allen-Zhu, Yuanzhi Li

AI总结本文提出了一种通过减少黑盒调用次数来实现主成分回归的算法，其精度为1+γ，且无需显式构造主成分，适用于大规模数据。同时，开发了稳定的矩阵切比雪夫多项式递推公式和最优多项式逼近矩阵符号函数的方法。

Comments title changed and minor revisions

1704.06803 2026-06-04 cs.LG cs.IR cs.NA math.NA stat.ML 版本更新

Geometric Matrix Completion with Recurrent Multi-Graph Neural Networks

基于循环多图神经网络的几何矩阵补全

Federico Monti, Michael M. Bronstein, Xavier Bresson

AI总结本文提出利用几何深度学习改进矩阵补全，结合图卷积网络和循环神经网络，学习图结构模式和非线性扩散过程，以提升推荐系统性能，参数数量与矩阵规模无关。

1704.05249 2026-06-04 cs.LG cs.NI cs.SY eess.SY 版本更新

Hot or not? Forecasting cellular network hot spots using sector performance indicators

热点与否？利用扇区性能指标预测蜂窝网络热点

Joan Serrà, Ilias Leontiadis, Alexandros Karatzoglou, Konstantina Papagiannaki

AI总结本文研究蜂窝网络热点评分的时空模式，利用树形机器学习模型预测热点，发现树模型在预测常规和非常规热点时分别提升14%和153%的准确性。

Comments Accepted for publication at ICDE 2017 - Industrial Track

详情

AI中文摘要

为管理维护大规模蜂窝网络，运营商需了解何时哪些扇区表现不佳。为此，他们使用所谓的热点评分，即多种网络测量的组合结果，反映单个扇区的即时整体性能。尽管运营商对网络当前性能和整体趋势有良好理解，但预测每个扇区随时间的变化却极具挑战性，因为其受常规和非常规事件影响，由人类行为和硬件故障触发。本文研究热点评分的时空模式，揭示其规律性。基于观察，我们探索利用近期测量历史预测未来热点的可能性。为此，我们考虑基于树的机器学习模型，并研究其性能随时间、历史数据量和预测时间跨度的变化。结果表明，与最佳基线相比，树模型在预测常规热点时可提升14%，在预测非常规热点时可提升153%。后者为中等时间跨度内预测孤立、非常规行为的热点提供了有力证据。整体而言，本文为蜂窝扇区动态及其可预测性提供了见解，并为更具前瞻性的网络运营和更长的预测时间跨度铺平了道路。

英文摘要

To manage and maintain large-scale cellular networks, operators need to know which sectors underperform at any given time. For this purpose, they use the so-called hot spot score, which is the result of a combination of multiple network measurements and reflects the instantaneous overall performance of individual sectors. While operators have a good understanding of the current performance of a network and its overall trend, forecasting the performance of each sector over time is a challenging task, as it is affected by both regular and non-regular events, triggered by human behavior and hardware failures. In this paper, we study the spatio-temporal patterns of the hot spot score and uncover its regularities. Based on our observations, we then explore the possibility to use recent measurements' history to predict future hot spots. To this end, we consider tree-based machine learning models, and study their performance as a function of time, amount of past data, and prediction horizon. Our results indicate that, compared to the best baseline, tree-based models can deliver up to 14% better forecasts for regular hot spots and 153% better forecasts for non-regular hot spots. The latter brings strong evidence that, for moderate horizons, forecasts can be made even for sectors exhibiting isolated, non-regular behavior. Overall, our work provides insight into the dynamics of cellular sectors and their predictability. It also paves the way for more proactive network operations with greater forecasting horizons.

URL PDF HTML ☆

赞 0 踩 0

1611.05317 2026-06-04 cs.LG cs.SY eess.SY 版本更新

A Learning Scheme for Microgrid Islanding and Reconnection

微电网孤岛与重新连接的学习方案

Carter Lassetter, Eduardo Cotilla-Sanchez, Jinsub Kim

AI总结本文提出一种学习方案，通过实时数据预测微电网重新连接到主电网的稳定性，利用支持向量机和动态模拟器提高预测准确性。

Comments 10 pages, 5 figures

详情

AI中文摘要

本文介绍了一种潜在的学习方案，能够动态预测子网络重新连接到主电网的稳定性。随着电力系统趋向智能化和绿色化，自给自足的微电网部署变得更为可能。微电网可能独立运行或与主电网同步，因此控制方法需考虑孤岛和重新连接。目前，最优且安全的重新连接能力尚不明确，目前仅限于连接点之间的简单同步。本文提出一种利用实时数据从相量测量单元（PMUs）的支持向量机（SVM）来预测子网络重新连接是否会导致稳定或不稳定。通过动态模拟器生成训练数据，用于在不同运行状态下训练SVM。分类器在多种情况下进行测试以确保多样性。在大多数条件下，动态预测的准确率约为85%。

英文摘要

This paper introduces a potential learning scheme that can dynamically predict the stability of the reconnection of sub-networks to a main grid. As the future electrical power systems tend towards smarter and greener technology, the deployment of self sufficient networks, or microgrids, becomes more likely. Microgrids may operate on their own or synchronized with the main grid, thus control methods need to take into account islanding and reconnecting of said networks. The ability to optimally and safely reconnect a portion of the grid is not well understood and, as of now, limited to raw synchronization between interconnection points. A support vector machine (SVM) leveraging real-time data from phasor measurement units (PMUs) is proposed to predict in real time whether the reconnection of a sub-network to the main grid would lead to stability or instability. A dynamics simulator fed with pre-acquired system parameters is used to create training data for the SVM in various operating states. The classifier was tested on a variety of cases and operating points to ensure diversity. Accuracies of approximately 85% were observed throughout most conditions when making dynamic predictions of a given network.

URL PDF HTML ☆

赞 0 踩 0

1607.07837 2026-06-04 math.OC cs.DS cs.LG cs.NA math.NA stat.ML 版本更新

First Efficient Convergence for Streaming k-PCA: a Global, Gap-Free, and Near-Optimal Rate

流式k-PCA的首次高效收敛：全局、无间隙且近最优速率

Zeyuan Allen-Zhu, Yuanzhi Li

AI总结本文研究流式PCA，提出改进的Oja算法变体Oja++，在O(dk)空间内实现全局收敛和无间隙收敛，匹配信息理论下限。

Comments REMARK: v4 adds discussions and polishes writing; v3 contains a stronger Theorem 2, a new lower bound Theorem 6, as well as new Oja++ results Theorem 4 and Theorem 5

详情

AI中文摘要

我们研究流式主成分分析（PCA），即在O(dk)空间内找到d×d隐藏矩阵Σ的前k个特征向量。我们为Oja算法提供了全局收敛性，该算法在实践中常用但缺乏理论支持。我们还提出改进的Oja++变体，其运行速度比Oja更快。我们的结果在误差、特征间隙、秩k和维度d的依赖关系上匹配信息理论下限，至多多项式对数因子。此外，我们的收敛速率可做到无间隙，即与近似误差成正比，不依赖特征间隙。相比之下，在一般秩k情况下，在O(dk)空间内设计具有高效全局收敛速率的算法之前未有解决方案；并且在O(dk)空间内设计具有（甚至局部）无间隙收敛速率的算法之前也未有解决方案。

英文摘要

We study streaming principal component analysis (PCA), that is to find, in $O(dk)$ space, the top $k$ eigenvectors of a $d\times d$ hidden matrix $\bf Σ$ with online vectors drawn from covariance matrix $\bf Σ$. We provide $\textit{global}$ convergence for Oja's algorithm which is popularly used in practice but lacks theoretical understanding for $k>1$. We also provide a modified variant $\mathsf{Oja}^{++}$ that runs $\textit{even faster}$ than Oja's. Our results match the information theoretic lower bound in terms of dependency on error, on eigengap, on rank $k$, and on dimension $d$, up to poly-log factors. In addition, our convergence rate can be made gap-free, that is proportional to the approximation error and independent of the eigengap. In contrast, for general rank $k$, before our work (1) it was open to design any algorithm with efficient global convergence rate; and (2) it was open to design any algorithm with (even local) gap-free convergence rate in $O(dk)$ space.

URL PDF HTML ☆

赞 0 踩 0

1605.07367 2026-06-04 cs.LG cs.NA math.NA math.OC stat.ML 版本更新

Riemannian stochastic variance reduced gradient on Grassmann manifold

黎曼流形上的随机方差缩减梯度算法

Hiroyuki Kasai, Hiroyuki Sato, Bamdev Mishra

AI总结本文提出了一种在紧凑流形搜索空间中扩展欧几里得随机方差缩减梯度算法的黎曼扩展方法，针对格拉斯曼流形进行研究，解决了多个梯度的平均、加法和减法问题，并在不同步长下分析了算法的收敛性。

详情

AI中文摘要

随机方差缩减算法近年来在最小化大量但有限的损失函数的平均值方面变得流行。本文提出了一种新颖的黎曼扩展欧几里得随机方差缩减梯度算法（R-SVRG）到紧凑流形搜索空间。为此，我们展示了在格拉斯曼流形上的发展。通过在格拉斯曼流形上引入对数映射和向量的平行翻译来解决多个梯度的平均、加法和减法的关键挑战。我们展示了所提出算法在衰减步长下的全局收敛性分析，并在固定步长下在某些自然假设下进行了局部收敛率分析。所提出算法被应用于格拉斯曼流形上的多个问题，如主成分分析、低秩矩阵补全和Karcher均值计算。在所有这些情况下，所提出算法都优于标准的黎曼随机梯度下降算法。

英文摘要

Stochastic variance reduction algorithms have recently become popular for minimizing the average of a large, but finite, number of loss functions. In this paper, we propose a novel Riemannian extension of the Euclidean stochastic variance reduced gradient algorithm (R-SVRG) to a compact manifold search space. To this end, we show the developments on the Grassmann manifold. The key challenges of averaging, addition, and subtraction of multiple gradients are addressed with notions like logarithm mapping and parallel translation of vectors on the Grassmann manifold. We present a global convergence analysis of the proposed algorithm with decay step-sizes and a local convergence rate analysis under fixed step-size with some natural assumptions. The proposed algorithm is applied on a number of problems on the Grassmann manifold like principal components analysis, low-rank matrix completion, and the Karcher mean computation. In all these cases, the proposed algorithm outperforms the standard Riemannian stochastic gradient descent algorithm.

URL PDF HTML ☆

赞 0 踩 0

1610.05792 2026-06-04 cs.LG cs.NA math.NA math.OC stat.ML 版本更新

Big Batch SGD: Automated Inference using Adaptive Batch Sizes

大批次SGD：利用自适应批次大小进行自动化推断

Soham De, Abhay Yadav, David Jacobs, Tom Goldstein

AI总结本文提出大批次SGD方法，通过自适应增长批次大小保持梯度近似信号与噪声比，实现无需凸优化的高效收敛，并支持自动学习率选择。

Comments A preliminary version of this paper appears in AISTATS 2017 (International Conference on Artificial Intelligence and Statistics)

1704.01265 2026-06-04 math.NA cs.IT cs.LG cs.NA math.IT math.OC 版本更新

Geometry of Factored Nuclear Norm Regularization

因子核范数正则化的几何学

Qiuwei Li, Zhihui Zhu, Gongguo Tang

AI总结研究因子化核范数正则化在机器学习、信号处理中的应用，通过几何结构分析提升优化算法效率。

详情

AI中文摘要

本文研究了非凸重构的几何结构，该重构用于最小化一般凸损失函数$f(X)$并采用矩阵核范数$\|X\|_*$进行正则化。核范数正则化的矩阵反问题在机器学习、信号处理和控制领域中占据核心地位。文献中广泛研究了核范数正则化的统计性能，使用凸分析技术进行分析。尽管其最优性能，当使用标准或甚至定制的快速凸求解器求解时，所得到的优化问题计算复杂度较高。为了开发更快且更可扩展的算法，我们遵循Burer-Monteiro的建议，将矩阵变量$X$分解为两个较小的矩形矩阵$X=UV^T$的乘积，并且将核范数$\|X\|_*$替换为$(\|U\|_F^2+\|V\|_F^2)/2$。尽管分解后的公式是非凸的，但我们证明当凸损失函数$f(X)$是$(2r,4r)$-受限良好条件时，分解问题的每个临界点要么对应于原始凸优化的最优解$X^\star$，要么是一个严格鞍点，其中Hessian矩阵有一个严格负的特征值。这种分解后的几何结构允许许多局部搜索算法在随机初始化下收敛到全局最优解。

英文摘要

This work investigates the geometry of a nonconvex reformulation of minimizing a general convex loss function $f(X)$ regularized by the matrix nuclear norm $\|X\|_*$. Nuclear-norm regularized matrix inverse problems are at the heart of many applications in machine learning, signal processing, and control. The statistical performance of nuclear norm regularization has been studied extensively in literature using convex analysis techniques. Despite its optimal performance, the resulting optimization has high computational complexity when solved using standard or even tailored fast convex solvers. To develop faster and more scalable algorithms, we follow the proposal of Burer-Monteiro to factor the matrix variable $X$ into the product of two smaller rectangular matrices $X=UV^T$ and also replace the nuclear norm $\|X\|_*$ with $(\|U\|_F^2+\|V\|_F^2)/2$. In spite of the nonconvexity of the factored formulation, we prove that when the convex loss function $f(X)$ is $(2r,4r)$-restricted well-conditioned, each critical point of the factored problem either corresponds to the optimal solution $X^\star$ of the original convex optimization or is a strict saddle point where the Hessian matrix has a strictly negative eigenvalue. Such a geometric structure of the factored formulation allows many local search algorithms to converge to the global optimum with random initializations.

URL PDF HTML ☆

赞 0 踩 0

1703.09800 2026-06-04 cs.LG cs.SY eess.SY 版本更新

Disruptive Event Classification using PMU Data in Distribution Networks

利用PMU数据在配电网中进行扰动事件分类

Iman Niazazari, Hanif Livani

AI总结本文提出基于PMU数据的框架，用于区分配电网中的扰动事件，通过PCA与SVM及自动编码器与softmax分类器实现高准确率的事件分类。

Comments 5 pages, 5 figures, conference

详情

AI中文摘要

随着高级计量设备在配电网中普及，如微量程测量单元（μPMU），为广域监控和诊断应用提供了前所未有的潜力，例如态势感知和配电网资产健康监测。意外的扰动事件会中断配电网资产的正常运行，最终导致永久性故障和昂贵的更换成本。因此，扰动事件分类为配电网资产的预防性维护提供了有用信息。本文提出了一种基于PMU数据的框架，用于配电网中扰动事件的分类。考虑并区分了两种扰动事件：即故障的电容器组切换和故障的调节器负载调节变换器（OLTC）切换，与配电网中的正常突发负载变化。通过模拟IEEE 13节点配电网中的事件验证了所提框架的性能。事件分类使用了两种不同的算法：i）主成分分析（PCA）与多类支持向量机（SVM），以及ii）自动编码器与softmax分类器。结果展示了所提算法的有效性以及满意的分类准确率。

英文摘要

Proliferation of advanced metering devices with high sampling rates in distribution grids, e.g., micro-phasor measurement units (μPMU), provides unprecedented potentials for wide-area monitoring and diagnostic applications, e.g., situational awareness, health monitoring of distribution assets. Unexpected disruptive events interrupting the normal operation of assets in distribution grids can eventually lead to permanent failure with expensive replacement cost over time. Therefore, disruptive event classification provides useful information for preventive maintenance of the assets in distribution networks. Preventive maintenance provides wide range of benefits in terms of time, avoiding unexpected outages, maintenance crew utilization, and equipment replacement cost. In this paper, a PMU-data-driven framework is proposed for classification of disruptive events in distribution networks. The two disruptive events, i.e., malfunctioned capacitor bank switching and malfunctioned regulator on-load tap changer (OLTC) switching are considered and distinguished from the normal abrupt load change in distribution grids. The performance of the proposed framework is verified using the simulation of the events in the IEEE 13-bus distribution network. The event classification is formulated using two different algorithms as; i) principle component analysis (PCA) together with multi-class support vector machine (SVM), and ii) autoencoder along with softmax classifier. The results demonstrate the effectiveness of the proposed algorithms and satisfactory classification accuracies.

URL PDF HTML ☆

赞 0 踩 0

1701.00573 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Robust method for finding sparse solutions to linear inverse problems using an L2 regularization

使用L2正则化求解稀疏解的稳健方法

Gonzalo H Otazu

AI总结本文提出了一种基于生物启发算法的稳健方法，通过L2正则化在过完备字典中寻找稀疏解，具有对噪声的强鲁棒性。

Comments 13 pages, 6 figures. Code available

1605.06848 2026-06-04 cs.CC cs.LG cs.NA math.NA 版本更新

Nonnegative Matrix Factorization Requires Irrationality

非负矩阵分解需要无理数

Dmitry Chistikov, Stefan Kiefer, Ines Marušić, Mahsa Shirmohammadi, James Worrell

AI总结研究证明非负矩阵分解中，即使输入矩阵为有理数，分解后的因子矩阵可能需要无理数元素，推翻了原有假设。

Comments Journal version, to appear in the SIAM Journal on Applied Algebra and Geometry (SIAGA)

1703.06327 2026-06-04 stat.ML cs.DS cs.LG cs.NA math.NA 版本更新

Spectrum Estimation from a Few Entries

从少量条目中估计谱

Ashish Khetan, Sewoong Oh

AI总结本文研究从矩阵部分条目中恢复谱性质的问题，提出通过估计Schatten范数和Chebyshev逼近或Wasserstein距离匹配来高效恢复奇异值，理论分析显示其比低秩矩阵恢复需要更少样本。

Comments 52 pages; 15 figures

详情

AI中文摘要

矩阵数据的奇异值提供了数据结构、有效维数和高阶数据分析工具超参数选择的见解。然而，在协同过滤和网络分析等实际应用中，我们只能获取部分观测。本文考虑从矩阵条目采样中恢复底层矩阵谱性质的基本问题。我们特别关注直接恢复奇异值集合以及样本高效恢复谱总和函数的方法。首先估计矩阵的Schatten k-范数，然后应用Chebyshev逼近谱总和函数或在Wasserstein距离中进行矩匹配以恢复奇异值。主要技术挑战是准确估计Schatten范数。我们引入基于图中小结构计数的无偏估计器，并提供与实测性能相匹配的保证。理论分析表明，Schatten范数可以从比恢复低秩矩阵所需更少的样本中准确恢复。数值实验表明，我们显著优于使用矩阵补全方法的竞争对手方法。

英文摘要

Singular values of a data in a matrix form provide insights on the structure of the data, the effective dimensionality, and the choice of hyper-parameters on higher-level data analysis tools. However, in many practical applications such as collaborative filtering and network analysis, we only get a partial observation. Under such scenarios, we consider the fundamental problem of recovering spectral properties of the underlying matrix from a sampling of its entries. We are particularly interested in directly recovering the spectrum, which is the set of singular values, and also in sample-efficient approaches for recovering a spectral sum function, which is an aggregate sum of the same function applied to each of the singular values. We propose first estimating the Schatten $k$-norms of a matrix, and then applying Chebyshev approximation to the spectral sum function or applying moment matching in Wasserstein distance to recover the singular values. The main technical challenge is in accurately estimating the Schatten norms from a sampling of a matrix. We introduce a novel unbiased estimator based on counting small structures in a graph and provide guarantees that match its empirical performance. Our theoretical analysis shows that Schatten norms can be recovered accurately from strictly smaller number of samples compared to what is needed to recover the underlying low-rank matrix. Numerical experiments suggest that we significantly improve upon a competing approach of using matrix completion methods.

URL PDF HTML ☆

赞 0 踩 0

1611.07305 2026-06-04 cs.LG cs.DS cs.NA math.NA 版本更新

Correlation Clustering with Low-Rank Matrices

基于低秩矩阵的相关聚类

Nate Veldt, Anthony Wirth, David F. Gleich

AI总结本文研究了在低秩矩阵表示数据时相关聚类的精确求解方法，证明了正定低秩矩阵可使问题在多项式时间内解决，但存在负特征值时仍为NP难问题，并提出基于zonotope顶点枚举的高效算法。

1703.05486 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Using Reinforcement Learning for Demand Response of Domestic Hot Water Buffers: a Real-Life Demonstration

使用强化学习进行住宅热水缓冲的负荷响应：一项现实生活的演示

Oscar De Somer, Ana Soares, Tristan Kuijpers, Koen Vossen, Koen Vanthournout, Fred Spiessens

AI总结本文提出了一种数据驱动的控制方法，用于现实住宅建筑中的负荷响应，通过优化热水器加热周期以最大化本地光伏生产自用电量。

Comments Submitted to IEEE ISGT Europe 2017

1502.00182 2026-06-04 math.NA cs.DS cs.LG cs.NA stat.ML 版本更新

High Dimensional Low Rank plus Sparse Matrix Decomposition

高维低秩加稀疏矩阵分解

Mostafa Rahmani, George Atia

AI总结本文提出一种可扩展的子空间追猎方法，将矩阵分解问题转化为子空间学习问题，通过小数据草稿实现分解，适应性采样算法提升了处理结构化数据的效率。

Comments IEEE Transactions on Signal Processing

详情

DOI: 10.1109/TSP.2017.2649482

AI中文摘要

本文关注大数据中的低秩加稀疏矩阵分解问题。传统算法使用全部数据提取低秩和稀疏成分，基于复杂度随数据维度增长的优化问题，限制了可扩展性。现有随机方法多依赖均匀随机采样，对于具有额外结构的数据（如聚类）效率低下。本文提出一种可扩展的子空间追猎方法，将分解问题转化为子空间学习问题。通过采样列/行形成的小数据草稿进行分解。即使均匀随机采样，所需采样列/行数约为O(rμ)，其中μ是相干参数，r是低秩成分的秩。此外，提出适应性采样算法以解决结构化数据的列/行采样问题。提供适应性采样方法的分析，证明适应性采样使所需采样列/行数对数据分布不变。所提方法适用于在线实现，并提出在线方案。

英文摘要

This paper is concerned with the problem of low rank plus sparse matrix decomposition for big data. Conventional algorithms for matrix decomposition use the entire data to extract the low-rank and sparse components, and are based on optimization problems with complexity that scales with the dimension of the data, which limits their scalability. Furthermore, existing randomized approaches mostly rely on uniform random sampling, which is quite inefficient for many real world data matrices that exhibit additional structures (e.g. clustering). In this paper, a scalable subspace-pursuit approach that transforms the decomposition problem to a subspace learning problem is proposed. The decomposition is carried out using a small data sketch formed from sampled columns/rows. Even when the data is sampled uniformly at random, it is shown that the sufficient number of sampled columns/rows is roughly O(rμ), where μis the coherency parameter and r the rank of the low rank component. In addition, adaptive sampling algorithms are proposed to address the problem of column/row sampling from structured data. We provide an analysis of the proposed method with adaptive sampling and show that adaptive sampling makes the required number of sampled columns/rows invariant to the distribution of the data. The proposed approach is amenable to online implementation and an online scheme is proposed.

URL PDF HTML ☆

赞 0 踩 0

1703.04550 2026-06-04 cs.RO cs.LG cs.NE cs.SY eess.SY 版本更新

Sensor Fusion for Robot Control through Deep Reinforcement Learning

通过深度强化学习实现机器人控制的传感器融合

Steven Bohez, Tim Verbelen, Elias De Coninck, Bert Vankeirsbilck, Pieter Simoens, Bart Dhoedt

AI总结本文提出通过深度强化学习实现机器人传感器信息融合，提升机器人在搜索和拾取任务中的鲁棒性和性能。

Comments 6 pages, 6 figures, submitted to IROS 2017

1703.04219 2026-06-04 cs.LG cs.NA math.NA 版本更新

SPARTan: Scalable PARAFAC2 for Large & Sparse Data

SPARTan：适用于大规模稀疏数据的可扩展PARAFAC2

Ioakeim Perros, Evangelos E. Papalexakis, Fei Wang, Richard Vuduc, Elizabeth Searles, Michael Thompson, Jimeng Sun

AI总结本文提出SPARTan方法，用于高效处理大规模稀疏数据的PARAFAC2分解，实现速度和内存效率的提升，并在真实医学数据中验证了其有效性。

详情

AI中文摘要

在探索性张量挖掘中，一个常见问题是如何分析一组变量在一组受试者中的观测数据，这些观测数据在自然上并不对齐。例如，当建模一组患者中的医疗特征时，治疗的次数和持续时间可能差异很大，在时间点上无法有意义地对齐临床记录。为处理此类数据，最先进的张量模型是所谓的PARAFAC2，它能产生可解释且稳健的输出，并能自然处理稀疏数据。然而，其主要限制在于缺乏能够处理大规模数据集的高效算法。在本文中，我们通过开发一种可扩展的方法来计算大规模稀疏数据集的PARAFAC2分解，称为SPARTan。我们的方法利用PARAFAC2内部的特殊结构，导致一种新颖的算法重述，该方法在绝对时间上更快且比先前工作更节省内存。我们评估了SPARTan在合成和真实数据集上的表现，显示其性能比最佳先前实现提高了22倍，并且能够处理基线方法无法处理的更大问题实例。此外，我们还能够将SPARTan应用于真实和医学复杂的儿科患者数据中的时间演变表型挖掘。在这一过程中的表型的临床意义以及在多个患者中的时间演变已得到临床专家的认可。

英文摘要

In exploratory tensor mining, a common problem is how to analyze a set of variables across a set of subjects whose observations do not align naturally. For example, when modeling medical features across a set of patients, the number and duration of treatments may vary widely in time, meaning there is no meaningful way to align their clinical records across time points for analysis purposes. To handle such data, the state-of-the-art tensor model is the so-called PARAFAC2, which yields interpretable and robust output and can naturally handle sparse data. However, its main limitation up to now has been the lack of efficient algorithms that can handle large-scale datasets. In this work, we fill this gap by developing a scalable method to compute the PARAFAC2 decomposition of large and sparse datasets, called SPARTan. Our method exploits special structure within PARAFAC2, leading to a novel algorithmic reformulation that is both fast (in absolute time) and more memory-efficient than prior work. We evaluate SPARTan on both synthetic and real datasets, showing 22X performance gains over the best previous implementation and also handling larger problem instances for which the baseline fails. Furthermore, we are able to apply SPARTan to the mining of temporally-evolving phenotypes on data taken from real and medically complex pediatric patients. The clinical meaningfulness of the phenotypes identified in this process, as well as their temporal evolution over time for several patients, have been endorsed by clinical experts.

URL PDF HTML ☆

赞 0 踩 0

1703.02899 2026-06-04 cs.LG cs.RO cs.SY eess.SY stat.ML 版本更新

Model-Based Policy Search for Automatic Tuning of Multivariate PID Controllers

基于模型的策略搜索用于多变量PID控制器的自动调优

Andreas Doerr, Duy Nguyen-Tuong, Alonso Marco, Stefan Schaal, Sebastian Trimpe

AI总结本文提出基于模型的策略搜索框架，用于自动调优多变量PID控制器，通过数据驱动的方法解决复杂系统的控制器调优问题。

Comments Accepted final version to appear in 2017 IEEE International Conference on Robotics and Automation (ICRA)

1703.02810 2026-06-04 cs.AI cs.LG cs.SY eess.SY 版本更新

An Integrated and Scalable Platform for Proactive Event-Driven Traffic Management

主动事件驱动交通管理的集成可扩展平台

Alain Kibangou, Alexander Artikis, Evangelos Michelioudakis, Georgios Paliouras, Marius Schmitt, John Lygeros, Chris Baber, Natan Morar, Fabiana Fournier, Inna Skarbovsky

AI总结本文提出一个集成平台，通过事件驱动方法预测拥堵，提升交通管理效率。

1605.06432 2026-06-04 stat.ML cs.LG cs.SY eess.SY 版本更新

Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data

深度变分贝叶斯滤波器：从原始数据中无监督学习状态空间模型

Maximilian Karl, Maximilian Soelch, Justin Bayer, Patrick van der Smagt

AI总结本文提出深度变分贝叶斯滤波器，通过变分推断处理非解析性推理，实现从原始数据中无监督学习状态空间模型，提升长期预测能力。

Comments Published as a conference paper at ICLR 2017

1703.00847 2026-06-04 cs.LG cs.SY eess.SY 版本更新

Exact Topology Reconstruction of Radial Dynamical Systems with Applications to Distribution System of the Power Grid

径向动态系统精确拓扑重建及其在电力分配系统中的应用

Saurav Talukdar, Deepjyoti Deka, Donatello Materassi, Murti V. Salapaka

AI总结本文提出了一种重建动态相关随机过程互联性的方法，通过多变量维纳滤波消除虚假链接，针对树状拓扑结构提出三阶段网络重建流程，并在电力分配系统中验证有效性。

Comments 6 pages

1703.00663 2026-06-04 math.NA cs.CV cs.LG cs.NA math.OC stat.ML 版本更新

Introduction to Nonnegative Matrix Factorization

非负矩阵因子分解简介

Nicolas Gillis

AI总结本文介绍非负矩阵因子分解的应用、解的几何性质与唯一性、复杂度及算法，并探讨其与多面体扩展形式的联系。

Comments 18 pages, 4 figures

1703.00084 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

Multi-Sensor Data Pattern Recognition for Multi-Target Localization: A Machine Learning Approach

多传感器数据模式识别用于多目标定位：一种机器学习方法

Kasthurirengan Suresh, Samuel Silva, Johnathan Votion, Yongcan Cao

AI总结本文提出了一种创新的目标定位学习方法，利用聚类和SVM等算法处理多传感器数据，以提高多目标定位的准确性。

Comments submitted for conference publication

1702.07834 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Efficient coordinate-wise leading eigenvector computation

高效坐标-wise 主特征向量计算

Jialei Wang, Weiran Wang, Dan Garber, Nathan Srebro

AI总结本文提出并分析了高效的坐标-wise 方法来寻找主特征向量，每一步仅涉及向量-向量乘积。方法在全局收敛性和运行时间上不低于Lanczos方法，并在谱衰减较慢时表现更优。

1702.06166 2026-06-04 stat.ML cs.LG cs.NA math.NA q-bio.GN q-bio.QM stat.ME 版本更新

Bayesian Boolean Matrix Factorisation

贝叶斯布尔矩阵分解

Tammo Rukat, Chris C. Holmes, Michalis K. Titsias, Christopher Yau

AI总结本文提出一种基于概率生成模型的布尔矩阵分解方法，通过Metropolised Gibbs采样实现高效后验推断，并在真实和模拟数据中优于现有方法，提升解释性与应用价值。

详情

AI中文摘要

布尔矩阵分解旨在将二进制数据矩阵分解为两个低秩二进制矩阵的近似布尔乘积：一个包含有意义的模式，另一个量化如何将观测表示为这些模式的组合。本文引入OrMachine，一种概率生成模型，推导出Metropolised Gibbs采样器以实现高效的后验推断。在真实和模拟数据上，我们的方法优于现有方法，首次提供完整的后验推断，适用于协作过滤中的假阳性控制，并提升推断模式的可解释性。所提算法可扩展至大规模数据集，如通过分析11,000个基因在130万只小鼠脑细胞中的单细胞基因表达数据，在商用硬件上实现。

英文摘要

Boolean matrix factorisation aims to decompose a binary data matrix into an approximate Boolean product of two low rank, binary matrices: one containing meaningful patterns, the other quantifying how the observations can be expressed as a combination of these patterns. We introduce the OrMachine, a probabilistic generative model for Boolean matrix factorisation and derive a Metropolised Gibbs sampler that facilitates efficient parallel posterior inference. On real world and simulated data, our method outperforms all currently existing approaches for Boolean matrix factorisation and completion. This is the first method to provide full posterior inference for Boolean Matrix factorisation which is relevant in applications, e.g. for controlling false positive rates in collaborative filtering and, crucially, improves the interpretability of the inferred patterns. The proposed algorithm scales to large datasets as we demonstrate by analysing single cell gene expression data in 1.3 million mouse brain cells across 11 thousand genes on commodity hardware.

URL PDF HTML ☆

赞 0 踩 0

1602.07046 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

An Improved Gap-Dependency Analysis of the Noisy Power Method

改进的噪声幂方法的间隙依赖性分析

Maria Florina Balcan, Simon S. Du, Yining Wang, Adams Wei Yu

AI总结本文改进了噪声幂方法对谱间隙的依赖性，通过引入中间参数q，提升了样本复杂度和噪声容忍度的界限，应用于分布式隐私PCA和内存高效流PCA。

详情

AI中文摘要

我们考虑了在机器学习和统计中广泛应用的噪声幂方法，尤其是在资源受限下的主成分分析（PCA）中。现有分析显示噪声幂方法对输入数据矩阵的连续谱间隙(σ_k-σ_{k+1})存在不满意的依赖性，这可能非常小，从而限制了算法的应用。本文提出了一种新的噪声幂方法分析，实现了样本复杂度和噪声容忍度界限的改进依赖性。具体而言，我们将对(σ_k-σ_{k+1})的依赖性改进为对(σ_k-σ_{q+1})的依赖性，其中q是一个中间算法参数，可能远大于目标秩k。我们的证明基于对两个子空间接近性的新特征化，这不同于之前工作中分析的canonical angle特征化。最后，我们将改进的界限应用于分布式隐私PCA和内存高效的流PCA，并获得了优于现有文献结果的界限。

英文摘要

We consider the noisy power method algorithm, which has wide applications in machine learning and statistics, especially those related to principal component analysis (PCA) under resource (communication, memory or privacy) constraints. Existing analysis of the noisy power method shows an unsatisfactory dependency over the "consecutive" spectral gap $(σ_k-σ_{k+1})$ of an input data matrix, which could be very small and hence limits the algorithm's applicability. In this paper, we present a new analysis of the noisy power method that achieves improved gap dependency for both sample complexity and noise tolerance bounds. More specifically, we improve the dependency over $(σ_k-σ_{k+1})$ to dependency over $(σ_k-σ_{q+1})$, where $q$ is an intermediate algorithm parameter and could be much larger than the target rank $k$. Our proofs are built upon a novel characterization of proximity between two subspaces that differ from canonical angle characterizations analyzed in previous works. Finally, we apply our improved bounds to distributed private PCA and memory-efficient streaming PCA and obtain bounds that are superior to existing results in the literature.

URL PDF HTML ☆

赞 0 踩 0

1701.08074 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Model-Free Control of Thermostatically Controlled Loads Connected to a District Heating Network

无模型控制连接到区域供热网络的自适应负载

Bert J. Claessens, Dirk Vanhoudt, Johan Desmedt, Frederik Ruelens

AI总结本文提出了一种基于强化学习和市场多智能体系统的无模型控制方法，用于优化连接到区域供热网络的自适应负载，显著提高了实际学习时间内的性能。

Comments Under review at Elsevier: Energy and buildings 2017

详情

AI中文摘要

连接到区域供热网络的自适应负载的最优控制被视为在不确定性下的顺序决策问题。直接基于模型的方法在实践中受到两个挑战的限制，即由于问题的大维度性导致的可扩展性问题以及系统识别所需的准确模型识别。为缓解这些问题，本文利用了强化学习的最新发展，并结合基于市场的多智能体系统，以获得一个可扩展的解决方案，该方案在实际学习时间内实现了显著的性能提升。控制方法应用于一个包含100个连接到辐射状区域供热网络的自适应负载的场景，该网络由中央联合热电联产厂供电。无论是能源套利还是削峰目标，该控制方法需要60天才能使性能达到理论成本下界65%以内。

英文摘要

Optimal control of thermostatically controlled loads connected to a district heating network is considered a sequential decision- making problem under uncertainty. The practicality of a direct model-based approach is compromised by two challenges, namely scalability due to the large dimensionality of the problem and the system identification required to identify an accurate model. To help in mitigating these problems, this paper leverages on recent developments in reinforcement learning in combination with a market-based multi-agent system to obtain a scalable solution that obtains a significant performance improvement in a practical learning time. The control approach is applied on a scenario comprising 100 thermostatically controlled loads connected to a radial district heating network supplied by a central combined heat and power plant. Both for an energy arbitrage and a peak shaving objective, the control approach requires 60 days to obtain a performance within 65% of a theoretical lower bound on the cost.

URL PDF HTML ☆

赞 0 踩 0

1611.03993 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

Riemannian Tensor Completion with Side Information

Riemannian张量补全与侧信息

Tengfei Zhou, Hui Qian, Zebang Shen, Congfu Xu

AI总结本文提出一种新的Riemannian模型，整合原始模型与侧信息以提升低秩张量补全效率与准确性。

1702.05548 2026-06-04 math.OC cs.LG cs.SY eess.SY 版本更新

Bi-Level Online Control without Regret

双层在线控制无遗憾

Andrey Bernstein

AI总结本文提出一种双层离散时间控制框架，结合在线凸优化与实时控制，通过小动态遗憾算法解决电力电网频率控制问题。

1702.01228 2026-06-04 cs.LG cs.SY eess.SY 版本更新

A Learning-Based Approach for Lane Departure Warning Systems with a Personalized Driver Model

基于学习的车道偏离预警系统个性化驾驶员模型方法

Wenshuo Wang, Ding Zhao, Junqiang Xi, Wei Han

AI总结本文提出基于学习的车道偏离预警方法，通过结合高斯混合模型和隐马尔可夫模型建立个性化驾驶员模型，预测驾驶员行为并降低误报率。

Comments 12 pages, 13 figures, Journal

详情

AI中文摘要

驾驶员纠正行为的误解是车道偏离预测系统误报的主要原因。本文提出一种基于学习的方法，用于预测意外车道偏离行为（LDB）和驾驶员将车辆带回车道的可能性。首先，通过结合高斯混合模型和隐马尔可夫模型建立个性化驾驶员模型，用于车道偏离和车道保持行为。其次，基于该模型，开发了一种基于模型的在线预测算法，用于预测车辆轨迹并判断驾驶员将表现出LDB还是DCB。此外，还开发了一种基于模型预测算法的预警策略，使车道偏离预警系统能根据预测轨迹被驾驶员接受。此外，通过密歇根大学安全飞行员模型部署计划收集了10名驾驶员的自然驾驶数据，用于训练个性化驾驶员模型并验证该方法。我们比较了所提出的方法与基本时间到车道 crossing（TLC）方法和TLC-方向序列的分段横向斜率（TLC-DSPLS）方法。结果表明，所提出的方法可将误报率降至3.07%。

英文摘要

Misunderstanding of driver correction behaviors (DCB) is the primary reason for false warnings of lane-departure-prediction systems. We propose a learning-based approach to predicting unintended lane-departure behaviors (LDB) and the chance for drivers to bring the vehicle back to the lane. First, in this approach, a personalized driver model for lane-departure and lane-keeping behavior is established by combining the Gaussian mixture model and the hidden Markov model. Second, based on this model, we develop an online model-based prediction algorithm to predict the forthcoming vehicle trajectory and judge whether the driver will demonstrate an LDB or a DCB. We also develop a warning strategy based on the model-based prediction algorithm that allows the lane-departure warning system to be acceptable for drivers according to the predicted trajectory. In addition, the naturalistic driving data of 10 drivers is collected through the University of Michigan Safety Pilot Model Deployment program to train the personalized driver model and validate this approach. We compare the proposed method with a basic time-to-lane-crossing (TLC) method and a TLC-directional sequence of piecewise lateral slopes (TLC-DSPLS) method. The results show that the proposed approach can reduce the false-warning rate to 3.07\%.

URL PDF HTML ☆

赞 0 踩 0

1702.01205 2026-06-04 cs.AI cs.LG cs.SY eess.SY 版本更新

Traffic Lights with Auction-Based Controllers: Algorithms and Real-World Data

带拍卖机制的交通灯控制器：算法与现实数据

Shumeet Baluja, Michele Covell, Rahul Sukthankar

AI总结本文提出一种基于拍卖的交通灯控制器，通过微拍卖整合交通传感器信息，提升路容量和平均出行时间，优于现有静态程序灯和长期规划方案。

详情

AI中文摘要

实时优化交通流解决重要实际问题：减少驾驶员空闲时间、提高城市效率、减少气体排放和改善空气质量。当前交通灯优化研究多依赖扩展交通灯与其他交通设施的通信能力，但在此类能力普及前，可通过现有部署基础设施更响应当前交通状况来改进交通灯。本文介绍一种利用微拍卖进行竞价的交通灯控制器，无需其他外部信息源。我们在旧金山山景城和芝加哥river north社区的Android用户数月收集的大规模数据上训练和测试交通灯控制器。学习得到的拍卖机制控制器在两个城市中均在道路容量和平均出行时间等相关指标上超越了现有部署的交通灯、优化静态程序灯和长期规划方法，通过真实用户驾驶数据测量。

英文摘要

Real-time optimization of traffic flow addresses important practical problems: reducing a driver's wasted time, improving city-wide efficiency, reducing gas emissions and improving air quality. Much of the current research in traffic-light optimization relies on extending the capabilities of traffic lights to either communicate with each other or communicate with vehicles. However, before such capabilities become ubiquitous, opportunities exist to improve traffic lights by being more responsive to current traffic situations within the current, already deployed, infrastructure. In this paper, we introduce a traffic light controller that employs bidding within micro-auctions to efficiently incorporate traffic sensor information; no other outside sources of information are assumed. We train and test traffic light controllers on large-scale data collected from opted-in Android cell-phone users over a period of several months in Mountain View, California and the River North neighborhood of Chicago, Illinois. The learned auction-based controllers surpass (in both the relevant metrics of road-capacity and mean travel time) the currently deployed lights, optimized static-program lights, and longer-term planning approaches, in both cities, measured using real user driving data.

URL PDF HTML ☆

赞 0 踩 0

1701.08757 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Bayesian Learning of Consumer Preferences for Residential Demand Response

贝叶斯学习消费者对住宅需求响应的偏好

Mikhail V. Goubko, Sergey O. Kuznetsov, Alexey A. Neznanov, Dmitry I. Ignatov

AI总结本文提出一种贝叶斯学习算法，用于估计消费者舒适度函数，通过历史家电使用数据实现能源节约，优于传统回归分析方法，可扩展至控制供暖和制冷系统。

1701.06652 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Convex Parameterizations and Fidelity Bounds for Nonlinear Identification and Reduced-Order Modelling

凸参数化与非线性识别和降阶建模的保真度界限

Mark M. Tobenkin, Ian R. Manchester, Alexandre Megretski

AI总结本文提出基于凸优化的非线性识别方法，通过拉格朗日松弛、耗散不等式和半正定规划解决模型不稳定和长期预测问题，应用于电子电路降阶和气动执行器识别。

Comments Conditionally accepted to IEEE TAC

1607.03463 2026-06-04 math.NA cs.DS cs.LG cs.NA math.OC stat.ML 版本更新

LazySVD: Even Faster SVD Decomposition Yet Without Agonizing Pain

LazySVD：即使更快的SVD分解也无需痛苦

Zeyuan Allen-Zhu, Yuanzhi Li

AI总结本文提出LazySVD框架，改进了k-SVD的突破性方法，实现了更快的无间隙方法，以及首个加速和随机方法，在特定参数范围内优于现有算法。

Comments first circulated on May 20, 2016; this newer version improves writing

1610.00681 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Team-Optimal Distributed MMSE Estimation in General and Tree Networks

一般和树状网络中的团队最优分布式最小均方误差估计

Muhammed O. Sayin, Suleyman S. Kozat, Tamer Başar

AI总结本文提出了一种在有限时间范围内实现团队最优学习性能的分布式最小均方误差估计算法，适用于任意网络拓扑，并通过局部估计的递归算法实现最优性能。

Comments Submitted to Digital Signal Processing

详情

AI中文摘要

我们构建了用于有限时间范围均方误差（MSE）状态估计的分布式网络团队最优估计算法。这里，我们有分布式处理和协作能力的代理，通过线性模型观察到目标状态的噪声样本，并通过相互交互来学习该状态。尽管这个问题在机器学习和信号处理等领域受到广泛关注，但所有已知的策略在有限时间范围的MSE意义上均无法实现团队最优学习性能。为此，我们制定了在没有披露信息大小限制的情况下，即在任意网络拓扑上实现有限时间范围的分布式最小MSE（MMSE）。随后，我们表明仅交换局部估计足以在某些网络拓扑上实现Oracle性能。通过检查这些网络结构，我们提出了通过披露局部估计实现Oracle性能的递归算法。对于实际应用，我们还提供了通过时间窗口化观测来降低算法复杂度的方法。最后，在数值示例中，我们展示了所提出算法在有限时间范围MSE意义上由于最优估计而表现出的优越性能。

英文摘要

We construct team-optimal estimation algorithms over distributed networks for state estimation in the finite-horizon mean-square error (MSE) sense. Here, we have a distributed collection of agents with processing and cooperation capabilities. These agents observe noisy samples of a desired state through a linear model and seek to learn this state by interacting with each other. Although this problem has attracted significant attention and been studied extensively in fields including machine learning and signal processing, all the well-known strategies do not achieve team-optimal learning performance in the finite-horizon MSE sense. To this end, we formulate the finite-horizon distributed minimum MSE (MMSE) when there is no restriction on the size of the disclosed information, i.e., oracle performance, over an arbitrary network topology. Subsequently, we show that exchange of local estimates is sufficient to achieve the oracle performance only over certain network topologies. By inspecting these network structures, we propose recursive algorithms achieving the oracle performance through the disclosure of local estimates. For practical implementations we also provide approaches to reduce the complexity of the algorithms through the time-windowing of the observations. Finally, in the numerical examples, we demonstrate the superior performance of the introduced algorithms in the finite-horizon MSE sense due to optimal estimation.

URL PDF HTML ☆

赞 0 踩 0

1701.00757 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

Clustering Signed Networks with the Geometric Mean of Laplacians

利用拉普拉斯矩阵几何均值对带符号网络进行聚类

Pedro Mercado, Francesco Tudisco, Matthias Hein

AI总结本文提出利用拉普拉斯矩阵几何均值改进谱聚类，解决传统算术均值方法在无噪声正负网络结构中无法准确聚类的问题。

Comments 14 pages, 5 figures. Accepted in Neural Information Processing Systems (NIPS), 2016

1612.09158 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

The interplay between system identification and machine learning

系统辨识与机器学习之间的相互作用

Gianluigi Pillonetto

AI总结本文探讨系统辨识与机器学习的联系，提出动态系统RKHS框架，简化稳定性条件推导，并证明正则化估计器收敛于最优预测。

详情

AI中文摘要

从例子中学习是科学和工程中的关键问题，涉及从有限直接和噪声样本中重建函数。在 reproducing kernel Hilbert spaces (RKHSs) 中的正则化被广泛用于解决此任务，包括强大的估计器如正则化网络。最近的成就包括证明这些基于内核的方法的统计一致性。同时，许多不同的系统辨识技术已被开发，但与机器学习的互动仍不强烈。原因之一是机器学习中通常使用的RKHS不嵌入动态系统的信息，例如BIBO稳定性。此外，在系统辨识中，机器学习中通常采用的独立数据假设在实践中并不成立。本文提供了新的结果，加强系统辨识与机器学习之间的联系。我们的起点是引入动态系统的RKHS。它们包含在系统输入定义的空间上的函数，允许将系统辨识解释为从例子中学习。在线性和非线性设置中，证明这种视角允许以相对简单的方式推导RKHS稳定性条件（即只包含BIBO稳定系统或预测器的性质），也促进了系统辨识新内核的设计。此外，我们证明在动态系统典型条件下，正则化估计器收敛于最优预测器。

英文摘要

Learning from examples is one of the key problems in science and engineering. It deals with function reconstruction from a finite set of direct and noisy samples. Regularization in reproducing kernel Hilbert spaces (RKHSs) is widely used to solve this task and includes powerful estimators such as regularization networks. Recent achievements include the proof of the statistical consistency of these kernel- based approaches. Parallel to this, many different system identification techniques have been developed but the interaction with machine learning does not appear so strong yet. One reason is that the RKHSs usually employed in machine learning do not embed the information available on dynamic systems, e.g. BIBO stability. In addition, in system identification the independent data assumptions routinely adopted in machine learning are never satisfied in practice. This paper provides new results which strengthen the connection between system identification and machine learning. Our starting point is the introduction of RKHSs of dynamic systems. They contain functionals over spaces defined by system inputs and allow to interpret system identification as learning from examples. In both linear and nonlinear settings, it is shown that this perspective permits to derive in a relatively simple way conditions on RKHS stability (i.e. the property of containing only BIBO stable systems or predictors), also facilitating the design of new kernels for system identification. Furthermore, we prove the convergence of the regularized estimator to the optimal predictor under conditions typical of dynamic systems.

URL PDF HTML ☆

赞 0 踩 0

1308.4757 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Online and stochastic Douglas-Rachford splitting method for large scale machine learning

在线和随机Douglas-Rachford分裂方法用于大规模机器学习

Ziqiang Shi, Rujie Liu

AI总结本文提出在线和随机Douglas-Rachford分裂方法，用于大规模优化问题，证明其在在线和随机设置下的收敛性，并通过实验验证其有效性。

1612.04933 2026-06-04 stat.ML cs.AI cs.LG cs.SY eess.SY 版本更新

Dynamical Kinds and their Discovery

动力学种类及其发现

Benjamin C. Jantzen

AI总结本文提出一种无需显式构建动态模型或依赖系统动力学先验知识，即可分类具有相同结构因果系统的算法，展示了其在动态模型开发与验证中的应用价值。

Comments Accepted for the proceedings of the Causation: Foundation to Application Workshop, UAI 2016

详情

AI中文摘要

我们展示了将因果系统分类为共享相同结构的种类的可能性，无需首先构建显式动态模型或使用系统动力学的先验知识。该算法能够确定任意系统是否由相同形式的因果关系支配，具有在动态模型开发和验证中的重要应用价值。从理论上看，这也是科学推理中从实证数据中推导定律的关键阶段。所提出的算法基于动态对称性方法来处理动态种类。时间对称性是指对系统的一个或多个变量进行干预，该干预与系统的时间演化过程可交换。动态种类是共享一组动态对称性的系统类。所提出的算法通过直接比较系统展示的对称性来分类确定性、时间依赖性的因果系统。使用来自多种非线性系统的模拟、噪声数据，我们证明该算法能够正确地将系统分类为动态种类。该算法在显著的采样误差下具有鲁棒性，对采样误差的非正态性不敏感，并在动态相似性增加时表现良好。所展示的算法是首个针对自动化科学发现这一方面的算法。

英文摘要

We demonstrate the possibility of classifying causal systems into kinds that share a common structure without first constructing an explicit dynamical model or using prior knowledge of the system dynamics. The algorithmic ability to determine whether arbitrary systems are governed by causal relations of the same form offers significant practical applications in the development and validation of dynamical models. It is also of theoretical interest as an essential stage in the scientific inference of laws from empirical data. The algorithm presented is based on the dynamical symmetry approach to dynamical kinds. A dynamical symmetry with respect to time is an intervention on one or more variables of a system that commutes with the time evolution of the system. A dynamical kind is a class of systems sharing a set of dynamical symmetries. The algorithm presented classifies deterministic, time-dependent causal systems by directly comparing their exhibited symmetries. Using simulated, noisy data from a variety of nonlinear systems, we show that this algorithm correctly sorts systems into dynamical kinds. It is robust under significant sampling error, is immune to violations of normality in sampling error, and fails gracefully with increasing dynamical similarity. The algorithm we demonstrate is the first to address this aspect of automated scientific discovery.

URL PDF HTML ☆

赞 0 踩 0

1612.02739 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY 版本更新

Controlling Robot Morphology from Incomplete Measurements

从不完整测量中控制机器人形态

Martin Pecka, Karel Zimmermann, Michal Reinštein, Tomáš Svoboda

AI总结针对复杂形态机器人在城市搜索与救援任务中的地形穿越需求，提出通过自主控制处理不完整数据并确保安全性的方法。

Comments Accepted into IEEE Transactions to Industrial Electronics, Special Section on Motion Control for Novel Emerging Robotic Devices and Systems

1606.07315 2026-06-04 cs.LG cs.NA math.NA 版本更新

Nearly-optimal Robust Matrix Completion

近优鲁棒矩阵补全

Yeshwanth Cherapanamjeri, Kartik Gupta, Prateek Jain

AI总结本文提出一种简单投影梯度下降方法，通过交替进行投影梯度下降和硬阈值清理来估计低秩矩阵，实现近最优观测和损坏数量的鲁棒矩阵补全，同时改进了低秩矩阵补全的时间复杂度。

1612.01600 2026-06-04 math.OC cs.LG cs.MA cs.SY eess.SY stat.ML 版本更新

Distributed Gaussian Learning over Time-varying Directed Graphs

时变有向图上的分布式高斯学习

Angelia Nedić, Alex Olshevsky, César A. Uribe

AI总结本文提出一种分布式非贝叶斯学习算法，用于高斯噪声下的参数估计，通过显式更新高斯信念参数，证明了收敛率和几乎必然收敛性。

1612.00221 2026-06-04 econ.GN cs.LG nlin.AO q-fin.EC 版本更新

The Coconut Model with Heterogeneous Strategies and Learning

带有异质策略和学习的椰子模型

Sven Banisch, Eckehard Olbrich

AI总结本文基于Diamond搜索均衡模型开发了基于代理的版本，探讨了异质适应性和适应性预期对非均衡轨迹的影响，并展示了系统收敛到原系统固定点的稳定性。

Comments Accepted for publication in the Journal of Artificial Societies and Social Simulation (JASSS)

详情

AI中文摘要

在本文中，我们开发了Diamond搜索均衡模型的基于代理的版本，也称为椰子模型。在该模型中，代理面临需要根据对未来产生的实体效用的预期来评估的生产决策，而该效用又通过交易机制依赖于全球生产水平。虽然原始动力学系统设定假设无限多个同质适应代理遵循强理性条件，基于代理的设定允许讨论异质性和适应性预期的影响，并能够分析非均衡轨迹。从匹配原始模型渐近行为的基础实现出发，我们展示了如何在总体动力学方程中考虑代理异质性。然后我们展示当代理通过简单的时差学习方案调整策略时，系统收敛到原系统的固定点。系统性模拟揭示这唯一稳定的均衡解。

英文摘要

In this paper, we develop an agent-based version of the Diamond search equilibrium model - also called Coconut Model. In this model, agents are faced with production decisions that have to be evaluated based on their expectations about the future utility of the produced entity which in turn depends on the global production level via a trading mechanism. While the original dynamical systems formulation assumes an infinite number of homogeneously adapting agents obeying strong rationality conditions, the agent-based setting allows to discuss the effects of heterogeneous and adaptive expectations and enables the analysis of non-equilibrium trajectories. Starting from a baseline implementation that matches the asymptotic behavior of the original model, we show how agent heterogeneity can be accounted for in the aggregate dynamical equations. We then show that when agents adapt their strategies by a simple temporal difference learning scheme, the system converges to one of the fixed points of the original system. Systematic simulations reveal that this is the only stable equilibrium solution.

URL PDF HTML ☆

赞 0 踩 0

1611.08372 2026-06-04 stat.ML cs.LG cs.NA math.NA math.OC 版本更新

A Unified Convex Surrogate for the Schatten-$p$ Norm

一种统一的凸替代项用于Schatten-p范数

Chen Xu, Zhouchen Lin, Hongbin Zha

AI总结本文提出一种统一的凸替代项，用于Schatten-p范数，通过矩阵分解的等价性，使因子矩阵的范数可凸优化，提升矩阵补全任务的性能。

Comments The paper is accepted by AAAI-17. We show that multi-factor matrix factorization enjoys superiority over the traditional two-factor case

详情

AI中文摘要

Schatten-p范数（0<p<1）已被广泛用于替代核范数以更好地近似秩函数。然而，现有方法要么由于每次迭代依赖奇异值分解（SVD）而不适用于大规模问题，要么局限于特定的p值，如1/2和2/3。本文表明，对于任何p、p1和p2>0满足1/p=1/p1+1/p2，单矩阵的Schatten-p范数与两个因子矩阵的Schatten-p1和Schatten-p2范数之间存在等价性。我们进一步将等价性扩展到多个因子矩阵，并证明所有因子范数对于任何p>0均可凸和光滑。相比之下，原始Schatten-p范数对于0<p<1是非凸和非光滑的。作为示例，我们进行了矩阵补全实验。为了利用因子矩阵范数的凸性，我们采用了加速近端交替线性化最小化算法，并建立了其序列收敛性。在合成和真实数据集上的实验显示其优于现有方法，速度也极具竞争力。

英文摘要

The Schatten-$p$ norm ($0<p<1$) has been widely used to replace the nuclear norm for better approximating the rank function. However, existing methods are either 1) not scalable for large scale problems due to relying on singular value decomposition (SVD) in every iteration, or 2) specific to some $p$ values, e.g., $1/2$, and $2/3$. In this paper, we show that for any $p$, $p_1$, and $p_2 >0$ satisfying $1/p=1/p_1+1/p_2$, there is an equivalence between the Schatten-$p$ norm of one matrix and the Schatten-$p_1$ and the Schatten-$p_2$ norms of its two factor matrices. We further extend the equivalence to multiple factor matrices and show that all the factor norms can be convex and smooth for any $p>0$. In contrast, the original Schatten-$p$ norm for $0<p<1$ is non-convex and non-smooth. As an example we conduct experiments on matrix completion. To utilize the convexity of the factor matrix norms, we adopt the accelerated proximal alternating linearized minimization algorithm and establish its sequence convergence. Experiments on both synthetic and real datasets exhibit its superior performance over the state-of-the-art methods. Its speed is also highly competitive.

URL PDF HTML ☆

赞 0 踩 0

1611.05977 2026-06-04 cs.LG cs.NA math.NA stat.AP stat.ML 版本更新

Robust and Scalable Column/Row Sampling from Corrupted Big Data

鲁棒且可扩展的列/行采样从受腐蚀的大数据

Mostafa Rahmani, George Atia

AI总结本文提出新的采样算法，能在严重数据腐蚀下定位信息列，并开发可扩展的随机化设计，同时对稀疏腐蚀和异常值具有鲁棒性，实验显示优于现有鲁棒采样算法。

1602.05703 2026-06-04 cs.LG cs.SY eess.SY 版本更新

Adaptive Least Mean Squares Estimation of Graph Signals

自适应最小均方图信号估计

Paolo Di Lorenzo, Sergio Barbarossa, Paolo Banelli, Stefania Sardellitti

AI总结本文提出一种自适应图信号估计方法，通过最小均方策略实现带限图信号的重建与跟踪，结合理论分析与数值实验验证了方法的有效性，并提出在线适应的图采样策略。

Comments Submitted to IEEE Transactions on Signal and Information Processing over Networks

详情

DOI: 10.1109/TSIPN.2016.2613687

AI中文摘要

本文旨在提出一种最小均方（LMS）策略，用于自适应估计定义在图上的信号。假设图信号在已知带宽下带限，该方法能够在有限观测下实现保证均方误差性能的重建与跟踪。详细的均方分析提供了所提方法的性能，并导致了设计有用的图信号采样策略的若干见解。数值结果验证了我们的理论发现，并展示了所提方法的性能。此外，为应对带宽未知的情况，我们提出了一种在图频域中进行稀疏在线估计信号支持的方法，从而实现了图采样策略的在线适应。最后，我们应用所提方法在认知网络环境中构建给定操作区域的功率空间密度制图。

英文摘要

The aim of this paper is to propose a least mean squares (LMS) strategy for adaptive estimation of signals defined over graphs. Assuming the graph signal to be band-limited, over a known bandwidth, the method enables reconstruction, with guaranteed performance in terms of mean-square error, and tracking from a limited number of observations over a subset of vertices. A detailed mean square analysis provides the performance of the proposed method, and leads to several insights for designing useful sampling strategies for graph signals. Numerical results validate our theoretical findings, and illustrate the performance of the proposed method. Furthermore, to cope with the case where the bandwidth is not known beforehand, we propose a method that performs a sparse online estimation of the signal support in the (graph) frequency domain, which enables online adaptation of the graph sampling strategy. Finally, we apply the proposed method to build the power spatial density cartography of a given operational region in a cognitive network environment.

URL PDF HTML ☆

赞 0 踩 0

1511.05261 2026-06-04 cs.CV cs.LG cs.NA math.NA stat.ML 版本更新

Robust PCA via Nonconvex Rank Approximation

通过非凸秩近似实现鲁棒PCA

Zhao Kang, Chong Peng, Qiang Cheng

AI总结本文提出非凸秩近似方法，以改进鲁棒PCA中核范数的局限性，通过高效算法提升准确性和效率。

Comments IEEE International Conference on Data Mining

详情

DOI: 10.1109/ICDM.2015.15

AI中文摘要

在数据挖掘和机器学习中，许多应用需要恢复低秩矩阵。鲁棒主成分分析（RPCA）是处理此类问题的通用框架。RPCA中核范数作为秩函数的凸替代物被广泛研究。在某些假设下，它可以以高概率恢复底层低秩矩阵。然而，这些假设可能在实际应用中不成立。由于核范数通过将所有奇异值相加来近似秩，即本质上是奇异值的ℓ1范数，因此产生的近似误差并不 trivial，导致最终的矩阵估计器可能有显著偏差。为寻求更接近的近似并缓解核范数的上述限制，我们提出了一种非凸秩近似。这种对矩阵秩的近似比核范数更紧密。为了解决相关的非凸最小化问题，我们开发了高效的增广拉格朗日乘子优化算法。实验结果表明，我们的方法在准确性和效率上均优于当前最先进的算法。

英文摘要

Numerous applications in data mining and machine learning require recovering a matrix of minimal rank. Robust principal component analysis (RPCA) is a general framework for handling this kind of problems. Nuclear norm based convex surrogate of the rank function in RPCA is widely investigated. Under certain assumptions, it can recover the underlying true low rank matrix with high probability. However, those assumptions may not hold in real-world applications. Since the nuclear norm approximates the rank by adding all singular values together, which is essentially a $\ell_1$-norm of the singular values, the resulting approximation error is not trivial and thus the resulting matrix estimator can be significantly biased. To seek a closer approximation and to alleviate the above-mentioned limitations of the nuclear norm, we propose a nonconvex rank approximation. This approximation to the matrix rank is tighter than the nuclear norm. To solve the associated nonconvex minimization problem, we develop an efficient augmented Lagrange multiplier based optimization algorithm. Experimental results demonstrate that our method outperforms current state-of-the-art algorithms in both accuracy and efficiency.

URL PDF HTML ☆

赞 0 踩 0

1611.05095 2026-06-04 cs.LG cs.RO cs.SY eess.SY 版本更新

Learning Dexterous Manipulation Policies from Experience and Imitation

从经验与模仿中学习灵巧操作策略

Vikash Kumar, Abhishek Gupta, Emanuel Todorov, Sergey Levine

AI总结本文研究了通过经验与模仿学习反馈控制灵巧五指手非抓取操作的任务，提出基于轨迹优化的局部控制器，并通过深度学习和最近邻方法进行泛化，展示了小数据训练下的有效性和盲操作优势。

Comments Initial draft for a journal submission

详情

AI中文摘要

我们探索了基于学习的反馈控制方法，用于控制执行非抓取操作的灵巧五指手。首先，我们学习了能够从预定义初始状态开始执行任务的局部控制器。这些控制器是通过轨迹优化构建的，基于从传感器数据直接学习到的局部线性时变模型。在某些情况下，我们使用通过虚拟环境中的遥控收集的人类示范来初始化优化器。我们证明，这些控制器在模拟和物理平台上都能在初始条件的有限范围内稳健地执行任务。然后，我们考虑了两种泛化方法：深度学习和最近邻。我们发现最近邻方法性能更高。然而，神经网络也有其优势：它仅使用触觉和本体感觉反馈，而没有关于物体的视觉反馈（即盲操作），并且学习了一个时间不变的策略。相比之下，最近邻方法根据运动捕捉感知的初始物体状态切换时间变化的局部控制器。尽管两种泛化方法仍有改进空间，我们的工作表明（i）复杂的非抓取操作任务的局部轨迹控制器可以从惊人的少量训练数据中构建，（ii）此类控制器的集合可以插值形成更全局的控制器。结果总结在补充视频中：https://youtu.be/E0wmO6deqjo

英文摘要

We explore learning-based approaches for feedback control of a dexterous five-finger hand performing non-prehensile manipulation. First, we learn local controllers that are able to perform the task starting at a predefined initial state. These controllers are constructed using trajectory optimization with respect to locally-linear time-varying models learned directly from sensor data. In some cases, we initialize the optimizer with human demonstrations collected via teleoperation in a virtual environment. We demonstrate that such controllers can perform the task robustly, both in simulation and on the physical platform, for a limited range of initial conditions around the trained starting state. We then consider two interpolation methods for generalizing to a wider range of initial conditions: deep learning, and nearest neighbors. We find that nearest neighbors achieve higher performance. Nevertheless, the neural network has its advantages: it uses only tactile and proprioceptive feedback but no visual feedback about the object (i.e. it performs the task blind) and learns a time-invariant policy. In contrast, the nearest neighbors method switches between time-varying local controllers based on the proximity of initial object states sensed via motion capture. While both generalization methods leave room for improvement, our work shows that (i) local trajectory-based controllers for complex non-prehensile manipulation tasks can be constructed from surprisingly small amounts of training data, and (ii) collections of such controllers can be interpolated to form more global controllers. Results are summarized in the supplementary video: https://youtu.be/E0wmO6deqjo

URL PDF HTML ☆

赞 0 踩 0

1508.07933 2026-06-04 math.OC cs.LG cs.SY eess.SY 版本更新

Coordinate Dual Averaging for Decentralized Online Optimization with Nonseparable Global Objectives

协调双平均法用于具有非分离全局目标的去中心化在线优化

Soomin Lee, Angelia Nedić, Maxim Raginsky

AI总结本文提出两种去中心化变体ODA-C和ODA-PS，用于解决去中心化在线凸优化问题，通过双平均方法实现子线性悔度增长。

Comments 10 pages; accepted for publication in IEEE Transactions on Control of Network Systems

详情

DOI: 10.1109/TCNS.2016.2573639

AI中文摘要

我们考虑了一个网络中代理的去中心化在线凸优化问题，每个代理仅控制全局决策向量的一个坐标（或部分）。针对此类问题，我们提出了两种去中心化变体（ODA-C和ODA-PS）的Nesterov原始-对偶算法变体，带有双平均。在ODA-C中，为缓解对偶向量更新的分歧，代理在静态无向图上实现了一种最近由Li和Marden提出的局部信息交换动态的泛化。在ODA-PS中，代理在时间变化的均匀连接有向图序列上实现基于广播的推送-求和动态。我们证明在步长形式为1/√t且目标函数为Lipschitz连续凸函数且具有Lipschitz梯度时，两种情况下的悔度界具有O(√T)的次线性增长，其中T为时间跨度。我们还将在传感器网络上实现所提出算法以补充我们的理论分析。

英文摘要

We consider a decentralized online convex optimization problem in a network of agents, where each agent controls only a coordinate (or a part) of the global decision vector. For such a problem, we propose two decentralized variants (ODA-C and ODA-PS) of Nesterov's primal-dual algorithm with dual averaging. In ODA-C, to mitigate the disagreements on the primal-vector updates, the agents implement a generalization of the local information-exchange dynamics recently proposed by Li and Marden over a static undirected graph. In ODA-PS, the agents implement the broadcast-based push-sum dynamics over a time-varying sequence of uniformly connected digraphs. We show that the regret bounds in both cases have sublinear growth of $O(\sqrt{T})$, with the time horizon $T$, when the stepsize is of the form $1/\sqrt{t}$ and the objective functions are Lipschitz-continuous convex functions with Lipschitz gradients. We also implement the proposed algorithms on a sensor network to complement our theoretical analysis.

URL PDF HTML ☆

赞 0 踩 0

1412.7215 2026-06-04 math.OC cs.DS cs.LG cs.MA cs.SY eess.SY 版本更新

Online Distributed Optimization on Dynamic Networks

动态网络上的在线分布式优化

Saghar Hosseini, Airlie Chapman, Mehran Mesbahi

AI总结本文提出了一种在存在成本不确定性和切换通信拓扑下的分布式优化方案，通过双对偶子梯度平均算法实现合作最小化成本函数，并分析了网络拓扑对收敛速度的影响。

Comments Submitted to The IEEE Transactions on Automatic Control, 2014

1410.7057 2026-06-04 cs.LG cs.DC cs.SY eess.SY stat.ML 版本更新

Sparse Distributed Learning via Heterogeneous Diffusion Adaptive Networks

稀疏分布式学习 via 异质扩散自适应网络

Bijit Kumar Das, Mrityunjoy Chakraborty, Jerónimo Arenas-García

AI总结本文提出通过异质扩散自适应网络实现稀疏参数向量的分布式估计，通过选择性应用凸正则化方法减少计算开销，同时保持最优性能。

Comments 4 pages, 1 figure, conference, submitted to IEEE ISCAS 2015, Lisbon, Portugal

详情

DOI: 10.1109/ISCAS.2015.7168664

AI中文摘要

近年来，关于通过扩散LMS策略在网内进行稀疏参数向量分布式估计的研究已有所涉及。在所有现有工作中，每个网络节点都使用了一些凸正则化方法，以实现优于简单扩散LMS的整体网络性能，尽管这导致了计算开销的增加。本文提供了分析和实验结果，表明凸正则化可以仅应用于某些选定的节点，其余节点保持稀疏性无感知，同时仍能实现与在所有节点上部署凸正则化相同最优行为。由于在部分节点中采用无正则化学习，所提出的方法需要更少的计算成本。我们还提供了一条选择稀疏感知节点的指南和最优正则化参数的闭式表达式。

英文摘要

In-network distributed estimation of sparse parameter vectors via diffusion LMS strategies has been studied and investigated in recent years. In all the existing works, some convex regularization approach has been used at each node of the network in order to achieve an overall network performance superior to that of the simple diffusion LMS, albeit at the cost of increased computational overhead. In this paper, we provide analytical as well as experimental results which show that the convex regularization can be selectively applied only to some chosen nodes keeping rest of the nodes sparsity agnostic, while still enjoying the same optimum behavior as can be realized by deploying the convex regularization at all the nodes. Due to the incorporation of unregularized learning at a subset of nodes, less computational cost is needed in the proposed approach. We also provide a guideline for selection of the sparsity aware nodes and a closed form expression for the optimum regularization parameter.

URL PDF HTML ☆

赞 0 踩 0

1610.05838 2026-06-04 cs.LG cs.NA math.NA 版本更新

CuMF_SGD: Fast and Scalable Matrix Factorization

CuMF_SGD：快速且可扩展的矩阵分解

Xiaolong Xie, Wei Tan, Liana L. Fong, Yun Liang

AI总结本文提出CuMF_SGD，利用GPU高带宽内存和快节点连接加速大规模矩阵分解，通过批量Hogwild!和波前更新方案及优化内核，在单CPU和多GPU上实现3.1X-28.2X的加速。

详情

AI中文摘要

矩阵分解（MF）已广泛应用于推荐系统、主题建模和词嵌入等领域。随机梯度下降（SGD）因其能处理大数据集和易于增量学习而流行。我们发现SGD用于MF是内存受限的。单节点CPU系统带缓存仅适用于小数据集；分布式系统具有更高的聚合内存带宽但网络连接相对较慢。这一观察启发我们通过利用GPU的高内存带宽和快速节点连接来加速MF。我们提出了cuMF_SGD，一种基于CUDA的SGD解决方案用于大规模MF问题。在单个CPU上，我们设计了两种工作负载调度方案，即批量Hogwild!和波前更新，充分利用大量核心。特别是，批量Hogwild!作为Hogwild!的向量版本克服了内存不连续的问题。我们还开发了高度优化的SGD更新内核，利用缓存、 warp-shuffle指令和半精度浮点数。我们还设计了分区方案以利用多个GPU，同时解决SGD并行化时的收敛问题。在仅使用一个Maxwell或Pascal GPU的三个数据集上，cuMF_SGD相比1-64个CPU节点的最新CPU解决方案快3.1X-28.2X。评估还显示cuMF_SGD在大数据集上能良好扩展到多个GPU。

英文摘要

Matrix factorization (MF) has been widely used in e.g., recommender systems, topic modeling and word embedding. Stochastic gradient descent (SGD) is popular in solving MF problems because it can deal with large data sets and is easy to do incremental learning. We observed that SGD for MF is memory bound. Meanwhile, single-node CPU systems with caching performs well only for small data sets; distributed systems have higher aggregated memory bandwidth but suffer from relatively slow network connection. This observation inspires us to accelerate MF by utilizing GPUs's high memory bandwidth and fast intra-node connection. We present cuMF_SGD, a CUDA-based SGD solution for large-scale MF problems. On a single CPU, we design two workload schedule schemes, i.e., batch-Hogwild! and wavefront-update that fully exploit the massive amount of cores. Especially, batch-Hogwild! as a vectorized version of Hogwild! overcomes the issue of memory discontinuity. We also develop highly-optimized kernels for SGD update, leveraging cache, warp-shuffle instructions and half-precision floats. We also design a partition scheme to utilize multiple GPUs while addressing the well-known convergence issue when parallelizing SGD. On three data sets with only one Maxwell or Pascal GPU, cuMF_SGD runs 3.1X-28.2X as fast compared with state-of-art CPU solutions on 1-64 CPU nodes. Evaluations also show that cuMF_SGD scales well on multiple GPUs in large data sets.

URL PDF HTML ☆

赞 0 踩 0

1407.1537 2026-06-04 cs.DS cs.LG cs.NA math.NA math.OC stat.ML 版本更新

Linear Coupling: An Ultimate Unification of Gradient and Mirror Descent

线性耦合：梯度下降与镜像下降的终极统一

Zeyuan Allen-Zhu, Lorenzo Orecchia

AI总结本文提出线性耦合方法，通过结合梯度下降和镜像下降，统一了两种优化算法，重新解释了Nesterov加速梯度方法，并扩展至其他无法应用Nesterov方法的场景。

Comments A new section added; polished writing

1611.01142 2026-06-04 cs.LG cs.SY eess.SY 版本更新

快速的贝叶斯非负矩阵分解与三因子分解

Thomas Brouwer, Jes Frellsen, Pietro Lio'

AI总结本文提出一种快速变分贝叶斯算法，用于非负矩阵分解和三因子分解，相比Gibbs采样和非概率方法，该方法在迭代和时间步收敛速度更快，且无需额外样本估计后验。

Comments NIPS 2016 Workshop on Advances in Approximate Bayesian Inference

1606.00119 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Contextual Bandits with Latent Confounders: An NMF Approach

具有潜在混杂因素的上下文老虎机：一种NMF方法

Rajat Sen, Karthikeyan Shanmugam, Murat Kocaoglu, Alexandros G. Dimakis, Sanjay Shakkottai

AI总结本文提出基于NMF的ε-贪心算法，通过低维结构学习与最优臂选择平衡，实现在线矩阵补全的 regret 保障，适用于高维数据场景。

Comments 37 pages, 2 figures

详情

AI中文摘要

受在线推荐和广告系统启发，本文考虑了具有潜在低维混杂因子的随机上下文老虎机因果模型。在该模型中，L个观察到的上下文和K个臂之间通过潜在混杂因子相关联。臂选择和潜在混杂因子因果决定奖励，而观察到的上下文与混杂因子相关。在此模型下，L×K的均值奖励矩阵U可分解为非负因子A和W。本文提出ε-贪心NMF-Bandit算法，通过干预序列选择臂，实现学习低维结构与最小化遗憾的平衡。算法在时间T时的遗憾为O(Lpoly(m,logK)logT)，相较于传统上下文老虎机的O(LKlogT)更优。这些保证基于较弱的统计RIP条件。此外，本文提出一类生成模型满足充分条件，并推导出O(KmlogT)的下界。这些是首次针对在线矩阵补全与老虎机反馈的regret保证，当秩大于一时。

英文摘要

Motivated by online recommendation and advertising systems, we consider a causal model for stochastic contextual bandits with a latent low-dimensional confounder. In our model, there are $L$ observed contexts and $K$ arms of the bandit. The observed context influences the reward obtained through a latent confounder variable with cardinality $m$ ($m \ll L,K$). The arm choice and the latent confounder causally determines the reward while the observed context is correlated with the confounder. Under this model, the $L \times K$ mean reward matrix $\mathbf{U}$ (for each context in $[L]$ and each arm in $[K]$) factorizes into non-negative factors $\mathbf{A}$ ($L \times m$) and $\mathbf{W}$ ($m \times K$). This insight enables us to propose an $ε$-greedy NMF-Bandit algorithm that designs a sequence of interventions (selecting specific arms), that achieves a balance between learning this low-dimensional structure and selecting the best arm to minimize regret. Our algorithm achieves a regret of $\mathcal{O}\left(L\mathrm{poly}(m, \log K) \log T \right)$ at time $T$, as compared to $\mathcal{O}(LK\log T)$ for conventional contextual bandits, assuming a constant gap between the best arm and the rest for each context. These guarantees are obtained under mild sufficiency conditions on the factors that are weaker versions of the well-known Statistical RIP condition. We further propose a class of generative models that satisfy our sufficient conditions, and derive a lower bound of $\mathcal{O}\left(Km\log T\right)$. These are the first regret guarantees for online matrix completion with bandit feedback, when the rank is greater than one. We further compare the performance of our algorithm with the state of the art, on synthetic and real world data-sets.

URL PDF HTML ☆

赞 0 踩 0

1610.07722 2026-06-04 cs.LG cs.NA math.NA 版本更新

Sparse Hierarchical Tucker Factorization and its Application to Healthcare

稀疏分层Tucker分解及其在医疗领域的应用

Ioakeim Perros, Robert Chen, Richard Vuduc, Jimeng Sun

AI总结本文提出稀疏分层Tucker分解方法，用于处理稀疏高阶张量数据。该方法通过嵌套采样技术解决传统分层Tucker方法的可扩展性问题，提升了效率和准确性，并在医疗数据集上验证了其性能。

Comments This is an extended version of a paper presented at the 15th IEEE International Conference on Data Mining (ICDM 2015)

详情

AI中文摘要

我们提出了一种新的张量分解方法，称为稀疏分层-Tucker（Sparse H-Tucker），用于稀疏和高阶数据张量。Sparse H-Tucker受经典分层Tucker方法启发，旨在计算输入数据集的树状结构分解，可被领域专家解释。然而，Sparse H-Tucker采用嵌套采样技术克服了分层Tucker的关键可扩展性问题，即创建不可行的密集核心张量；我们的方法结果是一种更快、更节省空间且更准确的方法。我们广泛测试了该方法在一个真实医疗数据集上，该数据集来自30,000名患者，产生一个18阶稀疏数据张量。与竞争方法不同，Sparse H-Tucker可以在单个多线程机器上分析完整数据集。它比最先进的方法更准确且更快：在输入数据的12阶子集上，Sparse H-Tucker比之前最先进的方法准确度提高了18倍，速度提高了7.5倍。即使对于低阶张量（如4阶），我们的方法所需时间也接近一个数量级，内存使用也减少两个数量级，相比传统张量分解方法如CP和Tucker。此外，我们发现Sparse H-Tucker在非零张量元素数量上几乎线性扩展。所得到的模型还提供可解释的疾病层级，这已由临床专家验证。

英文摘要

We propose a new tensor factorization method, called the Sparse Hierarchical-Tucker (Sparse H-Tucker), for sparse and high-order data tensors. Sparse H-Tucker is inspired by its namesake, the classical Hierarchical Tucker method, which aims to compute a tree-structured factorization of an input data set that may be readily interpreted by a domain expert. However, Sparse H-Tucker uses a nested sampling technique to overcome a key scalability problem in Hierarchical Tucker, which is the creation of an unwieldy intermediate dense core tensor; the result of our approach is a faster, more space-efficient, and more accurate method. We extensively test our method on a real healthcare dataset, which is collected from 30K patients and results in an 18th order sparse data tensor. Unlike competing methods, Sparse H-Tucker can analyze the full data set on a single multi-threaded machine. It can also do so more accurately and in less time than the state-of-the-art: on a 12th order subset of the input data, Sparse H-Tucker is 18x more accurate and 7.5x faster than a previously state-of-the-art method. Even for analyzing low order tensors (e.g., 4-order), our method requires close to an order of magnitude less time and over two orders of magnitude less memory, as compared to traditional tensor factorization methods such as CP and Tucker. Moreover, we observe that Sparse H-Tucker scales nearly linearly in the number of non-zero tensor elements. The resulting model also provides an interpretable disease hierarchy, which is confirmed by a clinical expert.

URL PDF HTML ☆

赞 0 踩 0

1508.00506 2026-06-04 math.OC cs.LG cs.SY eess.SY math.PR math.ST stat.TH 版本更新

A variational approach to path estimation and parameter inference of hidden diffusion processes

隐扩散过程路径估计与参数推断的变分方法

Tobias Sutter, Arnab Ganguly, Heinz Koeppl

AI总结本文提出一种变分方法，用于估计隐扩散过程的路径并推断参数，通过高效推理方案提升对随机微分方程参数的估计精度。

Comments 37 pages, 2 figures, revised

1610.07520 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Nonlinear Adaptive Algorithms on Rank-One Tensor Models

秩一张量模型上的非线性自适应算法

Felipe C. Pinheiro, Cassio G. Lopes

AI总结本文提出低复杂度非线性模型，基于可分解的Volterra核，推导出精确梯度型算法，进而发展出LMS滤波器及TRUE-LMS算法，通过仿真验证其在非线性处理中的优越性能。

1509.05009 2026-06-04 cs.NE cs.LG cs.NA math.NA stat.ML 版本更新

On the Expressive Power of Deep Learning: A Tensor Analysis

深度学习表达能力的分析：张量视角

Nadav Cohen, Or Sharir, Amnon Shashua

AI总结本文通过张量分解理论分析深度学习的表达能力，证明深度网络在多项式规模下实现的函数需浅层网络指数规模才能近似。

详情

Journal ref: 29th Annual Conference on Learning Theory, pp. 698-728, 2016

AI中文摘要

长期以来，人们推测适合组合性数据（如文本或图像）的假设空间可能更高效地由深度分层网络表示而非浅层网络。尽管有大量的实证证据支持这一观点，但目前的理论依据有限。特别是，它们未能考虑卷积网络的局部性、共享和池化构造，这是目前最成功的深度学习架构。本文推导出一种基于算术电路的深度网络架构，其本质上具有局部性、共享和池化。建立了网络与分层张量分解之间的等价性。证明浅层网络对应于CP（秩-1）分解，而深层网络对应于分层Tucker分解。利用测度论和矩阵代数工具，证明除了可忽略的集合外，所有可通过多项式规模深度网络实现的函数，都需要指数规模的浅层网络才能实现（或近似）。由于对数空间计算将我们的网络转化为SimNets，该结果直接适用于具有有希望实证性能的深度学习架构。本文提出的构造和理论为深度学习社区的各种实践和想法提供了新的见解。

英文摘要

It has long been conjectured that hypotheses spaces suitable for data that is compositional in nature, such as text or images, may be more efficiently represented with deep hierarchical networks than with shallow ones. Despite the vast empirical evidence supporting this belief, theoretical justifications to date are limited. In particular, they do not account for the locality, sharing and pooling constructs of convolutional networks, the most successful deep learning architecture to date. In this work we derive a deep network architecture based on arithmetic circuits that inherently employs locality, sharing and pooling. An equivalence between the networks and hierarchical tensor factorizations is established. We show that a shallow network corresponds to CP (rank-1) decomposition, whereas a deep network corresponds to Hierarchical Tucker decomposition. Using tools from measure theory and matrix algebra, we prove that besides a negligible set, all functions that can be implemented by a deep network of polynomial size, require exponential size in order to be realized (or even approximated) by a shallow network. Since log-space computation transforms our networks into SimNets, the result applies directly to a deep learning architecture demonstrating promising empirical performance. The construction and theory developed in this paper shed new light on various practices and ideas employed by the deep learning community.

URL PDF HTML ☆

赞 0 踩 0

1610.04042 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Generalized Online Transfer Learning for Climate Control in Residential Buildings

面向住宅建筑气候控制的通用在线迁移学习

Thomas Grubinger, Georgios Chasparis, Thomas Natschlaeger

AI总结本文提出了一种在线迁移学习框架，用于提升住宅建筑温度预测。通过结合目标域和源域预测器，提出通用在线迁移学习算法GOTL，确保收敛至最优加权预测器，并利用迁移组件分析TCA实现多源域知识迁移，实验表明可实现显著的能耗节省。

详情

AI中文摘要

本文提出了一种在线迁移学习框架，用于改进住宅建筑中的温度预测。在迁移学习中，通过使用来自相似源域的数据（如数据丰富的房屋）来改进在目标域（如数据有限的房屋）上训练的预测模型。鉴于预测模型需要在线训练（如作为模型预测控制实现的一部分），本文引入了通用在线迁移学习算法（GOTL）。它采用可用预测器的加权组合（即目标和源预测器），并保证收敛到最佳加权预测器。此外，使用迁移组件分析（TCA）允许使用多个源域，因为它可以促进一个模型在多个源域（房屋）上的拟合。这使GOTL能够从多个源域转移知识。我们进一步通过住宅建筑的气候控制实验验证了结果，并展示了GOTL可能在给定舒适水平下带来非可忽略的能耗节省。

英文摘要

This paper presents an online transfer learning framework for improving temperature predictions in residential buildings. In transfer learning, prediction models trained under a set of available data from a target domain (e.g., house with limited data) can be improved through the use of data generated from similar source domains (e.g., houses with rich data). Given also the need for prediction models that can be trained online (e.g., as part of a model-predictive-control implementation), this paper introduces the generalized online transfer learning algorithm (GOTL). It employs a weighted combination of the available predictors (i.e., the target and source predictors) and guarantees convergence to the best weighted predictor. Furthermore, the use of Transfer Component Analysis (TCA) allows for using more than a single source domains, since it may facilitate the fit of a single model on more than one source domains (houses). This allows GOTL to transfer knowledge from more than one source domains. We further validate our results through experiments in climate control for residential buildings and show that GOTL may lead to non-negligible energy savings for given comfort levels.

URL PDF HTML ☆

赞 0 踩 0

1610.03518 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY 版本更新

Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model

通过学习深度逆动力学模型实现仿真到现实世界的迁移

Paul Christiano, Zain Shah, Igor Mordatch, Jonas Schneider, Trevor Blackwell, Joshua Tobin, Pieter Abbeel, Wojciech Zaremba

AI总结本文提出通过学习深度逆动力学模型，在仿真与现实世界之间实现控制策略的迁移，解决仿真与现实差异导致的性能下降问题。

详情

AI中文摘要

在仿真中开发控制策略通常比直接在现实世界中运行实验更实用和安全。这适用于通过规划和优化获得的策略，甚至更适用于通过强化学习获得的策略，后者通常非常数据密集。然而，仿真中成功的策略在部署到现实机器人时往往无法工作。然而，策略在仿真中执行的整体思路在现实世界中通常仍然有效。本文研究了此类场景，其中仿真中遍历的状态序列在现实世界中仍然合理，即使控制细节不同，例如摩擦、接触、质量和几何属性的差异。在执行过程中，我们的方法在每个时间步计算仿真基于的控制策略会做什么，但不执行这些控制在现实机器人上，而是计算仿真期望的下一个状态，并依赖于学习的深度逆动力学模型来决定最合适的现实世界动作以达到这些状态。深度模型只有在训练数据足够的情况下才有效，我们还提出了一种数据收集方法来（逐步）学习深度逆动力学模型。我们的实验表明，我们的方法在处理仿真到现实世界模型差异的各种基线方法中表现良好，包括输出误差控制和高斯动态适应。

英文摘要

Developing control policies in simulation is often more practical and safer than directly running experiments in the real world. This applies to policies obtained from planning and optimization, and even more so to policies obtained from reinforcement learning, which is often very data demanding. However, a policy that succeeds in simulation often doesn't work when deployed on a real robot. Nevertheless, often the overall gist of what the policy does in simulation remains valid in the real world. In this paper we investigate such settings, where the sequence of states traversed in simulation remains reasonable for the real world, even if the details of the controls are not, as could be the case when the key differences lie in detailed friction, contact, mass and geometry properties. During execution, at each time step our approach computes what the simulation-based control policy would do, but then, rather than executing these controls on the real robot, our approach computes what the simulation expects the resulting next state(s) will be, and then relies on a learned deep inverse dynamics model to decide which real-world action is most suitable to achieve those next states. Deep models are only as good as their training data, and we also propose an approach for data collection to (incrementally) learn the deep inverse dynamics model. Our experiments shows our approach compares favorably with various baselines that have been developed for dealing with simulation to real world model discrepancy, including output error control and Gaussian dynamics adaptation.

URL PDF HTML ☆

赞 0 踩 0

1604.08382 2026-06-04 cs.LG cs.SY eess.SY 版本更新

Convolutional Neural Networks For Automatic State-Time Feature Extraction in Reinforcement Learning Applied to Residential Load Control

卷积神经网络用于强化学习中的自动状态-时间特征提取用于住宅负荷控制

Bert J. Claessens, Peter Vrancx, Frederik Ruelens

AI总结本文提出使用卷积神经网络提取隐藏状态-时间特征，以缓解部分可观测性带来的 curse，通过拟合 Q-迭代的监督学习步骤估计状态-动作值函数，验证了该方法在住宅负荷控制中的有效性。

Comments Submitted to Transactions on Smart Grid

1509.01404 2026-06-04 math.NA cs.CV cs.LG cs.NA math.OC stat.ML 版本更新

Coordinate Descent Methods for Symmetric Nonnegative Matrix Factorization

对称非负矩阵分解的坐标下降方法

Arnaud Vandaele, Nicolas Gillis, Qi Lei, Kai Zhong, Inderjit Dhillon

AI总结本文提出高效的坐标下降方法用于对称非负矩阵分解，适用于大规模稀疏矩阵，通过实验证明其在合成和实际数据集上的有效性。

Comments 25 pages, 5 figures, 7 tables. Main changes: comparison with another symNMF algorithm (namely, BetaSNMF), and correction of an error in the convergence proof

1411.7245 2026-06-04 math.OC cs.LG cs.NA math.NA stat.ML 版本更新

Heuristics for Exact Nonnegative Matrix Factorization

精确非负矩阵分解的启发式方法

Arnaud Vandaele, Nicolas Gillis, François Glineur, Daniel Tuyttens

AI总结本文提出两种启发式方法用于精确非负矩阵分解，通过模拟退火和贪心随机自适应搜索启发式方法，展示了其在多种非负矩阵类别的应用优势，并探讨了非负秩的行为特性。

Comments 32 pages, 2 figures, 16 tables

详情

DOI: 10.1007/s10898-015-0350-z
Journal ref: Journal of Global Optimization 65 (2), pp 369-400, 2016

AI中文摘要

精确非负矩阵分解（精确NMF）问题为：给定一个m-by-n的非负矩阵X和一个分解秩r，寻找若可能的m-by-r非负矩阵W和r-by-n非负矩阵H使得X=WH。本文提出两种启发式方法，一种受模拟退火启发，另一种受贪心随机自适应搜索启发。我们证明这两种启发式方法能够计算几种非负矩阵类别的精确非负分解，并展示其优于标准多起始策略。我们还考虑这两种启发式的混合方法，以结合两种方法的优势。最后，我们讨论这些启发式方法在理解非负秩行为方面的应用，即最小分解秩使得存在精确NMF。特别是，我们推翻了关于Kronecker积非负秩的猜想，提出了关于通用n边形扩展复杂度的新上界，并推测正则n边形的扩展复杂度和相关联的非负秩的精确值。

英文摘要

The exact nonnegative matrix factorization (exact NMF) problem is the following: given an $m$-by-$n$ nonnegative matrix $X$ and a factorization rank $r$, find, if possible, an $m$-by-$r$ nonnegative matrix $W$ and an $r$-by-$n$ nonnegative matrix $H$ such that $X = WH$. In this paper, we propose two heuristics for exact NMF, one inspired from simulated annealing and the other from the greedy randomized adaptive search procedure. We show that these two heuristics are able to compute exact nonnegative factorizations for several classes of nonnegative matrices (namely, linear Euclidean distance matrices, slack matrices, unique-disjointness matrices, and randomly generated matrices) and as such demonstrate their superiority over standard multi-start strategies. We also consider a hybridization between these two heuristics that allows us to combine the advantages of both methods. Finally, we discuss the use of these heuristics to gain insight on the behavior of the nonnegative rank, i.e., the minimum factorization rank such that an exact NMF exists. In particular, we disprove a conjecture on the nonnegative rank of a Kronecker product, propose a new upper bound on the extension complexity of generic $n$-gons and conjecture the exact value of (i) the extension complexity of regular $n$-gons and (ii) the nonnegative rank of a submatrix of the slack matrix of the correlation polytope.

URL PDF HTML ☆

赞 0 踩 0

1609.05587 2026-06-04 math.NA cs.IT cs.LG cs.NA math.IT 版本更新

Tensor Completion by Alternating Minimization under the Tensor Train (TT) Model

基于张量列车（TT）模型的交替最小化张量补全

Wenqi Wang, Vaneet Aggarwal, Shuchin Aeron

AI总结本文提出一种基于张量列车分解的交替最小化张量补全算法，通过交替优化MPS表示中的矩阵（张量），在计算复杂度和数值性能上优于现有方法。

1609.09681 2026-06-04 cs.LG cs.SY eess.SY 版本更新

Predicting the consequence of action in digital control state spaces

在数字控制状态空间中预测动作后果

Emmanuel Daucé

AI总结本文探讨连续状态空间中学习控制规律的障碍，提出借鉴神经科学的末端效应器控制原理，而非传统位移控制原理，以实现更有效的动作学习。

1609.09660 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

On Identification of Sparse Multivariable ARX Model: A Sparse Bayesian Learning Approach

关于稀疏多变量ARX模型识别：一种稀疏贝叶斯学习方法

J. Jin, Y. Yuan, W. Pan, D. L. T. Pham, C. J. Tomlin, A. Webb, J. Goncalves

AI总结本文提出一种基于稀疏贝叶斯学习的方法，用于识别稀疏多变量ARX模型的布尔结构和节点间动态，无需先验知识，通过最大后验估计结合复杂性和组稀疏性惩罚。

详情

AI中文摘要

本文首先考虑了由多变量ARX模型描述的稀疏线性时不变网络的识别问题。此类模型具有相对简单的结构，因此被用作基准以促进进一步研究。在保证网络可识别性的情况下，本文提出了一种识别方法，该方法从数据中推断网络的布尔结构和节点间的内部动态。识别直接从数据中进行，而无需任何系统先验知识，包括其阶数。所提出的方法通过最大后验估计（MAP）解决识别问题，但采用分离的惩罚项来处理复杂性，包括元素（非零连接的阶数）和组稀疏性（网络拓扑）。这种方法广泛应用于压缩感知（CS）中，被称为稀疏贝叶斯学习（SBL）。随后，本文提出了一种新的方案，结合稀疏贝叶斯和组稀疏贝叶斯以高效解决问题。所得到的算法形式与标准稀疏组正则化（SGL）相似，当已知噪声方差时，简化为精确的加权SGL。该方法和开发的工具包可应用于从各种领域推断网络，包括系统生物学中的信号和基因调控网络应用。

英文摘要

This paper begins with considering the identification of sparse linear time-invariant networks described by multivariable ARX models. Such models possess relatively simple structure thus used as a benchmark to promote further research. With identifiability of the network guaranteed, this paper presents an identification method that infers both the Boolean structure of the network and the internal dynamics between nodes. Identification is performed directly from data without any prior knowledge of the system, including its order. The proposed method solves the identification problem using Maximum a posteriori estimation (MAP) but with inseparable penalties for complexity, both in terms of element (order of nonzero connections) and group sparsity (network topology). Such an approach is widely applied in Compressive Sensing (CS) and known as Sparse Bayesian Learning (SBL). We then propose a novel scheme that combines sparse Bayesian and group sparse Bayesian to efficiently solve the problem. The resulted algorithm has a similar form of the standard Sparse Group Lasso (SGL) while with known noise variance, it simplifies to exact re-weighted SGL. The method and the developed toolbox can be applied to infer networks from a wide range of fields, including systems biology applications such as signaling and genetic regulatory networks.

URL PDF HTML ☆

赞 0 踩 0

1609.03240 2026-06-04 stat.ML cs.IT cs.LG cs.NA math.IT math.NA math.OC 版本更新

Non-square matrix sensing without spurious local minima via the Burer-Monteiro approach

非正方形矩阵感知：通过Burer-Monteiro方法避免虚假局部极小值

Dohyung Park, Anastasios Kyrillidis, Constantine Caramanis, Sujay Sanghavi

AI总结本文在受限等距性质假设下研究非正方形矩阵感知问题，通过非凸方法证明矩阵分解在RIP条件下不引入虚假局部极小值。

Comments 14 pages, no figures

1609.06942 2026-06-04 stat.ML cs.LG cs.SY eess.SY math.PR math.ST stat.TH 版本更新

Randomized Independent Component Analysis

随机独立成分分析

Matan Sela, Ron Kimmel

AI总结本文提出基于随机特征的随机广义方差和随机典型相关作为替代措施，以降低计算复杂度并提高ICA分解效率。

Comments Accepted to ICSEE 2016

详情

AI中文摘要

独立成分分析（ICA）是一种从未知线性组合的源信号观测中恢复统计独立信号的方法。一些最准确的ICA分解方法需要搜索最小化不同互信息近似值的逆变换，互信息是随机向量统计独立性的度量。两种这样的近似是核广义方差或核典型相关，已被证明能达到ICA方法的最高性能。然而，仅计算这些度量所需的计算努力与样本大小成立方关系。因此，优化它们在空间和时间上都变得更加计算密集。在此，我们提出了一种基于样本随机特征的替代新度量——随机广义方差和随机典型相关。所提出的替代措施的计算复杂度与样本大小成线性关系，并提供了可控的核非随机版本的近似。我们还展示了优化所提出的统计特性可以在数量级上比核方法更快地达到可比的分离误差。

英文摘要

Independent component analysis (ICA) is a method for recovering statistically independent signals from observations of unknown linear combinations of the sources. Some of the most accurate ICA decomposition methods require searching for the inverse transformation which minimizes different approximations of the Mutual Information, a measure of statistical independence of random vectors. Two such approximations are the Kernel Generalized Variance or the Kernel Canonical Correlation which has been shown to reach the highest performance of ICA methods. However, the computational effort necessary just for computing these measures is cubic in the sample size. Hence, optimizing them becomes even more computationally demanding, in terms of both space and time. Here, we propose a couple of alternative novel measures based on randomized features of the samples - the Randomized Generalized Variance and the Randomized Canonical Correlation. The computational complexity of calculating the proposed alternatives is linear in the sample size and provide a controllable approximation of their Kernel-based non-random versions. We also show that optimization of the proposed statistical properties yields a comparable separation error at an order of magnitude faster compared to Kernel-based measures.

URL PDF HTML ☆

赞 0 踩 0

1602.02164 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

A Note on Alternating Minimization Algorithm for the Matrix Completion Problem

关于矩阵补全问题交替最小化算法的注记

David Gamarnik, Sidhant Misra

AI总结本文分析了两种交替最小化算法变体在低秩矩阵补全问题中的性能，证明当矩阵秩为1且满足特定条件时，算法能在多项式时间内近似重建矩阵，并通过模拟结果表明第二种基于消息传递更新的算法表现更优。

Comments 8 pages, 2 figures

1609.04167 2026-06-04 math.NA cs.CV cs.IT cs.LG cs.NA math.IT math.OC 版本更新

Proceedings of the third "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'16)

第三届“国际稀疏模型与技术相互作用研讨会”（iTWIST'16）会议论文集

V. Abrol, O. Absil, P. -A. Absil, S. Anthoine, P. Antoine, T. Arildsen, N. Bertin, F. Bleichrodt, J. Bobin, A. Bol, A. Bonnefoy, F. Caltagirone, V. Cambareri, C. Chenot, V. Crnojević, M. Daňková, K. Degraux, J. Eisert, J. M. Fadili, M. Gabrié, N. Gac, D. Giacobello, A. Gonzalez, C. A. Gomez Gonzalez, A. González, P. -Y. Gousenbourger, M. Græsbøll Christensen, R. Gribonval, S. Guérit, S. Huang, P. Irofti, L. Jacques, U. S. Kamilov, S. Kiticć, M. Kliesch, F. Krzakala, J. A. Lee, W. Liao, T. Lindstrøm Jensen, A. Manoel, H. Mansour, A. Mohammad-Djafari, A. Moshtaghpour, F. Ngolè, B. Pairet, M. Panić, G. Peyré, A. Pižurica, P. Rajmic, M. Roblin, I. Roth, A. K. Sao, P. Sharma, J. -L. Starck, E. W. Tramel, T. van Waterschoot, D. Vukobratovic, L. Wang, B. Wirth, G. Wunder, H. Zhang

AI总结本文探讨了稀疏模型与技术的相互作用，涵盖数据传感、非凸逆问题、概率推断、机器学习等领域，通过演讲和讨论促进国际科研合作。

Comments 69 pages, 22 extended abstracts, iTWIST'16 website: http://www.itwist16.es.aau.dk

详情

基于智能电表数据识别配电网拓扑

Jayadev P Satya, Nirav Bhatt, Ramkrishna Pasumarthy, Aravind Rajeswaran

AI总结本文提出一种数据驱动方法，利用智能电表能耗数据通过主成分分析和图论解释识别配电网拓扑及负载相连接性。

Comments Submitted to IEEE transaction on smart grid

1504.05854 2026-06-04 cs.LG cs.NA math.NA math.OC 版本更新

On-the-fly Approximation of Multivariate Total Variation Minimization

实时多变量总变分最小化近似

Jordan Frecon, Nelly Pustelnik, Patrice Abry, Laurent Condat

AI总结本文提出一种实时多变量总变分最小化算法，通过局部验证对偶问题的KKT条件，实现高质量近似解，兼顾精度与计算成本。

详情

DOI: 10.1109/TSP.2016.2516962

AI中文摘要

在变化点检测背景下，总变分最小化策略被用于解决。本文设计了一种高效的实时算法，针对单变量数据获得精确解。本研究将该策略扩展至多变量数据。所提算法依赖于对偶问题的局部KKT条件验证。显示多变量设置的非局部性使得无法获得精确实时解，因此设计了一种实时算法提供近似解，其质量由可调参数控制，作为精度与计算成本的权衡。性能评估表明，实时获得高质量解的同时，计算成本比标准迭代方法低多个数量级。所提算法为从业者提供了高效的多变量变化点检测实时处理方法。

英文摘要

In the context of change-point detection, addressed by Total Variation minimization strategies, an efficient on-the-fly algorithm has been designed leading to exact solutions for univariate data. In this contribution, an extension of such an on-the-fly strategy to multivariate data is investigated. The proposed algorithm relies on the local validation of the Karush-Kuhn-Tucker conditions on the dual problem. Showing that the non-local nature of the multivariate setting precludes to obtain an exact on-the-fly solution, we devise an on-the-fly algorithm delivering an approximate solution, whose quality is controlled by a practitioner-tunable parameter, acting as a trade-off between quality and computational cost. Performance assessment shows that high quality solutions are obtained on-the-fly while benefiting of computational costs several orders of magnitude lower than standard iterative procedures. The proposed algorithm thus provides practitioners with an efficient multivariate change-point detection on-the-fly procedure.

URL PDF HTML ☆

赞 0 踩 0

1511.04695 2026-06-04 math.NA cs.LG cs.NA 版本更新

An Iterative Reweighted Method for Tucker Decomposition of Incomplete Multiway Tensors

一种用于不完整多维张量Tucker分解的迭代加权方法

Linxiao Yang, Jun Fang, Hongbin Li, Bing Zeng

AI总结本文提出一种基于组log-sum惩罚函数的迭代加权方法，用于处理不完整多维张量的低秩分解，通过多线性操作实现紧凑表示，并自动确定多线性秩。

详情

DOI: 10.1109/TSP.2016.2572047

AI中文摘要

我们考虑了不完整多维张量的低秩分解问题。由于许多现实数据位于本质上低维子空间中，具有缺失条目的张量低秩分解在推荐系统和图像修复等数据处理问题中具有广泛应用。本文聚焦于Tucker分解，通过多线性操作将N阶张量表示为N个因子矩阵和一个核心张量。为了利用高维数据集中的多线性低秩结构，我们提出了一种基于组的log-sum惩罚函数，以在核心张量上施加结构稀疏性，从而得到具有最小核心张量的紧凑表示。通过迭代最小化一个主导原始目标函数的替代函数，开发了Tucker分解的方法，从而得到迭代加权过程。此外，为了降低计算复杂性，采用了一种过松弛单调快速迭代收缩阈值技术，并将其嵌入到迭代加权过程中。所提出的方法能够自动确定模型复杂度（即多线性秩）。仿真结果表明，所提出的算法在与其他现有算法相比具有竞争力的性能。

英文摘要

We consider the problem of low-rank decomposition of incomplete multiway tensors. Since many real-world data lie on an intrinsically low dimensional subspace, tensor low-rank decomposition with missing entries has applications in many data analysis problems such as recommender systems and image inpainting. In this paper, we focus on Tucker decomposition which represents an Nth-order tensor in terms of N factor matrices and a core tensor via multilinear operations. To exploit the underlying multilinear low-rank structure in high-dimensional datasets, we propose a group-based log-sum penalty functional to place structural sparsity over the core tensor, which leads to a compact representation with smallest core tensor. The method for Tucker decomposition is developed by iteratively minimizing a surrogate function that majorizes the original objective function, which results in an iterative reweighted process. In addition, to reduce the computational complexity, an over-relaxed monotone fast iterative shrinkage-thresholding technique is adapted and embedded in the iterative reweighted process. The proposed method is able to determine the model complexity (i.e. multilinear rank) in an automatic way. Simulation results show that the proposed algorithm offers competitive performance compared with other existing algorithms.

URL PDF HTML ☆

赞 0 踩 0

1510.05237 2026-06-04 cs.LG cs.NA cs.SI math.NA 版本更新

Large Enforced Sparse Non-Negative Matrix Factorization

大尺度强制稀疏非负矩阵分解

Brendan Gavin, Vijay Gadepally, Jeremy Kepner

AI总结本文提出一种强制生成稀疏中间和输出矩阵的NMF改进方法，提升内存和计算性能，同时保持或提高主题模型的准确性和算法收敛速度。

Comments 9 pages

详情

DOI: 10.1109/IPDPSW.2016.58

AI中文摘要

非负矩阵分解（NMF）是一种从文本数据中生成主题模型的常用方法。NMF因其实现简单和计算方便而被广泛接受。然而，将其应用于大规模数据集时，中间矩阵乘积常变得密集，给系统的内存和计算元素带来压力。本文研究了一种简单的但强大的NMF算法修改方法，强制生成稀疏的中间和输出矩阵。该方法通过改进的内存和计算性能使NMF能够应用于大规模数据集。进一步，我们实证表明，这种在NMF中强制稀疏性的方法在保持或提高所生成的主题模型的准确性以及底层算法的收敛速度方面具有优势。

英文摘要

Non-negative matrix factorization (NMF) is a common method for generating topic models from text data. NMF is widely accepted for producing good results despite its relative simplicity of implementation and ease of computation. One challenge with applying NMF to large datasets is that intermediate matrix products often become dense, stressing the memory and compute elements of a system. In this article, we investigate a simple but powerful modification of a common NMF algorithm that enforces the generation of sparse intermediate and output matrices. This method enables the application of NMF to large datasets through improved memory and compute performance. Further, we demonstrate empirically that this method of enforcing sparsity in the NMF either preserves or improves both the accuracy of the resulting topic model and the convergence rate of the underlying algorithm.

URL PDF HTML ☆

赞 0 踩 0

1607.08012 2026-06-04 cs.LG cs.NA math.NA math.OC 版本更新

Learning of Generalized Low-Rank Models: A Greedy Approach

通用低秩模型的学习：一种贪心方法

Quanming Yao, James T. Kwok

AI总结本文提出一种灵活的贪心算法，用于解决通用低秩模型的优化问题，支持平滑或非平滑、一般凸或强凸目标，具有低时间复杂度和快收敛速度，实验显示其速度优于现有方法，预测性能相当或更优。

1607.05962 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Indoor occupancy estimation from carbon dioxide concentration

从二氧化碳浓度估计室内占用情况

Chaoyang Jiang, Mustafa K. Masood, Yeng Chai Soh, Hua Li

AI总结本文提出一种基于二氧化碳测量的室内占用估计器，采用改进的FS-ELM算法提升估计精度，并引入x-容忍度准则评估性能，实验显示在办公室环境中达到94%的准确率。

Comments 11 pages, 7 figures

详情

AI中文摘要

本文提出了一种基于二氧化碳测量的室内占用估计器，该估计器能够根据二氧化碳浓度实时估计室内实际人数。该估计器本质上是一个占用水平的动态模型。为了识别该动态模型，我们提出了特征缩放极端学习机（FS-ELM）算法，这是标准极端学习机（ELM）的一种变体，但已被证明在占用估计问题中表现更佳。测量的二氧化碳浓度受到严重尖峰的影响。我们发现对二氧化碳数据进行预平滑可以显著提高估计精度。然而在实际应用中，我们无法获得实时全局平滑的二氧化碳数据。我们提供了一种方法，利用局部平滑的二氧化碳数据代替，该数据是实时可用的。我们引入了一个新的准则，即$x$-容忍度准确性，以评估占用估计器。所提出的占用估计器在设有24个工位和11个开放座位的办公室房间中进行了测试。精度高达94%，容忍度为四人。

英文摘要

This paper presents an indoor occupancy estimator with which we can estimate the number of real-time indoor occupants based on the carbon dioxide (CO2) measurement. The estimator is actually a dynamic model of the occupancy level. To identify the dynamic model, we propose the Feature Scaled Extreme Learning Machine (FS-ELM) algorithm, which is a variation of the standard Extreme Learning Machine (ELM) but is shown to perform better for the occupancy estimation problem. The measured CO2 concentration suffers from serious spikes. We find that pre-smoothing the CO2 data can greatly improve the estimation accuracy. In real applications, however, we cannot obtain the real-time globally smoothed CO2 data. We provide a way to use the locally smoothed CO2 data instead, which is real-time available. We introduce a new criterion, i.e. $x$-tolerance accuracy, to assess the occupancy estimator. The proposed occupancy estimator was tested in an office room with 24 cubicles and 11 open seats. The accuracy is up to 94 percent with a tolerance of four occupants.

URL PDF HTML ☆

赞 0 踩 0

1402.1298 2026-06-04 math.NA cond-mat.stat-mech cs.IT cs.LG cs.NA math.IT stat.ML 版本更新

Phase transitions and sample complexity in Bayes-optimal matrix factorization

贝叶斯最优矩阵分解中的相变与样本复杂性

Yoshiyuki Kabashima, Florent Krzakala, Marc Mézard, Ayaka Sakata, Lenka Zdeborová

AI总结研究贝叶斯最优矩阵分解中的相变现象及样本复杂性，利用统计力学方法分析推断的可行性与计算可处理性，探讨最小均方误差与高效近似消息传递算法的性能。

Comments 50 pages, 10 figures

详情

DOI: 10.1109/TIT.2016.2556702
Journal ref: IEEE Transactions on Information Theory (Volume:62 , Issue: 7, Pages: 4228 - 4265) 2016

AI中文摘要

我们分析了矩阵分解问题。给定两个矩阵乘积的噪声测量，问题在于恢复原始矩阵。它在许多应用中出现，如字典学习、盲矩阵校准、稀疏主成分分析、盲源分离、低秩矩阵补全、鲁棒主成分分析或因子分析。它在机器学习中也很重要：无监督表示学习往往可以通过矩阵分解研究。我们使用统计力学工具——空腔和副本方法——来分析贝叶斯最优推断设置中推断问题的可行性和计算可处理性，即假设两个矩阵具有随机独立元素，由某些已知分布生成，并且该信息可供推断算法使用。在此设置中，我们计算了在任何计算时间内理论上可实现的最小均方误差，以及高效近似消息传递算法可达到的误差。计算基于算法的渐进状态演变分析。我们的分析预测的性能，无论是就达到的均方误差而言，还是就样本复杂性而言，都非常有希望，值得进一步发展该算法。

英文摘要

We analyse the matrix factorization problem. Given a noisy measurement of a product of two matrices, the problem is to estimate back the original matrices. It arises in many applications such as dictionary learning, blind matrix calibration, sparse principal component analysis, blind source separation, low rank matrix completion, robust principal component analysis or factor analysis. It is also important in machine learning: unsupervised representation learning can often be studied through matrix factorization. We use the tools of statistical mechanics - the cavity and replica methods - to analyze the achievability and computational tractability of the inference problems in the setting of Bayes-optimal inference, which amounts to assuming that the two matrices have random independent elements generated from some known distribution, and this information is available to the inference algorithm. In this setting, we compute the minimal mean-squared-error achievable in principle in any computational time, and the error that can be achieved by an efficient approximate message passing algorithm. The computation is based on the asymptotic state-evolution analysis of the algorithm. The performance that our analysis predicts, both in terms of the achieved mean-squared-error, and in terms of sample complexity, is extremely promising and motivating for a further development of the algorithm.

URL PDF HTML ☆

赞 0 踩 0

1606.02193 2026-06-04 cs.NI cs.LG cs.SY eess.SY 版本更新

Adapting Sampling Interval of Sensor Networks Using On-Line Reinforcement Learning

利用在线强化学习适应传感器网络的采样间隔

Gabriel Martins Dias, Maddalena Nurchis, Boris Bellalta

AI总结本文提出基于强化学习的动态采样率适应方案，通过实时调整传感器采样间隔，以优化能耗并保持数据质量。

Comments 6 pages, 2 figures, submitted to the IEEE World Forum on Internet of Things 2016

详情

AI中文摘要

无线传感器网络（WSNs）由报告温度、相对湿度等环境参数的传感器节点组成。两次连续测量之间的时间间隔是设置WSN配置时的关键参数，因为它会影响WSN的寿命、无线信道竞争和报告数据的质量。由于监控参数的趋势在不同场景和时间内可能显著变化，确定适用于多个情况的采样间隔具有挑战性。本文提出了一种基于强化学习的动态采样率适应方案，能够根据环境条件和应用需求实时调整传感器的采样间隔。主要目标是将采样间隔设置为最佳值，以避免过采样并节省能量，同时不遗漏对应用相关的重要环境变化。在模拟中，我们的机制相比固定策略可将总传输次数减少多达73%，同时保持WSN提供的信息的平均质量。强化学习算法的内在灵活性使其能够应用于多种场景，从而利用物联网的广泛范围。

英文摘要

Monitoring Wireless Sensor Networks (WSNs) are composed of sensor nodes that report temperature, relative humidity, and other environmental parameters. The time between two successive measurements is a critical parameter to set during the WSN configuration because it can impact the WSN's lifetime, the wireless medium contention and the quality of the reported data. As trends in monitored parameters can significantly vary between scenarios and within time, identifying a sampling interval suitable for several cases is also challenging. In this work, we propose a dynamic sampling rate adaptation scheme based on reinforcement learning, able to tune sensors' sampling interval on-the-fly, according to environmental conditions and application requirements. The primary goal is to set the sampling interval to the best value possible so as to avoid oversampling and save energy, while not missing environmental changes that can be relevant for the application. In simulations, our mechanism could reduce up to 73% the total number of transmissions compared to a fixed strategy and, simultaneously, keep the average quality of information provided by the WSN. The inherent flexibility of the reinforcement learning algorithm facilitates its use in several scenarios, so as to exploit the broad scope of the Internet of Things.

URL PDF HTML ☆

赞 0 踩 0

1307.4847 2026-06-04 cs.LG cs.AI cs.SY eess.SY stat.ML 版本更新

Efficient Reinforcement Learning in Deterministic Systems with Value Function Generalization

在确定性系统中通过价值函数泛化实现高效的强化学习

Zheng Wen, Benjamin Van Roy

AI总结本文提出OCP算法，通过优化约束传播实现高效探索和价值函数泛化，在有限时间 horizon 确定性系统中实现最优动作选择，并提供效率和渐进行为保证。

1607.00345 2026-06-04 math.OC cs.LG cs.NA math.NA stat.ML 版本更新

Convergence Rate of Frank-Wolfe for Non-Convex Objectives

非凸目标函数下Frank-Wolfe算法的收敛速度

Simon Lacoste-Julien

AI总结本文证明Frank-Wolfe算法在非凸目标函数上以O(1/√t)速度收敛，且分析为仿射不变，首次在不同稳定性度量下达到与投影梯度方法相似的收敛速度。

Comments 6 pages

1607.00514 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Approximate Joint Matrix Triangularization

近似联合矩阵三角化

Nicolo Colombo, Nikos Vlassis

AI总结本文研究了噪声联合对角化矩阵的近似联合三角化问题，提出基于理论和观测量的扰动界，并讨论其在张量分解中的应用。

Comments 19 pages

1602.04621 2026-06-04 cs.LG cs.AI cs.SY eess.SY stat.ML 版本更新

Deep Exploration via Bootstrapped DQN

通过Bootstrap DQN进行深度探索

Ian Osband, Charles Blundell, Alexander Pritzel, Benjamin Van Roy

AI总结本文提出Bootstrap DQN算法，通过随机价值函数实现高效探索，提升复杂环境中的学习速度和性能，尤其在Atari游戏中表现优异。

1606.09383 2026-06-04 cs.LG cs.SY eess.SY 版本更新

On Approximate Dynamic Programming with Multivariate Splines for Adaptive Control

基于多变量样条的近似动态规划在自适应控制中的应用

Willem Eerland, Coen de Visser, Erik-Jan van Kampen

AI总结本文提出基于RLSTD算法和多变量简单样条的SDP框架，引入局部遗忘因子以保持样条连续性，通过实验展示SDP在跟踪时变系统和提升控制性能方面的优势。

Comments 23 pages

详情

AI中文摘要

我们定义了一个基于RLSTD算法和多变量简单样条的SDP框架。我们引入了一个局部遗忘因子，能够保持简单样条的连续性。该局部遗忘因子与RLSTD算法结合，产生了一种能够跟踪时变系统的修改RLSTD算法。我们进行了两个数值实验，一个验证了SDP并将其与NDP进行比较，另一个展示了修改后的RLSTD算法在系统参数改变时的恢复速度优势。尽管SDP每时间步需要更多的计算，但实验表明，在相同的功能近似器参数量下，SDP在稳定性和学习率方面优于NDP。第二个实验表明，SDP结合修改后的RLSTD算法在系统参数改变时比原始RLSTD算法恢复得更快，为自适应高性能非线性控制方法铺平了道路。

英文摘要

We define a SDP framework based on the RLSTD algorithm and multivariate simplex B-splines. We introduce a local forget factor capable of preserving the continuity of the simplex splines. This local forget factor is integrated with the RLSTD algorithm, resulting in a modified RLSTD algorithm that is capable of tracking time-varying systems. We present the results of two numerical experiments, one validating SDP and comparing it with NDP and another to show the advantages of the modified RLSTD algorithm over the original. While SDP requires more computations per time-step, the experiment shows that for the same amount of function approximator parameters, there is an increase in performance in terms of stability and learning rate compared to NDP. The second experiment shows that SDP in combination with the modified RLSTD algorithm allows for faster recovery compared to the original RLSTD algorithm when system parameters are altered, paving the way for an adaptive high-performance non-linear control method.

URL PDF HTML ☆

赞 0 踩 0

1606.09333 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

Dimension-Free Iteration Complexity of Finite Sum Optimization Problems

有限求和优化问题的无维度迭代复杂性

Yossi Arjevani, Ohad Shamir

AI总结本文提出无维度下界，扩展了Arjevani等人的框架，覆盖了标准有限求和优化方法和随机坐标下降方法，突破了现有下界对迭代次数和维度的限制。

1606.07149 2026-06-04 cs.NE cs.AI cs.CE cs.LG cs.SY eess.SY 版本更新

An Approach to Stable Gradient Descent Adaptation of Higher-Order Neural Units

一种高阶神经单元稳定梯度下降适应的方法

Ivo Bukovsky, Noriyasu Homma

AI总结本文提出一种基于谱半径的高阶神经单元权重更新系统稳定性评估方法，通过梯度下降实现前馈和递归HONU的适应，确保每一步适应过程的稳定性，从而保证整个神经架构对目标数据的适应性。

Comments 2016, 13 pages

详情

DOI: 10.1109/TNNLS.2016.2572310
Journal ref: IEEE Transactions on Neural Networks and Learning Systems,ISSN: 2162-237X,2016

AI中文摘要

本文介绍了用于评估高阶神经单元（HONUs）权重更新系统稳定性的方法，该系统采用多项式聚合神经输入（也称为多项式神经网络类别）进行适应，通过梯度下降方法实现前馈和递归HONUs的适应。该方法的核心基于权重更新系统的谱半径，允许在每次适应步骤中监控和维持稳定性。确保权重更新系统的稳定性（在每次单独的适应步骤中）自然导致整个神经架构适应目标数据的稳定性。此外，所用方法强调HONU的权重优化是一个线性问题，因此所提出的方法可以一般扩展到任何其可调整参数为线性的神经架构。

英文摘要

Stability evaluation of a weight-update system of higher-order neural units (HONUs) with polynomial aggregation of neural inputs (also known as classes of polynomial neural networks) for adaptation of both feedforward and recurrent HONUs by a gradient descent method is introduced. An essential core of the approach is based on spectral radius of a weight-update system, and it allows stability monitoring and its maintenance at every adaptation step individually. Assuring stability of the weight-update system (at every single adaptation step) naturally results in adaptation stability of the whole neural architecture that adapts to target data. As an aside, the used approach highlights the fact that the weight optimization of HONU is a linear problem, so the proposed approach can be generally extended to any neural architecture that is linear in its adaptable parameters.

URL PDF HTML ☆

赞 0 踩 0

1411.0728 2026-06-04 cs.LG cs.GT cs.SY eess.SY math.OC 版本更新

Approachability in Stackelberg Stochastic Games with Vector Costs

在向量成本的Stackelberg随机博弈中可接近性的研究

Dileep Kalathil, Vivek Borkar, Rahul Jain

AI总结本文提出在动态变化环境中多目标优化问题中，针对向量成本的Stackelberg随机博弈的可接近性策略，并设计了计算可行的算法和强化学习方法。

Comments 18 Pages, Submitted to Dynamic Games and Applications

详情

AI中文摘要

本文引入了Blackwell [1]在向量值重复博弈中的可接近性概念。著名的Blackwell可接近性定理规定了一种策略，即无论其他参与者的策略如何，都能将给定参与者的平均成本导向给定的目标集。在本文中，受动态变化环境中多目标优化/决策问题的启发，我们研究了具有向量值成本函数的Stackelberg随机博弈的可接近性问题。我们做出了两项主要贡献。首先，我们为Stackelberg随机博弈提供了一种简单且计算上可行的可接近性策略，沿Blackwell的思路。其次，我们提出了一种强化学习算法，用于在转移核未知的情况下学习可接近的策略。我们还作为副产品恢复了Blackwell在凸集情况下可接近性的必要和充分条件，从而实现了完全表征。我们还给出了非凸集的充分条件。

英文摘要

The notion of approachability was introduced by Blackwell [1] in the context of vector-valued repeated games. The famous Blackwell's approachability theorem prescribes a strategy for approachability, i.e., for `steering' the average cost of a given agent towards a given target set, irrespective of the strategies of the other agents. In this paper, motivated by the multi-objective optimization/decision making problems in dynamically changing environments, we address the approachability problem in Stackelberg stochastic games with vector valued cost functions. We make two main contributions. Firstly, we give a simple and computationally tractable strategy for approachability for Stackelberg stochastic games along the lines of Blackwell's. Secondly, we give a reinforcement learning algorithm for learning the approachable strategy when the transition kernel is unknown. We also recover as a by-product Blackwell's necessary and sufficient condition for approachability for convex sets in this set up and thus a complete characterization. We also give sufficient conditions for non-convex sets.

URL PDF HTML ☆

赞 0 踩 0

1602.02990 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Self-organized control for musculoskeletal robots

肌骨机器人中的自组织控制

Ralf Der, Georg Martius

AI总结本文提出了一种自组织控制方法，通过无功能控制器实现机器人与环境的动态交互，展示了其在肌肉驱动臂肩系统中的自组织行为及与物体动态的共振效应。

Comments 11 pages, 4 figures, 1 table

详情

AI中文摘要

随着机器人技术的快速发展，最优控制成为研究核心。传统方法中，控制器基于传感器历史数据和预设目标进行动作决策。然而，弹性驱动机器人面临严重挑战。本文提出自组织控制新范式，采用无自身功能的固定函数控制器，基于传感器历史数据。在Myorobotics工具包的肌肉驱动臂肩系统中，观察到多样化的自组织行为：当系统独处时，臂部产生伪随机姿态序列，也可被操控为确定性运动模式。最有趣的是，当附加物体后，控制器与物体内部动态产生共振：给半满瓶时，系统自发摇晃瓶身以产生最大水动态响应；附加摆锤时，控制器使其进入圆周模式。本文还讨论了该控制器范式在意图驱动行为生成中的应用前景。

英文摘要

With the accelerated development of robot technologies, optimal control becomes one of the central themes of research. In traditional approaches, the controller, by its internal functionality, finds appropriate actions on the basis of the history of sensor values, guided by the goals, intentions, objectives, learning schemes, and so on planted into it. The idea is that the controller controls the world---the body plus its environment---as reliably as possible. However, in elastically actuated robots this approach faces severe difficulties. This paper advocates for a new paradigm of self-organized control. The paper presents a solution with a controller that is devoid of any functionalities of its own, given by a fixed, explicit and context-free function of the recent history of the sensor values. When applying this controller to a muscle-tendon driven arm-shoulder system from the Myorobotics toolkit, we observe a vast variety of self-organized behavior patterns: when left alone, the arm realizes pseudo-random sequences of different poses but one can also manipulate the system into definite motion patterns. But most interestingly, after attaching an object, the controller gets in a functional resonance with the object's internal dynamics: when given a half-filled bottle, the system spontaneously starts shaking the bottle so that maximum response from the dynamics of the water is being generated. After attaching a pendulum to the arm, the controller drives the pendulum into a circular mode. In this way, the robot discovers dynamical affordances of objects its body is interacting with. We also discuss perspectives for using this controller paradigm for intention driven behavior generation.

URL PDF HTML ☆

赞 0 踩 0

1605.09444 2026-06-04 eess.SY cs.LG cs.SY 版本更新

A Novel Fault Classification Scheme Based on Least Square SVM

一种基于最小二乘支持向量机的新型故障分类方案

Harishchandra Dubey, A. K. Tiwari, Nandita, P. K. Ray, S. R. Mohanty, Nand Kishor

AI总结本文提出基于最小二乘支持向量机的新型故障分类方案，利用故障后1/4周期的电流信号作为输入，通过四个二分类器实现相别识别和接地检测，验证了其在噪声下的准确性和可靠性。

Comments 5 Pages, 6 Figures, 3 Tables

1605.09049 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Recycling Randomness with Structure for Sublinear time Kernel Expansions

利用结构回收随机性以实现子线性时间核展开

Krzysztof Choromanski, Vikas Sindhwani

AI总结本文提出通过结构矩阵近似各种核函数的方法，扩展了快速食品构造，并通过理论分析和实验验证了结构化矩阵在提升核方法性能中的有效性。

1512.01110 2026-06-04 math.NA cs.AI cs.LG cs.NA 版本更新

Bayesian Matrix Completion via Adaptive Relaxed Spectral Regularization

基于自适应放松谱正则化的贝叶斯矩阵补全

Yang Song, Jun Zhu

AI总结本文提出一种基于谱正则化的贝叶斯矩阵补全方法，通过放松奇异向量的正交约束，设计出适用于贝叶斯推断的自适应谱正则化方法，无需参数调优即可自动推断潜在因子数量，在稀疏矩阵上表现优异。

Comments Accepted to AAAI 2016

1510.08896 2026-06-04 cs.DS cs.LG cs.NA math.NA math.OC 版本更新

Robust Shift-and-Invert Preconditioning: Faster and More Sample Efficient Algorithms for Eigenvector Computation

鲁棒的移位-倒置预处理：更快且更样本高效的特征向量计算算法

Chi Jin, Sham M. Kakade, Cameron Musco, Praneeth Netrapalli, Aaron Sidford

AI总结本文提出更快且更样本高效的算法，用于近似矩阵的顶部特征向量，通过改进经典幂方法和Lanczos方法，结合快速子空间嵌入和随机优化，提升了稳定秩和ε依赖性。

Comments Manuscript outdated. Updated version at arxiv:1605.08754

详情

AI中文摘要

我们提供了更快的算法和改进的样本复杂度，用于近似矩阵的顶部特征向量。在离线设置中，给定一个n×d矩阵A，我们展示了如何在时间~O([nnz(A) + (d·sr(A))/gap²]·log1/ε)和~O([(nnz(A))^{3/4}(d·sr(A))^{1/4}/√gap]·log1/ε)内计算ε近似顶部特征向量。这里sr(A)是稳定秩，gap是乘法特征值间隙。通过将gap依赖从nnz(A)中分离，我们改进了经典幂方法和Lanczos方法。我们还利用快速子空间嵌入和随机优化改进了先前工作，显著提升了sr(A)和ε的依赖性。我们的第二运行时间在nnz(A) ≤ (d·sr(A))/gap²时进一步改进。在在线设置中，给定一个分布D，其协方差矩阵为Σ，以及一个O(gap)近似顶部特征向量x₀，我们展示了如何通过~O(v(D)/gap² + v(D)/(gap·ε))个样本从D中进行细化。这里v(D)是一个自然的方差度量。结合我们的算法与先前工作来初始化x₀，我们获得了改进的样本复杂度和运行时间结果。对于一般分布，我们实现了随着样本数增加时的渐近最优准确性。我们的结果围绕经典移位-倒置预处理方法的鲁棒分析，将特征向量计算减少为近似求解一系列线性系统。我们然后应用快速SVRG基于的近似系统求解器来实现我们的结论。我们相信我们的结果表明了基于移位-倒置方法的广泛有效性，并暗示在实践中可能进一步获得计算增益。

英文摘要

We provide faster algorithms and improved sample complexities for approximating the top eigenvector of a matrix. Offline Setting: Given an $n \times d$ matrix $A$, we show how to compute an $ε$ approximate top eigenvector in time $\tilde O ( [nnz(A) + \frac{d \cdot sr(A)}{gap^2}]\cdot \log 1/ε)$ and $\tilde O([\frac{nnz(A)^{3/4} (d \cdot sr(A))^{1/4}}{\sqrt{gap}}]\cdot \log1/ε)$. Here $sr(A)$ is the stable rank and $gap$ is the multiplicative eigenvalue gap. By separating the $gap$ dependence from $nnz(A)$ we improve on the classic power and Lanczos methods. We also improve prior work using fast subspace embeddings and stochastic optimization, giving significantly improved dependencies on $sr(A)$ and $ε$. Our second running time improves this further when $nnz(A) \le \frac{d\cdot sr(A)}{gap^2}$. Online Setting: Given a distribution $D$ with covariance matrix $Σ$ and a vector $x_0$ which is an $O(gap)$ approximate top eigenvector for $Σ$, we show how to refine to an $ε$ approximation using $\tilde O(\frac{v(D)}{gap^2} + \frac{v(D)}{gap \cdot ε})$ samples from $D$. Here $v(D)$ is a natural variance measure. Combining our algorithm with previous work to initialize $x_0$, we obtain a number of improved sample complexity and runtime results. For general distributions, we achieve asymptotically optimal accuracy as a function of sample size as the number of samples grows large. Our results center around a robust analysis of the classic method of shift-and-invert preconditioning to reduce eigenvector computation to approximately solving a sequence of linear systems. We then apply fast SVRG based approximate system solvers to achieve our claims. We believe our results suggest the general effectiveness of shift-and-invert based approaches and imply that further computational gains may be reaped in practice.

URL PDF HTML ☆

赞 0 踩 0

1605.08754 2026-06-04 cs.DS cs.LG cs.NA math.NA math.OC 版本更新

Faster Eigenvector Computation via Shift-and-Invert Preconditioning

通过移位和倒置预条件化加速特征向量计算

Dan Garber, Elad Hazan, Chi Jin, Sham M. Kakade, Cameron Musco, Praneeth Netrapalli, Aaron Sidford

AI总结本文提出更快的算法和改进的样本复杂度，用于估计矩阵Σ的顶部特征向量，通过分离gap依赖性和非零元素数量，改进了经典幂方法和Lanczos方法，并在在线估计中利用方差降低样本复杂度。

Comments Appearing in ICML 2016. Combination of work in arXiv:1509.05647 and arXiv:1510.08896

详情

AI中文摘要

我们给出了更快的算法和改进的样本复杂度，用于估计矩阵Σ的顶部特征向量，即计算一个单位向量x，使得x^TΣx≥(1-ε)λ₁(Σ)：离线特征向量估计：给定显式矩阵A∈R^{n×d}，其中Σ=A^TA，我们展示了如何在时间~O([nnz(A)+d*sr(A)/gap²]*log1/ε)和~O([nnz(A)^{3/4}(d*sr(A))^{1/4}/sqrt(gap)]*log1/ε)内计算ε近似顶部特征向量。通过将gap依赖性从nnz(A)项中分离，我们的首次运行时间优于经典幂方法和Lanczos方法。它也改进了使用快速子空间嵌入[AC09, CW13]和随机优化[Sha15c]的先前工作，给出了对sr(A)和ε显著更好的依赖性。我们的第二次运行时间在nnz(A)≤d*sr(A)/gap²时进一步改进这些结果。在线特征向量估计：给定具有协方差矩阵Σ的分布D和一个初始向量x₀，它是O(gap)近似的顶部特征向量，我们展示了如何通过O(var(D)/(gap*ε))个样本从D中细化到ε近似。结合我们的算法与先前工作初始化x₀，我们获得了在各种D假设下的改进样本复杂度和运行时间结果。我们通过一个通用框架实现了这些结果，该框架我们认为具有独立兴趣。我们给出了经典移位和倒置预条件化方法的稳健分析，将特征向量计算减少为近似求解一系列线性系统。然后应用快速随机方差减少梯度（SVRG）基于系统求解器来实现我们的结论。

英文摘要

We give faster algorithms and improved sample complexities for estimating the top eigenvector of a matrix $Σ$ -- i.e. computing a unit vector $x$ such that $x^T Σx \ge (1-ε)λ_1(Σ)$: Offline Eigenvector Estimation: Given an explicit $A \in \mathbb{R}^{n \times d}$ with $Σ= A^TA$, we show how to compute an $ε$ approximate top eigenvector in time $\tilde O([nnz(A) + \frac{d*sr(A)}{gap^2} ]* \log 1/ε)$ and $\tilde O([\frac{nnz(A)^{3/4} (d*sr(A))^{1/4}}{\sqrt{gap}} ] * \log 1/ε)$. Here $nnz(A)$ is the number of nonzeros in $A$, $sr(A)$ is the stable rank, $gap$ is the relative eigengap. By separating the $gap$ dependence from the $nnz(A)$ term, our first runtime improves upon the classical power and Lanczos methods. It also improves prior work using fast subspace embeddings [AC09, CW13] and stochastic optimization [Sha15c], giving significantly better dependencies on $sr(A)$ and $ε$. Our second running time improves these further when $nnz(A) \le \frac{d*sr(A)}{gap^2}$. Online Eigenvector Estimation: Given a distribution $D$ with covariance matrix $Σ$ and a vector $x_0$ which is an $O(gap)$ approximate top eigenvector for $Σ$, we show how to refine to an $ε$ approximation using $ O(\frac{var(D)}{gap*ε})$ samples from $D$. Here $var(D)$ is a natural notion of variance. Combining our algorithm with previous work to initialize $x_0$, we obtain improved sample complexity and runtime results under a variety of assumptions on $D$. We achieve our results using a general framework that we believe is of independent interest. We give a robust analysis of the classic method of shift-and-invert preconditioning to reduce eigenvector computation to approximately solving a sequence of linear systems. We then apply fast stochastic variance reduced gradient (SVRG) based system solvers to achieve our claims.

URL PDF HTML ☆

赞 0 踩 0

1605.08527 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

Stochastic Optimization for Large-scale Optimal Transport

大规模最优传输的随机优化

Genevay Aude, Marco Cuturi, Gabriel Peyré, Francis Bach

AI总结本文提出新的随机优化算法以解决大规模最优传输问题，通过样本生成处理任意分布，避免离散化并保证收敛性，适用于高维学习场景。

详情

AI中文摘要

最优传输（OT）定义了一个强大的框架，以几何忠实的方式比较概率分布。然而，由于计算负担限制了其实际应用。本文提出了一类新的随机优化算法，能够处理任意分布（离散或连续）只需能从中抽样，这通常是高维学习问题的典型设置。这减轻了对这些密度的离散化需求，同时提供可证明收敛的方法，输出正确的距离而不引入离散化误差。这些算法依赖两个主要思想：（a）对偶OT问题可以重新表述为期望的最大化；（b）对偶OT问题的熵正则化导致平滑的对偶优化问题，可以使用具有可证明更快收敛性的算法解决。我们将在三种不同设置中实例化这些思想：（i）当比较两个离散分布时，我们显示增量随机优化方案可以超越Sinkhorn算法，当前最先进的有限维OT求解器；（ii）当比较离散分布和连续密度时，对偶程序的半离散化改写适用于平均随机梯度下降，导致比通过离散化近似解决问题更好的性能；（iii）当处理两个连续密度时，我们提出在再生核希尔伯特空间（RKHS）上进行随机梯度下降。这目前是唯一已知解决此问题的方法，除了在有限样本上计算OT。我们通过一组离散、半离散和连续的基准问题验证这些主张。

英文摘要

Optimal transport (OT) defines a powerful framework to compare probability distributions in a geometrically faithful way. However, the practical impact of OT is still limited because of its computational burden. We propose a new class of stochastic optimization algorithms to cope with large-scale problems routinely encountered in machine learning applications. These methods are able to manipulate arbitrary distributions (either discrete or continuous) by simply requiring to be able to draw samples from them, which is the typical setup in high-dimensional learning problems. This alleviates the need to discretize these densities, while giving access to provably convergent methods that output the correct distance without discretization error. These algorithms rely on two main ideas: (a) the dual OT problem can be re-cast as the maximization of an expectation ; (b) entropic regularization of the primal OT problem results in a smooth dual optimization optimization which can be addressed with algorithms that have a provably faster convergence. We instantiate these ideas in three different setups: (i) when comparing a discrete distribution to another, we show that incremental stochastic optimization schemes can beat Sinkhorn's algorithm, the current state-of-the-art finite dimensional OT solver; (ii) when comparing a discrete distribution to a continuous density, a semi-discrete reformulation of the dual program is amenable to averaged stochastic gradient descent, leading to better performance than approximately solving the problem by discretization ; (iii) when dealing with two continuous densities, we propose a stochastic gradient descent over a reproducing kernel Hilbert space (RKHS). This is currently the only known method to solve this problem, apart from computing OT on finite samples. We backup these claims on a set of discrete, semi-discrete and continuous benchmark problems.

URL PDF HTML ☆

赞 0 踩 0

1506.02159 2026-06-04 math.NA cs.LG cs.NA math.OC 版本更新

Riemannian preconditioning for tensor completion

Riemannian预处理用于张量补全

Hiroyuki Kasai, Bamdev Mishra

AI总结本文提出一种新的Riemannian预处理方法用于张量补全问题，利用最小二乘结构和Tucker分解的对称性，开发出高效的非线性共轭梯度算法，实验表明其在不同数据集上表现优异。

Comments Supplementary material included in the paper. An extension of the paper is in arXiv:1605.08257

1605.08257 2026-06-04 cs.LG cs.NA math.NA math.OC stat.ML 版本更新

Low-rank tensor completion: a Riemannian manifold preconditioning approach

低秩张量补全：黎曼流形预条件方法

Hiroyuki Kasai, Bamdev Mishra

AI总结本文提出了一种基于黎曼流形预条件的方法用于具有秩约束的张量补全问题，通过引入新的黎曼度量利用最小二乘结构和Tucker分解的对称性，开发出预条件非线性共轭梯度和随机梯度下降算法，实验表明其在不同数据集上优于现有方法。

Comments The 33rd International Conference on Machine Learning (ICML 2016). arXiv admin note: substantial text overlap with arXiv:1506.02159

1511.03722 2026-06-04 cs.LG cs.AI cs.SY eess.SY stat.ME stat.ML 版本更新

Doubly Robust Off-policy Value Evaluation for Reinforcement Learning

强化学习中的双重鲁棒离策略价值评估

Nan Jiang, Lihong Li

AI总结本文提出一种双重鲁棒估计器，用于离策略价值评估，兼顾无偏性和低方差性，并在基准问题中验证其有效性。

Comments 14 pages; 4 figures; ICML 2016

1605.06968 2026-06-04 math.NA cs.LG cs.NA math.OC 版本更新

A Riemannian gossip approach to decentralized matrix completion

基于黎曼流形的去中心化矩阵补全方法

Bamdev Mishra, Hiroyuki Kasai, Atul Saroop

AI总结本文提出基于黎曼流形的新型 gossip 算法，用于低秩去中心化矩阵补全问题，实现局部矩阵补全和全局低秩因子的渐近一致性，具有可扩展性和并行性。

Comments Under review

1605.02196 2026-06-04 eess.SY cs.CV cs.LG cs.RO cs.SY 版本更新

All Weather Perception: Joint Data Association, Tracking, and Classification for Autonomous Ground Vehicles

全天候感知：面向自主地面车辆的数据关联、跟踪与分类的联合解决方案

Peter Radecki, Mark Campbell, Kevin Matzen

AI总结本文提出一种新型概率感知算法，用于自主地面车辆在全天候条件下的数据关联、目标跟踪和分类。该算法扩展了原有的 Rao-Blackwellized 粒子滤波器，结合多模型跟踪进行分类，并通过升级 Cornell 的 AGV 实验证明了先进视觉算法在恶劣天气下的鲁棒性。

Comments 35 pages, 21 figures, 14 tables

详情

AI中文摘要

本文提出了一种新颖的概率感知算法，作为实时联合解决方案，用于自主地面车辆在全天候条件下的数据关联、目标跟踪和目标分类。该算法扩展了最初使用粒子滤波进行数据关联和卡尔曼滤波进行多目标跟踪的 Rao-Blackwellized 粒子滤波器（Miller 等，2011a），现已包含多模型跟踪用于分类。此外，还实现了一种最先进的视觉检测算法，该算法包含方向信息，适用于自主地面车辆（AGV）应用。Cornell 的 AGV 从 DARPA 城市挑战中被升级并用于实验，以检验先进视觉算法能否补充或替代激光雷达和雷达传感器。在恶劣天气和光照条件下，传感器和算法性能得到测试。实验评估显示，在联合概率感知算法中，摄像头、激光雷达和雷达传感器能够实现稳健的全天候数据关联、跟踪和分类。

英文摘要

A novel probabilistic perception algorithm is presented as a real-time joint solution to data association, object tracking, and object classification for an autonomous ground vehicle in all-weather conditions. The presented algorithm extends a Rao-Blackwellized Particle Filter originally built with a particle filter for data association and a Kalman filter for multi-object tracking (Miller et al. 2011a) to now also include multiple model tracking for classification. Additionally a state-of-the-art vision detection algorithm that includes heading information for autonomous ground vehicle (AGV) applications was implemented. Cornell's AGV from the DARPA Urban Challenge was upgraded and used to experimentally examine if and how state-of-the-art vision algorithms can complement or replace lidar and radar sensors. Sensor and algorithm performance in adverse weather and lighting conditions is tested. Experimental evaluation demonstrates robust all-weather data association, tracking, and classification where camera, lidar, and radar sensors complement each other inside the joint probabilistic perception algorithm.

URL PDF HTML ☆

赞 0 踩 0

1507.00333 2026-06-04 math.NA cs.IR cs.LG cs.NA 版本更新

Notes on Low-rank Matrix Factorization

关于低秩矩阵分解的笔记

Yuan Lu, Jie Yang

AI总结本文综述了低秩矩阵分解的不同变体，包括基本分解、非负分解和正交非负分解，并探讨了其在降维、聚类和矩阵补全中的应用，以及扩展至稀疏矩阵补全和半监督学习的可能性。

详情

AI中文摘要

低秩矩阵分解（MF）是数据科学中的重要技术。MF的核心思想是数据中存在潜在结构，通过揭示这些结构可以获得数据的压缩表示。通过将原始矩阵分解为低秩矩阵，MF提供了一种统一的降维、聚类和矩阵补全方法。本文回顾了MF的几个重要变体，包括基本MF、非负MF和正交非负MF。非负MF和正交非负MF是基本MF的变体，分别带有非负性和/或正交性约束。这些约束在特定场景中非常有用。本文第一部分介绍了每种模型的应用场景、独特性质和优化方法。通过适当适应MF，可以超越聚类和矩阵补全的问题。第二部分将扩展MF到稀疏矩阵补全，利用各种正则化方法增强矩阵补全，并通过引入潜在空间强化和变换来利用MF进行（半）监督学习。我们将看到MF不仅是一个有用的模型，也是一个灵活的框架，适用于各种预测问题。

英文摘要

Low-rank matrix factorization (MF) is an important technique in data science. The key idea of MF is that there exists latent structures in the data, by uncovering which we could obtain a compressed representation of the data. By factorizing an original matrix to low-rank matrices, MF provides a unified method for dimension reduction, clustering, and matrix completion. In this article we review several important variants of MF, including: Basic MF, Non-negative MF, Orthogonal non-negative MF. As can be told from their names, non-negative MF and orthogonal non-negative MF are variants of basic MF with non-negativity and/or orthogonality constraints. Such constraints are useful in specific senarios. In the first part of this article, we introduce, for each of these models, the application scenarios, the distinctive properties, and the optimizing method. By properly adapting MF, we can go beyond the problem of clustering and matrix completion. In the second part of this article, we will extend MF to sparse matrix compeletion, enhance matrix compeletion using various regularization methods, and make use of MF for (semi-)supervised learning by introducing latent space reinforcement and transformation. We will see that MF is not only a useful model but also as a flexible framework that is applicable for various prediction problems.

URL PDF HTML ☆

赞 0 踩 0

1605.00716 2026-06-04 cs.LG cs.NI cs.SY eess.SY 版本更新

Radio Transformer Networks: Attention Models for Learning to Synchronize in Wireless Systems

无线系统中的无线电变换网络：用于学习同步的注意力模型

Timothy J O'Shea, Latha Pemula, Dhruv Batra, T. Charles Clancy

AI总结本文提出利用空间变换网络和新无线电领域适应的变换，引入学习注意力模型以提升调制识别的准确率，通过优化分类精度、稀疏表示和正则化实现信号同步与归一化。

Comments 5 pages, 8 figures

1509.02604 2026-06-04 cs.DC cs.LG cs.SY eess.SY 版本更新

Asynchronous Distributed ADMM for Large-Scale Optimization- Part II: Linear Convergence Analysis and Numerical Performance

异步分布式ADMM用于大规模优化-第二部分：线性收敛性分析和数值性能

Tsung-Hui Chang, Wei-Cheng Liao, Mingyi Hong, Xiangfeng Wang

AI总结本文研究了异步分布式ADMM的线性收敛条件及其在大规模逻辑回归中的高效性。

Comments submitted for publication, 28 pages

1509.02597 2026-06-04 cs.DC cs.LG cs.SY eess.SY 版本更新

Asynchronous Distributed ADMM for Large-Scale Optimization- Part I: Algorithm and Convergence Analysis

异步分布式ADMM用于大规模优化-第一部分：算法与收敛性分析

Tsung-Hui Chang, Mingyi Hong, Wei-Cheng Liao, Xiangfeng Wang

AI总结本文提出异步分布式ADMM算法，用于解决大规模学习问题，通过共识问题建模，在星型网络上并行求解，克服传统同步计算在异构网络中的效率瓶颈。

Comments 37 pages

详情

DOI: 10.1109/TSP.2016.2537271

AI中文摘要

本文旨在解决大规模学习问题，研究基于交替方向乘子法（ADMM）的分布式优化方法。通过将学习问题建模为共识问题，ADMM可以在全平行方式下在具有星型拓扑的计算机网络上求解共识问题。然而，传统同步计算在问题规模扩大时效率低下，因为算法速度受限于最慢的工人。在异构网络中，计算节点经历不同的计算和通信延迟，这一问题尤为突出。本文提出异步分布式ADMM（AD-ADMM），以有效提高分布式优化的时间效率。我们的主要兴趣在于分析AD-ADMM的收敛条件，基于流行的部分异步模型，该模型基于网络的最大可容忍延迟定义。具体而言，通过考虑一般且可能非凸的成本函数，我们证明只要算法参数根据网络延迟适当选择，AD-ADMM就保证收敛到KKT点集。我们进一步说明，ADMM的异步性必须谨慎处理，因为对AD-ADMM实现的轻微修改可能会破坏算法收敛性，即使在标准凸设置下也是如此。

英文摘要

Aiming at solving large-scale learning problems, this paper studies distributed optimization methods based on the alternating direction method of multipliers (ADMM). By formulating the learning problem as a consensus problem, the ADMM can be used to solve the consensus problem in a fully parallel fashion over a computer network with a star topology. However, traditional synchronized computation does not scale well with the problem size, as the speed of the algorithm is limited by the slowest workers. This is particularly true in a heterogeneous network where the computing nodes experience different computation and communication delays. In this paper, we propose an asynchronous distributed ADMM (AD-AMM) which can effectively improve the time efficiency of distributed optimization. Our main interest lies in analyzing the convergence conditions of the AD-ADMM, under the popular partially asynchronous model, which is defined based on a maximum tolerable delay of the network. Specifically, by considering general and possibly non-convex cost functions, we show that the AD-ADMM is guaranteed to converge to the set of Karush-Kuhn-Tucker (KKT) points as long as the algorithm parameters are chosen appropriately according to the network delay. We further illustrate that the asynchrony of the ADMM has to be handled with care, as slightly modifying the implementation of the AD-ADMM can jeopardize the algorithm convergence, even under a standard convex setting.

URL PDF HTML ☆

赞 0 踩 0

1512.00984 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Fast Low-Rank Matrix Learning with Nonconvex Regularization

快速低秩矩阵学习与非凸正则化

Quanming Yao, James T. Kwok, Wenliang Zhong

AI总结本文提出一种利用非凸正则化快速学习低秩矩阵的方法，通过截断奇异值和幂方法提升效率，实现更准确的矩阵恢复。

Comments Long version of conference paper appeared ICDM 2015

详情

AI中文摘要

低秩建模在机器学习、计算机视觉和社会网络分析中有广泛应用。尽管核范数常用于近似矩阵秩，但非凸低秩正则化在恢复性能上更优。然而，由此产生的优化问题更具挑战性。最近的最先进方法基于近端梯度算法，但需要每次近端步骤进行昂贵的完整SVD。本文表明，对于许多常用非凸低秩正则化器，可以推导出截断，自动阈值化由近端算子获得的奇异值。这使得可以高效地用幂方法近似SVD。此外，近端算子可以简化为一个较小矩阵在该主子空间上的投影。可以保证O(1/T)的收敛率，其中T是迭代次数。在矩阵补全和鲁棒主成分分析上进行了广泛实验。所提方法在最先进方法上实现了显著加速。此外，获得的矩阵解比传统核范数正则化器更准确且秩更低。

英文摘要

Low-rank modeling has a lot of important applications in machine learning, computer vision and social network analysis. While the matrix rank is often approximated by the convex nuclear norm, the use of nonconvex low-rank regularizers has demonstrated better recovery performance. However, the resultant optimization problem is much more challenging. A very recent state-of-the-art is based on the proximal gradient algorithm. However, it requires an expensive full SVD in each proximal step. In this paper, we show that for many commonly-used nonconvex low-rank regularizers, a cutoff can be derived to automatically threshold the singular values obtained from the proximal operator. This allows the use of power method to approximate the SVD efficiently. Besides, the proximal operator can be reduced to that of a much smaller matrix projected onto this leading subspace. Convergence, with a rate of O(1/T) where T is the number of iterations, can be guaranteed. Extensive experiments are performed on matrix completion and robust principal component analysis. The proposed method achieves significant speedup over the state-of-the-art. Moreover, the matrix solution obtained is more accurate and has a lower rank than that of the traditional nuclear norm regularizer.

URL PDF HTML ☆

赞 0 踩 0

1509.03917 2026-06-04 stat.ML cs.DS cs.IT cs.LG cs.NA math.IT math.NA math.OC 版本更新

Dropping Convexity for Faster Semi-definite Optimization

放弃凸性以加快半定规划优化

Srinadh Bhojanapalli, Anastasios Kyrillidis, Sujay Sanghavi

AI总结本文研究了在半正定矩阵集上最小化凸函数的问题，通过因子梯度下降法（FGD）在非凸情况下实现更快收敛，提供了步长选择规则和初始化方法，适用于一般凸函数的收敛性保证。

Comments 40 pages

详情

AI中文摘要

我们研究了在n×n半正定矩阵集上最小化凸函数f(X)的问题，但当问题转换为min_U g(U) := f(UU^T)，其中U∈R^{n×r}且r≤n时，我们研究了梯度下降在g上的性能，即因子梯度下降（FGD）。我们提供了一个选择步长的规则，并证明在该选择下，FGD的局部收敛率与标准梯度下降在原始f上的收敛率相同：即经过k步后，误差为O(1/k)对于光滑的f，当f是（受限）强凸时，误差呈指数级减小。此外，我们提供了一种初始化FGD的程序，适用于（受限）强凸目标函数，并且当只能通过一阶oracle访问f时。对于多个问题实例，适当的初始化导致全局收敛保证。FGD和类似程序在实践中广泛用于可表述为矩阵分解的问题。据我们所知，这是首次为一般凸函数在标准凸假设下提供精确的收敛率保证的论文。

英文摘要

We study the minimization of a convex function $f(X)$ over the set of $n\times n$ positive semi-definite matrices, but when the problem is recast as $\min_U g(U) := f(UU^\top)$, with $U \in \mathbb{R}^{n \times r}$ and $r \leq n$. We study the performance of gradient descent on $g$---which we refer to as Factored Gradient Descent (FGD)---under standard assumptions on the original function $f$. We provide a rule for selecting the step size and, with this choice, show that the local convergence rate of FGD mirrors that of standard gradient descent on the original $f$: i.e., after $k$ steps, the error is $O(1/k)$ for smooth $f$, and exponentially small in $k$ when $f$ is (restricted) strongly convex. In addition, we provide a procedure to initialize FGD for (restricted) strongly convex objectives and when one only has access to $f$ via a first-order oracle; for several problem instances, such proper initialization leads to global convergence guarantees. FGD and similar procedures are widely used in practice for problems that can be posed as matrix factorization. To the best of our knowledge, this is the first paper to provide precise convergence rate guarantees for general convex functions under standard convex assumptions.

URL PDF HTML ☆

赞 0 踩 0

1604.04026 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

Fast Parallel Randomized Algorithm for Nonnegative Matrix Factorization with KL Divergence for Large Sparse Datasets

快速并行随机算法用于具有KL散度的非负矩阵分解以处理大规模稀疏数据集

Duy Khuong Nguyen, Tu Bao Ho

AI总结本文提出一种快速并行随机坐标下降算法，用于大规模稀疏数据集的非负矩阵分解与KL散度优化，实现更高效的稀疏建模与表示。

详情

AI中文摘要

非负矩阵分解（NMF）与KL散度（NMF-KL）是最重要的NMF问题之一，等价于概率潜在语义索引（PLSI），已在许多应用中成功应用。对于稀疏计数数据，泊松分布和KL散度提供稀疏模型和稀疏表示，比正态分布和Frobenius范数更能描述随机变化。特别地，稀疏模型能更简洁地理解潜在组件上的属性出现，而稀疏表示能更简洁地解释潜在组件对实例的贡献。然而，最小化NMF与KL散度比最小化NMF与Frobenius范数要困难得多；稀疏模型、稀疏表示以及大规模稀疏数据集的快速算法仍然是NMF-KL的挑战。在本文中，我们提出了一种快速并行随机坐标下降算法，用于大规模稀疏数据集，以实现稀疏模型和稀疏表示。所提出算法在该问题上的实验结果优于当前研究的成果。

英文摘要

Nonnegative Matrix Factorization (NMF) with Kullback-Leibler Divergence (NMF-KL) is one of the most significant NMF problems and equivalent to Probabilistic Latent Semantic Indexing (PLSI), which has been successfully applied in many applications. For sparse count data, a Poisson distribution and KL divergence provide sparse models and sparse representation, which describe the random variation better than a normal distribution and Frobenius norm. Specially, sparse models provide more concise understanding of the appearance of attributes over latent components, while sparse representation provides concise interpretability of the contribution of latent components over instances. However, minimizing NMF with KL divergence is much more difficult than minimizing NMF with Frobenius norm; and sparse models, sparse representation and fast algorithms for large sparse datasets are still challenges for NMF with KL divergence. In this paper, we propose a fast parallel randomized coordinate descent algorithm having fast convergence for large sparse datasets to archive sparse models and sparse representation. The proposed algorithm's experimental results overperform the current studies' ones in this problem.

URL PDF HTML ☆

赞 0 踩 0

1508.02087 2026-06-04 math.OC cs.LG cs.NA math.NA stat.CO stat.ML 版本更新

A Linearly-Convergent Stochastic L-BFGS Algorithm

一种线性收敛的随机L-BFGS算法

Philipp Moritz, Robert Nishihara, Michael I. Jordan

AI总结本文提出一种新的随机L-BFGS算法，证明了其在强凸和光滑函数上的线性收敛性，并展示了其在大规模凸优化问题中的高效性能。

Comments 10 pages, 3 figures in International Conference on Artificial Intelligence and Statistics, 2016

1604.03912 2026-06-04 cs.AI cs.LG cs.SY eess.SY stat.ML 版本更新

Inverse Reinforcement Learning with Simultaneous Estimation of Rewards and Dynamics

逆强化学习与奖励和动态的同时估计

Michael Herman, Tobias Gindele, Jörg Wagner, Felix Schmitt, Wolfram Burgard

AI总结本文提出一种基于梯度的逆强化学习方法，同时估计系统动态和奖励函数，提升了样本效率和估计准确性。

Comments accepted to appear in AISTATS 2016

详情

基于模型的连续深度Q学习加速

Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine

AI总结本文提出连续深度Q学习算法NAF及基于模型的加速方法，用于提升连续控制任务的样本效率和学习速度。

详情

AI中文摘要

模型无关强化学习已成功应用于多种挑战性问题，并扩展到处理大规模神经网络策略和价值函数。然而，模型无关算法的样本复杂性，特别是使用高维函数近似器时，限制了其在物理系统中的应用。本文探索了减少深度强化学习样本复杂性的算法和表示方法。我们提出两种互补技术来提高此类算法的效率。首先，我们推导出Q学习的连续变种，称为归一化优势函数（NAF），作为替代更常用的策略梯度和actor-critic方法。NAF表示允许我们应用带有经验回放的Q学习来处理连续任务，并在一组模拟机器人控制任务上显著提高性能。为进一步提高我们的方法效率，我们探索了使用学习模型来加速模型无关强化学习。我们展示迭代重新拟合的局部线性模型在这一点上特别有效，并在适用此类模型的领域中展示了显著更快的学习速度。

英文摘要

Model-free reinforcement learning has been successfully applied to a range of challenging problems, and has recently been extended to handle large neural network policies and value functions. However, the sample complexity of model-free algorithms, particularly when using high-dimensional function approximators, tends to limit their applicability to physical systems. In this paper, we explore algorithms and representations to reduce the sample complexity of deep reinforcement learning for continuous control tasks. We propose two complementary techniques for improving the efficiency of such algorithms. First, we derive a continuous variant of the Q-learning algorithm, which we call normalized adantage functions (NAF), as an alternative to the more commonly used policy gradient and actor-critic methods. NAF representation allows us to apply Q-learning with experience replay to continuous tasks, and substantially improves performance on a set of simulated robotic control tasks. To further improve the efficiency of our approach, we explore the use of learned models for accelerating model-free reinforcement learning. We show that iteratively refitted local linear models are especially effective for this, and demonstrate substantially faster learning on domains where such models are applicable.

URL PDF HTML ☆

赞 0 踩 0

1603.00427 2026-06-04 eess.SY cs.LG cs.SY 版本更新

A Nonlinear Adaptive Filter Based on the Model of Simple Multilinear Functionals

基于简单多线性函数的非线性自适应滤波器

Felipe C. Pinheiro, Cássio G. Lopes

AI总结本文提出一种基于简单多线性函数的非线性自适应滤波器模型，利用K个FIR线性滤波器的输出乘积作为非线性模型，通过梯度下降法解决优化问题，并在系统辨识中验证了其收敛性和计算复杂度。

Comments 5 pages, one of references, plus extra page attached

详情

AI中文摘要

非线性自适应滤波允许对一般系统的某些附加方面进行建模，通常依赖于高度复杂的算法，如基于Volterra级数的算法。通过使用Kronecker乘积和一些张量代数的基本事实，我们提出了一种简单的非线性模型，该模型可以解释为K个FIR线性滤波器输出的乘积，并计算其成本函数及其梯度，从而允许对优化问题进行一些分析。我们利用这些结果在随机梯度框架中推导出一个类似于LMS的算法，并研究了均方误差表面的多模态问题以及合适初始条件的选择。计算了该算法的计算复杂度。在系统辨识设置中测试了新算法，并与其他文献中的多项式算法进行了比较，展示了良好的收敛性和/或计算复杂度。

英文摘要

Nonlinear adaptive filtering allows for modeling of some additional aspects of a general system and usually relies on highly complex algorithms, such as those based on the Volterra series. Through the use of the Kronecker product and some basic facts of tensor algebra, we propose a simple model of nonlinearity, one that can be interpreted as a product of the outputs of K FIR linear filters, and compute its cost function together with its gradient, which allows for some analysis of the optimization problem. We use these results it in a stochastic gradient framework, from which we derive an LMS-like algorithm and investigate the problems of multi-modality in the mean-square error surface and the choice of adequate initial conditions. Its computational complexity is calculated. The new algorithm is tested in a system identification setup and is compared with other polynomial algorithms from the literature, presenting favorable convergence and/or computational complexity.

URL PDF HTML ☆

赞 0 踩 0

1602.08800 2026-06-04 math.NA cs.IR cs.LG cs.NA 版本更新

Iterative Aggregation Method for Solving Principal Component Analysis Problems

迭代聚合方法用于求解主成分分析问题

Vitaly Bulgakov

AI总结本文提出一种两级聚合方法，用于高效求解主成分分析问题，通过迭代幂法求解特征值问题，并在大规模文本数据集上进行了测试。

1406.3587 2026-06-04 math.NA cs.LG cs.NA 版本更新

Quaternion Gradient and Hessian

四元数梯度与Hessian

Dongpo Xu, Danilo P. Mandic

AI总结本文提出基于广义HR算子的新四元数梯度和Hessian定义，简化了四元数最优化算法的推导，使直接在四元数域中进行优化成为可能，提高了算法设计和评估的效率。

Comments 23 pages

详情

DOI: 10.1109/TNNLS.2015.2440473
Journal ref: IEEE Transactions on Neural Networks and Learning Systems, 2016, 27(2):249-261

AI中文摘要

实值标量函数在四元数变量下的优化，如均方误差或阵列输出功率，是许多实际应用的基础。解决方案通常需要计算梯度和Hessian，然而，实值四元数函数本质上是非解析的。为了解决这一问题，我们提出了基于新型广义HR（GHR）算子的新四元数梯度和Hessian定义，从而使得在四元数域中高效推导优化算法成为可能，而不是将问题转换到实数域，这是当前的做法。此外，不同于现有的四元数梯度，GHR算子允许乘法和链式法则，并且使所提出的四元数梯度和Hessian与它们的实数对应物之间有一一对应关系。与数值应用相关的四元数梯度和Hessian的性质被详细阐述，结果展示了GHR算子在大大简化四元数最小均方、四元数最小二乘和牛顿算法推导中的有用性。所提出的梯度和Hessian还被证明能够使相同的通用形式与相应的实数和复数值算法相同，进一步说明了在算法设计和评估中的优势。

英文摘要

The optimization of real scalar functions of quaternion variables, such as the mean square error or array output power, underpins many practical applications. Solutions often require the calculation of the gradient and Hessian, however, real functions of quaternion variables are essentially non-analytic. To address this issue, we propose new definitions of quaternion gradient and Hessian, based on the novel generalized HR (GHR) calculus, thus making possible efficient derivation of optimization algorithms directly in the quaternion field, rather than transforming the problem to the real domain, as is current practice. In addition, unlike the existing quaternion gradients, the GHR calculus allows for the product and chain rule, and for a one-to-one correspondence of the proposed quaternion gradient and Hessian with their real counterparts. Properties of the quaternion gradient and Hessian relevant to numerical applications are elaborated, and the results illuminate the usefulness of the GHR calculus in greatly simplifying the derivation of the quaternion least mean squares, and in quaternion least square and Newton algorithm. The proposed gradient and Hessian are also shown to enable the same generic forms as the corresponding real- and complex-valued algorithms, further illustrating the advantages in algorithm design and evaluation.

URL PDF HTML ☆

赞 0 踩 0

1602.04434 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Frequency Analysis of Temporal Graph Signals

时序图信号的频谱分析

Andreas Loukas, Damien Foucard

AI总结本文提出时序图频谱分析概念，统一了时频和图频分析方法，通过联合时频变换设计分布式滤波器用于干扰消除。

Comments 5 pages, 3 figures

1506.01326 2026-06-04 math.NA cs.AI cs.LG cs.NA stat.CO stat.ML 版本更新

Probabilistic Numerics and Uncertainty in Computations

概率数值计算与计算中的不确定性

Philipp Hennig, Michael A Osborne, Mark Girolami

AI总结本文呼吁采用概率数值方法，通过在计算中返回不确定性来改进线性代数、积分、优化和微分方程求解等算法，强调其在气候科学和天文学等领域的应用价值。

Comments Author Generated Postprint. 17 pages, 4 Figures, 1 Table

详情

DOI: 10.1098/rspa.2015.0142

AI中文摘要

我们呼吁采用概率数值方法：即在数值任务中返回不确定性的算法，包括线性代数、积分、优化和求解微分方程。这些不确定性源于数值计算中由于时间和硬件限制导致的精度损失，对现代科学和工业至关重要。在诸如气候科学和天文学等应用中，基于大规模复杂数据的计算需求促使重新关注数值不确定性的管理。我们描述了几种经典数值方法如何自然地被解释为概率推断。然后展示概率观点如何提出新的算法，能够灵活适应应用需求，并提供改进的实证性能。我们提供了天文学和天文成像等实际科学问题中概率数值算法的实例，同时指出这些新算法存在的开放问题。最后，我们描述了概率数值方法如何为结合数值算法（如数值优化器和微分方程求解器）的计算提供一致的框架，可能允许诊断（和控制）计算中的误差源。

英文摘要

We deliver a call to arms for probabilistic numerical methods: algorithms for numerical tasks, including linear algebra, integration, optimization and solving differential equations, that return uncertainties in their calculations. Such uncertainties, arising from the loss of precision induced by numerical calculation with limited time or hardware, are important for much contemporary science and industry. Within applications such as climate science and astrophysics, the need to make decisions on the basis of computations with large and complex data has led to a renewed focus on the management of numerical uncertainty. We describe how several seminal classic numerical methods can be interpreted naturally as probabilistic inference. We then show that the probabilistic view suggests new algorithms that can flexibly be adapted to suit application specifics, while delivering improved empirical performance. We provide concrete illustrations of the benefits of probabilistic numeric algorithms on real scientific problems from astrometry and astronomical imaging, while highlighting open problems with these new algorithms. Finally, we describe how probabilistic numerical methods provide a coherent framework for identifying the uncertainty in calculations performed with a combination of numerical algorithms (e.g. both numerical optimisers and differential equation solvers), potentially allowing the diagnosis (and control) of error sources in computations.

URL PDF HTML ☆

赞 0 踩 0

1602.04847 2026-06-04 math.OC cs.DS cs.LG cs.NA math.NA 版本更新

Black-box optimization with a politician

用政治家进行黑盒优化

Sébastien Bubeck, Yin-Tat Lee

AI总结本文提出一种适用于梯度计算昂贵情况的黑盒凸优化新框架，结合凸优化概念和分析中心，实验证明其性能优于BFGS等算法。

Comments 19 pages

1402.0635 2026-06-04 stat.ML cs.AI cs.LG cs.SY eess.SY 版本更新

Generalization and Exploration via Randomized Value Functions

通过随机价值函数实现泛化与探索

Ian Osband, Benjamin Van Roy, Zheng Wen

AI总结本文提出随机最小二乘价值迭代算法（RLSVI），通过线性参数化价值函数实现高效的探索与泛化，证明其在无先验知识学习中的近优性能。

Comments arXiv admin note: text overlap with arXiv:1307.4847

1511.07837 2026-06-04 math.OC cs.LG cs.NA math.NA stat.CO stat.ML 版本更新

Generalized Conjugate Gradient Methods for $\ell_1$ Regularized Convex Quadratic Programming with Finite Convergence

针对ℓ₁正则化凸二次规划的广义共轭梯度方法及其有限收敛性

Zhaosong Lu, Xiaojun Chen

AI总结本文提出了一种广义共轭梯度方法，用于求解带有ℓ₁正则化的凸二次规划问题，在有限次迭代内达到最优解。方法通过比较子梯度的分量大小选择步骤类型，并结合精确线搜索和共轭梯度子程序，具有较低的计算复杂度。

Comments 36 pages, 2 tables

详情

AI中文摘要

共轭梯度（CG）方法是求解大规模强凸二次规划（QP）的有效迭代方法。本文提出了一些广义CG（GCG）方法，用于求解带有ℓ₁正则化的（可能不强凸）QP问题，可在有限次迭代内终止于最优解。在每次迭代中，我们的方法首先确定一个正交抗的面，然后要么沿负的投影最小范数子梯度方向进行精确线搜索，要么执行一个CG子程序，直到CG迭代跨越该面的边界或找到该面或子面的近似最小值。我们通过比较最小范数子梯度的某些分量大小来确定应采取哪种步骤类型。我们的有限收敛性分析利用了误差界结果和上述精确线搜索和CG子程序的一些关键性质。我们还展示了所提出的方法能够通过允许CG子程序执行的不精确性来找到问题的近似解。我们GCG方法找到ε-最优解的总体算术运算成本依赖于ε在O(log(1/ε))，优于加速近端梯度方法[2,23]依赖于ε在O(1/√ε)。此外，我们的GCG方法可以简单地扩展到求解具有有限收敛性的盒约束凸QP。数值结果表明，我们的方法对于求解病态问题非常有效。

英文摘要

The conjugate gradient (CG) method is an efficient iterative method for solving large-scale strongly convex quadratic programming (QP). In this paper we propose some generalized CG (GCG) methods for solving the $\ell_1$-regularized (possibly not strongly) convex QP that terminate at an optimal solution in a finite number of iterations. At each iteration, our methods first identify a face of an orthant and then either perform an exact line search along the direction of the negative projected minimum-norm subgradient of the objective function or execute a CG subroutine that conducts a sequence of CG iterations until a CG iterate crosses the boundary of this face or an approximate minimizer of over this face or a subface is found. We determine which type of step should be taken by comparing the magnitude of some components of the minimum-norm subgradient of the objective function to that of its rest components. Our analysis on finite convergence of these methods makes use of an error bound result and some key properties of the aforementioned exact line search and the CG subroutine. We also show that the proposed methods are capable of finding an approximate solution of the problem by allowing some inexactness on the execution of the CG subroutine. The overall arithmetic operation cost of our GCG methods for finding an $ε$-optimal solution depends on $ε$ in $O(\log(1/ε))$, which is superior to the accelerated proximal gradient method [2,23] that depends on $ε$ in $O(1/\sqrtε)$. In addition, our GCG methods can be extended straightforwardly to solve box-constrained convex QP with finite convergence. Numerical results demonstrate that our methods are very favorable for solving ill-conditioned problems.

URL PDF HTML ☆

赞 0 踩 0

1602.02523 2026-06-04 stat.ML cs.LG cs.SY eess.SY 版本更新

Data-Efficient Reinforcement Learning in Continuous-State POMDPs

连续状态POMDPs中的数据高效强化学习

Rowan McAllister, Carl Edward Rasmussen

AI总结本文提出一种抗观测噪声的数据高效强化学习算法，通过扩展PILCO算法至POMDPs，利用过滤过程提升策略评估性能，实现在Cartpole摆动任务中更优的非线性控制效果。

1406.5311 2026-06-04 math.OC cs.AI cs.LG cs.NA math.NA stat.ML 版本更新

Towards A Deeper Geometric, Analytic and Algorithmic Understanding of Margins

迈向更深入的几何、分析和算法对边界的理解

Aaditya Ramdas, Javier Peña

AI总结本文研究了矩阵A的边界条件度量，探讨了线性可行性问题的难度，通过几何、分析和算法方法扩展了边界理论，并证明了感知机收敛率与边界的关联。

Comments 18 pages, 3 figures

1601.07721 2026-06-04 math.NA cs.LG cs.NA 版本更新

Distributed Low Rank Approximation of Implicit Functions of a Matrix

分布式隐函数矩阵的低秩近似

David P. Woodruff, Peilin Zhong

AI总结研究分布式低秩近似问题，针对隐式表示的矩阵计算低秩近似，提出高效算法并验证其在softmax、高斯核和鲁棒近似中的应用。

详情

AI中文摘要

我们研究了分布式低秩近似，其中待近似矩阵仅在不同服务器间隐式表示。例如，每个服务器可能有n×d矩阵A^t，目标是计算A=f(∑_{t=1}^s A^t)的低秩近似，其中f是对矩阵∑_{t=1}^s A^t进行逐元素应用的函数。我们证明对于广泛类别的函数f，可以高效计算一个d×d的秩k投影矩阵P，使得‖A−AP‖_F^2 ≤ ‖A−[A]_k‖_F^2 + ε‖A‖_F^2，其中AP表示A在P行空间上的投影，[A]_k表示A的最佳秩k近似。我们的协议通信成本为d·(sk/ε)^{O(1)}，并以高概率成功。我们的框架允许高效计算softmax、高斯核扩展和M-估计器的低秩近似。我们还证明这种加法误差近似是最佳的，即任何实现相对误差的协议需要更多通信。最后，我们在真实数据集上实验验证了我们的算法。

英文摘要

We study distributed low rank approximation in which the matrix to be approximated is only implicitly represented across the different servers. For example, each of $s$ servers may have an $n \times d$ matrix $A^t$, and we may be interested in computing a low rank approximation to $A = f(\sum_{t=1}^s A^t)$, where $f$ is a function which is applied entrywise to the matrix $\sum_{t=1}^s A^t$. We show for a wide class of functions $f$ it is possible to efficiently compute a $d \times d$ rank-$k$ projection matrix $P$ for which $\|A - AP\|_F^2 \leq \|A - [A]_k\|_F^2 + \varepsilon \|A\|_F^2$, where $AP$ denotes the projection of $A$ onto the row span of $P$, and $[A]_k$ denotes the best rank-$k$ approximation to $A$ given by the singular value decomposition. The communication cost of our protocols is $d \cdot (sk/\varepsilon)^{O(1)}$, and they succeed with high probability. Our framework allows us to efficiently compute a low rank approximation to an entry-wise softmax, to a Gaussian kernel expansion, and to $M$-Estimators applied entrywise (i.e., forms of robust low rank approximation). We also show that our additive error approximation is best possible, in the sense that any protocol achieving relative error for these problems requires significantly more communication. Finally, we experimentally validate our algorithms on real datasets.

URL PDF HTML ☆

赞 0 踩 0

1506.00438 2026-06-04 cs.LG cs.DM cs.SY eess.SY stat.ME 版本更新

Network Topology Identification using PCA and its Graph Theoretic Interpretations

利用PCA进行网络拓扑识别及其图论解释

Aravind Rajeswaran, Shankar Narasimhan

AI总结本文通过PCA估计线性关系，利用f-cut集和f-环路实现网络拓扑识别，展示了从稳态数据中识别网络结构的方法及图论意义。

Comments Structure of paper is changed to improve presentation. Methods and results are unchanged. A more detailed literature survey has been added

详情

AI中文摘要

我们解决了从稳态网络测量中识别（重建）网络拓扑的问题。具体来说，给定一个数据矩阵X，其中X_{ij}对应配置（稳态）j中边i的流量，我们希望找到一个网络结构，使得所有节点的流量守恒成立。这模型了许多涉及守恒量的网络问题，如水、电力和代谢网络。我们证明了识别等同于学习一个模型A_n，该模型捕捉了X中不同变量之间的近似线性关系（即形式为A_n X ≈ 0），使得A_n满秩（最高可能）且与网络节点-边 incidence 结构一致。该问题通过一系列步骤解决，包括使用PCA估计近似线性关系、从这些近似关系中获得f-cut集，以及从f-cut集（或等价地f-环路）中实现图结构。每一步和整个过程都是多项式时间。该方法通过识别水分布网络的拓扑结构进行示例说明。我们还研究了从稳态数据中识别的可识别性范围。

英文摘要

We solve the problem of identifying (reconstructing) network topology from steady state network measurements. Concretely, given only a data matrix $\mathbf{X}$ where the $X_{ij}$ entry corresponds to flow in edge $i$ in configuration (steady-state) $j$, we wish to find a network structure for which flow conservation is obeyed at all the nodes. This models many network problems involving conserved quantities like water, power, and metabolic networks. We show that identification is equivalent to learning a model $\mathbf{A_n}$ which captures the approximate linear relationships between the different variables comprising $\mathbf{X}$ (i.e. of the form $\mathbf{A_n X \approx 0}$) such that $\mathbf{A_n}$ is full rank (highest possible) and consistent with a network node-edge incidence structure. The problem is solved through a sequence of steps like estimating approximate linear relationships using Principal Component Analysis, obtaining f-cut-sets from these approximate relationships, and graph realization from f-cut-sets (or equivalently f-circuits). Each step and the overall process is polynomial time. The method is illustrated by identifying topology of a water distribution network. We also study the extent of identifiability from steady-state data.

URL PDF HTML ☆

赞 0 踩 0

1510.06895 2026-06-04 cs.LG cs.CV cs.NA math.NA 版本更新

Nonconvex Nonsmooth Low-Rank Minimization via Iteratively Reweighted Nuclear Norm

非凸非光滑低秩最小化通过迭代重加权核范数

Canyi Lu, Jinhui Tang, Shuicheng Yan, Zhouchen Lin

AI总结本文提出通过迭代重加权核范数算法解决非凸非光滑低秩最小化问题，利用非凸替代函数近似秩函数，提升低秩矩阵恢复性能。

详情

DOI: 10.1109/TIP.2015.2511584

基于分层反向传播自适应批评者的强化控制

John W. Jameson

AI总结本文提出分层反向传播自适应批评者架构，通过两级层次结构解决长期信用分配问题，引入响应诱导学习方法提升控制稳定性与鲁棒性。

Comments 16 pages, 5 figures

1512.01927 2026-06-04 math.NA cs.CV cs.LG cs.NA 版本更新

Fast Optimization Algorithm on Riemannian Manifolds and Its Application in Low-Rank Representation

流形上的快速优化算法及其在低秩表示中的应用

Haoran Chen, Yanfeng Sun, Junbin Gao, Yongli Hu

AI总结本文提出了一种具有快速收敛速度的新型一阶优化算法FOA，并在低秩表示模型中应用了基于FOA的快速子空间追踪方法，实验表明其在收敛速度和准确性方面优于其他方法。

1509.08581 2026-06-04 math.OC cs.LG cs.NA math.NA stat.CO stat.ML 版本更新

Optimization over Sparse Symmetric Sets via a Nonmonotone Projected Gradient Method

通过非单调投影梯度法优化稀疏对称集

Zhaosong Lu

AI总结本文提出非单调投影梯度法用于优化稀疏对称集，引入更强的最优条件并证明其全局或局部最优性。

Comments 30 pages

详情

AI中文摘要

我们考虑在稀疏对称集上最小化Lipschitz可微函数的问题，该问题在工程和科学中有广泛应用。已知经典投影梯度（PG）方法常数步长1/L的任何聚点满足L-站定最优条件。本文引入更强的最优条件，并提出非单调投影梯度（NPG）方法，结合支持变化和坐标交换策略。证明NPG的任何聚点满足新条件且为坐标站定点。在合适假设下，其为全局或局部极小值。数值实验显示NPG在解质量上优于PG，且在速度上至少可比甚至优于PG。

英文摘要

We consider the problem of minimizing a Lipschitz differentiable function over a class of sparse symmetric sets that has wide applications in engineering and science. For this problem, it is known that any accumulation point of the classical projected gradient (PG) method with a constant stepsize $1/L$ satisfies the $L$-stationarity optimality condition that was introduced in [3]. In this paper we introduce a new optimality condition that is stronger than the $L$-stationarity optimality condition. We also propose a nonmonotone projected gradient (NPG) method for this problem by incorporating some support-changing and coordintate-swapping strategies into a projected gradient method with variable stepsizes. It is shown that any accumulation point of NPG satisfies the new optimality condition and moreover it is a coordinatewise stationary point. Under some suitable assumptions, we further show that it is a global or a local minimizer of the problem. Numerical experiments are conducted to compare the performance of PG and NPG. The computational results demonstrate that NPG has substantially better solution quality than PG, and moreover, it is at least comparable to, but sometimes can be much faster than PG in terms of speed.

URL PDF HTML ☆

赞 0 踩 0

1511.08062 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

Relaxed Majorization-Minimization for Non-smooth and Non-convex Optimization

松弛的主导最小化方法用于非光滑和非凸优化

Chen Xu, Zhouchen Lin, Zhenyu Zhao, Hongbin Zha

AI总结本文提出了一种新的松弛主导最小化方法，用于非光滑和非凸优化问题，该方法能涵盖现有主导最小化方法。通过弱化条件，允许使用直接逼近非光滑目标函数的替代函数，在鲁棒矩阵分解问题中表现出优势。

Comments AAAI16

详情

AI中文摘要

我们提出了一种新的主导最小化（MM）方法，用于非光滑和非凸程序，该方法足够通用，可以包含现有的MM方法。除了局部主导条件外，我们仅要求当迭代次数趋于无穷大时，目标函数与其替代函数的方向导数的差消失，这是一个非常弱的条件。因此，我们的方法可以使用直接逼近非光滑目标函数的替代函数。相比之下，现有的所有MM方法都是通过近似目标函数的光滑部分来构建替代函数的。我们应用我们的松弛MM方法到具有不同正则化的鲁棒矩阵分解（RMF）问题中，其中我们的局部主导算法在RMF问题中优于现有方法。这是首个确保在不额外假设的情况下，任何迭代点的极限点都是驻点的RMF算法。

英文摘要

We propose a new majorization-minimization (MM) method for non-smooth and non-convex programs, which is general enough to include the existing MM methods. Besides the local majorization condition, we only require that the difference between the directional derivatives of the objective function and its surrogate function vanishes when the number of iterations approaches infinity, which is a very weak condition. So our method can use a surrogate function that directly approximates the non-smooth objective function. In comparison, all the existing MM methods construct the surrogate function by approximating the smooth component of the objective function. We apply our relaxed MM methods to the robust matrix factorization (RMF) problem with different regularizations, where our locally majorant algorithm shows advantages over the state-of-the-art approaches for RMF. This is the first algorithm for RMF ensuring, without extra assumptions, that any limit point of the iterates is a stationary point.

URL PDF HTML ☆

赞 0 踩 0

1509.03044 2026-06-04 cs.LG cs.AI cs.SY eess.SY 版本更新

Recurrent Reinforcement Learning: A Hybrid Approach

递归强化学习：一种混合方法

Xiujun Li, Lihong Li, Jianfeng Gao, Xiaodong He, Jianshu Chen, Li Deng, Ji He

AI总结本文提出一种混合模型，结合监督学习和强化学习，用于部分可观测任务的状态表示学习，在极少领域知识下有效。

Comments 11 pages, 6 figures

详情

AI中文摘要

成功的强化学习应用往往需要处理部分可观测状态。通常很难构建和推断隐藏状态，因为它们依赖于智能体的整个交互历史，可能需要大量领域知识。本文研究了一种深度学习方法，用于在极少领域知识下学习部分可观测任务的状态表示。特别地，我们提出了一种新的混合模型，结合监督学习（SL）和强化学习（RL）的优点，以联合方式训练：SL组件可以是循环神经网络（RNN）或其长短期记忆（LSTM）版本，具有捕捉长期依赖性的能力，从而有效学习隐藏状态的表示。RL组件是一个深度Q网络（DQN），学习优化控制以最大化长期奖励。在直接邮寄营销问题上的大量实验展示了所提出方法的有效性和优势，其在一组先前最先进的方法中表现最佳。

英文摘要

Successful applications of reinforcement learning in real-world problems often require dealing with partially observable states. It is in general very challenging to construct and infer hidden states as they often depend on the agent's entire interaction history and may require substantial domain knowledge. In this work, we investigate a deep-learning approach to learning the representation of states in partially observable tasks, with minimal prior knowledge of the domain. In particular, we propose a new family of hybrid models that combines the strength of both supervised learning (SL) and reinforcement learning (RL), trained in a joint fashion: The SL component can be a recurrent neural networks (RNN) or its long short-term memory (LSTM) version, which is equipped with the desired property of being able to capture long-term dependency on history, thus providing an effective way of learning the representation of hidden states. The RL component is a deep Q-network (DQN) that learns to optimize the control for maximizing long-term rewards. Extensive experiments in a direct mailing campaign problem demonstrate the effectiveness and advantages of the proposed approach, which performs the best among a set of previous state-of-the-art methods.

URL PDF HTML ☆

赞 0 踩 0

1511.05133 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

Fast Proximal Linearized Alternating Direction Method of Multiplier with Parallel Splitting

快速近端线性化交替方向乘子法与并行分裂

Canyi Lu, Huan Li, Zhouchen Lin, Shuicheng Yan

AI总结本文提出快速近端增广拉格朗日法和快速近端ADMM并行分裂法，改进了收敛速度并降低了计算复杂度，实验证明其在合成和真实数据上均优于传统PALM和ADMM。

Comments AAAI 2016

1407.0753 2026-06-04 math.OC cs.LG cs.NA math.NA stat.ML 版本更新

Global convergence of splitting methods for nonconvex composite optimization

非凸复合优化中分裂方法的全局收敛性

Guoyin Li, Ting Kei Pong

AI总结本文研究了非凸复合优化问题，分析了交替方向乘子法和近端梯度算法的收敛性，证明了在特定条件下序列收敛于 stationary 点，并给出了保证序列有界的充分条件。

Comments To appear in SIOPT

详情

DOI: 10.1137/140998135

AI中文摘要

我们考虑了最小化一个具有有界海森矩阵的光滑函数 h 和一个非光滑函数之和的问题。我们假设后者是一个闭函数 P 和一个满射线性映射 M 的组合，且 P 的近端映射在参数 τ>0 时易于计算。该问题一般是非凸的，并涵盖工程和机器学习中的许多重要应用。本文分析了两种分裂方法用于解决该非凸优化问题：交替方向乘子法和近端梯度算法。对于交替方向乘子法的直接适应，我们证明如果惩罚参数足够大且生成的序列有聚点，则会得到非凸问题的 stationary 点。我们还建立了在附加假设下整个序列收敛的条件，即函数 h 和 P 是半代数的。此外，我们给出了保证生成序列有界的简单充分条件。这些条件可以满足广泛的应用，包括带有 ℓ_{1/2} 正则化的最小二乘问题。最后，当 M 是恒等映射时，即近端梯度算法可以高效应用时，我们证明任何聚点在略微更灵活的常数步长规则下是 stationary 点，这比文献中非凸 h 的已知条件更灵活。

英文摘要

We consider the problem of minimizing the sum of a smooth function $h$ with a bounded Hessian, and a nonsmooth function. We assume that the latter function is a composition of a proper closed function $P$ and a surjective linear map $\cal M$, with the proximal mappings of $τP$, $τ> 0$, simple to compute. This problem is nonconvex in general and encompasses many important applications in engineering and machine learning. In this paper, we examined two types of splitting methods for solving this nonconvex optimization problem: alternating direction method of multipliers and proximal gradient algorithm. For the direct adaptation of the alternating direction method of multipliers, we show that, if the penalty parameter is chosen sufficiently large and the sequence generated has a cluster point, then it gives a stationary point of the nonconvex problem. We also establish convergence of the whole sequence under an additional assumption that the functions $h$ and $P$ are semi-algebraic. Furthermore, we give simple sufficient conditions to guarantee boundedness of the sequence generated. These conditions can be satisfied for a wide range of applications including the least squares problem with the $\ell_{1/2}$ regularization. Finally, when $\cal M$ is the identity so that the proximal gradient algorithm can be efficiently applied, we show that any cluster point is stationary under a slightly more flexible constant step-size rule than what is known in the literature for a nonconvex $h$.

URL PDF HTML ☆

赞 0 踩 0

1405.4980 2026-06-04 math.OC cs.CC cs.LG cs.NA math.NA stat.ML 版本更新

Convex Optimization: Algorithms and Complexity

凸优化：算法与复杂性

Sébastien Bubeck

AI总结本文探讨了凸优化中的复杂性定理及其算法，涵盖黑盒优化、结构优化和随机优化的理论与方法，重点介绍FISTA、对偶平均和内点法等核心算法。

Comments A previous version of the manuscript was titled "Theory of Convex Optimization for Machine Learning"

详情

Journal ref: In Foundations and Trends in Machine Learning, Vol. 8: No. 3-4, pp 231-357, 2015

AI中文摘要

本文阐述了凸优化中的复杂性定理及其相应算法。从黑盒优化的基本理论开始，内容逐步推进到结构优化和随机优化的最新进展。黑盒优化部分深受Nesterov的开创性著作和Nemirovski的讲义影响，涵盖了切割平面方法以及（加速）梯度下降方案的分析。我们特别关注非欧几里得设置（相关算法包括Frank-Wolfe、镜像下降和对偶平均）并讨论其在机器学习中的相关性。我们为结构优化提供了简要介绍，包括FISTA（用于优化光滑与简单非光滑项之和）、鞍点镜像近似（Nemirovski对Nesterov平滑方法的替代）以及内点法的简要描述。在随机优化中，我们讨论了随机梯度下降、小批量、随机坐标下降和亚线性算法。我们还简要提及了凸松弛的组合优化问题以及利用随机性来近似解的方法，以及基于随机游走的方法。

英文摘要

This monograph presents the main complexity theorems in convex optimization and their corresponding algorithms. Starting from the fundamental theory of black-box optimization, the material progresses towards recent advances in structural optimization and stochastic optimization. Our presentation of black-box optimization, strongly influenced by Nesterov's seminal book and Nemirovski's lecture notes, includes the analysis of cutting plane methods, as well as (accelerated) gradient descent schemes. We also pay special attention to non-Euclidean settings (relevant algorithms include Frank-Wolfe, mirror descent, and dual averaging) and discuss their relevance in machine learning. We provide a gentle introduction to structural optimization with FISTA (to optimize a sum of a smooth and a simple non-smooth term), saddle-point mirror prox (Nemirovski's alternative to Nesterov's smoothing), and a concise description of interior point methods. In stochastic optimization we discuss stochastic gradient descent, mini-batches, random coordinate descent, and sublinear algorithms. We also briefly touch upon convex relaxation of combinatorial problems and the use of randomness to round solutions, as well as random walks based methods.

URL PDF HTML ☆

赞 0 踩 0

1502.06800 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

On the Equivalence between Kernel Quadrature Rules and Random Feature Expansions

核积分规则与随机特征展开的等价性

Francis Bach

AI总结研究揭示核积分规则是随机特征展开的特例，通过理论分析得出样本数与积分算子特征值的关系，扩展至函数逼近问题并改进随机特征学习的泛化保证。

详情

AI中文摘要

我们展示基于核的积分规则可以视为正定核随机特征展开的特例，对于特定分解总存在。我们提供理论分析，得出所需样本数与近似误差的关系，得到基于积分算子特征值的上下界，匹配对数项。特别地，我们显示上界可通过特定非均匀分布的独立同分布样本获得，而下界若对任何点集有效。将结果应用于核积分规则时，我们恢复了Sobolev空间的已知上下界。此外，结果扩展至更一般的函数逼近问题，得到L2-和L∞-范数结果，匹配特殊情形的已知结果。应用于随机特征时，我们显示改进了保持学习Lipschitz连续损失泛化保证所需的随机特征数量。

英文摘要

We show that kernel-based quadrature rules for computing integrals can be seen as a special case of random feature expansions for positive definite kernels, for a particular decomposition that always exists for such kernels. We provide a theoretical analysis of the number of required samples for a given approximation error, leading to both upper and lower bounds that are based solely on the eigenvalues of the associated integral operator and match up to logarithmic terms. In particular, we show that the upper bound may be obtained from independent and identically distributed samples from a specific non-uniform distribution, while the lower bound if valid for any set of points. Applying our results to kernel-based quadrature, while our results are fairly general, we recover known upper and lower bounds for the special cases of Sobolev spaces. Moreover, our results extend to the more general problem of full function approximations (beyond simply computing an integral), with results in L2- and L$\infty$-norm that match known results for special cases. Applying our results to random features, we show an improvement of the number of random features needed to preserve the generalization guarantees for learning with Lipschitz-continuous losses.

URL PDF HTML ☆

赞 0 踩 0

1504.05477 2026-06-04 cs.DS cs.LG cs.NA math.NA 版本更新

Randomized Block Krylov Methods for Stronger and Faster Approximate Singular Value Decomposition

随机块Krylov方法用于更准确和更快的近似奇异值分解

Cameron Musco, Christopher Musco

AI总结本文提出了一种随机块Krylov方法，该方法在理论上和实验上均优于现有的随机同时幂迭代方法，能够以更少的迭代次数提供更优的低秩近似，并解决了传统Krylov子空间方法依赖奇异值间隙的问题。

Comments Neural Information Processing Systems 2015

详情

关于正则化损失最小化的数据预处理

Tianbao Yang, Rong Jin, Shenghuo Zhu, Qihang Lin

AI总结研究通过数据预处理技术提升一阶方法在正则化损失最小化中的收敛速度，分析了问题条件数对收敛的影响，并提出随机采样方法实现高效预处理。

详情

AI中文摘要

在本文中，我们研究了数据预处理技术，这是一种已知且长期存在的技术，用于提升一阶方法在正则化损失最小化中的收敛速度。众所周知，问题的条件数，即Lipschitz常数与强凸模量的比值，对一阶优化方法的收敛性有显著影响。因此，最小化一个小的正则化损失以获得良好的泛化性能，导致产生一个病态的问题，成为大数据问题的瓶颈。我们为正则化损失最小化提供了数据预处理的理论。特别是，我们的分析展示了一个适当的数据预处理器，并 characterize 了损失函数和数据下的条件，使得数据预处理可以降低条件数，从而加速最小化正则化损失的收敛。为了使数据预处理在实践中有用，我们努力采用并分析一种随机采样方法，以高效计算预处理后的数据。初步实验验证了我们的理论。

英文摘要

In this work, we study data preconditioning, a well-known and long-existing technique, for boosting the convergence of first-order methods for regularized loss minimization. It is well understood that the condition number of the problem, i.e., the ratio of the Lipschitz constant to the strong convexity modulus, has a harsh effect on the convergence of the first-order optimization methods. Therefore, minimizing a small regularized loss for achieving good generalization performance, yielding an ill conditioned problem, becomes the bottleneck for big data problems. We provide a theory on data preconditioning for regularized loss minimization. In particular, our analysis exhibits an appropriate data preconditioner and characterizes the conditions on the loss function and on the data under which data preconditioning can reduce the condition number and therefore boost the convergence for minimizing the regularized loss. To make the data preconditioning practically useful, we endeavor to employ and analyze a random sampling approach to efficiently compute the preconditioned data. The preliminary experiments validate our theory.

URL PDF HTML ☆

赞 0 踩 0

1509.06458 2026-06-04 cs.LG cs.NA math.NA 版本更新

Harmonic Extension

调和扩展

Zuoqiang Shi, Jian Sun, Minghao Tian

AI总结本文提出点积分方法(PIM)和体积约束方法(VCM)以解决调和扩展问题，改进传统图拉普拉斯方法的不足，应用于半监督学习中表现最佳。

Comments 10 pages, 2 figures

1509.03946 2026-06-04 cs.LG cs.NA math.NA 版本更新

Parametric Maxflows for Structured Sparse Learning with Convex Relaxations of Submodular Functions

参数最大流用于结构稀疏学习的凸松弛子模函数

Yoshinobu Kawahara, Yutaro Yamaguchi

AI总结本文提出利用参数最大流优化解决结构稀疏学习中的凸松弛子模函数问题，展示现有结构惩罚满足条件，可快速求解正则化学习。

1509.02730 2026-06-04 eess.SY cs.DC cs.IT cs.LG cs.SY math.IT 版本更新

Finite Dictionary Variants of the Diffusion KLMS Algorithm

有限字典变体的扩散KLMS算法

Rangeet Mitra, Vimal Bhatia

AI总结本文提出两种有限字典变体的扩散KLMS算法，以减少存储需求并保持收敛性能。

详情

AI中文摘要

基于分布式学习的方法已被发现是处理网络上线性可分数据集学习的可行解决方案。然而，至今为止的方法仅适用于线性可分数据集，需要扩展到需要学习非线性的情况。在这些情况下，最近提出的扩散核最小均方（KLMS）方法比扩散最小均方（LMS）方法表现更好。扩散KLMS的缺点是需要无限存储观测（也称为字典）。本文在固定预算设置下提出了扩散KLMS，使得存储需求得以降低，同时在收敛性能方面保持相当的水平。仿真结果验证了两种新提出的算法，即量化扩散KLMS（QDKLMS）和固定预算扩散KLMS（FBDKLMS），与KLMS相比，这两种算法在减少字典大小存储需求的同时，表现出更好的性能。

英文摘要

The diffusion based distributed learning approaches have been found to be a viable solution for learning over linearly separable datasets over a network. However, approaches till date are suitable for linearly separable datasets and need to be extended to scenarios in which we need to learn a non-linearity. In such scenarios, the recently proposed diffusion kernel least mean squares (KLMS) has been found to be performing better than diffusion least mean squares (LMS). The drawback of diffusion KLMS is that it requires infinite storage for observations (also called dictionary). This paper formulates the diffusion KLMS in a fixed budget setting such that the storage requirement is curtailed while maintaining appreciable performance in terms of convergence. Simulations have been carried out to validate the two newly proposed algorithms named as quantised diffusion KLMS (QDKLMS) and fixed budget diffusion KLMS (FBDKLMS) against KLMS, which indicate that both the proposed algorithms deliver better performance as compared to the KLMS while reducing the dictionary size storage requirement.

URL PDF HTML ☆

赞 0 踩 0

1409.0553 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Sampling-based Approximations with Quantitative Performance for the Probabilistic Reach-Avoid Problem over General Markov Processes

基于采样的近似方法在一般马尔可夫过程的概率可达-避免问题中的定量性能

Sofie Haesaert, Robert Babuska, Alessandro Abate

AI总结本文提出基于拟合值迭代算法的近似计算方案，用于解决一般马尔可夫过程的概率可达-避免问题，并提供可计算的误差界以保障安全应用。

详情

AI中文摘要

本文研究了具有马尔可夫性质的随机过程，其在一般（不可数）状态空间中演化，并依赖于非确定性量（控制输入）影响概率动态。本文针对最大可达-避免规范的计算及相应最优控制器的合成进行了研究。可达-避免规范评估模型任何有限时间轨迹进入给定目标集的概率，同时避免给定的不期望状态集。本文提出基于随机采样的近似计算方案，基于拟合值迭代算法，并提供可先验计算的正式概率误差界，从而对数值方案的输出进行定量评估，使其对安全关键应用具有意义。此外，本文还提供了更紧的基于样本的概率误差界。整体计算方案与文献中的其他近似算法相关联，并最终在基准案例研究中评估其性能。

英文摘要

This article deals with stochastic processes endowed with the Markov (memoryless) property and evolving over general (uncountable) state spaces. The models further depend on a non-deterministic quantity in the form of a control input, which can be selected to affect the probabilistic dynamics. We address the computation of maximal reach-avoid specifications, together with the synthesis of the corresponding optimal controllers. The reach-avoid specification deals with assessing the likelihood that any finite-horizon trajectory of the model enters a given goal set, while avoiding a given set of undesired states. This article newly provides an approximate computational scheme for the reach-avoid specification based on the Fitted Value Iteration algorithm, which hinges on random sample extractions, and gives a-priori computable formal probabilistic bounds on the error made by the approximation algorithm: as such, the output of the numerical scheme is quantitatively assessed and thus meaningful for safety-critical applications. Furthermore, we provide tighter probabilistic error bounds that are sample-based. The overall computational scheme is put in relationship with alternative approximation algorithms in the literature, and finally its performance is practically assessed over a benchmark case study.

URL PDF HTML ☆

赞 0 踩 0

1404.5009 2026-06-04 cs.CV cs.LG cs.NA math.NA 版本更新

Efficient Semidefinite Branch-and-Cut for MAP-MRF Inference

高效半定规划分支定界法用于MAP-MRF推断

Peng Wang, Chunhua Shen, Anton van den Hengel, Philip Torr

AI总结本文提出了一种高效的分支定界方法用于求解通用MAP-MRF推断问题，通过结合可扩展的半定规划和切割平面法，实现了高效的约束求解，并在密集连接或 unary 成本相对较低时取得最佳结果。

Comments 21 pages

1509.01352 2026-06-04 cs.LG cs.DC cs.IT cs.SY eess.SY math.IT 版本更新

Diffusion-KLMS Algorithm and its Performance Analysis for Non-Linear Distributed Networks

扩散-KLMS算法及其在非线性分布式网络中的性能分析

Rangeet Mitra, Vimal Bhatia

AI总结本文提出一种适用于非线性分布式环境的扩散-KLMS算法，通过仿真验证其优于同类算法的收敛性能，并引入预测暂稳态行为的技术，可扩展至5G通信系统中的协同频谱感知和大规模MIMO接收机设计。

详情

AI中文摘要

在分布式网络环境中，扩散-最小二乘（LMS）算法比原始LMS算法收敛更快。观察到扩散-LMS通常优于其他分布式LMS算法，如空间LMS和增量LMS。然而，原始LMS和扩散-LMS在非线性环境中不适用，因为数据可能无法线性分离。文献中提出了一种称为核LMS（KLMS）的LMS变种，用于处理此类非线性。本文提出了一种适用于非线性分布式环境的扩散-LMS核化版本。仿真显示，所提方法的收敛性优于同类算法。我们还引入了一种技术，用于预测所提算法的暂态和稳态行为。本文提出的技术（或同类算法）可以轻松扩展到分布式参数估计应用，如协同频谱感知和大规模多输入多输出（MIMO）接收机设计，这些是5G通信系统中的潜在组件。

英文摘要

In a distributed network environment, the diffusion-least mean squares (LMS) algorithm gives faster convergence than the original LMS algorithm. It has also been observed that, the diffusion-LMS generally outperforms other distributed LMS algorithms like spatial LMS and incremental LMS. However, both the original LMS and diffusion-LMS are not applicable in non-linear environments where data may not be linearly separable. A variant of LMS called kernel-LMS (KLMS) has been proposed in the literature for such non-linearities. In this paper, we propose kernelised version of diffusion-LMS for non-linear distributed environments. Simulations show that the proposed approach has superior convergence as compared to algorithms of the same genre. We also introduce a technique to predict the transient and steady-state behaviour of the proposed algorithm. The techniques proposed in this work (or algorithms of same genre) can be easily extended to distributed parameter estimation applications like cooperative spectrum sensing and massive multiple input multiple output (MIMO) receiver design which are potential components for 5G communication systems.

URL PDF HTML ☆

赞 0 踩 0

1508.07416 2026-06-04 cs.CE cs.LG cs.NA math.NA 版本更新

Linked Component Analysis from Matrices to High Order Tensors: Applications to Biomedical Data

从矩阵到高阶张量的关联组件分析：应用于生物医学数据

Guoxu Zhou, Qibin Zhao, Yu Zhang, Tülay Adalı, Shengli Xie, Andrzej Cichocki

AI总结本文综述了用于联合分析多块数据的矩阵基组件分析方法，并扩展至多块多向张量数据，重点展示如何通过多向数据性质提取共同和个体特征，用于生物医学数据分析。

Comments 20 pages, 11 figures, Proceedings of the IEEE, 2015

详情

DOI: 10.1109/JPROC.2015.2474704

AI中文摘要

随着各种传感器技术的普及，我们能够获取大量多块（也称为多集、多关系或多视图）数据，需要联合分析以探索其潜在连接。各种组件分析方法在分析此类耦合数据中扮演着越来越重要的角色。本文首先简要回顾了现有基于矩阵（二维）的组件分析方法，用于此类数据的联合分析，重点在生物医学应用。然后，我们讨论了这些方法对多块多向（张量）数据的重要扩展和一般化。我们展示了如何通过约束多块张量分解方法提取相似或统计依赖的共同特征，这些特征被所有块共享，通过整合数据的多向性质。特别强调了多块数据的灵活共同和个体特征分析，旨在同时提取具有所需属性和类型多样性的共同和个体潜在组件。通过示例展示了其在生物医学数据分析中的有效性。

英文摘要

With the increasing availability of various sensor technologies, we now have access to large amounts of multi-block (also called multi-set, multi-relational, or multi-view) data that need to be jointly analyzed to explore their latent connections. Various component analysis methods have played an increasingly important role for the analysis of such coupled data. In this paper, we first provide a brief review of existing matrix-based (two-way) component analysis methods for the joint analysis of such data with a focus on biomedical applications. Then, we discuss their important extensions and generalization to multi-block multiway (tensor) data. We show how constrained multi-block tensor decomposition methods are able to extract similar or statistically dependent common features that are shared by all blocks, by incorporating the multiway nature of data. Special emphasis is given to the flexible common and individual feature analysis of multi-block data with the aim to simultaneously extract common and individual latent components with desired properties and types of diversity. Illustrative examples are given to demonstrate their effectiveness for biomedical data analysis.

URL PDF HTML ☆

赞 0 踩 0

1502.04390 2026-06-04 cs.LG cs.NA math.NA 版本更新

Equilibrated adaptive learning rates for non-convex optimization

非凸优化中的平衡自适应学习率

Yann N. Dauphin, Harm de Vries, Yoshua Bengio

AI总结本文提出ESGD算法，通过平衡预条件器改进非凸优化中的自适应学习率，实验显示其收敛速度优于RMSProp。

1406.2082 2026-06-04 stat.ML cs.LG cs.NA math.NA math.OC stat.AP 版本更新

Fast and Flexible ADMM Algorithms for Trend Filtering

快速且灵活的ADMM算法用于趋势过滤

Aaditya Ramdas, Ryan J. Tibshirani

AI总结本文提出一种快速稳健的算法用于趋势过滤，解决其在大规模数据下的计算问题，并展示其在稀疏趋势过滤和等距趋势过滤中的扩展性。

Comments 22 pages, 10 figures; published in Journal of Computational and Graphical Statistics, 2015

详情

DOI: 10.1080/10618600.2015.1054033

AI中文摘要

本文提出了一种快速且稳健的算法用于趋势过滤，一种最近发展的非参数回归工具。已证明，对于导数有界变差的估计函数，趋势过滤达到最小最大最优误差率，而其他流行方法如平滑样条和核方法无法达到。然而，限制其更广泛实际应用的是缺乏可扩展且数值稳定的算法来拟合趋势过滤估计。本文提出了一种高效专用的ADMM程序用于趋势过滤。我们的算法与当前使用的专用内点方法竞争，但更具数值鲁棒性。此外，所提出的ADMM实现非常简单，而且重要的是，它足够灵活，可以扩展到许多有趣的相关问题，如稀疏趋势过滤和等距趋势过滤。我们的方法的软件以C和R语言免费提供。

英文摘要

This paper presents a fast and robust algorithm for trend filtering, a recently developed nonparametric regression tool. It has been shown that, for estimating functions whose derivatives are of bounded variation, trend filtering achieves the minimax optimal error rate, while other popular methods like smoothing splines and kernels do not. Standing in the way of a more widespread practical adoption, however, is a lack of scalable and numerically stable algorithms for fitting trend filtering estimates. This paper presents a highly efficient, specialized ADMM routine for trend filtering. Our algorithm is competitive with the specialized interior point methods that are currently in use, and yet is far more numerically robust. Furthermore, the proposed ADMM implementation is very simple, and importantly, it is flexible enough to extend to many interesting related problems, such as sparse trend filtering and isotonic trend filtering. Software for our method is freely available, in both the C and R languages.

URL PDF HTML ☆

赞 0 踩 0

1507.03194 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

A Review of Nonnegative Matrix Factorization Methods for Clustering

聚类分析中非负矩阵因子化方法综述

Ali Caner Türkmen

AI总结本文综述了非负矩阵因子化方法在聚类中的应用，探讨了多种变体及其聚类解释。

1508.05873 2026-06-04 math.NA cs.LG cs.NA 版本更新

Stochastic Behavior of the Nonnegative Least Mean Fourth Algorithm for Stationary Gaussian Inputs and Slow Learning

非负最小均方四次算法在平稳高斯输入和慢速学习下的随机行为

Jingen Ni, Jian Yang, Jie Chen, Cédric Richard, José Carlos M. Bermudez

AI总结本文研究了非负最小均方四次算法在平稳高斯输入和慢速学习下的随机行为，分析其性能并验证了理论结果。

Comments 11 pages, 8 figures, submitted for publication

1508.05514 2026-06-04 stat.ML cs.CV cs.LG cs.RO cs.SY eess.SY 版本更新

Gaussian Mixture Reduction Using Reverse Kullback-Leibler Divergence

基于反向Kullback-Leibler散度的高斯混合减少

Tohid Ardeshiri, Umut Orguner, Emre Özkan

AI总结本文提出一种贪心混合减少算法，基于Kullback-Leibler散度进行混合成分的剪枝与合并，通过分析近似方法提高计算效率，并在模拟和实际数据中验证其性能优于现有方法。

1508.04467 2026-06-04 cs.CV cs.IT cs.LG cs.NA math.IT math.NA stat.ML 版本更新

Robust Subspace Clustering via Smoothed Rank Approximation

通过平滑秩近似实现鲁棒子空间聚类

Zhao Kang, Chong Peng, Qiang Cheng

AI总结本文提出基于对数-行列式秩近似的方法，用于子空间聚类，以提高精度并有效处理误差和噪声。

Comments Journal, code is available

详情

DOI: 10.1109/LSP.2015.2460737
Journal ref: IEEE Signal Processing Letters, 22(2015)2088-2092

AI中文摘要

本文提出基于对数-行列式秩近似的方法，用于子空间聚类，以提高精度并有效处理误差和噪声。矩阵秩最小化受线性约束在许多应用领域中出现，从信号处理到机器学习。核范数是该问题的凸松弛，可以在某些受限且理论有趣的条件下精确恢复秩。然而，对于许多现实应用，核范数近似到秩函数只能产生远离最优解的结果。为了寻求比核范数更准确的解决方案，本文提出基于对数-行列式的秩近似方法。我们考虑将此秩近似应用于子空间聚类应用。我们的框架可以建模不同类型的误差和噪声。开发了有效的优化策略，并具有理论保证，以收敛到 stationary 点。所提出的方法在人脸识别和运动分割任务上相比最先进的子空间聚类算法表现出有希望的结果。

英文摘要

Matrix rank minimizing subject to affine constraints arises in many application areas, ranging from signal processing to machine learning. Nuclear norm is a convex relaxation for this problem which can recover the rank exactly under some restricted and theoretically interesting conditions. However, for many real-world applications, nuclear norm approximation to the rank function can only produce a result far from the optimum. To seek a solution of higher accuracy than the nuclear norm, in this paper, we propose a rank approximation based on Logarithm-Determinant. We consider using this rank approximation for subspace clustering application. Our framework can model different kinds of errors and noise. Effective optimization strategy is developed with theoretical guarantee to converge to a stationary point. The proposed method gives promising results on face clustering and motion segmentation tasks compared to the state-of-the-art subspace clustering algorithms.

URL PDF HTML ☆

赞 0 踩 0

1412.8293 2026-06-04 stat.ML cs.LG cs.NA math.NA stat.CO 版本更新

Quasi-Monte Carlo Feature Maps for Shift-Invariant Kernels

准蒙特卡洛特征映射用于移不变核

Haim Avron, Vikas Sindhwani, Jiyan Yang, Michael Mahoney

AI总结本文提出用准蒙特卡洛方法改进随机傅里叶特征映射，以加速大规模数据集上核方法的训练和测试速度，通过低差异序列减少积分误差。

Comments A short version of this paper has been presented in ICML 2014

1507.08847 2026-06-04 cs.LG cs.CV cs.NA math.NA 版本更新

A novel multivariate performance optimization method based on sparse coding and hyper-predictor learning

一种基于稀疏编码和超预测器学习的新型多变量性能优化方法

Jiachen Yanga, Zhiyong Dinga, Fei Guoa, Huogen Wanga, Nick Hughesb

AI总结本文提出一种新型方法，通过稀疏编码和超预测器学习优化多变量性能度量，通过联合优化问题最小化重建误差、稀疏性及复杂损失函数上界。

详情

DOI: 10.1016/j.neunet.2015.07.011

AI中文摘要

本文研究了多变量性能度量的优化问题，提出了一种新算法。与传统机器学习方法不同，本文研究如何学习有效超预测器以处理数据点元组，从而最小化对应于多变量性能度量的复杂损失函数。我们提出将数据点元组通过字典转换为稀疏码元组，然后应用线性函数比较稀疏码与给定候选类别标签。为了学习字典、稀疏码和线性函数参数，我们提出一个联合优化问题。在此问题中，同时最小化稀疏码的重建误差和稀疏性，以及复杂损失函数的上界。此外，损失函数的上界通过稀疏码和线性函数参数近似。为优化此问题，我们开发了一种基于下降梯度方法的迭代算法，交替学习稀疏码和超预测器参数。在一些基准数据集上的实验结果表明，所提方法优于其他最先进的算法。

英文摘要

In this paper, we investigate the problem of optimization multivariate performance measures, and propose a novel algorithm for it. Different from traditional machine learning methods which optimize simple loss functions to learn prediction function, the problem studied in this paper is how to learn effective hyper-predictor for a tuple of data points, so that a complex loss function corresponding to a multivariate performance measure can be minimized. We propose to present the tuple of data points to a tuple of sparse codes via a dictionary, and then apply a linear function to compare a sparse code against a give candidate class label. To learn the dictionary, sparse codes, and parameter of the linear function, we propose a joint optimization problem. In this problem, the both the reconstruction error and sparsity of sparse code, and the upper bound of the complex loss function are minimized. Moreover, the upper bound of the loss function is approximated by the sparse codes and the linear function parameter. To optimize this problem, we develop an iterative algorithm based on descent gradient methods to learn the sparse codes and hyper-predictor parameter alternately. Experiment results on some benchmark data sets show the advantage of the proposed methods over other state-of-the-art algorithms.

URL PDF HTML ☆

赞 0 踩 0

1409.2848 2026-06-04 cs.LG cs.NA math.NA math.OC stat.ML 版本更新

A Stochastic PCA and SVD Algorithm with an Exponential Convergence Rate

具有指数收敛速率的随机PCA和SVD算法

Ohad Shamir

AI总结提出VR-PCA算法，通过低计算成本的随机迭代实现快速收敛，解决传统方法收敛慢或计算强度大的问题。

Comments Fixed a minor bug in the proof of lemma 1 (which does not affect the result)

1507.04396 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Parallel MMF: a Multiresolution Approach to Matrix Computation

并行MMF：矩阵计算的多分辨率方法

Risi Kondor, Nedelina Teneva, Pramod K. Mudrakarta

AI总结本文提出并行MMF算法，用于多尺度结构分析和矩阵压缩，通过实验展示其在稀疏矩阵压缩和预处理中的有效性。

1406.1102 2026-06-04 math.NA cs.LG cs.NA stat.CO stat.ML 版本更新

Linear Convergence of Variance-Reduced Stochastic Gradient without Strong Convexity

方差减少随机梯度算法在无强凸性下的线性收敛性

Pinghua Gong, Jieping Ye

AI总结本文提出Prox-SVRG和VRPSG算法，证明在无强凸性条件下，这些算法在约束和正则化问题中实现线性收敛，引入Semi-Strongly Convex不等式作为关键理论贡献。

Comments 18 pages

详情

AI中文摘要

随机梯度算法通过仅使用一个或几个样本估计梯度，具有低的每迭代计算成本。它们在大规模优化问题中被广泛应用。然而，由于梯度计算中的固有方差，随机梯度算法通常收敛缓慢，具有亚线性收敛速率。为加速收敛，一些方差减少随机梯度算法，如近端随机方差减少梯度（Prox-SVRG）算法，最近被提出以解决强凸问题。在强凸条件下，这些方差减少随机梯度算法实现线性收敛速率。然而，许多机器学习问题是凸但非强凸的。在本文中，我们引入Prox-SVRG及其投影变种称为方差减少投影随机梯度（VRPSG）算法，以解决广泛用于机器学习的一类非强凸优化问题。作为本文的主要技术贡献，我们证明了VRPSG和Prox-SVRG在无强凸性条件下实现线性收敛速率。证明中的关键成分是一个半强凸（SSC）不等式，这是首次严格证明用于一类非强凸问题的约束和正则化设置中的不等式。此外，SSC不等式与算法无关，可能用于分析其他随机梯度算法，这可能具有独立价值。据我们所知，这是首次在无强凸性条件下建立方差减少随机梯度算法在解决约束和正则化问题中的线性收敛速率的工作。

英文摘要

Stochastic gradient algorithms estimate the gradient based on only one or a few samples and enjoy low computational cost per iteration. They have been widely used in large-scale optimization problems. However, stochastic gradient algorithms are usually slow to converge and achieve sub-linear convergence rates, due to the inherent variance in the gradient computation. To accelerate the convergence, some variance-reduced stochastic gradient algorithms, e.g., proximal stochastic variance-reduced gradient (Prox-SVRG) algorithm, have recently been proposed to solve strongly convex problems. Under the strongly convex condition, these variance-reduced stochastic gradient algorithms achieve a linear convergence rate. However, many machine learning problems are convex but not strongly convex. In this paper, we introduce Prox-SVRG and its projected variant called Variance-Reduced Projected Stochastic Gradient (VRPSG) to solve a class of non-strongly convex optimization problems widely used in machine learning. As the main technical contribution of this paper, we show that both VRPSG and Prox-SVRG achieve a linear convergence rate without strong convexity. A key ingredient in our proof is a Semi-Strongly Convex (SSC) inequality which is the first to be rigorously proved for a class of non-strongly convex problems in both constrained and regularized settings. Moreover, the SSC inequality is independent of algorithms and may be applied to analyze other stochastic gradient algorithms besides VRPSG and Prox-SVRG, which may be of independent interest. To the best of our knowledge, this is the first work that establishes the linear convergence rate for the variance-reduced stochastic gradient algorithms on solving both constrained and regularized problems without strong convexity.

URL PDF HTML ☆

赞 0 踩 0

1507.00567 2026-06-04 eess.SY cs.AI cs.DC cs.LG cs.SE cs.SY 版本更新

Self-Learning Cloud Controllers: Fuzzy Q-Learning for Knowledge Evolution

自学习云控制器：用于知识演化的模糊Q学习

Pooyan Jamshidi, Amir Sharifloo, Claus Pahl, Andreas Metzger, Giovani Estrada

AI总结本文提出FQL4KE自学习模糊云控制器，通过在运行时学习和修改模糊规则，使用户能通过调整优先级权重来指定控制器，而非复杂适应规则，实验表明其优于传统控制器。

详情

AI中文摘要

云控制器旨在通过在运行时自动扩展计算资源来响应应用需求，以满足性能保证并最小化资源成本。现有云控制器通常依赖预定义的适应规则集，但云服务提供商难以在设计时定义最优或预置的适应规则，因为上层应用是黑箱。因此，适应决策的负担常转嫁给云应用。然而，大多数情况下，应用开发者对云基础设施了解有限。本文提出在运行时学习适应规则。为此，我们引入FQL4KE，一种自学习模糊云控制器。FQL4KE在运行时学习和修改模糊规则。其优势在于设计云控制器时无需依赖仅靠精确的设计时知识，这可能难以获取。FQL4KE使用户能够通过简单调整代表系统目标优先级的权重来指定云控制器，而不是指定复杂的适应规则。FQL4KE的适用性已在云应用框架ElasticBench中得到实验评估。实验结果表明，FQL4KE优于我们之前开发的无学习机制的模糊控制器和原生Azure自动扩展。

英文摘要

Cloud controllers aim at responding to application demands by automatically scaling the compute resources at runtime to meet performance guarantees and minimize resource costs. Existing cloud controllers often resort to scaling strategies that are codified as a set of adaptation rules. However, for a cloud provider, applications running on top of the cloud infrastructure are more or less black-boxes, making it difficult at design time to define optimal or pre-emptive adaptation rules. Thus, the burden of taking adaptation decisions often is delegated to the cloud application. Yet, in most cases, application developers in turn have limited knowledge of the cloud infrastructure. In this paper, we propose learning adaptation rules during runtime. To this end, we introduce FQL4KE, a self-learning fuzzy cloud controller. In particular, FQL4KE learns and modifies fuzzy rules at runtime. The benefit is that for designing cloud controllers, we do not have to rely solely on precise design-time knowledge, which may be difficult to acquire. FQL4KE empowers users to specify cloud controllers by simply adjusting weights representing priorities in system goals instead of specifying complex adaptation rules. The applicability of FQL4KE has been experimentally assessed as part of the cloud application framework ElasticBench. The experimental results indicate that FQL4KE outperforms our previously developed fuzzy controller without learning mechanisms and the native Azure auto-scaling.

URL PDF HTML ☆

赞 0 踩 0

1507.00564 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Regularized linear system identification using atomic, nuclear and kernel-based norms: the role of the stability constraint

基于原子、核和核范数的正则化线性系统识别：稳定性约束的作用

Gianluigi Pillonetto, Tianshi Chen, Alessandro Chiuso, Giuseppe De Nicolao, Lennart Ljung

AI总结本文比较了不同正则化方法在系统识别中的表现，发现稳定样条核在稳定性和平滑性方面表现更优，提出了新的正则化方法。

详情

AI中文摘要

受机器学习文献启发，新的正则化技术被引入线性系统识别。所有采用的估计器都解决正则化最小二乘问题，区别在于对脉冲响应的惩罚项类型。流行的选项包括应用于Hankel矩阵的原子和核范数，以及由所谓稳定样条核诱导的范数。本文报告了基于这些不同正则化器的估计器的比较研究。我们的发现表明，稳定样条核优于基于原子和核范数的方法，因为它们合适地嵌入了脉冲响应的稳定性和平滑性信息。这一点通过正则化的贝叶斯解释来说明。我们还设计了一类由

英文摘要

Inspired by ideas taken from the machine learning literature, new regularization techniques have been recently introduced in linear system identification. In particular, all the adopted estimators solve a regularized least squares problem, differing in the nature of the penalty term assigned to the impulse response. Popular choices include atomic and nuclear norms (applied to Hankel matrices) as well as norms induced by the so called stable spline kernels. In this paper, a comparative study of estimators based on these different types of regularizers is reported. Our findings reveal that stable spline kernels outperform approaches based on atomic and nuclear norms since they suitably embed information on impulse response stability and smoothness. This point is illustrated using the Bayesian interpretation of regularization. We also design a new class of regularizers defined by "integral" versions of stable spline/TC kernels. Under quite realistic experimental conditions, the new estimators outperform classical prediction error methods also when the latter are equipped with an oracle for model order selection.

URL PDF HTML ☆

赞 0 踩 0

1507.00438 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

DC Proximal Newton for Non-Convex Optimization Problems

非凸优化问题的DC近端牛顿法

Alain Rakotomamonjy, Remi Flamary, Gilles Gasso

AI总结本文提出一种新的非凸优化算法，通过近端牛顿法处理非凸损失和正则化函数，理论分析证明其极限点为DC目标函数的 stationary points，实验显示其在高维转导学习中更高效。

1507.00421 2026-06-04 math.NA cs.LG cs.NA math.ST stat.ML stat.TH 版本更新

Categorical Matrix Completion

分类矩阵补全

Yang Cao, Yao Xie

AI总结本文提出通过扩展一位矩阵补全方法，解决具有类别值的矩阵补全问题，通过核范数约束最大化似然比，建立理论误差界，并在MovieLens数据集上验证方法优势。

Comments Submitted

详情

AI中文摘要

我们考虑从不完整观测中补全具有类别值的矩阵问题，通过扩展一位矩阵补全的公式和理论实现。通过最大化似然比并约束X的核范数来恢复低秩矩阵X，观测通过多个链接函数映射自X的条目。我们建立了恢复误差的理论上界和下界，达到常数因子O(K^{3/2})，其中K是固定类别数。上界依赖于类别数通过最大化涉及链接函数平滑度的项。与一位矩阵补全相比，我们的边界在类别数平方根的阶数上是最佳的，这与类别数增加时问题变难的直觉一致。通过在MovieLens数据集上比较我们的方法与传统矩阵补全方法的性能，我们展示了方法的优势。

英文摘要

We consider the problem of completing a matrix with categorical-valued entries from partial observations. This is achieved by extending the formulation and theory of one-bit matrix completion. We recover a low-rank matrix $X$ by maximizing the likelihood ratio with a constraint on the nuclear norm of $X$, and the observations are mapped from entries of $X$ through multiple link functions. We establish theoretical upper and lower bounds on the recovery error, which meet up to a constant factor $\mathcal{O}(K^{3/2})$ where $K$ is the fixed number of categories. The upper bound in our case depends on the number of categories implicitly through a maximization of terms that involve the smoothness of the link functions. In contrast to one-bit matrix completion, our bounds for categorical matrix completion are optimal up to a factor on the order of the square root of the number of categories, which is consistent with an intuition that the problem becomes harder when the number of categories increases. By comparing the performance of our method with the conventional matrix completion method on the MovieLens dataset, we demonstrate the advantage of our method.

URL PDF HTML ☆

赞 0 踩 0

1506.08187 2026-06-04 math.OC cs.DS cs.LG cs.NA math.NA 版本更新

A geometric alternative to Nesterov's accelerated gradient descent

一种几何替代的Nesterov加速梯度下降法

Sébastien Bubeck, Yin Tat Lee, Mohit Singh

AI总结本文提出了一种新的无约束优化方法，针对光滑强凸函数，达到Nesterov加速梯度下降的最优收敛速率，其几何解释灵感源自椭球法。

1506.07540 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Global Optimality in Tensor Factorization, Deep Learning, and Beyond

张量分解、深度学习及其他中的全局最优性

Benjamin D. Haeffele, Rene Vidal

AI总结本文提出一个通用框架，分析非凸分解问题，证明局部最小值为全局最小值，并指导深度网络架构和正则化策略以提高优化效率。

详情

AI中文摘要

涉及分解的技术在广泛的应用中取得显著实证成功，但大多数问题的优化问题通常由于多线性形式或其他破坏凸性的转换而非凸。本文基于矩阵分解的凸松弛思想，提出一个通用框架，分析包括矩阵分解、张量分解和深度神经网络训练在内的非凸分解问题。我们推导出保证非凸优化问题局部最小值为全局最小值的充分条件，并证明如果分解变量的规模足够大，则从任何初始化出发，使用纯局部下降算法可以找到全局最小值。该框架还部分理论上解释了深度神经网络中ReLU的广泛应用，并提供指导以促进高效优化。

英文摘要

Techniques involving factorization are found in a wide range of applications and have enjoyed significant empirical success in many fields. However, common to a vast majority of these problems is the significant disadvantage that the associated optimization problems are typically non-convex due to a multilinear form or other convexity destroying transformation. Here we build on ideas from convex relaxations of matrix factorizations and present a very general framework which allows for the analysis of a wide range of non-convex factorization problems - including matrix factorization, tensor factorization, and deep neural network training formulations. We derive sufficient conditions to guarantee that a local minimum of the non-convex optimization problem is a global minimum and show that if the size of the factorized variables is large enough then from any initialization it is possible to find a global minimizer using a purely local descent algorithm. Our framework also provides a partial theoretical justification for the increasingly common use of Rectified Linear Units (ReLUs) in deep neural networks and offers guidance on deep network architectures and regularization strategies to facilitate efficient optimization.

URL PDF HTML ☆

赞 0 踩 0

1306.4905 2026-06-04 math.NA cs.LG cs.NA 版本更新

From-Below Approximations in Boolean Matrix Factorization: Geometry and New Algorithm

从下逼近在布尔矩阵分解中的应用：几何与新算法

Radim Belohlavek, Martin Trnecka

AI总结本文提出布尔矩阵分解的新结果及新算法，强调从下逼近在输入矩阵中的重要性，并通过实验验证了算法在覆盖性和分解效率上的优势。

1408.6141 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Recursive Total Least-Squares Algorithm Based on Inverse Power Method and Dichotomous Coordinate-Descent Iterations

基于逆幂方法和二分坐标下降迭代的递归总最小二乘算法

Reza Arablouei, Kutluyıl Doğançay, Stefan Werner

AI总结本文提出一种基于逆幂方法和二分坐标下降迭代的递归总最小二乘算法，相比传统方法，计算复杂度更低且具有渐近无偏性和稳定性，同时推导了遗忘因子下界和稳态均方偏差理论值。

1502.02251 2026-06-04 stat.ML cs.LG cs.RO cs.SY eess.SY 版本更新

From Pixels to Torques: Policy Learning with Deep Dynamical Models

从像素到扭矩：基于深度动态模型的策略学习

Niklas Wahlström, Thomas B. Schön, Marc Peter Deisenroth

AI总结本文提出一种高效的数据驱动强化学习算法，通过深度动态模型直接从像素信息学习闭环控制策略，解决高维观测下的连续状态-动作空间数据高效学习问题。

Comments 9 pages

详情

AI中文摘要

在开发完全自主系统中，利用非常高的维数观测进行数据高效学习连续状态-动作空间仍是一个关键挑战。本文考虑这一挑战的一个实例，即像素到扭矩问题，其中智能体必须仅从像素信息学习闭环控制策略。我们引入了一种数据高效、基于模型的强化学习算法，该算法直接从像素信息学习此类闭环策略。关键成分是深度动态模型，该模型使用深度自编码器学习图像的低维嵌入，并在该低维特征空间中学习预测模型。联合学习确保不仅静态属性，而且动态属性都被考虑在内。这对于长期预测至关重要，而长期预测是适应性模型预测控制策略的核心。与最先进的连续状态和动作强化学习方法相比，我们的方法学习速度快，可扩展到高维状态空间，并是向完全自主学习从像素到扭矩的重要一步。

英文摘要

Data-efficient learning in continuous state-action spaces using very high-dimensional observations remains a key challenge in developing fully autonomous systems. In this paper, we consider one instance of this challenge, the pixels to torques problem, where an agent must learn a closed-loop control policy from pixel information only. We introduce a data-efficient, model-based reinforcement learning algorithm that learns such a closed-loop policy directly from pixel information. The key ingredient is a deep dynamical model that uses deep auto-encoders to learn a low-dimensional embedding of images jointly with a predictive model in this low-dimensional feature space. Joint learning ensures that not only static but also dynamic properties of the data are accounted for. This is crucial for long-term predictions, which lie at the core of the adaptive model predictive control strategy that we use for closed-loop control. Compared to state-of-the-art reinforcement learning methods for continuous states and actions, our approach learns quickly, scales to high-dimensional state spaces and is an important step toward fully autonomous learning from pixels to torques.

URL PDF HTML ☆

赞 0 踩 0

1310.0865 2026-06-04 stat.ML cs.LG cs.SY eess.SY 版本更新

Electricity Market Forecasting via Low-Rank Multi-Kernel Learning

通过低秩多核学习进行电力市场预测

Vassilis Kekatos, Yu Zhang, Georgios B. Giannakis

AI总结本文通过低秩核学习方法对电力市场进行预测，利用核规范正则化选择定价节点和小时的核，提高预测精度和计算效率。

Comments 10 pages

详情

DOI: 10.1109/JSTSP.2014.2336611

AI中文摘要

智能电网愿景涉及先进的信息技术和数据分析，以提高电网基础设施的效率、可持续性和经济性。本文利用现代统计学习工具进行电力市场推断。日前提价预测被转化为低秩核学习问题。独特地利用市场清算过程，拥堵模式被建模为矩阵中时空变化价格的秩一成分。通过一种新的核范数基于正则化，可以在定价节点和小时之间系统地选择核。尽管市场范围预测从学习角度看是有益的，但涉及处理高维市场数据。后者在设计解决涉及的非凸优化问题的块坐标下降算法后成为可能。该算法利用了块稀疏向量恢复的结果，并保证能够收敛到一个 stationary 点。在中西部 ISO（MISO）市场的实际数据上的数值测试证实了所开发方法的预测精度、计算效率和解释性优势。

英文摘要

The smart grid vision entails advanced information technology and data analytics to enhance the efficiency, sustainability, and economics of the power grid infrastructure. Aligned to this end, modern statistical learning tools are leveraged here for electricity market inference. Day-ahead price forecasting is cast as a low-rank kernel learning problem. Uniquely exploiting the market clearing process, congestion patterns are modeled as rank-one components in the matrix of spatio-temporally varying prices. Through a novel nuclear norm-based regularization, kernels across pricing nodes and hours can be systematically selected. Even though market-wide forecasting is beneficial from a learning perspective, it involves processing high-dimensional market data. The latter becomes possible after devising a block-coordinate descent algorithm for solving the non-convex optimization problem involved. The algorithm utilizes results from block-sparse vector recovery and is guaranteed to converge to a stationary point. Numerical tests on real data from the Midwest ISO (MISO) market corroborate the prediction accuracy, computational efficiency, and the interpretative merits of the developed approach over existing alternatives.

URL PDF HTML ☆

赞 0 踩 0

1404.0466 2026-06-04 cs.LG cs.NA math.NA 版本更新

piCholesky: Polynomial Interpolation of Multiple Cholesky Factors for Efficient Approximate Cross-Validation

piCholesky：多项式插值多重Cholesky因子以实现高效的近似交叉验证

Da Kuang, Alex Gittens, Raffay Hamid

AI总结通过多项式插值多个正则化参数下的Hessian矩阵Cholesky因子，实现高效近似交叉验证，减少计算成本并提供误差界分析。

1506.02649 2026-06-04 math.NA cs.LG cs.NA 版本更新

Faster SGD Using Sketched Conditioning

用Sketching方法加速SGD

Alon Gonen, Shai Shalev-Shwartz

AI总结本文提出通过Sketching方法加速随机优化算法，通过构造低成本的conditioner提升SGD效率，并在深度学习中验证其有效性。

1406.6603 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

A scaled gradient projection method for Bayesian learning in dynamical systems

一种用于动态系统贝叶斯学习的缩放梯度投影方法

Silvia Bonettini, Alessandro Chiuso, Marco Prato

AI总结本文提出一种缩放梯度投影算法，用于解决贝叶斯学习中的非凸优化问题，通过有效设计缩放矩阵和步长参数，实现高效求解。

详情

DOI: 10.1137/140973529
Journal ref: SIAM Journal on Scientific Computing 37 (2015), A1297-A1318

AI中文摘要

系统辨识问题中选择最合适的模型类是一个关键任务，传统上通过交叉验证或渐近论点解决。最近文献中提出在贝叶斯框架中解决此问题，通过少量超参数调节模型复杂性，可通过边际似然最大化估计。因此，设计有效的优化方法至关重要。若将未知脉冲响应建模为具有合适核的高斯过程，最大化边际似然会导致挑战性的非凸优化问题，需要稳定有效的求解策略。本文通过缩放梯度投影算法解决此问题，其中缩放矩阵和步长参数在计算时间上与二阶方法相当。特别地，我们提出了一种扩展的分裂梯度方法，用于在存在框约束时设计缩放矩阵，并有效实现梯度和目标函数。在多个测试问题上的广泛数值实验表明，该方法在几毫秒内提供与最先进方法相媲美的精度解决方案。此外，该策略的灵活性使其易于适应不同领域中的更广泛问题。

英文摘要

A crucial task in system identification problems is the selection of the most appropriate model class, and is classically addressed resorting to cross-validation or using asymptotic arguments. As recently suggested in the literature, this can be addressed in a Bayesian framework, where model complexity is regulated by few hyperparameters, which can be estimated via marginal likelihood maximization. It is thus of primary importance to design effective optimization methods to solve the corresponding optimization problem. If the unknown impulse response is modeled as a Gaussian process with a suitable kernel, the maximization of the marginal likelihood leads to a challenging nonconvex optimization problem, which requires a stable and effective solution strategy. In this paper we address this problem by means of a scaled gradient projection algorithm, in which the scaling matrix and the steplength parameter play a crucial role to provide a meaning solution in a computational time comparable with second order methods. In particular, we propose both a generalization of the split gradient approach to design the scaling matrix in the presence of box constraints, and an effective implementation of the gradient and objective function. The extensive numerical experiments carried out on several test problems show that our method is very effective in providing in few tenths of a second solutions of the problems with accuracy comparable with state-of-the-art approaches. Moreover, the flexibility of the proposed strategy makes it easily adaptable to a wider range of problems arising in different areas of machine learning, signal processing and system identification.

URL PDF HTML ☆

赞 0 踩 0

1506.02312 2026-06-04 cs.AI cs.LG cs.RO cs.SY eess.SY 版本更新

A Framework for Constrained and Adaptive Behavior-Based Agents

一种用于约束和自适应行为基 agent 的框架

Renato de Pontes Pereira, Paulo Martins Engel

AI总结本文提出一种框架，通过强化学习节点整合到行为树中，解决约束 agent 的学习能力问题，并展示其与分层强化学习选项的关系，确保嵌套学习节点的收敛性。

Comments 2015; 15 pages

1504.00905 2026-06-04 math.OC cs.CV cs.LG cs.SY eess.SY 版本更新

Robust Anomaly Detection Using Semidefinite Programming

使用半定规划进行鲁棒异常检测

Jose A. Lopez, Octavia Camps, Mario Sznaier

AI总结本文提出基于多项式优化和矩方法的新型异常检测方法，仅需正常状态特征统计矩信息，相较于Parzen窗口和1类SVM等方法表现更优，且能简洁描述正常状态，简化高维数据集的异常检测问题。

Comments 13 pages, 11 figures

1406.5286 2026-06-04 stat.ML cs.LG cs.NA math.NA math.OC 版本更新

Enhancing Pure-Pixel Identification Performance via Preconditioning

通过预条件化增强纯像素识别性能

Nicolas Gillis, Wing-Kin Ma

AI总结本文分析了不同预条件化方法以提升纯像素搜索算法的鲁棒性，针对SPA算法提出近似解的鲁棒性分析，并探讨了预白化和基于SPA的预条件化方法的鲁棒性与效率。

Comments 25 pages, 3 figures

详情

DOI: 10.1137/140994915
Journal ref: SIAM J. on Imaging Sciences 8 (2), pp. 1161-1186, 2015

AI中文摘要

在本文中，我们分析了不同预条件化方法以增强纯像素搜索算法的鲁棒性，这些算法用于盲超谱解混，并等价于近可分离的非负矩阵分解算法。我们的分析聚焦于 successive projection algorithm (SPA)，一种简单、高效且可证明鲁棒的纯像素算法。最近，Gillis和Vavasis（arXiv:1310.2273）提出了一种可证明鲁棒的预条件化方法，该方法需要求解一个半正定规划（SDP）以找到包含数据点的最小体积椭球。由于在高精度下求解SDP可能耗时，我们扩展了鲁棒性分析以适用于SDP的近似解，即目标函数值与最优值相差某些乘法因子的解。证明了高精度解对鲁棒性并不关键，这为更快的预条件化方法（例如基于一阶优化方法的）铺平了道路。这一贡献也使我们能够为另外两种预条件化方法提供鲁棒性分析。第一种是预白化，可以解释为同一SDP的最优解加上额外约束。我们分析了预白化的鲁棒性，以表征其在某些情况下与基于SDP的预条件化方法具有竞争力的情况。第二种基于SPA本身，可以解释为SDP松弛的最优解。它在多个合成数据集上与基于SDP的预条件化方法竞争。

英文摘要

In this paper, we analyze different preconditionings designed to enhance robustness of pure-pixel search algorithms, which are used for blind hyperspectral unmixing and which are equivalent to near-separable nonnegative matrix factorization algorithms. Our analysis focuses on the successive projection algorithm (SPA), a simple, efficient and provably robust algorithm in the pure-pixel algorithm class. Recently, a provably robust preconditioning was proposed by Gillis and Vavasis (arXiv:1310.2273) which requires the resolution of a semidefinite program (SDP) to find a data points-enclosing minimum volume ellipsoid. Since solving the SDP in high precisions can be time consuming, we generalize the robustness analysis to approximate solutions of the SDP, that is, solutions whose objective function values are some multiplicative factors away from the optimal value. It is shown that a high accuracy solution is not crucial for robustness, which paves the way for faster preconditionings (e.g., based on first-order optimization methods). This first contribution also allows us to provide a robustness analysis for two other preconditionings. The first one is pre-whitening, which can be interpreted as an optimal solution of the same SDP with additional constraints. We analyze robustness of pre-whitening which allows us to characterize situations in which it performs competitively with the SDP-based preconditioning. The second one is based on SPA itself and can be interpreted as an optimal solution of a relaxation of the SDP. It is extremely fast while competing with the SDP-based preconditioning on several synthetic data sets.

URL PDF HTML ☆

赞 0 踩 0

1406.4802 2026-06-04 math.NA cs.LG cs.NA 版本更新

Homotopy based algorithms for $\ell_0$-regularized least-squares

基于同伦的 $\ell_0$-正则化最小二乘算法

Charles Soussen, Jérôme Idier, Junbo Duan, David Brie

AI总结本文提出两种启发式搜索算法用于 $\ell_0$-同伦问题，通过改进的同伦路径方法解决稀疏信号恢复中的正则化问题，并展示了其在逆问题中的应用。

Comments 38 pages

详情

DOI: 10.1109/TSP.2015.2421476
Journal ref: IEEE Transactions on Signal Processing, vol. 63, no. 13, Jul. 2015, pp. 3301-3316

AI中文摘要

稀疏信号恢复通常被表述为最小化二次成本函数 $\|y-Ax\|_2^2$，其中 A 是字典，x 是未知的稀疏向量。众所周知，施加 $\ell_0$ 约束会导致 NP 难的最小化问题。凸松弛方法受到广泛关注，其中 $\ell_0$-范数被 $\ell_1$-范数替代。在许多高效的 $\ell_1$ 解决方案中，同伦算法最小化 $\|y-Ax\|_2^2+λ\|x\|_1$ 关于 x 对于连续的 $λ$ 的情况。它受到 $\ell_1$-正则化路径的分段正则性的启发，也称为同伦路径。在本文中，我们处理 $\|y-Ax\|_2^2+λ\|x\|_0$ 对于连续 $λ$ 的最小化问题，并提出两种启发式搜索算法用于 $\ell_0$-同伦。继续单最佳替换是扩展单最佳替换算法的前向-后向贪心策略，之前提出用于给定 $λ$ 的 $\ell_0$-最小化。$λ$ 值的自适应搜索受到 $\ell_1$-同伦的启发。$\ell_0$ 正则化路径下降是一种更复杂的算法，利用 $\ell_0$-正则化路径的结构特性，该路径对 $λ$ 是分段常数的。两种算法都对困难的逆问题进行了实证评估，涉及病态字典。最后，我们展示它们可以轻松地与常规的模型阶选择方法结合。

英文摘要

Sparse signal restoration is usually formulated as the minimization of a quadratic cost function $\|y-Ax\|_2^2$, where A is a dictionary and x is an unknown sparse vector. It is well-known that imposing an $\ell_0$ constraint leads to an NP-hard minimization problem. The convex relaxation approach has received considerable attention, where the $\ell_0$-norm is replaced by the $\ell_1$-norm. Among the many efficient $\ell_1$ solvers, the homotopy algorithm minimizes $\|y-Ax\|_2^2+λ\|x\|_1$ with respect to x for a continuum of $λ$'s. It is inspired by the piecewise regularity of the $\ell_1$-regularization path, also referred to as the homotopy path. In this paper, we address the minimization problem $\|y-Ax\|_2^2+λ\|x\|_0$ for a continuum of $λ$'s and propose two heuristic search algorithms for $\ell_0$-homotopy. Continuation Single Best Replacement is a forward-backward greedy strategy extending the Single Best Replacement algorithm, previously proposed for $\ell_0$-minimization at a given $λ$. The adaptive search of the $λ$-values is inspired by $\ell_1$-homotopy. $\ell_0$ Regularization Path Descent is a more complex algorithm exploiting the structural properties of the $\ell_0$-regularization path, which is piecewise constant with respect to $λ$. Both algorithms are empirically evaluated for difficult inverse problems involving ill-conditioned dictionaries. Finally, we show that they can be easily coupled with usual methods of model order selection.

URL PDF HTML ☆

赞 0 踩 0

1505.04123 2026-06-04 cs.LG cs.AI cs.NA math.NA math.OC 版本更新

Margins, Kernels and Non-linear Smoothed Perceptrons

边距、核与非线性平滑感知机

Aaditya Ramdas, Javier Peña

AI总结本文研究了在RKHS中寻找非线性分类函数的问题，提出了一种加速平滑算法，具有与经典核感知机相似的收敛特性，并给出了在无分类器存在时的分离定理。

Comments 17 pages, published in the proceedings of the International Conference on Machine Learning, 2014

详情

Journal ref: Ramdas, Aaditya, and Javier Pena. "Margins, kernels and non-linear smoothed perceptrons." Proceedings of the 31st International Conference on Machine Learning (ICML-14). 2014

AI中文摘要

我们关注在RKHS中寻找非线性分类函数的问题，从原问题和对偶问题两个角度出发，特别关注感知机和冯-诺依曼算法的推广。我们将问题转化为在RKHS中最大化正则化归一化硬边距(ρ)，并利用表示定理将其转换为与核的（归一化和带符号）Gram矩阵相关的马哈拉诺斯基点积/半范数。我们推导出一种加速平滑算法，具有收敛率为√(log n)/ρ的特性，给定n个可分离点。当不存在此类分类器时，我们证明了RKHS版本的戈尔丹分离定理，并重新解释了负边距。这使得我们能够为原对偶算法提供保证，该算法在存在可行原问题时，可在min{√n/|ρ|, √n/ε}次迭代中找到RKHS中的完美分离器，或在无可行原问题时提供一个对偶ε-不可行性证书。

英文摘要

We focus on the problem of finding a non-linear classification function that lies in a Reproducing Kernel Hilbert Space (RKHS) both from the primal point of view (finding a perfect separator when one exists) and the dual point of view (giving a certificate of non-existence), with special focus on generalizations of two classical schemes - the Perceptron (primal) and Von-Neumann (dual) algorithms. We cast our problem as one of maximizing the regularized normalized hard-margin ($ρ$) in an RKHS and %use the Representer Theorem to rephrase it in terms of a Mahalanobis dot-product/semi-norm associated with the kernel's (normalized and signed) Gram matrix. We derive an accelerated smoothed algorithm with a convergence rate of $\tfrac{\sqrt {\log n}}ρ$ given $n$ separable points, which is strikingly similar to the classical kernelized Perceptron algorithm whose rate is $\tfrac1{ρ^2}$. When no such classifier exists, we prove a version of Gordan's separation theorem for RKHSs, and give a reinterpretation of negative margins. This allows us to give guarantees for a primal-dual algorithm that halts in $\min\{\tfrac{\sqrt n}{|ρ|}, \tfrac{\sqrt n}ε\}$ iterations with a perfect separator in the RKHS if the primal is feasible or a dual $ε$-certificate of near-infeasibility.

URL PDF HTML ☆

赞 0 踩 0

1412.6095 2026-06-04 eess.SY cs.LG cs.SY math.OC stat.ML 版本更新

Theoretical and Numerical Analysis of Approximate Dynamic Programming with Approximation Errors

近似动态规划中近似误差的理论与数值分析

Ali Heydari

AI总结本文研究近似动态规划迭代中误差对最终结果的影响，分析确定性非线性最优控制问题中价值迭代方案的收敛性，并推导稳定性和吸引区域的充分条件。

Comments This study is the counterpart of another work of the author (arXiv:1412.5675) which was for value iterations with initial stabilizing guess (with overlaps on Theorem 1 and Lemma 1). As for the revision on this work, some steps of proofs are updated and an explanation about the approximation error is included. Initial submission date: 12/18/2014

详情

AI中文摘要

本文旨在回答近似动态规划（ADP）每次迭代中的近似误差如何影响最终结果的问题。研究了在考虑每次迭代中的误差影响下，确定性非线性最优控制问题中价值迭代方案的收敛性。通过已知的一般最优控制问题中的量和可验证的假设，获得了围绕最优解的有界性。此外，由于近似误差导致结果偏离最优性，推导了在有限次价值迭代后获得的结果所操作系统的稳定性充分条件，以及其吸引区域的估计。最后，通过轨道机动问题的实现过程验证了理论发展的假设，并应用充分条件以保证稳定性和近优性。

英文摘要

This study is aimed at answering the famous question of how the approximation errors at each iteration of Approximate Dynamic Programming (ADP) affect the quality of the final results considering the fact that errors at each iteration affect the next iteration. To this goal, convergence of Value Iteration scheme of ADP for deterministic nonlinear optimal control problems with undiscounted cost functions is investigated while considering the errors existing in approximating respective functions. The boundedness of the results around the optimal solution is obtained based on quantities which are known in a general optimal control problem and assumptions which are verifiable. Moreover, since the presence of the approximation errors leads to the deviation of the results from optimality, sufficient conditions for stability of the system operated by the result obtained after a finite number of value iterations, along with an estimation of its region of attraction, are derived in terms of a calculable upper bound of the control approximation error. Finally, the process of implementation of the method on an orbital maneuver problem is investigated through which the assumptions made in the theoretical developments are verified and the sufficient conditions are applied for guaranteeing stability and near optimality.

URL PDF HTML ☆

赞 0 踩 0

1312.7651 2026-06-04 stat.ML cs.LG cs.SY eess.SY 版本更新

Petuum: A New Platform for Distributed Machine Learning on Big Data

Petuum：一种用于大数据上分布式机器学习的新平台

Eric P. Xing, Qirong Ho, Wei Dai, Jin Kyu Kim, Jinliang Wei, Seunghak Lee, Xun Zheng, Pengtao Xie, Abhimanu Kumar, Yaoliang Yu

AI总结本文提出一种通用框架，系统解决大规模机器学习中的数据和模型并行挑战，通过观察许多机器学习程序本质上是优化导向的，并允许容错、迭代收敛的算法解决方案，从而实现高效的系统设计。

Comments 15 pages, 10 figures, final version in KDD 2015 under the same title

详情

AI中文摘要

什么是系统化的方法，能够高效地将广泛先进的机器学习程序应用于工业级问题，使用大数据（高达数百亿参数）上的大数据（高达太字节或拍字节）？现代并行化策略采用细粒度操作和调度，超越经典批量同步处理范式，如MapReduce流行化，甚至专门的基于图的执行，依赖于机器学习程序的图表示。各种方法倾向于将系统和算法设计引向不同方向，难以找到适用于广泛机器学习程序的通用平台。我们提出一种通用框架，系统解决大规模机器学习中的数据和模型并行挑战，通过观察许多机器学习程序本质上是优化导向的，并允许容错、迭代收敛的算法解决方案。这为集成系统设计提供了独特机会，如受限误差网络同步和基于机器学习程序结构的动态调度。我们展示了这些系统设计相对于现代机器学习算法知名实现的有效性，使机器学习程序能够在较小的计算集群上以更少的时间和更大的模型规模运行。

英文摘要

What is a systematic way to efficiently apply a wide spectrum of advanced ML programs to industrial scale problems, using Big Models (up to 100s of billions of parameters) on Big Data (up to terabytes or petabytes)? Modern parallelization strategies employ fine-grained operations and scheduling beyond the classic bulk-synchronous processing paradigm popularized by MapReduce, or even specialized graph-based execution that relies on graph representations of ML programs. The variety of approaches tends to pull systems and algorithms design in different directions, and it remains difficult to find a universal platform applicable to a wide range of ML programs at scale. We propose a general-purpose framework that systematically addresses data- and model-parallel challenges in large-scale ML, by observing that many ML programs are fundamentally optimization-centric and admit error-tolerant, iterative-convergent algorithmic solutions. This presents unique opportunities for an integrative system design, such as bounded-error network synchronization and dynamic scheduling based on ML program structure. We demonstrate the efficacy of these system designs versus well-known implementations of modern ML algorithms, allowing ML programs to run in much less time and at considerably larger model sizes, even on modestly-sized compute clusters.

URL PDF HTML ☆

赞 0 踩 0

1503.05214 2026-06-04 cs.DC cs.LG cs.NA math.NA 版本更新

Analysis of PCA Algorithms in Distributed Environments

分布式环境中PCA算法的分析

Tarek Elgamal, Mohamed Hefeeda

AI总结本文分析了分布式环境中PCA算法的性能，比较了时间复杂度和通信复杂度，探讨了不同算法的可扩展性瓶颈及适用场景。

1311.2854 2026-06-04 cs.LG cs.NA math.NA 版本更新

Spectral Clustering via the Power Method -- Provably

通过幂法进行谱聚类--可证明的

Christos Boutsidis, Alex Gittens, Prabhanjan Kambadur

AI总结本文提出通过幂法计算谱聚类的近似特征向量，证明少量迭代即可获得近优划分。

Comments ICML 2015, to appear

详情

AI中文摘要

通过幂法进行谱聚类--可证明的。谱聚类是数据挖掘和机器智能中最重要的算法之一；然而，其计算复杂性限制了其在真正大规模数据分析中的应用。谱聚类的计算瓶颈在于计算图表示数据的（归一化）拉普拉斯矩阵的几个顶部特征向量。一种加速计算这些特征向量的方法是使用数值线性代数文献中的“幂法”。尽管幂法已被经验性地用于加速谱聚类，但这种方法的理论基础，据我们所知，尚未被探索。本文提供了首次严谨的理论证明，认为少量的幂迭代足以通过幂法获得的近似特征向量获得近优划分。具体而言，我们证明在通过幂法获得的近似特征向量上求解k均值聚类问题，可以得到在最优特征向量上求解k均值问题的加法误差近似。

英文摘要

Spectral clustering is one of the most important algorithms in data mining and machine intelligence; however, its computational complexity limits its application to truly large scale data analysis. The computational bottleneck in spectral clustering is computing a few of the top eigenvectors of the (normalized) Laplacian matrix corresponding to the graph representing the data to be clustered. One way to speed up the computation of these eigenvectors is to use the "power method" from the numerical linear algebra literature. Although the power method has been empirically used to speed up spectral clustering, the theory behind this approach, to the best of our knowledge, remains unexplored. This paper provides the \emph{first} such rigorous theoretical justification, arguing that a small number of power iterations suffices to obtain near-optimal partitionings using the approximate eigenvectors. Specifically, we prove that solving the $k$-means clustering problem on the approximate eigenvectors obtained via the power method gives an additive-error approximation to solving the $k$-means problem on the optimal eigenvectors.

URL PDF HTML ☆

赞 0 踩 0

1505.02343 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Bayesian Sparse Tucker Models for Dimension Reduction and Tensor Completion

基于贝叶斯稀疏Tucker模型的降维与张量补全

Qibin Zhao, Liqing Zhang, Andrzej Cichocki

AI总结本文提出一种概率生成Tucker模型，通过结构稀疏性在多线性潜在空间中实现张量降维与补全，自动适应模型复杂度并提升泛化性能。

详情

AI中文摘要

Tucker分解是现代张量数据分析机器学习的核心，广泛应用于多向特征提取、压缩感知和张量补全。最具有挑战性的问题是确定模型复杂度（即多线性秩），尤其是在存在噪声和缺失数据时。此外，现有方法无法考虑潜在因子的不确定性信息，导致泛化性能低下。为了解决这些问题，我们提出了一类概率生成Tucker模型，用于张量分解与补全，具有结构稀疏性。为了利用结构稀疏建模，我们引入了两种组稀疏诱导先验，通过Laplace和学生t分布的分层表示，从而实现完全后验推断。对于模型学习，我们推导了所有模型（超）参数上的变分贝叶斯推断，并开发了基于多线性操作的高效可扩展算法。我们的方法可以自动适应模型复杂度，并通过模型证据的最大下界原理推断最优多线性秩。在合成、化学计量学和神经影像数据上的实验结果和比较显示，我们的模型在恢复多线性秩和缺失条目方面表现出色。

英文摘要

Tucker decomposition is the cornerstone of modern machine learning on tensorial data analysis, which have attracted considerable attention for multiway feature extraction, compressive sensing, and tensor completion. The most challenging problem is related to determination of model complexity (i.e., multilinear rank), especially when noise and missing data are present. In addition, existing methods cannot take into account uncertainty information of latent factors, resulting in low generalization performance. To address these issues, we present a class of probabilistic generative Tucker models for tensor decomposition and completion with structural sparsity over multilinear latent space. To exploit structural sparse modeling, we introduce two group sparsity inducing priors by hierarchial representation of Laplace and Student-t distributions, which facilitates fully posterior inference. For model learning, we derived variational Bayesian inferences over all model (hyper)parameters, and developed efficient and scalable algorithms based on multilinear operations. Our methods can automatically adapt model complexity and infer an optimal multilinear rank by the principle of maximum lower bound of model evidence. Experimental results and comparisons on synthetic, chemometrics and neuroimaging data demonstrate remarkable performance of our models for recovering ground-truth of multilinear rank and missing entries.

URL PDF HTML ☆

赞 0 踩 0

1505.00314 2026-06-04 cs.LG cs.SY eess.SY stat.ME 版本更新

Deconstructing Principal Component Analysis Using a Data Reconciliation Perspective

从数据协调视角解构主成分分析

Shankar Narasimhan, Nirav Bhatt

AI总结本文从数据协调视角探讨主成分分析，揭示两者紧密关联，构建统一框架并展示其协同处理数据的方法。

详情

DOI: 10.1016/j.compchemeng.2015.03.016
Journal ref: Computers and Chemical Engineering 77 (2015) 74-84

AI中文摘要

数据协调（DR）和主成分分析（PCA）是过程工业中两种流行的数据分析技术。数据协调用于从错误测量中获得准确且一致的变量和参数估计。PCA主要用于减少高维数据的维度并作为去噪预处理技术。这两种技术曾被独立开发和部署。本文的主要目的是阐明这两种看似不同的技术之间的密切关系。这导致了PCA和DR的统一框架。进一步，我们展示了如何将这两种技术以协作和一致的方式应用于数据处理。该框架已扩展以处理部分测量系统，并纳入关于过程模型的部分知识。

英文摘要

Data reconciliation (DR) and Principal Component Analysis (PCA) are two popular data analysis techniques in process industries. Data reconciliation is used to obtain accurate and consistent estimates of variables and parameters from erroneous measurements. PCA is primarily used as a method for reducing the dimensionality of high dimensional data and as a preprocessing technique for denoising measurements. These techniques have been developed and deployed independently of each other. The primary purpose of this article is to elucidate the close relationship between these two seemingly disparate techniques. This leads to a unified framework for applying PCA and DR. Further, we show how the two techniques can be deployed together in a collaborative and consistent manner to process data. The framework has been extended to deal with partially measured systems and to incorporate partial knowledge available about the process model.

URL PDF HTML ☆

赞 0 踩 0

1402.5284 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

Convergence results for projected line-search methods on varieties of low-rank matrices via Łojasiewicz inequality

关于通过Łojasiewicz不等式在低秩矩阵流形上投影线搜索方法收敛性的结果

Reinhold Schneider, André Uschmajew

AI总结本文利用Łojasiewicz不等式研究了在低秩矩阵流形上投影线搜索方法的收敛性，分析了其在闭合性和曲率无界性方面的优势，并给出了临界点位于流形光滑部分的理论依据。

详情

AI中文摘要

本文旨在推导在实代数流形$\mathcal{M}_{\le k}$上投影线搜索方法的收敛性结果，该流形由实$m \times n$矩阵构成，其秩至多为$k$。此类方法扩展了用于光滑流形$\mathcal{M}_k$上的黎曼优化方法，通过在切锥中沿梯度相关方向取步长，然后投影回$\mathcal{M}_{\le k}$。考虑此类方法可避免$\mathcal{M}_k$的非闭合性和无界曲率带来的困难。基于实解析函数的点收敛性，利用投影反梯度到切锥的Łojasiewicz不等式获得收敛性结果。如果极限点位于$\mathcal{M}_{\le k}$的光滑部分，即$\mathcal{M}_k$中，则结果退化为已知结果，但能通过极限点在光滑流形上获得渐近收敛速率估计，无需先验曲率界。同时，可以给出假设临界点位于$\mathcal{M}_k$的理论依据：若$X$是$f$在$\mathcal{M}_{\le k}$上的临界点，则$X$的秩为$k$或$\nabla f(X) = 0$。

英文摘要

The aim of this paper is to derive convergence results for projected line-search methods on the real-algebraic variety $\mathcal{M}_{\le k}$ of real $m \times n$ matrices of rank at most $k$. Such methods extend Riemannian optimization methods, which are successfully used on the smooth manifold $\mathcal{M}_k$ of rank-$k$ matrices, to its closure by taking steps along gradient-related directions in the tangent cone, and afterwards projecting back to $\mathcal{M}_{\le k}$. Considering such a method circumvents the difficulties which arise from the nonclosedness and the unbounded curvature of $\mathcal{M}_k$. The pointwise convergence is obtained for real-analytic functions on the basis of a Łojasiewicz inequality for the projection of the antigradient to the tangent cone. If the derived limit point lies on the smooth part of $\mathcal{M}_{\le k}$, i.e. in $\mathcal{M}_k$, this boils down to more or less known results, but with the benefit that asymptotic convergence rate estimates (for specific step-sizes) can be obtained without an a priori curvature bound, simply from the fact that the limit lies on a smooth manifold. At the same time, one can give a convincing justification for assuming critical points to lie in $\mathcal{M}_k$: if $X$ is a critical point of $f$ on $\mathcal{M}_{\le k}$, then either $X$ has rank $k$, or $\nabla f(X) = 0$.

URL PDF HTML ☆

赞 0 踩 0

1504.05517 2026-06-04 cs.NI cs.LG cs.SY eess.SY 版本更新

Online Learning Algorithm for Time Series Forecasting Suitable for Low Cost Wireless Sensor Networks Nodes

适用于低成本无线传感器网络节点的时间序列预测在线学习算法

Juan Pardo, Francisco Zamora-Martinez, Paloma Botella-Rocamora

AI总结本文提出一种基于反向传播算法的在线学习算法，用于在低成本系统级芯片上实现时间序列预测，以提升智能家居中HVAC系统的能效。

Comments 28 pages, Published 21 April 2015 at MDPI's journal "Sensors"

详情

DOI: 10.3390/s150409277
Journal ref: Sensors 2015, 15(4), 9277-9304

无线通信与网络中的去中心化学习

Georgios B. Giannakis, Qing Ling, Gonzalo Mateos, Ioannis D. Schizas, Hao Zhu

AI总结本文提出了一种去中心化学习算法，用于图数据的网络内处理，通过交替方向乘子法实现分布式优化，适用于无线通信和网络任务的案例研究。

Comments Contributed chapter to appear in Splitting Methods in Communication and Imaging, Science and Engineering, R. Glowinski, S. Osher, and W. Yin, Editors, New York, Springer, 2015

1503.08639 2026-06-04 cs.LG cs.SY eess.SY 版本更新

Sparse plus low-rank autoregressive identification in neuroimaging time series

神经影像时间序列中的稀疏加低秩自回归识别

Raphaël Liégeois, Bamdev Mishra, Mattia Zorzi, Rodolphe Sepulchre

AI总结本文提出利用交替方向乘子法解决多变量自回归稀疏加低秩图模型问题，针对神经影像数据规模进行扩展，重点展示低秩结构捕捉时空结构的能力。

Comments 6 pages paper submitted to CDC 2015

1503.06468 2026-06-04 cs.LG cs.CR cs.SY eess.SY 版本更新

Machine Learning Methods for Attack Detection in the Smart Grid

智能电网中攻击检测的机器学习方法

Mete Ozay, Inaki Esnaola, Fatos T. Yarman Vural, Sanjeev R. Kulkarni, H. Vincent Poor

AI总结本文提出利用机器学习方法检测智能电网攻击，通过批量和在线学习算法结合特征和决策融合，分析攻击向量的统计与几何特性，提升攻击检测性能。

Comments 14 pages, 11 Figures

详情

DOI: 10.1109/TNNLS.2015.2404803
Journal ref: A version of the manuscript was published in IEEE Transactions on Neural Networks and Learning Systems, 19 March 2015

AI中文摘要

智能电网中的攻击检测被提出为统计学习问题，针对不同的攻击场景，测量数据以批量或在线方式观测。在此方法中，机器学习算法用于将测量数据分类为安全或受攻击。提供了一个攻击检测框架，以利用系统中任何可用的先验知识，并克服所提出方法中稀疏结构带来的限制。已知的批量和在线学习算法（监督和半监督）结合决策和特征级融合，用于建模攻击检测问题。分析攻击场景中使用的攻击向量的统计和几何特性与学习算法之间的关系，以利用统计学习方法检测不可观测的攻击。所提出算法在各种IEEE测试系统上进行了检验。实验分析显示，机器学习算法在攻击检测性能上优于采用状态向量估计方法的攻击检测算法。

英文摘要

Attack detection problems in the smart grid are posed as statistical learning problems for different attack scenarios in which the measurements are observed in batch or online settings. In this approach, machine learning algorithms are used to classify measurements as being either secure or attacked. An attack detection framework is provided to exploit any available prior knowledge about the system and surmount constraints arising from the sparse structure of the problem in the proposed approach. Well-known batch and online learning algorithms (supervised and semi-supervised) are employed with decision and feature level fusion to model the attack detection problem. The relationships between statistical and geometric properties of attack vectors employed in the attack scenarios and learning algorithms are analyzed to detect unobservable attacks using statistical learning methods. The proposed algorithms are examined on various IEEE test systems. Experimental analyses show that machine learning algorithms can detect attacks with performances higher than the attack detection algorithms which employ state vector estimation methods in the proposed attack detection framework.

URL PDF HTML ☆

赞 0 踩 0

1503.06394 2026-06-04 cs.DS cs.LG cs.NA math.NA 版本更新

Large-scale Log-determinant Computation through Stochastic Chebyshev Expansions

通过随机Chebyshev展开实现大规模对数行列式计算

Insu Han, Dmitry Malioutov, Jinwoo Shin

AI总结本文提出一种线性时间随机算法，利用随机迹近似和Chebyshev多项式展开，高效计算大规模正定及一般非奇异矩阵的对数行列式，具有高精度和速度优势。

详情

AI中文摘要

对数行列式在机器学习中广泛应用，如高斯图模型、高斯过程模型、离散图模型的配分函数、最小体积椭球、度量学习和核学习。对数行列式计算通常涉及Cholesky分解，其计算复杂度为变量数的三次方，不适用于大规模应用。本文提出一种线性时间随机算法，通过随机迹近似（Hutchinson方法）和Chebyshev多项式展开，结合高效的矩阵向量乘法，实现对大规模正定和一般非奇异矩阵的对数行列式近似计算。我们建立了依赖输入矩阵条件数的严格加法和乘法近似误差界。实验表明，该算法在速度上比Cholesky分解和Schur补快多个数量级，并能计算包含数百万变量的矩阵的对数行列式。

英文摘要

Logarithms of determinants of large positive definite matrices appear ubiquitously in machine learning applications including Gaussian graphical and Gaussian process models, partition functions of discrete graphical models, minimum-volume ellipsoids, metric learning and kernel learning. Log-determinant computation involves the Cholesky decomposition at the cost cubic in the number of variables, i.e., the matrix dimension, which makes it prohibitive for large-scale applications. We propose a linear-time randomized algorithm to approximate log-determinants for very large-scale positive definite and general non-singular matrices using a stochastic trace approximation, called the Hutchinson method, coupled with Chebyshev polynomial expansions that both rely on efficient matrix-vector multiplications. We establish rigorous additive and multiplicative approximation error bounds depending on the condition number of the input matrix. In our experiments, the proposed algorithm can provide very high accuracy solutions at orders of magnitude faster time than the Cholesky decomposition and Schur completion, and enables us to compute log-determinants of matrices involving tens of millions of variables.

URL PDF HTML ☆

赞 0 踩 0

1310.0807 2026-06-04 cs.IT cs.LG cs.NA math.IT math.NA math.ST stat.ML stat.TH 版本更新

Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming

通过凸优化从二次采样中获得精确且稳定的协方差估计

Yuxin Chen, Yuejie Chi, Andrea Goldsmith

AI总结本文研究了通过凸优化从二次采样中提取高维数据协方差结构的方法，探讨了低秩、Toeplitz低秩、稀疏性等结构假设，并展示了在无噪声情况下能实现准确的协方差估计。

Comments accepted to IEEE Transactions on Information Theory, 2015

详情

AI中文摘要

高维数据的统计推断和信息处理往往需要高效且准确的二阶统计量估计。在数据快速变化、处理能力和存储有限的情况下，提取协方差结构需要单次数据遍历和少量存储测量。本文探讨了一种二次（或秩一）测量模型，该模型在采样过程中具有最小的内存需求和低计算复杂性，并在保持各种低维协方差结构方面被证明是最优的。具体而言，研究了四种流行的协方差矩阵结构假设，即低秩、Toeplitz低秩、稀疏性以及联合秩一和稀疏结构，通过针对相应结构的凸松弛范式实现恢复。所提出的二次采样框架具有多种潜在应用，包括流数据处理、高频无线通信、相空间断层扫描和光学相位恢复，以及非相干子空间检测。我们的方法在无噪声情况下，只要测量数量超过信息论极限即可实现普遍准确的协方差估计。我们还展示了该方法在噪声和不完美结构假设下的鲁棒性。我们的分析基于一种新的概念，称为混合范数限制等距性质（RIP-ℓ₂/ℓ₁），以及传统的RIP-ℓ₂/ℓ₂用于近各向同性和有界测量。此外，我们的结果在使用PhaseLift进行相位恢复（包括密集和稀疏信号）的已知最佳保证方面，采用了一种显著更简单的方法。

英文摘要

Statistical inference and information processing of high-dimensional data often require efficient and accurate estimation of their second-order statistics. With rapidly changing data, limited processing power and storage at the acquisition devices, it is desirable to extract the covariance structure from a single pass over the data and a small number of stored measurements. In this paper, we explore a quadratic (or rank-one) measurement model which imposes minimal memory requirements and low computational complexity during the sampling process, and is shown to be optimal in preserving various low-dimensional covariance structures. Specifically, four popular structural assumptions of covariance matrices, namely low rank, Toeplitz low rank, sparsity, jointly rank-one and sparse structure, are investigated, while recovery is achieved via convex relaxation paradigms for the respective structure. The proposed quadratic sampling framework has a variety of potential applications including streaming data processing, high-frequency wireless communication, phase space tomography and phase retrieval in optics, and non-coherent subspace detection. Our method admits universally accurate covariance estimation in the absence of noise, as soon as the number of measurements exceeds the information theoretic limits. We also demonstrate the robustness of this approach against noise and imperfect structural assumptions. Our analysis is established upon a novel notion called the mixed-norm restricted isometry property (RIP-$\ell_{2}/\ell_{1}$), as well as the conventional RIP-$\ell_{2}/\ell_{2}$ for near-isotropic and bounded measurements. In addition, our results improve upon the best-known phase retrieval (including both dense and sparse signals) guarantees using PhaseLift with a significantly simpler approach.

URL PDF HTML ☆

赞 0 踩 0

1503.02828 2026-06-04 cs.LG cs.NA math.NA 版本更新

Scalable Nuclear-norm Minimization by Subspace Pursuit Proximal Riemannian Gradient

通过子空间追求算法实现可扩展的核范数最小化

Mingkui Tan, Shijie Xiao, Junbin Gao, Dong Xu, Anton Van Den Hengel, Qinfeng Shi

AI总结本文提出一种基于子空间追求的近端黎曼梯度方法，用于高效求解核范数正则化问题，避免大秩SVD，提升矩阵补全和子空间聚类等任务的性能。

1503.03903 2026-06-04 cs.LG cs.IT cs.NA math.IT math.NA stat.ML 版本更新

Approximating Sparse PCA from Incomplete Data

从不完整数据近似稀疏PCA

Abhisek Kundu, Petros Drineas, Malik Magdon-Ismail

AI总结研究如何利用少量数据元素形成的草图恢复数据矩阵的稀疏主成分，证明草图接近原矩阵时可获得近优解，提出稀疏PCA算法并展示其在多领域数据上的有效性，提升运行效率。

1503.03355 2026-06-04 stat.ML cs.LG cs.NA math.NA stat.AP 版本更新

Automatic Unsupervised Tensor Mining with Quality Assessment

自动无监督张量挖掘与质量评估

Evangelos E. Papalexakis

AI总结本文提出AutoTen算法，通过改进启发式方法实现自动无监督张量挖掘，通过合成数据和真实数据验证其性能，为自动化张量挖掘提供新方法。

1503.01793 2026-06-04 cs.LO cs.GT cs.LG cs.SY eess.SY 版本更新

Correct-by-synthesis reinforcement learning with temporal logic constraints

通过时序逻辑约束的正确性合成强化学习

Min Wen, Ruediger Ehlers, Ufuk Topcu

AI总结本文提出一种通过时序逻辑约束的正确性合成强化学习方法，结合最大许可策略和maximin-Q学习算法，实现系统满足规格的同时优化未知性能指标。

Comments 8 pages, 3 figures, 2 tables, submitted to IROS 2015

详情

AI中文摘要

我们考虑了一个问题，即合成反应控制器，该控制器在与未受控环境交互时，优化一些事先未知的性能标准，同时确保系统满足给定的时序逻辑规范。我们将问题分解为两个子问题。首先，我们提取一个（最大许可）策略，该策略编码了系统对对抗性环境的多种（可能全部）反应方式，以满足规范。然后，我们将事先未知的性能标准量化为（仍然未知）的奖励函数，并在允许的运行范围内通过所谓的maximin-Q学习算法计算系统的最优策略。我们为时序逻辑规范的一个片段建立了这种两步技术的正确性和最优性。对于超过该片段的规范，正确性仍可保持，但学习的策略可能不最优。我们提出了解决整体问题的算法，并在一组机器人运动规划示例上展示了其使用和计算要求。

英文摘要

We consider a problem on the synthesis of reactive controllers that optimize some a priori unknown performance criterion while interacting with an uncontrolled environment such that the system satisfies a given temporal logic specification. We decouple the problem into two subproblems. First, we extract a (maximally) permissive strategy for the system, which encodes multiple (possibly all) ways in which the system can react to the adversarial environment and satisfy the specifications. Then, we quantify the a priori unknown performance criterion as a (still unknown) reward function and compute an optimal strategy for the system within the operating envelope allowed by the permissive strategy by using the so-called maximin-Q learning algorithm. We establish both correctness (with respect to the temporal logic specifications) and optimality (with respect to the a priori unknown performance criterion) of this two-step technique for a fragment of temporal logic specifications. For specifications beyond this fragment, correctness can still be preserved, but the learned strategy may be sub-optimal. We present an algorithm to the overall problem, and demonstrate its use and computational requirements on a set of robot motion planning examples.

URL PDF HTML ☆

赞 0 踩 0

1402.5180 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Guaranteed Non-Orthogonal Tensor Decomposition via Alternating Rank-$1$ Updates

通过交替秩-1更新保证非正交张量分解

Animashree Anandkumar, Rong Ge, Majid Janzamin

AI总结本文提供了一种保证CP张量分解的局部和全局收敛性，通过交替秩-1更新方法，适用于非对称张量，证明了在特定秩条件下可恢复过完备分解。

Comments We have added an additional sub-algorithm to remove the (approximate) residual error left after the tensor power iteration

详情

AI中文摘要

在本文中，我们为恢复CP（Candecomp/Parafac）张量分解提供了局部和全局收敛性保证。所提出算法的主要步骤是一个简单的交替秩-1更新，这是针对非对称张量的张量幂迭代的交替版本。对于秩为k的第三阶张量，在d维空间中，当k=o(d^{1.5})且张量组件不相干时，建立了局部收敛性保证，从而可以恢复过完备张量分解。我们还通过简单的初始化过程，强化了结果，通过使用随机张量切片的顶部奇异向量进行初始化，从而在更严格的秩条件下（k ≤ βd，对于任意常数β>1）实现了全局收敛性保证。此外，还提供了p阶张量的近似局部收敛性保证，条件为k=o(d^{p/2})。这些保证还包括在存在噪声张量时的紧扰动分析。

英文摘要

In this paper, we provide local and global convergence guarantees for recovering CP (Candecomp/Parafac) tensor decomposition. The main step of the proposed algorithm is a simple alternating rank-$1$ update which is the alternating version of the tensor power iteration adapted for asymmetric tensors. Local convergence guarantees are established for third order tensors of rank $k$ in $d$ dimensions, when $k=o \bigl( d^{1.5} \bigr)$ and the tensor components are incoherent. Thus, we can recover overcomplete tensor decomposition. We also strengthen the results to global convergence guarantees under stricter rank condition $k \le βd$ (for arbitrary constant $β> 1$) through a simple initialization procedure where the algorithm is initialized by top singular vectors of random tensor slices. Furthermore, the approximate local convergence guarantees for $p$-th order tensors are also provided under rank condition $k=o \bigl( d^{p/2} \bigr)$. The guarantees also include tight perturbation analysis given noisy tensor.

URL PDF HTML ☆

赞 0 踩 0

1502.04689 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Exact tensor completion using t-SVD

利用t-SVD进行精确张量补全

Zemin Zhang, Shuchin Aeron

AI总结本文基于t-SVD提出张量补全方法，通过凸优化最小化张量核范数以保证恢复概率，验证了在随机采样下张量补全的最优性。

Comments 16 pages, 5 figures, 2 tables

详情

AI中文摘要

本文聚焦于从有限采样中补全多维数组（张量）的问题。我们的方法基于最近提出的张量奇异值分解（t-SVD）。利用该分解可以得到称为张量管秩的张量秩概念，其最优性性质类似于由SVD得到的矩阵秩。如[2]所示，某些多维数据，如平移视频序列，表现出低张量管秩，我们考虑在数据立方体的随机采样下补全此类数据的问题。我们证明，通过求解一个凸优化问题，该问题最小化作为张量管秩的凸松弛得到的张量核范数，可以保证在观察到与t-SVD中的自由度成比例的样本时，恢复具有 overwhelming 概率。从这个意义上说，我们的结果是顺序最优的。该结果成立的条件非常类似于矩阵补全的无相干条件，尽管我们是在t-SVD的代数框架下定义无相干性。我们还在一些真实数据集上展示了算法的性能，并将其与其他基于张量展平和Tucker分解的方法进行了比较。

英文摘要

In this paper we focus on the problem of completion of multidimensional arrays (also referred to as tensors) from limited sampling. Our approach is based on a recently proposed tensor-Singular Value Decomposition (t-SVD) [1]. Using this factorization one can derive notion of tensor rank, referred to as the tensor tubal rank, which has optimality properties similar to that of matrix rank derived from SVD. As shown in [2] some multidimensional data, such as panning video sequences exhibit low tensor tubal rank and we look at the problem of completing such data under random sampling of the data cube. We show that by solving a convex optimization problem, which minimizes the tensor nuclear norm obtained as the convex relaxation of tensor tubal rank, one can guarantee recovery with overwhelming probability as long as samples in proportion to the degrees of freedom in t-SVD are observed. In this sense our results are order-wise optimal. The conditions under which this result holds are very similar to the incoherency conditions for the matrix completion, albeit we define incoherency under the algebraic set-up of t-SVD. We show the performance of the algorithm on some real data sets and compare it with other existing approaches based on tensor flattening and Tucker decomposition.

URL PDF HTML ☆

赞 0 踩 0

1309.2168 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

Large-scale optimization with the primal-dual column generation method

大规模优化中的对偶-原问题列生成方法

Jacek Gondzio, Pablo González-Brevis, Pedro Munari

AI总结本文研究了对偶-原问题列生成方法在解决大规模凸优化问题中的表现，通过数据挖掘、不确定决策和电信网络等应用，验证了该方法在迭代次数和计算时间上的竞争力。

Comments 28 pages, 1 figure, minor revision, scaled CPU times

详情

AI中文摘要

对偶-原问题列生成方法（PDCGM）是一种通用的列生成技术，依赖于对偶-原内点法来求解受限主问题。使用这种内点法变体可以得到次优且良好的对偶解，这自然稳定了列生成过程。与标准列生成方法相比，PDCGM在减少Oracle调用次数和CPU时间方面通常表现更好，但这些结果基于较小的线性松弛组合优化问题。本文探讨了PDCGM在解决大规模凸优化问题中的行为，包括数据分析、不确定决策和电信网络等应用。数值实验中使用公开可用的基准实例，将PDCGM与文献中不同方法的最新结果进行比较。分析结果表明，PDCGM在大规模优化问题中仍具有竞争力，提供了比专用方法更具吸引力的替代方案。

英文摘要

The primal-dual column generation method (PDCGM) is a general-purpose column generation technique that relies on the primal-dual interior point method to solve the restricted master problems. The use of this interior point method variant allows to obtain suboptimal and well-centered dual solutions which naturally stabilizes the column generation. As recently presented in the literature, reductions in the number of calls to the oracle and in the CPU times are typically observed when compared to the standard column generation, which relies on extreme optimal dual solutions. However, these results are based on relatively small problems obtained from linear relaxations of combinatorial applications. In this paper, we investigate the behaviour of the PDCGM in a broader context, namely when solving large-scale convex optimization problems. We have selected applications that arise in important real-life contexts such as data analysis (multiple kernel learning problem), decision-making under uncertainty (two-stage stochastic programming problems) and telecommunication and transportation networks (multicommodity network flow problem). In the numerical experiments, we use publicly available benchmark instances to compare the performance of the PDCGM against recent results for different methods presented in the literature, which were the best available results to date. The analysis of these results suggests that the PDCGM offers an attractive alternative over specialized methods since it remains competitive in terms of number of iterations and CPU times even for large-scale optimization problems.

URL PDF HTML ☆

赞 0 踩 0

1501.05740 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

Bayesian Learning for Low-Rank matrix reconstruction

基于贝叶斯学习的低秩矩阵重建

Martin Sundin, Cristian R. Rojas, Magnus Jansson, Saikat Chatterjee

AI总结本文提出基于潜在变量模型的贝叶斯学习方法，用于从线性测量中完成和重建低秩矩阵，通过证据近似和期望最大化学习模型参数，验证了在未知秩和噪声功率时的重建能力。

Comments Submitted to IEEE Transactions on Signal Processing

1501.03975 2026-06-04 cs.NE cs.LG cs.SY eess.SY 版本更新

Stochastic Gradient Based Extreme Learning Machines For Online Learning of Advanced Combustion Engines

基于随机梯度的极端学习机用于先进燃烧发动机的在线学习

Vijay Manikandan Janakiraman, XuanLong Nguyen, Dennis Assanis

AI总结本文提出一种基于随机梯度的在线学习算法SG-ELM，用于非线性动态系统的识别。该算法通过Lyapunov方法证明了估计误差的渐近稳定性和参数稳定性，同时减少了计算需求，应用于HCCI发动机的系统识别和动态操作范围识别。

Comments This paper was written as an extract from my PhD thesis (July 2013) and so references may not be to date as of this submission (Jan 2015). The article is in review and contains 10 figures, 35 references

详情

AI中文摘要

本文提出了一种基于随机梯度的极端学习机（SG-ELM）在线学习算法。通过Lyapunov方法构建的稳定性准则，证明了估计误差的渐近稳定性和参数稳定性，适用于非线性动态系统的识别。与基于递归最小二乘的OS-ELM相比，SG-ELM在保证稳定性的同时减少了计算需求。为验证算法在实际场景中的有效性，考虑了先进燃烧发动机的识别问题。该算法应用于两个案例研究：一种是HCCI发动机系统的在线回归学习，另一种是具有类别不平衡的在线分类学习，用于识别HCCI发动机的动态操作范围。结果表明，所提SG-ELM的准确性与现有最先进方法相当，同时增加了稳定性并减少了计算成本。

英文摘要

In this article, a stochastic gradient based online learning algorithm for Extreme Learning Machines (ELM) is developed (SG-ELM). A stability criterion based on Lyapunov approach is used to prove both asymptotic stability of estimation error and stability in the estimated parameters suitable for identification of nonlinear dynamic systems. The developed algorithm not only guarantees stability, but also reduces the computational demand compared to the OS-ELM approach based on recursive least squares. In order to demonstrate the effectiveness of the algorithm on a real-world scenario, an advanced combustion engine identification problem is considered. The algorithm is applied to two case studies: An online regression learning for system identification of a Homogeneous Charge Compression Ignition (HCCI) Engine and an online classification learning (with class imbalance) for identifying the dynamic operating envelope of the HCCI Engine. The results indicate that the accuracy of the proposed SG-ELM is comparable to that of the state-of-the-art but adds stability and a reduction in computational effort.

URL PDF HTML ☆

赞 0 踩 0

1310.5715 2026-06-04 math.NA cs.CV cs.LG cs.NA math.OC stat.ML 版本更新

Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm

随机梯度下降、加权采样与随机化Kaczmarz算法

Deanna Needell, Nathan Srebro, Rachel Ward

AI总结本文改进了随机梯度下降在光滑强凸目标下的线性收敛保证，从二次依赖于条件数转换为线性依赖，同时探讨了加权采样对收敛性的影响，并将随机化Kaczmarz算法与SGD联系起来，证明其在加权最小二乘问题中的指数收敛性。

Comments 22 pages, 6 figures

详情

AI中文摘要

我们获得了随机梯度下降在光滑且强凸目标下的改进有限样本保证，将线性收敛的依赖从二次的条件数$(L/μ)^2$（其中$L$是光滑性的上界，$μ$是强凸性的上界）转为线性依赖于$L/μ$。此外，我们展示了如何通过重新加权采样分布（即重要性采样）进一步提升收敛性，并获得平均光滑性的线性依赖，优于先前结果。我们还讨论了SGD中的重要性采样在其他场景中的应用。我们的结果基于将SGD与随机化Kaczmarz算法联系起来的发现，使我们能够将两种方法的文献思想相互转移。特别是，我们将随机化Kaczmarz算法重新表述为SGD的一个实例，并应用我们的结果证明其在加权最小二乘问题中的指数收敛性，而非原始最小二乘问题。然后，我们提出了一种修改的Kaczmarz算法，具有部分偏置采样，该算法能够收敛到原始最小二乘解，并以相同的指数收敛速率。

英文摘要

We obtain an improved finite-sample guarantee on the linear convergence of stochastic gradient descent for smooth and strongly convex objectives, improving from a quadratic dependence on the conditioning $(L/μ)^2$ (where $L$ is a bound on the smoothness and $μ$ on the strong convexity) to a linear dependence on $L/μ$. Furthermore, we show how reweighting the sampling distribution (i.e. importance sampling) is necessary in order to further improve convergence, and obtain a linear dependence in the average smoothness, dominating previous results. We also discuss importance sampling for SGD more broadly and show how it can improve convergence also in other scenarios. Our results are based on a connection we make between SGD and the randomized Kaczmarz algorithm, which allows us to transfer ideas between the separate bodies of literature studying each of the two methods. In particular, we recast the randomized Kaczmarz algorithm as an instance of SGD, and apply our results to prove its exponential convergence, but to the solution of a weighted least squares problem rather than the original least squares problem. We then present a modified Kaczmarz algorithm with partially biased sampling which does converge to the original least squares solution with the same exponential convergence rate.

URL PDF HTML ☆

赞 0 踩 0

1304.4610 2026-06-04 cs.IT cs.LG cs.NA math.IT math.NA stat.ML 版本更新

Spectral Compressed Sensing via Structured Matrix Completion

通过结构矩阵补全的谱压缩感知

Yuxin Chen, Yuejie Chi

AI总结本文提出基于结构矩阵补全的增强矩阵补全算法，用于从少量时域样本中恢复谱稀疏对象，通过核范数最小化实现完美恢复，且在信息理论极限附近具有鲁棒性和超分辨率应用能力。

Comments accepted to International Conference on Machine Learning (ICML 2013)

详情

Journal ref: Journal of Machine Learning Research, W&CP 28 (3) :414-422, 2013

AI中文摘要

本文研究从少量时域样本中恢复谱稀疏对象的问题。具体而言，目标对象在 ambient 维度 $n$ 下假设为 $r$ 个复多维正弦波的混合，而底层频率可在单位盘内任意取值。传统压缩感知范式在施加离散字典到傅里叶表示时会遇到基函数不匹配问题。为解决此问题，我们开发了一种非参数算法，称为增强矩阵补全（EMaC），基于结构矩阵补全。该算法首先将数据排列成低秩增强形式，具有多倍Hankel结构，然后通过核范数最小化进行恢复。在温和的不相干条件下，EMaC允许样本数量超过 $\mathcal{O}(r\log^{2} n)$ 时实现完美恢复。我们还显示，在许多实例中，当观测条目数与信息理论极限成比例时，低秩多倍Hankel矩阵的准确补全是可能的（除了对数间隙外）。通过数值实验进一步展示了EMaC对有界噪声的鲁棒性和其在超分辨率中的应用能力。

英文摘要

The paper studies the problem of recovering a spectrally sparse object from a small number of time domain samples. Specifically, the object of interest with ambient dimension $n$ is assumed to be a mixture of $r$ complex multi-dimensional sinusoids, while the underlying frequencies can assume any value in the unit disk. Conventional compressed sensing paradigms suffer from the {\em basis mismatch} issue when imposing a discrete dictionary on the Fourier representation. To address this problem, we develop a novel nonparametric algorithm, called enhanced matrix completion (EMaC), based on structured matrix completion. The algorithm starts by arranging the data into a low-rank enhanced form with multi-fold Hankel structure, then attempts recovery via nuclear norm minimization. Under mild incoherence conditions, EMaC allows perfect recovery as soon as the number of samples exceeds the order of $\mathcal{O}(r\log^{2} n)$. We also show that, in many instances, accurate completion of a low-rank multi-fold Hankel matrix is possible when the number of observed entries is proportional to the information theoretical limits (except for a logarithmic gap). The robustness of EMaC against bounded noise and its applicability to super resolution are further demonstrated by numerical experiments.

URL PDF HTML ☆

赞 0 踩 0

1501.00125 2026-06-04 cs.LG cs.NA cs.SY eess.SY math.NA physics.data-an 版本更新

减少通信扩散自适应最小均方算法分析

Reza Arablouei, Stefan Werner, Kutluyıl Doğançay, Yih-Fang Huang

AI总结本文分析了减少通信扩散自适应最小均方算法，探讨其在估计性能与通信成本之间的权衡，证明了算法在均值和均方意义下的稳定性和收敛性，并计算了稳态均方偏差。

1406.5429 2026-06-04 math.NA cs.CV cs.LG cs.NA math.OC 版本更新

Playing with Duality: An Overview of Recent Primal-Dual Approaches for Solving Large-Scale Optimization Problems

双模互动：解决大规模优化问题的最新对偶方法综述

Nikos Komodakis, Jean-Christophe Pesquet

AI总结本文综述了近期用于解决大规模优化问题的对偶方法，探讨了对偶问题在信号处理、计算机视觉和机器学习中的应用，强调了对偶算法在求解凸优化和离散问题中的优势。

1411.6081 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

PU Learning for Matrix Completion

矩阵补全的PU学习

Cho-Jui Hsieh, Nagarajan Natarajan, Inderjit S. Dhillon

AI总结本文研究了在仅观测到二进制测量值的情况下，如何通过PU学习方法进行矩阵补全，提出了两种方法并给出了误差界和样本复杂度。

详情

AI中文摘要

本文考虑了当观测值为某些基础矩阵M的一位测量值时的矩阵补全问题，特别是观测样本仅包含1而无0的情况。此问题源于推荐系统和社会网络等现代应用，其中仅观察到

英文摘要

In this paper, we consider the matrix completion problem when the observations are one-bit measurements of some underlying matrix M, and in particular the observed samples consist only of ones and no zeros. This problem is motivated by modern applications such as recommender systems and social networks where only "likes" or "friendships" are observed. The problem of learning from only positive and unlabeled examples, called PU (positive-unlabeled) learning, has been studied in the context of binary classification. We consider the PU matrix completion problem, where an underlying real-valued matrix M is first quantized to generate one-bit observations and then a subset of positive entries is revealed. Under the assumption that M has bounded nuclear norm, we provide recovery guarantees for two different observation models: 1) M parameterizes a distribution that generates a binary matrix, 2) M is thresholded to obtain a binary matrix. For the first case, we propose a "shifted matrix completion" method that recovers M using only a subset of indices corresponding to ones, while for the second case, we propose a "biased matrix completion" method that recovers the (thresholded) binary matrix. Both methods yield strong error bounds --- if M is n by n, the Frobenius error is bounded as O(1/((1-rho)n), where 1-rho denotes the fraction of ones observed. This implies a sample complexity of O(n\log n) ones to achieve a small error, when M is dense and n is large. We extend our methods and guarantees to the inductive matrix completion problem, where rows and columns of M have associated features. We provide efficient and scalable optimization procedures for both the methods and demonstrate the effectiveness of the proposed methods for link prediction (on real-world networks consisting of over 2 million nodes and 90 million links) and semi-supervised clustering tasks.

URL PDF HTML ☆

赞 0 踩 0

1408.4551 2026-06-04 math.OC cs.LG cs.SY eess.SY 版本更新

Dimensionality Reduction of Affine Variational Inequalities Using Random Projections

使用随机投影进行仿射变分不等式的降维

Bharat Prabhakar, Ankur A. Kulkarni

AI总结本文提出了一种基于Johnson-Lindenstrauss引理的随机算法，通过降维求解近似解，验证了算法在低维下的有效性及精确解计算中的时间节省。

Comments Submitted to Mathematical Programming Series A. Edited some typos from the previous version. Also added a bound on the lower dimension

1411.1087 2026-06-04 math.NA cs.DS cs.IT cs.LG cs.NA math.IT stat.ML 版本更新

Fast Exact Matrix Completion with Finite Samples

快速精确矩阵补全与有限样本

Prateek Jain, Praneeth Netrapalli

AI总结本文提出一种快速迭代算法，通过观察O(nr^5 log^3 n)个样本实现精确矩阵补全，运行时间为O(nr^7 log^3 n log 1/ε)，首次实现近线性时间且样本复杂度独立于精度的补全方法。

详情

AI中文摘要

矩阵补全是通过观测少量矩阵条目恢复低秩矩阵的问题。近期多项工作提出了快速非凸优化迭代算法，但样本复杂度在秩、条件数和所需精度的依赖上仍不最优。本文提出一种快速迭代算法，通过观察O(nr^5 log^3 n)个条目解决矩阵补全问题，运行时间为O(nr^7 log^3 n log 1/ε)，近线性于矩阵维度。本文算法基于已知的投影梯度下降方法，投影到非凸的低秩矩阵集合。两个关键思想：1) 使用ℓ∞范数势函数而非谱范数，提供新的扰动界方法；2) 扩展Davis-Kahan定理以获得具有良好特征间隙矩阵的最佳低秩近似扰动界。这些思想可能具有独立价值。

英文摘要

Matrix completion is the problem of recovering a low rank matrix by observing a small fraction of its entries. A series of recent works [KOM12,JNS13,HW14] have proposed fast non-convex optimization based iterative algorithms to solve this problem. However, the sample complexity in all these results is sub-optimal in its dependence on the rank, condition number and the desired accuracy. In this paper, we present a fast iterative algorithm that solves the matrix completion problem by observing $O(nr^5 \log^3 n)$ entries, which is independent of the condition number and the desired accuracy. The run time of our algorithm is $O(nr^7\log^3 n\log 1/ε)$ which is near linear in the dimension of the matrix. To the best of our knowledge, this is the first near linear time algorithm for exact matrix completion with finite sample complexity (i.e. independent of $ε$). Our algorithm is based on a well known projected gradient descent method, where the projection is onto the (non-convex) set of low rank matrices. There are two key ideas in our result: 1) our argument is based on a $\ell_{\infty}$ norm potential function (as opposed to the spectral norm) and provides a novel way to obtain perturbation bounds for it. 2) we prove and use a natural extension of the Davis-Kahan theorem to obtain perturbation bounds on the best low rank approximation of matrices with good eigen-gap. Both of these ideas may be of independent interest.

URL PDF HTML ☆

赞 0 踩 0

1406.6474 2026-06-04 math.OC cs.DM cs.DS cs.LG cs.NA math.NA 版本更新

On the Convergence Rate of Decomposable Submodular Function Minimization

可分解亚模函数最小化的收敛速度研究

Robert Nishihara, Stefanie Jegelka, Michael I. Jordan

AI总结本文研究了可分解亚模函数最小化算法的收敛速度，通过几何和谱图理论分析，给出了收敛速率的上界和下界。

Comments 17 pages, 3 figures

1411.0024 2026-06-04 math.OC cs.LG cs.SY eess.SY stat.ML 版本更新

Robust sketching for multiple square-root LASSO problems

多重平方根LASSO问题的鲁棒抽样

Vu Pham, Laurent El Ghaoui, Arturo Fernandez

AI总结本文提出一种鲁棒框架，通过低秩近似对多个相似问题进行高效求解，减少计算量并提升统计性能。

1406.4905 2026-06-04 cs.LG cs.RO cs.SY eess.SY stat.ML 版本更新

Variational Gaussian Process State-Space Models

变分高斯过程状态空间模型

Roger Frigola, Yutian Chen, Carl E. Rasmussen

AI总结本文提出基于稀疏高斯过程的变分贝叶斯学习方法，用于高效学习非线性状态空间模型，实现对非线性动力系统后验的可计算性，相比传统参数模型，能灵活平衡模型容量与计算成本，避免过拟合。

1410.7550 2026-06-04 stat.ML cs.LG cs.NE cs.SY eess.SY 版本更新

Learning deep dynamical models from image pixels

从图像像素学习深度动态模型

Niklas Wahlström, Thomas B. Schön, Marc Peter Deisenroth

AI总结本文提出通过深度学习与系统识别结合，从高维图像像素学习非线性动态系统的嵌入表示和预测转移模型。

Comments 10 pages, 11 figures

详情

AI中文摘要

建模动态系统在许多领域都很重要，例如控制、机器人学或神经技术。通常这些系统的状态无法直接观测，只能通过噪声且可能高维的观测获得。在这种情况下，系统识别，即在潜在空间中找到测量映射和转移映射（系统动态）具有挑战性。对于线性系统动态和测量映射，有高效的系统识别解决方案。然而在实际应用中，线性假设不成立，需要非线性系统识别技术。如果观测是高维的（例如图像），则非线性系统识别本质上困难。为了解决从高维观测中进行非线性系统识别的问题，我们结合了深度学习和系统识别的最新进展。特别是，我们通过深度自编码器联合学习观测的低维嵌入，并在该低维空间中学习预测转移模型。我们证明我们的模型能够仅从像素信息学习良好的动态系统预测模型。

英文摘要

Modeling dynamical systems is important in many disciplines, e.g., control, robotics, or neurotechnology. Commonly the state of these systems is not directly observed, but only available through noisy and potentially high-dimensional observations. In these cases, system identification, i.e., finding the measurement mapping and the transition mapping (system dynamics) in latent space can be challenging. For linear system dynamics and measurement mappings efficient solutions for system identification are available. However, in practical applications, the linearity assumptions does not hold, requiring non-linear system identification techniques. If additionally the observations are high-dimensional (e.g., images), non-linear system identification is inherently hard. To address the problem of non-linear system identification from high-dimensional observations, we combine recent advances in deep learning and system identification. In particular, we jointly learn a low-dimensional embedding of the observation by means of deep auto-encoders and a predictive transition model in this low-dimensional space. We demonstrate that our model enables learning good predictive models of dynamical systems from pixel information only.

URL PDF HTML ☆

赞 0 踩 0

1410.3463 2026-06-04 cs.OS cs.LG cs.SY eess.SY 版本更新

Mining Block I/O Traces for Cache Preloading with Sparse Temporal Non-parametric Mixture of Multivariate Poisson

通过稀疏时间非参数混合多维泊松模型挖掘块I/O轨迹以进行缓存预加载

Lavanya Sita Tekumalla, Chiranjib Bhattacharyya

AI总结本文提出稀疏时间非参数混合多维泊松模型，用于从块I/O轨迹中挖掘缓存预加载的长程模式，通过稀疏建模提高效率，实验证明在基准轨迹上显著提升命中率。

详情

AI中文摘要

现有的存储领域缓存策略虽然能有效利用短程时空模式，但无法利用长程模式提高命中率。为此，我们研究了新的贝叶斯非参数建模（BNP）技术，用于捕捉缓存预加载的长程相关性。此类轨迹由一系列内存访问组成，可聚合为高维稀疏相关计数向量序列。虽然已有多种先进的BNP算法用于聚类及其时间扩展用于预测，但尚无研究将这些方法应用于相关计数向量。我们的第一项贡献是提出基于DP的多维泊松混合模型（DP-MMVP）及其时间扩展（HMM-DP-MMVP），以捕捉多维计数数据的完整协方差结构。然而，对计数向量建模完整协方差结构计算成本高，特别是高维数据。因此，我们利用计数向量的稀疏性，并作为主要贡献，引入稀疏DP混合多维泊松（Sparse-DP-MMVP），推广我们的DP-MMVP混合模型，从而实现更高效的推断。我们随后讨论了模型的时间扩展用于缓存预加载。我们首次尝试挖掘历史数据，以捕捉存储轨迹中的长程模式用于缓存预加载。实验表明，在基准轨迹上命中率显著提高，为存储领域进一步研究减少延迟的数据挖掘技术奠定了基础。

英文摘要

Existing caching strategies, in the storage domain, though well suited to exploit short range spatio-temporal patterns, are unable to leverage long-range motifs for improving hitrates. Motivated by this, we investigate novel Bayesian non-parametric modeling(BNP) techniques for count vectors, to capture long range correlations for cache preloading, by mining Block I/O traces. Such traces comprise of a sequence of memory accesses that can be aggregated into high-dimensional sparse correlated count vector sequences. While there are several state of the art BNP algorithms for clustering and their temporal extensions for prediction, there has been no work on exploring these for correlated count vectors. Our first contribution addresses this gap by proposing a DP based mixture model of Multivariate Poisson (DP-MMVP) and its temporal extension(HMM-DP-MMVP) that captures the full covariance structure of multivariate count data. However, modeling full covariance structure for count vectors is computationally expensive, particularly for high dimensional data. Hence, we exploit sparsity in our count vectors, and as our main contribution, introduce the Sparse DP mixture of multivariate Poisson(Sparse-DP-MMVP), generalizing our DP-MMVP mixture model, also leading to more efficient inference. We then discuss a temporal extension to our model for cache preloading. We take the first step towards mining historical data, to capture long range patterns in storage traces for cache preloading. Experimentally, we show a dramatic improvement in hitrates on benchmark traces and lay the groundwork for further research in storage domain to reduce latencies using data mining techniques to capture long range motifs.

URL PDF HTML ☆

赞 0 踩 0

1410.2786 2026-06-04 cs.LG cs.NA math.NA 版本更新

New SVD based initialization strategy for Non-negative Matrix Factorization

基于SVD的非负矩阵分解的新型初始化策略

Hanli Qiao

AI总结本文提出基于SVD的非负矩阵分解初始化方法，通过提取主成分确定秩并利用SVD分解结果初始化算法，实验表明其优于NNDSVD。

详情

AI中文摘要

本文针对非负矩阵分解（NMF）中的两个问题进行了研究：选择合适的分解秩和提供良好的初始化方法。本文旨在利用奇异值分解（SVD）解决这些问题。首先，我们通过提取主成分来确定秩，这种方法受[1, 2]的启发。其次，我们利用奇异值及其向量来初始化NMF算法。2008年，Boutsidis和Gollopoulos[3]提出了名为NNDSVD的方法，用于增强NMF算法的初始化。他们提取了单位矩阵{C(j)}k j=1的正部分及其奇异三元组信息。该策略旨在利用正部分来处理奇异向量中的负元素，但实验表明，即使将负元素替换为绝对值，也能比NNDSVD获得更好的结果。因此，我们提出了一种基于SVD的NMF初始化方法（SVD-NMF）。在ORL和YALE两个人脸数据库上的数值实验表明，我们的方法优于NNDSVD。

英文摘要

There are two problems need to be dealt with for Non-negative Matrix Factorization (NMF): choose a suitable rank of the factorization and provide a good initialization method for NMF algorithms. This paper aims to solve these two problems using Singular Value Decomposition (SVD). At first we extract the number of main components as the rank, actually this method is inspired from [1, 2]. Second, we use the singular value and its vectors to initialize NMF algorithm. In 2008, Boutsidis and Gollopoulos [3] provided the method titled NNDSVD to enhance initialization of NMF algorithms. They extracted the positive section and respective singular triplet information of the unit matrices {C(j)}k j=1 which were obtained from singular vector pairs. This strategy aims to use positive section to cope with negative elements of the singular vectors, but in experiments we found that even replacing negative elements by their absolute values could get better results than NNDSVD. Hence, we give another method based SVD to fulfil initialization for NMF algorithms (SVD-NMF). Numerical experiments on two face databases ORL and YALE [16, 17] show that our method is better than NNDSVD.

URL PDF HTML ☆

赞 0 踩 0

1410.0719 2026-06-04 math.NA cs.CV cs.IT cs.LG cs.NA math.IT math.OC math.ST stat.TH 版本更新

Proceedings of the second "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'14)

第二届‘国际稀疏模型与技术相互作用’研讨会论文集（iTWIST'14）

L. Jacques, C. De Vleeschouwer, Y. Boursier, P. Sudhakar, C. De Mol, A. Pizurica, S. Anthoine, P. Vandergheynst, P. Frossard, C. Bilen, S. Kitic, N. Bertin, R. Gribonval, N. Boumal, B. Mishra, P. -A. Absil, R. Sepulchre, S. Bundervoet, C. Schretter, A. Dooms, P. Schelkens, O. Chabiron, F. Malgouyres, J. -Y. Tourneret, N. Dobigeon, P. Chainais, C. Richard, B. Cornelis, I. Daubechies, D. Dunson, M. Dankova, P. Rajmic, K. Degraux, V. Cambareri, B. Geelen, G. Lafruit, G. Setti, J. -F. Determe, J. Louveaux, F. Horlin, A. Drémeau, P. Heas, C. Herzet, V. Duval, G. Peyré, A. Fawzi, M. Davies, N. Gillis, S. A. Vavasis, C. Soussen, L. Le Magoarou, J. Liang, J. Fadili, A. Liutkus, D. Martina, S. Gigan, L. Daudet, M. Maggioni, S. Minsker, N. Strawn, C. Mory, F. Ngole, J. -L. Starck, I. Loris, S. Vaiter, M. Golbabaee, D. Vukobratovic

AI总结 iTWIST'14聚焦稀疏范式理论与应用，通过演讲、海报和讨论促进国际协作，涵盖稀疏数据传感、子空间联合、非线性逆问题等主题。

Comments 69 pages, 24 extended abstracts, iTWIST'14 website: http://sites.google.com/site/itwist14

1401.1549 2026-06-04 cs.LG cs.AI cs.SY eess.SY 版本更新

Optimal Demand Response Using Device Based Reinforcement Learning

基于设备的强化学习的最优需求响应

Zheng Wen, Daniel O'Neill, Hamid Reza Maei

AI总结本文提出一种新型EMS框架，将需求响应问题建模为强化学习问题，通过设备集群分解解决调度问题，无需显式建模用户不满，提升计算效率。

1409.8327 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

Bayesian and regularization approaches to multivariable linear system identification: the role of rank penalties

基于贝叶斯和正则化方法的多变量线性系统辨识：秩惩罚的作用

Giulia Prando, Alessandro Chiuso, Gianluigi Pillonetto

AI总结本文提出一种基于ℓ2正则化和秩惩罚的冲击响应估计器，用于处理多输入多输出系统中输入输出通道的耦合问题，通过优化边际似然估计超参数，实现闭式解。

Comments to appear in IEEE Conference on Decision and Control, 2014

详情

AI中文摘要

最近线性系统辨识的发展提出了非参数方法，依赖正则化策略处理偏差/方差权衡。本文引入一种冲击响应估计器，采用ℓ2型正则化和基于log-det启发式的秩惩罚作为秩函数的平滑近似。这允许考虑估计冲击响应的不同属性（如平滑性和稳定性），同时惩罚高复杂度模型。此外，它允许考虑并强制多输入多输出系统中不同输入输出通道的耦合。根据贝叶斯范式，参数定义了两个正则化项的相对权重以及秩惩罚的结构，通过优化边际似然估计。一旦这些超参数被估计，冲击响应估计即可用闭式形式表示。实验表明，所提方法优于仅依赖经典ℓ2正则化或原子和核范数的估计器。

英文摘要

Recent developments in linear system identification have proposed the use of non-parameteric methods, relying on regularization strategies, to handle the so-called bias/variance trade-off. This paper introduces an impulse response estimator which relies on an $\ell_2$-type regularization including a rank-penalty derived using the log-det heuristic as a smooth approximation to the rank function. This allows to account for different properties of the estimated impulse response (e.g. smoothness and stability) while also penalizing high-complexity models. This also allows to account and enforce coupling between different input-output channels in MIMO systems. According to the Bayesian paradigm, the parameters defining the relative weight of the two regularization terms as well as the structure of the rank penalty are estimated optimizing the marginal likelihood. Once these hyperameters have been estimated, the impulse response estimate is available in closed form. Experiments show that the proposed method is superior to the estimator relying on the "classic" $\ell_2$-regularization alone as well as those based in atomic and nuclear norm.

URL PDF HTML ☆

赞 0 踩 0

1409.8276 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

A Bayesian Tensor Factorization Model via Variational Inference for Link Prediction

通过变分推断的贝叶斯张量分解模型用于链接预测

Beyza Ermis, A. Taylan Cemgil

AI总结本文提出基于变分贝叶斯推断的张量分解模型，用于解决链接预测问题，相比最大似然方法在大规模数据集上表现更优。

Comments arXiv admin note: substantial text overlap with arXiv:1409.8083

1409.5671 2026-06-04 cs.AI cs.CE cs.LG cs.LO cs.SY eess.SY 版本更新

A Formal Methods Approach to Pattern Synthesis in Reaction Diffusion Systems

反应扩散系统模式合成的正式方法方法

Ebru Aydin Gol, Ezio Bartocci, Calin Belta

AI总结本文提出了一种基于空间叠加逻辑的模式检测与生成技术，结合模型检验与粒子群优化，实现反应扩散系统中所需模式的参数合成。

1409.4018 2026-06-04 cs.LG cs.NA math.NA 版本更新

EquiNMF: Graph Regularized Multiview Nonnegative Matrix Factorization

EquiNMF：图正则化多视图非负矩阵分解

Daniel Hidru, Anna Goldenberg

AI总结本文提出EquiNMF方法，通过图正则化多视图非负矩阵分解实现数据整合，自动设置参数以提升聚类效果，实验显示其优于其他单视图和多视图方法。

1409.2579 2026-06-04 math.NA cs.CV cs.LG cs.NA 版本更新

A theoretical contribution to the fast implementation of null linear discriminant analysis method using random matrix multiplication with scatter matrices

对利用散射矩阵进行随机矩阵乘法实现null线性判别分析方法的理论贡献

Ting-ting Feng, Gang Wu

AI总结本文提出一种理论方法，通过合理选择随机矩阵保证null LDA的列满秩，避免信息丢失，提升计算效率。

Comments 7 pages

1408.2054 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Non-Convex Rank Minimization via an Empirical Bayesian Approach

通过经验贝叶斯方法实现非凸秩最小化

David Wipf

AI总结本文提出基于变分近似的经验贝叶斯方法，用于解决非凸秩最小化问题，该方法在保留全局最优估计的同时，通过边际化处理克服了传统凸松弛方法的局限性，尤其在鲁棒主成分分析中表现出色。

Comments Appears in Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence (UAI2012)

详情

AI中文摘要

在需要最小秩矩阵解的应用中，底层成本函数非凸导致优化问题难以解决。因此，核范数常被用作矩阵秩的替代惩罚项。然而，在许多实际场景中，无法保证正确估计生成的低秩矩阵，理论特例除外。本文提出了一种基于变分近似的经验贝叶斯方法，该方法在许多有用约束下保留与秩函数相同的全局最小点估计。通过边际化处理，局部最小解被平滑掉，使算法在标准凸松弛完全失败时仍能成功。虽然该方法适用于广泛低秩应用，但本文聚焦于鲁棒主成分分析问题（RPCA），即估计未知低秩矩阵及其未知稀疏损坏。理论和实证证据表明，本文方法可能优于相关MAP方法，其中凸原理成分追求（PCP）算法（Candes等，2011）可视为特例。

英文摘要

In many applications that require matrix solutions of minimal rank, the underlying cost function is non-convex leading to an intractable, NP-hard optimization problem. Consequently, the convex nuclear norm is frequently used as a surrogate penalty term for matrix rank. The problem is that in many practical scenarios there is no longer any guarantee that we can correctly estimate generative low-rank matrices of interest, theoretical special cases notwithstanding. Consequently, this paper proposes an alternative empirical Bayesian procedure build upon a variational approximation that, unlike the nuclear norm, retains the same globally minimizing point estimate as the rank function under many useful constraints. However, locally minimizing solutions are largely smoothed away via marginalization, allowing the algorithm to succeed when standard convex relaxations completely fail. While the proposed methodology is generally applicable to a wide range of low-rank applications, we focus our attention on the robust principal component analysis problem (RPCA), which involves estimating an unknown low-rank matrix with unknown sparse corruptions. Theoretical and empirical evidence are presented to show that our method is potentially superior to related MAP-based approaches, for which the convex principle component pursuit (PCP) algorithm (Candes et al., 2011) can be viewed as a special case.

URL PDF HTML ☆

赞 0 踩 0

1408.0838 2026-06-04 cs.LG cs.NA math.NA math.OC stat.ML 版本更新

Estimating Maximally Probable Constrained Relations by Mathematical Programming

通过数学规划估计最大可能的约束关系

Lizhen Qu, Bjoern Andres

AI总结本文提出了一种概率测度家族，用于联合抽象多标签分类、相关聚类和排序问题，通过数学规划方法解决半监督学习。

Comments 16 pages

详情

AI中文摘要

估计约束关系是机器学习中的基本问题。特殊情形包括分类、聚类和排序。本文贡献了一种在两个有限非空集合之间所有关系的概率测度家族，提供多标签分类、相关聚类和排序的联合抽象。给定相关和不相关配对的训练集，估计最大可能测度是一个凸优化问题。给定测度估计最大可能关系是一个01线性规划问题，对于映射可在线性时间内解决，而对于等价关系和线性顺序则是NP难问题。实验显示了所有三种情况的实用解决方案。最后，联合估计最大可能测度和关系被提出为混合整数非线性规划问题。此方法为半监督学习提供了数学规划的途径。

英文摘要

Estimating a constrained relation is a fundamental problem in machine learning. Special cases are classification (the problem of estimating a map from a set of to-be-classified elements to a set of labels), clustering (the problem of estimating an equivalence relation on a set) and ranking (the problem of estimating a linear order on a set). We contribute a family of probability measures on the set of all relations between two finite, non-empty sets, which offers a joint abstraction of multi-label classification, correlation clustering and ranking by linear ordering. Estimating (learning) a maximally probable measure, given (a training set of) related and unrelated pairs, is a convex optimization problem. Estimating (inferring) a maximally probable relation, given a measure, is a 01-linear program. It is solved in linear time for maps. It is NP-hard for equivalence relations and linear orders. Practical solutions for all three cases are shown in experiments with real data. Finally, estimating a maximally probable measure and relation jointly is posed as a mixed-integer nonlinear program. This formulation suggests a mathematical programming approach to semi-supervised learning.

URL PDF HTML ☆

赞 0 踩 0

1404.1592 2026-06-04 math.OC cs.LG cs.SY eess.SY 版本更新

The Power of Online Learning in Stochastic Network Optimization

在线学习在随机网络优化中的力量

Longbo Huang, Xin Liu, Xiaohong Hao

AI总结本文研究在线学习在未知系统统计的随机网络优化中的作用，提出OLAC和OLAC2算法，证明其在效用-延迟权衡和收敛时间上的最优性能。

详情

AI中文摘要

本文探讨在线学习在随机网络优化中未知系统统计下的作用，研究信息与学习如何高效融入系统控制，并提出OLAC和OLAC2两种在线学习辅助控制技术。通过双学习过程利用过往系统信息，证明OLAC和OLAC2在效用-延迟权衡上达到近最优，OLAC2具有O(ε^{-2/3})收敛时间。仿真结果验证了算法的优越性能，首次将在线学习显式融入随机网络优化并在理论和实践中展示其力量。

英文摘要

In this paper, we investigate the power of online learning in stochastic network optimization with unknown system statistics {\it a priori}. We are interested in understanding how information and learning can be efficiently incorporated into system control techniques, and what are the fundamental benefits of doing so. We propose two \emph{Online Learning-Aided Control} techniques, $\mathtt{OLAC}$ and $\mathtt{OLAC2}$, that explicitly utilize the past system information in current system control via a learning procedure called \emph{dual learning}. We prove strong performance guarantees of the proposed algorithms: $\mathtt{OLAC}$ and $\mathtt{OLAC2}$ achieve the near-optimal $[O(ε), O([\log(1/ε)]^2)]$ utility-delay tradeoff and $\mathtt{OLAC2}$ possesses an $O(ε^{-2/3})$ convergence time. $\mathtt{OLAC}$ and $\mathtt{OLAC2}$ are probably the first algorithms that simultaneously possess explicit near-optimal delay guarantee and sub-linear convergence time. Simulation results also confirm the superior performance of the proposed algorithms in practice. To the best of our knowledge, our attempt is the first to explicitly incorporate online learning into stochastic network optimization and to demonstrate its power in both theory and practice.

URL PDF HTML ☆

赞 0 踩 0

1407.7299 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Algorithms, Initializations, and Convergence for the Nonnegative Matrix Factorization

非负矩阵分解的算法、初始化和收敛性

Amy N. Langville, Carl D. Meyer, Russell Albright, James Cox, David Duling

AI总结本文研究了非负矩阵分解算法的初始化对收敛速度和精度的影响，提出两种新的交替最小二乘算法，并讨论了收敛准则的选择问题。

1407.0286 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

DC approximation approaches for sparse optimization

基于DC近似方法的稀疏优化

Hoai An Le Thi, Tao Pham Dinh, Hoai Minh Le, Xuan Thanh Vo

AI总结本文从DC框架出发，研究了稀疏优化的非凸近似方法，分析了近似问题与原问题的解的一致性，并开发了四种DCA方案，用于解决零范数和稀疏优化问题。

Comments 35 pages

详情

AI中文摘要

稀疏优化是指目标或约束中包含零范数的优化问题。本文从DC（差分凸函数）编程框架出发，研究了稀疏优化的非凸近似方法。考虑了包含所有标准稀疏诱导惩罚函数的零范数的常见DC近似，研究了近似问题与原问题的全局最小值（或局部最小值）的一致性。证明了在某些情况下，近似问题的某些全局最小值（或局部最小值）也是原问题的。利用DC编程中的精确惩罚技术，证明了某些特定近似方法在合适参数下与原问题等价。对几种稀疏诱导惩罚函数的效率进行了全面分析。开发了四种DCA（DC算法）方案，涵盖了非凸稀疏近似方法中的所有标准算法作为特殊版本。这些算法可以视为ℓ₁扰动算法/加权ℓ₁算法。本文提供了一种统一的非凸近似方法，结合了坚实的理论工具和基于DC编程和DCA的高效算法，以解决零范数和稀疏优化问题。作为应用，我们实现了我们的方法用于SVM（支持向量机）问题的特征选择，并在各种近似函数上进行了实证比较数值实验。

英文摘要

Sparse optimization refers to an optimization problem involving the zero-norm in objective or constraints. In this paper, nonconvex approximation approaches for sparse optimization have been studied with a unifying point of view in DC (Difference of Convex functions) programming framework. Considering a common DC approximation of the zero-norm including all standard sparse inducing penalty functions, we studied the consistency between global minimums (resp. local minimums) of approximate and original problems. We showed that, in several cases, some global minimizers (resp. local minimizers) of the approximate problem are also those of the original problem. Using exact penalty techniques in DC programming, we proved stronger results for some particular approximations, namely, the approximate problem, with suitable parameters, is equivalent to the original problem. The efficiency of several sparse inducing penalty functions have been fully analyzed. Four DCA (DC Algorithm) schemes were developed that cover all standard algorithms in nonconvex sparse approximation approaches as special versions. They can be viewed as, an $\ell _{1}$-perturbed algorithm / reweighted-$\ell _{1}$ algorithm / reweighted-$\ell _{1}$ algorithm. We offer a unifying nonconvex approximation approach, with solid theoretical tools as well as efficient algorithms based on DC programming and DCA, to tackle the zero-norm and sparse optimization. As an application, we implemented our methods for the feature selection in SVM (Support Vector Machine) problem and performed empirical comparative numerical experiments on the proposed algorithms with various approximation functions.

URL PDF HTML ☆

赞 0 踩 0

1405.7910 2026-06-04 cs.DS cs.LG cs.NA math.NA 版本更新

Optimal CUR Matrix Decompositions

最优的CUR矩阵分解

Christos Boutsidis, David P. Woodruff

AI总结本文提出了一种在输入稀疏性和确定性算法中实现最优CUR分解的方法，通过选择少量列和行构造低秩矩阵，以近似原始矩阵。

Comments small revision in lemma 4.2

1407.2676 2026-06-04 math.OC cs.AI cs.LG cs.SY eess.SY stat.ML 版本更新

A New Optimal Stepsize For Approximate Dynamic Programming

近似动态规划的一种新最优步长

Ilya O. Ryzhov, Peter I. Frazier, Warren B. Powell

AI总结本文提出一种新的最优步长规则，通过优化预测误差提升近似动态规划算法的短期性能，仅需一个敏感度较低的可调参数，适应问题噪声水平，加快数值实验中的收敛速度。

Comments Matlab files are included with the paper source

1407.1399 2026-06-04 math.NA cs.LG cs.NA 版本更新

Generalized Higher-Order Tensor Decomposition via Parallel ADMM

通过并行ADMM实现广义高阶张量分解

Fanhua Shang, Yuanyuan Liu, James Cheng

AI总结本文提出一种并行迹范数正则化的张量分解方法，通过优化方案自动确定各模式的因子数，解决传统方法在模型选择、粗腐损和计算效率上的挑战。

Comments 9 pages, 5 figures, AAAI 2014

详情

AI中文摘要

高阶张量在计算机视觉、社交网络分析、数据挖掘和神经科学等领域变得普遍。传统张量分解方法面临三个主要挑战：模型选择、粗腐损和计算效率。为解决这些问题，我们首先提出一种并行迹范数正则化的张量分解方法，并将其公式化为凸优化问题。该方法不需要事先指定每个模式的秩，可通过我们的优化方案自动确定每个模式的因子数。通过考虑观测张量的低秩结构，我们分析了低秩张量与其核心张量之间的迹范数等价关系。然后，我们将非凸张量分解模型转换为多个更小规模矩阵迹范数最小化的加权组合。最后，我们开发了两种并行交替方向乘子法（ADMM）来解决这些问题。实验结果验证了我们的正则化公式有效，且我们的方法对噪声或异常值具有鲁棒性。

英文摘要

Higher-order tensors are becoming prevalent in many scientific areas such as computer vision, social network analysis, data mining and neuroscience. Traditional tensor decomposition approaches face three major challenges: model selecting, gross corruptions and computational efficiency. To address these problems, we first propose a parallel trace norm regularized tensor decomposition method, and formulate it as a convex optimization problem. This method does not require the rank of each mode to be specified beforehand, and can automatically determine the number of factors in each mode through our optimization scheme. By considering the low-rank structure of the observed tensor, we analyze the equivalent relationship of the trace norm between a low-rank tensor and its core tensor. Then, we cast a non-convex tensor decomposition model into a weighted combination of multiple much smaller-scale matrix trace norm minimization. Finally, we develop two parallel alternating direction methods of multipliers (ADMM) to solve our problems. Experimental results verify that our regularized formulation is effective, and our methods are robust to noise or outliers.

URL PDF HTML ☆

赞 0 踩 0

1407.0449 2026-06-04 cs.LG cs.SY eess.SY math.OC stat.ML 版本更新

Classification-based Approximate Policy Iteration: Experiments and Extended Discussions

基于分类的近似策略迭代：实验与扩展讨论

Amir-massoud Farahmand, Doina Precup, André M. S. Barreto, Mohammad Ghavamzadeh

AI总结本文提出基于分类的近似策略迭代框架，通过价值函数和策略空间的规律性来提升算法性能，并在HIV控制等任务中验证了其有效性。

1407.0013 2026-06-04 math.NA cs.LG cs.NA math.ST stat.TH 版本更新

Relevance Singular Vector Machine for low-rank matrix sensing

相关性奇异向量机用于低秩矩阵感知

Martin Sundin, Saikat Chatterjee, Magnus Jansson, Cristian R. Rojas

AI总结本文提出了一种新的贝叶斯推断方法，用于低秩矩阵重建，即相关性奇异向量机（RSVM），通过在基础矩阵的奇异向量上定义合适先验来促进低秩性，并通过数值高效近似加速计算。

Comments International Conference on Signal Processing and Communications (SPCOM), 5 pages

1310.7529 2026-06-04 stat.ML cs.LG cs.NA math.NA math.OC 版本更新

Successive Nonnegative Projection Algorithm for Robust Nonnegative Blind Source Separation

递进非负投影算法用于鲁棒非负盲源分离

Nicolas Gillis

AI总结本文提出一种快速且鲁棒的递归算法，用于近可分离的非负矩阵分解问题。该算法称为递进非负投影算法（SNPA），利用非负约束提升鲁棒性，适用于更广泛的非负矩阵。

Comments 31 pages, 7 figures, 4 tables. Main changes: new numerical experiments on column-rank-deficient matrices, typos corrected, discussion on the comparison with XRAY

1406.4619 2026-06-04 math.NA cs.LG cs.NA cs.NE 版本更新

A Generalized Markov-Chain Modelling Approach to $(1,λ)$-ES Linear Optimization: Technical Report

一种通用马尔可夫链建模方法用于(1,λ)-ES线性优化：技术报告

Alexandre Chotard, Martin Holena

AI总结本文提出一种通用马尔可夫链建模方法，用于(1,λ)-ES在线性优化中的应用，探讨了固定步长下常数步长(1,λ)-ES在简单线性约束问题中的成功条件。

1406.0554 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Universal Convexification via Risk-Aversion

通过风险厌恶实现通用凸化

Krishnamurthy Dvijotham, Maryam Fazel, Emanuel Todorov

AI总结本文提出了一种通用凸化框架，分析了凸化问题相对于非凸问题的次优性和收敛性，并展示了在监督学习和离散时间动力系统中的应用。

详情

AI中文摘要

我们开发了一个框架，用于将广泛类别的优化问题凸化。在额外假设下，我们分析了凸化问题解相对于原始非凸问题的次优性和证明了加性近似保证。然后我们基于随机梯度方法开发了求解相应优化问题的算法，并展示了收敛率的界限。我们展示了该框架在监督学习中的简单应用，其中可以显式进行整合，并可以使用标准（非随机）优化算法获得更好的收敛保证。然后我们将该框架扩展到一般类离散时间动力系统。在此背景下，我们的凸化方法属于已研究的风险敏感马尔可夫决策过程范式。我们推导了首个已知的基于模型和无模型的策略梯度优化算法，保证收敛到最优解。最后，我们展示了在不同应用中的数值结果以验证我们的公式。

英文摘要

We develop a framework for convexifying a fairly general class of optimization problems. Under additional assumptions, we analyze the suboptimality of the solution to the convexified problem relative to the original nonconvex problem and prove additive approximation guarantees. We then develop algorithms based on stochastic gradient methods to solve the resulting optimization problems and show bounds on convergence rates. %We show a simple application of this framework to supervised learning, where one can perform integration explicitly and can use standard (non-stochastic) optimization algorithms with better convergence guarantees. We then extend this framework to apply to a general class of discrete-time dynamical systems. In this context, our convexification approach falls under the well-studied paradigm of risk-sensitive Markov Decision Processes. We derive the first known model-based and model-free policy gradient optimization algorithms with guaranteed convergence to the optimal solution. Finally, we present numerical results validating our formulation in different applications.

URL PDF HTML ☆

赞 0 踩 0

1310.5035 2026-06-04 math.NA cs.LG cs.NA math.OC stat.ML 版本更新

Linearized Alternating Direction Method with Parallel Splitting and Adaptive Penalty for Separable Convex Programs in Machine Learning

线性化交替方向法与并行分裂及自适应惩罚用于机器学习中的可分离凸程序

Zhouchen Lin, Risheng Liu, Huan Li

AI总结本文提出LADMPSAP方法，用于高效求解多块可分离凸程序，通过并行分裂和自适应惩罚改进传统ADM方法，实现更强的收敛性和更快的收敛速度，适用于稀疏表示和低秩恢复问题。

Comments Preliminary version published on Asian Conference on Machine Learning 2013

详情

AI中文摘要

许多机器学习和其他领域的问题可以重新公式化为具有线性约束的可分离凸程序。在大多数情况下，存在多个变量块。然而，传统的交替方向法（ADM）及其线性化版本（LADM，通过线性化二次惩罚项获得）仅适用于两块情况，无法简单推广到多块情况。因此，扩展基于ADM的方法以处理多块情况有巨大需求。本文提出LADMPSAP以高效求解多块可分离凸程序。当所有组件目标函数具有有界子梯度时，我们获得了比ADM和LADM更强的收敛结果，例如允许惩罚参数无界，并证明了全局收敛的充分必要条件。我们进一步提出一个简单的最优性度量，并揭示了LADMPSAP在测度意义上的收敛速度。对于具有额外凸集约束的程序，通过精细的参数估计，我们设计了一个实用的LADMPSAP变种以加快收敛速度。最后，我们通过线性化部分目标函数来推广LADMPSAP，以处理更困难的目标函数程序。LADMPSAP特别适用于稀疏表示和低秩恢复问题，因为其子问题有闭合形式解，迭代过程中的稀疏性和低秩性可以得到保持。它也高度并行化，因此适合并行或分布式计算。数值实验验证了LADMPSAP在速度和数值精度方面的优势。

英文摘要

Many problems in machine learning and other fields can be (re)for-mulated as linearly constrained separable convex programs. In most of the cases, there are multiple blocks of variables. However, the traditional alternating direction method (ADM) and its linearized version (LADM, obtained by linearizing the quadratic penalty term) are for the two-block case and cannot be naively generalized to solve the multi-block case. So there is great demand on extending the ADM based methods for the multi-block case. In this paper, we propose LADM with parallel splitting and adaptive penalty (LADMPSAP) to solve multi-block separable convex programs efficiently. When all the component objective functions have bounded subgradients, we obtain convergence results that are stronger than those of ADM and LADM, e.g., allowing the penalty parameter to be unbounded and proving the sufficient and necessary conditions} for global convergence. We further propose a simple optimality measure and reveal the convergence rate of LADMPSAP in an ergodic sense. For programs with extra convex set constraints, with refined parameter estimation we devise a practical version of LADMPSAP for faster convergence. Finally, we generalize LADMPSAP to handle programs with more difficult objective functions by linearizing part of the objective function as well. LADMPSAP is particularly suitable for sparse representation and low-rank recovery problems because its subproblems have closed form solutions and the sparsity and low-rankness of the iterates can be preserved during the iteration. It is also highly parallelizable and hence fits for parallel or distributed computing. Numerical experiments testify to the advantages of LADMPSAP in speed and numerical accuracy.

URL PDF HTML ☆

赞 0 踩 0

1402.0562 2026-06-04 stat.ML cs.LG cs.SY eess.SY 版本更新

Online Stochastic Optimization under Correlated Bandit Feedback

在线随机优化中的相关多臂反馈

Mohammad Gheshlaghi Azar, Alessandro Lazaric, Emma Brunskill

AI总结本文提出HCT算法，解决局部光滑函数的在线随机优化问题，处理相关奖励挑战，改进内存需求和光滑性假设，应用于强化学习策略搜索。

1310.1502 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Randomized Approximation of the Gram Matrix: Exact Computation and Probabilistic Bounds

随机近似Gram矩阵：精确计算与概率界

John T. Holodnak, Ilse C. F. Ipsen

AI总结研究通过随机化方法近似Gram矩阵，提出基于稳定秩的概率误差界，适用于小维度矩阵和高成功概率场景。

Comments Update to title in third version. Major revisions in second version including new bounds and a more detailed experimental section. Submitted to SIMAX

1311.6107 2026-06-04 eess.SY cs.LG cs.SY math.OC stat.ML 版本更新

Off-policy reinforcement learning for $ H_\infty $ control design

为H∞控制设计的非策略强化学习

Biao Luo, Huai-Ning Wu, Tingwen Huang

AI总结本文提出非策略强化学习方法解决未知内部模型非线性系统H∞控制问题，通过实时系统数据学习HJI方程解，证明收敛性并应用于F16飞机和旋转/平移执行器系统。

Comments Accepted by IEEE Transactions on Cybernetics. IEEE Transactions on Cybernetics, Online Available, 2014

详情

DOI: 10.1109/TCYB.2014.2319577

AI中文摘要

针对具有未知内部系统模型的非线性系统H∞控制设计问题，本文考虑将非线性H∞控制问题转化为求解所谓的哈密顿-雅可比-伊萨克斯（HJI）方程，该方程是一种通常无法解析求解的非线性偏微分方程。更糟糕的是，当准确系统模型不可用或获取成本高时，基于模型的方法无法近似求解HJI方程。为克服这些困难，本文引入了一种非策略强化学习（RL）方法，从真实系统数据而不是数学系统模型中学习HJI方程的解，并证明其收敛性。在非策略RL方法中，系统数据可以使用任意策略生成，而不是评估策略，这对于实际系统至关重要且具有前景。出于实施目的，采用基于神经网络（NN）的actor-critic结构，并基于残差加权方法推导出最小二乘NN权重更新算法。最后，所开发的基于NN的非策略RL方法在线性F16飞机植物上进行测试，并进一步应用于旋转/平移执行器系统。

英文摘要

The $H_\infty$ control design problem is considered for nonlinear systems with unknown internal system model. It is known that the nonlinear $ H_\infty $ control problem can be transformed into solving the so-called Hamilton-Jacobi-Isaacs (HJI) equation, which is a nonlinear partial differential equation that is generally impossible to be solved analytically. Even worse, model-based approaches cannot be used for approximately solving HJI equation, when the accurate system model is unavailable or costly to obtain in practice. To overcome these difficulties, an off-policy reinforcement leaning (RL) method is introduced to learn the solution of HJI equation from real system data instead of mathematical system model, and its convergence is proved. In the off-policy RL method, the system data can be generated with arbitrary policies rather than the evaluating policy, which is extremely important and promising for practical systems. For implementation purpose, a neural network (NN) based actor-critic structure is employed and a least-square NN weight update algorithm is derived based on the method of weighted residuals. Finally, the developed NN-based off-policy RL method is tested on a linear F16 aircraft plant, and further applied to a rotational/translational actuator system.

URL PDF HTML ☆

赞 0 踩 0

1404.7073 2026-06-04 eess.SY cs.LG cs.LO cs.RO cs.SY 版本更新

Probably Approximately Correct MDP Learning and Control With Temporal Logic Constraints

在时序逻辑约束下进行近似正确马尔可夫决策过程的学习与控制

Jie Fu, Ufuk Topcu

AI总结本文提出在未知随机环境中，基于PAC-MDP方法学习满足时序逻辑规范的近优控制策略，通过多项式时间与空间复杂度实现高概率的近优策略生成。

Comments 9 pages, 5 figures, Accepted by 2014 Robotics: Science and Systems (RSS)

详情

AI中文摘要

我们考虑在未知随机环境中合成控制策略，以最大化满足给定时序逻辑规范的概率。我们将系统与环境的交互建模为具有初始未知转移概率的马尔可夫决策过程（MDP）。所开发的解决方案基于所谓的基于模型的近似正确马尔可夫决策过程（PAC-MDP）方法。该算法通过样本（即观测）、时间和空间，以多项式复杂度与MDP大小、自动机构造时序逻辑规范的大小、1/ε、1/δ和有限时间 horizon 相关，生成一个ε-近优策略，概率为1-δ。在此方法中，系统维护初始未知MDP的模型，并基于其学习模型和规范自动机构造产品MDP。在执行过程中，策略通过观察系统所采取的转移进行迭代更新。迭代在有限步骤内终止。以高概率，所生成的策略使得任何状态下，该策略满足规范的概率与最优策略之间的差异在预定义范围内。

英文摘要

We consider synthesis of control policies that maximize the probability of satisfying given temporal logic specifications in unknown, stochastic environments. We model the interaction between the system and its environment as a Markov decision process (MDP) with initially unknown transition probabilities. The solution we develop builds on the so-called model-based probably approximately correct Markov decision process (PAC-MDP) methodology. The algorithm attains an $\varepsilon$-approximately optimal policy with probability $1-δ$ using samples (i.e. observations), time and space that grow polynomially with the size of the MDP, the size of the automaton expressing the temporal logic specification, $\frac{1}{\varepsilon}$, $\frac{1}δ$ and a finite time horizon. In this approach, the system maintains a model of the initially unknown MDP, and constructs a product MDP based on its learned model and the specification automaton that expresses the temporal logic constraints. During execution, the policy is iteratively updated using observation of the transitions taken by the system. The iteration terminates in finitely many steps. With high probability, the resulting policy is such that, for any state, the difference between the probability of satisfying the specification under this policy and the optimal one is within a predefined bound.

URL PDF HTML ☆

赞 0 踩 0

1404.5899 2026-06-04 math.NA cs.LG cs.NA 版本更新

A Comparison of Clustering and Missing Data Methods for Health Sciences

健康科学中聚类和缺失数据方法的比较

Ran Zhao, Deanna Needell, Christopher Johansen, Jerry L. Grenard

AI总结本文比较了健康行为研究中聚类与缺失数据方法，提出结合压缩感知矩阵补全与谱聚类，实验证明其在聚类和缺失数据处理中表现更优。

1404.1377 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Orthogonal Rank-One Matrix Pursuit for Low Rank Matrix Completion

正交秩一矩阵追迹法用于低秩矩阵补全

Zheng Wang, Ming-Jun Lai, Zhaosong Lu, Wei Fan, Hasan Davulcu, Jieping Ye

AI总结本文提出一种高效可扩展的低秩矩阵补全算法，通过将正交匹配追踪方法扩展到矩阵领域，并引入新的权重更新规则降低计算和存储复杂度，具有线性收敛速度和单一可调参数，适用于大规模学习问题。

详情

AI中文摘要

本文提出了一种高效的低秩矩阵补全算法，其核心思想是将正交匹配追踪方法从向量扩展到矩阵领域。我们进一步提出了一种经济版本的算法，通过引入新的权重更新规则来降低时间和存储复杂度。两种版本在每次矩阵追迹迭代中计算成本低廉，且在几次迭代中就能获得满意的结果。我们提出的算法的另一个优点是仅有一个可调参数，即秩，这使得用户易于理解和使用，特别是在大规模学习问题中尤为重要。此外，我们严格证明了两种版本都实现了线性收敛速度，这比之前已知的结果显著更好。我们还通过多个真实世界数据集，包括大规模推荐数据集Netflix以及MovieLens数据集，经验性地比较了所提算法与几种最先进的矩阵补全算法。数值结果表明，所提算法在效率上优于竞争算法，同时在预测性能上达到相似或更好的水平。

英文摘要

In this paper, we propose an efficient and scalable low rank matrix completion algorithm. The key idea is to extend orthogonal matching pursuit method from the vector case to the matrix case. We further propose an economic version of our algorithm by introducing a novel weight updating rule to reduce the time and storage complexity. Both versions are computationally inexpensive for each matrix pursuit iteration, and find satisfactory results in a few iterations. Another advantage of our proposed algorithm is that it has only one tunable parameter, which is the rank. It is easy to understand and to use by the user. This becomes especially important in large-scale learning problems. In addition, we rigorously show that both versions achieve a linear convergence rate, which is significantly better than the previous known results. We also empirically compare the proposed algorithms with several state-of-the-art matrix completion algorithms on many real-world datasets, including the large-scale recommendation dataset Netflix as well as the MovieLens datasets. Numerical results show that our proposed algorithm is more efficient than competing algorithms while achieving similar or better prediction performance.

URL PDF HTML ☆

赞 0 踩 0

1403.7737 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Sharpened Error Bounds for Random Sampling Based $\ell_2$ Regression

随机采样基于ℓ₂回归的误差界改进

Shusen Wang

AI总结本文提出两种随机采样方法改进ℓ₂回归效率，改进误差界至O(d log d + d/ε)以实现1+ε精度，同时证明均匀采样在特定条件下可获得2+ε的界。

Comments unpublished manuscript

详情

AI中文摘要

给定数据矩阵X∈R^{n×d}和响应向量y∈R^n，当n>d时，求解最小二乘回归（LSR）问题需要O(n d²)时间与O(n d)空间。当n和d均较大时，精确求解非常昂贵。当n>>d时，将y和X的所有列随机嵌入到较小的子空间R^c中，可将LSR问题的行数减少，从而以O(c d²)时间和O(c d)空间求解。本文讨论了两种随机采样方法以更高效地求解LSR。先前工作表明基于杠杆分数采样的LSR在c≥O(d ε^{-2} log d)时可达到1+ε精度。本文改进此误差界，证明当c=O(d log d + d/ε)时即可实现1+ε精度。此外，当c≥O(μd ε^{-2} log d)时，均匀采样基于LSR以正概率达到2+ε的界。

英文摘要

Given a data matrix $X \in R^{n\times d}$ and a response vector $y \in R^{n}$, suppose $n>d$, it costs $O(n d^2)$ time and $O(n d)$ space to solve the least squares regression (LSR) problem. When $n$ and $d$ are both large, exactly solving the LSR problem is very expensive. When $n \gg d$, one feasible approach to speeding up LSR is to randomly embed $y$ and all columns of $X$ into a smaller subspace $R^c$; the induced LSR problem has the same number of columns but much fewer number of rows, and it can be solved in $O(c d^2)$ time and $O(c d)$ space. We discuss in this paper two random sampling based methods for solving LSR more efficiently. Previous work showed that the leverage scores based sampling based LSR achieves $1+ε$ accuracy when $c \geq O(d ε^{-2} \log d)$. In this paper we sharpen this error bound, showing that $c = O(d \log d + d ε^{-1})$ is enough for achieving $1+ε$ accuracy. We also show that when $c \geq O(μd ε^{-2} \log d)$, the uniform sampling based LSR attains a $2+ε$ bound with positive probability.

URL PDF HTML ☆

赞 0 踩 0

1311.4468 2026-06-04 cs.LG cs.SY eess.SY physics.data-an stat.ML 版本更新

Stochastic processes and feedback-linearisation for online identification and Bayesian adaptive control of fully-actuated mechanical systems

随机过程与反馈线性化用于完全驱动机械系统的在线识别和贝叶斯自适应控制

Jan-Peter Calliess, Antonis Papachristodoulou, Stephen J. Roberts

AI总结本文提出了一种新的方法，结合概率识别与控制，利用随机过程先验条件和拉格朗日力学结构知识，通过反馈线性化实现对完全驱动机械系统的灵活非参数贝叶斯学习。

1403.7429 2026-06-04 math.OC cs.DC cs.LG cs.SY eess.SY 版本更新

Distributed Reconstruction of Nonlinear Networks: An ADMM Approach

非线性网络的分布式重建：一种ADMM方法

Wei Pan, Aivar Sootla, Guy-Bart Stan

AI总结本文提出了一种分布式算法用于大规模非线性网络的重建，通过ADMM将问题分解为子问题，用于识别时间序列数据中的非线性函数形式和参数。

Comments To appear in the Preprints of 19th IFAC World Congress 2014

详情

AI中文摘要

本文提出了一种分布式算法用于大规模非线性网络的重建。特别是，我们关注从时间序列数据中识别大规模非线性网络的非线性函数形式及其相关参数。最近，非线性网络重建问题被表述为一个非凸优化问题，基于边际似然最大化过程与稀疏诱导先验的结合。使用凸-凹过程（CCCP），推导出迭代加权lasso算法来解决初始非凸优化问题。通过利用该加权lasso算法的目标函数结构，可以设计出分布式算法。为此，我们应用交替方向乘子法（ADMM）将原问题分解为几个子问题。为了说明所提方法的有效性，我们使用我们的方法来识别具有不同网络规模（500~100,000个节点）的相互连接库克振子网络。

英文摘要

In this paper, we present a distributed algorithm for the reconstruction of large-scale nonlinear networks. In particular, we focus on the identification from time-series data of the nonlinear functional forms and associated parameters of large-scale nonlinear networks. Recently, a nonlinear network reconstruction problem was formulated as a nonconvex optimisation problem based on the combination of a marginal likelihood maximisation procedure with sparsity inducing priors. Using a convex-concave procedure (CCCP), an iterative reweighted lasso algorithm was derived to solve the initial nonconvex optimisation problem. By exploiting the structure of the objective function of this reweighted lasso algorithm, a distributed algorithm can be designed. To this end, we apply the alternating direction method of multipliers (ADMM) to decompose the original problem into several subproblems. To illustrate the effectiveness of the proposed methods, we use our approach to identify a network of interconnected Kuramoto oscillators with different network sizes (500~100,000 nodes).

URL PDF HTML ☆

赞 0 踩 0

1312.7292 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Two Timescale Convergent Q-learning for Sleep--Scheduling in Wireless Sensor Networks

双时间尺度收敛Q学习用于无线传感器网络中的睡眠调度

Prashanth L. A., Abhranil Chatterjee, Shalabh Bhatnagar

AI总结本文研究无线传感器网络中通过优化睡眠时间延长网络寿命并最小化跟踪误差的问题，提出双时间尺度收敛Q学习算法，结合函数近似和策略梯度更新，提升睡眠调度性能。

详情

AI中文摘要

本文考虑无线传感器网络中的入侵检测应用，研究如何调度单个传感器的睡眠时间以最大化网络寿命并最小化跟踪误差。我们将该问题建模为部分可观测马尔可夫决策过程（POMDP），具有连续状态-动作空间。我们提出了一种双时间尺度收敛的Q学习算法，用于处理POMDP的维度灾难问题。该算法结合了一种策略梯度更新方法，使用单次模拟的同时扰动随机逼近（SPSA）估计在较快的时间尺度上进行更新，而Q值参数则在较慢的时间尺度上通过类似于时序差分（TD）算法的方式更新。特征选择方案管理能量和跟踪组件，以帮助寻找最优的睡眠调度策略。我们还开发了Q学习算法的函数近似类比，但该算法没有理论收敛保证。此外，我们还调整了算法以包含随机迭代估计方案，用于估计入侵者的移动模型。在二维网络设置下的仿真结果表明，我们的算法在仅增加少量传感器的情况下，相比最近的先前工作实现了更好的跟踪精度。

英文摘要

In this paper, we consider an intrusion detection application for Wireless Sensor Networks (WSNs). We study the problem of scheduling the sleep times of the individual sensors to maximize the network lifetime while keeping the tracking error to a minimum. We formulate this problem as a partially-observable Markov decision process (POMDP) with continuous state-action spaces, in a manner similar to (Fuemmeler and Veeravalli [2008]). However, unlike their formulation, we consider infinite horizon discounted and average cost objectives as performance criteria. For each criterion, we propose a convergent on-policy Q-learning algorithm that operates on two timescales, while employing function approximation to handle the curse of dimensionality associated with the underlying POMDP. Our proposed algorithm incorporates a policy gradient update using a one-simulation simultaneous perturbation stochastic approximation (SPSA) estimate on the faster timescale, while the Q-value parameter (arising from a linear function approximation for the Q-values) is updated in an on-policy temporal difference (TD) algorithm-like fashion on the slower timescale. The feature selection scheme employed in each of our algorithms manages the energy and tracking components in a manner that assists the search for the optimal sleep-scheduling policy. For the sake of comparison, in both discounted and average settings, we also develop a function approximation analogue of the Q-learning algorithm. This algorithm, unlike the two-timescale variant, does not possess theoretical convergence guarantees. Finally, we also adapt our algorithms to include a stochastic iterative estimation scheme for the intruder's mobility model. Our simulation results on a 2-dimensional network setting suggest that our algorithms result in better tracking accuracy at the cost of only a few additional sensors, in comparison to a recent prior work.

URL PDF HTML ☆

赞 0 踩 0

1403.4267 2026-06-04 math.NA cs.LG cs.NA 版本更新

Balancing Sparsity and Rank Constraints in Quadratic Basis Pursuit

在二次基追踪中平衡稀疏性和秩约束

Cagdas Bilen, Gilles Puy, Rémi Gribonval, Laurent Daudet

AI总结本文研究了同时 enforcing 稀疏性和低秩结构的方法，提出了一种分析稀疏性和低秩约束之间权衡的新方法，并通过仿真验证了其有效性。

1403.1863 2026-06-04 cs.LG cs.SY eess.SY 版本更新

Statistical Structure Learning, Towards a Robust Smart Grid

统计结构学习，迈向稳健的智能电网

Hanie Sedghi, Edmond Jonckheere

AI总结本文提出基于公交相角马尔可夫图的去中心化虚假数据注入检测方案，利用条件协方差测试学习电网结构，通过直流功率流模型证明正常情况下电网图可确定相角马尔可夫图，检测计算图与学习结构的差异以触发警报，无需额外硬件即可检测复杂攻击。

详情

AI中文摘要

电网的稳健控制和维护依赖于准确的数据。PMUs和状态估计器都容易受到虚假数据注入攻击。因此，必须有一个机制来快速准确地检测恶意代理篡改数据的行为——这不仅对防止可能导致停电的攻击至关重要，也对当前和未来电网的常规监控和控制任务至关重要。我们提出了一种基于公交相角马尔可夫图的去中心化虚假数据注入检测方案。我们利用条件协方差测试（CCT）来学习电网的结构。使用直流功率流模型，我们证明在正常情况下，由于电网图的行走可求和性，电网图可确定电压角的马尔可夫图。因此，计算的马尔可夫图与学习结构之间的差异应触发警报。本地电网拓扑结构可以从保护系统在线获取，并利用它来检查不匹配。如果检测到不匹配，我们使用相关异常分数来检测受攻击的节点集合。我们的方法可以检测假设了解系统公交-支路模型并能欺骗状态估计器、损害电力网络观测、控制、监控、需求响应和定价方案的最近期隐秘欺骗攻击。具体而言，在隐秘欺骗攻击下，相角马尔可夫图发生变化。除了检测攻击状态外，我们的方法还可以检测受攻击的节点集合。据我们所知，我们的方法是首次全面检测这种复杂攻击，并且不需要额外硬件。此外，我们的检测方案无论受攻击子集的大小都能成功。各种电力网络的模拟验证了我们的主张。

英文摘要

Robust control and maintenance of the grid relies on accurate data. Both PMUs and state estimators are prone to false data injection attacks. Thus, it is crucial to have a mechanism for fast and accurate detection of an agent maliciously tampering with the data---for both preventing attacks that may lead to blackouts, and for routine monitoring and control tasks of current and future grids. We propose a decentralized false data injection detection scheme based on Markov graph of the bus phase angles. We utilize the Conditional Covariance Test (CCT) to learn the structure of the grid. Using the DC power flow model, we show that under normal circumstances, and because of walk-summability of the grid graph, the Markov graph of the voltage angles can be determined by the power grid graph. Therefore, a discrepancy between calculated Markov graph and learned structure should trigger the alarm. Local grid topology is available online from the protection system and we exploit it to check for mismatch. Should a mismatch be detected, we use correlation anomaly score to detect the set of attacked nodes. Our method can detect the most recent stealthy deception attack on the power grid that assumes knowledge of bus-branch model of the system and is capable of deceiving the state estimator, damaging power network observatory, control, monitoring, demand response and pricing schemes. Specifically, under the stealthy deception attack, the Markov graph of phase angles changes. In addition to detect a state of attack, our method can detect the set of attacked nodes. To the best of our knowledge, our remedy is the first to comprehensively detect this sophisticated attack and it does not need additional hardware. Moreover, our detection scheme is successful no matter the size of the attacked subset. Simulation of various power networks confirms our claims.

URL PDF HTML ☆

赞 0 踩 0

1309.1369 2026-06-04 stat.ML cs.LG cs.NA math.NA stat.CO 版本更新

Semistochastic Quadratic Bound Methods

半随机二次界方法

Aleksandr Y. Aravkin, Anna Choromanska, Tony Jebara, Dimitri Kanevsky

AI总结本文提出半随机二次界方法用于最大似然推断，通过优化分区函数，证明了在弱假设下全局收敛性和在强假设下线性收敛性，同时通过不精确子问题求解和批量大小选择方案提升效率与稳定性。

Comments 11 pages, 1 figure

1312.0516 2026-06-04 cs.LG cs.SY eess.SY stat.AP stat.ML 版本更新

Grid Topology Identification using Electricity Prices

利用电价识别电网拓扑

Vassilis Kekatos, Georgios B. Giannakis, Ross Baldick

AI总结本文研究通过公开市场数据恢复电网拓扑的潜力，提出基于LMP的正则化最大似然估计器，利用低秩和稀疏结构恢复电网拉普拉斯矩阵，通过IEEE 14节点基准数据验证了方法的有效性。

Comments PES General Meeting 2014 submission

详情

AI中文摘要

本文探讨了仅使用公开市场数据恢复电网拓扑的潜力。在当代批发电力市场中，实时电价通常通过求解受网络约束的经济调度问题确定。在线性直流模型下，位置边际价格（LMP）对应于所涉及线性规划的拉格朗日乘数。有趣的是，时空变化的LMP矩阵具有以下性质：一旦与加权电网拉普拉斯矩阵相乘，就会得到低秩且稀疏的矩阵。利用这一丰富结构，开发了一种正则化的最大似然估计器（MLE）来从LMP中恢复电网拉普拉斯矩阵。所提出的凸优化问题包含促进低秩和稀疏性的正则化项，并通过可扩展的算法求解。在为IEEE 14节点基准生成的价格上进行的数值测试提供了令人鼓舞的拓扑恢复结果。

英文摘要

The potential of recovering the topology of a grid using solely publicly available market data is explored here. In contemporary whole-sale electricity markets, real-time prices are typically determined by solving the network-constrained economic dispatch problem. Under a linear DC model, locational marginal prices (LMPs) correspond to the Lagrange multipliers of the linear program involved. The interesting observation here is that the matrix of spatiotemporally varying LMPs exhibits the following property: Once premultiplied by the weighted grid Laplacian, it yields a low-rank and sparse matrix. Leveraging this rich structure, a regularized maximum likelihood estimator (MLE) is developed to recover the grid Laplacian from the LMPs. The convex optimization problem formulated includes low rank- and sparsity-promoting regularizers, and it is solved using a scalable algorithm. Numerical tests on prices generated for the IEEE 14-bus benchmark provide encouraging topology recovery results.

URL PDF HTML ☆

赞 0 踩 0

1306.3343 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Relaxed Sparse Eigenvalue Conditions for Sparse Estimation via Non-convex Regularized Regression

松弛的稀疏特征值条件用于非凸正则化回归中的稀疏估计

Zheng Pan, Changshui Zhang

AI总结本文研究了非凸正则化回归中稀疏估计的松弛特征值条件，证明了非凸正则化在稀疏估计中的有效性，并展示了坐标下降法在获得近似全局解中的应用。

1306.0308 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

Probabilistic Solutions to Differential Equations and their Application to Riemannian Statistics

微分方程的概率解及其在黎曼统计学中的应用

Philipp Hennig, Søren Hauberg

AI总结本文提出一种概率数值方法，用于求解初值和边界值问题，返回解的高斯过程后验。该方法在黎曼流形统计中具有应用价值，能处理非解析常微分方程，通过不确定性边际化提升统计鲁棒性，提出新的黎曼算法和主地理分析方法。

Comments 11 page (9 page conference paper, plus supplements)

详情

Journal ref: Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS) 2014, Reykjavik, Iceland. Journal of Machine Learning Research: W&CP volume 33

AI中文摘要

我们研究了一种概率数值方法，用于求解初值和边界值问题，该方法返回解的联合高斯过程后验。此类方法在黎曼流形统计中具有实际价值，因为几乎所有计算都涉及非解析常微分方程。概率公式允许对数值解的不确定性进行边际化，使统计结果对不准确性更不敏感。这导致了新的黎曼算法用于均值计算和主地理分析。边际化也意味着结果可能不如点估计精确，从而在状态-of-the-art方法上实现显著加速。我们的方法是关于更广泛观点的论据，即数值计算引起的不确定性应在机器学习算法的整个管道中进行跟踪。

英文摘要

We study a probabilistic numerical method for the solution of both boundary and initial value problems that returns a joint Gaussian process posterior over the solution. Such methods have concrete value in the statistics on Riemannian manifolds, where non-analytic ordinary differential equations are involved in virtually all computations. The probabilistic formulation permits marginalising the uncertainty of the numerical solution such that statistics are less sensitive to inaccuracies. This leads to new Riemannian algorithms for mean value computations and principal geodesic analysis. Marginalisation also means results can be less precise than point estimates, enabling a noticeable speed-up over the state of the art. Our approach is an argument for a wider point that uncertainty caused by numerical calculations should be tracked throughout the pipeline of machine learning algorithms.

URL PDF HTML ☆

赞 0 踩 0

1401.2288 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Extension of Sparse Randomized Kaczmarz Algorithm for Multiple Measurement Vectors

稀疏随机Kaczmarz算法的扩展：多测量向量问题

Hemant Kumar Aggarwal, Angshul Majumdar

AI总结本文提出改进的随机Kaczmarz算法解决多测量向量问题，通过共同稀疏支持实现更高效的恢复与收敛。

详情

AI中文摘要

Kaczmarz算法因迭代求解超定线性方程组而流行。传统算法在几次遍历中可近似解，但随机版本能指数收敛且与方程数量无关。最近提出基于加权随机Kaczmarz算法的稀疏解算法，但仅适用于单测量向量问题。本文通过修改随机Kaczmarz算法解决多测量向量问题，将视频人脸识别建模为该问题并应用所提技术。在真实和合成数据集上，所提算法在公平约束下优于状态最新型谱投影梯度算法，蒙特卡洛模拟证实其恢复和收敛速率更优。

英文摘要

The Kaczmarz algorithm is popular for iteratively solving an overdetermined system of linear equations. The traditional Kaczmarz algorithm can approximate the solution in few sweeps through the equations but a randomized version of the Kaczmarz algorithm was shown to converge exponentially and independent of number of equations. Recently an algorithm for finding sparse solution to a linear system of equations has been proposed based on weighted randomized Kaczmarz algorithm. These algorithms solves single measurement vector problem; however there are applications were multiple-measurements are available. In this work, the objective is to solve a multiple measurement vector problem with common sparse support by modifying the randomized Kaczmarz algorithm. We have also modeled the problem of face recognition from video as the multiple measurement vector problem and solved using our proposed technique. We have compared the proposed algorithm with state-of-art spectral projected gradient algorithm for multiple measurement vectors on both real and synthetic datasets. The Monte Carlo simulations confirms that our proposed algorithm have better recovery and convergence rate than the MMV version of spectral projected gradient algorithm under fairness constraints.

URL PDF HTML ☆

赞 0 踩 0

1401.3198 2026-06-04 math.OC cs.LG cs.SY eess.SY 版本更新

Online Markov decision processes with Kullback-Leibler control cost

在线马尔可夫决策过程与Kullback-Leibler控制成本

Peng Guan, Maxim Raginsky, Rebecca Willett

AI总结本文研究了在线控制问题，通过离散时间随机游走在有限状态空间中进行决策，采用KL散度作为控制成本，提出了一种计算高效且具有小遗憾的策略，并在目标跟踪任务中验证了其性能。

Comments to appear in IEEE Transactions on Automatic Control

1401.1842 2026-06-04 stat.ML cs.IT cs.LG cs.NA math.IT math.NA 版本更新

Robust Large Scale Non-negative Matrix Factorization using Proximal Point Algorithm

鲁棒大规模非负矩阵分解的近点算法

Jason Gejie Liu, Shuchin Aeron

AI总结本文提出一种鲁棒算法用于大规模非负矩阵分解，通过引入减少的约束条件改进线性规划算法，无需预先知道分解秩，适用于极端射线或主题数量远大于数据向量维度的情况。

Comments Appeared in IEEE GlobalSIP, 2013, TX, Austin

1401.0159 2026-06-04 math.NA cs.LG cs.NA 版本更新

Speeding-Up Convergence via Sequential Subspace Optimization: Current State and Future Directions

通过顺序子空间优化加速收敛：现状与未来方向

Michael Zibulevsky

AI总结本文综述了顺序子空间优化框架在大规模无约束优化中的应用，探讨了其与并行坐标下降法的结合，以及如何通过改进方法提升求解效率。

详情

AI中文摘要

本文是一篇以研究提案风格撰写的技术综述论文。近年来，我们引入了一个用于大规模无约束优化的通用框架——顺序子空间优化（SESOP），并展示了其在基于稀疏性的信号/图像去噪、反卷积、压缩感知、计算机断层扫描、衍射成像、支持向量机等领域的实用性。我们探索了其与并行坐标下降法和可分离替代函数方法的结合，从而在上述领域取得了最先进的成果。存在几种方法，在特定条件下比纯SESOP更快：信任区域牛顿方法——适用于Hessian矩阵易于逆的问题；截断牛顿方法——当可以快速乘以Hessian时；随机优化方法——适用于具有大数据的随机类型问题；多网格方法——适用于具有嵌套多级结构的问题。这些方法可以通过与SESOP结合进一步改进。此外，也可以通过SESOP加速约束优化问题的增广拉格朗日方法，以及具有可分离目标函数和不可分离约束的问题的交替方向乘子法。

英文摘要

This is an overview paper written in style of research proposal. In recent years we introduced a general framework for large-scale unconstrained optimization -- Sequential Subspace Optimization (SESOP) and demonstrated its usefulness for sparsity-based signal/image denoising, deconvolution, compressive sensing, computed tomography, diffraction imaging, support vector machines. We explored its combination with Parallel Coordinate Descent and Separable Surrogate Function methods, obtaining state of the art results in above-mentioned areas. There are several methods, that are faster than plain SESOP under specific conditions: Trust region Newton method - for problems with easily invertible Hessian matrix; Truncated Newton method - when fast multiplication by Hessian is available; Stochastic optimization methods - for problems with large stochastic-type data; Multigrid methods - for problems with nested multilevel structure. Each of these methods can be further improved by merge with SESOP. One can also accelerate Augmented Lagrangian method for constrained optimization problems and Alternating Direction Method of Multipliers for problems with separable objective function and non-separable constraints.

URL PDF HTML ☆

赞 0 踩 0

1312.6872 2026-06-04 math.NA cs.LG cs.NA 版本更新

Matrix recovery using Split Bregman

利用Split Bregman方法进行矩阵恢复

Anupriya Gogna, Ankita Shukla, Angshul Majumdar

AI总结本文提出Split Bregman算法用于低秩矩阵恢复，通过改进收敛速度和成功率，提升重建精度，尤其在测量数据有限时表现更优。

详情

AI中文摘要

本文针对从低维投影中恢复具有内在低秩结构的矩阵问题，该问题广泛应用于模式识别、无线传感器网络、控制系统、推荐系统、图像/视频重建等领域。在理论和实践中，最有效的解决低秩矩阵恢复问题的方法是核范数最小化。本文提出了一种Split Bregman算法用于核范数最小化。Bregman技术的使用提高了算法的收敛速度并提高了成功率。即使在小数量线性测量可用的情况下，重建的准确性也更好。我们的主张通过使用我们算法及其与其他现有矩阵恢复方法的比较实验得到支持。算法基于NMSE、执行时间和成功率对不同秩和采样比率进行比较。

英文摘要

In this paper we address the problem of recovering a matrix, with inherent low rank structure, from its lower dimensional projections. This problem is frequently encountered in wide range of areas including pattern recognition, wireless sensor networks, control systems, recommender systems, image/video reconstruction etc. Both in theory and practice, the most optimal way to solve the low rank matrix recovery problem is via nuclear norm minimization. In this paper, we propose a Split Bregman algorithm for nuclear norm minimization. The use of Bregman technique improves the convergence speed of our algorithm and gives a higher success rate. Also, the accuracy of reconstruction is much better even for cases where small number of linear measurements are available. Our claim is supported by empirical results obtained using our algorithm and its comparison to other existing methods for matrix recovery. The algorithms are compared on the basis of NMSE, execution time and success rate for varying ranks and sampling ratios.

URL PDF HTML ☆

赞 0 踩 0

1312.6182 2026-06-04 cs.MS cs.LG cs.NA math.NA stat.ML 版本更新

Large-Scale Paralleled Sparse Principal Component Analysis

大规模并行稀疏主成分分析

W. Liu, H. Zhang, D. Tao, Y. Wang, K. Lu

AI总结本文提出基于GPU的高效并行稀疏主成分分析方法，通过并行实现通用幂方法的四种优化形式，显著提升计算效率，实验证明其在实际数据集中的实用性。

Comments submitted to Multimedia Tools and Applications

1306.2861 2026-06-04 stat.ML cs.LG cs.SY eess.SY 版本更新

Bayesian Inference and Learning in Gaussian Process State-Space Models with Particle MCMC

贝叶斯推断与学习在高斯过程状态空间模型中的粒子MCMC应用

Roger Frigola, Fredrik Lindsten, Thomas B. Schön, Carl E. Rasmussen

AI总结本文提出一种全贝叶斯方法，用于非线性非参数状态空间模型中的推断与学习，通过高斯过程先验建模状态转移动态，并利用粒子MCMC进行高效推断。

1310.3556 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Identifying Influential Entries in a Matrix

识别矩阵中的关键条目

Abhisek Kundu, Srinivas Nambirajan, Petros Drineas

AI总结本文提出一种概率分布，用于识别矩阵中最关键的条目，并通过理论证明在采样少量条目后可精确重建矩阵，且无需假设矩阵的无相干性。

Comments There is a bug in the proof of Lemma 5, which we are currently working to fix

1312.2132 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

Robust Subspace System Identification via Weighted Nuclear Norm Optimization

通过加权核范数优化实现鲁棒子空间系统辨识

Dorsa Sadigh, Henrik Ohlsson, S. Shankar Sastry, Sanjit A. Seshia

AI总结本文提出一种基于加权核范数优化的鲁棒子空间系统辨识方法，通过在拟合、秩和稀疏性之间进行权衡，有效处理异常值问题。

Comments Submitted to the IFAC World Congress 2014

1312.1613 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

Max-Min Distance Nonnegative Matrix Factorization

最大-最小距离非负矩阵分解

Jim Jing-Yan Wang

AI总结本文提出一种监督非负矩阵分解算法，通过利用类别标签将数据对分为同类和异类对，旨在最小化同类对在新空间中的最大距离，同时最大化异类对的最小距离，提升表示的判别能力。

详情

方差调整的actor-critic算法

Aviv Tamar, Shie Mannor

AI总结本文提出了一种针对MDP的actor-critic框架，目标为方差调整的预期回报。通过线性函数逼近和扩展兼容特征概念，提出了一种分回合算法，并证明其几乎必然收敛到目标函数的局部最优解。

1309.2375 2026-06-04 stat.ML cs.LG cs.NA math.NA stat.CO 版本更新

Accelerated Proximal Stochastic Dual Coordinate Ascent for Regularized Loss Minimization

加速的正则化损失最小化随机对偶坐标上升法

Shai Shalev-Shwartz, Tong Zhang

AI总结本文提出一种正则化损失最小化问题的加速随机对偶坐标上升方法，并通过内-外迭代流程提升其效率，改进了支持向量机、逻辑回归等关键机器学习优化问题的理论结果。

1309.5803 2026-06-04 cs.LG cs.DC cs.SY eess.SY math.OC 版本更新

Scalable Anomaly Detection in Large Homogenous Populations

大规模同质群体中的可扩展异常检测

Henrik Ohlsson, Tianshi Chen, Sina Khoshfetrat Pakazad, Lennart Ljung, S. Shankar Sastry

AI总结本文提出一种优化方法解决大规模同质群体中的异常检测问题，通过将问题转化为凸优化问题实现分布式求解，并在特定条件下保证解的精确性。

1309.0866 2026-06-04 cs.LO cs.AI cs.LG cs.SY eess.SY 版本更新

On the Robustness of Temporal Properties for Stochastic Models

关于随机模型中时间属性的鲁棒性

Ezio Bartocci, Luca Bortolussi, Laura Nenzi, Guido Sanguinetti

AI总结本文研究了随机模型中时间属性的鲁棒性，提出鲁棒性度量方法，并结合满足概率优化系统设计。

Comments In Proceedings HSB 2013, arXiv:1308.5724

详情

DOI: 10.4204/EPTCS.125.1
Journal ref: EPTCS 125, 2013, pp. 3-19

AI中文摘要

随机模型如连续时间马尔可夫链（CTMC）和随机混合自动机（SHA）因其能捕捉生物过程中的随机性而成为强大的形式化工具。形式化建模中的经典问题——模型检查问题——即计算特定时间逻辑公式行为在给定随机过程中的概率。然而，除了满足性外，还关注系统维持特定涌现行为的鲁棒性，不受外部噪声或模型参数微小变化的影响。本文提出将鲁棒性概念扩展至随机系统，展示其自然导致鲁棒性分数分布，并通过两个例子说明如何近似分布及其关键指标：平均鲁棒性和条件平均鲁棒性。其次，展示了如何将这些指标与满足概率结合，以解决系统设计问题，即优化随机模型的控制参数以最大化所需规范的鲁棒性。

英文摘要

Stochastic models such as Continuous-Time Markov Chains (CTMC) and Stochastic Hybrid Automata (SHA) are powerful formalisms to model and to reason about the dynamics of biological systems, due to their ability to capture the stochasticity inherent in biological processes. A classical question in formal modelling with clear relevance to biological modelling is the model checking problem. i.e. calculate the probability that a behaviour, expressed for instance in terms of a certain temporal logic formula, may occur in a given stochastic process. However, one may not only be interested in the notion of satisfiability, but also in the capacity of a system to mantain a particular emergent behaviour unaffected by the perturbations, caused e.g. from extrinsic noise, or by possible small changes in the model parameters. To address this issue, researchers from the verification community have recently proposed several notions of robustness for temporal logic providing suitable definitions of distance between a trajectory of a (deterministic) dynamical system and the boundaries of the set of trajectories satisfying the property of interest. The contributions of this paper are twofold. First, we extend the notion of robustness to stochastic systems, showing that this naturally leads to a distribution of robustness scores. By discussing two examples, we show how to approximate the distribution of the robustness score and its key indicators: the average robustness and the conditional average robustness. Secondly, we show how to combine these indicators with the satisfaction probability to address the system design problem, where the goal is to optimize some control parameters of a stochastic model in order to best maximize robustness of the desired specifications.

URL PDF HTML ☆

赞 0 踩 0

1308.5329 2026-06-04 cs.LO cs.LG cs.SY eess.SY 版本更新

Monitoring with uncertainty

监控中的不确定性

Ezio Bartocci, Radu Grosu

AI总结本文探讨了在监控开销控制机制下，如何利用统计模型学习应用行为并填补监控数据缺失，以估计属性违反的概率。

Comments In Proceedings HAS 2013, arXiv:1308.4904

1308.3558 2026-06-04 cs.LG cs.NA math.NA 版本更新

Fast Stochastic Alternating Direction Method of Multipliers

快速随机交替方向乘子法

Leon Wenliang Zhong, James T. Kwok

AI总结本文提出一种新的随机交替方向乘子法算法，通过逐步近似线性化ADMM中的完整梯度，提升凸问题的收敛速度至O(1/T)，在无需访问所有样本的情况下达到批量ADMM的收敛率。

1308.2853 2026-06-04 cs.LG cs.IR cs.NA math.NA math.ST stat.ML stat.TH 版本更新

When are Overcomplete Topic Models Identifiable? Uniqueness of Tensor Tucker Decompositions with Structured Sparsity

何时过完备主题模型是可识别的？具有结构稀疏性的张量Tucker分解的唯一性

Animashree Anandkumar, Daniel Hsu, Majid Janzamin, Sham Kakade

AI总结本文研究了过完备主题模型在特定阶可观察矩下的可识别性，提出通过结构稀疏性约束实现张量Tucker分解的唯一性。

详情

AI中文摘要

过完备潜在表示近年来在无监督特征学习中非常流行。本文指明哪些过完备模型在特定阶的可观察矩下可识别。我们考虑在过完备 regime 中的概率混合或主题模型，其中潜在主题数量远超观察词汇量。尽管一般过完备主题模型不可识别，但通过引入称为主题持续性的约束，我们建立了通用可识别性条件。这些条件涉及对主题-词汇矩阵或模型总体结构的新“高阶”展开条件。这些高阶展开条件允许过完备模型，并要求存在从潜在主题到高阶观察词汇的完美匹配。我们证明在过完备 regime 中，随机结构主题模型以高概率可识别。我们的可识别性结果允许一般（非退化）分布建模主题比例，从而在框架中处理任意相关的主题。我们的可识别性结果暗示了一类具有结构稀疏性的张量分解的唯一性，该类包含在Tucker分解中，但比Candecomp/Parafac（CP）分解更一般。

英文摘要

Overcomplete latent representations have been very popular for unsupervised feature learning in recent years. In this paper, we specify which overcomplete models can be identified given observable moments of a certain order. We consider probabilistic admixture or topic models in the overcomplete regime, where the number of latent topics can greatly exceed the size of the observed word vocabulary. While general overcomplete topic models are not identifiable, we establish generic identifiability under a constraint, referred to as topic persistence. Our sufficient conditions for identifiability involve a novel set of "higher order" expansion conditions on the topic-word matrix or the population structure of the model. This set of higher-order expansion conditions allow for overcomplete models, and require the existence of a perfect matching from latent topics to higher order observed words. We establish that random structured topic models are identifiable w.h.p. in the overcomplete regime. Our identifiability results allows for general (non-degenerate) distributions for modeling the topic proportions, and thus, we can handle arbitrarily correlated topics in our framework. Our identifiability results imply uniqueness of a class of tensor decompositions with structured sparsity which is contained in the class of Tucker decompositions, but is more general than the Candecomp/Parafac (CP) decomposition.

URL PDF HTML ☆

赞 0 踩 0

1306.2665 2026-06-04 cs.IT cs.LG cs.SY eess.SY math.IT math.OC stat.ML 版本更新

Precisely Verifying the Null Space Conditions in Compressed Sensing: A Sandwiching Algorithm

在压缩感知中精确验证空域条件：一种 Sandwiching 算法

Myung Cho, Weiyu Xu

AI总结本文提出新算法验证压缩感知中的空域条件，通过高效计算α_k，改进了传统方法的复杂度和精度。

Comments 30 pages

详情

收敛性条件：正则化机器学习目标中的收敛性条件

Patrick Hop, Xinghao Pan

AI总结本文研究了现代凸优化算法收敛率的分析方法，探讨了分布式计算中非线性延迟对收敛性的影响，并给出了收敛性的存在性和收敛率下界。

Comments 3 Pages

1305.0395 2026-06-04 math.NA cs.LG cs.NA q-bio.NC stat.ML 版本更新

Tensor Decompositions: A New Concept in Brain Data Analysis?

张量分解：脑数据处理中的新概念？

Andrzej Cichocki

AI总结本文综述了张量分解在多向BSS/ICA、特征提取、分类和多向PLS回归中的新模型与方法，涵盖约束Tucker和CP模型及惩罚张量分解。

详情

Journal ref: Control Measurement, and System Integration (SICE), special issue; Measurement of Brain Functions and Bio-Signals, 7, 507-517, (2011)

AI中文摘要

矩阵分解及其扩展到张量分解和分解的技术已成为线性和多线性盲源分离（BSS）中的重要方法，尤其在多向独立成分分析（ICA）、非负矩阵和张量分解（NMF/NTF）、平滑成分分析（SmoCA）和稀疏成分分析（SCA）中。此外，张量分解在多线性BSS之外还有许多潜在应用，如特征提取、分类、降维和多向聚类。本文简要回顾了张量分解在组联多向BSS/ICA、特征提取、分类和多向偏最小二乘（MPLS）回归中的新模型和方法。关键词：多线性BSS，联多向BSS/ICA，张量分解和分解，约束Tucker和CP模型，惩罚张量分解（PTD），特征提取，分类，多向PLS和CCA。

英文摘要

Matrix factorizations and their extensions to tensor factorizations and decompositions have become prominent techniques for linear and multilinear blind source separation (BSS), especially multiway Independent Component Analysis (ICA), NonnegativeMatrix and Tensor Factorization (NMF/NTF), Smooth Component Analysis (SmoCA) and Sparse Component Analysis (SCA). Moreover, tensor decompositions have many other potential applications beyond multilinear BSS, especially feature extraction, classification, dimensionality reduction and multiway clustering. In this paper, we briefly overview new and emerging models and approaches for tensor decompositions in applications to group and linked multiway BSS/ICA, feature extraction, classification andMultiway Partial Least Squares (MPLS) regression problems. Keywords: Multilinear BSS, linked multiway BSS/ICA, tensor factorizations and decompositions, constrained Tucker and CP models, Penalized Tensor Decompositions (PTD), feature extraction, classification, multiway PLS and CCA.

URL PDF HTML ☆

赞 0 踩 0

1304.7710 2026-06-04 eess.SY cs.LG cs.SY physics.soc-ph 版本更新

Learning Geo-Temporal Non-Stationary Failure and Recovery of Power Distribution

学习电力分配网络的地理时间非平稳故障与恢复

Yun Wei, Chuanyi Ji, Floyd Galvan, Stephen Couvillon, George Orellana, James Momoh

AI总结本文研究电力分配网络在非平稳环境下的故障与恢复行为，提出新的建模方法，并通过实际案例验证模型参数学习的有效性。

Comments 12 pages, 12 figures, Accepted with minor revisions by TNNLS, Special Issue on Learning in Nonstationary and Evolving Environments. arXiv admin note: text overlap with arXiv:1202.4720

详情

AI中文摘要

智能能源电网是机器学习在非平稳环境中的新应用领域。当大规模故障发生在电力分配网络中，由于外部干扰如飓风和恶劣天气时，这种非平稳环境就会出现。电力分配网络位于电网边缘，特别容易受到外部干扰。缺乏可量化的途径来学习大规模故障和恢复的非平稳行为。本文从三个方面研究这种非平稳行为。首先，推导出大规模故障和恢复整个生命周期的新公式。其次，开发空间时间模型，将故障和恢复建模为基于地理定位的多变量非平稳GI(t)/G(t)/Infinity队列。第三，非平稳空间时间模型识别出少量需要学习的参数。学习应用于两个真实案例：一个是飓风Ike，其中操作网络的数据精确记录了故障和恢复；另一个是飓风桑迪，其中使用汇总数据推断受影响区域的故障和恢复过程。模型参数使用真实数据学习。学习结果得出两个发现：(a) 两种不同运营商网络在两个不同飓风中的故障率行为相似，但在地理区域上不同。(b) 飓风Ike中存在快速和缓慢恢复，但桑迪飓风影响的区域网络中只显示缓慢恢复。

英文摘要

Smart energy grid is an emerging area for new applications of machine learning in a non-stationary environment. Such a non-stationary environment emerges when large-scale failures occur at power distribution networks due to external disturbances such as hurricanes and severe storms. Power distribution networks lie at the edge of the grid, and are especially vulnerable to external disruptions. Quantifiable approaches are lacking and needed to learn non-stationary behaviors of large-scale failure and recovery of power distribution. This work studies such non-stationary behaviors in three aspects. First, a novel formulation is derived for an entire life cycle of large-scale failure and recovery of power distribution. Second, spatial-temporal models of failure and recovery of power distribution are developed as geo-location based multivariate non-stationary GI(t)/G(t)/Infinity queues. Third, the non-stationary spatial-temporal models identify a small number of parameters to be learned. Learning is applied to two real-life examples of large-scale disruptions. One is from Hurricane Ike, where data from an operational network is exact on failures and recoveries. The other is from Hurricane Sandy, where aggregated data is used for inferring failure and recovery processes at one of the impacted areas. Model parameters are learned using real data. Two findings emerge as results of learning: (a) Failure rates behave similarly at the two different provider networks for two different hurricanes but differently at the geographical regions. (b) Both rapid- and slow-recovery are present for Hurricane Ike but only slow recovery is shown for a regional distribution network from Hurricane Sandy.

URL PDF HTML ☆

赞 0 踩 0