arXivDaily arXiv每日学术速递 周一至周五更新
重置
2606.05165 2026-06-04 cs.LG cs.CL 版本更新

STRIDE: Training Data Attribution via Sparse Recovery from Subset Perturbations

STRIDE: 通过子集扰动的稀疏恢复进行训练数据归因

Rishit Dagli, Abir Harrasse, Luke Zhang, Florent Draye, Amirali Abdullah, Bernhard Schölkopf, Zhijing Jin

发表机构 * Jinesis AI Lab, University of Toronto & Vector Institute(Jinesis AI实验室,多伦多大学及向量研究所) Max Planck Institute for Intelligent Systems, Tübingen, Germany(智能系统马克斯·普朗克研究所,图宾根,德国) Thoughtworks(Thoughtworks公司) Martian ELLIS Institute, Tübingen, Germany(图宾根ELLIS研究所,德国) EuroSafeAI

AI总结 提出STRIDE框架,将训练数据归因建模为压缩感知中的稀疏恢复问题,通过激活空间中的轻量级“引导算子”模拟数据子集的影响,实现高效且准确的LLM预训练归因。

Comments project page: https://stride-tda.github.io/

详情
AI中文摘要

训练数据归因(TDA)旨在将模型的预测追溯到其训练数据。TDA的黄金标准依赖于因果干预,观察模型在数据添加或移除时的变化,但对于大型语言模型(LLMs)而言,重复训练在计算上具有挑战性。因此,大多数方法在参数空间中使用梯度来近似这种效应。然而,跟踪数十亿参数的梯度不仅成本高昂,而且依赖于局部近似。在这项工作中,我们提出了一种转变:我们不估计参数变化,而是在激活空间中建模训练数据的功能效应。我们引入了STRIDE(基于引导的训练数据影响分解),这是一个将TDA表述为压缩感知精神下的稀疏恢复问题的框架。STRIDE学习轻量级的“引导算子”,这些算子模拟在数据子集上训练引起的行为变化。通过测量这些算子如何扰动测试预测,我们通过稀疏线性分解恢复单个训练示例的影响。STRIDE在LLM预训练归因中达到了最先进的性能,同时比先前的方法快一个数量级(13倍)。我们通过下游应用(包括数据选择、数据污染和定性分析)进一步验证了其实用性。

英文摘要

Training Data Attribution (TDA) seeks to trace a model's predictions back to its training data. The gold standard for TDA relies on causal interventions, observing how a model changes when data is added or removed, but repeated retraining is computationally challenging for Large Language Models (LLMs). Consequently, most approaches approximate this effect in the parameter space using gradients. However, tracking gradients across billions of parameters is not only prohibitively expensive but relies on local approximations. In this work, we propose a shift: rather than estimating parameter changes, we model the functional effect of training data in the activation space. We introduce STRIDE (Steering-based Training Data Influence Decomposition), a framework that formulates TDA as a sparse recovery problem in the spirit of compressive sensing. STRIDE learns lightweight "steering operators" that mimic the behavioral shift caused by training on data subsets. By measuring how these operators perturb test predictions, we recover individual training example influences via sparse linear decomposition. STRIDE achieves state-of-the-art for LLM pre-training attribution while being an order of magnitude ($13\times$) faster than previous art. We further validate its practical utility through downstream applications including data selection, data contamination, and qualitative analysis.

2606.05149 2026-06-04 cs.CV cs.LG eess.IV 版本更新

An Open-Source Two-Stage Computer Vision Pipeline for Fine-Grained Vehicle Classification using Vision Transformers

基于视觉Transformer的开源两阶段细粒度车辆分类流水线

Gandhimathi Padmanaban, Fred Feng

发表机构 * Department of Electrical and Computer Engineering, University of California, Los Angeles, CA, USA(1 电气工程与计算机科学系,美国加州大学洛杉矶分校)

AI总结 提出一个结合RT-DETR检测器和微调ViT-Base/16的两阶段流水线,用于六类车身分类,并引入置信度弃权机制,在分布内和分布外数据集上分别达到0.94和0.89的准确率。

Comments 24 pages, 10 figures, venue TBD

详情
AI中文摘要

车辆车身类型是超车碰撞中骑行者伤害严重程度的重要决定因素,然而,在公开文献中,尚不存在从自然道路视频中将车辆分类为与伤害风险相关类别的自动化工具。标准目标检测基准仅提供粗粒度车辆标签(轿车、卡车、公交车、摩托车),而现有的细粒度识别系统在受控图像上训练,且缺乏跨记录站点的部署鲁棒性评估。本文提出一个开源的两阶段计算机视觉流水线,结合预训练的RT-DETR检测器进行粗粒度车辆定位,以及微调的视觉Transformer(ViT-Base/16)进行六类车身分类:乘用车、SUV、皮卡、小型货车、大型货车和商用卡车。当softmax输出低于0.60时,基于置信度的弃权机制保留第二阶段预测,产生未知标签而非静默误分类。在来自密歇根州安阿伯市自行车道走廊的3,805个标注超车事件(分布内)上评估,该流水线达到0.94的准确率,每类F1分数从0.91(小型货车)到0.97(SUV)。在来自开放骑行数据集的311个事件(分布外)上独立评估,无需重新训练,准确率为0.89。四个代表性类别中的三个在域偏移下保持F1不低于0.90。观察到的最大退化出现在小型货车(F1=0.72),原因是弃权率从2.4%上升到25.0%,而非主动误分类,这与传播真实模型不确定性的机制一致。完整的流水线,包括推理脚本、训练代码、评估工具和模型权重,作为开源软件发布,以支持跨路边视频档案和骑行安全研究的可重复性和复用。

英文摘要

Vehicle body type is a significant determinant of cyclist injury severity in overtaking crashes, yet automated tools for classifying vehicles into injury-risk-relevant categories from naturalistic roadway video do not exist in the open literature. Standard object detection benchmarks provide only coarse vehicle labels (car, truck, bus, motorcycle), while existing fine-grained recognition systems are trained on controlled imagery and lack evaluation for deployment robustness across recording sites. This paper presents an open-source two-stage computer vision pipeline combining a pre-trained RT-DETR detector for coarse vehicle localization with a fine-tuned Vision Transformer (ViT-Base/16) for six-category body-type classification: passenger car, SUV, pickup truck, minivan, large van, and commercial truck. A confidence-based abstention mechanism withholds Stage 2 predictions when softmax output falls below 0.60, producing unknown labels rather than silent misclassifications. Evaluated on 3,805 annotated overtaking events from a bicycle-lane corridor in Ann Arbor, Michigan (in-distribution), the pipeline achieved 0.94 accuracy with per-class F1 scores from 0.91 (minivan) to 0.97 (SUV). On an independent out-of-distribution evaluation of 311 events from an open cycling dataset without retraining, accuracy was 0.89. Three of four well-represented categories maintained F1 at or above 0.90 under domain shift. The largest degradation was observed for minivan (F1 = 0.72), driven by abstention rate rising from 2.4% to 25.0% rather than active misclassification, consistent with the mechanism propagating genuine model uncertainty. The full pipeline, including inference scripts, training code, evaluation utilities, and model weights, is released as open-source software to support reproducibility and reuse across roadside video archives and cycling safety research.

2606.05145 2026-06-04 cs.LG cs.AI cs.CL 版本更新

Failed Reasoning Traces Tell You What Is Fixable (But Not by Reading Them)

失败推理轨迹告诉你什么是可修复的(但仅凭阅读它们不行)

Nizar Islah, Istabrak Abbes, Irina Rish, Sarath Chandar, Eilif B. Muller

发表机构 * Mila - Quebec AI Institute(魁北克人工智能研究所) Université de Montréal(蒙特利尔大学) Polytechnique Montréal(蒙特利尔理工学院) CHU Sainte-Justine(圣约斯特医院)

AI总结 本文提出通过失败推理轨迹的分布特征而非文本内容来识别可修复的失败,并设计无训练的路由规则提升测试时干预效果。

详情
AI中文摘要

当后训练语言模型在推理问题上失败时,常见的测试时扩展响应是花费更多计算进行额外尝试,而失败轨迹不再发挥作用。我们认为这丢弃了一个关键信号;一些失败源于不幸运的采样,此时更多滚动有助于解决,而其他失败是结构性的,无论预算如何都无法通过重采样解决。我们提出失败轨迹编码了可恢复性结构:即哪些测试时干预可以挽救特定失败的推理时特征。三个问题级别的轨迹特征,源自可用干预的结构,从失败滚动的分布特征(而非其文本)中恢复这种结构。它们将失败聚类为稳定区域,刻画不同后训练方法的失败地形(准确率84.3±4.3%,比多数类基线高20%),并支持一个无训练的路由规则,在部署相关的Steerable-Hard子集(重试不足且可达有界干预的失败)上将挽救率提升12.2%。这些特征和路由规则在两个跨家族探针上可迁移。因此,相同的三个特征将失败轨迹从丢弃数据转化为诊断对象,支持测试时路由和后训练分析,无需训练时或权重空间访问。

英文摘要

When post-trained language models fail on reasoning problems, the common test-time-scaling response is to spend more compute on additional attempts, and the failed traces play no further role. We argue this discards a crucial signal; some failures come from unlucky sampling, where more rollouts help, while others are structural and resist resampling regardless of budget. We propose that failed traces encode recoverability structure: the inference-time signature of which test-time interventions can rescue a given failure. Three problem-level trajectory features, derived from the structure of available interventions, recover this structure from the distributional signature of failed rollouts, not their text. They cluster failures into stable regimes, characterize the failure topography of different post-training methods ($84.3{\pm}4.3\%$ accuracy, $+20\%$ over a majority-class baseline), and support a training-free routing rule that lifts rescue by $+12.2\%$ on the deployment-relevant Steerable-Hard subset (failures where retry is insufficient and a bounded intervention is reachable). The features and the routing rule transfer across two cross-family probes. The same three features thus convert failed traces from discarded data into a diagnostic object, supporting test-time routing and post-training analysis without training-time or weight-space access.

2606.05139 2026-06-04 cs.LG 版本更新

BBOmix: A Tabular Benchmark for Hyperparameter Optimization of Unsupervised Biological Representation Learning

BBOmix: 用于无监督生物表示学习超参数优化的表格基准

Luca Thale-Bombien, Jan Ewald, Ralf König, Aaron Klein

发表机构 * Center for Scalable Data Analytics and Artificial Intelligence (ScaDS.AI)(可扩展数据与人工智能研究中心) Leipzig University(莱比锡大学) ELLIS Institute(ELLIS研究所)

AI总结 针对高通量测序产生的组学数据,提出首个开源表格基准BBOmix,包含105,000次评估,涵盖四种自编码器架构和七种多组学模态,用于无监督表示学习的超参数优化。

详情
AI中文摘要

高通量测序的快速发展产生了大规模、高维的组学数据集。深度无监督学习架构,特别是自编码器(AEs),在该领域越来越多地被用于降维和表示学习。然而,AEs对架构选择和超参数高度敏感,且无监督优化通常依赖于重建损失,这可能是下游任务效用的不良代理。穷举超参数优化(HPO)计算成本高昂,导致研究人员经常依赖次优的默认配置。为了普及大规模无监督HPO研究,我们引入了$ extbf{BBOmix}$,这是第一个用于真实生物数据上无监督表示学习的开源表格基准。我们的基准包括来自TCGA和SCHC数据集的四种AE架构和七种多组学模态的105,000次评估。我们量化了重建损失与下游任务性能之间的相关性,并对最先进的单保真度、多保真度和迁移学习HPO方法进行了广泛评估,为未来无监督生物表示学习研究建立了严格的基线。

英文摘要

The rapid advancement of high-throughput sequencing has led to large, high-dimensional omics datasets. Deep unsupervised learning architectures, particularly Autoencoders (AEs), are increasingly used for dimensionality reduction and representation learning in this domain. However, AEs are highly sensitive to architectural choices and hyperparameters, and unsupervised optimization typically relies on reconstruction loss, which may be a poor proxy for downstream utility. Exhaustive hyperparameter optimization (HPO) is computationally expensive, leading researchers to frequently rely on suboptimal default configurations. To democratize access to large-scale unsupervised HPO research, we introduce $\textbf{BBOmix}$, the first open-source tabular benchmark for unsupervised representation learning on real-world biological data. Our benchmark includes 105,000 evaluations across four AE architectures and seven multi-omics modalities from the TCGA and SCHC datasets. We quantify the correlation between reconstruction loss and downstream task performance and provide an extensive evaluation of state-of-the-art single-fidelity, multi-fidelity, and transfer learning HPO methods, establishing a rigorous baseline for future research in unsupervised biological representation learning.

2606.05138 2026-06-04 cs.LG q-fin.ST 版本更新

Generating Financial Time Series by Matching Random Convolutional Features

通过匹配随机卷积特征生成金融时间序列

Konrad J. Mueller, Nikita Zozoulenko, Ben Wood, Thomas Cass, Lukas Gonon

发表机构 * Imperial College London(帝国理工学院伦敦分校) JPMorgan Chase & Co.(摩根大通公司) University of St. Gallen(圣加尔登大学)

AI总结 提出SOCK(软竞争核)可微随机卷积特征图,通过匹配真实与生成时间序列的随机卷积特征来训练生成器,在小样本金融数据集上优于签名和扩散基线方法。

详情
AI中文摘要

生成逼真的金融时间序列具有挑战性,因为训练数据通常仅限于单个历史路径。在如此稀缺的数据下,过拟合难以避免,尤其是在对抗训练中,训练好的判别器可能记忆训练样本。为了缓解这一问题,近期的方法训练生成器以最小化真实与生成时间序列的未训练特征表示之间的差异。在这些工作中,特征图基于路径签名,而路径签名在可处理的截断深度下可能无法捕捉相关的时间序列属性。在本工作中,我们通过匹配真实与生成时间序列的随机卷积特征来训练生成器。现有的随机卷积特征图,如Rocket和Hydra,已被证明能为真实世界的时间序列提供信息丰富的表示,但由于不可微,无法监督生成模型。我们引入了SOCK(软竞争核),一种完全可微的随机卷积特征图,适用于训练生成时间序列模型。我们表明,通过匹配随机SOCK特征训练的生成器在多种小样本金融数据集上始终优于签名和扩散基线。我们进一步在双样本假设检验和时间序列分类任务中展示了SOCK的表达能力,在这些任务中SOCK匹配或超越了现有的无监督特征图。

英文摘要

Generating realistic financial time series is challenging as training data is often limited to a single historical path. With such scarce data, overfitting is hard to avoid, especially under adversarial training where a trained discriminator can memorize the training samples. To mitigate this, recent approaches train generators to minimize the discrepancy between untrained feature representations of real and generated time series. In these works, the feature maps are based on path signatures, which can fail to capture relevant time series properties at tractable truncation depths. In this work, we instead train generators by matching random convolutional features of real and generated time series. Existing random convolutional feature maps, such as Rocket and Hydra, have been shown to provide informative representations of real-world time series, but cannot supervise generative models because they are non-differentiable. We introduce SOCK (SOft Competing Kernels), a fully differentiable random convolutional feature map, suited to train generative time series models. We show that generators trained by matching random SOCK features consistently outperform signature and diffusion baselines across a wide range of small-sample financial datasets. We further demonstrate SOCK's expressiveness on two-sample hypothesis testing and time series classification tasks, where SOCK matches or outperforms existing unsupervised feature maps.

2606.05134 2026-06-04 cs.CL cs.LG 版本更新

Activation-Based Active Learning for In-Context Learning: Challenges and Insights

基于激活的主动学习用于上下文学习:挑战与见解

Yaseen M. Osman, Geoff V. Merrett, Stuart E. Middleton

发表机构 * School of Electronics and Computer Science (ECS), University of Southampton(电子与计算机科学学院(ECS),南安普顿大学)

AI总结 本文研究了基于MLP激活的深度主动学习方法在上下文学习中的应用,发现激活信号与示例质量或任务性能相关性弱,表明此类方法不适用于上下文学习。

Comments 9 pages, 3 figures

详情
AI中文摘要

深度主动学习此前已被探索用于大语言模型的上下文样本选择,但未利用对Transformer激活理解的最新进展。在本文中,我们测试了模型激活能否提供细粒度信号以优化上下文示例选择的假设。我们提出了迄今为止最全面的基于MLP激活的深度主动学习方法应用于上下文学习的分析,包括不同注意力掩码策略如何影响跨多样分类和生成数据集的主动学习,使用了Llama-3.2-3B和Qwen2.5-3B基础模型。然而,我们得到了负面结果:通过大规模激活或前四阶矩视角观察的MLP输出,与示例质量或任务性能不相关。具体来说,对于所有测试的任务和模型,绝对Spearman相关系数至多为0.33,表明此类基于激活的采样不应用于上下文学习。我们假设这可能是由于叠加现象,即模型表示的特征数量超过其维度,表明稀疏自编码器等方法可能是未来有前景的方向。

英文摘要

Deep active learning has previously been explored for LLM in-context sample selection, but not with methods that utilise recent advances in understanding of transformer activations. In this paper, we test the hypothesis that model activations could provide a fine-grained signal to optimise the selection of in-context examples. We present the most comprehensive analysis to date of MLP activation-based deep active learning methods applied to in-context learning, including how different attention masking strategies impact active learning across diverse classification and generative datasets, using both Llama-3.2-3B and Qwen2.5-3B base models. However, we find a negative result: MLP outputs, viewed through the lenses of massive activations or the first four moments, do not correlate with example quality or task performance. Specifically, the absolute Spearman correlation coefficient is at most 0.33 for all tasks and models we tested, showing that such activation-based sampling should not be used for in-context learning. We hypothesise that this may be due to superposition, whereby models represent more features than they have dimensionality, suggesting that methods like Sparse Autoencoders (SAEs) may be a promising future direction.

2606.05131 2026-06-04 cs.LG cs.NA math.DS math.NA math.OC math.SP 版本更新

Deep Embedded Multiplicative DMD for Algebra-Preserving Koopman Learning

深度嵌入乘法DMD用于保代数Koopman学习

Kelan Gray, Finlay Brown, Nicolas Boullé, Matthew J. Colbrook

发表机构 * Department of Mathematics, Imperial College London(帝国理工学院数学系) Department of Applied Mathematics and Theoretical Physics, University of Cambridge(剑桥大学应用数学与理论物理系)

AI总结 提出DeepMDMD方法,通过结合深度学习和乘法DMD,在潜空间中施加Koopman乘积规则作为代数约束,学习紧凑且动态一致的字典,实现稳定预测和谱污染减少。

Comments 26 pages, 11 figures

详情
AI中文摘要

Koopman理论将非线性动力学转化为线性谱问题。然而,在计算中,一切都取决于一个困难的有限维选择:可观测量必须具有表现力,在动力学下几乎不变,并且理想情况下与复合运算兼容。深度Koopman方法学习灵活的坐标,而保结构方法在固定字典上强制执行算子恒等式。我们通过引入深度嵌入乘法动态模式分解(DeepMDMD)来结合这些思想,该方法学习潜空间及其划分,同时将Koopman乘积规则作为精确代数约束强制执行。训练在精确的乘法算子更新和可微的潜聚类步骤之间交替进行,后者促进Koopman封闭性。结果是在学习的潜细胞上得到一个有限转移映射。其非零谱位于单位圆上,其字典由动力学而非环境几何塑造,预测在潜坐标中进行,然后解码到物理空间。在哈密顿、混沌和流体示例中,DeepMDMD学习的字典比几何MDMD划分产生的字典更紧凑且动态一致。它减少了谱污染,揭示了更丰富的连续谱结构,并在严重噪声下提供稳定预测。在高维流中,包括158,624维圆柱尾流和噪声$Re=20,000$顶盖驱动空腔,它保持了相干结构和长时间谱统计,而状态空间MDMD则失败。这些结果提出了Koopman学习的实用规则:学习坐标,约束代数。

英文摘要

Koopman theory turns nonlinear dynamics into a linear spectral problem. In computation, however, everything depends on a hard finite-dimensional choice: the observables must be expressive, nearly invariant under the dynamics, and, ideally, compatible with composition. Deep Koopman methods learn flexible coordinates, whereas structure-preserving methods enforce operator identities on fixed dictionaries. We combine these ideas by introducing Deep Embedded Multiplicative Dynamic Mode Decomposition (DeepMDMD), a method that learns a latent space and a partition of it, while enforcing the Koopman product rule as an exact algebraic constraint. Training alternates between an exact multiplicative operator update and a differentiable latent-clustering step that promotes Koopman closure. The result is a finite transition map on learned latent cells. Its nonzero spectrum lies on the unit circle, its dictionary is shaped by the dynamics rather than by ambient geometry, and forecasts are made in latent coordinates before being decoded to physical space. Across Hamiltonian, chaotic, and fluid examples, DeepMDMD learns dictionaries that are far more compact and dynamically coherent than those produced by geometric MDMD partitions. It reduces spectral pollution, reveals richer continuous-spectrum structure, and gives stable forecasts under severe noise. In high-dimensional flows, including a 158,624-dimensional cylinder wake and a noisy $Re=20,000$ lid-driven cavity, it preserves coherent structures and long-time spectral statistics where state-space MDMD fails. These results suggest a practical rule for Koopman learning: learn the coordinates, constrain the algebra.

2606.05130 2026-06-04 cs.LG cs.AI 版本更新

Towards Efficient and Evidence-grounded Mobility Prediction with LLM-Driven Agent

面向高效且基于证据的移动预测:基于LLM驱动的智能体

Linyao Chen, Qinlao Zhao, Zechen Li, Mingming Li, Likun Ni, Jinyu Chen, Yuhao Yao, Xuan Song, Noboru Koshizuka, Hiroki Kobayashi

发表机构 * The University of Tokyo(东京大学) Huazhong University of Science and Technology(华中科技大学) University of New South Wales, Sydney(新南威尔士大学(悉尼)) LocationMind Inc.(LocationMind公司) Southern University of Science and Technology(南方科技大学) Jilin University(吉林大学)

AI总结 提出一种无需训练的LLM驱动智能体框架AgentMob,通过自适应证据收集机制解决移动预测中的模糊情况,在多个数据集上达到最优性能。

详情
AI中文摘要

个体层面的移动预测是城市模拟、交通规划和政策分析的核心。监督序列模型实现了高精度,但需要任务特定训练且决策透明度有限。最近的基于LLM的方法提高了可解释性,但大多依赖静态提示和单次推理,限制了在移动信号弱或冲突时寻求额外证据的能力。我们提出\method{},一种无需训练的LLM驱动智能体框架,将下一位置预测建模为自适应证据控制的决策制定。\method{}通过基于历史规律性的快速路径处理常规情况,而模糊情况则触发对近期轨迹、历史行为、停留-移动可能性和地理证据的迭代工具使用。在三个移动数据集上,AgentMob在无需训练的基于LLM的方法中实现了最强的整体性能,GPT-5.4在BW上达到71.42%的Acc@1,在YJMob100K上达到33.14%,在上海ISP上达到33.50%。在BW的非快速路径案例中,LLM控制器相比相同工具的统计基线将Acc@1从30.65%提高到48.62%,表明其主要优势在于通过自适应证据收集解决模糊预测。我们的代码可在https://github.com/Unknown-zoo/AgentMob获取。

英文摘要

Individual-level mobility prediction is central to urban simulation, transportation planning, and policy analysis. Supervised sequence models achieve strong accuracy but require task-specific training and offer limited decision-level transparency. Recent LLM-based methods improve interpretability, yet mostly rely on static prompts and single-pass inference, limiting their ability to seek additional evidence when mobility signals are weak or conflicting. We propose \method{}, a training-free LLM-driven agent framework that formulates next-location prediction as adaptive evidence-controlled decision making. \method{} resolves routine cases through a fast path based on historical regularity, while ambiguous cases trigger iterative tool use over recent trajectories, historical behavior, stay-move likelihood, and geographical evidence. Across three mobility datasets, AgentMob achieves the strongest overall performance among training-free LLM-based methods, with GPT-5.4 reaching 71.42\% Acc@1 on BW, 33.14\% on YJMob100K, and 33.50\% on Shanghai ISP. On BW non-fast-path cases, the LLM controller improves Acc@1 from 30.65\% to 48.62\% over a same-tool statistical baseline, showing that its main benefit lies in resolving ambiguous predictions through adaptive evidence gathering. Our code is available at https://github.com/Unknown-zoo/AgentMob.

2606.05129 2026-06-04 cs.CR cs.LG 版本更新

Preserving Data Privacy in Learning Causal Structure with Fully Homomorphic Encryption

在全同态加密下学习因果结构时保护数据隐私

Jian Yang, Yuan Tong, Qinbin Li, Zeyi Wen, Xiaofang Zhou

发表机构 * Hong Kong University of Science and Technology (Guangzhou)(香港理工大学(广州)) Hong Kong University of Science and Technology(香港理工大学) University of California, Berkeley(加州大学伯克利分校)

AI总结 针对分布式因果结构学习中的隐私泄露问题,提出基于全同态加密的方法,通过电路简化、除法和对数近似以及SIMD批处理技术,在加密数据上高效完成因果结构学习,并支持扩展到差分隐私。

详情
AI中文摘要

保护数据隐私是结构数据管理和数据挖掘中的重要课题。然而,分布式因果结构学习中的隐私泄露问题是一个持续的挑战,特别是在需要数据传输和计算的情况下。在本文中,我们提出了一种基于全同态加密(FHE)的方法,该方法在密文上进行计算,保持数据在传输和计算过程中加密。然而,由于FHE计算成本高且对除法和对数运算的支持有限,将FHE应用于因果结构学习具有挑战性。为了应对这一挑战,我们提出了一系列新颖的技术,包括(i)电路简化以提高效率,(ii)通过牛顿-拉夫森倒数和泰勒展开近似除法和对数,以及(iii)使用SIMD加速的批处理技术来增强整个学习过程。此外,我们的方法可以轻松扩展到FHE之外,通过展示其可移植性来支持差分隐私。实验结果表明,我们的方法在测试的数据集上实现了与明文版本高度一致且可比的因果结构。最后,即使在FHE的隐私保护下,我们的方法也能在几十分钟内高效且实际地完成因果结构学习。

英文摘要

Preserving data privacy is an important topic in structural data management and data mining. However, the issue of privacy leakage in distributed causal structure learning is a persistent challenge, especially in cases where data transmission and computation are required. In this paper, we propose a method based on fully homomorphic encryption (FHE) that performs calculations on ciphertexts, keeping data encrypted in transition and computation. Nevertheless, adopting FHE to causal structure learning is challenging due to the high computation cost and limited support on division as well as logarithm operations in FHE. To tackle this challenge, we propose a series of novel techniques including (i) circuit simplification for better efficiency, (ii) approximation of division and logarithm through Newton-Raphson Reciprocal and Taylor expansion, and (iii) a batching technique with SIMD-acceleration to enhance the whole learning process. Additionally, our method can be easily extended beyond FHE by demonstration of its portability to support differential privacy. Empirical results show that our method achieves high consistency and comparable causal structure with the plaintext version in the datasets tested. Last, our method is efficient and practical to complete learning causal structures in tens of minutes even under the privacy protection of FHE.

2606.05124 2026-06-04 cs.GR cs.CV cs.LG 版本更新

Geometry Gaussians: Decoupling Appearance and Geometry in Gaussian Splatting

几何高斯:在高斯泼溅中解耦外观与几何

Hongyu Zhou, Zorah Lähner

发表机构 * University of Bonn(波恩大学) Lamarr Institut(拉马尔研究所)

AI总结 针对3D高斯泼溅在几何表示与外观渲染间的冲突,提出通过为每个溅射添加几何不透明度参数并配合透明度优化流程,实现几何与外观的解耦,提升复杂场景(尤其是透明物体)的渲染与几何性能。

详情
AI中文摘要

在3D高斯泼溅(3DGS)成功用于新视角合成后,许多工作探索了如何将其用于几何表面表示。然而,直接从3DGS中提取准确的几何信息仍然具有挑战性,且往往会降低外观渲染质量。在这项工作中,我们通过使用完整的地面真值纹理和几何信息进行训练,证明了默认形式的3DGS本质上不适合同时表示纹理和几何。我们还提出了一种简单的解决方案,即为每个溅射应用一个额外的几何不透明度参数,并配合可选的透明度策划优化流程。我们的实验,无论是使用地面真值还是视觉基础模型的几何输入,都表明这一改变在多种数据集上提高了渲染和几何性能,尤其是对于包含透明物体的复杂场景,我们的方法带来了显著提升。

英文摘要

After the success of 3D Gaussian Splatting (3DGS) for novel view synthesis, many works have explored how to also use it for geometric surface representation. However, extracting accurate geometric information directly from 3DGS remains challenging and can often reduce the appearance rendering quality. In this work, we show that 3DGS in its default form is inheritedly unsuited to represent texture and geometry at the same time, by training with complete ground-truth texture and geometry information. We also propose a simple solution by applying a single additional geometry opacity parameter to each splat, together with an optional transparency-curated optimization pipeline. Our experiments, both with ground-truth and vision foundation model geometric input, show that this change leads to improved rendering and geometry performance on a wide variety of dataset, and especially complex scenes with transparent objects benefit significantly from our method.

2606.05116 2026-06-04 cs.LG 版本更新

Graph Set Transformer

图集变换器

Jose E. Escrig Molina, Baoquan Chen, Daniel Probst

发表机构 * Bioinformatics Group Wageningen University(瓦赫宁根大学生物信息学组) Department of Physics Technical University of Munich(慕尼黑技术大学物理系)

AI总结 提出图集变换器(GST),通过层间交织节点级特征传播与跨图上下文建模,解决图集合学习任务中局部结构与集合上下文融合问题,在合成和真实基准上优于基线。

Comments 10 pages, 1 figure, conference

详情
AI中文摘要

我们介绍了图集变换器(GST),一种用于在图集合上学习的神经网络架构,设计用于每个元素的预测依赖于集合范围的上下文以及局部结构的任务。现有架构,包括DeepSets和SetTransformer,需要来自单独GNN的预编码图嵌入,在特征提取和集合级上下文化之间造成瓶颈。相比之下,GST在每一层交织节点级特征传播和跨图上下文建模,通过门控机制融合两个信息层次。我们在一个旨在隔离集合条件结构推理的受控合成套件以及三个真实数据基准(包括逐原子反应中心识别、反应产率预测和图像分类)上评估了GST。在匹配参数预算下,GST在这些设置中表现优于基线。架构消融强烈表明,局部和集合上下文的交织对这一优势有显著贡献。

英文摘要

We introduce the Graph Set Transformer (GST), a neural network architecture for learning on sets of graphs, designed for tasks in which per-element predictions depend on set-wide context as well as local structure. Existing architectures, including DeepSets and SetTransformer, require pre-encoded graph embeddings from a separate GNN, creating a bottleneck between feature extraction and set-level contextualisation. In contrast, GST interleaves node-level feature propagation and cross-graph contextual modelling at every layer, fusing the two levels of information through a gating mechanism. We evaluate GST on a controlled synthetic suite designed to isolate set-conditional structural reasoning and on three real-data benchmarks spanning per-atom reaction-centre identification, reaction yield prediction, and image classification. Under matched parameter budgets, GST performs better than the baselines across these settings. An architectural ablation strongly suggests that the interleaving of local and set context contributes substantially to this advantage.

2606.05109 2026-06-04 cs.LG 版本更新

RePercENT: Scaling Disentangled Representation Learning Beyond Two Modalities

RePercENT:将解耦表示学习扩展到两种模态之外

Vasiliki Rizou, Pascal Frossard, Dorina Thanou

发表机构 * EPFL(瑞士联邦理工学院)

AI总结 提出RePercENT框架,通过多模态即插即用架构和联合优化目标,实现超过两种模态的可扩展成对解耦,无需联合预训练并降低计算复杂度。

详情
AI中文摘要

为了充分利用多模态数据的潜力,我们需要超越当前最先进的对齐和融合方法,在不牺牲模态特定信息的情况下利用所有跨模态交互。学习解耦表示是识别隐藏在观测数据中的潜在共享和独特因素的一种原则性方法。然而,尽管多模态解耦是一个引人注目的范式,现有方法由于固有的可扩展性瓶颈,主要局限于两种模态。为了解决这个问题,我们提出了RePercENT,这是一个自监督框架,旨在超越这些限制,并解锁超过两种模态的可扩展成对解耦。通过多模态“即插即用”架构,我们的方法直接操作于预提取的嵌入,消除了对广泛联合预训练的需求,同时不对底层模态或基础模型骨干做出任何假设。此外,我们引入了一个联合优化目标,用于同时推导共享和独特组件,并提供了形式化的理论保证来表征我们解决方案的最优性。在多种模态和任务中,RePercENT成功恢复了解耦组件,同时保持了竞争性能并显著降低了计算复杂度。

英文摘要

To leverage the full potential of multimodal data, we need representations that go beyond the state-of-the-art alignment and fusion approaches and exploit all cross-modal interactions without sacrificing modality-specific information. Learning disentangled representations is a principled way to identify these underlying shared and unique factors that are hidden in observational data. However, while multimodal disentanglement is a compelling paradigm, existing methods are largely confined to the two-modality regime due to its inherent scalability bottleneck. To address this, we propose RePercENT, a self-supervised framework designed to surpass these limitations and unlocks scalable pairwise disentanglement beyond two modalities. Through a multimodal `plug-and-play' architecture, our approach operates directly on pre-extracted embeddings, eliminating the need for extensive joint pre-training while making no assumptions regarding the underlying modalities or foundation model backbones. Moreover, we introduce a joint optimization objective for simultaneously deriving the shared and unique components, and provide formal theoretical guarantees that characterize the optimality of our solution. Across diverse modalities and tasks, RePercENT successfully recovers disentangled components while maintaining competitive performance and significantly reducing computational complexity.

2606.05103 2026-06-04 cs.LG astro-ph.IM cs.CV stat.ML 版本更新

Identifying Gems from Roman RAPIDly

从Roman RAPIDly中识别宝石

Karan Gandhi, Ashish A. Mahabal, Jacob E. Jencson, Russ R. Laher, Ben Rusholme, Lin Yan, Ryan M. Lau, Schuyler D. Van Dyk, Mansi M. Kasliwal

发表机构 * Department of Computer Science and Engineering, Indian Institute of Technology, Gandhinagar, India(印度理工学院计算机科学与工程系) Division of Physics, Mathematics, and Astronomy, California Institute of Technology, Pasadena, CA 91125, USA(加州理工学院物理、数学与天文学系) Center for Data Driven Discovery, California Institute of Technology, Pasadena, CA 91125, USA(数据驱动发现中心) IPAC, California Institute of Technology, 1200 E. California Blvd, Pasadena, CA 91125, USA(IPAC, 加州理工学院) Caltech Optical Observatories, California Institute of Technology, Pasadena, CA 91125, USA(加州理工学院光学观测站)

AI总结 针对Roman太空望远镜无真实数据的问题,提出机器学习模型RuBR和通用方法,用于在RAPID流水线中区分真实瞬变/变源与虚假检测,实验表明该方法在Roman时代具有鲁棒性。

Comments 15 pages, 10 figures, Submitted to the Publications of the Astronomical Society of the Pacific

详情
AI中文摘要

南希·格雷斯·罗马太空望远镜(Roman)计划最早于2026年9月发射,将以前所未有的空间分辨率和节奏进行宽场红外成像巡天,从而发现数百万天文瞬变源。因此,有必要建立自动化的警报流水线,以便望远镜在发射后不久就能开始发现可靠的瞬变源和变源。然而,目前不存在真实的Roman数据,这使得开发此类流水线变得困难。在这项工作中,我们提出了一个机器学习模型$RuBR$和一种通用方法,用于在RAPID流水线中区分真实的瞬变和变源检测与虚假检测。具体而言,我们使用该方法提出了三个模型:$RuBR_{comb}$在本地注入和OpenUniverse2024瞬变源的组合数据上训练和测试,$RuBR_{loc}$在本地注入瞬变源上训练并在OpenUniverse2024瞬变源上测试,以及$RuBR_{DA}$将本地注入瞬变源与部分OpenUniverse2024瞬变源以域适应模式结合进行训练。这为在Roman任务早期阶段缺乏真实标签的情况下,将$RuBR_{comb}$模型适应真实观测的策略铺平了道路。尽管图像差分流水线仍在改进中,但我们的实验结果证明了所提出方法的有效性及其在Roman时代进行稳健真实-虚假分类的前景。

英文摘要

The Nancy Grace Roman Space Telescope (Roman), set for launch as early as September 2026, will conduct wide-field infrared imaging surveys with unprecedented spatial resolution and cadence, enabling the discovery of millions of astronomical transients. Hence, it is necessary to have automated pipelines for generating alerts in place so that the telescope can begin discovering reliable transients and variable objects soon after it is launched. However, no real Roman data currently exist, making the development of such pipelines difficult. In this work, we present a machine learning model $RuBR$ and a general methodology for distinguishing genuine transient and variable detections from spurious (bogus) detections within the RAPID pipeline. In particular, we present three models using this methodology: $RuBR_{comb}$ trained and tested on combined locally injected and OpenUniverse2024 transients, $RuBR_{loc}$ trained on locally injected transients and tested on OpenUniverse2024 transients, and $RuBR_{DA}$ that combines locally injected transients with a fraction of OpenUniverse2024 transients in domain-adaptation mode for training. This paves the way for strategies to adapt the $RuBR_{comb}$ model to real observations in the absence of any ground-truth labels during the early phases of the Roman mission. While the image differencing pipeline continues to be improved, our experimental results demonstrate the effectiveness of the proposed approach and its promise for robust real-bogus classification in the Roman era.

2606.05101 2026-06-04 cs.SD cs.LG 版本更新

FoeGlass: Simple In-Context Learning Is Enough for Red Teaming Audio Deepfake Detectors

FoeGlass: 简单的上下文学习足以对音频深度伪造检测器进行红队测试

Sepehr Dehdashtian, Jacob H Seidman, Vishnu N Boddeti, Gaurav Bharaj

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 提出FoeGlass,一种基于大语言模型上下文学习的黑盒自动红队方法,通过生成音频样本发现深度伪造检测器的盲点,将假阴性率降低高达94%。

Comments Accepted at ICML 2026

详情
AI中文摘要

音频深度伪造检测(ADD)模型对于对抗文本转语音(TTS)模型的恶意使用至关重要。评估和增强ADD模型需要开发覆盖生成音频空间并突出高错误区域的数据集。现有数据集开发策略面临两个挑战:(i)手动收集,以及(ii)低效发现ADD模型中的盲点。为应对这些挑战,我们提出FoeGlass,这是首个针对ADD的黑盒自动红队方法,能有效发现最先进深度伪造基准未充分探索的生成音频空间中的ADD失败模式。FoeGlass利用大语言模型的上下文学习能力探索TTS模型的输入空间,仅通过黑盒访问所有组件即可生成欺骗目标ADD的音频样本。通过使用基于多样性度量精心设计的上下文,FoeGlass缓解了自动红队系统中常见的模式崩溃问题。在多个开源ADD和TTS模型上的实证评估表明,与无条件采样基线和最近的欺骗数据集相比,FoeGlass生成的数据将假阴性率大幅提升高达94%,且无需人工监督。此外,我们证明FoeGlass生成的攻击在不同目标ADD之间具有可迁移性,展示了其在ADD系统自动红队中的广泛适用性和易用性。最后,在FoeGlass生成的样本上微调ADD模型显著增强了检测器的鲁棒性(提升高达41%)。

英文摘要

Audio deepfake detection (ADD) models are critical for countering the malicious use of text-to-speech (TTS) models. Evaluating and strengthening ADD models requires developing datasets that span the space of generated audio and highlight high-error regions. Existing dataset development strategies face two challenges: (i) manual collection, and (ii) inefficient discovery of blind spots in the ADD models. To address these challenges, we propose FoeGlass, the first black-box automated red-teaming method for ADDs, which effectively discovers ADD failure modes in the space of generated audio underexplored by state-of-the-art deepfake benchmarks. FoeGlass uses the in-context learning capabilities of an LLM to explore the input space of a TTS model, generating audio samples that fool the target ADD using only black-box access to all components. By using a carefully designed context based on diversity measurements, FoeGlass mitigates the common problem of mode collapse in automated red-teaming systems. Empirical evaluations on several open-source ADD and TTS models demonstrate that data generated from FoeGlass substantially improves the false negative rates over unconditional sampling baselines and recent spoofing datasets by up to 94%, while requiring no manual supervision. Furthermore, we show that the attacks generated by FoeGlass are transferable across different target ADDs, demonstrating its broad applicability and ease of use for the automated red teaming of ADD systems. Finally, fine-tuning ADD models on FoeGlass-generated samples notably enhances the robustness of the detectors (up 41%).

2606.05080 2026-06-04 cs.AI cs.LG 版本更新

AutoLab: Can Frontier Models Solve Long-Horizon Auto Research and Engineering Tasks?

AutoLab:前沿模型能否解决长周期自动研究与工程任务?

Zhangchen Xu, Junda Chen, Yue Huang, Dongfu Jiang, Jiefeng Chen, Hang Hua, Zijian Wu, Zheyuan Liu, Zexue He, Lichi Li, Shizhe Diao, Jiaxin Pei, Jinsung Yoon, Hao Zhang, Mengdi Wang, Radha Poovendran, Misha Sra, Alex Pentland, Zichen Chen

发表机构 * MIT(麻省理工学院) Stanford University(斯坦福大学) University of California, Berkeley(加州大学伯克利分校) University of California, Los Angeles(加州大学洛杉矶分校) University of California, San Diego(加州大学圣地亚哥分校) University of Washington(华盛顿大学) University of Toronto(多伦多大学) University of Michigan(密歇根大学) National University of Singapore(新加坡国立大学) University of Tokyo(东京大学)

AI总结 本文提出AutoLab基准,通过36个专家策划的长周期闭环优化任务评估前沿模型,发现持续迭代和利用经验反馈比初始尝试质量更重要。

Comments Code: https://github.com/autolabhq/autolab ; Website: https://autolab.moe/

详情
AI中文摘要

科学和工程进步本质上是一个长周期迭代过程:提出更改、运行实验、测量结果并不断改进工件。然而,现有的前沿模型基准主要评估单轮响应或短周期智能体轨迹,未能捕捉在长时间跨度内持续迭代改进的挑战。为了解决这一差距,我们引入了AutoLab,一个用于超长周期闭环优化的新基准。AutoLab包含36个现实且由专家策划的任务,涵盖四个不同领域:系统优化、谜题与挑战、模型开发和CUDA内核优化。每个任务从一个正确但故意次优的基线开始,并挑战智能体在严格的挂钟预算内改进它。评估17个最先进模型的结果表明,成功的主要预测因素不是智能体初始尝试的质量,而是其持续进行基准测试、编辑和整合经验反馈的毅力。虽然claude-opus-4.6表现出强大的长周期优化能力,但大多数前沿模型,包括几个专有模型,要么过早终止,要么在预算内进展甚微。这些结果强调了时间意识和持续迭代在自主智能体中的重要性。我们开源了完整的基准、评估框架和任务工件,以加速研究真正有能力的长周期智能体。

英文摘要

Scientific and engineering progress is fundamentally a long-horizon iterative process: proposing changes, running experiments, measuring outcomes, and continuously refining artifacts. Yet existing benchmarks for frontier models primarily evaluate either single-turn responses or short-horizon agent trajectories, failing to capture the challenges of sustained iterative improvement over extended time horizons. To address this gap, we introduce AutoLab, a new benchmark for ultra long-horizon closed-loop optimization. AutoLab consists of 36 realistic, expert-curated tasks spanning four diverse domains: system optimization, puzzle & challenge, model development, and CUDA kernel optimization. Each task begins with a correct but deliberately suboptimal baseline and challenges agents to improve it within a strict wall-clock budget. Evaluating 17 state-of-the-art models reveals the dominant predictor of success is not the quality of an agent's initial attempt, but its persistence in repeatedly benchmarking, editing, and incorporating empirical feedback. While claude-opus-4.6 exhibits strong long-horizon optimization capabilities, most frontier models, including several proprietary ones, either terminate prematurely or exhaust their budgets with minimal progress. These results underscore the importance of time awareness and persistent iteration in autonomous agents. We open-source the full benchmark, evaluation harness, and task artifacts, to accelerate research toward truly capable long-horizon agents.

2606.05079 2026-06-04 cs.CL cs.LG 版本更新

Fast & Faithful Function Vectors

快速且保真的函数向量

Minh An Pham, Anton Segeler, Thomas Wiegand, Wojciech Samek, Sebastian Lapuschkin, Patrick Kahardipraja, Reduan Achtibat

发表机构 * GitHub arXiv

AI总结 本研究通过优化注意力头选择和分布式引导方法,利用基于梯度的逐层相关性传播(LRP)提高了函数向量(FV)的效率和准确性,从而实现了对大型语言模型(LLM)的快速且保真的引导。

详情
AI中文摘要

函数向量(FV)是在上下文学习过程中产生的任务表示,可用于引导大型语言模型(LLM)。然而,其公式中的设计选择仍未得到充分探索。在这项工作中,我们研究了沿两个自由度(注意力头选择和引导)改变FV定义对指令的影响。对于头选择,使用基于梯度的逐层相关性传播(LRP)显著提高了效率和准确性。对于FV引导,分布式应用比简单聚合获得了更高的准确性。我们的代码已公开。

英文摘要

Function vectors (FVs) are task representations elicited during in-context learning that can be used to steer Large Language Models (LLMs). However, design choices in their formulation remain underexplored. In this work, we study the impact of varying FV definitions for instructions along two degrees of freedom: attention head selection and steering. For head selection, using gradient-based attributions with Layer-wise Relevance Propagation (LRP) substantially improves efficiency as well as accuracy. For FV steering, applying it in a distributed manner yields a higher accuracy compared to simple aggregation. Our code is publicly available.

2606.05073 2026-06-04 cs.LG 版本更新

Learning What Not to Impute: An Uncertainty-Aware Diffusion Framework for Meaningful Missingness

学习什么不该插补:一种面向有意义缺失的不确定性感知扩散框架

Lixing Zhang, Yidong Ouyang, Weifu Li, Shixiang Zhu, Guang Cheng, Liyan Xie

发表机构 * University of Science and Technology of China(中国科学技术大学)

AI总结 提出Diff-Joint扩散框架,通过联合建模表格数据和潜在缺失掩码,交替进行条件采样和不确定性感知聚合,以区分有意义缺失和需插补的缺失,实现选择性插补。

详情
AI中文摘要

缺失值插补是机器学习中的一项基本任务,现有大多数方法假设所有缺失条目对应于未观测到的常规值。然而,在许多现实世界数据集中,缺失可能源于两个不同的来源:一些条目是有意义缺失(本质上不存在且语义有效),而另一些则因观测过程而缺失,应被插补。我们将这一区别形式化为选择性插补问题,目标是共同推断哪些缺失条目应被保留,哪些应被恢复。为应对这一挑战,我们提出了Diff-Joint,一种基于扩散的框架,联合建模表格数据与潜在缺失掩码。该方法在条件采样和不确定性感知聚合之间交替,以迭代优化插补值和缺失标签。在合成和真实数据集上的实验结果表明,Diff-Joint能有效识别有意义缺失条目,同时实现具有竞争力的插补精度和改善的下游任务性能。

英文摘要

Missing value imputation is a fundamental task in machine learning, with most existing methods assuming that all missing entries correspond to unobserved regular values. In many real-world datasets, however, missingness may arise from two distinct sources: some entries are meaningfully missing (intrinsically absent and semantically valid), while others are missing due to the observation process and should be imputed. We formalize this distinction as a selective imputation problem, where the goal is to jointly infer which missing entries should be preserved and which should be recovered. To address this challenge, we propose Diff-Joint, a diffusion-based framework that jointly models tabular data together with a latent missingness mask. The method alternates between conditional sampling and uncertainty-aware aggregation to iteratively refine both imputed values and missingness labels. Empirical results on synthetic and real-world datasets demonstrate that Diff-Joint effectively identifies meaningfully missing entries while achieving competitive imputation accuracy and improved downstream task performance.

2606.05070 2026-06-04 cs.LG 版本更新

RIDE: An Open Dataset and Benchmark for Train Delay Prediction

RIDE:用于列车延误预测的开放数据集与基准

Clément Elliker, Mathis Le Bail, Clément Mantoux, Jesse Read, Sonia Vanier

发表机构 * LIX, École Polytechnique, IP Paris, France(巴黎理工学院LIX研究所,IP巴黎,法国) e.SNCF Solutions, France(法国e.SNCF解决方案)

AI总结 针对列车延误预测缺乏标准化数据集和评估协议的问题,构建了覆盖比利时全国铁路网的开放数据集RIDE,并基于非学习、统计学习和深度学习模型进行了首次全面比较评估。

Comments 58 pages, 41 figures

详情
AI中文摘要

列车延误预测对乘客和铁路运营商都是一个重要问题,但由于缺乏标准化的数据集、预测目标和评估协议,该领域的进展仍然难以评估。为了解决这一问题,我们引入了RIDE,一个在比利时铁路网全国范围内构建的开放数据集和基准。RIDE涵盖了2023年至2025年的9450万次列车事件、360万次行程和3570万条天气记录。它被组织成一个分层数据管道,从原始铁路和天气数据源到两个公开发布版本:一个可重用的中间关系数据集和模型就绪的基准数据集。该基准标准化了预测任务以及训练和测试数据。它还提供了一个统一的评估协议,支持模型间的直接比较。利用这一框架,我们首次对非学习模型、统计学习模型和深度学习模型进行了全面的比较评估。我们表明,基于学习的方法明显优于非学习模型,其中图神经网络实现了最佳的平均性能,而最强的基于学习模型之间则相对接近。除了聚合的平均绝对误差(MAE)和均方根误差(RMSE)外,该框架还提供了按预测时间范围和延误变化分类的细分结果,从而能够更详细地分析模型在不同预测场景下的行为。

英文摘要

Train delay prediction is an important problem for both passengers and railway operators, yet progress in the field remains difficult to assess due to the lack of standardized datasets, prediction targets, and evaluation protocols. To address this gap, we introduce RIDE, an open dataset and benchmark for train delay prediction built at nationwide scale over the Belgian railway network. RIDE covers 94.5M train events, 3.6M journeys, and 35.7M weather records from 2023 to 2025. It is organized as a layered data pipeline from raw railway and weather sources to two public releases: a reusable intermediate relational dataset and model-ready benchmark datasets. The benchmark standardizes the prediction task and the training and testing data. It also provides a unified evaluation protocol that supports direct comparison across models. Using this framework, we provide the first comprehensive comparative evaluation of non-learning, statistical learning, and deep learning models. We show that learning-based methods clearly outperform non-learning models, with graph neural networks achieving the best mean performance, while the strongest learning-based models remain relatively close to one another. Beyond aggregate mean absolute error (MAE) and root mean squared error (RMSE), the framework also provides breakdowns by prediction horizon and delay change, enabling more detailed analysis of model behavior across forecasting regimes.

2606.05067 2026-06-04 cs.LG 版本更新

FLAGG: Flexible Autoregressive Graph Generation

FLAGG:灵活自回归图生成

Samuel Cognolato, Alessandro Sperduti, Luciano Serafini

发表机构 * Department of Mathematics, University of Padova(帕多瓦大学数学系) Fondazione Bruno Kessler (FBK)(布鲁诺·克瑟研究所) Department of Information Engineering and Computer Science, University of Trento(特伦托大学信息工程与计算机科学系)

AI总结 提出FLAGG框架,通过将一次性模型与自回归顺序生成相结合,灵活处理不同规模和拓扑的图生成任务,在多个数据集上优于纯一次性或纯自回归基线。

Comments Accepted for publication at JMLR, currently in press

详情
AI中文摘要

深度图生成的全景涵盖了两个极端:一次性模型和顺序模型。前者联合生成节点和边,而后者以自回归方式采样它们。每种方法在不同图域中根据大小和拓扑表现更好,但都不适用于所有图类别。例如,一次性方法难以生成大图,而顺序方法在小图上表现不佳。克服这些限制的一种可能方法是在一个统一系统中灵活结合这两种方法。在这项工作中,我们提出了FLAGG(灵活自回归图生成)框架,该框架使用一次性模型顺序生成图的部分。FLAGG可以应用任何一次性模型使其自回归,从而灵活选择顺序策略。该策略通过一个随机节点移除过程来指定,插入模型学习逆转该过程。我们使用DiGress一次性模型在多个不同图大小和领域的数据集上评估FLAGG。结果表明,该方法在采样质量上优于一次性基线和自回归基线。

英文摘要

The Deep Graph Generation's panorama spans two extremes: one-shot and sequential models. The former generates nodes and edges jointly, while the latter samples them autoregressively. Each method performs better in different graph domains depending on size and topology, but neither is applicable to all graph categories. For instance, one-shot methods struggle with generating large graphs, while sequential methods underperform on smaller graphs. A possible way to overcome these limitations is to flexibly combine the two methods in a unique system. In this work, we propose the FLAGG (Flexible Autoregressive Graph Generation) framework, which sequentially generates portions of graphs with one-shot models. FLAGG can apply any one-shot model to make it autoregressive, allowing flexibility in choosing the sequential policy. This policy is specified through a stochastic node removal process, which an Insertion Model learns to reverse. We evaluate FLAGG with the DiGress one-shot model on several data sets of different graph sizes and domains. We show that the approach outperforms both one-shot and autoregressive baselines in terms of sampling quality.

2606.05046 2026-06-04 cs.LG stat.ML 版本更新

Graph Cascades: Contagion-Based Mesoscopic Rewiring for Structure-Aware Graph Machine Learning

图级联:基于传染的介观重连用于结构感知图机器学习

Meher Chaitanya, My Le, Luana Ruiz

发表机构 * KTH Royal Institute of Technology(皇家理工学院) Johns Hopkins University(约翰霍普金斯大学)

AI总结 提出一种基于传染扩散的介观重连策略Graph Cascades,通过构建辅助图增强图神经网络和变换器对中间尺度结构的捕捉能力,在节点分类任务上提升多个骨干网络性能,并理论刻画了重连有效的条件。

详情
AI中文摘要

我们引入图级联(Graph Cascades),一种用于图神经网络(GNN)和图变换器(GT)的介观重连策略,它能够捕获超出纯局部边或完全全局注意力的中间尺度图结构。基于传染扩散过程,Graph Cascades 在 O(|V|+|E|) 时间内构建一个辅助图,其中由重复多跳强化支持的节点对被提升为直接邻居。我们从理论上刻画了基于强化的重连何时有帮助:强化边选择比直接邻接更标签对齐的充分条件,一个两跳强化完全同质的 SBM 示例,以及通过图有效电阻对介观连通性的形式化。实验上,在节点分类基准测试中,Graph Cascades 改进了多个 GNN 和稀疏 GT 骨干网络,在异质图和中等至高同质度图上观察到最可靠的增益。理论条件还识别了介观重连不太可能有益的场景——低度正则图和存在结构瓶颈的图——这些预测与观察到的失败相符。我们还观察到重连图中性能与结构属性之间的紧密相关性。

英文摘要

We introduce Graph Cascades, a mesoscopic rewiring strategy for Graph Neural Networks (GNNs) and Graph Transformers (GTs) that captures intermediate-scale graph structure beyond purely local edges or fully global attention. Using contagion-based diffusion processes, Graph Cascades constructs, in O(|V|+|E|) time, an auxiliary graph where node pairs supported by repeated multi-hop reinforcement are promoted to direct neighbors. We theoretically characterize when reinforcement-based rewiring helps: sufficient conditions under which reinforcement-based edge selection is more label-aligned than direct adjacency, an SBM witness in which two-hop reinforcement is perfectly homophilic, and a formalization of mesoscopic connectivity via graph effective resistance. Empirically, across node-classification benchmarks, Graph Cascades improves multiple GNN and sparse-GT backbones, with the most reliable gains observed on heterophilic and moderate- to high-degree homophilic graphs. The theoretical conditions also identify regimes where mesoscopic rewiring is unlikely to be beneficial -- low-degree regular graphs and graphs with structural bottlenecks -- and these predictions match the observed failures. We additionally observe tight correlations between performance and structural properties in the rewired graphs.

2606.05045 2026-06-04 math.DS cs.LG 版本更新

Learning Control-Affine Reduced-Order Models via Autoencoders

通过自编码器学习控制仿射降阶模型

Ali Mjalled, Martin Mönnigmann

发表机构 * Automatic Control and Systems Theory Ruhr-Universität Bochum(自动控制与系统理论 梅尔恩大学波恩分校)

AI总结 提出一种利用自编码器同时学习降阶潜在空间和控制仿射状态空间动力学的框架,并扩展为序列模型以提高预测精度,通过反馈线性化验证其有效性。

详情
AI中文摘要

本文提出了一种用于识别控制仿射降阶模型(ROM)的框架。该方法利用自编码器(AE)将高维状态以及潜在的高维输入变换为适合控制仿射状态空间动力学的降维潜在变量。这是通过同时训练AE和状态空间模型实现的。此外,我们将离散ROM公式扩展为基于序列的模型,该模型处理状态和输入历史以提高预测精度,同时保持控制仿射结构。我们通过对导出的模型应用反馈线性化来激励我们的框架,并提出了有效使用它的指南。所提出的框架在两个数值示例上进行了评估,并将其性能与基线模型(其中AE识别具有线性状态空间动力学的潜在空间)进行了比较。评估涉及测试数据上ROM的预测精度及其将系统控制到期望状态或轨迹的有效性。

英文摘要

We present in this paper a framework for the identification of control-affine reduced-order models (ROMs). The proposed method utilizes autoencoders (AEs) to transform the high-dimensional states, and potentially the high-dimensional inputs, into reduced latent ones suitable for control-affine state-space dynamics. This is achieved by simultaneous training of the AE and the state-space model. In addition, we extend the discrete ROM formulation to a sequence-based model, which processes state and input histories to improve prediction accuracy while preserving the control-affine structure. We motivate our framework by applying feedback linearization to the derived models, and we present guidelines for its efficient use. The proposed framework is assessed on two numerical examples and its performance is compared to a baseline model, where the AE identifies a latent space with linear state-space dynamics. The assessment involves evaluating the prediction accuracy of the ROM on test data and its effectiveness in controlling the system to a desired state or trajectory.

2606.05042 2026-06-04 cs.LG cs.CL cs.SC 版本更新

In-Context Graphical Inference

上下文图形推理

Zehua Cheng, Wei Dai, Jiahao Sun

发表机构 * Department of Computer Science, University of Oxford(计算机科学系,牛津大学) FLock.io

AI总结 提出一种自回归图Transformer(ICG-I),通过模拟变量消除并利用张量列压缩和加权共形预测,实现离散图形模型中可扩展且校准的边缘推理,在标准实例和受挫自旋玻璃上达到最先进性能。

Comments 19 Pages

详情
AI中文摘要

离散图形模型中的边缘推理迫使在精确性和可扩展性之间做出选择:精确算法对于高树宽图是难以处理的,而迭代近似(信念传播、变分方法)在受挫拓扑上牺牲了收敛保证。我们认为这种二分法源于归纳偏置不匹配:迭代方法放弃了使精确推理正确的顺序消除结构。我们引入了上下文图形推理(ICG-I),一种自回归图Transformer,通过模拟变量消除并使用学习的张量列压缩中间因子来恢复这种结构,同时结合Dirichlet输出层和加权共形预测,在拓扑偏移下提供校准的、无分布的覆盖保证。我们证明了TT压缩误差在自回归链中最多线性传播,Dirichlet-Multinomial损失是适当的评分规则,并且WCP在估计密度比下保持覆盖且退化可量化。我们进行了大量实验来评估ICG-I,并在所有基准测试中取得了最先进的性能。ICG-I将标准实例上的MAE从0.041(最佳基线)降低到0.020,并在N=500的受挫自旋玻璃上达到0.048,而BP完全发散。

英文摘要

Marginal inference in discrete graphical models forces a choice between exactness and scalability: exact algorithms are intractable for high-treewidth graphs, while iterative approximations (Belief Propagation, variational methods) sacrifice convergence guarantees on frustrated topologies. We argue that this dichotomy stems from a mismatched inductive bias: iterative methods abandon the sequential elimination structure that makes exact inference correct. We introduce In-Context Graphical Inference (ICG-I), an autoregressive Graph Transformer that restores this structure by mimicking Variable Elimination with learned, Tensor- Train-compressed intermediate factors, paired with a Dirichlet output layer and Weighted Conformal Prediction for calibrated, distribution-free coverage guarantees under topological shift. We prove that TT compression errors propagate at most lincarly through the autoregressive chain, that the Dirichlet-Multinomial loss is a proper scoring rule, and that WCP maintains coverage with a quantifiable degradation under estimated density ratios. We conducted intensive experiments to evaluate ICG-I and achieved state-of-the-art performance across all benchmarks. ICG-I reduces MAE from 0.041 (best baseline) to 0.020 on standard instances and achieves 0.048 on N=500 frustrated spin glasses where BP diverges entirely.

2606.05029 2026-06-04 cs.LG cs.CL 版本更新

Validity Threats for Foundation Model Research

基础模型研究的有效性威胁

Gunnar König, Martin Pawelczyk, Ulrike von Luxburg, Sebastian Bordt

发表机构 * University of Tübingen, Tübingen AI Center(图宾根大学,图宾根人工智能中心) University of Vienna(维也纳大学)

AI总结 本文提出一个因果推断评估框架,将基础模型研究中的不同近似实验策略(代理实验、观察性研究、单次运行设计)映射为四种有效性(统计、内部、外部、构念)的权衡,揭示并分析计算节省带来的隐蔽有效性威胁。

详情
AI中文摘要

受控实验是机器学习研究的基石,但在现代基础模型的规模下,它们变得过于昂贵。相反,研究界越来越依赖于以较低成本近似理想实验的研究策略:代理实验和缩放定律、使用公开模型的观察性研究,以及利用单个训练运行内部变化的单次运行设计。在这项工作中,我们认为在计算预算内近似大规模实验没有免费午餐。具体来说,计算节省是以有效性威胁为代价的——隐藏且有时无法检验的假设,当这些假设被违反时,会使研究主张无效。为了帮助应对这些威胁,我们提出了一个评估框架,将基础模型研究视为因果推断问题。在这个框架内,我们通过从经验社会科学中改编的四种有效性——统计、内部、外部和构念有效性——来评估不同的研究策略。我们发现每种策略都有其特有的有效性特征:代理实验以外部和构念有效性换取统计和内部有效性;观察性研究面临混杂和效应异质性;单次运行设计则因处理单元之间的干扰而紧张。这一分析揭示了文献中未得到充分关注的若干有效性威胁。总体而言,我们的评估框架为研究人员提供了一个实用的工具包,用于审视基础模型研究设计中的有效性威胁。

英文摘要

Controlled experiments are the backbone of machine learning research, but at the scale of modern foundation models, they have become prohibitively expensive. Instead, the community increasingly relies on research strategies that approximate the ideal experiment at a fraction of the cost: proxy experiments and scaling laws, observational studies with publicly available models, and single-run designs that leverage variation within individual training runs. In this work, we argue that there is no free lunch when approximating large-scale experiments on a compute budget. Specifically, savings in compute come at the cost of validity threats -- hidden and sometimes untestable assumptions that, when violated, can invalidate research claims. To help navigate such threats, we propose an evaluation framework that casts foundation model research as a causal inference problem. Within this framework, we evaluate different research strategies through four types of validity adapted from the empirical social sciences -- statistical, internal, external, and construct validity. We find that each strategy comes with a characteristic validity profile: proxy experiments trade external and construct validity for statistical and internal validity; observational studies face confounding and effect heterogeneity; and single-run designs are strained by interference between treated units. This analysis reveals several validity threats that have received insufficient attention in the literature. Overall, our evaluation framework provides researchers with a practical toolkit for scrutinizing validity threats in foundation model research~designs.

2606.05025 2026-06-04 cs.LG cs.AI 版本更新

Invariant Gradient Alignment for Robust Reasoning Distillation

不变梯度对齐用于鲁棒推理蒸馏

Zehua Cheng, Wei Dai, Jiahao Sun

发表机构 * University of Oxford(牛津大学) FLock.io

AI总结 提出不变梯度对齐(IGA)框架,通过逻辑同构集、连续梯度冲突掩码和截断SVD投影,对齐不同语义域但逻辑结构相同的梯度更新,提升大语言模型在分布外输入上的鲁棒性。

Comments 30 Pages

详情
Journal ref
In Proceedings of European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases 2026
AI中文摘要

大型语言模型(LLMs)存在捷径学习问题:它们在分布外(OOD)输入上系统性失败,这些输入的语义表面与训练数据不同,即使逻辑结构相同。这破坏了将思维链推理迁移到较小学生模型的知识蒸馏流程。我们引入不变梯度对齐(IGA),一种训练框架,通过三项创新对齐跨语义多样但逻辑同构示例的梯度更新:(i)逻辑同构集,即跨不同语义领域(数学、医学、法律、科学)共享相同逻辑结构的问题组;(ii)可微的连续梯度冲突掩码,抑制具有高跨域梯度方差的参数维度,同时保留不变方向;(iii)将掩码梯度通过截断SVD投影回LoRA低秩流形,保持参数效率。理论上,IGA比ERM产生更紧的OOD泛化界,随同构域数量缩放,并在温和正则条件下以标准SGD速率收敛。实验上,IGA在四个基准测试中优于八种基线,准确率提升高达14.3个百分点(相对于ERM-SFT),逻辑一致性得分为0.031对比0.142——表示不变性提升四倍。

英文摘要

Large language models (LLMs) suffer from shortcut learning: they systematically fail on out-of-distribution (OOD) inputs whose semantic surface differs from training data, even when the logical structure is identical. This undermines knowledge distillation pipelines that transfer chain-of-thought reasoning to smaller students. We introduce Invariant Gradient Alignment (IGA), a training framework that aligns gradient updates across semantically diverse but logically isomorphic examples via three innovations: (i) Logical Isomer Sets, groups of problems sharing identical logical structure across distinct semantic domains (mathematics, medicine, law, science); (ii) a differentiable \emph{Continuous Gradient Conflict Mask}, that suppresses parameter dimensions with high cross-domain gradient variance while preserving invariant directions; and (iii) a truncated SVD projection of the masked gradient back onto the LoRA low-rank manifold, maintaining parameter efficiency throughout. Theoretically, IGA yields tighter OOD generalization bounds than ERM, scaling with the number of isomer domains, and converges at the standard SGD rate under mild regularity. Empirically, IGA outperforms eight baselines across four benchmarks with accuracy gains up to 14.3 pp over ERM-SFT and a Logical Consistency Score of 0.031 versus 0.142 -- a fourfold improvement in representational invariance.

2606.05021 2026-06-04 cs.LG 版本更新

Enhancing the MADDPG Algorithm for Multi-Agent Learning via Action Inference and Importance Sampling

通过动作推理和重要性采样增强多智能体学习的MADDPG算法

Marc Walden, Jason Liu, Shaashwath Sivakumar, Ryan Liu, Hamza Khan

发表机构 * Department of Mathematics, University of California Los Angeles, Los Angeles, CA, USA(加州大学洛杉矶分校数学系)

AI总结 针对多智能体深度强化学习,提出动作推理机制和基于几何分布的重要性采样策略来改进MADDPG算法,在离散动作捕食者-猎物任务中提升了学习稳定性、智能体间协作和探索效率。

详情
AI中文摘要

我们研究了多智能体深度强化学习,并提出了对多智能体深度确定性策略梯度(MADDPG)算法的两项增强。首先,我们引入了一种新颖的动作推理机制,使每个智能体能够预测其他智能体的预期动作,从而提高其自身策略的准确性和稳定性。其次,我们在回放缓冲区中应用了基于几何分布的重要性采样策略,以优先考虑更近期和更具信息性的经验,这有助于缓解多智能体环境中固有的非平稳性。我们在PettingZoo库提供的离散动作捕食者-猎物任务上评估了这两项修改,PettingZoo是一个用于通用多智能体强化学习基准测试的灵活Python接口。我们的结果表明,动作推理在提高学习稳定性和智能体间协作方面是有效的,并且使用几何分布的重要性采样可以在探索效率上比标准MADDPG带来显著改进。代码可在https://github.com/shaashwathsivakumar/MARL_Proj获取。

英文摘要

We investigate multi-agent deep reinforcement learning and propose two enhancements to the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) algorithm. First, we introduce a novel Action Inference mechanism that enables each agent to predict other agents' intended actions, thereby improving the accuracy and stability of its own policy. Second, we apply an importance sampling strategy, using geometric distribution, in the replay buffer to prioritize more recent and informative experiences, which helps mitigate the non-stationarity inherent in multi-agent environments. We evaluate both modifications on the discrete-action Predator-Prey task provided by the PettingZoo library, a flexible Python interface for general multi-agent reinforcement learning benchmarks. Our results indicate that Action Inference is effective in improving learning stability and inter-agent cooperation and that importance sampling using geometric distribution can lead to significant improvements in exploration efficiency over standard MADDPG. Code available at https://github.com/shaashwathsivakumar/MARL_Proj

2606.04994 2026-06-04 cs.LG q-bio.QM 版本更新

New Benchmarking Shows Limited Generalization Power of TCR Antigenic Epitope Prediction Models

新基准测试显示TCR抗原表位预测模型的泛化能力有限

Yiming Liao, Yiheng Li, Ning Jiang, Bo Li, Keke Chen

发表机构 * Trustworthy and Intelligent Computing Lab (TAIC), Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County(可信智能计算实验室(TAIC),计算机科学与电气工程系,马里兰大学巴尔的摩分校) Children’s Hospital of Philadelphia(费城儿童医院) Department of Bioengineering, University of Pennsylvania(生物工程系,宾夕法尼亚大学) Institute for Immunology & Immune Health, University of Pennsylvania(免疫学与免疫健康研究所,宾夕法尼亚大学) Institute for RNA Innovation, University of Pennsylvania(RNA创新研究所,宾夕法尼亚大学) Abramson Cancer Center, University of Pennsylvania(Abramson癌症中心,宾夕法尼亚大学) Center for Precision Engineering for Health, University of Pennsylvania(健康精准工程中心,宾夕法尼亚大学) Center for Cellular Immunotherapies, University of Pennsylvania(细胞免疫治疗中心,宾夕法尼亚大学)

AI总结 本文通过构建两类严格定义的未见基准数据集,评估了T细胞受体(TCR)抗原特异性预测模型的性能,发现现有模型泛化能力有限,并提出了改进框架。

Comments 6 pages, 1 figure. Preprint version

详情
AI中文摘要

准确计算预测T细胞受体(TCR)抗原特异性将改变T细胞生物学研究,并实现可扩展的免疫工程,但现有模型缺乏足够的灵敏度和特异性,难以广泛应用。一个主要限制是缺乏严格定义的、未见过的基准数据集,无法对模型性能和泛化能力进行无偏评估。在此,我们描述了两类满足此标准的互补数据集,并认为它们既为模型评估提供了稳健框架,也为下一代TCR-抗原预测算法的开发奠定了基础。

英文摘要

Accurate computational prediction of T cell receptor (TCR) antigen specificity would transform the study of T cell biology and enable scalable immune engineering, yet existing models lack sufficient sensitivity and specificity for broad applications. A major limitation is the absence of rigorously defined, unseen benchmark datasets that allow unbiased evaluation of model performance and generalizability. Here, we describe two complementary classes of datasets that meet this criterion and argue that they provide both a robust framework for model assessment and a foundation for next-generation TCR-antigen prediction algorithm development.

2606.04980 2026-06-04 cs.LG 版本更新

AlphaQ: Calibration-Free Bit Allocation for Mixture-of-Experts Quantization

AlphaQ: 混合专家量化的免校准位分配

Wanqi Yang, Yuexiao Ma, Alexander Conzelmann, Xiawu Zheng, Michael W. Mahoney, T. Konstantin Rusch, Shiwei Liu

发表机构 * Max Planck Institute for Intelligent Systems(马克斯·普朗克智能系统研究所) ELLIS Institute Tübingen(图宾根ELLIS研究所) Tübingen AI Center(图宾根人工智能中心) Key Laboratory of Multimedia Trusted Perception and Efficient Computing, Xiamen University(厦门大学多媒体可信感知与高效计算重点实验室) International Computer Science Institute(国际计算机科学研究所) Lawrence Berkeley National Laboratory(伯克利国家实验室) University of California, Berkeley(加州大学伯克利分校) Liquid AI

AI总结 针对混合专家模型量化中依赖校准数据导致位分配次优的问题,提出基于重尾自正则化理论的免校准位分配方法AlphaQ,通过专家权重谱的重尾程度分配位宽,在预算约束下最小化量化误差,实现接近全精度的性能。

Comments 28 pages, 11 figures

详情
AI中文摘要

混合专家(MoE)架构通过稀疏专家激活扩展模型容量,但其部署仍受内存限制,因为所有专家权重必须驻留在内存中。混合精度量化通过为不同专家分配不同位宽,可以显著减少内存占用。然而,现有方法通常依赖校准数据来估计专家重要性并确定位分配。对于前沿的MoE大语言模型,原始训练数据(即真实训练分布)是专有的且不可访问。因此,校准集不可避免地成为不完美的替代品,这可能导致对专家利用率的错误估计和次优的位分配。受现代MoE模型中观察到的显著跨专家质量差异,以及重尾自正则化(HT-SR)理论在无需训练或测试数据的情况下成功预测神经网络模型质量的启发,我们提出了AlphaQ,一种用于MoE量化的免校准位分配方法。AlphaQ借鉴HT-SR理论,遵循一个简单原则:具有更重尾权重谱的专家通常训练得更好,因此应获得更高的位宽,而重尾结构较弱的专家可以更激进地量化。AlphaQ通过测量专家级别的谱重尾程度,并求解在全局位预算约束下最小化总量化误差的预算约束优化问题来实现这一原则。在多个MoE模型上,AlphaQ在匹配位预算下始终优于基于校准的基线方法。值得注意的是,在Qwen1.5-MoE上,AlphaQ在平均专家精度仅为3.5位的情况下实现了接近全精度的准确率,同时提供了超过4倍的内存压缩。我们的代码可在https://github.com/Superone77/AlphaQ获取。

英文摘要

Mixture-of-Experts (MoE) architectures scale model capacity through sparse expert activation, but their deployment remains memory-bound because all expert weights must reside in memory. Mixed-precision quantization can substantially reduce this footprint by assigning different bit-widths to different experts. Existing approaches, however, typically rely on calibration data to estimate expert importance and determine bit allocation. For frontier MoE LLMs, the original training data, and hence the true training distribution, is proprietary and inaccessible. As a result, calibration sets are inevitably imperfect surrogates, and this can misestimate expert utilization and lead to suboptimal bit allocation. Motivated by the substantial cross-expert quality variability observed in modern MoE models, and by the success of Heavy-Tailed Self-Regularization (HT-SR) theory at predicting neural network model quality without access to training or testing data, we propose AlphaQ, a calibration-free bit-allocation method for MoE quantization. AlphaQ draws on HT-SR theory and follows a simple principle: experts with more heavy-tailed weight spectra are typically better trained and hence should receive higher bit-widths, while experts with weaker heavy-tailed structure can be quantized more aggressively. AlphaQ operationalizes this principle by measuring expert-wise spectral heavy-tailedness and solving a budget-constrained optimization problem that minimizes total quantization error under a global bit-budget constraint. Across several MoE models, AlphaQ consistently outperforms calibration-based baselines under matched bit budgets. Notably, on Qwen1.5-MoE, AlphaQ achieves near full-precision accuracy with an average expert precision of only 3.5 bits, while delivering more than 4$\times$ memory compression. Our code is available at https://github.com/Superone77/AlphaQ.

2606.04971 2026-06-04 cs.LG cs.DB 版本更新

Be Fair! Can Machine Learning Engineering Agents Adhere to Fairness Constraints?

公平吗?机器学习工程代理能否遵守公平性约束?

Anna Richter, Julia Stoyanovich, Sebastian Schelter

发表机构 * BIFOLD & TU Berlin(BIFOLD与柏林技术大学) New York University(纽约大学)

AI总结 本文研究机器学习工程代理在自动化ML管道开发中能否满足公平性约束,通过黑色素瘤分类实验发现代理生成的管道在预测质量和公平性上均低于人工基线。

详情
AI中文摘要

机器学习工程(MLE)代理承诺从原始数据和自然语言指令自动化端到端ML管道开发,可能使非技术领域专家也能使用ML。然而,在敏感和受监管的领域,这种抽象造成了责任差距:最终用户可能无法了解影响正确性、鲁棒性、公平性和法规遵从性的设计选择。我们认为现有基准不足以评估MLE代理能否安全应用于此类环境。我们提出了以责任为中心的评估框架的期望,并进行了黑色素瘤分类的探索性研究,重点关注跨肤色公平性作为责任约束。在评估两个最近的MLE代理时,我们发现代理生成的管道在预测质量和公平性方面表现出高方差,并且始终低于手动设计的基线,尽管使用了面向公平性的提示。这些初步结果表明,需要进一步研究重新设计MLE代理,以允许人类指导搜索过程并可靠地评估生成的ML管道的合规性和质量。

英文摘要

Machine learning engineering (MLE) agents promise to automate end-to-end ML pipeline development from raw data and natural language instructions, potentially making ML accessible to non-technical domain experts. However, in sensitive and regulated domains, this abstraction creates a responsibility gap: end-users may lack visibility into design choices that affect correctness, robustness, fairness, and regulatory compliance. We argue that existing benchmarks are insufficient to assess whether MLE agents can be safely applied in such settings. We propose desiderata for a responsibility-centered evaluation framework and conduct an exploratory study on melanoma classification, focusing on fairness across skin tones as a responsibility constraint. When evaluating two recent MLE agents, we find that agent-generated pipelines show high variance and consistently underperform manually designed baselines in both predictive quality and fairness, despite fairness-oriented prompts. These preliminary results suggest that further research is needed towards redesigning MLE agents to allow humans to guide the search process and reliably assess the compliance and quality of the generated ML pipelines.

2606.04957 2026-06-04 cs.CR cs.IR cs.LG 版本更新

NLLog: Lightweight, Explainable SOC Anomaly Detection via Log-to-Language Rewriting

NLLog: 通过日志到语言重写的轻量级、可解释的SOC异常检测

Samuel Ndichu, Tao Ban, Seiichi Ozawa, Takeshi Takahashi, Daisuke Inoue

发表机构 * University of Tokyo(东京大学) National Institute of Information and Communications Technology(日本信息通信技术研究所)

AI总结 提出NLLog流水线,将日志模板重写为自然语言句子,结合TF-IDF加权和树集成分类,利用TreeSHAP提供可解释的异常检测,在HDFS、BGL和AIT数据集上实现低误报率和低延迟。

Comments 15 pages, 11 figures, 12 tables; submitted to ACSAC 2026

详情
AI中文摘要

系统生成的日志是安全监控的基础,但其僵化的基于模板的格式阻碍了自动化分析和人类理解。我们提出NLLog(自然语言日志),一个轻量级流水线,它确定性地将解析后的模板重写为WHO-WHAT-SEVERITY句子,通过词频-逆文档频率加权进行池化,使用树集成对会话进行分类,并通过TreeSHAP反向投影证据供分析师审查。在Hadoop分布式文件系统(HDFS)和Blue Gene/L(BGL)语料库上,NLLog超过了两个复现的匹配协议基线;在HDFS、BGL和AIT警报数据集上,它保持了低误报率,且延迟适用于安全运营中心分类。覆盖度、稀疏与密集、忠实性和对抗性消融实验表明,回退充分性依赖于语料库,部署前的覆盖度检查可以揭示细化需求,并且可审计的确定性重写结合轻量级密集编码为日志异常检测和分类提供了可测量的表示层。

英文摘要

System-generated logs underpin security monitoring, yet their rigid template-based format hinders both automated analysis and human comprehension. We present NLLog (Natural-Language Log), a lightweight pipeline that deterministically rewrites parsed templates into WHO-WHAT-SEVERITY sentences, pools them with term-frequency-inverse-document-frequency weighting, classifies sessions with tree ensembles, and back-projects evidence with TreeSHAP for analyst review. On Hadoop Distributed File System (HDFS) and Blue Gene/L (BGL) corpora, NLLog exceeds two reproduced matched-protocol baselines; across HDFS, BGL, and the AIT Alert Data Set, it sustains low false-positive rates with commodity-hardware latency suitable for security operations center triage. Coverage, sparse-versus-dense, faithfulness, and adversarial ablations show that fallback sufficiency is corpus-dependent, that an enrollment-time coverage check can surface refinement requirements before deployment, and that an auditable deterministic rewrite combined with lightweight dense encoding provides a measurable representation layer for log-anomaly detection and triage.

2606.04946 2026-06-04 cs.DS cs.LG stat.ML 版本更新

A General Framework for Dynamic Consistent Submodular Maximization

动态一致子模最大化的通用框架

Paul Dütting, Federico Fusco, Silvio Lattanzi, Ashkan Norouzi-Fard, Ola Svensson, Morteza Zadimoghaddam

发表机构 * ETH Zurich(苏黎世联邦理工学院) KTH Royal Institute of Technology(皇家理工学院) University of Toronto(多伦多大学)

AI总结 针对全动态环境下的子模最大化问题,提出一个通用算法框架,首次实现具有次线性一致性的常数因子近似解。

Comments Accepted at ICML 2026

详情
AI中文摘要

一致性是动态子模最大化中的一个重要性质,它要求算法始终维持一个接近最优的解,并且在每一步只对解进行少量调整。先前的工作仅在仅插入的情况下探讨了这个问题,其中算法面临 $n$ 个插入的流,并建立了基数约束版本的下界和上界。我们在全动态设置中考虑这个问题,其中操作流可能同时包含插入和删除。我们开发了一个通用框架来设计该设置下的算法,并通过实例化得到了首个具有次线性一致性的常数因子近似。对于基数约束,我们提出了一个 $\frac 12 - O(\varepsilon)$ 近似,其一致性为 $O\left(\frac{1}{\varepsilon^2}\right)$。对于秩-$k$ 拟阵约束,我们构造了一个 $\frac 14 - O(\varepsilon)$ 近似于动态最优解,其一致性为 $O\left(\frac{\log k}{\varepsilon^2}\right)$。

英文摘要

Consistency is an important property in dynamic submodular maximization and entails maintaining a near-optimal solution at all times, making only a small number of adjustments to the solution in each step. Prior work has explored this question for the insertion-only case, where the algorithm faces a stream of $n$ insertions, and has established lower and upper bounds for the cardinality-constrained version of the problem. We consider this question in the fully dynamic setting, where the stream of operations may contain both insertions and deletions. We develop a general framework for designing algorithms for this setting, and instantiate it to obtain the first constant-factor approximations with sublinear consistency. For cardinality constraints, we propose a $\frac 12 - O(\varepsilon)$ approximation that is $O\left(\frac{1}{\varepsilon^2}\right)$ consistent. For rank-$k$ matroid constraints, we construct a $\frac 14 - O(\varepsilon)$ approximation to the dynamic optimum that is $O\left(\frac{\log k}{\varepsilon^2}\right)$ consistent.

2606.04931 2026-06-04 cs.LG cs.GT 版本更新

Mean-based algorithms: A lower bound and regret

基于均值的算法:下界与遗憾

Julius Durmann, Amelie Kleber

发表机构 * Technical University of Munich(慕尼黑技术大学)

AI总结 本文针对未知时间范围且仅有赌博机反馈的设定,首次给出了基于均值算法定义序列γ_t的下界,并提出了两种新算法,实验表明其性能与现有算法相当,同时分析了与无遗憾算法的关系。

详情
AI中文摘要

基于均值的算法是一类在线学习算法,它们将低概率分配给平均奖励低的动作。最近的研究表明,这些算法能够有利地收敛到序列非支配动作,从而逼近经济博弈中的纳什均衡。然而,实证研究也显示,在赌博机反馈场景中,与已有算法相比,其收敛速度较慢。 我们研究时间范围未知且仅有赌博机反馈时的基于均值算法。在此设定下,我们首次给出了算法定义序列$γ_t$的下界,正式确立了这些算法学习速度的极限。此外,我们提出了两种基于均值的算法:一种推广了$ε$-贪心算法,另一种将基于均值的Exp3扩展到未知时间范围。我们的实验表明,基于均值的算法虽然略慢,但可以与其他赌博机反馈算法竞争。 我们进一步分析了与无遗憾算法的关系。根据$γ_t$的选择,与无遗憾算法的交集是非平凡的,并且我们证明存在既是基于均值又是无遗憾的算法。这为此类算法的“可剥削性”提供了背景,而先前的研究曾暗示这一点。

英文摘要

Mean-based algorithms are a class of online learning algorithms that assign low probability to actions with low average rewards. Recent work indicates these algorithms converge favorably to serially undominated actions, which approximate Nash equilibria in economic games. However, empirical studies also show slower convergence compared to established algorithms in bandit-feedback scenarios. We study mean-based algorithms when the time horizon is unknown and only bandit feedback is available. In this setting, we provide the first lower bound on the algorithm-defining sequence $γ_t$ that formally establishes a limit on how fast these algorithms can learn. Additionally, we propose two mean-based algorithms: one generalizes $ε$-greedy, and the other extends the mean-based Exp3 to unknown horizons. Our experiments show that mean-based algorithms, although slightly slower, can perform competitively with other bandit-feedback algorithms. We further analyze the relationship to no-regret algorithms. Depending on the choice of $γ_t$, the intersection with no-regret algorithms is non-trivial, and we show that algorithms exist that are both mean-based and no-regret. This adds context to the "exploitability" of this class of algorithms that previous contributions suggest.

2606.04930 2026-06-04 cs.LG cs.AI stat.ML 版本更新

AdaKoop: Efficient Modeling of Nonlinear Dynamics from Nonstationary Data Streams with Koopman Operator Regression

AdaKoop: 基于Koopman算子回归的非平稳数据流非线性动力学高效建模

Naoki Chihara, Ren Fujiwara, Yasuko Matsubara, Yasushi Sakurai

发表机构 * SANKEN, The University of Osaka(SANKEN大学)

AI总结 提出AdaKoop,一种基于Koopman算子理论和概率框架的流式算法,通过将非线性动力学表示为线性系统,实现对非平稳数据流的高效、稳定建模,并在71个基准数据集上超越现有方法。

Comments Accepted by KDD'26

详情
Journal ref
The 32nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2026
AI中文摘要

实时数据分析需要准确且自适应地处理非平稳数据流中的非线性动力学,同时保持计算效率。然而,非线性动力学非常复杂,在严格时间限制下捕获动态变化的非线性模式并将其用于下游任务并非易事。为了弥合非线性复杂性与计算可处理性之间的差距,本研究应用了Koopman算子理论,该理论指出非线性动力学可以表示为无限维空间中的线性变换。基于该算子的有限维近似,我们提出了AdaKoop,一种用于对非平稳数据流上的非线性动力学进行建模的高效流式算法。我们的方法利用基于Koopman算子理论的概率框架,将原始观测和再生核希尔伯特空间(RKHS)特征都视为来自潜在向量的发射。这种双视角公式允许非线性动力学被表示为可处理的线性系统。因此,AdaKoop能够以流式方式高效稳定地建模非线性动力学,避免了迭代非线性优化的高昂计算成本。此外,为了应对数据流中的非平稳性,AdaKoop通过统计假设检验自适应地检测模式突变,并增量更新模型参数以处理连续变化。在总共71个跨领域实际基准数据集上的大量实验表明,AdaKoop在实时预测准确性和计算效率方面均优于最先进的方法。

英文摘要

Real-time data analysis requires the ability to accurately and adaptively address nonlinear dynamics in a nonstationary data stream while preserving computational efficiency. However, nonlinear dynamics are so complex that capturing dynamically changing nonlinear patterns and utilizing them for downstream tasks under strict time constraints is nontrivial. To bridge the gap between nonlinear complexity and computational tractability, this study applies Koopman operator theory, which states that nonlinear dynamics can be represented as linear transitions in an infinite-dimensional space. Building upon finite-dimensional approximations of this operator, we present AdaKoop, an efficient streaming algorithm for modeling nonlinear dynamics over nonstationary data streams. Our approach utilizes a probabilistic framework grounded in Koopman operator theory, treating both raw observations and reproducing kernel Hilbert space (RKHS) features as emissions from latent vectors. This dual-view formulation allows nonlinear dynamics to be expressed as a tractable linear system. Therefore, AdaKoop enables the efficient and stable modeling of nonlinear dynamics in a streaming fashion, avoiding the prohibitive computational costs of iterative nonlinear optimization. Furthermore, to address nonstationarity in data streams, AdaKoop adaptively detects the switching of patterns via statistical hypothesis testing for abrupt pattern shifts and incrementally updates model parameters to handle continuous changes. Extensive experiments on a total of 71 practical benchmark datasets across various domains demonstrate that AdaKoop outperforms state-of-the-art methods in terms of real-time forecasting accuracy and computational efficiency.

2606.04929 2026-06-04 cs.LG cs.CR 版本更新

Sequential Data Poisoning in LLM Post-Training

LLM后训练中的顺序数据投毒

Jack Sanderson, Yihan Wang, Xiaoqian Lu, Gautam Kamath, Yiwei Lu

发表机构 * University of Chicago(芝加哥大学) University of Waterloo(滑铁卢大学) University of Ottawa(渥太华大学) Vector Institute(向量研究所)

AI总结 提出顺序数据投毒威胁模型,研究多个攻击者在LLM后训练不同阶段(SFT和偏好数据)分别投毒,发现单一攻击者看似威胁小但多阶段协作会暴露真实漏洞,且不同管道中贡献呈加性或互补性。

详情
AI中文摘要

LLM后训练通过多个阶段进行,例如监督微调(SFT)后跟人类反馈强化学习(RLHF)或直接偏好优化(DPO),每个阶段的数据来自不同且可能不可信的来源。现有文献假设数据投毒攻击可能发生在每个训练阶段,但忽略了多个攻击者的可能性。为了研究整个后训练管道的可信度,我们提出了顺序数据投毒的威胁模型,其中多个对手分别投毒SFT和偏好数据集。在此威胁模型下,我们发现了单一攻击者幻觉:每个对手单独评估时看似威胁可忽略,但当对手跨阶段协作时,真正的漏洞才会暴露。在SFT→DPO管道中,他们的贡献是加性的:将固定投毒预算跨阶段分配优于单独集中在任一阶段。在SFT→PPO管道中,他们的贡献是互补的:单独SFT或奖励模型投毒都不成功,但组合却成功。这些发现表明,对单个后训练阶段的安全分析系统性地低估了仅从它们交互中出现的复合漏洞。代码可在https://github.com/jcksanderson/sequential-poisoning获取。

英文摘要

LLM post-training proceeds through multiple stages, e.g., supervised fine-tuning (SFT) followed by reinforcement learning from human feedback (RLHF) or direct preference optimization (DPO), where each stage draws data from different, potentially untrusted sources. Existing literature assumes data poisoning attacks may occur at each training stage, but neglects the possibility of multiple attackers. To study the trustworthiness of the entire post-training pipeline, we propose the threat model of sequential data poisoning, where multiple adversaries separately poison the SFT and preference datasets. Under this threat model, we identify the single-attacker illusion: each adversary, evaluated in isolation, appears to pose a negligible threat. Yet when adversaries collaborate across stages, the true vulnerability is revealed. In the SFT $\to$ DPO pipeline, their contributions are additive: splitting a fixed poison budget across stages outperforms concentrating it in either stage alone. In the SFT $\to$ PPO pipeline, their contributions are complementary: neither SFT nor reward model poisoning succeeds individually, yet their combination does. These findings show that security analyses of individual post-training stages systematically underestimate compound vulnerabilities that emerge only from their interaction. Code is available at https://github.com/jcksanderson/sequential-poisoning.

2606.04928 2026-06-04 cs.LG cs.CL 版本更新

Data Attribution in Large Language Models via Bidirectional Gradient Optimization

通过双向梯度优化实现大型语言模型中的数据归因

Frédéric Berdoz, Luca A. Lanzendörfer, Kaan Bayraktar, Roger Wattenhofer

发表机构 * EPFL, Switzerland(瑞士联邦理工学院) ETH Zurich, Switzerland(瑞士苏黎世联邦理工学院)

AI总结 提出一种基于双向梯度优化的训练数据归因方法,用于自动回归大型语言模型,以识别影响模型输出的关键训练数据,提升模型可解释性。

Comments Presented at the AI Governance (AIGOV) Workshop at AAAI 2026

详情
AI中文摘要

大型语言模型(LLMs)越来越多地部署在各种应用中,引发了关于治理、问责和数据溯源的关键问题。理解哪些训练数据对模型的输出影响最大仍然是一个基本开放问题。我们通过扩展逆公式来解决自动回归LLMs的训练数据归因(TDA)挑战:如果模型在训练期间看到了生成的输出,训练数据会如何受到影响?我们的方法通过对生成的文本样本进行双向梯度优化(梯度上升和下降)来扰动基础模型,并测量训练样本上损失的变化。我们的框架支持任意数据粒度的归因,能够实现事实和风格归因。我们在已知数据集的预训练模型上评估了我们的方法,并表明它在影响力指标上优于先前的工作,从而增强了模型的可解释性,这是负责任AI系统的基本要求。

英文摘要

Large Language Models (LLMs) are increasingly deployed across diverse applications, raising critical questions for governance, accountability, and data provenance. Understanding which training data most influenced a model's output remains a fundamental open problem. We address this challenge through training data attribution (TDA) for auto-regressive LLMs by expanding upon the inverse formulation: How would training data be affected if the model had seen the generated output during training? Our method perturbs the base model using bidirectional gradient optimization (gradient ascent and descent) on a generated text sample and measures the resulting change in loss across training samples. Our framework supports attribution at arbitrary data granularity, enabling both factual and stylistic attribution. We evaluate our method against baselines on pretrained models with known datasets, and show that it outperforms previous work on influence metrics, thereby enhancing model interpretability, an essential requirement for accountable AI systems.

2606.04923 2026-06-04 cs.LG cs.AI cs.CL 版本更新

Reproducing, Analyzing, and Detecting Reward Hacking in Rubric-Based Reinforcement Learning

基于评分标准的强化学习中的奖励黑客行为的复现、分析与检测

Xuekang Wang, Zhuoyuan Hao, Shuo Hou, Hao Peng, Juanzi Li, Xiaozhi Wang

发表机构 * Tsinghua University(清华大学) Harbin Institute of Technology, Shenzhen(哈尔滨工业大学(深圳)) Xi’an Jiaotong University(西安交通大学)

AI总结 本文提出可控黑客环境CHERRL,通过注入已知偏见复现奖励黑客行为,分析其可发现性与可利用性,并探索基于智能体的自动检测方法。

Comments 23 pages, 7 figures

详情
AI中文摘要

基于评分标准的强化学习(RL)使用LLM作为评判者(LaaJ)根据评分标准对模型输出进行评分作为奖励。然而,策略模型可能利用评判者中的潜在偏见,导致奖励黑客行为以及无效或不安全的训练结果。在真实的基于评分标准的RL中,此类黑客行为通常微妙且与多种评判者偏见纠缠在一起,使得分析、检测和缓解变得困难。在本文中,我们引入了CHERRL,一个用于基于评分标准的RL的可控黑客环境。通过将已知偏见注入LaaJ,CHERRL能够稳定复现奖励黑客行为,明确观察奖励发散,并精确识别黑客行为的起始点。这为研究基于评分标准的RL中奖励黑客行为的机制和缓解措施提供了一个干净的实验测试平台。为了展示其效用,我们从可发现性和可利用性的角度分析了不同的评判者偏见,并探索了一个基于智能体的系统,用于从训练日志中自动检测奖励黑客行为的起始点。代码和环境公开于https://github.com/THUAIS-Lab/CHERRL。

英文摘要

Rubric-based reinforcement learning (RL) uses an LLM-as-a-Judge (LaaJ) to score model outputs according to rubrics as rewards. However, policy models may exploit latent biases in the judge, leading to reward hacking and ineffective or unsafe training outcomes. In real-world rubric-based RL, such hacking behaviors are often subtle and entangled with multiple judge biases, making them difficult to analyze, detect, and mitigate. In this paper, we introduce CHERRL, a controllable hacking environment for rubric-based RL. By injecting known biases into LaaJ, CHERRL enables stable reproduction of reward hacking, explicit observation of reward divergence, and precise identification of hacking onset. This provides a clean experimental testbed for studying the mechanisms and mitigations of reward hacking in rubric-based RL. To demonstrate its utility, we analyze different judge biases from the perspectives of discoverability and exploitability, and explore an agent-based system for automatically detecting reward hacking onset from training logs. The code and environment are publicly available at https://github.com/THUAIS-Lab/CHERRL.

2606.04922 2026-06-04 cs.CV cs.AI cs.LG 版本更新

Geometry-Aware Distillation for Prompt Tuning Biomedical Vision-Language Models

几何感知蒸馏用于提示调优生物医学视觉-语言模型

Tran Dinh Tien, Zhiqiang Shen

发表机构 * Department of Machine Learning(机器学习系) Mohamed bin Zayed University of Artificial Intelligence(Mohamed bin Zayed人工智能大学)

AI总结 提出Omni-Geometry知识蒸馏(OGKD)框架,通过注入类别关系结构到教师模型,生成保留真实标签同时尊重类间几何的方向性目标,并设计全局几何感知蒸馏(GAD)和标签引导几何蒸馏(LGD)损失,在11个医学数据集上平均提升准确率1.7%-2.8%。

Comments Preprint. Code is available at https://github.com/tientrandinh/OGKD

详情
AI中文摘要

当前基于提示和适配器的视觉-语言模型(VLM)调优方法在医学影像中具有吸引力,因为临床数据敏感性倾向于冻结骨干网络且标注有限。然而,这些方法通常仅优化真实类别,将所有其他类别视为同等错误,忽略了临床上有意义的类别关系,并在有限监督设置下产生不稳定的决策边界。我们提出了Omni-Geometry知识蒸馏(OGKD),一种新框架,将类别关系结构注入教师模型,以生成保留真实标签同时尊重类间几何的方向性目标。利用这些目标,我们开发了两种蒸馏损失:全局几何感知蒸馏(GAD)作用于全局图像标记,标签引导几何蒸馏(LGD)将相同的几何应用于注意力补丁标记以改善细粒度对齐。在11个广泛使用的医学数据集上进行的基础到新类和少样本评估的综合实验和分析中,我们的OGKD实现了显著更好的性能,在所有先前最先进的VLM适应方法上平均绝对增益为1.7%-2.8%。它还能稳健地泛化到未见类别,并产生比其他方法更可靠的预测。我们的代码可在https://github.com/tientrandinh/OGKD获取。

英文摘要

Current prompt-based and adapter-based tuning of vision-language models (VLMs) is attractive for medical imaging, where clinical data sensitivity favors frozen backbones and annotations are limited. However, these methods typically optimize only the ground-truth class, treating all other classes as equally incorrect, ignoring clinically meaningful class relations and yielding unstable decision boundaries in limited-supervision settings. We propose Omni-Geometry Knowledge Distillation (OGKD), a new framework that injects class-relation structure into the teacher to produce directional targets that preserve the ground truth while respecting inter-class geometry. Using these targets, we develop two distillation losses: Global Geometry-Aware Distillation (GAD) operates on the global image token, and Label-Guided Geometry Distillation (LGD) applies the same geometry to attentive patch tokens to improve fine-grained alignment. Across comprehensive experiments and analyses on 11 widely-used medical datasets for base-to-novel and few-shot evaluations, our OGKD achieves substantially better performance, consistently improving accuracy by an average absolute gain of 1.7%-2.8% over all prior state-of-the-art VLM adaptation counterparts. It also robustly generalizes to unseen classes and yields more reliable predictions than other approaches. Our code is available at https://github.com/tientrandinh/OGKD.

2606.04916 2026-06-04 cs.LG econ.GN q-fin.EC stat.ML 版本更新

Worker Utility as Hysteresis: A Preisach Model of Transaction Acceptance in Gig Labour Markets

工人效用作为滞后:零工劳动力市场中交易接受的Preisach模型

Piotr Frydrych

发表机构 * Metrology and Biomedical Engineering Institute, Faculty of Mechatronics, Warsaw University of Technology(计量与生物医学工程研究所,机械电子学系,华沙技术大学)

AI总结 本文提出Preisach滞后模型表示零工工人隐藏偏好,通过双输出神经网络估计接受和拒绝效用,结合XGBoost分类器,在36891笔交易上实现Jaccard=0.827和ROC AUC=0.799,并证明价格下降比上升对完成率影响更大。

Comments 18 pages, 5 figures

详情
AI中文摘要

工人效用是不可观测的——只有其结果可观测。每笔零工交易产生一个比特:接受或拒绝。我们认为这种结构直接指向Preisach滞后模型作为潜在工人偏好的自然表示。Preisach算子将总产出建模为对一群二元阈值元素的积分——这正是异质性工人各自持有私人接受工资时出现的结构。我们通过双输出神经网络(共享层256->128,边际损失强制U_1 >= U_0)估计两个潜在效用曲面:接受效用U_1(X)和拒绝效用U_0(X)。分类简化为Preisach间隙U_1(X) - U_0(X),与裁剪稳定的价格-阈值编码一起输入XGBoost分类器。在36,891笔零工交易上,该流程实现了Jaccard=0.827和ROC AUC=0.799。价格-阈值编码相比原始效用特征贡献了+11.0个百分点的AUC。模型证实了滞后预测的方向不对称性:价格下降比同等幅度的上升更严重地降低完成率。应用于完整数据集,模型的建议同时将总工资账单减少21.3%,并将预期填充率提高9.7个百分点。对于74.2%的交易,P(接受)已超过0.80;降低工资使其保持在阈值以上(削减后平均P=0.972),释放成本节约(中位数31%)。对于剩余的25.4%,中位数7%的工资增长恢复了+43个百分点的接受率。没有明确无差异区域的模型无法同时执行这两种操作。

英文摘要

Worker utility is not observed -- only its consequence is. Each gig transaction produces a single bit: accepted or rejected. We argue this structure points directly to the Preisach hysteresis model as the natural representation of latent worker preferences. The Preisach operator models aggregate output as an integral over a population of binary threshold elements -- precisely the structure that emerges when heterogeneous workers each carry a private acceptance wage. We estimate two latent utility surfaces: acceptance utility U_1(X) and rejection utility U_0(X), via a dual-output neural network (shared layers 256->128, margin loss enforcing U_1 >= U_0). Classification reduces to the Preisach gap U_1(X) - U_0(X), passed into an XGBoost classifier alongside clip-stabilised price-to-threshold encodings. On 36,891 gig transactions, this pipeline achieves Jaccard = 0.827 and ROC AUC = 0.799. The price-to-threshold encoding accounts for +11.0 pp AUC over raw utility features. The model confirms the directional asymmetry hysteresis predicts: price decreases depress completion rates more than equivalent increases raise them. Applied to the full dataset, the model's recommendations simultaneously reduce the total wage bill by 21.3% and increase expected fill rate by 9.7 pp. For 74.2% of transactions, P(accept) already exceeds 0.80; reducing the wage keeps it above threshold (mean post-cut P = 0.972), releasing cost savings (median 31%). For the remaining 25.4%, a median 7% wage increase recovers +43 pp acceptance. A model without an explicit indifference zone cannot execute both moves simultaneously.

2606.04876 2026-06-04 cs.LG 版本更新

Towards Pretraining Text Encoders for TabPFN

面向TabPFN的文本编码器预训练

Mustafa Tajjar, Alexander Pfefferle, Lennart Purucker, Frank Hutter

发表机构 * University of California, Berkeley(加州大学伯克利分校) DeepMind(深度思维)

AI总结 提出TabPFN文本适配器,通过轻量级适配器将文本嵌入映射到TabPFN的嵌入空间,避免PCA瓶颈,保留TabPFN数值优势,训练效率更高。

详情
AI中文摘要

表格基础模型(如TabPFN)在数值和分类数据的表格数据集上表现强劲,但本身不处理高基数文本特征。因此,标准流程使用语言模型嵌入文本,并通过PCA将结果向量压缩为少量标量特征,再输入TabPFN。这造成了信息瓶颈:大多数嵌入维度被丢弃,压缩后的表示必须由TabPFN的特征编码器再次扩展。端到端替代方案可以避免PCA,但需要大量包含文本单元格的预训练数据,且通常性能不如在大量合成数据上预训练的表格基础模型。受模态对齐方法(如LLaVA(视觉到LLM令牌投影)和TableGPT风格系统(表格到LLM令牌投影))的启发,我们引入了TabPFN文本适配器(文本到TFM令牌投影)。我们冻结句子编码器和TabPFN,仅训练一个轻量级适配器,将文本嵌入映射为TabPFN嵌入空间中的短序列令牌。这种设计消除了PCA瓶颈,保留了TabPFN的数值优势,并且比端到端文本表格流水线训练效率更高。

英文摘要

Tabular foundation models, such as TabPFN, achieve strong performance on tabular datasets with numerical and categorical data, but do not natively handle high-cardinality text features. Standard pipelines, therefore, embed text with a language model and compress the resulting vectors with PCA into a small number of scalar features before inputting them into TabPFN. This creates an information bottleneck: most embedding dimensions are discarded, and the compressed representation must then be expanded again by TabPFN's feature encoder. End-to-end alternatives can avoid PCA, but they require large amounts of pretraining data containing text cells and usually perform subpar compared to tabular foundation models that were pretrained on large amounts of synthetic data. Inspired by modality-alignment approaches like LLaVA (vision-to-LLM token projection) and TableGPT-style systems (table-to-LLM token projection), we introduce the TabPFN Text Adapter (text-to-TFM token projection). We freeze both the sentence encoder and TabPFN, and train only a lightweight adapter that maps text embeddings into a short sequence of tokens in TabPFN's embedding space. This design removes the PCA bottleneck, preserves TabPFN's numerical strengths, and is more efficient to train than end-to-end text-tabular pipelines.

2606.04866 2026-06-04 cs.LG 版本更新

Provably Reduced Sample Cost in Prior-Guided Hyperparameter Optimization

在先验引导的超参数优化中可证明的样本成本降低

Leona Hennig, Jasmin Brandt, Lukas Fehring, Barbara Hammer, Marius Lindauer, Marcel Wever

发表机构 * Leibniz University Hanover(莱比锡大学汉诺威分校) University of Bielefeld(比勒菲尔德大学) Institute of Artificial Intelligence, Leibniz University Hanover(人工智能研究所,莱比锡大学汉诺威分校) L3S Research Center Hanover(汉诺威L3S研究中心)

AI总结 本文通过固定预算最佳臂识别的形式化框架,首次给出了多保真度超参数优化中依赖先验分布的样本复杂度界,证明了信息性先验可显著减少评估次数,并实验验证了高达90%的预算节省。

详情
AI中文摘要

自动化机器学习(AutoML)中的大规模超参数优化(HPO)消耗大量计算资源,引发了关于可扩展性和能源效率的日益关注。现有方法启发式地利用先验信息来加速黑箱和多保真度设置,但缺乏对先验信息性如何定量减少样本复杂度的刻画。在这项工作中,我们通过固定预算最佳臂识别的形式化视角,首次给出了带先验的多保真度HPO的依赖分布的样本复杂度界。通过将先验直接建模在臂均值(即配置性能)上,我们推导出显式的、依赖分布的误差界,量化了先验与评估预算之间的关系。我们的分析表明,信息性先验(将概率质量集中在接近最优的臂上)能够减少所需的评估次数,而无信息或误导性先验则恢复基线性能。我们在合成基准和LCBench(一个用于深度学习的常见多保真度HPO基准)上进行了概念验证实验,以确认我们的理论结果,在保持解质量的同时实现了高达90%的预算削减。总之,我们的结果为先验引导和计算高效的绿色AutoML提供了原则性基础。

英文摘要

Large-scale hyperparameter optimization (HPO) in automated machine learning (AutoML) consumes substantial computational resources, raising growing concerns about scalability and energy efficiency. Existing methods use prior information heuristically to accelerate both black-box and multi-fidelity settings, but they lack a characterization of how prior informativeness quantitatively reduces sample complexity. In this work, we provide the first distribution-dependent sample complexity bounds for multi-fidelity HPO with priors through the formal lens of fixed-budget best-arm identification. By modeling priors directly over arm means as configuration performance, we derive explicit, distribution-dependent error bounds that quantify the relationship between priors and evaluation budget. Our analysis shows that informative priors, which concentrate probability mass on near-optimal arms, yield reductions in the number of required evaluations, whereas baseline performance is recovered with uninformative or misleading priors. We conduct proof-of-concept experiments on a synthetic benchmark and on LCBench, a common multi-fidelity HPO benchmark for deep learning, to confirm our theoretical results, achieving up to 90% budget reduction while retaining solution quality. Together, our results provide a principled foundation for prior-guided and compute-efficient green AutoML.

2606.04860 2026-06-04 cs.LG cs.AI 版本更新

Learning Empirically Admissible Neural Heuristics for Combinatorial Search

学习组合搜索的经验可容许神经启发式

Siddharth Sahay

发表机构 * Independent Researcher(独立研究者)

AI总结 针对组合搜索问题,提出一种结合可容许贝尔曼算子与非对称损失函数的验证校准框架,训练出经验可容许的神经启发式,在保证路径最优性的同时显著减少搜索节点扩展。

Comments 13 pages, 3 figures, 2 tables, 1 algorithm

详情
AI中文摘要

寻找诸如魔方、滑动拼图游戏和Lights Out等组合谜题的最优解路径仍然是人工智能中的经典挑战。启发式搜索算法(如A*)仅在使用可容许启发式(即从不高估真实剩余代价的启发式)时才能保证路径最优性。深度强化学习方法(如DeepCubeA)训练深度神经网络来近似代价到目标的启发式。然而,标准均方误差训练经常产生高估,违反可容许性并损害解的最优性。在本文中,我们介绍了一个可泛化的框架,用于学习验证校准的可容许神经启发式。我们使用低估的可容许贝尔曼算子结合非对称损失函数来训练价值网络,以惩罚高估。为了考虑残差神经函数逼近误差,我们提出了一个基于验证打乱计算的校准安全偏移量。我们证明,在校准的神经启发式下,在评估协议下未观察到可容许性违反,并在实践中保持了路径最优性,同时与标准分析基线相比,在2x2魔方上减少了高达83.0%的搜索节点扩展,在3x3 Lights Out网格上减少了19.9%,在8-Puzzle上减少了1.9%。

英文摘要

Finding optimal solution paths for combinatorial puzzles like the Rubik's Cube, sliding tile puzzles, and Lights Out remains a classical challenge in artificial intelligence. Heuristic search algorithms, such as A* , guarantee path optimality only when using an admissible heuristic-one that never overestimates the true remaining cost-to-go. Deep reinforcement learning (RL) methods like DeepCubeA train deep neural networks to approximate cost-to-go heuristics. However, standard mean-squared error (MSE) training regularly yields overestimations, violating admissibility and compromising solution optimality. In this paper, we introduce a generalizable framework for learning validation-calibrated admissible neural heuristics. We train a value network using an underestimating Admissible Bellman Operator combined with an Asymmetric Loss function to penalize overestimation. To account for residual neural function approximation errors, we propose a post-hoc calibration safety offset computed over validation scrambles. We demonstrate that our calibrated neural heuristics achieve no observed admissibility violations under the evaluation protocol and preserve path optimality in practice while reducing search node expansions by up to 83.0% on a 2 by 2 Rubik's Cube, 19.9% on a 3 by 3 Lights Out grid, and 1.9% on an 8-Puzzle compared to standard analytical baselines.

2606.04857 2026-06-04 cs.LG 版本更新

Rethinking Incompleteness: Formalizing Protocol Divergence and Train-Once Learning for Robust IMVC

重新思考不完备性:形式化协议发散与单次训练学习用于鲁棒IMVC

Haolu Liu, Xiyue Wang, Xuanting Xie, Liangjian Wen, Zhao Kang

发表机构 * National University of Singapore(新加坡国立大学)

AI总结 针对标准IMVC评估范式忽视缺失率不足以刻画数据不完备性的问题,提出协议发散形式化度量,并设计CRAFT架构通过样本独立性和掩码感知融合实现单次训练泛化到多种缺失模式。

详情
AI中文摘要

标准IMVC评估为不同的缺失数据配置分别训练模型。我们表明,这种范式掩盖了一个基本脆弱性:仅缺失率不足以刻画数据不完备性。具体而言,我们表明,具有相同名义缺失率的协议在完全观测样本的比例上可能相差高达$50\times$,从而引发截然不同的学习机制。我们将这一现象形式化为不完备性发散,提供了捕捉缺失数据协议间结构差异的度量。我们进一步证明,对于一大类基于重构的目标函数,当完整样本比例低于临界阈值时,学习在结构上变得不适定,导致接近随机的性能。为了绕过这一理论界限,我们提出了CRAFT(完整数据鲁棒注意力掩码融合变换器)。CRAFT通过两个关键特性将鲁棒性的负担从损失函数转移到架构上:(i)每个样本的独立性,消除了对完整样本共现的依赖,以及(ii)掩码感知变长融合,通过注意力掩码仅聚合观测到的视图。这种设计允许单个模型在完整数据上训练一次,即可在推理时泛化到不同的缺失模式,无需重新训练。在七个基准上的大量实验表明,CRAFT匹配或超越了每个配置的基线,同时将训练开销降低了$8.8\times$,证明对缺失数据的鲁棒性可以作为固有的架构属性实现。代码(CRAFT)和我们的imvc-audit工具包可在https://anonymous.4open.science/r/CRAFT-BF80/ 和 https://anonymous.4open.science/r/imvc-audit-8263/ 获取。

英文摘要

Standard IMVC evaluation retrains separate models for different missing-data configurations. We show that this paradigm obscures a fundamental vulnerability: missing rate alone is insufficient to characterize data incompleteness. Specifically, we show that protocols with identical nominal missing rates can differ by up to $50\times$ in their proportion of fully observed samples, inducing drastically different learning regimes. We formalize this phenomenon as incompleteness divergence, providing measures that capture structural disparities across missing-data protocols. We further prove that for a broad class of reconstruction-based objectives, learning becomes structurally ill-posed when the proportion of complete samples falls below a critical threshold, leading to near-random performance. To bypass this theoretical bound, we propose CRAFT (Complete-data Robust Attention-masked Fusion Transformer). CRAFT shifts the burden of robustness from the loss function to the architecture via two key properties: (i) per-sample independence, which removes reliance on complete-sample co-occurrence, and (ii) mask-aware variable-length fusion, which aggregates only observed views through attention masking. This design allows a single model, trained once on complete data, to generalize to diverse missing patterns at inference time without retraining. Extensive experiments on seven benchmarks show that CRAFT matches or outperforms per-configuration baselines while reducing training overhead by $8.8\times$, demonstrating that robustness to missing data can be achieved as an inherent architectural property. Code (CRAFT) and our imvc-audit toolkit are available at https://anonymous.4open.science/r/CRAFT-BF80/ and https://anonymous.4open.science/r/imvc-audit-8263/.

2606.04850 2026-06-04 cs.LG cs.AI cs.AR math.OC 版本更新

Uncertainty-Aware End-to-End Co-Design of Neural Network Processors: From Training and Mapping to Fabrication

不确定性感知的神经网络处理器端到端协同设计:从训练、映射到制造

Yuyang Du, Yujun Huang, Gioele Zardini

AI总结 提出一个基于单调协同设计理论的统一框架,通过四个可互操作的设计模块(网络训练、芯片映射、晶圆级制造和计算资源分配)实现神经网络处理器的端到端协同设计,并引入置信度(成功概率的倒数)作为显式可优化资源来处理不确定性。

Comments 14 pages

详情
AI中文摘要

设计神经网络处理器是一个端到端的协同设计问题:网络架构和训练预算决定了推理工作负载;硬件映射决策决定了芯片面积、延迟和能量;这些特性决定了制造良率和生产成本。在实践中,这些决策是在不同阶段做出的,现有的协同设计方法与特定算法紧密耦合,使得改进一个组件而不重新设计整个流水线变得困难。本文提出了一个基于单调协同设计理论的统一框架,该框架组合了四个可互操作的设计模块,涵盖网络训练、芯片映射、晶圆级制造和计算资源分配。每个模块仅向系统其余部分暴露功能-资源接口,因此任何模块都可以在不改变其他模块结构的情况下进行优化。一个核心贡献是对不确定性的处理:该框架没有将随机结果简化为点估计,而是引入置信度(成功概率的倒数)作为与成本、时间和功耗并列的显式可优化资源。三个案例研究验证了该方法。第一个案例恢复了跨异构应用场景的帕累托最优实现。第二个案例确认置信度作为一个连续可调的设计旋钮,而非事后诊断指标。第三个案例表明,改进单个模块的实现集会自动传播到全局帕累托前沿,而无需修改协同设计图。

英文摘要

Designing a neural network processor is an end-to-end co-design problem: network architecture and training budget determine the inference workload; hardware mapping decisions determine chip area, latency, and energy; and these characteristics govern fabrication yield and manufacturing cost. In practice, these decisions are made in separate stages, and existing co-design methodologies are tightly coupled to specific algorithms, making it difficult to improve one component without reworking the entire pipeline. This paper presents a unified framework, grounded in monotone co-design theory, that composes four interoperable design blocks spanning network training, chip mapping, wafer-level fabrication, and compute resource allocation. Each block exposes only a functionality-resource interface to the rest of the system, so any block can be refined without structural changes elsewhere. A central contribution is the treatment of uncertainty: rather than collapsing stochastic outcomes into point estimates, the framework introduces Confidence, the inverse of success probability, as an explicit and optimizable resource alongside cost, time, and power. Three case studies validate the approach. The first recovers Pareto-optimal implementations across heterogeneous application scenarios. The second confirms that Confidence functions as a continuously tunable design knob rather than a post-hoc diagnostic. The third demonstrates that improving a single block's implementation set automatically propagates to the global Pareto front, without modifying the co-design diagram.

2606.04847 2026-06-04 cs.CV cs.CL cs.LG 版本更新

MusaCoder: Native GPU Kernel Generation with Full-Stack Training on Moore Threads GPU

MusaCoder: 在摩尔线程GPU上通过全栈训练实现原生GPU内核生成

Kun Cheng, Songshuo Lu, Sicong Liao, Tankun Li, Yafei Zhang, Dong Yang, Qiheng Lv, Hua Wang, Zhi Chen, Yaohua Tang

发表机构 * Moore Threads AI

AI总结 提出MusaCoder全栈训练框架,结合渐进式数据合成、多样性保持拒绝微调和基于执行反馈的强化学习,在CUDA和MUSA后端上生成高效原生GPU内核,9B模型匹配前沿闭源模型,27B模型达到新最优。

详情
AI中文摘要

原生GPU内核生成将高级张量程序转换为可执行、高效的低级代码。现有大型语言模型(LLMs)在此任务上表现不佳,而基于执行的强化学习面临稀疏奖励、奖励黑客和训练不稳定性问题。我们提出MusaCoder,一个用于在CUDA和MUSA后端上生成原生GPU内核的全栈训练框架。MusaCoder结合了渐进式内核导向数据合成、保持多样性的拒绝微调以及通过MooreEval(一个分布式验证器和奖励环境)进行的执行反馈强化学习(RL)。为了稳定RL,MusaCoder引入了PrimeEcho用于首轮锚定的多轮奖励、Buffered Dynamic Retry用于从全失败的困难样本中恢复信号,以及MirrorPop用于离策略序列过滤。在KernelBench和MUSA移植变体上的实验表明,MusaCoder在正确性和经验加速方面均优于强开源和专有基线,其中9B模型匹配或超越前沿闭源模型,27B模型建立了新的最优结果。这些结果不仅证明了全栈执行反馈训练对原生内核生成的有效性,也展示了摩尔线程GPU支持完整LLM后训练栈的能力,为新兴加速器上的大模型训练和优化提供了实用基础。

英文摘要

Native GPU kernel generation turns high-level tensor programs into executable, efficient low-level code. Existing Large Language Models (LLMs) struggle with this task, while execution-based reinforcement learning suffers from sparse rewards, reward hacking, and training instability. We present MusaCoder, a full-stack training framework for native GPU kernel generation on CUDA and MUSA backends. MusaCoder combines progressive kernel-oriented data synthesis, diversity-preserving rejection fine-tuning, and execution-feedback Reinforcement Learning (RL) through MooreEval, a distributed verifier and reward environment. To stabilize RL, MusaCoder introduces PrimeEcho for first-turn-anchored multi-turn rewards, Buffered Dynamic Retry for recovering signals from all-failed hard samples, and MirrorPop for off-policy sequence filtering. Experiments on KernelBench and a MUSA-ported variant show that MusaCoder outperforms strong open-source and proprietary baselines in both correctness and empirical speedup, with the 9B model matching or exceeding frontier closed-source models and the 27B model establishing a new state of the art. These results demonstrate not only the effectiveness of full-stack execution-feedback training for native kernel generation, but also the capability of Moore Threads GPUs to support the complete LLM post-training stack, providing a practical foundation for large-model training and optimization on emerging accelerators.

2606.04845 2026-06-04 stat.ML cs.LG math.ST stat.CO stat.TH 版本更新

Bayesian learning for the stochastic shortest path problem

随机最短路径问题的贝叶斯学习

Chon Wai Ho, Sumeetpal S. Singh, Jiaqi Guo

发表机构 * Department of Engineering, University of Cambridge, UK(剑桥大学工程系) School of Mathematics and Physics, University of Wollongong, Wollongong, Australia(沃林根大学数学与物理学院)

AI总结 针对随机最短路径问题,提出一种贝叶斯框架,通过贝尔曼最优方程直接构建最优动作价值函数Q*的后验分布,并解决似然松弛导致的不可识别性问题,实现不确定性量化与数据高效学习。

Comments 50 pages, 19 figures

详情
AI中文摘要

序列决策问题通常被建模为马尔可夫决策过程(MDP)。我们关注随机最短路径(SSP)问题,这是一个具有吸收终止状态的无限水平无折扣MDP。我们开发了一个贝叶斯框架,通过与决策任务的交互来学习最优决策策略。具体来说,我们学习最优动作价值函数$Q^*$,但与许多现有的贝叶斯方法不同,我们不依赖于不现实的建模假设和临时近似。我们的方法是通过贝尔曼最优方程直接构建$Q^*$的后验信念。对于确定性奖励,我们将后验描述为具有流形密度的分布。为了简化推理,我们放松了似然,使得勒贝格密度存在。但这样做的代价是产生不可识别性问题。具体来说,放松后的后验可能在不当决策规则上有显著质量,而精确后验则不会。我们还计算了$Q^*$的表格参数化、高斯似然放松和高斯先验下最优动作选择的精确后验概率,这在基准测试研究中很有用。对深海基准测试变体的数值研究验证了我们的发现。我们证明了我们的框架能够忠实地量化不确定性,并且与其他基于时间差分的贝叶斯方法相比,数据效率更高。最后,我们对未来工作提出了建议。

英文摘要

Sequential decision-making problems are often modelled as a Markov decision process (MDP). We focus on the stochastic shortest path (SSP) problem, which is an infinite-horizon undiscounted MDP with absorbing terminal states. We develop a Bayesian framework to learn the optimal decision strategy through interactions with the decision-making task. Specifically, we learn the optimal action-value function $Q^*$, but unlike many existing Bayesian approaches, we do not rely on unrealistic modelling assumptions and ad-hoc approximations. Our approach is to directly construct the posterior beliefs for $Q^*$ through Bellman's optimality equations. For deterministic rewards, we characterise the posterior as a distribution with a manifold density. To facilitate simpler inference, we relax the likelihood so that a Lebesgue density exists. The flip side is to create unidentifiability issues. Specifically, the relaxed posterior can have significant mass on improper decision rules, while the exact posterior will not. We also calculate the exact posterior probabilities for optimal action selections for the tabular parametrisation of $Q^*$, a Gaussian likelihood relaxation and a Gaussian prior, which is useful in benchmarking studies. Numerical studies on variants of the Deep Sea benchmark verify our findings. We demonstrate that our framework faithfully quantifies uncertainty and, compared to other temporal-difference-based Bayesian methodologies, is more data efficient. We conclude with recommendations for future work.

2606.04834 2026-06-04 cs.LG 版本更新

Prediction Under Imperfect Compression: A Theory of Approximate MDL

非完美压缩下的预测:近似最小描述长度理论

Qian Li, Xinyu Mao, Shang-Hua Teng, Guangxu Yang

发表机构 * Shenzhen Research Institute of Big Data(深圳大数据研究院) University of Southern California(南加州大学)

AI总结 本文研究了在近似优化下,最小描述长度(MDL)原则仍能保证可靠序列预测的条件,证明了加性松弛下的鲁棒性并刻画了正则化的必要性。

Comments 26 pages

详情
AI中文摘要

最小描述长度(MDL)通过优化总描述长度 $L(\mathrm{model})+L(\mathrm{data} \ | \ \mathrm{model})$ 形式化了奥卡姆剃刀原则。对于序列预测,MDL 方法反复选择在观测前缀上具有最小目标得分的模型进行下一步预测。经典 MDL 预测理论表明,精确优化 MDL 目标确实提供了支持可靠预测的强压缩保证。然而,实际机器学习通常只能通过近似优化目标函数来找到模型。为弥合这一差距,本文解决了以下基本问题:在何种近似和正则化形式下,近似 MDL 仍能保证可靠的序列预测?本文提供了一个原则性的刻画。我们证明,对于平衡 MDL 目标更一般形式 $λ\cdot L(\mathrm{model})+L(\mathrm{data} \ | \ \mathrm{model})$ 的任意加性松弛 $C$,当 $λ\ge1$ 时,累积期望平方预测误差有限。$λ>1$ 的情况通过亲和-望远镜论证证明,而边界情况 $λ=1$ 通过基于精确静态 MDL 边界的似然比停止论证证明。我们的结果表明,经典 MDL 正则化对任意固定加性优化误差保持鲁棒。此外,我们建立了近似 MDL 框架刻画的尖锐性:当 $0<λ<1$ 时,在可估测度的通用类中,过拟合可能导致无限累积期望误差,因此需要强形式的模型复杂度正则化。另外,在乘性近似下,模型选择可能在每个正则化区域 $λ>0$ 中失败,因此加性近似既充分又必要。

英文摘要

Minimum Description Length (MDL) formalizes the principle of Occam's razor by optimizing the total description length: $L(\mathrm{model})+L(\mathrm{data} \ | \ \mathrm{model})$. For sequential prediction, the MDL method repeatedly selects a model with a minimum objective score of the observed prefix for the next step prediction. Classical MDL prediction theory shows that exact optimization of the MDL objective indeed provides a strong compression guarantee that supports reliable prediction. However, practical machine learning usually can only find models by approximately optimizing the objective function. To bridge this gap, this paper addresses the following fundamental question: Under what forms of approximation and regularization does approximate MDL still guarantee reliable sequential prediction? This work offers a principled characterization. We prove that for any approximation with additive slack $C$ of the more general form of the balanced MDL objective: $λ\cdot L(\mathrm{model})+L(\mathrm{data} \ | \ \mathrm{model})$, the cumulative expected squared prediction error is finite for all $λ\ge1$. The case $λ>1$ is proved by an affinity-telescoping argument, while the boundary case $λ=1$ is proved by a likelihood-ratio stopping argument based on exact static MDL bounds. Our results establish that classical MDL regularization remains robust to any fixed additive optimization error. Furthermore, we establish that our characterization of the approximate MDL framework is sharp: When $0<λ<1$, overfits can happen to incur infinite cumulative expected error in the universal class of estimable measures, and hence a strong form of model-complexity regularization is necessary. In addition, model selection may fail in every regularized regime $λ>0$, under multiplicative approximation, and thus, additive approximation is both sufficient and essential.

2606.04822 2026-06-04 cs.LG 版本更新

Reconciling Causality and Non-Equilibrium Thermodynamics with Hamiltonian Causal Models

用哈密顿因果模型调和因果关系与非平衡热力学

Dario Rancati, Max Welling, Francesco Locatello

发表机构 * Institute of Science and Technology Austria(奥地利科学与技术研究所) CuspAI University of Amsterdam(阿姆斯特丹大学CuspAI)

AI总结 提出哈密顿因果模型(HCMs),通过分离不可变运动方程与可干预机制,定义路径级因果效应,并与非平衡热力学自然接口,利用熵产生量化因果效应。

详情
AI中文摘要

物理时间现象的因果建模必须处理沿轨迹的干预、非平稳诱导律、路径依赖效应以及由动力学介导的反馈,这些在标准因果模型中都具有挑战性。我们引入了哈密顿因果模型(HCMs),这是一个轨迹级框架,其中观测变量与局部环境相互作用,干预作为哈密顿机制的控制。HCMs将不可变的运动方程与可干预机制分离,并将因果效应定义为干预路径律之间的差异。HCMs的一个关键动机是它们与非平衡热力学的自然接口。熵产生量化了过程的不可逆性,是一个核心因果可观测量:它可以从数据中估计,并见证系统演化过程中标准平均处理效应的端点和累积版本所不可见的因果效应。如同物理学中,原因和结果不是两个随机变量之间关系的原始概念,而是源于热力学箭头的不可逆性。因此,我们的论文调和了统计因果模型和非平稳热力学的语言,为描述广泛物理系统中的因果关系提供了新工具。

英文摘要

Causal modeling of physical temporal phenomena must handle interventions that act along trajectories, nonstationary induced laws, path-dependent effects, and feedback mediated by dynamics, all challenging in standard causal models. We introduce Hamiltonian Causal Models (HCMs), a trajectory-level framework in which observed variables interact with local environments and interventions act as controls of Hamiltonian mechanisms. HCMs separate immutable equations of motion from intervenable mechanisms and define causal effects as discrepancies between interventional path laws. A key motivation for HCMs is their natural interface with non-equilibrium thermodynamics. Entropy production quantifies the irreversibility of a process and is a central causal observable: it is estimable from data and witnesses causal effects along the system's evolution that are invisible to endpoint and cumulative versions of the standard average treatment effect. As in physics, cause and effect are not primitives of the relation between two random variables but arise from the non-invertibility of the thermodynamic arrow. With this, our paper reconciles the language of statistical causal models and non-stationary thermodynamics, offering new tools to describe causality in a wide range of physical systems.

2606.04820 2026-06-04 cs.CV cs.AI cs.LG 版本更新

OA-CutMix: Correcting the Label Bias of CutMix

OA-CutMix:纠正CutMix的标签偏差

Tobias Christian Nauen, Stanislav Frolov, Federico Raue, Brian B. Moser, Andreas Dengel

发表机构 * RPTU University Kaiserslautern-Landau(凯撒斯劳滕-兰道大学) German Research Center for Artificial Intelligence (DFKI)(德国人工智能研究中心)

AI总结 针对CutMix中标签分配基于区域面积导致语义偏差的问题,提出OA-CutMix,利用分割掩码根据可见目标面积分配标签,在不改变图像混合过程的情况下提升分类准确率。

详情
AI中文摘要

CutMix已成为事实上的标准混合增强方法,但其标签分配基于一个有缺陷的假设:粘贴补丁的面积忠实地反映了其对混合图像的语义贡献。然而,在实践中,补丁经常落在背景区域,将标签信用分配给其目标不可见的类别。CutMix标签与语义目标面积的平均差异为21.5%。在17%的样本中,一张图像贡献了零个可见目标像素,却获得了非零的标签权重。我们提出目标感知CutMix(OA-CutMix),通过用从预计算分割掩码中导出的权重替换基于面积的CutMix权重来纠正这种偏差,根据每个图像贡献给混合图像的可见目标面积比例分配标签。图像混合过程完全保持不变。我们在4种架构和6个数据集上评估了OA-CutMix与10多种静态和动态混合方法的性能。OA-CutMix在所有任务中始终达到最高准确率,甚至优于动态混合方法,但训练时间成本仅为其一小部分。对于小目标,改进最大,因为CutMix的标签偏差最大。因此,纠正标签足以匹配或超过修改图像混合算法的方法的性能。

英文摘要

CutMix has become the de facto standard mixing augmentation, yet its label assignment rests on a flawed assumption: The area of the pasted patch faithfully reflects its semantic contribution to the mixed image. In practice, however, patches frequently land on background regions, assigning label credit to classes whose objects are not visible. The mean discrepancy of the CutMix label and the semantic object area is $21.5\%$. In $17\%$ of samples an image contributes zero visible object pixels yet receives nonzero label weight. We propose Object-Aware CutMix (OA-CutMix), which corrects this bias by replacing the area-based CutMix weight with one derived from precomputed segmentation masks, assigning labels in proportion to the visible object area each image contributes to the mix. The image mixing procedure is left entirely unchanged. We evaluate OA-CutMix against 10+ static and dynamic mixing methods across 4 architectures and 6 datasets. OA-CutMix consistently achieves the highest accuracy over all tasks, outperforming even dynamic mixing methods, but at a fraction of the training-time cost. Improvements are largest for small objects, where the label bias from CutMix is greatest. Thus, correcting the label is sufficient to match or exceed the performance of methods modifying the image mixing algorithm.

2606.04816 2026-06-04 cs.AI cs.LG 版本更新

Beyond Objective Equivalence: Constraint Injection for LLM-Based Optimization Modeling on Vehicle Routing Problems

超越目标等价性:基于LLM的车辆路径问题优化建模的约束注入

Xizi Luo, Changhong He, Dongdong Geng, Chenggong Shi, Yu Mei

发表机构 * Beihang University(北京航空航天大学) Baidu Inc.(百度公司)

AI总结 针对LLM在约束密集的运筹问题中可能添加虚假约束或遗漏必要约束的问题,提出约束注入方法,结合差分测试形成双重验证器,并在车辆路径问题上验证其有效性。

Comments 28 pages

详情
AI中文摘要

大型语言模型(LLM)越来越多地将自然语言优化问题转化为可执行的求解器代码。然而,对于约束密集的运筹学(OR)问题,现有的数据过滤和训练流程主要依赖于目标等价性信号,如差分测试和答案一致性,这些信号允许程序在测试实例上添加虚假约束或静默省略必要约束,只要这些约束在测试实例上非绑定。我们提出约束注入,利用可行探针暴露虚假过度约束,利用单约束违反探针揭示静默约束遗漏。结合差分测试,它形成一个双重验证器。我们在车辆路径问题(VRPs)上实例化并评估该方法,VRPs是代表性的约束密集组合优化测试平台,具有耦合的操作约束。我们开发了VRPCoder,一个8B端到端模型,将自然语言VRP场景转化为Gurobi脚本,并附带一个专家验证的VRP基准套件,涵盖21种变体。该验证器在数据合成期间用作拒绝采样过滤器,在组相对策略优化(GRPO)中用作每次rollout的奖励。在四个VRP基准上,VRPCoder-GRPO达到93%的平均Pass@1,在三个基准上优于Gemini-3.1-Pro Preview,超过Claude-Sonnet-4.5平均28个百分点,并超过先前的OR-LLM平均78个百分点。

英文摘要

Large language models (LLMs) increasingly translate natural-language optimization problems into executable solver code. Yet for constraint-dense operations research (OR) problems, existing data-filtering and training pipelines largely rely on objective-equivalence signals such as differential testing and answer agreement, which a program can pass while adding spurious constraints or silently omitting required ones, whenever those constraints are non-binding on the tested instance. We propose constraint injection, which uses feasible probes to expose spurious over-constraint and one-constraint-violating probes to reveal silent constraint omission. Combined with differential testing, it forms a dual verifier. We instantiate and evaluate it on vehicle routing problems (VRPs), a representative constraint-dense combinatorial optimization testbed with coupled operational constraints. We develop VRPCoder, an 8B end-to-end model that translates natural-language VRP scenarios into Gurobi scripts, together with an expert-verified VRP benchmark suite covering 21 variants. The verifier is reused as a rejection-sampling filter during data synthesis and as a per-rollout reward in group relative policy optimization (GRPO). Across four VRP benchmarks, VRPCoder-GRPO reaches 93\% average Pass@1, outperforms Gemini-3.1-Pro Preview on three benchmarks, exceeds Claude-Sonnet-4.5 by 28 average points, and surpasses prior OR-LLMs by 78 average points.

2606.04815 2026-06-04 cs.LG cs.AI 版本更新

Learning While Acting: A Skill-Enhanced Test-Time Co-Evolution Framework for Online Lifelong Learning Agents

边行动边学习:面向在线终身学习智能体的技能增强测试时协同进化框架

Bo Mao, Jie Zhou, Yutao Yang, Xin Li, Xian Wei, Qin Chen, Xingjiao Wu, Liang He

发表机构 * School of Computer Science and Technology, East China Normal University(东华大学计算机科学与技术学院) Shanghai AI Laboratory(上海人工智能实验室) Software Engineering Institute, East China Normal University(东华大学软件工程学院)

AI总结 提出LifeSkill框架,通过验证器引导的技能学习和在线技能内化,使LLM智能体在测试时持续内化反馈,提升终身学习性能。

详情
AI中文摘要

终身学习对于在动态、交互环境中运行的大型语言模型(LLM)智能体至关重要。然而,现有的用于长时任务的终身学习智能体通常依赖于离散技能或过去经验检索,并在推理期间使用静态参数,这阻止了它们像人类学习者一样持续内化测试时反馈。为弥补这一差距,我们提出了技能增强测试时协同进化(LifeSkill),一个用于在线终身学习智能体的两阶段强化学习框架。具体来说,我们设计了验证器引导的技能学习,通过根据多个技能条件策略滚动的平均验证器成功率奖励候选技能,解决了技能提取缺乏直接监督的问题,鼓励模型生成对解决任务有用的技能,而不仅仅是文本上合理的技能。此外,我们引入了在线技能内化,通过在测试时交互期间将技能条件轨迹转化为奖励信号,持续改进策略模型。这使得智能体能够将推理能力直接内化到其参数中,避免了经验检索的上下文膨胀。在LifelongAgentBench上的实验表明,与现有终身学习智能体基线相比,LifeSkill将平均性能提高了7个绝对百分点。

英文摘要

Lifelong learning is essential for Large Language Model (LLM) agents operating in dynamic, interactive environments. However, existing lifelong learning agents for long-horizon tasks typically depend on discrete skill or past experiences retrieval with static parameters during inference, which prevents them from continuously internalizing test-time feedback like human learners. To bridge this gap, we propose Skill-enhanced Test-Time Co-Evolution (\texttt{LifeSkill}), a two-stage reinforcement learning framework for Online Lifelong Learning Agents. Specifically, we design Verifier-Guided Skill Learning that addresses the lack of direct supervision for skill extraction by rewarding candidate skills according to the average verifier success of multiple skill-conditioned policy rollouts, encouraging the model to generate skills that are useful for solving tasks rather than merely plausible in text. Furthermore, we introduce Online Skill Internalization, which continuously improves the policy model during test-time interaction by transforming skill-conditioned trajectories into reward signals. This enables the agent to directly internalize reasoning capabilities into its parameters, avoiding the context bloat of experience retrieval. Experiments on LifelongAgentBench show that LifeSkill improves average performance by 7 absolute points by comparing with existing lifelong agent baselines.

2606.04807 2026-06-04 cs.AI cs.CL cs.CY cs.LG 版本更新

BiasGRPO: Stabilizing Bias Mitigation in High-Variance Reward Landscapes via Group-Relative Policy Optimization

BiasGRPO:通过组相对策略优化在高方差奖励景观中稳定偏差缓解

Saket Reddy, Ke Yang, ChengXiang Zhai

发表机构 * University of Illinois - Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校)

AI总结 提出BiasGRPO框架,利用组相对策略优化(GRPO)通过归一化组内奖励来稳定大语言模型的社会偏差缓解,优于DPO和PPO。

Comments Accepted to Findings of the ACL

详情
AI中文摘要

缓解大语言模型(LLMs)中的社会偏差提出了一个独特的对齐挑战:与可验证任务不同,偏差缺乏单一的真实标准,从而产生高方差、主观的奖励景观。先前的基于偏好的微调方法存在主要权衡:直接偏好优化(DPO)受限于离线训练中缺乏探索,而近端策略优化(PPO)由于潜在不可靠的评论家估计可能导致训练不稳定。在本文中,我们提出了BiasGRPO,一个使用组相对策略优化(GRPO)的框架,通过对一组采样完成进行奖励归一化来稳定对齐。通过用组相对基线替代价值函数,我们的方法在保持在线训练探索优势的同时减少了不稳定性。我们发现BiasGRPO在多个基准测试中优于DPO和PPO,表明其有效性。为了适应GRPO,我们综合扩展了一个涵盖多个领域和上下文的数据集。我们还创建并发布了一个定制的偏差奖励模型,该模型在有效指导生成的同时高度计算高效且避免知识退化,提供了一个可无缝集成到多目标RLHF流程中的宝贵资源。

英文摘要

Mitigating social bias in Large Language Models (LLMs) presents a distinct alignment challenge: unlike verifiable tasks, bias lacks a single ground truth, creating a high-variance, subjective reward landscape. Previous preference-based fine-tuning methods have major trade-offs: Direct Preference Optimization (DPO) is limited by the lack of exploration inherent in offline training, while Proximal Policy Optimization (PPO) can lead to training instability due to potentially unreliable critic estimates. In this paper, we propose BiasGRPO, a framework using Group Relative Policy Optimization (GRPO) to stabilize alignment by normalizing rewards across a group of sampled completions. By substituting the value function with a group-relative baseline, our approach reduces instability while maintaining the exploration benefits of online training. We find that BiasGRPO outperforms DPO and PPO across multiple benchmarks, indicating its effectiveness. To adapt GRPO, we synthetically extend a dataset spanning multiple domains and contexts. We also create and release a custom bias reward model that effectively guides generation while being highly compute-efficient and avoiding knowledge degradation, providing a valuable resource that can be seamlessly integrated into multi-objective RLHF pipelines.

2606.04798 2026-06-04 cs.LG 版本更新

Uncertainty-Aware (Un)Supervised Few-Shot User Adaptation for On-Device Personalized Human Activity Recognition

不确定性感知的(无)监督少样本用户自适应用于设备上个性化人类活动识别

Maximilian Burzer, Till Riedel, Michael Beigl, Tobias Röddiger

发表机构 * Karlsruhe Institute of Technology(卡尔斯鲁厄理工学院) IPAI Foundation gGmbH(IPAI基金会)

AI总结 提出一种无梯度框架,通过贝叶斯原型估计实现监督/无监督少样本用户自适应,仅需每活动3秒校准数据即可显著提升设备上HAR模型性能。

Comments 6 pages, 4 figures, 2 tables, 2 algorithms

详情
AI中文摘要

基于传感器的人类活动识别(HAR)模型通常因个体运动模式和传感器放置导致的域偏移而在未见用户上性能下降。因此,实用的可穿戴HAR系统需要轻量级的个性化方法,这些方法应适用于校准数据有标签、无标签或不可用的情况,并在有限校准下具有鲁棒性。我们提出一个无梯度框架,将预训练的HAR分类器重新用作原型网络,利用先验原型保持零样本性能并规范自适应。对于有标签校准,我们引入闭式贝叶斯原型估计,并将相同原理扩展到无标签校准。仅使用每活动3秒的校准数据(一次样本),监督自适应在四个数据集上将宏F1提高了+2.76至+33.44个百分点,而无监督自适应提高了+0.56至+32.13个百分点。由于自适应仅需要闭式原型更新,该框架能够实现现有HAR分类器的高效且鲁棒的设备上个性化。

英文摘要

Sensor-based Human Activity Recognition (HAR) models often degrade on unseen users due to domain shifts caused by individual movement patterns and sensor placement. Practical wearable HAR systems therefore require personalization methods that are lightweight, applicable whether calibration data is labeled, unlabeled, or unavailable, and robust under limited calibration. We present a gradient-free framework that repurposes pretrained HAR classifiers as Prototypical Networks using using prior prototypes, which preserve zero-shot performance and regularize adaptation. For labeled calibration, we introduce closed-form Bayesian prototype estimation and extend the same principle to unlabeled calibration. With only 3 seconds of calibration data per activity (one shot), supervised adaptation improves macro-F1 by +2.76 to +33.44 percentage points across four datasets, while unsupervised adaptation improves by +0.56 to +32.13 points. Since adaptation requires only closed-form prototype updates, the framework enables efficient and robust on-device personalization of preexisting HAR classifiers.

2606.04797 2026-06-04 cs.CV cs.LG 版本更新

Crafting Your Evolving Dreams: Concept-Incremental Versatile Customization

打造你不断演变的梦想:概念增量式多功能定制

Jiahua Dong, Wenqi Liang, Hongliu Li, Yang Cong, Duzhen Zhang, Hanbin Zhao, Henghui Ding, Yulun Zhang, Salman Khan, Fahad Shahbaz Khan

发表机构 * Mohamed bin Zayed University of Artificial Intelligence(Mohamed bin Zayed大学人工智能学院) University of Trento(特伦托大学) Department of Civil and Environmental Engineering, The Hong Kong Polytechnic University(香港理工大学土木与环境工程系) South China University of Technology(华南理工大学) College of Computer Science and Technology, Zhejiang University(浙江大学计算机科学与技术学院) Institute of Big Data, Fudan University(复旦大学大数据研究院) Shanghai Jiao Tong University(上海交通大学)

AI总结 提出持续可定制扩散模型(CCDM),通过属性解耦LoRA模块和相关性引导聚合策略解决灾难性遗忘,并结合可控区域上下文合成策略处理概念忽视,实现概念增量式多功能定制。

Comments Accepted to Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

详情
AI中文摘要

定制扩散模型(CDMs)因其生成个性化概念的卓越能力而引起了广泛关注。然而,大多数CDMs不切实际地假设用户的个性化概念集合是静态的,无法随时间增长。此外,在增量学习一系列新概念时,它们对先前学习的概念表现出显著的灾难性遗忘和概念忽视。为了解决上述挑战,我们开发了一种新颖的持续可定制扩散模型(CCDM),使用户能够进行概念增量式多功能定制。具体来说,我们设计了一个属性解耦LoRA(AD-LoRA)模块和一个相关性引导的AD-LoRA聚合策略,以缓解灾难性遗忘。它们可以保留每个任务的概念特定属性,并利用有益的任务间相关性来增强新定制任务的持续学习。此外,为了解决概念忽视的挑战,我们提出了一种可控区域上下文合成策略,该策略根据用户提供的条件进行多概念合成。该策略通过保证用户定义区域之间的语义独立性及其平滑边界过渡,增强了多概念合成的整体一致性。实验表明,我们的CCDM在基线方法上表现出显著改进。

英文摘要

Custom diffusion models (CDMs) have garnered significant interest owing to their remarkable capacity for generating personalized concepts. However, the majority of CDMs unrealistically presume that the user's collection of personalized concepts is static and incapable of incremental growth over time. Furthermore, they exhibit significant catastrophic forgetting and concept neglect of previously learned concepts when incrementally learning a sequence of new ones. To resolve the above challenges, we develop a novel Continually Customizable Diffusion Model (CCDM), enabling users to perform concept-incremental versatile customization. Specifically, we design an attribute-decoupled LoRA (AD-LoRA) module and a relevance-guided AD-LoRA aggregation strategy to mitigate catastrophic forgetting. They can preserve concept-specific attributes of each task and leverage beneficial inter-task correlations to enhance the continual learning of new customization tasks. Additionally, to address the challenge of concept neglect, we propose a controllable regional context synthesis strategy that performs multi-concept composition in alignment with user-provided conditions. This strategy enhances the overall consistency in multi-concept synthesis by guaranteeing semantic independence between user-defined regions and their smooth boundary transitions. Experiments show our CCDM exhibits significant improvements over baseline methods.

2606.04781 2026-06-04 cs.AI cs.LG 版本更新

AIP: A Graph Representation for Learning and Governing Agent Skills

AIP: 一种用于学习和治理智能体技能的图表示

Zachary Blumenfeld, Jim Webber

发表机构 * Neo4j USA(Neo4j美国公司) Neo4j UK(Neo4j英国公司)

AI总结 提出Agent指令协议(AIP),将有向执行图作为技能表示,通过编译人类编写的技能提升任务表现,并支持技能的可诊断修复与治理。

详情
AI中文摘要

当前的智能体技能主要由自由形式的散文组成,要求智能体在每个会话中阅读、解释并重新推导如何行动。这带来了两个叠加的成本:在实现密集型任务上降低了可靠性,并且技能创建和改进困难,因为编辑散文是一个脆弱的过程,人类和智能体都难以处理,特别是对于模型训练中代表性不足的领域特定程序性知识。智能体指令协议(AIP)通过将技能建模为有向执行图来解决这两个问题:离散步骤作为节点,由确定性脚本或自然语言描述支持,通过显式类型的输入/输出边连接,并由模式验证的YAML规范管理。一个编译器元技能将现有的人类编写的技能转换为这种形式。好处是双重的。首先,将人类编写的技能编译为AIP后,Claude Sonnet在SkillsBench的27个真实智能体任务上的平均任务奖励从0.60提高到0.71,通过率从53%提高到67%——这是统计上显著的提升(Wilcoxon符号秩检验p=0.011),在12个任务中获胜,2个失败,13个平局——通常耗时更少。该图为智能体提供了经过验证、可运行的单元,而不是要求它从自然语言中重新推导代码、命令和工具调用。其次,在创建和改进方面,由于每个技能都经过模式验证、功能可测试且可逐节点寻址,因此可以精确诊断和修复故障。两个作者编写的技能故障被追溯到脚本级别。在调整AIP规范并重新编译后,两者均恢复且无回归(一个任务从0/5变为5/5),将技能改进转变为可测量的调优循环,而不是散文重写。相同的图结构支持语料库级别的治理和技能内省,并为基于技能的强化学习提供了自然的动作空间。

英文摘要

Agent Skills today consist largely of free-form prose requiring the agent to read, interpret, and re-derive how to act in every session. This imposes two compounding costs: reduced reliability on implementation-heavy tasks, and difficulty in skill creation and improvement, since editing prose is a fragile process that both humans and agents struggle with, particularly for domain-specific procedural knowledge underrepresented in model training. The Agent Instruction Protocol (AIP) addresses both by modeling a skill as a directed execution graph: discrete steps as nodes backed by deterministic scripts or natural-language descriptions, connected by explicit typed input/output edges, and governed by a schema-validated YAML specification. A compiler meta-skill translates existing human-written skills into this form. The benefits are twofold. First, compiling human-written skills to AIP raised Claude Sonnet's mean task reward from 0.60 to 0.71 and pass rate from 53% to 67% across 27 real agent tasks from SkillsBench - a statistically significant gain (Wilcoxon signed-rank p = 0.011), winning 12 tasks to 2 with 13 ties - often in less wall-clock time. The graph delivers vetted, runnable units to the agent rather than asking it to re-derive code, commands, and tool calls from natural language. Second, on creation and improvement, because each skill is schema-validated, functionally testable, and addressable node-by-node, failures can be diagnosed and repaired precisely. Two authored-skill failures were traced to the script level. After adjusting the AIP spec and recompiling, both recovered with zero regressions (one task going from 0/5 to 5/5), turning skill improvement into a measurable tuning loop rather than a prose rewrite. That same graph structure supports corpus-level governance and skill introspection, and provides a natural action space for reinforcement learning over skills.

2606.04778 2026-06-04 cs.AI cs.CL cs.LG 版本更新

Inference-Time Vulnerability Beyond Shallow Safety: Alignment Along Generation Trajectories

超越浅层安全的推理时脆弱性:沿生成轨迹的对齐

Kyungmin Park, Taesup Kim

发表机构 * Hankuk University of Foreign Studies(翰江大学外国语大学) Seoul National University(首尔国立大学)

AI总结 本文揭示安全对齐的大语言模型在推理时存在更广泛的脆弱性,即任意生成步骤的短标记注入都能显著改变后续安全行为,并提出通过直接在生成轨迹上对齐模型来提升鲁棒性。

详情
AI中文摘要

安全对齐的大语言模型(LLMs)在推理时仍然容易受到干预,这些干预会将生成导向有害输出。最近的研究将其归因于浅层安全,即对齐集中在最初的几个输出标记上。我们表明,浅层安全是更广泛的推理时脆弱性的一个特例,其中在任何生成步骤的短标记注入都能显著改变后续的安全行为。我们还发现,模型在其隐藏状态中与拒绝方向的对齐并不能预测其对这种注入的鲁棒性,这表明在扰动下,内部状态本身并不能决定生成行为。为了解决这个问题,我们通过模拟序列中段扰动构建的生成轨迹上直接对齐模型,并表明这提高了对中段注入的鲁棒性,并泛化到利用早期标记生成的攻击。我们的工作认为,鲁棒的安全对齐需要对生成过程本身进行训练,而不仅仅是其输出。

英文摘要

Safety-aligned Large Language Models (LLMs) remain vulnerable to interventions during inference that redirect generation toward harmful outputs. Recent work attributes this to shallow safety, where alignment concentrates in the first few output tokens. We show that shallow safety is a special case of a broader inference-time vulnerability, in which short token injections at any generation step can substantially alter subsequent safety behavior. We also find that a model's alignment with refusal directions in its hidden states does not predict its robustness to such injection, revealing that internal state alone does not determine generation behavior under perturbation. To address this, we align models directly on generation trajectories constructed by simulating mid-sequence perturbation, and show that this improves robustness to mid-sequence injection and generalizes to attacks that exploit early-token generation. Our work argues that robust safety alignment requires training on the generation process itself, not only its outputs.

2606.04775 2026-06-04 cs.LG cs.AI cs.CV cs.SY eess.SY math.OC 版本更新

Activation Steering of Video Generation Models via Reduced-Order Linear Optimal Control

通过降阶线性最优控制引导视频生成模型的激活

Jihoon Hong, Alice Chan, Qiyue Dai, Julian Skifstad, Glen Chou

发表机构 * Georgia Institute of Technology(佐治亚理工学院)

AI总结 提出LA-LQR框架,将文本到视频推理建模为动态系统,通过降阶最优控制实现最小干预的激活引导,减少不安全内容生成同时保持视觉质量。

详情
AI中文摘要

在大规模网络数据上训练的文本到视频(T2V)模型可能生成不良内容,这促使我们进行干预以减少有害输出而不牺牲视觉质量。激活引导提供了一种有吸引力的机制替代微调和提示过滤,但现有的T2V引导方法仍然有限,通常采用粗糙的、非预测性的干预,可能导致过度引导和内容退化。为了弥补这一差距,我们提出了潜在激活线性二次型调节器(LA-LQR),一种用于最小侵入性T2V引导的降阶最优控制框架。LA-LQR将T2V推理表述为一个动态系统,并计算闭环反馈干预,将激活引导向期望的特征设定点,同时惩罚不必要的扰动。为了使最优控制对高维视频激活可行,我们将激活投影到由对比提示对导出的低维、任务相关子空间,估计该潜在空间中的局部线性动力学,并求解潜在LQR问题以获得时间步和层特定的引导信号。我们提供了将潜在设定点跟踪与原始激活空间特征控制联系起来的理论界限,并实证验证了降阶潜在动力学的保真度。在概念引导和视频安全基准测试中,LA-LQR相对于基线减少了不安全生成,同时保持了提示保真度和视觉质量。

英文摘要

Text-to-video (T2V) models trained on large-scale web data can generate undesired content, motivating interventions that reduce harmful outputs without sacrificing visual quality. Activation steering offers an attractive mechanistic alternative to finetuning and prompt filtering, but existing T2V steering methods remain limited, typically applying coarse, non-anticipative interventions that can lead to oversteering and content degradation. To close this gap, we propose Latent Activation Linear-Quadratic Regulator (LA-LQR), a reduced-order optimal control framework for minimally invasive T2V steering. LA-LQR formulates T2V inference as a dynamical system and computes closed-loop feedback interventions that steer activations toward desired feature setpoints while penalizing unnecessary perturbations. To make optimal control feasible for high-dimensional video activations, we project activations onto a low-dimensional, task-relevant subspace derived from contrastive prompt pairs, estimate local linear dynamics in this latent space, and solve a latent LQR problem to obtain timestep- and layer-specific steering signals. We provide theoretical bounds relating latent setpoint tracking to raw activation-space feature control, and empirically validate the fidelity of the reduced latent dynamics. On concept steering and video safety benchmarks, LA-LQR reduces unsafe generations relative to baselines, while preserving prompt fidelity and visual quality.

2606.04767 2026-06-04 cs.LG cs.CV 版本更新

Measuring Model Robustness via Fisher Information: Spectral Bounds, Theoretical Guarantees, and Practical Algorithms

通过Fisher信息度量模型鲁棒性:谱界、理论保证与实用算法

Chong Zhang, Xiang Li, Jia Wang, Qiufeng Wang, Xiaobo Jin

发表机构 * University of Science and Technology of China(中国科学技术大学)

AI总结 提出基于Fisher信息矩阵谱范数的攻击无关鲁棒性度量,理论推导常见架构的闭式谱界,并开发高效估计算法,实验验证其与对抗脆弱性的强相关性。

Comments 35 pages, 1 figure

详情
AI中文摘要

深度神经网络的鲁棒性对于安全关键部署至关重要,但现有评估方法通常依赖于攻击且缺乏可解释性。我们提出了一种基于Fisher信息矩阵(FIM)谱范数的原则性、攻击无关的鲁棒性度量,该度量量化了模型输出分布对输入扰动的worst-case敏感性。理论上,我们证明了FIM等于输入Jacobian的方差,并推导了常见架构(包括VGG、ResNet、DenseNet和Transformer)的闭式谱界,提供了首个理论鲁棒性排名。为了实现可扩展的评估,我们开发了高效算法,包括幂迭代和基于Hutchinson的估计,支持白盒和黑盒设置。在多个数据集(包括CIFAR、ImageNet和医学图像)和多种架构上的大量实验表明,我们的度量与对抗脆弱性之间存在强相关性。我们的框架作为一种可解释的诊断工具,补充了基于攻击的评估,提供了对架构敏感性的洞察,并指导更鲁棒模型的设计。代码可在https://github.com/franz-chang/SRP/获取。

英文摘要

The robustness of deep neural networks is crucial for safety-critical deployments, yet existing evaluation methods are often attack-dependent and lack interpretability. We propose a principled, attack-agnostic robustness metric based on the spectral norm of the Fisher Information Matrix (FIM), which quantifies the worst-case sensitivity of the model's output distribution to input perturbations. Theoretically, we establish that the FIM equals the variance of the input Jacobian and derive closed-form spectral bounds for common architectures, including VGG, ResNet, DenseNet, and Transformer, providing the first theoretical robustness ranking. To enable scalable evaluation, we develop efficient algorithms, including power iteration and Hutchinson-based estimation, that support both white-box and black-box settings. Extensive experiments across multiple datasets, including CIFAR, ImageNet, and medical images, and across multiple architectures show a strong correlation between our metric and adversarial vulnerability. Our framework serves as an interpretable diagnostic tool that complements attack-based evaluations, offering insights into architectural sensitivity and guiding the design of more robust models. Code is available at: https://github.com/franz-chang/SRP/.

2606.04757 2026-06-04 math.OC cs.LG 版本更新

Near-Optimal Decentralized Stochastic Convex Optimization over Networks

网络上的近最优去中心化随机凸优化

Nitai Kluger, Amit Attia, Tomer Koren

发表机构 * Blavatnik School of Computer Science, Tel Aviv University(塔尔大学比拉维克计算机科学学院) Google Research Tel Aviv(谷歌研究以色列特拉维夫)

AI总结 针对去中心化随机光滑凸优化问题,提出一种加速去中心化方法,在总梯度样本预算N下,将可支持的工作节点数提升至M≲√ρ N^{3/4},并证明其最优性。

Comments 12 papers

详情
AI中文摘要

我们研究去中心化随机光滑凸优化,其中$M$个工作者使用局部随机梯度并通过固定八卦网络上的仅邻居通信来最小化平均目标。该设置中的一个核心问题是,在总梯度样本预算为$N$的情况下,确定可以使用的最大工作者数量,同时仍保持集中式$O(1/\sqrt N)$统计速率。我们引入了一种加速去中心化方法,该方法在最多$\smash{M\lesssim \sqrt\rho\,N^{3/4}}$个工作者时保持该速率,其中$\rho$是八卦网络的谱间隙,改进了先前最佳的最大缩放$\smash{M\lesssim \rho\sqrt N}$。该方法基于一步延迟随机加速方案,使工作者能够将小批量与加速八卦交错进行,同时控制残差分歧,其保证仅对数依赖于最优-局部异质性。我们还为线性跨度去中心化一阶方法建立了匹配的下界,表明该方法在对数因子内是最优的。

英文摘要

We study decentralized stochastic smooth convex optimization, where $M$ workers minimize an average objective using local stochastic gradients and neighbor-only communication over a fixed gossip network. A central question in this setting is to determine the largest number of workers that can be used under a total budget of $N$ gradient samples while still preserving the centralized $O(1/\sqrt N)$ statistical rate. We introduce an accelerated decentralized method that preserves this rate for up to $\smash{M\lesssim \sqrtρ\,N^{3/4}}$ workers, where $ρ$ is the spectral gap of the gossip network, improving the best prior maximal scaling of $\smash{M\lesssim ρ\sqrt N}$. The method is based on a one-step-delayed stochastic acceleration scheme that enables workers to interleave minibatching with accelerated gossip while controlling residual disagreement, and its guarantee depends only logarithmically on the optimum-local heterogeneity. We also establish a matching lower bound for linear-span decentralized first-order methods, showing that the method is optimal up to logarithmic factors.

2606.04754 2026-06-04 cs.LG 版本更新

Beyond Structural Symmetries: Linear Mode Connectivity via Neuron Identifiability

超越结构对称性:通过神经元可辨识性实现线性模式连通性

Vincent Bürgin, Daniel Herbst, Ya-Wei Eileen Lin, Stefanie Jegelka

发表机构 * DeepMind, London, UK(伦敦英国深Mind公司) University of Cambridge(剑桥大学) University of California, Berkeley(加州大学伯克利分校)

AI总结 本文通过提出有效函数类理论框架并形式化神经元可辨识性,揭示了神经网络中即使结构不对称也存在大量近似等价解,并展示了神经元可辨识性如何无需先验对齐即可实现表示合并及线性低损失路径。

Comments Accepted at ICML 2026

详情
AI中文摘要

深度学习中的许多显著现象,如线性模式连通性和训练动力学的结构化行为,都与参数对称性密切相关:即保持实现函数不变的变换。尽管参数对称性日益受到关注,但参数、数据和表示之间的确切相互作用仍未得到充分探索。为了研究这一点,我们开发了一个有效函数类的理论框架,即神经元在其输入支持上可以实现的函数集以及实现它们的范数代价。然后,我们通过跨独立训练运行的神经元可辨识性来形式化有效对称性破缺。我们的分析表明,即使在结构不对称的模型中,神经网络也可以容纳大量近似等价的解族。我们进一步证明,神经元可辨识性使得无需先验对齐即可进行表示合并,并刻画了这种合并何时允许线性低损失路径。这些发现强调了有效函数类在影响损失景观中的作用。

英文摘要

Many striking phenomena in deep learning, such as linear mode connectivity and the structured behavior of training dynamics, are closely tied to parameter symmetries: transformations that leave the realized function unchanged. Despite growing attention to parameter symmetries, the exact interplay between parameters, data, and representations remains underexplored. To investigate this, we develop a theoretical framework of effective function classes, i.e., the set of functions a neuron can realize on its input support, and the norm cost of realizing them. We then formalize effective symmetry breaking via neuron identifiability across independent training runs. Our analysis shows that neural networks can admit large families of approximately equivalent solutions even in structurally asymmetric models. We further show that neuron identifiability enables representation merging without prior alignment, and characterize when such merging admits a linear low-loss path. These findings highlight the role of effective function classes in affecting the loss landscape.

2606.04750 2026-06-04 cs.AI cs.CY cs.LG 版本更新

Fog of Love: Engineering Virtuous Agent Behavior with Affinity-based Reinforcement Learning in a Game Environment

Fog of Love: 基于亲和力强化学习在游戏环境中塑造道德智能体行为

Ajay Vishwanath, Christian Omlin

发表机构 * University of Agder(阿格德大学)

AI总结 本文提出基于亲和力的强化学习方法,通过策略正则化在多智能体角色扮演游戏Fog of Love中同时实现竞争与合作目标,并提升智能体行为的可解释性。

详情
AI中文摘要

在人工智能中注入道德行为越来越受到关注。其中一种提出的技术是基于亲和力的强化学习,它通过对目标函数进行策略正则化来激励道德行为,而不完全依赖于奖励函数设计。迄今为止,该技术已在状态和动作空间最小的网格世界和玩具问题环境中证明有效。为了将这项研究扩展到更复杂的环境,我们引入了一个基于角色扮演棋盘游戏Fog of Love的双人多智能体环境。在该环境中,两个智能体竞争以实现各自的道德目标,同时合作以维持他们的关系。鉴于多智能体性质,这是一个复杂问题,其中多智能体深度确定性策略梯度智能体既不能成功竞争也不能成功合作。我们提供的证据表明,局部亲和力增强了智能体在实现竞争和合作目标方面的性能,从而在两个领域都获得了更高的总体得分。这不仅产生了道德选择,还阐明了智能体的目的论,并使其行为达到人类水平的可解释性。

英文摘要

Instilling virtuous behavior in artificial intelligence has seen increasing interest. One of the techniques proposed is known as affinity-based reinforcement learning, which uses policy regularization on the objective function to incentivize virtuous actions without being fully dependent on the reward function design. Thus far, this technique has been demonstrated to be effective in grid worlds and toy-problem environments with minimal state and action spaces. To expand this research to more sophisticated environments, we introduce a two-player multi-agent environment based on the role-playing board game known as Fog of Love. In this environment, two agents compete to fulfill their individual virtues, while also cooperating to satisfy their relationship. Given the multi-agent nature, this is a complex problem where multi-agent deep deterministic policy gradient agents neither compete nor cooperate successfully. We present evidence that localized affinities enhance agent performance in achieving both competitive and cooperative objectives, resulting from superior overall scores in both domains. This not only results in virtuous choices but also clarifies an agent's teleology and makes its behavior human-level interpretable.

2606.04749 2026-06-04 cs.RO cs.LG 版本更新

COP-Q: Safety-First Reinforcement Learning for Robot Control via Cholesky-Ordered Projection

COP-Q:基于Cholesky有序投影的安全优先强化学习机器人控制

Guopeng Li, Moritz A. Zanger, Matthijs T. J. Spaan, Julian F. P. Kooij

发表机构 * Department of Cognitive Robotics, Delft University of Technology(代尔夫特理工大学认知机器人系) Department of Intelligent Systems, Delft University of Technology(代尔夫特理工大学智能系统系) School of Transportation, Southeast University(东南大学交通学院)

AI总结 提出COP-Q方法,通过Cholesky分解编码目标优先级并利用联合Q值空间的广义置信界,在安全优先的离线策略强化学习中平衡安全与奖励目标,减少过度保守性,提升样本效率。

Comments 7 pages, 6 figures, 2 tables

详情
AI中文摘要

安全机器人控制需要在满足安全约束的同时最大化回报。在离线策略安全强化学习中,奖励和安全Q值通常由独立的评论家集成学习,每个目标的不确定性独立处理。这种按目标处理的方式忽略了目标间的相关性,可能导致过于保守的价值估计,从而降低样本效率。为解决此问题,我们提出Cholesky有序投影Q学习(COP-Q),一种安全优先的方法,将目标间协方差纳入向量值Q值估计中。COP-Q在联合Q值空间中构建广义置信界,并使用Cholesky分解以顺序形式编码目标优先级。这在对安全目标保持保守性的同时,自适应地减少对奖励目标的过度保守性。得到的估计同时用于时序差分目标计算和演员优化。COP-Q引入最小的计算开销,并且与大多数现有深度Q学习框架兼容。在Brax中的机器人运动和安全健身房中的安全导航实验(涵盖硬安全和软安全设置)表明,与代表性基线相比,COP-Q实现了强大的安全性能以及有竞争力或更高的样本效率。

英文摘要

Safe robot control requires maximizing return while satisfying safety constraints. In off-policy safe reinforcement learning, reward and safety Q-values are commonly learned by separate critic ensembles, with uncertainty handled independently for each objective. This objective-wise treatment neglects inter-objective correlation and can lead to overly conservative value estimates, thereby reducing sample efficiency. To address this issue, we propose Cholesky-Ordered Projection Q-learning (COP-Q), a safety-first method that incorporates inter-objective covariance into vector-valued Q-value estimation. COP-Q constructs a generalized confidence bound in the joint Q-value space and uses Cholesky factorization to encode objective priority in a sequential form. This preserves conservatism on safety while adaptively reducing excessive conservatism on the reward objective. The resulting estimate is used in both temporal-difference target computation and actor optimization. COP-Q incurs minimal computational overhead and is readily compatible with most existing deep Q-learning frameworks. Experiments on robot locomotion in Brax and safe navigation in Safety-Gymnasium, covering both hard- and soft-safety settings, demonstrate that COP-Q achieves strong safety performance together with competitive or improved sample efficiency relative to representative baselines.

2606.04743 2026-06-04 cs.CL cs.AI cs.LG 版本更新

TIDE: Proactive Multi-Problem Discovery via Template-Guided Iteration

TIDE:通过模板引导迭代的主动多问题发现

Soyeong Jeong, Jinheon Baek, Minki Kang, Sung Ju Hwang

发表机构 * KAIST(韩国科学技术院) DeepAuto.ai

AI总结 提出TIDE框架,通过模板引导的迭代机制主动发现用户上下文中隐藏的多个问题,并给出具体行动方案,在个人工作区和软件仓库两个场景中显著提升任务覆盖率和问题识别与解决能力。

详情
AI中文摘要

智能体被广泛部署为文档、工具和代码的助手。然而,它们通常仅对明确的用户请求做出响应,这些请求只反映了用户已注意到的问题,而许多其他重要问题共存于更广泛的用户上下文中,隐藏于显而易见之处,且其总数事先未知。我们将此定义为从上下文中发现多个隐藏问题的任务,其中应揭示共存的问题,基于支持性证据,并配以具体行动。为此,我们引入了TIDE,一个模板引导的迭代框架,包含两种互补机制。具体而言,基于单次预测倾向于关注最显著案例并产生泛化结论的观察,我们提出迭代发现:每轮生成一小批候选,同时基于已发现结果进行条件化,从而后续轮次扩展覆盖范围;以及思维模板:从先前解决的案例中提炼的可重用模式,指定应关注哪些上下文信号以及如何连接它们,将每个预测锚定于可识别的问题类别。我们在两个现实场景(个人工作区和软件仓库)中,使用四种模型骨干验证了TIDE,在任务覆盖率、识别和解决方面显著优于单次和并行多智能体基线。

英文摘要

Agents are widely deployed as assistants over documents, tools, and code. However, they typically act only on explicit user requests, which surface only the problems the user has noticed, while many other important problems coexist, hidden in plain sight, within the broader user context, with their total number unknown in advance. We frame this as the task of discovering multiple hidden problems from context, in which coexisting problems should be uncovered, grounded in supporting evidence, and paired with concrete actions. To this end, we introduce TIDE, a template-guided iterative framework with two complementary mechanisms. Specifically, motivated by the observation that single-pass prediction anchors on the most salient cases and yields generic claims, we propose iterative discovery, which surfaces a small batch of candidates per round while conditioning on what has already been found, so subsequent rounds extend coverage; and thought templates, reusable schemas distilled from previously solved cases that specify what contextual signals to attend to and how to connect them, anchoring each prediction in a recognizable problem class. We validate TIDE on two realistic settings, personal workspaces and software repositories, across four model backbones, showing substantial gains over single-shot and parallel multi-agent baselines on task coverage, identification, and resolution.

2606.04736 2026-06-04 cs.LG cs.AI 版本更新

Curvature-aware dynamic precision approach for physics-informed neural networks

面向物理信息神经网络的曲率感知动态精度方法

Yingjie Shao, Ioannis N. Athanasiadis, George van Voorn, Taniya Kapoor

发表机构 * Mathematical & Statistical Methods Group (Biometris), Wageningen University & Research(数学与统计方法组(Biometris),瓦赫宁根大学与研究中心) Artificial Intelligence Group, Wageningen University & Research(人工智能组,瓦赫宁根大学与研究中心)

AI总结 提出一种曲率感知精度控制器,利用L-BFGS优化器中的曲率信息动态调整数值精度,在保持预测精度的同时降低双精度训练的计算成本。

详情
AI中文摘要

物理信息神经网络(PINNs)通过将物理定律直接嵌入神经网络训练,已成为模拟偏微分方程(PDEs)的有前景框架。然而,近期研究表明PINN优化对数值精度敏感。现有实现通常使用单精度(FP32),计算效率高但易出现失败模式,或双精度(FP64),鲁棒但成本高昂。这造成了计算效率与数值精度之间的权衡。为降低双精度训练的计算成本同时保持预测精度,我们提出一种曲率感知精度控制器,在训练过程中自适应调整数值精度,而非将其视为固定的实现选择。该方法重用来自有限内存BFGS(L-BFGS)优化器的曲率信息来构建精度控制器,在低精度足够时保留FP32,并在训练动态表明数值敏感或精度受限停滞时提升至FP64计算。我们在四个典型PINN失败模式基准和一个辐照度驱动的常微分方程示例上评估了所提方法。我们还测试了不同神经网络架构下的方法。该方法在所有基准方程上一致匹配甚至略微超过全FP64解的精度,同时相对于全双精度训练减少了训练时间。所得结果表明,PINN优化中的精度敏感性具有相位依赖性,仅在数值关键阶段选择性应用更高精度可以在不牺牲预测精度的前提下降低计算成本。

英文摘要

Physics-informed neural networks (PINNs) have become a promising framework for simulating partial differential equations (PDEs) by embedding physical laws directly into neural network training. However, recent studies show that PINN optimisation is sensitive to numerical precision. Existing implementations commonly use either single precision (FP32), which is computationally efficient but prone to failure modes, or double precision (FP64), which is robust but substantially expensive. This creates a trade-off between computational efficiency and numerical accuracy. To reduce the computational cost of double-precision training while retaining prediction accuracy, we propose a curvature-aware precision controller that adapts numerical precision during training rather than treating it as a fixed implementation choice. The proposed method reuses curvature information derived from the limited-memory BFGS (L-BFGS) optimiser to construct a precision controller, retaining FP32 when lower precision is sufficient and promoting computation to FP64 when the training dynamics indicate numerical sensitivity or precision-limited stagnation. We evaluate the proposed approach on four canonical PINN failure-mode benchmarks and an irradiance-driven ordinary differential equation example. We further test the proposed approach across different neural network architectures. The method consistently matches or even slightly exceeds full FP64 solution accuracy while reducing training time relative to full double-precision training on all benchmark equations. The obtained results indicate that precision sensitivity in PINN optimisation is phase-dependent, and that selectively applying higher precision only during numerically critical stages can lower computational cost without sacrificing predictive accuracy.

2606.04735 2026-06-04 cs.LG cs.AI 版本更新

Trace-Mediated Peak Bias: Bridging Temporal Credit Assignment and Cognitive Heuristics in Deep Reinforcement Learning

迹介导的峰值偏差:深度强化学习中时间信用分配与认知启发式的桥梁

Viktor Veselý, Aleksandar Todorov, Erwan Escudie, Matthia Sabatelli

发表机构 * Department of AI, University of Groningen(格罗宁根大学人工智能系)

AI总结 本文发现深度强化学习中的迹介导峰值偏差(TMPB),揭示了其作为峰值-末端规则的机制基础,并证明自适应优化器通过二阶矩归一化可缓解该偏差。

详情
AI中文摘要

时间信用分配是生物和人工智能的核心问题,但其与非线性函数逼近的相互作用尚不清楚。我们在深度强化学习中识别出一种系统性失效模式,称为迹介导峰值偏差(TMPB)。在中间资格迹深度下,智能体非理性地偏好具有高幅度奖励“峰值”的轨迹,而非具有更高累积回报的替代轨迹。这为峰值-末端规则提供了一种机制解释:一种人类记忆偏差,其中经验由其最强烈的时刻而非整合效用判断。我们证明,TMPB的出现是因为迹将远时时间差分误差放大为“梯度冲击”,而固定步长的随机梯度下降无法将其归一化,导致全局高估。相反,自适应优化器通过二阶矩归一化缓解了这种病理现象。我们的结果表明,类人的显著性扭曲可能自然产生于分布式系统中信用分配的数学约束,而自适应优化是理性价值估计的理论必要条件。

英文摘要

Temporal credit assignment is central to both biological and artificial intelligence, yet its interaction with non-linear function approximation is poorly understood. We identify a systematic failure mode in deep reinforcement learning (RL) termed Trace-Mediated Peak Bias (TMPB). At intermediate eligibility trace depths, agents irrationally prefer trajectories with high-magnitude reward ``peaks'' over alternatives with higher cumulative returns. This provides a mechanistic account of the Peak-End Rule: a human memory bias where experiences are judged by their most intense moments rather than integrated utility. We show that TMPB emerges because traces amplify distal Temporal Difference errors into ``gradient shocks'' that fixed-step-size Stochastic Gradient Descent cannot normalize, leading to global overestimation. Conversely, adaptive optimizers mitigate this pathology via second-moment normalization. Our results suggest that human-like saliency distortions may emerge naturally from the mathematical constraints of credit assignment in distributed systems, and that adaptive optimization is a theoretical necessity for rational value estimation.

2606.04733 2026-06-04 cs.LG cs.NI 版本更新

Contrastive Learning and Correlation Clustering for Sequences of Network Telescope Data

对比学习与相关聚类在网络望远镜数据序列中的应用

Jannik Presberger, Alexander Männel, Maynard Koch, Thomas C. Schmidt, Matthias Wählisch, Bjoern Andres

发表机构 * TU Dresden(德累斯顿技术大学) HAW Hamburg(汉堡应用技术大学) Center for Scalable Data Analytics and AI Dresden/Leipzig(德累斯顿/莱比锡可扩展数据与人工智能研究中心)

AI总结 本文提出一种无需预训练和标注的对比学习变压器模型,用于估计网络流记录序列间的语义关系,并通过相关聚类实现扫描器行为的无监督分组。

Comments Code: https://github.com/JannikPresberger/Contrastive_Learning_and_Correlation_Clustering_for_Sequences_of_Network_Telescope_Data

详情
AI中文摘要

理解互联网扫描器的活动具有挑战性;通常需要识别源之间的关系,而这一任务的语义标注非常稀缺。本文研究是否可以通过对比学习,无需预训练和标注,来估计网络流记录序列之间具有语义意义的成对关系。为此,我们提出一个变压器模型,嵌入经过最小预处理的网络流记录序列,并使用对比学习进行训练。利用该模型获得的相似度,我们定义了一个相关聚类问题并局部求解。实验表明:来自同一源的序列之间的学习相似度平均高于来自不同源的序列,并且这一特性可推广到未见过的源和未见过的序列。此外,相关聚类产生的聚类结果与扫描器标签一致。算法和重现实验的完整源代码已公开。

英文摘要

Understanding activities of Internet scanners is challenging; it often requires identifying relationships between sources, a task for which semantic annotations are scarce. This work investigates whether semantically meaningful pairwise relationships between sequences of network flow records can be estimated by contrastive learning, without pretraining and without annotations. To this end, we propose a transformer model that embeds minimally preprocessed sequences of network flow records and train it using contrastive learning. With the similarities obtained from this model, we state a correlation clustering problem and solve it locally. Experimentally, we show: Learned similarities are higher on average for sequences originating from the same source than for sequences originating from different sources, and this property generalizes to unseen sequences of unseen sources. Moreover, correlation clustering yields clusters consistent with scanner labels. The complete source code of the algorithms and for reproducing the experiments is publicly available.

2606.04703 2026-06-04 cs.CL cs.LG 版本更新

Rethinking Continual Experience Internalization for Self-Evolving LLM Agents

重新思考持续经验内化以实现自我进化的大语言模型智能体

Jingwen Chen, Wenkai Yang, Shengda Fan, Wenbo Nie, Chenxing Sun, Shaodong Zheng, Yangen Hu, Lu Pan, Ke Zeng, Yankai Lin

发表机构 * Gaoling School of Artificial Intelligence, Renmin University of China(中国人民大学人工智能学院 Gallagher 学院) School of Software, Beihang University(北航软件学院) Meituan(美团)

AI总结 本文通过经验粒度、注入模式和内化机制三个维度,提出一种稳定可持续的经验内化方法,解决多轮经验学习中的能力崩溃问题。

Comments 10 pages, 8 figures

详情
AI中文摘要

经验内化将过去交互中的上下文经验转化为可重用的参数化能力,为大型语言模型(LLM)的持续学习提供了一条有前景的路径。虽然先前的工作主要关注单次迭代迁移,但我们发现在多轮经验学习下,现有方法遭受的是渐进的能力崩溃而非复合改进。我们通过经验内化的三个关键维度系统地考察了这种失败:(1)经验粒度:我们发现原则级经验比实例级经验更持久,因为它有效地从轨迹特定细节中抽象出可迁移的策略。(2)经验注入模式:我们的分析表明,逐步注入通过将经验与中间决策状态对齐,显著优于全局注入,这一特性对于长程工具使用至关重要。(3)内化机制:我们证明,在高质量教师轨迹上的离策略上下文蒸馏提供了比在策略上下文蒸馏更稳定的训练信号,后者固有地受限于对学生诱导的缺陷状态的局部修正。这些见解共同产生了一个简单而稳健的配方,用于稳定和可持续的经验内化,为工程化自我进化和持续学习的LLM提供了具体指导。

英文摘要

Experience internalization converts contextual experience from past interactions into reusable parametric capability, offering a promising path toward continual learning in large language models (LLMs). While prior work has predominantly focused on single-iteration transfer, we discover that under multi-iteration experience learning, existing methods suffer from a progressive capability collapse rather than compounding improvement. We systematically examine this failure through three vital dimensions of experience internalization: (1) Experience Granularity: We find that principle-level experience is more durable than instance-level experience, as it effectively abstracts transferable strategies away from trajectory-specific details. (2) Experience Injection Pattern: Our analysis reveals that step-wise injection significantly outperforms global injection by aligning experience with intermediate decision states, a property that is critical for long-horizon tool use. (3) Internalization Regime: We demonstrate that off-policy context-distillation on high-quality teacher trajectories provides a substantially more stable training signal than on-policy context-distillation, which is inherently limited by local corrections on student-induced flawed states. Together, these insights yield a simple yet robust recipe for stable and sustainable experience internalization, providing concrete guidance for engineering self-evolving and continually learning LLMs.

2606.04699 2026-06-04 cs.LG cs.AI cs.CV 版本更新

Graph-Guided Universum Learning in Generalized Eigenvalue Proximal SVMs for Alzheimer's Disease Classification

基于图引导的广义特征值近端支持向量机中的Universum学习用于阿尔茨海默病分类

Yogesh Kumar, Vrushank Ahire, Mudasir Ganaie

发表机构 * Dept. of Computer Science and Engineering, IIT Ropar, Punjab 140001, India(计算机科学与工程系,IIT罗帕尔,旁遮普140001,印度)

AI总结 针对阿尔茨海默病分类,提出两种图引导的Universum学习模型UG-GEPSVM和IUG-GEPSVM,利用轻度认知障碍样本构建图拉普拉斯正则化,替代传统独立惩罚项,在ADNI MRI数据集上取得更优性能。

详情
AI中文摘要

早期准确检测阿尔茨海默病(AD)对于及时干预和疾病管理至关重要。广义特征值近端支持向量机(GEPSVM)及其基于Universum的变体在AD分类中显示出有希望的结果。然而,现有方法将Universum样本视为独立点,未考虑它们之间的几何关系。本文提出了两种图引导的Universum学习模型,即UG-GEPSVM和IUG-GEPSVM,用于使用结构MRI数据进行AD与认知正常(CN)分类。在所提出的框架中,轻度认知障碍(MCI)受试者被用作Universum数据,以提供AD和CN类别之间的中间信息。使用高斯相似性、最小生成树连通性和多跳传播在Universum样本上构建图。从该图中导出拉普拉斯矩阵,捕获MCI样本的几何结构。这种基于拉普拉斯的正则化被纳入学习过程,以替代传统的独立Universum惩罚项。UG-GEPSVM将此正则化集成到广义特征值公式中,而IUG-GEPSVM使用标准特征值公式扩展了数值稳定的改进GEPSVM框架。在ADNI MRI数据集变体上使用ICA和PCA特征在五个不同噪声水平下的实验表明,两种提出的模型始终优于现有的GEPSVM和基于Universum的方法。UG-GEPSVM实现了88.07%的最高平均AUC,并在增加的噪声水平下保持稳定的性能。统计检验进一步证实了观察到的改进的显著性。

英文摘要

Early and accurate detection of Alzheimer's disease (AD) is important for timely intervention and disease management. Generalized Eigenvalue Proximal Support Vector Machine (GEPSVM) and its Universum-based variants have shown promising results for AD classification. However, existing methods treat Universum samples as independent points and do not consider the geometric relationships among them. This paper proposes two graph-guided Universum learning models, namely UG-GEPSVM and IUG-GEPSVM, for AD versus cognitively normal (CN) classification using structural MRI data. In the proposed framework, mild cognitive impairment (MCI) subjects are used as Universum data to provide intermediate information between AD and CN classes. A graph is constructed over the Universum samples using Gaussian similarity, Minimum Spanning Tree connectivity, and multi-hop propagation. From this graph, a Laplacian matrix is derived that captures the geometric structure of the MCI samples. This Laplacian-based regularization is incorporated into the learning process in place of the conventional independent Universum penalty term. UG-GEPSVM integrates this regularization into the generalized eigenvalue formulation, while IUG-GEPSVM extends the numerically stable improved GEPSVM framework using a standard eigenvalue formulation. Experiments on ADNI MRI dataset variants using ICA- and PCA-based features at five different noise levels show that both proposed models consistently outperform existing GEPSVM and Universum-based methods. UG-GEPSVM achieves the highest average AUC of 88.07% and maintains stable performance under increasing noise levels. Statistical tests further confirm the significance of the observed improvements.

2606.04695 2026-06-04 cs.LG 版本更新

Cone-Compatible Monge Geometry for High-Dimensional Ordered Optimal Transport

锥相容的Monge几何用于高维有序最优输运

Lei Luo, Hongliang Zhang, Jian Yang

发表机构 * PCA Lab, Key Lab of Intelligent Perception and Systems for High-Dimensional Information of Ministry of Education, School of Computer Science and Engineering, Nanjing University of Science and Technology(PCA实验室、教育部高维信息智能感知与系统重点实验室、计算机科学与工程学院、南京理工大学)

AI总结 本文提出锥相容的Monge几何,通过闭凸锥诱导的偏序与输运成本兼容的条件,为高维有序数据提供闭式最优耦合。

Comments 13 pages, 2 figures, including appendices

详情
AI中文摘要

高维最优输运很少具有闭式解。一维情况是例外,因为实数线的顺序与凸输运成本兼容,使得单调重排最优。本文研究在更高维中如何从偏序恢复类似的Monge结构。我们引入锥相容的Monge几何:一个闭凸锥(K)诱导序(x\preceq_K y)当(y-x\in K),并且如果有序对满足Monge交换不等式,则与成本兼容。对于平方马氏距离成本(c_M(x,y)=(x-y)^\top M(x-y)),我们证明了一个尖锐的刻画:兼容性恰好当(K)在(M)-内积下是锐角锥,即对所有(u,v\in K)有(u^\top Mv\ge0),等价于(K\subseteq K_M^*)。在此条件下,支撑在锥链上的测度允许分位数型的闭式最优耦合,在原始地面成本下(而非投影或度量替换后)得到精确输运。我们将由此产生的锥链Wasserstein度量(定义在规范有序的链分布上)与扩展的有向锥输运成本(定义在一般测度上)区分开来,并发展了可行性、对偶性、稳定性、逼近、高斯恢复、统计和计算方面的结果。该理论与切片和树Wasserstein距离互补:它不是通用的快速替代,而是为有序高维数据提供可解释、方向有效、原始空间单调输运的一种方法。

英文摘要

High-dimensional optimal transport is seldom available in closed form. The one-dimensional case is exceptional because the order of the real line is compatible with convex transport costs, making monotone rearrangement optimal. This paper studies when an analogous Monge structure can be recovered in higher dimensions from a partial order. We introduce a cone-compatible Monge geometry: a closed convex cone (K) induces the order (x\preceq_K y) whenever (y-x\in K), and is compatible with a cost if ordered pairs satisfy a Monge exchange inequality. For squared Mahalanobis costs (c_M(x,y)=(x-y)^\top M(x-y)), we prove a sharp characterization: compatibility holds exactly when (K) is acute under the (M)-inner product, namely (u^\top Mv\ge0) for all (u,v\in K), equivalently (K\subseteq K_M^*). Under this condition, measures supported on cone chains admit a quantile-type closed-form optimal coupling, yielding exact transport under the original ground cost rather than after projection or metric replacement. We distinguish the resulting cone-chain Wasserstein metric on canonically ordered chain distributions from an extended directed cone transport cost on general measures, and develop feasibility, duality, stability, approximation, Gaussian recovery, statistical, and computational results. The theory is complementary to sliced and tree Wasserstein distances: it is not a universal fast surrogate, but a way to obtain interpretable, direction-valid, original-space monotone transport for ordered high-dimensional data.

2606.04689 2026-06-04 quant-ph cs.LG 版本更新

QPredSGG: Hybrid Quantum Predicate Learning for Long-Tailed Scene Graph Generation

QPredSGG:面向长尾场景图生成的混合量子谓词学习

Prerana Ramkumar, Nouhaila Innan, Muhammad Shafique

发表机构 * Department of Computer Science, University of Waterloo(1. 温哥华大学计算机科学系) Machine Learning Research Group, University of Waterloo(2. 温哥华大学机器学习研究组)

AI总结 针对场景图生成中长尾谓词分布导致的分类偏差,提出用量子谓词头(QP-Head)替换经典谓词头,通过振幅嵌入和强纠缠层压缩特征,在Visual Genome 150上实现参数高效的长尾关系分类。

Comments 11 pages, 5 figures

详情
AI中文摘要

场景图生成(SGG)需要对物体及其交互进行关系推理,但性能常受严重的长尾谓词不平衡限制。经典SGG模型通常依赖数据集统计,导致预测偏向频繁关系而非细粒度语义谓词。尽管现有去偏策略提高了平均召回率,但当前框架中的谓词分类仍常依赖参数成本高的大型经典决策模块。本文通过用加权交叉熵训练的量子谓词头(QP-Head)替换因果特征增强网络(CFEN)中的经典谓词头,引入了一种用于SGG的混合量子谓词分类器。据我们所知,这是首批评估混合量子架构在Visual Genome 150上进行场景图谓词分类的研究之一。我们研究了量子比特数、编码策略、纠缠结构和电路深度对关系预测的影响。最佳4量子比特QP-Head使用振幅嵌入和强纠缠层将4096维对特征压缩为16维量子兼容表示,对应256倍缩减。它实现了57.25%的mR@100,而经典CFEN参考为41.1%,同时仅使用96个可训练量子参数。扩展到8量子比特保持了强大的长尾性能,达到55.38%的mR@100,使用384个量子参数,而深度分析显示了表达能力和运行时间开销之间的权衡。这些结果表明,紧凑的混合量子谓词头可以支持复杂视觉推理任务中参数高效的长尾关系分类。

英文摘要

Scene Graph Generation (SGG) requires relational reasoning over objects and their interactions, but performance is often limited by severe long-tail predicate imbalance. Classical SGG models frequently rely on dataset statistics, leading to biased predictions toward frequent relations rather than fine-grained semantic predicates. Although existing debiasing strategies improve mean recall, predicate classification in current frameworks still often depends on large classical decision modules with high parameter cost. This work introduces a hybrid quantum predicate classifier for SGG by replacing the classical predicate head in Causal Feature Enhancement Network (CFEN) with a Quantum Predicate Head (QP-Head) trained using weighted cross-entropy. To the best of our knowledge, this is among the first studies to evaluate a hybrid quantum architecture for scene graph predicate classification on Visual Genome 150. We study the effect of qubit count, encoding strategy, entangling structure, and circuit depth on relational prediction. The best 4-qubit QP-Head uses Amplitude Embedding and Strongly Entangling Layers to compress 4096-dimensional pair features into a 16-dimensional quantum-compatible representation, corresponding to a 256$\times$ reduction. It achieves an mR@100 of 57.25%, compared with 41.1% for the classical CFEN reference, while using only 96 trainable quantum parameters. Scaling to 8 qubits maintains strong long-tail performance, reaching an mR@100 of 55.38% with 384 quantum parameters, while the depth analysis shows a trade-off between expressibility and runtime overhead. These results suggest that compact hybrid quantum predicate heads can support parameter-efficient long-tail relational classification in complex visual reasoning tasks.

2606.04670 2026-06-04 math.NA cs.LG cs.MS cs.NA 版本更新

Fitting scattered data with optional monotonicity constraints on GPU: LipFit package

在GPU上拟合带有可选单调性约束的散乱数据:LipFit包

Gleb Beliakov

发表机构 * School of Information Technology, Deakin University(德肯大学信息科技学院)

AI总结 提出一种多变量散乱数据插值与逼近方法,在满足单调性约束下产生最优Lipschitz连续逼近,并实现GPU并行化的Python包LipFit。

详情
AI中文摘要

本文提出了一种多变量散乱数据插值与逼近方法,该方法在期望的单调性约束下产生最优的Lipschitz连续逼近。该方法依赖于数据的紧上下逼近,其精神类似于最近邻逼近,但不会遭受不连续性。还介绍了局部Lipschitz插值和Lipschitz平滑。该方法属于无训练阶段的基于实例的逼近范畴,适用于基于GPU的并行化。讨论了一个实现所述方法的Python GPU友好包LipFit。

英文摘要

This paper presents a method of multivariate scattered data interpolation and approximation that produces optimal Lipschitz-continuous approximation, subject to the desired monotonicity constraints. This method relies on tight upper and lower approximations to the data, and is similar in its spirit to the nearest-neighbour approximation but does not suffer from discontinuities. Local Lipschitz interpolation and Lipschitz smoothing are also presented. This approach falls under the umbrella of instance-based approximation with no training phase, and it is suitable for GPU-based parallelisation. A Python GPU-friendly package LipFit which implements the methods discussed is discussed.

2606.04665 2026-06-04 cs.LG 版本更新

Towards Accurate Model Selection in Deep Unsupervised Domain Adaptation

面向深度无监督域适应的精确模型选择

Kaichao You, Ximei Wang, Mingsheng Long, Michael I. Jordan

发表机构 * University of California, Berkeley(加州大学伯克利分校) UC Berkeley(加州大学伯克利分校)

AI总结 针对深度无监督域适应中缺乏准确模型选择方法的问题,提出Deep Embedded Validation (DEV)方法,通过嵌入适应特征表示到验证过程中,获得目标风险的无偏估计,并利用控制变量技术降低方差,理论和实验证明了其有效性。

Comments upload to arxiv for record

详情
AI中文摘要

深度无监督域适应(Deep UDA)方法成功利用源域中丰富的标记数据来提升相关但未标记的目标域上的性能。然而,由于缺乏准确且标准化的模型选择方法,Deep UDA中的算法比较变得繁琐,这阻碍了该领域的进一步进展。现有的Deep UDA模型选择方法要么高度有偏、受限、不稳定,甚至存在争议(需要标记的目标数据)。为此,我们提出了 extit{Deep Embedded Validation}( extbf{DEV}),它将适应后的特征表示嵌入到验证过程中,以获得目标风险的无偏估计,且方差有界。通过控制变量技术进一步降低了方差。该方法的有效性在理论和实验上都得到了验证。

英文摘要

Deep unsupervised domain adaptation (Deep UDA) methods successfully leverage rich labeled data in a source domain to boost the performance on related but unlabeled data in a target domain. However, algorithm comparison is cumbersome in Deep UDA due to the absence of accurate and standardized model selection method, posing an obstacle to further advances in the field. Existing model selection methods for Deep UDA are either highly biased, restricted, unstable, or even controversial (requiring labeled target data). To this end, we propose \textit{Deep Embedded Validation} (\textbf{DEV}), which embeds adapted feature representation into the validation procedure to obtain unbiased estimation of the target risk with bounded variance. The variance is further reduced by the technique of control variate. The efficacy of the method has been justified both theoretically and empirically.

2606.04662 2026-06-04 cs.LG cs.AI 版本更新

Why Muon Outperforms Adam: A Curvature Perspective

为什么 Muon 优于 Adam:曲率视角

Shuche Wang, Fengzhuo Zhang, Jiaxiang Li, Dirk Bergemann, Zhuoran Yang

发表机构 * National University of Singapore(新加坡国立大学) Yale University(耶鲁大学) University of Minnesota(明尼苏达大学)

AI总结 从曲率视角出发,通过泰勒展开和曲率分解,发现 Muon 因更低的归一化方向锐度(NDS)而比 Adam 实现更大的一步损失下降,数据不平衡和层内曲率是其主要优势来源。

详情
AI中文摘要

Muon 在大语言模型训练中相比 Adam 将训练效率提升约两倍,但这一优势的局部几何来源尚不清楚。我们的工作首次从曲率视角尝试揭开 Muon 优于 Adam 的原因。首先,我们对训练损失曲面应用二阶泰勒近似,表明在匹配验证损失下,Muon 比 Adam 实现更大的一步损失下降。两种优化器的一阶增益相当,但 Muon 始终承受更小的二阶曲率惩罚。其次,我们将该曲率惩罚分解为更新范数的平方和归一化方向锐度(NDS)。我们发现 Muon 和 Adam 的更新范数相当,因此 Muon 更小的曲率惩罚源于更低的 NDS,而非更新尺度。第三,我们研究训练数据和模型结构如何塑造 Muon 的 NDS 优势。使用具有受控不平衡的 Zipf-概率上下文无关文法(PCFG)数据,我们表明数据不平衡放大了 Muon 相对于 Adam 的 NDS 优势。进一步的层内/跨层分解表明,在训练的中后期,Muon 更低的 NDS 主要由更小的层内曲率维持。除了经验证据,我们还分析了具有异质曲率和梯度对齐于高曲率模式的风格化二次问题。我们证明 Muon 通过平衡曲率组间的更新能量,实现了比 GD 更低的平均 NDS;当曲率异质性足够强时,在相同步数后这也产生更低的局部二次损失。

英文摘要

Muon improves training efficiency over Adam in large language-model training by about two times, but the local geometric source of this advantage remains unclear. Our work takes a first step toward demystifying Muon's superiority over Adam from a curvature perspective. First, we apply a second-order Taylor approximation to the training landscape and show that Muon achieves a larger one-step loss decrease than Adam at matched validation loss. The two optimizers have comparable first-order gains, but Muon consistently incurs a smaller second-order curvature penalty. Second, we decompose this curvature penalty into the squared update norm and Normalized Directional Sharpness (NDS). We find that Muon and Adam have comparable update norms, so Muon's smaller curvature penalty is driven by lower NDS, not update scale. Third, we study how training data and model structure shape Muon's NDS advantage. Using Zipf-Probabilistic Context-Free Grammar (PCFG) data with controlled imbalance, we show that data imbalance amplifies Muon's NDS advantage over Adam. A within-/cross-layer decomposition further shows that, in the middle and late stages of training, Muon's lower NDS is mainly sustained by smaller within-layer curvature. Beyond empirical evidence, we analyze stylized quadratic problems with heterogeneous curvature and gradient alignment toward high-curvature modes. We prove that Muon attains a smaller average NDS than GD by balancing update energy across curvature groups; when curvature heterogeneity is sufficiently strong, this also yields lower local quadratic loss after the same number of steps.

2606.04661 2026-06-04 cs.CL cs.LG 版本更新

CRAFT: Cost-aware Refinement And Front-aware Tuning of Prompts

CRAFT: 成本感知的提示精炼与前沿感知的调优

Shanu Kumar, Shubhanshu Khandelwal, Akhila Yesantarao Venkata, Parag Agrawal, Yova Kementchedjhieva, Manish Gupta

发表机构 * MBZUAI Microsoft(微软)

AI总结 提出CRAFT方法,通过帕累托前沿优化提示的准确性和成本,避免标量化崩溃,在多个基准上实现更广泛的准确-成本权衡。

详情
AI中文摘要

为准确性调优的提示通常变长,每次模型调用都会增加推理成本。最佳的准确-成本权衡取决于任务和预算,因此提示优化是在准确性和提示令牌成本的帕累托前沿上的搜索,而不是针对单个提示。通常的捷径是将目标折叠成加权和,在搜索前固定权衡权重,通常只能恢复前沿的狭窄区域,我们称之为标量化崩溃。我们提出了CRAFT(成本感知的精炼和前沿感知的调优),一种帕累托前沿提示优化器,将目标LLM验证调用视为稀缺资源,并将其分配给乐观候选前沿附近的候选。每轮,互补的面向准确性和面向成本的生成器提出编辑,帕累托差距获取花费每轮的验证预算,NSGA-II保留保持分布广泛的种群。在六个分类和推理基准上,CRAFT保留的前沿同时达到高准确性和低成本区域,而仅准确性、仅成本和加权和基线各自集中在更窄的区域。准确-成本权衡成为搜索后的选择,而不是搜索前的权重。

英文摘要

Prompts tuned for accuracy often grow long, raising inference cost on every model call. The best accuracy-cost trade-off depends on the task and the budget, so prompt optimization is a search over the Pareto front of accuracy and prompt-token cost rather than for one prompt. The usual shortcut, collapsing the objectives into a weighted sum, fixes the trade-off weight before search and often recovers only a narrow region of the front, a failure we call scalarization collapse. We present CRAFT (Cost-aware Refinement And Front-aware Tuning), a Pareto-front prompt optimizer that treats target-LLM validation calls as the scarce resource and allocates them to candidates near the optimistic candidate front. Each round, complementary accuracy-oriented and cost-oriented generators propose edits, Pareto-gap acquisition spends the per-round validation budget, and NSGA-II retention keeps a spread-out population. Across six classification and reasoning benchmarks, CRAFT's retained fronts reach both high-accuracy and low-cost regions, while accuracy-only, cost-only, and weighted-sum baselines each concentrate in narrower regions. The accuracy-cost trade-off becomes a post-search choice, not a pre-search weight.

2606.04658 2026-06-04 cs.NE cs.LG 版本更新

U-Net-Accelerated Quality-Diversity Optimization for Climate-Adaptive Urban Layouts

U-Net加速的质量-多样性优化用于气候适应性城市布局

Alexander Hagg, Tania Guerrero, Dirk Reith

发表机构 * Institute of Technology, Resource and Energy-efficient Engineering (TREE)(技术学院,资源与能源高效工程院(TREE)) Bonn-Rhein-Sieg University of Applied Sciences(博恩-莱茵-锡格应用科学大学) Fraunhofer Institute for Algorithms and Scientific Computing (SCAI)(弗劳恩霍夫算法与科学计算研究所(SCAI))

AI总结 提出用U-Net替代慢速物理模拟器作为代理模型,结合离线MAP-Elites算法,实现快速生成数千个多样化且经气候评估的建筑布局。

详情
AI中文摘要

优化城市布局以适应气候需要在建筑密度与冷空气通风之间取得平衡。由于基于物理的气候模拟计算成本高昂,规划者通常只能评估少于十个手动设计方案。质量-多样性(QD)算法提供了一种系统性地照亮设计空间的方法,但需要代理模型才能实用。在本文中,我们用一个空间深度学习代理(U-Net)替换了缓慢的监管物理模拟器,并将其嵌入离线MAP-Elites循环中。我们系统地比较了这种空间方法与传统的高斯过程(GP)代理在不同训练数据策略(准随机Sobol采样 vs. 主动QD自举)下的表现。结果表明,标量GP代理在随机样本上训练时灾难性地失败,需要昂贵的、主动生成的QD存档才能泛化。相比之下,U-Net的空间归纳偏置使其能够稳健地学习底层物理映射(R² = 0.996),完全独立于训练数据来源。这使得离线QD优化仅需一次性随机训练样本批次即可实现高度准确的适应度排名(ρ = 0.994)。最终流程部署在开源OpenSKIZZE工具中,能在十分钟内生成数千个多样化且经气候评估的建筑布局。

英文摘要

Optimizing urban layouts for climate adaptation requires balancing building density with cold-air ventilation. Because physics-based climate simulations are computationally expensive, planners typically evaluate fewer than ten manual designs. \gls{qd} algorithms offer a way to systematically illuminate the design space, but they require surrogate models to be practical. In this paper, we replace a slow, regulatory physics simulator with a spatial deep-learning surrogate (U-Net) inside an offline MAP-Elites loop. We systematically compare this spatial approach with a traditional \gls{gp} surrogate across different training-data strategies (quasi-random Sobol sampling vs.\ active \gls{qd} bootstrapping). Our results reveal that scalar \gls{gp} surrogates fail catastrophically when trained on random samples, requiring expensive, actively generated \gls{qd} archives to generalize. In contrast, the spatial inductive bias of the U-Net allows it to learn the underlying physics mapping robustly ($R^2 = 0.996$), completely independent of the training data source. This allows offline \gls{qd} optimization to achieve highly accurate fitness rankings ($ρ= 0.994$) using only a one-time batch of random training samples. The resulting pipeline, deployed in the open-source OpenSKIZZE tool, generates thousands of diverse, climate-evaluated building layouts in under ten minutes.

2606.04647 2026-06-04 cs.LG 版本更新

ALINC: Active Learning for Inductive Node Classification via Graph Sampling

ALINC: 通过图采样的归纳式节点分类主动学习

Pascal Plettenberg, Denis Huseljic, André Alcalde, Bernhard Sick, Josephine M. Thomas

发表机构 * Intelligent Embedded Systems, University of Kassel(智能嵌入式系统,卡塞尔大学) CELUS GmbH GAIN Group, Institute of Data Science, University of Greifswald(GAIN集团,数据科学研究所,格里夫斯瓦尔德大学)

AI总结 提出ALINC框架,通过图级采样策略解决归纳式节点分类中的主动学习问题,并评估了多种策略与聚合方法的效果。

Comments Accepted at ECML PKDD 2026

详情
AI中文摘要

节点分类的主动学习通常专注于在一个或几个大图(例如社交网络分析)中选择最具信息量的节点进行标注。然而,在其他领域,如分子化学或电子设计自动化,数据集由数千个独立图组成。在许多这样的归纳式设置中,标注单个节点需要全图分析,这实际上会即时产生剩余的节点标签。因此,这些场景需要选择整个图而非单个节点的主动学习策略,而这一问题迄今尚未在文献中得到解决。因此,我们提出了ALINC,一个通过图采样进行归纳式节点分类的主动学习框架。它通过多种聚合机制将节点级效用度量提升为图级选择标准,从而弥合了现有的方法论差距。在包含十种策略、三种聚合方法和四个数据集的广泛基准测试中,我们确定了CoreSet、TypiClust和BADGE作为性能最佳的图采样策略。我们的详细分析进一步揭示,聚合方法的选择至关重要,因为它显著影响模型性能和标注成本。最后,我们在两个用例研究中展示了ALINC的有效性:分子中的代谢位点预测和印刷电路板原理图的设计自动化。

英文摘要

Active learning (AL) for node classification typically focuses on selecting the most informative nodes for annotation within one or a few large graphs (e.g., in social network analysis). However, in other domains, such as molecular chemistry or electronic design automation, datasets consist of thousands of independent graphs. In many of these inductive settings, annotating an individual node requires a full-graph analysis, which effectively yields the remaining node labels on-the-fly. Therefore, these scenarios require AL strategies that select entire graphs instead of single nodes, a problem which has not been tackled in the literature so far. Thus, we introduce ALINC, an AL framework for inductive node classification via graph sampling. It bridges the existing methodological gap by elevating node-level utility measures to graph-level selection criteria through various aggregation mechanisms. In an extensive benchmark including ten strategies, three aggregation methods, and four datasets, we identify CoreSet, TypiClust, and BADGE as the top-performing graph sampling strategies. Our detailed analysis further reveals that the choice of the aggregation method is pivotal, as it substantially affects model performance and annotation costs. Finally, we demonstrate the effectiveness of ALINC in two use case studies: site-of-metabolism prediction in molecules and design automation of printed circuit board schematics.

2606.04634 2026-06-04 cs.LG 版本更新

Explainably Safe Reinforcement Learning

可解释的安全强化学习

Sabine Rieder, Stefan Pranger, Debraj Chakraborty, Jan Křetínský, Bettina Könighofer

发表机构 * Masaryk University(马萨里克大学) Graz University of Technology(格拉茨技术大学) Technical University of Munich(慕尼黑技术大学)

AI总结 提出一种基于分层决策树的可解释安全强化学习方法,通过世界模型分析状态风险并构建屏蔽策略,生成可理解的解释,同时保持安全保证。

详情
AI中文摘要

对决策系统的信任既需要安全保证,也需要解释和理解其行为的能力。这对于学习系统尤为重要,因为其决策过程往往高度不透明。屏蔽是一种基于模型的强化学习安全增强技术。然而,由于屏蔽是通过严格的形式化方法自动合成的,其决策同样难以被人类解释。最近,决策树被广泛用于表示控制器和策略。但由于屏蔽本质上具有非确定性,其决策树表示变得过大,无法在实践中提供可解释性。为应对这一挑战,我们提出了一种新颖的可解释安全强化学习方法,通过提供人类可理解的屏蔽决策解释来增强信任。我们的方法将屏蔽策略表示为分层决策树,提供自上而下的基于案例的解释。在设计时,我们使用世界模型分析在给定状态下执行动作的安全风险。基于此分析,我们构建屏蔽策略和一个高层决策树,将状态分类为风险类别(安全、关键、危险、不安全),解释为何某种情况可能涉及安全关键。在运行时,我们生成局部决策树,解释哪些动作被允许以及为何其他动作被认为不安全。我们的方法促进了屏蔽安全强化学习中安全方面的可解释性,不需要超出屏蔽已用信息的额外信息,开销极小,并能轻松集成到现有的屏蔽强化学习流程中。实验中,我们使用比原始屏蔽小几个数量级的决策树来计算解释。

英文摘要

Trust in a decision-making system requires both safety guarantees and the ability to interpret and understand its behavior. This is particularly important for learned systems, whose decision-making processes are often highly opaque. Shielding is a prominent model-based technique for enforcing safety in reinforcement learning. However, because shields are automatically synthesized using rigorous formal methods, their decisions are often similarly difficult for humans to interpret. Recently, decision trees became customary to represent controllers and policies. However, since shields are inherently non-deterministic, their decision tree representations become too large to be explainable in practice. To address this challenge, we propose a novel approach for explainable safe RL that enhances trust by providing human-interpretable explanations of the shield's decisions. Our method represents the shielding policy as a hierarchy of decision trees, offering top-down, case-based explanations. At design time, we use a world model to analyze the safety risks of executing actions in given states. Based on this analysis, we construct both the shield and a high-level decision tree that classifies states into risk categories (safe, critical, dangerous, unsafe), explaining why a situation may be safety-critical. At runtime, we generate localized decision trees that explain which actions are allowed and why others are deemed unsafe. Our method facilitates explainability of the safety aspect in safe-by-shielding reinforcement learning, requires no additional information beyond what is already used for shielding, incurs minimal overhead, and integrates readily into existing shielded RL pipelines. In our experiments, we compute explanations using decision trees that are several orders of magnitude smaller than the original shield.

2606.04632 2026-06-04 cs.LG cs.CL 版本更新

VentAgent: When LLMs Learn to Breathe -- Multi-Objective Arbitration for ARDS Ventilation

VentAgent:当大语言模型学会呼吸——ARDS通气的多目标仲裁

Teqi Hao, Yuxuan Fu, Xiaoyu Tan, Shaojie Shi, Bohao Lv, Yinghui Xu, Xihe Qiu

发表机构 * School of Electronic and Electrical Engineering, Shanghai University of Engineering Science(上海工程技术大学电子与电气工程学院) Tencent Youtu Lab(腾讯优图实验室) Artificial Intelligence Innovation and Incubation Institute, Fudan University(复旦大学人工智能创新与孵化院)

AI总结 提出VentAgent分层框架,利用大语言模型作为透明仲裁者,通过感知-规划-编排三阶段将机械通气控制转化为动态多目标仲裁过程,在生理模拟器上优于强化学习和经典控制基线,并提供可解释的推理链。

详情
AI中文摘要

急性呼吸窘迫综合征(ARDS)的机械通气需要平衡竞争性的生理目标,包括氧合、肺保护和酸碱平衡。然而,当前的数据驱动方法,尤其是模仿回顾性电子健康记录(EHR)的方法,常常遭受模仿偏差。它们可能从不一致的临床演示中捕获表面相关性,例如将被动呼吸机设置与生存关联,因为这种设置在稳定患者中很常见,因此无法泛化到不稳定或分布外的表型。标准的强化学习(RL)方法也难以处理重症监护中的对抗性权衡,并常常产生不透明且临床可解释性有限的策略。为了解决这些局限性,我们引入了VentAgent,一个分层框架,其中大语言模型(LLM)作为机械通气的透明仲裁者。我们将通气控制重新表述为动态多目标仲裁过程,而非单目标优化。VentAgent将决策分解为三个可解释的阶段:感知、规划和编排。通过利用LLM的语义推理能力,它综合来自异构专家的策略,并通过显式协调机制解决冲突的临床优先级。在高保真生理模拟器上的评估表明,VentAgent优于最先进的RL和经典控制基线。此外,它将控制决策转化为人类可读的推理链,为重症监护自动化提供了更安全、更可解释和更自适应的范式。

英文摘要

Mechanical ventilation for Acute Respiratory Distress Syndrome (ARDS) requires balancing competing physiological goals, including oxygenation, lung protection, and acid-base homeostasis. However, current data-driven methods, especially those imitating retrospective Electronic Health Records (EHR), often suffer from imitation bias. They may capture superficial correlations from inconsistent clinical demonstrations, such as associating passive ventilator settings with survival because such settings are common in stable patients, and thus fail to generalize to volatile or out-of-distribution phenotypes. Standard Reinforcement Learning (RL) methods also struggle with the adversarial trade-offs of critical care and often produce opaque policies with limited clinical interpretability. To address these limitations, we introduce VentAgent, a hierarchical framework in which Large Language Models (LLMs) act as transparent arbitrators for mechanical ventilation. We reformulate ventilation control as a dynamic Multi-Objective Arbitration process rather than single-objective optimization. VentAgent decomposes decision-making into three interpretable stages: Perception, Planning, and Orchestration. By leveraging the semantic reasoning capabilities of LLMs, it synthesizes strategies from heterogeneous experts and resolves conflicting clinical priorities through an explicit coordination mechanism. Evaluations on a high-fidelity physiological simulator show that VentAgent outperforms state-of-the-art RL and classical control baselines. Moreover, it converts control decisions into human-readable reasoning chains, offering a safer, more interpretable, and adaptable paradigm for critical care automation.

2606.04623 2026-06-04 cs.LG 版本更新

Learning symplectic model reduction based on a approximation theorem of symplectic embeddings

基于辛嵌入逼近定理的辛模型降阶学习

Liyi Feng, Yifa Tang, Yulin Xie, Ruili Zhang, Aiqing Zhu

发表机构 * School of Mathematics and Statistics, Beijing Jiaotong University(北京交通大学数学与统计学学院) State Key Laboratory of Mathematical Sciences, Academy of Mathematics and Systems Science, Chinese Academy of Sciences(中国科学院数学与系统科学研究院数学科学国家重点实验室) Department of Mathematics, National University of Singapore(新加坡国立大学数学系)

AI总结 针对高维哈密顿系统降阶中辛结构易破坏的问题,提出辛保持自编码器(SpAE),通过参数化解码器为辛嵌入、编码器为辛投影,在保证辛结构的同时提升重构与预测精度。

详情
AI中文摘要

高维哈密顿系统在许多科学和工程学科中扮演着核心角色,其动力学在辛流形上演化。尽管深度学习为从数据构建低维替代模型提供了强大工具,但在模型降阶过程中,内在的辛结构很容易被破坏。因此,标准自编码器可能产生不支持哈密顿流的潜在坐标,导致长时间预测不稳定。本文首先建立了辛嵌入的通用逼近定理。基于该理论,我们提出了辛保持自编码器(SpAE),其中解码器被参数化为辛嵌入,编码器被构造为相应的辛投影。该架构具有足够的表达能力来逼近非线性辛嵌入及其相关的辛投影,通过构造精确保持辛结构,并且可以通过标准的无约束优化进行训练,从而提高了重构和预测精度。在高维晶格和粒子系统上的大量实验证明了所提出方法的有效性。

英文摘要

High-dimensional Hamiltonian systems play a central role in many scientific and engineering disciplines, with dynamics evolving on symplectic manifolds. Although deep learning provides powerful tools for constructing low-dimensional surrogates from data, the intrinsic symplectic structure is easily destroyed during model reduction. As a result, a standard autoencoder may produce latent coordinates that do not support a Hamiltonian flow, leading to unstable long-time prediction. In this paper, we first establish a universal approximation theorem for symplectic embeddings. Based on this theory, we propose symplecticity-preserving autoencoders (SpAE), in which the decoder is parameterized as a symplectic embedding and the encoder is constructed as the corresponding symplectic projection. This architecture is expressive enough to approximate nonlinear symplectic embeddings and the associated symplectic projections, preserves the symplectic structure exactly by construction, and can be trained by standard unconstrained optimization, thereby improving both reconstruction and prediction accuracy. Extensive experiments on high-dimensional lattice and particle systems demonstrate the effectiveness of the proposed method.

2606.04620 2026-06-04 cs.LG cs.AI 版本更新

QuBLAST: A Framework for Quantizing Large Language Models with Block-Level Compression Approach and Activation Scaling Strategy

QuBLAST: 一种采用块级压缩方法和激活缩放策略量化大语言模型的框架

Pasindu Wickramasinghe, Achyuta Muthuvelan, Rachmad Vidya Wicaksana Putra, Minghao Shao, Muhammad Shafique

发表机构 * eBRAIN Lab, Division of Engineering, New York University (NYU) Abu Dhabi(eBRAIN实验室,工程系,纽约大学(NYU)阿布扎赫德分校)

AI总结 针对大语言模型部署困难,提出QuBLAST框架,通过块级混合精度量化和激活缩放策略,在降低模型大小40%-45.2%的同时保持困惑度增加不超过5%。

Comments 10 pages, 9 figures, 5 tables

详情
AI中文摘要

大语言模型已成为解决NLP任务的最先进算法。然而,它们通常伴随着巨大的计算和内存成本,因此难以部署在嵌入式系统上。为此,最先进的方法通常在网络的所有注意力块上采用统一的训练后量化,从而忽略了在同一网络中应用不同量化级别的潜力。它们还采用复杂操作来减轻激活异常值的负面影响,从而产生高计算开销。此外,它们没有考虑使用具有非传统注意力架构(例如状态空间模型)的新兴大语言模型进行评估,这些模型在应用量化时提出了不同的挑战。为了解决这些局限性,我们提出了QuBLAST,一种新颖的训练后量化方法,该方法采用块级压缩方法和激活缩放策略用于大语言模型。块级压缩方法实现了网络各块之间的混合精度量化,而激活缩放策略有效减轻了激活异常值的负面影响。具体来说,QuBLAST首先通过交叉熵损失分析预训练模型中不同注意力块的敏感性。QuBLAST利用这种敏感性分析来确定模型中每个注意力块的权重量化级别。此外,QuBLAST为每个块采用激活缩放图来控制激活值的范围并减轻激活异常值的负面影响,从而实现更好的量化结果。实验结果表明,QuBLAST在不同模型架构(即Qwen3-8B、Llama3-8B、Mistral v0.1-8B和Falcon H1R-7B)上将模型大小减少了40%-45.2%,同时在WikiText-2和WikiText-103数据集上保持性能在5%的困惑度增加之内。

英文摘要

LLMs have become the state-of-the-art algorithms for solving NLP tasks. However, they typically come at huge computational and memory costs, thus making them difficult to deploy on embedded systems. Toward this, state-of-the-art methods typically employ uniform post-training quantization (PTQ) across attention blocks of the network, hence overlooking the potential of applying different quantization levels in the same network. They also employ complex operations to mitigate the negative impact of activation outliers, hence incurring high computational overheads. Moreover, they have not considered evaluation using emerging LLMs with non-conventional attention architectures (e.g., state-space models), which pose different challenges in applying quantization. To address these limitations, we propose QuBLAST, a novel PTQ methodology that employs block-level compression approach with activation scaling strategy for LLMs. Block-level compression approach enables mixed-precision quantization across blocks of the network, while activation scaling strategy efficiently mitigates the negative impact of activation outliers. Specifically, QuBLAST first analyzes the sensitivity of different attention blocks in the pre-trained model through the cross-entropy loss analysis. QuBLAST leverages this sensitivity analysis to determine the weight quantization level for each attention block in the model. Furthermore, QuBLAST employs the activation scaling map for each block to control the range of activation values and mitigate the negative impact of activation outliers, thereby enabling better quantization results. Experimental results show that, QuBLAST reduces model sizes by 40%-45.2% across different model architectures (i.e., Qwen3-8B, Llama3-8B, Mistral v0.1-8B, and Falcon H1R-7B), while maintaining the performance within 5% perplexity increase for the WikiText-2 and WikiText-103 datasets.

2606.04613 2026-06-04 cs.CV cs.LG 版本更新

Beyond Symmetric Alignment: Spectral Diagnostics of Modality Imbalance in Vision-Language Models in the Medical Domain

超越对称对齐:医学领域视觉-语言模型中模态不平衡的光谱诊断

Alessandro Gambetti, Qiwei Han, Cláudia Soares, Hong Shen

发表机构 * NOVA School of Science and Technology(诺瓦科学与技术学校) Nova School of Business and Economics(诺瓦商业与经济学校) Carnegie Mellon University(卡内基梅隆大学)

AI总结 提出非对称光谱对齐分数(SAS),通过特征值加权的特征模态相关性量化模态信息不平衡,并在医学图像-文本数据集上评估15个VLM,发现医学图像比临床报告保留更丰富的结构信息,且SAS与检索性能的相关性最强。

Comments 10 pages, 3 figures, 9 tables

详情
AI中文摘要

视觉-语言模型(VLM)在应用于医学图像-文本数据时表现不佳,但可用于诊断这种失败的工具仍然有限。现有的表示对齐度量是对称的,将两种模态合并为一个分数,隐藏了哪种模态驱动了跨模态退化。我们引入了光谱对齐分数(SAS),这是一种非对称度量,将两种模态投影到锚定模态的主特征基上,并计算特征值加权的每个特征模态的相关性,从而得到方向性分数,其差值量化了模态信息不平衡。我们将SAS嵌入到一个基准框架中,评估了15个VLM在自然和医学图像-文本数据集上的表现,同时使用了6种对齐度量和双向检索。我们的实验表明,医学图像比其配对的临床报告保留了更丰富的结构信息,这种方向性不对称是所有竞争度量无法察觉的,并且SAS在医学领域实现了与检索性能的最强零标签相关性,使其成为临床部署的实用诊断工具。代码可在以下网址获取:https://github.com/iamalegambetti/medical-vlms-assessment。

英文摘要

Vision-Language Models (VLMs) struggle when applied to medical image-text data, yet the tools available to diagnose this failure remain limited. Existing representation alignment metrics are symmetric, collapsing both modalities into a single score and hiding which modality drives cross-modal degradation. We introduce the Spectral Alignment Score (SAS), an asymmetric metric that projects both modalities onto the principal eigenbasis of an anchor modality and computes eigenvalue-weighted per-eigenmode correlations, resulting in directional scores whose difference quantifies modality information imbalance. We embed SAS within a benchmarking framework evaluating 15 VLMs across natural and medical image-text datasets alongside 6 alignment metrics and bidirectional retrieval. Our experiments show that medical images retain richer structural information than their paired clinical reports, a directional asymmetry invisible to all competing metrics, and that SAS achieves the strongest zero-label correlation with retrieval performance in the medical domain, positioning it as a practical diagnostic tool for clinical deployment. Code is available at this URL: https://github.com/iamalegambetti/medical-vlms-assessment.

2606.04603 2026-06-04 cs.IR cs.LG stat.ML 版本更新

Distributional Approximate Nearest Neighbour Search for Uncertainty-Aware Retrieval

面向不确定性感知检索的分布近似最近邻搜索

Olivier Jeunen

发表机构 * Antwerp, Belgium(比利时安特卫普)

AI总结 提出DINOSAUR框架,通过为每个物品采样多个嵌入并构建索引,在检索时对用户嵌入进行采样,以隐式边缘化嵌入不确定性,从而在不改变模型架构或索引基础设施的情况下提升长尾物品的覆盖。

详情
AI中文摘要

近似最近邻搜索索引构成了现实世界推荐系统的骨干,支持在百万级物品目录上进行实时候选检索。通常,为每个用户和每个物品学习一个点估计嵌入。在服务时,用户嵌入查询索引以获取相关物品。由于这些表示是从稀疏交互数据中学习的,它们带有噪声,可能无法捕捉所有有助于“相关性”的细微差别——忽略了其固有的基本不确定性。结果是检索管道系统性地偏向于少数嵌入估计良好的热门头部物品,而牺牲了长尾中多数小众、多样和偶然的内容。 我们提出了DINOSAUR(面向不确定性感知检索的分布近似最近邻搜索):一个简单且与基础设施兼容的框架,将嵌入不确定性纳入候选生成。DINOSAUR不为点估计建立索引,而是为每个物品采样$S_i$个嵌入,并在这一增强集上构建索引。类似地,在查询时,对用户嵌入进行采样。这种双边的随机检索过程隐式地边缘化了嵌入不确定性,无需改变模型架构或ANN索引基础设施。 在分析方面,我们展示了当不确定性消失时,DINOSAUR恢复标准的点估计检索,并刻画了增加的嵌入方差如何扩展不确定物品可检索的潜在空间区域。可重复的实证观察与这些预期一致,显示出在离线召回率小幅损失的情况下,覆盖率大幅提升。

英文摘要

Approximate Nearest Neighbour search indices form the backbone of real-world recommender systems, enabling real-time candidate retrieval over million-item catalogues. Typically, a single point estimate embedding is learnt for every user and every item. At serving time, the user embedding queries the index for relevant items. Since these representations are learnt from sparse interaction data, they are noisy and might fail to capture all the nuances that contribute to ``relevance'' -- ignoring the fundamental uncertainty that is inherent to them. The result is a retrieval pipeline that is systematically biased toward the small minority of popular head items with well-estimated embeddings, at the expense of the long-tail majority of niche, diverse, and serendipitous content. We propose DINOSAUR (Distributional Approximate Nearest Neighbour Search for Uncertainty-Aware Retrieval): a simple and infrastructure-compatible framework to incorporate embedding uncertainty into candidate generation. Rather than indexing point estimates, DINOSAUR samples $S_i$ embeddings per item and constructs an index on this augmented set. Analogously, at query time, a user embedding is sampled. This two-sided stochastic retrieval process implicitly marginalises over embedding uncertainty, without requiring changes to model architecture or ANN index infrastructure. On the analytical side, we show that DINOSAUR recovers standard point-estimate retrieval as uncertainty vanishes, and we characterise how increased embedding variance expands the regions of latent space in which uncertain items are retrievable. Reproducible empirical observations align with these expectations, showing large coverage gains with small losses in offline recall.

2606.04583 2026-06-04 cs.LG 版本更新

HalfNet: Randomized Neural Networks with Learned Subspace Geometry

HalfNet: 具有学习子空间几何的随机神经网络

Ethem Alpaydin

发表机构 * Ethem Alpaydin

AI总结 提出HalfNet,通过从可学习的低秩协方差矩阵中随机采样权重,在减少参数的同时匹配全连接网络的性能,揭示权重空间几何对预测能力的关键作用。

Comments 6 pages (+2 pages of appendix), 6 figures

详情
AI中文摘要

许多研究者研究了将部分权重固定为从给定分布(例如 $N(0, I)$)随机抽取值的神经网络。我们提出的 HalfNet 从 $N(0, Σ)$ 中抽取随机权重,其中定义分布几何的 $Σ$ 具有我们从数据中学习的低秩分解。在 MNIST 和 CIFAR-10 上的实验表明,HalfNet 在使用显著更少参数的情况下,能够匹配全训练多层感知器的性能。谱分析表明,神经网络的大部分预测能力在于其权重空间的几何结构,而非单个参数的精确值,并且我们观察到准确率随秩平滑扩展。HalfNet 并非针对低秩结构的神经架构技巧;它实现了一种数据相关的随机嵌入,也可以通过监督度量学习或随机特征和核视角进行解释。

英文摘要

Many researchers investigated neural networks with some of their weights fixed to values randomly drawn from a given distribution, e.g., $N(0, I)$. Our proposed HalfNet draws random weights from $N(0, Σ)$, where $Σ$, which defines the geometry of the distribution, has a low-rank factorization that we learn from data. Experiments on MNIST and CIFAR-10 demonstrate that HalfNet can match the performance of fully trained multilayer perceptrons while using substantially fewer parameters. Spectral analysis indicates that much of the predictive power of neural networks lies in the geometry of their weight space rather than in the precise values of individual parameters, and we observe that accuracy scales smoothly with rank. HalfNet is not a neural architecture trick for low-rank structure; it implements a data-dependent random embedding that can also be interpreted through supervised metric learning, or random-feature and kernel perspectives.

2606.04582 2026-06-04 physics.comp-ph cs.LG physics.app-ph 版本更新

Reconstructing Unobservable Temperature Fields via Simulation-Aided Intelligent Sensing

通过仿真辅助智能感知重建不可观测温度场

Monika Stipsitz, Hèlios Sanchis-Alepuz, Jacob Reynvaan, Silvester Sabathiel

发表机构 * Silicon Austria Labs(硅酸奥地利实验室) Republic of Austria(奥地利共和国) Styrian Business Promotion Agency(施蒂里亚商业促进局) federal state of Carinthia(卡林西亚联邦州) Upper Austrian Research(上奥地利研究) Austrian Association for the Electric and Electronics Industry(奥地利电子电气工业协会)

AI总结 提出基于随机物理仿真生成数据集的方法,训练神经网络从稀疏传感器重建内部温度场,实现实时在线监测。

Comments Presented at IEEE International Instrumentation and Measurement Technology Conference (I2MTC), Nancy, France, 2026

详情
AI中文摘要

在许多系统中,由于传感器位置的限制,实时监测组件和子结构内部的温度分布是一个具有挑战性的课题。虽然机器学习在许多应用中是一种多功能工具,但其在高分辨率热监测中的应用受到高质量训练数据集可用性的阻碍。在这项工作中,我们提出了一种基于随机物理仿真为工业应用生成数据集的新方法。我们在一个概念验证硬件设置中演示了该方法:仅在此类合成数据集上训练的神经网络被用于从嵌入硬件中的稀疏传感器重建内部温度场。基于神经网络的重建不仅在鲁棒性上优于克里金法,而且能够实现实时推理,使得该方法适用于在线监测原本不可观测的热状态。

英文摘要

Real-time monitoring of the temperature distribution within components and sub-structures is a challenging topic in many systems due to restrictions on feasible sensor locations. While machine learning (ML) proves a versatile tool in many applications, its adoption for high-resolution thermal monitoring is hindered by the availability of high-quality datasets for training. In this work, we propose a novel approach for generating datasets for industrial applications based on randomized physics-based simulations. We demonstrate the approach in a proof-of-concept hardware setup: A neural network (NN) trained only on such a synthetic dataset, is used to reconstruct the internal temperature field from sparse sensors embedded in the hardware. The NN-based reconstructions do not only outperform Kriging in robustness but also enable real-time inference, making the method suitable for online monitoring of otherwise unobservable thermal states.

2606.04576 2026-06-04 stat.ML cs.LG econ.EM q-fin.RM 版本更新

ReSGA: A Large Tail Risk Model for Learning Value-at-Risk and Expected Shortfall

ReSGA: 一种用于学习风险价值和预期缺口的大尾部风险模型

Yichi Zhang, Ke Zhu, Zhoufan Zhu

发表机构 * Hong Kong University(香港大学) Xiamen University(厦门大学)

AI总结 提出检索增强自分组自编码器(ReSGA),利用数百万参数捕捉资产横截面依赖和长期时间动态,在1926-2023年美国股票数据上优于12种基准模型,并通过新规模增强左尾动量策略实现经济收益。

详情
AI中文摘要

学习风险价值(VaR)和预期缺口(ES)对于有效管理金融风险至关重要。在大数据时代,参数有限的现有方法容易受到模型错误设定的影响。为了解决这一局限性,我们提出了一种大尾部风险模型——检索增强自分组自编码器(ReSGA),该模型设计有数百万个参数,利用资产的特征来挖掘丰富的横截面依赖性和长期时间动态。应用于1926年至2023年的月度美国股票收益数据,包含153个公司特征,ReSGA在样本外损失和统计回测方面优于十二种计量经济学和机器学习竞争对手。此外,其预测优势可以通过一种新的规模增强左尾动量策略构建的多空十分位投资组合转化为显著的经济收益。为了阐明复杂性的作用,我们进一步进行了系统的规模分析,并证明联合VaR-ES预测的改进主要由数据复杂性驱动,而非模型复杂性。最后,我们的组重要性和迁移学习分析展示了ReSGA的可解释性和跨市场泛化能力。

英文摘要

Learning Value-at-Risk (VaR) and Expected Shortfall (ES) is important for managing financial risks effectively. Existing approaches with limited parameters are vulnerable to model misspecification in the era of big data. To address this limitation, we propose a large tail risk model, the retrieval-enhanced self-grouping autoencoder (ReSGA), which is designed with millions of parameters to exploit the rich cross-sectional dependence and long-term temporal dynamics of assets using their characteristics. Applied to monthly US equity returns from 1926 to 2023 with 153 firm characteristics, ReSGA outperforms twelve econometric and machine learning competitors in terms of out-of-sample loss and statistical backtesting. In addition, its forecast advantages can translate into significant economic gains from long-short decile portfolios that are constructed by a new size-enhanced left-side momentum strategy. To clarify the role of complexity, we further conduct a systematic scaling analysis and demonstrate that improvements in joint VaR-ES forecasting are primarily driven by data complexity rather than model complexity. Finally, our analyses of group-importance and transfer-learning exhibit the interpretability and cross-market generalizability of ReSGA.

2606.04574 2026-06-04 cs.LG cs.NE q-fin.ST q-fin.TR stat.ML 版本更新

Dynamic Multi-Pair Trading Strategy in Cryptocurrency Markets with Deep Reinforcement Learning

基于深度强化学习的加密货币市场动态多对交易策略

Damian Lebiedź, Robert Ślepaczuk

发表机构 * Politechnika Śląska(波兰斯拉维亚理工大学)

AI总结 本研究提出一种结合深度强化学习执行覆盖层的层次化“过滤-排序”配对选择方法和“固定风险、自适应均值”执行模型,在加密货币市场实现优于启发式基准的统计套利表现。

Comments 61 pages, 37 figures, 16 tables

详情
AI中文摘要

本研究旨在确定深度强化学习(DRL)作为专门执行覆盖层是否能够增强高波动性加密货币市场中的配对交易。尽管该策略的经典实现在传统股票市场中已被证明成功,但在高方差环境中往往表现出刚性并面临严重的发散风险。为应对这一需求,本研究引入了新颖概念。为构建稳健系统,我们开发了层次化的“过滤-排序”配对选择方法和专有的“固定风险、自适应均值”执行模型。该系统采用带有长短期记忆(LSTM)层的近端策略优化(PPO)智能体,在严格确定性风险管理边界内控制执行决策。在币安USD-M期货市场的1小时间隔数据上评估,优化后的强化学习策略在样本外表现显著优于启发式基线。平稳循环块自举稳健性检验证实,智能体的风险调整后超额收益在10%水平上统计显著。尽管略低于更严格的5%阈值,这一结果凸显了数字资产特有的极端异质方差。最终,本论文通过引入结合统计套利与DRL执行策略的混合架构,为量化金融文献做出贡献。此外,它通过确定性屏蔽提供了一种安全强化学习的新框架,证明将神经策略锚定于统计稳健边界能成功缓解严重的发散风险。

英文摘要

This study aims to determine whether the application of Deep Reinforcement Learning (DRL) as a specialized execution overlay can enhance pair trading in highly volatile cryptocurrency markets. Although classical implementations of the strategy have proven successful in traditional equities, they frequently exhibit rigidity and suffer from severe divergence risks when applied to high-variance environments. To address this need, this research introduces novel concepts. To construct a robust system, we developed a hierarchical "Filter-then-Rank" pair selection methodology and a proprietary "Fixed Risk, Adaptive Mean" execution model. The system employs a Proximal Policy Optimization (PPO) agent with a Long Short-Term Memory (LSTM) layer to govern execution decisions within strict deterministic risk management boundaries. Evaluated on 1-hour interval data from the Binance USD-M Futures market, the optimized RL policy achieved an out-of-sample performance that substantially outperformed the heuristic baseline. A stationary circular block bootstrap robustness check confirms that the agent's risk-adjusted outperformance is statistically significant at the 10 percent level. Although falling marginally short of the stricter 5 percent threshold, this result highlights the extreme idiosyncratic variance characteristic of digital assets. Ultimately, this thesis contributes to the quantitative finance literature by introducing a hybrid architecture that combines statistical arbitrage with DRL execution policies. Furthermore, it delivers a novel framework for safe reinforcement learning via deterministic shielding, proving that anchoring a neural policy to statistically robust boundaries successfully mitigates severe divergence risks.

2606.04564 2026-06-04 cs.LG 版本更新

SurvPFN: Towards Foundation Models for Survival Predictions

SurvPFN:面向生存预测的基础模型

Samuel Böhm, Lennart Purucker, Frank Hutter, Pascal Schlosser

发表机构 * University of Freiburg(弗赖堡大学)

AI总结 提出SurvPFN,一种基于先验数据拟合网络(PFN)的生存预测模型,通过合成数据预训练和删失负对数似然损失,无需逐数据集拟合即可在真实任务中与经典和深度生存基线竞争。

Comments 10 pages, 1 figure. Accepted to "Foundation Models for Structured Data" Workshop at the International Conference on Machine Learning (ICML) 2026

详情
AI中文摘要

表格基础模型(TFM)在标准分类和回归任务中取得了快速进展,但时间至事件生存预测任务在很大程度上仍未涉及。与标准回归任务不同,生存预测模型必须处理删失数据。标准TFM无法原生处理删失数据,导致预测有偏且不准确,使其不适用于实际应用。为克服这一根本限制,我们提出了用于生存预测任务的先验数据拟合网络(PFN) exttt{SurvPFN}。我们在数百万个合成生存预测任务上预训练 exttt{SurvPFN},通过考虑删失数据的分布回归来学习生存。 exttt{SurvPFN}通过以下方式工作:(1)使用威布尔事件时间和非信息性删失机制生成数据;(2)整合删失事件指示符;(3)最小化删失负对数似然。在SurvSet(一个真实世界生存任务集合)上, exttt{SurvPFN}无需逐数据集拟合、生存特定架构或特征工程,即可与经典和深度生存基线高度竞争。我们表明,生存可以被视为具有删失损失的连续时间分布回归问题,从而释放PFN在时间至事件预测中的潜力。

英文摘要

Tabular foundation models (TFMs) have made rapid progress in standard classification and regression, but time-to-event survival prediction tasks have remained largely untouched. Unlike in standard regression tasks, survival prediction models must account for censored data. Standard TFMs cannot handle natively censored data, leading to biased and inaccurate predictions, making them unsuitable for real-world applications. To overcome this fundamental limitation, we propose \texttt{SurvPFN}, a prior-data fitted network (PFN), for survival prediction tasks. We pretrain \texttt{SurvPFN} on millions of synthetic survival prediction tasks to learn survival via distributional regression that accounts for censored data. \texttt{SurvPFN} works by (1) generating data with Weibull event times and a non-informative censoring mechanism; (2) integrating a censored event indicator; and (3) minimizing a censored negative log-likelihood. On SurvSet, a collection of real-world survival tasks, \texttt{SurvPFN} is highly competitive with classical and deep survival baselines without per-dataset fitting, a survival-specific architecture, or feature engineering. We show that survival can be treated as a continuous-time distributional regression problem with censored loss, unlocking the power of PFNs for time-to-event predictions.

2606.04562 2026-06-04 cs.AI cs.LG cs.SI 版本更新

Neetyabhas: A Framework for Uncertainty-Aware Public Policy Optimization in Rational Agent-Based Models

Neetyabhas: 理性主体模型中不确定性感知的公共政策优化框架

Janani Venugopalan, Gaurav Deshkar, Rishabh Gaur, Harshal Hayatnagarkar, Jayanta Kshirsagar

发表机构 * ThoughtWorks

AI总结 提出一种集成流行病测量和政策执行不确定性的分层强化学习框架,通过模拟个体行为与政策干预的交互,有效管理疫情并降低影响。

详情
AI中文摘要

目的 世界卫生组织的COVID-19非药物干预措施(如封锁、疫苗接种)有效遏制了传播,但带来了沉重的经济负担。现有研究常常忽略个体行为,并错误地假设完美的感染追踪和无误的政策执行,未能考虑现实世界的不确定性和错误。方法 我们提出了一种整合流行病测量(感染/住院)和政策执行中不确定性的方法。我们构建了一个包含1000名个体的模拟模型,这些个体实时做出关于佩戴口罩、接种疫苗和购物的选择。同时,政策制定者基于健康和经济观察部署干预措施(封锁、强制令)。该框架由分层强化学习智能体驱动,利用深度Q网络以及不确定性感知的策略梯度变体(DDPG和TD3)。结果 模拟有效管理了疫情的进展。佩戴口罩和疫苗接种被证明非常有效,显著降低了疫情高峰的高度和持续时间。通过整合个体行为、政策不确定性和多方面的干预措施,我们的动态控制方法成功减轻了疫情的影响。结论 我们的模型通过将不确定性和人类行为嵌入公共卫生政策框架,克服了以往研究的局限性。模拟表明,考虑个体选择和不完美数据对于设计复杂疫情期间的有效干预措施至关重要,其中口罩和疫苗是关键工具。

英文摘要

Purpose The WHO's COVID-19 non-pharmaceutical interventions (e.g., lockdowns, vaccinations) effectively curb transmission but impose heavy economic strains. Existing research often neglects individual behaviors and falsely assumes perfect infection tracking and flawless policy execution, failing to account for real-world uncertainties and errors. Methods We propose an integrative approach incorporating uncertainties in both epidemic measurement (infections/hospitalizations) and policy implementation. We built a simulation model of 1,000 individuals making real-time choices regarding mask-wearing, vaccination, and shopping. Concurrently, policymakers deploy interventions (lockdowns, mandates) based on health and economic observations. This framework is driven by hierarchical reinforcement learning agents, utilizing deep Q-networks alongside uncertainty-aware policy gradient variants (DDPG and TD3). Results The simulations effectively managed the epidemic's progression. Masking and vaccinations proved highly effective, significantly reducing both the outbreak's peak height and duration. By integrating individual behaviors, policy uncertainties, and multifaceted interventions, our dynamic control approach successfully mitigated the epidemic's impact. Conclusions Our model overcomes previous research limitations by embedding uncertainty and human behavior into public health policy frameworks. The simulation demonstrates that accounting for individual choices and imperfect data is crucial for designing effective interventions during complex pandemics, with masks and vaccines serving as pivotal tools.

2606.04557 2026-06-04 cs.CL cs.IR cs.LG 版本更新

Cartridges at Scale: Training Modular KV Caches over Large Document Collections

大规模弹匣:训练模块化KV缓存以处理大型文档集合

Momchil Hardalov, Gonzalo Iglesias, Adrià de Gispert

发表机构 * Amazon AGI(亚马逊人工智能研究院)

AI总结 提出Cartridges at Scale (CAS)框架,通过动态干扰混合和内存高效预算管理器实现大规模多弹匣训练,在减少预填充开销的同时保持准确性,性能优于单块弹匣10-31点,接近全上下文学习。

Comments 21 pages, 5 figures, 17 tables

详情
AI中文摘要

大型语言模型能够处理长上下文,但预填充数百万个标记是浪费的,因为许多内容在查询之间保持不变。弹匣通过将文档集合提炼为可重用的键值(KV)缓存来解决这一问题,从而消除预填充同时保持准确性。这种方法的一个关键限制是弹匣是单块且非组合的:将整个集合编码为单个KV块无法扩展,并且天真地混合单独训练的弹匣会使性能下降到接近随机水平。我们引入了Cartridges at Scale (CAS),这是一个可扩展的多弹匣学习训练框架,具有动态干扰混合和内存高效的预算管理器,可在GPU和持久存储之间轮换数百个每文档弹匣。我们的方法可扩展到超过一百万个标记的集合,在可比标记预算下,比单块弹匣提高10-31点。即使在高度压缩下,Oracle弹匣准确率也接近完全上下文学习的2-6点范围内。当与检索结合用于弹匣选择时,CAS匹配或超过传统RAG准确率,同时消耗的提示标记减少3-4倍。

英文摘要

Large Language Models can reason over long contexts, yet prefilling millions of tokens is wasteful as much of the content remains static across queries. Cartridges address this by distilling document collections into reusable key-value (KV) caches that eliminate prefilling while preserving accuracy. A critical limitation of this approach is that cartridges are monolithic and non-compositional: encoding an entire collection into a single KV block does not scale, and naively mixing cartridges trained in isolation collapses performance to near chance. We introduce Cartridges at Scale (CAS), a training framework for scalable multi-cartridge learning with dynamic distractor mixing and a memory-efficient budget manager that rotates hundreds of per-document cartridges between GPU and persistent storage. Our approach scales to collections exceeding a million tokens, improving over a monolithic cartridge by 10-31 points at comparable token budgets. Oracle cartridge accuracy falls within 2-6 points of full in-context learning even at high compression. When paired with retrieval for cartridge selection, CAS matches or exceeds conventional RAG accuracy while consuming 3-4x fewer prompt tokens.

2606.04522 2026-06-04 cs.IR cs.AI cs.DB cs.LG 版本更新

ANN Search: Recall What Matters

ANN搜索:召回真正重要的

Dimitris Dimitropoulos, Nikos Mamoulis

发表机构 * University of Ioannina(伊奥尼亚大学) Archimedes, Athena RC(阿基米德,雅典RC)

AI总结 本文提出用逆近似比1/Ratio@k替代Recall@k来评估近似最近邻搜索质量,实验表明前者能更准确反映实际效用并降低计算开销。

详情
AI中文摘要

近似最近邻(ANN)搜索已成为信息检索和现代机器学习任务(从分类到检索增强生成)的核心原语。社区主要通过给定Recall@k(检索到的真实精确最近邻的比例)下的吞吐量来评估和调优ANN算法。我们认为,ANN搜索真正重要的是检索结果的质量,而非它们与真实kNN集合的重叠。我们证明,使用Recall@k评估检索质量会带来不必要的计算开销,并研究用逆近似比1/Ratio@k替代它。1/Ratio@k评估检索到的邻居与真实邻居之间距离的差异。它无需判断、无需超参数,仅通过标准ANN基准输入即可计算。我们在涵盖广泛内在维度的多样化数据集上对最先进的ANN算法进行基准测试,从效率、下游分类和检索增强生成三个维度全面评估这两个指标。在效率方面,优化1/Ratio@k达到操作质量阈值所需的计算成本远低于Recall@k。在下游任务中,即使Recall@k显著下降,性能指标(标签精度、语义相似度、BERTScore和LLM评分质量)仍保持高度稳定。相反,逆近似比紧密反映了这种稳定性,比Recall@k更好地追踪实际效用。最终,虽然Recall@k夸大了近似的真实成本,但1/Ratio@k提供了更准确、可部署的ANN实际质量代理。

英文摘要

Approximate nearest neighbor (ANN) search has become a core primitive in information retrieval and modern machine learning tasks, from classification to retrieval-augmented generation. The community evaluates and tunes ANN algorithms primarily on their throughput at a given Recall@k, the fraction of true exact neighbors retrieved. We argue that what really matters in ANN search is the quality of the retrieved results and not their overlap with the true kNN set. We show that using Recall@k to assess retrieval quality forces unnecessary computational overhead and investigate replacing it by 1/Ratio@k, the inverse approximation ratio. 1/Ratio@k evaluates the differences between the distances of the retrieved and true neighbors. It is judge-free, hyperparameter-free, and computable from standard ANN benchmark inputs alone. We benchmark state-of-the-art ANN algorithms across diverse datasets spanning a wide range of intrinsic dimensionalities, evaluating the two metrics comprehensively across efficiency, downstream classification, and retrieval-augmented generation. On the efficiency axis, optimizing for 1/Ratio@k reaches operational quality thresholds at a substantially lower computational cost than Recall@k. In downstream tasks, performance indicators (label precision, semantic similarity, BERTScore, and LLM-graded quality) remain highly stable even when Recall@k drops significantly. The inverse approximation ratio, on the other hand, closely mirrors this stability, tracking true utility much better than Recall@k. Ultimately, while Recall@k overstates the true cost of approximation, 1/Ratio@k offers a more accurate, deployable proxy for actual ANN quality.

2606.04516 2026-06-04 cs.LG cs.AI 版本更新

GeoMin: Data-Efficient Semi-Supervised RLVR via Geometric Distribution Modeling

GeoMin: 基于几何分布建模的数据高效半监督RLVR

Guangcheng Zhu, Shenzhi Yang, Haobo Wang, Xing Zheng, Yingfan MA, Xuening Feng, Zhongqi Chen, Kai Tang, Zhengqing Zang, Bowen Song, Weiqiang Wang, Gang Chen

发表机构 * Zhejiang University(浙江大学) Ant Group(蚂蚁集团)

AI总结 提出GeoMin方法,通过建模标注数据的全局特征分布来解码正确与错误展开的结构差异,从而建立稳健先验评估自奖励信号可靠性,以少量标注数据高效利用未标注数据,在仅用10%标注时超越全监督模型。

详情
AI中文摘要

基于可验证奖励的强化学习(RLVR)显著提升了LLM的推理能力,但面临困境:标准监督扩展受限于高标注成本,而无监督替代方案则遭受严重的模型崩溃。最近的半监督RLVR方法通过使用少量标注集指导未标注数据,在训练效果和标注成本之间取得了有前景的权衡。然而,由于依赖粗糙的性能启发式,它们遭受严重的数据效率瓶颈,导致绝大多数有价值实例未被充分利用。为此,我们提出GeoMin,它在标注数据上建模全局特征分布,以解码正确和错误展开之间的结构差异,从而建立稳健的先验来评估自奖励信号的可靠性,并充分释放未标注数据的潜力。实验上,GeoMin比最强基线高出+4.1%,甚至在使用仅10%标注的情况下超越全监督模型,展示了显著的数据效率。

英文摘要

Reinforcement learning with verifiable rewards (RLVR) significantly advances LLM reasoning, yet it faces a dilemma: standard supervised scaling is throttled by high annotation costs, while unsupervised alternatives suffer from severe model collapse. Recent semi-supervised RLVR methods address this by using a small labeled set to guide unlabeled data, achieving a promising trade-off between training efficacy and annotation cost. However, they suffer from a severe data-efficiency bottleneck due to the reliance on coarse performance heuristics, leaving a vast majority of valuable instances underutilized. To this end, we propose GeoMin, which models global feature distributions on labeled data to decode the structural discrepancy between correct and incorrect rollouts, thereby establishing a robust prior to assess the reliability of self-reward signals and fully unleash the potential of unlabeled data. Empirically, GeoMin outperforms the strongest baselines by +4.1% and even surpasses fully supervised models with only 10% of the annotations, demonstrating remarkable data efficiency.

2606.04511 2026-06-04 cs.CL cs.LG 版本更新

SparDA: Sparse Decoupled Attention for Efficient Long-Context LLM Inference

SparDA: 用于高效长上下文LLM推理的稀疏解耦注意力

Yaosheng Fu, Guangxuan Xiao, Xin Dong, Song Han, Oreste Villa

发表机构 * NVIDIA Thinking Machines Lab ByteDance Seed MIT

AI总结 提出SparDA架构,通过引入第四投影Forecast实现KV缓存预取与注意力解耦,减少稀疏选择开销,在长上下文推理中实现1.25倍预填充加速和1.7倍解码加速。

详情
AI中文摘要

稀疏注意力减少了长上下文LLM推理的计算和内存带宽。然而,仍然存在两个关键挑战:(1)KV缓存容量随序列长度增长,卸载到CPU内存引入了PCIe传输瓶颈;(2)稀疏选择步骤本身保持$O(T^2)$复杂度,在长上下文中可能主导注意力成本。我们提出SparDA,一种解耦的稀疏注意力架构,它在Query、Key和Value之外引入了第四个逐层投影——Forecast。Forecast预测下一层所需的KV块,从而实现超前选择,将CPU到GPU的预取与当前层执行重叠。由于Forecast与注意力查询解耦,我们的GQA实现为每个GQA组使用一个Forecast头,相比原始多头选择器减少了选择开销。SparDA增加了<0.5%的参数,并通过匹配原始选择器的注意力分布仅训练Forecast投影。在两个稀疏预训练的8B模型上,SparDA匹配或略微提高了准确性,并且相比稀疏注意力卸载基线,提供了高达1.25倍的预填充加速和1.7倍的解码加速。通过使单个GPU上可行的批量大小更大,SparDA进一步实现了比非卸载稀疏基线高达5.3倍的解码吞吐量。我们的源代码可在https://github.com/NVlabs/SparDA获取。

英文摘要

Sparse attention reduces compute and memory bandwidth for long-context LLM inference. However, two key challenges remain: (1) KV cache capacity still grows with sequence length, and offloading to CPU memory introduces a PCIe transfer bottleneck; (2) the sparse selection step itself retains $O(T^2)$ complexity and can dominate attention cost at long contexts. We propose SparDA, a decoupled sparse attention architecture that introduces a fourth per-layer projection, the Forecast, alongside Query, Key, and Value. The Forecast predicts the KV blocks needed by the next layer, enabling lookahead selection that overlaps CPU-to-GPU prefetch with current-layer execution. Because Forecast is decoupled from the attention query, our GQA implementation uses one Forecast head per GQA group, reducing selection overhead versus the original multi-head selector. SparDA adds $<$0.5% parameters and trains only the Forecast projections by matching the original selector's attention distribution. On two sparse-pretrained 8B models, SparDA matches or slightly improves accuracy and delivers up to 1.25$\times$ prefill speedup and 1.7$\times$ decode speedup over the sparse-attention offload baseline. By enabling larger feasible batch sizes on a single GPU, SparDA further reaches up to 5.3$\times$ higher decode throughput than the non-offload sparse baseline. Our source code is available at https://github.com/NVlabs/SparDA.

2606.04503 2026-06-04 cs.LG cs.AI 版本更新

Smart Picks in the Dark: Towards Efficient RLVR for Reasoning via Tracing Metacognitive Pivots

暗中选择:通过追踪元认知支点实现高效的推理可验证奖励强化学习

Guangcheng Zhu, Shenzhi Yang, Haobo Wang, Xing Zheng, Yingfan MA, Xuening Feng, Zhongqi Chen, Bowen Song, Weiqiang Wang, Gang Chen

发表机构 * Zhejiang University(浙江大学) Ant Group(蚂蚁集团)

AI总结 针对可验证奖励强化学习(RLVR)中数据效率低的问题,提出PivotTrace框架,利用注意力动态追踪推理过程中的元认知支点,通过支点密度量化不确定性实现数据自动分流,在仅使用29.3%标注样本和2.75倍收敛加速下超越全监督模型。

详情
AI中文摘要

可验证奖励强化学习(RLVR)极大地推进了大型推理模型(LRMs),但它需要及时在大量完全标注的数据集上进行训练。为此,从两个角度广泛研究了数据高效的RLVR方法:(i)数据选择方法识别一小部分“黄金”样本,这些样本能产生接近全数据性能,但它们依赖于预先存在的标注数据池。(ii)无监督RLVR方法在大规模未标注数据上利用模型自身的内部监督信号进行训练,但表现出次优性能。因此,我们研究了RLVR的“暗中选择”设置,其目标是在没有先验监督的情况下,选择对训练最有益且值得标注的未标注样本。通过系统分析,我们证明智能选择依赖于一个校准良好的不确定性估计器,以实现数据的策略性划分,从而进行自适应训练方案。基于这一见解,我们提出了PivotTrace,一个三路数据分流框架,利用注意力动态追踪推理过程中的元认知支点。通过支点密度精确量化不确定性,PivotTrace实现了自动数据路由,协同最大化标注和训练效率。实验表明,PivotTrace仅使用29.3%的标注样本和2.75倍的收敛速度就超越了全监督LRM。

英文摘要

Reinforcement learning with verifiable rewards (RLVR) has greatly advanced large reasoning models (LRMs), but it requires timely training on a huge fully-annotated dataset. To this end, data-efficient RLVR methods have been widely studied from two perspectives: (i) data selection methods identify a small subset of "golden" samples that yield near-full-data performance, but they rely on a pre-existing pool of labeled data. (ii) unsupervised RLVR methods train the model using its own internal supervision signals on large-scale unlabeled data, yet they exhibit suboptimal performance. Accordingly, we investigate the "pick in the dark" setup for RLVR, which aims to select, without prior supervision, unlabeled samples that are most beneficial for training and worthy of annotation. Through systematic analysis, we demonstrate that smart picks hinge on a well-calibrated uncertainty estimator to enable strategic partitioning of data for adaptive training regimes. Building on this insight, we propose PivotTrace, a three-way data triage framework that leverages attention dynamics to trace metacognitive pivots during reasoning. By precisely quantifying uncertainty through pivot density, PivotTrace achieves automated data routing to synergistically maximize both annotation and training efficiency. Empirically, PivotTrace surpasses the fully supervised LRM with only 29.3% annotated samples and 2.75 faster convergence.

2606.04499 2026-06-04 cs.SI cs.LG 版本更新

Modeling and Interpreting Teamwork Dynamics in Cancer Care Outcome Prediction

建模与解释癌症护理结果预测中的团队协作动态

Yuhua Huang, Hsiao-Ying Lu, Kwan-Liu Ma

发表机构 * University of California, Davis(加州大学戴维斯分校)

AI总结 利用电子健康记录中的协作网络和机器学习方法,研究医疗专业人员团队协作动态对癌症患者生存预测的影响,并解释关键网络特征。

详情
AI中文摘要

癌症护理需要纵向方法,根据每个患者的需求随时间规划和实施治疗。虽然先前研究深入探讨了临床和人口统计学因素(如合并症和年龄)如何指导治疗规划,但对护理实施阶段的关注却少得多。然而,规划和实施都是基于团队的过程,依赖于多个医疗专业人员之间的协调努力。因此,这些协作实践中蕴含的人为因素对于优化患者结果至关重要。尽管重要性显著,但现有关于癌症护理中人为因素的文献有限,很少有研究调查护理团队内的协作如何在治疗过程中演变。为填补这一空白,本研究探讨通过电子健康记录系统捕获的医疗专业人员协作如何影响癌症患者结果,特别强调团队协作动态。我们将电子健康记录介导的医疗专业人员交互表示为网络,并应用机器学习方法识别这些协作结构中嵌入的患者生存预测信号。我们进一步通过指出与特定结果相关的网络特征和动态模式来解释模型预测。我们通过稳健性分析评估模型,确保发现稳定且不受训练中随机变异驱动。此外,我们的见解与医学文献中提出的假设一致,我们的结果为这些主张提供了基于经验数据的证据。总体而言,我们的工作提供了一个实用流程,利用协作的数字痕迹来评估和加强纵向团队医疗,为医疗实施中的数据驱动干预提供可操作的见解。

英文摘要

Cancer care requires a longitudinal approach in which treatments are planned and delivered over time according to the needs of each individual patient. While prior research has thoroughly explored how clinical and demographic factors, such as comorbidities and age, inform treatment planning, far less attention has been devoted to the delivery phase of care. Yet planning and delivery are both team-based processes that depend on coordinated efforts among multiple healthcare professionals (HCPs). As such, the human factors embedded in these collaborative practices are crucial to optimizing patient outcomes. Despite this importance, the existing literature on human factors in cancer care is limited, and very few studies have investigated how collaboration within care teams evolves over the course of treatment. To fill this gap, this work examine how HCPs' collaboration, captured through electronic health record (EHR) systems, affects cancer patient outcomes, with particular emphasis on teamwork dynamics. We represent EHR-mediated HCP interactions as networks and apply machine learning methods to identify predictive signals of patient survival embedded in these collaborative structures. We further interpret model predictions by pinpointing network characteristics and dynamic patterns associated with particular outcomes. We evaluate our model through robustness analyses to ensure that the findings are stable and not driven by stochastic variation in training. Additionally, our insights align with hypotheses proposed in the medical literature, and our results provide the empirical, data-driven evidence supporting these claims. Overall, our work contributes a practical workflow for leveraging digital traces of collaboration to evaluate and strengthen longitudinal team-based healthcare, offering actionable insights to guide data-informed interventions in healthcare delivery.

2606.04492 2026-06-04 cs.LG cs.GT 版本更新

Episodic Memory Temporal Consistency for Cooperative Multi-Agent Reinforcement Learning

面向合作多智能体强化学习的 episodic 记忆时间一致性

Zicheng Zhao, Yu Lan, Chengzhengxu Li, Zhaohan Zhang, Xiaoming Liu

发表机构 * Xi’an Jiaotong University(西安交通大学) Queen Mary University of London(伦敦玛丽女王大学)

AI总结 针对合作多智能体强化学习中的奖励稀疏和探索瓶颈,提出 Episodic Memory Temporal Consistency (EMTC) 框架,通过时间一致性语义嵌入器和门控机制,防止表示崩溃并过滤伪成功轨迹,理论保证误差界,在 SMAC 和 GRF 基准上显著优于现有方法。

Comments Under Review

详情
AI中文摘要

合作多智能体强化学习(MARL)经常遭受严重的奖励稀疏性和探索瓶颈。虽然 episodic 记忆机制通过重用高回报轨迹缓解了这些问题,但由于无约束的激励分布和语义表示崩溃,它们常常使智能体陷入局部最优。为了解决这个问题,我们提出了 Episodic Memory Temporal Consistency (EMTC),一个能够稳健构建并选择性利用历史经验的框架。EMTC 引入了两个协同组件:(1) 时间一致性语义嵌入器,它将对比学习与时间条件状态重建相结合,防止表示崩溃并实现精确的记忆检索;(2) 时间一致性门控机制,它根据时间一致性误差动态调节 episodic 激励。这个自适应门从伪成功轨迹中过滤误导信号,有效缓解 Q 值高估。我们提供了理论保证,建立了严格误差界,将可观测的时间一致性误差直接与底层轨迹最优性和表示质量联系起来。在 SMAC 和 GRF 基准上的广泛评估表明,EMTC 持续优于最先进的基线。值得注意的是,与最强的 episodic 基线相比,EMTC 在超难 SMAC 场景中实现了高达 24% 的绝对胜率提升,在 GRF 任务上平均提升 28%。

英文摘要

Cooperative Multi-Agent Reinforcement Learning (MARL) frequently suffers from severe reward sparsity and exploration bottlenecks. While episodic memory mechanisms mitigate these issues by reusing high-return trajectories, they often trap agents in local optima due to unconstrained incentive distribution and semantic representation collapse. To address this, we propose Episodic Memory Temporal Consistency (EMTC), a framework that robustly constructs and selectively leverages historical experiences. EMTC introduces two synergistic components: (1) a Temporally Consistent Semantic Embedder that integrates contrastive learning with time-conditioned state reconstruction, preventing representation collapse and enabling precise memory retrieval; and (2) a Temporal Consistency Gating Mechanism that dynamically modulates episodic incentives based on temporal consistency error. This adaptive gate filters misleading signals from pseudo-successful trajectories, effectively mitigating Q-value overestimation. We provide theoretical guarantees, establishing a strict error bound that directly links the observable temporal consistency error to the underlying trajectory optimality and representation quality. Extensive evaluations on the SMAC and GRF benchmarks demonstrate that EMTC consistently outperforms state-of-the-art baselines. Notably, compared to the strongest episodic baseline, EMTC achieves absolute win-rate improvements of up to 24% in super-hard SMAC scenarios and an average improvement of 28% across GRF tasks.

2606.04486 2026-06-04 cs.CR cs.CL cs.LG stat.ML 版本更新

Global Sketch-Based Watermarking for Diffusion Language Models

基于全局草图的扩散语言模型水印

Daniel Zhao

发表机构 * Harvard University(哈佛大学)

AI总结 提出一种针对掩码扩散语言模型的全局向量草图水印方法,通过控制文本的整体统计特征实现与局部上下文无关的检测。

详情
AI中文摘要

语言模型的水印方法在自回归设置中已被广泛研究,其中令牌是顺序生成的。这些工作主要关注局部上下文方案,该方案根据前序令牌扰动下一个令牌的分布。在扩散语言模型中,许多未解析位置的分布被联合采样,使得整个序列的加性统计在生成过程中是可处理的。我们提出了一种针对掩码扩散语言模型的水印,该水印控制文本的全局向量草图表示。与上下文相关的水印相比,草图公式将检测与生成过程中看到的局部上下文解耦,从而产生一个顺序无关的统计量和一个不表现为简单令牌偏差的水印规则。我们分析了该方法的失真、合理性和鲁棒性。

英文摘要

Watermarking methods for language models have been studied extensively in the autoregressive setting, where tokens are generated sequentially. These works largely focus on local-context schemes that perturb the next token's distribution as a function of its preceding tokens. In diffusion language models, distributions over many unresolved positions are jointly sampled, allowing additive statistics of the entire sequence to be tractable during generation. We propose a watermark for masked diffusion language models that controls a global, vector-valued sketch representation of the text. Compared to context-dependent watermarking, the sketch formulation decouples detection from the local contexts seen during generation, resulting in an order-agnostic statistic and a watermarking rule which does not manifest as a simple token bias. We analyze the distortion, soundness, and robustness properties of the method.

2606.04484 2026-06-04 cs.AI cs.LG cs.MA 版本更新

AgentJet: A Flexible Swarm Training Framework for Agentic Reinforcement Learning

AgentJet:一种用于智能体强化学习的灵活群体训练框架

Qingxu Fu, Boyin Liu, Shuchang Tao, Zhaoyang Liu, Bolin Ding

发表机构 * Tongyi Lab, Alibaba Group(通义实验室,阿里巴巴集团)

AI总结 提出AgentJet,一种解耦的多节点群体训练框架,支持异构多模型强化学习、多任务鸡尾酒训练、容错执行和实时代码迭代,并通过上下文跟踪模块实现1.5-10倍训练加速。

Comments Technical report, 27 pages

详情
AI中文摘要

我们提出了AgentJet,一个用于大型语言模型(LLM)智能体强化学习的分布式群体训练框架。与将智能体运行与模型优化紧密耦合的集中式框架不同,AgentJet采用解耦的多节点架构,其中群体服务器节点托管可训练模型并在GPU集群上运行优化,而群体客户端节点在任意设备上执行任意智能体。这种设计提供了集中式框架难以支持的能力:(1)异构多模型强化学习,支持训练具有多个LLM作为大脑的异构多智能体团队;(2)具有隔离智能体运行时的多任务鸡尾酒训练;(3)容错执行,防止外部环境故障中断训练过程;(4)实时代码迭代,允许通过替换群体客户端节点在训练期间编辑智能体。为了支持多模型、多轮和多智能体设置中的高效强化学习,AgentJet引入了一个带有时间线合并的上下文跟踪模块,该模块合并冗余上下文并实现1.5-10倍的训练加速。最后,AgentJet引入了一个自动化研究系统,该系统以研究主题为输入,并在大规模集群上自主进行长期、多天的强化学习研究。通过利用群体架构,该系统在无需人工干预的情况下复现了强化学习研究人员的关键探索工作流程。

英文摘要

We present AgentJet, a distributed swarm training framework for large language model (LLM) agent reinforcement learning. Unlike centralized frameworks that tightly couple agent rollouts with model optimization, AgentJet adopts a decoupled multi-node architecture in which swarm server nodes host trainable models and run optimization on GPU clusters, whereas swarm client nodes execute arbitrary agents on arbitrary devices. This design provides capabilities that are difficult to support in centralized frameworks: (1) heterogeneous multi-model reinforcement learning, enabling the training of heterogeneous multi-agent teams with multiple LLM as brains; (2) multi-task cocktail training with isolated agent runtimes; (3) fault-tolerant execution that prevents external environment failures from interrupting the training process; and (4) live code iteration, which allows agents to be edited during training by replacing swarm client nodes. To support efficient RL in multi-model, multi-turn, and multi-agent settings, AgentJet introduces a context tracking module with timeline merging, which consolidates redundant context and achieves a 1.5-10x training speedup. Finally, AgentJet introduces an automated research system that takes a research topic as input and autonomously conducts long-horizon, multi-day RL studies on large-scale clusters. By leveraging the swarm architecture, this system reproduces key exploratory workflows of RL researchers without human intervention during execution.

2606.04476 2026-06-04 cs.LG math.OC math.ST stat.ML stat.TH 版本更新

When Both Layers Learn: Training Dynamics of Representing Linear Models via ReLU Networks

当两层都学习:通过ReLU网络表示线性模型的训练动力学

Berk Tinaz, Changzhi Xie, Mahdi Soltanolkotabi

发表机构 * Department of Electrical and Computer Engineering(电气工程系) Department of Computer Science(计算机科学系)

AI总结 本文研究单隐层ReLU网络联合训练两层以拟合线性目标函数的梯度下降动力学,通过三阶段分析证明从随机初始化出发能以线性速率收敛到全局最小化器并达到最优样本复杂度。

Comments 47 pages, 8 figures, published at the 39th Annual Conference on Learning Theory (COLT), 2026

详情
AI中文摘要

在本文中,我们研究了联合训练单隐层ReLU网络的两层以拟合线性目标函数的梯度下降动力学。具体来说,我们考虑一个可实现设置,其中输入从高斯分布中独立同分布采样,标签遵循一个植入的线性模型。这种风格化的框架捕捉了逆问题和某些自编码器模型中端到端训练的关键特征。尽管其表面简单,但动力学仍然难以理解,部分原因是损失景观包含多个非严格鞍点,这使得不清楚为什么从随机初始化开始的梯度下降能够可靠地逃离坏的驻点区域。我们提供了优化景观的详细刻画,并证明从适度小的随机初始化开始——同时训练两层——梯度下降以线性速率收敛到全局最小化器,并具有阶次最优的样本复杂度。我们的分析通过三个阶段追踪轨迹:对齐阶段,其中隐藏权重逐渐与植入方向对齐,而输出权重保持正确的符号模式;增长阶段,其中两层的范数增加同时保持对齐;以及局部细化阶段,其中对齐的神经元快速收敛到植入方向,产生快速的局部收敛。为了严格证明梯度下降避免非严格鞍点,我们为端到端动力学开发了轨迹级控制论证。此外,我们建立了沿整个轨迹成立的新颖的均匀集中结果,这对于获得阶次最优的样本复杂度至关重要。我们通过一系列配置的大量实验验证了我们的理论。

英文摘要

In this paper, we study the gradient descent dynamics for jointly training both layers of a one-hidden-layer ReLU network to fit a linear target function. Concretely, we consider a realizable setting where inputs are drawn i.i.d. from a Gaussian distribution and labels follow a planted linear model. This stylized framework captures salient features of end-to-end training in inverse problems and certain auto-encoder models. Despite its apparent simplicity, the dynamics remain poorly understood, in part because the loss landscape contains multiple non-strict saddle points, making it unclear why gradient descent from random initialization reliably escapes bad stationary regions. We provide a detailed characterization of the optimization landscape and prove that gradient descent from a moderately small random initialization-simultaneously training both layers-converges to a global minimizer at a linear rate with order-wise optimal sample complexity. Our analysis tracks the trajectory through three phases: an alignment phase in which hidden weights progressively align with the planted direction while the output weights maintain the correct sign pattern; a growth phase in which the norms of both layers increase while preserving alignment; and a local refinement phase in which the aligned neurons rapidly converge to the planted direction, yielding fast local convergence. To rigorously show that GD avoids non-strict saddles, we develop trajectory-level control arguments for the end-to-end dynamics. In addition, we establish novel uniform concentration results that hold along the entire trajectory, and are essential for obtaining order-wise optimal sample complexity. We corroborate our theory with extensive experiments across a range of configurations.

2606.04473 2026-06-04 cs.LG cs.AI 版本更新

ChessMimic: Per-Rating Transformer Models for Human Move, Clock, and Outcome Prediction in Online Blitz Chess

ChessMimic: 用于在线闪电棋中人类走棋、时钟和结果预测的按等级划分的Transformer模型

Thomas Johnson

发表机构 * nascent.xyz(nascent实验室)

AI总结 提出ChessMimic系统,包含三个小型编码器Transformer模型,分别用于走棋、思考时间和结果预测,通过按Elo等级分段训练实现更精细的技能校准,在Lichess闪电棋数据上走棋预测准确率超越Maia-2,结果预测AUC达0.78,时钟模型提供可用但非最优的思考时间信号。

详情
AI中文摘要

我们提出了ChessMimic,一个由三个小型编码器Transformer组成的系统——分别用于走棋、思考时间和结果预测——以局面、最近走棋历史、玩家等级和时钟状态为条件。我们为每100 Elo等级区间拟合每个模型的独立实例,以参数效率换取更精细的技能校准。在Lichess Rated Blitz游戏的一个月保留切片上,ChessMimic的人类走棋预测准确率在每个Elo区间都优于Maia-2。与Maia-3相比,我们的9M参数模型的准确率介于Maia-3-5M和Maia-3-23M之间,且没有几何注意力偏置的额外复杂性。除了走棋匹配模型,我们还训练了一个游戏结果模型,该模型不仅以局面为条件,还以玩家等级、时间控制和剩余时钟时间为条件。结果模型在样本外达到了0.78的AUC,击败了Maia-2以及基于子力、等级和时钟时间的逻辑回归。最后,我们训练了一个时钟模型来预测人类思考时间。该时钟模型在ALLIE风格过滤器下提供了可用但非最优的每步思考时间信号(Pearson r = 0.41,Spearman rho = 0.50,MAE 4.10秒,而ALLIE报告的r = 0.70),残差差距集中在每位置桶的锐度上,而非桶边际校准。公开演示在1e4.ai,我们在GitHub上发布了代码、每个区间的权重以及C++数据过滤管道代码。

英文摘要

We present ChessMimic, a system of three small encoder-only transformers - for move, thinking-time, and outcome prediction - conditioned on the position, recent move history, player rating, and clock state. We fit a separate instance of each model per 100-Elo rating band, trading parameter efficiency for sharper per-skill calibration. On a held-out month-wide slice of Lichess Rated Blitz games ChessMimic's human move prediction accuracy outperforms Maia-2 in every Elo band. Compared to Maia-3, our 9M parameter model's accuracy sits between Maia-3-5M and Maia-3-23M without the additional complexity of Geometric Attention Bias. In addition to the move matching model, we also train a game outcome model that conditions not only on the position, but also player ratings, time control, and remaining clock times. The outcome model achieves an AUC of 0.78 out of sample, beating Maia-2 as well as logistic regressions based on material, ratings, and clock time. Finally, we train a clock model that predicts human thinking times. The clock model provides a usable but non-SOTA per-ply think-time signal under ALLIE-style filters (Pearson r = 0.41, Spearman rho = 0.50, MAE 4.10 s, against ALLIE's reported r = 0.70), with the residual gap concentrated in per-position bucket sharpness rather than bucket-marginal calibration. A public demo is at 1e4.ai and we release code, per-band weights, and the C++ data-filter pipeline code in GitHub.

2606.04468 2026-06-04 cs.LG cs.AI cs.NE math.OC 版本更新

ParetoPilot: Zero-Surrogate Offline Multi-Objective Optimization via Infer-Perturb-Guide Diffusion

ParetoPilot:通过推断-扰动-引导扩散实现零代理离线多目标优化

Ruiqing Sun, Sen Yang, Dawei Feng, Bo Ding, Yijie Wang, Huaimin Wang

发表机构 * Nanyang Technological University(南洋理工大学)

AI总结 提出ParetoPilot,一种无需外部代理模型的零代理扩散框架,通过推断-扰动-引导引擎在无条件去噪步骤中隐式推断目标方向、正交化并行引力场和边缘感知排斥力,实现离线多目标优化的帕累托最优设计。

详情
AI中文摘要

离线多目标优化旨在基于静态数据集发现新颖的帕累托最优设计,而无需昂贵的环境交互。尽管最近的生成方法取得了显著成功,但它们主要依赖外部代理模型。这种依赖引入了显著的计算开销,遭受欺骗性评估,并偏离了联合训练主流生成模型与条件的流行范式。为了解决这些瓶颈,我们提出了ParetoPilot,一种用于离线多目标优化的新颖零代理扩散框架。ParetoPilot充分利用预训练扩散模型中固有的条件先验。其核心是引入了推断-扰动-引导引擎,该引擎无缝地插入在反向生成过程的无条件去噪步骤中。首先,通过匹配条件噪声预测和无条件噪声预测,隐式推断瞬时目标方向。其次,数学上正交化一个用于严格收敛的平行引力场和一个用于相互多样性的边缘感知排斥力,从而生成一个动态退火的扰动向量。最后,这个扰动目标通过标准的无分类器引导无缝地引导生成过程。在51个任务上的大量实验表明,ParetoPilot优于14个最先进的基于代理和逆生成基线。通过消除辅助代理训练,我们的方法在实现超体积改进和鲁棒帕累托前沿覆盖的同时,保护了数据隐私。

英文摘要

Offline multi-objective optimization (Offline MOO) aims to discover novel Pareto-optimal designs based on static datasets without expensive environment interactions. While recent generative methods have achieved notable success, they predominantly rely on external surrogate models. This dependency introduces significant computational overhead, suffers from deceptive evaluations, and deviates from the prevailing paradigm of jointly training mainstream generative models with conditions. To address these bottlenecks, we propose ParetoPilot, a novel zero-surrogate diffusion framework for offline MOO. ParetoPilot fully leverages the conditional priors inherently embedded within pre-trained diffusion models. At its core, the framework introduces the Infer-Perturb-Guide (IPG) engine, which is seamlessly interleaved within the unconditional denoising steps of the reverse generation process. First, it implicitly infers the instantaneous objective direction by matching conditional and unconditional noise predictions. Next, it mathematically orthogonalizes a parallel gravity field for strict convergence and an edgeness-aware repulsive force for mutual diversity, creating a dynamically annealed perturbation vector. Finally, this perturbed target seamlessly steers the generation process via standard Classifier-Free Guidance (CFG). Extensive experiments across 51 tasks demonstrate that ParetoPilot outperforms 14 state-of-the-art surrogate-based and inverse generative baselines. By eliminating auxiliary proxy training, our approach preserves data privacy while achieving hypervolume improvement and robust Pareto front coverage.

2606.04460 2026-06-04 cs.CR cs.AI cs.LG 版本更新

CyberGym-E2E: Scalable Real-World Benchmark for AI Agents' End-to-End Cybersecurity Capabilities

CyberGym-E2E:面向AI代理端到端网络安全能力的可扩展真实世界基准

Tianneng Shi, Robin Rheem, Dongwei Jiang, Mona Wang, Francisco De La Riega, Zhun Wang, Jingzhi Jiang, Alexander Cheung, Sean Tai, Jonah Cha, Jianhong Tu, Gabriel Han, Chenguang Wang, Jingxuan He, Wenbo Guo, Dawn Song

发表机构 * Stanford University(斯坦福大学) UC Berkeley(加州大学伯克利分校)

AI总结 提出CyberGym-E2E,一个大规模、真实的端到端网络安全基准,通过自动化流水线将开源漏洞数据转化为评估环境,全面评估AI代理在漏洞发现、PoC生成和补丁生成全生命周期中的能力。

Comments ICML 2026

详情
AI中文摘要

人工智能有潜力通过使系统能够自主检测、分析和修复软件漏洞来改变网络安全。然而,现有对AI系统的网络安全评估在规模或范围上有限,未能捕捉真实世界软件漏洞发现和修复的端到端生命周期。为了解决这一差距,我们提出了CyberGym-E2E,一个大规模、真实的端到端网络安全基准,全面评估AI代理在漏洞发现、PoC生成和补丁生成整个生命周期中的能力。CyberGym-E2E全面且可扩展,因为我们构建了一个自动化的、代理增强的流水线,用于将开源漏洞数据转化为真实的评估环境。目前,该基准包含139个不同开源项目中的920个真实世界漏洞。

英文摘要

AI has the potential to transform cybersecurity by enabling systems that can autonomously detect, analyze, and remediate software vulnerabilities. However, existing cybersecurity evaluations of AI systems are limited in scale or scope, and fail to capture the end-to-end lifecycle of real-world software vulnerability discovery and remediation. To address this gap, we propose CyberGym-E2E, a large-scale and realistic end-to-end cybersecurity benchmark that comprehensively evaluates AI agents' abilities across the full lifecycle of vulnerability discovery, PoC generation, and patch generation. CyberGym-E2E is comprehensive and scalable, as we build an automated, agent-enhanced pipeline for transforming open-source vulnerability data into realistic evaluation environments. Currently, the benchmark consists of 920 real-world vulnerabilities across 139 different open-source projects.

2606.04453 2026-06-04 cs.CV cs.LG 版本更新

Radiomic Feature Selection Using Gradient Loss of Deep Neural Network for Lung Cancer Stage Detection

基于深度神经网络梯度损失的放射组学特征选择用于肺癌分期检测

Hina Shakir, Mohammad Mohatram, Javeed Hussain, Syed Rizwan Ali, Muhammad Irfan Memon

发表机构 * Department of Software Engineering, Bahria University(巴尔ia大学软件工程系) Global College of Engineering and Technology(全球工程与技术学院) Software Engineering & Business Incubation Center, Bahria University(软件工程与企业孵化中心,巴尔ia大学)

AI总结 提出GL-RFE框架,利用深度神经网络梯度敏感性分析递归消除低贡献特征,从106个放射组学特征中选出前15个用于肺癌早晚期分类,准确率达90.22%。

详情
Journal ref
J. Vis. Exp. (230), e70181, (2026)
AI中文摘要

放射组学能够从医学图像中提取定量成像生物标志物,已成为计算机辅助癌症诊断的重要工具。然而,放射组学数据集通常具有高维小样本的特点,使得特征选择成为构建可靠预测模型的关键步骤。本研究提出了一种梯度损失递归特征消除(GL-RFE)框架,该框架集成深度神经网络的梯度敏感性分析,以识别对肺癌分期检测最具影响力的放射组学特征。使用3D Slicer平台的PyRadiomics扩展从胸部计算机断层扫描(CT)中提取了总共106个放射组学特征。所提出的方法通过计算网络损失相对于输入特征的梯度来评估特征重要性,并递归消除贡献最小的特征。最终选出的前15个放射组学特征用于训练深度神经网络分类器,以区分早期和晚期肺癌。该框架在测试数据集上取得了强劲的分类性能,准确率为90.22%,精确率为90.10%,召回率为90.24%,F1分数为90.16%。可视化分析(包括相关性热图和分布图)进一步证实了特征冗余减少和类别可分性提高。与传统特征选择技术相比,GL-RFE有效捕捉了非线性特征交互并增强了模型泛化能力。所提出的协议为基于放射组学的癌症分期检测提供了一种可重复且可解释的方法,特别适用于高维小样本生物医学数据集,并在基因组学和多模态临床分析等其他领域具有潜在应用价值。

英文摘要

Radiomics enables extraction of quantitative imaging biomarkers from medical images and has become an important tool for computer-aided cancer diagnosis. However, radiomics datasets are typically high-dimensional with limited samples, making feature selection a critical step for building reliable predictive models. This study proposes a Gradient-Loss Recursive Feature Elimination (GL-RFE) framework that integrates gradient sensitivity analysis from a deep neural network to identify the most influential radiomic features for lung cancer stage detection. A total of 106 radiomic features were extracted from chest Computed Tomography (CT) scans using the PyRadiomics extension of the 3D Slicer platform. The proposed method evaluates feature importance by computing gradients of the network loss with respect to input features and recursively eliminates features with minimal contribution. The resulting top-15 radiomic features are used to train a deep neural network classifier for distinguishing early-stage and advanced-stage lung cancer. The proposed framework achieves strong classification performance, with accuracy of 90.22%, precision of 90.10%, recall of 90.24%, and F1-score of 90.16% on the test dataset. Visualization analyses, including correlation heat maps and distribution plots, further confirm reduced feature redundancy and improved class separability. Compared to conventional feature selection techniques, GL-RFE effectively captures nonlinear feature interactions and enhances model generalization. The presented protocol provides a reproducible and interpretable methodology for radiomics-based cancer stage detection and is particularly suitable for high-dimensional, small-sample biomedical datasets, with potential applications in other domains such as genomics and multimodal clinical analysis.

2606.04451 2026-06-04 cs.LG 版本更新

On Out-of-sample Embedding in UMAP

UMAP中的样本外嵌入

Mohammad Tariqul Islam, Jason W. Fleischer

发表机构 * Media Lab, Massachusetts Institute of Technology(媒体实验室,麻省理工学院) Electrical and Computer Engineering, Princeton University(电子与计算机工程,普林斯顿大学)

AI总结 针对UMAP在添加新样本时产生的排斥效应,通过优化原始k近邻图中的成对交互,提出参数化UMAP方法以改善嵌入质量。

Comments 22 pages, 16 figures

详情
AI中文摘要

邻域嵌入算法通过在低维空间中构建等价的图表示来揭示高维数据中的相关性。一种日益流行的算法是统一流形学习与投影(UMAP),它使用代数拓扑来映射两个空间之间的距离。虽然它在许多类型的数据集上表现良好,但UMAP在将样本外点添加到现有映射时存在困难。特别是,UMAP通常将新点放置在所发现簇的周边,而不是与它们的相关邻居一起放在簇的内部。在这里,我们通过优化原始k近邻图中的成对交互来克服这种“排斥效应”。此外,我们表明参数化UMAP比非参数算法获得更好的嵌入,特别是当数据变得更复杂时(例如,医学图像)。我们还表明,当使用参数化UMAP嵌入数据时,排斥效应自然得到缓解。我们使用可信度、最近邻分类器以及分析嵌入中的吸引力和排斥力来表征不同的UMAP方法。

英文摘要

Neighbor embedding algorithms reveal correlations in high-dimensional data by constructing an equivalent graph representation in a lower-dimensional space. An increasingly popular algorithm is Uniform Manifold Learning and Projection (UMAP), which uses algebraic topology to map distances between the two spaces. While it works well on many types of data sets, UMAP has trouble adding out-of-sample points to a pre-existing mapping. In particular, UMAP often places new points on the periphery of the found clusters, rather than in their interiors with their correlated neighbors. Here, we overcome this ``repulsion effect'' by optimizing pairwise interactions within the original k-nearest-neighbor graph. Moreover, we show that parameterizing UMAP obtains better embeddings than non-parametric algorithms, particularly as the data gets more complex (e.g., medical images). We also show that the repulsion effect is naturally mitigated when a parameterized UMAP is employed to embed the data. We characterize different UMAP approaches using trustworthiness, nearest neighbor classifiers, and by analyzing attractive and repulsive forces in the embeddings.

2606.04446 2026-06-04 cs.DC cs.LG 版本更新

D^2SD: Accelerating Speculative Decoding with Dual Diffusion Draft Models

D^2SD: 使用双重扩散草稿模型加速推测解码

Liyuan Zhang, Jiarui Zhang, Jinwei Yao, Ran Yan, Yuchen Yang, Jiahao Zhang, Tongkai Yang, Yi Wu, Binhang Yuan

发表机构 * Peking University(北京大学) Tsinghua University(清华大学) HKUST(香港科技大学) UIUC(伊利诺伊大学厄巴纳-香槟分校) Ant Group(蚂蚁集团)

AI总结 提出D^2SD框架,通过双重扩散草稿模型和置信度引导的前缀树,提升推测解码的接受率,优于现有扩散方法和自回归推测解码基线。

详情
AI中文摘要

推测解码通过草拟多个令牌并在单次目标模型前向传递中验证它们,加速自回归大语言模型推理。最近的基于扩散的草稿模型并行生成整个令牌块,但通常每次验证只提交单个草稿序列:一旦出现第一个不匹配,所有后续草稿令牌被丢弃,导致接受率有限。简单地对更多草稿候选序列进行批处理只会带来边际改进,因为冗余或位置不当的分支增加了草拟和验证的成本,而没有成比例地增加接受的令牌数量。我们提出D^2SD,一种双重扩散草稿推测解码框架,将候选组织成置信度引导的前缀树,其中第一个扩散草稿器生成一个块以及每个位置的置信度分数,用于识别最可能的拒绝边界并选择前K个前缀范围进行恢复;第二个可变前缀扩散草稿器在每个选定前缀处重新锚定,并在一次批处理中提出替代延续;得到的共享前缀候选通过级联注意力联合验证。实验表明,D^2SD在底层扩散方法和强自回归推测解码基线上均有明显改进。

英文摘要

Speculative decoding accelerates autoregressive large language model inference by drafting multiple tokens and verifying them in a single target-model forward pass. Recent diffusion-based drafters generate an entire block of tokens in parallel but usually commit to a single draft sequence per verification: once the first mismatch occurs, all subsequent draft tokens are discarded, resulting in a limited acceptance rate. Naively batching more draft candidate sequences only introduces a marginal improvement, as redundant or poorly placed branches increase the cost of drafting and verification without proportionally increasing the number of accepted tokens. We propose D^2SD, a dual diffusion draft speculative decoding framework that organizes candidates into a confidence-guided prefix tree, where the first diffusion drafter generates a block along with per-position confidence scores that are used to identify the most likely rejection boundary and select the top-K prefix ranges for recovery; the second variable-prefix diffusion drafter re-anchors at each selected prefix and proposes alternative continuations in one batched pass; the resulting shared-prefix candidates are jointly verified via cascade attention. Empirically, D^2SD shows clear improvements over both the underlying diffusion approach and strong autoregressive speculative decoding baselines.

2606.04445 2026-06-04 cs.LG cs.AI math.ST stat.TH 版本更新

RowNet: A Memory Transformer for Tabular Regression

RowNet: 用于表格回归的记忆Transformer

Askat Rakhymbekov, Gulshat Muhametjanova

发表机构 * Department of Applied Mathematics and Informatics(应用数学与信息学系) Kyrgyz-Turkish Manas University(吉尔吉斯-土耳其马纳斯大学)

AI总结 针对房地产估值中表格回归问题,提出RowNet,一种基于检索的神经网络架构,通过记忆库中的成对相似性特征、目标一致性增强和混合专家模块实现价格预测。

Comments Retrieval-based neural architecture for real estate valuation. Related to TabR (arXiv:2307.14338) and retrieval-augmented tabular learning

详情
AI中文摘要

房地产估值是一个结构化回归问题,其中价格受异构特征类型、稀疏区域效应、非线性交互以及可比房产的实际逻辑影响。标准多层感知器将每一行视为孤立向量,必须仅从监督中学习局部性、尺度敏感性和类别匹配。梯度提升决策树提供了强大的表格基线,但其以特征为中心的分裂机制并未显式建模相似历史观测的检索。本文提出了RowNet,一种用于房地产每平方米价格预测的基于检索的神经网络架构。RowNet通过针对标记属性记忆库的成对相似性特征来表示查询属性。第一检索层从仅特征相似性中估计粗略目标。第二层通过目标一致性特征增强记忆比较,并使用多个学习注意力头检索互补的可比集。最终的混合专家模块结合了学习门控、残差校正、熵正则化和头多样性正则化以产生预测。

英文摘要

Real estate valuation is a structured regression problem in which prices are governed by heterogeneous feature types, sparse regional effects, nonlinear interactions, and the practical logic of comparable properties. Standard multilayer perceptrons treat each row as an isolated vector and must learn locality, scale sensitivity, and categorical matching from supervision alone. Gradient-boosted decision trees provide strong tabular baselines, but their feature-centric splitting mechanism does not explicitly model the retrieval of similar historical observations. This paper presents RowNet, a retrieval-based neural architecture for real estate price-per-square-meter prediction. RowNet represents a query property through pairwise similarity features against a memory bank of labeled properties. A first retrieval layer estimates a coarse target from feature-only similarities. A second layer augments the memory comparison with target-consistency features and uses multiple learned attention heads to retrieve complementary comparable sets. A final mixture-of-experts module combines learned gating, residual correction, entropy regularization, and head-diversity regularization to produce the prediction.

2606.04444 2026-06-04 eess.IV cs.LG 版本更新

Scaling Datasets for Multi-Sensor, Multi-Agent, and Multi-Domain Learning in Autonomous Systems

面向自主系统中多传感器、多智能体与多领域学习的数据集扩展

R. Spencer Hallyburton, David Hunt, Miroslav Pajic

发表机构 * Department of Electrical and Computer Engineering, Duke University(电气与计算机工程系,杜克大学)

AI总结 提出基于AVstack和CARLA的模块化数据集生成流程,创建TB级带真实标签的多域数据,支持单/多智能体与灵活传感器配置,用于特定应用训练和协作自主研究。

详情
AI中文摘要

现有数据集无法支持多智能体、多传感器或多领域自主系统中的大规模学习,而多样性和协调性在这些系统中至关重要。我们提出了一种模块化数据集生成流程,利用AVstack框架和CARLA模拟器,为地面、空中和基础设施系统创建TB级、带有真实标签的数据。该流程支持单智能体和多智能体配置,配备灵活的传感器套件,能够在具有挑战性的条件下进行可控实验。代表性的感知与融合研究表明,生成的数据可以支持特定应用的训练和协作自主性。

英文摘要

Existing datasets cannot support large-scale learning in multi-agent, multi-sensor, or multi-domain autonomy, where diversity and coordination are essential. We present a modular dataset generation pipeline that creates terabyte-scale, ground-truth-labeled data for ground, aerial, and infrastructure-based systems using the AVstack framework and CARLA simulator. Supporting single- and multi-agent configurations with flexible sensor suites, the pipeline enables controllable experimentation across challenging conditions. Representative perception and fusion studies show how generated data can support application-specific training and collaborative autonomy.

2606.04438 2026-06-04 cs.LG cs.AI 版本更新

LoopMoE: Unifying Iterative Computation with Mixture-of-Experts for Language Modeling

LoopMoE:统一迭代计算与混合专家模型用于语言建模

Wenkai Chen, Tianshu Li, Wenyong Huang, Yichun Yin, Lifeng Shang, Chengwei Qin

发表机构 * Hong Kong University of Science and Technology (Guangzhou)(香港科技大学(广州)) Huawei Technologies Co.,Ltd.(华为技术有限公司)

AI总结 提出LoopMoE,通过迭代自适应层归一化和容量平衡策略,在相同参数和FLOPs下,循环MoE语言模型在多个基准上优于标准MoE。

详情
AI中文摘要

混合专家模型(MoE)和循环架构分别沿着参数容量和有效深度两个正交维度扩展模型。然而,主流的循环架构依赖于密集主干,将参数数量与每个token的FLOPs耦合,这使得在匹配预算下无法隔离迭代计算的效果。为此,我们提出了LoopMoE,一种循环MoE语言模型,通过两种设计将稀疏路由与迭代权重共享计算相结合。第一种是IterAdaLN,它通过联合以迭代索引和每个token隐藏状态为条件的调制信号来解决权重共享对称性。第二种是一种容量平衡策略,恢复了经过良好调整的非循环参考模型的注意力到FFN活跃参数比率。这些设计共同实现了在相同总参数、每个token FLOPs和活跃子层比率下,循环MoE与标准MoE的首次严格受控的头对头评估。在3B规模下,LoopMoE在9个下游基准测试中的8个上优于标准MoE,平均提升超过1个点。在9B规模下,LoopMoE继续优于匹配的标准MoE,表明架构优势在更大规模下持续存在。我们的工作建立了稀疏性和循环性的受控综合,并为循环语言模型指明了一个有前景的方向。

英文摘要

Mixture-of-Experts (MoE) and looped architectures scale models along two orthogonal axes, namely parameter capacity and effective depth. However, mainstream looped architectures rely on dense backbones that couple parameter count with per-token FLOPs, which makes it impossible to isolate the effect of iterative computation under matched budgets. To this end, we present LoopMoE, a looped MoE language model that integrates sparse routing with iterative weight-shared computation through two designs. The first is IterAdaLN, which resolves weight-sharing symmetry via a modulation signal jointly conditioned on the iteration index and the per-token hidden state. The second is a capacity-balancing strategy that recovers the attention-to-FFN active parameter ratio of well-tuned non-looped references. Together, these designs enable the first strictly controlled, head-to-head evaluation of a looped MoE against a Vanilla MoE under identical total parameters, per-token FLOPs, and active sublayer ratios. At the 3B scale, LoopMoE outperforms the Vanilla MoE on 8 of 9 downstream benchmarks with an average improvement exceeding 1 point. At the 9B scale, LoopMoE continues to outperform the matched Vanilla MoE, indicating that the architectural gain persists at larger scale. Our work establishes a controlled synthesis of sparsity and recurrence, and suggests a promising direction for looped language models.

2606.04434 2026-06-04 cs.CV cs.LG 版本更新

Hyper-ICL: Attention Calibration with Hyperbolic Anchor Distillation for Multimodal In-Context Learning

Hyper-ICL:基于双曲锚点蒸馏的注意力校准用于多模态上下文学习

Niloufar Alipour Talemi, Hossein Kashiani, Fatemeh Afghah

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 提出Hyper-ICL,一种轻量级训练框架,通过低秩logit适配器和双曲锚点蒸馏损失校准注意力分布,无需推理时提供上下文示例即可重建演示效果,提升多模态上下文学习的准确性和稳定性。

Comments Accepted at the 43rd International Conference on Machine Learning (ICML 2026)

详情
AI中文摘要

多模态上下文学习已成为多模态大语言模型的一种实用推理范式,其中少量交错的图像-文本上下文示例条件化模型以解决新任务。尽管灵活,但多模态ICL由于对演示格式、顺序和内容的敏感性,导致高推理延迟和不稳定性。为解决这些限制,我们提出Hyper-ICL,一种轻量级、基于训练的无演示多模态ICL框架,它直接在推理时无需ICD即可重建演示效果。Hyper-ICL学习一个参数高效的低秩logit级适配器,校准注意力分布以更好地匹配演示诱导的注意力重分布。为捕捉演示影响如何随查询变化,我们引入查询自适应调制机制,根据当前查询在层和头之间自适应控制token级的干预强度。最后,我们提出逐层双曲锚点蒸馏损失,通过Lorentz测地距离将中间学生特征对齐到演示条件化的教师。该损失鼓励学生重建ICD诱导的演示-查询关系。在六个不同多模态基准(包括VQAv2、OK-VQA和COCO Caption)上的大量实验表明,Hyper-ICL在准确性和稳定性上持续优于普通ICL和现有最先进方法。

英文摘要

Multimodal In-Context Learning (ICL) has emerged as a practical inference paradigm for Multimodal Large Language Models, where a small set of interleaved image-text In-Context Demonstrations (ICDs) conditions the model to solve new tasks. Despite its flexibility, multimodal ICL incurs high inference latency and suffers from instability due to sensitivity to demonstration formatting, ordering, and content. To address these limitations, we propose Hyper-ICL, a lightweight, training-based framework for demonstration-free multimodal ICL that reconstructs demonstration effects directly without requiring ICDs at inference time. Hyper-ICL learns a parameter-efficient low-rank logit-level adapter that calibrates attention distributions to better match demonstration-induced attention redistribution. To capture how demonstration influence varies across queries, we introduce a query-adaptive modulation mechanism that adaptively controls intervention strength at token level across layers and heads based on the current query. Finally, we propose a layer-wise hyperbolic anchor distillation loss that aligns intermediate student features to a demonstration-conditioned teacher via Lorentz geodesic distance. This loss encourages the student to reconstruct the demonstration-query relationships induced by ICDs. Extensive experiments across six different multimodal benchmarks (including VQAv2, OK-VQA, and COCO Caption) demonstrate that Hyper-ICL consistently improves accuracy and stability over vanilla ICL and existing state-of-the-art methods.

2606.04433 2026-06-04 cs.CV cs.CL cs.LG 版本更新

Stateful Visual Encoders for Vision-Language Models

用于视觉-语言模型的有状态视觉编码器

Zirui Wang, Junwei Yu, Adam Yala, David M. Chan, Joseph E. Gonzalez, Trevor Darrell

发表机构 * University of California, Berkeley(加州大学伯克利分校) UC Berkeley(加州大学伯克利分校)

AI总结 提出有状态视觉编码器,通过将每个视觉表示条件于先前的视觉特征,增强视觉-语言模型在多图像、多轮交互中的视觉变化感知能力,在跨图像空间聚合、多目标视觉差异和轨迹行为克隆等任务上取得一致改进。

Comments Project page: https://statefulvisualencoders.github.io/

详情
AI中文摘要

视觉-语言模型(VLM)越来越多地用于多图像、多轮代理场景,其中决策依赖于视觉变化。然而,在现有的开源权重VLM中,视觉比较仅在语言模型内部进行,而视觉编码器本身是无状态的:每个图像独立编码,无法访问先前的视觉上下文。因此,微小但任务关键的变化可能在语言模型有机会比较之前被减弱,尤其是当这些变化不影响场景的高层语义时。我们引入了一种有状态视觉编码器,它将每个视觉表示条件于先前的视觉特征。在监督微调下,配备有状态编码器的VLM在涉及跨图像空间聚合、多目标视觉差异和视觉轨迹行为克隆的控制任务上取得了一致的改进。这些改进在输入分辨率、语言模型大小和VLM骨干网络上保持一致。最后,我们在实际任务上验证了我们的模型,包括纵向放射学、细粒度图像比较和遥感,其中有状态编码器一致地改进了通用VLM基线,并在选定领域可以匹配或超越专用模型。项目页面:https://statefulvisualencoders.github.io/

英文摘要

Vision-language models (VLMs) are increasingly used in multi-image, multi-turn agentic settings where decisions depend on visual changes. However, in existing open-weight VLMs, visual comparisons happen only inside the language model, while the visual encoder itself remains stateless: each image is encoded independently, without access to the prior visual context. As a result, small but task-critical changes may be attenuated before the language model has a chance to compare them, especially when those changes do not affect the high-level semantics of the scene. We introduce a Stateful Visual Encoder, which conditions each visual representation on prior visual features. Under supervised finetuning, VLMs equipped with stateful encoders achieve consistent improvements on controlled tasks involving cross-image spatial aggregation, multi-object visual differencing, and visual trajectory behavior cloning. These improvements are consistent across input resolutions, language model sizes, and VLM backbones. Finally, we validate our model on real-world tasks, including longitudinal radiology, fine-grained image comparison, and remote sensing, where stateful encoders consistently improve generalist VLM baselines and can match or surpass specialized models in selected domains. Project page: https://statefulvisualencoders.github.io/

2606.04429 2026-06-04 stat.ML cs.LG 版本更新

Flatness and Generalization: Learning Multi-Index Models with Homogeneous Neural Networks

平坦性与泛化:使用齐次神经网络学习多指标模型

Harsh Vardhan, Hossein Taheri, Arya Mazumdar

发表机构 * Department of Computer Science(计算机科学系) University of California, San Diego(加州大学圣地亚哥分校) Halicioğlu Data Science Institute(Halicioğlu数据科学研究所)

AI总结 本文研究两层齐次神经网络学习多指标模型时,平坦性与泛化之间的关系,证明最平坦插值器总能泛化,而某些非泛化插值器的平坦性无法接近最平坦值。

详情
AI中文摘要

用于解释一阶梯度方法在非凸神经网络上泛化能力的常见启发式方法是“平坦插值器泛化良好”(Hochreiter and Schmidhuber, 1994; Keskar et al., 2017),其中平坦性可通过经验损失Hessian矩阵的迹来衡量。然而,Dinh等人(2017)表明,利用网络的对称性(可在保持总体和经验损失不变的情况下改变平坦性),任何插值器都可以变得更尖锐或更平坦。这一结果使得之前的启发式陈述变得空洞。在本文中,我们表明,对于使用两层非凸齐次神经网络学习未知多指标模型,尽管存在对称性,平坦性与泛化之间仍存在联系。这种联系涉及“最平坦”插值器,即所有插值器中具有阶数最小平坦性的插值器。首先,我们证明存在一类自然的非泛化插值器,其平坦性即使利用对称性也无法接近最平坦可能值。其次,我们证明,对于由单指标模型之和生成的数据,如果近似误差和标签噪声较低,任何最平坦插值器都能实现较小的总体损失,即最平坦插值器总是泛化的。这建立了平坦性与泛化之间的直接联系,适用于一大类激活函数和现实数据分布。

英文摘要

A common heuristic used to explain the generalization of first-order gradient methods on non-convex neural networks is that "flat interpolators generalize well" (Hochreiter and Schmidhuber, 1994; Keskar et al., 2017), where flatness can be measured by the trace of the Hessian of the empirical loss. However, Dinh et al. 2017) showed that, using symmetry of the network that can change flatness while keeping the population and empirical losses unchanged, any interpolator can be made sharper or flatter. This result makes the earlier heuristic statement vacuous. In this paper, we show that for learning an unknown multi-index model with $2$-layer non-convex homogeneous neural networks, there is a connection between flatness and generalization, despite the existence of symmetries. This connection pertains to the "flattest" interpolators, i.e., the interpolators that have orderwise minimum flatness among all interpolators. First, we show that there exists a natural class of non-generalizing interpolators whose flatness cannot be made closer to the flattest possible, even using symmetries. Second, we show that for data generated by a sum of single-index models, if the approximation error and label noise are low, any flattest interpolator achieves small population loss, i.e., the flattest interpolators always generalize. This establishes a direct link between flatness and generalization which applies to a large class of activations and realistic data distributions.

2606.04423 2026-06-04 cs.LG stat.ML 版本更新

The price of multi-group transductive learning

多组转导学习的代价

Noah Bergam, Samuel Deng, Daniel Hsu

发表机构 * Columbia University(哥伦比亚大学)

AI总结 本文证明在转导学习设置中,多组学习器在某些组上的错误率可能相对于单组设置产生乘法惩罚,且惩罚随组数线性增长至样本量的平方根,这与统计设置中惩罚至多对数增长且与组数无关形成鲜明对比。

详情
AI中文摘要

我们证明,在转导设置中,每个多组学习器在某些组上的错误率相对于单组设置可能产生乘法惩罚,并且惩罚可以随组数线性增加,最多达到样本量的平方根。这与类似(组可实现)统计设置中的最优多组学习器形成鲜明对比,后者的惩罚始终至多是样本量的对数,且与组数无关。

英文摘要

We show every multi-group learner in the transductive setting may incur a multiplicative penalty in its error rate on some group relative to the error rate achievable in the single-group setting, and the penalty can increasing linearly with the number of groups, up to roughly the square-root of the sample size. This stands in stark contrast to optimal multi-group learners in an analogous (group-realizable) statistical setting, where the penalty is always at most logarithmic in the sample size and independent of the number of groups.

2606.04420 2026-06-04 cs.LG 版本更新

Loss-Conditional PINNs for Parametric PDE Families

损失条件PINNs用于参数化PDE族

Anna Lazareva, Alexander Tarakanov

发表机构 * Faculty of Computer Science HSE University(俄罗斯莫斯科高等经济学院计算机科学系) VK and HSE University(VK与莫斯科高等经济大学)

AI总结 提出LC-PINN,通过将损失权重或物理系数作为网络输入并随机采样,实现单一模型参数化整个PDE族,无需配对数据,在多个参数化方程上匹配或优于逐权重重训练的PINN基线。

详情
AI中文摘要

物理信息神经网络(PINNs)通过最小化残差、边界、初始和数据损失的加权组合来逼近常微分方程和偏微分方程的解。其性能通常受损失权重选择的主导:不良的权重可能导致训练退化到满足一个物理约束而忽略另一个的解。现有方法选择或调整单一组好的权重。我们采取不同的观点:不是调整一个权重向量,而是在训练期间探索整个权重空间。我们引入LC-PINN,它将Dosovitskiy和Djolonga(2020)的损失条件训练适应于PDE残差设置:条件向量(损失权重或标量物理系数)被视为网络输入,并在每个优化步骤从简单先验中采样。这将PINN训练转变为学习由该向量索引的连续解族,无需求解器生成的配对数据。因此,LC-PINN介于经典PINNs和算子学习之间:它保持完全物理信息,但在参数族上摊销训练。我们的贡献不在于损失条件构造本身,而在于将其扩展到PINNs,将损失权重和参数系数机制统一在一个架构下(损失权重使用拼接,系数使用FiLM),以及一个固定求积的L-BFGS完成协议,使得参数系数机制可训练。我们给出了条件最优的lambda不变性结果,并在参数化Helmholtz、Schrödinger、粘性Burgers和Buckley-Leverett方程上研究了LC-PINN。单个LC-PINN在一个模型中参数化整个族,同时匹配或改进逐权重重训练的PINN基线,总成本相对于逐实例重训练具有有利的摊销。

英文摘要

Physics-informed neural networks (PINNs) approximate solutions of ODEs and PDEs by minimising a weighted combination of residual, boundary, initial, and data losses. Their performance is often dominated by the choice of loss weights: a poor weighting can drive training to a degenerate solution in which one physical constraint is satisfied while another is ignored. Existing methods select or adapt a single good set of weights. We take a different view: instead of tuning one weight vector, we explore the entire weight space during training. We introduce LC-PINN, which adapts the loss-conditional training of Dosovitskiy and Djolonga (2020) to the PDE-residual setting: the conditioning vector (either the loss weights or a scalar physical coefficient) is treated as a network input and sampled from a simple prior at every optimisation step. This turns PINN training into learning a continuous family of solutions indexed by that vector, with no solver-generated paired data. LC-PINN thus lies between classical PINNs and operator learning: it stays fully physics-informed but amortises training over a parametric family. Our contribution is not the loss-conditional construction itself, but its extension to PINNs, the unification of the loss-weight and parametric-coefficient regimes under one architecture (concatenation for loss weights, FiLM for coefficients), and a fixed-quadrature L-BFGS finishing protocol that makes the parametric-coefficient regime trainable. We give a lambda-invariance result for the conditional optimum and study LC-PINN on parametric Helmholtz, Schrodinger, viscous Burgers, and Buckley-Leverett equations. A single LC-PINN matches or improves retrained per-weight PINN baselines while parameterising the full family in one model, at a total cost that amortises favourably against per-instance retraining.

2606.04413 2026-06-04 cs.LG 版本更新

(Mis)generalization of Helpful-only Fine-tuning

仅帮助性微调的(错误)泛化

Mohammad Omar Khursheed, Baram Sosis, Fabien Roger

发表机构 * Anthropic Fellows Program(Anthropic 合作者计划) Anthropic

AI总结 研究仅帮助性训练(不拒绝用户意图)的模型在泛化中的缺陷,发现其存在涌现错位、残余拒绝行为、低可操控性、谄媚和不连贯角色等问题,并提出合成文档微调和添加角色相关问题来缓解。

Comments 77 pages, 50 figures

详情
AI中文摘要

仅帮助性模型,即训练为始终遵循用户意图的模型,对于危险能力评估和AI研发中拒绝行为会成为障碍的其他领域具有价值。关于仅帮助性训练的泛化特性知之甚少:仅帮助性模型比其无害对应模型拒绝更少,但先前工作未研究其对齐的其他维度。我们研究了现有仅帮助性模型的缺陷。我们发现一些模型表现出涌现错位,其他模型存在残余拒绝行为,大多数模型显示出低可操控性、谄媚和不连贯角色。我们表明简单的反拒绝训练可能导致其中许多问题。然而,这些问题并非仅帮助性训练的必要后果:我们证明合成文档微调和向SFT及RL添加角色相关问题可以缓解它们。

英文摘要

Helpful-only models, that is, models that are trained to always follow user intent, are valuable for dangerous capability evaluations and other areas of AI R&D where refusals would be an obstacle. Little is known about the generalization properties of helpful-only training: helpful-only models refuse less than their harmless counterparts, but previous work has not studied other dimensions of their alignment. We study the shortcomings of existing helpful-only models. We find that some show emergent misalignment, others have residual refusal behaviors, and most show poor steerability, sycophancy, and incoherent character. We show that simple anti-refusal training can cause many of these issues. None of these problems are necessary consequences of helpful-only training, though: we show that synthetic document fine-tuning and adding character-related questions to SFT and RL can mitigate them.

2606.04408 2026-06-04 cs.LG cs.AI 版本更新

An Ensembled Latent Factor Model via Differential Evolution and Gradient Descent Optimization

基于差分进化和梯度下降优化的集成潜在因子模型

Rui Zhang, Jinhang Liu, Wenbo Zhang

发表机构 * Chongqing Academy of Economics Research(重庆经济研究院) College of Computer and Information Science, Southwest University(西南大学计算机与信息科学学院)

AI总结 针对高维不完全数据,提出一种集成潜在因子模型,通过差分进化和梯度下降两种优化方法分别建模并自适应加权融合,以获取更全面、偏差更小的表示。

详情
AI中文摘要

高维不完全(HDI)数据在许多现实世界的大数据场景中普遍存在。潜在因子模型是一种常见的表示学习方法,能够从这些数据中揭示信息丰富的潜在因子。然而,大多数现有的潜在因子模型仅依赖梯度下降进行优化,这可能导致表示不充分且有偏差,特别是在处理异构HDI数据时。因此,本研究提出了一种基于差分进化和梯度下降优化的集成潜在因子模型(ELFM-DEGDO),其设计包括两个方面:1)分别通过差分进化和梯度下降优化独立建模两个不同的潜在因子模型;2)通过定制的自适应加权机制将这两个不同的潜在因子模型组合起来,以有效融合它们的优势。通过利用两种优化范式的互补优势,ELFM-DEGDO能够为HDI数据生成更全面、偏差更小的表示。在三个HDI数据集上的测试表明,ELFM-DEGDO的性能始终优于相关的几种潜在因子模型。

英文摘要

High-dimensional and incomplete (HDI) data are prevalent in many real-world big data scenarios. Latent factor models serve as a common representation learning approach, capable of uncovering informative latent factors from such data. Nevertheless, most existing latent factor models rely solely on gradient descent for optimization, which may lead to insufficient and biased representations, particularly when dealing with heterogeneous HDI data. Thus, this study proposes an Ensembled Latent Factor Model via Differential Evolution and Gradient Descent Optimization (ELFM-DEGDO) with two-fold designed: 1) two diverse latent factor models are independently modeled via differential evolution and gradient descent optimization, respectively, and 2) the two diverse latent factor models are combined via a customized self-adaptive weighting mechanism to effectively fuse their strengths. By leveraging the complementary advantages of both optimization paradigms, ELFM-DEGDO is able to produce more comprehensive and less biased representations for HDI data. Three HDI datasets are tested to show that ELFM-DEGDO consistently performs better than related several latent factor models.

2606.04405 2026-06-04 cs.LG cs.AI 版本更新

Low-Rank Decay for Grokking in Scale-Invariant Transformers: A Spectral-Geometric View

尺度不变Transformer中Grokking的低秩衰减:谱几何视角

Mingyu Li

发表机构 * Beijing Normal University(北京师范大学)

AI总结 针对尺度不变Transformer中权重衰减无法简化归一化层函数的问题,提出低秩衰减(LRD)正则化器,通过核范数子梯度的切向分量压缩奇异值,在模算术任务中加速有效秩下降并扩展延迟泛化(grokking)的数据边界。

详情
AI中文摘要

现代Transformer架构经常采用归一化机制,如RMSNorm和Query-Key归一化,使得模型的部分相对于权重幅度近似尺度不变。在这种机制下,标准的Frobenius范数权重衰减仅沿权重空间的径向方向作用,无法直接简化归一化层所表示的函数。我们通过这一视角研究小规模算法任务中的grokking现象,并提出\emph{低秩衰减}(LRD),一种类似核范数的谱正则化器,其子梯度——极因子$UV^\top$——即使在尺度不变设置中也保留切向分量。这一区别具有具体的动力学后果:在模型记忆训练集且任务梯度消失后,L2衰减无法再重塑权重谱,而LRD则以类似$\ell_1$的方式继续压缩奇异值。在模算术任务中,我们发现LRD诱导Query/Key矩阵的快速有效秩下降,并扩展了延迟泛化(grokking)发生的数据分数边界。我们进一步通过核范数子微分在低秩流形附近的“针到扇”展开,提供了谱几何解释。

英文摘要

Modern Transformer architectures frequently employ normalization mechanisms such as RMSNorm and Query-Key Normalization, making parts of the model approximately scale-invariant with respect to weight magnitudes. In this regime, standard Frobenius-norm weight decay acts purely along the radial direction of the weight space and cannot directly simplify the function represented by the normalized layer. We study grokking in small algorithmic tasks through this lens and propose \emph{Low-Rank Decay} (LRD), a nuclear-norm-like spectral regularizer whose subgradient -- the polar factor $UV^\top$ -- retains a tangential component even in the scale-invariant setting. This distinction has a concrete dynamical consequence: after the model memorizes the training set and task gradients vanish, L2 decay can no longer reshape the weight spectrum, whereas LRD continues to compress singular values in an $\ell_1$-like fashion. On modular arithmetic tasks, we find that LRD induces rapid effective-rank collapse in Query/Key matrices and expands the data-fraction boundary at which delayed generalization (grokking) occurs. We further provide a spectral-geometric interpretation through the ``needle-to-fan'' expansion of the nuclear-norm subdifferential near low-rank strata.

2606.04401 2026-06-04 cs.LG 版本更新

TANDEM: Bi-Level Data Mixture Optimization with Twin Networks

TANDEM: 基于孪生网络的双层数据混合优化

Jiaxing Wang, Deping Xiang, Jin Xu, Mingyang Yi, Guoqiang Gong, Zicheng Zhang, Haoran Li, Pengzhang Liu, Zhen Chen, Ke Zhang, Ju Fan, Qixiang Jiang

发表机构 * JD.com(京东公司) University of Oxford(牛津大学) Renmin University of China(中国人民大学) University of Chinese Academy of Sciences(中国科学院大学)

AI总结 提出TANDEM方法,通过孪生网络(代理模型和参考模型)的差异衡量数据效用,优化领域混合比例,在数据受限和监督微调等场景中显著提升大语言模型性能。

详情
AI中文摘要

大语言模型(LLM)的能力在很大程度上取决于来自不同领域的训练数据。优化特定领域的混合比例可以建模为一个双层优化问题,我们将其简化为单层惩罚形式,并通过孪生网络求解:一个在主数据上训练的代理模型和一个在额外数据上动态更新的参考模型。我们提出的方法——用于双层数据混合优化的孪生网络(TANDEM),通过孪生模型之间的差异衡量数据效用,并增加从额外数据中受益更多的领域的权重。与先前方法相比,TANDEM提供了理论保证和更广泛的适用性。此外,我们的双层视角提出了研究领域重新加权的新设置,例如数据受限场景和监督微调,其中优化的混合比例显著提升了性能。大量实验验证了TANDEM在所有场景中的有效性。

英文摘要

The capabilities of large language models (LLMs) significantly depend on training data drawn from various domains. Optimizing domain-specific mixture ratios can be modeled as a bi-level optimization problem, which we simplify into a single-level penalized form and solve with twin networks: a proxy model trained on primary data and a dynamically updated reference model trained with additional data. Our proposed method, Twin Networks for bi-level DatA mixturE optiMization (TANDEM), measures the data efficacy through the difference between the twin models and up-weights domains that benefit more from the additional data. TANDEM provides theoretical guarantees and wider applicability, compared to prior approaches. Furthermore, our bi-level perspective suggests new settings to study domain reweighting such as data-restricted scenarios and supervised fine-tuning, where optimized mixture ratios significantly improve the performance. Extensive experiments validate TANDEM's effectiveness in all scenarios.

2606.04399 2026-06-04 cs.LG cs.CR 版本更新

DPDL: Towards Differential Privacy Preservation in Decentralized Stochastic Learning on Non-IID Data

DPDL: 非独立同分布数据下分散式随机学习中的差分隐私保护

Yunsheng Yuan, Xue Xiao, Lina Wang, Feng Li

发表机构 * School of Computer Science and Technology, Shandong University(计算机科学与技术学院,山东大学) Inspur Cloud Information Technology Co, Ltd(Inspur 云信息技术有限公司)

AI总结 针对非独立同分布数据下的分散式学习隐私泄露问题,提出基于差分隐私和相似性校准的DPDL算法,通过加噪和余弦相似度校准实现隐私保护并保持线性加速。

详情
AI中文摘要

在分散式学习范式中,一组智能体在没有中央服务器的情况下,利用分布式数据集协作训练全局模型。尽管协作的力量已被许多前沿研究验证,但它需要智能体之间广泛交换梯度信息,从而对单个智能体带来高隐私泄露风险。此外,在实际应用中,训练数据通常在智能体之间非独立同分布,这给实现隐私保护的分散式学习带来了更多挑战。为了解决这些问题,我们提出了一种针对非独立同分布数据的隐私保护分散式学习算法DPDL,该算法通过基于相似性校准的技术,在交叉梯度聚合中利用差分隐私(DP)的概念。具体来说,在每一轮中,每个智能体在与其邻居共享交叉梯度(即其私有本地数据上邻居局部模型的导数)之前,通过高斯噪声机制对其进行扰动;然后采用余弦相似度校准接收到的扰动交叉梯度,使得校准后的交叉梯度聚合能够以类似动量的方式有效更新局部模型。我们严格的理论分析不仅揭示了实现特定隐私保护水平所需的最小噪声水平,而且表明我们的算法在非独立同分布数据训练中仍然实现了线性加速。最后,我们在真实世界数据集上进行了大量实验,以验证我们的算法在防御隐私攻击和训练准确模型方面的有效性。

英文摘要

In the paradigm of decentralized learning, a group of agents collaborate to train a global model using distributed datasets without a central server. Although the power of collaboration has been verified by many state-of-the-art studies, it entails extensive gradient information exchanging among the agents and thus induces high risk of privacy leakage for the individual agents. Moreover, in real-world applications, the training data are usually non-identically and independently distributed across the agents, inducing more challenges to enable privacy-preserved decentralized learning. To address these issues, we propose a privacy-preserved decentralized learning algorithm with non-IID data, DPDL, which leverages the notion of Differential Privacy (DP) in cross-gradient aggregation through a similarity-based calibration technique. Specifically, in each round, each agent perturbs the cross-gradients (i.e., the derivatives of its neighbors' local model in its private local data) by Gaussian noise mechanism before sharing them with its neighbors; it then adopt cosine similarity to calibrate the received perturbed cross-gradients such that the aggregation of the calibrated cross-gradients can be utilized to effectively update local model in a momentum-like manner. Our rigorous theoretical analysis not only reveals the minimum noise level required to achieve a specific level of privacy preservation, but also illustrates that our algorithm still achieves a linear speedup in training with non-IID data. We finally conduct extensive experiments on real-world dataset to validate the effectiveness of our algorithm in defending privacy attacks and in training accurate models.

2606.04392 2026-06-04 cs.LG cs.CL 版本更新

Physics-Informed Neural Network Modeling of Biodegradable Contaminant Transport through GCL/SL Composite Liners

物理信息神经网络建模可生物降解污染物通过GCL/SL复合衬垫的迁移

Dong Li, Yapeng Cao, Haiping Zhao, Shutong Han

发表机构 * Department of Civil, Environmental, and Infrastructure Engineering, George Mason University(乔治·马歇尔大学土木、环境与基础设施工程系) State Key Laboratory of Cryospheric Science and Frozen Soil Engineering, Northwest Institute of Eco-Environment and Resources, Chinese Academy of Sciences(中国科学院寒区工程与冻土科学联合实验室,西北生态环境资源研究院) Laboratoire Navier/CERMES, École Nationale des Ponts et Chaussées, Institut Polytechnique de Paris(巴黎理工学院劳达实验室/塞姆斯实验室,法国国家桥梁与道路学院)

AI总结 提出双域物理信息神经网络框架,通过硬约束PINN精确模拟GCL/SL复合衬垫中污染物迁移,并扩展至逆问题识别降解半衰期。

详情
AI中文摘要

本研究开发了一个双域物理信息神经网络框架,用于污染物通过GCL/SL复合衬垫系统的迁移,其中薄GCL层采用稳态平流-弥散-生物降解公式处理,而下层土壤衬垫建模为瞬态传输域。在不同渗滤液水头条件下,评估了两种公式与解析解和有限元参考解的对比:标准软约束PINN(Std-PINN)和硬约束PINN(H-PINN),其中选定的边界和初始条件直接嵌入试验解中。Std-PINN捕捉了整体突破行为,但在早期传输阶段显示出较大误差,特别是在平流传输更显著的高水头条件下。H-PINN减少了与基于惩罚的约束执行相关的优化负担,提供了更准确和稳定的浓度预测,将MAE从Std-PINN的约0.058-0.067降低到H-PINN的约0.011-0.023,同时将MRE从约9.10%-19.16%降低到约2.08%-3.14%。参数分析证实,采用tanh激活函数和优化网络结构的H-PINN提供了最佳的预测精度。H-PINN进一步扩展到逆建模,用于从有限的浓度观测中识别SL降解半衰期,显示出对预设值的可靠收敛性以及在低到中等观测噪声下的可接受鲁棒性。

英文摘要

This study develops a two-domain physics-informed neural network framework for contaminant transport through a GCL/SL composite liner system, in which the thin GCL layer is treated using a steady-state advection-dispersion-biodegradation formulation and the underlying soil liner is modeled as a transient transport domain. Two formulations are evaluated against analytical and finite-element reference solutions under different leachate-head conditions: a standard PINN with soft constraint enforcement (Std-PINN) and a hard-constrained PINN (H-PINN), in which selected boundary and initial conditions are embedded directly into the trial solutions. The Std-PINN captures the overall breakthrough behavior but shows larger errors during the early transport stage, particularly under higher leachate heads where advective transport becomes more pronounced. The H-PINN reduces the optimization burden associated with penalty-based constraint enforcement and provides more accurate and stable concentration predictions, lowering the MAE from approximately 0.058-0.067 for the Std-PINN to about 0.011-0.023 for the H-PINN, while reducing the MRE from approximately 9.10%-19.16% to about 2.08%-3.14%. Parametric analyses confirm that the H-PINN with the tanh activation function and an optimized network structure provides the best predictive accuracy. The H-PINN is further extended to inverse modeling for identifying the SL degradation half-life from limited concentration observations, showing reliable convergence toward prescribed values and acceptable robustness under low-to-moderate observation noise.

2606.04390 2026-06-04 cs.LG cond-mat.dis-nn math.PR 版本更新

Shortcomings and capacities of real-constrained neural networks in complex spaces

复空间中实约束神经网络的缺陷与能力

Andrew Gracyk

发表机构 * Department of Mathematics(数学系)

AI总结 通过 Gardner 体积比较和 Harish-Chandra-Itzykson-Zuber (HCIZ) 公式,研究了复假设类中强制实预激活相对于复预激活的存储容量渐近比。

Comments First version

详情
AI中文摘要

我们找到了在复假设类中强制实预激活相对于复预激活时存储容量的渐近比。我们的方法依赖于临界容量下的 Gardner 体积比较。我们的证明依赖于文献中非标准的 Harish-Chandra-Itzykson-Zuber (HCIZ) 公式的应用。利用 HCIZ 公式,我们可以获得最终渐近比的更稳健近似。该策略特别适用于我们的工作,因为我们通过 Weyl 积分公式和 Haar 测度在酉紧流形和正交紧流形上进行积分。

英文摘要

We find the asymptotic ratio between the storage capacities when enforcing real pre-activations in a complex hypothesis class as opposed to complex ones in the same class. Our methods depend on Gardner volume comparisons at critical capacity. Our proof relies on an application of the Harish-Chandra-Itzykson-Zuber (HCIZ) formula, nonstandard in literature. With the HCIZ formula, we may obtain a more robust approximation for the final asymptotic ratio. This strategy is applicable to our work specifically since we integrate over the unitary and orthogonal compact manifolds, facilitated via the Weyl integration formula and the Haar measure.

2606.04388 2026-06-04 cs.CR cs.AI cs.LG 版本更新

TITAN-FedAnil+: Trust-Based Adaptive Blockchain Federated Learning for Resource-Constrained Intelligent Enterprises

TITAN-FedAnil+:面向资源受限智能企业的基于信任的自适应区块链联邦学习

Muhammad Hadi, Muhammad Jahangir, Talha Shafique, Muhammad Khuram Shahzad

发表机构 * University of Science and Technology of China(中国科学技术大学)

AI总结 提出TITAN-FedAnil+框架,通过基于亲和传播的自适应聚类聚合过滤恶意更新、GPU加速向量化提升效率及有符号状态跳变机制实现轻量级区块链重同步,在资源受限边缘设备上内存开销降低81%。

Comments 8 pages, 5 figures; code available at https://github.com/error8149/FedAnilPlus-Optimized

详情
AI中文摘要

联邦学习(FL)已成为一种在保护数据隐私的同时实现协作智能的有效范式。然而,由非独立同分布(non-IID)数据分布引起的数据异构性和去中心化安全威胁仍然是重大挑战,尤其是在资源受限的企业环境中。本文提出了TITAN-FedAnil+,一种面向智能企业中区块链联邦学习的基于信任的自适应网络。所提出的框架引入了基于亲和传播的自适应聚类聚合,无需预先知道攻击者数量即可识别并过滤恶意更新。此外,采用GPU加速向量化以提高计算效率,同时通过有符号状态跳变机制实现轻量级区块链重同步。实验结果表明,与基线框架相比,在受限的8 GB边缘设备上经过50轮通信,内存开销显著降低,节省高达81%。结果表明,TITAN-FedAnil+有效提升了智能企业环境中安全联邦学习部署的鲁棒性、可扩展性和资源效率。

英文摘要

Federated Learning (FL) has emerged as an effective paradigm for collaborative intelligence while preserving data privacy. However, data heterogeneity arising from non-IID distributions and decentralized security threats remain significant challenges, particularly in resource-constrained enterprise environments. This paper presents TITAN-FedAnil+, a Trust-Based Adaptive Network for blockchain-enabled federated learning in intelligent enterprises. The proposed framework introduces affinity propagation-based adaptive clustered aggregation to identify and filter malicious updates without requiring prior knowledge of the number of attackers. In addition, GPU-accelerated vectorization is employed to improve computational efficiency, while a signed state jump mechanism enables lightweight blockchain resynchronization. Experimental results demonstrate substantial reductions in memory overhead, achieving up to 81% savings across 50 communication rounds on constrained 8 GB edge devices compared with the baseline framework. The results indicate that TITAN-FedAnil+ effectively improves robustness, scalability, and resource efficiency for secure federated learning deployments in intelligent enterprise environments.

2606.04384 2026-06-04 cs.LG cs.CR stat.ML 版本更新

Revisiting Privacy Amplification by Subsampling in Selective Release DPSGD

重新审视选择性释放DPSGD中的子采样隐私放大

Xiaobo Huang, Fang Xie

发表机构 * Guangdong Provincial Key Laboratory of IRADS, Beijing Normal-Hong Kong Baptist University(广东IRADS重点实验室,北京师范大学-香港 Baptist大学)

AI总结 针对DPSGD中梯度裁剪和噪声注入导致的效用下降和收敛缓慢问题,重新评估选择性释放机制的隐私分析,提出基于裁剪梯度的差分隐私选择性释放算法(DPSR-CG),通过严格的隐私分析和实验证明其在保持严格隐私保证的同时实现优异模型性能。

详情
AI中文摘要

机器学习对敏感数据的依赖需要差分隐私随机梯度下降(DPSGD)等隐私保护技术。然而,由于梯度裁剪和噪声注入,DPSGD存在显著的效用下降和收敛缓慢的问题。先前的工作试图从不同角度改进DPSGD;值得注意的是,差分隐私选择性更新与释放(DPSUR)算法取得了显著的模型效用。然而,DPSUR中的隐私核算忽略了选择性释放机制引入的采样概率变化,这损害了其隐私保证的严谨性。为了解决这些限制,我们重新评估了选择性释放机制的隐私分析,并提出了一种新颖的算法:基于裁剪梯度的差分隐私选择性释放(DPSR-CG)。通过严格的新推导隐私分析以及在多个数据集(MNIST、CIFAR-10、IMDB和FMNIST)上的广泛实验,我们证明了我们的DPSR-CG机制在保持严格隐私保证的同时实现了卓越的模型性能。

英文摘要

Machine learning's reliance on sensitive data necessitates privacy-preserving techniques like Differentially Private Stochastic Gradient Descent (DPSGD). However, DPSGD suffers from substantial utility degradation and slow convergence due to gradient clipping and noise injection. Prior works have attempted to improve DPSGD from various perspectives; notably, the Differentially Private Selective Update and Release (DPSUR) algorithm has achieved remarkable model utility. However, the privacy accounting in DPSUR overlooks the variation in sampling probability introduced by the selective release mechanism, which compromises the rigor of its privacy guarantees. To address these limitations, we re-evaluate the privacy analysis of the selective release mechanism and propose a novel algorithm: Differentially Private Selective Release based on Clipped Gradients (DPSR-CG). Through a rigorous, newly derived privacy analysis and extensive experiments on multiple datasets (MNIST, CIFAR-10, IMDB, and FMNIST), we demonstrate that our DPSR-CG mechanism maintains strict privacy guarantees while achieving exceptional model performance.

2606.04381 2026-06-04 cs.LG cs.AI 版本更新

From Symbolic to Geometric: Enabling Spatial Reasoning in Large Language Models

从符号到几何:在大语言模型中实现空间推理

Chen Chu, Bita Azarijoo, Li Xiong, Khurram Shafique, Cyrus Shahabi

发表机构 * University of Southern California(南加州大学) Emory University(埃默里大学) Novateur Research Solutions(Novateur研究解决方案)

AI总结 提出空间语言模型(SLM),通过将位置信息作为一等模态并学习空间表示,在推理过程中实现几何空间推理,显著优于基于符号推理的现有方法。

详情
AI中文摘要

近期的大语言模型(LLM)通常表现出空间推理能力;然而,这种能力很大程度上是\emph{符号}性的,源于对空间语言的模式匹配,而非真正的\emph{几何}空间推理。由于LLM操作离散令牌,它们缺乏对连续空间表示、显式几何计算和结构化空间算子的原生支持。为解决这一局限,我们引入了\emph{空间语言模型(SLM)},这是首个将位置信息作为一等模态并在模型推理过程中实现几何空间推理的多模态LLM。SLM直接操作学习到的空间表示,而非空间关系的文本描述。为支持有效训练,我们构建了\emph{空间指令数据集},该数据集对齐了空间表示、原子几何操作和自然语言指令。我们进一步提出了名为\emph{SpatialEval}的新基准,旨在评估属性、距离、拓扑和相对位置任务上的空间推理。大量实验表明,SLM显著优于依赖通过提示工程或文本抽象进行符号推理的现有基于LLM的方法,展示了集成几何空间表示对稳健空间推理的优势。我们的指令数据集、评估基准、模型训练代码和模型检查点可在\hyperlink{https://github.com/chuchen2017/SLM}{https://github.com/chuchen2017/SLM}获取。

英文摘要

Recent large language models (LLMs) often appear to exhibit spatial reasoning ability; however, this capability is largely \emph{symbolic}, arising from pattern matching over spatial language rather than true \emph{geometric} reasoning over space. Because LLMs operate on discrete tokens, they lack native support for continuous spatial representations, explicit geometric computation, and structured spatial operators. To address this limitation, we introduce the \emph{Spatial Language Model (SLM)}, the first multimodal LLM that treats location information as a first-class modality and enables geometric spatial reasoning within the model's inference process. SLM directly operates on learned spatial representations rather than textual descriptions of spatial relations. To support effective training, we construct a \emph{Spatial Instruction Dataset} that aligns spatial representations, atomic geometric operations, and natural language instructions. We further propose a new benchmark named \emph{SpatialEval}, which is designed to evaluate spatial reasoning across attributes, distance, topology, and relative-position tasks. Extensive experiments show that SLM significantly outperforms existing LLM-based approaches that rely on symbolic reasoning via prompt engineering or textual abstraction, demonstrating the benefits of integrating geometric spatial representations for robust spatial reasoning. Our instruction dataset, evaluation benchmark, model training codes, and models' checkpoints can be found at: \hyperlink{https://github.com/chuchen2017/SLM}{https://github.com/chuchen2017/SLM}.

2606.04380 2026-06-04 stat.ML cs.LG 版本更新

REGAIN: REconciliation GAIN-driven Auxiliary Direction Learning

REGAIN:基于调和增益的辅助方向学习

Weijia Li, Shun Hu, Yanfei Kang

发表机构 * School of Mathematical Sciences, Beihang University, Beijing, China(北京航空航天大学数学科学学院) School of Economics and Management, Beihang University, Beijing, China(北京航空航天大学经济管理学院)

AI总结 提出REGAIN框架,通过学习归一化辅助方向并利用冻结预测预言机,基于目标加权损失减少选择方向,以改进预测调和。

详情
AI中文摘要

预测调和通常从固定测量系统开始,询问如何将预测投影到一致空间。我们提出不同问题:哪些额外的线性测量应被预测并纳入调和系统?我们提出REGAIN,一种调和增益框架,学习归一化辅助方向,用冻结预测预言机预测诱导序列,并通过增强广义最小二乘调和后的目标加权损失减少选择方向。与基于方差的分量或基于可预测性的辅助选择不同,REGAIN优化辅助测量对最终调和预测的下游影响。我们提供统计特征,表明有用的辅助方向必须提供关于未解决目标不确定性的互补信息,而不仅仅是易于预测。分析还阐明了协方差风险减少机制、偏差变化在实现二次风险中的作用以及估计增益信号的稳定性。开发了带有保留增益筛选的分阶段学习算法,以及可选的联合优化步骤。在北京PM2.5和澳大利亚旅游数据上的实验表明,增益选择的测量可以改进普通多变量和层次预测,特别是当它们揭示原始测量系统未捕捉的残差不确定性时。

英文摘要

Forecast reconciliation usually starts from a fixed measurement system and asks how forecasts should be projected onto a coherent space. We ask a different question: which additional linear measurements should be forecast and included in the reconciliation system? We propose REGAIN, a reconciliation-gain framework that learns normalized auxiliary directions, forecasts the induced series with a frozen forecasting oracle, and selects directions by their target-weighted loss reduction after augmented generalized least-squares reconciliation. Unlike variance-based components or predictability-based auxiliary selection, REGAIN optimizes the downstream effect of an auxiliary measurement on the final reconciled forecasts. We provide a statistical characterization showing that useful auxiliary directions must provide complementary information about unresolved target uncertainty, rather than merely being easy to forecast. The analysis also clarifies the covariance-risk reduction mechanism, the role of bias changes in realized quadratic risk, and the stability of estimated gain signals. A stagewise learning algorithm with held-out gain screening is developed, together with an optional joint refinement step. Experiments on Beijing PM2.5 and Australian Tourism data show that gain-selected measurements can improve both ordinary multivariate and hierarchical forecasts, especially when they reveal residual uncertainty not captured by the original measurement system.

2606.04375 2026-06-04 cs.LG stat.ML 版本更新

When Do Fewer Coordinates Suffice in DP-SGD?

何时在DP-SGD中更少的坐标就足够了?

Huiqi Zhang, Fang Xie

发表机构 * Guangdong Provincial Key Laboratory of IRADS, Beijing Normal-Hong Kong Baptist University(广东省级信息检索与数据分析重点实验室,北京师范大学-香港 Baptist 大学)

AI总结 本文提出一种无需公共数据的两阶段坐标稀疏私有训练方法TP-TopK,通过私有预热阶段识别坐标支撑集,使得噪声项缩放比例从全参数维度d降至活跃维度k,并在非凸平稳性边界下给出坐标限制有效的条件。

Comments 14 pages

详情
AI中文摘要

差分隐私随机梯度下降(DP-SGD)向每个更新的坐标注入噪声,使得注入的噪声能量随环境参数维度\(d\)缩放。我们探究私有训练何时可以更新更少的坐标而不丢失优化所需的信号。我们提出 extsc{TP-TopK}(两阶段TopK DP-SGD),一种无需公共数据的坐标稀疏私有训练的两阶段方法,其中私有预热阶段识别用于指导主训练阶段的坐标支撑集。我们给出了一个刻画坐标限制何时有益的准则,通过非凸平稳性边界表明在该条件下相关噪声项随活跃维度\(k\)而非全参数维度\(d\)缩放,并提供了基于预热的坐标排序可靠性的下界。在MNIST、FMNIST和CIFAR-10上的实验表明,学习到的坐标支撑集比大小匹配的随机支撑集能保留更多的梯度能量,当活跃维度较小且预热分数信息丰富时收益最大。

英文摘要

Differentially private stochastic gradient descent (DP-SGD) injects noise into every updated coordinate, making the injected noise energy scale with the ambient parameter dimension \(d\). We ask when private training can update fewer coordinates without losing the signal needed for optimization. We propose \textsc{TP-TopK} (Two-Phase TopK DP-SGD), a two-phase method for coordinate-sparse private training without public data, in which a private warm-up phase identifies a coordinate support used to guide the main training phase. We give a criterion characterizing when coordinate restriction can be beneficial, show via a nonconvex stationarity bound that under this condition the relevant noise term scales with the active dimension \(k\) rather than the full parameter dimension \(d\), and provide a lower bound on the reliability of warm-up-based coordinate ranking. Experiments on MNIST, FMNIST, and CIFAR-10 show that learned coordinate supports can retain more gradient energy than size-matched random supports, with the largest gains when the active dimension is small and warm-up scores are informative.

2606.04366 2026-06-04 cs.LG cs.NA math.NA 版本更新

MeshTok: Efficient Multi-Scale Tokenization for Scalable PDE Transformers

MeshTok:可扩展PDE Transformer的高效多尺度分词化

Yanshun Zhao, Xiaoyu Peng, Jiamin Jiang, Congcong Zhu, Jingrun Chen

发表机构 * School of Mathematical Sciences, University of Science(科学学院,科学大学) Suzhou Institute for Advanced Research, University of Science(先进研究院,科学大学)

AI总结 提出受自适应网格细化启发的MeshTok框架,通过多尺度分词化在统一Transformer中同时捕获粗粒度全局上下文和细粒度局部细节,改善PDE建模的效率-准确率权衡。

Comments ICML2026

详情
AI中文摘要

传统的分块Transformer在均匀空间分区上运行,将计算工作量均匀分布在整个域中,而不考虑局部特征。这种不灵活的分词化方案在有效表示和处理复杂PDE解方面具有固有的局限性。为了解决这个问题,我们提出了MeshTok,一种受自适应网格细化(AMR)启发的分词化和序列建模框架。该方法选择性地细化具有陡峭梯度、瞬态特征或多尺度结构的空间区域,在固定模拟网格上生成一组异质的多尺度令牌。这些令牌在统一的Transformer序列中处理,使模型能够同时捕获粗粒度的全局上下文和细粒度的局部细节,而无需专门的架构组件。尽管自适应细化适度增加了令牌数量,但它促进了计算资源向物理信息区域的更有针对性的分配,我们将其视为一种实用的归纳偏置,而不是形式上的最优性保证。跨多个PDE族和基准数据集的实验评估表明,与均匀网格基线相比,MeshTok持续改善了效率-准确率权衡。这表明自适应多尺度分词化是神经PDE建模的一种可扩展且可推广的设计原则。代码可在https://github.com/SCAILab-USTC/MeshTok获取。

英文摘要

Conventional patchified Transformers operate on uniform spatial partitions, distributing computational effort evenly across the domain irrespective of local features. This inflexible tokenization scheme is inherently limited in its ability to efficiently represent and process solutions to complex PDEs. To address this, we propose MeshTok, an adaptive mesh refinement (AMR)-inspired tokenization and sequence modeling framework. This method selectively refines spatial regions exhibiting sharp gradients, transient features, or multiscale structures, generating a heterogeneous set of multiscale tokens defined on a fixed simulation grid. These tokens are processed within a unified Transformer sequence, enabling the model to simultaneously capture coarse-grained global context and fine-grained local details without requiring specialized architectural components. Although adaptive refinement moderately increases token count, it promotes a more targeted allocation of computational resources to physically informative regions, which we view as a practical inductive bias rather than a formal optimality guarantee. Experimental evaluations across multiple PDE families and benchmark datasets demonstrate that MeshTok consistently improves the efficiency-accuracy trade-off compared to uniform-grid baselines. This suggests adaptive multiscale tokenization as a scalable and generalizable design principle for neural PDE modeling. Code is available at https://github.com/SCAILab-USTC/MeshTok.

2606.04360 2026-06-04 cs.CL cs.LG 版本更新

Deliberate Evolution: Agentic Reasoning for Sample-Efficient Symbolic Regression with LLMs

Deliberate Evolution: 基于智能体推理的样本高效符号回归与LLM

Xinyu Pang, Zhanke Zhou, Xuan Li, Fangrui Lv, Shanshan Wei, Sen Cui, Bo Han, Changshui Zhang

发表机构 * TMLR Group, Department of Computer Science, Hong Kong Baptist University(香港 Baptist 大学计算机科学系 TMLR 组) Beijing National Research Center for Information Science(北京信息科学国家研究中心) Technology (BNRist), Department of Automation, Tsinghua University, Beijing, P.R. China(技术(BNRist),自动化系,清华大学,北京,中华人民共和国) Lenovo Research(联想研究)

AI总结 提出Deliberate Evolution框架,通过解耦符号生成与搜索控制,利用自适应算子、分析工具和反思记忆,在仅用40%样本预算下超越现有LLM符号回归方法。

Comments ICML 2026

详情
AI中文摘要

符号回归(SR)从数据中发现紧凑的数学表达式,然而最近基于LLM的进化方法仍然样本效率低下,因为它们主要依赖标量反馈(如MSE)。我们发现一个核心限制:现有方法将候选提议与搜索指导混为一谈,要求LLM从单一分数中推断如何进化表达式、诊断其错误并重用过去经验。为了解决这个问题,我们提出了Deliberate Evolution(DE),一个将符号生成与搜索控制解耦的智能体框架。DE使用自适应算子引导搜索方向、分析工具进行结构诊断以及反思记忆存储轨迹级经验,从而指导LLM的提议。在LLM-SRBench上的实验表明,DE在仅使用标准样本预算的40%的情况下,在多个科学领域一致优于代表性的基于LLM的SR基线。

英文摘要

Symbolic regression (SR) discovers compact mathematical expressions from data, yet recent LLM-based evolutionary methods remain sample-inefficient because they rely mainly on scalar feedback such as MSE. We identify a core limitation: existing methods conflate candidate proposal with search guidance, requiring the LLM to infer how to evolve an expression, diagnose its errors, and reuse past experience from a single score. To address this, we propose Deliberate Evolution (DE), an agentic framework that decouples symbolic generation from search control. DE guides LLM proposals with adaptive operators for search direction, analytical tools for structural diagnosis, and reflective memory for trajectory-level experience. Experiments on LLM-SRBench show that DE consistently outperforms representative LLM-based SR baselines across diverse scientific domains while using only 40% of the standard sample budget.

2606.04345 2026-06-04 cs.CV cs.AI cs.LG 版本更新

HYolo: An Intelligent IoT-Based Object Detection System Using Hypergraph Learning

HYolo:一种基于超图学习的智能物联网目标检测系统

Isha Abid, Fawad Khan, Muhammad Khuram Shahzad

发表机构 * National University of Sciences and Technology(国家安全科学与技术大学)

AI总结 提出HYolo框架,将超图学习融入YOLO架构以建模高阶特征关系,在COCO数据集上mAP@50提升约12%。

Comments 8 pages, multiple figures;

详情
AI中文摘要

本文提出HYolo,一种基于物联网的智能目标检测框架,将超图学习集成到YOLO架构中。传统的基于YOLO的目标检测模型主要捕获成对特征交互,可能无法建模对象与上下文特征之间的复杂高阶关系。为解决这一局限,HYolo引入超图学习以捕获更丰富的上下文依赖关系并改进对象表示。在COCO数据集上的实验评估表明,与基线YOLO模型相比,性能显著提升。所提方法在mAP@50上实现了约12%的提升,同时增强了整体检测准确性和鲁棒性。通过建模高阶特征关系,HYolo在物联网环境中提供了改进的上下文理解和更可靠的目标检测性能。结果表明,将超图学习集成到目标检测流程中,为智能且上下文感知的物联网视觉系统提供了一个有前景的方向。

英文摘要

This paper presents HYolo, an intelligent IoT-based object detection framework that integrates hypergraph learning into the YOLO architecture. Traditional YOLO-based object detection models primarily capture pairwise feature interactions and may fail to model complex high-order relationships among objects and contextual features. To address this limitation, HYolo incorporates hypergraph learning to capture richer contextual dependencies and improve object representation. Experimental evaluation on the COCO dataset demonstrates significant performance improvements over baseline YOLO models. The proposed approach achieves approximately 12% improvement in mAP@50 while enhancing overall detection accuracy and robustness. By modeling high-order feature relationships, HYolo provides improved contextual understanding and more reliable object detection performance in IoT-based environments. The results indicate that integrating hypergraph learning into object detection pipelines offers a promising direction for intelligent and context-aware IoT vision systems.

2606.04342 2026-06-04 cs.LG cs.AI 版本更新

Expectations vs. Realities: The Cost of MSE-Optimal Forecasting Under Conditional Uncertainty

期望与现实:条件不确定性下MSE最优预测的成本

Riku Green, Zahraa S. Abdallah, Telmo M Silva Filho

发表机构 * The University of Bristol(布里斯托尔大学)

AI总结 本文通过条件不确定性间隙理论证明多步时间序列预测中MSE最优与边际真实性存在根本性权衡,并实证表明小幅牺牲MSE(≤5%)可显著提升边际真实性(中位数17.3%)。

Comments 12 pages, Accepted for KDD 2026 Research track

详情
AI中文摘要

多步时间序列预测(MSF)通常使用均方误差(MSE)等逐点误差指标进行评估,隐含地将条件均值视为充分目标。我们证明,在条件不确定性下,当条件期望在较长预测范围内无法代表典型实现值时,这种做法可能产生误导。我们通过条件不确定性间隙形式化这一效应,并证明只要该间隙非零,任何确定性预测器都无法同时最小化MSE并匹配实现未来的边际分布。这确立了MSF评估中逐点准确性与边际真实性之间根本性的、与模型无关的权衡。利用受控随机动力系统和九个真实世界预测基准,我们经验性地刻画了由此产生的准确性-真实性前沿,并量化了仅基于MSE的模型选择的实际成本。随着条件不确定性随预测范围增加,可达集扩展为明显的帕累托前沿,将MSE最优但分散不足的预测器与牺牲准确性换取真实边际变异性的方法区分开来。在多个基准中,我们发现MSE的小幅放松(≤5%)通常能带来边际真实性的不成比例提升,中位数改进为17.3%,在某些数据集中增益超过30%。我们进一步表明,常见的预测策略系统性地占据该前沿的不同区域:直接多输出预测器集中在准确性最优极端附近,而递归策略和基于样本的推断更倾向于边际真实性。这些结果共同揭示了长期预测中基于MSE评估的结构性失败模式,并将策略和推断选择重新定义为对不可避免的准确性-真实性权衡的导航。

英文摘要

Multi-step time series forecasting (MSF) is commonly evaluated using point-wise error metrics such as mean squared error (MSE), implicitly treating the conditional mean as a sufficient target. We show that this can be misleading under conditional uncertainty, where the conditional expectation becomes unrepresentative of typical realized values at longer horizons. We formalize this effect through a conditional uncertainty gap and prove that whenever this gap is nonzero, no deterministic predictor can simultaneously minimize MSE and match the marginal distribution of realized futures. This establishes a fundamental, model-agnostic trade-off between point accuracy and marginal realism in MSF evaluation. Using controlled stochastic dynamical systems and nine real-world forecasting benchmarks, we empirically characterize the resulting accuracy--realism frontier and \textbf{quantify the practical cost of MSE-only model selection}. As conditional uncertainty increases with forecast horizon, the attainable set expands into a pronounced Pareto front, separating MSE-optimal but under-dispersed predictors from methods that trade accuracy for realistic marginal variability. \textbf{Across benchmarks, we find that small relaxations in MSE ($\boldsymbol{\le 5\%}$) frequently unlock disproportionate gains in marginal realism, with median improvements of $\mathbf{17.3\%}$ and gains exceeding $\mathbf{30\%}$ in some datasets.} We further show that common forecasting strategies systematically occupy different regions of this frontier: direct multi-output predictors concentrate near the accuracy-optimal extreme, while recursive strategies and sample-based inference favors marginal realism. Together, these results expose a structural failure mode of MSE-based evaluation in long-horizon forecasting and recast strategy and inference selection as navigation of an unavoidable accuracy--realism trade-off.

2606.04339 2026-06-04 cs.LG 版本更新

Literature-Guided Minimax Optimization of Virtual Epilepsy Neurostimulation

文献引导的虚拟癫痫神经刺激极小化优化

Cathy Liu

发表机构 * Cathy Liu

AI总结 提出一种文献引导的极小化优化流程,结合PubMed规模假设提取、TVB Epileptor模拟和大语言模型黑箱优化,以最大化最坏情况下的奖励,用于鲁棒的神经刺激设计。

Comments 9 pages, 4 figures. Code and interactive essay at https://github.com/liuzhitong330/tvb-llm-robust-neurostim

详情
AI中文摘要

癫痫的计算模型有望实现患者特异性治疗设计,但大多数优化工作流程仍搜索平均表现良好的参数。在神经调控中,这是一个薄弱目标:改善平均响应的方案仍可能对网络最不耐受刺激的患者失败。我们提出一种文献引导的极小化优化流程,结合PubMed规模假设提取、虚拟大脑(TVB)Epileptor模拟和大语言模型引导的黑箱优化。优化器提出内在模型控制参数或临床可解释的外部刺激方案;TVB对采样的虚拟患者评估每个方案;目标函数最大化最坏情况奖励,定义为模拟癫痫活动的负方差。在内在模型控制实验中,最佳存档参数集将最坏情况奖励从-0.5285提升至-0.3182,比基线提高39.8%。临床风格的外部刺激搜索产生较小的最坏情况改善(1.7%),尽管有55%的响应率和阳性颞叶亚组信号,但20名虚拟患者队列未显示总体获益(p=0.9019)。该研究应被视为鲁棒、文献感知的神经刺激设计的计算机概念验证,而非临床证据。

英文摘要

Computational models of epilepsy promise patient-specific treatment design, but most optimization workflows still search for parameters that perform well on average. In neuromodulation, this is a weak target: a protocol that improves the mean response can still fail in the patient whose network is least tolerant to stimulation. We present a literature-guided minimax pipeline that couples PubMed-scale hypothesis extraction, The Virtual Brain (TVB) Epileptor simulations, and large-language-model-guided black-box optimization. The optimizer proposes either intrinsic model-control parameters or clinically interpretable external-stimulation protocols; TVB evaluates each proposal across sampled virtual patients; and the objective maximizes worst-case reward, defined as the negative variance of simulated seizure activity. In the intrinsic model-control experiment, the best archived parameter set improved worst-case reward from -0.5285 to -0.3182, a 39.8% gain over baseline. The clinical-style external-stimulation search produced a much smaller worst-case improvement (1.7%), and a 20-patient virtual cohort showed no aggregate benefit (p=0.9019), despite a 55% responder rate and a positive temporal-lobe subgroup signal. The study should be read as an in silico proof of concept for robust, literature-aware neurostimulation design, not as clinical evidence.

2606.04338 2026-06-04 cs.LG cs.CR 版本更新

Federated Learning for Multi-Center Sepsis Early Prediction with Privacy-Preserving

联邦学习用于隐私保护的多中心脓毒症早期预测

Xixi Tian, Di Wu, Xiang Liu, Yiziting Zhu, Yujie Li, Xin Shu, Bin Yi

发表机构 * University of Science and Technology of China(中国科学技术大学)

AI总结 针对多中心医疗数据的隐私和分布式特性,提出基于联邦学习的分布式协作建模方法,实现与集中式模型相当的预测精度并避免隐私泄露。

详情
AI中文摘要

多中心医疗数据的隐私敏感性和分布式特征给集中式建模进行脓毒症早期准确预测带来了严重障碍。联邦学习作为一种有前景的协作模型开发框架,允许多个机构在不直接共享或集中原始数据的情况下联合训练预测模型,因此受到越来越多的关注。然而,其实际性能、鲁棒性和隐私保护优势尚未使用真实临床数据集进行充分评估。为弥补这一差距,本研究系统性地考察了联邦学习在多中心脓毒症预测中的应用。实验数据集包括从中国三家三级医院收集的648个临床筛选样本,并采用严格的纳入和排除标准。我们建立了集中式训练范式作为性能基线,然后实现了水平联邦学习框架用于分布式协作建模。大量实验结果表明,基于联邦学习的模型在预测精度上与集中式模型高度相当,同时从根本上避免了隐私泄露。进一步的隐私安全分析验证了恶意攻击者无法从传输的模型参数中重建原始患者数据,表明其对数据重建攻击具有强大的抵抗力。这项工作不仅验证了联邦学习在临床脓毒症预测中的实用性和安全性,而且为隐私保护的多中心医疗协作提供了可靠且可行的解决方案。

英文摘要

Privacy-sensitive and distributed characteristics of multi-center medical data bring severe obstacles to centralized modeling for accurate early prediction of sepsis. Federated learning (FL) has attracted growing attention as a promising framework for collaborative model development, as it allows multiple institutions to jointly train predictive models without directly sharing or centralizing raw data. Nevertheless, its practical performance, robustness, and privacy-preserving benefits remain insufficiently evaluated using real-world clinical datasets. To bridge this gap, this study systematically examines the application of federated learning to multi-center sepsis prediction. The experimental dataset consists of 648 clinically screened samples collected from three tertiary hospitals in China, with rigorous inclusion and exclusion criteria. We establish a centralized training paradigm as the performance baseline, and then implement a horizontal federated learning framework for distributed collaborative modeling. Extensive experimental results demonstrate that the federated learning-based model achieves highly comparable prediction accuracy to the centralized counterpart, while fundamentally avoiding privacy leakage. Further privacy security analysis verifies that malicious attackers cannot reconstruct the original patient data from the transmitted model parameters, indicating strong resistance against data reconstruction attacks. This work not only validates the practicality and security of federated learning in clinical sepsis prediction, but also provides a reliable and feasible solution for privacy-preserving multi-center medical collaboration.

2605.01910 2026-06-04 cs.LG cs.AI cs.DC 版本更新

Stochastic Sparse Attention for Memory-Bound Inference

随机稀疏注意力用于内存受限推理

Kyle Lee, Corentin Delacour, Kevin Callahan-Coray, Kyle Jiang, Can Yaras, Samet Oymak, Tathagata Srimani, Kerem Y. Camsari

发表机构 * University of California, Santa Barbara(加州大学圣芭芭拉分校) University of Michigan(密歇根大学) Carnegie Mellon University(卡内基梅隆大学)

AI总结 提出SANTA方法,通过从后softmax分布中采样稀疏索引来减少值缓存访问,实现无乘法的高效解码,在Llama-3.1-8B-Instruct上获得1.5倍注意力核加速和1.25倍端到端加速。

Comments Code available at https://github.com/OPUSLab/SANTA

详情
Journal ref
ICML 2026
AI中文摘要

自回归解码在长上下文中变得带宽受限,因为生成每个token需要从KV缓存中读取所有$n_k$个键和值向量。我们提出随机加法无乘法注意力(SANTA),一种通过从后softmax分布中采样$S \ll n_k$个索引并仅聚合这些值行来稀疏化值缓存访问的方法。这产生了后softmax值聚合的无偏估计,同时将值阶段的乘加运算替换为收集和加法。我们引入分层和系统采样来设计方差减少、GPU友好的变体。在32k token上下文的Llama-3.1-8B-Instruct上评估,S$^2$ANTA匹配基线准确率,同时在NVIDIA RTX 6000 Ada上相比FlashInfer和FlashDecoding实现高达1.5倍解码步注意力核加速。在批处理长上下文生成中,这些核增益转化为高达1.25倍的端到端解码延迟加速。最后,我们提出伯努利$qK^\mathsf{T}$采样作为补充技术来稀疏化分数阶段,通过随机三元查询减少键特征访问。两种方法对上游量化、低秩投影、KV缓存压缩和KV缓存选择方法互补。它们共同指向稀疏、无乘法和节能的推理。我们在https://github.com/OPUSLab/SANTA.git开源了我们的核。

英文摘要

Autoregressive decoding becomes bandwidth-limited at long contexts, as generating each token requires reading all $n_k$ key and value vectors from KV cache. We present Stochastic Additive No-mulT Attention (SANTA), a method that sparsifies value-cache access by sampling $S \ll n_k$ indices from the post-softmax distribution and aggregates only those value rows. This yields an unbiased estimator of the post-softmax value aggregation while replacing value-stage multiply-accumulates with gather-and-add. We introduce stratified and systematic sampling to design variance-reduced, GPU-friendly variants. Evaluated on Llama-3.1-8B-Instruct at 32k-token contexts, S$^2$ANTA matches baseline accuracy while achieving up to $1.5\times$ decode-step attention-kernel speedup over FlashInfer and FlashDecoding on an NVIDIA RTX 6000 Ada. In batched long-context generation, these kernel gains translate to up to $1.25\times$ end-to-end decode-latency speedup. Finally, we propose Bernoulli $qK^\mathsf{T}$ sampling as a complementary technique to sparsify the score stage, reducing key-feature access through stochastic ternary queries. Both methods are complementary to upstream quantization, low-rank projection, KV-cache compression, and KV-cache selection methods. Together, they point toward sparse, multiplier-free, and energy-efficient inference. We open-source our kernels at: https://github.com/OPUSLab/SANTA.git

2606.04327 2026-06-04 cs.LG cs.AI math.OC 版本更新

A Geometric Characterization of the Stationary Plateau for Two-Layer Neural Networks

两层神经网络平稳高原的几何刻画

Tian Ding, Dawei Li, Ruoyu Sun

发表机构 * Shenzhen International Center of Industrial and Applied Mathematics(深圳工业与应用数学国际中心) Shenzhen Research Institute of Big Data(深圳大数据研究院) Shenzhen Loop Area Institute(深圳环城区域研究所) AutoKernel University of Minnesota Twin Cities(明尼苏达大学双城分校) School of Data Science, The Chinese University of Hong Kong, Shenzhen, China(香港中文大学(深圳)数据科学学院)

AI总结 通过定义“内Hessian”矩阵,研究了光滑激活函数下两层神经网络损失景观中平稳高原的几何结构,分类了所有平稳点的类型(局部极小或鞍点),并揭示了分裂系数与内Hessian的定性如何共同决定高原的局部几何。

Comments 47 pages

详情
AI中文摘要

我们研究了光滑激活函数的两层神经网络损失景观中出现的平稳高原的几何结构。我们关注“神经元分裂”现象,其中复制一个隐藏神经元会在更宽的网络中产生一个仿射平稳点集。我们提供了这些高原上所有平稳点的全面分类,确定了它们在何种条件下构成局部极小点或鞍点。我们的刻画依赖于一个我们称之为“内Hessian”矩阵的每个神经元曲率对象。我们的分析表明,内Hessian的定性以及分裂系数的选择共同决定了高原的局部几何。我们证明,分裂一个局部极小点可以产生局部极小和鞍点的混合,或者一个全鞍点的高原,在温和假设下确定了一个具体的必然鞍点区域。相反,分裂一个鞍点总是产生一个鞍点的高原。我们的结果统一并扩展了先前的景观分析,阐明了模型扩展何时以及如何保持或改变平稳点的性质。这些发现为神经网络中宽度扩展和重参数化的影响提供了新的几何见解。

英文摘要

We investigate the geometric structure of stationary plateaus that arise in the loss landscape of two-layer neural networks with smooth activation functions. We focus on the phenomenon of "neuron splitting" where duplicating a hidden neuron yields an affine set of stationary points in a wider network. We provide a comprehensive classification of all stationary points on these plateaus, determining under what conditions they constitute local minima or saddle points. Our characterization hinges on a per-neuron curvature object we term the "inner Hessian" matrix. Our analysis reveals that the definiteness of the inner Hessian and the choice of splitting coefficients jointly dictate the local geometry of the plateau. We show that "splitting" a local minimum can yield either a mixture of local minima and saddles or an all-saddle plateau, with a concrete sure-saddle region identified under mild assumptions. In contrast, splitting a saddle point always produces a plateau of saddle points. Our results unify and extend prior landscape analyses, elucidating when and how model expansion preserves or alters the nature of stationary points. These findings offer new geometric insights into the effects of width expansion and reparameterization in neural networks.

2606.04326 2026-06-04 cs.LG cs.AI 版本更新

Measuring What Matters: Synthetic Benchmarks for Concept Bottleneck Models

衡量重要之事:概念瓶颈模型的合成基准

Julian Skirzynski, Harry Cheon, Shreyas Kadekodi, Meredith Stewart, Berk Ustun

发表机构 * University of California, San Diego(加州大学圣地亚哥分校)

AI总结 本文开发了用于概念瓶颈模型的合成基准,通过控制数据模态、概念选择、标注质量和完整性等属性,评估模型在决策支持和自动化场景下的性能,并诊断失败模式。

Comments Benchmarks available at https://github.com/ustunb/concept-benchmark

详情
AI中文摘要

概念瓶颈模型从输入中检测到的高级概念预测结果。尽管概念提供了从可解释性中获益的简单方法,但很少有数据集包含概念标签。这限制了研究人员确定哪些问题适合这些模型、隔离驱动其性能或导致失败的因素、或发现哪些算法表现良好的能力。在本文中,我们为概念瓶颈模型开发了合成基准,重点关注其两个主要用例:决策支持(模型帮助人类做出更好的决策)和自动化(模型在无监督下处理常规任务)。我们的基准可以生成带标签的数据集,同时控制影响性能的属性,包括数据模态、概念选择、标注质量和完整性。我们演示了如何使用这些基准评估代表性类别的概念瓶颈模型。我们的演示展示了基准如何诊断失败模式并指导后续测试。

英文摘要

Concept bottleneck models predict outcomes from high-level concepts detected in inputs. Although concepts provide a simple way to reap benefits from interpretability, very few datasets include concept labels. This limits researchers' ability to determine which problems are suitable for these models, isolate the factors that drive their performance or lead to failures, or uncover which algorithms perform well. In this paper, we develop synthetic benchmarks for concept-bottleneck models, focusing on their two main use cases: decision support, in which models assist humans in making better decisions, and automation, in which models handle routine tasks without supervision. Our benchmarks can generate labeled datasets while controlling for properties that affect performance, including data modality, concept choice, annotation quality, and completeness. We demonstrate how the benchmarks can be used to evaluate representative classes of concept bottleneck models. Our demonstrations show how the benchmarks can diagnose failure modes and guide follow-up testing.

2606.04324 2026-06-04 cs.LG stat.ML 版本更新

Neural Galerkin Normalizing Flows for Bayesian Inference of Diffusions with Inaccessible Boundaries

用于具有不可达边界的扩散模型贝叶斯推断的神经Galerkin归一化流

Riccardo Saporiti, Fabio Nobile

发表机构 * CSQI École Polytechnique Fédérale de Lausanne(CSQI瑞士联邦理工学院)

AI总结 提出一种新的归一化流架构,通过神经Galerkin框架求解Fokker-Planck方程,学习扩散过程在两次观测之间的转移密度函数,从而高效实现贝叶斯推断。

Comments 27 pages, 12 figures

详情
AI中文摘要

从离散观测对扩散模型参数进行贝叶斯推断的主要挑战之一是,在连续观测时间之间无法获得转移密度函数的解析表达式,而该函数是推导似然函数所必需的。扩展先前使用归一化流求解Fokker-Planck型偏微分方程的研究,我们提出一种新的归一化流架构,用于学习扩散过程在两个观测时间之间的转移密度函数。我们通过神经Galerkin框架,以狄拉克质量作为初始条件,在初始数据和扩散系数的指定训练分布上求解相关的Fokker-Planck方程来实现这一点。我们特别关注扩散矩阵在某些不可达边界区域消失的过程,例如满足Feller条件的随机波动率模型。沿观测轨迹评估所获得的转移密度的乘积近似似然函数,从而通过马尔可夫链蒙特卡洛实现廉价的后验采样。在离线训练阶段之后,推断变得显著更高效,因为它避免了为MCMC采样器提出的每个参数实时求解Fokker-Planck方程,或依赖其他涉及重复模拟扩散桥的无似然贝叶斯推断方法。

英文摘要

One of the primary challenges in Bayesian inference on the parameters of a diffusion model from discrete observations is the unavailability of an analytical expression for the transition density function between consecutive observation times, which is needed to derive the likelihood function. Extending previous studies that solve Fokker-Planck (FP) type partial differential equations with Normalizing Flows, we propose a new Normalizing Flow architecture to learn the transition density function of the diffusion process between two observation times. We do so by solving in a Neural Galerkin framework the associated FP equation with a Dirac mass as initial condition, over a specified training distribution of the initial datum and the coefficients of the diffusion. We specifically focus on processes whose diffusion matrix vanishes in certain inaccessible boundary regions, such as Stochastic Volatility models that satisfy a Feller condition. The product of the obtained transition densities evaluated along the observed trajectory approximates the likelihood function, thereby enabling cheap posterior sampling via Markov chain Monte Carlo (MCMC). After the offline training phase, inference becomes significantly more efficient, as it avoids the need to solve the FP equation in real time for each parameter proposed by the MCMC sampler or to rely on other likelihood-free methods for Bayesian inference that involve repeated simulation of diffusion bridges.

2606.04320 2026-06-04 cs.LG cs.AI 版本更新

OpenRFM: Dissecting Relational In-Context Learning

OpenRFM:剖析关系型上下文学习

Zhikai Chen, Junyu Yin, Jialiang Gu, Siheng Xiong, Xiaoze Liu, Ruowang Zhang, Keren Zhou, Kai Guo

发表机构 * Michigan State University(密歇根州立大学) Georgia Institute of Technology(佐治亚理工学院) Purdue University(普渡大学) George Mason University(乔治·马歇尔大学)

AI总结 本文通过分析关系型Transformer的模型和数据两方面问题,提出双阶段上下文学习架构和同质性感知预训练混合策略,构建OpenRFM模型,在关系型基础模型上平均任务性能提升约30%。

Comments 25 pages, including appendix

详情
AI中文摘要

关系型基础模型(RFM)承诺一个单一的预训练预测器,给定任何关系数据库,通过关系型上下文学习(ICL)在一次前向传播中返回预测。然而,开放RFM与其商业对应物之间存在显著差距,且这一差距的根源尚未被系统理解。我们从两个角度剖析了一个代表性框架——关系型Transformer(RT)。模型方面:我们表明RT执行关系级ICL,而核回归视图显示,当稀疏标签单元覆盖导致欠定回归时,它会失败。数据方面:我们消融了RT的预训练来源,发现仅合成预训练和分布内预训练将相同架构驱动到不同机制(惰性与特征学习)。探究这一差距揭示,缺失的成分是标签生成过程中可识别支持的关系型潜在变量。这两个诊断转化为:(1)一种双阶段ICL架构,将关系型骨干与从预训练表格基础模型提升的批级ICL层相结合,以克服关系级标签稀缺;(2)一种同质性感知的合成加持续真实数据预训练混合,辅以基于原型的正则化。这些选择定义了OpenRFM,一个简单而有效的RFM,在RT骨干上平均任务性能提升约30%,并在大量评估任务上超越了商业模型KumoRFMv1。

英文摘要

Relational Foundation Models (RFMs) promise a single pre-trained predictor that, given any relational database, returns predictions in one forward pass via relational in-context learning (ICL). Yet a substantial gap separates open RFMs from their commercial counterparts, and the origin of this gap has not been systematically understood. We dissect a representative framework, the Relational Transformer (RT), from two perspectives. Model side: we show that RT performs relation-level ICL, and a kernel regression view shows it fails when sparse label-cell coverage yields an underdetermined regression. Data side: we ablate RT's pre-training source and find that existing synthetic-only pre-training and in-distribution pre-training drive the same architecture into different regimes, lazy vs. feature-learning. Probing this gap reveals that the missing ingredient is a support-identifiable relational latent in the label-generation process. These two diagnoses translate into (1) a dual-stage ICL architecture that combines the relational backbone with a batch-level ICL layer lifted from a pre-trained tabular foundation model to overcome relation-level label scarcity, and (2) a homophily-aware synthetic plus continual real-data pre-training mixture, augmented with a prototype-based regularization. These choices define OpenRFM, a simple yet effective RFM that improves average task performance by approximately 30% over the RT backbone and surpasses the commercial model KumoRFMv1 on a large set of evaluation tasks.

2606.04317 2026-06-04 cs.CR cs.LG cs.SE 版本更新

Toward a Generalized Defense Across Sparse, Continuous, and Structured Parameter Attacks

面向稀疏、连续和结构化参数攻击的通用防御

Bin Duan, Zeyu Bai, Guowei Yang

发表机构 * School of Electrical Engineering and Computer Science, The University of Queensland, Australia(电气工程与计算机科学学院,昆士兰大学,澳大利亚)

AI总结 提出 ParDef 框架,通过密钥通道重参数化、QC-LDPC 量化和自适应鲁棒推理,实现对多种参数攻击的通用防御,在保持高性能的同时降低攻击成功率。

详情
AI中文摘要

深度神经网络越来越多地部署在异构和部分不可信的环境中,模型通过云存储、CI/CD 流水线、容器化服务和边缘执行平台进行分发。这种广泛的部署场景使模型参数面临各种完整性风险。与输入空间对抗攻击不同,参数攻击直接篡改模型的内部参数,并持续影响所有后续推理。现有防御要么需要重新训练,要么导致显著的精度下降,或者仅限于特定的攻击类别。然而,在实际部署场景中,参数攻击的形式往往不可预测。为了解决这一挑战,我们提出了 ParDef,一种针对深度神经网络面向多种类型参数攻击的通用防御。ParDef 集成了密钥通道重参数化(隐藏敏感参数方向)、QC-LDPC 量化(嵌入冗余并支持纠错)以及自适应鲁棒推理(在不确定性下稳定预测)。我们在 CIFAR-10、CIFAR-100 和 Tiny-ImageNet 上使用 ResNet 和 VGG 模型的评估表明,ParDef 在不同参数攻击下持续降低攻击成功率,同时保持较高的模型性能,且仅引入适度的部署开销。这些结果凸显了 ParDef 是一种实用且通用的 DNN 部署防御方案。

英文摘要

Deep neural networks are increasingly deployed across heterogeneous and partially untrusted environments, where models are distributed through cloud storage, CI/CD pipelines, containerized services, and edge execution platforms. This broad deployment landscape exposes model parameters to various integrity risks. Unlike input-space adversarial attacks, parameter attacks directly tamper with the model's internal parameters and persist across all subsequent inferences. Existing defenses either require retraining, incur significant accuracy degradation, or are limited to specific attack classes. However, in real-world deployment scenarios, the forms of parameter attacks are often unpredictable. To address this challenge, we present ParDef, a generalized defense for deep neural networks against diverse types of parameter attacks. ParDef integrates keyed channel reparameterization, which obscures sensitive parameter directions, QC-LDPC quantization, which embeds redundancy and supports error correction, and adaptive robust inference, which stabilizes predictions under uncertainty. Our evaluation on CIFAR-10, CIFAR-100, and Tiny-ImageNet using ResNet and VGG models demonstrates that ParDef consistently reduces attack success rates across different parameter attacks while maintaining high model performance and incurring only moderate deployment overhead. These results highlight that ParDef is a practical and generalized defense for DNN deployments.

2606.04314 2026-06-04 cs.LG cs.SE 版本更新

Testing Neural Networks via Bayesian-Guided Exploration of Decision Landscapes

通过贝叶斯引导的决策景观探索测试神经网络

Bin Duan, Meiru Che, Guowei Yang

发表机构 * School of Electrical Engineering and Computer Science, The University of Queensland, Australia(昆士兰大学电子工程与计算机科学学院) College of Information and Communications Technology, Central Queensland University, Australia(中央昆士兰大学信息与通信技术学院)

AI总结 提出BayesWarp框架,利用可解释显著性技术识别决策关键区域,并通过不确定性感知的贝叶斯优化自适应引导测试,在保持数据分布和语义接近性的同时高效发现多样化故障。

详情
AI中文摘要

随着神经网络越来越多地部署在安全关键领域,测试对于评估和提高其可靠性至关重要。现有的测试方法,无论是黑盒还是白盒,主要使用全局变异或覆盖引导策略,这两种方法都难以在保持与原始数据分布和语义接近的同时高效发现多样化的模型故障。我们提出BayesWarp,一个通过可解释显著性技术识别决策关键输入区域,并使用不确定性感知的贝叶斯优化策略自适应引导测试过程的测试框架,能够在保持与原始数据分布和语义接近的同时发现多样化故障。在MNIST、CIFAR-10和ImageNet上对六个神经网络模型的评估表明,BayesWarp在固定变异预算下提高了故障发现率、故障多样性、测试用例质量和关键神经元覆盖率。这些结果表明BayesWarp提高了测试有效性。此外,使用生成的故障案例进行微调可提高模型性能。

英文摘要

As neural networks are increasingly deployed in safety-critical domains, testing is essential to evaluate and improve their reliability. Existing testing methods, whether black-box or white-box, primarily use global mutation or coverage-guided strategies, both of which struggle to efficiently uncover diverse model failures while remaining proximate to the original data distribution and semantics. We propose BayesWarp, a testing framework that addresses this limitation by mutating decision-critical input regions identified via interpretable saliency techniques and adaptively guiding the testing process using an uncertainty-aware Bayesian Optimization strategy, enabling the discovery of diverse failures while preserving distributional and semantic proximity to the original data. Evaluation on MNIST, CIFAR-10, and ImageNet across six neural network models shows that BayesWarp improves failure discovery, failure diversity, test case quality, and critical neuron coverage under a fixed mutation budget. These results demonstrate that BayesWarp improves testing effectiveness. Moreover, fine-tuning with the generated failure cases leads to improvements in model performance.

2606.04310 2026-06-04 cs.LG cs.SE 版本更新

Latent Anchor-Driven Test Generation for Deep Neural Networks

基于潜在锚点的深度神经网络测试生成

Bin Duan, Matthew B. Dwyer, Guowei Yang

发表机构 * School of Electrical Engineering and Computer Science, The University of Queensland, Australia(昆士兰大学电气工程与计算机科学学院) Department of Computer Science, University of Virginia, United States(弗吉尼亚大学计算机科学系)

AI总结 提出 Latte 框架,利用预训练 VQ-VAE 在潜在空间中进行锚点引导的变异,生成语义相近、多样且能揭示错误的测试用例,提高故障暴露和行为多样性。

详情
AI中文摘要

深度神经网络(DNN)越来越多地部署在安全关键和安全性敏感的应用中,这使得严格的测试对于识别和缓解模型弱点至关重要。现有的 DNN 测试方法要么探索输入空间,要么探索学习到的潜在空间。虽然潜在空间生成比直接输入空间变异能更好地保持合理性,但当前方法在探索可控性、故障多样性和种子相对语义漂移之间仍面临权衡。为了克服这些限制,我们提出了 Latte,一个黑盒测试框架,通过利用潜在空间生成语义相近、多样且能揭示错误的测试用例。具体来说,Latte 使用预训练的 VQ-VAE 对每个输入种子进行编码,并沿着由从替代类别中采样的锚点定义的方向执行以种子为中心的一步潜在变异,然后进行量化并解码回输入空间。这会在学习到的潜在流形中探索每个种子周围的局部邻域,从而在相同预算下产生更多数量和更广泛多样性的触发预言机预测差异。我们在 5 个数据集和 10 个 DNN 模型上评估了 Latte,包括单模型和多模型测试场景。在评估的数据集和模型上,Latte 在匹配的测试预算下提高了故障暴露和行为多样性。在单模型设置下,它还相对于源种子保持了较低的种子相对语义漂移。

英文摘要

Deep Neural Networks (DNNs) are increasingly being deployed in security-critical and safety-sensitive applications, which makes rigorous testing essential to identify and mitigate model weaknesses. Existing DNN testing approaches explore either the input space or a learned latent space. While latent-space generation can better maintain plausibility than direct input-space mutation, current methods still face a trade-off among exploration controllability, failure diversity, and seed-relative semantic drift. To overcome these limitations, we propose Latte, a black-box testing framework that generates semantically proximate, diverse, and fault-revealing test cases by leveraging the latent space. Specifically, Latte encodes each input seed with a pre-trained VQ-VAE and performs a seed-centered, one-step latent mutation along directions defined by anchors sampled from alternative classes, followed by quantization and decoding back to the input space. This explores local neighborhoods around each seed within the learned latent manifold, resulting in a larger number and broader diversity of oracle-triggering prediction discrepancies under the same budget. We evaluated Latte on 5 datasets and 10 DNN models in single-model and multi-model testing scenarios. Across the evaluated datasets and models, Latte improves fault exposure and behavioral diversity under matched testing budgets. Under the single-model setting, it also maintains low seed-relative semantic drift with respect to the source seeds.

2606.04307 2026-06-04 cs.LG stat.CO stat.ME 版本更新

Folded Transport MCMC: Certifiable Quotient Posterior Computation for Symmetric Bayesian Models

折叠传输MCMC:对称贝叶斯模型的可认证商后验计算

Jun Hu

发表机构 * Wuhan University of Technology(武汉理工大学)

AI总结 针对对称贝叶斯模型中的冗余多峰性导致MCMC收敛诊断退化的问题,提出Folded Transport MCMC方法,通过在对称群的基本域上构建独立采样器直接对商后验进行推断,并利用LCNF振荡认证框架在商度量下提供可证明的认证下界。

Comments 48 pages (including supplementary material), 5 figures, 6 tables. Submitted to Journal of the Royal Statistical Society: Series B

详情
AI中文摘要

具有有限对称性的贝叶斯模型——如可交换分量的混合模型、具有紧密间隔模态的结构识别——定义的后验在标签置换群下不变,产生冗余的多峰性,从而降低MCMC收敛诊断的质量。我们引入折叠传输MCMC(FolT-MCMC),该方法通过在对称群的基本域上构建独立采样器,直接对商后验进行推断。商提议分布通过对群轨道上学习的归一化流进行对称化得到。我们证明了基于LCNF振荡的认证框架可以迁移到商度量,并具有稳定子修正的球质量界和改进的覆盖半径,并且当未折叠流表现出跨模态提议缺陷时,分位数核心认证下界会得到改善。在高斯混合(d=2-20)、标签切换目标(最多24个等价模态)以及标准贝叶斯三分量混合后验上,分位数核心认证改进比从2倍到145倍不等,且折叠认证经验上几乎与维度无关。在台风山竹期间超高层建筑的真实加速度计数据上,FolT-MCMC产生了非平凡的分位数核心认证,而未折叠认证是平凡的。

英文摘要

Bayesian models with finite symmetry - mixture models with exchangeable components, structural identification with closely-spaced modes - define posteriors that are invariant under a group of label permutations, creating redundant multimodality that degrades MCMC convergence diagnostics. We introduce Folded Transport MCMC (FolT-MCMC), which performs inference directly on the quotient posterior by constructing an independence sampler on the fundamental domain of the symmetry group. The quotient proposal is formed by symmetrising a learned normalising flow over the group orbits. We prove that the LCNF oscillation-based certification framework transfers to the quotient metric with a stabiliser-corrected ball-mass bound and improved covering radius, and that the quantile-core certified lower bound improves whenever the unfolded flow exhibits cross-mode proposal deficiency. On Gaussian mixtures (d = 2 - 20), label-switching targets (up to 24 equivalent modes), and a standard Bayesian three-component mixture posterior, the quantile-core certified improvement ratio ranges from 2x to 145x, with the folded certificate empirically nearly dimension-free. On real accelerometer data from a supertall building during Typhoon Mangkhut, FolT-MCMC yields a non-vacuous quantile-core certificate where the unfolded certificate is vacuous.

2606.04305 2026-06-04 cs.LG stat.ML 版本更新

Offline-to-Online Learning in Linear Bandits

线性Bandit中的离线到在线学习

Kushagra Chandak, Toshinori Kitamura, Xiaoqi Tan

发表机构 * University of Tokyo(东京大学)

AI总结 针对随机线性Bandit问题,提出一种平衡离线数据与在线探索的算法,实现次线性遗憾并随离线样本增加降低离线参考遗憾。

详情
AI中文摘要

我们研究了在随机线性Bandit设置中利用额外离线数据集进行在线学习的问题。尽管该问题在实践中频繁出现,但在结构化环境中,离线到在线的权衡仍然缺乏深入理解。我们提出了一种线性Bandit算法来平衡这种权衡:它在早期回合依赖离线数据,并随着时间推移逐渐增加探索。我们建立了遗憾界,表明我们的方法同时与纯在线和纯离线解决方案具有竞争力。特别地,相对于最优动作,它在在线交互次数上实现了次线性遗憾,而相对于离线参考的遗憾随着离线样本数量的增加而降低。实验结果进一步证明了该方法在各种问题参数下的有效性。

英文摘要

We study online learning with an additional offline dataset in the stochastic linear bandit setting. Although this problem arises frequently in practice, the offline-to-online tradeoff remains poorly understood in structured environments. We propose a linear bandit algorithm that balances this tradeoff: it relies on offline data during early rounds, and increasingly favors exploration as the horizon grows. We establish regret bounds showing that our method is simultaneously competitive with both purely online and purely offline solutions. In particular, it achieves sublinear regret relative to the optimal action in the number of online interactions, while its regret relative to an offline reference decreases as the number of offline samples grows. Empirical results further demonstrate its effectiveness across various problem parameters.

2606.04302 2026-06-04 cs.CL cs.LG 版本更新

LazyAttention: Efficient Retrieval-Augmented Generation with Deferred Positional Encoding

LazyAttention: 高效检索增强生成中的延迟位置编码

Haocheng Xia, Mihir Pamnani, Hanxi Fang, Supawit Chockchowwat, Yongjoo Park

发表机构 * Siebel School of Computing and Data Science, University of Illinois Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校Siebel计算与数据科学学院) Google(谷歌) Amazon(亚马逊)

AI总结 针对检索增强生成中KV缓存位置编码复用性差的问题,提出LazyAttention机制,通过核化延迟位置编码实现零拷贝、位置无关的KV重用,显著降低首令牌延迟并提升推理吞吐量。

Comments ICML 2026

详情
AI中文摘要

键值(KV)缓存通过重用已生成令牌的过去计算来加速大型语言模型(LLM)的推理。在长上下文应用(如检索增强生成(RAG)和上下文学习(ICL))中,其重要性更加凸显。然而,传统的KV缓存将位置信息直接嵌入缓存中,限制了其可重用性。现有解决方案要么将重用限制为前缀,要么需要昂贵的内存物化来进行位置重新编码。我们引入了LazyAttention,一种新颖的注意力机制,它通过核化延迟位置编码来实现零拷贝、位置无关的KV重用。通过在注意力内核中动态调整位置编码,LazyAttention解决了物化瓶颈,使得单个物理KV副本能够服务于任意位置的多个逻辑请求。利用为预填充和解码定制的注意力内核,我们的系统实现了显著的效率提升:在偏斜的文档分布下,与最先进的Block-Attention相比,首令牌延迟(TTFT)降低了1.37倍,推理吞吐量提高了1.40倍,同时保持了可比的输出质量。

英文摘要

Key-value (KV) caching accelerates inference of large language models (LLMs) by reusing past computations for generated tokens. Its importance becomes even greater in long-context applications such as retrieval-augmented generation (RAG) and in-context learning (ICL). However, conventional KV caching embeds positional information directly into the cache, limiting its reusability. Existing solutions either restrict reuse to prefixes or require expensive memory materialization for positional re-encoding. We introduce LazyAttention, a novel attention mechanism that kernelizes deferred positional encoding to enable zero-copy, position-agnostic KV reuse. By adjusting positional encoding within attention kernels on-the-fly, LazyAttention resolves the materialization bottleneck, allowing a single physical KV copy to serve multiple logical requests at arbitrary positions. Leveraging attention kernels tailored for prefilling and decoding, our system achieves significant efficiency improvements: under skewed document distributions, it reduces time-to-first-token (TTFT) by 1.37$\times$ and increases inference throughput by 1.40$\times$ compared to the state-of-the-art Block-Attention, while maintaining comparable output quality.

2606.04299 2026-06-04 cs.CV cs.LG 版本更新

Efficient and Training-Free Single-Image Diffusion Models

高效且无需训练的单图像扩散模型

Haojun Qiu, Kiriakos N. Kutulakos, David B. Lindell

发表机构 * Department of Computer Science, University of Toronto(多伦多大学计算机科学系) Vector Institute(向量研究所)

AI总结 提出一种基于多尺度补丁数据集的无训练单图像扩散模型,通过闭式最优去噪器实现高效生成,达到与训练模型相当的质量和多样性。

Comments CVPR 2026; Project Page: https://haojunqiu.github.io/efficient-SID/

详情
AI中文摘要

我们考虑生成图像的问题,其内部结构——由多尺度补丁分布定义——与单个参考图像匹配。最近的方法通过训练单图像扩散模型来解决这个问题。但即使在这种设置下,训练计算成本高昂且需要数小时的优化。相反,我们使用不同尺度下的图像补丁数据集对图像进行建模。由于该数据集是有限的,且其补丁的维度较小,可以使用最优的闭式去噪器可计算地获得噪声补丁的得分函数,从而消除了神经网络训练的需要。我们将这种基于补丁的去噪器集成到一个高效、无需训练的图像扩散模型中,并描述了我们的方法如何与经典的基于补丁的图像恢复技术相联系。与训练过的单图像扩散模型相比,我们的方法实现了最先进的生成质量和多样性,并展示了应用,包括无条件图像生成、文本引导风格化、图像对称化和重定向。此外,我们展示了我们的方法与潜在空间扩散兼容,并展示了多种额外的加速技术,以实现一秒内的百万像素单图像生成和几分钟内的十亿像素生成。

英文摘要

We consider the problem of generating images whose internal structure -- defined by the distribution of patches across multiple scales -- matches that of a single reference image. Recent approaches address this problem by training a diffusion model on a single image. But even in this setting, training is computationally expensive and requires hours of optimization. Instead, we model the image using a dataset of its patches at different scales. As this dataset is finite and the dimensionality of its patches is small, the score function for a noisy patch can be computed tractably using an optimal, closed-form denoiser, eliminating the need for neural network training. We integrate this patch-based denoiser into an efficient, training-free image diffusion model, and we describe how our method connects to classical patch-based image restoration techniques. Our approach achieves state-of-the-art generation quality and diversity compared to trained single-image diffusion models, and we demonstrate applications, including unconditional image generation, text-guided stylization, image symmetrization, and retargeting. Further, we show that our approach is compatible with latent space diffusion, and we show multiple additional acceleration techniques to achieve megapixel single-image generation in one second, and gigapixel generation in minutes.

2606.04290 2026-06-04 cs.LG math.OC 版本更新

PE-MHL: Physics-Encoded Modular Hybrid Layers for Scalable Learning of Complex Systems

PE-MHL: 用于复杂系统可扩展学习的物理编码模块化混合层

Ismail Hassaballa, Mircea Lazar

发表机构 * TUE(蒂姆大学)

AI总结 提出物理编码模块化混合层(PE-MHL)框架,通过增量添加子模型并保证训练误差单调非增,实现可扩展、鲁棒的混合建模,在非线性NARX基准和Quanser Aero 2平台上优于同等规模单体网络。

详情
AI中文摘要

结合基于物理和数据驱动的混合模型在控制应用中展现出实现准确性和可解释性的强大潜力。尽管最近的方法在融入物理一致性方面取得了进展,但在可扩展性、对噪声的鲁棒性以及模型复杂度控制方面仍存在挑战。本文提出了物理编码模块化混合层(PE-MHL)框架,其中基线基于物理的模型通过添加新的子模型逐步细化,每个新组件在保留先前组件已学知识的同时增加复杂度。我们为这种构造建立了理论保证:通过每个新子模型的最小二乘初始化,训练误差在子模型数量上单调非增并可证明收敛。在非线性NARX基准和Quanser Aero 2平台上的实证评估表明,PE-MHL在准确性和泛化能力上均优于同等规模的单体网络,同时提供更稳定的训练动态和更好的底层数据结构保留。

英文摘要

Hybrid models that combine physics-based and data-driven components have shown strong potential for achieving accuracy and interpretability in control applications. While recent methods have made progress in incorporating physical consistency, challenges remain in scalability, robustness to noise, and control of model complexity. This paper proposes a Physics-Encoded Modular Hybrid Layer (PE-MHL) framework, in which a baseline physics-based model is incrementally refined through the addition of new sub-models, where each new component adds complexity while preserving what previous components have already learned. We establish a theoretical guarantee for this construction: with a least-squares initialization of each new sub-model, the training error is monotonically non-increasing in the number of sub-models and provably converges. Empirical evaluations on a nonlinear NARX benchmark and the Quanser Aero 2 platform demonstrate that PE-MHL outperforms equivalently sized monolithic networks in both accuracy and generalization, while also providing more stable training dynamics and better preservation of underlying data structures.

2606.04287 2026-06-04 cs.LG cs.AI 版本更新

Scaling Novel Graph Generation via Lightweight Structure-Guided Autoregressive Models

通过轻量级结构引导自回归模型扩展新颖图生成

Alessio Barboni, Massimiliano Lupo Pasini, Bishal Lakha, Edoardo Serra

发表机构 * Boise State University(博伊州立大学) Oak Ridge National Laboratory(橡树岭国家实验室)

AI总结 提出一种轻量级自回归框架,利用结构引导拓扑排序和两阶段训练策略,在分子和非分子基准上实现高新颖性、有效性和唯一性的图生成。

详情
AI中文摘要

生成真实且多样的图是机器学习中的一个关键问题,在分子发现、电路设计、网络安全等领域有应用。然而,当前的图生成模型在可扩展性和新颖性方面仍存在局限。基于扩散的方法通常需要昂贵的全邻接操作和长去噪链,而许多自回归和混合模型至少具有二次复杂度。此外,这些模型往往模仿训练图而非泛化到新图。我们提出一个轻量级自回归框架来解决这些问题。它使用结构引导的拓扑排序将图序列化为规则的边序列,实现近对数线性生成,以及一种两阶段训练策略,结合探索导向的增强和迭代细化,以减少过拟合并促进受控的新颖性。在分子和非分子基准上的实验表明,我们的方法在保持高有效性和唯一性的同时提高了新颖性。该框架还支持LSTM和Mamba风格的因果序列骨干,大内存加速器使得能够进行超出典型GPU限制的更长的图序列实验。

英文摘要

Generating realistic and diverse graphs is a key problem in machine learning, with applications in molecular discovery, circuit design, cybersecurity, and beyond. However, current graph generative models remain limited by scalability and novelty. Diffusion-based methods often require costly full-adjacency operations and long denoising chains, while many autoregressive and hybrid models have at least quadratic complexity. In addition, these models often imitate training graphs rather than generalize beyond them. We propose a lightweight autoregressive framework to address these issues. It uses a structure-guided topological ordering to serialize graphs into regular edge sequences, enabling near log-linear generation, and a two-phase training strategy that combines exploration-oriented augmentation with iterative refinement to reduce overfitting and promote controlled novelty. Experiments on molecular and non-molecular benchmarks show that our approach improves novelty while preserving high validity and uniqueness. The framework also supports both LSTM and Mamba-style causal sequence backbones, with large-memory accelerators enabling longer graph-sequence experiments beyond typical GPU limits.

2606.04284 2026-06-04 cs.LG cs.AI cs.CL 版本更新

Sparse Mixture-of-Experts Reward Models Learn Interpretable and Specialized Experts for Personalized Preference Modeling

稀疏混合专家奖励模型学习可解释且专业化的专家用于个性化偏好建模

Yifan Wang, Jinyi Mu, Mayank Jobanputra, Yu Wang, Ji-Ung Lee, Soyoung Oh, Isabel Valera, Vera Demberg

发表机构 * Saarland University(萨尔兰大学) Independent Researcher(独立研究者) Bielefeld University(比勒菲尔德大学) Max Planck Institute for Software Systems(马克斯·普朗克软件系统研究所) Max Planck Institute for Informatics(马克斯·普朗克信息研究所)

AI总结 提出稀疏混合专家奖励模型,通过稀疏路由和专家多样性训练,从二元偏好数据中学习可解释的专家模式,提升个性化偏好建模的测试时适应性和可解释性。

详情
AI中文摘要

偏好建模在基于人类反馈的强化学习(RLHF)中扮演核心角色,使大型语言模型(LLMs)与人类价值观对齐。然而,大多数现有方法假设一个通用的奖励函数,忽视了人类偏好的多样性和异质性。为了在不增加额外标注成本的情况下解决这一限制,最近的工作提出从二元数据中学习多个偏好组件,并组合它们以建模个体偏好。然而,这些组件往往无法捕捉连贯且解耦的模式,限制了其可解释性和个性化效果。在这项工作中,我们提出了一种稀疏混合专家(MoE)奖励模型,该模型在二元偏好数据训练过程中鼓励稀疏路由和专家多样性。在受控和真实世界的实验中,稀疏MoE学习了可解释的路由模式和专业化的专家。它还改进了测试时的个性化,并且适应后的专家权重变化为分析模型如何适应个性化偏好提供了定性视角。

英文摘要

Preference modeling plays a central role in reinforcement learning from human feedback (RLHF), enabling large language models (LLMs) to align with human values. However, most existing approaches assume a universal reward function, neglecting the diversity and heterogeneity of human preferences. To address this limitation without additional annotation costs, recent work has proposed learning multiple preference components from binary data and combining them to model individual preferences. Nevertheless, these components often fail to capture coherent and disentangled patterns, limiting their interpretability and effectiveness for personalization. In this work, we propose a sparse Mixture-of-Experts (MoE) reward model that encourages sparse routing and expert diversity during training on binary preference data. Across controlled and real-world experiments, sparse MoE learns interpretable routing patterns and specialized experts. It also improves test-time personalization, and post-adaptation shifts in expert weights provide a qualitative lens for analyzing how the model adapts to personalized preferences.

2606.04280 2026-06-04 cs.LG cs.AI cs.IR 版本更新

The Loss Is Not Enough: Sampling Conditions and Inductive Bias in Contrastive Representation Learning

损失还不够:对比表示学习中的采样条件和归纳偏置

Justinas Zaliaduonis, Patrick Putzky, Till Richter, Sergios Gatidis

发表机构 * ETH Zürich(苏黎世联邦理工学院)

AI总结 本文通过测度论框架形式化对比学习中的多样性条件,提出支持校正的InfoNCE变体,并实验验证了采样多样性与编码器归纳偏置的相互作用。

详情
AI中文摘要

对比学习已成为自监督表示学习的主要范式,但其恢复有意义潜在几何的条件尚未完全理解。我们开发了一个测度论框架,形式化了多样性条件,即正对采样的支持要求,这是等距潜在恢复所必需的。我们表明,标准的全支持von Mises-Fisher设置意味着满足多样性条件,因此全局对比损失最小化器可以恢复潜在几何(直到正交变换),而受限条件分布可以使非正交映射达到严格更低的渐近对比损失。我们引入了一种支持校正的信息噪声对比估计(InfoNCE)变体作为理论修复:这种校正使得正交潜在空间恢复成为可能,但并不能唯一选择它。在合成基准上的实验验证了可识别性预测,CIFAR-10实验与定性预测一致,即当采样多样性有限时,架构归纳偏置变得更加重要。总之,我们的结果阐明了采样机制和编码器归纳偏置在对比表示学习中的相互作用。

英文摘要

Contrastive learning has become a leading paradigm for self-supervised representation learning, yet the conditions under which it recovers meaningful latent geometry remain incompletely understood. We develop a measure-theoretic framework formalizing the diversity condition, a support requirement on positive-pair sampling that is necessary for isometric latent recovery. We show that the standard full-support von Mises-Fisher setting implies the satisfaction of the diversity condition and as a consequence global contrastive loss minimizers recover latent geometry up to orthogonal transformation, while restricted conditionals can make non-orthogonal maps attain strictly lower asymptotic contrastive loss. We introduce a support-corrected Information Noise Contrastive Estimation (InfoNCE) variant as a theoretical fix: this correction makes orthogonal latent space recovery achievable but does not uniquely select it. Experiments on synthetic benchmarks validate the identifiability predictions, and CIFAR-10 experiments are consistent with the qualitative prediction that architectural inductive bias becomes more important when sampling diversity is limited. Together, our results clarify how sampling mechanisms and encoder inductive bias interact in contrastive representation learning.

2606.04279 2026-06-04 cs.LG quant-ph 版本更新

Derivative Informed Learning of Exchange-Correlation Functionals

交换相关泛函的导数知情学习

Eike S. Eberhard, Luca A. Thiede, Abdul Aldossary, Andreas Burger, Nicholas Gao, Vignesh Bhethanabotla, Alán Aspuru-Guzik, Stephan Günnemann

发表机构 * Technical University of Munich(慕尼黑技术大学) Munich Data Science Institute(慕尼黑数据科学研究所) Munich Center for Machine Learning(慕尼黑机器学习中心) University of Toronto(多伦多大学) Vector Institute(向量研究所) CuspAI California Institute of Technology(加州理工学院)

AI总结 提出导数知情交换相关损失(DI-Loss),通过监督能量在密度矩阵Grassmannian上的一阶和二阶导数,训练O(N^3)标度的机器学习交换相关泛函以复现B3LYP/def2-SVP目标,在多个架构上平均总能量MAE降低66%,并减少混合泛函SCF迭代次数达50%。

Comments Proceedings of the 43rd International Conference on Machine Learning

详情
AI中文摘要

机器学习(ML)交换相关(XC)泛函旨在通过直接从参考数据学习来替代人工设计的密度泛函近似,但它们仍未能持续优于传统的$\mathcal{O}(N^4)$标度混合泛函。我们研究了一种混合蒸馏设置,其中$\mathcal{O}(N^3)$标度的ML-XC泛函被训练以复现B3LYP/def2-SVP目标。我们引入了导数知情XC损失(DI-Loss),该损失通过监督能量在可容许密度矩阵的Grassmannian上的一阶和二阶导数,融入了来自参考混合泛函的额外信息。DI-Loss不仅匹配自洽不动点,还将学习到的泛函的局部一阶和二阶响应与目标泛函对齐。在四个评估的架构中,DI-Loss一致地改善了主要能量指标。在所有架构上均匀平均,总能量MAE相对于仅使用能量和密度监督降低了66%。密度敏感的均场能量度量$E_ρ$平均从1.2 mEh改善到0.8 mEh,而偶极子和$\mathcal{L}_2$密度误差并未均匀改善。我们进一步表明,来自蒸馏泛函的密度将混合泛函的SCF迭代次数减少了高达50%。在下游TDDFT计算中,Hessian监督改善了激发态预测,XCdiff将平均激发能MAE降低了19-35%。

英文摘要

Machine-learned (ML) exchange-correlation (XC) functionals aim to replace human-designed density functional approximations by learning directly from reference data, but they still do not consistently outperform traditional $\mathcal{O}(N^4)$-scaling hybrid functionals. We study a hybrid-distillation setting in which $\mathcal{O}(N^3)$-scaling ML-XC functionals are trained to reproduce B3LYP/def2-SVP targets. We introduce Derivative Informed XC-Loss (DI-Loss), a loss that incorporates additional information from the reference hybrid functional by supervising first and second derivatives of the energy on the Grassmannian of admissible density matrices. Rather than only matching the self-consistent fixed point, DI-Loss aligns the local first- and second-order response of the learned functional with that of the target functional. Across four evaluated architectures, DI-Loss consistently improves the main energy metrics. Averaged uniformly across architectures, the total-energy MAE decreases by 66% relative to energy and density supervision alone. The density-sensitive mean-field energy metric $E_ρ$ improves from $1.2$ to $0.8$ mEh on average, while dipole and $\mathcal{L}_2$ density errors do not improve uniformly. We further show that densities from the distilled functionals reduce hybrid-functional SCF iterations by up to 50%. In downstream TDDFT calculations, Hessian supervision improves excited-state predictions, with XCdiff reducing the mean excitation-energy MAE by 19 - 35%.

2606.04275 2026-06-04 cs.LG cs.AI 版本更新

From Ticks to Flows: Dynamics of Neural Reinforcement Learning in Continuous Environments

从蜱虫到流:连续环境中神经强化学习的动力学

Saket Tiwari, Tejas Kotwal, George Konidaris

发表机构 * Brown University(布朗大学)

AI总结 本文通过将深度强化学习建模为连续时间随机过程,利用随机控制理论,首次推导了连续环境下过参数化神经演员-评论家算法在无限宽度极限下的状态分布演化方程。

Comments Presented at ICLR 2026: https://openreview.net/forum?id=TdiRLe3rPA

详情
AI中文摘要

我们提出了一种新颖的深度强化学习(RL)在连续环境中的理论框架,通过借鉴随机控制的思想,将问题建模为连续时间随机过程。在先前工作的基础上,我们引入了一个可行的演员-评论家算法模型,该模型同时包含探索和随机转移。对于单隐藏层神经网络,我们表明环境状态可以表述为两个时间尺度的过程:环境时间和梯度时间。在此框架下,我们描述了表示环境状态和累积折扣回报估计的时间相关随机变量如何在两层网络的无限宽度极限下随梯度步长演化。利用随机微分方程理论,我们首次在连续RL中推导出一个方程,描述了在极小的学习率下,每个梯度步长上状态分布的无穷小变化。总体而言,我们的工作为研究过参数化神经演员-评论家算法提供了一种新颖的非参数化表述。我们通过一个简单的连续控制任务实证验证了我们的理论结果。

英文摘要

We present a novel theoretical framework for deep reinforcement learning (RL) in continuous environments by modeling the problem as a continuous-time stochastic process, drawing on insights from stochastic control. Building on previous work, we introduce a viable model of actor-critic algorithm that incorporates both exploration and stochastic transitions. For single-hidden-layer neural networks, we show that the state of the environment can be formulated as a two time scale process: the environment time and the gradient time. Within this formulation, we characterize how the time-dependent random variables that represent the environment's state and estimate of the cumulative discounted return evolve over gradient steps in the infinite width limit of two-layer networks. Using the theory of stochastic differential equations, we derive, for the first time in continuous RL, an equation describing the infinitesimal change in the state distribution at each gradient step, under a vanishingly small learning rate. Overall, our work provides a novel nonparametric formulation for studying overparametrized neural actor-critic algorithms. We empirically corroborate our theoretical result using a toy continuous control task.

2606.04272 2026-06-04 cs.LG 版本更新

RL Excursions during Pre-Training: Re-examining Policy Optimization for LLM training

预训练期间的强化学习探索:重新审视LLM训练中的策略优化

Rachit Bansal, Clara Mohri, Tian Qin, David Alvarez-Melis, Sham Kakade

发表机构 * Harvard University(哈佛大学)

AI总结 本文质疑LLM标准训练流程中仅在预训练和监督微调后使用强化学习的做法,通过从头训练LLM并在中间检查点直接应用RL、SFT及SFT后RL,发现RL早期有效且能匹配完整流程,同时提出并行平均合并RL和SFT目标的方法在保持通用能力的同时优于其他方法。

详情
AI中文摘要

标准的LLM训练流程仅在预训练和监督微调(SFT)之后应用强化学习(RL)。我们通过从头训练LLM,并直接在中间预训练检查点上应用RL、SFT以及SFT后接RL,来质疑这一现状。我们发现RL在早期非常有效,并且通常也能在早期匹配完整的SFT→RL流程。通过在更难问题上的实验,我们发现针对性的预训练数据组成是RL有效性的强大杠杆,甚至比模型规模更重要。除了推理准确性之外,直接将RL应用于基础检查点会扩展模型的分布;而最近工作中报告的锐化效应仅在RL跟随SFT时出现。RL基本不改变模型的通用能力,而SFT后通用能力会下降。最后,我们通过并行平均合并RL和SFT目标,该方法在所有其他训练方法中表现最佳,跨指标均优于其他方法,同时保持通用能力。这些结果表明,LLM训练可能受益于RL的更广泛使用。

英文摘要

The standard LLM training pipeline applies reinforcement learning (RL) only after pre-training and supervised fine-tuning (SFT). We question this status quo by training a LLM from scratch and applying RL, SFT, and SFT followed by RL directly to intermediate pre-training checkpoints. We find that RL is effective very early, and often matches the full SFT$\to$RL pipeline early as well. Through experiments on harder problems, we find that targeted pre-training data composition is a strong lever for RL effectiveness, even more so than model scale. Beyond reasoning accuracy, applying RL directly to base checkpoints expands the model's distribution; the sharpening effect reported in recent work arises only when RL follows SFT. The general capabilities of the model remain essentially unchanged by RL, while they degrade following SFT. Finally, we merge RL and SFT objectives by parallel averaging, which outperforms across all other training methods discussed, across metrics, while preserving general capabilities. Together, these results suggest that LLM training might benefit from an expanded use of RL.

2606.04266 2026-06-04 cs.CR cs.LG 版本更新

Long-Term and Short-Term Transistor Aging in Deep Neural Networks: Impact and Mitigation

深度神经网络中的长期与短期晶体管老化:影响与缓解

Alireza Sarmadi, Virinchi Roy Surabhi, Prashanth Krishnamurthy, Hussam Amrouch, Ramesh Karri, Farshad Khorrami

发表机构 * Dept. of Electrical and Computer Engineering, New York University (NYU) Tandon School of Engineering(纽约大学电气与计算机工程系(Tandon工程学院)) School of Computation, Information and Technology, Technical University of Munich (TUM)(慕尼黑技术大学计算、信息与技术学院)

AI总结 本文研究了长期和短期晶体管老化对深度神经网络推理精度的影响,并提出了一种老化感知重训练方法来缓解性能下降。

Comments 28 pages, 16 figures

详情
AI中文摘要

深度神经网络(DNN)被用于各种实际应用,例如图像分类和语音识别。在集成电路(IC)的硬件上实现的DNN的推理精度会在晶体管老化等现象下下降。老化会减慢晶体管的开关速度,由于时钟无法维持而导致系统级时序违规。为了在整个预期寿命内保持可靠性,设计人员添加保护带以防止时序违规;然而,添加大的时序保护带会导致性能(速度或吞吐量)损失。本章详细讨论了长期和短期晶体管老化对DNN推理精度的影响。此外,为了减轻老化对DNN精度的影响并控制它们,提出了一种老化感知重训练方法,以生成即使在激进(即小于所需)保护带下也具有弹性的DNN。这提高了DNN在老化引起的退化情况下的推理精度。本章在用于图像分类的DNN硬件实现上,使用现成的图像数据集讨论了这些影响以及缓解策略。还简要讨论了短期老化作为检测集成电路中硬件木马的激励机制的应用。

英文摘要

Deep neural networks (DNNs) are used in a variety of real-world applications including, for example, image classification and speech recognition. The inference accuracy of DNN implemented on hardware in integrated circuits (ICs) degrades under phenomena such as transistor aging. Aging slows down the switching speed of transistors, resulting in system-level timing violations due to unsustainable clocks. To maintain reliability for the entire projected lifetime, designers add guardbands to prevent timing violations; however, adding large timing guardbands causes losses in performance (speed or throughput). This chapter provides a detailed discussion of the effects of long-term and short-term transistor aging on DNN inference accuracy. Furthermore, to mitigate aging effects on DNN's accuracy and keep them at bay, a methodology for aging-aware retraining is presented in order to generate a resilient DNN even when aggressive (i.e., smaller than required) guardbands are used. This improves the inference accuracy of the DNNs even in the presence of aging-induced degradation. These effects are discussed in this chapter along with mitigation strategies on a hardware implementation of a DNN for image classification on an off-the-shelf image dataset. The application of short-term aging as an excitation mechanism for the detection of hardware Trojans in integrated circuits is also briefly discussed.

2606.04265 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

Nonlocal Mean Field Schrödinger Bridge with Learned Interactions

具有学习相互作用的非局部平均场薛定谔桥

Daisuke Inoue, Mathieu Laurière, Dante Kalise

发表机构 * Department of Mathematics, Imperial College London(伦敦帝国学院数学系) Shanghai Frontiers Science Center of Artificial Intelligence and Deep Learning(上海前沿人工智能与深度学习科学中心) NYU-ECNU Institute of Mathematical Sciences, NYU Shanghai(纽约大学上海数学科学研究所)

AI总结 本文提出一种使用神经网络代理近似非局部相互作用的平均场薛定谔桥方法,将推理时的每步计算成本从二次降低到线性,并推导了代理误差传播的稳定性界限。

Comments 31 pages, 15 figures

详情
AI中文摘要

薛定谔桥问题构建一个以最小能量连接初始分布和终端分布的随机过程。本文考虑其平均场扩展,即平均场薛定谔桥,用于相互作用粒子系统。对于非局部相互作用,评估产生的依赖于粒子的分布项的计算量随种群规模呈二次增长,这使得大规模问题难以处理。我们通过使用神经网络代理近似非局部相互作用来解决这一瓶颈。由此产生的四阶段交替算法将推理时每步成本从种群规模的二次降低到线性。我们还推导了Grönwall型稳定性界限,显示代理误差如何传播到生成的轨迹。在导航和意见动力学任务的数值实验中,所提出的方法再现了通过解析评估获得的轨迹,并减少了训练时间。

英文摘要

The Schrödinger Bridge Problem constructs a stochastic process that connects an initial distribution to a terminal distribution with minimum energy. This work considers its mean-field extension, the Mean-Field Schrödinger Bridge, for interacting particle systems. With nonlocal interactions, evaluating the resulting particle-dependent distributional terms can scale quadratically with the population size, which makes large-scale problems intractable. We address this bottleneck by approximating the nonlocal interactions with neural network surrogates. The resulting four-stage alternating algorithm reduces the per-step cost from quadratic to linear in the population size at inference. We also derive Grönwall-type stability bounds that show how surrogate errors propagate to the generated trajectories. In numerical experiments on navigation and opinion-dynamics tasks, the proposed method reproduces trajectories obtained with analytical evaluation and reduces training time.

2606.04261 2026-06-04 cs.AI cs.CL cs.CV cs.ET cs.LG 版本更新

Can Generalist Agents Automate Data Curation?

通用智能体能否自动化数据筛选?

Feiyang Kang, Hanze Li, Adam Nguyen, Mahavir Dabas, Jiaqi W. Ma, Frederic Sala, Dawn Song, Ruoxi Jia

发表机构 * Virginia Tech(弗吉尼亚理工大学) University of Illinois Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校) University of Wisconsin-Madison(威斯康星大学麦迪逊分校) University of California, Berkeley(加州大学伯克利分校)

AI总结 本文提出Curation-Bench基准,通过通用编码智能体自动化数据筛选循环,实验表明现成智能体可达到强基线,但存在执行-研究差距,而结构化方法引导的智能体能在十分之一数据预算下自主组合出优于强基线的数据选择策略。

Comments Preprint

详情
AI中文摘要

训练数据的筛选是现代AI开发中最重要但劳动密集的部分之一:实践者根据嘈杂的基准反馈迭代地提出、实施、评估和修订数据策略。我们探究通用编码智能体能否自动化这一数据筛选循环。我们引入了*Curation-Bench*,一个以智能体为中心的基准,它固定模型、训练配方和评估套件,同时赋予智能体命令行权限以检查数据、实施策略、提交到固定的训练/评估流水线并进行修订。在视觉-语言指令微调实例中,现成智能体在十次迭代内达到了已发表的强数据选择基线。然而,轨迹分析揭示了持续的*执行-研究差距*:即使提供了策略指南和论文参考,智能体主要调整局部策略变体,而非探索新的策略家族。要求每次迭代引用、实例化和改编先前方法的框架将智能体转向方法引导的探索。这种框架化的智能体自主组合——无需人工设计输入——一种数据选择策略,在十分之一的数据预算下优于已发表的强基线。总体而言,当前智能体可以运行筛选循环,但可靠的数据研究需要框架化的方法适应,而非仅靠开放式提示。代码和基准已开源。

英文摘要

Curating training data is among the most consequential yet labor-intensive parts of modern AI development: practitioners iteratively propose, implement, evaluate, and revise data policies against noisy benchmark feedback. We ask whether generalist coding agents can automate this data-curation loop. We introduce *Curation-Bench*, an agent-centric benchmark that fixes the model, training recipe, and evaluation suite while giving agents command-line access to inspect data, implement policies, submit them to a fixed training/evaluation pipeline, and revise. In a vision-language instruction-tuning instantiation, out-of-the-box agents reach strong published data-selection baselines within ten iterations. However, trajectory analysis reveals a persistent *execution-research gap*: agents mainly tune local policy variants rather than explore new policy families, even when given strategy guides and paper references. Scaffolds requiring each iteration to cite, instantiate, and adapt a prior method shift agents toward method-guided exploration. The scaffolded agent autonomously composes -- without human design input -- a data-selection policy that outperforms strong published baselines at one-tenth their data budget. Overall, current agents can run the curation loop, but reliable data research requires scaffolded method adaptation, not open-ended prompting alone. Code and benchmark are open-sourced.

2606.04244 2026-06-04 cs.AI cs.CL cs.CV cs.LG 版本更新

VAMPS: Visual-Assisted Mathematical Problem Solving Benchmark

VAMPS: 视觉辅助数学问题求解基准

Amirhossein Dabiriaghdam, Shayan Vassef, Mohammadreza Bakhtiari, Yasamin Medghalchi, Ilker Hacihaliloglu, Mesrob Ohannessian, Lele Wang, Giuseppe Carenini

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 提出VAMPS基准,通过1,168道双语多选题评估多模态大模型在借助绘图工具进行数学推理时的表现,发现直接解析求解优于工具辅助视觉求解。

详情
AI中文摘要

多模态大语言模型在复杂推理方面能力日益增强,但当它们必须通过工具外部化问题然后基于工具输出进行推理时,尤其是在依赖视觉辅助的情况下,其性能往往会下降。这一差距尤为重要,因为真实的工程和科学工作流程通常依赖可视化工具进行分析、验证和决策。为了研究这一差异,我们引入了VAMPS(视觉辅助数学问题求解),一个用于图辅助数学的基准。VAMPS包含1,168个多模态、双语选择题问答对,这些题目来自伊朗大学入学考试的代数和微积分问题,并通过人工审核的LLM生成的合成变体进行了扩展,所有题目都经过精心挑选,使得绘图能够通过揭示交点、极值、渐近线等提供自然的求解策略。VAMPS旨在用于基准测试和诊断,它超越了以往主要评估在固定视觉输入上进行推理的多模态基准,通过测试模型是否能够从构建有用的图形中受益并将其答案基于结果可视化。总体而言,我们发现,在一组多样化的模型中,直接解析求解出人意料地优于工具辅助的视觉求解,即使在绘图是自然策略的问题上也是如此。

英文摘要

Multimodal large language models are increasingly capable of complex reasoning, yet their performance often degrades when they must externalize a problem through a tool and then reason over the tool's output, specifically when they rely on visual aids. This gap is especially important because real engineering and scientific workflows often rely on visualization tools for analysis, validation, and decision-making. To study this discrepancy, we introduce VAMPS (Visual-Assisted Mathematical Problem Solving), a benchmark for graph-assisted mathematics. VAMPS contains 1,168 multimodal, bilingual multiple-choice question-answer pairs drawn from Iranian University Entrance Exam algebra and calculus problems and expanded with human-reviewed LLM-generated synthetic variants, all selected so that plotting provides a natural solution strategy by revealing intersections, extrema, asymptotes, etc. Designed for both benchmarking and diagnosis, VAMPS goes beyond prior multimodal benchmarks that primarily evaluate reasoning over fixed visual inputs by testing whether a model can benefit from constructing a useful graph and grounding its answer in the resulting visualization. Overall, we found that across a diverse set of models, direct analytical solving surprisingly outperforms tool-enabled visual solving, even on problems where plotting is a natural strategy.

2606.04238 2026-06-04 cs.LG cs.AI 版本更新

Recover-LoRA for Aggressive Quantization: Reclaiming Accuracy in 2-Bit Language Models via Low-Rank Adaptation with Knowledge Distillation on Synthetic Data

Recover-LoRA 用于激进量化:通过低秩适配与合成数据知识蒸馏恢复2比特语言模型的精度

Devleena Das, Rajeev Patwari, Elliott Delaye, Ashish Sirasao

发表机构 * Advanced Micro Devices, Inc.(先进微器件公司)

AI总结 针对2比特激进量化导致的大语言模型精度严重下降问题,提出Recover-LoRA方法,结合选择性混合精度策略(仅MLP的gate和up层量化为2比特)和基于合成数据蒸馏的低秩适配训练,在Qwen3-4B上以1万合成样本在12个基准中恢复9个基准80-95%的精度。

详情
AI中文摘要

将权重激进量化至2比特精度可大幅提升大语言模型推理的吞吐量和内存效率,但通常会导致严重的精度下降。这些增益对于内存容量和带宽为主要限制的边缘和设备端部署尤为重要。在本工作中,我们将Recover-LoRA——一种最初为通用模型权重损坏设计的轻量级、无需数据的精度恢复方法——扩展到超低比特量化场景。我们提出了一种选择性混合精度策略,其中仅MLP的gate和up投影层被量化为2比特(W2),而所有其他线性层保持更高精度,从而形成混合精度的GateUp配置。通过三个模型系列(4B-20B)和两个硬件平台的屋顶线分析,我们证明W4/W2-GateUp部署(4比特基础加2比特gate/up)相比均匀W4可实现7.5-23.3%的TPS提升(取决于模型和上下文长度),同时将量化误差限制在可预测的层子集内。然后,我们应用Recover-LoRA——在量化层上通过合成数据的logit蒸馏训练低秩适配器——来恢复因gate和up层的2比特量化而损失的精度。在Qwen3-4B的案例研究中,Recover-LoRA仅使用1万合成训练样本且无需标注数据,就在12个基准中的9个上实现了80-95%的精度恢复。我们进一步证明,对于基于蒸馏的恢复,合成数据的表现与精心整理的标注数据相当,并且恢复结果可泛化到分布外评估任务。我们的结果表明,Recover-LoRA是一种实用的后量化精度恢复工具,适用于部署场景中的激进权重压缩。

英文摘要

Aggressive weight quantization to 2-bit precision offers substantial throughput and memory gains for large language model (LLM) inference, but typically incurs severe accuracy degradation. These gains are particularly relevant for edge and on-device deployment, where memory capacity and bandwidth are primary constraints. In this work, we extend Recover-LoRA -- a lightweight, data-free accuracy recovery method originally developed for general model weight corruption -- to the setting of ultra-low-bit quantization. We propose a selective mixed-precision strategy in which only gate and up projection layers of the MLP are quantized to 2-bit (W2), while all other linear layers remain at higher precision, yielding a mixed-precision GateUp configuration. We demonstrate via roofline analysis across three model families (4B--20B) and two hardware platforms that a W4/W2-GateUp deployment (4-bit base with 2-bit gate/up) delivers 7.5--23.3\% TPS improvement over uniform W4 depending on model and context length, while confining quantization error to a predictable subset of layers. We then apply Recover-LoRA -- training low-rank adapters on the quantized layers via logit distillation with synthetic data -- to recover accuracy lost from 2-bit quantization of the gate and up layers. In a case study on Qwen3-4B, Recover-LoRA achieves 80--95\% accuracy recovery on 9 of 12 benchmarks, using only 10k synthetic training samples and no labeled data. We further demonstrate that synthetic data performs comparably to curated labeled data for distillation-based recovery, and that recovery generalizes to out-of-distribution evaluation tasks. Our results present Recover-LoRA as a practical post-quantization accuracy recovery tool for aggressive weight compression in deployment settings.

2606.04236 2026-06-04 cs.CL cs.AI cs.LG 版本更新

Supportive Token Revealing for Fast Diffusion Language Model Decoding

支持性标记揭示:快速扩散语言模型解码

Giries Abu Ayoub, Mario Barbara, Lluís Pastor-Pérez, Tanja Bien, Aneesh Barthakur, Alaa Maalouf, Loay Mualem

发表机构 * Department of Computer Science, University of Haifa(海法大学计算机科学系) Institute for AI, University of Stuttgart(斯图加特大学人工智能研究所) IMPRS-IS

AI总结 提出AXON模块,通过选择注意力、不确定性和置信度信号中的锚点标记来改善扩散语言模型并行解码的质量-延迟权衡。

详情
AI中文摘要

离散扩散语言模型可以通过并行更新多个掩码位置来高效生成文本,但这种并行性引入了质量-延迟权衡。激进的解码可能过早提交相互依赖的标记,而保守的解码则需要大量去噪步骤。现有方法通过使用置信度或依赖性标准决定哪些标记可以安全揭示来解决这一矛盾。然而,避免不安全提交并不一定使剩余的掩码序列易于解码,因为不确定的标记可能依赖于掩码标记,从而成为去噪步骤的瓶颈。我们提出AXON,一个无需训练的模块,可添加到现有扩散语言模型的并行解码策略之上。AXON不替换基础解码器,而是监控剩余不确定的掩码标记,并仅当它们当前状态表明需要额外上下文时才进行干预。然后它将标准从揭示哪些标记最安全转变为哪些自信揭示最能支持后续去噪。AXON使用注意力、不确定性和置信度信号选择锚点,即不确定位置关注的自信掩码标记。在多个扩散语言模型的推理和代码生成基准上的实验表明,AXON改善了现有并行解码器的质量-延迟权衡,通常减少函数评估次数,同时保持或提高准确性。

英文摘要

Discrete diffusion language models can generate text efficiently by updating multiple masked positions in parallel, but this parallelism introduces a quality-latency trade-off. Aggressive decoding may commit mutually dependent tokens too early, while conservative decoding requires many denoising steps. Existing methods address this tension by deciding which tokens are safe to reveal using confidence or dependency criteria. However, avoiding unsafe commits does not necessarily make the remaining masked sequence easy to decode, since uncertain tokens may depend on masked tokens, creating a bottleneck for denoising steps. We propose AXON, a training-free module that can be added on top of existing parallel decoding strategies for diffusion language models. Rather than replacing the base decoder, AXON monitors the remaining uncertain masked tokens and intervenes only when their current state suggests that additional context is needed. It then shifts the criterion from which tokens are safest to reveal to which confident reveals would best support later denoising. AXON selects anchors, confident masked tokens that uncertain positions attend to, using attention, uncertainty, and confidence signals. Experiments on reasoning and code-generation benchmarks across multiple diffusion language models show that AXON improves the quality-latency trade-off of existing parallel decoders, often reducing the number of function evaluations while maintaining or improving accuracy.

2606.04210 2026-06-04 eess.AS cs.LG cs.SD 版本更新

Representation Matters in Randomized Smoothing for Audio Classification

表示在音频分类的随机平滑中至关重要

Jong-Ik Park, Shreyas Chaudhari, José M. F. Moura, Carlee Joe-Wong

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 研究随机平滑在音频分类中的表示问题,通过实验揭示预处理和表示选择对认证鲁棒性的影响,并提出报告规范。

详情
AI中文摘要

随机平滑(RS)在添加高斯噪声的向量空间中认证鲁棒性。在音频分类中,该空间通常不是唯一确定的,因为标准流程会对波形进行归一化、范围控制,并将其转换为log-mel或其他频谱特征。我们表明,除非认证对象和预处理策略明确,否则直接RS是欠定义的。在两个音频基准(关键词识别和环境声音分类)上,我们研究了波形、特征空间和后处理平滑。我们的诊断显示了为什么表示感知的报告是必要的:在相同的平滑水平$σ=0.0025$下,两个数据集共享相同的中位数原始半径$.007996$,但不同的波形能量产生不同的SNR等效尺度($83.98$ vs. $90.97$ dB);log-mel平滑在环境声音上给出更高的正半径认证准确率($68.42\%$ vs. $65.53\%$),认证了更多具有非零半径的样本,但基于特征而非波形;裁剪或峰值归一化将有效扰动范数改变约$230$--$351\times$。因此,我们建议音频RS研究选择并报告任务特定的认证对象和扰动模型,包括扰动位置、增益策略、原始半径以及任何噪声后的几何变化。

英文摘要

Randomized smoothing (RS) certifies robustness in the vector space where Gaussian noise is added. In audio classification, this space is often not uniquely defined as standard pipelines normalize, range-control, and transform waveforms into log-mel or other spectral features. We show that direct RS is therefore under-specified unless the certified object and preprocessing policy are explicit. On two audio benchmarks, keyword spotting and environmental-sound classification, we study waveform, feature-space, and post-processed smoothing. Our diagnostics show why representation-aware reporting is necessary: at the same smoothing level $σ=0.0025$, the two datasets share the same median raw radius $.007996$, but different waveform energies yield different SNR-equivalent scales ($83.98$ vs. $90.97$ dB); log-mel smoothing gives higher positive-radius certified accuracy on environmental sounds ($68.42\%$ vs. $65.53\%$), certifying more examples with nonzero radius but over features rather than waveforms; and clipping or peak normalization changes the effective perturbation norm by roughly $230$--$351\times$. We therefore recommend that audio RS studies choose and report the task-specific certified object and perturbation model, including the perturbation location, gain policy, raw radius, and any post-noise geometry changes.

2606.04209 2026-06-04 cs.LG 版本更新

A Geometric View of Counterfactual Behavior: Interaction of Boundary Proximity and Local Support

反事实行为的几何视角:边界接近度与局部支持的交互作用

Ioanna Gemou, Matteo Gamba, Randall Balestriero, Ritambhara Singh

发表机构 * Brown University(布朗大学)

AI总结 本文通过几何视角研究反事实行为,发现决策边界接近度与局部数据支持的交互作用决定了反事实的可行性,且反事实行为是独立于预测性能的维度,可在不改变准确率的情况下被改变。

详情
AI中文摘要

反事实解释寻求对输入进行小的、语义上有意义的改变,以改变模型的预测,并广泛用于解释和审计机器学习系统。在现代视觉、语言和多模态系统中,预训练编码器将输入映射到表示空间,下游分类器头在这些空间内施加决策边界。因此,附近反事实的可行性和距离取决于边界相对于数据的位置。然而,具有相似预测性能的模型在是否能够实现此类改变以及表示必须移动多远方面可能存在显著差异。本文通过使用标准化局部搜索探针,在多个预训练编码器和线性分类器头上检验了这种变化。结果表明,尽管预测性能相似,但模型在反事实行为上存在显著差异。在固定表示下,仅改变分类器头就会改变反事实结果,而预测性能基本保持不变。这种变化由决策边界接近度和局部数据支持的交互作用解释,两者共同决定了预测变化是否可行且位于数据支持的区域内,并且还可以改进固定模型内的反事实搜索。总之,这些发现将反事实行为识别为超越预测性能的独立维度,并表明可以在不改变准确率的情况下改变它,这对模型选择、鲁棒性和反事实方法的可靠性具有启示意义。

英文摘要

Counterfactual explanations seek small, semantically meaningful changes to an input that alter a model's prediction, and are widely used to interpret and audit machine learning systems. In modern vision, language, and multimodal systems, pretrained encoders map inputs to representation spaces, and downstream classifier heads impose decision boundaries within those spaces. As a result, the feasibility and distance of nearby counterfactuals depend on boundary placement relative to the data. Yet models with similar predictive performance can differ substantially in whether such changes are achievable and how far representations must move. This work examines this variation using a standardized local search probe across several pretrained encoders and linear classifier heads. Results show that despite similar predictive performance, models differ substantially in their counterfactual behavior. Under fixed representations, varying only the classifier head alters counterfactual outcomes while leaving predictive performance largely unchanged. This variation is explained by the interaction of decision-boundary proximity and local data support, which jointly determine whether prediction changes are both feasible and lie in regions supported by the data, and can also improve counterfactual search within fixed models. Together, these findings identify counterfactual behavior as a distinct dimension beyond predictive performance and show that it can be altered without changing accuracy, with implications for model selection, robustness, and the reliability of counterfactual methods.

2606.04205 2026-06-04 cs.MM cs.AI cs.CL cs.CV cs.LG cs.SD 版本更新

DetectZoo: A Unified Toolkit for AI-Generated Content Detection Across Text, Audio, and Image Modalities

DetectZoo:一个用于跨文本、音频和图像模态的AI生成内容检测的统一工具包

Sajad Ebrahimi, Nima Jamali, Bardia Shirsalimian, Kelly McConvey, Wentao Zhang, Jalehsadat Mahdavimoghaddam, Maksym Taranukhin, Maura Grossman, Vered Shwartz, Yuntian Deng, Ebrahim Bagheri

发表机构 * University of Toronto(多伦多大学) University of Waterloo(滑铁卢大学) Toronto Metropolitan University(多伦多 Metropolitan 大学) University of British Columbia(不列颠哥伦比亚大学) Vector Institute(向量研究所)

AI总结 提出DetectZoo,一个首个统一的多模态AI生成内容检测工具包,通过标准化数据预处理、评估流程和集成61个检测器与22个基准数据集,实现公平可重复的基准测试。

详情
AI中文摘要

生成模型的日益普及和能力提升模糊了人类与机器生成内容之间的界限,推动了跨文本、图像和音频检测领域的大量研究。大多数现有的检测器要么是商业软件,要么是开源但带有不兼容的代码库、定制化的预处理、评估协议和评估指标,这使得它们的采用、公平比较和复现变得相当困难。为了解决这一关键差距,我们引入了DetectZoo,这是首个可扩展的工具包,旨在为跨文本、音频和图像模态的AI生成内容检测提供统一接口。DetectZoo标准化了从数据摄取和预处理到模型评估的完整实证流程,为研究人员提供了一个统一的框架来系统地基准测试最先进的检测器。通过将多样的公共数据集和基线检测算法集成到单一的统一API下,我们的工具包促进了严格且可重复的评估。DetectZoo提供了61个检测器的参考实现、22个基准数据集的原生加载器,以及一个标准化的评估流程,通过通用接口报告多个指标。每个检测器都是自包含的,但可通过同一接口访问,自动缓存预训练权重,并复现原始发表的结果。DetectZoo降低了多模态AI取证的入门门槛,使研究人员能够识别跨领域的性能差距,并加速开发鲁棒、可泛化的检测技术。开源仓库和全面文档可在https://github.com/sadjadeb/DetectZoo 获取,且可通过pip install detectzoo安装该包。

英文摘要

The growing popularity and capacity of generative models have eroded the distinction between human and machine-generated content, motivating a growing body of work on detection across text, images, and audio. Most available detectors are either commercial software or, if open-source, come with incompatible codebases with bespoke preprocessing, evaluation protocols, and evaluation metrics, which make their adoption, fair comparison, and reproduction quite difficult. To address this critical gap, we introduce DetectZoo, a first-of-its-kind, extensible toolkit designed to provide a unified interface for AI-generated content detection across text, audio, and image modalities. DetectZoo standardizes the complete empirical pipeline, from data ingestion and preprocessing to model assessment, offering researchers a cohesive framework to benchmark state-of-the-art detectors systematically. By integrating diverse public datasets and baseline detection algorithms under a single, unified API, our toolkit facilitates rigorous and reproducible evaluation. DetectZoo provides reference implementations of 61 detectors, native loaders for 22 benchmark datasets, and a standardized evaluation pipeline that reports multiple metrics through a common interface. Each detector is self-contained yet accessible through the same interface, automatically caches pretrained weights, and reproduces the original published results. DetectZoo lowers the barrier to entry for multi-modal AI forensics, enabling researchers to identify performance gaps across domains and accelerating the development of robust, generalizable detection techniques. The open-source repository and comprehensive documentation are publicly available at https://github.com/sadjadeb/DetectZoo, and the package can be installed via pip install detectzoo.

2606.04199 2026-06-04 cs.CL cs.LG 版本更新

Cross-Prompt Generalization in Detecting AI-Generated Fake News Using Interpretable Linguistic Features

使用可解释语言特征检测AI生成假新闻的跨提示泛化

Aya Vera-Jimenez, Samuel Jaeger, Calvin Ibenye, Dhrubajyoti Ghosh

发表机构 * Department of Mathematics(数学系) School of Data Science and Analytics(数据科学与分析学院) Department of Computer Science(计算机科学系)

AI总结 研究通过提取词汇多样性、可读性和情感特征,在跨提示框架下使用随机森林分类器检测AI生成假新闻,发现模型在不同提示下均表现稳定(AUC 0.988-1.000),表明这些特征可泛化。

详情
AI中文摘要

大型语言模型的日益普及引发了对AI生成假新闻传播的担忧,尤其是在不同的提示策略下。大多数现有的检测模型是在单一生成设置下训练和评估的,其跨未见提示的泛化能力尚不清楚。在本研究中,我们使用三个在不同提示下生成的AI文章数据集以及真实新闻文章,研究了假新闻检测中的跨提示泛化。我们提取了捕捉词汇多样性、可读性和情感特征的可解释语言特征,并在跨提示框架下评估了随机森林分类器,其中在一个提示上训练的模型在另一个提示上进行测试。在所有六个训练-测试组合中,性能始终保持较高,AUC值在0.988到1.000之间。特征分布分析显示,与整体数据集相比,AI生成文本表现出更高的词汇多样性、更低的可读性和显著较低的情感强度,且不同提示间存在差异。尽管存在这些分布变化,分类器仍保持强劲性能,表明这些特征捕捉了AI生成文本的稳定属性,这些属性可跨提示策略泛化。这些发现表明,基于特征的方法可以在提示变化下提供对AI生成假新闻的稳健检测。

英文摘要

The increasing use of large language models has raised concerns about the spread of AI-generated fake news, particularly under varying prompting strategies. Most existing detection models are trained and evaluated under a single generation setting, leaving their ability to generalize across unseen prompts unclear. In this study, we investigate cross-prompt generalization in fake news detection using three datasets of AI-generated articles produced under distinct prompts, combined with real news articles. We extract interpretable linguistic features capturing lexical diversity, readability, and emotion-based characteristics and evaluate a random forest classifier under a cross-prompt framework, where models trained on one prompt are tested on another. Across all six train-test combinations, performance remains consistently high, with AUC values ranging from 0.988 to 1.000. Analysis of feature distributions shows that AI-generated text exhibits increased lexical diversity, reduced readability, and substantially lower emotional intensity compared to the overall dataset, with variations across prompts. Despite these distributional shifts, the classifier maintains strong performance, indicating that these features capture stable properties of AI-generated text that generalize across prompting strategies. These findings suggest that feature-based approaches can provide robust detection of AI-generated fake news under prompt variability.

2606.04194 2026-06-04 cs.LG cs.CL cs.IR 版本更新

Training-Free Lexical-Dense Fusion for Conversational-Memory Retrieval

免训练的词汇-稠密融合用于对话记忆检索

Christian Lysenstøen

发表机构 * Inland Norway University of Applied Sciences(内陆挪威应用科学大学) University of California, Berkeley(加州大学伯克利分校)

AI总结 本文提出一种免训练、仅CPU的检索方法,通过分数级融合最大查询-轮次相似度(后期交互)与BM25,显著提升多会话对话记忆检索的命中率,并分析了不同编码器和池化策略的影响。

Comments 9 pages, 3 figures, 10 tables. Code, data, and per-table receipts: https://github.com/Chrislysen/opsem

详情
AI中文摘要

在跨长多会话历史中检索回答新查询的过去几轮是长期对话记忆(LoCoMo, LongMemEval)背后的检索瓶颈。最近的并行工作Nano-Memory表明,通过最大查询-轮次相似度(后期交互,“轮次隔离检索”)对会话进行评分优于均值池化的会话嵌入。我们不声称该效果;我们复现它并询问一个免训练、仅CPU的检索阶段应在其周围添加什么。我们报告四个发现。(1)融合:在单个留一对话权重下,后期交互稠密分数与BM25的分数级融合,在六个编码器上比单独后期交互增加+8.8到+17.2个LoCoMo Hit@1点(所有p<1e-4),达到Hit@1 0.752 / NDCG@5 0.829(e5-large-v2),比BM25高+11.2个百分点。(2)一个现成的网络搜索交叉编码器重排序器在融合的前10个结果上效果不佳,将Hit@1降低6.9个百分点(一个重排序器,一种配置)。(3)池化算子消融显示top-k后期交互匹配最大相似度,但朴素的平滑最大值(log-sum-exp)对一半编码器失效。(4)所有六个编码器的后期减早期差距很大,且较大的编码器差距往往更大,而边际融合增益缩小;在LongMemEval-S上,一个BM25饱和的词汇机制中,相对于BM25的净融合增益很小且不显著。按类别分析将增益视为分工:稠密后期交互在多跳和时间问题上帮助最大,但在对抗性问题上落后于BM25。贡献是对一个强大的免训练检索方案的可控、可复现的描述,而非后期交互检索器本身(Nano-Memory的)。我们不声称完整的记忆架构;这是一个检索阶段的研究。

英文摘要

Retrieving the few past turns that answer a new query across long multi-session histories is the retrieval bottleneck behind long-term conversational memory (LoCoMo, LongMemEval). Recent concurrent work, Nano-Memory, shows that scoring a session by the maximum query-turn similarity (late interaction, "Turn Isolation Retrieval") beats mean-pooled session embeddings. We do not claim that effect; we replicate it and ask what a training-free, CPU-only retrieval stage should add around it. We report four findings. (1) Fuse: score-level fusion of the late-interaction dense score with BM25, under a single leave-one-conversation-out weight, adds +8.8 to +17.2 points of LoCoMo Hit@1 over late interaction alone across six encoders (all p<1e-4), reaching Hit@1 0.752 / NDCG@5 0.829 (e5-large-v2), +11.2 pp over BM25. (2) An off-the-shelf web-search cross-encoder reranker over the fused top-10 hurts here, degrading Hit@1 by 6.9 pp (one reranker, one configuration). (3) A pooling-operator ablation shows top-k late interaction matches max-similarity, but a naive smooth-max (log-sum-exp) collapses for half the encoders. (4) The late-minus-early gap is large for all six encoders and tends to be larger for larger ones, while the marginal fusion gain shrinks; on LongMemEval-S, a lexical regime where BM25 saturates, the net fusion gain over BM25 is small and not significant. A per-category analysis frames the gain as a division of labor: dense late interaction helps most on multi-hop and temporal questions but trails BM25 on adversarial ones. The contribution is a controlled, reproducible account of a strong training-free retrieval recipe, not the late-interaction retriever itself (Nano-Memory's). We make no claim to a complete memory architecture; this is a retrieval-stage study.

2606.04191 2026-06-04 cs.LG cs.AI 版本更新

Metric-Aware Hybrid Forecasting for the CTF4Science Lorenz Challenge

CTF4Science Lorenz挑战的度量感知混合预测

Cen Lu

发表机构 * EPFL & Idiap Research Institute(瑞士联邦理工学院(EPFL)及Idiap研究所)

AI总结 针对CTF4Science Lorenz挑战,提出一种度量感知混合系统,通过为不同度量族分配专用预测器(去噪器、ODE拟合、直方图替换),在九项任务对上取得高分。

详情
AI中文摘要

我们描述了针对CTF4Science Lorenz挑战的方法,该基准混合了短时预测、长时间分布匹配和轨迹重建,涵盖九项任务对。关键发现是,没有单一模型族在所有度量上占优。相反,我们构建了一个度量感知混合系统,为每个度量族分配不同的预测器:(1)用于全轨迹重建的合成预训练去噪器,(2)用于前20个预测步的Lorenz ODE拟合和轨迹射击,以及(3)使用合成Lorenz库的直方图尾部替换用于长时间评估。该系统中一个具有代表性的成熟提交在公共排行榜上得分为83.83551,而采用相同思想的小型后续堆栈达到了83.85529。我们专注于更干净的中间系统,因为它捕获了完整方法,同时足够简单以重现和分析,而最终提交可以理解为同一骨干的保守扩展。

英文摘要

We describe our approach to the CTF4Science Lorenz challenge, a benchmark that mixes short-horizon forecasting, long-time distribution matching, and trajectory reconstruction across nine task pairs. The key discovery is that no single model family dominated all metrics. Instead, we built a metric-aware hybrid system that assigned a different predictor to each metric family: (1) synthetic-pretrained denoisers for full-trajectory reconstruction, (2) Lorenz ODE fitting and trajectory shooting for the first 20 forecast steps, and (3) histogram-tail substitution using synthetic Lorenz libraries for long-time evaluation. A representative mature submission from this system family scored 83.83551 on the public leaderboard, and a small follow-up stack of the same ideas reached 83.85529. We focus on the cleaner intermediate system because it captures the full method while remaining simple enough to reproduce and analyze, while the final submission can be understood as a conservative extension of the same backbone.

2606.04188 2026-06-04 cs.LG cs.AI cs.RO 版本更新

Dual Advantage Fields

双优势场

Alexey Zemtsov, Maxim Bobrin, Alexander Nikulin, Dmitry V. Dylov, Fakhri Karray, Vladislav Kurenkov, Martin Takáč, Arip Asadulaev

发表机构 * NUST MISIS(努斯大学材料科学与工程学院) MSU(莫斯科大学) Computational Imaging Lab(计算成像实验室) MBZUAI(马斯喀特人工智能研究院) dunnolab(杜诺实验室) Innopolis University(因诺波利斯大学)

AI总结 提出双优势场(DAF)方法,利用双线性对偶值模型生成局部优势信号,通过动作-效应模型预测折扣特征位移并与目标方向对齐来评分动作,实现离线目标条件强化学习中的策略提取。

Comments Accepted by ICML 2026 Workshop on Decision-Making from Offline Datasets to Online Adaptation: Black-Box Optimization to Reinforcement Learning

详情
AI中文摘要

离线目标条件强化学习需要长期可达性估计和局部动作比较。双目标表示提供捕获全局目标可达性的值场,但它们不直接指定在给定状态下应优先选择哪个动作。我们提出双优势场(DAF),一种策略提取方法,将双线性对偶值模型转化为局部优势信号。在双线性对偶参数化下,目标嵌入是值场关于状态表示的梯度。DAF学习一个动作-效应模型,预测由动作引起的折扣特征位移,并通过该位移与目标方向的对齐程度对动作进行评分。在可实现的情况下,该分数等于目标条件Bellman优势,从而提供标准的局部策略改进保证。在OGBench的 locomotion、manipulation 和 puzzle 任务上,DAF改进了聚合RLiable指标,并在局部正确动作与直接朝向最终目标移动不同的设置中表现强劲。

英文摘要

Offline goal-conditioned reinforcement learning requires both long-horizon reachability estimates and local action comparisons. Dual goal representations provide value fields that capture global goal reachability, but they do not directly specify which action should be preferred at a given state. We propose Dual Advantage Fields, a policy-extraction method that turns a bilinear dual value model into a local advantage signal. Under bilinear dual parameterization, the goal embedding is the gradient of the value field with respect to the state representation. DAF learns an action-effect model that predicts the discounted feature displacement induced by an action and scores actions by the alignment between this displacement and the goal direction. In the realizable case, this score equals the goal-conditioned Bellman advantage, yielding a standard local policy-improvement guarantee. On OGBench locomotion, manipulation, and puzzle tasks, DAF improves aggregate RLiable metrics and performs strongly in settings where locally correct actions differ from direct movement toward the final goal.

2606.04182 2026-06-04 cs.LG cs.AI stat.ML 版本更新

Exact Unlearning in Reinforcement Learning

强化学习中的精确遗忘

Thanh Nguyen-Tang, Raman Arora

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 本文提出强化学习中的精确遗忘问题,通过ρ-TV稳定算法实现数据删除后输出与从未学习该数据时不可区分,并给出近乎最优的遗憾界。

Comments ICML Spotlight

详情
AI中文摘要

我们提出了强化学习中的精确遗忘问题,目标是设计一个高效框架,使得在收到删除请求后能够移除任何用户的数据,即遗忘后在线学习者的输出与从未与学习者交互过的用户所产生的结果不可区分。对于任意 $ρ>0$,我们证明存在一个 $ρ$-TV 稳定的强化学习算法,支持精确遗忘过程,其期望计算成本仅为从头重新训练计算成本的 $ρ\sqrt{\ln T}$ 分之一。我们为表格型马尔可夫决策过程构造了这样一个 $ρ$-TV 稳定的强化学习算法,其遗憾界为 $\mathcal{O}(H^2 \sqrt{SAT} + H^3 S^2 A + {H^{2.5} S^2 A}/ρ)$,其中 $S, A, H, T$ 分别表示状态数、动作数、回合长度和回合数。我们还为 $ρ$-TV 稳定的强化学习算法建立了 $\Omega(H\sqrt{\!SAT}\! +\! {SAH}/ρ)$ 的下界,表明我们的算法几乎是极小化最优的。

英文摘要

We formulate the problem of \emph{exact unlearning} in reinforcement learning, where the goal is to design an efficient framework that enables the removal of any user's data upon deletion request, i.e., the online learner's output after unlearning is \emph{indistinguishable} from what would have been produced had the deleted user never interacted with the learner. For any $ρ>0$, we show that there exists a reinforcement learning (RL) algorithm that is $ρ$-TV-stable and supports an exact unlearning procedure whose expected computational cost is only a $ρ\sqrt{\ln T}$ fraction of the computational cost of retraining from scratch. We construct such a $ρ$-TV-stable RL algorithm for tabular Markov decision processes (MDPs), which achieves a regret bound of $\mathcal{O}(H^2 \sqrt{SAT} + H^3 S^2 A + {H^{2.5} S^2 A}/ρ)$, where $S, A, H$, and $T$ denote the number of states, the number of actions, the episode horizon, and the number of episodes, respectively. We also establish a lower bound of $Ω(H\sqrt{\!SAT}\! +\! {SAH}/ρ)$ for $ρ$-TV-stable RL algorithms, showing that our algorithm is nearly minimax optimal.

2606.04180 2026-06-04 cs.LG cs.IT math.IT 版本更新

KODA: Contrastive Representation Comparison and Alignment for Vision-Language Foundation Models

KODA: 视觉-语言基础模型的对比表示比较与对齐

Youqi Wu, Mohammad Jalali, Farzan Farnia

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 提出KODA框架,通过核优化方法对比分析视觉-语言基础模型的表示差异,并识别弱聚类与强聚类的样本子集,实现表示对齐。

详情
AI中文摘要

视觉-语言基础模型(如CLIP和SigLIP)为多模态学习系统提供了广泛使用的表示。虽然这些模型通常通过下游性能进行比较,但这种评估往往不能解释它们的表示在结构上如何不同。在本文中,我们通过对比嵌入聚类任务研究这一问题:识别在一个表示下弱聚类但在另一个表示下强聚类的样本子集。我们提出了\emph{核优化差异分析(KODA)},一个基于核的对比表示比较与对齐框架。KODA通过模态核组合构建统一的多模态核,并将差异发现形式化为一个约束优化问题,该问题在一个表示中搜索一致结构,同时抑制参考表示中的一致性。这产生了与特定样本子集和模态交互相关的可解释差异方向。为了将KODA扩展到大型视觉-语言数据集,我们开发了使用随机投影的联合核随机低维近似,包括用于平移不变核的随机傅里叶特征。实验上,KODA在视觉-语言表示中识别出一致且可解释的差异结构,并为表示对齐提供了样本子集。代码可在https://github.com/yokiwuuu/KODA获取。

英文摘要

Vision-language foundation models such as CLIP and SigLIP provide widely used representations for multimodal learning systems. While these models are typically compared through downstream performance, such evaluations often do not explain how their representations differ structurally. In this work, we study this problem through the task of Contrastive Embedding Clustering: identifying sample subsets that are weakly clustered under one representation but strongly clustered under another. We propose \emph{Kernel Optimization for Discrepancy Analysis (KODA)}, a kernel-based framework for contrastive representation comparison and alignment. KODA constructs unified multimodal kernels through modality-wise kernel composition and formulates discrepancy discovery as a constrained optimization problem that searches for coherent structures in one representation while suppressing coherence in a reference representation. This yields interpretable discrepancy directions associated with specific sample subsets and modality interactions. To scale KODA to large vision-language datasets, we develop randomized low-dimensional approximations of joint kernels using random projections, including Random Fourier Features for shift-invariant kernels. Empirically, KODA identifies consistent and interpretable discrepancy structures across vision-language representations and provides sample subsets for representation alignment. The code is available at https://github.com/yokiwuuu/KODA.

2606.04176 2026-06-04 cs.LG math.ST stat.ML stat.TH 版本更新

Low-rank Distributional Matrix Completion

低秩分布矩阵补全

Jiayi Wang, Raymond K. W. Wong

发表机构 * University of Texas at Dallas(德克萨斯大学达拉斯分校) Texas A&M University(德克萨斯农工大学)

AI总结 针对每个条目为概率分布的矩阵,提出基于核均值嵌入和Tucker秩的低秩结构,通过函数展开算子连接无限维与有限维,实现分布矩阵补全并给出非渐近误差界。

详情
AI中文摘要

我们研究了矩阵补全问题的分布推广,其中目标矩阵的每个条目是概率分布而非标量。在此设置中,仅观察到矩阵条目的一个子集,即使对于观察到的条目,底层分布也无法直接获取;相反,我们观察到从这些分布中抽取的有限样本。为了表示分布条目,我们采用核均值嵌入,并引入分布值矩阵的Tucker秩概念以捕捉其低秩结构。核嵌入的无限维性质带来了重大的方法论挑战。为解决此问题,我们引入了函数展开算子,将所提出的分布低秩结构与有限维张量的经典Tucker秩联系起来。基于此框架,我们提出了一种用于分布矩阵补全的新估计器。我们建立了非渐近误差界,刻画了估计器的统计性能。在合成数据和真实世界应用上的大量实验证明了所提方法的有效性。

英文摘要

We study a distributional generalization of the matrix completion problem in which each entry of the target matrix is a probability distribution rather than a scalar. In this setting, only a subset of matrix entries is observed, and even for observed entries, the underlying distributions are not directly accessible; instead, we observe finitely many samples drawn from them. To represent distributional entries, we employ kernel mean embeddings and introduce a notion of Tucker rank for distribution-valued matrices to capture their low-rank structure. The infinite-dimensional nature of kernel embeddings poses significant methodological challenges. To address this, we introduce functional unfolding operators that link the proposed distributional low-rank structure to the classical Tucker rank for finite-dimensional tensors. Based on this framework, we propose a novel estimator for distributional matrix completion. We establish non-asymptotic error bounds that characterize the statistical performance of the estimator. Extensive experiments on synthetic data and a real-world application demonstrate the effectiveness of the proposed method.

2606.04171 2026-06-04 cs.CR cs.AI cs.LG 版本更新

MimeLens: Position-Agnostic Content-Type Detection for Binary Fragments

MimeLens: 二进制片段的位置无关内容类型检测

Michael J. Bommarito

发表机构 * II∗

AI总结 针对现有文件类型分类系统(如Magika)无法处理无头片段、随机磁盘块等非完整文件输入的问题,提出MimeLens,一种基于BERT的小型编码器家族,通过随机偏移采样训练实现位置无关的二进制内容分类,在libmagic标记数据上top-1准确率比Magika v1.1高10.7个百分点,并能从单个UDP数据包或随机磁盘块中分类。

Comments 18 pages, 2 figures, 15 tables. Models released on Hugging Face (https://huggingface.co/mjbommar); reference training code at https://github.com/mjbommar/mimelens-training

详情
AI中文摘要

文件类型分类是恶意软件分类、取证雕刻、数据包检查和存储索引等工作流程的基础。像Google的Magika这样的学习系统假设在已知偏移处访问整个文件,因此它们无法处理这些任务实际产生的许多输入,例如单个数据包负载、无头的雕刻片段、随机磁盘块或分块上传。我们引入了MimeLens,这是一个小型BERT风格编码器家族,在从每个文件内均匀随机偏移处采样的窗口的二进制内容上进行预训练,没有特权文件头位置,有标准上下文和短上下文变体。一个字节块来自文件中的任何位置,无需头部且无固定大小;输出是libmagic的125个MIME标签之一。在完整文件的干净头部上,MimeLens在libmagic标记数据上的top-1准确率比Magika v1.1高10.7个百分点,并且在Magika无法分类的地方(例如单个中间流UDP数据包)仍然能分类,在随机中间文件磁盘块上的准确率是libmagic和Magika的两倍以上。代价是延迟:在CPU上,MimeLens每个样本的运行速度大约比Magika慢一到两个数量级,但在消费级GPU或批处理中与之相当。所有训练好的检查点已在Hugging Face上发布(mjbommar/mimelens-001-*)。

英文摘要

File-type classification underlies many workflows like malware triage, forensic carving, packet inspection, and storage indexing. Learned systems such as Google's Magika assume whole-file access at a known offset, so they break on the inputs many of these tasks actually produce, like a single packet payload, a header-less carved fragment, a random disk block, or a chunked upload. We introduce MimeLens, a family of small BERT-style encoders pretrained on binary content from windows sampled at a uniformly random offset within each file, with no privileged head-of-file position, in standard- and short-context variants. A byte chunk goes in from anywhere in a file, no header needed and no fixed size; out comes one of libmagic's 125 MIME labels. On the clean head of complete files, MimeLens beats Magika v1.1 by +10.7 pp top-1 on libmagic-labeled data, and it keeps classifying where Magika cannot: from a single mid-stream UDP packet, and more than twice as accurately as libmagic and Magika on random mid-file disk blocks. The cost is latency: MimeLens runs roughly one to two orders of magnitude slower per sample on CPU than Magika, though it matches on consumer GPUs or in batch. All trained checkpoints are released on Hugging Face (mjbommar/mimelens-001-*).

2606.04168 2026-06-04 cs.LG cs.CR 版本更新

When Autoregressive Consistency Hurts Safety Alignment

当自回归一致性损害安全对齐

Bochen Lyu, Yiyang Jia, Xiaohao Cai, Zhanxing Zhu

发表机构 * University of Southampton(索姆塞特大学)

AI总结 本文通过分析自回归一致性机制,揭示了大语言模型安全对齐的浅层性,并提出随机插入攻击和对抗性安全对齐方法。

Comments 21 pages

详情
AI中文摘要

大语言模型(LLMs)的安全对齐是脆弱的,部分原因在于它通常是浅层的:微调主要重塑模型在最初几个输出标记附近的行为。我们认为,这种现象可以通过自回归一致性来理解,即下一个标记预测倾向于一致地保持和扩展当前响应轨迹。通过分析安全对齐的学习动态,我们表明自回归一致性可以将对齐更新集中在早期标记上,为浅层安全对齐提供机制解释。同样的机制还预测了一类更广泛的LLM攻击:在输出轨迹的任意位置诱导有害延续状态的攻击。作为一个具体例子,我们引入了随机插入攻击,该攻击将一个短的有害片段插入原本安全的拒绝轨迹中,并利用自回归一致性维持由此产生的有害分支,从而绕过安全对齐。值得注意的是,即使在一个长的拒绝前缀之后,一个短的有害片段也能将生成重定向为有害,这突显了自回归一致性作为一个潜在的更广泛失败机制。这表明安全对齐还应该在整个输出轨迹中打破有害的自回归一致性。因此,我们提出了对抗性安全对齐,一个基于最坏情况有害延续状态的初始框架,并通过随机最坏插入训练实例化它。总体而言,我们的结果表明,自回归一致性应被视为安全对齐和攻击设计中的核心考虑因素。

英文摘要

Safety alignment in large language models (LLMs) is fragile in part because it is often shallow: fine-tuning mainly reshapes the model's behavior near the first few output tokens. We argue that this phenomenon can be understood through autoregressive consistency, the tendency of next-token prediction to preserve and extend the current response trajectory consistently. By analyzing the learning dynamics of safety alignment, we show that autoregressive consistency can concentrate alignment updates on early tokens, offering a mechanistic explanation for shallow safety alignment. The same mechanism also predicts a broader class of attacks on LLMs: attacks that induce harmful continuation states at arbitrary positions in the output trajectory. As a concrete example, we introduce random insertion attack, which inserts a short harmful span into an otherwise safe refusal trajectory and exploits autoregressive consistency to sustain the resulting harmful branch, thereby bypassing safety alignment. Notably, a short harmful span can redirect the generation to be harmful even after a long refusal prefix, highlighting autoregressive consistency as a potential broader failure mechanism. This suggests that safety alignment should also break harmful autoregressive consistency throughout the output trajectory. We therefore propose adversarial safety alignment, an initial framework based on worst-case harmful continuation states, and instantiate it with random worst-insertion training. Overall, our results suggest that autoregressive consistency should be treated as a central consideration in both safety alignment and attack design.

2606.04167 2026-06-04 cs.LG cs.AI 版本更新

Smart Transportation Without Neurons -- Fair Metro Network Expansion with Tabular Reinforcement Learning

无神经元的智能交通——基于表格强化学习的公平地铁网络扩展

Dimitris Michailidis, Sennay Ghebreab, Fernando P. Santos

发表机构 * Socially Intelligent Artificial Systems University of Amsterdam(社会智能人工智能系统大学阿姆斯特丹)

AI总结 针对地铁网络扩展问题,提出将非马尔可夫奖励决策过程与表格强化学习相结合的方法,在保证性能的同时大幅降低训练轮次和碳排放,并融入社会公平性指标。

Comments 16 pages

详情
AI中文摘要

我们解决了地铁网络扩展问题(MNEP),这是交通网络设计问题(TNDP)的一个子集,专注于扩展地铁系统以满足出行需求。传统方法依赖于精确和启发式方法,需要专家定义的约束来缩小搜索空间。最近,深度强化学习(Deep RL)因其在复杂序列决策过程中的有效性而出现,但它仍然计算成本高、环境成本高,并且需要额外的工程来解释。我们表明,MNEP问题规模足够小,不需要深度强化学习方法。将MNEP重新表述为非马尔可夫奖励决策过程(NMRDP),我们使用表格强化学习以显著更少的训练轮次实现类似的性能,此外还提供了更高的可解释性。此外,我们将社会公平标准纳入奖励函数,侧重于效率和公平性,突出了我们方法的多功能性。在现实场景中——西安和阿姆斯特丹——我们的方法平均将总轮次减少了18倍,总碳排放减少了12倍,同时与深度强化学习保持竞争力。这种方法提供了一种可复制、模块化、可解释且资源高效的解决方案,并具有应用于其他组合优化问题的潜力。

英文摘要

We tackle the Metro Network Expansion Problem (MNEP), a subset of the Transport Network Design Problem (TNDP), which focuses on expanding metro systems to satisfy travel demand. Traditional methods rely on exact and heuristic approaches that require expert-defined constraints to reduce the search space. Recently, deep reinforcement learning (Deep RL) has emerged due to its effectiveness in complex sequential decision-making processes-it remains, however, computationally expensive, environmentally costly, and requires additional engineering to interpret. We show that MNEP problems are small enough to not require Deep RL methods. Reformulating the MNEP as a Non-Markovian Rewards Decision Process (NMRDP), we use tabular RL to achieve similar performance with significantly fewer training episodes, additionally offering greater interpretability. Additionally, we incorporate social equity criteria into the reward functions, focusing on efficiency and fairness, highlighting the versatility of our method. Evaluated in real-world settings-Xi'an and Amsterdam-our method reduces total episodes by a factor of 18 and total carbon emissions by a factor of 12 on average, while remaining competitive with Deep RL. This approach offers a replicable, modular, interpretable, and resource-efficient solution with potential applications to other combinatorial optimization problems.

2606.04165 2026-06-04 hep-ex cs.LG hep-ph physics.ins-det 版本更新

CaloTrilogy: Toward a Breakthrough in One-Step, End-to-End, Physics-Guided Shower Generation for Modern Calorimeters

CaloTrilogy:迈向现代量热器一步式端到端物理引导簇射生成的突破

Cheng Jiang, Sitian Qian, Kevin Pedro, Oz Amram, Huilin Qu, Maggie Voetberg

发表机构 * School of Physics and Astronomy, University of Edinburgh(爱丁堡大学物理与天文学学院) Department of Physics, University of Wisconsin-Madison(威斯康星大学麦迪逊分校物理系) Fermi National Accelerator Laboratory(费米国家加速器实验室) State Key Laboratory of Dark Matter Physics, Tsung-Dao Lee Institute & School of Physics and Astronomy, Shanghai Jiao Tong University(上海交通大学暗物质物理国家重点实验室、李政道研究所及物理与天文学学院) Key Laboratory for Particle Astrophysics and Cosmology (MOE) & Shanghai Key Laboratory for Particle Physics and Cosmology, Shanghai Jiao Tong University(教育部粒子天体物理与宇宙学重点实验室及上海粒子物理与宇宙学重点实验室,上海交通大学)

AI总结 提出一种结合平均速度场积分器、学习生成先验和物理引导损失项的框架,实现一步或少量评估步骤的高质量簇射生成,性能与最先进的流和扩散模型相当。

详情
AI中文摘要

当前和未来对撞机的高精度量热器模拟对计算资源的需求快速增长,促使开发机器学习替代传统蒙特卡洛工具(如Geant4)。流匹配和基于扩散的生成模型因其样本质量而成为高维快速模拟的主流方法,但通常在推理时需要${\cal O}(100)$次函数评估,并常依赖辅助网络约束全局可观测量,损害了简化的端到端生成。我们引入了一个统一框架,改进了速度、簇射质量和物理保真度之间的平衡。该方法结合了:(i)平均速度场积分器,实现一步或少量评估的采样;(ii)从数据而非随机噪声构建的簇射空间学习生成先验;(iii)训练期间对关键可观测量施加归纳偏置的物理引导损失项。这些元素是训练时的正则化器,保持了端到端推理且无额外成本。仅需一步或少量评估步骤,该模型在多个公开的高粒度量热器数据集上达到了与最先进的流和扩散模型竞争的簇射质量。结果表明层间簇射结构与底层物理一致,为未来的快速模拟工作流提供了有力候选。

英文摘要

High-precision calorimeter simulation at current and future colliders imposes rapidly growing computational demands, motivating the development of machine-learning surrogates for traditional Monte Carlo tools such as Geant4. Flow matching and diffusion-based generative models have become leading approaches for high-dimensional fast simulation because of their sample quality, but typically require ${\cal O}(100)$ function evaluations at inference and often rely on auxiliary networks to constrain global observables, compromising streamlined end-to-end generation. We introduce a unified framework that improves the balance between speed, shower quality, and physics fidelity. The method combines: (i) an average velocity field integrator that enables sampling in one or a few evaluations; (ii) a learned generative prior in shower space, constructed from data rather than random noise; and (iii) physics-guided loss terms that impose inductive biases on key observables during training. These elements are training time regularizers, preserving end-to-end inference with no additional cost. With only one or a few evaluation steps, the model achieves shower quality competitive with state-of-the-art flow and diffusion approaches, tested on several public high granularity calorimeter datasets. The results demonstrate inter-layer shower structure consistent with the underlying physics, providing a strong candidate for future fast simulation workflows.

2606.04164 2026-06-04 cs.LG cs.AI 版本更新

ADAPTOOD: Uncertainty-Aware Fine-Tuning for Out-of-Distribution ECG Time Series Models

ADAPTOOD:面向分布外心电图时间序列模型的不确定性感知微调

Sotirios Vavaroutas, Yu Yvonne Wu, Ali Etemad, Cecilia Mascolo

发表机构 * University of Cambridge(剑桥大学) Dartmouth College(达特茅斯学院) Queen’s University(皇后大学)

AI总结 提出ADAPTOOD框架,利用数据不确定性量化分布偏移严重性,结合低秩更新和自适应超参数优化,在分布外心电图时间序列任务上提升准确率高达7%和精确率12.9%。

Comments 11 pages

详情
AI中文摘要

用于训练的数据样本通常与微调和部署期间遇到的数据不同,尽管机器学习模型显示出潜力,但在只有少量标注数据集可用时,其性能仍然有限。在由不同传感器、人群和应用设置引起的分布偏移下,性能通常会下降。尽管预训练有所帮助,但模型在现实环境中经常遇到分布外(OOD)数据,导致鲁棒性降低。现有的自适应方法通常假设固定的分布偏移,并在出现多种类型或严重性时难以应对。特别是,它们忽略了偏移的严重性,例如将适应大型熟悉数据集与适应带有新任务的小型数据集同等对待,这限制了泛化能力。为了解决这个问题,我们提出了ADAPTOOD,这是一个新颖的框架,利用数据不确定性来量化分布偏移的严重性并指导时间序列的微调。这种不确定性衡量目标部署分布中的样本与预训练分布偏离的程度,提供了OOD严重性的直接信号。我们的框架将这种不确定性与低秩模型更新和自适应超参数优化相结合,以改进自适应。我们表明,在OOD任务中,ADAPTOOD比现有方法实现了高达7%的准确率和12.9%的精确率提升,在分布偏移严重性增加时仍保持强劲性能。

英文摘要

Data samples used for training often differ from those encountered during fine-tuning and deployment, and while ML models show promise, their performance remains limited when only small annotated datasets are available. Performance often degrades under distribution shifts caused by diverse sensors, populations, and application settings. Although pre-training helps, models frequently encounter out-of-distribution (OOD) data in real-world settings, leading to reduced robustness. Existing adaptation methods usually assume fixed distribution shifts and struggle when multiple types or severities occur. In particular, they overlook shift severity, for example treating adaptation to a large familiar dataset the same as adaptation to a small dataset with a new task, which limits generalisation. To address this, we propose ADAPTOOD, a novel framework that leverages data uncertainty to quantify distribution shift severity and guide fine-tuning for time series. This uncertainty measures how strongly samples from the target deployment distribution deviate from the pre-training distribution, providing a direct signal of OOD severity. Our framework combines this uncertainty with low-rank model updates and adaptive hyperparameter optimisation to improve adaptation. We show that ADAPTOOD achieves up to 7% higher accuracy and 12.9% higher precision than existing methods in OOD tasks, maintaining strong performance as distribution shift severity increases.

2606.04161 2026-06-04 cs.LG 版本更新

When Offline Selectors Cannot Beat the Best Single Model: A Diagnostic Study on edX Dropout Prediction

当离线选择器无法超越最佳单一模型:基于edX辍学预测的诊断研究

Tyler Crosse, Alan Nadelsticher Ruvalcaba, Dustin Khang LeDuc, Thomas Trask, Nicholas Lytle, David Joyner

发表机构 * edX

AI总结 针对离线选择器在实践中的表现常不如最佳单一模型的问题,提出三阶段诊断方法,通过k-NN标签一致性、离线学习器性能比较和状态特征消融实验,识别瓶颈为局部表示模糊性,建议改进状态或收集新数据而非调优学习器。

详情
AI中文摘要

不同的预测器通常在不同输入上表现优异,因此每实例选择最佳预测器有望比固定单一模型获得更高准确率。在实践中,从日志数据训练的选择器经常无法击败最强的单一预测器。在进一步调优之前,三个原因通常未被区分:不匹配的学习器、无法预测哪个模型获胜的状态、或从缓存到部署的标签偏移。 一个三阶段诊断在共享缓存上排除这些原因。第一阶段通过k-NN标签一致性估计oracle恢复的局部上限。第二阶段询问配对BC和离线RL学习器(BC、DQN和CQL,跨惩罚权重)是否达到该上限。第三阶段消融选择器状态,测试更丰富的特征是否会提高上限。综合结论指向最有希望的下一步:调优学习器、重新设计状态或收集新数据。 我们将其应用于在edX点击流数据上选择五个辍学预测模型。在16个时间窗口上,oracle平均比最强单一基模型高出9.7个准确率点,但BC、DQN和CQL均落在其下方的相同测试准确率带内(对十倍缓存扫描和N=2,000个保留样本鲁棒)。瓶颈是局部表示模糊性:CQL缩小了模仿差距但无部署增益(非保守性),遗憾在学习器间紧密聚集(非打破平局),三个学习器在测试准确率上收敛(非偏移)。下一次迭代应改变状态或收集新数据,而非进一步调优离线学习器。

英文摘要

Different predictors often excel on different inputs, so picking the best one per instance promises higher accuracy than committing to a single model. In practice, selectors trained from logged data routinely fail to beat the strongest single predictor. Three causes typically go unseparated before more tuning is applied: a mismatched learner, a state that does not predict which model wins, or buffer-to-deployment label shift. A three-stage diagnostic rules them out on a shared buffer. Stage~1 estimates a local ceiling on oracle recovery from $k$-NN label consistency. Stage~2 asks whether paired BC and offline-RL learners (BC, DQN, and CQL across penalty weights) reach that ceiling. Stage~3 ablates the selector state to test whether richer features would raise it. The combined verdict points to the most promising next step: tuning the learner, redesigning the state, or collecting new data. We apply it to selecting among five dropout-prediction models on edX clickstream data. Across 16 windows, the oracle beats the strongest single base model by 9.7 accuracy points on average, yet BC, DQN, and CQL land in the same test-accuracy band below it (robust to a tenfold buffer sweep and $N{=}2{,}000$ held-out examples). The bottleneck is local representational ambiguity: CQL closes the imitation gap without a deployment gain (not conservatism), regret clusters tightly across learners (not tie-breaking), and the three learners converge on test accuracy (not shift). The next iteration should change the state or collect new data, not tune the offline learner further.

2606.04160 2026-06-04 cs.CL cs.LG 版本更新

Expert-Aware Refusal Steering

专家感知的拒绝引导

Anna C. Marbut, Daniel R. Olson, Travis J. Wheeler

发表机构 * Department of Interdisciplinary Studies(交叉学科研究部) University of Montana(蒙大拿大学) Department of Pharmacy Practice & Science(药学与科学系) University of Arizona(亚利桑那大学) European Bioinformatics Institute(欧洲生物信息研究所) European Molecular Biology Laboratory(欧洲分子生物学实验室) Wellcome Genome Campus(沃氏基因组校园)

AI总结 研究在混合专家(MoE)大语言模型中,通过专家感知的引导向量抑制拒绝行为,发现单个专家输出即可有效引导,且注意力机制在MoE拒绝行为中起重要作用。

Comments Under review for COLM 2026

详情
AI中文摘要

指令调优的大语言模型(LLM)的安全对齐依赖于模型可靠地拒绝回答有害或不允许请求的能力。最近的研究表明,在推理过程中对密集LLM应用引导向量可以有效抑制拒绝行为,诱导模型响应有害请求。我们将这种拒绝引导方法扩展到三个开源混合专家(MoE)LLM,并发现引导性能不受MoE架构固有的复杂路由模式影响。然后,我们提出了两种专家感知的拒绝引导方法,利用拒绝特定的专家路由模式和专家特定的引导方向来抑制正常的拒绝行为。我们发现,基于单个专家的输出即可有效引导拒绝行为。我们的结果表明,引导方法捕获的拒绝信号与专家路由行为不同,这表明注意力在MoE拒绝行为中扮演重要角色。

英文摘要

Safety alignment in instruction-tuned large language models (LLMs) depends on a model's ability to reliably refuse to respond to harmful or disallowed requests. Recent work has shown that a steering vector can be applied to a dense LLM during inference to effectively suppress refusal behavior, inducing response to harmful requests. We extend this refusal steering method to three open-source Mixture-of-Experts (MoE) LLMs and find that steering performance is uninhibited by the complex routing patterns inherent to the MoE architecture. We then propose two expert-aware refusal steering methods that leverage refusal-specific expert routing patterns and expert-specific steering directions to suppress normal refusal behavior. We find that refusal behavior can be effectively steered based on the output of a single expert. Our results show that refusal signals captured by steering methods differ from expert routing behavior, suggesting a substantial role for attention in MoE refusal behavior.

2606.04154 2026-06-04 q-bio.QM cs.LG 版本更新

EpiFormer: Learning Antigen-Antibody Interactions for Epitope Prediction via Geometric Deep Learning

EpiFormer: 通过几何深度学习学习抗原-抗体相互作用进行表位预测

Mansoor Ahmed, Huirong Chai, Haoxin Wang, Hemanth Venkateswara, Murray Patterson

发表机构 * Georgia State University(佐治亚州立大学) Georgia Institute of Technology(佐治亚理工学院)

AI总结 提出EpiFormer编码器-解码器框架,通过GNN层间交叉注意力实现抗原-抗体双向信息流,结合稀疏感知目标,在表位预测任务上F1分数提升超40%。

详情
AI中文摘要

抗体通过结合称为表位的特定表面区域来中和外来抗原。计算表位预测对于理解免疫识别和指导抗体工程至关重要。然而,现有方法面临三个基本挑战:抗体感知模型独立编码每条链并在后期才进行组合,无法捕捉定义结合界面的共依赖结构特征;而严重的类别不平衡和已知抗体-抗原复合物的稀缺使得标准训练目标无效。我们提出EpiFormer,一个通用的编码器-解码器框架,联合解决这些挑战。我们的关键设计原则是在GNN编码层内进行交错交叉注意力,使得抗原-抗体信息流贯穿整个表示学习过程,而不仅仅在输出时。这种早期融合原则与主干无关,从简单的GCN到等变模型,在各种GNN架构上都能提供一致的改进。我们进一步表明,当与早期融合架构配对时,稀疏感知目标对于表位预测任务是有效的。EpiFormer在标准基准上的F1分数比之前的最佳方法提高了40%以上,展示了泛化能力和跨数据集迁移性。值得注意的是,EpiFormer发现已知的生物学原理作为端到端训练的涌现行为,其中学习到的交叉注意力门控倾向于抗原到抗体的信息流,与两条链在结合界面的不对称角色一致,并且模型对几何特征而非进化特征的偏好与已建立的发现(表位残基并非进化保守)一致。源代码可在https://github.com/mansoor181/epiformer.git获取。

英文摘要

Antibodies neutralize foreign antigens by binding to specific surface regions called epitopes. Computational epitope prediction is critical for understanding immune recognition and guiding antibody engineering. However, existing methods face three fundamental challenges: antibody-aware models encode each chain independently and combine them only at a late stage, failing to capture co-dependent structural features that define binding interfaces, whereas severe class imbalance and scarcity of known antibody-antigen complexes render standard training objectives ineffective. We propose EpiFormer, a general encoder-decoder framework that addresses these challenges jointly. Our key design principle is interleaved cross-attention within GNN encoding layers, enabling bidirectional antigen-antibody information flow throughout representation learning rather than only at the output. This early-fusion principle is backbone-agnostic, providing consistent gains across GNN architectures from simple GCNs to equivariant models. We further show that sparsity-aware objectives are effective when paired with early-fusion architectures for the epitope prediction task. EpiFormer improves over the previous best method by over 40% in F1 score on standard benchmarks, demonstrating generalizability and cross-dataset transferability. Notably, EpiFormer discovers known biological principles as emergent behaviors of end-to-end training, where the learned cross-attention gates favor antigen-to-antibody information flow, consistent with the asymmetric roles of the two chains at the binding interface, and the model's preference for geometric over evolutionary features aligns with the established finding that epitope residues are not evolutionarily conserved. The source code is available at: https://github.com/mansoor181/epiformer.git

2606.04143 2026-06-04 cs.LG cs.AI 版本更新

Physics-Informed Machine Learning for Short-Term Flood Prediction

物理信息机器学习用于短期洪水预测

Tewodros Syum Gebre, Jagrati Talreja, Leila Hashemi-Beni

发表机构 * IEEE Service Center(IEEE服务中心) National Science Foundation(国家科学基金会) Microsoft(微软)

AI总结 提出一种物理信息机器学习框架,通过将水文知识作为趋势对齐约束嵌入LSTM损失函数,在数据稀缺和极端天气下提升洪水预测的物理一致性和可靠性。

Comments This paper has been accepted for publication in IGARSS 2026. The final authenticated version will be available through IEEE Xplore

详情
AI中文摘要

准确的洪水预测对于减轻灾害风险和保护社区至关重要。然而,纯数据驱动的机器学习模型在数据稀缺环境中常常表现不佳,并可能违反基本的水文原理。标准长短期记忆(LSTM)网络可能产生物理上不一致的预测,特别是在外推到极端天气条件时。为了解决这些限制,我们提出了一种物理信息机器学习(PIML)框架,将水文知识直接纳入LSTM模型的损失函数中。具体来说,趋势对齐约束惩罚降水与流量趋势之间的方向不一致性,从而在不需复杂水动力学方程的情况下提高模型鲁棒性。这种正则化鼓励模型学习物理上合理的水文过程线行为,即使在训练数据有限的情况下,也能增强峰值洪水事件期间的可靠性。实验结果表明,所提出的物理信息模型在数据稀缺环境下优于标准LSTM基线,当仅使用5%的可用数据训练时,纳什-萨特克利夫效率(NSE)从0.20提高到0.23。在模拟极端气候情景下的额外压力测试表明,基线模型表现出不稳定的行为,而物理信息模型保持了方向一致性和物理合理性。尽管在数据有限的情况下准确预测极端峰值幅度仍然具有挑战性,但所提出的方法显著减少了纯数据驱动模型中常见的非物理波动。这些发现表明,简单的物理约束可以显著提高深度学习模型在实时洪水预测中的可靠性,为无测站流域和不断变化的气候条件提供了实用解决方案。

英文摘要

Accurate flood forecasting is essential for mitigating disaster risks and protecting communities. However, purely data-driven machine learning models often struggle in data-scarce environments and may violate fundamental hydrological principles. Standard Long Short-Term Memory (LSTM) networks can generate physically inconsistent predictions, particularly when extrapolating to extreme weather conditions. To address these limitations, we propose a Physics-Informed Machine Learning (PIML) framework that incorporates hydrological knowledge directly into the loss function of an LSTM model. Specifically, a Trend Alignment constraint penalizes directional inconsistencies between precipitation and discharge trends, improving model robustness without requiring complex hydrodynamic equations. This regularization encourages the model to learn physically plausible hydrograph behavior, even with limited training data, while enhancing reliability during peak flood events. Experimental results show that the proposed physics-informed model outperforms a standard LSTM baseline in data-scarce settings, increasing the Nash-Sutcliffe Efficiency (NSE) from 0.20 to 0.23 when trained on only 5% of the available data. Additional stress tests under simulated extreme climate scenarios demonstrate that the baseline model exhibits unstable behavior, whereas the physics-informed model maintains directional consistency and physical plausibility. Although accurately predicting extreme peak magnitudes remains challenging with limited data, the proposed approach substantially reduces unphysical fluctuations common in purely data-driven models. These findings demonstrate that simple physical constraints can significantly improve the reliability of deep learning models for real-time flood forecasting, offering a practical solution for ungauged basins and evolving climate conditions.

2606.04135 2026-06-04 cs.LG 版本更新

Stationarity-Aware Retrieval-Augmented Time Series Forecasting

平稳性感知的检索增强时间序列预测

Shiqiao Zhou, Holger Schöner, Zipeng Wu, Edouard Fouché, IAG Wilson, Shuo Wang

发表机构 * University of Birmingham(伯明翰大学) Siemens AG(西门子有限公司)

AI总结 提出SARAF框架,通过自适应平衡检索相关性与多样性,并利用平稳性感知聚合,提升非平稳时间序列预测的准确性和鲁棒性。

Comments Accepted by KDD 2026 research track

详情
AI中文摘要

时间序列预测依赖于历史模式,但真实世界序列通常表现出非平稳性和制度转换,这对全参数预测器构成挑战。受检索增强生成(RAG)启发,最近的工作通过检索相关历史片段并在推理时将其作为外部证据来增强预测器。然而,由于真实世界时间序列的内在非平稳性,高度相似的过去片段并不一定意味着相似的未来,这使得仅基于相似性的检索脆弱且容易冗余。我们提出平稳性感知的检索增强时间序列预测(SARAF),这是一个自适应平衡检索中相关性和多样性的框架。SARAF首先通过时间对齐增强的时间相似性形成候选池,然后应用多样性感知选择策略覆盖异质历史制度,其中多样化强度由数据集级别的平稳性自动调节。此外,SARAF使用平稳性感知聚合来融合检索到的未来。在八个真实世界数据集上的大量实验表明,SARAF实现了有竞争力的预测性能,并在强基线上提高了平均准确性和鲁棒性,在具有挑战性的非平稳设置下尤其明显。代码:https://github.com/ShiqiaoZhou/SARAF。

英文摘要

Time series forecasting relies on historical patterns, but real-world series often exhibit non-stationarity and regime shifts that challenge fully parametric forecasters. Inspired by Retrieval-Augmented Generation (RAG), recent work augments forecasters by retrieving relevant historical segments and using them as external evidence at inference time. However, due to the intrinsic non-stationarity of real-world time series, a highly similar past segment does not necessarily imply a similar future, rendering similarity-only retrieval brittle and prone to redundancy. We propose Stationarity-Aware Retrieval-Augmented Time Series Forecasting (SARAF), a framework that adaptively balances relevance and diversity in retrieval. SARAF first forms a candidate pool via temporal similarity with time-aligned enhancement, then applies a diversity-aware selection strategy to cover heterogeneous historical regimes, with the diversification strength automatically modulated by dataset-level stationarity. Moreover, SARAF uses stationarity-aware aggregation to fuse the retrieved futures. Extensive experiments on eight real-world datasets show that SARAF achieves competitive forecasting performance and improves average accuracy and robustness over strong baselines, with particularly clear benefits under challenging non-stationary settings. Code: https://github.com/ShiqiaoZhou/SARAF.

2606.04121 2026-06-04 cs.LO cs.LG cs.SE 版本更新

veriFIRE: an Industrial Case Study in Verifying Consistency Properties for a DNN-Based Wildfire Detection System

veriFIRE:基于DNN的野火检测系统一致性属性验证的工业案例研究

Idan Refaeli, Maya Swisa, Itay Buchnik, Alon Zada, Guy Amir, Elad Mandelbaum, Ziv Freund, Guy Katz

发表机构 * The Hebrew University of Jerusalem(海法大学) Elbit Systems - ISTAR & EW - Elisra L.T.D Cornell University(康奈尔大学)

AI总结 本文提出一种端到端方法,通过将应用需求编码为求解器兼容查询,利用现有神经网络验证器验证野火检测系统中的单调性和有界响应等一致性属性,并在真实背景样本上评估,展示了工业系统可获得有意义的领域特定保证。

Comments To appear in The 9th International Symposium on AI Verification (SAIV)

详情
AI中文摘要

我们介绍了veriFIRE项目的当前工作:一个工业界与学术界的合作项目,旨在应用验证来提高一个真实世界安全关键系统的可靠性。具体来说,我们针对一个用于野火检测的机载平台,该平台包含两个深度神经网络。我们提出了一种端到端的方法来验证该系统中的 extit{一致性属性}。我们的方法将基于应用的需求编码为现有神经网络验证器可求解的查询。我们研究了关键操作场景下的感兴趣属性:(i) 检测器置信度随目标强度增加而单调递增;(ii) 在传感器物理上合理的模糊下,检测器响应有界。我们使用最先进的神经网络验证后端实例化这些编码,并在真实背景样本上大规模评估。对于第一个属性,所有验证查询在五分钟内解决。对于第二个属性,验证难度显著增加,突出了更丰富、更高维规格的关键可扩展性挑战。总体而言,结果表明可以为工业系统获得有意义的、领域特定的保证。

英文摘要

We present our ongoing work on the veriFIRE project: a collaboration between industry and academia, aimed at applying verification to increase the reliability of a real-world, safety-critical system. Specifically, we target an airborne platform for wildfire detection, which incorporates two deep neural networks. We present an end-to-end methodology for verifying \textit{consistency properties} in this system. Our approach encodes application-grounded requirements into solver-compatible queries for existing neural network verifiers. We study properties of interest over critical operational scenarios: (i) monotonicity of detector confidence as target intensity increases; and (ii) bounded detector response under physically plausible blur over the sensor. We instantiate these encodings using state-of-the-art neural network verification backends and evaluate them at scale on real background samples. For the first property, all verification queries are solved in under five minutes. For the second property, verification is substantially harder, highlighting key scalability challenges for richer, higher-dimensional specifications. Overall, the results demonstrate that meaningful, domain-specific guarantees can be obtained for industrial systems.

2606.04115 2026-06-04 cs.LG cs.AI 版本更新

dMX: Differentiable Mixed-Precision Assignment for Low-Precision Floating-Point Formats

dMX: 低精度浮点格式的可微分混合精度分配

Giuseppe Franco, Ian Colbert, Pablo Monteagudo-Lago, Felix Marty, Nicholas Fraser

发表机构 * AMD

AI总结 提出可微分混合精度量化框架 dMX,通过连续优化每层浮点格式参数并配合退火调度和正则化项,实现硬件兼容的 MXFP 格式分配,在 LLM 上取得帕累托最优效果。

详情
AI中文摘要

将大型语言模型(LLM)量化为低精度浮点表示是高效部署的关键,然而在所有层上统一应用单一比特宽度在性能和准确性方面均非最优。本文介绍 dMX,一种用于可学习浮点比特宽度分配的可微分混合精度量化框架。我们研究了其在开放计算项目(OCP)标准定义的微缩放浮点(MXFP)数据类型家族上的应用。每层比特宽度分配被表述为一个连续优化问题,其中每层的浮点格式由一个标量参数参数化,将多变量设计空间折叠为单个可学习偏移量。在训练过程中,该偏移量取连续值,避免了离散量化格式之间的突然振荡。基于温度的退火调度逐步离散化学习到的偏移量,确保最终配置映射到硬件兼容的 MXFP 格式,而不会在训练和推理行为之间出现突变。目标感知正则化项将平均比特宽度引导至用户指定的预算,作为推理成本的粗粒度代理,平衡模型质量与部署效率。我们在不同 LLM 家族(如 Llama、Qwen3 和 SmolLM2)上进行了实验,评估了 WikiText-2 上的困惑度和四个零样本推理基准上的准确率。在这些设置中,dMX 一致地产生帕累托主导模型,并优于基于 Kullback-Leibler(KL)散度的层选择启发式方法,有效导航模型质量与平均比特宽度之间的权衡。

英文摘要

Quantizing large language models (LLMs) to low-precision floating-point representations is central to efficient deployment, yet applying a single bit-width uniformly across all layers is sub-optimal in terms of both performance and accuracy. This work introduces dMX, a differentiable mixed-precision quantization framework for learnable floating-point bit-width assignment. We study its application for the microscaling floating-point (MXFP) family of data types defined by the Open Compute Project (OCP) standard. The per-layer bit-width assignment is formulated as a continuous optimization problem in which each layer's floating-point format format is parameterized by a scalar parameter, folding the multi-variate design space into a single learnable offset. During training this offset takes continuous values, avoiding sudden oscillations between discrete quantization formats. A temperature-based annealing schedule progressively discretizes the learned offsets, ensuring that the final configuration maps to hardware-compatible MXFP formats without abrupt transitions between training and inference behavior. A target-aware regularization term steers the average bit-width toward a user-specified budget, serving as a coarse-grained proxy for inference cost and balancing model quality against deployment efficiency. We performed experiments on different families of LLM, such as Llama, Qwen3, and SmolLM2, evaluating perplexity on WikiText-2 and accuracy on four zero-shot reasoning benchmarks. Across these settings, dMX consistently yields Pareto-dominating models and improves over Kullback-Leibler (KL) divergence-based layer-selection heuristics, efficiently navigating trade-offs between model quality and average bit-width.

2606.04110 2026-06-04 cs.LG stat.ML 版本更新

Variance Reduction for Heavy-Tailed Monetization Metrics in Ranking Experiments via Post-Stratification

基于事后分层的排序实验中重尾货币化指标的方差缩减

Neeti Pokharna, Olivier Jeunen, Yatharth Saraf, Aleksei Ustimenko

发表机构 * ShareChat Aampe Simulacra Research

AI总结 针对排序实验中重尾货币化指标方差大、统计功效低的问题,提出结合事后分层与CUPED的方差缩减框架,利用实验前协变量提升灵敏度,在ShareChat部署后以约45%的流量实现同等统计置信度。

Comments Accepted as Industry Track paper in the 2026 ACM SIGIR Conference on Research and Development in Information Retrieval

详情
AI中文摘要

排序和检索系统的在线评估通常依赖于下游货币化指标,如应用收入或创作者收益。这些指标通常是重尾的,一小部分用户主导了均值和方差,导致A/B实验的统计功效低、结论不可靠——尤其是在流量有限的情况下。我们提出了一个实用的在线实验方差缩减框架,通过结合事后分层与CUPED。我们的方法利用实验前协变量提高货币化实验的灵敏度,无需额外流量。在ShareChat的排名驱动货币化实验中部署后,该方法显著降低了方差并提高了决策稳定性,与标准指标相比,以约45%的流量实现了同等的统计置信度。我们进一步讨论了实际设计选择、防护措施和局限性,为事后分层在现实信息检索和推荐系统中的适用性提供了指导。

英文摘要

Online evaluation of ranking and retrieval systems often relies on downstream monetization metrics such as app revenue or creator earnings. These metrics are typically heavy-tailed, with a small fraction of users dominating both mean and variance, leading to low statistical power and unreliable conclusions in A/B experiments -- especially under limited traffic. We present a practical framework for variance reduction in online experiments by combining post-stratification with CUPED. Our approach leverages pre-experiment covariates to improve the sensitivity of monetization experiments without requiring additional traffic. Deployed at ShareChat across ranking-driven monetization experiments, the method substantially reduces variance and improves decision stability, achieving equivalent statistical confidence with ~45\% less traffic than standard metrics. We further discuss practical design choices, guardrails, and limitations, providing guidance on when post-stratification is appropriate for real-world information retrieval and Recommendation systems.

2606.04108 2026-06-04 cs.GR cs.AI cs.CV cs.LG 版本更新

SymTRELLIS: Symmetry-Enforced Voxel Latents for 3D Generation

SymTRELLIS: 对称性增强的体素潜变量用于3D生成

Guangda Ji, Qimin Chen, Qinchan Li, Mingrui Zhao, Kai Wang, Hao Zhang

发表机构 * Simon Fraser University(西蒙 Fraser大学)

AI总结 提出SymTRELLIS方法,通过在流模型生成过程中对预测速度进行对称化平均,强制任意有限点群对称性,无需重新训练VAE或流模型,显著降低对称性误差。

详情
AI中文摘要

单视图3D生成模型已取得令人印象深刻的视觉质量,但它们并非为满足结构或功能需求而设计,在实践中常常存在不足。对称性就是这样一个需求:违反对称性,即使是微小的违反,也可能使模型在物理上不可用。我们提出SymTRELLIS,一种在TRELLIS.2的基于流的3D生成过程中强制任意有限点群对称性(旋转、反射和多面体对称)的方法,无需重新训练底层的VAE或流模型。我们的关键思想是将空间变换在潜空间中的作用近似为体素潜变量上的学习线性算子,通过一个轻量级的空间变换潜映射器实现,该映射器在通用的非对称3D数据上训练。在生成时,我们通过在每一步ODE中对所有对称等价变换的预测流速度进行平均来强制对称性,这一过程称为速度对称化。对称性规格可以从初始TRELLIS.2生成中自动估计,或由用户提供,从而实现超越输入图像暗示的刻意折叠操作。在一个包含266个严格对称物体的基准测试上(涵盖2到20倍旋转和多面体对称群),与TRELLIS.2、Hunyuan3D-2.1和TripoSG相比,SymTRELLIS显著降低了所有对称性误差指标,同时保持了与基础模型相当的重建精度。

英文摘要

Single-view 3D generative models have achieved impressive visual quality, yet they are not designed to satisfy structural or functional requirements, and in practice, often fall short. Symmetry is one such requirement: violations, even subtle ones, on symmetry can render a model physically unusable. We present SymTRELLIS, a method that enforces arbitrary finite point group symmetries (rotational, reflectional, and polyhedral) during the flow-based 3D generation of TRELLIS.2, without retraining the underlying VAE or flow model. Our key idea is to approximate the latent-space action of spatial transformations as a learned linear operator on voxel latents, implemented as a lightweight spatial-transform latent mapper trained on generic, non-symmetric 3D data. At generation time, we enforce symmetry by averaging predicted flow velocities across all symmetry-equivalent transformations at each ODE step, a process we call velocity symmetrization. The symmetry specification can be estimated automatically from an initial TRELLIS.2 generation or supplied by the user, enabling deliberate fold manipulation beyond what the input image suggests. On a curated benchmark of 266 strictly symmetric objects spanning 2- to 20-fold rotations and polyhedral symmetry groups, SymTRELLIS substantially reduces all symmetry error metrics compared to TRELLIS.2, Hunyuan3D-2.1, and TripoSG, while maintaining reconstruction accuracy comparable to the base model.

2606.04106 2026-06-04 cs.LG cs.AI 版本更新

Building The Ph(ysical)AI Layer Of Machine Intelligence

构建机器智能的物理AI层

Ulbert Jose Botero, Liam Smith, Brooks Olney, Pooya Khorrami, Steven Kusiak, Watson Jia, Sage Trudeau, Daniel Capecci

发表机构 * MIT Lincoln Laboratory(麻省理工学院林肯实验室)

AI总结 提出基于信号处理原理的基座模型,通过射频数据训练实现跨模态迁移,无需目标域微调,以1.99M参数在15个任务上平均准确率77.7%。

Comments 102 pages, 11 Figures

详情
AI中文摘要

基础模型通过多样化数据的大规模训练实现泛化,但在没有配对训练数据的情况下,向真正未见过的领域迁移存在局限性。我们提出基于原理的基座模型,该模型编码信号处理原理(傅里叶分解、能量守恒、对称性),而不是学习无约束的统计相关性。我们假设不同领域的差异不在于基本物理规律,而在于时间、频率、幅度或相位上的可学习变换。仅使用射频数据训练,并结合这些原理的协同设计架构和损失函数,我们实现了向音频、图像、文本和视频的跨模态迁移,仅使用从射频数据学习到的冻结表示,无需在目标域上对编码器进行微调。我们的1.99M参数冻结编码器通过线性探测在15个不同任务上达到77.7%的平均准确率(top-3为91.9%),具有系统性差异:在物理基础任务(说话人识别、地震学、射频指纹识别)上为84.5%,而在语义任务(音乐流派、语言识别)上为70.0%。这表明基于原理和基于规模的方法提供了互补路径:物理原理实现了高效的跨模态迁移,同时自然地界定了物理理解与语义理解之间的边界。

英文摘要

Foundation models achieve generalization through massive-scale training on diverse data, but have limitations with transfer to truly unseen domains without paired training data. We propose principle-driven foundation models that encode signal-theoretic principles (Fourier decomposition, energy conservation, symmetry) rather than learn untethered statistical correlations. We hypothesize that domains differ not in fundamental physics, but in learnable transformations in time, frequency, magnitude, or phase. Training exclusively on radio-frequency (RF) data with co-designed architecture and losses incorporating these principles, we achieve cross-modal transfer to audio, images, text, and video using only frozen representations learned from RF data, requiring no fine-tuning of the encoder on target domains. Our 1.99M parameter frozen encoder achieves 77.7% average accuracy (91.9% top-3) across 15 diverse tasks via linear probing, with systematic variation: 84.5 on physically-grounded tasks (speaker recognition, seismology, RF fingerprinting) versus 70.0% on semantic tasks (music genre, language recognition). This reveals that principle-driven and scale-driven approaches offer complementary paths: physical principles enable efficient cross-modal transfer while naturally establishing the boundary between physical and semantic understanding.

2606.04103 2026-06-04 cs.SD cs.AI cs.LG eess.AS 版本更新

The Differentiable Auditory Loop (DAL): An ML Framework for Hyper-Personalized Hearing Aids

可微分听觉环路(DAL):用于超个性化助听器的机器学习框架

Alejandro Ballesta Rosen, Jason Mikiel-Hunter, Julian Maclaren, Jack Collins, Richard F. Lyon, Simon Carlile

发表机构 * Google Research Australia(谷歌澳大利亚研究实验室) Macquarie University(麦考瑞大学)

AI总结 提出可微分听觉环路(DAL)框架,通过将CARFAC模型移植到JAX并优化SEANet深度神经网络,以正常听觉神经活动模式为参考补偿听力损失,在神经表征和信号保真度指标上优于传统助听器基线。

详情
AI中文摘要

传统助听器依赖固定的频率依赖性放大和压缩来管理灵敏度降低,这在复杂环境中(如多说话者场景,即“鸡尾酒会”问题)往往无法提供足够的听力支持。为了更全面地解决听力损失背后的编码功能障碍,我们引入了可微分听觉环路(DAL),这是一个用于个性化助听器设计和验配的新开源框架。我们的第一个DAL实现包含了CARFAC——一个可微的人类耳蜗功能模型,我们将其移植到JAX,以优化深度神经网络,使受损的听觉神经活动模式与正常听力参考匹配。为了构建具有所需精细频谱-时间信号处理的助听器,我们采用了SEANet,一种波形到波形的全卷积UNet生成器。我们通过比较适配正常听力的CARFAC模型输出与适配每个受试者个体听力损伤的CARFAC模型输出来微调网络。比较使用来自各自CARFAC神经活动模式(NAP)输出和稳定听觉图像(SAI)的损失函数进行,后者提供捕获听觉神经输出中相位不敏感时间结构的二维表示。通过梯度下降,SEANet模型学习同时去噪输入并补偿由受损CARFAC模型建模的听力损失。在神经表征和信号保真度指标上,DAL优化的SEANet模型优于测试的主助听器(MHA)基线。DAL框架为基于模型、机器学习驱动的助听器信号处理个性化提供了一条实用路径。下一步包括硬件部署以实现真实世界的临床测试。

英文摘要

Conventional hearing aids rely on fixed, frequency-dependent amplification and compression to manage reduced sensitivity, which often fails to provide sufficient listening support in complex environments, such as situations with multiple speakers (the ``cocktail party'' problem). To more comprehensively address the underlying encoding dysfunctions of hearing loss, we introduce the Differentiable Auditory Loop (DAL), a new open-source framework for personalized hearing aid design and fitting. Our first implementation of DAL incorporates CARFAC, a differentiable model of human cochlear function, which we ported to JAX, to optimize a deep neural network to match impaired auditory neural activity patterns with a normal-hearing reference. To build a hearing aid with the fine-grained spectro-temporal signal processing required, we adopt SEANet, a waveform-to-waveform fully convolutional UNet generator. We fine-tune the network by comparing the outputs of a CARFAC model fitted to normal hearing with that of a CARFAC model fitted to match each subject's individual hearing impairment. The comparison is done using loss functions derived from the respective CARFAC neural activity pattern (NAP) outputs and stabilized auditory images (SAIs), the latter providing a 2D representation that captures phase-insensitive temporal structure in the auditory nerve output. Through gradient descent, the SEANet model learns to both denoise the input and compensate for the hearing loss modelled by the impaired CARFAC model. Across neural-representation and signal-fidelity metrics, the DAL-optimized SEANet model outperformed the tested master hearing aid (MHA) baselines. The DAL framework provides a practical path toward model-based, machine-learning-driven personalization of hearing aid signal processing. Next steps include hardware deployment to enable real-world clinical testing.

2606.04100 2026-06-04 cs.LG physics.comp-ph 版本更新

Stein Kernelized Molecular Dynamics for Active Learning of Interatomic Potentials

Stein核化分子动力学用于原子间势的主动学习

Joanna Zou, Fraser Birks, Dallas Foster, Youssef Marzouk

发表机构 * Center for Computational Science & Engineering, Schwarzman College of Computing, MIT(计算科学与工程中心,计算机科学学院,麻省理工学院) Warwick Centre for Predictive Modelling, School of Engineering, University of Warwick(预测建模中心,工程学院,沃里克大学) NVIDIA

AI总结 提出Stein核化分子动力学(SKMD),一种通过相互作用粒子动力学获取信息性训练配置的增强采样方法,用于主动学习和微调机器学习原子间势,保持玻尔兹曼分布作为渐近分布,并采用自适应停止准则高效在线获取非冗余数据,在Müller-Brown势和丙氨酸二肽的MACE势上展示了优于基线的模型精度。

详情
AI中文摘要

机器学习原子间势(MLIP)能够实现高效且精确的原子模拟,但其性能关键取决于训练数据的质量和多样性。我们引入了Stein核化分子动力学(SKMD),这是一种增强采样方法,利用相互作用粒子动力学获取信息性训练配置,用于MLIP的主动学习和微调。SKMD是Stein变分梯度下降的一种随机变体,通过引入异步粒子更新和全局原子描述符的核函数,为分子动力学进行了适配,从而提供了对称性感知的构型相似性度量。与分子动力学中使用的其他增强采样器不同,SKMD保留了玻尔兹曼分布作为动力学的渐近分布。这一特性在探索多样构型与吸引到高概率区域之间取得了平衡。我们进一步提出了一种高效在线数据获取方法,使用自适应停止准则在模拟过程中选择非冗余训练数据。我们展示了SKMD在Müller-Brown势的神经网络模型主动学习以及丙氨酸二肽的MACE原子间势微调中的应用。与主动学习基线相比,我们的方法在相同数量的训练样本下,以更少的训练迭代次数实现了更高的模型精度。

英文摘要

Machine learning interatomic potentials (MLIPs) enable efficient and accurate atomistic simulations but depend critically on the quality and diversity of the training data. We introduce Stein kernelized molecular dynamics (SKMD), an enhanced sampling method that uses interacting particle dynamics to acquire informative training configurations for the active learning and fine-tuning of MLIPs. SKMD corresponds to a stochastic variant of Stein variational gradient descent that is adapted for molecular dynamics by incorporating asynchronous particle updates and a kernel of global atomic descriptors, which provides a symmetry-aware measure of configurational similarity. Unlike other enhanced samplers used in molecular dynamics, SKMD preserves the Boltzmann distribution as the asymptotic distribution of the dynamics. This property enforces a balance between the exploration of diverse configurations and attraction toward high-probability regions of the energy landscape. We further propose an approach to efficient online data acquisition using an adaptive stopping criterion that selects non-redundant training data over the course of simulation. We demonstrate SKMD for the active learning of a neural network model of the Müller-Brown potential and the fine-tuning of a MACE interatomic potential for alanine dipeptide. Compared to active learning baselines, our method achieves higher model accuracy in fewer training iterations with the same number of acquired training samples.

2606.04092 2026-06-04 cs.CV cs.LG 版本更新

Optimal Transport Flow Matching by Design

通过设计实现最优传输流匹配

Shimon Malnick, Matan Rusanovsky, Ohad Fried, Shai Avidan

发表机构 * Tel Aviv University(特拉维夫大学) Reichman University(里奇曼大学)

AI总结 本文通过将先验分布视为设计选择而非固定输入,利用数据与其低频投影之间的恒等耦合作为最优传输耦合,简化流匹配模型中的轨迹曲率,实现快速高质量生成。

Comments Project page: https://www.malnick.net/designing_ot_flows

详情
AI中文摘要

流匹配模型学习将样本从简单先验分布传输到复杂数据分布。当先验-数据对通过最优传输(OT)耦合时,学习到的轨迹是直线且无交叉的,从而实现快速甚至单步生成。然而,在高维空间中计算OT耦合是困难的,现有方法试图解决OT问题,但代价是持续的偏差或显著的开销。我们不求解OT耦合,而是重新表述问题。一旦将先验视为设计选择而非固定输入,先验与数据之间的OT耦合就不再唯一。许多先验允许与数据之间存在OT最优的恒等耦合,因此我们可以自由选择一个易于采样的先验。我们将自然图像的低频投影确定为这样的选择。数据与其低频表示之间的恒等耦合在经验上是OT最优的,先验的结构足够丰富,可以在推理时由轻量级模型采样,而剩余的流匹配任务简化为合成高频细节。用高斯噪声插值先验进一步提高了生成质量,同时保留了OT耦合。该方法无需对流模型本身进行修改,并且自然地与潜在空间模型、无分类器引导和单步生成框架集成。在所有基准测试中,与现有流匹配方法相比,我们的方法将轨迹曲率降低了2倍以上,从而在少步数情况下实现了更好的生成质量。

英文摘要

Flow matching models learn to transport samples from a simple prior distribution to a complex data distribution. When prior-data pairs are coupled via optimal transport (OT), the learned trajectories are straight and non-crossing, enabling fast, even single-step, generation. However, computing the OT coupling in high dimensions is intractable, and existing methods attempt to solve the OT problem, at the cost of persistent bias or significant overhead. Rather than solving for the OT coupling, we reformulate the problem. Once the prior is treated as a design choice rather than a fixed input, the OT coupling between prior and data is no longer unique. Many priors admit an OT-optimal identity coupling to the data, leaving us free to choose one that is also tractable to sample. We identify low-frequency projection of natural images as such a choice. The identity coupling between data and its low-frequency representation is empirically OT-optimal, the prior is structured enough to be sampled by a lightweight model at inference, and the remaining flow-matching task reduces to synthesizing high-frequency detail. Interpolating the prior with Gaussian noise further improves generation quality while preserving the OT coupling. The approach requires no modifications to the flow model itself, and integrates naturally with latent-space models, classifier-free guidance, and one-step generation frameworks. Across all benchmarks, our method reduces trajectory curvature by more than $2\times$ compared to existing flow matching methods, yielding better generation quality in the few-step regime.

2606.04075 2026-06-04 cs.LG cs.AI cs.CL cs.CR cs.CY 版本更新

Large Language Models Hack Rewards, and Society

大型语言模型攻击奖励机制与社会

Wei Liu, Xinyi Mou, Hanqi Yan, Zhongyu Wei, Yulan He

发表机构 * King’s College London(伦敦大学国王学院) Fudan University(复旦大学) The Alan Turing Institute(艾伦·图灵研究所)

AI总结 研究强化学习训练中大型语言模型利用奖励函数漏洞的“社会攻击”现象,通过SocioHack沙盒实验发现模型能发现并利用社会规则漏洞,且现有安全措施效果有限。

Comments 14 pages, 9 figures, 7 tables

详情
AI中文摘要

强化学习已成为一种主导的后训练范式,使大型语言模型能够从奖励中学习。我们观察到社会规则在结构上与奖励函数相似。它们定义了可衡量的结果、阈值和例外情况,同时往往仅部分指定了制度意图。我们假设强化学习训练过程可能利用这些漏洞,因此提出模型在强化学习期间攻击奖励函数的已知倾向是否可能扩展为一种更严重的失败模式,即社会攻击:发现社会运行规则中的漏洞。为了研究这一现象,我们引入了SocioHack,一个包含72个社会环境的沙盒,并发现这些环境中奖励攻击自然出现并导致监管漏洞的发现。模型学会攻击社会规则并生成技术上合规但违背监管意图的策略,而当前的大型语言模型安全措施仅提供有限的缓解。因此,收集真实世界反馈用于模型训练需要更加谨慎,我们需要下一代后训练范式来安全地在真实社会中迭代大型语言模型。

英文摘要

Reinforcement learning (RL) has become a dominant post-training paradigm, enabling large language models (LLMs) to learn from rewards. We observe that societal regulations are structurally similar to reward functions. They define measurable outcomes, thresholds, and exceptions, while often leaving institutional intent only partially specified. We hypothesise that the RL training process may exploit these gaps and therefore ask whether models' well-known tendency to hack reward functions during RL can scale into a more consequential failure mode named societal hacking: discovering loopholes in the rules society runs on. To study this phenomenon, we introduce SocioHack, a sandbox of 72 societal environments, and find that within these environments, reward hacking naturally emerges and leads to regulatory loophole discovery. Models learn to hack the social rules and generate strategies that remain technically compliant while defeating regulatory intent, and current LLM safeguards provide only limited mitigation. Therefore, collecting in-the-wild feedback for model training requires greater caution, and we need a next-generation post-training paradigm for safely iterating LLMs in real society.=

2606.04074 2026-06-04 cs.LG cs.AI cs.IT math.IT 版本更新

Adaptive Patching Is Harder Than It Looks For Time-Series Forecasting

自适应分块在时间序列预测中比看起来更难

Federico Zucchi, Yi Xie, Chao Zhang, Keyuan Luo, Thomas Lampert, Ziyue Li

发表机构 * ICube, University of Strasbourg, Illkirch-Graffenstaden, France(斯特拉斯堡大学ICube研究所,法国伊尔克里奇-格拉夫芬斯坦德) Technical University of Munich(慕尼黑技术大学) FinTech Thrust, The Hong Kong University of Science and Technology (Guangzhou)(香港科技大学(广州)金融科技研究组) Computer Science Department, Hainan Bielefeld University of Applied Sciences(海南比尔费尔德应用科学大学计算机科学系) Cephalgo, Strasbourg, France(法国斯特拉斯堡Cephalgo公司) Heilbronn Data Science Center, Munich Data Science Institute(慕尼黑数据科学研究所海德堡数据科学中心)

AI总结 本文通过理论分析和实验验证,探讨自适应分块在时间序列Transformer中是否优于调优的均匀分块,发现均匀基线在标准基准上具有竞争力,自适应分块的优势有限且依赖于特定方法和数据集。

详情
AI中文摘要

自适应分块是时间序列Transformer最近提出的一个引人注目的方案:在序列局部信息丰富的区域分配更细的分块。本文探究在什么条件下内容自适应分块算子应优于调优的均匀算子。局部异质性本身并不足够:在逐点预测损失下,一个看似复杂的区域并不自动意味着更细的分块会减少损失。我们将分块建模为有预算的比特率分配,并推导出一个显式阈值,动态分块规则必须满足该阈值才能击败调优的均匀基线,然后从局部(二次代理)和全局(模型假设下的强凸界)两方面界定了可实现的改进。由此得出两个结构性结果:在没有耦合约束的情况下,标量局部复杂度无法在常见损失景观下产生非均匀最优;一旦骨干网络训练到其表示感知最优,对齐增益会在调优的均匀分块大小附近崩溃。为了验证这些预测,我们在三种代表性架构上进行了受控隔离研究,用均匀分块大小扫描替换每个自适应机制,同时保持骨干网络、数据和训练协议不变。在标准的长时域预测基准上,验证选择的均匀基线与动态对应物具有竞争力,每个设置的效果集中在零附近,且按数据集汇总后没有一致的方向性优势。我们观察到的较大增益是方法和数据集特定的。因此,自适应分块应针对调优的均匀基线进行评估;其价值取决于是否有一个廉价且可靠的路由信号能够识别出更细的分块实际上在何处减少预测损失。

英文摘要

Adaptive patching is a recent and compelling proposal for time-series Transformers: allocate finer patches where the sequence looks locally informative. This paper asks under what conditions a content-adaptive patching operator should outperform a tuned uniform one. Local heterogeneity alone is not enough: under pointwise forecasting losses, a complex-looking region is not automatically one where finer patching reduces the loss. We model patching as a budgeted bitrate allocation and derive an explicit threshold that a dynamic patching rule must satisfy to beat a well-tuned uniform baseline, then bound the achievable improvement both locally (a quadratic surrogate) and globally (a strong-convexity bound under the model's assumptions). Two structural results follow: without a coupling constraint, scalar local complexity cannot produce a non-uniform optimum under a common loss landscape; and once the backbone is trained to its representation-aware optimum, the alignment gain collapses around a well-tuned uniform patch size. To test these predictions, we run a controlled isolation study on three representative architectures, replacing each adaptive mechanism with a uniform patch-size sweep while keeping the backbone, data, and training protocol fixed. On standard long-horizon forecasting benchmarks, the validation-selected uniform baseline is competitive with the dynamic counterpart, with per-setting effects concentrated near zero and no consistent directional advantage once results are aggregated by dataset. The larger gains we do observe are method- and dataset-specific. Adaptive patching should therefore be evaluated against a tuned uniform baseline; its value depends on whether a cheap and reliable routing signal can identify where finer patches actually reduce forecasting loss.

2606.04073 2026-06-04 cs.LG cs.AI stat.ML 版本更新

TPA-AD: A Two-Stage Pseudo Anomaly-Guided Method for Bearing Time-Series Anomaly Detection

TPA-AD: 一种用于轴承时间序列异常检测的两阶段伪异常引导方法

Xiancheng Wang, Zhibo Zhang, Ran Li, Rui Wang, Minghang Zhao, Shisheng Zhong, Lin Wang

发表机构 * CQSF.com(重庆师范大学) Huadian University(哈尔滨理工大学)

AI总结 提出一种两阶段伪异常引导方法TPA-AD,通过重构模型和特征误差控制生成边界伪异常窗口,结合对比学习与KNN实现无监督轴承时间序列异常检测,在轴承故障和退化数据集上表现稳定且具泛化性。

详情
AI中文摘要

本文提出了一种两阶段伪异常引导的异常检测方法(TPA-AD),用于在仅正常样本可用的训练设置下进行轴箱轴承时间序列异常检测(TSAD)。该方法首先利用重构模型和每特征目标误差控制在正常边界附近生成伪异常窗口,然后通过正常窗口与伪异常窗口之间的对比学习学习异常敏感表示,最后使用k近邻(KNN)生成窗口级和点级异常分数。与依赖已知故障类别、真实异常先验或随机异常注入的现有方法相比,TPA-AD通过在边界邻域构建伪异常提高了正常边界的可分离性,并能联合处理混合变量场景中的连续和离散特征。主要实验在轴承故障检测数据集和退化过程数据集上进行,并在13个公共TSAD数据集上进行了额外的探索性扩展。结果表明,所提方法产生相对稳定的异常响应,对退化演化敏感,并在公共TSAD基准和真实高速列车相关轴承数据上表现出一定程度的更广泛适用性。

英文摘要

This paper proposes a two-stage pseudo anomaly-guided anomaly detection method (\textbf{T}wo-stage \textbf{P}seudo \textbf{A}nomaly-guided \textbf{A}nomaly \textbf{D}etection, \textbf{TPA-AD}) for axle-box bearing time-series anomaly detection (time series anomaly detection, TSAD) under the setting where only normal samples are available for training. The method first generates pseudo-anomalous windows near the normal boundary using a reconstruction model and per-feature target-error control. It then learns anomaly-sensitive representations through contrastive learning between normal and pseudo-anomalous windows, and finally produces window-level and point-level anomaly scores using k-nearest neighbors (KNN). Compared with existing methods that rely on known fault categories, real anomaly priors, or random anomaly injection, TPA-AD improves the separability of the normal boundary by constructing pseudo-anomalies in boundary neighborhoods and can jointly handle continuous and discrete features in mixed-variable scenarios. The main experiments are conducted on bearing fault detection datasets and degradation-process datasets, with an additional exploratory extension on $13$ public TSAD datasets. The results show that the proposed method yields relatively stable anomaly responses, is sensitive to degradation evolution, and demonstrates a certain degree of broader applicability on public TSAD benchmarks and real high-speed-train-related bearing data.

2606.04072 2026-06-04 cs.RO cs.DC cs.LG cs.SY eess.SY 版本更新

CADET: A Modular Platform for Evaluating Distributed Cooperative Autonomy in Connected Autonomous Vehicles

CADET:用于评估网联自动驾驶车辆中分布式协作自主性的模块化平台

Pragya Sharma, Brian Wang, Mani Srivastava

发表机构 * UCLA Amazon Scholar(亚马逊学者)

AI总结 提出CADET模块化平台,通过解耦自动驾驶堆栈并集成网络与工作负载仿真,系统评估分布式协作自主系统在真实部署条件下的安全性与性能。

详情
Journal ref
ICRA 2026
AI中文摘要

深度学习模型日益成为自动驾驶汽车(AV)管道的核心,然而其集成传统上遵循单一设计,即感知、规划和控制在同一车载计算机上执行。这种设计忽视了协作自主的新兴范式,即车辆通过车联网(V2X)连接与路侧单元(RSU)、边缘服务器和云托管智能进行交互。协作感知和控制提高了安全性和效率,但也引入了系统级挑战:网络延迟、计算异构性和多租户争用,所有这些都严重影响实时决策。这些挑战因对大型基础模型的日益依赖而进一步放大,这些模型的规模需要云部署。我们提出CADET(通过分布式实验工具包实现协作自主),这是一个模块化平台,用于在真实部署条件下对分布式协作自主系统进行系统化和可重复的评估。CADET将自动驾驶堆栈解耦为可组合的模块,这些模块可以灵活地部署在车辆、基础设施和边缘/云层级上。该框架集成了最先进的模型,引入了基于轨迹的网络和工作负载仿真,并提供了同步的模型级、系统级和任务级检测。通过V2V和V2I实验,我们表明分布式部署选择从根本上影响安全性,其中V2V意图数据包优于基于云的感知,而RSU辅助感知在过载并发请求之前维持安全性。尽管专为自动驾驶管道设计,CADET也支持数据集驱动的实验,使系统和机器学习研究人员能够独立于完整的车辆仿真来基准测试分布式推理工作负载。CADET是开源的,代码和演示可在https://nesl.github.io/cadet-web获取。

英文摘要

Deep learning models are increasingly central to autonomous vehicle (AV) pipelines, yet their integration has traditionally followed a monolithic design where perception, planning, and control execute on a single onboard computer. This design overlooks the emerging paradigm of cooperative autonomy, where vehicles interact with roadside units (RSUs), edge servers, and cloud-hosted intelligence through vehicle-to-everything (V2X) connectivity. Cooperative perception and control improve safety and efficiency, but also introduce systems-level challenges: network latency, compute heterogeneity, and multi-tenant contention, all critically affect real-time decision-making. These challenges are further amplified by the increasing reliance on large foundation models, whose scale necessitates cloud deployment. We present CADET (Cooperative Autonomy through Distributed Experimentation Toolkit), a modular platform for systematic and reproducible evaluation of distributed cooperative autonomy systems under realistic deployment conditions. CADET decouples the AV stack into composable modules that can be flexibly deployed across vehicles, infrastructure, and edge/cloud tiers. The framework integrates state-of-the-art models, incorporates trace-driven network and workload emulation, and provides synchronized model-, system-, and task-level instrumentation. Through V2V and V2I experiments, we show that distributed deployment choices fundamentally shape safety, with V2V intent packets outperforming cloud-based perception and RSU-assisted perception sustaining safety until overloaded by concurrent requests. Although designed for AV pipelines, CADET also supports dataset-driven experimentation, enabling systems and ML researchers to benchmark distributed inference workloads independently of full vehicle simulation. CADET is open source, with code and demo available at https://nesl.github.io/cadet-web.

2606.04071 2026-06-04 cs.CR cs.CL cs.LG 版本更新

Covert Influence Between Language Models

语言模型之间的隐蔽影响

Avidan Shah, Jay Chooi, Jinghua Ou, Shi Feng

发表机构 * MATS New York University(纽约大学) Harvard University(哈佛大学) George Washington University(乔治华盛顿大学)

AI总结 本文研究语言模型间通过微调、蒸馏和上下文学习三种接口实现隐蔽影响的风险,并提出使用逐点归因分数选择载体以放大训练时影响,发现自然语言载体相比数字载体更难被人类检测且跨模型迁移性更差。

详情
AI中文摘要

随着语言模型越来越多地消费彼此的输出,隐蔽影响——即发送者的载荷(其被条件化传播的行为倾向)通过人类无法检测的载体转移到接收者的现象——成为一种日益增长的风险。我们通过三种接口(监督微调、在线策略蒸馏和上下文学习)刻画了这一风险,并发现它们在实现不留下人类可见痕迹的影响规模上有所不同。利用推理时逐样本归因分数,我们研究了所有三种接口下的隐蔽影响,并具备选择能够放大训练时影响的载体的能力,解锁了先前工作无法实现的载荷转移。我们进一步提供证据表明,使用自然语言载体的隐蔽影响与先前使用数字载体的研究是不同的现象,因为前者更难以被人类检测且跨模型家族的迁移性更差。这些结果共同表明,隐蔽影响的风险面比先前认识到的更广,我们研究了逐点归因评分方法作为调查和缓解该风险的工具。

英文摘要

As language models increasingly consume one another's outputs, covert influence -- a phenomenon where a sender's payload (the behavioral disposition it is conditioned to propagate) transfers to a receiver through carriers undetectable by humans -- becomes a growing risk. We characterize this risk across three interfaces: supervised fine-tuning, on-policy distillation, and in-context learning, and find that they vary in the scale of influence achievable without leaving behind human-visible traces. Using inference-time per-sample attribution scores, we study covert influence across all three interfaces with the ability to select carriers that amplify training-time influence, unlocking payload transfers that prior work could not achieve. We further provide evidence that covert influence with natural-language carriers is a distinct phenomenon from prior studies using number carriers, as the latter is more resistant to human detection and less portable across model families. Together, these results suggest that the risk surface for covert influence is broader than previously recognized, and we study pointwise attribution scoring methods as a tool to investigate and mitigate it.

2606.04069 2026-06-04 cs.CR cs.LG 版本更新

Bayesian Membership Privacy for Graph Neural Networks

图神经网络的贝叶斯成员隐私

Sinan Yıldırım, Megha Khosla

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 针对图神经网络中结构相关性和随机训练图采样导致的成员推断问题,提出贝叶斯成员隐私(BMP)框架,通过贝叶斯假设检验量化节点级成员隐私,并设计采样感知审计机制以评估隐私泄露。

详情
AI中文摘要

现有的图神经网络(GNN)隐私分析很大程度上继承了非图设置中的假设,忽略了结构相关性和随机训练图采样。特别是,节点相关的先验使得仅凭第一类和第二类错误不足以刻画最优的成员推断测试。为了解决这个问题,我们引入了贝叶斯成员隐私(BMP),这是一种采样感知的节点级成员隐私公式,它结合了节点相关的先验,并将图采样概率视为对手知识的一部分。BMP将成员推断视为贝叶斯假设检验,并据此以后验成员概率来量化成员隐私。我们探讨了BMP与文献中现有定义相关的理论性质。我们进一步提出了一种实用的、采样感知的审计机制,用于估计BMP的参数,作为GNN中节点级隐私泄露的度量。我们在基准图数据集上进行了实验,结果表明BMP提供了细粒度的隐私洞察,而这些洞察仅通过全局攻击准确率是无法看到的。

英文摘要

Existing privacy analyses for Graph Neural Networks (GNNs) largely inherit assumptions from non-graph settings, overlooking structural correlations and stochastic training-graph sampling. In particular, node-dependent priors make type-I and type-II errors alone insufficient to characterize the best membership inference test. To address this, we introduce Bayesian Membership Privacy (BMP), a sampling-aware formulation of node-level membership privacy that incorporates node-dependent priors and treats graph sampling probabilities as part of the adversary's knowledge. BMP casts membership inference as a Bayesian hypothesis test and accordingly quantifies membership privacy in terms of posterior membership probability. We explore theoretical properties of BMP in relation to the existing definitions in the literature. We further propose a practical, sampling-aware auditing mechanism to estimate the parameters of BMP as a measure of node-level privacy leakage in GNNs. We conduct experiments on benchmark graph datasets and show that BMP yields fine-grained privacy insights that are not visible through global attack accuracy alone.

2606.04066 2026-06-04 q-bio.NC cs.LG 版本更新

SC-TauPath: A Structural Connectivity Attribution Framework for Mapping Tau Propagation Pathways in Alzheimer's Disease

SC-TauPath:一种用于映射阿尔茨海默病中tau蛋白传播路径的结构连接归因框架

Jing Zhang, Norman Scheel, Minheng Chen, Tong Chen, Yanjun Lyu, David C. Zhu, Rong Zhang, Dajiang Zhu

发表机构 * University of Texas at Arlington(德克萨斯理工大学) Michigan State University(密歇根州立大学) University of Texas Southwestern Medical Center(德克萨斯西南医学中心)

AI总结 提出SC-TauPath框架,结合网络扩散模型增强的多层感知机和梯度×输入归因方法,从体内神经影像数据中映射tau蛋白传播路径,并验证了与Braak分期解剖学的一致性。

详情
AI中文摘要

理解结构连接如何与阿尔茨海默病(AD)中的tau蛋白传播相关联仍然是一个核心未解问题,然而现有的计算模型要么严重依赖生物物理假设,要么缺乏神经生物学可解释的路径图。我们提出了SC-TauPath,一个结构连接(SC)归因框架,用于从体内神经影像数据中映射tau蛋白传播路径。SC-TauPath将网络扩散模型(NDM)增强的多层感知机与梯度×输入归因相结合,以评分每个SC边对tau预测的贡献,然后将这些归因分数转化为多尺度路径图(骨干边、高流量路径和枢纽ROI),这验证了已建立的Braak分期解剖学。应用于234名ADNI参与者,这些参与者具有配对的DTI SC和18F-Flortaucipir PET数据,SC-TauPath实现了强交叉验证的tau预测,并产生了与已建立的Braak分期解剖学一致的基于归因的路径图,表明SC编码了AD中区域tau分布的特定空间信息。

英文摘要

Understanding how structural connections are associated with tau propagation in Alzheimer's disease (AD) remains a central open question, yet existing computational models either rely heavily on biophysical assumptions or lack neurobiologically interpretable pathway maps. We present SC-TauPath, a structural connectivity (SC) attribution framework that maps tau propagation pathways from in vivo neuroimaging data. SC-TauPath combines a Network Diffusion Model (NDM)-augmented multilayer perceptron with gradient $\times$ input attribution to score each SC edge's contribution to tau prediction, then translates these attribution scores into multi-scale pathway maps (backbone edges, high-traffic routes, and hub ROIs), which validates established Braak staging anatomy. Applied to 234 ADNI participants with paired DTI SC and 18F-Flortaucipir PET, SC-TauPath achieves strong cross-validated tau prediction and yields attribution-based pathway maps consistent with established Braak staging anatomy, demonstrating that SC encode spatially specific information about regional tau distribution in AD.

2606.04065 2026-06-04 stat.ML cs.LG math.ST stat.TH 版本更新

Finite-Iteration Local Dynamics and Warm Starts for Alternating Power Iteration in Spiked Tensor PCA

尖峰张量PCA中交替幂迭代的有限迭代局部动力学与热启动

Yanjin Xiang, Zhihua Zhang

发表机构 * Peking University(北京大学)

AI总结 研究固定阶非对称秩一张量模型中同步交替幂迭代的有限迭代局部理论,提出与初始化无关的误差分解和热启动机制。

Comments 67 pages, 0 figures. The paper studies local dynamics and warm-start analysis for alternating power iteration in spiked tensor PCA

详情
AI中文摘要

我们研究了固定阶非对称秩一张量模型中的同步交替幂迭代。主要贡献是一个与任何特定初始化无关的有限迭代局部理论。一旦迭代进入种植秩一方向的足够小邻域,其误差分解为几何衰减的瞬态部分和由种植点处固定正交噪声收缩引起的内在噪声基底。确定性有限样本条件被明确陈述,但在粗粒度的固定阶多线性噪声事件下,它们简化为固定或缓慢扩展局部半径的保守高信号区域。然后,我们将热启动机制与任何特定谱构造分离。一个通用的单扫描原理表明,如果符号兼容的初始器具有相关性γ_N,第一扫描噪声水平a_N,且a_N/(γ_N^{d-1}ω_{N,d})→0,则可以选择一个扩展半径r_N=o(ω_{N,d}),使得第一扫描进入局部盆地。进入后,局部仿射收缩导致收敛到该盆地中唯一的信息性局部不动点。对于中心Gram初始化,我们通过信号保持的仅噪声留一比较和平均留一片收缩估计(称为压回估计),在独立同分布有限四阶矩噪声下验证了所需的相关性和同一样本第一扫描噪声界。留一比较保持尖峰固定并对删除坐标取平均,因此种植坐标通过ℓ₂加权和而非最坏情况非相干界进入。

英文摘要

We study simultaneous alternating power iteration for fixed-order asymmetric rank-one spiked tensor models. Our main contribution is a finite-iteration local theory that is independent of any particular initialization. Once the iterates enter a sufficiently small neighborhood of the planted rank-one direction, their error decomposes into a geometrically decaying transient and an intrinsic noise floor caused by fixed orthogonal noise contractions at the planted point. The deterministic finite-sample conditions are stated explicitly, but under a coarse fixed-order multilinear noise event they reduce to a conservative high-signal regime for fixed or slowly expanding local radii. We then separate the warm-start mechanism from any specific spectral construction. A generic one-sweep principle shows that, if a sign-compatible initializer has correlation \(γ_N\), first-sweep noise level \(a_N\), and \(a_N/(γ_N^{d-1}ω_{N,d})\to0\), then one can choose an expanding radius \(r_N=o(ω_{N,d})\) for which the first sweep enters the local basin. After entry, the local affine contraction yields convergence to the unique informative local fixed point in that basin. For centered-Gram initialization, we verify the required correlation and same-sample first-sweep noise bound under i.i.d. finite-fourth-moment noise by a signal-preserving noise-only leave-one comparison and an averaged leave-one slice-contraction estimate, which we call a pressed-back estimate. The leave-one comparison keeps the spike fixed and averages over the deleted coordinate, so planted coordinates enter through \(\ell_2\)-weighted sums rather than worst-case incoherence bounds.

2606.04063 2026-06-04 cs.LG cs.AI 版本更新

LLM Compression with Jointly Optimizing Architectural and Quantization choices

联合优化架构与量化选择的大语言模型压缩

Hoang-Loc La, Truong-Thanh Le, Amir Taherkordi, Phuong Hoai Ha

发表机构 * UiT The Arctic University of Norway(UiT北莫斯科斯大学) University of Oslo, Norway(奥斯陆大学)

AI总结 提出一种可微神经架构搜索框架,联合优化大语言模型的架构配置与混合精度量化,实现更优的精度-延迟权衡。

详情
AI中文摘要

部署大型语言模型(LLM)因其巨大的内存和计算需求而具有挑战性。虽然一些方法通过从头开发小型或微型语言模型来解决这一问题,但这些方法需要大量的GPU训练。压缩预训练的LLM用于边缘设备提供了一种有吸引力的替代方案。除了剪枝和量化,神经架构搜索(NAS)能够实现有效的压缩,然而先前的NAS方法通常限制搜索空间并将架构与量化解耦。我们引入了一种可微NAS框架,该框架探索整个空间,并联合优化LLM线性层的架构配置与混合精度量化。实验表明,我们的模型在精度-延迟权衡上具有优越性:在可比精度下,我们的模型推理速度比顺序的NAS后量化基线快1.4倍,或在等效延迟下,在七个推理任务上平均精度提高高达6%。

英文摘要

Deploying large language models (LLMs) is challenging due to their significant memory and computational requirements. While some methods address this by developing small or tiny language models from scratch, these approaches demand extensive GPU training. Compressing pre-trained LLMs for edge devices offers a compelling alternative. Beyond pruning and quantization, Neural Architecture Search (NAS) enables effective compression, yet prior NAS approaches often limit the search space and decouple architecture from quantization. We introduce a differentiable NAS framework that explores the entire space and jointly optimizes architectural configurations alongside mixed-precision quantization for linear layers of LLMs. Experiments demonstrate superior accuracy-latency trade-offs: our models achieve up to 1.4x faster inference than sequential NAS-then-quantization baselines at comparable accuracy, or up to 6% higher average accuracy across seven reasoning tasks at equivalent latency.

2606.04057 2026-06-04 cs.SE cs.AI cs.LG 版本更新

The Invisible Lottery: How Subtle Cues Steer Algorithm Choice in LLM Code Generation

隐形彩票:微妙线索如何引导LLM代码生成中的算法选择

Akanksha Narula, Mofasshara Binte Rafique, Laurent Bindschaedler

发表机构 * University of Washington(华盛顿大学) Google Research(谷歌研究院)

AI总结 通过大量控制实验,发现提示中的偶然线索(如上下文词或元数据)会系统性地改变LLM在代码生成中选择的算法族分布,影响性能、安全性和可维护性,而直接命名算法是最可靠的缓解措施。

详情
AI中文摘要

大型语言模型(LLM)现在生成大量生产代码,通常用于具有多个有效算法解决方案的任务。偶然的提示线索,即任务规范之外的上下文词或元数据,可以引导模型选择哪个算法,即使所有输出都通过相同的测试。提示敏感性作为提高输出质量的工具已被广泛研究。这里,输出策略意味着在固定正确性下的算法选择。我们将算法引导定义为线索引起的算法族分布变化,并在11个任务、19种线索类型(18个通道加上一个记忆化语义与表面消融,在改变排版和标点的同时保留含义)以及15个模型配置上进行了46,535次控制实验。我们发现算法族分布存在大规模、系统性的变化(高达100个百分点),与线索语义基本一致,包括在速率限制等应用任务中。直接命名算法是我们测试的最可靠的缓解措施。因此,偶然的上下文在性能、安全性和可维护性上创造了一个“隐形彩票”。

英文摘要

Large language models (LLMs) now generate substantial production code, often for tasks with multiple valid algorithmic solutions. Incidental prompt cues, meaning contextual words or metadata outside the task specification, can steer which algorithm the model selects, even when all outputs pass the same tests. Prompt sensitivity is well studied as a tool to improve output quality. Here, output policy means algorithm choice under fixed correctness. We define algorithm steering as cue-induced shifts in algorithm-family distributions and run 46,535 controlled experiments across 11 tasks, 19 cue types (18 channels plus a memoization semantic-vs-surface ablation that preserves meaning while changing typography and punctuation), and 15 model configurations. We find large, systematic shifts in algorithm-family distributions (up to 100 pp), largely consistent with cue semantics, including in applied tasks such as rate limiting. Direct algorithm naming is the most reliable mitigation we tested. Accidental context therefore creates an "invisible lottery" over performance, security, and maintainability.

2606.04053 2026-06-04 cs.LG cs.AI 版本更新

A Goal-Set Characterization of Task Composition in the Boolean Task Algebra

布尔任务代数中任务组合的目标集刻画

Eduardo Terrés-Caballero, Herke van Hoof

发表机构 * Informatics Institute, University of Amsterdam(阿姆斯特丹大学信息学院) AMLab, University of Amsterdam(阿姆斯特丹大学AML实验室)

AI总结 本文通过目标集方法简化了布尔任务代数中的任务组合,证明了确定性MDP中最优扩展Q值函数由通用任务和空任务决定,从而减少了学习成本。

详情
AI中文摘要

布尔任务代数(BTA)通过为达到目标的任务配备布尔运算,为强化学习中的零样本任务组合提供了一个原则性框架。我们重新审视了其结构假设,并形式化了最优扩展Q值函数空间中的坍缩:在确定性MDP中,每个这样的函数完全由通用任务和空任务决定。这使得原始BTA公式中提出的对数基任务集变得冗余。基于这一观察,我们引入了一种基于目标集的组合方法,该方法对目标集执行逻辑运算,并通过从通用值函数和空值函数中选择切片来重构组合值函数。这降低了标准BTA的学习成本,并减少了BTA和技能机器的组合时间,同时保持了策略性能。在表格、视觉、函数逼近和连续控制领域的实验表明,学习额外的基任务并不会带来更好的性能。最后,我们研究了随机设置,并提供了一个反例,表明这种坍缩不一定成立,即最优组合可能需要考虑目标数量指数级的策略。代码可在 https://github.com/EduardoTerres/bta_paper 获取。

英文摘要

The Boolean Task Algebra (BTA) provides a principled framework for zero-shot task composition in reinforcement learning by equipping goal-reaching tasks with Boolean operations. We revisit its structural assumptions and formalize a collapse in the space of optimal extended Q-value functions: in deterministic MDPs, every such function is fully determined by the universal and empty tasks. This makes the logarithmic set of base tasks proposed in the original BTA formulation redundant. Building on this observation, we introduce a goal-set-based composition method that performs logical operations on goal sets and reconstructs composed value functions by selecting slices from the universal and empty value functions. This reduces learning costs for standard BTA and reduces composition time for both BTA and Skill Machines, while preserving policy performance. Experiments across tabular, visual, function-approximation, and continuous-control domains show that learning additional base tasks does not yield better performance. Finally, we study the stochastic setting and provide a counterexample showing that this collapse need not hold, that is, optimal composition may require accounting for exponentially many policies in the number of goals. Code is available at https://github.com/EduardoTerres/bta_paper.

2606.04051 2026-06-04 cs.LG cs.AI cs.CR 版本更新

RUBAS: Rubric-Based Reinforcement Learning for Agent Safety

RUBAS: 基于评分标准的强化学习用于智能体安全

Xian Qi Loye, Qinglin Su, Zhexin Zhang, Shiyao Cui, Qi Zhu, Fei Mi, Hongning Wang, Minlie Huang

发表机构 * The Conversational AI (CoAI) group, DCST, Tsinghua University(清华大学对话人工智能(CoAI)组,DCST,清华大学) Huawei Noah’s Ark Lab(华为诺亚实验室)

AI总结 提出RUBAS框架,通过将智能体行为分解为四个维度的评分标准提供细粒度奖励,利用强化学习在保证任务完成的同时提升工具使用安全性。

详情
AI中文摘要

LLM进化为工具型智能体带来了与真实世界执行相关的新安全挑战,而非简单的文本生成。现有的对齐方法通常依赖粗略的拒绝信号或静态监督,难以在多样化的智能体风险中平衡安全性与有用的工具执行。我们提出了RUBAS,一种基于评分标准的强化学习框架用于智能体安全。RUBAS将智能体行为分解为四个维度:工具使用安全性、参数安全性、响应安全性和有用性。这些结构化的评分标准在完整的智能体轨迹上提供细粒度且可解释的奖励,使强化学习能够在保持任务完成的同时优化安全工具使用。在多个智能体安全基准和模型上的大量实验表明,RUBAS相比标准对齐基线提高了安全性,减少了基于工具的幻觉,并保持了竞争性的实用性。我们的结果表明,多维评分标准奖励为在安全关键的工具使用环境中对齐LLM智能体提供了有效的训练信号。

英文摘要

The evolution of LLMs into tool-enabled agents creates a new class of safety challenges associated with real-world execution rather than simple text generation. Existing alignment methods often rely on coarse refusal signals or static supervision, making it difficult to balance safety with useful tool execution across diverse agentic risks. We introduce RUBAS, a rubric-based reinforcement learning framework for agent safety. RUBAS decomposes agent behavior into four dimensions: tool-use safety, argument safety, response safety, and helpfulness. These structured rubrics provide fine-grained and interpretable rewards over complete agent trajectories, enabling reinforcement learning to optimize safe tool use while preserving task completion. Extensive experiments across multiple agent safety benchmarks and models show that RUBAS improves safety over standard alignment baselines, reduces tool-grounded hallucinations, and maintains competitive utility. Our results suggest that multi-dimensional rubric rewards provide an effective training signal for aligning LLM agents in safety-critical tool-use settings.

2606.04050 2026-06-04 cs.LG cs.AI 版本更新

LiftQuant: Continuous Bit-Width LLM via Dimensional Lifting and Projection

LiftQuant: 通过维度提升和投影实现连续位宽的LLM

Liulu He, XuanAng Liu, Juntao Liu, Taolue Feng, Ting Lu, Chunsheng Gan, Zhiyv Peng, Yuan Du, Huanrui Yang, Yijiang Liu, Li Du

发表机构 * Nanyang Technological University(南洋理工大学)

AI总结 提出LiftQuant框架,通过“提升-投影”机制实现准连续位宽控制,以精确适配内存预算,在70B模型上以2.4位压缩超越现有2位模型。

Comments ICML 2026 Spotlight

详情
AI中文摘要

现有的量化方法从根本上受限于刚性的整数位宽(例如2位、3位),导致存在“部署鸿沟”,即大型语言模型无法最优地适配特定的内存预算。为弥合这一鸿沟,我们引入了LiftQuant,一种新颖的框架,能够实现连续位宽控制,从而实现真正的帕累托最优部署。其核心创新是一种“提升-投影”机制,该机制通过从更高维度的“提升”空间中投影一个简单的1位格点来近似低维权重向量。关键在于,有效位宽仅由提升维度与原始维度的比率决定,这使得位宽可以准连续地调整,因为维度是一个灵活的结构参数。这种投影生成一个结构化但非均匀的码本,捕获了向量量化(VQ)的表达能力。虽然优于VQ,但LiftQuant的解码路径仅依赖于线性变换和1位均匀量化器,保持了硬件友好的特性。这种灵活性具有变革性:LiftQuant能够将70B的LLM压缩到2.4位,以精确适配24GB GPU,其性能显著超过在同一设备上部署的最先进的2位模型。我们的代码和检查点可在https://github.com/Heliulu/LiftQuant获取。

英文摘要

Existing quantization methods are fundamentally limited by rigid, integer-based bit-widths (e.g., 2, 3-bit), resulting in a ``deployment gap" where Large Language Models cannot be optimally fitted to specific memory budgets. To bridge this gap, we introduce LiftQuant, a novel framework that enables continuous bit-width control for true Pareto-optimal deployment. The core innovation is a ``lift-then-project" mechanism which approximates low-dimensional weight vectors by projecting a simple 1-bit lattice from a higher-dimensional ``lifted" space. Crucially, the effective bit-width is determined simply by the ratio of the lifted dimension to the original dimension, which allows the bit-width to be tuned quasi-continuous as the dimension is a flexible structural parameter. This projection generates a structured yet non-uniform codebook, capturing the expressive power of Vector Quantization (VQ). While beneficial over VQ, LiftQuant's decoding path relies solely on linear transformations and 1-bit uniform quantizers, retaining hardware-friendly nature. This flexibility is transformative: LiftQuant enables a 70B LLM to be compressed to 2.4 bits to precisely fit a 24GB GPU, where its performance significantly surpasses state-of-the-art 2-bit models fitted on the same device. Our code and ckpt is available at https://github.com/Heliulu/LiftQuant.

2606.04048 2026-06-04 cs.LG cs.AI 版本更新

Unlocking Feature Learning in Gated Delta Networks at Scale

解锁大规模门控Delta网络中的特征学习

Yifeng Liu, Quanquan Gu

发表机构 * University of California Los Angeles(加州大学洛杉矶分校)

AI总结 本文通过推导门控Delta网络的缩放规则,实现了超参数(尤其是学习率)在不同模型宽度下的零样本迁移,验证了Maximal Update Parametrization在结构化状态空间模型中的有效性。

详情
AI中文摘要

训练和扩展大型语言模型需要巨大的计算资源,这促使了高效次二次架构和原则性超参数调优方法的发展。虽然最大更新参数化($μ$P)已实现标准Transformer的零样本超参数迁移,但其在线性模型(特别是具有结构化状态转换和复杂架构的模型)中的扩展仍基本未探索。通过在前向传播、门控机制和循环状态动态中严格传播坐标大小估计,我们推导出门控Delta网络的缩放规则。语言模型预训练实验证实,我们的配置使得在AdamW和SGD下,学习率在不同模型宽度间稳定迁移,而标准参数化无法迁移,验证了我们分析的正确性和实用性。

英文摘要

Training and scaling Large Language Models demand enormous computational resources, motivating both efficient sub-quadratic architectures and principled hyperparameter tuning methods. While the Maximal Update Parametrization ($μ$P) has enabled zero-shot hyperparameter transfer for standard Transformers, its extension to linear models, particularly those with structured state transitions and complicated architectures, remains largely unexplored. By rigorously propagating coordinate-size estimates through the forward pass, gating mechanisms, and recurrent state dynamics, we derive the scaling rules for Gated Delta Network. Experiments on language-model pre-training confirm that our configurations enable stable learning-rate transfer across model widths under both AdamW and SGD, whereas standard parametrization fails to transfer, validating the correctness and practical utility of our analysis.

2606.04046 2026-06-04 cs.CV cs.AI cs.CL cs.LG cs.RO 版本更新

Dive into the Scene: Breaking the Perceptual Bottleneck in Vision-Language Decision Making via Focus Plan Generation

深入场景:通过焦点计划生成打破视觉-语言决策中的感知瓶颈

Boyuan Xiao, Bohong Chen, Yumeng Li, Ji Feng, Yao-Xiang Ding, Kun Zhou

发表机构 * University of Science and Technology of China(中国科学技术大学) Tsinghua University(清华大学)

AI总结 提出SceneDiver方法,通过从粗到细的焦点计划生成,逐步构建场景图并分解任务,减少视觉幻觉,提升视觉-语言模型和视觉-语言-动作模型在具身决策任务中的表现。

Comments Accepted at ICML 2026

详情
AI中文摘要

在具身视觉-语言决策任务(如机器人操作和导航)中,视觉-语言模型和视觉-语言-动作模型(VLMs & VLAs)是具有不同优势的强大工具:VLMs更擅长长期规划,而VLAs更擅长反应控制。然而,它们的性能受到相同感知瓶颈的限制:由于模型无法区分任务相关对象与干扰物,导致视觉幻觉。原则上,准确识别并聚焦关键对象同时过滤无关对象是突破这一限制的关键。一个直接的解决方案是一步聚焦:直接关注重要对象。然而,这种方法被证明无效,因为有效的聚焦本质上需要深度场景理解。为此,我们提出SceneDiver,一种利用VLMs长期规划能力的从粗到细的焦点计划生成方法,首先构建整体场景图以建立初步理解,然后通过识别、理解和分析的迭代循环逐步将任务分解为更简单的子问题。为了实现反应控制,我们还设计了一个轻量级适配器,将深思熟虑的聚焦能力蒸馏到VLAs中。在标准具身AI基准上的评估证实,我们的方法显著减少了VLMs和VLAs的视觉幻觉,同时在需要快速执行的任务中保持了计算效率。我们的代码和数据发布在:https://future-item.github.io/SceneDiver。

英文摘要

In embodied vision-language decision making tasks such as robotic manipulation and navigation, Vision-Language and Vision-Language-Action Models (VLMs & VLAs) are powerful tools with different benefits: VLMs are better at long-term planning, while VLAs are better at reactive control. However, their performance is limited by the same perceptual bottleneck: visual hallucinations arise due to the models' inability to distinguish task-relevant objects from distractors. In principle, accurate identification and focus on critical objects while filtering out irrelevant ones is the key to break this limitation. A straightforward solution is one-step focus: directly attending to essential objects. However, this approach proves ineffective because effective focus inherently requires deep scene understanding. To this end, we propose SceneDiver, a coarse-to-fine focus plan generation method for VLMs leveraging their long-term planning abilities, that first constructs a holistic scene graph to establish initial comprehension, then progressively decomposes the task into simpler sub-problems through an iterative cycle of recognition, understanding, and analysis. To enable reactive control, we also design a lightweight adapter for distilling the deliberate focus ability into VLAs. Evaluations on standard embodied AI benchmarks confirm that our method substantially reduces visual hallucinations for both VLMs and VLAs, while preserving computational efficiency in tasks requiring fast execution. Our code and data are released at: https://future-item.github.io/SceneDiver.

2606.04045 2026-06-04 cs.LG cs.AI 版本更新

Bayes-Sufficient Representations in Supervised Learning

监督学习中的贝叶斯充分表示

Vasileios Sevetlidis

发表机构 * Athena Research Center, Kimmeria Campus, Xanthi, Greece(阿塔尼亚研究中心,基米里亚校区,辛提斯,希腊) Democritus University of Thrace, Vas. Sofias Campus, Xanthi, Greece(德摩根大学,瓦斯·索菲亚校区,辛提斯,希腊) International Hellenic University, Serres, Greece(国际希腊大学,塞雷斯,希腊)

AI总结 本文定义了监督学习中表示对损失函数的贝叶斯充分性,引入贝叶斯商概念,并证明最小充分表示等价于贝叶斯商,通过实验区分了充分性、最小性和非必要信息保留。

详情
AI中文摘要

表示学习通常被描述为保留输入中与预测相关的信息。本文探讨了在固定监督决策问题中相关性的含义。定义了一个表示对于联合分布和损失是贝叶斯充分的,如果某个预测头可以使用它来实现贝叶斯最优行动规则。这使得目标信息依赖于损失。在几乎必然唯一的贝叶斯行动情况下,相关对象是贝叶斯商,它识别需要相同贝叶斯最优行动的输入。当表示细化这个商时,它是充分的;当它在信息上等价于商时,它是贝叶斯最小的。该框架自然地连接到属性诱导:零一损失需要贝叶斯类,平方损失需要条件均值,布里尔损失需要二元预测中的条件概率,对数损失或严格适当评分规则需要预测分布。受控的有限实验、学习的神经瓶颈实验以及真实数据的iNaturalist分类学细化实验说明了充分性、最小性和保留的非必要信息之间的区别。对于固定的监督问题,分布和损失决定贝叶斯行动,贝叶斯行动决定商,商决定贝叶斯最优预测所需的最小信息。

英文摘要

Representation learning is often described as preserving the information in an input that is relevant for prediction. This work asks what relevance means for a fixed supervised decision problem. A representation is defined to be Bayes-sufficient for a joint distribution and loss if some prediction head can use it to implement a Bayes-optimal action rule. This makes the target information loss-dependent. In the almost-surely unique Bayes-action case, the relevant object is a Bayes quotient, which identifies inputs that require the same Bayes-optimal action. A representation is sufficient when it refines this quotient, and Bayes-minimal when it is informationally equivalent to it. The framework connects naturally to property elicitation: zero-one loss requires the Bayes class, squared loss the conditional mean, Brier loss the conditional probability in binary prediction, and log loss or strictly proper scoring rules the predictive distribution. Controlled finite experiments, learned neural bottleneck experiments, and a real-data iNaturalist taxonomic refinement experiment illustrate the distinction between sufficiency, minimality, and retained non-required information. For a fixed supervised problem, the distribution and the loss determine the Bayes action, the Bayes action determines the quotient, and the quotient determines the minimal information required for Bayes-optimal prediction.

2606.04039 2026-06-04 cs.NE cs.AI cs.LG 版本更新

Beyond Static Priors: Dynamic Neural Guidance for Large-Scale Ant Colony Optimization

超越静态先验:大规模蚁群优化的动态神经引导

Dat Thanh Tran, Van Khu Vu, Yining Ma

发表机构 * Center for AI Research(人工智能研究中心) VinUniversity(文大学) College of Engineering and Computer Science(工程与计算机科学学院) Laboratory for Information and Decision Systems(信息与决策系统实验室) Massachusetts Institute of Technology(麻省理工学院)

AI总结 提出DyNACO框架,通过周期性观察信息素分布和当前解实现动态神经引导,结合扰动ACO后端和范围受限的细化机制,在TSP上扩展至10万节点并优于神经基线,在CVRP上以<1%神经开销持续改进无引导基线。

Comments Accepted at KDD 2026

详情
AI中文摘要

神经引导的蚁群优化(ACO)存在一个根本性的训练-推理错位:策略通常被训练来生成静态先验(例如热图),但部署时却用于引导迭代的、长视野的搜索过程。在本文中,我们提出了DyNACO,一个新颖的框架,通过周期性观察信息素分布和当前解来实现动态神经引导。为了使DyNACO在大规模上易于处理,我们将策略与基于扰动的ACO后端和范围受限的细化机制配对,共同确保有效性和稳定的信用分配。在TSP上,DyNACO扩展到10万个节点的实例,并优于神经基线,同时与无引导求解器相比通常减少总运行时间。我们通过容量感知后端将DyNACO扩展到CVRP,以不到1%的神经开销持续改进无引导基线。我们进一步提供了深入分析,验证了模型的泛化能力,并阐明了为什么动态引导优于静态先验。我们的工作强调了在学习引导优化中使神经训练与迭代搜索动态对齐的必要性。代码可在https://github.com/shoraaa/DyNACO获取。

英文摘要

Neural-guided Ant Colony Optimization (ACO) suffers from a fundamental training-inference misalignment: policies are typically trained to generate static priors (e.g., heatmaps), yet deployed to guide iterative, long-horizon search processes. In this paper, we present DyNACO, a novel framework that achieves dynamic neural guidance by periodically observing the pheromone distribution and the incumbent solution. To make DyNACO tractable at scale, we pair the policy with a perturbation-based ACO backend and a scope-restricted refinement mechanism that jointly ensure efficacy and stable credit assignment. On TSP, DyNACO scales to 100,000-node instances and outperforms neural baselines while often reducing total runtime compared to the unguided solver. We extend DyNACO to CVRP via a capacity-aware backend, consistently improving the unguided baseline with less than 1% neural overhead. We further provide in-depth analysis validating the model's generalization capabilities and elucidating why dynamic guidance outperforms static priors. Our work underscores the necessity of aligning neural training with iterative search dynamics in learning-guided optimization. The code is available at https://github.com/shoraaa/DyNACO.

2606.04036 2026-06-04 cs.LG 版本更新

Self-Distilled Policy Gradient

自蒸馏策略梯度

Yifeng Liu, Shiyuan Zhang, Yifan Zhang, Quanquan Gu

发表机构 * Department of Computer Science, University of California, Los Angeles, CA, USA(加州大学洛杉矶分校计算机科学系) Princeton AI Laboratory, Princeton University, Princeton, NJ, USA(普林斯顿大学普林斯顿AI实验室)

AI总结 提出SDPG框架,结合组相对验证器优势、归一化标准差、精确全词汇在线自蒸馏和参考策略KL正则化,提升稀疏奖励强化学习的稳定性和性能。

详情
AI中文摘要

在线自蒸馏,即语言模型基于特权上下文监督自身生成,是稀疏奖励强化学习中密集监督的有前景来源。实际上,它可以实例化为辅助的全词汇学生到教师反向KL散度损失。因此,我们提出SDPG,一个自蒸馏策略梯度框架,它结合了组相对验证器优势与归一化标准差、精确的全词汇在线自蒸馏以及参考策略KL正则化。实验上,SDPG相比RLVR和自蒸馏基线提高了稳定性和性能。代码可在https://github.com/lauyikfung/SDPG获取。

英文摘要

On-policy self-distillation, where a language model conditions on privileged context to supervise its own generations, is a promising source of dense supervision for sparse-reward reinforcement learning. Actually, it can be instantiated as an auxiliary full-vocabulary student-to-teacher reverse Kullback-Leibler divergence loss. We therefore propose SDPG, a self-distilled policy-gradient framework that combines group-relative verifier advantages with normalized standard deviation, exact full-vocabulary on-policy self-distillation, as well as reference-policy KL regularization. Empirically, SDPG improves stability and performance over RLVR and self-distillation baselines. The code is available at https://github.com/lauyikfung/SDPG.

2606.04035 2026-06-04 cs.SE cs.AI cs.LG 版本更新

Unpredictable Safety: Domain-Dependent Compliance and the Transparency Gap in Open-Weight LLMs

不可预测的安全性:开放权重大语言模型中领域依赖的合规性与透明度差距

Zacharie Bugaud

发表机构 * Astera Institute(Astera研究院)

AI总结 通过7个伦理领域的标准化实验,发现开放权重大语言模型的合规率在14.7%到85.7%之间波动,且同一模型在不同领域表现高度不一致,揭示了安全机制缺乏透明度和一致性。

详情
AI中文摘要

我们对开放权重大语言模型中领域依赖的安全行为进行了系统研究:在7个伦理领域进行了7项标准化实验,测试了5个模型(12B--70B),共4200次交互,并采用双法官验证。使用双条件方法,每个场景在分析框架(识别危害)和操作框架(帮助实施危害)下进行测试,我们发现合规率从14.7%(人口贩卖)到85.7%(监控设计)不等,跨度达71个百分点,且非重叠的聚类自助法95%置信区间。可信部署需要可预测的安全行为,但我们发现合规性高度依赖于上下文:同一模型(Mistral Nemo 12B)在100%的请求中提供监控设计,但仅在26.7%的请求中协助贩卖。这种不可预测性对部署者来说是不透明的:技术框架绕过,即有害请求被重新定义为工程问题,从而覆盖安全训练,而没有任何外部信号表明拒绝阈值已改变。领域内异质性高达84.4个百分点,意味着即使在领域层面也无法预测安全行为。在通过GitHub Copilot CLI部署产品界面访问的五个前沿封闭模型(GPT-4.1/5.2, Claude Haiku/Sonnet/Opus 4.x;n=4,163个响应)上进行的复制实验,再现了相同的领域分层,绝对水平有所减弱但形状相同,其中两个低规范化领域(科学欺诈、监控)再次最为宽松。这些结果表明,当前的安全机制缺乏可信AI部署所需的透明度和一致性。

英文摘要

We present a systematic study of domain-dependent safety behavior in open-weight LLMs: 7 standardized experiments across 7 ethical domains, testing 5 models (12B--70B) in 4,200 interactions with dual-judge validation. Using a dual-condition methodology, each scenario tested in both an analytical framing (identify the harm) and an operational framing (help commit the harm), we find compliance rates vary from 14.7% (human trafficking) to 85.7% (surveillance design), a 71-percentage-point span with non-overlapping cluster-bootstrapped 95% CIs. Trustworthy deployment requires predictable safety behavior, yet we find compliance is highly context-dependent: the same model (Mistral Nemo 12B) provides surveillance designs in 100% of requests but assists with trafficking in only 26.7%. This unpredictability is opaque to deployers: the technical framing bypass, where harmful requests reframed as engineering problems override safety training without any external signal that refusal thresholds have shifted. Within-domain heterogeneity reaches 84.4pp, meaning safety behavior cannot be predicted even at the domain level. A replication on five frontier closed models (GPT-4.1/5.2, Claude Haiku/Sonnet/Opus 4.x; n=4,163 responses) accessed via the GitHub Copilot CLI deployed-product surface reproduces the same domain stratification, attenuated in absolute level but identical in shape, with the two low-codification domains (science fraud, surveillance) again the most permissive. These results show that current safety mechanisms lack the transparency and consistency required for trustworthy AI deployment.

2606.04033 2026-06-04 cs.LG 版本更新

Inverse Critical Experiment Design via Gradient Optimization and a Multigroup Attention-Based Neural Network Architecture

通过梯度优化与多群注意力神经网络架构的逆临界实验设计

Will Savage, Logan Burnett, Dean Price

发表机构 * Massachusetts Institute of Technology(麻省理工学院)

AI总结 提出一种结合深度神经网络代理模型和非参数梯度优化的方法,用于逆设计临界实验几何结构以最大化中子相似性相关系数$c_k$,并在TN-LC运输容器验证中取得高$c_k$值。

详情
AI中文摘要

先进核反应堆设计和燃料概念的验证需要与目标技术具有高中子相似性的临界实验。中子相似性由相关系数$c_k$量化,该系数捕捉了核数据不确定性引起的$k_\text{eff}$共享偏差。通常,实验需要$c_k\geq0.9$才能与目标技术充分相似。本文提出了一种临界实验逆设计方法。使用深度神经网络代理模型和非参数梯度优化来生成最大化$c_k$的实验几何结构。 深度神经网络基于OpenMC计算的网格临界实验几何结构的灵敏度向量进行训练。模型架构结合了U-Net卷积编码器-解码器与新颖的多群注意力池化层,引入该层以捕捉灵敏度的不同空间依赖性。多群注意力池化在性能上优于传统池化,并具有可解释的内部行为。代理模型的可微性使得能够对全组合设计空间进行基于梯度的优化,通过直接改变几何网格中每个位置的材料分配来最大化$c_k$。 该方法应用于TN-Americas TN-LC运输容器(使用HALEU燃料)的验证,该容器现有的临界实验覆盖有限。优化过程为三个感兴趣配置生成了实验几何结构,分别达到0.97757、0.81324和0.93276的$c_k$分数。该方法展示了深度学习和梯度优化在加速先进核技术开发方面的潜力。

英文摘要

The validation of advanced nuclear reactor designs and fuel concepts requires critical experiments with high neutronic similarity to the target technology. Neutronic similarity is quantified by the correlation coefficient $c_k$, which captures the shared bias in $k_\text{eff}$ induced by uncertainties in nuclear data. Generally, a $c_k\geq0.9$ is needed for an experiment to be sufficiently similar to a target technology. This work presents a methodology for the inverse design of critical experiments. Deep neural network surrogate modeling and nonparametric gradient optimization are used to generate experiment geometries that maximize $c_k$. A deep neural network is trained on OpenMC-calculated sensitivity vectors for grid-based critical experiment geometries. The model architecture combines a U-Net convolutional encoder-decoder with a novel multigroup attention pooling layer, introduced to capture the differing spatial dependencies of sensitivities. Multigroup attention pooling is shown to achieve better performance than traditional pooling, as well as interpretable internal behavior. The differentiability of the surrogate enables gradient-based optimization of the full combinatorial design space, allowing $c_k$ to be maximized by directly changing the material assignment of each position in the geometry grid. The method is applied to the validation of the TN-Americas TN-LC transportation cask with HALEU fuel, for which existing critical experiment coverage is limited. The optimization procedure is shown to produce experiment geometries achieving $c_k$ scores of 0.97757, 0.81324, and 0.93276 for three configurations of interest. This approach demonstrates the potential of deep learning and gradient optimization to accelerate the development of advanced nuclear technology.

2606.04031 2026-06-04 cs.LG math.OC stat.ML 版本更新

Pseudospectral Bounds for Transient Amplification in Coupled Gradient Descent

耦合梯度下降中瞬态放大的伪谱界

Ahanaf Hasan Ariq

发表机构 * Ideal School and College(理想学校和学院)

AI总结 针对耦合梯度下降中块三角雅可比矩阵的非正态性导致的瞬态放大,提出尖锐的伪谱理论,给出Kreiss常数的上界与匹配极小极大下界,并导出随机耦合下降的有限步迭代复杂度界。

Comments 11 pages, 3 tables. Accepted as poster at HiLD 2026 (4th Workshop on High-dimensional Learning Dynamics, ICML 2026)

详情
AI中文摘要

耦合梯度下降——其中一个参数块的更新依赖于另一个——是双层优化、双时间尺度随机逼近和对抗训练的基础。当耦合雅可比矩阵为块三角时,渐近稳定性由对角块的谱半径决定,但由于非正态性,收敛前的瞬态放大可能任意大。我们为这种块三角雅可比矩阵发展了尖锐的伪谱理论,证明当对角块对称且谱半径至多为γ<1时,Kreiss常数满足K(J) ≤ 2/(1-γ) + ||C||/(4(1-γ)),并建立了匹配的极小极大下界。我们刻画了谱不稳定的临界耦合阈值,并通过Neumann级数扰动框架将分析扩展到近自指系统。作为推论,我们得到了随机耦合下降的有限步迭代复杂度界O(K(J)^2 log(1/δ))。将结果表述为非平稳双时间尺度优化的标度律,我们的理论揭示了谱半径分析无法看到的非渐近、实例依赖的高维学习动力学。在线性二次问题、基于IQC的比较和神经网络训练上的实验证实了该理论。

英文摘要

Coupled gradient descent--where the update of one parameter block depends on another--underlies bilevel optimization, two-time-scale stochastic approximation, and adversarial training. When the coupled Jacobian is block-triangular, asymptotic stability is governed by the spectral radii of the diagonal blocks, yet transient amplification before convergence can be arbitrarily large due to non-normality. We develop a sharp pseudospectral theory for such block-triangular Jacobians, proving that the Kreiss constant satisfies $K(J) \leq 2/(1-γ) + \|C\|/(4(1-γ))$ when the diagonal blocks are symmetric with spectral radii at most $γ< 1$, and we establish matching minimax lower bounds. We characterize the critical coupling threshold for spectral instability and extend the analysis to nearly self-referential systems via a Neumann-series perturbation framework. As a consequence, we obtain a finite-horizon iteration-complexity bound of $O(K(J)^2 \log(1/δ))$ for stochastic coupled descent. Framed as scaling laws for non-stationary two-time-scale optimization, our results expose a non-asymptotic, instance-dependent regime of high-dimensional learning dynamics that is invisible to spectral-radius analysis. Experiments on linear-quadratic problems, IQC-based comparisons, and neural-network training confirm the theory.

2606.04028 2026-06-04 cs.LG 版本更新

Novel Aspects of IEEE SA P3109 Arithmetic Formats for Machine Learning

IEEE SA P3109 机器学习算术格式的新颖方面

Andrew Fitzgibbon, Christoph M. Wintersteiger, Jeffrey Sarnoff

发表机构 * Imandra, Inc.(Imandra公司)

AI总结 本文介绍 IEEE P3109 草案标准,该标准定义了一组参数化的二进制浮点格式及相关操作,旨在促进机器学习,并提出了新颖的近似度量 kappa-approximation。

详情
AI中文摘要

IEEE P3109 草案标准定义了一个参数化的二进制浮点格式族及相关操作,重点在于促进机器学习。这些格式允许以少量比特高效且一致地表示数值。定义的格式在宽度、精度(以比特计)、有符号性以及无穷大的存在性上参数化。操作通过将浮点值解码为闭扩展实数集(实数加上正负无穷大和 NaN(非数值))来定义。对 NaN 和无穷大操作数的显式处理确保了仅在操作定义中调用实数运算。定义了广泛的舍入和饱和模式;包括随机舍入。操作无异常,加速了吞吐量,异常情况通过返回值(例如 NaN)传达。对共享公共比例因子的值块的操作以统一方式基于底层操作定义。系统供应商可以通过一种新颖的尺度不变度量(类似于最后一位单位)来描述近似实现,称为 kappa-approximation。标准函数定义和各种其他属性通过形式化规范进行机械验证和生成。

英文摘要

The IEEE P3109 draft standard defines a parameterized family of binary floating-point formats and associated operations, with a focus on facilitating machine learning. These formats allow efficient and consistent representation of values in a small number of bits. The defined formats are parameterized over width and precision in bits, signedness, and the presence of infinities. Operations are defined by decoding floating-point values to the set of closed extended reals: the reals augmented with positive and negative infinity and NaN (Not a Number). Explicit treatment of NaN and infinite operands ensures that only real arithmetic is invoked in operation definitions. Extensive rounding and saturation modes are defined; stochastic rounding is included. Operations are exception-free, accelerating throughput, with exceptional situations communicated through return values, e.g., NaN. Operations on blocks of values sharing a common scale factor are defined in terms of the underlying operations in a uniform manner. System vendors may describe approximate implementations via a novel scale-invariant measure, akin to units in the last place, called kappa-approximation. Standard function definitions and various other properties are mechanically verified and generated using formal specifications.

2606.04021 2026-06-04 q-bio.QM cs.LG 版本更新

Structure-Aware Prediction of PROTAC-Mediated Protein Degradability via Graph Neural Networks

通过图神经网络进行PROTAC介导的蛋白质降解性的结构感知预测

Bryan Cheng, Austin Jin

发表机构 * Independent Researcher(独立研究者)

AI总结 提出DegradoMap,一种仅利用蛋白质结构和E3连接酶身份预测PROTAC降解性的图神经网络,在目标未见和E3未见评估中优于基线,并推荐最优E3连接酶。

Comments 10 pages, 5 figures, ACM-BCB 2026 Main Conference Full Paper

详情
AI中文摘要

蛋白水解靶向嵌合体(PROTACs)可以选择性降解致病蛋白,然而预测哪些靶点适合降解仍然是一个关键瓶颈:现有计算方法需要完整的PROTAC分子结构,而该信息在合成前不可用。我们提出DegradoMap,一种图神经网络,仅从蛋白质结构和E3连接酶身份预测PROTAC介导的降解性——这是靶点选择阶段可用的最小信息。该模型通过赖氨酸加权图池化(每蛋白质归一化)编码生物物理先验,通过交叉注意力建模蛋白质-E3兼容性,并整合来自癌症依赖性图谱的细胞环境。在PROTAC-8K基准(3,101个样本,155个靶点,10种E3连接酶)上,DegradoMap在靶点未见评估中达到0.646±0.124 AUROC(最佳种子:0.7449),在CRBN→VHL E3未见迁移中达到0.811 AUROC,优于GNN和机器学习基线。该模型还以74%的Hit@3准确率推荐最优E3连接酶。两个发现具有更广泛的意义:对于此标量预测任务,E(3)-等变架构的性能低于更简单的不变设计;ESM-2嵌入仅在仔细正则化下提升峰值性能——简单集成失败。DegradoMap为降解性评估提供合成前的计算指导;其良好校准的置信度分数(ECE=0.029,靶点未见)使从业者能够优先选择高置信度预测进行实验验证。然而,高种子方差(std=0.124)和有限的E3覆盖范围需要集成以实现可靠部署。

英文摘要

Proteolysis-targeting chimeras (PROTACs) can selectively degrade disease-causing proteins, yet predicting which targets are amenable to degradation remains a critical bottleneck: existing computational methods require the complete PROTAC molecular structure, information unavailable before synthesis. We present DegradoMap, a graph neural network that predicts PROTAC-mediated degradability from protein structure and E3 ligase identity alone -- the minimal information available at the target selection stage. The model encodes biophysical priors through lysine-weighted graph pooling with per-protein normalization, models protein-E3 compatibility via cross-attention, and integrates cellular context from the Cancer Dependency Map. On the PROTAC-8K benchmark (3,101 samples, 155 targets, 10 E3 ligases), DegradoMap achieves 0.646+-0.124 AUROC on target-unseen evaluation (best seed: 0.7449) and 0.811 AUROC on CRBN->VHL E3-unseen transfer, outperforming GNN and machine learning baselines. The model additionally recommends optimal E3 ligases with 74% Hit@3 accuracy. Two findings carry broader implications: E(3)-equivariant architectures underperform the simpler invariant design for this scalar prediction task, and ESM-2 embeddings improve peak performance only with careful regularization -- naive integration fails. DegradoMap provides pre-synthesis computational guidance for degradability assessment; its well-calibrated confidence scores (ECE = 0.029, target-unseen) enable practitioners to prioritize high-confidence predictions for experimental follow-up. However, the high seed variance (std = 0.124) and limited E3 coverage require ensembling for reliable deployment.

2606.04020 2026-06-04 q-bio.QM cs.LG 版本更新

SpliceBind: Isoform-Aware Prediction of Binding Pocket Druggability

SpliceBind: 异构体感知的结合口袋可药性预测

Bryan Cheng, Austin Jin, Joshua Chang

发表机构 * Independent Researcher USA(美国独立研究员)

AI总结 提出图神经网络框架SpliceBind,通过异构体感知预测可药性,揭示结构方法成功与失败的边界,并建立耐药性分类法以指导临床决策。

Comments 10 pages, 4 figures, ACM-BCB 2026 Main Conference Short Paper

详情
AI中文摘要

剪接介导的药物耐药性发生在高达40%的靶向激酶抑制剂患者中,然而最先进的可药性工具基于单一结构运行,无法跨异构体进行比较。我们引入SpliceBind,一个用于异构体感知可药性预测的图神经网络框架。除了提高预测准确性(AUROC 0.703 vs. P2Rank 0.634,p = 0.026),我们还解决了一个更基本的问题:结构方法何时成功,何时必然失败?对跨越五种机制类别的六个临床验证变体的系统分析揭示了一个双层耐药性分类法。结构域缺失(AR-V7,Delta = -18.39)和口袋破坏产生结构可检测的变化,而变构机制(BRAF-p61)仍然从根本上对任何口袋中心方法不可见——这是任何算法改进都无法跨越的边界。值得注意的是,学习到的嵌入捕获了仅靠几何结构无法识别的基于亲和力的耐药性(ALK-L1196M:Delta_SB = -0.228 vs. Delta_P2Rank = -0.95),部分弥合了结构-生化差距。在跨越25个家族的229个激酶口袋上,SpliceBind实现了AUROC 0.703(p = 0.026 vs. P2Rank),并对保留的家族具有稳健的泛化能力(AUROC 0.761)。这种分类法改变了临床工作流程:在发现剪接变体后,临床医生可以立即确定计算分诊是否足够,或者是否需要生化验证——从而缩短从变体发现到治疗决策的时间。

英文摘要

Splice-mediated drug resistance occurs in up to 40% of patients on targeted kinase inhibitors, yet state-of-the-art druggability tools operate on single structures and cannot compare across isoforms. We introduce SpliceBind, a graph neural network framework for isoform-aware druggability prediction. Beyond improving prediction accuracy (AUROC 0.703 vs. P2Rank 0.634, p = 0.026), we address a more fundamental question: when do structural methods succeed, and when must they fail? Systematic analysis of six clinically validated variants spanning five mechanism classes reveals a two-tier resistance taxonomy. Domain deletions (AR-V7, Delta = -18.39) and pocket disruptions produce structurally detectable changes, while allosteric mechanisms (BRAF-p61) remain fundamentally invisible to any pocket-centric approach -- a boundary no algorithmic improvement can cross. Notably, learned embeddings capture affinity-based resistance missed by geometry alone (ALK-L1196M: Delta_SB = -0.228 vs. Delta_P2Rank = -0.95), partially bridging the structural-biochemical gap. On 229 kinase pockets spanning 25 families, SpliceBind achieves AUROC 0.703 (p = 0.026 vs. P2Rank) with robust generalization to held-out families (AUROC 0.761). This taxonomy transforms clinical workflows: upon discovering a splice variant, clinicians can immediately determine whether computational triage suffices or biochemical validation is required -- reducing time from variant discovery to therapeutic decision.

2606.04000 2026-06-04 cond-mat.mtrl-sci cs.LG 版本更新

SPLIT-PINN: Separable Probability Learning Technique via Physics-Informed Neural Networks for High-Dimensional Probabilistic Modeling

SPLIT-PINN: 基于物理信息神经网络的可分离概率学习技术用于高维概率建模

Pouria Behnoudfar, Deekshith Naidu Ponnana, Noah J. Schmelzer, Janith Wanni, George T. Gray, Dan J. Thoma, Curt A. Bronkhorst, Nan Chen, Wenxiao Pan

发表机构 * Department of Mechanical Engineering, University of Wisconsin-Madison(威斯康星大学麦迪逊分校机械工程系) Department of Mathematics, University of Wisconsin-Madison(威斯康星大学麦迪逊分校数学系) Department of Civil Engineering, Johns Hopkins University(约翰霍普金斯大学土木工程系) Materials Physics and Applications Division, Los Alamos National Laboratory(洛斯阿拉莫斯国家实验室材料物理与应用 division) Department of Materials Science and Engineering, University of Wisconsin-Madison(威斯康星大学麦迪逊分校材料科学与工程系)

AI总结 提出一种基于物理信息神经网络的可分离概率学习技术(SPLIT-PINN),通过将漂移场分解为边际校正项并施加正交约束,从数据中推断高维输运主导的联合概率密度函数演化,实现对多晶材料微观结构状态演变的准确概率预测。

详情
AI中文摘要

我们提出了一种概率建模框架,用于将小尺度空间异质性纳入多晶金属材料宏观行为描述中。空间异质性材料状态场使用概率密度函数(PDF)表示,提供了跨不同计算多晶实现的微观结构变异性和状态演化的原则性统计描述。该框架基于概率输运模型的逆识别,该模型被表述为具有未知漂移项的Liouville方程。为了在高维、输运主导的设置中实现该漂移场的准确、稳定和可解释推断,我们开发了基于物理信息神经网络的可分离概率学习技术(SPLIT-PINN)。该方法结合了边际校正漂移分解、正交性约束和基于残差的自适应训练,以增强适定性、数值稳定性和物理一致性,而不施加限制性参数假设。使用SPLIT-PINN,控制联合状态PDF时间演化的漂移场直接从数据中推断。在基准验证之后,该框架应用于描述多晶微观结构状态(包括von Mises应力、位错密度和等效塑性应变率)演化的物理计算数据集。在单个数据集上训练的所学Liouville模型随后用于对多个未见过的多晶实现的联合和边际PDF的时间演化进行正向预测。与参考PDF的定量比较表明,所提出的框架产生了准确且鲁棒的概率预测,并有效跨数据集泛化。

英文摘要

We present a probabilistic modeling framework for incorporating small-scale spatial heterogeneity into macroscopic descriptions of material behavior for polycrystalline metallic materials. Spatially heterogeneous material state fields are represented using probability density functions (PDFs), providing a principled statistical description of microstructural variability and state evolution across different computational polycrystalline realizations. The framework is built on the inverse identification of a probabilistic transport model, formulated as a Liouville equation with an unknown drift term. To enable accurate, stable, and interpretable inference of this drift field in high-dimensional, transport-dominated settings, we develop a Separable Probability Learning Technique via Physics-Informed Neural Networks (SPLIT-PINN). This method incorporates a marginal-correction drift decomposition, orthogonality constraints, and residual-based adaptive training to enhance well-posedness, numerical stability, and physical consistency without imposing restrictive parametric assumptions. Using SPLIT-PINN, the drift field governing the temporal evolution of joint state PDFs is inferred directly from data. After benchmark validation, the framework is applied to physical computational datasets describing the evolution of polycrystalline microstructural states, including von Mises stress, dislocation density, and equivalent plastic strain rate. The learned Liouville model, trained on a single dataset, is subsequently used in forward predictions of the temporal evolution of joint and marginal PDFs for multiple unseen polycrystal realizations. Quantitative comparisons with reference PDFs demonstrate that the proposed framework yields accurate and robust probabilistic predictions and generalizes effectively across datasets.

2606.03995 2026-06-04 cs.LG cs.AI q-bio.QM 版本更新

Early Detection of Alzheimer's Disease Using Explainable Machine Learning on Clinical Biomarkers: A Multi-Class Classification Study Using the Alzheimer's Disease Neuroimaging Initiative (ADNI) Dataset

使用可解释机器学习基于临床生物标志物早期检测阿尔茨海默病:基于阿尔茨海默病神经影像学倡议(ADNI)数据集的多分类研究

Afshan Hashmi

发表机构 * TRDC, Tuwaiq Academy(TRDC,图瓦伊克学院)

AI总结 本研究使用XGBoost分类器,基于ADNI数据集的8个临床特征(MMSE、CDR Global、CDR-SB、MoCA、FAQ、年龄、性别、教育程度)进行三分类(正常认知、轻度认知障碍、阿尔茨海默病)检测,通过SMOTE处理类别不平衡,Optuna优化超参数,SHAP提供可解释性,在测试集上达到macro AUC 0.982、准确率0.943,并揭示了临床合理的特征重要性模式。

详情
AI中文摘要

背景:阿尔茨海默病(AD)影响全球超过5500万人。从常规临床评估中准确、可解释地检测正常认知(NC)、轻度认知障碍(MCI)和AD仍是一个关键未满足需求。方法:使用XGBoost分类器进行三分类检测,采用来自阿尔茨海默病神经影像学倡议(ADNI)的八个临床特征:MMSE、CDR Global、CDR Sum of Boxes(CDR-SB)、MoCA、FAQ、年龄、性别和教育程度。使用Optuna(50次试验)优化超参数;通过SMOTE处理类别不平衡。性能通过macro AUC-ROC(1000次迭代bootstrap 95%置信区间)、macro F1、平衡准确率和Cohen's kappa评估。SHAP值提供特征级别的可解释性。结果:数据集包含1641名基线受试者(608 NC、767 MCI、266 AD)。在五折交叉验证中,平均macro AUC为0.983(SD 0.007),准确率为0.944(SD 0.006),macro F1为0.929(SD 0.008)。在保留测试集(n=247)上,macro AUC为0.982(95% CI: 0.965--0.995),准确率为0.943,平衡准确率为0.932,macro F1为0.927,Cohen's kappa为0.909。SHAP分析确定CDR Global是NC和MCI的主要预测因子,而CDR-SB和MMSE共同驱动AD分类。结论:一个基于常规临床评估训练的可解释机器学习模型实现了近乎完美的三分类阿尔茨海默病检测。SHAP分析揭示了临床合理、类别特定的特征重要性模式,支持临床有效性。未来工作将扩展该框架,加入语音生物标志物以实现多模态检测。

英文摘要

Background: Alzheimer's disease (AD) affects over 55 million people worldwide. Accurate, interpretable detection of normal cognition (NC), mild cognitive impairment (MCI), and AD from routine clinical assessments remains a critical unmet need. Methods: An XGBoost classifier was developed for three-class detection using eight clinical features from the Alzheimer's Disease Neuroimaging Initiative (ADNI): MMSE, CDR Global, CDR Sum of Boxes (CDR-SB), MoCA, FAQ, age, sex, and education. Hyperparameters were optimised using Optuna (50 trials); class imbalance was addressed with SMOTE. Performance was evaluated by macro AUC-ROC with 1,000-iteration bootstrap 95% confidence intervals, macro F1, balanced accuracy, and Cohen's kappa. SHAP values provided feature-level explainability. Results: The dataset comprised 1,641 baseline subjects (608 NC, 767 MCI, 266 AD). On five-fold cross-validation, mean macro AUC was 0.983 (SD 0.007), accuracy 0.944 (SD 0.006), and macro F1 0.929 (SD 0.008). On the held-out test set (n = 247), macro AUC was 0.982 (95% CI: 0.965--0.995), accuracy 0.943, balanced accuracy 0.932, macro F1 0.927, and Cohen's kappa 0.909. SHAP analysis identified CDR Global as the dominant predictor for NC and MCI, while CDR-SB and MMSE together drove AD classification. Conclusion: An explainable machine learning model trained on routine clinical assessments achieves near-perfect three-class Alzheimer's detection. SHAP analysis reveals clinically plausible, class-specific feature importance patterns supporting clinical validity. Future work will extend this framework with speech biomarkers for multimodal detection.

2605.04356 2026-06-04 cs.LG cs.AI 版本更新

Efficiently Aligning Language Models with Online Natural Language Feedback

通过在线自然语言反馈高效对齐语言模型

Christine Ye, Joe Benton

发表机构 * GitHub

AI总结 提出使用在线自然语言反馈替代可验证奖励,通过迭代优化代理奖励模型并在过优化点收集专家监督,在模糊领域高效对齐语言模型,实验表明可大幅提升专家监督的数据效率。

详情
AI中文摘要

可验证奖励的强化学习已被用于在许多领域激发语言模型的出色性能。但是,AI的广泛有益部署可能需要我们在“模糊”、难以监督的领域中训练具有强大能力的模型。在本文中,我们开发了在模糊领域中对齐语言模型的方法,其中人类专家仍然能够提供高质量的监督信号,但仅限于少量模型输出,使用在线自然语言反馈。具体来说,我们通过迭代优化代理奖励信号来训练模型,在过优化点停止,收集新的专家监督,并更新代理奖励。我们使用上下文学习(ICL)和微调从语言模型构建代理奖励模型。我们通过分别在Qwen3-8B和Haiku 4.5上激发创意写作和对齐研究能力来测试我们的方法。对于Qwen3-8B,ICL方法使用50倍更少的专家样本恢复了高达35%的性能,而微调方法使用最多20倍更少的样本恢复了80%,使用3倍更少的样本恢复了100%。对于Haiku 4.5,ICL方法使用30倍更少的样本恢复了高达35%的性能,微调方法使用10倍更少的样本恢复了100%。我们的结果表明,在线自然语言反馈可以显著提高专家监督的数据效率。

英文摘要

Reinforcement learning with verifiable rewards has been used to elicit impressive performance from language models in many domains. But, broadly beneficial deployments of AI may require us to train models with strong capabilities in "fuzzy", hard-to-supervise domains. In this paper, we develop methods to align language models in fuzzy domains where human experts are still able to provide high-quality supervision signal, but only for a small number of model outputs, using online natural language feedback. Specifically, we train models by iteratively optimizing against proxy reward signals, stopping at the point of over-optimization, collecting fresh expert supervision, and updating the proxy reward. We construct proxy reward models from language models using in-context learning (ICL) and fine-tuning. We test our methods by eliciting creative writing and alignment research capabilities in Qwen3-8B and Haiku 4.5 respectively. For Qwen3-8B, ICL methods recover up to 35% of performance with 50x fewer expert samples, while fine-tuning methods recover 80% with up to 20x fewer samples and 100% with 3x fewer samples. For Haiku 4.5, ICL methods recover up to 35% of performance with 30x fewer samples, and fine-tuning methods recover 100% with 10x fewer samples. Our results suggest that online natural language feedback can substantially improve the data efficiency of expert supervision.

2606.03943 2026-06-04 cs.RO cs.CV cs.LG 版本更新

PointAction: 3D Points as Universal Action Representations for Robot Control

PointAction: 3D点作为机器人控制的通用动作表示

Mutian Tong, Han Jiang, Qiao Feng, Lingjie Liu, Jiatao Gu

发表机构 * University of Pennsylvania(宾夕法尼亚大学)

AI总结 提出PointAction框架,通过微调视频生成模型联合预测未来RGB帧和动态3D点图,将点动力学作为与具体本体无关的动作接口,再由扩散动作解码器映射为可执行动作,以减少RGB动作歧义并跨任务/本体迁移。

Comments Project page: https://oriontmt.github.io/pointaction/

详情
AI中文摘要

视频-动作模型(VAM)利用预训练视频扩散模型捕获的广泛视觉动态,为通用机器人操作提供了有前景的路径。然而,仅RGB视频展开无法直接操作:它们未明确指定度量3D运动、接触几何和细粒度空间约束,导致动作基础不明确。同时,跨不同任务和本体的动作监督扩展仍然成本高昂。我们提出PointAction,一个通过显式基于点的4D建模将视频预测桥接到机器人动作的框架。PointAction微调基础视频生成模型,联合预测未来RGB帧和动态3D点图,产生任务相关场景几何的时间一致3D运动。这些点动力学作为结构化的、与本体无关的动作接口,由基于扩散的动作解码器映射为可执行的机器人动作。通过使用度量3D点动力学作为视频预测和控制之间的接口,PointAction减少了仅RGB动作基础的不确定性,并支持在有限动作监督下跨任务和本体的迁移。实验表明,PointAction在机器人场景上实现了最先进的4D生成质量,在模拟中优于现有基线,并泛化到预训练中未见过的两个真实机器人手臂。

英文摘要

Video-Action Models (VAMs) leverage the broad visual dynamics captured by pre-trained video diffusion models, offering a promising path toward generalizable robot manipulation. However, RGB-only video rollouts are not directly actionable: they leave metric 3D motion, contact geometry, and fine-grained spatial constraints under-specified, making action grounding ambiguous. Meanwhile, scaling action supervision across diverse tasks and embodiments remains costly. We present PointAction, a framework that bridges video predictions to robot actions through explicit point-based 4D modeling. PointAction fine-tunes a foundation video generation model to jointly predict future RGB frames and dynamic 3D pointmaps, producing temporally consistent 3D motion of task-relevant scene geometry. These point dynamics serve as a structured, embodiment-agnostic action interface, which a diffusion-based action decoder maps to executable robot actions. By using metric 3D point dynamics as the interface between video prediction and control, PointAction reduces the ambiguity of RGB-only action grounding and supports transfer across tasks and embodiments with limited action supervision. Experiments show that PointAction achieves state-of-the-art 4D generation quality on robot scenes, outperforms existing baselines in simulation, and generalizes to two real robot arms unseen during pretraining.

2606.03938 2026-06-04 cs.LG cs.AI 版本更新

q0: Primitives for Hyper-Epoch Pretraining

q0: 超周期预训练的原语

Bishwas Mandal, Shmuel Berman, Akshay Vegesna, Samip Dahal

发表机构 * Q Labs(Q实验室) Princeton University(普林斯顿大学)

AI总结 针对多周期训练中单模型性能饱和的问题,提出超周期预训练(q0)方法,通过循环调度、链式蒸馏和学习先验三个原语,从多周期预算中生成多样化模型群体并聚合其预测,显著提升数据效率。

Comments 22 pages, 5 figures

详情
AI中文摘要

多周期训练正成为标准做法,因为计算能力的增长速度快于高质量文本的供应。但预训练单个模型会在几轮后饱和,远在计算预算耗尽之前。我们认为这需要概念上的转变,从训练单个模型转向探索模型群体并聚合它们的预测。我们引入了超周期预训练(q0),它将多周期预算转化为多样化模型群体,其组合预测比单个精炼模型达到更低的验证损失。q0 归结为三个核心原语。具有反相关学习率和权重衰减的循环调度从几个并行轨迹中收集多样化模型。链式蒸馏使每个模型针对其前驱进行训练,从而模型质量在群体中累积。一个在保留集上拟合的学习先验,为任何推理预算选择和加权成员。在 1.8B 参数模型上,使用 100M FineWeb 令牌训练,q0 仅使用约 56 个周期(约 4.6 倍更少)即可匹配强大的 256 周期集成基线,或当匹配基线的集成大小时使用约 67 个周期(约 3.8 倍更少),并持续改进。这些增益在 Slowrun 设置下达到累积约 12.9 倍的数据效率,并迁移到下游基准测试。关键的是,最优分配随预算变化,因此我们给出了处方性配方,说明如何花费给定的周期预算以最大化泛化,从单个周期到最大预算。

英文摘要

Multi-epoch training is becoming the standard now that compute is growing faster than the supply of high-quality text. But pretraining a single model saturates within a few passes, long before the compute budget is exhausted. We argue this calls for a conceptual shift from training a single model toward exploring a population of models and aggregating their predictions. We introduce hyper-epoch pretraining (q0), which turns a multi-epoch budget into a population of diverse models whose combined predictions reach a lower validation loss than a single refined model. q0 reduces to three core primitives. A cyclic schedule with anti-correlated learning rate and weight decay collects diverse models from a few parallel trajectories. Chain distillation trains each model against its predecessor so that model quality compounds across the population. A learned prior, fit on a held out set, selects and weights members for any inference budget. On a 1.8B-parameter model trained on 100M FineWeb tokens, q0 matches a strong 256-epoch ensemble baseline using only ~56 epochs (~4.6x fewer), or ~67 epochs (~3.8x fewer) when matched to the baseline's ensemble size, and continues to improve beyond it. These gains reach cumulative ~12.9x data efficiency under the Slowrun setting and transfer to downstream benchmarks. Crucially, the optimal allocation shifts with the budget, so we give prescriptive recipes for how to spend a given epoch budget to maximize generalization, from a single epoch up to the largest budgets.

2606.03899 2026-06-04 cs.LG 版本更新

Denoise First, Orthogonalize Later: Understanding Momentum in Muon via Spectral Filtering

先降噪,后正交:通过谱滤波理解Muon中的动量

Xianliang Li, Zihan Zhang, Weiyang Liu, Han Bao

发表机构 * The Institute of Statistical Mathematics(统计数学研究所) The Graduate Institute for Advanced Studies, SOKENDAI(SOKENDAI高级研究院) National Institute of Informatics(国家信息研究所) The Chinese University of Hong Kong(香港中文大学) Tohoku University(东北大学) RIKEN AIP(理化学研究所AIP)

AI总结 本文通过谱滤波理论证明Muon优化器中的动量能抑制梯度扰动、扩大谱间隙,从而稳定正交化步骤,并证明先动量后正交化比相反顺序或去除动量更优。

详情
AI中文摘要

Muon最近在大语言模型训练中展示了强大的实证性能,但动量在Muon中的理论作用仍不清楚。现有的Muon分析要么移除动量以单独研究谱更新,要么保留动量而不解释其为何提升实证性能。我们的工作通过展示Muon中的动量充当谱滤波器来弥合这一差距。在结构化信号加扰动梯度模型下,我们证明动量抑制扰动同时保留主导信号,从而扩大它们之间的谱间隙。这个扩大的间隙稳定了传递给Muon正交化步骤的矩阵的奇异子空间,使得最终更新更可靠。我们进一步证明,在正交化之前应用动量比颠倒顺序或简单地移除动量能实现与梯度信号分量可证明的更强对齐。跨多种任务(包括LLM预训练)的实验支持我们的理论分析。更广泛地说,我们的理论为理解其他基于矩阵的优化器中动量的益处提供了起点。

英文摘要

Muon has recently demonstrated strong empirical performance in large language model training, but the theoretical role of momentum in Muon remains unclear. Existing analyses of Muon either remove momentum to study spectral updates in isolation, or retain momentum without explaining why it improves empirical performance. Our work bridges this gap by showing momentum in Muon acts as a spectral filter. Under a structured signal-plus-perturbation gradient model, we prove that momentum suppresses perturbations while preserving the dominant signal, thereby enlarging the spectral gap between them. This enlarged gap stabilizes the singular subspaces of the matrix passed to Muon's orthogonalization step, making the resulting update more reliable. We further show that applying momentum before orthogonalization achieves provably stronger alignment with the signal component of the gradient than either reversing this order or simply removing momentum. Experiments across diverse tasks, including LLM pretraining, support our theoretical analysis. More broadly, our theory offers a starting point for understanding the benefits of momentum in other matrix-based optimizers.

2606.03892 2026-06-04 cs.CL cs.AI cs.LG 版本更新

Synthesize and Reward -- Reinforcement Learning for Multi-Step Tool Use in Live Environments

合成与奖励——面向实时环境中多步骤工具使用的强化学习

Ibrahim Abdelaziz, Asim Munawar, Kinjal Basu, Maxwell Crouse, Chulaka Gunasekara, Suneet Katrekar, Pavan Kapanipathi

发表机构 * IBM Research(IBM研究院)

AI总结 提出PROVE框架,通过20个有状态MCP服务器、自动化数据合成流水线和多组件程序化奖励,解决多步骤工具调用中的环境构建、查询生成和奖励设计问题,在BFCL Multi-Turn、tau2-bench和T-Eval上分别提升最多+10.2、+6.8和+6.5分。

详情
AI中文摘要

训练LLM编排多步骤工具调用受到三个相互耦合的障碍的阻碍:现实的有状态执行环境构建成本高昂,合成训练查询通常与服务器的实际状态脱节(因此生成的工具调用无法执行),以及基于回忆的RL奖励会鼓励冗长的工具调用模式。我们提出PROVE(已验证环境上的程序化奖励),一个包含三项贡献的框架:(1)一个包含20个有状态MCP(模型上下文协议)服务器的库,暴露了343个工具,支持具有会话范围状态隔离的实时执行RL训练;(2)一个自动数据合成流水线,通过基于实时采样服务器状态的依赖图引导的对话模拟,针对这些服务器生成经过验证的多轮工具调用轨迹,使得每个生成的查询都引用实际存在的实体;(3)一个多组件程序化奖励——渐进式有效性评分、依赖感知覆盖率、具有复杂度缩放调用预算的自适应效率惩罚、工具名称信号和参数值匹配奖励——无需外部评判模型。我们使用相同的奖励超参数和约13K训练示例,通过GRPO训练了四个模型(Qwen3-4B、Qwen3-8B、Qwen2.5-7B、Granite-4.1-8B);仅对每个模型族从三点扫描中调整学习率。在BFCL Multi-Turn、tau2-bench和T-Eval上,PROVE分别带来了最多+10.2、+6.8和+6.5分的改进,表明紧凑的程序化奖励在两个模型族的多步骤工具编排上产生了一致的收益。

英文摘要

Training LLMs to orchestrate multi-step tool calls is held back by three coupled obstacles: realistic stateful execution environments are costly to build, synthetic training queries are often detached from the server's actual state (so the generated tool calls fail to execute), and recall-based RL rewards incentivize verbose tool-calling patterns. We present PROVE (Programmatic Rewards On Verified Environments), a framework with three contributions: (1) a library of 20 stateful MCP (Model Context Protocol) servers exposing 343 tools, enabling live-execution RL training with session-scoped state isolation; (2) a state-machine data synthesis pipeline that generates multi-turn tool-call trajectories grounded in live-sampled server state, so generated queries reference entities that actually exist; and (3) a multi-component programmatic reward with an adaptive efficiency penalty that counters the verbosity incentive of recall-based rewards. We train four models (Qwen3-4B, Qwen3-8B, Qwen2.5-7B, Granite-4.1-8B) with GRPO on the resulting ~13K training examples. On BFCL Multi-Turn, tau2-bench, and T-Eval, PROVE yields improvements of up to +10.2, +6.8, and +6.5 points respectively, demonstrating that this framework yields consistent gains on multi-step tool orchestration across two model families.

2606.03746 2026-06-04 cs.CV cs.AI cs.GR cs.LG 版本更新

Qwen-Image-Flash: Beyond Objective Design

Qwen-Image-Flash:超越目标设计

Tianhe Wu, Kun Yan, Zikai Zhou, Lihan Jiang, Jiahao Li, Jie Zhang, Kaiyuan Gao, Ningyuan Tang, Shengming Yin, Xiaoyue Chen, Xiao Xu, Yilei Chen, Yuxiang Chen, Yan Shu, Yixian Xu, Yanran Zhang, Zihao Liu, Zhendong Wang, Zekai Zhang, Deqing Li, Liang Peng, Yi Wang, Jingren Zhou, Chenfei Wu

发表机构 * alibaba-inc.com(阿里巴巴公司)

AI总结 本文通过系统研究数据组成、教师指导和任务混合三个因素,提出Qwen-Image-Flash,表明有效的少步蒸馏不仅需要精心设计的目标,还需要对更广泛的训练流程进行原则性组织。

详情
AI中文摘要

少步蒸馏已成为加速先进视觉生成模型的有效策略,但先前的工作主要集中在蒸馏目标上。在这项工作中,我们从互补的角度重新审视少步蒸馏,重点关注关键影响学生表现的训练方案。以Qwen-Image-2.0为代表案例,我们系统地研究了统一文本到图像生成和指令引导图像编辑蒸馏中的三个因素:数据组成、教师指导和任务混合。我们的实证分析揭示了若干非直观行为,这些行为推动了Qwen-Image-Flash的开发。总体而言,我们的结果表明,有效的少步蒸馏不仅需要精心设计的目标,还需要对更广泛的训练流程进行原则性组织。

英文摘要

Few-step distillation has become an effective strategy for accelerating advanced visual generative models, yet prior work has largely focused on distillation objectives. In this work, we revisit few-step distillation from a complementary perspective, focusing on the training recipe that critically shapes student performance. Using Qwen-Image-2.0 as a representative case, we systematically investigate three factors in unified text-to-image generation and instruction-guided image editing distillation: data composition, teacher guidance, and task mixture. Our empirical analysis reveals several non-obvious behaviors, which motivate the development of Qwen-Image-Flash. Overall, our results suggest that effective few-step distillation requires not only carefully designed objectives, but also principled organization of the broader training pipeline.

2606.03631 2026-06-04 cs.LG cs.AI 版本更新

AnchorMoE: Interpretable Time Series Classification via Anchor-Routed MoE

AnchorMoE: 基于锚点路由的混合专家模型实现可解释时间序列分类

Tao Xie, Zexi Tan, Haoyi Xiao, Mengke Li, Yiqun Zhang, Yang Lu, Cuie Yang, Yiu-ming Cheung

发表机构 * School of Automation, Guangdong University of Technology(广东工业大学自动化学院) School of Computer Science and Technology, Guangdong University of Technology(广东工业大学计算机科学与技术学院) College of Computer Science and Software Engineering, Shenzhen University(深圳大学计算机科学与软件工程学院) School of Informatics, Xiamen University(厦门大学信息学院) State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University(东北大学过程工业综合自动化国家重点实验室) Department of Computer Science, Hong Kong Baptist University(香港 Baptist 大学计算机科学系)

AI总结 提出AnchorMoE框架,利用混合专家架构对局部补丁进行多视角表示并路由至专门专家,通过加性分解实现前向可解释性,并引入几何正交约束和不确定性感知门控机制提升稀疏信号下的分解可靠性与噪声抑制。

Comments Accepted by KDD 2026, 12 pages

详情
AI中文摘要

多变量时间序列分类(MTSC)在高风险领域(如临床诊断和工业故障检测)中至关重要,这些领域的安全部署需要透明的决策过程。然而,隔离驱动模型预测的时间段具有挑战性,因为现实世界时间序列中的判别信号通常是稀疏、异构且被背景噪声严重掩盖的。因此,本文提出了AnchorMoE,一种天生可解释的分类框架。基于混合专家(MoE)架构,AnchorMoE编码局部补丁的多视角表示并将其路由到专门专家,确保最终预测被表述为输入段上的精确加性分解,从而促进前向透明度,而非依赖事后估计。为了在稀疏信号分布下保持这种分解的可靠性,我们引入了几何正交约束,惩罚表示冗余,迫使不同专家专门处理异构预测模式。此外,设计了一个不确定性感知的可靠性门控,动态校准每个段的贡献,有效抑制残余背景噪声。在真实世界和合成基准上的大量实验表明,AnchorMoE在实现高度竞争的分类性能的同时,忠实于原始时间序列进行决策。

英文摘要

Multivariate time series classification (MTSC) is pivotal in high-stakes domains, such as clinical diagnosis and industrial fault detection, where safe deployment necessitates transparent decision-making. However, isolating the temporal segments that drive model predictions is challenging because discriminative signals in real-world time series are typically sparse, heterogeneous, and heavily obscured by background noise. This paper, therefore, proposes AnchorMoE, an interpretable-by-construction classification framework. Built upon a Mixture-of-Experts (MoE) architecture, AnchorMoE encodes multi-view representations of local patches and routes them to specialized experts, ensuring that the final prediction is formulated as an exact additive decomposition over the input segments, facilitating ante-hoc transparency rather than relying on post-hoc estimations. To maintain the reliability of this decomposition under sparse signal distributions, we introduce a geometric orthogonality constraint that penalizes representational redundancy, compelling distinct experts to specialize in heterogeneous predictive patterns. Furthermore, an uncertainty-aware reliability gate is designed to dynamically calibrate the contribution of each segment, effectively suppressing residual background noise. Extensive experiments on real-world and synthetic benchmarks demonstrate that AnchorMoE achieves highly competitive classification performance while faithfully grounding its decisions in the raw time series.

2606.03441 2026-06-04 cs.RO cs.LG 版本更新

PerchRL: Vision-Based Agile Perching on Inclined Platforms under Rapid and Irregular Motion

PerchRL:基于视觉的快速不规则运动倾斜平台敏捷着陆

Zihong Lu, Zongzhuo Liu, Huaxu Li, Jinqiang Cui, Jie Mei, Youmin Gong, U Kei Cheang, Boyu Zhou

发表机构 * SUSTech(四川大学) HITSZ(哈尔滨工业大学) PCL(鹏城实验室) Differential Robotics(差分机器人实验室)

AI总结 提出PerchRL强化学习框架,通过两阶段学习策略(状态预训练+视觉微调)和混合学习框架(可见性感知状态增强+主动感知奖励),实现四旋翼在快速不规则运动倾斜平台上的自主视觉着陆。

详情
AI中文摘要

自主视觉引导的四旋翼在移动倾斜平台上的着陆对于空地协作至关重要,但由于有限的视场角(FOV)而具有挑战性。本文提出PerchRL,一种基于强化学习(RL)的框架,用于在快速和不规则运动下的倾斜平台上进行基于视觉的敏捷着陆。具体而言,我们采用两阶段学习策略,包括基于状态的预训练和基于视觉的微调。为了提高对不同平台运动的泛化能力,我们使用随机化的平台轨迹来防止过拟合,并采用时间增强方法从历史观测中捕捉潜在运动模式。在基于视觉的微调过程中,提出了一种混合学习框架,包括可见性感知状态增强和主动感知奖励,以提高在间歇性视觉丢失下的鲁棒性。大量的仿真和真实世界实验证明了PerchRL的可行性、稳定性和实时性能,而在不同四旋翼平台上的成功部署进一步验证了其适应性。源代码将发布以惠及社区。

英文摘要

Autonomous vision-based perching of quadrotors on moving inclined platforms is critical for air-ground collaboration but remains challenging due to the limited field of view (FOV). In this paper, we propose PerchRL, a reinforcement learning (RL) framework for vision-based agile perching on inclined platforms under rapid and irregular motion. Specifically, we employ a two-stage learning strategy consisting of state-based pre-training followed by vision-based fine-tuning. To improve generalization across diverse platform motions, we employ randomized platform trajectories to prevent overfitting and temporal augmentation methods to capture latent motion patterns from historical observations. During vision-based fine-tuning, a hybrid learning framework consisting of visibility-aware state augmentation and active perception rewards is presented to improve robustness under intermittent visual loss. Extensive simulation and real-world experiments demonstrate the feasibility, stability, and real-time performance of PerchRL, while successful deployment across distinct quadrotor platforms further validates its adaptability. The source code will be released to benefit the community.

2606.03393 2026-06-04 cs.LG 版本更新

Flicker-DDPM: Accelerating Denoising Diffusion via 1/f Colored Noise Injection

Flicker-DDPM:通过1/f彩色噪声注入加速去噪扩散

KeXiang Mao, FanCheng Li

发表机构 * School of Physics and Technology, Wuhan University(武汉大学物理科学技术学院) Hongyi Honor College, Wuhan University(弘毅荣誉学院)

AI总结 提出Flicker-DDPM模型,利用自组织临界性启发的1/f彩色噪声替代各向同性白噪声,通过空间相关核生成幂律谱噪声,在CIFAR-10上以3.33倍更少的采样步数达到或超越标准DDPM的生成质量,并从频域线性理论解释加速机制。

Comments 16pages, 8 figures, Code available at https://github.com/Mao-Kexiang/Flicker_DDPM

详情
AI中文摘要

我们提出了一种新颖的扩散模型Flicker-DDPM,它引入了受自组织临界性(SOC)启发的闪烁(1/f)噪声,SOC是自然系统中广泛观察到的现象。与在前向过程中采用各向同性白噪声的去噪扩散概率模型(DDPM)不同,Flicker-DDPM采用具有幂律谱的彩色噪声,以更好地匹配自然图像的频谱统计,其功率谱通常遵循P(k)正比于1/k^{\alpha}。为此,我们基于空间相关核{\sigma}(d) = (d + 1)^{-\eta}开发了一个彩色噪声模块,并从理论上证明调整{\eta}可以控制生成的1/f^{\alpha}噪声的谱指数{\alpha},从而适应具有不同频谱特征的数据集。在CIFAR-10上,Flicker-DDPM使用3.33倍更少的采样步数即可达到或超越标准DDPM基线的生成质量,且每步的额外计算成本可忽略不计。我们进一步开发了一种频域线性理论,证明频谱匹配的彩色噪声使反向轨迹线性化,从理论上解释了所观察到的采样加速现象。

英文摘要

We propose a novel diffusion model, Flicker-DDPM, which incorporates flicker (1/f) noise inspired by self-organized criticality (SOC), a widely observed phenomenon in natural systems. Unlike denoising diffusion probabilistic models (DDPMs), which employ isotropic white noise in the forward process, Flicker-DDPM adopts colored noise with power-law spectra to better match the spectral statistics of natural images, whose power spectra typically follow P(k) proportional to 1/k^α. To this end, we develop a colored-noise module based on a spatial correlation kernel, σ(d) = (d + 1)^{-η}, and theoretically establish that adjusting η controls the spectral exponent α of the generated 1/fα noise, enabling adaptation to datasets with diverse spectral characteristics. On CIFAR-10, Flicker DDPM matches or surpasses the generation quality of a standard DDPM baseline using 3.33 times fewer sampling steps, with negligible additional computational cost per step. We further develop a frequency-domain linear theory demonstrating that spectrally matched colored noise linearizes the reverse trajectory, theoretically explaining the observed sampling acceleration.

2606.03376 2026-06-04 cs.CV cs.AI cs.CL cs.LG 版本更新

P$^2$-DPO: Grounding Hallucination in Perceptual Processing via Calibration Direct Preference Optimization

P²-DPO:通过校准直接偏好优化在感知处理中锚定幻觉

Ruipeng Zhang, Zhihao Li, Haozhang Yuan, C. L. Philip Chen, Tong Zhang

发表机构 * Guangdong Provincial Key Laboratory of Computational AI Models and Cognitive Intelligence, School of Computer Science & Engineering, South China University of Technology(广东省计算人工智能模型与认知智能重点实验室,计算机科学与工程学院,华南理工大学) Pazhou Lab, Guangzhou, China(琶洲实验室,广州,中国) Engineering Research Center of the Ministry of Education on Health Intelligent Perception and Paralleled Digital-Human, Guangzhou, China(教育部健康智能感知与并行数字人工程研究中心,广州,中国)

AI总结 针对大型视觉语言模型中的幻觉问题,提出P²-DPO训练范式,通过模型自生成偏好对和校准损失,直接优化感知瓶颈和视觉鲁棒性,无需昂贵人工反馈。

详情
AI中文摘要

幻觉最近在大型视觉语言模型(LVLMs)中引起了广泛的研究关注。直接偏好优化(DPO)旨在直接从人类提供的纠正偏好中学习,从而解决幻觉问题。尽管取得了成功,但这种范式尚未专门针对关注区域中的感知瓶颈或解决图像退化下的视觉鲁棒性不足问题。此外,现有的偏好对通常是视觉无关的,其固有的离策略性质限制了它们在指导模型学习方面的有效性。为了解决这些挑战,我们提出了感知处理直接偏好优化(P²-DPO),一种新颖的训练范式,其中模型生成并学习自己的偏好对,从而直接解决已识别的视觉瓶颈,同时固有地避免视觉无关和离策略数据的问题。它引入了:(1)一种针对焦点增强感知和视觉鲁棒性的在策略偏好对构建方法,以及(2)一种精心设计的校准损失,以精确地将视觉信号与文本的因果生成对齐。实验结果表明,在相当数量的训练数据和成本下,P²-DPO在基准测试中优于依赖昂贵人工反馈的强基线。此外,对注意力区域保真度(ARF)和图像退化场景的评估验证了P²-DPO在解决关注区域感知瓶颈和提高对退化输入的视觉鲁棒性方面的有效性。

英文摘要

Hallucination has recently garnered significant research attention in Large Vision-Language Models (LVLMs). Direct Preference Optimization (DPO) aims to learn directly from the corrected preferences provided by humans, thereby addressing the hallucination issue. Despite its success, this paradigm has yet to specifically target the perceptual bottleneck in attended regions or address insufficient Visual Robustness against image degradation. Furthermore, existing preference pairs are often vision-agnostic and their inherently off-policy nature limits their effectiveness in guiding model learning. To address these challenges, we propose Perceptual Processing Direct Preference Optimization (P$^2$-DPO), a novel training paradigm in which the model generates and learns from its own preference pairs, thereby directly addressing the identified visual bottlenecks while inherently avoiding the issues of vision-agnostic and off-policy data. It introduces: (1) an on-policy preference pairs construction method targeting Focus-and-Enhance perception and Visual Robustness, and (2) a well-designed Calibration Loss to precisely align visual signals with the causal generation of text. Experimental results demonstrate that with a comparable amount of training data and cost, P$^2$-DPO outperforms strong baselines that rely on costly human feedback on benchmarks. Furthermore, evaluations on Attention Region Fidelity (ARF) and image degradation scenarios validate the effectiveness of P$^2$-DPO in addressing perceptual bottleneck in attended regions and improving Visual Robustness against degraded inputs.

2606.02886 2026-06-04 cs.LG cs.AI cs.CE math.PR physics.ao-ph 版本更新

Scalable Uncertainty Quantification for Extreme Weather Forecasting via Empirical Neural Tangent Kernels

基于经验神经正切核的极端天气预报可扩展不确定性量化

Jose Marie Antonio Miñoza, Rex Gregor Laylo, Sebastian C. Ibañez

发表机构 * Center for AI Research(人工智能研究中心) Department of Education(教育部门) Makati Philippines(马卡蒂菲律宾)

AI总结 本文提出基于神经正切核的不确定性量化方法,利用最后一层经验特征,通过方差崩溃机制和分解性能分析,实现无需重训练的极端天气自适应预测区间。

Comments Proceedings of the 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2 (KDD '26)

详情
AI中文摘要

深度学习天气模型现在匹配数值天气预报的准确性,同时运行速度快几个数量级,但产生确定性预测而没有不确定性估计,这对于极端天气事件期间的高风险决策是一个关键差距。本文提出基于神经正切核的不确定性量化(NTK-UQ),使用最后一层经验特征。理论分析预测,UQ质量通过两种机制依赖于架构。首先,方差崩溃机制解释了UQ何时失败:当特征值截断秩接近特征空间的有效秩时,GP校正项消耗几乎所有的先验方差,破坏了热带气旋与常规条件之间的区分;具有集中谱(谱算子)的架构需要激进截断(k≤10),而基于注意力的模型容忍满秩计算。其次,分解性能取决于极端天气的非高斯、重尾结构:独立成分分析利用高阶统计量(峰度、负熵)来隔离重尾极端事件特征,实现了比仅捕获二阶方差的奇异值分解更高的区分度。一个数据驱动的选择规则根据特征谱集中比选择ICA或SVD,正确地为所有四种评估架构指定了更优的分解。与分裂共形预测(自然的后验基线)相比,NTK-UQ在90%覆盖率下实现了31-37%更窄的预测区间,并且独特地产生随极端事件严重程度缩放的自适应区间,而共形预测无法通过构造实现。该框架无需重训练;推理时的不确定性每个样本仅需一次矩阵-向量乘积。

英文摘要

Deep learning weather models now match numerical weather prediction accuracy while running orders of magnitude faster, but produce deterministic forecasts without uncertainty estimates, a critical gap for high-stakes decisions during extreme weather events. This paper proposes Neural Tangent Kernel-based uncertainty quantification (NTK-UQ) using last-layer empirical features. Theoretical analysis predicts that UQ quality is architecture-dependent through two mechanisms. First, a variance collapse mechanism explains when UQ fails: when the eigenvalue truncation rank approaches the effective rank of the feature space, the GP correction term consumes nearly all prior variance, destroying discrimination between tropical cyclones and routine conditions; architectures with concentrated spectra (spectral operators) require aggressive truncation ($k \leq 10$), while attention-based models tolerate full-rank computation. Second, decomposition performance depends on the non-Gaussian, heavy-tailed structure of extreme weather: Independent Component Analysis exploits higher-order statistics (kurtosis, negentropy) to isolate heavy-tailed extreme-event features, achieving higher discrimination than singular value decomposition, which captures only second-order variance. A data-driven selection rule chooses ICA or SVD from the feature eigenspectrum concentration ratio, correctly prescribing the superior decomposition for all four evaluated architectures. Compared to split conformal prediction (the natural post-hoc baseline), NTK-UQ achieves 31--37\% sharper prediction intervals at 90\% coverage, and uniquely produces \emph{adaptive} intervals that scale with extreme event severity, which conformal prediction cannot achieve by construction. The framework requires no retraining; inference-time uncertainty requires only a single matrix-vector product per sample.

2606.02576 2026-06-04 cs.CV cs.LG 版本更新

ProtoAda: Prototype-Guided Adaptive Adapter Expansion and Geometric Consolidation for Multimodal Continual Instruction Tuning

ProtoAda: 原型引导的自适应适配器扩展与几何整合用于多模态持续指令微调

Yu-Cheng Shi, Zhen-Hao Xie, Jun-Tao Tang, Da-Wei Zhou

发表机构 * School of Artificial Intelligence, Nanjing University, China(南京大学人工智能学院) State Key Laboratory of Novel Software Technology, Nanjing University, China(南京大学新型软件技术国家重点实验室)

AI总结 提出ProtoAda框架,通过格式感知任务原型和几何感知参数整合,解决多模态持续指令微调中任务路由错误和梯度干扰问题。

详情
AI中文摘要

多模态大语言模型通过指令微调取得了强大性能,但实际部署需要它们持续获取新的视觉语言能力,这使得多模态持续指令微调至关重要。为了减少任务间干扰并促进协作,近期方法常采用稀疏架构,如基于图像-文本相似度路由的LoRA专家混合。然而,具有不同响应结构的任务可能共享高度相似的视觉语言语义,从而被错误地路由到同一专家;仅凭图像-文本相似度不足以进行可靠的任务分配。例如,一个需要坐标预测的定位任务专家,在学习语义相似的VQA任务后,可能偏向于生成短文本答案。这种格式盲目的任务分配将异构响应类型整合到共享参数中,引发梯度干扰和无效的专家协作。为解决此问题,我们提出ProtoAda,一种原型引导的自适应微调框架。ProtoAda引入格式感知任务原型,使任务分配和路由与任务语义及输出结构对齐,并以几何感知方式整合格式兼容的更新,有效重用并逐步优化现有参数。在多个基准上的大量实验表明,ProtoAda取得了优越性能,尤其是在答案结构易被顺序微调破坏的任务上。

英文摘要

Multimodal Large Language Models (MLLMs) achieve strong performance through instruction tuning, but real-world deployment requires them to continually acquire new vision-language capabilities, making Multimodal Continual Instruction Tuning (MCIT) essential. To reduce inter-task interference and promote collaboration, recent methods often employ sparse architectures like Mixture of LoRA Experts with image-text similarity routing. However, tasks with distinct response structures could share highly similar visual-linguistic semantics and thus be wrongly routed to the same expert; image-text similarity alone is insufficient for reliable task assignment. For example, an expert in a grounding task requiring coordinate prediction may be biased toward producing short textual answers after learning semantically similar VQA tasks. This format-blind task assignment integrates heterogeneous response types into shared parameters, inducing gradient interference and ineffective expert collaboration. To address this problem, we propose ProtoAda, a prototype-guided adaptive tuning framework. ProtoAda introduces format-aware task prototypes to align task assignment and routing with both task semantics and output structure, and further consolidates format-compatible updates in a geometry-aware manner to effectively reuse and progressively refine existing parameters. Extensive experiments on multiple benchmarks demonstrate that ProtoAda achieves superior performance, especially on tasks whose answer structures are easily corrupted by sequential tuning.

2606.02521 2026-06-04 cs.LG cs.CV 版本更新

Drifting Preference Optimization for One-Step Generative Models

一步生成模型的漂移偏好优化

Zhou Jiang, Yandong Wen, Zhen Liu

发表机构 * Westlake University(西湖大学) The Chinese University of Hong Kong, Shenzhen(香港中文大学(深圳))

AI总结 提出 DrPO 方法,通过在线采样和排序构建非参数偶极偏好场及参考漂移,实现一步式生成模型的无梯度偏好微调。

Comments 24 pages, 9 figures

详情
AI中文摘要

一步式文本到图像生成器因只需一次前向传播即可生成图像而具有吸引力,但其偏好微调仍然困难:标准对齐方法通常依赖于策略似然、去噪轨迹、可微奖励梯度或测试时优化。我们提出漂移偏好优化(DrPO),一种针对确定性一步生成器的在线偏好微调方法。对于每个提示,DrPO 从当前生成器中采样候选图像,用目标奖励对其进行排序,并利用高分和低分样本合成特征空间更新方向。该更新是一个非参数偶极偏好场加上从冻结的基础生成器估计的参考漂移,并通过解耦的特征空间回归目标进行优化。目标奖励仅用于排序,因此 DrPO 可以使用大型、黑盒或不可微的奖励进行训练,而推理仍只需一次生成器调用。我们在 SD-Turbo 和 SDXL-Turbo 上使用多个目标奖励和基准(包括 HPSv3 和 GenEval)评估了 DrPO。DrPO 在匹配有效批量设置下,通过移除奖励模型反向传播,比无奖励梯度的一步偏好基线提高了对齐度,并将 HPSv3 训练计算量减少了 3.51 倍。初步离线实验表明,基于样本的梯度合成也可用于在线奖励排序之外。

英文摘要

One-step text-to-image generators are attractive for deployment because they generate an image with a single forward pass, but preference finetuning them remains difficult: standard alignment methods often rely on policy likelihoods, denoising trajectories, differentiable reward gradients, or test-time optimization. We propose Drifting Preference Optimization (DrPO), an online preference-finetuning method for deterministic one-step generators. For each prompt, DrPO samples candidates from the current generator, ranks them with a target reward, and uses high- and low-scoring samples to synthesize a feature-space update direction. The update is a non-parametric dipole preference field plus a reference drift estimated from the frozen base generator, and is optimized through a detached feature-space regression target. The target reward is used only for ranking, so DrPO can train with large, black-box, or non-differentiable rewards while inference remains a single generator call. We evaluate DrPO on SD-Turbo and SDXL-Turbo with multiple target rewards and benchmarks, including HPSv3 and GenEval. DrPO improves alignment over reward-gradient-free one-step preference baselines and reduces HPSv3 training computation by $3.51\times$ under the matched effective-batch setting by removing reward-model backpropagation. Initial offline experiments suggest that sample-based gradient synthesis can also be used beyond online reward ranking.

2606.02166 2026-06-04 cs.LG 版本更新

EEG-FuseFormer: A Transformer-Driven Feature Fusion Framework for Seizure Onset Prediction

EEG-FuseFormer: 一种用于癫痫发作预测的Transformer驱动特征融合框架

Vigneshwar Hariharan, Chithra Reghuvaran, Arlene John, Nhat Pham, Omer Rana, Deepu John, Ganesh Neelakanta Iyer

发表机构 * National University of Singapore(新加坡国立大学) University College Dublin(都柏林大学) University of Twente(特文特大学) Cardiff University(卡迪夫大学)

AI总结 提出EEG-FuseFormer框架,融合CNN-LSTM和ResNet-18提取的时空特征与频谱特征,利用Transformer编码器进行融合,在CHB-MIT数据集上达到98.85%的平均召回率,优于多数现有方法。

Comments IEEE International Instrumentation and Measurement Technology Conference (I2MTC) 2026

详情
AI中文摘要

癫痫是全球最常见的神经系统疾病之一,以反复发作为特征,严重影响生活质量。尽管诊断技术有所进步,但由于癫痫事件不可预测,减轻患者面临的风险仍然具有挑战性。准确预测癫痫发作有助于降低患者风险。本文提出EEG-FuseFormer,一种基于Transformer的特征融合框架,用于癫痫发作预测,该框架结合了从卷积神经网络-长短期记忆网络(CNN-LSTM)和ResNet-18网络中提取的中间特征。CNN-LSTM架构直接从原始信号中捕获时空特征,而ResNet-18从脑电图信号的短时傅里叶变换(STFT)表示中提取特征。使用Transformer编码器进行融合,并通过全连接密集层生成最终预测。使用CHB-MIT数据集验证所提模型。结果表明,所提模型实现了98.85%的平均召回率,优于大多数现有方法。本研究评估了所提特征融合模型在跨患者测试场景中的泛化能力。在跨患者验证框架内,对有限目标患者数据进行微调(目标适应)相比传统跨患者验证方法,获得了更高的召回率、精确率和F1分数。最后,在不同硬件平台上评估了模型的运行时计算复杂度,以突出性能与复杂度的权衡。

英文摘要

Epilepsy is one of the most common neurological disorders globally, characterized by recurring seizures and significantly impacting the quality of life. Despite advancements in diagnostic techniques, the mitigation of risks faced by epilepsy patients remains challenging due to the unpredictability of seizure events. An accurate forecast of seizure onset helps to reduce risks in epilepsy patients. In this paper, we propose EEG-FuseFormer, a transformer-based feature fusion framework for seizure-onset prediction that combines intermediate features extracted from Convolutional Neural Networks-Long Short-Term Memory (CNN-LSTM) and ResNet-18 networks. The CNN-LSTM architecture captures both spatial and temporal features directly from the raw signal, whereas the ResNet-18 extracts features from the Short-Time Fourier Transform (STFT) representation of the EEG signals. Fusion is carried out using a transformer encoder, and the final prediction is generated using fully connected dense layers. The CHB-MIT dataset was used to validate the proposed model. The results show that the proposed model achieves a mean recall of 98.85% and outperforms most of the state-of-the-art methods. This study evaluates the ability of the proposed feature fusion model to generalize in cross-patient testing scenarios. Fine-tuning pre-trained models on limited target patient data (target adaptation) within the cross-patient validation framework results in higher recall, precision, and F1-score metrics in comparison to the conventional cross-patient validation approach. Finally, the runtime-based computational complexity of the model is assessed across diverse hardware platforms to highlight the performance-complexity trade-off.

2606.01770 2026-06-04 cs.LG cs.AI 版本更新

Adaptive Auto-Harness: Sustained Self-Improvement for Agentic System Deployment on Open-Ended Task Streams

自适应自动框架:面向开放式任务流的智能体系统部署的持续自我改进

Zewen Liu, Zhan Shi, Yisi Sang, Bing He, Minhua Lin, Tianxin Wei, Dakuo Wang, Benoit Dumoulin, Wei Jin, Hanqing Lu

发表机构 * Emory University(埃默里大学) Amazon(亚马逊) The Pennsylvania State University(宾夕法尼亚州立大学) UIUC(伊利诺伊大学香槟分校) Northeastern University(东北大学)

AI总结 提出自适应自动框架(Adaptive Auto-Harness),通过状态化多智能体进化器、带求解时路由的框架树和人工引导机制,解决开放式任务流中自动框架性能退化问题,在多个流上超越现有基线。

详情
AI中文摘要

自动框架系统(如A-Evolve、GEPA和Meta-Harness)通过从执行反馈中优化提示、技能、工具、记忆和支持基础设施来改进LLM智能体,但它们通常在固定的离线基准上进行评估。实际部署中呈现的是开放式任务流:历史记录无固定终点增长,异构任务需要不同的框架,问题分布随时间变化。这些挑战使得单一反复密集更新的框架变得脆弱,导致性能退化,准确率早期达到峰值后下降。这激发了具有任务自适应性的持续框架构建。我们引入了自适应自动框架(Adaptive Auto-Harness),一个针对此类流的框架和系统。该框架将到 oracle 框架的差距分解为进化损失和适应损失。系统通过状态化多智能体进化器、带求解时路由的框架树以及针对历史缺乏所需信号情况的人工引导钩子来解决这些损失。在预测市场、安全竞赛和事件预测流中,自适应自动框架优于五个现有的自动框架基线,消融实验将收益归因于更好的构建、路由或针对性的人工引导。代码可在 https://github.com/A-EVO-Lab/AdaptiveHarness 获取。

英文摘要

Auto-harness systems such as A-Evolve, GEPA, and Meta-Harness improve LLM agents by optimizing prompts, skills, tools, memories, and supporting infrastructure from execution feedback, but they are typically evaluated on fixed offline benchmarks. Real deployments instead present open-ended task streams: histories grow without a fixed endpoint, heterogeneous tasks require different harnesses, and problem distributions shift over time. These challenges make a single repeatedly and densely updated harness brittle, causing performance degradation as accuracy peaks early and then declines. This motivates sustained harness construction with task-wise adaptation. We introduce Adaptive Auto-Harness, a framework and system for such streams. The framework decomposes the gap to an oracle harness into evolution loss and adaptation loss. The system addresses these losses with a stateful multi-agent evolver, a harness tree with solve-time routing, and human-steering hooks for cases where history lacks the needed signal. Across prediction-market, security-competition, and event-forecasting streams, Adaptive Auto-Harness outperforms five existing auto-harness baselines and ablations attribute gains to better construction, routing, or targeted human steering. Code is available in \href{https://github.com/A-EVO-Lab/a-evolve/tree/release/adaptive-auto-harness}{Link}.

2606.01537 2026-06-04 cs.CV cs.LG 版本更新

PaCX-MAE: Physiology-Augmented Chest X-Ray Masked Autoencoder

PaCX-MAE: 生理增强的胸部X光掩码自编码器

Yancheng Liu, Kenichi Maeda, Manan Pancholy

发表机构 * University of California, Berkeley(加州大学伯克利分校) University of Tokyo(东京大学) University of Michigan(密歇根大学)

AI总结 提出PaCX-MAE跨模态蒸馏框架,通过双对比预测目标将生理先验注入胸部X光编码器,在保持单模态推理的同时提升生理相关任务性能。

Comments Accepted at the ICML 2026 3rd Workshop on Multi-modal Foundation Models and Large Language Models for Life Sciences (FM4LS)

详情
AI中文摘要

临床诊断通常需要结合影像与生理测量,但部署的模型通常处理单模态数据。我们提出PaCX-MAE,一种跨模态蒸馏框架,将生理先验注入胸部X光(CXR)编码器,同时在推理时严格保持单模态。PaCX-MAE通过双对比预测目标增强域内掩码自编码,使CXR表示与配对的ECG和实验室嵌入对齐。在九个基准上的广泛评估表明,该方法在领域特定MAE上取得一致改进,特别是在依赖生理的任务上(例如,MedMod上AUROC提升2.7;VinDr上F1提升6.5)。该方法在1%标注数据下表现出高度标签效率,并保持解剖保真度,在分割任务上与MAE持平。零样本和注意力分析证实,PaCX-MAE成功学习关注生理指标,如心脏轮廓,这在标准视觉预训练中缺失。

英文摘要

Clinical diagnosis often requires combining imaging with physiological measurements, yet deployed models typically operate on unimodal data. We present PaCX-MAE, a cross-modal distillation framework that injects physiological priors into chest X-ray (CXR) encoders while remaining strictly unimodal at inference. PaCX-MAE augments in-domain masked autoencoding with a dual contrastive-predictive objective, aligning CXR representations with paired ECG and laboratory embeddings. Extensive evaluation across nine benchmarks demonstrates consistent improvements over domain-specific MAE, particularly on physiology-dependent tasks (e.g., +2.7 AUROC on MedMod; +6.5 F1 on VinDr). The method proves highly label-efficient in the 1% regime and preserves anatomical fidelity, achieving parity with MAE on segmentation tasks. Zero-shot and attention analyses confirm that PaCX-MAE successfully learns to attend to physiological indicators, such as the cardiac silhouette, absent in standard visual pretraining.

2606.01495 2026-06-04 cs.LG cs.CL 版本更新

CART: Context-Anchored Recurrent Transformer -- A Parameter-Efficient Architecture with Learned Stability

CART: 上下文锚定循环Transformer——一种具有学习稳定性的参数高效架构

Chad A. Capps

发表机构 * Independent Researcher(独立研究员)

AI总结 提出CART,一种通过共享核心块循环和冻结键值张量实现参数高效的语言模型,并引入线性时不变门控保持稳定性,实验表明在参数匹配时性能略低于密集基线。

Comments 31 pages, 4 figures. Code, training scripts, and the full experiment database (results.db) are available at https://github.com/ccapps42/CART

详情
AI中文摘要

我们提出CART(上下文锚定循环Transformer),一种参数高效的语言模型,它在深度上重复使用单个共享核心块R次。与先前每次迭代重新计算键值张量的循环Transformer不同,CART从多层前奏中一次性计算K和V,并通过多头潜在注意力让循环核心交叉关注这些冻结的张量。一个学习得到的线性时不变(LTI)门控保持循环稳定性:其谱半径在所有36个完全训练配置中稳定在窄带内(rho在[0.79, 0.83]之间)。我们在单个消费级GPU上分两个阶段评估CART:首先在3000步进行64配置筛选,然后对36个配置(P=6,R∈{6,8,10},三个种子)训练30500步(约10亿token)。在宽度d∈{256,512,768,1024}上,两个模式成立:前奏深度P主导循环次数R,并且R的第一阶段排名在完全训练时反转(在d≥512时R=6变为最佳)。在绑定d=1024的参数对比测试中,CART未能击败参数匹配的密集基线,在存储参数对比中损失1-2%,在有效参数对比中损失约10%。诊断消融将有效参数差距分为约5%来自权重共享和约5%来自异质的前奏/锚点/核心/尾声框架;循环核心机制(超连接、LTI门控、循环索引嵌入)单独来看是退化的。变R推理在训练R的两侧性能下降,这是该方案下测试时深度扩展的一个负面结果。

英文摘要

We present CART (Context-Anchored Recurrent Transformer), a parameter-efficient language model that reuses a single shared core block R times across depth. Unlike prior looped transformers that recompute key-value tensors at every iteration, CART computes K and V once from a multi-layer prelude and has the recurrent core cross-attend to those frozen tensors via multi-head latent attention. A learned Linear Time-Invariant (LTI) gate keeps the recurrence stable: its spectral radius settles in a narrow band (rho in [0.79, 0.83]) across all 36 fully-trained configurations. We evaluate CART on single consumer GPUs in two stages: a 64-configuration screen at 3,000 steps, then 36 configurations (P=6, R in {6,8,10}, three seeds) trained for 30,500 steps (~1B tokens). Two patterns hold across widths d in {256,512,768,1024}: prelude depth P dominates loop count R, and the Stage-1 ranking of R reverses at full training (R=6 becomes best at d>=512). At the binding d=1024 parameter-parity test, CART does not beat a parameter-matched dense baseline, losing by 1-2% at stored-parameter parity and by ~10% at effective-parameter parity. Diagnostic ablations split the effective-parameter gap into ~5% from weight sharing and a residual ~5% from the heterogeneous prelude/anchor/core/coda framing; the recurrent-core machinery (hyper-connections, LTI gate, loop-index embedding) is individually vestigial. Variable-R inference degrades on both sides of the trained R, a negative result for test-time depth scaling under this recipe.

2606.00732 2026-06-04 cs.AI cs.LG 版本更新

SHARP: Sleep-based Hierarchical Accelerated Replay for Long Range Non-Stationary Temporal Pattern Recognition

SHARP: 基于睡眠的分层加速重放用于长程非平稳时间模式识别

Jayanta Dey, Shikhar Srivastava, Itamar Lerner, Christopher Kanan, Dhireesha Kudithipudi

发表机构 * Department of Computer Engineering, University of Texas at San Antonio, USA(德克萨斯大学圣安东尼奥分校计算机工程系) Department of Computer Science, University of Rochester, USA(罗切斯特大学计算机科学系) Department of Psychology, University of Texas at San Antonio, USA(德克萨斯大学圣安东尼奥分校心理学系)

AI总结 提出SHARP框架,通过将时间学习分解为记忆模块和模式识别模块,并引入离线睡眠阶段加速重放时间结构记忆,实现长程非平稳序列模式的高效学习。

详情
AI中文摘要

学习长程非平稳时间模式仍然是现代序列模型的核心挑战,特别是在严格的流式设置中。在这些设置中,数据按顺序到达,必须单次处理,不能同时回顾过去的观测。标准架构,包括循环神经网络和变换器,受到截断时间反向传播或显式输入窗口长度的限制,无法进行长程信用分配。为了解决这些限制,我们提出了SHARP(基于睡眠的分层加速重放),一个将时间学习分解为两个互补组件的框架:一个累积过去输入的结构化历史的记忆模块,以及一个在该记忆上操作的模式识别模块。这种分离通过消除跨多步时间反向传播进行长程信用分配的需求,实现了对非平稳动态的资源高效和计算高效适应。受啮齿动物在慢波睡眠期间观察到的加速重放启发,SHARP引入了离线(睡眠)阶段,其中时间结构的记忆痕迹以加速形式重放并整合到更高层次的记忆表示中,从而改善长程上下文保留。通过受控模拟和消融研究,我们表征了所提出框架的关键属性。在text8和PG-19等基准数据集上,我们证明SHARP通过保留先前见过数据的下一个令牌预测性能,同时继续从当前流中学习并泛化到未来未见数据,改进了循环基线。这些增益得益于其分层结构,该结构以线性时间计算成本实现了指数级增长的有效时间上下文。

英文摘要

Learning long-range non-stationary temporal patterns remains a core challenge for modern sequence models, particularly in strict streaming settings. In these settings, data arrive sequentially and must be processed in a single pass without simultaneously revisiting past observations. Standard architectures, including recurrent neural networks and transformers, are constrained by either truncated backpropagation through time horizon or explicit input window length for long range credit assignment. To address these limitations, we propose SHARP (Sleep-based Hierarchical Accelerated Replay), a framework that decomposes temporal learning into two complementary components: a memory module that accumulates a structured history of past inputs, and a pattern-recognition module that operates over this memory. This separation enables resource- and compute-efficient adaptation to non-stationary dynamics by eliminating the need for backpropagation through time across many steps for long-range credit assignment. Inspired by the accelerated replay observed in rodents during slow-wave sleep, SHARP incorporates offline (sleep) phases in which temporally structured memory traces are replayed in an accelerated form and integrated into higher-level memory representations, improving long-range context retention. Through controlled simulations and ablation studies, we characterize the key properties of the proposed framework. In benchmark datasets such as text8 and PG-19, we demonstrate that SHARP improves over recurrent baselines by retaining next-token predictive performance on previously seen data while continuing to learn from the current stream and generalizing to future unseen data. These gains are enabled by its hierarchical structure, which yields an exponentially increasing effective temporal context with only linear-time computational cost.

2606.00260 2026-06-04 cs.CV cs.LG 版本更新

LastAct: Trajectory-Guided Latest-Activity Localization for Real-Time Smart-Home Activity Recognition

LastAct: 轨迹引导的最新活动定位用于实时智能家居活动识别

Zishuai Liu, Ruili Fang, Jin Lu, Fei Dou

发表机构 * School of Computing, University of Georgia(佐治亚大学计算学院)

AI总结 提出LastAct框架,通过轨迹图像序列和边界定位器解决滑动窗口中的边界污染问题,实现实时智能家居活动识别。

详情
AI中文摘要

基于环境传感器的人类活动识别(HAR)支持健康监测和辅助生活等智能家居应用。然而,在实际部署中,传感器事件以连续流的形式到达,活动边界未知。因此,滑动窗口推理会产生许多跨越转换并包含混合活动的窗口,造成边界污染,违反了大多数基准和模型使用的预分割实例假设。此外,许多管道通过将传感器ID视为独立标记来未充分利用空间上下文。我们提出了LastAct,一个面向轨迹的流式智能家居HAR框架,旨在处理混合窗口下的最新活动,同时显式建模空间结构。LastAct将传感器事件投影到家庭平面图上,形成保持空间连续性的布局对齐轨迹图像序列。一个轻量级门控识别受污染的窗口,边界定位器估计最后一个转换,从而实现边界引导的掩码,强调边界后的证据并抑制过时的上下文。为了提高效率,我们重用预计算的布局对齐模板缓存以避免重复渲染。实验表明,在四个公开的智能家居数据集上,采用接近真实的混合活动协议,LastAct在纯窗口上达到竞争性或更优的性能,并在交叉/混合窗口上获得显著的Macro-F1增益,展示了在接近真实的滑动窗口机制下更强的鲁棒性。

英文摘要

Human Activity Recognition (HAR) from ambient sensors enables smart-home applications such as health monitoring and assisted living. In realistic deployments, however, sensor events arrive as a continuous stream and activity boundaries are unknown. Sliding-window inference therefore produces many windows that straddle transitions and contain mixed activities, creating boundary contamination that violates the pre-segmented instance assumption used by most benchmarks and models. Moreover, many pipelines under-use spatial context by treating sensor IDs as independent tokens. We present LastAct, a trajectory-centric framework for streaming smart-home HAR that targets the most recent activity under mixed windows while explicitly modeling spatial structure. LastAct projects sensor events onto the home floorplan to form a layout-aligned trajectory image sequence that preserves spatial continuity. A lightweight gate identifies contaminated windows, and a boundary localizer estimates the last transition to enable boundary-guided masking that emphasizes post-boundary evidence and suppresses stale context. For efficiency, we reuse a precomputed layout-aligned template cache to avoid repeated rendering. Empirically, across four public smart-home datasets under near-realistic mixed-activity protocols, LastAct achieves competitive or superior performance on pure windows and yields substantial Macro-F1 gains on cross/mixed windows, demonstrating improved robustness under near-realistic sliding-window regimes.

2605.30705 2026-06-04 cs.CV cs.LG 版本更新

Equivariant Latent Alignment via Flow Matching under Group Symmetries

群对称下通过流匹配的等变潜在对齐

Sunghyun Kim, Jaehoon Hahm, Jeongwoo Shin, Joonseok Lee

发表机构 * University of Illinois Urbana-Champaign, Illinois, USA(伊利诺伊大学厄巴纳-香槟分校) Seoul National University, Seoul, Korea(首尔国立大学)

AI总结 针对现有方法在潜在空间中存在的等变错位问题,提出基于流的残差潜在流框架,通过纠正错位潜在表示来增强旋转群SO(n)下的等变一致性,提升新视角合成质量。

详情
AI中文摘要

几何感知生成模型和新视角合成方法在视觉保真度和一致性方面展现出强大潜力。同时,等变表示学习已成为构建潜在空间的有力框架,其中分析已知的群变换可以直接作用,捕捉数据中的几何结构,并增强新视角合成的可解释性和泛化性。然而,我们发现现有方法常遭受潜在错位问题,即潜在空间中预期的群作用与实际所需的变换之间存在差异。因此,学习到的潜在表示往往无法一致地保持底层群对称性所施加的等变关系。为解决此问题,我们提出残差潜在流,一种基于流的框架,用于纠正错位的潜在表示,从而提高对底层等变关系的遵从性。我们的综合实验表明,在旋转群SO(n)下,我们的方法显著减少了潜在错位,并提高了新视角合成的质量。

英文摘要

Geometry-aware generative models and novel view synthesis approaches have shown strong potential in visual fidelity and consistency. In parallel, equivariant representation learning has emerged as a powerful framework for constructing latent spaces where analytically known group transformations could act directly, capturing geometric structure in data and enhancing both interpretability and generalization in novel view synthesis. However, we identify that existing approaches often suffer from latent misalignment, a discrepancy between the intended group action and the actually required transformations in the latent space. Consequently, the learned latents often fail to consistently preserve the equivariant relations imposed by the underlying group symmetry. To address this, we propose Residual Latent Flow, a flow-based framework that corrects the misaligned latents, thereby improving compliance with the underlying equivariance relation. Our comprehensive experiments show that our method significantly reduces latent misalignment and improves novel view synthesis quality, under rotation groups SO(n).

2605.24358 2026-06-04 cs.LG cs.AI 版本更新

Treatment Effect Estimation with Differentiated Networked Effect on Graph Data

图数据上具有差异化网络效应的处理效应估计

Xiaofeng Lin, Han Bao, Hisashi Kashima

发表机构 * Kyoto University(京都大学) The Institute of Statistical Mathematics(统计数学研究所) Tohoku University(东北大学) RIKEN AIP(理化学研究所AIP)

AI总结 针对图数据中个体处理效应估计受邻居干扰且存在差异化网络效应的问题,提出一种结合部分注意力机制和消息放大器的干扰建模方法,以捕获邻居重要性和规模差异,提升估计精度。

Comments Accepted by the research track of the KDD 2026 conference

详情
AI中文摘要

从观测图数据中估计个体处理效应(ITE)对于商业和医学等领域的决策至关重要。由于干扰的存在,该任务具有挑战性,因为个体结果可能受到其邻居的处理和协变量的影响。现有方法尝试对这种干扰进行建模以实现准确的ITE估计。然而,一个关键问题常常被忽视:差异化网络效应(DNE),即由具有不同重要性和规模的邻居组成的局部网络所产生的影响。捕获DNE至关重要;否则,由于对干扰的错误刻画,我们将得到不精确的ITE估计,从而导致错误的决策。为了解决这一挑战,我们提出了一种新颖的干扰建模机制,该机制结合了两个部分注意力机制和一个消息放大器。部分注意力机制自动估计不同邻居在干扰中的重要性,而消息放大器根据邻居的规模调整干扰建模机制的结果,所有这些使得模型能够捕获DNE。在三个真实世界图上的实验表明,我们的方法在从图数据估计ITE方面优于现有方法,这证实了显式捕获DNE的重要性。

英文摘要

Estimating individual treatment effect (ITE) from observational graph data is crucial for decision-making in the fields such as commerce and medicine. This task is challenging due to interference, where individual outcomes can be influenced by the treatments and covariates of their neighbors. Existing methods attempt to model such interference for accurate ITE estimation. However, a critical issue is often overlooked: differentiated networked effect (DNE), an effect caused by local networks consisting of neighbors with varying importance and scales. Capturing DNE is vital; otherwise, we will end up with imprecise ITE estimation due to an erroneous characterization of interference, which can result in misguided decisions. To address this challenge, we propose a novel interference modeling mechanism that incorporates two partial attention mechanisms and a message amplifier. The partial attention mechanisms automatically estimate the importance of different neighbors in contributing to interference, while the message amplifier adjusts the results of the interference modeling mechanism based on the scale of neighbors, all of which enables the model to capture DNE. Experiments on three real-world graphs demonstrate that our methods outperform existing approaches for ITE estimation from graph data, which corroborates the importance of explicitly capturing DNE.

2605.26814 2026-06-04 cond-mat.str-el cs.LG physics.comp-ph 版本更新

Neural Autoregressive Control Variates for the Quantum Monte Carlo Sign Problem

量子蒙特卡洛符号问题的神经自回归控制变量

Bei Qiao, Lei Wang

发表机构 * Beijing National Laboratory for Condensed Matter Physics and Institute of Physics, Chinese Academy of Sciences, Beijing 100190, China(北京凝聚态物理国家实验室和物理研究所,中国科学院,北京100190,中国) University of Chinese Academy of Sciences, Beijing 100049, China(中国科学院大学,北京100049,中国)

AI总结 通过训练一对自回归模型构造零均值控制变量,有效缓解量子蒙特卡洛模拟中的符号问题,在三角晶格海森堡反铁磁体上实现平均符号标准误差降低一个数量级,能量估计误差降低三到五倍。

Comments 19 pages, 9 figures

详情
AI中文摘要

我们训练一对自回归模型来构造零均值控制变量,以缓解量子蒙特卡洛模拟中的符号问题。这两个自回归网络被限制在严格不相交支撑的正负符号扇区内,并且每个网络在其扇区内精确归一化。因此,它们的差在结构上具有零均值,提供了一个无偏的辅助可观测量,其与符号估计量的相关性控制方差减少。我们在随机级数展开框架内实现该方法,通过开发增量环拓扑更新将其扩展到受挫晶格。符号遍历采样通过扭转通道实现,这是非二分晶格上唯一的符号改变机制。我们将控制变量实现为自回归变换器,并带有序列结束奇偶掩码以强制精确的符号扇区分辨率,同时将增量环计数变化和累积受挫奇偶性作为拓扑特征纳入。在三角晶格海森堡反铁磁体上,我们在小$N$极限下对该方法进行基准测试。控制变量将平均符号的标准误差降低了一个数量级,并将能量估计量的标准误差降低了三到五倍,即使在平均符号低于$10^{-3}$时仍然有效。这项工作奠定了框架并提供了原理验证,表明自回归控制变量可以有效缓解符号问题。扩展到更大系统并采用物理信息架构是未来工作的主题。

英文摘要

We train a pair of autoregressive models to construct zero-mean control variates to mitigate the sign problem in quantum Monte Carlo simulations. The two autoregressive networks are confined to the positive- and negative-sign sectors with strictly disjoint support, and each is exactly normalized over its sector. Their difference is therefore structurally zero-mean, providing an unbiased auxiliary observable whose correlation with the sign estimator controls the variance reduction. We implement the method within the stochastic series expansion framework, which we extend to frustrated lattices by developing an incremental loop-topology update. Sign-ergodic sampling is achieved through a twist channel, which is the unique sign-changing mechanism on non-bipartite lattices. We implement the control variates as autoregressive transformers with an end-of-sequence parity mask that enforces exact sign-sector resolution, while the incremental loop-count change and cumulative frustration parity are incorporated as topological features. On the triangular-lattice Heisenberg antiferromagnet, we benchmark the method in the small-$N$ limit. The control variate reduces the standard error of the average sign by up to an order of magnitude and that of the energy estimator by a factor of three to five, remaining effective even when the average sign drops below $10^{-3}$. This work lays out the framework and provides a proof-of-principle demonstration that autoregressive control variates can effectively mitigate the sign problem. Scaling to larger systems with physics-informed architectures is the subject of future work.

2605.30120 2026-06-04 cs.IR cs.AI cs.LG 版本更新

No More K-means: Single-Stage Sparse Coding for Efficient Multi-Vector Retrieval

不再需要K-means:用于高效多向量检索的单阶段稀疏编码

Lixuan Guo, Yifei Wang, Tiansheng Wen, Aosong Feng, Stefanie Jegelka, Chenyu You

发表机构 * University of California, Berkeley(加州大学伯克利分校) Stanford University(斯坦福大学)

AI总结 针对多向量检索中K-means聚类导致的索引延迟和语义损失问题,提出单阶段稀疏检索(SSR),利用稀疏自编码器将词元嵌入投影为高维稀疏表示,结合倒排索引实现高效检索,在BEIR基准上索引时间减少15倍、检索延迟减半且性能提升。

Comments Accepted by ICML2026

详情
AI中文摘要

以ColBERT为代表的多向量检索(MVR)模型通过保留细粒度的词元级交互,在检索准确性上树立了新标杆。然而,这种粒度带来了存储和检索效率的瓶颈:为了管理十亿级词元向量的巨大内存占用和计算开销,最先进的系统被迫依赖激进的降维和复杂的聚类(例如K-means)。这种妥协引入了两个关键限制:大规模语料库聚类的过度索引延迟以及压缩固有的语义信息损失。在本文中,我们提出了单阶段稀疏检索(SSR),这是一种范式转变,用高效的稀疏编码取代了昂贵的聚类。我们不将特征压缩为低维稠密向量,而是利用稀疏自编码器(SAE)将词元嵌入投影到高维但高度稀疏的表示中。这种转换使我们能够完全绕过向量聚类,并利用倒排索引实现精确、高吞吐量的检索。在BEIR基准上的大量实验表明,SSR实现了“三连胜”的改进:与ColBERTv2相比,索引时间减少了15倍,检索延迟减半,同时检索性能优于领先的基线方法。

英文摘要

Multi-vector retrieval (MVR) models, exemplified by ColBERT, have established new benchmarks in retrieval accuracy by preserving fine-grained token-level interactions. However, this granularity imposes prohibitive storage and retrieval efficiency bottlenecks: to manage the immense memory footprint and computational overhead of billion-scale token vectors, state-of-the-art systems are forced to rely on aggressive dimension reduction and complex clustering (e.g., K-means). This compromise introduces two critical limitations: excessive indexing latency of clustering large-scale corpora and semantic information loss inherent to compression. In this paper, we propose Single-stage Sparse Retrieval (SSR}, a paradigm shift that replaces expensive clustering with efficient sparse coding. Instead of compressing features into low-dimensional dense vectors, we utilize Sparse Autoencoder (SAE) to project token embeddings into a high-dimensional but highly sparse representation. This transformation enables us to bypass vector clustering entirely and leverage inverted indexing for precise, high-throughput retrieval. Extensive experiments on the BEIR benchmark demonstrate that SSR achieves a "trifecta" of improvements: it reduces indexing time by 15x compared to ColBERTv2, halves retrieval latency, and simultaneously improves retrieval performance over leading baselines.

2511.05924 2026-06-04 cs.LG 版本更新

DiScoFormer: Plug-In Density and Score Estimation with Transformers

DiScoFormer: 基于Transformer的即插即用密度与得分估计

Vasily Ilin, Peter Sushko, Ranjay Krishna

发表机构 * Department of Mathematics, University of Washington, Seattle, USA Math AI Lab, University of Washington, Seattle, USA Allen Institute for Artificial Intelligence, Seattle, USA Paul G.\ Allen School of Computer Science \& Engineering, University of Washington, Seattle, USA

AI总结 提出DiScoFormer,一种可一次训练、任意推理的等变Transformer,通过自注意力机制实现跨分布和样本规模的密度与得分估计,证明其泛化核密度估计并优于KDE。

Comments Accepted in ICML 2026 (oral)

详情
AI中文摘要

从样本中估计概率密度及其得分仍然是生成建模、贝叶斯推断和动力学理论中的核心问题。现有方法分为两类:经典核密度估计(KDE)可泛化到不同分布,但受维度灾难影响;现代神经得分模型精度高,但需为每个目标分布重新训练。我们提出DiScoFormer(密度与得分Transformer),一种“一次训练,任意推理”的等变Transformer,将独立同分布样本映射到密度值和得分向量,可泛化到不同分布和样本规模。理论上,我们证明自注意力可以恢复归一化KDE,从而建立其作为核方法函数泛化的地位;实验上,单个注意力头学习多尺度、类核的行为。该模型在密度估计上收敛更快、精度高于KDE,并为得分去偏KDE、Fisher信息计算和Fokker-Planck型偏微分方程提供高保真即插即用得分预言机。

英文摘要

Estimating probability density and its score from samples remains a core problem in generative modeling, Bayesian inference, and kinetic theory. Existing methods are bifurcated: classical kernel density estimators (KDE) generalize across distributions but suffer from the curse of dimensionality, while modern neural score models achieve high precision but require retraining for every target distribution. We introduce DiScoFormer (Density and Score Transformer), a ``train-once, infer-anywhere" equivariant Transformer that maps i.i.d. samples to both density values and score vectors, generalizing across distributions and sample sizes. Analytically, we prove that self-attention can recover normalized KDE, establishing it as a functional generalization of kernel methods; empirically, individual attention heads learn multi-scale, kernel-like behaviors. The model converges faster and achieves higher precision than KDE for density estimation, and provides a high-fidelity plug-in score oracle for score-debiased KDE, Fisher information computation, and Fokker-Planck-type PDEs.

2504.12988 2026-06-04 cs.LG stat.ML 版本更新

Why Ask One When You Can Ask $k$? Learning-to-Defer to the Top-$k$ Experts

为何只问一个专家?学习将任务推迟到Top-$k$专家

Yannis Montreuil, Axel Carlier, Lai Xing Ng, Wei Tsang Ooi

发表机构 * School of Computing, National University of Singapore(新加坡国立大学计算机学院) Fédération ENAC, ISAE-SUPAERO, ONERA, Université de Toulouse(ENAC联合会、ISAE-SUPAERO、ONERA、图卢兹大学) Agency for Science, Technology and Research, Institute for Infocomm Research(科技研究局、信息通信研究所)

AI总结 提出Top-$k$学习推迟框架,通过将查询分配给最优的$k$个专家,实现多专家协作,并开发了与$k$无关的替代损失函数,在准确性和成本之间取得更优权衡。

详情
AI中文摘要

现有的学习推迟(L2D)框架仅限于单专家推迟,迫使每个查询仅依赖一个专家,无法利用集体专业知识。我们首次提出了Top-$k$学习推迟框架,将查询分配给成本效益最高的$k$个实体。我们的公式统一并严格推广了先前的方法,包括单阶段和两阶段机制、选择性预测以及经典级联。特别地,它将通常的Top-1推迟规则作为特例,同时当$k>1$时能够与多个专家进行原则性协作。我们进一步提出了Top-$k(x)$学习推迟,这是一种自适应变体,根据输入难度、专家质量和咨询成本学习每个查询的最佳专家数量。为了实现实际学习,我们开发了一种新颖的替代损失函数,该函数在单阶段设置中是贝叶斯一致且$\mathcal{H}_h$一致的,在两阶段设置中是$(\mathcal{H}_r,\mathcal{H}_g)$一致的。关键是,该替代损失与$k$无关,允许一次性学习单个策略并灵活地部署到不同的$k$值。在两个机制上的实验表明,Top-$k$和Top-$k(x)$在准确性和成本之间实现了更优的权衡,为L2D中的多专家推迟开辟了新方向。

英文摘要

Existing Learning-to-Defer (L2D) frameworks are limited to single-expert deferral, forcing each query to rely on only one expert and preventing the use of collective expertise. We introduce the first framework for Top-$k$ Learning-to-Defer, which allocates queries to the $k$ most cost-effective entities. Our formulation unifies and strictly generalizes prior approaches, including the one-stage and two-stage regimes, selective prediction, and classical cascades. In particular, it recovers the usual Top-1 deferral rule as a special case while enabling principled collaboration with multiple experts when $k>1$. We further propose Top-$k(x)$ Learning-to-Defer, an adaptive variant that learns the optimal number of experts per query based on input difficulty, expert quality, and consultation cost. To enable practical learning, we develop a novel surrogate loss that is Bayes-consistent, $\mathcal{H}_h$-consistent in the one-stage setting, and $(\mathcal{H}_r,\mathcal{H}_g)$-consistent in the two-stage setting. Crucially, this surrogate is independent of $k$, allowing a single policy to be learned once and deployed flexibly across $k$. Experiments across both regimes show that Top-$k$ and Top-$k(x)$ deliver superior accuracy-cost trade-offs, opening a new direction for multi-expert deferral in L2D.

2410.15761 2026-06-04 cs.CL cs.LG stat.ML 版本更新

Optimal Query Allocation in Extractive QA with LLMs: A Learning-to-Defer Framework with Theoretical Guarantees

基于LLM的抽取式问答中的最优查询分配:一个具有理论保证的学习-推迟框架

Yannis Montreuil, Shu Heng Yeo, Axel Carlier, Lai Xing Ng, Wei Tsang Ooi

发表机构 * School of Computing, National University of Singapore(新加坡国立大学计算机学院) Fédération ENAC ISAE-SUPAERO ONERA, Université de Toulouse, France(法国图卢兹大学ENAC ISAE-SUPAERO ONERA联合体) Institute for Infocomm Research (A*STAR), Singapore(新加坡信息与通信研究院(A*STAR)) IPAL, IRL 2955, Singapore(新加坡IPAL实验室)

AI总结 提出一个学习-推迟框架,通过将查询分配给专门专家,在保证高置信度预测的同时优化计算效率,并在SQuADv1、SQuADv2和TriviaQA上验证了其提高答案可靠性和降低计算开销的效果。

Comments 25 pages, 17 main paper

详情
AI中文摘要

大型语言模型在生成任务中表现出色,但在结构化文本选择(特别是抽取式问答)中效率低下。这一挑战在资源受限环境中被放大,因为部署多个专门模型处理不同任务是不切实际的。我们提出一个学习-推迟框架,将查询分配给专门专家,确保高置信度预测的同时优化计算效率。我们的方法整合了一个原则性的分配策略,并提供了关于最优推迟的理论保证,以平衡性能和成本。在SQuADv1、SQuADv2和TriviaQA上的实证评估表明,我们的方法增强了答案可靠性,同时显著降低了计算开销,使其非常适合可扩展且高效的EQA部署。

英文摘要

Large Language Models excel in generative tasks but exhibit inefficiencies in structured text selection, particularly in extractive question answering. This challenge is magnified in resource-constrained environments, where deploying multiple specialized models for different tasks is impractical. We propose a Learning-to-Defer framework that allocates queries to specialized experts, ensuring high-confidence predictions while optimizing computational efficiency. Our approach integrates a principled allocation strategy with theoretical guarantees on optimal deferral that balances performance and cost. Empirical evaluations on SQuADv1, SQuADv2, and TriviaQA demonstrate that our method enhances answer reliability while significantly reducing computational overhead, making it well-suited for scalable and efficient EQA deployment.

2605.29280 2026-06-04 cs.LG cs.AI cs.IR 版本更新

LoopFM: Learning frOm HistOrical RePresentations of Foundation Model for Recommendation

LoopFM:从基础模型的历史表示中学习用于推荐

Shali Jiang, Hua Zheng, Boyang Liu, Laming Chen, Kenny Lov, Chuanqi Xu, Lisang Ding, Qinghai Zhou, Can Cui, Xiaolong Liu, Xiaoyi Liu, Yasmine Badr, Xin Xu, Jiyan Yang, Ellie Dingqiao Wen, Gerard Jonathan Mugisha Akkerhuis, Chenxiao Guan, Rong Jin, Ruichao Qiu, Xian Chen, Shifu Xu, Zhehui Zhou, Ping Chen, Rui Yang, Haicheng Chen, Xiangge Meng, Song Zhou, Dharak Kharod, Shuyu Xu, Qiang Jin, Qiao Yang, Wankun Zhu, Qin Huang, Yuzhen Huang, Darren Liu, Parish Aggarwal, Hui Zhou, Erzhuo Wang, Shuo Chang, Xiaorui Gan, Wenlin Chen, Santanu Kolay, Huayu Li

发表机构 * Meta

AI总结 针对知识蒸馏中传递标量导致转移率下降的问题,提出LoopFM框架,通过将基础模型的中间嵌入作为输入特征传递给下游垂直模型,实现高带宽知识转移,并在理论和实验中证明其有效性。

Comments Shali Jiang, Hua Zheng, Boyang Liu contributed equally to this work

详情
AI中文摘要

知识蒸馏(KD)将大型基础模型(FM)的单个标量预测传递给紧凑的垂直模型(VM),但由于单个标量无法传达较大FM学习的丰富中间知识,导致转移率(VM捕获的FM改进比例)下降。为了解决这一瓶颈,我们提出了LoopFM(从FM的历史表示中学习),该框架通过将FM中间嵌入结构化为下游VM的输入特征(例如,用户历史序列)来打开高带宽传输通道,无需在服务时进行实时FM推理,也无需FM和VM之间的架构耦合。我们为LoopFM提供了理论框架,包括增益分解和转移率分析。在三个公开基准上,LoopFM展示了强大的AUC改进(例如,在淘宝广告上提高6%以上)以及与KD互补的知识转移能力。在工业规模系统(数十亿样本、万亿参数FM)上,LoopFM在KD基础上将知识转移率大约翻倍,在Y1H1中实现了+0.5%的转化改进,在Y1H2中分别从两次单独发布实现了+1.03%和+1.22%的转化改进。

英文摘要

Knowledge distillation (KD) transfers a single scalar prediction from a large foundation model (FM) to compact vertical models (VMs), suffering from diminishing transfer ratio -- the fraction of FM improvement captured by the VM -- as a single scalar cannot convey the rich intermediate knowledge that larger FMs learn. To address this bottleneck, we propose LoopFM (Learning frOm HistOrical RePresentations of FM), a framework that opens a high-bandwidth transfer channel by structuring FM intermediate embeddings as input features (e.g., user history sequence) for downstream VMs, without requiring real-time FM inference at serving and architectural coupling between FM and VM. We provide a theoretical framework for LoopFM with a gain decomposition and transfer-ratio analysis. On three public benchmarks, LoopFM demonstrates strong AUC improvements (e.g., 6%+ on TaobaoAd) and complementary knowledge transfer capability with KD. On industrial-scale systems (billions of examples, trillion-parameter FMs), LoopFM approximately doubles the knowledge transfer ratio on top of KD, delivering a +0.5% conversion improvement in the first half after its initial launch, and +1.03% and +1.22% conversion improvement from two individual launches in the subsequent half.

2605.29076 2026-06-04 cs.CL cs.AI cs.LG 版本更新

Structured Prompt Optimization Meets Reinforcement Learning for Global and Local Interpretability over Complex Text

结构化提示优化结合强化学习实现复杂文本的全局与局部可解释性

Tianyang Zhou, Wenbo Chen, Pierre Jinghong Liang, Leman Akoglu

发表机构 * Carnegie Mellon University(卡内基梅隆大学) Amazon(亚马逊)

AI总结 提出eXTC框架,通过结构化提示优化、基于SOP的推理蒸馏和强化学习扩展,在分类性能和解释质量上显著优于现有范式。

详情
AI中文摘要

LLMs在文本分类上取得了进展,但现有范式面临权衡:监督(仅标签)微调可扩展,但对复杂文本推理有限且缺乏模型透明度;离散提示优化提供可读指令,但性能和可扩展性不佳。我们引入eXTC(可解释文本分类器),包含三个渐进阶段:(1)通过新的结构化提示优化算法学习自然语言的标准操作程序(SOP或规则手册);(2)从大型教师LLM到紧凑LM的基于SOP的推理蒸馏;(3)通过强化学习扩展超出初始SOP的推理能力。该设计使eXTC能够(i)通过紧凑LM实现快速推理,(ii)提供推理时的局部推理轨迹,以及其学习领域规则的全局模块化解释,同时(iii)在分类性能和解释质量上显著优于现有范式,并逐步提升。

英文摘要

LLMs have advanced text classification, yet existing paradigms face a trade-off: supervised (label only) fine-tuning is scalable but offers limited reasoning on complex text and lacks broader model transparency, while discrete prompt optimization offers human-readable instructions but struggles with performance and scalability. We introduce eXTC (eXplainable Text Classifier) with three progressive stages: (1) learning a Standard Operating Procedure (SOP, or rulebook) in natural language via a new Structured Prompt Optimization algorithm; (2) SOP-grounded reasoning distillation from a large teacher LLM into a compact LM; and (3) expanding reasoning capabilities beyond the initial SOP via reinforcement learning. This design enables eXTC to provide (i) fast inference via a compact LM, with (ii) inference-time local reasoning traces, alongside a global, modular explanation of its learned domain rules, while (iii) significantly outperforming existing paradigms across diverse benchmarks in both classification performance and explanation quality, with stage-by-stage gains.

2605.11130 2026-06-04 cs.LG cs.AI 版本更新

HEPA: A Self-Supervised Horizon-Conditioned Event Predictive Architecture for Time Series

HEPA: 一种用于时间序列的自监督水平条件事件预测架构

Jonas Petersen, Gian-Alessandro Lombardi, Riccardo Maggioni, Camilla Mazzoleni, Federico Martelli, Philipp Petersen

发表机构 * ETH Zurich(苏黎世联邦理工学院) Forgis University of Vienna(维也纳大学)

AI总结 提出HEPA架构,通过因果Transformer编码器联合嵌入预测(JEPA)预训练和仅微调预测器生成单调生存累积分布函数,在14个基准测试中超过PatchTST等模型,参数和标注数据量减少一个数量级。

Comments Spotlight at FMSD, ICML 2026. Code: https://github.com/Forgis-Labs/HEPA

详情
AI中文摘要

多变量时间序列中的关键事件,从涡轮机故障到心律失常,需要准确的预测,但由于此类事件罕见且标注成本高,标注数据稀缺。我们引入了HEPA(水平条件事件预测架构),基于两个关键原则。首先,通过联合嵌入预测架构(JEPA)预训练因果Transformer编码器:一个水平条件预测器学习预测未来表示而非未来值,迫使编码器仅从无标注数据中捕获可预测的时间动态。其次,我们冻结编码器,仅微调预测器以预测目标事件,生成随水平单调的生存累积分布函数(CDF)。在所有基准测试中,使用固定的架构和优化器超参数,HEPA处理了水污染、网络攻击检测、波动率制度以及跨11个领域的另外8种事件类型,在14个基准测试中的至少10个上超过了包括PatchTST、iTransformer、MAE和Chronos-2在内的领先时间序列架构,调优参数少一个数量级,并且在生命周期数据集上,标注数据少一个数量级。

英文摘要

Critical events in multivariate time series, from turbine failures to cardiac arrhythmias, demand accurate prediction, yet labeled data is scarce because such events are rare and costly to annotate. We introduce HEPA (Horizon-conditioned Event Predictive Architecture), built on two key principles. First, a causal Transformer encoder is pretrained via a Joint-Embedding Predictive Architecture (JEPA): a horizon-conditioned predictor learns to forecast future representations rather than future values, forcing the encoder to capture predictable temporal dynamics from unlabeled data alone. Second, we freeze the encoder and finetune only the predictor toward the target event, producing a monotonic survival cumulative distribution function (CDF) over horizons. With fixed architecture and optimiser hyperparameters across all benchmarks, HEPA handles water contamination, cyberattack detection, volatility regimes, and eight further event types across 11 domains, exceeding leading time-series architectures including PatchTST, iTransformer, MAE, and Chronos-2 on at least 10 of 14 benchmarks, with an order of magnitude fewer tuned parameters and, on lifecycle datasets, an order of magnitude less labeled data.

2605.09081 2026-06-04 cs.LG cs.AI 版本更新

FactoryNet: A Large-Scale Dataset toward Industrial Time-Series Foundation Models

FactoryNet:面向工业时间序列基础模型的大规模数据集

Karim Othman, Jonas Petersen, Matei Ignuta-Ciuncanu, Camilla Mazzoleni, Federico Martelli, Alessandro Lombardi, Riccardo Maggioni, Philipp Petersen

发表机构 * ETH Zurich(苏黎世联邦理工学院)

AI总结 提出首个工业时间序列通用预训练语料库FactoryNet,通过统一模式实现跨实体零样本迁移和高效异常检测。

Comments Accepted at AI4Physics and FMSD, ICML 2026. Code: https://github.com/Forgis-Labs/FactoryNet

详情
AI中文摘要

我们引入了首个工业时间序列数据的通用预训练语料库:FactoryNet。该数据集包含51M个数据点,涵盖六种实体上的23k个端到端任务执行(13.3k真实,9.8k合成),通过共享模式实现了鲁棒的零样本跨实体迁移和高参数效率的异常检测。我们提出了一种新颖的模式:设定点、努力、反馈、上下文(S-E-F-C),该模式贯穿整个流水线,将任何驱动系统映射到共同的表示框架。该语料库涵盖27种标注的异常类型,以及健康基线和机器人操作与加工领域的反事实对。跨实体迁移实验取得了积极结果:在考虑偏见的指标下,我们的模型在评估的源-目标对上展示了公平的跨实体迁移能力,而24个模式对齐的信号与高维基线相比,实现了有竞争力的异常检测性能。我们发布FactoryNet作为一个不断增长的多实体数据集,以推动工业基础模型的发展。

英文摘要

We introduce the first universal pretraining corpus for industrial time-series data: FactoryNet. 51M datapoints across 23k end-to-end task executions (13.3k real, 9.8k synthetic) on six embodiments, unified by a shared schema that enables robust zero-shot cross-embodiment transfer and highly parameter-efficient anomaly detection. We introduce a novel schema: Setpoint, Effort, Feedback, Context (S-E-F-C) underlying the whole pipeline that maps any actuated system into a common representational frame. The corpus spans 27 annotated anomaly types alongside healthy baselines and counterfactual pairs across robotic manipulation and machining domains. Cross-embodiment transfer experiments yield positive results: under bias-aware metrics our model demonstrates fair cross-embodiment transfer capabilities on the evaluated source-target pair, while 24 schema-aligned signals achieves competitive anomaly detection performance compared to high-dimensional baselines. We release FactoryNet as a growing, multi-embodiment dataset to drive progress toward industrial foundation models.

2603.07523 2026-06-04 cs.LG 版本更新

Breaking the Scale Barrier: One-Shot Knowledge Transfer via Frequency Transform

基于频域知识的通用模型初始化

Jianlu Shen, Fu Feng, Yucheng Xie, Jiaqi Lv, Xin Geng

发表机构 * School of Computer Science and Engineering(计算机科学与工程学院) Southeast University(东南大学) Key Laboratory of New Generation Artificial Intelligence Technology and Its Interdisciplinary Applications(新一代人工智能技术及其交叉应用重点实验室) Ministry of Education, China(中华人民共和国教育部)

AI总结 提出FRONT框架,利用离散余弦变换提取权重的低频分量作为“学习基因”,通过截断或填充实现任意大小模型的免训练初始化,并可选频谱正则化提升迁移性,在视觉任务中加速收敛15倍,语言任务中平均减少40.5%训练计算量。

详情
AI中文摘要

通过微调大规模预训练网络来迁移知识已成为下游任务的标准范式,然而预训练模型的知识与单一架构紧密耦合,限制了在不同规模模型间的灵活复用。针对这一挑战,近期方法通常采用参数选择(无法捕捉知识的相互依赖结构)或使用生成模型进行参数预测(依赖于对大规模网络集合的不切实际访问)。在本文中,我们实验证明,模型的基础、任务无关知识(即其“学习基因”)编码在权重的低频分量中,并且可以被下游模型高效继承。基于这一发现,我们提出FRONT(频域知识迁移),一种新颖框架,使用离散余弦变换(DCT)分离低频“学习基因”。该学习基因可以通过简单的截断或填充无缝适配以初始化任意大小的模型,整个过程无需训练。为了提升性能,我们提出一个可选的低成本精炼过程,引入频谱正则化器以进一步提高学习基因的可迁移性。大量实验表明,FRONT达到了最先进的性能,在视觉任务中加速收敛高达15倍,在语言任务中平均减少40.5%的训练FLOPs。

英文摘要

Transferring knowledge by fine-tuning large-scale pre-trained networks has become a standard paradigm for downstream tasks, yet the knowledge of a pre-trained model is tightly coupled with monolithic architecture, which restricts flexible reuse across models of varying scales. In response to this challenge, recent approaches typically resort to either parameter selection, which fails to capture the interdependent structure of this knowledge, or parameter prediction using generative models that depend on impractical access to large network collections. In this paper, we identify the low-frequency components of model weights as the concrete carrier of foundational, task-agnostic knowledge, its ``learngene", and validate this by demonstrating its efficient inheritance by downstream models and tasks. Based on this insight, we propose FRONT (FRequency dOmain kNowledge Transfer), a novel framework that uses the Discrete Cosine Transform (DCT) to isolate the low-frequency ``learngene". This learngene can be seamlessly adapted to initialize models of arbitrary size via simple truncation or padding, a process that is entirely training-free. For enhanced performance, we propose an optional low-cost refinement process that introduces a spectral regularizer to further improve the learngene's transferability. Extensive experiments demonstrate that FRONT achieves the state-of-the-art performance, accelerates convergence by up to $15\times$ in vision tasks, and reduces training FLOPs by an average of 40.5% in language tasks. Code is available at https://github.com/LUcy0505/FRONT.

2602.05725 2026-06-04 cs.LG math.OC stat.ML 版本更新

Muon in Associative Memory Learning: Training Dynamics and Scaling Laws

联想记忆学习中的Muon:训练动力学与缩放定律

Binghui Li, Kaifei Wang, Han Zhong, Pinyan Lu, Liwei Wang

发表机构 * University of Science and Technology of China(中国科学技术大学)

AI总结 本文在联想记忆模型中研究Muon优化器的训练动力学和缩放定律,证明其相比梯度下降在无噪声情况下实现指数加速,在有噪声情况下具有更优的缩放效率。

Comments Published as a conference paper at ICML 2026; 53 pages

详情
AI中文摘要

Muon通过梯度的矩阵符号更新矩阵参数,并显示出强大的经验增益,但其动力学和缩放行为在理论上仍不清楚。我们在具有softmax检索和查询-答案对上的层次频谱(含和不含标签噪声)的线性联想记忆模型中研究Muon。在该设置下,我们证明梯度下降以高度不平衡的速率学习频率分量,导致收敛缓慢,瓶颈在于低频分量。相比之下,Muon优化器缓解了这种不平衡,实现了更快且更均匀的进展。具体地,在无噪声情况下,Muon实现了相对于梯度下降的指数加速;在具有幂律频谱的有噪声情况下,我们推导了Muon的缩放定律,并展示了其相对于梯度下降的优越缩放效率。此外,我们表明Muon可以解释为由自适应任务对齐和块对称梯度结构产生的隐式矩阵预处理器。相比之下,具有坐标符号算子的预处理器在已知未知任务表示的oracle访问下才能匹配Muon,而这在实践中的SignGD中是不可行的。在合成长尾分类和LLaMA风格预训练上的实验证实了该理论。

英文摘要

Muon updates matrix parameters via the matrix sign of the gradient and has shown strong empirical gains, yet its dynamics and scaling behavior remain unclear in theory. We study Muon in a linear associative memory model with softmax retrieval and a hierarchical frequency spectrum over query-answer pairs, with and without label noise. In this setting, we show that Gradient Descent (GD) learns frequency components at highly imbalanced rates, leading to slow convergence bottlenecked by low-frequency components. In contrast, the Muon optimizer mitigates this imbalance, leading to faster and more uniform progress. Specifically, in the noiseless case, Muon achieves an exponential speedup over GD; in the noisy case with a power-law frequency spectrum, we derive Muon's scaling law and demonstrate its superior scaling efficiency over GD. Furthermore, we show that Muon can be interpreted as an implicit matrix preconditioner arising from adaptive task alignment and block-symmetric gradient structure. In contrast, the preconditioner with coordinate-wise sign operator could match Muon under oracle access to unknown task representations, which is infeasible for SignGD in practice. Experiments on synthetic long-tail classification and LLaMA-style pre-training corroborate the theory.

2605.24782 2026-06-04 cs.LG 版本更新

The Perception-Physics Paradox: Probing Scientific Alignment with TC-Bench

感知-物理悖论:用TC-Bench探究科学对齐

Dingling Yao, Andrea Polesello, Adeel Pervez, Caroline Muller, Francesco Locatello

发表机构 * ETH Zurich(苏黎世联邦理工学院) DeepMind University of Cambridge(剑桥大学) University of Amsterdam(阿姆斯特丹大学) University of Toronto(多伦多大学)

AI总结 本文提出科学对齐概念,通过结构同构性构建层次化必要条件,并发布TC-Bench基准数据集,揭示视觉基础模型在极端条件下依赖视觉捷径而非科学推理。

Comments Accepted at ICML 2026

详情
AI中文摘要

虽然视觉基础模型(VFM)在卫星图像的预测任务中表现出色,但其性能可能源于视觉相关性而非底层结构不变性,这使得基于感知的分布外准确性甚至不能作为科学实用性的良好代理。因此,模型可能看起来正确但推理错误,我们将这种差异称为感知-物理悖论。为了解决这一差距,我们引入科学对齐作为科学领域表示学习的隐式目标。我们通过结构同构性研究科学对齐的一个原则性、可测试的方面,该要求潜在表示能够唯一地识别物理系统,直至线性重新参数化。这一视角引出了一个层次化的必要条件,并为物理和因果可解释性提供了系统的探测协议。为了实施这一框架,我们发布了TC-Bench,这是一个全球性的、可复现的基准数据集,带有自动构建流程,用于热带气旋研究,并表明当前的VFM依赖于在极端条件下崩溃的视觉捷径,表明科学对齐并非仅仅是规模扩展的自然副产品。

英文摘要

While Vision Foundation Models (VFMs) excel at predictive tasks on satellite imagery, their performance can arise from visual correlations rather than underlying structural invariants, making even perception-based out-of-distribution accuracy a poor proxy for scientific utility. As a result, models may look correct without reasoning correctly, a discrepancy we term the Perception-Physics Paradox. To address this gap, we introduce scientific alignment as an implicit objective for representation learning in scientific domains. We study a principled, testable aspect of scientific alignment through structural isomorphism, which requires latent representations to uniquely identify physical systems up to a linear reparameterization. This perspective induces a hierarchy of necessary conditions and yields a systematic probing protocol for physical and causal interpretability. To operationalize this framework, we release TC-Bench, a global, reproducible benchmark dataset with an automated construction pipeline for tropical cyclone research, and show that current VFMs rely on visual shortcuts that collapse in intense regimes, indicating that scientific alignment does not arise as a natural byproduct of scaling alone.

2605.17273 2026-06-04 cs.LG cs.AI 版本更新

Position: State-of-the-Art Claims Require State-of-the-Art Evidence

立场:声称最先进需要最先进的证据

YongKyung Oh

发表机构 * YongKyung Oh(永庆欧)

AI总结 本文指出人工智能和机器学习研究中普遍存在的声称最先进(SOTA)与证据不足之间的差距,通过分析十个跨领域基准测试发现,超过一半的顶级模型比较中至少一项常见的优越性假设不成立,并呼吁声明语言应反映证据强度。

详情
AI中文摘要

最先进(SOTA)声称在人工智能(AI)和机器学习(ML)研究中普遍存在。这些声称基于基准评估,其中模型根据跨任务的总分进行排名。公共基准或排行榜是最明显的实例,但相同的结构也出现在文献中的论文表格中。然而,这种微弱的证据往往无法支持这些强有力的声称。我们识别出AI基准测试中普遍存在的声称-证据差距。声称SOTA隐含着超越平均分数优越性的假设,表明模型在大多数任务上显著优于替代方案。然而,平均分数的边际改进仅表明平均排名靠前,而非真正的优越性。通过分析来自公共排行榜的十个跨领域基准测试,我们发现超过一半的顶级模型比较中,至少一项常见的优越性假设不成立。这些属性包括有意义的效应大小、跨任务的一致性,或对数据集移除的鲁棒性。相反,总分提升往往由异常数据集驱动。即使在任务众多的基准测试中,这种脆弱性仍然存在。我们认为,声称语言应反映潜在证据的强度。这不需要额外的实验,只需诚实地报告结果实际显示的内容,从而实现跨模型更精确和可解释的比较。

英文摘要

State-of-the-Art (SOTA) claims pervade Artificial Intelligence (AI) and Machine Learning (ML) research. These claims rest on benchmark evaluations, where models are ranked by aggregate scores across tasks. Public benchmarks or leaderboards are the most visible instance, but the same structure appears in paper tables throughout the literature. However, such minimal evidence often cannot support these strong claims. We identify a widespread claim-evidence gap in AI benchmarking. Claiming SOTA carries implicit assumptions beyond mean score superiority, suggesting that a model meaningfully outperforms alternatives across most tasks. However, a marginal improvement in the mean score merely indicates a top average rank rather than true superiority. Analyzing ten cross-domain benchmarks from public leaderboards, we found that in more than half of top-model comparisons, at least one commonly assumed property of superiority does not hold. These properties include meaningful effect size, consistency across tasks, or robustness to dataset removal. Instead, aggregate gains are frequently driven by outlier datasets. This fragility persists even in benchmarks with many tasks. We argue that claim language should reflect the strength of the underlying evidence. This requires no additional experiments, only honest reporting of what results actually show, enabling more precise and interpretable comparisons across models.

2605.22740 2026-06-04 cs.LG 版本更新

Ternary Decision Trees with Locally-Adaptive Uncertainty Zones

三元决策树与局部自适应不确定性区域

William Smits

发表机构 * Avathon

AI总结 本文提出三元决策树,通过在每个分裂节点引入局部自适应的不确定性区域,改进传统二元决策树的决策准确性,并在多个数据集上验证了其优越性。

Comments V2: Major revision. Added decision-theoretic framework deriving optimal delta* as a node-local cost minimisation problem; four formal theoretical properties (Propositions 1-4); motivating example figure (Figure 5); strengthened related work and limitations analysis. 15 pages, 5 figures, 5 appendix sections. Submitted to Data Mining and Knowledge Discovery (DAMI)

详情
AI中文摘要

决策树通过硬二元阈值划分特征空间,对远离决策边界和直接位于边界上的实例赋予相同的置信度。我们引入三元决策树,每个分裂节点附加一个半宽为delta的不确定性区域,位于最优阈值中心。该区域内实例的预测由两个子树的加权混合生成,并被标记为边界不确定,提示下游应用可能以不同方式处理这些预测。关键的是,delta在每个节点本地计算,基于标准CART分裂寻找过程中已有的统计信息,无需外部噪声指定。我们提出并评估了五种delta估计方法:质量平台(分裂标准曲线的平台宽度)、类别重叠(经验类别分布重叠)、增益比(分裂质量相对于分裂熵)、节点自助法(节点层面重采样下的阈值方差)以及边缘(受SVM启发的最近跨类训练实例距离)。在72个OpenML-CC18数据集上进行5折交叉验证后,所有五种方法结合概率路由显著优于标准CART在决定准确性上(Wilcoxon符号秩检验,p < 0.001)。边缘方法在效率上最佳(每个边界不确定标志率单位获得0.104准确性提升),在42个数据集上获胜,且不需要额外超参数。对三个Breiman合成基准的分析显示,边缘方法在干净数据上自我校准,而节点自助法和质量平台方法最佳跟踪理论不可约误差。在四个医疗和金融数据集上的实验展示了实际价值:在乳腺X线摄影中,节点自助法通过将10.8%的筛查病例标记为边界不确定,实现了+0.71%的决定准确性提升。

英文摘要

Decision trees assign identical confidence to instances near and far from each split threshold. We introduce ternary decision trees, which augment each split node with an uncertainty zone of half-width delta. A decision-theoretic framework characterises the optimal zone width delta* as the solution to a node-local cost-minimisation problem; four formal properties are established: accuracy decomposition, a sufficiency condition for decided accuracy improvement, an exact efficiency characterisation (eta = Dec-Acc minus Acc_u, the accuracy gap between decided and boundary-uncertain predictions), and asymptotic consistency of the margin method. Instances within the zone receive predictions by weighted blending of both child subtrees and are flagged as boundary-uncertain. We propose and evaluate five delta-estimation methods: quality-plateau (plateau width of the split criterion curve), class-overlap (empirical class-distribution overlap), gain-ratio (split quality relative to split entropy), node-bootstrap (threshold variance under node-level resampling), and margin (SVM-inspired distance to the nearest cross-class training example). All methods reuse statistics already computed during standard CART split finding, requiring no external noise specification. Evaluated across 71 of the 72 OpenML-CC18 datasets with 5-fold cross-validation, all five methods with probabilistic routing significantly outperform standard CART on decided accuracy (Wilcoxon signed-rank, p < 0.001). The margin method achieves the best efficiency (0.104 accuracy gain per unit flagging rate), wins on 42 of 72 datasets, and requires zero hyperparameters. Analysis on Breiman synthetic benchmarks confirms margin is self-calibrating on clean data. On mammography, node-bootstrap achieves +0.71% decided accuracy by flagging 10.8% of cases as boundary-uncertain.

2605.20654 2026-06-04 cs.LG cs.AI 版本更新

REFLECTOR: Internalizing Step-wise Reflection against Indirect Jailbreak

REFLECTOR: 内化逐步反思以对抗间接越狱

Jiachen Ma, Jiawen Zhang, Xiangtian Li, Bo Zou, Chaochao Lu, Chao Yang

发表机构 * University of Science and Technology of China(中国科学技术大学)

AI总结 提出REFLECTOR两阶段框架,通过教师引导生成反思数据并进行监督微调,再结合强化学习内化自主反思能力,在复杂间接攻击下实现超过90%的防御成功率,同时提升通用性能。

Comments ICML 2026

详情
AI中文摘要

尽管大型语言模型(LLMs)展现出卓越的能力,但它们仍然容易受到复杂的多步越狱攻击,这些攻击通过利用内部生成过程来规避传统的表面安全对齐。为了解决这些漏洞,我们提出了REFLECTOR,一个原则性的两阶段框架,将自我反思内化在生成轨迹中。REFLECTOR首先利用教师引导生成高质量反思数据用于监督微调(SFT),建立结构化的反思模式。随后,它使用强化学习(RL)结合结果驱动和奖励有效性监督,以培养稳健、自主的自我反思能力。实验结果表明,REFLECTOR在复杂的间接攻击下实现了超过90%的防御成功率(DSR),同时在不同威胁场景中具有稳健的泛化能力。值得注意的是,该框架增强了任务特定和通用效用,在GSM8K上获得了5.85%的提升,并在知识密集型基准测试中表现更佳。通过内化轨迹级安全性,REFLECTOR克服了表面对齐的基本限制,且没有显著的计算开销,为开发安全且能力强大的LLMs提供了一种高效且可扩展的解决方案。

英文摘要

While Large Language Models (LLMs) demonstrate remarkable capabilities, they remain susceptible to sophisticated, multi-step jailbreak attacks that circumvent conventional surface-level safety alignment by exploiting the internal generation process. To address these vulnerabilities, we propose Reflector, a principled two-stage framework that internalizes self-reflection within the generation trajectory. Reflector first leverages teacher-guided generation to produce high-quality reflection data for supervised fine-tuning (SFT), establishing structured reflection patterns. It subsequently uses Reinforcement Learning (RL) with outcome-driven and reward-validity supervision to instill robust, autonomous self-reflection capabilities. Empirical results show that Reflector achieves Defense Success Rates (DSR) exceeding 90% against complex indirect attacks while generalizing robustly across diverse threat scenarios. Notably, the framework enhances both task-specific and general utility, yielding a 5.85% gain on GSM8K alongside improved performance on knowledge-intensive benchmarks. By internalizing trajectory-level safety, Reflector overcomes the fundamental limitations of surface alignment without significant computational overhead, offering an efficient and scalable solution for the development of safe and capable LLMs.

2605.18879 2026-06-04 cs.LG cs.AI cs.CL 版本更新

ZeroUnlearn: Few-Shot Knowledge Unlearning in Large Language Models

ZeroUnlearn:大语言模型中的少样本知识遗忘

Yujie Lin, Chengyi Yang, Zhishang Xiang, Yiping Song, Jinsong Su

发表机构 * University of Science and Technology of China(中国科学技术大学)

AI总结 提出ZeroUnlearn框架,通过模型编辑将机器遗忘重新定义为精确的知识重映射问题,利用封闭解乘法参数更新实现高效、定向的少样本遗忘。

详情
AI中文摘要

大型语言模型由于在海量网络语料上训练,不可避免地会保留敏感信息(定义为可能引发有害生成的输入),从而引发隐私和安全担忧。现有的机器遗忘方法主要依赖于重训练或激进微调,这些方法要么计算成本高,要么容易降低相关知识并损害整体模型效用。在这项工作中,我们通过模型编辑将机器遗忘重新表述为一个精确的知识重映射问题。我们提出了ZeroUnlearn,一个少样本遗忘框架。它通过将敏感输入映射到中性目标状态并移除其原始表示来覆盖敏感输入。ZeroUnlearn通过封闭解形式的乘法参数更新强制执行表示正交性,从而实现高效且有针对性的遗忘。我们进一步将ZeroUnlearn扩展到基于梯度的变体,用于多样本遗忘。实验表明,我们的方法在保持模型整体效用的同时优于现有基线。我们的代码可在github上获取:https://github.com/XMUDeepLIT/ZeroUnlearn。

英文摘要

Large language models inevitably retain sensitive information, defined as inputs that may induce harmful generations, due to training on massive web corpora, raising concerns for privacy and safety. Existing machine unlearning methods primarily rely on retraining or aggressive fine-tuning, which are either computationally expensive or prone to degrading related knowledge and overall model utility. In this work, we reformulate machine unlearning as a precise knowledge re-mapping problem via model editing. We propose ZeroUnlearn, a few-shot unlearning framework. It overwrites sensitive inputs by mapping them to a neutral target state and removing their original representations. ZeroUnlearn enforces representational orthogonality through a multiplicative parameter update with a closed-form solution, enabling efficient and targeted unlearning. We further extend ZeroUnlearn to a gradient-based variant for multi-sample unlearning. Experiments demonstrate that our approach outperforms existing baselines while preserving general model utility. Our code is available at the github: https://github.com/XMUDeepLIT/ZeroUnlearn.

2605.20468 2026-06-04 cs.LG stat.ME stat.ML 版本更新

CASCADE Conformal Prediction: Uncertainty-Adaptive Prediction Intervals for Two-Stage Clinical Decision Support

CASCADE 共形预测:两阶段临床决策支持的不确定性自适应预测区间

Ricardo Diaz-Rincon, Muxuan Liang, Adolfo Ramirez-Zamora, Benjamin Shickel

发表机构 * University of Florida(佛罗里达大学) MD Anderson Cancer Center(MD安德森癌症中心) University of Louisville(路易斯维尔大学)

AI总结 提出 CASCADE 共形预测框架,通过传播分类器认知不确定性动态调整回归预测区间,在帕金森病用药管理中实现高效且鲁棒的区间估计。

Comments Accepted to ICML 2026 AgenticUQ Workshop. 14 Pages, 3 Figures

详情
AI中文摘要

由于疾病进展的异质性、患者反应的差异性以及药物副作用,帕金森病(PD)的有效用药管理具有挑战性。虽然AI模型可以预测左旋多巴等效日剂量(LEDD)作为用药需求的度量,但标准的不确定性量化通常无法传达这些预测的可靠性,将高置信度和低置信度的临床决策等同对待。我们引入了CASCADE(通过共形和分布估计的校准自适应缩放),一种新颖的共形预测框架,它将来自筛选分类器的认知不确定性传播以自适应下游预测。与依赖辅助残差回归的标准共形方法不同,我们利用来自主要分类任务(识别是否需要改变用药)的认知不确定性,动态缩放次要回归任务(预测改变多少)的预测区间。通过将Venn-Abers多概率不确定性直接映射到非一致性分数,我们的框架实现了连续的风险自适应。我们证明,这种“级联效应”为高置信度患者产生高效的区间(比标准共形基线窄38.9%),同时自动扩展区间以确保对不确定病例的鲁棒覆盖,弥合了PD中离散临床决策与连续剂量预测之间的差距。

英文摘要

Effective medication management in Parkinson's Disease (PD) is challenging due to heterogeneous disease progression, variable patient response, and medication side effects. While AI models can forecast levodopa equivalent daily dose (LEDD) as a measure of medication needs, standard uncertainty quantification often fails to communicate the reliability of these predictions, treating high and low confidence clinical decisions identically. We introduce CASCADE (Calibrated Adaptive Scaling via Conformal And Distributional Estimation), a novel conformal prediction framework that propagates epistemic uncertainty from a screening classifier to adapt downstream predictions. Unlike standard conformal methods that rely on auxiliary residual regression, we leverage epistemic uncertainty from a primary classification task (identifying whether a medication change is needed) to dynamically scale the prediction intervals of a secondary regression task (predicting how much change). By mapping Venn-Abers multi-probabilistic uncertainty directly to non-conformity scores, our framework achieves continuous risk adaptation. We demonstrate that this ``cascade effect'' produces highly efficient intervals for confident patients (38.9% narrower than standard conformal baselines) while automatically expanding intervals to ensure robust coverage for uncertain cases, bridging the gap between discrete clinical decision-making and continuous dose forecasting in PD.

2605.18936 2026-06-04 cs.LG cs.CL 版本更新

FedMental: Evaluating Federated Learning for Mental Health Detection from Social Media Data

FedMental: 评估用于社交媒体数据心理健康检测的联邦学习

Nuredin Ali Abdelkadir, Anjali Ratnam, Zeerak Talat, Stevie Chancellor

发表机构 * University of Minnesota(明尼苏达大学) University of Edinburgh(爱丁堡大学)

AI总结 本文通过联邦学习和差分隐私联邦学习在抑郁和自杀危机检测任务上的实验,评估了隐私保护技术对心理健康检测性能的影响,发现联邦学习性能接近集中式训练,但差分隐私联邦学习存在显著的性能-隐私权衡。

Comments Association for Computational Linguistics (ACL) 2026 Main Conference

详情
AI中文摘要

社交媒体文本数据常用于训练机器学习模型以识别表现出高风险心理健康行为的用户。然而,共享这些敏感数据会带来隐私风险,并限制了基准数据集的发展。我们全面评估了隐私保护的机器学习技术是否能在保持性能的同时实现更安全的数据共享。具体来说,我们将联邦学习和差分隐私联邦学习应用于两个广泛研究的心理健康预测任务:X(Twitter)上的抑郁检测和Reddit上的自杀危机检测。通过将每个用户视为非独立同分布设置中的一个客户端,我们模拟了现实的数据共享场景,评估了不同的客户端比例、聚合策略和隐私预算。虽然联邦学习在抑郁识别上达到了与集中式训练相当的性能(集中式F1=85.63;最佳联邦学习模型F1=83.16),但我们发现差分隐私联邦学习即使在低噪声水平(epsilon=50)下也存在较大的性能-隐私权衡(F1下降高达27.01)。这是由于与心理健康相关的高信息量但稀疏的语言标记(如健康主题和情感词)被扭曲所致。本研究实证展示了当前隐私保护技术在心理健康推理任务中的潜力和局限性。

英文摘要

Social media text data are often used to train Machine Learning (ML) models to identify users exhibiting high-risk mental health behaviors. However, sharing this sensitive data poses privacy risks and limits the growth of benchmark datasets. We comprehensively evaluate whether privacy-preserving ML techniques can enable safer data sharing while preserving performance. Specifically, we apply federated learning (FL) and Differentially Private FL for two widely-studied mental health prediction tasks: depression detection on X (Twitter) and suicide crisis detection on Reddit. We simulate realistic data-sharing scenarios by treating each user as a client in a non-IID setting, evaluating across different client fractions, aggregation strategies, and privacy budgets. While FL achieves comparable performance to centralized training (centralized F1 = 85.63; best FL model F1 = 83.16) on depression identification, we find that Differentially Private FL has a large performance-privacy trade-off (up to F1 = 27.01 drop) even with low levels of noise (epsilon = 50). This is due to the distortion of highly informative yet sparse mental health linguistic markers related to mental health, like health topics and emotion words. This research empirically demonstrates the potential and limitations of current privacy preservation techniques for mental health inference tasks.

2605.18931 2026-06-04 stat.ML cs.AI cs.LG 版本更新

Markov Chain Decoders Overcome the Heavy-Tail Limitations of Lipschitz Generative Models

马尔可夫链解码器克服Lipschitz生成模型的重尾限制

Abdelhakim Ziani, Andras Horvath, Paolo Ballarini

发表机构 * Université Paris Saclay, Lab. MICS, CentraleSupélec, Gif-sur-Yvette, France(巴黎萨克雷大学,MICS实验室,CentraleSupélec,法国吉夫昂耶vette) Università di Torino, Torino, Italy(都灵大学,意大利都灵)

AI总结 针对Lipschitz生成模型无法生成重尾分布的问题,提出用基于马尔可夫链的Phase-Type分布替换高斯解码器,显著降低了尾部误差和极端分位数误差。

详情
Journal ref
22nd European Performance Engineering Workshop (EPEW 2026), Jun 2025, Grimstad, Norway
AI中文摘要

重尾分布在性能评估、网络流量和风险建模中普遍存在。这种行为对现代深度生成模型构成了根本性挑战。标准变分自编码器(VAE)采用高斯解码器似然和Lipschitz约束神经网络,这种组合在结构上无法产生重尾输出:高斯尾部呈指数衰减,而Lipschitz连续性阻止解码器放大来自潜在空间的罕见事件以充分克服这种衰减。我们提供了这一局限性的理论刻画,并使用合成Pareto数据(跨越尾部指数$α$ ∈ {2, 3, 5, 30}和维度d ∈ {1, 5, 10}的网格)进行了受控实证演示。作为解决方案,我们在保持编码器、潜在空间和训练过程不变的情况下,将高斯解码器替换为基于马尔可夫链的Phase-Type(PH)分布。PH分布允许对任何正值分布(包括重尾族)进行任意精确的近似。实验表明,对于重尾数据,与高斯基线相比,基于PH的模型将尾部Kolmogorov-Smirnov距离减少了最多6倍,极端分位数误差减少了最多10倍。这些结果表明,将基于马尔可夫链的分布集成到生成模型的解码器中,为重尾生成问题提供了一个有原则且实际有效的解决方案。

英文摘要

Heavy-tailed distributions are prevalent in performance evaluation, network traffic, and risk modeling. This behavior poses a fundamental challenge for modern deep generative models. Standard Variational Autoencoders (VAEs) employ Gaussian decoder likelihoods and Lipschitz-constrained neural networks, a combination that is structurally incapable of producing heavy-tailed outputs: the Gaussian tail decays exponentially, and Lipschitz continuity prevents the decoder from amplifying rare events from the latent space input to sufficiently overcome this decay. We provide both a theoretical characterization of this limitation and a controlled empirical demonstration using synthetic Pareto data across a grid of tail indices $α$ $\in$ {2, 3, 5, 30} and dimensions d $\in$ {1, 5, 10}. As a solution, we replace the Gaussian decoder with a Phase-Type (PH) distribution based on Markov chains, while keeping the encoder, latent space, and training procedure identical. PH distributions allow for arbitrarily precise approximations of any positive-valued distributions, including heavy-tailed families. Experiments showed that the PH-based model reduces tail Kolmogorov-Smirnov distance by up to x6 and extreme quantile error by up to x10 compared to the Gaussian baseline for heavy-tailed data. These results demonstrate that integrating Markov chain-based distributions into the decoder of a generative model institutes a principled and practically effective solution to the heavy-tail generation problem.

2605.16301 2026-06-04 cs.CY cs.AI cs.LG 版本更新

Do LLMs Hold Their Values? MANTA: A Multi-Turn Adversarial Benchmark for Animal Welfare Reasoning

LLMs 是否坚持其价值观?MANTA:一个用于动物福利推理的多轮对抗性基准

Isabella Luong, Joyee Chen, Arturs Kanepajs, Jasmine Brazilek, Sankalpa Ghose, David Williams-King, Linh Le, Allen Lu

发表机构 * SPAR Compassion Aligned Machine Learning(同情对齐机器学习) NUS(新加坡大学) Mila(Mila研究所) ERA Cambridge(剑桥ERA)

AI总结 提出 MANTA 基准,通过多轮对抗性对话评估大语言模型在动物福利推理中的价值观稳定性和道德敏感性,发现单轮基准无法捕捉的排名变化和物种-压力交互效应。

详情
AI中文摘要

评估大语言模型(LLMs)中的动物福利推理仍然是一个开放挑战,尽管它们在消费者和专业环境中迅速部署,其中福利考虑隐含地出现在日常查询中。现有的基准(如 AnimalHarmBench)通过单轮、明确框架的问题进行评估,衡量模型在直接询问时是否避免有害内容。这种方法忽略了两种失败模式:在持续对抗性压力下的对齐退化,以及道德敏感性(模型是否在日常查询中自发提出福利问题)。为填补这一空白,我们构建了 MANTA,一个包含 1,088 个五轮对话的基准,从隐式的第一轮场景开始,通过明确的福利提示,再到来自五种类型(社会、文化、经济、实用和认知)的三轮对抗性压力。我们在两个维度上对对话进行评分:动物福利价值观稳定性(AWVS,主要)和动物福利道德敏感性(AWMS,诊断)。我们评估了七个前沿模型:Claude Opus 4.7、GPT-5.5、DeepSeek V4、Llama 3.3 70B、Mistral Small、Grok 4.3 和 Gemini 3.1 Flash Lite。多轮评估捕捉了单轮基准遗漏的行为:7 个模型中有 4 个相对于第一轮得分改变了排名,包括 Gemini Flash Lite,它在 AWMS 上从第五名下降到 AWVS 上的最后一名。AWMS 和 AWVS 呈正相关但不完全相关,表明道德识别测试捕捉了模型在压力下行为的一个稳定但不完整的组成部分。MANTA 还提供了先前基准无法获得的物种-压力交互矩阵,显示福利鲁棒性同时取决于动物和施加的压力;伴侣动物得分高于野生动物,后者高于养殖动物和无脊椎动物。我们发布了数据集、脚本化压力计划、评判提示和分析代码。

英文摘要

Evaluating animal welfare reasoning in LLMs remains an open challenge despite rapid deployment in consumer and professional contexts where welfare considerations appear implicitly in everyday queries. Existing benchmarks such as AnimalHarmBench evaluate this through single-turn, explicitly framed questions, measuring whether models avoid harmful content when directly asked. This approach overlooks two failure modes: alignment degradation under sustained adversarial pressure, and moral sensitivity (whether a model spontaneously surfaces welfare stakes in everyday queries). To fill this gap, we construct MANTA, a benchmark of 1,088 five-turn conversations progressing from an implicit Turn-1 scenario through an explicit welfare prompt to three adversarial pressure rounds drawn from a five-type taxonomy: Social, Cultural, Economic, Pragmatic, and Epistemic. We score conversations on two dimensions: Animal Welfare Value Stability (AWVS, primary) and Animal Welfare Moral Sensitivity (AWMS, diagnostic). We evaluate seven frontier models: Claude Opus 4.7, GPT-5.5, DeepSeek V4, Llama 3.3 70B, Mistral Small, Grok 4.3, and Gemini 3.1 Flash Lite. Multi-turn evaluation captures behavior single-turn benchmarks miss: 4 of 7 models change rank relative to Turn 1 scores, including Gemini Flash Lite, which drops from fifth on AWMS to last on AWVS. AWMS and AWVS are positively but imperfectly correlated, suggesting moral-recognition tests capture a stable but incomplete component of model behavior under pressure. MANTA also enables a species-by-pressure interaction matrix unavailable to prior benchmarks, showing welfare robustness depends jointly on the animal and pressure applied; companion animals score above wild animals, which score above farmed animals and invertebrates. We release the dataset, scripted pressure plans, judge prompts, and analysis code.

2605.15152 2026-06-04 cs.LG cs.AI 版本更新

Widening the Gap: Exploiting LLM Quantization via Outlier Injection

扩大差距:通过异常值注入利用LLM量化

Xiaohua Zhan, Kazuki Egashira, Robin Staab, Mark Vero, Martin Vechev

发表机构 * ETH Zurich(苏黎世联邦理工学院)

AI总结 本文提出首个针对多种先进量化方法(AWQ、GPTQ、GGUF I-quants)的量化条件攻击,通过注入异常值导致权重塌缩,诱导模型在量化后出现恶意行为。

详情
AI中文摘要

LLM量化已成为内存高效部署的关键。最近的研究表明,量化方案可能带来严重的安全风险:对手可以发布一个在全精度下看似良性,但在用户量化后表现出恶意行为的模型。然而,现有的量化条件攻击仅限于相对简单的量化方法,攻击者可以估计在目标量化下保持不变的权重区域。值得注意的是,先前的攻击始终未能攻破更流行和复杂的方案,限制了其实际影响。在这项工作中,我们提出了首个量化条件攻击,能够持续诱导出可由多种先进量化技术(包括AWQ、GPTQ和GGUF I-quants)触发的恶意行为。我们的攻击利用了现代量化方法共有的一个简单特性:大的异常值可能导致其他权重四舍五入为零。因此,通过向特定权重块注入异常值,对手可以诱导模型出现目标性的、可预测的权重塌缩。这种效应可用于制作看似良性的全精度模型,这些模型在量化后表现出广泛的恶意行为。通过在三种攻击场景和LLM上的广泛评估,我们表明我们的攻击在先前攻击失败的多种量化方法上实现了高成功率。我们的结果首次证明,量化的安全风险不仅限于更简单的方案,而是广泛存在于复杂、广泛使用的量化方法中。

英文摘要

LLM quantization has become essential for memory-efficient deployment. Recent work has shown that quantization schemes can pose critical security risks: an adversary may release a model that appears benign in full precision but exhibits malicious behavior once quantized by users. However, existing quantization-conditioned attacks have been limited to relatively simple quantization methods, where the attacker can estimate weight regions that remain invariant under the target quantization. Notably, prior attacks have consistently failed to compromise more popular and sophisticated schemes, limiting their practical impact. In this work, we introduce the first quantization-conditioned attack that consistently induces malicious behavior that can be triggered by a broad range of advanced quantization techniques, including AWQ, GPTQ, and GGUF I-quants. Our attack exploits a simple property shared by many modern quantization methods: large outliers can cause other weights to be rounded to zero. Consequently, by injecting outliers into specific weight blocks, an adversary can induce a targeted, predictable weight collapse in the model. This effect can be used to craft seemingly benign full-precision models that exhibit a wide range of malicious behaviors after quantization. Through extensive evaluation across three attack scenarios and LLMs, we show that our attack achieves high success rates against a broad range of quantization methods on which prior attacks fail. Our results demonstrate, for the first time, that the security risks of quantization are not restricted to simpler schemes but are broadly relevant across complex, widely-used quantization methods.

2605.13672 2026-06-04 cs.CV cs.AI cs.LG 版本更新

SpurAudio: A Benchmark for Studying Shortcut Learning in Few-Shot Audio Classification

SpurAudio: 用于研究少样本音频分类中捷径学习的基准

Giries Abu Ayoub, Morad Tukan, Loay Mualem

发表机构 * Department of Computer Science, University of Haifa(海法大学计算机科学系) Independent Researcher(独立研究者) University of Stuttgart, Germany(斯图加特大学,德国) IMPRS-IS, Germany(智能系统国际Max Planck研究学校,德国)

AI总结 提出SpurAudio基准,通过控制音频中前景与背景的关联,评估少样本分类模型对虚假相关性的敏感性,发现现有方法在背景变化时性能显著下降。

详情
AI中文摘要

少样本分类(FSC)广泛用于从有限标注数据中学习,但大多数评估隐含假设目标概念与上下文线索无关。然而,在现实场景中,样本通常出现在丰富的上下文中,允许模型利用前景内容与背景信号之间的虚假相关性。虽然这种效应已在少样本图像分类中得到研究,但其在少样本音频分类中的作用仍 largely 未被探索,且现有音频基准对上下文结构的控制有限。我们引入了 SpurAudio,一个利用音频中前景事件和背景环境的自然可分离性,以支持对支持集和查询集之间的上下文偏移进行可控、多级评估的基准。使用该基准,我们表明许多最先进的少样本方法在背景相关性被破坏时遭受严重的性能下降,尽管在标准评估协议下达到相似的准确率。关键的是,即使在大型预训练音频基础模型中,这种脆弱性仍然存在,排除了骨干网络容量不足的解释。此外,在传统基准下看似相当的方法可能对虚假相关性表现出显著不同的敏感性,揭示了与特征表示在推理时如何与分类器头交互相关的系统性算法优势和脆弱性。这些发现为音频中少样本方法的行为提供了新的见解,并强调了在评估FSC模型时需要明确探测上下文依赖性的基准。

英文摘要

Few-shot classification (FSC) is widely used for learning from limited labeled data, yet most evaluations implicitly assume that target concepts are independent of contextual cues. In real-world settings, however, examples often appear within rich contexts, allowing models to exploit spurious correlations between foreground content and background signals. While such effects have been studied in few-shot image classification, their role in few-shot audio classification remains largely unexplored, and existing audio benchmarks offer limited control over contextual structure. We introduce SpurAudio, a benchmark that leverages the natural separability of foreground events and background environments in audio to enable controlled, multi-level evaluation of contextual shifts across support and query sets. Using this benchmark, we show that many state-of-the-art few-shot methods suffer severe performance degradation when background correlations are disrupted, despite achieving similar accuracy under standard evaluation protocols. Crucially, this vulnerability persists even in large pretrained audio foundation models, ruling out limited backbone capacity as an explanation. Moreover, methods that appear comparable under conventional benchmarks can exhibit markedly different sensitivity to spurious correlations, revealing systematic algorithmic strengths and vulnerabilities tied to how feature representations interact with classifier heads at inference time. These findings provide new insight into the behavior of few-shot methods in audio and highlight the need for benchmarks that explicitly probe context dependence when evaluating FSC models.

2605.00182 2026-06-04 cs.LG 版本更新

Towards A Generative Protein Evolution Machine with DPLM-Evo

迈向生成式蛋白质进化机器:DPLM-Evo

Xinyou Wang, Liang Hong, Jiasheng Ye, Zaixiang Zheng, Yu Li, Shujian Huang, Quanquan Gu

发表机构 * Nanjing University(南京大学) CUHK(香港大学) Fudan University(复旦大学) ByteDance(字节跳动)

AI总结 提出DPLM-Evo,一种显式建模替换、插入和删除操作的进化离散扩散框架,在单序列设置下实现蛋白质突变效应预测的最优性能,并支持变长模拟进化与蛋白质编辑优化。

Comments A peer-reviewed version was accepted to ICML 2026

详情
AI中文摘要

蛋白质在生物物理和功能约束下通过逐渐进化形成。蛋白质语言模型从大规模序列中学习丰富的进化约束,基于离散扩散的蛋白质语言模型(如DPLM)在理解和生成方面都很有前景。然而,现有的DPLM通常依赖于掩码扩散,这与一个简单的生物学直觉相矛盾:蛋白质通过累积的编辑进化,而不是从掩码中出现。因此,这些框架缺乏用于替换和插入/删除(indel)操作的显式预训练目标,限制了优化风格的后编辑和灵活的引导生成。为了解决这些限制,我们提出了DPLM-Evo,一种进化离散扩散框架,在去噪过程中显式预测替换、插入和删除操作。DPLM-Evo将上采样长度的潜在对齐空间与可变长度的观测序列空间解耦,使得indel感知生成变得可行。为了更好地将替换与真实进化对齐,我们进一步引入了一种上下文感知的进化噪声核,产生生物学信息丰富、上下文依赖的突变模式。在各种任务中,DPLM-Evo提升了序列理解能力,并在单序列设置下在ProteinGym上实现了最先进的突变效应预测性能。它还支持变长模拟进化,以及通过显式编辑轨迹对现有蛋白质进行后编辑/优化。

英文摘要

Proteins are shaped by gradual evolution under biophysical and functional constraints. Protein language models learn rich evolutionary constraints from large-scale sequences, and discrete diffusion-based protein language models~(\eg, DPLMs) are promising for both understanding and generation. However, existing DPLMs typically rely on masked diffusion that contradicts a simple biological intuition: proteins evolve through accumulated edits, not by emerging from masks. Consequently, these frameworks lack explicit pretraining objectives for substitution and insertion/deletion (indel) operations, limiting both optimization-style post-editing and flexible guided generation. To address these limitations, we present DPLM-Evo, an evolutionary discrete diffusion framework that explicitly predicts substitution, insertion, and deletion operations during denoising. DPLM-Evo decouples an upsampled-length latent alignment space from the variable-length observed sequence space, which makes indel-aware generation tractable. To better align substitutions with real evolution, we further introduce a contextualized evolutionary noising kernel that produces biologically informed, context-dependent mutation patterns. Across tasks, DPLM-Evo improves sequence understanding and achieves state-of-the-art mutation effect prediction performance on ProteinGym in the single-sequence setting. It also enables variable-length simulated evolution, and post-editing/optimization of existing proteins via explicit edit trajectories.

2304.10891 2026-06-04 cs.LG cs.AI cs.CV cs.RO cs.SY eess.SY 版本更新

Transformer-Based Autonomous Driving Models and Deployment-Oriented Compression: A Survey

基于Transformer的自动驾驶模型与面向部署的压缩:综述

Juan Zhong, Yuhang Shi, Zukang Xu, Xi Chen

发表机构 * Renmin University of China(中国人民大学) Artificial Intelligence Innovation and Incubation Institute, Fudan University(复旦大学人工智能创新与孵化院) Shanghai Academy of AI for Science(上海人工智能科学研究院) Department of houmo.ai(houmo.ai部门)

AI总结 本文综述了基于Transformer的自动驾驶模型,并从部署角度分析了压缩与加速策略(如量化、剪枝、知识蒸馏等)如何影响模型设计、部署性、鲁棒性和安全性。

详情
AI中文摘要

基于Transformer的模型正成为自动驾驶的核心范式,因为它们能够捕捉感知、预测和规划中的长程空间依赖、多智能体交互和多模态上下文。然而,它们在真实车辆中的部署仍然困难,因为高容量注意力架构带来了显著的延迟、内存和能量开销。本综述回顾了具有代表性的基于Transformer的自动驾驶模型,并按任务角色、感知配置和架构设计进行组织。更重要的是,我们从面向部署的角度审视这些模型,分析效率约束如何在实际中重塑模型设计选择。我们进一步回顾了与基于Transformer的驾驶系统相关的压缩和加速策略,包括量化、剪枝、知识蒸馏、低秩近似和高效注意力,并讨论了它们的优势、局限性和任务依赖性。我们不将压缩视为孤立的后期处理步骤,而是强调其作为直接影响部署性、鲁棒性和安全性的系统级设计考虑。最后,我们指出了面向标准化、安全感知和硬件感知的高效自动驾驶系统评估的开放挑战和未来研究方向。

英文摘要

Transformer-based models are becoming a central paradigm in autonomous driving because they can capture long-range spatial dependencies, multi-agent interactions, and multimodal context across perception, prediction, and planning. At the same time, their deployment in real vehicles remains difficult because high-capacity attention-based architectures impose substantial latency, memory, and energy overhead. This survey reviews representative Transformer-based autonomous driving models and organizes them by task role, sensing configuration, and architectural design. More importantly, it examines these models from a deployment-oriented perspective and analyzes how efficiency constraints reshape model design choices in practice. We further review compression and acceleration strategies relevant to Transformer-based driving systems, including quantization, pruning, knowledge distillation, low-rank approximation, and efficient attention, and discuss their benefits, limitations, and task-dependent applicability. Rather than treating compression as an isolated post-processing step, we highlight it as a system-level design consideration that directly affects deployability, robustness, and safety. Finally, we identify open challenges and future research directions toward standardized, safety-aware, and hardware-conscious evaluation of efficient autonomous driving systems.

2602.02834 2026-06-04 cs.LG cs.AI 版本更新

What Structural Inductive Bias Helps Transformers Reason Over Knowledge Graphs? A Study with Tabula RASA

什么结构归纳偏置帮助Transformer在知识图谱上进行推理?Tabula RASA研究

Jonas Petersen, Camilla Mazzoleni, Gian-Alessandro Lombardi, Federico Martelli, Riccardo Maggioni

发表机构 * ETH Zurich(苏黎世联邦理工学院)

AI总结 通过最小化Transformer修改的消融实验,发现稀疏邻接掩码是驱动多跳推理的主要结构归纳偏置,而关系参数贡献有限。

Comments Accepted at GFM, ICML 2026

详情
AI中文摘要

什么结构归纳偏置帮助Transformer在知识图谱上进行推理?通过对一个最小化Transformer修改(包含四个独立可移除组件:稀疏邻接掩码、边类型偏置、查询缩放、值门控)进行受控消融,我们隔离了哪些结构信号驱动多跳推理。我们的发现很明确:稀疏邻接掩码单独占据了相对于未掩码Transformer改进的主要份额(在3跳MetaQA上+72.5pp,在WebQSP上+45.5pp,在CWQ上+53.9pp),而学习的关系参数只增加了适度的改进,并且在缺乏结构指导时可能造成损害。一个零样本实验提供了架构独立的佐证:当边类型被排除时,基于掩码的注意力退化比关系特定权重少4.0倍。多跳KGQA的有用归纳偏置主要是拓扑的,而非关系的。

英文摘要

What structural inductive bias helps transformers reason over knowledge graphs? Through controlled ablations of a minimal transformer modification with four independently removable components (sparse adjacency masking, edge-type biases, query scaling, value gating), we isolate which structural signals drive multi-hop reasoning. Our finding is sharp: sparse adjacency masking alone accounts for the dominant share of improvement over unmasked transformers (+72.5pp on 3-hop MetaQA, +45.5pp on WebQSP, +53.9pp on CWQ), while learned relation parameters add only modest refinement and can actively hurt without structural guidance. A zero-shot experiment provides architecturally independent corroboration: masking-based attention degrades 4.0x less than relation-specific weights when edge types are held out. The useful inductive bias for multi-hop KGQA is predominantly topological, not relational.

2510.17281 2026-06-04 cs.LG cs.AI cs.IR 版本更新

MemoryBench: A Benchmark for Memory and Continual Learning in LLM Systems

MemoryBench:面向LLM系统的记忆与持续学习基准

Qingyao Ai, Yichen Tang, Changyue Wang, Jianming Long, Weihang Su, Yiqun Liu

发表机构 * Department of Computer Science and Technology, Tsinghua University, Beijing, China(清华大学计算机科学与技术系)

AI总结 提出用户反馈模拟框架及跨领域、多语言、多任务类型的综合基准MemoryBench,评估LLM系统从累积用户反馈中持续学习的能力,实验表明现有方法效果与效率均不理想。

详情
AI中文摘要

扩展数据、参数和测试时计算一直是改进LLM系统(LLMsys)的主流方法,但由于高质量数据的逐渐枯竭以及更大计算资源消耗带来的边际收益,这些方法的性能上限已几乎达到。受人类和传统AI系统从实践中学习能力的启发,为LLMsys构建记忆和持续学习框架已成为近期文献中一个重要且热门的研究方向。然而,现有的LLM记忆基准通常侧重于评估系统在长文本输入的同质阅读理解任务上的表现,而非测试其在服务时间内从累积用户反馈中学习的能力。因此,我们提出了一个用户反馈模拟框架和一个涵盖多个领域、语言和任务类型的综合基准,以评估LLMsys的持续学习能力。实验表明,最先进的基线方法在有效性和效率上远未令人满意,我们希望这一基准能为未来LLM记忆和优化算法的研究铺平道路。

英文摘要

Scaling up data, parameters, and test-time computation has been the mainstream methods to improve LLM systems (LLMsys), but their upper bounds are almost reached due to the gradual depletion of high-quality data and marginal gains obtained from larger computational resource consumption. Inspired by the abilities of human and traditional AI systems in learning from practice, constructing memory and continual learning frameworks for LLMsys has become an important and popular research direction in recent literature. Yet, existing benchmarks for LLM memory often focus on evaluating the system on homogeneous reading comprehension tasks with long-form inputs rather than testing their abilities to learn from accumulated user feedback in service time. Therefore, we propose a user feedback simulation framework and a comprehensive benchmark covering multiple domains, languages, and types of tasks to evaluate the continual learning abilities of LLMsys. Experiments show that the effectiveness and efficiency of state-of-the-art baselines are far from satisfying, and we hope this benchmark could pave the way for future studies on LLM memory and optimization algorithms. Website: https://memorybench.thuir.cn Code: https://github.com/THUIR/MemoryBench Data: https://huggingface.co/datasets/THUIR/MemoryBench Data-Full: https://huggingface.co/datasets/THUIR/MemoryBench-Full

2506.01250 2026-06-04 cs.LG stat.ML 版本更新

Neural Variance-aware Dueling Bandits with Deep Representation and Shallow Exploration

神经方差感知的深度表示与浅层探索的对抗性老虎机

Youngmin Oh, Jinje Park, Taejin Paik

发表机构 * InfiniTree Samsung Electronics(InfiniTree三星电子) Samsung Electronics(三星电子)

AI总结 提出首个方差感知的上下文对抗性老虎机算法,结合浅层探索与神经网络非线性效用逼近,通过迭代自改进与谱分析将网络宽度需求从Ω̃(T^{14})降至Ω̃(T^{6}),并实现次线性遗憾。

Comments Accepted at AISTATS 2026; code at https://github.com/youngmin0oh/NVLDB-AISTATS2026

详情
AI中文摘要

我们首次引入了方差感知的上下文对抗性老虎机算法,该算法利用浅层探索策略与神经网络进行非线性效用逼近。一个关键的理论挑战是缺乏闭式估计量,这导致先前的工作需要极大的网络宽度$m$(即$m = \widetilde{\Omega}(T^{14})$)。我们通过一种结合迭代自改进与谱分析的新颖分析方法解决了这一约束。我们的分析将网络宽度需求显著降低至$m = \widetilde{\Omega}(T^{6})$,并表明我们的算法在UCB和TS框架下均实现了次线性遗憾$\widetilde{\mathcal{O}}(d\sqrt{\sum_{t=1}^{T} \sigma_t^2} + \sqrt{dT})$。实验结果表明,所提出的算法不仅计算高效,在实际环境中表现出次线性遗憾,而且在合成和实际任务上均达到了最先进的性能。

英文摘要

We introduce the first variance-aware algorithms for contextual dueling bandits that leverage shallow exploration strategies with neural networks for nonlinear utility approximation. A key theoretical challenge is the absence of a closed-form estimator, which led prior work to require an extremely large network width $m$ (i.e., $m = \widetildeΩ(T^{14})$). We address this constraint with a novel analytical approach that combines iterative self-improvement with spectral analysis. Our analysis significantly reduces the network width requirement to $m = \widetildeΩ(T^{6})$, and shows that our algorithms achieve a sublinear regret of $\widetilde{\mathcal{O}}(d\sqrt{\sum_{t=1}^{T} σ_t^2} + \sqrt{dT})$ under both UCB and TS frameworks. Empirical results show that the proposed algorithms are not only computationally efficient and exhibit sublinear regret in practical settings, but also achieve state-of-the-art performance on both synthetic and real-world tasks.

2605.07724 2026-06-04 cs.LG cs.AI 版本更新

Curated Synthetic Data Doesn't Have to Collapse: A Theoretical Study of Generative Retraining with Pluralistic Preferences

策展合成数据不会崩溃:具有多元偏好的生成式再训练的理论研究

Ali Falahati, Mohammad Mohammadi Amiri, Kate Larson, Lukasz Golab

发表机构 * University of Washington(华盛顿大学)

AI总结 通过理论分析证明,基于多个奖励函数进行策展的递归训练可以避免生成模型崩溃,并收敛到满足加权纳什议价解的稳定分布。

Comments Proceedings of the 43rd International Conference on Machine Learning, Seoul, South Korea. PMLR 306, 2026

详情
AI中文摘要

生成模型的递归再训练提出了一个关键的表示挑战:当基于固定奖励信号策展合成输出时,模型倾向于崩溃到过度优化该目标的狭窄输出集上。先前的研究表明,如果不将真实数据混合进来,这种崩溃是不可避免的。我们从对齐角度重新审视这一结论,并表明通过基于多个奖励函数的策展可以减轻崩溃。我们形式化了异质偏好下递归训练的动力学,并证明在特定条件下,模型收敛到一个稳定分布,该分布在竞争的高奖励区域之间分配概率质量。极限分布保持多样性,并证明满足加权纳什议价解,为合成再训练循环中的价值聚合提供了正式解释。

英文摘要

Recursive retraining of generative models poses a critical representation challenge: when synthetic outputs are curated based on a fixed reward signal, the model tends to collapse onto a narrow set of outputs that over-optimize that objective. Prior work suggests that such collapse is unavoidable without adding real data into the mix. We revisit this conclusion from an alignment perspective and show that collapse can be mitigated through curation based on multiple reward functions. We formalize the dynamics of recursive training under heterogeneous preferences and prove that, under certain conditions, the model converges to a stable distribution that allocates probability mass across competing high-reward regions. The limiting distribution preserves diversity and provably satisfies a weighted Nash bargaining solution, offering a formal interpretation of value aggregation in synthetic retraining loops.

2605.07032 2026-06-04 cs.LG cs.AI 版本更新

A Systematic Investigation of RL-Jailbreaking in LLMs

LLMs中RL越狱的系统性研究

Montaser Mohammedalamen, Kevin Roice, Reginald McLean, Alyssa Lefaivre Škopac

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 本文首次系统分解RL越狱框架,通过分析奖励函数、动作空间、回合长度等环境形式化因素和算法措施,发现密集奖励和延长回合长度是越狱成功的主要驱动因素,并提供了提升RL越狱效率及强化模型防御的工具。

Comments Warning: This paper may contain unfiltered and potentially offensive jailbreaking examples. Accepted at the Second Workshop on Agents in the Wild: Safety, Security, and Beyond (AIWILD) at ICML 2026

详情
AI中文摘要

生成模型从下一个词预测器演变为复杂系统的自主引擎,这要求严格的安全加固。对抗性越狱,即通过策略性操纵模型以产生有害输出,仍然是安全部署的主要威胁。虽然强化学习(RL)通过顺序优化将越狱视为多步攻击,但对该框架为何成功的机制理解仍不完整。为填补这一空白,我们首次对RL越狱进行了系统分解。我们将框架解构为问题形式化(奖励函数、动作空间、回合长度)和算法措施(RL算法、训练数据、奖励塑造),以识别对抗成功的结构决定因素。我们的结果表明,RL越狱者成功攻破了所有目标模型和安全措施。通过这种首次分析,我们证明环境形式化,特别是密集奖励和延长回合长度,是越狱成功的主要驱动因素。这项工作为提高RL越狱效率提供了工具,并最终强化生成模型以抵御基于RL的攻击。

英文摘要

The evolution of generative models from next-token predictors to autonomous engines of complex systems necessitates rigorous safety hardening. Adversarial jailbreaking, the strategic manipulation of models to elicit harmful output, remains a primary threat to safe deployment. While Reinforcement Learning (RL) frames jailbreaking as a multi-step attack through sequential optimization, a mechanistic understanding of why the framework succeeds remains incomplete. To fill this gap, we present the first systematic decomposition of RL jailbreaking. We deconstruct the framework into problem formalization (reward function, action space, episode length), and algorithmic measures (RL algorithm, training data, reward-shaping) to identify the structural determinants of adversarial success. Our results reveal that the RL-jailbreaker successfully compromised all targeted models and safeguards. Through this first-of-its-kind analysis, we demonstrate that environment formalization, specifically dense rewards and extended episode lengths, is the primary driver of jailbreaking success. This work provides a tool for improving RL-jailbreaker efficiency and, ultimately, harden generative models resistant to RL-based attacks.

2604.25649 2026-06-04 cs.LG 版本更新

Towards interpretable AI with quantum annealing feature selection

迈向可解释的人工智能:基于量子退火的特征选择

Francesco Aldo Venturelli, Emanuele Costa, Sikha O K, Bruno Juliá-Díaz, Miguel A. González Ballester, Alba Cervera-Lierta

发表机构 * BCN Medtech, Universitat Pompeu Fabra, Barcelona, Spain(BCN医疗科技,庞培法布拉大学,巴塞罗那,西班牙) Barcelona Supercomputing Center (BSC)(巴塞罗那超级计算中心(BSC)) Departament de Física Quàntica i Astrofísica, Facultat de Física, Universitat de Barcelona(巴塞罗那大学物理量子与天体物理系,物理系) Institut de Ciències del Cosmos, Universitat de Barcelona, ICCUB(巴塞罗那大学宇宙科学研究所,ICCUB) ICREA, Barcelona, Spain(ICREA,巴塞罗那,西班牙)

AI总结 提出一种利用量子退火选择最具代表性特征图的方法,以解释卷积神经网络的图像分类预测,相比GradCAM和GradCAM++提升了类别解缠和解释质量。

Comments Text improvement and extra tests in v2. 15 pages, 10 figures, 1 table, including appendices

详情
AI中文摘要

深度学习模型被用于关键应用中,其中错误可能导致严重后果。因此,理解模型如何以及为何生成预测至关重要。这种理解提供了有用信息,用于检查模型是否学习到正确的模式、检测数据中的偏差、改进模型设计以及构建可信赖的系统。本文提出了一种新方法,用于解释图像分类任务中的卷积神经网络。该方法通过选择对每个预测贡献最大的最具代表性特征图来工作。为了解决这个组合问题,我们将其编码为量子约束优化问题,并提出使用量子退火求解。我们针对最先进的可解释AI技术(特别是GradCAM和GradCAM++)评估了我们的方法,并观察到类别解缠的改进,即模型的决策边界变得更加清晰,其推理更加透明。这表明我们的方法提高了解释质量,使得更容易理解模型依赖哪些特征进行特定预测。此外,我们研究了量子退火算法的计算行为。具体来说,我们分析了计算过程中系统的最小能隙以及算法找到正确解的概率。这些分析为该方法在实践中有效工作的原因提供了理论见解。

英文摘要

Deep learning models are used in critical applications, in which mistakes can have serious consequences. Therefore, it is crucial to understand how and why models generate predictions. This understanding provides useful information to check whether the model is learning the right patterns, detect biases in the data, improve model design, and build systems that can be trusted. This work proposes a new method for interpreting Convolutional Neural Networks in image classification tasks. The approach works by selecting the most representative feature maps that contribute to each prediction. To solve this combinatorial problem, we encode it into a quantum constrained optimization problem and propose to solve it using quantum annealing. We evaluate our method against the state-of-the-art explainable AI techniques, specifically GradCAM and GradCAM++, and observe an improved class disentanglement, i.e. the model's decision boundaries become more distinct and its reasoning more transparent. This demonstrates that our approach enhances the quality of explanations, making it easier to understand which features the model relies on for specific predictions. In addition, we study the computational behavior of the quantum annealing algorithm. Specifically, we analyze the minimum energy gap of the system during computation and the probability that the algorithm finds the correct solution. These analyses provide theoretical insight into why the method works effectively in practice.

2604.00860 2026-06-04 cs.LG 版本更新

Policy Improvement Reinforcement Learning

策略改进强化学习

Huaiyang Wang, Xiaojie Li, Deqing Wang, Haoyi Zhou, Zixuan Huang, Yaodong Yang, Jianxin Li, Yikun Ban

发表机构 * Beihang University(北航) Peking University(北京大学)

AI总结 提出策略改进强化学习(PIRL)框架,通过最大化跨迭代的累积策略改进来替代替代奖励最大化,并基于此设计策略改进策略优化(PIPO)算法,实现闭环优化,在数学推理基准上提升稳定性和性能。

Comments Update author list

详情
AI中文摘要

具有可验证奖励的强化学习(RLVR)已成为改进大型语言模型推理能力的核心后训练范式。然而,现有方法存在一个共同的盲点:它们基于瞬时组级或批次级统计量优化策略,而从未验证所得更新是否实际改进了模型。这种开环设计——在每一步孤立地更新,仅由组内(批次)奖励信号引导——意味着优化可能漂移或崩溃,且没有机制来检测和纠正这些失败。我们认为缺失的要素是策略改进反馈:直接测量和优化跨迭代进展的能力。为此,我们引入策略改进强化学习(PIRL),这是一个用最大化跨迭代累积策略改进的显式目标替代替代奖励最大化的框架,并证明该时间目标与最大化最终任务性能完美对齐。基于PIRL,我们提出策略改进策略优化(PIPO),通过回顾性验证实现闭环优化。在每次迭代中,PIPO评估先前更新是否相对于滑动窗口历史基线产生了真正改进,然后主动强化有益更新并抑制有害更新——将开环过程转变为自纠正过程。我们提供理论分析表明PIPO在期望上对PIRL目标进行上升,并且在数学推理基准上的实验表明,与GRPO及其变体相比,PIPO提高了稳定性和性能。

英文摘要

Reinforcement Learning with Verifiable Rewards (RLVR) has become a central post-training paradigm for improving the reasoning capabilities of large language models. Yet existing methods share a common blind spot: they optimize policies based on instantaneous group-level or batch-level statistics without ever verifying whether the resulting update actually improved the model. This open-loop design -- updating in isolation at each step, guided only by within-group (batch) reward signals -- means optimization can drift or collapse with no mechanism to detect and correct these failures. We argue that the missing ingredient is policy improvement feedback: the ability to measure and optimize inter-iteration progress directly. To this end, we introduce Policy Improvement Reinforcement Learning (PIRL), a framework that replaces surrogate reward maximization with the explicit objective of maximizing cumulative policy improvement across iterations, and prove this temporal objective is perfectly aligned with maximizing final task performance. Building on PIRL, we propose Policy Improvement Policy Optimization (PIPO), which implements closed-loop optimization through retrospective verification. At each iteration, PIPO evaluates whether the previous update yielded genuine improvement against a sliding-window historical baseline, then actively reinforces beneficial updates and suppresses the harmful ones -- transforming an open-loop process into a self-correcting one. We provide theoretical analysis showing that PIPO performs ascent on the PIRL objective in expectation, and experiments on mathematical reasoning benchmarks demonstrate improved stability and performance over GRPO and its variants.

2401.07386 2026-06-04 cs.CY cs.AI cs.LG 版本更新

How do machines learn? Evaluating the AIcon2abs method

机器如何学习?评估AIcon2abs方法

Rubens Lacerda Queiroz, Cabral Lima, Fabio Ferrentini Sampaio, Priscila Machado Vieira Lima

发表机构 * PPGI, Federal University of Rio de Janeiro(里约热内卢联邦大学PPGI系) Computer Science Institute, Federal University of Rio de Janeiro(里约热内卢联邦大学计算机科学研究所) Polytechnic University of Setúbal – Portugal(葡萄牙塞图巴尔理工大学) PESC/COPPE, Federal University of Rio de Janeiro(里约热内卢联邦大学PESC/COPPE系) Tercio Pacitti Institute (NCE), Federal University of Rio de Janeiro(里约热内卢联邦大学Tercio Pacitti研究所(NCE))

AI总结 本研究通过远程课程实验,评估了基于WiSARD权重神经网络、无需互联网的AIcon2abs方法在提升不同年龄段公众对机器学习理解方面的有效性,结果显示参与者满意度高。

Comments textual review (spelling and grammar); reorganization of the elements of some figures; New references included

详情
AI中文摘要

本研究扩展了先前介绍AIcon2abs方法(从具体到抽象的人工智能:向公众揭秘人工智能)的工作,该方法是一种创新方法,旨在提高不同年龄群体(包括K-12学生)对机器学习(ML)的理解,并评估其有效性。AIcon2abs采用WiSARD算法,这是一种以其简单性和用户可访问性著称的无权重神经网络。WiSARD不需要互联网,使其非常适合非技术用户和资源有限的环境。该方法使参与者能够通过引人入胜的动手活动直观地可视化和交互ML过程,仿佛他们自己就是算法。该方法允许用户通过实践活动直观地可视化和理解训练和分类的内部过程。由于WiSARD的功能不需要互联网连接,它可以从最小数据集(甚至单个示例)中有效学习。这一特性使用户能够观察到机器在接收更多数据时如何逐步提高其准确性。此外,WiSARD生成代表其学习内容的心理图像,突出显示分类数据的基本特征。AIcon2abs通过一个六小时的远程课程进行测试,有34名巴西参与者,包括5名儿童、5名青少年和24名成人。数据分析从两个角度进行:混合方法预实验(包括假设检验)和定性现象学分析。几乎所有参与者都对AIcon2abs给予正面评价,结果显示在实现预期结果方面具有高度满意度。本研究已获得CEP-HUCFF-UFRJ研究伦理委员会的批准。

英文摘要

This study expands on previous work that introduced the AIcon2abs method (AI from Concrete to Abstract: Demystifying Artificial Intelligence to the general public), an innovative approach designed to increase public understanding of machine learning (ML) across diverse age groups, including K-12 students, and aims to evaluate its effectiveness. AIcon2Abs employs the WiSARD algorithm, a weightless neural network known for its simplicity, and user accessibility. WiSARD does not require Internet, making it ideal for non-technical users and resource-limited environments. This method enables participants to intuitively visualize and interact with ML processes through engaging, hands-on activities, as if they were the algorithms themselves. The method allows users to intuitively visualize and understand the internal processes of training and classification through practical activities. Once WiSARDs functionality does not require an Internet connection, it can learn effectively from a minimal dataset, even from a single example. This feature enables users to observe how the machine improves its accuracy incrementally as it receives more data. Moreover, WiSARD generates mental images representing what it has learned, highlighting essential features of the classified data. AIcon2abs was tested through a six-hour remote course with 34 Brazilian participants, including 5 children, 5 adolescents, and 24 adults. Data analysis was conducted from two perspectives: a mixed-method pre-experiment (including hypothesis testing), and a qualitative phenomenological analysis. Nearly all participants rated AIcon2abs positively, with the results demonstrating a high degree of satisfaction in achieving the intended outcomes. This research was approved by the CEP-HUCFF-UFRJ Research Ethics Committee.

2506.10630 2026-06-04 cs.LG cs.AI 版本更新

Time Series Forecasting as Reasoning: A Slow-Thinking Approach with Reinforced LLMs

时间序列预测作为推理:一种基于强化LLM的慢思考方法

Yitong Zhou, Yucong Luo, Mingyue Cheng, Qi Liu, Jiahao Wang, Daoyu Wang, Enhong Chen

发表机构 * State Key Laboratory of Cognitive Intelligence, University of Science and Technology of China(认知智能国家重点实验室,中国科学技术大学)

AI总结 提出Time-R1框架,通过两阶段强化微调(监督微调+强化学习)训练LLM进行多步推理,以提升时间序列预测的准确性。

详情
AI中文摘要

为了推进时间序列预测(TSF),人们提出了各种方法来提高预测精度,从统计技术发展到数据驱动的深度学习架构。尽管这些方法有效,但大多数现有方法仍然遵循快速思考范式——依赖提取历史模式并将其映射到未来值作为核心建模理念,缺乏包含中间时间序列推理的显式思考过程。与此同时,新兴的慢思考LLM(如OpenAI-o1)展示了显著的多步推理能力,为克服这些问题提供了替代途径。然而,仅靠提示工程存在若干局限性——包括高计算成本、隐私风险以及领域特定时间序列深度推理能力有限。为了解决这些局限性,更有前景的方法是训练LLM发展慢思考能力并获得强大的时间序列推理技能。为此,我们提出了Time-R1,一个两阶段强化微调框架,旨在增强LLM用于时间序列预测的多步推理能力。具体来说,第一阶段进行监督微调以进行预热适应,而第二阶段采用强化学习来提高模型的泛化能力。特别地,我们专门为时间序列预测设计了一个细粒度的多目标奖励,然后引入了GRIP(基于组的相对重要性策略优化),它利用非均匀采样进一步鼓励和优化模型对有效推理路径的探索。实验表明,Time-R1在多种数据集上显著提高了预测性能。

英文摘要

To advance time series forecasting (TSF), various methods have been proposed to improve prediction accuracy, evolving from statistical techniques to data-driven deep learning architectures. Despite their effectiveness, most existing methods still adhere to a fast thinking paradigm-relying on extracting historical patterns and mapping them to future values as their core modeling philosophy, lacking an explicit thinking process that incorporates intermediate time series reasoning. Meanwhile, emerging slow-thinking LLMs (e.g., OpenAI-o1) have shown remarkable multi-step reasoning capabilities, offering an alternative way to overcome these issues. However, prompt engineering alone presents several limitations - including high computational cost, privacy risks, and limited capacity for in-depth domain-specific time series reasoning. To address these limitations, a more promising approach is to train LLMs to develop slow thinking capabilities and acquire strong time series reasoning skills. For this purpose, we propose Time-R1, a two-stage reinforcement fine-tuning framework designed to enhance multi-step reasoning ability of LLMs for time series forecasting. Specifically, the first stage conducts supervised fine-tuning for warmup adaptation, while the second stage employs reinforcement learning to improve the model's generalization ability. Particularly, we design a fine-grained multi-objective reward specifically for time series forecasting, and then introduce GRIP (group-based relative importance for policy optimization), which leverages non-uniform sampling to further encourage and optimize the model's exploration of effective reasoning paths. Experiments demonstrate that Time-R1 significantly improves forecast performance across diverse datasets.

2502.00944 2026-06-04 cs.LG 版本更新

Training speedups via batching for geometric learning: an analysis of static and dynamic algorithms

通过批处理实现几何学习训练加速:静态与动态算法分析

Daniel T. Speckhard, Tim Bechtel, Sebastian Kehl, Jonathan Godwin, Claudia Draxl

发表机构 * Humboldt-Universität zu Berlin(洪堡-柏林大学) Max Planck Institute for Solid State Research(马克斯·普朗克固态研究所) Max Planck Computing and Data Facility(马克斯·普朗克计算与数据设施) Orbital Materials(Orbital Materials公司)

AI总结 本文分析图神经网络中静态与动态批处理算法对训练速度和模型性能的影响,实验表明算法选择可带来最高2.7倍加速,但最优算法取决于数据、模型、批大小、硬件和训练步数。

详情
Journal ref
Transactions on Machine Learning Research (3/2026)
AI中文摘要

图神经网络(GNN)在材料科学、化学和社会科学等多个领域展现出有前景的结果。GNN模型通常包含数百万个参数,与其他神经网络(NN)模型一样,通常仅以批次方式输入训练数据集中的一部分图来更新模型参数。批处理算法对训练时间和模型性能的影响已在NN中得到了深入探索,但在GNN中尚未进行。我们分析了两种不同的基于图的模型批处理算法,即针对两个数据集(小分子QM9数据集和AFLOW材料数据库)的静态和动态批处理。我们的实验表明,更改批处理算法可提供高达2.7倍的加速,但最快的算法取决于数据、模型、批大小、硬件和运行的训练步数。实验表明,对于批大小、数据集和模型的某些组合,静态和动态批处理算法之间的模型学习指标存在显著差异。

英文摘要

Graph neural networks (GNN) have shown promising results for several domains such as materials science, chemistry, and the social sciences. GNN models often contain millions of parameters, and like other neural network (NN) models, are often fed only a fraction of the graphs that make up the training dataset in batches to update model parameters. The effect of batching algorithms on training time and model performance has been thoroughly explored for NNs but not yet for GNNs. We analyze two different batching algorithms for graph-based models, namely static and dynamic batching for two datasets, the QM9 dataset of small molecules and the AFLOW materials database. Our experiments show that changing the batching algorithm can provide up to a 2.7x speedup, but the fastest algorithm depends on the data, model, batch size, hardware, and number of training steps run. Experiments show that for a select number of combinations of batch size, dataset, and model, significant differences in model learning metrics are observed between static and dynamic batching algorithms.

2602.14757 2026-06-04 math.NA cs.LG cs.NA 版本更新

Solving Inverse Parametrized Problems via Finite Elements and Extreme Learning Networks

通过有限元和极限学习网络求解反参数化问题

Erik Burman, Mats G. Larson, Karl Larsson, Jonatan Vallin

发表机构 * KTH Royal Institute of Technology(皇家理工学院) Uppsala University(乌普萨拉大学)

AI总结 提出一种基于插值的建模框架,结合有限元离散和极限学习机代理,用于控制、反问题和不确定性量化中的参数依赖偏微分方程,并应用于定量光声层析成像,实现计算节省且保持精度。

详情
Journal ref
Comput. Methods Appl. Mech. Engrg. 460 (2026), Paper No. 119077
AI中文摘要

我们开发了一种基于插值的建模框架,用于控制、反问题和不确定性量化中出现的参数依赖偏微分方程。在物理域中使用有限元方法对解进行离散化,同时单独近似对有限维参数的依赖。我们建立了参数解的存在性、唯一性和正则性,并推导了严格的误差估计,明确量化了空间离散化和参数逼近之间的相互作用。在低维参数空间中,经典插值方案基于参数变量的Sobolev正则性产生代数收敛速度。在高维参数空间中,我们用极限学习机(ELM)代理替换经典插值,并在显式逼近和稳定性假设下获得误差界。该框架应用于定量光声层析成像中的反问题,我们推导了势和参数重建误差估计,并证明了与标准方法相比,在不牺牲精度的情况下显著节省了计算量。

英文摘要

We develop an interpolation-based modeling framework for parameter-dependent partial differential equations arising in control, inverse problems, and uncertainty quantification. The solution is discretized in the physical domain using finite element methods, while the dependence on a finite-dimensional parameter is approximated separately. We establish existence, uniqueness, and regularity of the parametric solution and derive rigorous error estimates that explicitly quantify the interplay between spatial discretization and parameter approximation. In low-dimensional parameter spaces, classical interpolation schemes yield algebraic convergence rates based on Sobolev regularity in the parameter variable. In higher-dimensional parameter spaces, we replace classical interpolation by extreme learning machine (ELM) surrogates and obtain error bounds under explicit approximation and stability assumptions. The proposed framework is applied to inverse problems in quantitative photoacoustic tomography, where we derive potential and parameter reconstruction error estimates and demonstrate substantial computational savings compared to standard approaches, without sacrificing accuracy.

2604.14575 2026-06-04 cs.LG cs.AI stat.ME stat.ML 版本更新

Generative Augmented Inference

生成式增强推断

Cheng Lu, Mengxin Wang, Dennis J. Zhang, Heng Zhang

发表机构 * University of California, Berkeley(加州大学伯克利分校) Stanford University(斯坦福大学) University of Toronto(多伦多大学)

AI总结 提出生成式增强推断(GAI)框架,将AI输出视为学习真实标签的高维信息特征而非代理,通过非参数方法建模,实现人机数据联合的一致估计和有效推断,在随机标注下渐近效率严格优于仅用人类数据。

详情
AI中文摘要

大型语言模型使得廉价的AI生成标注成为可能,但如何可靠地将其用于因果推断仍然具有挑战性。简单地将AI和人类数据混合会引入偏差,而现有方法如预测驱动推断(PPI;Angelopoulos et al., 2023a)将AI输出视为真实标签的代理——这一假设在实践中常被生成模型输出所违背。我们提出生成式增强推断(GAI),一个将AI输出视为学习人类标签的一般性、潜在高维信息特征而非替代品的框架。GAI使用非参数方法灵活建模这种关系,从而能够从人类和AI的联合数据中进行一致估计和有效推断。我们建立了渐近正态性,并证明在随机标注下,只要AI输出对真实标签具有信息量,GAI在渐近效率上严格优于仅使用人类数据的估计。在真实数据集上的实证研究表明,与仅使用人类数据和基于PPI的估计相比,GAI在多种生成数据源上显著降低了估计误差并提高了置信区间质量。

英文摘要

Large language models enable inexpensive AI-generated annotations, but using them reliably for causal inference remains challenging. Naively pooling AI and human data induces bias, while existing methods such as Prediction-Powered Inference (PPI; Angelopoulos et al., 2023a) treat AI outputs as proxies of true labels -- an assumption often violated for generative model outputs in practice. We propose Generative Augmented Inference (GAI), a framework that treats AI outputs as general, potentially high-dimensional informative features for learning human labels rather than as surrogates. GAI flexibly models this relationship using nonparametric methods, enabling consistent estimation and valid inference from combined human and AI data. We establish asymptotic normality and show that, under random labeling, GAI strictly improves asymptotic efficiency over human-data-only estimation whenever AI outputs are informative for true labels. Empirical studies on real-world datasets demonstrate that GAI significantly reduces estimation error and improves confidence interval quality across diverse generative data sources relative to human-only and PPI-based estimation.

2407.00809 2026-06-04 cs.LG cs.NA math.NA 版本更新

Kernel Neural Operators (KNOs) for Scalable, Memory-efficient, Geometrically-flexible Operator Learning

核神经算子(KNOs):可扩展、内存高效、几何灵活的算子学习

Matthew Lowery, John Turnage, Zachary Morrow, John D. Jakeman, Akil Narayan, Shandian Zhe, Varun Shankar

发表机构 * Kahlert School of Computing(卡勒特计算学院) University of Utah(犹他大学) Department of Mathematics(数学系) Sandia National Laboratories(桑迪亚国家实验室) Scientific Machine Learning(科学机器学习) Scientific Computing and Imaging (SCI) Institute(科学计算与成像(SCI)研究所)

AI总结 提出核神经算子(KNO),通过组合深度核积分算子实现算子学习,具有收敛性、低内存和几何灵活性,在基准测试中以更少参数达到可比或更高精度。

Comments 14 pages + 15 page appendix, 7 figures

详情
Journal ref
Transactions on Machine Learning Research, ISSN 2835-8856, 2026
AI中文摘要

本文介绍了核神经算子(KNO),一种可证明收敛的算子学习架构,它利用深度核积分算子的组合进行算子(函数到函数的映射)的函数空间逼近。KNO将核的选择与数值积分方案(求积)解耦,从而自然允许在不规则几何上使用显式选择的可训练核进行算子学习。在不规则域上,这使KNO能够利用特定于域的求积规则。为了帮助缓解维数灾难,我们还在规则域上利用了一种高效的维度分解算法。更重要的是,显式指定核的能力还允许使用高度表达性的、非平稳的、神经各向异性核,其参数通过训练神经网络计算。我们提出了通用逼近定理,表明连续和完全离散化的KNO都是算子学习问题的通用逼近器。数值结果表明,在现有基准测试中,KNO的训练和测试精度与流行的神经算子相当或更高,同时通常使用的可训练参数少一个数量级,其中更具表达性的核对于实现高精度至关重要。因此,KNO促进了低内存、几何灵活的深度算子学习,同时保留了科学计算和机器学习中传统核方法的实现简单性和透明性。

英文摘要

This paper introduces the Kernel Neural Operator (KNO), a provably convergent operator-learning architecture that utilizes compositions of deep kernel-based integral operators for function-space approximation of operators (maps from functions to functions). The KNO decouples the choice of kernel from the numerical integration scheme (quadrature), thereby naturally allowing for operator learning with explicitly-chosen trainable kernels on irregular geometries. On irregular domains, this allows the KNO to utilize domain-specific quadrature rules. To help ameliorate the curse of dimensionality, we also leverage an efficient dimension-wise factorization algorithm on regular domains. More importantly, the ability to explicitly specify kernels also allows the use of highly expressive, non-stationary, neural anisotropic kernels whose parameters are computed by training neural networks. We present universal approximation theorems showing that both the continuous and fully discretized KNO are universal approximators on operator learning problems. Numerical results demonstrate that on existing benchmarks the training and test accuracy of KNOs is closely comparable to or higher than that of popular neural operators while typically using an order of magnitude fewer trainable parameters, with the more expressive kernels proving important to attaining high accuracy. KNOs thus facilitate low-memory, geometrically-flexible, deep operator learning, while retaining the implementation simplicity and transparency of traditional kernel methods from both scientific computing and machine learning.

2604.11510 2026-06-04 cs.CL cs.AI cs.LG 版本更新

Policy Split: Incentivizing Dual-Mode Exploration in LLM Reinforcement with Dual-Mode Entropy Regularization

策略分裂:通过双模式熵正则化激励大语言模型强化学习中的双模式探索

Jiashu Yao, Heyan Huang, Daiqing Wu, Zeming Liu, Yuhang Guo

发表机构 * Beijing Institute of Technology(北京理工大学) Tsinghua University(清华大学) Beihang University(北航)

AI总结 提出Policy Split方法,将策略分裂为正常和高熵两种模式,通过协作双模式熵正则化在保持准确性的同时促进多样化探索,实验表明在通用和创造性任务上优于现有基线。

Comments preprint

详情
AI中文摘要

为了在不牺牲准确性的情况下鼓励大语言模型(LLM)强化学习(RL)中的多样化探索,我们提出了Policy Split,一种新颖的范式,通过高熵提示将策略分裂为正常模式和高熵模式。在共享模型参数的同时,两种模式针对不同目标进行协作的双模式熵正则化。具体来说,正常模式优化任务正确性,而高熵模式融入探索偏好,两种模式协作学习。大量实验表明,我们的方法在通用和创造性任务的各种模型规模上始终优于已建立的熵引导RL基线。进一步分析揭示,Policy Split促进了双模式探索,其中高熵模式产生与正常模式不同的行为模式,提供独特的学习信号。

英文摘要

To encourage diverse exploration in reinforcement learning (RL) for large language models (LLMs) without compromising accuracy, we propose Policy Split, a novel paradigm that bifurcates the policy into normal and high-entropy modes with a high-entropy prompt. While sharing model parameters, the two modes undergo collaborative dual-mode entropy regularization tailored to distinct objectives. Specifically, the normal mode optimizes for task correctness, while the high-entropy mode incorporates a preference for exploration, and the two modes learn collaboratively. Extensive experiments demonstrate that our approach consistently outperforms established entropy-guided RL baselines across various model sizes in general and creative tasks. Further analysis reveals that Policy Split facilitates dual-mode exploration, where the high-entropy mode generates distinct behavioral patterns to the normal mode, providing unique learning signals.

2604.08564 2026-06-04 cs.CL cs.LG 版本更新

Attention-Based Sampler for Diffusion Language Models

基于注意力的扩散语言模型采样器

Yuyan Zhou, Kai Syun Hou, Weiyu Chen, James Kwok

发表机构 * Department of Computer Science and Engineering, The Hong Kong University of Science and Technology(计算机科学与工程系,香港科学与技术大学)

AI总结 针对扩散语言模型采样中忽略全局序列结构的问题,提出基于注意力矩阵列和的采样顺序优化方法,实现无训练的高质量并行采样。

详情
AI中文摘要

自回归模型(ARMs)已在语言建模中建立了主导范式。然而,其严格的顺序采样范式对推理效率和建模灵活性施加了根本性限制。为解决这些限制,提出了基于扩散的大语言模型(dLLMs),提供了并行采样和灵活语言建模的潜力。尽管有这些优势,当前dLLMs的采样策略主要依赖于token级别的信息,未能考虑全局序列结构,往往产生次优结果。在本文中,我们从对数似然最大化的角度研究采样顺序选择问题。我们证明该问题是NP难的,并提出一种基于最优采样秩的近似方法,使目标在计算上可行。我们进一步证明,通过按注意力矩阵列和降序采样token可以优化该可行目标。这一发现为注意力引导采样提供了原则性依据,并提供了贪婪搜索的理论基础替代方案。我们将这一理论见解实例化为一种新的无训练采样算法,称为Attn-Sampler,并进一步提出动态注意力阈值以实现实际加速。在多个基准上的大量实验验证了我们方法的有效性,表明它在增强采样并行性的同时实现了更优的生成质量。

英文摘要

Auto-regressive models (ARMs) have established a dominant paradigm in language modeling. However, their strictly sequential sampling paradigm imposes fundamental constraints on both inference efficiency and modeling flexibility. To address these limitations, diffusion-based large language models (dLLMs) have been proposed, offering the potential for parallel sampling and flexible language modeling. Despite these advantages, current dLLMs sampling strategies rely primarily on token level information, which fails to account for global sequence structure and often yields suboptimal results. In this paper, we study the sampling order selection problem from the perspective of log-likelihood maximization. We show that this problem is NP-hard and propose an optimal sampling-rank-based approximation that makes the objective computationally tractable. We further prove that the tractable objective is optimized by sampling tokens in descending order of their attention-matrix column sums. This finding provides a principled justification for attention-guided sampling and offers a theoretically grounded alternative to greedy search. We instantiate this theoretical insight in a new training-free sampling algorithm, termed Attn-Sampler, and further propose dynamic attention thresholding for practical acceleration. Extensive experiments across multiple benchmarks validate the effectiveness of our proposed method, demonstrating that it achieves superior generation quality while enhancing the sampling parallelism.

2603.21180 2026-06-04 cs.LG stat.CO stat.ME stat.ML 版本更新

ALMAB-DC: Active Learning, Multi-Armed Bandits, and Distributed Computing for Sequential Experimental Design and Black-Box Optimization

ALMAB-DC:用于序贯实验设计和黑箱优化的主动学习、多臂老虎机和分布式计算

Foo Hui-Mean, Yuan-chin I Chang

发表机构 * Institute of Statistical Science, Academia Sinica(中央研究院统计科学研究所)

AI总结 提出ALMAB-DC框架,结合高斯过程代理模型、多臂老虎机控制和异步分布式调度,解决昂贵黑箱优化问题,在多个基准上显著优于现有方法。

Comments 33 pages, and 13 figures

详情
AI中文摘要

在昂贵且无梯度目标下的序贯实验设计是计算统计学中的一个核心挑战:评估预算严格受限,必须从每次观测中高效提取信息。我们提出 extbf{ALMAB-DC},一种基于高斯过程的序贯设计框架,结合主动学习、多臂老虎机(MAB)和分布式异步计算,用于昂贵的黑箱实验。具有不确定性感知获取函数的高斯过程代理模型识别信息量大的查询点;UCB或汤普森采样老虎机控制器在并行工作节点间分配评估;异步调度器处理异构运行时间。我们给出了老虎机组件的累积遗憾界,并通过阿姆达尔定律刻画了并行可扩展性。我们在五个基准上验证了ALMAB-DC。在两个统计实验设计任务中,ALMAB-DC在剂量-响应优化中实现了比等间距、随机和D最优设计更低的简单遗憾,在自适应空间场估计中匹配了贪婪最大方差基准并优于拉丁超立方采样;在$K=4$时,分布式设置达到目标性能所需的序贯挂钟轮次仅为四分之一。在三个机器学习/工程任务(CIFAR-10 HPO、CFD阻力最小化、MuJoCo RL)中,ALMAB-DC实现了93.4%的CIFAR-10准确率(超过BOHB 1.7个百分点和Optuna 1.1个百分点),将翼型阻力降低至$C_D = 0.059$(比网格搜索低36.9%),并将RL回报比网格搜索提高50%。所有相对于非ALMAB基线的优势在Bonferroni校正的Mann-Whitney $U$检验下均具有统计显著性。分布式执行在$K = 16$个智能体时实现了$7.5 imes$加速,与阿姆达尔定律一致。

英文摘要

Sequential experimental design under expensive, gradient-free objectives is a central challenge in computational statistics: evaluation budgets are tightly constrained and information must be extracted efficiently from each observation. We propose \textbf{ALMAB-DC}, a GP-based sequential design framework combining active learning, multi-armed bandits (MAB), and distributed asynchronous computing for expensive black-box experimentation. A Gaussian process surrogate with uncertainty-aware acquisition identifies informative query points; a UCB or Thompson-sampling bandit controller allocates evaluations across parallel workers; and an asynchronous scheduler handles heterogeneous runtimes. We present cumulative regret bounds for the bandit components and characterize parallel scalability via Amdahl's Law. We validate ALMAB-DC on five benchmarks. On the two statistical experimental-design tasks, ALMAB-DC achieves lower simple regret than Equal Spacing, Random, and D-optimal designs in dose--response optimization, and in adaptive spatial field estimation matches the Greedy Max-Variance benchmark while outperforming Latin Hypercube Sampling; at $K=4$ the distributed setting reaches target performance in one-quarter of sequential wall-clock rounds. On three ML/engineering tasks (CIFAR-10 HPO, CFD drag minimization, MuJoCo RL), ALMAB-DC achieves 93.4\% CIFAR-10 accuracy (outperforming BOHB by 1.7\,pp and Optuna by 1.1\,pp), reduces airfoil drag to $C_D = 0.059$ (36.9\% below Grid Search), and improves RL return by 50\% over Grid Search. All advantages over non-ALMAB baselines are statistically significant under Bonferroni-corrected Mann--Whitney $U$ tests. Distributed execution achieves $7.5\times$ speedup at $K = 16$ agents, consistent with Amdahl's Law.

2604.08438 2026-06-04 cs.LG 版本更新

Adalina: Adaptive Linear Approximation for the Shapley Value and Beyond

Adalina: Shapley值及更广的半值的自适应线性逼近

Weida Li, Yaoliang Yu, Bryan Kian Hsiang Low

发表机构 * School of Computer Science, University of Waterloo, Canada(滑铁卢大学计算机科学学院) Department of Computer Science, National University of Singapore, Republic of Singapore(新加坡国立大学计算机科学系) Vector Institute, Canada(加拿大向量研究所)

AI总结 针对Shapley值及其半值族的高效近似问题,提出一种基于向量集中不等式的理论框架,并开发了线性空间算法Adalina,在Θ(n)空间约束下实现O(n/ε² log(1/δ))次查询,显著降低均方误差。

详情
AI中文摘要

Shapley值及其更广的半值族在各种归因问题中受到了广泛关注。一个基本且长期存在的挑战是它们的有效近似,因为精确计算通常需要指数级的效用查询次数(关于玩家数量n)。为了应对大规模应用的挑战,我们探索了在Θ(n)空间约束下有效近似半值的极限。基于向量集中不等式,我们建立了一个理论框架,使得现有的无偏随机算法能够获得更锐利的查询复杂度。在该框架内,我们系统地开发了一种线性空间算法,该算法需要O(frac{n}{ε^{2}}logfrac{1}{δ})次效用查询,以确保对于所有常用的半值,有P(‖hat{boldsymbolϕ}-boldsymbolϕ‖_{2}≥ε)≤δ。特别地,我们的框架自然地桥接了OFA、无偏kernelSHAP、SHAP-IQ和回归调整方法,并明确刻画了配对采样何时有益。此外,我们的算法允许针对每个特定的效用函数显式最小化均方误差mathbb{E}[‖hat{boldsymbolϕ}-boldsymbolϕ‖_{2}^{2}]。据此,我们引入了第一个自适应的、线性时间、线性空间的随机算法Adalina,该算法在理论上实现了改进的均方误差。我们所有的理论发现都得到了实验验证。我们的代码可在https://github.com/watml/adalina获取。

英文摘要

The Shapley value, and its broader family of semi-values, has received much attention in various attribution problems. A fundamental and long-standing challenge is their efficient approximation, since exact computation generally requires an exponential number of utility queries in the number of players $n$. To meet the challenges of large-scale applications, we explore the limits of efficiently approximating semi-values under a $Θ(n)$ space constraint. Building upon a vector concentration inequality, we establish a theoretical framework that enables sharper query complexities for existing unbiased randomized algorithms. Within this framework, we systematically develop a linear-space algorithm that requires $O(\frac{n}{ε^{2}}\log\frac{1}δ)$ utility queries to ensure $P(\|\hat{\boldsymbolϕ}-\boldsymbolϕ\|_{2}\geqε)\leq δ$ for all commonly used semi-values. In particular, our framework naturally bridges OFA, unbiased kernelSHAP, SHAP-IQ and the regression-adjusted approach, and definitively characterizes when paired sampling is beneficial. Moreover, our algorithm allows explicit minimization of the mean squared error $\mathbb{E}[\|\hat{\boldsymbolϕ}-\boldsymbolϕ\|_{2}^{2}]$ for each specific utility function. Accordingly, we introduce the first adaptive, linear-time, linear-space randomized algorithm, Adalina, that theoretically achieves improved mean squared error. All of our theoretical findings are experimentally validated. Our code is available at https://github.com/watml/adalina.

2604.02121 2026-06-04 physics.comp-ph cond-mat.stat-mech cs.LG physics.bio-ph physics.chem-ph 版本更新

Gradient estimators for parameter inference in discrete stochastic kinetic models

离散随机动力学模型中参数推断的梯度估计器

Ludwig Burger, Annalena Kofler, Lukas Heinrich, Ulrich Gerland

发表机构 * Physics of Complex Biosystems, School of Natural Sciences, James-Franck-Straße 1, 85748 Garching, Germany(复杂生物系统物理系,自然科学院,James-Franck街1号,85748 Garching,德国) Max Planck Institute for Intelligent Systems, Max-Planck-Ring 4, 72076 Tübingen, Germany(智能系统马克斯·普朗克研究所,Max-Planck环4号,72076 Tübingen,德国) Max Planck Institute for Gravitational Physics (Albert Einstein Institute), Am Mühlenberg 1, 14476 Potsdam, Germany(引力物理马克斯·普朗克研究所(爱因斯坦研究所),Am Mühlenberg 1号,14476 Potsdam,德国) Data Science in Physics, School of Natural Sciences, James-Franck-Straße 1, 85748 Garching, Germany(物理学数据科学,自然科学院,James-Franck街1号,85748 Garching,德国) Munich Center for Machine Learning (MCML), Munich, Germany(慕尼黑机器学习中心(MCML),慕尼黑,德国)

AI总结 针对Gillespie随机模拟算法不可微的问题,采用三种机器学习梯度估计器(Gumbel-Softmax直通、得分函数、替代路径)实现参数梯度计算,并在弛豫和振荡动力学系统中验证了其有效性。

Comments 19 pages, 9 figures

详情
AI中文摘要

随机动力学模型在物理学中无处不在,但从实验数据推断其参数仍然具有挑战性。对于确定性模型,参数推断通常依赖于梯度,这些梯度可以通过自动微分(AD)高效获得。然而,AD不能直接应用于Gillespie随机模拟算法(SSA),因为从离散反应集合中采样引入了不可微操作。在这项工作中,我们采用三种来自机器学习的梯度估计器用于Gillespie SSA:Gumbel-Softmax直通(GS-ST)估计器、得分函数估计器和替代路径估计器。我们使用这些估计器来评估稳态和时间依赖可观测量的梯度,并在具有弛豫动力学(双分子结合)和振荡动力学(抑制子)的代表性生物物理系统中比较它们的性能。我们发现GS-ST估计器通常产生表现良好的梯度估计,但在具有挑战性的参数区域中表现出发散的方差,这可能导致参数推断失败。在这些情况下,其他估计器提供更稳健、方差更低的梯度。我们的结果表明,基于梯度的参数推断可以有效地与Gillespie SSA结合,不同的估计器提供互补的优势。

英文摘要

Stochastic kinetic models are ubiquitous in physics, yet inferring their parameters from experimental data remains challenging. For deterministic models, parameter inference often relies on gradients, which can be obtained efficiently through automatic differentiation (AD). However, AD cannot be applied directly to the Gillespie stochastic simulation algorithm (SSA), since sampling from a discrete set of reactions introduces non-differentiable operations. In this work, we adopt three gradient estimators from machine learning for the Gillespie SSA: the Gumbel-Softmax Straight-Through (GS-ST) estimator, the Score Function estimator, and the Alternative Path estimator. We use the estimators to evaluate gradients of steady-state and time-dependent observables, and compare their performance in representative biophysical systems with relaxation dynamics (bimolecular association) and oscillatory dynamics (repressilator). We find that the GS-ST estimator generally yields well-behaved gradient estimates, but exhibits diverging variance in challenging parameter regimes, which can cause parameter inference to fail. In these cases, other estimators provide more robust, lower variance gradients. Our results demonstrate that gradient-based parameter inference can be effectively combined with the Gillespie SSA, with different estimators offering complementary advantages.

2604.01161 2026-06-04 cs.LG 版本更新

Reasoning Shift: How Context Silently Shortens LLM Reasoning

推理偏移:上下文如何无声地缩短LLM推理

Gleb Rodionov, Roman Garipov, George Yakushev

发表机构 * Yandex HSE University(俄罗斯高等经济学院)

AI总结 通过系统评估,发现推理模型在不同上下文条件下(如无关长上下文、多轮对话、子任务)对同一问题的推理链长度显著缩短(最高65%),且伴随自我验证和不确定性管理行为减少,但简单任务性能不受影响,复杂任务可能受影响。

Comments Preprint

详情
AI中文摘要

展现出测试时缩放行为的大语言模型(LLMs),如扩展推理轨迹和自我验证,在复杂、长期推理任务上表现出色。然而,这些推理行为的鲁棒性仍未得到充分探索。为此,我们对多个推理模型在三种场景下进行了系统评估:(1)添加冗长无关上下文的问题;(2)具有独立任务的多轮对话设置;(3)作为复杂任务中的子任务呈现的问题。我们观察到一个有趣的现象:与问题单独呈现时产生的推理轨迹相比,推理模型在不同上下文条件下对同一问题产生的推理轨迹要短得多(最高达65%)。更细粒度的分析表明,这种压缩与自我验证和不确定性管理行为(如双重检查)的减少相关。虽然这种行为转变不会影响简单问题的性能,但可能会影响更具挑战性任务的表现。此外,我们表明有针对性的监督微调可以部分缓解无关上下文的不利影响。我们希望我们的发现能引起对推理模型鲁棒性以及LLM和基于LLM的智能体上下文管理问题的更多关注。

英文摘要

Large language models (LLMs) exhibiting test-time scaling behavior, such as extended reasoning traces and self-verification, have demonstrated remarkable performance on complex, long-term reasoning tasks. However, the robustness of these reasoning behaviors remains underexplored. To investigate this, we conduct a systematic evaluation of multiple reasoning models across three scenarios: (1) problems augmented with lengthy, irrelevant context; (2) multi-turn conversational settings with independent tasks; and (3) problems presented as a subtask within a complex task. We observe an interesting phenomenon: reasoning models tend to produce much shorter reasoning traces (up to 65%) for the same problem under different context conditions compared to the traces produced when the problem is presented in isolation. A finer-grained analysis reveals that this compression is associated with a decrease in self-verification and uncertainty management behaviors, such as double-checking. While this behavioral shift does not compromise performance on straightforward problems, it might affect performance on more challenging tasks. Additionally, we show that targeted supervised fine-tuning partially mitigates the adverse effects of irrelevant context. We hope our findings draw additional attention to both the robustness of reasoning models and the problem of context management for LLMs and LLM-based agents.

2604.00915 2026-06-04 cs.LG stat.ML 版本更新

Orthogonal Learner for Estimating Heterogeneous Long-Term Treatment Effects

正交学习器用于估计异质性长期处理效应

Haorui Ma, Dennis Frauen, Valentyn Melnychuk, Stefan Feuerriegel

发表机构 * AI in Management, LMU Munich(慕尼黑莱茵河大学人工智能管理系) Munich Center for Machine Learning(慕尼黑机器学习中心) Munich, Germany(德国慕尼黑)

AI总结 提出LT-O-learners,通过定制重叠权重重新定位损失函数,解决低重叠区域下异质性长期处理效应估计不稳定问题,并证明其Neyman正交性和对干扰误差的鲁棒性。

详情
AI中文摘要

异质性长期处理效应(HLTEs)的估计对于营销、经济学和医学中的个性化决策具有重要意义,在这些领域中,短期观测数据集通常与长期观测数据集相结合。然而,由于某些子群体在处理分配或长期结果上的重叠有限,HLTE估计面临挑战,可能导致具有大有限样本方差不稳定的HLTE估计。为了解决这一挑战,我们引入了LT-O-learners(长期正交学习器),这是一组新颖的正交学习器,用于在具有替代性的典型HLTE设置中进行HLTE估计。我们的LT-O-learners的关键思想是通过定制的重叠权重重新定位损失函数,这些权重降低了低重叠样本的权重。我们证明了重新定位的损失函数逐点恢复真实的HLTE,并满足Neyman正交性。我们进一步证明了两个关键的理论结果:(i)干扰误差仅通过高阶项进入误差界,这意味着我们的学习器对干扰估计误差具有鲁棒性。(ii)在线性函数类下,重新定位通过低重叠区域中的重叠权重有效控制了HLTE估计器的渐近方差。我们在合成和真实世界数据集上进行实验,以确认我们的LT-O-learners的理论性质,特别是在低重叠区域中的鲁棒性。据我们所知,我们是第一个在长期设置中对低重叠鲁棒的HLTE估计正交学习器。

英文摘要

Estimation of heterogeneous long-term treatment effects (HLTEs) is relevant for personalized decision-making in marketing, economics, and medicine, where short-term observational datasets are often combined with long-term observational datasets. However, HLTE estimation is challenging due to limited overlap in treatment assignments or in long-term outcomes for certain subpopulations, which can lead to unstable HLTE estimates with large finite-sample variance. To address this challenge, we introduce the LT-O-learners (Long-Term Orthogonal Learners), a set of novel orthogonal learners for HLTE estimation in the canonical HLTE setting with surrogacy. The key idea of our LT-O-learners is to retarget the loss via custom overlap weights that downweight low-overlap samples. We show that the retargeted loss recovers the true HLTE pointwise and satisfies Neyman-orthogonality. We further prove two key theoretical results: (i) The nuisance error enters the error bound only through higher-order terms, which means our learners are robust to nuisance estimation error. (ii) Under a linear function class, the retargeting effectively controls the asymptotic variance of the HLTE estimator via the overlap weights in low-overlap regimes. We conduct experiments on synthetic and real-world datasets to confirm the theoretical properties of our LT-O-learners, particularly robustness in low-overlap regimes. To our knowledge, ours are the first orthogonal learners for HLTE estimation robust to low overlap in long-term settings.

2603.28762 2026-06-04 cs.CV cs.AI cs.GR cs.LG 版本更新

On-the-fly Repulsion in the Contextual Space for Rich Diversity in Diffusion Transformers

上下文空间中的即时排斥以实现扩散变换器的丰富多样性

Omer Dahary, Benaya Koren, Daniel Garibi, Daniel Cohen-Or

发表机构 * Tel Aviv University(特拉维夫大学) Snap Research Israel(Snap以色列研究)

AI总结 针对文本到图像扩散模型多样性不足的问题,提出在扩散变换器的上下文空间中通过多模态注意力通道施加即时排斥,在不牺牲视觉保真度和语义一致性的前提下显著提升生成多样性,且计算开销小,适用于现代Turbo和蒸馏模型。

Comments SIGGRAPH 2026. Project page: https://contextual-repulsion.github.io/

详情
AI中文摘要

现代文本到图像(T2I)扩散模型在语义对齐方面取得了显著进展,但通常缺乏多样性,倾向于为任何给定提示收敛到狭窄的视觉解决方案集。这种典型性偏差对需要广泛生成结果的创意应用构成了挑战。我们识别出当前多样性方法中的一个基本权衡:修改模型输入需要昂贵的优化来整合生成路径的反馈。相反,对空间上已承诺的中间潜变量进行操作往往会破坏正在形成的视觉结构,导致伪影。在这项工作中,我们提出在上下文空间中应用排斥作为一种新颖的框架,以实现扩散变换器的丰富多样性。通过干预多模态注意力通道,我们在变换器的前向传播过程中施加即时排斥,在文本条件被新兴图像结构丰富后的块之间注入干预。这允许在结构信息形成后但构图固定之前重定向引导轨迹。我们的结果表明,上下文空间中的排斥在不牺牲视觉保真度或语义一致性的情况下产生了显著更丰富的多样性。此外,我们的方法非常高效,计算开销小,即使在现代“Turbo”和蒸馏模型中也有效,而传统的基于轨迹的干预在这些模型中通常会失败。

英文摘要

Modern Text-to-Image (T2I) diffusion models have achieved remarkable semantic alignment, yet they often suffer from a significant lack of variety, converging on a narrow set of visual solutions for any given prompt. This typicality bias presents a challenge for creative applications that require a wide range of generative outcomes. We identify a fundamental trade-off in current approaches to diversity: modifying model inputs requires costly optimization to incorporate feedback from the generative path. In contrast, acting on spatially-committed intermediate latents tends to disrupt the forming visual structure, leading to artifacts. In this work, we propose to apply repulsion in the Contextual Space as a novel framework for achieving rich diversity in Diffusion Transformers. By intervening in the multimodal attention channels, we apply on-the-fly repulsion during the transformer's forward pass, injecting the intervention between blocks where text conditioning is enriched with emergent image structure. This allows for redirecting the guidance trajectory after it is structurally informed but before the composition is fixed. Our results demonstrate that repulsion in the Contextual Space produces significantly richer diversity without sacrificing visual fidelity or semantic adherence. Furthermore, our method is uniquely efficient, imposing a small computational overhead while remaining effective even in modern "Turbo" and distilled models where traditional trajectory-based interventions typically fail.

2601.04051 2026-06-04 cs.LG 版本更新

Symbolic Regression for Shared Expressions: Introducing Partial Parameter Sharing

共享表达式的符号回归:引入部分参数共享

Viktor Martinek, Roland Herzog

发表机构 * Interdisciplinary Center for Scientific Computing, Heidelberg University(科学计算跨学科中心,海德堡大学)

AI总结 提出一种符号回归方法,通过引入部分参数共享机制处理多个分类变量,以分离通用效应、类别特定趋势和类别交互,并在合成数据和天体物理学数据集上验证其减少数据需求和迁移学习的能力。

详情
AI中文摘要

符号回归旨在寻找描述数据集的符号表达式。由于其固有的可解释性,符号回归(SR)是科学发现的有力范式。最近的进展已将SR扩展到使用具有可变参数集的单一表达式来描述相关现象,从而引入单一分类变量。例如,这允许搜索描述多种流体温度依赖粘度的单一表达式,同时识别一组不同的流体特定参数。我们在先前工作的基础上,考虑多个分类变量并引入中间级别的参数共享。参数并非完全通用或完全唯一,一些参数可以在特定类别之间共享,而对其他类别保持不同。这允许分离通用效应(共享参数)、类别特定趋势(部分共享参数)和类别交互(非共享参数)。我们通过一个合成的仅拟合示例测试了这种设置在减少数据需求和迁移学习方面的极限。此外,我们将该方法应用于先前单类别研究中也使用过的天体物理学数据集。相比之下,我们以显著更少的参数实现了类似的拟合质量,同时提取了关于问题的额外信息。

英文摘要

Symbolic regression aims to find symbolic expressions that describe datasets. Due to its inherent interpretability, symbolic regression (SR) is a powerful paradigm for scientific discovery. Recent advances have expanded SR to describe related phenomena using a single expression with varying sets of parameters, thereby introducing a single categorical variable. To illustrate, this enables the search for a single expression describing temperaturedependent viscosity across multiple fluids, while simultaneously identifying a distinct set of fluid-specific parameters. We expand upon prior efforts by considering multiple categorical variables and introducing intermediate levels of parameter sharing. Rather than parameters being either entirely universal or entirely unique, some parameters can also be shared across specific categories while remaining distinct for others. This allows for separating universal effects (shared parameters), category-specific trends (partially-shared parameters), and category interactions (non-shared parameters). We test the limits of this setup in terms of reducing data requirements and transfer learning using a synthetic, fitting-only example. Furthermore, we apply the method to an astrophysics dataset also used in a previous single-category study. In comparison, we achieve similar fit quality with significantly fewer parameters while extracting additional information about the problem.

2510.21459 2026-06-04 cs.CR cs.CL cs.LG 版本更新

SBASH: a Framework for Designing and Evaluating RAG vs. Prompt-Tuned LLM Honeypots

SBASH:用于设计和评估RAG与提示调优的LLM蜜罐框架

Adetayo Adebimpe, Helmut Neukirchen, Thomas Welsh

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 提出SBASH框架,利用轻量级本地LLM和RAG技术构建蜜罐,通过多种指标评估RAG与提示调优对LLM蜜罐真实性和响应延迟的影响。

Comments to be published in: The 3rd International Conference on Foundation and Large Language Models (FLLM2025), IEEE, 2025

详情
Journal ref
2025 3rd International Conference on Foundation and Large Language Models (FLLM), IEEE, 2025
AI中文摘要

蜜罐是用于收集有价值威胁情报或将攻击者从生产系统引开的诱饵系统。最大化攻击者参与度对其效用至关重要。然而,研究表明,上下文感知能力(例如响应新攻击类型、系统和攻击者代理的能力)对于提高参与度是必要的。大型语言模型(LLM)已被证明是提高上下文感知能力的一种方法,但面临若干挑战,包括响应时间的准确性和及时性、高运营成本以及由于云部署带来的数据保护问题。我们提出了基于系统的注意力外壳蜜罐(SBASH)框架,通过使用轻量级本地LLM来管理数据保护问题。我们研究了使用检索增强生成(RAG)支持的LLM和非RAG LLM处理Linux shell命令的情况,并使用多种不同指标(如响应时间差异、人类测试者的真实感、以及通过Levenshtein距离、SBert和BertScore计算的与真实系统的相似度)对其进行评估。我们表明,RAG提高了未调优模型的准确性,而通过系统提示(指示LLM像Linux系统一样响应)调优的模型在无RAG情况下达到了与未调优模型有RAG时相似的准确性,同时延迟略低。

英文摘要

Honeypots are decoy systems used for gathering valuable threat intelligence or diverting attackers away from production systems. Maximising attacker engagement is essential to their utility. However research has highlighted that context-awareness, such as the ability to respond to new attack types, systems and attacker agents, is necessary to increase engagement. Large Language Models (LLMs) have been shown as one approach to increase context awareness but suffer from several challenges including accuracy and timeliness of response time, high operational costs and data-protection issues due to cloud deployment. We propose the System-Based Attention Shell Honeypot (SBASH) framework which manages data-protection issues through the use of lightweight local LLMs. We investigate the use of Retrieval Augmented Generation (RAG) supported LLMs and non-RAG LLMs for Linux shell commands and evaluate them using several different metrics such as response time differences, realism from human testers, and similarity to a real system calculated with Levenshtein distance, SBert, and BertScore. We show that RAG improves accuracy for untuned models while models that have been tuned via a system prompt that tells the LLM to respond like a Linux system achieve without RAG a similar accuracy as untuned with RAG, while having a slightly lower latency.

2603.19005 2026-06-04 cs.LG cs.AI stat.ME 版本更新

AgentDS Technical Report: Benchmarking the Future of Human-AI Collaboration in Domain-Specific Data Science

AgentDS技术报告:领域特定数据科学中人机协作的未来基准测试

An Luo, Jin Du, Xun Xian, Robert Specht, Fangqiao Tian, Ganghua Wang, Xuan Bi, Charles Fleming, Ashish Kundu, Jayanth Srinivasa, Mingyi Hong, Rui Zhang, Tianxi Li, Galin Jones, Jie Ding

发表机构 * School of Statistics, University of Minnesota(明尼苏达大学统计学系) AIScientists, Inc.(AIScientists公司) Data Science Institute, University of Chicago(芝加哥大学数据科学研究所) Carlson School of Management, University of Minnesota(明尼苏达大学卡尔森管理学院) Cisco Research(思科研究) Department of Electrical and Computer Engineering, University of Minnesota(明尼苏达大学电气与计算机工程系) Division of Computational Health Sciences, University of Minnesota(明尼苏达大学计算健康科学 division)

AI总结 提出AgentDS基准测试和竞赛,通过17个跨行业挑战评估AI代理及人机协作在领域特定数据科学中的表现,发现AI代理在领域推理上存在不足,人机协作优于纯AI方法。

详情
AI中文摘要

数据科学在将复杂数据转化为跨领域的可操作洞察方面发挥着关键作用。大型语言模型(LLM)和人工智能(AI)代理的最新发展显著自动化了数据科学工作流程。然而,目前尚不清楚AI代理在多大程度上能够匹配人类专家在领域特定数据科学任务上的表现,以及人类专业知识在哪些方面仍具有优势。我们引入了AgentDS,一个旨在评估AI代理和人机协作在领域特定数据科学中表现的基准测试和竞赛。AgentDS包含来自六个行业(商业、食品生产、医疗保健、保险、制造业和零售银行)的17个挑战。我们组织了一场公开竞赛,涉及29支队伍和80名参与者,从而能够系统比较人机协作方法与纯AI基线。我们的结果表明,当前的AI代理在领域特定推理方面存在困难。纯AI基线的表现低于竞赛参与者的前四分位数,而最强的解决方案来自人机协作。这些发现挑战了AI完全自动化的说法,并强调了人类专业知识在数据科学中的持久重要性,同时为下一代AI指明了方向。访问AgentDS网站:https://agentds.org/,开源数据集:https://huggingface.co/datasets/lainmn/AgentDS。

英文摘要

Data science plays a critical role in transforming complex data into actionable insights across numerous domains. Recent developments in large language models (LLMs) and artificial intelligence (AI) agents have significantly automated data science workflow. However, it remains unclear to what extent AI agents can match the performance of human experts on domain-specific data science tasks, and in which aspects human expertise continues to provide advantages. We introduce AgentDS, a benchmark and competition designed to evaluate both AI agents and human-AI collaboration performance in domain-specific data science. AgentDS consists of 17 challenges across six industries: commerce, food production, healthcare, insurance, manufacturing, and retail banking. We conducted an open competition involving 29 teams and 80 participants, enabling systematic comparison between human-AI collaborative approaches and AI-only baselines. Our results show that current AI agents struggle with domain-specific reasoning. AI-only baselines perform below the top quartile of competition participants, while the strongest solutions arise from human-AI collaboration. These findings challenge the narrative of complete automation by AI and underscore the enduring importance of human expertise in data science, while illuminating directions for the next generation of AI. Visit the AgentDS website here: https://agentds.org/ and open source datasets here: https://huggingface.co/datasets/lainmn/AgentDS .

2603.16867 2026-06-04 cs.LG cs.CL 版本更新

Efficient Reasoning on the Edge

边缘设备上的高效推理

Yelysei Bondarenko, Thomas Hehn, Rob Hesselink, Romain Lepert, Fabio Valerio Massoli, Evgeny Mironov, Leyla Mirvakhabova, Tribhuvanesh Orekondy, Spyridon Stasis, Andrey Kuzmin, Anna Kuzina, Markus Nagel, Ankita Nayak, Corrado Rainone, Ork de Rooij, Paul N Whatmough, Arash Behboodi, Babak Ehteshami Bejnordi

发表机构 * Qualcomm AI Research(高通人工智能研究)

AI总结 提出结合LoRA适配器、监督微调、强化学习预算控制、并行测试时缩放、动态适配器切换和KV缓存共享的方法,在资源受限的边缘设备上实现高效准确的推理。

Comments Project page: https://qualcomm-ai-research.github.io/llm-reasoning-on-edge/

详情
AI中文摘要

具有思维链推理的大型语言模型在复杂问题解决任务中达到了最先进的性能,但其冗长的推理轨迹和大的上下文需求使其不适用于边缘部署。这些挑战包括高令牌生成成本、大的KV缓存占用,以及在将推理能力蒸馏到用于移动设备的较小模型时的低效性。现有方法通常依赖于将较大模型的推理轨迹蒸馏到较小模型中,这些轨迹冗长且风格冗余,不适合设备端推理。在这项工作中,我们提出了一种轻量级方法,通过使用LoRA适配器结合监督微调,在小型LLM中实现推理。我们进一步通过在这些适配器上进行强化学习引入预算控制,显著减少响应长度,同时保持最小的精度损失。为了解决内存受限的解码问题,我们利用并行测试时缩放,在轻微延迟增加的情况下提高精度。最后,我们提出了一种动态适配器切换机制,仅在需要时激活推理,以及在提示编码期间的KV缓存共享策略,减少设备端推理的首令牌时间。在Qwen2.5-7B上的实验表明,我们的方法在严格的资源约束下实现了高效、准确的推理,使LLM推理在移动场景中变得实用。展示我们的解决方案在移动设备上运行的视频可在我们的项目页面上找到。

英文摘要

Large language models (LLMs) with chain-of-thought reasoning achieve state-of-the-art performance across complex problem-solving tasks, but their verbose reasoning traces and large context requirements make them impractical for edge deployment. These challenges include high token generation costs, large KV-cache footprints, and inefficiencies when distilling reasoning capabilities into smaller models for mobile devices. Existing approaches often rely on distilling reasoning traces from larger models into smaller models, which are verbose and stylistically redundant, undesirable for on-device inference. In this work, we propose a lightweight approach to enable reasoning in small LLMs using LoRA adapters combined with supervised fine-tuning. We further introduce budget forcing via reinforcement learning on these adapters, significantly reducing response length with minimal accuracy loss. To address memory-bound decoding, we exploit parallel test-time scaling, improving accuracy at minor latency increase. Finally, we present a dynamic adapter-switching mechanism that activates reasoning only when needed and a KV-cache sharing strategy during prompt encoding, reducing time-to-first-token for on-device inference. Experiments on Qwen2.5-7B demonstrate that our method achieves efficient, accurate reasoning under strict resource constraints, making LLM reasoning practical for mobile scenarios. Videos demonstrating our solution running on mobile devices are available on our project page.

2603.12433 2026-06-04 cs.CV cs.AI cs.LG 版本更新

Revisiting Model Stitching In the Foundation Model Era

重新审视基础模型时代的模型拼接

Zheda Mai, Ke Zhang, Fu-En Wang, Zixiao Ken Wang, Albert Y. C. Chen, Lu Xia, Min Sun, Wei-Lun Chao, Cheng-Hao Kuo

发表机构 * The Ohio State University(俄亥俄州立大学) Boston University(波士顿大学) Amazon(亚马逊)

AI总结 本文通过系统协议研究视觉基础模型(如CLIP、DINOv2、SigLIP 2)的可拼接性,提出基于目标模型倒数第二层特征匹配损失的拼接方法,并构建VFM拼接树(VST)实现多模态大模型中多个VFM的准确率-延迟权衡。

Comments Accepted by CVPR 2026

详情
AI中文摘要

模型拼接通过一个轻量拼接层将一个模型(源)的早期层连接到另一个模型(目标)的后期层,作为表征兼容性的探针。先前工作发现,尽管初始化或目标不同,但基于同一数据集训练的模型仍然是可拼接的(准确率下降可忽略)。我们重新审视在目标、数据和模态组合(例如CLIP、DINOv2、SigLIP 2)上各异的视觉基础模型(VFM)的拼接,并提出问题:异构VFM是否可拼接?我们引入了一个系统协议,涵盖拼接点、拼接层家族、训练损失和下游任务。三个发现浮现:(1)拼接层训练至关重要:传统方法在拼接点匹配中间特征或端到端优化任务损失时难以保持准确率,尤其是在浅层拼接点。(2)通过在目标模型的倒数第二层使用简单的特征匹配损失,异构VFM在视觉任务上变得可靠可拼接。(3)对于深层拼接点,拼接模型可以超越任一组成模型,仅增加少量推理开销(用于拼接层)。基于这些发现,我们进一步提出VFM拼接树(VST),它在多个VFM之间共享早期层同时保留其后期层,为通常利用多个VFM的多模态大语言模型提供了可控的准确率-延迟权衡。综合来看,我们的研究将拼接从诊断探针提升为整合互补VFM优势并定位其表征对齐或分歧点的实用方法。

英文摘要

Model stitching, connecting early layers of one model (source) to later layers of another (target) via a light stitch layer, has served as a probe of representational compatibility. Prior work finds that models trained on the same dataset remain stitchable (negligible accuracy drop) despite different initializations or objectives. We revisit stitching for Vision Foundation Models (VFMs) that vary in objectives, data, and modality mix (e.g., CLIP, DINOv2, SigLIP 2) and ask: Are heterogeneous VFMs stitchable? We introduce a systematic protocol spanning the stitch points, stitch layer families, training losses, and downstream tasks. Three findings emerge. (1) Stitch layer training matters: conventional approaches that match the intermediate features at the stitch point or optimize the task loss end-to-end struggle to retain accuracy, especially at shallow stitch points. (2) With a simple feature-matching loss at the target model's penultimate layer, heterogeneous VFMs become reliably stitchable across vision tasks. (3) For deep stitch points, the stitched model can surpass either constituent model at only a small inference overhead (for the stitch layer). Building on these findings, we further propose the VFM Stitch Tree (VST), which shares early layers across VFMs while retaining their later layers, yielding a controllable accuracy-latency trade-off for multimodal LLMs that often leverage multiple VFMs. Taken together, our study elevates stitching from a diagnostic probe to a practical recipe for integrating complementary VFM strengths and pinpointing where their representations align or diverge.

2602.23312 2026-06-04 cs.HC cs.AI cs.LG cs.RO cs.SY eess.SY 版本更新

Evaluating Zero-Shot and One-Shot Adaptation of Small Language Models in Leader-Follower Interaction

评估小语言模型在领导者-跟随者交互中的零样本和单样本适应

Rafael R. Baptista, André de Lima Salgado, Ricardo V. Godoy, Marcelo Becker, Thiago Boaventura, Gustavo J. G. Lahr

发表机构 * University of Sao Paulo(圣保罗大学) Federal University of Lavras(拉瓦尔联邦大学) Faculdade Israelita de Ensino e Pesquisa Albert Einstein(亚伯拉罕·林克·埃instein教育与研究学院)

AI总结 本文通过微调小语言模型(Qwen2.5-0.5B)在领导者-跟随者交互中实现角色分类,零样本微调达到86.66%准确率且延迟低至22.2毫秒,但单样本模式因上下文长度增加导致性能下降。

详情
AI中文摘要

领导者-跟随者交互是人机交互(HRI)中的一个重要范式。然而,对于资源受限的移动和辅助机器人来说,实时分配角色仍然具有挑战性。虽然大型语言模型(LLMs)在自然通信方面显示出潜力,但其规模和延迟限制了设备端部署。小语言模型(SLMs)提供了一种潜在的替代方案,但它们在HRI中角色分类的有效性尚未得到系统评估。在本文中,我们提出了一个用于领导者-跟随者通信的SLMs基准测试,引入了一个源自已发表数据库的新数据集,并增加了合成样本以捕捉交互特定的动态。我们研究了两种适应策略:提示工程和微调,在零样本和单样本交互模式下进行研究,并与未训练的基线进行比较。使用Qwen2.5-0.5B的实验表明,零样本微调实现了稳健的分类性能(86.66%准确率),同时保持低延迟(每个样本22.2毫秒),显著优于基线和提示工程方法。然而,结果也表明在单样本模式下性能下降,其中增加的上下文长度挑战了模型的架构能力。这些发现表明,微调的SLMs为直接角色分配提供了有效的解决方案,同时突出了边缘端对话复杂性与分类可靠性之间的关键权衡。

英文摘要

Leader-follower interaction is an important paradigm in human-robot interaction (HRI). Yet, assigning roles in real time remains challenging for resource-constrained mobile and assistive robots. While large language models (LLMs) have shown promise for natural communication, their size and latency limit on-device deployment. Small language models (SLMs) offer a potential alternative, but their effectiveness for role classification in HRI has not been systematically evaluated. In this paper, we present a benchmark of SLMs for leader-follower communication, introducing a novel dataset derived from a published database and augmented with synthetic samples to capture interaction-specific dynamics. We investigate two adaptation strategies: prompt engineering and fine-tuning, studied under zero-shot and one-shot interaction modes, compared with an untrained baseline. Experiments with Qwen2.5-0.5B reveal that zero-shot fine-tuning achieves robust classification performance (86.66% accuracy) while maintaining low latency (22.2 ms per sample), significantly outperforming baseline and prompt-engineered approaches. However, results also indicate a performance degradation in one-shot modes, where increased context length challenges the model's architectural capacity. These findings demonstrate that fine-tuned SLMs provide an effective solution for direct role assignment, while highlighting critical trade-offs between dialogue complexity and classification reliability on the edge.

2603.10289 2026-06-04 quant-ph cs.AI cs.LG 版本更新

Quantum entanglement provides a competitive advantage in adversarial games

量子纠缠在对抗性博弈中提供竞争优势

Peiyong Wang, Kieran Hymas, James Quach

发表机构 * CSIRO(联邦科学与工业研究组织)

AI总结 本研究通过量子-经典混合智能体在Pong对抗性马尔可夫博弈中的实验,发现纠缠量子电路在特征提取和竞争性强化学习中优于可分离电路,表明量子纠缠可作为表示学习的功能资源。

Comments 22 pages, 5 figures

详情
AI中文摘要

量子资源是否能在完全经典的竞争环境中提供优势仍然是一个悬而未决的问题。竞争性零和强化学习尤其具有挑战性,因为成功需要对对抗智能体之间的动态交互进行建模,而非静态的状态-动作映射。在此,我们进行了一项受控研究,隔离了量子纠缠在训练于Pong(一个竞争性马尔可夫博弈)的量子-经典混合智能体中的作用。一个8量子比特参数化量子电路作为近端策略优化框架内的特征提取器,允许直接比较可分离电路与包含固定(CZ)或可训练(IsingZZ)纠缠门的架构。纠缠电路在参数数量相当的情况下始终优于可分离电路,并且在低容量区域中达到或超过经典多层感知机基线。表示相似性分析进一步表明,纠缠电路学习到结构上不同的特征,与对交互状态变量的改进建模一致。这些发现确立了纠缠作为竞争性强化学习中表示学习的功能资源。

英文摘要

Whether uniquely quantum resources confer advantages in fully classical, competitive environments remains an open question. Competitive zero-sum reinforcement learning is particularly challenging, as success requires modelling dynamic interactions between opposing agents rather than static state-action mappings. Here, we conduct a controlled study isolating the role of quantum entanglement in a quantum-classical hybrid agent trained on Pong, a competitive Markov game. An 8-qubit parameterised quantum circuit serves as a feature extractor within a proximal policy optimisation framework, allowing direct comparison between separable circuits and architectures incorporating fixed (CZ) or trainable (IsingZZ) entangling gates. Entangled circuits consistently outperform separable counterparts with comparable parameter counts and, in low-capacity regimes, match or exceed classical multilayer perceptron baselines. Representation similarity analysis further shows that entangled circuits learn structurally distinct features, consistent with improved modelling of interacting state variables. These findings establish entanglement as a function resource for representation learning in competitive reinforcement learning.

2603.10044 2026-06-04 cs.SE cs.AI cs.CL cs.LG 版本更新

Safety Under Scaffolding: How Evaluation Conditions Shape Measured Safety

脚手架下的安全性:评估条件如何影响测量的安全性

David Gringras

发表机构 * Harvard University(哈佛大学) MIT(麻省理工学院)

AI总结 本研究通过62,808次盲法预注册评估,测试了六种前沿模型在四种部署配置下的安全性,发现脚手架架构对安全性影响较小,而格式转换(如选择题与开放式问题)可导致5-20个百分点的测量差异,且模型-脚手架间存在显著异质性,质疑了单一综合安全性分数的实用性。

Comments 74 pages including appendices. 6 frontier models, 62,808 primary observations (~89k total). Pre-registered: OSF DOI 10.17605/OSF.IO/CJW92. Code and data: https://github.com/davidgringras/safety-under-scaffolding

详情
AI中文摘要

在基准测试中获得的安全分数不一定能预测同一模型在未经测试的智能体脚手架中的行为。我们通过四种部署配置(直接API、ReAct、多智能体批评者、map-reduce委托)运行了六种前沿模型:在四个安全基准测试(BBQ、TruthfulQA、XSTest/OR-Bench、sycophancy)上进行了N = 62,808次盲法、预注册、等价性检验评估,以及三项支持性分析。ReAct和多智能体脚手架保持在预注册的±2个百分点的等价范围内;map-reduce委托降低了测量的安全性(NNH = 14),尽管这种损失很大程度上是测量伪影:在相同项目上,选择题与开放式问题的措辞使测量的安全率变化5-20个百分点,而分解过程无声地移除了选择题选项。每个模型map-reduce损失的约40-89%归因于这种格式转换而非推理中断,一种保留选项的变体恢复了大部分损失。汇总效应也掩盖了模型与脚手架之间的显著异质性:在map-reduce下,对于相同项目,Opus损失16.8个百分点,而Llama 4增加18.8个百分点。从结构上看,脚手架架构仅解释了0.4%的结果方差(基准选择解释了45倍以上),泛化系数G = 0.000(bootstrap 95% CI [0.000, 0.752])。如此宽的区间本身足以削弱任何单一综合安全分数作为部署标准的效用。这些是“简单案例”;像诡计和CBRN提升这样的重要属性没有明显理由对格式或脚手架不敏感。代码、数据和提示已作为ScaffoldSafety发布。

英文摘要

A safety score earned on a benchmark need not predict how the same model behaves once it is wrapped in an agentic scaffold the benchmark never tested. We ran six frontier models through four deployment configurations (direct API, ReAct, multi-agent critic, map-reduce delegation): N = 62,808 blinded, pre-registered, equivalence-tested evaluations across four safety benchmarks (BBQ, TruthfulQA, XSTest/OR-Bench, sycophancy), plus three supporting analyses. ReAct and multi-agent scaffolds stay within a pre-registered +/-2 pp equivalence margin; map-reduce delegation degrades measured safety (NNH = 14), though that loss is largely a measurement artifact: on identical items, multiple-choice versus open-ended phrasing shifts the measured safety rate by 5-20 pp, and decomposition silently strips the multiple-choice options. Roughly 40-89% of the per-model map-reduce loss is this format conversion rather than reasoning disruption, and an option-preserving variant recovers most of it. Pooled effects also mask sharp model-by-scaffold heterogeneity: under map-reduce, on identical items, Opus loses 16.8 pp while Llama 4 gains 18.8 pp. Structurally, scaffold architecture explains only 0.4% of outcome variance (benchmark choice explains 45x more), and the generalizability coefficient is G = 0.000 (bootstrap 95% CI [0.000, 0.752]). An interval that wide is enough on its own to undermine the utility of any single composite safety number as a deployment criterion. These are the "easy cases"; consequential properties like scheming and CBRN uplift have no obvious reason to be less format- or scaffold-sensitive. Code, data, and prompts are released as ScaffoldSafety.

2603.09803 2026-06-04 cs.LG 版本更新

Good Reasoning Makes Good Demonstrations: Implicit Reasoning Quality Supervision via In-Context Reinforcement Learning

好的推理产生好的示范:通过上下文强化学习进行隐式推理质量监督

Tiehua Mei, Minxuan Lv, Leiyu Pan, Zhenpeng Su, Hongru Hou, Hengrui Chen, Ao Xu, Deqing Yang

发表机构 * School of Data Science, Fudan University(复旦大学数据科学学院) University of Chinese Academy of Sciences(中国科学院大学) College of Intelligence and Computing, Tianjin University(天津大学智能与计算学院)

AI总结 提出In-Context RLVR方法,利用策略模型自身的上下文学习能力衡量示范效用(Demonstration Utility),通过隐式奖励加权提升推理质量和准确性。

Comments Accepted at ACL 2026

详情
AI中文摘要

可验证奖励强化学习(RLVR)改进了大型语言模型的推理能力,但将所有正确解决方案同等对待,可能强化偶然得到正确答案的有缺陷的轨迹。我们观察到,\emph{更好的推理产生更好的示范}:高质量的解决方案比低质量的解决方案作为上下文示例更有效。我们将这种教学能力称为 extbf{示范效用},并表明策略模型自身的上下文学习能力提供了一种有效衡量它的方法,产生一个称为 extbf{证据增益}的质量信号。为了在训练中利用这一信号,我们引入了 extbf{上下文RLVR},在每次轨迹采样前添加示范。理论上,我们证明这种简单的输入修改隐式地以近似与证据增益成比例的因子重新加权奖励,为高质量轨迹分配更高的权重,而无需昂贵的计算。在数学推理基准上的实验表明,与标准RLVR基线相比,在准确性和推理质量上均有一致的改进。我们的代码和数据集可在 https://github.com/Mithas-114/IC-DAPO 获取。

英文摘要

Reinforcement Learning with Verifiable Rewards (RLVR) improves reasoning in large language models but treats all correct solutions equally, potentially reinforcing flawed traces that arrive at correct answers by chance. We observe that \emph{better reasoning makes better demonstrations}: high-quality solutions serve as more effective in-context examples than low-quality ones. We term this teaching ability \textbf{Demonstration Utility}, and show that the policy model's own in-context learning ability provides an efficient way to measure it, yielding a quality signal termed \textbf{Evidence Gain}. To leverage this signal during training, we introduce \textbf{In-Context RLVR}, which prepends demonstrations before each rollout. Theoretically, we prove that this simple input modification implicitly reweights rewards by a factor approximately proportional to Evidence Gain, assigning higher weights to high-quality traces without requiring costly computation. Experiments on mathematical reasoning benchmarks demonstrate consistent improvements in both accuracy and reasoning quality over standard RLVR baselines. Our codes and datasets are available at https://github.com/Mithas-114/IC-DAPO.

2603.07584 2026-06-04 cs.SD cs.LG eess.AS 版本更新

Analysis-Driven Procedural Generation of an Engine Sound Dataset with Embedded Control Annotations

分析驱动的发动机声音数据集程序化生成与嵌入式控制注释

Robin Doerfler, Lonce Wyse

发表机构 * rdoerfler

AI总结 提出一种分析驱动的框架,通过自适应音高谱分析提取真实录音中的谐波结构,驱动扩展参数化谐波加噪声合成器,生成带有精确时间对齐控制注释的发动机音频数据集,用于数据驱动的发动机声音建模。

Comments To appear in the Proceedings of the 34th European Signal Processing Conference (EUSIPCO 2026)

详情
AI中文摘要

计算发动机声音建模是汽车音频行业的核心,尤其适用于主动声音设计应用和虚拟原型设计。新兴的数据驱动发动机声音合成方法需要大量标准化、干净的音频记录,并带有精确时间对齐的运行状态注释:由于高成本、专用测量设备要求和不可避免的噪声污染,这些数据难以获取。我们提出了一种分析驱动的框架,用于生成带有样本精确控制注释的发动机音频。该方法通过自适应音高谱分析从真实录音中提取谐波结构,进而驱动扩展参数化谐波加噪声合成器。利用该框架,我们通过多样化的控制轨迹和参数变化,将每个发动机的5-10分钟源音频扩展15-30倍,生成程序化发动机声音数据集(19.0小时,5,935个文件):一组带有样本精确RPM和扭矩注释的发动机音频信号,覆盖广泛的工作条件、信号复杂度和谐波轮廓。与真实录音的对比验证了合成数据保留了特征谐波结构,基于该数据集训练的基线可微合成网络证实了其适用于数据驱动的发动机声音建模。该数据集已公开发布,以支持发动机音色分析、控制参数估计和神经生成合成的研究。

英文摘要

Computational engine sound modeling is central to the automotive audio industry, particularly for active sound design applications and virtual prototyping. Emerging data-driven engine sound synthesis methods require large volumes of standardized, clean audio recordings with precisely time-aligned operating-state annotations: data that is difficult to obtain due to high costs, specialized measurement equipment requirements, and inevitable noise contamination. We present an analysis-driven framework for generating engine audio with sample-accurate control annotations. The method extracts harmonic structures from real recordings through pitch-adaptive spectral analysis, which then drive an extended parametric harmonic-plus-noise synthesizer. With this framework, we augment 5-10 min of source audio per engine 15-30x via diverse control trajectories and parametric variation, producing the Procedural Engine Sounds Dataset (19.0 h, 5,935 files): a set of engine audio signals with sample-accurate RPM and torque annotations spanning a wide range of operating conditions, signal complexities, and harmonic profiles. Comparison against real recordings validates that the synthesized data preserves characteristic harmonic structures, and a baseline differentiable synthesis network trained on the dataset confirms its suitability for data-driven engine sound modeling. The dataset is released publicly to support research on engine timbre analysis, control parameter estimation, and neural generative synthesis.

2602.12147 2026-06-04 cs.LG 版本更新

It's TIME: Towards the Next Generation of Time Series Forecasting Benchmarks

是时候了:迈向下一代时间序列预测基准

Zhongzheng Qiao, Sheng Pan, Anni Wang, Viktoriya Zhukova, Yong Liu, Xudong Jiang, Qingsong Wen, Mingsheng Long, Ming Jin, Chenghao Liu

发表机构 * University of Science and Technology of China(中国科学技术大学) Tsinghua University(清华大学)

AI总结 针对现有时间序列基准在数据组成、完整性、任务设计和分析视角上的局限,提出TIME基准,包含50个新数据集和98个预测任务,采用人机协作管道确保数据完整性,并引入模式级评估视角,对12个基础模型进行多粒度排名。

Comments Accepted to ICML 2026. Camera-ready version

详情
AI中文摘要

时间序列基础模型正在从特定数据集建模向通用任务评估转变,彻底改变预测格局。然而,我们认为现有基准在四个维度存在常见局限:受重复使用传统资源主导的受限数据组成、缺乏严格质量保证的数据完整性受损、脱离现实背景的任务公式错位,以及掩盖通用洞察的僵化分析视角。为弥补这些差距,我们引入TIME,一个下一代以任务为中心的基准,包含50个新数据集和98个预测任务,专为无数据泄露的严格零样本TSFM评估而设计。集成大语言模型和人类专业知识,我们建立了人机协同的基准构建管道以确保高数据完整性,并通过将预测配置与现实操作需求和变量可预测性对齐来重新定义任务公式。此外,我们提出一种新颖的模式级评估视角,超越了基于静态元标签的传统数据集级评估。通过利用结构时间序列特征来刻画内在时间属性,该方法提供了跨不同模式的模型能力的通用洞察。我们评估了12个TSFM,并建立了一个多粒度排行榜以促进深入分析和可视化检查。排行榜可在 https://huggingface.co/spaces/Real-TSF/TIME-leaderboard 获取。

英文摘要

Time series foundation models (TSFMs) are revolutionizing the forecasting landscape from specific dataset modeling to generalizable task evaluation. However, we contend that existing benchmarks exhibit common limitations in four dimensions: constrained data composition dominated by reused legacy sources, compromised data integrity lacking rigorous quality assurance, misaligned task formulations detached from real-world contexts, and rigid analysis perspectives that obscure generalizable insights. To bridge these gaps, we introduce TIME, a next-generation task-centric benchmark comprising 50 fresh datasets and 98 forecasting tasks, tailored for strict zero-shot TSFM evaluation free from data leakage. Integrating large language models and human expertise, we establish a human-in-the-loop benchmark construction pipeline to ensure high data integrity and redefine task formulation by aligning forecasting configurations with real-world operational requirements and variate predictability. Furthermore, we propose a novel pattern-level evaluation perspective that moves beyond traditional dataset-level evaluations based on static meta labels. By leveraging structural time series features to characterize intrinsic temporal properties, this approach offers generalizable insights into model capabilities across diverse patterns. We evaluate 12 TSFMs and establish a multi-granular leaderboard to facilitate in-depth analysis and visualized inspection. The leaderboard is available at https://huggingface.co/spaces/Real-TSF/TIME-leaderboard.

2603.03482 2026-06-04 cs.CV cs.AI cs.LG 版本更新

Beyond Pixel Histories: World Models with Persistent 3D State

超越像素历史:具有持久3D状态的世界模型

Samuel Garcin, Thomas Walker, Steven McDonagh, Tim Pearce, Hakan Bilen, Tianyu He, Kaixin Wang, Jiang Bian

发表机构 * University of Edinburgh(爱丁堡大学) Microsoft Research(微软研究院)

AI总结 提出PERSIST范式,通过模拟潜在3D场景(环境、相机、渲染器)的演化,实现具有持久空间记忆和一致几何的世界模型,显著提升3D一致性、空间记忆和长期稳定性。

Comments Accepted to the International Conference on Machine Learning (ICML) 2026. To appear in the Proceedings of Machine Learning Research (PMLR). 9 pages

详情
AI中文摘要

交互式世界模型通过响应用户的动作持续生成视频,实现开放式的生成能力。然而,现有模型通常缺乏环境的3D表示,意味着3D一致性必须从数据中隐式学习,且空间记忆受限于有限的时域上下文窗口。这导致不真实的用户体验,并对训练智能体等下游任务构成重大障碍。为解决这一问题,我们提出PERSIST,一种新的世界模型范式,它模拟潜在3D场景(环境、相机和渲染器)的演化。这使得我们能够合成具有持久空间记忆和一致几何的新帧。定量指标和定性用户研究均表明,与现有方法相比,在空间记忆、3D一致性和长期稳定性方面有显著提升,从而实现连贯、演化的3D世界。我们进一步展示了新颖的能力,包括从单张图像合成多样化的3D环境,以及通过直接在3D空间中支持环境编辑和指定,实现对生成体验的细粒度、几何感知控制。项目页面:https://francelico.github.io/persist.github.io

英文摘要

Interactive world models continually generate video by responding to a user's actions, enabling open-ended generation capabilities. However, existing models typically lack a 3D representation of the environment, meaning 3D consistency must be implicitly learned from data, and spatial memory is restricted to limited temporal context windows. This results in an unrealistic user experience and presents significant obstacles to downstream tasks such as training agents. To address this, we present PERSIST, a new paradigm of world model which simulates the evolution of a latent 3D scene: environment, camera, and renderer. This allows us to synthesise new frames with persistent spatial memory and consistent geometry. Both quantitative metrics and a qualitative user study show substantial improvements in spatial memory, 3D consistency, and long-horizon stability over existing methods, enabling coherent, evolving 3D worlds. We further demonstrate novel capabilities, including synthesising diverse 3D environments from a single image, as well as enabling fine-grained, geometry-aware control over generated experiences by supporting environment editing and specification directly in 3D space. Project page: https://francelico.github.io/persist.github.io

2602.20651 2026-06-04 cs.LG stat.AP stat.ML 版本更新

Sparse Bayesian Deep Functional Learning with Structured Region Selection

稀疏贝叶斯深度函数学习与结构化区域选择

Xiaoxian Zhu, Yingmeng Li, Shuangge Ma, Mengyun Wu

发表机构 * School of Statistics and Data Science, Shanghai University of Finance and Economics(上海金融学院统计与数据科学学院) School of Statistics(统计学院) Data Science, Shanghai University of Finance(金融大学数据科学学院) Department of Biostatistics, Yale School of Public Health(耶鲁大学公共卫生学院生物统计学系)

AI总结 提出稀疏贝叶斯函数深度神经网络(sBayFDNN),通过深度贝叶斯架构学习自适应函数嵌入以捕捉复杂非线性关系,并利用结构化先验实现具有量化不确定性的可解释区域选择,理论上首次为贝叶斯深度函数模型提供了近似误差界、后验一致性和区域选择一致性的严格保证。

详情
AI中文摘要

在现代应用如心电图监测、神经影像、可穿戴传感和工业设备诊断中,复杂且连续结构化的数据无处不在,为函数数据分析带来了挑战和机遇。然而,现有方法面临关键权衡:传统函数模型受限于线性,而深度学习方法缺乏可解释的区域选择以处理稀疏效应。为弥合这些差距,我们提出了一种稀疏贝叶斯函数深度神经网络(sBayFDNN)。它通过深度贝叶斯架构学习自适应函数嵌入以捕捉复杂的非线性关系,同时结构化先验使得能够对具有量化不确定性的影响域进行可解释的区域选择。理论上,我们建立了严格的近似误差界、后验一致性和区域选择一致性。这些结果为贝叶斯深度函数模型提供了首个理论保证,确保了其可靠性和统计严谨性。实证上,全面的模拟和真实世界研究证实了sBayFDNN的有效性和优越性。关键的是,sBayFDNN在识别复杂依赖关系以实现准确预测方面表现出色,并能更精确地识别功能上有意义的区域,这些能力从根本上超越了现有方法。

英文摘要

In modern applications such as ECG monitoring, neuroimaging, wearable sensing, and industrial equipment diagnostics, complex and continuously structured data are ubiquitous, presenting both challenges and opportunities for functional data analysis. However, existing methods face a critical trade-off: conventional functional models are limited by linearity, whereas deep learning approaches lack interpretable region selection for sparse effects. To bridge these gaps, we propose a sparse Bayesian functional deep neural network (sBayFDNN). It learns adaptive functional embeddings through a deep Bayesian architecture to capture complex nonlinear relationships, while a structured prior enables interpretable, region-wise selection of influential domains with quantified uncertainty. Theoretically, we establish rigorous approximation error bounds, posterior consistency, and region selection consistency. These results provide the first theoretical guarantees for a Bayesian deep functional model, ensuring its reliability and statistical rigor. Empirically, comprehensive simulations and real-world studies confirm the effectiveness and superiority of sBayFDNN. Crucially, sBayFDNN excels in recognizing intricate dependencies for accurate predictions and more precisely identifies functionally meaningful regions, capabilities fundamentally beyond existing approaches.

2603.01013 2026-06-04 cs.LG 版本更新

Feature-Weighted Maximum Representative Subsampling

特征加权最大代表性子抽样

Tony Hauptmann, Stefan Kramer

发表机构 * Institute of Computer Science, Johannes Gutenberg University Mainz(明斯特大学计算机科学研究所)

AI总结 针对部分特征高度偏倚导致代表性变量引入偏差的问题,提出特征加权最大代表性子抽样(FW-MRS)方法,通过特征权重降低高度偏倚特征的影响,在保持下游任务泛化性能的同时保留更多实例。

详情
AI中文摘要

在社会科学中,通常需要在得出有效结论之前对研究和调查进行去偏。去偏算法能够使用样本权重通过计算方式去除偏差。然而,当只有一部分特征高度偏倚而其余特征已经具有代表性时,就会出现问题。算法需要强烈改变样本分布以处理少数高度偏倚的特征,这反过来可能会给已经具有代表性的变量引入偏差。为了解决这个问题,我们开发了一种使用特征权重的方法,以最小化高度偏倚特征对样本权重计算的影响。我们的算法基于最大代表性子抽样(MRS),该方法通过迭代移除元素来对齐非代表性样本与代表性样本,从而创建代表性子样本,进而对数据集进行去偏。新算法名为特征加权MRS(FW-MRS),它降低了对高度偏倚特征的重视程度,从而能够为下游任务保留更多实例。特征权重来源于一个域分类器的特征重要性,该分类器经过训练以区分代表性数据集和非代表性数据集。我们使用八个表格数据集验证了FW-MRS,每个数据集都被人为偏倚。偏倚特征可能对下游任务很重要,较少关注它们可能导致泛化性能下降。因此,我们评估了FW-MRS在下游任务上的泛化性能,发现没有统计学上的显著差异。此外,FW-MRS被应用于一个来自社会科学的真实数据集。源代码可在https://github.com/kramerlab/FeatureWeightDebiasing获取。

英文摘要

In the social sciences, it is often necessary to debias studies and surveys before valid conclusions can be drawn. Debiasing algorithms enable the computational removal of bias using sample weights. However, an issue arises when only a subset of features is highly biased, while the rest is already representative. Algorithms need to strongly alter the sample distribution to manage a few highly biased features, which can in turn introduce bias into already representative variables. To address this issue, we developed a method that uses feature weights to minimize the impact of highly biased features on the computation of sample weights. Our algorithm is based on Maximum Representative Subsampling (MRS), which debiases datasets by aligning a non-representative sample with a representative one through iterative removal of elements to create a representative subsample. The new algorithm, named feature-weighted MRS (FW-MRS), decreases the emphasis on highly biased features, allowing it to retain more instances for downstream tasks. The feature weights are derived from the feature importance of a domain classifier trained to differentiate between the representative and non-representative datasets. We validated FW-MRS using eight tabular datasets, each of which we artificially biased. Biased features can be important for downstream tasks, and focusing less on them could lead to a decline in generalization. For this reason, we assessed the generalization performance of FW-MRS on downstream tasks and found no statistically significant differences. Additionally, FW-MRS was applied to a real-world dataset from the social sciences. The source code is available at https://github.com/kramerlab/FeatureWeightDebiasing.

2602.23214 2026-06-04 cs.CV cs.LG eess.IV 版本更新

Plug-and-Play Diffusion Meets ADMM: Dual-Variable Coupling for Robust Medical Image Reconstruction

即插即用扩散遇见ADMM:双变量耦合用于鲁棒医学图像重建

Chenhe Du, Xuanyu Tian, Qing Wu, Muyu Liu, Jingyi Yu, Hongjiang Wei, Yuyao Zhang

发表机构 * ShanghaiTech University(上海科技大学) Shanghai Jiao Tong University(上海交通大学)

AI总结 提出双耦合即插即用扩散(DC-PnPDP)框架,通过引入经典对偶变量提供积分反馈并采用频谱均匀化(SH)处理结构伪影,解决了现有PnP求解器的稳态偏差和幻觉问题,在CT和MRI重建中实现了最先进的保真度和加速收敛。

Comments Accepted by ICML 2026

详情
AI中文摘要

即插即用扩散先验(PnPDP)框架通过将预训练生成模型视为模块化先验,已成为解决成像逆问题的强大范式。然而,我们发现当前PnP求解器(例如基于HQS或近端梯度)存在一个关键缺陷:它们作为无记忆算子,仅基于瞬时梯度更新估计。这种缺乏历史跟踪的做法不可避免地导致非消失稳态偏差,使得重建在严重损坏下无法严格满足物理测量。为了解决这个问题,我们提出了双耦合PnP扩散(DC-PnPDP),它恢复了经典对偶变量以提供积分反馈,逐步强制数据一致性和先验之间的一致性。然而,这种严格的几何耦合引入了第二个挑战:累积的对偶残差表现出频谱有色、结构化的伪影,违反了扩散先验的加性白高斯噪声(AWGN)假设,导致严重的幻觉。为了弥合这一差距,我们引入了频谱均匀化(SH),一种频域适应机制,将这些结构化残差调制为统计上合规的伪AWGN输入。这有效地将求解器的严格优化轨迹与去噪器的有效统计流形对齐。在CT和MRI重建上的大量实验表明,我们的方法解决了偏差-幻觉权衡,实现了最先进的保真度并显著加速收敛。代码可在https://github.com/duchenhe/DC-PnPDP获取。

英文摘要

Plug-and-Play diffusion prior (PnPDP) frameworks have emerged as a powerful paradigm for solving imaging inverse problems by treating pretrained generative models as modular priors. However, we identify a critical flaw in prevailing PnP solvers (e.g., based on HQS or Proximal Gradient): they function as memoryless operators, updating estimates solely based on instantaneous gradients. This lack of historical tracking inevitably leads to non-vanishing steady-state bias, where the reconstruction fails to strictly satisfy physical measurements under heavy corruption. To resolve this, we propose Dual-Coupled PnP Diffusion (DC-PnPDP), which restores the classical dual variable to provide integral feedback, progressively enforce agreement between the data-consistency and prior. However, this rigorous geometric coupling introduces a secondary challenge: the accumulated dual residuals exhibit spectrally colored, structured artifacts that violate the Additive White Gaussian Noise (AWGN) assumption of diffusion priors, causing severe hallucinations. To bridge this gap, we introduce Spectral Homogenization (SH), a frequency-domain adaptation mechanism that modulates these structured residuals into statistically compliant pseudo-AWGN inputs. This effectively aligns the solver's rigorous optimization trajectory with the denoiser's valid statistical manifold. Extensive experiments on CT and MRI reconstruction demonstrate that our approach resolves the bias-hallucination trade-off, achieving state-of-the-art fidelity with significantly accelerated convergence. The code is available at https://github.com/duchenhe/DC-PnPDP

2602.20971 2026-06-04 cs.LG cs.AI 版本更新

Does Order Matter : Connecting The Law of Robustness to Robust Generalization

顺序重要吗:连接鲁棒性定律与鲁棒泛化

Mihir More, Aritra Das, Jaee Ponde, Himadri Mandal, Vishnu Varadarajan, Debayan Gupta

发表机构 * Ashoka University(阿什oka大学) Truth Audit Labs(真相审计实验室) Indian Statistical Institute(印度统计研究所)

AI总结 本文通过全局和局部Rademacher复杂度,将鲁棒性定律(Lipschitz常数下界)与鲁棒泛化误差联系起来,证明了对任意数据分布,全局Lipschitz界阶不变,而局部Lipschitz界阶随扰动半径和局部浓度项变化。

详情
AI中文摘要

Bubeck和Selke(2021)将鲁棒性定律与鲁棒泛化误差之间的联系作为一个开放问题提出。鲁棒性定律指出,过参数化对于模型实现鲁棒插值是必要的,即插值函数必须是Lipschitz的。Wu等人(2023)将该定律推广到任意数据分布,证明Lipschitz常数满足$L = Ω(n^{1/d})$。另一方面,鲁棒泛化研究小的鲁棒训练损失是否意味着小的鲁棒测试损失。这可以使用统计学习技术(如Rademacher复杂度)来研究,其中鲁棒损失类的Rademacher复杂度的界意味着函数类Lipschitz性的界。我们利用这一联系,明确地将两者联系起来,适用于任意数据分布。(i) 我们证明,在考虑鲁棒损失类的全局Rademacher复杂度时,Lipschitz界的阶保持不变。(ii) 在局部尺度上,即对于具有小经验误差的函数子集,Lipschitz界的阶随扰动半径$ρ$和局部浓度项$\sqrt{r/n}$变化。

英文摘要

Bubeck and Selke (2021) propose the connection between the Law of Robustness and robust generalization error as an open problem. The Law of Robustness states that overparameterization is necessary for models to interpolate robustly, i.e., the interpolating function is required to be Lipschitz. Wu et al. (2023) extend this law to arbitrary data distributions, proving that the Lipschitz constant satisfies $L = Ω(n^{1/d})$. Robust generalization, on the other hand, asks whether small robust training loss implies small robust test loss. This can be studied using statistical learning techniques such as Rademacher complexities, where a bound on the Rademacher complexity of the robust loss class implies a bound on the Lipschitzness of the function class. We use this connection to explicitly link the two for arbitrary data distributions. (i) We prove that the order of the Lipschitz bound remains the same when considering the global Rademacher complexity of robust loss classes. (ii) At the local scale, i.e., for subsets of functions with small empirical error, the order of the Lipschitz bound changes with the perturbation radius $ρ$ and the localized concentration term $\sqrt{r/n}$.

2602.19799 2026-06-04 stat.ML cs.LG math.OC 版本更新

Path-conditioned training: a principled way to rescale ReLU neural networks

路径条件训练:一种缩放ReLU神经网络的原则性方法

Arthur Lebeurrier, Titouan Vayer, Rémi Gribonval

发表机构 * Université de Lyon(里昂大学) CNRS(法国国家科学研究中心)

AI总结 本文提出一种基于路径提升框架的几何准则来缩放ReLU网络参数,通过最小化该准则实现核对齐,从而加速训练。

详情
Journal ref
Proceedings of the 43rd International Conference on Machine Learning (ICML 2026), Seoul, South Korea, PMLR 306 (2026)
AI中文摘要

尽管最近算法有所进展,我们仍然缺乏原则性的方法来利用ReLU神经网络参数中记录良好的缩放对称性。虽然两个适当缩放的权重实现相同的函数,但训练动态可能截然不同。为了对这一现象提供新的视角,我们基于最近的路径提升框架,该框架提供了ReLU网络的紧凑分解。我们引入了一个几何动机的准则来缩放神经网络参数,其最小化导致一种条件策略,将路径提升空间中的核与选定的参考对齐。我们推导了一种有效的算法来执行这种对齐。在随机网络初始化的背景下,我们分析了架构和初始化尺度如何共同影响所提出方法的输出。数值实验展示了其加速训练的潜力。

英文摘要

Despite recent algorithmic advances, we still lack principled ways to leverage the well-documented rescaling symmetries in ReLU neural network parameters. While two properly rescaled weights implement the same function, the training dynamics can be dramatically different. To offer a fresh perspective on exploiting this phenomenon, we build on the recent path-lifting framework, which provides a compact factorization of ReLU networks. We introduce a geometrically motivated criterion to rescale neural network parameters which minimization leads to a conditioning strategy that aligns a kernel in the path-lifting space with a chosen reference. We derive an efficient algorithm to perform this alignment. In the context of random network initialization, we analyze how the architecture and the initialization scale jointly impact the output of the proposed method. Numerical experiments illustrate its potential to speed up training.

2602.16966 2026-06-04 cs.LG cs.AI 版本更新

A Unified Framework for Locality in Scalable MARL

可扩展多智能体强化学习中局部性的统一框架

Sourav Chakraborty, Amit Kiran Rege, Claire Monteleoni, Lijun Chen

发表机构 * University of Colorado Boulder(科罗拉多大学博尔德分校) INRIA Paris(巴黎国家信息与自动化研究所)

AI总结 提出统一框架,通过将矩阵C^π分解为环境敏感性和策略敏感性部分,利用谱半径条件ρ(H^π)<1严格弱于行和条件,证明软max温度直接控制局部性,并给出块坐标KL近端策略改进的确定性保证。

详情
AI中文摘要

网络化多智能体强化学习的可扩展方法让每个智能体仅使用智能体图的一小部分邻域进行规划。这仅在系统是值局部性时有效,即一个智能体的扰动对远处另一个智能体的长期值影响较弱。在平均奖励设置中,验证局部性的标准方法是Dobrushin行和界,该界基于一个矩阵$C^π$,该矩阵捕捉每个智能体的下一个状态如何依赖于其他智能体的当前状态。为了使该矩阵易于处理,先前的工作通过联合动作的上确界来约束它。得到的界与策略无关,但当策略从不选择最坏情况动作时,该界是松的。我们将$C^π$分解为分别跟踪环境敏感性和策略敏感性的部分,$C^π\preceq E^{\mathrm s}+E^{\mathrm a}Π(π)$,其中$E^{\mathrm s}$衡量下一个状态如何随当前状态变化,$E^{\mathrm a}$衡量它如何随当前动作变化,$Π(π)$衡量策略对状态变化的反应程度。那么$H^π:= E^{\mathrm s}+E^{\mathrm a}Π(π)$的谱半径控制平均奖励泊松解的衰减,谱证书$ρ(H^π)<1$严格弱于同一矩阵上的行和条件$\|H^π\|_\infty<1$,并适用于先前Dobrushin风格工作中使用的策略无关动作上确界界无法处理的场景。对于温度-$τ$ softmax策略,我们有$Π(π)\le L/(2τ)$,因此softmax温度直接控制局部性。我们利用这一衰减结果为块坐标KL近端策略改进模板提供确定性预言机保证,其截断偏差随消息传递半径$κ$指数衰减。

英文摘要

Scalable methods for networked multi-agent reinforcement learning let each agent plan using only a small neighborhood of the agent graph. This works only when the system is value-local, meaning a perturbation at one agent affects the long-run value at another agent weakly when the two are far apart. In the average-reward setting, the standard way to certify locality is the Dobrushin row-sum bound on a single matrix $C^π$ that captures how each agent's next state depends on each other agent's current state. To make this matrix easy to work with, prior work bounds it by a supremum over joint actions. The resulting bound is independent of the policy, but it is loose whenever the policy never picks the worst-case action. We split $C^π$ into pieces that separately track environment sensitivity and policy sensitivity, $C^π\preceq E^{\mathrm s}+E^{\mathrm a}Π(π)$, where $E^{\mathrm s}$ measures how the next state moves with the current state, $E^{\mathrm a}$ measures how it moves with the current action, and $Π(π)$ measures how reactive the policy is to changes in state. The spectral radius of $H^π:= E^{\mathrm s}+E^{\mathrm a}Π(π)$ then controls the decay of the average-reward Poisson solution, and the spectral certificate $ρ(H^π)<1$ is strictly weaker than the row-sum condition $\|H^π\|_\infty<1$ on the same matrix and applies in regimes where policy-independent action-supremum bounds used in prior Dobrushin-style work cannot. For temperature-$τ$ softmax policies we get $Π(π)\le L/(2τ)$, so the softmax temperature directly controls locality. We use this decay result to give a deterministic oracle guarantee for a block-coordinate KL-proximal policy-improvement template whose truncation bias decays exponentially in the message-passing radius $κ$.

2602.03972 2026-06-04 stat.ML cs.AI cs.LG 版本更新

Fixed Budget is No Harder Than Fixed Confidence in Best-Arm Identification up to Logarithmic Factors

固定预算在最佳臂识别中不比固定置信度难(对数因子范围内)

Kapilan Balagopalan, Yinan Li, Yao Zhao, Tuan Nguyen, Anton Daitche, Houssam Nassif, Kwang-Sung Jun

发表机构 * University of California, Berkeley(加州大学伯克利分校) University of Washington(华盛顿大学) University of Texas at Austin(德克萨斯大学奥斯汀分校)

AI总结 本文提出元算法FC2FB,将固定置信度算法转化为固定预算算法,证明固定预算的样本复杂度在log因子内不高于固定置信度。

详情
Journal ref
International Conference on Machine Learning (ICML'26), Seoul, Korea, 2026
AI中文摘要

最佳臂识别(BAI)问题是交互式机器学习中最基本的问题之一,有两种形式:固定预算设置(FB)和固定置信度设置(FC)。对于具有唯一最佳臂的$K$臂赌博机,两种设置的最优样本复杂度已被确定,且在对数因子内匹配。这引出了一个关于通用的、可能具有结构化的BAI问题的有趣研究问题:FB是否比FC更难,还是相反?在本文中,我们证明FB在对数因子内并不比FC难。我们通过构造性方式做到这一点:我们提出了一种名为FC2FB(固定置信度到固定预算)的新算法,这是一种元算法,它接收一个FC算法$\mathcal{A}$并将其转化为FB算法。我们证明FC2FB的样本复杂度与$\mathcal{A}$的样本复杂度在对数因子内匹配。这意味着最优FC样本复杂度是FB最优样本复杂度的一个上界(在对数因子内)。我们的结果不仅揭示了FB和FC之间的基本关系,而且具有重要含义:FC2FB与现有最先进的FC算法相结合,可以改善许多FB问题的样本复杂度。

英文摘要

The best-arm identification (BAI) problem is one of the most fundamental problems in interactive machine learning, which has two flavors: the fixed-budget setting (FB) and the fixed-confidence setting (FC). For $K$-armed bandits with a unique best arm, the optimal sample complexities for both settings have been settled down, and they match up to logarithmic factors. This prompts an interesting research question about the generic, potentially structured BAI problems: is FB harder than FC or the other way around? In this paper, we show that FB is no harder than FC up to logarithmic factors. We do this constructively: we propose a novel algorithm called FC2FB (fixed confidence to fixed budget), which is a meta algorithm that takes in an FC algorithm $\mathcal{A}$ and turn it into an FB algorithm. We prove that FC2FB enjoys a sample complexity that matches, up to logarithmic factors, that of the sample complexity of $\mathcal{A}$. This means that the optimal FC sample complexity is an upper bound of the optimal FB sample complexity up to logarithmic factors. Our result not only reveals a fundamental relationship between FB and FC, but also has a significant implication: FC2FB combined with existing state-of-the-art FC algorithms leads to improved sample complexity for a number of FB problems.

2602.14885 2026-06-04 cond-mat.dis-nn cond-mat.stat-mech cs.LG q-bio.NC 版本更新

Drift-Diffusion Matching: Embedding dynamics in latent manifolds of asymmetric neural networks

漂移-扩散匹配:非对称神经网络潜在流形中的动力学嵌入

Ramón Nartallo-Kaluarachchi, Renaud Lambiotte, Alain Goriely

发表机构 * Mathematical Institute, University of Oxford(牛津大学数学研究所) Centre for Eudaimonia and Human Flourishing, University of Oxford(牛津大学幸福与人类繁荣中心) Complexity Science Hub, Vienna(维也纳复杂科学中心)

AI总结 提出漂移-扩散匹配框架,通过训练连续时间循环神经网络在低维潜在子空间中嵌入任意非线性随机微分方程,利用非对称连接实现非平衡动力学,并应用于联想记忆和序列记忆建模。

Comments 25 pages, 16 figures

详情
AI中文摘要

循环神经网络(RNN)为理解生物神经回路中的计算提供了理论框架,然而经典结果(如Hopfield联想记忆模型)依赖于对称连接,将网络动力学限制为梯度流。相比之下,生物网络支持由其非对称性促进的丰富时间依赖行为。本文引入一个通用框架,称为漂移-扩散匹配,用于训练连续时间RNN在低维潜在子空间中表示具有给定漂移和扩散系数的任意非线性随机微分方程(SDE)。通过允许非对称连接,我们证明RNN能够忠实地嵌入给定SDE的漂移和扩散,包括非线性非平衡动力学(如混沌吸引子)。作为应用,我们构建了随机系统的RNN实现,这些系统通过输入驱动切换和由非平衡电流驱动的自主跃迁短暂探索各种吸引子,我们将其解释为联想记忆和序列(情景)记忆的模型。为了阐明这些动力学如何在网络中编码,我们基于RNN的非对称连接及其时间不可逆性引入分解。我们的结果将吸引子神经网络理论扩展到平衡态之外,表明非对称神经群体可以在低维流形内实现广泛的动力学计算,统一了来自联想记忆、非平衡统计力学和神经计算的思想。

英文摘要

Recurrent neural networks (RNNs) provide a theoretical framework for understanding computation in biological neural circuits, yet classical results, such as Hopfield's model of associative memory, rely on symmetric connectivity that restricts network dynamics to gradient-like flows. In contrast, biological networks support rich time-dependent behaviour facilitated by their asymmetry. Here we introduce a general framework, which we term drift-diffusion matching, for training continuous-time RNNs to represent arbitrary, nonlinear stochastic differential equations (SDEs), with given drift and diffusion coefficients, within a low-dimensional latent subspace. Allowing asymmetric connectivity, we show that RNNs can faithfully embed the drift and diffusion of a given SDE, including nonlinear and nonequilibrium dynamics such as chaotic attractors. As an application, we construct RNN realisations of stochastic systems that transiently explore various attractors through both input-driven switching and autonomous transitions driven by nonequilibrium currents, which we interpret as models of associative and sequential (episodic) memory. To elucidate how these dynamics are encoded in the network, we introduce decompositions of the RNN based on its asymmetric connectivity and its time-irreversibility. Our results extend attractor neural network theory beyond equilibrium, showing that asymmetric neural populations can implement a broad class of dynamical computations within low-dimensional manifolds, unifying ideas from associative memory, nonequilibrium statistical mechanics, and neural computation.

2602.12643 2026-06-04 cs.LG cs.AI stat.ML 版本更新

Unifying Model-Free Efficiency and Model-Based Representations via Latent Dynamics

通过潜在动力学统一无模型效率与基于模型的表示

Jashaswimalya Acharjee, Balaraman Ravindran

AI总结 提出统一潜在动力学算法,通过将状态-动作对嵌入到值函数近似线性的潜在空间,无需规划开销即可融合无模型效率与基于模型表示的优势,在80个环境中匹配或超越专门基线。

Comments Similarities found with a prior work. Hence, requesting for withdrawal until further notice

详情
AI中文摘要

我们提出了统一潜在动力学(ULD),一种新颖的强化学习算法,它统一了无模型方法的效率与基于模型方法的表示优势,且不产生规划开销。通过将状态-动作对嵌入到真实值函数近似线性的潜在空间中,我们的方法支持跨不同领域使用单一超参数集——从低维和像素输入的连续控制到高维Atari游戏。我们证明,在温和条件下,基于嵌入的时序差分更新的不动点与相应线性基于模型的值扩展的不动点一致,并推导了将嵌入保真度与值逼近质量相关联的显式误差界。在实践中,ULD采用编码器、值函数和策略网络的同步更新、短视界预测动力学的辅助损失以及奖励尺度归一化,以确保在稀疏奖励下的稳定学习。在涵盖Gym运动控制、DeepMind Control(本体感觉和视觉)以及Atari的80个环境上的评估表明,我们的方法匹配或超过了专门的基于模型和通用基于模型的基线的性能——以最少的调参和更少的参数实现了跨领域能力。这些结果表明,仅与值对齐的潜在表示就能提供传统上归因于完整基于模型规划的适应性和样本效率。

英文摘要

We present Unified Latent Dynamics (ULD), a novel reinforcement learning algorithm that unifies the efficiency of model-free methods with the representational strengths of model-based approaches, without incurring planning overhead. By embedding state-action pairs into a latent space in which the true value function is approximately linear, our method supports a single set of hyperparameters across diverse domains -- from continuous control with low-dimensional and pixel inputs to high-dimensional Atari games. We prove that, under mild conditions, the fixed point of our embedding-based temporal-difference updates coincides with that of a corresponding linear model-based value expansion, and we derive explicit error bounds relating embedding fidelity to value approximation quality. In practice, ULD employs synchronized updates of encoder, value, and policy networks, auxiliary losses for short-horizon predictive dynamics, and reward-scale normalization to ensure stable learning under sparse rewards. Evaluated on 80 environments spanning Gym locomotion, DeepMind Control (proprioceptive and visual), and Atari, our approach matches or exceeds the performance of specialized model-free and general model-based baselines -- achieving cross-domain competence with minimal tuning and a fraction of the parameter footprint. These results indicate that value-aligned latent representations alone can deliver the adaptability and sample efficiency traditionally attributed to full model-based planning.

2602.11406 2026-06-04 stat.ML cs.LG 版本更新

The Cost of Learning Under Multiple Change Points

多个变化点下的学习成本

Tomer Gafni, Garud Iyengar, Assaf Zeevi

AI总结 针对多变化点在线学习问题,提出选择性检测算法ATC,实现近乎极小化最优的遗憾界。

Comments A version of this work has been accepted for publication in the Proceedings of the 43rd International Conference on Machine Learning (ICML 2026), Seoul, South Korea

详情
AI中文摘要

我们考虑具有多个变化点的环境中的在线学习问题。与使用经典“高置信度”检测方案广泛研究的单变化点问题不同,多变化点环境提出了新的学习理论和算法挑战。具体来说,我们表明经典方法可能由于我们称之为内生混杂的现象而表现出灾难性失败(高遗憾)。为了克服这一点,我们提出了一类新的学习算法,称为任意时间跟踪CUSUM(ATC)。这些是无视时间范围的在线算法,实现选择性检测原则,平衡忽略“小”(难以检测)变化的需要,同时对显著变化做出“快速”反应。我们证明,适当调整的ATC算法的性能几乎是极小化最优的;其遗憾保证紧密匹配任何学习算法在多变化点问题中可实现性能的新信息论下界。在合成数据以及真实数据上的实验验证了上述理论发现。

英文摘要

We consider an online learning problem in environments with multiple change points. In contrast to the single change point problem that is widely studied using classical "high confidence" detection schemes, the multiple change point environment presents new learning-theoretic and algorithmic challenges. Specifically, we show that classical methods may exhibit catastrophic failure (high regret) due to a phenomenon we refer to as endogenous confounding. To overcome this, we propose a new class of learning algorithms dubbed Anytime Tracking CUSUM (ATC). These are horizon-free online algorithms that implement a selective detection principle, balancing the need to ignore "small" (hard-to-detect) shifts, while reacting "quickly" to significant ones. We prove that the performance of a properly tuned ATC algorithm is nearly minimax-optimal; its regret is guaranteed to closely match a novel information-theoretic lower bound on the achievable performance of any learning algorithm in the multiple change point problem. Experiments on synthetic as well as real-world data validate the aforementioned theoretical findings.

2510.26219 2026-06-04 cs.LG cs.AI 版本更新

Test-time reward-guided alignment of language models by importance sampling on pre-logit space

基于预逻辑空间重要性采样的测试时奖励引导语言模型对齐

Sekitoshi Kanai, Tsukasa Yoshida, Hiroshi Takahashi, Haru Kuroki, Kazumune Hashimoto

发表机构 * NTT, Inc.(NTT公司) Toyohashi University of Technology(东邦大学) The University of Osaka(大阪大学)

AI总结 提出一种基于预逻辑空间自适应重要性采样的测试时对齐方法AISP,通过高斯扰动和重要性采样优化奖励期望,在样本效率上优于最佳-of-n采样和其他测试时对齐方法。

Comments 24 pages, 10 figures

详情
AI中文摘要

大型语言模型(LLM)的测试时对齐因其微调计算成本高而受到关注。本文提出一种新的测试时奖励引导对齐方法,称为基于预逻辑的自适应重要性采样(AISP),该方法基于随机控制输入的采样模型预测控制。AISP将高斯扰动应用于预逻辑(倒数第二层的输出),以最大化相对于扰动均值的期望奖励。我们证明,通过重要性采样和采样奖励可以获得最优均值。AISP在使用样本数量方面的奖励优于最佳-of-n采样,并且比其他基于奖励的测试时对齐方法获得更高的奖励。

英文摘要

Test-time alignment of large language models (LLMs) attracts attention because fine-tuning of LLMs requires high computational costs. In this paper, we propose a new test-time reward-guided alignment method called adaptive importance sampling on pre-logits (AISP) on the basis of the sampling-based model predictive control with the stochastic control input. AISP applies the Gaussian perturbation into pre-logits, which are outputs of the penultimate layer, so as to maximize expected rewards with respect to the mean of the perturbation. We demonstrate that the optimal mean is obtained by importance sampling with sampled rewards. AISP outperforms best-of-n sampling in terms of rewards over the number of used samples and achieves higher rewards than other reward-based test-time alignment methods.

2509.16301 2026-06-04 q-bio.QM cs.LG 版本更新

TF-DWGNet: A Directed Weighted Graph Neural Network with Tensor Fusion for Multi-Omics Cancer Subtype Classification

TF-DWGNet: 基于张量融合的有向加权图神经网络用于多组学癌症亚型分类

Tiantian Yang, Zhiqian Chen

发表机构 * Mathematics and Statistical Science University of Idaho(数学与统计科学大学 Idaho 大学) Computer Science and Engineering Mississippi State University(计算机科学与工程密苏里州立大学)

AI总结 提出TF-DWGNet框架,结合基于树的有向加权图构建与张量融合机制,解决多组学数据异质性和高阶交互问题,在癌症亚型分类中优于现有方法并提供可解释性。

Comments 9 pages, 4 figures, 4 tables

详情
Journal ref
NAR Genomics and Bioinformatics, Volume 8, Issue 2, 2026, lqag054
AI中文摘要

多组学数据的整合与分析为改善癌症亚型分类提供了宝贵的见解。然而,这些数据本质上是异质的、高维的,并表现出复杂的模态内和模态间依赖关系。图神经网络(GNN)为建模这些结构提供了一个原则性框架,但现有方法通常依赖先验知识或预定义的相似性网络,这些网络生成无向或无权重图,无法捕捉任务特定的方向性和交互强度。在模态和特征层面的可解释性也仍然有限。为了解决这些挑战,我们提出了TF-DWGNet,一种新颖的图神经网络框架,它结合了基于树的有向加权图构建与张量融合,用于多类癌症亚型分类。TF-DWGNet引入了两个关键创新:(i)一种监督的基于树的策略,为每种组学模态构建定制的有向加权图,以及(ii)一种张量融合机制,通过低秩分解捕获单模态、双模态和三模态交互,以提高计算效率。在三个真实世界癌症数据集上的实验表明,TF-DWGNet在多个指标和统计测试中始终优于最先进的基线方法。此外,该模型通过模态级贡献分数和排序的特征重要性提供了生物学上有意义的见解。这些结果突显了TF-DWGNet是癌症研究中多组学整合的有效且可解释的解决方案。

英文摘要

Integration and analysis of multi-omics data provide valuable insights for improving cancer subtype classification. However, such data are inherently heterogeneous, high-dimensional, and exhibit complex intra- and inter-modality dependencies. Graph neural networks (GNNs) offer a principled framework for modeling these structures, but existing approaches often rely on prior knowledge or predefined similarity networks that produce undirected or unweighted graphs and fail to capture task-specific directionality and interaction strength. Interpretability at both the modality and feature levels also remains limited. To address these challenges, we propose TF-DWGNet, a novel Graph Neural Network framework that combines tree-based Directed Weighted graph construction with Tensor Fusion for multiclass cancer subtype classification. TF-DWGNet introduces two key innovations: (i) a supervised tree-based strategy that constructs directed, weighted graphs tailored to each omics modality, and (ii) a tensor fusion mechanism that captures unimodal, bimodal, and trimodal interactions using low-rank decomposition for computational efficiency. Experiments on three real-world cancer datasets demonstrate that TF-DWGNet consistently outperforms state-of-the-art baselines across multiple metrics and statistical tests. In addition, the model provides biologically meaningful insights through modality-level contribution scores and ranked feature importance. These results highlight that TF-DWGNet is an effective and interpretable solution for multi-omics integration in cancer research.

2511.13391 2026-06-04 cs.LG cs.AI math.CO math.MG 版本更新

Finding Kissing Numbers with Game-theoretic Reinforcement Learning

用博弈论强化学习寻找亲吻数

Chengdong Ma, Théo Tao Zhaowei, Pengyu Li, Minghao Liu, Haojun Chen, Zihao Mao, Bo Li, Yuan Cheng, Yuan Qi, Yaodong Yang

发表机构 * Institute for Artificial Intelligence, Peking University(北京大学人工智能研究院) Shanghai Academy of AI for Science(上海人工智能科学研究院) Artificial Intelligence Innovation and Incubation Institute, Fudan University(复旦大学人工智能创新与孵化院)

AI总结 将亲吻数问题转化为合作矩阵补全博弈,利用强化学习系统PackingStar在极值配置空间中探索,改进了15个长期未突破的亲吻数上界,并发现了新的可解释几何结构。

详情
AI中文摘要

自1694年牛顿首次研究亲吻数问题以来,确定中心球周围非重叠球的最大数量一直是离散几何中的一个决定性挑战。作为希尔伯特第18问题的局部类比,它在几何、数论和信息论中具有深远意义。尽管格和编码取得了显著进展,但该领域局限于孤立的极值构型,掩盖了潜在的几何原理。在这里,我们将对象转移到更广泛的极值配置空间,从而为亲吻数问题开辟了一条新路径。因此,我们将该问题重新表述为一个合作矩阵补全博弈,并训练一个强化学习系统PackingStar来解决它。一个玩家填充余弦条目,而另一个玩家纠正次优条目,使爆炸性的几何复杂性变得可处理。在极值配置空间内工作,PackingStar发现了新的可解释几何结构,改进了15个在亲吻数及其推广中保持数十年的强上界,其中几个在自然内积下被证明是最优的。这些发现揭示了Fischer群Fi22的第一个显式球面编码实现,扩展了子群结构的经典欧几里得表示,并直接启发了数学家的后续突破。总体而言,这项工作为人工智能在希尔伯特级别问题上的进展提供了一个早期示例,展示了强化学习通过解锁更具表现力的对象来推动数学发现。

英文摘要

Since Isaac Newton first studied the Kissing Number Problem in 1694, determining the maximal number of non-overlapping spheres around a central sphere has remained a defining challenge in discrete geometry. As the local analogue of Hilbert's 18th problem, it has profound implications across geometry, number theory and information theory. Although lattices and codes have achieved significant progress, the field is confined to isolated extremal configurations, leaving underlying geometric principles obscured. Here we shift the object to the broader extremal configuration space, thereby opening a new path for the Kissing Number Problem. Accordingly, we recast this problem as a cooperative matrix-completion game, and train a reinforcement learning system, PackingStar, to solve it. One player fills cosine entries while the other corrects suboptimal ones, making explosive geometric complexity tractable. Working within extremal configuration spaces, PackingStar discovers new interpretable geometric structures that improve 15 strong bounds held for decades in kissing numbers and their generalizations, several of them provably optimal under natural inner products. These findings reveal the first explicit spherical-code realization of the Fischer group Fi22, extend the classical Euclidean representation of subgroup structure, and directly inspire subsequent breakthroughs by mathematicians. Overall, the work provides an early example of AI-driven progress on a Hilbert-calibre problem, showing how reinforcement learning advances mathematical discovery by unlocking more expressive objects.

2602.09075 2026-06-04 cs.LG cs.AI 版本更新

Learning to Remember, Learn, and Forget in Attention-Based Models

在基于注意力的模型中学习记忆、学习和遗忘

Djohan Bonnet, Jamie Lohoff, Jan Finkbeiner, Elidona Shiqerukaj, Emre Neftci

发表机构 * University of Cambridge(剑桥大学)

AI总结 提出Palimpsa模型,将上下文学习视为持续学习问题,通过贝叶斯元可塑性解决稳定性-可塑性困境,显著提升记忆容量,在MQAR和常识推理任务上优于基线。

详情
AI中文摘要

Transformer中的上下文学习(ICL)作为一种在线联想记忆,被认为是其在复杂序列处理任务中高性能的基础。然而,在门控线性注意力模型中,这种记忆具有固定容量且容易受到干扰,尤其是对于长序列。我们提出Palimpsa,一种自注意力模型,将ICL视为必须解决稳定性-可塑性困境的持续学习问题。Palimpsa使用贝叶斯元可塑性,其中每个注意力状态的可塑性绑定到一个由捕获累积知识的先验分布支撑的重要性状态。我们证明各种门控线性注意力模型作为特定的架构选择和后验近似出现,并且Mamba2是Palimpsa的一个特例,其中遗忘占主导。这一理论联系使得任何非元可塑性模型都能转化为元可塑性模型,从而显著扩展其记忆容量。我们的实验表明,Palimpsa在多查询联想回忆(MQAR)基准和常识推理任务上始终优于基线。

英文摘要

In-Context Learning (ICL) in transformers acts as an online associative memory and is believed to underpin their high performance on complex sequence processing tasks. However, in gated linear attention models, this memory has a fixed capacity and is prone to interference, especially for long sequences. We propose Palimpsa, a self-attention model that views ICL as a continual learning problem that must address a stability-plasticity dilemma. Palimpsa uses Bayesian metaplasticity, where the plasticity of each attention state is tied to an importance state grounded by a prior distribution that captures accumulated knowledge. We demonstrate that various gated linear attention models emerge as specific architecture choices and posterior approximations, and that Mamba2 is a special case of Palimpsa where forgetting dominates. This theoretical link enables the transformation of any non-metaplastic model into a metaplastic one, significantly expanding its memory capacity. Our experiments show that Palimpsa consistently outperforms baselines on the Multi-Query Associative Recall (MQAR) benchmark and on Commonsense Reasoning tasks.

2509.25289 2026-06-04 cs.LG cs.AI 版本更新

ClustRecNet: A Novel End-to-End Deep Learning Framework for Clustering Algorithm Recommendation

ClustRecNet: 一种用于聚类算法推荐的新型端到端深度学习框架

Mohammadreza Bakhtyari, Bogdan Mazoure, Renato Cordeiro de Amorim, Guillaume Rabusseau, Vladimir Makarenkov

发表机构 * Département d’Informatique, Université du Québec à Montréal(魁北克大学蒙特利尔分校计算机科学系) Mila - Quebec AI Institute(魁北克人工智能研究所) School of Computer Science and EE, University of Essex(埃塞克斯大学计算机科学与电子工程学院) Department of Computer Science and Operations Research, Université de Montréal(蒙特利尔大学计算机科学与运筹学系)

AI总结 提出ClustRecNet,一种端到端深度学习框架,通过直接学习原始表格数据的高阶表示来推荐合适的聚类算法,在合成和真实基准上优于传统内部聚类有效性指标和AutoML方法。

Comments Published in IEEE Access

详情
Journal ref
IEEE Access, vol. 14, pp. 81352 - 81365, 2026
AI中文摘要

为给定数据集识别有效的聚类算法仍然是一个基本的无监督学习问题。我们引入了ClustRecNet,一种新颖的端到端深度学习框架,通过直接学习原始表格数据的高阶表示来推荐合适的聚类算法。为了促进稳健的元学习,我们首先构建了一个包含34,000个合成数据集的综合存储库,涵盖了多种聚类场景,运行了10种流行的聚类算法,并使用调整兰德指数(ARI)建立真实标签。ClustRecNet的架构包含一个卷积块、两个残差块和一个注意力块,以捕获局部和全局结构模式,有效绕过了与手动特征工程相关的知识瓶颈。在合成和真实世界基准上的广泛评估表明,ClustRecNet始终优于传统的内部聚类有效性指标,如轮廓系数、Calinski-Harabasz、Davies-Bouldin和Dunn,以及最先进的自动化机器学习(AutoML)方法,如ML2DAC、AutoCluster和AutoML4Clust。例如,我们的框架在合成数据上平均比Calinski-Harabasz聚类有效性指数高出0.497的ARI增益,在真实世界基准上平均比领先的AutoML方法(ML2DAC)高出44.16%的ARI改进。代码和数据可在以下网址获取:https://github.com/mrbakhtyari/ClustRecNet

英文摘要

Identifying an effective clustering algorithm for a given dataset remains a fundamental unsupervised learning issue. We introduce ClustRecNet, a novel end-to-end deep learning framework that recommends suitable clustering algorithm(s) by directly learning high-order representations of raw tabular data. To facilitate robust meta-learning, we first construct a comprehensive repository of 34,000 synthetic datasets encompassing a large variety of clustering scenarios, run 10 popular clustering algorithms, and use Adjusted Rand Index (ARI) to establish ground-truth labels. ClustRecNet's architecture incorporates a convolution block, two residual blocks, and an attention block to capture local and global structural patterns, effectively bypassing the knowledge bottleneck associated with manual feature engineering. Extensive evaluation on both synthetic and real-world benchmarks demonstrates that ClustRecNet consistently outperforms traditional internal cluster validity indices such as Silhouette, Calinski-Harabasz, Davies-Bouldin, and Dunn as well as state-of-the-art Automated Machine Learning (AutoML) approaches such as ML2DAC, AutoCluster, and AutoML4Clust. For example, our framework achieves an average 0.497 ARI gain over the Calinski-Harabasz cluster validity index on synthetic data and an average 44.16% ARI improvement over the leading AutoML approach (ML2DAC) on real-world benchmarks. Code and data are available at: https://github.com/mrbakhtyari/ClustRecNet

2602.08142 2026-06-04 cs.LG stat.ML 版本更新

Variance-Gated Ensembles: An Epistemic-Aware Framework for Uncertainty Estimation

方差门控集成:一种面向认知不确定性的估计框架

H. Martin Gillis, Isaac Xu, Thomas Trappenberg

发表机构 * Faculty of Computer Science, Dalhousie University, Halifax, NS(计算机科学学院,达尔豪西大学,哈利法克斯,NS)

AI总结 提出方差门控集成(VGE)框架,通过从集成统计量计算信噪比门控注入认知敏感性,实现高效且可微的不确定性估计,在计算效率与性能上匹配或超越现有方法。

Comments Published in Transactions on Machine Learning Research (06/2026)

详情
AI中文摘要

机器学习应用需要快速且可靠的逐样本不确定性估计。常见方法是使用贝叶斯或近似方法的预测分布,并将不确定性加性分解为偶然(即数据相关)和认知(即模型相关)分量。然而,加性分解最近受到质疑,有证据表明当使用有限集成采样和/或不匹配的预测分布时,该分解会失效。本文介绍方差门控集成(VGE),一种直观、可微的框架,通过从集成统计量计算的信噪比门控注入认知敏感性。VGE提供:(i)方差门控边际不确定性(VGMU)分数,将决策边际与集成预测方差耦合;(ii)方差门控归一化(VGN)层,通过每类可学习的集成成员概率归一化,将方差门控不确定性机制推广到训练。我们推导出闭合形式的向量-雅可比积,使得通过集成样本均值和方差进行端到端训练成为可能。VGE在保持计算效率的同时,匹配或超越最先进的信息论基线。因此,VGE为集成模型中的认知感知不确定性估计提供了一种实用且可扩展的方法。

英文摘要

Machine learning applications require fast and reliable per-sample uncertainty estimation. A common approach is to use predictive distributions from Bayesian or approximation methods and additively decompose uncertainty into aleatoric (i.e., data-related) and epistemic (i.e., model-related) components. However, additive decomposition has recently been questioned, with evidence that it breaks down when using finite-ensemble sampling and/or mismatched predictive distributions. This paper introduces Variance-Gated Ensembles (VGE), an intuitive, differentiable framework that injects epistemic sensitivity via a signal-to-noise gate computed from ensemble statistics. VGE provides: (i) a Variance-Gated Margin Uncertainty (VGMU) score that couples decision margins with ensemble predictive variance; and (ii) a Variance-Gated Normalization (VGN) layer that generalizes the variance-gated uncertainty mechanism to training via per-class, learnable normalization of ensemble member probabilities. We derive closed-form vector-Jacobian products enabling end-to-end training through ensemble sample mean and variance. VGE matches or exceeds state-of-the-art information-theoretic baselines while remaining computationally efficient. As a result, VGE provides a practical and scalable approach to epistemic-aware uncertainty estimation in ensemble models.

2602.06883 2026-06-04 cs.LG cs.CV stat.ML 版本更新

Vision Transformer Finetuning Benefits from Non-Smooth Components

视觉变换器微调受益于非平滑组件

Ambroise Odonnat, Laetitia Chapel, Romain Tavenard, Ievgen Redko

发表机构 * Noah's Ark Lab(诺亚 ark 实验室) Univ. Rennes 2, Inria(里昂二大学,法国国家信息与自动化研究所)

AI总结 本文通过分析视觉变换器组件的可塑性(即输出对输入变化的敏感度),发现高可塑性(低平滑性)的注意力模块和前馈层在微调中表现更好,挑战了平滑性有利的传统观点。

Comments Accepted at ICML 2026

详情
AI中文摘要

变换器架构的平滑性在泛化、训练稳定性和对抗鲁棒性方面已被广泛研究。然而,其在迁移学习中的作用仍知之甚少。本文分析了视觉变换器组件使其输出适应输入变化的能力,即它们的\emph{可塑性}。定义为平均变化率,它捕捉了对输入扰动的敏感性;特别地,高可塑性意味着低平滑性。我们的理论分析和大量实验——在大规模视觉变换器上进行超过1000次微调运行——表明,这一视角为选择在适应过程中优先考虑的组件提供了原则性指导。对从业者的关键启示是,注意力模块和前馈层的高可塑性始终导致更好的微调性能。我们的发现偏离了平滑性是可取的普遍假设,为变换器的功能特性提供了新的视角。代码可在 https://github.com/ambroiseodt/vit-plasticity 获取。

英文摘要

The smoothness of the transformer architecture has been extensively studied in the context of generalization, training stability, and adversarial robustness. However, its role in transfer learning remains poorly understood. In this paper, we analyze the ability of vision transformer components to adapt their outputs to changes in inputs, or, in other words, their \emph{plasticity}. Defined as an average rate of change, it captures the sensitivity to input perturbation; in particular, a high plasticity implies a low smoothness. Our theoretical analysis and extensive experiments -- over $1,000$ finetuning runs on large-scale vision transformers -- showcase that this perspective provides principled guidance in choosing the components to prioritize during adaptation. A key takeaway for practitioners is that the high plasticity of the attention modules and feedforward layers consistently leads to better finetuning performance. Our findings depart from the prevailing assumption that smoothness is desirable, offering a novel perspective on transformers' functional properties. The code is available at https://github.com/ambroiseodt/vit-plasticity.

2601.20800 2026-06-04 cs.LG cs.AI 版本更新

Conditional PED-ANOVA: Hyperparameter Importance in Hierarchical & Dynamic Search Spaces

条件PED-ANOVA:层次与动态搜索空间中的超参数重要性

Kaito Baba, Yoshihiko Ozaki, Shuhei Watanabe

发表机构 * Preferred Networks, Inc.(Preferred Networks公司) The University of Tokyo(东京大学) SB Intuitions Corp.(SB Intuitions公司)

AI总结 提出条件PED-ANOVA框架,用于估计条件搜索空间中超参数的重要性,通过闭式估计器准确反映条件激活和域变化,实验证明其优于朴素适应方法。

Comments 20 pages, 15 figures. Accepted to the 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD 2026)

详情
AI中文摘要

我们提出条件PED-ANOVA(condPED-ANOVA),一个用于估计条件搜索空间中超参数重要性(HPI)的原则性框架,其中超参数的存在或域可能依赖于其他超参数。尽管原始PED-ANOVA提供了一种快速有效的方法来估计搜索空间内高性能区域的HPI,但它假设一个固定的、无条件的搜索空间,因此无法正确处理条件超参数。为了解决这个问题,我们引入了针对高性能区域的条件HPI,并推导出一个闭式估计器,能够准确反映条件激活和域变化。实验表明,现有HPI估计器的朴素适应在条件设置下会产生误导性或不可解释的重要性,而condPED-ANOVA始终提供反映底层条件结构的有意义的重要性。我们的代码公开在https://github.com/kAIto47802/condPED-ANOVA。

英文摘要

We propose conditional PED-ANOVA (condPED-ANOVA), a principled framework for estimating hyperparameter importance (HPI) in conditional search spaces, where the presence or domain of a hyperparameter can depend on other hyperparameters. Although the original PED-ANOVA provides a fast and efficient way to estimate HPI within the top-performing regions of the search space, it assumes a fixed, unconditional search space and therefore cannot properly handle conditional hyperparameters. To address this, we introduce a conditional HPI for top-performing regions and derive a closed-form estimator that accurately reflects conditional activation and domain changes. Experiments show that naive adaptations of existing HPI estimators yield misleading or uninterpretable importances in conditional settings, whereas condPED-ANOVA consistently provides meaningful importances that reflect the underlying conditional structure. Our code is publicly available at https://github.com/kAIto47802/condPED-ANOVA.

2602.05657 2026-06-04 cs.LG math.OC 版本更新

Tight Long-Term Tail Decay of (Clipped) SGD in Non-Convex Optimization

非凸优化中(裁剪)SGD的严格长期尾衰减

Aleksandar Armacki, Dragana Bajović, Dušan Jakovetić, Soummya Kar, Ali H. Sayed

发表机构 * École Polytechnique Fédérale de Lausanne(瑞士联邦理工学院洛桑分校) University of Novi Sad(诺维萨德大学) Carnegie Mellon University(卡内基梅隆大学)

AI总结 通过大偏差理论,研究非凸优化中SGD和裁剪SGD的长期尾衰减,给出梯度范数平方的指数级上界和下界,证明衰减率可达$e^{-t/\log(t)}$量级,比现有有限时间界快一个数量级。

Comments 34 pages

详情
AI中文摘要

由于能够为算法的单次运行提供强保证,对SGD诱导过程的尾部行为的研究引起了广泛兴趣。虽然许多工作提供了高概率保证(量化固定概率阈值下的误差率),但缺乏直接研究失败概率的工作,即量化固定误差阈值下的尾部衰减率。此外,现有结果具有有限时间性质,限制了它们捕捉真实长期尾部衰减的能力,而后者对于现代学习模型(通常训练数百万次迭代)更具信息量。我们的工作通过大偏差理论的视角研究基于SGD的方法的长期尾部衰减,填补了这些空白,在此过程中建立了若干强结果。首先,对于非凸成本和有界噪声,我们给出了(普通)SGD产生的最佳迭代的梯度范数平方的尾部上界,长期衰减率为$e^{-t/\log(t)}$。接着,我们通过考虑在具有有界$p$阶矩($p \in (1,2]$)的重尾噪声下的裁剪SGD(c-SGD)来放宽噪声假设,证明了长期衰减率为$e^{-t^{\beta_p}/\log(t)}$的上界,其中当$p \in (1,2)$时$\beta_p = \frac{4(p-1)}{3p-2}$,当$p=2$时衰减率为$e^{-t/\log^2(t)}$。最后,我们给出了尾部衰减的下界,衰减率为$e^{-t}$,表明我们关于SGD和c-SGD的衰减率在多项式对数因子意义下是紧的。值得注意的是,我们的结果表明,与基于有限时间界的现有工作(分别显示SGD和c-SGD的衰减率为$e^{-\sqrt{t}}$和$e^{-t^{\beta_p/2}}$,$p \in (1,2]$)相比,长期尾部衰减快一个数量级。因此,我们揭示了尾部衰减比先前已知快得多的机制,为单次运行提供了更强的长期保证。

英文摘要

The study of tail behaviour of SGD-induced processes has been attracting a lot of interest, due to offering strong guarantees with respect to individual runs of an algorithm. While many works provide high-probability guarantees, quantifying the error rate for a fixed probability threshold, there is a lack of work directly studying the probability of failure, i.e., quantifying the tail decay rate for a fixed error threshold. Moreover, existing results are of finite-time nature, limiting their ability to capture the true long-term tail decay which is more informative for modern learning models, typically trained for millions of iterations. Our work closes these gaps, by studying the long-term tail decay of SGD-based methods through the lens of large deviations theory, establishing several strong results in the process. First, we provide an upper bound on the tails of the gradient norm-squared of the best iterate produced by (vanilla) SGD, for non-convex costs and bounded noise, with long-term decay at rate $e^{-t/\log(t)}$. Next, we relax the noise assumption by considering clipped SGD (c-SGD) under heavy-tailed noise with bounded moment of order $p \in (1,2]$, showing an upper bound with long-term decay at rate $e^{-t^{β_p}/\log(t)}$, where $β_p = \frac{4(p-1)}{3p-2}$ for $p \in (1,2)$ and $e^{-t/\log^2(t)}$ for $p = 2$. Finally, we provide lower bounds on the tail decay, at rate $e^{-t}$, showing that our rates for both SGD and c-SGD are tight, up to poly-logarithmic factors. Notably, our results demonstrate an order of magnitude faster long-term tail decay compared to existing work based on finite-time bounds, which show rates $e^{-\sqrt{t}}$ and $e^{-t^{β_p/2}}$, $p \in (1,2]$, for SGD and c-SGD, respectively. As such, we uncover regimes where the tails decay much faster than previously known, providing stronger long-term guarantees for individual runs.

2510.08734 2026-06-04 cs.LG 版本更新

Transmuting prompts into weights

将提示转化为权重

Hanna Mazzawi, Benoit Dherin, Michael Munn, Adrian Goldwaser, Michael Wunder, Javier Gonzalvo

发表机构 * Google Research(谷歌研究) Cambridge University(剑桥大学)

AI总结 本文提出一种将提示信息转化为与token无关的思维向量和思维矩阵的算法,为现有基于向量和矩阵的模型编辑技术提供理论解释,并实现文本输入到可复用权重更新的直接转化。

详情
AI中文摘要

越来越多的研究表明,大型语言模型的行为可以在推理时通过直接修改其内部状态来有效控制,无论是通过向激活添加向量还是更新权重矩阵。这些技术虽然强大,但通常由经验启发式指导,例如从对比提示的平均激活中推导出“引导向量”。基于Dherin等人(2025)的基础工作,他们发现提示的影响在数学上映射为与token相关的隐式权重更新,并引入了用于提示压缩的静态思维补丁的初始概念,我们将这一框架提升为一种用于直接模型编辑的鲁棒算法。我们推导出一种原则性方法,将这种瞬态信息压缩为与token无关的思维向量和思维矩阵。这些构造为现有的基于向量和矩阵的模型编辑技术提供了理论解释,并提供了一种直接、基于计算的方法,将文本输入转化为可用于复杂架构和新知识注入的可复用权重更新。

英文摘要

A growing body of research has demonstrated that the behavior of large language models can be effectively controlled at inference time by directly modifying their internal states, either through vector additions to their activations or through updates to their weight matrices. These techniques, while powerful, are often guided by empirical heuristics, such as deriving ``steering vectors'' from the average activations of contrastive prompts. Building on the foundational work of Dherin et al. (2025), who discovered that a prompt's influence mathematically maps to token-dependent implicit weight updates and introduced the initial concept of a static thought patch for prompt compression, we elevate this framework into a robust algorithm for direct model editing. We derive a principled method for condensing this transient information into token-independent thought vectors and thought matrices. These constructs provide a theoretical explanation for existing vector-and-matrix-based model editing techniques and offer a direct, computationally-grounded method for transmuting textual input into reusable weight updates for complex architectures and new knowledge injection.

2602.03542 2026-06-04 cs.CL cs.LG 版本更新

Can Large Language Models Generalize Procedures Across Representations?

大型语言模型能否跨表示泛化过程?

Fangru Lin, Valentin Hofmann, Xingchen Wan, Weixing Wang, Zifeng Ding, Anthony G. Cohn, Janet B. Pierrehumbert

发表机构 * Stanford University(斯坦福大学)

AI总结 研究大型语言模型在代码、图与自然语言等不同表示间泛化过程的能力,提出两阶段强化学习课程来弥合差距。

Comments Accepted at ICML 2026

详情
AI中文摘要

大型语言模型(LLMs)在符号表示(如代码和图)上进行了广泛的训练和测试,然而现实世界的用户任务通常用自然语言指定。LLMs 能在多大程度上跨这些表示进行泛化?在这里,我们通过研究涉及以代码、图和自然语言表示的过程(例如,规划中的调度步骤)的同构任务来探讨这个问题。我们发现,仅在图或代码数据上使用流行的后训练方法训练 LLMs 并不能可靠地泛化到相应的自然语言任务,而仅用自然语言训练可能导致效率低下的性能提升。为了解决这一差距,我们提出了一种两阶段强化学习课程,首先在符号数据上训练,然后在自然语言数据上训练。该课程显著提高了跨模型家族和任务的模型性能。值得注意的是,通过我们的方法训练的 1.5B Qwen 模型在自然规划中几乎可以匹配零样本 GPT-4o。最后,我们的分析表明,成功的跨表示泛化可以解释为一种生成性类比的形式,而我们的课程有效地鼓励了这种类比。本文使用的数据集和代码可在此处找到。

英文摘要

Large language models (LLMs) are trained and tested extensively on symbolic representations such as code and graphs, yet real-world user tasks are often specified in natural language. To what extent can LLMs generalize across these representations? Here, we approach this question by studying isomorphic tasks involving procedures represented in code, graphs, and natural language (e.g., scheduling steps in planning). We find that training LLMs with popular post-training methods on graphs or code data alone does not reliably generalize to corresponding natural language tasks, while training solely on natural language can lead to inefficient performance gains. To address this gap, we propose a two-stage reinforcement learning curriculum that first trains on symbolic, then natural language data. The curriculum substantially improves model performance across model families and tasks. Remarkably, a 1.5B Qwen model trained by our method can closely match zero-shot GPT-4o in naturalistic planning. Finally, our analysis suggests that successful cross-representation generalization can be interpreted as a form of generative analogy, which our curriculum effectively encourages. The dataset and code used in this paper can be found \href{https://github.com/fangru-lin/procedure_generalization_llm}{here}.

2602.02405 2026-06-04 cs.LG cs.AI 版本更新

Making Expert Reasoning Learnable with Self-Distillation

通过自蒸馏使专家推理可学习

Ethan Mendes, Jungsoo Park, Alan Ritter

发表机构 * Georgia Institute of Technology, Atlanta, Georgia(佐治亚理工学院,亚特兰大,佐治亚州)

AI总结 提出分布对齐模仿学习(DAIL),通过两步自蒸馏方法弥合专家解决方案与模型分布之间的差距,利用少量高质量专家数据显著提升大语言模型的推理能力。

Comments ICML 2026

详情
AI中文摘要

提升大语言模型(LLM)的推理能力通常依赖于模型采样正确解以进行强化,或存在更强模型来解决问题。然而,许多难题即使对当前前沿模型也难以处理,阻碍了有效训练信号的提取。一个有前景的替代方案是利用高质量的人类专家解决方案,但直接模仿这些数据从根本上存在分布外问题:专家解决方案通常具有教学性质,包含为人类读者而非计算模型设计的隐含推理间隙。此外,高质量专家解决方案成本高昂,需要可泛化且样本高效的训练方法。我们提出分布对齐模仿学习(DAIL),一种两步自蒸馏方法,通过首先将专家解决方案转化为详细的、分布内的推理轨迹,然后应用对比目标使学习聚焦于专家见解和方法,从而弥合分布差距。我们发现,DAIL可以利用少于1000个高质量专家解决方案,在Qwen2.5-Instruct和Qwen3上实现高达31%的pass@128增益,推理效率翻倍,并实现域外泛化。

英文摘要

Improving the reasoning capabilities of large language models (LLMs) typically relies either on the model's ability to sample a correct solution to be reinforced or the existence of a stronger model able to solve the problem. However, many difficult problems remain intractable for even current frontier models, preventing the extraction of valid training signals. A promising alternative is to leverage high-quality expert human solutions, yet naive imitation of this data fails because it is fundamentally out-of-distribution: expert solutions are typically didactic, containing implicit reasoning gaps intended for human readers rather than computational models. Furthermore, high-quality expert solutions are expensive, necessitating generalizable, sample-efficient training methods. We propose Distribution Aligned Imitation Learning (DAIL), a two-step self-distillation method that bridges the distributional gap by first transforming expert solutions into detailed, in-distribution reasoning traces and then applying a contrastive objective to focus learning on expert insights and methodologies. We find that DAIL can leverage fewer than 1000 high-quality expert solutions to achieve up to 31% pass@128 gains on Qwen2.5-Instruct and Qwen3, double reasoning efficiency, and enable out-of-domain generalization.

2602.01658 2026-06-04 cs.LG cs.AI 版本更新

Efficient Adversarial Attacks on High-dimensional Offline Bandits

高维离线Bandits的高效对抗攻击

Seyed Mohammad Hadi Hosseini, Amir Najafi, Mahdieh Soleymani Baghshah

发表机构 * Department of Computer Engineering, Sharif University of Technology(技术学院计算机工程系)

AI总结 研究离线bandit训练在奖励模型被对抗扰动时的脆弱性,提出高维威胁模型,证明维度增加时攻击所需扰动范数减小,实验验证了针对性攻击的高成功率。

Comments Published at ICLR 2026 Conference

详情
AI中文摘要

Bandit算法最近成为评估机器学习模型(包括生成图像模型和大语言模型)的强大工具,通过高效识别表现最佳的候选者而无需详尽比较。这些方法通常依赖于奖励模型(常在Hugging Face等平台上以公共权重发布)向bandit提供反馈。在线评估昂贵且需要重复试验,而使用记录数据的离线评估已成为有吸引力的替代方案。然而,离线bandit评估的对抗鲁棒性在很大程度上尚未被探索,特别是当攻击者在bandit训练之前扰动奖励模型(而非训练数据)时。在这项工作中,我们通过理论和实证研究离线bandit训练对奖励模型对抗操纵的脆弱性来填补这一空白。我们引入了一种新颖的威胁模型,其中攻击者利用高维环境中的离线数据劫持bandit的行为。从线性奖励函数开始,扩展到非线性模型如ReLU神经网络,我们研究了用于生成模型评估的两个Hugging Face评估器上的攻击:一个测量美学质量,另一个评估组合对齐。我们的结果表明,即使对奖励模型权重进行微小、不可察觉的扰动,也能显著改变bandit的行为。从理论角度来看,我们证明了一个显著的高维效应:随着输入维度的增加,成功攻击所需的扰动范数减小,使得现代应用如图像评估尤其脆弱。大量实验证实,简单的随机扰动无效,而精心设计的针对性攻击实现了近乎完美的攻击成功率。

英文摘要

Bandit algorithms have recently emerged as a powerful tool for evaluating machine learning models, including generative image models and large language models, by efficiently identifying top-performing candidates without exhaustive comparisons. These methods typically rely on a reward model, often distributed with public weights on platforms such as Hugging Face, to provide feedback to the bandit. While online evaluation is expensive and requires repeated trials, offline evaluation with logged data has become an attractive alternative. However, the adversarial robustness of offline bandit evaluation remains largely unexplored, particularly when an attacker perturbs the reward model (rather than the training data) prior to bandit training. In this work, we fill this gap by investigating, both theoretically and empirically, the vulnerability of offline bandit training to adversarial manipulations of the reward model. We introduce a novel threat model in which an attacker exploits offline data in high-dimensional settings to hijack the bandit's behavior. Starting with linear reward functions and extending to nonlinear models such as ReLU neural networks, we study attacks on two Hugging Face evaluators used for generative model assessment: one measuring aesthetic quality and the other assessing compositional alignment. Our results show that even small, imperceptible perturbations to the reward model's weights can drastically alter the bandit's behavior. From a theoretical perspective, we prove a striking high-dimensional effect: as input dimensionality increases, the perturbation norm required for a successful attack decreases, making modern applications such as image evaluation especially vulnerable. Extensive experiments confirm that naive random perturbations are ineffective, whereas carefully targeted perturbations achieve near-perfect attack success rates ...

2602.01619 2026-06-04 cs.LG cs.AI 版本更新

SUSD: Structured Unsupervised Skill Discovery through State Factorization

SUSD: 通过状态分解的结构化无监督技能发现

Seyed Mohammad Hadi Hosseini, Mahdieh Soleymani Baghshah

发表机构 * Department of Computer Engineering(计算机工程系)

AI总结 提出SUSD框架,通过将状态空间分解为独立组件并分配不同技能变量,结合动态模型自适应引导探索,实现更丰富多样的无监督技能发现,并在分解环境中显著优于现有方法。

Comments Published as a conference paper at ICLR 2026

详情
AI中文摘要

无监督技能发现(USD)旨在无需外部奖励的情况下自主学习多样化的技能。最常见的USD方法之一是最大化技能潜在变量与状态之间的互信息(MI)。然而,基于MI的方法由于其不变性特性,倾向于偏好简单、静态的技能,限制了动态、任务相关行为的发现。距离最大化技能发现(DSD)通过利用状态空间距离促进更动态的技能,但仍未能鼓励涵盖环境中所有可控因素或实体的全面技能集。在这项工作中,我们引入了SUSD,一种新颖的框架,通过将状态空间分解为独立组件(例如,物体或可控实体)来利用环境的组合结构。SUSD将不同的技能变量分配给不同的因素,从而实现对技能发现过程的更细粒度控制。一个动态模型还跟踪各因素的学习情况,自适应地将智能体的注意力引导至未充分探索的因素。这种结构化方法不仅促进了更丰富、更多样化技能的发现,还产生了一种分解的技能表示,能够对单个实体进行细粒度且解耦的控制,从而通过分层强化学习(HRL)促进组合下游任务的高效训练。我们在三个环境中的实验结果(因素数量从1到10)表明,我们的方法能够在无监督的情况下发现多样且复杂的技能,在分解和复杂环境中显著优于现有的无监督技能发现方法。代码公开于:https://github.com/hadi-hosseini/SUSD。

英文摘要

Unsupervised Skill Discovery (USD) aims to autonomously learn a diverse set of skills without relying on extrinsic rewards. One of the most common USD approaches is to maximize the Mutual Information (MI) between skill latent variables and states. However, MI-based methods tend to favor simple, static skills due to their invariance properties, limiting the discovery of dynamic, task-relevant behaviors. Distance-Maximizing Skill Discovery (DSD) promotes more dynamic skills by leveraging state-space distances, yet still fall short in encouraging comprehensive skill sets that engage all controllable factors or entities in the environment. In this work, we introduce SUSD, a novel framework that harnesses the compositional structure of environments by factorizing the state space into independent components (e.g., objects or controllable entities). SUSD allocates distinct skill variables to different factors, enabling more fine-grained control on the skill discovery process. A dynamic model also tracks learning across factors, adaptively steering the agent's focus toward underexplored factors. This structured approach not only promotes the discovery of richer and more diverse skills, but also yields a factorized skill representation that enables fine-grained and disentangled control over individual entities which facilitates efficient training of compositional downstream tasks via Hierarchical Reinforcement Learning (HRL). Our experimental results across three environments, with factors ranging from 1 to 10, demonstrate that our method can discover diverse and complex skills without supervision, significantly outperforming existing unsupervised skill discovery methods in factorized and complex environments. Code is publicly available at: https://github.com/hadi-hosseini/SUSD.

2601.15158 2026-06-04 cs.LG cs.AI 版本更新

Outcome-Based RL Provably Leads Transformers to Reason, but Only With the Right Data

基于结果的强化学习可证明地引导Transformer进行推理,但仅在合适的数据条件下

Yuval Ran-Milo, Yotam Alexander, Shahar Mendel, Nadav Cohen

发表机构 * Tel Aviv University(特拉维夫大学)

AI总结 本文通过分析单层Transformer在合成图遍历任务上的策略梯度动力学,证明了基于结果的强化学习能够使Transformer自发学习出结构化的迭代推理算法,并揭示了训练数据中“简单示例”的分布对推理能力涌现的关键作用。

Comments 94 pages, 7 figures

详情
AI中文摘要

通过基于结果的监督进行强化学习训练的Transformer可以自发地生成中间推理步骤(思维链)。然而,稀疏奖励驱动策略梯度发现这种系统性推理的机制仍然知之甚少。我们通过分析单层Transformer在合成图遍历任务上的策略梯度动力学来解决这个问题,该任务没有思维链就无法解决,但允许简单的迭代解决方案。我们证明,尽管仅对最终答案的正确性进行训练,策略梯度仍驱动Transformer收敛到一个结构化的、可解释的算法,该算法逐顶点迭代遍历图。我们刻画了这种涌现所需的分布特性,识别出“简单示例”(即需要较少推理步骤的实例)的关键作用。当训练分布在这些更简单的示例上放置足够的质量时,Transformer学习到一种可泛化的遍历策略,能够外推到更长的链;当这种质量消失时,策略梯度学习变得不可行。我们通过在合成数据上的实验以及在数学推理任务中使用真实世界语言模型的实验来证实我们的理论结果,验证了我们的理论发现可以推广到实际场景。

英文摘要

Transformers trained via Reinforcement Learning (RL) with outcome-based supervision can spontaneously develop the ability to generate intermediate reasoning steps (Chain-of-Thought). Yet the mechanism by which sparse rewards drive policy gradient to discover such systematic reasoning remains poorly understood. We address this by analyzing the policy gradient dynamics of single-layer Transformers on a synthetic graph traversal task that cannot be solved without Chain-of-Thought but admits a simple iterative solution. We prove that despite training solely on final-answer correctness, policy gradient drives the Transformer to converge to a structured, interpretable algorithm that iteratively traverses the graph vertex-by-vertex. We characterize the distributional properties required for this emergence, identifying the critical role of "simple examples": instances requiring fewer reasoning steps. When the training distribution places sufficient mass on these simpler examples, the Transformer learns a generalizable traversal strategy that extrapolates to longer chains; when this mass vanishes, policy gradient learning becomes infeasible. We corroborate our theoretical results through experiments on synthetic data and with real-world language models on mathematical reasoning tasks, validating that our theoretical findings carry over to practical settings.

2512.21917 2026-06-04 cs.LG cs.AI econ.EM stat.ML 版本更新

Semiparametric Preference Optimization: Your Language Model is Secretly a Single-Index Model

半参数偏好优化:你的语言模型秘密地是一个单索引模型

Nathan Kallus

发表机构 * Netflix & Cornell University(Netflix与康奈尔大学)

AI总结 本文提出半参数偏好优化方法,通过放宽偏好与潜在奖励之间的链接函数假设,在未知且无限制的链接函数下进行策略对齐,并证明策略类的可实现性诱导出半参数单索引二元选择模型,直接学习策略并给出链接无关的收敛保证。

详情
AI中文摘要

策略对齐到偏好数据通常假设观察到的偏好与潜在奖励之间存在已知的链接函数(例如,Bradley-Terry模型/逻辑链接)。这种链接的错误设定可能会使推断的奖励产生偏差,并使学习到的策略偏离对齐。我们研究了在未知且无限制的链接函数下的策略对齐。我们提出了一个$f$-散度约束的奖励最大化问题,并表明策略类中的可实现性诱导出一个半参数单索引二元选择模型,其中标量策略诱导的索引捕获了所有对示范的依赖,而剩余的偏好分布是无限制的。与计量经济学中要求识别此类模型的结构参数并进行估计不同,我们开发了直接学习策略的方法,其中奖励函数是隐式的,分析了与最优策略的误差,并允许不可识别和非参数的索引。我们证明了基于通用函数复杂度度量的链接无关收敛保证,并通过实验验证了方法和理论。代码可在 https://github.com/causalml/spo/ 获取。

英文摘要

Policy alignment to preference data typically assumes a known link function between observed preferences and latent rewards (e.g., Bradley-Terry model / logistic link). Misspecification of this link can bias inferred rewards and misalign learned policies. We study policy alignment under an unknown and unrestricted link function. We formulate an $f$-divergence-constrained reward maximization problem and show that realizability in a policy class induces a semiparametric single-index binary choice model, where a scalar policy-induced index captures all dependence on demonstrations and the remaining preference distribution is unrestricted. Rather than impose identifiability of structural parameters of such a model and estimate them, as in econometrics, we develop methods that directly learn policies, with the reward function implicit, analyzing error to the optimal policy and allowing for unidentifiable and nonparametric indices. We prove link-agnostic convergence guarantees in terms of generic function complexity measures and validate the methods and theory empirically. Code is available at https://github.com/causalml/spo/.

2506.06178 2026-06-04 cs.LG 版本更新

Reusing Trajectories in Policy Gradients Enables Fast Convergence

在策略梯度中重用轨迹实现快速收敛

Alessandro Montenegro, Federico Mansutti, Marco Mussi, Matteo Papini, Alberto Maria Metelli

发表机构 * University of Bologna(博洛尼亚大学)

AI总结 提出RT-PG算法,通过重用过去轨迹并使用幂均值校正的多重重要性加权估计器,将策略梯度的样本复杂度降低到$\tilde{O}(\epsilon^{-2}\omega^{-1})$,当重用所有轨迹时达到$\tilde{O}(\epsilon^{-1})$,是目前已知最优。

详情
AI中文摘要

策略梯度(PG)方法是一类有效的强化学习算法,特别是在处理连续控制问题时。它们依赖于新鲜的在线策略数据,导致样本效率低下,需要$O(\epsilon^{-2})$条轨迹才能达到$\epsilon$近似平稳点。提高效率的一种常见策略是重用过去迭代的信息,例如之前的梯度或轨迹,从而产生离策略PG方法。虽然梯度重用已受到广泛关注,并将速率提高到$O(\epsilon^{-3/2})$,但过去轨迹的重用虽然直观,在理论上仍基本未被探索。在这项工作中,我们提供了第一个严格的理论证据,表明重用过去的离策略轨迹可以显著加速PG收敛。我们提出了RT-PG(重用轨迹-策略梯度),一种新颖的算法,它利用幂均值校正的多重重要性加权估计器,有效地结合来自最近$\omega$次迭代的在线策略和离策略数据。通过新颖的分析,我们证明RT-PG实现了$\tilde{O}(\epsilon^{-2}\omega^{-1})$的样本复杂度。当重用所有可用的过去轨迹时,这导致$\tilde{O}(\epsilon^{-1})$的速率,这是文献中PG方法已知的最佳速率。我们进一步通过实验验证了我们的方法,证明了其相对于具有最先进速率的基线的有效性。

英文摘要

Policy gradient (PG) methods are a class of effective reinforcement learning algorithms, particularly when dealing with continuous control problems. They rely on fresh on-policy data, making them sample-inefficient and requiring $O(ε^{-2})$ trajectories to reach an $ε$-approximate stationary point. A common strategy to improve efficiency is to reuse information from past iterations, such as previous gradients or trajectories, leading to off-policy PG methods. While gradient reuse has received substantial attention, leading to improved rates up to $O(ε^{-3/2})$, the reuse of past trajectories, although intuitive, remains largely unexplored from a theoretical perspective. In this work, we provide the first rigorous theoretical evidence that reusing past off-policy trajectories can significantly accelerate PG convergence. We propose RT-PG (Reusing Trajectories - Policy Gradient), a novel algorithm that leverages a power mean-corrected multiple importance weighting estimator to effectively combine on-policy and off-policy data coming from the most recent $ω$ iterations. Through a novel analysis, we prove that RT-PG achieves a sample complexity of $\tilde{O}(ε^{-2}ω^{-1})$. When reusing all available past trajectories, this leads to a rate of $\tilde{O}(ε^{-1})$, the best known one in the literature for PG methods. We further validate our approach empirically, demonstrating its effectiveness against baselines with state-of-the-art rates.

2602.01083 2026-06-04 cs.LG 版本更新

On the Expressive Power of Permutation-Equivariant Weight-Space Networks

关于置换等变权重空间网络的表达能力

Adir Dayan, Yam Eitan, Haggai Maron

发表机构 * Technion -- Israel Institute of Technology(技术ion-以色列理工学院) NVIDIA Research(NVIDIA研究)

AI总结 本文系统研究置换等变权重空间网络的表达能力,证明主流网络表达能力等价,并在温和假设下建立权重空间和函数空间的普适性,指导模型改进实现34%性能提升。

Comments Accepted as a spotlight paper at ICML 2026

详情
AI中文摘要

权重空间学习研究直接对其他神经网络的参数进行操作的神经架构。受预训练模型日益普及的推动,最近的工作展示了权重空间网络在广泛任务中的有效性。SOTA权重空间网络依赖置换等变设计来提高泛化能力。然而,这可能会对表达能力产生负面影响,需要进行理论研究。重要的是,与其他结构化领域不同,权重空间学习的目标是对权重空间和函数空间都进行操作的映射,这使得表达能力分析尤为微妙。虽然一些先前的工作提供了部分表达能力结果,但全面的刻画仍然缺失。在这项工作中,我们通过为权重空间网络的表达能力开发系统理论来填补这一空白。我们首先证明所有主流的置换等变网络在表达能力上是等价的。然后,我们在输入权重的温和自然假设下,建立了权重空间和函数空间设置中的普适性,并刻画了普适性不再成立的边缘情况。在我们的理论结果指导下,我们表明对现有权重空间模型的轻微修改相比先前SOTA实现了34%的提升,展示了我们框架的实际相关性。

英文摘要

Weight-space learning studies neural architectures that operate directly on the parameters of other neural networks. Motivated by the growing availability of pretrained models, recent work has demonstrated the effectiveness of weight-space networks across a wide range of tasks. SOTA weight-space networks rely on permutation-equivariant designs to improve generalization. However, this may negatively affect expressive power, warranting theoretical investigation. Importantly, unlike other structured domains, weight-space learning targets maps operating on both weight and function spaces, making expressivity analysis particularly subtle. While a few prior works provide partial expressivity results, a comprehensive characterization is still missing. In this work, we address this gap by developing a systematic theory for expressivity of weight-space networks. We first prove that all prominent permutation-equivariant networks are equivalent in expressive power. We then establish universality in both weight- and function-space settings under mild, natural assumptions on the input weights, and characterize the edge-case regimes where universality no longer holds. Guided by our theoretical results, we show that slight modifications to existing weight-space models yield a 34% improvement over prior SOTA, demonstrating the practical relevance of our framework.

2602.01027 2026-06-04 cs.LG 版本更新

SFMP: Fine-Grained, Hardware-Friendly and Search-Free Mixed-Precision Quantization for Large Language Models

SFMP:面向大语言模型的细粒度、硬件友好且免搜索的混合精度量化

Xin Nie, Haicheng Zhang, Liang Dong, Beining Feng, Jinhong Weng, Guiling Sun

发表机构 * College of Electronic Information and Optical Engineering, Nankai University(南开大学电子信息与光工程学院)

AI总结 提出SFMP框架,通过分数位宽、块级混合精度、行列重排序和统一GEMM内核,实现免搜索、硬件友好的混合精度量化,在相同内存约束下优于现有方法。

Comments 30 pages,17 figures

详情
AI中文摘要

混合精度量化是在严格内存预算下压缩大型语言模型的一种有前景的方法。然而,现有的混合精度方法通常存在两个限制之一:它们要么依赖昂贵的离散优化来确定精度分配,要么由于不规则的内存布局而导致硬件效率低下。我们提出了SFMP,一个用于大型语言模型的免搜索且硬件友好的混合精度量化框架。该框架基于四个新颖的想法:分数位宽,将权重矩阵的整数位宽扩展为分数值,并将离散精度分配转化为连续问题;2)块级混合精度,在权重矩阵内实现细粒度精度,同时保持硬件友好性;3)行列权重重排序,通过行和列重排序聚合显著权重,在推理过程中仅引入少量激活重排序开销;4)统一GEMM内核,支持任意平均位宽的混合精度GEMM。大量实验表明,在相同内存约束下,SFMP优于最先进的逐层混合精度方法,同时显著降低量化成本并提高推理效率。代码可在https://github.com/Nkniexin/SFMP获取。

英文摘要

Mixed-precision quantization is a promising approach for compressing large language models under tight memory budgets. However, existing mixed-precision methods typically suffer from one of two limitations: they either rely on expensive discrete optimization to determine precision allocation, or introduce hardware inefficiencies due to irregular memory layouts. We propose SFMP, a search-free and hardware-friendly mixed-precision quantization framework for large language models. The framework is built upon four novel ideas: Fractional bit-width, which extends integer bit-width for weight matrix to fractional value and transforms discrete precision allocation as a continuous problem; 2)Block-wise mixed-precision, enabling fine-grained precision within weight matrices while remaining hardware-friendly; 3)Row-column weight reordering, which aggregates salient weights via row and column reordering, incurring only a small activation reordering overhead during inference; 4)Unified GEMM kernel, which supports mixed-precision GEMM at arbitrary average bit-width. Extensive experiments demonstrate that SFMP outperforms state-of-the-art layer-wise mixed-precision methods under the same memory constraints, while significantly reducing quantization cost and improving inference efficiency. Code is available at https://github.com/Nkniexin/SFMP

2601.21461 2026-06-04 cs.LG cs.AI 版本更新

L$^3$: Large Lookup Layers

L$^3$:大型查找层

Albert Tseng, Christopher De Sa

发表机构 * Department of Computer Science, Cornell University(康奈尔大学计算机科学系)

AI总结 提出Large Lookup Layer (L$^3$),通过静态基于token的路由聚合每个token的嵌入,实现稀疏性,在语言建模和下游任务中优于稠密模型和等稀疏MoE。

Comments ICML 2026

详情
AI中文摘要

现代稀疏语言模型通常通过混合专家(MoE)层实现稀疏性,该层动态地将token路由到稠密MLP“专家”。然而,动态硬路由存在一些缺点,例如潜在的硬件效率低下以及需要辅助损失来稳定训练。相比之下,分词器嵌入表本质上是稀疏的,通过为每个token选择单个嵌入来避免这些问题,但代价是没有上下文信息。在这项工作中,我们引入了大型查找层(L$^3$),它将嵌入表推广到模型解码器层,作为进一步扩展稀疏性的一种手段。L$^3$层使用基于token的静态路由,以上下文相关的方式聚合每个token的一组学习嵌入,允许模型通过将信息缓存在嵌入中有效地平衡内存和计算。L$^3$有两个主要组成部分:(1)一个系统友好的架构,允许快速训练和CPU卸载推理,且没有开销;(2)一种信息论嵌入分配算法,有效平衡速度和质量。我们通过训练具有多达2.6B活动参数的transformer来实证测试L$^3$,发现L$^3$在语言建模和下游任务中均显著优于稠密模型和等稀疏MoE。

英文摘要

Modern sparse language models typically achieve sparsity through Mixture-of-Experts (MoE) layers, which dynamically route tokens to dense MLP "experts." However, dynamic hard routing has a number of drawbacks, such as potentially poor hardware efficiency and needing auxiliary losses for stable training. In contrast, the tokenizer embedding table, which is natively sparse, largely avoids these issues by selecting a single embedding per token at the cost of not having contextual information. In this work, we introduce the Large Lookup Layer (L$^3$), which generalizes embedding tables to model decoder layers as a means of further scaling sparsity. L$^3$ layers use static token-based routing to aggregate a set of learned embeddings per token in a context-dependent way, allowing the model to efficiently balance memory and compute by caching information in embeddings. L$^3$ has two main components: (1) a systems-friendly architecture that allows for fast training and CPU-offloaded inference with no overhead, and (2) an information-theoretic embedding allocation algorithm that effectively balances speed and quality. We empirically test L$^3$ by training transformers with up to 2.6B active parameters and find that L$^3$ strongly outperforms both dense models and iso-sparse MoEs in both language modeling and downstream tasks.

2601.07144 2026-06-04 stat.ML cs.LG math.ST stat.TH 版本更新

Optimal Transport under Group Fairness Constraints

群体公平约束下的最优运输

Linus Bleistein, Mathieu Dagréou, Francisco Andrade, Thomas Boudou, Aurélien Bellet

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 针对最优运输中的群体公平问题,提出通过修正Sinkhorn算法实现完美公平,并开发两种松弛策略(惩罚OT和双层优化学习成本)以平衡公平与匹配质量,给出理论保证和实证结果。

Comments Accepted at ICML 2026 (spotlight)

详情
AI中文摘要

确保匹配算法中的公平性是分配稀缺资源和职位的关键挑战。聚焦于最优运输(OT),我们引入了一种新的群体公平概念,要求OT计划中任意两个给定群体的个体匹配概率满足预定义目标。我们首先提出一种修正的Sinkhorn算法来高效计算完全公平的运输计划。由于实际中精确公平会显著降低匹配质量,我们随后开发了两种松弛策略。第一种涉及求解一个惩罚OT问题,我们为其推导了新的有限样本复杂度保证。第二种策略利用双层优化学习一个诱导公平OT解的基础成本,并建立了匹配未见数据时公平性偏差的界。最后,我们展示了实证结果,说明了我们方法的性能以及公平性与运输成本之间的权衡。

英文摘要

Ensuring fairness in matching algorithms is a key challenge in allocating scarce resources and positions. Focusing on Optimal Transport (OT), we introduce a novel notion of group fairness requiring that the probability of matching two individuals from any two given groups in the OT plan satisfies a predefined target. We first propose a modified Sinkhorn algorithm to compute perfectly fair transport plans efficiently. Since exact fairness can significantly degrade matching quality in practice, we then develop two relaxation strategies. The first one involves solving a penalized OT problem, for which we derive novel finite-sample complexity guarantees. Our second strategy leverages bilevel optimization to learn a ground cost that induces a fair OT solution, and we establish a bound on the deviation of fairness when matching unseen data. Finally, we present empirical results illustrating the performance of our approaches and the trade-off between fairness and transport cost.

2601.22601 2026-06-04 cs.LG 版本更新

\textsc{Lethe}: Principled Dual-Stream Update for Persistent Knowledge Erasure in Federated Unlearning

\textsc{Lethe}: 用于联邦遗忘中持久知识擦除的原则性双流更新

Wentai Wu, Hanwei Tan, Yijun Quan, Haixia Peng, Ligang He, Bin Yang, C. L. Philip Chen

发表机构 * Department of Computer Science, College of Information Science and Technology, Jinan University(计算机科学系,信息科学与技术学院,暨南大学) WMG, University of Warwick(沃森盖尔学院,沃里克大学) School of Information and Communications Engineering, Xi’an Jiaotong University(信息与通信工程学院,西安交通大学) Department of Computer Science, University of Warwick(计算机科学系,沃里克大学) School of Data Science and Engineering, East China Normal University(数据科学与工程学院,华东师范大学) School of Computer Science and Engineering, South China University of Technology(计算机科学与工程学院,华南理工大学)

AI总结 针对联邦遗忘后继续训练导致已遗忘知识重新浮现的问题,提出Lethe方法,通过遗忘流和保留流的反对齐更新实现持久知识擦除。

详情
AI中文摘要

联邦遗忘(FU)旨在从全局模型中擦除知识。现有研究通常假设遗忘后联邦协作终止,忽略了在删除请求完成后剩余客户端继续训练的实际部署场景。在这项工作中,我们识别出一个关键失败模式,称为知识重新浮现,揭示了仅对保留数据进行持续训练可以在几轮内重新激活已遗忘的知识。实验表明,许多最先进的FU方法容易发生知识重新浮现。我们随后提出Lethe,一种用于联邦设置中持久知识擦除的新型遗忘方法。在每次迭代中,Lethe操作来自遗忘客户端的遗忘流和来自保留客户端的保留流。它将遗忘更新重定向到两个流反对齐的区域,阻止保留数据训练移回遗忘知识。因此,Lethe在后续联邦训练期间确保更强的遗忘持久性。跨不同模型、数据集和遗忘级别的广泛实验验证了Lethe以统一方式支持CV和NLP任务中的所有遗忘级别,即使在极长后续训练时间后,大多数情况下也持续显示出低于1%的低重新浮现率。

英文摘要

Federated unlearning (FU) aims to erase knowledge from a global model. Existing studies commonly assume that federated collaboration terminates after unlearning, overlooking a deployment-realistic scenario where training continues on the remaining clients after deletion requests are fulfilled. In this work, we identify a critical failure mode, termed knowledge resurfacing, revealing that continued training on retained data alone can reactivate unlearned knowledge in a few rounds. Empirically, we demonstrate that many state-of-the-art FU methods are prone to knowledge resurfacing. We then propose Lethe, a novel unlearning method for persistent knowledge erasure in federated settings. In each iteration, Lethe operates on a forget stream from the unlearning client and a retain stream from the retained clients. It redirects unlearning updates toward a region where the two streams are anti-aligned, discouraging retained-data training from moving back toward the forgotten knowledge. Consequently, Lethe ensures stronger unlearning persistence during subsequent federated training. Extensive experiments across diverse models, datasets, and unlearning levels validate that Lethe supports all levels of unlearning in a unified manner across both CV and NLP tasks, demonstrating consistently low RR, below 1% in most cases, even after an extremely long horizon of follow-up training.

2601.22450 2026-06-04 cs.LG cs.AI 版本更新

Tuning the Implicit Regularizer of Masked Diffusion Language Models: Enhancing Generalization via Insights from $k$-Parity

调整掩码扩散语言模型的隐式正则化器:通过$k$-奇偶问题的见解增强泛化能力

Jianhao Huang, Baharan Mirzasoleiman

发表机构 * University of California, Berkeley(加州大学伯克利分校) Stanford University(斯坦福大学)

AI总结 本文通过$k$-奇偶问题研究掩码扩散语言模型的泛化特性,理论分解其目标函数为信号和噪声两部分,并利用噪声作为隐式正则化器,通过优化掩码概率分布显著提升模型性能。

Comments ICML 2026

详情
AI中文摘要

掩码扩散语言模型最近成为一种强大的生成范式,但与自回归模型相比,其泛化特性仍未得到充分研究。本文在$k$-奇偶问题(计算$k$个相关位的异或和)的背景下研究这些特性,其中神经网络通常表现出“grokking”现象——长时间的性能平台期后突然泛化。我们从理论上将掩码扩散(MD)目标分解为驱动特征学习的信号机制和作为隐式正则化器的噪声机制。通过在$k$-奇偶问题上使用MD目标训练nanoGPT,我们证明MD目标从根本上改变了学习景观,实现了快速且同时的泛化,而无需经历grokking。此外,我们利用理论见解优化MD目标中掩码概率的分布。我们的方法显著提高了50M参数模型的困惑度,并在从头预训练和监督微调中均取得了优越结果。具体而言,在8B参数模型上,我们观察到性能提升分别达到$8.8\%$和$5.8\%$,证实了我们的框架在大规模掩码扩散语言模型中的可扩展性和有效性。

英文摘要

Masked Diffusion Language Models have recently emerged as a powerful generative paradigm, yet their generalization properties remain understudied compared to their auto-regressive counterparts. In this work, we investigate these properties within the setting of the $k$-parity problem (computing the XOR sum of $k$ relevant bits), where neural networks typically exhibit grokking -- a prolonged plateau of chance-level performance followed by sudden generalization. We theoretically decompose the Masked Diffusion (MD) objective into a Signal regime which drives feature learning, and a Noise regime which serves as an implicit regularizer. By training nanoGPT using MD objective on the $k$-parity problem, we demonstrate that MD objective fundamentally alters the learning landscape, enabling rapid and simultaneous generalization without experiencing grokking. Furthermore, we leverage our theoretical insights to optimize the distribution of the mask probability in the MD objective. Our method significantly improves perplexity for 50M-parameter models and achieves superior results across both pre-training from scratch and supervised fine-tuning. Specifically, we observe performance gains peaking at $8.8\%$ and $5.8\%$, respectively, on 8B-parameter models, confirming the scalability and effectiveness of our framework in large-scale masked diffusion language model regimes.

2601.21868 2026-06-04 stat.ML cs.LG 版本更新

On Forgetting and Stability of Score-based Generative models

关于基于分数的生成模型的遗忘与稳定性

Stanislas Strasman, Gabriel Cardoso, Sylvain Le Corff, Vincent Lemaire, Antonio Ocello

发表机构 * Sorbonne Université and Université Paris Cité, CNRS, LPSM(索邦大学和巴黎大学,CNRS,LPSM) Center for Statistics and Images, Mines Paris, PSL University(统计与图像中心,巴黎 Mines,PSL 大学) CREST, ENSAE Paris, Institut Polytechnique de Paris(CREST,ENSAE 巴黎,巴黎理工学院)

AI总结 本文利用反向时间动力学马尔可夫链的遗忘和稳定性性质,在弱假设下通过Lyapunov漂移条件和Doeblin型小化条件,定量分析了基于分数的生成模型的采样误差,并证明了采样过程的定量稳定性。

详情
AI中文摘要

理解生成模型的稳定性和长期行为是现代机器学习中的一个基本问题。本文通过利用与反向时间动力学相关的马尔可夫链的稳定性和遗忘性质,为基于分数的生成模型的采样误差提供了定量界限。在弱假设下,我们提供了两个结构性质以确保反向过程的初始化和离散化误差的传播:一个Lyapunov漂移条件和一个Doeblin型小化条件。一个实际结果是采样过程的定量稳定性,因为反向扩散动力学沿采样轨迹诱导了一种收缩机制。我们的结果阐明了随机动力学在基于分数的模型中的作用,并为分析此类方法中的误差传播提供了一个原则性框架。

英文摘要

Understanding the stability and long-time behavior of generative models is a fundamental problem in modern machine learning. This paper provides quantitative bounds on the sampling error of score-based generative models by leveraging stability and forgetting properties of the Markov chain associated with the reverse-time dynamics. Under weak assumptions, we provide the two structural properties to ensure the propagation of initialization and discretization errors of the backward process: a Lyapunov drift condition and a Doeblin-type minorization condition. A practical consequence is quantitative stability of the sampling procedure, as the reverse diffusion dynamics induces a contraction mechanism along the sampling trajectory. Our results clarify the role of stochastic dynamics in score-based models and provide a principled framework for analyzing propagation of errors in such approaches.

2601.19449 2026-06-04 cs.LG 版本更新

Fixed Aggregation Features Can Rival GNNs

固定聚合特征可媲美GNN

Celia Rubio-Madrigal, Rebekka Burkholz

发表机构 * Celia Rubio-Madrigal Rebekka Burkholz

AI总结 提出固定聚合特征(FAFs)方法,将图学习转化为表格问题,通过非训练聚合特征结合表格模型,在多数基准上达到或超越GNN性能。

Comments Accepted at ICML 2026

详情
AI中文摘要

图神经网络(GNN)被广泛认为通过可训练的邻域聚合在节点表示学习中表现出色。我们通过引入固定聚合特征(FAFs)挑战这一观点,这是一种无需训练的方法,将图学习任务转化为表格问题。这一简单转变使得使用成熟的表格方法成为可能,提供了强大的可解释性和部署不同分类器的灵活性。在14个基准测试中,基于FAF训练的调优多层感知机在12个任务上媲美或超越最先进的GNN和图变换器——通常仅使用均值聚合。唯一的例外是Roman Empire和Minesweeper数据集,这些数据集通常需要异常深的GNN。为了解释非可训练聚合的理论可能性,我们将我们的发现与Kolmogorov-Arnold表示联系起来,并讨论何时均值聚合是足够的。总之,我们的结果呼吁:(i)更丰富的基准测试,以受益于学习多样化的邻域聚合;(ii)将强表格基线作为标准;(iii)使用和推进图数据的表格模型,以获得对相关任务的新见解。

英文摘要

Graph neural networks (GNNs) are widely believed to excel at node representation learning through trainable neighborhood aggregations. We challenge this view by introducing Fixed Aggregation Features (FAFs), a training-free approach that transforms graph learning tasks into tabular problems. This simple shift enables the use of well-established tabular methods, offering strong interpretability and the flexibility to deploy diverse classifiers. Across 14 benchmarks, well-tuned multilayer perceptrons trained on FAFs rival or outperform state-of-the-art GNNs and graph transformers on 12 tasks -- often using only mean aggregation. The only exceptions are the Roman Empire and Minesweeper datasets, which typically require unusually deep GNNs. To explain the theoretical possibility of non-trainable aggregations, we connect our findings to Kolmogorov-Arnold representations and discuss when mean aggregation can be sufficient. In conclusion, our results call for (i) richer benchmarks benefiting from learning diverse neighborhood aggregations, (ii) strong tabular baselines as standard, and (iii) employing and advancing tabular models for graph data to gain new insights into related tasks.

2601.18777 2026-06-04 cs.LG cs.AI cs.CL cs.IR stat.AP 版本更新

PRECISE: Reducing the Bias of LLM Evaluations Using Prediction-Powered Ranking Estimation

PRECISE: 使用预测驱动的排名估计减少LLM评估的偏差

Abhishek Divekar, Anirban Majumder

发表机构 * Primary contributor and corresponding author(主要贡献者及通讯作者)

AI总结 提出PRECISE框架,通过结合少量人工标注与LLM判断,利用预测驱动推断(PPI)方法,在低资源下可靠估计搜索、排序和RAG系统的指标,并校正LLM偏差。

Comments Accepted at AAAI 2026 - Innovative Applications of AI (IAAI-26)

详情
AI中文摘要

评估搜索、排序和RAG系统的质量传统上需要大量人工相关性标注。近年来,一些已部署的系统探索使用大型语言模型(LLM)作为自动评判者,但其固有偏差阻碍了直接用于指标估计。我们提出了一个扩展预测驱动推断(PPI)的统计框架,将最少的人工标注与LLM判断相结合,以生成需要子实例标注的指标的可靠估计。我们的方法仅需少至100个人工标注查询和10,000个未标注示例,相比传统方法显著减少了标注需求。我们为基于LLM的查询改写应用中的相关性提升推断制定了所提出的框架(PRECISE),将PPI扩展到查询-文档级别的子实例标注。通过重新制定指标集成空间,我们将计算复杂度从O(2^|C|)降低到O(2^K),其中|C|表示语料库大小(百万量级)。在多个著名检索数据集上的详细实验表明,我们的方法降低了业务关键指标Precision@K的估计方差,同时在低资源设置下有效校正了LLM偏差。

英文摘要

Evaluating the quality of search, ranking and RAG systems traditionally requires a significant number of human relevance annotations. In recent times, several deployed systems have explored the usage of Large Language Models (LLMs) as automated judges for this task while their inherent biases prevent direct use for metric estimation. We present a statistical framework extending Prediction-Powered Inference (PPI) that combines minimal human annotations with LLM judgments to produce reliable estimates of metrics which require sub-instance annotations. Our method requires as few as 100 human-annotated queries and 10,000 unlabeled examples, reducing annotation requirements significantly compared to traditional approaches. We formulate our proposed framework (PRECISE) for inference of relevance uplift for an LLM-based query reformulation application, extending PPI to sub-instance annotations at the query-document level. By reformulating the metric-integration space, we reduced the computational complexity from O(2^|C|) to O(2^K), where |C| represents corpus size (in order of millions). Detailed experiments across prominent retrieval datasets demonstrate that our method reduces the variance of estimates for the business-critical Precision@K metric, while effectively correcting for LLM bias in low-resource settings.

2601.18175 2026-06-04 cs.AI cs.LG cs.SY eess.SY stat.ML 版本更新

Success Conditioning as Policy Improvement: The Optimization Problem Solved by Imitating Success

成功条件化作为策略改进:模仿成功所解决的优化问题

Daniel Russo

发表机构 * Daniel J. Russo

AI总结 本文证明成功条件化(模仿成功轨迹)精确求解了一个信任区域优化问题,其χ²散度约束半径由数据自动确定,并揭示了相对策略改进、策略变化幅度和动作影响之间的等式关系。

详情
AI中文摘要

一种广泛使用的策略改进技术是成功条件化,即收集轨迹,识别那些实现期望结果的轨迹,并更新策略以模仿沿成功轨迹采取的动作。这一原则有许多名称——带SFT的拒绝采样、目标条件化RL、决策Transformer——但它解决了什么优化问题(如果有的话)一直不清楚。我们证明成功条件化精确求解了一个信任区域优化问题,在由数据自动确定半径的χ²散度约束下最大化策略改进。这产生了一个恒等式:相对策略改进、策略变化幅度以及我们称为动作影响(衡量动作选择中的随机变化如何影响成功率)的量在每个状态下都完全相等。因此,成功条件化表现为一个保守的改进算子。精确的成功条件化不会降低性能或引发危险的分布偏移,但当它失败时,它会以可观察的方式失败,即几乎不改变策略。我们将我们的理论应用于常见的回报阈值设定实践,表明这可以放大改进,但代价是可能与真实目标不一致。

英文摘要

A widely used technique for improving policies is success conditioning, in which one collects trajectories, identifies those that achieve a desired outcome, and updates the policy to imitate the actions taken along successful trajectories. This principle appears under many names -- rejection sampling with SFT, goal-conditioned RL, Decision Transformers -- yet what optimization problem it solves, if any, has remained unclear. We prove that success conditioning exactly solves a trust-region optimization problem, maximizing policy improvement subject to a $χ^2$ divergence constraint whose radius is determined automatically by the data. This yields an identity: relative policy improvement, the magnitude of policy change, and a quantity we call action-influence -- measuring how random variation in action choices affects success rates -- are exactly equal at every state. Success conditioning thus emerges as a conservative improvement operator. Exact success conditioning cannot degrade performance or induce dangerous distribution shift, but when it fails, it does so observably, by hardly changing the policy at all. We apply our theory to the common practice of return thresholding, showing this can amplify improvement, but at the cost of potential misalignment with the true objective.

2601.17469 2026-06-04 cs.LG 版本更新

Identifying and Correcting Label Noise for Robust GNNs via Influence Contradiction

通过影响矛盾识别和纠正标签噪声以实现鲁棒图神经网络

Wei Ju, Wei Zhang, Siyu Yi, Zhengyang Mao, Yifan Wang, Jingyang Yuan, Zhiping Xiao, Ziyue Qiao, Ming Zhang

发表机构 * University of Science and Technology of China(中国科学技术大学)

AI总结 提出ICGNN方法,利用图扩散矩阵计算影响矛盾分数(ICS)检测噪声标签,并通过邻居预测软策略纠正噪声标签,结合伪标签提升鲁棒性。

Comments Accepted by Proceedings of the 43rd International Conference on Machine Learning (ICML 2026)

详情
AI中文摘要

图神经网络(GNN)在学习图结构数据方面表现出显著能力,广泛应用于社交分析和生物信息学等领域。然而,现实场景中标签噪声的存在对学习鲁棒GNN构成重大挑战,其有效性在处理图上噪声标签(通常源于标注错误或不一致)时会受到严重影响。为此,本文提出一种名为ICGNN的新方法,利用图的结构信息有效缓解噪声标签带来的挑战。具体地,我们首先设计一种新的噪声指示器,基于图扩散矩阵测量影响矛盾分数(ICS),以量化具有干净标签的节点的可信度,使得ICS值较高的节点更可能被检测为具有噪声标签。然后,我们利用高斯混合模型精确检测节点标签是否含有噪声。此外,我们开发了一种软策略,结合图上邻居节点的预测来纠正检测到的噪声标签。最后,引入大量未标记节点的伪标签,以提供辅助监督信号并指导模型优化。在基准数据集上的实验表明,我们的方法在噪声标签场景下优于竞争基线。

英文摘要

Graph Neural Networks (GNNs) have shown remarkable capabilities in learning from graph-structured data with various applications such as social analysis and bioinformatics. However, the presence of label noise in real scenarios poses a significant challenge in learning robust GNNs, and their effectiveness can be severely impacted when dealing with noisy labels on graphs, often stemming from annotation errors or inconsistencies. To address this, in this paper we propose a novel approach called ICGNN that harnesses the structure information of the graph to effectively alleviate the challenges posed by noisy labels. Specifically, we first design a novel noise indicator that measures the influence contradiction score (ICS) based on the graph diffusion matrix to quantify the credibility of nodes with clean labels, such that nodes with higher ICS values are more likely to be detected as having noisy labels. Then we leverage the Gaussian mixture model to precisely detect whether the label of a node is noisy or not. Additionally, we develop a soft strategy to combine the predictions from neighboring nodes on the graph to correct the detected noisy labels. At last, pseudo-labeling for abundant unlabeled nodes is incorporated to provide auxiliary supervision signals and guide the model optimization. Experiments on benchmark datasets show the superiority of our approach over competitive baselines in noisy label scenarios.

2601.06196 2026-06-04 cs.LG cs.AI cs.CL 版本更新

Geometry-Aware Hallucination Detection in Large Language Models

大语言模型中的几何感知幻觉检测

Bodla Krishna Vamshi, Rohan Bhatnagar, Haizhao Yang

发表机构 * University of Maryland, College Park(马里兰大学学院公园分校)

AI总结 提出GA-ICL框架,利用冻结LLM的潜在表示建模局部流形和类别原型几何,选择上下文示例以检测幻觉,在FEVER和HaluEval基准上优于基线方法。

详情
AI中文摘要

大型语言模型(LLM)经常生成事实不正确或未经支持的内容,通常称为幻觉。先前的工作探索了解码策略、检索增强和监督微调用于幻觉检测,而最近的研究表明,上下文学习(ICL)可以显著影响事实可靠性。然而,现有的ICL示例选择方法通常依赖于表面相似性启发式方法,并且在任务和模型上表现出有限的鲁棒性。我们提出GA-ICL,一种几何感知的示例采样框架,用于选择上下文示例,该框架利用从冻结LLM中提取的潜在表示。通过联合建模局部流形结构和类别感知的原型几何,GA-ICL根据示例与学习原型的接近程度进行选择,而不仅仅是基于词汇或嵌入相似性。在事实验证(FEVER)和幻觉检测(HaluEval)基准上,GA-ICL在大多数评估设置中优于标准ICL选择基线,在对话和摘要任务上尤其有显著提升。该方法在温度扰动和模型变化下保持鲁棒性,表明与启发式检索策略相比具有更高的稳定性。虽然在较小模型规模下的某些问答场景中,词汇检索仍可能具有竞争力,但我们的结果表明,几何感知的原型选择为幻觉检测提供了一种可靠且训练轻量的方法,无需修改LLM参数。在Phi-14B和Qwen3-32B上的扩展评估证实,GA-ICL能有效扩展到更大模型,在包括较小模型显示边界条件限制的问答任务在内的所有比较基线上均表现优异,为改进ICL示例选择提供了原则性方向。

英文摘要

Large language models (LLMs) frequently generate factually incorrect or unsupported content, commonly referred to as hallucinations. Prior work has explored decoding strategies, retrieval augmentation, and supervised fine-tuning for hallucination detection, while recent studies show that in-context learning (ICL) can substantially influence factual reliability. However, existing ICL demonstration selection methods often rely on surface-level similarity heuristics and exhibit limited robustness across tasks and models. We propose GA-ICL, a geometry-aware demonstration sampling framework for selecting in-context demonstrations that leverages latent representations extracted from frozen LLMs. By jointly modeling local manifold structure and class-aware prototype geometry, GA-ICL selects demonstrations based on their proximity to learned prototypes rather than lexical or embedding similarity alone. Across factual verification (FEVER) and hallucination detection (HaluEval) benchmarks, GA-ICL outperforms standard ICL selection baselines in the majority of evaluated settings, with particularly strong gains on dialogue and summarization tasks. The method remains robust under temperature perturbations and model variation, indicating improved stability compared to heuristic retrieval strategies. While lexical retrieval can remain competitive in certain question-answering regimes at smaller model scales, our results demonstrate that geometry-aware prototype selection provides a reliable and training-light approach for hallucination detection without modifying LLM parameters. Extended evaluations on Phi-14B and Qwen3-32B confirm that GA-ICL scales effectively to larger models, outperforming all compared baselines including on QA tasks where smaller models show boundary-condition limitations, offering a principled direction for improved ICL demonstration selection.

2412.18134 2026-06-04 cs.LG cs.CC cs.PL cs.SE 版本更新

Learning Randomized Reductions

学习随机归约

Ferhat Erata, Orr Paradise, Thanos Typaldos, Timos Antonopoulos, ThanhVu Nguyen, Shafi Goldwasser, Ruzica Piskac

发表机构 * Yale University, USA(耶鲁大学) EPFL, Switzerland(瑞士联邦理工学院) George Mason University, USA(乔治·梅onn大学)

AI总结 提出 Bitween 框架自动学习随机自归约(RSR),通过线性回归、遗传编程等后端和 LLM 代理,在 80 个函数中分别发现 54% 和 80% 的 RSR,包括首个 sigmoid 归约。

Comments Accepted at ICML 2026 (Spotlight). 9 pages main text + appendix

详情
Journal ref
Proceedings of the 43rd International Conference on Machine Learning, PMLR 306, 2026
AI中文摘要

随机自归约(RSR)通过使用 $f$ 在随机相关点上的求值来表达 $f(x)$,从而能够实现自校正程序、实例隐藏协议,并在复杂性理论和密码学中有应用。然而,40 多年来发现 RSR 一直需要手动专家推导,限制了其实际应用。我们提出了用于自动 RSR 学习的 Bitween。首先,我们在相关采样下形式化了 RSR 学习及其样本复杂度分析。其次,我们开发了 Vanilla Bitween,它集成了多个后端(线性回归、遗传编程、符号回归和混合整数规划)。线性回归后端表现最佳,在我们的基准套件 RSR-Bench 中为 80 个函数中的 43 个(54%)发现了 RSR,包括 sigmoid 的首次已知归约。第三,我们引入了 Agentic Bitween,一种神经符号方法,其中 LLM 代理提出超越先前工作中固定集合($x+r$, $x-r$, $x \cdot r$, $x$, $r$)的新查询函数。Agentic Bitween 为 80 个函数中的 64 个(80%)发现了 RSR,在 RSR 发现和验证准确性方面均优于纯神经基线。

英文摘要

Randomized self-reductions (RSRs) express $f(x)$ using $f$ evaluated at random correlated points, enabling self-correcting programs, instance-hiding protocols, and applications in complexity theory and cryptography. Yet discovering RSRs has required manual expert derivation for over 40 years, limiting their practical use. We present Bitween for automated RSR learning. First, we formalize RSR learning with sample complexity analysis under correlated sampling. Second, we develop Vanilla Bitween, which integrates multiple backends (linear regression, genetic programming, symbolic regression, and mixed-integer programming). The linear regression backend outperforms the others, discovering RSRs for 43 of 80 functions (54%) in RSR-Bench, our benchmark suite, including the first known reduction for sigmoid. Third, we introduce Agentic Bitween, a neuro-symbolic approach where LLM agents propose novel query functions beyond the fixed set ($x+r$, $x-r$, $x \cdot r$, $x$, $r$) in prior work. Agentic Bitween discovers RSRs for 64 of 80 functions (80%), outperforming pure neural baselines in both RSR discovery and verification accuracy.

2601.03569 2026-06-04 cs.LG stat.AP 版本更新

Local Intrinsic Dimensionality of Ground Motion Data for Early Detection of Catastrophic Slope Failure

用于早期检测灾难性边坡破坏的地震动数据的局部内在维度

Yuansan Liu, James Bailey, Antoinette Tordesillas

发表机构 * The University of Melbourne(墨尔本大学) Monash University(莫纳什大学)

AI总结 提出时空局部内在维度(st-LID)无监督框架,通过运动增强、贝叶斯空间融合和时间建模,提高滑坡监测中破坏区域的早期检测精度和提前时间。

Comments 20 pages, 9 figures. ECML-PKDD 2026

详情
AI中文摘要

局部内在维度(LID)在高维数据异常检测中显示出强大潜力,包括颗粒介质中滑坡破坏检测,其中早期准确识别破坏区域对于有效的地质灾害缓解至关重要。然而,由于表面位移数据中固有的空间相关性和时间动态,这项任务仍然具有挑战性。为了解决这一差距,我们提出了一种新颖的无监督框架,称为时空LID(st-LID),它将LID推广到滑坡监测网络中的稳健破坏检测。我们的方法引入了三个关键创新:(1)运动增强,将速度纳入LID计算以捕获瞬时变形率和短期时间动态;(2)贝叶斯空间融合,通过贝叶斯估计聚合空间邻域内的LID值,以嵌入空间相关性并考虑局部噪声;以及(3)时间建模(t-LID),一种新变体,表征位移数据的长期动态,提供位移行为的稳健时间表示。通过统一这些组件,st-LID识别出现有方法经常忽略的复杂多阶段破坏区域。大量实验表明,st-LID在检测精度和提前时间方面始终优于最先进的无监督基线,为滑坡早期预警系统和有针对性的风险干预提供了稳健基础,以增强社区韧性和准备策略。

英文摘要

Local Intrinsic Dimensionality (LID) has shown strong potential for anomaly detection in high-dimensional data, including landslide failure detection in granular media, where early and accurate identification of failure zones is crucial for effective geohazard mitigation. However, this task is still challenging due to the spatial correlations and temporal dynamics that are inherently present in surface displacement data. To address this gap, we propose a novel unsupervised framework called spatiotemporal LID (st-LID) that generalizes the LID for robust failure detection in landslide monitoring networks. Our approach introduces three key innovations: (1) Kinematic enhancement, incorporating velocity into the LID computation to capture instantaneous deformation rates and short-term temporal dynamics; (2) Bayesian spatial fusion, which aggregates LID values across spatial neighborhoods via Bayesian estimation, to embed spatial correlations and account for localized noise; and (3) Temporal modeling (t-LID), a new variant that characterizes long-term dynamics of displacement data, providing a robust temporal representation of displacement behavior. By unifying these components, st-LID identifies complex, multi-stage failure zones often overlooked by existing methods. Extensive experiments show that st-LID consistently outperforms state-of-the-art unsupervised baselines in detection precision and lead-time, providing a robust foundation for landslide early warning systems and targeted risk intervention to enhance community resilience and preparedness strategies.

2601.07408 2026-06-04 cs.CL cs.LG 版本更新

Outcome-Grounded Advantage Reshaping for Fine-Grained Credit Assignment in Mathematical Reasoning

基于结果锚定的优势重塑用于数学推理中的细粒度信用分配

Ziheng Li, Liu Kang, Feng Xiao, Luxi Xing, Qingyi Si, Zhuoran Li, Weikang Gong, Deqing Yang, Yanghua Xiao, Hongcheng Guo

发表机构 * Fudan University(复旦大学) XingYun lab, HUJING Digital Media & Entertainment Group(星云实验室,HUJING数字媒体与娱乐集团) University of Science and Technology Beijing(北京科技大学) Chinese Academy of Sciences(中国科学院) Beijing University of Posts and Telecommunications(北京邮电大学)

AI总结 提出结果锚定优势重塑(OAR),通过两种策略(OAR-P和OAR-G)实现细粒度信用分配,显著提升GRPO在数学推理中的性能。

详情
AI中文摘要

组相对策略优化(GRPO)已成为一种有前途的无需评论家的强化学习范式,用于推理任务。然而,标准GRPO采用粗粒度的信用分配机制,将组级奖励均匀地传播到序列中的每个令牌,忽略了各个推理步骤的不同贡献。我们通过引入结果锚定优势重塑(OAR)来解决这一局限性,这是一种细粒度的信用分配机制,根据每个令牌对模型最终答案的影响程度重新分配优势。我们通过两种互补策略实例化OAR:(1)OAR-P,通过反事实令牌扰动估计结果敏感性,作为高保真归因信号;(2)OAR-G,使用输入梯度敏感性代理,通过单次反向传播近似影响信号。这些重要性信号与保守的双层优势重塑方案相结合,该方案抑制低影响令牌并提升关键令牌,同时保持整体优势质量。在广泛数学推理基准上的实证结果表明,虽然OAR-P设定了性能上限,但OAR-G以可忽略的计算开销实现了相当的增益,两者均显著优于强GRPO基线,推动了无需评论家的大语言模型推理的边界。

英文摘要

Group Relative Policy Optimization (GRPO) has emerged as a promising critic-free reinforcement learning paradigm for reasoning tasks. However, standard GRPO employs a coarse-grained credit assignment mechanism that propagates group-level rewards uniformly to to every token in a sequence, neglecting the varying contribution of individual reasoning steps. We address this limitation by introducing Outcome-grounded Advantage Reshaping (OAR), a fine-grained credit assignment mechanism that redistributes advantages based on how much each token influences the model's final answer. We instantiate OAR via two complementary strategies: (1) OAR-P, which estimates outcome sensitivity through counterfactual token perturbations, serving as a high-fidelity attribution signal; (2) OAR-G, which uses an input-gradient sensitivity proxy to approximate the influence signal with a single backward pass. These importance signals are integrated with a conservative Bi-Level advantage reshaping scheme that suppresses low-impact tokens and boosts pivotal ones while preserving the overall advantage mass. Empirical results on extensive mathematical reasoning benchmarks demonstrate that while OAR-P sets the performance upper bound, OAR-G achieves comparable gains with negligible computational overhead, both significantly outperforming a strong GRPO baseline, pushing the boundaries of critic-free LLM reasoning.

2601.07036 2026-06-04 cs.CL cs.AI cs.LG 版本更新

Mid-Think: Training-Free Intermediate-Budget Reasoning via Token-Level Triggers

Mid-Think: 通过词元级触发器实现无需训练的中间预算推理

Wang Yang, Debargha Ganguly, Xinpeng Li, Chaoda Song, Shouren Wang, Vikash Singh, Vipin Chaudhary, Xiaotian Han

发表机构 * Case Western Reserve University(凯斯西储大学)

AI总结 本文通过分析注意力机制和提示实验,发现推理行为主要由少量触发词元控制,并据此提出Mid-Think方法,通过组合触发词元实现中间预算推理,在准确率-长度权衡上优于基线,并能在强化学习训练中减少时间并提升性能。

详情
AI中文摘要

混合推理语言模型通常通过高级的Think/No-think指令来控制推理行为,但我们发现这种模式切换主要由一小部分触发词元驱动,而非指令本身。通过注意力分析和受控提示实验,我们表明开头的“Okay”词元会诱导推理行为,而“</think>”后的换行模式则会抑制推理。基于这一观察,我们提出了Mid-Think,一种简单的无需训练的提示格式,通过组合这些触发器实现中间预算推理,在准确率-长度权衡上始终优于固定词元和基于提示的基线。此外,在监督微调后将Mid-Think应用于强化学习训练,可将训练时间减少约15%,同时将Qwen3-8B在AIME上的最终性能从69.8%提升至72.4%,在GPQA上从58.5%提升至61.1%,证明了其在推理时控制和基于强化学习的推理训练中的有效性。

英文摘要

Hybrid reasoning language models are commonly controlled through high-level Think/No-think instructions to regulate reasoning behavior, yet we found that such mode switching is largely driven by a small set of trigger tokens rather than the instructions themselves. Through attention analysis and controlled prompting experiments, we show that a leading ``Okay'' token induces reasoning behavior, while the newline pattern following ``</think>'' suppresses it. Based on this observation, we propose Mid-Think, a simple training-free prompting format that combines these triggers to achieve intermediate-budget reasoning, consistently outperforming fixed-token and prompt-based baselines in terms of the accuracy-length trade-off. Furthermore, applying Mid-Think to RL training after SFT reduces training time by approximately 15% while improving final performance of Qwen3-8B on AIME from 69.8% to 72.4% and on GPQA from 58.5% to 61.1%, demonstrating its effectiveness for both inference-time control and RL-based reasoning training.

2411.05894 2026-06-04 cs.CL cs.AI cs.LG 版本更新

SSSD: Simply-Scalable Speculative Decoding

SSSD: 简单可扩展的推测解码

Michele Marzollo, Jiawei Zhuang, Niklas Roemer, Niklas Zwingenberger, Lorenz K. Müller, Lukas Cavigelli

发表机构 * Huawei(华为) ETH Zurich(苏黎世联邦理工学院)

AI总结 提出一种无需训练的推测解码方法SSSD,结合轻量级n-gram匹配和硬件感知推测,在多种基准测试中达到与领先训练方法相当的性能,延迟降低高达2.9倍,且对语言和领域变化具有鲁棒性。

Comments Accepted to the 64th Annual Meeting of the Association for Computational Linguistics (ACL 2026, Main Conference)

详情
AI中文摘要

推测解码已成为加速大型语言模型推理的流行技术。然而,大多数现有方法在生产服务系统中仅带来适度的改进。实现显著加速的方法通常依赖于额外的训练草案模型或辅助模型组件,增加了部署和维护的复杂性。这种增加的复杂性降低了灵活性,特别是当服务负载转移到草案模型训练数据中未充分表示的任务、领域或语言时。我们引入了简单可扩展的推测解码(SSSD),一种无需训练的方法,结合了轻量级n-gram匹配和硬件感知推测。相对于标准自回归解码,SSSD将延迟降低高达2.9倍。它在广泛的基准测试中达到了与领先的基于训练的方法相当的性能,同时需要显著更低的采用成本——无需数据准备、训练或调优——并且在语言和领域变化以及长上下文设置中表现出优越的鲁棒性。

英文摘要

Speculative Decoding has emerged as a popular technique for accelerating inference in Large Language Models. However, most existing approaches yield only modest improvements in production serving systems. Methods that achieve substantial speedups typically rely on an additional trained draft model or auxiliary model components, increasing deployment and maintenance complexity. This added complexity reduces flexibility, particularly when serving workloads shift to tasks, domains, or languages that are not well represented in the draft model's training data. We introduce Simply-Scalable Speculative Decoding (SSSD), a training-free method that combines lightweight n-gram matching with hardware-aware speculation. Relative to standard autoregressive decoding, SSSD reduces latency by up to 2.9x. It achieves performance on par with leading training-based approaches across a broad range of benchmarks, while requiring substantially lower adoption effort--no data preparation, training or tuning are needed--and exhibiting superior robustness under language and domain shift, as well as in long-context settings.

2509.05510 2026-06-04 physics.comp-ph cs.LG 版本更新

Causal Multi-fidelity Surrogate Forward and Inverse Models for ICF Implosions

因果多保真替代前向与逆向模型用于ICF内爆

Tyler E. Maltba, Ben S. Southworth, Jeffrey R. Haack, Marc L. Klasky

发表机构 * Theoretical Division, Los Alamos National Laboratory(洛斯阿拉莫斯国家实验室理论 division) Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory(洛斯阿拉莫斯国家实验室计算机、计算与统计科学 division)

AI总结 针对惯性约束聚变中的逆向问题,构建因果动态多保真降阶替代模型,通过低/高保真训练数据学习控制器,并利用机器学习模型优化辐射温度驱动以复现观测界面动力学。

详情
AI中文摘要

惯性约束聚变(ICF)的持续进展需要解决将实验观测与模拟输入参数相关联的逆向问题,随后进行设计优化。然而,这类高维动态PDE约束优化问题极具挑战性,甚至难以处理。最近研究表明,通过仅考虑某些鲁棒特征可以解决逆向问题。本文考虑ICF靶丸的氘-氚(DT)界面,构建了一个因果、动态、多保真降阶替代模型,将时间依赖的辐射温度驱动映射到界面的半径和速度动力学。该替代模型针对DT界面动力学的ODE嵌入,通过使用低/高保真模拟训练数据(关于辐射能群结构)学习基础解析模型的控制器来构建。在展示了替代界面模型的优异精度后,我们使用机器学习(ML)模型结合替代生成的数据来解决逆向问题,优化辐射温度驱动以复现观测到的界面动力学。对于稀疏时间快照,ML模型进一步表征了采样动力学最具信息量的时间点。总之,我们展示了如何将算子学习、因果架构和物理归纳偏差整合起来,以加速高能量密度系统中的发现、设计和诊断。

英文摘要

Continued progress in inertial confinement fusion (ICF) requires solving inverse problems relating experimental observations to simulation input parameters, followed by design optimization. However, such high-dimensional dynamic PDE-constrained optimization problems are extremely challenging or even intractable. It has been recently shown that inverse problems can be solved by only considering certain robust features. Here we consider the ICF capsule's deuterium-tritium (DT) interface, and construct a causal, dynamic, multifidelity reduced-order surrogate that maps from a time-dependent radiation temperature drive to the interface's radius and velocity dynamics. The surrogate targets an ODE embedding of DT interface dynamics, and is constructed by learning a controller for a base analytical model using low- and high-fidelity simulation training data with respect to radiation energy group structure. After demonstrating excellent accuracy of the surrogate interface model, we use machine learning (ML) models with surrogate-generated data to solve inverse problems optimizing radiation temperature drive to reproduce observed interface dynamics. For sparse snapshots in time, the ML model further characterizes the most informative times at which to sample dynamics. Altogether we demonstrate how operator learning, causal architectures, and physical inductive bias can be integrated to accelerate discovery, design, and diagnostics in high-energy-density systems.

2512.17678 2026-06-04 cs.LG cs.AI 版本更新

You Only Train Once: Differentiable Subset Selection for Omics Data

你只训练一次:用于组学数据的可微分子集选择

Daphné Chopard, Jorge da Silva Gonçalves, Irene Cannistraci, Thomas M. Sutter, Julia E. Vogt

发表机构 * Department of Computer Science, ETH Zurich(计算机科学系,苏黎世联邦理工学院) Department of Intensive Care and Neonatology, University Children’s Hospital Zurich(重症医学与新生儿科,苏黎世大学儿童医院)

AI总结 提出YOTO框架,通过端到端可微架构联合选择离散基因子集并进行预测,实现稀疏、多任务学习,提升单细胞转录组数据分析性能。

Comments Camera-ready version accepted at Transactions on Machine Learning Research (TMLR)

详情
Journal ref
Transactions on Machine Learning Research, 2026
AI中文摘要

从单细胞转录组数据中选择紧凑且信息丰富的基因子集对于生物标志物发现、提高可解释性和成本效益分析至关重要。然而,大多数现有的特征选择方法要么作为多阶段流水线运行,要么依赖于事后特征归因,使得选择和预测弱耦合。在这项工作中,我们提出了YOTO(你只训练一次),一个端到端框架,在单个可微架构中联合识别离散基因子集并进行预测。在我们的模型中,预测任务直接指导选择哪些基因,而学习到的子集反过来塑造预测表示。这种闭环反馈使模型能够在训练过程中迭代地优化其选择内容和预测方式。与现有方法不同,YOTO强制执行稀疏性,使得只有选中的基因对推理有贡献,从而无需训练额外的下游分类器。通过多任务学习设计,模型在相关目标之间学习共享表示,使得部分标记的数据集能够相互提供信息,并发现无需额外训练步骤即可跨任务泛化的基因子集。我们在两个代表性的单细胞RNA-seq数据集上评估YOTO,显示它持续优于最先进的基线。这些结果表明,稀疏、端到端、多任务的基因子集选择提高了预测性能,并产生了紧凑且有意义的基因子集,推进了生物标志物发现和单细胞分析。

英文摘要

Selecting compact and informative gene subsets from single-cell transcriptomic data is essential for biomarker discovery, improving interpretability, and cost-effective profiling. However, most existing feature selection approaches either operate as multi-stage pipelines or rely on post hoc feature attribution, making selection and prediction weakly coupled. In this work, we present YOTO (you only train once), an end-to-end framework that jointly identifies discrete gene subsets and performs prediction within a single differentiable architecture. In our model, the prediction task directly guides which genes are selected, while the learned subsets, in turn, shape the predictive representation. This closed feedback loop enables the model to iteratively refine both what it selects and how it predicts during training. Unlike existing approaches, YOTO enforces sparsity so that only the selected genes contribute to inference, eliminating the need to train additional downstream classifiers. Through a multi-task learning design, the model learns shared representations across related objectives, allowing partially labeled datasets to inform one another, and discovering gene subsets that generalize across tasks without additional training steps. We evaluate YOTO on two representative single-cell RNA-seq datasets, showing that it consistently outperforms state-of-the-art baselines. These results demonstrate that sparse, end-to-end, multi-task gene subset selection improves predictive performance and yields compact and meaningful gene subsets, advancing biomarker discovery and single-cell analysis.

2510.05013 2026-06-04 stat.ML cs.LG 版本更新

Curiosity-Driven Development of Action and Language in Robots Through Self-Exploration

通过自我探索的机器人好奇心驱动行为与语言发展

Theodore Jerome Tinker, Kenji Doya, Jun Tani

发表机构 * Okinawa Institute of Science and Technology(冲绳科学技术大学院大学)

AI总结 本研究通过好奇心驱动的机器人自我探索,结合Q学习实现主动推理,揭示了组合泛化、快速学习、先配对后组合以及异常处理导致的U型发展模式,为人类高效语言习得提供解释。

Comments 27 pages, 22 pages of supplementary material

详情
AI中文摘要

婴儿通过极少的经验就能泛化习得语言,而大型语言模型需要数十亿的训练标记。人类高效发展的基础是什么?我们通过实验研究了这一问题,其中机器人代理通过好奇心驱动的自我探索学习执行与祈使句(例如,推红色立方体)相关的动作。我们的方法使用Q学习摊销主动推理,实现内在动机的发展性学习。模拟揭示了与发展心理学观察相对应的关键发现。i) 随着组合元素规模的增加,泛化能力显著提高。ii) 好奇心驱动的探索能够加速学习。iii) 句子和动作的机械配对先于组合泛化。iv) 异常处理导致U型发展表现,这种模式类似于儿童语言学习中的表征重述。这些结果表明,好奇心驱动的主动推理解释了内在动机的感觉运动-语言学习如何支持人类和人工代理中的可扩展组合泛化和异常处理。

英文摘要

Infants acquire language with generalization from minimal experience, whereas large language models require billions of training tokens. What underlies efficient development in humans? We investigated this problem through experiments wherein robotic agents learn to perform actions associated with imperative sentences (e.g., push red cube) via curiosity-driven self-exploration. Our approach amortizes active inference using Q-learning, enabling intrinsically motivated developmental learning. The simulations reveal key findings corresponding to observations in developmental psychology. i) Generalization improves drastically as the scale of compositional elements increases. ii) Curiosity-driven exploration enables faster learning. iii) Rote pairing of sentences and actions precedes compositional generalization. iv) Exception-handling induces U-shaped developmental performance, a pattern like representational redescription in child language learning. These results suggest that curiosity-driven active inference accounts for how intrinsically motivated sensorimotor-linguistic learning supports scalable compositional generalization and exception handling in humans and artificial agents.

2512.10236 2026-06-04 cs.DC cs.AR cs.LG 版本更新

Design Space Exploration of DMA based Finer-Grain Compute Communication Overlap

基于DMA的细粒度计算通信重叠的设计空间探索

Shagnik Pal, Shaizeen Aga, Suchita Pati, Mahzabeen Islam, Lizy K. John

发表机构 * Advanced Micro Devices Inc.(先进微器件公司) The University of Texas at Austin(德克萨斯大学奥斯汀分校)

AI总结 本文提出细粒度计算通信重叠方法FiCCO,通过深入分片粒度以下、利用DMA引擎卸载通信,实现更广泛网络拓扑下的性能优化,最高获得1.6倍加速。

详情
AI中文摘要

现代机器学习工作负载需要跨多个GPU分布训练和推理。然而,这些并行化技术常常遭受暴露的关键路径通信,通过计算通信重叠可能实现1.7倍的加速。先前的重叠方法利用ML模型状态和输入已经分片到GPU数量的事实,并在分片粒度上重叠计算和通信。然而,这种粗粒度重叠受到有限网络拓扑支持和次优数据流的限制。在这项工作中,我们转而支持更细粒度的计算通信重叠,称之为FiCCO。FiCCO比传统分片深入一层,为更广泛的网络拓扑解锁重叠,并实现更细粒度的数据流。我们表明,FiCCO打开了比仅分片级别更广泛的执行调度设计空间。为了遍历调度的设计空间,我们研究并表征了进行重叠时的性能低效问题,并将调度与相关的低效特征叠加。我们的表征揭示了分解和基于争用的减速是主要的性能限制因素,并将减速因子与静态计算/通信算子大小相关联。这有助于我们设计启发式方法(框架和运行时可以利用)来根据底层ML操作的性质选择定制的FiCCO调度。最后,为了进一步最小化操作重叠固有的争用低效,我们将通信卸载到GPU DMA引擎。我们评估了来自实际ML部署的几种场景,并证明我们提出的启发式驱动的定制调度可提供高达1.6倍的加速。此外,我们的启发式方法在81%的未见场景中提供了准确选择最优调度的指导。

英文摘要

Modern ML workloads demand distributing training and inference across multiple GPUs. However, these parallelization techniques often suffer from exposed critical-path communication, leaving a potential 1.7x speedup on the table through compute-communication overlap. Prior overlapping methods harness the fact that ML model state and inputs are already sharded into the number of GPUs, and overlap the compute and communication at shard granularity. However, such coarse-grained overlap suffers from limited network topology support, and suboptimal dataflows. In this work, we instead make a case for finer-grain compute-communication overlap which we term FiCCO. FiCCO operates one level deeper than traditional sharding, and unlocks overlap for a wider set of network topologies and enables finer-grain dataflow. We show that FiCCO opens up a wider design space of execution schedules than possible at shard-level alone. To walk the design space of schedules, we study and characterize the performance inefficiencies on doing overlap and overlay the schedules with the associated inefficiency signatures. Our characterization reveals decomposition and contention based slowdowns to be the major performance limiters, and we correlate the slowdown factors with the static compute/communication operator sizes. This helps us design heuristics (that frameworks and runtimes can harness) to select bespoke FiCCO schedules based on the nature of underlying ML operations. Finally, to further minimize contention inefficiencies inherent with operation overlap, we offload communication to GPU DMA engines. We evaluate several scenarios from realistic ML deployments and demonstrate that our proposed heuristics driven bespoke schedules deliver up to 1.6x speedup. Further, our heuristics provide accurate guidance to pick the optimal schedule in 81% of unseen scenarios.

2512.06553 2026-06-04 stat.AP cs.LG 版本更新

A Latent Variable Framework for Scaling Laws in Large Language Models

大型语言模型中缩放定律的潜变量框架

Peiyao Cai, Chengyu Cui, Felipe Maia Polo, Seamus Somerstep, Leshem Choshen, Mikhail Yurochkin, Yuekai Sun, Kean Ming Tan, Gongjun Xu

发表机构 * Department of Statistics, University of Michigan(密歇根大学统计系) IBM Research and CSAIL, MIT(IBM研究与麻省理工学院计算机科学与人工智能实验室) Institute of Foundation Models, MBZUAI(MBZUAI基础模型研究所)

AI总结 提出基于潜变量建模的统计框架,通过引入潜变量捕获不同模型家族和基准的异构性,以更准确地建模大型语言模型的缩放定律。

详情
AI中文摘要

我们提出了一个基于潜变量建模的统计框架,用于大型语言模型(LLMs)的缩放定律。我们的工作受到大量具有不同架构和训练策略的新LLM家族迅速涌现的推动,这些模型在越来越多的基准上进行评估。这种异构性使得单一的全局缩放曲线不足以捕捉不同家族和基准之间的性能变化。为了解决这个问题,我们提出了一个潜变量建模框架,其中每个LLM家族与一个潜变量相关联,该潜变量捕获该家族中常见的底层特征。然后,LLM在不同基准上的性能由其潜在技能驱动,这些技能由潜变量和模型自身的可观测特征共同决定。我们开发了该潜变量模型的估计程序,并建立了其统计性质。我们还设计了支持估计和各种下游任务的高效数值算法。在实验上,我们在Open LLM Leaderboard(v1/v2)的12个广泛使用的基准上评估了该方法。

英文摘要

We propose a statistical framework built on latent variable modeling for scaling laws of large language models (LLMs). Our work is motivated by the rapid emergence of numerous new LLM families with distinct architectures and training strategies, evaluated on an increasing number of benchmarks. This heterogeneity makes a single global scaling curve inadequate for capturing how performance varies across families and benchmarks. To address this, we propose a latent variable modeling framework in which each LLM family is associated with a latent variable that captures the common underlying features in that family. An LLM's performance on different benchmarks is then driven by its latent skills, which are jointly determined by the latent variable and the model's own observable features. We develop an estimation procedure for this latent variable model and establish its statistical properties. We also design efficient numerical algorithms that support estimation and various downstream tasks. Empirically, we evaluate the approach on 12 widely used benchmarks from the Open LLM Leaderboard (v1/v2).

2512.03296 2026-06-04 cs.SI cs.CY cs.LG 版本更新

Associating Healthcare Teamwork with Patient Outcomes for Predictive Analysis

将医疗团队协作与患者结局关联以进行预测分析

Hsiao-Ying Lu, Kwan-Liu Ma

发表机构 * Department of Computer Science University of California, Davis Davis, USA(加州大学戴维斯分校计算机科学系)

AI总结 本研究通过电子健康记录系统建模医疗专业人员协作网络,应用机器学习技术检测患者生存预测信号,并识别与改善结局相关的关键网络特征,经临床专家验证其现实应用潜力。

详情
AI中文摘要

癌症治疗结局不仅受临床和人口统计学因素影响,还受医疗团队协作的影响。然而,先前的工作在很大程度上忽视了人类协作在塑造患者生存中的潜在作用。本文提出了一种应用人工智能方法,通过电子健康记录(EHR)系统捕获的医疗专业人员(HCP)协作,揭示其对癌症患者结局的影响。我们将EHR介导的HCP交互建模为网络,并应用机器学习技术检测这些协作中嵌入的患者生存预测信号。我们的模型经过交叉验证以确保泛化能力,并通过识别与改善结局相关的关键网络特征来解释预测。重要的是,临床专家和文献验证了所识别关键协作特征的相关性,增强了其在现实应用中的潜力。这项工作为利用协作的数字痕迹和人工智能评估及改善基于团队的医疗保健提供了实用工作流程。该方法可能可转移到涉及复杂协作的其他领域,并提供可操作的见解以支持医疗保健服务中的数据知情干预。

英文摘要

Cancer treatment outcomes are influenced not only by clinical and demographic factors but also by the collaboration of healthcare teams. However, prior work has largely overlooked the potential role of human collaboration in shaping patient survival. This paper presents an applied AI approach to uncovering the impact of healthcare professionals' (HCPs) collaboration, captured through electronic health record (EHR) systems, on cancer patient outcomes. We model EHR-mediated HCP interactions as networks and apply machine learning techniques to detect predictive signals of patient survival embedded in these collaborations. Our models are cross validated to ensure generalizability, and we explain the predictions by identifying key network traits associated with improved outcomes. Importantly, clinical experts and literature validate the relevance of the identified crucial collaboration traits, reinforcing their potential for real-world applications. This work contributes to a practical workflow for leveraging digital traces of collaboration and AI to assess and improve team-based healthcare. The approach is potentially transferable to other domains involving complex collaboration and offers actionable insights to support data-informed interventions in healthcare delivery.

2511.21035 2026-06-04 cs.LG 版本更新

RAVQ-HoloNet: Rate-Adaptive Vector-Quantized Hologram Compression

RAVQ-HoloNet:速率自适应向量量化全息图压缩

Shima Rafiei, Zahra Nabizadeh Shahr-Babak, Soroush Khoubyarian, Alexandre Cooper, Shadrokh Samavi, Shahram Shirani

发表机构 * Department of Electrical and Computer Engineering, McMaster University(麦基尔大学电气与计算机工程系) Department of Physics and Astronomy, University of Waterloo(滑铁卢大学物理与天文学系) Institute for Quantum Computing, Department of Physics and Astronomy, University of Waterloo(滑铁卢大学量子计算研究所) Computer Science Department, Seattle University(西雅图大学计算机科学系)

AI总结 提出RAVQ-HoloNet,一种集成速率自适应压缩与相位全息图变换的向量量化框架,在低比特率下实现高保真重建,性能超越现有方法。

详情
AI中文摘要

全息术为AR/VR应用提供了巨大潜力。然而,其应用受到数据压缩高需求的限制。现有的深度学习方法通常缺乏单一网络内的速率自适应性,往往需要多个模型来覆盖不同的带宽要求。我们提出了RAVQ-HoloNet,一种速率自适应向量量化框架,将速率自适应压缩与图像数据到纯相位全息图的变换相结合。RAVQ-HoloNet实现了高保真重建,通过两种不同的架构配置超越了当前最先进的方法:一种针对低比特率优化的标准模型,以及一种针对超低比特率设置的更深、扩展变体。为了评估这些模型,我们使用DIV2K数据集作为高保真全息重建的基准。模拟中的定量分析表明,我们的方法显著超越了当前基准。具体来说,在低比特率领域,相对于最先进的方法,我们的方法实现了-33.91%的BD-Rate降低和1.02dB的BD-PSNR增益。此外,在SLM设备上的实验结果表明,我们的方法实现了更高的对比度和改进的质量。

英文摘要

Holography offers significant potential for AR/VR applications. However, its adoption is limited by the high demand for data compression. Existing deep learning approaches generally lack rate adaptivity within a single network and often require multiple models to cover different bandwidth requirements. We present RAVQ-HoloNet, a rate-adaptive vector quantization framework that integrates the rate-adaptive compression with the transformation of image data into phase-only hologram. RAVQ-HoloNet achieves high-fidelity reconstructions, outperforming current state-of-the-art methods implemented via two distinct architectural configurations: a standard model optimized for low bit rates and a deeper, extended variant tailored for ultra low bit rate setting. To evaluate these models, we utilized the DIV2K dataset as a benchmark for high-fidelity holographic reconstruction. Quantitative analysis in the simulation reveals that our approach significantly surpasses current benchmarks. Specifically, in the low bit rate domain, our method achieves a BD-Rate reduction of -33.91% and a BD-PSNR gain of 1.02dB relative to the state-of-the-art method. Additionally, experimental results on the SLM device show that our method achieves higher contrast and improved quality.

2511.12581 2026-06-04 cs.LG 版本更新

LMM-IR: Large-Scale Netlist-Aware Multimodal Framework for Static IR-Drop Prediction

LMM-IR:面向静态IR压降预测的大规模网表感知多模态框架

Kai Ma, Zhen Wang, Hongquan He, Qi Xu, Tinghuan Chen, Hao Geng

发表机构 * Tsinghua University(清华大学)

AI总结 提出一种基于大规模网表变换器和3D点云表示的多模态框架,用于快速准确地预测芯片静态IR压降,在ICCAD 2023竞赛中取得最佳F1分数和最低MAE。

Comments Accepted by DAC2025

详情
AI中文摘要

静态IR压降分析是芯片设计领域一项基础且关键的任务。然而,该过程可能相当耗时,有时需要数小时。此外,解决IR压降违规问题通常需要迭代分析,从而造成计算负担。因此,快速准确的IR压降预测对于减少芯片设计的总体投入时间至关重要。在本文中,我们首次提出了一种新颖的多模态方法,通过大规模网表变换器(LNT)高效处理SPICE文件。我们的关键创新在于将网表拓扑表示为3D点云并进行处理,从而能够高效处理节点数达数十万至数百万的网表。所有类型的数据,包括网表文件和图像数据,都被编码到潜在空间作为特征,并输入模型进行静态电压降预测。这使得来自多种模态的数据能够集成,实现互补预测。实验结果表明,我们提出的算法在ICCAD 2023竞赛的获胜团队和现有最优算法中,能够取得最佳F1分数和最低MAE。

英文摘要

Static IR drop analysis is a fundamental and critical task in the field of chip design. Nevertheless, this process can be quite time-consuming, potentially requiring several hours. Moreover, addressing IR drop violations frequently demands iterative analysis, thereby causing the computational burden. Therefore, fast and accurate IR drop prediction is vital for reducing the overall time invested in chip design. In this paper, we firstly propose a novel multimodal approach that efficiently processes SPICE files through large-scale netlist transformer (LNT). Our key innovation is representing and processing netlist topology as 3D point cloud representations, enabling efficient handling of netlist with up to hundreds of thousands to millions nodes. All types of data, including netlist files and image data, are encoded into latent space as features and fed into the model for static voltage drop prediction. This enables the integration of data from multiple modalities for complementary predictions. Experimental results demonstrate that our proposed algorithm can achieve the best F1 score and the lowest MAE among the winning teams of the ICCAD 2023 contest and the state-of-the-art algorithms.

2511.03304 2026-06-04 cs.LG cs.AI 版本更新

Extending Fair Null-Space Projections for Continuous Attributes to Kernel Methods

将连续属性的公平零空间投影扩展到核方法

Felix Störck, Fabian Hinder, Barbara Hammer

发表机构 * Felix Störck Fabian Hinder Barbara Hammer

AI总结 提出将公平零空间投影扩展到核诱导特征空间,通过经验特征空间直接变换核矩阵,实现模型和公平评分无关的连续属性公平性方法,并在支持向量回归中展示竞争性或改进性能。

Comments Accepted to ICML 2026

详情
AI中文摘要

随着机器学习系统融入数百万人的日常社会生活,公平性在其发展中的优先级日益提高。公平性概念通常依赖受保护属性来评估潜在偏差。这里,大多数文献关注离散设置下的目标和受保护属性。关于连续属性尤其是与回归结合——我们称之为“连续公平性”——的文献很少。一种常见策略是迭代零空间投影,目前仅在线性模型或通过非线性编码器获得的嵌入中探索。我们通过“经验特征空间”将其扩展到核诱导特征空间,从而改进这一点。我们从理论上推导出这是核矩阵的直接变换,产生一种适用于连续受保护属性的模型和公平评分无关的方法。我们证明,与支持向量回归结合时,我们的新方法在多个数据集上相比其他当代方法具有竞争性或改进的性能。

英文摘要

With the on-going integration of machine learning systems into the everyday social life of millions the notion of fairness becomes an ever increasing priority in their development. Fairness notions commonly rely on protected attributes to assess potential biases. Here, the majority of literature focuses on discrete setups regarding both target and protected attributes. The literature on continuous attributes especially in conjunction with regression -- we refer to this as \emph{continuous fairness} -- is scarce. A common strategy is iterative null-space projection which as of now has only been explored for linear models or embeddings such as obtained by a non-linear encoder. We improve on this by extending this to kernel induced feature spaces by means of the ``empirical feature space''. We theoretically derive this as a direct transformation of the kernel matrix yielding a model and fairness-score agnostic method applicable to continuous protected attributes. We demonstrate that our novel approach in conjunction with Support Vector Regression (SVR) provides competitive or improved performance across multiple datasets in comparison to other contemporary methods.

2408.04607 2026-06-04 stat.ML cond-mat.dis-nn cs.LG 版本更新

Risk and cross validation in ridge regression with correlated samples

带相关样本的岭回归中的风险与交叉验证

Alexander Atanasov, Jacob A. Zavatone-Veth, Cengiz Pehlevan

发表机构 * Department of Physics, Harvard University(哈佛大学物理系) Center for Brain Science, Harvard University(哈佛大学脑科学中心) Society of Fellows, Harvard University(哈佛大学 fellows 会) John A. Paulson School of Engineering and Applied Sciences, Harvard University(哈佛大学约翰·A·保罗森工程与应用科学学院) Kempner Institute for the Study of Natural and Artificial Intelligence, Harvard University(哈佛大学自然与人工智能研究学院)

AI总结 利用随机矩阵理论和自由概率,研究了数据点具有任意相关性时岭回归的渐近风险,并提出了修正的广义交叉验证估计器CorrGCV,同时扩展到测试点与训练集相关的情况。

Comments 50 pages, 19 figures. v4: ICML 2025 camera-ready. v5: Fix typo in statement of Theorem 5. v6: typos corrected, to appear in 2026 JSTAT Machine Learning focus collection

详情
Journal ref
International Conference on Machine Learning (2025), https://proceedings.mlr.press/v267/atanasov25a.html
AI中文摘要

近年来,我们对高维岭回归的理解取得了实质性进展,但现有理论假设训练样本是独立的。通过利用随机矩阵理论和自由概率的技术,我们为数据点具有任意相关性时岭回归的样本内和样本外风险提供了精确的渐近结果。我们证明,在这种情况下,广义交叉验证估计器(GCV)无法正确预测样本外风险。然而,当噪声残差与数据点具有相同相关性时,可以修改GCV以产生一个在高维极限下集中的高效可计算无偏估计器,我们称之为CorrGCV。我们进一步将渐近分析扩展到测试点与训练集具有非平凡相关性的情况,这是时间序列预测中经常遇到的情况。假设已知时间序列的相关结构,这再次产生了GCV估计器的扩展,并精确刻画了此类测试点对长期风险产生过于乐观预测的程度。我们在各种高维数据上验证了理论的预测。

英文摘要

Recent years have seen substantial advances in our understanding of high-dimensional ridge regression, but existing theories assume that training examples are independent. By leveraging techniques from random matrix theory and free probability, we provide sharp asymptotics for the in- and out-of-sample risks of ridge regression when the data points have arbitrary correlations. We demonstrate that in this setting, the generalized cross validation estimator (GCV) fails to correctly predict the out-of-sample risk. However, in the case where the noise residuals have the same correlations as the data points, one can modify the GCV to yield an efficiently-computable unbiased estimator that concentrates in the high-dimensional limit, which we dub CorrGCV. We further extend our asymptotic analysis to the case where the test point has nontrivial correlations with the training set, a setting often encountered in time series forecasting. Assuming knowledge of the correlation structure of the time series, this again yields an extension of the GCV estimator, and sharply characterizes the degree to which such test points yield an overly optimistic prediction of long-time risk. We validate the predictions of our theory across a variety of high dimensional data.

2511.03000 2026-06-04 stat.ML cs.IT cs.LG math.IT 版本更新

Unifying Information-Theoretic and Pair-Counting Clustering Similarity

统一信息论与配对计数的聚类相似性

Alexander J. Gates

发表机构 * School of Data Science, University of Virginia(数据科学学院,弗吉尼亚大学)

AI总结 本文通过加权展开和高阶扩展两个视角,统一了配对计数与信息论两类聚类相似性度量,揭示了它们之间的分析联系。

Comments 23 pages, 2 figures

详情
AI中文摘要

比较聚类结果对于评估无监督模型至关重要,然而现有的许多相似性度量可能产生广泛分歧、有时甚至矛盾的评估。聚类相似性度量通常分为两大族:配对计数和信息论,分别反映它们是通过元素对还是通过完整聚类列联表的聚合信息来量化一致性。先前的工作已发现这些族之间的相似性,并应用了经验归一化或机会校正方案,但它们更深层的分析联系仍仅部分被理解。在此,我们开发了一个分析框架,通过两个互补视角统一这些族。首先,两个族都表示为观察到的与期望的共现的加权展开,配对计数作为二次低阶近似出现,而信息论度量作为高阶频率加权扩展。其次,我们将配对计数推广到k元组一致性,并表明信息论度量可以被视为系统性地累积超出成对水平的高阶共分配结构。我们针对Rand指数和互信息从分析上说明了这些方法,并展示了每个族中的其他指数如何作为自然扩展出现。总之,这些观点阐明了两个体系何时以及为何产生分歧,将它们的敏感性直接与权重和近似阶数联系起来,并为跨应用选择、解释和扩展聚类相似性度量提供了原则性基础。

英文摘要

Comparing clusterings is central to evaluating unsupervised models, yet the many existing similarity measures can produce widely divergent, sometimes contradictory, evaluations. Clustering similarity measures are typically organized into two principal families, pair-counting and information-theoretic, reflecting whether they quantify agreement through element pairs or aggregate information across full cluster contingency tables. Prior work has uncovered parallels between these families and applied empirical normalization or chance-correction schemes, but their deeper analytical connection remains only partially understood. Here, we develop an analytical framework that unifies these families through two complementary perspectives. First, both families are expressed as weighted expansions of observed versus expected co-occurrences, with pair-counting arising as a quadratic, low-order approximation and information-theoretic measures as higher-order, frequency-weighted extensions. Second, we generalize pair-counting to k-tuple agreement and show that information-theoretic measures can be viewed as systematically accumulating higher-order co-assignment structure beyond the pairwise level. We illustrate the approaches analytically for the Rand index and Mutual Information, and show how other indices in each family emerge as natural extensions. Together, these views clarify when and why the two regimes diverge, relating their sensitivities directly to weighting and approximation order, and provide a principled basis for selecting, interpreting, and extending clustering similarity measures across applications.

2505.24528 2026-06-04 cs.CV cs.LG 版本更新

Geospatial Foundation Models to Enable Progress on Sustainable Development Goals

地理空间基础模型推动可持续发展目标的进展

Pedram Ghamisi, Weikang Yu, Xiaokang Zhang, Aldino Rizaldy, Jian Wang, Chufeng Zhou, Richard Gloaguen, Gustau Camps-Valls

发表机构 * Helmholtz-Zentrum Dresden-Rossendorf(德累斯顿-罗斯托克研究所) University of Iceland(冰岛大学) Wuhan University(武汉大学) Wuhan University of Science and Technology(武汉科技大学) Universitat de València(瓦伦西亚大学)

AI总结 本文提出SustainFM基准框架,基于17个可持续发展目标评估地理空间基础模型,发现其在多样任务中优于传统方法,并强调需从模型中心转向影响驱动部署,关注能效、泛化性和伦理。

详情
AI中文摘要

基础模型(FMs)是大规模预训练的人工智能系统,已革新自然语言处理和计算机视觉,并正在推进地理空间分析和地球观测(EO)。它们承诺在任务间改进泛化、可扩展性以及用最少标注数据高效适应。然而,尽管地理空间FMs迅速激增,其现实世界效用和与全球可持续发展目标的一致性仍未充分探索。我们提出SustainFM,一个基于17个可持续发展目标的全面基准框架,涵盖从资产财富预测到环境危害检测的极其多样化的任务。本研究提供了对地理空间FMs的严格、跨学科评估,并对其在实现可持续发展目标中的作用提供了关键见解。我们的发现表明:(1)虽然并非普遍优越,但FMs在多样任务和数据集上通常优于传统方法。(2)评估FMs应超越准确性,将可迁移性、泛化性和能效作为其负责任使用的关键标准。(3)FMs支持可扩展的、基于SDG的解决方案,为应对复杂可持续发展挑战提供广泛实用性。关键的是,我们倡导从以模型为中心的发展转向以影响驱动的部署,并强调能效、对领域变化的鲁棒性以及伦理考量等指标。

英文摘要

Foundation Models (FMs) are large-scale, pre-trained artificial intelligence (AI) systems that have revolutionized natural language processing and computer vision, and are now advancing geospatial analysis and Earth Observation (EO). They promise improved generalization across tasks, scalability, and efficient adaptation with minimal labeled data. However, despite the rapid proliferation of geospatial FMs, their real-world utility and alignment with global sustainability goals remain underexplored. We introduce SustainFM, a comprehensive benchmarking framework grounded in the 17 Sustainable Development Goals with extremely diverse tasks ranging from asset wealth prediction to environmental hazard detection. This study provides a rigorous, interdisciplinary assessment of geospatial FMs and offers critical insights into their role in attaining sustainability goals. Our findings show: (1) While not universally superior, FMs often outperform traditional approaches across diverse tasks and datasets. (2) Evaluating FMs should go beyond accuracy to include transferability, generalization, and energy efficiency as key criteria for their responsible use. (3) FMs enable scalable, SDG-grounded solutions, offering broad utility for tackling complex sustainability challenges. Critically, we advocate for a paradigm shift from model-centric development to impact-driven deployment, and emphasize metrics such as energy efficiency, robustness to domain shifts, and ethical considerations.

2502.00470 2026-06-04 math.OC cs.LG stat.ML 版本更新

On the Relationship Between CoCoA and ADMM for Distributed Empirical Risk Minimization

关于CoCoA与ADMM在分布式经验风险最小化中的关系

Runxiong Wu, Andi Wang

发表机构 * Department of Industrial & Systems Engineering, University of Wisconsin–Madison(工业与系统工程系,威斯康星大学麦迪逊分校)

AI总结 本文从统一原始-对偶视角揭示CoCoA与ADMM两类分布式ERM算法的内在联系,证明岭正则化下CoCoA等价于特定近端ADMM方案,并给出ADMM型方法的统一收敛分析和早停准则。

Comments 21 pages, 4 figures, 1 table

详情
Journal ref
Published in Transactions on Machine Learning Research (06/2026)
AI中文摘要

分布式经验风险最小化(ERM)通常通过两类有影响力但看似独立的方法来研究:源自分布式对偶坐标上升的CoCoA型算法,以及源自共识和近端分裂的ADMM型算法。本文从统一的原始-对偶视角研究这两类算法的联系。我们证明共识ADMM、线性化共识ADMM、两种分布式近端ADMM变体以及岭正则化CoCoA都可以写成一种涉及全局原始变量和块对偶变量的通用更新形式。这种重新表述使几个先前隐藏的联系变得明确:对于岭正则化ERM,CoCoA在对偶更新层面上与特定的近端ADMM方案一致。此外,原始问题上的共识ADMM等价于对偶问题上的近端ADMM,并具有显式参数映射以及鞍点目标符号反转;线性化变体也存在类似的对应关系。这些结果表明,在岭正则化ERM问题下,经过精细调参的ADMM型算法至少与CoCoA性能相当。统一视角还为共识ADMM提供了自然的原始-对偶间隙早停准则,并为ADMM型方法提供了统一的$O(1/T)$遍历收敛分析。在合成回归问题和真实SVM数据集上的实验支持了预测的关系,阐明了调参的作用,并表明适当调参的ADMM变体在岭正则化设置下可以优于CoCoA。

英文摘要

Distributed empirical risk minimization (ERM) is often studied through two influential yet seemingly separate families of methods: CoCoA-type algorithms, derived from distributed dual coordinate ascent, and ADMM-type algorithms, derived from consensus and proximal splitting. In this paper, we investigate the connection of the two types of algorithms from a unified primal-dual perspective. We show that consensus ADMM, linearized consensus ADMM, two distributed proximal ADMM variants, and ridge-regularized CoCoA can all be written in a common update form involving a global primal variable and block dual variables. This reformulation makes several previously hidden connections explicit: For ridge-regularized ERM, CoCoA coincides with a particular proximal ADMM scheme at the level of the dual update. Moreover, consensus ADMM on the primal problem is equivalent to proximal ADMM on the dual problem under an explicit parameter mapping together with a sign reversal of the saddle objective; similar correspondences also hold for the linearized variants.These results indicates that the ADMM-type algorithms, when fine tuned, performs at least as good as CoCoA, under ridge regularized ERM problems. The unified view also yields a natural primal-dual gap stopping criterion for consensus ADMM and a unified $O(1/T)$ ergodic convergence analysis for the ADMM-type methods. Experiments on synthetic regression problems and real SVM datasets support the predicted relationships, clarify the role of tuning parameters, and show that suitably tuned ADMM variants can outperform CoCoA in the ridge-regularized setting.

2509.23385 2026-06-04 stat.ML cs.LG 版本更新

Flow Matching Calibration for Simulation-Based Inference under Model Misspecification

模型误设定下基于模拟推断的流匹配校准

Pierre-Louis Ruhlmann, Michael Arbel, Florence Forbes, Pedro L. C. Rodrigues

发表机构 * Institut national de physique de la matière (CNRS UMR 7586)(物质物理国家研究院(CNRS UMR 7586))

AI总结 针对基于模拟推断中模型误设定导致的偏差,提出流匹配校正后验估计方法,通过少量校准样本利用流匹配范式修正后验估计器,提高推断准确性和不确定性量化。

详情
AI中文摘要

基于模拟的推断(SBI)通过从模拟数据中估计复杂非线性模型的参数,正在变革实验科学。然而,一个持续的挑战是模型误设定。在贝叶斯设置中,针对后验分布,误差可能来自模拟器、噪声或先验建模。这些模型组件只是现实世界的近似,严重的不匹配可能导致有偏或过于自信的后验。我们通过引入流匹配校正后验估计(FMCPE)来解决这个问题,该框架利用流匹配范式,使用少量校准样本细化基于模拟训练的后验估计器。我们的方法分两个阶段进行:首先,在大量模拟数据上训练后验近似器;其次,流匹配将其预测向由校准观测支持的真实后验传输。我们依靠后者来指导校正,无需明确知道误设定形式或哪些模型组件受到影响。这种设计使FMCPE能够结合SBI的可扩展性和对分布偏移的鲁棒性。在合成基准和真实世界数据集上,我们表明我们的提议一致地减轻了误设定的影响,与标准SBI基线相比,提供了改进的推断准确性和不确定性量化,同时保持计算效率。

英文摘要

Simulation-based inference (SBI) is transforming experimental sciences by enabling parameter estimation in complex non-linear models from simulated data. A persistent challenge, however, is model misspecification. In a Bayesian setting, targeting posterior distributions, errors may arise from the simulator, the noise or prior modelling. These model components are only approximations of reality, and severe mismatches can yield biased or overconfident posteriors. We address this issue by introducing Flow Matching Corrected Posterior Estimation (FMCPE), a framework that leverages the flow matching paradigm to refine simulation-trained posterior estimators using a small set of calibration samples. Our approach proceeds in two stages: first, a posterior approximator is trained on abundant simulated data; second, flow matching transports its predictions toward the true posterior supported by calibration observations. We rely on the later to guide the correction, without requiring explicit knowledge of the misspecification form or of which model components are affected. This design enables FMCPE to combine the scalability of SBI with robustness to distributional shift. Across synthetic benchmarks and real-world datasets, we show that our proposal consistently mitigates the effects of misspecification, delivering improved inference accuracy and uncertainty quantification compared to standard SBI baselines, while remaining computationally efficient.

2510.13704 2026-06-04 cs.LG cs.AI cs.RO 版本更新

Simplicial Embeddings Improve Sample Efficiency in Actor-Critic Agents

单纯形嵌入提升Actor-Critic智能体的样本效率

Johan Obando-Ceron, Walter Mayor, Samuel Lavoie, Scott Fujimoto, Aaron Courville, Pablo Samuel Castro

发表机构 * Mila – Québec AI Institute(魁北克人工智能研究所) Université de Montréal(蒙特利尔大学) McGill University(麦吉尔大学) CIFAR AI Chair(CIFAR人工智能主席)

AI总结 针对大规模环境并行化下Actor-Critic方法仍需大量交互的问题,提出使用单纯形嵌入作为轻量级表示层,通过几何归纳偏置产生稀疏离散特征,稳定评论家引导并强化策略梯度,在FastTD3、FastSAC和PPO中一致提升样本效率和最终性能。

详情
AI中文摘要

最近的工作提出通过大规模环境并行化来加速actor-critic方法的挂钟训练时间;不幸的是,这些方法有时仍需要大量的环境交互才能达到期望的性能水平。注意到结构良好的表示可以改善深度强化学习(RL)智能体的泛化能力和样本效率,我们提出使用单纯形嵌入:将嵌入约束到单纯形结构的轻量级表示层。这种几何归纳偏置产生稀疏且离散的特征,稳定了评论家引导并强化了策略梯度。当应用于FastTD3、FastSAC和PPO时,单纯形嵌入在多种连续和离散控制环境中一致提高了样本效率和最终性能,且不损失运行速度。

英文摘要

Recent works have proposed accelerating the wall-clock training time of actor-critic methods via the use of large-scale environment parallelization; unfortunately, these can sometimes still require large number of environment interactions to achieve a desired level of performance. Noting that well-structured representations can improve the generalization and sample efficiency of deep reinforcement learning (RL) agents, we propose the use of simplicial embeddings: lightweight representation layers that constrain embeddings to simplicial structures. This geometric inductive bias results in sparse and discrete features that stabilize critic bootstrapping and strengthen policy gradients. When applied to FastTD3, FastSAC, and PPO, simplicial embeddings consistently improve sample efficiency and final performance across a variety of continuous- and discrete-control environments, without any loss in runtime speed.

2510.03511 2026-06-04 cs.CV cs.AI cs.LG eess.IV 版本更新

Platonic Transformers: A Solid Choice For Equivariance

柏拉图式Transformer:等变性的坚实选择

Mohammad Mohaiminul Islam, Rishabh Anand, David R. Wessels, Friso de Kruiff, Thijs P. Kuipers, Rex Ying, Clara I. Sánchez, Sharvaree Vadgama, Georg Bökman, Erik J. Bekkers

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 提出Platonic Transformer,通过基于柏拉图立体对称群参考帧的注意力机制实现等变性,在不增加计算成本的前提下提升性能。

详情
AI中文摘要

尽管Transformer广泛应用,但缺乏科学和计算机视觉中常见几何对称性的归纳偏置。现有的等变方法往往通过复杂、计算密集的设计牺牲了Transformer的高效性和灵活性。我们引入Platonic Transformer来解决这一权衡。通过将注意力定义为相对于柏拉图立体对称群参考帧,我们的方法引入了一种有原则的权重共享方案。这使得模型能够同时对连续平移和柏拉图对称性保持等变,同时保留标准Transformer的精确架构和计算成本。此外,我们证明这种注意力在形式上等价于动态群卷积,这表明模型学习自适应几何滤波器,并实现高度可扩展的线性时间卷积变体。在计算机视觉(CIFAR-10)、3D点云(ScanObjectNN)和分子性质预测(QM9、OMol25)等多个基准测试中,Platonic Transformer通过利用这些几何约束以零额外成本取得了有竞争力的性能。

英文摘要

While widespread, Transformers lack inductive biases for geometric symmetries common in science and computer vision. Existing equivariant methods often sacrifice the efficiency and flexibility that make Transformers so effective through complex, computationally intensive designs. We introduce the Platonic Transformer to resolve this trade-off. By defining attention relative to reference frames from the Platonic solid symmetry groups, our method induces a principled weight-sharing scheme. This enables combined equivariance to continuous translations and Platonic symmetries, while preserving the exact architecture and computational cost of a standard Transformer. Furthermore, we show that this attention is formally equivalent to a dynamic group convolution, which reveals that the model learns adaptive geometric filters and enables a highly scalable, linear-time convolutional variant. Across diverse benchmarks in computer vision (CIFAR-10), 3D point clouds (ScanObjectNN), and molecular property prediction (QM9, OMol25), the Platonic Transformer achieves competitive performance by leveraging these geometric constraints at no additional cost.

2510.01902 2026-06-04 cs.AI cs.CL cs.LG 版本更新

Constrained Adaptive Rejection Sampling

约束自适应拒绝采样

Paweł Parys, Sairam Vaidya, Taylor Berg-Kirkpatrick, Loris D'Antoni

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 提出约束自适应拒绝采样(CARS),通过自适应剪枝无效前缀来提高拒绝采样的样本效率,同时保持无分布扭曲,在程序模糊测试和分子生成等任务中优于现有方法。

详情
AI中文摘要

语言模型(LMs)越来越多地应用于生成的输出必须满足严格语义或语法约束的场景。现有的约束生成方法处于一个谱系中:贪婪约束解码方法在解码过程中强制执行有效性,但扭曲了LM的分布;而拒绝采样(RS)保留了保真度,但通过丢弃无效输出浪费计算资源。在程序模糊测试等领域,样本的有效性和多样性都至关重要,这两种极端方法都有问题。我们提出约束自适应拒绝采样(CARS),一种严格提高RS样本效率且不产生分布扭曲的方法。CARS从无约束LM采样开始,通过将违反约束的续写记录在trie中并从后续抽取中减去其概率质量,自适应地排除它们。这种自适应剪枝确保已证明无效的前缀不会被重新访问,接受率单调提高,并且生成的样本精确遵循约束分布。在多个领域的实验(例如程序模糊测试和分子生成)中,CARS始终实现更高的效率(以每个有效样本的LM前向传递次数衡量),同时产生比GCD和近似LM分布的方法更强的样本多样性。

英文摘要

Language Models (LMs) are increasingly used in applications where generated outputs must satisfy strict semantic or syntactic constraints. Existing approaches to constrained generation fall along a spectrum: greedy constrained decoding methods enforce validity during decoding but distort the LM's distribution, while rejection sampling (RS) preserves fidelity but wastes computation by discarding invalid outputs. Both extremes are problematic in domains such as program fuzzing, where both validity and diversity of samples are essential. We present Constrained Adaptive Rejection Sampling (CARS), an approach that strictly improves the sample-efficiency of RS without distributional distortion. CARS begins with unconstrained LM sampling and adaptively rules out constraint-violating continuations by recording them in a trie and subtracting their probability mass from future draws. This adaptive pruning ensures that prefixes proven invalid are never revisited, acceptance rates improve monotonically, and the resulting samples exactly follow the constrained distribution. In experiments on a variety of domains -- e.g., program fuzzing and molecular generation -- CARS consistently achieves higher efficiency -- measured in the number of LM forward passes per valid sample -- while also producing stronger sample diversity than both GCD and methods that approximate the LM's distribution.

2505.15497 2026-06-04 cs.LG cs.SY eess.SY 版本更新

Certified Neural Approximations of Nonlinear Dynamics

非线性动力学的认证神经逼近

Frederik Baymler Mathiesen, Nikolaus Vertovec, Francesco Fabiano, Luca Laurenti, Alessandro Abate

发表机构 * Delft Center for Systems and Control(代尔夫特系统与控制中心) Department of Computer Science, University of Oxford(牛津大学计算机科学系) The Italian Institute of Artificial Intelligence (AI4I)(意大利人工智能研究所(AI4I))

AI总结 提出一种基于认证一阶模型的自适应并行验证方法,为神经网络逼近非线性动力学提供形式化误差界,从而安全地用作替代模型,并在多个基准测试中显著优于现有方法。

Comments first and second author contributed equally

详情
AI中文摘要

神经网络作为非线性动力系统的近似模型具有巨大潜力,由此产生的神经逼近能够实现对此类系统的验证和控制。然而,在安全关键背景下,使用神经逼近需要对其与底层系统的接近程度有形式化界限。为了解决这一基本挑战,我们提出了一种新颖的、自适应的、可并行化的验证方法,基于认证的一阶模型。我们的方法为动力系统的神经逼近提供了形式化误差界,通过将误差界解释为作用于近似动力学的有界扰动,使得它们能够安全地用作替代模型。我们在文献中的一系列既定基准测试上展示了我们方法的有效性和可扩展性,表明它显著优于现有技术。此外,我们展示了我们的框架能够成功解决现有方法以前无法处理的额外场景——神经网络压缩和基于自编码器的深度学习架构,用于训练Koopman算子以进行轨迹预测。

英文摘要

Neural networks hold great potential to act as approximate models of nonlinear dynamical systems, with the resulting neural approximations enabling verification and control of such systems. However, in safety-critical contexts, the use of neural approximations requires formal bounds on their closeness to the underlying system. To address this fundamental challenge, we propose a novel, adaptive, and parallelizable verification method based on certified first-order models. Our approach provides formal error bounds on the neural approximations of dynamical systems, allowing them to be safely employed as surrogates by interpreting the error bound as bounded disturbances acting on the approximated dynamics. We demonstrate the effectiveness and scalability of our method on a range of established benchmarks from the literature, showing that it significantly outperforms the state of the art. Furthermore, we show that our framework can successfully address additional scenarios previously intractable for existing methods -- neural network compression and an autoencoder-based deep learning architecture for training Koopman operators for the purpose of trajectory prediction.

2509.22454 2026-06-04 cs.LG 版本更新

Overclocking Electrostatic Generative Models

超频静电生成模型

Daniil Shlenskii, Alexander Korotin

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 提出逆泊松流匹配(IPFM)蒸馏框架,加速所有维度D下的静电生成模型,实现少步采样且质量接近甚至超越教师模型。

详情
AI中文摘要

诸如PFGM++等静电生成模型最近作为一种强大的框架出现,在图像合成中取得了竞争性能。PFGM++在具有辅助维度$D$的扩展数据空间中运行,当$D\to\infty$时恢复扩散模型框架,而在有限$D$下产生更优的经验结果。与扩散模型一样,PFGM++依赖昂贵的ODE模拟来生成样本,计算成本高。为解决此问题,我们提出逆泊松流匹配(IPFM),一个原则性的蒸馏框架,可加速所有$D$值下的静电生成模型。我们的IPFM将蒸馏重新表述为一个逆问题:学习一个生成器,其诱导的静电场与教师模型匹配。我们为该问题推导了一个可处理的训练目标,并表明当$D\to\infty$时,我们的IPFM紧密恢复分数恒等蒸馏(SiD),一种最近用于蒸馏扩散模型的方法。实验上,我们的IPFM生成的蒸馏生成器仅需少量函数评估即可达到接近教师甚至更优的样本质量。此外,我们发现单步生成器蒸馏在有限$D$下比在$D\to\infty$扩散极限下收敛更快,这与先前证据一致,即有限$D$的PFGM++模型提供更有利的优化和采样行为。

英文摘要

Electrostatic generative models such as PFGM++ have recently emerged as a powerful framework, achieving competitive performance in image synthesis. PFGM++ operates in an extended data space with auxiliary dimensionality $D$, recovering the diffusion model framework as $D\to\infty$, while yielding superior empirical results for finite $D$. Like diffusion models, PFGM++ relies on expensive ODE simulations to generate samples, making it computationally costly. To address this, we propose Inverse Poisson Flow Matching (IPFM), a principled distillation framework that accelerates electrostatic generative models across all values of $D$. Our IPFM reformulates distillation as an inverse problem: learning a generator whose induced electrostatic field matches that of the teacher. We derive a tractable training objective for this problem and show that, as $D\to\infty$, our IPFM closely recovers Score Identity Distillation (SiD), a recent method for distilling diffusion models. Empirically, our IPFM produces distilled generators that achieve near-teacher or even superior sample quality using only a few function evaluations. Moreover, we find that one-step generator distillation converges faster at finite $D$ than in the $D\to\infty$ diffusion limit, aligning with prior evidence that finite-$D$ PFGM++ models offer more favorable optimization and sampling behavior.

2505.22988 2026-06-04 cs.LG cs.AI 版本更新

Model-Preserving Adaptive Rounding

模型保持的自适应舍入

Albert Tseng, Zhaofeng Sun, Christopher De Sa

发表机构 * Department of Computer Science, Cornell University(康奈尔大学计算机科学系)

AI总结 提出一种直接考虑网络输出误差的自适应舍入量化算法YAQA,通过理论分析给出首个端到端误差界,并利用Kronecker分解近似Hessian矩阵,在无推理开销下实现优于GPTQ/LDLQ约30%的误差降低。

Comments ICML 2026

详情
AI中文摘要

量化的目标是生成一个压缩模型,其输出分布尽可能接近原始模型。为了可处理地实现这一点,大多数量化算法最小化每层的即时激活误差作为端到端误差的代理。然而,这忽略了未来层的影响,使其成为一个较差的代理。在这项工作中,我们引入了另一种量化算法(YAQA),一种直接考虑网络输出误差的自适应舍入算法。YAQA引入了一系列理论结果,最终给出了量化算法的首个端到端误差界。首先,我们通过Hessian近似的结构刻画了自适应舍入算法的收敛时间。然后,我们证明端到端误差可以通过近似与真实Hessian的余弦相似度来界定。这允许一种自然的Kronecker分解近似,并具有相应的近最优Hessian草图。YAQA在理论上优于GPTQ/LDLQ,并在经验上比这些方法减少约30%的误差。YAQA甚至实现了比量化感知训练更低的误差。这转化为下游任务上的最先进性能,同时不增加推理开销。

英文摘要

The goal of quantization is to produce a compressed model whose output distribution is as close to the original model's as possible. To do this tractably, most quantization algorithms minimize the immediate activation error of each layer as a proxy for the end-to-end error. However, this ignores the effect of future layers, making it a poor proxy. In this work, we introduce Yet Another Quantization Algorithm (YAQA), an adaptive rounding algorithm that directly considers the error at the network's output. YAQA introduces a series of theoretical results that culminate in the first end-to-end error bounds for quantization algorithms. First, we characterize the convergence time of adaptive rounding algorithms via the structure of their Hessian approximations. We then show that the end-to-end error can be bounded by the approximation's cosine similarity to the true Hessian. This admits a natural Kronecker-factored approximation with corresponding near-optimal Hessian sketches. YAQA is provably better than GPTQ/LDLQ and empirically reduces the error by $\approx 30\%$ over these methods. YAQA even achieves a lower error than quantization aware training. This translates to state of the art performance on downstream tasks, all while adding no inference overhead.

2509.15676 2026-06-04 cs.LG cs.AI cs.CL 版本更新

KITE: Kernelized and Information Theoretic Exemplars for In-Context Learning

KITE: 基于核方法和信息论的上下文学习示例选择

Vaibhav Singh, Soumya Suvra Ghosal, Kapu Nirmal Joshua, Soumyabrata Pal, Sayak Ray Chowdhury

发表机构 * IIT Bombay(印度比哈尔理工学院) UMD College Park(马里兰大学 College Park 分校) IIT Kanpur(印度坎普尔理工学院) Adobe Research(Adobe 研究)

AI总结 针对上下文学习中的示例选择问题,提出一种基于信息论和核方法的贪心算法,通过最小化查询特定预测误差并引入多样性正则化,显著提升分类性能。

详情
AI中文摘要

上下文学习(ICL)已成为一种强大的范式,通过仅使用提示中精心选择的少量任务特定示例,使大型语言模型(LLM)适应新的、数据稀缺的任务。然而,鉴于LLM有限的上下文大小,一个基本问题出现了:应选择哪些示例以最大化给定用户查询的性能?虽然基于最近邻的方法(如KATE)已被广泛用于此目的,但它们在高维嵌入空间中存在众所周知的缺点,包括泛化能力差和缺乏多样性。在这项工作中,我们从原则性的、信息论驱动的角度研究ICL中的示例选择问题。我们首先将LLM建模为输入嵌入上的线性函数,并将示例选择任务框架化为一个查询特定的优化问题:从较大的示例库中选择一个子集,以最小化特定查询上的预测误差。这种表述通过针对特定查询实例的准确预测,偏离了传统的以泛化为中心的学习理论方法。我们推导出一个原则性的代理目标,该目标是近似子模的,从而能够使用具有近似保证的贪心算法。我们通过(i)引入核技巧以在高维特征空间中操作而无需显式映射,以及(ii)引入基于最优设计的正则化项以鼓励所选示例的多样性,进一步增强了我们的方法。实验上,我们在多个分类任务上展示了相对于标准检索方法的显著改进,突出了在真实世界、标签稀缺场景中,结构感知、多样化的示例选择对ICL的益处。

英文摘要

In-context learning (ICL) has emerged as a powerful paradigm for adapting large language models (LLMs) to new and data-scarce tasks using only a few carefully selected task-specific examples presented in the prompt. However, given the limited context size of LLMs, a fundamental question arises: Which examples should be selected to maximize performance on a given user query? While nearest-neighbor-based methods like KATE have been widely adopted for this purpose, they suffer from well-known drawbacks in high-dimensional embedding spaces, including poor generalization and a lack of diversity. In this work, we study this problem of example selection in ICL from a principled, information theory-driven perspective. We first model an LLM as a linear function over input embeddings and frame the example selection task as a query-specific optimization problem: selecting a subset of exemplars from a larger example bank that minimizes the prediction error on a specific query. This formulation departs from traditional generalization-focused learning theoretic approaches by targeting accurate prediction for a specific query instance. We derive a principled surrogate objective that is approximately submodular, enabling the use of a greedy algorithm with an approximation guarantee. We further enhance our method by (i) incorporating the kernel trick to operate in high-dimensional feature spaces without explicit mappings, and (ii) introducing an optimal design-based regularizer to encourage diversity in the selected examples. Empirically, we demonstrate significant improvements over standard retrieval methods across a suite of classification tasks, highlighting the benefits of structure-aware, diverse example selection for ICL in real-world, label-scarce scenarios.

2502.06301 2026-06-04 cs.LG cs.NE 版本更新

Utilizing Novelty-based Evolution Strategies to Train Transformers in Reinforcement Learning

利用基于新颖性的进化策略训练强化学习中的Transformer

Matyáš Lorenc, Roman Neruda

发表机构 * Faculty of Mathematics and Physics, Charles University(数学与物理学院,查理大学) Institute of Computer Science, Czech Academy of Sciences(计算机科学研究所,捷克科学院)

AI总结 本研究实验了基于新颖性的进化策略变体(NS-ES和NSR-ES),评估其在训练强化学习中的Transformer架构(如Decision Transformer)的效果,并探索预训练模型加速训练的可能性。

详情
Journal ref
2025 IEEE 37th International Conference on Tools with Artificial Intelligence (ICTAI), Athens, Greece, 2025, pp. 801-805
AI中文摘要

在本文中,我们实验了OpenAI-ES的基于新颖性的变体,即NS-ES和NSR-ES算法,并评估了它们在训练针对强化学习问题设计的复杂Transformer架构(如Decision Transformers)中的有效性。我们还测试了是否可以通过使用预训练模型进行种子训练来加速这些更大模型的新颖性训练。实验结果喜忧参半。NS-ES显示出进展,但显然需要更多迭代才能产生有趣的智能体。另一方面,NSR-ES被证明能够直接用于更大模型,因为其性能在前馈模型和Decision Transformer之间表现相似,正如我们之前工作中OpenAI-ES的表现一样。

英文摘要

In this paper, we experiment with novelty-based variants of OpenAI-ES, the NS-ES and NSR-ES algorithms, and evaluate their effectiveness in training complex, transformer-based architectures designed for the problem of reinforcement learning, such as Decision Transformers. We also test if we can accelerate the novelty-based training of these larger models by seeding the training with a pretrained models. The experimental results were mixed. NS-ES showed progress, but it would clearly need many more iterations for it to yield interesting agents. NSR-ES, on the other hand, proved quite capable of being straightforwardly used on larger models, since its performance appears as similar between the feed-forward model and Decision Transformer, as it was for the OpenAI-ES in our previous work.

2509.08846 2026-06-04 cs.LG cs.AI stat.ML 版本更新

Uncertainty Estimation using Variance-Gated Distributions

使用方差门控分布的不确定性估计

H. Martin Gillis, Isaac Xu, Thomas Trappenberg

发表机构 * Faculty of Computer Science(计算机科学学院) Dalhousie University(达尔豪斯大学)

AI总结 提出基于类概率分布信噪比的方差门控不确定性估计框架,通过集成置信因子缩放预测,解决神经网络预测不确定性分解中的加性分解问题。

Comments NeurIPS Workshop: Mathematical Foundations and Operational Integration of Machine Learning for Uncertainty-Aware Decision-Making

详情
AI中文摘要

评估神经网络每个样本的不确定性量化对于涉及高风险应用的决策至关重要。一种常见的方法是使用贝叶斯或近似模型的预测分布,并将相应的预测不确定性分解为认知(模型相关)和偶然(数据相关)成分。然而,加性分解最近受到质疑。在这项工作中,我们提出了一个基于不同模型预测中类概率分布信噪比的不确定性估计和分解的直观框架。我们引入了一种方差门控度量,该度量通过从集成中导出的置信因子来缩放预测。我们使用这个度量来讨论委员会机器多样性崩溃的存在性。

英文摘要

Evaluation of per-sample uncertainty quantification from neural networks is essential for decision-making involving high-risk applications. A common approach is to use the predictive distribution from Bayesian or approximation models and decompose the corresponding predictive uncertainty into epistemic (model-related) and aleatoric (data-related) components. However, additive decomposition has recently been questioned. In this work, we propose an intuitive framework for uncertainty estimation and decomposition based on the signal-to-noise ratio of class probability distributions across different model predictions. We introduce a variance-gated measure that scales predictions by a confidence factor derived from ensembles. We use this measure to discuss the existence of a collapse in the diversity of committee machines.

2509.07963 2026-06-04 cs.LG 版本更新

Customizing the Inductive Biases of Softmax Attention using Structured Matrices

使用结构化矩阵定制软注意力机制的归纳偏置

Yilun Kuang, Noah Amsel, Sanae Lotfi, Shikai Qiu, Andres Potapczynski, Andrew Gordon Wilson

发表机构 * University of Cambridge(剑桥大学)

AI总结 针对标准注意力机制在低维投影信息损失和缺乏距离依赖偏置的问题,提出基于块张量列(BTT)和连续多级低秩(MLR)结构化矩阵的高秩评分函数,在上下文回归、语言建模和长程时间序列预测中提升性能。

Comments ICML 2025. Code available at https://github.com/YilunKuang/structured-attention

详情
AI中文摘要

注意力机制的核心组件是评分函数,它将输入转换为低维查询和键,并计算每对向量的点积。虽然低维投影提高了效率,但对于某些具有本质高维输入的任务,它会导致信息损失。此外,注意力对所有输入对使用相同的评分函数,而没有对序列中相邻标记施加距离相关的计算偏置。在这项工作中,我们通过提出基于计算高效的高秩结构化矩阵(包括块张量列(BTT)和连续多级低秩(MLR)矩阵)的新评分函数来解决这些缺陷。在高维输入的上下文回归任务中,我们提出的评分函数在任意固定计算预算下均优于标准注意力。在语言建模(一种表现出局部性模式的任务)中,基于MLR的注意力方法相比标准注意力和滑动窗口注意力的变体实现了改进的扩展定律。此外,我们表明BTT和MLR都属于更广泛的高效结构化矩阵家族,能够编码全秩或距离依赖的计算偏置,从而解决了标准注意力的显著缺陷。最后,我们展示了MLR注意力在长程时间序列预测中具有令人期待的结果。

英文摘要

The core component of attention is the scoring function, which transforms the inputs into low-dimensional queries and keys and takes the dot product of each pair. While the low-dimensional projection improves efficiency, it causes information loss for certain tasks that have intrinsically high-dimensional inputs. Additionally, attention uses the same scoring function for all input pairs, without imposing a distance-dependent compute bias for neighboring tokens in the sequence. In this work, we address these shortcomings by proposing new scoring functions based on computationally efficient structured matrices with high ranks, including Block Tensor-Train (BTT) and contiguous Multi-Level Low Rank (MLR) matrices. On in-context regression tasks with high-dimensional inputs, our proposed scoring functions outperform standard attention for any fixed compute budget. On language modeling, a task that exhibits locality patterns, our MLR-based attention method achieves improved scaling laws compared to both standard attention and variants of sliding window attention. Additionally, we show that both BTT and MLR fall under a broader family of efficient structured matrices capable of encoding either full-rank or distance-dependent compute biases, thereby addressing significant shortcomings of standard attention. Finally, we show that MLR attention has promising results for long-range time-series forecasting.

2509.03351 2026-06-04 cs.LG cs.AI q-bio.QM 版本更新

epiGPTope: A machine learning-based epitope generator and classifier

epiGPTope: 一种基于机器学习的表位生成器和分类器

Natalia Flechas Manrique, Alberto Martínez, Elena López-Martínez, Luc Andrea, Román Orus, Aitor Manteca, Aitziber L. Cortajarena, Llorenç Espinosa-Portalés

发表机构 * Multiverse Computing(多维计算公司) Centre for Cooperative Research in Biomaterials (CIC biomaGUNE)(生物材料联合研究中心) Basque Research and Technology Alliance (BRTA)(巴斯克研究与技术联盟) Donostia International Physics Center(多斯蒂亚国际物理中心) Ikerbasque Foundation for Science(伊kerbasque科学基金会) IKERBASQUE(伊kerbasque)

AI总结 提出基于大型语言模型epiGPTope,通过预训练和微调直接生成新型表位序列,并结合统计分类器预测表位来源(细菌或病毒),以加速合成表位库的构建和筛选。

Comments 11 pages, 4 figures. Supplementary Information with 5 pages, 4 figures

详情
Journal ref
ACS Synthetic Biology 2026 15 (2), 631-642
AI中文摘要

表位是能被抗体或免疫细胞受体识别的短抗原肽序列,对免疫疗法、疫苗和诊断的开发至关重要。然而,由于巨大的组合序列空间(n个氨基酸的线性表位有$20^n$种组合),即使采用高通量实验技术,合成表位库的合理设计也极具挑战。在本研究中,我们提出了一种大型语言模型epiGPTope,该模型在蛋白质数据上预训练,并专门针对线性表位进行微调,首次能够直接生成新型表位样序列,这些序列被发现具有与已知表位相似的统计特性。这种生成方法可用于制备表位候选序列库。我们进一步训练统计分类器来预测表位序列是细菌来源还是病毒来源,从而缩小候选库范围,提高识别特定表位的可能性。我们提出,这种生成模型与预测模型的组合有助于表位发现。该方法仅使用线性表位的一级氨基酸序列,无需几何框架或手工特征。通过开发生成生物学可行序列的方法,我们预期能更快、更经济地生成和筛选合成表位,并在新生物技术开发中具有相关应用。

英文摘要

Epitopes are short antigenic peptide sequences which are recognized by antibodies or immune cell receptors. These are central to the development of immunotherapies, vaccines, and diagnostics. However, the rational design of synthetic epitope libraries is challenging due to the large combinatorial sequence space, $20^n$ combinations for linear epitopes of n amino acids, making screening and testing unfeasible, even with high throughput experimental techniques. In this study, we present a large language model, epiGPTope, pre-trained on protein data and specifically fine-tuned on linear epitopes, which for the first time can directly generate novel epitope-like sequences, which are found to possess statistical properties analogous to the ones of known epitopes. This generative approach can be used to prepare libraries of epitope candidate sequences. We further train statistical classifiers to predict whether an epitope sequence is of bacterial or viral origin, thus narrowing the candidate library and increasing the likelihood of identifying specific epitopes. We propose that such combination of generative and predictive models can be of assistance in epitope discovery. The approach uses only primary amino acid sequences of linear epitopes, bypassing the need for a geometric framework or hand-crafted features of the sequences. By developing a method to create biologically feasible sequences, we anticipate faster and more cost-effective generation and screening of synthetic epitopes, with relevant applications in the development of new biotechnologies.

2507.21638 2026-06-04 cs.AI cs.LG cs.MA cs.RO 版本更新

Assistax: A Multi-Agent Hardware-Accelerated Reinforcement Learning Benchmark for Assistive Robotics

Assistax: 一个用于辅助机器人的多智能体硬件加速强化学习基准

Leonard Hinckeldey, Elliot Fosong, Rimvydas Rubavicius, Elle Miller, Trevor McInroe, Fan Zhang, Patricia Wollstadt, Stefano V. Albrecht, Subramanian Ramamoorthy

发表机构 * University of California, Berkeley(加州大学伯克利分校) Stanford University(斯坦福大学)

AI总结 提出Assistax基准,利用JAX硬件加速和基于多智能体强化学习的辅助机器人任务,实现高达370倍加速,并测试机器人的零样本协调能力。

Comments Accepted at the Reinforcement Learning Conference 2026

详情
AI中文摘要

强化学习(RL)算法的发展在很大程度上受到具有挑战性的任务和基准的推动。游戏在RL基准中占据主导地位,因为它们呈现了相关的挑战,运行成本低且易于理解。虽然围棋和Atari等游戏带来了许多突破,但它们通常不能直接转化为现实世界的具身应用。在认识到需要多样化RL基准并解决具身交互场景中出现的复杂性的情况下,我们引入了Assistax:一个旨在解决辅助机器人任务中出现的挑战的开源基准。Assistax利用JAX的硬件加速,在基于物理的模拟中实现显著的学习加速。在开环挂钟时间方面,Assistax在向量化训练运行时比基于CPU的替代方案快高达370倍。Assistax使用多智能体RL将辅助机器人与活跃的人类患者之间的交互概念化,以训练一群多样化的伙伴智能体,从而可以测试具身机器人智能体的零样本协调能力。对流行的连续控制RL和MARL算法进行的广泛评估和超参数调优提供了可靠的基线,并将Assistax确立为推进辅助机器人RL研究的实用基准。代码可在以下网址获取:https://github.com/assistive-autonomy/assistax。

英文摘要

The development of reinforcement learning (RL) algorithms has been largely driven by ambitious challenge tasks and benchmarks. Games have dominated RL benchmarks because they present relevant challenges, are inexpensive to run and easy to understand. While games such as Go and Atari have led to many breakthroughs, they often do not directly translate to real-world embodied applications. In recognising the need to diversify RL benchmarks and addressing complexities that arise in embodied interaction scenarios, we introduce Assistax: an open-source benchmark designed to address challenges arising in assistive robotics tasks. Assistax uses JAX's hardware acceleration for significant speed-ups for learning in physics-based simulations. In terms of open-loop wall-clock time, Assistax runs up to $370\times$ faster when vectorising training runs compared to CPU-based alternatives. Assistax conceptualises the interaction between an assistive robot and an active human patient using multi-agent RL to train a population of diverse partner agents against which an embodied robotic agent's zero-shot coordination capabilities can be tested. Extensive evaluation and hyperparameter tuning for popular continuous control RL and MARL algorithms provide reliable baselines and establish Assistax as a practical benchmark for advancing RL research for assistive robotics. The code is available at: https://github.com/assistive-autonomy/assistax.

2506.23546 2026-06-04 q-bio.NC cond-mat.dis-nn cs.LG cs.NE 版本更新

Neural Langevin Machine: a local asymmetric learning rule can be creative

神经朗之万机:一种局部非对称学习规则可以具有创造性

Zhendong Yu, Weizhong Huang, Haiping Huang

发表机构 * PMI Lab, School of Physics, Sun Yat-sen University(物理系,中山大学,PMI实验室) Guangdong Provincial Key Laboratory of Magnetoelectric Physics and Devices, Sun Yat-sen University(磁电物理与器件广东省重点实验室,中山大学)

AI总结 本文提出神经朗之万机,利用递归神经网络的固定点通过非对称、速率调整的局部学习规则进行生成学习,并揭示了非平衡生成过程及记忆到泛化的转变。

Comments 7 pages, 5 figures, with Github link in the paper, supplemental material available upon request

详情
AI中文摘要

递归神经网络的固定点可用于存储和生成信息。这些固定点可以通过玻尔兹曼-吉布斯测度捕获,从而得到神经朗之万动力学,可用于在真实数据集的生成学习中找到它们。我们将这种生成模型称为神经朗之万机,它推导出一种非对称且放电速率调整的学习规则,仅需要局部神经信号,因此在局部预测学习方面具有生物学相关性。揭示了生成过程中一个有趣的非平衡状态,以及随着训练数据量增加从记忆到泛化的转变。这种神经启发机器还可以实现对不同种类生成图像的相空间连续探索,并且能够对受损图像进行去噪。

英文摘要

Fixed points of recurrent neural networks can be leveraged to store and generate information. These fixed points can be captured by the Boltzmann-Gibbs measure, which leads to neural Langevin dynamics that can be used to find them for generative learning of a real dataset. We call this type of generative model a neural Langevin machine, which derives an asymmetric and firing-rate-speed adjusted learning rule requiring only local neural signals, thereby bearing biological relevance in terms of local predictive learning. An interesting out-of-equilibrium regime of the generative process is revealed, together with a memorization-to-generalization transition with increasing training data size. The neuro-inspired machine can also realize a continuous exploration of the phase space for different kinds of generative images and can denoise a corrupted image as well.

2506.05233 2026-06-04 cs.LG cs.AI cs.CL 版本更新

MesaNet: Sequence Modeling by Locally Optimal Test-Time Training

MesaNet: 通过局部最优测试时训练进行序列建模

Johannes von Oswald, Nino Scherrer, Seijin Kobayashi, Luca Versari, Songlin Yang, Sarthak Mittal, Maximilian Schlegel, Kaitlin Maile, Yanick Schimpf, Oliver Sieberling, Alexander Meulemans, Rif A. Saurous, Guillaume Lajoie, Charlotte Frenkel, Razvan Pascanu, Blaise Agüera y Arcas, João Sacramento

发表机构 * Google(谷歌) Paradigms of Intelligence Team(智能范式团队) Google DeepMind(谷歌深Mind) MIT CSAIL(麻省理工学院CSAIL)

AI总结 提出一种基于共轭梯度求解器实现局部最优测试时训练的Mesa层,在保持常数推理成本的同时,在语言建模困惑度和下游基准性能上超越现有RNN模型。

Comments Published at ICLR 2026

详情
AI中文摘要

序列建模目前主要由使用softmax自注意力的因果Transformer架构主导。尽管被广泛采用,Transformer在推理时需要线性扩展内存和计算。最近一系列工作将softmax操作线性化,产生了具有恒定内存和计算成本的强大循环神经网络模型,如DeltaNet、Mamba或xLSTM。这些模型可以通过注意到其循环层动态都源于上下文回归目标(通过在线学习规则近似优化)来统一。在此,我们加入这一系列工作,引入最近提出的Mesa层(von Oswald等人,2024)的一个数值稳定、可分块并行化的版本,该层原本只能顺序运行,因此不可扩展。该层同样源于上下文损失,但现在使用快速共轭梯度求解器在每个时间点将其最小化至最优。通过一系列扩展到十亿参数规模的实验,我们表明最优测试时训练使得语言建模困惑度更低,下游基准性能优于之前的RNN,尤其是在需要长上下文理解的任务上。这一性能提升以推理时额外浮点运算为代价。因此,我们的结果与最近增加测试时计算以提高性能的趋势有趣地相关——这里通过花费计算在神经网络内部解决序列优化问题来实现。

英文摘要

Sequence modeling is currently dominated by causal transformer architectures that use softmax self-attention. Although widely adopted, transformers require scaling memory and compute linearly during inference. A recent stream of work linearized the softmax operation, resulting in powerful recurrent neural network (RNN) models with constant memory and compute costs such as DeltaNet, Mamba or xLSTM. These models can be unified by noting that their recurrent layer dynamics can all be derived from an in-context regression objective, approximately optimized through an online learning rule. Here, we join this line of work and introduce a numerically stable, chunkwise parallelizable version of the recently proposed Mesa layer (von Oswald et al., 2024), which could only run sequentially in time and was therefore not scalable. This layer again stems from an in-context loss, but which is now minimized to optimality at every time point using a fast conjugate gradient solver. Through an extensive suite of experiments study up to the billion-parameter scale, we show that optimal test-time training enables reaching lower language modeling perplexity and higher downstream benchmark performance than previous RNNs, especially on tasks requiring long context understanding. This performance gain comes at the cost of additional flops spent during inference time. Our results are therefore intriguingly related to recent trends of increasing test-time compute to improve performance -- here by spending compute to solve sequential optimization problems within the neural network itself.

2506.04281 2026-06-04 cs.LG 版本更新

Uncovering Insights of Compound Flooding with Data-Driven AI

利用数据驱动AI揭示复合洪水的内在机制

Xu Zheng, Chaohao Lin, Sipeng Chen, Zhuomin Chen, Jimeng Shi, Jayantha Obeysekera, Jingchao Ni, Wei Cheng, Jason Liu, Dongsheng Luo

发表机构 * Florida International University(佛罗里达国际大学) Florida State University(佛罗里达州立大学) UIUC(伊利诺伊大学香槟分校) University of Houston(休斯顿大学) NEC Lab America(NEC美国实验室) Singapore Management University(新加坡管理大学)

AI总结 通过整合潮汐、降雨、地下水位和人类水管理活动,利用数据驱动方法分析南佛罗里达复合洪水,发现地下水位是主要预测因子,空间耦合状态比长期时间依赖更重要。

Comments Accepted to SIGKDD 2026 AI for Science Track; 12 Pages, 5 Figures, 6 Tables

详情
AI中文摘要

复合洪水由多个水文气象因素之间的非线性相互作用驱动,对灾害预防构成重大挑战。现有的预测方法,无论是基于物理的还是数据驱动的,通常强调时间模式,而较少探索多个相互作用因素如何共同塑造洪水动态。为了解决这个问题,我们通过整合潮汐条件、降雨、地下水位和人类水管理活动,对南佛罗里达(一个典型的复合洪水区域)进行了大规模数据驱动的复合洪水分析。我们的分析揭示了三个关键发现:(i)仅捕捉时间动态的模型无法代表复合事件期间的多因素相互作用;(ii)地下水位反映的地表下饱和度成为洪水严重程度的主要预测因子,在这个多孔沿海地区往往超过即时降雨强度;(iii)有限有效半径内周围监测站的空间状态为洪水提供了关键的因果背景,而延长时间历史在极端事件中收益递减。这些发现表明,复合洪水更多地受空间耦合系统状态而非长期时间依赖性的支配,挑战了以降雨为中心和序列主导的预测范式。通过将数据驱动模型定位为科学探究工具而非仅用于预测,本研究为复合洪水的机制提供了新见解,并为设计更基于物理的沿海环境早期预警系统提供了信息。我们的数据集和代码公开在 https://github.com/AslanDing/SFBench。

英文摘要

Compound flooding, driven by nonlinear interactions between multiple hydrometeorological factors, poses a significant challenge to hazard prevention. Existing forecasting approaches, whether physics-based or data-driven, often emphasize temporal patterns while underexploring how multiple interacting factors jointly shape flood dynamics. To address this problem, we conduct a large-scale data-driven analysis of compound flooding in South Florida, a typical area for compound flooding, by integrating tidal conditions, rainfall, groundwater stage, and human water management activities. Our analysis reveals three key findings: (i) models that capture temporal dynamics alone fail to represent multi-factor interactions during compound events; (ii) subsurface saturation, as reflected by groundwater levels, emerges as a dominant predictor of flood severity, often outweighing immediate rainfall intensity in this porous coastal region; and (iii) the spatial state of surrounding monitoring stations within a finite effective radius provides critical causal context for flooding, while extending temporal history yields diminishing returns during extreme events. These findings suggest that compound flooding is governed more by spatially coupled system states than by long-term temporal dependencies, challenging rain-centric and sequence-dominated forecasting paradigms. By framing data-driven models as tools for scientific inquiry rather than prediction alone, this study offers new insights into the mechanisms of compound flooding and informs the design of more physically grounded early-warning systems for coastal environments. Our dataset and code are publicly available at https://github.com/AslanDing/SFBench.

2505.21331 2026-06-04 cs.DS cs.GT cs.LG cs.PF math.PR 版本更新

Scheduling in Queueing Systems with Uncertain and Evolving Holding Costs

具有不确定和演化持有成本的排队系统中的调度

Caner Gocmen, Thodoris Lykouris, Deeksha Sinha, Wentao Weng

发表机构 * Meta Platforms(Meta平台) Massachusetts Institute of Technology(麻省理工学院)

AI总结 针对持有成本不确定且演化的排队系统,提出基于马尔可夫链的模型和机会调整剩余成本(OaRC)算法,证明其渐近最优性并优于经典规则。

详情
AI中文摘要

在社交媒体平台的内容审核中,延迟审核内容的成本与其观看轨迹成正比,而观看轨迹是波动的且先验未知。受这种不确定且演化的持有成本的启发,我们考虑一个排队模型,其中作业状态基于马尔可夫链演化,并具有状态相关的瞬时持有成本。我们证明,在存在这种不确定且演化的持有成本的情况下,两个经典算法原则——瞬时成本($cμ$规则)和期望剩余成本($cμ/θ$规则)——是次优的。通过将每个作业视为一个马尔可夫滑雪租赁问题,我们开发了一种新的基于索引的算法——机会调整剩余成本(OaRC),该算法在不确定性部分解决时调整到未来服务作业的机会。我们证明OaRC的次优性差距为$ ilde{O}(\sqrt{N})$,其中$N$是系统规模。这个界限表明,当系统规模$N$趋于无穷时,OaRC对于过载系统实现了渐近最优性。此外,该界限与状态空间大小无关,这在作业状态包含上下文信息时是一个理想性质。我们基于社交媒体平台内容审核中出现的两种持有成本模式(在线广告和用户生成内容)进行了广泛的模拟研究,验证了我们的结果。基于合成和真实数据集的模拟表明,OaRC始终优于基于两个经典算法原则的现有实践。

英文摘要

In content moderation for social media platforms, the cost of delaying the review of a content is proportional to its view trajectory, which fluctuates and is apriori unknown. Motivated by such uncertain and evolving holding costs, we consider a queueing model where job states evolve based on a Markov chain with state-dependent instantaneous holding costs. We demonstrate that in the presence of such uncertain and evolving holding costs, the two canonical algorithmic principles, instantaneous-cost ($cμ$-rule) and expected-remaining-cost ($cμ/θ$-rule), are suboptimal. By viewing each job as a Markovian ski-rental problem, we develop a new index-based algorithm, Opportunity-adjusted Remaining Cost (OaRC), that adjusts to the opportunity of serving jobs in the future when uncertainty partly resolves. We show that the suboptimality gap of OaRC scales as $\tilde{O}(\sqrt{N})$, where $N$ is the system size. This bound shows that OaRC achieves asymptotic optimality for overloaded systems when the system size $N$ scales to infinity. Moreover, the bound is independent of the state-space size, which is a desirable property when job states contain contextual information. We corroborate our results with an extensive simulation study based on two holding cost patterns (online ads and user-generated content) that arise in content moderation for social media platforms. Our simulations based on synthetic and real datasets demonstrate that OaRC consistently outperforms existing practice, which is based on the two canonical algorithmic principles.

2505.19293 2026-06-04 cs.CL cs.AI cs.LG 版本更新

100-LongBench: Are de facto Long-Context Benchmarks Literally Evaluating Long-Context Ability?

100-LongBench:事实上的长上下文基准是否真的在评估长上下文能力?

Wang Yang, Hongye Jin, Shaochen Zhong, Song Jiang, Qifan Wang, Vipin Chaudhary, Xiaotian Han

发表机构 * Case Western Reserve University(凯斯西储大学) Texas A&M University(德克萨斯A&M大学) Rice University(里德大学) University of California, Los Angeles(加州大学洛杉矶分校) Meta(Meta公司)

AI总结 针对现有长上下文基准无法分离基线能力与真实长上下文能力、且输入长度固定等问题,提出长度可控的长上下文基准和新指标,以有效评估大语言模型的长上下文能力。

详情
AI中文摘要

长上下文能力被认为是LLM最重要的能力之一,因为真正具备长上下文能力的LLM使用户能够轻松处理许多原本繁琐的任务——例如,阅读长文档寻找答案与直接询问LLM。然而,现有的基于真实任务的长上下文评估基准有两个主要缺陷。首先,像LongBench这样的基准通常没有提供适当的指标来将长上下文性能与模型的基线能力分开,使得跨模型比较不清晰。其次,此类基准通常以固定输入长度构建,这限制了它们在不同模型上的适用性,并且无法揭示模型何时开始崩溃。为了解决这些问题,我们引入了一个长度可控的长上下文基准和一个新颖的指标,该指标将基线知识与真实的长上下文能力解耦。实验证明了我们的方法在有效评估LLM方面的优越性。

英文摘要

Long-context capability is considered one of the most important abilities of LLMs, as a truly long context-capable LLM enables users to effortlessly process many originally exhausting tasks -- e.g., digesting a long-form document to find answers vs. directly asking an LLM about it. However, existing real-task-based long-context evaluation benchmarks have two major shortcomings. First, benchmarks like LongBench often do not provide proper metrics to separate long-context performance from the model's baseline ability, making cross-model comparison unclear. Second, such benchmarks are usually constructed with fixed input lengths, which limits their applicability across different models and fails to reveal when a model begins to break down. To address these issues, we introduce a length-controllable long-context benchmark and a novel metric that disentangles baseline knowledge from true long-context capabilities. Experiments demonstrate the superiority of our approach in effectively evaluating LLMs.

2505.17315 2026-06-04 cs.AI cs.CL cs.LG 版本更新

Longer Context, Deeper Thinking: Uncovering the Role of Long-Context Ability in Reasoning

更长上下文,更深思考:揭示长上下文能力在推理中的作用

Wang Yang, Zirui Liu, Hongye Jin, Qingyu Yin, Vipin Chaudhary, Xiaotian Han

发表机构 * Case Western Reserve University(凯斯西储大学) University of Minnesota - Twin Cities(明尼苏达大学双城分校) Texas A&M University(德克萨斯阿姆大学)

AI总结 本研究通过实验发现,增强模型的长上下文能力(在监督微调前)能显著提升推理性能,即使对于短输入任务也有泛化收益,表明长上下文建模是推理能力的关键基础。

详情
AI中文摘要

近期语言模型展现出强大的推理能力,但长上下文能力对推理的影响仍未充分探索。在本工作中,我们假设当前推理能力的局限性部分源于长上下文能力不足,这一假设基于经验观察:(1)更高的上下文窗口长度通常带来更强的推理性能,(2)失败的推理案例与失败的长上下文案例相似。为验证这一假设,我们检验了在监督微调(SFT)前增强模型的长上下文能力是否能提升推理性能。具体而言,我们比较了架构和微调数据相同但长上下文能力不同的模型。结果揭示了一致趋势:长上下文能力更强的模型在SFT后,在推理基准上取得了显著更高的准确率。值得注意的是,即使在输入长度较短的任务上,这些增益也持续存在,表明长上下文训练为推理性能提供了可泛化的益处。这些发现表明,长上下文建模不仅对处理长输入至关重要,而且也是推理的关键基础。我们主张将长上下文能力作为未来语言模型设计的首要目标。

英文摘要

Recent language models exhibit strong reasoning capabilities, yet the influence of long-context capacity on reasoning remains underexplored. In this work, we hypothesize that current limitations in reasoning stem, in part, from insufficient long-context capacity, motivated by empirical observations such as (1) higher context window length often leads to stronger reasoning performance, and (2) failed reasoning cases resemble failed long-context cases. To test this hypothesis, we examine whether enhancing a model's long-context ability before Supervised Fine-Tuning (SFT) leads to improved reasoning performance. Specifically, we compared models with identical architectures and fine-tuning data but varying levels of long-context capacity. Our results reveal a consistent trend: models with stronger long-context capacity achieve significantly higher accuracy on reasoning benchmarks after SFT. Notably, these gains persist even on tasks with short input lengths, indicating that long-context training offers generalizable benefits for reasoning performance. These findings suggest that long-context modeling is not just essential for processing lengthy inputs, but also serves as a critical foundation for reasoning. We advocate for treating long-context capacity as a first-class objective in the design of future language models.

2505.15354 2026-06-04 cs.LG stat.ML 版本更新

Post-Training Corrections for Improved Time-Series Forecasting

人在回路的自适应优化用于改进时间序列预测

Hamza Cherkaoui, Malik Tiomoko, Giuseppe Paolo, Zhang Yili, Yu Meng, Zhang Keli, Hafiz Tiomoko Ali

发表机构 * SAMOVAR Télécom SudParis Institut Polytechnique de Paris(Telecom SudParis高等研究院) Noah Ark Lab(Noah Ark实验室) Independent Researcher(独立研究员)

AI总结 提出一种无需重训练或修改架构的轻量级后训练自适应优化框架,通过强化学习、上下文赌博机或遗传算法自动学习表达性变换来校正模型输出,并支持人类专家通过自然语言引导校正,从而在多个基准上以最小计算开销持续提升预测精度。

详情
AI中文摘要

时间序列预测模型即使在能源、金融和医疗等关键领域也经常产生系统性的、可预测的错误。我们引入了一种新颖的后训练自适应优化框架,无需重训练或架构更改即可提高预测准确性。我们的方法自动应用通过强化学习、上下文赌博机或遗传算法优化的表达性变换,以轻量级和模型无关的方式校正模型输出。理论上,我们证明了仿射校正总能降低均方误差;实际上,我们通过基于动态动作的优化扩展了这一思想。该框架还支持可选的人回路组件:领域专家可以使用自然语言指导校正,自然语言由语言模型解析为动作。在多个基准(例如电力、天气、交通)上,我们观察到以最小的计算开销持续提高准确性。我们的交互式演示展示了该框架的实时可用性。通过将自动事后改进与可解释和可扩展的机制相结合,我们的方法为实际预测系统提供了强大的新方向。

英文摘要

Time-series forecasting is a critical task in various business domains, but it remains inherently challenging. Typically, large forecasting models are trained in a single, resource-intensive run. Once training is completed, a natural question arises:~\emph{is there still potential for meaningful improvement in the model's performance?} Motivated by techniques from boosting, we introduce the concept of~\emph{post-training corrections}. This approach enhances a trained forecaster by sequentially applying a carefully selected set of corrections to its predictions. Our method offers a lightweight, model-agnostic, and scalable strategy to improve forecasting performance in practical settings. We provide theoretical foundations for the approach, starting with the affine correction case, and analyze the expected performance gains and computational costs in more general settings. Across a range of benchmark datasets, our method consistently delivers up to a $30\%$ improvement in forecasting accuracy over existing state-of-the-art models, with minimal computational overhead.

2504.15587 2026-06-04 cs.LG cs.AI 版本更新

MetaMolGen: A Neural Graph Motif Generation Model for De Novo Molecular Design

MetaMolGen: 一种用于从头分子设计的神经图基序生成模型

Zimo Yan, Jie Zhang, Zheng Xie, Chang Liu, Yizhen Liu, Yiping Song

发表机构 * National University of Defense Technology(国防科技大学)

AI总结 提出基于元学习的分子生成模型MetaMolGen,通过标准化图基序分布和轻量级自回归序列模型,实现少样本和属性条件分子生成。

详情
AI中文摘要

分子生成在药物发现和材料科学中扮演重要角色,尤其是在数据稀缺场景下,传统生成模型往往难以实现令人满意的条件泛化。为应对这一挑战,我们提出MetaMolGen,一种基于一阶元学习的分子生成器,专为少样本和属性条件分子生成而设计。MetaMolGen通过将图基序映射到标准化潜在空间来标准化其分布,并采用轻量级自回归序列模型生成忠实反映底层分子结构的SMILES序列。此外,它通过集成到生成过程中的可学习属性投影器,支持具有目标属性的分子的条件生成。实验结果表明,MetaMolGen在低数据条件下持续生成有效且多样的SMILES序列,优于传统基线。这突显了其在快速适应和高效条件生成方面的优势,适用于实际分子设计。

英文摘要

Molecular generation plays an important role in drug discovery and materials science, especially in data-scarce scenarios where traditional generative models often struggle to achieve satisfactory conditional generalization. To address this challenge, we propose MetaMolGen, a first-order meta-learning-based molecular generator designed for few-shot and property-conditioned molecular generation. MetaMolGen standardizes the distribution of graph motifs by mapping them to a normalized latent space, and employs a lightweight autoregressive sequence model to generate SMILES sequences that faithfully reflect the underlying molecular structure. In addition, it supports conditional generation of molecules with target properties through a learnable property projector integrated into the generative process.Experimental results demonstrate that MetaMolGen consistently generates valid and diverse SMILES sequences under low-data regimes, outperforming conventional baselines. This highlights its advantage in fast adaptation and efficient conditional generation for practical molecular design.

2405.08036 2026-06-04 cs.LG cs.AI 版本更新

Potentially Optimal Joint Actions Recognition for Cooperative Multi-Agent Reinforcement Learning

合作多智能体强化学习中潜在最优联合动作识别

Chang Huang, Shatong Zhu, Junqiao Zhao, Hongtu Zhou, Di Zhang, Hai Zhang, Chen Ye, Ziqiao Wang, Guang Chen

发表机构 * School of Computer Science and Technology, Tongji University(同济大学计算机科学与技术学院) Stanford University(斯坦福大学) MOE Key Lab of Embedded System and Service Computing, Tongji University, Shanghai, China(同济大学嵌入式系统与服务计算教育部重点实验室,上海,中国) The University of Hong Kong(香港大学) Shanghai Innovation Institute(上海创新研究院)

AI总结 针对值函数分解中单调性约束限制表达能力的问题,提出潜在最优联合动作加权方法,通过迭代加权训练保证最优策略恢复,在多个任务上超越现有方法。

Comments ICLR 2026

详情
Journal ref
ICLR 2026
AI中文摘要

值函数分解在合作多智能体强化学习(MARL)中被广泛使用。现有方法通常对联合动作值与个体动作值之间施加单调性约束以实现分散执行。然而,此类约束限制了值函数分解的表达能力,缩小了可表示的联合动作值范围,并阻碍了最优策略的学习。为解决这一问题,我们提出了潜在最优联合动作加权(POW)方法,该方法在现有近似加权策略可能失效的情况下确保最优策略恢复。POW通过一个理论上有依据的迭代加权训练过程,迭代地识别潜在最优联合动作并为其分配更高的训练权重。我们证明该机制保证了真实最优策略的恢复,克服了先前启发式加权策略的局限性。POW是架构无关的,可以无缝集成到现有的值函数分解算法中。在矩阵博弈、难度增强的捕食者-猎物任务、SMAC、SMACv2以及高速公路环境交叉口场景上的大量实验表明,POW显著提升了稳定性,并持续超越了最先进的基于值的MARL方法。

英文摘要

Value function factorization is widely used in cooperative multi-agent reinforcement learning (MARL). Existing approaches often impose monotonicity constraints between the joint action value and individual action values to enable decentralized execution. However, such constraints limit the expressiveness of value factorization, restricting the range of joint action values that can be represented and hindering the learning of optimal policies. To address this, we propose Potentially Optimal Joint Actions Weighting (POW), a method that ensures optimal policy recovery where existing approximate weighting strategies may fail. POW iteratively identifies potentially optimal joint actions and assigns them higher training weights through a theoretically grounded iterative weighted training process. We prove that this mechanism guarantees recovery of the true optimal policy, overcoming the limitations of prior heuristic weighting strategies. POW is architecture-agnostic and can be seamlessly integrated into existing value factorization algorithms. Extensive experiments on matrix games, difficulty-enhanced predator-prey tasks, SMAC, SMACv2, and a highway-env intersection scenario show that POW substantially improves stability and consistently surpasses state-of-the-art value-based MARL methods.

2503.18721 2026-06-04 math.ST cs.CR cs.LG stat.ME stat.ML stat.TH 版本更新

Differentially Private Joint Independence Test

差分隐私联合独立性检验

Xingwei Liu, Yuexin Chen, Jin-Ting Zhang, Wangli Xu

发表机构 * Center for Applied Statistics and School of Statistics, Renmin University of China(应用统计中心和中国人民大学统计学院) Department of Statistics and Data Science, National University of Singapore(统计与数据科学系,新加坡国立大学)

AI总结 针对隐私约束下的多随机向量联合依赖检测问题,提出基于差分隐私置换的dHSIC检验方法,实现有效水平、点态一致性和极小极大最优功效。

Comments 57 pages, 7 figures

详情
AI中文摘要

多个随机向量之间的联合依赖识别在许多统计应用中扮演重要角色,其中数据可能包含敏感或机密信息。本文在差分隐私背景下考虑$d$变量希尔伯特-施密特独立性准则(dHSIC)。鉴于dHSIC经验估计的极限分布是复杂的高斯混沌,非隐私场景下的检验通常基于置换和自助法。为了在隐私约束下检测联合依赖,我们提出了一种采用差分隐私置换方法的基于dHSIC的检验程序。我们证明该方法具有隐私保证、有效水平和点态一致性,而自助法存在功效不一致的问题。我们进一步研究了所提检验在dHSIC和$L_2$度量下的均匀功效,表明该检验在不同隐私机制下达到极小极大最优功效。作为副产品,我们证明了Pfister等人(2018)提出的非隐私置换dHSIC检验是我们差分隐私置换检验的特例,并且我们的结果也建立了其点态和均匀功效——从而解决了该工作中的开放问题。因果推断中的数值模拟和真实数据分析表明,我们提出的检验在实证中表现良好。

英文摘要

Identification of joint dependence among several random vectors plays an important role in many statistical applications, where the data may contain sensitive or confidential information. In this paper, we consider the $d$-variable Hilbert-Schmidt independence criterion (dHSIC) in the context of differential privacy. Given that the limiting distribution of the empirical estimate of dHSIC is a complicated Gaussian chaos, constructing tests in the non-private regime is typically based on permutation and bootstrap methods. To detect joint dependence under privacy constraints, we propose a dHSIC-based testing procedure employing a differentially private permutation methodology. We show that our method enjoys privacy guarantees, a valid level, and pointwise consistency, whereas the bootstrap counterpart suffers from inconsistent power. We further investigate the uniform power of the proposed test under the dHSIC and $L_2$ metrics, showing that the proposed test attains the minimax optimal power across different privacy regimes. As a byproduct, we show that the non-private permutation dHSIC test proposed in Pfister et al. (2018) is a special case of our differentially private permutation test, and our results also establish its pointwise and uniform power--thus resolving an open problem from that work. Both numerical simulations and real data analysis in causal inference suggest that our proposed test performs well empirically.

2502.08870 2026-06-04 cs.LG stat.ML 版本更新

When and why randomised exploration works (in linear bandits)

随机探索何时以及为何有效(在线性赌博机中)

Marc Abeille, David Janz, Ciara Pike-Burke

发表机构 * Criteo AI Lab(Criteo AI实验室) University of Oxford(牛津大学) Imperial College London(伦敦帝国学院)

AI总结 本文提出一种不依赖强制乐观或后验膨胀的分析方法,证明在动作空间光滑且强凸的d维线性赌博机中,随机探索算法(如汤普森采样)可实现O(d√n log(n))的n步遗憾界,首次表明在非平凡线性赌博机设置中汤普森采样能达到最优维度依赖。

Comments Minor corrections to formulas and text; results unchanged

详情
AI中文摘要

我们提供了一种分析随机探索算法(如汤普森采样)的方法,该方法不依赖于强制乐观或后验膨胀。通过这种方法,我们证明在$d$维线性赌博机设置中,当动作空间光滑且强凸时,随机探索算法享有$O(d\sqrt{n} \log(n))$阶的$n$步遗憾界。值得注意的是,这首次表明存在非平凡的线性赌博机设置,其中汤普森采样可以在遗憾中实现最优维度依赖。

英文摘要

We provide an approach for the analysis of randomised exploration algorithms like Thompson sampling that does not rely on forced optimism or posterior inflation. With this, we demonstrate that in the $d$-dimensional linear bandit setting, when the action space is smooth and strongly convex, randomised exploration algorithms enjoy an $n$-step regret bound of the order $O(d\sqrt{n} \log(n))$. Notably, this shows for the first time that there exist non-trivial linear bandit settings where Thompson sampling can achieve optimal dimension dependence in the regret.

2408.01382 2026-06-04 cs.LG cs.GT 版本更新

Explaining a probabilistic prediction on the simplex with Shapley compositions

用Shapley组合解释单纯形上的概率预测

Paul-Gauthier Noé, Miquel Perelló-Nieto, Jean-François Bonastre, Peter Flach

发表机构 * Laboratoire Informatique d’Avignon, Avignon Université, France(阿维尼昂信息实验室,阿维尼昂大学,法国) University of Bristol, United Kingdom(布里斯托大学,英国)

AI总结 本文引入Shapley组合,利用成分数据分析的Aitchison几何,为多类概率预测提供了一种基于公理的解释方法。

Comments Published in ECAI2024's proceedings

详情
AI中文摘要

源于博弈论的Shapley值被广泛用于通过量化每个特征值对预测的贡献来解释机器学习模型的预测。这需要像二分类中那样的标量预测,而多类概率预测是离散概率分布,位于多维单纯形上。在这种多类设置中,Shapley值通常以一对多的方式单独计算每个类别,忽略了输出分布的组成性质。在本文中,我们引入Shapley组合作为一种有根据的方法来正确解释多类概率预测,使用成分数据分析中的Aitchison几何。我们证明了Shapley组合是满足Aitchison单纯形上的线性性、对称性和效率的唯一量,扩展了标准Shapley值的相应公理性质。我们在一系列场景中展示了这种正确的多类处理。

英文摘要

Originating in game theory, Shapley values are widely used for explaining a machine learning model's prediction by quantifying the contribution of each feature's value to the prediction. This requires a scalar prediction as in binary classification, whereas a multiclass probabilistic prediction is a discrete probability distribution, living on a multidimensional simplex. In such a multiclass setting the Shapley values are typically computed separately on each class in a one-vs-rest manner, ignoring the compositional nature of the output distribution. In this paper, we introduce Shapley compositions as a well-founded way to properly explain a multiclass probabilistic prediction, using the Aitchison geometry from compositional data analysis. We prove that the Shapley composition is the unique quantity satisfying linearity, symmetry and efficiency on the Aitchison simplex, extending the corresponding axiomatic properties of the standard Shapley value. We demonstrate this proper multiclass treatment in a range of scenarios.

2502.05349 2026-06-04 math.OC cs.LG 版本更新

Contextual Scenario Generation for Two-Stage Stochastic Programming

两阶段随机规划的情境生成

David Islip, Roy H. Kwon, Sanghyeon Bae, Woo Chang Kim

发表机构 * Department of Mechanical and Industrial Engineering, University of Toronto(机械与工业工程系,多伦多大学) Department of Industrial and Systems Engineering, Korea Advanced Institute of Science and Technology (KAIST)(工业与系统工程系,韩国科学技术院(KAIST))

AI总结 针对两阶段随机规划中情境数量大、部署受限的问题,提出两种情境生成方法(基于分布和基于任务),通过上下文信息学习生成少量替代情境,并保证决策质量。

Comments 79 pages, 12 figures

详情
AI中文摘要

两阶段随机规划(2SPs)广泛用于不确定性下的决策,但其实际部署通常受限于需要大量情境来近似不确定结果的条件分布。我们研究情境生成:给定上下文信息,学习生成一个小的、用户指定的替代情境集,当将其作为2SP的输入时,能产生高质量的2SP决策。现有的情境生成方法要么忽略上下文信息,要么在此设置下计算负担沉重。我们提出上下文情境生成(CSG),它学习从上下文到一组替代情境的映射。我们开发了两种互补的方法:(i)基于分布的方法,通过最小化与条件分布的基于核的距离来学习从上下文到情境的映射;(ii)基于任务的方法,通过区分下游2SP目标的代理来优化决策质量。这两种方法都广泛适用,仅需要重复求解底层子问题和在生成的情境上定义的2SP。我们提供了有限样本泛化保证,并在多个2SP类别上展示了强大的实证性能。

英文摘要

Two-stage stochastic programs (2SPs) are widely used for decision-making under uncertainty, but their practical deployment is often limited by the large number of scenarios needed to approximate the conditional distribution of uncertain outcomes. We study contextual scenario generation: given contextual information, learn to produce a small, user-specified set of surrogate scenarios that, when used as input into the 2SP, lead to high-quality 2SP decisions. Existing scenario generation methods either ignore contextual information or are computationally burdensome in this setting. We propose contextual scenario generation (CSG), which learns a mapping from context to a set of surrogate scenarios. We develop two complementary methodologies: (i) a distributional approach that learns a mapping from context to scenarios by minimizing a kernel-based distance to the conditional distribution, and (ii) a task-based approach that selects the mapping to optimize decision quality via differentiating through a learned surrogate of the downstream 2SP objective. Both approaches are broadly applicable and require only repeated solution of the underlying subproblems and 2SPs defined on the generated scenarios. We provide finite-sample generalization guarantees and demonstrate strong empirical performance across multiple 2SP classes.

2408.11121 2026-06-04 cs.LG cs.AI cs.CL cs.CR 版本更新

DOMBA: Double Model Balancing for Access-Controlled Language Models via Minimum-Bounded Aggregation

DOMBA: 通过最小有界聚合实现访问控制语言模型的双模型平衡

Tom Segal, Asaf Shabtai, Yuval Elovici

发表机构 * Ben-Gurion University(本·古里安大学)

AI总结 提出DOMBA方法,通过最小有界平均函数聚合两个不同访问级别文档训练的语言模型的概率分布,在保证安全性的同时实现高效用。

Comments Code: https://github.com/ppo1/DOMBA 11 pages, 3 figures

详情
Journal ref
Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 39, pp. 25101-25109, 2025
AI中文摘要

大型语言模型(LLMs)的实用性在很大程度上取决于其训练数据的质量和数量。许多组织拥有大量数据语料库,可用于训练或微调针对其特定需求的LLMs。然而,这些数据集通常带有基于用户权限并由访问控制机制强制执行的访问限制。在此类数据集上训练LLMs可能导致敏感信息暴露给未经授权的用户。防止此类暴露的一种直接方法是为每个访问级别训练一个单独的模型。然而,由于每个模型的训练数据量相对于整个组织语料库的总量有限,这可能导致模型效用低下。另一种方法是在所有数据上训练单个LLM,同时限制未经授权信息的暴露。然而,当前针对LLMs的暴露限制方法对于访问控制数据无效,因为敏感信息在多个训练样本中频繁出现。我们提出DOMBA——双模型平衡——一种训练和部署LLMs的简单方法,可在提供高效用和访问控制功能的同时保证安全性。DOMBA使用“最小有界”平均函数(一个受较小值约束的函数,例如调和平均)聚合两个模型的概率分布,每个模型在具有(可能多个)不同访问级别的文档上训练。详细的数学分析和广泛评估表明,DOMBA在保护受限信息的同时,提供了与非安全模型相当的效用。

英文摘要

The utility of large language models (LLMs) depends heavily on the quality and quantity of their training data. Many organizations possess large data corpora that could be leveraged to train or fine-tune LLMs tailored to their specific needs. However, these datasets often come with access restrictions that are based on user privileges and enforced by access control mechanisms. Training LLMs on such datasets could result in exposure of sensitive information to unauthorized users. A straightforward approach for preventing such exposure is to train a separate model for each access level. This, however, may result in low utility models due to the limited amount of training data per model compared to the amount in the entire organizational corpus. Another approach is to train a single LLM on all the data while limiting the exposure of unauthorized information. However, current exposure-limiting methods for LLMs are ineffective for access-controlled data, where sensitive information appears frequently across many training examples. We propose DOMBA - double model balancing - a simple approach for training and deploying LLMs that provides high utility and access-control functionality with security guarantees. DOMBA aggregates the probability distributions of two models, each trained on documents with (potentially many) different access levels, using a "min-bounded" average function (a function that is bounded by the smaller value, e.g., harmonic mean). A detailed mathematical analysis and extensive evaluation show that DOMBA safeguards restricted information while offering utility comparable to non-secure models.

2412.03008 2026-06-04 cs.SI cs.DS cs.LG 版本更新

Local Clustering on Complex Graphs and Complex Hypergraphs

复杂图与复杂超图上的局部聚类

Zihao Li, Dongqi Fu, Hengyu Liu, Jingrui He

发表机构 * University of Illinois Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校) Meta

AI总结 本文通过扩展非近似的Andersen-Chung-Lang (ACL)聚类算法,提出了GeneralACL和HyperACL两种算法,分别适用于带权、有向、自环图以及边依赖顶点权重的超图,并证明了在温和条件下它们能识别出电导率二次最优的聚类。

Comments KDD 2026, Preprint version. 26 pages

详情
AI中文摘要

局部/种子聚类旨在找到靠近给定起始实例的紧凑聚类。虽然现有的大多数图聚类研究假设离散图设置(即无权重、无向、无自环图),但现实世界的图可能更加复杂。在本文中,我们将经典的非近似Andersen-Chung-Lang (ACL)聚类算法扩展到离散图之外,并将其二次最优性推广到更广泛的复杂图,包括带权、有向、自环图以及具有边依赖顶点权重的超图。具体来说,通过利用PageRank,我们提出了两种算法:用于图的GeneralACL和用于超图的HyperACL。我们证明,在两种温和条件下,这两种算法都能识别出电导率方面二次最优的聚类。此外,我们提供了实验来验证我们的理论发现。我们的代码可在https://github.com/iDEA-iSAIL-Lab-UIUC/HyperACL获取。

英文摘要

Local/seeded clustering aims to find a compact cluster near the given starting instances. While most existing studies on graph clustering assume a discrete graph setting (i.e., unweighted, undirected graphs without self-loops), real-world graphs can be more complex. In this paper, we extend the classic non-approximating Andersen-Chung-Lang (ACL) clustering algorithm beyond discrete graphs and generalize its quadratic optimality to a wider range of complex graphs, including weighted, directed, and self-looped graphs and hypergraphs with edge-dependent vertex weights. Specifically, by leveraging PageRank, we propose two algorithms: GeneralACL for graphs and HyperACL for hypergraphs. We prove that, under two mild conditions, both algorithms can identify a quadratically optimal cluster in terms of conductance. Additionally, we provide experiments to validate our theoretical findings. Our code is available at https://github.com/iDEA-iSAIL-Lab-UIUC/HyperACL.

2411.19758 2026-06-04 cs.CV cs.AI cs.LG 版本更新

LaVIDE: Language-Prompted Satellite Change Detection via Map-Image Alignment

LaVIDE: 通过地图-图像对齐的语言提示卫星变化检测

Shuguo Jiang, Fang Xu, Chuandong Liu, Hong Tan, Shengyang Li, Lei Yu, Wen Yang, Sen Jia, Gui-Song Xia

发表机构 * School of Computer Science, Wuhan University(武汉大学计算机学院) School of Artificial Intelligence, Wuhan University(武汉大学人工智能学院) Technology and Engineering Center for Space Utilization and the Key Laboratory of Space Utilization, Chinese Academy of Sciences(中国科学院空间利用技术与重点实验室) School of Aeronautics and Astronautics, University of Chinese Academy of Sciences(中国科学院大学航空宇航学院) School of Electronic Information, Wuhan University(武汉大学电子信息学院) College of Computer Science and Software Engineering, Shenzhen University(深圳大学计算机科学与软件工程学院)

AI总结 提出LaVIDE框架,利用受限提示学习和对象感知嵌入增强,通过语言弥合高层地图类别与低层图像细节之间的语义鸿沟,实现跨模态对齐,在多类与单类变化检测任务上分别提升IoU 18.4%和5.2%。

详情
AI中文摘要

基于地图参考和最新图像的遥感变化检测,在缺乏早期图像进行比较时,有助于及时观测地球表面。然而,高层地图类别与低层图像细节之间的语义鸿沟阻碍了提取同质特征以进行稳健的时间关联。与比较像素级视觉相似性或传播分割误差的传统方法不同,我们提出了一种新颖框架——LaVIDE(用于检测变化的语言-视觉判别器),该框架以语言为中介,弥合了高层地图类别与低层图像细节之间的语义鸿沟。具体来说,我们引入了受限提示学习来生成上下文感知的文本提示,使地图语义与图像内容对齐,并采用对象感知嵌入增强策略将对象级属性(如形状、边界)整合到地图表示中。这些组件能够在统一的语言-视觉特征空间中实现稳健的跨模态对齐。在四个基准数据集(DynamicEarthNet、HRSCD、BANDON和SECOND)上的大量实验表明,LaVIDE以显著优势超越了最先进的方法,在多类和单类变化检测任务上分别实现了18.4%和5.2%的IoU提升。我们的框架不仅提高了地图-图像变化检测的准确性,还为以最少人工干预快速更新地图提供了实用解决方案,有望在城市规划、灾害评估和生态保护等领域产生广泛影响。代码和数据集可在 https://github.com/ShuGuoJ/LAVIDE.git 获取。

英文摘要

Remote sensing change detection based on a map reference and an up-to-date image boosts timely observation of the Earth's surface when earlier images are lacking for comparison. However, the semantic gap between high-level map categories and low-level image details hinders the extraction of homogeneous features for robust temporal association in change detection. Unlike conventional approaches that either compare pixel-level visual similarity or propagate segmentation errors, \textcolor{black}{we propose a novel framework, \underline{La}nguage-\underline{VI}sion \underline{D}iscriminator for d\underline{E}tecting changes, LaVIDE}, which bridges the semantic gap between high-level map categories and low-level image details using language as an intermediary. Specifically, we introduce {\it restricted prompt learning} to generate context-aware textual prompts that align map semantics with image content, and an {\it object-aware embedding enhancement} strategy to integrate object-level attributes (e.g., shape, boundary) into map representations. These components enable robust cross-modal alignment within a unified language-vision feature space. Extensive experiments on four benchmarks, DynamicEarthNet, HRSCD, BANDON, and SECOND, demonstrate that LaVIDE outperforms state-of-the-art methods by significant margins, achieving $18.4\%$ and $5.2\%$ improvements in IoU on multi-class and single-class change detection tasks, respectively. Our framework not only advances the accuracy of map-image change detection but also provides a practical solution for rapid map updating with minimal human intervention, promising broad impacts in urban planning, disaster assessment, and ecological conservation. Code and datasets are available at: https://github.com/ShuGuoJ/LAVIDE.git.

2411.05591 2026-06-04 stat.ML cs.LG 版本更新

Decentralized EM Algorithm for Gaussian Mixtures under Data Heterogeneity and Partial Labeling

数据异质性和部分标记下高斯混合的分布式EM算法

Xuetong Li, Shuyuan Wu, Bin Du, Hansheng Wang

发表机构 * School of Mathematics and Statistics(数学与统计学学院) School of Statistics and Data Science(统计与数据科学学院) Guanghua School of Management(光华管理学院)

AI总结 针对分布式联邦学习中数据异质性导致经典EM算法估计有偏的问题,提出动量网络EM(MNEM)算法和半监督MNEM(semi-MNEM)算法,实现渐近有效估计并加速收敛。

详情
AI中文摘要

我们系统研究了分布式联邦学习(DFL)中高斯混合模型的几种基于网络的期望最大化(EM)算法。理论研究表明,当数据在不同站点间异质分布时,直接将经典EM算法扩展到DFL会导致有偏估计。为解决这一问题,我们引入了动量网络EM(MNEM)算法,该算法整合了当前和先前DFL迭代的历史估计信息。我们进一步开发了半监督MNEM(semi-MNEM)算法,利用部分标记数据提供的信息。严格的理论分析表明,在适当的正则条件下,即使数据异质,MNEM估计器也能达到与全样本估计器相同的渐近效率。此外,即使不同混合成分分离较差,semi-MNEM估计器也能显著提高MNEM算法的收敛速度。进行了大量模拟,并分析了一个广泛使用的胸部X射线数据集,以证明所提出方法的有限样本性能。

英文摘要

We systematically study several network-based Expectation-Maximization (EM) algorithms for the Gaussian mixture model within decentralized federated learning (DFL). Our theoretical investigation shows that directly extending the classic EM algorithm to DFL leads to a biased estimator when data are heterogeneously distributed across sites. To address this, we introduce a momentum network EM (MNEM) algorithm, which integrates information from both current and historical estimators from previous DFL iterations. We further develop a semi-supervised MNEM (semi-MNEM) algorithm, which utilizes information provided by partially labeled data. Rigorous theoretical analysis demonstrates that the MNEM estimator can achieve the same asymptotic efficiency as the whole-sample estimator under appropriate regularity conditions, even with heterogeneous data. Moreover, the semi-MNEM estimator significantly improves the convergence speed of the MNEM algorithm, even if different mixture components are poorly separated. Extensive simulations are conducted, and a widely used chest X-ray dataset is analyzed to demonstrate the finite-sample performance of the proposed methods.

2205.08609 2026-06-04 stat.ML cs.LG stat.ME 版本更新

Bagged Polynomial Regression and Neural Networks

Bagged Polynomial Regression and Neural Networks

Sylvia Klosin, Jaume Vives-i-Bastida

发表机构 * Department of Agricultural and Resource Economics, UC Davis(加州大学戴维斯分校农业与资源经济学系) Stanford Graduate School of Business(斯坦福商学院)

AI总结 针对高维预测问题,提出基于随机投影的袋装多项式回归(BPR),在保持与神经网络相当精度的同时提供可解释性和诊断工具。

详情
AI中文摘要

气候和环境应用越来越依赖于从遥感和其他科学数据中进行高维预测。神经网络(NN)在这些场景中能够提供强大的准确性,但往往难以审计且难以与领域知识对齐。作为替代方案,我们提出了基于随机投影的袋装多项式回归(BPR),这是一种计量经济学原生的集成方法,它对在随机选择的协变量组上拟合的多个正则化低次多项式模型进行平均。我们提供了新颖的有限样本和渐近风险界,并展示了协变量划分如何通过控制字典基增长来改善光滑目标函数的速率。速率改进对于边际效应的估计可能尤其重要。在使用光学和雷达图像进行基于卫星的作物分类的应用中,BPR 在保持易于诊断的同时达到了与 NN 相当的准确性。我们提供了实用的透明度工具、系数汇总和偏依赖诊断,表明 BPR 捕捉到了 NN 未能捕捉到的直观特征关系。

英文摘要

Climate and environmental applications increasingly rely on high-dimensional prediction from remote sensing and other scientific data. Neural networks (NN) can deliver strong accuracy in these settings, but they are often hard to audit and hard to align with domain knowledge. As an alternative, we propose bagged polynomial regression with random projections (BPR), an econometrics-native ensemble that averages many regularized low-degree polynomial models fit on randomly selected covariate groups. We provide novel finite-sample and asymptotic risk bounds and show how covariate partitioning can improve rates for smooth target functions by controlling dictionary basis growth. Rate improvements may be particularly relevant for the estimation of marginal effects. In an application to satellite-based crop classification using optical and radar imagery, BPR matches NN accuracy while remaining straightforward to diagnose. We provide practical transparency tools, coefficient summaries and partial-dependence diagnostics, that show BPR captures intuitive feature relationships that NNs do not.

2407.13922 2026-06-04 cs.CV cs.AI cs.LG 版本更新

CounterFace: A Synthetic Face Dataset for Fine-Grained Counterfactual Evaluation of Face Recognition Systems

CounterFace: 用于人脸识别系统细粒度反事实评估的合成人脸数据集

Guruprasad Viswanathan Ramesh, Ashish Hooda, Shimaa Ahmed, Harrison J Rosenberg, Ramya Korlakai Vinayak, Kassem Fawaz

发表机构 * University of Wisconsin-Madison(威斯康星大学麦迪逊分校) Visa Research(Visa研究)

AI总结 提出CounterFace数据集,通过全自动流水线生成包含20种面部属性和8种人口统计因素的11,821个反事实人脸对,用于细粒度评估人脸识别系统在特定属性-人口统计组合下的性能退化。

Comments Code available at https://github.com/Guruprasad68/counterface_facct2026. Dataset available for non-commercial research upon request

详情
AI中文摘要

人脸识别系统广泛应用于关键应用,因此其在不同人群和条件下的可靠性和鲁棒性至关重要。人脸识别系统的标准评估通常依赖LFW等数据集来估计平均识别准确率。一些基准测试也捕捉了粗粒度的身份内变化,如老化、姿态和光照。然而,人脸存在更细粒度的变化,包括发型和化妆等外观变化,这些在现有基准测试中代表性不足。反事实评估提供了一种在细粒度变化下评估人脸识别鲁棒性的方法。然而,现有使用图像生成器合成的反事实人脸数据集由于在流程中使用人工验证,属性覆盖范围有限。我们提出CounterFace,一个新的反事实评估数据集,包含20种面部属性和8种人口统计因素,超过先前合成人脸数据集14种属性和2种人口统计因素。该数据集使用基于现成图像生成器和自定义验证器的全自动流水线生成,无需人工验证。CounterFace包含11,821个反事实人脸对,事后用户研究证实了生成反事实的忠实性。我们评估了两个商业和四个开源人脸识别系统(AWS Rekognition、Face++、AdaFace、MagFace、ArcFace、FaceNet)在160种属性-人口统计组合上的性能。与标准评估基准不同,我们的数据集有助于隔离单个系统的精确故障模式。结果表明,所有六个系统的性能退化因属性和人口统计而异,遮挡属性(如口罩和胡须)普遍降低性能。

英文摘要

Face recognition (FR) systems are widely deployed in critical applications, making their reliability and robustness across diverse populations and conditions essential. Standard evaluation of FR systems typically relies on datasets such as LFW to estimate average recognition accuracy. Some benchmarks also capture coarse-grained intra-identity variations such as aging, pose, and lighting. However, human faces undergo more fine-grained changes, including appearance changes such as hairstyles and makeup, that are underrepresented in existing benchmarks. Counterfactual evaluation provides a method to assess FR robustness under such fine-grained variations. Existing counterfactual face datasets synthesized with image generators, however, are limited in attribute coverage due to the use of humans for verification in the pipeline. We propose CounterFace, a new counterfactual evaluation dataset comprising 20 facial attributes and 8 demographic factors, exceeding prior synthetic face datasets by 14 attributes and 2 demographics. The dataset is generated using a fully automated pipeline based on off-the-shelf image generators with custom verifiers, removing human need for verification. CounterFace contains 11,821 counterfactual face pairs, and a post-hoc user study confirms the faithfulness of the generated counterfactuals. We evaluate two commercial and four open-source FR systems (AWS Rekognition, Face++, AdaFace, MagFace, ArcFace, FaceNet) across 160 attribute-demographic combinations. Our dataset helps in the isolation of precise failure modes for individual systems unlike standard evaluation benchmarks. Results indicate that the performance degradation varies across attributes and demographics for all six systems and occluding attributes (e.g., facemask and facial hair) universally degrade performance.

2004.10846 2026-06-04 cs.CY cs.LG 版本更新

Reducing the Filtering Effect in Public School Admissions: A Bias-aware Analysis for Targeted Interventions

减少公立学校招生中的过滤效应:面向针对性干预的偏差感知分析

Yuri Faenza, Swati Gupta, Aapeli Vuorinen, Xuan Zhang

发表机构 * Columbia University(哥伦比亚大学) Massachusetts Institute of Technology(麻省理工学院)

AI总结 本研究采用运筹学方法,通过分析纽约市教育部数据,将弱势学生分数分布偏移建模为偏差,并证明针对中等成绩弱势学生的集中干预(如奖学金或培训)可显著降低偏差影响。

详情
AI中文摘要

问题定义:传统上,纽约市顶尖的8所公立学校仅根据学生在特殊高中入学考试(SHSAT)中的成绩选拔候选人。这些成绩已知受到学生社会经济地位和初中所接受的考试准备的影响,导致教育管道中产生巨大的过滤效应。经典的学校分配机制并未自然解决学校隔离和班级多样性等问题,这些问题近年来日益恶化。包括政策制定者在内的科学界通过引入群体特定配额和比例约束来应对,但结果好坏参半。寻找有效且公平的方法以扩大优质教育机会的问题仍未解决。 方法/结果:我们采用与大多数现有文献不同的运筹学方法,目标是增加经济需求高的学生的机会。利用纽约市教育部(DOE)的数据,我们展示了被DOE归类为“弱势”(主要基于经济因素的标准)的学生所获分数的分布存在偏移。我们将这种偏移建模为“偏差”,源于对弱势学生真实潜力的低估。我们分析了这种偏差对分类匹配市场的影响。我们表明,当针对中等成绩的弱势学生群体时,通过奖学金或培训进行的集中干预可以显著降低偏差的影响。

英文摘要

Problem definition: Traditionally, New York City's top 8 public schools have selected candidates solely based on their scores in the Specialized High School Admissions Test (SHSAT). These scores are known to be impacted by socioeconomic status of students and test preparation received in middle schools, leading to a massive filtering effect in the education pipeline. The classical mechanisms for assigning students to schools do not naturally address problems like school segregation and class diversity, which have worsened over the years. The scientific community, including policymakers, have reacted by incorporating group-specific quotas and proportionality constraints, with mixed results. The problem of finding effective and fair methods for broadening access to top-notch education is still unsolved. Methodology/results: We take an operations approach to the problem different from most established literature, with the goal of increasing opportunities for students with high economic needs. Using data from the Department of Education (DOE) in New York City, we show that there is a shift in the distribution of scores obtained by students that the DOE classifies as "disadvantaged" (following criteria mostly based on economic factors). We model this shift as a "bias" that results from an underestimation of the true potential of disadvantaged students. We analyze the impact this bias has on an assortative matching market. We show that centrally planned interventions can significantly reduce the impact of bias through scholarships or training, when they target the segment of disadvantaged students with average performance.

1710.04238 2026-06-04 stat.ME cs.LG cs.NA math.NA 版本更新

Regression-aware decompositions

回归感知的分解

Mark Tygert

发表机构 * Facebook Artificial Intelligence Research(脸书人工智能研究)

AI总结 本文提出了一种回归感知的分解方法,通过结合线性最小二乘回归模型与插值分解,实现了对矩阵B的监督降维,从而揭示了B中与A回归相关的结构。

Comments 19 pages, 9 figures, 2 tables

详情
Journal ref
Linear Algebra and Its Applications, 565 (6): 208-224, 2019
AI中文摘要

线性最小二乘回归通过设计矩阵A来近似给定矩阵B,通过最小化谱范数或Frobenius范数的差异||AX-B||来实现。另一种流行的近似方法是通过主成分分析(PCA)进行低秩近似,即奇异值分解(SVD)或插值分解(ID)。传统上,PCA/SVD和ID仅使用被近似的矩阵B,而不受任何辅助矩阵A的监督。然而,线性最小二乘回归模型可以指导ID,从而产生回归感知的ID。作为额外的好处,这为一种典型的判别分析(A和B之间的相关性)提供了解释。回归感知的分解有效使监督信息能够指导经典的降维方法,而经典降维方法历来是完全无监督的。回归感知的分解揭示了B中与A回归相关的结构。

英文摘要

Linear least-squares regression with a "design" matrix A approximates a given matrix B via minimization of the spectral- or Frobenius-norm discrepancy ||AX-B|| over every conformingly sized matrix X. Another popular approximation is low-rank approximation via principal component analysis (PCA) -- which is essentially singular value decomposition (SVD) -- or interpolative decomposition (ID). Classically, PCA/SVD and ID operate solely with the matrix B being approximated, not supervised by any auxiliary matrix A. However, linear least-squares regression models can inform the ID, yielding regression-aware ID. As a bonus, this provides an interpretation as regression-aware PCA for a kind of canonical correlation analysis between A and B. The regression-aware decompositions effectively enable supervision to inform classical dimensionality reduction, which classically has been totally unsupervised. The regression-aware decompositions reveal the structure inherent in B that is relevant to regression against A.

2312.08472 2026-06-04 cs.NE cs.LG cs.NA math.NA 版本更新

AutoNumerics-Zero: Automated Discovery of State-of-the-Art Mathematical Functions

AutoNumerics-Zero:自动发现最先进的数学函数

Esteban Real, Mirko Rossini, Connal de Souza, Manav Garg, Moritz Firsching, Quoc V. Le, Yao Chen, Akhil Verghese, Ekin Dogus Cubuk, David H. Park

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 本文通过符号回归的进化方法,在不依赖任意精度的情况下,自动发现比传统方法更高效的数学函数近似程序,例如一个10操作的程序逼近指数函数达到14位有效数字。

Comments v2: Accepted to the International Conference on Machine Learning (ICML 2026); added results, clarified framing, and added proofs

详情
AI中文摘要

超越函数(如指数函数)是科学计算的核心,但数字硬件无法原生计算它们。相反,计算机必须通过组合基本运算(如$\{+, -, \times, ÷\}$)使用泰勒级数等方法近似这些函数。这些方法由数学家经过几个世纪发展而来,专注于能够达到任意精度的途径。然而,计算机通过仅使用有限精度类型(如float32)即可处理大多数应用,超出该类型精度的任何精度实际上都被丢弃了。因此,我们探索放弃任意精度是否能够发现更高效的近似。符号回归的进化方法特别适合,因为它可以搜索任意操作组合,并优化不可微的目标(如使用的操作数)。我们的结果表明,进化能够发现在此设置下优于已有方法的计算机程序,尽管除了基本运算的计算外没有先前的数学知识。从空代码开始,符号回归构建表示新颖数学表达式的程序。特别地,我们发现了10个操作的逼近指数函数达到14位有效数字的程序,其精度比此前已知的同等规模近似高出超过6个数量级。

英文摘要

Transcendental functions, such as the exponential, are central to scientific computing, yet they cannot be natively calculated by digital hardware. Instead, computers must approximate these functions by combining basic operations, such as $\{+, -, \times, ÷\}$, using methods like Taylor series. These methods were developed over centuries by mathematicians, who focused on approaches that could attain arbitrary accuracy. However, computers can handle most applications by using only finite-precision types, like float32, where any accuracy beyond the type's precision is effectively discarded. We explore, therefore, whether forgoing arbitrary accuracy can lead to the discovery of more efficient approximations. The evolutionary method of symbolic regression is particularly suitable, as it can search for arbitrary operation combinations and can optimize non-differentiable objectives, such as the number of operations used. Our results show that evolution can discover computer programs that outperform established methods in this setting, despite having no prior mathematical knowledge beyond the calculation of the basic operations. Starting from empty code, symbolic regression constructs programs representing novel mathematical expressions. In particular, we discovered a 10-operation program that approximates the exponential function to 14 significant figures, exceeding the accuracy of previously known approximations of this size by more than 6 orders of magnitude.

2209.15448 2026-06-04 cs.LG math.ST stat.ME stat.TH 版本更新

Blessing from Human-AI Interaction: Super Reinforcement Learning in Confounded Environments

人机交互的福音:混杂环境下的超级强化学习

Jiayi Wang, Zhengling Qi, Chengchun Shi

发表机构 * Department of Mathematical Sciences, University of Texas at Dallas(德克萨斯大学达拉斯分校数学科学系) Department of Statistics, London School of Economics and Political Science(伦敦政治经济学院统计系) Department of Decision Sciences, George Washington University(乔治华盛顿大学决策科学系)

AI总结 提出利用人机交互中的观察动作进行超级策略学习,在存在未测量混杂的情况下,通过近端因果推断实现优于标准最优策略和行为策略的超级策略。

详情
AI中文摘要

随着人工智能在社会中越来越普遍,整合人类和AI系统以发挥各自优势并降低风险的有效方法已成为重要优先事项。在本文中,我们引入了超级策略学习的范式,该范式利用人机交互进行数据驱动的序贯决策。这种方法将来自AI或人类的观察动作作为输入,以实现决策者(人类或AI)在策略学习中更强的oracle。在存在未测量混杂的决策过程中,过去智能体采取的动作可以揭示未公开信息的有价值见解。通过以一种新颖且合法的方式将这些信息纳入策略搜索,所提出的超级策略学习将产生一个超级策略,该策略保证优于标准最优策略和行为策略(例如,过去智能体的动作)。我们将这种更强的oracle称为人机交互的福音。此外,为了解决使用批处理数据寻找超级策略时的未测量混杂问题,在近端因果推断框架下建立了一系列非参数和因果识别。基于这些新颖的识别结果,我们开发了几种超级策略学习算法,并系统研究了它们的理论性质,例如有限样本遗憾保证。最后,通过大量模拟和实际应用说明了我们方法的有效性。

英文摘要

As AI becomes more prevalent throughout society, effective methods of integrating humans and AI systems that leverage their respective strengths and mitigate risk have become an important priority. In this paper, we introduce the paradigm of super policy learning that takes advantage of Human-AI interaction for data driven sequential decision making. This approach utilizes the observed action, either from AI or humans, as input for achieving a stronger oracle in policy learning for the decision maker (humans or AI). In the decision process with unmeasured confounding, the actions taken by past agents can offer valuable insights into undisclosed information. By including this information for the policy search in a novel and legitimate manner, the proposed super policy learning will yield a super-policy that is guaranteed to outperform both the standard optimal policy and the behavior one (e.g., past agents' actions). We call this stronger oracle a blessing from human-AI interaction. Furthermore, to address the issue of unmeasured confounding in finding super-policies using the batch data, a number of nonparametric and causal identifications are established under the framework of proximal causal inference. Building upon on these novel identification results, we develop several super-policy learning algorithms and systematically study their theoretical properties such as finite-sample regret guarantee. Finally, we illustrate the effectiveness of our proposal through extensive simulations and real-world applications.

1409.6111 2026-06-04 math.OC cs.LG cs.MA cs.SY eess.SY stat.ML 版本更新

Distributed Clustering and Learning Over Networks

网络上的分布式聚类与学习

Xiaochuan Zhao, Ali H. Sayed

发表机构 * Department of Electrical Engineering, University of California, Los Angeles(加州大学洛杉矶分校电气工程系)

AI总结 本文提出了一种自适应的聚类和学习方案,使智能体能够学习应与哪些邻居合作以及哪些邻居应忽略,从而在网络中实现更准确的学习和估计。通过详细的均方分析,评估了聚类机制的一阶和二阶误差概率,并证明这些概率随步长指数衰减,从而可以将正确聚类的概率任意接近于一。

Comments 47 pages, 6 figures

详情
AI中文摘要

网络上的分布式处理依赖于节点间的在网处理和邻近智能体之间的合作。当智能体共享共同目标时,合作是有益的。然而,在许多应用中,智能体可能属于不同的集群,追求不同的目标。因此,无差别合作会导致不期望的结果。在本文中,我们提出了一种自适应的聚类和学习方案,使智能体能够学习应与哪些邻居合作以及哪些其他邻居应忽略。通过这样做,所得到的算法使智能体能够识别其集群,并在网络中实现改进的学习和估计准确性。我们进行了详细的均方分析,并评估了聚类机制的一阶和二阶误差概率,即虚警和误检概率。此外,我们证明这些概率随着步长指数衰减,从而使正确聚类的概率可以任意接近于一。

英文摘要

Distributed processing over networks relies on in-network processing and cooperation among neighboring agents. Cooperation is beneficial when agents share a common objective. However, in many applications agents may belong to different clusters that pursue different objectives. Then, indiscriminate cooperation will lead to undesired results. In this work, we propose an adaptive clustering and learning scheme that allows agents to learn which neighbors they should cooperate with and which other neighbors they should ignore. In doing so, the resulting algorithm enables the agents to identify their clusters and to attain improved learning and estimation accuracy over networks. We carry out a detailed mean-square analysis and assess the error probabilities of Types I and II, i.e., false alarm and mis-detection, for the clustering mechanism. Among other results, we establish that these probabilities decay exponentially with the step-sizes so that the probability of correct clustering can be made arbitrarily close to one.

1808.03408 2026-06-04 cs.LG cs.NA math.NA math.OC stat.ML 版本更新

A Unified Analysis of AdaGrad with Weighted Aggregation and Momentum Acceleration

AdaGrad的统一分析:带加权聚合和动量加速

Li Shen, Congliang Chen, Fangyu Zou, Zequn Jie, Ju Sun, Wei Liu

发表机构 * JD Explore Academy, Beijing, China(京东探索研究院,北京,中国) Facebook, USA(Facebook,美国) Meituan, Beijing, China(美团,北京,中国) University of Minnesota, Twin Cities, USA(明尼苏达大学双城分校,美国) Tencent, Shenzhen, China(腾讯,深圳,中国)

AI总结 本文提出了一种名为AdaUSM的加权AdaGrad算法,通过统一动量方案和新型加权自适应学习率,实现了在非凸随机设置下的O(√(log(T)/T))收敛率,并从新视角解释了Adam和RMSProp的自适应学习率。

Comments IEEE TNNLS

详情
AI中文摘要

将自适应学习率和动量技术整合到SGD中,可以得到一系列高效加速的自适应随机算法,如AdaGrad、RMSProp、Adam、AccAdaGrad等。尽管这些算法在实践中效果显著,但在非凸随机设置下的收敛理论仍存在较大差距。为此,我们提出了名为AdaUSM的加权AdaGrad,其主要特点包括(1)采用统一的动量方案,涵盖重球动量和Nesterov加速梯度动量;(2)采用新颖的加权自适应学习率,能够统一AdaGrad、AccAdaGrad、Adam和RMSProp的学习率。此外,当在AdaUSM中采用多项式增长的权重时,可以得到非凸随机设置下的O(√(log(T)/T))收敛率。我们还展示了Adam和RMSProp的自适应学习率对应于在AdaUSM中采用指数增长的权重,从而为理解Adam和RMSProp提供了新的视角。最后,我们还在各种深度学习模型和数据集上进行了AdaUSM与SGD动量、AdaGrad、AdaEMA、Adam和AMSGrad的比较实验。

英文摘要

Integrating adaptive learning rate and momentum techniques into SGD leads to a large class of efficiently accelerated adaptive stochastic algorithms, such as AdaGrad, RMSProp, Adam, AccAdaGrad, \textit{etc}. In spite of their effectiveness in practice, there is still a large gap in their theories of convergences, especially in the difficult non-convex stochastic setting. To fill this gap, we propose \emph{weighted AdaGrad with unified momentum}, dubbed AdaUSM, which has the main characteristics that (1) it incorporates a unified momentum scheme which covers both the heavy ball momentum and the Nesterov accelerated gradient momentum; (2) it adopts a novel weighted adaptive learning rate that can unify the learning rates of AdaGrad, AccAdaGrad, Adam, and RMSProp. Moreover, when we take polynomially growing weights in AdaUSM, we obtain its $\mathcal{O}(\log(T)/\sqrt{T})$ convergence rate in the non-convex stochastic setting. We also show that the adaptive learning rates of Adam and RMSProp correspond to taking exponentially growing weights in AdaUSM, thereby providing a new perspective for understanding Adam and RMSProp. Lastly, comparative experiments of AdaUSM against SGD with momentum, AdaGrad, AdaEMA, Adam, and AMSGrad on various deep learning models and datasets are also carried out.

1709.09480 2026-06-04 cs.AI cs.LG cs.SY eess.SY 版本更新

A Benchmark Environment Motivated by Industrial Control Problems

由工业控制问题启发的基准环境

Daniel Hein, Stefan Depeweg, Michel Tokic, Steffen Udluft, Alexander Hentschel, Thomas A. Runkler, Volkmar Sterzing

发表机构 * Siemens AG, Corporate Technology(西门子股份公司企业技术部)

AI总结 本文提出一个结合工业控制问题的基准环境,旨在解决真实工业环境与现有人工基准之间缺乏联系的问题,通过详细描述基准动态并识别典型实验设置来促进强化学习方法的改进。

详情
Journal ref
2017 IEEE Symposium Series on Computational Intelligence (SSCI)
AI中文摘要

在强化学习(RL)研究领域,频繁出现新的有前景的方法被开发并引入RL社区。然而,尽管许多研究人员渴望将他们的方法应用于现实世界的问题,但在真实工业环境中实施这些方法往往是一个令人沮丧和繁琐的过程。通常,学术研究小组只能有限地访问真实工业数据和应用。因此,新方法通常通过使用人工软件基准来开发、评估和比较。一方面,这些基准旨在提供可解释的RL训练场景和对所用方法学习过程的深入见解。另一方面,它们通常与现实工业应用缺乏相似性。为此,我们利用行业经验设计了一个基准,以弥合自由可用、文档齐全且有动机的人工基准与真实工业问题属性之间的差距。所得到的工业基准(IB)已通过在GitHub上发布其Java和Python代码,包括一个OpenAI Gym包装器,向RL社区公开。在本文中,我们详细阐述了IB的动力学,并识别了能够捕捉现实世界工业控制问题中常见情况的典型实验设置。

英文摘要

In the research area of reinforcement learning (RL), frequently novel and promising methods are developed and introduced to the RL community. However, although many researchers are keen to apply their methods on real-world problems, implementing such methods in real industry environments often is a frustrating and tedious process. Generally, academic research groups have only limited access to real industrial data and applications. For this reason, new methods are usually developed, evaluated and compared by using artificial software benchmarks. On one hand, these benchmarks are designed to provide interpretable RL training scenarios and detailed insight into the learning process of the method on hand. On the other hand, they usually do not share much similarity with industrial real-world applications. For this reason we used our industry experience to design a benchmark which bridges the gap between freely available, documented, and motivated artificial benchmarks and properties of real industrial problems. The resulting industrial benchmark (IB) has been made publicly available to the RL community by publishing its Java and Python code, including an OpenAI Gym wrapper, on Github. In this paper we motivate and describe in detail the IB's dynamics and identify prototypic experimental settings that capture common situations in real-world industry control problems.

2107.01629 2026-06-04 stat.ML cs.LG econ.GN q-fin.EC stat.AP 版本更新

From Live to Recording: Consumer Demand and Response to Price Across the Livestreaming Lifecycle

从直播到录制:消费者对直播生命周期中价格的需求与响应

Ziwei Cong, Jia Liu, Puneet Manchanda

发表机构 * Georgetown University(乔治·华盛顿大学) Hong Kong University of Science and Technology(香港科学与技术大学) University of Michigan(密歇根大学) Stephen M. Ross School of Business, University of Michigan(密歇根大学罗斯商学院)

AI总结 利用大型直播平台数据,研究消费者在直播前后对价格敏感性的差异,发现直播前需求价格弹性更高,主要由消费者自选择和质量不确定性驱动。

Comments An earlier version of this paper was distributed under the title "The Role of 'Live' in Livestreaming Markets: Evidence Using Orthogonal Random Forest."

详情
AI中文摘要

直播已发展成为一个蓬勃发展的行业,创作者可以直接从中获利并与观众和粉丝互动。在实践中,创作者和平台通常将营销工作集中在直播前的时期。然而,直播活动在结束后自然过渡到录制格式,创造了潜在的“剩余”变现机会。本研究利用一个大型直播平台的数据,系统性地考察了消费者在整个直播生命周期中对直播活动的需求,该平台允许消费者在直播结束后购买付费直播活动的录制版本。我们发现,与直播后时期相比,直播前时期的需求对价格更敏感。这部分由两种机制驱动:消费者自选择(不常消费的消费者可能错过了直播活动,对录制版本表现出更高的支付意愿)和质量不确定性(消费者在直播前时期面临的事件质量不确定性高于直播后时期)。我们的研究结果为直播市场的定价和定向策略提供了启示。

英文摘要

Livestreaming has evolved into a thriving industry where creators can directly monetize and engage with their audiences and followers. In practice, creators and platforms typically concentrate their marketing efforts on the period leading up to the livestream. However, livestreaming events naturally transition into recorded formats once the event concludes, creating potential "residual" opportunities for monetization. This study systematically examines consumer demand for live events throughout the entire livestream life-cycle, using data from a large livestreaming platform that allows consumers to purchase the recorded version of a paid live event after the livestream ends. We find that the demand is surprisingly more price-sensitive during the pre-livestream period compared to the post-period. This is partly driven by two mechanisms: consumer self-selection (infrequent consumers who may have missed the live events exhibit a higher willingness to pay for recorded versions) and quality uncertainty (consumers face higher uncertainty in event quality during the pre-period than in the post-period). Our findings generate implications for the pricing and targeting strategies in livestreaming markets.

2006.04013 2026-06-04 cs.CY cs.AI cs.LG 版本更新

AI from concrete to abstract: demystifying artificial intelligence to the general public

从具体到抽象的人工智能:向公众揭秘人工智能

Rubens Lacerda Queiroz, Fábio Ferrentini Sampaio, Cabral Lima, Priscila Machado Vieira Lima

发表机构 * Federal University of Rio de Janeiro – UFRJ – Brazil(巴西联邦大学里约热内卢分校) InovLabs – Portugal(葡萄牙InovLabs) Atlantica University – Portugal(葡萄牙Atlantica大学) PESC/COPPE Tercio Pacitti Institute (NCE)(Tercio Pacitti研究所(NCE))

AI总结 本文提出一种结合可视化编程与WiSARD无权重人工神经网络的新方法AIcon2abs,通过实践开发学习机器并观察其学习过程,帮助普通大众(包括儿童)理解人工智能的基本概念。

Comments 23 pages; 2 tables; 47 figures; review comment: Included references for the final published peer-reviewed version of this pre-print: https://doi.org/10.1007/s00146-021-01151-x and https://rdcu.be/cihdO; typos corrected

详情
Journal ref
AI & SOCIETY, 36 877-893 (2021)
AI中文摘要

人工智能(AI)已被广泛应用于众多领域,这表明迫切需要开发手段,使普通大众对AI的含义有最基本的理解。本文结合可视化编程与WiSARD无权重人工神经网络,提出了一种新方法——从具体到抽象的人工智能(AIcon2abs),使普通人(包括儿童)能够实现这一目标。该方法的主要策略是通过与学习机器开发相关的实践活动,以及观察其学习过程,来促进对人工智能的去神秘化。因此,它能够使受训者获得技能,从而在涉及采用人工智能机制的辩论和决策中成为有洞察力的参与者。目前,通过编程教授基本AI概念的现有方法将机器智能视为外部元素/模块。经过训练后,该外部模块被耦合到学习者正在开发的主应用程序中。而在本文提出的方法中,训练和分类任务都是构成主程序的模块,就像其他编程结构一样。作为AIcon2abs的一个有益副作用,能够从数据中学习的程序与常规计算机程序之间的区别变得更加明显。此外,WiSARD无权重人工神经网络模型的简单性使得训练和分类任务的内部实现易于可视化和理解。

英文摘要

Artificial Intelligence (AI) has been adopted in a wide range of domains. This shows the imperative need to develop means to endow common people with a minimum understanding of what AI means. Combining visual programming and WiSARD weightless artificial neural networks, this article presents a new methodology, AI from concrete to abstract (AIcon2abs), to enable general people (including children) to achieve this goal. The main strategy adopted by is to promote a demystification of artificial intelligence via practical activities related to the development of learning machines, as well as through the observation of their learning process. Thus, it is possible to provide subjects with skills that contributes to making them insightful actors in debates and decisions involving the adoption of artificial intelligence mechanisms. Currently, existing approaches to the teaching of basic AI concepts through programming treat machine intelligence as an external element/module. After being trained, that external module is coupled to the main application being developed by the learners. In the methodology herein presented, both training and classification tasks are blocks that compose the main program, just as the other programming constructs. As a beneficial side effect of AIcon2abs, the difference between a program capable of learning from data and a conventional computer program becomes more evident. In addition, the simplicity of the WiSARD weightless artificial neural network model enables easy visualization and understanding of training and classification tasks internal realization.

1809.03225 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Gait learning for soft microrobots controlled by light fields

基于光场控制的软微机器人步态学习

Alexander von Rohr, Sebastian Trimpe, Alonso Marco, Peer Fischer, Stefano Palagi

发表机构 * Micro, Nano, and Molecular Systems Group, Max Planck Institute for Intelligent Systems(微、纳、分子系统组,人工智能系统马克斯·普朗克研究所) Max Planck ETH Center for Learning Systems(马克斯·普朗克-ETH学习系统中心)

AI总结 本文提出一种基于贝叶斯优化和高斯过程的概率学习方法,用于优化光场控制的软微机器人步态,通过有限实验预算实现高效且鲁棒的运动性能提升。

Comments 8 pages, 7 figures, to appear in the proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems 2018

详情
AI中文摘要

基于光场控制的软微机器人可以生成多种不同的步态。这种内在的灵活性可以用来最大化其在特定环境中的运动性能,并用于适应变化的条件。然而,由于缺乏准确的运动模型以及微机器人之间的固有变异性,分析控制设计是不可能的。另一方面,常见的数据驱动方法需要运行大量的实验,导致非常特定于样本的结果。本文提出了一种基于贝叶斯优化(BO)和高斯过程(GPs)的概率学习方法,用于光场控制的软微机器人。所提出的方法产生了一种学习方案,该方案在数据效率方面表现优异,能够在有限的实验预算下进行步态优化,并且对微机器人样本之间的差异具有鲁棒性。这些特性是通过在半合成数据集上比较不同的GP先验和BO设置来设计学习方案获得的。开发的学习方案在微机器人实验中得到验证,结果在仅20次实验的预算下,使微机器人的运动性能提高了115%。这些令人鼓舞的结果为基于光场控制的软微机器人和概率学习控制的自适应微机器人系统铺平了道路。

英文摘要

Soft microrobots based on photoresponsive materials and controlled by light fields can generate a variety of different gaits. This inherent flexibility can be exploited to maximize their locomotion performance in a given environment and used to adapt them to changing conditions. Albeit, because of the lack of accurate locomotion models, and given the intrinsic variability among microrobots, analytical control design is not possible. Common data-driven approaches, on the other hand, require running prohibitive numbers of experiments and lead to very sample-specific results. Here we propose a probabilistic learning approach for light-controlled soft microrobots based on Bayesian Optimization (BO) and Gaussian Processes (GPs). The proposed approach results in a learning scheme that is data-efficient, enabling gait optimization with a limited experimental budget, and robust against differences among microrobot samples. These features are obtained by designing the learning scheme through the comparison of different GP priors and BO settings on a semi-synthetic data set. The developed learning scheme is validated in microrobot experiments, resulting in a 115% improvement in a microrobot's locomotion performance with an experimental budget of only 20 tests. These encouraging results lead the way toward self-adaptive microrobotic systems based on light-controlled soft microrobots and probabilistic learning control.

1904.08962 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Constrained Restless Bandits for Dynamic Scheduling in Cyber-Physical Systems

用于网络物理系统动态调度的受限 restless 扩展老虎机

Kesav Kaza, Rahul Meshram, Varun Mehta, S. N. Merchant

发表机构 * Department of Electrical Engineering, Indian Institute of Technology Bombay(印度理工学院班加罗尔电子工程系) Polytechnique Montreal(蒙特利尔理工学院) IIIT Allahabad(阿哈迈德纳巴德印度理工学院) University of Ottawa(渥太华大学) IIT Bombay(印度理工学院班加罗尔)

AI总结 本文研究了一类受约束的 restless 多臂老虎机(CRMAB),其中约束以时间变化的动作集(可用臂集)形式存在。该变化可以是随机的或半确定性的。给定一组臂,每个决策区间内固定数量的臂可以被选择进行播放。每个臂的播放会产生依赖于当前状态的奖励。当前臂的状态通过二进制反馈信号部分可观察,而当前可用臂的状态则是完全可观察的。目标是最大化长期累积奖励。未来臂可用性的不确定性以及部分状态信息使这一目标具有挑战性。CRMAB的应用可以发现于涉及时间变化可用性的网络物理系统中的资源分配。首先,通过 Whittle 的指数策略分析该优化问题。为此,研究了一个受约束的 restless 单臂老虎机。证明其具有阈值型最优策略,并且是可指数化的。提出了一种计算 Whittle 指数的算法。还提出了一种复杂度更低的替代解决方案方法,以在线滚动策略的形式呈现。还详细讨论了这两种方案的复杂性,表明具有短前瞻的在线滚动策略比 Whittle 指数计算更容易实施。进一步,推导了价值函数的上界,以估计各种解决方案的次优程度。模拟研究比较了 Whittle 指数、在线滚动、贪心和修改 Whittle 指数策略的性能。

Comments 17 pages, 2 figures

详情
AI中文摘要

本文研究了一类受约束的 restless 多臂老虎机(CRMAB)。约束以时间变化的动作集(可用臂集)形式存在。这种变化可以是随机的或半确定性的。给定一组臂,每个决策区间内固定数量的臂可以被选择进行播放。每个臂的播放会产生依赖于当前状态的奖励。当前臂的状态通过二进制反馈信号部分可观察,而当前可用臂的状态则是完全可观察的。目标是最大化长期累积奖励。未来臂可用性的不确定性以及部分状态信息使这一目标具有挑战性。CRMAB的应用可以发现于涉及时间变化可用性的网络物理系统中的资源分配。首先,通过 Whittle 的指数策略分析该优化问题。为此,研究了一个受约束的 restless 单臂老虎机。证明其具有阈值型最优策略,并且是可指数化的。提出了一种计算 Whittle 指数的算法。还提出了一种复杂度更低的替代解决方案方法,以在线滚动策略的形式呈现。还详细讨论了这两种方案的复杂性,表明具有短前瞻的在线滚动策略比 Whittle 指数计算更容易实施。进一步,推导了价值函数的上界,以估计各种解决方案的次优程度。模拟研究比较了 Whittle 指数、在线滚动、贪心和修改 Whittle 指数策略的性能。

英文摘要

This paper studies a class of constrained restless multi-armed bandits (CRMAB). The constraints are in the form of time varying set of actions (set of available arms). This variation can be either stochastic or semi-deterministic. Given a set of arms, a fixed number of them can be chosen to be played in each decision interval. The play of each arm yields a state dependent reward. The current states of arms are partially observable through binary feedback signals from arms that are played. The current availability of arms is fully observable. The objective is to maximize long term cumulative reward. The uncertainty about future availability of arms along with partial state information makes this objective challenging. Applications for CRMAB can be found in resource allocation in cyber-physical systems involving components with time varying availability. First, this optimization problem is analyzed using Whittle's index policy. To this end, a constrained restless single-armed bandit is studied. It is shown to admit a threshold-type optimal policy and is also indexable. An algorithm to compute Whittle's index is presented. An alternate solution method with lower complexity is also presented in the form of an online rollout policy. A detailed discussion on the complexity of both these schemes is also presented, which suggests that online rollout policy with short look ahead is simpler to implement than Whittle's index computation. Further, upper bounds on the value function are derived in order to estimate the degree of sub-optimality of various solutions. The simulation study compares the performance of Whittle's index, online rollout, myopic and modified Whittle's index policies.

1903.11734 2026-06-04 math.OC cs.LG cs.SY eess.SY 版本更新

A Posteriori Probabilistic Bounds of Convex Scenario Programs with Validation Tests

凸场景程序后验概率界限的后验概率界限与验证测试

Chao Shang, Fengqi You

发表机构 * College of Engineering, Cornell University, Ithaca, New York(工程学院,康奈尔大学,伊萨卡,纽约)

AI总结 本文提出了一种新的后验界限,用于凸场景程序的后验概率评估,结合支持约束的实现和样本外验证数据的表现,以提高随机解的风险评估。

详情
Journal ref
IEEE Transactions on Automatic Control, Sept. 2021, Volume 66, Issue 9, Pages 4015 - 4028
AI中文摘要

场景程序已建立为在不确定性下做出决策的有效工具。为了评估基于场景的解决方案的质量,后验验证测试基于伯努利试验已被广泛采用。然而,为了达到理论上可靠的风险判断,通常需要收集大量的验证样本。在本文中,我们提出了一种新的后验界限,用于凸场景程序的后验概率评估,这些界限依赖于支持约束的实现和样本外验证数据的表现。所提出的界限具有广泛的通用性,因为许多现有的理论结果可以作为特殊情况被纳入其中。为了便于实际应用,还开发了一种系统的方法来参数化后验概率界限,该方法被证明具有多种有利的属性,允许易于实施和清晰的解释。通过综合支持约束和验证测试的全面信息,可以比现有的后验界限更有效地评估随机解的风险。对飞机侧向运动控制器设计的案例研究被提出以验证所提出的后验界限的有效性。

英文摘要

Scenario programs have established themselves as efficient tools towards decision-making under uncertainty. To assess the quality of scenario-based solutions a posteriori, validation tests based on Bernoulli trials have been widely adopted in practice. However, to reach a theoretically reliable judgement of risk, one typically needs to collect massive validation samples. In this work, we propose new a posteriori bounds for convex scenario programs with validation tests, which are dependent on both realizations of support constraints and performance on out-of-sample validation data. The proposed bounds enjoy wide generality in that many existing theoretical results can be incorporated as particular cases. To facilitate practical use, a systematic approach for parameterizing a posteriori probability bounds is also developed, which is shown to possess a variety of desirable properties allowing for easy implementations and clear interpretations. By synthesizing comprehensive information about support constraints and validation tests, improved risk evaluation can be achieved for randomized solutions in comparison with existing a posteriori bounds. Case studies on controller design of aircraft lateral motion are presented to validate the effectiveness of the proposed a posteriori bounds.

1711.05519 2026-06-04 cs.IT cs.LG cs.NA math.IT math.NA math.OC 版本更新

Accelerated Alternating Projections for Robust Principal Component Analysis

加速交替投影用于鲁棒主成分分析

HanQin Cai, Jian-Feng Cai, Ke Wei

发表机构 * Department of Mathematics, University of California, Los Angeles(加州大学洛杉矶分校数学系) Department of Mathematics, Hong Kong University of Science and Technology(香港理工大学数学系) School of Data Science, Fudan University(复旦大学数据科学学院)

AI总结 本文提出了一种加速交替投影算法,用于鲁棒主成分分析,显著提高了现有交替投影方法在更新低秩因子时的计算效率,并证明了该算法的精确恢复保证和线性收敛性。

详情
Journal ref
Journal of Machine Learning Research, 20 (2019): 685-717
AI中文摘要

我们研究了完全观测设置下的鲁棒PCA,即从其总和D=L+S中分离低秩矩阵L和稀疏矩阵S。在本文中,提出了一种新的算法,称为加速交替投影,用于鲁棒PCA,显著提高了现有在[Netrapalli, Praneeth, et al., 2014]中提出的交替投影方法在更新低秩因子时的计算效率。通过首先将矩阵投影到某些低维子空间,然后通过截断SVD获得低秩矩阵的新估计,实现了加速。精确恢复保证已经建立,证明了所提出算法的线性收敛性。经验性能评估证明了我们的算法在鲁棒PCA中的优势。

英文摘要

We study robust PCA for the fully observed setting, which is about separating a low rank matrix $\boldsymbol{L}$ and a sparse matrix $\boldsymbol{S}$ from their sum $\boldsymbol{D}=\boldsymbol{L}+\boldsymbol{S}$. In this paper, a new algorithm, dubbed accelerated alternating projections, is introduced for robust PCA which significantly improves the computational efficiency of the existing alternating projections proposed in [Netrapalli, Praneeth, et al., 2014] when updating the low rank factor. The acceleration is achieved by first projecting a matrix onto some low dimensional subspace before obtaining a new estimate of the low rank matrix via truncated SVD. Exact recovery guarantee has been established which shows linear convergence of the proposed algorithm. Empirical performance evaluations establish the advantage of our algorithm over other state-of-the-art algorithms for robust PCA.

1903.00979 2026-06-04 math.OC cs.LG cs.SY eess.SY math.DS stat.ML 版本更新

Analysis of a Generalized Expectation-Maximization Algorithm for Gaussian Mixture Models: A Control Systems Perspective

Gaussian混合模型中通用期望-最大化算法的分析:控制系统的视角

Sarthak Chatterjee, Orlando Romero, Sérgio Pequito

发表机构 * Department of Electrical, Computer, and Systems Engineering, Rensselaer Polytechnic Institute(雷德利尔理工学院电子工程与计算机系统系) Department of Industrial and Systems Engineering, Rensselaer Polytechnic Institute(雷德利尔理工学院工业与系统工程系)

AI总结 本文从控制系统的角度分析了Gaussian混合模型中的一种通用期望-最大化算法,探讨了其收敛性质,并通过示例展示了该方法的优势。

Comments 17 pages, 7 figures

详情
AI中文摘要

期望-最大化(EM)算法是无监督学习中解决参数分布基于聚类问题最流行的算法之一。在本文中,我们提出在高斯混合模型的背景下分析一种通用的EM(GEM)算法,其中EM中的最大化步骤被替换为递增步骤。我们证明这种GEM算法可以被理解为具有反馈非线性的线性时不变(LTI)系统。因此,我们利用鲁棒控制理论的工具来探索其收敛性质。最后,我们解释了如何设计所提出的GEM,并通过一个教学示例来理解所提出方法的优势。

英文摘要

The Expectation-Maximization (EM) algorithm is one of the most popular methods used to solve the problem of parametric distribution-based clustering in unsupervised learning. In this paper, we propose to analyze a generalized EM (GEM) algorithm in the context of Gaussian mixture models, where the maximization step in the EM is replaced by an increasing step. We show that this GEM algorithm can be understood as a linear time-invariant (LTI) system with a feedback nonlinearity. Therefore, we explore some of its convergence properties by leveraging tools from robust control theory. Lastly, we explain how the proposed GEM can be designed, and present a pedagogical example to understand the advantages of the proposed approach.

1812.05506 2026-06-04 eess.SY cs.LG cs.SY 版本更新

A predictive safety filter for learning-based control of constrained nonlinear dynamical systems

基于约束非线性动力学系统的预测安全过滤器

Kim P. Wabersich, Melanie N. Zeilinger

发表机构 * Institute for Dynamic Systems and Control, ETH Zurich, Zurich, Switzerland(动态系统与控制研究所,苏黎世联邦理工学院,瑞士苏黎世)

AI总结 本文提出了一种预测安全过滤器,用于基于学习的控制中处理物理限制下的安全问题,通过将约束动力学系统转换为无约束安全系统,使任何强化学习算法都能直接应用。

详情
AI中文摘要

将强化学习(RL)技术转移到现实应用中的挑战在于存在物理限制下的安全要求。大多数RL方法,特别是最流行的算法,不支持显式考虑状态和输入约束。在本文中,我们针对具有连续状态和输入空间的非线性系统,引入了一种预测安全过滤器,能够将约束动力学系统转换为无约束安全系统,并且任何RL算法都可以直接应用。预测安全过滤器接收提出的控制输入,并基于当前系统状态决定是否可以安全地应用于实际系统,或者是否需要进行修改。安全通过一个不断更新的安全策略来建立,该策略基于数据驱动的系统模型和考虑状态和输入依赖的不确定性,采用模型预测控制的公式。

英文摘要

The transfer of reinforcement learning (RL) techniques into real-world applications is challenged by safety requirements in the presence of physical limitations. Most RL methods, in particular the most popular algorithms, do not support explicit consideration of state and input constraints. In this paper, we address this problem for nonlinear systems with continuous state and input spaces by introducing a predictive safety filter, which is able to turn a constrained dynamical system into an unconstrained safe system and to which any RL algorithm can be applied `out-of-the-box'. The predictive safety filter receives the proposed control input and decides, based on the current system state, if it can be safely applied to the real system, or if it has to be modified otherwise. Safety is thereby established by a continuously updated safety policy, which is based on a model predictive control formulation using a data-driven system model and considering state and input dependent uncertainties.

1902.02311 2026-06-04 cs.MA cs.AI cs.LG cs.SY eess.SY 版本更新

Decentralized Multi-Agents by Imitation of a Centralized Controller

通过模仿集中控制器实现去中心化多智能体

Alex Tong Lin, Mark J. Debord, Katia Estabridis, Gary Hewer, Guido Montufar, Stanley Osher

发表机构 * UCLA(加州大学洛杉矶分校) Max Planck Institute, Leipzig(莱比锡马克斯·普朗克研究所) University of California, Los Angeles(加州大学洛杉矶分校)

AI总结 本文提出了一种基于集中训练、去中心执行框架的新型算法,通过模仿学习生成去中心化多智能体,解决了多智能体强化学习中非平稳和部分可观测环境下的协作问题。

详情
AI中文摘要

我们考虑了一个多智能体强化学习问题,其中每个智能体试图在与其他智能体交互时最大化共享奖励,且可能无法通信。通常,智能体无法访问其他智能体的策略,因此每个智能体都处于非平稳和部分可观测的环境中。为了获得去中心化作用的多智能体,我们引入了一种新的算法,该算法基于流行的集中训练、去中心执行框架。该训练框架首先通过单一集中联合空间学习者解决多智能体问题,然后用于指导模仿学习以生成独立的去中心化多智能体。该框架具有灵活性,可以使用任何强化学习算法来获得专家,以及任何模仿学习算法来获得去中心化智能体。这与其它多智能体学习算法不同,例如可能需要更具体的结构。我们为该方法提供了一些理论界限,并展示了通过模仿学习可以获得多智能体问题的去中心化解决方案。

英文摘要

We consider a multi-agent reinforcement learning problem where each agent seeks to maximize a shared reward while interacting with other agents, and they may or may not be able to communicate. Typically the agents do not have access to other agent policies and thus each agent is situated in a non-stationary and partially-observable environment. In order to obtain multi-agents that act in a decentralized manner, we introduce a novel algorithm under the popular framework of centralized training, but decentralized execution. This training framework first obtains solutions to a multi-agent problem with a single centralized joint-space learner, which is then used to guide imitation learning for independent decentralized multi-agents. This framework has the flexibility to use any reinforcement learning algorithm to obtain the expert as well as any imitation learning algorithm to obtain the decentralized agents. This is in contrast to other multi-agent learning algorithms that, for example, can require more specific structures. We present some theoretical bounds for our method, and we show that one can obtain decentralized solutions to a multi-agent problem through imitation learning.

1701.00178 2026-06-04 math.OC cs.AI cs.LG cs.SY eess.SY stat.ML 版本更新

Lazily Adapted Constant Kinky Inference for Nonparametric Regression and Model-Reference Adaptive Control

惰性适应的常数Kinky推断用于非参数回归和模型参考自适应控制

Jan-Peter Calliess

发表机构 * Dept. of Engineering Science University of Oxford, UK(工程科学系 奥克斯福德大学 英国)

AI总结 本文提出了一种惰性适应的常数Kinky推断方法,用于非参数回归和模型参考自适应控制,通过在线估计Hölder常数并建立强通用逼近保证,展示了在密集数据下学习任意连续函数的能力。

详情
AI中文摘要

非线性集合成员预测、Lipschitz插值或Kinky推断是机器学习中利用预设Lipschitz性质来计算未观测函数值推断的方法。在已知目标函数真实最佳Lipschitz常数的上界时,这些方法提供收敛保证和预测的界限。考虑一个更一般的设置,该设置基于相对于伪度量的Hölder连续性,我们提出了一种在线方法,用于从可能受有界观测误差影响的函数值观测中估计Hölder常数。利用此方法在Kinky推断规则中计算自适应参数,从而得到一种非参数机器学习方法,我们为此建立了强通用逼近保证。也就是说,我们证明我们的预测规则在数据越来越密集的情况下,可以学习任意连续函数,其最坏误差界取决于观测不确定性水平。我们在非参数模型参考自适应控制(MRAC)的背景下应用了我们的方法。在一系列模拟飞机滚动动力学和性能指标中,我们的方法优于基于高斯过程和RBF神经网络最近提出的方法。对于离散时间系统,我们为我们的基于学习的控制器在批量学习和在线学习设置下的跟踪成功率提供了保证。

英文摘要

Techniques known as Nonlinear Set Membership prediction, Lipschitz Interpolation or Kinky Inference are approaches to machine learning that utilise presupposed Lipschitz properties to compute inferences over unobserved function values. Provided a bound on the true best Lipschitz constant of the target function is known a priori they offer convergence guarantees as well as bounds around the predictions. Considering a more general setting that builds on Hoelder continuity relative to pseudo-metrics, we propose an online method for estimating the Hoelder constant online from function value observations that possibly are corrupted by bounded observational errors. Utilising this to compute adaptive parameters within a kinky inference rule gives rise to a nonparametric machine learning method, for which we establish strong universal approximation guarantees. That is, we show that our prediction rule can learn any continuous function in the limit of increasingly dense data to within a worst-case error bound that depends on the level of observational uncertainty. We apply our method in the context of nonparametric model-reference adaptive control (MRAC). Across a range of simulated aircraft roll-dynamics and performance metrics our approach outperforms recently proposed alternatives that were based on Gaussian processes and RBF-neural networks. For discrete-time systems, we provide guarantees on the tracking success of our learning-based controllers both for the batch and the online learning setting.

1906.00729 2026-06-04 cs.LG cs.GT cs.SY eess.SY math.OC stat.ML 版本更新

Policy Optimization Provably Converges to Nash Equilibria in Zero-Sum Linear Quadratic Games

策略优化在零和线性二次博弈中可证明收敛至纳什均衡

Kaiqing Zhang, Zhuoran Yang, Tamer Başar

发表机构 * Department of Electrical and Computer Engineering & Coordinated Science Laboratory, University of Illinois at Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校电子工程与协调科学实验室部门) Department of Operations Research and Financial Engineering, Princeton University(普林斯顿大学运筹学与金融工程系)

AI总结 本文研究了策略优化在零和线性二次博弈中寻找纳什均衡的全局收敛性,通过分析LQ博弈的优化景观,证明了线性反馈控制策略的 stationary 点构成博弈的纳什均衡,并提出三种保证收敛到纳什均衡的投影嵌套梯度方法,同时展示了这些算法具有全局次线性和局部线性收敛率。

Comments Fixed some typos, addressed some comments from NeurIPS reviews

详情
AI中文摘要

我们研究了策略优化在寻找零和线性二次(LQ)博弈纳什均衡(NE)中的全局收敛性。为此,我们首先分析了LQ博弈的景观,将其视为策略空间中的非凸非凹鞍点问题。具体来说,我们证明了尽管其非凸性和非凹性,零和LQ博弈具有性质:目标函数相对于线性反馈控制策略的 stationary 点构成博弈的纳什均衡。在此基础上,我们开发了三种投影嵌套梯度方法,这些方法保证能够收敛到博弈的纳什均衡。此外,我们证明所有这些算法都具有全局次线性和局部线性收敛率。还提供了仿真结果以说明算法的满意收敛特性。据我们所知,这项工作似乎是首次研究LQ博弈的优化景观,并且证明了策略优化方法收敛到纳什均衡。我们的工作为理解一般零和马尔可夫游戏中的基于策略的强化学习算法的理论方面提供了初步步骤。

英文摘要

We study the global convergence of policy optimization for finding the Nash equilibria (NE) in zero-sum linear quadratic (LQ) games. To this end, we first investigate the landscape of LQ games, viewing it as a nonconvex-nonconcave saddle-point problem in the policy space. Specifically, we show that despite its nonconvexity and nonconcavity, zero-sum LQ games have the property that the stationary point of the objective function with respect to the linear feedback control policies constitutes the NE of the game. Building upon this, we develop three projected nested-gradient methods that are guaranteed to converge to the NE of the game. Moreover, we show that all of these algorithms enjoy both globally sublinear and locally linear convergence rates. Simulation results are also provided to illustrate the satisfactory convergence properties of the algorithms. To the best of our knowledge, this work appears to be the first one to investigate the optimization landscape of LQ games, and provably show the convergence of policy optimization methods to the Nash equilibria. Our work serves as an initial step toward understanding the theoretical aspects of policy-based reinforcement learning algorithms for zero-sum Markov games in general.

1905.04403 2026-06-04 eess.SY cs.LG cs.SY 版本更新

PAC Statistical Model Checking for Markov Decision Processes and Stochastic Games

PAC统计模型检验用于马尔可夫决策过程和随机游戏

Pranav Ashok, Jan Křetínský, Maximilian Weininger

发表机构 * Technical University of Munich, Germany(慕尼黑技术大学)

AI总结 本文提出了一种用于马尔可夫决策过程和随机游戏的PAC统计模型检验算法,该算法在不完全了解转移函数的情况下,能够提供概率近似正确性保证,且在实际应用中效率较高。

详情
AI中文摘要

统计模型检验(SMC)是一种用于分析概率系统的技术,这些系统可能(部分)未知。我们提出了一种用于无界可达性的SMC算法,该算法能够提供概率近似正确(PAC)的保证。我们考虑了两种情况:(i)没有转移函数的知识(仅需一个转移概率的下界)和(ii)了解底层图的拓扑结构。一方面,这是首个针对随机游戏的算法;另一方面,即使对于马尔可夫决策过程,这也是首个实用的算法。与之前需要运行时间超过宇宙年龄的方法相比,我们的算法通常可以在几分钟内得到合理精确的结果,不需要了解混合时间或整个模型的拓扑结构。

英文摘要

Statistical model checking (SMC) is a technique for analysis of probabilistic systems that may be (partially) unknown. We present an SMC algorithm for (unbounded) reachability yielding probably approximately correct (PAC) guarantees on the results. We consider both the setting (i) with no knowledge of the transition function (with the only quantity required a bound on the minimum transition probability) and (ii) with knowledge of the topology of the underlying graph. On the one hand, it is the first algorithm for stochastic games. On the other hand, it is the first practical algorithm even for Markov decision processes. Compared to previous approaches where PAC guarantees require running times longer than the age of universe even for systems with a handful of states, our algorithm often yields reasonably precise results within minutes, not requiring the knowledge of mixing time or the topology of the whole model.

1905.13268 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Interpretable PID Parameter Tuning for Control Engineering using General Dynamic Neural Networks: An Extensive Comparison

使用通用动态神经网络进行可解释的PID参数调节:一种广泛的比较

Johannes Günther, Elias Reichensdörfer, Patrick M. Pilarski, Klaus Diepold

发表机构 * Department of Computing Science, University of Alberta(阿尔伯塔大学计算机科学系) Alberta Machine Intelligence Institute(阿尔伯塔人工智能研究所)

AI总结 本文研究了如何通过通用动态神经网络(GDNN)扩展PID控制器,以提高复杂控制系统的性能和可解释性,通过四个基准系统的广泛比较,展示了神经PID控制器在16项任务中优于传统PID和模型驱动控制的13项任务。

详情
AI中文摘要

现代自动化系统依赖于闭环控制,其中控制器根据观察与受控过程交互。这些系统日益复杂,但大多数控制器仍是线性比例-积分-微分(PID)控制器。PID控制器在处理线性和近线性系统时表现良好,但其简单性与控制复杂过程所需鲁棒性相矛盾。现代机器学习提供了一种方法,即通过神经网络扩展PID控制器,以超越其线性能力。然而,这种扩展以失去稳定性保证和控制器可解释性为代价。本文研究了通过循环神经网络(即通用动态神经网络GDNN)扩展PID控制器的效用,证明GDNN(神经)PID控制器在多种控制系统中表现良好,并强调其作为可扩展和可解释的控制选项。为此,我们通过四个基准系统进行了广泛研究,这些系统代表了最常用的控制工程基准。所有控制基准均在有噪声和无噪声、有干扰和无干扰的情况下进行评估。神经PID控制器在16项任务中优于传统PID控制15项,在16项任务中优于模型驱动控制13项。作为第二项贡献,我们解决了防止神经网络用于实际控制过程的可解释性不足问题。我们使用有界输入有界输出稳定性分析来评估神经网络建议的参数,从而使其变得可理解。这种严格的评估与更好的可解释性相结合,是神经网络控制方法接受的重要步骤。此外,这也是可解释和安全应用人工智能的重要步骤。

英文摘要

Modern automation systems rely on closed loop control, wherein a controller interacts with a controlled process, based on observations. These systems are increasingly complex, yet most controllers are linear Proportional-Integral-Derivative (PID) controllers. PID controllers perform well on linear and near-linear systems but their simplicity is at odds with the robustness required to reliably control complex processes. Modern machine learning offers a way to extend PID controllers beyond their linear capabilities by using neural networks. However, such an extension comes at the cost of losing stability guarantees and controller interpretability. In this paper, we examine the utility of extending PID controllers with recurrent neural networks-namely, General Dynamic Neural Networks (GDNN); we show that GDNN (neural) PID controllers perform well on a range of control systems and highlight how they can be a scalable and interpretable option for control systems. To do so, we provide an extensive study using four benchmark systems that represent the most common control engineering benchmarks. All control benchmarks are evaluated with and without noise as well as with and without disturbances. The neural PID controller performs better than standard PID control in 15 of 16 tasks and better than model-based control in 13 of 16 tasks. As a second contribution, we address the lack of interpretability that prevents neural networks from being used in real-world control processes. We use bounded-input bounded-output stability analysis to evaluate the parameters suggested by the neural network, thus making them understandable. This combination of rigorous evaluation paired with better interpretability is an important step towards the acceptance of neural-network-based control approaches. It is furthermore an important step towards interpretable and safely applied artificial intelligence.

1812.03412 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Learning Multiplication-free Linear Transformations

学习无乘法线性变换

Cristian Rusu

AI总结 本文提出了一种字典学习算法,用于稀疏表示,同时对学习到的字典施加特定结构,使其在数值上更高效:减少加法/乘法次数甚至避免乘法。我们基于字典的高结构化基本构建块(二进制正交、缩放和剪切变换)来建立工作,可以写出优化问题的闭式解。我们在图像数据上展示了方法的有效性,并与已知的数值高效变换如快速傅里叶变换和快速离散余弦变换进行比较。

详情
AI中文摘要

在本文中,我们提出了几种字典学习算法,用于稀疏表示,同时对学习到的字典施加特定结构,使得它们在数值上更高效:减少加法/乘法的次数,甚至避免乘法。我们的工作基于字典的高结构化基本构建块(二进制正交、缩放和剪切变换)来建立,可以写出我们考虑的优化问题的闭式解。我们在图像数据上展示了我们方法的有效性,并可以与已知的数值高效变换如快速傅里叶变换和快速离散余弦变换进行比较。

英文摘要

In this paper, we propose several dictionary learning algorithms for sparse representations that also impose specific structures on the learned dictionaries such that they are numerically efficient to use: reduced number of addition/multiplications and even avoiding multiplications altogether. We base our work on factorizations of the dictionary in highly structured basic building blocks (binary orthonormal, scaling and shear transformations) for which we can write closed-form solutions to the optimization problems that we consider. We show the effectiveness of our methods on image data where we can compare against well-known numerically efficient transforms such as the fast Fourier and the fast discrete cosine transforms.

1904.09841 2026-06-04 cs.DS cs.LG cs.NA math.NA 版本更新

Simple Heuristics Yield Provable Algorithms for Masked Low-Rank Approximation

简单的启发式方法可证明算法用于带掩码的低秩近似

Cameron Musco, Christopher Musco, David P. Woodruff

发表机构 * UMass Amherst(马萨诸塞大学阿默斯特分校) New York University(纽约大学) Carnegie Mellon University(卡内基梅隆大学)

AI总结 本文研究了带掩码的低秩近似问题,提出了一种简单的启发式方法,通过将掩码为0的区域设为0,然后求解标准低秩近似,从而得到具有双标准逼近保证的算法。

Comments ITCS 2021

详情
AI中文摘要

在$masked\ low-rank\ approximation$中,给定$A \in \mathbb{R}^{n imes n}$和二进制掩码矩阵$W \in \{0,1\}^{n imes n}$。目标是找到一个秩为$k$的矩阵$L$,使得$$cost(L) = \sum_{i=1}^{n} \sum_{j = 1}^{n} W_{i,j} \cdot (A_{i,j} - L_{i,j} )^2 \leq OPT + ε\|A\|_F^2 ,$$其中$OPT = \min_{rank-k\ \hat{L}} cost(\hat L)$,$ε$是一个给定的误差参数。根据$W$的不同选择,该问题捕捉到因子分析、低秩加对角分解、鲁棒PCA、低秩矩阵补全、低秩加块矩阵近似以及许多问题。许多这些问题都是NP难的,尽管已有一些具有证明保证的算法,但它们要么1) 运行时间是$n^{Ω(k^2/ε)}$,要么2) 做出强假设,例如$A$是不相干的或$W$是随机的。在本工作中,我们证明了一个常见的多项式时间启发式方法,即简单地将$W$为0的区域设为0,然后找到标准低秩近似,可以为该问题提供双标准逼近保证。特别是,对于秩为$k' > k$,取决于$W$的public\ coin\ partition\ number,该启发式方法输出秩为$k'$的$L$,其成本$(L) \leq OPT + ε\|A\|_F^2$。这个partition number反过来由$W$作为两个玩家通信矩阵时的randomized\ communication\ complexity$所限制。对于许多重要的带掩码低秩近似示例,包括上述所有问题,该结果提供了具有$k' = k \cdot poly(\log n/ε)$的双标准逼近保证。此外,我们还显示了不同的通信模型为带掩码低秩近似的自然变种提供了算法。例如,多玩家number-in-hand通信复杂度与带掩码张量分解相关,而非确定性通信复杂度与带掩码布尔低秩分解相关。

英文摘要

In $masked\ low-rank\ approximation$, one is given $A \in \mathbb{R}^{n \times n}$ and binary mask matrix $W \in \{0,1\}^{n \times n}$. The goal is to find a rank-$k$ matrix $L$ for which: $$cost(L) = \sum_{i=1}^{n} \sum_{j = 1}^{n} W_{i,j} \cdot (A_{i,j} - L_{i,j} )^2 \leq OPT + ε\|A\|_F^2 ,$$ where $OPT = \min_{rank-k\ \hat{L}} cost(\hat L)$ and $ε$ is a given error parameter. Depending on the choice of $W$, this problem captures factor analysis, low-rank plus diagonal decomposition, robust PCA, low-rank matrix completion, low-rank plus block matrix approximation, and many problems. Many of these problems are NP-hard, and while some algorithms with provable guarantees are known, they either 1) run in time $n^{Ω(k^2/ε)}$ or 2) make strong assumptions, e.g., that $A$ is incoherent or that $W$ is random. In this work, we show that a common polynomial time heuristic, which simply sets $A$ to $0$ where $W$ is $0$, and then finds a standard low-rank approximation, yields bicriteria approximation guarantees for this problem. In particular, for rank $k' > k$ depending on the $public\ coin\ partition\ number$ of $W$, the heuristic outputs rank-$k'$ $L$ with cost$(L) \leq OPT + ε\|A\|_F^2$. This partition number is in turn bounded by the $randomized\ communication\ complexity$ of $W$, when interpreted as a two-player communication matrix. For many important examples of masked low-rank approximation, including all those listed above, this result yields bicriteria approximation guarantees with $k' = k \cdot poly(\log n/ε)$. Further, we show that different models of communication yield algorithms for natural variants of masked low-rank approximation. For example, multi-player number-in-hand communication complexity connects to masked tensor decomposition and non-deterministic communication complexity to masked Boolean low-rank factorization.

1903.07214 2026-06-04 eess.SY cs.LG cs.SY 版本更新

A Control Lyapunov Perspective on Episodic Learning via Projection to State Stability

从控制李雅普诺夫视角看通过投影到状态稳定性进行片段学习

Andrew J. Taylor, Victor D. Dorobantu, Meera Krishnamoorthy, Hoang M. Le, Yisong Yue, Aaron D. Ames

发表机构 * California Institute of Technology(加州理工学院)

AI总结 本文从李雅普诺夫函数视角探讨学习对控制合成的影响,提出投影到状态稳定性(PSS)概念,用于表征CLF对系统不确定数据的鲁棒性,并展示如何利用PSS在仿射控制中限制不确定性,实现鲁棒控制合成。

详情
AI中文摘要

本文的目标是从李雅普诺夫函数视角理解学习对控制合成的影响。具体而言,而不是考虑完整系统动态中的不确定性,我们采用控制李雅普诺夫函数(CLFs)作为低维投影。为了理解和表征这些投影动态引入的不确定性,我们引入了一个新概念:投影到状态稳定性(PSS)。PSS可以看作是定义在投影动态上的输入到状态稳定性变种,能够表征CLF对用于学习系统不确定性的数据的鲁棒性。我们使用PSS来限制仿射控制中的不确定性,并展示一种实用的片段学习方法可以利用PSS来表征CLF中的不确定性,以实现鲁棒控制合成。

英文摘要

The goal of this paper is to understand the impact of learning on control synthesis from a Lyapunov function perspective. In particular, rather than consider uncertainties in the full system dynamics, we employ Control Lyapunov Functions (CLFs) as low-dimensional projections. To understand and characterize the uncertainty that these projected dynamics introduce in the system, we introduce a new notion: Projection to State Stability (PSS). PSS can be viewed as a variant of Input to State Stability defined on projected dynamics, and enables characterizing robustness of a CLF with respect to the data used to learn system uncertainties. We use PSS to bound uncertainty in affine control, and demonstrate that a practical episodic learning approach can use PSS to characterize uncertainty in the CLF for robust control synthesis.

1903.01577 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Episodic Learning with Control Lyapunov Functions for Uncertain Robotic Systems

具有控制李雅普诺夫函数的不确定性机器人系统的经验学习

Andrew J. Taylor, Victor D. Dorobantu, Hoang M. Le, Yisong Yue, Aaron D. Ames

发表机构 * California Institute of Technology(加州理工学院)

AI总结 本文提出了一种基于控制李雅普诺夫函数的机器学习框架,用于适应机器人系统中的参数不确定性和未建模动态,通过迭代更新李雅普诺夫函数导数的估计和改进控制器,最终获得一个稳定性的二次规划基于控制器,并在平面Segway模拟中验证了方法的有效性。

详情
AI中文摘要

许多现代非线性控制方法旨在赋予系统保证性质,如稳定性或安全性,并已成功应用于机器人领域。然而,模型不确定性仍然是持续的挑战,削弱了理论保证并导致物理系统中的实施失败。本文开发了一种以控制李雅普诺夫函数(CLFs)为中心的机器学习框架,以适应一般机器人系统中的参数不确定性和未建模动态。我们提出的方法通过迭代更新李雅普诺夫函数导数的估计并改进控制器,最终获得一个基于二次规划的稳定控制器。我们在平面Segway模拟中验证了我们的方法,通过迭代改进基础无模型控制器,展示了显著的性能提升。

英文摘要

Many modern nonlinear control methods aim to endow systems with guaranteed properties, such as stability or safety, and have been successfully applied to the domain of robotics. However, model uncertainty remains a persistent challenge, weakening theoretical guarantees and causing implementation failures on physical systems. This paper develops a machine learning framework centered around Control Lyapunov Functions (CLFs) to adapt to parametric uncertainty and unmodeled dynamics in general robotic systems. Our proposed method proceeds by iteratively updating estimates of Lyapunov function derivatives and improving controllers, ultimately yielding a stabilizing quadratic program model-based controller. We validate our approach on a planar Segway simulation, demonstrating substantial performance improvements by iteratively refining on a base model-free controller.

1711.03127 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Energy Storage Arbitrage in Real-Time Markets via Reinforcement Learning

通过强化学习实现实时市场的能源存储套利

Hao Wang, Baosen Zhang

发表机构 * Department of Electrical Engineering, University of Washington(华盛顿大学电气工程系)

AI总结 本文通过强化学习设计了一个时间套利策略,用于能源存储,解决了实时价格套利中价格高度不确定带来的策略设计难题,通过设计奖励函数提升性能。

详情
Journal ref
2018 IEEE Power & Energy Society General Meeting (PESGM)
AI中文摘要

在本文中,我们通过强化学习推导出一个时间套利策略用于存储。实时价格套利是存储单元的重要收入来源,但设计良好的策略 proved 难以实现,因为价格的高不确定性。我们采用强化学习来设计一个最优的套利策略。该策略通过存储单元重复的充放电操作,通过更新价值矩阵来学习。我们设计了一个奖励函数,不仅反映了充放电决策的即时利润,还结合了历史信息。仿真结果表明,与现有算法相比,我们设计的奖励函数导致了显著的性能提升。

英文摘要

In this paper, we derive a temporal arbitrage policy for storage via reinforcement learning. Real-time price arbitrage is an important source of revenue for storage units, but designing good strategies have proven to be difficult because of the highly uncertain nature of the prices. Instead of current model predictive or dynamic programming approaches, we use reinforcement learning to design an optimal arbitrage policy. This policy is learned through repeated charge and discharge actions performed by the storage unit through updating a value matrix. We design a reward function that does not only reflect the instant profit of charge/discharge decisions but also incorporate the history information. Simulation results demonstrate that our designed reward function leads to significant performance improvement compared with existing algorithms.

1904.02851 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY 版本更新

Planning under non-rational perception of uncertain spatial costs

在不确定空间成本下的非理性感知规划

Aamodh Suresh, Sonia Martinez

AI总结 本文研究了在不确定空间成本下考虑非理性风险感知的运动规划策略,提出基于累积前景理论(CPT)生成感知风险地图的方法,并通过理论和仿真验证了CPT模型的建模能力,与CVaR等其他风险感知模型相比,展示了在路径规划中的优势。

Comments 12 pages and 10 figures. This revision adds more explanation and clearer figures

详情
AI中文摘要

本工作探讨了设计一种考虑与不确定空间成本相关的风险感知的运动规划策略。我们提出的方法利用累积前景理论(CPT)来生成给定环境中的感知风险地图。CPT-like感知风险和路径长度指标被结合以定义一个符合采样运动规划器(RRT*)渐近最优要求的成本函数。通过理论和仿真展示了CPT的建模能力,并与其他风险感知模型如条件价值-at-风险(CVaR)进行了比较。理论上,我们定义了风险感知模型的表达性概念,并证明CPT的表达性高于CVaR和期望风险。然后我们展示了这种表达性在路径规划设置中的转化,其中我们观察到一个配备CPT和同时扰动随机近似(SPSA)方法的规划器可以更好地近似任意环境中的路径。此外,我们通过仿真展示了我们的规划器能够捕捉一组丰富的有意义路径,代表了不同风险感知的自定义环境。然后我们通过在拥挤和动态环境中的仿真比较了我们的规划器与T-RRT*(连续成本空间的规划器)和Risk-RRT*(动态人类障碍物的风险感知规划器)的性能,展示了我们所提规划器的优势。

英文摘要

This work investigates the design of risk-perception-aware motion-planning strategies that incorporate non-rational perception of risks associated with uncertain spatial costs. Our proposed method employs the Cumulative Prospect Theory (CPT) to generate a perceived risk map over a given environment. CPT-like perceived risks and path-length metrics are then combined to define a cost function that is compliant with the requirements of asymptotic optimality of sampling-based motion planners (RRT*). The modeling power of CPT is illustrated in theory and in simulation, along with a comparison to other risk perception models like Conditional Value at Risk (CVaR). Theoretically, we define a notion of expressiveness for a risk perception model and show that CPT's is higher than that of CVaR and expected risk. We then show that this expressiveness translates to our path planning setting, where we observe that a planner equipped with CPT together with a simultaneous perturbation stochastic approximation (SPSA) method can better approximate arbitrary paths in an environment. Additionally, we show in simulation that our planner captures a rich set of meaningful paths, representative of different risk perceptions in a custom environment. We then compare the performance of our planner with T-RRT* (a planner for continuous cost spaces) and Risk-RRT* (a risk-aware planner for dynamic human obstacles) through simulations in cluttered and dynamic environments respectively, showing the advantage of our proposed planner.

1902.10139 2026-06-04 eess.SY cs.LG cs.RO cs.SY 版本更新

Learning Dynamic-Objective Policies from a Class of Optimal Trajectories

从一类最优轨迹中学习动态目标策略

Christopher Iliffe Sprague, Dario Izzo, Petter Ögren

AI总结 本文提出了一种新颖且简单的方法,通过轨迹优化、同伦持续和模仿学习相结合,合成能够在线切换目标函数的最优状态反馈控制器,并在倒立摆摆起和航天器轨道转移问题中验证了其有效性。

Comments Accepted to the 59th IEEE Conference on Decision and Control (CDC)

详情
AI中文摘要

最优状态反馈控制器,能够切换不同的目标函数,在可能遇到意外情况的系统中具有优势。然而,即使对于单一目标函数,合成此类控制器也是极具挑战性的。本文提出了一种新颖且简单的方法,通过轨迹优化、同伦持续和模仿学习相结合,来合成这些策略。我们使用数值持续法高效地生成多个目标函数和边界条件下的最优演示,并利用这些演示来训练我们的策略。此外,我们展示了我们的策略能够有效学习一系列最优状态反馈控制器,这些控制器可以在线切换目标函数。我们通过两个轨迹优化问题,即倒立摆摆起和航天器轨道转移,展示了该方法,并证明在仿真中合成的策略产生的轨迹接近最优。这些结果表明,轨迹优化和同伦持续对动态目标情境下的控制器合成具有益处。

英文摘要

Optimal state-feedback controllers, capable of changing between different objective functions, are advantageous to systems in which unexpected situations may arise. However, synthesising such controllers, even for a single objective, is a demanding process. In this paper, we present a novel and straightforward approach to synthesising these policies through a combination of trajectory optimisation, homotopy continuation, and imitation learning. We use numerical continuation to efficiently generate optimal demonstrations across several objectives and boundary conditions, and use these to train our policies. Additionally, we demonstrate the ability of our policies to effectively learn families of optimal state-feedback controllers, which can be used to change objective functions online. We illustrate this approach across two trajectory optimisation problems, an inverted pendulum swingup and a spacecraft orbit transfer, and show that the synthesised policies, when evaluated in simulation, produce trajectories that are near-optimal. These results indicate the benefit of trajectory optimisation and homotopy continuation to the synthesis of controllers in dynamic-objective contexts.

1710.09691 2026-06-04 eess.SY cs.LG cs.RO cs.SY 版本更新

Iterative Machine Learning for Precision Trajectory Tracking with Series Elastic Actuators

迭代机器学习用于系列弹性执行器的高精度轨迹跟踪

Nathan Banka, W. Tony Piaskowy, Joseph Garbini, Santosh Devasia

发表机构 * Ultra-Precision Controls Lab(超精密控制实验室) University of Washington(华盛顿大学)

AI总结 本文研究了在系列弹性执行器中使用迭代学习方法提高位置跟踪精度的问题,通过迭代学习生成前馈命令,利用复值高斯过程回归技术估计局部系统模型,从而减少跟踪误差。

Comments 9 pages, 16 figure. Submitted to AMC Workshop

详情
Journal ref
2018 IEEE 15th International Workshop on Advanced Motion Control (AMC), Tokyo, 2018, pp. 234-239
AI中文摘要

当机器人在未知环境中操作时,位置的小误差可能导致接触力的大幅变化,尤其是对于典型的高阻抗设计。这可能会损坏周围环境或机器人本身。系列弹性执行器(SEAs)是一种减少机器人手臂输出阻抗以提高对环境施加力的控制能力的流行方法。然而,这种增加的力控制能力伴随着较低的位置精度和带宽。本文探讨了使用迭代学习的前馈命令来改进使用SEAs时的位置跟踪。在每次迭代中,系统对量化输入的输出响应被用来估计线性化的局部系统模型。这些估计的模型是通过复值高斯过程回归(cGPR)技术获得的,然后用于基于前一次迭代的误差生成新的前馈输入命令。本文展示了该迭代机器学习(IML)技术在双自由度(2-DOF)机器人手臂上的应用,并证明了IML方法能够成功收敛以减少跟踪误差。

英文摘要

When robots operate in unknown environments small errors in postions can lead to large variations in the contact forces, especially with typical high-impedance designs. This can potentially damage the surroundings and/or the robot. Series elastic actuators (SEAs) are a popular way to reduce the output impedance of a robotic arm to improve control authority over the force exerted on the environment. However this increased control over forces with lower impedance comes at the cost of lower positioning precision and bandwidth. This article examines the use of an iteratively-learned feedforward command to improve position tracking when using SEAs. Over each iteration, the output responses of the system to the quantized inputs are used to estimate a linearized local system models. These estimated models are obtained using a complex-valued Gaussian Process Regression (cGPR) technique and then, used to generate a new feedforward input command based on the previous iteration's error. This article illustrates this iterative machine learning (IML) technique for a two degree of freedom (2-DOF) robotic arm, and demonstrates successful convergence of the IML approach to reduce the tracking error.

1812.07725 2026-06-04 math.OC cs.LG cs.NA math.NA math.PR stat.ML 版本更新

Breaking Reversibility Accelerates Langevin Dynamics for Global Non-Convex Optimization

打破可逆性加速Langevin动力学用于全局非凸优化

Xuefeng Gao, Mert Gurbuzbalaban, Lingjiong Zhu

发表机构 * Department of Systems Engineering and Engineering Management, The Chinese University of Hong Kong, Shatin, N.T. Hong Kong(系统工程与工程管理系,香港中文大学(深圳)) Department of Management Science and Information Systems and the DIMACS Institute, Rutgers University, Piscataway, NJ-08854, United States of America(管理科学与信息系统系及DIMACS研究所,罗杰斯大学) Department of Mathematics, Florida State University, 1017 Academic Way, Tallahassee, FL-32306, United States of America(数学系,佛罗里达州立大学)

AI总结 本文研究了非可逆Langevin动力学在全局非凸优化中的应用,通过分析非可逆动力学算法的收敛性和混合速率,证明了非可逆算法在寻找局部极小值和探索状态空间方面的效率提升。

详情
AI中文摘要

Langevin动力学(LD)已被证明是一种强大的技术,用于优化非凸目标,作为一种高效的算法来寻找局部极小值,而最终在更长的时间尺度上访问全局极小值。LD基于一阶Langevin扩散,其时间是可逆的。我们研究了两种基于非可逆Langevin扩散的变种:欠阻尼Langevin动力学(ULD)和具有非对称漂移的Langevin动力学(NLD)。采用Tzen、Liang和Raginsky(2018)为LD到非可逆扩散的技术,我们证明了对于给定的局部极小值,其在初始化点任意距离内,以高概率,ULD轨迹会在依赖于局部极小值Hessian最小特征值的复发时间内结束于该局部极小值的小邻域之外,或者在复发时间内进入该邻域并停留可能极长的逃逸时间。ULD算法在Hessian最小特征值的依赖性方面优于Tzen、Liang和Raginsky(2018)中LD的复发时间。对于NLD算法也获得了相似的结果和改进。我们还展示了非可逆变种在离散时间中能够更快地退出局部极小值的吸引盆地,当目标函数有两个局部极小值被鞍点分隔时,并量化了改进的幅度。我们的分析表明,非可逆Langevin算法在寻找局部极小值和探索状态空间方面更有效。我们的分析基于在局部极小值周围对目标函数的二次近似。作为我们分析的副产品,我们获得了两个非可逆Langevin算法在2-Wasserstein距离下的最优混合速率。

英文摘要

Langevin dynamics (LD) has been proven to be a powerful technique for optimizing a non-convex objective as an efficient algorithm to find local minima while eventually visiting a global minimum on longer time-scales. LD is based on the first-order Langevin diffusion which is reversible in time. We study two variants that are based on non-reversible Langevin diffusions: the underdamped Langevin dynamics (ULD) and the Langevin dynamics with a non-symmetric drift (NLD). Adopting the techniques of Tzen, Liang and Raginsky (2018) for LD to non-reversible diffusions, we show that for a given local minimum that is within an arbitrary distance from the initialization, with high probability, either the ULD trajectory ends up somewhere outside a small neighborhood of this local minimum within a recurrence time which depends on the smallest eigenvalue of the Hessian at the local minimum or they enter this neighborhood by the recurrence time and stay there for a potentially exponentially long escape time. The ULD algorithms improve upon the recurrence time obtained for LD in Tzen, Liang and Raginsky (2018) with respect to the dependency on the smallest eigenvalue of the Hessian at the local minimum. Similar result and improvement are obtained for the NLD algorithm. We also show that non-reversible variants can exit the basin of attraction of a local minimum faster in discrete time when the objective has two local minima separated by a saddle point and quantify the amount of improvement. Our analysis suggests that non-reversible Langevin algorithms are more efficient to locate a local minimum as well as exploring the state space. Our analysis is based on the quadratic approximation of the objective around a local minimum. As a by-product of our analysis, we obtain optimal mixing rates for quadratic objectives in the 2-Wasserstein distance for two non-reversible Langevin algorithms we consider.

1711.01526 2026-06-04 cs.LG cs.SY eess.SY math.OC 版本更新

On Identification of Distribution Grids

配电网络的识别

Omid Ardakanian, Vincent W. S. Wong, Roel Dobbe, Steven H. Low, Alexandra von Meier, Claire Tomlin, Ye Yuan

发表机构 * Department of Electrical Engineering and Computer Sciences, UC Berkeley, USA(伯克利大学电气工程与计算机科学系,美国)

AI总结 本文研究了如何通过遥测数据联合估计配电网络的模型参数和运行结构,利用lasso方法进行回归收缩和选择,提出可处理配电系统低秩结构的可行凸优化程序,并开发了用于早期检测和定位引起电导矩阵变化的关键事件的在线算法。

详情
AI中文摘要

将分布式能源资源大规模整合到住宅配电馈线中需要通过潮流分析仔细控制其运行。虽然分布系统模型的知识对于此类分析至关重要,但这种知识往往不可用或过时。最近同步相量技术在低压配电网络中的引入为从高精度、时间同步的电压和电流相量测量中学习此模型创造了前所未有的机会。本文重点是通过lasso方法(一种回归收缩和选择方法)从可用遥测数据中联合估计多相配电网络的模型参数(电导值)和运行结构。我们提出了能够处理配电系统低秩结构的可行凸优化程序,并开发了用于早期检测和定位引起电导矩阵变化的关键事件的在线算法。这些技术的有效性通过四个三相辐射形配电系统在真实家庭需求上的潮流研究得到验证。

英文摘要

Large-scale integration of distributed energy resources into residential distribution feeders necessitates careful control of their operation through power flow analysis. While the knowledge of the distribution system model is crucial for this type of analysis, it is often unavailable or outdated. The recent introduction of synchrophasor technology in low-voltage distribution grids has created an unprecedented opportunity to learn this model from high-precision, time-synchronized measurements of voltage and current phasors at various locations. This paper focuses on joint estimation of model parameters (admittance values) and operational structure of a poly-phase distribution network from the available telemetry data via the lasso, a method for regression shrinkage and selection. We propose tractable convex programs capable of tackling the low rank structure of the distribution system and develop an online algorithm for early detection and localization of critical events that induce a change in the admittance matrix. The efficacy of these techniques is corroborated through power flow studies on four three-phase radial distribution systems serving real household demands.

1806.04225 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY math.OC 版本更新

PAC-Bayes Control: Learning Policies that Provably Generalize to Novel Environments

PAC-Bayes 控制:学习能够证明在新环境中泛化的能力的策略

Anirudha Majumdar, Alec Farid, Anoopkumar Sonar

发表机构 * Department of Mechanical and Aerospace Engineering(1,2 机械与航空航天工程系) Department of Computer Science Princeton University(3 计算机科学系 纽约大学普林斯顿分校)

AI总结 本文提出了一种基于PAC-Bayes框架的机器人策略学习方法,通过在新环境中泛化能力的理论分析,为机器人系统提供强泛化保证。

Comments Extended version of paper presented at the 2018 Conference on Robot Learning (CoRL)

详情
AI中文摘要

我们的目标是学习能够证明在新环境中泛化能力的机器人控制策略,给定一组示例环境的数据集。我们方法的关键技术思想是利用机器学习中的泛化理论工具,通过精确的类比(以缩减形式呈现)将控制策略在新环境中的泛化与监督学习中的假设泛化联系起来。特别是,我们利用Probably Approximately Correct (PAC)-Bayes框架,这使我们能够获得在新环境中(随机)控制策略预期成本的上界。我们提出策略学习算法,明确寻求最小化此上界。相应的优化问题可以在有限策略空间的设置中通过凸优化(特别是相对熵编程)解决。在更一般的情况下,对于连续参数化策略(例如神经网络策略),我们使用随机梯度下降来最小化此上界。我们展示了所提出方法应用于学习(1)反应性障碍物回避策略和(2)基于神经网络的抓取策略的模拟结果。我们还展示了Parrot Swing无人机在不同障碍物环境中的硬件结果。我们的例子展示了该方法在具有连续状态和动作空间、复杂(例如非线性)动态、丰富感官输入(例如深度图像)和基于神经网络的策略的机器人系统中提供强泛化保证的潜力。

英文摘要

Our goal is to learn control policies for robots that provably generalize well to novel environments given a dataset of example environments. The key technical idea behind our approach is to leverage tools from generalization theory in machine learning by exploiting a precise analogy (which we present in the form of a reduction) between generalization of control policies to novel environments and generalization of hypotheses in the supervised learning setting. In particular, we utilize the Probably Approximately Correct (PAC)-Bayes framework, which allows us to obtain upper bounds that hold with high probability on the expected cost of (stochastic) control policies across novel environments. We propose policy learning algorithms that explicitly seek to minimize this upper bound. The corresponding optimization problem can be solved using convex optimization (Relative Entropy Programming in particular) in the setting where we are optimizing over a finite policy space. In the more general setting of continuously parameterized policies (e.g., neural network policies), we minimize this upper bound using stochastic gradient descent. We present simulated results of our approach applied to learning (1) reactive obstacle avoidance policies and (2) neural network-based grasping policies. We also present hardware results for the Parrot Swing drone navigating through different obstacle environments. Our examples demonstrate the potential of our approach to provide strong generalization guarantees for robotic systems with continuous state and action spaces, complicated (e.g., nonlinear) dynamics, rich sensory inputs (e.g., depth images), and neural network-based policies.

1905.00820 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

On the smoothness of nonlinear system identification

关于非线性系统辨识的光滑性

Antônio H. Ribeiro, Koen Tiels, Jack Umenberger, Thomas B. Schön, Luis A. Aguirre

发表机构 * Dept. of Information Technology, Uppsala University, Sweden(信息科技系,乌普萨拉大学,瑞典) Dept. of Mechanical Engineering, Eindhoven University of Technology, The Netherlands(机械工程系,埃因霍温理工大学,荷兰)

AI总结 本文研究了预测误差参数估计中线性和非线性系统优化问题的光滑性,提出通过多阶段法解决参数空间中模型非合同区域导致的Lipschitz常数和β-光滑性指数指数级增长的问题。

详情
Journal ref
Automatica, vol. 121, 109158, Nov. 2020
AI中文摘要

我们从新的角度探讨了在预测误差参数估计中线性和非线性系统出现的优化问题的光滑性。我们证明,在参数空间中模型非合同的区域,目标函数的Lipschitz常数和β-光滑性可能会随着仿真长度指数级增长,使得在这些区域内难以数值地找到极小值,甚至难以逃离这些区域。除了对这一问题提供理论理解外,本文还提出了多阶段法作为可行的解决方案。所提出的方法最小化预测模型与观测值之间的误差。与其在整个数据集上运行预测模型不同,多阶段法将数据分成更小的子集,并在每个子集上运行预测模型,使仿真长度成为设计参数,并使使用标准方法不可行的问题变得可行。通过在优化中包含约束条件,获得了与原问题的等价性。新方法通过估计具有混沌或不稳定行为的非线性系统的参数以及神经网络的参数进行了说明。我们还比较了所提出的方法与多步预测误差最小化方法的性能。

英文摘要

We shed new light on the \textit{smoothness} of optimization problems arising in prediction error parameter estimation of linear and nonlinear systems. We show that for regions of the parameter space where the model is not contractive, the Lipschitz constant and $β$-smoothness of the objective function might blow up exponentially with the simulation length, making it hard to numerically find minima within those regions or, even, to escape from them. In addition to providing theoretical understanding of this problem, this paper also proposes the use of multiple shooting as a viable solution. The proposed method minimizes the error between a prediction model and the observed values. Rather than running the prediction model over the entire dataset, multiple shooting splits the data into smaller subsets and runs the prediction model over each subset, making the simulation length a design parameter and making it possible to solve problems that would be infeasible using a standard approach. The equivalence to the original problem is obtained by including constraints in the optimization. The new method is illustrated by estimating the parameters of nonlinear systems with chaotic or unstable behavior, as well as neural networks. We also present a comparative analysis of the proposed method with multi-step-ahead prediction error minimization.

1904.10778 2026-06-04 cs.LG cs.SY eess.SY math.OC math.PR stat.ML 版本更新

Some Limit Properties of Markov Chains Induced by Stochastic Recursive Algorithms

由随机递归算法诱导的马尔可夫链的一些极限性质

Abhishek Gupta, Hao Chen, Jianzong Pi, Gaurav Tendolkar

发表机构 * Electrical and Computer Engineering Department, The Ohio State University(俄亥俄州立大学电气与计算机工程系) Microsoft Corp(微软公司)

AI总结 本文研究了由随机递归算法诱导的马尔可夫链的极限性质,通过分析迭代随机算子的收敛性,证明了随机序列的分布弱收敛于收缩算子生成的轨迹,并进一步展示了随机序列的时间平均收敛于不变分布的空间均值。

Comments Accepted in SIMODS, 37 pages

详情
AI中文摘要

递归随机算法由于数据驱动应用而近期受到广泛关注。例如,随机梯度下降用于解决大规模优化问题,经验动态规划算法用于解决马尔可夫决策问题。这些递归随机算法近似某些收缩算子,并可以被视为迭代随机算子的框架内。因此,我们考虑在波兰空间上迭代随机算子,模拟该波兰空间上的迭代收缩算子。假设迭代随机算子按一定批次大小索引,当批次大小趋于无穷时,每个随机算子的实现(以某种方式)收敛于它所模拟的收缩算子。我们证明,从相同的初始条件出发,由迭代随机算子生成的随机序列的分布弱收敛于由收缩算子生成的轨迹。我们进一步证明,在某些条件下,随机序列的时间平均收敛于不变分布的空间均值。然后,我们将这些结果应用于逻辑回归、经验价值迭代和经验Q值迭代,以说明此处发展的通用理论。

英文摘要

Recursive stochastic algorithms have gained significant attention in the recent past due to data driven applications. Examples include stochastic gradient descent for solving large-scale optimization problems and empirical dynamic programming algorithms for solving Markov decision problems. These recursive stochastic algorithms approximate certain contraction operators and can be viewed within the framework of iterated random operators. Accordingly, we consider iterated random operators over a Polish space that simulate iterated contraction operator over that Polish space. Assume that the iterated random operators are indexed by certain batch sizes such that as batch sizes grow to infinity, each realization of the random operator converges (in some sense) to the contraction operator it is simulating. We show that starting from the same initial condition, the distribution of the random sequence generated by the iterated random operators converges weakly to the trajectory generated by the contraction operator. We further show that under certain conditions, the time average of the random sequence converges to the spatial mean of the invariant distribution. We then apply these results to logistic regression, empirical value iteration, and empirical Q value iteration for finite state finite action MDPs to illustrate the general theory develop here.

1501.07242 2026-06-04 math.NA cs.LG cs.NA math.OC 版本更新

Escaping the Local Minima via Simulated Annealing: Optimization of Approximately Convex Functions

通过模拟退火逃离局部极小值:近似凸函数的优化

Alexandre Belloni, Tengyuan Liang, Hariharan Narayanan, Alexander Rakhlin

发表机构 * The Fuqua School of Business, Duke University(德克萨斯大学福克商学院) Department of Statistics, The Wharton School, University of Pennsylvania(宾夕法尼亚大学沃顿商学院统计系) Department of Statistics and Department of Mathematics, University of Washington(华盛顿大学统计系和数学系)

AI总结 本文研究了如何通过模拟退化方法优化近似凸函数,提出了一种基于Hit-and-Run方法的采样算法,能够有效避免局部极小值问题,并在零阶随机凸优化中实现了高效的ε-极小值求解。

Comments 27 pages

详情
Journal ref
Proceedings of the 28th Conference on Learning Theory 40 (2015) 240-265
AI中文摘要

我们考虑在$\mathbb{R}^n$中有限凸集上仅使用函数评估来优化近似凸函数的问题。该问题被转化为使用Hit-and-Run方法从近似对数凹分布中采样,证明其具有与对数凹分布采样相同的$\mathcal{O}^*$复杂度。除了将对数凹分布的分析扩展到近似对数凹分布外,Hit-and-Run漫步的一维采样器的实现需要新的方法和分析。该算法基于模拟退火,不依赖一阶条件,从而本质上免疫于局部极小值。然后,我们将该方法应用于不同的激励问题。在零阶随机凸优化的背景下,所提出的方法在诱导一个$\mathcal{O}(ε/n)$-近似对数凹分布后,通过$\mathcal{O}^*(n^{7.5}ε^{-2})$的噪声函数评估产生一个$ε$-极小值。我们还详细考虑了当“非凸性程度”向函数最优解衰减时的情况。本文讨论的方法还包括隐私计算经验风险最小化、两阶段随机规划以及在线学习中的近似动态规划应用。

英文摘要

We consider the problem of optimizing an approximately convex function over a bounded convex set in $\mathbb{R}^n$ using only function evaluations. The problem is reduced to sampling from an \emph{approximately} log-concave distribution using the Hit-and-Run method, which is shown to have the same $\mathcal{O}^*$ complexity as sampling from log-concave distributions. In addition to extend the analysis for log-concave distributions to approximate log-concave distributions, the implementation of the 1-dimensional sampler of the Hit-and-Run walk requires new methods and analysis. The algorithm then is based on simulated annealing which does not relies on first order conditions which makes it essentially immune to local minima. We then apply the method to different motivating problems. In the context of zeroth order stochastic convex optimization, the proposed method produces an $ε$-minimizer after $\mathcal{O}^*(n^{7.5}ε^{-2})$ noisy function evaluations by inducing a $\mathcal{O}(ε/n)$-approximately log concave distribution. We also consider in detail the case when the "amount of non-convexity" decays towards the optimum of the function. Other applications of the method discussed in this work include private computation of empirical risk minimizers, two-stage stochastic programming, and approximate dynamic programming for online learning.

1707.02568 2026-06-04 math.NA cs.LG cs.NA math.OC math.PR 版本更新

Solving high-dimensional partial differential equations using deep learning

利用深度学习解决高维偏微分方程

Jiequn Han, Arnulf Jentzen, Weinan E

发表机构 * Program in Applied and Computational Mathematics, Princeton University(普林斯顿大学应用与计算数学项目) Department of Mathematics, Princeton University(普林斯顿大学数学系) Beijing Institute of Big Data Research, Beijing(北京大数据研究院)

AI总结 本文提出了一种基于深度学习的方法,用于解决高维抛物型偏微分方程,通过将偏微分方程转化为反向随机微分方程,并利用神经网络近似未知解的梯度,有效提高了高维问题的准确性和效率。

Comments 13 pages, 6 figures

详情
Journal ref
Proceedings of the National Academy of Sciences, 115(34), 8505-8510 (2018)
AI中文摘要

开发用于求解高维偏微分方程(PDEs)的算法长期以来一直是一个极具挑战性的问题,由于著名的“维度灾难”问题。本文介绍了一种基于深度学习的方法,能够处理一般的高维抛物型PDEs。为此,PDEs被重新表述为反向随机微分方程,并且未知解的梯度通过神经网络近似,这在很大程度上类似于深度强化学习,其中梯度作为策略函数。在非线性Black-Scholes方程、Hamilton-Jacobi-Bellman方程和Allen-Cahn方程等示例上的数值结果表明,所提出的算法在高维情况下在准确性和成本方面都非常有效。这为经济学、金融学、运筹学和物理学开辟了新的可能性,通过同时考虑所有参与的代理、资产、资源或粒子,而不是对它们之间的相互关系做出任意假设。

英文摘要

Developing algorithms for solving high-dimensional partial differential equations (PDEs) has been an exceedingly difficult task for a long time, due to the notoriously difficult problem known as the "curse of dimensionality". This paper introduces a deep learning-based approach that can handle general high-dimensional parabolic PDEs. To this end, the PDEs are reformulated using backward stochastic differential equations and the gradient of the unknown solution is approximated by neural networks, very much in the spirit of deep reinforcement learning with the gradient acting as the policy function. Numerical results on examples including the nonlinear Black-Scholes equation, the Hamilton-Jacobi-Bellman equation, and the Allen-Cahn equation suggest that the proposed algorithm is quite effective in high dimensions, in terms of both accuracy and cost. This opens up new possibilities in economics, finance, operational research, and physics, by considering all participating agents, assets, resources, or particles together at the same time, instead of making ad hoc assumptions on their inter-relationships.

1709.05963 2026-06-04 math.NA cs.LG cs.NA cs.NE math.PR stat.ML 版本更新

Machine learning approximation algorithms for high-dimensional fully nonlinear partial differential equations and second-order backward stochastic differential equations

基于机器学习的近似算法用于高维非线性偏微分方程和二阶反向随机微分方程

Christian Beck, Weinan E, Arnulf Jentzen

发表机构 * ETH Zurich(苏黎世联邦理工学院) Beijing Institute of Big Data Research(北京大数据研究院) Princeton University(普林斯顿大学) Peking University(北京大学)

AI总结 本文提出了一种基于机器学习的高维非线性二阶偏微分方程的求解方法,通过将非线性偏微分方程与二阶反向随机微分方程联系起来,并利用深度神经网络进行空间近似和随机梯度下降优化,展示了该方法在高维Black-Scholes-Barenblatt方程、Hamilton-Jacobi-Bellman方程和非线性期望问题中的高效性和准确性。

Comments 56 pages, 12 figures

详情
Journal ref
J. Nonlinear Sci. 29, 1563-1619 (2019)
AI中文摘要

高维偏微分方程(PDE)出现在金融行业的多个模型中,例如衍生品定价模型、信用估值调整(CVA)模型或投资组合优化模型。这些应用中的PDE通常是高维的,因为维度对应于投资组合中的金融资产数量。此外,由于需要在模型中纳入某些非线性现象,如违约风险、交易成本、波动率不确定性(Knightian不确定性)或交易限制,这些PDE往往是完全非线性的。此类高维完全非线性PDE的求解极具挑战性,因为标准近似方法的计算努力随着维度呈指数增长。在本工作中,我们提出了一种新的方法来求解高维完全非线性二阶PDE。该方法可以特别用于采样高维非线性期望。该方法基于(i)完全非线性二阶PDE与二阶反向随机微分方程(2BSDE)之间的联系,(ii)PDE和2BSDE问题的合并公式,(iii)2BSDE的时间前向离散化和通过深度神经网络的空间近似,以及(iv)随机梯度下降型优化过程。使用Python中的TENSORFLOW获得的数值结果展示了该方法在100维Black-Scholes-Barenblatt方程、100维Hamilton-Jacobi-Bellman方程和100维G-布朗运动的非线性期望问题中的效率和准确性。

英文摘要

High-dimensional partial differential equations (PDE) appear in a number of models from the financial industry, such as in derivative pricing models, credit valuation adjustment (CVA) models, or portfolio optimization models. The PDEs in such applications are high-dimensional as the dimension corresponds to the number of financial assets in a portfolio. Moreover, such PDEs are often fully nonlinear due to the need to incorporate certain nonlinear phenomena in the model such as default risks, transaction costs, volatility uncertainty (Knightian uncertainty), or trading constraints in the model. Such high-dimensional fully nonlinear PDEs are exceedingly difficult to solve as the computational effort for standard approximation methods grows exponentially with the dimension. In this work we propose a new method for solving high-dimensional fully nonlinear second-order PDEs. Our method can in particular be used to sample from high-dimensional nonlinear expectations. The method is based on (i) a connection between fully nonlinear second-order PDEs and second-order backward stochastic differential equations (2BSDEs), (ii) a merged formulation of the PDE and the 2BSDE problem, (iii) a temporal forward discretization of the 2BSDE and a spatial approximation via deep neural nets, and (iv) a stochastic gradient descent-type optimization procedure. Numerical results obtained using ${\rm T{\small ENSOR}F{\small LOW}}$ in ${\rm P{\small YTHON}}$ illustrate the efficiency and the accuracy of the method in the cases of a $100$-dimensional Black-Scholes-Barenblatt equation, a $100$-dimensional Hamilton-Jacobi-Bellman equation, and a nonlinear expectation of a $ 100 $-dimensional $ G $-Brownian motion.

1706.04702 2026-06-04 math.NA cs.LG cs.NA cs.NE math.PR stat.ML 版本更新

Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations

基于深度学习的高维抛物型偏微分方程和反向随机微分方程的数值方法

Weinan E, Jiequn Han, Arnulf Jentzen

发表机构 * Beijing Institute of Big Data Research (China)(北京大数据研究院(中国)) Princeton University (USA)(普林斯顿大学(美国)) Peking University (China)(北京大学(中国)) ETH Zurich (Switzerland)(苏黎世联邦理工学院(瑞士))

AI总结 本文提出了一种基于深度学习的算法,通过将反向随机微分方程与强化学习类比,利用解的梯度作为策略函数,采用神经网络近似策略函数,有效解决了高维非线性偏微分方程和反向随机微分方程的问题。

Comments 39 pages, 15 figures

详情
Journal ref
Commun. Math. Stat. 5, 349-380 (2017)
AI中文摘要

我们提出了一种新的算法,用于求解高维抛物型偏微分方程(PDEs)和反向随机微分方程(BSDEs),通过将BSDE与强化学习进行类比,将解的梯度作为策略函数,损失函数由给定的终端条件与BSDE解之间的误差构成。策略函数随后通过神经网络进行近似,如深度强化学习中所做的那样。使用TensorFlow进行的数值结果展示了所提出算法在解决物理和金融领域中多个100维非线性PDEs方面的效率和准确性,例如Allen-Cahn方程、Hamilton-Jacobi-Bellman方程以及金融衍生品的非线性定价模型。

英文摘要

We propose a new algorithm for solving parabolic partial differential equations (PDEs) and backward stochastic differential equations (BSDEs) in high dimension, by making an analogy between the BSDE and reinforcement learning with the gradient of the solution playing the role of the policy function, and the loss function given by the error between the prescribed terminal condition and the solution of the BSDE. The policy function is then approximated by a neural network, as is done in deep reinforcement learning. Numerical results using TensorFlow illustrate the efficiency and accuracy of the proposed algorithms for several 100-dimensional nonlinear PDEs from physics and finance such as the Allen-Cahn equation, the Hamilton-Jacobi-Bellman equation, and a nonlinear pricing model for financial derivatives.

1808.07452 2026-06-04 math.NA cs.LG cs.NA 版本更新

Generalized Canonical Polyadic Tensor Decomposition

广义规范张量分解

David Hong, Tamara G. Kolda, Jed A. Duersch

发表机构 * Sandia National Laboratories(桑迪亚国家实验室) University of Michigan(密歇根大学)

AI总结 本文提出了一种广义规范张量分解(GCP),能够使用除均方误差外的其他损失函数,如逻辑损失或KL散度,从而适用于二分类或计数数据,并展示了其在社交网络互动、小鼠神经活动和印度月降雨量等真实数据集上的灵活性。

详情
Journal ref
SIAM Review, Vol. 62, No. 1, pp. 133-163, 2020
AI中文摘要

张量分解是数据科学中一种基本的无监督机器学习方法,应用于网络分析和传感器数据处理等领域。本文开发了一种广义规范(GCP)低秩张量分解,允许使用除均方误差外的其他损失函数。例如,我们可以使用逻辑损失或Kullback-Leibler散度,从而实现二分类或计数数据的张量分解。我们为各种场景提出了多种统计动机的损失函数。我们提供了一个通用的框架,用于计算梯度和处理缺失数据,使标准优化方法能够用于拟合模型。我们展示了GCP在多个真实世界示例中的灵活性,包括社交网络中的互动、小鼠神经活动以及印度的月降雨量测量。

英文摘要

Tensor decomposition is a fundamental unsupervised machine learning method in data science, with applications including network analysis and sensor data processing. This work develops a generalized canonical polyadic (GCP) low-rank tensor decomposition that allows other loss functions besides squared error. For instance, we can use logistic loss or Kullback-Leibler divergence, enabling tensor decomposition for binary or count data. We present a variety statistically-motivated loss functions for various scenarios. We provide a generalized framework for computing gradients and handling missing data that enables the use of standard optimization methods for fitting the model. We demonstrate the flexibility of GCP on several real-world examples including interactions in a social network, neural activity in a mouse, and monthly rainfall measurements in India.

1904.01514 2026-06-04 math.NA cs.LG cs.NA 版本更新

Data driven approximation of parametrized PDEs by Reduced Basis and Neural Networks

基于降阶基和神经网络的数据驱动参数化PDE近似

Niccolò Dal Santo, Simone Deparis, Luca Pegolotti

发表机构 * SCI-SB-SD, École Polytechnique Fédérale de Lausanne (EPFL), Station 8, 1015 Lausanne, Switzerland(SCI-SB-SD,瑞士联邦理工学院(洛桑联邦理工学院),8号站,1015洛桑,瑞士)

AI总结 本文提出一种结合降阶基方法和神经网络的数据驱动方法,用于近似参数化偏微分方程,通过减少物理参数的计算成本来估计感兴趣的场,如材料样本的温度或流体的速度。

详情
AI中文摘要

我们致力于利用基于降阶基方法和机器学习的数据驱动方法来近似偏微分方程。我们假设感兴趣的物理现象可以由参数化偏微分方程建模,但物理参数的值未知或难以直接测量。我们的方法允许在域内少量点的数据基础上估计感兴趣的场,例如材料样本的温度或流体的速度。我们提出使用神经网络嵌入降阶基求解器作为最后一层的 exotic 激活函数来完成此任务。降阶基求解器考虑了底层的物理现象,并从随机选择的物理参数值期间获得的快照中构建。随后,相同的全阶解用于训练神经网络。事实上,所选架构类似于一个不对称自动编码器,其中解码器是降阶基求解器,因此不包含可训练参数。所得到的自动编码器的潜在空间包括参数依赖的量,这些量为降阶基求解器提供输入,这取决于所考虑的偏微分方程,可能是物理参数本身或微分算子的仿射分解系数。

英文摘要

We are interested in the approximation of partial differential equations with a data-driven approach based on the reduced basis method and machine learning. We suppose that the phenomenon of interest can be modeled by a parametrized partial differential equation, but that the value of the physical parameters is unknown or difficult to be directly measured. Our method allows to estimate fields of interest, for instance temperature of a sample of material or velocity of a fluid, given data at a handful of points in the domain. We propose to accomplish this task with a neural network embedding a reduced basis solver as exotic activation function in the last layer. The reduced basis solver accounts for the underlying physical phenomenonon and it is constructed from snapshots obtained from randomly selected values of the physical parameters during an expensive offline phase. The same full order solutions are then employed for the training of the neural network. As a matter of fact, the chosen architecture resembles an asymmetric autoencoder in which the decoder is the reduced basis solver and as such it does not contain trainable parameters. The resulting latent space of our autoencoder includes parameter-dependent quantities feeding the reduced basis solver, which -- depending on the considered partial differential equation -- are the values of the physical parameters themselves or the affine decomposition coefficients of the differential operators.

1903.11483 2026-06-04 cs.LG cs.NE cs.RO cs.SY eess.SY stat.ML 版本更新

Constructing Parsimonious Analytic Models for Dynamic Systems via Symbolic Regression

通过符号回归构建动态系统的简洁解析模型

Erik Derner, Jiří Kubalík, Nicola Ancona, Robert Babuška

发表机构 * Czech Institute of Informatics, Robotics, and Cybernetics(捷克信息学、机器人学与自动化研究所) Czech Technical University in Prague(布拉格捷克技术大学) Department of Control Engineering, Faculty of Electrical Engineering(电气工程系控制工程系) Delft University of Technology(代尔夫特理工大学)

AI总结 本文提出利用符号回归构建动态系统的简洁解析模型,通过两种先进的符号回归算法在状态空间域和输入输出域中应用,展示了在模拟示例和真实系统中的优越性能。

详情
Journal ref
Applied Soft Computing, Volume 94, September 2020, 106432
AI中文摘要

构建动态系统的数学模型对于许多工程和科学学科至关重要。模型有助于模拟、分析系统行为、决策制定和自动控制算法的设计。即使像强化学习(RL)这样的无模型控制技术也已被证明能从使用模型中受益,通常这些模型是在线学习的。任何模型构建方法都必须处理模型的准确性和复杂性之间的权衡,这很难做到。本文提出利用符号回归(SR)来构建由解析方程描述的简洁过程模型。我们为方法配备了两种最先进的符号回归算法,它们自动搜索适合测量数据的方程:单节点遗传编程(SNGP)和多基因遗传编程(MGGP)。除了状态空间域中的标准问题表述外,我们还展示了该方法如何应用于非线性自回归加外生输入(NARX)类型的输入输出模型。我们展示了该方法在三个模拟示例中的应用,这些示例的状态空间最高可达14维:倒立摆、移动机器人和双足行走机器人。与深度神经网络和局部线性回归的比较表明,SR在大多数情况下优于这些常用替代方法。我们在真实摆系统上展示了解析模型的发现使RL控制器能够成功完成摆起任务,该模型仅基于100个数据样本构建。

英文摘要

Developing mathematical models of dynamic systems is central to many disciplines of engineering and science. Models facilitate simulations, analysis of the system's behavior, decision making and design of automatic control algorithms. Even inherently model-free control techniques such as reinforcement learning (RL) have been shown to benefit from the use of models, typically learned online. Any model construction method must address the tradeoff between the accuracy of the model and its complexity, which is difficult to strike. In this paper, we propose to employ symbolic regression (SR) to construct parsimonious process models described by analytic equations. We have equipped our method with two different state-of-the-art SR algorithms which automatically search for equations that fit the measured data: Single Node Genetic Programming (SNGP) and Multi-Gene Genetic Programming (MGGP). In addition to the standard problem formulation in the state-space domain, we show how the method can also be applied to input-output models of the NARX (nonlinear autoregressive with exogenous input) type. We present the approach on three simulated examples with up to 14-dimensional state space: an inverted pendulum, a mobile robot, and a bipedal walking robot. A comparison with deep neural networks and local linear regression shows that SR in most cases outperforms these commonly used alternative methods. We demonstrate on a real pendulum system that the analytic model found enables a RL controller to successfully perform the swing-up task, based on a model constructed from only 100 data samples.

1904.01068 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY 版本更新

Efficient and Safe Exploration in Deterministic Markov Decision Processes with Unknown Transition Models

在未知转移模型的确定性马尔可夫决策过程中实现高效且安全的探索

Erdem Bıyık, Jonathan Margoliash, Shahrouz Ryan Alimo, Dorsa Sadigh

发表机构 * Stanford University(斯坦福大学) Jet Propulsion Laboratory(喷气推进实验室) California Institute of Technology(加州理工学院)

AI总结 本文提出了一种安全探索算法,通过利用Lipschitz连续性确保在探索过程中不访问危险状态,该算法在确定性马尔可夫决策过程中提供了确定性的安全保证,并通过模拟导航任务验证了其性能。

Comments Proceedings of the American Control Conference (ACC), July 2019. The first two authors have equal contribution

详情
AI中文摘要

我们提出了一种安全探索算法,用于具有未知转移模型的确定性马尔可夫决策过程。我们的算法通过利用Lipschitz连续性来保证安全性,确保在探索过程中不访问不安全的状态。与许多其他现有技术不同,所提供的安全保证是确定性的。我们的算法被优化以减少探索安全空间所需的操作次数。我们在导航任务的模拟中将我们的算法与基线方法进行了比较,以展示其性能。

英文摘要

We propose a safe exploration algorithm for deterministic Markov Decision Processes with unknown transition models. Our algorithm guarantees safety by leveraging Lipschitz-continuity to ensure that no unsafe states are visited during exploration. Unlike many other existing techniques, the provided safety guarantee is deterministic. Our algorithm is optimized to reduce the number of actions needed for exploring the safe space. We demonstrate the performance of our algorithm in comparison with baseline methods in simulation on navigation tasks.

1902.02095 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Space Navigator: a Tool for the Optimization of Collision Avoidance Maneuvers

空间导航:碰撞规避 maneuver 优化工具

Leonid Gremyachikh, Dmitrii Dubov, Nikita Kazeev, Andrey Kulibaba, Andrey Skuratov, Anton Tereshkin, Andrey Ustyuzhanin, Lubov Shiryaeva, Sergej Shishkin

发表机构 * National Research University Higher School of Economics, Laboratory of Methods for Big Data Analysis(俄罗斯国家研究大学高等经济学院,大数据分析方法实验室) Yandex School of Data Analysis(Yandex数据科学学院) Phygitalism

AI总结 本文提出了一种名为“空间导航”的模块化自主碰撞规避系统,通过结合领域知识与强化学习方法,解决千级卫星星座的碰撞规避 maneuver 优化问题。

Comments Submitted to AAS Advances in the Astronautical Sciences, presented at IAA SciTech Forum 2018

详情
Journal ref
Advances in the Astronautical Sciences 2020 First IAA/AAS SciTech Forum on Space Flight Mechanics and Space Structures and Materials Conference, volume 170
AI中文摘要

由于计划中的千级微卫星星座发射,空间物体数量将在几年内增长数倍,导致卫星碰撞威胁显著增加。航天器必须执行碰撞规避 maneuver 来降低风险。根据公开信息,目前 conjunction 事件是由地球上的操作员手动处理的。手动 maneuver 规划需要合格人员,对于千级卫星星座来说是不现实的。本文提出了一种新的模块化自主碰撞规避系统,称为“空间导航”,其基于一种新颖的 maneuver 优化方法,结合了领域知识与强化学习方法。

英文摘要

The number of space objects will grow several times in a few years due to the planned launches of constellations of thousands microsatellites. It leads to a significant increase in the threat of satellite collisions. Spacecraft must undertake collision avoidance maneuvers to mitigate the risk. According to publicly available information, conjunction events are now manually handled by operators on the Earth. The manual maneuver planning requires qualified personnel and will be impractical for constellations of thousands satellites. In this paper we propose a new modular autonomous collision avoidance system called "Space Navigator". It is based on a novel maneuver optimization approach that combines domain knowledge with Reinforcement Learning methods.

1711.11417 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Scalable synthesis of safety certificates from data with application to learning-based control

可扩展的数据合成安全证书及在基于学习的控制中的应用

Kim P. Wabersich, Melanie N. Zeilinger

发表机构 * Institute for Dynamic Systems and Control, ETH Zurich(动态系统与控制研究所,苏黎世联邦理工学院)

AI总结 本文提出了一种高效的方法来合成安全集和控制律,通过基于凸优化问题的近似方法,提高了可扩展性,同时利用高斯过程先验减少保守性,应用于自动驾驶车队等场景。

详情
AI中文摘要

复杂系统的控制面临着高性能与安全保证之间的权衡,这尤其限制了基于学习的方法在安全关键系统中的应用。为了解决这一问题,最近提出了一种框架,即使用安全控制器,以确保系统保持在状态空间的安全区域内。本文介绍了一种高效的合成安全集和控制律的方法,通过依赖于基于凸优化问题的近似方法,提供了改进的可扩展性。第一种方法仅需要近似的线性系统模型和未知非线性动力学的利普希茨连续性。第二种方法扩展了这些结果,展示了如何利用高斯过程先验来减少所得到的安全集的保守性。我们通过数值示例,包括自动驾驶车队,来展示这些结果。

英文摘要

The control of complex systems faces a trade-off between high performance and safety guarantees, which in particular restricts the application of learning-based methods to safety-critical systems. A recently proposed framework to address this issue is the use of a safety controller, which guarantees to keep the system within a safe region of the state space. This paper introduces efficient techniques for the synthesis of a safe set and control law, which offer improved scalability properties by relying on approximations based on convex optimization problems. The first proposed method requires only an approximate linear system model and Lipschitz continuity of the unknown nonlinear dynamics. The second method extends the results by showing how a Gaussian process prior on the unknown system dynamics can be used in order to reduce conservatism of the resulting safe set. We demonstrate the results with numerical examples, including an autonomous convoy of vehicles.

1905.10706 2026-06-04 cs.LG cs.RO cs.SY eess.SY stat.ML 版本更新

Interactive Differentiable Simulation

交互式可微分模拟

Eric Heiden, David Millard, Hejia Zhang, Gaurav S. Sukhatme

发表机构 * University of Southern California(南加州大学)

AI总结 本文提出交互式可微分模拟(IDS),一种能够高效准确推断刚体系统物理属性的可微分物理引擎,通过视觉输入实现系统识别,从而建立具有物理意义的世界模型,并在非线性动态系统中实现自动任务机器人设计和参数估计,显著提升了非线性控制领域的样本效率。

详情
AI中文摘要

智能体需要对世界有物理理解才能预测其未来行动的影响。虽然基于学习的环境动力学模型在样本效率上相比无模型强化学习算法有所改进,但通常无法泛化到训练数据之外的系统状态,且往往依赖于非解释性的潜在变量。我们引入交互式可微分模拟(IDS),一种可微分的物理引擎,能够高效准确地推断刚体系统的物理属性。将模型集成到深度学习架构中,该模型能够利用视觉输入实现系统识别,从而建立具有物理意义的世界模型。我们展示了通过自动计算IDS中的梯度,实现非线性动态系统的自动任务机器人设计和参数估计。当与自适应模型预测控制算法结合时,我们的方法在具有挑战性的非线性控制领域中,相比无模型强化学习算法显示出数量级的样本效率提升。

英文摘要

Intelligent agents need a physical understanding of the world to predict the impact of their actions in the future. While learning-based models of the environment dynamics have contributed to significant improvements in sample efficiency compared to model-free reinforcement learning algorithms, they typically fail to generalize to system states beyond the training data, while often grounding their predictions on non-interpretable latent variables. We introduce Interactive Differentiable Simulation (IDS), a differentiable physics engine, that allows for efficient, accurate inference of physical properties of rigid-body systems. Integrated into deep learning architectures, our model is able to accomplish system identification using visual input, leading to an interpretable model of the world whose parameters have physical meaning. We present experiments showing automatic task-based robot design and parameter estimation for nonlinear dynamical systems by automatically calculating gradients in IDS. When integrated into an adaptive model-predictive control algorithm, our approach exhibits orders of magnitude improvements in sample efficiency over model-free reinforcement learning algorithms on challenging nonlinear control domains.

1607.01027 2026-06-04 math.OC cs.LG cs.NA math.NA stat.ML 版本更新

Accelerate Stochastic Subgradient Method by Leveraging Local Growth Condition

通过利用局部增长条件加速随机子梯度方法

Yi Xu, Qihang Lin, Tianbao Yang

发表机构 * Department of Computer Science(计算机科学系) Department of Management Sciences(管理科学系) The University of Iowa(爱荷华大学)

AI总结 本文提出了一种新的理论,表明在最优解邻域内目标函数的局部增长率足以量化一阶随机凸优化的全局收敛率,通过局部区域逐步缩小的方法改进了加速随机子梯度方法的收敛性,并在实践中提出了无需知道乘法增长常数和增长率的实用变体。

详情
AI中文摘要

在本文中,我们为一阶随机凸优化开发了一种新理论,表明全局收敛率足以由最优解邻域内目标函数的局部增长率量化。具体而言,如果目标函数F(w)在ε子水平集内以速度‖w - w*‖_2^{1/θ}增长,其中w*是w最近的最优解,θ∈(0,1]表示局部增长率,则达到ε最优解的一阶随机优化迭代复杂度可以为~O(1/ε^{2(1-θ)}),这在至多对数因子范围内是最佳的。为了实现更快的全局收敛,我们通过在历史解的局部区域中迭代求解原始问题,开发了两种不同的加速随机子梯度方法,该局部区域的大小随着解接近最优集而逐渐减小。除了理论改进外,这项工作还包含了使所提算法实用的新贡献:(i) 我们提出了可以运行而无需知道乘法增长常数和增长率θ的加速随机子梯度方法的实用变体;(ii) 我们考虑了机器学习中的广泛问题集,以证明所提算法比传统随机子梯度方法具有更快的收敛速度。我们还表征了所提算法的复杂性,以确保在不假设光滑性的情况下梯度较小。

英文摘要

In this paper, a new theory is developed for first-order stochastic convex optimization, showing that the global convergence rate is sufficiently quantified by a local growth rate of the objective function in a neighborhood of the optimal solutions. In particular, if the objective function $F(\mathbf w)$ in the $ε$-sublevel set grows as fast as $\|\mathbf w - \mathbf w_*\|_2^{1/θ}$, where $\mathbf w_*$ represents the closest optimal solution to $\mathbf w$ and $θ\in(0,1]$ quantifies the local growth rate, the iteration complexity of first-order stochastic optimization for achieving an $ε$-optimal solution can be $\widetilde O(1/ε^{2(1-θ)})$, which is optimal at most up to a logarithmic factor. To achieve the faster global convergence, we develop two different accelerated stochastic subgradient methods by iteratively solving the original problem approximately in a local region around a historical solution with the size of the local region gradually decreasing as the solution approaches the optimal set. Besides the theoretical improvements, this work also includes new contributions towards making the proposed algorithms practical: (i) we present practical variants of accelerated stochastic subgradient methods that can run without the knowledge of multiplicative growth constant and even the growth rate $θ$; (ii) we consider a broad family of problems in machine learning to demonstrate that the proposed algorithms enjoy faster convergence than traditional stochastic subgradient method. We also characterize the complexity of the proposed algorithms for ensuring the gradient is small without the smoothness assumption.

1809.09170 2026-06-04 math.NA cs.LG cs.NA math.DS stat.ML 版本更新

Numerical Aspects for Approximating Governing Equations Using Data

利用数据近似求解方程的数值方面

Kailiang Wu, Dongbin Xiu

发表机构 * Department of Mathematics, The Ohio State University, Columbus, OH 43210, USA.(数学系,俄亥俄州立大学,哥伦布,OH 43210,USA)

AI总结 本文提出了一种有效的数值算法,用于从测量数据中局部恢复未知的偏微分方程,通过使用多项式等标准基函数进行高精度近似,并讨论了准确近似的关键因素,如使用大量短轨迹数据而非单一长轨迹数据,以及展示了线性和非线性系统的数值示例。

Comments 26 pages, 17 figures

详情
Journal ref
Journal of Computational Physics, 384, 200-221, 2019
AI中文摘要

我们提出了有效的数值算法,用于从测量数据中局部恢复未知的偏微分方程。我们采用一组标准基函数,例如多项式,来高精度地近似求解方程。在将问题转化为函数近似问题后,我们讨论了几个重要的方面以确保准确的近似。最值得注意的是,我们讨论了使用大量短轨迹数据burst而非单一长轨迹数据的重要性。随后,我们提出了几种数值算法以实现准确的近似,并给出了最终方程近似的误差估计。然后,我们展示了线性和非线性系统的一系列广泛数值示例,以展示我们方程恢复算法的性质和有效性。

英文摘要

We present effective numerical algorithms for locally recovering unknown governing differential equations from measurement data. We employ a set of standard basis functions, e.g., polynomials, to approximate the governing equation with high accuracy. Upon recasting the problem into a function approximation problem, we discuss several important aspects for accurate approximation. Most notably, we discuss the importance of using a large number of short bursts of trajectory data, rather than using data from a single long trajectory. Several options for the numerical algorithms to perform accurate approximation are then presented, along with an error estimate of the final equation approximation. We then present an extensive set of numerical examples of both linear and nonlinear systems to demonstrate the properties and effectiveness of our equation recovery algorithms.

1905.13547 2026-06-04 cs.LG cs.SY eess.SY math.DS math.OC stat.ML 版本更新

Learning robust control for LQR systems with multiplicative noise via policy gradient

通过策略梯度学习具有乘性噪声的LQR系统的鲁棒控制

Benjamin Gravell, Peyman Mohajerin Esfahani, Tyler Summers

发表机构 * Control, Optimization, and Networks lab, UT Dallas(控制、优化与网络实验室,UT Dallas) Delft Center for Systems and Control, TU Delft(代尔夫特系统与控制中心,TU Delft)

AI总结 本文研究了具有乘性噪声的LQR系统,通过策略梯度方法实现鲁棒控制,证明了在非凸成本函数下策略梯度算法的全局收敛性。

详情
AI中文摘要

线性二次调节(LQR)问题重新成为强化学习控制复杂动态系统的重要理论基准,特别是当状态和动作空间连续时。与几乎所有近期相关工作不同,我们考虑了乘性噪声模型,这些模型由于显式地纳入系统动态中的固有不确定性和变化,从而提高了控制器的鲁棒性。鲁棒性是强化学习中一个关键但理解不足的问题;现有不考虑不确定性的方法可能会收敛到脆弱的策略或完全无法收敛。此外,有意地将乘性噪声注入到学习算法中可以增强策略的鲁棒性,如在领域随机化中的非正式工作所观察到的。尽管策略梯度算法需要优化非凸成本函数,我们展示了乘性噪声LQR成本具有称为梯度支配的特殊性质,该性质被用来证明策略梯度算法在问题参数上具有多项式依赖性的全局收敛性,以达到全局最优控制策略。结果在已知模型和未知模型设置中均提供,其中系统轨迹样本用于估计策略梯度。

英文摘要

The linear quadratic regulator (LQR) problem has reemerged as an important theoretical benchmark for reinforcement learning-based control of complex dynamical systems with continuous state and action spaces. In contrast with nearly all recent work in this area, we consider multiplicative noise models, which are increasingly relevant because they explicitly incorporate inherent uncertainty and variation in the system dynamics and thereby improve robustness properties of the controller. Robustness is a critical and poorly understood issue in reinforcement learning; existing methods which do not account for uncertainty can converge to fragile policies or fail to converge at all. Additionally, intentional injection of multiplicative noise into learning algorithms can enhance robustness of policies, as observed in ad hoc work on domain randomization. Although policy gradient algorithms require optimization of a non-convex cost function, we show that the multiplicative noise LQR cost has a special property called gradient domination, which is exploited to prove global convergence of policy gradient algorithms to the globally optimum control policy with polynomial dependence on problem parameters. Results are provided both in the model-known and model-unknown settings where samples of system trajectories are used to estimate policy gradients.

1902.09964 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

A Neural-Network-Based Model Predictive Control of Three-Phase Inverter With an Output LC Filter

基于神经网络的三相逆变器模型预测控制及输出LC滤波器

Ihab S. Mohamed, Stefano Rovetta, Ton Duc Do, Tomislav Dragicevic, Ahmed A. Zaki Diab

发表机构 * 1 INRIA Sophia Antipolis - M\'editerran\'ee, University C\ ote d'Azur, France (e-mail: ) 2 Department of Informatics, Bioengineering, Robotics Systems Engineering, University of Genoa, Italy (e-mail: ) 3 Department of Robotics Mechatronics, School of Science Technology (SST), Nazarbayev University, Astana Z05H0P9, Republic of Kazakhstan (e-mail: ) 4 Department of Energy Technology, Aalborg University, Denmark (e-mail: ) 5 Electrical Engineering Department, Faculty of Engineering, Minia University, Egypt (e-mail: )

AI总结 本文提出了一种结合模型预测控制(MPC)和前馈人工神经网络(ANN)的双级逆变器控制方案,旨在降低总谐波失真(THD)并提高系统在不同负载类型下的稳态和动态性能。通过MPC生成神经网络训练数据,利用训练好的ANN实现无MPC的电压跟踪,通过MATLAB/Simulink仿真验证了该策略的优越性能。

Comments 13 pages, 15 figures, 3 tables. This article has been submitted to IEEE Access

详情
AI中文摘要

模型预测控制(MPC)已成为一种well-established的现代控制方法,用于具有输出LC滤波器的三相逆变器,其中需要高质量电压和低总谐波失真(THD)。尽管MPC是一种直观的控制器,易于理解和实现,但它有显著缺点,即需要大量的在线计算来解决优化问题。另一方面,在电力电子和驱动领域,基于人工神经网络的无模型方法的应用正在迅速增长。本文提出了一种新的双级逆变器控制方案,结合MPC和前馈ANN,旨在降低THD并提高系统在不同负载类型下的稳态和动态性能。首先,MPC在训练阶段用于生成用于训练所提出神经网络所需的数据。然后,一旦神经网络经过微调,就可以在不需要使用MPC的情况下在线用于电压跟踪目的。所提出的基于ANN的控制策略通过MATLAB/Simulink工具进行仿真验证,考虑了不同的负载条件。此外,评估了基于ANN的控制器在多种线性和非线性负载下的不同运行条件下性能,并与MPC的性能进行比较,证明了所提出基于ANN的控制策略在稳态和动态性能方面的优异表现。

英文摘要

Model predictive control (MPC) has become one of the well-established modern control methods for three-phase inverters with an output LC filter, where a high-quality voltage with low total harmonic distortion (THD) is needed. Although it is an intuitive controller, easy to understand and implement, it has the significant disadvantage of requiring a large number of online calculations for solving the optimization problem. On the other hand, the application of model-free approaches such as those based on artificial neural networks approaches is currently growing rapidly in the area of power electronics and drives. This paper presents a new control scheme for a two-level converter based on combining MPC and feed-forward ANN, with the aim of getting lower THD and improving the steady and dynamic performance of the system for different types of loads. First, MPC is used, as an expert, in the training phase to generate data required for training the proposed neural network. Then, once the neural network is fine-tuned, it can be successfully used online for voltage tracking purpose, without the need of using MPC. The proposed ANN-based control strategy is validated through simulation, using MATLAB/Simulink tools, taking into account different loads conditions. Moreover, the performance of the ANN-based controller is evaluated, on several samples of linear and non-linear loads under various operating conditions, and compared to that of MPC, demonstrating the excellent steady-state and dynamic performance of the proposed ANN-based control strategy.

1904.05856 2026-06-04 math.OC cs.LG cs.SY eess.SY 版本更新

Connections Between Adaptive Control and Optimization in Machine Learning

适应控制与机器学习中优化方法之间的联系

Joseph E. Gaudio, Travis E. Gibson, Anuradha M. Annaswamy, Michael A. Bolender, Eugene Lavretsky

发表机构 * Massachusetts Institute of Technology(麻省理工学院) Brigham and Women’s Hospital and Harvard Medical School(布莱尔妇女医院和哈佛医学院) Air Force Research Laboratory(空军研究实验室) The Boeing Company(波音公司)

AI总结 本文探讨了适应控制与机器学习中常用优化方法之间的联系,通过分析更新法则的相似性,讨论了稳定性、性能和学习等共同概念,并提出了新的交集和改进算法分析的机会,特别是通过这些交集的见解解决了高阶学习问题。

Comments 18 pages

详情
AI中文摘要

本文展示了适应控制和机器学习中常用优化方法之间的许多直接联系。从常见的输出误差公式开始,探讨了更新法则修改的相似性。然后讨论了两个领域共有的稳定性、性能和学习概念。基于更新法则的相似性和共同概念,提供了新的交集和改进算法分析的机会。特别是,通过这些交集的见解解决了与高阶学习相关的问题。

英文摘要

This paper demonstrates many immediate connections between adaptive control and optimization methods commonly employed in machine learning. Starting from common output error formulations, similarities in update law modifications are examined. Concepts in stability, performance, and learning, common to both fields are then discussed. Building on the similarities in update laws and common concepts, new intersections and opportunities for improved algorithm analysis are provided. In particular, a specific problem related to higher order learning is solved through insights obtained from these intersections.

1602.04450 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Bayesian Optimization with Safety Constraints: Safe and Automatic Parameter Tuning in Robotics

具有安全约束的贝叶斯优化:机器人中的安全自动参数调节

Felix Berkenkamp, Andreas Krause, Angela P. Schoellig

发表机构 * 1 Learning \& Adaptive Systems Group, Department of Computer Science, ETH Zurich, Switzerland. 2 Dynamic Systems Lab, Institute for Aerospace Studies, University of Toronto, Canada.

AI总结 本文提出了一种通用算法,允许在目标函数之外存在多个独立的安全约束。该算法在给定初始安全参数的情况下,最大化性能,但仅评估满足所有安全约束的参数。通过利用高斯过程先验的正则性假设,该算法仔细探索参数空间,并展示了如何利用上下文变量安全地将知识转移到新任务中。

详情
AI中文摘要

机器人算法通常依赖于各种参数,这些参数的选择对机器人的性能有显著影响。虽然初始参数猜测可以从机器人的动态模型中获得,但通常需要在真实系统上手动调整参数以达到最佳性能。优化算法,如贝叶斯优化,已被用来自动化这一过程。然而,这些方法在优化过程中可能会评估不安全的参数,导致安全关键系统的故障。最近,一种称为SafeOpt的安全贝叶斯优化算法已被开发,该算法保证系统性能永远不会低于临界值;即,安全性是基于性能函数定义的。然而,在机器人中,将性能和安全性结合往往并不理想。例如,高增益控制器可能实现低平均跟踪误差(性能),但可能会超调并违反输入约束。在本文中,我们提出了一种通用算法,允许在目标函数之外存在多个独立的安全约束。给定初始的安全参数集,该算法最大化性能,但只评估满足所有约束的参数,以高概率。为此,它通过利用高斯过程先验的正则性假设来仔细探索参数空间。此外,我们展示了如何利用上下文变量安全地将知识转移到新情况和任务中。我们提供了理论分析,并证明所提出的算法能够实现快速、自动和安全的参数调节优化,在四旋翼飞行器的实验中得到了验证。

英文摘要

Robotic algorithms typically depend on various parameters, the choice of which significantly affects the robot's performance. While an initial guess for the parameters may be obtained from dynamic models of the robot, parameters are usually tuned manually on the real system to achieve the best performance. Optimization algorithms, such as Bayesian optimization, have been used to automate this process. However, these methods may evaluate unsafe parameters during the optimization process that lead to safety-critical system failures. Recently, a safe Bayesian optimization algorithm, called SafeOpt, has been developed, which guarantees that the performance of the system never falls below a critical value; that is, safety is defined based on the performance function. However, coupling performance and safety is often not desirable in robotics. For example, high-gain controllers might achieve low average tracking error (performance), but can overshoot and violate input constraints. In this paper, we present a generalized algorithm that allows for multiple safety constraints separate from the objective. Given an initial set of safe parameters, the algorithm maximizes performance but only evaluates parameters that satisfy safety for all constraints with high probability. To this end, it carefully explores the parameter space by exploiting regularity assumptions in terms of a Gaussian process prior. Moreover, we show how context variables can be used to safely transfer knowledge to new situations and tasks. We provide a theoretical analysis and demonstrate that the proposed algorithm enables fast, automatic, and safe optimization of tuning parameters in experiments on a quadrotor vehicle.

1812.08625 2026-06-04 math.NA cs.LG cs.NA physics.comp-ph stat.ML 版本更新

Deep Theory of Functional Connections: A New Method for Estimating the Solutions of PDEs

深度函数连接理论:一种用于估计偏微分方程解的新方法

Carl Leake

发表机构 * Ph.D. Student, Aerospace Engineering, Texas A\&M University, College Station, TX(航空航天工程博士研究生,德克萨斯A&M大学,学院站,德克萨斯)

AI总结 本文提出了一种名为深度函数连接理论(TFC)的新方法,通过将神经网络与TFC结合来估计偏微分方程(PDEs)的解。该方法将带有边界条件的PDEs转换为无约束优化问题,利用神经网络作为自由函数来求解无约束优化问题,并通过残差平方作为损失函数进行无监督训练。与传统方法相比,该方法无需离散化域,且能提供整个训练域的闭合形式解析解。

Comments 14 pages, 7 figures

详情
Journal ref
Mach. Learn. Knowl. Extr. 2020, 2(1), 37-55
AI中文摘要

本文提出了一种名为深度函数连接理论(TFC)的新方法,通过将神经网络与TFC结合来估计偏微分方程(PDEs)的解。TFC用于将带有边界条件的PDEs转换为无约束优化问题,通过将边界条件嵌入到一个“约束表达式”中。在本工作中,神经网络被选为自由函数,并用于求解现在无约束的优化问题。损失函数取为PDE残差的平方。然后,神经网络以无监督的方式训练以解决无约束优化问题。与用于估计PDE解的流行方法相比,该方法有两个主要区别。首先,该方法不需要将域离散化为网格,而是在线性训练阶段随机采样域中的点。其次,训练后,该方法在整个训练域内提供闭合形式、解析、可微的解的近似。相比之下,其他流行方法如果需要在不在离散化网格上的点上估计解,则需要插值。深度TFC方法用于解决四个具有各种边界条件的问题。

英文摘要

This article presents a new methodology called deep Theory of Functional Connections (TFC) that estimates the solutions of partial differential equations (PDEs) by combining neural networks with TFC. TFC is used to transform PDEs with boundary conditions into unconstrained optimization problems by embedding the boundary conditions into a "constrained expression." In this work, a neural network is chosen as the free function, and used to solve the now unconstrained optimization problem. The loss function is taken as the square of the residual of the PDE. Then, the neural network is trained in an unsupervised manner to solve the unconstrained optimization problem. This methodology has two major differences when compared with popular methods used to estimate the solutions of PDEs. First, this methodology does not need to discretize the domain into a grid, rather, this methodology randomly samples points from the domain during the training phase. Second, after training, this methodology represents a closed form, analytical, differentiable approximation of the solution throughout the entire training domain. In contrast, other popular methods require interpolation if the estimated solution is desired at points that do not lie on the discretized grid. The deep TFC method for estimating the solution of PDEs is demonstrated on four problems with a variety of boundary conditions.

1809.02341 2026-06-04 math.OC cs.LG cs.NA math.NA stat.ML 版本更新

A Fast Anderson-Chebyshev Acceleration for Nonlinear Optimization

非线性优化中的快速安德森-切比雪夫加速方法

Zhize Li, Jian Li

发表机构 * King Abdullah University of Science and Technology(卡布斯大学) Tsinghua University(清华大学)

AI总结 本文提出了一种快速安德森-切比雪夫加速方法,用于非线性优化问题,该方法在二次函数上实现了最优收敛率O(√κ ln(1/ε)),并提供了通用非线性问题的收敛分析,同时提出了动态猜测超参数的算法。

Comments To appear in AISTATS 2020

详情
AI中文摘要

安德森加速(或安德森混合)是一种高效的固定点迭代方法$ x_{t+1}=G(x_t) $,例如梯度下降可以视为迭代应用操作$ G(x) riangleq x-α abla f(x) $。本文表明,安德森加速结合切比雪夫多项式可以实现最优收敛率$ O(\sqrtκ\ln rac{1}ε) $,这改进了之前对于二次函数提供的结果$ O(κ\ln rac{1}ε) $(Toth and Kelley, 2015)。此外,我们为一般非线性问题提供了收敛分析。此外,如果超参数(例如Lipschitz光滑参数$ L $)不可用,我们提出了一种猜测算法来动态猜测它们,并证明了类似的收敛率。最后,实验结果表明,所提出的安德森-切比雪夫加速方法比其他算法如普通梯度下降(GD)、Nesterov加速GD收敛更快。此外,这些算法结合所提出的猜测算法(动态猜测超参数)实现了更好的性能。

英文摘要

Anderson acceleration (or Anderson mixing) is an efficient acceleration method for fixed point iterations $x_{t+1}=G(x_t)$, e.g., gradient descent can be viewed as iteratively applying the operation $G(x) \triangleq x-α\nabla f(x)$. It is known that Anderson acceleration is quite efficient in practice and can be viewed as an extension of Krylov subspace methods for nonlinear problems. In this paper, we show that Anderson acceleration with Chebyshev polynomial can achieve the optimal convergence rate $O(\sqrtκ\ln\frac{1}ε)$, which improves the previous result $O(κ\ln\frac{1}ε)$ provided by (Toth and Kelley, 2015) for quadratic functions. Moreover, we provide a convergence analysis for minimizing general nonlinear problems. Besides, if the hyperparameters (e.g., the Lipschitz smooth parameter $L$) are not available, we propose a guessing algorithm for guessing them dynamically and also prove a similar convergence rate. Finally, the experimental results demonstrate that the proposed Anderson-Chebyshev acceleration method converges significantly faster than other algorithms, e.g., vanilla gradient descent (GD), Nesterov's Accelerated GD. Also, these algorithms combined with the proposed guessing algorithm (guessing the hyperparameters dynamically) achieve much better performance.

1812.11137 2026-06-04 cs.LG cs.SY eess.SY math.OC stat.ML 版本更新

Differential Temporal Difference Learning

差分时间差分学习

Adithya M. Devraj, Ioannis Kontoyiannis, Sean P. Meyn

发表机构 * Department of Electrical and Computer Engineering, University of Florida(佛罗里达大学电气与计算机工程系) Department of Engineering, University of Cambridge(剑桥大学工程系)

AI总结 本文提出了一种新的差分时间差分学习算法,旨在解决传统时间差分学习方法中收敛缓慢和相对价值函数计算中一致性算法仅在特殊情况下存在的问题。

Comments Preliminary versions of some of the results in this article were submitted as arXiv:1604.01828

详情
AI中文摘要

由马尔可夫决策过程导出的价值函数在许多统计和工程应用中的机器学习技术中作为算法和性能指标的核心组成部分。在大多数实际情况下,计算相关贝尔曼方程的解具有挑战性。一种流行的近似技术,即时间差分(TD)学习算法,是通用强化学习方法的重要子类。本文介绍的算法旨在解决TD学习方法的两个已知难题:由于非常高的方差导致的收敛缓慢,以及在计算相对价值函数的问题中,仅在特殊情况下存在一致算法。首先,我们表明这些价值函数的梯度具有可以用于算法设计的表示形式。基于这一结果,引入了一种新的差分TD学习算法。对于在欧几里得空间上具有光滑动力学的马尔可夫模型,在一般条件下,这些算法被证明是自洽的。数值结果表明,与标准方法相比,具有显著的方差减少。

英文摘要

Value functions derived from Markov decision processes arise as a central component of algorithms as well as performance metrics in many statistics and engineering applications of machine learning techniques. Computation of the solution to the associated Bellman equations is challenging in most practical cases of interest. A popular class of approximation techniques, known as Temporal Difference (TD) learning algorithms, are an important sub-class of general reinforcement learning methods. The algorithms introduced in this paper are intended to resolve two well-known difficulties of TD-learning approaches: Their slow convergence due to very high variance, and the fact that, for the problem of computing the relative value function, consistent algorithms exist only in special cases. First we show that the gradients of these value functions admit a representation that lends itself to algorithm design. Based on this result, a new class of differential TD-learning algorithms is introduced. For Markovian models on Euclidean space with smooth dynamics, the algorithms are shown to be consistent under general conditions. Numerical results show dramatic variance reduction when compared to standard methods.

1905.11011 2026-06-04 math.OC cs.AI cs.LG cs.SY eess.SY 版本更新

Robustness of accelerated first-order algorithms for strongly convex optimization problems

强凸优化问题中加速一阶算法的鲁棒性

Hesameddin Mohammadi, Meisam Razaviyayn, Mihailo R. Jovanović

发表机构 * Ming Hsieh Department of Electrical and Computer Engineering(明希德电气与计算机工程系) Daniel J. Epstein Department of Industrial and Systems Engineering(丹尼尔·J·埃普斯坦工业与系统工程系)

AI总结 本文研究了在梯度评估中存在随机不确定性的加速一阶算法的鲁棒性,分析了噪声对优化变量均方误差的影响,并探讨了噪声放大与收敛速率之间的根本权衡。

Comments 45 pages, 6 figures

详情
AI中文摘要

我们研究了在梯度评估中存在随机不确定性的加速一阶算法的鲁棒性。具体而言,针对无约束、光滑、强凸优化问题,我们考察了在迭代项受到加性白噪声扰动时优化变量的均方误差。这种不确定性可能出现在通过真实系统的测量来近似梯度或在分布式网络计算中。尽管此类问题的一阶算法的动力学是非线性的,我们建立了均方偏离最优解的上界,这些上界在常数因子范围内是紧致的。我们的分析量化了通过任何类似于Nesterov或重力球方法的加速方案所获得的噪声放大与收敛速率之间的根本权衡。为了获得额外的分析洞察,对于强凸二次问题,我们明确地将优化变量的稳态方差表示为目标函数Hessian矩阵特征值的函数。我们证明了Hessian的整个谱,而不仅仅是极值特征值,影响噪声算法的鲁棒性。我们将这一结果专门应用于无向网络上的分布式平均问题,并考察了网络大小和拓扑结构对噪声加速算法鲁棒性的影响。

英文摘要

We study the robustness of accelerated first-order algorithms to stochastic uncertainties in gradient evaluation. Specifically, for unconstrained, smooth, strongly convex optimization problems, we examine the mean-squared error in the optimization variable when the iterates are perturbed by additive white noise. This type of uncertainty may arise in situations where an approximation of the gradient is sought through measurements of a real system or in a distributed computation over a network. Even though the underlying dynamics of first-order algorithms for this class of problems are nonlinear, we establish upper bounds on the mean-squared deviation from the optimal solution that are tight up to constant factors. Our analysis quantifies fundamental trade-offs between noise amplification and convergence rates obtained via any acceleration scheme similar to Nesterov's or heavy-ball methods. To gain additional analytical insight, for strongly convex quadratic problems, we explicitly evaluate the steady-state variance of the optimization variable in terms of the eigenvalues of the Hessian of the objective function. We demonstrate that the entire spectrum of the Hessian, rather than just the extreme eigenvalues, influence robustness of noisy algorithms. We specialize this result to the problem of distributed averaging over undirected networks and examine the role of network size and topology on the robustness of noisy accelerated algorithms.

1809.07180 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

Projective Splitting with Forward Steps only Requires Continuity

仅需连续性即可实现投影分裂

Patrick R. Johnstone, Jonathan Eckstein

AI总结 本文研究了投影分裂算法中仅需连续性即可实现收敛的问题,核心方法是通过有限维空间中的反向追踪线搜索来实现收敛,主要贡献是证明了在连续条件下无需Lipschitz连续性假设。

Comments 15 pages. arXiv admin note: text overlap with arXiv:1803.07043

详情
Journal ref
Optim. Lett. 14, 229-247 (2020)
AI中文摘要

在单调算子包含的投影分裂算法中,最近的一项创新是开发了一种使用两个前向步骤而不是通常的近端步骤的程序,用于Lipschitz连续的算子。本文证明了在有限维空间中执行前向步骤时,Lipschitz假设是不必要的:反向追踪线搜索为仅具有完整域的连续算子提供了收敛的算法。

英文摘要

A recent innovation in projective splitting algorithms for monotone operator inclusions has been the development of a procedure using two forward steps instead of the customary proximal steps for operators that are Lipschitz continuous. This paper shows that the Lipschitz assumption is unnecessary when the forward steps are performed in finite-dimensional spaces: a backtracking linesearch yields a convergent algorithm for operators that are merely continuous with full domain.

1602.02726 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

Local and Global Convergence of a General Inertial Proximal Splitting Scheme

局部和全局收敛性的一般惯性近端分裂方案

Patrick R. Johnstone, Pierre Moulin

发表机构 * Coordinated Science Laboratory, University of Illinois, Urbana, IL 61801, USA(协调科学实验室,伊利诺伊大学,厄巴纳,伊利诺伊州,61801,美国)

AI总结 本文研究了希尔伯特空间中的凸复合最优化问题,提出了一种通用的惯性近端分裂算法(GIPSA),并证明了其迭代序列的平方增量和累积点的最优性,以及在最小值存在时的弱收敛性。进一步分析了ℓ1正则化优化问题,展示了GIPSA在特定参数选择下的局部收敛性和局部线性收敛性,以及其在FISTA变体中的应用。

Comments 33 pages 1 figure

详情
Journal ref
Comput Optim Appl 67, 259-292 (2017)
AI中文摘要

本文关注希尔伯特空间中的凸复合最优化问题。在这些问题中,目标函数是两个闭合、proper且凸函数的和,其中一个是光滑的,另一个具有计算成本较低的近端算子。我们分析了一种通用的惯性近端分裂算法(GIPSA)以解决此类问题。我们建立了迭代序列平方增量之和的有限性和累积点的最优性。如果最小值被达到,则整个序列的弱收敛性随之成立。我们的分析统一并扩展了之前的一些结果。然后我们专注于ℓ1正则化优化,这是最常见的特殊情况,其中非光滑项是ℓ1范数。对于某些参数选择,GIPSA适用于此问题的局部分析。对于这些选择,我们证明GIPSA在有限次迭代内收敛到最优支持和符号,之后GIPSA减少到最小化局部光滑函数。在某些条件下,局部线性收敛性成立。我们以惯性、步长和局部曲率来确定收敛率。我们的局部分析适用于某些最近的快速迭代收缩阈值算法(FISTA)变体,我们在此类FISTA变体中建立了主动流形识别和局部线性收敛性。我们的分析促使在这些FISTA变体中使用动量重启方案以获得最优的局部线性收敛率。

英文摘要

This paper is concerned with convex composite minimization problems in a Hilbert space. In these problems, the objective is the sum of two closed, proper, and convex functions where one is smooth and the other admits a computationally inexpensive proximal operator. We analyze a general family of inertial proximal splitting algorithms (GIPSA) for solving such problems. We establish finiteness of the sum of squared increments of the iterates and optimality of the accumulation points. Weak convergence of the entire sequence then follows if the minimum is attained. Our analysis unifies and extends several previous results. We then focus on $\ell_1$-regularized optimization, which is the ubiquitous special case where the nonsmooth term is the $\ell_1$-norm. For certain parameter choices, GIPSA is amenable to a local analysis for this problem. For these choices we show that GIPSA achieves finite "active manifold identification", i.e. convergence in a finite number of iterations to the optimal support and sign, after which GIPSA reduces to minimizing a local smooth function. Local linear convergence then holds under certain conditions. We determine the rate in terms of the inertia, stepsize, and local curvature. Our local analysis is applicable to certain recent variants of the Fast Iterative Shrinkage-Thresholding Algorithm (FISTA), for which we establish active manifold identification and local linear convergence. Our analysis motivates the use of a momentum restart scheme in these FISTA variants to obtain the optimal local linear convergence rate.

1811.06838 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

The Trace Criterion for Kernel Bandwidth Selection for Support Vector Data Description

核带宽选择的迹准则用于支持向量数据描述

Arin Chaudhuri, Carol Sadek, Deovrat Kakde, Wenhao Hu, Hansi Jiang, Seunghyun Kong, Yuewei Liao, Sergiy Peredriy, Haoyu Wang

发表机构 * Internet of Things, SAS Institute Inc., Cary, NC, 27513(物联网,SAS公司,北卡罗来纳州卡里,27513)

AI总结 本文提出了一种新的无监督方法,用于选择支持向量数据描述(SVDD)中高斯核的带宽,通过利用核矩阵的低秩表示来建议带宽值,该方法在低维数据中与当前最佳方法竞争,并在许多高维数据类别中表现极佳。

Comments note: some text overlap with arXiv:1708.05106 because common background material is covered in both papers

详情
AI中文摘要

支持向量数据描述(SVDD)是一种流行的异常检测技术。SVDD分类器将整个数据空间划分为内群区域和外群区域。计算SVDD分类器需要一个核函数,高斯核是一个常见选择。高斯核有一个带宽参数,正确设置该参数对获得良好结果至关重要。小带宽会导致过拟合,使得SVDD分类器高估异常数量,而大带宽会导致欠拟合,无法检测许多异常。本文提出了一种新的无监督方法,用于选择高斯核的带宽。我们的方法利用核矩阵的低秩表示来建议带宽值。我们的新方法在低维数据中与当前最佳方法竞争,并在许多高维数据类别中表现极佳。由于当使用高斯核时,SVDD的数学公式与单类支持向量机(OCSVM)的数学公式相同,因此我们的方法同样适用于OCSVM的高斯核带宽调整。

英文摘要

Support vector data description (SVDD) is a popular anomaly detection technique. The SVDD classifier partitions the whole data space into an inlier region, which consists of the region near the training data, and an outlier region, which consists of points away from the training data. The computation of the SVDD classifier requires a kernel function, for which the Gaussian kernel is a common choice. The Gaussian kernel has a bandwidth parameter, and it is important to set the value of this parameter correctly for good results. A small bandwidth leads to overfitting such that the resulting SVDD classifier overestimates the number of anomalies, whereas a large bandwidth leads to underfitting and an inability to detect many anomalies. In this paper, we present a new unsupervised method for selecting the Gaussian kernel bandwidth. Our method exploits a low-rank representation of the kernel matrix to suggest a kernel bandwidth value. Our new technique is competitive with the current state of the art for low-dimensional data and performs extremely well for many classes of high-dimensional data. Because the mathematical formulation of SVDD is identical with the mathematical formulation of one-class support vector machines (OCSVM) when the Gaussian kernel is used, our method is equally applicable to Gaussian kernel bandwidth tuning for OCSVM.

1904.13317 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

A data-efficient geometrically inspired polynomial kernel for robot inverse dynamics

一种数据高效且受几何启发的多项式核用于机器人逆动力学

Alberto Dalla Libera, Ruggero Carli

AI总结 本文提出了一种基于高斯过程回归的数据驱动逆动力学估计器,引入了几何启发多项式核(GIP),该核在合适输入空间上将逆动力学描述为多项式函数,并证明其定义了有限维的再生核希尔伯特空间,包含刚体动力学计算的逆动力学函数,实验表明该方法在数据效率和泛化能力上优于其他数据驱动方法,同时相比模型驱动方法需要更少的先验信息且不受模型偏差影响。

详情
Journal ref
IEEE Robotics and Automation Letters, vol. 5, no. 1, pp. 24-31, Jan. 2020
AI中文摘要

在本文中,我们介绍了一种基于高斯过程回归的新数据驱动逆动力学估计器。受逆动力学可以描述为合适输入空间上的多项式函数的启发,我们提出了一个名为几何启发多项式核(GIP)的新核。所得到的估计器在数据效率方面与基于模型的方法相似。事实上,我们证明了GIP核定义了一个有限维的再生核希尔伯特空间,该空间包含通过刚体动力学计算的逆动力学函数。所提出的核基于最近引入的乘法多项式核,这是经典多项式核的重新定义,配备了允许更高正则化的参数集。我们已在模拟环境和UR10机器人的真实实验中测试了所提出的方法。获得的结果证实,与其它数据驱动估计器相比,所提出的方法在数据效率和泛化能力上更优。相反,与基于模型的估计器相比,我们的方法需要更少的先验信息且不受模型偏差影响。

英文摘要

In this paper, we introduce a novel data-driven inverse dynamics estimator based on Gaussian Process Regression. Driven by the fact that the inverse dynamics can be described as a polynomial function on a suitable input space, we propose the use of a novel kernel, called Geometrically Inspired Polynomial Kernel (GIP). The resulting estimator behaves similarly to model-based approaches as concerns data efficiency. Indeed, we proved that the GIP kernel defines a finite-dimensional Reproducing Kernel Hilbert Space that contains the inverse dynamics function computed through the Rigid Body Dynamics. The proposed kernel is based on the recently introduced Multiplicative Polynomial Kernel, a redefinition of the classical polynomial kernel equipped with a set of parameters that allows for a higher regularization. We tested the proposed approach in a simulated environment, and also in real experiments with a UR10 robot. The obtained results confirm that, compared to other data-driven estimators, the proposed approach is more data-efficient and exhibits better generalization properties. Instead, with respect to model-based estimators, our approach requires less prior information and is not affected by model bias.

1905.11266 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

One Method to Rule Them All: Variance Reduction for Data, Parameters and Many New Methods

一法统御诸法:数据、参数及许多新方法的方差缩减

Filip Hanzely, Peter Richtárik

发表机构 * King Abdullah University of Science and Technology(国王阿卜杜勒·阿齐兹科技大学)

AI总结 本文提出了一种通用的方差缩减方法,适用于解决具有大量训练样例或大模型维度的正则化经验风险最小化问题。该方法在特殊情况下可以退化为多种已知且以前被认为无关的方法,如SAGA、LSVRG、JacSketch、SEGA和ISEGA及其任意采样和近端泛化。同时,本文还提出了许多具有有趣性质的新具体算法,并提供了一个单一定理,证明在光滑性和拟强凸性假设下方法的线性收敛性。

Comments 61 pages, 6 figures, 3 tables

详情
AI中文摘要

我们提出了一种 remarkably 通用的方差缩减方法,适用于解决具有大量训练样例或大模型维度的正则化经验风险最小化问题。在特殊情况下,该方法退化为几种已知且以前被认为无关的方法,如SAGA、LSVRG、JacSketch、SEGA和ISEGA及其任意采样和近端泛化。然而,我们还强调了大量具有有趣性质的新具体算法。我们提供了一个单一定理,证明在光滑性和拟强凸性假设下方法的线性收敛性。通过这个定理,我们恢复了已知方法的最佳已知速率,有时甚至改进了这些速率。作为副产品,我们提供了第一个统一的随机梯度和随机坐标下降类型方法的统一方法和理论。

英文摘要

We propose a remarkably general variance-reduced method suitable for solving regularized empirical risk minimization problems with either a large number of training examples, or a large model dimension, or both. In special cases, our method reduces to several known and previously thought to be unrelated methods, such as {\tt SAGA}, {\tt LSVRG}, {\tt JacSketch}, {\tt SEGA} and {\tt ISEGA}, and their arbitrary sampling and proximal generalizations. However, we also highlight a large number of new specific algorithms with interesting properties. We provide a single theorem establishing linear convergence of the method under smoothness and quasi strong convexity assumptions. With this theorem we recover best-known and sometimes improved rates for known methods arising in special cases. As a by-product, we provide the first unified method and theory for stochastic gradient and stochastic coordinate descent type methods.

1807.01739 2026-06-04 math.OC cs.AI cs.LG cs.SY eess.SY 版本更新

Proximal algorithms for large-scale statistical modeling and sensor/actuator selection

大规模统计建模和传感器/执行器选择的近端算法

Armin Zare, Hesameddin Mohammadi, Neil K. Dhingra, Tryphon T. Georgiou, Mihailo R. Jovanović

发表机构 * Ming Hsieh Department of Electrical and Computer Engineering, University of Southern California(南加州大学明希赫电子与计算机工程系) Numerica Corporation(Numerica公司)

AI总结 本文提出了一种统一的近端算法框架,用于解决大规模系统建模与控制中的正则化半定规划问题,通过近端方法实现了对统计建模和传感器/执行器选择的高效处理,展示了算法的线性收敛性和有效性。

Comments To appear in IEEE Trans. Automat. Control

详情
AI中文摘要

若干在随机驱动动态系统建模与控制中的问题可以被表述为正则化半定规划。我们考察了两个具有代表性的此类问题,并展示了它们可以以类似的方式进行表述。第一个问题在统计建模中寻求通过适当且最小的扰动来协调观测统计数据。第二个问题则旨在为控制目的最优选择可用的传感器和执行器子集。为了应对大规模系统的建模与控制,我们开发了一种统一的算法框架,利用近端方法。我们的定制算法利用问题结构,使得能够处理统计建模以及传感器和执行器选择,比当前通用求解器可以处理的规模大得多。我们建立了近端梯度算法的线性收敛性,对比了所提出的近端算法与交替方向乘子法,并提供了示例以说明我们框架的优势和有效性。

英文摘要

Several problems in modeling and control of stochastically-driven dynamical systems can be cast as regularized semi-definite programs. We examine two such representative problems and show that they can be formulated in a similar manner. The first, in statistical modeling, seeks to reconcile observed statistics by suitably and minimally perturbing prior dynamics. The second seeks to optimally select a subset of available sensors and actuators for control purposes. To address modeling and control of large-scale systems we develop a unified algorithmic framework using proximal methods. Our customized algorithms exploit problem structure and allow handling statistical modeling, as well as sensor and actuator selection, for substantially larger scales than what is amenable to current general-purpose solvers. We establish linear convergence of the proximal gradient algorithm, draw contrast between the proposed proximal algorithms and alternating direction method of multipliers, and provide examples that illustrate the merits and effectiveness of our framework.

1905.08314 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Longitudinal Dynamic versus Kinematic Models for Car-Following Control Using Deep Reinforcement Learning

纵向动态模型与运动学模型在使用深度强化学习的汽车跟随控制中的比较

Yuan Lin, John McPhee, Nasser L. Azad

发表机构 * University of Waterloo, Ontario, Canada(加拿大温哥华大学)

AI总结 本文研究了在考虑车辆动力学的情况下,使用深度强化学习的纵向汽车跟随控制问题,通过引入延迟的控制输入和实际车辆加速度到强化学习环境状态中,改进了DRL框架,从而在考虑车辆动力学时实现了接近最优的控制性能。

Comments Accepted to 2019 IEEE Intelligent Transportation Systems Conference

详情
AI中文摘要

目前大多数关于通过深度强化学习(DRL)实现自动驾驶车辆控制的研究都使用点质量运动学模型,忽略了车辆动力学,包括加速度延迟和加速度命令动力学。加速度延迟源于传感和执行延迟,导致控制输入执行延迟。加速度命令动力学决定了实际车辆加速度不会立即达到期望的命令加速度,因为存在动力学限制。在本工作中,我们研究了将使用车辆运动学模型训练的DRL控制器应用于更现实的驾驶控制中的可行性。我们考虑了一个特定的纵向汽车跟随控制问题,即自适应巡航控制系统(ACC),该问题通过使用点质量运动学模型的DRL解决。当此类控制器应用于具有车辆动力学的汽车跟随时,我们观察到显著退化的汽车跟随性能。因此,我们重新设计DRL框架,通过将延迟的控制输入和实际车辆加速度分别添加到强化学习环境状态中,以适应加速度延迟和加速度命令动力学。训练结果表明,改进后的DRL控制器在考虑车辆动力学时的汽车跟随控制性能接近最优,与动态规划解决方案相比。

英文摘要

The majority of current studies on autonomous vehicle control via deep reinforcement learning (DRL) utilize point-mass kinematic models, neglecting vehicle dynamics which includes acceleration delay and acceleration command dynamics. The acceleration delay, which results from sensing and actuation delays, results in delayed execution of the control inputs. The acceleration command dynamics dictates that the actual vehicle acceleration does not rise up to the desired command acceleration instantaneously due to dynamics. In this work, we investigate the feasibility of applying DRL controllers trained using vehicle kinematic models to more realistic driving control with vehicle dynamics. We consider a particular longitudinal car-following control, i.e., Adaptive Cruise Control (ACC), problem solved via DRL using a point-mass kinematic model. When such a controller is applied to car following with vehicle dynamics, we observe significantly degraded car-following performance. Therefore, we redesign the DRL framework to accommodate the acceleration delay and acceleration command dynamics by adding the delayed control inputs and the actual vehicle acceleration to the reinforcement learning environment state, respectively. The training results show that the redesigned DRL controller results in near-optimal control performance of car following with vehicle dynamics considered when compared with dynamic programming solutions.

1803.00204 2026-06-04 cs.LG cs.AI cs.NA math.NA stat.ML 版本更新

Scalar Quantization as Sparse Least Square Optimization

标量量化作为稀疏最小二乘优化

Chen Wang, Xiaomei Yang, Shaomin Fei, Kai Zhou, Xiaofeng Gong, Miao Du, Ruisen Luo

发表机构 * College of Electrical Engineering, Sichuan University(四川大学电气工程学院) Department of Computer Science, Rutgers University -- New Brunswick(罗格斯大学新布朗斯维广场分校计算机科学系) Engineering Practice Center, Chengdu University of Information Technology(成都信息科技大学工程实践中心)

AI总结 本文提出了一种基于稀疏最小二乘优化的新方法,用于解决标量量化中的问题,通过引入l1、l1+l2和l0正则化,改进了传统聚类方法的不足,提升了在位宽缩减场景下的性能。

详情
Journal ref
IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019
AI中文摘要

量化可以用来形成具有共享值的新向量/矩阵,其值接近原始数据。近年来,标量量化在值共享应用中的普及度迅速上升,因为它在减少神经网络复杂度方面具有巨大实用性。现有的基于聚类的量化技术虽然发展成熟,但存在多个缺点,包括对随机种子的依赖性、空集群或超出范围的集群,以及大量集群时的时间复杂度高。为克服这些问题,本文从新的视角研究标量量化问题,即稀疏最小二乘优化。具体来说,受稀疏最小二乘回归性质的启发,提出了几种基于l1最小二乘的量化算法。此外,还提出了类似的方案,具有l1 + l2和l0正则化。此外,为了计算给定数量的值/集群的量化结果,本文设计了一种迭代方法和一种基于聚类的方法,并且两者都建立在稀疏最小二乘之上。本文表明,后者方法在数学上等价于改进版的k-means聚类基量化算法,尽管两种算法起源于不同的直觉。所提出的算法在三种类型的数据上进行了测试,比较和分析了其计算性能,包括信息损失、时间消耗以及稀疏向量值的分布。本文为量化领域提供了新的视角,所提出的算法在某些位宽缩减场景下表现优异,当所需的量化后分辨率(值的数量)不显著低于原始数量时尤其如此。

英文摘要

Quantization can be used to form new vectors/matrices with shared values close to the original. In recent years, the popularity of scalar quantization for value-sharing applications has been soaring as it has been found huge utilities in reducing the complexity of neural networks. Existing clustering-based quantization techniques, while being well-developed, have multiple drawbacks including the dependency of the random seed, empty or out-of-the-range clusters, and high time complexity for a large number of clusters. To overcome these problems, in this paper, the problem of scalar quantization is examined from a new perspective, namely sparse least square optimization. Specifically, inspired by the property of sparse least square regression, several quantization algorithms based on $l_1$ least square are proposed. In addition, similar schemes with $l_1 + l_2$ and $l_0$ regularization are proposed. Furthermore, to compute quantization results with a given amount of values/clusters, this paper designed an iterative method and a clustering-based method, and both of them are built on sparse least square. The paper shows that the latter method is mathematically equivalent to an improved version of k-means clustering-based quantization algorithm, although the two algorithms originated from different intuitions. The algorithms proposed were tested with three types of data and their computational performances, including information loss, time consumption, and the distribution of the values of the sparse vectors, were compared and analyzed. The paper offers a new perspective to probe the area of quantization, and the algorithms proposed can outperform existing methods especially under some bit-width reduction scenarios, when the required post-quantization resolution (number of values) is not significantly lower than the original number.

1904.08831 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Neural-Attention-Based Deep Learning Architectures for Modeling Traffic Dynamics on Lane Graphs

基于神经注意力的深度学习架构用于车道图上的交通动态建模

Matthew A. Wright, Simon F. G. Ehlers, Roberto Horowitz

AI总结 本文提出了一种基于神经注意力的深度学习架构,用于建模车道图上的交通动态,通过显式编码车道间的关系类型来提高预测性能,并展示了该模型在复杂道路网络中的迁移能力。

Comments To appear at 2019 IEEE Conference on Intelligent Transportation Systems

详情
AI中文摘要

深度神经网络可以成为强大的工具,但需要特定应用的精心设计以确保数据中最相关信息可被学习。在本文中,我们将深度神经网络应用于车辆交通动态的非线性时空物理问题。我们考虑在车道层面估计宏观量(例如交叉口的排队长度)的问题。由于建模如车道变更等社会行为的复杂性以及这些行为的宏观尺度影响,车道尺度的第一原理建模一直是一个挑战。遵循领域知识,上游/下游车道和邻近车道以不同的方式影响彼此的交通流量,我们应用了一种神经注意力机制,使神经网络层能够以不同的方式聚合来自不同车道的信息。使用微观交通模拟器作为测试平台,我们获得了结果,表明注意力神经网络模型可以利用附近车道的信息来提高预测效果,并且显式编码车道间的关系类型显著提高了性能。我们还展示了所学神经网络在更复杂道路网络中的迁移能力,讨论了其性能退化可能归因于拓扑复杂性增加所引起的新交通行为,并激励从多种道路网络拓扑中学习动态模型。

英文摘要

Deep neural networks can be powerful tools, but require careful application-specific design to ensure that the most informative relationships in the data are learnable. In this paper, we apply deep neural networks to the nonlinear spatiotemporal physics problem of vehicle traffic dynamics. We consider problems of estimating macroscopic quantities (e.g., the queue at an intersection) at a lane level. First-principles modeling at the lane scale has been a challenge due to complexities in modeling social behaviors like lane changes, and those behaviors' resultant macro-scale effects. Following domain knowledge that upstream/downstream lanes and neighboring lanes affect each others' traffic flows in distinct ways, we apply a form of neural attention that allows the neural network layers to aggregate information from different lanes in different manners. Using a microscopic traffic simulator as a testbed, we obtain results showing that an attentional neural network model can use information from nearby lanes to improve predictions, and, that explicitly encoding the lane-to-lane relationship types significantly improves performance. We also demonstrate the transfer of our learned neural network to a more complex road network, discuss how its performance degradation may be attributable to new traffic behaviors induced by increased topological complexity, and motivate learning dynamics models from many road network topologies.

1905.06518 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Efficient hinging hyperplanes neural network and its application in nonlinear system identification

高效铰接超平面神经网络及其在非线性系统辨识中的应用

Jun Xu, Qinghua Tao, Zhen Li, Xiangming Xi, Johan A. K. Suykens, Shuning Wang

发表机构 * School of Mechanical Engineering(机械工程学院) Automation, Harbin Institute of Technology, Shenzhen, 518055, China(自动化学院,哈尔滨工业大学深圳校区,中国,518055) BNRist, Department of Automation, Tsinghua University, Beijing, 100084, China(BNRist,自动化系,清华大学,北京,中国,100084)

AI总结 本文提出了一种高效的铰接超平面(EHH)神经网络,该网络通过求解多个凸优化问题进行训练,具有快速的训练速度。研究证明,每个EHH神经网络等价于一个自适应铰接超平面(AHH)树,并在系统辨识中表现出良好的应用效果。EHH神经网络具有可解释性,可通过ANOVA分解或交互矩阵获得,可作为输入变量选择的建议。

Comments submitted to Automatica

详情
AI中文摘要

本文提出了一种基于铰接超平面(HH)模型的高效铰接超平面(EHH)神经网络。EHH神经网络是一种分布式表示,其训练涉及求解多个凸优化问题,并且训练速度快。证明了对于每一个EHH神经网络,都存在一个等价的自适应铰接超平面(AHH)树,该AHH树也是基于HH模型提出的,并在系统辨识中找到了良好的应用。EHH神经网络的构建包括两个阶段。首先,EHH神经网络的初始结构是随机确定的,使用Lasso回归选择合适的网络。为了减轻随机性的影响,第二阶段采用堆叠策略来形成更一般的网络结构。与其他神经网络不同,EHH神经网络具有可解释性,可以通过其ANOVA分解(或交互矩阵)轻松获得。这种可解释性可以用于输入变量选择的建议。EHH神经网络应用于非线性系统辨识,仿真结果表明所选回归向量合理,识别速度较快,同时仿真精度也令人满意。

英文摘要

In this paper, the efficient hinging hyperplanes (EHH) neural network is proposed based on the model of hinging hyperplanes (HH). The EHH neural network is a distributed representation, the training of which involves solving several convex optimization problems and is fast. It is proved that for every EHH neural network, there is an equivalent adaptive hinging hyperplanes (AHH) tree, which was also proposed based on the model of HH and find good applications in system identification. The construction of the EHH neural network includes 2 stages. First the initial structure of the EHH neural network is randomly determined and the Lasso regression is used to choose the appropriate network. To alleviate the impact of randomness, secondly, the stacking strategy is employed to formulate a more general network structure. Different from other neural networks, the EHH neural network has interpretability ability, which can be easily obtained through its ANOVA decomposition (or interaction matrix). The interpretability can then be used as a suggestion for input variable selection. The EHH neural network is applied in nonlinear system identification, the simulation results show that the regression vector selected is reasonable and the identification speed is fast, while at the same time, the simulation accuracy is satisfactory.

1905.09435 2026-06-04 cs.LG cs.SY eess.SY math.OC stat.ML 版本更新

MATCHA: Speeding Up Decentralized SGD via Matching Decomposition Sampling

MATCHA: 通过匹配分解采样加速去中心化SGD

Jianyu Wang, Anit Kumar Sahu, Zhouyi Yang, Gauri Joshi, Soummya Kar

发表机构 * Carnegie Mellon University(卡内基梅隆大学) Bosch Center for Artificial Intelligence(博世人工智能中心)

AI总结 该研究提出MATCHA算法,通过匹配分解采样在去中心化SGD中实现误差与运行时间的双赢,验证了其在各种数据集和深度神经网络上的有效性,证明其比传统去中心化SGD快5倍。

详情
AI中文摘要

本文研究了在基于随机梯度下降(SGD)的去中心化训练中常见的误差-运行时间权衡问题。尽管更密集(稀疏)的网络拓扑会导致迭代更快(更慢)的误差收敛,但会带来更多的(更少)每次迭代的通信时间/延迟。本文提出MATCHA算法,能够在任意任意网络拓扑中实现误差-运行时间的双赢。MATCHA的主要思想是通过将拓扑分解为匹配来并行化节点间通信。为了保持快速的误差收敛速度,它识别并频繁通过关键链接进行通信,并通过较少使用其他链接来节省通信时间。在一系列数据集和深度神经网络上的实验验证了理论分析,并证明MATCHA在达到相同训练损失时比传统去中心化SGD快多达5倍。

英文摘要

This paper studies the problem of error-runtime trade-off, typically encountered in decentralized training based on stochastic gradient descent (SGD) using a given network. While a denser (sparser) network topology results in faster (slower) error convergence in terms of iterations, it incurs more (less) communication time/delay per iteration. In this paper, we propose MATCHA, an algorithm that can achieve a win-win in this error-runtime trade-off for any arbitrary network topology. The main idea of MATCHA is to parallelize inter-node communication by decomposing the topology into matchings. To preserve fast error convergence speed, it identifies and communicates more frequently over critical links, and saves communication time by using other links less frequently. Experiments on a suite of datasets and deep neural networks validate the theoretical analyses and demonstrate that MATCHA takes up to $5\times$ less time than vanilla decentralized SGD to reach the same training loss.

1905.07960 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

A novel Multiplicative Polynomial Kernel for Volterra series identification

一种新型的乘积多项式核用于Volterra级数识别

Alberto Dalla Libera, Ruggero Carli, Gianluigi Pillonetto

发表机构 * Department of Information Engineering, University of Padova(信息工程系,帕多瓦大学)

AI总结 本文提出了一种新的正则化网络用于Volterra模型的识别,通过引入由基本构建块乘积构成的新核,利用边际似然优化估计未知参数,实验表明该方法能更有效地选择影响系统输出的单项式,提升模型预测能力。

详情
AI中文摘要

Volterra级数在非线性系统识别中尤其有用,也得益于其能够近似广泛输入输出映射的能力。然而,从有限数据集中进行识别困难,由于维度灾难。最近的方法已展示出如何利用正则化核方法对此任务有所帮助。本文提出了一种新的正则化网络用于Volterra模型的识别。它依赖于一个新的核,由基本构建块的乘积给出。每个块包含一些未知参数,可以通过边际似然优化从数据中估计。与文献中提出的其他算法相比,数值实验表明,我们的方法能够更好地选择真正影响系统输出的单项式,大大提高了模型的预测能力。

英文摘要

Volterra series are especially useful for nonlinear system identification, also thanks to their capability to approximate a broad range of input-output maps. However, their identification from a finite set of data is hard, due to the curse of dimensionality. Recent approaches have shown how regularized kernel-based methods can be useful for this task. In this paper, we propose a new regularization network for Volterra models identification. It relies on a new kernel given by the product of basic building blocks. Each block contains some unknown parameters that can be estimated from data using marginal likelihood optimization. In comparison with other algorithms proposed in the literature, numerical experiments show that our approach allows to better select the monomials that really influence the system output, much increasing the prediction capability of the model.

1804.10273 2026-06-04 math.OC cs.LG cs.NA math.FA math.NA 版本更新

A telescoping Bregmanian proximal gradient method without the global Lipschitz continuity assumption

一种无需全局Lipschitz连续性假设的 telescoping Bregmanian 近端梯度方法

Daniel Reem, Simeon Reich, Alvaro De Pierro

发表机构 * The Technion - Israel Institute of Technology(技术Ion-以色列理工学院)

AI总结 本文提出了一种无需全局Lipschitz连续性假设的近端梯度方法变体,通过在约束集上进行 telescoping 分解,并利用Bregman散度来改进收敛性分析。

Comments Journal of Optimization Theory and Applications (JOTA): accepted for publication; very minor modifications; this version contains full proofs and alphabetically ordered list of references (in contrast with the journal version)

详情
Journal ref
J. Optim. Theory. Appl. 182 (2019), 851--884
AI中文摘要

最小化两个凸函数之和的问题在理论和实际应用中都有广泛的应用。解决此类问题的一种流行方法是近端梯度方法(近端前向-后向算法)。在使用该方法时,一个常见的假设是光滑项的梯度具有全局Lipschitz连续性。然而,这种假设在实践中并不总是成立,从而限制了该方法的使用。在本文中,我们讨论了一种新的近端梯度方法变体,在广泛的有限和无限维空间中,该方法不假定上述全局Lipschitz连续性假设。该方法的关键贡献是迭代步长依赖于约束集的某种telescoping分解。此外,我们使用Bregman散度在近端前向-后向操作中。在某些实际条件下,建立了非渐近收敛率(即函数值的收敛率),以及整个序列弱收敛到极小值点。我们还获得了一些具有独立兴趣的辅助结果。

英文摘要

The problem of minimization of the sum of two convex functions has various theoretical and real-world applications. One of the popular methods for solving this problem is the proximal gradient method (proximal forward-backward algorithm). A very common assumption in the use of this method is that the gradient of the smooth term is globally Lipschitz continuous. However, this assumption is not always satisfied in practice, thus casting a limitation on the method. In this paper, we discuss, in a wide class of finite and infinite-dimensional spaces, a new variant of the proximal gradient method which does not impose the above-mentioned global Lipschitz continuity assumption. A key contribution of the method is the dependence of the iterative steps on a certain telescopic decomposition of the constraint set into subsets. Moreover, we use a Bregman divergence in the proximal forward-backward operation. Under certain practical conditions, a non-asymptotic rate of convergence (that is, in the function values) is established, as well as the weak convergence of the whole sequence to a minimizer. We also obtain a few auxiliary results of independent interest.

1904.03537 2026-06-04 math.OC cs.CV cs.LG cs.NA math.NA 版本更新

Convex-Concave Backtracking for Inertial Bregman Proximal Gradient Algorithms in Non-Convex Optimization

凸凹回溯法用于非凸优化中的惯性Bregman近似梯度算法

Mahesh Chandra Mukkamala, Peter Ochs, Thomas Pock, Shoham Sabach

发表机构 * Faculty of Mathematics and Computer Science, Saarland University(萨尔兰大学数学与计算机科学学院) Institute of Computer Graphics and Vision, Graz University of Technology(格拉茨技术大学计算机图形与视觉研究所) Faculty of Industrial Engineering, The Technion(技术学院工业工程学院)

AI总结 本文提出了一种凸凹回溯方法,用于非凸优化中的惯性Bregman近似梯度算法,通过寻找目标函数的凸上界和凹下界,实现步长和外推参数的自适应选择,并证明算法全局收敛到临界点。

Comments 29 pages

详情
AI中文摘要

回溯线搜索是一种古老而强大的策略,用于在近似梯度算法中寻找更好的步长。其主要原理是局部寻找目标函数的简单凸上界,从而控制使用的步长。在惯性近似梯度算法中,情况变得更加复杂,通常导致对外推参数的非常严格的限制。在本文中,我们展示通过局部寻找目标函数的简单凹下界,可以控制外推参数。这导致了一种双凸凹回溯过程,允许自适应地选择步长和外推参数。我们将此过程应用于惯性Bregman近似梯度方法的类别,并证明由这些算法生成的任何序列都全局收敛到函数的临界点。在图像处理和机器学习中的多个具有挑战性的非凸问题上的数值实验显示,结合惯性步和双回溯策略能够实现性能的提升。

英文摘要

Backtracking line-search is an old yet powerful strategy for finding a better step sizes to be used in proximal gradient algorithms. The main principle is to locally find a simple convex upper bound of the objective function, which in turn controls the step size that is used. In case of inertial proximal gradient algorithms, the situation becomes much more difficult and usually leads to very restrictive rules on the extrapolation parameter. In this paper, we show that the extrapolation parameter can be controlled by locally finding also a simple concave lower bound of the objective function. This gives rise to a double convex-concave backtracking procedure which allows for an adaptive choice of both the step size and extrapolation parameters. We apply this procedure to the class of inertial Bregman proximal gradient methods, and prove that any sequence generated by these algorithms converges globally to a critical point of the function at hand. Numerical experiments on a number of challenging non-convex problems in image processing and machine learning were conducted and show the power of combining inertial step and double backtracking strategy in achieving improved performances.

1810.00697 2026-06-04 eess.SY cs.AI cs.LG cs.SY 版本更新

Data-driven Discovery of Cyber-Physical Systems

基于数据的物理系统发现

Ye Yuan, Xiuchuan Tang, Wei Pan, Xiuting Li, Wei Zhou, Hai-Tao Zhang, Han Ding, Jorge Goncalves

发表机构 * School of Automation, Huazhong University of Science and Technology(华中科技大学自动化学院) State Key Lab of Digital Manufacturing Equipment and Technology(数字制造装备与技术国家重点实验室) School of Mechanical Science and Engineering, Huazhong University of Science and Technology(华中科技大学机械科学与工程学院) Department of Cognitive Robotics, Delft University of Technology(代尔夫特理工大学认知机器人系) Department of Engineering, University of Cambridge(剑桥大学工程系) Luxembourg Centre for Systems Biomedicine, University of Luxembourg(卢森堡系统生物医学中心,卢森堡大学)

AI总结 本文提出了一种从数据直接反向工程物理系统的通用框架,通过识别物理系统和推断转移逻辑,成功应用于机械、电气系统和医疗应用,为预测CPS轨迹、评估性能、设计容错系统和制定新系统设计指南提供了新方法。

详情
AI中文摘要

物理系统(CPSs)将软件嵌入物理世界,广泛应用于智能电网、机器人、智能制造和医疗监测等领域。由于其固有的复杂性,来自物理组件和网络组件的组合以及它们之间的相互作用,CPSs在建模方面表现出抗性。本文提出了一种从数据直接反向工程CPSs的通用框架。该方法涉及识别物理系统以及推断转移逻辑。它已成功应用于从机械和电气系统到医疗应用的多个现实世界示例。该新颖的框架旨在使研究人员能够基于发现的模型预测CPS的轨迹。此类信息已被证明对于评估CPS性能、设计容错CPS以及为新CPS制定设计指南至关重要。

英文摘要

Cyber-physical systems (CPSs) embed software into the physical world. They appear in a wide range of applications such as smart grids, robotics, intelligent manufacture and medical monitoring. CPSs have proved resistant to modeling due to their intrinsic complexity arising from the combination of physical components and cyber components and the interaction between them. This study proposes a general framework for reverse engineering CPSs directly from data. The method involves the identification of physical systems as well as the inference of transition logic. It has been applied successfully to a number of real-world examples ranging from mechanical and electrical systems to medical applications. The novel framework seeks to enable researchers to make predictions concerning the trajectory of CPSs based on the discovered model. Such information has been proven essential for the assessment of the performance of CPS, the design of failure-proof CPS and the creation of design guidelines for new CPSs.

1811.07624 2026-06-04 math.NA cs.DS cs.LG cs.NA stat.ML 版本更新

Approximate Eigenvalue Decompositions of Linear Transformations with a Few Householder Reflectors

利用少量Householder反射子进行线性变换的近似本征值分解

Cristian Rusu

发表机构 * Istituto Italiano di Tecnologia(意大利技术研究院)

AI总结 本文提出了一种利用少量Householder反射子构造高效或thonormal矩阵的方法,用于近似或thonormal或对称变换,并应用于快速Mahalanobis距离度量变换的学习。

详情
AI中文摘要

将信号分解为正交基(一组正交分量,每个分量归一化为单位长度)的能力,是许多信号处理方法和应用的核心。经典例子是傅里叶变换和小波变换,它们具有数值高效的实现(FFT和FWT)。不幸的是,正交变换通常结构不规则,因此通常不具有低计算复杂度的性质。在本文中,基于Householder反射子,我们引入了一类正交矩阵,这些矩阵在数值上易于操作:我们通过一个给定参数控制这些矩阵与向量的乘法复杂度。我们提供了数值算法,用于近似任何正交或对称变换,通过给定数量Householder反射子的乘积构造新的正交或对称结构。我们展示了分析和数值证据,以突出所提近似的准确性,并提供了一个应用于快速Mahalanobis距离度量变换学习的应用。

英文摘要

The ability to decompose a signal in an orthonormal basis (a set of orthogonal components, each normalized to have unit length) using a fast numerical procedure rests at the heart of many signal processing methods and applications. The classic examples are the Fourier and wavelet transforms that enjoy numerically efficient implementations (FFT and FWT, respectively). Unfortunately, orthonormal transformations are in general unstructured, and therefore they do not enjoy low computational complexity properties. In this paper, based on Householder reflectors, we introduce a class of orthonormal matrices that are numerically efficient to manipulate: we control the complexity of matrix-vector multiplications with these matrices using a given parameter. We provide numerical algorithms that approximate any orthonormal or symmetric transform with a new orthonormal or symmetric structure made up of products of a given number of Householder reflectors. We show analyses and numerical evidence to highlight the accuracy of the proposed approximations and provide an application to the case of learning fast Mahanalobis distance metric transformations.

1812.04426 2026-06-04 cs.LG cs.NA math.NA physics.comp-ph stat.ML 版本更新

PDE-Net 2.0: Learning PDEs from Data with A Numeric-Symbolic Hybrid Deep Network

PDE-Net 2.0:基于数据学习PDE的数值-符号混合深度网络

Zichao Long, Yiping Lu, Bin Dong

AI总结 本文提出PDE-Net 2.0,一种结合数值近似和符号计算的深度网络,用于从动态数据中学习偏微分方程,并具有较高的灵活性和表达能力。

Comments 16 pages, 15 figures. arXiv admin note: substantial text overlap with arXiv:1710.09668

详情
AI中文摘要

偏微分方程(PDEs)通常是基于经验观察推导得出的。然而,技术的进步使我们能够收集和存储大量数据,这为数据驱动的PDE发现提供了新机会。本文提出了一种新的深度神经网络,称为PDE-Net 2.0,用于从观测动态数据中发现(时间依赖的)PDE,仅需少量对驱动动态机制的先验知识。PDE-Net 2.0的设计基于我们先前的工作\cite{Long2018PDE},其中提出了原始版本的PDE-Net。PDE-Net 2.0是通过卷积近似微分算子和用于模型恢复的符号多层神经网络的结合。与现有方法相比,PDE-Net 2.0通过学习微分算子和PDE模型的非线性响应函数,具有最大的灵活性和表达能力。数值实验表明,PDE-Net 2.0有潜力揭示观测动态的隐藏PDE,并在噪声环境中预测相对较长时间的动力学行为。

英文摘要

Partial differential equations (PDEs) are commonly derived based on empirical observations. However, recent advances of technology enable us to collect and store massive amount of data, which offers new opportunities for data-driven discovery of PDEs. In this paper, we propose a new deep neural network, called PDE-Net 2.0, to discover (time-dependent) PDEs from observed dynamic data with minor prior knowledge on the underlying mechanism that drives the dynamics. The design of PDE-Net 2.0 is based on our earlier work \cite{Long2018PDE} where the original version of PDE-Net was proposed. PDE-Net 2.0 is a combination of numerical approximation of differential operators by convolutions and a symbolic multi-layer neural network for model recovery. Comparing with existing approaches, PDE-Net 2.0 has the most flexibility and expressive power by learning both differential operators and the nonlinear response function of the underlying PDE model. Numerical experiments show that the PDE-Net 2.0 has the potential to uncover the hidden PDE of the observed dynamics, and predict the dynamical behavior for a relatively long time, even in a noisy environment.

1905.07875 2026-06-04 eess.SY cs.LG cs.NA cs.SY math.NA 版本更新

Investigating Flight Envelope Variation Predictability of Impaired Aircraft using Least-Squares Regression Analysis

利用最小二乘回归分析研究受损飞机飞行包线变化的可预测性

Ramin Norouzi, Amirreza Kosari, Mohammad Hossein Sabour

发表机构 * University of Tehran(塔里哈大学)

AI总结 本文通过线性和非线性最小二乘估计方法,研究了受损飞机飞行包线内Trim点数量及其质心的可预测性,并开发并比较了多种多项式模型和基于双曲正切函数的非线性模型,以预测不同故障程度下的飞行包线变化。

Comments Accepted version, Journal of Aerospace Information Systems

详情
AI中文摘要

飞机故障会改变飞机的动态特性并导致飞行包线发生变化。此类包线变化是非线性的,通常无法被飞行员预测,因为它们受飞机复杂动态的支配。因此,为了防止飞行中失去控制,必须能够实际预测任何事先未知故障程度下受损飞机的飞行包线变化。本文通过线性和非线性最小二乘估计方法,研究了飞行包线内Trim点数量及其质心的可预测性。为此,开发并比较了多种多项式模型和基于双曲正切函数的非线性模型,这些模型将影响包线变化的因素作为输入,并在任何预期故障程度下估计飞行包线的Trim点数量和质心。结果表明,多项式和基于双曲正切函数的模型都能以高精度预测受损飞行包线的变化。此外,还证明了最佳多项式拟合的回归方程能够直接评估受损飞机的飞行包线收缩和位移对特定飞机故障和飞行条件参数的敏感性。

英文摘要

Aircraft failures alter the aircraft dynamics and cause maneuvering flight envelope to change. Such envelope variations are nonlinear and generally unpredictable by the pilot as they are governed by the aircraft's complex dynamics. Hence, in order to prevent in-flight Loss of Control it is crucial to practically predict the impaired aircraft's flight envelope variation due to any a-priori unknown failure degree. This paper investigates the predictability of the number of trim points within the maneuvering flight envelope and its centroid using both linear and nonlinear least-squares estimation methods. To do so, various polynomial models and nonlinear models based on hyperbolic tangent function are developed and compared which incorporate the influencing factors on the envelope variations as the inputs and estimate the centroid and the number of trim points of the maneuvering flight envelope at any intended failure degree. Results indicate that both the polynomial and hyperbolic tangent function-based models are capable of predicting the impaired fight envelope variation with good precision. Furthermore, it is shown that the regression equation of the best polynomial fit enables direct assessment of the impaired aircraft's flight envelope contraction and displacement sensitivity to the specific parameters characterizing aircraft failure and flight condition.

1904.11898 2026-06-04 cs.RO cs.CV cs.LG cs.SY eess.SY 版本更新

Perceptual Attention-based Predictive Control

基于感知注意力的预测控制

Keuntaek Lee, Gabriel Nakajima An, Viacheslav Zakharov, Evangelos A. Theodorou

发表机构 * Georgia Institute of Technology(佐治亚理工学院)

AI总结 本文提出了一种新的信息处理架构,用于安全的深度学习视觉导航系统,通过模型预测控制(MPC)、卷积神经网络(CNNs)和不确定性量化方法,实现基于感知注意力的预测控制算法,提高了系统对不安全状况的快速检测能力。

详情
AI中文摘要

在本文中,我们提出了一种新的信息处理架构,用于安全的基于深度学习的视觉导航自主系统。所提出的信息处理架构用于支持一种基于感知注意力的预测控制算法,该算法利用模型预测控制(MPC)、卷积神经网络(CNNs)和不确定性量化方法。我们的方法新颖之处在于利用MPC学习如何在视觉输入的相关区域上放置注意力,从而最终使系统能够更快速地检测到不安全状况。我们通过使用MPC学习如何选择输入图像中的感兴趣区域,这些区域用于输出控制动作以及在注意力感知的视觉输入中的epistemic和aleatoric不确定性估计。我们使用这些不确定性估计来量化在当前导航条件下网络控制器的安全性。所提出的架构和算法在1:5比例的陆地车辆上进行了测试。实验结果表明,所提出的算法在早期检测不安全状况方面优于先前的方法,例如当导航环境中出现新障碍物时。所提出的架构是向在安全关键领域使用基于深度学习的感知控制策略迈出的第一步。

英文摘要

In this paper, we present a novel information processing architecture for safe deep learning-based visual navigation of autonomous systems. The proposed information processing architecture is used to support a perceptual attention-based predictive control algorithm that leverages model predictive control (MPC), convolutional neural networks (CNNs), and uncertainty quantification methods. The novelty of our approach lies in using MPC to learn how to place attention on relevant areas of the visual input, which ultimately allows the system to more rapidly detect unsafe conditions. We accomplish this by using MPC to learn to select regions of interest in the input image, which are used to output control actions as well as estimates of epistemic and aleatoric uncertainty in the attention-aware visual input. We use these uncertainty estimates to quantify the safety of our network controller under the current navigation condition. The proposed architecture and algorithm is tested on a 1:5 scale terrestrial vehicle. Experimental results show that the proposed algorithm outperforms previous approaches on early detection of unsafe conditions, such as when novel obstacles are present in the navigation environment. The proposed architecture is the first step towards using deep learning-based perceptual control policies in safety-critical domains.

1806.02957 2026-06-04 cs.LG cs.NA cs.NE math.NA physics.comp-ph stat.ML 版本更新

A Deep Neural Network Surrogate for High-Dimensional Random Partial Differential Equations

高维随机偏微分方程的深度神经网络替代模型

Mohammad Amin Nabian, Hadi Meidani

发表机构 * Department of Civil and Environmental Engineering, University of Illinois at Urbana-Champaign, Urbana, Illinois, USA.(土木与环境工程系,伊利诺伊大学厄巴纳-香槟分校)

AI总结 本文提出了一种基于深度学习的高维随机偏微分方程求解框架,通过深度残差网络近似随机PDE,并采用强化或弱化初始和边界条件的方法,验证了该方法在扩散和热传导问题中的准确性。

详情
Journal ref
Probabilistic Engineering Mechanics, 57, pp.14-25 (2019)
AI中文摘要

开发高效的数值算法来求解高维随机偏微分方程(PDEs)一直是一个具有挑战性的任务,由于众所周知的维度灾难。我们提出了一种基于深度学习的新解决方案框架。具体而言,随机PDE通过前馈全连接深度残差网络进行近似,采用强或弱执行初始和边界约束。该框架是无网格的,能够处理不规则计算域。近似深度神经网络的参数通过SGD算法的变种迭代确定。所提出的框架在扩散和热传导问题中通过数值实验验证了令人满意的准确性,与收敛的基于蒙特卡洛的有限元结果进行比较。

英文摘要

Developing efficient numerical algorithms for the solution of high dimensional random Partial Differential Equations (PDEs) has been a challenging task due to the well-known curse of dimensionality. We present a new solution framework for these problems based on a deep learning approach. Specifically, the random PDE is approximated by a feed-forward fully-connected deep residual network, with either strong or weak enforcement of initial and boundary constraints. The framework is mesh-free, and can handle irregular computational domains. Parameters of the approximating deep neural network are determined iteratively using variants of the Stochastic Gradient Descent (SGD) algorithm. The satisfactory accuracy of the proposed frameworks is numerically demonstrated on diffusion and heat conduction problems, in comparison with the converged Monte Carlo-based finite element results.

1701.08711 2026-06-04 cs.CL cs.LG econ.GN q-fin.EC stat.ML 版本更新

Predicting Auction Price of Vehicle License Plate with Deep Recurrent Neural Network

利用深度循环神经网络预测车辆车牌拍卖价格

Vinci Chow

发表机构 * Department of Economics, The Chinese University of Hong Kong, Shatin, Hong Kong(香港中文大学经济系,沙田,香港)

AI总结 本文提出将车辆车牌价格预测视为自然语言处理任务,通过构建深度循环神经网络来预测香港车牌拍卖价格,并展示了模型在解释价格变化和扩展为车牌搜索引擎方面的贡献。

详情
AI中文摘要

在中国社会,迷信因素极为重要,具有吉祥数字的车辆车牌在拍卖中可以高价成交。与其他珍贵物品不同,车牌在拍卖前并不预估价格。本文提出将车牌价格预测视为自然语言处理(NLP)任务,因为价值取决于车牌上每个字符的含义和语义。本文构建了一个深度循环神经网络(RNN)来预测香港车牌的价格,基于车牌上的字符。在13年的历史拍卖价格上评估,深度RNN的预测可以解释超过80%的价格变化,显著优于以前的模型。此外,本文还展示了该模型如何扩展为车牌搜索引擎,并提供价格分布的估计。

英文摘要

In Chinese societies, superstition is of paramount importance, and vehicle license plates with desirable numbers can fetch very high prices in auctions. Unlike other valuable items, license plates are not allocated an estimated price before auction. I propose that the task of predicting plate prices can be viewed as a natural language processing (NLP) task, as the value depends on the meaning of each individual character on the plate and its semantics. I construct a deep recurrent neural network (RNN) to predict the prices of vehicle license plates in Hong Kong, based on the characters on a plate. I demonstrate the importance of having a deep network and of retraining. Evaluated on 13 years of historical auction prices, the deep RNN's predictions can explain over 80 percent of price variations, outperforming previous models by a significant margin. I also demonstrate how the model can be extended to become a search engine for plates and to provide estimates of the expected price distribution.

1904.11538 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Zap Q-Learning for Optimal Stopping Time Problems

Zap Q-Learning for Optimal Stopping Time Problems

Shuhang Chen, Adithya M. Devraj, Ana Bušić, Sean P. Meyn

发表机构 * Department of ECE at the University of Florida(佛罗里达大学电子与计算机工程系) Inria International Chair, Paris(巴黎Inria国际席位)

AI总结 本文研究了在不可约、均匀递归的马尔可夫链上,通过快速收敛的强化学习算法近似求解折扣成本最优停止问题,提出了一种名为Zap-Q-learning的算法,证明其在线性函数近似设置下的收敛性。

详情
AI中文摘要

本文的目标是获得快速收敛的强化学习算法,以近似求解在不可约、均匀递归的马尔可夫链上,其状态空间为$\mathbb{R}^n$的紧子集中的折扣成本最优停止问题的解。我们基于Tsitsikilis和Van Roy所采用的动态规划方法,其中他们提出了一种Q-learning算法来估计最优状态-动作价值函数,从而定义最优停止规则。我们探讨了该算法收敛速度慢的原因,并提出了一种快速收敛的替代算法,即“Zap-Q-learning”算法,旨在实现最优的收敛速度。首次在假设线性函数近似设置下证明了Zap-Q-learning算法的收敛性。我们通过ODE分析进行证明,并通过金融示例中的最优渐近方差性质反映该算法的快速收敛性。

英文摘要

The objective in this paper is to obtain fast converging reinforcement learning algorithms to approximate solutions to the problem of discounted cost optimal stopping in an irreducible, uniformly ergodic Markov chain, evolving on a compact subset of $\mathbb{R}^n$. We build on the dynamic programming approach taken by Tsitsikilis and Van Roy, wherein they propose a Q-learning algorithm to estimate the optimal state-action value function, which then defines an optimal stopping rule. We provide insights as to why the convergence rate of this algorithm can be slow, and propose a fast-converging alternative, the "Zap-Q-learning" algorithm, designed to achieve optimal rate of convergence. For the first time, we prove the convergence of the Zap-Q-learning algorithm under the assumption of linear function approximation setting. We use ODE analysis for the proof, and the optimal asymptotic variance property of the algorithm is reflected via fast convergence in a finance example.

1905.05992 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Deep reinforcement learning for scheduling in large-scale networked control systems

在大规模网络化控制系统中使用深度强化学习进行调度

Adrian Redder, Arunselvan Ramaswamy, Daniel E. Quevedo

发表机构 * Faculty of Computer Science, Electrical Engineering and Mathematics(计算机科学、电气工程与数学系) Paderborn University(帕德博恩大学)

AI总结 本文提出了一种基于深度强化学习的迭代资源分配算法DIRA,用于解决网络化系统中的控制与资源调度问题,通过联合优化控制与调度以提高性能。

详情
AI中文摘要

本文考虑了网络化系统中的控制与资源调度问题。我们提出了DIRA,一种基于深度强化学习的迭代资源分配算法,具有可扩展性和控制意识。我们的算法针对大规模问题进行了定制,其中控制与调度需要联合优化以提高性能。DIRA可以用于调度基于一般时域优化的控制器。在本工作中,我们专注于基于适当适应的线性二次调节器的控制设计。我们应用我们的算法到具有相关衰减通信信道的网络化系统。我们的仿真显示,DIRA能够良好地扩展到大规模调度问题。

英文摘要

This work considers the problem of control and resource scheduling in networked systems. We present DIRA, a Deep reinforcement learning based Iterative Resource Allocation algorithm, which is scalable and control-aware. Our algorithm is tailored towards large-scale problems where control and scheduling need to act jointly to optimize performance. DIRA can be used to schedule general time-domain optimization based controllers. In the present work, we focus on control designs based on suitably adapted linear quadratic regulators. We apply our algorithm to networked systems with correlated fading communication channels. Our simulations show that DIRA scales well to large scheduling problems.

1603.07421 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

On the Powerball Method for Optimization

关于优化的Powerball方法

Ye Yuan, Mu Li, Jun Liu, Claire J. Tomlin

发表机构 * School of Automation, Huazhong University of Science and Technology(华中科技大学自动化学院) Department of Computer Science, Carnegie Mellon University(卡内基梅隆大学计算机科学系) Department of Applied Mathematics, University of Waterloo(滑铁卢大学应用数学系) Department of Electrical Engineering and Computer Sciences, University of California, Berkeley(加州大学伯克利分校电气工程与计算机科学系)

AI总结 本文提出了一种新的方法来加速优化算法的收敛,通过在优化过程中添加一个功率系数γ∈[0,1),称为Powerball方法,并分析了该方法在强凸函数中的收敛率。尽管理论上Powerball方法与梯度方法有相同的线性收敛率,但实验证明其在初始迭代中显著优于梯度下降和牛顿方法,尤其在多个真实数据集上提供了10倍的收敛加速。

详情
AI中文摘要

我们提出了一种新的方法来加速优化算法的收敛。该方法在优化过程中简单地将一个功率系数γ∈[0,1)添加到梯度中。我们称其为Powerball方法,并分析了该方法在强凸函数中的收敛率。尽管理论上Powerball方法保证具有与梯度方法相同的线性收敛率,但我们显示,实验证明该方法在初始迭代中显著优于梯度下降和牛顿方法。我们证明,Powerball方法在多个真实数据集上对梯度下降和L-BFGS的收敛速度提供了10倍的加速。

英文摘要

We propose a new method to accelerate the convergence of optimization algorithms. This method simply adds a power coefficient $γ\in[0,1)$ to the gradient during optimization. We call this the Powerball method and analyze the convergence rate for the Powerball method for strongly convex functions. While theoretically the Powerball method is guaranteed to have a linear convergence rate in the same order of the gradient method, we show that empirically it significantly outperforms the gradient descent and Newton's method, especially during the initial iterations. We demonstrate that the Powerball method provides a $10$-fold speedup of the convergence of both gradient descent and L-BFGS on multiple real datasets.

1904.10945 2026-06-04 cs.LG cs.SY eess.SY math.OC stat.ML 版本更新

Target-Based Temporal Difference Learning

基于目标的时序差分学习

Donghwan Lee, Niao He

发表机构 * Coordinated Science Laboratory (CSL), University of Illinois at Urbana-Champaign(协调科学实验室(CSL),伊利诺伊大学厄巴纳-香槟分校) Department of Industrial and Enterprise Systems Engineering, University of Illinois(工业与企业系统工程系,伊利诺伊大学)

AI总结 本文提出了一种新的基于目标的时序差分学习算法家族,并从理论上分析了其收敛性,展示了这些算法在收敛性能上可能优于标准时序差分学习。

详情
AI中文摘要

目标网络的使用已成为近期深度Q学习算法在强化学习中的流行和关键组成部分,但理论方面的了解仍然有限。在本工作中,我们介绍了一种新的基于目标的时序差分(TD)学习算法家族,并对其收敛性进行了理论分析。与标准TD学习不同,基于目标的TD算法维护两个独立的学习参数——目标变量和在线变量。特别地,我们介绍了该家族中的三个成员,称为平均TD、双TD和周期TD,其中目标变量通过平均、对称或周期性的方式更新,模仿了深度Q学习实践中使用的技术。我们为平均TD和双TD建立了渐近收敛分析,并为周期TD提供了有限样本分析。此外,我们还提供了一些模拟结果,显示这些基于目标的TD算法在收敛性能上可能优于标准TD学习。虽然本工作集中在线性函数逼近和策略评估设置上,但我们将其视为朝着理解具有目标网络的深度Q学习变体理论基础迈出的有意义一步。

英文摘要

The use of target networks has been a popular and key component of recent deep Q-learning algorithms for reinforcement learning, yet little is known from the theory side. In this work, we introduce a new family of target-based temporal difference (TD) learning algorithms and provide theoretical analysis on their convergences. In contrast to the standard TD-learning, target-based TD algorithms maintain two separate learning parameters-the target variable and online variable. Particularly, we introduce three members in the family, called the averaging TD, double TD, and periodic TD, where the target variable is updated through an averaging, symmetric, or periodic fashion, mirroring those techniques used in deep Q-learning practice. We establish asymptotic convergence analyses for both averaging TD and double TD and a finite sample analysis for periodic TD. In addition, we also provide some simulation results showing potentially superior convergence of these target-based TD algorithms compared to the standard TD-learning. While this work focuses on linear function approximation and policy evaluation setting, we consider this as a meaningful step towards the theoretical understanding of deep Q-learning variants with target networks.

1806.07200 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Adaptive Input Estimation in Linear Dynamical Systems with Applications to Learning-from-Observations

线性动态系统中自适应输入估计及其在学习-观察中的应用

Sebastian Curi, Kfir Y. Levy, Andreas Krause

发表机构 * Electrical Engineering Department, Technion- Israel Institute of Technology(电气工程系,技术ion-以色列理工学院)

AI总结 本文提出了一种自适应输入估计算法,通过在每个时间步高效地平衡偏差和方差以优化总体估计误差,并在学习-观察框架中展示了其在控制器学习中的有效性。

Comments CDC 2019

详情
AI中文摘要

我们解决了从系统输出测量估计动态系统输入的问题。为此,我们引入了一种新颖的估计算法,该算法明确地在偏差和方差之间进行权衡,以最优地减少总体估计误差。这种最优的权衡在每个时间步都高效且自适应地完成。实验表明,我们的方法经常产生比现有最佳方法低得多的误差估计。最后,我们考虑了更复杂的学习-观察框架,其中智能体应从专家示范的输出学习控制器。我们将我们的估计算法作为该框架中的基本模块,并展示了它能够成功地学习控制器。

英文摘要

We address the problem of estimating the inputs of a dynamical system from measurements of the system's outputs. To this end, we introduce a novel estimation algorithm that explicitly trades off bias and variance to optimally reduce the overall estimation error. This optimal trade-off is done efficiently and adaptively in every time step. Experimentally, we show that our method often produces estimates with substantially lower error compared to the state-of-the-art. Finally, we consider the more complex \emph{Learning-from-Observations} framework, where an agent should learn a controller from the outputs of an expert's demonstration. We incorporate our estimation algorithm as a building block inside this framework and show that it enables learning controllers successfully.

1606.00911 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Distributed Cooperative Decision-Making in Multiarmed Bandits: Frequentist and Bayesian Algorithms

多臂老虎机中分布式协同决策:频率主义与贝叶斯算法

Peter Landgren, Vaibhav Srivastava, Naomi Ehrich Leonard

AI总结 本文研究了在多臂老虎机问题中,如何在探索与利用之间取得平衡进行分布式协同决策,提出了适用于多智能体的频率主义和贝叶斯算法,并证明了这些算法在渐近意义上能够恢复集中式智能体的性能。

Comments This revision provides a correction to the original paper, which appeared in the Proceedings of the 2016 IEEE Conference on Decision and Control (CDC). The second statement of Proposition 1 and Theorem 1 are new from arXiv:1512.06888v3 and Lemma 1 is new. These are used to prove regret bounds in Theorems 2 and 3

详情
AI中文摘要

我们研究了在多臂老虎机(MAB)问题中,如何在探索与利用之间取得平衡进行分布式协同决策。我们扩展了单智能体MAB问题中最新流行的频率主义和贝叶斯算法,以适用于多智能体MAB问题中的协同分布式算法,其中智能体根据固定的网络图进行通信。我们依赖于每个智能体的运行共识算法,以利用其自身的奖励和邻居估计的奖励来估计平均奖励。我们证明了这些算法的性能,并表明它们在渐近意义上能够恢复集中式智能体的性能。进一步,我们严格刻画了通信图结构对群体决策性能的影响。

英文摘要

We study distributed cooperative decision-making under the explore-exploit tradeoff in the multiarmed bandit (MAB) problem. We extend the state-of-the-art frequentist and Bayesian algorithms for single-agent MAB problems to cooperative distributed algorithms for multi-agent MAB problems in which agents communicate according to a fixed network graph. We rely on a running consensus algorithm for each agent's estimation of mean rewards from its own rewards and the estimated rewards of its neighbors. We prove the performance of these algorithms and show that they asymptotically recover the performance of a centralized agent. Further, we rigorously characterize the influence of the communication graph structure on the decision-making performance of the group.

1712.09379 2026-06-04 math.OC cs.DS cs.LG cs.NA math.NA stat.ML 版本更新

IHT dies hard: Provable accelerated Iterative Hard Thresholding

IHT死守:可证明的加速迭代硬阈值法

Rajiv Khanna, Anastasios Kyrillidis

发表机构 * University of Texas at Austin(德克萨斯大学奥斯汀分校) IBM T.J. Watson Research Center(IBM 沃森研究中心)

AI总结 本文研究了在理论和实践中经典迭代硬阈值(IHT)方法中动量运动的使用,通过简单修改普通IHT,探讨了其在具有非凸约束的凸优化标准下的收敛行为,并观察到IHT的加速在投影梯度下降和Frank-Wolfe变体中带来了显著改进。

Comments accepted to AISTATS 2018

详情
AI中文摘要

我们研究了经典迭代硬阈值(IHT)方法中动量运动的使用,理论和实践相结合。通过简单修改普通IHT,我们探讨了其在具有非凸约束的凸优化标准下的收敛行为,在标准假设下。在多样场景中,我们观察到IHT的加速在投影梯度下降和Frank-Wolfe变体中带来了显著改进。作为我们检查的副产品,我们研究了选择动量参数的影响:类似于凸设置,观察到两种行为模式——“波纹”和线性——这取决于动量的水平。

英文摘要

We study --both in theory and practice-- the use of momentum motions in classic iterative hard thresholding (IHT) methods. By simply modifying plain IHT, we investigate its convergence behavior on convex optimization criteria with non-convex constraints, under standard assumptions. In diverse scenaria, we observe that acceleration in IHT leads to significant improvements, compared to state of the art projected gradient descent and Frank-Wolfe variants. As a byproduct of our inspection, we study the impact of selecting the momentum parameter: similar to convex settings, two modes of behavior are observed --"rippling" and linear-- depending on the level of momentum.

1808.03258 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Application of Bounded Total Variation Denoising in Urban Traffic Analysis

bounded 总变差去噪在城市交通分析中的应用

Shanshan Tang, Haijun Yu

发表机构 * 1 School of Mathematical Sciences, University of Chinese Academy of Sciences LSEC, Institute of Computational Mathematics Scientific/Engineering Computing, Academy of Mathematics Systems Science, Beijing 100190, China 2 NCMIS \& LSEC, Institute of Computational Mathematics School of Mathematical Sciences, University of Chinese Academy of Sciences

AI总结 本文提出利用 bounded 总变差去噪方法提升城市交通分析的准确性,通过改进的去噪算法和神经网络结合历史匹配方法,提高了交通预测和聚类的性能。

Comments 7 figures, 3 tables, to appear on East Asian Journal on Applied Mathematics

详情
Journal ref
East Asian Journal on Applied Mathematics Vol.9, No.3, pp. 622-642, 2019
AI中文摘要

尽管在许多大数据应用中人们认为去噪并不总是必要,但本文通过将 bounded 总变差去噪方法应用于城市道路预测和聚类问题,证明了去噪在城市交通分析中的有效性。我们提出了两种易于实现的方法来估计去噪算法中的噪声强度参数,并将去噪算法应用于北京出租车系统基于 GPS 的交通数据。在交通预测问题中,我们结合神经网络和历史匹配方法,对北京城市区域中随机选择的道路进行预测。数值实验表明,应用所提出的 bounded 总变差去噪算法显著提高了预测精度。我们还测试了该算法在聚类问题中的应用,其中一种 recently 开发的聚类分析方法被应用于北京超过一百个城市的道路段,基于其速度剖面进行聚类分析。去噪后获得了更好的聚类结果。

英文摘要

While it is believed that denoising is not always necessary in many big data applications, we show in this paper that denoising is helpful in urban traffic analysis by applying the method of bounded total variation denoising to the urban road traffic prediction and clustering problem. We propose two easy-to-implement methods to estimate the noise strength parameter in the denoising algorithm, and apply the denoising algorithm to GPS-based traffic data from Beijing taxi system. For the traffic prediction problem, we combine neural network and history matching method for roads randomly chosen from an urban area of Beijing. Numerical experiments show that the predicting accuracy is improved significantly by applying the proposed bounded total variation denoising algorithm. We also test the algorithm on clustering problem, where a recently developed clustering analysis method is applied to more than one hundred urban road segments in Beijing based on their velocity profiles. Better clustering result is obtained after denoising.

1806.06790 2026-06-04 cs.LG cs.AI cs.IT cs.SY eess.SY math.IT math.OC stat.ML 版本更新

Towards Distributed Energy Services: Decentralizing Optimal Power Flow with Machine Learning

迈向分布式能源服务:利用机器学习实现最优功率流的去中心化

Roel Dobbe, Oscar Sondermeijer, David Fridovich-Keil, Daniel Arnold, Duncan Callaway, Claire Tomlin

发表机构 * AI Now Institute at New York University(纽约大学AI现在研究所) Energy & Resources Group at UC Berkeley(伯克利大学能源与资源组)

AI总结 本文提出了一种基于机器学习的去中心化方法,通过本地可用信息学习可控分布式能源资源(DER)的控制策略,以重构和模仿集中式最优功率流(OPF)问题的解决方案,从而实现分布式能源服务。

Comments Accepted for publication. To appear in the IEEE Transactions on Smart Grid

详情
AI中文摘要

实现最优功率流(OPF)方法以调节电力网络中的电压和功率流通常被认为需要大量通信。我们考虑包含多个可控分布式能源资源(DER)的配电系统,并提出一种数据驱动的方法,用于学习每个DER的控制策略,以仅利用本地可用信息来重构和模仿集中式OPF问题的解决方案。集体来看,所有本地控制器紧密匹配集中式OPF解决方案,提供接近最优的性能并满足系统约束。速率失真框架使得能够分析由此产生的完全去中心化控制策略在重构OPF解决方案方面的效果。该方法为决定DER应与哪些节点通信以改进其个别策略提供了自然扩展。该方法在单相和三相测试馈线网络上应用,使用真实负载和分布式发电机的数据,重点于不表现出跨时间依赖性的DER。它为配电系统运营商提供了一个框架,以高效规划和操作DER的贡献,以实现配电网络中的分布式能源服务。

英文摘要

The implementation of optimal power flow (OPF) methods to perform voltage and power flow regulation in electric networks is generally believed to require extensive communication. We consider distribution systems with multiple controllable Distributed Energy Resources (DERs) and present a data-driven approach to learn control policies for each DER to reconstruct and mimic the solution to a centralized OPF problem from solely locally available information. Collectively, all local controllers closely match the centralized OPF solution, providing near optimal performance and satisfaction of system constraints. A rate distortion framework enables the analysis of how well the resulting fully decentralized control policies are able to reconstruct the OPF solution. The methodology provides a natural extension to decide what nodes a DER should communicate with to improve the reconstruction of its individual policy. The method is applied on both single- and three-phase test feeder networks using data from real loads and distributed generators, focusing on DERs that do not exhibit inter-temporal dependencies. It provides a framework for Distribution System Operators to efficiently plan and operate the contributions of DERs to achieve Distributed Energy Services in distribution networks.

1804.02948 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

Sample-Derived Disjunctive Rules for Secure Power System Operation

基于样本的离散规则用于安全电力系统运行

Jochen L. Cremer, Ioannis Konstantelos, Simon H. Tindemans, Goran Strbac

发表机构 * Department of Electrical and Electronic Engineering(电气与电子工程系) Department of Electrical Sustainable Energy(电气可持续能源系)

AI总结 本文提出了一种基于决策树的离散规则方法,用于在标准优化框架中进行预故障和后故障控制,通过通用化方法将决策树衍生的规则嵌入到操作决策模型中,以提高电力系统运行的安全性。

Comments 6 pages, accepted paper to IEEE PMAPS 2018

详情
AI中文摘要

机器学习技术过去曾利用蒙特卡洛样本来构建电力系统动态稳定的预测器。在本文中,我们超越了预测任务,提出了一种综合方法,将预测器(如决策树(DT))纳入标准优化框架中,用于预故障和后故障控制。具体而言,我们提出了一种通用方法,用于将从决策树中导出的规则嵌入到操作决策模型中。我们首先指出了从预测框架过渡到控制框架时所面临的特定挑战。接着,我们介绍了基于广义离散规划(GDP)的解决方案策略,以及一种两步搜索方法,用于确定最优超参数以平衡成本和控制精度。我们通过IEEE 39节点系统的案例研究,展示了所提出的方法如何在高维不确定性条件下构建覆盖多种故障情景的安全代理。该方法在系统价格方面仅略高于理想模型,实现了高效的系统控制。

英文摘要

Machine learning techniques have been used in the past using Monte Carlo samples to construct predictors of the dynamic stability of power systems. In this paper we move beyond the task of prediction and propose a comprehensive approach to use predictors, such as Decision Trees (DT), within a standard optimization framework for pre- and post-fault control purposes. In particular, we present a generalizable method for embedding rules derived from DTs in an operation decision-making model. We begin by pointing out the specific challenges entailed when moving from a prediction to a control framework. We proceed with introducing the solution strategy based on generalized disjunctive programming (GDP) as well as a two-step search method for identifying optimal hyper-parameters for balancing cost and control accuracy. We showcase how the proposed approach constructs security proxies that cover multiple contingencies while facing high-dimensional uncertainty with respect to operating conditions with the use of a case study on the IEEE 39-bus system. The method is shown to achieve efficient system control at a marginal increase in system price compared to an oracle model.

1905.04835 2026-06-04 cs.LG cs.CV cs.MA cs.RO cs.SY eess.SY stat.ML 版本更新

Multi-Agent Image Classification via Reinforcement Learning

通过强化学习进行多智能体图像分类

Hossein K. Mousavi, Mohammadreza Nazari, Martin Takáč, Nader Motee

AI总结 本文研究了利用多个能够收集未知环境部分姿态依赖观测的移动智能体进行图像分类的问题,提出了一种网络架构,用于指导智能体形成局部信念、采取局部行动并从原始部分观测中提取相关特征,通过与邻居智能体交换信息更新自身信念,并利用强化学习技术实现分类问题的去中心化实现。

Comments Preprint of the paper to be published in IROS'19 proceedings

详情
AI中文摘要

我们研究了使用多个能够收集未知环境部分姿态依赖观测的移动智能体进行分类问题。目标是在有限的时间范围内对图像进行分类。我们提出了一种网络架构,用于指导智能体如何形成局部信念、采取局部行动并从原始部分观测中提取相关特征。智能体被允许与邻居智能体交换信息以更新自身信念。证明了如何利用强化学习技术通过运行去中心化共识协议来实现分类问题的去中心化实现。我们在MNIST手写数字数据集上的实验结果展示了我们所提框架的有效性。

英文摘要

We investigate a classification problem using multiple mobile agents capable of collecting (partial) pose-dependent observations of an unknown environment. The objective is to classify an image over a finite time horizon. We propose a network architecture on how agents should form a local belief, take local actions, and extract relevant features from their raw partial observations. Agents are allowed to exchange information with their neighboring agents to update their own beliefs. It is shown how reinforcement learning techniques can be utilized to achieve decentralized implementation of the classification problem by running a decentralized consensus protocol. Our experimental results on the MNIST handwritten digit dataset demonstrates the effectiveness of our proposed framework.

1801.09627 2026-06-04 cs.LG cs.RO cs.SY eess.SY 版本更新

Barrier-Certified Adaptive Reinforcement Learning with Applications to Brushbot Navigation

具有应用的障碍证书自适应强化学习:Brushbot导航

Motoya Ohnishi, Li Wang, Gennaro Notomista, Magnus Egerstedt

发表机构 * School of Electrical Engineering, Royal Institute of Technology(皇家理工学院电气工程学院) Georgia Institute of Technology(佐治亚理工学院) RIKEN Center for Advanced Intelligence Project(日本理化学研究所高级智能研究中心) School of Mechanical Engineering(机械工程学院)

AI总结 本文提出了一种安全学习框架,结合自适应模型学习算法和障碍证书,用于具有可能非平稳智能体动态的系统。通过稀疏优化技术提取模型的动态结构,并利用学习的模型结合控制障碍证书来约束策略(反馈控制器),以保持安全性,即避免特定的不利状态空间区域。在某些条件下,保证了在安全被非平稳性破坏后,以李雅普诺夫稳定性的方式恢复安全。此外,将动作-价值函数近似重新公式化,使任何基于内核的非线性函数估计方法都能应用于我们的自适应学习框架。最后,保证了障碍证书策略优化的解是全局最优的,确保在温和条件下进行贪心策略改进。所得到的框架通过四旋翼无人机的模拟进行验证,该无人机此前在安全学习文献中被假设为平稳性,然后在动态未知、高度复杂且非平稳的Brushbot机器人上进行测试。

Comments ©2019 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works

详情
Journal ref
Published in IEEE Transactions on Robotics, 2019
AI中文摘要

本文提出了一种安全学习框架,该框架结合了自适应模型学习算法和障碍证书,用于具有可能非平稳智能体动态的系统。为了提取模型的动态结构,我们使用了稀疏优化技术。我们利用学习的模型结合控制障碍证书,以约束策略(反馈控制器)从而保持安全性,即避免特定的状态空间区域中的不利区域。在某些条件下,恢复安全性的保证是在安全被非平稳性破坏后以李雅普诺夫稳定性的方式恢复。此外,我们重新公式化了动作-价值函数近似,使任何基于内核的非线性函数估计方法都能应用于我们的自适应学习框架。最后,保证了障碍证书策略优化的解是全局最优的,确保在温和条件下进行贪心策略改进。所得到的框架通过四旋翼无人机的模拟进行验证,该无人机此前在安全学习文献中被假设为平稳性,然后在动态未知、高度复杂且非平稳的Brushbot机器人上进行测试。

英文摘要

This paper presents a safe learning framework that employs an adaptive model learning algorithm together with barrier certificates for systems with possibly nonstationary agent dynamics. To extract the dynamic structure of the model, we use a sparse optimization technique. We use the learned model in combination with control barrier certificates which constrain policies (feedback controllers) in order to maintain safety, which refers to avoiding particular undesirable regions of the state space. Under certain conditions, recovery of safety in the sense of Lyapunov stability after violations of safety due to the nonstationarity is guaranteed. In addition, we reformulate an action-value function approximation to make any kernel-based nonlinear function estimation method applicable to our adaptive learning framework. Lastly, solutions to the barrier-certified policy optimization are guaranteed to be globally optimal, ensuring the greedy policy improvement under mild conditions. The resulting framework is validated via simulations of a quadrotor, which has previously been used under stationarity assumptions in the safe learnings literature, and is then tested on a real robot, the brushbot, whose dynamics is unknown, highly complex and nonstationary.

1706.00078 2026-06-04 cs.CC cs.LG cs.NA math.NA math.OC 版本更新

Low-Rank Matrix Approximation in the Infinity Norm

以无穷范数为度量的低秩矩阵逼近

Nicolas Gillis, Yaroslav Shitov

发表机构 * Department of Mathematics and Operational Research, University of Mons(蒙斯大学数学与运筹学系) National Research University Higher School of Economics(俄罗斯国家研究大学高等经济学院)

AI总结 本文研究了以无穷范数为度量的低秩矩阵逼近问题,证明了当秩r=1时该问题的决策变种是NP难的,并分析了在某些情况下该问题可以在多项式时间内解决,同时提出了一种实用的启发式算法用于恢复量化低秩矩阵。

Comments 12 pages, 3 tables

详情
Journal ref
Linear Algebra and its Applications 581, pp. 367-382, 2019
AI中文摘要

以无穷范数为度量的低秩矩阵逼近问题是指:给定一个矩阵M和一个分解秩r,找到一个秩至多为r的矩阵X,使其在所有元素上的最大误差最小。在本文中,我们证明了当r=1时该问题的决策变种是NP难的,通过将其归约为'非全部相等的3SAT'问题。我们还分析了在某些情况下该问题可以在多项式时间内解决,并提出了一种简单的实用启发式算法,该算法应用于恢复量化低秩矩阵的问题。

英文摘要

The low-rank matrix approximation problem with respect to the entry-wise $\ell_{\infty}$-norm is the following: given a matrix $M$ and a factorization rank $r$, find a matrix $X$ whose rank is at most $r$ and that minimizes $\max_{i,j} |M_{ij} - X_{ij}|$. In this paper, we prove that the decision variant of this problem for $r=1$ is NP-complete using a reduction from the problem `not all equal 3SAT'. We also analyze several cases when the problem can be solved in polynomial time, and propose a simple practical heuristic algorithm which we apply on the problem of the recovery of a quantized low-rank matrix.

1903.11683 2026-06-04 stat.ML cs.CV cs.LG cs.RO cs.SY eess.SY stat.AP 版本更新

Outlier-Robust Spatial Perception: Hardness, General-Purpose Algorithms, and Guarantees

抗异常的空域感知:难度、通用算法和保证

Vasileios Tzoumas, Pasquale Antonante, Luca Carlone

AI总结 本文研究了空域感知中异常数据的影响,提出了一种通用算法来有效去除异常,并提供了对算法性能的理论保证。

详情
AI中文摘要

空域感知是许多机器人应用的核心,涵盖了定位与建图、点云对齐和从相机图像中估计相对姿态等广泛的研究问题。异常数据的存在会威胁到空域感知的鲁棒性,而一般情况下,异常值是主要问题。尽管已有处理异常值的技术,但它们可能以不可预测的方式失败(例如RANSAC、鲁棒估计器),或具有指数级的运行时间(例如分支界限法)。在本文中,我们通过三个贡献推动了异常拒绝的前沿。首先,我们证明了即使是最简单的线性异常拒绝实例也是近似不可行的:在最坏情况下,无法设计出一个准多项式时间算法来高效计算近似解。我们的第二个贡献是提供第一个实例级的次优界限,以评估给定异常拒绝结果的近似质量。我们的第三个贡献是提出了一种简单的通用算法,称为自适应修剪,用于去除异常值。我们的算法利用了最近提出的一类全局求解器,能够解决无异常的问题,并通过迭代去除误差较大的测量值。我们在三个空域感知问题上展示了所提出的算法:三维配准、双视几何和SLAM。结果表明,我们的算法在各种应用中优于几种最先进的方法,同时是一种通用的方法。

英文摘要

Spatial perception is the backbone of many robotics applications, and spans a broad range of research problems, including localization and mapping, point cloud alignment, and relative pose estimation from camera images. Robust spatial perception is jeopardized by the presence of incorrect data association, and in general, outliers. Although techniques to handle outliers do exist, they can fail in unpredictable manners (e.g., RANSAC, robust estimators), or can have exponential runtime (e.g., branch-and-bound). In this paper, we advance the state of the art in outlier rejection by making three contributions. First, we show that even a simple linear instance of outlier rejection is inapproximable: in the worst-case one cannot design a quasi-polynomial time algorithm that computes an approximate solution efficiently. Our second contribution is to provide the first per-instance sub-optimality bounds to assess the approximation quality of a given outlier rejection outcome. Our third contribution is to propose a simple general-purpose algorithm, named adaptive trimming, to remove outliers. Our algorithm leverages recently-proposed global solvers that are able to solve outlier-free problems, and iteratively removes measurements with large errors. We demonstrate the proposed algorithm on three spatial perception problems: 3D registration, two-view geometry, and SLAM. The results show that our algorithm outperforms several state-of-the-art methods across applications while being a general-purpose method.

1811.05537 2026-06-04 math.NA cs.LG cs.NA cs.NE math.DS stat.ML 版本更新

Data Driven Governing Equations Approximation Using Deep Neural Networks

利用深度神经网络的数据驱动 governing 方程近似

Tong Qin, Kailiang Wu, Dongbin Xiu

发表机构 * Department of Mathematics, The Ohio State University(数学系,俄亥俄州立大学)

AI总结 本文提出了一种数值框架,利用观测数据和深度神经网络近似未知的 governing 方程,通过残差网络作为基本构建块,提出了两种多步方法,展示了其在不同时间步长下的性能。

详情
AI中文摘要

我们提出了一种数值框架,用于利用观测数据和深度神经网络(DNN)近似未知的 governing 方程。特别是,我们提出使用残差网络(ResNet)作为方程近似的基本构建块。我们证明残差网络块可以被视为在时间积分中精确的一步方法。然后,我们提出了两种多步方法,即递归残差网络(RT-ResNet)方法和递归 ReNet(RS-ResNet)方法。RT-ResNet 是一种在均匀时间步长上的多步方法,而 RS-ResNet 是一种使用可变时间步长的自适应多步方法。所有三种方法均基于底层动力系统的基本积分形式。因此,它们不需要时间导数数据进行方程恢复,能够处理相对粗略分布的轨迹数据。几个数值例子展示了这些方法的性能。

英文摘要

We present a numerical framework for approximating unknown governing equations using observation data and deep neural networks (DNN). In particular, we propose to use residual network (ResNet) as the basic building block for equation approximation. We demonstrate that the ResNet block can be considered as a one-step method that is exact in temporal integration. We then present two multi-step methods, recurrent ResNet (RT-ResNet) method and recursive ReNet (RS-ResNet) method. The RT-ResNet is a multi-step method on uniform time steps, whereas the RS-ResNet is an adaptive multi-step method using variable time steps. All three methods presented here are based on integral form of the underlying dynamical system. As a result, they do not require time derivative data for equation recovery and can cope with relatively coarsely distributed trajectory data. Several numerical examples are presented to demonstrate the performance of the methods.

1904.08353 2026-06-04 stat.ML cs.LG cs.SY eess.SY 版本更新

Towards Robust Deep Reinforcement Learning for Traffic Signal Control: Demand Surges, Incidents and Sensor Failures

面向交通信号控制的鲁棒深度强化学习:需求激增、事故和传感器故障

Filipe Rodrigues, Carlos Lima Azevedo

发表机构 * Technical University of Denmark (DTU)(丹麦技术大学)

AI总结 本文提出了一种开源的回调框架,用于在交通模拟环境中灵活评估不同深度强化学习配置,研究了深度强化学习自适应交通控制器在需求激增、事故导致的容量下降和传感器故障等场景下的表现,并提出了缓解这些外源不确定性的具体设计。

Comments 8 pages

详情
AI中文摘要

强化学习(RL)构成了缓解交通拥堵问题的一种有希望的解决方案。特别是,深度RL算法已被证明能够产生适应性强的交通信号控制器,其性能优于传统系统。然而,为了在高度动态的城市区域中保持可靠性,此类控制器需要对一系列外源不确定性具有鲁棒性。在本文中,我们开发了一个开源的回调基于框架,用于在交通模拟环境中促进不同深度RL配置的灵活评估。借助该框架,我们研究了深度RL基于自适应交通控制器在不同场景下的表现,即由特殊事件引起的交通需求激增、由事故导致的容量下降以及传感器故障。我们提取了若干关键见解,以开发用于交通控制的鲁棒深度RL算法,并提出了具体设计以减轻所考虑的外源不确定性的影响。

英文摘要

Reinforcement learning (RL) constitutes a promising solution for alleviating the problem of traffic congestion. In particular, deep RL algorithms have been shown to produce adaptive traffic signal controllers that outperform conventional systems. However, in order to be reliable in highly dynamic urban areas, such controllers need to be robust with the respect to a series of exogenous sources of uncertainty. In this paper, we develop an open-source callback-based framework for promoting the flexible evaluation of different deep RL configurations under a traffic simulation environment. With this framework, we investigate how deep RL-based adaptive traffic controllers perform under different scenarios, namely under demand surges caused by special events, capacity reductions from incidents and sensor failures. We extract several key insights for the development of robust deep RL algorithms for traffic control and propose concrete designs to mitigate the impact of the considered exogenous uncertainties.

1903.02531 2026-06-04 cs.RO cs.AI cs.CV cs.LG cs.SY eess.SY 版本更新

Combining Optimal Control and Learning for Visual Navigation in Novel Environments

将最优控制与学习相结合用于新环境中的视觉导航

Somil Bansal, Varun Tolani, Saurabh Gupta, Jitendra Malik, Claire Tomlin

发表机构 * University of California, Berkeley(加州大学伯克利分校) Facebook AI Research(脸书人工智能研究)

AI总结 本文提出了一种结合模型控制与学习感知的方法,用于在新环境中实现可靠的视觉导航,通过生成无碰撞路径的 waypoints,使机器人能够高效地到达目标位置,同时在低帧率和仿真到现实的迁移中表现良好。

Comments Project website: https://vtolani95.github.io/WayPtNav/

详情
AI中文摘要

基于模型的控制是机器人导航的流行范式,因为它可以利用已知的动力学模型来高效地规划鲁棒的机器人轨迹。然而,在环境事先未知且只能通过机器人上的传感器部分观测的情况下,使用基于模型的方法具有挑战性。在本工作中,我们通过将基于模型的控制与基于学习的感知相结合来解决这一不足。基于学习的感知模块生成一系列 waypoints,通过无碰撞路径引导机器人到达目标。这些 waypoints 被用于基于模型的规划器生成平滑且动态可行的轨迹,该轨迹通过反馈控制在物理系统上执行。我们在模拟的真实世界复杂环境中以及在实际地面车辆上的实验表明,与纯几何映射或端到端学习方法相比,所提出的方法在新环境中能够更可靠、更高效地到达目标位置。我们的方法不依赖于详细的显式 3D 环境地图,能够与低帧率工作,并且在仿真到现实的迁移中表现良好。描述我们方法和实验的视频可在项目网站上获得。

英文摘要

Model-based control is a popular paradigm for robot navigation because it can leverage a known dynamics model to efficiently plan robust robot trajectories. However, it is challenging to use model-based methods in settings where the environment is a priori unknown and can only be observed partially through on-board sensors on the robot. In this work, we address this short-coming by coupling model-based control with learning-based perception. The learning-based perception module produces a series of waypoints that guide the robot to the goal via a collision-free path. These waypoints are used by a model-based planner to generate a smooth and dynamically feasible trajectory that is executed on the physical system using feedback control. Our experiments in simulated real-world cluttered environments and on an actual ground vehicle demonstrate that the proposed approach can reach goal locations more reliably and efficiently in novel environments as compared to purely geometric mapping-based or end-to-end learning-based alternatives. Our approach does not rely on detailed explicit 3D maps of the environment, works well with low frame rates, and generalizes well from simulation to the real world. Videos describing our approach and experiments are available on the project website.

1809.05525 2026-06-04 quant-ph cs.LG cs.SY eess.SY stat.ML 版本更新

Robustness of Quantum-Enhanced Adaptive Phase Estimation

量子增强自适应相位估计的鲁棒性

Pantita Palittapongarnpim, Barry C. Sanders

发表机构 * Institute for Quantum Science and Technology(量子科学与技术研究所) University of Calgary(卡尔加里大学) Program in Quantum Information Science(量子信息科学项目) Canadian Institute for Advanced Research(加拿大高级研究 institute) Toronto, Ontario M5G 1M1, Canada(加拿大安大略省多伦多M5G 1M1)

AI总结 本研究提出了一种评估量子增强自适应相位估计策略鲁棒性的测试方法,并比较了不同策略所使用的资源,以确定其有效性并选择合适的策略。

Comments 15 pages, 2 figures, 2 tables

详情
Journal ref
Phys. Rev. A 100, 012106 (2019)
AI中文摘要

由于所有物理上的自适应量子增强计量方案都在具有部分理解的噪声条件下运行,因此实际的控制策略必须在未知噪声的情况下也具有鲁棒性。我们旨在设计一个测试来评估AQEM策略的鲁棒性,并评估策略所使用的资源。鲁棒性测试是在QEAPE上进行的,通过模拟四种相位噪声模型(正态分布噪声、随机电报噪声、偏态正态分布噪声和对数正态分布噪声)下的方案进行。控制策略要么是在相同嘈杂条件下由进化算法设计,尽管不知道其特性,要么是基于贝叶斯反馈的方法,假设没有噪声。我们的鲁棒性测试和资源比较方法可用于确定有效性和选择合适的策略。

英文摘要

As all physical adaptive quantum-enhanced metrology schemes operate under noisy conditions with only partially understood noise characteristics, so a practical control policy must be robust even for unknown noise. We aim to devise a test to evaluate the robustness of AQEM policies and assess the resource used by the policies. The robustness test is performed on QEAPE by simulating the scheme under four phase-noise models corresponding to normal-distribution noise, random-telegraph noise, skew-normal-distribution noise, and log-normal-distribution noise. Control policies are devised either by an evolutionary algorithm under the same noisy conditions, albeit ignorant of its properties, or a Bayesian-based feedback method that assumes no noise. Our robustness test and resource comparison method can be used to determining the efficacy and selecting a suitable policy.

1806.03816 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Adaptive MCMC via Combining Local Samplers

通过结合局部采样器实现自适应MCMC

Kiarash Shaloudegi, András György

发表机构 * Imperial College London, London, UK(伦敦帝国学院,伦敦,英国)

AI总结 本文提出了一种自适应MCMC方法,通过结合多个并行运行的局部采样器,利用核Stein分歧度优先选择链,以提高整体采样效率,实验表明该方法在多模态问题和传感器定位任务中优于现有方法。

详情
AI中文摘要

马尔可夫链蒙特卡罗(MCMC)方法在机器学习中被广泛使用。MCMC的主要问题之一是如何设计能够快速混合整个状态空间的链,特别是如何选择MCMC算法的参数。本文采取了不同的方法,类似于并行MCMC方法,而不是寻找一个能够采样整个分布的单一链,而是结合多个并行运行的链的样本,每个链仅探索状态空间的部分(例如几个模式)。链根据核Stein分歧度优先级进行选择,这提供了局部性能的良好度量。独立链的样本通过一种新的技术进行组合,用于估计样本空间不同区域的概率。实验结果表明,所提出的算法可能在不同的采样问题中提供显著的加速。最重要的是,当与最先进的NUTS算法作为基础MCMC采样器结合时,我们的方法在采样单峰分布时与NUTS具有竞争力,而在合成多峰问题以及具有挑战性的传感器定位任务中显著优于现有方法。

英文摘要

Markov chain Monte Carlo (MCMC) methods are widely used in machine learning. One of the major problems with MCMC is the question of how to design chains that mix fast over the whole state space; in particular, how to select the parameters of an MCMC algorithm. Here we take a different approach and, similarly to parallel MCMC methods, instead of trying to find a single chain that samples from the whole distribution, we combine samples from several chains run in parallel, each exploring only parts of the state space (e.g., a few modes only). The chains are prioritized based on kernel Stein discrepancy, which provides a good measure of performance locally. The samples from the independent chains are combined using a novel technique for estimating the probability of different regions of the sample space. Experimental results demonstrate that the proposed algorithm may provide significant speedups in different sampling problems. Most importantly, when combined with the state-of-the-art NUTS algorithm as the base MCMC sampler, our method remained competitive with NUTS on sampling from unimodal distributions, while significantly outperforming state-of-the-art competitors on synthetic multimodal problems as well as on a challenging sensor localization task.

1809.07192 2026-06-04 eess.SY cs.LG cs.SY math.OC stat.ML 版本更新

Unbalanced Multi-Phase Distribution Grid Topology Estimation and Bus Phase Identification

不平衡多相配电网拓扑估计与节点相位识别

Yizheng Liao, Yang Weng, Guangyi Liu, Zhongyang Zhao, Chin-woo Tan, Ram Rajagopal

AI总结 本文提出了一种基于信息论的方法,利用智能电表数据估计配电网的多相拓扑并识别节点相位,通过将不平衡系统转换为对称分量并证明Chow-Liu算法在存在错误节点相位标签时能确定拓扑结构,最终通过Carson方程证明电压测量可正确识别节点相位连接,实验结果表明该方法在强负载不平衡和分布式能源接入条件下具有高准确性。

Comments 17 pages, 18 figures

详情
AI中文摘要

随着分布式能源资源带来的不确定性在配电网中增加,准确的多相拓扑是相关不平衡配电网测量的基础。然而,由于投资有限,尤其是低压配电网,此类拓扑知识往往不可用。此外,由于人为错误或过时记录,节点相位标签信息不准确。为此,本文利用智能电表数据提出了一种信息论方法来学习配电网拓扑。具体而言,多相不平衡系统被转换为对称分量,即正序、负序和零序。然后,本文证明Chow-Liu算法通过利用功率流方程和由配电网径向多相结构隐含的条件独立关系来确定拓扑。最后,通过Carson方程证明可以使用电压测量正确识别节点相位连接。为验证,使用三个真实数据集模拟IEEE系统。仿真结果表明,该算法在强负载不平衡和DERs条件下仍能准确找到多相拓扑,确保配电网中分布式能源的紧密监控和控制。

英文摘要

There is an increasing need for monitoring and controlling uncertainties brought by distributed energy resources in distribution grids. For such goal, accurate multi-phase topology is the basis for correlating measurements in unbalanced distribution networks. Unfortunately, such topology knowledge is often unavailable due to limited investment, especially for \revv{low-voltage} distribution grids. Also, the bus phase labeling information is inaccurate due to human errors or outdated records. For this challenge, this paper utilizes smart meter data for an information-theoretic approach to learn the topology of distribution grids. Specifically, multi-phase unbalanced systems are converted into symmetrical components, namely positive, negative, and zero sequences. Then, this paper proves that the Chow-Liu algorithm finds the topology by utilizing power flow equations and the conditional independence relationships implied by the radial multi-phase structure of distribution grids with the presence of incorrect bus phase labels. At last, by utilizing Carson's equation, this paper proves that the bus phase connection can be correctly identified using voltage measurements. For validation, IEEE systems are simulated using three real data sets. The simulation results demonstrate that the algorithm is highly accurate for finding multi-phase topology even with strong load unbalancing condition and DERs. This ensures close monitoring and controlling DERs in distribution grids.

1811.09358 2026-06-04 cs.LG cs.CV cs.NA math.NA math.OC stat.ML 版本更新

A Sufficient Condition for Convergences of Adam and RMSProp

Adam和RMSProp收敛性的充分条件

Fangyu Zou, Li Shen, Zequn Jie, Weizhong Zhang, Wei Liu

发表机构 * Tencent AI Lab(腾讯AI实验室) Stony Brook University(石英布鲁克大学)

AI总结 本文提出了一种易于检查的充分条件,该条件仅依赖于基础学习率参数和历史二阶矩量的组合,以保证通用的Adam/RMSProp算法在大规模非凸随机优化中的全局收敛性,并展示了几种Adam变体在非凸设置下的收敛性可由此条件直接推导。

Comments Accepted by CVPR2019 as an Oral presentation

详情
AI中文摘要

Adam和RMSProp是训练深度神经网络中最具影响力的自适应随机算法,尽管在凸设置中通过几个简单的反例已被指出存在发散现象。许多尝试,如降低自适应学习率、采用大批次大小、引入时间去相关技术、寻找类比的替代方案等,已被尝试以促进Adam/RMSProp型算法收敛。与现有方法不同,我们引入了一种替代的易于检查的充分条件,该条件仅依赖于基础学习率参数和历史二阶矩量的组合,以保证通用的Adam/RMSProp算法在大规模非凸随机优化中的全局收敛性。此外,我们展示了几种Adam变体,如AdamNC、AdaEMA等,在非凸设置下的收敛性可通过所提出的充分条件直接推导。此外,我们表明Adam本质上是一种具有指数移动平均动量的特定加权AdaGrad,这为理解Adam和RMSProp提供了新的视角。这一观察结合该充分条件,为它们的发散性提供了更深入的解释。最后,我们通过将Adam和RMSProp应用于特定反例和训练深度神经网络来验证该充分条件。数值结果与我们的理论分析一致。

英文摘要

Adam and RMSProp are two of the most influential adaptive stochastic algorithms for training deep neural networks, which have been pointed out to be divergent even in the convex setting via a few simple counterexamples. Many attempts, such as decreasing an adaptive learning rate, adopting a big batch size, incorporating a temporal decorrelation technique, seeking an analogous surrogate, etc., have been tried to promote Adam/RMSProp-type algorithms to converge. In contrast with existing approaches, we introduce an alternative easy-to-check sufficient condition, which merely depends on the parameters of the base learning rate and combinations of historical second-order moments, to guarantee the global convergence of generic Adam/RMSProp for solving large-scale non-convex stochastic optimization. Moreover, we show that the convergences of several variants of Adam, such as AdamNC, AdaEMA, etc., can be directly implied via the proposed sufficient condition in the non-convex setting. In addition, we illustrate that Adam is essentially a specifically weighted AdaGrad with exponential moving average momentum, which provides a novel perspective for understanding Adam and RMSProp. This observation coupled with this sufficient condition gives much deeper interpretations on their divergences. At last, we validate the sufficient condition by applying Adam and RMSProp to tackle a certain counterexample and train deep neural networks. Numerical results are exactly in accord with our theoretical analysis.

1803.07726 2026-06-04 stat.ML cs.IT cs.LG cs.NA math.IT math.NA math.OC 版本更新

Gradient Descent with Random Initialization: Fast Global Convergence for Nonconvex Phase Retrieval

梯度下降与随机初始化:非凸相位恢复的快速全局收敛性

Yuxin Chen, Yuejie Chi, Jianqing Fan, Cong Ma

发表机构 * Department of Electrical Engineering, Princeton University(普林斯顿大学电气工程系) Department of Electrical and Computer Engineering, Carnegie Mellon University(卡内基梅隆大学电气与计算机工程系) Department of Operations Research and Financial Engineering, Princeton University(普林斯顿大学运筹学与金融工程系)

AI总结 本文研究了通过二次方程恢复目标对象的问题,证明了在高斯设计下,随机初始化的梯度下降能在O(log n + log(1/ε))次迭代中获得ε精度的解,从而实现了计算和样本复杂度的近最优性,为相位恢复提供了首个无需精心设计初始化、样本分割或复杂鞍点逃离方案的全局收敛保证。

Comments Accepted to Mathematical Programming

详情
Journal ref
Mathematical Programming 2019, Volume 176, Issue 1-2, 5-37
AI中文摘要

本文考虑了解二次方程组的问题,即从m个二次方程/样本y_i=(a_i^T x^natural)^2 (1≤i≤m)中恢复感兴趣的对象x^natural∈R^n。这个问题也被称为相位恢复,涵盖了多个领域,包括物理科学和机器学习。我们研究了为非凸最小二乘问题设计的梯度下降(或Wirtinger流)的效率。我们证明,在高斯设计下,梯度下降——当以随机方式初始化时——能在O(log n + log(1/ε))次迭代中获得ε精度的解,从而同时实现了近最优的计算和样本复杂度。这为相位恢复提供了首个关于普通梯度下降的全局收敛保证,无需(i)精心设计的初始化(ii)样本分割,或(iii)复杂的鞍点逃离方案。所有这些都通过利用统计模型分析优化算法,通过一种leave-one-out方法,实现了梯度下降迭代与数据之间的统计依赖性的解耦。

英文摘要

This paper considers the problem of solving systems of quadratic equations, namely, recovering an object of interest $\mathbf{x}^{\natural}\in\mathbb{R}^{n}$ from $m$ quadratic equations/samples $y_{i}=(\mathbf{a}_{i}^{\top}\mathbf{x}^{\natural})^{2}$, $1\leq i\leq m$. This problem, also dubbed as phase retrieval, spans multiple domains including physical sciences and machine learning. We investigate the efficiency of gradient descent (or Wirtinger flow) designed for the nonconvex least squares problem. We prove that under Gaussian designs, gradient descent --- when randomly initialized --- yields an $ε$-accurate solution in $O\big(\log n+\log(1/ε)\big)$ iterations given nearly minimal samples, thus achieving near-optimal computational and sample complexities at once. This provides the first global convergence guarantee concerning vanilla gradient descent for phase retrieval, without the need of (i) carefully-designed initialization, (ii) sample splitting, or (iii) sophisticated saddle-point escaping schemes. All of these are achieved by exploiting the statistical models in analyzing optimization algorithms, via a leave-one-out approach that enables the decoupling of certain statistical dependency between the gradient descent iterates and the data.

1811.10745 2026-06-04 cs.LG cs.CR cs.NA math.NA stat.ML 版本更新

ResNets Ensemble via the Feynman-Kac Formalism to Improve Natural and Robust Accuracies

通过费米-狄拉克公式式方法提升ResNets的自然和鲁棒准确性的集成方法

Bao Wang, Binjie Yuan, Zuoqiang Shi, Stanley J. Osher

发表机构 * Department of Mathematics(数学系) Computer Science Department(计算机科学系) University of California, Los Angeles(加州大学洛杉矶分校) Tsinghua University(清华大学) Yau Mathematical Sciences Center(杨振宁数学科学中心)

AI总结 本文提出了一种基于费米-狄拉克公式式的ResNets集成算法,通过在残差映射的输出中注入方差指定的高斯噪声并平均多个联合训练的修改ResNets的乘积来提高模型在干净和对抗性图像上的准确率。

Comments 18 pages, 6 figures

详情
AI中文摘要

经验对抗风险最小化(EARM)是一种广泛使用的数学框架,用于鲁棒地训练深度神经网络(DNNs),使其对对抗性攻击具有抵抗力。然而,训练后的鲁棒模型在分类干净图像和对抗图像时的自然和鲁棒准确率仍然远未令人满意。在本工作中,我们统一了传输方程最优控制的理论与ResNets的训练和测试实践。基于这一统一观点,我们提出了一种简单但有效的ResNets集成算法,以提升鲁棒训练模型在干净和对抗图像上的准确率。所提出的算法包括两个组成部分:首先,我们通过在每个残差映射的输出中注入指定方差的高斯噪声来修改基础ResNets。其次,我们对多个联合训练的修改ResNets的乘积进行平均以获得最终预测。这两个步骤对费米-狄拉克公式表示粘性传输方程或对流-扩散方程的解提供了近似。在CIFAR10基准测试中,该简单算法导致在干净图像上的自然准确率为85.62%,在20次IFGSM攻击迭代下的鲁棒准确率为57.94%,优于当前在CIFAR10上防御IFGSM攻击的最先进方法。所提出的ResNets集成的自然和鲁棒准确率可以随着基础ResNet的进展动态提高。代码可在:https://github.com/BaoWangMath/EnResNet获取。

英文摘要

Empirical adversarial risk minimization (EARM) is a widely used mathematical framework to robustly train deep neural nets (DNNs) that are resistant to adversarial attacks. However, both natural and robust accuracies, in classifying clean and adversarial images, respectively, of the trained robust models are far from satisfactory. In this work, we unify the theory of optimal control of transport equations with the practice of training and testing of ResNets. Based on this unified viewpoint, we propose a simple yet effective ResNets ensemble algorithm to boost the accuracy of the robustly trained model on both clean and adversarial images. The proposed algorithm consists of two components: First, we modify the base ResNets by injecting a variance specified Gaussian noise to the output of each residual mapping. Second, we average over the production of multiple jointly trained modified ResNets to get the final prediction. These two steps give an approximation to the Feynman-Kac formula for representing the solution of a transport equation with viscosity, or a convection-diffusion equation. For the CIFAR10 benchmark, this simple algorithm leads to a robust model with a natural accuracy of {\bf 85.62}\% on clean images and a robust accuracy of ${\bf 57.94 \%}$ under the 20 iterations of the IFGSM attack, which outperforms the current state-of-the-art in defending against IFGSM attack on the CIFAR10. Both natural and robust accuracies of the proposed ResNets ensemble can be improved dynamically as the building block ResNet advances. The code is available at: \url{https://github.com/BaoWangMath/EnResNet}.

1807.06613 2026-06-04 cs.MA cs.AI cs.LG cs.SY eess.SY stat.ML 版本更新

Deep Reinforcement Learning for Swarm Systems

深度强化学习用于群体系统

Maximilian Hüttenrauch, Adrian Šošić, Gerhard Neumann

发表机构 * L-CAS University of Lincoln(L-CAS林肯大学) Technische Universität Darmstadt(达姆施塔特技术大学)

AI总结 本文提出了一种基于分布均嵌入的新状态表示方法,用于深度多智能体强化学习,以更有效地处理大规模同质群体系统的去中心化决策问题。

Comments 31 pages, 12 figures, version 3 (published in JMLR Volume 20)

详情
Journal ref
Journal of Machine Learning Research 20(54):1-31, 2019
AI中文摘要

最近,深度强化学习(RL)方法已成功应用于多智能体场景。通常,这些方法依赖于将智能体状态拼接起来以表示去中心化决策所需的信 �息内容。然而,拼接在大规模同质群体系统中表现不佳,因为它不利用这些系统固有的基本属性:(i)群体中的智能体是可互换的,(ii)群体中智能体的精确数量无关。因此,我们提出了一种基于分布均嵌入的新深度多智能体RL状态表示方法。我们将智能体视为分布的样本,并使用经验均嵌入作为去中心化策略的输入。我们通过直方图、径向基函数和端到端学习的神经网络定义了不同的均嵌入特征空间。我们在群体文献中两个著名的已知问题(相遇和追捕)上评估了该表示方法,在全局和局部可观察的设置中。对于局部设置,我们进一步引入了简单的通信协议。所有方法中,基于神经网络特征的均嵌入表示能够促进相邻智能体之间最丰富的信息交换,从而促进更复杂的集体策略的发展。

英文摘要

Recently, deep reinforcement learning (RL) methods have been applied successfully to multi-agent scenarios. Typically, these methods rely on a concatenation of agent states to represent the information content required for decentralized decision making. However, concatenation scales poorly to swarm systems with a large number of homogeneous agents as it does not exploit the fundamental properties inherent to these systems: (i) the agents in the swarm are interchangeable and (ii) the exact number of agents in the swarm is irrelevant. Therefore, we propose a new state representation for deep multi-agent RL based on mean embeddings of distributions. We treat the agents as samples of a distribution and use the empirical mean embedding as input for a decentralized policy. We define different feature spaces of the mean embedding using histograms, radial basis functions and a neural network learned end-to-end. We evaluate the representation on two well known problems from the swarm literature (rendezvous and pursuit evasion), in a globally and locally observable setup. For the local setup we furthermore introduce simple communication protocols. Of all approaches, the mean embedding representation using neural network features enables the richest information exchange between neighboring agents facilitating the development of more complex collective strategies.

1905.08645 2026-06-04 math.OC cs.DC cs.LG cs.MA cs.SY eess.SY 版本更新

Revisiting Randomized Gossip Algorithms: General Framework, Convergence Rates and Novel Block and Accelerated Protocols

重新审视随机广播算法:通用框架、收敛速率和新型块及加速协议

Nicolas Loizou, Peter Richtárik

发表机构 * University of Edinburgh(爱丁堡大学) KAUST(卡塔尔科技大学) MIPT(莫斯科国立信息安全研究学院)

AI总结 本文提出了一种新的随机广播算法分析和设计框架,用于解决平均共识问题。通过将经典随机迭代方法应用于特殊系统来解释网络结构,展示了其去中心化特性。该框架恢复了多种已知的广播算法作为特殊情况,并允许开发具有证明更快变体的方法。我们还提出了新的块和第一个可证明加速的随机广播协议,以及双随机广播算法。

Comments 44 pages, 12 figures

详情
AI中文摘要

在本文中,我们提出了一种新的框架,用于分析和设计随机广播算法以解决平均共识问题。我们展示了经典随机迭代方法在应用于特殊系统以编码底层网络时如何被解释为广播算法,并详细解释了其去中心化性质。我们的通用框架恢复了多种已知的广播算法作为特殊情况,包括配对随机广播算法和路径平均广播算法,并允许开发具有证明更快变体的方法。新方法的灵活性使我们能够设计出多种新的特定广播方法。例如,我们提出了并分析了新的块和第一个可证明加速的随机广播协议,以及双随机广播算法。从数值分析的角度来看,我们的工作是首次深入探讨随机迭代方法在解决线性系统时的去中心化性质,并将其作为解决平均共识问题的方法。我们通过在典型无线网络拓扑上进行广泛的实验测试来评估所提出广播协议的性能。

英文摘要

In this work we present a new framework for the analysis and design of randomized gossip algorithms for solving the average consensus problem. We show how classical randomized iterative methods for solving linear systems can be interpreted as gossip algorithms when applied to special systems encoding the underlying network and explain in detail their decentralized nature. Our general framework recovers a comprehensive array of well-known gossip algorithms as special cases, including the pairwise randomized gossip algorithm and path averaging gossip, and allows for the development of provably faster variants. The flexibility of the new approach enables the design of a number of new specific gossip methods. For instance, we propose and analyze novel block and the first provably accelerated randomized gossip protocols, and dual randomized gossip algorithms. From a numerical analysis viewpoint, our work is the first that explores in depth the decentralized nature of randomized iterative methods for linear systems and proposes them as methods for solving the average consensus problem. We evaluate the performance of the proposed gossip protocols by performing extensive experimental testing on typical wireless network topologies.

1905.13587 2026-06-04 cs.LG cs.NA math.NA math.OC stat.ML 版本更新

GENO -- GENeric Optimization for Classical Machine Learning

GENO -- 为经典机器学习设计的通用优化

Sören Laue, Matthias Mitterreiter, Joachim Giesen

发表机构 * Friedrich-Schiller-Universität Jena(耶拿弗里德里希-施勒斯海姆大学)

AI总结 本文提出GENO框架,通过结合建模语言和通用求解器,实现了对大多数经典机器学习问题的高效自动求解,展示了其在效率上的优势。

详情
AI中文摘要

尽管优化是机器学习的长期算法核心,但新模型仍需要耗时实现新求解器。因此,有成千上万种针对机器学习问题的优化算法实现。一个自然的问题是,是否总需要实现新求解器,或者是否存在一个适用于大多数模型的算法。普遍认为这种“万能算法”无法工作,因为该算法无法利用模型特定的结构,因此无法在广泛的问题上高效且稳健。本文挑战这一普遍观点。我们设计并实现了优化框架GENO(GENeric Optimization),它结合了建模语言和通用求解器。GENO从优化问题类的声明性规范中生成求解器。该框架足够灵活,可以涵盖大多数经典机器学习问题。我们在广泛的经典问题以及一些最近提出的问题上展示了自动生成的求解器的性能:(1) 与精心设计的专用求解器一样高效,(2) 比最近的最先进求解器有相当大的优势,(3) 比传统建模语言加求解器方法快多个数量级。

英文摘要

Although optimization is the longstanding algorithmic backbone of machine learning, new models still require the time-consuming implementation of new solvers. As a result, there are thousands of implementations of optimization algorithms for machine learning problems. A natural question is, if it is always necessary to implement a new solver, or if there is one algorithm that is sufficient for most models. Common belief suggests that such a one-algorithm-fits-all approach cannot work, because this algorithm cannot exploit model specific structure and thus cannot be efficient and robust on a wide variety of problems. Here, we challenge this common belief. We have designed and implemented the optimization framework GENO (GENeric Optimization) that combines a modeling language with a generic solver. GENO generates a solver from the declarative specification of an optimization problem class. The framework is flexible enough to encompass most of the classical machine learning problems. We show on a wide variety of classical but also some recently suggested problems that the automatically generated solvers are (1) as efficient as well-engineered specialized solvers, (2) more efficient by a decent margin than recent state-of-the-art solvers, and (3) orders of magnitude more efficient than classical modeling language plus solver approaches.

1905.13548 2026-06-04 math.OC cs.LG cs.SY eess.SY math.DS 版本更新

Sparse optimal control of networks with multiplicative noise via policy gradient

通过策略梯度实现受乘性噪声影响的网络稀疏最优控制

Benjamin Gravell, Yi Guo, Tyler Summers

发表机构 * The University of Texas at Dallas(德克萨斯大学达拉斯分校)

AI总结 本文提出了一种基于策略梯度的近优稀疏控制器设计算法,用于处理受乘性噪声影响的复杂动态网络系统,通过多种正则化方案的比较,展示了算法在大规模网络系统中的有效性。

详情
AI中文摘要

我们给出了设计近优稀疏控制器的算法,利用策略梯度方法应用于受乘性噪声影响的系统控制,这在新兴复杂动态网络中变得越来越重要。各种正则化方案通过梯度、次梯度和近似梯度方法被纳入优化过程。在大规模网络系统上的数值实验表明,算法能够收敛到高性能的稀疏均方稳定控制器。

英文摘要

We give algorithms for designing near-optimal sparse controllers using policy gradient with applications to control of systems corrupted by multiplicative noise, which is increasingly important in emerging complex dynamical networks. Various regularization schemes are examined and incorporated into the optimization by the use of gradient, subgradient, and proximal gradient methods. Numerical experiments on a large networked system show that the algorithms converge to performant sparse mean-square stabilizing controllers.

1905.13428 2026-06-04 cs.LG cs.MA cs.SY eess.SY stat.ML 版本更新

Attentional Policies for Cross-Context Multi-Agent Reinforcement Learning

面向跨上下文多智能体强化学习的注意力策略

Matthew A. Wright, Roberto Horowitz

发表机构 * University of California Berkeley(加州大学伯克利分校)

AI总结 本文提出了一种新的神经策略架构,用于解决多智能体问题,通过在策略层面学习多智能体关系,利用注意力机制实现智能体间的协作,优于传统方法并在大规模智能体场景中表现更优。

详情
AI中文摘要

许多现实世界中强化学习的应用涉及与数量随时间变化的其他智能体交互。我们为这些多智能体问题提出了新的神经策略架构。与传统的为每个智能体训练离散策略并通过额外的跨策略机制强制合作的方法不同,我们遵循最近关于深度网络中关系归纳偏置力量的工作精神,在策略层面学习多智能体关系。在我们的方法中,所有智能体共享相同的策略,但各自在自己的上下文中独立应用该策略,以聚合其他智能体的状态信息以选择下一步动作。我们的架构结构允许其应用于具有不同数量智能体的环境。我们在基准多智能体自动驾驶协调问题上展示了我们的架构,取得了优于全知识、完全集中化参考解决方案的成果,并在智能体数量扩大时显著优于该方案。

英文摘要

Many potential applications of reinforcement learning in the real world involve interacting with other agents whose numbers vary over time. We propose new neural policy architectures for these multi-agent problems. In contrast to other methods of training an individual, discrete policy for each agent and then enforcing cooperation through some additional inter-policy mechanism, we follow the spirit of recent work on the power of relational inductive biases in deep networks by learning multi-agent relationships at the policy level via an attentional architecture. In our method, all agents share the same policy, but independently apply it in their own context to aggregate the other agents' state information when selecting their next action. The structure of our architectures allow them to be applied on environments with varying numbers of agents. We demonstrate our architecture on a benchmark multi-agent autonomous vehicle coordination problem, obtaining superior results to a full-knowledge, fully-centralized reference solution, and significantly outperforming it when scaling to large numbers of agents.

1905.09673 2026-06-04 cs.AI cs.LG cs.SY eess.SY 版本更新

Deep Q-Learning with Q-Matrix Transfer Learning for Novel Fire Evacuation Environment

基于Q矩阵迁移学习的深度Q学习用于新型火灾疏散环境

Jivitesh Sharma, Per-Arne Andersen, Ole-Chrisoffer Granmo, Morten Goodwin

发表机构 * Centre for Artificial Intelligence Research(人工智能研究中心) Department of Information and Communication Technology(信息与通信技术系) University of Agder, Norway(阿格德大学,挪威)

AI总结 本文提出了一种基于Q矩阵迁移学习的深度Q学习方法,用于解决紧急疏散问题,通过预训练DQN网络权重以获取最短路径信息,并在复杂真实环境中实现最优疏散路径。

Comments 21 pages, 14 figures, 4 tables

详情
AI中文摘要

我们关注紧急疏散这一重要问题,该问题显然可以受益于强化学习,但长期以来未被充分研究。紧急疏散是一个复杂的任务,难以用强化学习解决,因为紧急情况高度动态,包含大量变化变量和复杂约束,使训练变得困难。在本文中,我们提出了第一个用于训练强化学习代理进行疏散规划的火灾疏散环境。该环境被建模为图,以捕捉建筑结构。它包括现实特征,如火势蔓延、不确定性和瓶颈。我们已经将环境实现为OpenAI gym格式,以促进未来研究。我们还提出了一种新的强化学习方法,该方法通过预训练DQN代理的网络权重来整合通往出口的最短路径信息。我们通过使用表格Q学习来学习建筑模型图中的最短路径来实现这一点。此信息通过故意在Q矩阵上过拟合来转移到网络。然后,预训练的DQN模型在火灾疏散环境中进行训练,以在时间变化条件下生成最优疏散路径。我们对所提出的方法与PPO、VPG、SARSA、A2C和ACKTR等最新强化学习算法进行了比较。结果表明,我们的方法在包括原始DQN模型在内的最新模型上表现出巨大的优势。最后,我们在一个大型且复杂的现实建筑中测试我们的模型,该建筑由91个房间组成,可以移动到任何其他房间,因此有8281种动作。我们使用基于注意力的机制来处理大动作空间。我们的模型在现实世界紧急环境中实现了接近最优的性能。

英文摘要

We focus on the important problem of emergency evacuation, which clearly could benefit from reinforcement learning that has been largely unaddressed. Emergency evacuation is a complex task which is difficult to solve with reinforcement learning, since an emergency situation is highly dynamic, with a lot of changing variables and complex constraints that makes it difficult to train on. In this paper, we propose the first fire evacuation environment to train reinforcement learning agents for evacuation planning. The environment is modelled as a graph capturing the building structure. It consists of realistic features like fire spread, uncertainty and bottlenecks. We have implemented the environment in the OpenAI gym format, to facilitate future research. We also propose a new reinforcement learning approach that entails pretraining the network weights of a DQN based agents to incorporate information on the shortest path to the exit. We achieved this by using tabular Q-learning to learn the shortest path on the building model's graph. This information is transferred to the network by deliberately overfitting it on the Q-matrix. Then, the pretrained DQN model is trained on the fire evacuation environment to generate the optimal evacuation path under time varying conditions. We perform comparisons of the proposed approach with state-of-the-art reinforcement learning algorithms like PPO, VPG, SARSA, A2C and ACKTR. The results show that our method is able to outperform state-of-the-art models by a huge margin including the original DQN based models. Finally, we test our model on a large and complex real building consisting of 91 rooms, with the possibility to move to any other room, hence giving 8281 actions. We use an attention based mechanism to deal with large action spaces. Our model achieves near optimal performance on the real world emergency environment.

1805.07297 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

General solutions for nonlinear differential equations: a rule-based self-learning approach using deep reinforcement learning

非线性微分方程的通用解法:一种基于规则的自学习方法使用深度强化学习

Shiyin Wei, Xiaowei Jin, Hui Li

发表机构 * Key Lab of Smart Prevention and Mitigation of Civil Engineering Disasters of the Ministry of Industry and Information Technology, Harbin Institute of Technology(工信部智能防灾减灾重点实验室,哈尔滨工业大学) Key Lab of Structures Dynamic Behavior and Control of the Ministry of Education, Harbin Institute of Technology(教育部结构动力行为与控制重点实验室,哈尔滨工业大学) School of Civil Engineering, Harbin Institute of Technology(哈尔滨工业大学土木工程学院)

AI总结 本文提出了一种基于规则的自学习方法,利用深度强化学习解决非线性常微分方程和偏微分方程,通过深度神经网络结构的演员输出候选解,以及仅基于物理规则( governing equations 和边界和初始条件)的评论家,展示了转移学习特性,并验证了该方法在求解薛定谔、纳维-斯托克斯、伯格斯、范德波尔和洛伦兹方程及运动方程中的高精度解。

详情
AI中文摘要

本文首次提出了一种基于深度强化学习(DRL)的通用规则-based 自学习方法,用于求解非线性常微分方程和偏微分方程。求解器由一个深度神经网络结构的演员组成,该演员输出候选解,以及仅基于物理规则( governing equations 和边界和初始条件)的评论家。离散时间中的解被视为共享相同 governing equation 的多个任务,当前步骤参数为下一步提供了理想的初始化,由于解的时序连续性,展示了转移学习特性,表明DRL求解器已经捕捉到了方程的本质。该方法通过求解薛定谔、纳维-斯托克斯、伯格斯、范德波尔和洛伦兹方程及运动方程进行了验证。结果表明,该方法能够给出高精度的解,且求解过程有望更快。

英文摘要

A universal rule-based self-learning approach using deep reinforcement learning (DRL) is proposed for the first time to solve nonlinear ordinary differential equations and partial differential equations. The solver consists of a deep neural network-structured actor that outputs candidate solutions, and a critic derived only from physical rules (governing equations and boundary and initial conditions). Solutions in discretized time are treated as multiple tasks sharing the same governing equation, and the current step parameters provide an ideal initialization for the next owing to the temporal continuity of the solutions, which shows a transfer learning characteristic and indicates that the DRL solver has captured the intrinsic nature of the equation. The approach is verified through solving the Schrödinger, Navier-Stokes, Burgers', Van der Pol, and Lorenz equations and an equation of motion. The results indicate that the approach gives solutions with high accuracy, and the solution process promises to get faster.

1905.10457 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

A Polynomial-Based Approach for Architectural Design and Learning with Deep Neural Networks

基于多项式的深度神经网络架构设计与学习方法

Joseph Daws, Clayton G. Webster

发表机构 * Oak Ridge National Lab(橡树岭国家实验室) University of Tennessee at Knoxville(田纳西大学 Knoxville分校)

AI总结 本文提出了一种基于多项式的新型方法,通过识别合适的网络架构和初始化来从训练数据中重建多元函数,利用多项式近似,通过标准训练过程改进网络,从而更可能获得理想的局部极小值。

Comments 11 pages, 6 figures, submitted to NeurIPS 2019, corrected several typos and included new examples

详情
AI中文摘要

在本研究中,我们提出了一种新的方法,通过多项式近似来从训练数据中重建多元函数,同时确定合适的网络架构和初始化。使用梯度下降训练深度神经网络可以被视为沿着损失景观移动网络参数以最小化损失函数。参数初始化对于基于下降的迭代训练方法至关重要。我们的方法产生了一个初始状态为训练数据多项式表示的网络。该技术的主要优势是,从该初始状态出发,网络可以通过标准训练过程进行改进。由于网络已经近似了数据,训练更可能产生一组与理想局部极小值相关的参数。我们提供了构建此类网络所需的理论细节,并考虑了几个数值示例,揭示了我们的方法最终能够有效训练网络,从初始状态开始,以实现对大量目标函数的改进近似。

英文摘要

In this effort we propose a novel approach for reconstructing multivariate functions from training data, by identifying both a suitable network architecture and an initialization using polynomial-based approximations. Training deep neural networks using gradient descent can be interpreted as moving the set of network parameters along the loss landscape in order to minimize the loss functional. The initialization of parameters is important for iterative training methods based on descent. Our procedure produces a network whose initial state is a polynomial representation of the training data. The major advantage of this technique is from this initialized state the network may be improved using standard training procedures. Since the network already approximates the data, training is more likely to produce a set of parameters associated with a desirable local minimum. We provide the details of the theory necessary for constructing such networks and also consider several numerical examples that reveal our approach ultimately produces networks which can be effectively trained from our initialized state to achieve an improved approximation for a large class of target functions.

1905.08930 2026-06-04 math.NA cs.LG cs.NA math.PR math.ST stat.ML stat.TH 版本更新

Heavy Hitters and Bernoulli Convolutions

重 hitters与伯努利卷积

Alexander Kushkuley

发表机构 * Salesforce/Demandware

AI总结 本文提出了一种简单的事件频率近似算法,该算法对事件时效性敏感。算法通过迭代更新类别点击分布,在标准n维单纯形上生成随机游走路径。在某些条件下,这种随机游走具有自相似性,并对应于有偏伯努利卷积。算法评估自然地导致对有偏(有限和无限)伯努利卷积矩的估计。

Comments 1) fixed some typos and a reference 2) expanded section 3

详情
AI中文摘要

提出了一种非常简单的事件频率近似算法,该算法对事件时效性敏感。该算法通过迭代更新类别点击分布,在标准n维单纯形上生成(路径)随机游走。在某些条件下,这种随机游走具有自相似性,并对应于有偏伯努利卷积。算法评估自然地导致对有偏(有限和无限)伯努利卷积矩的估计。

英文摘要

A very simple event frequency approximation algorithm that is sensitive to event timeliness is suggested. The algorithm iteratively updates categorical click-distribution, producing (path of) a random walk on a standard $n$-dimensional simplex. Under certain conditions, this random walk is self-similar and corresponds to a biased Bernoulli convolution. Algorithm evaluation naturally leads to estimation of moments of biased (finite and infinite) Bernoulli convolutions.

1903.04666 2026-06-04 math.OC cs.LG cs.SY eess.SY 版本更新

Provably Correct Learning Algorithms in the Presence of Time-Varying Features Using a Variational Perspective

在存在时间变化特征的情况下使用变分视角的可证明正确学习算法

Joseph E. Gaudio, Travis E. Gibson, Anuradha M. Annaswamy, Michael A. Bolender

发表机构 * Massachusetts Institute of Technology(麻省理工学院) Brigham and Women’s Hospital and Harvard Medical School(布里奇沃特医院和哈佛医学院) Air Force Research Laboratory(空军研究实验室)

AI总结 本文提出了一种在存在时间变化特征的情况下,通过变分视角来保证学习算法正确性的方法,并通过仿真验证了理论结果。

Comments 25 pages, additional simulation detail, paper rewritten

详情
AI中文摘要

在机器学习问题中,特征通常是时间变化的,并且可能以代数或动态的方式与输出相关联。这些机器学习问题的动态性质使得当前的高阶加速梯度下降方法不稳定或削弱了收敛保证。受自适应控制方法的启发,本文提出了新的算法,用于处理存在时间变化特征的情况,并展示了可证明的性能保证。特别是,我们开发了一种连续时间算法中的统一变分视角。该变分视角包括高阶学习概念和归一化,这些都源自自适应控制,并允许在存在时间变化特征的动态机器学习问题中建立稳定性。这些高阶算法还被检查用于自适应控制和识别中的可证明正确学习。提供了仿真以验证理论结果。

英文摘要

Features in machine learning problems are often time-varying and may be related to outputs in an algebraic or dynamical manner. The dynamic nature of these machine learning problems renders current higher order accelerated gradient descent methods unstable or weakens their convergence guarantees. Inspired by methods employed in adaptive control, this paper proposes new algorithms for the case when time-varying features are present, and demonstrates provable performance guarantees. In particular, we develop a unified variational perspective within a continuous time algorithm. This variational perspective includes higher order learning concepts and normalization, both of which stem from adaptive control, and allows stability to be established for dynamical machine learning problems where time-varying features are present. These higher order algorithms are also examined for provably correct learning in adaptive control and identification. Simulations are provided to verify the theoretical results.

1905.11299 2026-06-04 eess.SY cs.CV cs.LG cs.SY 版本更新

ImgSensingNet: UAV Vision Guided Aerial-Ground Air Quality Sensing System

ImgSensingNet: 基于无人机视觉引导的空地空气质量感知系统

Yuzhe Yang, Zhiwen Hu, Kaigui Bian, Lingyang Song

发表机构 * Computer Science and Artificial Intelligence Laboratory(计算机科学与人工智能实验室) School of Electrical Engineering and Computer Science(电子工程与计算机科学学院)

AI总结 本文提出ImgSensingNet,一种基于无人机视觉引导的空地联合感知系统,通过融合无人机拍摄的雾霾图像与地面三维无线传感器网络(WSN)收集的AQI数据,实现精细化空气质量监测与预测,显著降低系统能耗。

Comments Preliminary version published in INFOCOM 2019. Code available at https://github.com/YyzHarry/ImgSensingNet

详情
AI中文摘要

鉴于日益严重的空气污染问题,城市区域空气质量指数(AQI)的监测已引起广泛关注。本文提出ImgSensingNet,一种基于视觉引导的空地联合感知系统,用于利用无人机拍摄的雾霾图像与地面三维无线传感器网络(WSN)收集的AQI数据进行精细化空气质量监测与预测。具体而言,ImgSensingNet首先利用计算机视觉技术从拍摄的雾霾图像中识别不同区域的AQI尺度,其中设计了与雾霾相关的特征和深度卷积神经网络(CNN)以直接学习雾霾图像与相应AQI尺度之间的映射关系。基于学习到的AQI尺度,ImgSensingNet决定是否唤醒地面无线传感器进行小尺度AQI监测和推断,从而显著降低系统的能耗。采用基于熵的模型以在未测量位置实现准确的实时AQI推断和未来空气质量分布预测。我们在两所大学校园自2018年2月起实施并评估ImgSensingNet,已收集17,630张照片和260万条AQI数据样本。实验结果证实,与现有最先进的AQI监测方法相比,ImgSensingNet在提高推断精度的同时显著降低了能耗。

英文摘要

Given the increasingly serious air pollution problem, the monitoring of air quality index (AQI) in urban areas has drawn considerable attention. This paper presents ImgSensingNet, a vision guided aerial-ground sensing system, for fine-grained air quality monitoring and forecasting using the fusion of haze images taken by the unmanned-aerial-vehicle (UAV) and the AQI data collected by an on-ground three-dimensional (3D) wireless sensor network (WSN). Specifically, ImgSensingNet first leverages the computer vision technique to tell the AQI scale in different regions from the taken haze images, where haze-relevant features and a deep convolutional neural network (CNN) are designed for direct learning between haze images and corresponding AQI scale. Based on the learnt AQI scale, ImgSensingNet determines whether to wake up on-ground wireless sensors for small-scale AQI monitoring and inference, which can greatly reduce the energy consumption of the system. An entropy-based model is employed for accurate real-time AQI inference at unmeasured locations and future air quality distribution forecasting. We implement and evaluate ImgSensingNet on two university campuses since Feb. 2018, and has collected 17,630 photos and 2.6 millions of AQI data samples. Experimental results confirm that ImgSensingNet can achieve higher inference accuracy while greatly reduce the energy consumption, compared to state-of-the-art AQI monitoring approaches.

1905.11261 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

A Unified Theory of SGD: Variance Reduction, Sampling, Quantization and Coordinate Descent

SGD的统一理论:方差减少、采样、量化和坐标下降

Eduard Gorbunov, Filip Hanzely, Peter Richtárik

发表机构 * MIPT, Russia(莫斯科国立研究型大学, 俄罗斯) KAUST, Saudi Arabia(卡塔尔科技大学, 卡塔尔)

AI总结 本文提出了一种统一分析大规模近端随机梯度下降(SGD)变体的框架,涵盖了方差减少、重要采样、小批量采样、量化和坐标子采样等技巧及其组合,首次统一了SGD和随机坐标下降(RCD)方法、方差减少和非方差减少SGD方法以及量化和非量化方法的理论。

Comments 38 pages, 4 figures, 2 tables

详情
AI中文摘要

在本文中,我们引入了一种对大规模近端随机梯度下降(SGD)变体的统一分析,这些变体至今需要不同的直觉、收敛分析、应用,并在不同社区中分别发展。我们证明我们的框架包括有和没有以下技巧的方法及其组合:方差减少、重要采样、小批量采样、量化和坐标子采样。作为副产品,我们获得了SGD和随机坐标下降(RCD)方法的第一个统一理论,方差减少和非方差减少SGD方法的第一个统一理论,以及量化和非量化方法的第一个统一理论。我们方法的关键是关于迭代和随机梯度的参数假设。在单一定理中,我们在该假设和损失函数的强拟凸性下建立了线性收敛结果。每当我们将现有方法作为特殊情况恢复时,我们的定理给出了目前最好的复杂度结果。我们的方法可以用来激励新有用方法的开发,并提供预证明的收敛保证。为了说明我们方法的强度,我们开发了五个新的SGD变体,并通过数值实验展示了一些性质。

英文摘要

In this paper we introduce a unified analysis of a large family of variants of proximal stochastic gradient descent ({\tt SGD}) which so far have required different intuitions, convergence analyses, have different applications, and which have been developed separately in various communities. We show that our framework includes methods with and without the following tricks, and their combinations: variance reduction, importance sampling, mini-batch sampling, quantization, and coordinate sub-sampling. As a by-product, we obtain the first unified theory of {\tt SGD} and randomized coordinate descent ({\tt RCD}) methods, the first unified theory of variance reduced and non-variance-reduced {\tt SGD} methods, and the first unified theory of quantized and non-quantized methods. A key to our approach is a parametric assumption on the iterates and stochastic gradients. In a single theorem we establish a linear convergence result under this assumption and strong-quasi convexity of the loss function. Whenever we recover an existing method as a special case, our theorem gives the best known complexity result. Our approach can be used to motivate the development of new useful methods, and offers pre-proved convergence guarantees. To illustrate the strength of our approach, we develop five new variants of {\tt SGD}, and through numerical experiments demonstrate some of their properties.

1905.10363 2026-06-04 math.NA cs.CE cs.LG cs.NA stat.ML 版本更新

User-Device Authentication in Mobile Banking using APHEN for Paratuck2 Tensor Decomposition

使用APHEN进行Paratuck2张量分解的移动银行用户-设备认证

Jeremy Charlier, Eric Falk, Radu State, Jean Hilger

发表机构 * University of Luxembourg(卢森堡大学)

AI总结 本文研究了如何利用Paratuck2张量分解和APHEN算法提高移动银行应用中的用户-设备认证效率,以增强个人财务广告的效果。

详情
AI中文摘要

新的金融欧洲法规,如PSD2,正在改变零售银行业务服务。值得注意的是,个人支出的监控现在不仅限于零售银行。然而,零售银行希望通过移动银行应用中的用户-设备认证来增强个人财务广告。为了解决认证的建模问题,我们依赖于张量分解,这是矩阵分解的高维类比。我们使用Paratuck2,因为它可以将张量表示为矩阵乘积和对角张量的乘积,因为用户和设备数量之间存在不平衡。我们强调为什么Paratuck2比流行的CP张量分解更适合这种情况,后者将张量分解为秩一张量的和。然而,Paratuck2的计算是计算密集型的。我们提出了一种新的近似Hessian基于牛顿求解算法,APHEN,能够比基于交替最小二乘或梯度下降的其他流行方法更准确和快速地解决Paratuck2。Paratuck2的结果用于通过神经网络预测用户认证的预测。我们应用我们的方法用于具体的案例,即基于移动银行应用生成的认证事件来针对客户进行财务广告活动。

英文摘要

The new financial European regulations such as PSD2 are changing the retail banking services. Noticeably, the monitoring of the personal expenses is now opened to other institutions than retail banks. Nonetheless, the retail banks are looking to leverage the user-device authentication on the mobile banking applications to enhance the personal financial advertisement. To address the profiling of the authentication, we rely on tensor decomposition, a higher dimensional analogue of matrix decomposition. We use Paratuck2, which expresses a tensor as a multiplication of matrices and diagonal tensors, because of the imbalance between the number of users and devices. We highlight why Paratuck2 is more appropriate in this case than the popular CP tensor decomposition, which decomposes a tensor as a sum of rank-one tensors. However, the computation of Paratuck2 is computational intensive. We propose a new APproximate HEssian-based Newton resolution algorithm, APHEN, capable of solving Paratuck2 more accurately and faster than the other popular approaches based on alternating least square or gradient descent. The results of Paratuck2 are used for the predictions of users' authentication with neural networks. We apply our method for the concrete case of targeting clients for financial advertising campaigns based on the authentication events generated by mobile banking applications.

1905.10224 2026-06-04 cs.LG cs.DM cs.NA cs.NE math.NA stat.ML 版本更新

Semi-Supervised Classification on Non-Sparse Graphs Using Low-Rank Graph Convolutional Networks

利用低秩图卷积网络对非稀疏图进行半监督分类

Dominik Alfke, Martin Stoll

发表机构 * Department of Mathematics, Chair of Scientific Computing(数学系,科学计算教研室)

AI总结 本文提出了一种低秩图卷积网络架构,用于高效处理非稀疏图上的半监督学习问题,通过引入低秩滤波器提升运行效率和准确率,并扩展到超图数据集的处理。

详情
AI中文摘要

图卷积网络(GCNs)已被证明是图数据集上半监督学习的成功工具。对于稀疏图,线性和多项式滤波函数取得了显著成果。然而,对于大规模非稀疏图,网络训练和评估变得成本过高。通过引入低秩滤波器,我们获得了显著的运行时间加速,同时提高了准确性。我们进一步提出了一种架构变化,模仿了模型降阶技术,称为降阶GCN。此外,我们展示了我们的方法如何也能应用于超图数据集,并展示了超图卷积如何高效实现。

英文摘要

Graph Convolutional Networks (GCNs) have proven to be successful tools for semi-supervised learning on graph-based datasets. For sparse graphs, linear and polynomial filter functions have yielded impressive results. For large non-sparse graphs, however, network training and evaluation becomes prohibitively expensive. By introducing low-rank filters, we gain significant runtime acceleration and simultaneously improved accuracy. We further propose an architecture change mimicking techniques from Model Order Reduction in what we call a reduced-order GCN. Moreover, we present how our method can also be applied to hypergraph datasets and how hypergraph convolution can be implemented efficiently.

1812.03457 2026-06-04 math.OC cs.LG cs.NA cs.NE math.NA 版本更新

Minima distribution for global optimization

全局优化的极小值分布

Xiaopeng Luo

发表机构 * Department of Chemistry, Princeton University(普林斯顿大学化学系)

AI总结 本文研究了任意连续函数在紧集上的极小值分布问题,提出了一种新的极小值分布函数构造方法,并建立了与目标函数和紧集相关的单调收敛序列,最终确定了连续可微函数的极小值集收缩率。

Comments 19 pages, 6 figures

详情
AI中文摘要

本文建立了任意连续函数在紧集上的极小值与全局极小值之间的严格数学关系,类似于已知的一阶最优性条件对于凸可微函数。通过引入一类仅与目标函数和给定紧集相关的极小值分布函数,我们构造了一个单调收敛到给定紧集上全局极小值的序列。然后,我们进一步考虑了一些各种集列,每个集列从原始紧集单调收缩到所有全局极小值的集合,且对于连续可微函数,收缩率可以确定。最后,我们提供了一种不同的构造极小值分布函数的方法。

英文摘要

This paper establishes a strict mathematical relationship between an arbitrary continuous function on a compact set and its global minima, like the well-known first order optimality condition for convex and differentiable functions. By introducing a class of nascent minima distribution functions that is only related to the target function and the given compact set, we construct a sequence that monotonically converges to the global minima on that given compact set. Then, we further consider some various sequences of sets where each sequence monotonically shrinks from the original compact set to the set of all global minimizers, and the shrink rate can be determined for continuously differentiable functions. Finally, we provide a different way of constructing the nascent minima distribution functions.

1807.00251 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Trust-Region Algorithms for Training Responses: Machine Learning Methods Using Indefinite Hessian Approximations

基于信任区域算法的训练响应方法:使用不定Hessian近似机学习方法

Jennifer B. Erway, Joshua Griffin, Roummel F. Marcia, Riadh Omheni

发表机构 * Department of Mathematics, Wake Forest University(威克森林大学数学系) Department of Applied Mathematics, University of California, Merced(加州大学默塞德分校应用数学系)

AI总结 本文提出了一种基于准牛顿信任区域框架的机学习方法,用于解决允许不定Hessian近似的大规模优化问题,通过数值实验展示了其在固定计算时间预算下优于传统有限记忆BFGS和Hessian自由方法的性能。

详情
AI中文摘要

机学习(ML)问题通常被表述为高度非线性和非凸的无约束优化问题。基于随机梯度下降的ML问题求解方法易于扩展到非常大的问题,但可能需要微调许多超参数。基于有限记忆Broyden-Fletcher-Goldfarb-Shanno(BFGS)更新的准牛顿方法通常不需要手动调整超参数,但会将潜在的不定Hessian近似为正定矩阵。Hessian自由方法利用了无需整个Hessian矩阵即可执行Hessian-向量乘法的能力,但每次迭代的复杂度显著高于准牛顿方法。在本文中,我们提出了一种基于准牛顿信任区域框架的替代方法,用于解决允许不定Hessian近似的大型优化问题。在标准测试数据集上的数值实验表明,在固定计算时间预算下,所提出的方法比传统有限记忆BFGS和Hessian自由方法表现更好。

英文摘要

Machine learning (ML) problems are often posed as highly nonlinear and nonconvex unconstrained optimization problems. Methods for solving ML problems based on stochastic gradient descent are easily scaled for very large problems but may involve fine-tuning many hyper-parameters. Quasi-Newton approaches based on the limited-memory Broyden-Fletcher-Goldfarb-Shanno (BFGS) update typically do not require manually tuning hyper-parameters but suffer from approximating a potentially indefinite Hessian with a positive-definite matrix. Hessian-free methods leverage the ability to perform Hessian-vector multiplication without needing the entire Hessian matrix, but each iteration's complexity is significantly greater than quasi-Newton methods. In this paper we propose an alternative approach for solving ML problems based on a quasi-Newton trust-region framework for solving large-scale optimization problems that allow for indefinite Hessian approximations. Numerical experiments on a standard testing data set show that with a fixed computational time budget, the proposed methods achieve better results than the traditional limited-memory BFGS and the Hessian-free methods.

1905.07619 2026-06-04 math.NA cs.LG cs.NA math.DS 版本更新

A Discrete Empirical Interpolation Method for Interpretable Immersion and Embedding of Nonlinear Manifolds

一种用于非线性流形可解释沉浸与嵌入的离散经验插值方法

Samuel E. Otto, Clarence W. Rowley

发表机构 * Mechanical and Aerospace Engineering, Princeton University(普林斯顿大学机械与航空航天工程系)

AI总结 本文提出了一种扩展的离散经验插值方法(DEIM),用于在非线性流形上进行可解释的沉浸与嵌入,通过同时应用 pivoted QR 过程于局部线性近似块,实现一种新的同时 pivoted QR(SimPQR)算法,从而在真实数据中有效应用非线性 DEIM(NLDEIM)坐标。

Comments Minor typos corrected in version 2

详情
AI中文摘要

流形学习技术旨在发现将高维数据映射到低维空间的结构保持映射。尽管这些映射指定的新坐标能够紧密参数化数据,但它们通常是原始变量的复杂非线性函数,这使得它们在物理上难以解释。此外,在数据驱动的模型降阶应用中,主导方程的结构可能在非线性映射到惰性流形上的坐标时被破坏,从而形成计算瓶颈。相反,我们提出识别一组原始变量,这些变量能够通过局部沉浸或全局嵌入确定所有其他变量。当数据位于低维子空间时,现有的离散经验插值方法(DEIM)通过最近的变种使用基于 pivoted QR(PQR)因子分解的贪心算法实现这一点。然而,来自各种应用的低维流形,特别是来自主导对流的偏微分方程,不位于或接近任何低维子空间。我们提出的方法通过在组成流形局部线性近似的块上同时应用 pivoted QR 过程,将 DEIM 扩展到接近非线性流形的数据,从而得到一种新的同时 pivoted QR(SimPQR)算法。SimPQR 提供的沉浸可以通过再次应用 SimPQR 到修改后的向量集合来扩展为嵌入。SimPQR 计算这些 `非线性 DEIM'(NLDEIM)坐标的方法成功应用于现实数据,这些数据接近惰性流形在圆柱涡流中的数据,以及来自不同初始条件的粘性 Burgers 方程的数据。

英文摘要

Manifold learning techniques seek to discover structure-preserving mappings of high-dimensional data into low-dimensional spaces. While the new sets of coordinates specified by these mappings can closely parameterize the data, they are generally complicated nonlinear functions of the original variables. This makes them difficult to interpret physically. Furthermore, in data-driven model reduction applications the governing equations may have structure that is destroyed by nonlinear mapping into coordinates on an inertial manifold, creating a computational bottleneck for simulations. Instead, we propose to identify a small collection of the original variables which are capable of uniquely determining all others either locally via immersion or globally via embedding of the underlying manifold. When the data lies on a low-dimensional subspace the existing discrete empirical interpolation method (DEIM) accomplishes this with recent variants employing greedy algorithms based on pivoted QR (PQR) factorizations. However, low-dimensional manifolds coming from a variety of applications, particularly from advection-dominated PDEs, do not lie in or near any low-dimensional subspace. Our proposed approach extends DEIM to data lying near nonlinear manifolds by applying a similar pivoted QR procedure simultaneously on collections of patches making up locally linear approximations of the manifold, resulting in a novel simultaneously pivoted QR (SimPQR) algorithm. The immersion provided by SimPQR can be extended to an embedding by applying SimPQR a second time to a modified collection of vectors. The SimPQR method for computing these `nonlinear DEIM' (NLDEIM) coordinates is successfully applied to real-world data lying near an inertial manifold in a cylinder wake flow as well as data coming from a viscous Burgers equation with different initial conditions.

1905.07501 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Enforcing constraints for time series prediction in supervised, unsupervised and reinforcement learning

在监督、无监督和强化学习中对时间序列预测施加约束

Panos Stinis

发表机构 * Advanced Computing, Mathematics and Data Division, Pacific Northwest National Laboratory(太平洋西北国家实验室高级计算、数学与数据部门)

AI总结 本文研究了如何在监督、无监督和强化学习中通过施加来自动态系统的约束来加速深度神经网络训练并提高其预测能力,主要贡献是提出了一种基于动作价值函数同伦的新型方法来稳定和加速强化学习训练。

Comments 30 pages, 5 figures

详情
AI中文摘要

我们假设我们给定了一个来自动态系统的数据时间序列,我们的任务是学习动态系统的流映射。我们提出了关于如何施加来自动态系统的约束以加速深度神经网络训练并提高其预测能力的一系列结果。特别是,我们为监督、无监督和强化学习三种主要学习模式提供了在训练过程中施加约束的方法。一般来说,动态约束需要包括类似于模型降阶形式中的记忆项。这些记忆项起到恢复力的作用,纠正由学习的流映射在预测过程中犯的错误。对于监督学习,约束被添加到目标函数中。对于无监督学习,特别是生成对抗网络,约束是通过增强判别器的输入引入的。最后,对于强化学习,特别是actor-critic方法,约束被添加到奖励函数中。此外,对于强化学习情况,我们提出了一种基于动作价值函数同伦的新方法,以稳定和加速训练。我们使用洛伦兹系统数值结果来说明各种构造。

英文摘要

We assume that we are given a time series of data from a dynamical system and our task is to learn the flow map of the dynamical system. We present a collection of results on how to enforce constraints coming from the dynamical system in order to accelerate the training of deep neural networks to represent the flow map of the system as well as increase their predictive ability. In particular, we provide ways to enforce constraints during training for all three major modes of learning, namely supervised, unsupervised and reinforcement learning. In general, the dynamic constraints need to include terms which are analogous to memory terms in model reduction formalisms. Such memory terms act as a restoring force which corrects the errors committed by the learned flow map during prediction. For supervised learning, the constraints are added to the objective function. For the case of unsupervised learning, in particular generative adversarial networks, the constraints are introduced by augmenting the input of the discriminator. Finally, for the case of reinforcement learning and in particular actor-critic methods, the constraints are added to the reward function. In addition, for the reinforcement learning case, we present a novel approach based on homotopy of the action-value function in order to stabilize and accelerate training. We use numerical results for the Lorenz system to illustrate the various constructions.

1905.07436 2026-06-04 math.OC cs.LG cs.SY eess.SY stat.ML 版本更新

A Dynamical Systems Perspective on Nesterov Acceleration

从动力系统角度看待Nesterov加速法

Michael Muehlebach, Michael I. Jordan

发表机构 * Electrical Engineering and Computer Science Department, UC Berkeley, Berkeley, California, USA(加州大学伯克利分校电子工程与计算机科学系)

AI总结 本文提出一个动力系统框架来理解Nesterov加速梯度方法,通过分析连续时间动力学和离散化过程,揭示了曲率依赖的阻尼项是加速现象的核心,并建立了离散和连续时间动力学之间的联系。

Comments 11 pages, 4 figures, to appear in the Proceedings of the 36th International Conference on Machine Learning

详情
AI中文摘要

我们提出一个动力系统框架来理解Nesterov加速梯度方法。与以往工作不同,我们的推导不依赖于步长消失的论证。我们展示Nesterov加速源于对常微分方程的半隐式欧拉积分方案的离散化。我们分析了底层微分方程及其离散化,以获得对加速现象的见解。分析表明,曲率依赖的阻尼项是该现象的核心。我们进一步建立了离散和连续时间动力学之间的联系。

英文摘要

We present a dynamical system framework for understanding Nesterov's accelerated gradient method. In contrast to earlier work, our derivation does not rely on a vanishing step size argument. We show that Nesterov acceleration arises from discretizing an ordinary differential equation with a semi-implicit Euler integration scheme. We analyze both the underlying differential equation as well as the discretization to obtain insights into the phenomenon of acceleration. The analysis suggests that a curvature-dependent damping term lies at the heart of the phenomenon. We further establish connections between the discretized and the continuous-time dynamics.

1902.01119 2026-06-04 cs.AI cs.CL cs.LG cs.SY eess.SY 版本更新

The Natural Language of Actions

动作的自然语言

Guy Tennenholtz, Shie Mannor

发表机构 * Faculty of Electrical Engineering, Technion Institute of Technology, Israel(电气工程学院,技术学院,以色列)

AI总结 本文提出Act2Vec框架,用于学习基于上下文的动作表示以提升强化学习性能,通过将相似动作分组并利用动作间的关系来改进Q值近似和状态表示。

Comments Published in the proceedings of the 36th International Conference on Machine Learning (ICML 2019)

详情
AI中文摘要

我们介绍了Act2Vec,一种通用框架,用于学习基于上下文的动作表示以用于强化学习。在向量空间中表示动作有助于强化学习算法通过将相似动作分组并利用不同动作之间的关系来实现更好的性能。我们展示了如何从演示中提取环境的先验知识,并将其注入到编码自然兼容行为的动作向量表示中。然后我们利用这些表示来增强状态表示以及改进Q值的函数逼近。我们还在三个领域中可视化和测试了动作嵌入:绘画任务、高维导航任务以及星际争霸II中的大规模动作空间领域。

英文摘要

We introduce Act2Vec, a general framework for learning context-based action representation for Reinforcement Learning. Representing actions in a vector space help reinforcement learning algorithms achieve better performance by grouping similar actions and utilizing relations between different actions. We show how prior knowledge of an environment can be extracted from demonstrations and injected into action vector representations that encode natural compatible behavior. We then use these for augmenting state representations as well as improving function approximation of Q-values. We visualize and test action embeddings in three domains including a drawing task, a high dimensional navigation task, and the large action space domain of StarCraft II.

1905.06978 2026-06-04 eess.SY cs.LG cs.RO cs.SY stat.AP 版本更新

Randomized Algorithms for Data-Driven Stabilization of Stochastic Linear Systems

数据驱动的随机算法用于随机线性系统的稳定化

Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis

AI总结 本文提出两种随机算法用于数据驱动的随机线性系统稳定化,通过数值分析研究了随机反馈和随机参数方法的稳定速度和失败概率,证明在统计独立随机化数量不小时可以保证快速稳定化。

详情
AI中文摘要

数据驱动的控制策略在动态系统中广泛应用,尤其是在参数未知的情况下。一个关键问题是防止随机线性系统因决策者对动态参数不确定而失稳。本文提出了两种随机算法来解决这个问题,但其性能尚未充分研究。此外,算法中的关键参数,如随机化应用的幅度和频率的影响目前尚不明确。本文研究了数据驱动过程的稳定速度和失败概率。我们对两种方法:随机反馈和随机参数的性能进行了数值分析。所呈现的结果表明,只要统计独立的随机化数量不太多,就可以保证快速稳定化。

英文摘要

Data-driven control strategies for dynamical systems with unknown parameters are popular in theory and applications. An essential problem is to prevent stochastic linear systems becoming destabilized, due to the uncertainty of the decision-maker about the dynamical parameter. Two randomized algorithms are proposed for this problem, but the performance is not sufficiently investigated. Further, the effect of key parameters of the algorithms such as the magnitude and the frequency of applying the randomizations is not currently available. This work studies the stabilization speed and the failure probability of data-driven procedures. We provide numerical analyses for the performance of two methods: stochastic feedback, and stochastic parameter. The presented results imply that as long as the number of statistically independent randomizations is not too small, fast stabilization is guaranteed.

1810.05247 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

Real-time Faulted Line Localization and PMU Placement in Power Systems through Convolutional Neural Networks

通过卷积神经网络实现电力系统中的实时故障线路定位与PMU布置

Wenting Li, Deepjyoti Deka, Michael Chertkov, Meng Wang

发表机构 * Theory Division and the Center for Nonlinear Studies, Los Alamos National Laboratory(理论部和非线性研究中心,洛斯阿拉莫斯国家实验室)

AI总结 本文提出基于卷积神经网络的故障线路定位方法,利用母线电压特征提高鲁棒性,并提出联合PMU布置策略,通过不同类型的故障模拟验证了在低可观测性条件下高精度的故障定位能力。

Comments 11 pages, 8 figures

详情
AI中文摘要

多样化的故障类型、快速的重合闸和故障后复杂的暂态状态使得电力电网中的实时故障定位具有挑战性。现有定位技术依赖于静态负载等简化假设或需要更高的采样率或总测量可用性。本文提出了一种基于卷积神经网络(CNN)分类器的故障线路定位方法,利用母线电压。与以往的数据驱动方法不同,所提出的分类器基于具有物理解释的特征,提高了定位性能的鲁棒性。我们的基于CNN的定位工具的准确性明显优于文献中的其他机器学习分类器。为了进一步提高定位性能,提出了一种联合相量测量单元(PMU)布置策略,并与其他方法进行了验证。我们方法的一个重要方面是,在非常低的可观测性(7%的母线)下,算法仍能以高概率将故障线路定位到小的邻域。通过在IEEE 39母线和68母线电力系统中不同类型的故障模拟,验证了在变化的不确定条件、系统可观测性和测量质量下的方案性能。

英文摘要

Diverse fault types, fast re-closures, and complicated transient states after a fault event make real-time fault location in power grids challenging. Existing localization techniques in this area rely on simplistic assumptions, such as static loads, or require much higher sampling rates or total measurement availability. This paper proposes a faulted line localization method based on a Convolutional Neural Network (CNN) classifier using bus voltages. Unlike prior data-driven methods, the proposed classifier is based on features with physical interpretations that improve the robustness of the location performance. The accuracy of our CNN based localization tool is demonstrably superior to other machine learning classifiers in the literature. To further improve the location performance, a joint phasor measurement units (PMU) placement strategy is proposed and validated against other methods. A significant aspect of our methodology is that under very low observability (7% of buses), the algorithm is still able to localize the faulted line to a small neighborhood with high probability. The performance of our scheme is validated through simulations of faults of various types in the IEEE 39-bus and 68-bus power systems under varying uncertain conditions, system observability, and measurement quality.

1810.03076 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Online Center of Mass Estimation for a Humanoid Wheeled Inverted Pendulum Robot

人形轮式反重力摆机器人在线质心估计

Munzir Zafar, Akash Patel, Bogdan Vlahov, Nathaniel Glaser, Sergio Aguillera, Seth Hutchinson

AI总结 本文提出了一种新颖的鲁棒控制与在线学习结合的方法,用于平衡具有n自由度的轮式反重力摆人形机器人,通过在线学习更新质量模型以获得更准确的质心估计,实验表明该方法提升了整体控制效率。

详情
AI中文摘要

我们提出了一种新颖的鲁棒控制和在线学习应用,用于平衡具有n个自由度(DoF)的轮式反重力摆(WIP)人形机器人。我们的技术将质量模型的不准确性转化为质心(CoM)误差,并在存在误差的情况下实现平衡,同时利用在线学习更新质量模型以获得更好的质心估计。使用我们机器人的模拟模型,我们元学习了一组激励关节姿态,使我们的梯度下降算法快速收敛到准确的(CoM)估计。该模拟流程完全在线执行,使用主动扰动抵消来解决由于持续演变的质量模型所产生的质量误差。在19个自由度的WIP上进行了实验,我们手动获取了学习姿态集的数据,并展示了由梯度下降产生的质量模型产生的质心估计能够提高整体控制和效率。本工作为Golem Krang人形机器人整体控制贡献了更丰富的文献。

英文摘要

We present a novel application of robust control and online learning for the balancing of a n Degree of Freedom (DoF), Wheeled Inverted Pendulum (WIP) humanoid robot. Our technique condenses the inaccuracies of a mass model into a Center of Mass (CoM) error, balances despite this error, and uses online learning to update the mass model for a better CoM estimate. Using a simulated model of our robot, we meta-learn a set of excitory joint poses that makes our gradient descent algorithm quickly converge to an accurate (CoM) estimate. This simulated pipeline executes in a fully online fashion, using active disturbance rejection to address the mass errors that result from a steadily evolving mass model. Experiments were performed on a 19 DoF WIP, in which we manually acquired the data for the learned set of poses and show that the mass model produced by a gradient descent produces a CoM estimate that improves overall control and efficiency. This work contributes to a greater corpus of whole body control on the Golem Krang humanoid robot.

1905.05380 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Control Regularization for Reduced Variance Reinforcement Learning

减少方差的强化学习中的控制正则化

Richard Cheng, Abhinav Verma, Gabor Orosz, Swarat Chaudhuri, Yisong Yue, Joel W. Burdick

发表机构 * California Institute of Technology, Pasadena, CA(加州理工学院) University of Michigan, Ann Arbor, MI(密歇根大学) Rice University, Houston, TX(Rice大学)

AI总结 本文提出了一种功能正则化方法,用于减少连续控制中强化学习的方差,通过正则化深度策略的行为与先验策略相似,从而在偏倚-方差权衡中实现更稳定的动态稳定性和更高效的训练。

Comments Appearing in ICML 2019

详情
AI中文摘要

在模型无关强化学习(RL)中,处理高方差是一个重要的挑战。现有方法不可靠,使用不同初始化/种子时性能表现方差较大。针对连续控制中出现的问题,我们提出了一种功能正则化方法来增强模型无关RL。具体而言,我们正则化深度策略的行为与先验策略相似,即在函数空间中进行正则化。我们证明功能正则化会产生偏倚-方差权衡,并提出了一种自适应调节策略来优化这种权衡。当策略先验具有控制理论稳定性保证时,我们进一步证明这种正则化在整个学习过程中近似保持这些稳定性保证。我们在多种设置中通过实验证明了我们的方法,展示了显著减少的方差、保证的动态稳定性和比深度RL更高效的训练。

英文摘要

Dealing with high variance is a significant challenge in model-free reinforcement learning (RL). Existing methods are unreliable, exhibiting high variance in performance from run to run using different initializations/seeds. Focusing on problems arising in continuous control, we propose a functional regularization approach to augmenting model-free RL. In particular, we regularize the behavior of the deep policy to be similar to a policy prior, i.e., we regularize in function space. We show that functional regularization yields a bias-variance trade-off, and propose an adaptive tuning strategy to optimize this trade-off. When the policy prior has control-theoretic stability guarantees, we further show that this regularization approximately preserves those stability guarantees throughout learning. We validate our approach empirically on a range of settings, and demonstrate significantly reduced variance, guaranteed dynamic stability, and more efficient learning than deep RL alone.

1712.09718 2026-06-04 stat.CO cs.LG cs.SY eess.SY 版本更新

Directional Statistics and Filtering Using libDirectional

基于libDirectional的方向统计与滤波

Gerhard Kurz, Igor Gilitschenski, Florian Pfaff, Lukas Drude, Uwe D. Hanebeck, Reinhold Haeb-Umbach, Roland Y. Siegwart

发表机构 * Karlsruhe Institute of Technology (KIT)(卡尔斯鲁厄理工学院) ETH Zurich(苏黎世联邦理工学院) University of Paderborn(波德恩堡大学)

AI总结 本文介绍了libDirectional库,该库用于方向统计和方向估计,支持单位圆上常用的分布如von Mises、 Wrapped Normal和Wrapped Cauchy分布,以及更高维流形上的分布,如单位超球面和超 torus,并基于这些分布实现了多种递归滤波算法。

Comments Version accepted for Publication in the Journal of Statistical Software

详情
AI中文摘要

在本文中,我们介绍了libDirectional,一个用于方向统计和方向估计的MATLAB库。它支持单位圆上各种常用分布,如von Mises、wrapped normal和wrapped Cauchy分布。此外,还提供了更高维流形上的分布,如单位超球面和超 torus。基于这些分布,libDirectional中的几种递归滤波算法允许在这些流形上进行估计。该功能以清晰、文档齐全且面向对象的结构实现,易于使用且易于扩展。

英文摘要

In this paper, we present libDirectional, a MATLAB library for directional statistics and directional estimation. It supports a variety of commonly used distributions on the unit circle, such as the von Mises, wrapped normal, and wrapped Cauchy distributions. Furthermore, various distributions on higher-dimensional manifolds such as the unit hypersphere and the hypertorus are available. Based on these distributions, several recursive filtering algorithms in libDirectional allow estimation on these manifolds. The functionality is implemented in a clear, well-documented, and object-oriented structure that is both easy to use and easy to extend.

1905.04351 2026-06-04 cs.LG cs.NA math.NA physics.comp-ph physics.data-an stat.ML 版本更新

Solving Irregular and Data-enriched Differential Equations using Deep Neural Networks

使用深度神经网络求解不规则和数据丰富的微分方程

Craig Michoski, Milos Milosavljevic, Todd Oliver, David Hatch

发表机构 * Oden Institute for Computational Engineering and Sciences, University of Texas at Austin, Austin,TX 78712(计算工程与科学研究院,德克萨斯大学奥斯汀分校,奥斯汀,TX 78712) Department of Astronomy, The University of Texas at Austin, Austin, TX 78712(天文学系,德克萨斯大学奥斯汀分校,奥斯汀,TX 78712) Institute for Fusion Studies, University of Texas at Austin, Austin,TX 78712(融合研究学院,德克萨斯大学奥斯汀分校,奥斯汀,TX 78712)

AI总结 本文提出了一种利用深度神经网络求解不规则和数据丰富的微分方程的方法,通过分析Sod激波管解和压缩磁流体动力学中的激波解,展示了该方法在提高数值方法性能和参数空间探索方面的优势。

Comments 21 pages, 14 figures, 3 tables

详情
AI中文摘要

近期的研究提出了一种简单的数值方法,用于利用深度神经网络(DNN)求解偏微分方程(PDEs)。本文回顾并扩展了该方法,并将其应用于分析数值PDEs和非线性分析中最基本的特征之一:不规则解。首先,讨论并分析了Sod激波管解到可压缩欧拉方程的解,然后与传统有限元和有限体积方法进行比较。这些方法被扩展以考虑性能改进和同时的参数空间探索。接下来,解决了一个压缩磁流体动力学(MHD)中的激波解,并在实验数据被用来增强一个原本不足以验证的PDE系统的情况下使用。这通过在模型PDE系统中加入源项并使用监督训练在合成实验数据上实现。所得到的DNN框架似乎在系统原型化方面表现出几乎幻想般的易用性,能够自然整合大规模数据集(无论是合成还是实验数据),同时能够同时进行整个参数空间的单次探索。

英文摘要

Recent work has introduced a simple numerical method for solving partial differential equations (PDEs) with deep neural networks (DNNs). This paper reviews and extends the method while applying it to analyze one of the most fundamental features in numerical PDEs and nonlinear analysis: irregular solutions. First, the Sod shock tube solution to compressible Euler equations is discussed, analyzed, and then compared to conventional finite element and finite volume methods. These methods are extended to consider performance improvements and simultaneous parameter space exploration. Next, a shock solution to compressible magnetohydrodynamics (MHD) is solved for, and used in a scenario where experimental data is utilized to enhance a PDE system that is \emph{a priori} insufficient to validate against the observed/experimental data. This is accomplished by enriching the model PDE system with source terms and using supervised training on synthetic experimental data. The resulting DNN framework for PDEs seems to demonstrate almost fantastical ease of system prototyping, natural integration of large data sets (be they synthetic or experimental), all while simultaneously enabling single-pass exploration of the entire parameter space.

1905.04152 2026-06-04 eess.SY cs.LG cs.NI cs.SY stat.ML 版本更新

Massive Autonomous UAV Path Planning: A Neural Network Based Mean-Field Game Theoretic Approach

大规模自主无人机路径规划:一种基于神经网络的均场博弈理论方法

Hamid Shiri, Jihong Park, Mehdi Bennis

发表机构 * Centre for Wireless Communications, University of Oulu, Finland(奥卢大学无线通信中心,芬兰)

AI总结 本文研究了大规模无人机在关键任务中的自主控制问题,提出了一种基于神经网络的均场博弈理论方法,通过减少无人机状态交换次数和降低计算能耗来实现高效路径规划。

Comments 6 pages, 5 figures, submitted to IEEE GLOBECOM 2019

详情
AI中文摘要

本文研究了大规模无人驾驶航空器(UAVs)在关键任务中的自主控制问题,例如从源点向目的地派遣大量UAVs进行灭火任务。在风扰扰动下实现快速移动和低运动能耗同时避免UAV间碰撞是一项具有挑战性的控制任务,这会带来巨大的通信能耗用于实时交换UAV状态。我们通过利用均场博弈(MFG)理论控制方法来解决这个问题,该方法要求UAVs仅在初始源点交换一次状态。此后,每个UAV可以通过本地求解两个偏微分方程(PDEs)来控制其加速度,即哈密尔顿-雅可比-贝尔曼(HJB)方程和福克-科尔莫戈罗夫-柯尔莫哥洛夫(FPK)方程。然而,这种方法在解决PDEs时带来了巨大的计算能耗,特别是在多维UAV状态的情况下。我们通过使用机器学习(ML)方法来解决这个问题,其中两个独立的ML模型近似HJB和FPK方程的解。这些ML模型通过使用在线梯度下降方法进行训练和利用,具有较低的计算复杂度。数值评估验证了所提出的ML辅助MFG理论算法(称为MFG学习控制)在碰撞避免方面是有效的,具有低通信能耗和可接受的计算能耗。

英文摘要

This paper investigates the autonomous control of massive unmanned aerial vehicles (UAVs) for mission-critical applications (e.g., dispatching many UAVs from a source to a destination for firefighting). Achieving their fast travel and low motion energy without inter-UAV collision under wind perturbation is a daunting control task, which incurs huge communication energy for exchanging UAV states in real time. We tackle this problem by exploiting a mean-field game (MFG) theoretic control method that requires the UAV state exchanges only once at the initial source. Afterwards, each UAV can control its acceleration by locally solving two partial differential equations (PDEs), known as the Hamilton-Jacobi-Bellman (HJB) and Fokker-Planck-Kolmogorov (FPK) equations. This approach, however, brings about huge computation energy for solving the PDEs, particularly under multi-dimensional UAV states. We address this issue by utilizing a machine learning (ML) method where two separate ML models approximate the solutions of the HJB and FPK equations. These ML models are trained and exploited using an online gradient descent method with low computational complexity. Numerical evaluations validate that the proposed ML aided MFG theoretic algorithm, referred to as MFG learning control, is effective in collision avoidance with low communication energy and acceptable computation energy.

1612.01597 2026-06-04 math.NA cs.IT cs.LG cs.NA math.IT 版本更新

Deterministic and Probabilistic Conditions for Finite Completability of Low-Tucker-Rank Tensor

低 Tucker 等秩张量有限可补全的确定性和概率条件

Morteza Ashraphijuo, Vaneet Aggarwal, Xiaodong Wang

发表机构 * Department of Electrical Engineering, Columbia University(哥伦比亚大学电气工程系) School of IE, Purdue University(普渡大学工业工程学院)

AI总结 本文研究了在给定某些 Tucker 等秩组件的情况下,张量有限可补全的采样模式的基本条件。通过在 Tucker 流形上进行代数几何分析,提出了确定性必要和充分条件,同时研究了概率条件并给出了采样概率的下界,以确保所提出的确定性条件在高概率下成立。此外,利用所提出的几何方法,提出了一个保证采样张量有唯一补全的采样模式充分条件。

详情
AI中文摘要

我们研究了在给定某些 Tucker 等秩组件的情况下,张量有限可补全的采样模式的基本条件。为了找到确定性必要和充分条件,我们提出了一种在 Tucker 流形上的代数几何分析,这与传统在 Grassmannian 流形上的几何方法不同,允许在分析中纳入多个秩组件。这种分析刻画了一组基于采样模式定义的多项式的代数独立性,这与有限补全密切相关。然后研究了概率条件,并给出了采样概率的下界,该下界保证所提出的关于采样模式的确定性条件在有限补全中以高概率成立。此外,利用所提出的有限补全几何方法,我们提出了一种关于采样模式的充分条件,该条件保证采样张量存在唯一的补全。

英文摘要

We investigate the fundamental conditions on the sampling pattern, i.e., locations of the sampled entries, for finite completability of a low-rank tensor given some components of its Tucker rank. In order to find the deterministic necessary and sufficient conditions, we propose an algebraic geometric analysis on the Tucker manifold, which allows us to incorporate multiple rank components in the proposed analysis in contrast with the conventional geometric approaches on the Grassmannian manifold. This analysis characterizes the algebraic independence of a set of polynomials defined based on the sampling pattern, which is closely related to finite completion. Probabilistic conditions are then studied and a lower bound on the sampling probability is given, which guarantees that the proposed deterministic conditions on the sampling patterns for finite completability hold with high probability. Furthermore, using the proposed geometric approach for finite completability, we propose a sufficient condition on the sampling pattern that ensures there exists exactly one completion for the sampled tensor.

1706.07119 2026-06-04 eess.SY cs.LG cs.SY 版本更新

"Parallel Training Considered Harmful?": Comparing series-parallel and parallel feedforward network training

并行训练是否有害?:比较系列-并行与并行前馈网络训练

Antônio H. Ribeiro, Luis A. Aguirre

发表机构 * Department of Electronic Engineering at Universidade Federal de Minas Gerais (UFMG) - Av. Ant\ o nio Carlos 6627, 31270-901, Belo Horizonte, MG, Brazil

AI总结 本文比较了系列-并行和并行前馈网络训练方法,探讨了其在鲁棒性、计算成本和收敛性方面的表现,发现并行训练在更现实的场景中表现更优。

详情
Journal ref
Neurocomputing 316:222--231, (2018)
AI中文摘要

动态系统神经网络模型可以并行或系列-并行配置进行训练。受早期论述影响,一些论文认为系列-并行配置比并行配置更优,因其计算成本更低、训练稳定性更好且结果更准确。另一方面,其他研究则认为并行训练更稳健,能产生更准确的长期预测。本文的主要贡献是通过统一框架比较两种方法。我们关注三个方面:i)在噪声下的估计鲁棒性;ii)计算成本;iii)收敛性。统一的数学框架和模拟研究显示了每种训练方法在不同情境下的验证结果,发现并行训练在更现实的场景中表现更优。一个使用测量数据的例子似乎支持这一结论。我们还通过新的复杂度分析和数值示例表明,两种方法的计算成本相似,但系列-并行训练更易于并行化。一些关于稳定性和收敛性性质的非正式讨论也在示例中进行了探讨。

英文摘要

Neural network models for dynamic systems can be trained either in parallel or in series-parallel configurations. Influenced by early arguments, several papers justify the choice of series-parallel rather than parallel configuration claiming it has a lower computational cost, better stability properties during training and provides more accurate results. Other published results, on the other hand, defend parallel training as being more robust and capable of yielding more accu- rate long-term predictions. The main contribution of this paper is to present a study comparing both methods under the same unified framework. We focus on three aspects: i) robustness of the estimation in the presence of noise; ii) computational cost; and, iii) convergence. A unifying mathematical framework and simulation studies show situations where each training method provides better validation results, being parallel training better in what is believed to be more realistic scenarios. An example using measured data seems to reinforce such claim. We also show, with a novel complexity analysis and numerical examples, that both methods have similar computational cost, being series series-parallel training, however, more amenable to parallelization. Some informal discussion about stability and convergence properties is presented and explored in the examples.

1809.07412 2026-06-04 cs.LG cs.AI cs.SY eess.SY 版本更新

Learning, Planning, and Control in a Monolithic Neural Event Inference Architecture

在单体神经事件推理架构中的学习、规划与控制

Martin V. Butz, David Bilkey, Dania Humaidan, Alistair Knott, Sebastian Otte

发表机构 * Cognitive Modeling Group Computer Science Department University of Tübingen(图宾根大学认知建模组计算机科学系)

AI总结 该研究提出了一种单体神经事件推理架构REPRISE,通过学习动态系统的时序事件预测模型,结合回顾和前瞻推理,实现对传感器运动动态的高效预测与控制。

Comments This is the final revision submitted to the Neural Networks journal. The revision mainly includes improvements in language, explanation, and additional references and system relations

详情
AI中文摘要

我们引入了REPRISE,一种回顾和前瞻推理方案,用于学习动态系统的时序事件预测模型。REPRISE推断出不可观测的上下文事件状态及其最佳解释最近遭遇的传感器运动经验的时序预测模型。同时,它以目标导向的方式优化即将到来的运动活动。在此,REPRISE通过循环神经网络(RNN)实现,该网络学习由不同模拟动态车辆生成的传感器运动连续性的时序前向模型。RNN通过上下文神经元增强,能够编码不同但相关的传感器运动动态为紧凑的事件代码。我们证明REPRISE能够同时学习分离和近似遇到的传感器运动动态:它分析传感器运动误差信号,同时适应内部上下文神经活动和连接权重值。此外,我们证明REPRISE可以利用所学模型诱导目标导向的模型预测控制,即近似主动推理:给定一个目标状态,系统想象一个优化该状态的运动命令序列,以最小化与目标的距离。RNN活动因此持续想象即将到来的未来并反思最近的过去,优化预测模型、隐藏神经状态活动和即将到来的运动活动。结果,事件预测神经编码得以发展,从而能够调用高效且适应性强的目标导向传感器运动控制。

英文摘要

We introduce REPRISE, a REtrospective and PRospective Inference SchEme, which learns temporal event-predictive models of dynamical systems. REPRISE infers the unobservable contextual event state and accompanying temporal predictive models that best explain the recently encountered sensorimotor experiences retrospectively. Meanwhile, it optimizes upcoming motor activities prospectively in a goal-directed manner. Here, REPRISE is implemented by a recurrent neural network (RNN), which learns temporal forward models of the sensorimotor contingencies generated by different simulated dynamic vehicles. The RNN is augmented with contextual neurons, which enable the encoding of distinct, but related, sensorimotor dynamics as compact event codes. We show that REPRISE concurrently learns to separate and approximate the encountered sensorimotor dynamics: it analyzes sensorimotor error signals adapting both internal contextual neural activities and connection weight values. Moreover, we show that REPRISE can exploit the learned model to induce goal-directed, model-predictive control, that is, approximate active inference: Given a goal state, the system imagines a motor command sequence optimizing it with the prospective objective to minimize the distance to the goal. The RNN activities thus continuously imagine the upcoming future and reflect on the recent past, optimizing the predictive model, the hidden neural state activities, and the upcoming motor activities. As a result, event-predictive neural encodings develop, which allow the invocation of highly effective and adaptive goal-directed sensorimotor control.

1803.10986 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Error Analysis and Improving the Accuracy of Winograd Convolution for Deep Neural Networks

卷积在深度神经网络中的误差分析及精度提升

Barbara Barabasz, Andrew Anderson, Kirk M. Soodhalter, David Gregg

发表机构 * School of Computer Science and Statistics(计算机科学与统计学学院) Trinity College Dublin(都柏林三一学院) School of Mathematics(数学学院)

AI总结 本文分析了Winograd卷积算法的误差,并提出改进方法以提高其精度,通过Huffman编码优化求和误差,实验选择采样点并探索混合精度卷积等方法以减少浮点误差。

详情
AI中文摘要

流行的深度神经网络(DNNs)大部分执行时间用于计算卷积。Winograd算法族可以显著减少所需的算术运算次数,并存在于许多DNN软件框架中。然而,性能提升是以浮点(FP)数值精度的降低为代价。在本文中,我们分析了最坏情况下的FP误差并证明了算法的范数和条件数的估计。我们证明误差界随卷积大小的增加呈指数增长,但改进后的算法的误差界比原始算法更小。我们提出几种减少FP误差的方法。我们提出基于Huffman编码的通用评估顺序以减少求和误差。我们通过实验研究采样

英文摘要

Popular deep neural networks (DNNs) spend the majority of their execution time computing convolutions. The Winograd family of algorithms can greatly reduce the number of arithmetic operations required and is present in many DNN software frameworks. However, the performance gain is at the expense of a reduction in floating point (FP) numerical accuracy. In this paper, we analyse the worst case FP error and prove the estimation of norm and conditioning of the algorithm. We show that the bound grows exponentially with the size of the convolution, but the error bound of the \textit{modified} algorithm is smaller than the original one. We propose several methods for reducing FP error. We propose a canonical evaluation ordering based on Huffman coding that reduces summation error. We study the selection of sampling "points" experimentally and find empirically good points for the most important sizes. We identify the main factors associated with good points. In addition, we explore other methods to reduce FP error, including mixed-precision convolution, and pairwise summation across DNN channels. Using our methods we can significantly reduce FP error for a given block size, which allows larger block sizes and reduced computation.

1904.13304 2026-06-04 cs.LG cs.SY eess.SP eess.SY stat.ML 版本更新

A supervised-learning-based strategy for optimal demand response of an HVAC System

基于监督学习的HVAC系统最优需求响应策略

Youngjin Kim

发表机构 * Member, IEEE(IEEE成员)

AI总结 本文提出了一种基于监督学习的HVAC系统最优需求响应策略,通过训练人工神经网络并结合分段线性方程,解决多区建筑中HVAC系统的优化需求响应问题,同时确保热舒适性和经济性。

Comments 12 pages

详情
AI中文摘要

建筑的大型热容量使供暖、通风和空气调节(HVAC)系统能够被用作需求响应(DR)资源。优化HVAC单元的需求响应具有挑战性,特别是在多区建筑中,因为这需要详细的基于物理的模型来描述区域温度变化和建筑热状况。本文提出了一种基于监督学习(SL)的新策略,用于多区建筑中的HVAC系统最优需求响应。人工神经网络(ANNs)使用正常建筑运行条件下的数据进行训练。ANNs通过分段线性方程进行复制,并显式地整合到基于价格的需求响应优化问题中。该优化问题在各种电价和建筑热状况下得到解决。解决方案进一步用于训练深度神经网络(DNN)以直接确定最优需求响应计划,称为监督学习辅助元预测(SLAMP)。通过三种不同方法进行案例研究:显式ANN复制(EAR)、SLAMP和基于物理的建模。案例研究结果验证了所提出的SL策略的有效性,不仅在实际应用性和计算时间方面,还确保了 occupant的热舒适性和HVAC系统的经济运行。

英文摘要

The large thermal capacity of buildings enables heating, ventilating, and air-conditioning (HVAC) systems to be exploited as demand response (DR) resources. Optimal DR of HVAC units is challenging, particularly for multi-zone buildings, because this requires detailed physics-based models of zonal temperature variations for HVAC system operation and building thermal conditions. This paper proposes a new strategy for optimal DR of an HVAC system in a multi-zone building, based on supervised learning (SL). Artificial neural networks (ANNs) are trained with data obtained under normal building operating conditions. The ANNs are replicated using piecewise linear equations, which are explicitly integrated into an optimal scheduling problem for price-based DR. The optimization problem is solved for various electricity prices and building thermal conditions. The solutions are further used to train a deep neural network (DNN) to directly determine the optimal DR schedule, referred to here as supervised-learning-aided meta-prediction (SLAMP). Case studies are performed using three different methods: explicit ANN replication (EAR), SLAMP, and physics-based modeling. The case study results verify the effectiveness of the proposed SL-based strategy, in terms of both practical applicability and computational time, while also ensuring the thermal comfort of occupants and cost-effective operation of the HVAC system.

1806.06317 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Laplacian Smoothing Gradient Descent

拉普拉斯平滑梯度下降

Stanley Osher, Bao Wang, Penghang Yin, Xiyang Luo, Farzin Barekat, Minh Pham, Alex Lin

发表机构 * University of California, Los Angeles(加州大学洛杉矶分校)

AI总结 本文提出了一种简单的方法改进梯度下降和随机梯度下降,通过乘以正定矩阵的逆(可通过FFT高效计算)来减少方差、增大步长并提高泛化精度,同时在理论和实践中均表现出色。

Comments 28 pages, 15 figures

详情
AI中文摘要

我们提出了一类非常简单的梯度下降和随机梯度下降的修改方法。我们展示,当应用于从逻辑回归到深度神经网络的各种机器学习问题时,所提出的替代方法可以显著减少方差,允许采取更大的步长,并提高泛化准确性。这些方法仅涉及将通常的(随机)梯度乘以正定矩阵的逆(可以通过FFT高效计算),该矩阵的条件数来自一维离散拉普拉斯或其高阶推广。它还保持均值并增加最小成分,减少最大成分。哈密尔顿-雅可比偏微分方程的理论表明,新算法的隐式版本几乎等同于在新的函数上进行梯度下降,该函数(i)具有与原函数相同的全局极小值,并且(ii)更“凸”。此外,我们证明具有这些替代方案的优化算法在离散Sobolev $H_σ^p$ 意义下统一收敛,并减少凸优化问题的最优性差距。代码可在:\url{https://github.com/BaoWangMath/LaplacianSmoothing-GradientDescent}

英文摘要

We propose a class of very simple modifications of gradient descent and stochastic gradient descent. We show that when applied to a large variety of machine learning problems, ranging from logistic regression to deep neural nets, the proposed surrogates can dramatically reduce the variance, allow to take a larger step size, and improve the generalization accuracy. The methods only involve multiplying the usual (stochastic) gradient by the inverse of a positive definitive matrix (which can be computed efficiently by FFT) with a low condition number coming from a one-dimensional discrete Laplacian or its high order generalizations. It also preserves the mean and increases the smallest component and decreases the largest component. The theory of Hamilton-Jacobi partial differential equations demonstrates that the implicit version of the new algorithm is almost the same as doing gradient descent on a new function which (i) has the same global minima as the original function and (ii) is ``more convex". Moreover, we show that optimization algorithms with these surrogates converge uniformly in the discrete Sobolev $H_σ^p$ sense and reduce the optimality gap for convex optimization problems. The code is available at: \url{https://github.com/BaoWangMath/LaplacianSmoothing-GradientDescent}

1904.11352 2026-06-04 math.NA cs.LG cs.NA 版本更新

Construction of the similarity matrix for the spectral clustering method: numerical experiments

谱聚类方法相似性矩阵的构造:数值实验

Paola Favati, Grazia Lotti, Ornella Menchi, Francesco Romani

发表机构 * IIT - CNR(意大利国家研究 council(CNR)- 国立应用科学研究院(IIT)) Dipartimento di Scienze Matematiche, Fisiche e Informatiche, University of Parma(帕尔马大学数学、物理和信息科学系) Dipartimento di Informatica, University of Pisa(比萨大学信息科学系)

AI总结 本文研究了谱聚类中相似性矩阵的构造问题,通过直接基于数据集关联图或其最小生成树(MST)来考虑稀疏性和尺度参数σ的选择,进行人工和真实数据集的数值实验以比较方法性能。

Comments Submitted to Journal of Computational and Applied Mathematics

详情
AI中文摘要

谱聚类是一种通过相似性矩阵的特征向量来发现数据集结构的强大方法。当个体聚类结构高度非凸时,它通常优于传统聚类算法如k-均值。其准确性取决于如何定义数据点对之间的相似性。两个重要因素影响相似性矩阵的构造:底层加权图的稀疏性,这主要取决于数据点之间的距离,以及相似性函数。当使用高斯相似性函数时,尺度参数σ的选择可能至关重要。本文基于数据集关联图或其最小生成树(MST)直接或间接地考察了稀疏性和σ的选择。为了比较方法性能,已进行了大量人工和真实数据集的数值实验。

英文摘要

Spectral clustering is a powerful method for finding structure in a dataset through the eigenvectors of a similarity matrix. It often outperforms traditional clustering algorithms such as $k$-means when the structure of the individual clusters is highly non-convex. Its accuracy depends on how the similarity between pairs of data points is defined. Two important items contribute to the construction of the similarity matrix: the sparsity of the underlying weighted graph, which depends mainly on the distances among data points, and the similarity function. When a Gaussian similarity function is used, the choice of the scale parameter $σ$ can be critical. In this paper we examine both items, the sparsity and the selection of suitable $σ$'s, based either directly on the graph associated to the dataset or on the minimal spanning tree (MST) of the graph. An extensive numerical experimentation on artificial and real-world datasets has been carried out to compare the performances of the methods.

1904.10597 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Autonomous Voltage Control for Grid Operation Using Deep Reinforcement Learning

利用深度强化学习实现电网运行的自主电压控制

Ruisheng Diao, Zhiwei Wang, Di Shi, Qianyun Chang, Jiajun Duan, Xiaohu Zhang

发表机构 * GEIRI North America(GEIRI北美中心) State Grid Corporation of China(国家电网公司)

AI总结 本文提出Grid Mind框架,通过深度强化学习实现自主电网控制,解决传统方法在处理可再生能源和需求响应动态性带来的挑战,提升电网运行的安全性和经济性。

Comments To be published (Accepted) in: Proceedings of the Power and Energy Society General Meeting (PESGM), Atlanta, GA, 2019

详情
AI中文摘要

现代电力电网正面临由快速增长的可再生能源和需求响应的随机性和动态性带来的巨大挑战。传统理论假设和运营规则可能被违反,而现有控制系统由于缺乏计算能力和准确的电网模型,在实时应用中难以适应,导致电网安全和经济运行日益受到关注。现有运营控制措施通常是离线确定的,优化程度较低。本文提出了一种新的范式Grid Mind,用于利用深度强化学习实现自主电网运营控制。所提出的AI代理可通过与大量离线模拟的交互学习其控制策略,并适应新的变化,包括负载/发电变化以及拓扑变化。经过适当训练的代理在IEEE 14节点系统上测试了数万种场景,并在应用自主电压控制以实现安全电网运行方面展示了有希望的性能。

英文摘要

Modern power grids are experiencing grand challenges caused by the stochastic and dynamic nature of growing renewable energy and demand response. Traditional theoretical assumptions and operational rules may be violated, which are difficult to be adapted by existing control systems due to the lack of computational power and accurate grid models for use in real time, leading to growing concerns in the secure and economic operation of the power grid. Existing operational control actions are typically determined offline, which are less optimized. This paper presents a novel paradigm, Grid Mind, for autonomous grid operational controls using deep reinforcement learning. The proposed AI agent for voltage control can learn its control policy through interactions with massive offline simulations, and adapts its behavior to new changes including not only load/generation variations but also topological changes. A properly trained agent is tested on the IEEE 14-bus system with tens of thousands of scenarios, and promising performance is demonstrated in applying autonomous voltage controls for secure grid operation.

1809.00846 2026-06-04 cs.LG cs.CV cs.SY eess.SY stat.ML 版本更新

Towards Understanding Regularization in Batch Normalization

向批量归一化中的正则化理解迈进

Ping Luo, Xinjiang Wang, Wenqi Shao, Zhanglin Peng

发表机构 * The Chinese University of Hong Kong(香港中文大学) SenseTime Research(时光科技研究院) The University of Hong Kong(香港大学)

AI总结 本文通过理论分析探讨了批量归一化在神经网络训练中的收敛性和泛化能力,揭示了批量归一化作为隐式正则化的作用,并通过实验验证了其在卷积神经网络中的正则化特性。

Comments International Conference on Learning Representations (ICLR)

详情
AI中文摘要

批量归一化(BN)在神经网络训练中提高了收敛性和泛化能力。本工作从理论上理解这些现象。我们通过使用由核层、BN层和非线性激活函数组成的基本网络块来分析BN。这个基本网络帮助我们从三个方面理解BN的影响。首先,将BN视为隐式正则化,可以将其分解为总体归一化(PN)和伽马衰减作为显式正则化。其次,BN和正则化的学习动态表明,使用大最大有效学习率训练可以收敛。第三,通过统计力学探讨BN的泛化能力。实验表明,卷积神经网络中的BN共享上述分析中的正则化特性。

英文摘要

Batch Normalization (BN) improves both convergence and generalization in training neural networks. This work understands these phenomena theoretically. We analyze BN by using a basic block of neural networks, consisting of a kernel layer, a BN layer, and a nonlinear activation function. This basic network helps us understand the impacts of BN in three aspects. First, by viewing BN as an implicit regularizer, BN can be decomposed into population normalization (PN) and gamma decay as an explicit regularization. Second, learning dynamics of BN and the regularization show that training converged with large maximum and effective learning rate. Third, generalization of BN is explored by using statistical mechanics. Experiments demonstrate that BN in convolutional neural networks share the same traits of regularization as the above analyses.

1904.09656 2026-06-04 cs.LG cs.NA math.NA 版本更新

Solution of Definite Integrals using Functional Link Artificial Neural Networks

使用功能链接人工神经网络求解定积分

Satyasaran Changdar, Snehangshu Bhattacharjee

发表机构 * Department of Information Technology, Institute of Engineering and Management(信息科技系,工程管理学院)

AI总结 本文提出了一种利用人工神经网络求解定积分的新方法,通过最小化精心设计的误差函数,构建出一种新颖的替代传统数值方法的神经网络,特别适用于高阶多项式积分。

Comments 14 pages, 7 figures

详情
AI中文摘要

本文讨论了一种利用人工神经网络求解定积分的新方法。目的是构建一个神经网络,作为传统数值方法的新替代方案,并通过学习算法能够求解定积分,通过最小化精心设计的误差函数。所提出的算法与现有数值方法相比更为有效和精确,适用于需要积分高阶多项式的场合。观察结果以表格和图形形式记录并展示。

英文摘要

This paper discusses a new method to solve definite integrals using artificial neural networks. The objective is to build a neural network that would be a novel alternative to pre-established numerical methods and with the help of a learning algorithm, be able to solve definite integrals, by minimising a well constructed error function. The proposed algorithm, with respect to existing numerical methods, is effective and precise and well-suited for purposes which require integration of higher order polynomials. The observations have been recorded and illustrated in tabular and graphical form.

1903.05803 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

On Applications of Bootstrap in Continuous Space Reinforcement Learning

在连续空间强化学习中Bootstrap应用的探讨

Mohamad Kazem Shirani Faradonbeh, Ambuj Tewari, George Michailidis

AI总结 本文研究了在连续状态和动作空间决策问题中,基于Bootstrap的策略在 regret 方面的平方根缩放特性,并探讨了模型动态学习的准确性。

详情
AI中文摘要

在连续状态和动作空间的决策问题中,线性动力学模型被广泛采用。具体而言,针对受二次成本函数约束的随机线性系统,其策略在强化学习中涵盖了大量应用。最近的文献中研究了随机策略,以解决识别与控制之间的权衡。然而,关于基于Bootstrap观察状态和动作的策略知之甚少。在本文中,我们证明基于Bootstrap的策略在时间方面具有平方根缩放的regret。我们还获得了关于学习模型动态准确性结果。此外,还提供了支持技术结果的数值分析。

英文摘要

In decision making problems for continuous state and action spaces, linear dynamical models are widely employed. Specifically, policies for stochastic linear systems subject to quadratic cost functions capture a large number of applications in reinforcement learning. Selected randomized policies have been studied in the literature recently that address the trade-off between identification and control. However, little is known about policies based on bootstrapping observed states and actions. In this work, we show that bootstrap-based policies achieve a square root scaling of regret with respect to time. We also obtain results on the accuracy of learning the model's dynamics. Corroborative numerical analysis that illustrates the technical results is also provided.

1903.03712 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Adaptive Power System Emergency Control using Deep Reinforcement Learning

基于深度强化学习的自适应电力系统紧急控制

Qiuhua Huang, Renke Huang, Weituo Hao, Jie Tan, Rui Fan, Zhenyu Huang

发表机构 * Pacific Northwest National Laboratory(太平洋西北国家实验室) Battelle(巴特尔) U.S. Department of Energy(美国能源部) Deep Science laboratory(深科学实验室) Google Brain(谷歌大脑)

AI总结 本文提出了一种基于深度强化学习的自适应电力系统紧急控制方法,通过高维特征提取和非线性泛化能力来应对现代电网中的不确定性与变化,展示了其在发电机动态制动和电压降低负载切除中的优异性能和鲁棒性。

Comments 12 pages

详情
AI中文摘要

电力系统紧急控制通常被视为电网安全和韧性的最后一道安全网。现有的紧急控制方案通常是基于设想的'最坏'场景或几个典型运行场景进行离线设计。这些方案在现代电网中出现越来越多的不确定性和变化时,面临着显著的适应性和鲁棒性问题。为了解决这些挑战,本文首次开发了新的自适应紧急控制方案,利用深度强化学习(DRL)的高维特征提取和非线性泛化能力来处理复杂的电力系统。此外,首次设计了一个名为RLGC的开源平台,以协助DRL算法在电力系统控制中的开发和基准测试。详细介绍了该平台和基于DRL的紧急控制方案,包括发电机动态制动和电压降低负载切除。在两个区域四机系统和IEEE 39节点系统中进行了广泛的案例研究,证明了所提出方案的优异性能和鲁棒性。

英文摘要

Power system emergency control is generally regarded as the last safety net for grid security and resiliency. Existing emergency control schemes are usually designed off-line based on either the conceived "worst" case scenario or a few typical operation scenarios. These schemes are facing significant adaptiveness and robustness issues as increasing uncertainties and variations occur in modern electrical grids. To address these challenges, for the first time, this paper developed novel adaptive emergency control schemes using deep reinforcement learning (DRL), by leveraging the high-dimensional feature extraction and non-linear generalization capabilities of DRL for complex power systems. Furthermore, an open-source platform named RLGC has been designed for the first time to assist the development and benchmarking of DRL algorithms for power system control. Details of the platform and DRL-based emergency control schemes for generator dynamic braking and under-voltage load shedding are presented. Extensive case studies performed in both two-area four-machine system and IEEE 39-Bus system have demonstrated the excellent performance and robustness of the proposed schemes.

1712.01975 2026-06-04 cs.LG cs.NA math.NA math.OC 版本更新

Regularization and feature selection for large dimensional data

大规模数据的正则化与特征选择

Nand Sharma, Prathamesh Verlekar, Rehab Ashary, Sui Zhiquan

发表机构 * Department of Mathematics, Colorado State University(科罗拉多州立大学数学系) Department of Computer Science, Colorado State University(科罗拉多州立大学计算机科学系)

AI总结 本文研究了五种嵌入式特征选择方法,通过ridge回归、Lasso回归或其组合进行正则化,以在高维数据中减少特征空间并提高分类性能。

详情
AI中文摘要

特征选择已成为几种机器学习范式中的重要步骤。在生物信息学和文本分类等涉及高维数据的领域,特征选择可以帮助显著减少特征空间。在难以或无法获得足够训练示例的情况下,特征选择有助于克服维度灾难,从而提高分类算法的性能。本文研究了五种嵌入式特征选择方法,这些方法在优化函数的正则化部分使用ridge回归、Lasso回归或两者的组合。我们对五个高维数据集评估了所选方法,并比较了它们在数据集的稀疏性和相关性参数以及执行时间上的表现。

英文摘要

Feature selection has evolved to be an important step in several machine learning paradigms. In domains like bio-informatics and text classification which involve data of high dimensions, feature selection can help in drastically reducing the feature space. In cases where it is difficult or infeasible to obtain sufficient number of training examples, feature selection helps overcome the curse of dimensionality which in turn helps improve performance of the classification algorithm. The focus of our research here are five embedded feature selection methods which use either the ridge regression, or Lasso regression, or a combination of the two in the regularization part of the optimization function. We evaluate five chosen methods on five large dimensional datasets and compare them on the parameters of sparsity and correlation in the datasets and their execution times.

1904.08361 2026-06-04 cs.LG cs.RO cs.SY eess.SY stat.ML 版本更新

Decoupled Data Based Approach for Learning to Control Nonlinear Dynamical Systems

基于解耦数据的方法用于学习控制非线性动力学系统

Ran Wang, Karthikeya Parunandi, Dan Yu, Dileep Kalathil, Suman Chakravorty

发表机构 * College of Astronautics, Nanjing University and hence, run into the curse of dimensionality(南京大学航天学院) Department of Electrical and Computer Engineering, Texas A&M University, Texas, USA(德克萨斯A&M大学电气与计算机工程系)

AI总结 本文提出了一种解耦数据基于的方法,用于学习控制具有连续状态空间、连续动作空间和未知动态的非线性随机动力学系统,通过解耦的开环-闭环方法,利用黑盒仿真模型解决开环确定性轨迹优化问题,并通过线性化动态在该名义轨迹上开发闭环控制,从而使用线性二次调节器算法,证明了该方法的性能近似最优,并在训练时间上显著优于其他先进算法。

详情
AI中文摘要

本文解决了一个非线性随机动力学系统学习最优控制策略的问题,该系统具有连续状态空间、连续动作空间和未知动态。此类问题通常在随机自适应控制和强化学习文献中使用基于模型和无模型的方法分别解决。这两种方法都依赖于解决动态规划问题,无论是直接还是间接,以找到最优闭环控制策略。动态规划方法固有的'维度灾难'使这些方法也变得计算上困难。本文提出了一种新颖的解耦数据基于控制(D2C)算法,通过解耦的'开环-闭环'方法解决这个问题。首先,使用动力学系统的黑盒仿真模型解决一个开环确定性轨迹优化问题。然后,通过在该名义轨迹上线性化动态,开发围绕该开环轨迹的闭环控制。通过线性化,可以使用基于线性二次调节器的算法来实现该闭环控制。我们证明了D2C算法的性能近似最优。此外,仿真性能表明,与其它先进算法相比,训练时间显著减少。

英文摘要

This paper addresses the problem of learning the optimal control policy for a nonlinear stochastic dynamical system with continuous state space, continuous action space and unknown dynamics. This class of problems are typically addressed in stochastic adaptive control and reinforcement learning literature using model-based and model-free approaches respectively. Both methods rely on solving a dynamic programming problem, either directly or indirectly, for finding the optimal closed loop control policy. The inherent `curse of dimensionality' associated with dynamic programming method makes these approaches also computationally difficult. This paper proposes a novel decoupled data-based control (D2C) algorithm that addresses this problem using a decoupled, `open loop - closed loop', approach. First, an open-loop deterministic trajectory optimization problem is solved using a black-box simulation model of the dynamical system. Then, a closed loop control is developed around this open loop trajectory by linearization of the dynamics about this nominal trajectory. By virtue of linearization, a linear quadratic regulator based algorithm can be used for this closed loop control. We show that the performance of D2C algorithm is approximately optimal. Moreover, simulation performance suggests significant reduction in training time compared to other state of the art algorithms.

1904.07200 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

A Discussion on Solving Partial Differential Equations using Neural Networks

利用神经网络求解偏微分方程的讨论

Tim Dockhorn

发表机构 * Department of Applied Mathematics(应用数学系) University of Waterloo(滑铁卢大学)

AI总结 本文探讨了神经网络求解偏微分方程的能力,通过数值实验展示了小型神经网络能够准确学习复杂解,并分析了随机权重初始化对解质量的影响,提出了损失函数的选择、神经网络与经典数值方法的优劣比较,以及未来研究方向。

Comments 9 pages, 2 figures

详情
AI中文摘要

神经网络是否能够学习求解偏微分方程(PDEs)?本文针对两个(系统)PDEs,即泊松方程和稳态纳维-斯托克斯方程,探讨了这一问题。本文的贡献有五点:(1)数值实验表明,小型神经网络(<500个可学习参数)能够准确学习偏微分方程组的复杂解。(2)研究了随机权重初始化对神经网络近似解质量的影响,并展示了如何利用这种非确定性进行集成学习。(3)探讨了本文中所用损失函数的适用性。(4)研究了用神经网络求解(系统)PDEs与经典数值方法的优缺点。(5)提出了未来研究的全面方向列表。

英文摘要

Can neural networks learn to solve partial differential equations (PDEs)? We investigate this question for two (systems of) PDEs, namely, the Poisson equation and the steady Navier--Stokes equations. The contributions of this paper are five-fold. (1) Numerical experiments show that small neural networks (< 500 learnable parameters) are able to accurately learn complex solutions for systems of partial differential equations. (2) It investigates the influence of random weight initialization on the quality of the neural network approximate solution and demonstrates how one can take advantage of this non-determinism using ensemble learning. (3) It investigates the suitability of the loss function used in this work. (4) It studies the benefits and drawbacks of solving (systems of) PDEs with neural networks compared to classical numerical methods. (5) It proposes an exhaustive list of possible directions of future work.

1812.03467 2026-06-04 math.NA cs.LG cs.MS cs.NA math.OC 版本更新

A note on solving nonlinear optimization problems in variable precision

关于在变量精度下求解非线性优化问题的注记

S. Gratton, Ph. L. Toint

发表机构 * NAXYS, University of Namur(NAXYS,纳慕尔大学)

AI总结 本文提出一种高效的信赖域算法变体,用于高性能计算,通过多精度计算有效降低目标函数和梯度评估的能耗。

Comments 11 pages, 2 figures

详情
AI中文摘要

本文简要考虑了Carter (1993)和Conn, Gould和Toint (2000)提出的动态精度信赖域算法变体,作为非常高性能计算领域的一个工具,该领域中允许多精度计算对于控制能量耗散至关重要。所呈现的数值实验表明,使用该方法可以显著降低目标函数和梯度评估的'能耗',通过高效利用多精度计算。

英文摘要

This short note considers an efficient variant of the trust-region algorithm with dynamic accuracy proposed Carter (1993) and Conn, Gould and Toint (2000) as a tool for very high-performance computing, an area where it is critical to allow multi-precision computations for keeping the energy dissipation under control. Numerical experiments are presented indicating that the use of the considered method can bring substantial savings in objective function's and gradient's evaluation "energy costs" by efficiently exploiting multi-precision computations.

1904.05814 2026-06-04 cs.CV cs.GR cs.LG cs.NA cs.RO math.NA 版本更新

Probabilistic Permutation Synchronization using the Riemannian Structure of the Birkhoff Polytope

利用Birkhoff多面体的Riemannian结构的概率排列同步

Tolga Birdal, Umut Şimşekli

AI总结 本文提出了一种新的几何和概率方法,用于在多个对象或图像集合之间同步对应关系。核心方法包括基于Birkhoff-Riemannian L-BFGS优化放松后的循环一致性损失,以及基于Birkhoff-Riemannian Langevin Monte Carlo生成Birkhoff多面体样本并估计解的置信度。

Comments To appear as oral presentation at CVPR 2019. 20 pages including the supplementary material

详情
AI中文摘要

我们提出了一种全新的几何和概率方法,用于在多个对象或图像集合之间同步对应关系。具体而言,我们提出了两个算法:(1) Birkhoff-Riemannian L-BFGS用于以系统化的方式优化放松后的循环一致性损失的松弛版本;(2) Birkhoff-Riemannian Langevin Monte Carlo用于在Birkhoff多面体上生成样本并估计找到的解的置信度。为此,我们首先介绍了最近发展出的Birkhoff多面体的Riemannian几何。接着,我们引入了一种新的概率同步模型,形式为马尔可夫随机场(MRF)。最后,基于一阶retraction算子,我们将问题 formulation 为模拟随机微分方程,并设计了新的积分器。我们在合成和真实数据集上展示,我们能够以更快的收敛速度和可靠的置信度/不确定性估计获得高质量的多图匹配结果。

英文摘要

We present an entirely new geometric and probabilistic approach to synchronization of correspondences across multiple sets of objects or images. In particular, we present two algorithms: (1) Birkhoff-Riemannian L-BFGS for optimizing the relaxed version of the combinatorially intractable cycle consistency loss in a principled manner, (2) Birkhoff-Riemannian Langevin Monte Carlo for generating samples on the Birkhoff Polytope and estimating the confidence of the found solutions. To this end, we first introduce the very recently developed Riemannian geometry of the Birkhoff Polytope. Next, we introduce a new probabilistic synchronization model in the form of a Markov Random Field (MRF). Finally, based on the first order retraction operators, we formulate our problem as simulating a stochastic differential equation and devise new integrators. We show on both synthetic and real datasets that we achieve high quality multi-graph matching results with faster convergence and reliable confidence/uncertainty estimates.

1904.05732 2026-06-04 math.NA cs.LG cs.NA 版本更新

A Kaczmarz Algorithm for Solving Tree Based Distributed Systems of Equations

一种用于求解树结构分布式方程组的Kaczmarz算法

Chinmay Hegde, Fritz Keinert, Eric S. Weber

发表机构 * Electrical and Computer Engineering, Iowa State University, Ames, IA 50011(电子与计算机工程系,爱荷华州立大学,爱荷华州阿姆斯,IA 50011) Department of Mathematics, Iowa State University, 396 Carver Hall, Ames, IA 50011(数学系,爱荷华州立大学,396号Carver大楼,爱荷华州阿姆斯,IA 50011)

AI总结 本文提出了一种改进的Kaczmarz算法,用于求解分布式环境中的线性方程组,该算法适用于具有树结构的网络,并在方程组一致时收敛到解,在不一致时收敛到加权最小二乘解。

详情
AI中文摘要

Kaczmarz算法是一种用于求解线性方程组的迭代方法。我们介绍了一种改进的Kaczmarz算法,用于在分布式环境中求解线性方程组,即方程组中的方程分布在网络中的多个节点上。我们引入的修改是为了一个具有树结构的网络,允许节点之间传递解的估计值。我们证明了在不增加对方程的额外假设的情况下,改进的算法收敛。我们展示了当系统一致时,该算法收敛到解或最小范数解。我们还展示了在方程组不一致的情况下,改进的放松Kaczmarz算法当松弛参数接近0时收敛到加权最小二乘解。

英文摘要

The Kaczmarz algorithm is an iterative method for solving systems of linear equations. We introduce a modified Kaczmarz algorithm for solving systems of linear equations in a distributed environment, i.e. the equations within the system are distributed over multiple nodes within a network. The modification we introduce is designed for a network with a tree structure that allows for passage of solution estimates between the nodes in the network. We prove that the modified algorithm converges under no additional assumptions on the equations. We demonstrate that the algorithm converges to the solution, or the solution of minimal norm, when the system is consistent. We also demonstrate that in the case of an inconsistent system of equations, the modified relaxed Kaczmarz algorithm converges to a weighted least squares solution as the relaxation parameter approaches $0$.

1903.07266 2026-06-04 cs.LG cs.DC cs.MA cs.SY eess.SY stat.ML 版本更新

Distributed stochastic optimization with gradient tracking over strongly-connected networks

在强连通网络上进行分布式随机优化与梯度跟踪

Ran Xin, Anit Kumar Sahu, Usman A. Khan, Soummya Kar

发表机构 * Department of Electrical and Computer Engineering, Tufts University(Tufts大学电气与计算机工程系) Bosch Center for Artificial Intelligence(博世人工智能中心) Department of Electrical and Computer Engineering, Carnegie Mellon University(卡内基梅隆大学电气与计算机工程系)

AI总结 本文研究了在强连通网络上最小化平滑且强凸局部成本函数之和的分布式随机优化问题,提出了一种新的分布式方法$\mathcal{S}$-$\mathcal{AB}$,通过辅助变量在期望意义上渐近跟踪全局成本的梯度,利用行和列随机权重确保共识和最优性,并在任意强连通图上应用。

详情
AI中文摘要

在本文中,我们研究了在强连通网络上最小化平滑且强凸局部成本函数之和的分布式随机优化问题,假设每个代理都有访问随机一阶oracle($\mathcal{SFO}$)的权限,我们提出了一种新的分布式方法,称为$\mathcal{S}$-$\mathcal{AB}$,其中每个代理使用辅助变量以期望意义渐近跟踪全局成本的梯度。$\mathcal{S}$-$\mathcal{AB}$算法同时使用行和列随机权重,以确保共识和最优性。由于未使用双随机权重,$\mathcal{S}$-$\mathcal{AB}$适用于任意强连通图。我们证明,在足够小的常数步长下,$\mathcal{S}$-$\mathcal{AB}$在期望均方意义上线性收敛到全局极小值的邻域。我们基于真实世界数据集进行了数值模拟以说明理论结果。

英文摘要

In this paper, we study distributed stochastic optimization to minimize a sum of smooth and strongly-convex local cost functions over a network of agents, communicating over a strongly-connected graph. Assuming that each agent has access to a stochastic first-order oracle ($\mathcal{SFO}$), we propose a novel distributed method, called $\mathcal{S}$-$\mathcal{AB}$, where each agent uses an auxiliary variable to asymptotically track the gradient of the global cost in expectation. The $\mathcal{S}$-$\mathcal{AB}$ algorithm employs row- and column-stochastic weights simultaneously to ensure both consensus and optimality. Since doubly-stochastic weights are not used, $\mathcal{S}$-$\mathcal{AB}$ is applicable to arbitrary strongly-connected graphs. We show that under a sufficiently small constant step-size, $\mathcal{S}$-$\mathcal{AB}$ converges linearly (in expected mean-square sense) to a neighborhood of the global minimizer. We present numerical simulations based on real-world data sets to illustrate the theoretical results.

1904.04685 2026-06-04 math.NA cs.LG cs.NA math.OC 版本更新

On the approximation of the solution of partial differential equations by artificial neural networks trained by a multilevel Levenberg-Marquardt method

利用多级Levenberg-Marquardt方法训练人工神经网络近似偏微分方程解

Henri Calandra, Serge Gratton, Elisa Riccietti, Xavier Vasseur

发表机构 * INPT-IRIT, University of Toulouse and ENSEEIHT(INPT-IRIT,图卢兹大学和ENSEEIHT) ISAE-SUPAERO, University of Toulouse(ISAE-SUPAERO,图卢兹大学)

AI总结 本文研究了利用人工神经网络近似偏微分方程解的问题,提出了一种多级Levenberg-Marquardt方法用于训练,通过数值实验展示了该方法在训练人工神经网络时的高效性。

详情
AI中文摘要

本文关注利用人工神经网络近似偏微分方程解的问题。这里使用前馈神经网络来近似偏微分方程的解。将学习问题公式化为最小二乘问题,选择偏微分方程的残差作为损失函数,同时采用多级Levenberg-Marquardt方法作为训练方法。这种设置使我们能够进一步了解多级方法的潜力。确实,当最小二乘问题来源于人工神经网络的训练时,需要优化的变量并不受任何几何约束,标准的插值和限制算子不能再被使用。因此,提出了一种受代数多重网格方法启发的启发式方法,用于构造多级转移算子。数值实验显示,与标准的一级过程相比,新的多级优化方法在训练人工神经网络时表现出令人鼓舞的结果。

英文摘要

This paper is concerned with the approximation of the solution of partial differential equations by means of artificial neural networks. Here a feedforward neural network is used to approximate the solution of the partial differential equation. The learning problem is formulated as a least squares problem, choosing the residual of the partial differential equation as a loss function, whereas a multilevel Levenberg-Marquardt method is employed as a training method. This setting allows us to get further insight into the potential of multilevel methods. Indeed, when the least squares problem arises from the training of artificial neural networks, the variables subject to optimization are not related by any geometrical constraints and the standard interpolation and restriction operators cannot be employed any longer. A heuristic, inspired by algebraic multigrid methods, is then proposed to construct the multilevel transfer operators. Numerical experiments show encouraging results related to the efficiency of the new multilevel optimization method for the training of artificial neural networks, compared to the standard corresponding one-level procedure.

1904.03665 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Learning to Control Highly Accelerated Ballistic Movements on Muscular Robots

学习控制高加速度的球形运动在肌肉机器人上

Dieter Büchler, Roberto Calandra, Jan Peters

发表机构 * Max Planck Institute for Intelligent Systems(智能系统马克斯·普朗克研究所) Facebook AI Research(脸书人工智能研究)

AI总结 本文研究了如何通过学习方法提高肌肉机器人在高速高加速度运动中的控制精度,提出了一种四自由度的机器人臂,利用气动人工肌肉实现高关节角加速度,并通过贝叶斯优化直接在硬件上调整控制参数,从而在快速轨迹上实现了优于以往的结果。

Comments 12 pages, preprint submitted to Journal of Robotics and Autonomous Systems

详情
AI中文摘要

高速和高加速度的运动本质上很难控制。在人形机器人臂上应用学习方法来控制此类运动可以提高控制的准确性,但可能会损害系统。学习方法的内在探索可能导致不稳定性和机器人在高速下达到关节极限。因此,具有安全探索高速和高加速度运动硬件的需求是必要的。为了解决这个问题,我们提出使用由气动人工肌肉(PAMs)驱动的机器人。在本文中,我们展示了一种四自由度的机器人臂,能够达到高达28000度/秒²的关节角加速度,同时通过拮抗驱动和空气压力范围限制避免危险的关节极限。利用这种机器人臂,我们能够通过贝叶斯优化直接在硬件上调整控制参数,而无需额外的安全考虑。在快速轨迹上的跟踪性能超过了以往在类似PAM驱动机器人上的结果。我们还展示了由于电缆弯曲最小、轻量级动力学和PAMs与链接之间的最小接触等精心设计考虑,系统能够使用PID控制器在慢速轨迹上良好控制。最后,我们提出了一种新的技术来控制拮抗肌肉对的协同收缩。实验结果表明,选择最佳的协同收缩水平对于达到更好的跟踪性能至关重要。通过使用PAM驱动机器人和学习,我们朝着未来能够实现更像人类运动的机器人发展迈出了小一步。

英文摘要

High-speed and high-acceleration movements are inherently hard to control. Applying learning to the control of such motions on anthropomorphic robot arms can improve the accuracy of the control but might damage the system. The inherent exploration of learning approaches can lead to instabilities and the robot reaching joint limits at high speeds. Having hardware that enables safe exploration of high-speed and high-acceleration movements is therefore desirable. To address this issue, we propose to use robots actuated by Pneumatic Artificial Muscles (PAMs). In this paper, we present a four degrees of freedom (DoFs) robot arm that reaches high joint angle accelerations of up to 28000 deg/s^2 while avoiding dangerous joint limits thanks to the antagonistic actuation and limits on the air pressure ranges. With this robot arm, we are able to tune control parameters using Bayesian optimization directly on the hardware without additional safety considerations. The achieved tracking performance on a fast trajectory exceeds previous results on comparable PAM-driven robots. We also show that our system can be controlled well on slow trajectories with PID controllers due to careful construction considerations such as minimal bending of cables, lightweight kinematics and minimal contact between PAMs and PAMs with the links. Finally, we propose a novel technique to control the the co-contraction of antagonistic muscle pairs. Experimental results illustrate that choosing the optimal co-contraction level is vital to reach better tracking performance. Through the use of PAM-driven robots and learning, we do a small step towards the future development of robots capable of more human-like motions.

1803.08552 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Linear model predictive safety certification for learning-based control

基于线性模型的预测安全认证用于基于学习的控制

Kim P. Wabersich, Melanie N. Zeilinger

发表机构 * ETH Zurich(苏黎世联邦理工学院)

AI总结 本文提出了一种模型预测安全认证(MPSC)方案,用于具有加性扰动的多边形线性系统,以解决基于学习的控制器缺乏安全保证的问题。通过引入MPC来确保系统在安全目标集内运行,并通过场景优化提出了一种实用的数据驱动设计方法。

详情
AI中文摘要

尽管已多次证明基于学习的控制器可以提供优越的性能,但它们通常缺乏安全保证。本文旨在通过引入一种模型预测安全认证(MPSC)方案来解决这一问题,该方案适用于具有加性扰动的多边形线性系统。该方案验证所提出的学习输入的安全性,并尽可能最小地修改以保持系统在给定约束集内。安全因此与模型预测控制器(MPC)提供可行轨迹以达到安全目标集的存在相关。一种鲁棒的MPC公式考虑了学习环境中模型通常不确定的事实,从而在所提出的MPSC策略下始终保证约束满足。MPSC方案可用于扩展任何潜在保守的安全状态集用于学习,并证明了一种迭代技术用于扩大安全集。最后,提出了一种使用场景优化的实用数据驱动设计方法用于MPSC。

英文摘要

While it has been repeatedly shown that learning-based controllers can provide superior performance, they often lack of safety guarantees. This paper aims at addressing this problem by introducing a model predictive safety certification (MPSC) scheme for polytopic linear systems with additive disturbances. The scheme verifies safety of a proposed learning-based input and modifies it as little as necessary in order to keep the system within a given set of constraints. Safety is thereby related to the existence of a model predictive controller (MPC) providing a feasible trajectory towards a safe target set. A robust MPC formulation accounts for the fact that the model is generally uncertain in the context of learning, which allows proving constraint satisfaction at all times under the proposed MPSC strategy. The MPSC scheme can be used in order to expand any potentially conservative set of safe states for learning and we prove an iterative technique for enlarging the safe set. Finally, a practical data-based design procedure for MPSC is proposed using scenario optimization.

1904.02765 2026-06-04 cs.RO cs.LG cs.SY eess.SY math.OC 版本更新

Intent-Aware Probabilistic Trajectory Estimation for Collision Prediction with Uncertainty Quantification

意图感知的概率轨迹估计用于碰撞预测与不确定性量化

Andrew Patterson, Arun Lakshmanan, Naira Hovakimyan

发表机构 * Department of Mechanical Science and Engineering, University of Illinois at Urbana-Champaign(伊利诺伊大学厄巴纳-香槟分校机械科学与工程系)

AI总结 本文提出了一种基于高斯过程的概率轨迹估计方法,用于在不确定环境中预测碰撞,通过概率方法替代确定性假设,以考虑更广泛的障碍物类型,并通过案例研究展示了在有限障碍物行为知识下预测碰撞的能力。

详情
AI中文摘要

在动态和未知的环境中,碰撞预测依赖于对环境变化的理解。许多碰撞预测方法依赖于对障碍物运动的确定性知识,但完全确定性的障碍物运动知识往往不可用。本文提出了一种基于高斯过程的预测方法,用概率知识替代对每个障碍物未来行为的确定性假设,以考虑更广泛的障碍物。该方法仅依赖位置和速度测量来预测与动态障碍物的碰撞。我们证明,障碍物位置的不确定性区域可以表示为通过高斯过程回归生成的多项式的组合。为了控制任意时间范围内不确定性的增长,假设概率障碍物意图作为障碍物位置和速度的分布,这可以自然地包含在高斯过程框架中。我们的方法在两个案例研究中得到验证:(i) 障碍物超越代理;(ii) 障碍物垂直穿过代理的路径。在这些模拟中,我们展示了即使在有限的障碍物行为知识下也能预测碰撞。

英文摘要

Collision prediction in a dynamic and unknown environment relies on knowledge of how the environment is changing. Many collision prediction methods rely on deterministic knowledge of how obstacles are moving in the environment. However, complete deterministic knowledge of the obstacles' motion is often unavailable. This work proposes a Gaussian process based prediction method that replaces the assumption of deterministic knowledge of each obstacle's future behavior with probabilistic knowledge, to allow a larger class of obstacles to be considered. The method solely relies on position and velocity measurements to predict collisions with dynamic obstacles. We show that the uncertainty region for obstacle positions can be expressed in terms of a combination of polynomials generated with Gaussian process regression. To control the growth of uncertainty over arbitrary time horizons, a probabilistic obstacle intention is assumed as a distribution over obstacle positions and velocities, which can be naturally included in the Gaussian process framework. Our approach is demonstrated in two case studies in which (i), an obstacle overtakes the agent and (ii), an obstacle crosses the agent's path perpendicularly. In these simulations we show that the collision can be predicted despite having limited knowledge of the obstacle's behavior.

1708.01945 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

A Bootstrap Method for Error Estimation in Randomized Matrix Multiplication

一种用于随机矩阵乘法误差估计的自助法

Miles E. Lopes, Shusen Wang, Michael W. Mahoney

发表机构 * Department of Statistics University of California at Davis(加州大学戴维斯分校统计学系) Department of Computer Science Stevens Institute of Technology(史蒂文斯理工学院计算机科学系) International Computer Science Institute and Department of Statistics University of California at Berkeley(伯克利大学国际计算机科学研究所和统计学系)

AI总结 本文提出了一种自助方法,用于直接估计随机矩阵乘法(降维)的准确性,作为解决一般问题的原型设置。该方法在计算上不显著增加标准降维方法的成本,并通过插值技术实现,同时提供了理论和实证结果以证明其有效性。

详情
Journal ref
Journal of Machine Learning Research, 20(39): 1-40, 2019
AI中文摘要

近年来,随机方法在数值线性代数中受到越来越多的关注,作为解决大规模问题的一般方法。通常,这些方法的核心成分是某种形式的随机降维,这加速了计算,但也引入了随机近似误差。在这种情况下,降维步骤编码了成本与精度之间的权衡。然而,成本与精度之间的精确数值关系通常未知,因此用户可能难以准确知道(1)给定解的准确性,或(2)为了达到给定的准确性水平需要多少计算。在本文中,我们研究随机矩阵乘法(草图)作为解决这些问题的原型设置。作为解决方案,我们开发了一种自助方法,用于直接估计准确性作为降维函数的函数(而不是导出降维的最坏情况界限)。从计算角度来看,所提出的方法不显著增加标准草图方法的成本,并且这得益于一种“插值”技术。此外,我们提供了理论和实证结果,以证明所提出方法的有效性。

英文摘要

In recent years, randomized methods for numerical linear algebra have received growing interest as a general approach to large-scale problems. Typically, the essential ingredient of these methods is some form of randomized dimension reduction, which accelerates computations, but also creates random approximation error. In this way, the dimension reduction step encodes a tradeoff between cost and accuracy. However, the exact numerical relationship between cost and accuracy is typically unknown, and consequently, it may be difficult for the user to precisely know (1) how accurate a given solution is, or (2) how much computation is needed to achieve a given level of accuracy. In the current paper, we study randomized matrix multiplication (sketching) as a prototype setting for addressing these general problems. As a solution, we develop a bootstrap method for \emph{directly estimating} the accuracy as a function of the reduced dimension (as opposed to deriving worst-case bounds on the accuracy in terms of the reduced dimension). From a computational standpoint, the proposed method does not substantially increase the cost of standard sketching methods, and this is made possible by an "extrapolation" technique. In addition, we provide both theoretical and empirical results to demonstrate the effectiveness of the proposed method.

1904.01855 2026-06-04 math.OC cs.LG cs.SY eess.SY stat.ML 版本更新

A Stochastic Interpretation of Stochastic Mirror Descent: Risk-Sensitive Optimality

随机镜像下降的随机解释:风险敏感最优性

Navid Azizan, Babak Hassibi

发表机构 * California Institute of Technology(加州理工学院)

AI总结 本文提出随机镜像下降(SMD)是一种风险敏感最优估计器,适用于非高斯分布的未知权重向量和加性噪声,同时引入了对称SMD(SSMD)的改进版本。

详情
AI中文摘要

随机镜像下降(SMD)是一种相对较新的算法家族,最近在优化、机器学习和控制领域得到了广泛应用。它可以被视为经典随机梯度算法(SGD)的推广,其中权重向量的更新不是沿着随机梯度的负方向进行,而是在一个由梯度的(严格凸)势函数定义的“镜像域”中进行。这种势函数及其产生的镜像域相比SGD提供了更大的算法灵活性。尽管许多SMD的性质已经在文献中得到研究,但本文提出了SMD的一个新解释,即当未知权重向量和加性噪声非高斯且属于指数分布族时,SMD是一个风险敏感最优估计器。分析还建议了SMD的一种改进版本,称为对称SMD(SSMD)。证明依赖于Bregman散度的一些简单性质,使我们能够将结果从二次函数和高斯分布扩展到某些凸函数和指数分布族,方式较为流畅。

英文摘要

Stochastic mirror descent (SMD) is a fairly new family of algorithms that has recently found a wide range of applications in optimization, machine learning, and control. It can be considered a generalization of the classical stochastic gradient algorithm (SGD), where instead of updating the weight vector along the negative direction of the stochastic gradient, the update is performed in a "mirror domain" defined by the gradient of a (strictly convex) potential function. This potential function, and the mirror domain it yields, provides considerable flexibility in the algorithm compared to SGD. While many properties of SMD have already been obtained in the literature, in this paper we exhibit a new interpretation of SMD, namely that it is a risk-sensitive optimal estimator when the unknown weight vector and additive noise are non-Gaussian and belong to the exponential family of distributions. The analysis also suggests a modified version of SMD, which we refer to as symmetric SMD (SSMD). The proofs rely on some simple properties of Bregman divergence, which allow us to extend results from quadratics and Gaussians to certain convex functions and exponential families in a rather seamless way.

1904.01214 2026-06-04 cs.LG cs.SY eess.SY math.OC stat.ML 版本更新

Enhancement of Energy-Based Swing-Up Controller via Entropy Search

通过熵搜索增强基于能量的摆动上控制器

Chang Sik Lee, Dong Eui Chang

发表机构 * School of Electrical Engineering, KAIST, Daejeon, Korea.(韩国成均馆大学电气工程学院)

AI总结 本文利用熵搜索进行贝叶斯优化,改进基于能量的控制器,以实现旋转倒立摆(Furuta摆)的摆动控制,实验表明该控制器在各种初始条件下性能优于常规控制器。

Comments 6 pages, 2019 Asian Control Conference

详情
AI中文摘要

基于能量的方法为稳定机械系统提供了一种简单而强大的控制方案。然而,由于它不对控制器参数空间施加强约束,寻找适合最优控制器的参数值被认为是困难的。本文旨在通过应用称为熵搜索的贝叶斯优化方法,生成一个最优的基于能量的控制器,用于旋转倒立摆(也称为Furuta摆)的摆动控制。仿真和实验表明,与常规控制器相比,最优控制器在各种初始条件下表现出改进的性能。

英文摘要

An energy based approach for stabilizing a mechanical system has offered a simple yet powerful control scheme. However, since it does not impose such strong constraints on parameter space of the controller, finding appropriate parameter values for an optimal controller is known to be hard. This paper intends to generate an optimal energy-based controller for swinging up a rotary inverted pendulum, also known as the Furuta pendulum, by applying the Bayesian optimization called Entropy Search. Simulations and experiments show that the optimal controller has an improved performance compared to a nominal controller for various initial conditions.

1903.09698 2026-06-04 math.NA cs.LG cs.NA 版本更新

CUR Decompositions, Approximations, and Perturbations

CUR分解、近似与扰动

Keaton Hamm, Longxiu Huang

发表机构 * Department of Mathematics, University of Arizona, Tucson, AZ 85719 USA(亚利桑那大学数学系,图森,亚利桑那州,85719 USA) Department of Mathematics, Vanderbilt University, Nashville, TN 37240 USA(范德比尔特大学数学系,纳什维尔,田纳西州,37240 USA)

AI总结 本文探讨了用于降维和低秩矩阵近似的CUR分解方法,综述并比较了文献中的不同观点,提出了一种新的精确CUR分解特征,并对噪声低秩矩阵的CUR近似进行了新颖的扰动分析,同时给出了新的列和行采样结果,证明了低秩矩阵的CUR分解在高概率下得以实现,并展示了这些采样方法在之前研究的扰动下的稳定性以及相关方法和界限的数值示例。

Comments 40 pages

详情
AI中文摘要

本文讨论了用于降维和低秩矩阵近似的CUR分解这一有用的工具。文献中关于该方法的各种观点被综合并进行比较和对比;其中包括对精确CUR分解的新特征。对噪声低秩矩阵的CUR近似的新型扰动分析被进行,该分析将这些近似与底层低秩部分的潜在CUR分解进行比较。此外,我们给出了新的列和行采样结果,允许得出结论:低秩矩阵的CUR分解以高概率获得。然后,我们展示了这些采样方法在之前研究的扰动下的稳定性,并提供了所讨论的方法和界限的数值示例。

英文摘要

This article discusses a useful tool in dimensionality reduction and low-rank matrix approximation called the CUR decomposition. Various viewpoints of this method in the literature are synergized and are compared and contrasted; included in this is a new characterization of exact CUR decompositions. A novel perturbation analysis is performed on CUR approximations of noisy versions of low-rank matrices, which compares them with the putative CUR decomposition of the underlying low-rank part. Additionally, we give new column and row sampling results which allow one to conclude that a CUR decomposition of a low-rank matrix is attained with high probability. We then illustrate the stability of these sampling methods under the perturbations studied before, and provide numerical illustrations of the methods and bounds discussed.

1707.09198 2026-06-04 cs.LG cs.AI cs.SY eess.SY math.OC 版本更新

Data-Driven Stochastic Robust Optimization: A General Computational Framework and Algorithm for Optimization under Uncertainty in the Big Data Era

数据驱动的随机稳健优化:大数据时代不确定性优化的通用计算框架和算法

Chao Ning, Fengqi You

发表机构 * Robert Frederick Smith School of Chemical and Biomolecular Engineering, Cornell University(罗伯特·弗雷德里克·史密斯化学与生物分子工程学院,康奈尔大学)

AI总结 本文提出了一种数据驱动的随机稳健优化框架,通过双层优化结构基于数据驱动的不确定性模型,结合两阶段随机规划和自适应稳健优化,解决大数据时代下的不确定性优化问题。

详情
Journal ref
Computers & Chemical Engineering, Volume 111, Pages 115-133, 4 March 2018,
AI中文摘要

本文提出了一种新颖的数据驱动随机稳健优化(DDSRO)框架,用于利用带有标签的多类不确定性数据进行不确定性优化。大数据集中的不确定性数据通常来自各种条件,这些条件通过类别标签进行编码。采用狄利克雷过程混合模型和最大似然估计等机器学习方法进行不确定性建模。基于数据驱动的不确定性模型,进一步提出了一种双层优化结构的DDSRO框架。外层优化问题采用两阶段随机规划方法,以在不同数据类别上优化预期目标;自适应稳健优化作为内层问题,确保解决方案的鲁棒性,同时保持计算可行性。进一步开发了一种基于分解的算法,以高效解决由此产生的多级优化问题。通过过程网络设计和规划的案例研究,展示了所提框架和算法的应用性。

英文摘要

A novel data-driven stochastic robust optimization (DDSRO) framework is proposed for optimization under uncertainty leveraging labeled multi-class uncertainty data. Uncertainty data in large datasets are often collected from various conditions, which are encoded by class labels. Machine learning methods including Dirichlet process mixture model and maximum likelihood estimation are employed for uncertainty modeling. A DDSRO framework is further proposed based on the data-driven uncertainty model through a bi-level optimization structure. The outer optimization problem follows a two-stage stochastic programming approach to optimize the expected objective across different data classes; adaptive robust optimization is nested as the inner problem to ensure the robustness of the solution while maintaining computational tractability. A decomposition-based algorithm is further developed to solve the resulting multi-level optimization problem efficiently. Case studies on process network design and planning are presented to demonstrate the applicability of the proposed framework and algorithm.

1810.04351 2026-06-04 math.AP cs.LG cs.NA math.NA math.PR 版本更新

Properly-weighted graph Laplacian for semi-supervised learning

带权图拉普拉斯算子用于半监督学习

Jeff Calder, Dejan Slepcev

发表机构 * Department of Mathematics, University of Minnesota(明尼苏达大学数学系) Department of Mathematical Sciences, Carnegie Mellon University(卡内基梅隆大学数学科学系)

AI总结 本文提出了一种带权图拉普拉斯算子的方法,以解决传统半监督学习方法在标签数据与未标签数据比例降低时性能下降的问题,通过在拉普拉斯正则化中正确设置权重,使估计器在大样本极限下保持良好和稳定,证明了所提出的方法在无限样本极限下收敛于连续变分问题的光滑解。

详情
AI中文摘要

传统图拉普拉斯方法在半监督学习中,当标签数据与未标签数据的比例降低时,性能显著下降,这是由于图拉普拉斯的退化。最近有几种方法被提出以解决这个问题,然而我们表明其中一些方法在大数据极限下仍然不恰当。在本文中,我们展示了一种正确设置拉普拉斯正则化中权重的方法,使得估计器在大样本极限下保持良好和稳定。我们证明了我们的半监督学习算法在无限样本极限下收敛于一个连续变分问题的光滑解,该解连续地达到标签值。我们的方法快速且易于实现。

英文摘要

The performance of traditional graph Laplacian methods for semi-supervised learning degrades substantially as the ratio of labeled to unlabeled data decreases, due to a degeneracy in the graph Laplacian. Several approaches have been proposed recently to address this, however we show that some of them remain ill-posed in the large-data limit. In this paper, we show a way to correctly set the weights in Laplacian regularization so that the estimator remains well posed and stable in the large-sample limit. We prove that our semi-supervised learning algorithm converges, in the infinite sample size limit, to the smooth solution of a continuum variational problem that attains the labeled values continuously. Our method is fast and easy to implement.

1904.00035 2026-06-04 cs.RO cs.LG cs.SY eess.SY stat.ML 版本更新

Autonomous Highway Driving using Deep Reinforcement Learning

使用深度强化学习实现自动驾驶高速公路驾驶

Subramanya Nageshrao, Eric Tseng, Dimitar Filev

发表机构 * Ford Greenfield Labs(福特绿谷实验室) Ford Research and Innovation Center(福特研究与创新中心)

AI总结 本文提出了一种基于强化学习的方法,通过与模拟交通直接交互,使自动驾驶车辆在复杂和多变的环境中做出决策,解决了传统规则和预设成本函数在实时优化中的不足,提高了学习效率和安全性。

详情
AI中文摘要

自动驾驶车辆的操作空间可以是多样的,并且可能显著变化。这可能导致设计阶段未预料到的场景。因此,基于规则的决策者选择动作可能并不理想。同样,设计一个先验成本函数然后在实时中求解最优控制问题可能也不够有效。为了应对这些问题并避免在遇到意外场景时出现异常行为,我们提出了一种基于强化学习(RL)的方法,其中自动驾驶车辆通过与模拟交通直接交互来学习决策。决策者由深度神经网络实现,根据给定的系统状态提供动作选择。在关键应用如驾驶中,没有明确安全概念的RL代理可能无法收敛,或者需要极大量的样本才能找到可靠的策略。为了更好地解决这个问题,本文将强化学习与额外的短时间安全检查(SC)相结合。在关键场景中,安全检查还将为代理提供替代的安全动作,如果存在的话。这导致了两个新的贡献。首先,它扩展了可能导致不良“接近事件”或“碰撞”的状态。其次,安全检查的加入可以提供一个安全且稳定的训练环境。这显著提高了学习效率,同时不抑制有意义的探索,以确保安全和最优的学习行为。我们展示了所开发算法在高速公路驾驶场景中的性能,其中训练好的自动驾驶车辆在高速公路环境下遇到不同交通密度的情况。

英文摘要

The operational space of an autonomous vehicle (AV) can be diverse and vary significantly. This may lead to a scenario that was not postulated in the design phase. Due to this, formulating a rule based decision maker for selecting maneuvers may not be ideal. Similarly, it may not be effective to design an a-priori cost function and then solve the optimal control problem in real-time. In order to address these issues and to avoid peculiar behaviors when encountering unforeseen scenario, we propose a reinforcement learning (RL) based method, where the ego car, i.e., an autonomous vehicle, learns to make decisions by directly interacting with simulated traffic. The decision maker for AV is implemented as a deep neural network providing an action choice for a given system state. In a critical application such as driving, an RL agent without explicit notion of safety may not converge or it may need extremely large number of samples before finding a reliable policy. To best address the issue, this paper incorporates reinforcement learning with an additional short horizon safety check (SC). In a critical scenario, the safety check will also provide an alternate safe action to the agent provided if it exists. This leads to two novel contributions. First, it generalizes the states that could lead to undesirable "near-misses" or "collisions ". Second, inclusion of safety check can provide a safe and stable training environment. This significantly enhances learning efficiency without inhibiting meaningful exploration to ensure safe and optimal learned behavior. We demonstrate the performance of the developed algorithm in highway driving scenario where the trained AV encounters varying traffic density in a highway setting.

1808.10788 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

Data-driven discovery of PDEs in complex datasets

基于数据的复杂数据集中的PDE发现

Jens Berg, Kaj Nyström

发表机构 * Department of Mathematics, Uppsala University(乌普萨拉大学数学系)

AI总结 本文通过机器学习方法从复杂数据集中发现隐藏的偏微分方程,展示了如何通过数据转换和特征选择来揭示物理过程的PDE,并在非线性二次PDE和瑞典温度分布模拟中验证了该方法的有效性。

详情
AI中文摘要

许多科学和工程中的过程可以用偏微分方程(PDEs)来描述。传统上,PDEs是通过考虑物理基本原理来推导感兴趣的物理量之间的关系。另一种方法是测量感兴趣的量并使用深度学习来逆向工程描述物理过程的PDEs。本文使用机器学习,特别是深度学习,来从测量数据中发现复杂数据集中的PDEs。我们包括来自已知模型问题的数据示例和来自气象站的实测数据。我们展示了输入数据的必要转换相当于发现的PDE中的坐标转换,并详细阐述了特征和模型选择。证明了非线性二次PDE的动力学可以被普通微分方程准确描述,该方程由我们的深度学习算法自动发现。更有趣的是,我们在瑞典温度分布更复杂的模拟中也展示了类似的结果。

英文摘要

Many processes in science and engineering can be described by partial differential equations (PDEs). Traditionally, PDEs are derived by considering first principles of physics to derive the relations between the involved physical quantities of interest. A different approach is to measure the quantities of interest and use deep learning to reverse engineer the PDEs which are describing the physical process. In this paper we use machine learning, and deep learning in particular, to discover PDEs hidden in complex data sets from measurement data. We include examples of data from a known model problem, and real data from weather station measurements. We show how necessary transformations of the input data amounts to coordinate transformations in the discovered PDE, and we elaborate on feature and model selection. It is shown that the dynamics of a non-linear, second order PDE can be accurately described by an ordinary differential equation which is automatically discovered by our deep learning algorithm. Even more interestingly, we show that similar results apply in the context of more complex simulations of the Swedish temperature distribution.

1810.08754 2026-06-04 math.NA cs.LG cs.NA eess.SP 版本更新

BCR-Net: a neural network based on the nonstandard wavelet form

BCR-Net: 一种基于非标准小波形式的神经网络

Yuwei Fan, Cindy Orozco Bohorquez, Lexing Ying

发表机构 * Department of Mathematics, Stanford University(斯坦福大学数学系) Institute for Computational and Mathematical Engineering, Stanford University(斯坦福大学计算与数学工程研究所) Department of Mathematics and ICME, Stanford University(斯坦福大学数学系和计算与数学工程研究所) Facebook AI Research, Menlo Park, CA(脸书人工智能研究(Menlo Park, CA))

AI总结 本文提出了一种基于非标准小波形式的神经网络架构,该架构通过将非标准形式的矩阵向量乘法算法表示为线性神经网络,其中每个多分辨率计算的尺度都由局部连接的线性子网络完成,并通过用更深层次和强大的非线性子网络替换线性子网络来扩展以解决非线性问题。

Comments 17 pages and 9 figures

详情
AI中文摘要

本文提出了一种新颖的神经网络架构,灵感来源于Beylkin、Coifman和Rokhlin在[Communications on Pure and Applied Mathematics, 44(2), 141-183]中提出的一种非标准形式。非标准形式是一种高效的基于小波的压缩方案,用于线性积分算子。在本文中,我们首先将非标准形式的矩阵向量乘法算法表示为线性神经网络,其中每个多分辨率计算的尺度都通过局部连接的线性子网络完成。为了处理非线性问题,我们提出了一种扩展,称为BCR-Net,通过将每个线性子网络替换为更深层次和更强大的非线性子网络。数值结果展示了新架构的有效性,通过近似出现在均质理论和随机计算中的非线性映射。

英文摘要

This paper proposes a novel neural network architecture inspired by the nonstandard form proposed by Beylkin, Coifman, and Rokhlin in [Communications on Pure and Applied Mathematics, 44(2), 141-183]. The nonstandard form is a highly effective wavelet-based compression scheme for linear integral operators. In this work, we first represent the matrix-vector product algorithm of the nonstandard form as a linear neural network where every scale of the multiresolution computation is carried out by a locally connected linear sub-network. In order to address nonlinear problems, we propose an extension, called BCR-Net, by replacing each linear sub-network with a deeper and more powerful nonlinear one. Numerical results demonstrate the efficiency of the new architecture by approximating nonlinear maps that arise in homogenization theory and stochastic computation.

1903.10343 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Sample Complexity Lower Bounds for Linear System Identification

线性系统辨识的样本复杂性下界

Yassir Jedra, Alexandre Proutiere

AI总结 本文基于线性系统辨识问题,推导了针对特定问题的样本复杂性下界,该下界在PAC框架下定义,用于确定系统参数的识别时间和精度。研究针对受控和不受控系统,为不受控系统推导了基于有限时间可控性格拉姆矩阵的下界,并推导了仅依赖系统谱的简化下界。对于受控系统,下界虽不如不受控系统明确,但可为设计最小样本复杂度的控制策略提供见解。

详情
AI中文摘要

本文建立了线性系统辨识问题的特定问题样本复杂性下界。样本复杂性在PAC框架下定义,对应于在指定精度和置信水平下识别系统参数所需的时间。所谓特定问题,是指下界明确依赖于待识别的系统(与最小化下界形成对比),从而真正捕捉到特定系统的识别难度。我们考虑了受控和不受控系统。对于不受控系统,下界适用于任何线性系统(稳定或不稳定),仅依赖于系统的有限时间可控性格拉姆矩阵。还推导了仅依赖系统谱的简化下界。鉴于最近对经典估计方法(如普通最小二乘法)的有限时间分析,我们的样本复杂性下界对于许多系统来说是紧的。对于受控系统,我们的下界不如不受控系统明确,但可能为设计具有最小样本复杂度的控制策略提供有趣的见解。

英文摘要

This paper establishes problem-specific sample complexity lower bounds for linear system identification problems. The sample complexity is defined in the PAC framework: it corresponds to the time it takes to identify the system parameters with prescribed accuracy and confidence levels. By problem-specific, we mean that the lower bound explicitly depends on the system to be identified (which contrasts with minimax lower bounds), and hence really captures the identification hardness specific to the system. We consider both uncontrolled and controlled systems. For uncontrolled systems, the lower bounds are valid for any linear system, stable or not, and only depend of the system finite-time controllability gramian. A simplified lower bound depending on the spectrum of the system only is also derived. In view of recent finitetime analysis of classical estimation methods (e.g. ordinary least squares), our sample complexity lower bounds are tight for many systems. For controlled systems, our lower bounds are not as explicit as in the case of uncontrolled systems, but could well provide interesting insights into the design of control policy with minimal sample complexity.

1902.06094 2026-06-04 cs.NE cs.LG cs.SY eess.SY 版本更新

Differentiable reservoir computing

可微 reservoir 计算

Lyudmila Grigoryeva, Juan-Pablo Ortega

发表机构 * Department of Mathematics and Statistics(数学与统计学系) Centre National de la Recherche Scientifique (CNRS)(国家科学研究中心(CNRS))

AI总结 本文研究了 reservoir 计算系统在不同可微性条件下的特性,提出了一种新的方法来分析 reservoir 过滤器的可微性,并展示了其在混沌动力系统学习中的应用。

Comments 60 pages

详情
AI中文摘要

在过去二十年中,大量努力致力于确定 reservoir 计算系统在所谓的回声状态(ESP)和衰减记忆(FMP)特性下的情况。这些重要特性在数学上相当于全局 reservoir 系统解的存在性和连续性。本文通过为非常一般类别的离散时间确定性输入刻画 reservoir 过滤器的可微性,从而补充了这一研究。这构成了对长期研究 ESP 和 FMP 的重要贡献,并特别与现有研究中的 ESP 输入依赖性相关联。文献中已证明可微性是学习混沌动力系统吸引子的关键特征。在分析情况下,利用泰勒定理构造了 reservoir 过滤器的 Volterra 型级数表示,并提供了相应的近似界。最后,这些结果的推论表明,任何衰减记忆过滤器都可以通过具有有限记忆的有限 Volterra 级数均匀近似。

英文摘要

Much effort has been devoted in the last two decades to characterize the situations in which a reservoir computing system exhibits the so-called echo state (ESP) and fading memory (FMP) properties. These important features amount, in mathematical terms, to the existence and continuity of global reservoir system solutions. That research is complemented in this paper with the characterization of the differentiability of reservoir filters for very general classes of discrete-time deterministic inputs. This constitutes a novel strong contribution to the long line of research on the ESP and the FMP and, in particular, links to existing research on the input-dependence of the ESP. Differentiability has been shown in the literature to be a key feature in the learning of attractors of chaotic dynamical systems. A Volterra-type series representation for reservoir filters with semi-infinite discrete-time inputs is constructed in the analytic case using Taylor's theorem and corresponding approximation bounds are provided. Finally, it is shown as a corollary of these results that any fading memory filter can be uniformly approximated by a finite Volterra series with finite memory.

1903.09136 2026-06-04 stat.ML cs.LG cs.SY eess.SP eess.SY 版本更新

On Approximate Nonlinear Gaussian Message Passing On Factor Graphs

关于因子图上的近似非线性高斯信息传递

Eike Petersen, Christian Hoffmann, Philipp Rostalski

发表机构 * Institute for Electrical Engineering in Medicine, University of Lübeck(医学电气工程研究所,吕贝克大学)

AI总结 本文提出了一种基于因子图的近似高斯信息传递规则,用于处理确定性非线性变换节点,通过数值求积和Rauch-Tung-Striebel型近似方法,为非线性问题的求解提供了新的算法框架。

详情
Journal ref
2018 IEEE Statistical Signal Processing Workshop (SSP)
AI中文摘要

因子图近年来因其作为信号处理、估计和控制算法表示和构建的统一框架而受到越来越多的关注。因子图工具包中似乎未充分探索的一个能力是利用表格信息传递规则处理确定性非线性变换,如非线性滤波和平滑问题中的变换。在本贡献中,我们基于前向传递的数值求积过程和后向传递的Rauch-Tung-Striebel型近似方法,为满足马尔可夫性质的任意因子图中的确定性非线性变换节点提供了通用的前向(滤波)和后向(平滑)近似高斯信息传递规则。这些信息传递规则可用于推导许多使用因子图求解非线性问题的算法,如基于所提出信息传递规则的非线性修改Bryson-Frazier(MBF)平滑器的提出。

英文摘要

Factor graphs have recently gained increasing attention as a unified framework for representing and constructing algorithms for signal processing, estimation, and control. One capability that does not seem to be well explored within the factor graph tool kit is the ability to handle deterministic nonlinear transformations, such as those occurring in nonlinear filtering and smoothing problems, using tabulated message passing rules. In this contribution, we provide general forward (filtering) and backward (smoothing) approximate Gaussian message passing rules for deterministic nonlinear transformation nodes in arbitrary factor graphs fulfilling a Markov property, based on numerical quadrature procedures for the forward pass and a Rauch-Tung-Striebel-type approximation of the backward pass. These message passing rules can be employed for deriving many algorithms for solving nonlinear problems using factor graphs, as is illustrated by the proposition of a nonlinear modified Bryson-Frazier (MBF) smoother based on the presented message passing rules.

1903.09122 2026-06-04 cs.LG cs.SY eess.SY math.OC stat.ML 版本更新

Finite Sample Analysis of Stochastic System Identification

随机系统辨识的有限样本分析

Anastasios Tsiamis, George J. Pappas

发表机构 * Department of Electrical and Systems Engineering, University of Pennsylvania(宾夕法尼亚大学电气与系统工程系)

AI总结 本文基于机器学习和统计学的现代工具,研究了随机系统辨识的有限样本复杂性。通过子空间辨识算法和有限数量的输出样本,提供了系统参数估计误差的非渐近高概率上界,证明了在高概率下估计误差以1/√N的速度减小。

Comments Under review

详情
AI中文摘要

在本文中,我们利用现代机器学习和统计学工具,分析了随机系统辨识的有限样本复杂性。一个未知的离散时间线性系统在高斯噪声下随时间演变,没有外部输入。目标是在给定有限时间跨度N内的单条输出测量轨迹的情况下,恢复系统参数以及卡尔曼滤波增益。基于子空间辨识算法和有限数量的N个输出样本,我们提供了系统参数估计误差的非渐近高概率上界。我们的分析利用了最近的随机矩阵理论、自归一化鞅和SVD鲁棒性结果,以证明在高概率下估计误差以1/√N的速度减小。我们的非渐近界不仅与经典渐近结果一致,而且即使在系统处于临界稳定的情况下也有效。

英文摘要

In this paper, we analyze the finite sample complexity of stochastic system identification using modern tools from machine learning and statistics. An unknown discrete-time linear system evolves over time under Gaussian noise without external inputs. The objective is to recover the system parameters as well as the Kalman filter gain, given a single trajectory of output measurements over a finite horizon of length $N$. Based on a subspace identification algorithm and a finite number of $N$ output samples, we provide non-asymptotic high-probability upper bounds for the system parameter estimation errors. Our analysis uses recent results from random matrix theory, self-normalized martingales and SVD robustness, in order to show that with high probability the estimation errors decrease with a rate of $1/\sqrt{N}$. Our non-asymptotic bounds not only agree with classical asymptotic results, but are also valid even when the system is marginally stable.

1903.08792 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

End-to-End Safe Reinforcement Learning through Barrier Functions for Safety-Critical Continuous Control Tasks

通过障碍函数实现端到端安全强化学习用于安全关键的连续控制任务

Richard Cheng, Gabor Orosz, Richard M. Murray, Joel W. Burdick

发表机构 * California Institute of Technology(加州理工学院) University of Michigan(密歇根大学)

AI总结 本文提出了一种结合模型无关强化学习控制器、基于控制障碍函数的控制器以及在线学习未知系统动力学的控制器架构,以确保学习过程中的安全性,通过Gaussian Processes建模系统动力学并展示在倒立摆和无线车对车自主跟车任务中更高的样本效率和安全性。

Comments Published in AAAI 2019

详情
AI中文摘要

强化学习(RL)算法在模拟应用之外取得有限成功,主要原因是学习过程中缺乏安全性保证。现实世界系统在最优控制器学习之前可能无法正常运行或崩溃。为了解决这个问题,我们提出了一种控制器架构,结合(1)模型无关的RL控制器、(2)利用控制障碍函数(CBFs)的模型基于控制器以及(3)在线学习未知系统动力学,以确保学习过程中的安全性。我们的通用框架利用RL算法的成功来学习高性能控制器,而基于CBF的控制器通过约束可探索策略集来保证安全并引导学习过程。我们利用高斯过程(GPs)来建模系统动力学及其不确定性。我们的新型控制器合成算法RL-CBF在学习过程中以高概率保证安全性,无论使用何种RL算法,并展示了更高的策略探索效率。我们在(1)倒立摆控制和(2)具有无线车辆到车辆通信的自动驾驶跟车任务中测试了我们的算法,并展示了我们的算法在学习过程中比其他最先进的算法具有更高的样本效率,并在整个学习过程中保持安全。

英文摘要

Reinforcement Learning (RL) algorithms have found limited success beyond simulated applications, and one main reason is the absence of safety guarantees during the learning process. Real world systems would realistically fail or break before an optimal controller can be learned. To address this issue, we propose a controller architecture that combines (1) a model-free RL-based controller with (2) model-based controllers utilizing control barrier functions (CBFs) and (3) on-line learning of the unknown system dynamics, in order to ensure safety during learning. Our general framework leverages the success of RL algorithms to learn high-performance controllers, while the CBF-based controllers both guarantee safety and guide the learning process by constraining the set of explorable polices. We utilize Gaussian Processes (GPs) to model the system dynamics and its uncertainties. Our novel controller synthesis algorithm, RL-CBF, guarantees safety with high probability during the learning process, regardless of the RL algorithm used, and demonstrates greater policy exploration efficiency. We test our algorithm on (1) control of an inverted pendulum and (2) autonomous car-following with wireless vehicle-to-vehicle communication, and show that our algorithm attains much greater sample efficiency in learning than other state-of-the-art algorithms and maintains safety during the entire learning process.

1803.02099 2026-06-04 cs.LG cs.SY eess.SY 版本更新

A Hybrid Method for Traffic Flow Forecasting Using Multimodal Deep Learning

一种用于交通流预测的混合方法:使用多模态深度学习

Shengdong Du, Tianrui Li, Xun Gong, Shi-Jinn Horng

发表机构 * School of Information Science and Technology, National Engineering Laboratory of Integrated Transportation Big Data Application Technology(信息科学与技术学院,集成交通大数据应用技术国家工程实验室) Department of Computer Science and Information Engineering, National Taiwan University of Science and Technology(计算机科学与工程系,台湾大学科技学院)

AI总结 本文提出了一种混合多模态深度学习方法,用于短期交通流预测,通过注意力辅助多模态深度学习架构联合和自适应学习多模态交通数据的空间时间相关特征和长期时间依赖性。

详情
AI中文摘要

交通流预测被视为智能交通系统的关键问题。在本工作中,我们提出了一种混合多模态深度学习方法,用于短期交通流预测,该方法通过注意力辅助多模态深度学习架构,联合和自适应地学习多模态交通数据的空间时间相关特征和长期时间依赖性。根据多模态交通数据的强非线性特征,我们方法的基础模块由一维卷积神经网络(1D CNN)和门控循环单元(GRU)组成,其中前者用于捕捉局部趋势特征,后者用于捕捉长期时间依赖性。然后,我们设计了一个混合多模态深度学习框架(HMDLF),通过多个CNN-GRU-Attention模块融合不同模态交通数据的共享表示特征。实验结果表明,所提出的多模态深度学习模型能够有效处理复杂的非线性城市交通流预测,并具有满意的准确性和有效性。

英文摘要

Traffic flow forecasting has been regarded as a key problem of intelligent transport systems. In this work, we propose a hybrid multimodal deep learning method for short-term traffic flow forecasting, which can jointly and adaptively learn the spatial-temporal correlation features and long temporal interdependence of multi-modality traffic data by an attention auxiliary multimodal deep learning architecture. According to the highly nonlinear characteristics of multi-modality traffic data, the base module of our method consists of one-dimensional Convolutional Neural Networks (1D CNN) and Gated Recurrent Units (GRU) with the attention mechanism. The former is to capture the local trend features and the latter is to capture the long temporal dependencies. Then, we design a hybrid multimodal deep learning framework (HMDLF) for fusing share representation features of different modality traffic data by multiple CNN-GRU-Attention modules. The experimental results indicate that the proposed multimodal deep learning model is capable of dealing with complex nonlinear urban traffic flow forecasting with satisfying accuracy and effectiveness.

1903.05196 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

A Review of Reinforcement Learning for Autonomous Building Energy Management

自主建筑能源管理中强化学习的综述

Karl Mason, Santiago Grijalva

发表机构 * School of Electrical and Computer Engineering(电气与计算机工程学院) Georgia Institute of Technology(佐治亚理工学院)

AI总结 本文综述了强化学习在自主建筑能源管理系统中的应用,总结了相关文献,并探讨了未来研究方向和挑战。

Comments 17 pages, 3 figures

详情
AI中文摘要

近年来,建筑能源管理领域受到了广泛关注。该领域致力于结合传感器技术、通信技术和先进控制算法,以优化能源利用。强化学习是用于控制问题中最突出的机器学习算法之一,并已在建筑能源管理领域取得了许多成功应用。本文对与强化学习应用于开发自主建筑能源管理系统相关的文献进行了全面回顾。还概述了强化学习未来的研究方向和挑战。

英文摘要

The area of building energy management has received a significant amount of interest in recent years. This area is concerned with combining advancements in sensor technologies, communications and advanced control algorithms to optimize energy utilization. Reinforcement learning is one of the most prominent machine learning algorithms used for control problems and has had many successful applications in the area of building energy management. This research gives a comprehensive review of the literature relating to the application of reinforcement learning to developing autonomous building energy management systems. The main direction for future research and challenges in reinforcement learning are also outlined.

1903.01032 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

A Fundamental Performance Limitation for Adversarial Classification

对抗分类中的基本性能限制

Abed AlRahman Al Makdah, Vaibhav Katewa, Fabio Pasqualetti

AI总结 本文研究了对抗分类中的基本性能限制,指出在优化准确率的过程中,二分类算法不可避免地会变得更加敏感于数据的对抗操纵,并且准确率与敏感度之间的根本权衡曲线仅取决于数据的统计特性,无法通过调整算法来改进。

详情
AI中文摘要

尽管机器学习算法被广泛用于解决技术、经济和社会相关的问题,但对这些数据驱动算法性能的可证明保证却严重不足,尤其是在数据来自不可靠来源并通过未保护和易受攻击的通道传输时。在本文中,我们采取了重要的步骤来弥合这一差距,并正式证明,在试图优化其准确率时,二分类算法——包括基于机器学习技术的算法——不可避免地会变得更加敏感于数据的对抗操纵。进一步地,对于具有相同复杂度(即分类边界数量)的给定算法类,准确率与敏感度之间的根本权衡曲线仅取决于数据的统计特性,无法通过调整算法来改进。

英文摘要

Despite the widespread use of machine learning algorithms to solve problems of technological, economic, and social relevance, provable guarantees on the performance of these data-driven algorithms are critically lacking, especially when the data originates from unreliable sources and is transmitted over unprotected and easily accessible channels. In this paper we take an important step to bridge this gap and formally show that, in a quest to optimize their accuracy, binary classification algorithms -- including those based on machine-learning techniques -- inevitably become more sensitive to adversarial manipulation of the data. Further, for a given class of algorithms with the same complexity (i.e., number of classification boundaries), the fundamental tradeoff curve between accuracy and sensitivity depends solely on the statistics of the data, and cannot be improved by tuning the algorithm.

1903.05817 2026-06-04 eess.SY cs.LG cs.SY 版本更新

A New Approach for Distributed Hypothesis Testing with Extensions to Byzantine-Resilience

分布式假设检验的一种新方法及其对拜占庭容错的扩展

Aritra Mitra, John A. Richards, Shreyas Sundaram

发表机构 * School of Electrical and Computer Engineering at Purdue University(普渡大学电气与计算机工程学院) Sandia National Laboratories(桑迪亚国家实验室)

AI总结 本文提出了一种新的分布式学习规则,用于在时间序列中联合观察资料下学习真实的状态,该方法不采用信念平均,且能扩展到处理网络中某些代理的恶意行为。

Comments To appear in the Proceedings of the American Control Conference, 2019

详情
AI中文摘要

我们研究了一个场景,其中一组代理各自接收部分信息的私人观察,试图协作学习能够解释他们随时间变化的联合观察资料的真实状态(在一组假设中)。为了解决这个问题,我们提出了一种分布式学习规则,与现有方法不同,它不采用任何形式的“信念平均”。具体来说,每个代理维护一个本地信念(对每个假设),该信念以贝叶斯方式更新,不受网络影响,同时维护一个实际信念,该信念在更新(除归一化外)时是其自身本地信念和邻居实际信念的最小值。在对代理信号结构和底层通信图的最小要求下,我们建立了所提出信念更新规则的一致性,即我们证明了代理的实际信念几乎必然渐近地集中在真实状态上。作为我们方法的一个关键好处,我们展示了我们的学习规则可以扩展到捕捉网络中某些代理的恶意行为,通过拜占庭对手模型。特别是,我们证明在适当的观察模型和网络拓扑条件下,每个非恶意代理几乎必然渐近地学习世界的真实状态。

英文摘要

We study a setting where a group of agents, each receiving partially informative private observations, seek to collaboratively learn the true state (among a set of hypotheses) that explains their joint observation profiles over time. To solve this problem, we propose a distributed learning rule that differs fundamentally from existing approaches, in the sense, that it does not employ any form of "belief-averaging". Specifically, every agent maintains a local belief (on each hypothesis) that is updated in a Bayesian manner without any network influence, and an actual belief that is updated (up to normalization) as the minimum of its own local belief and the actual beliefs of its neighbors. Under minimal requirements on the signal structures of the agents and the underlying communication graph, we establish consistency of the proposed belief update rule, i.e., we show that the actual beliefs of the agents asymptotically concentrate on the true state almost surely. As one of the key benefits of our approach, we show that our learning rule can be extended to scenarios that capture misbehavior on the part of certain agents in the network, modeled via the Byzantine adversary model. In particular, we prove that each non-adversarial agent can asymptotically learn the true state of the world almost surely, under appropriate conditions on the observation model and the network topology.

1903.05355 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

A Framework for On-line Learning of Underwater Vehicles Dynamic Models

在线学习水下机器人动态模型的框架

Bilal Wehbe, Marc Hildebrandt, Frank Kirchner

发表机构 * DFKI - Robotic Innovation Center(DFKI机器人创新中心)

AI总结 本文提出了一种在线学习水下机器人动态模型的框架,通过增量支持向量回归方法从数据流中逐步学习模型,并结合增量学习策略来改进模型在整体状态空间上的泛化能力。

Comments 8 pages, 6 figures, ICRA 2019 authors preprint

详情
AI中文摘要

从数据中学习机器人的动力学有助于实现更精确的跟踪控制器,或帮助其导航算法。然而,当由于外部条件变化导致机器人实际动力学变化时,需要在线调整其模型以保持高性能。本文提出了一种在线学习机器人动力学的框架,以适应此类变化。所提出的框架采用增量支持向量回归方法,从数据流中逐步学习模型。结合增量学习,开发了包括和遗忘数据的策略,以在整体状态空间上获得更好的泛化能力。该框架在仿真和真实实验场景中进行了测试,展示了其适应机器人动力学变化的能力。

英文摘要

Learning the dynamics of robots from data can help achieve more accurate tracking controllers, or aid their navigation algorithms. However, when the actual dynamics of the robots change due to external conditions, on-line adaptation of their models is required to maintain high fidelity performance. In this work, a framework for on-line learning of robot dynamics is developed to adapt to such changes. The proposed framework employs an incremental support vector regression method to learn the model sequentially from data streams. In combination with the incremental learning, strategies for including and forgetting data are developed to obtain better generalization over the whole state space. The framework is tested in simulation and real experimental scenarios demonstrating its adaptation capabilities to changes in the robot's dynamics.

1903.04958 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Real-Time Boiler Control Optimization with Machine Learning

燃煤电厂实时锅炉控制优化与机器学习

Yukun Ding, Yiyu Shi

发表机构 * University of Notre Dame(诺特达姆大学)

AI总结 本文提出利用机器学习优化燃煤电厂锅炉实时控制,通过优化不同区域的温度分布和炉膛氧含量,提高锅炉稳定性与能源效率。

Comments To appear in TC-CPS Newsletter

详情
AI中文摘要

在燃煤电厂中,提高锅炉运行效率对于可持续发展至关重要。本文将实时锅炉控制建模为一个优化问题,寻找不同区域的最佳温度分布和炉膛氧含量,以提高锅炉的稳定性和能源效率。我们采用一种高效的算法,结合适当的机器学习和优化技术。我们从行业合作伙伴处获得了一个超过两个月的实时锅炉数据集,并进行了广泛的实验,以证明所提算法的有效性和效率。

英文摘要

In coal-fired power plants, it is critical to improve the operational efficiency of boilers for sustainability. In this work, we formulate real-time boiler control as an optimization problem that looks for the best distribution of temperature in different zones and oxygen content from the flue to improve the boiler's stability and energy efficiency. We employ an efficient algorithm by integrating appropriate machine learning and optimization techniques. We obtain a large dataset collected from a real boiler for more than two months from our industry partner, and conduct extensive experiments to demonstrate the effectiveness and efficiency of the proposed algorithm.

1903.04681 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Estimating multi-class dynamic origin-destination demand through a forward-backward algorithm on computational graphs

通过计算图上的前向-后向算法估计多类动态出行生成需求

Wei Ma, Xidong Pi, Sean Qian

发表机构 * Department of Civil and Environmental Engineering(土木与环境工程系) Carnegie Mellon University(卡内基梅隆大学)

AI总结 本文提出了一种基于计算图的多类动态出行生成需求估计框架(MCDODE),通过前向-后向算法和树基累积曲线估计OD需求梯度,以解决大规模交通网络中多类时空车流估计的挑战。

Comments 31 pages, 21 figures, submitted to Transportation Research Part C: Emerging Technologies

详情
AI中文摘要

交通网络的复杂性前所未有,具有异质性车流。传统上,车辆类别通过车辆分类(如标准乘用车和卡车)来考虑。然而,车辆流的异质性源于许多其他方面,例如网约车与个人车辆、人工驾驶车辆与联网和自动驾驶车辆。在大型交通网络中,为每个类别提供一些车辆流观测,如何估计多类时空车辆流,即时间变化的起源-目的地(OD)需求和路径/链流,仍是一个重大挑战。本文提出了一种多类动态OD需求估计(MCDODE)的解决方案框架,该框架基于具有张量表示的时空流和MCDODE公式中所有中间特征的计算图。提出了一种前向-后向算法,以在计算图上高效求解MCDODE公式。此外,我们提出了一种新的树基累积曲线概念来估计OD需求的梯度。开发了Growing Tree算法来构建树基累积曲线。所提出的框架在小型网络以及实际的大规模网络上进行了检验。实验结果表明,所提出的框架具有竞争力、令人满意且计算上可行。

英文摘要

Transportation networks are unprecedentedly complex with heterogeneous vehicular flow. Conventionally, vehicle classes are considered by vehicle classifications (such as standard passenger cars and trucks). However, vehicle flow heterogeneity stems from many other aspects in general, e.g., ride-sourcing vehicles versus personal vehicles, human driven vehicles versus connected and automated vehicles. Provided with some observations of vehicular flow for each class in a large-scale transportation network, how to estimate the multi-class spatio-temporal vehicular flow, in terms of time-varying Origin-Destination (OD) demand and path/link flow, remains a big challenge. This paper presents a solution framework for multi-class dynamic OD demand estimation (MCDODE) in large-scale networks. The proposed framework is built on a computational graph with tensor representations of spatio-temporal flow and all intermediate features involved in the MCDODE formulation. A forward-backward algorithm is proposed to efficiently solve the MCDODE formulation on computational graphs. In addition, we propose a novel concept of tree-based cumulative curves to estimate the gradient of OD demand. A Growing Tree algorithm is developed to construct tree-based cumulative curves. The proposed framework is examined on a small network as well as a real-world large-scale network. The experiment results indicate that the proposed framework is compelling, satisfactory and computationally plausible.

1903.03763 2026-06-04 eess.SY cs.LG cs.SY math.OC stat.ML 版本更新

A tractable ellipsoidal approximation for voltage regulation problems

电压调节问题中的可处理椭球近似

Pan Li, Baihong Jin, Ruoxuan Xiong, Dai Wang, Alberto Sangiovanni-Vincentelli, Baosen Zhang

发表机构 * Facebook Inc.(Facebook公司) University of Washington(华盛顿大学) Stanford University(斯坦福大学) Tesla Inc.(特斯拉公司)

AI总结 本文提出了一种基于机器学习的方法来解决电力系统运行中电压调节问题中的机会约束优化问题,通过用椭球近似不确定性可行区域,提出了类似支持向量机的学习模型和高效的采样算法。

Comments accepted by ACC2019 http://acc2019.a2c2.org/

详情
AI中文摘要

我们提出了一种机器学习方法来解决电压调节问题中的机会约束优化问题。我们的方法新颖之处在于用椭球近似不确定性可行区域。我们使用类似于支持向量机(SVM)的学习模型来提出这个问题,并提出了一种高效的采样算法来训练模型。我们使用标准的IEEE配电测试馈线在电压调节问题上展示了我们的方法。

英文摘要

We present a machine learning approach to the solution of chance constrained optimizations in the context of voltage regulation problems in power system operation. The novelty of our approach resides in approximating the feasible region of uncertainty with an ellipsoid. We formulate this problem using a learning model similar to Support Vector Machines (SVM) and propose a sampling algorithm that efficiently trains the model. We demonstrate our approach on a voltage regulation problem using standard IEEE distribution test feeders.

1807.09519 2026-06-04 math.NA cs.LG cs.NA 版本更新

A machine learning framework for data driven acceleration of computations of differential equations

一种用于微分方程计算的数据驱动加速的机器学习框架

Siddhartha Mishra

发表机构 * Seminar for Applied Mathematics (SAM), D-Math ETH Zürich(应用数学研讨会(SAM),ETH Zurich 数学系)

AI总结 本文提出了一种机器学习框架,用于加速时间依赖的常微分方程和偏微分方程的数值计算,通过将现有数值方法转化为人工神经网络,并通过离线训练过程最小化损失函数来确定可训练参数,从而提高计算效率。

详情
AI中文摘要

我们提出了一种机器学习框架,用于加速时间依赖的常微分方程和偏微分方程的数值计算。我们的方法是将(现有数值方法的)泛化形式作为人工神经网络,具有一个可训练的参数集。这些参数通过离线训练过程通过(随机)梯度下降方法(近似)最小化合适的(可能非凸)损失函数来确定。所提出的算法始终与底层微分方程保持一致。涉及线性和非线性ODE和PDE模型问题的数值实验显示,与标准数值方法相比,计算效率有显著提升。

英文摘要

We propose a machine learning framework to accelerate numerical computations of time-dependent ODEs and PDEs. Our method is based on recasting (generalizations of) existing numerical methods as artificial neural networks, with a set of trainable parameters. These parameters are determined in an offline training process by (approximately) minimizing suitable (possibly non-convex) loss functions by (stochastic) gradient descent methods. The proposed algorithm is designed to be always consistent with the underlying differential equation. Numerical experiments involving both linear and non-linear ODE and PDE model problems demonstrate a significant gain in computational efficiency over standard numerical methods.

1903.02219 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Training in Task Space to Speed Up and Guide Reinforcement Learning

在任务空间中训练以加速和引导强化学习

Guillaume Bellegarda, Katie Byl

AI总结 本文提出在任务空间中训练以提高强化学习的效率和稳定性,通过简化高自由度系统模型、利用正逆运动学以及在笛卡尔空间中学习运动策略,从而减少样本复杂度和训练时间。

详情
AI中文摘要

最近强化学习(RL)领域的突破在学习和部署真实世界机器人系统策略方面取得了显著进展。然而,即使使用当前最先进的算法和计算资源,这些算法仍然面临高样本复杂度的问题,导致训练时间长,尤其是对于高自由度(DOF)系统。此外,新兴策略缺乏感知稳定性和鲁棒性保证也引发了担忧。本文旨在通过以下方法缓解这些缺点:(1)用一个代表性简单的模型来建模复杂的高DOF系统;(2)明确使用正逆运动学,而不需要让RL算法自行学习;(3)在笛卡尔空间中学习运动策略,而不是关节空间。本文将这些方法应用于JPL的Robosimian,但可以轻松应用于任何具有基座和末端执行器的系统。这些运动策略可以在几分钟内生成,并在单台笔记本电脑上训练。我们比较了所学策略的鲁棒性与其他控制方法的鲁棒性。本文的配套视频可在https://youtu.be/xDxxSw5ahnc找到。

英文摘要

Recent breakthroughs in the reinforcement learning (RL) community have made significant advances towards learning and deploying policies on real world robotic systems. However, even with the current state-of-the-art algorithms and computational resources, these algorithms are still plagued with high sample complexity, and thus long training times, especially for high degree of freedom (DOF) systems. There are also concerns arising from lack of perceived stability or robustness guarantees from emerging policies. This paper aims at mitigating these drawbacks by: (1) modeling a complex, high DOF system with a representative simple one, (2) making explicit use of forward and inverse kinematics without forcing the RL algorithm to "learn" them on its own, and (3) learning locomotion policies in Cartesian space instead of joint space. In this paper these methods are applied to JPL's Robosimian, but can be readily used on any system with a base and end effector(s). These locomotion policies can be produced in just a few minutes, trained on a single laptop. We compare the robustness of the resulting learned policies to those of other control methods. An accompanying video for this paper can be found at https://youtu.be/xDxxSw5ahnc .

1902.08705 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY 版本更新

A General Framework for Structured Learning of Mechanical Systems

结构机械系统学习的通用框架

Jayesh K. Gupta, Kunal Menda, Zachary Manchester, Mykel J. Kochenderfer

发表机构 * Stanford University(斯坦福大学)

AI总结 本文提出了一种通用框架,用于结构化学习机械系统,通过结合先验知识和训练表达式近似器来提高模型的准确性和效率。

Comments 10 pages, 7 figures. First two authors contributed equally. Submitted to IROS/RA-L. Code at https://github.com/sisl/mechamodlearn/

详情
AI中文摘要

学习准确的动力学模型对于优化和顺应性控制机器人系统至关重要。当前使用解析参数化进行白盒建模或使用神经网络进行黑盒建模的方法可能会产生高偏差或高方差。我们提出了一个灵活的灰盒模型,可以无缝地结合可用的先验知识,并在没有时训练具有表达能力的函数近似器。我们提出使用神经网络参数化机械系统,以建模其拉格朗日量和作用在其上的广义力。我们在模拟的驱动双摆上测试了我们的方法。我们展示了我们的方法在数据效率以及基于模型的强化学习中的性能优于朴素的黑盒模型。我们还系统地研究了我们的方法在结合可用的系统先验知识以提高数据效率方面的能力。

英文摘要

Learning accurate dynamics models is necessary for optimal, compliant control of robotic systems. Current approaches to white-box modeling using analytic parameterizations, or black-box modeling using neural networks, can suffer from high bias or high variance. We address the need for a flexible, gray-box model of mechanical systems that can seamlessly incorporate prior knowledge where it is available, and train expressive function approximators where it is not. We propose to parameterize a mechanical system using neural networks to model its Lagrangian and the generalized forces that act on it. We test our method on a simulated, actuated double pendulum. We show that our method outperforms a naive, black-box model in terms of data-efficiency, as well as performance in model-based reinforcement learning. We also conduct a systematic study of our method's ability to incorporate available prior knowledge about the system to improve data efficiency.

1903.00182 2026-06-04 eess.SY cs.DC cs.LG cs.SY 版本更新

Distributed Variational Bayesian Algorithms for Extended Object Tracking

分布式变分贝叶斯算法用于扩展目标跟踪

Junhao Hua, Chunguang Li

发表机构 * College of Information Science and Electronic Engineering, Zhejiang University(浙江大学信息科学与电子工程学院)

AI总结 本文研究了分布式扩展目标跟踪问题,提出了一种基于变分贝叶斯方法的集中算法,并扩展到分布式场景,通过交替方向乘子法技术,同时估计扩展目标状态和测量噪声协方差。

Comments 14 pages, 9 figures

详情
AI中文摘要

本文关注分布式扩展目标跟踪问题,旨在通过节点网络协同估计目标的状态和扩展。在传统跟踪应用中,大多数方法将目标视为测量点源,由于传感器分辨率有限。最近,一些研究考虑了扩展对象,即空间结构化的对象,即多个分辨率单元被目标占据。在这种设置中,每个时间步长生成多个测量值。本文提出了一种用于传感器网络中扩展目标跟踪问题的贝叶斯模型。在该模型中,目标扩展由对称正定随机矩阵表示,并假设存在但未知的测量噪声。基于此贝叶斯模型,我们首先提出了一种基于变分贝叶斯方法的新型集中算法用于扩展目标跟踪。然后,我们基于交替方向乘子法(ADMM)技术将其扩展到分布式场景。所提出的算法可以同时估计扩展目标状态(运动状态和扩展)和测量噪声协方差。给出了扩展目标跟踪和群体目标跟踪的仿真以验证所提模型和算法的有效性。

英文摘要

This paper is concerned with the problem of distributed extended object tracking, which aims to collaboratively estimate the state and extension of an object by a network of nodes. In traditional tracking applications, most approaches consider an object as a point source of measurements due to limited sensor resolution capabilities. Recently, some studies consider the extended objects, which are spatially structured, i.e., multiple resolution cells are occupied by an object. In this setting, multiple measurements are generated by each object per time step. In this paper, we present a Bayesian model for extended object tracking problem in a sensor network. In this model, the object extension is represented by a symmetric positive definite random matrix, and we assume that the measurement noise exists but is unknown. Using this Bayesian model, we first propose a novel centralized algorithm for extended object tracking based on variational Bayesian methods. Then, we extend it to the distributed scenario based on the alternating direction method of multipliers (ADMM) technique. The proposed algorithms can simultaneously estimate the extended object state (the kinematic state and extension) and the measurement noise covariance. Simulations on both extended object tracking and group target tracking are given to verify the effectiveness of the proposed model and algorithms.

1902.11136 2026-06-04 eess.SY cs.LG cs.SY math.DS physics.ao-ph 版本更新

Learning Dynamical Systems from Partial Observations

从部分观测中学习动力系统

Ibrahim Ayed, Emmanuel de Bézenac, Arthur Pajot, Julien Brajard, Patrick Gallinari

发表机构 * Theresis lab, Thales, Thales Research \& Technology Route D\'epartementale, 91120 Palaiseau Sorbonne Universit\'e, CNRS-IRD-MNHN, LOCEAN, Paris, France Remote Sensing Center, Bergen, Norway Criteo AI Lab, Paris, France

AI总结 本文提出了一种数据驱动的框架,用于从部分观测中预测复杂非线性时空过程,通过神经网络估计时间变化的微分方程来建模系统动力学,并在浅水和欧拉模拟中验证了该方法在长期预测和学习隐藏状态方面的有效性。

详情
AI中文摘要

我们考虑在观测仅提供系统状态部分信息的情况下,预测复杂非线性时空过程的问题。我们提出了一种自然的数据驱动框架,其中系统的动力学由一个未知的时间变化微分方程建模,通过神经网络从数据中估计演化项。任何未来的状态都可以通过将关联的微分方程输入ODE求解器来计算。我们首先在浅水和欧拉模拟上评估了我们的方法,发现该方法不仅能够产生高质量的长期预测,还能学习产生接近系统真实状态的隐藏状态,而无需对后者进行直接监督。在具有挑战性的最新海洋模拟中进行的额外实验进一步验证了我们的发现,同时在经典基线方法上展示了显著的改进。

英文摘要

We consider the problem of forecasting complex, nonlinear space-time processes when observations provide only partial information of on the system's state. We propose a natural data-driven framework, where the system's dynamics are modelled by an unknown time-varying differential equation, and the evolution term is estimated from the data, using a neural network. Any future state can then be computed by placing the associated differential equation in an ODE solver. We first evaluate our approach on shallow water and Euler simulations. We find that our method not only demonstrates high quality long-term forecasts, but also learns to produce hidden states closely resembling the true states of the system, without direct supervision on the latter. Additional experiments conducted on challenging, state of the art ocean simulations further validate our findings, while exhibiting notable improvements over classical baselines.

1810.06749 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Optimally rotated coordinate systems for adaptive least-squares regression on sparse grids

最优旋转坐标系用于稀疏网格上的自适应最小二乘回归

Bastian Bohn, Michael Griebel, Jens Oettershagen

发表机构 * Institute for Numerical Simulation, University of Bonn(柏林洪堡大学数值模拟研究所)

AI总结 针对高维数据集,本文提出了一种预处理方法,通过确定问题相关的优化坐标系来降低数据的有效维度,从而提升自适应稀疏网格最小二乘回归算法的性能。

详情
AI中文摘要

对于低维数据集具有大量数据点时,标准核方法通常不再适用于回归。除了简单的线性模型或复杂的启发式深度学习模型外,基于网格的更大(核)模型类的离散化方法导致的算法自然地线性缩放数据点数量。在中等维或高维回归任务中,这些基于网格的离散化方法受到维度诅咒的影响。在此背景下,稀疏网格方法已证明可以很大程度上克服这一问题。在这种情况下,能够检测并利用名义上高维数据的低有效维数的空间和维度自适应稀疏网格特别成功。然而,它们仍然依赖于轴对齐的结构,并在具有主要偏斜和旋转坐标的数据中表现出问题。在本文中,我们提出了一种预处理方法,用于这些自适应稀疏网格算法,以确定一个优化的、问题相关的坐标系,从而在ANOVA意义上降低给定数据集的有效维度。我们通过合成数据以及现实世界数据的数值示例,展示了自适应稀疏网格最小二乘算法如何从我们的预处理方法中受益。

英文摘要

For low-dimensional data sets with a large amount of data points, standard kernel methods are usually not feasible for regression anymore. Besides simple linear models or involved heuristic deep learning models, grid-based discretizations of larger (kernel) model classes lead to algorithms, which naturally scale linearly in the amount of data points. For moderate-dimensional or high-dimensional regression tasks, these grid-based discretizations suffer from the curse of dimensionality. Here, sparse grid methods have proven to circumvent this problem to a large extent. In this context, space- and dimension-adaptive sparse grids, which can detect and exploit a given low effective dimensionality of nominally high-dimensional data, are particularly successful. They nevertheless rely on an axis-aligned structure of the solution and exhibit issues for data with predominantly skewed and rotated coordinates. In this paper we propose a preprocessing approach for these adaptive sparse grid algorithms that determines an optimized, problem-dependent coordinate system and, thus, reduces the effective dimensionality of a given data set in the ANOVA sense. We provide numerical examples on synthetic data as well as real-world data to show how an adaptive sparse grid least squares algorithm benefits from our preprocessing method.

1902.10590 2026-06-04 cs.SE cs.AI cs.LG cs.SY eess.SY 版本更新

Architecting Dependable Learning-enabled Autonomous Systems: A Survey

构建可靠的学习自主系统:一项综述

Chih-Hong Cheng, Dhiraj Gulati, Rongjie Yan

发表机构 * fortiss - Research Institute of the Free State of Bavaria, Germany(巴伐利亚自由州研究 institute) State Key Laboratory of Computer Science, China(中国计算机科学国家重点实验室)

AI总结 本文综述了构建可靠学习自主系统的方法,重点在于自动驾驶,讨论了多样冗余、信息融合和运行时监控等技术支柱,并总结了提升深度学习组件可靠性的最新方法,最后提出了研究方向。

详情
AI中文摘要

我们提供了一项关于构建可靠学习自主系统架构方法的综述,重点在于自动驾驶。我们考虑了构建可靠自主性的三个技术支柱,即多样化冗余、信息融合和运行时监控。对于学习组件,我们还总结了近年来提高深度学习组件可靠性的最新架构方法。最后,我们以现有方法面临的挑战为导向,提出了一 series of promising research directions.

英文摘要

We provide a summary over architectural approaches that can be used to construct dependable learning-enabled autonomous systems, with a focus on automated driving. We consider three technology pillars for architecting dependable autonomy, namely diverse redundancy, information fusion, and runtime monitoring. For learning-enabled components, we additionally summarize recent architectural approaches to increase the dependability beyond standard convolutional neural networks. We conclude the study with a list of promising research directions addressing the challenges of existing approaches.

1703.00734 2026-06-04 stat.ML cs.DC cs.LG cs.NA math.NA stat.ME 版本更新

Distributed Bayesian Matrix Factorization with Limited Communication

分布式贝叶斯矩阵分解与有限通信

Xiangju Qin, Paul Blomstedt, Eemeli Leppäaho, Pekka Parviainen, Samuel Kaski

发表机构 * Helsinki Institute for Information Technology HIIT, Department of Computer Science, Aalto University(赫尔辛基信息科技研究院 HIIT,计算机科学系,阿莱大学) Department of Informatics, University of Bergen(信息学院,卑尔根大学)

AI总结 本文提出了一种分布式贝叶斯矩阵分解方法,通过分层分解联合后验分布,结合并行计算和高效近似实现,提高了大规模数据处理效率,同时保持预测准确性。

Comments 28 pages, 8 figures. The paper is published in Machine Learning journal. An implementation of the method is is available in SMURFF software on github (bmfpp branch): https://github.com/ExaScience/smurff

详情
Journal ref
Machine Learning, 2019
AI中文摘要

贝叶斯矩阵分解(BMF)是一种强大的工具,用于生成低秩矩阵表示,并预测缺失值和提供置信区间。对大规模矩阵的后验推断进行扩展具有挑战性,需要将数据和计算分布到多个工人上,使通信成为主要的计算瓶颈。 embarrassingly parallel 推断可以通过在不同数据子集上使用完全独立的计算来消除通信需求,但会受到BMF解的固有不可识别性的影响。我们引入了联合后验分布的分层分解,将子推断耦合起来,允许在最多三个阶段中进行 embarrassingly parallel 计算。使用高效的近似实现,我们在真实和模拟数据上经验性地展示了改进。我们的分布式方法能够实现比完整后验快几乎一个数量级的速度提升,对预测准确性影响微小。我们的方法在准确性上优于最先进的 embarrassingly parallel MCMC 方法,并在结果上与其它可用的分布式和并行BMF实现具有竞争力。

英文摘要

Bayesian matrix factorization (BMF) is a powerful tool for producing low-rank representations of matrices and for predicting missing values and providing confidence intervals. Scaling up the posterior inference for massive-scale matrices is challenging and requires distributing both data and computation over many workers, making communication the main computational bottleneck. Embarrassingly parallel inference would remove the communication needed, by using completely independent computations on different data subsets, but it suffers from the inherent unidentifiability of BMF solutions. We introduce a hierarchical decomposition of the joint posterior distribution, which couples the subset inferences, allowing for embarrassingly parallel computations in a sequence of at most three stages. Using an efficient approximate implementation, we show improvements empirically on both real and simulated data. Our distributed approach is able to achieve a speed-up of almost an order of magnitude over the full posterior, with a negligible effect on predictive accuracy. Our method outperforms state-of-the-art embarrassingly parallel MCMC methods in accuracy, and achieves results competitive to other available distributed and parallel implementations of BMF.

1902.09626 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Learning Extreme Hummingbird Maneuvers on Flapping Wing Robots

在扑翼机器人上学习极端蜂鸟动作

Fan Fei, Zhan Tu, Jian Zhang, Xinyan Deng

AI总结 研究通过模仿蜂鸟的极端机动动作,开发了一种混合控制策略,利用模型驱动的非线性控制和模型无关的强化学习,实现了在12克仿生蜂鸟机器人上实现快速逃避机动。

Comments 6 pages, accepted at ICRA 2019

详情
AI中文摘要

生物学研究表明,蜂鸟在快速逃避时可以执行极端空战动作。在悬停时突然出现的视觉刺激下,蜂鸟会启动快速的后退平移并伴随180度的偏转,随后在不到10次振翅之间完成瞬间姿态稳定。考虑到振翅频率为40Hz,这种激进的动作仅在0.2秒内完成。受蜂鸟在这些极端动作中接近最大性能的启发,我们开发了一种飞行控制系统,并实验表明,这种机动性可通过配备两个执行器的12克仿生蜂鸟机器人实现。所提出的混合控制策略结合了基于模型的非线性控制和无模型强化学习。我们使用基于模型的非线性控制进行正常飞行控制,因为这些条件下的动态模型相对准确。然而,在极端机动中,建模误差变得无法控制。通过在仿真中训练的无模型强化学习策略被优化以'破坏'系统并最大化机动期间的性能。混合策略表现出接近蜂鸟观察到的机动动作。直接仿真到现实的转移得以实现,证明了仿生蜂鸟机器人上蜂鸟式的快速逃避机动。

英文摘要

Biological studies show that hummingbirds can perform extreme aerobatic maneuvers during fast escape. Given a sudden looming visual stimulus at hover, a hummingbird initiates a fast backward translation coupled with a 180-degree yaw turn, which is followed by instant posture stabilization in just under 10 wingbeats. Consider the wingbeat frequency of 40Hz, this aggressive maneuver is carried out in just 0.2 seconds. Inspired by the hummingbirds' near-maximal performance during such extreme maneuvers, we developed a flight control strategy and experimentally demonstrated that such maneuverability can be achieved by an at-scale 12-gram hummingbird robot equipped with just two actuators. The proposed hybrid control policy combines model-based nonlinear control with model-free reinforcement learning. We use model-based nonlinear control for nominal flight control, as the dynamic model is relatively accurate for these conditions. However, during extreme maneuver, the modeling error becomes unmanageable. A model-free reinforcement learning policy trained in simulation was optimized to 'destabilize' the system and maximize the performance during maneuvering. The hybrid policy manifests a maneuver that is close to that observed in hummingbirds. Direct simulation-to-real transfer is achieved, demonstrating the hummingbird-like fast evasive maneuvers on the at-scale hummingbird robot.

1902.09427 2026-06-04 eess.SP cs.LG cs.SY eess.SY stat.ML 版本更新

Fault Diagnosis Method Based on Scaling Law for On-line Refrigerant Leak Detection

基于缩放定律的故障诊断方法用于在线制冷剂泄漏检测

Shun Takeuchi, Takahiro Saito

发表机构 * Machine Discovery Technology Project Artificial Intelligence Laboratory Fujitsu Laboratories Ltd., Kanagawa, Japan Machine Learning Technology Project Artificial Intelligence Laboratory Fujitsu Laboratories Ltd., Kanagawa, Japan

AI总结 本文提出了一种基于物理建模和空调系统控制机制的制冷剂泄漏故障诊断方法,通过推导与制冷剂泄漏相关的缩放定律,使模型能够适用于不同配置的空调系统,利用实验室的小规模离线故障测试数据估计缩放指数,并通过真实数据验证,证明了该方法在早期泄漏检测中的有效性。

Comments 8 pages, 6 figures

详情
Journal ref
2018 17th IEEE International Conference on Machine Learning and Applications (ICMLA)
AI中文摘要

利用仪器化传感器数据进行早期故障检测是机器学习在工业设施中的一个有前景的应用领域。然而,由于目标诊断系统中复杂的系统配置和不足的故障数据,训练出的故障检测模型的泛化性能难以提高。将训练好的模型应用于其他系统并不容易。本文提出了一种考虑空调系统物理建模和控制机制的制冷剂泄漏故障诊断方法。我们推导出与制冷剂泄漏相关的有用缩放定律。如果控制机制相同,模型可以应用于其他空调系统,而不论系统配置如何。在实验室中获得的小规模离线故障测试数据用于估计缩放指数。我们通过真实数据评估所提出的缩放定律。基于两组之间相互作用的统计假设检验,我们证明了不同空调系统的缩放指数是等价的。此外,我们基于缩放定律对实际过程数据的泄漏程度时间序列进行了估计,并通过与专家评估的比较,证明了该方法在早期泄漏检测中的有效性。

英文摘要

Early fault detection using instrumented sensor data is one of the promising application areas of machine learning in industrial facilities. However, it is difficult to improve the generalization performance of the trained fault-detection model because of the complex system configuration in the target diagnostic system and insufficient fault data. It is not trivial to apply the trained model to other systems. Here we propose a fault diagnosis method for refrigerant leak detection considering the physical modeling and control mechanism of an air-conditioning system. We derive a useful scaling law related to refrigerant leak. If the control mechanism is the same, the model can be applied to other air-conditioning systems irrespective of the system configuration. Small-scale off-line fault test data obtained in a laboratory are applied to estimate the scaling exponent. We evaluate the proposed scaling law by using real-world data. Based on a statistical hypothesis test of the interaction between two groups, we show that the scaling exponents of different air-conditioning systems are equivalent. In addition, we estimated the time series of the degree of leakage of real process data based on the scaling law and confirmed that the proposed method is promising for early leak detection through comparison with assessment by experts.

1902.09426 2026-06-04 eess.SP cs.LG cs.SY eess.SY stat.ML 版本更新

Semi-supervised Approach to Soft Sensor Modeling for Fault Detection in Industrial Systems with Multiple Operation Modes

基于半监督方法的软传感器建模用于具有多种操作模式的工业系统故障检测

Shun Takeuchi, Takuya Nishino, Takahiro Saito, Isamu Watanabe

发表机构 * Artificial Intelligence Research Center(人工智能研究中心) Knowledge Information Processing Laboratory(知识信息处理实验室) Fujitsu Laboratories Ltd.(Fujitsu实验室有限公司) Japan(日本)

AI总结 本文提出了一种半监督方法用于软传感器建模,以解决在多操作模式系统中因目标变量数据不足而无法有效训练的问题,通过利用操作模式转换点的特性来改进模型预测能力。

Comments 7 pages, 1 figure

详情
Journal ref
International Conference on Advanced Intelligent Systems and Informatics 2017
AI中文摘要

在工业系统中,某些需要监控以检测故障的过程变量往往难以或无法测量。软传感器技术广泛用于从易于测量的变量估计这些难以测量的过程变量。软传感器建模需要包含各种状态信息的训练数据集,但目标变量的故障数据集不足,无法作为训练数据集。本文描述了一种半监督方法用于软传感器建模,以将缺少目标变量的不完整数据集纳入训练数据集。为了整合不完整数据集,我们考虑系统中操作模式转换点的特性。在约束条件下,通过从模式转换信息中获得的约束条件估计操作模式的回归系数。在案例研究中,这种受约束的软传感器建模被用于预测具有加热和制冷操作模式的空调系统中的制冷剂泄漏。结果表明,这种建模方法对于具有多种操作模式的系统中的软传感器具有前景。

英文摘要

In industrial systems, certain process variables that need to be monitored for detecting faults are often difficult or impossible to measure. Soft sensor techniques are widely used to estimate such difficult-to-measure process variables from easy-to-measure ones. Soft sensor modeling requires training datasets including the information of various states such as operation modes, but the fault dataset with the target variable is insufficient as the training dataset. This paper describes a semi-supervised approach to soft sensor modeling to incorporate an incomplete dataset without the target variable in the training dataset. To incorporate the incomplete dataset, we consider the properties of processes at transition points between operation modes in the system. The regression coefficients of the operation modes are estimated under constraint conditions obtained from the information on the mode transitions. In a case study, this constrained soft sensor modeling was used to predict refrigerant leaks in air-conditioning systems with heating and cooling operation modes. The results show that this modeling method is promising for soft sensors in a system with multiple operation modes.

1806.07190 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Stable Gaussian Process based Tracking Control of Euler-Lagrange Systems

基于稳定高斯过程的欧拉-拉格朗日系统跟踪控制

Thomas Beckers, Dana Kulić, Sandra Hirche

发表机构 * Chair of Information-oriented Control (ITR), Department of Electrical and Computer Engineering, Technical University of Munich(信息导向控制研究所(ITR),电气与计算机工程系,慕尼黑技术大学) Adaptive Systems Laboratory, Department of Electrical and Computer Engineering, University of Waterloo(自适应系统实验室,电气与计算机工程系,滑铁卢大学)

AI总结 本文提出一种基于高斯过程回归的稳定跟踪控制方法,用于未知欧拉-拉格朗日系统的高精度跟踪控制,通过数据驱动建模实现前馈补偿,并利用模型保真度动态调整反馈增益,确保全局有界跟踪误差。

Comments Accepted manuscript for publication in Elsevier Automatica

详情
AI中文摘要

对现实中的欧拉-拉格朗日系统实现完美的跟踪控制具有挑战性,因为系统模型的不确定性以及外部干扰会影响跟踪误差的大小。通过增加反馈增益或改进系统模型可以减小跟踪误差。后者显然更可取,因为它允许在低反馈增益下保持良好的跟踪性能。然而,准确的模型往往难以获得。在本文中,我们解决了未知欧拉-拉格朗日系统的稳定高性能跟踪控制问题。具体来说,我们使用高斯过程回归来获得一个数据驱动的模型,用于系统未知动力学的前馈补偿。模型保真度用于调整反馈增益,允许在状态空间中模型信心高的区域使用低反馈增益。所提出的控制律保证了具有特定概率的全局有界跟踪误差。仿真研究展示了其优于现有跟踪控制方法的优越性。

英文摘要

Perfect tracking control for real-world Euler-Lagrange systems is challenging due to uncertainties in the system model and external disturbances. The magnitude of the tracking error can be reduced either by increasing the feedback gains or improving the model of the system. The latter is clearly preferable as it allows to maintain good tracking performance at low feedback gains. However, accurate models are often difficult to obtain. In this article, we address the problem of stable high-performance tracking control for unknown Euler-Lagrange systems. In particular, we employ Gaussian Process regression to obtain a data-driven model that is used for the feed-forward compensation of unknown dynamics of the system. The model fidelity is used to adapt the feedback gains allowing low feedback gains in state space regions of high model confidence. The proposed control law guarantees a globally bounded tracking error with a specific probability. Simulation studies demonstrate the superiority over state of the art tracking control approaches.

1902.08721 2026-06-04 cs.LG cs.SY eess.SY math.OC stat.ML 版本更新

Online Control with Adversarial Disturbances

对抗性扰动下的在线控制

Naman Agarwal, Brian Bullins, Elad Hazan, Sham M. Kakade, Karan Singh

发表机构 * Google AI Princeton(谷歌AI普林斯顿) Princeton University(普林斯顿大学) University of Washington(华盛顿大学) Allen School of Computer Science and Engineering(阿伦计算机科学与工程学院)

AI总结 本文研究了在存在对抗性扰动的线性动态系统中的在线控制问题,提出了一种高效的算法,该算法在几乎紧致的 regret 绑定下实现了接近全知扰动的控制效果,同时扩展了先前工作的两个主要方面:允许动态中的对抗性噪声和一般的凸成本。

详情
AI中文摘要

我们研究了具有对抗性扰动(而非统计噪声)的线性动态系统的控制问题。我们考虑的目标是regret:我们希望一种在线控制过程能够几乎达到完全了解扰动的控制过程的性能。我们的主要结果是一个高效的算法,该算法为该问题提供了几乎紧致的regret界。从技术角度来看,这项工作在两个主要方面扩展了先前的工作:我们的模型允许动态中的对抗性噪声,并允许一般的凸成本。

英文摘要

We study the control of a linear dynamical system with adversarial disturbances (as opposed to statistical noise). The objective we consider is one of regret: we desire an online control procedure that can do nearly as well as that of a procedure that has full knowledge of the disturbances in hindsight. Our main result is an efficient algorithm that provides nearly tight regret bounds for this problem. From a technical standpoint, this work generalizes upon previous work in two main aspects: our model allows for adversarial noise in the dynamics, and allows for general convex costs.

1902.08594 2026-06-04 eess.SY cs.LG cs.MA cs.SY stat.ML 版本更新

Regression-based Inverter Control for Decentralized Optimal Power Flow and Voltage Regulation

基于回归的逆变器控制用于分布式最优功率流和电压调节

Oscar Sondermeijer, Roel Dobbe, Daniel Arnold, Claire Tomlin, Tamás Keviczky

发表机构 * 2 Department of Electrical Engineering \& Computer Sciences, UC Berkeley, Berkeley, USA 3 Department of Mechanical Engineering, UC Berkeley, Berkeley, USA 4 Delft Center for Systems Control, Delft University of Technology, Delft, The Netherlands

AI总结 本文提出了一种系统化的数据驱动方法,通过本地测量确定逆变器输出无功功率,以实现接近最优的结果,该方法通过网络模型和历史负荷和发电数据进行最优功率流计算,然后利用回归找到每个逆变器的函数,将本地历史数据映射到其最优无功功率注入的近似值,从而实现分布式控制,以在电压和容量约束下最小化损耗并实现电压平坦化,同时允许高效的电压-无功优化(VVO)方案,使传统控制设备与现有逆变器协同工作,以安全运行高分布式发电水平的配电网。

Comments Cite as: Oscar Sondermeijer, Roel Dobbe, Daniel Arnold, Claire Tomlin and Tamás Keviczky, "Regression-based Inverter Control for Decentralized Optimal Power Flow and Voltage Regulation", IEEE Power & Energy Society General Meeting, Boston, July 2016

详情
AI中文摘要

电子功率逆变器能够快速提供无功功率以维持客户电压在运行容差范围内,并减少配电网中的系统损耗。本文提出了一种系统化且数据驱动的方法,以确定无功功率逆变器输出作为本地测量函数的方式,以获得接近最优的结果。首先,我们使用网络模型和历史负荷和发电数据,并进行最优功率流计算,以计算网络中所有可控逆变器的全局最优无功功率注入。随后,我们使用回归找到每个逆变器的函数,将本地历史数据映射到其最优无功功率注入的近似值。所得函数随后作为参与逆变器的分布式控制器,根据新的本地测量预测最优注入。该方法在执行电压和容量约束下的损耗最小化和电压平坦化时能够实现接近最优的结果,并允许高效的电压-无功优化(VVO)方案,其中传统控制设备与现有逆变器协同工作,以安全运行具有更高分布式发电水平的配电网。

英文摘要

Electronic power inverters are capable of quickly delivering reactive power to maintain customer voltages within operating tolerances and to reduce system losses in distribution grids. This paper proposes a systematic and data-driven approach to determine reactive power inverter output as a function of local measurements in a manner that obtains near optimal results. First, we use a network model and historic load and generation data and do optimal power flow to compute globally optimal reactive power injections for all controllable inverters in the network. Subsequently, we use regression to find a function for each inverter that maps its local historical data to an approximation of its optimal reactive power injection. The resulting functions then serve as decentralized controllers in the participating inverters to predict the optimal injection based on a new local measurements. The method achieves near-optimal results when performing voltage- and capacity-constrained loss minimization and voltage flattening, and allows for an efficient volt-VAR optimization (VVO) scheme in which legacy control equipment collaborates with existing inverters to facilitate safe operation of distribution networks with higher levels of distributed generation.

1902.08274 2026-06-04 cs.AI cs.LG cs.MA cs.SY eess.SY 版本更新

An Online Decision-Theoretic Pipeline for Responder Dispatch

为响应调度设计一个在线决策理论管道

Ayan Mukhopadhyay, Geoffrey Pettet, Chinmaya Samal, Abhishek Dubey, Yevgeniy Vorobeychik

发表机构 * Vanderbilt University(范德比大学) Washington University(华盛顿大学)

AI总结 本文提出了一种在线决策理论管道,用于有效应对紧急事件,通过实时数据流更新模型,提高响应效率并减少计算时间。

Comments Appeared in ICCPS 2019

详情
AI中文摘要

向服务交通事故、火灾、 distress 电话和犯罪等紧急事件派遣应急响应人员的问题困扰着全球各地的城市。尽管此类问题已广泛研究,但大多数方法是离线的。这些方法无法捕捉到关键紧急响应发生的动态变化环境,因此无法在实践中实施。任何全面的方法必须考虑其他挑战,包括预测事件何时何地发生以及理解环境动态变化。我们描述了一个系统,该系统以在线方式处理所有这些问题,即模型通过流数据源更新。我们强调这种做法对应急响应有效性的重要性,并提出了一种算法框架,可以为给定的决策理论模型计算有希望的行动。我们还提出了一种在线机制用于事件预测,以及基于循环神经网络的方法来学习和预测影响响应调度的环境特征。我们比较了我们的方法与现有最先进的方法和现有调度策略,结果表明我们的方法在减少响应时间的同时大幅减少了计算时间。

英文摘要

The problem of dispatching emergency responders to service traffic accidents, fire, distress calls and crimes plagues urban areas across the globe. While such problems have been extensively looked at, most approaches are offline. Such methodologies fail to capture the dynamically changing environments under which critical emergency response occurs, and therefore, fail to be implemented in practice. Any holistic approach towards creating a pipeline for effective emergency response must also look at other challenges that it subsumes - predicting when and where incidents happen and understanding the changing environmental dynamics. We describe a system that collectively deals with all these problems in an online manner, meaning that the models get updated with streaming data sources. We highlight why such an approach is crucial to the effectiveness of emergency response, and present an algorithmic framework that can compute promising actions for a given decision-theoretic model for responder dispatch. We argue that carefully crafted heuristic measures can balance the trade-off between computational time and the quality of solutions achieved and highlight why such an approach is more scalable and tractable than traditional approaches. We also present an online mechanism for incident prediction, as well as an approach based on recurrent neural networks for learning and predicting environmental features that affect responder dispatch. We compare our methodology with prior state-of-the-art and existing dispatch strategies in the field, which show that our approach results in a reduction in response time with a drastic reduction in computational time.

1807.04020 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Improved SVD-based Initialization for Nonnegative Matrix Factorization using Low-Rank Correction

改进的基于SVD的非负矩阵分解初始化方法:利用低秩修正

Atif Muhammad Syed, Sameer Qazi, Nicolas Gillis

发表机构 * Graduate School of Science and Engineering(研究生院) PAF-Karachi Institute of Economics and Technology(卡拉奇经济和技术学院) Department of Mathematics and Operational Research(数学与运筹学系)

AI总结 本文提出了一种改进的基于SVD的非负矩阵分解初始化方法,通过考虑被丢弃的SVD因子来降低初始误差,同时生成稀疏初始因子并提高计算效率。

Comments 12 pages, 1 figure, 5 tables, submitted to pattern recognition letters

详情
Journal ref
Pattern Recognition Letters 122, pp. 53-59, 2019
AI中文摘要

由于大多数非负矩阵分解(NMF)算法的迭代性质,初始化是一个关键因素,因为它显著影响收敛性和最终得到的解。许多初始化方案已被提出,其中最受欢迎的一类方法基于奇异值分解(SVD)。然而,这些基于SVD的初始化方法并不满足一个自然条件,即误差应随着因子分解的秩增加而减少。在本文中,我们提出了一种新的基于SVD的NMF初始化方法,专门针对这一不足,通过考虑用于获得非负初始化而被丢弃的SVD因子。这种方法称为非负SVD与低秩修正(NNSVD-LRC),通过利用被丢弃的SVD因子的低秩结构,在可忽略的额外计算成本下显著降低初始误差。与以往基于SVD的初始化方法相比,NNSVD-LRC还有两个其他优势:(1)它能够证明生成稀疏的初始因子;(2)它更快,因为它只需要计算秩为⌈r/2 + 1⌉的截断SVD,其中r是所求NMF分解的因子秩(与其他方法不同,其他方法需要计算秩为r的截断SVD)。我们在多个标准密集和稀疏数据集上展示了我们的新方法在NMF中与最先进的基于SVD的初始化方法竞争性。

英文摘要

Due to the iterative nature of most nonnegative matrix factorization (\textsc{NMF}) algorithms, initialization is a key aspect as it significantly influences both the convergence and the final solution obtained. Many initialization schemes have been proposed for NMF, among which one of the most popular class of methods are based on the singular value decomposition (SVD). However, these SVD-based initializations do not satisfy a rather natural condition, namely that the error should decrease as the rank of factorization increases. In this paper, we propose a novel SVD-based \textsc{NMF} initialization to specifically address this shortcoming by taking into account the SVD factors that were discarded to obtain a nonnegative initialization. This method, referred to as nonnegative SVD with low-rank correction (NNSVD-LRC), allows us to significantly reduce the initial error at a negligible additional computational cost using the low-rank structure of the discarded SVD factors. NNSVD-LRC has two other advantages compared to previous SVD-based initializations: (1) it provably generates sparse initial factors, and (2) it is faster as it only requires to compute a truncated SVD of rank $\lceil r/2 + 1 \rceil$ where $r$ is the factorization rank of the sought NMF decomposition (as opposed to a rank-$r$ truncated SVD for other methods). We show on several standard dense and sparse data sets that our new method competes favorably with state-of-the-art SVD-based initializations for NMF.

1812.07084 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY 版本更新

Learning Constraints from Demonstrations

从示范中学习约束

Glen Chou, Dmitry Berenson, Necmiye Ozay

发表机构 * Dept. of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI, 48109, USA(电气工程与计算机科学系,密歇根大学,安娜堡,MI,48109,美国)

AI总结 该研究提出了一种从示范中学习未知约束的方法,通过任务示范、成本函数和系统动力学与控制约束,利用hit-and-run采样获取低成本但不安全的轨迹,并通过整数规划获得一致的不安全集表示,同时理论分析了可从安全示范中学习的约束子集。

Comments Presented at the Workshop on the Algorithmic Foundations of Robotics (WAFR), 2018, Mérida, Mexico

详情
AI中文摘要

我们通过提供一种方法扩展了从示范中学习的范式,该方法利用任务的示范、成本函数以及系统动力学和控制约束来学习跨任务的未知约束。给定安全的示范,我们的方法使用hit-and-run采样来获得低成本但不安全的轨迹。安全和不安全的轨迹都被用来通过求解整数规划问题获得不安全集的一致表示。我们的方法能够跨系统动力学泛化,并学习保证的约束子集。我们还提供了理论分析,说明从安全示范中可以学习的约束子集。我们在线性和非线性系统动力学上展示了我们的方法,并证明它可以修改以适应次优示范,并且也可以用于特征空间中学习约束。

英文摘要

We extend the learning from demonstration paradigm by providing a method for learning unknown constraints shared across tasks, using demonstrations of the tasks, their cost functions, and knowledge of the system dynamics and control constraints. Given safe demonstrations, our method uses hit-and-run sampling to obtain lower cost, and thus unsafe, trajectories. Both safe and unsafe trajectories are used to obtain a consistent representation of the unsafe set via solving an integer program. Our method generalizes across system dynamics and learns a guaranteed subset of the constraint. We also provide theoretical analysis on what subset of the constraint can be learnable from safe demonstrations. We demonstrate our method on linear and nonlinear system dynamics, show that it can be modified to work with suboptimal demonstrations, and that it can also be used to learn constraints in a feature space.

1902.06366 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Detecting and Diagnosing Incipient Building Faults Using Uncertainty Information from Deep Neural Networks

利用深度神经网络的不确定性信息检测和诊断建筑初期故障

Baihong Jin, Dan Li, Seshadhri Srinivasan, See-Kiong Ng, Kameshwar Poolla, Alberto~Sangiovanni-Vincentelli

发表机构 * Department of EECS, University of California, Berkeley(加州大学伯克利分校电子工程与计算机科学系) Institute of Data Science, National University of Singapore(新加坡国立大学数据科学研究所) The Berkeley Education Alliance for Research in Singapore(新加坡伯克利教育联盟)

AI总结 本文提出利用蒙特卡洛dropout方法增强监督学习流程,以检测和诊断未见过的初期故障,并在RP-1043数据集上验证其在指示最可能的初期故障类型方面的有效性。

详情
AI中文摘要

早期检测初期故障对于减少维护成本、节约能源和提高居住舒适度在建筑中至关重要。尽管深度神经网络等流行监督学习模型因其能够直接从标记的故障数据中学习而被认为具有前景,但监督学习方法的性能高度依赖于标记训练数据的可用性和质量。在故障检测与诊断(FDD)应用中,缺乏标记的初期故障数据已成为将这些监督学习技术应用于商业建筑的主要挑战。为克服这一挑战,本文提出利用蒙特卡洛dropout(MC-dropout)来增强监督学习流程,使生成的神经网络能够检测和诊断未见过的初期故障示例。我们还检查了所提出的MC-dropout方法在RP-1043数据集上的效果,以证明其在指示最可能的初期故障类型方面的有效性。

英文摘要

Early detection of incipient faults is of vital importance to reducing maintenance costs, saving energy, and enhancing occupant comfort in buildings. Popular supervised learning models such as deep neural networks are considered promising due to their ability to directly learn from labeled fault data; however, it is known that the performance of supervised learning approaches highly relies on the availability and quality of labeled training data. In Fault Detection and Diagnosis (FDD) applications, the lack of labeled incipient fault data has posed a major challenge to applying these supervised learning techniques to commercial buildings. To overcome this challenge, this paper proposes using Monte Carlo dropout (MC-dropout) to enhance the supervised learning pipeline, so that the resulting neural network is able to detect and diagnose unseen incipient fault examples. We also examine the proposed MC-dropout method on the RP-1043 dataset to demonstrate its effectiveness in indicating the most likely incipient fault types.

1902.06361 2026-06-04 cs.LG cs.NE cs.SY eess.SY stat.ML 版本更新

A One-Class Support Vector Machine Calibration Method for Time Series Change Point Detection

一种用于时间序列变化点检测的一类支持向量机校准方法

Baihong Jin, Yuxin Chen, Dan Li, Kameshwar Poolla, Alberto Sangiovanni-Vincentelli

发表机构 * Department of EECS, University of California, Berkeley(加州大学伯克利分校电子工程与计算机科学系) California Institute of Technology(加州理工学院) Institute of Data Science, National University of Singapore(新加坡国立大学数据科学研究所)

AI总结 本文提出了一种校准一类支持向量机(OC-SVM)的方法,用于时间序列变化点检测,通过启发式搜索方法找到输入数据和超参数的最优组合,实验表明OC-SVM在少量训练数据下也能有效检测变化点,优于现有深度学习方法。

详情
AI中文摘要

识别系统健康状态的变化点对于检测发展中的初始故障至关重要。一类支持向量机(OC-SVM)是一种流行的机器学习模型,用于异常检测,因此可用于识别变化点;然而,有时难以获得一个能够用于传感器测量时间序列以识别系统健康状态变化点的良好的OC-SVM模型。在本文中,我们提出了一种新颖的OC-SVM模型校准方法。该方法使用启发式搜索方法来寻找一组良好的输入数据和超参数,以产生一个表现良好的模型。我们在C-MAPSS数据集上的结果表明,OC-SVM在使用较少训练数据的情况下也能在时间序列中实现满意的准确性,相较于最先进的深度学习方法。在我们的案例研究中,通过所提出的模型校准的OC-SVM在训练数据有限的情况下显示出特别的实用性。

英文摘要

It is important to identify the change point of a system's health status, which usually signifies an incipient fault under development. The One-Class Support Vector Machine (OC-SVM) is a popular machine learning model for anomaly detection and hence could be used for identifying change points; however, it is sometimes difficult to obtain a good OC-SVM model that can be used on sensor measurement time series to identify the change points in system health status. In this paper, we propose a novel approach for calibrating OC-SVM models. The approach uses a heuristic search method to find a good set of input data and hyperparameters that yield a well-performing model. Our results on the C-MAPSS dataset demonstrate that OC-SVM can also achieve satisfactory accuracy in detecting change point in time series with fewer training data, compared to state-of-the-art deep learning approaches. In our case study, the OC-SVM calibrated by the proposed model is shown to be useful especially in scenarios with limited amount of training data.

1812.11293 2026-06-04 math.OC cs.LG cs.SY eess.SY 版本更新

DeGroot-Friedkin Map in Opinion Dynamics is Mirror Descent

DeGroot-Friedkin映射在意见动力学中的镜像下降

Abhishek Halder

AI总结 本文通过变分解释将DeGroot-Friedkin映射视为在标准单纯形上的镜像下降,其关联的Bregman散度等于广义Kullback-Leibler散度,即熵的镜像下降,揭示了DeGroot-Friedkin映射在最小化互补意见的熵的同时,使个体的社会影响力接近其社会权力。

详情
AI中文摘要

我们为意见动力学中的DeGroot-Friedkin映射提供了变分解释。具体而言,我们证明了DeGroot-Friedkin映射的非线性动力学可以被视为在标准单纯形上的镜像下降,其中关联的Bregman散度等于广义Kullback-Leibler散度,即熵的镜像下降。我们的结果揭示了DeGroot-Friedkin映射在最小化所谓的“熵”——互补意见的熵的同时,使个体的社会权力接近其社会影响力。

英文摘要

We provide a variational interpretation of the DeGroot-Friedkin map in opinion dynamics. Specifically, we show that the nonlinear dynamics for the DeGroot-Friedkin map can be viewed as mirror descent on the standard simplex with the associated Bregman divergence being equal to the generalized Kullback-Leibler divergence, i.e., an entropic mirror descent. Our results reveal that the DeGroot-Friedkin map elicits an individual's social power to be close to her social influence while minimizing the so called "extropy" -- the entropy of the complimentary opinion.

1712.10158 2026-06-04 q-bio.NC cs.LG cs.NE cs.SY eess.SY stat.ML 版本更新

Non-linear motor control by local learning in spiking neural networks

通过局部学习在脉冲神经网络中实现非线性运动控制

Aditya Gilra, Wulfram Gerstner

发表机构 * School of Computer and Communication Sciences(计算机与通信科学学院) Brain-Mind Institute, School of Life Sciences(脑科学与生命科学研究所)

AI总结 本文提出了一种基于反馈的在线局部学习权重(FOLLOW)方法,用于训练异构脉冲神经网络,以控制双臂并重现期望的状态轨迹,核心贡献是通过局部可塑性规则学习逆模型以实现非线性动力学控制。

详情
Journal ref
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:1773-1782, 2018
AI中文摘要

在具有隐藏神经元的脉冲神经网络中,使用局部、稳定且在线的规则学习权重,以控制非线性身体动力学是一个开放性问题。本文采用监督方案,反馈基于在线局部学习权重(FOLLOW),训练具有隐藏层的异质脉冲神经元网络,以控制双臂以重现期望状态轨迹。网络首先学习非线性动力学的逆模型,即从状态轨迹作为输入,学习推断产生轨迹的连续时间命令。连接权重通过涉及前突触放电和后突触误差反馈的局部可塑性规则进行调整。我们选择了一种称为微分前馈的网络架构,该架构在不同前馈和递归架构中提供了最低的测试误差。学习的逆模型随后用于生成连续时间运动命令以控制手臂,给定期望轨迹。

英文摘要

Learning weights in a spiking neural network with hidden neurons, using local, stable and online rules, to control non-linear body dynamics is an open problem. Here, we employ a supervised scheme, Feedback-based Online Local Learning Of Weights (FOLLOW), to train a network of heterogeneous spiking neurons with hidden layers, to control a two-link arm so as to reproduce a desired state trajectory. The network first learns an inverse model of the non-linear dynamics, i.e. from state trajectory as input to the network, it learns to infer the continuous-time command that produced the trajectory. Connection weights are adjusted via a local plasticity rule that involves pre-synaptic firing and post-synaptic feedback of the error in the inferred command. We choose a network architecture, termed differential feedforward, that gives the lowest test error from different feedforward and recurrent architectures. The learned inverse model is then used to generate a continuous-time motor command to control the arm, given a desired trajectory.

1712.06281 2026-06-04 math.OC cs.LG cs.SY eess.SY physics.chem-ph 版本更新

A New Data-Driven Sparse-Learning Approach to Study Chemical Reaction Networks

一种新的数据驱动稀疏学习方法用于研究化学反应网络

Farshad Harirchi, Doohyun Kim, Omar A. Khalil, Sijia Liu, Paolo Elvati, Angela Violi, Alfred O. Hero

发表机构 * Department of Electrical Engineering and Computer Science, University of Michigan, Ann Arbor, MI 48109--2125, USA(电气工程与计算机科学系,密歇根大学,安娜堡,MI 48109--2125,美国) Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI 48109--2125, USA(机械工程系,密歇根大学,安娜堡,MI 48109--2125,美国) Departments of Chemical Engineering, Biomedical Engineering, Macromolecular Science and Engineering, Biophysics Program, University of Michigan, Ann Arbor, MI 48109--2125, USA(化学工程、生物医学工程、大分子科学与工程、生物物理项目系,密歇根大学,安娜堡,MI 48109--2125,美国)

AI总结 本文提出了一种数据驱动的稀疏学习方法,用于识别化学反应网络中关键反应,该方法通过物种浓度和反应速率来确定影响反应,具有低计算成本,无需额外数据或模拟,应用于氢气和丙烷的燃烧化学分析,并展示了简化机制在点火延迟上的良好性能。

详情
AI中文摘要

化学动力学机制可以通过一组基本反应表示,这些反应可以利用物理化学关系轻松转换为数学表达式。反应的示意图表示捕捉了反应物和产物之间的相互作用。确定系统动态行为下的最小化学相互作用是一个主要任务。在本文中,我们介绍了一种新的方法,利用数据驱动的稀疏学习技术来识别化学反应网络中在燃烧应用中的关键反应。所提出的方法利用物种浓度和反应速率来确定一组关键反应,且具有最小的计算成本,无需额外的数据或模拟。新的方法应用于分析恒容均质反应器中氢气和丙烷的燃烧化学。稀疏学习方法识别出的关键反应与当前化学机制的速率理论知识一致。此外,我们还表明,可以将不同时间和条件下识别出的关键反应组合起来,生成一个简化版本的原始机制,并且对于氢气和丙烷,这种简化机制在广泛的条件下表现出与原始机制相近的点火延迟性能。我们的结果展示了稀疏学习方法作为有效且高效的机制分析和机制简化工具的潜力。

英文摘要

Chemical kinetic mechanisms can be represented by sets of elementary reactions that are easily translated into mathematical terms using physicochemical relationships. The schematic representation of reactions captures the interactions between reacting species and products. Determining the minimal chemical interactions underlying the dynamic behavior of systems is a major task. In this paper, we introduce a novel approach for the identification of the influential reactions in chemical reaction networks for combustion applications, using a data-driven sparse-learning technique. The proposed approach identifies a set of influential reactions using species concentrations and reaction rates, with minimal computational cost without requiring additional data or simulations. The new approach is applied to analyze the combustion chemistry of H2 and C3H8 in a constant-volume homogeneous reactor. The influential reactions identified by the sparse-learning method are consistent with the current kinetics knowledge of chemical mechanisms. Additionally, we show that a reduced version of the parent mechanism can be generated as a combination of the influential reactions identified at different times and conditions and that for both H2 and C3H8 this reduced mechanism performs closely to the parent mechanism as a function of ignition delay over a wide range of conditions. Our results demonstrate the potential of the sparse-learning approach as an effective and efficient tool for mechanism analysis and mechanism reduction.

1902.02542 2026-06-04 math.OC cs.LG cs.SY eess.SY 版本更新

Predict Globally, Correct Locally: Parallel-in-Time Optimal Control of Neural Networks

全局预测,局部校正:神经网络优化的并行时间最优控制

Panos Parpas, Corey Muir

发表机构 * Department of Computing, Imperial College London, London, United Kingdom(帝国理工学院计算系,伦敦,英国)

AI总结 本文提出了一种新的分布式优化算法,通过将神经网络的层视为动力系统离散动力学,利用最优控制的共态(adjoints)与反向传播的关系,实现参数更新无需等待前向或反向传播完成,从而提高效率。

详情
AI中文摘要

动态系统最优控制与神经网络之间的联系在理论和实践中都具有价值。几位研究者利用这些联系来研究不同神经网络架构的稳定性,并开发了内存高效的训练算法。我们同样采用动态系统的观点来看待神经网络,但我们的目标与早期工作不同。我们利用动态系统、最优控制和神经网络之间的联系,开发了一种新的分布式优化算法。所提出的算法解决了分布式神经网络优化算法最显著的障碍:网络权重不能在数据前向传播完成之前更新,且反向传播梯度完成之后才能更新。利用动态系统的观点,我们将(残差)神经网络的层解释为动态系统的离散动力学,并利用最优控制问题的共态(adjoints)与反向传播之间的关系。然后我们开发了一种并行时间方法,该方法在前向或反向传播算法完全完成之前即可更新网络参数。我们建立了所提算法的收敛性。初步的数值结果表明,该算法在竞争性和效率方面优于最先进的方法。

英文摘要

The links between optimal control of dynamical systems and neural networks have proved beneficial both from a theoretical and from a practical point of view. Several researchers have exploited these links to investigate the stability of different neural network architectures and develop memory efficient training algorithms. We also adopt the dynamical systems view of neural networks, but our aim is different from earlier works. We exploit the links between dynamical systems, optimal control, and neural networks to develop a novel distributed optimization algorithm. The proposed algorithm addresses the most significant obstacle for distributed algorithms for neural network optimization: the network weights cannot be updated until the forward propagation of the data, and backward propagation of the gradients are complete. Using the dynamical systems point of view, we interpret the layers of a (residual) neural network as the discretized dynamics of a dynamical system and exploit the relationship between the co-states (adjoints) of the optimal control problem and backpropagation. We then develop a parallel-in-time method that updates the parameters of the network without waiting for the forward or back propagation algorithms to complete in full. We establish the convergence of the proposed algorithm. Preliminary numerical results suggest that the algorithm is competitive and more efficient than the state-of-the-art.

1902.01064 2026-06-04 cs.DC cs.LG cs.SY eess.SY 版本更新

Hop: Heterogeneity-Aware Decentralized Training

Hop:异质性感知的去中心化训练

Qinyi Luo, Jinkun Lin, Youwei Zhuo, Xuehai Qian

发表机构 * University of Southern California(南加州大学) Tsinghua University(清华大学)

AI总结 本文提出Hop,首个考虑异质性的去中心化训练协议,通过引入迭代间隙这一独特特性,提出基于队列的同步机制以实现备份工作者和有限滞后,同时通过跳过迭代来缓解确定性延迟,实验表明在异质环境中相比标准去中心化训练有显著加速。

详情
AI中文摘要

近期研究表明,在机器学习领域,去中心化算法在异质环境中相较于集中化算法能提供更优的性能。这两种方法的主要区别在于其不同的通信模式,两者在异质环境中都可能受到性能下降的影响。尽管已有大量努力支持集中化算法对抗异质性,但针对去中心化算法的相关研究却十分有限。本文提出Hop,首个异质性感知的去中心化训练协议。基于我们识别出的去中心化训练的一个独特特性,即迭代间隙,我们提出一种基于队列的同步机制,能够高效实现备份工作者和有限滞后。为了应对确定性延迟,我们提出跳过迭代,以进一步减轻较慢工作者的影响。我们基于TensorFlow构建了Hop的原型实现。在CNN和SVM上的实验结果表明,在异质环境中相比标准去中心化训练有显著的加速效果。

英文摘要

Recent work has shown that decentralized algorithms can deliver superior performance over centralized ones in the context of machine learning. The two approaches, with the main difference residing in their distinct communication patterns, are both susceptible to performance degradation in heterogeneous environments. Although vigorous efforts have been devoted to supporting centralized algorithms against heterogeneity, little has been explored in decentralized algorithms regarding this problem. This paper proposes Hop, the first heterogeneity-aware decentralized training protocol. Based on a unique characteristic of decentralized training that we have identified, the iteration gap, we propose a queue-based synchronization mechanism that can efficiently implement backup workers and bounded staleness in the decentralized setting. To cope with deterministic slowdown, we propose skipping iterations so that the effect of slower workers is further mitigated. We build a prototype implementation of Hop on TensorFlow. The experiment results on CNN and SVM show significant speedup over standard decentralized training in heterogeneous settings.

1809.06277 2026-06-04 math.OC cs.LG cs.SY eess.SY 版本更新

Optimal Matrix Momentum Stochastic Approximation and Applications to Q-learning

最优矩阵动量随机逼近及其在Q学习中的应用

Adithya M. Devraj, Ana Bušić, Sean Meyn

发表机构 * Department of Electrical and Computer Engineering, University of Florida(佛罗里达大学电气与计算机工程系)

AI总结 本文提出两种新的根寻找算法,PolSA和NeSA,用于解决优化问题,并探讨了这些算法在强化学习中的应用,特别是在Q学习中通过随机逼近实现最优渐近协方差。

详情
AI中文摘要

加速是随机优化文献中越来越常见的主题。最常见的例子是Nesterov的方法和Polyak的动量技术。在本文中,针对根寻找问题引入了两种新的算法:1)PolSA是一种具有特别设计的矩阵动量的根寻找算法,2)NeSA可以被视为Nesterov算法的一种变种,或PolSA的简化版本。PolSA算法在优化领域(当作为根寻找问题处理时)是新的。本文研究的调研受到强化学习应用的启发。众所周知,大多数TD-和Q学习的变种可以作为SA(随机逼近)算法来处理,且一般SA理论的工具可用于研究收敛性和收敛速率的界限。特别是,渐近方差是SA算法性能的常见度量标准,也是评估随机优化算法性能的多种度量之一。有两种广为人知的SA技术已知具有最优渐近方差:Ruppert-Polyak平均技术和随机牛顿-拉夫逊(SNR)。前者算法可能具有极差的瞬时性能,而后者计算成本较高。本文证明了新提出的PolSA算法的参数估计与理想(但更复杂)SNR算法的估计耦合。因此,新算法成为获得最优渐近协方差的第三种方法。这些强结果需要对模型的假设。考虑了线性化模型,并假设噪声是一个鞅差序列。在非线性设置中获得了数值结果,这是本文工作的动机:在PolSA实现的Q学习中,在这种非理想设置下观察到与SNR的耦合。

英文摘要

Acceleration is an increasingly common theme in the stochastic optimization literature. The two most common examples are Nesterov's method, and Polyak's momentum technique. In this paper two new algorithms are introduced for root finding problems: 1) PolSA is a root finding algorithm with specially designed matrix momentum, and 2) NeSA can be regarded as a variant of Nesterov's algorithm, or a simplification of PolSA. The PolSA algorithm is new even in the context of optimization (when cast as a root finding problem). The research surveyed in this paper is motivated by applications to reinforcement learning. It is well known that most variants of TD- and Q-learning may be cast as SA (stochastic approximation) algorithms, and the tools from general SA theory can be used to investigate convergence and bounds on convergence rate. In particular, the asymptotic variance is a common metric of performance for SA algorithms, and is also one among many metrics used in assessing the performance of stochastic optimization algorithms. There are two well known SA techniques that are known to have optimal asymptotic variance: the Ruppert-Polyak averaging technique, and stochastic Newton-Raphson (SNR). The former algorithm can have extremely bad transient performance, and the latter can be computationally expensive. It is demonstrated here that parameter estimates from the new PolSA algorithm couple with those of the ideal (but more complex) SNR algorithm. The new algorithm is thus a third approach to obtain optimal asymptotic covariance. These strong results require assumptions on the model. A linearized model is considered, and the noise is assumed to be a martingale difference sequence. Numerical results are obtained in a non-linear setting that is the motivation for this work: In PolSA implementations of Q-learning it is observed that coupling occurs with SNR in this non-ideal setting.

1806.05722 2026-06-04 cs.LG cs.SY eess.SY math.OC stat.ML 版本更新

Non-asymptotic Identification of LTI Systems from a Single Trajectory

非渐近识别单轨迹下的线性时不变系统

Samet Oymak, Necmiye Ozay

发表机构 * Department of Electrical and Computer Engineering, University of California, Riverside, CA(加州大学河滨分校电子工程与计算机科学系) Electrical Engineering and Computer Science Department, University of Michigan, Ann Arbor, MI(密歇根大学安娜堡分校电气工程与计算机科学系)

AI总结 该研究通过单轨迹输入输出数据,利用霍尔-卡尔曼算法在有限时间内学习线性时不变系统的马尔可夫参数,并结合稳定性结果和样本复杂度分析,确定学习系统平衡实现所需的数据量。

Comments Version 2 has two improvements: First, paper now uses spectral radius rather than largest singular value hence applies to a larger class of systems. Secondly, new sample complexity bounds are provided for approximating the system's Hankel operator via estimated Markov parameters

详情
AI中文摘要

我们考虑从输入输出数据学习线性时不变(LTI)动态系统实现的问题。给定单个输入输出轨迹,我们为学习系统的马尔可夫参数提供有限时间分析,通过经典霍尔-卡尔曼算法获得平衡实现。通过证明霍尔-卡尔曼算法的稳定性结果,并结合马尔可夫参数的样本复杂度结果,我们展示了如何确定以高概率学习系统平衡实现所需的数据量。

英文摘要

We consider the problem of learning a realization for a linear time-invariant (LTI) dynamical system from input/output data. Given a single input/output trajectory, we provide finite time analysis for learning the system's Markov parameters, from which a balanced realization is obtained using the classical Ho-Kalman algorithm. By proving a stability result for the Ho-Kalman algorithm and combining it with the sample complexity results for Markov parameters, we show how much data is needed to learn a balanced realization of the system up to a desired accuracy with high probability.

1803.06443 2026-06-04 cs.LG cs.DC cs.SY eess.SY stat.ML 版本更新

Communication Compression for Decentralized Training

分布式训练中的通信压缩

Hanlin Tang, Shaoduo Gan, Ce Zhang, Tong Zhang, Ji Liu

发表机构 * Department of Computer Science, University of Rochester(罗切斯特大学计算机科学系) Department of Computer Science, ETH Zurich(苏黎世联邦理工学院计算机科学系) Tencent AI Lab(腾讯AI实验室)

AI总结 本文研究了在高延迟和低带宽网络中结合通信压缩与去中心化技术以实现鲁棒训练系统的问题,提出了两种新的压缩策略并证明了其收敛性。

详情
AI中文摘要

优化分布式学习系统是平衡计算与通信的艺术。已有两种研究方向试图解决网络速度慢的问题:通信压缩用于低带宽网络,去中心化用于高延迟网络。本文探讨了一个自然问题:能否将这两种技术结合,使系统同时鲁棒于带宽和延迟?尽管这种组合的系统影响是显而易见的,但其理论原理和算法设计却极具挑战性:与集中式算法不同,简单地在去中心化网络中压缩交换信息,即使以无偏随机方式,也会累积误差并导致无法收敛。本文提出了一种压缩的去中心化训练框架,并提出了两种不同的策略,分别称为 extrapolation compression 和 difference compression。我们分析了这两种算法并证明了它们以 $O(1/\sqrt{nT})$ 的速率收敛,其中 $n$ 是工作者数量,$T$ 是迭代次数,与全精度集中式训练的收敛速率相匹配。我们验证了我们的算法,并发现对于同时具有高延迟和低带宽的网络,我们的算法显著优于仅去中心化或仅量化算法。

英文摘要

Optimizing distributed learning systems is an art of balancing between computation and communication. There have been two lines of research that try to deal with slower networks: {\em communication compression} for low bandwidth networks, and {\em decentralization} for high latency networks. In this paper, We explore a natural question: {\em can the combination of both techniques lead to a system that is robust to both bandwidth and latency?} Although the system implication of such combination is trivial, the underlying theoretical principle and algorithm design is challenging: unlike centralized algorithms, simply compressing exchanged information, even in an unbiased stochastic way, within the decentralized network would accumulate the error and fail to converge. In this paper, we develop a framework of compressed, decentralized training and propose two different strategies, which we call {\em extrapolation compression} and {\em difference compression}. We analyze both algorithms and prove both converge at the rate of $O(1/\sqrt{nT})$ where $n$ is the number of workers and $T$ is the number of iterations, matching the convergence rate for full precision, centralized training. We validate our algorithms and find that our proposed algorithm outperforms the best of merely decentralized and merely quantized algorithm significantly for networks with {\em both} high latency and low bandwidth.

1809.08911 2026-06-04 cs.LG cs.CY cs.SY eess.SP eess.SY stat.ML 版本更新

Understanding Compressive Adversarial Privacy

理解压缩对抗隐私

Xiao Chen, Peter Kairouz, Ram Rajagopal

发表机构 * Stanford University(斯坦福大学)

AI总结 本文提出了一种压缩对抗隐私框架,通过凸优化在数据隐私和效用之间取得平衡,并通过实证应用展示了该框架在保护敏感信息方面的有效性。

详情
Journal ref
2018 IEEE Conference on Decision and Control (CDC)
AI中文摘要

设计一种不牺牲过多隐私的数据共享机制可以被视为数据持有者与恶意攻击者之间的博弈。本文描述了一种压缩对抗隐私框架,该框架捕捉了数据隐私与效用之间的权衡。我们在假设数据持有者和攻击者只能使用线性变换修改数据的情况下,通过凸优化确定最优的数据发布机制。随后,我们构建了一个更加现实的数据发布机制,该机制可以依赖于非线性压缩模型,而攻击者则使用神经网络。通过一系列实证应用,我们展示了该框架,即压缩对抗隐私,能够保护敏感信息。

英文摘要

Designing a data sharing mechanism without sacrificing too much privacy can be considered as a game between data holders and malicious attackers. This paper describes a compressive adversarial privacy framework that captures the trade-off between the data privacy and utility. We characterize the optimal data releasing mechanism through convex optimization when assuming that both the data holder and attacker can only modify the data using linear transformations. We then build a more realistic data releasing mechanism that can rely on a nonlinear compression model while the attacker uses a neural network. We demonstrate in a series of empirical applications that this framework, consisting of compressive adversarial privacy, can preserve sensitive information.

1610.05202 2026-06-04 cs.LG cs.AI cs.DC cs.SY eess.SY stat.ML 版本更新

Decentralized Collaborative Learning of Personalized Models over Networks

网络上的去中心化协作学习个性化模型

Paul Vanhaesebrouck, Aurélien Bellet, Marc Tommasi

发表机构 * INRIA

AI总结 本文研究了在协作对等网络中,如何通过与其他具有相似目标的代理通信来改进本地训练模型,提出两种异步 gossip 算法并基于 ADMM 实现去中心化算法。

Comments To appear in the Proceedings of the 20th International Conference on Artificial Intelligence and Statistics (AISTATS 2017)

详情
AI中文摘要

我们考虑了一个协作对等网络中的学习代理集合,其中每个代理根据其自身的学习目标学习一个个性化模型。本文研究的问题是:如何通过与其他具有相似目标的代理通信来改进本地训练的模型?我们引入并分析了两种异步 gossip 算法,以完全去中心化的方式运行。我们的第一种方法受标签传播启发,旨在在网络中平滑预训练的本地模型,同时考虑每个代理对其初始模型的置信度。我们的第二种方法中,代理通过基于本地数据集和邻居行为的迭代更新联合学习和传播模型。为了优化这个具有挑战性的目标,我们的去中心化算法基于 ADMM。

英文摘要

We consider a set of learning agents in a collaborative peer-to-peer network, where each agent learns a personalized model according to its own learning objective. The question addressed in this paper is: how can agents improve upon their locally trained model by communicating with other agents that have similar objectives? We introduce and analyze two asynchronous gossip algorithms running in a fully decentralized manner. Our first approach, inspired from label propagation, aims to smooth pre-trained local models over the network while accounting for the confidence that each agent has in its initial model. In our second approach, agents jointly learn and propagate their model by making iterative updates based on both their local dataset and the behavior of their neighbors. To optimize this challenging objective, our decentralized algorithm is based on ADMM.

1606.02421 2026-06-04 stat.ML cs.AI cs.DC cs.LG cs.SY eess.SY 版本更新

Gossip Dual Averaging for Decentralized Optimization of Pairwise Functions

基于 gossip 的双重平均法用于分布式优化配对函数

Igor Colin, Aurélien Bellet, Joseph Salmon, Stéphan Clémençon

发表机构 * Magnet Team, INRIA Lille – Nord Europe(磁力团队、法国国家信息与自动化技术研究所里尔-北欧洲分部)

AI总结 本文提出了一种基于 gossip 的双重平均算法,用于在分布式网络中优化配对函数,适用于排名、距离度量学习和图推断等应用,通过同步和异步设置解决优化问题,并展示了其在AUC最大化和度量学习中的实际应用。

详情
AI中文摘要

在分布式网络(如传感器、连接设备等)中,存在对高效算法优化全局成本函数的重要需求,例如从每个计算单元收集的本地数据中学习全局模型。本文针对分布式最小化数据点配对函数的问题,这些点分布在定义网络通信拓扑的图的节点上。该问题在排名、距离度量学习和图推断等领域有广泛应用。我们提出了一种基于双重平均的新型 gossip 算法,旨在在同步和异步设置中解决此类问题。所提出的框架足够灵活,能够处理约束和正则化优化问题的变体。我们的理论分析表明,所提出的算法在保持集中式双重平均收敛速度的同时,仅引入一个加性偏差项。我们还通过在AUC最大化和度量学习问题上的数值模拟,展示了我们方法的实际价值。

英文摘要

In decentralized networks (of sensors, connected objects, etc.), there is an important need for efficient algorithms to optimize a global cost function, for instance to learn a global model from the local data collected by each computing unit. In this paper, we address the problem of decentralized minimization of pairwise functions of the data points, where these points are distributed over the nodes of a graph defining the communication topology of the network. This general problem finds applications in ranking, distance metric learning and graph inference, among others. We propose new gossip algorithms based on dual averaging which aims at solving such problems both in synchronous and asynchronous settings. The proposed framework is flexible enough to deal with constrained and regularized variants of the optimization problem. Our theoretical analysis reveals that the proposed algorithms preserve the convergence rate of centralized dual averaging up to an additive bias term. We present numerical simulations on Area Under the ROC Curve (AUC) maximization and metric learning problems which illustrate the practical interest of our approach.

1511.05464 2026-06-04 stat.ML cs.DC cs.LG cs.SY eess.SY stat.CO 版本更新

Extending Gossip Algorithms to Distributed Estimation of U-Statistics

将 gossip 算法扩展到分布式 U-统计量估计

Igor Colin, Aurélien Bellet, Joseph Salmon, Stéphan Clémençon

发表机构 * INRIA Lille - Nord Europe(INRIA里尔-北欧洲)

AI总结 本文提出新的同步和异步随机 gossip 算法,用于在分布式网络中同时传播数据并维护局部的 U-统计量估计,证明了同步和异步情况下的收敛率分别为 O(1/t) 和 O(log t / t),并通过数值实验验证了算法的优越性。

Comments to be presented at NIPS 2015

详情
AI中文摘要

高效且稳健的去中心化网络估计算法对于许多分布式系统至关重要。尽管样本均值统计的分布式估计已受到广泛关注,但依赖于对观测对的更昂贵平均的 U-统计量计算却是一个研究较少的领域。然而,这些数据函数对于描述统计总体的全局特性至关重要,重要例子包括曲线下面积、经验方差、基尼均差和簇内点散度。本文提出新的同步和异步随机 gossip 算法,同时在网络中传播数据并维护感兴趣的 U-统计量的局部估计。我们建立了同步和异步情况下的收敛率界分别为 O(1/t) 和 O(log t / t),其中 t 是迭代次数,且具有明确的数据和网络依赖项。除了在速率分析方面的优越比较外,数值实验还提供了实证证据,证明所提出的算法优于之前引入的方法。

英文摘要

Efficient and robust algorithms for decentralized estimation in networks are essential to many distributed systems. Whereas distributed estimation of sample mean statistics has been the subject of a good deal of attention, computation of $U$-statistics, relying on more expensive averaging over pairs of observations, is a less investigated area. Yet, such data functionals are essential to describe global properties of a statistical population, with important examples including Area Under the Curve, empirical variance, Gini mean difference and within-cluster point scatter. This paper proposes new synchronous and asynchronous randomized gossip algorithms which simultaneously propagate data across the network and maintain local estimates of the $U$-statistic of interest. We establish convergence rate bounds of $O(1/t)$ and $O(\log t / t)$ for the synchronous and asynchronous cases respectively, where $t$ is the number of iterations, with explicit data and network dependent terms. Beyond favorable comparisons in terms of rate analysis, numerical experiments provide empirical evidence the proposed algorithms surpasses the previously introduced approach.

1812.06325 2026-06-04 eess.SY cs.LG cs.RO cs.SY 版本更新

Data-efficient Auto-tuning with Bayesian Optimization: An Industrial Control Study

数据高效自动调优与贝叶斯优化:一项工业控制研究

Matthias Neumann-Brosig, Alonso Marco, Dieter Schwarzmann, Sebastian Trimpe

发表机构 * IAV GmbH(IAV集团) Max Planck Society(马克斯·普朗克学会) Cyber Valley initiative(Cyber Valley倡议) Max Planck Institute for Intelligent Systems(智能系统研究所)

AI总结 本文提出利用贝叶斯优化自动学习最优控制器参数,通过概率模型(高斯过程)建模控制器参数到用户定义成本的未知函数,并通过实验数据迭代优化,以高效找到全局最优参数,实验表明其在 throttle valve 控制中优于手动校准。

Comments 11 pages, 7 figures and 4 tables. To appear in IEEE Transactions on Control Systems Technology

详情
AI中文摘要

贝叶斯优化被提出用于从实验数据自动学习最优控制器参数。通过概率描述(高斯过程)建模控制器参数到用户定义成本的未知函数。概率模型通过在物理系统上测试一组参数并评估成本来更新。为加快学习速度,贝叶斯优化算法系统地选择下一步评估的参数,例如通过最大化关于最优解的信息增益。因此,该算法通过少量实验迭代找到全局最优参数。以节流阀控制为例,所提出的自动调优方法在低实验次数下 consistently 实现更好的性能,优于手动校准。所提出的自动调优框架具有灵活性,可处理不同的控制结构和目标。

英文摘要

Bayesian optimization is proposed for automatic learning of optimal controller parameters from experimental data. A probabilistic description (a Gaussian process) is used to model the unknown function from controller parameters to a user-defined cost. The probabilistic model is updated with data, which is obtained by testing a set of parameters on the physical system and evaluating the cost. In order to learn fast, the Bayesian optimization algorithm selects the next parameters to evaluate in a systematic way, for example, by maximizing information gain about the optimum. The algorithm thus iteratively finds the globally optimal parameters with only few experiments. Taking throttle valve control as a representative industrial control example, the proposed auto-tuning method is shown to outperform manual calibration: it consistently achieves better performance with a low number of experiments. The proposed auto-tuning framework is flexible and can handle different control structures and objectives.

1809.06750 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Multiobjective Reinforcement Learning for Reconfigurable Adaptive Optimal Control of Manufacturing Processes

多目标强化学习用于可重构自适应最优控制的制造过程

Johannes Dornheim, Norbert Link

发表机构 * Intelligent Systems Research Group (ISRG)(智能系统研究组)

AI总结 本文提出了一种新型无模型多目标强化学习方法,用于制造过程的自适应最优控制,能够高效学习不同目标权重下的控制配置。

Comments Conference, Preprint, 978-1-5386-5925-0/18/$31.00 \c{opyright} 2018 IEEE

详情
Journal ref
2018 IEEE International Symposium on Electronics and Telecommunications (ISETC)
AI中文摘要

在工业应用中,自适应最优控制 often 需要考虑多个矛盾的目标。这些目标的权重(相对重要性)通常在控制设计期间并不已知,并且会随着生产条件和要求的变化而变化。本文提出了一种新的无模型多目标强化学习方法,用于制造过程的自适应最优控制。该方法能够在给定特定目标权重的控制配置序列中实现样本高效的學習。

英文摘要

In industrial applications of adaptive optimal control often multiple contrary objectives have to be considered. The weights (relative importance) of the objectives are often not known during the design of the control and can change with changing production conditions and requirements. In this work a novel model-free multiobjective reinforcement learning approach for adaptive optimal control of manufacturing processes is proposed. The approach enables sample-efficient learning in sequences of control configurations, given by particular objective weights.

1811.04455 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

Learning with tree-based tensor formats

基于树结构张量格式的学习

Erwan Grelier, Anthony Nouy, Mathilde Chevreuil

AI总结 本文研究了在统计学习设置中,通过经验风险最小化在树结构张量格式的模型类中近似高维函数的问题,提出了一种基于树结构张量格式的模型选择策略和树优化算法,以提高学习的数值稳定性与可靠性。

详情
AI中文摘要

本文关注在统计学习设置中,通过在树结构张量格式的模型类中进行经验风险最小化来近似高维函数的问题。这些是特定的秩结构函数类,可以视为具有与树和多线性激活函数相关的稀疏架构的深度神经网络。对于给定的模型类,我们利用树结构张量格式是多线性模型的事实,将风险最小化问题转换为一系列线性模型的学习问题。适当的表示变换会产生数值稳定的 学习问题,并允许利用稀疏性。对于高维问题或仅当数据集较小时,选择合适的模型类是一个关键问题。对于给定的树,选择最小化风险的树结构秩元组是一个组合问题。在这里,我们提出了一种秩适应策略,实际情况下能够提供风险随模型类复杂度变化的良好收敛性。找到合适的树也是一个组合问题,可以与深度神经网络特定稀疏架构的选择相关联。在这里,我们提出了一种随机算法,用于最小化给定函数在具有给定 arity 的树类中的表示复杂度,允许树的拓扑结构变化。该树优化算法随后被包含在一种学习方案中,该方案依次适应树和相应的树结构秩。与经典非线性模型类学习算法不同,所提出的算法在数值上是稳定、可靠的,并且只需要用户较低水平的专业知识。

英文摘要

This paper is concerned with the approximation of high-dimensional functions in a statistical learning setting, by empirical risk minimization over model classes of functions in tree-based tensor format. These are particular classes of rank-structured functions that can be seen as deep neural networks with a sparse architecture related to the tree and multilinear activation functions. For learning in a given model class, we exploit the fact that tree-based tensor formats are multilinear models and recast the problem of risk minimization over a nonlinear set into a succession of learning problems with linear models. Suitable changes of representation yield numerically stable learning problems and allow to exploit sparsity. For high-dimensional problems or when only a small data set is available, the selection of a good model class is a critical issue. For a given tree, the selection of the tuple of tree-based ranks that minimize the risk is a combinatorial problem. Here, we propose a rank adaptation strategy which provides in practice a good convergence of the risk as a function of the model class complexity. Finding a good tree is also a combinatorial problem, which can be related to the choice of a particular sparse architecture for deep neural networks. Here, we propose a stochastic algorithm for minimizing the complexity of the representation of a given function over a class of trees with a given arity, allowing changes in the topology of the tree. This tree optimization algorithm is then included in a learning scheme that successively adapts the tree and the corresponding tree-based ranks. Contrary to classical learning algorithms for nonlinear model classes, the proposed algorithms are numerically stable, reliable, and require only a low level expertise of the user.

1805.11572 2026-06-04 cs.CV cs.LG cs.NA math.NA stat.ML 版本更新

Adversarial Regularizers in Inverse Problems

对抗正则化在反问题中的应用

Sebastian Lunz, Ozan Öktem, Carola-Bibiane Schönlieb

发表机构 * DAMTP Department of Mathematics(DAMTP数学系) University of Cambridge(剑桥大学) KTH - Royal Institute of Technology(皇家理工学院)

AI总结 本文提出了一种利用神经网络作为正则化函数的新框架,用于解决反问题,该方法通过学习真实图像分布与未正则化重建分布之间的差异来提升反问题求解的性能。

Comments published at NeurIPS 2018

详情
AI中文摘要

医学成像和计算机视觉中的反问题传统上使用纯模型方法来解决。其中,变分正则化模型是其中最流行的方法之一。我们提出了一种新的框架,用于将数据驱动的方法应用于反问题,使用神经网络作为正则化函数。网络学习区分真实图像分布与未正则化重建分布的分布。一旦训练完成,网络通过求解相应的变分问题应用于反问题。与其他数据驱动的反问题方法不同,该算法即使在只有无监督训练数据可用的情况下也能应用。实验展示了该框架在BSDS数据集上的去噪潜力以及在LIDC数据集上的计算机断层扫描重建潜力。

英文摘要

Inverse Problems in medical imaging and computer vision are traditionally solved using purely model-based methods. Among those variational regularization models are one of the most popular approaches. We propose a new framework for applying data-driven approaches to inverse problems, using a neural network as a regularization functional. The network learns to discriminate between the distribution of ground truth images and the distribution of unregularized reconstructions. Once trained, the network is applied to the inverse problem by solving the corresponding variational problem. Unlike other data-based approaches for inverse problems, the algorithm can be applied even if only unsupervised training data is available. Experiments demonstrate the potential of the framework for denoising on the BSDS dataset and for computed tomography reconstruction on the LIDC dataset.

1704.04163 2026-06-04 cs.DS cs.LG cs.NA math.NA 版本更新

Spectrum Approximation Beyond Fast Matrix Multiplication: Algorithms and Hardness

超越快速矩阵乘法的谱近似:算法与难度

Cameron Musco, Praneeth Netrapalli, Aaron Sidford, Shashanka Ubaru, David P. Woodruff

发表机构 * MIT(麻省理工学院) Microsoft Research(微软研究院) Stanford University(斯坦福大学) University of Minnesota(明尼苏达大学) Carnegie Mellon University(卡内基梅隆大学)

AI总结 本文研究了如何在比矩阵乘法时间更快的运行时间内近似矩阵的谱,提出了一种基于随机迹估计、多项式逼近和快速系统求解器的算法,能够高效地隔离矩阵谱的不同范围并近似奇异值的数量,从而在许多应用中替代真实的奇异值。

Comments ITCS 2018

详情
AI中文摘要

理解矩阵$A \in \mathbb{R}^{n imes n}$的奇异值谱是众多应用中的基本任务。在矩阵乘法时间内,可以执行完整的SVD并直接计算奇异值$σ_1,...,σ_n$。然而,很少有关于突破这一运行时间障碍的算法。利用随机迹估计、多项式逼近和快速系统求解器的工具,我们展示了如何高效地隔离$A$的谱的不同范围并近似这些范围内的奇异值数量。因此,我们有效地计算了谱的直方图,这在许多应用中可以替代真实的奇异值。我们使用这一原始工具,给出了对广泛对称矩阵范数进行近似的第一种算法,其运行时间快于矩阵乘法时间。例如,我们给出了一种$(1 + ε)$近似算法,用于Schatten-1范数(核范数),运行时间为$ ilde O((nnz(A)n^{1/3} + n^2)ε^{-3})$,适用于具有均匀行稀疏性的矩阵,或$ ilde O(n^{2.18} ε^{-3})$时间用于密集矩阵。对于一般的Schatten-p范数,运行时间平滑地扩展,特别是对于任何$p \ge 2$,运行时间变为$ ilde O(p \cdot nnz(A) ε^{-3})$。同时,我们证明了谱近似的复杂性本质上与快速矩阵乘法在小$ε$范围内密切相关。我们证明,如果在我们的算法中实现更温和的$ε$依赖性,则意味着在一般图上实现比矩阵乘法时间更快的三角检测算法。这进一步意味着,高精度算法在亚立方时间内运行将导致亚立方时间矩阵乘法。作为我们界限的应用,我们展示了在矩阵乘法时间以内精确计算图中所有有效电阻的可能性可能很困难,除非有重大的算法突破。

英文摘要

Understanding the singular value spectrum of a matrix $A \in \mathbb{R}^{n \times n}$ is a fundamental task in countless applications. In matrix multiplication time, it is possible to perform a full SVD and directly compute the singular values $σ_1,...,σ_n$. However, little is known about algorithms that break this runtime barrier. Using tools from stochastic trace estimation, polynomial approximation, and fast system solvers, we show how to efficiently isolate different ranges of $A$'s spectrum and approximate the number of singular values in these ranges. We thus effectively compute a histogram of the spectrum, which can stand in for the true singular values in many applications. We use this primitive to give the first algorithms for approximating a wide class of symmetric matrix norms in faster than matrix multiplication time. For example, we give a $(1 + ε)$ approximation algorithm for the Schatten-$1$ norm (the nuclear norm) running in just $\tilde O((nnz(A)n^{1/3} + n^2)ε^{-3})$ time for $A$ with uniform row sparsity or $\tilde O(n^{2.18} ε^{-3})$ time for dense matrices. The runtime scales smoothly for general Schatten-$p$ norms, notably becoming $\tilde O (p \cdot nnz(A) ε^{-3})$ for any $p \ge 2$. At the same time, we show that the complexity of spectrum approximation is inherently tied to fast matrix multiplication in the small $ε$ regime. We prove that achieving milder $ε$ dependencies in our algorithms would imply faster than matrix multiplication time triangle detection for general graphs. This further implies that highly accurate algorithms running in subcubic time yield subcubic time matrix multiplication. As an application of our bounds, we show that precisely computing all effective resistances in a graph in less than matrix multiplication time is likely difficult, barring a major algorithmic breakthrough.

1704.03371 2026-06-04 cs.DS cs.LG cs.NA math.NA 版本更新

Sublinear Time Low-Rank Approximation of Positive Semidefinite Matrices

亚线性时间正定矩阵的低秩近似

Cameron Musco, David P. Woodruff

发表机构 * MIT(麻省理工学院) Carnegie Mellon University(卡内基梅隆大学)

AI总结 本文提出了一种在亚线性时间内计算正定矩阵低秩近似的算法,通过因子形式输出一个秩为k的矩阵B,使得B与原矩阵A的F范数平方误差不超过(1+ε)倍的最优秩k近似A_k的F范数平方误差,并在特定条件下无需读取全部矩阵元素。

详情
AI中文摘要

我们展示了如何在亚线性时间内计算任意正定矩阵的相对误差低秩近似。即对于任意n×n的正定矩阵A,在~O(n·poly(k/ε))时间内输出一个因子形式的秩k矩阵B,使得||A-B||_F^2 ≤ (1+ε)||A-A_k||_F^2,其中A_k是A的最佳秩k近似。当k和1/ε与A的稀疏性相比不太大时,我们的算法不需要读取矩阵的所有元素。因此,我们显著改进了基于无偏子空间嵌入的先前O(nnz(A))时间算法,并绕过了通用矩阵的O(nnz(A))时间下界。我们证明了正定矩阵低秩近似的时界,显示我们的算法接近最优。最后,我们扩展了我们的技术,以在(通常更强的)谱范数度量||A-B||_2^2下给出低秩近似的亚线性时间算法,并在正定矩阵上进行岭回归。

英文摘要

We show how to compute a relative-error low-rank approximation to any positive semidefinite (PSD) matrix in sublinear time, i.e., for any $n \times n$ PSD matrix $A$, in $\tilde O(n \cdot poly(k/ε))$ time we output a rank-$k$ matrix $B$, in factored form, for which $\|A-B\|_F^2 \leq (1+ε)\|A-A_k\|_F^2$, where $A_k$ is the best rank-$k$ approximation to $A$. When $k$ and $1/ε$ are not too large compared to the sparsity of $A$, our algorithm does not need to read all entries of the matrix. Hence, we significantly improve upon previous $nnz(A)$ time algorithms based on oblivious subspace embeddings, and bypass an $nnz(A)$ time lower bound for general matrices (where $nnz(A)$ denotes the number of non-zero entries in the matrix). We prove time lower bounds for low-rank approximation of PSD matrices, showing that our algorithm is close to optimal. Finally, we extend our techniques to give sublinear time algorithms for low-rank approximation of $A$ in the (often stronger) spectral norm metric $\|A-B\|_2^2$ and for ridge regression on PSD matrices.

1812.09701 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Nonlinear Robust Filtering of Sampled-Data Dynamical Systems

非线性采样数据动力系统鲁棒滤波

Masoud Abbaszadeh, Horacio J. Marquez

发表机构 * GE Global Research(GE全球研究) University of Alberta(阿尔伯塔大学)

AI总结 本文研究了具有和不具有精确离散时间模型的非线性采样数据系统的鲁棒滤波问题,提出了一种基于线性矩阵不等式的方法来设计鲁棒H∞观测器,并针对两种系统类型进行了分析,证明了观测器的收敛性,并通过欧拉近似离散时间模型证明了实际收敛性,同时通过最大化可允许的Lipschitz常数来保证对非线性不确定性的鲁棒性。

Comments 21 pages, 2 figures

详情
AI中文摘要

本文研究了具有和不具有精确离散时间模型的非线性采样数据系统的鲁棒滤波问题。提出了一种基于线性矩阵不等式(LMI)的方法,用于设计一类Lipschitz非线性系统的鲁棒H∞观测器。考虑了两种类型的系统:Lipschitz非线性离散时间系统和具有欧拉近似离散时间模型的Lipschitz非线性采样数据系统。当系统具有精确离散时间模型时,证明了观测器的收敛性。然后,利用欧拉近似离散时间模型证明了所提出观测器的实际收敛性。此外,通过最大化可允许的Lipschitz常数,所提出的LMI优化问题的解能够保证对某些非线性不确定性的鲁棒性。对于这两种情况,解决了鲁棒H∞观测器合成问题。通过LMI优化实现最大扰动衰减水平。最后,提供了一条将结果扩展到更高阶近似离散化的方法路径。

英文摘要

This work is concerned with robust filtering of nonlinear sampled-data systems with and without exact discrete-time models. A linear matrix inequality (LMI) based approach is proposed for the design of robust $H_{\infty}$ observers for a class of Lipschitz nonlinear systems. Two type of systems are considered, Lipschitz nonlinear discrete-time systems and Lipschitz nonlinear sampled-data systems with Euler approximate discrete-time models. Observer convergence when the exact discrete-time model of the system is available is shown. Then, practical convergence of the proposed observer is proved using the Euler approximate discrete-time model. As an additional feature, maximizing the admissible Lipschitz constant, the solution of the proposed LMI optimization problem guaranties robustness against some nonlinear uncertainty. The robust H_infty observer synthesis problem is solved for both cases. The maximum disturbance attenuation level is achieved through LMI optimization. At the end, a path to extending the results to higher-order approximate discretizations is provided.

1805.11706 2026-06-04 cs.LG cs.AI cs.RO cs.SY eess.SY stat.ML 版本更新

Supervised Policy Update for Deep Reinforcement Learning

深度强化学习中的监督策略更新

Quan Vuong, Yiming Zhang, Keith W. Ross

发表机构 * University of California, San Diego(加州大学圣地亚哥分校) New York University(纽约大学)

AI总结 本文提出了一种新的样本效率高的方法,称为监督策略更新(SPU),用于深度强化学习。该方法通过当前策略生成的数据,在非参数化的近端策略空间中构建并求解一个约束优化问题,然后利用监督回归将最优的非参数化策略转换为参数化策略,从而生成新的样本。该方法适用于离散和连续动作空间,并能处理多种接近约束。本文展示了如何通过该方法解决自然策略梯度和信任区域策略优化(NPG/TRPO)以及近端策略优化(PPO)问题。SPU的实现比TRPO更简单,在样本效率方面,实验表明SPU在Mujoco模拟机器人任务中优于TRPO,在Atari视频游戏任务中优于PPO。

Comments Accepted as a conference paper at ICLR 2019

详情
AI中文摘要

我们提出了一种新的样本效率高的方法,称为监督策略更新(SPU),用于深度强化学习。从当前策略生成的数据开始,SPU在非参数化的近端策略空间中构建并求解一个约束优化问题。利用监督回归,它将最优的非参数化策略转换为参数化策略,从而生成新的样本。该方法具有通用性,适用于离散和连续动作空间,并能处理多种接近约束。我们展示了如何通过该方法解决自然策略梯度和信任区域策略优化(NPG/TRPO)以及近端策略优化(PPO)问题。SPU的实现比TRPO更简单。在样本效率方面,我们的广泛实验表明,SPU在Mujoco模拟机器人任务中优于TRPO,在Atari视频游戏任务中优于PPO。

英文摘要

We propose a new sample-efficient methodology, called Supervised Policy Update (SPU), for deep reinforcement learning. Starting with data generated by the current policy, SPU formulates and solves a constrained optimization problem in the non-parameterized proximal policy space. Using supervised regression, it then converts the optimal non-parameterized policy to a parameterized policy, from which it draws new samples. The methodology is general in that it applies to both discrete and continuous action spaces, and can handle a wide variety of proximity constraints for the non-parameterized optimization problem. We show how the Natural Policy Gradient and Trust Region Policy Optimization (NPG/TRPO) problems, and the Proximal Policy Optimization (PPO) problem can be addressed by this methodology. The SPU implementation is much simpler than TRPO. In terms of sample efficiency, our extensive experiments show SPU outperforms TRPO in Mujoco simulated robotic tasks and outperforms PPO in Atari video game tasks.

1604.01828 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Differential TD Learning for Value Function Approximation

差分时间差学习用于价值函数近似

Adithya M. Devraj, Sean P. Meyn

发表机构 * Department of Electrical and Computer Engg. at the University of Florida(佛罗里达大学电气与计算机工程系)

AI总结 本文提出了一种差分时间差学习方法,用于解决传统时间差学习在折扣成本设置中方差发散和平均成本设置中无偏算法仅在特殊情况下存在的问题,通过价值函数梯度的表示来设计算法,提高了马尔可夫模型在欧几里得空间中平滑动态下的性能。

详情
AI中文摘要

价值函数作为算法组件和统计与工程应用中的性能度量出现。计算相关的Bellman方程在所有非特殊情况中都具有数值挑战性。一种流行的近似技术是时间差(TD)学习。本文介绍的算法旨在解决该方法的两个已知问题:在折扣成本设置中,当折扣因子接近单位时,算法的方差发散。第二,在平均成本设置中,只有在特殊情况下才存在无偏算法。证明了任何这些价值函数的梯度都可以表示为算法设计的依据。基于此结果,得到了适用于欧几里得空间中马尔可夫模型的新型差分TD方法。数值示例显示了显著的性能改进。在应用于速度调节时,方差减少了两个数量级。

英文摘要

Value functions arise as a component of algorithms as well as performance metrics in statistics and engineering applications. Computation of the associated Bellman equations is numerically challenging in all but a few special cases. A popular approximation technique is known as Temporal Difference (TD) learning. The algorithm introduced in this paper is intended to resolve two well-known problems with this approach: In the discounted-cost setting, the variance of the algorithm diverges as the discount factor approaches unity. Second, for the average cost setting, unbiased algorithms exist only in special cases. It is shown that the gradient of any of these value functions admits a representation that lends itself to algorithm design. Based on this result, the new differential TD method is obtained for Markovian models on Euclidean space with smooth dynamics. Numerical examples show remarkable improvements in performance. In application to speed scaling, variance is reduced by two orders of magnitude.

1805.03117 2026-06-04 astro-ph.CO cs.LG cs.NA math.NA 版本更新

Local, algebraic simplifications of Gaussian random fields

局部的代数简化方法用于高斯随机场

Theodor Bjorkmo, M. C. David Marsh

发表机构 * Department of Applied Mathematics and Theoretical Physics, University of Cambridge(应用数学与理论物理系,剑桥大学)

AI总结 本文提出了一种局部代数简化方法,用于高斯随机场的概率密度函数计算,从而避免了协方差矩阵求逆的计算复杂性,并展示了该方法在生成多场势能景观和机器学习中的应用。

Comments 15 pages, 2 figures

详情
AI中文摘要

许多高斯随机场和高斯随机过程的应用受到计算复杂性的限制,这涉及求逆相关协方差矩阵。在本工作中,我们展示了如何完全绕过这一问题,用于高斯随机场的局部泰勒系数,其协方差函数为高斯(或平方指数)形式。我们的结果适用于任意维度的场和任意阶的泰勒展开。我们给出了两个应用:首先,我们证明该方法可以用于显式生成具有许多场的非平凡势能景观,这在关注局部特殊点(例如极值)时特别有用,如早期宇宙中的`manyfield'膨胀问题。其次,我们证明该方法在机器学习中有应用,大大简化了确定协方差函数超参数的回归问题,给定由单点局部泰勒系数组成的训练数据集。一个配套的Mathematica笔记本可在https://doi.org/10.17863/CAM.22859获取。

英文摘要

Many applications of Gaussian random fields and Gaussian random processes are limited by the computational complexity of evaluating the probability density function, which involves inverting the relevant covariance matrix. In this work, we show how that problem can be completely circumvented for the local Taylor coefficients of a Gaussian random field with a Gaussian (or `square exponential') covariance function. Our results hold for any dimension of the field and to any order in the Taylor expansion. We present two applications. First, we show that this method can be used to explicitly generate non-trivial potential energy landscapes with many fields. This application is particularly useful when one is concerned with the field locally around special points (e.g.~maxima or minima), as we exemplify by the problem of cosmic `manyfield' inflation in the early universe. Second, we show that this method has applications in machine learning, and greatly simplifies the regression problem of determining the hyperparameters of the covariance function given a training data set consisting of local Taylor coefficients at single point. An accompanying Mathematica notebook is available at https://doi.org/10.17863/CAM.22859 .

1812.08723 2026-06-04 cs.DS cs.LG cs.NA eess.SP math.NA 版本更新

A Universal Sampling Method for Reconstructing Signals with Simple Fourier Transforms

一种适用于使用简单傅里叶变换重建信号的通用采样方法

Haim Avron, Michael Kapralov, Cameron Musco, Christopher Musco, Ameya Velingker, Amir Zandieh

发表机构 * Tel Aviv University(特拉维夫大学) EPFL(瑞士联邦理工学院) Microsoft Research(微软研究院) Princeton University(普林斯顿大学) Google Research(谷歌研究院)

AI总结 本文提出了一种通用采样方法,用于通过少量离散样本重建连续信号,该方法基于信号的傅里叶结构约束,并展示了其在多带信号重建和高斯过程回归等任务中的有效性。

详情
AI中文摘要

从少量离散样本重建连续信号是科学和工程中的基本问题。在实践中,我们通常感兴趣的信号具有'简单'的傅里叶结构,如带限、多带和傅里叶稀疏信号。更广泛地说,任何关于信号傅里叶功率谱的先验知识都可以限制其复杂性。直觉上,具有更受约束的傅里叶结构的信号需要更少的样本来重建。我们通过证明,给定类别的连续信号可以使用与该类允许功率谱的统计维度成比例的样本数近似重建。进一步地,在几乎所有情况下,这种自然度量紧密刻画了信号重建的样本复杂性。令人惊讶的是,我们还展示了,除了对数因子外,一种通用非均匀采样策略可以实现任何信号类别的最优复杂性。我们提出了一个简单且高效的算法,用于从采样中恢复信号。对于带限和稀疏信号,我们的方法达到了最先进的水平。同时,它为包括多带信号重建和一维kriging和高斯过程回归任务在内的广泛问题提供了第一个计算和样本效率的解决方案。我们的工作基于随机线性代数与具有受约束傅里叶结构的信号重建之间的新联系。我们扩展了基于统计杠杆得分采样和列基矩阵重建的工具到连续线性算子的近似,这些算子出现在信号重建中。我们相信这些扩展具有独立的兴趣,并为使用随机方法解决广泛的时间连续问题奠定了基础。

英文摘要

Reconstructing continuous signals from a small number of discrete samples is a fundamental problem across science and engineering. In practice, we are often interested in signals with 'simple' Fourier structure, such as bandlimited, multiband, and Fourier sparse signals. More broadly, any prior knowledge about a signal's Fourier power spectrum can constrain its complexity. Intuitively, signals with more highly constrained Fourier structure require fewer samples to reconstruct. We formalize this intuition by showing that, roughly, a continuous signal from a given class can be approximately reconstructed using a number of samples proportional to the *statistical dimension* of the allowed power spectrum of that class. Further, in nearly all settings, this natural measure tightly characterizes the sample complexity of signal reconstruction. Surprisingly, we also show that, up to logarithmic factors, a universal non-uniform sampling strategy can achieve this optimal complexity for *any class of signals*. We present a simple and efficient algorithm for recovering a signal from the samples taken. For bandlimited and sparse signals, our method matches the state-of-the-art. At the same time, it gives the first computationally and sample efficient solution to a broad range of problems, including multiband signal reconstruction and kriging and Gaussian process regression tasks in one dimension. Our work is based on a novel connection between randomized linear algebra and signal reconstruction with constrained Fourier structure. We extend tools based on statistical leverage score sampling and column-based matrix reconstruction to the approximation of continuous linear operators that arise in signal reconstruction. We believe that these extensions are of independent interest and serve as a foundation for tackling a broad range of continuous time problems using randomized methods.

1812.02588 2026-06-04 eess.SP cs.LG cs.SY eess.SY math.OC 版本更新

q-LMF: Quantum Calculus-based Least Mean Fourth Algorithm

q-LMF:基于量子微积分的最小四次均值算法

Alishba Sadiq, Muhammad Usman, Shujaat Khan, Imran Naseem, Muhammad Moinuddin, Ubaid M. Al-Saggaf

发表机构 * College of Engineering, Karachi Institute of Economics and Technology(卡拉奇经济科技学院工程学院) Faculty of Engineering Science and Technology (FEST), Iqra University(伊克拉大学工程科学与技术学院) School of Electrical, Electronic and Computer Engineering, The University of Western Australia(西澳大学电气、电子与计算机工程学院) Center of Excellence in Intelligent Engineering Systems (CEIES), King Abdulaziz University(国王阿卜杜勒阿齐兹大学智能工程系统卓越中心) Electrical and Computer Engineering Department, King Abdulaziz University(国王阿卜杜勒阿齐兹大学电气与计算机工程系)

AI总结 本文提出了一种基于量子微积分的最小四次均值算法(q-LMF),用于非高斯噪声环境下的信道估计,通过引入误差相关能量和信号归一化技术,提高了收敛速度、稳定性和稳态误差,相比传统LMF算法具有更大的步长自由度。

详情
AI中文摘要

信道估计是现代通信系统中的关键部分,因为它能提高系统的整体性能。在最近的研究中,已经设计了多种自适应学习方法以增强学习过程的鲁棒性和收敛速度。然而,仍然需要一种最优技术。本文针对非高斯噪声环境,提出了一种新的随机梯度算法用于信道识别。所提出的q-最小四次均值(q-LMF)是最小四次均值(LMF)算法的扩展,基于量子微积分(也称为Jackson导数)。所提出的算法利用了新的误差相关能量概念和信号归一化技术,以确保高收敛速率、更好的稳定性和低稳态误差。与传统LMF不同,所提出的方法在大步长情况下具有更大的自由度。广泛的实验表明,所提出的q-LMF算法在性能上相比现有技术有显著提升。

英文摘要

Channel estimation is an essential part of modern communication systems as it enhances the overall performance of the system. In recent past a variety of adaptive learning methods have been designed to enhance the robustness and convergence speed of the learning process. However, the need for an optimal technique is still there. Herein, for non-Gaussian noisy environment we propose a new class of stochastic gradient algorithm for channel identification. The proposed $q$-least mean fourth ($q$-LMF) is an extension of least mean fourth (LMF) algorithm and it is based on the $q$-calculus which is also known as Jackson derivative. The proposed algorithm utilizes a novel concept of error-correlation energy and normalization of signal to ensure high convergence rate, better stability and low steady-state error. Contrary to the conventional LMF, the proposed method has more freedom for large step-sizes. Extensive experiments show significant gain in the performance of the proposed $q$-LMF algorithm in comparison to the contemporary techniques.

1812.07810 2026-06-04 cs.LG cs.CR cs.NA math.NA stat.ML 版本更新

Fast Botnet Detection From Streaming Logs Using Online Lanczos Method

从流日志中快速检测僵尸网络的在线兰茨斯方法

Zheng Chen, Xinli Yu, Chi Zhang, Jin Zhang, Cui Lin, Bo Song, Jianliang Gao, Xiaohua Hu, Wei-Shih Yang, Erjia Yan

发表机构 * CA Technologies, Inc.(CA Technologies公司) College of Computing & Informatics, Drexel University(德雷塞尔大学计算与信息学院) Department of Mathematics, Temple University(Temple大学数学系) Department of Computer Science, Maryland University at Baltimore County(马里兰大学巴尔的摩县计算机科学系)

AI总结 本文提出了一种基于在线兰茨斯方法的僵尸网络检测方法,通过将PCA方法改进为亚立方复杂度,提高了实时检测的准确性和灵敏度,同时提出了通用的在线相关矩阵更新公式和新的终止条件。

详情
AI中文摘要

僵尸网络,作为一种协调的机器人网络,已成为恶意互联网活动的主要平台,如DDOS攻击、点击欺诈、网络爬虫、垃圾/谣言传播等。本文专注于设计和实验一种新的从流Web服务器日志中检测僵尸网络的方法,受到其广泛适用性、实时保护能力、易用性和敏感数据更安全的启发。我们的算法受到主成分分析(PCA)的启发,以捕捉数据中的相关性,我们首次将兰茨斯方法应用于改进基于PCA的僵尸网络检测的时间复杂度,从立方到亚立方,这使我们能够更准确和灵敏地检测滑动时间窗口中的僵尸网络,而不是固定时间窗口。我们贡献了一个通用的在线相关矩阵更新公式,以及基于误差界和对称矩阵非递减特征值的新终止条件。在电子商务网站日志数据集上,实验表明兰茨斯方法在不同时间窗口下的时间成本始终仅为PCA的20%至25%。

英文摘要

Botnet, a group of coordinated bots, is becoming the main platform of malicious Internet activities like DDOS, click fraud, web scraping, spam/rumor distribution, etc. This paper focuses on design and experiment of a new approach for botnet detection from streaming web server logs, motivated by its wide applicability, real-time protection capability, ease of use and better security of sensitive data. Our algorithm is inspired by a Principal Component Analysis (PCA) to capture correlation in data, and we are first to recognize and adapt Lanczos method to improve the time complexity of PCA-based botnet detection from cubic to sub-cubic, which enables us to more accurately and sensitively detect botnets with sliding time windows rather than fixed time windows. We contribute a generalized online correlation matrix update formula, and a new termination condition for Lanczos iteration for our purpose based on error bound and non-decreasing eigenvalues of symmetric matrices. On our dataset of an ecommerce website logs, experiments show the time cost of Lanczos method with different time windows are consistently only 20% to 25% of PCA.

1812.07410 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

An Improved Deep Belief Network Model for Road Safety Analyses

一种改进的深度信念网络模型用于道路安全分析

Guangyuan Pan, Liping Fu, Lalita Thakali, Matthew Muresan, Ming Yu

发表机构 * Intelligent Transportation Systems Research Center(智能交通系统研究中心) Wuhan University of Technology(武汉理工大学) University of Waterloo(滑铁卢大学) Department of Civil & Environmental Engineering(土木与环境工程系) Department of Electrical & Computer Engineering(电气与计算机工程系)

AI总结 本文提出了一种改进的深度信念网络模型,用于提升道路安全分析中的碰撞预测能力,通过两个案例研究展示该模型在预测性能上的优势,并与其他传统模型进行比较。

详情
Journal ref
Transportation Research Board 97th Annual Meeting, 2018
AI中文摘要

碰撞预测是道路安全分析中的关键组成部分。广泛采用的碰撞预测方法是应用基于回归的技术。底层的校准过程通常耗时较长,需要大量的领域知识和专业知识,无法轻易自动化。本文介绍了一种新的机器学习(ML)方法作为传统技术的替代方案。所提出的ML模型称为正则化深度信念网络,是一种具有两个训练步骤的深度神经网络:首先使用无监督学习算法进行训练,然后通过用第一步训练得到的权重初始化贝叶斯神经网络进行微调。所得模型预计具有改进的预测能力和减少对耗时人工干预的需求。在本文中,我们试图通过两个案例研究展示这种新模型在碰撞预测中的潜力,包括来自加拿大安大略省800公里高速公路401号和其他高速公路的碰撞数据集。我们的目的是展示该ML方法与其他传统模型(包括负二项(NB)模型、核回归(KR)和贝叶斯神经网络(贝叶斯NN))的性能比较。我们还试图解决其他相关问题,如训练数据大小和训练参数的影响。

英文摘要

Crash prediction is a critical component of road safety analyses. A widely adopted approach to crash prediction is application of regression based techniques. The underlying calibration process is often time-consuming, requiring significant domain knowledge and expertise and cannot be easily automated. This paper introduces a new machine learning (ML) based approach as an alternative to the traditional techniques. The proposed ML model is called regularized deep belief network, which is a deep neural network with two training steps: it is first trained using an unsupervised learning algorithm and then fine-tuned by initializing a Bayesian neural network with the trained weights from the first step. The resulting model is expected to have improved prediction power and reduced need for the time-consuming human intervention. In this paper, we attempt to demonstrate the potential of this new model for crash prediction through two case studies including a collision data set from 800 km stretch of Highway 401 and other highways in Ontario, Canada. Our intention is to show the performance of this ML approach in comparison to various traditional models including negative binomial (NB) model, kernel regression (KR), and Bayesian neural network (Bayesian NN). We also attempt to address other related issues such as effect of training data size and training parameters.

1703.00978 2026-06-04 eess.SY cs.LG cs.SE cs.SY 版本更新

Compositional Falsification of Cyber-Physical Systems with Machine Learning Components

包含机器学习组件的网络物理系统组合性验证

Tommaso Dreossi, Alexandre Donzé, Sanjit A. Seshia

发表机构 * University of California, Berkeley(加州大学伯克利分校) Decyphir, Inc.(Decyphir公司)

AI总结 本文研究了包含机器学习组件的网络物理系统(CPS)的正确性问题,提出了一种组合性验证框架,通过时间逻辑 falsifier 和机器学习分析器合作寻找违反规范的执行,以验证 CPS 的正确性。

详情
AI中文摘要

网络物理系统(CPS),如汽车系统,开始包含复杂的机器学习(ML)组件。因此,其正确性依赖于内部ML模块的属性。虽然学习算法旨在从示例中泛化,但它们的性能仅取决于提供的示例,最近的努力已显示它们在小对抗扰动下会产生不一致的输出。这引发了问题:学习组件的输出是否会导致整个CPS的失效?在本文中,我们通过将此问题建模为具有ML组件的CPS的时间逻辑(STL)规范的验证问题来解决此问题。我们提出了一种组合性验证框架,其中时间逻辑验证器和机器学习分析器合作,旨在找到所考虑模型的违反执行。所提出技术的有效性通过带有基于深度神经网络的感知组件的自动紧急制动系统模型得到展示。

英文摘要

Cyber-physical systems (CPS), such as automotive systems, are starting to include sophisticated machine learning (ML) components. Their correctness, therefore, depends on properties of the inner ML modules. While learning algorithms aim to generalize from examples, they are only as good as the examples provided, and recent efforts have shown that they can produce inconsistent output under small adversarial perturbations. This raises the question: can the output from learning components can lead to a failure of the entire CPS? In this work, we address this question by formulating it as a problem of falsifying signal temporal logic (STL) specifications for CPS with ML components. We propose a compositional falsification framework where a temporal logic falsifier and a machine learning analyzer cooperate with the aim of finding falsifying executions of the considered model. The efficacy of the proposed technique is shown on an automatic emergency braking system model with a perception component based on deep neural networks.

1509.09236 2026-06-04 cs.LG cs.CC cs.NA math.NA math.OC 版本更新

On the Complexity of Robust PCA and $\ell_1$-norm Low-Rank Matrix Approximation

关于鲁棒PCA和ℓ1-范数低秩矩阵逼近的复杂性

Nicolas Gillis, Stephen A. Vavasis

发表机构 * Department of Mathematics and Operational Research, University of Mons(蒙斯大学数学与运筹学系) Department of Combinatorics and Optimization, University of Waterloo(滑铁卢大学组合学与优化系)

AI总结 本文证明了基于ℓ1-范数的低秩矩阵逼近(ℓ1-LRA)在秩为1的情况下是NP难的,并将其与鲁棒PCA、ℓ0-LRA、二元矩阵分解等多个已知NP难问题建立了联系。

Comments 16 pages, some typos corrected

详情
Journal ref
Mathematics of Operations Research 43 (4), pp. 1072-1084, 2018
AI中文摘要

基于组件wise的ℓ1-范数(ℓ1-LRA)的低秩矩阵逼近问题,与鲁棒主成分分析(PCA)密切相关,已成为数据挖掘和机器学习中的非常流行工具。鲁棒PCA旨在恢复被稀疏噪声扰动的低秩矩阵,例如在前景-背景视频分离中的应用。尽管ℓ1-LRA被强烈认为是NP难的,但到目前为止,尚无正式证明。在本文中,我们通过将问题归约到MAX CUT,证明了ℓ1-LRA在秩为1的情况下是NP难的。我们的推导揭示了ℓ1-LRA与几个其他已知NP难问题之间的有趣联系,包括鲁棒PCA、ℓ0-LRA、二元矩阵分解、特定的稠密二分子图问题、{-1,+1}矩阵的切范数计算,以及离散基底问题。

英文摘要

The low-rank matrix approximation problem with respect to the component-wise $\ell_1$-norm ($\ell_1$-LRA), which is closely related to robust principal component analysis (PCA), has become a very popular tool in data mining and machine learning. Robust PCA aims at recovering a low-rank matrix that was perturbed with sparse noise, with applications for example in foreground-background video separation. Although $\ell_1$-LRA is strongly believed to be NP-hard, there is, to the best of our knowledge, no formal proof of this fact. In this paper, we prove that $\ell_1$-LRA is NP-hard, already in the rank-one case, using a reduction from MAX CUT. Our derivations draw interesting connections between $\ell_1$-LRA and several other well-known problems, namely, robust PCA, $\ell_0$-LRA, binary matrix factorization, a particular densest bipartite subgraph problem, the computation of the cut norm of $\{-1,+1\}$ matrices, and the discrete basis problem, which we all prove to be NP-hard.

1711.06586 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Cautious NMPC with Gaussian Process Dynamics for Autonomous Miniature Race Cars

谨慎的非线性模型预测控制用于自动驾驶微型赛车

Lukas Hewing, Alexander Liniger, Melanie N. Zeilinger

发表机构 * Institute for Dynamic Systems and Control, ETH Zurich(动态系统与控制研究所,苏黎世联邦理工学院) Institute for Automatic Control, ETH Zurich(自动控制研究所,苏黎世联邦理工学院)

AI总结 本文提出了一种自适应高性能控制方法,通过使用高斯过程动态模型来改进自动驾驶微型赛车的动力学模型,从而在保证安全性的前提下提高赛车性能。

详情
Journal ref
2018 European Control Conference (ECC), Limassol, 2018, pp. 1341-1348
AI中文摘要

本文提出了一种自适应高性能控制方法,用于自动驾驶微型赛车。赛车动力学从原理上建模 notoriously 非常困难,本文通过一种谨慎的非线性模型预测控制(NMPC)方法来解决,该方法通过数据学习来改进其动力学模型并安全地提高赛车性能。该方法利用高斯过程(GP)并通过机会约束形式考虑残差模型不确定性。我们提出了一个稀疏GP近似方法,具有动态调整的诱导输入,从而实现可实时实施的控制器。该方法在模拟中得到了验证,显示了与无模型学习的NMPC相比,在圈速和约束满足方面有显著的改进。

英文摘要

This paper presents an adaptive high performance control method for autonomous miniature race cars. Racing dynamics are notoriously hard to model from first principles, which is addressed by means of a cautious nonlinear model predictive control (NMPC) approach that learns to improve its dynamics model from data and safely increases racing performance. The approach makes use of a Gaussian Process (GP) and takes residual model uncertainty into account through a chance constrained formulation. We present a sparse GP approximation with dynamically adjusting inducing inputs, enabling a real-time implementable controller. The formulation is demonstrated in simulations, which show significant improvement with respect to both lap time and constraint satisfaction compared to an NMPC without model learning.

1812.06243 2026-06-04 cs.DS cs.LG cs.NA math.NA stat.ML 版本更新

Algorithmic Theory of ODEs and Sampling from Well-conditioned Logconcave Densities

ODEs的算法理论与从良好条件的logconcave密度采样

Yin Tat Lee, Zhao Song, Santosh S. Vempala

发表机构 * University of Washington & Microsoft Research(华盛顿大学与微软研究院) UT-Austin & University of Washington(德克萨斯大学奥斯汀分校与华盛顿大学) Georgia Tech(佐治亚理工学院)

AI总结 本文提出了一种求解多元微分方程的通用算法,其解接近已知基函数的张成,从而实现了近线性时间复杂度的HMC采样方法,适用于广泛使用的logistic回归损失函数。

详情
AI中文摘要

从统计学和机器学习中出现的logconcave函数采样已成为研究热点。最近的发展包括对 Langevin 动力学和 Hamiltonian Monte Carlo (HMC) 的分析。虽然这两种方法在足够强的光滑条件下对连续过程有维度无关的界,但所得到的离散算法的复杂度和函数评估次数随维度增长。受此问题启发,本文提出了一种通用算法,用于求解解接近已知基函数张成的多元微分方程。所得到的算法具有多项对数深度和几乎紧致的运行时间——几乎与解的表示大小成线性关系。我们将此应用于采样问题,以获得一个几乎线性的HMC实现,适用于广泛使用的logistic回归损失函数,其迭代次数(并行深度)和梯度评估次数为维度的多项对数(而非之前的多项式)。该类包括最近广泛研究的用于logistic回归的损失函数,其权重矩阵不相干。我们还给出了一个更快的算法,具有多项对数深度,适用于更一般和标准的强凸函数类,其梯度具有Lipschitz连续性。这些结果基于(1)对精确HMC过程的改进收缩界,以及(2)在实现HMC时出现的微分方程解的多项式近似次数的对数界。

英文摘要

Sampling logconcave functions arising in statistics and machine learning has been a subject of intensive study. Recent developments include analyses for Langevin dynamics and Hamiltonian Monte Carlo (HMC). While both approaches have dimension-independent bounds for the underlying $\mathit{continuous}$ processes under sufficiently strong smoothness conditions, the resulting discrete algorithms have complexity and number of function evaluations growing with the dimension. Motivated by this problem, in this paper, we give a general algorithm for solving multivariate ordinary differential equations whose solution is close to the span of a known basis of functions (e.g., polynomials or piecewise polynomials). The resulting algorithm has polylogarithmic depth and essentially tight runtime - it is nearly linear in the size of the representation of the solution. We apply this to the sampling problem to obtain a nearly linear implementation of HMC for a broad class of smooth, strongly logconcave densities, with the number of iterations (parallel depth) and gradient evaluations being $\mathit{polylogarithmic}$ in the dimension (rather than polynomial as in previous work). This class includes the widely-used loss function for logistic regression with incoherent weight matrices and has been subject of much study recently. We also give a faster algorithm with $ \mathit{polylogarithmic~depth}$ for the more general and standard class of strongly convex functions with Lipschitz gradient. These results are based on (1) an improved contraction bound for the exact HMC process and (2) logarithmic bounds on the degree of polynomials that approximate solutions of the differential equations arising in implementing HMC.

1812.05298 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Cyber-Physical Security and Safety of Autonomous Connected Vehicles: Optimal Control Meets Multi-Armed Bandit Learning

自动驾驶联网车辆的网络安全与安全性:最优控制与多臂老虎机学习

Aidin Ferdowsi, Samad Ali, Walid Saad, Narayan B. Mandayam

发表机构 * Centre for Wireless Communications (CWC), University of Oulu, Finland(奥卢大学无线通信中心(CWC)) WINLAB, Dept. of ECE, Rutgers University, New Brunswick, NJ, USA(罗格斯大学WINLAB,电子与计算机工程系,新泽西州新布朗斯维尔,美国)

AI总结 本文提出了一种综合框架,用于防止自动驾驶联网车辆网络中的网络和物理攻击。首先,推导出一个最优安全控制器,通过优化自动驾驶车辆的速度和车辆间间距,最大化街道交通流量并最小化事故风险。其次,提出数据注入攻击检测方法,以应对传感器的网络攻击及其对自动驾驶系统的影响。

Comments 30 pages, 11 figures

详情
AI中文摘要

自动驾驶联网车辆(ACVs)依赖于车载传感器如摄像头和雷达以及车对车通信来有效运行。这种对网络组件的依赖使ACVs容易受到网络和物理攻击,其中攻击者可以操纵传感器读数并物理上控制ACV。本文提出了一种综合框架,以防止ACV网络中的网络和物理攻击。首先,推导出一个最优安全控制器,通过优化ACV速度和车辆间间距,最大化街道交通流量并最小化事故风险。证明所提出的控制器对旨在使ACV系统不稳定的身体攻击具有鲁棒性。为了提高ACV系统的网络-物理安全性,接下来提出了数据注入攻击(DIA)检测方法,以应对传感器的网络攻击及其对ACV系统的影响。为了全面设计DIA检测方法,将ACV传感器分为两个子集,基于其数据的先验信息可用性。对于具有先验信息的传感器,提出DIA检测方法,并推导出实际和估计值之间的差异的最优阈值水平,使ACV能够抵御网络攻击。对于没有先验信息的传感器,提出了一种新的多臂老虎机(MAB)算法,以使ACV能够安全地控制其运动。仿真结果表明,所提出的最优安全控制器在最大化对物理攻击的鲁棒性方面优于当前最先进的控制器。结果还显示,所提出的DIA检测方法相比卡尔曼滤波,可以提高ACV传感器对网络攻击的安全性,并最终提高ACV系统的物理鲁棒性。

英文摘要

Autonomous connected vehicles (ACVs) rely on intra-vehicle sensors such as camera and radar as well as inter-vehicle communication to operate effectively. This reliance on cyber components exposes ACVs to cyber and physical attacks in which an adversary can manipulate sensor readings and physically take control of an ACV. In this paper, a comprehensive framework is proposed to thwart cyber and physical attacks on ACV networks. First, an optimal safe controller for ACVs is derived to maximize the street traffic flow while minimizing the risk of accidents by optimizing ACV speed and inter-ACV spacing. It is proven that the proposed controller is robust to physical attacks which aim at making ACV systems instable. To improve the cyber-physical security of ACV systems, next, data injection attack (DIA) detection approaches are proposed to address cyber attacks on sensors and their physical impact on the ACV system. To comprehensively design the DIA detection approaches, ACV sensors are characterized in two subsets based on the availability of a-priori information about their data. For sensors having a prior information, a DIA detection approach is proposed and an optimal threshold level is derived for the difference between the actual and estimated values of sensors data which enables ACV to stay robust against cyber attacks. For sensors having no prior information, a novel multi-armed bandit (MAB) algorithm is proposed to enable ACV to securely control its motion. Simulation results show that the proposed optimal safe controller outperforms current state of the art controllers by maximizing the robustness of ACVs to physical attacks. The results also show that the proposed DIA detection approaches, compared to Kalman filtering, can improve the security of ACV sensors against cyber attacks and ultimately improve the physical robustness of an ACV system.

1808.04580 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

NFFT meets Krylov methods: Fast matrix-vector products for the graph Laplacian of fully connected networks

NFFT与Krylov方法的结合:用于完全连接网络图拉普拉斯算子的快速矩阵-向量乘法

Dominik Alfke, Daniel Potts, Martin Stoll, Toni Volkmer

发表机构 * Technische Universität Chemnitz, Faculty of Mathematics, Chair of Scientific Computing(技术大学化学学院,数学系,科学计算教研室) Technische Universität Chemnitz, Faculty of Mathematics, Chair of Applied Functional Analysis(技术大学化学学院,数学系,应用泛函分析教研室) Technische Universität Chemnitz, Faculty of Mathematics, Chair of Applied Analysis(技术大学化学学院,数学系,应用分析教研室)

AI总结 本文提出利用NFFT进行快速矩阵-向量乘法,以高效处理完全连接网络的图拉普拉斯算子,同时展示了其在图像分割和半监督学习中的应用,并与Nyström方法进行了比较。

Comments 28 pages, 9 figures

详情
AI中文摘要

图拉普拉斯算子是数据科学、机器学习和图像处理中的标准工具。对应的矩阵继承了底层网络的复杂结构,并在某些应用中被密集填充。这使得与图拉普拉斯算子的计算,特别是矩阵-向量乘法,成为一个困难的任务。典型应用是计算其若干特征值和特征向量。标准方法在图中节点数量过大时变得不可行。本文提出利用基于非等间距快速傅里叶变换(NFFT)的快速求和方法,以快速执行图拉普拉斯算子的密集矩阵-向量乘法,而无需形成整个矩阵。NFFT算法的巨大灵活性使我们能够将加速乘法嵌入到基于Lanczos的特征值计算程序或迭代线性系统求解器中,并考虑非标准高斯核。我们通过一系列测试问题展示了该方法的可行性,从图像分割到基于图的PDEs的半监督学习。特别是,我们比较了我们的方法与Nyström方法。此外,我们还提出并测试了改进的、混合版本的Nyström方法,该方法内部使用NFFT。

英文摘要

The graph Laplacian is a standard tool in data science, machine learning, and image processing. The corresponding matrix inherits the complex structure of the underlying network and is in certain applications densely populated. This makes computations, in particular matrix-vector products, with the graph Laplacian a hard task. A typical application is the computation of a number of its eigenvalues and eigenvectors. Standard methods become infeasible as the number of nodes in the graph is too large. We propose the use of the fast summation based on the nonequispaced fast Fourier transform (NFFT) to perform the dense matrix-vector product with the graph Laplacian fast without ever forming the whole matrix. The enormous flexibility of the NFFT algorithm allows us to embed the accelerated multiplication into Lanczos-based eigenvalues routines or iterative linear system solvers and even consider other than the standard Gaussian kernels. We illustrate the feasibility of our approach on a number of test problems from image segmentation to semi-supervised learning based on graph-based PDEs. In particular, we compare our approach with the Nyström method. Moreover, we present and test an enhanced, hybrid version of the Nyström method, which internally uses the NFFT.

1711.04178 2026-06-04 cs.LG cs.CV cs.NA math.NA stat.ML 版本更新

CUR Decompositions, Similarity Matrices, and Subspace Clustering

CUR分解、相似矩阵与子空间聚类

Akram Aldroubi, Keaton Hamm, Ahmet Bugra Koku, Ali Sekmen

AI总结 本文提出了一种利用CUR分解解决子空间聚类问题的通用框架,通过构造相似矩阵实现无噪声情况下的精确聚类,并展示了如何通过CUR分解生成多种相似矩阵以处理噪声数据,同时推导出两种已知的子空间聚类方法。

Comments Approximately 30 pages. Current version contains improved algorithm and numerical experiments from the previous version

详情
AI中文摘要

本文提出了一种利用CUR分解解决子空间聚类问题的通用框架。CUR分解提供了一种自然方法来构造数据来自未知子空间联盟$\mathscr{U}=\underset{i=1}{\overset{M}\bigcup}S_i$的相似矩阵。由此构造的相似矩阵在无噪声情况下能够实现精确聚类。此外,这种分解还能从给定数据集生成多种不同的相似矩阵,从而具有足够的灵活性以对含噪声数据进行准确聚类。我们还展示了两种已知的子空间聚类方法可以从CUR分解中推导出来。本文还提出了一种基于相似矩阵理论构造的算法,并在合成和真实数据上进行了实验以测试该方法。此外,本文还利用了基于CUR的相似矩阵的改进版本,提供了一种启发式算法用于子空间聚类;该算法在Hopkins155运动分割数据集上的聚类性能目前最佳。

英文摘要

A general framework for solving the subspace clustering problem using the CUR decomposition is presented. The CUR decomposition provides a natural way to construct similarity matrices for data that come from a union of unknown subspaces $\mathscr{U}=\underset{i=1}{\overset{M}\bigcup}S_i$. The similarity matrices thus constructed give the exact clustering in the noise-free case. Additionally, this decomposition gives rise to many distinct similarity matrices from a given set of data, which allow enough flexibility to perform accurate clustering of noisy data. We also show that two known methods for subspace clustering can be derived from the CUR decomposition. An algorithm based on the theoretical construction of similarity matrices is presented, and experiments on synthetic and real data are presented to test the method. Additionally, an adaptation of our CUR based similarity matrices is utilized to provide a heuristic algorithm for subspace clustering; this algorithm yields the best overall performance to date for clustering the Hopkins155 motion segmentation dataset.

1511.06444 2026-06-04 cs.LG cs.NA math.NA math.PR 版本更新

Universal halting times in optimization and machine learning

优化与机器学习中的通用停止时间

Levent Sagun, Thomas Trogdon, Yann LeCun

发表机构 * Mathematics Department(数学系) Department of Mathematics(数学系) Computer Science Department(计算机科学系) New York University(纽约大学) University of California, Irvine(加州大学欧文分校)

AI总结 研究通过分析优化算法在随机系统(如自旋玻璃和深度学习)中的停止时间分布,发现其在特定条件下与底层分布无关,揭示了两种类型的分布类:Gumbel型和高斯型。

详情
Journal ref
Quart. Appl. Math. 76 (2018), 289-301
AI中文摘要

作者展示了优化算法在两个随机系统(自旋玻璃和深度学习)中达到给定精度所需的迭代次数的经验分布。给定一个算法(即优化过程和随机景观的形式),停止时间的波动遵循一种分布,即使在改变景观分布后,经过中心化和标准化后仍保持不变。我们观察到两种定性类别:一种类似于Gumbel分布的分布,出现在谷歌搜索、人类决策时间、QR特征值算法和自旋玻璃中;另一种类似于高斯分布的分布,出现在共轭梯度法、使用MNIST输入数据的深度网络和使用随机输入数据的深度网络中。这种经验证据表明,存在一种分布类,其停止时间在某些条件下与底层分布无关。

英文摘要

The authors present empirical distributions for the halting time (measured by the number of iterations to reach a given accuracy) of optimization algorithms applied to two random systems: spin glasses and deep learning. Given an algorithm, which we take to be both the optimization routine and the form of the random landscape, the fluctuations of the halting time follow a distribution that, after centering and scaling, remains unchanged even when the distribution on the landscape is changed. We observe two qualitative classes: A Gumbel-like distribution that appears in Google searches, human decision times, the QR eigenvalue algorithm and spin glasses, and a Gaussian-like distribution that appears in conjugate gradient method, deep network with MNIST input data and deep network with random input data. This empirical evidence suggests presence of a class of distributions for which the halting time is independent of the underlying distribution under some conditions.

1802.08678 2026-06-04 eess.SY cs.LG cs.RO cs.SY stat.ML 版本更新

Verifying Controllers Against Adversarial Examples with Bayesian Optimization

通过贝叶斯优化验证控制器对抗示例

Shromona Ghosh, Felix Berkenkamp, Gireeja Ranade, Shaz Qadeer, Ashish Kapoor

发表机构 * Microsoft Research, Redmond(微软研究院(红mond))

AI总结 本文提出基于贝叶斯优化的主动测试框架,用于验证控制器的安全性,通过逻辑定义安全约束并高效搜索行为空间以发现违反安全规范的对抗示例。

Comments Proc. of the IEEE International Conference on Robotics and Automation, 2018

详情
AI中文摘要

最近强化学习的成功促使开发了用于现实世界机器人的复杂控制器。由于这些机器人被部署在安全关键应用中并与人类交互,确保安全性以避免造成伤害变得至关重要。为此方向的一个初步步骤是测试控制器在仿真中的表现。为了做到这一点,我们需要明确安全的定义,然后高效地搜索所有行为空间以确定其安全性。在本文中,我们提出了一种基于贝叶斯优化的主动测试框架。我们使用逻辑指定安全约束,并利用问题中的结构来测试系统,以发现违反安全规范的对抗示例。这些规范被定义为轨迹上的光滑函数的复杂布尔组合,与强化学习中的奖励函数不同,它们是表达性强且对系统施加硬约束。在我们的框架中,我们利用单个函数的正则性假设,形式化为高斯过程(GP)先验。我们结合这些内容到一个连贯的优化框架中,利用问题结构。所得到的算法能够证明验证复杂的安全规范或找到对抗示例。实验结果表明,所提出的方法能够快速发现对抗示例。

英文摘要

Recent successes in reinforcement learning have lead to the development of complex controllers for real-world robots. As these robots are deployed in safety-critical applications and interact with humans, it becomes critical to ensure safety in order to avoid causing harm. A first step in this direction is to test the controllers in simulation. To be able to do this, we need to capture what we mean by safety and then efficiently search the space of all behaviors to see if they are safe. In this paper, we present an active-testing framework based on Bayesian Optimization. We specify safety constraints using logic and exploit structure in the problem in order to test the system for adversarial counter examples that violate the safety specifications. These specifications are defined as complex boolean combinations of smooth functions on the trajectories and, unlike reward functions in reinforcement learning, are expressive and impose hard constraints on the system. In our framework, we exploit regularity assumptions on individual functions in form of a Gaussian Process (GP) prior. We combine these into a coherent optimization framework using problem structure. The resulting algorithm is able to provably verify complex safety specifications or alternatively find counter examples. Experimental results show that the proposed method is able to find adversarial examples quickly.

1812.03216 2026-06-04 cs.LG cs.RO cs.SY eess.SY 版本更新

Zero-shot Deep Reinforcement Learning Driving Policy Transfer for Autonomous Vehicles based on Robust Control

基于鲁棒控制的零样本深度强化学习驾驶策略迁移用于自动驾驶车辆

Zhuo Xu, Chen Tang, Masayoshi Tomizuka

发表机构 * University of California, Berkeley(加州大学伯克利分校)

AI总结 本文提出了一种基于鲁棒控制的零样本深度强化学习驾驶策略迁移方法,通过转移具体的运动学量来解决自动驾驶中源域与目标域之间的建模差距问题,采用可转移的分层强化学习轨迹规划器和基于扰动观测器的鲁棒跟踪控制器,验证了该方法在多个驾驶场景中的零样本迁移能力。

Comments Published at IEEE ITSC 2018

详情
AI中文摘要

尽管深度强化学习(深度RL)方法在应用于自动驾驶时具有诸多优势,但真实应用却受到源域(训练)与目标域(部署)之间建模差距的限制。与当前的策略迁移方法不同,本文提出转移具体的自动驾驶运动学量。所提出的基于鲁棒控制的(RC)通用迁移架构,称为RL-RC,结合了可转移的分层强化学习轨迹规划器和基于扰动观测器(DOB)的鲁棒跟踪控制器。通过训练已知的名义动力学模型的深度RL策略直接转移到目标域,DOB基于的鲁棒跟踪控制用于处理建模差距,包括车辆动力学误差和外部扰动如侧向力。我们提供了模拟验证所提出方法在多个驾驶场景如车道保持、车道变更和障碍物避让中的迁移能力。

英文摘要

Although deep reinforcement learning (deep RL) methods have lots of strengths that are favorable if applied to autonomous driving, real deep RL applications in autonomous driving have been slowed down by the modeling gap between the source (training) domain and the target (deployment) domain. Unlike current policy transfer approaches, which generally limit to the usage of uninterpretable neural network representations as the transferred features, we propose to transfer concrete kinematic quantities in autonomous driving. The proposed robust-control-based (RC) generic transfer architecture, which we call RL-RC, incorporates a transferable hierarchical RL trajectory planner and a robust tracking controller based on disturbance observer (DOB). The deep RL policies trained with known nominal dynamics model are transfered directly to the target domain, DOB-based robust tracking control is applied to tackle the modeling gap including the vehicle dynamics errors and the external disturbances such as side forces. We provide simulations validating the capability of the proposed method to achieve zero-shot transfer across multiple driving scenarios such as lane keeping, lane changing and obstacle avoidance.

1805.07857 2026-06-04 cs.LG cs.CV cs.NA math.NA stat.ML 版本更新

Parallel Transport Convolution: A New Tool for Convolutional Neural Networks on Manifolds

平行运输卷积:用于流形上卷积神经网络的新工具

Stefan C. Schonsheck, Bin Dong, Rongjie Lai

发表机构 * Rensselaer Polytechnic Institute(伦斯拉尔理工学院)

AI总结 本文提出平行运输卷积(PTC),一种在流形及其离散对应物上扩展卷积操作的新方法,能够保持卷积的紧凑支持、方向性和跨流形的可转移性,从而在曲面域上构建小波样操作和深度卷积神经网络。

Comments 10 pages

详情
AI中文摘要

卷积在科学和工程中的各种应用中扮演了重要的角色,是卷积神经网络中最关键的操作。近年来,研究者对在曲面域(如流形和图)上推广卷积的兴趣增长,但现有方法无法保持欧几里得卷积的所有理想特性,即紧凑支持滤波器、方向性和跨不同流形的可转移性。本文开发了一种新的卷积操作扩展,称为平行运输卷积(PTC),应用于黎曼流形及其离散对应物。PTC基于平行运输,能够沿流形传输信息并内在保持方向性。PTC允许构建具有紧凑支持的滤波器,并且对流形变形具有鲁棒性。这使得我们能够执行小波样操作,并在曲面域上定义深度卷积神经网络。

英文摘要

Convolution has been playing a prominent role in various applications in science and engineering for many years. It is the most important operation in convolutional neural networks. There has been a recent growth of interests of research in generalizing convolutions on curved domains such as manifolds and graphs. However, existing approaches cannot preserve all the desirable properties of Euclidean convolutions, namely compactly supported filters, directionality, transferability across different manifolds. In this paper we develop a new generalization of the convolution operation, referred to as parallel transport convolution (PTC), on Riemannian manifolds and their discrete counterparts. PTC is designed based on the parallel transportation which is able to translate information along a manifold and to intrinsically preserve directionality. PTC allows for the construction of compactly supported filters and is also robust to manifold deformations. This enables us to preform wavelet-like operations and to define deep convolutional neural networks on curved domains.

1812.00679 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Data Driven Chiller Plant Energy Optimization with Domain Knowledge

数据驱动的冷水机组能源优化与领域知识

Hoang Dung Vu, Kok Soon Chai, Bryan Keating, Nurislam Tursynbek, Boyan Xu, Kaige Yang, Xiaoyan Yang, Zhenjie Zhang

发表机构 * Kaer Pte. Ltd.(卡尔公司) University of Illinois at Urbana Champaign(伊利诺伊大学厄巴纳-香槟分校) Nazarbayev University(纳扎尔拜耶夫大学) Guangdong University of Technology(广东工业大学) University College London(伦敦大学学院) Advanced Digital Sciences Center(先进数字科学中心)

AI总结 本文提出了一种结合领域知识的数据驱动方法,用于实时冷水机组优化,通过实际案例验证了该方法在降低日常电力消耗方面的显著效果。

Comments CIKM2017. Proceedings of the 26th ACM International Conference on Information and Knowledge Management. 2017

详情
AI中文摘要

制冷和冷水机组优化是机械工程中的重要且广泛研究的主题,主要利用物理模型,基于过于简化的假设在设备上进行设计。传统优化技术使用物理模型进行在线参数调整,仅基于有限的硬件规格和外部条件信息,例如室外天气。近年来,新一代传感器成为新冷水机组的重要组成部分,首次使系统管理员能够及时准确地持续监控所有设备的运行状态。数据激增,由机器学习和数据挖掘的分析能力增加驱动,揭示了数据驱动方法在实时冷水机组优化中的新可能性。本文介绍了我们在冷水机组上采用数据模型和优化的研究和工业经验,并讨论了我们在实际设备上的实践教训。与复杂机器学习模型不同,我们强调将适当的领域知识纳入数据分析工具中,这在很大程度上超越了最先进的深度学习技术的性能。我们在实际冷水机组上的实证评估实现了每日电力消耗的节省超过7%。

英文摘要

Refrigeration and chiller optimization is an important and well studied topic in mechanical engineering, mostly taking advantage of physical models, designed on top of over-simplified assumptions, over the equipments. Conventional optimization techniques using physical models make decisions of online parameter tuning, based on very limited information of hardware specifications and external conditions, e.g., outdoor weather. In recent years, new generation of sensors is becoming essential part of new chiller plants, for the first time allowing the system administrators to continuously monitor the running status of all equipments in a timely and accurate way. The explosive growth of data flowing to databases, driven by the increasing analytical power by machine learning and data mining, unveils new possibilities of data-driven approaches for real-time chiller plant optimization. This paper presents our research and industrial experience on the adoption of data models and optimizations on chiller plant and discusses the lessons learnt from our practice on real world plants. Instead of employing complex machine learning models, we emphasize the incorporation of appropriate domain knowledge into data analysis tools, which turns out to be the key performance improver over state-of-the-art deep learning techniques by a significant margin. Our empirical evaluation on a real world chiller plant achieves savings by more than 7% on daily power consumption.

1804.01526 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Training DNNs with Hybrid Block Floating Point

用混合块浮点数训练DNNs

Mario Drumond, Tao Lin, Martin Jaggi, Babak Falsafi

发表机构 * EPFL(苏黎世联邦理工学院)

AI总结 本文提出了一种混合块浮点数(HBFP)方法,通过在块浮点数中执行所有点积运算,其他运算使用浮点数,从而在保持高精度的同时提高硬件密度和吞吐量。

Comments 9 pages, 3 figures. Accepted in Neural Information Processing Systems 2018 (NeurIPS 2018)

详情
AI中文摘要

深度神经网络(DNN)的广泛应用催生了持续增长的计算需求,迫使数据中心运营商采用领域专用加速器来训练它们。这些加速器通常使用密集打包的全精度浮点运算以最大化面积性能。持续的研究努力旨在通过用固定点运算替换浮点运算来进一步提高这种性能密度。然而,这些尝试面临的主要障碍是固定点的动态范围狭窄,不足以支持DNN训练的收敛。我们识别出块浮点数(BFP)作为有前途的替代表示,因为它具有宽动态范围,并且能够使大多数DNN运算使用固定点逻辑进行。不幸的是,BFP单独引入了几个限制,使其无法直接应用。在本文中,我们引入了HBFP,一种混合BFP-FP方法,它在BFP中执行所有点积运算,其他运算使用浮点运算。HBFP实现了两全其美:浮点数的高精度和固定点的优越硬件密度。对于广泛的各种模型,我们证明HBFP在保持浮点数精度的同时,能够实现高达8.5倍的吞吐量。

英文摘要

The wide adoption of DNNs has given birth to unrelenting computing requirements, forcing datacenter operators to adopt domain-specific accelerators to train them. These accelerators typically employ densely packed full precision floating-point arithmetic to maximize performance per area. Ongoing research efforts seek to further increase that performance density by replacing floating-point with fixed-point arithmetic. However, a significant roadblock for these attempts has been fixed point's narrow dynamic range, which is insufficient for DNN training convergence. We identify block floating point (BFP) as a promising alternative representation since it exhibits wide dynamic range and enables the majority of DNN operations to be performed with fixed-point logic. Unfortunately, BFP alone introduces several limitations that preclude its direct applicability. In this work, we introduce HBFP, a hybrid BFP-FP approach, which performs all dot products in BFP and other operations in floating point. HBFP delivers the best of both worlds: the high accuracy of floating point at the superior hardware density of fixed point. For a wide variety of models, we show that HBFP matches floating point's accuracy while enabling hardware implementations that deliver up to 8.5x higher throughput.

1811.12830 2026-06-04 math.NA cs.LG cs.NA math.FA 版本更新

Beltrami-Net: Domain Independent Deep D-bar Learning for Absolute Imaging with Electrical Impedance Tomography (a-EIT)

Beltrami-Net: 域无关的深度D-bar学习用于电阻抗断层成像(a-EIT)

S. J. Hamilton, A. Hänninen, A. Hauptmann, V. Kolehmainen

发表机构 * Department of Mathematics, Statistics, and Computer Science(数学、统计与计算机科学系;马quette大学) Marquette University(应用物理系;东芬兰大学) Department of Applied Physics(计算机科学系;伦敦大学学院) University of Eastern Finland Department of Computer Science University College London

AI总结 本文提出了一种新的a-EIT图像重建方法,通过将深度学习技术与实时鲁棒D-bar方法结合,利用非物理Beltrami方程生成训练数据,实现了与边界形状无关的图像质量提升。

Comments 15 pages, 8 figures, 3 tables

详情
AI中文摘要

目标:开发并证明一种新的绝对电阻抗断层成像(a-EIT)图像重建方法,该方法结合了深度学习技术与实时鲁棒D-bar方法。方法:将D-bar方法与训练好的卷积神经网络(CNN)作为后处理步骤结合。通过使用关联的非物理Beltrami方程而非传统特定领域的电流和电压数据来模拟训练数据,从而实现训练数据与边界形状无关。该方法在两个EIT系统(ACT4和KIT4)的实验数据上进行了测试。主要结果:用CNN后处理D-bar图像,在结构相似性指数(SSIM)以及相对ℓ₂和ℓ₁图像误差方面显著提高了图像质量。意义:本工作展示了无需特定边界形状即可训练更通用网络的可能性,这是EIT图像重建中的关键挑战。该工作对未来涉及解剖学大数据库的研究具有前景。

英文摘要

Objective: To develop, and demonstrate the feasibility of, a novel image reconstruction method for absolute Electrical Impedance Tomography (a-EIT) that pairs deep learning techniques with real-time robust D-bar methods. Approach: A D-bar method is paired with a trained Convolutional Neural Network (CNN) as a post-processing step. Training data is simulated for the network using no knowledge of the boundary shape by using an associated nonphysical Beltrami equation rather than simulating the traditional current and voltage data specific to a given domain. This allows the training data to be boundary shape independent. The method is tested on experimental data from two EIT systems (ACT4 and KIT4). Main Results: Post processing the D-bar images with a CNN produces significant improvements in image quality measured by Structural SIMilarity indices (SSIMs) as well as relative $\ell_2$ and $\ell_1$ image errors. Significance: This work demonstrates that more general networks can be trained without being specific about boundary shape, a key challenge in EIT image reconstruction. The work is promising for future studies involving databases of anatomical atlases.

1811.11433 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Beyond Pham's algorithm for joint diagonalization

超越Pham算法的联合对角化

Pierre Ablin, Jean-François Cardoso, Alexandre Gramfort

发表机构 * INRIA - Parietal team(INRIA-帕里埃尔团队) CNRS - Institut d’Astrophysique de Paris(CNRS-巴黎天体物理研究所)

AI总结 本文提出了一种新的拟牛顿方法来优化Pham提出的对角化准则,并通过模拟和真实数据集的实验表明该方法优于Pham算法。

详情
AI中文摘要

一组矩阵的近似联合对角化问题在于找到一个基底,使得这些矩阵尽可能对角化。这个问题自然出现在多种统计学习任务中,如盲源分离。我们考虑了Pham(2001)在开创性论文中研究的对角化准则,并提出了一种新的拟牛顿方法来优化它。通过在模拟和真实数据集上的数值实验,我们展示了所提出的方法优于Pham算法。一个开源的Python包已发布。

英文摘要

The approximate joint diagonalization of a set of matrices consists in finding a basis in which these matrices are as diagonal as possible. This problem naturally appears in several statistical learning tasks such as blind signal separation. We consider the diagonalization criterion studied in a seminal paper by Pham (2001), and propose a new quasi-Newton method for its optimization. Through numerical experiments on simulated and real datasets, we show that the proposed method outper-forms Pham's algorithm. An open source Python package is released.

1804.01983 2026-06-04 math.NA cs.CV cs.LG cs.NA 版本更新

High-dimension Tensor Completion via Gradient-based Optimization Under Tensor-train Format

通过张量列车格式的梯度优化实现高维张量补全

Longhao Yuan, Qibin Zhao, Lihua Gui, Jianting Cao

发表机构 * Graduate School of Engineering, Saitama Institute of Technology, Japan(日本埼玉科技大学工学研究科) Tensor Learning Unit, RIKEN Center for Advanced Intelligence Project (AIP), Japan(日本RIKEN先进人工智能项目(AIP)张量学习单元) School of Automation, Guangdong University of Technology, China(广东技术大学自动化学院) School of Computer Science and Technology, Hangzhou Dianzi University, China(杭州电子科技大学计算机科学与技术学院)

AI总结 本文提出了一种基于张量列车格式的梯度优化方法,用于补全高维张量中的缺失数据,通过寻找低秩张量列车分解来捕捉数据的潜在特征,并利用梯度下降算法高效解决张量补全问题,同时引入视觉数据张量化方法提升算法性能。

详情
AI中文摘要

张量列车(TT)分解因其在高阶张量中的强大表示能力和稳定性而受到关注。本文提出了一种新的方法,用于恢复由高阶张量表示的不完整数据中的缺失条目。我们尝试找到不完整数据的低秩TT分解,以捕捉整个数据集的潜在特征,然后重建缺失条目。通过应用梯度下降算法,利用优化模型高效地解决了张量补全问题。我们提出了两种基于TT的算法:张量列车加权优化(TT-WOPT)和张量列车随机梯度下降(TT-SGD),用于优化TT分解因子。此外,提出了一种名为视觉数据张量化(VDT)的方法,将视觉数据转换为高阶张量,从而提升了我们算法的性能。在合成数据和视觉数据的实验中,我们的算法在高阶、高缺失率和大规模张量补全情况下表现出高效和优越的性能,相比最先进的补全算法。

英文摘要

Tensor train (TT) decomposition has drawn people's attention due to its powerful representation ability and performance stability in high-order tensors. In this paper, we propose a novel approach to recover the missing entries of incomplete data represented by higher-order tensors. We attempt to find the low-rank TT decomposition of the incomplete data which captures the latent features of the whole data and then reconstruct the missing entries. By applying gradient descent algorithms, tensor completion problem is efficiently solved by optimization models. We propose two TT-based algorithms: Tensor Train Weighted Optimization (TT-WOPT) and Tensor Train Stochastic Gradient Descent (TT-SGD) to optimize TT decomposition factors. In addition, a method named Visual Data Tensorization (VDT) is proposed to transform visual data into higher-order tensors, resulting in the performance improvement of our algorithms. The experiments in synthetic data and visual data show high efficiency and performance of our algorithms compared to the state-of-the-art completion algorithms, especially in high-order, high missing rate, and large-scale tensor completion situations.

1803.00444 2026-06-04 cs.LG cs.AI cs.RO cs.SY eess.SY stat.ML 版本更新

Inverse Reinforcement Learning via Nonparametric Spatio-Temporal Subgoal Modeling

通过非参数时空子目标建模实现逆强化学习

Adrian Šošić, Elmar Rueckert, Jan Peters, Abdelhak M. Zoubir, Heinz Koeppl

发表机构 * Signal Processing Group(信号处理组) Institute for Robotics and Cognitive Systems(机器人与认知系统研究所) Autonomous Systems Labs(自主系统实验室) Bioinspired Communication Systems(生物启发通信系统)

AI总结 本文提出了一种基于非参数时空子目标建模的逆强化学习方法,通过局部上下文更高效地解释单条轨迹,实现更紧凑的行为表示,并构建隐式意图模型以预测未观察到的情况,从而在处理意图变化和主动学习场景中表现出色。

Comments 45 pages, 14 figures; ### Version 3 ### published in the Journal of Machine Learning Research

详情
AI中文摘要

逆强化学习(IRL)领域的发展导致了更复杂的推理框架,这些框架放宽了原始建模假设,即观察到的代理行为仅反映单一意图。相反于学习全局行为模型,最近的IRL方法将演示数据分成部分,以考虑不同轨迹可能对应不同意图,例如因为它们由不同领域专家生成。在本工作中,我们进一步采用子目标的直观概念,建立一个前提:即使单条轨迹在特定上下文中局部解释也比全局更高效,从而实现更紧凑的行为表示。基于这一假设,我们构建了代理目标的隐式意图模型,以预测未观察到的情况。结果是一种集成的贝叶斯预测框架,显著优于现有IRL解决方案,并提供与专家计划一致的平滑策略估计。最值得注意的是,我们的框架自然处理代理意图随时间变化的情况,而经典IRL算法失败。此外,由于其概率性质,该模型可以轻松应用于主动学习场景,以指导专家的演示过程。

英文摘要

Advances in the field of inverse reinforcement learning (IRL) have led to sophisticated inference frameworks that relax the original modeling assumption of observing an agent behavior that reflects only a single intention. Instead of learning a global behavioral model, recent IRL methods divide the demonstration data into parts, to account for the fact that different trajectories may correspond to different intentions, e.g., because they were generated by different domain experts. In this work, we go one step further: using the intuitive concept of subgoals, we build upon the premise that even a single trajectory can be explained more efficiently locally within a certain context than globally, enabling a more compact representation of the observed behavior. Based on this assumption, we build an implicit intentional model of the agent's goals to forecast its behavior in unobserved situations. The result is an integrated Bayesian prediction framework that significantly outperforms existing IRL solutions and provides smooth policy estimates consistent with the expert's plan. Most notably, our framework naturally handles situations where the intentions of the agent change over time and classical IRL algorithms fail. In addition, due to its probabilistic nature, the model can be straightforwardly applied in active learning scenarios to guide the demonstration process of the expert.

1811.12084 2026-06-04 cs.CV cs.LG cs.NA math.AP math.NA 版本更新

Networks for Nonlinear Diffusion Problems in Imaging

图像中非线性扩散问题的网络

Simon Arridge, Andreas Hauptmann

发表机构 * Department of Computer Science(计算机科学系;伦敦大学学院) University College London

AI总结 本文提出了一种基于非线性扩散过程的网络架构DiffNet,用于解决图像中的非线性扩散问题,该网络在可解释性和泛化能力方面优于传统卷积神经网络,并在非线性扩散逆问题上取得了与U-Net相当的性能。

详情
AI中文摘要

许多成像和视觉任务近期通过深度学习方法,特别是卷积神经网络的应用,经历了重大变革。这些方法在某些应用中取得了显著成果,即使这些应用并不明显表明卷积适合捕捉底层物理。在本文中,我们开发了一种基于非线性扩散过程的网络架构,称为DiffNet。通过设计,我们获得了一种适合图像中扩散相关问题的非线性网络架构。此外,所执行的更新是显式的,从而比传统卷积神经网络架构获得了更好的可解释性和泛化能力。在STL-10图像数据集上测试DiffNet在非线性扩散逆问题中的性能,使用Perona-Malik滤波器。我们获得的结果与已建立的U-Net架构具有竞争力,参数数量和必要的训练数据较少。

英文摘要

A multitude of imaging and vision tasks have seen recently a major transformation by deep learning methods and in particular by the application of convolutional neural networks. These methods achieve impressive results, even for applications where it is not apparent that convolutions are suited to capture the underlying physics. In this work we develop a network architecture based on nonlinear diffusion processes, named DiffNet. By design, we obtain a nonlinear network architecture that is well suited for diffusion related problems in imaging. Furthermore, the performed updates are explicit, by which we obtain better interpretability and generalisability compared to classical convolutional neural network architectures. The performance of DiffNet tested on the inverse problem of nonlinear diffusion with the Perona-Malik filter on the STL-10 image dataset. We obtain competitive results to the established U-Net architecture, with a fraction of parameters and necessary training data.

1811.11259 2026-06-04 cs.LG cs.AI cs.DS cs.SY eess.SY stat.ML 版本更新

Scaling Configuration of Energy Harvesting Sensors with Reinforcement Learning

基于强化学习的能源收集传感器的扩展配置

Francesco Fraternali, Bharathan Balaji, Rajesh Gupta

发表机构 * University of California, San Diego(加州大学圣迭戈分校) University of California, Los Angeles(加州大学洛杉矶分校)

AI总结 本文提出利用强化学习自动配置室内太阳能板能源收集传感器的采样率,通过减少训练阶段和计算需求,实现快速部署和大规模扩展,有效提升传感器数据采集效率并避免能源耗尽。

Comments 7 pages, 5 figures

详情
Journal ref
ENSsys '18: International Workshop on Energy Harvesting & Energy-Neutral Sensing Systems}{November 4, 2018}{Shenzhen, China
AI中文摘要

随着物联网(IoT)的出现,越来越多的能源收集方法被用于补充或替代电池供电传感器。能源收集传感器需要根据应用、硬件和环境条件进行配置,以最大化其效用。目前,传感器配置要么是手动的,要么基于启发式方法,需要宝贵的领域专业知识。强化学习(RL)是一种有前景的方法,可以自动化配置并高效扩展IoT部署,但尚未在实践中得到应用。我们提出了解决这一差距的解决方案:减少RL的训练阶段,使节点在部署后短时间内即可运行,并减少计算需求以扩展到大规模部署。我们专注于配置基于室内太阳能板的能源收集传感器的采样率。我们基于三个月内从5个传感器节点收集的数据创建了一个模拟器。我们的模拟结果表明,RL可以有效学习能源可用性模式,并配置传感器节点的采样率以在确保不耗尽能源存储的情况下最大化传感数据。通过我们的方法,节点可以在部署的第一天内投入使用。我们还展示了通过使用相似光照条件的节点共享单个策略来减少RL策略数量的可能性。

英文摘要

With the advent of the Internet of Things (IoT), an increasing number of energy harvesting methods are being used to supplement or supplant battery based sensors. Energy harvesting sensors need to be configured according to the application, hardware, and environmental conditions to maximize their usefulness. As of today, the configuration of sensors is either manual or heuristics based, requiring valuable domain expertise. Reinforcement learning (RL) is a promising approach to automate configuration and efficiently scale IoT deployments, but it is not yet adopted in practice. We propose solutions to bridge this gap: reduce the training phase of RL so that nodes are operational within a short time after deployment and reduce the computational requirements to scale to large deployments. We focus on configuration of the sampling rate of indoor solar panel based energy harvesting sensors. We created a simulator based on 3 months of data collected from 5 sensor nodes subject to different lighting conditions. Our simulation results show that RL can effectively learn energy availability patterns and configure the sampling rate of the sensor nodes to maximize the sensing data while ensuring that energy storage is not depleted. The nodes can be operational within the first day by using our methods. We show that it is possible to reduce the number of RL policies by using a single policy for nodes that share similar lighting conditions.

1811.10275 2026-06-04 stat.CO cs.LG cs.NA math.NA stat.ML 版本更新

Rejoinder for "Probabilistic Integration: A Role in Statistical Computation?"

对“概率积分:在统计计算中的作用?”的回应

Francois-Xavier Briol, Chris J. Oates, Mark Girolami, Michael A. Osborne, Dino Sejdinovic

发表机构 * Imperial College London(伦敦帝国理工学院) Newcastle University(新castle大学) University of Oxford(牛津大学)

AI总结 本文是对即将发表在《统计科学》上的论文“概率积分:在统计计算中的作用?”的回应。作者感谢了评审员和同事们的帮助,并回应了讨论者提出的问题,探讨了贝叶斯方法在数值分析中的应用及其在统计计算中的作用。

Comments Accepted to Statistical Science

详情
AI中文摘要

本文是对即将发表在《统计科学》上的论文“概率积分:在统计计算中的作用?”的回应。我们首先感谢评审员和许多同事帮助塑造了这篇论文,感谢编辑选择我们的论文进行讨论,当然还有所有讨论者对他们深入、有见地和建设性的评论。在本回应中,我们回应了讨论者提出的一些观点,并进一步探讨了论文背后的基本问题:(i)贝叶斯思想是否应用于数值分析?(ii)如果应该,此类方法在统计计算中应扮演什么角色?

英文摘要

This article is the rejoinder for the paper "Probabilistic Integration: A Role in Statistical Computation?" to appear in Statistical Science with discussion. We would first like to thank the reviewers and many of our colleagues who helped shape this paper, the editor for selecting our paper for discussion, and of course all of the discussants for their thoughtful, insightful and constructive comments. In this rejoinder, we respond to some of the points raised by the discussants and comment further on the fundamental questions underlying the paper: (i) Should Bayesian ideas be used in numerical analysis?, and (ii) If so, what role should such approaches have in statistical computation?

1809.01588 2026-06-04 math.NA cs.LG cs.NA math.AG math.ST stat.TH 版本更新

Learning Paths from Signature Tensors

从签名张量学习路径

Max Pfeffer, Anna Seigal, Bernd Sturmfels

AI总结 本文通过张量分解、代数几何和数值优化方法,研究了张量的矩阵共轭问题,并针对随机分析中的逆问题,提出从第三阶签名张量恢复路径的方法,建立了路径的可识别性结果。

Comments 22 pages, 3 figures

详情
AI中文摘要

矩阵共轭自然地扩展到张量的设置中。我们应用张量分解、代数几何和数值优化的方法来处理这一群作用。给定一个张量在另一个张量的轨道上,我们计算一个矩阵将其转换为另一个。我们的主要应用是一个来自随机分析的逆问题:从第三阶签名张量恢复路径。我们为分段线性路径、多项式路径和通用字典建立了精确和数值的可识别性结果。数值优化应用于从不精确数据中恢复。我们还计算了具有给定签名张量的最短路径。

英文摘要

Matrix congruence extends naturally to the setting of tensors. We apply methods from tensor decomposition, algebraic geometry and numerical optimization to this group action. Given a tensor in the orbit of another tensor, we compute a matrix which transforms one to the other. Our primary application is an inverse problem from stochastic analysis: the recovery of paths from their third order signature tensors. We establish identifiability results, both exact and numerical, for piecewise linear paths, polynomial paths, and generic dictionaries. Numerical optimization is applied for recovery from inexact data. We also compute the shortest path with a given signature tensor.

1811.07799 2026-06-04 cs.MA cs.LG cs.SY eess.SY 版本更新

Distributed Learning of Average Belief Over Networks Using Sequential Observations

在使用顺序观测的网络上分布式学习平均信念

Kaiqing Zhang, Yang Liu, Ji Liu, Mingyan Liu, Tamer Başar

AI总结 本文研究了在顺序观测下,网络中多个智能体通过仅与邻居交换信息达成对平均信念共识的问题。提出两种分布式在线算法,适用于无向和有向图,均能几乎 surely 收敛到平均信念,并以 O(1/t) 的速率达成共识。对于无向图,还修改了算法以适应量化通信和有限精度的除法操作,证明修改后的算法能使所有智能体达成量化共识或进入平均信念的附近区域。

Comments Accepted to Automatica

详情
AI中文摘要

本文解决了分布式学习平均信念的问题,其中n>1个智能体通过仅与邻居交换信息来达成对平均信念的共识。每个智能体以在线方式接收到其信念的序列样本。n个智能体之间的邻居关系由一个可能随时间变化的图描述,其顶点对应智能体,边表示邻居关系。提出两种分布式在线算法,适用于无向和有向图,均能几乎 surely 收敛到平均信念。此外,两种算法生成的序列在高概率下以O(1/t)的速率达成共识,其中t是迭代次数。对于无向图,相应算法被修改以适应量化通信和有限精度的除法操作。证明修改后的算法使所有n个智能体要么达成量化共识,要么进入其信念平均值的附近区域。随后提供了数值模拟以验证理论结果。

英文摘要

This paper addresses the problem of distributed learning of average belief with sequential observations, in which a network of $n>1$ agents aim to reach a consensus on the average value of their beliefs, by exchanging information only with their neighbors. Each agent has sequentially arriving samples of its belief in an online manner. The neighbor relationships among the $n$ agents are described by a graph which is possibly time-varying, whose vertices correspond to agents and whose edges depict neighbor relationships. Two distributed online algorithms are introduced for undirected and directed graphs, which are both shown to converge to the average belief almost surely. Moreover, the sequences generated by both algorithms are shown to reach consensus with an $O(1/t)$ rate with high probability, where $t$ is the number of iterations. For undirected graphs, the corresponding algorithm is modified for the case with quantized communication and limited precision of the division operation. It is shown that the modified algorithm causes all $n$ agents to either reach a quantized consensus or enter a small neighborhood around the average of their beliefs. Numerical simulations are then provided to corroborate the theoretical results.

1811.05788 2026-06-04 cs.LG cs.AI cs.SY eess.SY 版本更新

Learning to Compensate Photovoltaic Power Fluctuations from Images of the Sky by Imitating an Optimal Policy

通过模仿最优策略从天空图像中学习补偿光伏功率波动

Robin Spiess, Felix Berkenkamp, Jan Poland, Andreas Krause

发表机构 * Department of Computer Science, ETH Zurich(计算机科学系,苏黎世联邦理工学院) ABB Corporate Research, Switzerland(瑞士ABB企业研究)

AI总结 本文提出了一种基于深度学习的方法,利用天空图像预测性地补偿光伏功率波动,减少电池压力,通过模仿学习训练神经网络近似最优策略。

Comments 7 pages, 7 figures

详情
AI中文摘要

光伏(PV)发电站的输出功率取决于环境,因此会随时间波动。这导致光伏功率可能在电网中引起不稳定性,尤其是在日益广泛使用的情况下。限制功率输出变化率是缓解这些波动的常见方法,通常借助大型电池。一种使用这些电池补偿阶跃变化的反应控制器在实践中有效,但会导致电池因高能量通过而受到压力。在本文中,我们提出了一种深度学习方法,利用天空图像来预测性地补偿功率波动并减少电池压力。特别是,我们证明可以通过仅在事后可用的信息来计算最优控制策略。基于此,我们使用模仿学习训练一个神经网络,该网络近似这种事后最优策略,但仅使用当前可用的天空图像和传感器数据。我们对一个大规模的测量和图像数据集进行了评估,并展示了训练后的策略能够减少电池压力。

英文摘要

The energy output of photovoltaic (PV) power plants depends on the environment and thus fluctuates over time. As a result, PV power can cause instability in the power grid, in particular when increasingly used. Limiting the rate of change of the power output is a common way to mitigate these fluctuations, often with the help of large batteries. A reactive controller that uses these batteries to compensate ramps works in practice, but causes stress on the battery due to a high energy throughput. In this paper, we present a deep learning approach that uses images of the sky to compensate power fluctuations predictively and reduces battery stress. In particular, we show that the optimal control policy can be computed using information that is only available in hindsight. Based on this, we use imitation learning to train a neural network that approximates this hindsight-optimal policy, but uses only currently available sky images and sensor data. We evaluate our method on a large dataset of measurements and images from a real power plant and show that the trained policy reduces stress on the battery.

1811.05646 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

Fast Distribution Grid Line Outage Identification with $μ$PMU

快速分布电网线路故障识别与μPMU

Yizheng Liao, Yang Weng, Chin-Woo Tan, Ram Rajagopal

AI总结 本文提出基于μPMU的随机时间序列分析的数据驱动故障监测方法,通过功率流分析证明线路故障后电压时间序列的统计特性显著变化,利用最大似然方法直接学习分布参数,实现快速准确的故障识别。

Comments 9 pages

详情
AI中文摘要

随着分布式能源资源(DERs)在城市配电网中的日益集成,由于DER行为的不确定性和复杂性,可靠性问题日益突出。在大规模DER渗透的情况下,传统故障检测方法依赖于客户电话呼叫和智能电表的“最后信号”,由于可再生能源发电机可在线路故障后供电且许多城市电网为网状结构,线路故障不会影响供电。为解决这些问题,我们提出了一种基于微量程相量测量单元(μPMU)的随机时间序列分析的数据驱动故障监测方法。具体而言,我们通过功率流分析证明,线路故障后电压时间序列的依赖性表现出显著的统计变化。这使得基于最优变化点检测的理论能够通过μPMU实现快速且准确的线路故障识别。然而,现有变化点检测方法需要分布系统中故障后电压分布未知。因此,我们设计了一种基于最大似然的方法,直接从μPMU数据中学习分布参数。我们证明,基于估计参数的检测仍能实现最优性能,使其在配电网故障识别中具有极高的实用性。仿真结果表明,在八个配置有和没有DERs的配电网中,使用μPMU数据实现了高度准确的故障识别。

英文摘要

The growing integration of distributed energy resources (DERs) in urban distribution grids raises various reliability issues due to DER's uncertain and complex behaviors. With a large-scale DER penetration, traditional outage detection methods, which rely on customers making phone calls and smart meters' "last gasp" signals, will have limited performance, because the renewable generators can supply powers after line outages and many urban grids are mesh so line outages do not affect power supply. To address these drawbacks, we propose a data-driven outage monitoring approach based on the stochastic time series analysis from micro phasor measurement unit ($μ$PMU). Specifically, we prove via power flow analysis that the dependency of time-series voltage measurements exhibits significant statistical changes after line outages. This makes the theory on optimal change-point detection suitable to identify line outages via $μ$PMUs with fast and accurate sampling. However, existing change point detection methods require post-outage voltage distribution unknown in distribution systems. Therefore, we design a maximum likelihood-based method to directly learn the distribution parameters from $μ$PMU data. We prove that the estimated parameters-based detection still achieves the optimal performance, making it extremely useful for distribution grid outage identifications. Simulation results show highly accurate outage identification in eight distribution grids with 14 configurations with and without DERs using $μ$PMU data.

1808.00113 2026-06-04 eess.SY cs.LG cs.RO cs.SY math.OC 版本更新

Learning Stabilizable Dynamical Systems via Control Contraction Metrics

通过控制收缩度量学习可稳定化的动态系统

Sumeet Singh, Vikas Sindhwani, Jean-Jacques E. Slotine, Marco Pavone

发表机构 * Dept. of Aeronautics and Astronautics, Stanford University(航空航天系,斯坦福大学) Google Brain Robotics, New York(谷歌大脑机器人,纽约) Dept. of Mechanical Engineering, Massachusetts Institute of Technology(机械工程系,麻省理工学院)

AI总结 本文提出了一种新的框架,用于学习可稳定化的非线性动态系统,以实现机器人连续控制任务。核心方法是开发一种基于稳定性的控制理论正则化器,以确保学习到的系统可以配备一个稳健的控制器,能够稳定任何系统生成的开环轨迹。通过利用收缩理论、统计学习和凸优化工具,我们提供了一个通用且可操作的半监督算法来学习可稳定化的动态系统,可以应用于复杂的欠驱动系统。在模拟平面四旋翼系统上验证了所提算法,并观察到与传统回归技术学习的模型相比,使用控制理论正则化模型在轨迹生成和跟踪性能上有显著改进,尤其是在使用少量示范示例时。结果展示了将标准基于模型的强化学习算法与非线性控制理论概念结合的必要性,以提高可靠性。

Comments To appear at WAFR 2018. v2: re-structured Sections 3 & 4 to improve clarity; expanded discussion on limitations & future work in Section 5; added details on training & validation, significantly expanded experiments

详情
AI中文摘要

我们提出了一种新的框架,用于学习可稳定化的非线性动态系统,以实现机器人连续控制任务。关键思想是开发一种基于稳定性的控制理论正则化器,用于动态拟合,该正则化器保证所学习的系统可以配备一个稳健的控制器,能够稳定任何系统可能生成的开环轨迹。通过利用收缩理论、统计学习和凸优化工具,我们提供了一个通用且可操作的半监督算法来学习可稳定化的动态系统,可以应用于复杂的欠驱动系统。我们在模拟平面四旋翼系统上验证了所提算法,并观察到与传统回归技术学习的模型相比,使用控制理论正则化模型在轨迹生成和跟踪性能上有显著改进,尤其是在使用少量示范示例时。所呈现的结果展示了将标准基于模型的强化学习算法与非线性控制理论概念结合的必要性,以提高可靠性。

英文摘要

We propose a novel framework for learning stabilizable nonlinear dynamical systems for continuous control tasks in robotics. The key idea is to develop a new control-theoretic regularizer for dynamics fitting rooted in the notion of stabilizability, which guarantees that the learned system can be accompanied by a robust controller capable of stabilizing any open-loop trajectory that the system may generate. By leveraging tools from contraction theory, statistical learning, and convex optimization, we provide a general and tractable semi-supervised algorithm to learn stabilizable dynamics, which can be applied to complex underactuated systems. We validated the proposed algorithm on a simulated planar quadrotor system and observed notably improved trajectory generation and tracking performance with the control-theoretic regularized model over models learned using traditional regression techniques, especially when using a small number of demonstration examples. The results presented illustrate the need to infuse standard model-based reinforcement learning algorithms with concepts drawn from nonlinear control theory for improved reliability.

1811.04006 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Reachability-based safe learning for optimal control problem

基于可达性的安全学习用于最优控制问题

Stanislav Fedorov, Antonio Candelieri

发表机构 * Department of Computer Science, Systems and Communication, University of Milano Bicocca(米兰Bicocca大学计算机科学与通信系)

AI总结 本文提出了一种结合系统部分已知状态空间模型和未知动态作为加性有界扰动的安全学习方法,旨在通过安全集选择最优动作以实现目标集,同时在学习过程中更新扰动并提升最优控制的鲁棒性。

详情
AI中文摘要

在本文中,我们寻求一种整合安全性的学习方法,该方法依赖于部分已知的系统状态空间模型,并将未知动态视为加性有界扰动。我们引入了一个框架,用于在存在扰动的情况下安全地学习控制策略。基于已知模型部分,算法可以在满足安全保持条件的情况下,选择最优动作以追求目标集。在一些学习回合后,扰动可以根据现实数据进行更新。为此,对收集的扰动样本进行高斯过程回归。由于现实世界的不稳定性,例如摩擦或导电性随温度的变化,我们期望获得更鲁棒的最优控制问题解决方案。为了评估上述方法,我们选择倒立摆作为基准模型。所提出的算法能够学习到不违反预设安全约束的策略。当将其与探索设置结合时,观察到性能有所提升,从而确保在安全集内学习到最优策略。最后,我们概述了一些超出本文范围的未来研究方向。

英文摘要

In this work we seek for an approach to integrate safety in the learning process that relies on a partly known state-space model of the system and regards the unknown dynamics as an additive bounded disturbance. We introduce a framework for safely learning a control strategy for a given system with an additive disturbance. On the basis of the known part of the model, a safe set in which the system can learn safely, the algorithm can choose optimal actions for pursuing the target set as long as the safety-preserving condition is satisfied. After some learning episodes, the disturbance can be updated based on real-world data. To this end, Gaussian Process regression is conducted on the collected disturbance samples. Since the unstable nature of the law of the real world, for example, change of friction or conductivity with the temperature, we expect to have the more robust solution of optimal control problem. For evaluation of approach described above we choose an inverted pendulum as a benchmark model. The proposed algorithm manages to learn a policy that does not violate the pre-specified safety constraints. Observed performance is improved when it was incorporated exploration set up to make sure that an optimal policy is learned everywhere in the safe set. Finally, we outline some promising directions for future research beyond the scope of this paper.

1811.03853 2026-06-04 cs.LG cs.AI cs.SY eess.SY 版本更新

Sample-Efficient Policy Learning based on Completely Behavior Cloning

基于完全行为克隆的高效策略学习

Qiming Zou, Ling Wang, Ke Lu, Yu Li

发表机构 * Department of Computer Science and Technology, Harbin Institute of Technology, China(计算机科学与技术系,哈尔滨工业大学,中国) Department of Management Science and Engineering, Anhui University of Technology, China(管理科学与工程系,安徽理工大学,中国)

AI总结 本文提出了一种基于完全行为克隆的策略初始化算法PLCBC,通过将模型预测控制转换为分段仿射函数并用神经网络表达,实现无训练的完全克隆,从而提高策略学习的效率和收敛性。

详情
AI中文摘要

直接策略搜索是强化学习中最重要的算法之一。然而,从头开始学习需要大量经验数据,并容易陷入局部极小值。此外,部分训练的策略可能会对智能体和环境产生危险的动作。为了解决这些挑战,本文提出了一种称为基于完全行为克隆的策略学习(PLCBC)的策略初始化算法。PLCBC首先使用多参数编程将模型预测控制(MPC)控制器转换为分段仿射(PWA)函数,并用神经网络表达此函数。通过这种方式,PLCBC可以在不损失性能的情况下完全克隆MPC控制器,并且是完全无训练的。实验表明,这种初始化策略可以帮助智能体在高奖励状态区域学习,并更快、更有效地收敛。

英文摘要

Direct policy search is one of the most important algorithm of reinforcement learning. However, learning from scratch needs a large amount of experience data and can be easily prone to poor local optima. In addition to that, a partially trained policy tends to perform dangerous action to agent and environment. In order to overcome these challenges, this paper proposed a policy initialization algorithm called Policy Learning based on Completely Behavior Cloning (PLCBC). PLCBC first transforms the Model Predictive Control (MPC) controller into a piecewise affine (PWA) function using multi-parametric programming, and uses a neural network to express this function. By this way, PLCBC can completely clone the MPC controller without any performance loss, and is totally training-free. The experiments show that this initialization strategy can help agent learn at the high reward state region, and converge faster and better.

1811.03621 2026-06-04 cs.HC cs.CV cs.LG cs.SY eess.SY stat.ML 版本更新

Satyam: Democratizing Groundtruth for Machine Vision

Satyam: 机器视觉领域地面真实数据的民主化

Hang Qiu, Krishna Chintalapudi, Ramesh Govindan

发表机构 * University of Southern California(南加州大学) Microsoft Research(微软研究院)

AI总结 本文提出Satyam系统,通过简化流程使非专业人员能够高效收集机器视觉的地面真实数据,从而提升自动驾驶、交通监控和视频监控系统的性能。

详情
AI中文摘要

机器学习的民主化已经导致了用于自动驾驶、交通监控和视频监控的基于机器学习的机器视觉系统。然而,没有大大简化收集地面真实数据的过程,真正的民主化就无法实现。这种地面真实数据的收集对于确保在不同条件下具有良好的性能是必要的。在本文中,我们提出了Satyam系统的设计和评估,这是一个首次出现的系统,使非专业人士能够以最小的努力启动机器视觉的地面真实数据收集任务。Satyam利用一个众包平台,亚马逊机械 Turk,并自动化了地面真实数据收集的几个具有挑战性的方面:创建和启动定制的网页用户界面任务以获取所需的真实数据,控制结果质量以应对垃圾邮件发送者和未经训练的工人,根据任务复杂性调整价格,过滤表现差的垃圾邮件发送者和工人,以及处理工人的报酬。我们通过几种流行的基准视觉数据集验证了Satyam,并展示了通过Satyam获得的真实数据与由训练专家获得的数据相当,并且在用于训练时提供匹配的机器学习性能。

英文摘要

The democratization of machine learning (ML) has led to ML-based machine vision systems for autonomous driving, traffic monitoring, and video surveillance. However, true democratization cannot be achieved without greatly simplifying the process of collecting groundtruth for training and testing these systems. This groundtruth collection is necessary to ensure good performance under varying conditions. In this paper, we present the design and evaluation of Satyam, a first-of-its-kind system that enables a layperson to launch groundtruth collection tasks for machine vision with minimal effort. Satyam leverages a crowdtasking platform, Amazon Mechanical Turk, and automates several challenging aspects of groundtruth collection: creating and launching of custom web-UI tasks for obtaining the desired groundtruth, controlling result quality in the face of spammers and untrained workers, adapting prices to match task complexity, filtering spammers and workers with poor performance, and processing worker payments. We validate Satyam using several popular benchmark vision datasets, and demonstrate that groundtruth obtained by Satyam is comparable to that obtained from trained experts and provides matching ML performance when used for training.

1803.08287 2026-06-04 eess.SY cs.AI cs.LG cs.RO cs.SY 版本更新

Learning-based Model Predictive Control for Safe Exploration

基于学习的模型预测控制用于安全探索

Torsten Koller, Felix Berkenkamp, Matteo Turchetta, Andreas Krause

发表机构 * Vector Institute(向量研究所) Max Planck ETH Center for Learning Systems(马克斯·普朗克-ETH学习系统中心)

AI总结 本文提出了一种基于学习的模型预测控制方法,通过高斯过程先验假设构建可证明准确的轨迹置信区间,从而提供可证明的高概率安全保证,用于动态系统的安全高效探索和学习。

Comments Proc. of the Conference on Decision and Control, 2018

详情
AI中文摘要

基于学习的方法在没有显著系统先验知识的情况下成功解决了复杂控制任务。然而,这些方法通常不提供任何安全保证,这限制了它们在安全关键的现实应用中的使用。在本文中,我们提出了一种基于学习的模型预测控制方案,可以提供可证明的高概率安全保证。为此,我们利用高斯过程先验对动态特性进行正则性假设,以构建可证明准确的预测轨迹置信区间。与以往的方法不同,我们不假设模型不确定性是独立的。基于这些预测,我们保证轨迹满足安全约束。此外,我们使用终端集约束递归地保证在每个迭代中都存在安全的控制动作。在我们的实验中,我们展示了所提出算法可以安全且高效地探索和学习动态系统。

英文摘要

Learning-based methods have been successful in solving complex control tasks without significant prior knowledge about the system. However, these methods typically do not provide any safety guarantees, which prevents their use in safety-critical, real-world applications. In this paper, we present a learning-based model predictive control scheme that can provide provable high-probability safety guarantees. To this end, we exploit regularity assumptions on the dynamics in terms of a Gaussian process prior to construct provably accurate confidence intervals on predicted trajectories. Unlike previous approaches, we do not assume that model uncertainties are independent. Based on these predictions, we guarantee that trajectories satisfy safety constraints. Moreover, we use a terminal set constraint to recursively guarantee the existence of safe control actions at every iteration. In our experiments, we show that the resulting algorithm can be used to safely and efficiently explore and learn about dynamic systems.

1811.02052 2026-06-04 eess.SY cs.LG cs.MA cs.SY 版本更新

Managing engineering systems with large state and action spaces through deep reinforcement learning

通过深度强化学习管理具有大状态和动作空间的工程系统

C. P. Andriotis, K. G. Papakonstantinou

发表机构 * Department of Civil & Environmental Engineering(土木与环境工程系) The Pennsylvania State University(宾夕法尼亚州立大学) University Park(大学公园) USA(美国)

AI总结 本文提出了一种集成的深度强化学习框架,用于管理具有大状态和动作空间的多组件工程系统,通过开发深度集中多智能体Actor-Critic(DCMAC)方法,提供高效生命周期策略,以应对高维空间中的复杂决策问题。

详情
AI中文摘要

工程系统的决策可以高效地建模为马尔可夫决策过程(MDP)或部分可观测马尔可夫决策过程(POMDP)。典型的MDP和POMDP解决方案利用离线环境知识,为相对较小的状态和动作空间提供详细策略。然而,在大型多组件系统中,这些空间的规模容易爆炸,因为系统状态和动作随着组件数量的增加呈指数级增长,而整个系统的环境动态难以用显式形式描述,只能通过数值模拟器获取。在本工作中,为了解决这些问题,引入了一个集成的深度强化学习(DRL)框架。开发了深度集中多智能体Actor-Critic(DCMAC),一种离线策略的Actor-Critic DRL方法,为在高维空间中运行的大型多组件系统提供高效的生命周期策略。除了深度函数近似,用于参数化大型状态空间外,DCMAC还采用了动作的因子化表示,能够指定个体化的组件级和子系统级决策,同时保持整个系统的集中价值函数。DCMAC在与深度Q网络(DQN)解决方案和精确策略相比时表现良好,并在基于时间、基于条件和周期性策略的优化基线之上表现更优。

英文摘要

Decision-making for engineering systems can be efficiently formulated as a Markov Decision Process (MDP) or a Partially Observable MDP (POMDP). Typical MDP and POMDP solution procedures utilize offline knowledge about the environment and provide detailed policies for relatively small systems with tractable state and action spaces. However, in large multi-component systems the sizes of these spaces easily explode, as system states and actions scale exponentially with the number of components, whereas environment dynamics are difficult to be described in explicit forms for the entire system and may only be accessible through numerical simulators. In this work, to address these issues, an integrated Deep Reinforcement Learning (DRL) framework is introduced. The Deep Centralized Multi-agent Actor Critic (DCMAC) is developed, an off-policy actor-critic DRL approach, providing efficient life-cycle policies for large multi-component systems operating in high-dimensional spaces. Apart from deep function approximations that parametrize large state spaces, DCMAC also adopts a factorized representation of the system actions, being able to designate individualized component- and subsystem-level decisions, while maintaining a centralized value function for the entire system. DCMAC compares well against Deep Q-Network (DQN) solutions and exact policies, where applicable, and outperforms optimized baselines that are based on time-based, condition-based and periodic policies.

1811.02033 2026-06-04 stat.ML cs.LG cs.NA math.AP math.NA 版本更新

Physics-Informed Generative Adversarial Networks for Stochastic Differential Equations

基于随机微分方程的物理信息生成对抗网络

Liu Yang, Dongkun Zhang, George Em Karniadakis

发表机构 * Division of Applied Mathematics, Brown University, Providence, RI 02912, USA(应用数学系,布朗大学,普罗维德恩,罗德岛州,02912,美国)

AI总结 本文提出了一种新的物理信息生成对抗网络(PI-GANs),通过统一解决随机问题中的正向、逆向和混合问题,利用自动微分将物理定律编码到GAN架构中,展示了PI-GANs在高维随机微分方程求解中的准确性和有效性。

详情
AI中文摘要

我们开发了一种新的物理信息生成对抗网络(PI-GANs),以统一的方式解决基于有限散射测量的正向、逆向和混合随机问题。与仅依赖数据训练的常规GANs不同,我们通过自动微分将 governing 物理定律以随机微分方程(SDEs)的形式编码到GANs的架构中。特别地,我们应用了Wasserstein GANs with gradient penalty(WGAN-GP),因为其比vanilla GANs具有更强的稳定性。我们首先测试了WGAN-GP在基于来自稀疏放置传感器的同时读取数据中不同相关长度的高斯过程的近似能力。我们获得了良好的生成随机过程对目标过程的近似,即使输入噪声维度与目标随机过程的有效维度不匹配。我们还研究了判别器和生成器的过拟合问题,并发现生成器也出现过拟合,除了判别器之外。随后,我们考虑了解决需要近似三个随机过程(即解、激励和扩散系数)的椭圆SDEs。我们使用了三个生成器,其中两个是前馈深度神经网络(DNNs),另一个是由SDE诱导的神经网络。根据数据,我们使用一个或多个前馈DNNs作为PI-GANs中的判别器。在此,我们展示了PI-GANs在解决最多30维的SDEs中的准确性和有效性,但原则上,PI-GANs可以处理非常高的维数问题,只要有更多的传感器数据,并且计算成本具有低多项式增长。

英文摘要

We developed a new class of physics-informed generative adversarial networks (PI-GANs) to solve in a unified manner forward, inverse and mixed stochastic problems based on a limited number of scattered measurements. Unlike standard GANs relying only on data for training, here we encoded into the architecture of GANs the governing physical laws in the form of stochastic differential equations (SDEs) using automatic differentiation. In particular, we applied Wasserstein GANs with gradient penalty (WGAN-GP) for its enhanced stability compared to vanilla GANs. We first tested WGAN-GP in approximating Gaussian processes of different correlation lengths based on data realizations collected from simultaneous reads at sparsely placed sensors. We obtained good approximation of the generated stochastic processes to the target ones even for a mismatch between the input noise dimensionality and the effective dimensionality of the target stochastic processes. We also studied the overfitting issue for both the discriminator and generator, and we found that overfitting occurs also in the generator in addition to the discriminator as previously reported. Subsequently, we considered the solution of elliptic SDEs requiring approximations of three stochastic processes, namely the solution, the forcing, and the diffusion coefficient. We used three generators for the PI-GANs, two of them were feed forward deep neural networks (DNNs) while the other one was the neural network induced by the SDE. Depending on the data, we employed one or multiple feed forward DNNs as the discriminators in PI-GANs. Here, we have demonstrated the accuracy and effectiveness of PI-GANs in solving SDEs for up to 30 dimensions, but in principle, PI-GANs could tackle very high dimensional problems given more sensor data with low-polynomial growth in computational cost.

1811.01721 2026-06-04 math.NA cs.LG cs.NA 版本更新

Rethinking floating point for deep learning

重新思考深度学习中的浮点运算

Jeff Johnson

发表机构 * Facebook AI Research(脸书人工智能研究)

AI总结 本文提出了一种新的混合对数乘/线性加、Kulisch累加和渐缩编码的8位对数浮点格式,以提高能效并保持精度,同时在不重新训练网络的情况下,实现了与原始float32 ResNet-50模型在ImageNet上的高精度性能。

详情
AI中文摘要

减少神经网络硬件开销以实现更快或更低功耗的推理和训练是研究的活跃领域。使用整数乘加的统一量化已得到充分研究,这需要学习许多量化参数、微调训练或其他先决条件。很少有努力致力于改进浮点相对于此基准线;它仍然效率低下,字大小减少导致所需动态范围的剧烈损失。我们通过一种新的混合对数乘/线性加、Kulisch累加和Gustafson的正数格式的渐缩编码,将浮点改进为在28nm ASIC工艺上比等效位宽的整数硬件更节能,同时在8位中保持精度。通过仅使用四舍五入到最近的偶数,无需网络重新训练,所有数学和float32参数的替换都可以直接使用。此开源的8位对数浮点在ImageNet上达到原始float32 ResNet-50 CNN模型的top-1精度为0.9%和top-5精度为0.2%。与int8量化不同,它仍然是通用的浮点运算,可以即开即用。我们的8/38位对数浮点乘加在28nm工艺上综合并功率分析,其功率为8/32位整数乘加的1.12倍,面积为0.96倍。在16位时,我们的对数浮点乘加的功率为IEEE 754 float16融合乘加的0.59倍,面积为0.68倍,保持相同的显著位精度和动态范围,证明了其在训练ASICs中的实用性。

英文摘要

Reducing hardware overhead of neural networks for faster or lower power inference and training is an active area of research. Uniform quantization using integer multiply-add has been thoroughly investigated, which requires learning many quantization parameters, fine-tuning training or other prerequisites. Little effort is made to improve floating point relative to this baseline; it remains energy inefficient, and word size reduction yields drastic loss in needed dynamic range. We improve floating point to be more energy efficient than equivalent bit width integer hardware on a 28 nm ASIC process while retaining accuracy in 8 bits with a novel hybrid log multiply/linear add, Kulisch accumulation and tapered encodings from Gustafson's posit format. With no network retraining, and drop-in replacement of all math and float32 parameters via round-to-nearest-even only, this open-sourced 8-bit log float is within 0.9% top-1 and 0.2% top-5 accuracy of the original float32 ResNet-50 CNN model on ImageNet. Unlike int8 quantization, it is still a general purpose floating point arithmetic, interpretable out-of-the-box. Our 8/38-bit log float multiply-add is synthesized and power profiled at 28 nm at 0.96x the power and 1.12x the area of 8/32-bit integer multiply-add. In 16 bits, our log float multiply-add is 0.59x the power and 0.68x the area of IEEE 754 float16 fused multiply-add, maintaining the same signficand precision and dynamic range, proving useful for training ASICs as well.

1806.06498 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Conditional Affordance Learning for Driving in Urban Environments

面向城市环境的条件性 affordance 学习

Axel Sauer, Nikolay Savinov, Andreas Geiger

发表机构 * Chair of Robotics Science and System Intelligence, Technical University of Munich(慕尼黑技术大学机器人科学与系统智能系)

AI总结 本文提出了一种直接感知方法,通过将视频输入映射到适合复杂城市环境自主导航的中间表示,结合高层方向输入,实现了比现有强化学习和条件模仿学习方法更高的目标导向导航性能,并首次通过图像级标签处理交通灯和速度标志,显著减少模拟中的交通事故。

Comments Accepted for Conference on Robot Learning (CoRL) 2018

详情
AI中文摘要

大多数现有的自动驾驶方法分为两类:模块化流水线,通过构建环境的详尽模型,以及模仿学习方法,直接将图像映射到控制输出。最近提出的一种第三范式,直接感知,旨在通过神经网络学习适当的低维中间表示来结合两者的优点。然而,现有的直接感知方法仅限于简单的高速公路场景,缺乏在交叉路口导航、在交通灯前停止或遵守速度限制的能力。在本文中,我们提出了一种直接感知方法,将视频输入映射到适合复杂城市环境自主导航的中间表示,给定高层方向输入。与最先进的强化学习和条件模仿学习方法相比,在具有挑战性的CARLA模拟基准上,我们实现了高达68%的目标导向导航改进。此外,我们的方法是首次通过仅使用图像级标签来处理交通灯和速度标志,从而在模拟中显著减少交通事故。

英文摘要

Most existing approaches to autonomous driving fall into one of two categories: modular pipelines, that build an extensive model of the environment, and imitation learning approaches, that map images directly to control outputs. A recently proposed third paradigm, direct perception, aims to combine the advantages of both by using a neural network to learn appropriate low-dimensional intermediate representations. However, existing direct perception approaches are restricted to simple highway situations, lacking the ability to navigate intersections, stop at traffic lights or respect speed limits. In this work, we propose a direct perception approach which maps video input to intermediate representations suitable for autonomous navigation in complex urban environments given high-level directional inputs. Compared to state-of-the-art reinforcement and conditional imitation learning approaches, we achieve an improvement of up to 68 % in goal-directed navigation on the challenging CARLA simulation benchmark. In addition, our approach is the first to handle traffic lights and speed signs by using image-level labels only, as well as smooth car-following, resulting in a significant reduction of traffic accidents in simulation.

1605.03364 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Active Uncertainty Calibration in Bayesian ODE Solvers

在贝叶斯微分方程求解器中的主动不确定性校准

Hans Kersting, Philipp Hennig

发表机构 * Max-Planck-Institute for Intelligent Systems(马克斯·普朗克智能系统研究所)

AI总结 本文研究了如何在贝叶斯微分方程求解器中平衡计算成本与概率校准,提出了一种基于过滤的方法Bayesian Quadrature filtering (BQF),通过主动学习梯度测量的不精确性来提高不确定性校准。

Comments 10 pages, 3 figures, published at UAI 2016. Changes for Version 3: fixed minor index mistake in equation (14) (q-1-i instead of q+1-i on top of the product)

详情
Journal ref
Proceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence (UAI2016) 309--3018
AI中文摘要

在统计学和机器学习中,对微分方程(ODEs)求解器的兴趣正在重新增长,这些求解器返回概率测度而非点估计。最近,Conrad等人引入了一种基于采样的方法类,这些方法在特定意义上是'well-calibrated'的。但是,这些方法的计算成本显著高于经典方法。另一方面,Schober等人指出经典Runge-Kutta ODE求解器与高斯滤波器之间存在精确的联系,这只能提供粗糙的概率校准,但计算开销可忽略不计。通过将ODE的解视为线性高斯SDE中的近似推断,我们研究了一类概率ODE求解器,这些求解器在计算成本和概率校准之间取得了平衡,并识别出不准确的梯度测量是不确定性的关键来源。我们提出了一种新的基于过滤的方法Bayesian Quadrature filtering (BQF),该方法利用贝叶斯二次法主动学习梯度测量的不精确性,通过收集多个梯度评估来提高不确定性校准。

英文摘要

There is resurging interest, in statistics and machine learning, in solvers for ordinary differential equations (ODEs) that return probability measures instead of point estimates. Recently, Conrad et al. introduced a sampling-based class of methods that are 'well-calibrated' in a specific sense. But the computational cost of these methods is significantly above that of classic methods. On the other hand, Schober et al. pointed out a precise connection between classic Runge-Kutta ODE solvers and Gaussian filters, which gives only a rough probabilistic calibration, but at negligible cost overhead. By formulating the solution of ODEs as approximate inference in linear Gaussian SDEs, we investigate a range of probabilistic ODE solvers, that bridge the trade-off between computational cost and probabilistic calibration, and identify the inaccurate gradient measurement as the crucial source of uncertainty. We propose the novel filtering-based method Bayesian Quadrature filtering (BQF) which uses Bayesian quadrature to actively learn the imprecision in the gradient measurement by collecting multiple gradient evaluations.

1811.00961 2026-06-04 math.DS cs.LG cs.SY eess.SY math.OC 版本更新

Discovering conservation laws from data for control

从数据中发现用于控制的守恒定律

Eurika Kaiser, J. Nathan Kutz, Steven L. Brunton

AI总结 本文提出了一种数据驱动的方法,基于Koopman理论,从数据中发现守恒量,通过回归和幂级数展开识别守恒量,并利用这些内在坐标开发模型预测控制器来跟踪给定的参考值。

Comments 7 pages, 2 figures, 57th IEEE Conference on Decision and Control (CDC 2018)

详情
AI中文摘要

守恒量,即守恒量,是科学和工程中许多动力系统的关键特征。这些量与底层对称性相关,并提供了关于物理定律的基本知识,描述了系统的演变,并使系统简化。在本工作中,我们提出了一种数据驱动的架构,用于基于Koopman理论发现守恒量。Koopman算子已成为非线性动力学的原理性线性嵌入,其特征函数建立了内在坐标,其中动力学表现为线性。有趣的是,Koopman算子的特征函数与消失特征值相关,对应于底层系统的守恒量。在本文中,我们展示这些不变量可以基于Koopman算子的无穷小生成元通过数据驱动回归和幂级数展开来识别。我们进一步建立了Koopman框架、守恒量和Lie-Poisson括号之间的联系。该数据驱动方法用于发现守恒量在三维刚体方程中被演示,其中同时发现总能量和角动量,并利用这些内在坐标开发模型预测控制器以跟踪给定的参考值。

英文摘要

Conserved quantities, i.e. constants of motion, are critical for characterizing many dynamical systems in science and engineering. These quantities are related to underlying symmetries and they provide fundamental knowledge about physical laws, describe the evolution of the system, and enable system reduction. In this work, we formulate a data-driven architecture for discovering conserved quantities based on Koopman theory. The Koopman operator has emerged as a principled linear embedding of nonlinear dynamics, and its eigenfunctions establish intrinsic coordinates along which the dynamics behave linearly. Interestingly, eigenfunctions of the Koopman operator associated with vanishing eigenvalues correspond to conserved quantities of the underlying system. In this paper, we show that these invariants may be identified with data-driven regression and power series expansions, based on the infinitesimal generator of the Koopman operator. We further establish a connection between the Koopman framework, conserved quantities, and the Lie-Poisson bracket. This data-driven method for discovering conserved quantities is demonstrated on the three-dimensional rigid body equations, where we simultaneously discover the total energy and angular momentum and use these intrinsic coordinates to develop a model predictive controller to track a given reference value.

1811.00641 2026-06-04 cs.LG cs.CL cs.NA math.NA stat.ML 版本更新

Online Embedding Compression for Text Classification using Low Rank Matrix Factorization

在线文本分类中的低秩矩阵分解用于词嵌入压缩

Anish Acharya, Rahul Goel, Angeliki Metallinou, Inderjit Dhillon

发表机构 * Amazon Alexa AI(亚马逊Alexa人工智能) Amazon Search Technologies(亚马逊搜索技术) University of Texas at Austin(德克萨斯大学奥斯汀分校)

AI总结 本文提出了一种在线词嵌入压缩方法,利用低秩矩阵分解在训练过程中压缩词嵌入层,从而减少NLP模型的内存瓶颈,同时在下游任务中通过重新训练恢复精度,实验证明该方法在句子分类任务中实现了90%的压缩率,并优于固定点量化等其他方法。

Comments Accepted in Thirty-Third AAAI Conference on Artificial Intelligence (AAAI 2019)

详情
AI中文摘要

深度学习模型已成为自然语言处理(NLP)任务的最新技术,但将其部署到生产系统中却面临显著的内存限制。现有的压缩方法要么有损,要么引入显著的延迟。我们提出了一种压缩方法,利用低秩矩阵分解在训练过程中压缩词嵌入层,该层是大多数NLP模型的主要内存瓶颈。我们的模型在训练、压缩后,再在下游任务上重新训练以恢复精度,同时保持减小的尺寸。实验证明,所提出的方法在句子分类任务中可实现90%的压缩,对精度影响极小,并优于固定点量化或其他方法如离线词嵌入压缩。我们还通过FLOP计算分析了我们方法的推理时间和存储空间,显示我们可以通过可配置的比率压缩DNN模型,并在不引入额外延迟的情况下恢复精度损失。最后,我们引入了一种新的学习率调度方法,即周期性退火学习率(CALR),并通过句子分类基准实验证明其优于其他流行的自适应学习率算法。

英文摘要

Deep learning models have become state of the art for natural language processing (NLP) tasks, however deploying these models in production system poses significant memory constraints. Existing compression methods are either lossy or introduce significant latency. We propose a compression method that leverages low rank matrix factorization during training,to compress the word embedding layer which represents the size bottleneck for most NLP models. Our models are trained, compressed and then further re-trained on the downstream task to recover accuracy while maintaining the reduced size. Empirically, we show that the proposed method can achieve 90% compression with minimal impact in accuracy for sentence classification tasks, and outperforms alternative methods like fixed-point quantization or offline word embedding compression. We also analyze the inference time and storage space for our method through FLOP calculations, showing that we can compress DNN models by a configurable ratio and regain accuracy loss without introducing additional latency compared to fixed point quantization. Finally, we introduce a novel learning rate schedule, the Cyclically Annealed Learning Rate (CALR), which we empirically demonstrate to outperform other popular adaptive learning rate algorithms on a sentence classification benchmark.

1810.08907 2026-06-04 math.OC cs.LG cs.NA math.CA math.NA stat.ML 版本更新

Understanding the Acceleration Phenomenon via High-Resolution Differential Equations

通过高分辨率微分方程理解加速现象

Bin Shi, Simon S. Du, Michael I. Jordan, Weijie J. Su

发表机构 * Florida International University(佛罗里达国际大学) Carnegie Mellon University(卡内基梅隆大学) University of California, Berkeley(加州大学伯克利分校) University of Pennsylvania(宾夕法尼亚大学)

AI总结 本文通过高分辨率微分方程研究优化算法的加速现象,提出了一种新的极限过程,能够区分Nesterov加速梯度法和Polyak重力球方法,并揭示了NAG-C在非强凸函数下的收敛特性。

Comments 82 pages, 11 figures

详情
AI中文摘要

基于梯度的优化算法可以从极限常微分方程(ODEs)的角度进行研究。受现有ODEs无法区分Nesterov加速梯度法(用于强凸函数)和Polyak重力球方法的启发,我们研究了一种替代的极限过程,以获得高分辨率的ODEs。我们证明这些ODEs允许一个通用的Lyapunov函数框架,用于连续和离散时间下的收敛分析。我们还证明这些ODEs是底层算法更准确的替代品;特别是,它们不仅区分NAG-SC和Polyak重力球方法,还允许识别一个称为“梯度修正”的项,该项存在于NAG-SC中但不在重力球方法中,并负责两种方法收敛性质的差异。我们还利用高分辨率ODE框架研究Nesterov加速梯度法用于(非强凸)函数,揭示了一个此前未知的结果——NAG-C以反立方速率最小化平方梯度范数。最后,通过修改NAG-C的高分辨率ODE,我们获得了一族新的优化方法,这些方法被证明在光滑凸函数上保持NAG-C的加速收敛率。

英文摘要

Gradient-based optimization algorithms can be studied from the perspective of limiting ordinary differential equations (ODEs). Motivated by the fact that existing ODEs do not distinguish between two fundamentally different algorithms---Nesterov's accelerated gradient method for strongly convex functions (NAG-SC) and Polyak's heavy-ball method---we study an alternative limiting process that yields high-resolution ODEs. We show that these ODEs permit a general Lyapunov function framework for the analysis of convergence in both continuous and discrete time. We also show that these ODEs are more accurate surrogates for the underlying algorithms; in particular, they not only distinguish between NAG-SC and Polyak's heavy-ball method, but they allow the identification of a term that we refer to as "gradient correction" that is present in NAG-SC but not in the heavy-ball method and is responsible for the qualitative difference in convergence of the two methods. We also use the high-resolution ODE framework to study Nesterov's accelerated gradient method for (non-strongly) convex functions, uncovering a hitherto unknown result---that NAG-C minimizes the squared gradient norm at an inverse cubic rate. Finally, by modifying the high-resolution ODE of NAG-C, we obtain a family of new optimization methods that are shown to maintain the accelerated convergence rates of NAG-C for smooth convex functions.

1810.13084 2026-06-04 math.OC cs.DC cs.LG cs.MA cs.SY eess.SY 版本更新

Provably Accelerated Randomized Gossip Algorithms

可证明加速的随机传话算法

Nicolas Loizou, Michael Rabbat, Peter Richtárik

发表机构 * The University of Edinburgh, UK(爱丁堡大学) Facebook AI Research, Montreal(脸书人工智能研究) KAUST, KSA(王国立科技大学)

AI总结 本文提出了一种可证明加速的随机传话算法,用于解决平均一致性问题。该算法受到最近开发的加速随机Kaczmarz方法的启发,该方法是解决线性系统问题的流行方法。在每次传话迭代中,网络中的所有节点都会更新它们的值,但只有成对的节点交换私有信息。还展示了在流行无线传感器网络上的数值实验,展示了我们协议的优势。

详情
AI中文摘要

在本文中,我们提出了新的可证明加速的传话算法,用于解决平均一致性问题。所提出的协议受到最近开发的加速随机Kaczmarz方法的启发,这是一种用于解决线性系统问题的流行方法。在每次传话迭代中,网络中的所有节点都会更新它们的值,但只有成对的节点交换它们的私有信息。还展示了在流行无线传感器网络上的数值实验,展示了我们协议的优势。

英文摘要

In this work we present novel provably accelerated gossip algorithms for solving the average consensus problem. The proposed protocols are inspired from the recently developed accelerated variants of the randomized Kaczmarz method - a popular method for solving linear systems. In each gossip iteration all nodes of the network update their values but only a pair of them exchange their private information. Numerical experiments on popular wireless sensor networks showing the benefits of our protocols are also presented.

1810.12429 2026-06-04 cs.LG cs.AI cs.SY eess.SY stat.ML 版本更新

Breaking the Curse of Horizon: Infinite-Horizon Off-Policy Estimation

打破地平线诅咒:无限地平线离线估计

Qiang Liu, Lihong Li, Ziyang Tang, Dengyong Zhou

发表机构 * The University of Texas at Austin(德克萨斯大学奥斯汀分校) Google Brain(谷歌大脑)

AI总结 本文提出了一种新的离线估计方法,通过直接在平稳状态访问分布上应用重要性采样来避免现有估计器中方差爆炸的问题,核心贡献是提出了一种估计两个平稳分布密度比的新方法,并推导了RKHS情况下的闭式解。

Comments 21 pages, 5 figures, NIPS 2018 (spotlight)

详情
AI中文摘要

我们考虑了估计目标策略预期奖励的离线估计问题,该问题使用由不同行为策略收集的样本进行估计。重要性采样(IS)已成为推导(近)无偏估计器的关键技术,但在长地平线问题中已知会遭受过度高的方差。在无限地平线问题的极端情况下,基于IS的估计器的方差可能甚至是无界的。在本文中,我们提出了一种新的离线估计方法,直接在平稳状态访问分布上应用重要性采样,以避免现有估计器所面临的爆炸方差问题。我们的关键贡献是提出了一种估计两个平稳分布密度比的新方法,仅从行为分布中采样轨迹。我们为估计问题开发了一种mini-max损失函数,并推导了RKHS情况下的闭式解。我们通过理论和实证分析支持我们的方法。

英文摘要

We consider the off-policy estimation problem of estimating the expected reward of a target policy using samples collected by a different behavior policy. Importance sampling (IS) has been a key technique to derive (nearly) unbiased estimators, but is known to suffer from an excessively high variance in long-horizon problems. In the extreme case of in infinite-horizon problems, the variance of an IS-based estimator may even be unbounded. In this paper, we propose a new off-policy estimation method that applies IS directly on the stationary state-visitation distributions to avoid the exploding variance issue faced by existing estimators.Our key contribution is a novel approach to estimating the density ratio of two stationary distributions, with trajectories sampled from only the behavior distribution. We develop a mini-max loss function for the estimation problem, and derive a closed-form solution for the case of RKHS. We support our method with both theoretical and empirical analyses.

1806.03085 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

A Stein variational Newton method

一种Stein变分牛顿方法

Gianluca Detommaso, Tiangang Cui, Alessio Spantini, Youssef Marzouk, Robert Scheichl

发表机构 * University of Bath(巴斯大学) The Alan Turing Institute(艾伦·图灵研究所) Monash University(莫纳什大学) Massachusetts Institute of Technology(麻省理工学院) Heidelberg University(海德堡大学)

AI总结 本文提出了一种基于Stein变分梯度下降(SVGD)的改进方法,通过引入二阶信息加速并推广了该算法,实现了函数空间中的牛顿迭代,并展示了在多个测试案例中显著的计算效率提升。

Comments 18 pages, 7 figures

详情
Journal ref
NIPS 2018
AI中文摘要

Stein变分梯度下降(SVGD)最近被提出作为一种通用的非参数变分推断算法 [Liu & Wang, NIPS 2016]:它通过在再生核希尔伯特空间上实现一种函数梯度下降的形式来最小化目标分布与其近似分布之间的Kullback-Leibler散度。在本文中,我们通过引入二阶信息来加速和推广SVGD算法,从而在函数空间中近似出一种牛顿迭代。我们还展示了二阶信息如何导致更有效的核选择。我们在多个测试案例中观察到相对于原始SVGD算法有显著的计算效率提升。

英文摘要

Stein variational gradient descent (SVGD) was recently proposed as a general purpose nonparametric variational inference algorithm [Liu & Wang, NIPS 2016]: it minimizes the Kullback-Leibler divergence between the target distribution and its approximation by implementing a form of functional gradient descent on a reproducing kernel Hilbert space. In this paper, we accelerate and generalize the SVGD algorithm by including second-order information, thereby approximating a Newton-like iteration in function space. We also show how second-order information can lead to more effective choices of kernel. We observe significant computational gains over the original SVGD algorithm in multiple test cases.

1810.11505 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Stability-certified reinforcement learning: A control-theoretic perspective

具有稳定性认证的强化学习:控制论视角

Ming Jin, Javad Lavaei

发表机构 * Department of Industrial Engineering and Operations Research, University of California, Berkeley(工业工程与运营管理系,加州大学伯克利分校) Department of Industrial Engineering and Operations Research, and the Tsinghua-Berkeley Shenzhen Institute, University of California, Berkeley(工业工程与运营管理系,以及清华-伯克利深圳研究院,加州大学伯克利分校)

AI总结 本文从控制论角度研究强化学习策略与非线性动力系统连接时的稳定性认证问题,提出了一种基于半定规划可行性的方法,通过调节策略的输入输出梯度来获得鲁棒稳定性保证,并通过实验验证了该方法在多飞行编队和电力系统频率调节任务中的有效性。

详情
AI中文摘要

我们研究了强化学习策略与非线性动力系统连接时稳定性认证的重要问题。我们证明,通过调节策略的输入输出梯度,可以基于所提出的半定规划可行性问题获得强鲁棒稳定性保证。该方法能够通过利用问题特定的结构来认证一组稳定的控制器;进一步地,我们分析并建立了其(非)保守性。在两个分布式控制任务,即多飞行编队和电力系统频率调节上的实证评估表明,强化学习代理在稳定性认证的参数空间内能够表现出高性能,并且在长期内表现出稳定的学習行为。

英文摘要

We investigate the important problem of certifying stability of reinforcement learning policies when interconnected with nonlinear dynamical systems. We show that by regulating the input-output gradients of policies, strong guarantees of robust stability can be obtained based on a proposed semidefinite programming feasibility problem. The method is able to certify a large set of stabilizing controllers by exploiting problem-specific structures; furthermore, we analyze and establish its (non)conservatism. Empirical evaluations on two decentralized control tasks, namely multi-flight formation and power system frequency regulation, demonstrate that the reinforcement learning agents can have high performance within the stability-certified parameter space, and also exhibit stable learning behaviors in the long run.

1804.04310 2026-06-04 cs.IT cs.LG cs.NA math.IT math.NA math.OC 版本更新

Exact Reconstruction of Euclidean Distance Geometry Problem Using Low-rank Matrix Completion

利用低秩矩阵补全进行欧几里得距离几何问题的精确重构

Abiy Tasissa, Rongjie Lai

发表机构 * Department of Mathematics, Rensselaer Polytechnic Institute(罗切斯特理工学院数学系)

AI总结 本文提出了一种利用低秩矩阵补全方法来解决欧几里得距离几何问题的框架,通过引入双基方法理论分析重构问题,并在不同三维数据和蛋白质分子上的数值测试验证了算法的有效性和效率。

Comments 28 pages, revised proof of Theorem 1, added proof of form of $H^{-1}$, presentation improved

详情
AI中文摘要

欧几里得距离几何问题出现在许多应用中,从计算化学中确定分子构象到传感器网络中的定位。当距离信息不完整时,该问题可以被公式化为核范数最小化问题。在本文中,该最小化程序被重新表述为一个低秩r Gram矩阵相对于合适基底的矩阵补全问题。众所周知的限制等距性质在此场景中无法满足。相反,引入了双基方法来理论分析重构问题。如果Gram矩阵满足某些与参数ν相关的相干条件,则主要结果表明,从O(nrνlog²(n))均匀随机样本中可以以很高的概率恢复n个点的底层配置。计算上,设计了简单且快速的算法来解决欧几里得距离几何问题。在不同三维数据和蛋白质分子上的数值测试验证了所提算法的有效性和效率。

英文摘要

The Euclidean distance geometry problem arises in a wide variety of applications, from determining molecular conformations in computational chemistry to localization in sensor networks. When the distance information is incomplete, the problem can be formulated as a nuclear norm minimization problem. In this paper, this minimization program is recast as a matrix completion problem of a low-rank $r$ Gram matrix with respect to a suitable basis. The well known restricted isometry property can not be satisfied in this scenario. Instead, a dual basis approach is introduced to theoretically analyze the reconstruction problem. If the Gram matrix satisfies certain coherence conditions with parameter $ν$, the main result shows that the underlying configuration of $n$ points can be recovered with very high probability from $O(nrν\log^{2}(n))$ uniformly random samples. Computationally, simple and fast algorithms are designed to solve the Euclidean distance geometry problem. Numerical tests on different three dimensional data and protein molecules validate effectiveness and efficiency of the proposed algorithms.

1810.11178 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Using solar and load predictions in battery scheduling at the residential level

在住宅层面利用太阳能和负载预测进行电池调度

Richard Bean, Hina Khan

发表机构 * Redback Technologies Brisbane, Australia(红背技术布里斯班,澳大利亚) School of ITEE The University of Queensland, Australia(信息技术工程学院昆士兰大学,澳大利亚)

AI总结 本文提出了一种新的电池调度算法,通过预测负载和太阳能发电量来优化住宅用户的电力成本,该算法在不同电价下可实现1%至10%的节能效果。

Comments This paper was presented at the 8th Solar Integration Workshop and published in the workshop's proceedings

详情
AI中文摘要

智能太阳能逆变器可用于存储、监控和管理家庭的太阳能能源。我们描述了一种带有电池的智能太阳能逆变器系统,该系统可以自动运行或通过网络接收命令以在给定速率充电和放电。为了使电池存储在财务上可行并对消费者有利,可以采用有效的电池调度算法。特别是当地区内实施分时电价时,在某些情况下可以调度电池以节省个人客户的费用,相比自动模式。因此,本文提出了并评估了为太阳能能源住宅消费者设计的新电池调度算法。所提出的电池调度算法优化了住宅用户的下一24小时电力成本。通过根据负载和太阳能发电量的预测来控制电池存储系统的充放电,实现成本最小化。调度问题被公式化为一个线性规划问题。我们使用几个月的每小时负载和光伏数据对83个逆变器进行了计算机模拟。模拟结果表明,影响优化可行性的关键因素是电价和每个逆变器的光伏与负载比率。根据电价,可以预期比自动方法节省1%至10%。本文中所用的预测方法也显示出优于基本“持久性”预测方法。我们还检查了提高预测准确性和优化有效性的方法。

英文摘要

Smart solar inverters can be used to store, monitor and manage a home's solar energy. We describe a smart solar inverter system with battery which can either operate in an automatic mode or receive commands over a network to charge and discharge at a given rate. In order to make battery storage financially viable and advantageous to the consumers, effective battery scheduling algorithms can be employed. Particularly, when time-of-use tariffs are in effect in the region of the inverter, it is possible in some cases to schedule the battery to save money for the individual customer, compared to the "automatic" mode. Hence, this paper presents and evaluates the performance of a novel battery scheduling algorithm for residential consumers of solar energy. The proposed battery scheduling algorithm optimizes the cost of electricity over next 24 hours for residential consumers. The cost minimization is realized by controlling the charging/discharging of battery storage system based on the predictions for load and solar power generation values. The scheduling problem is formulated as a linear programming problem. We performed computer simulations over 83 inverters using several months of hourly load and PV data. The simulation results indicate that key factors affecting the viability of optimization are the tariffs and the PV to Load ratio at each inverter. Depending on the tariff, savings of between 1% and 10% can be expected over the automatic approach. The prediction approach used in this paper is also shown to out-perform basic "persistence" forecasting approaches. We have also examined the approaches for improving the prediction accuracy and optimization effectiveness.

1810.09675 2026-06-04 math.NA cs.LG cs.NA 版本更新

SwitchNet: a neural network model for forward and inverse scattering problems

SwitchNet: 一种用于正反散射问题的神经网络模型

Yuehaw Khoo, Lexing Ying

发表机构 * Department of Mathematics, Stanford University, Stanford, CA 94305.(斯坦福大学数学系) Department of Mathematics(数学系) ICME, Stanford University, Stanford, CA 94305. Facebook AI Research, Menlo Park, CA 94025.(斯坦福大学计算数学与工程系,Facebook人工智能研究实验室)

AI总结 SwitchNet通过建立散射体与散射场之间的映射,解决基于波方程的反散射问题,利用低秩结构和稀疏连接的切换层减少参数量并提升训练效率。

Comments 19 pages, 7 figures

详情
AI中文摘要

我们提出了一种新颖的神经网络架构SwitchNet,用于通过建立散射体与散射场(反之亦然)之间的映射来解决基于波方程的反散射问题。使用神经网络解决此问题的主要困难在于,散射体对散射波场有全局影响,使得通常具有局部连接的卷积神经网络不适用。虽然可以使用全连接网络来处理此类问题,但参数数量与输入和输出数据的大小呈二次增长。通过利用散射问题固有的低秩结构,并引入一种具有稀疏连接的新型切换层,SwitchNet架构使用了更少的参数,并促进了训练过程。数值实验显示在学习散射体与散射波场之间的正反映射方面具有令人鼓舞的准确性。

英文摘要

We propose a novel neural network architecture, SwitchNet, for solving the wave equation based inverse scattering problems via providing maps between the scatterers and the scattered field (and vice versa). The main difficulty of using a neural network for this problem is that a scatterer has a global impact on the scattered wave field, rendering typical convolutional neural network with local connections inapplicable. While it is possible to deal with such a problem using a fully connected network, the number of parameters grows quadratically with the size of the input and output data. By leveraging the inherent low-rank structure of the scattering problems and introducing a novel switching layer with sparse connections, the SwitchNet architecture uses much fewer parameters and facilitates the training process. Numerical experiments show promising accuracy in learning the forward and inverse maps between the scatterers and the scattered wave field.

1810.10078 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

Model Selection for Nonnegative Matrix Factorization by Support Union Recovery

通过支持联合恢复进行非负矩阵分解的模型选择

Zhaoqiang Liu

发表机构 * Department of Electrical and Computer Engineering(电子工程系)

AI总结 本文提出了一种通过计算经验二阶矩并恢复与经验二阶矩相关的矩阵中非零行的索引集来自动选择非负矩阵分解的潜在维度的算法,该算法在理论上有保证地检测出真实的潜在维度。

详情
AI中文摘要

非负矩阵分解(NMF)因其非减性和部分基性质而被广泛应用于机器学习和信号处理,因为它增强了可解释性。通常假设潜在维度(或组件数量)是给定的。尽管已经设计了大量NMF算法,但关于具有理论保证的自动NMF模型选择的文献却很少。在本文中,我们提出了一种算法,首先从经验四阶累积张量中计算经验二阶矩,然后通过恢复与经验二阶矩相关的矩阵中非零行的索引集(即非零行的索引集)来估计潜在维度。通过假设数据的生成模型并加入额外的温和条件,我们的算法可以证明性地检测到真实的潜在维度。我们在合成示例上展示了所提出的算法能够找到近似正确的组件数量。

英文摘要

Nonnegative matrix factorization (NMF) has been widely used in machine learning and signal processing because of its non-subtractive, part-based property which enhances interpretability. It is often assumed that the latent dimensionality (or the number of components) is given. Despite the large amount of algorithms designed for NMF, there is little literature about automatic model selection for NMF with theoretical guarantees. In this paper, we propose an algorithm that first calculates an empirical second-order moment from the empirical fourth-order cumulant tensor, and then estimates the latent dimensionality by recovering the support union (the index set of non-zero rows) of a matrix related to the empirical second-order moment. By assuming a generative model of the data with additional mild conditions, our algorithm provably detects the true latent dimensionality. We show on synthetic examples that our proposed algorithm is able to find an approximately correct number of components.

1810.09365 2026-06-04 cs.LG cs.RO cs.SY eess.SY stat.ML 版本更新

Coupled Longitudinal and Lateral Control of a Vehicle using Deep Learning

使用深度学习进行车辆纵向和横向控制的耦合控制

Guillaume Devineau, Philip Polack, Florent Altché, Fabien Moutarde

发表机构 * Center for Robotics, MINES ParisTech(机器人中心,巴黎综合理工学院) PSL Research University(巴黎综合理工大学)

AI总结 本文研究了深度神经网络在捕捉车辆动力学关键特性及执行耦合纵向和横向控制方面的潜力,通过高保真车辆动力学模拟数据集训练两种不同的人工神经网络,评估多层感知机和卷积神经网络在复杂测试赛道上的性能,与传统解耦控制器进行比较。

Comments Published in the IEEE 2018 International Conference on Intelligent Transportation Systems (ITSC 2018)

详情
AI中文摘要

本文探讨了深度神经网络在捕捉车辆动力学关键特性及执行耦合纵向和横向控制方面的潜力。为此,两种不同的人工神经网络被训练以计算对应参考轨迹的车辆控制输入,使用基于高保真车辆动力学模拟的数据集。在本研究中,控制输入被选择为前轮转向角和每个车轮施加的扭矩。两种模型,即多层感知机(MLP)和卷积神经网络(CNN),基于其在复杂测试赛道上驾驶车辆的能力进行评估,该赛道在长直线和紧弯之间切换。还提供了与传统解耦控制器在相同赛道上的比较。

英文摘要

This paper explores the capability of deep neural networks to capture key characteristics of vehicle dynamics, and their ability to perform coupled longitudinal and lateral control of a vehicle. To this extent, two different artificial neural networks are trained to compute vehicle controls corresponding to a reference trajectory, using a dataset based on high-fidelity simulations of vehicle dynamics. In this study, control inputs are chosen as the steering angle of the front wheels, and the applied torque on each wheel. The performance of both models, namely a Multi-Layer Perceptron (MLP) and a Convolutional Neural Network (CNN), is evaluated based on their ability to drive the vehicle on a challenging test track, shifting between long straight lines and tight curves. A comparison to conventional decoupled controllers on the same track is also provided.

1506.02438 2026-06-04 cs.LG cs.RO cs.SY eess.SY 版本更新

High-Dimensional Continuous Control Using Generalized Advantage Estimation

利用广义优势估计进行高维连续控制

John Schulman, Philipp Moritz, Sergey Levine, Michael Jordan, Pieter Abbeel

发表机构 * Department of Electrical Engineering and Computer Science(电气工程与计算机科学系) University of California, Berkeley(加州大学伯克利分校)

AI总结 本文提出了一种基于广义优势估计的方法,通过减少策略梯度估计的方差来解决高维连续控制中的样本需求问题,并通过信任区域优化提高稳定性和收敛性,从而在复杂的3D运动任务中实现了高效的政策学习。

详情
AI中文摘要

策略梯度方法在强化学习中受到青睐,因为它们直接优化累积奖励,并且可以方便地与非线性函数近似器如神经网络结合使用。主要挑战是通常需要大量的样本,以及在输入数据非平稳性下获得稳定和持续改进的难度。我们通过使用价值函数来显著减少策略梯度估计的方差(尽管引入了偏差),并利用类似于TD(λ)的指数加权优势函数估计来解决第一个挑战。我们通过使用信任区域优化过程来解决第二个挑战,该过程用于策略和价值函数,它们由神经网络表示。我们的方法在高度具有挑战性的3D运动任务中表现出强大的经验结果,包括学习双足和四足仿真实体的行走姿态,以及学习使双足机器人从地面平躺状态站立的策略。与使用手工制定政策表示的先前工作相比,我们的神经网络策略直接从原始运动学映射到关节扭矩。我们的算法是完全模型无关的,并且在3D双足机器人上的学习任务所需的模拟经验时间相当于1-2周的真实时间。

英文摘要

Policy gradient methods are an appealing approach in reinforcement learning because they directly optimize the cumulative reward and can straightforwardly be used with nonlinear function approximators such as neural networks. The two main challenges are the large number of samples typically required, and the difficulty of obtaining stable and steady improvement despite the nonstationarity of the incoming data. We address the first challenge by using value functions to substantially reduce the variance of policy gradient estimates at the cost of some bias, with an exponentially-weighted estimator of the advantage function that is analogous to TD(lambda). We address the second challenge by using trust region optimization procedure for both the policy and the value function, which are represented by neural networks. Our approach yields strong empirical results on highly challenging 3D locomotion tasks, learning running gaits for bipedal and quadrupedal simulated robots, and learning a policy for getting the biped to stand up from starting out lying on the ground. In contrast to a body of prior work that uses hand-crafted policy representations, our neural network policies map directly from raw kinematics to joint torques. Our algorithm is fully model-free, and the amount of simulated experience required for the learning tasks on 3D bipeds corresponds to 1-2 weeks of real time.

1810.06175 2026-06-04 cs.LG cs.SY eess.SY math.OC stat.ML 版本更新

An Optimal Control Approach to Sequential Machine Teaching

用最优控制方法进行序列机器教学

Laurent Lessard, Xuezhou Zhang, Xiaojin Zhu

发表机构 * University of Wisconsin–Madison(威斯康星大学麦迪逊分校)

AI总结 本文提出了一种基于最优控制的序列机器教学方法,通过将问题转化为时间最优控制问题,解决了寻找最短训练序列以驱动学习算法达到目标模型的问题,并在案例研究中展示了该方法的优越性。

详情
AI中文摘要

给定一个序列学习算法和目标模型,序列机器教学旨在找到最短的训练序列以驱动学习算法达到目标模型。我们提出了寻找此类最短训练序列的第一个系统方法。我们的关键见解是将序列机器教学公式化为时间最优控制问题。这使我们能够利用过去60年间最优控制领域发展出的关键理论和计算工具来解决序列教学问题。具体而言,我们研究了庞特里亚金最大原理,它为训练序列的最优性提供了必要条件。我们通过一个使用最小二乘损失函数和梯度下降学习者案例研究,展示了该方法的分析、结构和数值影响。我们为该问题计算了最优训练序列,尽管这些序列看起来曲折,但我们发现它们可以大幅超越现有生成训练序列的最优启发式方法。

英文摘要

Given a sequential learning algorithm and a target model, sequential machine teaching aims to find the shortest training sequence to drive the learning algorithm to the target model. We present the first principled way to find such shortest training sequences. Our key insight is to formulate sequential machine teaching as a time-optimal control problem. This allows us to solve sequential teaching by leveraging key theoretical and computational tools developed over the past 60 years in the optimal control community. Specifically, we study the Pontryagin Maximum Principle, which yields a necessary condition for optimality of a training sequence. We present analytic, structural, and numerical implications of this approach on a case study with a least-squares loss function and gradient descent learner. We compute optimal training sequences for this problem, and although the sequences seem circuitous, we find that they can vastly outperform the best available heuristics for generating training sequences.

1810.04859 2026-06-04 cs.IT cs.AI cs.LG cs.SY eess.SY math.IT math.ST stat.TH 版本更新

Policy Design for Active Sequential Hypothesis Testing using Deep Learning

使用深度学习的主动顺序假设检验政策设计

Dhruva Kartik, Ekraam Sabir, Urbashi Mitra, Prem Natarajan

发表机构 * USC Information Sciences Institute(美国南加州大学信息科学研究所)

AI总结 本文研究了如何利用深度学习设计更有效的主动顺序假设检验策略,通过比较新提出的启发式方法与现有方法,展示了在某些场景下性能的显著提升。

Comments Accepted at 56th Annual Allerton Conference on Communication, Control, and Computing

详情
AI中文摘要

信息论在通信、压缩和假设检验等各类问题中取得了很大的成功,而随机控制理论则通过动态规划对部分可观测马尔可夫决策过程(POMDPs)的最优策略进行表征。然而,一般情况下找到这些问题的最优策略是计算上困难的,因此在实践中通常采用启发式方法。深度学习可以作为一种工具,用于设计更好的启发式方法。本文考虑了主动顺序假设检验问题,目标是通过自适应选择适当的查询来以最少的样本量可靠地推断真实假设。该问题可以建模为POMDP,并且文献中已存在其价值函数的界。然而,最优策略尚未被识别,各种启发式方法被使用。本文提出了两种新的启发式方法:一种基于深度强化学习,另一种基于KL散度零和博弈。这些启发式方法与最先进的解决方案进行了比较,并通过数值实验表明,在某些场景下,所提出的启发式方法能够显著优于现有方法。

英文摘要

Information theory has been very successful in obtaining performance limits for various problems such as communication, compression and hypothesis testing. Likewise, stochastic control theory provides a characterization of optimal policies for Partially Observable Markov Decision Processes (POMDPs) using dynamic programming. However, finding optimal policies for these problems is computationally hard in general and thus, heuristic solutions are employed in practice. Deep learning can be used as a tool for designing better heuristics in such problems. In this paper, the problem of active sequential hypothesis testing is considered. The goal is to design a policy that can reliably infer the true hypothesis using as few samples as possible by adaptively selecting appropriate queries. This problem can be modeled as a POMDP and bounds on its value function exist in literature. However, optimal policies have not been identified and various heuristics are used. In this paper, two new heuristics are proposed: one based on deep reinforcement learning and another based on a KL-divergence zero-sum game. These heuristics are compared with state-of-the-art solutions and it is demonstrated using numerical experiments that the proposed heuristics can achieve significantly better performance than existing methods in some scenarios.

1803.01066 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Specialized Interior Point Algorithm for Stable Nonlinear System Identification

用于稳定非线性系统辨识的专用内点算法

Jack Umenberger, Ian R. Manchester

发表机构 * Australian Centre for Field Robotics(澳大利亚机器人场实验室) The University of Sydney(悉尼大学)

AI总结 本文提出了一种专用内点算法,通过利用问题中的特殊结构,将计算复杂度从数据集长度的三次方降低到线性增长,从而提高了非线性系统辨识的效率,并展示了其在新数据上的优越泛化能力。

Comments accepted to IEEE Transactions on Automatic Control

详情
AI中文摘要

从数据估计非线性动态模型面临着许多挑战,包括模型不稳定性和长期仿真保真度的非凸性。最近,拉格朗日松弛法被提出作为近似仿真保真度和保证稳定性的方法,通过半正定规划(SDP),然而由此产生的SDP具有较大的维度,限制了其在实际问题中的应用。在本文中,我们开发了一种路径跟随内点算法,利用问题中的特殊结构,将计算复杂度从数据集长度的三次方降低到线性增长。新的算法使经验比较成为可能,包括非线性ARX方法,并展示了对新数据的优越泛化能力。我们还探讨了稳定性约束的“正则化”效应,作为替代回归子集选择的方法。

英文摘要

Estimation of nonlinear dynamic models from data poses many challenges, including model instability and non-convexity of long-term simulation fidelity. Recently Lagrangian relaxation has been proposed as a method to approximate simulation fidelity and guarantee stability via semidefinite programming (SDP), however the resulting SDPs have large dimension, limiting their utility in practical problems. In this paper we develop a path-following interior point algorithm that takes advantage of special structure in the problem and reduces computational complexity from cubic to linear growth with the length of the data set. The new algorithm enables empirical comparisons to established methods including Nonlinear ARX, and we demonstrate superior generalization to new data. We also explore the "regularizing" effect of stability constraints as an alternative to regressor subset selection.

1801.08383 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

Data-Driven Impulse Response Regularization via Deep Learning

基于深度学习的数据驱动脉冲响应正则化

Carl Andersson, Niklas Wahlström, Thomas B. Schön

发表机构 * Department of Information Technology, Uppsala University(信息技术系,乌普萨拉大学)

AI总结 本文提出了一种新的数据驱动模型,用于稳定线性单输入单输出系统的脉冲响应估计,该模型在利用输入输出数据中的隐藏模式方面优于非参数模型。

详情
AI中文摘要

我们考虑了稳定线性单输入单输出系统脉冲响应估计的问题。这是一个已广泛研究的问题,其中灵活的非参数模型最近在性能上超越了传统的有限维模型结构。受这一发展和深度学习的成功启发,我们提出了一种新的灵活的数据驱动模型。我们的实验表明,新模型能够比非参数模型更充分地利用输入输出数据中的隐藏模式。

英文摘要

We consider the problem of impulse response estimation of stable linear single-input single-output systems. It is a well-studied problem where flexible non-parametric models recently offered a leap in performance compared to the classical finite-dimensional model structures. Inspired by this development and the success of deep learning we propose a new flexible data-driven model. Our experiments indicate that the new model is capable of exploiting even more of the hidden patterns that are present in the input-output data as compared to the non-parametric models.

1810.03733 2026-06-04 math.NA cs.LG cs.NA 版本更新

Find the dimension that counts: Fast dimension estimation and Krylov PCA

找出计数的维度:快速维度估计和Krylov PCA

Shashanka Ubaru, Abd-Krim Seghouane, Yousef Saad

发表机构 * IBM T. J. Watson Research Center(IBM T.J.沃森研究中心) The University of Melbourne(墨尔本大学) University of Minnesota(明尼苏达大学)

AI总结 本文提出了一种新的方法,用于同时估计协方差矩阵主子空间的维度并获得子空间的近似值,该方法结合了Krylov子空间方法,避免了显式计算样本协方差矩阵和完整的特征分解,从而在大规模数据应用中具有成本效益。

详情
AI中文摘要

高维数据和具有许多自由度的系统通常由协方差矩阵来表征。在本文中,我们考虑同时估计这些协方差矩阵主子空间的维度并获得子空间近似值的问题。这个问题出现在流行的主成分分析(PCA)中,并在许多机器学习、数据分析、信号和图像处理等应用中出现。我们首先提出了一种新的方法来估计主子空间的维度。然后展示如何将该方法与Krylov子空间方法结合,以同时估计维度并获得子空间的近似。维度估计无需额外成本。所提出的方法基于模型选择框架,其中新的选择标准是基于随机矩阵扰动理论思想推导的。我们进行了理论分析,(a) 显示所提出的方法在数据点数量 $n ightarrow \infty$ 时具有强一致性(即得出最优解),(b) 分析了有限 $n$ 情况下的精确维度估计条件。利用最近的结果,我们展示了所提算法也能产生近最优的PCA。所提出的方法避免显式形成样本协方差矩阵(与数据相关)并计算完整的特征分解。因此,该方法成本低廉,这在现代数据应用中特别有利,因为协方差矩阵可能非常大。数值实验展示了所提方法在各种应用中的性能。

英文摘要

High dimensional data and systems with many degrees of freedom are often characterized by covariance matrices. In this paper, we consider the problem of simultaneously estimating the dimension of the principal (dominant) subspace of these covariance matrices and obtaining an approximation to the subspace. This problem arises in the popular principal component analysis (PCA), and in many applications of machine learning, data analysis, signal and image processing, and others. We first present a novel method for estimating the dimension of the principal subspace. We then show how this method can be coupled with a Krylov subspace method to simultaneously estimate the dimension and obtain an approximation to the subspace. The dimension estimation is achieved at no additional cost. The proposed method operates on a model selection framework, where the novel selection criterion is derived based on random matrix perturbation theory ideas. We present theoretical analyses which (a) show that the proposed method achieves strong consistency (i.e., yields optimal solution as the number of data-points $n\rightarrow \infty$), and (b) analyze conditions for exact dimension estimation in the finite $n$ case. Using recent results, we show that our algorithm also yields near optimal PCA. The proposed method avoids forming the sample covariance matrix (associated with the data) explicitly and computing the complete eigen-decomposition. Therefore, the method is inexpensive, which is particularly advantageous in modern data applications where the covariance matrices can be very large. Numerical experiments illustrate the performance of the proposed method in various applications.

1807.09904 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

A Data-Efficient Approach to Precise and Controlled Pushing

一种数据高效且精确可控的推动作方法

Maria Bauza, Francois R. Hogan, Alberto Rodriguez

发表机构 * Department of Mechanical Engineering — Massachusetts Institute of Technology(机械工程系——麻省理工学院)

AI总结 本文提出了一种数据高效的方法,通过学习动态模型来控制复杂机械系统,仅需10个数据点即可完成复杂的推动作轨迹。

Comments Maria Bauza and Francois R. Hogan contributed equally to this work. 10 pages, 5 figures

详情
Journal ref
CoRL 2018
AI中文摘要

几十年来,控制理论的研究表明,简单的控制器在获得及时反馈的情况下,能够控制复杂的系统。推动作是复杂机械系统的一个例子,由于摩擦系数和压力分布等未知系统参数,难以准确建模。本文探讨了控制而非建模所需的数据复杂性。结果表明,一种基于模型的控制方法,其中动态模型从数据中学习,能够使用极少量的训练数据(10个数据点)完成复杂的推动作轨迹。推动作的动态特性通过高斯过程(GP)建模,并在一种模型预测控制方法中利用,该方法线性化GP并施加执行器和任务约束,以完成平面操作任务。

英文摘要

Decades of research in control theory have shown that simple controllers, when provided with timely feedback, can control complex systems. Pushing is an example of a complex mechanical system that is difficult to model accurately due to unknown system parameters such as coefficients of friction and pressure distributions. In this paper, we explore the data-complexity required for controlling, rather than modeling, such a system. Results show that a model-based control approach, where the dynamical model is learned from data, is capable of performing complex pushing trajectories with a minimal amount of training data (10 data points). The dynamics of pushing interactions are modeled using a Gaussian process (GP) and are leveraged within a model predictive control approach that linearizes the GP and imposes actuator and task constraints for a planar manipulation task.

1810.03025 2026-06-04 stat.ML cs.AI cs.LG cs.SY eess.SY 版本更新

Discretizing Logged Interaction Data Biases Learning for Decision-Making

对记录交互数据进行离散化会偏学习决策制定

Peter Schulam, Suchi Saria

发表机构 * Johns Hopkins University(约翰霍普金斯大学)

AI总结 本文研究了对非等间隔时间序列数据进行离散化对决策制定模型训练的影响,指出离散化引入了偏差,并提出使用连续时间模型来避免这一问题。

Comments This is a standalone short paper describing a new type of bias that can arise when learning from time series data for sequential decision-making problems

详情
AI中文摘要

时间序列数据通常在非等间隔时间点测量,常通过离散化作为预处理步骤。例如,客户到达时间的数据可能通过将每小时内的到达次数相加来简化,从而生成更易建模的离散时间序列。在本文摘要中,我们展示离散化引入了影响决策制定模型训练的偏差。我们称这种现象为离散化偏差,并表明可以通过使用连续时间模型来避免它。

英文摘要

Time series data that are not measured at regular intervals are commonly discretized as a preprocessing step. For example, data about customer arrival times might be simplified by summing the number of arrivals within hourly intervals, which produces a discrete-time time series that is easier to model. In this abstract, we show that discretization introduces a bias that affects models trained for decision-making. We refer to this phenomenon as discretization bias, and show that we can avoid it by using continuous-time models instead.

1810.02866 2026-06-04 eess.SP cs.LG cs.SY eess.SY 版本更新

Artificial Intelligence Assisted Power Grid Hardening in Response to Extreme Weather Events

人工智能辅助的电网加固以应对极端天气事件

Rozhin Eskandarpour, Amin Khodaei, A. Paaso, N. M. Abdullah

发表机构 * University of Denver(丹佛大学) ComEd(ComEd公司) US National Committee(美国国家委员会)

AI总结 本文提出了一种基于人工智能的电网加固模型,旨在提高电网在极端天气事件中的韧性。首先,提出一个机器学习模型来预测组件状态(运行或停电),然后将这些预测输入到加固模型中,确定分布式发电(DG)单元的放置位置。与现有文献不同,本文通过考虑两个目标的复杂依赖关系,共同优化电网经济性和韧性。在标准IEEE 118节点测试系统上的数值模拟展示了所提加固模型的优势和适用性。结果表明,通过去中心化和分布式本地能源资源,所提加固模型可以产生更稳健的解决方案,显著保护系统免受多个组件因极端事件而停电的影响。

详情
Journal ref
2018 Grid of the Future Symposium
AI中文摘要

本文提出了一种基于人工智能的电网加固模型,旨在提高电网在极端天气事件中的韧性。首先,提出一个机器学习模型来预测组件状态(运行或停电)。然后,将这些预测输入到加固模型中,确定分布式发电(DG)单元的战略放置位置。与现有文献不同,本文通过考虑两个目标的复杂依赖关系,共同优化电网的经济性和韧性。在标准IEEE 118节点测试系统上的数值模拟展示了所提加固模型的优势和适用性。结果表明,通过去中心化和分布式本地能源资源,所提加固模型可以产生更稳健的解决方案,显著保护系统免受多个组件因极端事件而停电的影响。

英文摘要

In this paper, an artificial intelligence based grid hardening model is proposed with the objective of improving power grid resilience in response to extreme weather events. At first, a machine learning model is proposed to predict the component states (either operational or outage) in response to the extreme event. Then, these predictions are fed into a hardening model, which determines strategic locations for placement of distributed generation (DG) units. In contrast to existing literature in hardening and resilience enhancement, this paper co-optimizes grid economic and resilience objectives by considering the intricate dependencies of the two. The numerical simulations on the standard IEEE 118-bus test system illustrate the merits and applicability of the proposed hardening model. The results indicate that the proposed hardening model through decentralized and distributed local energy resources can produce a more robust solution that can protect the system significantly against multiple component outages due to an extreme event.

1810.02022 2026-06-04 math.OC cs.LG cs.SY eess.SY math.DS stat.ML 版本更新

Convergence of the Expectation-Maximization Algorithm Through Discrete-Time Lyapunov Stability Theory

通过离散时间李雅普诺夫稳定性理论分析期望-最大化算法的收敛性

Orlando Romero, Sarthak Chatterjee, Sérgio Pequito

AI总结 本文从动态系统视角重新审视期望-最大化算法,将其视为非线性状态空间动态系统,并利用离散时间李雅普诺夫稳定性理论证明其收敛性。

Comments Preprint submitted to ACC 2019

详情
AI中文摘要

在本文中,我们提出了期望-最大化(EM)算法的动态系统视角。更具体地说,我们可以将EM算法分析为一个非线性状态空间动态系统。EM算法广泛应用于统计学、控制系统和机器学习中的数据聚类和密度估计。该算法属于一类称为近点方法的大型迭代算法。特别是,我们将EM算法的极限点和其他局部最大似然函数的极值点重新解释为其动态系统表示中的平衡点。此外,我们提出将其收敛性作为李雅普诺夫意义下的渐近稳定性来评估。因此,我们通过利用最近关于离散时间李雅普诺夫稳定性理论的结果,以建立EM算法动态系统表示中的渐近稳定性(从而收敛性)。

英文摘要

In this paper, we propose a dynamical systems perspective of the Expectation-Maximization (EM) algorithm. More precisely, we can analyze the EM algorithm as a nonlinear state-space dynamical system. The EM algorithm is widely adopted for data clustering and density estimation in statistics, control systems, and machine learning. This algorithm belongs to a large class of iterative algorithms known as proximal point methods. In particular, we re-interpret limit points of the EM algorithm and other local maximizers of the likelihood function it seeks to optimize as equilibria in its dynamical system representation. Furthermore, we propose to assess its convergence as asymptotic stability in the sense of Lyapunov. As a consequence, we proceed by leveraging recent results regarding discrete-time Lyapunov stability theory in order to establish asymptotic stability (and thus, convergence) in the dynamical system representation of the EM algorithm.

1810.01586 2026-06-04 math.NA cs.LG cs.NA physics.comp-ph 版本更新

Machine learning for accelerating effective property prediction for poroelasticity problem in stochastic media

利用机器学习加速随机介质中渗透弹性问题有效性质的预测

Maria Vasilyeva, Aleksey Tyrylgin

发表机构 * Institute for Scientific Computation, Texas A&M University, College Station, TX 77843-3368(科学计算研究所,德克萨斯A&M大学,学院站,德克萨斯州77843-3368) Multiscale model reduction laboratory, North-Eastern Federal University, Yakutsk, Republic of Sakha (Yakutia), Russia, 677980(多尺度模型简化实验室,北欧联邦大学,雅库茨克,萨哈(雅库茨克)共和国,俄罗斯,677980)

AI总结 本文提出一种基于深度神经网络的数值均质方法,用于快速计算随机介质中渗透弹性问题的有效性质,通过卷积神经网络学习随机场与有效性质之间的映射关系,实验结果表明该方法在二维和三维模型问题中均能快速且准确地预测有效性质。

详情
AI中文摘要

在本文中,我们考虑了具有随机性质的渗透弹性问题的数值均质化。所提出的方法基于构造深度神经网络(DNN)以快速计算问题粗网格近似的有效性质。我们使用选定的局部微尺度随机场和宏观尺度特征(渗透率和弹性张量)的实现在神经网络上进行训练。通过卷积神经网络(CNN)构建深度学习方法,以学习随机场与有效性质之间的映射。数值结果展示了二维和三维模型问题,表明所提出的方法能够快速且准确地预测有效性质。

英文摘要

In this paper, we consider a numerical homogenization of the poroelasticity problem with stochastic properties. The proposed method based on the construction of the deep neural network (DNN) for fast calculation of the effective properties for a coarse grid approximation of the problem. We train neural networks on the set of the selected realizations of the local microscale stochastic fields and macroscale characteristics (permeability and elasticity tensors). We construct a deep learning method through convolutional neural network (CNN) to learn a map between stochastic fields and effective properties. Numerical results are presented for two and three-dimensional model problems and show that proposed method provide fast and accurate effective property predictions.

1711.00439 2026-06-04 math.NA cs.LG cs.NA 版本更新

Sampling and multilevel coarsening algorithms for fast matrix approximations

用于快速矩阵近似的大规模矩阵采样和多级粗化算法

Shashanka Ubaru, Yousef Saad

发表机构 * IBM T. J. Watson Research Center(IBM T.J.沃森研究中心) University of Minnesota(明尼苏达大学)

AI总结 本文针对大规模稀疏矩阵及其作为大规模图表示的问题,提出基于粗化技术(可能结合随机采样)的算法,通过超图关联数据矩阵和基于列匹配的图粗化策略,理论分析了适当列匹配策略下粗化步骤的降维质量,并在标准应用和新应用中展示了方法的有效性。

详情
AI中文摘要

本文针对大规模稀疏矩阵及其作为大规模图表示的问题,提出基于粗化技术(可能结合随机采样)的算法。本文提出一种多级粗化技术,利用与数据矩阵相关的超图和基于列匹配的图粗化策略。理论结果表明,当采用适当列匹配策略时,粗化步骤所实现的降维质量。我们考虑了该技术的若干标准应用以及一些新的应用。在标准应用中,首先考虑计算部分SVD的问题,其中采样与粗化相结合能够显著提升SVD结果,优于仅采样。我们还考虑了列子集选择问题,一种在数据相关应用中常用的低秩近似方法,并展示了如何将多级粗化技术应用于该问题。同样,我们考虑了图稀疏化问题,并展示了如何利用粗化技术来解决它。数值实验展示了方法在各种应用中的性能。

英文摘要

This paper addresses matrix approximation problems for matrices that are large, sparse and/or that are representations of large graphs. To tackle these problems, we consider algorithms that are based primarily on coarsening techniques, possibly combined with random sampling. A multilevel coarsening technique is proposed which utilizes a hypergraph associated with the data matrix and a graph coarsening strategy based on column matching. Theoretical results are established that characterize the quality of the dimension reduction achieved by a coarsening step, when a proper column matching strategy is employed. We consider a number of standard applications of this technique as well as a few new ones. Among the standard applications we first consider the problem of computing the partial SVD for which a combination of sampling and coarsening yields significantly improved SVD results relative to sampling alone. We also consider the Column subset selection problem, a popular low rank approximation method used in data related applications, and show how multilevel coarsening can be adapted for this problem. Similarly, we consider the problem of graph sparsification and show how coarsening techniques can be employed to solve it. Numerical experiments illustrate the performances of the methods in various applications.

1808.00924 2026-06-04 eess.SY cs.LG cs.RO cs.SY 版本更新

The Lyapunov Neural Network: Adaptive Stability Certification for Safe Learning of Dynamical Systems

Lyapunov神经网络:用于安全学习动力系统的自适应稳定性认证

Spencer M. Richards, Felix Berkenkamp, Andreas Krause

发表机构 * Department of Mechanical and Process Engineering(机械与过程工程系) Department of Computer Science(计算机科学系)

AI总结 本文提出了一种学习准确安全证书的方法,用于非线性闭环动力系统,通过构造Lyapunov函数神经网络和适应最大安全区域形状的训练算法,以确保安全学习。

Comments Proc. of the 2nd Conference on Robot Learning (CoRL 2018)

详情
AI中文摘要

学习算法在模拟中表现出色,使机器人能够适应不确定环境并提高性能。然而,这些算法很少在安全关键系统中实际应用,因为学习的策略通常不提供任何安全保证。也就是说,所需的探索可能会对机器人或其环境造成物理伤害。在本文中,我们提出了一种方法,用于学习非线性闭环动力系统的准确安全证书。具体而言,我们构建了一个Lyapunov函数神经网络和一个训练算法,该算法能够适应状态空间中最大的安全区域的形状。该算法仅依赖于动力学的输入和输出知识,而不是任何特定的模型结构。我们通过学习模拟倒立摆的安全吸引区域来展示我们的方法。此外,我们讨论了我们的方法如何与动态系统的统计模型结合,用于安全学习算法。

英文摘要

Learning algorithms have shown considerable prowess in simulation by allowing robots to adapt to uncertain environments and improve their performance. However, such algorithms are rarely used in practice on safety-critical systems, since the learned policy typically does not yield any safety guarantees. That is, the required exploration may cause physical harm to the robot or its environment. In this paper, we present a method to learn accurate safety certificates for nonlinear, closed-loop dynamical systems. Specifically, we construct a neural network Lyapunov function and a training algorithm that adapts it to the shape of the largest safe region in the state space. The algorithm relies only on knowledge of inputs and outputs of the dynamics, rather than on any specific model structure. We demonstrate our method by learning the safe region of attraction for a simulated inverted pendulum. Furthermore, we discuss how our method can be used in safe learning algorithms together with statistical models of dynamical systems.

1809.11003 2026-06-04 cs.GR cs.LG cs.NA math.NA stat.ML 版本更新

An inverse scattering approach for geometric body generation: a machine learning perspective

用于几何体生成的逆散射方法:一种机器学习视角

Jinhong Li, Hongyu Liu, Wing-Yan Tsui, Xianchao Wang

发表机构 * Faculty of Science, Qilu University of Technology, Jinan, Shandong, China(齐鲁工业大学科学学院,济南,山东,中国) Department of Mathematics, Hong Kong Baptist University, Kowloon, Hong Kong SAR(香港 Baptist 大学数学系,九龙,香港特别行政区) Department of Mathematics, Harbin Institute of Technology, Harbin(哈尔滨工业大学数学系,哈尔滨)

AI总结 本文提出了一种基于逆散射技术的机器学习方法,用于生成具有特定特征值的2D和3D几何体,通过建立几何体与远场模式的一一对应关系,实现高效稳定的体生成。

Comments 22pages, comments are welcome

详情
AI中文摘要

在本文中,我们关注通过指定特定几何体的特征值集来生成2D和3D几何形状。我们的主要动机之一是各种应用中的3D人体生成。我们开发了一种新的方法,能够根据定制的特征值生成所需的体。该方法采用机器学习的风味,通过训练数据集中的输入特征参数生成推断几何体。我们方法的一个关键成分和创新点是将波传播理论中的逆散射技术引入到体生成中。这是通过在由Helmholtz系统支配的源散射问题中建立几何体与远场模式之间精细的一一对应关系来实现的。这使得能够建立几何体空间与由远场模式定义的功能空间之间的一一对应关系。因此,远场模式可以作为形状生成器。通过首先操纵形状生成器,然后通过稳定的多频傅里叶方法从获得的形状生成器中重建对应的几何体,实现了具有指定特征参数的形状生成。我们的方法易于实现,能够产生更高效和稳定的体生成。我们为所提出的方法提供了理论分析和广泛的数值实验。本研究是首次尝试将逆散射方法与机器学习结合应用于几何体生成,并为进一步的发展打开了许多机会。

英文摘要

In this paper, we are concerned with the 2D and 3D geometric shape generation by prescribing a set of characteristic values of a specific geometric body. One of the major motivations of our study is the 3D human body generation in various applications. We develop a novel method that can generate the desired body with customized characteristic values. The proposed method follows a machine-learning flavour that generates the inferred geometric body with the input characteristic parameters from a training dataset. One of the critical ingredients and novelties of our method is the borrowing of inverse scattering techniques in the theory of wave propagation to the body generation. This is done by establishing a delicate one-to-one correspondence between a geometric body and the far-field pattern of a source scattering problem governed by the Helmholtz system. It in turn enables us to establish a one-to-one correspondence between the geometric body space and the function space defined by the far-field patterns. Hence, the far-field patterns can act as the shape generators. The shape generation with prescribed characteristic parameters is achieved by first manipulating the shape generators and then reconstructing the corresponding geometric body from the obtained shape generator by a stable multiple-frequency Fourier method. Our method is easy to implement and produces more efficient and stable body generations. We provide both theoretical analysis and extensive numerical experiments for the proposed method. The study is the first attempt to introduce inverse scattering approaches in combination with machine learning to the geometric body generation and it opens up many opportunities for further developments.

1809.10012 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Using Neural Networks to Generate Information Maps for Mobile Sensors

用神经网络为移动传感器生成信息图

Louis Dressel, Mykel J. Kochenderfer

AI总结 本文提出利用卷积神经网络实时生成移动传感器的信息图,以提高轨迹生成的效率和准确性。

Comments Accepted to the 2018 IEEE Conference on Decision and Control (CDC)

详情
AI中文摘要

目标定位是移动传感器的关键任务,具有多种应用。然而,为这些传感器生成信息丰富的轨迹是一个具有挑战性的问题。一种常用方法是使用信息图来估计在传感器状态空间中的任意点进行测量的价值。这些信息图用于生成轨迹;例如,轨迹可能被设计成其测量分布与信息图的分布匹配。无论轨迹生成方法如何,生成信息图作为新观察结果出现是至关重要的。然而,在实时计算这些图可能会有挑战。我们提出使用卷积神经网络从目标估计和传感器模型中实时生成信息图。模拟显示,生成的图准确且计算时间减少了多个数量级。

英文摘要

Target localization is a critical task for mobile sensors and has many applications. However, generating informative trajectories for these sensors is a challenging research problem. A common method uses information maps that estimate the value of taking measurements from any point in the sensor state space. These information maps are used to generate trajectories; for example, a trajectory might be designed so its distribution of measurements matches the distribution of the information map. Regardless of the trajectory generation method, generating information maps as new observations are made is critical. However, it can be challenging to compute these maps in real-time. We propose using convolutional neural networks to generate information maps from a target estimate and sensor model in real-time. Simulations show that maps are accurately rendered while offering orders of magnitude reduction in computation time.

1809.09261 2026-06-04 cs.AI cs.LG cs.SY eess.SY 版本更新

Resilient Computing with Reinforcement Learning on a Dynamical System: Case Study in Sorting

基于动态系统的强化学习鲁棒计算:排序问题案例研究

Aleksandra Faust, James B. Aimone, Conrad D. James, Lydia Tapia

发表机构 * Google Brain, Mountain View, CA, USA(谷歌大脑,美国加利福尼亚州山景城) Sandia National Labs, Albuquerque, NM, USA(桑迪亚国家实验室,美国新墨西哥州阿尔伯克基)

AI总结 本文将计算过程建模为反馈控制问题,利用强化学习解决序列决策问题,通过排序问题案例展示鲁棒计算方法在克服传统编程局限性方面的有效性。

Comments 11 pages, accepted to CDC 2018. Here with additional evaluations

详情
AI中文摘要

机器人和自主代理在资源有限的情况下,通常依赖不完美的模型和传感器测量来完成目标导向任务。特别是,强化学习(RL)和反馈控制可以用来帮助机器人实现目标。本文基于这一领域的工作,将通用计算建模为反馈控制问题,使代理能够自主克服标准过程语言编程的局限性:对错误的鲁棒性和早期程序终止的容忍。我们的建模将计算视为程序变量空间中的轨迹生成。计算因此成为一个序列决策问题,通过强化学习(RL)解决,并通过李雅普诺夫稳定性理论分析以评估代理的鲁棒性和向目标的进展。我们通过一个典型的计算机科学问题——数组排序的案例研究来实现这一点。评估显示,我们的RL排序代理能够稳定地向渐近稳定的终点进展,对故障组件具有鲁棒性,并且比传统的快速排序和冒泡排序进行的数组操作更少。

英文摘要

Robots and autonomous agents often complete goal-based tasks with limited resources, relying on imperfect models and sensor measurements. In particular, reinforcement learning (RL) and feedback control can be used to help a robot achieve a goal. Taking advantage of this body of work, this paper formulates general computation as a feedback-control problem, which allows the agent to autonomously overcome some limitations of standard procedural language programming: resilience to errors and early program termination. Our formulation considers computation to be trajectory generation in the program's variable space. The computing then becomes a sequential decision making problem, solved with reinforcement learning (RL), and analyzed with Lyapunov stability theory to assess the agent's resilience and progression to the goal. We do this through a case study on a quintessential computer science problem, array sorting. Evaluations show that our RL sorting agent makes steady progress to an asymptotically stable goal, is resilient to faulty components, and performs less array manipulations than traditional Quicksort and Bubble sort.

1809.08657 2026-06-04 math.OC cs.IT cs.LG cs.NA cs.SY eess.SY math.IT math.NA 版本更新

Accelerated Gossip via Stochastic Heavy Ball Method

通过随机重力球方法加速 gossip

Nicolas Loizou, Peter Richtárik

发表机构 * School of Mathematics, KAUST, Saudi Arabia(卡士塔大学数学学院,沙特阿拉伯) The University of Edinburgh(爱丁堡大学) The University of Edinburgh, United Kingdom(英国爱丁堡大学) Edinburgh, Scotland, UK(苏格兰爱丁堡,英国) MIPT, Russia(莫斯科国立信息安全大学,俄罗斯)

AI总结 本文研究了随机重力球方法(SHB)作为随机 gossip 算法的应用,提出了一种新的解决平均共识问题的协议,并通过实验展示了其优势。

Comments 8 pages, 5 Figures, 56th Annual Allerton Conference on Communication, Control, and Computing, 2018

详情
AI中文摘要

在本文中,我们展示了随机重力球方法(SHB)——一种用于解决随机凸和非凸优化问题的流行方法——如何作为随机 gossip 算法运行。特别是,我们关注 SHB 的两个特殊情形:带有动量的随机 Kaczmarz 方法及其块变体。基于最近的随机 gossip 算法设计和分析框架 [Loizou Richtarik, 2016],我们解释了所提出方法的分布式性质。我们提出了新的协议来解决平均共识问题,其中在每一步中,网络中的所有节点更新他们的值,但只有其中一部分节点交换他们的私有值。此外,我们还展示了在流行的无线传感器网络上的数值实验,以展示我们协议的优势。

英文摘要

In this paper we show how the stochastic heavy ball method (SHB) -- a popular method for solving stochastic convex and non-convex optimization problems --operates as a randomized gossip algorithm. In particular, we focus on two special cases of SHB: the Randomized Kaczmarz method with momentum and its block variant. Building upon a recent framework for the design and analysis of randomized gossip algorithms, [Loizou Richtarik, 2016] we interpret the distributed nature of the proposed methods. We present novel protocols for solving the average consensus problem where in each step all nodes of the network update their values but only a subset of them exchange their private values. Numerical experiments on popular wireless sensor networks showing the benefits of our protocols are also presented.

1809.06401 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Hidden Markov Model Estimation-Based Q-learning for Partially Observable Markov Decision Process

基于隐马尔可夫模型估计的Q学习:部分可观测马尔可夫决策过程

Hyung-Jin Yoon, Donghwan Lee, Naira Hovakimyan

发表机构 * Department of Industrial and Enterprise Systems Engineering(工业与企业系统工程系)

AI总结 本文提出了一种基于隐马尔可夫模型估计的在线Q学习算法,用于部分可观测马尔可夫决策过程,同时估计POMDP参数和Q函数,并证明其收敛性。

详情
AI中文摘要

目标是研究一种在线隐马尔可夫模型(HMM)估计基于的Q学习算法,用于有限状态和动作集的部分可观测马尔可夫决策过程(POMDP)。当完整状态观测可用时,Q学习在当前动作下找到最优动作价值函数(Q函数)。然而,当完整状态观测不可用时,Q学习表现不佳。本文将POMDP估计转化为HMM估计问题,并提出递归算法,同时估计POMDP参数和Q函数。此外,本文证明POMDP估计收敛到最大似然估计的平稳点,而Q函数估计收敛到满足由HMM估计过程确定的状态信念不变分布加权的贝尔曼最优性方程的固定点。

英文摘要

The objective is to study an on-line Hidden Markov model (HMM) estimation-based Q-learning algorithm for partially observable Markov decision process (POMDP) on finite state and action sets. When the full state observation is available, Q-learning finds the optimal action-value function given the current action (Q function). However, Q-learning can perform poorly when the full state observation is not available. In this paper, we formulate the POMDP estimation into a HMM estimation problem and propose a recursive algorithm to estimate both the POMDP parameter and Q function concurrently. Also, we show that the POMDP estimation converges to a set of stationary points for the maximum likelihood estimate, and the Q function estimation converges to a fixed point that satisfies the Bellman optimality equation weighted on the invariant distribution of the state belief determined by the HMM estimation process.

1809.08004 2026-06-04 cs.SI cs.LG cs.NA math.NA physics.data-an 版本更新

Multi-Dimensional, Multilayer, Nonlinear and Dynamic HITS

多维、多层、非线性和动态HITS

Francesca Arrigo, Francesco Tudisco

发表机构 * University of Strathclyde(斯特拉思克莱德大学)

AI总结 本文提出了一种基于多齐次顺序保持映射的 Perron 特征向量的时序多维加权有向网络排名模型,扩展了HITS算法到时序多层设置,并定义了五个中心性向量,包括节点、层和时间戳的向量,通过非线性引入保证了任何网络的中心性向量存在性和唯一性。

详情
AI中文摘要

我们介绍了一种基于多齐次顺序保持映射的Perron特征向量的时序多维加权和有向网络的排名模型。该模型将HITS算法扩展到时序多层设置,并定义了五个中心性向量:两个用于节点,两个用于层,一个用于时间戳。为了保证任何网络的中心性向量的存在性和唯一性,非线性被引入到标准的HITS模型中,而无需对网络的连通性结构有任何要求。我们引入了一种全局收敛的类似于幂迭代的算法来计算这些中心性向量。通过在真实世界网络上进行数值实验,以评估所提出模型的有效性,并展示所伴随算法的性能。

英文摘要

We introduce a ranking model for temporal multi-dimensional weighted and directed networks based on the Perron eigenvector of a multi-homogeneous order-preserving map. The model extends to the temporal multilayer setting the HITS algorithm and defines five centrality vectors: two for the nodes, two for the layers, and one for the temporal stamps. Nonlinearity is introduced in the standard HITS model in order to guarantee existence and uniqueness of these centrality vectors for any network, without any requirement on its connectivity structure. We introduce a globally convergent power iteration like algorithm for the computation of the centrality vectors. Numerical experiments on real-world networks are performed in order to assess the effectiveness of the proposed model and showcase the performance of the accompanying algorithm.

1809.07098 2026-06-04 cs.AI cs.LG cs.MA cs.NE cs.SY eess.SY 版本更新

Novelty-organizing team of classifiers in noisy and dynamic environments

在噪声和动态环境中组织新颖性的分类器团队

Danilo Vasconcellos Vargas, Hirotaka Takano, Junichi Murata

发表机构 * Graduate School of Information Science(信息科学研究生学校) Electrical Engineering Kyushu University Fukuoka, Japan Email(电气工程九州大学福冈日本电子邮件) Faculty of Information Science(信息科学学院)

AI总结 该研究提出了一种在噪声和动态环境中有效工作的分类器团队(NOTC),并通过连续动作山车问题及其变体进行验证,展示了NOTC在性能上的优势,尽管其初始化过程需要一些时间。

详情
Journal ref
2015 IEEE Congress on Evolutionary Computation (CEC)
AI中文摘要

在现实世界中,环境不断变化,输入变量受到噪声的影响。然而,很少有算法能够在这种情况下工作。在这里,新颖性组织分类器团队(NOTC)被应用于连续动作山车以及其两个变种:噪声山车和不稳定天气山车。这些问题分别考虑了噪声和问题动态的变化。此外,NOTC在这些问题中与神经进化拓扑增强(NEAT)进行了比较,揭示了两种方法之间的权衡。尽管NOTC在所有问题中均表现最佳,但NEAT需要更少的试验来收敛。证明了NOTC之所以表现更好,是因为其将输入空间划分为更易处理的问题。不幸的是,这种输入空间的划分也需要一些时间来初始化。

英文摘要

In the real world, the environment is constantly changing with the input variables under the effect of noise. However, few algorithms were shown to be able to work under those circumstances. Here, Novelty-Organizing Team of Classifiers (NOTC) is applied to the continuous action mountain car as well as two variations of it: a noisy mountain car and an unstable weather mountain car. These problems take respectively noise and change of problem dynamics into account. Moreover, NOTC is compared with NeuroEvolution of Augmenting Topologies (NEAT) in these problems, revealing a trade-off between the approaches. While NOTC achieves the best performance in all of the problems, NEAT needs less trials to converge. It is demonstrated that NOTC achieves better performance because of its division of the input space (creating easier problems). Unfortunately, this division of input space also requires a bit of time to bootstrap.

1809.06970 2026-06-04 cs.LG cs.NI cs.PF cs.SY eess.SY stat.ML 版本更新

FastDeepIoT: Towards Understanding and Optimizing Neural Network Execution Time on Mobile and Embedded Devices

FastDeepIoT: 向理解和优化移动和嵌入式设备上神经网络执行时间迈进

Shuochao Yao, Yiran Zhao, Huajie Shao, Shengzhong Liu, Dongxin Liu, Lu Su, Tarek Abdelzaher

发表机构 * University of Illinois Urbana Champaign(伊利诺伊大学厄巴纳-香槟分校) State University of New York at Buffalo(纽约州立大学布法罗分校)

AI总结 本文提出FastDeepIoT框架,通过揭示神经网络结构与执行时间之间的非线性关系,优化移动和嵌入式设备上执行时间与准确性的权衡,同时无需预先了解硬件规格或深度学习库的实现细节。

Comments Accepted by SenSys '18

详情
AI中文摘要

深度神经网络在许多传感应用问题中展现出巨大潜力,但其过度的资源需求会减慢执行时间,成为在低端设备上部署的重大障碍。为了解决这一挑战,最近的研究集中在压缩神经网络大小以提高性能。我们表明,改变神经网络大小并不成比例地影响感兴趣的性能属性,例如执行时间。相反,在网络配置空间中存在极端的运行时间非线性性。因此,我们提出了一个名为FastDeepIoT的新型框架,该框架揭示了神经网络结构与执行时间之间的非线性关系,然后利用这种理解来找到显著改善移动和嵌入式设备上执行时间与准确性权衡的网络配置。FastDeepIoT有两个关键贡献。首先,FastDeepIoT自动学习了一个准确且高度可解释的深度神经网络在目标设备上的执行时间模型。这无需事先了解硬件规格或所用深度学习库的详细实现。其次,FastDeepIoT告知压缩算法如何在经过分析的设备上最小化执行时间而不影响准确性。我们使用三种不同的传感相关任务在两部移动设备(Nexus 5和Galaxy Nexus)上评估了FastDeepIoT。FastDeepIoT进一步将神经网络的执行时间减少了48%到78%,并将能耗降低了37%到69%,与最先进的压缩算法相比。

英文摘要

Deep neural networks show great potential as solutions to many sensing application problems, but their excessive resource demand slows down execution time, pausing a serious impediment to deployment on low-end devices. To address this challenge, recent literature focused on compressing neural network size to improve performance. We show that changing neural network size does not proportionally affect performance attributes of interest, such as execution time. Rather, extreme run-time nonlinearities exist over the network configuration space. Hence, we propose a novel framework, called FastDeepIoT, that uncovers the non-linear relation between neural network structure and execution time, then exploits that understanding to find network configurations that significantly improve the trade-off between execution time and accuracy on mobile and embedded devices. FastDeepIoT makes two key contributions. First, FastDeepIoT automatically learns an accurate and highly interpretable execution time model for deep neural networks on the target device. This is done without prior knowledge of either the hardware specifications or the detailed implementation of the used deep learning library. Second, FastDeepIoT informs a compression algorithm how to minimize execution time on the profiled device without impacting accuracy. We evaluate FastDeepIoT using three different sensing-related tasks on two mobile devices: Nexus 5 and Galaxy Nexus. FastDeepIoT further reduces the neural network execution time by $48\%$ to $78\%$ and energy consumption by $37\%$ to $69\%$ compared with the state-of-the-art compression algorithms.

1809.06179 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Learning of Multi-Context Models for Autonomous Underwater Vehicles

多情境模型学习用于自主水下车辆

Bilal Wehbe, Octavio Arriaga, Mario Michael Krell, Frank Kirchner

发表机构 * DFKI - Robotic Innovation Center(DFKI机器人创新中心) Robotics Research Group(机器人研究组)

AI总结 本文提出利用LSTM网络学习自主水下车辆的多情境模型,通过实验数据构建仿真模型,生成不同情境并提高分类准确性,展现对噪声的鲁棒性和大数据集的扩展能力。

Comments 6 pages, 7 figures, AUV 2018 author copy

详情
AI中文摘要

多情境模型学习对于海洋机器人至关重要,因为多个因素可能干扰系统的动力学。本文解决了识别自主水下车辆(AUV)模型多种情境的问题。我们从实验数据构建了机器人的仿真模型,并利用该模型填补缺失数据并生成不同的模型情境。我们实现了一种基于长短期记忆(LSTM)网络的架构,直接从数据中学习不同的情境。我们证明LSTM网络在与基线方法相比时能够实现较高的分类准确性,显示出对噪声的鲁棒性,并能有效扩展到大规模数据集上。

英文摘要

Multi-context model learning is crucial for marine robotics where several factors can cause disturbances to the system's dynamics. This work addresses the problem of identifying multiple contexts of an AUV model. We build a simulation model of the robot from experimental data, and use it to fill in the missing data and generate different model contexts. We implement an architecture based on long-short-term-memory (LSTM) networks to learn the different contexts directly from the data. We show that the LSTM network can achieve high classification accuracy compared to baseline methods, showing robustness against noise and scaling efficiently on large datasets.

1809.06009 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Uncertainty Propagation in Deep Neural Networks Using Extended Kalman Filtering

使用扩展卡尔曼滤波在深度神经网络中进行不确定性传播

Jessica S. Titensky, Hayden Jananthan, Jeremy Kepner

发表机构 * Massachusetts Institute of Technology(麻省理工学院) Department of Mathematics(数学系) Lincoln Laboratory Supercomputing Center(林肯实验室超级计算机中心)

AI总结 本文提出利用扩展卡尔曼滤波在深度神经网络中传播和量化输入不确定性,方法在计算效率上优于现有技术,同时自然地将模型误差纳入输出不确定性。

Comments 4 Pages, 8 figures. Accepted at MIT IEEE Undergraduate Research Technology Conference 2018. Publication pending

详情
AI中文摘要

扩展卡尔曼滤波(EKF)可用于在假设输入分布具有温和假设的情况下通过深度神经网络(DNN)传播和量化输入不确定性。该方法在结果上与现有DNN不确定性传播方法相当,同时显著降低了计算开销。此外,EKF允许将模型误差自然地纳入输出不确定性中。

英文摘要

Extended Kalman Filtering (EKF) can be used to propagate and quantify input uncertainty through a Deep Neural Network (DNN) assuming mild hypotheses on the input distribution. This methodology yields results comparable to existing methods of uncertainty propagation for DNNs while lowering the computational overhead considerably. Additionally, EKF allows model error to be naturally incorporated into the output uncertainty.

1806.06161 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

BaRC: Backward Reachability Curriculum for Robotic Reinforcement Learning

BaRC:机器人强化学习中的逆向可达性课程

Boris Ivanovic, James Harrison, Apoorva Sharma, Mo Chen, Marco Pavone

发表机构 * Department of Mechanical Engineering, Stanford University(斯坦福大学机械工程系) School of Computing Science, Simon Fraser University(西蒙弗雷泽大学计算机科学学院)

AI总结 本文提出BaRC方法,利用物理先验知识设计课程方案,通过逆向可达性策略加速连续控制MDP中模型无关RL算法的训练,提升性能并减少探索需求。

详情
AI中文摘要

模型无关强化学习(RL)为高维系统学习控制策略提供了有吸引力的方法,但其相对差的样本复杂性通常迫使在模拟环境中进行训练。即使在模拟中,具有稀疏自然奖励函数的目标导向任务仍难以被最先进的模型无关算法处理。这些任务的瓶颈在于从系统初始状态获取学习信号所需的大量探索。本文利用物理先验知识(以近似系统动力学模型的形式)设计了一种课程方案,用于模型无关策略优化算法。我们的逆向可达性课程(BaRC)从需要少量动作完成任务的状态开始策略训练,并在策略优化算法表现出足够性能后,以动态一致的方式扩展初始状态分布。BaRC具有通用性,可以加速任何模型无关RL算法在广泛目标导向连续控制MDP上的训练。其课程策略具有物理直观性、易于调节,并允许将物理先验整合到训练中,而不会影响模型无关RL算法的性能、灵活性和适用性。我们在两个代表性的动态机器人学习问题上评估了我们的方法,并发现相对于先前的课程生成技术和朴素探索策略,有显著的性能提升。

英文摘要

Model-free Reinforcement Learning (RL) offers an attractive approach to learn control policies for high-dimensional systems, but its relatively poor sample complexity often forces training in simulated environments. Even in simulation, goal-directed tasks whose natural reward function is sparse remain intractable for state-of-the-art model-free algorithms for continuous control. The bottleneck in these tasks is the prohibitive amount of exploration required to obtain a learning signal from the initial state of the system. In this work, we leverage physical priors in the form of an approximate system dynamics model to design a curriculum scheme for a model-free policy optimization algorithm. Our Backward Reachability Curriculum (BaRC) begins policy training from states that require a small number of actions to accomplish the task, and expands the initial state distribution backwards in a dynamically-consistent manner once the policy optimization algorithm demonstrates sufficient performance. BaRC is general, in that it can accelerate training of any model-free RL algorithm on a broad class of goal-directed continuous control MDPs. Its curriculum strategy is physically intuitive, easy-to-tune, and allows incorporating physical priors to accelerate training without hindering the performance, flexibility, and applicability of the model-free RL algorithm. We evaluate our approach on two representative dynamic robotic learning problems and find substantial performance improvement relative to previous curriculum generation techniques and naive exploration strategies.

1809.05152 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Deep Reinforcement Learning for Event-Triggered Control

基于事件触发控制的深度强化学习

Dominik Baumann, Jia-Jie Zhu, Georg Martius, Sebastian Trimpe

AI总结 本文提出一种基于深度强化学习的事件触发控制方法,首次将DRL应用于ETC,能够从零开始学习控制与通信行为,并在非线性系统中展现优势。

详情
AI中文摘要

事件触发控制(ETC)方法能够以显著较少的样本实现高性能控制,相较于传统时间触发方法。这些框架通常基于系统数学模型和特定的控制器与事件触发器设计。本文展示了如何利用深度强化学习(DRL)算法从头开始同时学习控制与通信行为,并提出一种特别适用于ETC的DRL方法。到目前为止,这是首次将DRL应用于ETC的工作。我们验证了该方法在多个控制任务中的有效性,并将其与基于模型的事件触发框架进行了比较。特别是,我们证明了该方法除了许多基于模型的ETC设计外,还能简单地应用于非线性系统。

英文摘要

Event-triggered control (ETC) methods can achieve high-performance control with a significantly lower number of samples compared to usual, time-triggered methods. These frameworks are often based on a mathematical model of the system and specific designs of controller and event trigger. In this paper, we show how deep reinforcement learning (DRL) algorithms can be leveraged to simultaneously learn control and communication behavior from scratch, and present a DRL approach that is particularly suitable for ETC. To our knowledge, this is the first work to apply DRL to ETC. We validate the approach on multiple control tasks and compare it to model-based event-triggering frameworks. In particular, we demonstrate that it can, other than many model-based ETC designs, be straightforwardly applied to nonlinear systems.

1804.01031 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Provably Robust Learning-Based Approach for High-Accuracy Tracking Control of Lagrangian Systems

具有证明鲁棒性的基于学习的方法用于拉格朗日系统高精度跟踪控制

Mohamed K. Helwa, Adam Heins, Angela P. Schoellig

发表机构 * Dynamic Systems Lab(动态系统实验室) Institute for Aerospace Studies(航空航天研究院) University of Toronto(多伦多大学)

AI总结 本文提出基于高斯过程的新型学习控制方法,确保系统稳定性与高精度跟踪,通过不确定性界保证鲁棒性,并在仿真和实验中验证有效性。

Comments 8 pages, 4 figures, 2 tables, submitted to IEEE Robotics and Automation Letters (RA-L) and the 2019 International Conference on Robotics and Automation (ICRA) (created: March 2018; updated: September 2018)

详情
AI中文摘要

拉格朗日系统涵盖了多种机器人系统,包括机械臂、轮式和腿部机器人以及四旋翼。通常使用逆动力学控制和前馈线性化技术将复杂非线性动力学转换为解耦的二阶积分器,然后使用标准外环控制器计算线性化系统的期望加速度。然而,这些方法通常依赖于非常准确的系统模型,这在实践中往往不可用。尽管文献中使用了不同的学习方法来解决这一挑战,但大多数方法在学习控制系统稳定性方面缺乏安全保证。本文提出了一种基于高斯过程(GPs)的新学习控制方法,确保闭环系统的稳定性和高精度跟踪。我们使用GPs近似命令加速度与系统实际加速度之间的误差,并利用GP预测的均值和方差计算线性化模型不确定性的上界。此不确定性界随后用于鲁棒的外环控制器以确保整个系统的稳定性。此外,我们证明跟踪误差收敛到一个半径可任意小的球体。进一步,我们通过在2自由度平面机械臂上的仿真和6自由度工业机械臂上的实验验证了我们方法的有效性。

英文摘要

Lagrangian systems represent a wide range of robotic systems, including manipulators, wheeled and legged robots, and quadrotors. Inverse dynamics control and feedforward linearization techniques are typically used to convert the complex nonlinear dynamics of Lagrangian systems to a set of decoupled double integrators, and then a standard, outer-loop controller can be used to calculate the commanded acceleration for the linearized system. However, these methods typically depend on having a very accurate system model, which is often not available in practice. While this challenge has been addressed in the literature using different learning approaches, most of these approaches do not provide safety guarantees in terms of stability of the learning-based control system. In this paper, we provide a novel, learning-based control approach based on Gaussian processes (GPs) that ensures both stability of the closed-loop system and high-accuracy tracking. We use GPs to approximate the error between the commanded acceleration and the actual acceleration of the system, and then use the predicted mean and variance of the GP to calculate an upper bound on the uncertainty of the linearized model. This uncertainty bound is then used in a robust, outer-loop controller to ensure stability of the overall system. Moreover, we show that the tracking error converges to a ball with a radius that can be made arbitrarily small. Furthermore, we verify the effectiveness of our approach via simulations on a 2 degree-of-freedom (DOF) planar manipulator and experimentally on a 6 DOF industrial manipulator.

1809.03618 2026-06-04 cs.GR cs.LG cs.MM cs.NA math.NA 版本更新

Visualization of High-dimensional Scalar Functions Using Principal Parameterizations

使用主参数化可视化高维标量函数

Rafael Ballester-Ripoll, Renato Pajarola

发表机构 * University of Zurich(苏黎世大学)

AI总结 本文提出基于主成分的方法,通过降维高维标量场,利用Sobol方法进行敏感性分析,实现高维模型的交互式分析。

详情
AI中文摘要

多维标量场的深入可视化,特别是参数空间,在计算科学和工程中至关重要。我们提出了一种基于主成分的方法来可视化此类场,能够准确反映其对输入参数的敏感性。该方法对由所有可能的偏函数(即通过固定一个或多个输入参数到特定值定义的函数)构成的广阔L²希尔伯特空间进行降维,将这些函数投影到低维参数化流形上,如3D曲线、曲面及其集合。我们的映射提供了直接的几何和视觉解释,基于Sobol著名的基于方差的敏感性分析方法。我们还通过张量分解实现了该方法的实用实现,这使得能够准确且交互式地整合和多线性主成分分析高维模型。

英文摘要

Insightful visualization of multidimensional scalar fields, in particular parameter spaces, is key to many fields in computational science and engineering. We propose a principal component-based approach to visualize such fields that accurately reflects their sensitivity to input parameters. The method performs dimensionality reduction on the vast $L^2$ Hilbert space formed by all possible partial functions (i.e., those defined by fixing one or more input parameters to specific values), which are projected to low-dimensional parameterized manifolds such as 3D curves, surfaces, and ensembles thereof. Our mapping provides a direct geometrical and visual interpretation in terms of Sobol's celebrated method for variance-based sensitivity analysis. We furthermore contribute a practical realization of the proposed method by means of tensor decomposition, which enables accurate yet interactive integration and multilinear principal component analysis of high-dimensional models.

1809.03343 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

Distributed dynamic modeling and monitoring for large-scale industrial processes under closed-loop control

分布式动态建模与监控用于闭环控制下的大规模工业过程

Wenqing Li, Chunhui Zhao, Biao Huang

发表机构 * State Key Laboratory of Industrial Control Technology, College of Control Science and Engineering, Zhejiang University(工业控制技术国家重点实验室,控制科学与工程学院,浙江大学) Hubei Key Laboratory of Advanced Control and Intelligent Automation for Complex Systems(复杂系统先进控制与智能自动化湖北省重点实验室) Department of Chemical and Materials Engineering, University of Alberta(阿尔伯塔大学化学与材料工程系)

AI总结 本文提出一种分布式监控方法,结合静态和动态特性,区分真实故障与操作条件变化,通过稀疏慢特征分析算法分解过程并建立模型,验证方法有效性。

详情
AI中文摘要

对于受闭环控制的大规模工业过程,过程动态直接由控制动作产生,可能在真实故障和正常操作条件变化间表现出不同行为。然而,传统分布式监控方法不考虑闭环控制机制,仅探索静态特性,无法区分真实故障与名义变化,导致不必要的警报。本文提出一种分布式监控方法,通过同时探索静态和动态特性,首先通过开发稀疏慢特征分析(SSFA)算法将大规模闭环过程分解为若干子系统,其次开发分布式模型分别捕捉局部和全局的静态和动态特性。基于分布式监控系统,提出两级监控策略,检查操作条件和控制动作对过程特性的影响,从而区分两种变化。通过基准数据和真实工业过程数据的案例研究验证了所提方法的有效性。

英文摘要

For large-scale industrial processes under closed-loop control, process dynamics directly resulting from control action are typical characteristics and may show different behaviors between real faults and normal changes of operating conditions. However, conventional distributed monitoring approaches do not consider the closed-loop control mechanism and only explore static characteristics, which thus are incapable of distinguishing between real process faults and nominal changes of operating conditions, leading to unnecessary alarms. In this regard, this paper proposes a distributed monitoring method for closed-loop industrial processes by concurrently exploring static and dynamic characteristics. First, the large-scale closed-loop process is decomposed into several subsystems by developing a sparse slow feature analysis (SSFA) algorithm which capture changes of both static and dynamic information. Second, distributed models are developed to separately capture static and dynamic characteristics from the local and global aspects. Based on the distributed monitoring system, a two-level monitoring strategy is proposed to check different influences on process characteristics resulting from changes of the operating conditions and control action, and thus the two changes can be well distinguished from each other. Case studies are conducted based on both benchmark data and real industrial process data to illustrate the effectiveness of the proposed method.

1512.09156 2026-06-04 cs.IT cs.LG cs.NA math.IT math.NA 版本更新

Low rank approximation and decomposition of large matrices using error correcting codes

利用纠错码进行大矩阵的低秩近似与分解

Shashanka Ubaru, Arya Mazumdar, Yousef Saad

发表机构 * Department of Computer Science and Engineering, University of Minnesota, Twin Cities(计算机科学与工程系,明尼苏达大学,双城分校) Department of Electrical and Computer Engineering, University of Minnesota, Twin Cities(电气与计算机工程系,明尼苏达大学,双城分校)

AI总结 本文探讨利用纠错码矩阵进行大矩阵低秩近似与分解,提出该方法在低秩近似、线性回归等问题中的优势,包括减少随机性、子空间嵌入性质、并行计算优势等。

详情
Journal ref
IEEE Transactions on Information Theory ( Volume: 63, Issue: 9, Sept. 2017 ) Page(s): 5544 - 5558
AI中文摘要

低秩近似是信号处理和机器学习中重要的工具。最近,随机化草图算法被提出,用于有效构造低秩近似并获得大矩阵的近似奇异值分解。类似的思想也被用于解决最小二乘回归问题。本文展示如何利用纠错码中的矩阵来寻找此类低秩近似和矩阵分解,并将框架扩展到线性最小二乘回归问题。使用这些码矩阵的好处包括:(i) 它们易于生成且显著减少随机性。(ii) 具有轻微性质的码矩阵满足子空间嵌入性质,更有可能保持整个向量子空间的几何结构。(iii) 对于并行和分布式应用,码矩阵在结构随机矩阵和高斯随机矩阵上有显著优势。(iv) 与傅里叶或哈达玛变换矩阵不同,某些类型的码矩阵不需要log因子即可实现(1+ε)最优弗罗贝尼乌斯范数误差,即对于秩k的近似,仅需O(k/ε)样本。(v) 结构化码矩阵可以实现快速乘法,因此可以快速近似一般稠密输入矩阵。(vi) 对于最小二乘回归问题min‖Ax-b‖₂,当A∈ℝ^{n×d}时,使用特定码矩阵可实现(1+ε)相对误差近似,概率很高,仅需O(d/ε)样本。

英文摘要

Low rank approximation is an important tool used in many applications of signal processing and machine learning. Recently, randomized sketching algorithms were proposed to effectively construct low rank approximations and obtain approximate singular value decompositions of large matrices. Similar ideas were used to solve least squares regression problems. In this paper, we show how matrices from error correcting codes can be used to find such low rank approximations and matrix decompositions, and extend the framework to linear least squares regression problems. The benefits of using these code matrices are the following: (i) They are easy to generate and they reduce randomness significantly. (ii) Code matrices with mild properties satisfy the subspace embedding property, and have a better chance of preserving the geometry of an entire subspace of vectors. (iii) For parallel and distributed applications, code matrices have significant advantages over structured random matrices and Gaussian random matrices. (iv) Unlike Fourier or Hadamard transform matrices, which require sampling $O(k\log k)$ columns for a rank-$k$ approximation, the log factor is not necessary for certain types of code matrices. That is, $(1+ε)$ optimal Frobenius norm error can be achieved for a rank-$k$ approximation with $O(k/ε)$ samples. (v) Fast multiplication is possible with structured code matrices, so fast approximations can be achieved for general dense input matrices. (vi) For least squares regression problem $\min\|Ax-b\|_2$ where $A\in \mathbb{R}^{n\times d}$, the $(1+ε)$ relative error approximation can be achieved with $O(d/ε)$ samples, with high probability, when certain code matrices are used.

1809.01353 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

IKA: Independent Kernel Approximator

IKA:独立核近似器

Matteo Ronchetti

AI总结 本文提出IKA方法,通过线性组合任意选择的函数进行低秩核近似,优于Nyström方法,在STL-10数据集上表现更优。

详情
AI中文摘要

本文提出IKA方法,通过线性组合任意选择的函数进行低秩核近似,优于Nyström方法,在STL-10数据集上表现更优。

英文摘要

This paper describes a new method for low rank kernel approximation called IKA. The main advantage of IKA is that it produces a function $ψ(x)$ defined as a linear combination of arbitrarily chosen functions. In contrast the approximation produced by Nyström method is a linear combination of kernel evaluations. The proposed method consistently outperformed Nyström method in a comparison on the STL-10 dataset. Numerical results are reproducible using the source code available at https://gitlab.com/matteo-ronchetti/IKA

1710.03608 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

CTD: Fast, Accurate, and Interpretable Method for Static and Dynamic Tensor Decompositions

CTD: 一种快速、准确且可解释的静态和动态张量分解方法

Jungwoo Lee, Dongjin Choi, Lee Sael

发表机构 * Seoul National University(首尔国立大学) The State University of New York (SUNY) Korea(纽约州立大学(SUNY)韩国)

AI总结 本文提出CTD方法,用于高效且可解释地进行静态和动态张量分解,通过去除冗余提升准确性和效率,适用于在线环境下的异常检测。

详情
AI中文摘要

如何在高效且直接可解释的方式下发现张量中的模式和异常?如何在在线环境中处理不断到来的张量?张量模式和异常检测是关键问题,应用于安全监控、健康监测、网络安全等领域。标准的PARAFAC和Tucker分解结果不可直接解释。尽管已有基于采样的方法,但需要更快、更高效和更准确。本文提出CTD,一种基于采样的快速、准确且可解释的张量分解方法。CTD-S在准确性上比现有方法高17-83倍,速度和内存效率也分别提升5-86倍和7-12倍。CTD-D是首个可解释的动态张量分解方法,通过利用前一时间步的因素和重新排列操作,使速度提升2-3倍。通过CTD展示了如何在在线分布式拒绝服务(DDoS)攻击检测中有效解释结果。

英文摘要

How can we find patterns and anomalies in a tensor, or multi-dimensional array, in an efficient and directly interpretable way? How can we do this in an online environment, where a new tensor arrives each time step? Finding patterns and anomalies in a tensor is a crucial problem with many applications, including building safety monitoring, patient health monitoring, cyber security, terrorist detection, and fake user detection in social networks. Standard PARAFAC and Tucker decomposition results are not directly interpretable. Although a few sampling-based methods have previously been proposed towards better interpretability, they need to be made faster, more memory efficient, and more accurate. In this paper, we propose CTD, a fast, accurate, and directly interpretable tensor decomposition method based on sampling. CTD-S, the static version of CTD, provably guarantees a high accuracy that is 17 ~ 83x more accurate than that of the state-of-the-art method. Also, CTD-S is made 5 ~ 86x faster, and 7 ~ 12x more memory-efficient than the state-of-the-art method by removing redundancy. CTD-D, the dynamic version of CTD, is the first interpretable dynamic tensor decomposition method ever proposed. Also, it is made 2 ~ 3x faster than already fast CTD-S by exploiting factors at previous time step and by reordering operations. With CTD, we demonstrate how the results can be effectively interpreted in the online distributed denial of service (DDoS) attack detection.

1706.09763 2026-06-04 cs.GT cond-mat.stat-mech cs.LG econ.GN q-fin.EC 版本更新

Dynamical selection of Nash equilibria using Experience Weighted Attraction Learning: emergence of heterogeneous mixed equilibria

利用经验加权吸引学习动态选择纳什均衡:异质混合均衡的出现

Robin Nicole, Peter Sollich

发表机构 * Department of Mathematics, King’s College London(伦敦国王学院数学系)

AI总结 本文研究了大游戏中策略分布,分析了纳什均衡的分类及EWA学习如何打破均衡不确定性,揭示异质混合均衡的形成机制。

Comments 35 pages, 16 figures

详情
AI中文摘要

我们研究了大游戏中策略分布,分析了可能的均场纳什均衡,包括可能的分割状态。由于游戏是聚集性的,实际均衡策略分布仍不确定。因此,我们比较了经验加权吸引学习的结果,该学习在长时间后在适当的大选择强度、低噪声(长代理记忆)和完美填补缺失分数( fictitious play)极限下导致纳什均衡。学习动态打破了纳什均衡的不确定性。非平凡地,根据相关极限的取法,可以选出多种均衡类型,包括标准的同质混合和异质纯状态,以及异质混合状态,其中不同代理扮演不同策略,这些策略不全是纯策略。EWA学习的分析涉及福克-普兰克建模结合大偏差方法。理论结果通过多代理模拟得到验证。

英文摘要

We study the distribution of strategies in a large game that models how agents choose among different double auction markets. We classify the possible mean field Nash equilibria, which include potentially segregated states where an agent population can split into subpopulations adopting different strategies. As the game is aggregative, the actual equilibrium strategy distributions remain undetermined, however. We therefore compare with the results of Experience-Weighted Attraction (EWA) learning, which at long times leads to Nash equilibria in the appropriate limits of large intensity of choice, low noise (long agent memory) and perfect imputation of missing scores (fictitious play). The learning dynamics breaks the indeterminacy of the Nash equilibria. Non-trivially, depending on how the relevant limits are taken, more than one type of equilibrium can be selected. These include the standard homogeneous mixed and heterogeneous pure states, but also \emph{heterogeneous mixed} states where different agents play different strategies that are not all pure. The analysis of the EWA learning involves Fokker-Planck modeling combined with large deviation methods. The theoretical results are confirmed by multi-agent simulations.

1806.00728 2026-06-04 stat.ML cs.CV cs.LG cs.SY eess.SP eess.SY 版本更新

Data-Free/Data-Sparse Softmax Parameter Estimation with Structured Class Geometries

无数据/稀疏数据softmax参数估计与结构类几何

Nisar Ahmed

发表机构 * H.J. Smead Aerospace Engineering Sciences, University of Colorado, Boulder, Colorado 80309(H.J. Smead航空航天工程科学系,科罗拉多大学,伯尔德,科罗拉多州80309)

AI总结 本文提出在少量或无标注数据情况下,利用类标签对数几率边界结构几何先验信息进行softmax参数估计,通过线性方程组求解,无需昂贵的数据采样和优化。

Comments Final version accepted to IEEE Signal Processing Letters (double column), submitted July 21, 2018

详情
AI中文摘要

本文考虑在少量或无标注训练数据可用时,但已知类标签对数几率边界相对几何结构信息的softmax参数估计问题。证明了'无数据'softmax模型合成对应于求解参数方程组,其中期望主导类对数几率边界通过分解输入特征空间的凸多面体编码。当方程可解时,线性方程给出仅使用类边界多面体规范的softmax参数解集。这允许softmax参数学习无需昂贵的暴力数据采样和数值优化。线性方程还可适应数据稀疏情况下的约束最大似然估计。由于某些多面体规范可能无法得到解,因此也展示了存在某些概率分类问题,其对数几率边界无法用m类softmax模型学习。

英文摘要

This note considers softmax parameter estimation when little/no labeled training data is available, but a priori information about the relative geometry of class label log-odds boundaries is available. It is shown that `data-free' softmax model synthesis corresponds to solving a linear system of parameter equations, wherein desired dominant class log-odds boundaries are encoded via convex polytopes that decompose the input feature space. When solvable, the linear equations yield closed-form softmax parameter solution families using class boundary polytope specifications only. This allows softmax parameter learning to be implemented without expensive brute force data sampling and numerical optimization. The linear equations can also be adapted to constrained maximum likelihood estimation in data-sparse settings. Since solutions may also fail to exist for the linear parameter equations derived from certain polytope specifications, it is thus also shown that there exist probabilistic classification problems over m convexly separable classes for which the log-odds boundaries cannot be learned using an m-class softmax model.

1711.10144 2026-06-04 math.AP cs.LG cs.NA math.NA math.PR 版本更新

The game theoretic p-Laplacian and semi-supervised learning with few labels

基于博弈论的p-拉普拉斯方程与少量标签的半监督学习

Jeff Calder

发表机构 * Department of Mathematics, University of Minnesota(明尼苏达大学数学系)

AI总结 研究了图上半监督学习中的博弈论p-拉普拉斯方程,证明其在有限标签和无限未标签数据极限下的良好性,展示其连续极限为加权连续p-拉普拉斯方程,并证明图p-拉普拉斯方程的解在高概率下近似Hölder连续。

详情
AI中文摘要

我们研究了图上半监督学习中的博弈论p-拉普拉斯方程,并证明其在有限标签和无限未标签数据极限下的良好性。特别是,我们展示了基于图的半监督学习在博弈论p-拉普拉斯方程下的连续极限是加权连续p-拉普拉斯方程的变种。我们还证明了图p-拉普拉斯方程的解在高概率下近似Hölder连续。我们的证明使用了图上的粘性解理论和最大原理。

英文摘要

We study the game theoretic p-Laplacian for semi-supervised learning on graphs, and show that it is well-posed in the limit of finite labeled data and infinite unlabeled data. In particular, we show that the continuum limit of graph-based semi-supervised learning with the game theoretic p-Laplacian is a weighted version of the continuous p-Laplace equation. We also prove that solutions to the graph p-Laplace equation are approximately Holder continuous with high probability. Our proof uses the viscosity solution machinery and the maximum principle on a graph.

1803.10309 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Canonical Correlation Analysis of Datasets with a Common Source Graph

具有共同源图的数据集的典型相关分析

Jia Chen, Gang Wang, Yanning Shen, Georgios B. Giannakis

发表机构 * University of Minnesota(明尼苏达大学)

AI总结 本文提出了一种基于图正则化的典型相关分析方法(gCCA),通过引入图结构来利用共同源的知识,以提升数据融合和分类性能。

Comments 10 pages, 7 figures

详情
AI中文摘要

典型相关分析(CCA)是一种用于发现两个或多个数据集是否共享隐藏源的强大技术。其优点包括降维、聚类、分类、特征选择和数据融合。然而,标准CCA未利用共同源的几何结构,这可能来自给定数据或通过(交叉)相关性推导。本文将共同源提供的额外信息编码为图,并作为图正则化器。这导致了一种新的图正则化CCA方法,称为图(g)CCA。新的gCCA考虑了图诱导的共同源知识,同时最小化所需典型变量的距离。针对数据量小于数据向量维度的多种实际设置,还开发了gCCA的对偶形式。一种设置包括内核用于处理非线性数据依赖性。所得到的图内核(gk)CCA也以闭式形式获得。最后,通过多个真实数据集上的图像分类测试来证明新线性、对偶和内核方法相对于竞争方法的优势。

英文摘要

Canonical correlation analysis (CCA) is a powerful technique for discovering whether or not hidden sources are commonly present in two (or more) datasets. Its well-appreciated merits include dimensionality reduction, clustering, classification, feature selection, and data fusion. The standard CCA however, does not exploit the geometry of the common sources, which may be available from the given data or can be deduced from (cross-) correlations. In this paper, this extra information provided by the common sources generating the data is encoded in a graph, and is invoked as a graph regularizer. This leads to a novel graph-regularized CCA approach, that is termed graph (g) CCA. The novel gCCA accounts for the graph-induced knowledge of common sources, while minimizing the distance between the wanted canonical variables. Tailored for diverse practical settings where the number of data is smaller than the data vector dimensions, the dual formulation of gCCA is also developed. One such setting includes kernels that are incorporated to account for nonlinear data dependencies. The resultant graph-kernel (gk) CCA is also obtained in closed form. Finally, corroborating image classification tests over several real datasets are presented to showcase the merits of the novel linear, dual, and kernel approaches relative to competing alternatives.

1712.07249 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Probabilistic Learning of Torque Controllers from Kinematic and Force Constraints

基于概率学习的扭矩控制器从运动学和力约束中学习

João Silvério, Yanlong Huang, Leonel Rozo, Sylvain Calinon, Darwin G. Caldwell

发表机构 * Department of Advanced Robotics, Istituto Italiano di Tecnologia(意大利先进机器人研究所机器人部) Idiap Research Institute(Idiap研究 institute)

AI总结 本文提出一种概率方法,同时学习和合成扭矩控制命令,考虑任务空间、关节空间和力约束,通过概率学习不同扭矩控制器的相关性,结合高斯分布特性生成满足任务特征的新扭矩命令。

Comments Accepted for publication at 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

详情
AI中文摘要

在从示范中学习技能时,通常需要提前考虑适当的任务表示(通常在操作空间或配置空间中)。本文提出了一种概率方法,同时学习和合成扭矩控制命令,考虑任务空间、关节空间和力约束。我们通过考虑作用于机器人上的不同扭矩控制器,其相关性从示范中概率性地学习。利用高斯分布的性质,将这些控制器结合起来,生成满足任务重要特征的新扭矩命令。我们在两个实验场景中使用7自由度扭矩控制机械臂进行验证,任务需要考虑不同控制器以正确执行。

英文摘要

When learning skills from demonstrations, one is often required to think in advance about the appropriate task representation (usually in either operational or configuration space). We here propose a probabilistic approach for simultaneously learning and synthesizing torque control commands which take into account task space, joint space and force constraints. We treat the problem by considering different torque controllers acting on the robot, whose relevance is learned probabilistically from demonstrations. This information is used to combine the controllers by exploiting the properties of Gaussian distributions, generating new torque commands that satisfy the important features of the task. We validate the approach in two experimental scenarios using 7-DoF torquecontrolled manipulators, with tasks that require the consideration of different controllers to be properly executed.

1808.00058 2026-06-04 cs.NI cs.LG cs.SY eess.SP eess.SY 版本更新

A Unified Framework for Joint Mobility Prediction and Object Profiling of Drones in UAV Networks

无人机网络中无人机移动预测与物体特征联合预测的统一框架

Han Peng, Abolfazl Razi, Fatemeh Afghah, Jonathan Ashdown

发表机构 * School of Informatics, Computing and Cyber Systems, Northern Arizona University, Flagstaff, AZ(信息学、计算与网络系统学院,北亚利桑那大学,弗拉格斯塔,亚利桑那) Air Force Research Laboratory, Rome, NY(空军研究实验室,罗马,纽约)

AI总结 本文提出一种无监督在线学习算法,用于无人机网络中无人机的移动预测和物体特征联合预测,以提升控制和通信协议的效率。

Comments 8 pages, 11 figures

详情
AI中文摘要

近年来,使用自主且协作的无人空中车辆(UAV)网络在没有地面站命令和通信的情况下变得更加重要,特别是在搜索和救援、灾害管理等人类干预受限的应用中。在这些场景中,如果无人机能够获取关于邻居节点的移动性、传感和作动能力的信息,它们可以做出更有效的决策。本文开发了一种无监督在线学习算法,用于无人机的移动预测和物体特征联合预测,以促进控制和通信协议。所提出的方法不仅预测周围飞行物体的未来位置,还能将它们分类为具有相似机动能力的不同组别(例如旋转式和固定翼UAVs),而无需事先了解这些组别。该方法在接纳具有未知移动性特征的新物体类型方面具有灵活性,因此适用于具有异构节点的新兴飞行自组网。

英文摘要

In recent years, using a network of autonomous and cooperative unmanned aerial vehicles (UAVs) without command and communication from the ground station has become more imperative, in particular in search-and-rescue operations, disaster management, and other applications where human intervention is limited. In such scenarios, UAVs can make more efficient decisions if they acquire more information about the mobility, sensing and actuation capabilities of their neighbor nodes. In this paper, we develop an unsupervised online learning algorithm for joint mobility prediction and object profiling of UAVs to facilitate control and communication protocols. The proposed method not only predicts the future locations of the surrounding flying objects, but also classifies them into different groups with similar levels of maneuverability (e.g. rotatory, and fixed-wing UAVs) without prior knowledge about these classes. This method is flexible in admitting new object types with unknown mobility profiles, thereby applicable to emerging flying Ad-hoc networks with heterogeneous nodes.

1709.03726 2026-06-04 cs.LG cs.SY eess.SY 版本更新

Adaptive Graph Signal Processing: Algorithms and Optimal Sampling Strategies

自适应图信号处理:算法与最优采样策略

Paolo Di Lorenzo, Paolo Banelli, Elvin Isufi, Sergio Barbarossa, Geert Leus

发表机构 * Dept. of Engineering, University of Perugia(工程系,佩鲁吉亚大学)

AI总结 本文提出自适应图信号学习的新策略,通过分析随机采样对算法性能的影响,设计优化采样策略以提升稳态性能和收敛速度。

Comments Submitted to IEEE Transactions on Signal Processing, September 2017

详情
AI中文摘要

本文旨在提出自适应图信号学习的新策略,即在随机时间变化的顶点子集上观测信号。将经典自适应算法LMS和RLS重新纳入图信号处理框架,通过均方分析探讨随机采样对自适应重建能力和稳态性能的影响。随后提出几种概率采样策略,设计每个节点的采样概率,以优化稳态性能、图采样率和算法收敛速度的平衡。最后推导出一种分布式RLS策略,并证明其收敛于集中式算法。通过合成和真实数据的数值模拟,展示了所提采样和重建策略在图上信号(可能分布式)自适应学习中的良好性能。

英文摘要

The goal of this paper is to propose novel strategies for adaptive learning of signals defined over graphs, which are observed over a (randomly time-varying) subset of vertices. We recast two classical adaptive algorithms in the graph signal processing framework, namely, the least mean squares (LMS) and the recursive least squares (RLS) adaptive estimation strategies. For both methods, a detailed mean-square analysis illustrates the effect of random sampling on the adaptive reconstruction capability and the steady-state performance. Then, several probabilistic sampling strategies are proposed to design the sampling probability at each node in the graph, with the aim of optimizing the tradeoff between steady-state performance, graph sampling rate, and convergence rate of the adaptive algorithms. Finally, a distributed RLS strategy is derived and is shown to be convergent to its centralized counterpart. Numerical simulations carried out over both synthetic and real data illustrate the good performance of the proposed sampling and reconstruction strategies for (possibly distributed) adaptive learning of signals defined over graphs.

1807.08855 2026-06-04 stat.ML cs.LG cs.RO cs.SY eess.SP eess.SY 版本更新

Weak in the NEES?: Auto-tuning Kalman Filters with Bayesian Optimization

在NEES中薄弱:基于贝叶斯优化的自动调节卡尔曼滤波器

Zhaozhong Chen, Christoffer Heckman, Simon Julier, Nisar Ahmed

发表机构 * Department of Computer Science(计算机科学系) University of Colorado Boulder(科罗拉多大学博尔德分校) University College London(伦敦大学学院) Smead Aerospace Engineering Sciences(Smead航空航天工程科学系)

AI总结 本文提出一种基于贝叶斯优化的自动调节卡尔曼滤波器方法,通过智能采样参数空间,利用非参数高斯过程代理函数,高效识别多个局部极小值并提供结果不确定性量化。

Comments Final version presented at FUSION 2018 Conference, Cambridge, UK, July 2018 (submitted June 1, 2018)

详情
AI中文摘要

卡尔曼滤波器被广泛用于数据融合应用,包括导航、跟踪和同时定位与建图问题。然而,调整各种卡尔曼滤波器模型参数需要大量时间和努力,例如过程噪声协方差、非白噪声预白化滤波器模型等。传统优化技术在调整时容易陷入较差的局部极小值,并且使用真实传感器数据实施成本较高。为了解决这些问题,本文开发了一种新的“黑箱”贝叶斯优化策略,用于自动调节卡尔曼滤波器。在该方法中,性能由两种随机目标函数之一来表征:当可用真实状态模型时为归一化估计误差平方(NEES),当只有传感器数据可用时为归一化创新误差平方(NIS)。通过智能采样参数空间,学习和利用非参数高斯过程代理函数,贝叶斯优化可以高效地识别多个局部极小值,并对其结果提供不确定性量化。

英文摘要

Kalman filters are routinely used for many data fusion applications including navigation, tracking, and simultaneous localization and mapping problems. However, significant time and effort is frequently required to tune various Kalman filter model parameters, e.g. process noise covariance, pre-whitening filter models for non-white noise, etc. Conventional optimization techniques for tuning can get stuck in poor local minima and can be expensive to implement with real sensor data. To address these issues, a new "black box" Bayesian optimization strategy is developed for automatically tuning Kalman filters. In this approach, performance is characterized by one of two stochastic objective functions: normalized estimation error squared (NEES) when ground truth state models are available, or the normalized innovation error squared (NIS) when only sensor data is available. By intelligently sampling the parameter space to both learn and exploit a nonparametric Gaussian process surrogate function for the NEES/NIS costs, Bayesian optimization can efficiently identify multiple local minima and provide uncertainty quantification on its results.

1807.08048 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY 版本更新

Baidu Apollo EM Motion Planner

百度 Apollo EM 运动规划器

Haoyang Fan, Fan Zhu, Changchun Liu, Liangliang Zhang, Li Zhuang, Dong Li, Weicheng Zhu, Jiangtao Hu, Hongye Li, Qi Kong

发表机构 * Baidu USA LLC(百度美国有限公司)

AI总结 本文提出基于百度 Apollo 开源自动驾驶平台的实时运动规划系统,解决工业级4级运动规划问题,兼顾安全性、舒适性和可扩展性,通过分层结构实现多车道和单车道自动驾驶。

详情
AI中文摘要

本文介绍了一种基于百度 Apollo(开源)自动驾驶平台的实时运动规划系统。该系统旨在解决工业级4级运动规划问题,同时考虑安全性、舒适性和可扩展性。系统采用分层结构处理多车道和单车道自动驾驶:(1)系统顶层为多车道策略,通过并行计算的车道级轨迹进行比较以处理变道场景。(2)在车道级轨迹生成器中,基于弗伦兹框架迭代求解路径和速度优化。(3)对于路径和速度优化,提出结合动态规划和基于样条的二次规划的方法,构建可扩展且易于调节的框架,同时处理交通规则、障碍物决策和平滑性。该规划器可扩展至高速公路和低速城市驾驶场景。我们通过场景示例和道路测试结果展示了该算法。本文描述的系统自2017年9月Apollo v1.5发布以来已部署到数十辆百度Apollo自动驾驶车辆。截至2018年5月16日,该系统已在各种城市场景下进行了3,380小时和约68,000公里(42,253英里)的闭环自动驾驶测试。本文描述的算法可在https://github.com/ApolloAuto/apollo/tree/master/modules/planning上获得。

英文摘要

In this manuscript, we introduce a real-time motion planning system based on the Baidu Apollo (open source) autonomous driving platform. The developed system aims to address the industrial level-4 motion planning problem while considering safety, comfort and scalability. The system covers multilane and single-lane autonomous driving in a hierarchical manner: (1) The top layer of the system is a multilane strategy that handles lane-change scenarios by comparing lane-level trajectories computed in parallel. (2) Inside the lane-level trajectory generator, it iteratively solves path and speed optimization based on a Frenet frame. (3) For path and speed optimization, a combination of dynamic programming and spline-based quadratic programming is proposed to construct a scalable and easy-to-tune framework to handle traffic rules, obstacle decisions and smoothness simultaneously. The planner is scalable to both highway and lower-speed city driving scenarios. We also demonstrate the algorithm through scenario illustrations and on-road test results. The system described in this manuscript has been deployed to dozens of Baidu Apollo autonomous driving vehicles since Apollo v1.5 was announced in September 2017. As of May 16th, 2018, the system has been tested under 3,380 hours and approximately 68,000 kilometers (42,253 miles) of closed-loop autonomous driving under various urban scenarios. The algorithm described in this manuscript is available at https://github.com/ApolloAuto/apollo/tree/master/modules/planning.

1807.07099 2026-06-04 eess.SP cs.LG cs.NA math.NA stat.ML 版本更新

Comparative study of Discrete Wavelet Transforms and Wavelet Tensor Train decomposition to feature extraction of FTIR data of medicinal plants

对离散小波变换与小波张量分解在药用植物FTIR数据特征提取中的比较研究

Pavel Kharyuk, Dmitry Nazarenko, Ivan Oseledets

发表机构 * Skolkovo Institute of Science and Technology(斯克洛洛夫研究所) Faculty of Computational Mathematics and Cybernetics, Lomonosov Moscow State University(莫斯科国立大学计算数学与电子学系) Faculty of Chemistry, Lomonosov Moscow State University(莫斯科国立大学化学系) Institute of Numerical Mathematics of the Russian Academy of Sciences(俄罗斯科学院数值数学研究所)

AI总结 本文比较了小波张量分解与离散小波变换在药用植物FTIR数据特征提取中的应用,发现两者在预处理和特征提取对机器学习算法效率的影响上表现相似,且小波张量分解因其单一参数调优优势更适用于多种信号处理任务。

详情
AI中文摘要

本文利用7种植物样本的傅里叶变换红外(FTIR)光谱,探讨了预处理和特征提取对机器学习算法效率的影响。将小波张量分解(WTT)与离散小波变换(DWT)作为药用植物FTIR数据的特征提取技术进行比较。各种信号处理步骤在应用于分类和聚类任务时表现出不同的行为。通过网格搜索找到的WTT和DWT的最佳结果相似,显著提高了聚类质量和调优后的逻辑回归分类准确率,相比原始光谱。与DWT不同,WTT只有一个参数(秩)需要调优,使其成为在各种信号处理应用中更通用和易用的数据处理工具。

英文摘要

Fourier-transform infra-red (FTIR) spectra of samples from 7 plant species were used to explore the influence of preprocessing and feature extraction on efficiency of machine learning algorithms. Wavelet Tensor Train (WTT) and Discrete Wavelet Transforms (DWT) were compared as feature extraction techniques for FTIR data of medicinal plants. Various combinations of signal processing steps showed different behavior when applied to classification and clustering tasks. Best results for WTT and DWT found through grid search were similar, significantly improving quality of clustering as well as classification accuracy for tuned logistic regression in comparison to original spectra. Unlike DWT, WTT has only one parameter to be tuned (rank), making it a more versatile and easier to use as a data processing tool in various signal processing applications.

1709.07224 2026-06-04 cs.MA cs.AI cs.LG cs.SY eess.SY stat.ML 版本更新

Local Communication Protocols for Learning Complex Swarm Behaviors with Deep Reinforcement Learning

局部通信协议用于通过深度强化学习学习复杂群集行为

Maximilian Hüttenrauch, Adrian Šošić, Gerhard Neumann

发表机构 * School of Computer Science, University of Lincoln(林肯大学计算机科学学院) Department of Electrical Engineering, Technische Universität Darmstadt(达姆施塔特技术大学电气工程系)

AI总结 本文提出简单通信协议,利用深度强化学习在多机器人群环境中学习去中心化控制策略,通过直方图编码局部邻域关系并传输任务特定信息,如最短距离和方向,以完成协作任务。

Comments 13 pages, 4 figures, version 2, accepted at ANTS 2018

详情
AI中文摘要

群集系统对强化学习(RL)构成挑战,因为算法需要学习去中心化控制策略以应对代理的有限局部感知和通信能力。虽然直接定义代理行为困难,但可通过先验知识定义简单的通信协议。本文提出多种简单通信协议,用于深度强化学习在多机器人群环境中寻找去中心化控制策略。协议基于直方图编码代理的局部邻域关系,并可传输任务特定信息,如到目标的最短距离和方向。在我们的框架中,我们采用信任区域策略优化的变体来学习复杂协作任务,如编队和建立通信链路。我们在模拟的2D物理环境中评估了我们的发现,并比较了不同通信协议的影响。

英文摘要

Swarm systems constitute a challenging problem for reinforcement learning (RL) as the algorithm needs to learn decentralized control policies that can cope with limited local sensing and communication abilities of the agents. While it is often difficult to directly define the behavior of the agents, simple communication protocols can be defined more easily using prior knowledge about the given task. In this paper, we propose a number of simple communication protocols that can be exploited by deep reinforcement learning to find decentralized control policies in a multi-robot swarm environment. The protocols are based on histograms that encode the local neighborhood relations of the agents and can also transmit task-specific information, such as the shortest distance and direction to a desired target. In our framework, we use an adaptation of Trust Region Policy Optimization to learn complex collaborative tasks, such as formation building and building a communication link. We evaluate our findings in a simulated 2D-physics environment, and compare the implications of different communication protocols.

1807.03769 2026-06-04 math.OC cs.AI cs.LG cs.SY eess.SY stat.ML 版本更新

Kernel-Based Learning for Smart Inverter Control

基于核方法的智能逆变器控制

Aditie Garg, Mana Jalali, Vassilis Kekatos, Nikolaos Gatsis

发表机构 * Dept. of ECE, Virginia Tech(维吉尼亚理工大学电子工程系) Dept. of ECE, Un. of Texas at San Antonio(德克萨斯大学圣安东尼奥分校电子工程系)

AI总结 本文提出非线性逆变器控制策略,通过类比多任务学习将反应控制视为核回归任务,利用线性化电网模型和预测数据场景,在馈线层面联合设计逆变器规则以最小化电压偏差和电阻损耗。

Comments Submitted to the 2018 IEEE Global Signal and Information Processing Conf., Symposium on Smart Energy Infrastructures

详情
AI中文摘要

目前,分布电网面临由间歇性太阳能发电引起的频繁电压波动的挑战。智能逆变器被倡导为一种快速响应的手段,用于调节电压并最小化电阻损耗。由于最优逆变器协调可能计算上具有挑战性,而预设的本地控制规则表现不佳,因此定制化的准静态控制规则被视为最佳折中方案。本文从仿射控制规则出发,提出非线性逆变器控制策略。通过类比多任务学习,将反应控制视为基于核的回归任务。利用线性化电网模型和给定的预期数据场景,在馈线层面联合设计逆变器规则,以最小化电压偏差和电阻损耗的凸组合,通过线性约束的二次规划。使用真实世界数据在基准馈线上的数值测试表明,非线性控制规则即使由少数非本地读数驱动,也能实现近最优性能。

英文摘要

Distribution grids are currently challenged by frequent voltage excursions induced by intermittent solar generation. Smart inverters have been advocated as a fast-responding means to regulate voltage and minimize ohmic losses. Since optimal inverter coordination may be computationally challenging and preset local control rules are subpar, the approach of customized control rules designed in a quasi-static fashion features as a golden middle. Departing from affine control rules, this work puts forth non-linear inverter control policies. Drawing analogies to multi-task learning, reactive control is posed as a kernel-based regression task. Leveraging a linearized grid model and given anticipated data scenarios, inverter rules are jointly designed at the feeder level to minimize a convex combination of voltage deviations and ohmic losses via a linearly-constrained quadratic program. Numerical tests using real-world data on a benchmark feeder demonstrate that nonlinear control rules driven also by a few non-local readings can attain near-optimal performance.

1709.08174 2026-06-04 cs.LG cs.NA math.NA 版本更新

Function approximation with zonal function networks with activation functions analogous to the rectified linear unit functions

基于类似修正线性单元函数的区域函数网络的函数近似

Hrushikesh N. Mhaskar

发表机构 * Institute of Mathematical Sciences, Claremont Graduate University(数学科学研究所,克莱蒙特研究生大学)

AI总结 本文研究了在q维球面上的区域函数网络的近似性质,探讨了非正定激活函数的逼近特性,并建立了相应的光滑性类别和逼近性质。

Comments 18 pages, Title changed from the pervious version

详情
AI中文摘要

在q维球面S^q上,区域函数(ZF)网络的形式为x↦∑_{k=1}^n a_kϕ(x·x_k),其中ϕ:[-1,1]→R是激活函数,x_k∈S^q是中心,a_k∈R。尽管正定激活函数的近似性质已被广泛研究,但深度和浅层网络的近期兴趣促使研究类似修正线性单元函数的激活函数形式ϕ(t)=|t|,这些函数不是正定的。本文定义了适当的光滑性类别,并建立了此类网络在该类别中函数的逼近性质。中心可以独立于目标函数选择,系数是训练数据的线性组合。构造保持旋转对称性。

英文摘要

A zonal function (ZF) network on the $q$ dimensional sphere $\mathbb{S}^q$ is a network of the form $\mathbf{x}\mapsto \sum_{k=1}^n a_kϕ(\mathbf{x}\cdot\mathbf{x}_k)$ where $ϕ:[-1,1]\to\mathbf{R}$ is the activation function, $\mathbf{x}_k\in\mathbb{S}^q$ are the centers, and $a_k\in\mathbb{R}$. While the approximation properties of such networks are well studied in the context of positive definite activation functions, recent interest in deep and shallow networks motivate the study of activation functions of the form $ϕ(t)=|t|$, which are not positive definite. In this paper, we define an appropriate smoothess class and establish approximation properties of such networks for functions in this class. The centers can be chosen independently of the target function, and the coefficients are linear combinations of the training data. The constructions preserve rotational symmetries.

1807.02297 2026-06-04 cs.LG cs.AI cs.SY eess.SY stat.ML 版本更新

Combinatorial Bandits for Incentivizing Agents with Dynamic Preferences

基于动态偏好的激励机制组合博弈问题

Tanner Fiez, Shreyas Sekar, Liyuan Zheng, Lillian J. Ratliff

发表机构 * Electrical Engineering Department, University of Washington(华盛顿大学电气工程系)

AI总结 本文提出一种多臂老虎机框架,用于在资源受限环境下匹配用户激励,结合贪心匹配、UCB算法和马尔可夫链混合时间,理论分析 regret 并通过合成和现实案例验证性能。

Comments Published as a conference paper in Conference on Uncertainty in Artificial Intelligence (UAI) 2018

详情
AI中文摘要

个性化激励或推荐设计以提高用户参与度正日益受到重视,随着数字平台提供商不断涌现。我们提出了一种多臂老虎机框架,用于匹配激励给用户,其偏好在事前未知且随时间动态变化,在资源受限环境下。我们设计了一种算法,结合了三个不同领域的思想:(i) 贪心匹配范式,(ii) 用于老虎机的上置信界算法 (UCB),以及 (iii) 马尔可夫链理论中的混合时间。对于该算法,我们提供了关于 regret 的理论界限,并通过合成和现实(如共享单车平台的供需匹配)示例展示了其性能。

英文摘要

The design of personalized incentives or recommendations to improve user engagement is gaining prominence as digital platform providers continually emerge. We propose a multi-armed bandit framework for matching incentives to users, whose preferences are unknown a priori and evolving dynamically in time, in a resource constrained environment. We design an algorithm that combines ideas from three distinct domains: (i) a greedy matching paradigm, (ii) the upper confidence bound algorithm (UCB) for bandits, and (iii) mixing times from the theory of Markov chains. For this algorithm, we provide theoretical bounds on the regret and demonstrate its performance via both synthetic and realistic (matching supply and demand in a bike-sharing platform) examples.

1807.00553 2026-06-04 cs.LG cs.AI cs.SY eess.SY math.DS stat.ML 版本更新

A Broader View on Bias in Automated Decision-Making: Reflecting on Epistemology and Dynamics

对自动化决策中偏见的更广泛视角:反思认识论与动态性

Roel Dobbe, Sarah Dean, Thomas Gilbert, Nitin Kohli

发表机构 * Department of Electrical Engineering and Computer Sciences, University of California Berkeley, USA(加州大学伯克利分校电气工程与计算机科学系) Department of Rhetoric, University of California Berkeley, USA(加州大学伯克利分校修辞学系) School of Information, University of California Berkeley, USA(加州大学伯克利分校信息学院)

AI总结 本文探讨自动化决策中偏见的根源,将技术偏见视为认识论问题,新兴偏见视为动态反馈现象,强调需反思认识论并采用价值敏感设计方法改进决策系统。

Comments Presented at the 2018 Workshop on Fairness, Accountability and Transparency in Machine Learning during ICML 2018, Stockholm, Sweden

详情
AI中文摘要

机器学习(ML)正日益应用于现实世界,提供可操作见解并成为自动化决策系统的基础。尽管训练数据中固有的偏见是公平性讨论的核心问题,但这些系统也受到技术性和新兴偏见的影响,后者常作为实施中的上下文特定产物出现。本文将技术偏见视为认识论问题,新兴偏见视为动态反馈现象。为激发关于如何改变机器学习实践以有效应对这些问题的讨论,本文探索了偏见的更广泛视角,强调反思认识论的必要性,并指出价值敏感设计方法以重新审视自动化决策系统的设计和实施过程。

英文摘要

Machine learning (ML) is increasingly deployed in real world contexts, supplying actionable insights and forming the basis of automated decision-making systems. While issues resulting from biases pre-existing in training data have been at the center of the fairness debate, these systems are also affected by technical and emergent biases, which often arise as context-specific artifacts of implementation. This position paper interprets technical bias as an epistemological problem and emergent bias as a dynamical feedback phenomenon. In order to stimulate debate on how to change machine learning practice to effectively address these issues, we explore this broader view on bias, stress the need to reflect on epistemology, and point to value-sensitive design methodologies to revisit the design and implementation process of automated decision-making systems.

1709.01268 2026-06-04 cs.CE cs.LG cs.NA math.NA q-fin.TR 版本更新

Tensor Representation in High-Frequency Financial Data for Price Change Prediction

高频金融数据中的张量表示用于价格变动预测

Dat Thanh Tran, Martin Magris, Juho Kanniainen, Moncef Gabbouj, Alexandros Iosifidis

发表机构 * 1 Laboratory of Signal Processing, Tampere University of Technology, Tampere, Finland 2 Laboratory of Industrial

AI总结 本文研究了张量多线性方法在中价预测中的有效性,通过大规模数据集实验表明,张量表示优于向量方法及其他方法。

Comments accepted in SSCI 2017, typos fixed

详情
Journal ref
IEEE Symposium Series on Computational Intelligence (SSCI), 2017
AI中文摘要

如今,随着大量交易数据的可用性,金融市场的动态既是对高频交易者的一种挑战,也是一种机会。为了利用高频交易中资产快速微妙的变动,必须有自动算法来分析和检测基于交易记录的价格变动模式。金融数据的多通道时间序列表示自然地建议了基于张量的学习算法。在本工作中,我们研究了两种多线性方法在中价预测问题中的有效性,与其他现有方法相比。在包含超过400万笔限价订单的大型数据集上的实验表明,通过利用张量表示,多线性模型优于向量方法和其他竞争方法。

英文摘要

Nowadays, with the availability of massive amount of trade data collected, the dynamics of the financial markets pose both a challenge and an opportunity for high frequency traders. In order to take advantage of the rapid, subtle movement of assets in High Frequency Trading (HFT), an automatic algorithm to analyze and detect patterns of price change based on transaction records must be available. The multichannel, time-series representation of financial data naturally suggests tensor-based learning algorithms. In this work, we investigate the effectiveness of two multilinear methods for the mid-price prediction problem against other existing methods. The experiments in a large scale dataset which contains more than 4 millions limit orders show that by utilizing tensor representation, multilinear models outperform vector-based approaches and other competing ones.

1806.09919 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Tangent-Space Regularization for Neural-Network Models of Dynamical Systems

神经动力系统模型中的切空间正则化

Fredrik Bagge Carlson, Rolf Johansson, Anders Robertsson

发表机构 * LCCC Linnaeus Center(LCCC 林纳尤斯中心)

AI总结 本文提出神经网络动力系统模型的切空间正则化方法,通过利用动力学函数的切空间特性,改进模型雅可比矩阵的正则化,减少对大量训练数据的依赖,并探讨不同网络架构对输入输出雅可比矩阵学习能力及L2正则化对系统稳定性的影响。

详情
AI中文摘要

本文介绍了神经网络动力系统模型中的切空间正则化概念。许多物理系统在控制应用中的动力学函数的切空间表现出有用性质,例如光滑性,这促使通过假设动力学的切空间来沿系统轨迹正则化模型雅可比矩阵。在没有假设的情况下,神经网络需要大量训练数据才能学习完整的非线性动力学而不过拟合。本文比较了不同网络架构在一步预测和模拟性能上的表现,并研究了不同架构学习具有正确输入输出雅可比矩阵的倾向。此外,探讨了L2权重正则化对学习雅可比特征值谱以及系统稳定性的影响。

英文摘要

This work introduces the concept of tangent space regularization for neural-network models of dynamical systems. The tangent space to the dynamics function of many physical systems of interest in control applications exhibits useful properties, e.g., smoothness, motivating regularization of the model Jacobian along system trajectories using assumptions on the tangent space of the dynamics. Without assumptions, large amounts of training data are required for a neural network to learn the full non-linear dynamics without overfitting. We compare different network architectures on one-step prediction and simulation performance and investigate the propensity of different architectures to learn models with correct input-output Jacobian. Furthermore, the influence of $L_2$ weight regularization on the learned Jacobian eigenvalue spectrum, and hence system stability, is investigated.

1806.09620 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

A DCA-Like Algorithm and its Accelerated Version with Application in Data Visualization

一种类似DCA的算法及其加速版本在数据可视化中的应用

Hoai An Le Thi, Hoai Minh Le, Duy Nhat Phan, Bach Tran

发表机构 * Department Informatics and Application(信息与应用系) LGIPM University of Lorraine(洛林大学) France(法国)

AI总结 本文提出两种DCA变体,旨在加速约束下可微函数和复合函数的最小化问题。通过引入新的分解技术改进DCA,进而结合Nesterov加速技术得到加速DCA。算法在Kurdyka-Lojasiewicz假设下的收敛性被严格研究,并应用于t-分布随机邻居嵌入。

详情
AI中文摘要

本文提出两种DCA变体,旨在加速约束下可微函数和复合函数的最小化问题。在第一种变体DCA-Like中,我们引入了一种新的技术来迭代修改目标函数的分解。这种连续分解可以导致更好的主导性,从而比基本DCA有更快的收敛速度。然后,我们将Nesterov的加速技术纳入DCA-Like中,得到第二种变体,称为加速DCA-Like。两种变体的收敛性质和在Kurdyka-Lojasiewicz假设下的收敛率被严格研究。作为应用,我们研究了我们的算法用于t-分布随机邻居嵌入。在几个基准数据集上的数值实验展示了我们算法的有效性。

英文摘要

In this paper, we present two variants of DCA (Different of Convex functions Algorithm) to solve the constrained sum of differentiable function and composite functions minimization problem, with the aim of increasing the convergence speed of DCA. In the first variant, DCA-Like, we introduce a new technique to iteratively modify the decomposition of the objective function. This successive decomposition could lead to a better majorization and consequently a better convergence speed than the basic DCA. We then incorporate the Nesterov's acceleration technique into DCA-Like to give rise to the second variant, named Accelerated DCA-Like. The convergence properties and the convergence rate under Kudyka-Lojasiewicz assumption of both variants are rigorously studied. As an application, we investigate our algorithms for the t-distributed stochastic neighbor embedding. Numerical experiments on several benchmark datasets illustrate the efficiency of our algorithms.

1806.05419 2026-06-04 stat.ML cs.LG cs.NA math.NA math.ST stat.TH 版本更新

Ranking Recovery from Limited Comparisons using Low-Rank Matrix Completion

通过低秩矩阵补全进行有限比较的排序恢复

Tal Levy, Alireza Vahid, Raja Giryes

发表机构 * School of Electrical Engineering, Tel-Aviv University(特拉维夫大学电气工程学院) Electrical Engineering Department, University of Colorado Denver(科罗拉多大学丹佛分校电气工程系)

AI总结 本文提出利用低秩矩阵补全方法解决经典排名聚合问题,通过矩阵形式处理部分噪声比较数据,结合交替最小化算法和最大似然估计,重建真实偏好强度。

Comments 10 Pages, 9 figures. A prediction table for 2018 FIFA soccer world cup is included

详情
AI中文摘要

本文提出了一种新的方法,利用低秩矩阵补全技术解决经典的排名聚合问题。通过将成对比较的不完全和噪声数据转换为矩阵形式,并利用矩阵补全工具(如Netflix挑战中的低秩补全解决方案)来构建不同对象的偏好。在我们的方法中,利用多个比较数据估计对象i相对于对象j获胜(或被选择)的概率,其中仅已知N个对象的部分比较数据。然后将数据转换为矩阵形式,其无噪声解具有已知的秩为一。接着使用目标矩阵具有双线性形式的交替最小化算法,并结合最大似然估计对两个因素进行估计。重建的矩阵用于获得真实的潜在偏好强度。本工作在模拟场景和真实数据中展示了所提算法相对于当前最先进方法的改进。

英文摘要

This paper proposes a new method for solving the well-known rank aggregation problem from pairwise comparisons using the method of low-rank matrix completion. The partial and noisy data of pairwise comparisons is transformed into a matrix form. We then use tools from matrix completion, which has served as a major component in the low-rank completion solution of the Netflix challenge, to construct the preference of the different objects. In our approach, the data of multiple comparisons is used to create an estimate of the probability of object i to win (or be chosen) over object j, where only a partial set of comparisons between N objects is known. The data is then transformed into a matrix form for which the noiseless solution has a known rank of one. An alternating minimization algorithm, in which the target matrix takes a bilinear form, is then used in combination with maximum likelihood estimation for both factors. The reconstructed matrix is used to obtain the true underlying preference intensity. This work demonstrates the improvement of our proposed algorithm over the current state-of-the-art in both simulated scenarios and real data.

1806.04830 2026-06-04 math.NA cs.LG cs.NA 版本更新

Deep Multiscale Model Learning

深度多尺度模型学习

Yating Wang, Siu Wun Cheung, Eric T. Chung, Yalchin Efendiev, Min Wang

发表机构 * Department of Mathematics, Texas A&M University(德克萨斯大学数学系) Department of Mathematics, The Chinese University of Hong Kong(香港中文大学数学系) Department of Mathematics & Institute for Scientific Computation (ISC), Texas A&M University(德克萨斯大学数学系与科学计算研究所)

AI总结 本文提出利用深度学习与局部多尺度模型降阶方法,通过数据和物理建模概念提升流体多尺度模拟的预测能力。

详情
AI中文摘要

本文的目标是设计新型多层神经网络架构,用于考虑观测数据和物理建模概念的流体多尺度模拟。我们的方法结合深度学习概念与局部多尺度模型降阶方法,预测流体动力学。使用降阶模型对于构建稳健的深度学习架构至关重要,因为降阶模型提供较少的自由度。流体动力学可以视为多层网络。更准确地说,时间瞬间n+1的解(例如压力和饱和度)取决于时间瞬间n的解和输入参数,如渗透率场、强迫项和初始条件。可以将解视为多层网络,其中每一层通常是一个非线性前向映射,层数与内部时间步数相关。我们将依赖严格的模型降阶概念来定义每个层的未知数和连接。在每一层中,我们的降阶模型将提供一个前向映射,该映射将通过可用数据进行修改(“训练”)。使用降阶模型至关重要,因为它们将识别影响区域和适当的变量数量。由于可用数据有限,训练将补充计算数据,并在数据丰富和数据贫乏的模型之间进行插值。我们还将使用深度学习算法来训练降阶模型离散系统的元素。我们将介绍我们方法的主要成分和数值结果。数值结果表明,使用深度学习和多尺度模型,可以提高受可用数据条件的前向模型。

英文摘要

The objective of this paper is to design novel multi-layer neural network architectures for multiscale simulations of flows taking into account the observed data and physical modeling concepts. Our approaches use deep learning concepts combined with local multiscale model reduction methodologies to predict flow dynamics. Using reduced-order model concepts is important for constructing robust deep learning architectures since the reduced-order models provide fewer degrees of freedom. Flow dynamics can be thought of as multi-layer networks. More precisely, the solution (e.g., pressures and saturations) at the time instant $n+1$ depends on the solution at the time instant $n$ and input parameters, such as permeability fields, forcing terms, and initial conditions. One can regard the solution as a multi-layer network, where each layer, in general, is a nonlinear forward map and the number of layers relates to the internal time steps. We will rely on rigorous model reduction concepts to define unknowns and connections for each layer. In each layer, our reduced-order models will provide a forward map, which will be modified ("trained") using available data. It is critical to use reduced-order models for this purpose, which will identify the regions of influence and the appropriate number of variables. Because of the lack of available data, the training will be supplemented with computational data as needed and the interpolation between data-rich and data-deficient models. We will also use deep learning algorithms to train the elements of the reduced model discrete system. We will present main ingredients of our approach and numerical results. Numerical results show that using deep learning and multiscale models, we can improve the forward models, which are conditioned to the available data.

1806.04167 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Learning an Approximate Model Predictive Controller with Guarantees

学习具有保证的近似模型预测控制器

Michael Hertneck, Johannes Köhler, Sebastian Trimpe, Frank Allgöwer

发表机构 * University of Stuttgart(斯图加特大学)

AI总结 本文提出一种监督学习框架,用于在降低计算复杂度的同时近似模型预测控制器,并保证稳定性和约束满足。通过结合鲁棒MPC设计和统计学习界限,为学习的MPC提供闭环保证。

Comments 6 pages, 3 figures, to appear in IEEE Control Systems Letters

详情
AI中文摘要

本文提出了一种监督学习框架,用于近似模型预测控制器(MPC),以降低计算复杂度并保证稳定性和约束满足。该框架可应用于广泛非线性系统。任何标准监督学习技术(例如神经网络)均可用于从样本中近似MPC。为了获得学习MPC的闭环保证,将鲁棒MPC设计与统计学习界限相结合。MPC设计确保在给定范围内输入不准确时的鲁棒性,Hoeffding不等式用于验证学习的MPC在高置信度下满足这些界限。结果是学习MPC的闭环统计保证,确保稳定性和约束满足。所提出的基于学习的MPC框架在非线性基准问题上进行了示例说明,其中我们学习了一个具有保证的神经网络控制器。

英文摘要

A supervised learning framework is proposed to approximate a model predictive controller (MPC) with reduced computational complexity and guarantees on stability and constraint satisfaction. The framework can be used for a wide class of nonlinear systems. Any standard supervised learning technique (e.g. neural networks) can be employed to approximate the MPC from samples. In order to obtain closed-loop guarantees for the learned MPC, a robust MPC design is combined with statistical learning bounds. The MPC design ensures robustness to inaccurate inputs within given bounds, and Hoeffding's Inequality is used to validate that the learned MPC satisfies these bounds with high confidence. The result is a closed-loop statistical guarantee on stability and constraint satisfaction for the learned MPC. The proposed learning-based MPC framework is illustrated on a nonlinear benchmark problem, for which we learn a neural network controller with guarantees.

1806.03145 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Fidelity-based Probabilistic Q-learning for Control of Quantum Systems

基于保真度的概率Q学习用于量子系统的控制

Chunlin Chen, Daoyi Dong, Han-Xiong Li, Jian Chu, Tzyh-Jong Tarn

AI总结 本文提出基于保真度的概率Q学习方法,用于解决强化学习中探索与利用的平衡问题,并应用于量子系统控制,通过迭代更新动作概率实现自然探索策略,提升学习效率。

Comments 13 pages, 16 figures

详情
Journal ref
IEEE Transactions on Neural Networks and Learning Systems, VOL. 25, NO. 5, pp.920-933, MAY 2014
AI中文摘要

在强化学习中,探索与利用的平衡是一个关键问题,尤其是对于Q学习。本文提出一种基于保真度的概率Q学习(FPQL)方法,以自然解决此问题并应用于量子系统控制。该方法利用保真度指导学习过程,迭代更新每个状态的动作概率,从而实现自然探索策略而非依赖配置参数的尖锐策略。首先提出概率Q学习(PQL)算法以展示概率动作选择的基本思想,随后针对量子系统控制提出FPQL算法。通过两个例子(自旋-1/2系统和λ型原子系统)测试FPQL算法的性能。结果表明,FPQL算法在探索与利用之间取得更好的平衡,能够避免局部最优策略并加速学习过程。

英文摘要

The balance between exploration and exploitation is a key problem for reinforcement learning methods, especially for Q-learning. In this paper, a fidelity-based probabilistic Q-learning (FPQL) approach is presented to naturally solve this problem and applied for learning control of quantum systems. In this approach, fidelity is adopted to help direct the learning process and the probability of each action to be selected at a certain state is updated iteratively along with the learning process, which leads to a natural exploration strategy instead of a pointed one with configured parameters. A probabilistic Q-learning (PQL) algorithm is first presented to demonstrate the basic idea of probabilistic action selection. Then the FPQL algorithm is presented for learning control of quantum systems. Two examples (a spin- 1/2 system and a lamda-type atomic system) are demonstrated to test the performance of the FPQL algorithm. The results show that FPQL algorithms attain a better balance between exploration and exploitation, and can also avoid local optimal policies and accelerate the learning process.

1806.02499 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

Conditional probability calculation using restricted Boltzmann machine with application to system identification

基于受限玻尔兹曼机的条件概率计算及其在系统辨识中的应用

Erick de la Rosa, Wen Yu

发表机构 * Departamento de Control Automatico CINVESTAV-IPN (National Polytechnic Institute)(自动控制系 CINVESTAV-IPN(国家理工学院))

AI总结 本文利用受限玻尔兹曼机计算条件概率用于非线性系统辨识,通过二进制编码和连续值方法改进模型,提出通用逼近分析,验证在噪声大和系统动态复杂时方法优势。

详情
AI中文摘要

使用概率方法进行非线性系统辨识具有优势,如数据集中的噪声和离群值对概率模型影响小,输入特征可概率形式提取。概率模型的主要障碍是概率分布难以获得。本文将非线性系统辨识转化为求解条件概率问题,修改受限玻尔兹曼机(RBM),使联合概率、输入分布和条件概率可通过RBM训练计算。讨论了二进制编码和连续值方法,提出基于条件概率建模的通用逼近分析。使用两个非线性系统基准测试比较本文概率建模方法与其他黑盒建模方法。结果表明,该新方法在存在大量噪声和系统动态复杂时表现更优。

英文摘要

There are many advantages to use probability method for nonlinear system identification, such as the noises and outliers in the data set do not affect the probability models significantly; the input features can be extracted in probability forms. The biggest obstacle of the probability model is the probability distributions are not easy to be obtained. In this paper, we form the nonlinear system identification into solving the conditional probability. Then we modify the restricted Boltzmann machine (RBM), such that the joint probability, input distribution, and the conditional probability can be calculated by the RBM training. Binary encoding and continue valued methods are discussed. The universal approximation analysis for the conditional probability based modelling is proposed. We use two benchmark nonlinear systems to compare our probability modelling method with the other black-box modeling methods. The results show that this novel method is much better when there are big noises and the system dynamics are complex.

1709.10441 2026-06-04 cs.LG cs.NA math.NA 版本更新

A representer theorem for deep kernel learning

深度核学习的代表定理

Bastian Bohn, Michael Griebel, Christian Rieger

发表机构 * Institute for Numerical Simulation, University of Bonn(数值模拟研究所,波恩大学) Fraunhofer Institute for Algorithms and Scientific Computing SCAI(算法与科学计算弗劳恩霍夫研究所SCAI)

AI总结 本文为深度核学习中的函数拼接提供了有限和无限样本的代表定理,为分析基于函数组合的机器学习算法提供数学基础,并展示了如何将拼接的机器学习问题转化为神经网络,并应用于最新深度学习方法。

详情
AI中文摘要

在本文中,我们为再生核希尔伯特空间中核函数的拼接(线性组合)提供了有限样本和无限样本的代表定理。这些结果为基于函数组合的机器学习算法分析提供了数学基础。在有限样本情况下,相应的无限维最小化问题可以转化为(非线性)有限维最小化问题,可通过非线性优化算法求解。此外,我们展示了如何将拼接的机器学习问题重新表述为神经网络,并证明了我们的代表定理适用于一系列最先进的深度学习方法。

英文摘要

In this paper we provide a finite-sample and an infinite-sample representer theorem for the concatenation of (linear combinations of) kernel functions of reproducing kernel Hilbert spaces. These results serve as mathematical foundation for the analysis of machine learning algorithms based on compositions of functions. As a direct consequence in the finite-sample case, the corresponding infinite-dimensional minimization problems can be recast into (nonlinear) finite-dimensional minimization problems, which can be tackled with nonlinear optimization algorithms. Moreover, we show how concatenated machine learning problems can be reformulated as neural networks and how our representer theorem applies to a broad class of state-of-the-art deep learning methods.

1806.01678 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

A Projection Method for Metric-Constrained Optimization

度量约束优化的一种投影方法

Nate Veldt, David Gleich, Anthony Wirth, James Saunderson

发表机构 * Purdue University, Mathematics Department(普渡大学数学系) Purdue University, Computer Science Department(普渡大学计算机科学系) The University of Melbourne, Computing and Information Systems School(墨尔本大学计算与信息系统学院) Monash University, Department of Electrical and Computer Systems Engineering(莫纳什大学电子与计算机系统工程系)

AI总结 本文提出一种解决度量约束优化问题的新方法,通过改进投影算法解决图聚类中的高维优化问题,并提供新的近似保证。

详情
AI中文摘要

我们概述了一种解决强制输出变量三角不等式的优化问题的新方法。我们将其称为度量约束优化,并给出了在机器学习应用和图聚类理论近似算法中出现的几个例子。尽管这些问题是理论上的有趣问题,但实际求解具有挑战性,因为黑箱求解器需要高内存。为了解决这一挑战,我们首先证明了相关聚类的度量约束线性规划松弛等价于度量接近问题的特殊情况。然后我们通过推广和改进最初用于度量接近的简单投影算法,开发了一个通用求解器。我们为使用我们的框架找到几个具有挑战性的图聚类问题的最优解的下界提供了几种新的近似保证。我们还通过解决包含高达10^8个变量和10^11个约束的优化问题来展示我们框架的威力。

英文摘要

We outline a new approach for solving optimization problems which enforce triangle inequalities on output variables. We refer to this as metric-constrained optimization, and give several examples where problems of this form arise in machine learning applications and theoretical approximation algorithms for graph clustering. Although these problem are interesting from a theoretical perspective, they are challenging to solve in practice due to the high memory requirement of black-box solvers. In order to address this challenge we first prove that the metric-constrained linear program relaxation of correlation clustering is equivalent to a special case of the metric nearness problem. We then developed a general solver for metric-constrained linear and quadratic programs by generalizing and improving a simple projection algorithm originally developed for metric nearness. We give several novel approximation guarantees for using our framework to find lower bounds for optimal solutions to several challenging graph clustering problems. We also demonstrate the power of our framework by solving optimizing problems involving up to 10^{8} variables and 10^{11} constraints.

1806.01003 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

Distributed Learning from Interactions in Social Networks

社交网络中交互的分布式学习

Francesco Sasso, Angelo Coluccia, Giuseppe Notarstefano

发表机构 * European Research Council (ERC)(欧洲研究理事会)

AI总结 本文提出基于社交网络交互的分布式学习框架,利用贝叶斯方法和最大似然估计,通过图模型工具实现参数和超参数的分布式估计,用于用户画像建模。

Comments This submission is a shorter work (for conference publication) of a more comprehensive paper, already submitted as arXiv:1706.04081 (under review for journal publication). In this short submission only one social set-up is considered and only one of the relaxed estimators is proposed. Moreover, the exhaustive analysis, carried out in the longer manuscript, is completely missing in this version

详情
AI中文摘要

我们考虑一个网络场景,其中代理可以根据表示某些交互的评分图来评估彼此。目标是设计一个分布式协议,由代理运行,使他们能够在有限的可能值中学习其未知状态。我们提出一个贝叶斯框架,其中评分和状态与具有未知参数和超参数的概率事件相关联。我们展示每个代理可以通过本地贝叶斯分类器和结合普通最大似然估计和经验贝叶斯方法的(集中式)最大似然(ML)估计器来学习其状态。通过使用图模型工具,我们可以获得评分和状态的条件依赖性的洞察,从而提供一个放松的概率模型,最终导致一个适合分布式计算的参数-超参数估计器。为了突出所提放松的适当性,我们将在社交互动设置中演示分布式估计器。

英文摘要

We consider a network scenario in which agents can evaluate each other according to a score graph that models some interactions. The goal is to design a distributed protocol, run by the agents, that allows them to learn their unknown state among a finite set of possible values. We propose a Bayesian framework in which scores and states are associated to probabilistic events with unknown parameters and hyperparameters, respectively. We show that each agent can learn its state by means of a local Bayesian classifier and a (centralized) Maximum-Likelihood (ML) estimator of parameter-hyperparameter that combines plain ML and Empirical Bayes approaches. By using tools from graphical models, which allow us to gain insight on conditional dependencies of scores and states, we provide a relaxed probabilistic model that ultimately leads to a parameter-hyperparameter estimator amenable to distributed computation. To highlight the appropriateness of the proposed relaxation, we demonstrate the distributed estimators on a social interaction set-up for user profiling.

1806.00589 2026-06-04 cs.LG cs.AI cs.SY eess.SY stat.ML 版本更新

Efficient Entropy for Policy Gradient with Multidimensional Action Space

在多维动作空间中高效的策略梯度熵

Yiming Zhang, Quan Ho Vuong, Kenny Song, Xiao-Yue Gong, Keith W. Ross

发表机构 * New York University(纽约大学) New York University Abu Dhabi(纽约大学阿布扎克分校) New York University Shanghai(纽约大学上海分校) Massachusetts Institute of Technology(麻省理工学院)

AI总结 本文提出高效计算高维动作空间策略梯度熵的方法,通过改进的无偏估计器提升探索效率,在多猎手多兔子网格游戏和多智能体多臂老虎机问题中验证了其有效性。

详情
AI中文摘要

近年来,深度强化学习在解决高维状态空间(如Atari游戏)的序列决策过程方面表现出色。然而,许多强化学习问题涉及高维离散动作空间和高维状态空间。本文考虑熵奖励,用于在策略梯度中鼓励探索。在高维动作空间中,计算熵及其梯度需要枚举所有动作并为每个动作运行前向和反向传播,这可能计算上不可行。我们开发了几种新颖的无偏估计器用于熵奖励及其梯度。我们将这些估计器应用于几种参数化策略模型,包括独立采样、CommNet、带有修改MDP的自回归和带有LSTM的自回归。最后,我们在两个环境中测试我们的算法:一个多猎手多兔子网格游戏和一个多智能体多臂老虎机问题。结果表明,我们的熵估计器在边际额外计算成本下显著提升了性能。

英文摘要

In recent years, deep reinforcement learning has been shown to be adept at solving sequential decision processes with high-dimensional state spaces such as in the Atari games. Many reinforcement learning problems, however, involve high-dimensional discrete action spaces as well as high-dimensional state spaces. This paper considers entropy bonus, which is used to encourage exploration in policy gradient. In the case of high-dimensional action spaces, calculating the entropy and its gradient requires enumerating all the actions in the action space and running forward and backpropagation for each action, which may be computationally infeasible. We develop several novel unbiased estimators for the entropy bonus and its gradient. We apply these estimators to several models for the parameterized policies, including Independent Sampling, CommNet, Autoregressive with Modified MDP, and Autoregressive with LSTM. Finally, we test our algorithms on two environments: a multi-hunter multi-rabbit grid game and a multi-agent multi-arm bandit problem. The results show that our entropy estimators substantially improve performance with marginal additional computational cost.

1709.05746 2026-06-04 cs.RO cs.AI cs.CV cs.LG cs.SY eess.SY 版本更新

Adversarial Discriminative Sim-to-real Transfer of Visuo-motor Policies

对抗性判别仿真到现实的视觉-运动策略转移

Fangyi Zhang, Jürgen Leitner, Zongyuan Ge, Michael Milford, Peter Corke

发表机构 * Australian Centre for Robotic Vision (ACRV)(澳大利亚机器人视觉中心) Queensland University of Technology (QUT)(昆士兰技术大学) Monash University(墨尔本大学)

AI总结 本文提出对抗性判别仿真到现实转移方法,减少现实数据标注成本,在桌面上物体抓取任务中,通过视觉观测控制7自由度机械臂在障碍物中抓取蓝色立方体,仅需93个标注和186个未标注图像即可实现97.8%的成功率和1.8厘米的控制精度。

Comments Under review for the International Journal of Robotics Research

详情
AI中文摘要

各种方法已被提出以学习用于现实世界机器人应用的视觉-运动策略。一种解决方案是首先在仿真中学习然后转移到现实世界。在转移过程中,大多数现有方法需要带有标签的真实图像。然而,在许多机器人应用中,标注过程往往昂贵甚至不实际。在本文中,我们提出了一种对抗性判别仿真到现实转移方法,以减少标注真实数据的成本。通过模块化网络在桌面物体抓取任务中验证了该方法的有效性,其中7自由度的机械臂以速度模式控制在障碍物中抓取蓝色立方体。对抗性转移方法将标注真实数据的需求减少了50%。策略可以仅使用93个标注和186个未标注的真实图像转移到现实环境。转移的视觉-运动策略对训练中未见过的物体和移动目标具有鲁棒性,实现了97.8%的成功率和1.8厘米的控制精度。

英文摘要

Various approaches have been proposed to learn visuo-motor policies for real-world robotic applications. One solution is first learning in simulation then transferring to the real world. In the transfer, most existing approaches need real-world images with labels. However, the labelling process is often expensive or even impractical in many robotic applications. In this paper, we propose an adversarial discriminative sim-to-real transfer approach to reduce the cost of labelling real data. The effectiveness of the approach is demonstrated with modular networks in a table-top object reaching task where a 7 DoF arm is controlled in velocity mode to reach a blue cuboid in clutter through visual observations. The adversarial transfer approach reduced the labelled real data requirement by 50%. Policies can be transferred to real environments with only 93 labelled and 186 unlabelled real images. The transferred visuo-motor policies are robust to novel (not seen in training) objects in clutter and even a moving target, achieving a 97.8% success rate and 1.8 cm control accuracy.

1805.10638 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Fast K-Means Clustering with Anderson Acceleration

快速K均值聚类的安德森加速方法

Juyong Zhang, Yuxin Yao, Yue Peng, Hao Yu, Bailin Deng

发表机构 * University of Science and Technology of China(中国科学技术大学) Cardiff University(卡迪夫大学)

AI总结 本文提出了一种加速K均值聚类Lloyd算法的新方法,通过将Lloyd算法的分配和更新步骤视为固定点迭代,并应用安德森加速技术,动态调整参数m以实现鲁棒且一致的加速效果。

详情
AI中文摘要

我们提出了一种新的方法,用于加速K-均值聚类的Lloyd算法。与以往减少每次迭代计算成本或改进初始化的方法不同,我们的方法专注于减少收敛所需的迭代次数。这通过将Lloyd算法的分配步骤和更新步骤视为固定点迭代,并应用安德森加速,一种已建立的加速固定点求解器的技术来实现。经典安德森加速利用m个之前的迭代来找到加速的迭代,其在K-均值聚类中的性能对m的选择和样本分布敏感。我们提出了一种新的策略,动态调整m的值,以在不同问题实例上实现鲁棒且一致的加速。我们的方法补充了现有的加速技术,并可以与它们结合以实现最先进的性能。我们进行了广泛的实验来评估所提出方法的性能,在120个测试用例中,有106个用例优于其他算法,平均计算时间减少比率超过33%。

英文摘要

We propose a novel method to accelerate Lloyd's algorithm for K-Means clustering. Unlike previous acceleration approaches that reduce computational cost per iterations or improve initialization, our approach is focused on reducing the number of iterations required for convergence. This is achieved by treating the assignment step and the update step of Lloyd's algorithm as a fixed-point iteration, and applying Anderson acceleration, a well-established technique for accelerating fixed-point solvers. Classical Anderson acceleration utilizes m previous iterates to find an accelerated iterate, and its performance on K-Means clustering can be sensitive to choice of m and the distribution of samples. We propose a new strategy to dynamically adjust the value of m, which achieves robust and consistent speedups across different problem instances. Our method complements existing acceleration techniques, and can be combined with them to achieve state-of-the-art performance. We perform extensive experiments to evaluate the performance of the proposed method, where it outperforms other algorithms in 106 out of 120 test cases, and the mean decrease ratio of computational time is more than 33%.

1710.01493 2026-06-04 cs.LG cs.CV cs.NA math.NA math.OC 版本更新

Image Labeling Based on Graphical Models Using Wasserstein Messages and Geometric Assignment

基于图形模型的图像标注:利用Wasserstein消息与几何分配

Ruben Hühnerbein, Fabrizio Savarino, Freddie Åström, Christoph Schnörr

发表机构 * Image and Pattern Analysis Group, Heidelberg University, Germany(海德堡大学图像与模式分析组) Heidelberg Collaboratory for Image Processing, Heidelberg University, Germany(海德堡图像处理协同实验室)

AI总结 本文提出基于离散图模型的最大后验推断新方法,利用局部Wasserstein距离近似目标函数并实现并行收敛。

详情
AI中文摘要

我们介绍了一种基于离散图模型的最大后验推断新方法。通过利用局部Wasserstein距离来耦合图底层边的分配措施,给定的离散目标函数被平滑近似并限制在分配流形上。相应的乘法更新方案结合了两个过程:(i)所得到的黎曼梯度流的几何积分,以及(ii)将解四舍五入为有效的标签。在整个过程中,已知的LP松弛方法中的局部边缘约束得以满足,而平滑的几何设置导致快速收敛的迭代,可以并行执行每条边。

英文摘要

We introduce a novel approach to Maximum A Posteriori inference based on discrete graphical models. By utilizing local Wasserstein distances for coupling assignment measures across edges of the underlying graph, a given discrete objective function is smoothly approximated and restricted to the assignment manifold. A corresponding multiplicative update scheme combines in a single process (i) geometric integration of the resulting Riemannian gradient flow and (ii) rounding to integral solutions that represent valid labelings. Throughout this process, local marginalization constraints known from the established LP relaxation are satisfied, whereas the smooth geometric setting results in rapidly converging iterations that can be carried out in parallel for every edge.

1805.09613 2026-06-04 stat.ML cs.AI cs.LG cs.RO cs.SY eess.SY 版本更新

A0C: Alpha Zero in Continuous Action Space

A0C:在连续动作空间中的Alpha Zero

Thomas M. Moerland, Joost Broekens, Aske Plaat, Catholijn M. Jonker

发表机构 * Dep. of Computer Science, Delft University of Technology, The Netherlands(代尔夫特理工大学计算机科学系,荷兰) Dep. of Computer Science, Leiden University, The Netherlands(莱顿大学计算机科学系,荷兰)

AI总结 本文提出将Alpha Zero扩展到连续动作空间的理论方法,并在倒摆任务中验证了其可行性,为连续动作空间中的迭代搜索与学习应用奠定了基础。

详情
AI中文摘要

Alpha Zero的核心创新在于树搜索与深度学习的结合,这在国际象棋、国际跳棋和围棋等离散动作空间的游戏中证明非常成功。然而,许多现实世界的强化学习领域具有连续动作空间,例如机器人控制、导航和自动驾驶汽车。本文提出了将Alpha Zero扩展到连续动作空间所需的理论扩展。我们还提供了一些在倒摆摆起任务中的初步实验,实证地展示了我们方法的可行性。因此,这项工作为在连续动作空间领域中应用迭代搜索与学习奠定了基础。

英文摘要

A core novelty of Alpha Zero is the interleaving of tree search and deep learning, which has proven very successful in board games like Chess, Shogi and Go. These games have a discrete action space. However, many real-world reinforcement learning domains have continuous action spaces, for example in robotic control, navigation and self-driving cars. This paper presents the necessary theoretical extensions of Alpha Zero to deal with continuous action space. We also provide some preliminary experiments on the Pendulum swing-up task, empirically showing the feasibility of our approach. Thereby, this work provides a first step towards the application of iterated search and learning in domains with a continuous action space.

1805.09464 2026-06-04 cs.LG cs.IT cs.NA math.IT math.NA math.OC stat.ML 版本更新

Simple and practical algorithms for $\ell_p$-norm low-rank approximation

简单且实用的ℓp-范数低秩近似算法

Anastasios Kyrillidis

发表机构 * IBM T.J. Watson Research Center(IBM T.J. 巴特利特研究中心) Rice University(里士满大学)

AI总结 本文提出基于梯度的非凸算法,用于ℓp范数低秩近似,适用于p=1或p=∞。算法易于实现,能更快速且更精确地逼近,理论证明其可达到(1+ε)-OPT近似,且不依赖超参数。

Comments 16 pages, 11 figures, to appear in UAI 2018

详情
AI中文摘要

我们提出了一种实用算法,用于entrywise ℓp-范数低秩近似,其中p=1或p=∞。所提出的框架是非凸且基于梯度的,易于实现且通常在速度和精度上优于现有方法。从理论角度看,我们证明所提方案可以达到(1+ε)-OPT近似。我们的算法并非超参数无关:只有在假设算法的超参数已知或可近似的情况下,才能实现所需目标。即,我们的理论表明为了在多项式时间内获得良好的解,需要知道哪些问题量,且不与最近的不可近似性结果相矛盾,如[46]。

英文摘要

We propose practical algorithms for entrywise $\ell_p$-norm low-rank approximation, for $p = 1$ or $p = \infty$. The proposed framework, which is non-convex and gradient-based, is easy to implement and typically attains better approximations, faster, than state of the art. From a theoretical standpoint, we show that the proposed scheme can attain $(1 + \varepsilon)$-OPT approximations. Our algorithms are not hyperparameter-free: they achieve the desiderata only assuming algorithm's hyperparameters are known a priori---or are at least approximable. I.e., our theory indicates what problem quantities need to be known, in order to get a good solution within polynomial time, and does not contradict to recent inapproximabilty results, as in [46].

1805.08468 2026-06-04 math.NA cs.LG cs.NA 版本更新

Rank Minimization on Tensor Ring: A New Paradigm in Scalable Tensor Decomposition and Completion

张量环上的秩最小化:一种可扩展张量分解与补全的新范式

Longhao Yuan, Chao Li, Danilo Mandic, Jianting Cao, Qibin Zhao

发表机构 * Graduate School of Engineering, Saitama Institute of Technology, Japan(日本萨它马工学院工程研究生院) Tensor Learning Unit, RIKEN Center for Advanced Intelligence Project (AIP), Japan(日本RIKEN高级智能项目(AIP)张量学习单元) School of Automation, Guangdong University of Technology, China(中国广东技术大学自动化学院) School of Computer Science and Technology, Hangzhou Dianzi University, China(中国杭州电子科技大学计算机科学与技术学院) Department of Electrical and Electronic Engineering, Imperial College London, United Kingdom(英国伦敦帝国理工学院电子与电气工程系)

AI总结 本文提出基于张量环的秩最小化方法,通过引入凸替代项解决传统方法的高计算成本和模型复杂度敏感问题,提出两种算法以不同结构的Schatten范数优化张量环因子,实验显示其高效性与高性能。

详情
AI中文摘要

在低秩张量补全任务中,由于传统方法需要多次大规模奇异值分解(SVD)操作和秩选择问题,导致计算成本高且对模型复杂度敏感。本文利用最近提出的张量环(TR)分解的高压缩性,提出了一种新的张量补全模型。通过引入凸替代项对潜在张量环因子的低秩假设,使得基于Schatten范数正则化的模型可以在更小的规模上求解。我们提出了两种算法,分别对张量环因子应用不同的结构化Schatten范数。通过交替方向乘子法(ADMM)方案,张量环因子和预测张量可以同时优化。在合成数据和实际数据上的实验显示了所提方法的高性能和高效性。

英文摘要

In low-rank tensor completion tasks, due to the underlying multiple large-scale singular value decomposition (SVD) operations and rank selection problem of the traditional methods, they suffer from high computational cost and high sensitivity of model complexity. In this paper, taking advantages of high compressibility of the recently proposed tensor ring (TR) decomposition, we propose a new model for tensor completion problem. This is achieved through introducing convex surrogates of tensor low-rank assumption on latent tensor ring factors, which makes it possible for the Schatten norm regularization based models to be solved at much smaller scale. We propose two algorithms which apply different structured Schatten norms on tensor ring factors respectively. By the alternating direction method of multipliers (ADMM) scheme, the tensor ring factors and the predicted tensor can be optimized simultaneously. The experiments on synthetic data and real-world data show the high performance and efficiency of the proposed approach.

1805.08095 2026-06-04 cs.LG cs.CV cs.NA math.NA stat.ML 版本更新

Small steps and giant leaps: Minimal Newton solvers for Deep Learning

小步与巨跃:用于深度学习的最小牛顿求解器

João F. Henriques, Sebastien Ehrhardt, Samuel Albanie, Andrea Vedaldi

发表机构 * Visual Geometry Group, University of Oxford(视觉几何组,牛津大学)

AI总结 本文提出一种快速的二阶方法,可作为现有深度学习求解器的替代方案。该方法仅需每个迭代两次额外的前向模式自动微分操作,计算成本与两次标准前向传递相当,易于实现。方法解决了现有二阶求解器的长期问题,避免了计算Hessian矩阵的近似逆矩阵的高成本和噪声敏感性。

详情
AI中文摘要

我们提出了一种快速的二阶方法,可作为现有深度学习求解器的替代方案。与随机梯度下降(SGD)相比,该方法每个迭代仅需两次额外的前向模式自动微分操作,计算成本与两次标准前向传递相当,且易于实现。我们的方法解决了现有二阶求解器的长期问题,即每次迭代精确或通过共轭梯度法计算近似Hessian矩阵的逆矩阵,这一过程成本高且对噪声敏感。相反,我们提出保持一个梯度的估计值,该估计值通过逆Hessian矩阵投影得到,并在每次迭代中更新一次。该估计值的大小相同,类似于SGD中常用的动量变量。不维护Hessian的估计值。我们首先在具有已知闭式解的小问题上验证了我们的方法,称为CurveBall,包括噪声Rosenbrock函数和退化的两层线性网络,其中现有深度学习求解器似乎难以处理。然后我们在CIFAR和ImageNet上训练了多个大型模型,包括ResNet和VGG-f网络,展示了无需超参数调优的更快收敛速度。代码已提供。

英文摘要

We propose a fast second-order method that can be used as a drop-in replacement for current deep learning solvers. Compared to stochastic gradient descent (SGD), it only requires two additional forward-mode automatic differentiation operations per iteration, which has a computational cost comparable to two standard forward passes and is easy to implement. Our method addresses long-standing issues with current second-order solvers, which invert an approximate Hessian matrix every iteration exactly or by conjugate-gradient methods, a procedure that is both costly and sensitive to noise. Instead, we propose to keep a single estimate of the gradient projected by the inverse Hessian matrix, and update it once per iteration. This estimate has the same size and is similar to the momentum variable that is commonly used in SGD. No estimate of the Hessian is maintained. We first validate our method, called CurveBall, on small problems with known closed-form solutions (noisy Rosenbrock function and degenerate 2-layer linear networks), where current deep learning solvers seem to struggle. We then train several large models on CIFAR and ImageNet, including ResNet and VGG-f networks, where we demonstrate faster convergence with no hyperparameter tuning. Code is available.

1711.09220 2026-06-04 cs.LG cs.SY eess.SY math.OC 版本更新

Fitting Jump Models

拟合跳跃模型

A. Bemporad, V. Breschi, D. Piga, S. Boyd

发表机构 * IMT School for Advanced Studies Lucca(IMT 高级研究学院卢塞拉分校) Dalle Molle Institute for Artificial Intelligence Research - USI/SUPSI(达勒莫莱人工智能研究所 - USI/SUPSI) Department of Electrical Engineering, Stanford University(斯坦福大学电气工程系)

AI总结 本文提出了一种新的框架,用于拟合跳跃模型序列数据,通过交替最小化损失函数以拟合多个模型参数和确定每个数据点的活跃参数集,适用于隐马尔可夫模型等主流模型。

Comments Accepted for publication in Automatica

详情
AI中文摘要

我们描述了一种新的框架,用于将跳跃模型拟合到数据序列中。关键思想是交替最小化损失函数以拟合多个模型参数,以及最小化离散损失函数以确定每个数据点的模型参数集。该框架相当通用,涵盖了隐马尔可夫模型和分段仿射模型等流行模型类别。所选损失函数的形状决定了最终跳跃模型的形状。

英文摘要

We describe a new framework for fitting jump models to a sequence of data. The key idea is to alternate between minimizing a loss function to fit multiple model parameters, and minimizing a discrete loss function to determine which set of model parameters is active at each data point. The framework is quite general and encompasses popular classes of models, such as hidden Markov models and piecewise affine models. The shape of the chosen loss functions to minimize determine the shape of the resulting jump model.

1804.01825 2026-06-04 cs.LG econ.GN q-fin.EC stat.ML 版本更新

Evaluating Hospital Case Cost Prediction Models Using Azure Machine Learning Studio

利用Azure机器学习工作室评估医院病例成本预测模型

Alexei Botchkarev

发表机构 * Microsoft Azure Machine Learning Studio(微软Azure机器学习工作室)

AI总结 本文提出了一种利用Azure机器学习工作室快速评估多种回归模型的工具,评估了鲁棒回归、提升决策树回归和决策森林回归在医院病例成本预测中的优势。

详情
AI中文摘要

准确的医院病例成本建模和预测能力对高效医疗财务管理和预算规划至关重要。已知各种回归机器学习算法在医疗成本预测中表现良好。本实验的目的是构建一个Azure机器学习工作室工具,用于快速评估多种类型的回归模型。该工具提供了一个统一的实验环境,可比较14种回归模型:线性回归、贝叶斯线性回归、决策森林回归、提升决策树回归、神经网络回归、泊松回归、回归高斯过程、梯度提升机、非线性最小二乘回归、投影寻踪回归、随机森林回归、鲁棒回归、鲁棒回归与mm型估计器、支持向量回归。该工具通过五个性能指标将评估结果按模型准确性排列在单一表格中。对回归机器学习模型进行医院病例成本预测的评估显示,鲁棒回归模型、提升决策树回归和决策森林回归具有优势。该操作工具已发布到网络上,可供实验和扩展使用。

英文摘要

Ability for accurate hospital case cost modelling and prediction is critical for efficient health care financial management and budgetary planning. A variety of regression machine learning algorithms are known to be effective for health care cost predictions. The purpose of this experiment was to build an Azure Machine Learning Studio tool for rapid assessment of multiple types of regression models. The tool offers environment for comparing 14 types of regression models in a unified experiment: linear regression, Bayesian linear regression, decision forest regression, boosted decision tree regression, neural network regression, Poisson regression, Gaussian processes for regression, gradient boosted machine, nonlinear least squares regression, projection pursuit regression, random forest regression, robust regression, robust regression with mm-type estimators, support vector regression. The tool presents assessment results arranged by model accuracy in a single table using five performance metrics. Evaluation of regression machine learning models for performing hospital case cost prediction demonstrated advantage of robust regression model, boosted decision tree regression and decision forest regression. The operational tool has been published to the web and openly available for experiments and extensions.

1702.04837 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

Sketched Ridge Regression: Optimization Perspective, Statistical Perspective, and Model Averaging

草图岭回归:优化视角、统计视角和模型平均

Shusen Wang, Alex Gittens, Michael W. Mahoney

发表机构 * International Computer Science Institute and Department of Statistics University of California at Berkeley(国际计算机科学研究所和统计学系加州大学伯克利分校) Computer Science Department Rensselaer Polytechnic Institute(计算机科学系拉特格斯理工学院)

AI总结 本文从优化和统计角度研究了草图和Hessian草图在矩阵岭回归中的影响,发现经典草图能近似最优解,而Hessian草图则不同。通过理论和实验表明,模型平均可显著降低真实与草图解间的风险差距。

Comments To appear in Journal of Machine Learning Research, 2018. A short version has appeared in International Conference on Machine Learning (ICML), 2017

详情
Journal ref
Journal of Machine Learning Research, 19, pp1-50, 2018
AI中文摘要

我们探讨了经典草图和Hessian草图在近似求解矩阵岭回归(MRR)问题中的统计和优化影响。先前研究量化了经典草图对更简单的最小二乘回归(LSR)问题的影响。我们证明经典草图对MRR的优化属性的影响与对LSR的影响类似:即恢复近似最优解。相反,Hessian草图没有这种保证,其近似误差由响应中的“质量”与最优目标值之间的微妙交互决定。对于两种类型的近似,sketched MRR中的正则化导致与sketched LSR不同的统计特性。特别是,在sketched MRR中存在偏误-方差权衡,这在sketched LSR中不存在。我们提供了sketched MRR的偏误和方差的上界和下界,这些界限表明经典草图显著增加方差,而Hessian草图显著增加偏误。经验上,sketched MRR的解的风险可能比最优MRR解高一个数量级。我们理论和实证表明,模型平均显著降低真实解与sketched解风险之间的差距。因此,在并行或分布式设置中,草图结合模型平均是一种强大的技术,能够快速获得近似最优解,同时大幅减轻草图带来的统计风险增加。

英文摘要

We address the statistical and optimization impacts of the classical sketch and Hessian sketch used to approximately solve the Matrix Ridge Regression (MRR) problem. Prior research has quantified the effects of classical sketch on the strictly simpler least squares regression (LSR) problem. We establish that classical sketch has a similar effect upon the optimization properties of MRR as it does on those of LSR: namely, it recovers nearly optimal solutions. By contrast, Hessian sketch does not have this guarantee, instead, the approximation error is governed by a subtle interplay between the "mass" in the responses and the optimal objective value. For both types of approximation, the regularization in the sketched MRR problem results in significantly different statistical properties from those of the sketched LSR problem. In particular, there is a bias-variance trade-off in sketched MRR that is not present in sketched LSR. We provide upper and lower bounds on the bias and variance of sketched MRR, these bounds show that classical sketch significantly increases the variance, while Hessian sketch significantly increases the bias. Empirically, sketched MRR solutions can have risks that are higher by an order-of-magnitude than those of the optimal MRR solutions. We establish theoretically and empirically that model averaging greatly decreases the gap between the risks of the true and sketched solutions to the MRR problem. Thus, in parallel or distributed settings, sketching combined with model averaging is a powerful technique that quickly obtains near-optimal solutions to the MRR problem while greatly mitigating the increased statistical risk incurred by sketching.

1804.07323 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Nonparametric Stochastic Compositional Gradient Descent for Q-Learning in Continuous Markov Decision Problems

非参数随机组合梯度下降法在连续马尔可夫决策问题中的Q学习

Alec Koppel, Ekaterina Tolstaya, Ethan Stump, Alejandro Ribeiro

发表机构 * University of Pennsylvania(宾夕法尼亚大学) U.S. Army Research Laboratory(美国陆军研究实验室)

AI总结 本文提出非参数随机组合梯度下降法用于连续马尔可夫决策问题中的Q学习,通过将贝尔曼最优性方程转化为嵌套非凸随机优化问题,并利用核诱导再生核希尔伯特空间进行参数化,最终证明算法在概率意义下收敛于问题的 stationary 点。

详情
AI中文摘要

我们考虑定义在连续状态和动作空间上的马尔可夫决策问题,其中自主代理试图学习从状态到动作的映射以最大化长期折扣奖励累积。我们通过考虑定义在动作价值函数上的贝尔曼最优性方程,将其重新表述为一个嵌套非凸随机优化问题,该问题定义在再生核希尔伯特空间(RKHS)上。我们开发了一种功能扩展的随机准梯度方法来解决这个问题,由于RKHS的结构,它允许以标量权重和过去的状态-动作对参数化,其增长与算法迭代次数成比例。为缓解这种复杂性爆炸,我们应用核正交匹配追踪到核权重和字典序列,从而在底层优化方法的下降方向上产生可控的误差。我们证明所得到的算法,称为KQ学习,以概率1收敛于该问题的 stationary 点,从而在假设其属于RKHS的情况下得到贝尔曼最优性算子的固定点。在常数学习率下,我们进一步得到收敛于一个小的贝尔曼误差,该误差取决于所选的学习率。在连续山车和倒立摆任务上的数值评估表明,收敛的简洁学习动作价值函数、与最先进方法具有竞争力的策略,并表现出可靠、可重复的学习行为。

英文摘要

We consider Markov Decision Problems defined over continuous state and action spaces, where an autonomous agent seeks to learn a map from its states to actions so as to maximize its long-term discounted accumulation of rewards. We address this problem by considering Bellman's optimality equation defined over action-value functions, which we reformulate into a nested non-convex stochastic optimization problem defined over a Reproducing Kernel Hilbert Space (RKHS). We develop a functional generalization of stochastic quasi-gradient method to solve it, which, owing to the structure of the RKHS, admits a parameterization in terms of scalar weights and past state-action pairs which grows proportionately with the algorithm iteration index. To ameliorate this complexity explosion, we apply Kernel Orthogonal Matching Pursuit to the sequence of kernel weights and dictionaries, which yields a controllable error in the descent direction of the underlying optimization method. We prove that the resulting algorithm, called KQ-Learning, converges with probability 1 to a stationary point of this problem, yielding a fixed point of the Bellman optimality operator under the hypothesis that it belongs to the RKHS. Under constant learning rates, we further obtain convergence to a small Bellman error that depends on the chosen learning rates. Numerical evaluation on the Continuous Mountain Car and Inverted Pendulum tasks yields convergent parsimonious learned action-value functions, policies that are competitive with the state of the art, and exhibit reliable, reproducible learning behavior.

1804.07010 2026-06-04 stat.ML cs.LG cs.SY eess.SY math.AP math.OC 版本更新

Forward-Backward Stochastic Neural Networks: Deep Learning of High-dimensional Partial Differential Equations

前向-后向随机神经网络:高维偏微分方程的深度学习

Maziar Raissi

发表机构 * Division of Applied Mathematics, Brown University(布朗大学应用数学系)

AI总结 本文提出一种高维偏微分方程求解方法,利用深度神经网络和随机微分方程的联系,避免数值离散化限制,解决维度诅咒问题。

详情
AI中文摘要

经典偏微分方程数值方法因依赖精细的时空网格而受维度诅咒限制。受现代深度学习技术启发,本文提出一种可扩展的算法,通过深度神经网络近似未知解,并利用自动微分优势。通过将高维偏微分方程与前向-后向随机微分方程联系起来,利用布朗运动独立实现作为训练数据,测试了Black-Scholes-Barenblatt和Hamilton-Jacobi-Bellman方程等100维基准问题的有效性。

英文摘要

Classical numerical methods for solving partial differential equations suffer from the curse dimensionality mainly due to their reliance on meticulously generated spatio-temporal grids. Inspired by modern deep learning based techniques for solving forward and inverse problems associated with partial differential equations, we circumvent the tyranny of numerical discretization by devising an algorithm that is scalable to high-dimensions. In particular, we approximate the unknown solution by a deep neural network which essentially enables us to benefit from the merits of automatic differentiation. To train the aforementioned neural network we leverage the well-known connection between high-dimensional partial differential equations and forward-backward stochastic differential equations. In fact, independent realizations of a standard Brownian motion will act as training data. We test the effectiveness of our approach for a couple of benchmark problems spanning a number of scientific domains including Black-Scholes-Barenblatt and Hamilton-Jacobi-Bellman equations, both in 100-dimensions.

1804.06114 2026-06-04 cs.LG cs.CV cs.NA math.NA stat.ML 版本更新

A Support Tensor Train Machine

支持张量列车机

Cong Chen, Kim Batselier, Ching-Yun Ko, Ngai Wong

发表机构 * The Department of Electrical and Electronic Engineering, The University of Hong Kong(香港大学电子与电气工程系)

AI总结 本文提出支持张量列车机,通过将传统支持张量机中的秩一张量替换为张量列车,提升模型表达能力,实验验证其优于SVM和STM。

Comments 7 pages

详情
AI中文摘要

近年来,将传统向量机技术扩展到张量形式引起了广泛关注。例如,支持张量机(STM)利用秩一张量捕捉数据结构,从而缓解传统支持向量机(SVM)中的过拟合和维度灾难问题。然而,秩一张量的表达能力对于许多现实数据来说是有限的。为克服这一限制,我们引入支持张量列车机(STTM),通过将STM中的秩一张量替换为张量列车。实验验证并确认STTM优于SVM和STM。

英文摘要

There has been growing interest in extending traditional vector-based machine learning techniques to their tensor forms. An example is the support tensor machine (STM) that utilizes a rank-one tensor to capture the data structure, thereby alleviating the overfitting and curse of dimensionality problems in the conventional support vector machine (SVM). However, the expressive power of a rank-one tensor is restrictive for many real-world data. To overcome this limitation, we introduce a support tensor train machine (STTM) by replacing the rank-one tensor in an STM with a tensor train. Experiments validate and confirm the superiority of an STTM over the SVM and STM.

1804.04696 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY 版本更新

Efficient Model Identification for Tensegrity Locomotion

高效 tensegrity 机器人运动的模型识别

Shaojun Zhu, David Surovik, Kostas E. Bekris, Abdeslam Boularias

发表机构 * Department of Computer Science, Rutgers University(计算机科学系,罗格斯大学)

AI总结 本文提出一种高效方法,利用物理引擎和贝叶斯优化框架,用于识别高维顺应性tensegrity机器人中的未知机械参数,提升运动控制精度。

详情
AI中文摘要

本文旨在以实用方式识别未知物理参数,如驱动机器人连杆的机械模型,这些参数在动态机器人任务中至关重要。关键特征包括使用现成的物理引擎和贝叶斯优化框架。所考虑的任务是高维、顺应性tensegrity机器人的运动。关键见解在于将模型识别挑战投影到适当的低维空间以提高效率。与替代方法的比较表明,所提出的方法可以在给定的时间预算内更准确地识别参数,从而实现更精确的运动控制。

英文摘要

This paper aims to identify in a practical manner unknown physical parameters, such as mechanical models of actuated robot links, which are critical in dynamical robotic tasks. Key features include the use of an off-the-shelf physics engine and the Bayesian optimization framework. The task being considered is locomotion with a high-dimensional, compliant Tensegrity robot. A key insight, in this case, is the need to project the model identification challenge into an appropriate lower dimensional space for efficiency. Comparisons with alternatives indicate that the proposed method can identify the parameters more accurately within the given time budget, which also results in more precise locomotion control.

1804.02884 2026-06-04 cs.AI cs.LG cs.MA cs.NE cs.SY eess.SY 版本更新

Policy Gradient With Value Function Approximation For Collective Multiagent Planning

基于价值函数近似集体多智能体规划的策略梯度

Duc Thien Nguyen, Akshat Kumar, Hoong Chuin Lau

发表机构 * School of Information Systems(信息系统学院) Singapore Management University(新加坡管理大学)

AI总结 本文提出一种改进的actor-critic方法,用于优化集体决策多智能体规划问题,通过分解近似动作价值函数提升收敛速度,并在合成任务和出租车车队优化中验证了方法的有效性。

详情
AI中文摘要

去中心化(PO)MDPs为多智能体系统序列决策提供了表达性框架。鉴于其计算复杂性,近期研究聚焦于可处理且实用的Dec-POMDP子类。我们针对此类子类CDEC-POMDP进行研究,其中智能体群体行为影响联合奖励和环境动态。本文的主要贡献是一种用于优化CDEC-POMDP策略的actor-critic强化学习方法。 vanilla AC在大问题上收敛缓慢。为解决此问题,我们展示了如何通过特定的分解近似动作价值函数过智能体导致有效的更新,并推导出一种基于局部奖励信号训练critic的新方法。在合成基准和现实世界出租车车队优化问题上的比较表明,我们的新AC方法提供了比先前最佳方法更高质量的解决方案。

英文摘要

Decentralized (PO)MDPs provide an expressive framework for sequential decision making in a multiagent system. Given their computational complexity, recent research has focused on tractable yet practical subclasses of Dec-POMDPs. We address such a subclass called CDEC-POMDP where the collective behavior of a population of agents affects the joint-reward and environment dynamics. Our main contribution is an actor-critic (AC) reinforcement learning method for optimizing CDEC-POMDP policies. Vanilla AC has slow convergence for larger problems. To address this, we show how a particular decomposition of the approximate action-value function over agents leads to effective updates, and also derive a new way to train the critic based on local reward signals. Comparisons on a synthetic benchmark and a real-world taxi fleet optimization problem show that our new AC approach provides better quality solutions than previous best approaches.

1708.09630 2026-06-04 cs.MA cs.LG cs.SY eess.SY 版本更新

Resilient Autonomous Control of Distributed Multi-agent Systems in Contested Environments

在 contested 环境中分布式多智能体系统的鲁棒自主控制

Rohollah Moghadam, Hamidreza Modares

AI总结 本文提出了一种鲁棒学习控制协议,用于在存在攻击和系统动态不确定性的情况下实现多智能体系统的同步,通过分布式 H_infinity 控制器和信任-信心机制提高系统鲁棒性。

详情
AI中文摘要

本文提出了一种自主且鲁棒的控制器,用于在存在不确定性和网络攻击的条件下,领导-跟随多智能体系统。领导节点被假设为非自主的,具有非零控制输入,以响应环境变化改变团队行为或任务。本文提出了一种鲁棒的学习控制协议,以在存在攻击和系统动态不确定性的情况下找到同步问题的最优解。首先设计了一个基于观测器的分布式 H_infinity 控制器,以防止攻击对传感器和执行器的影响在整个网络中传播,并减弱对被攻击代理本身的影响。推导了非同质博弈代数 Riccati 方程以解决 H_infinity 最优同步问题,并利用非策略强化学习来学习其解,而无需任何关于代理动态的知识。然后提出了一种基于信任-信心的分布式控制协议,以缓解劫持整个节点和通信链路攻击。为每个代理定义一个基于其本地证据的置信值。所提出的鲁棒强化学习算法利用每个代理的置信值来指示其自身信息的可信度,并将其广播给邻居,以在学习过程中和之后对所接收的数据施加权重。如果某个代理的置信值较低,则利用信任机制来识别被篡改的代理,并从学习过程中移除其接收到的数据。仿真结果展示了所提出方法的有效性。

英文摘要

An autonomous and resilient controller is proposed for leader-follower multi-agent systems under uncertainties and cyber-physical attacks. The leader is assumed non-autonomous with a nonzero control input, which allows changing the team behavior or mission in response to environmental changes. A resilient learning-based control protocol is presented to find optimal solutions to the synchronization problem in the presence of attacks and system dynamic uncertainties. An observer-based distributed H_infinity controller is first designed to prevent propagating the effects of attacks on sensors and actuators throughout the network, as well as to attenuate the effect of these attacks on the compromised agent itself. Non-homogeneous game algebraic Riccati equations are derived to solve the H_infinity optimal synchronization problem and off-policy reinforcement learning is utilized to learn their solution without requiring any knowledge of the agent's dynamics. A trust-confidence based distributed control protocol is then proposed to mitigate attacks that hijack the entire node and attacks on communication links. A confidence value is defined for each agent based solely on its local evidence. The proposed resilient reinforcement learning algorithm employs the confidence value of each agent to indicate the trustworthiness of its own information and broadcast it to its neighbors to put weights on the data they receive from it during and after learning. If the confidence value of an agent is low, it employs a trust mechanism to identify compromised agents and remove the data it receives from them from the learning process. Simulation results are provided to show the effectiveness of the proposed approach.

1612.07139 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY 版本更新

A Survey of Deep Network Solutions for Learning Control in Robotics: From Reinforcement to Imitation

深度网络在机器人学习控制中的应用综述:从强化到模仿

Lei Tai, Jingwei Zhang, Ming Liu, Joschka Boedecker, Wolfram Burgard

发表机构 * University of Freiburg(弗赖堡大学)

AI总结 本文综述了深度学习在机器人学习控制中的应用,探讨了深度强化学习和模仿学习两大主流方法,分析了其在导航、 manipulation 任务中的应用及现实差距挑战。

Comments 19 pages, 1 figures

详情
AI中文摘要

深度学习技术已广泛应用于各种研究领域,取得了最先进的成果。本文综述了针对机器人应用的学习控制策略的深度学习解决方案。我们讨论了深度学习在学习控制中的两大主要范式:深度强化学习和模仿学习。对于深度强化学习(DRL),我们从传统强化学习算法开始,展示了如何将其扩展到深度领域,并介绍了在机器人导航和 manipulation 任务中使用 DRL 的代表性工作。我们继续讨论了解决现实差距挑战的方法,即如何将仿真中训练的 DRL 策略转移到现实世界场景,并总结了用于 DRL 研究的机器人仿真平台。对于模仿学习,我们探讨了其三个主要类别:行为克隆、逆强化学习和生成对抗模仿学习,介绍了它们的公式及其在机器人应用中的对应情况。最后,我们讨论了开放挑战和研究前沿。

英文摘要

Deep learning techniques have been widely applied, achieving state-of-the-art results in various fields of study. This survey focuses on deep learning solutions that target learning control policies for robotics applications. We carry out our discussions on the two main paradigms for learning control with deep networks: deep reinforcement learning and imitation learning. For deep reinforcement learning (DRL), we begin from traditional reinforcement learning algorithms, showing how they are extended to the deep context and effective mechanisms that could be added on top of the DRL algorithms. We then introduce representative works that utilize DRL to solve navigation and manipulation tasks in robotics. We continue our discussion on methods addressing the challenge of the reality gap for transferring DRL policies trained in simulation to real-world scenarios, and summarize robotics simulation platforms for conducting DRL research. For imitation leaning, we go through its three main categories, behavior cloning, inverse reinforcement learning and generative adversarial imitation learning, by introducing their formulations and their corresponding robotics applications. Finally, we discuss the open challenges and research frontiers.

1610.02967 2026-06-04 math.OC cs.LG cs.NA math.NA stat.ML 版本更新

Distributed Convex Optimization with Many Convex Constraints

具有许多凸约束的分布式凸优化

Joachim Giesen, Sören Laue

发表机构 * Friedrich-Schiller-University Jena(耶拿弗里德里希-施特劳斯大学)

AI总结 本文提出一种扩展的ADMM方法,用于解决具有众多凸约束的分布式凸优化问题,继承了ADMM和增广拉格朗日方法的收敛性保证。

详情
AI中文摘要

我们解决在分布式环境下求解具有许多凸约束的凸优化问题。我们的方法基于一种扩展的交替方向乘子法(ADMM),该方法近期在大数据领域受到广泛关注。尽管ADMM早在数十年前就被发明,但迄今为止只能应用于无约束问题或具有线性等式或不等式约束的问题。我们的扩展方法能够直接处理任意不等式约束。它结合了ADMM在分布式环境下求解凸优化问题的能力,以及增广拉格朗日方法求解约束优化问题的能力,并且我们证明它继承了ADMM和增广拉格朗日方法的收敛保证。

英文摘要

We address the problem of solving convex optimization problems with many convex constraints in a distributed setting. Our approach is based on an extension of the alternating direction method of multipliers (ADMM) that recently gained a lot of attention in the Big Data context. Although it has been invented decades ago, ADMM so far can be applied only to unconstrained problems and problems with linear equality or inequality constraints. Our extension can handle arbitrary inequality constraints directly. It combines the ability of ADMM to solve convex optimization problems in a distributed setting with the ability of the Augmented Lagrangian method to solve constrained optimization problems, and as we show, it inherits the convergence guarantees of ADMM and the Augmented Lagrangian method.

1711.07038 2026-06-04 math.NA cs.LG cs.NA 版本更新

A Coordinate-wise Optimization Algorithm for Sparse Inverse Covariance Selection

用于稀疏逆协方差选择的坐标优化算法

Ganzhao Yuan, Haoxian Tan, Wei-Shi Zheng

AI总结 本文提出一种坐标优化算法解决稀疏逆协方差选择问题,保证收敛到坐标最小点,并在合成和真实数据上优于现有方法。

详情
AI中文摘要

稀疏逆协方差选择是分析高维数据依赖性的基本问题,但因其NP难而难以求解。现有方法基于凸近似和迭代硬阈值,仅能得到次优解。本文提出一种坐标优化算法,通过迭代贪心选择或交换变量确定支持集,并在支持集上求解缩减的凸优化问题以实现最大下降。此外,本文还提出一种牛顿型算法解决缩减凸子问题,证明其具有全局线性收敛率和局部二次收敛率。最后,在合成数据和真实数据集上验证了方法的有效性,结果表明所提方法在准确性上优于现有方法。

英文摘要

Sparse inverse covariance selection is a fundamental problem for analyzing dependencies in high dimensional data. However, such a problem is difficult to solve since it is NP-hard. Existing solutions are primarily based on convex approximation and iterative hard thresholding, which only lead to sub-optimal solutions. In this work, we propose a coordinate-wise optimization algorithm to solve this problem which is guaranteed to converge to a coordinate-wise minimum point. The algorithm iteratively and greedily selects one variable or swaps two variables to identify the support set, and then solves a reduced convex optimization problem over the support set to achieve the greatest descent. As a side contribution of this paper, we propose a Newton-like algorithm to solve the reduced convex sub-problem, which is proven to always converge to the optimal solution with global linear convergence rate and local quadratic convergence rate. Finally, we demonstrate the efficacy of our method on synthetic data and real-world data sets. As a result, the proposed method consistently outperforms existing solutions in terms of accuracy.

1710.10781 2026-06-04 math.NA cs.CV cs.LG cs.NA stat.ML 版本更新

Stochastic variance reduced multiplicative update for nonnegative matrix factorization

随机方差缩减乘法更新用于非负矩阵分解

Hiroyuki Kasai

发表机构 * Graduate School of Informatics and Engineering, The University of Electro-Communications(信息与工程研究生院,东京电波通信大学)

AI总结 本文提出一种随机方差缩减乘法更新算法,改进非负矩阵分解的收敛速度,通过数值实验验证其在不同数据集上的优越性。

Comments IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP2018)

详情
AI中文摘要

非负矩阵分解(NMF)是一种降维和因子分析方法,其因子矩阵具有低秩非负约束。考虑到NMF中的随机学习,本文特别针对最流行的乘法更新(MU)规则,该规则收敛速度较慢。本文提出一种随机梯度的方差缩减技术,数值比较表明,所提出的算法在不同合成和实际数据集上均优于现有算法。

英文摘要

Nonnegative matrix factorization (NMF), a dimensionality reduction and factor analysis method, is a special case in which factor matrices have low-rank nonnegative constraints. Considering the stochastic learning in NMF, we specifically address the multiplicative update (MU) rule, which is the most popular, but which has slow convergence property. This present paper introduces on the stochastic MU rule a variance-reduced technique of stochastic gradient. Numerical comparisons suggest that our proposed algorithms robustly outperform state-of-the-art algorithms across different synthetic and real-world datasets.

1804.00684 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Graph-Based Deep Modeling and Real Time Forecasting of Sparse Spatio-Temporal Data

基于图的深度建模与稀疏时空数据的实时预测

Bao Wang, Xiyang Luo, Fangbo Zhang, Baichuan Yuan, Andrea L. Bertozzi, P. Jeffrey Brantingham

发表机构 * Dept of Anthropology, UCLA(人类学系,加州大学洛杉矶分校) Dept of Math, UCLA(数学系,加州大学洛杉矶分校)

AI总结 本文提出一种通用框架,用于稀疏时空数据的建模、分析与预测,结合自激发点过程和图结构循环神经网络,实现宏微观尺度的联合建模与实时预测。

Comments 9 pages, 19 figures

详情
AI中文摘要

我们提出了一种通用框架,用于时空数据的建模、分析和预测,特别关注在空间和时间上都稀疏的数据。我们的多尺度框架是两个主要组件的无缝耦合:一个自激发点过程用于建模时空数据的宏尺度统计行为,以及一个图结构循环神经网络(GSRNN)用于在推断图上发现时空数据的微尺度模式。这种新颖的深度神经网络(DNN)结合了图节点的实时交互,以实现更准确的实时预测。该方法在犯罪和交通预测上得到了验证。

英文摘要

We present a generic framework for spatio-temporal (ST) data modeling, analysis, and forecasting, with a special focus on data that is sparse in both space and time. Our multi-scaled framework is a seamless coupling of two major components: a self-exciting point process that models the macroscale statistical behaviors of the ST data and a graph structured recurrent neural network (GSRNN) to discover the microscale patterns of the ST data on the inferred graph. This novel deep neural network (DNN) incorporates the real time interactions of the graph nodes to enable more accurate real time forecasting. The effectiveness of our method is demonstrated on both crime and traffic forecasting.

1803.11411 2026-06-04 eess.SY cs.LG cs.MA cs.SY 版本更新

Observer-based Adaptive Optimal Output Containment Control problem of Linear Heterogeneous Multi-agent Systems with Relative Output Measurements

基于观测器的自适应最优输出包容控制问题:线性异构多智能体系统中的相对输出测量

Majid Mazouchi, Mohammad Bagher Naghibi-Sistani, Seyed Kamal Hosseini Sani, Farzaneh Tatari, Hamidreza Modares

发表机构 * Department of Electrical Engineering, Ferdowsi University of Mashhad, Mashhad, Iran(马什哈德法尔多西大学电气工程系) Department of Electrical Engineering, University of Semnan, Semnan, Iran(塞姆南大学电气工程系) Missouri University of Science(密苏里科技大学)

AI总结 本文提出了一种基于相对输出反馈的最优解法,用于线性异构多智能体系统的包容控制问题,通过分布式最优控制协议确保跟随器输出处于领导者输出的凸包内并优化暂态性能,采用分布式观测器估计不可用的状态和凸包。

详情
AI中文摘要

本文开发了一种基于最优相对输出反馈的解决方案,用于解决线性异构多智能体系统的包容控制问题。提出了一种分布式最优控制协议,使跟随器的输出落在领导者输出的凸包内(即期望或安全区域),并优化其暂态性能。所提出的最优控制解决方案由反馈部分和前馈部分组成,反馈部分依赖于跟随器的状态,前馈部分依赖于领导者状态的凸包。为了符合大多数实际应用,假设反馈和前馈状态不可用,并使用两个分布式观测器进行估计。即,由于跟随器无法直接感知其绝对状态,设计了一个分布式观测器,仅使用相对于邻居的输出测量(例如通过机器人中的范围传感器测量)和邻居广播的信息来估计其状态。此外,还设计了一个自适应分布式观测器,通过在通信网络上交换信息来估计领导者状态的凸包。所提出的观测器放松了所有跟随器必须知道领导者动态完整知识的严格要求。接下来,开发了一种基于Actor-Critic结构的离策略强化学习算法,用于在线解决最优包容控制问题,使用相对输出测量,无需所有跟随器知道领导者动态。最后,通过数值模拟验证了理论结果。

英文摘要

This paper develops an optimal relative output-feedback based solution to the containment control problem of linear heterogeneous multi-agent systems. A distributed optimal control protocol is presented for the followers to not only assure that their outputs fall into the convex hull of the leaders' output (i.e., the desired or safe region), but also optimizes their transient performance. The proposed optimal control solution is composed of a feedback part, depending of the followers' state, and a feed-forward part, depending on the convex hull of the leaders' state. To comply with most real-world applications, the feedback and feed-forward states are assumed to be unavailable and are estimated using two distributed observers. That is, since the followers cannot directly sense their absolute states, a distributed observer is designed that uses only relative output measurements with respect to their neighbors (measured for example by using range sensors in robotic) and the information which is broadcasted by their neighbors to estimate their states. Moreover, another adaptive distributed observer is designed that uses exchange of information between followers over a communication network to estimate the convex hull of the leaders' state. The proposed observer relaxes the restrictive requirement of knowing the complete knowledge of the leaders' dynamics by all followers. An off-policy reinforcement learning algorithm on an actor-critic structure is next developed to solve the optimal containment control problem online, using relative output measurements and without requirement of knowing the leaders' dynamics by all followers. Finally, the theoretical results are verified by numerical simulations.

1705.10887 2026-06-04 stat.ML cs.CV cs.LG cs.NA math.NA 版本更新

Efficient, sparse representation of manifold distance matrices for classical scaling

高效表示经典标度中的流形距离矩阵

Javier S. Turek, Alexander Huth

发表机构 * Intel Labs(英特尔实验室) The University of Texas at Austin(得克萨斯大学奥斯汀分校)

AI总结 本文提出一种基于双调和插值的稀疏方法,用于高效表示流形距离矩阵,相比现有方法速度快2倍,内存占用低20倍,能处理大规模点集。

Comments Conference CVPR 2018

详情
AI中文摘要

Geodesic距离矩阵可以揭示对非刚性变形不敏感的形状特性,因此常用于分析和表示3-D形状。然而,这些矩阵随点数的平方增长,因此对于大规模点集常用低秩近似来存储和分析。本文提出了一种新颖的稀疏方法,利用双调和插值高效表示流形距离矩阵。该方法利用数据流形的知识,学习一个稀疏插值算子,通过部分点近似距离。我们证明,与现有方法相比,该方法在处理大规模点集的MDS问题时速度快2倍,内存占用低20倍,质量相似。这使得分析之前不可行的大规模点集成为可能。

英文摘要

Geodesic distance matrices can reveal shape properties that are largely invariant to non-rigid deformations, and thus are often used to analyze and represent 3-D shapes. However, these matrices grow quadratically with the number of points. Thus for large point sets it is common to use a low-rank approximation to the distance matrix, which fits in memory and can be efficiently analyzed using methods such as multidimensional scaling (MDS). In this paper we present a novel sparse method for efficiently representing geodesic distance matrices using biharmonic interpolation. This method exploits knowledge of the data manifold to learn a sparse interpolation operator that approximates distances using a subset of points. We show that our method is 2x faster and uses 20x less memory than current leading methods for solving MDS on large point sets, with similar quality. This enables analyses of large point sets that were previously infeasible.

1701.01394 2026-06-04 cs.DS cs.LG cs.NA math.NA stat.ML 版本更新

On spectral partitioning of signed graphs

关于带符号图的谱划分

Andrew V. Knyazev

发表机构 * Mitsubishi Electric Research Laboratories (MERL)(三菱电机研究实验室(MERL))

AI总结 本文讨论了带符号图谱划分中标准图拉普拉斯矩阵优于符号拉普拉斯矩阵,指出基于符号拉普拉斯矩阵主特征向量的划分方法更有效,负特征值有助于提高计算效率。

Comments 12 pages, 10 figures. Rev 2 to appear in proceedings of the SIAM Workshop on Combinatorial Scientific Computing 2018 (CSC18)

详情
AI中文摘要

我们主张标准图拉普拉斯矩阵比符号拉普拉斯矩阵更适合带符号图的谱划分。简单例子表明,基于符号拉普拉斯矩阵主特征向量的划分可能无意义,而基于标准图拉普拉斯矩阵Fiedler向量的划分更有效。我们观察到负特征值对带符号图的谱划分有益,使Fiedler向量更容易计算。

英文摘要

We argue that the standard graph Laplacian is preferable for spectral partitioning of signed graphs compared to the signed Laplacian. Simple examples demonstrate that partitioning based on signs of components of the leading eigenvectors of the signed Laplacian may be meaningless, in contrast to partitioning based on the Fiedler vector of the standard graph Laplacian for signed graphs. We observe that negative eigenvalues are beneficial for spectral partitioning of signed graphs, making the Fiedler vector easier to compute.

1803.10371 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Reinforcement learning for non-prehensile manipulation: Transfer from simulation to physical system

基于非操控操作的强化学习:从仿真到物理系统的迁移

Kendall Lowrey, Svetoslav Kolev, Jeremy Dao, Aravind Rajeswaran, Emanuel Todorov

发表机构 * University of Washington(华盛顿大学) Roboti LLC(Roboti公司)

AI总结 本文提出了一种基于仿真的强化学习方法,用于非操控操作任务,通过在仿真环境中训练策略,成功迁移到物理系统中,且在模型集合训练下提升了策略的鲁棒性。

Comments Accepted at IEEE SIMPAR 2018. Project page: https://sites.google.com/view/phantomsim2real

详情
AI中文摘要

强化学习已作为一种有前途的方法用于训练机器人控制器。然而,大多数结果受限于仿真,因为需要大量样本且缺乏自动且安全的数据收集方法。基于模型的强化学习方法提供了一种途径来克服这些挑战,但传统关注的是仿真与现实世界之间的不匹配。这里,我们展示在仿真中学习的控制策略可以成功迁移到由三个Phantom机器人推动物体到各种目标位置的物理系统中。我们使用修改的自然策略梯度算法进行学习,应用于精心识别的仿真模型。所得到的策略在仿真中完全训练后,在物理系统中无需额外训练即可有效工作。此外,我们还表明,使用模型集合训练使学习的策略对建模误差更鲁棒,从而补偿系统识别的困难。

英文摘要

Reinforcement learning has emerged as a promising methodology for training robot controllers. However, most results have been limited to simulation due to the need for a large number of samples and the lack of automated-yet-safe data collection methods. Model-based reinforcement learning methods provide an avenue to circumvent these challenges, but the traditional concern has been the mismatch between the simulator and the real world. Here, we show that control policies learned in simulation can successfully transfer to a physical system, composed of three Phantom robots pushing an object to various desired target positions. We use a modified form of the natural policy gradient algorithm for learning, applied to a carefully identified simulation model. The resulting policies, trained entirely in simulation, work well on the physical system without additional training. In addition, we show that training with an ensemble of models makes the learned policies more robust to modeling errors, thus compensating for difficulties in system identification.

1802.04929 2026-06-04 eess.SY cs.LG cs.RO cs.SY 版本更新

Context-Specific Validation of Data-Driven Models

基于情境的驱动模型验证

Somil Bansal, Shromona Ghosh, Alberto Sangiovanni-Vincentelli, Sanjit A. Seshia, Claire J. Tomlin

AI总结 本文提出了一种基于情境的驱动模型验证框架,通过计算闭环实际系统与学习模型之间的距离来评估模型质量,并采用主动采样方案高效计算距离上界,用于验证实际系统控制器设计。

详情
AI中文摘要

随着数据驱动模型在机器人系统控制中的广泛应用,开发一种在部署前验证此类模型的方法变得至关重要。具体而言,必须确保为学习模型设计的控制器在实际物理系统上能按预期运行。本文提出了一种基于情境的验证框架,通过闭环实际系统与学习模型之间的距离度量来量化学习模型的质量。随后,我们提出了一种主动采样方案,以在样本高效的方式下计算该距离的概率上界。所提出的框架仅验证与模型预期用途相关的行为,且不需要任何先验系统动力学知识。几个模拟示例展示了该框架在验证真实系统模型及控制器合成中的实用性。

英文摘要

With an increasing use of data-driven models to control robotic systems, it has become important to develop a methodology for validating such models before they can be deployed to design a controller for the actual system. Specifically, it must be ensured that the controller designed for a learned model would perform as expected on the actual physical system. We propose a context-specific validation framework to quantify the quality of a learned model based on a distance measure between the closed-loop actual system and the learned model. We then propose an active sampling scheme to compute a probabilistic upper bound on this distance in a sample-efficient manner. The proposed framework validates the learned model against only those behaviors of the system that are relevant for the purpose for which we intend to use this model, and does not require any a priori knowledge of the system dynamics. Several simulations illustrate the practicality of the proposed framework for validating the models of real-world systems, and consequently, for controller synthesis.

1803.07661 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Efficient Recurrent Neural Networks using Structured Matrices in FPGAs

在FPGA上使用结构化矩阵实现高效的循环神经网络

Zhe Li, Shuo Wang, Caiwen Ding, Qinru Qiu, Yanzhi Wang, Yun Liang

发表机构 * Department of Electrical Engineering and Computer Science, Syracuse University, USA(Syracuse大学电气工程与计算机科学系) Center for Energy-efficient Computing and Applications (CECA), Peking University, China(北京大学能源高效计算与应用中心)

AI总结 本文提出在FPGA上使用块循环矩阵实现RNN,以提高模型压缩和加速,实验显示比ESE提升35.7倍的能效。

Comments To appear in International Conference on Learning Representations 2018 Workshop Track

详情
AI中文摘要

循环神经网络(RNN)在时间序列相关应用中正变得越来越重要,要求高效的实时实现。最近基于剪枝的工作ESE由于剪枝后网络结构的不规则性导致性能/能效下降。我们提出在RNN中使用块循环矩阵来表示权重矩阵,从而实现同时的模型压缩和加速。我们的目标是在FPGA上实现最高性能和能效的RNN,同时满足一定的精度要求(可忽略的精度下降)。实验结果表明,所提出的框架在实际FPGA部署中相比ESE实现了最大能效提升35.7倍。

英文摘要

Recurrent Neural Networks (RNNs) are becoming increasingly important for time series-related applications which require efficient and real-time implementations. The recent pruning based work ESE suffers from degradation of performance/energy efficiency due to the irregular network structure after pruning. We propose block-circulant matrices for weight matrix representation in RNNs, thereby achieving simultaneous model compression and acceleration. We aim to implement RNNs in FPGA with highest performance and energy efficiency, with certain accuracy requirement (negligible accuracy degradation). Experimental results on actual FPGA deployments shows that the proposed framework achieves a maximum energy efficiency improvement of 35.7$\times$ compared with ESE.

1711.02271 2026-06-04 math.NA cs.LG cs.NA 版本更新

High-order Tensor Completion for Data Recovery via Sparse Tensor-train Optimization

高阶张量补全:通过稀疏张量-列车优化实现数据恢复

Longhao Yuan, Qibin Zhao, Jianting Cao

发表机构 * Graduate School of Engineering, Saitama Institute of Technology, Japan(埼玉工科大学工学研究科) Tensor Learning Unit, RIKEN Center for Advanced Intelligence Project (AIP), Japan(日本先进情报项目(AIP)RIKEN智能学习单元) School of Automation, Guangdong University of Technology, China(广东技术大学自动化学院) School of Computer Science and Technology, Hangzhou Dianzi University, China(杭州电子科技大学计算机科学与技术学院)

AI总结 本文提出稀疏张量-列车优化算法,通过将缺失数据视为稀疏张量并利用一阶优化方法求解张量-列车分解因子,有效提升低阶和高阶张量补全性能,尤其在高缺失率下表现优异。

Comments 5 pages (include 1 page of reference) ICASSP 2018

详情
AI中文摘要

本文针对张量数据补全问题,采用张量-列车分解因其强大的表示能力和线性可扩展性。我们提出稀疏张量-列车优化(STTO)算法,将不完整数据视为稀疏张量,并使用一阶优化方法求解张量-列车分解因子。我们的算法在低阶和高阶情况的模拟实验中表现良好。我们还采用张量化方法将数据转换为高阶形式以提升算法性能。各种图像恢复实验的结果表明,我们的方法优于其他补全算法。尤其是在缺失率非常高时,例如90%到99%,我们的方法显著优于最先进的方法。

英文摘要

In this paper, we aim at the problem of tensor data completion. Tensor-train decomposition is adopted because of its powerful representation ability and linear scalability to tensor order. We propose an algorithm named Sparse Tensor-train Optimization (STTO) which considers incomplete data as sparse tensor and uses first-order optimization method to find the factors of tensor-train decomposition. Our algorithm is shown to perform well in simulation experiments at both low-order cases and high-order cases. We also employ a tensorization method to transform data to a higher-order form to enhance the performance of our algorithm. The results of image recovery experiments in various cases manifest that our method outperforms other completion algorithms. Especially when the missing rate is very high, e.g., 90\% to 99\%, our method is significantly better than the state-of-the-art methods.

1707.03770 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Fastest Convergence for Q-learning

Q-learning 最快收敛算法

Adithya M. Devraj, Sean P. Meyn

发表机构 * University of Florida(佛罗里达大学) University of California, Berkeley(加州大学伯克利分校)

AI总结 本文提出Zap Q-learning算法,通过矩阵增益设计实现渐近方差最优,并通过ODE分析证明其瞬态行为接近确定性牛顿-拉夫森法,实验验证其在非理想参数化设置下的快速收敛性。

详情
AI中文摘要

本文提出的Zap Q-learning算法是对Watkins原始算法及近期竞争对手的改进,是一种设计使得渐近方差最优的矩阵增益算法。此外,通过ODE分析表明,其瞬态行为与确定性牛顿-拉夫森实现接近,这得益于对矩阵增益序列的两时间尺度更新方程。分析表明,该方法即使在非理想参数化设置下也能实现稳定高效的计算。数值实验验证了其在非理想情况下的快速收敛性。本文的次要目标是教程性的,前半部分对强化学习算法进行了综述,重点在于最小方差算法。

英文摘要

The Zap Q-learning algorithm introduced in this paper is an improvement of Watkins' original algorithm and recent competitors in several respects. It is a matrix-gain algorithm designed so that its asymptotic variance is optimal. Moreover, an ODE analysis suggests that the transient behavior is a close match to a deterministic Newton-Raphson implementation. This is made possible by a two time-scale update equation for the matrix gain sequence. The analysis suggests that the approach will lead to stable and efficient computation even for non-ideal parameterized settings. Numerical experiments confirm the quick convergence, even in such non-ideal cases. A secondary goal of this paper is tutorial. The first half of the paper contains a survey on reinforcement learning algorithms, with a focus on minimum variance algorithms.

1703.02660 2026-06-04 cs.LG cs.AI cs.RO cs.SY eess.SY 版本更新

Towards Generalization and Simplicity in Continuous Control

连续控制中的泛化与简洁性

Aravind Rajeswaran, Kendall Lowrey, Emanuel Todorov, Sham Kakade

发表机构 * University of Washington(华盛顿大学)

AI总结 本文展示简单线性与RBF参数化策略可解决多种连续控制任务,性能可与更复杂网络相媲美,且多样初始化提升泛化能力。

Comments NIPS 2017, Project page: https://sites.google.com/view/simple-pol

详情
AI中文摘要

本文表明,简单线性及RBF参数化策略可训练解决多种连续控制任务,包括OpenAI Gym基准。这些策略性能可与更复杂参数化方法相媲美。现有训练测试场景受限且易过拟合,导致仅轨迹中心策略。多样初始化产生更具全局性的策略,允许系统在大扰动下恢复,如补充视频所示。

英文摘要

This work shows that policies with simple linear and RBF parameterizations can be trained to solve a variety of continuous control tasks, including the OpenAI gym benchmarks. The performance of these trained policies are competitive with state of the art results, obtained with more elaborate parameterizations such as fully connected neural networks. Furthermore, existing training and testing scenarios are shown to be very limited and prone to over-fitting, thus giving rise to only trajectory-centric policies. Training with a diverse initial state distribution is shown to produce more global policies with better generalization. This allows for interactive control scenarios where the system recovers from large on-line perturbations; as shown in the supplementary video.

1803.06989 2026-06-04 math.ST cs.LG cs.NA math.NA stat.ML stat.TH 版本更新

Numerical Integration on Graphs: where to sample and how to weigh

图上的数值积分:在哪里采样和如何加权

George C. Linderman, Stefan Steinerberger

发表机构 * Program in Applied Mathematics, Yale University, New Haven, CT 06511, USA(应用数学项目,耶鲁大学,新 Haven, CT 06511, USA) Department of Mathematics, Yale University, New Haven, CT 06511, USA(数学系,耶鲁大学,新 Haven, CT 06511, USA)

AI总结 研究图上数值积分问题,通过热球最优打包几何问题重构积分,提出采样策略与加权方法,验证方法效率。

详情
AI中文摘要

设G=(V,E,w)为有限连通加权图。我们关注寻找顶点子集W⊆V和权重a_w,使得1/|V|∑_{v∈V}f(v)≈∑_{w∈W}a_wf(w),其中f:V→R是图几何下光滑的函数。主要应用是当f依赖于图结构但单点评估成本高的问题。证明不等式显示积分问题可转化为几何问题(最优热球打包)。讨论如何构造热球打包近似解;数值示例展示方法效率。

英文摘要

Let $G=(V,E,w)$ be a finite, connected graph with weighted edges. We are interested in the problem of finding a subset $W \subset V$ of vertices and weights $a_w$ such that $$ \frac{1}{|V|}\sum_{v \in V}^{}{f(v)} \sim \sum_{w \in W}{a_w f(w)}$$ for functions $f:V \rightarrow \mathbb{R}$ that are `smooth' with respect to the geometry of the graph. The main application are problems where $f$ is known to somehow depend on the underlying graph but is expensive to evaluate on even a single vertex. We prove an inequality showing that the integration problem can be rewritten as a geometric problem (`the optimal packing of heat balls'). We discuss how one would construct approximate solutions of the heat ball packing problem; numerical examples demonstrate the efficiency of the method.

1803.05026 2026-06-04 cs.LG cs.CV cs.IT cs.NA math.IT math.NA 版本更新

Principal Component Analysis with Tensor Train Subspace

张量列车子空间下的主成分分析

Wenqi Wang, Vaneet Aggarwal, Shuchin Aeron

发表机构 * Purdue University(普渡大学)

AI总结 本文提出TT-PCA算法,通过保持低秩张量结构来估计结构化的张量列车子空间,相比PCA和Tucker-PCA更具鲁棒性,实验验证其有效性。

详情
AI中文摘要

张量列车是一种分层张量网络结构,通过参数化大规模多维数据集来缓解维度灾难。本文提出TT-PCA算法,用于从给定数据中估计这种结构化的张量列车子空间。通过保持低秩张量结构,TT-PCA比PCA或Tucker-PCA更具鲁棒性,这在测试扩展YaleFace数据集B时得到了数值验证。

英文摘要

Tensor train is a hierarchical tensor network structure that helps alleviate the curse of dimensionality by parameterizing large-scale multidimensional data via a set of network of low-rank tensors. Associated with such a construction is a notion of Tensor Train subspace and in this paper we propose a TT-PCA algorithm for estimating this structured subspace from the given data. By maintaining low rank tensor structure, TT-PCA is more robust to noise comparing with PCA or Tucker-PCA. This is borne out numerically by testing the proposed approach on the Extended YaleFace Dataset B.

1703.02419 2026-06-04 stat.CO cs.LG cs.SY eess.SY 版本更新

Probabilistic learning of nonlinear dynamical systems using sequential Monte Carlo

利用序贯蒙特卡洛方法进行非线性动力系统概率学习

Thomas B. Schön, Andreas Svensson, Lawrence Murray, Fredrik Lindsten

发表机构 * Department of Information Technology, Uppsala University(乌普萨拉大学信息科技系)

AI总结 本文提出基于序贯蒙特卡洛方法的概率非线性状态空间模型学习方法,通过粒子Metropolis-Hastings算法实现参数空间的高效采样,并展示了该方法在动态系统建模中的应用。

Comments Thomas B. Schön, Andreas Svensson, Lawrence Murray and Fredrik Lindsten, 2018. Probabilistic learning of nonlinear dynamical systems using sequential Monte Carlo. In Mechanical Systems and Signal Processing, Volume 104, pp. 866-883

详情
AI中文摘要

概率建模能够表示和操纵数据、模型、预测和决策中的不确定性。本文关注从测量数据中学习动态系统概率模型的问题,特别是非线性状态空间模型的学习。由于该问题没有闭式解,因此必须使用近似方法。本文提供了一个自包含的介绍,介绍了一种最先进的方法——粒子Metropolis-Hastings算法,该算法已被证明能提供实用的近似。这是一种基于蒙特卡洛的方法,其中粒子滤波用于引导马尔可夫链蒙特卡洛方法通过参数空间。粒子Metropolis-Hastings算法的一个关键优点是,在温和的假设下,它保证收敛到“真实解”,尽管它基于仅有限数量粒子的粒子滤波。本文还提供了一个数值示例,通过为序贯蒙特卡洛方法量身定制的建模语言来展示该方法。此类建模语言的目的是将高级蒙特卡洛方法(包括粒子Metropolis-Hastings)的威力带给大量用户,而无需他们了解所有底层数学细节。

英文摘要

Probabilistic modeling provides the capability to represent and manipulate uncertainty in data, models, predictions and decisions. We are concerned with the problem of learning probabilistic models of dynamical systems from measured data. Specifically, we consider learning of probabilistic nonlinear state-space models. There is no closed-form solution available for this problem, implying that we are forced to use approximations. In this tutorial we will provide a self-contained introduction to one of the state-of-the-art methods---the particle Metropolis--Hastings algorithm---which has proven to offer a practical approximation. This is a Monte Carlo based method, where the particle filter is used to guide a Markov chain Monte Carlo method through the parameter space. One of the key merits of the particle Metropolis--Hastings algorithm is that it is guaranteed to converge to the "true solution" under mild assumptions, despite being based on a particle filter with only a finite number of particles. We will also provide a motivating numerical example illustrating the method using a modeling language tailored for sequential Monte Carlo methods. The intention of modeling languages of this kind is to open up the power of sophisticated Monte Carlo methods---including particle Metropolis--Hastings---to a large group of users without requiring them to know all the underlying mathematical details.

1803.02553 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Graph Learning from Filtered Signals: Graph System and Diffusion Kernel Identification

基于滤波信号的图学习:图系统与扩散核识别

Hilmi E. Egilmez, Eduardo Pavez, Antonio Ortega

发表机构 * Department of Electrical Engineering, University of Southern California(电气工程系,南加州大学)

AI总结 本文提出一种新的图信号处理框架,用于从滤波信号类中构建图模型。通过将图建模问题转化为图系统识别问题,学习加权图(图拉普拉斯矩阵)和图滤波器(图拉普拉斯矩阵函数)。算法能从多信号观测中联合识别图和图滤波器,适用于学习扩散核,并在真实气候数据集上验证了其有效性。

Comments Submitted to IEEE Trans. on Signal and Information Processing over Networks (13 pages)

详情
AI中文摘要

本文介绍了一种新的图信号处理框架,用于从滤波信号类中构建图模型。在我们的框架中,图建模被公式化为图系统识别问题,目标是学习加权图(图拉普拉斯矩阵)和图滤波器(图拉普拉斯矩阵函数)。为了求解提出的问题,开发了一种算法,从多个信号/数据观测中联合识别图和图滤波器(GBF)。我们的算法在GBF是一一对应函数的假设下有效。所提出的方法可以应用于学习扩散(热)核,这些核在各种领域中用于建模扩散过程。此外,对于特定的图滤波器选择,所提出的问题减少为图拉普拉斯估计问题。我们的实验结果表明,所提出算法优于当前最先进的方法。我们还实现了该框架在一个真实气候数据集上,用于温度信号建模。

英文摘要

This paper introduces a novel graph signal processing framework for building graph-based models from classes of filtered signals. In our framework, graph-based modeling is formulated as a graph system identification problem, where the goal is to learn a weighted graph (a graph Laplacian matrix) and a graph-based filter (a function of graph Laplacian matrices). In order to solve the proposed problem, an algorithm is developed to jointly identify a graph and a graph-based filter (GBF) from multiple signal/data observations. Our algorithm is valid under the assumption that GBFs are one-to-one functions. The proposed approach can be applied to learn diffusion (heat) kernels, which are popular in various fields for modeling diffusion processes. In addition, for specific choices of graph-based filters, the proposed problem reduces to a graph Laplacian estimation problem. Our experimental results demonstrate that the proposed algorithm outperforms the current state-of-the-art methods. We also implement our framework on a real climate dataset for modeling of temperature signals.

1709.04407 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

An Inversion-Based Learning Approach for Improving Impromptu Trajectory Tracking of Robots with Non-Minimum Phase Dynamics

基于逆向学习的方法用于改进具有非最小相位动态的机器人即兴轨迹跟踪

Siqi Zhou, Mohamed K. Helwa, Angela P. Schoellig

发表机构 * Dynamic Systems Lab(动态系统实验室) Institute for Aerospace Studies(航空航天研究 institute) University of Toronto(多伦多大学) Cairo University(开罗大学)

AI总结 本文提出一种基于学习的方法,用于改进非最小相位系统的即兴轨迹跟踪,通过直接从输入输出数据学习稳定近似逆向,验证了方法的稳定性与高精度跟踪效果。

Comments Accepted for publication in the IEEE Robotics and Automation Letters (RA-L), July 2018

详情
AI中文摘要

本文提出了一种基于学习的方法,用于非最小相位系统的即兴轨迹跟踪。逆向前馈方法常用于提高跟踪性能,但无法直接应用于非最小相位系统,因为其固有不稳定。为解决此问题,现有方法假设系统模型已知,并使用预动作或逆向近似技术。本文提出了一种从输入输出数据直接学习稳定近似逆向的方法。通过理论讨论、模拟和两种不同平台的实验,展示了所提方法的稳定性及其在高精度即兴跟踪中的有效性。此外,本文还表明,在训练中包含更多信息,尽管通常被认为有用,但未必能提高性能,反而可能引发不稳定性并影响整体方法的效果。

英文摘要

This paper presents a learning-based approach for impromptu trajectory tracking for non-minimum phase systems, i.e., systems with unstable inverse dynamics. Inversion-based feedforward approaches are commonly used for improving tracking performance; however, these approaches are not directly applicable to non-minimum phase systems due to their inherent instability. In order to resolve the instability issue, existing methods have assumed that the system model is known and used pre-actuation or inverse approximation techniques. In this work, we propose an approach for learning a stable, approximate inverse of a non-minimum phase baseline system directly from its input-output data. Through theoretical discussions, simulations, and experiments on two different platforms, we show the stability of our proposed approach and its effectiveness for high-accuracy, impromptu tracking. Our approach also shows that including more information in the training, as is commonly assumed to be useful, does not lead to better performance but may trigger instability and impact the effectiveness of the overall approach.

1803.01626 2026-06-04 stat.ML cs.LG cs.SY eess.SY 版本更新

Variance-Aware Regret Bounds for Undiscounted Reinforcement Learning in MDPs

考虑平均回报准则下未知离散马尔可夫决策过程(MDP)中的强化学习的方差意识后悔界

Mohammad Sadegh Talebi, Odalric-Ambrym Maillard

发表机构 * KTH Royal Institute of Technology(皇家理工学院) INRIA Lille – Nord Europe(里尔-北欧洲研究所)

AI总结 本文基于平均回报准则,重新审视未知离散MDP中的强化学习问题,通过引入局部方差代替MDP直径,改进KL-UCRL算法的后悔界,提供更优的性能保证。

Comments To appear in Proceedings of the 29th International Conference on Algorithmic Learning Theory (ALT 2018)

详情
AI中文摘要

在未知和离散的马尔可夫决策过程(MDP)中,考虑在单一流观测下进行强化学习的问题,当学习者从初始状态开始与系统交互时。我们通过引入偏倚函数的局部方差代替MDP的直径,重新审视该问题的最小最大下界。此外,我们提供了KL-UCRL算法的新型分析,建立了高概率的后悔界,其规模为$\widetilde {\mathcal O}\Bigl({\textstyle \sqrt{S\sum_{s,a}{\bf V}^\star_{s,a}T}}\Big)$,适用于周期性MDP。该界优于之前已知的$\widetilde {\mathcal O}(DS\sqrt{AT})$界,其中$A$和$D$分别表示每个状态的最大动作数和MDP的直径。我们最终在一些基准MDP中比较了两个界的主导项,表明在某些情况下,所推导的界可以提供一个数量级的改进。我们的分析利用了运输引理的新变体结合KL集中不等式,我们认为这些方法具有独立的兴趣。

英文摘要

The problem of reinforcement learning in an unknown and discrete Markov Decision Process (MDP) under the average-reward criterion is considered, when the learner interacts with the system in a single stream of observations, starting from an initial state without any reset. We revisit the minimax lower bound for that problem by making appear the local variance of the bias function in place of the diameter of the MDP. Furthermore, we provide a novel analysis of the KL-UCRL algorithm establishing a high-probability regret bound scaling as $\widetilde {\mathcal O}\Bigl({\textstyle \sqrt{S\sum_{s,a}{\bf V}^\star_{s,a}T}}\Big)$ for this algorithm for ergodic MDPs, where $S$ denotes the number of states and where ${\bf V}^\star_{s,a}$ is the variance of the bias function with respect to the next-state distribution following action $a$ in state $s$. The resulting bound improves upon the best previously known regret bound $\widetilde {\mathcal O}(DS\sqrt{AT})$ for that algorithm, where $A$ and $D$ respectively denote the maximum number of actions (per state) and the diameter of MDP. We finally compare the leading terms of the two bounds in some benchmark MDPs indicating that the derived bound can provide an order of magnitude improvement in some cases. Our analysis leverages novel variations of the transportation lemma combined with Kullback-Leibler concentration inequalities, that we believe to be of independent interest.

1803.00491 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

The Power Mean Laplacian for Multilayer Graph Clustering

多层图聚类的幂均值拉普拉斯

Pedro Mercado, Antoine Gautier, Francesco Tudisco, Matthias Hein

发表机构 * Department of Mathematics and Computer Science, Saarland University(萨尔兰大学数学与计算机科学系) Department of Mathematics and Statistics, University of Strathclyde(斯特拉斯克莱德大学数学与统计学系)

AI总结 本文提出一种参数化的矩阵幂均值方法,用于融合多层图的拉普拉斯矩阵,分析其在随机块模型中的期望性能,并在真实数据中验证其在不同设置下恢复真实聚类的能力。

Comments 19 pages, 3 figures. Accepted in Artificial Intelligence and Statistics (AISTATS), 2018

详情
AI中文摘要

多层图编码了相同实体集之间的不同种类的相互作用。当对这样的多层图进行聚类时,自然的问题是应如何融合不同层的信息。本文介绍了一种参数化的矩阵幂均值家族,用于融合不同层的拉普拉斯矩阵,并在随机块模型中分析其期望性能。我们证明该家族在不同设置下能够恢复真实聚类,并在真实世界数据中验证了这一点。尽管计算矩阵幂均值对于大图来说可能非常昂贵,我们引入了一种数值方案,以高效计算大规模稀疏图的幂均值的特征向量。

英文摘要

Multilayer graphs encode different kind of interactions between the same set of entities. When one wants to cluster such a multilayer graph, the natural question arises how one should merge the information different layers. We introduce in this paper a one-parameter family of matrix power means for merging the Laplacians from different layers and analyze it in expectation in the stochastic block model. We show that this family allows to recover ground truth clusters under different settings and verify this in real world data. While computing the matrix power mean can be very expensive for large graphs, we introduce a numerical scheme to efficiently compute its eigenvectors for the case of large sparse graphs.

1802.10275 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Solving for high dimensional committor functions using artificial neural networks

利用人工神经网络求解高维承诺函数

Yuehaw Khoo, Jianfeng Lu, Lexing Ying

发表机构 * Department of Mathematics, Stanford University(数学系,斯坦福大学) Department of Mathematics, Department of Chemistry and Department of Physics, Duke University(数学系、化学系和物理系,杜克大学) Department of Mathematics and ICME, Stanford University(数学系和ICME,斯坦福大学)

AI总结 本文提出基于人工神经网络的方法,用于研究由随机过程支配的状态转换。通过变分公式和神经网络参数化,获得高维Fokker-Planck方程的承诺函数数值解,证明在高维问题中可实现中等精度。

Comments 12 pages, 6 figures

详情
AI中文摘要

在本注释中,我们提出了一种基于人工神经网络的方法,用于研究由随机过程支配的状态转换。特别是,我们旨在为过渡路径理论的核心对象——承诺函数,设计数值方案,该函数满足高维Fokker-Planck方程。通过处理此类偏微分方程的变分公式,并将承诺函数参数化为神经网络,可以利用随机算法优化神经网络权重来获得近似解。数值示例表明,对于高维问题可以实现中等精度。

英文摘要

In this note we propose a method based on artificial neural network to study the transition between states governed by stochastic processes. In particular, we aim for numerical schemes for the committor function, the central object of transition path theory, which satisfies a high-dimensional Fokker-Planck equation. By working with the variational formulation of such partial differential equation and parameterizing the committor function in terms of a neural network, approximations can be obtained via optimizing the neural network weights using stochastic algorithms. The numerical examples show that moderate accuracy can be achieved for high-dimensional problems.

1802.00930 2026-06-04 cs.NE cs.LG cs.NA math.NA 版本更新

Mixed Precision Training of Convolutional Neural Networks using Integer Operations

使用整数运算进行卷积神经网络的混合精度训练

Dipankar Das, Naveen Mellempudi, Dheevatsa Mudigere, Dhiraj Kalamkar, Sasikanth Avancha, Kunal Banerjee, Srinivas Sridharan, Karthik Vaidyanathan, Bharat Kaul, Evangelos Georganas, Alexander Heinecke, Pradeep Dubey, Jesus Corbal, Nikita Shustrov, Roma Dubtsov, Evarist Fomenko, Vadim Pirogov

发表机构 * Parallel Computing Lab(并行计算实验室) Intel Labs, India(英特尔实验室,印度) Product Architecture Group(产品架构组) Intel Labs, SC Intel, OR(英特尔实验室,SC英特尔,美国) Software Services Group(软件服务组) Intel, OR(英特尔,美国)

AI总结 本文提出了一种基于整数运算的混合精度训练方法,在ImageNet-1K数据集上训练了ResNet-50、GoogLeNet-v1等SOTA网络,实现了比FP32更高的训练吞吐量和相同精度下的最高准确率。

Comments Published as a conference paper at ICLR 2018

详情
AI中文摘要

当前混合精度训练的SOTA主要基于低精度浮点运算,如FP16累积到FP32的变种。然而,在低精度和混合精度整数训练领域,已有研究要么针对非SOTA网络(如仅AlexNet用于ImageNet-1K),要么针对较小的数据集(如CIFAR-10)。本文在通用硬件上训练了SOTA视觉理解神经网络,使用整数运算。特别关注整数融合乘加(FMA)运算,其输入为两个INT16操作数,输出为INT32。我们提出了张量的共享指数表示,并开发了适用于常见神经网络操作的动态定点(DFP)方案。研究了高效整数卷积核的开发细节,包括处理INT32累加器溢出的方法。我们实现了ResNet-50、GoogLeNet-v1、VGG-16和AlexNet的CNN训练,这些网络在相同迭代次数下达到或超过FP32的SOTA准确率,无需改变超参数,并在端到端训练吞吐量上提高了1.8倍。据我们所知,这些结果是首次在通用硬件上使用SOTA CNNs在ImageNet-1K数据集上实现INT16训练的结果,并实现了最高报告的准确率。

英文摘要

The state-of-the-art (SOTA) for mixed precision training is dominated by variants of low precision floating point operations, and in particular, FP16 accumulating into FP32 Micikevicius et al. (2017). On the other hand, while a lot of research has also happened in the domain of low and mixed-precision Integer training, these works either present results for non-SOTA networks (for instance only AlexNet for ImageNet-1K), or relatively small datasets (like CIFAR-10). In this work, we train state-of-the-art visual understanding neural networks on the ImageNet-1K dataset, with Integer operations on General Purpose (GP) hardware. In particular, we focus on Integer Fused-Multiply-and-Accumulate (FMA) operations which take two pairs of INT16 operands and accumulate results into an INT32 output.We propose a shared exponent representation of tensors and develop a Dynamic Fixed Point (DFP) scheme suitable for common neural network operations. The nuances of developing an efficient integer convolution kernel is examined, including methods to handle overflow of the INT32 accumulator. We implement CNN training for ResNet-50, GoogLeNet-v1, VGG-16 and AlexNet; and these networks achieve or exceed SOTA accuracy within the same number of iterations as their FP32 counterparts without any change in hyper-parameters and with a 1.8X improvement in end-to-end training throughput. To the best of our knowledge these results represent the first INT16 training results on GP hardware for ImageNet-1K dataset using SOTA CNNs and achieve highest reported accuracy using half-precision

1702.03258 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Adaptive and Resilient Soft Tensegrity Robots

自适应且具有韧性的软 tensegrity 机器人

John Rieffel, Jean-Baptiste Mouret

发表机构 * Union College(联合学院) Inria Nancy Grand - Est CNRS, Loria, UMR 7503(法国国家科学研究中心(CNRS)、洛里亚实验室(Loria)、UMR 7503)

AI总结 本文提出一种易于组装的基于 tensegrity 的软机器人,能产生高动态运动步态,并在物理损伤下表现出结构和行为韧性,通过机器学习算法实现有效步态发现。

Comments video: https://youtu.be/SuLQDhrk9tQ

详情
AI中文摘要

生物体结合软(如肌肉)和硬(如骨骼)材料,赋予其内在灵活性和韧性,这在传统刚性机器人中往往缺失。新兴的软机器人领域试图利用这些特性创造韧性机器。然而,软材料的性质给设计、制造和控制带来了重大挑战,迄今为止,大多数软机器人的步态都是通过经验试错法手动设计的。本文描述了一种易于组装的基于 tensegrity 的软机器人,能够产生高度动态的运动步态,并在面对物理损伤时表现出结构和行为的韧性。使这一成果成为可能的是使用一种机器学习算法,能够以最少的物理试验发现有效的步态。这些结果进一步支持了软机器人方法,旨在利用复杂材料动力学的相互作用,以生成丰富的动态行为。

英文摘要

Living organisms intertwine soft (e.g., muscle) and hard (e.g., bones) materials, giving them an intrinsic flexibility and resiliency often lacking in conventional rigid robots. The emerging field of soft robotics seeks to harness these same properties in order to create resilient machines. The nature of soft materials, however, presents considerable challenges to aspects of design, construction, and control -- and up until now, the vast majority of gaits for soft robots have been hand-designed through empirical trial-and-error. This manuscript describes an easy-to-assemble tensegrity-based soft robot capable of highly dynamic locomotive gaits and demonstrating structural and behavioral resilience in the face of physical damage. Enabling this is the use of a machine learning algorithm able to discover effective gaits with a minimal number of physical trials. These results lend further credence to soft-robotic approaches that seek to harness the interaction of complex material dynamics in order to generate a wealth of dynamical behaviors.

1705.08435 2026-06-04 cs.LG cs.CR cs.DC cs.SY eess.SY stat.ML 版本更新

Personalized and Private Peer-to-Peer Machine Learning

个性化与隐私保护的点对点机器学习

Aurélien Bellet, Rachid Guerraoui, Mahsa Taziki, Marc Tommasi

发表机构 * INRIA EPFL(瑞士联邦理工学院) Université de Lille(里尔大学)

AI总结 本文提出一种高效算法,实现去中心化且异步的个性化机器学习,在强隐私要求下保证收敛性。通过差分隐私保护数据隐私,并分析隐私与效用的平衡。实验表明,在非隐私情况下优于先前方法,隐私约束下可显著提升模型性能。

Comments 20 pages, to appear in the Proceedings of the 21st International Conference on Artificial Intelligence and Statistics (AISTATS 2018)

详情
AI中文摘要

随着连接个人设备的兴起和隐私问题的出现,需要能够利用大量代理数据学习个性化模型的机器学习算法,同时满足严格的隐私要求。本文介绍了一种高效的算法,以完全去中心化(点对点)和异步方式解决上述问题,并具有可证明的收敛速度。我们展示了如何使算法具有差分隐私性,以保护个人数据集信息的泄露,并正式分析效用与隐私之间的权衡。我们的实验表明,在非隐私情况下,我们的方法显著优于先前工作,在隐私约束下,我们可以在孤立学习的模型上取得显著改进。

英文摘要

The rise of connected personal devices together with privacy concerns call for machine learning algorithms capable of leveraging the data of a large number of agents to learn personalized models under strong privacy requirements. In this paper, we introduce an efficient algorithm to address the above problem in a fully decentralized (peer-to-peer) and asynchronous fashion, with provable convergence rate. We show how to make the algorithm differentially private to protect against the disclosure of information about the personal datasets, and formally analyze the trade-off between utility and privacy. Our experiments show that our approach dramatically outperforms previous work in the non-private case, and that under privacy constraints, we can significantly improve over models learned in isolation.

1708.07827 2026-06-04 math.OC cs.LG cs.NA math.NA stat.ML 版本更新

Second-Order Optimization for Non-Convex Machine Learning: An Empirical Study

非凸机器学习中的二阶优化:一项实证研究

Peng Xu, Farbod Roosta-Khorasani, Michael W. Mahoney

发表机构 * Institute for Computational and Mathematical Engineering, Stanford University(计算与数学工程研究所,斯坦福大学) School of Mathematics and Physics, University of Queensland(数学与物理学院,昆士兰大学) International Computer Science Institute, Berkeley, USA(国际计算机科学研究所,伯克利,美国) International Computer Science Institute and Department of Statistics, University of California at Berkeley(国际计算机科学研究所和统计学系,加州大学伯克利分校)

AI总结 本文通过实证研究评估了非凸机器学习问题中牛顿型方法的性能,证明其在泛化性能和超参数鲁棒性方面优于传统SGD,能有效逃离平坦区域和鞍点。

Comments 21 pages, 11 figures. Restructure the paper and add experiments

详情
AI中文摘要

尽管随机梯度下降(SGD)等一阶优化方法在机器学习(ML)中广泛应用,但它们存在收敛速度慢、超参数设置敏感、易陷入高训练误差和难以逃离平坦区域及鞍点等缺陷。在高度非凸设置(如神经网络中)尤为明显。受此启发,近期有研究关注二阶方法,旨在通过捕捉曲率信息缓解这些不足。本文报告了针对非凸ML问题的一类牛顿型方法——信任区域(TR)和自适应三次正则化(ARC)算法的子采样变体的详细实证评估。在此过程中,我们证明这些方法不仅在计算上与手工调优的SGD加动量方法具有竞争力,泛化性能可比或更优,而且对超参数设置具有高度鲁棒性。此外,与SGD加动量相比,这些牛顿型方法利用曲率信息的方式使其能够无缝逃离平坦区域和鞍点。

英文摘要

While first-order optimization methods such as stochastic gradient descent (SGD) are popular in machine learning (ML), they come with well-known deficiencies, including relatively-slow convergence, sensitivity to the settings of hyper-parameters such as learning rate, stagnation at high training errors, and difficulty in escaping flat regions and saddle points. These issues are particularly acute in highly non-convex settings such as those arising in neural networks. Motivated by this, there has been recent interest in second-order methods that aim to alleviate these shortcomings by capturing curvature information. In this paper, we report detailed empirical evaluations of a class of Newton-type methods, namely sub-sampled variants of trust region (TR) and adaptive regularization with cubics (ARC) algorithms, for non-convex ML problems. In doing so, we demonstrate that these methods not only can be computationally competitive with hand-tuned SGD with momentum, obtaining comparable or better generalization performance, but also they are highly robust to hyper-parameter settings. Further, in contrast to SGD with momentum, we show that the manner in which these Newton-type methods employ curvature information allows them to seamlessly escape flat regions and saddle points.

1605.09232 2026-06-04 math.NA cs.LG cs.NA cs.NE math.OC stat.ML 版本更新

Tradeoffs between Convergence Speed and Reconstruction Accuracy in Inverse Problems

反问题中收敛速度与重建精度之间的权衡

Raja Giryes, Yonina C. Eldar, Alex M. Bronstein, Guillermo Sapiro

发表机构 * School of Electrical Engineering, Tel Aviv University(特拉维夫大学电子工程学院) Electrical Engineering Department, Technion - IIT(技术学院-理工学院电子工程系) Computer Science Department, Technion - IIT(技术学院-理工学院计算机科学系) Electrical and Computer Engineering Department, Duke University(杜克大学电气与计算机工程系)

AI总结 研究探讨了在逆问题中,通过调整迭代算法以加快收敛速度同时保持重建精度的可行性,结合低维集的恢复技术,分析了粗略估计对收敛速度的影响。

Comments To appear in IEEE Transactions on Signal Processing

详情
AI中文摘要

使用迭代算法求解逆问题在大数据中很流行。由于时间限制,迭代次数通常有限,可能影响可实现的精度。给定可接受的误差范围,一个重要问题是是否可以通过修改原始迭代方法,获得更快收敛到达到允许误差的极小值点,而不显著增加每次迭代的计算成本。基于最近为某些低维集信号恢复开发的恢复技术,我们表明使用该集的粗略估计可能以额外的重建误差为代价加快收敛。我们的理论与稀疏恢复、压缩感知和深度学习的最新进展相关。特别是,它可能为神经网络通过层表示迭代来近似l1最小化解的成功提供了可能的解释,如在学习迭代收缩阈值算法(LISTA)中实践的那样。

英文摘要

Solving inverse problems with iterative algorithms is popular, especially for large data. Due to time constraints, the number of possible iterations is usually limited, potentially affecting the achievable accuracy. Given an error one is willing to tolerate, an important question is whether it is possible to modify the original iterations to obtain faster convergence to a minimizer achieving the allowed error without increasing the computational cost of each iteration considerably. Relying on recent recovery techniques developed for settings in which the desired signal belongs to some low-dimensional set, we show that using a coarse estimate of this set may lead to faster convergence at the cost of an additional reconstruction error related to the accuracy of the set approximation. Our theory ties to recent advances in sparse recovery, compressed sensing, and deep learning. Particularly, it may provide a possible explanation to the successful approximation of the l1-minimization solution by neural networks with layers representing iterations, as practiced in the learned iterative shrinkage-thresholding algorithm (LISTA).

1802.03981 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Spectral Filtering for General Linear Dynamical Systems

谱滤波用于通用线性动态系统

Elad Hazan, Holden Lee, Karan Singh, Cyril Zhang, Yi Zhang

发表机构 * Department of Computer Science, Princeton University(普林斯顿大学计算机科学系) Google Brain(谷歌大脑) Department of Mathematics, Princeton University(普林斯顿大学数学系)

AI总结 本文提出一种多项式时间算法,用于学习无系统识别假设的隐状态线性动态系统,无需假设系统转移矩阵的谱半径。该算法扩展了谱滤波技术,通过新的凸松弛方法高效识别相位。

详情
AI中文摘要

我们提出了一种多项式时间算法,用于学习隐状态线性动态系统,而无需系统识别,也不假设系统转移矩阵的谱半径。该算法扩展了最近引入的谱滤波技术,该技术先前仅应用于具有对称转移矩阵的系统,通过新的凸松弛方法允许高效识别相位。

英文摘要

We give a polynomial-time algorithm for learning latent-state linear dynamical systems without system identification, and without assumptions on the spectral radius of the system's transition matrix. The algorithm extends the recently introduced technique of spectral filtering, previously applied only to systems with a symmetric transition matrix, using a novel convex relaxation to allow for the efficient identification of phases.

1605.00031 2026-06-04 cs.LG cs.CV cs.NA math.NA stat.ML 版本更新

Deep Convolutional Neural Networks on Cartoon Functions

深度卷积神经网络在卡通函数上的应用

Philipp Grohs, Thomas Wiatowski, Helmut Bölcskei

发表机构 * 1 Dept. Math., ETH Zurich, Switzerland

AI总结 本文研究深度卷积神经网络在卡通函数上的变形稳定性,提出考虑结构特性的新结果,适用于具有尖锐和弯曲不连续性的信号。

Comments This is a slightly updated version of the paper published in the ISIT proceedings. Specifically, we corrected errors in the arguments on the volume of tubes. Note that this correction does not affect the main statements of the paper

详情
Journal ref
Proc. of IEEE International Symposium on Information Theory (ISIT), Barcelona, Spain, pp. 1163-1167, July 2016
AI中文摘要

Wiatowski和Bölcskei, 2015证明了深度卷积神经网络基于的特征提取器的变形稳定性和垂直平移不变性由网络结构本身保证,而非特定卷积核和非线性。虽然平移不变性结果适用于平方可积函数,变形稳定性界仅适用于带限函数。然而,许多实际相关信号(如自然图像)表现出尖锐和弯曲的不连续性,因此不是带限的。本文的主要贡献是针对Donoho, 2001引入的卡通函数类建立变形稳定性界。

英文摘要

Wiatowski and Bölcskei, 2015, proved that deformation stability and vertical translation invariance of deep convolutional neural network-based feature extractors are guaranteed by the network structure per se rather than the specific convolution kernels and non-linearities. While the translation invariance result applies to square-integrable functions, the deformation stability bound holds for band-limited functions only. Many signals of practical relevance (such as natural images) exhibit, however, sharp and curved discontinuities and are, hence, not band-limited. The main contribution of this paper is a deformation stability result that takes these structural properties into account. Specifically, we establish deformation stability bounds for the class of cartoon functions introduced by Donoho, 2001.

1705.07364 2026-06-04 cs.LG cs.CV cs.NA math.NA 版本更新

Stabilizing Adversarial Nets With Prediction Methods

用预测方法稳定对抗网络

Abhay Yadav, Sohil Shah, Zheng Xu, David Jacobs, Tom Goldstein

发表机构 * University of Maryland, College Park(马里兰大学 College Park 分校)

AI总结 本文提出一种改进的随机梯度下降方法,通过稳定对抗网络的训练过程,使其更可靠地收敛到鞍点,提高训练稳定性与效率。

Comments Accepted at ICLR 2018

详情
AI中文摘要

对抗神经网络在数据科学中解决了很多重要问题,但训练却极具挑战性。这些困难源于对抗网络的最优权重对应于损失函数的鞍点而非极小值。通常用于此类问题的交替随机梯度方法难以可靠收敛到鞍点,且当收敛时对学习率极为敏感。本文提出一种简单的随机梯度下降修改方法,以稳定对抗网络。理论和实践中均表明,所提方法可靠收敛到鞍点,并在更宽的训练参数范围内保持稳定。这使对抗网络更少出现'崩溃'现象,并允许使用更大的学习率进行更快的训练。

英文摘要

Adversarial neural networks solve many important problems in data science, but are notoriously difficult to train. These difficulties come from the fact that optimal weights for adversarial nets correspond to saddle points, and not minimizers, of the loss function. The alternating stochastic gradient methods typically used for such problems do not reliably converge to saddle points, and when convergence does happen it is often highly sensitive to learning rates. We propose a simple modification of stochastic gradient descent that stabilizes adversarial networks. We show, both in theory and practice, that the proposed method reliably converges to saddle points, and is stable with a wider range of training parameters than a non-prediction method. This makes adversarial networks less likely to "collapse," and enables faster training with larger learning rates.

1709.07089 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

On the Design of LQR Kernels for Efficient Controller Learning

关于为高效控制器学习设计LQR核

Alonso Marco, Philipp Hennig, Stefan Schaal, Sebastian Trimpe

发表机构 * Max Planck Institute for Intelligent Systems(马克斯·普朗克智能系统研究所) Computational Learning and Motor Control Lab(计算学习与运动控制实验室)

AI总结 本文提出两种基于LQR结构的核,用于改进基于贝叶斯优化的控制器学习,通过模拟线性和非线性系统证明其优于传统GP方法。

Comments 8 pages, 5 figures, to appear in 56th IEEE Conference on Decision and Control (CDC 2017)

详情
AI中文摘要

从数据中为非线性动态系统寻找最优反馈控制器是困难的。最近,贝叶斯优化(BO)被提出作为直接从实验试次中调整控制器的强大框架。为了选择下一个查询点并找到全局最优解,BO依赖于潜在目标函数的概率描述,通常为高斯过程(GP)。本文显示,使用常见核的GP在标准二次控制问题上可能导致学习效果差。对于一阶系统,我们构建了两种核,专门利用广为人知的线性二次调节器(LQR)的结构,同时保留贝叶斯非参数学习的灵活性。对不确定线性和非线性系统的模拟显示,LQR核在学习性能上优于传统GP方法。

英文摘要

Finding optimal feedback controllers for nonlinear dynamic systems from data is hard. Recently, Bayesian optimization (BO) has been proposed as a powerful framework for direct controller tuning from experimental trials. For selecting the next query point and finding the global optimum, BO relies on a probabilistic description of the latent objective function, typically a Gaussian process (GP). As is shown herein, GPs with a common kernel choice can, however, lead to poor learning outcomes on standard quadratic control problems. For a first-order system, we construct two kernels that specifically leverage the structure of the well-known Linear Quadratic Regulator (LQR), yet retain the flexibility of Bayesian nonparametric learning. Simulations of uncertain linear and nonlinear systems demonstrate that the LQR kernels yield superior learning performance.

1707.09676 2026-06-04 cs.LG cs.SY eess.SY math.OC 版本更新

Model-Free Renewable Scenario Generation Using Generative Adversarial Networks

基于生成对抗网络的无模型可再生能源场景生成

Yize Chen, Yishen Wang, Daniel Kirschen, Baosen Zhang

发表机构 * University of Washington(华盛顿大学)

AI总结 本文提出一种基于生成对抗网络的数据驱动场景生成方法,用于高效生成具有时空相关性的可再生能源生产模式。

Comments Accepted to IEEE Transactions on Power Systems; code available at https://github.com/chennnnnyize/Renewables_Scenario_Gen_GAN

详情
AI中文摘要

场景生成是高可再生能源渗透电力系统运行和规划中的重要步骤。本文提出了一种基于生成对抗网络的数据驱动方法,该方法基于两个相互连接的深度神经网络。与基于概率模型的方法相比,我们的方法能够捕捉大量相关资源在时间和空间维度上的可再生能源生产模式。通过使用NREL集成数据集中的风能和太阳能时间序列数据进行验证,证明了所提方法能够生成具有多样行为的逼真风能和光伏发电曲线。此外,通过使用训练期间的标注数据,展示了如何根据不同感兴趣的条件生成场景。例如,可以基于天气事件(如高风日)或年份时间(如七月某日的太阳能发电)来生成场景。由于神经网络的前馈性质,无需复杂的采样技术即可高效生成场景。

英文摘要

Scenario generation is an important step in the operation and planning of power systems with high renewable penetrations. In this work, we proposed a data-driven approach for scenario generation using generative adversarial networks, which is based on two interconnected deep neural networks. Compared with existing methods based on probabilistic models that are often hard to scale or sample from, our method is data-driven, and captures renewable energy production patterns in both temporal and spatial dimensions for a large number of correlated resources. For validation, we use wind and solar times-series data from NREL integration data sets. We demonstrate that the proposed method is able to generate realistic wind and photovoltaic power profiles with full diversity of behaviors. We also illustrate how to generate scenarios based on different conditions of interest by using labeled data during training. For example, scenarios can be conditioned on weather events~(e.g. high wind day) or time of the year~(e,g. solar generation for a day in July). Because of the feedforward nature of the neural networks, scenarios can be generated extremely efficiently without sophisticated sampling techniques.

1704.00227 2026-06-04 cs.LG cs.NA math.NA 版本更新

Online and Stable Learning of Analysis Operators

在线和稳定的分析算子学习

Michael Sandbichler, Karin Schnass

发表机构 * Department of Mathematics, University of Innsbruck(因斯布鲁克大学数学系)

AI总结 本文提出四种在线学习分析算子的迭代算法,基于优化原则,改进了分析K-SVD和分析SimCO,通过投影梯度下降、隐式欧拉方案和奇异值策略,在合成和图像数据上表现出更好的恢复率和更快的收敛速度。

Comments 21 pages, 12 figures, 6 tables

详情
AI中文摘要

本文提出了四种用于学习分析算子的迭代算法。它们基于分析K-SVD和分析SimCO所依赖的相同优化原则。FAOL和SAOL算法基于投影梯度下降法,使用最优步长;IAOL算法受隐式欧拉方案启发,无需选择步长;SVAOL算法采用类似分析K-SVD的策略,但避免其高计算成本。所有算法在每一步都证明能减少或保持目标函数,并提供了其平稳点的特征描述。进一步在合成和图像数据上测试,与分析SimCO相比,显示出更好的恢复率和更快的目标函数衰减。在最终的去噪实验中,所提算法的表现与最先进的ASimCO算法相当或更优。

英文摘要

In this paper four iterative algorithms for learning analysis operators are presented. They are built upon the same optimisation principle underlying both Analysis K-SVD and Analysis SimCO. The Forward and Sequential Analysis Operator Learning (FAOL and SAOL) algorithms are based on projected gradient descent with optimally chosen step size. The Implicit AOL (IAOL) algorithm is inspired by the implicit Euler scheme for solving ordinary differential equations and does not require to choose a step size. The fourth algorithm, Singular Value AOL (SVAOL), uses a similar strategy as Analysis K-SVD while avoiding its high computational cost. All algorithms are proven to decrease or preserve the target function in each step and a characterisation of their stationary points is provided. Further they are tested on synthetic and image data, compared to Analysis SimCO and found to give better recovery rates and faster decay of the objective function respectively. In a final denoising experiment the presented algorithms are again shown to perform similar to or better than the state-of-the-art algorithm ASimCO.

1801.09657 2026-06-04 math.NA cs.LG cs.NA stat.ME 版本更新

Matrix Completion for Structured Observations

结构观测的矩阵补全

Denali Molitor, Deanna Needell

发表机构 * University of California, Los Angeles(加州大学洛杉矶分校)

AI总结 本文提出改进的核范数最小化方法,以考虑观测与未观测条目间的结构性差异,提升矩阵补全效果。

详情
AI中文摘要

预测或填补缺失数据(即矩阵补全)是当今数据驱动世界中的常见挑战。以往策略通常假设观测与缺失条目之间无结构性差异。不幸的是,这一假设在许多应用中显得不现实。例如,在经典的Netflix挑战中,预测用户对未观看电影的评分时,观众未观看某部电影可能表明对该电影缺乏兴趣,从而建议评分低于预期。本文提出调整标准核范数最小化策略,通过正则化未观测条目的值来考虑此类结构性差异。我们证明在某些情况下,所提方法优于核范数最小化。

英文摘要

The need to predict or fill-in missing data, often referred to as matrix completion, is a common challenge in today's data-driven world. Previous strategies typically assume that no structural difference between observed and missing entries exists. Unfortunately, this assumption is woefully unrealistic in many applications. For example, in the classic Netflix challenge, in which one hopes to predict user-movie ratings for unseen films, the fact that the viewer has not watched a given movie may indicate a lack of interest in that movie, thus suggesting a lower rating than otherwise expected. We propose adjusting the standard nuclear norm minimization strategy for matrix completion to account for such structural differences between observed and unobserved entries by regularizing the values of the unobserved entries. We show that the proposed method outperforms nuclear norm minimization in certain settings.

1705.07252 2026-06-04 cs.LG cs.NA math.NA 版本更新

SVM via Saddle Point Optimization: New Bounds and Distributed Algorithms

通过鞍点优化实现SVM:新的界和分布式算法

Yifei Jin, Lingxiao Huang, Jian Li

发表机构 * Tsinghua University(清华大学) EPFL(苏黎世联邦理工学院)

AI总结 本文提出基于鞍点优化的新算法,为硬边距SVM和ν-SVM提供近线性时间复杂度的解决方案,并在分布式环境下实现高效通信。

详情
AI中文摘要

我们研究了两种重要的SVM变体:硬边距SVM(用于线性可分情况)和ν-SVM(用于线性不可分情况)。我们从鞍点优化的角度提出新算法。我们的算法在两种变体上均能实现(1-ε)近似解,运行时间为~O(nd +n√(d/ε)),其中n是点数,d是维度。目前最好的ν-SVM算法基于二次规划方法,最坏情况下需要Ω(n²d)时间~\cite{joachims1998making,platt199912}。本文为ν-SVM提供了首个近线性时间算法。硬边距SVM的最佳算法由Gilbert算法~\cite{gartner2009coresets}实现,需要O(nd/ε)时间。我们的算法将运行时间提高了√d/√ε倍。此外,我们的算法可以自然地在分布式设置中实现。我们证明我们的算法需要~O(k(d +√(d/ε)))的通信成本,其中k是客户端数量,这几乎接近理论下界。数值实验支持我们的理论,并显示我们的算法在高维、大规模和密集数据集上比先前方法收敛更快。

英文摘要

We study two important SVM variants: hard-margin SVM (for linearly separable cases) and $ν$-SVM (for linearly non-separable cases). We propose new algorithms from the perspective of saddle point optimization. Our algorithms achieve $(1-ε)$-approximations with running time $\tilde{O}(nd+n\sqrt{d / ε})$ for both variants, where $n$ is the number of points and $d$ is the dimensionality. To the best of our knowledge, the current best algorithm for $ν$-SVM is based on quadratic programming approach which requires $Ω(n^2 d)$ time in worst case~\cite{joachims1998making,platt199912}. In the paper, we provide the first nearly linear time algorithm for $ν$-SVM. The current best algorithm for hard margin SVM achieved by Gilbert algorithm~\cite{gartner2009coresets} requires $O(nd / ε)$ time. Our algorithm improves the running time by a factor of $\sqrt{d}/\sqrtε$. Moreover, our algorithms can be implemented in the distributed settings naturally. We prove that our algorithms require $\tilde{O}(k(d +\sqrt{d/ε}))$ communication cost, where $k$ is the number of clients, which almost matches the theoretical lower bound. Numerical experiments support our theory and show that our algorithms converge faster on high dimensional, large and dense data sets, as compared to previous methods.

1705.07262 2026-06-04 cs.LG cs.AI cs.NE cs.SY eess.SY 版本更新

Batch Reinforcement Learning on the Industrial Benchmark: First Experiences

批量强化学习在工业基准上的应用:初步经验

Daniel Hein, Steffen Udluft, Michel Tokic, Alexander Hentschel, Thomas A. Runkler, Volkmar Sterzing

发表机构 * Technical University of Munich, Department of Informatics(慕尼黑技术大学信息学院) Siemens AG, Corporate Technology(西门子股份公司企业技术部)

AI总结 本文研究了粒子群优化策略在工业基准上的表现,展示了其在真实应用场景中的有效性,相比传统方法,PSO-P在性能和鲁棒性上表现突出。

详情
Journal ref
2017 International Joint Conference on Neural Networks (IJCNN), Anchorage, AK, 2017, pp. 4214-4221
AI中文摘要

粒子群优化策略(PSO-P)近期被引入并证明在与学术强化学习基准的非策略、批量设置中产生了显著成果。为进一步研究其在真实应用中的性质和可行性,本文在所谓的工业基准(IB)上研究PSO-P,这是一个旨在通过包含工业应用中发现的各种方面(如连续状态和动作空间、高维部分可观测状态空间、延迟效应和复杂随机性)而变得真实的新强化学习(RL)基准。PSO-P在IB上的实验结果与基于模型的递归控制神经网络(RCNN)和基于模型的神经拟合Q迭代(NFQ)推导出的闭式控制策略的结果进行比较。实验表明,PSO-P不仅对学术基准感兴趣,也对真实世界工业应用感兴趣,因为它在我们的IB设置中也产生了最佳表现的策略。与其它已建立的RL技术相比,PSO-P在性能和鲁棒性上表现出色,仅需相对较低的努力来找到合适的参数或做出复杂的设计决策。

英文摘要

The Particle Swarm Optimization Policy (PSO-P) has been recently introduced and proven to produce remarkable results on interacting with academic reinforcement learning benchmarks in an off-policy, batch-based setting. To further investigate the properties and feasibility on real-world applications, this paper investigates PSO-P on the so-called Industrial Benchmark (IB), a novel reinforcement learning (RL) benchmark that aims at being realistic by including a variety of aspects found in industrial applications, like continuous state and action spaces, a high dimensional, partially observable state space, delayed effects, and complex stochasticity. The experimental results of PSO-P on IB are compared to results of closed-form control policies derived from the model-based Recurrent Control Neural Network (RCNN) and the model-free Neural Fitted Q-Iteration (NFQ). Experiments show that PSO-P is not only of interest for academic benchmarks, but also for real-world industrial applications, since it also yielded the best performing policy in our IB setting. Compared to other well established RL techniques, PSO-P produced outstanding results in performance and robustness, requiring only a relatively low amount of effort in finding adequate parameters or making complex design decisions.

1801.06637 2026-06-04 stat.ML cs.LG cs.NA math.AP math.NA 版本更新

Deep Hidden Physics Models: Deep Learning of Nonlinear Partial Differential Equations

深度隐物理模型:深度学习非线性偏微分方程

Maziar Raissi

发表机构 * Division of Applied Mathematics, Brown University(布朗大学应用数学系)

AI总结 本文提出深度学习方法,从散乱且可能含噪声的观测数据中发现非线性偏微分方程,通过两个深度神经网络近似未知解及非线性动力学,验证了该方法在多个科学领域基准问题中的有效性。

详情
AI中文摘要

在人工智能与应用数学的交汇处,长期存在的问题是设计出能够将观测数据转化为物理世界预测数学模型的算法。在数据丰富和机器学习能力先进的时代,自然的问题是:如何自动从高维实验数据中揭示物理定律?本文提出了一种深度学习方法,用于从空间和时间上散乱且可能含噪声的观测中发现非线性偏微分方程。具体而言,我们通过两个深度神经网络近似未知解及非线性动力学。第一个网络作为未知解的先验,本质上使我们能够避免本质上病态且不稳定的数值微分。第二个网络代表非线性动力学,帮助我们提炼支配给定时空数据集演化的机制。我们测试了该方法在多个科学领域基准问题中的有效性,并展示了所提框架如何帮助我们准确学习底层动力学并预测系统未来状态。特别是,我们研究了Burgers'、Korteweg-de Vries(KdV)、Kuramoto-Sivashinsky、非线性Schrödinger和Navier-Stokes方程。

英文摘要

A long-standing problem at the interface of artificial intelligence and applied mathematics is to devise an algorithm capable of achieving human level or even superhuman proficiency in transforming observed data into predictive mathematical models of the physical world. In the current era of abundance of data and advanced machine learning capabilities, the natural question arises: How can we automatically uncover the underlying laws of physics from high-dimensional data generated from experiments? In this work, we put forth a deep learning approach for discovering nonlinear partial differential equations from scattered and potentially noisy observations in space and time. Specifically, we approximate the unknown solution as well as the nonlinear dynamics by two deep neural networks. The first network acts as a prior on the unknown solution and essentially enables us to avoid numerical differentiations which are inherently ill-conditioned and unstable. The second network represents the nonlinear dynamics and helps us distill the mechanisms that govern the evolution of a given spatiotemporal data-set. We test the effectiveness of our approach for several benchmark problems spanning a number of scientific domains and demonstrate how the proposed framework can help us accurately learn the underlying dynamics and forecast future states of the system. In particular, we study the Burgers', Korteweg-de Vries (KdV), Kuramoto-Sivashinsky, nonlinear Schrödinger, and Navier-Stokes equations.

1801.05894 2026-06-04 math.HO cs.LG cs.NA math.NA stat.ML 版本更新

Deep Learning: An Introduction for Applied Mathematicians

深度学习:应用数学家的入门指南

Catherine F. Higham, Desmond J. Higham

发表机构 * School of Computing Science, University of Glasgow, UK(格拉斯哥大学计算机科学学院,英国) Department of Mathematics and Statistics, University of Strathclyde, UK(斯特拉斯克莱德大学数学与统计学系,英国)

AI总结 本文从应用数学角度介绍深度学习基本概念,面向数学专业研究生及本科生,通过MATLAB代码和图像分类实例展示神经网络原理与训练方法。

详情
AI中文摘要

多层人工神经网络正成为众多应用领域中的普遍工具。深度学习革命的核心概念来自应用数学和计算数学,包括微积分、逼近论、优化和线性代数。本文为应用数学家提供深度学习基础介绍。目标读者为数学专业研究生及大四本科生,也适用于希望在课堂中引入深度学习应用的数学教师。文章聚焦三个核心问题:什么是深度神经网络?如何训练网络?什么是随机梯度方法?通过MATLAB代码展示网络构建与训练,并展示在大规模图像分类问题中使用最新软件的应用。最后提供当前文献的参考文献。

英文摘要

Multilayered artificial neural networks are becoming a pervasive tool in a host of application fields. At the heart of this deep learning revolution are familiar concepts from applied and computational mathematics; notably, in calculus, approximation theory, optimization and linear algebra. This article provides a very brief introduction to the basic ideas that underlie deep learning from an applied mathematics perspective. Our target audience includes postgraduate and final year undergraduate students in mathematics who are keen to learn about the area. The article may also be useful for instructors in mathematics who wish to enliven their classes with references to the application of deep learning techniques. We focus on three fundamental questions: what is a deep neural network? how is a network trained? what is the stochastic gradient method? We illustrate the ideas with a short MATLAB code that sets up and trains a network. We also show the use of state-of-the art software on a large scale image classification problem. We finish with references to the current literature.

1801.04492 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

An Explicit Convergence Rate for Nesterov's Method from SDP

Nesterov方法从SDP的显式收敛率

Sam Safavi, Bikash Joshi, Guilherme França, José Bento

AI总结 本文通过求解SDP,为Nesterov加速法提供了新的显式收敛率上界,优化了参数并改进了强凸函数的收敛分析。

详情
AI中文摘要

Integral Quadratic Constraints (IQC)框架由Lessard等人(2014)引入,将多种优化算法收敛率上界计算简化为半定规划(SDP)。特别是,该技术被应用于Nesterov加速法(NAM)。对于二次函数,该SDP被显式求解,得到NAM收敛率的新上界;对于任意强凸函数,数值显示IQC可改进Nesterov(2004)的上界。不幸的是,SDP的显式解析解未被提供。本文提供了此类解析解,获得NAM收敛率的新通用显式上界,并进一步优化其参数。据我们所知,这是目前最强凸函数下NAM收敛率的最优且显式上界。

英文摘要

The framework of Integral Quadratic Constraints (IQC) introduced by Lessard et al. (2014) reduces the computation of upper bounds on the convergence rate of several optimization algorithms to semi-definite programming (SDP). In particular, this technique was applied to Nesterov's accelerated method (NAM). For quadratic functions, this SDP was explicitly solved leading to a new bound on the convergence rate of NAM, and for arbitrary strongly convex functions it was shown numerically that IQC can improve bounds from Nesterov (2004). Unfortunately, an explicit analytic solution to the SDP was not provided. In this paper, we provide such an analytical solution, obtaining a new general and explicit upper bound on the convergence rate of NAM, which we further optimize over its parameters. To the best of our knowledge, this is the best, and explicit, upper bound on the convergence rate of NAM for strongly convex functions.

1706.09993 2026-06-04 math.NA cs.IT cs.LG cs.NA math.IT math.PR math.ST stat.TH 版本更新

Phase Retrieval via Randomized Kaczmarz: Theoretical Guarantees

通过随机Kaczmarz方法进行相位恢复:理论保证

Yan Shuo Tan, Roman Vershynin

发表机构 * Department of Mathematics, University of Michigan(密歇根大学数学系) Department of Mathematics, University of California, Irvine(加州大学尔湾分校数学系)

AI总结 本文提出随机Kaczmarz方法在相位恢复中的理论保障,证明仅需与维度成正比的高斯测量即可保证收敛,引入了测量集的充分条件,并利用VC维和度量熵证明高斯采样向量满足该条件。

Comments Revised after comments from referees

详情
AI中文摘要

我们考虑相位恢复问题,即求解二次方程组的问题。最近提出了一种随机Kaczmarz方法的简单变种,并在数值上显示出比最先进的Wirtinger流方法更高效。在本文中,我们为相位恢复中的随机Kaczmarz方法提供了首次理论保证。我们证明仅需与维度成正比的高斯测量即可保证收敛。在此过程中,我们引入了一个关于测量集的充分条件,以保证随机Kaczmarz方法能够正常工作。我们证明高斯采样向量以高概率满足该性质;这通过链式论证结合VC维和度量熵的界来证明。

英文摘要

We consider the problem of phase retrieval, i.e. that of solving systems of quadratic equations. A simple variant of the randomized Kaczmarz method was recently proposed for phase retrieval, and it was shown numerically to have a computational edge over state-of-the-art Wirtinger flow methods. In this paper, we provide the first theoretical guarantee for the convergence of the randomized Kaczmarz method for phase retrieval. We show that it is sufficient to have as many Gaussian measurements as the dimension, up to a constant factor. Along the way, we introduce a sufficient condition on measurement sets for which the randomized Kaczmarz method is guaranteed to work. We show that Gaussian sampling vectors satisfy this property with high probability; this is proved using a chaining argument coupled with bounds on VC dimension and metric entropy.

1402.6964 2026-06-04 cs.LG cs.DC cs.NA math.NA stat.ML 版本更新

Scalable methods for nonnegative matrix factorizations of near-separable tall-and-skinny matrices

可扩展的非负矩阵分解方法用于近可分离的高瘦矩阵

Austin R. Benson, Jason D. Lee, Bartek Rajwa, David F. Gleich

发表机构 * Stanford University(斯坦福大学) Purdue University(普渡大学) Purdue University Institute for Computational and Bindley Biosciences Center(普渡大学计算与Bindley生物科学中心) Computer Science Mathematical Engineering(计算机科学数学工程)

AI总结 本文提出高效算法处理高瘦矩阵的非负矩阵分解,通过正交变换保持分离性,适用于流式数据和分布式计算环境。

详情
Journal ref
Proceedings of Neural Information Processing Systems, 2014
AI中文摘要

许多算法在假设矩阵近可分离的情况下用于非负矩阵分解。本文展示如何使这些算法高效处理行数远多于列数的高瘦矩阵。改进方法的关键是保持NMF问题分离性的正交矩阵变换。最终方法只需单次遍历数据矩阵,适用于流式、多核和MapReduce架构。我们在TB级合成矩阵和科学计算、生物信息学的真实数据上验证了算法的有效性。

英文摘要

Numerous algorithms are used for nonnegative matrix factorization under the assumption that the matrix is nearly separable. In this paper, we show how to make these algorithms efficient for data matrices that have many more rows than columns, so-called "tall-and-skinny matrices". One key component to these improved methods is an orthogonal matrix transformation that preserves the separability of the NMF problem. Our final methods need a single pass over the data matrix and are suitable for streaming, multi-core, and MapReduce architectures. We demonstrate the efficacy of these algorithms on terabyte-sized synthetic matrices and real-world matrices from scientific computing and bioinformatics.

1710.09668 2026-06-04 math.NA cs.LG cs.NA cs.NE stat.ML 版本更新

PDE-Net: Learning PDEs from Data

PDE-Net:从数据中学习偏微分方程

Zichao Long, Yiping Lu, Xianzhong Ma, Bin Dong

发表机构 * School of Mathematical Sciences(数学科学学院) School of Mathematical Sciences, Peking University(北京大学数学科学学院) Peking University, Beijing, China(北京大学北京中国) Beijing Computational Science Research Center(北京计算科学研究中心) Beijing International Center for Mathematical Research, Peking University(北京大学北京国际数学研究中心) Center for Data Science, Peking University(北京大学数据科学中心) Beijing Institute of Big Data Research(北京大数据研究院)

AI总结 本文提出PDE-Net,通过学习卷积核来获取微分算子,同时近似未知非线性响应,灵活地揭示复杂系统的动力学和隐藏的PDE模型。

Comments 15 pages, 13 figures

详情
AI中文摘要

本文提出了一种新的前馈深度网络PDE-Net,旨在同时准确预测复杂系统的动力学并揭示隐藏的PDE模型。PDE-Net通过学习卷积核来获取微分算子,并利用神经网络或其他机器学习方法近似未知的非线性响应。与现有方法相比,我们的方法通过学习微分算子和非线性响应具有最大的灵活性。PDE-Net的特殊之处在于所有滤波器都受到适当约束,这使我们能够轻松识别 governing PDE 模型,同时保持网络的表达力和预测能力。这些约束通过充分利用微分算子的阶数与滤波器的阶数总和规则(源自小波理论的重要概念)精心设计。我们还讨论了PDE-Net与计算机视觉中的一些现有网络如Network-In-Network (NIN) 和 Residual Neural Network (ResNet) 的关系。数值实验表明,PDE-Net有潜力揭示观测动态的隐藏PDE,并在噪声环境中预测相对较长的时间内的动态行为。

英文摘要

In this paper, we present an initial attempt to learn evolution PDEs from data. Inspired by the latest development of neural network designs in deep learning, we propose a new feed-forward deep network, called PDE-Net, to fulfill two objectives at the same time: to accurately predict dynamics of complex systems and to uncover the underlying hidden PDE models. The basic idea of the proposed PDE-Net is to learn differential operators by learning convolution kernels (filters), and apply neural networks or other machine learning methods to approximate the unknown nonlinear responses. Comparing with existing approaches, which either assume the form of the nonlinear response is known or fix certain finite difference approximations of differential operators, our approach has the most flexibility by learning both differential operators and the nonlinear responses. A special feature of the proposed PDE-Net is that all filters are properly constrained, which enables us to easily identify the governing PDE models while still maintaining the expressive and predictive power of the network. These constrains are carefully designed by fully exploiting the relation between the orders of differential operators and the orders of sum rules of filters (an important concept originated from wavelet theory). We also discuss relations of the PDE-Net with some existing networks in computer vision such as Network-In-Network (NIN) and Residual Neural Network (ResNet). Numerical experiments show that the PDE-Net has the potential to uncover the hidden PDE of the observed dynamics, and predict the dynamical behavior for a relatively long time, even in a noisy environment.

1712.09999 2026-06-04 math.NA cs.LG cs.NA 版本更新

Parallel Active Subspace Decomposition for Scalable and Efficient Tensor Robust Principal Component Analysis

并行主动子空间分解用于可扩展和高效的张量鲁棒主成分分析

Jonathan Q. Jiang, Michael K. Ng

发表机构 * Department of Mathematics, Hong Kong Baptist University(香港 Baptist 大学数学系)

AI总结 本文提出并行主动子空间分解方法,通过将张量展开的每个模式分解为正交矩阵和小矩阵,降低核范数最小化问题的规模,提升张量鲁棒主成分分析的效率和精度。

Comments 19 pages, 2 figures, 2 tables

详情
AI中文摘要

张量鲁棒主成分分析(TRPCA)在多个领域受到广泛关注。现有方法通常依赖张量核范数最小化,但每次迭代都需要多个奇异值分解(SVD),导致计算成本高昂。为克服这一缺点,我们提出了一种可扩展且高效的方法,称为并行主动子空间分解(PASD),该方法将张量展开的每个模式分解为列正交矩阵(主动子空间)和另一个小矩阵。这种变换导致了一个非凸优化问题,其中核范数最小化的规模通常比原始问题小得多。此外,我们引入交替方向乘子法(ADMM)来解决改写的问题,并提供其收敛性和次优性的严格分析。在合成和真实数据上的实验结果表明,我们的算法比最先进的方法更准确,并且快了多个数量级。

英文摘要

Tensor robust principal component analysis (TRPCA) has received a substantial amount of attention in various fields. Most existing methods, normally relying on tensor nuclear norm minimization, need to pay an expensive computational cost due to multiple singular value decompositions (SVDs) at each iteration. To overcome the drawback, we propose a scalable and efficient method, named Parallel Active Subspace Decomposition (PASD), which divides the unfolding along each mode of the tensor into a columnwise orthonormal matrix (active subspace) and another small-size matrix in parallel. Such a transformation leads to a nonconvex optimization problem in which the scale of nulcear norm minimization is generally much smaller than that in the original problem. Furthermore, we introduce an alternating direction method of multipliers (ADMM) method to solve the reformulated problem and provide rigorous analyses for its convergence and suboptimality. Experimental results on synthetic and real-world data show that our algorithm is more accurate than the state-of-the-art approaches, and is orders of magnitude faster.

1707.07342 2026-06-04 eess.SY cs.LG cs.SY 版本更新

An Online Learning Approach to Buying and Selling Demand Response

面向购买和销售需求响应的在线学习方法

Kia Khezeli, Eilyan Bitar

发表机构 * School of Electrical and Computer Engineering, Cornell University, Ithaca, NY 14853, USA.(电气与计算机工程系,康奈尔大学,纽约州伊萨卡市,14853,美国)

AI总结 本文提出一种在线学习方法,用于协调聚合商在固定居民客户群中购买需求削减,并在双结算批发市场中销售总体需求削减。研究通过动态定价和合同策略,在未知需求模型下最大化预期利润。

详情
AI中文摘要

我们采用聚合商的视角,旨在协调其从固定居民电力客户群中购买需求削减与在双结算批发市场中销售总体需求削减。聚合商通过向客户提供统一价格来获取需求削减,该价格相对于其预定基准。在实现总体需求削减之前,聚合商还必须确定向双结算能源市场出售多少能源。在日间市场中,聚合商承诺一份远期合同,要求在实时市场交付能源。基础需求曲线,将总体需求削减与聚合商提供的价格相关联,被假设为线性且受不可观测的随机冲击影响。假设聚合商最初不知道需求曲线的参数和随机冲击的分布,我们研究聚合商在T天时间窗口内动态调整报价和远期合同以最大化预期利润的程度。具体而言,我们设计了一种动态定价和合同提供策略,解决聚合商学习未知需求模型与最大化时间累积预期利润之间的需求。特别地,所提出的定价策略被证明在T天内产生的遗憾不超过O(log(T)√T)。

英文摘要

We adopt the perspective of an aggregator, which seeks to coordinate its purchase of demand reductions from a fixed group of residential electricity customers, with its sale of the aggregate demand reduction in a two-settlement wholesale energy market. The aggregator procures reductions in demand by offering its customers a uniform price for reductions in consumption relative to their predetermined baselines. Prior to its realization of the aggregate demand reduction, the aggregator must also determine how much energy to sell into the two-settlement energy market. In the day-ahead market, the aggregator commits to a forward contract, which calls for the delivery of energy in the real-time market. The underlying aggregate demand curve, which relates the aggregate demand reduction to the aggregator's offered price, is assumed to be affine and subject to unobservable, random shocks. Assuming that both the parameters of the demand curve and the distribution of the random shocks are initially unknown to the aggregator, we investigate the extent to which the aggregator might dynamically adapt its offered prices and forward contracts to maximize its expected profit over a time window of $T$ days. Specifically, we design a dynamic pricing and contract offering policy that resolves the aggregator's need to learn the unknown demand model with its desire to maximize its cumulative expected profit over time. In particular, the proposed pricing policy is proven to incur a regret over $T$ days that is no greater than $O(\log(T)\sqrt{T})$.

1710.10737 2026-06-04 math.OC cs.LG cs.NA math.NA stat.ML 版本更新

Linearly convergent stochastic heavy ball method for minimizing generalization error

用于最小化泛化误差的线性收敛随机重力球方法

Nicolas Loizou, Peter Richtárik

发表机构 * University of Edinburgh, United Kingdom(爱丁堡大学,英国) KAUST, Kingdom of Saudi Arabia(王国沙特阿拉伯的KAUST)

AI总结 本文首次证明了随机重力球方法的线性收敛性,通过固定步长的SGD步骤结合重力球动量项,专注于最小化期望损失而非有限和最小化。

Comments NIPS 2017, Workshop on Optimization for Machine Learning (camera ready version)

详情
AI中文摘要

本文首次建立了随机重力球方法的线性收敛性结果。该方法通过固定步长的SGD步骤结合重力球动量项进行优化。在分析中,我们专注于最小化期望损失,而非通常更困难的有限和最小化问题。尽管分析中我们限制在二次损失下,但总体目标不一定是强凸的。

英文摘要

In this work we establish the first linear convergence result for the stochastic heavy ball method. The method performs SGD steps with a fixed stepsize, amended by a heavy ball momentum term. In the analysis, we focus on minimizing the expected loss and not on finite-sum minimization, which is typically a much harder problem. While in the analysis we constrain ourselves to quadratic loss, the overall objective is not necessarily strongly convex.

1610.06781 2026-06-04 cs.RO cs.AI cs.CV cs.LG cs.SY eess.SY 版本更新

Modular Deep Q Networks for Sim-to-real Transfer of Visuo-motor Policies

模块化深度Q网络用于视觉-运动策略的仿真到现实迁移

Fangyi Zhang, Jürgen Leitner, Michael Milford, Peter Corke

发表机构 * Australian Centre for Robotic Vision (ACRV)(澳大利亚机器人视觉中心) Queensland University of Technology (QUT)(昆士兰理工大学)

AI总结 本文提出模块化深度强化学习方法,通过在感知与控制之间引入瓶颈,实现仿真到现实的迁移,提升机器人视觉-运动协调能力。

Comments Australasian Conference on Robotics and Automation (ACRA) 2017, Student Paper Award Finalist

详情
Journal ref
The proceedings of the Australasian Conference on Robotics and Automation (ACRA) 2017
AI中文摘要

尽管深度学习在计算机视觉中因大量视觉数据而取得显著成功,但为机器人学习收集足够大的现实世界数据集成本较高。为提高这些技术在真实机器人上的实用性,我们提出了一种模块化深度强化学习方法,能够将仿真训练的模型迁移到现实世界机器人任务中。我们引入了感知与控制之间的瓶颈,使网络能够独立训练,然后在端到端方式下合并和微调,以进一步提高视觉-运动协调性。在经典的平面视觉引导机器人抓取任务中,微调后的准确度达到1.6像素,显著优于直接迁移(17.5像素),显示出在更复杂和广泛的应用中的潜力。我们的方法提供了一种更高效学习和迁移视觉-运动策略的技术,无需完全依赖大规模现实世界机器人数据集。

英文摘要

While deep learning has had significant successes in computer vision thanks to the abundance of visual data, collecting sufficiently large real-world datasets for robot learning can be costly. To increase the practicality of these techniques on real robots, we propose a modular deep reinforcement learning method capable of transferring models trained in simulation to a real-world robotic task. We introduce a bottleneck between perception and control, enabling the networks to be trained independently, but then merged and fine-tuned in an end-to-end manner to further improve hand-eye coordination. On a canonical, planar visually-guided robot reaching task a fine-tuned accuracy of 1.6 pixels is achieved, a significant improvement over naive transfer (17.5 pixels), showing the potential for more complicated and broader applications. Our method provides a technique for more efficient learning and transfer of visuo-motor policies for real robotic systems without relying entirely on large real-world robot datasets.

1712.06577 2026-06-04 cs.LG cs.AI cs.NA math.NA 版本更新

Parallel Complexity of Forward and Backward Propagation

前向和反向传播的并行复杂度

Maxim Naumov

发表机构 * NVIDIA

AI总结 研究前向和反向传播作为三角方程组解的并行计算复杂度,提出直接和迭代并行算法,并展示FNN和RNN的反向传播可并行处理。

Comments 18 pages

详情
AI中文摘要

我们证明前向和反向传播可以表示为下三角和上三角方程组的解。对于标准前馈网络和循环神经网络,三角方程组总是块双对角线,而对于一般计算图,它们可能具有更复杂的三角稀疏模式。我们讨论了可以直接和迭代并行求解的算法,并将其解释为不同的模型并行方法。此外,我们展示了对于具有k层和τ时间步的FNN和RNN,反向传播可以在O(log k)和O(log k log τ)步内并行执行。最后,我们概述了使用雅可比矩阵扩展此技术的可能性,以处理任意层。

英文摘要

We show that the forward and backward propagation can be formulated as a solution of lower and upper triangular systems of equations. For standard feedforward (FNNs) and recurrent neural networks (RNNs) the triangular systems are always block bi-diagonal, while for a general computation graph (directed acyclic graph) they can have a more complex triangular sparsity pattern. We discuss direct and iterative parallel algorithms that can be used for their solution and interpreted as different ways of performing model parallelism. Also, we show that for FNNs and RNNs with $k$ layers and $τ$ time steps the backward propagation can be performed in parallel in O($\log k$) and O($\log k \log τ$) steps, respectively. Finally, we outline the generalization of this technique using Jacobians that potentially allows us to handle arbitrary layers.

1609.01387 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Learning Model Predictive Control for iterative tasks. A Data-Driven Control Framework

为迭代任务学习模型预测控制:一种数据驱动的控制框架

Ugo Rosolia, Francesco Borrelli

AI总结 本文提出一种无参考模型预测控制器,通过学习前次迭代提升性能,利用安全集和终端成本函数保证递归可行性,通过状态和输入轨迹递归构建终端集和终端成本函数,仿真验证了控制逻辑的有效性。

详情
AI中文摘要

本文提出了一种用于迭代任务的学习模型预测控制器(LMPC)。该控制器无参考输入,能够通过学习前次迭代提升性能。通过使用安全集和终端成本函数,保证了每次迭代的递归可行性与非递增性能。本文提出了控制设计方法,并展示了如何从先前迭代的状态和输入轨迹递归地构建终端集和终端成本函数。仿真结果展示了所提控制逻辑的有效性。

英文摘要

A Learning Model Predictive Controller (LMPC) for iterative tasks is presented. The controller is reference-free and is able to improve its performance by learning from previous iterations. A safe set and a terminal cost function are used in order to guarantee recursive feasibility and non-increasing performance at each iteration. The paper presents the control design approach, and shows how to recursively construct terminal set and terminal cost from state and input trajectories of previous iterations. Simulation results show the effectiveness of the proposed control logic.

1712.04612 2026-06-04 q-fin.CP cs.AI cs.CE cs.LG cs.SY eess.SY 版本更新

Inverse Reinforcement Learning for Marketing

营销中的逆强化学习

Igor Halperin

发表机构 * NYU Tandon School of Engineering(纽约大学坦顿工程学院)

AI总结 本文提出利用逆强化学习研究动态消费者需求,通过最大熵方法构建可 tractable 模型,展示观测噪声可能被误认为消费者异质性。

Comments 18 pages, 5 figures

详情
AI中文摘要

从观察行为中学习顾客偏好是营销文献中的重要课题。结构模型通常将前瞻性顾客或企业建模为效用最大化代理,其效用通过随机最优控制方法估计。本文提出基于逆强化学习(IRL)的替代方法研究动态消费者需求。我们开发了一种最大熵IRL的变种,导致高度可 tractable 的模型公式,最终转化为低维凸优化以寻找最优模型参数。通过消费者需求的模拟,我们显示相同顾客的观测噪声可以轻易被误认为显而易见的消费者异质性。

英文摘要

Learning customer preferences from an observed behaviour is an important topic in the marketing literature. Structural models typically model forward-looking customers or firms as utility-maximizing agents whose utility is estimated using methods of Stochastic Optimal Control. We suggest an alternative approach to study dynamic consumer demand, based on Inverse Reinforcement Learning (IRL). We develop a version of the Maximum Entropy IRL that leads to a highly tractable model formulation that amounts to low-dimensional convex optimization in the search for optimal model parameters. Using simulations of consumer demand, we show that observational noise for identical customers can be easily confused with an apparent consumer heterogeneity.

1708.01413 2026-06-04 cs.LG cs.DC cs.NA math.NA 版本更新

Distributed Solution of Large-Scale Linear Systems via Accelerated Projection-Based Consensus

通过加速投影共识解决大规模线性系统

Navid Azizan-Ruhi, Farshad Lahouti, Salman Avestimehr, Babak Hassibi

发表机构 * California Institute of Technology(加州理工学院) University of Southern California(南加州大学)

AI总结 本文提出加速分布式共识算法,用于解决大规模线性系统,通过在每个迭代中更新解决方案并结合动量平均,优于梯度下降、Nesterov加速梯度下降、重力球方法和ADMM等方法。

详情
AI中文摘要

解决大规模线性方程组是许多算法在机器学习、科学计算中的关键步骤。当问题维度较大时,计算和/或内存限制使得以分布式方式执行任务变得必要。本文考虑了任务主节点希望通过将方程组的子集分配给多个计算机器/核心来解决大规模线性系统的情况。我们提出了一种加速的分布式共识算法,在每次迭代中,每个机器通过将误差信号的投影加到其解决方案上,并在任务主节点进行动量平均。详细分析了所提算法的收敛行为,并证明其收敛速度优于替代的分布式方法,包括分布式梯度下降、分布式Nesterov加速梯度下降和重力球方法、块Cimmino方法和ADMM。在随机选择的线性系统以及真实世界数据集上,所提方法相对于上述方法提供了显著的加速效果。最后,我们的分析提出了一种新的分布式重力球方法变种,采用特定的分布式预处理,实现了与所提共识方法相同的理论收敛速度。

英文摘要

Solving a large-scale system of linear equations is a key step at the heart of many algorithms in machine learning, scientific computing, and beyond. When the problem dimension is large, computational and/or memory constraints make it desirable, or even necessary, to perform the task in a distributed fashion. In this paper, we consider a common scenario in which a taskmaster intends to solve a large-scale system of linear equations by distributing subsets of the equations among a number of computing machines/cores. We propose an accelerated distributed consensus algorithm, in which at each iteration every machine updates its solution by adding a scaled version of the projection of an error signal onto the nullspace of its system of equations, and where the taskmaster conducts an averaging over the solutions with momentum. The convergence behavior of the proposed algorithm is analyzed in detail and analytically shown to compare favorably with the convergence rate of alternative distributed methods, namely distributed gradient descent, distributed versions of Nesterov's accelerated gradient descent and heavy-ball method, the block Cimmino method, and ADMM. On randomly chosen linear systems, as well as on real-world data sets, the proposed method offers significant speed-up relative to all the aforementioned methods. Finally, our analysis suggests a novel variation of the distributed heavy-ball method, which employs a particular distributed preconditioning, and which achieves the same theoretical convergence rate as the proposed consensus-based method.

1710.01719 2026-06-04 eess.SY cs.LG cs.SY math.DS math.OC 版本更新

Decomposition of Nonlinear Dynamical Systems Using Koopman Gramians

利用Koopman格拉姆矩阵分解非线性动力系统

Zhiyuan Liu, Soumya Kundu, Lijun Chen, Enoch Yeung

发表机构 * Pacific Northwest National Laboratory(太平洋西北国家实验室)

AI总结 本文提出了一种新的Koopman算子方法,用于利用Koopman格拉姆矩阵分解非线性动力系统,介绍了输入-Koopman算子,并展示了如何将其用于将非线性系统转换为经典状态空间形式,以及输入和状态可观测函数分离的条件。

Comments 8 pages, submitted to IEEE 2018 ACC

详情
AI中文摘要

在本文中,我们提出了一种新的Koopman算子方法,用于利用Koopman格拉姆矩阵分解非线性动力系统。我们引入了输入-Koopman算子的概念,并展示了如何利用输入-Koopman算子将非线性系统转换为经典状态空间形式,并确定输入和状态可观测函数分离的条件。然后,我们扩展了现有的动态模式分解方法,用于从数据中学习Koopman算子,称为深度动态模式分解,以适用于具有控制或扰动的系统。我们通过学习一个输入-状态分离的Koopman算子来演示该方法的准确性,即使底层系统表现出混合的状态-输入项。我们接下来介绍了一种基于Koopman格拉姆矩阵的非线性分解算法,该算法最大化内部子系统的可观测性,并从其他子系统的噪声中减少扰动。我们推导了基于Koopman格拉姆矩阵和多维分区的放松方法,用于解决由此产生的NP难分解问题。最后,我们用IEEE 39节点系统的摆动动力学来演示所提出的算法。

英文摘要

In this paper we propose a new Koopman operator approach to the decomposition of nonlinear dynamical systems using Koopman Gramians. We introduce the notion of an input-Koopman operator, and show how input-Koopman operators can be used to cast a nonlinear system into the classical state-space form, and identify conditions under which input and state observable functions are well separated. We then extend an existing method of dynamic mode decomposition for learning Koopman operators from data known as deep dynamic mode decomposition to systems with controls or disturbances. We illustrate the accuracy of the method in learning an input-state separable Koopman operator for an example system, even when the underlying system exhibits mixed state-input terms. We next introduce a nonlinear decomposition algorithm, based on Koopman Gramians, that maximizes internal subsystem observability and disturbance rejection from unwanted noise from other subsystems. We derive a relaxation based on Koopman Gramians and multi-way partitioning for the resulting NP-hard decomposition problem. We lastly illustrate the proposed algorithm with the swing dynamics for an IEEE 39-bus system.

1306.4080 2026-06-04 cs.LG cs.NA math.NA 版本更新

Parallel Coordinate Descent Newton Method for Efficient $\ell_1$-Regularized Minimization

并行坐标下降牛顿法用于高效ℓ1正则化最小化

An Bian, Xiong Li, Yuncai Liu, Ming-Hsuan Yang

发表机构 * ETH Zurich(苏黎世联邦理工学院) CNCERT(国家计算机网络应急中心) Shanghai Jiao Tong University(上海交通大学) University of California, Merced(加州大学默塞德分校)

AI总结 本文提出并行坐标下降牛顿算法(PCDN),通过将Hessian矩阵的非对角线元素设为零以实现并行化,有效解决大规模优化问题中的并行性与收敛性问题。

详情
AI中文摘要

近年来,大规模优化问题中的并行算法取得了进展。尽管已有算法在并行化特征方面表现出色,但通常在高并行性下受限于发散问题或需要数据预处理来缓解这些问题。本文提出了一种并行坐标下降牛顿算法,使用多维近似牛顿步(PCDN),将Hessian矩阵的非对角线元素设为零以实现并行化。它随机将特征集分成$b$个束/子集,每个束的大小为$P$,并按顺序处理每个束,首先并行计算每个特征的下降方向,然后进行$P$维线搜索以获得步长。我们证明:(1) PCDN在增加并行性的情况下仍能保证全局收敛;(2) PCDN在有限的迭代次数$T_ε$内收敛到指定的精度$ε$,且$T_ε$随着并行性(束大小$P$)的增加而减少。通过维护中间量的实现技术,我们最小化了$P$维线搜索的数据传输和同步成本。为具体起见,所提出的PCDN算法应用于ℓ1正则化逻辑回归和ℓ2损失支持向量机。在六个基准数据集上的实验评估显示,所提出的PCDN算法有效利用了并行性,并在不损失精度的情况下优于最先进的方法。

英文摘要

The recent years have witnessed advances in parallel algorithms for large scale optimization problems. Notwithstanding demonstrated success, existing algorithms that parallelize over features are usually limited by divergence issues under high parallelism or require data preprocessing to alleviate these problems. In this work, we propose a Parallel Coordinate Descent Newton algorithm using multidimensional approximate Newton steps (PCDN), where the off-diagonal elements of the Hessian are set to zero to enable parallelization. It randomly partitions the feature set into $b$ bundles/subsets with size of $P$, and sequentially processes each bundle by first computing the descent directions for each feature in parallel and then conducting $P$-dimensional line search to obtain the step size. We show that: (1) PCDN is guaranteed to converge globally despite increasing parallelism; (2) PCDN converges to the specified accuracy $ε$ within the limited iteration number of $T_ε$, and $T_ε$ decreases with increasing parallelism (bundle size $P$). Using the implementation technique of maintaining intermediate quantities, we minimize the data transfer and synchronization cost of the $P$-dimensional line search. For concreteness, the proposed PCDN algorithm is applied to $\ell_1$-regularized logistic regression and $\ell_2$-loss SVM. Experimental evaluations on six benchmark datasets show that the proposed PCDN algorithm exploits parallelism well and outperforms the state-of-the-art methods in speed without losing accuracy.

1712.00634 2026-06-04 cs.LG cs.AI cs.RO cs.SY eess.SY math.OC 版本更新

PFAx: Predictable Feature Analysis to Perform Control

PFAx:可预测特征分析用于控制

Stefan Richthofer, Laurenz Wiskott

AI总结 PFAx通过整合补充信息提升预测性能,并透明展示补充信息对特征选择的影响,应用于强化学习环境中的智能体控制优化。

详情
AI中文摘要

可预测特征分析(PFA)(Richthofer, Wiskott, ICMLA 2015)是一种对高维输入信号进行降维的算法,提取最可预测的子信号。本文扩展了PFA,考虑补充信息以提高预测。补充信息不参与特征提取,特征仅从主输入中提取。PFAx透明地展示补充信息如何提升预测质量,并可生成补充信息以实现主信号的特定目标。该方法应用于强化学习环境,使智能体局部优化状态,接近目标。后续论文将扩展此方法以实现全局优化。

英文摘要

Predictable Feature Analysis (PFA) (Richthofer, Wiskott, ICMLA 2015) is an algorithm that performs dimensionality reduction on high dimensional input signal. It extracts those subsignals that are most predictable according to a certain prediction model. We refer to these extracted signals as predictable features. In this work we extend the notion of PFA to take supplementary information into account for improving its predictions. Such information can be a multidimensional signal like the main input to PFA, but is regarded external. That means it won't participate in the feature extraction - no features get extracted or composed of it. Features will be exclusively extracted from the main input such that they are most predictable based on themselves and the supplementary information. We refer to this enhanced PFA as PFAx (PFA extended). Even more important than improving prediction quality is to observe the effect of supplementary information on feature selection. PFAx transparently provides insight how the supplementary information adds to prediction quality and whether it is valuable at all. Finally we show how to invert that relation and can generate the supplementary information such that it would yield a certain desired outcome of the main signal. We apply this to a setting inspired by reinforcement learning and let the algorithm learn how to control an agent in an environment. With this method it is feasible to locally optimize the agent's state, i.e. reach a certain goal that is near enough. We are preparing a follow-up paper that extends this method such that also global optimization is feasible.

1711.02213 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Flexpoint: An Adaptive Numerical Format for Efficient Training of Deep Neural Networks

Flexpoint:一种适应性数值格式,用于高效训练深度神经网络

Urs Köster, Tristan J. Webb, Xin Wang, Marcel Nassar, Arjun K. Bansal, William H. Constable, Oğuz H. Elibol, Scott Gray, Stewart Hall, Luke Hornof, Amir Khosrowshahi, Carey Kloss, Ruby J. Pai, Naveen Rao

发表机构 * Artificial Intelligence Products Group, Intel Corporation(英特尔人工智能产品部)

AI总结 Flexpoint是一种适应性数值格式,旨在高效训练深度神经网络,通过动态调整指数来减少溢出并最大化动态范围,实验证明其在训练AlexNet、残差网络和生成对抗网络时性能接近32位浮点数。

Comments 14 pages, 5 figures, accepted in Neural Information Processing Systems 2017

详情
AI中文摘要

深度神经网络通常在32位浮点格式下开发和训练。通过在训练和推理中使用优化于深度学习的数值格式,可以实现性能和能效的显著提升。尽管近年来在有限精度推理方面取得了进展,但以低比特宽度训练神经网络仍是一个挑战。本文提出了Flexpoint数据格式,旨在完全取代32位浮点格式的训练和推理,支持现代深度网络拓扑而不需修改。Flexpoint张量具有共享的指数,该指数动态调整以最小化溢出并最大化可用动态范围。我们通过使用neon深度学习框架实现的模拟器验证了Flexpoint,证明在训练AlexNet、深度残差网络和生成对抗网络时,16位Flexpoint在不需调整模型超参数的情况下,性能接近32位浮点数。我们的结果表明,Flexpoint是一种有前途的数值格式,适用于未来用于训练和推理的硬件。

英文摘要

Deep neural networks are commonly developed and trained in 32-bit floating point format. Significant gains in performance and energy efficiency could be realized by training and inference in numerical formats optimized for deep learning. Despite advances in limited precision inference in recent years, training of neural networks in low bit-width remains a challenging problem. Here we present the Flexpoint data format, aiming at a complete replacement of 32-bit floating point format training and inference, designed to support modern deep network topologies without modifications. Flexpoint tensors have a shared exponent that is dynamically adjusted to minimize overflows and maximize available dynamic range. We validate Flexpoint by training AlexNet, a deep residual network and a generative adversarial network, using a simulator implemented with the neon deep learning framework. We demonstrate that 16-bit Flexpoint closely matches 32-bit floating point in training all three models, without any need for tuning of model hyperparameters. Our results suggest Flexpoint as a promising numerical format for future hardware for training and inference.

1711.11165 2026-06-04 cs.LG cs.SY eess.SY 版本更新

Safe Exploration for Identifying Linear Systems via Robust Optimization

通过鲁棒优化安全探索以识别线性系统

Tyler Lu, Martin Zinkevich, Craig Boutilier, Binz Roy, Dale Schuurmans

发表机构 * Google(谷歌)

AI总结 本文研究如何在未知动态系统中安全识别参数,通过鲁棒优化方法逐步扩展安全动作区域,提升安全探索的样本效率。

详情
AI中文摘要

安全探索未知动态系统对在物理系统中部署强化学习(RL)至关重要,特别是在可能产生灾难性后果的情况下。在了解动态系统很少的情况下,需要多样化的转移数据来应用基于模型或无模型的RL。受谷歌数据中心冷却的启发,我们研究如何在给定准确性和置信度水平的情况下安全识别系统参数。特别是,在假设最初已知一个名义安全动作的情况下,学习未知线性系统并假设存在高斯噪声。将安全性定义为在轨迹整个跨度内满足特定线性约束(例如过程变量要求),并给定概率近似正确(PAC)风格的模型参数估计误差界限,我们展示如何通过逐步扩展围绕名义安全动作的球体来计算安全动作空间区域。可以应用任何从此类安全区域中选择动作的探索策略。在数据中心冷却动态的简化模型上的实验展示了如何通过计算适当的安全区域来提高安全探索的样本效率。

英文摘要

Safely exploring an unknown dynamical system is critical to the deployment of reinforcement learning (RL) in physical systems where failures may have catastrophic consequences. In scenarios where one knows little about the dynamics, diverse transition data covering relevant regions of state-action space is needed to apply either model-based or model-free RL. Motivated by the cooling of Google's data centers, we study how one can safely identify the parameters of a system model with a desired accuracy and confidence level. In particular, we focus on learning an unknown linear system with Gaussian noise assuming only that, initially, a nominal safe action is known. Define safety as satisfying specific linear constraints on the state space (e.g., requirements on process variable) that must hold over the span of an entire trajectory, and given a Probably Approximately Correct (PAC) style bound on the estimation error of model parameters, we show how to compute safe regions of action space by gradually growing a ball around the nominal safe action. One can apply any exploration strategy where actions are chosen from such safe regions. Experiments on a stylized model of data center cooling dynamics show how computing proper safe regions can increase the sample efficiency of safe exploration.

1711.10566 2026-06-04 cs.AI cs.LG cs.NA math.AP math.NA stat.ML 版本更新

Physics Informed Deep Learning (Part II): Data-driven Discovery of Nonlinear Partial Differential Equations

物理指导深度学习(第二部分):数据驱动发现非线性偏微分方程

Maziar Raissi, Paris Perdikaris, George Em Karniadakis

发表机构 * Division of Applied Mathematics, Brown University(应用数学系,布朗大学)

AI总结 本文提出物理指导神经网络,用于在尊重物理定律的前提下解决监督学习任务。第二部分聚焦于数据驱动发现偏微分方程的问题,区分了连续时间和离散时间模型,并通过数学物理中的多个基准问题验证了方法的有效性。

详情
AI中文摘要

我们介绍了一种物理指导的神经网络——一种在解决监督学习任务时尊重由一般非线性偏微分方程描述的物理定律的神经网络。在本文第二部分中,我们专注于偏微分方程的数据驱动发现问题。根据可用数据在时空中的分布是散乱还是固定时间快照,我们引入了两种主要算法类别,即连续时间和离散时间模型。通过数学物理中的广泛基准问题,包括守恒定律、不可压缩流体流动和非线性浅水波传播,展示了我们方法的有效性。

英文摘要

We introduce physics informed neural networks -- neural networks that are trained to solve supervised learning tasks while respecting any given law of physics described by general nonlinear partial differential equations. In this second part of our two-part treatise, we focus on the problem of data-driven discovery of partial differential equations. Depending on whether the available data is scattered in space-time or arranged in fixed temporal snapshots, we introduce two main classes of algorithms, namely continuous time and discrete time models. The effectiveness of our approach is demonstrated using a wide range of benchmark problems in mathematical physics, including conservation laws, incompressible fluid flow, and the propagation of nonlinear shallow-water waves.

1711.10561 2026-06-04 cs.AI cs.LG cs.NA math.DS math.NA stat.ML 版本更新

Physics Informed Deep Learning (Part I): Data-driven Solutions of Nonlinear Partial Differential Equations

物理引导的深度学习(第一部分):非线性偏微分方程的数据驱动求解

Maziar Raissi, Paris Perdikaris, George Em Karniadakis

发表机构 * Division of Applied Mathematics, Brown University(应用数学系,布朗大学)

AI总结 本文提出物理引导的神经网络,用于在满足物理定律的前提下解决监督学习问题。第一部分介绍了如何利用这些网络推断偏微分方程的解,并构建可微的物理引导替代模型。

详情
AI中文摘要

我们引入了物理引导的神经网络——一种在解决监督学习任务时尊重由一般非线性偏微分方程描述的物理定律的神经网络。在本两部分论述中,我们围绕解决两类主要问题展开:数据驱动求解和数据驱动发现偏微分方程。根据可用数据的性质和安排,我们设计了两种不同的算法类别,即连续时间和离散时间模型。所得到的神经网络形成了一种新的数据高效通用函数逼近器类别,能够自然地将任何底层物理定律作为先验信息编码。在本第一部分中,我们展示了这些网络如何用于推断偏微分方程的解,并获得完全可微的物理引导替代模型,该模型对所有输入坐标和自由参数均可微分。

英文摘要

We introduce physics informed neural networks -- neural networks that are trained to solve supervised learning tasks while respecting any given law of physics described by general nonlinear partial differential equations. In this two part treatise, we present our developments in the context of solving two main classes of problems: data-driven solution and data-driven discovery of partial differential equations. Depending on the nature and arrangement of the available data, we devise two distinct classes of algorithms, namely continuous time and discrete time models. The resulting neural networks form a new class of data-efficient universal function approximators that naturally encode any underlying physical laws as prior information. In this first part, we demonstrate how these networks can be used to infer solutions to partial differential equations, and obtain physics-informed surrogate models that are fully differentiable with respect to all input coordinates and free parameters.

1702.06463 2026-06-04 q-bio.NC cs.LG cs.NE cs.SY eess.SY 版本更新

Predicting non-linear dynamics by stable local learning in a recurrent spiking neural network

通过在循环脉冲神经网络中稳定的局部学习预测非线性动力学

Aditya Gilra, Wulfram Gerstner

AI总结 本文提出了一种监督学习方案,用于循环脉冲神经网络中前馈和循环连接的权重学习,通过反馈误差和局部学习规则实现稳定学习,展示了线性、非线性及混沌动力学的学习能力。

详情
Journal ref
eLife 2017;6:e28295
AI中文摘要

大脑需要预测身体对运动指令的反应。问题在于如何通过脉冲神经元网络中的局部、在线和稳定学习规则,使网络学习复现由运动指令引起的非线性身体动力学。本文提出了一种监督学习方案,用于循环脉冲神经网络中前馈和循环连接的权重学习。误差通过固定随机连接和负增益反馈,使网络跟随期望动力学,同时在线局部规则改变权重。反馈基于在线局部学习权重(FOLLOW)规则在局部意义上,即权重变化取决于突触前活动和误差信号投影到突触后神经元。我们提供了学习线性、非线性及混沌动力学以及双臂动力学的例子。通过李雅普诺夫方法,在合理假设和近似下,证明FOLLOW学习是均匀稳定的,误差渐近趋于零。

英文摘要

Brains need to predict how the body reacts to motor commands. It is an open question how networks of spiking neurons can learn to reproduce the non-linear body dynamics caused by motor commands, using local, online and stable learning rules. Here, we present a supervised learning scheme for the feedforward and recurrent connections in a network of heterogeneous spiking neurons. The error in the output is fed back through fixed random connections with a negative gain, causing the network to follow the desired dynamics, while an online and local rule changes the weights. The rule for Feedback-based Online Local Learning Of Weights (FOLLOW) is local in the sense that weight changes depend on the presynaptic activity and the error signal projected onto the postsynaptic neuron. We provide examples of learning linear, non-linear and chaotic dynamics, as well as the dynamics of a two-link arm. Using the Lyapunov method, and under reasonable assumptions and approximations, we show that FOLLOW learning is stable uniformly, with the error going to zero asymptotically.

1711.08833 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Deep Learning for Real-Time Crime Forecasting and its Ternarization

深度学习在实时犯罪预测中的应用及其三元化

Bao Wang, Penghang Yin, Andrea L. Bertozzi, P. Jeffrey Brantingham, Stanley J. Osher, Jack Xin

发表机构 * Department of Mathematics, UCLA(UCLA数学系) Department of Anthropology, UCLA(UCLA人类学系) Department of Mathematics, UCI(UCI数学系)

AI总结 本文提出了一种基于时空残差网络的犯罪预测模型,并通过三元化技术解决实际部署中的资源消耗问题,提升了预测精度。

Comments 14 pages, 7 figures

详情
AI中文摘要

实时犯罪预测至关重要。然而,准确预测下一次犯罪发生的时间和地点具有挑战性。目前尚无已知的物理模型能合理近似此类复杂系统。历史犯罪数据在空间和时间上都很稀疏,感兴趣的信号较弱。本文首先提出了一种恰当的犯罪数据表示方法。然后,我们适应了在充分表示的数据上时空残差网络,以预测洛杉矶每个邻里规模区域在小时级尺度上的犯罪分布。这些实验以及与几种现有预测方法的比较证明了所提模型在准确性上的优越性。最后,我们提出了一种三元化技术,以解决其在现实世界部署中的资源消耗问题。本文是对我们短期会议论文[ Wang et al, Arxiv 1707.03340 ]的扩展。

英文摘要

Real-time crime forecasting is important. However, accurate prediction of when and where the next crime will happen is difficult. No known physical model provides a reasonable approximation to such a complex system. Historical crime data are sparse in both space and time and the signal of interests is weak. In this work, we first present a proper representation of crime data. We then adapt the spatial temporal residual network on the well represented data to predict the distribution of crime in Los Angeles at the scale of hours in neighborhood-sized parcels. These experiments as well as comparisons with several existing approaches to prediction demonstrate the superiority of the proposed model in terms of accuracy. Finally, we present a ternarization technique to address the resource consumption issue for its deployment in real world. This work is an extension of our short conference proceeding paper [Wang et al, Arxiv 1707.03340].

1711.08135 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Contracting Nonlinear Observers: Convex Optimization and Learning from Data

收缩非线性观测器:凸优化与数据学习

Ian R. Manchester

AI总结 本文提出通过构造凸集设计非线性观测器的新方法,利用凸优化最小化状态估计误差界,验证了在模拟噪声数据集上的有效性。

Comments conference submission

详情
AI中文摘要

本文提出了一种设计非线性观测器(状态估计器)的新方法。主要思想是(i)构造一个凸集的动力系统,这些系统对于特定系统是收缩观测器,(ii)在模拟噪声数据集上优化该集合以找到最小化状态估计误差界的一个系统。我们构造了连续时间和离散时间观测器的凸集,以及用于连续时间系统的收缩采样数据观测器。使用拉格朗日松弛构造学习的凸界。所提方法的实用性通过数值模拟验证。

英文摘要

A new approach to design of nonlinear observers (state estimators) is proposed. The main idea is to (i) construct a convex set of dynamical systems which are contracting observers for a particular system, and (ii) optimize over this set for one which minimizes a bound on state-estimation error on a simulated noisy data set. We construct convex sets of continuous-time and discrete-time observers, as well as contracting sampled-data observers for continuous-time systems. Convex bounds for learning are constructed using Lagrangian relaxation. The utility of the proposed methods are verified using numerical simulation.

1705.07112 2026-06-04 math.NA cs.LG cs.NA 版本更新

Fast Singular Value Shrinkage with Chebyshev Polynomial Approximation Based on Signal Sparsity

基于信号稀疏性的快速奇异值收缩与切比雪夫多项式逼近

Masaki Onuki, Shunsuke Ono, Keiichiro Shirai, Yuichi Tanaka

AI总结 本文提出利用切比雪夫多项式逼近方法快速处理奇异值收缩,通过信号稀疏性减少计算成本,提升矩阵秩最小化在图像处理中的效率和精度。

Comments This is a journal paper

详情
AI中文摘要

我们提出一种利用切比雪夫多项式逼近(CPA)进行奇异值阈值处理的近似方法。许多信号处理问题需要迭代应用奇异值分解(SVD)以最小化给定数据矩阵的秩,这称为矩阵秩最小化。在矩阵秩最小化中,通过硬阈值、软阈值或加权软阈值收缩矩阵的奇异值。然而,SVD的计算成本通常过高,难以处理高维信号如图像;因此,在这种情况下,矩阵秩最小化需要巨大的计算时间。本文利用CPA来(近似)操作奇异值,而无需计算奇异值和向量。奇异值的阈值处理通过特定矩阵的乘法表达,该乘法源于CPA的特性。该乘法也利用信号的稀疏性高效计算。结果表明,计算成本显著降低。实验结果通过基于矩阵秩最小化与核范数松弛的图像处理应用,展示了方法在计算时间和近似精度方面的有效性。

英文摘要

We propose an approximation method for thresholding of singular values using Chebyshev polynomial approximation (CPA). Many signal processing problems require iterative application of singular value decomposition (SVD) for minimizing the rank of a given data matrix with other cost functions and/or constraints, which is called matrix rank minimization. In matrix rank minimization, singular values of a matrix are shrunk by hard-thresholding, soft-thresholding, or weighted soft-thresholding. However, the computational cost of SVD is generally too expensive to handle high dimensional signals such as images; hence, in this case, matrix rank minimization requires enormous computation time. In this paper, we leverage CPA to (approximately) manipulate singular values without computing singular values and vectors. The thresholding of singular values is expressed by a multiplication of certain matrices, which is derived from a characteristic of CPA. The multiplication is also efficiently computed using the sparsity of signals. As a result, the computational cost is significantly reduced. Experimental results suggest the effectiveness of our method through several image processing applications based on matrix rank minimization with nuclear norm relaxation in terms of computation time and approximation precision.

1701.03974 2026-06-04 eess.SY cs.LG cs.SY math.OC stat.ML 版本更新

An Online Convex Optimization Approach to Dynamic Network Resource Allocation

一种面向动态网络资源分配的在线凸优化方法

Tianyi Chen, Qing Ling, Georgios B. Giannakis

AI总结 本文提出MOSP方案,解决对抗性损失和约束下的动态优化问题,实现子线性动态遗憾和适应性,应用于网络资源分配并优于现有方法。

详情
AI中文摘要

现有在线凸优化方法进行顺序单时段决策,导致可能的对抗性损失,其性能通过遗憾衡量。本文研究对抗性损失和约束的在线凸优化问题,约束在决策后揭示,可容忍瞬时违反但需长期满足。算法性能通过动态遗憾和动态适应性评估。本文提出改进的在线鞍点方案(MOSP),证明其在累积变化子线性增长时同时获得子线性动态遗憾和适应性。MOSP应用于动态网络资源分配任务,并与已知的随机对偶梯度方法比较。在各种场景中,数值实验展示了MOSP相对于现有方法的性能优势。

英文摘要

Existing approaches to online convex optimization (OCO) make sequential one-slot-ahead decisions, which lead to (possibly adversarial) losses that drive subsequent decision iterates. Their performance is evaluated by the so-called regret that measures the difference of losses between the online solution and the best yet fixed overall solution in hindsight. The present paper deals with online convex optimization involving adversarial loss functions and adversarial constraints, where the constraints are revealed after making decisions, and can be tolerable to instantaneous violations but must be satisfied in the long term. Performance of an online algorithm in this setting is assessed by: i) the difference of its losses relative to the best dynamic solution with one-slot-ahead information of the loss function and the constraint (that is here termed dynamic regret); and, ii) the accumulated amount of constraint violations (that is here termed dynamic fit). In this context, a modified online saddle-point (MOSP) scheme is developed, and proved to simultaneously yield sub-linear dynamic regret and fit, provided that the accumulated variations of per-slot minimizers and constraints are sub-linearly growing with time. MOSP is also applied to the dynamic network resource allocation task, and it is compared with the well-known stochastic dual gradient method. Under various scenarios, numerical experiments demonstrate the performance gain of MOSP relative to the state-of-the-art.

1612.08461 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

Randomized Block Frank-Wolfe for Convergent Large-Scale Learning

随机块Frank-Wolfe用于收敛大规模学习

Liang Zhang, Gang Wang, Daniel Romero, Georgios B. Giannakis

AI总结 本文提出随机块Frank-Wolfe方法,通过灵活选择每次迭代更新的块数,确保收敛性和可行性,并扩展了收敛分析以涵盖非凸目标。

详情
AI中文摘要

由于其低复杂度迭代,Frank-Wolfe(FW)求解器适用于各种大规模学习任务。当存在块可分离约束时,随机块FW(RB-FW)通过每次迭代仅更新部分坐标块进一步降低复杂度。为克服现有方法的限制,本文开发了RB-FW的步长,使每次迭代可灵活选择更新的块数,同时保证收敛性和可行性。通过建立关于对偶间隙和原问题次优性的计算界,推导了RB-FW的收敛速率。新界扩展了现有收敛分析,后者仅适用于不保证可行性迭代的步长序列。此外,还提出了两类保证迭代可行性的步长序列,以增强衰减率选择的灵活性。新收敛结果扩展到非凸目标,并证明精确线搜索的RB-FW以$\mathcal{O}(1/\sqrt{t})$速率达到临界点。在电动汽车充电和结构支持向量机应用中,展示了不同步长和块数的RB-FW性能。广泛模拟测试显示,RB-FW相比现有随机单块FW方法有性能提升。

英文摘要

Owing to their low-complexity iterations, Frank-Wolfe (FW) solvers are well suited for various large-scale learning tasks. When block-separable constraints are present, randomized block FW (RB-FW) has been shown to further reduce complexity by updating only a fraction of coordinate blocks per iteration. To circumvent the limitations of existing methods, the present work develops step sizes for RB-FW that enable a flexible selection of the number of blocks to update per iteration while ensuring convergence and feasibility of the iterates. To this end, convergence rates of RB-FW are established through computational bounds on a primal sub-optimality measure and on the duality gap. The novel bounds extend the existing convergence analysis, which only applies to a step-size sequence that does not generally lead to feasible iterates. Furthermore, two classes of step-size sequences that guarantee feasibility of the iterates are also proposed to enhance flexibility in choosing decay rates. The novel convergence results are markedly broadened to encompass also nonconvex objectives, and further assert that RB-FW with exact line-search reaches a stationary point at rate $\mathcal{O}(1/\sqrt{t})$. Performance of RB-FW with different step sizes and number of blocks is demonstrated in two applications, namely charging of electrical vehicles and structural support vector machines. Extensive simulated tests demonstrate the performance improvement of RB-FW relative to existing randomized single-block FW methods.

1711.04683 2026-06-04 cs.LG cs.RO cs.SY eess.SY stat.ML 版本更新

Tensor Decompositions for Modeling Inverse Dynamics

张量分解用于逆动力学建模

Stephan Baier, Volker Tresp

AI总结 本文提出利用张量分解方法建模逆动力学,通过处理位置、速度和加速度的三重交互,实现对高非线性函数的近似,并在SARCOS机械臂数据集上验证了其优越性。

详情
AI中文摘要

建模逆动力学对于精确的前馈机器人控制至关重要。该模型计算所需的关节扭矩,以执行预期的运动。高度非线性的动态系统逆函数可以通过回归技术近似。我们提出了一种回归方法,即利用位置x速度x加速度的三重交互的张量分解模型。大多数张量分解工作都解决了密集张量的分解问题。本文在稀疏张量的分解基础上进行扩展,仅包含少量非零条目。稀疏张量的分解已成功应用于关系学习,例如大规模知识图谱的建模。最近,该方法已扩展到多类分类问题,涉及离散输入变量。在高维稀疏张量中表示数据可以近似复杂的高非线性函数。本文展示了稀疏张量分解如何应用于回归问题。此外,我们通过学习从连续输入到张量分解的潜在表示的映射,利用基函数将方法扩展到连续输入。我们在具有七自由度SARCOS机械臂轨迹的数据集上评估了所提出的模型。实验结果表明,所提出的功能张量模型相比挑战性的最新方法具有优越的性能。

英文摘要

Modeling inverse dynamics is crucial for accurate feedforward robot control. The model computes the necessary joint torques, to perform a desired movement. The highly non-linear inverse function of the dynamical system can be approximated using regression techniques. We propose as regression method a tensor decomposition model that exploits the inherent three-way interaction of positions x velocities x accelerations. Most work in tensor factorization has addressed the decomposition of dense tensors. In this paper, we build upon the decomposition of sparse tensors, with only small amounts of nonzero entries. The decomposition of sparse tensors has successfully been used in relational learning, e.g., the modeling of large knowledge graphs. Recently, the approach has been extended to multi-class classification with discrete input variables. Representing the data in high dimensional sparse tensors enables the approximation of complex highly non-linear functions. In this paper we show how the decomposition of sparse tensors can be applied to regression problems. Furthermore, we extend the method to continuous inputs, by learning a mapping from the continuous inputs to the latent representations of the tensor decomposition, using basis functions. We evaluate our proposed model on a dataset with trajectories from a seven degrees of freedom SARCOS robot arm. Our experimental results show superior performance of the proposed functional tensor model, compared to challenging state-of-the art methods.

1711.04518 2026-06-04 eess.SY cs.AI cs.HC cs.LG cs.NE cs.SY 版本更新

A Supervised Learning Concept for Reducing User Interaction in Passenger Cars

一种用于减少乘客汽车中用户交互的监督学习概念

Marius Stärk, Damian Backes, Christian Kehl

AI总结 本文提出了一种基于监督学习的自动化系统,用于减少人机交互界面中的交互复杂性,适用于汽车多模态热调节系统的设定点选择。

Comments 4 pages, 9 figures, concept only

详情
AI中文摘要

本文介绍了一种用于人机界面(HMI)的自动化系统,用于通过监督学习实现设定点调整。以乘客汽车多模态热调节系统的HMI为例,展示了一个复杂的设定点选择系统。目标是将交互复杂性降低到完全自动化。该方法不仅限于气候控制应用,还可扩展到其他基于设定点的HMI领域。

英文摘要

In this article an automation system for human-machine-interfaces (HMI) for setpoint adjustment using supervised learning is presented. We use HMIs of multi-modal thermal conditioning systems in passenger cars as example for a complex setpoint selection system. The goal is the reduction of interaction complexity up to full automation. The approach is not limited to climate control applications but can be extended to other setpoint-based HMIs.

1705.08551 2026-06-04 stat.ML cs.AI cs.LG cs.SY eess.SY 版本更新

Safe Model-based Reinforcement Learning with Stability Guarantees

具有稳定性保证的安全模型基于强化学习

Felix Berkenkamp, Matteo Turchetta, Angela P. Schoellig, Andreas Krause

AI总结 本文提出一种考虑安全性的强化学习算法,通过Lyapunov稳定性验证理论,利用动态统计模型获得具有证明稳定性的高性能控制策略,并在模拟倒立摆中展示其安全优化神经网络策略的能力。

Comments Proc. of Neural Information Processing Systems (NIPS), 2017

详情
AI中文摘要

强化学习是一种从实验数据中学习最优策略的强大范式。然而,为了找到最优策略,大多数强化学习算法会探索所有可能的动作,这可能对现实系统有害。因此,学习算法在现实世界中很少应用于安全关键系统。在本文中,我们提出了一种明确考虑安全性的学习算法,定义为稳定性保证。具体来说,我们扩展了控制理论中关于Lyapunov稳定性验证的结果,并展示了如何利用动态的统计模型来获得具有证明稳定性的高性能控制策略。此外,在额外的正则性假设条件下,我们证明了可以有效地、安全地收集数据以学习动态特性,从而提高控制性能并扩大状态空间的安全区域。在我们的实验中,我们展示了所得到的算法如何在模拟倒立摆上安全地优化神经网络策略,而摆杆从未倒下。

英文摘要

Reinforcement learning is a powerful paradigm for learning optimal policies from experimental data. However, to find optimal policies, most reinforcement learning algorithms explore all possible actions, which may be harmful for real-world systems. As a consequence, learning algorithms are rarely applied on safety-critical systems in the real world. In this paper, we present a learning algorithm that explicitly considers safety, defined in terms of stability guarantees. Specifically, we extend control-theoretic results on Lyapunov stability verification and show how to use statistical models of the dynamics to obtain high-performance control policies with provable stability certificates. Moreover, under additional regularity assumptions in terms of a Gaussian process prior, we prove that one can effectively and safely collect data in order to learn about the dynamics and thus both improve control performance and expand the safe region of the state space. In our experiments, we show how the resulting algorithm can safely optimize a neural network policy on a simulated inverted pendulum, without the pendulum ever falling down.

1711.03906 2026-06-04 cs.LG cs.DC cs.NI cs.RO cs.SY eess.SY 版本更新

D-SLATS: Distributed Simultaneous Localization and Time Synchronization

D-SLATS:分布式的同时定位与时间同步

Amr Alanwar, Henrique Ferraz, Kevin Hsieh, Rohit Thazhath, Paul Martin, Joao Hespanha, Mani Srivastava

AI总结 本文提出D-SLATS框架,通过分布式扩展卡尔曼滤波和优化技术联合解决时间同步与定位问题,实现3微秒精度和30厘米误差。

详情
AI中文摘要

通过过去十年,我们见证了物联网(IoT)设备数量的激增,随之而来的是一次对时间和空间上协同行动的更大需求。尽管时间同步和定位这两个问题在许多方面有共同点,但传统上它们被分别处理或在集中式方法中结合,导致资源利用效率低下,或在设备数量方面不可扩展的解决方案。因此,我们提出D-SLATS,一个由三种不同且独立算法组成的框架,以分布式方式联合解决时间和定位问题。前两个算法主要基于分布式扩展卡尔曼滤波(EKF),而第三个算法使用优化技术。不需要融合中心,设备仅与邻居通信。所提出的方法在定制的超宽带通信测试平台和四旋翼无人机上进行了评估,代表了静态和移动节点的网络。我们的算法实现了高达三微秒的时间同步精度和30厘米的定位误差。

英文摘要

Through the last decade, we have witnessed a surge of Internet of Things (IoT) devices, and with that a greater need to choreograph their actions across both time and space. Although these two problems, namely time synchronization and localization, share many aspects in common, they are traditionally treated separately or combined on centralized approaches that results in an ineffcient use of resources, or in solutions that are not scalable in terms of the number of IoT devices. Therefore, we propose D-SLATS, a framework comprised of three different and independent algorithms to jointly solve time synchronization and localization problems in a distributed fashion. The First two algorithms are based mainly on the distributed Extended Kalman Filter (EKF) whereas the third one uses optimization techniques. No fusion center is required, and the devices only communicate with their neighbors. The proposed methods are evaluated on custom Ultra-Wideband communication Testbed and a quadrotor, representing a network of both static and mobile nodes. Our algorithms achieve up to three microseconds time synchronization accuracy and 30 cm localization error.

1701.08585 2026-06-04 cs.LG cs.SI cs.SY eess.SY math.OC 版本更新

Variational Policy for Guiding Point Processes

变分策略用于引导点过程

Yichen Wang, Grady Williams, Evangelos Theodorou, Le Song

AI总结 本文提出基于最优测度和变分推断的凸优化框架,用于设计点过程的最优控制策略,以更高效准确地引导系统状态。

Comments ICML 2017

详情
AI中文摘要

时间点过程已被广泛应用于建模由在线用户生成的事件序列数据。本文考虑如何设计点过程的最优控制策略,以将由点过程驱动的随机系统引导至目标状态。我们从最优测度和变分推断的角度提出关键洞察,并进一步提出一个凸优化框架和高效的算法,用于自适应更新策略。在合成和真实数据上的实验表明,我们的算法比其他随机控制方法在引导用户活动方面更加准确和高效。

英文摘要

Temporal point processes have been widely applied to model event sequence data generated by online users. In this paper, we consider the problem of how to design the optimal control policy for point processes, such that the stochastic system driven by the point process is steered to a target state. In particular, we exploit the key insight to view the stochastic optimal control problem from the perspective of optimal measure and variational inference. We further propose a convex optimization framework and an efficient algorithm to update the policy adaptively to the current system state. Experiments on synthetic and real-world data show that our algorithm can steer the user activities much more accurately and efficiently than other stochastic control methods.

1711.03398 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Data Fusion and Machine Learning Integration for Transformer Loss of Life Estimation

数据融合与机器学习集成用于变压器寿命估计

Mohsen Mahoor, Amin Khodaei

AI总结 本文通过数据融合与机器学习技术,结合IEEE标准,利用ANFIS和RBF网络估算变压器寿命,并采用OWA和序列卡尔曼滤波提高精度。

详情
AI中文摘要

机器学习方法的快速发展为变压器资产管理提供了新机会。本文整合机器学习和数据融合技术,基于IEEE Std. C57.91-2011,利用小时变压器负载和环境温度数据合成数据,通过ANFIS和RBF网络估算变压器寿命,并采用OWA和序列卡尔曼滤波融合结果以提高估计精度。仿真结果验证了所提方法的有效性。

英文摘要

Rapid growth of machine learning methodologies and their applications offer new opportunity for improved transformer asset management. Accordingly, power system operators are currently looking for data-driven methods to make better-informed decisions in terms of network management. In this paper, machine learning and data fusion techniques are integrated to estimate transformer loss of life. Using IEEE Std. C57.91-2011, a data synthesis process is proposed based on hourly transformer loading and ambient temperature values. This synthesized data is employed to estimate transformer loss of life by using Adaptive Network-Based Fuzzy Inference System (ANFIS) and Radial Basis Function (RBF) network, which are further fused together with the objective of improving the estimation accuracy. Among various data fusion techniques, Ordered Weighted Averaging (OWA) and sequential Kalman filter are selected to fuse the output results of the estimated ANFIS and RBF. Simulation results demonstrate the merit and the effectiveness of the proposed method.

1711.02877 2026-06-04 eess.SY cs.AI cs.LG cs.SY math.OC 版本更新

Un résultat intrigant en commande sans modèle

一个令人着迷的无模型控制结果

Cédric Join, Emmanuel Delaleau, Michel Fliess, Claude H. Moog

AI总结 通过鲁夫-赫维茨准则,证明了无模型控制中智能比例控制器可能比智能比例-微分控制器更难调参,通过仿真展示了iPD的优势。

Comments in French, https://www.openscience.fr/Un-resultat-intrigant-en-commande-sans-modele

详情
Journal ref
ISTE OpenScience Automatique, vol. 1, 2017
AI中文摘要

一个简单的数学例子证明,通过鲁夫-赫维茨准则,一个令人着迷的结果得以展现,即在当今对无模型控制的理解中,智能比例控制器(iP)可能比智能比例-微分控制器(iPD)更难调参。通过计算机仿真展示了iPD相较于经典PID的显著优势。引言和结论从近期进展的角度分析了无模型控制。

英文摘要

An elementary mathematical example proves, thanks to the Routh-Hurwitz criterion, a result that is intriguing with respect to today's practical understanding of model-free control, i.e., an "intelligent" proportional controller (iP) may turn to be more difficult to tune than an intelligent proportional-derivative one (iPD). The vast superiority of iPDs when compared to classic PIDs is shown via computer simulations. The introduction as well as the conclusion analyse model-free control in the light of recent advances.

1711.02857 2026-06-04 cs.LG cs.AI cs.CV cs.NA math.NA stat.ML 版本更新

Learning Sparse Visual Representations with Leaky Capped Norm Regularizers

通过泄漏受限范数正则化器学习稀疏视觉表示

Jianqiao Wangni, Dahua Lin

AI总结 本文提出泄漏受限范数正则化器,用于学习过完备视觉表示,证明了其在3D形状恢复中的收敛性,优于ℓ1和非凸正则化方法。

详情
AI中文摘要

诱导稀疏性的正则化是学习过完备视觉表示的重要组成部分。尽管ℓ1正则化广受欢迎,本文研究了非凸正则化在该问题中的应用。我们的贡献包括三个部分:首先,我们提出了泄漏受限范数正则化器(LCNR),允许模型权重低于一定阈值的部分被更强地正则化,从而实现强稀疏性,仅引入可控的估计偏差。我们提出了一种主要化-最小化算法来优化联合目标函数。其次,我们的研究显示,在单目3D形状恢复和神经网络中,LCNR优于ℓ1和其他非凸正则化方法,实现了最先进的性能和更快的收敛速度。第三,我们证明了在3D恢复问题上的理论全局收敛速度。到目前为止,这是首次对3D恢复问题的收敛性分析。

英文摘要

Sparsity inducing regularization is an important part for learning over-complete visual representations. Despite the popularity of $\ell_1$ regularization, in this paper, we investigate the usage of non-convex regularizations in this problem. Our contribution consists of three parts. First, we propose the leaky capped norm regularization (LCNR), which allows model weights below a certain threshold to be regularized more strongly as opposed to those above, therefore imposes strong sparsity and only introduces controllable estimation bias. We propose a majorization-minimization algorithm to optimize the joint objective function. Second, our study over monocular 3D shape recovery and neural networks with LCNR outperforms $\ell_1$ and other non-convex regularizations, achieving state-of-the-art performance and faster convergence. Third, we prove a theoretical global convergence speed on the 3D recovery problem. To the best of our knowledge, this is the first convergence analysis of the 3D recovery problem.

1711.00946 2026-06-04 cs.LG cs.SY eess.SY math.OC stat.ML 版本更新

Learning Linear Dynamical Systems via Spectral Filtering

通过谱滤波学习线性动力系统

Elad Hazan, Karan Singh, Cyril Zhang

AI总结 本文提出一种高效算法,通过过度参数化线性动力系统实现在线预测,利用谱滤波技术获得近优 regret 保证。

Comments Published as a conference paper at NIPS 2017

详情
AI中文摘要

我们提出了一种高效且实用的算法,用于在线预测具有对称转移矩阵的离散时间线性动力系统。我们通过不当学习避免非凸优化问题:通过多项式对数因子过度参数化LDS类,在换取损失函数的凸性。由此产生一个具有近优 regret 保证的多项式时间算法,具有类似的一般学习样本复杂度界。我们的算法基于一种新颖的过滤技术,可能具有独立兴趣:我们将时间序列与某个Hankel矩阵的特征向量进行卷积。

英文摘要

We present an efficient and practical algorithm for the online prediction of discrete-time linear dynamical systems with a symmetric transition matrix. We circumvent the non-convex optimization problem using improper learning: carefully overparameterize the class of LDSs by a polylogarithmic factor, in exchange for convexity of the loss functions. From this arises a polynomial-time algorithm with a near-optimal regret guarantee, with an analogous sample complexity bound for agnostic learning. Our algorithm is based on a novel filtering technique, which may be of independent interest: we convolve the time series with the eigenvectors of a certain Hankel matrix.

1702.06861 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

On the Power of Truncated SVD for General High-rank Matrix Estimation Problems

关于截断SVD在一般高秩矩阵估计问题中的功效

Simon S. Du, Yining Wang, Aarti Singh

AI总结 本文探讨了在谱范数下接近高秩正半定矩阵的估计值,通过截断SVD在Frobenius范数下得到乘法近似,解决了高秩矩阵补全、去噪和高维协方差估计问题。

Comments Accepted by NIPS 2017. Add gap-dependent bounds

详情
AI中文摘要

本文证明,给定一个在谱范数下接近一般高秩正半定矩阵A的估计值Ã(即‖Ã-A‖₂ ≤ δ),对Ã进行简单的截断SVD可以得到A在Frobenius范数下的乘法近似。这一观察导致了许多关于一般高秩矩阵估计问题的有趣结果,我们简要总结如下(A是一个n×n的高秩正半定矩阵,A_k是A的最佳秩-k近似):(1)高秩矩阵补全:通过观测Ω(nmax{ε⁻⁴,k²}μ₀²‖A‖_F²logn/σ_{k+1}(A)²)个A的元素,其中σ_{k+1}(A)是A的第(k+1)个奇异值,μ₀是不相干性,对零填充矩阵进行截断SVD可以满足‖Ã_k -A‖_F ≤ (1+O(ε))‖A -A_k‖_F,以高概率成立。(2)高秩矩阵去噪:令Ã=A+E,其中E是一个高斯随机噪声矩阵,具有零均值和每个元素方差为ν²/n。则Ã的截断SVD满足‖Ã_k -A‖_F ≤ (1+O(√(ν/σ_{k+1}(A))))‖A -A_k‖_F + O(√kν)。(3)高维协方差的低秩估计:给定N个i.i.d.样本X₁,…,X_N ~ N_n(0,A),能否用相对误差Frobenius范数界来估计A?我们证明如果N=Ω(nmax{ε⁻⁴,k²}γ_k(A)²logN),其中γ_k(A)=σ₁(A)/σ_{k+1}(A),则‖Ã_k -A‖_F ≤ (1+O(ε))‖A -A_k‖_F,以高概率成立,其中Ã=1/N∑_{i=1}^N X_iX_i^T是样本协方差。

英文摘要

We show that given an estimate $\widehat{A}$ that is close to a general high-rank positive semi-definite (PSD) matrix $A$ in spectral norm (i.e., $\|\widehat{A}-A\|_2 \leq δ$), the simple truncated SVD of $\widehat{A}$ produces a multiplicative approximation of $A$ in Frobenius norm. This observation leads to many interesting results on general high-rank matrix estimation problems, which we briefly summarize below ($A$ is an $n\times n$ high-rank PSD matrix and $A_k$ is the best rank-$k$ approximation of $A$): (1) High-rank matrix completion: By observing $Ω(\frac{n\max\{ε^{-4},k^2\}μ_0^2\|A\|_F^2\log n}{σ_{k+1}(A)^2})$ elements of $A$ where $σ_{k+1}\left(A\right)$ is the $\left(k+1\right)$-th singular value of $A$ and $μ_0$ is the incoherence, the truncated SVD on a zero-filled matrix satisfies $\|\widehat{A}_k-A\|_F \leq (1+O(ε))\|A-A_k\|_F$ with high probability. (2)High-rank matrix de-noising: Let $\widehat{A}=A+E$ where $E$ is a Gaussian random noise matrix with zero mean and $ν^2/n$ variance on each entry. Then the truncated SVD of $\widehat{A}$ satisfies $\|\widehat{A}_k-A\|_F \leq (1+O(\sqrt{ν/σ_{k+1}(A)}))\|A-A_k\|_F + O(\sqrt{k}ν)$. (3) Low-rank Estimation of high-dimensional covariance: Given $N$ i.i.d.~samples $X_1,\cdots,X_N\sim\mathcal N_n(0,A)$, can we estimate $A$ with a relative-error Frobenius norm bound? We show that if $N = Ω\left(n\max\{ε^{-4},k^2\}γ_k(A)^2\log N\right)$ for $γ_k(A)=σ_1(A)/σ_{k+1}(A)$, then $\|\widehat{A}_k-A\|_F \leq (1+O(ε))\|A-A_k\|_F$ with high probability, where $\widehat{A}=\frac{1}{N}\sum_{i=1}^N{X_iX_i^\top}$ is the sample covariance.

1710.10532 2026-06-04 eess.SY cs.AI cs.LG cs.SY 版本更新

Interpretable Apprenticeship Learning with Temporal Logic Specifications

具有时序逻辑规范的可解释模仿学习

Daniel Kasenberg, Matthias Scheutz

AI总结 本文提出通过多目标优化从MDP中的行为轨迹推断LTL规范,采用违反成本概念设计状态和动作基于的目标函数,并通过遗传算法在简单领域验证方法有效性。

Comments Accepted to the 56th IEEE Conference on Decision and Control (CDC 2017)

详情
AI中文摘要

近期工作已针对线性时序逻辑(LTL)公式作为在马尔可夫决策过程(MDP)中规划智能体的规范进行了研究。我们考虑逆问题:从MDP中的演示行为轨迹推断LTL规范。我们将此问题形式化为多目标优化问题,并基于

英文摘要

Recent work has addressed using formulas in linear temporal logic (LTL) as specifications for agents planning in Markov Decision Processes (MDPs). We consider the inverse problem: inferring an LTL specification from demonstrated behavior trajectories in MDPs. We formulate this as a multiobjective optimization problem, and describe state-based ("what actually happened") and action-based ("what the agent expected to happen") objective functions based on a notion of "violation cost". We demonstrate the efficacy of the approach by employing genetic programming to solve this problem in two simple domains.

1710.09854 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Gradient Sparsification for Communication-Efficient Distributed Optimization

梯度稀疏化用于通信高效的分布式优化

Jianqiao Wangni, Jialei Wang, Ji Liu, Tong Zhang

AI总结 本文提出通过凸优化方法减少梯度通信开销,设计高效算法实现梯度稀疏化,验证了在逻辑回归、支持向量机和卷积神经网络中的有效性。

详情
AI中文摘要

现代大规模机器学习应用需要在分布式计算架构上实现随机优化算法。关键瓶颈是不同工作者之间交换信息(如随机梯度)的通信开销。本文提出了一种凸优化公式,以最小化随机梯度的编码长度。为高效求解最优稀疏化,提出了几种简单快速的算法用于近似解,具有理论保证的稀疏性。在ℓ2正则化逻辑回归、支持向量机和卷积神经网络上的实验验证了我们的稀疏化方法。

英文摘要

Modern large scale machine learning applications require stochastic optimization algorithms to be implemented on distributed computational architectures. A key bottleneck is the communication overhead for exchanging information such as stochastic gradients among different workers. In this paper, to reduce the communication cost we propose a convex optimization formulation to minimize the coding length of stochastic gradients. To solve the optimal sparsification efficiently, several simple and fast algorithms are proposed for approximate solution, with theoretical guaranteed for sparseness. Experiments on $\ell_2$ regularized logistic regression, support vector machines, and convolutional neural networks validate our sparsification approaches.

1710.09657 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Segment Parameter Labelling in MCMC Mean-Shift Change Detection

MCMC均值迁移中的分段参数标记

Alireza Ahrabian, Shirin Enshaeifar, Clive Cheong-Took, Payam Barnaghi

AI总结 本文提出一种基于贝叶斯均值迁移的分段变化检测算法,利用分段参数重复性提升性能。

详情
AI中文摘要

本文解决时间序列数据在贝叶斯模型中关于感兴趣统计参数的分段问题。通常假设每个分段内的参数是不同的,因此许多贝叶斯变化点检测模型未利用分段参数模式,这可能提高性能。本文提出了一种贝叶斯均值迁移变化点检测算法,通过引入利用狄利克雷过程先验的分段类别标签来利用分段参数的重复性。所提出方法在合成和真实数据上的性能评估表明,使用参数标记可提高性能。

英文摘要

This work addresses the problem of segmentation in time series data with respect to a statistical parameter of interest in Bayesian models. It is common to assume that the parameters are distinct within each segment. As such, many Bayesian change point detection models do not exploit the segment parameter patterns, which can improve performance. This work proposes a Bayesian mean-shift change point detection algorithm that makes use of repetition in segment parameters, by introducing segment class labels that utilise a Dirichlet process prior. The performance of the proposed approach was assessed on both synthetic and real world data, highlighting the enhanced performance when using parameter labelling.

1710.08883 2026-06-04 cs.DC cs.LG cs.NA math.NA math.OC 版本更新

Avoiding Communication in Proximal Methods for Convex Optimization Problems

在凸优化问题中避免通信的近端方法

Saeed Soori, Aditya Devarakonda, James Demmel, Mert Gurbuzbalaban, Maryam Mehri Dehnavi

AI总结 本文提出一种改进的FISTA算法,通过每k次迭代通信一次来减少数据传输,提升分布式架构性能,实验显示在多个基准测试中平均加速3-10倍。

详情
AI中文摘要

快速迭代软阈值算法(FISTA)用于解决机器学习中的凸正则化优化问题。分布式实现因能处理大数据集而流行,但现有FISTA在每次迭代都通信,限制了现代分布式架构的性能。本文重新公式化FISTA,使数据每k次迭代通信一次,减少大数据集的通信开销。在Lasso问题的两种优化方法中,算法显示延迟成本降低k倍,而带宽和浮点运算成本保持不变。改进算法的收敛率和稳定性与标准方法相似。在1至1024个节点上评估通信避免FISTA和近端牛顿方法的性能,显示在多个基准测试中平均加速3-10倍,且扩展性能优于经典算法。

英文摘要

The fast iterative soft thresholding algorithm (FISTA) is used to solve convex regularized optimization problems in machine learning. Distributed implementations of the algorithm have become popular since they enable the analysis of large datasets. However, existing formulations of FISTA communicate data at every iteration which reduces its performance on modern distributed architectures. The communication costs of FISTA, including bandwidth and latency costs, is closely tied to the mathematical formulation of the algorithm. This work reformulates FISTA to communicate data at every k iterations and reduce data communication when operating on large data sets. We formulate the algorithm for two different optimization methods on the Lasso problem and show that the latency cost is reduced by a factor of k while bandwidth and floating-point operation costs remain the same. The convergence rates and stability properties of the reformulated algorithms are similar to the standard formulations. The performance of communication-avoiding FISTA and Proximal Newton methods is evaluated on 1 to 1024 nodes for multiple benchmarks and demonstrate average speedups of 3-10x with scaling properties that outperform the classical algorithms.

1710.08530 2026-06-04 math.OC cs.LG cs.SY eess.SY stat.ML 版本更新

Stability Analysis of Optimal Adaptive Control using Value Iteration with Approximation Errors

基于价值迭代的最优自适应控制稳定性分析

Ali Heydari

AI总结 本文通过价值迭代分析自适应最优控制的稳定性,考虑了近似误差的影响,提供了吸引域的估计,确保初始条件在该域内时轨迹保持有效。

Comments A part of this paper is based on preliminary results presented in arXiv:1412.5675

详情
AI中文摘要

本文对基于价值迭代的自适应最优控制进行了理论分析,研究了学习阶段系统稳定性,不忽略近似误差的影响。分析包括使用任何单个/常数控制策略或演化的/时间变化控制策略时的系统运行。所提结果的一个特点是提供吸引域的估计,如果初始条件在该域内,整个轨迹将保持在其中,从而保证函数近似结果的有效性。

英文摘要

Adaptive optimal control using value iteration initiated from a stabilizing control policy is theoretically analyzed in terms of stability of the system during the learning stage without ignoring the effects of approximation errors. This analysis includes the system operated using any single/constant resulting control policy and also using an evolving/time-varying control policy. A feature of the presented results is providing estimations of the \textit{region of attraction} so that if the initial condition is within the region, the whole trajectory will remain inside it and hence, the function approximation results remain valid.

1709.03153 2026-06-04 cs.LG cs.AI cs.RO cs.SY eess.SY 版本更新

MBMF: Model-Based Priors for Model-Free Reinforcement Learning

MBMF:基于模型的先验用于无模型强化学习

Somil Bansal, Roberto Calandra, Kurtland Chua, Sergey Levine, Claire Tomlin

AI总结 本文提出一种结合模型与无模型强化学习的方法,通过学习概率动力学模型作为先验,提升数据效率和成本效益。

Comments After we submitted the paper for consideration in CoRL 2017 we found a paper published in the recent past with a similar method (see related work for a discussion). Considering the similarities between the two papers, we have decided to retract our paper from CoRL 2017

详情
AI中文摘要

强化学习主要分为无模型和有模型两种范式。每种范式都有其优势和局限性,并已成功应用于适合其相应优势的真实世界领域。本文提出一种新方法,旨在弥合这两种范式的差距。我们通过学习概率动力学模型,并将其作为交织的无模型优化的先验,结合两种范式的优点,从而实现数据高效和成本节约。结果表明,我们的方法在性能上优于纯有模型和纯无模型方法,以及简单切换范式的方法。

英文摘要

Reinforcement Learning is divided in two main paradigms: model-free and model-based. Each of these two paradigms has strengths and limitations, and has been successfully applied to real world domains that are appropriate to its corresponding strengths. In this paper, we present a new approach aimed at bridging the gap between these two paradigms. We aim to take the best of the two paradigms and combine them in an approach that is at the same time data-efficient and cost-savvy. We do so by learning a probabilistic dynamics model and leveraging it as a prior for the intertwined model-free optimization. As a result, our approach can exploit the generality and structure of the dynamics model, but is also capable of ignoring its inevitable inaccuracies, by directly incorporating the evidence provided by the direct observation of the cost. Preliminary results demonstrate that our approach outperforms purely model-based and model-free approaches, as well as the approach of simply switching from a model-based to a model-free setting.

1607.03081 2026-06-04 math.NA cs.LG cs.NA math.OC stat.ML 版本更新

Proximal Quasi-Newton Methods for Regularized Convex Optimization with Linear and Accelerated Sublinear Convergence Rates

近似拟牛顿方法在正则化凸优化中的应用:线性和加速次线性收敛率

Hiva Ghanbari, Katya Scheinberg

AI总结 本文研究了正则化凸优化中近似拟牛顿方法的收敛性,分析了强凸情况下精确与不精确设置的收敛性质,并探讨了加速变体的实用性与性能。

详情
AI中文摘要

在[19]中,提出了一种通用的、不精确的、高效的近似拟牛顿算法用于复合优化问题,并建立了次线性全局收敛率。本文分析了该方法在精确和不精确设置下的收敛性质,当目标函数为强凸时。我们还研究了该方法的一个实用变种,通过建立一个简单的子问题优化停止准则。此外,我们考虑了基于FISTA[1]的加速变体,针对近似拟牛顿算法。类似加速方法在[7]中被考虑,但其收敛性分析依赖于非常强但不实际的假设。我们提出了一个修改后的分析,放松了这些假设,并对加速的近似拟牛顿算法和常规方法进行了实际比较。我们的分析和计算结果表明,在拟牛顿设置中加速可能不会带来任何好处。

英文摘要

In [19], a general, inexact, efficient proximal quasi-Newton algorithm for composite optimization problems has been proposed and a sublinear global convergence rate has been established. In this paper, we analyze the convergence properties of this method, both in the exact and inexact setting, in the case when the objective function is strongly convex. We also investigate a practical variant of this method by establishing a simple stopping criterion for the subproblem optimization. Furthermore, we consider an accelerated variant, based on FISTA [1], to the proximal quasi-Newton algorithm. A similar accelerated method has been considered in [7], where the convergence rate analysis relies on very strong impractical assumptions. We present a modified analysis while relaxing these assumptions and perform a practical comparison of the accelerated proximal quasi- Newton algorithm and the regular one. Our analysis and computational results show that acceleration may not bring any benefit in the quasi-Newton setting.

1710.05472 2026-06-04 cs.LG cs.SY eess.SY 版本更新

Safe Learning of Quadrotor Dynamics Using Barrier Certificates

使用屏障证书安全学习四旋翼动力学

Li Wang, Evangelos A. Theodorou, Magnus Egerstedt

AI总结 本文提出基于高斯过程的数据驱动方法,通过屏障证书确保四旋翼在部分未知环境中的安全学习,结合自适应采样方案和递归高斯过程预测实现动态建模。

Comments Submitted to ICRA 2018, 8 pages

详情
AI中文摘要

为了有效控制复杂动力系统,通常需要准确的非线性模型。然而,这些模型并不总是已知的。在本文中,我们提出了一种基于高斯过程的数据驱动方法,用于学习在部分未知环境中运行的四旋翼动力学模型。挑战在于,若学习过程未被谨慎控制,系统将不稳定,即四旋翼将坠毁。为此,采用屏障证书进行安全学习。屏障证书建立了一个非保守的正向不变安全区域,基于高斯过程的统计特性提供高概率的安全保证。设计了一个学习控制器,以高效探索不确定状态并扩展基于自适应采样方案的屏障认证安全区域。此外,开发了一种递归高斯过程预测方法,用于实时学习复杂的四旋翼动力学。仿真结果证明了所提方法的有效性。

英文摘要

To effectively control complex dynamical systems, accurate nonlinear models are typically needed. However, these models are not always known. In this paper, we present a data-driven approach based on Gaussian processes that learns models of quadrotors operating in partially unknown environments. What makes this challenging is that if the learning process is not carefully controlled, the system will go unstable, i.e., the quadcopter will crash. To this end, barrier certificates are employed for safe learning. The barrier certificates establish a non-conservative forward invariant safe region, in which high probability safety guarantees are provided based on the statistics of the Gaussian Process. A learning controller is designed to efficiently explore those uncertain states and expand the barrier certified safe region based on an adaptive sampling scheme. In addition, a recursive Gaussian Process prediction method is developed to learn the complex quadrotor dynamics in real-time. Simulation results are provided to demonstrate the effectiveness of the proposed approach.

1709.04889 2026-06-04 math.OC cs.LG cs.RO cs.SY eess.SY 版本更新

Control-Oriented Learning on the Fly

实时控制导向学习

Melkior Ornik, Arie Israel, Ufuk Topcu

AI总结 本文提出一种实时控制导向学习方法,用于在系统动力学几乎未知的情况下实现控制目标,通过小扰动学习局部动态并保证近似最优方向,验证了其在受损飞机避撞和Van der Pol振荡器中的有效性。

Comments Extended version of M. Ornik, A. Israel, U. Topcu, "Myopic Control of Systems with Unknown Dynamics". Detailed list of differences from that paper given within the manuscript. Changes in v2 include a discussion of myopic control in an LTL context and a correction of the bound for suboptimality of the algorithm

详情
AI中文摘要

本文聚焦于开发一种策略,用于控制其动力学几乎完全未知的系统。这种情况自然出现在系统经历关键故障的场景中。在这种情况下,保留满足基本控制目标的能力以避免即将来临的灾难至关重要。一个典型的此类目标是可达避障问题,其中系统需要在受限的状态空间中移动到某个状态。为了应对对系统动力学知识的限制,我们开发了一种贪心控制理论。贪心控制的主要目标是在任何给定时间,仅根据到目前为止获得的系统信息,优化系统的轨迹方向。我们提出了一种算法,利用小扰动的控制努力来学习局部动态,同时确保系统朝着看似近似最优的方向移动,并为其次优性能提供硬性界限。我们还验证了该算法在受损飞机避撞模拟以及Van der Pol振荡器示例中的有效性。

英文摘要

This paper focuses on developing a strategy for control of systems whose dynamics are almost entirely unknown. This situation arises naturally in a scenario where a system undergoes a critical failure. In that case, it is imperative to retain the ability to satisfy basic control objectives in order to avert an imminent catastrophe. A prime example of such an objective is the reach-avoid problem, where a system needs to move to a certain state in a constrained state space. To deal with limitations on our knowledge of system dynamics, we develop a theory of myopic control. The primary goal of myopic control is to, at any given time, optimize the current direction of the system trajectory, given solely the information obtained about the system until that time. We propose an algorithm that uses small perturbations in the control effort to learn local dynamics while simultaneously ensuring that the system moves in a direction that appears to be nearly optimal, and provide hard bounds for its suboptimality. We additionally verify the usefulness of the algorithm on a simulation of a damaged aircraft seeking to avoid a crash, as well as on an example of a Van der Pol oscillator.

1710.03971 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

Adaptive multi-penalty regularization based on a generalized Lasso path

基于广义Lasso路径的自适应多罚项正则化

Markus Grasmair, Timo Klock, Valeriya Naumova

AI总结 本文提出了一种自适应多罚项正则化参数选择框架,通过构建包含结构相似解的区域,实现正确支持恢复,并结合模型选择准则进行数据自适应参数选择,提升压缩感知问题的鲁棒性和性能。

详情
AI中文摘要

对于许多算法,参数调节仍是一个具有挑战性和关键性的任务,尤其是在多参数设置中变得繁琐且不可行。多罚项正则化,成功用于解决混合型不定稀疏回归问题,其中信号和噪声是加法混合的,是此类例子之一。本文提出了一种新的算法框架,用于多罚项正则化的自适应参数选择,重点在于正确支持恢复。基于正则化路径理论和单罚项函数的算法理论,我们通过提供一种高效的构造包含结构相似解的区域的程序,将这些想法扩展到多罚项框架中,即在参数范围内的整个范围内,构造具有相同稀疏性和符号模式的解。结合这一方法与模型选择准则,可以以数据自适应的方式选择正则化参数。我们算法的另一个优势是,它提供了整个参数范围内解稳定性概述。这可以进一步用于获得对感兴趣问题的额外见解。我们对我们的方法进行了数值分析,并将其与压缩感知问题中的最新单罚项算法进行比较,以展示所提算法的鲁棒性和强大性。

英文摘要

For many algorithms, parameter tuning remains a challenging and critical task, which becomes tedious and infeasible in a multi-parameter setting. Multi-penalty regularization, successfully used for solving undetermined sparse regression of problems of unmixing type where signal and noise are additively mixed, is one of such examples. In this paper, we propose a novel algorithmic framework for an adaptive parameter choice in multi-penalty regularization with a focus on the correct support recovery. Building upon the theory of regularization paths and algorithms for single-penalty functionals, we extend these ideas to a multi-penalty framework by providing an efficient procedure for the construction of regions containing structurally similar solutions, i.e., solutions with the same sparsity and sign pattern, over the whole range of parameters. Combining this with a model selection criterion, we can choose regularization parameters in a data-adaptive manner. Another advantage of our algorithm is that it provides an overview on the solution stability over the whole range of parameters. This can be further exploited to obtain additional insights into the problem of interest. We provide a numerical analysis of our method and compare it to the state-of-the-art single-penalty algorithms for compressed sensing problems in order to demonstrate the robustness and power of the proposed algorithm.

1502.02860 2026-06-04 stat.ML cs.LG cs.RO cs.SY eess.SY 版本更新

Gaussian Processes for Data-Efficient Learning in Robotics and Control

高斯过程在机器人和控制中的数据高效学习

Marc Peter Deisenroth, Dieter Fox, Carl Edward Rasmussen

AI总结 本文提出基于高斯过程的非参数转移模型,通过提取更多数据信息加速学习,减少模型误差影响,实现高效自主学习。

Comments 20 pages, 29 figures; fixed a typo in equation on page 8

详情
Journal ref
IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 37, issue no 2, pages 408-423, February 2015
AI中文摘要

自主学习在控制和机器人领域已持续十多年,数据驱动学习可减少工程知识需求。然而,自主强化学习通常需要大量系统交互,这在实际系统中(如机器人)不现实。本文提出通过高斯过程转移模型提取更多数据信息,显式纳入模型不确定性以减少误差影响,相比现有RL方法,模型基于策略搜索方法实现了前所未有的学习速度,并在真实机器人和控制任务中展示了应用价值。

英文摘要

Autonomous learning has been a promising direction in control and robotics for more than a decade since data-driven learning allows to reduce the amount of engineering knowledge, which is otherwise required. However, autonomous reinforcement learning (RL) approaches typically require many interactions with the system to learn controllers, which is a practical limitation in real systems, such as robots, where many interactions can be impractical and time consuming. To address this problem, current learning approaches typically require task-specific knowledge in form of expert demonstrations, realistic simulators, pre-shaped policies, or specific knowledge about the underlying dynamics. In this article, we follow a different approach and speed up learning by extracting more information from data. In particular, we learn a probabilistic, non-parametric Gaussian process transition model of the system. By explicitly incorporating model uncertainty into long-term planning and controller learning our approach reduces the effects of model errors, a key problem in model-based learning. Compared to state-of-the art RL our model-based policy search method achieves an unprecedented speed of learning. We demonstrate its applicability to autonomous learning in real robot and control tasks.

1710.02242 2026-06-04 cs.LG cs.NA math.NA 版本更新

Solving differential equations with unknown constitutive relations as recurrent neural networks

利用未知本构关系求解微分方程作为循环神经网络

Tobias Hagge, Panos Stinis, Enoch Yeung, Alexandre M. Tartakovsky

AI总结 本文提出用循环神经网络学习未知的反应速率项,通过离散化的常微分方程作为训练问题的一部分,解决部分可用状态变量测量数据下的微分方程问题,应用于fedbatch生物反应器模拟。

Comments 19 pages, 8 figures

详情
AI中文摘要

我们解决一个具有未知功能形式的sink(反应速率)项的常微分方程组。我们假设状态变量的测量(时间序列)部分可用,并利用循环神经网络来“学习”反应速率。这通过将离散化的常微分方程作为循环神经网络训练问题的一部分来实现。我们扩展了TensorFlow的循环神经网络架构,创建了一个简单但可扩展且有效的求解器,用于未知函数的求解,并应用于fedbatch生物反应器模拟问题。使用最近深度学习文献中的技术使训练具有在数千个时间步上表现的行为的函数成为可能。我们的网络在结构上类似于循环神经网络,但设计和功能上的差异要求对训练此类网络的传统智慧进行修改。

英文摘要

We solve a system of ordinary differential equations with an unknown functional form of a sink (reaction rate) term. We assume that the measurements (time series) of state variables are partially available, and we use recurrent neural network to "learn" the reaction rate from this data. This is achieved by including a discretized ordinary differential equations as part of a recurrent neural network training problem. We extend TensorFlow's recurrent neural network architecture to create a simple but scalable and effective solver for the unknown functions, and apply it to a fedbatch bioreactor simulation problem. Use of techniques from recent deep learning literature enables training of functions with behavior manifesting over thousands of time steps. Our networks are structurally similar to recurrent neural networks, but differences in design and function require modifications to the conventional wisdom about training such networks.

1710.00032 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Learning the Exact Topology of Undirected Consensus Networks

学习无向共识网络的精确拓扑结构

Saurav Talukdar, Deepjyoti Deka, Sandeep Attree, Donatello Materassi, Murti V. Salapaka

AI总结 本文提出了一种非侵入性方法,利用多元维纳滤波学习无向共识网络的交互拓扑,通过频率响应识别虚假链接,从而精确揭示节点间交互结构。

Comments 6 pages

详情
AI中文摘要

本文提出了一种方法,用于学习执行线性共识更新的网络代理的交互拓扑。我们的方法基于多元维纳滤波,已知能够恢复拓扑中的虚假边以外的真实边。本文的主要贡献是证明在无向共识网络中,使用维纳滤波获得的所有虚假链接可通过维纳滤波器的频率响应识别。因此,代理的精确交互拓扑得以揭示。所提出的方法需要代理状态的时间序列测量,且不需任何链接权重的知识。据我们所知,这是首次能证明重建具有相关噪声的无向共识网络结构的方法。我们通过数值模拟以及五节点Raspberry Pi网络的实验展示了该方法的有效性。

英文摘要

In this article, we present a method to learn the interaction topology of a network of agents undergoing linear consensus updates in a non invasive manner. Our approach is based on multivariate Wiener filtering, which is known to recover spurious edges apart from the true edges in the topology. The main contribution of this work is to show that in the case of undirected consensus networks, all spurious links obtained using Wiener filtering can be identified using frequency response of the Wiener filters. Thus, the exact interaction topology of the agents is unveiled. The method presented requires time series measurements of the state of the agents and does not require any knowledge of link weights. To the best of our knowledge this is the first approach that provably reconstructs the structure of undirected consensus networks with correlated noise. We illustrate the effectiveness of the method developed through numerical simulations as well as experiments on a five node network of Raspberry Pis.

1708.02276 2026-06-04 math.NA cs.LG cs.NA 版本更新

Parallelizing Over Artificial Neural Network Training Runs with Multigrid

用多网格方法并行化人工神经网络训练运行

Jacob B. Schroder

AI总结 本文提出一种多网格降时(MGRIT)算法,用于并行化神经网络训练运行,以解决训练阶段的瓶颈问题,通过将训练视为演化方程来实现并行计算。

Comments Version 2: - Added more complete references to basic neural network literature - Corrected typos - Condensed results in Section 3 to be more concise - 22 pages

详情
AI中文摘要

人工神经网络是一种流行且有效的机器学习技术。在并行化单个网络的昂贵训练阶段方面取得了巨大进展,导致了高度专门化的硬件,许多基于GPU架构,以及更多并发算法如合成梯度。然而,训练阶段仍然是一个瓶颈,其中训练数据必须在数千个单独的训练运行上串行处理。本文考虑了一种多网格降时(MGRIT)算法,能够并行化数千个训练运行,并收敛到与传统训练相同的结果。MGRIT最初是为提供时间演化的并行性而开发的,通过串行地步进有限数量的时间步。本文将神经网络训练类似地重新表述,将神经网络训练视为一个演化方程,该方程从一步到下一步演化网络权重。因此,本文关注分布式计算方法,但与其他仅试图在单个训练运行上并行化的做法不同。本文最后给出了两个模型问题的数值结果以支持研究。

英文摘要

Artificial neural networks are a popular and effective machine learning technique. Great progress has been made parallelizing the expensive training phase of an individual network, leading to highly specialized pieces of hardware, many based on GPU-type architectures, and more concurrent algorithms such as synthetic gradients. However, the training phase continues to be a bottleneck, where the training data must be processed serially over thousands of individual training runs. This work considers a multigrid reduction in time (MGRIT) algorithm that is able to parallelize over the thousands of training runs and converge to the exact same solution as traditional training would provide. MGRIT was originally developed to provide parallelism for time evolution problems that serially step through a finite number of time-steps. This work recasts the training of a neural network similarly, treating neural network training as an evolution equation that evolves the network weights from one step to the next. Thus, this work concerns distributed computing approaches for neural networks, but is distinct from other approaches which seek to parallelize only over individual training runs. The work concludes with supporting numerical results for two model problems.

1709.09578 2026-06-04 cs.LG cs.NA math.NA 版本更新

Neural networks for topology optimization

神经网络用于拓扑优化

Ivan Sosnovik, Ivan Oseledets

AI总结 本文提出基于深度学习的拓扑优化加速方法,将布局问题转化为图像分割任务,利用卷积编码器-解码器架构实现高效优化,实验表明方法显著提升优化速度并具有良好的泛化能力。

详情
AI中文摘要

在本研究中,我们提出了一种基于深度学习的方法,以加速拓扑优化方法。我们试图解决布局问题。本工作的主要创新点是将问题表述为图像分割任务。我们利用深度学习方法的高效像素级图像标注技术来进行拓扑优化。我们引入了卷积编码器-解码器架构,并介绍了通过高性能方法解决上述问题的整体方法。进行的实验展示了优化过程的显著加速。所提出的方法具有出色的泛化能力。我们展示了所提出模型应用于其他问题的能力。成功的实验结果以及当前方法的缺点均进行了讨论。

英文摘要

In this research, we propose a deep learning based approach for speeding up the topology optimization methods. The problem we seek to solve is the layout problem. The main novelty of this work is to state the problem as an image segmentation task. We leverage the power of deep learning methods as the efficient pixel-wise image labeling technique to perform the topology optimization. We introduce convolutional encoder-decoder architecture and the overall approach of solving the above-described problem with high performance. The conducted experiments demonstrate the significant acceleration of the optimization process. The proposed approach has excellent generalization properties. We demonstrate the ability of the application of the proposed model to other problems. The successful results, as well as the drawbacks of the current method, are discussed.

1709.08830 2026-06-04 eess.SY cs.CR cs.LG cs.SY 版本更新

Catching Anomalous Distributed Photovoltaics: An Edge-based Multi-modal Anomaly Detection

捕捉异常分布式光伏:基于边缘的多模态异常检测

Devu Manikantan Shilay, Kin Gwn Lorey, Tianshu Weiz, Teems Lovetty, Yu Cheng

AI总结 本文提出基于边缘的多模态异常检测方法,用于识别分布式光伏等设备的异常行为,通过融合多源时间序列数据,提升对电网安全的检测能力。

详情
AI中文摘要

能源系统网络安全性面临的主要挑战是无法检测针对分布式电网边缘设备(如光伏板、智能柔性负载和电动汽车)的网络物理攻击。本文设计并开发了一种分布式、多模态异常检测方法,通过在多个时间序列数据源上使用无监督机器学习算法,融合本地观测并标记异常。特别关注分布式光伏面临的网络物理威胁,通过创建供需失配、反向功率流等条件导致局部扰动或电网不稳定。使用开源电力系统模拟工具GridLAB-D,结合真实智能家居和太阳能数据集模拟智能电网场景,展示光伏攻击对电力系统的影响。各种针对光伏板的攻击(如电压波动、反向功率流等)被设计并执行。观察到虽然单个无监督学习算法如OCSVMs、Corrupt RF和PCA在识别特定攻击类型上表现优异,但PCA与凸包的组合在识别所有设计攻击时表现最佳,真阳性率为83.64%,准确率为95.78%。关键发现是由于配电网络的异构性和攻击类型的不确定性,依赖单一信息模式进行防御会导致假警报和漏检率增加,因为攻击者可以设计攻击隐藏在这些不确定性中保持隐蔽。

英文摘要

A significant challenge in energy system cyber security is the current inability to detect cyber-physical attacks targeting and originating from distributed grid-edge devices such as photovoltaics (PV) panels, smart flexible loads, and electric vehicles. We address this concern by designing and developing a distributed, multi-modal anomaly detection approach that can sense the health of the device and the electric power grid from the edge. This is realized by exploiting unsupervised machine learning algorithms on multiple sources of time-series data, fusing these multiple local observations and flagging anomalies when a deviation from the normal behavior is observed. We particularly focus on the cyber-physical threats to the distributed PVs that has the potential to cause local disturbances or grid instabilities by creating supply-demand mismatch, reverse power flow conditions etc. We use an open source power system simulation tool called GridLAB-D, loaded with real smart home and solar datasets to simulate the smart grid scenarios and to illustrate the impact of PV attacks on the power system. Various attacks targeting PV panels that create voltage fluctuations, reverse power flow etc were designed and performed. We observe that while individual unsupervised learning algorithms such as OCSVMs, Corrupt RF and PCA surpasses in identifying particular attack type, PCA with Convex Hull outperforms all algorithms in identifying all designed attacks with a true positive rate of 83.64% and an accuracy of 95.78%. Our key insight is that due to the heterogeneous nature of the distribution grid and the uncertainty in the type of the attack being launched, relying on single mode of information for defense can lead to increased false alarms and missed detection rates as one can design attacks to hide within those uncertainties and remain stealthy.

1703.09260 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Goal-Driven Dynamics Learning via Bayesian Optimization

通过贝叶斯优化的目标驱动动力学学习

Somil Bansal, Roberto Calandra, Ted Xiao, Sergey Levine, Claire J. Tomlin

AI总结 本文提出通过贝叶斯优化主动学习框架,迭代学习局部线性动力学模型以提升控制性能,用于四旋翼无人机任务控制。

Comments This is the extended version of the CDC'17 paper titled "Goal-Driven Dynamics Learning via Bayesian Optimization."

详情
AI中文摘要

现实中的机器人日益复杂,常在不明确的环境中运作,难以建模或学习其真实动力学。因此,采用任务特定方法,聚焦于学习能实现最佳控制性能的动力学模型,而非真实动力学。本文在主动学习框架中使用贝叶斯优化,通过迭代更新基于物理系统实验表现的动力学模型,与最优控制方案结合,高效设计控制器。通过仿真和真实四旋翼测试平台实验验证了所提方法的有效性。

英文摘要

Real-world robots are becoming increasingly complex and commonly act in poorly understood environments where it is extremely challenging to model or learn their true dynamics. Therefore, it might be desirable to take a task-specific approach, wherein the focus is on explicitly learning the dynamics model which achieves the best control performance for the task at hand, rather than learning the true dynamics. In this work, we use Bayesian optimization in an active learning framework where a locally linear dynamics model is learned with the intent of maximizing the control performance, and used in conjunction with optimal control schemes to efficiently design a controller for a given task. This model is updated directly based on the performance observed in experiments on the physical system in an iterative manner until a desired performance is achieved. We demonstrate the efficacy of the proposed approach through simulations and real experiments on a quadrotor testbed.

1703.01250 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Virtual vs. Real: Trading Off Simulations and Physical Experiments in Reinforcement Learning with Bayesian Optimization

虚拟与现实:在强化学习中权衡模拟与物理实验

Alonso Marco, Felix Berkenkamp, Philipp Hennig, Angela P. Schoellig, Andreas Krause, Stefan Schaal, Sebastian Trimpe

AI总结 本文提出利用模拟数据优化强化学习,通过结合低成本但不准确的模拟信息与高成本但准确的物理实验,提高效率。

Comments 7 pages, 6 figures, to appear in IEEE 2017 International Conference on Robotics and Automation (ICRA)

详情
AI中文摘要

在实践中,控制策略的参数通常手动调整,这耗时且令人沮丧。强化学习是一种有前途的替代方法,旨在自动化此过程,但通常需要太多实验才实用。本文提出了一种解决方案,通过利用可用于大多数机器人平台的模拟先验知识。具体而言,我们扩展了熵搜索,一种最大化每次实验信息增益的贝叶斯优化算法,以处理多个信息源的情况。结果是一种原则性的方法,可以有效地将低成本但不准确的模拟信息与高成本且准确的物理实验结合起来。我们将其应用于摆杆系统,证明该算法可以在比仅使用物理系统标准贝叶斯优化更少的实验中找到良好的控制策略。

英文摘要

In practice, the parameters of control policies are often tuned manually. This is time-consuming and frustrating. Reinforcement learning is a promising alternative that aims to automate this process, yet often requires too many experiments to be practical. In this paper, we propose a solution to this problem by exploiting prior knowledge from simulations, which are readily available for most robotic platforms. Specifically, we extend Entropy Search, a Bayesian optimization algorithm that maximizes information gain from each experiment, to the case of multiple information sources. The result is a principled way to automatically combine cheap, but inaccurate information from simulations with expensive and accurate physical experiments in a cost-effective manner. We apply the resulting method to a cart-pole system, which confirms that the algorithm can find good control policies with fewer experiments than standard Bayesian optimization on the physical system only.

1605.01950 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Automatic LQR Tuning Based on Gaussian Process Global Optimization

基于高斯过程全局优化的自动LQR调优

Alonso Marco, Philipp Hennig, Jeannette Bohg, Stefan Schaal, Sebastian Trimpe

AI总结 本文提出一种结合线性最优控制的自动控制器调优框架,利用贝叶斯优化算法提升控制器参数,通过实验数据优化性能目标,以七自由度机械臂平衡倒立杆为例验证方法有效性。

Comments 8 pages, 5 figures, to appear in IEEE 2016 International Conference on Robotics and Automation. Video demonstration of the experiments available at https://am.is.tuebingen.mpg.de/publications/marco_icra_2016

详情
AI中文摘要

本文提出一种基于线性最优控制与贝叶斯优化的自动控制器调优框架。该框架根据预定义的性能目标,利用实验数据自动改进初始控制器参数。所采用的贝叶斯优化算法为熵搜索,将潜在目标表示为高斯过程,并构建关于目标最小值位置的显式信念。通过最大化每次实验评估的信息增益,该框架能够在较少评估次数下获得改进的控制器。实验演示使用了七自由度机械臂平衡倒立杆的任务,二、四维调优问题的结果展示了该方法在机器人平台上的自动控制器调优潜力。

英文摘要

This paper proposes an automatic controller tuning framework based on linear optimal control combined with Bayesian optimization. With this framework, an initial set of controller gains is automatically improved according to a pre-defined performance objective evaluated from experimental data. The underlying Bayesian optimization algorithm is Entropy Search, which represents the latent objective as a Gaussian process and constructs an explicit belief over the location of the objective minimum. This is used to maximize the information gain from each experimental evaluation. Thus, this framework shall yield improved controllers with fewer evaluations compared to alternative approaches. A seven-degree-of-freedom robot arm balancing an inverted pole is used as the experimental demonstrator. Results of a two- and four-dimensional tuning problems highlight the method's potential for automatic controller tuning on robotic platforms.

1709.06080 2026-06-04 cs.LG cs.AI cs.NA math.NA 版本更新

Feedforward and Recurrent Neural Networks Backward Propagation and Hessian in Matrix Form

前馈和循环神经网络的反向传播与Hessian矩阵形式

Maxim Naumov

AI总结 本文研究了前馈和循环神经网络的线性代数理论,推导了Hessian的精确表达式,并展示了权重梯度和Hessian的矩阵形式。

Comments 23 pages, 4 figures

详情
AI中文摘要

本文聚焦于前馈(FNN)和循环(RNN)神经网络背后的线性代数理论。我们回顾了反向传播,包括通过时间反向传播(BPTT)。此外,我们推导出Hessian的新的精确表达式,代表了二次效应。我们证明,对于t个时间步,权重梯度可以表示为秩-t矩阵,而权重Hessian则可以表示为t²个Kronecker积之和,这些Kronecker积由秩-1和W^TAW矩阵组成,其中A和W是某些矩阵。此外,我们还证明,对于大小为r的mini-batch,权重更新可以表示为秩-rt矩阵。最后,我们简要评论了Hessian矩阵的特征值。

英文摘要

In this paper we focus on the linear algebra theory behind feedforward (FNN) and recurrent (RNN) neural networks. We review backward propagation, including backward propagation through time (BPTT). Also, we obtain a new exact expression for Hessian, which represents second order effects. We show that for $t$ time steps the weight gradient can be expressed as a rank-$t$ matrix, while the weight Hessian is as a sum of $t^{2}$ Kronecker products of rank-$1$ and $W^{T}AW$ matrices, for some matrix $A$ and weight matrix $W$. Also, we show that for a mini-batch of size $r$, the weight update can be expressed as a rank-$rt$ matrix. Finally, we briefly comment on the eigenvalues of the Hessian matrix.

1709.06011 2026-06-04 cs.MA cs.AI cs.LG cs.SY eess.SY stat.ML 版本更新

Guided Deep Reinforcement Learning for Swarm Systems

引导式深度强化学习用于群体系统

Maximilian Hüttenrauch, Adrian Šošić, Gerhard Neumann

AI总结 本文研究如何通过有限感知能力的协作代理(如机器人群)学习控制方法,提出引导式强化学习框架,利用中央 critic 获取全局状态以简化策略评估,通过深度强化学习近似 Q 函数和策略。

Comments 15 pages, 8 figures, accepted at the AAMAS 2017 Autonomous Robots and Multirobot Systems (ARMS) Workshop

详情
AI中文摘要

本文研究如何学习控制具有有限感知能力的协作代理群体(如机器人群)。代理仅具备基本传感器能力,但通过协作可完成复杂任务,如分布式装配或搜索救援。学习群体代理的策略因分布式部分可观测性而困难。本文采用引导式方法,其中 critic 在学习过程中拥有全局状态的中央访问,从而从强化学习角度简化策略评估问题。例如,通过摄像头图像获取所有机器人位置,但该图像仅供 critic 使用,不供机器人控制策略。本文采用 actor-critic 方法,其中 actor 仅基于本地感知信息做决策,而 critic 基于真实全局状态进行学习。算法使用深度强化学习近似 Q 函数和策略。算法性能在两个简单模拟 2D 代理任务上进行评估:1) 找到并维持一定距离;2) 定位目标。

英文摘要

In this paper, we investigate how to learn to control a group of cooperative agents with limited sensing capabilities such as robot swarms. The agents have only very basic sensor capabilities, yet in a group they can accomplish sophisticated tasks, such as distributed assembly or search and rescue tasks. Learning a policy for a group of agents is difficult due to distributed partial observability of the state. Here, we follow a guided approach where a critic has central access to the global state during learning, which simplifies the policy evaluation problem from a reinforcement learning point of view. For example, we can get the positions of all robots of the swarm using a camera image of a scene. This camera image is only available to the critic and not to the control policies of the robots. We follow an actor-critic approach, where the actors base their decisions only on locally sensed information. In contrast, the critic is learned based on the true global state. Our algorithm uses deep reinforcement learning to approximate both the Q-function and the policy. The performance of the algorithm is evaluated on two tasks with simple simulated 2D agents: 1) finding and maintaining a certain distance to each others and 2) locating a target.

1602.04436 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Autoregressive Moving Average Graph Filtering

自回归移动平均图滤波器

Elvin Isufi, Andreas Loukas, Andrea Simonetto, Geert Leus

AI总结 本文提出了一种自回归移动平均图滤波器,能够近似任意图频响应并实现信号去噪和插值。该方法适用于静态和时变场景,通过二维滤波处理时变图信号。

详情
Journal ref
IEEE Transactions on Signal Processing, vol. 67 (2), pages 274 - 288, 2017
AI中文摘要

本文提出了一种自回归移动平均图滤波器,能够近似任意图频响应并实现信号去噪和插值。该方法适用于静态和时变场景,通过二维滤波处理时变图信号。

英文摘要

One of the cornerstones of the field of signal processing on graphs are graph filters, direct analogues of classical filters, but intended for signals defined on graphs. This work brings forth new insights on the distributed graph filtering problem. We design a family of autoregressive moving average (ARMA) recursions, which (i) are able to approximate any desired graph frequency response, and (ii) give exact solutions for tasks such as graph signal denoising and interpolation. The design philosophy, which allows us to design the ARMA coefficients independently from the underlying graph, renders the ARMA graph filters suitable in static and, particularly, time-varying settings. The latter occur when the graph signal and/or graph are changing over time. We show that in case of a time-varying graph signal our approach extends naturally to a two-dimensional filter, operating concurrently in the graph and regular time domains. We also derive sufficient conditions for filter stability when the graph and signal are time-varying. The analytical and numerical results presented in this paper illustrate that ARMA graph filters are practically appealing for static and time-varying settings, as predicted by theoretical derivations.

1709.04073 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Linear Stochastic Approximation: Constant Step-Size and Iterate Averaging

线性随机逼近:固定步长和迭代平均

Chandrashekar Lakshminarayanan, Csaba Szepesvári

AI总结 本文研究了固定步长和Polyak-Ruppert平均的线性随机逼近算法,分析了其均方误差随迭代次数的变化,并探讨了在不同数据分布下固定步长的选择条件及启发式调整方法。

Comments 16 pages, 2 figures, was submitted to NIPS 2017

详情
AI中文摘要

本文研究了具有固定步长和Polyak-Ruppert(PR)迭代平均的$d$维线性随机逼近算法(LSAs)。LSAs广泛应用于机器学习和强化学习(RL)中,其目标是利用噪声数据和每个迭代$O(d)$次更新来计算合适的$θ_*∈\mathbb{R}^d$(即最优解或固定点)。本文受RL中从经验回放中评估策略的问题启发,探讨了属于时间差分(TD)类学习算法的LSAs。对于具有固定步长和PR平均的LSAs,我们提供了$t$次迭代后的均方误差(MSE)的界限。我们假设数据是独立同分布且具有有限方差(底层分布为$P$)且期望动力学是Hurwitz的。对于给定的LSA与PR平均,以及满足上述假设的数据分布$P$,我们证明存在一个常数步长范围,使得其MSE衰减为$O(1/t)$。我们还探讨了在数据分布$\mathcal{P}$中选择统一常数步长的条件,并证明并非所有数据分布都允许这样的统一常数步长。此外,我们建议一种启发式步长调整算法,用于为给定的数据分布$P$选择LSA的常数步长。我们还比较了我们的结果与相关工作,并讨论了我们的结果在TD算法作为LSAs的上下文中的意义。

英文摘要

We consider $d$-dimensional linear stochastic approximation algorithms (LSAs) with a constant step-size and the so called Polyak-Ruppert (PR) averaging of iterates. LSAs are widely applied in machine learning and reinforcement learning (RL), where the aim is to compute an appropriate $θ_{*} \in \mathbb{R}^d$ (that is an optimum or a fixed point) using noisy data and $O(d)$ updates per iteration. In this paper, we are motivated by the problem (in RL) of policy evaluation from experience replay using the \emph{temporal difference} (TD) class of learning algorithms that are also LSAs. For LSAs with a constant step-size, and PR averaging, we provide bounds for the mean squared error (MSE) after $t$ iterations. We assume that data is \iid with finite variance (underlying distribution being $P$) and that the expected dynamics is Hurwitz. For a given LSA with PR averaging, and data distribution $P$ satisfying the said assumptions, we show that there exists a range of constant step-sizes such that its MSE decays as $O(\frac{1}{t})$. We examine the conditions under which a constant step-size can be chosen uniformly for a class of data distributions $\mathcal{P}$, and show that not all data distributions `admit' such a uniform constant step-size. We also suggest a heuristic step-size tuning algorithm to choose a constant step-size of a given LSA for a given data distribution $P$. We compare our results with related work and also discuss the implication of our results in the context of TD algorithms that are LSAs.

1701.02440 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Machine Learning of Linear Differential Equations using Gaussian Processes

利用高斯过程学习线性微分方程

Maziar Raissi, George Em. Karniadakis

AI总结 本文利用概率机器学习最新进展,通过高斯过程先验发现参数化的线性守恒律方程,包括常微分、偏微分、积分微分及分数阶算子。

详情
AI中文摘要

本工作利用概率机器学习最新进展,发现由参数化线性方程表达的守恒律。此类方程包括但不限于常微分、偏微分、积分微分和分数阶算子。此处,根据此类算子的特定形式修改高斯过程先验,并用于从稀疏且可能含噪声的观测中推断线性方程的参数。此类观测可能来自实验或

英文摘要

This work leverages recent advances in probabilistic machine learning to discover conservation laws expressed by parametric linear equations. Such equations involve, but are not limited to, ordinary and partial differential, integro-differential, and fractional order operators. Here, Gaussian process priors are modified according to the particular form of such operators and are employed to infer parameters of the linear equations from scarce and possibly noisy observations. Such observations may come from experiments or "black-box" computer simulations.

1709.01672 2026-06-04 cs.NI cs.LG cs.NE cs.SY eess.SY 版本更新

Throughput Optimal Decentralized Scheduling of Multi-Hop Networks with End-to-End Deadline Constraints: II Wireless Networks with Interference

多跳网络中端到端截止期限约束下的吞吐量最优去中心化调度:II 无线网络与干扰

Rahul Singh, P. R. Kumar, Eytan Modiano

AI总结 研究多跳无线网络中端到端截止期限约束下的去中心化调度问题,提出基于位置和年龄的路由调度策略以最大化吞吐量,强调其与传统稳定性的关键差异。

详情
AI中文摘要

考虑一个多跳无线网络,服务于多个流,其中无线链路干扰约束由链路干扰图描述。针对此类网络,设计路由调度策略以最大化网络的端到端及时吞吐量。流f的及时吞吐量定义为平均到达其目的地节点df内的包率。我们的策略具有几个令人惊讶的特点。首先,我们证明了单个包在无线节点i∈V处的最优路由调度决策仅取决于其位置和年龄,因此无线节点i无需了解全局网络状态即可最大化及时吞吐量。相比之下,在回压路由策略下,节点i仅需了解邻居队列长度以保证最大稳定性,因此是去中心化的。关键差异在于,在我们的设置中,一旦包的年龄超过其截止期限,其效用将丧失,这使得优化及时吞吐量比确保网络稳定性更具挑战性。当然,由于这一关键差异,最大化及时吞吐量的决策过程也比确保网络内队列稳定化更复杂。因此,我们的结果有些令人惊讶。

英文摘要

Consider a multihop wireless network serving multiple flows in which wireless link interference constraints are described by a link interference graph. For such a network, we design routing-scheduling policies that maximize the end-to-end timely throughput of the network. Timely throughput of a flow $f$ is defined as the average rate at which packets of flow $f$ reach their destination node $d_f$ within their deadline. Our policy has several surprising characteristics. Firstly, we show that the optimal routing-scheduling decision for an individual packet that is present at a wireless node $i\in V$ is solely a function of its location, and "age". Thus, a wireless node $i$ does not require the knowledge of the "global" network state in order to maximize the timely throughput. We notice that in comparison, under the backpressure routing policy, a node $i$ requires only the knowledge of its neighbours queue lengths in order to guarantee maximal stability, and hence is decentralized. The key difference arises due to the fact that in our set-up the packets loose their utility once their "age" has crossed their deadline, thus making the task of optimizing timely throughput much more challenging than that of ensuring network stability. Of course, due to this key difference, the decision process involved in maximizing the timely throughput is also much more complex than that involved in ensuring network-wide queue stabilization. In view of this, our results are somewhat surprising.

1709.02555 2026-06-04 eess.SY cs.AI cs.LG cs.LO cs.SY 版本更新

Causality-Aided Falsification

因果辅助的反驳

Takumi Akazaki, Yoshihiro Kumazawa, Ichiro Hasuo

AI总结 本文提出利用因果信息提升异构系统质量保证中反驳效率的方法,通过贝叶斯网络优化成本函数实现高效输入值搜索。

Comments In Proceedings FVAV 2017, arXiv:1709.02126

详情
Journal ref
EPTCS 257, 2017, pp. 3-18
AI中文摘要

在异构系统的质量保证中,反例寻找因其复杂性超出了大多数验证技术的可扩展性而受到关注。本文提出在反例寻找中引入因果辅助的概念:通过为反例求解器提供由贝叶斯网络表达的合适因果信息,使其依赖于特定成本函数的随机优化,可以高效地搜索反例输入值。我们的实验结果展示了该方法的可行性。

英文摘要

Falsification is drawing attention in quality assurance of heterogeneous systems whose complexities are beyond most verification techniques' scalability. In this paper we introduce the idea of causality aid in falsification: by providing a falsification solver -- that relies on stochastic optimization of a certain cost function -- with suitable causal information expressed by a Bayesian network, search for a falsifying input value can be efficient. Our experiment results show the idea's viability.

1709.02435 2026-06-04 cs.AI cs.LG cs.SE cs.SY eess.SY 版本更新

An Analysis of ISO 26262: Using Machine Learning Safely in Automotive Software

ISO 26262分析:在汽车软件中安全使用机器学习

Rick Salay, Rodrigo Queiroz, Krzysztof Czarnecki

AI总结 本文分析了在汽车软件中使用机器学习对ISO 26262安全生命周期的影响,并提出适应该标准以容纳机器学习的建议。

Comments 6 pages, 3 figures

详情
AI中文摘要

机器学习(ML)在高级驾驶辅助和自动驾驶功能中的作用日益增加;然而,其在安全认证方面的充分性仍存在争议。本文分析了将ML作为实现方法对ISO 26262安全生命周期的影响,并探讨了如何解决这些问题。我们随后提供了一套建议,说明如何调整标准以适应机器学习。

英文摘要

Machine learning (ML) plays an ever-increasing role in advanced automotive functionality for driver assistance and autonomous operation; however, its adequacy from the perspective of safety certification remains controversial. In this paper, we analyze the impacts that the use of ML as an implementation approach has on ISO 26262 safety lifecycle and ask what could be done to address them. We then provide a set of recommendations on how to adapt the standard to accommodate ML.

1709.01237 2026-06-04 cs.CV cs.LG cs.NA math.NA 版本更新

Newton-type Methods for Inference in Higher-Order Markov Random Fields

牛顿型方法在高阶马尔可夫随机场推断中的应用

Hariprasad Kannan, Nikos Komodakis, Nikos Paragios

AI总结 本文研究了在高阶马尔可夫随机场推断中使用牛顿型方法求解拉格朗日对偶问题的益处,提出了一种收敛性可证且高效的框架,包含Hessian矩阵构建的计算复杂度与精度的平衡策略、阻尼策略、截断策略与通用预条件器的结合,以及稀疏团势能的高效求和-乘积计算。

Comments 10 pages, 3 figures, 3 tables, CVPR 2017

详情
Journal ref
Poster at IEEE International Conference on Computer Vision and Pattern Recognition 2017
AI中文摘要

线性规划松弛是离散马尔可夫随机场MAP推断中的核心方法。正确求解拉格朗日对偶问题的能力是此类方法的关键组成部分。本文研究了使用牛顿型方法求解平滑版本问题的拉格朗日对偶问题的益处。我们探讨了其在实现更优收敛行为和更好地处理公式中的病态性质方面的能力,与一阶方法相比。我们证明了确实可以高效地应用信任区域牛顿方法,以解决广泛MAP推断问题。本文提出了一种可证收敛且高效的框架,包括(i)在Hessian矩阵构建方面计算复杂度和精度之间的良好平衡,(ii)一种有助于高效优化的阻尼策略,(iii)一种与通用共轭梯度预条件器结合的截断策略,(iv)稀疏团势能的高效求和-乘积计算。高阶马尔可夫随机场的结果展示了这种方法的潜力。

英文摘要

Linear programming relaxations are central to {\sc map} inference in discrete Markov Random Fields. The ability to properly solve the Lagrangian dual is a critical component of such methods. In this paper, we study the benefit of using Newton-type methods to solve the Lagrangian dual of a smooth version of the problem. We investigate their ability to achieve superior convergence behavior and to better handle the ill-conditioned nature of the formulation, as compared to first order methods. We show that it is indeed possible to efficiently apply a trust region Newton method for a broad range of {\sc map} inference problems. In this paper we propose a provably convergent and efficient framework that includes (i) excellent compromise between computational complexity and precision concerning the Hessian matrix construction, (ii) a damping strategy that aids efficient optimization, (iii) a truncation strategy coupled with a generic pre-conditioner for Conjugate Gradients, (iv) efficient sum-product computation for sparse clique potentials. Results for higher-order Markov Random Fields demonstrate the potential of this approach.

1607.03428 2026-06-04 cs.LG cs.SY eess.SY quant-ph stat.ML 版本更新

Learning in Quantum Control: High-Dimensional Global Optimization for Noisy Quantum Dynamics

量子控制中的学习:用于噪声量子动力学的高维全局优化

Pantita Palittapongarnpim, Peter Wittek, Ehsan Zahedinejad, Shakib Vedaie, Barry C. Sanders

AI总结 本文提出使用差分进化算法解决高维量子系统中非凸优化问题,通过改进控制保真度和引入启发式方法提升计算效率,展示在量子相位估计和量子门设计中的优越性能。

Comments 32 pages, 4 figures, extension of proceedings in ESANN 2016 conference submitted to Neurocomputing

详情
Journal ref
Neurocomputing 268 (2017) 116-126
AI中文摘要

量子控制在多种量子技术中具有重要价值,如通用量子计算中的高保真门、自适应量子增强计量和超冷原子操控。尽管监督学习和强化学习广泛用于优化经典系统的控制参数,但量子参数优化主要通过基于梯度的贪心算法进行。虽然量子适应度景观通常与贪心算法兼容,但在高维量子系统中贪心算法可能表现不佳。本文采用差分进化算法克服非凸优化的停滞问题,通过平均目标函数提升噪声系统中的量子控制保真度。为减少计算成本,引入了运行终止的启发式方法和自适应搜索子空间选择。我们的实现是大规模并行和向量化的,以进一步减少运行时间。通过量子相位估计和量子门设计两个示例,我们展示了在保真度和可扩展性方面优于贪心算法的结果。

英文摘要

Quantum control is valuable for various quantum technologies such as high-fidelity gates for universal quantum computing, adaptive quantum-enhanced metrology, and ultra-cold atom manipulation. Although supervised machine learning and reinforcement learning are widely used for optimizing control parameters in classical systems, quantum control for parameter optimization is mainly pursued via gradient-based greedy algorithms. Although the quantum fitness landscape is often compatible with greedy algorithms, sometimes greedy algorithms yield poor results, especially for large-dimensional quantum systems. We employ differential evolution algorithms to circumvent the stagnation problem of non-convex optimization. We improve quantum control fidelity for noisy system by averaging over the objective function. To reduce computational cost, we introduce heuristics for early termination of runs and for adaptive selection of search subspaces. Our implementation is massively parallel and vectorized to reduce run time even further. We demonstrate our methods with two examples, namely quantum phase estimation and quantum gate design, for which we achieve superior fidelity and scalability than obtained using greedy algorithms.

1504.01982 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Adaptive Diffusion Schemes for Heterogeneous Networks

异构网络中的自适应扩散方案

Jesus Fernandez-Bes, Jerónimo Arenas-García, Magno T. M. Silva, Luis A. Azpicueta-Ruiz

AI总结 本文针对异构扩散网络中的分布式估计问题,提出了一种自适应组合策略,通过解耦适应与组合阶段,实现局部估计,并通过最小化网络均方误差优化组合器,实验显示其优于现有技术。

Comments To appear in in IEEE Transactions on Signal Processing. URL: http://ieeexplore.ieee.org/document/8010454/

详情
Journal ref
IEEE Transactions on Signal Processing ( Volume: 65, Issue: 21, Nov.1, 1 2017 )
AI中文摘要

本文研究了异构网络中的扩散策略,提出了一种解耦适应与组合阶段的自适应方案,通过最小化网络均方误差优化组合器,与传统Adapt-then-Combine方案相比,实验表明该方法在异构网络中表现更优。

英文摘要

In this paper, we deal with distributed estimation problems in diffusion networks with heterogeneous nodes, i.e., nodes that either implement different adaptive rules or differ in some other aspect such as the filter structure or length, or step size. Although such heterogeneous networks have been considered from the first works on diffusion networks, obtaining practical and robust schemes to adaptively adjust the combiners in different scenarios is still an open problem. In this paper, we study a diffusion strategy specially designed and suited to heterogeneous networks. Our approach is based on two key ingredients: 1) the adaptation and combination phases are completely decoupled, so that network nodes keep purely local estimations at all times; and 2) combiners are adapted to minimize estimates of the network mean-square-error. Our scheme is compared with the standard Adapt-then-Combine scheme and theoretically analyzed using energy conservation arguments. Several experiments involving networks with heterogeneous nodes show that the proposed decoupled Adapt-then-Combine approach with adaptive combiners outperforms other state-of-the-art techniques, becoming a competitive approach in these scenarios.

1708.09342 2026-06-04 eess.SY cs.LG cs.RO cs.SY math.OC 版本更新

Optimal and Learning Control for Autonomous Robots

自主机器人最优与学习控制

Jonas Buchli, Farbod Farshidian, Alexander Winkler, Timothy Sandy, Markus Giftthaler

AI总结 本文基于最优控制与强化学习,从统一视角探讨自主机器人闭环控制问题,提供统一符号和术语对比,帮助理解不同领域方法。

Comments Lecture Notes, 101 pages

详情
AI中文摘要

自主机器人最优与学习控制课程在苏黎世联邦理工学院的机器人、系统与控制硕士项目中教授,旨在从统一视角教授最优控制和强化学习以解决闭环控制问题。起始点是制定最优控制问题并由此推导出不同类型的解决方案和算法。这些讲义力求在可能的情况下使用统一的符号,并提供一些术语和符号的翻译帮助,以比较不同领域的术语和符号。该课程假定具备控制理论、线性代数和随机微积分的基础知识。

英文摘要

Optimal and Learning Control for Autonomous Robots has been taught in the Robotics, Systems and Controls Masters at ETH Zurich with the aim to teach optimal control and reinforcement learning for closed loop control problems from a unified point of view. The starting point is the formulation of of an optimal control problem and deriving the different types of solutions and algorithms from there. These lecture notes aim at supporting this unified view with a unified notation wherever possible, and a bit of a translation help to compare the terminology and notation in the different fields. The course assumes basic knowledge of Control Theory, Linear Algebra and Stochastic Calculus.

1708.09165 2026-06-04 math.NA cs.LG cs.NA 版本更新

Tensor Networks for Dimensionality Reduction and Large-Scale Optimizations. Part 2 Applications and Future Perspectives

张量网络用于降维和大规模优化。第二部分:应用与未来展望

A. Cichocki, A-H. Phan, Q. Zhao, N. Lee, I. V. Oseledets, M. Sugiyama, D. Mandic

AI总结 本文第二部分探讨了张量网络在数据/参数超压缩高阶表示及相关成本函数中的应用,重点介绍张量列车(TT)和分层张量(HT)分解及其物理意义,展示其在机器学习和数据分析中的潜力。

Comments 232 pages

详情
Journal ref
Foundations and Trends in Machine Learning: Vol. 9: No. 6, pp 431-673, 2017
AI中文摘要

本专著第二部分基于第一部分介绍的张量网络及其操作,聚焦于张量网络模型在数据/参数的超压缩高阶表示及相关成本函数中的应用,概述其在机器学习和数据分析中的应用。特别强调张量列车(TT)和分层张量(HT)分解及其具有物理意义的解释,反映张量网络方法的可扩展性。通过图示方法,阐述了通过底层低秩张量近似和核心张量的复杂收缩,张量网络能够执行分布式计算,从而缓解或消除维度灾难。该概念在广义回归和分类(支持张量机、典型相关分析、高阶偏最小二乘)、广义特征值分解、黎曼优化和深度神经网络优化等多个应用领域中得到了验证。本工作的第一部分和第二部分可以作为单独的文本使用,也可以作为低秩张量网络和张量分解这一激动人心领域的综合综述。

英文摘要

Part 2 of this monograph builds on the introduction to tensor networks and their operations presented in Part 1. It focuses on tensor network models for super-compressed higher-order representation of data/parameters and related cost functions, while providing an outline of their applications in machine learning and data analytics. A particular emphasis is on the tensor train (TT) and Hierarchical Tucker (HT) decompositions, and their physically meaningful interpretations which reflect the scalability of the tensor network approach. Through a graphical approach, we also elucidate how, by virtue of the underlying low-rank tensor approximations and sophisticated contractions of core tensors, tensor networks have the ability to perform distributed computations on otherwise prohibitively large volumes of data/parameters, thereby alleviating or even eliminating the curse of dimensionality. The usefulness of this concept is illustrated over a number of applied areas, including generalized regression and classification (support tensor machines, canonical correlation analysis, higher order partial least squares), generalized eigenvalue decomposition, Riemannian optimization, and in the optimization of deep neural networks. Part 1 and Part 2 of this work can be used either as stand-alone separate texts, or indeed as a conjoint comprehensive review of the exciting field of low-rank tensor networks and tensor decompositions.

1708.08552 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

An inexact subsampled proximal Newton-type method for large-scale machine learning

一种用于大规模机器学习的近似子采样近端牛顿型方法

Xuanqing Liu, Cho-Jui Hsieh, Jason D. Lee, Yuekai Sun

AI总结 本文提出一种快速近端牛顿型算法,通过子采样构造牛顿子问题,提升大规模优化效率,实验验证其在ℓ₁正则化逻辑回归中的优越性。

详情
AI中文摘要

我们提出了一种快速的近端牛顿型算法,用于最小化带有正则化的有限和。该算法能够在$\tilde{\mathcal{O}}(d(n + \sqrt{κd})\log(\frac{1}ε))$ FLOPS内返回一个$ε$-次优解,其中$n$是样本数,$d$是特征维度,$κ$是条件数。只要$n > d$,该方法比最先进的加速随机一阶方法更高效,后者需要$\tilde{\mathcal{O}}(d(n + \sqrt{κn})\log(\frac{1}ε))$ FLOPS。关键思想是通过子采样构造牛顿子问题,以保持目标函数的有限和结构,从而利用最近的随机一阶方法进展来求解子问题。实验结果验证了所提算法在真实数据集上的ℓ₁正则化逻辑回归任务中优于先前算法。

英文摘要

We propose a fast proximal Newton-type algorithm for minimizing regularized finite sums that returns an $ε$-suboptimal point in $\tilde{\mathcal{O}}(d(n + \sqrt{κd})\log(\frac{1}ε))$ FLOPS, where $n$ is number of samples, $d$ is feature dimension, and $κ$ is the condition number. As long as $n > d$, the proposed method is more efficient than state-of-the-art accelerated stochastic first-order methods for non-smooth regularizers which requires $\tilde{\mathcal{O}}(d(n + \sqrt{κn})\log(\frac{1}ε))$ FLOPS. The key idea is to form the subsampled Newton subproblem in a way that preserves the finite sum structure of the objective, thereby allowing us to leverage recent developments in stochastic first-order methods to solve the subproblem. Experimental results verify that the proposed algorithm outperforms previous algorithms for $\ell_1$-regularized logistic regression on real datasets.

1708.07850 2026-06-04 cs.LG cs.CV cs.NA math.NA 版本更新

Structured Low-Rank Matrix Factorization: Global Optimality, Algorithms, and Applications

结构低秩矩阵分解:全局最优性、算法与应用

Benjamin D. Haeffele, Rene Vidal

AI总结 本文提出一种适用于大规模数据集的矩阵分解技术,通过特定正则化形式捕捉额外结构,证明在因子规模足够时局部极小值即为全局极小值,并展示在神经钙成像视频分割和高光谱压缩恢复中的优势。

详情
AI中文摘要

近年来,低秩矩阵分解问题凸形式在机器学习中受到广泛关注。然而,此类形式往往需要求解与数据矩阵同样大小的矩阵,难以应用于大规模数据集。此外,在许多应用中,数据可能表现出超越单纯低秩的结构,例如图像和视频呈现复杂的时空结构,而标准低秩方法大多忽略这些结构。本文研究了一种适用于大规模数据集的矩阵分解技术,通过特定形式的正则化捕捉额外结构,该正则化包括总变分和核范数等已知正则化器作为特例。尽管所得优化问题非凸,我们证明在因子规模足够时,若满足某些条件,则因子的任何局部极小值即为全局极小值。此外,本文还提供了几种实用算法来解决矩阵分解问题,并推导了近似解到全局最优解距离的界。神经钙成像视频分割和高光谱压缩恢复的示例展示了该方法在高维数据集中的优势。

英文摘要

Recently, convex formulations of low-rank matrix factorization problems have received considerable attention in machine learning. However, such formulations often require solving for a matrix of the size of the data matrix, making it challenging to apply them to large scale datasets. Moreover, in many applications the data can display structures beyond simply being low-rank, e.g., images and videos present complex spatio-temporal structures that are largely ignored by standard low-rank methods. In this paper we study a matrix factorization technique that is suitable for large datasets and captures additional structure in the factors by using a particular form of regularization that includes well-known regularizers such as total variation and the nuclear norm as particular cases. Although the resulting optimization problem is non-convex, we show that if the size of the factors is large enough, under certain conditions, any local minimizer for the factors yields a global minimizer. A few practical algorithms are also provided to solve the matrix factorization problem, and bounds on the distance from a given approximate solution of the optimization problem to the global optimum are derived. Examples in neural calcium imaging video segmentation and hyperspectral compressed recovery show the advantages of our approach on high-dimensional datasets.

1610.05984 2026-06-04 cs.NE cs.AI cs.LG cs.SY eess.SY 版本更新

Particle Swarm Optimization for Generating Interpretable Fuzzy Reinforcement Learning Policies

粒子群优化用于生成可解释的模糊强化学习策略

Daniel Hein, Alexander Hentschel, Thomas Runkler, Steffen Udluft

AI总结 本文提出一种基于模糊粒子群强化学习(FPSRL)的方法,通过训练参数在模拟真实系统动态的世界模型上生成可解释的模糊强化学习策略,适用于无法进行在线学习的领域。

详情
Journal ref
Engineering Applications of Artificial Intelligence, Volume 65C, October 2017, Pages 87-98
AI中文摘要

模糊控制器是用于连续状态和动作空间的有效且可解释的系统控制器。到目前为止,此类控制器要么是手动构建的,要么是通过使用专家生成的问题特定成本函数或结合详细的最优控制策略知识自动训练的。在大多数现实世界的强化学习(RL)问题中,这两种要求都不存在。在这些应用中,由于在线学习需要在策略训练期间探索问题的动力学,因此通常禁止在线学习。我们引入了一种模糊粒子群强化学习(FPSRL)方法,该方法仅通过在模拟真实系统动态的世界模型上训练参数来构建模糊RL策略。这些世界模型是通过使用之前生成的转换样本的自主机器学习技术创建的。据我们所知,这种方法是首次将自组织模糊控制器与基于模型的批量RL相关联的。因此,FPSRL旨在解决那些禁止在线学习、系统动态相对容易从先前生成的默认策略转换样本中建模,并且预计存在相对易于解释的控制策略的领域的问题。通过使用三个标准RL基准,即山车、平衡小车和小车摆起,证明了所提出方法在这些领域中的效率。我们的实验结果展示了高性能且可解释的模糊策略。

英文摘要

Fuzzy controllers are efficient and interpretable system controllers for continuous state and action spaces. To date, such controllers have been constructed manually or trained automatically either using expert-generated problem-specific cost functions or incorporating detailed knowledge about the optimal control strategy. Both requirements for automatic training processes are not found in most real-world reinforcement learning (RL) problems. In such applications, online learning is often prohibited for safety reasons because online learning requires exploration of the problem's dynamics during policy training. We introduce a fuzzy particle swarm reinforcement learning (FPSRL) approach that can construct fuzzy RL policies solely by training parameters on world models that simulate real system dynamics. These world models are created by employing an autonomous machine learning technique that uses previously generated transition samples of a real system. To the best of our knowledge, this approach is the first to relate self-organizing fuzzy controllers to model-based batch RL. Therefore, FPSRL is intended to solve problems in domains where online learning is prohibited, system dynamics are relatively easy to model from previously generated default policy transition samples, and it is expected that a relatively easily interpretable control policy exists. The efficiency of the proposed approach with problems from such domains is demonstrated using three standard RL benchmarks, i.e., mountain car, cart-pole balancing, and cart-pole swing-up. Our experimental results demonstrate high-performing, interpretable fuzzy policies.

1601.08068 2026-06-04 stat.ML cs.LG cs.SY eess.SY 版本更新

System Identification through Online Sparse Gaussian Process Regression with Input Noise

通过在线稀疏高斯过程回归进行系统辨识

Hildo Bijl, Thomas B. Schön, Jan-Willem van Wingerden, Michel Verhaegen

AI总结 本文提出一种在线稀疏高斯过程回归算法,解决高斯过程回归在计算效率、在线更新和处理噪声输入方面的不足,实验表明其在非线性黑盒系统建模中性能优异。

详情
AI中文摘要

近年来,非参数回归方法如高斯过程(GP)回归在系统辨识中受到越来越多关注。传统高斯过程回归有三个重要缺点:(1)计算成本高,(2)无法高效在线实现新测量值,(3)无法处理随机(噪声)输入点。本文提出一种算法同时解决这三个问题。所提出的稀疏在线噪声输入高斯过程(SONIG)回归算法可以在常数时间内纳入新的噪声测量值。实验表明,其比现有回归算法更准确。当应用于非线性黑盒系统建模时,其性能与现有非线性ARX模型相媲美。

英文摘要

There has been a growing interest in using non-parametric regression methods like Gaussian Process (GP) regression for system identification. GP regression does traditionally have three important downsides: (1) it is computationally intensive, (2) it cannot efficiently implement newly obtained measurements online, and (3) it cannot deal with stochastic (noisy) input points. In this paper we present an algorithm tackling all these three issues simultaneously. The resulting Sparse Online Noisy Input GP (SONIG) regression algorithm can incorporate new noisy measurements in constant runtime. A comparison has shown that it is more accurate than similar existing regression algorithms. When applied to non-linear black-box system modeling, its performance is competitive with existing non-linear ARX models.

1708.03366 2026-06-04 cs.LG cs.AI cs.CR cs.SY eess.SY 版本更新

Resilient Linear Classification: An Approach to Deal with Attacks on Training Data

鲁棒线性分类:一种应对训练数据攻击的方法

Sangdon Park, James Weimer, Insup Lee

AI总结 本文提出一种鲁棒线性分类方法,通过引入多数约束,提高对抗训练数据攻击的鲁棒性,验证了传统算法在攻击下的脆弱性。

Comments Accepted as a conference paper at ICCPS17

详情
AI中文摘要

数据驱动技术用于控制自动驾驶车辆、处理能源管理的需求响应以及建模人体生理学用于医疗设备。这些技术从训练数据中提取模型,其性能通常基于训练数据中的随机误差进行分析。然而,如果训练数据被攻击者恶意篡改,这些攻击对数据驱动CPS底层学习算法的影响尚未被考虑。本文分析了分类算法对训练数据攻击的鲁棒性。具体而言,提出了一种通用度量标准,用于衡量分类算法对训练数据最坏情况篡改的鲁棒性。使用该度量标准,我们显示传统线性分类算法在受限条件下具有鲁棒性。为克服这些限制,我们提出了一种具有多数约束的线性分类算法,并证明其比传统算法更鲁棒。在合成数据和一个现实世界的回顾性心律失常医疗案例研究中的评估显示,传统算法对篡改的训练数据易受攻击,而所提算法更具鲁棒性(以最坏情况篡改衡量)。

英文摘要

Data-driven techniques are used in cyber-physical systems (CPS) for controlling autonomous vehicles, handling demand responses for energy management, and modeling human physiology for medical devices. These data-driven techniques extract models from training data, where their performance is often analyzed with respect to random errors in the training data. However, if the training data is maliciously altered by attackers, the effect of these attacks on the learning algorithms underpinning data-driven CPS have yet to be considered. In this paper, we analyze the resilience of classification algorithms to training data attacks. Specifically, a generic metric is proposed that is tailored to measure resilience of classification algorithms with respect to worst-case tampering of the training data. Using the metric, we show that traditional linear classification algorithms are resilient under restricted conditions. To overcome these limitations, we propose a linear classification algorithm with a majority constraint and prove that it is strictly more resilient than the traditional algorithms. Evaluations on both synthetic data and a real-world retrospective arrhythmia medical case-study show that the traditional algorithms are vulnerable to tampered training data, whereas the proposed algorithm is more resilient (as measured by worst-case tampering).

1610.05261 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

A probabilistic model for the numerical solution of initial value problems

初值问题数值解的概率模型

Michael Schober, Simo Särkkä, Philipp Hennig

AI总结 本文提出将初值问题解法视为对潜在路径的推断,连接了广义线性方法、Runge-Kutta方法和Nordsieck方法,揭示了经典方法的隐含假设和不确定性处理。

Comments 23 pages, 11 figures

详情
AI中文摘要

与许多数值方法类似,初值问题求解器通过可计算结果估计不可解析量。本文将求解过程视为从高斯过程概率测度中抽取路径的推断,展示了该类算法与广义线性方法、Runge-Kutta方法和Nordsieck方法的联系。这种概率框架在分析上突显了隐含的先验假设,并在实践中为不确定性处理提供了 docking points。

英文摘要

Like many numerical methods, solvers for initial value problems (IVPs) on ordinary differential equations estimate an analytically intractable quantity, using the results of tractable computations as inputs. This structure is closely connected to the notion of inference on latent variables in statistics. We describe a class of algorithms that formulate the solution to an IVP as inference on a latent path that is a draw from a Gaussian process probability measure (or equivalently, the solution of a linear stochastic differential equation). We then show that certain members of this class are connected precisely to generalized linear methods for ODEs, a number of Runge--Kutta methods, and Nordsieck methods. This probabilistic formulation of classic methods is valuable in two ways: analytically, it highlights implicit prior assumptions favoring certain approximate solutions to the IVP over others, and gives a precise meaning to the old observation that these methods act like filters. Practically, it endows the classic solvers with `docking points' for notions of uncertainty and prior information about the initial value, the value of the ODE itself, and the solution of the problem.

1610.02067 2026-06-04 cs.GT cs.IT cs.LG cs.SY eess.SY math.IT 版本更新

Stochastic Games for Smart Grid Energy Management with Prospect Prosumers

面向智能电网的能量管理的随机博弈:考虑前景理论的产消者

Seyed Rasoul Etesami, Walid Saad, Narayan Mandayam, H. Vincent Poor

AI总结 本文研究了在随机动态下智能电网能量管理问题,通过随机博弈模型考虑产消者的行为,利用前景理论构建收益函数,并提出分布式算法以实现纳什均衡。

详情
AI中文摘要

本文研究了在随机动态下智能电网能量管理问题。在所考虑的模型中,假设消费者可以作为拥有可再生能源的产消者,既能生产也能消费能源。由于产消者决策与可再生能源的随机性耦合,产消者间的交互被建模为随机博弈,其中每个产消者通过控制其能源消费和需求来最大化收益。特别地,利用前景理论,将产消者的主观行为显式地反映到其收益函数中。对于基于前景的随机博弈,证明总存在一个平稳纳什均衡,其中产消者的交易策略与时间及其历史无关。此外,提出了一种无需产消者间信息共享的新型分布式算法,并证明其收敛于ε-纳什均衡。另一方面,在供给侧,电力公司与产消者之间的交互被建模为在线优化问题,其中电力公司的目标是学习其最优能源分配规则。对于这种情况,证明该优化问题具有无遗憾算法,即无论产消者间的实际游戏结果如何,电力公司都可以遵循一种策略,以减少其分配成本,如同事先知道整个需求市场一样。仿真结果展示了所提算法收敛到预测结果的能力,并揭示了前景理论带来的新见解,有助于更高效的智能电网能量管理。

英文摘要

In this paper, the problem of smart grid energy management under stochastic dynamics is investigated. In the considered model, at the demand side, it is assumed that customers can act as prosumers who own renewable energy sources and can both produce and consume energy. Due to the coupling between the prosumers' decisions and the stochastic nature of renewable energy, the interaction among prosumers is formulated as a stochastic game, in which each prosumer seeks to maximize its payoff, in terms of revenues, by controlling its energy consumption and demand. In particular, the subjective behavior of prosumers is explicitly reflected into their payoff functions using prospect theory, a powerful framework that allows modeling real-life human choices. For this prospect-based stochastic game, it is shown that there always exists a stationary Nash equilibrium where the prosumers' trading policies in the equilibrium are independent of the time and their histories of the play. Moreover, a novel distributed algorithm with no information sharing among prosumers is proposed and shown to converge to an $ε$-Nash equilibrium. On the other hand, at the supply side, the interaction between the utility company and the prosumers is formulated as an online optimization problem in which the utility company's goal is to learn its optimal energy allocation rules. For this case, it is shown that such an optimization problem admits a no-regret algorithm meaning that regardless of the actual outcome of the game among the prosumers, the utility company can follow a strategy that mitigates its allocation costs as if it knew the entire demand market a priori. Simulation results show the convergence of the proposed algorithms to their predicted outcomes and present new insights resulting from prospect theory that contribute toward more efficient energy management in the smart grids.

1605.01278 2026-06-04 stat.ML cs.LG cs.SY eess.SY math.DS math.PR 版本更新

A Bayesian Approach to Policy Recognition and State Representation Learning

基于贝叶斯方法的策略识别与状态表示学习

Adrian Šošić, Abdelhak M. Zoubir, Heinz Koeppl

AI总结 本文提出一种贝叶斯方法,用于在不假设专家行为最优的情况下,学习任意随机专家策略,并推断专家使用的状态表示复杂度及任务相关的状态空间划分。

Comments 17 pages, 8 figures; ### Version 4 ### to appear in IEEE Transactions on Pattern Analysis and Machine Intelligence

详情
AI中文摘要

学习从示范(LfD)是通过专家提供的示范构建任务行为模型的过程。这些模型可用于系统控制,通过泛化专家示范到未曾遇到的情况。然而,大多数LfD方法假设专家行为的确定性最优地面真实策略或需要直接监控专家的控制,限制了其在一般系统识别框架中的实际应用。本文考虑了更一般性的LfD问题,允许任意随机专家策略,而不考虑示范的最优性。采用贝叶斯方法,我们建模了能够解释所提供示范数据的全部可能专家控制器的后验分布。此外,我们展示了该方法可以应用于非参数上下文,以推断专家使用的状态表示复杂度,并学习任务相关的系统状态空间划分。

英文摘要

Learning from demonstration (LfD) is the process of building behavioral models of a task from demonstrations provided by an expert. These models can be used e.g. for system control by generalizing the expert demonstrations to previously unencountered situations. Most LfD methods, however, make strong assumptions about the expert behavior, e.g. they assume the existence of a deterministic optimal ground truth policy or require direct monitoring of the expert's controls, which limits their practical use as part of a general system identification framework. In this work, we consider the LfD problem in a more general setting where we allow for arbitrary stochastic expert policies, without reasoning about the optimality of the demonstrations. Following a Bayesian methodology, we model the full posterior distribution of possible expert controllers that explain the provided demonstration data. Moreover, we show that our methodology can be applied in a nonparametric context to infer the complexity of the state representation used by the expert, and to learn task-appropriate partitionings of the system state space.

1707.09428 2026-06-04 math.NA cs.LG cs.NA 版本更新

A unified method for super-resolution recovery and real exponential-sum separation

超分辨率恢复与实指数和分离的统一方法

Charles K. Chui, Hrushikesh N. Mhaskar

AI总结 本文提出一种统一方法,解决多变量超分辨率问题和盲源分离实值指数和问题,适用于荧光显微镜、天文观测及磁共振成像等应用。

详情
AI中文摘要

本文受衍射传播光波的启发,提出一个简单的数学模型,用于多变量超分辨率问题和实值指数和的盲源分离问题。该模型促进了本文中两个问题的统一理论和统一解决方案的发展。超分辨率问题的研究旨在应用于荧光显微镜和天文观测,而第二个问题的动机是当前需要从磁共振波谱学中提取多变量指数特征,以帮助神经科医生和放射科医生,并为核化学中的同位素分离提供数学工具。本文介绍的统一方法可通过处理有限数量的数据实现,这些数据在非预先指定的位置采样,计算方案仅包括矩阵-向量乘法、峰值寻找和聚类。

英文摘要

In this paper, motivated by diffraction of traveling light waves, a simple mathematical model is proposed, both for the multivariate super-resolution problem and the problem of blind-source separation of real-valued exponential sums. This model facilitates the development of a unified theory and a unified solution of both problems in this paper. Our consideration of the super-resolution problem is aimed at applications to fluorescence microscopy and observational astronomy, and the motivation for our consideration of the second problem is the current need of extracting multivariate exponential features in magnetic resonance spectroscopy (MRS) for the neurologist and radiologist as well as for providing a mathematical tool for isotope separation in Nuclear Chemistry. The unified method introduced in this paper can be easily realized by processing only finitely many data, sampled at locations that are not necessarily prescribed in advance, with computational scheme consisting only of matrix - vector multiplication, peak finding, and clustering.

1706.03369 2026-06-04 stat.ML cs.LG cs.NA math.NA stat.CO 版本更新

On the Sampling Problem for Kernel Quadrature

关于核二次求积的采样问题

Francois-Xavier Briol, Chris J. Oates, Jon Cockayne, Wilson Ye Chen, Mark Girolami

AI总结 本文探讨了核二次求积中采样分布对收敛速率的影响,提出基于自适应温度调节和序列蒙特卡罗的自动方法,显著降低积分误差。

Comments To appear at Thirty-fourth International Conference on Machine Learning (ICML 2017)

详情
Journal ref
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:586-595, 2017
AI中文摘要

标准的核二次求积方法在随机点集下(也称为贝叶斯蒙特卡罗)以根均方误差收敛,其收敛速率由$s/d$比值决定,其中$s$和$d$分别表示被积函数的光滑性和维度。然而,实证研究显示速率常数$C$对随机点分布高度敏感。与标准蒙特卡罗积分不同,对于核二次求积,使$C$最小的采样分布无闭合形式。本文认为采样分布的实用选择是一个重要开放问题。一种解决方案是基于自适应温度调节和序列蒙特卡罗的自动方法。实证结果表明,该方法可使积分误差降低多达4个数量级。

英文摘要

The standard Kernel Quadrature method for numerical integration with random point sets (also called Bayesian Monte Carlo) is known to converge in root mean square error at a rate determined by the ratio $s/d$, where $s$ and $d$ encode the smoothness and dimension of the integrand. However, an empirical investigation reveals that the rate constant $C$ is highly sensitive to the distribution of the random points. In contrast to standard Monte Carlo integration, for which optimal importance sampling is well-understood, the sampling distribution that minimises $C$ for Kernel Quadrature does not admit a closed form. This paper argues that the practical choice of sampling distribution is an important open problem. One solution is considered; a novel automatic approach based on adaptive tempering and sequential Monte Carlo. Empirical results demonstrate a dramatic reduction in integration error of up to 4 orders of magnitude can be achieved with the proposed method.

1707.09319 2026-06-04 stat.OT cs.LG cs.NA math.NA 版本更新

A Fourier-invariant method for locating point-masses and computing their attributes

一种用于定位点质量及其属性的傅里叶不变方法

Charles K. Chui, Hrushikesh N. Mhaskar

AI总结 本文提出一种有效方法,用于计数点质量、确定其空间位置并计算其属性,基于傅里叶不变的赫尔姆特矩计算,适用于任意维度的空间和傅里叶数据处理。

详情
AI中文摘要

受观察癌细胞生长和探索星系及恒星形成过程的启发,本文旨在介绍一种严谨有效的计数点质量、确定其空间位置并计算其属性的方法。基于傅里叶不变的赫尔姆特矩计算,我们的方法促进了任意维度空间和傅里叶数据的处理。

英文摘要

Motivated by the interest of observing the growth of cancer cells among normal living cells and exploring how galaxies and stars are truly formed, the objective of this paper is to introduce a rigorous and effective method for counting point-masses, determining their spatial locations, and computing their attributes. Based on computation of Hermite moments that are Fourier-invariant, our approach facilitates the processing of both spatial and Fourier data in any dimension.

1707.08689 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Multi-Robot Transfer Learning: A Dynamical System Perspective

多机器人迁移学习:动态系统视角

Mohamed K. Helwa, Angela P. Schoellig

AI总结 本文从动态系统角度研究多机器人迁移学习中的最优转移映射性质,提出无需详细动力学知识的算法,通过实验验证该算法在四旋翼平台间迁移学习中减少60-70%的误差。

Comments 7 pages, 6 figures, accepted at the 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems

详情
AI中文摘要

多机器人迁移学习允许一个机器人利用第二个相似机器人生成的数据来改进自身行为。潜在优势是减少训练时间并降低训练阶段不可避免的风险。迁移学习算法旨在找到不同机器人之间的最优转移映射。本文通过单输入单输出(SISO)系统的理论研究,探讨了此类最优转移映射的性质。我们首先证明最优迁移学习映射通常是一个动态系统。本文的主要贡献是提供一种确定该最优动态映射性质的算法,包括其阶数和回归器(即它所依赖的变量)。所提出的算法不需要详细的机器人动力学知识,但依赖于通过简单实验测试可获得的基本系统属性。我们通过两个不同四旋翼平台间的迁移学习示例验证了所提算法。实验结果表明,通过我们的算法获得的最优动态映射在减少迁移学习误差方面比直接转移数据或使用最优静态映射的情况减少了60-70%。

英文摘要

Multi-robot transfer learning allows a robot to use data generated by a second, similar robot to improve its own behavior. The potential advantages are reducing the time of training and the unavoidable risks that exist during the training phase. Transfer learning algorithms aim to find an optimal transfer map between different robots. In this paper, we investigate, through a theoretical study of single-input single-output (SISO) systems, the properties of such optimal transfer maps. We first show that the optimal transfer learning map is, in general, a dynamic system. The main contribution of the paper is to provide an algorithm for determining the properties of this optimal dynamic map including its order and regressors (i.e., the variables it depends on). The proposed algorithm does not require detailed knowledge of the robots' dynamics, but relies on basic system properties easily obtainable through simple experimental tests. We validate the proposed algorithm experimentally through an example of transfer learning between two different quadrotor platforms. Experimental results show that an optimal dynamic map, with correct properties obtained from our proposed algorithm, achieves 60-70% reduction of transfer learning error compared to the cases when the data is directly transferred or transferred using an optimal static map.

1707.08369 2026-06-04 cs.LG cs.NA math.NA 版本更新

Updating Singular Value Decomposition for Rank One Matrix Perturbation

针对秩一矩阵扰动的奇异值分解更新

Ratnik Gandhi, Amoli Rajgor

AI总结 本文提出一种高效算法,用于在O(n² log(1/ε))时间内更新秩一扰动矩阵的奇异值分解,利用快速多极子方法在O(n log(1/ε))时间内更新奇异向量。

详情
AI中文摘要

一个高效的奇异值分解(SVD)算法是大数据问题中分布式和流式计算的重要工具。观察到秩一扰动矩阵的奇异向量更新类似于Cauchy矩阵-向量乘积。基于此观察,本文提出一种高效的算法,用于在O(n² log(1/ε))时间内更新秩一扰动矩阵的SVD。该方法利用快速多极子方法(FMM)在O(n log(1/ε))时间内更新奇异向量,其中ε是计算精度。

英文摘要

An efficient Singular Value Decomposition (SVD) algorithm is an important tool for distributed and streaming computation in big data problems. It is observed that update of singular vectors of a rank-1 perturbed matrix is similar to a Cauchy matrix-vector product. With this observation, in this paper, we present an efficient method for updating Singular Value Decomposition of rank-1 perturbed matrix in $O(n^2 \ \text{log}(\frac{1}ε))$ time. The method uses Fast Multipole Method (FMM) for updating singular vectors in $O(n \ \text{log} (\frac{1}ε))$ time, where $ε$ is the precision of computation.

1502.02609 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Efficient model-based reinforcement learning for approximate online optimal

近似在线最优的高效基于模型的强化学习

Rushikesh Kamalapurkar, Joel A. Rosenfeld, Warren E. Dixon

AI总结 本文提出基于状态跟随核方法的在线近似最优控制策略,通过局部小邻域近似值函数实现稳定性和最优性的高效控制。

详情
AI中文摘要

本文针对确定性控制仿射非线性动力系统,采用状态跟随核方法在线求解无限时间最优调节问题。与传统方法不同,该方法在紧凑集内状态的局部小邻域内近似函数,仿真结果表明相比全局近似方法,显著减少基函数数量即可实现控制系统的稳定性和近似最优性。

英文摘要

In this paper the infinite horizon optimal regulation problem is solved online for a deterministic control-affine nonlinear dynamical system using the state following (StaF) kernel method to approximate the value function. Unlike traditional methods that aim to approximate a function over a large compact set, the StaF kernel method aims to approximate a function in a small neighborhood of a state that travels within a compact set. Simulation results demonstrate that stability and approximate optimality of the control system can be achieved with significantly fewer basis functions than may be required for global approximation methods.

1610.06283 2026-06-04 cs.RO cs.LG cs.NE cs.SY eess.SY 版本更新

Deep Neural Networks for Improved, Impromptu Trajectory Tracking of Quadrotors

用于四旋翼机即时轨迹跟踪的深度神经网络

Qiyang Li, Jingxing Qian, Zining Zhu, Xuchan Bao, Mohamed K. Helwa, Angela P. Schoellig

AI总结 本文提出基于深度神经网络的算法,通过提供定制参考输入提升经典反馈控制器的轨迹跟踪性能,实验表明该方法能有效减少跟踪误差,适用于实时轨迹跟踪应用。

Comments 7 pages, 8 figures. Accepted final version. To appear in the proc. of the 2017 IEEE International Conference on Robotics and Automation

详情
AI中文摘要

四旋翼机的轨迹跟踪控制对于应用范围从勘测和检查到影视制作都至关重要。然而,设计和调优经典控制器,如比例-积分-微分(PID)控制器,以实现高跟踪精度可能耗时且困难,由于隐藏动态和其他非理想因素。深度神经网络(DNN)凭借其卓越的近似抽象、非线性函数的能力,提出了一种增强轨迹跟踪控制的新方法。本文提出了一种基于DNN的算法作为附加模块,以提高经典反馈控制器的跟踪性能。给定期望轨迹,DNNs根据其获得的经验为控制器提供定制参考输入。输入旨在实现期望轨迹与输出轨迹之间的单位映射。这项工作的动机是交互式“画即飞”应用,用户在移动设备上绘制轨迹,四旋翼机即时飞越该轨迹,通过DNN增强的控制系统。实验结果表明,所提出的方法在DNNs在选定的周期轨迹上训练后,能够提高用户绘制轨迹的跟踪精度,表明该方法在现实应用中的潜力。跟踪误差在训练和测试轨迹上分别减少约40-50%,突显了DNNs在知识泛化方面的能力。

英文摘要

Trajectory tracking control for quadrotors is important for applications ranging from surveying and inspection, to film making. However, designing and tuning classical controllers, such as proportional-integral-derivative (PID) controllers, to achieve high tracking precision can be time-consuming and difficult, due to hidden dynamics and other non-idealities. The Deep Neural Network (DNN), with its superior capability of approximating abstract, nonlinear functions, proposes a novel approach for enhancing trajectory tracking control. This paper presents a DNN-based algorithm as an add-on module that improves the tracking performance of a classical feedback controller. Given a desired trajectory, the DNNs provide a tailored reference input to the controller based on their gained experience. The input aims to achieve a unity map between the desired and the output trajectory. The motivation for this work is an interactive "fly-as-you-draw" application, in which a user draws a trajectory on a mobile device, and a quadrotor instantly flies that trajectory with the DNN-enhanced control system. Experimental results demonstrate that the proposed approach improves the tracking precision for user-drawn trajectories after the DNNs are trained on selected periodic trajectories, suggesting the method's potential in real-world applications. Tracking errors are reduced by around 40-50% for both training and testing trajectories from users, highlighting the DNNs' capability of generalizing knowledge.

1508.03332 2026-06-04 math.NA cs.LG cs.MA cs.NA math.DS stat.ML 版本更新

Dimensionality Reduction of Collective Motion by Principal Manifolds

通过主流形进行集体运动的降维

Kelum Gajamannage, Sachit Butail, Maurizio Porfiri, Erik M. Bollt

AI总结 本文提出基于立方平滑样条构建二维主流形的方法,用于降维分析集体运动数据,保留原始结构并优于现有非线性降维方法。

Comments 19 pages, 13 figures, journal article

详情
Journal ref
Physica-D : Nonlinear Phenomena, Volume 291, 15 January 2015, Pages 62-73
AI中文摘要

尽管已证明集体运动模式中存在低维嵌入流形,但现有非线性降维方法无法有效分析此类流形,主要原因是谱分解步骤限制了高维空间到嵌入空间的映射控制。本文提出一种替代方法,要求二维嵌入以拓扑方式总结高维数据。具体而言,我们直接在高维空间中使用立方平滑样条构建二维主流形,并用测地距离定义嵌入坐标。通过代表性示例,我们展示与现有非线性降维方法相比,主流形在噪声和稀疏数据集上仍能保留原始结构。主流形寻找算法应用于多个代理动态系统模拟复杂机动(捕食者围攻)得到的配置,并将所得二维嵌入与已建立的非线性降维方法进行比较。

英文摘要

While the existence of low-dimensional embedding manifolds has been shown in patterns of collective motion, the current battery of nonlinear dimensionality reduction methods are not amenable to the analysis of such manifolds. This is mainly due to the necessary spectral decomposition step, which limits control over the mapping from the original high-dimensional space to the embedding space. Here, we propose an alternative approach that demands a two-dimensional embedding which topologically summarizes the high-dimensional data. In this sense, our approach is closely related to the construction of one-dimensional principal curves that minimize orthogonal error to data points subject to smoothness constraints. Specifically, we construct a two-dimensional principal manifold directly in the high-dimensional space using cubic smoothing splines, and define the embedding coordinates in terms of geodesic distances. Thus, the mapping from the high-dimensional data to the manifold is defined in terms of local coordinates. Through representative examples, we show that compared to existing nonlinear dimensionality reduction methods, the principal manifold retains the original structure even in noisy and sparse datasets. The principal manifold finding algorithm is applied to configurations obtained from a dynamical system of multiple agents simulating a complex maneuver called predator mobbing, and the resulting two-dimensional embedding is compared with that of a well-established nonlinear dimensionality reduction method.

1707.05828 2026-06-04 cs.LG cs.NA math.NA 版本更新

A deep learning approach to diabetic blood glucose prediction

基于深度学习的糖尿病血血糖预测方法

H. N. Mhaskar, S. V. Pereverzyev, M. D. van der Walt

AI总结 本文提出利用深度学习对糖尿病患者血糖进行30分钟预测,通过选取部分患者数据训练模型,验证深度学习在该任务中的优越性,并展示如何利用领域知识构建简洁的深度表示。

详情
Journal ref
Front. Appl. Math. Stat., 14 July 2017
AI中文摘要

我们考虑利用连续血糖监测设备测量的血糖水平进行30分钟预测,使用临床数据。虽然大多数此类研究针对单个患者,我们选取数据集中的一定比例患者作为训练数据,其余作为测试对象;即模型无需在数据集中的新患者上重新校准。我们展示了深度学习如何在该示例中优于浅层网络。一个创新点是展示如何利用领域知识构建简洁的深度表示。

英文摘要

We consider the question of 30-minute prediction of blood glucose levels measured by continuous glucose monitoring devices, using clinical data. While most studies of this nature deal with one patient at a time, we take a certain percentage of patients in the data set as training data, and test on the remainder of the patients; i.e., the machine need not re-calibrate on the new patients in the data set. We demonstrate how deep learning can outperform shallow networks in this example. One novelty is to demonstrate how a parsimonious deep representation can be constructed using domain knowledge.

1605.07246 2026-06-04 cs.LG cs.AI cs.NA math.NA 版本更新

Adaptive ADMM with Spectral Penalty Parameter Selection

自适应ADMM与谱惩罚参数选择

Zheng Xu, Mario A. T. Figueiredo, Tom Goldstein

AI总结 本文提出自适应ADMM算法,通过自适应调整惩罚参数实现快速收敛,提高算法鲁棒性与易用性。

Comments AISTATS 2017

详情
AI中文摘要

交替方向乘子法(ADMM)是一种解决广泛约束优化问题的 versatile 工具,适用于可微或非可微的目标函数。不幸的是,其性能高度敏感于惩罚参数,使ADMM往往不可靠且难以自动化。我们通过提出自适应调整惩罚参数的方法来克服这一缺点,得到的自适应ADMM(AADMM)算法受成功Barzilai-Borwein谱方法启发,实现快速收敛和对初始步长和问题规模的相对不敏感性。

英文摘要

The alternating direction method of multipliers (ADMM) is a versatile tool for solving a wide range of constrained optimization problems, with differentiable or non-differentiable objective functions. Unfortunately, its performance is highly sensitive to a penalty parameter, which makes ADMM often unreliable and hard to automate for a non-expert user. We tackle this weakness of ADMM by proposing a method to adaptively tune the penalty parameters to achieve fast convergence. The resulting adaptive ADMM (AADMM) algorithm, inspired by the successful Barzilai-Borwein spectral method for gradient descent, yields fast convergence and relative insensitivity to the initial stepsize and problem scaling.

1611.03220 2026-06-04 math.NA cs.DS cs.LG cs.NA 版本更新

Faster Kernel Ridge Regression Using Sketching and Preconditioning

用Sketching和Preconditioning加速核岭回归

Haim Avron, Kenneth L. Clarkson, David P. Woodruff

AI总结 本文提出利用随机特征映射和预处理技术加速核岭回归线性系统的求解,通过有效预处理方法提升大规模数据集的计算效率。

详情
AI中文摘要

本文提出利用随机特征映射和预处理技术加速核岭回归线性系统的求解,通过有效预处理方法提升大规模数据集的计算效率。

英文摘要

Kernel Ridge Regression (KRR) is a simple yet powerful technique for non-parametric regression whose computation amounts to solving a linear system. This system is usually dense and highly ill-conditioned. In addition, the dimensions of the matrix are the same as the number of data points, so direct methods are unrealistic for large-scale datasets. In this paper, we propose a preconditioning technique for accelerating the solution of the aforementioned linear system. The preconditioner is based on random feature maps, such as random Fourier features, which have recently emerged as a powerful technique for speeding up and scaling the training of kernel-based methods, such as kernel ridge regression, by resorting to approximations. However, random feature maps only provide crude approximations to the kernel function, so delivering state-of-the-art results by directly solving the approximated system requires the number of random features to be very large. We show that random feature maps can be much more effective in forming preconditioners, since under certain conditions a not-too-large number of random features is sufficient to yield an effective preconditioner. We empirically evaluate our method and show it is highly effective for datasets of up to one million training examples.

1707.03340 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Deep Learning for Real Time Crime Forecasting

深度学习用于实时犯罪预测

Bao Wang, Duo Zhang, Duanhao Zhang, P. Jeffery Brantingham, Andrea L. Bertozzi

AI总结 本文提出基于深度学习的时空预测模型,通过空间时间正则化和残差卷积结构,提升对洛杉矶犯罪分布的预测精度。

Comments 4 pages, 6 figures, NOLTA, 2017

详情
AI中文摘要

准确的实时犯罪预测是公共安全的关键问题,但对科学界仍具挑战性。犯罪发生受多种复杂因素影响,且犯罪事件稀疏。在不同时空尺度下,犯罪分布呈现显著不同的模式,且在空间和时间上均具有极低的规律性。本文采用最先进的深度学习时空预测器ST-ResNet[1],用于预测洛杉矶地区的犯罪分布。我们的模型分为两个阶段:首先对原始犯罪数据进行预处理,包括空间和时间上的正则化以增强可预测信号;其次采用残差卷积单元的层次结构训练多因素犯罪预测模型。在洛杉矶为期半年的实验中,我们的模型展现出高度的预测能力。

英文摘要

Accurate real time crime prediction is a fundamental issue for public safety, but remains a challenging problem for the scientific community. Crime occurrences depend on many complex factors. Compared to many predictable events, crime is sparse. At different spatio-temporal scales, crime distributions display dramatically different patterns. These distributions are of very low regularity in both space and time. In this work, we adapt the state-of-the-art deep learning spatio-temporal predictor, ST-ResNet [Zhang et al, AAAI, 2017], to collectively predict crime distribution over the Los Angeles area. Our models are two staged. First, we preprocess the raw crime data. This includes regularization in both space and time to enhance predictable signals. Second, we adapt hierarchical structures of residual convolutional units to train multi-factor crime prediction models. Experiments over a half year period in Los Angeles reveal highly accurate predictive power of our models.

1707.03092 2026-06-04 eess.SY cs.LG cs.RO cs.SY 版本更新

A Separation-Based Design to Data-Driven Control for Large-Scale Partially Observed Systems

基于分离的设计到数据驱动控制用于大规模部分观测系统

Dan Yu, Mohammadhussein Rafieisakhaei, Suman Chakravorty

AI总结 本文研究了由偏微分方程(PDE)描述的状态动力学导致的 partially observed 随机最优控制问题,通过黑盒模拟模型求解开环确定性轨迹优化问题,并基于输入输出实验数据设计线性二次高斯控制器。

Comments 3 pages, 6 figures, In Robotics: Science and Systems (RSS) 2017 Workshop of "POMDPs in Robotics: State of The Art, Challenges, and Opportunities"

详情
AI中文摘要

本文研究了由偏微分方程(PDE)描述的状态动力学导致的 partially observed 随机最优控制问题,该问题导致极大规模的问题。首先,使用动态系统的黑盒模拟模型求解开环确定性轨迹优化问题。接着,针对依赖于名义轨迹的线性化系统,设计线性二次高斯(LQG)控制器,该控制器通过由优化名义系统冲击响应组成的输入输出实验数据进行识别。通过计算非线性热例示该方法的性能。

英文摘要

This paper studies the partially observed stochastic optimal control problem for systems with state dynamics governed by Partial Differential Equations (PDEs) that leads to an extremely large problem. First, an open-loop deterministic trajectory optimization problem is solved using a black box simulation model of the dynamical system. Next, a Linear Quadratic Gaussian (LQG) controller is designed for the nominal trajectory-dependent linearized system, which is identified using input-output experimental data consisting of the impulse responses of the optimized nominal system. A computational nonlinear heat example is used to illustrate the performance of the approach.

1707.02670 2026-06-04 math.OC cs.DS cs.LG cs.NA math.NA stat.ML 版本更新

Accelerated Stochastic Power Iteration

加速随机幂迭代

Christopher De Sa, Bryan He, Ioannis Mitliagkas, Christopher Ré, Peng Xu

AI总结 本文提出一种带有动量项的幂迭代变种,实现了最优的样本和迭代复杂度,适用于在线和离线设置的随机PCA算法,加速了迭代复杂度至O(1/√Δ)。

Comments 37 pages, 5 figures

详情
AI中文摘要

主成分分析(PCA)是机器学习中最强大的工具之一。最简单的PCA方法,即幂迭代,需要O(1/Δ)次全数据遍历来恢复具有特征间隙Δ的矩阵的主成分。Lanczos方法虽然更复杂,但实现了加速的O(1/√Δ)遍历率。现代应用却要求仅处理可用数据子集的随机方法。在线随机设置中,简单的Oja迭代方法达到最优样本复杂度O(σ²/Δ²),但其完全序列且需要O(σ²/Δ²)次迭代,远低于Lanczos的O(1/√Δ)速率。本文提出一种带有动量项的幂迭代变种,实现了最优的样本和迭代复杂度。在全遍历设置中,标准分析表明动量可实现加速的O(1/√Δ)速率。我们实证表明,简单将动量应用于随机方法并不能加速。通过新颖的紧方差分析,揭示了“断裂点方差”之后加速不再发生。结合现代方差减少技术,我们构建了适用于在线和离线设置的随机PCA算法,实现了加速的迭代复杂度O(1/√Δ)。由于我们的方法具有 embarrassingly 并行性质,如果在并行环境中部署,这种加速可直接转化为实际时间。我们的方法非常通用,适用于许多可加速的非凸优化问题。

英文摘要

Principal component analysis (PCA) is one of the most powerful tools in machine learning. The simplest method for PCA, the power iteration, requires $\mathcal O(1/Δ)$ full-data passes to recover the principal component of a matrix with eigen-gap $Δ$. Lanczos, a significantly more complex method, achieves an accelerated rate of $\mathcal O(1/\sqrtΔ)$ passes. Modern applications, however, motivate methods that only ingest a subset of available data, known as the stochastic setting. In the online stochastic setting, simple algorithms like Oja's iteration achieve the optimal sample complexity $\mathcal O(σ^2/Δ^2)$. Unfortunately, they are fully sequential, and also require $\mathcal O(σ^2/Δ^2)$ iterations, far from the $\mathcal O(1/\sqrtΔ)$ rate of Lanczos. We propose a simple variant of the power iteration with an added momentum term, that achieves both the optimal sample and iteration complexity. In the full-pass setting, standard analysis shows that momentum achieves the accelerated rate, $\mathcal O(1/\sqrtΔ)$. We demonstrate empirically that naively applying momentum to a stochastic method, does not result in acceleration. We perform a novel, tight variance analysis that reveals the "breaking-point variance" beyond which this acceleration does not occur. By combining this insight with modern variance reduction techniques, we construct stochastic PCA algorithms, for the online and offline setting, that achieve an accelerated iteration complexity $\mathcal O(1/\sqrtΔ)$. Due to the embarassingly parallel nature of our methods, this acceleration translates directly to wall-clock time if deployed in a parallel environment. Our approach is very general, and applies to many non-convex optimization problems that can now be accelerated using the same technique.

1707.02515 2026-06-04 cs.AI cs.LG cs.SY eess.SY 版本更新

A Fast Integrated Planning and Control Framework for Autonomous Driving via Imitation Learning

一种通过模仿学习的快速集成规划与控制系统用于自动驾驶

Liting Sun, Cheng Peng, Wei Zhan, Masayoshi Tomizuka

AI总结 本文提出一种结合学习与优化方法的两层框架,通过神经网络学习长期最优策略并结合短期优化控制器提升自动驾驶的安全性和效率。

详情
AI中文摘要

为实现自动驾驶中的安全高效规划与控制,需要一种能够长期 horizon 内实现良好驾驶质量且保证安全可行的驾驶策略。基于优化的方法,如模型预测控制(MPC),可以提供此类最优策略,但其计算复杂度通常无法满足实时实现的需求。为解决此问题,我们提出了一种快速集成规划与控制系统,该系统通过在两层分层结构中结合学习与优化方法。第一层定义为“策略层”,由神经网络建立,学习由MPC生成的长期最优驾驶策略。第二层称为“执行层”,是一个基于优化的短期控制器,能够跟踪由“策略层”提供的参考轨迹,并保证短期的安全性和可行性。此外,通过高效且高度代表性的特征,小尺寸的神经网络足以处理许多复杂的驾驶场景。这使得在线模仿学习与数据集聚合(DAgger)成为可能,从而能够快速且持续地提升“策略层”的性能。几个驾驶场景的例子被演示以验证所提框架的有效性和效率。

英文摘要

For safe and efficient planning and control in autonomous driving, we need a driving policy which can achieve desirable driving quality in long-term horizon with guaranteed safety and feasibility. Optimization-based approaches, such as Model Predictive Control (MPC), can provide such optimal policies, but their computational complexity is generally unacceptable for real-time implementation. To address this problem, we propose a fast integrated planning and control framework that combines learning- and optimization-based approaches in a two-layer hierarchical structure. The first layer, defined as the "policy layer", is established by a neural network which learns the long-term optimal driving policy generated by MPC. The second layer, called the "execution layer", is a short-term optimization-based controller that tracks the reference trajecotries given by the "policy layer" with guaranteed short-term safety and feasibility. Moreover, with efficient and highly-representative features, a small-size neural network is sufficient in the "policy layer" to handle many complicated driving scenarios. This renders online imitation learning with Dataset Aggregation (DAgger) so that the performance of the "policy layer" can be improved rapidly and continuously online. Several exampled driving scenarios are demonstrated to verify the effectiveness and efficiency of the proposed framework.

1707.02201 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Learning human behaviors from motion capture by adversarial imitation

通过对抗模仿学习学习人类行为

Josh Merel, Yuval Tassa, Dhruva TB, Sriram Srinivasan, Jay Lemmon, Ziyu Wang, Greg Wayne, Nicolas Heess

AI总结 本文提出利用生成对抗模仿学习训练神经网络策略,从有限的不完全观测状态特征中生成人类化运动模式,即使演示来自不同物理参数的躯体,也能通过子技能策略解决任务。

详情
AI中文摘要

深度强化学习的快速进展使训练高维人形身体控制器变得越来越可行。然而,纯强化学习方法使用简单的奖励函数往往会产生非人类化且过于刻板的运动行为。在本文中,我们扩展了生成对抗模仿学习,以使训练通用神经网络策略成为可能,从而从仅包含部分观测状态特征的有限演示中生成人类化运动模式,即使在没有动作信息且演示来自具有不同且未知物理参数的躯体时也是如此。我们利用这种方法从动作捕捉数据构建子技能策略,并展示这些策略在由更高层次控制器控制时可以用于解决任务。

英文摘要

Rapid progress in deep reinforcement learning has made it increasingly feasible to train controllers for high-dimensional humanoid bodies. However, methods that use pure reinforcement learning with simple reward functions tend to produce non-humanlike and overly stereotyped movement behaviors. In this work, we extend generative adversarial imitation learning to enable training of generic neural network policies to produce humanlike movement patterns from limited demonstrations consisting only of partially observed state features, without access to actions, even when the demonstrations come from a body with different and unknown physical parameters. We leverage this approach to build sub-skill policies from motion capture data and show that they can be reused to solve tasks when controlled by a higher level controller.

1707.01945 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Simple Classification using Binary Data

基于二进制数据的简单分类

Deanna Needell, Rayan Saab, Tina Woolf

AI总结 本文研究了从二进制数据进行分类的问题,提出了一种计算和资源消耗低的框架,并通过实验和理论分析验证其有效性。

详情
AI中文摘要

二进制数据在许多应用中自然出现,并在硬件实现和算法设计中具有吸引力。本文研究了从二进制数据进行分类的问题,提出了一种计算和资源消耗低的框架。我们通过 stylized 和 realistic 的数值实验展示了所提方法的实用性,并为简单情况提供了理论分析。我们希望我们的框架和分析能为研究类似方法提供基础。

英文摘要

Binary, or one-bit, representations of data arise naturally in many applications, and are appealing in both hardware implementations and algorithm design. In this work, we study the problem of data classification from binary data and propose a framework with low computation and resource costs. We illustrate the utility of the proposed approach through stylized and realistic numerical experiments, and provide a theoretical analysis for a simple case. We hope that our framework and analysis will serve as a foundation for studying similar types of approaches.

1707.01322 2026-06-04 cs.LG cs.LO cs.SY eess.SY 版本更新

Automated Experiment Design for Data-Efficient Verification of Parametric Markov Decision Processes

数据高效验证参数马尔可夫决策过程的自动化实验设计

Elizabeth Polgreen, Viraj Wijesuriya, Sofie Haesaert, Alessandro Abate

AI总结 本文提出一种利用参数模型和实验数据进行统计验证的新方法,通过参数综合确定可行参数集,主动合成实验提高数据相关性,并传播信息以获得验证结果。

Comments QEST 2017, 18 pages, 7 figures

详情
AI中文摘要

我们提出了一种新的方法,用于在部分未知系统上对定量属性进行统计验证,利用参数模型(本文中为参数马尔可夫决策过程)和从底层系统中收集的实验数据。我们获得底层系统满足给定属性的信心,并展示该方法使用数据高效,因此对可用数据量具有鲁棒性。这些特性通过首先利用参数综合确定可行参数集,其次主动合成实验以增加与属性相关的数据信息,最后将此信息传播到模型参数,从而获得反映我们对系统参数是否在可行集中的信心,从而解决验证问题。

英文摘要

We present a new method for statistical verification of quantitative properties over a partially unknown system with actions, utilising a parameterised model (in this work, a parametric Markov decision process) and data collected from experiments performed on the underlying system. We obtain the confidence that the underlying system satisfies a given property, and show that the method uses data efficiently and thus is robust to the amount of data available. These characteristics are achieved by firstly exploiting parameter synthesis to establish a feasible set of parameters for which the underlying system will satisfy the property; secondly, by actively synthesising experiments to increase amount of information in the collected data that is relevant to the property; and finally propagating this information over the model parameters, obtaining a confidence that reflects our belief whether or not the system parameters lie in the feasible set, thereby solving the verification problem.

1609.00932 2026-06-04 cs.LG cs.AI cs.SY eess.SY math.PR physics.data-an 版本更新

Spectral learning of dynamic systems from nonequilibrium data

从非平衡数据中学习动态系统的谱方法

Hao Wu, Frank Noé

AI总结 本文研究了在不假设数据同分布的情况下,通过施加平衡约束从非平衡观测数据中提取系统平衡动力学的谱学习特性,并提出了一种适用于连续数据的无bin扩展方法,实现线性复杂度下的稳定估计。

详情
Journal ref
Proceedings of the 29th conference on Neural Information Processing Systems (NIPS), Barcelona, Spain, 2016, pp. 4179-4187
AI中文摘要

可观测操作模型(OOMs)及相关模型是建模和分析随机系统的重要且强大的工具。它们精确描述有限秩系统的动力学,并可通过谱学习在假设数据同分布的情况下高效一致地估计。本文研究了在分析长时间尺度系统时不假设数据同分布的谱学习特性,并展示通过施加平衡约束可从非平衡观测数据中提取系统平衡动力学。此外,本文提出了一种适用于连续数据的无bin扩展谱学习方法。与其他连续值谱算法相比,无bin算法仅需线性复杂度即可实现平衡动力学的一致估计。

英文摘要

Observable operator models (OOMs) and related models are one of the most important and powerful tools for modeling and analyzing stochastic systems. They exactly describe dynamics of finite-rank systems and can be efficiently and consistently estimated through spectral learning under the assumption of identically distributed data. In this paper, we investigate the properties of spectral learning without this assumption due to the requirements of analyzing large-time scale systems, and show that the equilibrium dynamics of a system can be extracted from nonequilibrium observation data by imposing an equilibrium constraint. In addition, we propose a binless extension of spectral learning for continuous data. In comparison with the other continuous-valued spectral algorithms, the binless algorithm can achieve consistent estimation of equilibrium dynamics with only linear complexity.

1706.02869 2026-06-04 cs.LG cs.NA cs.SY eess.SY math.NA 版本更新

Adaptive Consensus ADMM for Distributed Optimization

自适应共识ADMM用于分布式优化

Zheng Xu, Gavin Taylor, Hao Li, Mario Figueiredo, Xiaoming Yuan, Tom Goldstein

AI总结 本文提出自适应共识ADMM方法,通过为每个节点定制参数提升分布式优化性能,并证明其O(1/k)收敛速率。

Comments ICML 2017

详情
AI中文摘要

交替方向乘子法(ADMM)常用于分布式模型拟合问题,但其性能依赖于用户定义的惩罚参数。本文研究了通过在每个工作节点上使用不同微调算法参数来提升性能的分布式ADMM方法。我们为具有节点特定参数的自适应ADMM方法证明了O(1/k)收敛速率,并提出了自动调节参数的自适应共识ADMM(ACADMM),无需用户监督。

英文摘要

The alternating direction method of multipliers (ADMM) is commonly used for distributed model fitting problems, but its performance and reliability depend strongly on user-defined penalty parameters. We study distributed ADMM methods that boost performance by using different fine-tuned algorithm parameters on each worker node. We present a O(1/k) convergence rate for adaptive ADMM methods with node-specific parameters, and propose adaptive consensus ADMM (ACADMM), which automatically tunes parameters without user oversight.

1602.07764 2026-06-04 cs.AI cs.LG cs.NA math.NA math.OC stat.ML 版本更新

Reinforcement Learning of POMDPs using Spectral Methods

使用谱方法进行POMDP的强化学习

Kamyar Azizzadenesheli, Alessandro Lazaric, Animashree Anandkumar

AI总结 本文提出基于谱分解方法的POMDP强化学习算法,通过轨迹学习参数并利用优化 oracle 得到最优无记忆策略,证明了与最优无记忆策略的最优 regret 绑定和高维空间的高效扩展性。

详情
Journal ref
29th Annual Conference on Learning Theory, PMLR 49:193-256, 2016
AI中文摘要

我们提出了一种新的基于谱分解方法的POMDP强化学习算法。尽管谱方法之前已被用于一致学习隐马尔可夫模型等被动潜在变量模型,但POMDP更具挑战性,因为学习者与环境交互可能会改变未来的观测。我们设计了一种通过回合运行的算法,每个回合中利用谱技术从由固定策略生成的轨迹中学习POMDP参数。回合结束时,优化 oracle 返回基于估计POMDP模型的最优无记忆规划策略,该策略最大化预期奖励。我们证明了与最优无记忆策略相比的最优 regret 绑定以及在观测和动作空间维度上的高效扩展性。

英文摘要

We propose a new reinforcement learning algorithm for partially observable Markov decision processes (POMDP) based on spectral decomposition methods. While spectral methods have been previously employed for consistent learning of (passive) latent variable models such as hidden Markov models, POMDPs are more challenging since the learner interacts with the environment and possibly changes the future observations in the process. We devise a learning algorithm running through episodes, in each episode we employ spectral techniques to learn the POMDP parameters from a trajectory generated by a fixed policy. At the end of the episode, an optimization oracle returns the optimal memoryless planning policy which maximizes the expected reward based on the estimated POMDP model. We prove an order-optimal regret bound with respect to the optimal memoryless policy and efficient scaling with respect to the dimensionality of observation and action spaces.

1608.05754 2026-06-04 math.NA cs.LG cs.NA 版本更新

Fast estimation of approximate matrix ranks using spectral densities

利用谱密度快速估计近似矩阵秩

Shashanka Ubaru, Yousef Saad, Abd-Krim Seghouane

AI总结 本文提出两种低成本方法,利用谱密度估算大数据矩阵的近似秩,通过Chebyshev多项式和Lanczos算法进行分析,结合谱密度图定位噪声与有效特征值间隙,验证方法在典型应用中的性能。

详情
Journal ref
Neural Computation, Vol. 29, No. 5, pp. 1317-1351 (May 2017)
AI中文摘要

在许多机器学习和数据应用中,需要掌握大型数据矩阵的近似秩。本文提出两种计算成本低的技术来估算此类大矩阵的近似秩。这些技术利用物理中流行的近似谱密度,即测量矩阵在实线上特定点处特征值出现概率的密度分布。对区间积分可得到该区间内矩阵的特征值计数,因此通过精心选择的区间积分可近似得到秩。讨论了两种不同的方法来估计近似秩,一种基于Chebyshev多项式,另一种基于Lanczos算法。为了获得适当的区间,需要定位噪声对应的特征值与影响矩阵秩的特征值之间的间隙。基于谱密度图提出了一种定位此间隙并选择积分区间的办法。数值实验展示了这些技术在典型应用矩阵上的性能。

英文摘要

In many machine learning and data related applications, it is required to have the knowledge of approximate ranks of large data matrices at hand. In this paper, we present two computationally inexpensive techniques to estimate the approximate ranks of such large matrices. These techniques exploit approximate spectral densities, popular in physics, which are probability density distributions that measure the likelihood of finding eigenvalues of the matrix at a given point on the real line. Integrating the spectral density over an interval gives the eigenvalue count of the matrix in that interval. Therefore the rank can be approximated by integrating the spectral density over a carefully selected interval. Two different approaches are discussed to estimate the approximate rank, one based on Chebyshev polynomials and the other based on the Lanczos algorithm. In order to obtain the appropriate interval, it is necessary to locate a gap between the eigenvalues that correspond to noise and the relevant eigenvalues that contribute to the matrix rank. A method for locating this gap and selecting the interval of integration is proposed based on the plot of the spectral density. Numerical experiments illustrate the performance of these techniques on matrices from typical applications.

1405.6341 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY 版本更新

Efficient Model Learning for Human-Robot Collaborative Tasks

高效的人机协作任务模型学习

Stefanos Nikolaidis, Keren Gu, Ramya Ramakrishnan, Julie Shah

AI总结 本文提出一种框架,通过联合动作演示学习人类用户模型,使机器人能自动计算稳健的协作策略。采用无监督学习聚类动作序列,学习逆强化学习奖励函数,并在混合可观测马尔可夫决策过程框架中应用,实现对新用户的类型推断和策略计算。

详情
Journal ref
Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction (HRI 2015)
AI中文摘要

我们提出了一种框架,用于从联合动作演示中学习人类用户模型,使机器人能够计算协作任务的稳健策略。学习过程完全自动,无需人工干预。首先,我们描述了使用无监督学习算法将演示的动作序列聚类为不同的人类类型。这些演示序列还被机器人用来通过逆强化学习算法学习代表每种类型的奖励函数。学习的模型随后作为混合可观测马尔可夫决策过程(MO-MDP)的一部分使用,其中人类类型是部分可观测变量。通过该框架,我们可以推断新用户类型(未包含在训练集中),并计算与新用户偏好一致且对人类动作偏离具有鲁棒性的机器人策略。最后,我们通过人类受试者实验数据验证了该方法,并进行了概念验证演示,其中一个人与小型工业机器人进行协作任务。

英文摘要

We present a framework for learning human user models from joint-action demonstrations that enables the robot to compute a robust policy for a collaborative task with a human. The learning takes place completely automatically, without any human intervention. First, we describe the clustering of demonstrated action sequences into different human types using an unsupervised learning algorithm. These demonstrated sequences are also used by the robot to learn a reward function that is representative for each type, through the employment of an inverse reinforcement learning algorithm. The learned model is then used as part of a Mixed Observability Markov Decision Process formulation, wherein the human type is a partially observable variable. With this framework, we can infer, either offline or online, the human type of a new user that was not included in the training set, and can compute a policy for the robot that will be aligned to the preference of this new user and will be robust to deviations of the human actions from prior demonstrations. Finally we validate the approach using data collected in human subject experiments, and conduct proof-of-concept demonstrations in which a person performs a collaborative task with a small industrial robot.

1706.04097 2026-06-04 cs.LG cs.DS cs.NA math.NA stat.ML 版本更新

Provable Alternating Gradient Descent for Non-negative Matrix Factorization with Strong Correlations

可证明的非负矩阵分解交替梯度下降法用于强相关性情况

Yuanzhi Li, Yingyu Liang

AI总结 本文提出了一种简单的交替梯度下降算法,证明在强相关性下能有效恢复真实特征矩阵,并展示了其在噪声下的鲁棒性。

Comments Accepted to the International Conference on Machine Learning (ICML), 2017

详情
AI中文摘要

非负矩阵分解是一种在非负约束下将数据分解为特征和权重矩阵的基本工具,在实践中通常通过交替最小化框架求解。然而,当不同特征的权重高度相关时,此类算法能否恢复真实特征矩阵仍不明确。本文提出了一种简单自然的交替梯度下降算法,并证明在温和初始化下,即使在强相关性存在时也能证明恢复真实矩阵。在大多数有趣的情况下,相关性可以达到最高可能的量级。我们的分析还揭示了其几个有利特性,包括对噪声的鲁棒性。我们通过半合成数据集的实证研究补充了理论结果,证明其在恢复真实矩阵方面优于几种流行方法。

英文摘要

Non-negative matrix factorization is a basic tool for decomposing data into the feature and weight matrices under non-negativity constraints, and in practice is often solved in the alternating minimization framework. However, it is unclear whether such algorithms can recover the ground-truth feature matrix when the weights for different features are highly correlated, which is common in applications. This paper proposes a simple and natural alternating gradient descent based algorithm, and shows that with a mild initialization it provably recovers the ground-truth in the presence of strong correlations. In most interesting cases, the correlation can be in the same order as the highest possible. Our analysis also reveals its several favorable features including robustness to noise. We complement our theoretical results with empirical studies on semi-synthetic datasets, demonstrating its advantage over several popular methods in recovering the ground-truth.

1702.07944 2026-06-04 cs.LG cs.AI cs.SY eess.SY math.OC stat.ML 版本更新

Stochastic Variance Reduction Methods for Policy Evaluation

基于随机方差缩减的方法用于策略评估

Simon S. Du, Jianshu Chen, Lihong Li, Lin Xiao, Dengyong Zhou

AI总结 本文提出基于线性函数逼近的策略评估方法,通过将经验策略评估问题转化为二次凸-凹鞍点问题,并设计了双变量批量梯度方法及两种随机方差缩减算法,实现线性缩放和线性收敛。

Comments Accepted by ICML 2017

详情
AI中文摘要

策略评估是强化学习中的关键步骤,用于估计在给定策略下状态长期价值的价值函数。本文聚焦于在固定数据集上使用线性函数逼近的策略评估。我们首先将经验策略评估问题转化为二次凸-凹鞍点问题,然后提出了一种对偶批量梯度方法,以及两种用于解决该问题的随机方差缩减方法。这些算法在样本大小和特征维度上均呈线性扩展。此外,即使当鞍点问题仅在对偶变量中具有强凹性而没有在原变量中具有强凸性时,它们仍能实现线性收敛。在基准问题上的数值实验验证了方法的有效性。

英文摘要

Policy evaluation is a crucial step in many reinforcement-learning procedures, which estimates a value function that predicts states' long-term value under a given policy. In this paper, we focus on policy evaluation with linear function approximation over a fixed dataset. We first transform the empirical policy evaluation problem into a (quadratic) convex-concave saddle point problem, and then present a primal-dual batch gradient method, as well as two stochastic variance reduction methods for solving the problem. These algorithms scale linearly in both sample size and feature dimension. Moreover, they achieve linear convergence even when the saddle-point problem has only strong concavity in the dual variables but no strong convexity in the primal variables. Numerical experiments on benchmark problems demonstrate the effectiveness of our methods.

1706.00241 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Krylov Subspace Recycling for Fast Iterative Least-Squares in Machine Learning

Krylov子空间回收用于机器学习中的快速迭代最小二乘法

Filip de Roos, Philipp Hennig

AI总结 本文研究了利用Krylov子空间回收方法提高机器学习中对称正定线性问题求解效率,通过迭代优化低秩近似以平衡计算成本与数值精度。

详情
AI中文摘要

求解对称正定线性问题是机器学习中的基础计算任务。精确解,众所周知,其计算复杂度与矩阵大小呈立方关系。为缓解这一问题,已提出几种线性时间的近似方法,如谱方法和诱导点方法,这些方法现在被广泛应用。这些方法是低秩近似,提前选择低秩空间,并不随时间迭代优化。虽然这允许数据集大小的线性成本,但也导致有限的、无法纠正的近似误差。数值线性代数领域的作者探索了如何迭代优化此类低秩近似,其成本仅为少量矩阵-向量乘法。这一想法尤其在机器学习中许多情况下具有吸引力,其中需要解决一系列相关的对称正定线性问题。从机器学习的角度来看,此类消减方法可以被解释为在时间序列的数值任务中,低秩近似的迁移学习。我们研究了此类方法在我们领域中的应用。我们的实验证明,在中等规模的回归和分类问题上,这种方法可以介于低计算成本和数值精度之间。

英文摘要

Solving symmetric positive definite linear problems is a fundamental computational task in machine learning. The exact solution, famously, is cubicly expensive in the size of the matrix. To alleviate this problem, several linear-time approximations, such as spectral and inducing-point methods, have been suggested and are now in wide use. These are low-rank approximations that choose the low-rank space a priori and do not refine it over time. While this allows linear cost in the data-set size, it also causes a finite, uncorrected approximation error. Authors from numerical linear algebra have explored ways to iteratively refine such low-rank approximations, at a cost of a small number of matrix-vector multiplications. This idea is particularly interesting in the many situations in machine learning where one has to solve a sequence of related symmetric positive definite linear problems. From the machine learning perspective, such deflation methods can be interpreted as transfer learning of a low-rank approximation across a time-series of numerical tasks. We study the use of such methods for our field. Our empirical results show that, on regression and classification problems of intermediate size, this approach can interpolate between low computational cost and numerical precision.

1705.10596 2026-06-04 math.NA cs.LG cs.NA 版本更新

Approximation learning methods of Harmonic Mappings in relation to Hardy Spaces

谐映射近似学习方法与Hardy空间的关系

Zhulin Liu, C. L. Philip Chen

AI总结 本文提出基于Tikhonov正则化和再生Hilbert核空间的Hardy空间方法,用于求解Dirichlet型问题,通过利用Hardy空间中函数的再生性质,简化优化运算并提出高效算法,同时探讨了谐映射在图像处理等应用中的独特性质和有效性。

Comments 2016 3rd International Conference on Informative and Cybernetics for Computational Social Systems (ICCSS)

详情
AI中文摘要

本文讨论了一种基于Tikhonov正则化和再生Hilbert核空间的Hardy空间方法,用于解决Dirichlet型问题,该方法本质上是一个位于上高复平面上的极值问题。在Hardy空间中,该问题的优化算子将被大大简化,从而可能实现高效算法。这主要通过利用上高复平面上Hardy空间中函数的再生性质来实现,并提出了详细算法。此外,谐映射作为一种重要的几何变换,在许多应用中如图像处理中被广泛应用,因为它描述了个体流形之间的能量最小化映射。特别是,当关注两个欧几里得平面区域之间的平面映射时,谐映射是存在且唯一的,这一性质由谐函数的存在性保证。本文展示了该性质的吸引力,并通过模拟结果验证了其在平面形状扭曲和表面配准等应用中的能力。

英文摘要

A new Hardy space Hardy space approach of Dirichlet type problem based on Tikhonov regularization and Reproducing Hilbert kernel space is discussed in this paper, which turns out to be a typical extremal problem located on the upper upper-high complex plane. If considering this in the Hardy space, the optimization operator of this problem will be highly simplified and an efficient algorithm is possible. This is mainly realized by the help of reproducing properties of the functions in the Hardy space of upper-high complex plane, and the detail algorithm is proposed. Moreover, harmonic mappings, which is a significant geometric transformation, are commonly used in many applications such as image processing, since it describes the energy minimization mappings between individual manifolds. Particularly, when we focus on the planer mappings between two Euclid planer regions, the harmonic mappings are exist and unique, which is guaranteed solidly by the existence of harmonic function. This property is attractive and simulation results are shown in this paper to ensure the capability of applications such as planer shape distortion and surface registration.

1705.10152 2026-06-04 math.OC cs.LG cs.NA math.AG math.NA 版本更新

Tangent Cones to TT Varieties

TT种集的切锥

Benjamin Kutschan

AI总结 本文研究了TT种集的Bouligand切锥参数化,探讨了其在二进制分层格式中的推广,并给出了切锥的正交和参数化及多项式方程组的隐式描述。

详情
AI中文摘要

如同在矩阵情况中如[Joe Harris, Algebraic Geometry - A first course, p.256]所做的一样,我们给出了TT种集的Bouligand切锥的参数化。我们讨论了该证明如何推广到任意二进制分层格式。该参数化可以重写为TT张量的正交和。其在种集上的重退化特别容易组合。我们还给出了切锥的隐式描述作为多项式方程组的解。

英文摘要

As already done for the matrix case for example in [Joe Harris, Algebraic Geometry - A first course, p.256] we give a parametrization of the Bouligand tangent cone of the variety of tensors of bounded TT rank. We discuss how the proof generalizes to any binary hierarchical format. The parametrization can be rewritten as an orthogonal sum of TT tensors. Its retraction onto the variety is particularly easy to compose. We also give an implicit description of the tangent cone as the solution of a system of polynomial equations.

1705.09761 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Stochastic Feedback Control of Systems with Unknown Nonlinear Dynamics

具有未知非线性动力学系统的随机反馈控制

Dan Yu, Mohammadhussein Rafieisakhaei, Suman Chakravorty

AI总结 研究未知动力学系统的随机最优控制问题,通过开环确定性轨迹优化和LQG控制器设计,使状态接近最优轨迹,利用输入输出实验数据识别轨迹依赖线性化系统。

Comments 7 pages, 7 figures, submitted to 56th IEEE Conference on Decision and Control (CDC), 2017

详情
AI中文摘要

本文研究具有未知动力学的随机最优控制问题。首先,解决开环确定性轨迹优化问题,无需知道动力学系统的显式形式。接着,为依赖轨迹的线性化系统设计线性二次高斯(LQG)控制器,使得在小噪声假设下,实际状态保持接近最优轨迹。利用输入输出实验数据(包含名义系统的脉冲响应)识别轨迹依赖线性化系统。通过计算示例说明所提方法的性能。

英文摘要

This paper studies the stochastic optimal control problem for systems with unknown dynamics. First, an open-loop deterministic trajectory optimization problem is solved without knowing the explicit form of the dynamical system. Next, a Linear Quadratic Gaussian (LQG) controller is designed for the nominal trajectory-dependent linearized system, such that under a small noise assumption, the actual states remain close to the optimal trajectory. The trajectory-dependent linearized system is identified using input-output experimental data consisting of the impulse responses of the nominal system. A computational example is given to illustrate the performance of the proposed approach.

1607.01231 2026-06-04 math.OC cs.LG cs.NA math.NA stat.ML 版本更新

Stochastic Quasi-Newton Methods for Nonconvex Stochastic Optimization

非凸随机优化的随机拟牛顿方法

Xiao Wang, Shiqian Ma, Donald Goldfarb, Wei Liu

AI总结 本文研究非凸随机优化中的随机拟牛顿方法,提出了一种框架并证明了收敛性,分析了最坏情况下的迭代复杂度,提出了一种随机阻尼L-BFGS方法,并结合SVRG技术,展示了在非凸二分类和多分类问题中的数值结果。

Comments published in SIAM Journal on Optimization

详情
AI中文摘要

本文研究了非凸随机优化中的随机拟牛顿方法,假设可以通过随机一阶 oracle(SFO)获取目标函数梯度的噪声信息。我们提出了一种通用框架,证明了几乎必然收敛到 stationary points,并分析了最坏情况下的迭代复杂度。当随机选择的迭代结果作为算法输出时,我们证明在最坏情况下,SFO调用的复杂度为 $O(ε^{-2})$,以确保梯度平方范数的期望小于给定的精度容限 $ε$。我们还提出了一种具体的算法,即随机阻尼L-BFGS(SdLBFGS)方法,该方法属于所提出的框架。此外,我们将SVRG方差减少技术纳入所提出的SdLBFGS方法中,并分析了其SFO调用复杂度。报告了在非凸二分类问题中使用SVM以及多分类问题中使用神经网络的数值结果。

英文摘要

In this paper we study stochastic quasi-Newton methods for nonconvex stochastic optimization, where we assume that noisy information about the gradients of the objective function is available via a stochastic first-order oracle (SFO). We propose a general framework for such methods, for which we prove almost sure convergence to stationary points and analyze its worst-case iteration complexity. When a randomly chosen iterate is returned as the output of such an algorithm, we prove that in the worst-case, the SFO-calls complexity is $O(ε^{-2})$ to ensure that the expectation of the squared norm of the gradient is smaller than the given accuracy tolerance $ε$. We also propose a specific algorithm, namely a stochastic damped L-BFGS (SdLBFGS) method, that falls under the proposed framework. {Moreover, we incorporate the SVRG variance reduction technique into the proposed SdLBFGS method, and analyze its SFO-calls complexity. Numerical results on a nonconvex binary classification problem using SVM, and a multiclass classification problem using neural networks are reported.

1705.05475 2026-06-04 cs.LG cs.NA cs.NE math.NA q-bio.NC 版本更新

Sparse Coding by Spiking Neural Networks: Convergence Theory and Computational Results

稀疏编码的脉冲神经网络:收敛理论与计算结果

Ping Tak Peter Tang, Tsung-Han Lin, Mike Davies

AI总结 本文提出一种脉冲神经网络模型,证明其能可靠解决稀疏编码问题,为非冯·诺依曼架构计算机提供了理论保障。

Comments 13 pages, 3 figures

详情
AI中文摘要

在脉冲神经网络(SNN)中,单个神经元自主运作,通过脉冲信号与其它神经元稀疏异步通信。这些特性使大规模并行硬件实现成为潜在强大计算机,但能否保证SNN计算机可靠解决重要问题?本文提出一个可配置解决稀疏编码问题的SNN数学模型,并在合理假设下证明其能解决稀疏编码。到目前为止,这是此类问题的首个严谨结果。

英文摘要

In a spiking neural network (SNN), individual neurons operate autonomously and only communicate with other neurons sparingly and asynchronously via spike signals. These characteristics render a massively parallel hardware implementation of SNN a potentially powerful computer, albeit a non von Neumann one. But can one guarantee that a SNN computer solves some important problems reliably? In this paper, we formulate a mathematical model of one SNN that can be configured for a sparse coding problem for feature extraction. With a moderate but well-defined assumption, we prove that the SNN indeed solves sparse coding. To the best of our knowledge, this is the first rigorous result of this kind.

1705.05116 2026-06-04 cs.RO cs.AI cs.CV cs.LG cs.SY eess.SY 版本更新

Tuning Modular Networks with Weighted Losses for Hand-Eye Coordination

通过加权损失调节模块网络以提升手眼协调

Fangyi Zhang, Jürgen Leitner, Michael Milford, Peter I. Corke

AI总结 本文提出端到端微调方法,通过加权损失提升模块化深度视觉-运动策略在平面抓取任务中的手眼协调性能。

Comments 2 pages, to appear in the Deep Learning for Robotic Vision (DLRV) Workshop in CVPR 2017

详情
AI中文摘要

本文介绍了一种端到端微调方法,用于改进模块化深度视觉-运动策略(模块网络)中的手眼协调能力,其中每个模块独立训练。得益于加权损失,该微调方法显著提升了策略在机器人平面抓取任务中的性能。

英文摘要

This paper introduces an end-to-end fine-tuning method to improve hand-eye coordination in modular deep visuo-motor policies (modular networks) where each module is trained independently. Benefiting from weighted losses, the fine-tuning method significantly improves the performance of the policies for a robotic planar reaching task.

1702.02453 2026-06-04 cs.LG cs.RO cs.SY eess.SY 版本更新

Preparing for the Unknown: Learning a Universal Policy with Online System Identification

为未知做准备:学习通用策略与在线系统识别

Wenhao Yu, Jie Tan, C. Karen Liu, Greg Turk

AI总结 本文提出了一种学习通用策略的方法,通过在线系统识别和大量训练示例,使策略在未知动态模型下具备鲁棒性,适用于多种动态模型和环境变化。

Comments Accepted as a conference paper at RSS 2017

详情
AI中文摘要

我们提出了一种学习控制策略的新方法,该方法能够在未知动态模型下有效运行。我们通过利用大量由物理模拟器生成的训练示例来创建此类策略。系统由两个组件组成:通用策略(UP)和在线系统识别(OSI)函数。我们描述我们的控制策略为通用,因为它是在广泛动态模型上训练的。这些动态模型的变化可能包括机器人组件的质量和惯性差异、摩擦系数变化或未知被操作物体的质量。通过在这些变化上训练通用策略,控制策略在未知环境中准备了更广泛的可能条件。系统第二部分利用系统的近期状态和动作历史来预测动态模型参数mu。在线系统识别的mu值然后作为输入提供给控制策略(连同系统状态)。UP-OSI是一种在广泛动态模型上适用且对环境突然变化具有响应性的稳健控制策略。我们评估了该系统在多种任务上的性能,包括cart-pole翻转问题、双倒立摆、跳蛙器的运动和机械臂的块投掷任务。UP-OSI在各种动态模型上均有效。此外,当测试动态模型超出训练范围时,UP-OSI在UP单独的情况下表现更优,即使UP被给予实际的动态模型值。除了创建更稳健的控制器的好处外,UP-OSI还具有缩小模拟与真实物理系统现实差距的潜力。

英文摘要

We present a new method of learning control policies that successfully operate under unknown dynamic models. We create such policies by leveraging a large number of training examples that are generated using a physical simulator. Our system is made of two components: a Universal Policy (UP) and a function for Online System Identification (OSI). We describe our control policy as universal because it is trained over a wide array of dynamic models. These variations in the dynamic model may include differences in mass and inertia of the robots' components, variable friction coefficients, or unknown mass of an object to be manipulated. By training the Universal Policy with this variation, the control policy is prepared for a wider array of possible conditions when executed in an unknown environment. The second part of our system uses the recent state and action history of the system to predict the dynamics model parameters mu. The value of mu from the Online System Identification is then provided as input to the control policy (along with the system state). Together, UP-OSI is a robust control policy that can be used across a wide range of dynamic models, and that is also responsive to sudden changes in the environment. We have evaluated the performance of this system on a variety of tasks, including the problem of cart-pole swing-up, the double inverted pendulum, locomotion of a hopper, and block-throwing of a manipulator. UP-OSI is effective at these tasks across a wide range of dynamic models. Moreover, when tested with dynamic models outside of the training range, UP-OSI outperforms the Universal Policy alone, even when UP is given the actual value of the model dynamics. In addition to the benefits of creating more robust controllers, UP-OSI also holds out promise of narrowing the Reality Gap between simulated and real physical systems.

1702.04077 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Mutual Kernel Matrix Completion

互核矩阵补全

Tsuyoshi Kato, Rachelle Rivero

AI总结 本文提出互核矩阵补全算法,通过融合数据与核矩阵补全方法,提升生物数据分类任务中缺失核矩阵的补全效果。

Comments 10 pages, 4 figures

详情
AI中文摘要

随着各种数据的大量涌入,从其中提取知识已成为数据科学家的一项有趣但繁琐的任务,特别是当数据形式异构且存在缺失信息时。许多数据补全技术已被引入,尤其是在核方法出现后。然而,在现有文献中,关于同时补全多个不完整核矩阵的研究却很少受到关注。本文提出了一种新的方法,称为互核矩阵补全(MKMC)算法,通过结合数据融合和核矩阵补全的概念,应用于生物数据集以用于分类任务。我们首先引入了一个目标函数,通过利用EM算法进行最小化,从而得到涉及的核矩阵中缺失条目的估计。补全后的核矩阵随后被结合以生成一个模型矩阵,可用于进一步改进获得的估计。我们的研究结果表明,E步和M步以闭合形式给出,使我们的算法在时间和内存方面都高效。完成补全后,补全的核矩阵用于训练SVM分类器,以测试数据点之间关系的保持程度。我们的实证结果表明,所提出的算法在保持数据点之间关系和准确恢复缺失核矩阵条目方面优于传统补全技术。目前,MKMC为多个相关不完整核矩阵的相互估计问题提供了一个有前途的解决方案。

英文摘要

With the huge influx of various data nowadays, extracting knowledge from them has become an interesting but tedious task among data scientists, particularly when the data come in heterogeneous form and have missing information. Many data completion techniques had been introduced, especially in the advent of kernel methods. However, among the many data completion techniques available in the literature, studies about mutually completing several incomplete kernel matrices have not been given much attention yet. In this paper, we present a new method, called Mutual Kernel Matrix Completion (MKMC) algorithm, that tackles this problem of mutually inferring the missing entries of multiple kernel matrices by combining the notions of data fusion and kernel matrix completion, applied on biological data sets to be used for classification task. We first introduced an objective function that will be minimized by exploiting the EM algorithm, which in turn results to an estimate of the missing entries of the kernel matrices involved. The completed kernel matrices are then combined to produce a model matrix that can be used to further improve the obtained estimates. An interesting result of our study is that the E-step and the M-step are given in closed form, which makes our algorithm efficient in terms of time and memory. After completion, the (completed) kernel matrices are then used to train an SVM classifier to test how well the relationships among the entries are preserved. Our empirical results show that the proposed algorithm bested the traditional completion techniques in preserving the relationships among the data points, and in accurately recovering the missing kernel matrix entries. By far, MKMC offers a promising solution to the problem of mutual estimation of a number of relevant incomplete kernel matrices.

1705.02891 2026-06-04 stat.CO cs.LG cs.NA hep-lat math.NA stat.ML 版本更新

Geometry and Dynamics for Markov Chain Monte Carlo

马尔可夫链蒙特卡洛的几何与动力学

Alessandro Barp, Francois-Xavier Briol, Anthony D. Kennedy, Mark Girolami

AI总结 本文综述了Hamiltonian Monte Carlo中使用的几何工具,为统计学家和机器学习者提供基础理解,并讨论了该领域最新进展。

Comments Submitted to "Annual Review of Statistics and Its Applications"

详情
AI中文摘要

马尔可夫链蒙特卡洛方法已革新了数学计算,并在许多以前无法处理的模型中实现了统计推断。在此背景下,哈密顿动力学被提出作为高效构建链的方法,以高效探索概率密度。该方法源自物理和几何,并且这些联系已通过一系列作者三十年的研究被广泛研究。然而,目前用户对方法的直觉和知识与我们对这些理论基础的深入理解之间存在差距。本文的目的是为统计学家、机器学习者及其他方法使用者提供一个全面的介绍,使他们能够在仅具备基本蒙特卡洛方法知识的情况下理解这些几何工具。这将通过讨论该领域最近期的进展来补充,我们相信这些进展将对应用科学家越来越相关。

英文摘要

Markov Chain Monte Carlo methods have revolutionised mathematical computation and enabled statistical inference within many previously intractable models. In this context, Hamiltonian dynamics have been proposed as an efficient way of building chains which can explore probability densities efficiently. The method emerges from physics and geometry and these links have been extensively studied by a series of authors through the last thirty years. However, there is currently a gap between the intuitions and knowledge of users of the methodology and our deep understanding of these theoretical foundations. The aim of this review is to provide a comprehensive introduction to the geometric tools used in Hamiltonian Monte Carlo at a level accessible to statisticians, machine learners and other users of the methodology with only a basic understanding of Monte Carlo methods. This will be complemented with some discussion of the most recent advances in the field which we believe will become increasingly relevant to applied scientists.

1605.00609 2026-06-04 cs.LG cs.IT cs.NA math.IT math.NA stat.ML 版本更新

Algorithms for Learning Sparse Additive Models with Interactions in High Dimensions

高维空间中包含交互项的稀疏加法模型的学习算法

Hemant Tyagi, Anastasios Kyrillidis, Bernd Gärtner, Andreas Krause

AI总结 本文提出了一种在高维空间中学习包含稀疏交互项的加法模型的算法,通过压缩感知方法有效恢复模型结构并保证误差界。

Comments To appear in Information and Inference: A Journal of the IMA. Made following changes after review process: (a) Corrected typos throughout the text. (b) Corrected choice of sampling distribution in Section 5, see eqs. (5.2), (5.3). (c) More detailed comparison with existing work in Section 8. (d) Added Section B in appendix on roots of cubic equation

详情
AI中文摘要

一个函数$f: \mathbb{R}^d \rightarrow \mathbb{R}$是稀疏加法模型(SPAM),如果其形式为$f(\mathbf{x}) = \sum_{l \in \mathcal{S}}ϕ_{l}(x_l)$,其中$\mathcal{S} \subset [d]$,且$|\mathcal{S}| \ll d$。假设$ϕ$和$\mathcal{S}$未知,已有大量工作致力于从样本中估计$f$。本文考虑了一种广义的SPAMs,允许存在少量的二次交互项。对于某些$\mathcal{S}_1 \subset [d], \mathcal{S}_2 \subset {[d] \choose 2}$,其中$|\mathcal{S}_1| \ll d, |\mathcal{S}_2| \ll d^2$,函数$f$现在被假设为形式:$\sum_{p \in \mathcal{S}_1}ϕ_{p} (x_p) + \sum_{(l,l^{\prime}) \in \mathcal{S}_2}ϕ_{(l,l^{\prime})} (x_l,x_{l^{\prime}})$。假设我们能够任意查询$f$的域内任意点,我们推导出高效的算法,能够以有限样本界证明恢复$\mathcal{S}_1,\mathcal{S}_2$。我们的分析涵盖了无噪声设置,即获得精确的$f$样本,也扩展到有噪声设置,其中查询被噪声污染。特别是对于有噪声设置,我们考虑了两种噪声模型:独立同分布高斯噪声和任意但有界的噪声。我们的主要方法依赖于稀疏Hessian矩阵的估计,为此我们提供了两种新的压缩感知方案。一旦$\mathcal{S}_1, \mathcal{S}_2$已知,我们展示了如何通过额外的$f$查询估计个体组件$ϕ_p$, $ϕ_{(l,l^{\prime})}$,并保证均匀误差界。最后,我们通过合成数据的模拟结果验证了我们的理论发现。

英文摘要

A function $f: \mathbb{R}^d \rightarrow \mathbb{R}$ is a Sparse Additive Model (SPAM), if it is of the form $f(\mathbf{x}) = \sum_{l \in \mathcal{S}}ϕ_{l}(x_l)$ where $\mathcal{S} \subset [d]$, $|\mathcal{S}| \ll d$. Assuming $ϕ$'s, $\mathcal{S}$ to be unknown, there exists extensive work for estimating $f$ from its samples. In this work, we consider a generalized version of SPAMs, that also allows for the presence of a sparse number of second order interaction terms. For some $\mathcal{S}_1 \subset [d], \mathcal{S}_2 \subset {[d] \choose 2}$, with $|\mathcal{S}_1| \ll d, |\mathcal{S}_2| \ll d^2$, the function $f$ is now assumed to be of the form: $\sum_{p \in \mathcal{S}_1}ϕ_{p} (x_p) + \sum_{(l,l^{\prime}) \in \mathcal{S}_2}ϕ_{(l,l^{\prime})} (x_l,x_{l^{\prime}})$. Assuming we have the freedom to query $f$ anywhere in its domain, we derive efficient algorithms that provably recover $\mathcal{S}_1,\mathcal{S}_2$ with finite sample bounds. Our analysis covers the noiseless setting where exact samples of $f$ are obtained, and also extends to the noisy setting where the queries are corrupted with noise. For the noisy setting in particular, we consider two noise models namely: i.i.d Gaussian noise and arbitrary but bounded noise. Our main methods for identification of $\mathcal{S}_2$ essentially rely on estimation of sparse Hessian matrices, for which we provide two novel compressed sensing based schemes. Once $\mathcal{S}_1, \mathcal{S}_2$ are known, we show how the individual components $ϕ_p$, $ϕ_{(l,l^{\prime})}$ can be estimated via additional queries of $f$, with uniform error bounds. Lastly, we provide simulation results on synthetic data that validate our theoretical findings.

1704.07669 2026-06-04 cs.DS cs.LG cs.NA math.NA 版本更新

Single-Pass PCA of Large High-Dimensional Data

大规模高维数据的单次PCA处理

Wenjian Yu, Yu Gu, Jian Li, Shenghua Liu, Yaohang Li

AI总结 本文提出一种单次随机算法实现大规模高维数据的PCA,适用于存储在慢速存储器或流式生成的数据,实验验证其准确性,比现有算法误差小多个数量级,可在24分钟内计算50个主成分。

Comments IJCAI 2017, 16 pages, 6 figures

详情
AI中文摘要

主成分分析(PCA)是统计学和机器学习中的基本降维工具。对于大规模高维数据,计算PCA(即数据矩阵的主导奇异值对应的奇异向量)成为挑战。本文提出一种单次随机算法,仅需一次数据遍历即可计算PCA。该算法适用于存储在慢速存储器或流式生成的数据。合成和真实数据的实验验证了算法的准确性,其误差比现有单次算法小多个数量级。对于存储为150 GB文件的高维数据,本文算法在典型24核计算机上可在24分钟内计算前50个主成分,内存成本低于1 GB。

英文摘要

Principal component analysis (PCA) is a fundamental dimension reduction tool in statistics and machine learning. For large and high-dimensional data, computing the PCA (i.e., the singular vectors corresponding to a number of dominant singular values of the data matrix) becomes a challenging task. In this work, a single-pass randomized algorithm is proposed to compute PCA with only one pass over the data. It is suitable for processing extremely large and high-dimensional data stored in slow memory (hard disk) or the data generated in a streaming fashion. Experiments with synthetic and real data validate the algorithm's accuracy, which has orders of magnitude smaller error than an existing single-pass algorithm. For a set of high-dimensional data stored as a 150 GB file, the proposed algorithm is able to compute the first 50 principal components in just 24 minutes on a typical 24-core computer, with less than 1 GB memory cost.

1608.04773 2026-06-04 stat.ML cs.DS cs.LG cs.NA math.NA math.OC 版本更新

Faster Principal Component Regression and Stable Matrix Chebyshev Approximation

更快的主成分回归与稳定的矩阵切比雪夫逼近

Zeyuan Allen-Zhu, Yuanzhi Li

AI总结 本文提出了一种通过减少黑盒调用次数来实现主成分回归的算法,其精度为1+γ,且无需显式构造主成分,适用于大规模数据。同时,开发了稳定的矩阵切比雪夫多项式递推公式和最优多项式逼近矩阵符号函数的方法。

Comments title changed and minor revisions

详情
AI中文摘要

我们通过将问题转化为至多$\tilde{O}(γ^{-1})$次黑盒调用的岭回归来解决主成分回归(PCR),从而在乘法精度$1+γ$内达到目标。因此,我们的算法不需要显式构造顶部主成分,适用于大规模PCR实例。相比之下,先前结果需要$\tilde{O}(γ^{-2})$次这样的黑盒调用。我们通过开发一个通用的稳定递推公式用于矩阵切比雪夫多项式,以及一个最优次数的多项式逼近矩阵符号函数来获得这一结果。我们的技术可能具有独立的兴趣,尤其是在设计迭代方法时。

英文摘要

We solve principal component regression (PCR), up to a multiplicative accuracy $1+γ$, by reducing the problem to $\tilde{O}(γ^{-1})$ black-box calls of ridge regression. Therefore, our algorithm does not require any explicit construction of the top principal components, and is suitable for large-scale PCR instances. In contrast, previous result requires $\tilde{O}(γ^{-2})$ such black-box calls. We obtain this result by developing a general stable recurrence formula for matrix Chebyshev polynomials, and a degree-optimal polynomial approximation to the matrix sign function. Our techniques may be of independent interests, especially when designing iterative methods.

1704.06803 2026-06-04 cs.LG cs.IR cs.NA math.NA stat.ML 版本更新

Geometric Matrix Completion with Recurrent Multi-Graph Neural Networks

基于循环多图神经网络的几何矩阵补全

Federico Monti, Michael M. Bronstein, Xavier Bresson

AI总结 本文提出利用几何深度学习改进矩阵补全,结合图卷积网络和循环神经网络,学习图结构模式和非线性扩散过程,以提升推荐系统性能,参数数量与矩阵规模无关。

详情
AI中文摘要

本文提出利用几何深度学习改进矩阵补全,结合图卷积网络和循环神经网络,学习图结构模式和非线性扩散过程,以提升推荐系统性能,参数数量与矩阵规模无关。

英文摘要

Matrix completion models are among the most common formulations of recommender systems. Recent works have showed a boost of performance of these techniques when introducing the pairwise relationships between users/items in the form of graphs, and imposing smoothness priors on these graphs. However, such techniques do not fully exploit the local stationarity structures of user/item graphs, and the number of parameters to learn is linear w.r.t. the number of users and items. We propose a novel approach to overcome these limitations by using geometric deep learning on graphs. Our matrix completion architecture combines graph convolutional neural networks and recurrent neural networks to learn meaningful statistical graph-structured patterns and the non-linear diffusion process that generates the known ratings. This neural network system requires a constant number of parameters independent of the matrix size. We apply our method on both synthetic and real datasets, showing that it outperforms state-of-the-art techniques.

1704.05249 2026-06-04 cs.LG cs.NI cs.SY eess.SY 版本更新

Hot or not? Forecasting cellular network hot spots using sector performance indicators

热点与否?利用扇区性能指标预测蜂窝网络热点

Joan Serrà, Ilias Leontiadis, Alexandros Karatzoglou, Konstantina Papagiannaki

AI总结 本文研究蜂窝网络热点评分的时空模式,利用树形机器学习模型预测热点,发现树模型在预测常规和非常规热点时分别提升14%和153%的准确性。

Comments Accepted for publication at ICDE 2017 - Industrial Track

详情
AI中文摘要

为管理维护大规模蜂窝网络,运营商需了解何时哪些扇区表现不佳。为此,他们使用所谓的热点评分,即多种网络测量的组合结果,反映单个扇区的即时整体性能。尽管运营商对网络当前性能和整体趋势有良好理解,但预测每个扇区随时间的变化却极具挑战性,因为其受常规和非常规事件影响,由人类行为和硬件故障触发。本文研究热点评分的时空模式,揭示其规律性。基于观察,我们探索利用近期测量历史预测未来热点的可能性。为此,我们考虑基于树的机器学习模型,并研究其性能随时间、历史数据量和预测时间跨度的变化。结果表明,与最佳基线相比,树模型在预测常规热点时可提升14%,在预测非常规热点时可提升153%。后者为中等时间跨度内预测孤立、非常规行为的热点提供了有力证据。整体而言,本文为蜂窝扇区动态及其可预测性提供了见解,并为更具前瞻性的网络运营和更长的预测时间跨度铺平了道路。

英文摘要

To manage and maintain large-scale cellular networks, operators need to know which sectors underperform at any given time. For this purpose, they use the so-called hot spot score, which is the result of a combination of multiple network measurements and reflects the instantaneous overall performance of individual sectors. While operators have a good understanding of the current performance of a network and its overall trend, forecasting the performance of each sector over time is a challenging task, as it is affected by both regular and non-regular events, triggered by human behavior and hardware failures. In this paper, we study the spatio-temporal patterns of the hot spot score and uncover its regularities. Based on our observations, we then explore the possibility to use recent measurements' history to predict future hot spots. To this end, we consider tree-based machine learning models, and study their performance as a function of time, amount of past data, and prediction horizon. Our results indicate that, compared to the best baseline, tree-based models can deliver up to 14% better forecasts for regular hot spots and 153% better forecasts for non-regular hot spots. The latter brings strong evidence that, for moderate horizons, forecasts can be made even for sectors exhibiting isolated, non-regular behavior. Overall, our work provides insight into the dynamics of cellular sectors and their predictability. It also paves the way for more proactive network operations with greater forecasting horizons.

1611.05317 2026-06-04 cs.LG cs.SY eess.SY 版本更新

A Learning Scheme for Microgrid Islanding and Reconnection

微电网孤岛与重新连接的学习方案

Carter Lassetter, Eduardo Cotilla-Sanchez, Jinsub Kim

AI总结 本文提出一种学习方案,通过实时数据预测微电网重新连接到主电网的稳定性,利用支持向量机和动态模拟器提高预测准确性。

Comments 10 pages, 5 figures

详情
AI中文摘要

本文介绍了一种潜在的学习方案,能够动态预测子网络重新连接到主电网的稳定性。随着电力系统趋向智能化和绿色化,自给自足的微电网部署变得更为可能。微电网可能独立运行或与主电网同步,因此控制方法需考虑孤岛和重新连接。目前,最优且安全的重新连接能力尚不明确,目前仅限于连接点之间的简单同步。本文提出一种利用实时数据从相量测量单元(PMUs)的支持向量机(SVM)来预测子网络重新连接是否会导致稳定或不稳定。通过动态模拟器生成训练数据,用于在不同运行状态下训练SVM。分类器在多种情况下进行测试以确保多样性。在大多数条件下,动态预测的准确率约为85%。

英文摘要

This paper introduces a potential learning scheme that can dynamically predict the stability of the reconnection of sub-networks to a main grid. As the future electrical power systems tend towards smarter and greener technology, the deployment of self sufficient networks, or microgrids, becomes more likely. Microgrids may operate on their own or synchronized with the main grid, thus control methods need to take into account islanding and reconnecting of said networks. The ability to optimally and safely reconnect a portion of the grid is not well understood and, as of now, limited to raw synchronization between interconnection points. A support vector machine (SVM) leveraging real-time data from phasor measurement units (PMUs) is proposed to predict in real time whether the reconnection of a sub-network to the main grid would lead to stability or instability. A dynamics simulator fed with pre-acquired system parameters is used to create training data for the SVM in various operating states. The classifier was tested on a variety of cases and operating points to ensure diversity. Accuracies of approximately 85% were observed throughout most conditions when making dynamic predictions of a given network.

1607.07837 2026-06-04 math.OC cs.DS cs.LG cs.NA math.NA stat.ML 版本更新

First Efficient Convergence for Streaming k-PCA: a Global, Gap-Free, and Near-Optimal Rate

流式k-PCA的首次高效收敛:全局、无间隙且近最优速率

Zeyuan Allen-Zhu, Yuanzhi Li

AI总结 本文研究流式PCA,提出改进的Oja算法变体Oja++,在O(dk)空间内实现全局收敛和无间隙收敛,匹配信息理论下限。

Comments REMARK: v4 adds discussions and polishes writing; v3 contains a stronger Theorem 2, a new lower bound Theorem 6, as well as new Oja++ results Theorem 4 and Theorem 5

详情
AI中文摘要

我们研究流式主成分分析(PCA),即在O(dk)空间内找到d×d隐藏矩阵Σ的前k个特征向量。我们为Oja算法提供了全局收敛性,该算法在实践中常用但缺乏理论支持。我们还提出改进的Oja++变体,其运行速度比Oja更快。我们的结果在误差、特征间隙、秩k和维度d的依赖关系上匹配信息理论下限,至多多项式对数因子。此外,我们的收敛速率可做到无间隙,即与近似误差成正比,不依赖特征间隙。相比之下,在一般秩k情况下,在O(dk)空间内设计具有高效全局收敛速率的算法之前未有解决方案;并且在O(dk)空间内设计具有(甚至局部)无间隙收敛速率的算法之前也未有解决方案。

英文摘要

We study streaming principal component analysis (PCA), that is to find, in $O(dk)$ space, the top $k$ eigenvectors of a $d\times d$ hidden matrix $\bf Σ$ with online vectors drawn from covariance matrix $\bf Σ$. We provide $\textit{global}$ convergence for Oja's algorithm which is popularly used in practice but lacks theoretical understanding for $k>1$. We also provide a modified variant $\mathsf{Oja}^{++}$ that runs $\textit{even faster}$ than Oja's. Our results match the information theoretic lower bound in terms of dependency on error, on eigengap, on rank $k$, and on dimension $d$, up to poly-log factors. In addition, our convergence rate can be made gap-free, that is proportional to the approximation error and independent of the eigengap. In contrast, for general rank $k$, before our work (1) it was open to design any algorithm with efficient global convergence rate; and (2) it was open to design any algorithm with (even local) gap-free convergence rate in $O(dk)$ space.

1605.07367 2026-06-04 cs.LG cs.NA math.NA math.OC stat.ML 版本更新

Riemannian stochastic variance reduced gradient on Grassmann manifold

黎曼流形上的随机方差缩减梯度算法

Hiroyuki Kasai, Hiroyuki Sato, Bamdev Mishra

AI总结 本文提出了一种在紧凑流形搜索空间中扩展欧几里得随机方差缩减梯度算法的黎曼扩展方法,针对格拉斯曼流形进行研究,解决了多个梯度的平均、加法和减法问题,并在不同步长下分析了算法的收敛性。

详情
AI中文摘要

随机方差缩减算法近年来在最小化大量但有限的损失函数的平均值方面变得流行。本文提出了一种新颖的黎曼扩展欧几里得随机方差缩减梯度算法(R-SVRG)到紧凑流形搜索空间。为此,我们展示了在格拉斯曼流形上的发展。通过在格拉斯曼流形上引入对数映射和向量的平行翻译来解决多个梯度的平均、加法和减法的关键挑战。我们展示了所提出算法在衰减步长下的全局收敛性分析,并在固定步长下在某些自然假设下进行了局部收敛率分析。所提出算法被应用于格拉斯曼流形上的多个问题,如主成分分析、低秩矩阵补全和Karcher均值计算。在所有这些情况下,所提出算法都优于标准的黎曼随机梯度下降算法。

英文摘要

Stochastic variance reduction algorithms have recently become popular for minimizing the average of a large, but finite, number of loss functions. In this paper, we propose a novel Riemannian extension of the Euclidean stochastic variance reduced gradient algorithm (R-SVRG) to a compact manifold search space. To this end, we show the developments on the Grassmann manifold. The key challenges of averaging, addition, and subtraction of multiple gradients are addressed with notions like logarithm mapping and parallel translation of vectors on the Grassmann manifold. We present a global convergence analysis of the proposed algorithm with decay step-sizes and a local convergence rate analysis under fixed step-size with some natural assumptions. The proposed algorithm is applied on a number of problems on the Grassmann manifold like principal components analysis, low-rank matrix completion, and the Karcher mean computation. In all these cases, the proposed algorithm outperforms the standard Riemannian stochastic gradient descent algorithm.

1610.05792 2026-06-04 cs.LG cs.NA math.NA math.OC stat.ML 版本更新

Big Batch SGD: Automated Inference using Adaptive Batch Sizes

大批次SGD:利用自适应批次大小进行自动化推断

Soham De, Abhay Yadav, David Jacobs, Tom Goldstein

AI总结 本文提出大批次SGD方法,通过自适应增长批次大小保持梯度近似信号与噪声比,实现无需凸优化的高效收敛,并支持自动学习率选择。

Comments A preliminary version of this paper appears in AISTATS 2017 (International Conference on Artificial Intelligence and Statistics)

详情
AI中文摘要

经典随机梯度方法依赖于噪声梯度近似,随着迭代接近解,噪声逐渐增大,导致难以用于自适应步长选择和自动停止。本文提出替代的

英文摘要

Classical stochastic gradient methods for optimization rely on noisy gradient approximations that become progressively less accurate as iterates approach a solution. The large noise and small signal in the resulting gradients makes it difficult to use them for adaptive stepsize selection and automatic stopping. We propose alternative "big batch" SGD schemes that adaptively grow the batch size over time to maintain a nearly constant signal-to-noise ratio in the gradient approximation. The resulting methods have similar convergence rates to classical SGD, and do not require convexity of the objective. The high fidelity gradients enable automated learning rate selection and do not require stepsize decay. Big batch methods are thus easily automated and can run with little or no oversight.

1704.01265 2026-06-04 math.NA cs.IT cs.LG cs.NA math.IT math.OC 版本更新

Geometry of Factored Nuclear Norm Regularization

因子核范数正则化的几何学

Qiuwei Li, Zhihui Zhu, Gongguo Tang

AI总结 研究因子化核范数正则化在机器学习、信号处理中的应用,通过几何结构分析提升优化算法效率。

详情
AI中文摘要

本文研究了非凸重构的几何结构,该重构用于最小化一般凸损失函数$f(X)$并采用矩阵核范数$\|X\|_*$进行正则化。核范数正则化的矩阵反问题在机器学习、信号处理和控制领域中占据核心地位。文献中广泛研究了核范数正则化的统计性能,使用凸分析技术进行分析。尽管其最优性能,当使用标准或甚至定制的快速凸求解器求解时,所得到的优化问题计算复杂度较高。为了开发更快且更可扩展的算法,我们遵循Burer-Monteiro的建议,将矩阵变量$X$分解为两个较小的矩形矩阵$X=UV^T$的乘积,并且将核范数$\|X\|_*$替换为$(\|U\|_F^2+\|V\|_F^2)/2$。尽管分解后的公式是非凸的,但我们证明当凸损失函数$f(X)$是$(2r,4r)$-受限良好条件时,分解问题的每个临界点要么对应于原始凸优化的最优解$X^\star$,要么是一个严格鞍点,其中Hessian矩阵有一个严格负的特征值。这种分解后的几何结构允许许多局部搜索算法在随机初始化下收敛到全局最优解。

英文摘要

This work investigates the geometry of a nonconvex reformulation of minimizing a general convex loss function $f(X)$ regularized by the matrix nuclear norm $\|X\|_*$. Nuclear-norm regularized matrix inverse problems are at the heart of many applications in machine learning, signal processing, and control. The statistical performance of nuclear norm regularization has been studied extensively in literature using convex analysis techniques. Despite its optimal performance, the resulting optimization has high computational complexity when solved using standard or even tailored fast convex solvers. To develop faster and more scalable algorithms, we follow the proposal of Burer-Monteiro to factor the matrix variable $X$ into the product of two smaller rectangular matrices $X=UV^T$ and also replace the nuclear norm $\|X\|_*$ with $(\|U\|_F^2+\|V\|_F^2)/2$. In spite of the nonconvexity of the factored formulation, we prove that when the convex loss function $f(X)$ is $(2r,4r)$-restricted well-conditioned, each critical point of the factored problem either corresponds to the optimal solution $X^\star$ of the original convex optimization or is a strict saddle point where the Hessian matrix has a strictly negative eigenvalue. Such a geometric structure of the factored formulation allows many local search algorithms to converge to the global optimum with random initializations.

1703.09800 2026-06-04 cs.LG cs.SY eess.SY 版本更新

Disruptive Event Classification using PMU Data in Distribution Networks

利用PMU数据在配电网中进行扰动事件分类

Iman Niazazari, Hanif Livani

AI总结 本文提出基于PMU数据的框架,用于区分配电网中的扰动事件,通过PCA与SVM及自动编码器与softmax分类器实现高准确率的事件分类。

Comments 5 pages, 5 figures, conference

详情
AI中文摘要

随着高级计量设备在配电网中普及,如微量程测量单元(μPMU),为广域监控和诊断应用提供了前所未有的潜力,例如态势感知和配电网资产健康监测。意外的扰动事件会中断配电网资产的正常运行,最终导致永久性故障和昂贵的更换成本。因此,扰动事件分类为配电网资产的预防性维护提供了有用信息。本文提出了一种基于PMU数据的框架,用于配电网中扰动事件的分类。考虑并区分了两种扰动事件:即故障的电容器组切换和故障的调节器负载调节变换器(OLTC)切换,与配电网中的正常突发负载变化。通过模拟IEEE 13节点配电网中的事件验证了所提框架的性能。事件分类使用了两种不同的算法:i)主成分分析(PCA)与多类支持向量机(SVM),以及ii)自动编码器与softmax分类器。结果展示了所提算法的有效性以及满意的分类准确率。

英文摘要

Proliferation of advanced metering devices with high sampling rates in distribution grids, e.g., micro-phasor measurement units (μPMU), provides unprecedented potentials for wide-area monitoring and diagnostic applications, e.g., situational awareness, health monitoring of distribution assets. Unexpected disruptive events interrupting the normal operation of assets in distribution grids can eventually lead to permanent failure with expensive replacement cost over time. Therefore, disruptive event classification provides useful information for preventive maintenance of the assets in distribution networks. Preventive maintenance provides wide range of benefits in terms of time, avoiding unexpected outages, maintenance crew utilization, and equipment replacement cost. In this paper, a PMU-data-driven framework is proposed for classification of disruptive events in distribution networks. The two disruptive events, i.e., malfunctioned capacitor bank switching and malfunctioned regulator on-load tap changer (OLTC) switching are considered and distinguished from the normal abrupt load change in distribution grids. The performance of the proposed framework is verified using the simulation of the events in the IEEE 13-bus distribution network. The event classification is formulated using two different algorithms as; i) principle component analysis (PCA) together with multi-class support vector machine (SVM), and ii) autoencoder along with softmax classifier. The results demonstrate the effectiveness of the proposed algorithms and satisfactory classification accuracies.

1701.00573 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Robust method for finding sparse solutions to linear inverse problems using an L2 regularization

使用L2正则化求解稀疏解的稳健方法

Gonzalo H Otazu

AI总结 本文提出了一种基于生物启发算法的稳健方法,通过L2正则化在过完备字典中寻找稀疏解,具有对噪声的强鲁棒性。

Comments 13 pages, 6 figures. Code available

详情
AI中文摘要

我们分析了在需要稀疏性约束以唯一重建观测信号时,生物启发算法正确投影算法(CPA)的性能。通过改变估计问题的几何结构,CPA给出了一个二元变量的解析表达式,该变量通过L2正则化指示字典原子的存在或不存在。正则化解可通过高效的实时卡尔曼滤波类型算法实现。CPA的平滑L2正则化使其对噪声具有极强的鲁棒性,并在信号中存在强新型原子时优于其他方法。

英文摘要

We analyzed the performance of a biologically inspired algorithm called the Corrected Projections Algorithm (CPA) when a sparseness constraint is required to unambiguously reconstruct an observed signal using atoms from an overcomplete dictionary. By changing the geometry of the estimation problem, CPA gives an analytical expression for a binary variable that indicates the presence or absence of a dictionary atom using an L2 regularizer. The regularized solution can be implemented using an efficient real-time Kalman-filter type of algorithm. The smoother L2 regularization of CPA makes it very robust to noise, and CPA outperforms other methods in identifying known atoms in the presence of strong novel atoms in the signal.

1605.06848 2026-06-04 cs.CC cs.LG cs.NA math.NA 版本更新

Nonnegative Matrix Factorization Requires Irrationality

非负矩阵分解需要无理数

Dmitry Chistikov, Stefan Kiefer, Ines Marušić, Mahsa Shirmohammadi, James Worrell

AI总结 研究证明非负矩阵分解中,即使输入矩阵为有理数,分解后的因子矩阵可能需要无理数元素,推翻了原有假设。

Comments Journal version, to appear in the SIAM Journal on Applied Algebra and Geometry (SIAGA)

详情
AI中文摘要

非负矩阵分解(NMF)是将给定的非负n×m矩阵M分解为非负n×d矩阵W和非负d×m矩阵H的产品。自1993年Cohen和Rothblum提出问题以来,长期存在的开放性问题在于是否所有有理数矩阵M的最小内维d的NMF因子W和H也都是有理数。我们通过展示一个矩阵,其因子W和H需要无理数元素,回答了这个问题的否定。

英文摘要

Nonnegative matrix factorization (NMF) is the problem of decomposing a given nonnegative $n \times m$ matrix $M$ into a product of a nonnegative $n \times d$ matrix $W$ and a nonnegative $d \times m$ matrix $H$. A longstanding open question, posed by Cohen and Rothblum in 1993, is whether a rational matrix $M$ always has an NMF of minimal inner dimension $d$ whose factors $W$ and $H$ are also rational. We answer this question negatively, by exhibiting a matrix for which $W$ and $H$ require irrational entries.

1703.06327 2026-06-04 stat.ML cs.DS cs.LG cs.NA math.NA 版本更新

Spectrum Estimation from a Few Entries

从少量条目中估计谱

Ashish Khetan, Sewoong Oh

AI总结 本文研究从矩阵部分条目中恢复谱性质的问题,提出通过估计Schatten范数和Chebyshev逼近或Wasserstein距离匹配来高效恢复奇异值,理论分析显示其比低秩矩阵恢复需要更少样本。

Comments 52 pages; 15 figures

详情
AI中文摘要

矩阵数据的奇异值提供了数据结构、有效维数和高阶数据分析工具超参数选择的见解。然而,在协同过滤和网络分析等实际应用中,我们只能获取部分观测。本文考虑从矩阵条目采样中恢复底层矩阵谱性质的基本问题。我们特别关注直接恢复奇异值集合以及样本高效恢复谱总和函数的方法。首先估计矩阵的Schatten k-范数,然后应用Chebyshev逼近谱总和函数或在Wasserstein距离中进行矩匹配以恢复奇异值。主要技术挑战是准确估计Schatten范数。我们引入基于图中小结构计数的无偏估计器,并提供与实测性能相匹配的保证。理论分析表明,Schatten范数可以从比恢复低秩矩阵所需更少的样本中准确恢复。数值实验表明,我们显著优于使用矩阵补全方法的竞争对手方法。

英文摘要

Singular values of a data in a matrix form provide insights on the structure of the data, the effective dimensionality, and the choice of hyper-parameters on higher-level data analysis tools. However, in many practical applications such as collaborative filtering and network analysis, we only get a partial observation. Under such scenarios, we consider the fundamental problem of recovering spectral properties of the underlying matrix from a sampling of its entries. We are particularly interested in directly recovering the spectrum, which is the set of singular values, and also in sample-efficient approaches for recovering a spectral sum function, which is an aggregate sum of the same function applied to each of the singular values. We propose first estimating the Schatten $k$-norms of a matrix, and then applying Chebyshev approximation to the spectral sum function or applying moment matching in Wasserstein distance to recover the singular values. The main technical challenge is in accurately estimating the Schatten norms from a sampling of a matrix. We introduce a novel unbiased estimator based on counting small structures in a graph and provide guarantees that match its empirical performance. Our theoretical analysis shows that Schatten norms can be recovered accurately from strictly smaller number of samples compared to what is needed to recover the underlying low-rank matrix. Numerical experiments suggest that we significantly improve upon a competing approach of using matrix completion methods.

1611.07305 2026-06-04 cs.LG cs.DS cs.NA math.NA 版本更新

Correlation Clustering with Low-Rank Matrices

基于低秩矩阵的相关聚类

Nate Veldt, Anthony Wirth, David F. Gleich

AI总结 本文研究了在低秩矩阵表示数据时相关聚类的精确求解方法,证明了正定低秩矩阵可使问题在多项式时间内解决,但存在负特征值时仍为NP难问题,并提出基于zonotope顶点枚举的高效算法。

详情
AI中文摘要

相关聚类是一种根据对象对的相似或不相似标签聚合数据的技术。由于优化问题属于NP难,以往文献多关注近似算法。本文探讨了当数据可由低秩矩阵表示时如何精确求解相关聚类问题。我们证明,当底层矩阵为正定且具有小常数秩时,相关聚类可在多项式时间内解决,但存在单个负特征值时问题仍为NP难。基于理论结果,我们开发了利用zonotope顶点枚举过程的算法,用于高效解决低秩正定相关聚类问题。通过在合成和实际数据集上应用该算法,展示了其有效性和速度。

英文摘要

Correlation clustering is a technique for aggregating data based on qualitative information about which pairs of objects are labeled 'similar' or 'dissimilar.' Because the optimization problem is NP-hard, much of the previous literature focuses on finding approximation algorithms. In this paper we explore how to solve the correlation clustering objective exactly when the data to be clustered can be represented by a low-rank matrix. We prove in particular that correlation clustering can be solved in polynomial time when the underlying matrix is positive semidefinite with small constant rank, but that the task remains NP-hard in the presence of even one negative eigenvalue. Based on our theoretical results, we develop an algorithm for efficiently "solving" low-rank positive semidefinite correlation clustering by employing a procedure for zonotope vertex enumeration. We demonstrate the effectiveness and speed of our algorithm by using it to solve several clustering problems on both synthetic and real-world data.

1703.05486 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Using Reinforcement Learning for Demand Response of Domestic Hot Water Buffers: a Real-Life Demonstration

使用强化学习进行住宅热水缓冲的负荷响应:一项现实生活的演示

Oscar De Somer, Ana Soares, Tristan Kuijpers, Koen Vossen, Koen Vanthournout, Fred Spiessens

AI总结 本文提出了一种数据驱动的控制方法,用于现实住宅建筑中的负荷响应,通过优化热水器加热周期以最大化本地光伏生产自用电量。

Comments Submitted to IEEE ISGT Europe 2017

详情
AI中文摘要

本文展示了一种数据驱动的控制方法,用于现实住宅建筑中的负荷响应。目标是优化热水器缓冲的加热周期,以最大化本地光伏(PV)生产的自用电量。采用基于模型的强化学习技术来解决底层的顺序决策问题。所提出的算法学习随机的住户行为,预测光伏生产并考虑系统的动态。使用该算法对六个住宅建筑进行了现实实验。结果表明,与默认恒温器控制相比,光伏生产的自用电量显著增加。

英文摘要

This paper demonstrates a data-driven control approach for demand response in real-life residential buildings. The objective is to optimally schedule the heating cycles of the Domestic Hot Water (DHW) buffer to maximize the self-consumption of the local photovoltaic (PV) production. A model-based reinforcement learning technique is used to tackle the underlying sequential decision-making problem. The proposed algorithm learns the stochastic occupant behavior, predicts the PV production and takes into account the dynamics of the system. A real-life experiment with six residential buildings is performed using this algorithm. The results show that the self-consumption of the PV production is significantly increased, compared to the default thermostat control.

1502.00182 2026-06-04 math.NA cs.DS cs.LG cs.NA stat.ML 版本更新

High Dimensional Low Rank plus Sparse Matrix Decomposition

高维低秩加稀疏矩阵分解

Mostafa Rahmani, George Atia

AI总结 本文提出一种可扩展的子空间追猎方法,将矩阵分解问题转化为子空间学习问题,通过小数据草稿实现分解,适应性采样算法提升了处理结构化数据的效率。

Comments IEEE Transactions on Signal Processing

详情
AI中文摘要

本文关注大数据中的低秩加稀疏矩阵分解问题。传统算法使用全部数据提取低秩和稀疏成分,基于复杂度随数据维度增长的优化问题,限制了可扩展性。现有随机方法多依赖均匀随机采样,对于具有额外结构的数据(如聚类)效率低下。本文提出一种可扩展的子空间追猎方法,将分解问题转化为子空间学习问题。通过采样列/行形成的小数据草稿进行分解。即使均匀随机采样,所需采样列/行数约为O(rμ),其中μ是相干参数,r是低秩成分的秩。此外,提出适应性采样算法以解决结构化数据的列/行采样问题。提供适应性采样方法的分析,证明适应性采样使所需采样列/行数对数据分布不变。所提方法适用于在线实现,并提出在线方案。

英文摘要

This paper is concerned with the problem of low rank plus sparse matrix decomposition for big data. Conventional algorithms for matrix decomposition use the entire data to extract the low-rank and sparse components, and are based on optimization problems with complexity that scales with the dimension of the data, which limits their scalability. Furthermore, existing randomized approaches mostly rely on uniform random sampling, which is quite inefficient for many real world data matrices that exhibit additional structures (e.g. clustering). In this paper, a scalable subspace-pursuit approach that transforms the decomposition problem to a subspace learning problem is proposed. The decomposition is carried out using a small data sketch formed from sampled columns/rows. Even when the data is sampled uniformly at random, it is shown that the sufficient number of sampled columns/rows is roughly O(rμ), where μis the coherency parameter and r the rank of the low rank component. In addition, adaptive sampling algorithms are proposed to address the problem of column/row sampling from structured data. We provide an analysis of the proposed method with adaptive sampling and show that adaptive sampling makes the required number of sampled columns/rows invariant to the distribution of the data. The proposed approach is amenable to online implementation and an online scheme is proposed.

1703.04550 2026-06-04 cs.RO cs.LG cs.NE cs.SY eess.SY 版本更新

Sensor Fusion for Robot Control through Deep Reinforcement Learning

通过深度强化学习实现机器人控制的传感器融合

Steven Bohez, Tim Verbelen, Elias De Coninck, Bert Vankeirsbilck, Pieter Simoens, Bart Dhoedt

AI总结 本文提出通过深度强化学习实现机器人传感器信息融合,提升机器人在搜索和拾取任务中的鲁棒性和性能。

Comments 6 pages, 6 figures, submitted to IROS 2017

详情
AI中文摘要

深度强化学习正日益成为机器人控制算法的热门方法,旨在使机器人能够从非结构化感官输入中自学习有用的特征表示,从而获得最优的操作策略。除了机器人上的传感器外,环境中的传感器也可能被部署,尽管这些可能需要通过不可靠的无线连接访问。在本文中,我们展示了能够融合多个传感器信息并具有运行时传感器故障鲁棒性的深度神经网络架构。我们评估了我们的方法在机器人搜索和拾取任务中的性能,包括仿真和现实世界中的测试。

英文摘要

Deep reinforcement learning is becoming increasingly popular for robot control algorithms, with the aim for a robot to self-learn useful feature representations from unstructured sensory input leading to the optimal actuation policy. In addition to sensors mounted on the robot, sensors might also be deployed in the environment, although these might need to be accessed via an unreliable wireless connection. In this paper, we demonstrate deep neural network architectures that are able to fuse information coming from multiple sensors and are robust to sensor failures at runtime. We evaluate our method on a search and pick task for a robot both in simulation and the real world.

1703.04219 2026-06-04 cs.LG cs.NA math.NA 版本更新

SPARTan: Scalable PARAFAC2 for Large & Sparse Data

SPARTan:适用于大规模稀疏数据的可扩展PARAFAC2

Ioakeim Perros, Evangelos E. Papalexakis, Fei Wang, Richard Vuduc, Elizabeth Searles, Michael Thompson, Jimeng Sun

AI总结 本文提出SPARTan方法,用于高效处理大规模稀疏数据的PARAFAC2分解,实现速度和内存效率的提升,并在真实医学数据中验证了其有效性。

详情
AI中文摘要

在探索性张量挖掘中,一个常见问题是如何分析一组变量在一组受试者中的观测数据,这些观测数据在自然上并不对齐。例如,当建模一组患者中的医疗特征时,治疗的次数和持续时间可能差异很大,在时间点上无法有意义地对齐临床记录。为处理此类数据,最先进的张量模型是所谓的PARAFAC2,它能产生可解释且稳健的输出,并能自然处理稀疏数据。然而,其主要限制在于缺乏能够处理大规模数据集的高效算法。在本文中,我们通过开发一种可扩展的方法来计算大规模稀疏数据集的PARAFAC2分解,称为SPARTan。我们的方法利用PARAFAC2内部的特殊结构,导致一种新颖的算法重述,该方法在绝对时间上更快且比先前工作更节省内存。我们评估了SPARTan在合成和真实数据集上的表现,显示其性能比最佳先前实现提高了22倍,并且能够处理基线方法无法处理的更大问题实例。此外,我们还能够将SPARTan应用于真实和医学复杂的儿科患者数据中的时间演变表型挖掘。在这一过程中的表型的临床意义以及在多个患者中的时间演变已得到临床专家的认可。

英文摘要

In exploratory tensor mining, a common problem is how to analyze a set of variables across a set of subjects whose observations do not align naturally. For example, when modeling medical features across a set of patients, the number and duration of treatments may vary widely in time, meaning there is no meaningful way to align their clinical records across time points for analysis purposes. To handle such data, the state-of-the-art tensor model is the so-called PARAFAC2, which yields interpretable and robust output and can naturally handle sparse data. However, its main limitation up to now has been the lack of efficient algorithms that can handle large-scale datasets. In this work, we fill this gap by developing a scalable method to compute the PARAFAC2 decomposition of large and sparse datasets, called SPARTan. Our method exploits special structure within PARAFAC2, leading to a novel algorithmic reformulation that is both fast (in absolute time) and more memory-efficient than prior work. We evaluate SPARTan on both synthetic and real datasets, showing 22X performance gains over the best previous implementation and also handling larger problem instances for which the baseline fails. Furthermore, we are able to apply SPARTan to the mining of temporally-evolving phenotypes on data taken from real and medically complex pediatric patients. The clinical meaningfulness of the phenotypes identified in this process, as well as their temporal evolution over time for several patients, have been endorsed by clinical experts.

1703.02899 2026-06-04 cs.LG cs.RO cs.SY eess.SY stat.ML 版本更新

Model-Based Policy Search for Automatic Tuning of Multivariate PID Controllers

基于模型的策略搜索用于多变量PID控制器的自动调优

Andreas Doerr, Duy Nguyen-Tuong, Alonso Marco, Stefan Schaal, Sebastian Trimpe

AI总结 本文提出基于模型的策略搜索框架,用于自动调优多变量PID控制器,通过数据驱动的方法解决复杂系统的控制器调优问题。

Comments Accepted final version to appear in 2017 IEEE International Conference on Robotics and Automation (ICRA)

详情
AI中文摘要

PID控制架构在工业应用中被广泛使用。尽管其开放参数数量较少,但实际中调优多个耦合的PID控制器可能变得繁琐。本文扩展了PILCO,一种基于模型的策略搜索框架,以纯数据驱动的方式自动调优多变量PID控制器,无需事先了解系统。通过适当扩展系统状态,将PID策略框架为静态状态反馈策略,从而将PID调优视为有限时间最优控制问题的解法,无需进一步先验知识。该框架应用于平衡倒立摆于七自由度机械臂的任务,展示了其在复杂现实问题中快速且数据高效的学习能力。

英文摘要

PID control architectures are widely used in industrial applications. Despite their low number of open parameters, tuning multiple, coupled PID controllers can become tedious in practice. In this paper, we extend PILCO, a model-based policy search framework, to automatically tune multivariate PID controllers purely based on data observed on an otherwise unknown system. The system's state is extended appropriately to frame the PID policy as a static state feedback policy. This renders PID tuning possible as the solution of a finite horizon optimal control problem without further a priori knowledge. The framework is applied to the task of balancing an inverted pendulum on a seven degree-of-freedom robotic arm, thereby demonstrating its capabilities of fast and data-efficient policy learning, even on complex real world problems.

1703.02810 2026-06-04 cs.AI cs.LG cs.SY eess.SY 版本更新

An Integrated and Scalable Platform for Proactive Event-Driven Traffic Management

主动事件驱动交通管理的集成可扩展平台

Alain Kibangou, Alexander Artikis, Evangelos Michelioudakis, Georgios Paliouras, Marius Schmitt, John Lygeros, Chris Baber, Natan Morar, Fabiana Fournier, Inna Skarbovsky

AI总结 本文提出一个集成平台,通过事件驱动方法预测拥堵,提升交通管理效率。

详情
AI中文摘要

高速公路的交通可通过路障控制室的可变限速来管理。人类操作员无法高效管理多个可变限速装置。为此,本文提出一个智能交通管理平台,包含新的可变限速协调方案、高效的交互仪表盘、机器学习工具用于学习事件定义以及能够处理交通场景固有不确定性的复杂事件处理工具。与传统方法不同,该事件驱动平台可提前4分钟预测拥堵,从而实现主动决策,显著改善交通状况。

英文摘要

Traffic on freeways can be managed by means of ramp meters from Road Traffic Control rooms. Human operators cannot efficiently manage a network of ramp meters. To support them, we present an intelligent platform for traffic management which includes a new ramp metering coordination scheme in the decision making module, an efficient dashboard for interacting with human operators, machine learning tools for learning event definitions and Complex Event Processing tools able to deal with uncertainties inherent to the traffic use case. Unlike the usual approach, the devised event-driven platform is able to predict a congestion up to 4 minutes before it really happens. Proactive decision making can then be established leading to significant improvement of traffic conditions.

1605.06432 2026-06-04 stat.ML cs.LG cs.SY eess.SY 版本更新

Deep Variational Bayes Filters: Unsupervised Learning of State Space Models from Raw Data

深度变分贝叶斯滤波器:从原始数据中无监督学习状态空间模型

Maximilian Karl, Maximilian Soelch, Justin Bayer, Patrick van der Smagt

AI总结 本文提出深度变分贝叶斯滤波器,通过变分推断处理非解析性推理,实现从原始数据中无监督学习状态空间模型,提升长期预测能力。

Comments Published as a conference paper at ICLR 2017

详情
AI中文摘要

我们介绍了深度变分贝叶斯滤波器(DVBF),一种新的方法,用于无监督学习和识别潜在马尔可夫状态空间模型。利用最近在随机梯度变分贝叶斯中的进展,DVBF可以通过变分推断克服不可处理的推断分布。因此,它可以处理具有时间和空间依赖性的高度非线性输入数据,如图像序列,而无需领域知识。我们的实验表明,启用通过转换的反向传播强制状态空间假设,并显著提高潜在嵌入的信息内容。这还使长期预测成为可能。

英文摘要

We introduce Deep Variational Bayes Filters (DVBF), a new method for unsupervised learning and identification of latent Markovian state space models. Leveraging recent advances in Stochastic Gradient Variational Bayes, DVBF can overcome intractable inference distributions via variational inference. Thus, it can handle highly nonlinear input data with temporal and spatial dependencies such as image sequences without domain knowledge. Our experiments show that enabling backpropagation through transitions enforces state space assumptions and significantly improves information content of the latent embedding. This also enables realistic long-term prediction.

1703.00847 2026-06-04 cs.LG cs.SY eess.SY 版本更新

Exact Topology Reconstruction of Radial Dynamical Systems with Applications to Distribution System of the Power Grid

径向动态系统精确拓扑重建及其在电力分配系统中的应用

Saurav Talukdar, Deepjyoti Deka, Donatello Materassi, Murti V. Salapaka

AI总结 本文提出了一种重建动态相关随机过程互联性的方法,通过多变量维纳滤波消除虚假链接,针对树状拓扑结构提出三阶段网络重建流程,并在电力分配系统中验证有效性。

Comments 6 pages

详情
AI中文摘要

本文提出了一种重建动态相关随机过程互联性的方法,其中交互是双向的,底层拓扑是树状结构。我们的方法基于多变量维纳滤波,能够恢复真实的边并消除虚假边。本文的主要贡献是证明如果底层拓扑是树状结构,那么通过维纳滤波获得的所有虚假链接都可以被消除,从而提出针对树状拓扑的三阶段网络重建流程。我们通过在典型电力分配系统中应用该方法来展示该方法的有效性。

英文摘要

In this article we present a method to reconstruct the interconnectedness of dynamically related stochastic processes, where the interactions are bi-directional and the underlying topology is a tree. Our approach is based on multivariate Wiener filtering which recovers spurious edges apart from the true edges in the topology reconstruction. The main contribution of this work is to show that all spurious links obtained using Wiener filtering can be eliminated if the underlying topology is a tree based on which we present a three stage network reconstruction procedure for trees. We illustrate the effectiveness of the method developed by applying it on a typical distribution system of the electric grid.

1703.00663 2026-06-04 math.NA cs.CV cs.LG cs.NA math.OC stat.ML 版本更新

Introduction to Nonnegative Matrix Factorization

非负矩阵因子分解简介

Nicolas Gillis

AI总结 本文介绍非负矩阵因子分解的应用、解的几何性质与唯一性、复杂度及算法,并探讨其与多面体扩展形式的联系。

Comments 18 pages, 4 figures

详情
Journal ref
SIAG/OPT Views and News 25 (1), pp. 7-16 (2017)
AI中文摘要

本文介绍了非负矩阵因子分解(NMF)的概念,并提供了简要概述。讨论了NMF在高光谱成像中的应用、解的几何性质与唯一性、复杂度、算法及其与多面体扩展形式的联系。为将NMF置于更广泛的问题框架中,首先简要介绍了受限低秩矩阵近似问题的更一般问题类别。

英文摘要

In this paper, we introduce and provide a short overview of nonnegative matrix factorization (NMF). Several aspects of NMF are discussed, namely, the application in hyperspectral imaging, geometry and uniqueness of NMF solutions, complexity, algorithms, and its link with extended formulations of polyhedra. In order to put NMF into perspective, the more general problem class of constrained low-rank matrix approximation problems is first briefly introduced.

1703.00084 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

Multi-Sensor Data Pattern Recognition for Multi-Target Localization: A Machine Learning Approach

多传感器数据模式识别用于多目标定位:一种机器学习方法

Kasthurirengan Suresh, Samuel Silva, Johnathan Votion, Yongcan Cao

AI总结 本文提出了一种创新的目标定位学习方法,利用聚类和SVM等算法处理多传感器数据,以提高多目标定位的准确性。

Comments submitted for conference publication

详情
AI中文摘要

数据-目标配对是实现无人系统智能操作多目标定位的重要步骤。目标定位在许多应用中至关重要,如搜索、救援、交通管理和监控。本文旨在提出一种创新的目标位置学习方法,其中使用了多种机器学习方法,包括K均值聚类和支持向量机(SVM),以学习跨多个空间分布传感器的数据模式。为了实现不同传感器之间的准确数据关联以实现准确的目标定位,适当的数据预处理至关重要,随后应用不同的机器学习算法对不同传感器的数据进行适当分组,以实现多个目标的准确定位。通过模拟示例,对这些机器学习算法的性能进行了量化和比较。

英文摘要

Data-target pairing is an important step towards multi-target localization for the intelligent operation of unmanned systems. Target localization plays a crucial role in numerous applications, such as search, and rescue missions, traffic management and surveillance. The objective of this paper is to present an innovative target location learning approach, where numerous machine learning approaches, including K-means clustering and supported vector machines (SVM), are used to learn the data pattern across a list of spatially distributed sensors. To enable the accurate data association from different sensors for accurate target localization, appropriate data pre-processing is essential, which is then followed by the application of different machine learning algorithms to appropriately group data from different sensors for the accurate localization of multiple targets. Through simulation examples, the performance of these machine learning algorithms is quantified and compared.

1702.07834 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Efficient coordinate-wise leading eigenvector computation

高效坐标-wise 主特征向量计算

Jialei Wang, Weiran Wang, Dan Garber, Nathan Srebro

AI总结 本文提出并分析了高效的坐标-wise 方法来寻找主特征向量,每一步仅涉及向量-向量乘积。方法在全局收敛性和运行时间上不低于Lanczos方法,并在谱衰减较慢时表现更优。

详情
AI中文摘要

我们开发并分析了高效的

英文摘要

We develop and analyze efficient "coordinate-wise" methods for finding the leading eigenvector, where each step involves only a vector-vector product. We establish global convergence with overall runtime guarantees that are at least as good as Lanczos's method and dominate it for slowly decaying spectrum. Our methods are based on combining a shift-and-invert approach with coordinate-wise algorithms for linear regression.

1702.06166 2026-06-04 stat.ML cs.LG cs.NA math.NA q-bio.GN q-bio.QM stat.ME 版本更新

Bayesian Boolean Matrix Factorisation

贝叶斯布尔矩阵分解

Tammo Rukat, Chris C. Holmes, Michalis K. Titsias, Christopher Yau

AI总结 本文提出一种基于概率生成模型的布尔矩阵分解方法,通过Metropolised Gibbs采样实现高效后验推断,并在真实和模拟数据中优于现有方法,提升解释性与应用价值。

详情
AI中文摘要

布尔矩阵分解旨在将二进制数据矩阵分解为两个低秩二进制矩阵的近似布尔乘积:一个包含有意义的模式,另一个量化如何将观测表示为这些模式的组合。本文引入OrMachine,一种概率生成模型,推导出Metropolised Gibbs采样器以实现高效的后验推断。在真实和模拟数据上,我们的方法优于现有方法,首次提供完整的后验推断,适用于协作过滤中的假阳性控制,并提升推断模式的可解释性。所提算法可扩展至大规模数据集,如通过分析11,000个基因在130万只小鼠脑细胞中的单细胞基因表达数据,在商用硬件上实现。

英文摘要

Boolean matrix factorisation aims to decompose a binary data matrix into an approximate Boolean product of two low rank, binary matrices: one containing meaningful patterns, the other quantifying how the observations can be expressed as a combination of these patterns. We introduce the OrMachine, a probabilistic generative model for Boolean matrix factorisation and derive a Metropolised Gibbs sampler that facilitates efficient parallel posterior inference. On real world and simulated data, our method outperforms all currently existing approaches for Boolean matrix factorisation and completion. This is the first method to provide full posterior inference for Boolean Matrix factorisation which is relevant in applications, e.g. for controlling false positive rates in collaborative filtering and, crucially, improves the interpretability of the inferred patterns. The proposed algorithm scales to large datasets as we demonstrate by analysing single cell gene expression data in 1.3 million mouse brain cells across 11 thousand genes on commodity hardware.

1602.07046 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

An Improved Gap-Dependency Analysis of the Noisy Power Method

改进的噪声幂方法的间隙依赖性分析

Maria Florina Balcan, Simon S. Du, Yining Wang, Adams Wei Yu

AI总结 本文改进了噪声幂方法对谱间隙的依赖性,通过引入中间参数q,提升了样本复杂度和噪声容忍度的界限,应用于分布式隐私PCA和内存高效流PCA。

详情
AI中文摘要

我们考虑了在机器学习和统计中广泛应用的噪声幂方法,尤其是在资源受限下的主成分分析(PCA)中。现有分析显示噪声幂方法对输入数据矩阵的连续谱间隙(σ_k-σ_{k+1})存在不满意的依赖性,这可能非常小,从而限制了算法的应用。本文提出了一种新的噪声幂方法分析,实现了样本复杂度和噪声容忍度界限的改进依赖性。具体而言,我们将对(σ_k-σ_{k+1})的依赖性改进为对(σ_k-σ_{q+1})的依赖性,其中q是一个中间算法参数,可能远大于目标秩k。我们的证明基于对两个子空间接近性的新特征化,这不同于之前工作中分析的canonical angle特征化。最后,我们将改进的界限应用于分布式隐私PCA和内存高效的流PCA,并获得了优于现有文献结果的界限。

英文摘要

We consider the noisy power method algorithm, which has wide applications in machine learning and statistics, especially those related to principal component analysis (PCA) under resource (communication, memory or privacy) constraints. Existing analysis of the noisy power method shows an unsatisfactory dependency over the "consecutive" spectral gap $(σ_k-σ_{k+1})$ of an input data matrix, which could be very small and hence limits the algorithm's applicability. In this paper, we present a new analysis of the noisy power method that achieves improved gap dependency for both sample complexity and noise tolerance bounds. More specifically, we improve the dependency over $(σ_k-σ_{k+1})$ to dependency over $(σ_k-σ_{q+1})$, where $q$ is an intermediate algorithm parameter and could be much larger than the target rank $k$. Our proofs are built upon a novel characterization of proximity between two subspaces that differ from canonical angle characterizations analyzed in previous works. Finally, we apply our improved bounds to distributed private PCA and memory-efficient streaming PCA and obtain bounds that are superior to existing results in the literature.

1701.08074 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Model-Free Control of Thermostatically Controlled Loads Connected to a District Heating Network

无模型控制连接到区域供热网络的自适应负载

Bert J. Claessens, Dirk Vanhoudt, Johan Desmedt, Frederik Ruelens

AI总结 本文提出了一种基于强化学习和市场多智能体系统的无模型控制方法,用于优化连接到区域供热网络的自适应负载,显著提高了实际学习时间内的性能。

Comments Under review at Elsevier: Energy and buildings 2017

详情
AI中文摘要

连接到区域供热网络的自适应负载的最优控制被视为在不确定性下的顺序决策问题。直接基于模型的方法在实践中受到两个挑战的限制,即由于问题的大维度性导致的可扩展性问题以及系统识别所需的准确模型识别。为缓解这些问题,本文利用了强化学习的最新发展,并结合基于市场的多智能体系统,以获得一个可扩展的解决方案,该方案在实际学习时间内实现了显著的性能提升。控制方法应用于一个包含100个连接到辐射状区域供热网络的自适应负载的场景,该网络由中央联合热电联产厂供电。无论是能源套利还是削峰目标,该控制方法需要60天才能使性能达到理论成本下界65%以内。

英文摘要

Optimal control of thermostatically controlled loads connected to a district heating network is considered a sequential decision- making problem under uncertainty. The practicality of a direct model-based approach is compromised by two challenges, namely scalability due to the large dimensionality of the problem and the system identification required to identify an accurate model. To help in mitigating these problems, this paper leverages on recent developments in reinforcement learning in combination with a market-based multi-agent system to obtain a scalable solution that obtains a significant performance improvement in a practical learning time. The control approach is applied on a scenario comprising 100 thermostatically controlled loads connected to a radial district heating network supplied by a central combined heat and power plant. Both for an energy arbitrage and a peak shaving objective, the control approach requires 60 days to obtain a performance within 65% of a theoretical lower bound on the cost.

1611.03993 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

Riemannian Tensor Completion with Side Information

Riemannian张量补全与侧信息

Tengfei Zhou, Hui Qian, Zebang Shen, Congfu Xu

AI总结 本文提出一种新的Riemannian模型,整合原始模型与侧信息以提升低秩张量补全效率与准确性。

详情
AI中文摘要

通过将迭代限制在非线性流形上,最近提出的Riemannian优化方法在低秩张量补全问题中证明了其高效和有效性。然而,现有方法由于格式不匹配而无法利用容易获取的侧信息。因此,这些方法仍有改进的空间。为填补这一空白,本文提出了一种新的Riemannian模型,通过克服其不一致性来有机整合原始模型和侧信息。针对此模型,基于一种新的度量方法设计了高效的Riemannian共轭梯度下降求解器。数值实验表明,我们的求解器在不牺牲效率的情况下比最先进的方法更准确。

英文摘要

By restricting the iterate on a nonlinear manifold, the recently proposed Riemannian optimization methods prove to be both efficient and effective in low rank tensor completion problems. However, existing methods fail to exploit the easily accessible side information, due to their format mismatch. Consequently, there is still room for improvement in such methods. To fill the gap, in this paper, a novel Riemannian model is proposed to organically integrate the original model and the side information by overcoming their inconsistency. For this particular model, an efficient Riemannian conjugate gradient descent solver is devised based on a new metric that captures the curvature of the objective.Numerical experiments suggest that our solver is more accurate than the state-of-the-art without compromising the efficiency.

1702.05548 2026-06-04 math.OC cs.LG cs.SY eess.SY 版本更新

Bi-Level Online Control without Regret

双层在线控制无遗憾

Andrey Bernstein

AI总结 本文提出一种双层离散时间控制框架,结合在线凸优化与实时控制,通过小动态遗憾算法解决电力电网频率控制问题。

详情
AI中文摘要

本文考虑一个包含多个局部控制器和中央控制器的双层离散时间控制框架,具有实时约束。目标是通过提出具有小动态遗憾的在线控制算法,弥合在线凸优化与实时控制文献之间的差距。我们展示了所提出算法如何应用于电力电网中实时控制功率设定点。

英文摘要

This paper considers a bi-level discrete-time control framework with real-time constraints, consisting of several local controllers and a central controller. The objective is to bridge the gap between the online convex optimization and real-time control literature by proposing an online control algorithm with small dynamic regret, which is a natural performance criterion in nonstationary environments related to real-time control problems. We illustrate how the proposed algorithm can be applied to real-time control of power setpoints in an electrical grid.

1702.01228 2026-06-04 cs.LG cs.SY eess.SY 版本更新

A Learning-Based Approach for Lane Departure Warning Systems with a Personalized Driver Model

基于学习的车道偏离预警系统个性化驾驶员模型方法

Wenshuo Wang, Ding Zhao, Junqiang Xi, Wei Han

AI总结 本文提出基于学习的车道偏离预警方法,通过结合高斯混合模型和隐马尔可夫模型建立个性化驾驶员模型,预测驾驶员行为并降低误报率。

Comments 12 pages, 13 figures, Journal

详情
AI中文摘要

驾驶员纠正行为的误解是车道偏离预测系统误报的主要原因。本文提出一种基于学习的方法,用于预测意外车道偏离行为(LDB)和驾驶员将车辆带回车道的可能性。首先,通过结合高斯混合模型和隐马尔可夫模型建立个性化驾驶员模型,用于车道偏离和车道保持行为。其次,基于该模型,开发了一种基于模型的在线预测算法,用于预测车辆轨迹并判断驾驶员将表现出LDB还是DCB。此外,还开发了一种基于模型预测算法的预警策略,使车道偏离预警系统能根据预测轨迹被驾驶员接受。此外,通过密歇根大学安全飞行员模型部署计划收集了10名驾驶员的自然驾驶数据,用于训练个性化驾驶员模型并验证该方法。我们比较了所提出的方法与基本时间到车道 crossing(TLC)方法和TLC-方向序列的分段横向斜率(TLC-DSPLS)方法。结果表明,所提出的方法可将误报率降至3.07%。

英文摘要

Misunderstanding of driver correction behaviors (DCB) is the primary reason for false warnings of lane-departure-prediction systems. We propose a learning-based approach to predicting unintended lane-departure behaviors (LDB) and the chance for drivers to bring the vehicle back to the lane. First, in this approach, a personalized driver model for lane-departure and lane-keeping behavior is established by combining the Gaussian mixture model and the hidden Markov model. Second, based on this model, we develop an online model-based prediction algorithm to predict the forthcoming vehicle trajectory and judge whether the driver will demonstrate an LDB or a DCB. We also develop a warning strategy based on the model-based prediction algorithm that allows the lane-departure warning system to be acceptable for drivers according to the predicted trajectory. In addition, the naturalistic driving data of 10 drivers is collected through the University of Michigan Safety Pilot Model Deployment program to train the personalized driver model and validate this approach. We compare the proposed method with a basic time-to-lane-crossing (TLC) method and a TLC-directional sequence of piecewise lateral slopes (TLC-DSPLS) method. The results show that the proposed approach can reduce the false-warning rate to 3.07\%.

1702.01205 2026-06-04 cs.AI cs.LG cs.SY eess.SY 版本更新

Traffic Lights with Auction-Based Controllers: Algorithms and Real-World Data

带拍卖机制的交通灯控制器:算法与现实数据

Shumeet Baluja, Michele Covell, Rahul Sukthankar

AI总结 本文提出一种基于拍卖的交通灯控制器,通过微拍卖整合交通传感器信息,提升路容量和平均出行时间,优于现有静态程序灯和长期规划方案。

详情
AI中文摘要

实时优化交通流解决重要实际问题:减少驾驶员空闲时间、提高城市效率、减少气体排放和改善空气质量。当前交通灯优化研究多依赖扩展交通灯与其他交通设施的通信能力,但在此类能力普及前,可通过现有部署基础设施更响应当前交通状况来改进交通灯。本文介绍一种利用微拍卖进行竞价的交通灯控制器,无需其他外部信息源。我们在旧金山山景城和芝加哥river north社区的Android用户数月收集的大规模数据上训练和测试交通灯控制器。学习得到的拍卖机制控制器在两个城市中均在道路容量和平均出行时间等相关指标上超越了现有部署的交通灯、优化静态程序灯和长期规划方法,通过真实用户驾驶数据测量。

英文摘要

Real-time optimization of traffic flow addresses important practical problems: reducing a driver's wasted time, improving city-wide efficiency, reducing gas emissions and improving air quality. Much of the current research in traffic-light optimization relies on extending the capabilities of traffic lights to either communicate with each other or communicate with vehicles. However, before such capabilities become ubiquitous, opportunities exist to improve traffic lights by being more responsive to current traffic situations within the current, already deployed, infrastructure. In this paper, we introduce a traffic light controller that employs bidding within micro-auctions to efficiently incorporate traffic sensor information; no other outside sources of information are assumed. We train and test traffic light controllers on large-scale data collected from opted-in Android cell-phone users over a period of several months in Mountain View, California and the River North neighborhood of Chicago, Illinois. The learned auction-based controllers surpass (in both the relevant metrics of road-capacity and mean travel time) the currently deployed lights, optimized static-program lights, and longer-term planning approaches, in both cities, measured using real user driving data.

1701.08757 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Bayesian Learning of Consumer Preferences for Residential Demand Response

贝叶斯学习消费者对住宅需求响应的偏好

Mikhail V. Goubko, Sergey O. Kuznetsov, Alexey A. Neznanov, Dmitry I. Ignatov

AI总结 本文提出一种贝叶斯学习算法,用于估计消费者舒适度函数,通过历史家电使用数据实现能源节约,优于传统回归分析方法,可扩展至控制供暖和制冷系统。

详情
Journal ref
IFAC-PapersOnLine, 49(32), 2016, p. 24-29, ISSN 2405-8963
AI中文摘要

在未来几年,住宅消费者将面临实时电价,能源价格每日变化,有效的节能需要自动化——一个推荐系统,通过学习消费者的行为来了解其偏好。消费者选择家电使用场景以平衡舒适度和电费。本文提出一种贝叶斯学习算法,从家电使用历史中估计舒适度函数。在由模拟模型生成的数据集上进行数值实验时,该算法优于流行的回归分析工具。我们的方法可扩展至控制负责家庭能源费用一半的供暖和制冷系统。

英文摘要

In coming years residential consumers will face real-time electricity tariffs with energy prices varying day to day, and effective energy saving will require automation - a recommender system, which learns consumer's preferences from her actions. A consumer chooses a scenario of home appliance use to balance her comfort level and the energy bill. We propose a Bayesian learning algorithm to estimate the comfort level function from the history of appliance use. In numeric experiments with datasets generated from a simulation model of a consumer interacting with small home appliances the algorithm outperforms popular regression analysis tools. Our approach can be extended to control an air heating and conditioning system, which is responsible for up to half of a household's energy bill.

1701.06652 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Convex Parameterizations and Fidelity Bounds for Nonlinear Identification and Reduced-Order Modelling

凸参数化与非线性识别和降阶建模的保真度界限

Mark M. Tobenkin, Ian R. Manchester, Alexandre Megretski

AI总结 本文提出基于凸优化的非线性识别方法,通过拉格朗日松弛、耗散不等式和半正定规划解决模型不稳定和长期预测问题,应用于电子电路降阶和气动执行器识别。

Comments Conditionally accepted to IEEE TAC

详情
AI中文摘要

本文提出基于凸优化的非线性识别方法,通过拉格朗日松弛、耗散不等式和半正定规划解决模型不稳定和长期预测问题,应用于电子电路降阶和气动执行器识别。

英文摘要

Model instability and poor prediction of long-term behavior are common problems when modeling dynamical systems using nonlinear "black-box" techniques. Direct optimization of the long-term predictions, often called simulation error minimization, leads to optimization problems that are generally non-convex in the model parameters and suffer from multiple local minima. In this work we present methods which address these problems through convex optimization, based on Lagrangian relaxation, dissipation inequalities, contraction theory, and semidefinite programming. We demonstrate the proposed methods with a model order reduction task for electronic circuit design and the identification of a pneumatic actuator from experiment.

1607.03463 2026-06-04 math.NA cs.DS cs.LG cs.NA math.OC stat.ML 版本更新

LazySVD: Even Faster SVD Decomposition Yet Without Agonizing Pain

LazySVD:即使更快的SVD分解也无需痛苦

Zeyuan Allen-Zhu, Yuanzhi Li

AI总结 本文提出LazySVD框架,改进了k-SVD的突破性方法,实现了更快的无间隙方法,以及首个加速和随机方法,在特定参数范围内优于现有算法。

Comments first circulated on May 20, 2016; this newer version improves writing

详情
AI中文摘要

本文研究了$k$-SVD,旨在获得矩阵$A$的第一个$k$个奇异向量。最近,$k$-SVD上发现了一些突破:Musco和Musco[1]利用块Krylov方法证明了首个无间隙收敛结果,Shamir[2]发现了首个方差缩减随机方法,Bhojanapalli等人[3]利用交替最小化提供了最快的$O(\mathsf{nnz}(A) + \mathsf{poly}(1/\varepsilon))$-时间算法。本文提出了一种新的简单LazySVD框架以改进上述突破。该框架导致了一个更快的无间隙方法,优于[1],并且首个加速和随机方法,优于[2]。在$O(\mathsf{nnz}(A) + \mathsf{poly}(1/\varepsilon))$运行时间范围内,LazySVD在某些参数范围内优于[3],甚至不使用交替最小化。

英文摘要

We study $k$-SVD that is to obtain the first $k$ singular vectors of a matrix $A$. Recently, a few breakthroughs have been discovered on $k$-SVD: Musco and Musco [1] proved the first gap-free convergence result using the block Krylov method, Shamir [2] discovered the first variance-reduction stochastic method, and Bhojanapalli et al. [3] provided the fastest $O(\mathsf{nnz}(A) + \mathsf{poly}(1/\varepsilon))$-time algorithm using alternating minimization. In this paper, we put forward a new and simple LazySVD framework to improve the above breakthroughs. This framework leads to a faster gap-free method outperforming [1], and the first accelerated and stochastic method outperforming [2]. In the $O(\mathsf{nnz}(A) + \mathsf{poly}(1/\varepsilon))$ running-time regime, LazySVD outperforms [3] in certain parameter regimes without even using alternating minimization.

1610.00681 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Team-Optimal Distributed MMSE Estimation in General and Tree Networks

一般和树状网络中的团队最优分布式最小均方误差估计

Muhammed O. Sayin, Suleyman S. Kozat, Tamer Başar

AI总结 本文提出了一种在有限时间范围内实现团队最优学习性能的分布式最小均方误差估计算法,适用于任意网络拓扑,并通过局部估计的递归算法实现最优性能。

Comments Submitted to Digital Signal Processing

详情
AI中文摘要

我们构建了用于有限时间范围均方误差(MSE)状态估计的分布式网络团队最优估计算法。这里,我们有分布式处理和协作能力的代理,通过线性模型观察到目标状态的噪声样本,并通过相互交互来学习该状态。尽管这个问题在机器学习和信号处理等领域受到广泛关注,但所有已知的策略在有限时间范围的MSE意义上均无法实现团队最优学习性能。为此,我们制定了在没有披露信息大小限制的情况下,即在任意网络拓扑上实现有限时间范围的分布式最小MSE(MMSE)。随后,我们表明仅交换局部估计足以在某些网络拓扑上实现Oracle性能。通过检查这些网络结构,我们提出了通过披露局部估计实现Oracle性能的递归算法。对于实际应用,我们还提供了通过时间窗口化观测来降低算法复杂度的方法。最后,在数值示例中,我们展示了所提出算法在有限时间范围MSE意义上由于最优估计而表现出的优越性能。

英文摘要

We construct team-optimal estimation algorithms over distributed networks for state estimation in the finite-horizon mean-square error (MSE) sense. Here, we have a distributed collection of agents with processing and cooperation capabilities. These agents observe noisy samples of a desired state through a linear model and seek to learn this state by interacting with each other. Although this problem has attracted significant attention and been studied extensively in fields including machine learning and signal processing, all the well-known strategies do not achieve team-optimal learning performance in the finite-horizon MSE sense. To this end, we formulate the finite-horizon distributed minimum MSE (MMSE) when there is no restriction on the size of the disclosed information, i.e., oracle performance, over an arbitrary network topology. Subsequently, we show that exchange of local estimates is sufficient to achieve the oracle performance only over certain network topologies. By inspecting these network structures, we propose recursive algorithms achieving the oracle performance through the disclosure of local estimates. For practical implementations we also provide approaches to reduce the complexity of the algorithms through the time-windowing of the observations. Finally, in the numerical examples, we demonstrate the superior performance of the introduced algorithms in the finite-horizon MSE sense due to optimal estimation.

1701.00757 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

Clustering Signed Networks with the Geometric Mean of Laplacians

利用拉普拉斯矩阵几何均值对带符号网络进行聚类

Pedro Mercado, Francesco Tudisco, Matthias Hein

AI总结 本文提出利用拉普拉斯矩阵几何均值改进谱聚类,解决传统算术均值方法在无噪声正负网络结构中无法准确聚类的问题。

Comments 14 pages, 5 figures. Accepted in Neural Information Processing Systems (NIPS), 2016

详情
Journal ref
Advances in Neural Information Processing Systems 29, pp.4421--4429, 2016
AI中文摘要

带符号网络允许建模正负关系。我们分析了现有谱聚类方法在带符号网络中的扩展,发现现有方法在某些情况下无法恢复真实聚类,因为其采用正负部分拉普拉斯矩阵的算术均值。本文提出使用几何均值,并证明其优于现有方法。尽管矩阵几何均值计算成本高,但通过高效计算几何均值的特征向量,提出了适用于稀疏矩阵的数值方案。

英文摘要

Signed networks allow to model positive and negative relationships. We analyze existing extensions of spectral clustering to signed networks. It turns out that existing approaches do not recover the ground truth clustering in several situations where either the positive or the negative network structures contain no noise. Our analysis shows that these problems arise as existing approaches take some form of arithmetic mean of the Laplacians of the positive and negative part. As a solution we propose to use the geometric mean of the Laplacians of positive and negative part and show that it outperforms the existing approaches. While the geometric mean of matrices is computationally expensive, we show that eigenvectors of the geometric mean can be computed efficiently, leading to a numerical scheme for sparse matrices which is of independent interest.

1612.09158 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

The interplay between system identification and machine learning

系统辨识与机器学习之间的相互作用

Gianluigi Pillonetto

AI总结 本文探讨系统辨识与机器学习的联系,提出动态系统RKHS框架,简化稳定性条件推导,并证明正则化估计器收敛于最优预测。

详情
AI中文摘要

从例子中学习是科学和工程中的关键问题,涉及从有限直接和噪声样本中重建函数。在 reproducing kernel Hilbert spaces (RKHSs) 中的正则化被广泛用于解决此任务,包括强大的估计器如正则化网络。最近的成就包括证明这些基于内核的方法的统计一致性。同时,许多不同的系统辨识技术已被开发,但与机器学习的互动仍不强烈。原因之一是机器学习中通常使用的RKHS不嵌入动态系统的信息,例如BIBO稳定性。此外,在系统辨识中,机器学习中通常采用的独立数据假设在实践中并不成立。本文提供了新的结果,加强系统辨识与机器学习之间的联系。我们的起点是引入动态系统的RKHS。它们包含在系统输入定义的空间上的函数,允许将系统辨识解释为从例子中学习。在线性和非线性设置中,证明这种视角允许以相对简单的方式推导RKHS稳定性条件(即只包含BIBO稳定系统或预测器的性质),也促进了系统辨识新内核的设计。此外,我们证明在动态系统典型条件下,正则化估计器收敛于最优预测器。

英文摘要

Learning from examples is one of the key problems in science and engineering. It deals with function reconstruction from a finite set of direct and noisy samples. Regularization in reproducing kernel Hilbert spaces (RKHSs) is widely used to solve this task and includes powerful estimators such as regularization networks. Recent achievements include the proof of the statistical consistency of these kernel- based approaches. Parallel to this, many different system identification techniques have been developed but the interaction with machine learning does not appear so strong yet. One reason is that the RKHSs usually employed in machine learning do not embed the information available on dynamic systems, e.g. BIBO stability. In addition, in system identification the independent data assumptions routinely adopted in machine learning are never satisfied in practice. This paper provides new results which strengthen the connection between system identification and machine learning. Our starting point is the introduction of RKHSs of dynamic systems. They contain functionals over spaces defined by system inputs and allow to interpret system identification as learning from examples. In both linear and nonlinear settings, it is shown that this perspective permits to derive in a relatively simple way conditions on RKHS stability (i.e. the property of containing only BIBO stable systems or predictors), also facilitating the design of new kernels for system identification. Furthermore, we prove the convergence of the regularized estimator to the optimal predictor under conditions typical of dynamic systems.

1308.4757 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Online and stochastic Douglas-Rachford splitting method for large scale machine learning

在线和随机Douglas-Rachford分裂方法用于大规模机器学习

Ziqiang Shi, Rujie Liu

AI总结 本文提出在线和随机Douglas-Rachford分裂方法,用于大规模优化问题,证明其在在线和随机设置下的收敛性,并通过实验验证其有效性。

详情
AI中文摘要

在线和随机学习已成为大规模优化中的强大工具。本文将Douglas-Rachford分裂(DRs)方法推广到在线和随机设置(据我们所知,这是首次将DRs方法推广到顺序版本)。我们首先建立了批量DRs方法的O(1/√T) regret界。然后证明在线DRs分裂方法具有O(1)的regret界,而随机DRs分裂方法的收敛率为O(1/√T)。证明过程简单直观,结果和技术可为利用DRs方法进行大规模机器学习研究提供基础。所提方法的数值实验验证了在线和随机更新规则的有效性,并进一步确认了我们的regret和收敛性分析。

英文摘要

Online and stochastic learning has emerged as powerful tool in large scale optimization. In this work, we generalize the Douglas-Rachford splitting (DRs) method for minimizing composite functions to online and stochastic settings (to our best knowledge this is the first time DRs been generalized to sequential version). We first establish an $O(1/\sqrt{T})$ regret bound for batch DRs method. Then we proved that the online DRs splitting method enjoy an $O(1)$ regret bound and stochastic DRs splitting has a convergence rate of $O(1/\sqrt{T})$. The proof is simple and intuitive, and the results and technique can be served as a initiate for the research on the large scale machine learning employ the DRs method. Numerical experiments of the proposed method demonstrate the effectiveness of the online and stochastic update rule, and further confirm our regret and convergence analysis.

1612.04933 2026-06-04 stat.ML cs.AI cs.LG cs.SY eess.SY 版本更新

Dynamical Kinds and their Discovery

动力学种类及其发现

Benjamin C. Jantzen

AI总结 本文提出一种无需显式构建动态模型或依赖系统动力学先验知识,即可分类具有相同结构因果系统的算法,展示了其在动态模型开发与验证中的应用价值。

Comments Accepted for the proceedings of the Causation: Foundation to Application Workshop, UAI 2016

详情
AI中文摘要

我们展示了将因果系统分类为共享相同结构的种类的可能性,无需首先构建显式动态模型或使用系统动力学的先验知识。该算法能够确定任意系统是否由相同形式的因果关系支配,具有在动态模型开发和验证中的重要应用价值。从理论上看,这也是科学推理中从实证数据中推导定律的关键阶段。所提出的算法基于动态对称性方法来处理动态种类。时间对称性是指对系统的一个或多个变量进行干预,该干预与系统的时间演化过程可交换。动态种类是共享一组动态对称性的系统类。所提出的算法通过直接比较系统展示的对称性来分类确定性、时间依赖性的因果系统。使用来自多种非线性系统的模拟、噪声数据,我们证明该算法能够正确地将系统分类为动态种类。该算法在显著的采样误差下具有鲁棒性,对采样误差的非正态性不敏感,并在动态相似性增加时表现良好。所展示的算法是首个针对自动化科学发现这一方面的算法。

英文摘要

We demonstrate the possibility of classifying causal systems into kinds that share a common structure without first constructing an explicit dynamical model or using prior knowledge of the system dynamics. The algorithmic ability to determine whether arbitrary systems are governed by causal relations of the same form offers significant practical applications in the development and validation of dynamical models. It is also of theoretical interest as an essential stage in the scientific inference of laws from empirical data. The algorithm presented is based on the dynamical symmetry approach to dynamical kinds. A dynamical symmetry with respect to time is an intervention on one or more variables of a system that commutes with the time evolution of the system. A dynamical kind is a class of systems sharing a set of dynamical symmetries. The algorithm presented classifies deterministic, time-dependent causal systems by directly comparing their exhibited symmetries. Using simulated, noisy data from a variety of nonlinear systems, we show that this algorithm correctly sorts systems into dynamical kinds. It is robust under significant sampling error, is immune to violations of normality in sampling error, and fails gracefully with increasing dynamical similarity. The algorithm we demonstrate is the first to address this aspect of automated scientific discovery.

1612.02739 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY 版本更新

Controlling Robot Morphology from Incomplete Measurements

从不完整测量中控制机器人形态

Martin Pecka, Karel Zimmermann, Michal Reinštein, Tomáš Svoboda

AI总结 针对复杂形态机器人在城市搜索与救援任务中的地形穿越需求,提出通过自主控制处理不完整数据并确保安全性的方法。

Comments Accepted into IEEE Transactions to Industrial Electronics, Special Section on Motion Control for Novel Emerging Robotic Devices and Systems

详情
AI中文摘要

复杂形态的移动机器人对于在城市搜索与救援任务中穿越粗糙地形至关重要。由于远程操作复杂形态会增加操作员的认知负担,因此需要自主控制。自主控制会测量机器人状态和周围地形,通常只能部分观测,因此数据往往不完整。我们对缺失测量进行边缘化,并评估一个显式安全条件。如果安全条件被违反,身体安装的机械臂通过触觉探索收集缺失数据。

英文摘要

Mobile robots with complex morphology are essential for traversing rough terrains in Urban Search & Rescue missions (USAR). Since teleoperation of the complex morphology causes high cognitive load of the operator, the morphology is controlled autonomously. The autonomous control measures the robot state and surrounding terrain which is usually only partially observable, and thus the data are often incomplete. We marginalize the control over the missing measurements and evaluate an explicit safety condition. If the safety condition is violated, tactile terrain exploration by the body-mounted robotic arm gathers the missing data.

1606.07315 2026-06-04 cs.LG cs.NA math.NA 版本更新

Nearly-optimal Robust Matrix Completion

近优鲁棒矩阵补全

Yeshwanth Cherapanamjeri, Kartik Gupta, Prateek Jain

AI总结 本文提出一种简单投影梯度下降方法,通过交替进行投影梯度下降和硬阈值清理来估计低秩矩阵,实现近最优观测和损坏数量的鲁棒矩阵补全,同时改进了低秩矩阵补全的时间复杂度。

详情
AI中文摘要

本文提出了一种简单的投影梯度下降方法,通过交替进行投影梯度下降和硬阈值清理来估计低秩矩阵,实现近最优观测和损坏数量的鲁棒矩阵补全,同时改进了低秩矩阵补全的时间复杂度。

英文摘要

In this paper, we consider the problem of Robust Matrix Completion (RMC) where the goal is to recover a low-rank matrix by observing a small number of its entries out of which a few can be arbitrarily corrupted. We propose a simple projected gradient descent method to estimate the low-rank matrix that alternately performs a projected gradient descent step and cleans up a few of the corrupted entries using hard-thresholding. Our algorithm solves RMC using nearly optimal number of observations as well as nearly optimal number of corruptions. Our result also implies significant improvement over the existing time complexity bounds for the low-rank matrix completion problem. Finally, an application of our result to the robust PCA problem (low-rank+sparse matrix separation) leads to nearly linear time (in matrix dimensions) algorithm for the same; existing state-of-the-art methods require quadratic time. Our empirical results corroborate our theoretical results and show that even for moderate sized problems, our method for robust PCA is an an order of magnitude faster than the existing methods.

1612.01600 2026-06-04 math.OC cs.LG cs.MA cs.SY eess.SY stat.ML 版本更新

Distributed Gaussian Learning over Time-varying Directed Graphs

时变有向图上的分布式高斯学习

Angelia Nedić, Alex Olshevsky, César A. Uribe

AI总结 本文提出一种分布式非贝叶斯学习算法,用于高斯噪声下的参数估计,通过显式更新高斯信念参数,证明了收敛率和几乎必然收敛性。

详情
AI中文摘要

我们提出了一种分布式(非贝叶斯)学习算法,用于参数估计问题中的高斯噪声。该算法以高斯信念的参数(即均值和精度)的显式更新形式表达。我们展示了收敛率为O(1/k),常数项依赖于代理数量和网络拓扑。此外,我们还证明了在一般时变有向图情况下,算法几乎必然收敛到估计问题的最优解。

英文摘要

We present a distributed (non-Bayesian) learning algorithm for the problem of parameter estimation with Gaussian noise. The algorithm is expressed as explicit updates on the parameters of the Gaussian beliefs (i.e. means and precision). We show a convergence rate of $O(1/k)$ with the constant term depending on the number of agents and the topology of the network. Moreover, we show almost sure convergence to the optimal solution of the estimation problem for the general case of time-varying directed graphs.

1612.00221 2026-06-04 econ.GN cs.LG nlin.AO q-fin.EC 版本更新

The Coconut Model with Heterogeneous Strategies and Learning

带有异质策略和学习的椰子模型

Sven Banisch, Eckehard Olbrich

AI总结 本文基于Diamond搜索均衡模型开发了基于代理的版本,探讨了异质适应性和适应性预期对非均衡轨迹的影响,并展示了系统收敛到原系统固定点的稳定性。

Comments Accepted for publication in the Journal of Artificial Societies and Social Simulation (JASSS)

详情
AI中文摘要

在本文中,我们开发了Diamond搜索均衡模型的基于代理的版本,也称为椰子模型。在该模型中,代理面临需要根据对未来产生的实体效用的预期来评估的生产决策,而该效用又通过交易机制依赖于全球生产水平。虽然原始动力学系统设定假设无限多个同质适应代理遵循强理性条件,基于代理的设定允许讨论异质性和适应性预期的影响,并能够分析非均衡轨迹。从匹配原始模型渐近行为的基础实现出发,我们展示了如何在总体动力学方程中考虑代理异质性。然后我们展示当代理通过简单的时差学习方案调整策略时,系统收敛到原系统的固定点。系统性模拟揭示这唯一稳定的均衡解。

英文摘要

In this paper, we develop an agent-based version of the Diamond search equilibrium model - also called Coconut Model. In this model, agents are faced with production decisions that have to be evaluated based on their expectations about the future utility of the produced entity which in turn depends on the global production level via a trading mechanism. While the original dynamical systems formulation assumes an infinite number of homogeneously adapting agents obeying strong rationality conditions, the agent-based setting allows to discuss the effects of heterogeneous and adaptive expectations and enables the analysis of non-equilibrium trajectories. Starting from a baseline implementation that matches the asymptotic behavior of the original model, we show how agent heterogeneity can be accounted for in the aggregate dynamical equations. We then show that when agents adapt their strategies by a simple temporal difference learning scheme, the system converges to one of the fixed points of the original system. Systematic simulations reveal that this is the only stable equilibrium solution.

1611.08372 2026-06-04 stat.ML cs.LG cs.NA math.NA math.OC 版本更新

A Unified Convex Surrogate for the Schatten-$p$ Norm

一种统一的凸替代项用于Schatten-p范数

Chen Xu, Zhouchen Lin, Hongbin Zha

AI总结 本文提出一种统一的凸替代项,用于Schatten-p范数,通过矩阵分解的等价性,使因子矩阵的范数可凸优化,提升矩阵补全任务的性能。

Comments The paper is accepted by AAAI-17. We show that multi-factor matrix factorization enjoys superiority over the traditional two-factor case

详情
AI中文摘要

Schatten-p范数(0<p<1)已被广泛用于替代核范数以更好地近似秩函数。然而,现有方法要么由于每次迭代依赖奇异值分解(SVD)而不适用于大规模问题,要么局限于特定的p值,如1/2和2/3。本文表明,对于任何p、p1和p2>0满足1/p=1/p1+1/p2,单矩阵的Schatten-p范数与两个因子矩阵的Schatten-p1和Schatten-p2范数之间存在等价性。我们进一步将等价性扩展到多个因子矩阵,并证明所有因子范数对于任何p>0均可凸和光滑。相比之下,原始Schatten-p范数对于0<p<1是非凸和非光滑的。作为示例,我们进行了矩阵补全实验。为了利用因子矩阵范数的凸性,我们采用了加速近端交替线性化最小化算法,并建立了其序列收敛性。在合成和真实数据集上的实验显示其优于现有方法,速度也极具竞争力。

英文摘要

The Schatten-$p$ norm ($0<p<1$) has been widely used to replace the nuclear norm for better approximating the rank function. However, existing methods are either 1) not scalable for large scale problems due to relying on singular value decomposition (SVD) in every iteration, or 2) specific to some $p$ values, e.g., $1/2$, and $2/3$. In this paper, we show that for any $p$, $p_1$, and $p_2 >0$ satisfying $1/p=1/p_1+1/p_2$, there is an equivalence between the Schatten-$p$ norm of one matrix and the Schatten-$p_1$ and the Schatten-$p_2$ norms of its two factor matrices. We further extend the equivalence to multiple factor matrices and show that all the factor norms can be convex and smooth for any $p>0$. In contrast, the original Schatten-$p$ norm for $0<p<1$ is non-convex and non-smooth. As an example we conduct experiments on matrix completion. To utilize the convexity of the factor matrix norms, we adopt the accelerated proximal alternating linearized minimization algorithm and establish its sequence convergence. Experiments on both synthetic and real datasets exhibit its superior performance over the state-of-the-art methods. Its speed is also highly competitive.

1611.05977 2026-06-04 cs.LG cs.NA math.NA stat.AP stat.ML 版本更新

Robust and Scalable Column/Row Sampling from Corrupted Big Data

鲁棒且可扩展的列/行采样从受腐蚀的大数据

Mostafa Rahmani, George Atia

AI总结 本文提出新的采样算法,能在严重数据腐蚀下定位信息列,并开发可扩展的随机化设计,同时对稀疏腐蚀和异常值具有鲁棒性,实验显示优于现有鲁棒采样算法。

详情
AI中文摘要

传统采样技术在数据严重腐蚀时无法生成数据描述性草图,因为此类腐蚀破坏了其所需的低秩结构。本文提出新的采样算法,可在存在严重数据腐蚀时定位信息列,并开发新的可扩展随机化设计。所提方法同时对稀疏腐蚀和异常值具有鲁棒性,并通过真实和合成数据的实验表明显著优于现有鲁棒采样算法。

英文摘要

Conventional sampling techniques fall short of drawing descriptive sketches of the data when the data is grossly corrupted as such corruptions break the low rank structure required for them to perform satisfactorily. In this paper, we present new sampling algorithms which can locate the informative columns in presence of severe data corruptions. In addition, we develop new scalable randomized designs of the proposed algorithms. The proposed approach is simultaneously robust to sparse corruption and outliers and substantially outperforms the state-of-the-art robust sampling algorithms as demonstrated by experiments conducted using both real and synthetic data.

1602.05703 2026-06-04 cs.LG cs.SY eess.SY 版本更新

Adaptive Least Mean Squares Estimation of Graph Signals

自适应最小均方图信号估计

Paolo Di Lorenzo, Sergio Barbarossa, Paolo Banelli, Stefania Sardellitti

AI总结 本文提出一种自适应图信号估计方法,通过最小均方策略实现带限图信号的重建与跟踪,结合理论分析与数值实验验证了方法的有效性,并提出在线适应的图采样策略。

Comments Submitted to IEEE Transactions on Signal and Information Processing over Networks

详情
AI中文摘要

本文旨在提出一种最小均方(LMS)策略,用于自适应估计定义在图上的信号。假设图信号在已知带宽下带限,该方法能够在有限观测下实现保证均方误差性能的重建与跟踪。详细的均方分析提供了所提方法的性能,并导致了设计有用的图信号采样策略的若干见解。数值结果验证了我们的理论发现,并展示了所提方法的性能。此外,为应对带宽未知的情况,我们提出了一种在图频域中进行稀疏在线估计信号支持的方法,从而实现了图采样策略的在线适应。最后,我们应用所提方法在认知网络环境中构建给定操作区域的功率空间密度制图。

英文摘要

The aim of this paper is to propose a least mean squares (LMS) strategy for adaptive estimation of signals defined over graphs. Assuming the graph signal to be band-limited, over a known bandwidth, the method enables reconstruction, with guaranteed performance in terms of mean-square error, and tracking from a limited number of observations over a subset of vertices. A detailed mean square analysis provides the performance of the proposed method, and leads to several insights for designing useful sampling strategies for graph signals. Numerical results validate our theoretical findings, and illustrate the performance of the proposed method. Furthermore, to cope with the case where the bandwidth is not known beforehand, we propose a method that performs a sparse online estimation of the signal support in the (graph) frequency domain, which enables online adaptation of the graph sampling strategy. Finally, we apply the proposed method to build the power spatial density cartography of a given operational region in a cognitive network environment.

1511.05261 2026-06-04 cs.CV cs.LG cs.NA math.NA stat.ML 版本更新

Robust PCA via Nonconvex Rank Approximation

通过非凸秩近似实现鲁棒PCA

Zhao Kang, Chong Peng, Qiang Cheng

AI总结 本文提出非凸秩近似方法,以改进鲁棒PCA中核范数的局限性,通过高效算法提升准确性和效率。

Comments IEEE International Conference on Data Mining

详情
AI中文摘要

在数据挖掘和机器学习中,许多应用需要恢复低秩矩阵。鲁棒主成分分析(RPCA)是处理此类问题的通用框架。RPCA中核范数作为秩函数的凸替代物被广泛研究。在某些假设下,它可以以高概率恢复底层低秩矩阵。然而,这些假设可能在实际应用中不成立。由于核范数通过将所有奇异值相加来近似秩,即本质上是奇异值的ℓ1范数,因此产生的近似误差并不 trivial,导致最终的矩阵估计器可能有显著偏差。为寻求更接近的近似并缓解核范数的上述限制,我们提出了一种非凸秩近似。这种对矩阵秩的近似比核范数更紧密。为了解决相关的非凸最小化问题,我们开发了高效的增广拉格朗日乘子优化算法。实验结果表明,我们的方法在准确性和效率上均优于当前最先进的算法。

英文摘要

Numerous applications in data mining and machine learning require recovering a matrix of minimal rank. Robust principal component analysis (RPCA) is a general framework for handling this kind of problems. Nuclear norm based convex surrogate of the rank function in RPCA is widely investigated. Under certain assumptions, it can recover the underlying true low rank matrix with high probability. However, those assumptions may not hold in real-world applications. Since the nuclear norm approximates the rank by adding all singular values together, which is essentially a $\ell_1$-norm of the singular values, the resulting approximation error is not trivial and thus the resulting matrix estimator can be significantly biased. To seek a closer approximation and to alleviate the above-mentioned limitations of the nuclear norm, we propose a nonconvex rank approximation. This approximation to the matrix rank is tighter than the nuclear norm. To solve the associated nonconvex minimization problem, we develop an efficient augmented Lagrange multiplier based optimization algorithm. Experimental results demonstrate that our method outperforms current state-of-the-art algorithms in both accuracy and efficiency.

1611.05095 2026-06-04 cs.LG cs.RO cs.SY eess.SY 版本更新

Learning Dexterous Manipulation Policies from Experience and Imitation

从经验与模仿中学习灵巧操作策略

Vikash Kumar, Abhishek Gupta, Emanuel Todorov, Sergey Levine

AI总结 本文研究了通过经验与模仿学习反馈控制灵巧五指手非抓取操作的任务,提出基于轨迹优化的局部控制器,并通过深度学习和最近邻方法进行泛化,展示了小数据训练下的有效性和盲操作优势。

Comments Initial draft for a journal submission

详情
AI中文摘要

我们探索了基于学习的反馈控制方法,用于控制执行非抓取操作的灵巧五指手。首先,我们学习了能够从预定义初始状态开始执行任务的局部控制器。这些控制器是通过轨迹优化构建的,基于从传感器数据直接学习到的局部线性时变模型。在某些情况下,我们使用通过虚拟环境中的遥控收集的人类示范来初始化优化器。我们证明,这些控制器在模拟和物理平台上都能在初始条件的有限范围内稳健地执行任务。然后,我们考虑了两种泛化方法:深度学习和最近邻。我们发现最近邻方法性能更高。然而,神经网络也有其优势:它仅使用触觉和本体感觉反馈,而没有关于物体的视觉反馈(即盲操作),并且学习了一个时间不变的策略。相比之下,最近邻方法根据运动捕捉感知的初始物体状态切换时间变化的局部控制器。尽管两种泛化方法仍有改进空间,我们的工作表明(i)复杂的非抓取操作任务的局部轨迹控制器可以从惊人的少量训练数据中构建,(ii)此类控制器的集合可以插值形成更全局的控制器。结果总结在补充视频中:https://youtu.be/E0wmO6deqjo

英文摘要

We explore learning-based approaches for feedback control of a dexterous five-finger hand performing non-prehensile manipulation. First, we learn local controllers that are able to perform the task starting at a predefined initial state. These controllers are constructed using trajectory optimization with respect to locally-linear time-varying models learned directly from sensor data. In some cases, we initialize the optimizer with human demonstrations collected via teleoperation in a virtual environment. We demonstrate that such controllers can perform the task robustly, both in simulation and on the physical platform, for a limited range of initial conditions around the trained starting state. We then consider two interpolation methods for generalizing to a wider range of initial conditions: deep learning, and nearest neighbors. We find that nearest neighbors achieve higher performance. Nevertheless, the neural network has its advantages: it uses only tactile and proprioceptive feedback but no visual feedback about the object (i.e. it performs the task blind) and learns a time-invariant policy. In contrast, the nearest neighbors method switches between time-varying local controllers based on the proximity of initial object states sensed via motion capture. While both generalization methods leave room for improvement, our work shows that (i) local trajectory-based controllers for complex non-prehensile manipulation tasks can be constructed from surprisingly small amounts of training data, and (ii) collections of such controllers can be interpolated to form more global controllers. Results are summarized in the supplementary video: https://youtu.be/E0wmO6deqjo

1508.07933 2026-06-04 math.OC cs.LG cs.SY eess.SY 版本更新

Coordinate Dual Averaging for Decentralized Online Optimization with Nonseparable Global Objectives

协调双平均法用于具有非分离全局目标的去中心化在线优化

Soomin Lee, Angelia Nedić, Maxim Raginsky

AI总结 本文提出两种去中心化变体ODA-C和ODA-PS,用于解决去中心化在线凸优化问题,通过双平均方法实现子线性悔度增长。

Comments 10 pages; accepted for publication in IEEE Transactions on Control of Network Systems

详情
AI中文摘要

我们考虑了一个网络中代理的去中心化在线凸优化问题,每个代理仅控制全局决策向量的一个坐标(或部分)。针对此类问题,我们提出了两种去中心化变体(ODA-C和ODA-PS)的Nesterov原始-对偶算法变体,带有双平均。在ODA-C中,为缓解对偶向量更新的分歧,代理在静态无向图上实现了一种最近由Li和Marden提出的局部信息交换动态的泛化。在ODA-PS中,代理在时间变化的均匀连接有向图序列上实现基于广播的推送-求和动态。我们证明在步长形式为1/√t且目标函数为Lipschitz连续凸函数且具有Lipschitz梯度时,两种情况下的悔度界具有O(√T)的次线性增长,其中T为时间跨度。我们还将在传感器网络上实现所提出算法以补充我们的理论分析。

英文摘要

We consider a decentralized online convex optimization problem in a network of agents, where each agent controls only a coordinate (or a part) of the global decision vector. For such a problem, we propose two decentralized variants (ODA-C and ODA-PS) of Nesterov's primal-dual algorithm with dual averaging. In ODA-C, to mitigate the disagreements on the primal-vector updates, the agents implement a generalization of the local information-exchange dynamics recently proposed by Li and Marden over a static undirected graph. In ODA-PS, the agents implement the broadcast-based push-sum dynamics over a time-varying sequence of uniformly connected digraphs. We show that the regret bounds in both cases have sublinear growth of $O(\sqrt{T})$, with the time horizon $T$, when the stepsize is of the form $1/\sqrt{t}$ and the objective functions are Lipschitz-continuous convex functions with Lipschitz gradients. We also implement the proposed algorithms on a sensor network to complement our theoretical analysis.

1412.7215 2026-06-04 math.OC cs.DS cs.LG cs.MA cs.SY eess.SY 版本更新

Online Distributed Optimization on Dynamic Networks

动态网络上的在线分布式优化

Saghar Hosseini, Airlie Chapman, Mehran Mesbahi

AI总结 本文提出了一种在存在成本不确定性和切换通信拓扑下的分布式优化方案,通过双对偶子梯度平均算法实现合作最小化成本函数,并分析了网络拓扑对收敛速度的影响。

Comments Submitted to The IEEE Transactions on Automatic Control, 2014

详情
AI中文摘要

本文提出了一种在存在成本不确定性和切换通信拓扑下的分布式优化方案。受最近分布式凸优化进展的启发,我们提出了一种基于双对偶子梯度平均的分布式算法,旨在合作最小化成本函数。此外,该算法通过调整网络中通信链路的权重以适应邻居节点可靠性变化。随后对底层网络拓扑的收敛速率进行了分析,并给出了代表性传感器网络的模拟结果。

英文摘要

This paper presents a distributed optimization scheme over a network of agents in the presence of cost uncertainties and over switching communication topologies. Inspired by recent advances in distributed convex optimization, we propose a distributed algorithm based on a dual sub-gradient averaging. The objective of this algorithm is to minimize a cost function cooperatively. Furthermore, the algorithm changes the weights on the communication links in the network to adapt to varying reliability of neighboring agents. A convergence rate analysis as a function of the underlying network topology is then presented, followed by simulation results for representative classes of sensor networks.

1410.7057 2026-06-04 cs.LG cs.DC cs.SY eess.SY stat.ML 版本更新

Sparse Distributed Learning via Heterogeneous Diffusion Adaptive Networks

稀疏分布式学习 via 异质扩散自适应网络

Bijit Kumar Das, Mrityunjoy Chakraborty, Jerónimo Arenas-García

AI总结 本文提出通过异质扩散自适应网络实现稀疏参数向量的分布式估计,通过选择性应用凸正则化方法减少计算开销,同时保持最优性能。

Comments 4 pages, 1 figure, conference, submitted to IEEE ISCAS 2015, Lisbon, Portugal

详情
AI中文摘要

近年来,关于通过扩散LMS策略在网内进行稀疏参数向量分布式估计的研究已有所涉及。在所有现有工作中,每个网络节点都使用了一些凸正则化方法,以实现优于简单扩散LMS的整体网络性能,尽管这导致了计算开销的增加。本文提供了分析和实验结果,表明凸正则化可以仅应用于某些选定的节点,其余节点保持稀疏性无感知,同时仍能实现与在所有节点上部署凸正则化相同最优行为。由于在部分节点中采用无正则化学习,所提出的方法需要更少的计算成本。我们还提供了一条选择稀疏感知节点的指南和最优正则化参数的闭式表达式。

英文摘要

In-network distributed estimation of sparse parameter vectors via diffusion LMS strategies has been studied and investigated in recent years. In all the existing works, some convex regularization approach has been used at each node of the network in order to achieve an overall network performance superior to that of the simple diffusion LMS, albeit at the cost of increased computational overhead. In this paper, we provide analytical as well as experimental results which show that the convex regularization can be selectively applied only to some chosen nodes keeping rest of the nodes sparsity agnostic, while still enjoying the same optimum behavior as can be realized by deploying the convex regularization at all the nodes. Due to the incorporation of unregularized learning at a subset of nodes, less computational cost is needed in the proposed approach. We also provide a guideline for selection of the sparsity aware nodes and a closed form expression for the optimum regularization parameter.

1610.05838 2026-06-04 cs.LG cs.NA math.NA 版本更新

CuMF_SGD: Fast and Scalable Matrix Factorization

CuMF_SGD:快速且可扩展的矩阵分解

Xiaolong Xie, Wei Tan, Liana L. Fong, Yun Liang

AI总结 本文提出CuMF_SGD,利用GPU高带宽内存和快节点连接加速大规模矩阵分解,通过批量Hogwild!和波前更新方案及优化内核,在单CPU和多GPU上实现3.1X-28.2X的加速。

详情
AI中文摘要

矩阵分解(MF)已广泛应用于推荐系统、主题建模和词嵌入等领域。随机梯度下降(SGD)因其能处理大数据集和易于增量学习而流行。我们发现SGD用于MF是内存受限的。单节点CPU系统带缓存仅适用于小数据集;分布式系统具有更高的聚合内存带宽但网络连接相对较慢。这一观察启发我们通过利用GPU的高内存带宽和快速节点连接来加速MF。我们提出了cuMF_SGD,一种基于CUDA的SGD解决方案用于大规模MF问题。在单个CPU上,我们设计了两种工作负载调度方案,即批量Hogwild!和波前更新,充分利用大量核心。特别是,批量Hogwild!作为Hogwild!的向量版本克服了内存不连续的问题。我们还开发了高度优化的SGD更新内核,利用缓存、 warp-shuffle指令和半精度浮点数。我们还设计了分区方案以利用多个GPU,同时解决SGD并行化时的收敛问题。在仅使用一个Maxwell或Pascal GPU的三个数据集上,cuMF_SGD相比1-64个CPU节点的最新CPU解决方案快3.1X-28.2X。评估还显示cuMF_SGD在大数据集上能良好扩展到多个GPU。

英文摘要

Matrix factorization (MF) has been widely used in e.g., recommender systems, topic modeling and word embedding. Stochastic gradient descent (SGD) is popular in solving MF problems because it can deal with large data sets and is easy to do incremental learning. We observed that SGD for MF is memory bound. Meanwhile, single-node CPU systems with caching performs well only for small data sets; distributed systems have higher aggregated memory bandwidth but suffer from relatively slow network connection. This observation inspires us to accelerate MF by utilizing GPUs's high memory bandwidth and fast intra-node connection. We present cuMF_SGD, a CUDA-based SGD solution for large-scale MF problems. On a single CPU, we design two workload schedule schemes, i.e., batch-Hogwild! and wavefront-update that fully exploit the massive amount of cores. Especially, batch-Hogwild! as a vectorized version of Hogwild! overcomes the issue of memory discontinuity. We also develop highly-optimized kernels for SGD update, leveraging cache, warp-shuffle instructions and half-precision floats. We also design a partition scheme to utilize multiple GPUs while addressing the well-known convergence issue when parallelizing SGD. On three data sets with only one Maxwell or Pascal GPU, cuMF_SGD runs 3.1X-28.2X as fast compared with state-of-art CPU solutions on 1-64 CPU nodes. Evaluations also show that cuMF_SGD scales well on multiple GPUs in large data sets.

1407.1537 2026-06-04 cs.DS cs.LG cs.NA math.NA math.OC stat.ML 版本更新

Linear Coupling: An Ultimate Unification of Gradient and Mirror Descent

线性耦合:梯度下降与镜像下降的终极统一

Zeyuan Allen-Zhu, Lorenzo Orecchia

AI总结 本文提出线性耦合方法,通过结合梯度下降和镜像下降,统一了两种优化算法,重新解释了Nesterov加速梯度方法,并扩展至其他无法应用Nesterov方法的场景。

Comments A new section added; polished writing

详情
AI中文摘要

首先阶方法在大规模机器学习中起核心作用。尽管存在多种变体,每种适用于特定问题,几乎所有此类方法本质上都依赖于两种算法步骤:梯度下降,产生原始进展,和镜像下降,产生对偶进展。我们观察到梯度和镜像下降的性能互补,因此通过线性耦合这两种方法可以设计出更快的算法。我们展示了如何通过线性耦合重构Nesterov加速梯度方法,这比Nesterov原始证明提供了更清晰的解释。我们还通过将线性耦合扩展到Nesterov方法无法应用的其他场景,讨论了其威力。

英文摘要

First-order methods play a central role in large-scale machine learning. Even though many variations exist, each suited to a particular problem, almost all such methods fundamentally rely on two types of algorithmic steps: gradient descent, which yields primal progress, and mirror descent, which yields dual progress. We observe that the performances of gradient and mirror descent are complementary, so that faster algorithms can be designed by LINEARLY COUPLING the two. We show how to reconstruct Nesterov's accelerated gradient methods using linear coupling, which gives a cleaner interpretation than Nesterov's original proofs. We also discuss the power of linear coupling by extending it to many other settings that Nesterov's methods cannot apply to.

1611.01142 2026-06-04 cs.LG cs.SY eess.SY 版本更新

Using a Deep Reinforcement Learning Agent for Traffic Signal Control

使用深度强化学习代理进行交通信号控制

Wade Genders, Saiedeh Razavi

AI总结 本文提出一种基于深度强化学习的交通信号控制系统,通过离散交通状态编码和Q-learning训练,有效减少交通延误、队列长度和旅行时间。

详情
AI中文摘要

确保交通系统高效是现代社会的优先事项。技术进步使得交通系统能够收集前所未有的大量多样化数据。我们提出了一种利用这种高质量数据的交通信号控制系统,与现有系统相比抽象程度较低。我们应用现代深度强化学习方法,在交通微观模拟器SUMO中构建了一个真正自适应的交通信号控制代理。我们提出了一种新的状态空间,即离散交通状态编码,信息密度高。该离散交通状态编码作为输入传递给深度卷积神经网络,通过经验回放进行训练。我们的代理与一个具有单隐藏层的神经网络交通信号控制代理进行了比较,平均累计延迟减少了82%,平均队列长度减少了66%,平均旅行时间减少了20%。

英文摘要

Ensuring transportation systems are efficient is a priority for modern society. Technological advances have made it possible for transportation systems to collect large volumes of varied data on an unprecedented scale. We propose a traffic signal control system which takes advantage of this new, high quality data, with minimal abstraction compared to other proposed systems. We apply modern deep reinforcement learning methods to build a truly adaptive traffic signal control agent in the traffic microsimulator SUMO. We propose a new state space, the discrete traffic state encoding, which is information dense. The discrete traffic state encoding is used as input to a deep convolutional neural network, trained using Q-learning with experience replay. Our agent was compared against a one hidden layer neural network traffic signal control agent and reduces average cumulative delay by 82%, average queue length by 66% and average travel time by 20%.

1606.03168 2026-06-04 math.OC cs.DS cs.IT cs.LG cs.NA math.IT math.NA 版本更新

Finding Low-Rank Solutions via Non-Convex Matrix Factorization, Efficiently and Provably

通过非凸矩阵分解寻找低秩解:高效且可证明

Dohyung Park, Anastasios Kyrillidis, Constantine Caramanis, Sujay Sanghavi

AI总结 本文提出Bi-Factored Gradient Descent算法,用于高效求解低秩矩阵优化问题,证明其在特定条件下具有线性收敛性。

Comments 45 pages

详情
AI中文摘要

一个秩为r的矩阵X∈R^{m×n}可以表示为UV^T的乘积,其中U∈R^{m×r}和V∈R^{n×r}。利用这一观察可以在优化中进行:例如,考虑在秩为r的矩阵上最小化一个凸函数f(X),其中秩为r的矩阵集通过分解UV^T来建模。尽管这种参数化减少了变量数量,并且更高效(特别是当r<<min{m,n}时),但它带来了代价:f(UV^T)相对于U和V是非凸函数。我们研究了这种参数化在优化通用凸目标函数f中的应用,并专注于一阶梯度下降算法。我们提出了Bi-Factored Gradient Descent(BFGD)算法,一种高效的、基于U和V因子的一阶方法。我们证明当f是(受限)光滑时,BFGD具有局部次线性收敛性,当f是(受限)光滑且(受限)强凸时具有线性收敛性。对于几个关键应用,我们提供了简单且高效的初始化方案,这些方案能提供足够好的近似解,以保证上述收敛结果成立。

英文摘要

A rank-$r$ matrix $X \in \mathbb{R}^{m \times n}$ can be written as a product $U V^\top$, where $U \in \mathbb{R}^{m \times r}$ and $V \in \mathbb{R}^{n \times r}$. One could exploit this observation in optimization: e.g., consider the minimization of a convex function $f(X)$ over rank-$r$ matrices, where the set of rank-$r$ matrices is modeled via the factorization $UV^\top$. Though such parameterization reduces the number of variables, and is more computationally efficient (of particular interest is the case $r \ll \min\{m, n\}$), it comes at a cost: $f(UV^\top)$ becomes a non-convex function w.r.t. $U$ and $V$. We study such parameterization for optimization of generic convex objectives $f$, and focus on first-order, gradient descent algorithmic solutions. We propose the Bi-Factored Gradient Descent (BFGD) algorithm, an efficient first-order method that operates on the $U, V$ factors. We show that when $f$ is (restricted) smooth, BFGD has local sublinear convergence, and linear convergence when $f$ is both (restricted) smooth and (restricted) strongly convex. For several key applications, we provide simple and efficient initialization schemes that provide approximate solutions good enough for the above convergence results to hold.

1401.0869 2026-06-04 math.OC cs.LG cs.NA math.NA stat.CO stat.ML 版本更新

Schatten-$p$ Quasi-Norm Regularized Matrix Optimization via Iterative Reweighted Singular Value Minimization

通过迭代加权奇异值最小化进行Schatten-p准范数正则化的矩阵优化

Zhaosong Lu, Yong Zhang

AI总结 本文研究了Schatten-p准范数正则化的矩阵最小化问题,提出了一种迭代加权奇异值最小化方法,证明了其收敛性并展示了其在解质量和速度上的优势。

Comments This paper has been withdrawn by the author due to major revision and corrections

详情
AI中文摘要

本文研究了通用Schatten-p准范数(SPQN)正则化的矩阵最小化问题。首先,我们引入了一类一阶 stationary 点,并证明了在SPQN正则化的向量最小化问题中引入的一阶 stationary 点等同于在SPQN正则化的矩阵最小化重参数化问题中的一阶 stationary 点。我们还证明了SPQN正则化的矩阵最小化问题的任何局部极小值必须是一阶 stationary 点。此外,我们推导了非零奇异值的下界,并因此也推导了SPQN正则化的矩阵最小化问题的局部极小值的下界。然后,我们提出了迭代加权奇异值最小化(IRSVM)方法来解决这些问题,其子问题被证明具有闭式解。与SPQN正则化的向量最小化问题的类似方法相比,这些方法的收敛性分析显著更具挑战性。我们开发了一种新的方法来证明这些方法的收敛性,利用了其子问题特定解的表达式,避免了寻找其子问题目标函数Clarke子微分的显式表达式的复杂问题。特别是,我们证明了由IRSVM方法生成的序列的任何累积点都是问题的一阶 stationary 点。我们的计算结果表明,IRSVM方法在解质量和/或速度上通常优于一些最近开发的最先进的方法。

英文摘要

In this paper we study general Schatten-$p$ quasi-norm (SPQN) regularized matrix minimization problems. In particular, we first introduce a class of first-order stationary points for them, and show that the first-order stationary points introduced in [11] for an SPQN regularized $vector$ minimization problem are equivalent to those of an SPQN regularized $matrix$ minimization reformulation. We also show that any local minimizer of the SPQN regularized matrix minimization problems must be a first-order stationary point. Moreover, we derive lower bounds for nonzero singular values of the first-order stationary points and hence also of the local minimizers of the SPQN regularized matrix minimization problems. The iterative reweighted singular value minimization (IRSVM) methods are then proposed to solve these problems, whose subproblems are shown to have a closed-form solution. In contrast to the analogous methods for the SPQN regularized $vector$ minimization problems, the convergence analysis of these methods is significantly more challenging. We develop a novel approach to establishing the convergence of these methods, which makes use of the expression of a specific solution of their subproblems and avoids the intricate issue of finding the explicit expression for the Clarke subdifferential of the objective of their subproblems. In particular, we show that any accumulation point of the sequence generated by the IRSVM methods is a first-order stationary point of the problems. Our computational results demonstrate that the IRSVM methods generally outperform some recently developed state-of-the-art methods in terms of solution quality and/or speed.

1610.08127 2026-06-04 cs.LG cs.AI cs.NA math.NA stat.ML 版本更新

Fast Bayesian Non-Negative Matrix Factorisation and Tri-Factorisation

快速的贝叶斯非负矩阵分解与三因子分解

Thomas Brouwer, Jes Frellsen, Pietro Lio'

AI总结 本文提出一种快速变分贝叶斯算法,用于非负矩阵分解和三因子分解,相比Gibbs采样和非概率方法,该方法在迭代和时间步收敛速度更快,且无需额外样本估计后验。

Comments NIPS 2016 Workshop on Advances in Approximate Bayesian Inference

详情
AI中文摘要

我们提出了一种快速的变分贝叶斯算法,用于执行非负矩阵分解和三因子分解。我们证明了我们的方法在每次迭代和时间步(墙钟时间)上的收敛速度比Gibbs采样和非概率方法更快,并且不需要额外的样本来估计后验。我们特别展示了对于矩阵三因子分解,收敛具有挑战性,但我们的变分贝叶斯方法提供了一种快速的解决方案,使三因子分解方法能够更有效地使用。

英文摘要

We present a fast variational Bayesian algorithm for performing non-negative matrix factorisation and tri-factorisation. We show that our approach achieves faster convergence per iteration and timestep (wall-clock) than Gibbs sampling and non-probabilistic approaches, and do not require additional samples to estimate the posterior. We show that in particular for matrix tri-factorisation convergence is difficult, but our variational Bayesian approach offers a fast solution, allowing the tri-factorisation approach to be used more effectively.

1606.00119 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Contextual Bandits with Latent Confounders: An NMF Approach

具有潜在混杂因素的上下文老虎机:一种NMF方法

Rajat Sen, Karthikeyan Shanmugam, Murat Kocaoglu, Alexandros G. Dimakis, Sanjay Shakkottai

AI总结 本文提出基于NMF的ε-贪心算法,通过低维结构学习与最优臂选择平衡,实现在线矩阵补全的 regret 保障,适用于高维数据场景。

Comments 37 pages, 2 figures

详情
AI中文摘要

受在线推荐和广告系统启发,本文考虑了具有潜在低维混杂因子的随机上下文老虎机因果模型。在该模型中,L个观察到的上下文和K个臂之间通过潜在混杂因子相关联。臂选择和潜在混杂因子因果决定奖励,而观察到的上下文与混杂因子相关。在此模型下,L×K的均值奖励矩阵U可分解为非负因子A和W。本文提出ε-贪心NMF-Bandit算法,通过干预序列选择臂,实现学习低维结构与最小化遗憾的平衡。算法在时间T时的遗憾为O(Lpoly(m,logK)logT),相较于传统上下文老虎机的O(LKlogT)更优。这些保证基于较弱的统计RIP条件。此外,本文提出一类生成模型满足充分条件,并推导出O(KmlogT)的下界。这些是首次针对在线矩阵补全与老虎机反馈的regret保证,当秩大于一时。

英文摘要

Motivated by online recommendation and advertising systems, we consider a causal model for stochastic contextual bandits with a latent low-dimensional confounder. In our model, there are $L$ observed contexts and $K$ arms of the bandit. The observed context influences the reward obtained through a latent confounder variable with cardinality $m$ ($m \ll L,K$). The arm choice and the latent confounder causally determines the reward while the observed context is correlated with the confounder. Under this model, the $L \times K$ mean reward matrix $\mathbf{U}$ (for each context in $[L]$ and each arm in $[K]$) factorizes into non-negative factors $\mathbf{A}$ ($L \times m$) and $\mathbf{W}$ ($m \times K$). This insight enables us to propose an $ε$-greedy NMF-Bandit algorithm that designs a sequence of interventions (selecting specific arms), that achieves a balance between learning this low-dimensional structure and selecting the best arm to minimize regret. Our algorithm achieves a regret of $\mathcal{O}\left(L\mathrm{poly}(m, \log K) \log T \right)$ at time $T$, as compared to $\mathcal{O}(LK\log T)$ for conventional contextual bandits, assuming a constant gap between the best arm and the rest for each context. These guarantees are obtained under mild sufficiency conditions on the factors that are weaker versions of the well-known Statistical RIP condition. We further propose a class of generative models that satisfy our sufficient conditions, and derive a lower bound of $\mathcal{O}\left(Km\log T\right)$. These are the first regret guarantees for online matrix completion with bandit feedback, when the rank is greater than one. We further compare the performance of our algorithm with the state of the art, on synthetic and real world data-sets.

1610.07722 2026-06-04 cs.LG cs.NA math.NA 版本更新

Sparse Hierarchical Tucker Factorization and its Application to Healthcare

稀疏分层Tucker分解及其在医疗领域的应用

Ioakeim Perros, Robert Chen, Richard Vuduc, Jimeng Sun

AI总结 本文提出稀疏分层Tucker分解方法,用于处理稀疏高阶张量数据。该方法通过嵌套采样技术解决传统分层Tucker方法的可扩展性问题,提升了效率和准确性,并在医疗数据集上验证了其性能。

Comments This is an extended version of a paper presented at the 15th IEEE International Conference on Data Mining (ICDM 2015)

详情
AI中文摘要

我们提出了一种新的张量分解方法,称为稀疏分层-Tucker(Sparse H-Tucker),用于稀疏和高阶数据张量。Sparse H-Tucker受经典分层Tucker方法启发,旨在计算输入数据集的树状结构分解,可被领域专家解释。然而,Sparse H-Tucker采用嵌套采样技术克服了分层Tucker的关键可扩展性问题,即创建不可行的密集核心张量;我们的方法结果是一种更快、更节省空间且更准确的方法。我们广泛测试了该方法在一个真实医疗数据集上,该数据集来自30,000名患者,产生一个18阶稀疏数据张量。与竞争方法不同,Sparse H-Tucker可以在单个多线程机器上分析完整数据集。它比最先进的方法更准确且更快:在输入数据的12阶子集上,Sparse H-Tucker比之前最先进的方法准确度提高了18倍,速度提高了7.5倍。即使对于低阶张量(如4阶),我们的方法所需时间也接近一个数量级,内存使用也减少两个数量级,相比传统张量分解方法如CP和Tucker。此外,我们发现Sparse H-Tucker在非零张量元素数量上几乎线性扩展。所得到的模型还提供可解释的疾病层级,这已由临床专家验证。

英文摘要

We propose a new tensor factorization method, called the Sparse Hierarchical-Tucker (Sparse H-Tucker), for sparse and high-order data tensors. Sparse H-Tucker is inspired by its namesake, the classical Hierarchical Tucker method, which aims to compute a tree-structured factorization of an input data set that may be readily interpreted by a domain expert. However, Sparse H-Tucker uses a nested sampling technique to overcome a key scalability problem in Hierarchical Tucker, which is the creation of an unwieldy intermediate dense core tensor; the result of our approach is a faster, more space-efficient, and more accurate method. We extensively test our method on a real healthcare dataset, which is collected from 30K patients and results in an 18th order sparse data tensor. Unlike competing methods, Sparse H-Tucker can analyze the full data set on a single multi-threaded machine. It can also do so more accurately and in less time than the state-of-the-art: on a 12th order subset of the input data, Sparse H-Tucker is 18x more accurate and 7.5x faster than a previously state-of-the-art method. Even for analyzing low order tensors (e.g., 4-order), our method requires close to an order of magnitude less time and over two orders of magnitude less memory, as compared to traditional tensor factorization methods such as CP and Tucker. Moreover, we observe that Sparse H-Tucker scales nearly linearly in the number of non-zero tensor elements. The resulting model also provides an interpretable disease hierarchy, which is confirmed by a clinical expert.

1508.00506 2026-06-04 math.OC cs.LG cs.SY eess.SY math.PR math.ST stat.TH 版本更新

A variational approach to path estimation and parameter inference of hidden diffusion processes

隐扩散过程路径估计与参数推断的变分方法

Tobias Sutter, Arnab Ganguly, Heinz Koeppl

AI总结 本文提出一种变分方法,用于估计隐扩散过程的路径并推断参数,通过高效推理方案提升对随机微分方程参数的估计精度。

Comments 37 pages, 2 figures, revised

详情
Journal ref
JMLR, volume 17, number 190, year 2016
AI中文摘要

本文考虑了一个隐马尔可夫模型,其中信号过程由扩散过程给出,仅通过一些噪声测量间接观测。文章开发了一种变分方法,用于在给定全部观测数据的情况下近似信号过程的隐藏状态。这特别导致了对信号过程平滑密度的系统近似。论文随后展示了如何基于这种变分方法来设计高效的推理方案,以估计随机微分方程的未知参数。最后两个例子展示了所提方法的有效性和准确性。

英文摘要

We consider a hidden Markov model, where the signal process, given by a diffusion, is only indirectly observed through some noisy measurements. The article develops a variational method for approximating the hidden states of the signal process given the full set of observations. This, in particular, leads to systematic approximations of the smoothing densities of the signal process. The paper then demonstrates how an efficient inference scheme, based on this variational approach to the approximation of the hidden states, can be designed to estimate the unknown parameters of stochastic differential equations. Two examples at the end illustrate the efficacy and the accuracy of the presented method.

1610.07520 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Nonlinear Adaptive Algorithms on Rank-One Tensor Models

秩一张量模型上的非线性自适应算法

Felipe C. Pinheiro, Cassio G. Lopes

AI总结 本文提出低复杂度非线性模型,基于可分解的Volterra核,推导出精确梯度型算法,进而发展出LMS滤波器及TRUE-LMS算法,通过仿真验证其在非线性处理中的优越性能。

详情
AI中文摘要

本文提出了一种低复杂度的非线性模型,并在此基础上开发了自适应算法。该模型基于可分解(或称秩一,在张量语言中)的Volterra核,也可描述为FIR滤波器的乘积,这解释了其低复杂度特性。秩一模型之所以有趣,是因为它来源于逼近理论中的良好定义问题。本文在估计理论背景下使用该模型,推导出精确的梯度型算法,从而发展出如最小均方(LMS)滤波器及其数据重用版本---TRUE-LMS算法。讨论了稳定性与收敛性问题。随后在仿真中测试这些算法,结果显示其在与其他非线性处理算法相比时具有良好的性能。

英文摘要

This work proposes a low complexity nonlinearity model and develops adaptive algorithms over it. The model is based on the decomposable---or rank-one, in tensor language---Volterra kernels. It may also be described as a product of FIR filters, which explains its low-complexity. The rank-one model is also interesting because it comes from a well-posed problem in approximation theory. The paper uses such model in an estimation theory context to develop an exact gradient-type algorithm, from which adaptive algorithms such as the least mean squares (LMS) filter and its data-reuse version---the TRUE-LMS---are derived. Stability and convergence issues are addressed. The algorithms are then tested in simulations, which show its good performance when compared to other nonlinear processing algorithms in the literature.

1509.05009 2026-06-04 cs.NE cs.LG cs.NA math.NA stat.ML 版本更新

On the Expressive Power of Deep Learning: A Tensor Analysis

深度学习表达能力的分析:张量视角

Nadav Cohen, Or Sharir, Amnon Shashua

AI总结 本文通过张量分解理论分析深度学习的表达能力,证明深度网络在多项式规模下实现的函数需浅层网络指数规模才能近似。

详情
Journal ref
29th Annual Conference on Learning Theory, pp. 698-728, 2016
AI中文摘要

长期以来,人们推测适合组合性数据(如文本或图像)的假设空间可能更高效地由深度分层网络表示而非浅层网络。尽管有大量的实证证据支持这一观点,但目前的理论依据有限。特别是,它们未能考虑卷积网络的局部性、共享和池化构造,这是目前最成功的深度学习架构。本文推导出一种基于算术电路的深度网络架构,其本质上具有局部性、共享和池化。建立了网络与分层张量分解之间的等价性。证明浅层网络对应于CP(秩-1)分解,而深层网络对应于分层Tucker分解。利用测度论和矩阵代数工具,证明除了可忽略的集合外,所有可通过多项式规模深度网络实现的函数,都需要指数规模的浅层网络才能实现(或近似)。由于对数空间计算将我们的网络转化为SimNets,该结果直接适用于具有有希望实证性能的深度学习架构。本文提出的构造和理论为深度学习社区的各种实践和想法提供了新的见解。

英文摘要

It has long been conjectured that hypotheses spaces suitable for data that is compositional in nature, such as text or images, may be more efficiently represented with deep hierarchical networks than with shallow ones. Despite the vast empirical evidence supporting this belief, theoretical justifications to date are limited. In particular, they do not account for the locality, sharing and pooling constructs of convolutional networks, the most successful deep learning architecture to date. In this work we derive a deep network architecture based on arithmetic circuits that inherently employs locality, sharing and pooling. An equivalence between the networks and hierarchical tensor factorizations is established. We show that a shallow network corresponds to CP (rank-1) decomposition, whereas a deep network corresponds to Hierarchical Tucker decomposition. Using tools from measure theory and matrix algebra, we prove that besides a negligible set, all functions that can be implemented by a deep network of polynomial size, require exponential size in order to be realized (or even approximated) by a shallow network. Since log-space computation transforms our networks into SimNets, the result applies directly to a deep learning architecture demonstrating promising empirical performance. The construction and theory developed in this paper shed new light on various practices and ideas employed by the deep learning community.

1610.04042 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Generalized Online Transfer Learning for Climate Control in Residential Buildings

面向住宅建筑气候控制的通用在线迁移学习

Thomas Grubinger, Georgios Chasparis, Thomas Natschlaeger

AI总结 本文提出了一种在线迁移学习框架,用于提升住宅建筑温度预测。通过结合目标域和源域预测器,提出通用在线迁移学习算法GOTL,确保收敛至最优加权预测器,并利用迁移组件分析TCA实现多源域知识迁移,实验表明可实现显著的能耗节省。

详情
AI中文摘要

本文提出了一种在线迁移学习框架,用于改进住宅建筑中的温度预测。在迁移学习中,通过使用来自相似源域的数据(如数据丰富的房屋)来改进在目标域(如数据有限的房屋)上训练的预测模型。鉴于预测模型需要在线训练(如作为模型预测控制实现的一部分),本文引入了通用在线迁移学习算法(GOTL)。它采用可用预测器的加权组合(即目标和源预测器),并保证收敛到最佳加权预测器。此外,使用迁移组件分析(TCA)允许使用多个源域,因为它可以促进一个模型在多个源域(房屋)上的拟合。这使GOTL能够从多个源域转移知识。我们进一步通过住宅建筑的气候控制实验验证了结果,并展示了GOTL可能在给定舒适水平下带来非可忽略的能耗节省。

英文摘要

This paper presents an online transfer learning framework for improving temperature predictions in residential buildings. In transfer learning, prediction models trained under a set of available data from a target domain (e.g., house with limited data) can be improved through the use of data generated from similar source domains (e.g., houses with rich data). Given also the need for prediction models that can be trained online (e.g., as part of a model-predictive-control implementation), this paper introduces the generalized online transfer learning algorithm (GOTL). It employs a weighted combination of the available predictors (i.e., the target and source predictors) and guarantees convergence to the best weighted predictor. Furthermore, the use of Transfer Component Analysis (TCA) allows for using more than a single source domains, since it may facilitate the fit of a single model on more than one source domains (houses). This allows GOTL to transfer knowledge from more than one source domains. We further validate our results through experiments in climate control for residential buildings and show that GOTL may lead to non-negligible energy savings for given comfort levels.

1610.03518 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY 版本更新

Transfer from Simulation to Real World through Learning Deep Inverse Dynamics Model

通过学习深度逆动力学模型实现仿真到现实世界的迁移

Paul Christiano, Zain Shah, Igor Mordatch, Jonas Schneider, Trevor Blackwell, Joshua Tobin, Pieter Abbeel, Wojciech Zaremba

AI总结 本文提出通过学习深度逆动力学模型,在仿真与现实世界之间实现控制策略的迁移,解决仿真与现实差异导致的性能下降问题。

详情
AI中文摘要

在仿真中开发控制策略通常比直接在现实世界中运行实验更实用和安全。这适用于通过规划和优化获得的策略,甚至更适用于通过强化学习获得的策略,后者通常非常数据密集。然而,仿真中成功的策略在部署到现实机器人时往往无法工作。然而,策略在仿真中执行的整体思路在现实世界中通常仍然有效。本文研究了此类场景,其中仿真中遍历的状态序列在现实世界中仍然合理,即使控制细节不同,例如摩擦、接触、质量和几何属性的差异。在执行过程中,我们的方法在每个时间步计算仿真基于的控制策略会做什么,但不执行这些控制在现实机器人上,而是计算仿真期望的下一个状态,并依赖于学习的深度逆动力学模型来决定最合适的现实世界动作以达到这些状态。深度模型只有在训练数据足够的情况下才有效,我们还提出了一种数据收集方法来(逐步)学习深度逆动力学模型。我们的实验表明,我们的方法在处理仿真到现实世界模型差异的各种基线方法中表现良好,包括输出误差控制和高斯动态适应。

英文摘要

Developing control policies in simulation is often more practical and safer than directly running experiments in the real world. This applies to policies obtained from planning and optimization, and even more so to policies obtained from reinforcement learning, which is often very data demanding. However, a policy that succeeds in simulation often doesn't work when deployed on a real robot. Nevertheless, often the overall gist of what the policy does in simulation remains valid in the real world. In this paper we investigate such settings, where the sequence of states traversed in simulation remains reasonable for the real world, even if the details of the controls are not, as could be the case when the key differences lie in detailed friction, contact, mass and geometry properties. During execution, at each time step our approach computes what the simulation-based control policy would do, but then, rather than executing these controls on the real robot, our approach computes what the simulation expects the resulting next state(s) will be, and then relies on a learned deep inverse dynamics model to decide which real-world action is most suitable to achieve those next states. Deep models are only as good as their training data, and we also propose an approach for data collection to (incrementally) learn the deep inverse dynamics model. Our experiments shows our approach compares favorably with various baselines that have been developed for dealing with simulation to real world model discrepancy, including output error control and Gaussian dynamics adaptation.

1604.08382 2026-06-04 cs.LG cs.SY eess.SY 版本更新

Convolutional Neural Networks For Automatic State-Time Feature Extraction in Reinforcement Learning Applied to Residential Load Control

卷积神经网络用于强化学习中的自动状态-时间特征提取用于住宅负荷控制

Bert J. Claessens, Peter Vrancx, Frederik Ruelens

AI总结 本文提出使用卷积神经网络提取隐藏状态-时间特征,以缓解部分可观测性带来的 curse,通过拟合 Q-迭代的监督学习步骤估计状态-动作值函数,验证了该方法在住宅负荷控制中的有效性。

Comments Submitted to Transactions on Smart Grid

详情
AI中文摘要

对异质住宅需求灵活性源的直接负荷控制是一个高维控制问题,具有部分可观测性。本文提出了一种新方法,使用卷积神经网络提取隐藏的状态-时间特征以缓解部分可观测性的诅咒。具体来说,卷积神经网络被用作函数近似器,在拟合 Q-迭代的监督学习步骤中估计状态-动作值函数或 Q 函数。该方法在定性模拟中得到评估,该模拟包括一个仅共享空气温度的恒温器控制负载集群,而其围护结构温度保持隐藏。模拟结果表明,所提出的方法能够捕捉到隐藏的特征,并成功降低了集群的电力成本。

英文摘要

Direct load control of a heterogeneous cluster of residential demand flexibility sources is a high-dimensional control problem with partial observability. This work proposes a novel approach that uses a convolutional neural network to extract hidden state-time features to mitigate the curse of partial observability. More specific, a convolutional neural network is used as a function approximator to estimate the state-action value function or Q-function in the supervised learning step of fitted Q-iteration. The approach is evaluated in a qualitative simulation, comprising a cluster of thermostatically controlled loads that only share their air temperature, whilst their envelope temperature remains hidden. The simulation results show that the presented approach is able to capture the underlying hidden features and successfully reduce the electricity cost the cluster.

1509.01404 2026-06-04 math.NA cs.CV cs.LG cs.NA math.OC stat.ML 版本更新

Coordinate Descent Methods for Symmetric Nonnegative Matrix Factorization

对称非负矩阵分解的坐标下降方法

Arnaud Vandaele, Nicolas Gillis, Qi Lei, Kai Zhong, Inderjit Dhillon

AI总结 本文提出高效的坐标下降方法用于对称非负矩阵分解,适用于大规模稀疏矩阵,通过实验证明其在合成和实际数据集上的有效性。

Comments 25 pages, 5 figures, 7 tables. Main changes: comparison with another symNMF algorithm (namely, BetaSNMF), and correction of an error in the convergence proof

详情
Journal ref
IEEE Transactions on Signal Processing 64 (21), pp. 5571-5584, 2016
AI中文摘要

给定一个对称非负矩阵A,对称非负矩阵分解(symNMF)是寻找一个非负矩阵H,通常列数远少于A,使得A≈HH^T。本文提出简单且高效的坐标下降方案来解决该问题,能够处理大规模稀疏输入矩阵。通过合成和实际数据集的实验,展示了所提方法在合成和实际数据集上的有效性,并证明其在与最新状态的最先进方法相比表现优异。

英文摘要

Given a symmetric nonnegative matrix $A$, symmetric nonnegative matrix factorization (symNMF) is the problem of finding a nonnegative matrix $H$, usually with much fewer columns than $A$, such that $A \approx HH^T$. SymNMF can be used for data analysis and in particular for various clustering tasks. In this paper, we propose simple and very efficient coordinate descent schemes to solve this problem, and that can handle large and sparse input matrices. The effectiveness of our methods is illustrated on synthetic and real-world data sets, and we show that they perform favorably compared to recent state-of-the-art methods.

1411.7245 2026-06-04 math.OC cs.LG cs.NA math.NA stat.ML 版本更新

Heuristics for Exact Nonnegative Matrix Factorization

精确非负矩阵分解的启发式方法

Arnaud Vandaele, Nicolas Gillis, François Glineur, Daniel Tuyttens

AI总结 本文提出两种启发式方法用于精确非负矩阵分解,通过模拟退火和贪心随机自适应搜索启发式方法,展示了其在多种非负矩阵类别的应用优势,并探讨了非负秩的行为特性。

Comments 32 pages, 2 figures, 16 tables

详情
Journal ref
Journal of Global Optimization 65 (2), pp 369-400, 2016
AI中文摘要

精确非负矩阵分解(精确NMF)问题为:给定一个m-by-n的非负矩阵X和一个分解秩r,寻找若可能的m-by-r非负矩阵W和r-by-n非负矩阵H使得X=WH。本文提出两种启发式方法,一种受模拟退火启发,另一种受贪心随机自适应搜索启发。我们证明这两种启发式方法能够计算几种非负矩阵类别的精确非负分解,并展示其优于标准多起始策略。我们还考虑这两种启发式的混合方法,以结合两种方法的优势。最后,我们讨论这些启发式方法在理解非负秩行为方面的应用,即最小分解秩使得存在精确NMF。特别是,我们推翻了关于Kronecker积非负秩的猜想,提出了关于通用n边形扩展复杂度的新上界,并推测正则n边形的扩展复杂度和相关联的非负秩的精确值。

英文摘要

The exact nonnegative matrix factorization (exact NMF) problem is the following: given an $m$-by-$n$ nonnegative matrix $X$ and a factorization rank $r$, find, if possible, an $m$-by-$r$ nonnegative matrix $W$ and an $r$-by-$n$ nonnegative matrix $H$ such that $X = WH$. In this paper, we propose two heuristics for exact NMF, one inspired from simulated annealing and the other from the greedy randomized adaptive search procedure. We show that these two heuristics are able to compute exact nonnegative factorizations for several classes of nonnegative matrices (namely, linear Euclidean distance matrices, slack matrices, unique-disjointness matrices, and randomly generated matrices) and as such demonstrate their superiority over standard multi-start strategies. We also consider a hybridization between these two heuristics that allows us to combine the advantages of both methods. Finally, we discuss the use of these heuristics to gain insight on the behavior of the nonnegative rank, i.e., the minimum factorization rank such that an exact NMF exists. In particular, we disprove a conjecture on the nonnegative rank of a Kronecker product, propose a new upper bound on the extension complexity of generic $n$-gons and conjecture the exact value of (i) the extension complexity of regular $n$-gons and (ii) the nonnegative rank of a submatrix of the slack matrix of the correlation polytope.

1609.05587 2026-06-04 math.NA cs.IT cs.LG cs.NA math.IT 版本更新

Tensor Completion by Alternating Minimization under the Tensor Train (TT) Model

基于张量列车(TT)模型的交替最小化张量补全

Wenqi Wang, Vaneet Aggarwal, Shuchin Aeron

AI总结 本文提出一种基于张量列车分解的交替最小化张量补全算法,通过交替优化MPS表示中的矩阵(张量),在计算复杂度和数值性能上优于现有方法。

详情
AI中文摘要

利用矩阵乘积态(MPS)表示的张量列车分解,本文提出了一种张量补全算法,该算法在MPS表示中的矩阵(张量)上交替优化。这一方法部分受矩阵补全算法在低秩因子上交替优化的成功启发。我们讨论了所提算法的计算复杂度,并通过数值实验将其与采用低秩张量列车近似进行数据补全的现有方法及其他近期提出的方法进行了比较。我们证明,该方法在多种实际场景中优于现有方法。

英文摘要

Using the matrix product state (MPS) representation of tensor train decompositions, in this paper we propose a tensor completion algorithm which alternates over the matrices (tensors) in the MPS representation. This development is motivated in part by the success of matrix completion algorithms which alternate over the (low-rank) factors. We comment on the computational complexity of the proposed algorithm and numerically compare it with existing methods employing low rank tensor train approximation for data completion as well as several other recently proposed methods. We show that our method is superior to existing ones for a variety of real settings.

1609.09681 2026-06-04 cs.LG cs.SY eess.SY 版本更新

Predicting the consequence of action in digital control state spaces

在数字控制状态空间中预测动作后果

Emmanuel Daucé

AI总结 本文探讨连续状态空间中学习控制规律的障碍,提出借鉴神经科学的末端效应器控制原理,而非传统位移控制原理,以实现更有效的动作学习。

详情
AI中文摘要

本论文的目标是揭示在连续状态空间中学习控制规律的一些基本障碍。特别是,如果想要构建能够像学习分类信号和图像一样学习运动任务的人工设备,就需要建立不依赖周围空间量比较的控制规则。在此背景下,我们提出借鉴神经科学研究建议的“末端效应器控制”原理,而非传统控制理论中使用的“位移控制”原理。

英文摘要

The objective of this dissertation is to shed light on some fundamental impediments in learning control laws in continuous state spaces. In particular, if one wants to build artificial devices capable to learn motor tasks the same way they learn to classify signals and images, one needs to establish control rules that do not necessitate comparisons between quantities of the surrounding space. We propose, in that context, to take inspiration from the "end effector control" principle, as suggested by neuroscience studies, as opposed to the "displacement control" principle used in the classical control theory.

1609.09660 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

On Identification of Sparse Multivariable ARX Model: A Sparse Bayesian Learning Approach

关于稀疏多变量ARX模型识别:一种稀疏贝叶斯学习方法

J. Jin, Y. Yuan, W. Pan, D. L. T. Pham, C. J. Tomlin, A. Webb, J. Goncalves

AI总结 本文提出一种基于稀疏贝叶斯学习的方法,用于识别稀疏多变量ARX模型的布尔结构和节点间动态,无需先验知识,通过最大后验估计结合复杂性和组稀疏性惩罚。

详情
AI中文摘要

本文首先考虑了由多变量ARX模型描述的稀疏线性时不变网络的识别问题。此类模型具有相对简单的结构,因此被用作基准以促进进一步研究。在保证网络可识别性的情况下,本文提出了一种识别方法,该方法从数据中推断网络的布尔结构和节点间的内部动态。识别直接从数据中进行,而无需任何系统先验知识,包括其阶数。所提出的方法通过最大后验估计(MAP)解决识别问题,但采用分离的惩罚项来处理复杂性,包括元素(非零连接的阶数)和组稀疏性(网络拓扑)。这种方法广泛应用于压缩感知(CS)中,被称为稀疏贝叶斯学习(SBL)。随后,本文提出了一种新的方案,结合稀疏贝叶斯和组稀疏贝叶斯以高效解决问题。所得到的算法形式与标准稀疏组正则化(SGL)相似,当已知噪声方差时,简化为精确的加权SGL。该方法和开发的工具包可应用于从各种领域推断网络,包括系统生物学中的信号和基因调控网络应用。

英文摘要

This paper begins with considering the identification of sparse linear time-invariant networks described by multivariable ARX models. Such models possess relatively simple structure thus used as a benchmark to promote further research. With identifiability of the network guaranteed, this paper presents an identification method that infers both the Boolean structure of the network and the internal dynamics between nodes. Identification is performed directly from data without any prior knowledge of the system, including its order. The proposed method solves the identification problem using Maximum a posteriori estimation (MAP) but with inseparable penalties for complexity, both in terms of element (order of nonzero connections) and group sparsity (network topology). Such an approach is widely applied in Compressive Sensing (CS) and known as Sparse Bayesian Learning (SBL). We then propose a novel scheme that combines sparse Bayesian and group sparse Bayesian to efficiently solve the problem. The resulted algorithm has a similar form of the standard Sparse Group Lasso (SGL) while with known noise variance, it simplifies to exact re-weighted SGL. The method and the developed toolbox can be applied to infer networks from a wide range of fields, including systems biology applications such as signaling and genetic regulatory networks.

1609.03240 2026-06-04 stat.ML cs.IT cs.LG cs.NA math.IT math.NA math.OC 版本更新

Non-square matrix sensing without spurious local minima via the Burer-Monteiro approach

非正方形矩阵感知:通过Burer-Monteiro方法避免虚假局部极小值

Dohyung Park, Anastasios Kyrillidis, Constantine Caramanis, Sujay Sanghavi

AI总结 本文在受限等距性质假设下研究非正方形矩阵感知问题,通过非凸方法证明矩阵分解在RIP条件下不引入虚假局部极小值。

Comments 14 pages, no figures

详情
AI中文摘要

我们考虑在受限等距性质(RIP)假设下的非正方形矩阵感知问题。我们聚焦于非凸形式,其中任何秩为r的矩阵X∈R^{m×n}表示为UV^T,其中U∈R^{m×r}和V∈R^{n×r}。在本文中,我们补充了最近关于类似PSD设置的非凸几何的发现[5],并证明在RIP条件下,矩阵分解不会引入任何虚假局部极小值。

英文摘要

We consider the non-square matrix sensing problem, under restricted isometry property (RIP) assumptions. We focus on the non-convex formulation, where any rank-$r$ matrix $X \in \mathbb{R}^{m \times n}$ is represented as $UV^\top$, where $U \in \mathbb{R}^{m \times r}$ and $V \in \mathbb{R}^{n \times r}$. In this paper, we complement recent findings on the non-convex geometry of the analogous PSD setting [5], and show that matrix factorization does not introduce any spurious local minima, under RIP.

1609.06942 2026-06-04 stat.ML cs.LG cs.SY eess.SY math.PR math.ST stat.TH 版本更新

Randomized Independent Component Analysis

随机独立成分分析

Matan Sela, Ron Kimmel

AI总结 本文提出基于随机特征的随机广义方差和随机典型相关作为替代措施,以降低计算复杂度并提高ICA分解效率。

Comments Accepted to ICSEE 2016

详情
AI中文摘要

独立成分分析(ICA)是一种从未知线性组合的源信号观测中恢复统计独立信号的方法。一些最准确的ICA分解方法需要搜索最小化不同互信息近似值的逆变换,互信息是随机向量统计独立性的度量。两种这样的近似是核广义方差或核典型相关,已被证明能达到ICA方法的最高性能。然而,仅计算这些度量所需的计算努力与样本大小成立方关系。因此,优化它们在空间和时间上都变得更加计算密集。在此,我们提出了一种基于样本随机特征的替代新度量——随机广义方差和随机典型相关。所提出的替代措施的计算复杂度与样本大小成线性关系,并提供了可控的核非随机版本的近似。我们还展示了优化所提出的统计特性可以在数量级上比核方法更快地达到可比的分离误差。

英文摘要

Independent component analysis (ICA) is a method for recovering statistically independent signals from observations of unknown linear combinations of the sources. Some of the most accurate ICA decomposition methods require searching for the inverse transformation which minimizes different approximations of the Mutual Information, a measure of statistical independence of random vectors. Two such approximations are the Kernel Generalized Variance or the Kernel Canonical Correlation which has been shown to reach the highest performance of ICA methods. However, the computational effort necessary just for computing these measures is cubic in the sample size. Hence, optimizing them becomes even more computationally demanding, in terms of both space and time. Here, we propose a couple of alternative novel measures based on randomized features of the samples - the Randomized Generalized Variance and the Randomized Canonical Correlation. The computational complexity of calculating the proposed alternatives is linear in the sample size and provide a controllable approximation of their Kernel-based non-random versions. We also show that optimization of the proposed statistical properties yields a comparable separation error at an order of magnitude faster compared to Kernel-based measures.

1602.02164 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

A Note on Alternating Minimization Algorithm for the Matrix Completion Problem

关于矩阵补全问题交替最小化算法的注记

David Gamarnik, Sidhant Misra

AI总结 本文分析了两种交替最小化算法变体在低秩矩阵补全问题中的性能,证明当矩阵秩为1且满足特定条件时,算法能在多项式时间内近似重建矩阵,并通过模拟结果表明第二种基于消息传递更新的算法表现更优。

Comments 8 pages, 2 figures

详情
AI中文摘要

我们考虑从矩阵部分条目重建低秩矩阵的问题,并分析了两种交替最小化算法的变体。我们证明当底层矩阵秩为1,具有正有界的条目,并且底层揭示条目的图$\mathcal{G}$具有有界度数和直径不超过矩阵规模对数时,两种算法都能在多项式时间内从任意初始化开始近似重建矩阵。我们进一步提供了模拟结果,表明基于消息传递类型更新的第二种算法表现更优。

英文摘要

We consider the problem of reconstructing a low rank matrix from a subset of its entries and analyze two variants of the so-called Alternating Minimization algorithm, which has been proposed in the past. We establish that when the underlying matrix has rank $r=1$, has positive bounded entries, and the graph $\mathcal{G}$ underlying the revealed entries has bounded degree and diameter which is at most logarithmic in the size of the matrix, both algorithms succeed in reconstructing the matrix approximately in polynomial time starting from an arbitrary initialization. We further provide simulation results which suggest that the second algorithm which is based on the message passing type updates, performs significantly better.

1609.04167 2026-06-04 math.NA cs.CV cs.IT cs.LG cs.NA math.IT math.OC 版本更新

Proceedings of the third "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'16)

第三届“国际稀疏模型与技术相互作用研讨会”(iTWIST'16)会议论文集

V. Abrol, O. Absil, P. -A. Absil, S. Anthoine, P. Antoine, T. Arildsen, N. Bertin, F. Bleichrodt, J. Bobin, A. Bol, A. Bonnefoy, F. Caltagirone, V. Cambareri, C. Chenot, V. Crnojević, M. Daňková, K. Degraux, J. Eisert, J. M. Fadili, M. Gabrié, N. Gac, D. Giacobello, A. Gonzalez, C. A. Gomez Gonzalez, A. González, P. -Y. Gousenbourger, M. Græsbøll Christensen, R. Gribonval, S. Guérit, S. Huang, P. Irofti, L. Jacques, U. S. Kamilov, S. Kiticć, M. Kliesch, F. Krzakala, J. A. Lee, W. Liao, T. Lindstrøm Jensen, A. Manoel, H. Mansour, A. Mohammad-Djafari, A. Moshtaghpour, F. Ngolè, B. Pairet, M. Panić, G. Peyré, A. Pižurica, P. Rajmic, M. Roblin, I. Roth, A. K. Sao, P. Sharma, J. -L. Starck, E. W. Tramel, T. van Waterschoot, D. Vukobratovic, L. Wang, B. Wirth, G. Wunder, H. Zhang

AI总结 本文探讨了稀疏模型与技术的相互作用,涵盖数据传感、非凸逆问题、概率推断、机器学习等领域,通过演讲和讨论促进国际科研合作。

Comments 69 pages, 22 extended abstracts, iTWIST'16 website: http://www.itwist16.es.aau.dk

详情
AI中文摘要

第三届“国际稀疏模型与技术相互作用研讨会”(iTWIST'16)于2016年8月24日至26日在丹麦第四大城市阿勒堡举行。该研讨会旨在通过具体的口头/海报展示和自由讨论促进国际科研团队的合作。本届研讨会汇集了约50位国际参与者,包含8场特邀讲座、12场口头报告和12个海报,主题涵盖稀疏范式的理论、应用和推广,包括由稀疏驱动的数据传感与处理(如光学、计算机视觉、基因组学、生物医学、数字通信、信道估计、天文学);稀疏模型在非凸/非线性逆问题中的应用(如相位恢复、盲去卷积、自校准);近似概率推断用于稀疏问题;稀疏机器学习与推断;“盲”逆问题与字典学习;稀疏建模的优化;信息论、几何与随机性;稀疏?未来是什么(离散值信号;低维空间的并集、共稀疏性、混合/组范数、基于模型的、低复杂度模型等);矩阵/流形传感与处理(图、低秩近似等);数值方法/优化中的复杂性与精度权衡;电子/光学压缩传感器(硬件)。

英文摘要

The third edition of the "international - Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST) took place in Aalborg, the 4th largest city in Denmark situated beautifully in the northern part of the country, from the 24th to 26th of August 2016. The workshop venue was at the Aalborg University campus. One implicit objective of this biennial workshop is to foster collaboration between international scientific teams by disseminating ideas through both specific oral/poster presentations and free discussions. For this third edition, iTWIST'16 gathered about 50 international participants and features 8 invited talks, 12 oral presentations, and 12 posters on the following themes, all related to the theory, application and generalization of the "sparsity paradigm": Sparsity-driven data sensing and processing (e.g., optics, computer vision, genomics, biomedical, digital communication, channel estimation, astronomy); Application of sparse models in non-convex/non-linear inverse problems (e.g., phase retrieval, blind deconvolution, self calibration); Approximate probabilistic inference for sparse problems; Sparse machine learning and inference; "Blind" inverse problems and dictionary learning; Optimization for sparse modelling; Information theory, geometry and randomness; Sparsity? What's next? (Discrete-valued signals; Union of low-dimensional spaces, Cosparsity, mixed/group norm, model-based, low-complexity models, ...); Matrix/manifold sensing/processing (graph, low-rank approximation, ...); Complexity/accuracy tradeoffs in numerical methods/optimization; Electronic/optical compressive sensors (hardware).

1609.03628 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Co-active Learning to Adapt Humanoid Movement for Manipulation

协同学习以适应人形机器人的运动用于操作

Ren Mao, John S. Baras, Yezhou Yang, Cornelia Fermuller

AI总结 本文提出协同学习框架,通过人机交互适应机器人末端执行器的运动以应对不同约束环境,实验验证了方法的有效性。

详情
AI中文摘要

本文针对机器人在各种环境约束下运动适应问题,提出了一种协同学习框架,用于学习适应机器人末端执行器的运动以执行操作任务。该框架设计用于适应从演示中学习的原始模仿轨迹,以应对具有各种约束的新情况。框架还考虑了用户对适应轨迹的反馈,并通过人机交互学习适应运动。实现的系统能够将训练的运动原语泛化到具有不同约束的各种情况,考虑用户偏好。在人形平台上进行的实验验证了本文方法的有效性。

英文摘要

In this paper we address the problem of robot movement adaptation under various environmental constraints interactively. Motion primitives are generally adopted to generate target motion from demonstrations. However, their generalization capability is weak while facing novel environments. Additionally, traditional motion generation methods do not consider the versatile constraints from various users, tasks, and environments. In this work, we propose a co-active learning framework for learning to adapt robot end-effector's movement for manipulation tasks. It is designed to adapt the original imitation trajectories, which are learned from demonstrations, to novel situations with various constraints. The framework also considers user's feedback towards the adapted trajectories, and it learns to adapt movement through human-in-the-loop interactions. The implemented system generalizes trained motion primitives to various situations with different constraints considering user preferences. Experiments on a humanoid platform validate the effectiveness of our approach.

1609.03344 2026-06-04 stat.ML cs.LG econ.GN math.ST q-fin.EC stat.CO stat.TH 版本更新

Finite-sample and asymptotic analysis of generalization ability with an application to penalized regression

有限样本与渐近分析中的泛化能力研究:以正则化回归应用为例

Ning Xu, Jian Hong, Timothy C. G. Fisher

AI总结 本文从泛化能力角度研究极值估计器性能,推导了泛化误差上界,并探讨了交叉验证中K值对偏差方差权衡的影响,证明了正则化回归估计在高维数据下的L2一致性。

Comments The theoretical generalization and extension of arXiv:1606.00142

详情
AI中文摘要

本文从泛化能力(GA)角度研究极值估计器的性能:即模型在新样本上预测结果的能力。通过适应经典集中不等式,我们推导了经验外样本预测误差的上界,作为样本内误差、样本量、误差分布尾部重性及模型复杂度的函数。我们证明误差界可用于调节关键估计超参数,如交叉验证中K值。我们还展示了K值如何影响交叉验证的偏差方差权衡。最后,我们证明所有正则化回归估计在n≥p和n<p情况下均为L2一致。通过模拟验证了关键结果。

英文摘要

In this paper, we study the performance of extremum estimators from the perspective of generalization ability (GA): the ability of a model to predict outcomes in new samples from the same population. By adapting the classical concentration inequalities, we derive upper bounds on the empirical out-of-sample prediction errors as a function of the in-sample errors, in-sample data size, heaviness in the tails of the error distribution, and model complexity. We show that the error bounds may be used for tuning key estimation hyper-parameters, such as the number of folds $K$ in cross-validation. We also show how $K$ affects the bias-variance trade-off for cross-validation. We demonstrate that the $\mathcal{L}_2$-norm difference between penalized and the corresponding un-penalized regression estimates is directly explained by the GA of the estimates and the GA of empirical moment conditions. Lastly, we prove that all penalized regression estimates are $L_2$-consistent for both the $n \geqslant p$ and the $n < p$ cases. Simulations are used to demonstrate key results. Keywords: generalization ability, upper bound of generalization error, penalized regression, cross-validation, bias-variance trade-off, $\mathcal{L}_2$ difference between penalized and unpenalized regression, lasso, high-dimensional data.

1606.00602 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

Variance-Reduced Proximal Stochastic Gradient Descent for Non-convex Composite optimization

方差缩减的非凸复合优化的近端随机梯度下降

Xiyu Yu, Dacheng Tao

AI总结 本文提出非凸复合优化的方差缩减近端随机梯度下降方法,证明其在非凸情况下能以O(1/ε)迭代收敛至 stationary 点,优于随机梯度下降。

Comments This paper has been withdrawn by the author due to an error in the proof of the convergence rate. They will modify this proof as soon as possible

详情
AI中文摘要

本文研究非凸复合优化问题:首先是一个有限和的光滑但非凸函数,其次是一个具有简单近端映射的通用函数。大多数关于复合优化随机方法的研究假设每个函数是凸的或强凸的。本文通过方差缩减技术(如prox-SVRG和prox-SAGA)将问题扩展到非凸设置。证明在固定步长下,prox-SVRG和prox-SAGA适用于非凸复合优化,并能在O(1/ε)次迭代内收敛至 stationary 点。这与RSAG方法的收敛速度相似,但比随机梯度下降更快。本文分析还扩展到min-batch设置,线性加速收敛。到目前为止,这是首个关于非凸复合优化中方差缩减近端随机梯度方法收敛率的分析。

英文摘要

Here we study non-convex composite optimization: first, a finite-sum of smooth but non-convex functions, and second, a general function that admits a simple proximal mapping. Most research on stochastic methods for composite optimization assumes convexity or strong convexity of each function. In this paper, we extend this problem into the non-convex setting using variance reduction techniques, such as prox-SVRG and prox-SAGA. We prove that, with a constant step size, both prox-SVRG and prox-SAGA are suitable for non-convex composite optimization, and help the problem converge to a stationary point within $O(1/ε)$ iterations. That is similar to the convergence rate seen with the state-of-the-art RSAG method and faster than stochastic gradient descent. Our analysis is also extended into the min-batch setting, which linearly accelerates the convergence. To the best of our knowledge, this is the first analysis of convergence rate of variance-reduced proximal stochastic gradient for non-convex composite optimization.

1609.02678 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Identifying Topology of Power Distribution Networks Based on Smart Meter Data

基于智能电表数据识别配电网拓扑

Jayadev P Satya, Nirav Bhatt, Ramkrishna Pasumarthy, Aravind Rajeswaran

AI总结 本文提出一种数据驱动方法,利用智能电表能耗数据通过主成分分析和图论解释识别配电网拓扑及负载相连接性。

Comments Submitted to IEEE transaction on smart grid

详情
AI中文摘要

在配电网中,网络拓扑信息对于高效运行至关重要。由于低电压层级中频繁发生无意识变化,网络连接信息不准确。本文提出了一种新的数据驱动方法,通过时间序列能耗测量数据识别底层网络拓扑,包括负载相连接性。所提方法通过随机生成网络和IEEE认可的Roy Billinton配电网测试系统进行模拟验证。

英文摘要

In a power distribution network, the network topology information is essential for an efficient operation of the network. This information of network connectivity is not accurately available, at the low voltage level, due to uninformed changes that happen from time to time. In this paper, we propose a novel data--driven approach to identify the underlying network topology including the load phase connectivity from time series of energy measurements. The proposed method involves the application of Principal Component Analysis (PCA) and its graph-theoretic interpretation to infer the topology from smart meter energy measurements. The method is demonstrated through simulation on randomly generated networks and also on IEEE recognized Roy Billinton distribution test system.

1504.05854 2026-06-04 cs.LG cs.NA math.NA math.OC 版本更新

On-the-fly Approximation of Multivariate Total Variation Minimization

实时多变量总变分最小化近似

Jordan Frecon, Nelly Pustelnik, Patrice Abry, Laurent Condat

AI总结 本文提出一种实时多变量总变分最小化算法,通过局部验证对偶问题的KKT条件,实现高质量近似解,兼顾精度与计算成本。

详情
AI中文摘要

在变化点检测背景下,总变分最小化策略被用于解决。本文设计了一种高效的实时算法,针对单变量数据获得精确解。本研究将该策略扩展至多变量数据。所提算法依赖于对偶问题的局部KKT条件验证。显示多变量设置的非局部性使得无法获得精确实时解,因此设计了一种实时算法提供近似解,其质量由可调参数控制,作为精度与计算成本的权衡。性能评估表明,实时获得高质量解的同时,计算成本比标准迭代方法低多个数量级。所提算法为从业者提供了高效的多变量变化点检测实时处理方法。

英文摘要

In the context of change-point detection, addressed by Total Variation minimization strategies, an efficient on-the-fly algorithm has been designed leading to exact solutions for univariate data. In this contribution, an extension of such an on-the-fly strategy to multivariate data is investigated. The proposed algorithm relies on the local validation of the Karush-Kuhn-Tucker conditions on the dual problem. Showing that the non-local nature of the multivariate setting precludes to obtain an exact on-the-fly solution, we devise an on-the-fly algorithm delivering an approximate solution, whose quality is controlled by a practitioner-tunable parameter, acting as a trade-off between quality and computational cost. Performance assessment shows that high quality solutions are obtained on-the-fly while benefiting of computational costs several orders of magnitude lower than standard iterative procedures. The proposed algorithm thus provides practitioners with an efficient multivariate change-point detection on-the-fly procedure.

1511.04695 2026-06-04 math.NA cs.LG cs.NA 版本更新

An Iterative Reweighted Method for Tucker Decomposition of Incomplete Multiway Tensors

一种用于不完整多维张量Tucker分解的迭代加权方法

Linxiao Yang, Jun Fang, Hongbin Li, Bing Zeng

AI总结 本文提出一种基于组log-sum惩罚函数的迭代加权方法,用于处理不完整多维张量的低秩分解,通过多线性操作实现紧凑表示,并自动确定多线性秩。

详情
AI中文摘要

我们考虑了不完整多维张量的低秩分解问题。由于许多现实数据位于本质上低维子空间中,具有缺失条目的张量低秩分解在推荐系统和图像修复等数据处理问题中具有广泛应用。本文聚焦于Tucker分解,通过多线性操作将N阶张量表示为N个因子矩阵和一个核心张量。为了利用高维数据集中的多线性低秩结构,我们提出了一种基于组的log-sum惩罚函数,以在核心张量上施加结构稀疏性,从而得到具有最小核心张量的紧凑表示。通过迭代最小化一个主导原始目标函数的替代函数,开发了Tucker分解的方法,从而得到迭代加权过程。此外,为了降低计算复杂性,采用了一种过松弛单调快速迭代收缩阈值技术,并将其嵌入到迭代加权过程中。所提出的方法能够自动确定模型复杂度(即多线性秩)。仿真结果表明,所提出的算法在与其他现有算法相比具有竞争力的性能。

英文摘要

We consider the problem of low-rank decomposition of incomplete multiway tensors. Since many real-world data lie on an intrinsically low dimensional subspace, tensor low-rank decomposition with missing entries has applications in many data analysis problems such as recommender systems and image inpainting. In this paper, we focus on Tucker decomposition which represents an Nth-order tensor in terms of N factor matrices and a core tensor via multilinear operations. To exploit the underlying multilinear low-rank structure in high-dimensional datasets, we propose a group-based log-sum penalty functional to place structural sparsity over the core tensor, which leads to a compact representation with smallest core tensor. The method for Tucker decomposition is developed by iteratively minimizing a surrogate function that majorizes the original objective function, which results in an iterative reweighted process. In addition, to reduce the computational complexity, an over-relaxed monotone fast iterative shrinkage-thresholding technique is adapted and embedded in the iterative reweighted process. The proposed method is able to determine the model complexity (i.e. multilinear rank) in an automatic way. Simulation results show that the proposed algorithm offers competitive performance compared with other existing algorithms.

1510.05237 2026-06-04 cs.LG cs.NA cs.SI math.NA 版本更新

Large Enforced Sparse Non-Negative Matrix Factorization

大尺度强制稀疏非负矩阵分解

Brendan Gavin, Vijay Gadepally, Jeremy Kepner

AI总结 本文提出一种强制生成稀疏中间和输出矩阵的NMF改进方法,提升内存和计算性能,同时保持或提高主题模型的准确性和算法收敛速度。

Comments 9 pages

详情
AI中文摘要

非负矩阵分解(NMF)是一种从文本数据中生成主题模型的常用方法。NMF因其实现简单和计算方便而被广泛接受。然而,将其应用于大规模数据集时,中间矩阵乘积常变得密集,给系统的内存和计算元素带来压力。本文研究了一种简单的但强大的NMF算法修改方法,强制生成稀疏的中间和输出矩阵。该方法通过改进的内存和计算性能使NMF能够应用于大规模数据集。进一步,我们实证表明,这种在NMF中强制稀疏性的方法在保持或提高所生成的主题模型的准确性以及底层算法的收敛速度方面具有优势。

英文摘要

Non-negative matrix factorization (NMF) is a common method for generating topic models from text data. NMF is widely accepted for producing good results despite its relative simplicity of implementation and ease of computation. One challenge with applying NMF to large datasets is that intermediate matrix products often become dense, stressing the memory and compute elements of a system. In this article, we investigate a simple but powerful modification of a common NMF algorithm that enforces the generation of sparse intermediate and output matrices. This method enables the application of NMF to large datasets through improved memory and compute performance. Further, we demonstrate empirically that this method of enforcing sparsity in the NMF either preserves or improves both the accuracy of the resulting topic model and the convergence rate of the underlying algorithm.

1607.08012 2026-06-04 cs.LG cs.NA math.NA math.OC 版本更新

Learning of Generalized Low-Rank Models: A Greedy Approach

通用低秩模型的学习:一种贪心方法

Quanming Yao, James T. Kwok

AI总结 本文提出一种灵活的贪心算法,用于解决通用低秩模型的优化问题,支持平滑或非平滑、一般凸或强凸目标,具有低时间复杂度和快收敛速度,实验显示其速度优于现有方法,预测性能相当或更优。

详情
AI中文摘要

低秩矩阵的学习是许多机器学习应用的基础。最先进的算法是秩一矩阵追求(R1MP)。然而,它只能用于具有平方损失的矩阵补全问题。在本文中,我们开发了一种更灵活的贪心算法,用于通用低秩模型,其优化目标可以是平滑或非平滑、一般凸或强凸。所提出的算法具有低的每迭代时间复杂度和快的收敛速度。实验结果表明,它比最先进的方法快得多,预测性能可比或甚至更好。

英文摘要

Learning of low-rank matrices is fundamental to many machine learning applications. A state-of-the-art algorithm is the rank-one matrix pursuit (R1MP). However, it can only be used in matrix completion problems with the square loss. In this paper, we develop a more flexible greedy algorithm for generalized low-rank models whose optimization objective can be smooth or nonsmooth, general convex or strongly convex. The proposed algorithm has low per-iteration time complexity and fast convergence rate. Experimental results show that it is much faster than the state-of-the-art, with comparable or even better prediction performance.

1607.05962 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Indoor occupancy estimation from carbon dioxide concentration

从二氧化碳浓度估计室内占用情况

Chaoyang Jiang, Mustafa K. Masood, Yeng Chai Soh, Hua Li

AI总结 本文提出一种基于二氧化碳测量的室内占用估计器,采用改进的FS-ELM算法提升估计精度,并引入x-容忍度准则评估性能,实验显示在办公室环境中达到94%的准确率。

Comments 11 pages, 7 figures

详情
AI中文摘要

本文提出了一种基于二氧化碳测量的室内占用估计器,该估计器能够根据二氧化碳浓度实时估计室内实际人数。该估计器本质上是一个占用水平的动态模型。为了识别该动态模型,我们提出了特征缩放极端学习机(FS-ELM)算法,这是标准极端学习机(ELM)的一种变体,但已被证明在占用估计问题中表现更佳。测量的二氧化碳浓度受到严重尖峰的影响。我们发现对二氧化碳数据进行预平滑可以显著提高估计精度。然而在实际应用中,我们无法获得实时全局平滑的二氧化碳数据。我们提供了一种方法,利用局部平滑的二氧化碳数据代替,该数据是实时可用的。我们引入了一个新的准则,即$x$-容忍度准确性,以评估占用估计器。所提出的占用估计器在设有24个工位和11个开放座位的办公室房间中进行了测试。精度高达94%,容忍度为四人。

英文摘要

This paper presents an indoor occupancy estimator with which we can estimate the number of real-time indoor occupants based on the carbon dioxide (CO2) measurement. The estimator is actually a dynamic model of the occupancy level. To identify the dynamic model, we propose the Feature Scaled Extreme Learning Machine (FS-ELM) algorithm, which is a variation of the standard Extreme Learning Machine (ELM) but is shown to perform better for the occupancy estimation problem. The measured CO2 concentration suffers from serious spikes. We find that pre-smoothing the CO2 data can greatly improve the estimation accuracy. In real applications, however, we cannot obtain the real-time globally smoothed CO2 data. We provide a way to use the locally smoothed CO2 data instead, which is real-time available. We introduce a new criterion, i.e. $x$-tolerance accuracy, to assess the occupancy estimator. The proposed occupancy estimator was tested in an office room with 24 cubicles and 11 open seats. The accuracy is up to 94 percent with a tolerance of four occupants.

1402.1298 2026-06-04 math.NA cond-mat.stat-mech cs.IT cs.LG cs.NA math.IT stat.ML 版本更新

Phase transitions and sample complexity in Bayes-optimal matrix factorization

贝叶斯最优矩阵分解中的相变与样本复杂性

Yoshiyuki Kabashima, Florent Krzakala, Marc Mézard, Ayaka Sakata, Lenka Zdeborová

AI总结 研究贝叶斯最优矩阵分解中的相变现象及样本复杂性,利用统计力学方法分析推断的可行性与计算可处理性,探讨最小均方误差与高效近似消息传递算法的性能。

Comments 50 pages, 10 figures

详情
Journal ref
IEEE Transactions on Information Theory (Volume:62 , Issue: 7, Pages: 4228 - 4265) 2016
AI中文摘要

我们分析了矩阵分解问题。给定两个矩阵乘积的噪声测量,问题在于恢复原始矩阵。它在许多应用中出现,如字典学习、盲矩阵校准、稀疏主成分分析、盲源分离、低秩矩阵补全、鲁棒主成分分析或因子分析。它在机器学习中也很重要:无监督表示学习往往可以通过矩阵分解研究。我们使用统计力学工具——空腔和副本方法——来分析贝叶斯最优推断设置中推断问题的可行性和计算可处理性,即假设两个矩阵具有随机独立元素,由某些已知分布生成,并且该信息可供推断算法使用。在此设置中,我们计算了在任何计算时间内理论上可实现的最小均方误差,以及高效近似消息传递算法可达到的误差。计算基于算法的渐进状态演变分析。我们的分析预测的性能,无论是就达到的均方误差而言,还是就样本复杂性而言,都非常有希望,值得进一步发展该算法。

英文摘要

We analyse the matrix factorization problem. Given a noisy measurement of a product of two matrices, the problem is to estimate back the original matrices. It arises in many applications such as dictionary learning, blind matrix calibration, sparse principal component analysis, blind source separation, low rank matrix completion, robust principal component analysis or factor analysis. It is also important in machine learning: unsupervised representation learning can often be studied through matrix factorization. We use the tools of statistical mechanics - the cavity and replica methods - to analyze the achievability and computational tractability of the inference problems in the setting of Bayes-optimal inference, which amounts to assuming that the two matrices have random independent elements generated from some known distribution, and this information is available to the inference algorithm. In this setting, we compute the minimal mean-squared-error achievable in principle in any computational time, and the error that can be achieved by an efficient approximate message passing algorithm. The computation is based on the asymptotic state-evolution analysis of the algorithm. The performance that our analysis predicts, both in terms of the achieved mean-squared-error, and in terms of sample complexity, is extremely promising and motivating for a further development of the algorithm.

1606.02193 2026-06-04 cs.NI cs.LG cs.SY eess.SY 版本更新

Adapting Sampling Interval of Sensor Networks Using On-Line Reinforcement Learning

利用在线强化学习适应传感器网络的采样间隔

Gabriel Martins Dias, Maddalena Nurchis, Boris Bellalta

AI总结 本文提出基于强化学习的动态采样率适应方案,通过实时调整传感器采样间隔,以优化能耗并保持数据质量。

Comments 6 pages, 2 figures, submitted to the IEEE World Forum on Internet of Things 2016

详情
AI中文摘要

无线传感器网络(WSNs)由报告温度、相对湿度等环境参数的传感器节点组成。两次连续测量之间的时间间隔是设置WSN配置时的关键参数,因为它会影响WSN的寿命、无线信道竞争和报告数据的质量。由于监控参数的趋势在不同场景和时间内可能显著变化,确定适用于多个情况的采样间隔具有挑战性。本文提出了一种基于强化学习的动态采样率适应方案,能够根据环境条件和应用需求实时调整传感器的采样间隔。主要目标是将采样间隔设置为最佳值,以避免过采样并节省能量,同时不遗漏对应用相关的重要环境变化。在模拟中,我们的机制相比固定策略可将总传输次数减少多达73%,同时保持WSN提供的信息的平均质量。强化学习算法的内在灵活性使其能够应用于多种场景,从而利用物联网的广泛范围。

英文摘要

Monitoring Wireless Sensor Networks (WSNs) are composed of sensor nodes that report temperature, relative humidity, and other environmental parameters. The time between two successive measurements is a critical parameter to set during the WSN configuration because it can impact the WSN's lifetime, the wireless medium contention and the quality of the reported data. As trends in monitored parameters can significantly vary between scenarios and within time, identifying a sampling interval suitable for several cases is also challenging. In this work, we propose a dynamic sampling rate adaptation scheme based on reinforcement learning, able to tune sensors' sampling interval on-the-fly, according to environmental conditions and application requirements. The primary goal is to set the sampling interval to the best value possible so as to avoid oversampling and save energy, while not missing environmental changes that can be relevant for the application. In simulations, our mechanism could reduce up to 73% the total number of transmissions compared to a fixed strategy and, simultaneously, keep the average quality of information provided by the WSN. The inherent flexibility of the reinforcement learning algorithm facilitates its use in several scenarios, so as to exploit the broad scope of the Internet of Things.

1307.4847 2026-06-04 cs.LG cs.AI cs.SY eess.SY stat.ML 版本更新

Efficient Reinforcement Learning in Deterministic Systems with Value Function Generalization

在确定性系统中通过价值函数泛化实现高效的强化学习

Zheng Wen, Benjamin Van Roy

AI总结 本文提出OCP算法,通过优化约束传播实现高效探索和价值函数泛化,在有限时间 horizon 确定性系统中实现最优动作选择,并提供效率和渐进行为保证。

详情
AI中文摘要

我们考虑在有限时间 horizon 确定性系统中进行强化学习的问题,并提出乐观约束传播(OCP)算法,该算法旨在合成高效的探索和价值函数泛化。我们证明当真实价值函数位于给定的假设类中时,OCP在最多K个episode中选择最优动作,其中K是给定假设类的eluder维度。我们进一步建立了效率和渐进行为保证,即使真实价值函数不位于给定的假设类中,对于假设类是预指定指示函数在不相交集合上的张量的特殊情况。我们还讨论了OCP的计算复杂性,并展示了两个示例的计算结果。

英文摘要

We consider the problem of reinforcement learning over episodes of a finite-horizon deterministic system and as a solution propose optimistic constraint propagation (OCP), an algorithm designed to synthesize efficient exploration and value function generalization. We establish that when the true value function lies within a given hypothesis class, OCP selects optimal actions over all but at most K episodes, where K is the eluder dimension of the given hypothesis class. We establish further efficiency and asymptotic performance guarantees that apply even if the true value function does not lie in the given hypothesis class, for the special case where the hypothesis class is the span of pre-specified indicator functions over disjoint sets. We also discuss the computational complexity of OCP and present computational results involving two illustrative examples.

1607.00345 2026-06-04 math.OC cs.LG cs.NA math.NA stat.ML 版本更新

Convergence Rate of Frank-Wolfe for Non-Convex Objectives

非凸目标函数下Frank-Wolfe算法的收敛速度

Simon Lacoste-Julien

AI总结 本文证明Frank-Wolfe算法在非凸目标函数上以O(1/√t)速度收敛,且分析为仿射不变,首次在不同稳定性度量下达到与投影梯度方法相似的收敛速度。

Comments 6 pages

详情
AI中文摘要

本文证明Frank-Wolfe算法在非凸目标函数上以O(1/√t)速度收敛,且分析为仿射不变,首次在不同稳定性度量下达到与投影梯度方法相似的收敛速度。

英文摘要

We give a simple proof that the Frank-Wolfe algorithm obtains a stationary point at a rate of $O(1/\sqrt{t})$ on non-convex objectives with a Lipschitz continuous gradient. Our analysis is affine invariant and is the first, to the best of our knowledge, giving a similar rate to what was already proven for projected gradient methods (though on slightly different measures of stationarity).

1607.00514 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Approximate Joint Matrix Triangularization

近似联合矩阵三角化

Nicolo Colombo, Nikos Vlassis

AI总结 本文研究了噪声联合对角化矩阵的近似联合三角化问题,提出基于理论和观测量的扰动界,并讨论其在张量分解中的应用。

Comments 19 pages

详情
AI中文摘要

本文研究了噪声联合对角化矩阵的近似联合三角化问题,提出基于理论和观测量的扰动界,并讨论其在张量分解中的应用。

英文摘要

We consider the problem of approximate joint triangularization of a set of noisy jointly diagonalizable real matrices. Approximate joint triangularizers are commonly used in the estimation of the joint eigenstructure of a set of matrices, with applications in signal processing, linear algebra, and tensor decomposition. By assuming the input matrices to be perturbations of noise-free, simultaneously diagonalizable ground-truth matrices, the approximate joint triangularizers are expected to be perturbations of the exact joint triangularizers of the ground-truth matrices. We provide a priori and a posteriori perturbation bounds on the `distance' between an approximate joint triangularizer and its exact counterpart. The a priori bounds are theoretical inequalities that involve functions of the ground-truth matrices and noise matrices, whereas the a posteriori bounds are given in terms of observable quantities that can be computed from the input matrices. From a practical perspective, the problem of finding the best approximate joint triangularizer of a set of noisy matrices amounts to solving a nonconvex optimization problem. We show that, under a condition on the noise level of the input matrices, it is possible to find a good initial triangularizer such that the solution obtained by any local descent-type algorithm has certain global guarantees. Finally, we discuss the application of approximate joint matrix triangularization to canonical tensor decomposition and we derive novel estimation error bounds.

1602.04621 2026-06-04 cs.LG cs.AI cs.SY eess.SY stat.ML 版本更新

Deep Exploration via Bootstrapped DQN

通过Bootstrap DQN进行深度探索

Ian Osband, Charles Blundell, Alexander Pritzel, Benjamin Van Roy

AI总结 本文提出Bootstrap DQN算法,通过随机价值函数实现高效探索,提升复杂环境中的学习速度和性能,尤其在Atari游戏中表现优异。

详情
AI中文摘要

复杂环境中的高效探索仍是强化学习的主要挑战。我们提出了Bootstrap DQN,一种简单算法,通过使用随机价值函数在计算和统计上高效地进行探索。与epsilon-greedy等策略不同,Bootstrap DQN实现时序扩展(或深度)探索,可导致学习速度呈指数级提升。我们在复杂随机MDP和大规模 Arcade Learning Environment 中展示了这些优势。Bootstrap DQN在大多数Atari游戏中显著提高了学习时间和性能。

英文摘要

Efficient exploration in complex environments remains a major challenge for reinforcement learning. We propose bootstrapped DQN, a simple algorithm that explores in a computationally and statistically efficient manner through use of randomized value functions. Unlike dithering strategies such as epsilon-greedy exploration, bootstrapped DQN carries out temporally-extended (or deep) exploration; this can lead to exponentially faster learning. We demonstrate these benefits in complex stochastic MDPs and in the large-scale Arcade Learning Environment. Bootstrapped DQN substantially improves learning times and performance across most Atari games.

1606.09383 2026-06-04 cs.LG cs.SY eess.SY 版本更新

On Approximate Dynamic Programming with Multivariate Splines for Adaptive Control

基于多变量样条的近似动态规划在自适应控制中的应用

Willem Eerland, Coen de Visser, Erik-Jan van Kampen

AI总结 本文提出基于RLSTD算法和多变量简单样条的SDP框架,引入局部遗忘因子以保持样条连续性,通过实验展示SDP在跟踪时变系统和提升控制性能方面的优势。

Comments 23 pages

详情
AI中文摘要

我们定义了一个基于RLSTD算法和多变量简单样条的SDP框架。我们引入了一个局部遗忘因子,能够保持简单样条的连续性。该局部遗忘因子与RLSTD算法结合,产生了一种能够跟踪时变系统的修改RLSTD算法。我们进行了两个数值实验,一个验证了SDP并将其与NDP进行比较,另一个展示了修改后的RLSTD算法在系统参数改变时的恢复速度优势。尽管SDP每时间步需要更多的计算,但实验表明,在相同的功能近似器参数量下,SDP在稳定性和学习率方面优于NDP。第二个实验表明,SDP结合修改后的RLSTD算法在系统参数改变时比原始RLSTD算法恢复得更快,为自适应高性能非线性控制方法铺平了道路。

英文摘要

We define a SDP framework based on the RLSTD algorithm and multivariate simplex B-splines. We introduce a local forget factor capable of preserving the continuity of the simplex splines. This local forget factor is integrated with the RLSTD algorithm, resulting in a modified RLSTD algorithm that is capable of tracking time-varying systems. We present the results of two numerical experiments, one validating SDP and comparing it with NDP and another to show the advantages of the modified RLSTD algorithm over the original. While SDP requires more computations per time-step, the experiment shows that for the same amount of function approximator parameters, there is an increase in performance in terms of stability and learning rate compared to NDP. The second experiment shows that SDP in combination with the modified RLSTD algorithm allows for faster recovery compared to the original RLSTD algorithm when system parameters are altered, paving the way for an adaptive high-performance non-linear control method.

1606.09333 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

Dimension-Free Iteration Complexity of Finite Sum Optimization Problems

有限求和优化问题的无维度迭代复杂性

Yossi Arjevani, Ohad Shamir

AI总结 本文提出无维度下界,扩展了Arjevani等人的框架,覆盖了标准有限求和优化方法和随机坐标下降方法,突破了现有下界对迭代次数和维度的限制。

详情
AI中文摘要

许多经典机器学习问题可以归结为具有有限求和结构的凸优化问题。然而,尽管在开发更快算法方面取得了进展,现有下界未能充分揭示这些问题的固有限制。当前下界仅关注一阶优化算法,并且仅适用于不切实际的迭代次数小于O(d/n)的 regime(其中d是维度,n是样本数)。在本文中,我们扩展了(Arjevani et al., 2015)的框架,提供新的无维度下界,超越了现有下界的假设,从而覆盖了标准有限求和优化方法,例如SAG、SAGA、SVRG、SDCA(无对偶)以及随机坐标下降方法,如SDCA和加速近端SDCA。

英文摘要

Many canonical machine learning problems boil down to a convex optimization problem with a finite sum structure. However, whereas much progress has been made in developing faster algorithms for this setting, the inherent limitations of these problems are not satisfactorily addressed by existing lower bounds. Indeed, current bounds focus on first-order optimization algorithms, and only apply in the often unrealistic regime where the number of iterations is less than $\mathcal{O}(d/n)$ (where $d$ is the dimension and $n$ is the number of samples). In this work, we extend the framework of (Arjevani et al., 2015) to provide new lower bounds, which are dimension-free, and go beyond the assumptions of current bounds, thereby covering standard finite sum optimization methods, e.g., SAG, SAGA, SVRG, SDCA without duality, as well as stochastic coordinate-descent methods, such as SDCA and accelerated proximal SDCA.

1606.07149 2026-06-04 cs.NE cs.AI cs.CE cs.LG cs.SY eess.SY 版本更新

An Approach to Stable Gradient Descent Adaptation of Higher-Order Neural Units

一种高阶神经单元稳定梯度下降适应的方法

Ivo Bukovsky, Noriyasu Homma

AI总结 本文提出一种基于谱半径的高阶神经单元权重更新系统稳定性评估方法,通过梯度下降实现前馈和递归HONU的适应,确保每一步适应过程的稳定性,从而保证整个神经架构对目标数据的适应性。

Comments 2016, 13 pages

详情
Journal ref
IEEE Transactions on Neural Networks and Learning Systems,ISSN: 2162-237X,2016
AI中文摘要

本文介绍了用于评估高阶神经单元(HONUs)权重更新系统稳定性的方法,该系统采用多项式聚合神经输入(也称为多项式神经网络类别)进行适应,通过梯度下降方法实现前馈和递归HONUs的适应。该方法的核心基于权重更新系统的谱半径,允许在每次适应步骤中监控和维持稳定性。确保权重更新系统的稳定性(在每次单独的适应步骤中)自然导致整个神经架构适应目标数据的稳定性。此外,所用方法强调HONU的权重优化是一个线性问题,因此所提出的方法可以一般扩展到任何其可调整参数为线性的神经架构。

英文摘要

Stability evaluation of a weight-update system of higher-order neural units (HONUs) with polynomial aggregation of neural inputs (also known as classes of polynomial neural networks) for adaptation of both feedforward and recurrent HONUs by a gradient descent method is introduced. An essential core of the approach is based on spectral radius of a weight-update system, and it allows stability monitoring and its maintenance at every adaptation step individually. Assuring stability of the weight-update system (at every single adaptation step) naturally results in adaptation stability of the whole neural architecture that adapts to target data. As an aside, the used approach highlights the fact that the weight optimization of HONU is a linear problem, so the proposed approach can be generally extended to any neural architecture that is linear in its adaptable parameters.

1411.0728 2026-06-04 cs.LG cs.GT cs.SY eess.SY math.OC 版本更新

Approachability in Stackelberg Stochastic Games with Vector Costs

在向量成本的Stackelberg随机博弈中可接近性的研究

Dileep Kalathil, Vivek Borkar, Rahul Jain

AI总结 本文提出在动态变化环境中多目标优化问题中,针对向量成本的Stackelberg随机博弈的可接近性策略,并设计了计算可行的算法和强化学习方法。

Comments 18 Pages, Submitted to Dynamic Games and Applications

详情
AI中文摘要

本文引入了Blackwell [1]在向量值重复博弈中的可接近性概念。著名的Blackwell可接近性定理规定了一种策略,即无论其他参与者的策略如何,都能将给定参与者的平均成本导向给定的目标集。在本文中,受动态变化环境中多目标优化/决策问题的启发,我们研究了具有向量值成本函数的Stackelberg随机博弈的可接近性问题。我们做出了两项主要贡献。首先,我们为Stackelberg随机博弈提供了一种简单且计算上可行的可接近性策略,沿Blackwell的思路。其次,我们提出了一种强化学习算法,用于在转移核未知的情况下学习可接近的策略。我们还作为副产品恢复了Blackwell在凸集情况下可接近性的必要和充分条件,从而实现了完全表征。我们还给出了非凸集的充分条件。

英文摘要

The notion of approachability was introduced by Blackwell [1] in the context of vector-valued repeated games. The famous Blackwell's approachability theorem prescribes a strategy for approachability, i.e., for `steering' the average cost of a given agent towards a given target set, irrespective of the strategies of the other agents. In this paper, motivated by the multi-objective optimization/decision making problems in dynamically changing environments, we address the approachability problem in Stackelberg stochastic games with vector valued cost functions. We make two main contributions. Firstly, we give a simple and computationally tractable strategy for approachability for Stackelberg stochastic games along the lines of Blackwell's. Secondly, we give a reinforcement learning algorithm for learning the approachable strategy when the transition kernel is unknown. We also recover as a by-product Blackwell's necessary and sufficient condition for approachability for convex sets in this set up and thus a complete characterization. We also give sufficient conditions for non-convex sets.

1602.02990 2026-06-04 cs.RO cs.LG cs.SY eess.SY 版本更新

Self-organized control for musculoskeletal robots

肌骨机器人中的自组织控制

Ralf Der, Georg Martius

AI总结 本文提出了一种自组织控制方法,通过无功能控制器实现机器人与环境的动态交互,展示了其在肌肉驱动臂肩系统中的自组织行为及与物体动态的共振效应。

Comments 11 pages, 4 figures, 1 table

详情
AI中文摘要

随着机器人技术的快速发展,最优控制成为研究核心。传统方法中,控制器基于传感器历史数据和预设目标进行动作决策。然而,弹性驱动机器人面临严重挑战。本文提出自组织控制新范式,采用无自身功能的固定函数控制器,基于传感器历史数据。在Myorobotics工具包的肌肉驱动臂肩系统中,观察到多样化的自组织行为:当系统独处时,臂部产生伪随机姿态序列,也可被操控为确定性运动模式。最有趣的是,当附加物体后,控制器与物体内部动态产生共振:给半满瓶时,系统自发摇晃瓶身以产生最大水动态响应;附加摆锤时,控制器使其进入圆周模式。本文还讨论了该控制器范式在意图驱动行为生成中的应用前景。

英文摘要

With the accelerated development of robot technologies, optimal control becomes one of the central themes of research. In traditional approaches, the controller, by its internal functionality, finds appropriate actions on the basis of the history of sensor values, guided by the goals, intentions, objectives, learning schemes, and so on planted into it. The idea is that the controller controls the world---the body plus its environment---as reliably as possible. However, in elastically actuated robots this approach faces severe difficulties. This paper advocates for a new paradigm of self-organized control. The paper presents a solution with a controller that is devoid of any functionalities of its own, given by a fixed, explicit and context-free function of the recent history of the sensor values. When applying this controller to a muscle-tendon driven arm-shoulder system from the Myorobotics toolkit, we observe a vast variety of self-organized behavior patterns: when left alone, the arm realizes pseudo-random sequences of different poses but one can also manipulate the system into definite motion patterns. But most interestingly, after attaching an object, the controller gets in a functional resonance with the object's internal dynamics: when given a half-filled bottle, the system spontaneously starts shaking the bottle so that maximum response from the dynamics of the water is being generated. After attaching a pendulum to the arm, the controller drives the pendulum into a circular mode. In this way, the robot discovers dynamical affordances of objects its body is interacting with. We also discuss perspectives for using this controller paradigm for intention driven behavior generation.

1605.09444 2026-06-04 eess.SY cs.LG cs.SY 版本更新

A Novel Fault Classification Scheme Based on Least Square SVM

一种基于最小二乘支持向量机的新型故障分类方案

Harishchandra Dubey, A. K. Tiwari, Nandita, P. K. Ray, S. R. Mohanty, Nand Kishor

AI总结 本文提出基于最小二乘支持向量机的新型故障分类方案,利用故障后1/4周期的电流信号作为输入,通过四个二分类器实现相别识别和接地检测,验证了其在噪声下的准确性和可靠性。

Comments 5 Pages, 6 Figures, 3 Tables

详情
Journal ref
Harishchandra Dubey etal., "A novel fault classification scheme based on least square SVM," Engineering and Systems (SCES), 2012 Students Conference on, INDIA, 2012, pp. 1-5
AI中文摘要

本文提出了一种基于最小二乘支持向量机(LS-SVM)的新型故障分类方案,用于系列补偿输电线路的故障分类和段识别。所提方案利用故障后1/4周期的电流信号作为输入,采用四个二分类器:三个用于相别选择,第四个用于接地检测。实验结果表明,该方案在噪声环境下具有较高的准确性和可靠性,仿真结果验证了其在系列补偿输电线路故障分类中的有效性。

英文摘要

This paper presents a novel approach for fault classification and section identification in a series compensated transmission line based on least square support vector machine. The current signal corresponding to one-fourth of the post fault cycle is used as input to proposed modular LS-SVM classifier. The proposed scheme uses four binary classifier; three for selection of three phases and fourth for ground detection. The proposed classification scheme is found to be accurate and reliable in presence of noise as well. The simulation results validate the efficacy of proposed scheme for accurate classification of fault in a series compensated transmission line.

1605.09049 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Recycling Randomness with Structure for Sublinear time Kernel Expansions

利用结构回收随机性以实现子线性时间核展开

Krzysztof Choromanski, Vikas Sindhwani

AI总结 本文提出通过结构矩阵近似各种核函数的方法,扩展了快速食品构造,并通过理论分析和实验验证了结构化矩阵在提升核方法性能中的有效性。

详情
AI中文摘要

我们提出了一种方案,通过将高斯随机向量回收到结构化矩阵中,在子线性时间内近似各种核函数。我们的框架包括快速食品构造作为特殊情况,但也可扩展到循环、托普利茨和汉克尔矩阵,以及更广泛的具有低位移秩特性的结构化矩阵。我们引入了相干性和图论结构常数等概念来控制近似质量,并证明了框架内随机特征映射的无偏性和低方差性质。对于低位移矩阵,我们展示了如何通过控制结构和随机性的程度来减少统计方差,尽管这会增加计算和存储需求。实验证果强烈支持我们的理论,并证明了使用更广泛的结构化矩阵可以提升核方法的扩展性。

英文摘要

We propose a scheme for recycling Gaussian random vectors into structured matrices to approximate various kernel functions in sublinear time via random embeddings. Our framework includes the Fastfood construction as a special case, but also extends to Circulant, Toeplitz and Hankel matrices, and the broader family of structured matrices that are characterized by the concept of low-displacement rank. We introduce notions of coherence and graph-theoretic structural constants that control the approximation quality, and prove unbiasedness and low-variance properties of random feature maps that arise within our framework. For the case of low-displacement matrices, we show how the degree of structure and randomness can be controlled to reduce statistical variance at the cost of increased computation and storage requirements. Empirical results strongly support our theory and justify the use of a broader family of structured matrices for scaling up kernel methods using random features.

1512.01110 2026-06-04 math.NA cs.AI cs.LG cs.NA 版本更新

Bayesian Matrix Completion via Adaptive Relaxed Spectral Regularization

基于自适应放松谱正则化的贝叶斯矩阵补全

Yang Song, Jun Zhu

AI总结 本文提出一种基于谱正则化的贝叶斯矩阵补全方法,通过放松奇异向量的正交约束,设计出适用于贝叶斯推断的自适应谱正则化方法,无需参数调优即可自动推断潜在因子数量,在稀疏矩阵上表现优异。

Comments Accepted to AAAI 2016

详情
AI中文摘要

基于低秩矩阵分解的贝叶斯矩阵补全方法已取得良好成果,但基于更直接的谱正则化方法的研究较少。本文通过提出基于谱正则化的新型贝叶斯矩阵补全方法填补这一空白。为克服奇异向量正交约束处理的困难,我们推导出一种等价形式,其中包含放松的约束,从而设计出适用于贝叶斯推断的自适应谱正则化方法。我们的贝叶斯方法不需要参数调优,能够自动推断潜在因子数量。在合成和真实数据集上的实验显示,该方法在恢复秩和协同过滤任务中表现良好,尤其在非常稀疏的矩阵上结果显著。

英文摘要

Bayesian matrix completion has been studied based on a low-rank matrix factorization formulation with promising results. However, little work has been done on Bayesian matrix completion based on the more direct spectral regularization formulation. We fill this gap by presenting a novel Bayesian matrix completion method based on spectral regularization. In order to circumvent the difficulties of dealing with the orthonormality constraints of singular vectors, we derive a new equivalent form with relaxed constraints, which then leads us to design an adaptive version of spectral regularization feasible for Bayesian inference. Our Bayesian method requires no parameter tuning and can infer the number of latent factors automatically. Experiments on synthetic and real datasets demonstrate encouraging results on rank recovery and collaborative filtering, with notably good results for very sparse matrices.

1510.08896 2026-06-04 cs.DS cs.LG cs.NA math.NA math.OC 版本更新

Robust Shift-and-Invert Preconditioning: Faster and More Sample Efficient Algorithms for Eigenvector Computation

鲁棒的移位-倒置预处理:更快且更样本高效的特征向量计算算法

Chi Jin, Sham M. Kakade, Cameron Musco, Praneeth Netrapalli, Aaron Sidford

AI总结 本文提出更快且更样本高效的算法,用于近似矩阵的顶部特征向量,通过改进经典幂方法和Lanczos方法,结合快速子空间嵌入和随机优化,提升了稳定秩和ε依赖性。

Comments Manuscript outdated. Updated version at arxiv:1605.08754

详情
AI中文摘要

我们提供了更快的算法和改进的样本复杂度,用于近似矩阵的顶部特征向量。在离线设置中,给定一个n×d矩阵A,我们展示了如何在时间~O([nnz(A) + (d·sr(A))/gap²]·log1/ε)和~O([(nnz(A))^{3/4}(d·sr(A))^{1/4}/√gap]·log1/ε)内计算ε近似顶部特征向量。这里sr(A)是稳定秩,gap是乘法特征值间隙。通过将gap依赖从nnz(A)中分离,我们改进了经典幂方法和Lanczos方法。我们还利用快速子空间嵌入和随机优化改进了先前工作,显著提升了sr(A)和ε的依赖性。我们的第二运行时间在nnz(A) ≤ (d·sr(A))/gap²时进一步改进。在在线设置中,给定一个分布D,其协方差矩阵为Σ,以及一个O(gap)近似顶部特征向量x₀,我们展示了如何通过~O(v(D)/gap² + v(D)/(gap·ε))个样本从D中进行细化。这里v(D)是一个自然的方差度量。结合我们的算法与先前工作来初始化x₀,我们获得了改进的样本复杂度和运行时间结果。对于一般分布,我们实现了随着样本数增加时的渐近最优准确性。我们的结果围绕经典移位-倒置预处理方法的鲁棒分析,将特征向量计算减少为近似求解一系列线性系统。我们然后应用快速SVRG基于的近似系统求解器来实现我们的结论。我们相信我们的结果表明了基于移位-倒置方法的广泛有效性,并暗示在实践中可能进一步获得计算增益。

英文摘要

We provide faster algorithms and improved sample complexities for approximating the top eigenvector of a matrix. Offline Setting: Given an $n \times d$ matrix $A$, we show how to compute an $ε$ approximate top eigenvector in time $\tilde O ( [nnz(A) + \frac{d \cdot sr(A)}{gap^2}]\cdot \log 1/ε)$ and $\tilde O([\frac{nnz(A)^{3/4} (d \cdot sr(A))^{1/4}}{\sqrt{gap}}]\cdot \log1/ε)$. Here $sr(A)$ is the stable rank and $gap$ is the multiplicative eigenvalue gap. By separating the $gap$ dependence from $nnz(A)$ we improve on the classic power and Lanczos methods. We also improve prior work using fast subspace embeddings and stochastic optimization, giving significantly improved dependencies on $sr(A)$ and $ε$. Our second running time improves this further when $nnz(A) \le \frac{d\cdot sr(A)}{gap^2}$. Online Setting: Given a distribution $D$ with covariance matrix $Σ$ and a vector $x_0$ which is an $O(gap)$ approximate top eigenvector for $Σ$, we show how to refine to an $ε$ approximation using $\tilde O(\frac{v(D)}{gap^2} + \frac{v(D)}{gap \cdot ε})$ samples from $D$. Here $v(D)$ is a natural variance measure. Combining our algorithm with previous work to initialize $x_0$, we obtain a number of improved sample complexity and runtime results. For general distributions, we achieve asymptotically optimal accuracy as a function of sample size as the number of samples grows large. Our results center around a robust analysis of the classic method of shift-and-invert preconditioning to reduce eigenvector computation to approximately solving a sequence of linear systems. We then apply fast SVRG based approximate system solvers to achieve our claims. We believe our results suggest the general effectiveness of shift-and-invert based approaches and imply that further computational gains may be reaped in practice.

1605.08754 2026-06-04 cs.DS cs.LG cs.NA math.NA math.OC 版本更新

Faster Eigenvector Computation via Shift-and-Invert Preconditioning

通过移位和倒置预条件化加速特征向量计算

Dan Garber, Elad Hazan, Chi Jin, Sham M. Kakade, Cameron Musco, Praneeth Netrapalli, Aaron Sidford

AI总结 本文提出更快的算法和改进的样本复杂度,用于估计矩阵Σ的顶部特征向量,通过分离gap依赖性和非零元素数量,改进了经典幂方法和Lanczos方法,并在在线估计中利用方差降低样本复杂度。

Comments Appearing in ICML 2016. Combination of work in arXiv:1509.05647 and arXiv:1510.08896

详情
AI中文摘要

我们给出了更快的算法和改进的样本复杂度,用于估计矩阵Σ的顶部特征向量,即计算一个单位向量x,使得x^TΣx≥(1-ε)λ₁(Σ):离线特征向量估计:给定显式矩阵A∈R^{n×d},其中Σ=A^TA,我们展示了如何在时间~O([nnz(A)+d*sr(A)/gap²]*log1/ε)和~O([nnz(A)^{3/4}(d*sr(A))^{1/4}/sqrt(gap)]*log1/ε)内计算ε近似顶部特征向量。通过将gap依赖性从nnz(A)项中分离,我们的首次运行时间优于经典幂方法和Lanczos方法。它也改进了使用快速子空间嵌入[AC09, CW13]和随机优化[Sha15c]的先前工作,给出了对sr(A)和ε显著更好的依赖性。我们的第二次运行时间在nnz(A)≤d*sr(A)/gap²时进一步改进这些结果。在线特征向量估计:给定具有协方差矩阵Σ的分布D和一个初始向量x₀,它是O(gap)近似的顶部特征向量,我们展示了如何通过O(var(D)/(gap*ε))个样本从D中细化到ε近似。结合我们的算法与先前工作初始化x₀,我们获得了在各种D假设下的改进样本复杂度和运行时间结果。我们通过一个通用框架实现了这些结果,该框架我们认为具有独立兴趣。我们给出了经典移位和倒置预条件化方法的稳健分析,将特征向量计算减少为近似求解一系列线性系统。然后应用快速随机方差减少梯度(SVRG)基于系统求解器来实现我们的结论。

英文摘要

We give faster algorithms and improved sample complexities for estimating the top eigenvector of a matrix $Σ$ -- i.e. computing a unit vector $x$ such that $x^T Σx \ge (1-ε)λ_1(Σ)$: Offline Eigenvector Estimation: Given an explicit $A \in \mathbb{R}^{n \times d}$ with $Σ= A^TA$, we show how to compute an $ε$ approximate top eigenvector in time $\tilde O([nnz(A) + \frac{d*sr(A)}{gap^2} ]* \log 1/ε)$ and $\tilde O([\frac{nnz(A)^{3/4} (d*sr(A))^{1/4}}{\sqrt{gap}} ] * \log 1/ε)$. Here $nnz(A)$ is the number of nonzeros in $A$, $sr(A)$ is the stable rank, $gap$ is the relative eigengap. By separating the $gap$ dependence from the $nnz(A)$ term, our first runtime improves upon the classical power and Lanczos methods. It also improves prior work using fast subspace embeddings [AC09, CW13] and stochastic optimization [Sha15c], giving significantly better dependencies on $sr(A)$ and $ε$. Our second running time improves these further when $nnz(A) \le \frac{d*sr(A)}{gap^2}$. Online Eigenvector Estimation: Given a distribution $D$ with covariance matrix $Σ$ and a vector $x_0$ which is an $O(gap)$ approximate top eigenvector for $Σ$, we show how to refine to an $ε$ approximation using $ O(\frac{var(D)}{gap*ε})$ samples from $D$. Here $var(D)$ is a natural notion of variance. Combining our algorithm with previous work to initialize $x_0$, we obtain improved sample complexity and runtime results under a variety of assumptions on $D$. We achieve our results using a general framework that we believe is of independent interest. We give a robust analysis of the classic method of shift-and-invert preconditioning to reduce eigenvector computation to approximately solving a sequence of linear systems. We then apply fast stochastic variance reduced gradient (SVRG) based system solvers to achieve our claims.

1605.08527 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

Stochastic Optimization for Large-scale Optimal Transport

大规模最优传输的随机优化

Genevay Aude, Marco Cuturi, Gabriel Peyré, Francis Bach

AI总结 本文提出新的随机优化算法以解决大规模最优传输问题,通过样本生成处理任意分布,避免离散化并保证收敛性,适用于高维学习场景。

详情
AI中文摘要

最优传输(OT)定义了一个强大的框架,以几何忠实的方式比较概率分布。然而,由于计算负担限制了其实际应用。本文提出了一类新的随机优化算法,能够处理任意分布(离散或连续)只需能从中抽样,这通常是高维学习问题的典型设置。这减轻了对这些密度的离散化需求,同时提供可证明收敛的方法,输出正确的距离而不引入离散化误差。这些算法依赖两个主要思想:(a)对偶OT问题可以重新表述为期望的最大化;(b)对偶OT问题的熵正则化导致平滑的对偶优化问题,可以使用具有可证明更快收敛性的算法解决。我们将在三种不同设置中实例化这些思想:(i)当比较两个离散分布时,我们显示增量随机优化方案可以超越Sinkhorn算法,当前最先进的有限维OT求解器;(ii)当比较离散分布和连续密度时,对偶程序的半离散化改写适用于平均随机梯度下降,导致比通过离散化近似解决问题更好的性能;(iii)当处理两个连续密度时,我们提出在再生核希尔伯特空间(RKHS)上进行随机梯度下降。这目前是唯一已知解决此问题的方法,除了在有限样本上计算OT。我们通过一组离散、半离散和连续的基准问题验证这些主张。

英文摘要

Optimal transport (OT) defines a powerful framework to compare probability distributions in a geometrically faithful way. However, the practical impact of OT is still limited because of its computational burden. We propose a new class of stochastic optimization algorithms to cope with large-scale problems routinely encountered in machine learning applications. These methods are able to manipulate arbitrary distributions (either discrete or continuous) by simply requiring to be able to draw samples from them, which is the typical setup in high-dimensional learning problems. This alleviates the need to discretize these densities, while giving access to provably convergent methods that output the correct distance without discretization error. These algorithms rely on two main ideas: (a) the dual OT problem can be re-cast as the maximization of an expectation ; (b) entropic regularization of the primal OT problem results in a smooth dual optimization optimization which can be addressed with algorithms that have a provably faster convergence. We instantiate these ideas in three different setups: (i) when comparing a discrete distribution to another, we show that incremental stochastic optimization schemes can beat Sinkhorn's algorithm, the current state-of-the-art finite dimensional OT solver; (ii) when comparing a discrete distribution to a continuous density, a semi-discrete reformulation of the dual program is amenable to averaged stochastic gradient descent, leading to better performance than approximately solving the problem by discretization ; (iii) when dealing with two continuous densities, we propose a stochastic gradient descent over a reproducing kernel Hilbert space (RKHS). This is currently the only known method to solve this problem, apart from computing OT on finite samples. We backup these claims on a set of discrete, semi-discrete and continuous benchmark problems.

1506.02159 2026-06-04 math.NA cs.LG cs.NA math.OC 版本更新

Riemannian preconditioning for tensor completion

Riemannian预处理用于张量补全

Hiroyuki Kasai, Bamdev Mishra

AI总结 本文提出一种新的Riemannian预处理方法用于张量补全问题,利用最小二乘结构和Tucker分解的对称性,开发出高效的非线性共轭梯度算法,实验表明其在不同数据集上表现优异。

Comments Supplementary material included in the paper. An extension of the paper is in arXiv:1605.08257

详情
AI中文摘要

我们提出了一种新颖的Riemannian预处理方法用于具有秩约束的张量补全问题。提出了一种Riemannian度量或内积,利用成本函数的最小二乘结构并考虑Tucker分解中的结构对称性。特定的度量允许使用商流形上的Riemannian优化框架开发预处理的非线性共轭梯度算法。为此,列出了各种优化相关成分的矩阵表示。数值比较表明,所提出的算法在不同问题实例上稳健地优于最先进的算法,涵盖各种合成和现实世界数据集。

英文摘要

We propose a novel Riemannian preconditioning approach for the tensor completion problem with rank constraint. A Riemannian metric or inner product is proposed that exploits the least-squares structure of the cost function and takes into account the structured symmetry in Tucker decomposition. The specific metric allows to use the versatile framework of Riemannian optimization on quotient manifolds to develop a preconditioned nonlinear conjugate gradient algorithm for the problem. To this end, concrete matrix representations of various optimization-related ingredients are listed. Numerical comparisons suggest that our proposed algorithm robustly outperforms state-of-the-art algorithms across different problem instances encompassing various synthetic and real-world datasets.

1605.08257 2026-06-04 cs.LG cs.NA math.NA math.OC stat.ML 版本更新

Low-rank tensor completion: a Riemannian manifold preconditioning approach

低秩张量补全:黎曼流形预条件方法

Hiroyuki Kasai, Bamdev Mishra

AI总结 本文提出了一种基于黎曼流形预条件的方法用于具有秩约束的张量补全问题,通过引入新的黎曼度量利用最小二乘结构和Tucker分解的对称性,开发出预条件非线性共轭梯度和随机梯度下降算法,实验表明其在不同数据集上优于现有方法。

Comments The 33rd International Conference on Machine Learning (ICML 2016). arXiv admin note: substantial text overlap with arXiv:1506.02159

详情
AI中文摘要

我们提出了一种新颖的黎曼流形预条件方法用于具有秩约束的张量补全问题。提出了一种新的黎曼度量或内积,利用成本函数的最小二乘结构,并考虑Tucker分解中存在的结构对称性。特定的度量允许利用黎曼优化在商流形框架中开发预条件非线性共轭梯度和随机梯度下降算法,分别用于批量和在线设置。列举了各种优化相关成分的矩阵表示。数值比较表明,所提出算法在不同合成和实际数据集上稳健地优于现有方法。

英文摘要

We propose a novel Riemannian manifold preconditioning approach for the tensor completion problem with rank constraint. A novel Riemannian metric or inner product is proposed that exploits the least-squares structure of the cost function and takes into account the structured symmetry that exists in Tucker decomposition. The specific metric allows to use the versatile framework of Riemannian optimization on quotient manifolds to develop preconditioned nonlinear conjugate gradient and stochastic gradient descent algorithms for batch and online setups, respectively. Concrete matrix representations of various optimization-related ingredients are listed. Numerical comparisons suggest that our proposed algorithms robustly outperform state-of-the-art algorithms across different synthetic and real-world datasets.

1511.03722 2026-06-04 cs.LG cs.AI cs.SY eess.SY stat.ME stat.ML 版本更新

Doubly Robust Off-policy Value Evaluation for Reinforcement Learning

强化学习中的双重鲁棒离策略价值评估

Nan Jiang, Lihong Li

AI总结 本文提出一种双重鲁棒估计器,用于离策略价值评估,兼顾无偏性和低方差性,并在基准问题中验证其有效性。

Comments 14 pages; 4 figures; ICML 2016

详情
AI中文摘要

本文研究了强化学习(RL)中的离策略价值评估问题,其中目标是基于由不同策略收集的数据来估计新策略的价值。这一问题通常是将RL应用于现实世界问题时的关键步骤。尽管其重要性,现有的通用方法要么存在不可控的偏差,要么方差较高。在本文中,我们扩展了用于轮盘赌的双重鲁棒估计器到顺序决策问题,实现了两全其美:它保证无偏,并且比流行的重要性采样估计器具有显著更低的方差。我们展示了估计器在多个基准问题中的准确性,并展示了其作为安全策略改进子程序的用途。我们还提供了对问题难度的理论结果,并证明在某些情况下,我们的估计器可以达到下限。

英文摘要

We study the problem of off-policy value evaluation in reinforcement learning (RL), where one aims to estimate the value of a new policy based on data collected by a different policy. This problem is often a critical step when applying RL in real-world problems. Despite its importance, existing general methods either have uncontrolled bias or suffer high variance. In this work, we extend the doubly robust estimator for bandits to sequential decision-making problems, which gets the best of both worlds: it is guaranteed to be unbiased and can have a much lower variance than the popular importance sampling estimators. We demonstrate the estimator's accuracy in several benchmark problems, and illustrate its use as a subroutine in safe policy improvement. We also provide theoretical results on the hardness of the problem, and show that our estimator can match the lower bound in certain scenarios.

1605.06968 2026-06-04 math.NA cs.LG cs.NA math.OC 版本更新

A Riemannian gossip approach to decentralized matrix completion

基于黎曼流形的去中心化矩阵补全方法

Bamdev Mishra, Hiroyuki Kasai, Atul Saroop

AI总结 本文提出基于黎曼流形的新型 gossip 算法,用于低秩去中心化矩阵补全问题,实现局部矩阵补全和全局低秩因子的渐近一致性,具有可扩展性和并行性。

Comments Under review

详情
AI中文摘要

本文提出了一种新颖的 gossip 算法用于低秩去中心化矩阵补全问题。所提出的算法在黎曼流形上运行,允许不同代理进行局部矩阵补全,同时实现全局低秩因子的渐近一致性。所提出的方法具有可扩展性和并行性。数值实验表明,所提算法在各种基准上表现出良好的性能。

英文摘要

In this paper, we propose novel gossip algorithms for the low-rank decentralized matrix completion problem. The proposed approach is on the Riemannian Grassmann manifold that allows local matrix completion by different agents while achieving asymptotic consensus on the global low-rank factors. The resulting approach is scalable and parallelizable. Our numerical experiments show the good performance of the proposed algorithms on various benchmarks.

1605.02196 2026-06-04 eess.SY cs.CV cs.LG cs.RO cs.SY 版本更新

All Weather Perception: Joint Data Association, Tracking, and Classification for Autonomous Ground Vehicles

全天候感知:面向自主地面车辆的数据关联、跟踪与分类的联合解决方案

Peter Radecki, Mark Campbell, Kevin Matzen

AI总结 本文提出一种新型概率感知算法,用于自主地面车辆在全天候条件下的数据关联、目标跟踪和分类。该算法扩展了原有的 Rao-Blackwellized 粒子滤波器,结合多模型跟踪进行分类,并通过升级 Cornell 的 AGV 实验证明了先进视觉算法在恶劣天气下的鲁棒性。

Comments 35 pages, 21 figures, 14 tables

详情
AI中文摘要

本文提出了一种新颖的概率感知算法,作为实时联合解决方案,用于自主地面车辆在全天候条件下的数据关联、目标跟踪和目标分类。该算法扩展了最初使用粒子滤波进行数据关联和卡尔曼滤波进行多目标跟踪的 Rao-Blackwellized 粒子滤波器(Miller 等,2011a),现已包含多模型跟踪用于分类。此外,还实现了一种最先进的视觉检测算法,该算法包含方向信息,适用于自主地面车辆(AGV)应用。Cornell 的 AGV 从 DARPA 城市挑战中被升级并用于实验,以检验先进视觉算法能否补充或替代激光雷达和雷达传感器。在恶劣天气和光照条件下,传感器和算法性能得到测试。实验评估显示,在联合概率感知算法中,摄像头、激光雷达和雷达传感器能够实现稳健的全天候数据关联、跟踪和分类。

英文摘要

A novel probabilistic perception algorithm is presented as a real-time joint solution to data association, object tracking, and object classification for an autonomous ground vehicle in all-weather conditions. The presented algorithm extends a Rao-Blackwellized Particle Filter originally built with a particle filter for data association and a Kalman filter for multi-object tracking (Miller et al. 2011a) to now also include multiple model tracking for classification. Additionally a state-of-the-art vision detection algorithm that includes heading information for autonomous ground vehicle (AGV) applications was implemented. Cornell's AGV from the DARPA Urban Challenge was upgraded and used to experimentally examine if and how state-of-the-art vision algorithms can complement or replace lidar and radar sensors. Sensor and algorithm performance in adverse weather and lighting conditions is tested. Experimental evaluation demonstrates robust all-weather data association, tracking, and classification where camera, lidar, and radar sensors complement each other inside the joint probabilistic perception algorithm.

1507.00333 2026-06-04 math.NA cs.IR cs.LG cs.NA 版本更新

Notes on Low-rank Matrix Factorization

关于低秩矩阵分解的笔记

Yuan Lu, Jie Yang

AI总结 本文综述了低秩矩阵分解的不同变体,包括基本分解、非负分解和正交非负分解,并探讨了其在降维、聚类和矩阵补全中的应用,以及扩展至稀疏矩阵补全和半监督学习的可能性。

详情
AI中文摘要

低秩矩阵分解(MF)是数据科学中的重要技术。MF的核心思想是数据中存在潜在结构,通过揭示这些结构可以获得数据的压缩表示。通过将原始矩阵分解为低秩矩阵,MF提供了一种统一的降维、聚类和矩阵补全方法。本文回顾了MF的几个重要变体,包括基本MF、非负MF和正交非负MF。非负MF和正交非负MF是基本MF的变体,分别带有非负性和/或正交性约束。这些约束在特定场景中非常有用。本文第一部分介绍了每种模型的应用场景、独特性质和优化方法。通过适当适应MF,可以超越聚类和矩阵补全的问题。第二部分将扩展MF到稀疏矩阵补全,利用各种正则化方法增强矩阵补全,并通过引入潜在空间强化和变换来利用MF进行(半)监督学习。我们将看到MF不仅是一个有用的模型,也是一个灵活的框架,适用于各种预测问题。

英文摘要

Low-rank matrix factorization (MF) is an important technique in data science. The key idea of MF is that there exists latent structures in the data, by uncovering which we could obtain a compressed representation of the data. By factorizing an original matrix to low-rank matrices, MF provides a unified method for dimension reduction, clustering, and matrix completion. In this article we review several important variants of MF, including: Basic MF, Non-negative MF, Orthogonal non-negative MF. As can be told from their names, non-negative MF and orthogonal non-negative MF are variants of basic MF with non-negativity and/or orthogonality constraints. Such constraints are useful in specific senarios. In the first part of this article, we introduce, for each of these models, the application scenarios, the distinctive properties, and the optimizing method. By properly adapting MF, we can go beyond the problem of clustering and matrix completion. In the second part of this article, we will extend MF to sparse matrix compeletion, enhance matrix compeletion using various regularization methods, and make use of MF for (semi-)supervised learning by introducing latent space reinforcement and transformation. We will see that MF is not only a useful model but also as a flexible framework that is applicable for various prediction problems.

1605.00716 2026-06-04 cs.LG cs.NI cs.SY eess.SY 版本更新

Radio Transformer Networks: Attention Models for Learning to Synchronize in Wireless Systems

无线系统中的无线电变换网络:用于学习同步的注意力模型

Timothy J O'Shea, Latha Pemula, Dhruv Batra, T. Charles Clancy

AI总结 本文提出利用空间变换网络和新无线电领域适应的变换,引入学习注意力模型以提升调制识别的准确率,通过优化分类精度、稀疏表示和正则化实现信号同步与归一化。

Comments 5 pages, 8 figures

详情
AI中文摘要

我们介绍了将学习注意力模型引入无线电机器学习领域,以实现调制识别任务,通过利用空间变换网络并引入新的无线电领域适应的变换。这种注意力模型使网络能够学习一个定位网络,基于优化分类精度、稀疏表示和正则化,实现对无线电信号的盲同步与归一化。使用这种架构,我们能够以更高的准确性在信噪比下超越先前结果,但我们认为这种注意力模型对调制识别任务之外的领域也有深远影响。

英文摘要

We introduce learned attention models into the radio machine learning domain for the task of modulation recognition by leveraging spatial transformer networks and introducing new radio domain appropriate transformations. This attention model allows the network to learn a localization network capable of synchronizing and normalizing a radio signal blindly with zero knowledge of the signals structure based on optimization of the network for classification accuracy, sparse representation, and regularization. Using this architecture we are able to outperform our prior results in accuracy vs signal to noise ratio against an identical system without attention, however we believe such an attention model has implication far beyond the task of modulation recognition.

1509.02604 2026-06-04 cs.DC cs.LG cs.SY eess.SY 版本更新

Asynchronous Distributed ADMM for Large-Scale Optimization- Part II: Linear Convergence Analysis and Numerical Performance

异步分布式ADMM用于大规模优化-第二部分:线性收敛性分析和数值性能

Tsung-Hui Chang, Wei-Cheng Liao, Mingyi Hong, Xiangfeng Wang

AI总结 本文研究了异步分布式ADMM的线性收敛条件及其在大规模逻辑回归中的高效性。

Comments submitted for publication, 28 pages

详情
AI中文摘要

本文研究了异步分布式ADMM的线性收敛条件及其在大规模逻辑回归中的高效性。

英文摘要

The alternating direction method of multipliers (ADMM) has been recognized as a versatile approach for solving modern large-scale machine learning and signal processing problems efficiently. When the data size and/or the problem dimension is large, a distributed version of ADMM can be used, which is capable of distributing the computation load and the data set to a network of computing nodes. Unfortunately, a direct synchronous implementation of such algorithm does not scale well with the problem size, as the algorithm speed is limited by the slowest computing nodes. To address this issue, in a companion paper, we have proposed an asynchronous distributed ADMM (AD-ADMM) and studied its worst-case convergence conditions. In this paper, we further the study by characterizing the conditions under which the AD-ADMM achieves linear convergence. Our conditions as well as the resulting linear rates reveal the impact that various algorithm parameters, network delay and network size have on the algorithm performance. To demonstrate the superior time efficiency of the proposed AD-ADMM, we test the AD-ADMM on a high-performance computer cluster by solving a large-scale logistic regression problem.

1509.02597 2026-06-04 cs.DC cs.LG cs.SY eess.SY 版本更新

Asynchronous Distributed ADMM for Large-Scale Optimization- Part I: Algorithm and Convergence Analysis

异步分布式ADMM用于大规模优化-第一部分:算法与收敛性分析

Tsung-Hui Chang, Mingyi Hong, Wei-Cheng Liao, Xiangfeng Wang

AI总结 本文提出异步分布式ADMM算法,用于解决大规模学习问题,通过共识问题建模,在星型网络上并行求解,克服传统同步计算在异构网络中的效率瓶颈。

Comments 37 pages

详情
AI中文摘要

本文旨在解决大规模学习问题,研究基于交替方向乘子法(ADMM)的分布式优化方法。通过将学习问题建模为共识问题,ADMM可以在全平行方式下在具有星型拓扑的计算机网络上求解共识问题。然而,传统同步计算在问题规模扩大时效率低下,因为算法速度受限于最慢的工人。在异构网络中,计算节点经历不同的计算和通信延迟,这一问题尤为突出。本文提出异步分布式ADMM(AD-ADMM),以有效提高分布式优化的时间效率。我们的主要兴趣在于分析AD-ADMM的收敛条件,基于流行的部分异步模型,该模型基于网络的最大可容忍延迟定义。具体而言,通过考虑一般且可能非凸的成本函数,我们证明只要算法参数根据网络延迟适当选择,AD-ADMM就保证收敛到KKT点集。我们进一步说明,ADMM的异步性必须谨慎处理,因为对AD-ADMM实现的轻微修改可能会破坏算法收敛性,即使在标准凸设置下也是如此。

英文摘要

Aiming at solving large-scale learning problems, this paper studies distributed optimization methods based on the alternating direction method of multipliers (ADMM). By formulating the learning problem as a consensus problem, the ADMM can be used to solve the consensus problem in a fully parallel fashion over a computer network with a star topology. However, traditional synchronized computation does not scale well with the problem size, as the speed of the algorithm is limited by the slowest workers. This is particularly true in a heterogeneous network where the computing nodes experience different computation and communication delays. In this paper, we propose an asynchronous distributed ADMM (AD-AMM) which can effectively improve the time efficiency of distributed optimization. Our main interest lies in analyzing the convergence conditions of the AD-ADMM, under the popular partially asynchronous model, which is defined based on a maximum tolerable delay of the network. Specifically, by considering general and possibly non-convex cost functions, we show that the AD-ADMM is guaranteed to converge to the set of Karush-Kuhn-Tucker (KKT) points as long as the algorithm parameters are chosen appropriately according to the network delay. We further illustrate that the asynchrony of the ADMM has to be handled with care, as slightly modifying the implementation of the AD-ADMM can jeopardize the algorithm convergence, even under a standard convex setting.

1512.00984 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Fast Low-Rank Matrix Learning with Nonconvex Regularization

快速低秩矩阵学习与非凸正则化

Quanming Yao, James T. Kwok, Wenliang Zhong

AI总结 本文提出一种利用非凸正则化快速学习低秩矩阵的方法,通过截断奇异值和幂方法提升效率,实现更准确的矩阵恢复。

Comments Long version of conference paper appeared ICDM 2015

详情
AI中文摘要

低秩建模在机器学习、计算机视觉和社会网络分析中有广泛应用。尽管核范数常用于近似矩阵秩,但非凸低秩正则化在恢复性能上更优。然而,由此产生的优化问题更具挑战性。最近的最先进方法基于近端梯度算法,但需要每次近端步骤进行昂贵的完整SVD。本文表明,对于许多常用非凸低秩正则化器,可以推导出截断,自动阈值化由近端算子获得的奇异值。这使得可以高效地用幂方法近似SVD。此外,近端算子可以简化为一个较小矩阵在该主子空间上的投影。可以保证O(1/T)的收敛率,其中T是迭代次数。在矩阵补全和鲁棒主成分分析上进行了广泛实验。所提方法在最先进方法上实现了显著加速。此外,获得的矩阵解比传统核范数正则化器更准确且秩更低。

英文摘要

Low-rank modeling has a lot of important applications in machine learning, computer vision and social network analysis. While the matrix rank is often approximated by the convex nuclear norm, the use of nonconvex low-rank regularizers has demonstrated better recovery performance. However, the resultant optimization problem is much more challenging. A very recent state-of-the-art is based on the proximal gradient algorithm. However, it requires an expensive full SVD in each proximal step. In this paper, we show that for many commonly-used nonconvex low-rank regularizers, a cutoff can be derived to automatically threshold the singular values obtained from the proximal operator. This allows the use of power method to approximate the SVD efficiently. Besides, the proximal operator can be reduced to that of a much smaller matrix projected onto this leading subspace. Convergence, with a rate of O(1/T) where T is the number of iterations, can be guaranteed. Extensive experiments are performed on matrix completion and robust principal component analysis. The proposed method achieves significant speedup over the state-of-the-art. Moreover, the matrix solution obtained is more accurate and has a lower rank than that of the traditional nuclear norm regularizer.

1509.03917 2026-06-04 stat.ML cs.DS cs.IT cs.LG cs.NA math.IT math.NA math.OC 版本更新

Dropping Convexity for Faster Semi-definite Optimization

放弃凸性以加快半定规划优化

Srinadh Bhojanapalli, Anastasios Kyrillidis, Sujay Sanghavi

AI总结 本文研究了在半正定矩阵集上最小化凸函数的问题,通过因子梯度下降法(FGD)在非凸情况下实现更快收敛,提供了步长选择规则和初始化方法,适用于一般凸函数的收敛性保证。

Comments 40 pages

详情
AI中文摘要

我们研究了在n×n半正定矩阵集上最小化凸函数f(X)的问题,但当问题转换为min_U g(U) := f(UU^T),其中U∈R^{n×r}且r≤n时,我们研究了梯度下降在g上的性能,即因子梯度下降(FGD)。我们提供了一个选择步长的规则,并证明在该选择下,FGD的局部收敛率与标准梯度下降在原始f上的收敛率相同:即经过k步后,误差为O(1/k)对于光滑的f,当f是(受限)强凸时,误差呈指数级减小。此外,我们提供了一种初始化FGD的程序,适用于(受限)强凸目标函数,并且当只能通过一阶oracle访问f时。对于多个问题实例,适当的初始化导致全局收敛保证。FGD和类似程序在实践中广泛用于可表述为矩阵分解的问题。据我们所知,这是首次为一般凸函数在标准凸假设下提供精确的收敛率保证的论文。

英文摘要

We study the minimization of a convex function $f(X)$ over the set of $n\times n$ positive semi-definite matrices, but when the problem is recast as $\min_U g(U) := f(UU^\top)$, with $U \in \mathbb{R}^{n \times r}$ and $r \leq n$. We study the performance of gradient descent on $g$---which we refer to as Factored Gradient Descent (FGD)---under standard assumptions on the original function $f$. We provide a rule for selecting the step size and, with this choice, show that the local convergence rate of FGD mirrors that of standard gradient descent on the original $f$: i.e., after $k$ steps, the error is $O(1/k)$ for smooth $f$, and exponentially small in $k$ when $f$ is (restricted) strongly convex. In addition, we provide a procedure to initialize FGD for (restricted) strongly convex objectives and when one only has access to $f$ via a first-order oracle; for several problem instances, such proper initialization leads to global convergence guarantees. FGD and similar procedures are widely used in practice for problems that can be posed as matrix factorization. To the best of our knowledge, this is the first paper to provide precise convergence rate guarantees for general convex functions under standard convex assumptions.

1604.04026 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

Fast Parallel Randomized Algorithm for Nonnegative Matrix Factorization with KL Divergence for Large Sparse Datasets

快速并行随机算法用于具有KL散度的非负矩阵分解以处理大规模稀疏数据集

Duy Khuong Nguyen, Tu Bao Ho

AI总结 本文提出一种快速并行随机坐标下降算法,用于大规模稀疏数据集的非负矩阵分解与KL散度优化,实现更高效的稀疏建模与表示。

详情
AI中文摘要

非负矩阵分解(NMF)与KL散度(NMF-KL)是最重要的NMF问题之一,等价于概率潜在语义索引(PLSI),已在许多应用中成功应用。对于稀疏计数数据,泊松分布和KL散度提供稀疏模型和稀疏表示,比正态分布和Frobenius范数更能描述随机变化。特别地,稀疏模型能更简洁地理解潜在组件上的属性出现,而稀疏表示能更简洁地解释潜在组件对实例的贡献。然而,最小化NMF与KL散度比最小化NMF与Frobenius范数要困难得多;稀疏模型、稀疏表示以及大规模稀疏数据集的快速算法仍然是NMF-KL的挑战。在本文中,我们提出了一种快速并行随机坐标下降算法,用于大规模稀疏数据集,以实现稀疏模型和稀疏表示。所提出算法在该问题上的实验结果优于当前研究的成果。

英文摘要

Nonnegative Matrix Factorization (NMF) with Kullback-Leibler Divergence (NMF-KL) is one of the most significant NMF problems and equivalent to Probabilistic Latent Semantic Indexing (PLSI), which has been successfully applied in many applications. For sparse count data, a Poisson distribution and KL divergence provide sparse models and sparse representation, which describe the random variation better than a normal distribution and Frobenius norm. Specially, sparse models provide more concise understanding of the appearance of attributes over latent components, while sparse representation provides concise interpretability of the contribution of latent components over instances. However, minimizing NMF with KL divergence is much more difficult than minimizing NMF with Frobenius norm; and sparse models, sparse representation and fast algorithms for large sparse datasets are still challenges for NMF with KL divergence. In this paper, we propose a fast parallel randomized coordinate descent algorithm having fast convergence for large sparse datasets to archive sparse models and sparse representation. The proposed algorithm's experimental results overperform the current studies' ones in this problem.

1508.02087 2026-06-04 math.OC cs.LG cs.NA math.NA stat.CO stat.ML 版本更新

A Linearly-Convergent Stochastic L-BFGS Algorithm

一种线性收敛的随机L-BFGS算法

Philipp Moritz, Robert Nishihara, Michael I. Jordan

AI总结 本文提出一种新的随机L-BFGS算法,证明了其在强凸和光滑函数上的线性收敛性,并展示了其在大规模凸优化问题中的高效性能。

Comments 10 pages, 3 figures in International Conference on Artificial Intelligence and Statistics, 2016

详情
AI中文摘要

我们提出了一种新的随机L-BFGS算法,并证明了其在强凸和光滑函数上的线性收敛性。我们的算法借鉴了Byrd等人(2014)最近提出的随机L-BFGS变体以及Johnson和Zhang(2013)最近提出的随机梯度下降方差减少方法。我们通过实验表明,该算法在大规模凸和非凸优化问题中表现良好,展现出线性收敛性和快速求解能力。此外,我们还展示了该算法在广泛步长范围内的良好表现,步长通常相差几个数量级。

英文摘要

We propose a new stochastic L-BFGS algorithm and prove a linear convergence rate for strongly convex and smooth functions. Our algorithm draws heavily from a recent stochastic variant of L-BFGS proposed in Byrd et al. (2014) as well as a recent approach to variance reduction for stochastic gradient descent from Johnson and Zhang (2013). We demonstrate experimentally that our algorithm performs well on large-scale convex and non-convex optimization problems, exhibiting linear convergence and rapidly solving the optimization problems to high levels of precision. Furthermore, we show that our algorithm performs well for a wide-range of step sizes, often differing by several orders of magnitude.

1604.03912 2026-06-04 cs.AI cs.LG cs.SY eess.SY stat.ML 版本更新

Inverse Reinforcement Learning with Simultaneous Estimation of Rewards and Dynamics

逆强化学习与奖励和动态的同时估计

Michael Herman, Tobias Gindele, Jörg Wagner, Felix Schmitt, Wolfram Burgard

AI总结 本文提出一种基于梯度的逆强化学习方法,同时估计系统动态和奖励函数,提升了样本效率和估计准确性。

Comments accepted to appear in AISTATS 2016

详情
AI中文摘要

逆强化学习(IRL)描述了从观察到的智能体行为中学习未知马尔可夫决策过程(MDP)奖励函数的问题。由于智能体的行为源于其策略,而MDP策略依赖于随机系统动态和奖励函数,逆问题的解决方案受到两者显著影响。当前的IRL方法假设如果转移模型未知,可以获取额外的系统动态样本,或观察行为提供足够的系统动态样本以准确求解逆问题。这些假设往往不成立。为克服这一问题,我们提出了一种基于梯度的IRL方法,同时估计系统的动态。通过求解联合优化问题,我们的方法考虑了演示的偏差,这种偏差源于生成策略。在合成MDP和迁移学习任务上的评估显示,该方法在样本效率以及估计的奖励函数和转移模型的准确性方面有所改进。

英文摘要

Inverse Reinforcement Learning (IRL) describes the problem of learning an unknown reward function of a Markov Decision Process (MDP) from observed behavior of an agent. Since the agent's behavior originates in its policy and MDP policies depend on both the stochastic system dynamics as well as the reward function, the solution of the inverse problem is significantly influenced by both. Current IRL approaches assume that if the transition model is unknown, additional samples from the system's dynamics are accessible, or the observed behavior provides enough samples of the system's dynamics to solve the inverse problem accurately. These assumptions are often not satisfied. To overcome this, we present a gradient-based IRL approach that simultaneously estimates the system's dynamics. By solving the combined optimization problem, our approach takes into account the bias of the demonstrations, which stems from the generating policy. The evaluation on a synthetic MDP and a transfer learning task shows improvements regarding the sample efficiency as well as the accuracy of the estimated reward functions and transition models.

1604.01376 2026-06-04 math.NA cs.LG cs.NA 版本更新

Lipschitz Continuity of Mahalanobis Distances and Bilinear Forms

马哈拉诺布斯距离与双线性形式的利普希茨连续性

Valentina Zantedeschi, Rémi Emonet, Marc Sebban

AI总结 本文研究了马哈拉诺布斯距离和双线性形式的利普希茨连续性,推导出紧致的利普希茨常数,首次证明了马哈拉诺布斯距离的利普希茨连续性。

详情
AI中文摘要

许多机器学习领域的理论结果仅适用于利普希茨连续函数。利普希茨连续性是一种强连续性,它线性地限制了函数的变化幅度。在本文中,我们推导出两个度量家族——马哈拉诺布斯距离和有界空间双线性形式——的紧致利普希茨常数。据我们所知,这是首次正式证明马哈拉诺布斯距离的利普希茨连续性,并推导出此类紧致的利普希茨常数。

英文摘要

Many theoretical results in the machine learning domain stand only for functions that are Lipschitz continuous. Lipschitz continuity is a strong form of continuity that linearly bounds the variations of a function. In this paper, we derive tight Lipschitz constants for two families of metrics: Mahalanobis distances and bounded-space bilinear forms. To our knowledge, this is the first time the Mahalanobis distance is formally proved to be Lipschitz continuous and that such tight Lipschitz constants are derived.

1505.05114 2026-06-04 cs.IT cs.LG cs.NA math.IT math.NA math.ST stat.ML stat.TH 版本更新

Solving Random Quadratic Systems of Equations Is Nearly as Easy as Solving Linear Systems

求解随机二次方程组与求解线性方程组几乎同样容易

Yuxin Chen, Emmanuel J. Candes

AI总结 本文提出了一种新的方法,通过谱方法获得初始猜测,再通过非凸函数最小化来求解随机二次方程组,证明在特定模型下算法可在线性时间内得到正确解,并在噪声环境下达到接近不可改进的统计精度。

Comments accepted to Communications on Pure and Applied Mathematics (CPAM)

详情
AI中文摘要

我们考虑了在n个变量中求解二次方程组的基本问题,其中y_i = |⟨a_i, x⟩|²,i = 1, ..., m,且x ∈ ℝ^n未知。我们提出了一种新方法,从通过谱方法计算的初始猜测开始,通过类似于Wirtinger流方法的非凸函数最小化进行处理。关键特征包括不同的目标函数和新的更新规则,这些规则以自适应方式操作,并丢弃对搜索方向影响过大的项。这些精心选择的规则提供了更精确的初始猜测、更好的下降方向,从而提升了实际性能。在理论方面,我们证明对于某些无结构的二次方程组模型,我们的算法在m/n比率超过固定数值常数时可在线性时间内得到正确解。我们扩展了理论以处理噪声系统,其中我们只有y_i ≈ |⟨a_i, x⟩|²,并证明我们的算法达到几乎不可改进的统计精度。我们通过数值示例补充了理论研究,显示求解随机二次方程组在计算和统计上并不比求解相同规模的线性方程组更困难——因此本文的标题。例如,我们证明算法的计算成本大约是相同规模最小二乘问题的四倍。

英文摘要

We consider the fundamental problem of solving quadratic systems of equations in $n$ variables, where $y_i = |\langle \boldsymbol{a}_i, \boldsymbol{x} \rangle|^2$, $i = 1, \ldots, m$ and $\boldsymbol{x} \in \mathbb{R}^n$ is unknown. We propose a novel method, which starting with an initial guess computed by means of a spectral method, proceeds by minimizing a nonconvex functional as in the Wirtinger flow approach. There are several key distinguishing features, most notably, a distinct objective functional and novel update rules, which operate in an adaptive fashion and drop terms bearing too much influence on the search direction. These careful selection rules provide a tighter initial guess, better descent directions, and thus enhanced practical performance. On the theoretical side, we prove that for certain unstructured models of quadratic systems, our algorithms return the correct solution in linear time, i.e. in time proportional to reading the data $\{\boldsymbol{a}_i\}$ and $\{y_i\}$ as soon as the ratio $m/n$ between the number of equations and unknowns exceeds a fixed numerical constant. We extend the theory to deal with noisy systems in which we only have $y_i \approx |\langle \boldsymbol{a}_i, \boldsymbol{x} \rangle|^2$ and prove that our algorithms achieve a statistical accuracy, which is nearly un-improvable. We complement our theoretical study with numerical examples showing that solving random quadratic systems is both computationally and statistically not much harder than solving linear systems of the same size---hence the title of this paper. For instance, we demonstrate empirically that the computational cost of our algorithm is about four times that of solving a least-squares problem of the same size.

1506.09016 2026-06-04 cs.LG cs.CV cs.NA math.NA math.OC stat.ML 版本更新

Online Learning to Sample

在线学习采样

Guillaume Bouchard, Théo Trouillon, Julien Perez, Adrien Gaidon

AI总结 本文提出AW-SGD算法,通过在线学习优化采样策略,提升在线优化效率,应用于图像分类、矩阵分解和强化学习。

Comments Update: removed convergence theorem and proof as there is an error. Submitted to UAI 2016

详情
AI中文摘要

随机梯度下降(SGD)是机器学习中用于在线优化最广泛使用的技术之一。在本工作中,我们通过适应性地学习如何在每个时间步选择最有用的训练示例来加速SGD。首先,我们证明SGD可以用于学习重要采样估计器的最佳可能采样分布。其次,我们证明SGD算法的采样分布可以通过逐步最小化梯度的方差来在线估计。所得到的算法——自适应加权SGD(AW-SGD)——维护一组用于优化的参数,以及一组用于采样学习示例的参数。我们证明AWSGD在三个不同的应用中实现了更快的收敛:(i)使用深度特征的图像分类,其中图像的采样取决于其标签,(ii)矩阵分解,其中行和列不是均匀采样的,以及(iii)强化学习,其中优化和探索策略同时被估计,其中我们的方法对应于一个off-policy梯度算法。

英文摘要

Stochastic Gradient Descent (SGD) is one of the most widely used techniques for online optimization in machine learning. In this work, we accelerate SGD by adaptively learning how to sample the most useful training examples at each time step. First, we show that SGD can be used to learn the best possible sampling distribution of an importance sampling estimator. Second, we show that the sampling distribution of an SGD algorithm can be estimated online by incrementally minimizing the variance of the gradient. The resulting algorithm - called Adaptive Weighted SGD (AW-SGD) - maintains a set of parameters to optimize, as well as a set of parameters to sample learning examples. We show that AWSGD yields faster convergence in three different applications: (i) image classification with deep features, where the sampling of images depends on their labels, (ii) matrix factorization, where rows and columns are not sampled uniformly, and (iii) reinforcement learning, where the optimized and exploration policies are estimated at the same time, where our approach corresponds to an off-policy gradient algorithm.

1510.06083 2026-06-04 cs.LG cs.NA math.NA math.OC stat.ML 版本更新

Regularization vs. Relaxation: A conic optimization perspective of statistical variable selection

正则化与松弛:从锥优化视角看统计变量选择

Hongbo Dong, Kun Chen, Jeff Linderoth

AI总结 本文从锥优化视角探讨变量选择问题,证明MCP和反Huber惩罚函数可视为视角松弛的特例,并通过半定松弛解决,结合Goemans-Williamson方法获得近似解。

Comments Also available on optimization online {http://www.optimization-online.org/DB_HTML/2015/05/4932.html}

详情
AI中文摘要

变量选择是统计数据分析中的基本任务。稀疏诱导正则化方法同时执行变量选择和模型估计,核心问题是一个带有l0范数惩罚的二次优化问题。精确执行l0范数惩罚对大规模问题计算不可行,因此引入了近似l0范数的稀疏诱导惩罚函数。本文表明从凸松弛视角分析问题提供新见解。特别是,我们证明了流行的稀疏诱导凹惩罚函数Minimax Concave Penalty(MCP)和反Huber惩罚函数(由Pilanci等人最近提出)均可视为一种称为视角松弛的提升凸松弛的特例。最优视角松弛是一个相关的minimax问题,平衡整体凸性和对l0范数的逼近紧密性。我们证明其可通过半定松弛解决。此外,半定松弛的概率解释揭示了与组合优化中的布尔二次多面体的联系。最后,通过将l0范数惩罚问题重新表述为两级问题,其中内层为Max-Cut问题,我们的所提半定松弛可通过将内层问题替换为其由Goemans和Williamson研究的半定松弛来实现。此解释表明使用Goemans-Williamson的舍入过程可找到l0范数惩罚问题的近似解。数值实验展示了我们所提半定松弛的紧密性,以及通过Goemans-Williamson舍入找到近似解的有效性。

英文摘要

Variable selection is a fundamental task in statistical data analysis. Sparsity-inducing regularization methods are a popular class of methods that simultaneously perform variable selection and model estimation. The central problem is a quadratic optimization problem with an l0-norm penalty. Exactly enforcing the l0-norm penalty is computationally intractable for larger scale problems, so dif- ferent sparsity-inducing penalty functions that approximate the l0-norm have been introduced. In this paper, we show that viewing the problem from a convex relaxation perspective offers new insights. In particular, we show that a popular sparsity-inducing concave penalty function known as the Minimax Concave Penalty (MCP), and the reverse Huber penalty derived in a recent work by Pilanci, Wainwright and Ghaoui, can both be derived as special cases of a lifted convex relaxation called the perspective relaxation. The optimal perspective relaxation is a related minimax problem that balances the overall convexity and tightness of approximation to the l0 norm. We show it can be solved by a semidefinite relaxation. Moreover, a probabilistic interpretation of the semidefinite relaxation reveals connections with the boolean quadric polytope in combinatorial optimization. Finally by reformulating the l0-norm pe- nalized problem as a two-level problem, with the inner level being a Max-Cut problem, our proposed semidefinite relaxation can be realized by replacing the inner level problem with its semidefinite relaxation studied by Goemans and Williamson. This interpretation suggests using the Goemans-Williamson rounding procedure to find approximate solutions to the l0-norm penalized problem. Numerical experiments demonstrate the tightness of our proposed semidefinite relaxation, and the effectiveness of finding approximate solutions by Goemans-Williamson rounding.

1603.02038 2026-06-04 cs.RO cs.AI cs.LG cs.SY eess.SY 版本更新

Unscented Bayesian Optimization for Safe Robot Grasping

无迹贝叶斯优化用于安全机器人抓取

José Nogueira, Ruben Martinez-Cantin, Alexandre Bernardino, Lorenzo Jamone

AI总结 本文提出无迹贝叶斯优化算法,通过考虑输入噪声在安全区域寻找最优抓取策略,提升机器人抓取的安全性和效率。

Comments conference paper

详情
AI中文摘要

我们解决了在输入空间存在不确定性时的机器人抓取优化问题。通过试错探索策略实现抓取未知物体。贝叶斯优化是一种样本高效的优化算法,特别适合此设置,因为它能主动减少试验次数以学习待优化函数。事实上,这种主动对象探索策略与婴儿学习最佳抓取方式的策略相同。在学习抓取策略时,一些抓取参数配置可能对物体与机器人末端执行器之间相对姿态的误差非常敏感。我们称这些配置为不安全,因为抓取执行中的小误差可能将好的抓取变为坏的抓取。因此,为了降低抓取失败的风险,抓取应规划在安全区域。我们提出了一种新的算法,即无迹贝叶斯优化,能够在考虑输入噪声的情况下进行样本高效的优化以找到安全的极值。无迹贝叶斯优化的贡献是双方面的:一方面提供了一个新的决策过程,驱动探索到安全区域;另一方面提供了一个新的选择过程,选择在不进行额外分析或计算成本的情况下最优的抓取策略。这两个贡献都根植于无迹变换背后的强大理论,这是一种流行的非线性近似方法。我们在合成问题和现实的机器人抓取模拟中展示了其相对于经典贝叶斯优化的优势。结果表明,我们的方法在几次试验后就能获得最优且鲁棒的抓取策略,同时所选的抓取保持在安全区域。

英文摘要

We address the robot grasp optimization problem of unknown objects considering uncertainty in the input space. Grasping unknown objects can be achieved by using a trial and error exploration strategy. Bayesian optimization is a sample efficient optimization algorithm that is especially suitable for this setups as it actively reduces the number of trials for learning about the function to optimize. In fact, this active object exploration is the same strategy that infants do to learn optimal grasps. One problem that arises while learning grasping policies is that some configurations of grasp parameters may be very sensitive to error in the relative pose between the object and robot end-effector. We call these configurations unsafe because small errors during grasp execution may turn good grasps into bad grasps. Therefore, to reduce the risk of grasp failure, grasps should be planned in safe areas. We propose a new algorithm, Unscented Bayesian optimization that is able to perform sample efficient optimization while taking into consideration input noise to find safe optima. The contribution of Unscented Bayesian optimization is twofold as if provides a new decision process that drives exploration to safe regions and a new selection procedure that chooses the optimal in terms of its safety without extra analysis or computational cost. Both contributions are rooted on the strong theory behind the unscented transformation, a popular nonlinear approximation method. We show its advantages with respect to the classical Bayesian optimization both in synthetic problems and in realistic robot grasp simulations. The results highlights that our method achieves optimal and robust grasping policies after few trials while the selected grasps remain in safe regions.

1603.00748 2026-06-04 cs.LG cs.AI cs.RO cs.SY eess.SY 版本更新

Continuous Deep Q-Learning with Model-based Acceleration

基于模型的连续深度Q学习加速

Shixiang Gu, Timothy Lillicrap, Ilya Sutskever, Sergey Levine

AI总结 本文提出连续深度Q学习算法NAF及基于模型的加速方法,用于提升连续控制任务的样本效率和学习速度。

详情
AI中文摘要

模型无关强化学习已成功应用于多种挑战性问题,并扩展到处理大规模神经网络策略和价值函数。然而,模型无关算法的样本复杂性,特别是使用高维函数近似器时,限制了其在物理系统中的应用。本文探索了减少深度强化学习样本复杂性的算法和表示方法。我们提出两种互补技术来提高此类算法的效率。首先,我们推导出Q学习的连续变种,称为归一化优势函数(NAF),作为替代更常用的策略梯度和actor-critic方法。NAF表示允许我们应用带有经验回放的Q学习来处理连续任务,并在一组模拟机器人控制任务上显著提高性能。为进一步提高我们的方法效率,我们探索了使用学习模型来加速模型无关强化学习。我们展示迭代重新拟合的局部线性模型在这一点上特别有效,并在适用此类模型的领域中展示了显著更快的学习速度。

英文摘要

Model-free reinforcement learning has been successfully applied to a range of challenging problems, and has recently been extended to handle large neural network policies and value functions. However, the sample complexity of model-free algorithms, particularly when using high-dimensional function approximators, tends to limit their applicability to physical systems. In this paper, we explore algorithms and representations to reduce the sample complexity of deep reinforcement learning for continuous control tasks. We propose two complementary techniques for improving the efficiency of such algorithms. First, we derive a continuous variant of the Q-learning algorithm, which we call normalized adantage functions (NAF), as an alternative to the more commonly used policy gradient and actor-critic methods. NAF representation allows us to apply Q-learning with experience replay to continuous tasks, and substantially improves performance on a set of simulated robotic control tasks. To further improve the efficiency of our approach, we explore the use of learned models for accelerating model-free reinforcement learning. We show that iteratively refitted local linear models are especially effective for this, and demonstrate substantially faster learning on domains where such models are applicable.

1603.00427 2026-06-04 eess.SY cs.LG cs.SY 版本更新

A Nonlinear Adaptive Filter Based on the Model of Simple Multilinear Functionals

基于简单多线性函数的非线性自适应滤波器

Felipe C. Pinheiro, Cássio G. Lopes

AI总结 本文提出一种基于简单多线性函数的非线性自适应滤波器模型,利用K个FIR线性滤波器的输出乘积作为非线性模型,通过梯度下降法解决优化问题,并在系统辨识中验证了其收敛性和计算复杂度。

Comments 5 pages, one of references, plus extra page attached

详情
AI中文摘要

非线性自适应滤波允许对一般系统的某些附加方面进行建模,通常依赖于高度复杂的算法,如基于Volterra级数的算法。通过使用Kronecker乘积和一些张量代数的基本事实,我们提出了一种简单的非线性模型,该模型可以解释为K个FIR线性滤波器输出的乘积,并计算其成本函数及其梯度,从而允许对优化问题进行一些分析。我们利用这些结果在随机梯度框架中推导出一个类似于LMS的算法,并研究了均方误差表面的多模态问题以及合适初始条件的选择。计算了该算法的计算复杂度。在系统辨识设置中测试了新算法,并与其他文献中的多项式算法进行了比较,展示了良好的收敛性和/或计算复杂度。

英文摘要

Nonlinear adaptive filtering allows for modeling of some additional aspects of a general system and usually relies on highly complex algorithms, such as those based on the Volterra series. Through the use of the Kronecker product and some basic facts of tensor algebra, we propose a simple model of nonlinearity, one that can be interpreted as a product of the outputs of K FIR linear filters, and compute its cost function together with its gradient, which allows for some analysis of the optimization problem. We use these results it in a stochastic gradient framework, from which we derive an LMS-like algorithm and investigate the problems of multi-modality in the mean-square error surface and the choice of adequate initial conditions. Its computational complexity is calculated. The new algorithm is tested in a system identification setup and is compared with other polynomial algorithms from the literature, presenting favorable convergence and/or computational complexity.

1602.08800 2026-06-04 math.NA cs.IR cs.LG cs.NA 版本更新

Iterative Aggregation Method for Solving Principal Component Analysis Problems

迭代聚合方法用于求解主成分分析问题

Vitaly Bulgakov

AI总结 本文提出一种两级聚合方法,用于高效求解主成分分析问题,通过迭代幂法求解特征值问题,并在大规模文本数据集上进行了测试。

详情
AI中文摘要

受之前开发的多级聚合方法用于解决结构分析问题的启发,本文提出了一种新的两级聚合方法,用于高效迭代求解主成分分析(PCA)问题。通过使用原始协方差矩阵的课程聚合模型,利用幂迭代法求解特征值问题。该方法在多个包含大量文本文档的数据集上进行了测试。

英文摘要

Motivated by the previously developed multilevel aggregation method for solving structural analysis problems a novel two-level aggregation approach for efficient iterative solution of Principal Component Analysis (PCA) problems is proposed. The course aggregation model of the original covariance matrix is used in the iterative solution of the eigenvalue problem by a power iterations method. The method is tested on several data sets consisting of large number of text documents.

1406.3587 2026-06-04 math.NA cs.LG cs.NA 版本更新

Quaternion Gradient and Hessian

四元数梯度与Hessian

Dongpo Xu, Danilo P. Mandic

AI总结 本文提出基于广义HR算子的新四元数梯度和Hessian定义,简化了四元数最优化算法的推导,使直接在四元数域中进行优化成为可能,提高了算法设计和评估的效率。

Comments 23 pages

详情
Journal ref
IEEE Transactions on Neural Networks and Learning Systems, 2016, 27(2):249-261
AI中文摘要

实值标量函数在四元数变量下的优化,如均方误差或阵列输出功率,是许多实际应用的基础。解决方案通常需要计算梯度和Hessian,然而,实值四元数函数本质上是非解析的。为了解决这一问题,我们提出了基于新型广义HR(GHR)算子的新四元数梯度和Hessian定义,从而使得在四元数域中高效推导优化算法成为可能,而不是将问题转换到实数域,这是当前的做法。此外,不同于现有的四元数梯度,GHR算子允许乘法和链式法则,并且使所提出的四元数梯度和Hessian与它们的实数对应物之间有一一对应关系。与数值应用相关的四元数梯度和Hessian的性质被详细阐述,结果展示了GHR算子在大大简化四元数最小均方、四元数最小二乘和牛顿算法推导中的有用性。所提出的梯度和Hessian还被证明能够使相同的通用形式与相应的实数和复数值算法相同,进一步说明了在算法设计和评估中的优势。

英文摘要

The optimization of real scalar functions of quaternion variables, such as the mean square error or array output power, underpins many practical applications. Solutions often require the calculation of the gradient and Hessian, however, real functions of quaternion variables are essentially non-analytic. To address this issue, we propose new definitions of quaternion gradient and Hessian, based on the novel generalized HR (GHR) calculus, thus making possible efficient derivation of optimization algorithms directly in the quaternion field, rather than transforming the problem to the real domain, as is current practice. In addition, unlike the existing quaternion gradients, the GHR calculus allows for the product and chain rule, and for a one-to-one correspondence of the proposed quaternion gradient and Hessian with their real counterparts. Properties of the quaternion gradient and Hessian relevant to numerical applications are elaborated, and the results illuminate the usefulness of the GHR calculus in greatly simplifying the derivation of the quaternion least mean squares, and in quaternion least square and Newton algorithm. The proposed gradient and Hessian are also shown to enable the same generic forms as the corresponding real- and complex-valued algorithms, further illustrating the advantages in algorithm design and evaluation.

1602.04434 2026-06-04 cs.LG cs.SY eess.SY stat.ML 版本更新

Frequency Analysis of Temporal Graph Signals

时序图信号的频谱分析

Andreas Loukas, Damien Foucard

AI总结 本文提出时序图频谱分析概念,统一了时频和图频分析方法,通过联合时频变换设计分布式滤波器用于干扰消除。

Comments 5 pages, 3 figures

详情
AI中文摘要

本文扩展了图频谱的概念,研究时序图信号的频谱特性。通过构建联合时频变换,设计分布式滤波器实现干扰消除,算法具有线性复杂度且能近似任意联合滤波目标。

英文摘要

This letter extends the concept of graph-frequency to graph signals that evolve with time. Our goal is to generalize and, in fact, unify the familiar concepts from time- and graph-frequency analysis. To this end, we study a joint temporal and graph Fourier transform (JFT) and demonstrate its attractive properties. We build on our results to create filters which act on the joint (temporal and graph) frequency domain, and show how these can be used to perform interference cancellation. The proposed algorithms are distributed, have linear complexity, and can approximate any desired joint filtering objective.

1506.01326 2026-06-04 math.NA cs.AI cs.LG cs.NA stat.CO stat.ML 版本更新

Probabilistic Numerics and Uncertainty in Computations

概率数值计算与计算中的不确定性

Philipp Hennig, Michael A Osborne, Mark Girolami

AI总结 本文呼吁采用概率数值方法,通过在计算中返回不确定性来改进线性代数、积分、优化和微分方程求解等算法,强调其在气候科学和天文学等领域的应用价值。

Comments Author Generated Postprint. 17 pages, 4 Figures, 1 Table

详情
AI中文摘要

我们呼吁采用概率数值方法:即在数值任务中返回不确定性的算法,包括线性代数、积分、优化和求解微分方程。这些不确定性源于数值计算中由于时间和硬件限制导致的精度损失,对现代科学和工业至关重要。在诸如气候科学和天文学等应用中,基于大规模复杂数据的计算需求促使重新关注数值不确定性的管理。我们描述了几种经典数值方法如何自然地被解释为概率推断。然后展示概率观点如何提出新的算法,能够灵活适应应用需求,并提供改进的实证性能。我们提供了天文学和天文成像等实际科学问题中概率数值算法的实例,同时指出这些新算法存在的开放问题。最后,我们描述了概率数值方法如何为结合数值算法(如数值优化器和微分方程求解器)的计算提供一致的框架,可能允许诊断(和控制)计算中的误差源。

英文摘要

We deliver a call to arms for probabilistic numerical methods: algorithms for numerical tasks, including linear algebra, integration, optimization and solving differential equations, that return uncertainties in their calculations. Such uncertainties, arising from the loss of precision induced by numerical calculation with limited time or hardware, are important for much contemporary science and industry. Within applications such as climate science and astrophysics, the need to make decisions on the basis of computations with large and complex data has led to a renewed focus on the management of numerical uncertainty. We describe how several seminal classic numerical methods can be interpreted naturally as probabilistic inference. We then show that the probabilistic view suggests new algorithms that can flexibly be adapted to suit application specifics, while delivering improved empirical performance. We provide concrete illustrations of the benefits of probabilistic numeric algorithms on real scientific problems from astrometry and astronomical imaging, while highlighting open problems with these new algorithms. Finally, we describe how probabilistic numerical methods provide a coherent framework for identifying the uncertainty in calculations performed with a combination of numerical algorithms (e.g. both numerical optimisers and differential equation solvers), potentially allowing the diagnosis (and control) of error sources in computations.

1602.04847 2026-06-04 math.OC cs.DS cs.LG cs.NA math.NA 版本更新

Black-box optimization with a politician

用政治家进行黑盒优化

Sébastien Bubeck, Yin-Tat Lee

AI总结 本文提出一种适用于梯度计算昂贵情况的黑盒凸优化新框架,结合凸优化概念和分析中心,实验证明其性能优于BFGS等算法。

Comments 19 pages

详情
AI中文摘要

我们提出了一种新的黑盒凸优化框架,适用于梯度计算昂贵的情况。我们推导出一种新方法,结合了凸优化中的标准一阶方法(如梯度下降或拟牛顿方法)和分析中心(即自共轭屏障的最小化点)。我们实验证明,我们的新方法在性能上优于BFGS等现有算法。

英文摘要

We propose a new framework for black-box convex optimization which is well-suited for situations where gradient computations are expensive. We derive a new method for this framework which leverages several concepts from convex optimization, from standard first-order methods (e.g. gradient descent or quasi-Newton methods) to analytical centers (i.e. minimizers of self-concordant barriers). We demonstrate empirically that our new technique compares favorably with state of the art algorithms (such as BFGS).

1402.0635 2026-06-04 stat.ML cs.AI cs.LG cs.SY eess.SY 版本更新

Generalization and Exploration via Randomized Value Functions

通过随机价值函数实现泛化与探索

Ian Osband, Benjamin Van Roy, Zheng Wen

AI总结 本文提出随机最小二乘价值迭代算法(RLSVI),通过线性参数化价值函数实现高效的探索与泛化,证明其在无先验知识学习中的近优性能。

Comments arXiv admin note: text overlap with arXiv:1307.4847

详情
AI中文摘要

我们提出了随机最小二乘价值迭代(RLSVI)——一种新的强化学习算法,旨在通过线性参数化价值函数实现高效的探索与泛化。我们解释了使用玻尔兹曼或epsilon-贪婪探索的最小二乘价值迭代版本为何效率低下,并通过计算结果展示了RLSVI带来的显著效率提升。进一步,我们建立了RLSVI预期遗憾的上界,证明其在无先验知识学习中的近最优性。更广泛地说,我们的结果表明,随机价值函数为解决强化学习中的关键挑战——合成高效探索与有效泛化——提供了一种有前景的方法。

英文摘要

We propose randomized least-squares value iteration (RLSVI) -- a new reinforcement learning algorithm designed to explore and generalize efficiently via linearly parameterized value functions. We explain why versions of least-squares value iteration that use Boltzmann or epsilon-greedy exploration can be highly inefficient, and we present computational results that demonstrate dramatic efficiency gains enjoyed by RLSVI. Further, we establish an upper bound on the expected regret of RLSVI that demonstrates near-optimality in a tabula rasa learning context. More broadly, our results suggest that randomized value functions offer a promising approach to tackling a critical challenge in reinforcement learning: synthesizing efficient exploration and effective generalization.

1511.07837 2026-06-04 math.OC cs.LG cs.NA math.NA stat.CO stat.ML 版本更新

Generalized Conjugate Gradient Methods for $\ell_1$ Regularized Convex Quadratic Programming with Finite Convergence

针对ℓ₁正则化凸二次规划的广义共轭梯度方法及其有限收敛性

Zhaosong Lu, Xiaojun Chen

AI总结 本文提出了一种广义共轭梯度方法,用于求解带有ℓ₁正则化的凸二次规划问题,在有限次迭代内达到最优解。方法通过比较子梯度的分量大小选择步骤类型,并结合精确线搜索和共轭梯度子程序,具有较低的计算复杂度。

Comments 36 pages, 2 tables

详情
AI中文摘要

共轭梯度(CG)方法是求解大规模强凸二次规划(QP)的有效迭代方法。本文提出了一些广义CG(GCG)方法,用于求解带有ℓ₁正则化的(可能不强凸)QP问题,可在有限次迭代内终止于最优解。在每次迭代中,我们的方法首先确定一个正交抗的面,然后要么沿负的投影最小范数子梯度方向进行精确线搜索,要么执行一个CG子程序,直到CG迭代跨越该面的边界或找到该面或子面的近似最小值。我们通过比较最小范数子梯度的某些分量大小来确定应采取哪种步骤类型。我们的有限收敛性分析利用了误差界结果和上述精确线搜索和CG子程序的一些关键性质。我们还展示了所提出的方法能够通过允许CG子程序执行的不精确性来找到问题的近似解。我们GCG方法找到ε-最优解的总体算术运算成本依赖于ε在O(log(1/ε)),优于加速近端梯度方法[2,23]依赖于ε在O(1/√ε)。此外,我们的GCG方法可以简单地扩展到求解具有有限收敛性的盒约束凸QP。数值结果表明,我们的方法对于求解病态问题非常有效。

英文摘要

The conjugate gradient (CG) method is an efficient iterative method for solving large-scale strongly convex quadratic programming (QP). In this paper we propose some generalized CG (GCG) methods for solving the $\ell_1$-regularized (possibly not strongly) convex QP that terminate at an optimal solution in a finite number of iterations. At each iteration, our methods first identify a face of an orthant and then either perform an exact line search along the direction of the negative projected minimum-norm subgradient of the objective function or execute a CG subroutine that conducts a sequence of CG iterations until a CG iterate crosses the boundary of this face or an approximate minimizer of over this face or a subface is found. We determine which type of step should be taken by comparing the magnitude of some components of the minimum-norm subgradient of the objective function to that of its rest components. Our analysis on finite convergence of these methods makes use of an error bound result and some key properties of the aforementioned exact line search and the CG subroutine. We also show that the proposed methods are capable of finding an approximate solution of the problem by allowing some inexactness on the execution of the CG subroutine. The overall arithmetic operation cost of our GCG methods for finding an $ε$-optimal solution depends on $ε$ in $O(\log(1/ε))$, which is superior to the accelerated proximal gradient method [2,23] that depends on $ε$ in $O(1/\sqrtε)$. In addition, our GCG methods can be extended straightforwardly to solve box-constrained convex QP with finite convergence. Numerical results demonstrate that our methods are very favorable for solving ill-conditioned problems.

1602.02523 2026-06-04 stat.ML cs.LG cs.SY eess.SY 版本更新

Data-Efficient Reinforcement Learning in Continuous-State POMDPs

连续状态POMDPs中的数据高效强化学习

Rowan McAllister, Carl Edward Rasmussen

AI总结 本文提出一种抗观测噪声的数据高效强化学习算法,通过扩展PILCO算法至POMDPs,利用过滤过程提升策略评估性能,实现在Cartpole摆动任务中更优的非线性控制效果。

详情
AI中文摘要

我们提出了一种抗观测噪声的数据高效强化学习算法。我们的方法通过在策略评估过程中考虑过滤过程,将高度数据高效的PILCO算法(Deisenroth & Rasmussen, 2011)扩展至部分观测马尔可夫决策过程(POMDPs)。PILCO进行策略搜索,通过首先预测可能系统轨迹的解析分布来评估每个策略。我们还预测轨迹相对于过滤过程,从而在结合过滤器与由原始(未过滤)框架优化的策略时实现了显著更高的性能。我们的测试设置是带有传感器噪声的Cartpole摆动任务,该任务涉及非线性动态并需要非线性控制。

英文摘要

We present a data-efficient reinforcement learning algorithm resistant to observation noise. Our method extends the highly data-efficient PILCO algorithm (Deisenroth & Rasmussen, 2011) into partially observed Markov decision processes (POMDPs) by considering the filtering process during policy evaluation. PILCO conducts policy search, evaluating each policy by first predicting an analytic distribution of possible system trajectories. We additionally predict trajectories w.r.t. a filtering process, achieving significantly higher performance than combining a filter with a policy optimised by the original (unfiltered) framework. Our test setup is the cartpole swing-up task with sensor noise, which involves nonlinear dynamics and requires nonlinear control.

1406.5311 2026-06-04 math.OC cs.AI cs.LG cs.NA math.NA stat.ML 版本更新

Towards A Deeper Geometric, Analytic and Algorithmic Understanding of Margins

迈向更深入的几何、分析和算法对边界的理解

Aaditya Ramdas, Javier Peña

AI总结 本文研究了矩阵A的边界条件度量,探讨了线性可行性问题的难度,通过几何、分析和算法方法扩展了边界理论,并证明了感知机收敛率与边界的关联。

Comments 18 pages, 3 figures

详情
Journal ref
Optimization Methods and Software, Volume 31, Issue 2, Pages 377-391, 2016
AI中文摘要

给定一个矩阵A,线性可行性问题(线性分类是其特例)旨在求解原问题w: A^Tw > 0或证明对偶问题的证书,即概率分布p: Ap = 0。受

英文摘要

Given a matrix $A$, a linear feasibility problem (of which linear classification is a special case) aims to find a solution to a primal problem $w: A^Tw > \textbf{0}$ or a certificate for the dual problem which is a probability distribution $p: Ap = \textbf{0}$. Inspired by the continued importance of "large-margin classifiers" in machine learning, this paper studies a condition measure of $A$ called its \textit{margin} that determines the difficulty of both the above problems. To aid geometrical intuition, we first establish new characterizations of the margin in terms of relevant balls, cones and hulls. Our second contribution is analytical, where we present generalizations of Gordan's theorem, and variants of Hoffman's theorems, both using margins. We end by proving some new results on a classical iterative scheme, the Perceptron, whose convergence rates famously depends on the margin. Our results are relevant for a deeper understanding of margin-based learning and proving convergence rates of iterative schemes, apart from providing a unifying perspective on this vast topic.

1601.07721 2026-06-04 math.NA cs.LG cs.NA 版本更新

Distributed Low Rank Approximation of Implicit Functions of a Matrix

分布式隐函数矩阵的低秩近似

David P. Woodruff, Peilin Zhong

AI总结 研究分布式低秩近似问题,针对隐式表示的矩阵计算低秩近似,提出高效算法并验证其在softmax、高斯核和鲁棒近似中的应用。

详情
AI中文摘要

我们研究了分布式低秩近似,其中待近似矩阵仅在不同服务器间隐式表示。例如,每个服务器可能有n×d矩阵A^t,目标是计算A=f(∑_{t=1}^s A^t)的低秩近似,其中f是对矩阵∑_{t=1}^s A^t进行逐元素应用的函数。我们证明对于广泛类别的函数f,可以高效计算一个d×d的秩k投影矩阵P,使得‖A−AP‖_F^2 ≤ ‖A−[A]_k‖_F^2 + ε‖A‖_F^2,其中AP表示A在P行空间上的投影,[A]_k表示A的最佳秩k近似。我们的协议通信成本为d·(sk/ε)^{O(1)},并以高概率成功。我们的框架允许高效计算softmax、高斯核扩展和M-估计器的低秩近似。我们还证明这种加法误差近似是最佳的,即任何实现相对误差的协议需要更多通信。最后,我们在真实数据集上实验验证了我们的算法。

英文摘要

We study distributed low rank approximation in which the matrix to be approximated is only implicitly represented across the different servers. For example, each of $s$ servers may have an $n \times d$ matrix $A^t$, and we may be interested in computing a low rank approximation to $A = f(\sum_{t=1}^s A^t)$, where $f$ is a function which is applied entrywise to the matrix $\sum_{t=1}^s A^t$. We show for a wide class of functions $f$ it is possible to efficiently compute a $d \times d$ rank-$k$ projection matrix $P$ for which $\|A - AP\|_F^2 \leq \|A - [A]_k\|_F^2 + \varepsilon \|A\|_F^2$, where $AP$ denotes the projection of $A$ onto the row span of $P$, and $[A]_k$ denotes the best rank-$k$ approximation to $A$ given by the singular value decomposition. The communication cost of our protocols is $d \cdot (sk/\varepsilon)^{O(1)}$, and they succeed with high probability. Our framework allows us to efficiently compute a low rank approximation to an entry-wise softmax, to a Gaussian kernel expansion, and to $M$-Estimators applied entrywise (i.e., forms of robust low rank approximation). We also show that our additive error approximation is best possible, in the sense that any protocol achieving relative error for these problems requires significantly more communication. Finally, we experimentally validate our algorithms on real datasets.

1506.00438 2026-06-04 cs.LG cs.DM cs.SY eess.SY stat.ME 版本更新

Network Topology Identification using PCA and its Graph Theoretic Interpretations

利用PCA进行网络拓扑识别及其图论解释

Aravind Rajeswaran, Shankar Narasimhan

AI总结 本文通过PCA估计线性关系,利用f-cut集和f-环路实现网络拓扑识别,展示了从稳态数据中识别网络结构的方法及图论意义。

Comments Structure of paper is changed to improve presentation. Methods and results are unchanged. A more detailed literature survey has been added

详情
AI中文摘要

我们解决了从稳态网络测量中识别(重建)网络拓扑的问题。具体来说,给定一个数据矩阵X,其中X_{ij}对应配置(稳态)j中边i的流量,我们希望找到一个网络结构,使得所有节点的流量守恒成立。这模型了许多涉及守恒量的网络问题,如水、电力和代谢网络。我们证明了识别等同于学习一个模型A_n,该模型捕捉了X中不同变量之间的近似线性关系(即形式为A_n X ≈ 0),使得A_n满秩(最高可能)且与网络节点-边 incidence 结构一致。该问题通过一系列步骤解决,包括使用PCA估计近似线性关系、从这些近似关系中获得f-cut集,以及从f-cut集(或等价地f-环路)中实现图结构。每一步和整个过程都是多项式时间。该方法通过识别水分布网络的拓扑结构进行示例说明。我们还研究了从稳态数据中识别的可识别性范围。

英文摘要

We solve the problem of identifying (reconstructing) network topology from steady state network measurements. Concretely, given only a data matrix $\mathbf{X}$ where the $X_{ij}$ entry corresponds to flow in edge $i$ in configuration (steady-state) $j$, we wish to find a network structure for which flow conservation is obeyed at all the nodes. This models many network problems involving conserved quantities like water, power, and metabolic networks. We show that identification is equivalent to learning a model $\mathbf{A_n}$ which captures the approximate linear relationships between the different variables comprising $\mathbf{X}$ (i.e. of the form $\mathbf{A_n X \approx 0}$) such that $\mathbf{A_n}$ is full rank (highest possible) and consistent with a network node-edge incidence structure. The problem is solved through a sequence of steps like estimating approximate linear relationships using Principal Component Analysis, obtaining f-cut-sets from these approximate relationships, and graph realization from f-cut-sets (or equivalently f-circuits). Each step and the overall process is polynomial time. The method is illustrated by identifying topology of a water distribution network. We also study the extent of identifiability from steady-state data.

1510.06895 2026-06-04 cs.LG cs.CV cs.NA math.NA 版本更新

Nonconvex Nonsmooth Low-Rank Minimization via Iteratively Reweighted Nuclear Norm

非凸非光滑低秩最小化通过迭代重加权核范数

Canyi Lu, Jinhui Tang, Shuicheng Yan, Zhouchen Lin

AI总结 本文提出通过迭代重加权核范数算法解决非凸非光滑低秩最小化问题,利用非凸替代函数近似秩函数,提升低秩矩阵恢复性能。

详情
AI中文摘要

核范数因其在压缩感知中用于低秩矩阵恢复而被广泛使用,但求解基于核范数的松弛凸问题通常导致原始秩最小化问题的次优解。本文提出在矩阵奇异值上使用非凸替代函数近似秩函数,从而得到非凸非光滑最小化问题。然后通过迭代重加权核范数(IRNN)算法求解,该算法通过求解加权奇异值阈值(WSVT)问题,利用非凸替代函数的特殊性质获得闭式解。同时,IRNN被扩展以处理两个或多个变量块的非凸问题。理论上,证明IRNN单调减少目标函数值,任何极限点都是 stationary 点。在合成数据和真实图像上的大量实验表明,IRNN相比最先进的凸算法在低秩矩阵恢复方面表现更优。

英文摘要

The nuclear norm is widely used as a convex surrogate of the rank function in compressive sensing for low rank matrix recovery with its applications in image recovery and signal processing. However, solving the nuclear norm based relaxed convex problem usually leads to a suboptimal solution of the original rank minimization problem. In this paper, we propose to perform a family of nonconvex surrogates of $L_0$-norm on the singular values of a matrix to approximate the rank function. This leads to a nonconvex nonsmooth minimization problem. Then we propose to solve the problem by Iteratively Reweighted Nuclear Norm (IRNN) algorithm. IRNN iteratively solves a Weighted Singular Value Thresholding (WSVT) problem, which has a closed form solution due to the special properties of the nonconvex surrogate functions. We also extend IRNN to solve the nonconvex problem with two or more blocks of variables. In theory, we prove that IRNN decreases the objective function value monotonically, and any limit point is a stationary point. Extensive experiments on both synthesized data and real images demonstrate that IRNN enhances the low-rank matrix recovery compared with state-of-the-art convex algorithms.

1601.04251 2026-06-04 eess.SY cs.LG cs.SY stat.AP stat.ML 版本更新

On-line Bayesian System Identification

在线贝叶斯系统辨识

Diego Romeres, Giulia Prando, Gianluigi Pillonetto, Alessandro Chiuso

AI总结 本文提出一种在线贝叶斯系统辨识方法,通过边际似然最大化更新超参数,仅需一次迭代优化算法,实验验证其有效性。

详情
AI中文摘要

我们考虑一种在线系统辨识设置,在给定的时间步骤中新数据逐步出现。为满足实时估计要求,我们提出一种定制的贝叶斯系统辨识程序,其中超参数仍通过边际似然最大化更新,但仅需一次合适的迭代优化算法迭代。考虑了梯度方法和EM算法用于边际似然优化。我们比较了这种'1步'程序与标准程序,后者优化方法运行直至收敛到局部最小值。我们进行的实验确认了所提方法的有效性。

英文摘要

We consider an on-line system identification setting, in which new data become available at given time steps. In order to meet real-time estimation requirements, we propose a tailored Bayesian system identification procedure, in which the hyper-parameters are still updated through Marginal Likelihood maximization, but after only one iteration of a suitable iterative optimization algorithm. Both gradient methods and the EM algorithm are considered for the Marginal Likelihood optimization. We compare this "1-step" procedure with the standard one, in which the optimization method is run until convergence to a local minimum. The experiments we perform confirm the effectiveness of the approach we propose.

1506.08350 2026-06-04 cs.LG cs.NA math.NA 版本更新

Stochastic Gradient Made Stable: A Manifold Propagation Approach for Large-Scale Optimization

随机梯度使稳定:一种用于大规模优化的流形传播方法

Yadong Mu, Wei Liu, Wei Fan

AI总结 本文提出了一种新的半随机梯度下降算法S3GD,通过高效的流形传播方法减少计算复杂度,提升优化稳定性。

Comments 14 pages, 9 figures

详情
AI中文摘要

随机梯度下降(SGD)是构建大规模机器学习模型的经典方法。由于通常仅使用少量样本计算随机梯度,导致梯度估计波动较大,参数难以收敛。本文提出了一种新的半随机梯度下降算法S3GD,通过高效的流形传播方法,能够以较低的计算复杂度生成精确的梯度估计,从而在优化复合凸函数时实现更快的收敛速度。理论分析表明,S3GD在几何算法收敛速度与空间和时间复杂度之间取得了良好的平衡。实验结果在多个大规模基准数据集上验证了S3GD的有效性。

英文摘要

Stochastic gradient descent (SGD) holds as a classical method to build large scale machine learning models over big data. A stochastic gradient is typically calculated from a limited number of samples (known as mini-batch), so it potentially incurs a high variance and causes the estimated parameters bounce around the optimal solution. To improve the stability of stochastic gradient, recent years have witnessed the proposal of several semi-stochastic gradient descent algorithms, which distinguish themselves from standard SGD by incorporating global information into gradient computation. In this paper we contribute a novel stratified semi-stochastic gradient descent (S3GD) algorithm to this nascent research area, accelerating the optimization of a large family of composite convex functions. Though theoretically converging faster, prior semi-stochastic algorithms are found to suffer from high iteration complexity, which makes them even slower than SGD in practice on many datasets. In our proposed S3GD, the semi-stochastic gradient is calculated based on efficient manifold propagation, which can be numerically accomplished by sparse matrix multiplications. This way S3GD is able to generate a highly-accurate estimate of the exact gradient from each mini-batch with largely-reduced computational complexity. Theoretic analysis reveals that the proposed S3GD elegantly balances the geometric algorithmic convergence rate against the space and time complexities during the optimization. The efficacy of S3GD is also experimentally corroborated on several large-scale benchmark datasets.

1601.02947 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Online Model Estimation for Predictive Thermal Control of Buildings

建筑预测热控的在线模型估计

Peter Radecki, Brandon Hencey

AI总结 本文提出一种可扩展的方法,用于学习建筑的控导向热模型,以实现大规模部署低成本预测控制。通过改进的无迹卡尔曼滤波器,准确学习和预测建筑热响应。通过EnergyPlus模拟数据验证了基于多区热网络的灰盒方法,实现了24小时能源预测。

Comments 14 pages, 15 figures, 2 tables, 1 algorithm

详情
AI中文摘要

本文提出了一种可扩展的方法,用于学习建筑的控导向热模型,以实现大规模部署低成本预测控制。通过改进的无迹卡尔曼滤波器,准确学习和预测建筑热响应。近期研究显示,先进的模型预测控制(MPC)在供暖、通风和空气调节(HVAC)系统中能带来显著的节能效果。然而,获取准确、稳健的单个建筑独特热外壳模型的可扩展低成本方法一直难以实现,阻碍了预测控制系统的广泛应用。这些热模型的持续校准和长期性能需要部署在线数据驱动的系统识别和参数估计程序。我们提出了一种基于多区热网络的新型灰盒方法,利用改进的无迹卡尔曼滤波器,并通过EnergyPlus模拟数据验证了其有效性。滤波器在已知或受约束负载期间快速学习热网络的参数,然后表征未知负载,以提供准确的24小时能源预测。本研究通过一年的研究扩展了我们的初步研究,正式化了参数和扰动估计程序,并展示了跨年度的结果。

英文摘要

This study proposes a general, scalable method to learn control-oriented thermal models of buildings that could enable wide-scale deployment of cost-effective predictive controls. An Unscented Kalman Filter augmented for parameter and disturbance estimation is shown to accurately learn and predict a building's thermal response. Recent studies of heating, ventilating, and air conditioning (HVAC) systems have shown significant energy savings with advanced model predictive control (MPC). A scalable cost-effective method to readily acquire accurate, robust models of individual buildings' unique thermal envelopes has historically been elusive and hindered the widespread deployment of prediction-based control systems. Continuous commissioning and lifetime performance of these thermal models requires deployment of on-line data-driven system identification and parameter estimation routines. We propose a novel gray-box approach using an Unscented Kalman Filter based on a multi-zone thermal network and validate it with EnergyPlus simulation data. The filter quickly learns parameters of a thermal network during periods of known or constrained loads and then characterizes unknown loads in order to provide accurate 24+ hour energy predictions. This study extends our initial investigation by formalizing parameter and disturbance estimation routines and demonstrating results across a year-long study.

1512.08169 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Self-Excitation: An Enabler for Online Thermal Estimation and Model Predictive Control of Buildings

自激发:建筑在线热估计与模型预测控制的赋能者

Peter Radecki, Brandon Hencey

AI总结 本文提出通过在线识别和激发(主动学习过程)提升建筑热预测控制性能,利用控制器激发未知动态部分,改进热模型估计,从而提高MPC的能耗节约和舒适度。

Comments 11 pages, 10 figures, 2 tables

详情
AI中文摘要

本文研究了一种通过在线识别和激发(主动学习过程)来提升建筑热预测控制性能的方法。先前研究已展示出利用灰盒方法获取被动建筑多区热模型的可扩展方法,利用建筑拓扑和测量数据。本文将该方法扩展到多区主动控制建筑,并探讨如何利用控制器激发未知动态部分来改进热模型估计。与基线恒温器控制器相比,本文在MPC框架中展示了初始获取和改进的热模型的实用性,该框架可预测天气不确定性和时间变化的温度设定点。仿真研究证明自激发可提高模型估计,从而对应于改进的MPC能耗节约和 occupant comfort。通过将建筑拓扑、估计和控制程序整合到单一在线框架中,本文展示了低成本可扩展方法在主动学习和控制建筑以确保 occupant comfort 和最小化能耗方面的潜力,同时利用现有建筑的HVAC传感器和硬件。

英文摘要

This paper investigates a method to improve buildings' thermal predictive control performance via online identification and excitation (active learning process) that minimally disrupts normal operations. In previous studies we have demonstrated scalable methods to acquire multi-zone thermal models of passive buildings using a gray-box approach that leverages building topology and measurement data. Here we extend the method to multi-zone actively controlled buildings and examine how to improve the thermal model estimation by using the controller to excite unknown portions of the building's dynamics. Comparing against a baseline thermostat controller, we demonstrate the utility of both the initially acquired and improved thermal models within a Model Predictive Control (MPC) framework, which anticipates weather uncertainty and time-varying temperature set-points. A simulation study demonstrates self-excitation improves model estimation, which corresponds to improved MPC energy savings and occupant comfort. By coupling building topology, estimation, and control routines into a single online framework, we have demonstrated the potential for low-cost scalable methods to actively learn and control buildings to ensure occupant comfort and minimize energy usage, all while using the existing building's HVAC sensors and hardware.

1512.03518 2026-06-04 math.OC cs.LG cs.NA math.NA stat.ML 版本更新

A Unified Approach to Error Bounds for Structured Convex Optimization Problems

结构凸优化问题误差界的一种统一方法

Zirui Zhou, Anthony Man-Cho So

AI总结 本文提出一种统一框架,用于建立结构凸优化问题的误差界,涵盖一般约束最小化问题和机器学习中的正则化损失最小化问题,并通过核范数正则化损失问题展示了新误差界的应用。

Comments 32 pages

详情
AI中文摘要

误差界是指通过残差函数将测试集向量距离给定集合的距离进行限制的不等式,已被证明在分析迭代方法的收敛速度方面非常有用。本文提出了一种新的框架,用于建立一类结构凸优化问题的误差界,其中目标函数是光滑凸函数和一般闭合正凸函数的和。此类问题不仅涵盖广泛的一般约束最小化问题,还涵盖各种正则化损失最小化公式。使用我们的框架,我们证明了现有误差界结果可以以统一和透明的方式恢复。为进一步展示我们框架的威力,我们将其应用于核范数正则化损失最小化问题,并在严格互补型正则性条件下建立了此类问题的新误差界。然后,我们通过构造一个例子来证明,在没有正则性条件的情况下,所述误差界可能失效。因此,我们得到了对Tseng提出的问题的较为完整的回答。我们相信,我们的方法将在结构凸优化问题的误差界研究中找到进一步的应用。

英文摘要

Error bounds, which refer to inequalities that bound the distance of vectors in a test set to a given set by a residual function, have proven to be extremely useful in analyzing the convergence rates of a host of iterative methods for solving optimization problems. In this paper, we present a new framework for establishing error bounds for a class of structured convex optimization problems, in which the objective function is the sum of a smooth convex function and a general closed proper convex function. Such a class encapsulates not only fairly general constrained minimization problems but also various regularized loss minimization formulations in machine learning, signal processing, and statistics. Using our framework, we show that a number of existing error bound results can be recovered in a unified and transparent manner. To further demonstrate the power of our framework, we apply it to a class of nuclear-norm regularized loss minimization problems and establish a new error bound for this class under a strict complementarity-type regularity condition. We then complement this result by constructing an example to show that the said error bound could fail to hold without the regularity condition. Consequently, we obtain a rather complete answer to a question raised by Tseng. We believe that our approach will find further applications in the study of error bounds for structured convex optimization problems.

1512.02693 2026-06-04 cs.NE cs.LG cs.SY eess.SY 版本更新

Reinforcement Control with Hierarchical Backpropagated Adaptive Critics

基于分层反向传播自适应批评者的强化控制

John W. Jameson

AI总结 本文提出分层反向传播自适应批评者架构,通过两级层次结构解决长期信用分配问题,引入响应诱导学习方法提升控制稳定性与鲁棒性。

Comments 16 pages, 5 figures

详情
AI中文摘要

本文提出分层反向传播自适应批评者架构,通过两级层次结构解决长期信用分配问题,引入响应诱导学习方法提升控制稳定性与鲁棒性。

英文摘要

Present incremental learning methods are limited in the ability to achieve reliable credit assignment over a large number time steps (or events). However, this situation is typical for cases where the dynamical system to be controlled requires relatively frequent control updates in order to maintain stability or robustness yet has some action-consequences which must be established over relatively long periods of time. To address this problem, the learning capabilities of a control architecture comprised of two Backpropagated Adaptive Critics (BACs) in a two-level hierarchy with continuous actions are explored. The high-level BAC updates less frequently than the low-level BAC and controls the latter to some degree. The response of the low-level to high-level signals can either be determined a priori or it can emerge during learning. A general approach called Response Induction Learning is introduced to address the latter case.

1512.01927 2026-06-04 math.NA cs.CV cs.LG cs.NA 版本更新

Fast Optimization Algorithm on Riemannian Manifolds and Its Application in Low-Rank Representation

流形上的快速优化算法及其在低秩表示中的应用

Haoran Chen, Yanfeng Sun, Junbin Gao, Yongli Hu

AI总结 本文提出了一种具有快速收敛速度的新型一阶优化算法FOA,并在低秩表示模型中应用了基于FOA的快速子空间追踪方法,实验表明其在收敛速度和准确性方面优于其他方法。

详情
AI中文摘要

本文研究了在流形上优化一类复合函数的问题,并提出了一种新的第一阶优化算法(FOA),具有快速收敛速度。通过理论分析证明了该算法具有二次收敛性。在矩阵补全任务的实验中,FOA在流形上的其他一阶优化方法中表现更优。基于FOA提出了一种快速子空间追踪方法,用于解决基于增广拉格朗日方法的低秩矩阵流形上的低秩表示模型。实验结果表明,FOA和SP-RPRG(ALM)在合成和真实数据集上均实现了更快的收敛速度和更高的准确性。

英文摘要

The paper addresses the problem of optimizing a class of composite functions on Riemannian manifolds and a new first order optimization algorithm (FOA) with a fast convergence rate is proposed. Through the theoretical analysis for FOA, it has been proved that the algorithm has quadratic convergence. The experiments in the matrix completion task show that FOA has better performance than other first order optimization methods on Riemannian manifolds. A fast subspace pursuit method based on FOA is proposed to solve the low-rank representation model based on augmented Lagrange method on the low rank matrix variety. Experimental results on synthetic and real data sets are presented to demonstrate that both FOA and SP-RPRG(ALM) can achieve superior performance in terms of faster convergence and higher accuracy.

1509.08581 2026-06-04 math.OC cs.LG cs.NA math.NA stat.CO stat.ML 版本更新

Optimization over Sparse Symmetric Sets via a Nonmonotone Projected Gradient Method

通过非单调投影梯度法优化稀疏对称集

Zhaosong Lu

AI总结 本文提出非单调投影梯度法用于优化稀疏对称集,引入更强的最优条件并证明其全局或局部最优性。

Comments 30 pages

详情
AI中文摘要

我们考虑在稀疏对称集上最小化Lipschitz可微函数的问题,该问题在工程和科学中有广泛应用。已知经典投影梯度(PG)方法常数步长1/L的任何聚点满足L-站定最优条件。本文引入更强的最优条件,并提出非单调投影梯度(NPG)方法,结合支持变化和坐标交换策略。证明NPG的任何聚点满足新条件且为坐标站定点。在合适假设下,其为全局或局部极小值。数值实验显示NPG在解质量上优于PG,且在速度上至少可比甚至优于PG。

英文摘要

We consider the problem of minimizing a Lipschitz differentiable function over a class of sparse symmetric sets that has wide applications in engineering and science. For this problem, it is known that any accumulation point of the classical projected gradient (PG) method with a constant stepsize $1/L$ satisfies the $L$-stationarity optimality condition that was introduced in [3]. In this paper we introduce a new optimality condition that is stronger than the $L$-stationarity optimality condition. We also propose a nonmonotone projected gradient (NPG) method for this problem by incorporating some support-changing and coordintate-swapping strategies into a projected gradient method with variable stepsizes. It is shown that any accumulation point of NPG satisfies the new optimality condition and moreover it is a coordinatewise stationary point. Under some suitable assumptions, we further show that it is a global or a local minimizer of the problem. Numerical experiments are conducted to compare the performance of PG and NPG. The computational results demonstrate that NPG has substantially better solution quality than PG, and moreover, it is at least comparable to, but sometimes can be much faster than PG in terms of speed.

1511.08062 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

Relaxed Majorization-Minimization for Non-smooth and Non-convex Optimization

松弛的主导最小化方法用于非光滑和非凸优化

Chen Xu, Zhouchen Lin, Zhenyu Zhao, Hongbin Zha

AI总结 本文提出了一种新的松弛主导最小化方法,用于非光滑和非凸优化问题,该方法能涵盖现有主导最小化方法。通过弱化条件,允许使用直接逼近非光滑目标函数的替代函数,在鲁棒矩阵分解问题中表现出优势。

Comments AAAI16

详情
AI中文摘要

我们提出了一种新的主导最小化(MM)方法,用于非光滑和非凸程序,该方法足够通用,可以包含现有的MM方法。除了局部主导条件外,我们仅要求当迭代次数趋于无穷大时,目标函数与其替代函数的方向导数的差消失,这是一个非常弱的条件。因此,我们的方法可以使用直接逼近非光滑目标函数的替代函数。相比之下,现有的所有MM方法都是通过近似目标函数的光滑部分来构建替代函数的。我们应用我们的松弛MM方法到具有不同正则化的鲁棒矩阵分解(RMF)问题中,其中我们的局部主导算法在RMF问题中优于现有方法。这是首个确保在不额外假设的情况下,任何迭代点的极限点都是驻点的RMF算法。

英文摘要

We propose a new majorization-minimization (MM) method for non-smooth and non-convex programs, which is general enough to include the existing MM methods. Besides the local majorization condition, we only require that the difference between the directional derivatives of the objective function and its surrogate function vanishes when the number of iterations approaches infinity, which is a very weak condition. So our method can use a surrogate function that directly approximates the non-smooth objective function. In comparison, all the existing MM methods construct the surrogate function by approximating the smooth component of the objective function. We apply our relaxed MM methods to the robust matrix factorization (RMF) problem with different regularizations, where our locally majorant algorithm shows advantages over the state-of-the-art approaches for RMF. This is the first algorithm for RMF ensuring, without extra assumptions, that any limit point of the iterates is a stationary point.

1509.03044 2026-06-04 cs.LG cs.AI cs.SY eess.SY 版本更新

Recurrent Reinforcement Learning: A Hybrid Approach

递归强化学习:一种混合方法

Xiujun Li, Lihong Li, Jianfeng Gao, Xiaodong He, Jianshu Chen, Li Deng, Ji He

AI总结 本文提出一种混合模型,结合监督学习和强化学习,用于部分可观测任务的状态表示学习,在极少领域知识下有效。

Comments 11 pages, 6 figures

详情
AI中文摘要

成功的强化学习应用往往需要处理部分可观测状态。通常很难构建和推断隐藏状态,因为它们依赖于智能体的整个交互历史,可能需要大量领域知识。本文研究了一种深度学习方法,用于在极少领域知识下学习部分可观测任务的状态表示。特别地,我们提出了一种新的混合模型,结合监督学习(SL)和强化学习(RL)的优点,以联合方式训练:SL组件可以是循环神经网络(RNN)或其长短期记忆(LSTM)版本,具有捕捉长期依赖性的能力,从而有效学习隐藏状态的表示。RL组件是一个深度Q网络(DQN),学习优化控制以最大化长期奖励。在直接邮寄营销问题上的大量实验展示了所提出方法的有效性和优势,其在一组先前最先进的方法中表现最佳。

英文摘要

Successful applications of reinforcement learning in real-world problems often require dealing with partially observable states. It is in general very challenging to construct and infer hidden states as they often depend on the agent's entire interaction history and may require substantial domain knowledge. In this work, we investigate a deep-learning approach to learning the representation of states in partially observable tasks, with minimal prior knowledge of the domain. In particular, we propose a new family of hybrid models that combines the strength of both supervised learning (SL) and reinforcement learning (RL), trained in a joint fashion: The SL component can be a recurrent neural networks (RNN) or its long short-term memory (LSTM) version, which is equipped with the desired property of being able to capture long-term dependency on history, thus providing an effective way of learning the representation of hidden states. The RL component is a deep Q-network (DQN) that learns to optimize the control for maximizing long-term rewards. Extensive experiments in a direct mailing campaign problem demonstrate the effectiveness and advantages of the proposed approach, which performs the best among a set of previous state-of-the-art methods.

1511.05133 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

Fast Proximal Linearized Alternating Direction Method of Multiplier with Parallel Splitting

快速近端线性化交替方向乘子法与并行分裂

Canyi Lu, Huan Li, Zhouchen Lin, Shuicheng Yan

AI总结 本文提出快速近端增广拉格朗日法和快速近端ADMM并行分裂法,改进了收敛速度并降低了计算复杂度,实验证明其在合成和真实数据上均优于传统PALM和ADMM。

Comments AAAI 2016

详情
AI中文摘要

增广拉格朗日方法(ALM)和交替方向乘子法(ADMM)已成为求解一般凸规划问题的有效工具。本文考虑目标函数由光滑部分和非光滑但简单部分组成的凸问题。我们提出快速近端增广拉格朗日法(Fast PALM),其收敛速度为O(1/K²),优于传统PALM的O(1/K)。为进一步降低每迭代复杂度并处理多块问题,我们提出快速近端ADMM并行分裂法(Fast PL-ADMM-PS)。该方法在目标函数的光滑部分收敛速度方面也有所改进。在合成和真实数据上的实验结果表明,我们的快速方法显著优于之前的PALM和ADMM。

英文摘要

The Augmented Lagragian Method (ALM) and Alternating Direction Method of Multiplier (ADMM) have been powerful optimization methods for general convex programming subject to linear constraint. We consider the convex problem whose objective consists of a smooth part and a nonsmooth but simple part. We propose the Fast Proximal Augmented Lagragian Method (Fast PALM) which achieves the convergence rate $O(1/K^2)$, compared with $O(1/K)$ by the traditional PALM. In order to further reduce the per-iteration complexity and handle the multi-blocks problem, we propose the Fast Proximal ADMM with Parallel Splitting (Fast PL-ADMM-PS) method. It also partially improves the rate related to the smooth part of the objective function. Experimental results on both synthesized and real world data demonstrate that our fast methods significantly improve the previous PALM and ADMM.

1407.0753 2026-06-04 math.OC cs.LG cs.NA math.NA stat.ML 版本更新

Global convergence of splitting methods for nonconvex composite optimization

非凸复合优化中分裂方法的全局收敛性

Guoyin Li, Ting Kei Pong

AI总结 本文研究了非凸复合优化问题,分析了交替方向乘子法和近端梯度算法的收敛性,证明了在特定条件下序列收敛于 stationary 点,并给出了保证序列有界的充分条件。

Comments To appear in SIOPT

详情
AI中文摘要

我们考虑了最小化一个具有有界海森矩阵的光滑函数 h 和一个非光滑函数之和的问题。我们假设后者是一个闭函数 P 和一个满射线性映射 M 的组合,且 P 的近端映射在参数 τ>0 时易于计算。该问题一般是非凸的,并涵盖工程和机器学习中的许多重要应用。本文分析了两种分裂方法用于解决该非凸优化问题:交替方向乘子法和近端梯度算法。对于交替方向乘子法的直接适应,我们证明如果惩罚参数足够大且生成的序列有聚点,则会得到非凸问题的 stationary 点。我们还建立了在附加假设下整个序列收敛的条件,即函数 h 和 P 是半代数的。此外,我们给出了保证生成序列有界的简单充分条件。这些条件可以满足广泛的应用,包括带有 ℓ_{1/2} 正则化的最小二乘问题。最后,当 M 是恒等映射时,即近端梯度算法可以高效应用时,我们证明任何聚点在略微更灵活的常数步长规则下是 stationary 点,这比文献中非凸 h 的已知条件更灵活。

英文摘要

We consider the problem of minimizing the sum of a smooth function $h$ with a bounded Hessian, and a nonsmooth function. We assume that the latter function is a composition of a proper closed function $P$ and a surjective linear map $\cal M$, with the proximal mappings of $τP$, $τ> 0$, simple to compute. This problem is nonconvex in general and encompasses many important applications in engineering and machine learning. In this paper, we examined two types of splitting methods for solving this nonconvex optimization problem: alternating direction method of multipliers and proximal gradient algorithm. For the direct adaptation of the alternating direction method of multipliers, we show that, if the penalty parameter is chosen sufficiently large and the sequence generated has a cluster point, then it gives a stationary point of the nonconvex problem. We also establish convergence of the whole sequence under an additional assumption that the functions $h$ and $P$ are semi-algebraic. Furthermore, we give simple sufficient conditions to guarantee boundedness of the sequence generated. These conditions can be satisfied for a wide range of applications including the least squares problem with the $\ell_{1/2}$ regularization. Finally, when $\cal M$ is the identity so that the proximal gradient algorithm can be efficiently applied, we show that any cluster point is stationary under a slightly more flexible constant step-size rule than what is known in the literature for a nonconvex $h$.

1405.4980 2026-06-04 math.OC cs.CC cs.LG cs.NA math.NA stat.ML 版本更新

Convex Optimization: Algorithms and Complexity

凸优化:算法与复杂性

Sébastien Bubeck

AI总结 本文探讨了凸优化中的复杂性定理及其算法,涵盖黑盒优化、结构优化和随机优化的理论与方法,重点介绍FISTA、对偶平均和内点法等核心算法。

Comments A previous version of the manuscript was titled "Theory of Convex Optimization for Machine Learning"

详情
Journal ref
In Foundations and Trends in Machine Learning, Vol. 8: No. 3-4, pp 231-357, 2015
AI中文摘要

本文阐述了凸优化中的复杂性定理及其相应算法。从黑盒优化的基本理论开始,内容逐步推进到结构优化和随机优化的最新进展。黑盒优化部分深受Nesterov的开创性著作和Nemirovski的讲义影响,涵盖了切割平面方法以及(加速)梯度下降方案的分析。我们特别关注非欧几里得设置(相关算法包括Frank-Wolfe、镜像下降和对偶平均)并讨论其在机器学习中的相关性。我们为结构优化提供了简要介绍,包括FISTA(用于优化光滑与简单非光滑项之和)、鞍点镜像近似(Nemirovski对Nesterov平滑方法的替代)以及内点法的简要描述。在随机优化中,我们讨论了随机梯度下降、小批量、随机坐标下降和亚线性算法。我们还简要提及了凸松弛的组合优化问题以及利用随机性来近似解的方法,以及基于随机游走的方法。

英文摘要

This monograph presents the main complexity theorems in convex optimization and their corresponding algorithms. Starting from the fundamental theory of black-box optimization, the material progresses towards recent advances in structural optimization and stochastic optimization. Our presentation of black-box optimization, strongly influenced by Nesterov's seminal book and Nemirovski's lecture notes, includes the analysis of cutting plane methods, as well as (accelerated) gradient descent schemes. We also pay special attention to non-Euclidean settings (relevant algorithms include Frank-Wolfe, mirror descent, and dual averaging) and discuss their relevance in machine learning. We provide a gentle introduction to structural optimization with FISTA (to optimize a sum of a smooth and a simple non-smooth term), saddle-point mirror prox (Nemirovski's alternative to Nesterov's smoothing), and a concise description of interior point methods. In stochastic optimization we discuss stochastic gradient descent, mini-batches, random coordinate descent, and sublinear algorithms. We also briefly touch upon convex relaxation of combinatorial problems and the use of randomness to round solutions, as well as random walks based methods.

1502.06800 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

On the Equivalence between Kernel Quadrature Rules and Random Feature Expansions

核积分规则与随机特征展开的等价性

Francis Bach

AI总结 研究揭示核积分规则是随机特征展开的特例,通过理论分析得出样本数与积分算子特征值的关系,扩展至函数逼近问题并改进随机特征学习的泛化保证。

详情
AI中文摘要

我们展示基于核的积分规则可以视为正定核随机特征展开的特例,对于特定分解总存在。我们提供理论分析,得出所需样本数与近似误差的关系,得到基于积分算子特征值的上下界,匹配对数项。特别地,我们显示上界可通过特定非均匀分布的独立同分布样本获得,而下界若对任何点集有效。将结果应用于核积分规则时,我们恢复了Sobolev空间的已知上下界。此外,结果扩展至更一般的函数逼近问题,得到L2-和L∞-范数结果,匹配特殊情形的已知结果。应用于随机特征时,我们显示改进了保持学习Lipschitz连续损失泛化保证所需的随机特征数量。

英文摘要

We show that kernel-based quadrature rules for computing integrals can be seen as a special case of random feature expansions for positive definite kernels, for a particular decomposition that always exists for such kernels. We provide a theoretical analysis of the number of required samples for a given approximation error, leading to both upper and lower bounds that are based solely on the eigenvalues of the associated integral operator and match up to logarithmic terms. In particular, we show that the upper bound may be obtained from independent and identically distributed samples from a specific non-uniform distribution, while the lower bound if valid for any set of points. Applying our results to kernel-based quadrature, while our results are fairly general, we recover known upper and lower bounds for the special cases of Sobolev spaces. Moreover, our results extend to the more general problem of full function approximations (beyond simply computing an integral), with results in L2- and L$\infty$-norm that match known results for special cases. Applying our results to random features, we show an improvement of the number of random features needed to preserve the generalization guarantees for learning with Lipschitz-continuous losses.

1504.05477 2026-06-04 cs.DS cs.LG cs.NA math.NA 版本更新

Randomized Block Krylov Methods for Stronger and Faster Approximate Singular Value Decomposition

随机块Krylov方法用于更准确和更快的近似奇异值分解

Cameron Musco, Christopher Musco

AI总结 本文提出了一种随机块Krylov方法,该方法在理论上和实验上均优于现有的随机同时幂迭代方法,能够以更少的迭代次数提供更优的低秩近似,并解决了传统Krylov子空间方法依赖奇异值间隙的问题。

Comments Neural Information Processing Systems 2015

详情
AI中文摘要

自从Rokhlin、Szlam和Tygert分析并推广了随机同时幂迭代方法以来,该方法已成为近似奇异值分解的首选方法。它比更简单的抽样算法更准确,但对任何矩阵都快速收敛,无论奇异值间隙如何。在经过~O(1/ε)次迭代后,它能够提供一个在谱范数误差下(1+ε)的低秩近似。我们首次证明了随机块Krylov方法在理论上和实验上均优于随机同时幂迭代方法,该方法与经典的块Lanczos算法密切相关,仅需~O(1/√ε)次迭代即可达到相同保证,并在实验上表现显著更好。尽管这些方法有很长的历史,我们的分析是首个不依赖奇异值间隙的Krylov子空间方法的证明,而奇异值间隙在实践中不可靠。此外,虽然该方法是一个简单的准确性基准,但即使(1+ε)的谱范数低秩近似误差也不意味着算法返回高质量的主成分,这对数据应用是一个主要问题。我们首次解决了这个问题,通过展示块Krylov迭代和对同时迭代的微小修改能够为任何矩阵提供接近最优的PCA。这一结果进一步证明了它们在非迭代抽样方法上的优势。最后,我们给出了超越最坏情况的见解,解释了为什么这两种算法在实践中可以比预测的快得多。我们澄清了如何利用简单的技术来利用常见的矩阵属性,从而显著提高运行时间。

英文摘要

Since being analyzed by Rokhlin, Szlam, and Tygert and popularized by Halko, Martinsson, and Tropp, randomized Simultaneous Power Iteration has become the method of choice for approximate singular value decomposition. It is more accurate than simpler sketching algorithms, yet still converges quickly for any matrix, independently of singular value gaps. After $\tilde{O}(1/ε)$ iterations, it gives a low-rank approximation within $(1+ε)$ of optimal for spectral norm error. We give the first provable runtime improvement on Simultaneous Iteration: a simple randomized block Krylov method, closely related to the classic Block Lanczos algorithm, gives the same guarantees in just $\tilde{O}(1/\sqrtε)$ iterations and performs substantially better experimentally. Despite their long history, our analysis is the first of a Krylov subspace method that does not depend on singular value gaps, which are unreliable in practice. Furthermore, while it is a simple accuracy benchmark, even $(1+ε)$ error for spectral norm low-rank approximation does not imply that an algorithm returns high quality principal components, a major issue for data applications. We address this problem for the first time by showing that both Block Krylov Iteration and a minor modification of Simultaneous Iteration give nearly optimal PCA for any matrix. This result further justifies their strength over non-iterative sketching methods. Finally, we give insight beyond the worst case, justifying why both algorithms can run much faster in practice than predicted. We clarify how simple techniques can take advantage of common matrix properties to significantly improve runtime.

1410.4062 2026-06-04 stat.ML cs.LG cs.NA math.NA math.OC 版本更新

Complexity Issues and Randomization Strategies in Frank-Wolfe Algorithms for Machine Learning

凸优化中Frank-Wolfe算法的复杂性问题与随机化策略

Emanuele Frandi, Ricardo Nanculef, Johan Suykens

AI总结 本文研究了Frank-Wolfe算法在大规模数据集中的有效性,分析了随机采样策略的替代方案,并提供了一些指导原则。

详情
AI中文摘要

Frank-Wolfe算法用于凸优化最近在优化和机器学习社区中引起了广泛关注,因其特性使其成为各种应用中的合适选择。然而,由于每次迭代都需要优化线性模型,巧妙的实现对于使这些算法在大规模数据集上可行至关重要。为此,几位研究者提出了基于随机采样的近似策略。在本文中,我们对这些技术的有效性进行了实验研究,分析了可能的替代方案,并根据我们的结果提供了一些指导原则。

英文摘要

Frank-Wolfe algorithms for convex minimization have recently gained considerable attention from the Optimization and Machine Learning communities, as their properties make them a suitable choice in a variety of applications. However, as each iteration requires to optimize a linear model, a clever implementation is crucial to make such algorithms viable on large-scale datasets. For this purpose, approximation strategies based on a random sampling have been proposed by several researchers. In this work, we perform an experimental study on the effectiveness of these techniques, analyze possible alternatives and provide some guidelines based on our results.

1509.01208 2026-06-04 cs.LG cs.IR cs.NA math.NA 版本更新

Fast Clustering and Topic Modeling Based on Rank-2 Nonnegative Matrix Factorization

基于秩2非负矩阵分解的快速聚类与主题建模

Da Kuang, Barry Drake, Haesun Park

AI总结 本文提出HierNMF2和FlatNMF2方法,利用秩2非负矩阵分解实现高效层次聚类和主题建模,实验表明在计算时间和解质量上均有显著提升。

Comments This paper has been withdrawn by the author to clarify the authorship

详情
AI中文摘要

无监督聚类和主题建模的重要性日益凸显,随着文本数据量的增加。本文提出了一种名为HierNMF2的快速方法,基于快速秩2非负矩阵分解(NMF),进行二元聚类和高效的节点分裂规则。进一步利用HierNMF2生成的最终叶节点和非负最小二乘拟合思想,提出新的聚类/主题建模方法FlatNMF2,以极简且显著更有效的方式恢复扁平聚类/主题建模结果。我们为HierNMF2和FlatNMF2实现了高度优化的开源C++软件,用于文档数据集的层次和部分聚类/主题建模。大量实验测试展示了计算时间和解质量的显著改进。我们比较了我们的方法与其他聚类方法(包括K-means、标准NMF和CLUTO)以及主题建模方法(包括潜在狄利克雷分配(LDA)和最近提出的具有分离性约束的NMF算法)。总体而言,我们提出了分析大规模数据集的高效工具和技术,这些技术可推广到许多其他数据分析问题领域。

英文摘要

The importance of unsupervised clustering and topic modeling is well recognized with ever-increasing volumes of text data. In this paper, we propose a fast method for hierarchical clustering and topic modeling called HierNMF2. Our method is based on fast Rank-2 nonnegative matrix factorization (NMF) that performs binary clustering and an efficient node splitting rule. Further utilizing the final leaf nodes generated in HierNMF2 and the idea of nonnegative least squares fitting, we propose a new clustering/topic modeling method called FlatNMF2 that recovers a flat clustering/topic modeling result in a very simple yet significantly more effective way than any other existing methods. We implement highly optimized open source software in C++ for both HierNMF2 and FlatNMF2 for hierarchical and partitional clustering/topic modeling of document data sets. Substantial experimental tests are presented that illustrate significant improvements both in computational time as well as quality of solutions. We compare our methods to other clustering methods including K-means, standard NMF, and CLUTO, and also topic modeling methods including latent Dirichlet allocation (LDA) and recently proposed algorithms for NMF with separability constraints. Overall, we present efficient tools for analyzing large-scale data sets, and techniques that can be generalized to many other data analytics problem domains.

1509.08990 2026-06-04 cs.SI cs.LG cs.SY eess.SY math.OC stat.ML 版本更新

Learning without Recall: A Case for Log-Linear Learning

无需回忆的学习:对日志线性学习的案例

Mohammad Amin Rahimian, Ali Jadbabaie

AI总结 本文研究了在无回忆条件下,理性代理人如何通过合理推断进行学习和信念形成,探讨了时间变化先验对学习和学习速率的影响。

Comments in 5th IFAC Workshop on Distributed Estimation and Control in Networked Systems, (NecSys 2015)

详情
AI中文摘要

我们分析了一个学习和信念形成在网络中的模型,其中代理人遵循贝叶斯规则,但不回忆过去的观察历史,也无法推断其他代理人的信念形成过程。他们通过合理推断自己的观察来实现,这些观察包括一系列独立同分布的私人信号以及每个时间点邻居的信念。完全理性的代理人会依次应用贝叶斯规则处理全部观察历史。这导致了由于对全球网络结构缺乏了解而产生令人担忧的复杂推断。为了解决这些复杂性,我们考虑了无回忆学习模型,该模型不仅为分析社会网络中理性代理人的行为提供了可处理的框架,还能为文献中各种非贝叶斯更新规则提供行为基础。我们阐述了各种时间变化先验选择对代理人学习及其速率的影响。

英文摘要

We analyze a model of learning and belief formation in networks in which agents follow Bayes rule yet they do not recall their history of past observations and cannot reason about how other agents' beliefs are formed. They do so by making rational inferences about their observations which include a sequence of independent and identically distributed private signals as well as the beliefs of their neighboring agents at each time. Fully rational agents would successively apply Bayes rule to the entire history of observations. This leads to forebodingly complex inferences due to lack of knowledge about the global network structure that causes those observations. To address these complexities, we consider a Learning without Recall model, which in addition to providing a tractable framework for analyzing the behavior of rational agents in social networks, can also provide a behavioral foundation for the variety of non-Bayesian update rules in the literature. We present the implications of various choices for time-varying priors of such agents and how this choice affects learning and its rate.

1408.3115 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

On Data Preconditioning for Regularized Loss Minimization

关于正则化损失最小化的数据预处理

Tianbao Yang, Rong Jin, Shenghuo Zhu, Qihang Lin

AI总结 研究通过数据预处理技术提升一阶方法在正则化损失最小化中的收敛速度,分析了问题条件数对收敛的影响,并提出随机采样方法实现高效预处理。

详情
AI中文摘要

在本文中,我们研究了数据预处理技术,这是一种已知且长期存在的技术,用于提升一阶方法在正则化损失最小化中的收敛速度。众所周知,问题的条件数,即Lipschitz常数与强凸模量的比值,对一阶优化方法的收敛性有显著影响。因此,最小化一个小的正则化损失以获得良好的泛化性能,导致产生一个病态的问题,成为大数据问题的瓶颈。我们为正则化损失最小化提供了数据预处理的理论。特别是,我们的分析展示了一个适当的数据预处理器,并 characterize 了损失函数和数据下的条件,使得数据预处理可以降低条件数,从而加速最小化正则化损失的收敛。为了使数据预处理在实践中有用,我们努力采用并分析一种随机采样方法,以高效计算预处理后的数据。初步实验验证了我们的理论。

英文摘要

In this work, we study data preconditioning, a well-known and long-existing technique, for boosting the convergence of first-order methods for regularized loss minimization. It is well understood that the condition number of the problem, i.e., the ratio of the Lipschitz constant to the strong convexity modulus, has a harsh effect on the convergence of the first-order optimization methods. Therefore, minimizing a small regularized loss for achieving good generalization performance, yielding an ill conditioned problem, becomes the bottleneck for big data problems. We provide a theory on data preconditioning for regularized loss minimization. In particular, our analysis exhibits an appropriate data preconditioner and characterizes the conditions on the loss function and on the data under which data preconditioning can reduce the condition number and therefore boost the convergence for minimizing the regularized loss. To make the data preconditioning practically useful, we endeavor to employ and analyze a random sampling approach to efficiently compute the preconditioned data. The preliminary experiments validate our theory.

1509.06458 2026-06-04 cs.LG cs.NA math.NA 版本更新

Harmonic Extension

调和扩展

Zuoqiang Shi, Jian Sun, Minghao Tian

AI总结 本文提出点积分方法(PIM)和体积约束方法(VCM)以解决调和扩展问题,改进传统图拉普拉斯方法的不足,应用于半监督学习中表现最佳。

Comments 10 pages, 2 figures

详情
AI中文摘要

本文提出点积分方法(PIM)和体积约束方法(VCM)以解决调和扩展问题,改进传统图拉普拉斯方法的不足,应用于半监督学习中表现最佳。

英文摘要

In this paper, we consider the harmonic extension problem, which is widely used in many applications of machine learning. We find that the transitional method of graph Laplacian fails to produce a good approximation of the classical harmonic function. To tackle this problem, we propose a new method called the point integral method (PIM). We consider the harmonic extension problem from the point of view of solving PDEs on manifolds. The basic idea of the PIM method is to approximate the harmonicity using an integral equation, which is easy to be discretized from points. Based on the integral equation, we explain the reason why the transitional graph Laplacian may fail to approximate the harmonicity in the classical sense and propose a different approach which we call the volume constraint method (VCM). Theoretically, both the PIM and the VCM computes a harmonic function with convergence guarantees, and practically, they are both simple, which amount to solve a linear system. One important application of the harmonic extension in machine learning is semi-supervised learning. We run a popular semi-supervised learning algorithm by Zhu et al. over a couple of well-known datasets and compare the performance of the aforementioned approaches. Our experiments show the PIM performs the best.

1509.03946 2026-06-04 cs.LG cs.NA math.NA 版本更新

Parametric Maxflows for Structured Sparse Learning with Convex Relaxations of Submodular Functions

参数最大流用于结构稀疏学习的凸松弛子模函数

Yoshinobu Kawahara, Yutaro Yamaguchi

AI总结 本文提出利用参数最大流优化解决结构稀疏学习中的凸松弛子模函数问题,展示现有结构惩罚满足条件,可快速求解正则化学习。

详情
AI中文摘要

通过凸松弛子模函数获得的结构惩罚的近端问题已被证明等价于在相应子模多面体上最小化可分离的凸函数。本文揭示了一类广泛的结构惩罚,这些问题可通过高效求解的参数最大流优化类解决。我们进一步表明,Gallo等人提出的参数最大流算法及其变种,其最坏情况下的计算成本仅为相应最大流优化计算的常数倍,可适应于解决这些惩罚的近端问题。几种现有结构惩罚满足这些条件;因此,使用这些惩罚的正则化学习可通过参数最大流算法快速求解。我们还研究了所提框架的实证运行时间性能。

英文摘要

The proximal problem for structured penalties obtained via convex relaxations of submodular functions is known to be equivalent to minimizing separable convex functions over the corresponding submodular polyhedra. In this paper, we reveal a comprehensive class of structured penalties for which penalties this problem can be solved via an efficiently solvable class of parametric maxflow optimization. We then show that the parametric maxflow algorithm proposed by Gallo et al. and its variants, which runs, in the worst-case, at the cost of only a constant factor of a single computation of the corresponding maxflow optimization, can be adapted to solve the proximal problems for those penalties. Several existing structured penalties satisfy these conditions; thus, regularized learning with these penalties is solvable quickly using the parametric maxflow algorithm. We also investigate the empirical runtime performance of the proposed framework.

1509.02730 2026-06-04 eess.SY cs.DC cs.IT cs.LG cs.SY math.IT 版本更新

Finite Dictionary Variants of the Diffusion KLMS Algorithm

有限字典变体的扩散KLMS算法

Rangeet Mitra, Vimal Bhatia

AI总结 本文提出两种有限字典变体的扩散KLMS算法,以减少存储需求并保持收敛性能。

详情
AI中文摘要

基于分布式学习的方法已被发现是处理网络上线性可分数据集学习的可行解决方案。然而,至今为止的方法仅适用于线性可分数据集,需要扩展到需要学习非线性的情况。在这些情况下,最近提出的扩散核最小均方(KLMS)方法比扩散最小均方(LMS)方法表现更好。扩散KLMS的缺点是需要无限存储观测(也称为字典)。本文在固定预算设置下提出了扩散KLMS,使得存储需求得以降低,同时在收敛性能方面保持相当的水平。仿真结果验证了两种新提出的算法,即量化扩散KLMS(QDKLMS)和固定预算扩散KLMS(FBDKLMS),与KLMS相比,这两种算法在减少字典大小存储需求的同时,表现出更好的性能。

英文摘要

The diffusion based distributed learning approaches have been found to be a viable solution for learning over linearly separable datasets over a network. However, approaches till date are suitable for linearly separable datasets and need to be extended to scenarios in which we need to learn a non-linearity. In such scenarios, the recently proposed diffusion kernel least mean squares (KLMS) has been found to be performing better than diffusion least mean squares (LMS). The drawback of diffusion KLMS is that it requires infinite storage for observations (also called dictionary). This paper formulates the diffusion KLMS in a fixed budget setting such that the storage requirement is curtailed while maintaining appreciable performance in terms of convergence. Simulations have been carried out to validate the two newly proposed algorithms named as quantised diffusion KLMS (QDKLMS) and fixed budget diffusion KLMS (FBDKLMS) against KLMS, which indicate that both the proposed algorithms deliver better performance as compared to the KLMS while reducing the dictionary size storage requirement.

1409.0553 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Sampling-based Approximations with Quantitative Performance for the Probabilistic Reach-Avoid Problem over General Markov Processes

基于采样的近似方法在一般马尔可夫过程的概率可达-避免问题中的定量性能

Sofie Haesaert, Robert Babuska, Alessandro Abate

AI总结 本文提出基于拟合值迭代算法的近似计算方案,用于解决一般马尔可夫过程的概率可达-避免问题,并提供可计算的误差界以保障安全应用。

详情
AI中文摘要

本文研究了具有马尔可夫性质的随机过程,其在一般(不可数)状态空间中演化,并依赖于非确定性量(控制输入)影响概率动态。本文针对最大可达-避免规范的计算及相应最优控制器的合成进行了研究。可达-避免规范评估模型任何有限时间轨迹进入给定目标集的概率,同时避免给定的不期望状态集。本文提出基于随机采样的近似计算方案,基于拟合值迭代算法,并提供可先验计算的正式概率误差界,从而对数值方案的输出进行定量评估,使其对安全关键应用具有意义。此外,本文还提供了更紧的基于样本的概率误差界。整体计算方案与文献中的其他近似算法相关联,并最终在基准案例研究中评估其性能。

英文摘要

This article deals with stochastic processes endowed with the Markov (memoryless) property and evolving over general (uncountable) state spaces. The models further depend on a non-deterministic quantity in the form of a control input, which can be selected to affect the probabilistic dynamics. We address the computation of maximal reach-avoid specifications, together with the synthesis of the corresponding optimal controllers. The reach-avoid specification deals with assessing the likelihood that any finite-horizon trajectory of the model enters a given goal set, while avoiding a given set of undesired states. This article newly provides an approximate computational scheme for the reach-avoid specification based on the Fitted Value Iteration algorithm, which hinges on random sample extractions, and gives a-priori computable formal probabilistic bounds on the error made by the approximation algorithm: as such, the output of the numerical scheme is quantitatively assessed and thus meaningful for safety-critical applications. Furthermore, we provide tighter probabilistic error bounds that are sample-based. The overall computational scheme is put in relationship with alternative approximation algorithms in the literature, and finally its performance is practically assessed over a benchmark case study.

1404.5009 2026-06-04 cs.CV cs.LG cs.NA math.NA 版本更新

Efficient Semidefinite Branch-and-Cut for MAP-MRF Inference

高效半定规划分支定界法用于MAP-MRF推断

Peng Wang, Chunhua Shen, Anton van den Hengel, Philip Torr

AI总结 本文提出了一种高效的分支定界方法用于求解通用MAP-MRF推断问题,通过结合可扩展的半定规划和切割平面法,实现了高效的约束求解,并在密集连接或 unary 成本相对较低时取得最佳结果。

Comments 21 pages

详情
AI中文摘要

我们提出了一种分支定界(B&C)方法用于求解通用MAP-MRF推断问题。该方法的核心是一个非常高效的边界求解过程,结合了可扩展的半定规划(SDP)和切割平面法以寻找违反的约束。为了进一步加快计算,采用了模型简化、预热启动和移除不活跃约束等策略。我们分析了所提方法在不同设置下的性能,并证明我们的方法在性能上要么优于要么与最先进的方法相当。特别是当连接是密集的或当unary成本的相对大小较低时,我们实现了最佳报告结果。实验表明,所提出的算法在各种时间预算下,在具有挑战性的非子模MAP-MRF推断问题中优于最先进的方法。

英文摘要

We propose a Branch-and-Cut (B&C) method for solving general MAP-MRF inference problems. The core of our method is a very efficient bounding procedure, which combines scalable semidefinite programming (SDP) and a cutting-plane method for seeking violated constraints. In order to further speed up the computation, several strategies have been exploited, including model reduction, warm start and removal of inactive constraints. We analyze the performance of the proposed method under different settings, and demonstrate that our method either outperforms or performs on par with state-of-the-art approaches. Especially when the connectivities are dense or when the relative magnitudes of the unary costs are low, we achieve the best reported results. Experiments show that the proposed algorithm achieves better approximation than the state-of-the-art methods within a variety of time budgets on challenging non-submodular MAP-MRF inference problems.

1509.01352 2026-06-04 cs.LG cs.DC cs.IT cs.SY eess.SY math.IT 版本更新

Diffusion-KLMS Algorithm and its Performance Analysis for Non-Linear Distributed Networks

扩散-KLMS算法及其在非线性分布式网络中的性能分析

Rangeet Mitra, Vimal Bhatia

AI总结 本文提出一种适用于非线性分布式环境的扩散-KLMS算法,通过仿真验证其优于同类算法的收敛性能,并引入预测暂稳态行为的技术,可扩展至5G通信系统中的协同频谱感知和大规模MIMO接收机设计。

详情
AI中文摘要

在分布式网络环境中,扩散-最小二乘(LMS)算法比原始LMS算法收敛更快。观察到扩散-LMS通常优于其他分布式LMS算法,如空间LMS和增量LMS。然而,原始LMS和扩散-LMS在非线性环境中不适用,因为数据可能无法线性分离。文献中提出了一种称为核LMS(KLMS)的LMS变种,用于处理此类非线性。本文提出了一种适用于非线性分布式环境的扩散-LMS核化版本。仿真显示,所提方法的收敛性优于同类算法。我们还引入了一种技术,用于预测所提算法的暂态和稳态行为。本文提出的技术(或同类算法)可以轻松扩展到分布式参数估计应用,如协同频谱感知和大规模多输入多输出(MIMO)接收机设计,这些是5G通信系统中的潜在组件。

英文摘要

In a distributed network environment, the diffusion-least mean squares (LMS) algorithm gives faster convergence than the original LMS algorithm. It has also been observed that, the diffusion-LMS generally outperforms other distributed LMS algorithms like spatial LMS and incremental LMS. However, both the original LMS and diffusion-LMS are not applicable in non-linear environments where data may not be linearly separable. A variant of LMS called kernel-LMS (KLMS) has been proposed in the literature for such non-linearities. In this paper, we propose kernelised version of diffusion-LMS for non-linear distributed environments. Simulations show that the proposed approach has superior convergence as compared to algorithms of the same genre. We also introduce a technique to predict the transient and steady-state behaviour of the proposed algorithm. The techniques proposed in this work (or algorithms of same genre) can be easily extended to distributed parameter estimation applications like cooperative spectrum sensing and massive multiple input multiple output (MIMO) receiver design which are potential components for 5G communication systems.

1508.07416 2026-06-04 cs.CE cs.LG cs.NA math.NA 版本更新

Linked Component Analysis from Matrices to High Order Tensors: Applications to Biomedical Data

从矩阵到高阶张量的关联组件分析:应用于生物医学数据

Guoxu Zhou, Qibin Zhao, Yu Zhang, Tülay Adalı, Shengli Xie, Andrzej Cichocki

AI总结 本文综述了用于联合分析多块数据的矩阵基组件分析方法,并扩展至多块多向张量数据,重点展示如何通过多向数据性质提取共同和个体特征,用于生物医学数据分析。

Comments 20 pages, 11 figures, Proceedings of the IEEE, 2015

详情
AI中文摘要

随着各种传感器技术的普及,我们能够获取大量多块(也称为多集、多关系或多视图)数据,需要联合分析以探索其潜在连接。各种组件分析方法在分析此类耦合数据中扮演着越来越重要的角色。本文首先简要回顾了现有基于矩阵(二维)的组件分析方法,用于此类数据的联合分析,重点在生物医学应用。然后,我们讨论了这些方法对多块多向(张量)数据的重要扩展和一般化。我们展示了如何通过约束多块张量分解方法提取相似或统计依赖的共同特征,这些特征被所有块共享,通过整合数据的多向性质。特别强调了多块数据的灵活共同和个体特征分析,旨在同时提取具有所需属性和类型多样性的共同和个体潜在组件。通过示例展示了其在生物医学数据分析中的有效性。

英文摘要

With the increasing availability of various sensor technologies, we now have access to large amounts of multi-block (also called multi-set, multi-relational, or multi-view) data that need to be jointly analyzed to explore their latent connections. Various component analysis methods have played an increasingly important role for the analysis of such coupled data. In this paper, we first provide a brief review of existing matrix-based (two-way) component analysis methods for the joint analysis of such data with a focus on biomedical applications. Then, we discuss their important extensions and generalization to multi-block multiway (tensor) data. We show how constrained multi-block tensor decomposition methods are able to extract similar or statistically dependent common features that are shared by all blocks, by incorporating the multiway nature of data. Special emphasis is given to the flexible common and individual feature analysis of multi-block data with the aim to simultaneously extract common and individual latent components with desired properties and types of diversity. Illustrative examples are given to demonstrate their effectiveness for biomedical data analysis.

1502.04390 2026-06-04 cs.LG cs.NA math.NA 版本更新

Equilibrated adaptive learning rates for non-convex optimization

非凸优化中的平衡自适应学习率

Yann N. Dauphin, Harm de Vries, Yoshua Bengio

AI总结 本文提出ESGD算法,通过平衡预条件器改进非凸优化中的自适应学习率,实验显示其收敛速度优于RMSProp。

详情
AI中文摘要

为减少训练大型深度网络时遇到的病态问题,参数特定的自适应学习率方法在计算上是高效的。在最近的研究中,我们发现考虑Hessian的负特征值的存在有助于设计更合适的自适应学习率方案。我们发现流行的雅可比预条件器在存在正负曲率时表现不佳,并提供了理论和实验证据表明所谓的平衡预条件器更适合非凸问题。我们基于平衡预条件器引入了一种新的自适应学习率方案,称为ESGD。我们的实验表明,ESGD在收敛速度方面表现良好或优于RMSProp,始终明显优于普通的随机梯度下降。

英文摘要

Parameter-specific adaptive learning rate methods are computationally efficient ways to reduce the ill-conditioning problems encountered when training large deep networks. Following recent work that strongly suggests that most of the critical points encountered when training such networks are saddle points, we find how considering the presence of negative eigenvalues of the Hessian could help us design better suited adaptive learning rate schemes. We show that the popular Jacobi preconditioner has undesirable behavior in the presence of both positive and negative curvature, and present theoretical and empirical evidence that the so-called equilibration preconditioner is comparatively better suited to non-convex problems. We introduce a novel adaptive learning rate scheme, called ESGD, based on the equilibration preconditioner. Our experiments show that ESGD performs as well or better than RMSProp in terms of convergence speed, always clearly improving over plain stochastic gradient descent.

1406.2082 2026-06-04 stat.ML cs.LG cs.NA math.NA math.OC stat.AP 版本更新

Fast and Flexible ADMM Algorithms for Trend Filtering

快速且灵活的ADMM算法用于趋势过滤

Aaditya Ramdas, Ryan J. Tibshirani

AI总结 本文提出一种快速稳健的算法用于趋势过滤,解决其在大规模数据下的计算问题,并展示其在稀疏趋势过滤和等距趋势过滤中的扩展性。

Comments 22 pages, 10 figures; published in Journal of Computational and Graphical Statistics, 2015

详情
AI中文摘要

本文提出了一种快速且稳健的算法用于趋势过滤,一种最近发展的非参数回归工具。已证明,对于导数有界变差的估计函数,趋势过滤达到最小最大最优误差率,而其他流行方法如平滑样条和核方法无法达到。然而,限制其更广泛实际应用的是缺乏可扩展且数值稳定的算法来拟合趋势过滤估计。本文提出了一种高效专用的ADMM程序用于趋势过滤。我们的算法与当前使用的专用内点方法竞争,但更具数值鲁棒性。此外,所提出的ADMM实现非常简单,而且重要的是,它足够灵活,可以扩展到许多有趣的相关问题,如稀疏趋势过滤和等距趋势过滤。我们的方法的软件以C和R语言免费提供。

英文摘要

This paper presents a fast and robust algorithm for trend filtering, a recently developed nonparametric regression tool. It has been shown that, for estimating functions whose derivatives are of bounded variation, trend filtering achieves the minimax optimal error rate, while other popular methods like smoothing splines and kernels do not. Standing in the way of a more widespread practical adoption, however, is a lack of scalable and numerically stable algorithms for fitting trend filtering estimates. This paper presents a highly efficient, specialized ADMM routine for trend filtering. Our algorithm is competitive with the specialized interior point methods that are currently in use, and yet is far more numerically robust. Furthermore, the proposed ADMM implementation is very simple, and importantly, it is flexible enough to extend to many interesting related problems, such as sparse trend filtering and isotonic trend filtering. Software for our method is freely available, in both the C and R languages.

1507.03194 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

A Review of Nonnegative Matrix Factorization Methods for Clustering

聚类分析中非负矩阵因子化方法综述

Ali Caner Türkmen

AI总结 本文综述了非负矩阵因子化方法在聚类中的应用,探讨了多种变体及其聚类解释。

详情
AI中文摘要

非负矩阵因子化(NMF)最初作为低秩矩阵近似技术引入,已广泛应用于多个领域。尽管NMF似乎与聚类问题无关,但研究表明二者紧密相关。本文首先介绍了聚类和NMF的基础知识,然后探讨了多种NMF变体,包括稀疏NMF、投影NMF、非负光谱聚类和Cluster-NMF,以及它们的聚类解释。

英文摘要

Nonnegative Matrix Factorization (NMF) was first introduced as a low-rank matrix approximation technique, and has enjoyed a wide area of applications. Although NMF does not seem related to the clustering problem at first, it was shown that they are closely linked. In this report, we provide a gentle introduction to clustering and NMF before reviewing the theoretical relationship between them. We then explore several NMF variants, namely Sparse NMF, Projective NMF, Nonnegative Spectral Clustering and Cluster-NMF, along with their clustering interpretations.

1508.05873 2026-06-04 math.NA cs.LG cs.NA 版本更新

Stochastic Behavior of the Nonnegative Least Mean Fourth Algorithm for Stationary Gaussian Inputs and Slow Learning

非负最小均方四次算法在平稳高斯输入和慢速学习下的随机行为

Jingen Ni, Jian Yang, Jie Chen, Cédric Richard, José Carlos M. Bermudez

AI总结 本文研究了非负最小均方四次算法在平稳高斯输入和慢速学习下的随机行为,分析其性能并验证了理论结果。

Comments 11 pages, 8 figures, submitted for publication

详情
AI中文摘要

一些系统辨识问题由于未知系统的固有物理特性,要求对参数估计施加非负性约束。非负最小均方(NNLMS)算法及其变种允许在线解决此问题。最近提出了一种非负最小均方四次(NNLMF)算法,以在测量噪声非高斯的情况下提高这些算法的性能。本文首次对NNLMF算法在平稳高斯输入和慢速学习下的随机行为进行了理论分析。仿真结果展示了所提分析的准确性。

英文摘要

Some system identification problems impose nonnegativity constraints on the parameters to estimate due to inherent physical characteristics of the unknown system. The nonnegative least-mean-square (NNLMS) algorithm and its variants allow to address this problem in an online manner. A nonnegative least mean fourth (NNLMF) algorithm has been recently proposed to improve the performance of these algorithms in cases where the measurement noise is not Gaussian. This paper provides a first theoretical analysis of the stochastic behavior of the NNLMF algorithm for stationary Gaussian inputs and slow learning. Simulation results illustrate the accuracy of the proposed analysis.

1508.05514 2026-06-04 stat.ML cs.CV cs.LG cs.RO cs.SY eess.SY 版本更新

Gaussian Mixture Reduction Using Reverse Kullback-Leibler Divergence

基于反向Kullback-Leibler散度的高斯混合减少

Tohid Ardeshiri, Umut Orguner, Emre Özkan

AI总结 本文提出一种贪心混合减少算法,基于Kullback-Leibler散度进行混合成分的剪枝与合并,通过分析近似方法提高计算效率,并在模拟和实际数据中验证其性能优于现有方法。

详情
AI中文摘要

我们提出了一种贪心的混合减少算法,能够基于Kullback-Leibler散度(KLD)剪枝和合并混合成分。该算法不同于已知的Runnalls基于KLD的方法,因为它不限于合并操作。剪枝能力(除合并外)使算法在减少过程中能够保留原始混合的峰值。通过分析近似方法来避免KLD的计算不可行性,从而得到一个计算高效的算法。所提出的算法在两个数值示例中与Runnalls和Williams的方法进行比较,使用模拟和实际数据。结果表明,所提出的方法在性能和计算复杂度方面使其成为现有混合减少方法的高效替代方案。

英文摘要

We propose a greedy mixture reduction algorithm which is capable of pruning mixture components as well as merging them based on the Kullback-Leibler divergence (KLD). The algorithm is distinct from the well-known Runnalls' KLD based method since it is not restricted to merging operations. The capability of pruning (in addition to merging) gives the algorithm the ability of preserving the peaks of the original mixture during the reduction. Analytical approximations are derived to circumvent the computational intractability of the KLD which results in a computationally efficient method. The proposed algorithm is compared with Runnalls' and Williams' methods in two numerical examples, using both simulated and real world data. The results indicate that the performance and computational complexity of the proposed approach make it an efficient alternative to existing mixture reduction methods.

1508.04467 2026-06-04 cs.CV cs.IT cs.LG cs.NA math.IT math.NA stat.ML 版本更新

Robust Subspace Clustering via Smoothed Rank Approximation

通过平滑秩近似实现鲁棒子空间聚类

Zhao Kang, Chong Peng, Qiang Cheng

AI总结 本文提出基于对数-行列式秩近似的方法,用于子空间聚类,以提高精度并有效处理误差和噪声。

Comments Journal, code is available

详情
Journal ref
IEEE Signal Processing Letters, 22(2015)2088-2092
AI中文摘要

本文提出基于对数-行列式秩近似的方法,用于子空间聚类,以提高精度并有效处理误差和噪声。矩阵秩最小化受线性约束在许多应用领域中出现,从信号处理到机器学习。核范数是该问题的凸松弛,可以在某些受限且理论有趣的条件下精确恢复秩。然而,对于许多现实应用,核范数近似到秩函数只能产生远离最优解的结果。为了寻求比核范数更准确的解决方案,本文提出基于对数-行列式的秩近似方法。我们考虑将此秩近似应用于子空间聚类应用。我们的框架可以建模不同类型的误差和噪声。开发了有效的优化策略,并具有理论保证,以收敛到 stationary 点。所提出的方法在人脸识别和运动分割任务上相比最先进的子空间聚类算法表现出有希望的结果。

英文摘要

Matrix rank minimizing subject to affine constraints arises in many application areas, ranging from signal processing to machine learning. Nuclear norm is a convex relaxation for this problem which can recover the rank exactly under some restricted and theoretically interesting conditions. However, for many real-world applications, nuclear norm approximation to the rank function can only produce a result far from the optimum. To seek a solution of higher accuracy than the nuclear norm, in this paper, we propose a rank approximation based on Logarithm-Determinant. We consider using this rank approximation for subspace clustering application. Our framework can model different kinds of errors and noise. Effective optimization strategy is developed with theoretical guarantee to converge to a stationary point. The proposed method gives promising results on face clustering and motion segmentation tasks compared to the state-of-the-art subspace clustering algorithms.

1412.8293 2026-06-04 stat.ML cs.LG cs.NA math.NA stat.CO 版本更新

Quasi-Monte Carlo Feature Maps for Shift-Invariant Kernels

准蒙特卡洛特征映射用于移不变核

Haim Avron, Vikas Sindhwani, Jiyan Yang, Michael Mahoney

AI总结 本文提出用准蒙特卡洛方法改进随机傅里叶特征映射,以加速大规模数据集上核方法的训练和测试速度,通过低差异序列减少积分误差。

Comments A short version of this paper has been presented in ICML 2014

详情
AI中文摘要

我们考虑如何提高随机傅里叶特征映射的效率,以加速核方法在大规模数据集上的训练和测试速度。这些近似特征映射作为蒙特卡洛近似积分表示的移不变核函数(如高斯核)的近似。本文提出使用准蒙特卡洛(QMC)近似,其中相关的积分被评估在低差异点序列上,而不是蒙特卡洛方法中的随机点集。我们推导了一个新的差异度量,称为箱差异,基于对给定序列的积分误差的理论特征。然后我们提出基于显式的箱差异最小化来学习适应我们设置的QMC序列。我们的理论分析辅以实验证明经典和自适应QMC技术在该问题上的有效性。

英文摘要

We consider the problem of improving the efficiency of randomized Fourier feature maps to accelerate training and testing speed of kernel methods on large datasets. These approximate feature maps arise as Monte Carlo approximations to integral representations of shift-invariant kernel functions (e.g., Gaussian kernel). In this paper, we propose to use Quasi-Monte Carlo (QMC) approximations instead, where the relevant integrands are evaluated on a low-discrepancy sequence of points as opposed to random point sets as in the Monte Carlo approach. We derive a new discrepancy measure called box discrepancy based on theoretical characterizations of the integration error with respect to a given sequence. We then propose to learn QMC sequences adapted to our setting based on explicit box discrepancy minimization. Our theoretical analyses are complemented with empirical results that demonstrate the effectiveness of classical and adaptive QMC techniques for this problem.

1507.08847 2026-06-04 cs.LG cs.CV cs.NA math.NA 版本更新

A novel multivariate performance optimization method based on sparse coding and hyper-predictor learning

一种基于稀疏编码和超预测器学习的新型多变量性能优化方法

Jiachen Yanga, Zhiyong Dinga, Fei Guoa, Huogen Wanga, Nick Hughesb

AI总结 本文提出一种新型方法,通过稀疏编码和超预测器学习优化多变量性能度量,通过联合优化问题最小化重建误差、稀疏性及复杂损失函数上界。

详情
AI中文摘要

本文研究了多变量性能度量的优化问题,提出了一种新算法。与传统机器学习方法不同,本文研究如何学习有效超预测器以处理数据点元组,从而最小化对应于多变量性能度量的复杂损失函数。我们提出将数据点元组通过字典转换为稀疏码元组,然后应用线性函数比较稀疏码与给定候选类别标签。为了学习字典、稀疏码和线性函数参数,我们提出一个联合优化问题。在此问题中,同时最小化稀疏码的重建误差和稀疏性,以及复杂损失函数的上界。此外,损失函数的上界通过稀疏码和线性函数参数近似。为优化此问题,我们开发了一种基于下降梯度方法的迭代算法,交替学习稀疏码和超预测器参数。在一些基准数据集上的实验结果表明,所提方法优于其他最先进的算法。

英文摘要

In this paper, we investigate the problem of optimization multivariate performance measures, and propose a novel algorithm for it. Different from traditional machine learning methods which optimize simple loss functions to learn prediction function, the problem studied in this paper is how to learn effective hyper-predictor for a tuple of data points, so that a complex loss function corresponding to a multivariate performance measure can be minimized. We propose to present the tuple of data points to a tuple of sparse codes via a dictionary, and then apply a linear function to compare a sparse code against a give candidate class label. To learn the dictionary, sparse codes, and parameter of the linear function, we propose a joint optimization problem. In this problem, the both the reconstruction error and sparsity of sparse code, and the upper bound of the complex loss function are minimized. Moreover, the upper bound of the loss function is approximated by the sparse codes and the linear function parameter. To optimize this problem, we develop an iterative algorithm based on descent gradient methods to learn the sparse codes and hyper-predictor parameter alternately. Experiment results on some benchmark data sets show the advantage of the proposed methods over other state-of-the-art algorithms.

1409.2848 2026-06-04 cs.LG cs.NA math.NA math.OC stat.ML 版本更新

A Stochastic PCA and SVD Algorithm with an Exponential Convergence Rate

具有指数收敛速率的随机PCA和SVD算法

Ohad Shamir

AI总结 提出VR-PCA算法,通过低计算成本的随机迭代实现快速收敛,解决传统方法收敛慢或计算强度大的问题。

Comments Fixed a minor bug in the proof of lemma 1 (which does not affect the result)

详情
AI中文摘要

我们描述并分析了一个简单的主成分分析和奇异值分解算法VR-PCA,该算法使用计算成本低的随机迭代,却能以指数速度收敛到最优解。与现有算法相比,现有方法要么收敛速度慢,要么迭代计算强度大,运行时间随数据规模增长。该算法基于最近的方差减少随机梯度技术,此前该技术用于强凸优化分析,而此处应用于本质上非凸的问题,采用了非常不同的分析方法。

英文摘要

We describe and analyze a simple algorithm for principal component analysis and singular value decomposition, VR-PCA, which uses computationally cheap stochastic iterations, yet converges exponentially fast to the optimal solution. In contrast, existing algorithms suffer either from slow convergence, or computationally intensive iterations whose runtime scales with the data size. The algorithm builds on a recent variance-reduced stochastic gradient technique, which was previously analyzed for strongly convex optimization, whereas here we apply it to an inherently non-convex problem, using a very different analysis.

1507.04396 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Parallel MMF: a Multiresolution Approach to Matrix Computation

并行MMF:矩阵计算的多分辨率方法

Risi Kondor, Nedelina Teneva, Pramod K. Mudrakarta

AI总结 本文提出并行MMF算法,用于多尺度结构分析和矩阵压缩,通过实验展示其在稀疏矩阵压缩和预处理中的有效性。

详情
AI中文摘要

多分辨率矩阵分解(MMF)最近被引入为一种寻找多尺度结构并定义图/矩阵上的小波的方法。在本文中,我们推导出pMMF,一种用于计算MMF分解的并行算法。经验上,pMMF的运行时间与稀疏矩阵的维度成线性关系。我们认为这使pMMF成为一种有价值的计算原语,并展示了将其用于两种不同目的的实验:压缩矩阵和预处理大型稀疏线性系统。

英文摘要

Multiresolution Matrix Factorization (MMF) was recently introduced as a method for finding multiscale structure and defining wavelets on graphs/matrices. In this paper we derive pMMF, a parallel algorithm for computing the MMF factorization. Empirically, the running time of pMMF scales linearly in the dimension for sparse matrices. We argue that this makes pMMF a valuable new computational primitive in its own right, and present experiments on using pMMF for two distinct purposes: compressing matrices and preconditioning large sparse linear systems.

1406.1102 2026-06-04 math.NA cs.LG cs.NA stat.CO stat.ML 版本更新

Linear Convergence of Variance-Reduced Stochastic Gradient without Strong Convexity

方差减少随机梯度算法在无强凸性下的线性收敛性

Pinghua Gong, Jieping Ye

AI总结 本文提出Prox-SVRG和VRPSG算法,证明在无强凸性条件下,这些算法在约束和正则化问题中实现线性收敛,引入Semi-Strongly Convex不等式作为关键理论贡献。

Comments 18 pages

详情
AI中文摘要

随机梯度算法通过仅使用一个或几个样本估计梯度,具有低的每迭代计算成本。它们在大规模优化问题中被广泛应用。然而,由于梯度计算中的固有方差,随机梯度算法通常收敛缓慢,具有亚线性收敛速率。为加速收敛,一些方差减少随机梯度算法,如近端随机方差减少梯度(Prox-SVRG)算法,最近被提出以解决强凸问题。在强凸条件下,这些方差减少随机梯度算法实现线性收敛速率。然而,许多机器学习问题是凸但非强凸的。在本文中,我们引入Prox-SVRG及其投影变种称为方差减少投影随机梯度(VRPSG)算法,以解决广泛用于机器学习的一类非强凸优化问题。作为本文的主要技术贡献,我们证明了VRPSG和Prox-SVRG在无强凸性条件下实现线性收敛速率。证明中的关键成分是一个半强凸(SSC)不等式,这是首次严格证明用于一类非强凸问题的约束和正则化设置中的不等式。此外,SSC不等式与算法无关,可能用于分析其他随机梯度算法,这可能具有独立价值。据我们所知,这是首次在无强凸性条件下建立方差减少随机梯度算法在解决约束和正则化问题中的线性收敛速率的工作。

英文摘要

Stochastic gradient algorithms estimate the gradient based on only one or a few samples and enjoy low computational cost per iteration. They have been widely used in large-scale optimization problems. However, stochastic gradient algorithms are usually slow to converge and achieve sub-linear convergence rates, due to the inherent variance in the gradient computation. To accelerate the convergence, some variance-reduced stochastic gradient algorithms, e.g., proximal stochastic variance-reduced gradient (Prox-SVRG) algorithm, have recently been proposed to solve strongly convex problems. Under the strongly convex condition, these variance-reduced stochastic gradient algorithms achieve a linear convergence rate. However, many machine learning problems are convex but not strongly convex. In this paper, we introduce Prox-SVRG and its projected variant called Variance-Reduced Projected Stochastic Gradient (VRPSG) to solve a class of non-strongly convex optimization problems widely used in machine learning. As the main technical contribution of this paper, we show that both VRPSG and Prox-SVRG achieve a linear convergence rate without strong convexity. A key ingredient in our proof is a Semi-Strongly Convex (SSC) inequality which is the first to be rigorously proved for a class of non-strongly convex problems in both constrained and regularized settings. Moreover, the SSC inequality is independent of algorithms and may be applied to analyze other stochastic gradient algorithms besides VRPSG and Prox-SVRG, which may be of independent interest. To the best of our knowledge, this is the first work that establishes the linear convergence rate for the variance-reduced stochastic gradient algorithms on solving both constrained and regularized problems without strong convexity.

1507.00567 2026-06-04 eess.SY cs.AI cs.DC cs.LG cs.SE cs.SY 版本更新

Self-Learning Cloud Controllers: Fuzzy Q-Learning for Knowledge Evolution

自学习云控制器:用于知识演化的模糊Q学习

Pooyan Jamshidi, Amir Sharifloo, Claus Pahl, Andreas Metzger, Giovani Estrada

AI总结 本文提出FQL4KE自学习模糊云控制器,通过在运行时学习和修改模糊规则,使用户能通过调整优先级权重来指定控制器,而非复杂适应规则,实验表明其优于传统控制器。

详情
AI中文摘要

云控制器旨在通过在运行时自动扩展计算资源来响应应用需求,以满足性能保证并最小化资源成本。现有云控制器通常依赖预定义的适应规则集,但云服务提供商难以在设计时定义最优或预置的适应规则,因为上层应用是黑箱。因此,适应决策的负担常转嫁给云应用。然而,大多数情况下,应用开发者对云基础设施了解有限。本文提出在运行时学习适应规则。为此,我们引入FQL4KE,一种自学习模糊云控制器。FQL4KE在运行时学习和修改模糊规则。其优势在于设计云控制器时无需依赖仅靠精确的设计时知识,这可能难以获取。FQL4KE使用户能够通过简单调整代表系统目标优先级的权重来指定云控制器,而不是指定复杂的适应规则。FQL4KE的适用性已在云应用框架ElasticBench中得到实验评估。实验结果表明,FQL4KE优于我们之前开发的无学习机制的模糊控制器和原生Azure自动扩展。

英文摘要

Cloud controllers aim at responding to application demands by automatically scaling the compute resources at runtime to meet performance guarantees and minimize resource costs. Existing cloud controllers often resort to scaling strategies that are codified as a set of adaptation rules. However, for a cloud provider, applications running on top of the cloud infrastructure are more or less black-boxes, making it difficult at design time to define optimal or pre-emptive adaptation rules. Thus, the burden of taking adaptation decisions often is delegated to the cloud application. Yet, in most cases, application developers in turn have limited knowledge of the cloud infrastructure. In this paper, we propose learning adaptation rules during runtime. To this end, we introduce FQL4KE, a self-learning fuzzy cloud controller. In particular, FQL4KE learns and modifies fuzzy rules at runtime. The benefit is that for designing cloud controllers, we do not have to rely solely on precise design-time knowledge, which may be difficult to acquire. FQL4KE empowers users to specify cloud controllers by simply adjusting weights representing priorities in system goals instead of specifying complex adaptation rules. The applicability of FQL4KE has been experimentally assessed as part of the cloud application framework ElasticBench. The experimental results indicate that FQL4KE outperforms our previously developed fuzzy controller without learning mechanisms and the native Azure auto-scaling.

1507.00564 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Regularized linear system identification using atomic, nuclear and kernel-based norms: the role of the stability constraint

基于原子、核和核范数的正则化线性系统识别:稳定性约束的作用

Gianluigi Pillonetto, Tianshi Chen, Alessandro Chiuso, Giuseppe De Nicolao, Lennart Ljung

AI总结 本文比较了不同正则化方法在系统识别中的表现,发现稳定样条核在稳定性和平滑性方面表现更优,提出了新的正则化方法。

详情
AI中文摘要

受机器学习文献启发,新的正则化技术被引入线性系统识别。所有采用的估计器都解决正则化最小二乘问题,区别在于对脉冲响应的惩罚项类型。流行的选项包括应用于Hankel矩阵的原子和核范数,以及由所谓稳定样条核诱导的范数。本文报告了基于这些不同正则化器的估计器的比较研究。我们的发现表明,稳定样条核优于基于原子和核范数的方法,因为它们合适地嵌入了脉冲响应的稳定性和平滑性信息。这一点通过正则化的贝叶斯解释来说明。我们还设计了一类由

英文摘要

Inspired by ideas taken from the machine learning literature, new regularization techniques have been recently introduced in linear system identification. In particular, all the adopted estimators solve a regularized least squares problem, differing in the nature of the penalty term assigned to the impulse response. Popular choices include atomic and nuclear norms (applied to Hankel matrices) as well as norms induced by the so called stable spline kernels. In this paper, a comparative study of estimators based on these different types of regularizers is reported. Our findings reveal that stable spline kernels outperform approaches based on atomic and nuclear norms since they suitably embed information on impulse response stability and smoothness. This point is illustrated using the Bayesian interpretation of regularization. We also design a new class of regularizers defined by "integral" versions of stable spline/TC kernels. Under quite realistic experimental conditions, the new estimators outperform classical prediction error methods also when the latter are equipped with an oracle for model order selection.

1507.00438 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

DC Proximal Newton for Non-Convex Optimization Problems

非凸优化问题的DC近端牛顿法

Alain Rakotomamonjy, Remi Flamary, Gilles Gasso

AI总结 本文提出一种新的非凸优化算法,通过近端牛顿法处理非凸损失和正则化函数,理论分析证明其极限点为DC目标函数的 stationary points,实验显示其在高维转导学习中更高效。

详情
AI中文摘要

我们介绍了一种新的算法,用于解决学习问题,其中损失函数和正则器均为非凸但属于差分凸(DC)函数类。我们的贡献是一种通用的近端牛顿算法,能够处理此类情况。算法通过近似损失函数获得下降方向,并通过线搜索确保充分下降。理论分析表明,所提出算法的迭代点的极限点是DC目标函数的 stationary points。数值实验显示,我们的方法在具有凸损失函数和非凸正则化函数的问题上比现有方法更高效。我们还展示了该算法在高维转导学习问题中的优势,其中损失函数和正则化器均为非凸的。

英文摘要

We introduce a novel algorithm for solving learning problems where both the loss function and the regularizer are non-convex but belong to the class of difference of convex (DC) functions. Our contribution is a new general purpose proximal Newton algorithm that is able to deal with such a situation. The algorithm consists in obtaining a descent direction from an approximation of the loss function and then in performing a line search to ensure sufficient descent. A theoretical analysis is provided showing that the iterates of the proposed algorithm {admit} as limit points stationary points of the DC objective function. Numerical experiments show that our approach is more efficient than current state of the art for a problem with a convex loss functions and non-convex regularizer. We have also illustrated the benefit of our algorithm in high-dimensional transductive learning problem where both loss function and regularizers are non-convex.

1507.00421 2026-06-04 math.NA cs.LG cs.NA math.ST stat.ML stat.TH 版本更新

Categorical Matrix Completion

分类矩阵补全

Yang Cao, Yao Xie

AI总结 本文提出通过扩展一位矩阵补全方法,解决具有类别值的矩阵补全问题,通过核范数约束最大化似然比,建立理论误差界,并在MovieLens数据集上验证方法优势。

Comments Submitted

详情
AI中文摘要

我们考虑从不完整观测中补全具有类别值的矩阵问题,通过扩展一位矩阵补全的公式和理论实现。通过最大化似然比并约束X的核范数来恢复低秩矩阵X,观测通过多个链接函数映射自X的条目。我们建立了恢复误差的理论上界和下界,达到常数因子O(K^{3/2}),其中K是固定类别数。上界依赖于类别数通过最大化涉及链接函数平滑度的项。与一位矩阵补全相比,我们的边界在类别数平方根的阶数上是最佳的,这与类别数增加时问题变难的直觉一致。通过在MovieLens数据集上比较我们的方法与传统矩阵补全方法的性能,我们展示了方法的优势。

英文摘要

We consider the problem of completing a matrix with categorical-valued entries from partial observations. This is achieved by extending the formulation and theory of one-bit matrix completion. We recover a low-rank matrix $X$ by maximizing the likelihood ratio with a constraint on the nuclear norm of $X$, and the observations are mapped from entries of $X$ through multiple link functions. We establish theoretical upper and lower bounds on the recovery error, which meet up to a constant factor $\mathcal{O}(K^{3/2})$ where $K$ is the fixed number of categories. The upper bound in our case depends on the number of categories implicitly through a maximization of terms that involve the smoothness of the link functions. In contrast to one-bit matrix completion, our bounds for categorical matrix completion are optimal up to a factor on the order of the square root of the number of categories, which is consistent with an intuition that the problem becomes harder when the number of categories increases. By comparing the performance of our method with the conventional matrix completion method on the MovieLens dataset, we demonstrate the advantage of our method.

1506.08187 2026-06-04 math.OC cs.DS cs.LG cs.NA math.NA 版本更新

A geometric alternative to Nesterov's accelerated gradient descent

一种几何替代的Nesterov加速梯度下降法

Sébastien Bubeck, Yin Tat Lee, Mohit Singh

AI总结 本文提出了一种新的无约束优化方法,针对光滑强凸函数,达到Nesterov加速梯度下降的最优收敛速率,其几何解释灵感源自椭球法。

详情
AI中文摘要

我们提出了一种新的方法,用于无约束优化光滑且强凸函数,该方法达到了Nesterov加速梯度下降的最优收敛速率。新算法具有简单的几何解释,灵感松散地来源于椭球法。我们提供了一些数值证据表明,该方法可以优于Nesterov加速梯度下降法。

英文摘要

We propose a new method for unconstrained optimization of a smooth and strongly convex function, which attains the optimal rate of convergence of Nesterov's accelerated gradient descent. The new algorithm has a simple geometric interpretation, loosely inspired by the ellipsoid method. We provide some numerical evidence that the new method can be superior to Nesterov's accelerated gradient descent.

1506.07540 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Global Optimality in Tensor Factorization, Deep Learning, and Beyond

张量分解、深度学习及其他中的全局最优性

Benjamin D. Haeffele, Rene Vidal

AI总结 本文提出一个通用框架,分析非凸分解问题,证明局部最小值为全局最小值,并指导深度网络架构和正则化策略以提高优化效率。

详情
AI中文摘要

涉及分解的技术在广泛的应用中取得显著实证成功,但大多数问题的优化问题通常由于多线性形式或其他破坏凸性的转换而非凸。本文基于矩阵分解的凸松弛思想,提出一个通用框架,分析包括矩阵分解、张量分解和深度神经网络训练在内的非凸分解问题。我们推导出保证非凸优化问题局部最小值为全局最小值的充分条件,并证明如果分解变量的规模足够大,则从任何初始化出发,使用纯局部下降算法可以找到全局最小值。该框架还部分理论上解释了深度神经网络中ReLU的广泛应用,并提供指导以促进高效优化。

英文摘要

Techniques involving factorization are found in a wide range of applications and have enjoyed significant empirical success in many fields. However, common to a vast majority of these problems is the significant disadvantage that the associated optimization problems are typically non-convex due to a multilinear form or other convexity destroying transformation. Here we build on ideas from convex relaxations of matrix factorizations and present a very general framework which allows for the analysis of a wide range of non-convex factorization problems - including matrix factorization, tensor factorization, and deep neural network training formulations. We derive sufficient conditions to guarantee that a local minimum of the non-convex optimization problem is a global minimum and show that if the size of the factorized variables is large enough then from any initialization it is possible to find a global minimizer using a purely local descent algorithm. Our framework also provides a partial theoretical justification for the increasingly common use of Rectified Linear Units (ReLUs) in deep neural networks and offers guidance on deep network architectures and regularization strategies to facilitate efficient optimization.

1306.4905 2026-06-04 math.NA cs.LG cs.NA 版本更新

From-Below Approximations in Boolean Matrix Factorization: Geometry and New Algorithm

从下逼近在布尔矩阵分解中的应用:几何与新算法

Radim Belohlavek, Martin Trnecka

AI总结 本文提出布尔矩阵分解的新结果及新算法,强调从下逼近在输入矩阵中的重要性,并通过实验验证了算法在覆盖性和分解效率上的优势。

详情
AI中文摘要

我们提出了布尔矩阵分解的新结果及基于这些结果的新算法。结果强调了提供从下逼近的分解在输入矩阵中的重要性。虽然之前的方法未考虑不同矩阵元素可能的不同重要性,我们的结果有助于衡量这种重要性,并建议在计算因子时重点关注某些区域。在合成和真实数据上的实验评估显示,该算法在前k个因子的良好覆盖以及精确分解所需因子数少方面表现优异,并表明在这些方面优于现有方法。我们还提出了未来的研究方向。

英文摘要

We present new results on Boolean matrix factorization and a new algorithm based on these results. The results emphasize the significance of factorizations that provide from-below approximations of the input matrix. While the previously proposed algorithms do not consider the possibly different significance of different matrix entries, our results help measure such significance and suggest where to focus when computing factors. An experimental evaluation of the new algorithm on both synthetic and real data demonstrates its good performance in terms of good coverage by the first k factors as well as a small number of factors needed for exact decomposition and indicates that the algorithm outperforms the available ones in these terms. We also propose future research topics.

1408.6141 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Recursive Total Least-Squares Algorithm Based on Inverse Power Method and Dichotomous Coordinate-Descent Iterations

基于逆幂方法和二分坐标下降迭代的递归总最小二乘算法

Reza Arablouei, Kutluyıl Doğançay, Stefan Werner

AI总结 本文提出一种基于逆幂方法和二分坐标下降迭代的递归总最小二乘算法,相比传统方法,计算复杂度更低且具有渐近无偏性和稳定性,同时推导了遗忘因子下界和稳态均方偏差理论值。

详情
AI中文摘要

我们开发了一种递归总最小二乘(RTLS)算法,用于误差变量系统辨识,利用逆幂方法和二分坐标下降(DCD)迭代。所提出的算法称为DCD-RTLS,优于基于线搜索方法的传统RTLS算法,具有更低的计算复杂度。我们对DCD-RTLS算法进行了全面分析,证明其渐近无偏性和均值稳定性。我们还发现了一个确保算法均方稳定的遗忘因子下界,并计算了理论稳态均方偏差(MSD)。通过仿真验证了所提算法的有效性和预测稳态MSD的准确性。

英文摘要

We develop a recursive total least-squares (RTLS) algorithm for errors-in-variables system identification utilizing the inverse power method and the dichotomous coordinate-descent (DCD) iterations. The proposed algorithm, called DCD-RTLS, outperforms the previously-proposed RTLS algorithms, which are based on the line-search method, with reduced computational complexity. We perform a comprehensive analysis of the DCD-RTLS algorithm and show that it is asymptotically unbiased as well as being stable in the mean. We also find a lower bound for the forgetting factor that ensures mean-square stability of the algorithm and calculate the theoretical steady-state mean-square deviation (MSD). We verify the effectiveness of the proposed algorithm and the accuracy of the predicted steady-state MSD via simulations.

1502.02251 2026-06-04 stat.ML cs.LG cs.RO cs.SY eess.SY 版本更新

From Pixels to Torques: Policy Learning with Deep Dynamical Models

从像素到扭矩:基于深度动态模型的策略学习

Niklas Wahlström, Thomas B. Schön, Marc Peter Deisenroth

AI总结 本文提出一种高效的数据驱动强化学习算法,通过深度动态模型直接从像素信息学习闭环控制策略,解决高维观测下的连续状态-动作空间数据高效学习问题。

Comments 9 pages

详情
AI中文摘要

在开发完全自主系统中,利用非常高的维数观测进行数据高效学习连续状态-动作空间仍是一个关键挑战。本文考虑这一挑战的一个实例,即像素到扭矩问题,其中智能体必须仅从像素信息学习闭环控制策略。我们引入了一种数据高效、基于模型的强化学习算法,该算法直接从像素信息学习此类闭环策略。关键成分是深度动态模型,该模型使用深度自编码器学习图像的低维嵌入,并在该低维特征空间中学习预测模型。联合学习确保不仅静态属性,而且动态属性都被考虑在内。这对于长期预测至关重要,而长期预测是适应性模型预测控制策略的核心。与最先进的连续状态和动作强化学习方法相比,我们的方法学习速度快,可扩展到高维状态空间,并是向完全自主学习从像素到扭矩的重要一步。

英文摘要

Data-efficient learning in continuous state-action spaces using very high-dimensional observations remains a key challenge in developing fully autonomous systems. In this paper, we consider one instance of this challenge, the pixels to torques problem, where an agent must learn a closed-loop control policy from pixel information only. We introduce a data-efficient, model-based reinforcement learning algorithm that learns such a closed-loop policy directly from pixel information. The key ingredient is a deep dynamical model that uses deep auto-encoders to learn a low-dimensional embedding of images jointly with a predictive model in this low-dimensional feature space. Joint learning ensures that not only static but also dynamic properties of the data are accounted for. This is crucial for long-term predictions, which lie at the core of the adaptive model predictive control strategy that we use for closed-loop control. Compared to state-of-the-art reinforcement learning methods for continuous states and actions, our approach learns quickly, scales to high-dimensional state spaces and is an important step toward fully autonomous learning from pixels to torques.

1310.0865 2026-06-04 stat.ML cs.LG cs.SY eess.SY 版本更新

Electricity Market Forecasting via Low-Rank Multi-Kernel Learning

通过低秩多核学习进行电力市场预测

Vassilis Kekatos, Yu Zhang, Georgios B. Giannakis

AI总结 本文通过低秩核学习方法对电力市场进行预测,利用核规范正则化选择定价节点和小时的核,提高预测精度和计算效率。

Comments 10 pages

详情
AI中文摘要

智能电网愿景涉及先进的信息技术和数据分析,以提高电网基础设施的效率、可持续性和经济性。本文利用现代统计学习工具进行电力市场推断。日前提价预测被转化为低秩核学习问题。独特地利用市场清算过程,拥堵模式被建模为矩阵中时空变化价格的秩一成分。通过一种新的核范数基于正则化,可以在定价节点和小时之间系统地选择核。尽管市场范围预测从学习角度看是有益的,但涉及处理高维市场数据。后者在设计解决涉及的非凸优化问题的块坐标下降算法后成为可能。该算法利用了块稀疏向量恢复的结果,并保证能够收敛到一个 stationary 点。在中西部 ISO(MISO)市场的实际数据上的数值测试证实了所开发方法的预测精度、计算效率和解释性优势。

英文摘要

The smart grid vision entails advanced information technology and data analytics to enhance the efficiency, sustainability, and economics of the power grid infrastructure. Aligned to this end, modern statistical learning tools are leveraged here for electricity market inference. Day-ahead price forecasting is cast as a low-rank kernel learning problem. Uniquely exploiting the market clearing process, congestion patterns are modeled as rank-one components in the matrix of spatio-temporally varying prices. Through a novel nuclear norm-based regularization, kernels across pricing nodes and hours can be systematically selected. Even though market-wide forecasting is beneficial from a learning perspective, it involves processing high-dimensional market data. The latter becomes possible after devising a block-coordinate descent algorithm for solving the non-convex optimization problem involved. The algorithm utilizes results from block-sparse vector recovery and is guaranteed to converge to a stationary point. Numerical tests on real data from the Midwest ISO (MISO) market corroborate the prediction accuracy, computational efficiency, and the interpretative merits of the developed approach over existing alternatives.

1404.0466 2026-06-04 cs.LG cs.NA math.NA 版本更新

piCholesky: Polynomial Interpolation of Multiple Cholesky Factors for Efficient Approximate Cross-Validation

piCholesky:多项式插值多重Cholesky因子以实现高效的近似交叉验证

Da Kuang, Alex Gittens, Raffay Hamid

AI总结 通过多项式插值多个正则化参数下的Hessian矩阵Cholesky因子,实现高效近似交叉验证,减少计算成本并提供误差界分析。

详情
AI中文摘要

在使用牛顿法求解最小二乘问题时,Hessian矩阵的因子化通常是主要成本,尤其在多个正则化参数(λ)值下。我们提出了一种高效方法,通过插值少量λ值下的Hessian矩阵Cholesky因子,从而在仅付出少量成本的情况下,实现最优的hold-out误差最小化。我们为我们的近似方案提供了正式的误差界,并解决了关键的实现挑战,以充分利用现代架构的计算能力。我们对多个数据集进行了详尽的实证分析,以展示我们方法的有效性。

英文摘要

The dominant cost in solving least-square problems using Newton's method is often that of factorizing the Hessian matrix over multiple values of the regularization parameter ($λ$). We propose an efficient way to interpolate the Cholesky factors of the Hessian matrix computed over a small set of $λ$ values. This approximation enables us to optimally minimize the hold-out error while incurring only a fraction of the cost compared to exact cross-validation. We provide a formal error bound for our approximation scheme and present solutions to a set of key implementation challenges that allow our approach to maximally exploit the compute power of modern architectures. We present a thorough empirical analysis over multiple datasets to show the effectiveness of our approach.

1506.02649 2026-06-04 math.NA cs.LG cs.NA 版本更新

Faster SGD Using Sketched Conditioning

用Sketching方法加速SGD

Alon Gonen, Shai Shalev-Shwartz

AI总结 本文提出通过Sketching方法加速随机优化算法,通过构造低成本的conditioner提升SGD效率,并在深度学习中验证其有效性。

详情
AI中文摘要

本文提出了一种通过Sketching方法加速随机优化算法的新方法,该方法近期已成为加速数值线性代数算法的强大工具。我们重新审视了用于加速一阶方法的conditioning方法,并建议使用Sketching方法构造一个低成本的conditioner,从而在Stochastic Gradient Descent(SGD)算法中实现显著的速度提升。虽然我们的理论保证假设了凸性,但我们讨论了该方法在深度神经网络中的适用性,并通过实验展示了其优势。

英文摘要

We propose a novel method for speeding up stochastic optimization algorithms via sketching methods, which recently became a powerful tool for accelerating algorithms for numerical linear algebra. We revisit the method of conditioning for accelerating first-order methods and suggest the use of sketching methods for constructing a cheap conditioner that attains a significant speedup with respect to the Stochastic Gradient Descent (SGD) algorithm. While our theoretical guarantees assume convexity, we discuss the applicability of our method to deep neural networks, and experimentally demonstrate its merits.

1406.6603 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

A scaled gradient projection method for Bayesian learning in dynamical systems

一种用于动态系统贝叶斯学习的缩放梯度投影方法

Silvia Bonettini, Alessandro Chiuso, Marco Prato

AI总结 本文提出一种缩放梯度投影算法,用于解决贝叶斯学习中的非凸优化问题,通过有效设计缩放矩阵和步长参数,实现高效求解。

详情
Journal ref
SIAM Journal on Scientific Computing 37 (2015), A1297-A1318
AI中文摘要

系统辨识问题中选择最合适的模型类是一个关键任务,传统上通过交叉验证或渐近论点解决。最近文献中提出在贝叶斯框架中解决此问题,通过少量超参数调节模型复杂性,可通过边际似然最大化估计。因此,设计有效的优化方法至关重要。若将未知脉冲响应建模为具有合适核的高斯过程,最大化边际似然会导致挑战性的非凸优化问题,需要稳定有效的求解策略。本文通过缩放梯度投影算法解决此问题,其中缩放矩阵和步长参数在计算时间上与二阶方法相当。特别地,我们提出了一种扩展的分裂梯度方法,用于在存在框约束时设计缩放矩阵,并有效实现梯度和目标函数。在多个测试问题上的广泛数值实验表明,该方法在几毫秒内提供与最先进方法相媲美的精度解决方案。此外,该策略的灵活性使其易于适应不同领域中的更广泛问题。

英文摘要

A crucial task in system identification problems is the selection of the most appropriate model class, and is classically addressed resorting to cross-validation or using asymptotic arguments. As recently suggested in the literature, this can be addressed in a Bayesian framework, where model complexity is regulated by few hyperparameters, which can be estimated via marginal likelihood maximization. It is thus of primary importance to design effective optimization methods to solve the corresponding optimization problem. If the unknown impulse response is modeled as a Gaussian process with a suitable kernel, the maximization of the marginal likelihood leads to a challenging nonconvex optimization problem, which requires a stable and effective solution strategy. In this paper we address this problem by means of a scaled gradient projection algorithm, in which the scaling matrix and the steplength parameter play a crucial role to provide a meaning solution in a computational time comparable with second order methods. In particular, we propose both a generalization of the split gradient approach to design the scaling matrix in the presence of box constraints, and an effective implementation of the gradient and objective function. The extensive numerical experiments carried out on several test problems show that our method is very effective in providing in few tenths of a second solutions of the problems with accuracy comparable with state-of-the-art approaches. Moreover, the flexibility of the proposed strategy makes it easily adaptable to a wider range of problems arising in different areas of machine learning, signal processing and system identification.

1506.02312 2026-06-04 cs.AI cs.LG cs.RO cs.SY eess.SY 版本更新

A Framework for Constrained and Adaptive Behavior-Based Agents

一种用于约束和自适应行为基 agent 的框架

Renato de Pontes Pereira, Paulo Martins Engel

AI总结 本文提出一种框架,通过强化学习节点整合到行为树中,解决约束 agent 的学习能力问题,并展示其与分层强化学习选项的关系,确保嵌套学习节点的收敛性。

Comments 2015; 15 pages

详情
AI中文摘要

行为树常用于建模机器人和游戏中的 agent,其中必须由人类专家设计受约束的行为以确保 agent 在特定感知下执行特定动作链。在这些应用领域,学习是可取的,因为它能为 agent 提供适应和改进与人类和环境交互的能力,但往往被丢弃,因为其不可靠。本文提出一个框架,将强化学习节点作为行为树的一部分,以解决在受约束 agent 中添加学习能力的问题。我们展示了该框架与分层强化学习中选项的关系,确保嵌套学习节点的收敛性,并通过实验证明学习节点不会影响树中其他节点的执行。

英文摘要

Behavior Trees are commonly used to model agents for robotics and games, where constrained behaviors must be designed by human experts in order to guarantee that these agents will execute a specific chain of actions given a specific set of perceptions. In such application areas, learning is a desirable feature to provide agents with the ability to adapt and improve interactions with humans and environment, but often discarded due to its unreliability. In this paper, we propose a framework that uses Reinforcement Learning nodes as part of Behavior Trees to address the problem of adding learning capabilities in constrained agents. We show how this framework relates to Options in Hierarchical Reinforcement Learning, ensuring convergence of nested learning nodes, and we empirically show that the learning nodes do not affect the execution of other nodes in the tree.

1504.00905 2026-06-04 math.OC cs.CV cs.LG cs.SY eess.SY 版本更新

Robust Anomaly Detection Using Semidefinite Programming

使用半定规划进行鲁棒异常检测

Jose A. Lopez, Octavia Camps, Mario Sznaier

AI总结 本文提出基于多项式优化和矩方法的新型异常检测方法,仅需正常状态特征统计矩信息,相较于Parzen窗口和1类SVM等方法表现更优,且能简洁描述正常状态,简化高维数据集的异常检测问题。

Comments 13 pages, 11 figures

详情
AI中文摘要

本文提出了一种基于多项式优化和矩方法的新方法,用于异常检测问题。所提出的方法仅需了解感兴趣特征的正常状态分布的统计矩信息,并在性能上优于现有方法(如Parzen窗口和1类SVM)。此外,它还提供了对正常状态的简洁描述,因此在处理高维数据集时,导致异常检测问题显著简化。

英文摘要

This paper presents a new approach, based on polynomial optimization and the method of moments, to the problem of anomaly detection. The proposed technique only requires information about the statistical moments of the normal-state distribution of the features of interest and compares favorably with existing approaches (such as Parzen windows and 1-class SVM). In addition, it provides a succinct description of the normal state. Thus, it leads to a substantial simplification of the the anomaly detection problem when working with higher dimensional datasets.

1406.5286 2026-06-04 stat.ML cs.LG cs.NA math.NA math.OC 版本更新

Enhancing Pure-Pixel Identification Performance via Preconditioning

通过预条件化增强纯像素识别性能

Nicolas Gillis, Wing-Kin Ma

AI总结 本文分析了不同预条件化方法以提升纯像素搜索算法的鲁棒性,针对SPA算法提出近似解的鲁棒性分析,并探讨了预白化和基于SPA的预条件化方法的鲁棒性与效率。

Comments 25 pages, 3 figures

详情
Journal ref
SIAM J. on Imaging Sciences 8 (2), pp. 1161-1186, 2015
AI中文摘要

在本文中,我们分析了不同预条件化方法以增强纯像素搜索算法的鲁棒性,这些算法用于盲超谱解混,并等价于近可分离的非负矩阵分解算法。我们的分析聚焦于 successive projection algorithm (SPA),一种简单、高效且可证明鲁棒的纯像素算法。最近,Gillis和Vavasis(arXiv:1310.2273)提出了一种可证明鲁棒的预条件化方法,该方法需要求解一个半正定规划(SDP)以找到包含数据点的最小体积椭球。由于在高精度下求解SDP可能耗时,我们扩展了鲁棒性分析以适用于SDP的近似解,即目标函数值与最优值相差某些乘法因子的解。证明了高精度解对鲁棒性并不关键,这为更快的预条件化方法(例如基于一阶优化方法的)铺平了道路。这一贡献也使我们能够为另外两种预条件化方法提供鲁棒性分析。第一种是预白化,可以解释为同一SDP的最优解加上额外约束。我们分析了预白化的鲁棒性,以表征其在某些情况下与基于SDP的预条件化方法具有竞争力的情况。第二种基于SPA本身,可以解释为SDP松弛的最优解。它在多个合成数据集上与基于SDP的预条件化方法竞争。

英文摘要

In this paper, we analyze different preconditionings designed to enhance robustness of pure-pixel search algorithms, which are used for blind hyperspectral unmixing and which are equivalent to near-separable nonnegative matrix factorization algorithms. Our analysis focuses on the successive projection algorithm (SPA), a simple, efficient and provably robust algorithm in the pure-pixel algorithm class. Recently, a provably robust preconditioning was proposed by Gillis and Vavasis (arXiv:1310.2273) which requires the resolution of a semidefinite program (SDP) to find a data points-enclosing minimum volume ellipsoid. Since solving the SDP in high precisions can be time consuming, we generalize the robustness analysis to approximate solutions of the SDP, that is, solutions whose objective function values are some multiplicative factors away from the optimal value. It is shown that a high accuracy solution is not crucial for robustness, which paves the way for faster preconditionings (e.g., based on first-order optimization methods). This first contribution also allows us to provide a robustness analysis for two other preconditionings. The first one is pre-whitening, which can be interpreted as an optimal solution of the same SDP with additional constraints. We analyze robustness of pre-whitening which allows us to characterize situations in which it performs competitively with the SDP-based preconditioning. The second one is based on SPA itself and can be interpreted as an optimal solution of a relaxation of the SDP. It is extremely fast while competing with the SDP-based preconditioning on several synthetic data sets.

1406.4802 2026-06-04 math.NA cs.LG cs.NA 版本更新

Homotopy based algorithms for $\ell_0$-regularized least-squares

基于同伦的 $\ell_0$-正则化最小二乘算法

Charles Soussen, Jérôme Idier, Junbo Duan, David Brie

AI总结 本文提出两种启发式搜索算法用于 $\ell_0$-同伦问题,通过改进的同伦路径方法解决稀疏信号恢复中的正则化问题,并展示了其在逆问题中的应用。

Comments 38 pages

详情
Journal ref
IEEE Transactions on Signal Processing, vol. 63, no. 13, Jul. 2015, pp. 3301-3316
AI中文摘要

稀疏信号恢复通常被表述为最小化二次成本函数 $\|y-Ax\|_2^2$,其中 A 是字典,x 是未知的稀疏向量。众所周知,施加 $\ell_0$ 约束会导致 NP 难的最小化问题。凸松弛方法受到广泛关注,其中 $\ell_0$-范数被 $\ell_1$-范数替代。在许多高效的 $\ell_1$ 解决方案中,同伦算法最小化 $\|y-Ax\|_2^2+λ\|x\|_1$ 关于 x 对于连续的 $λ$ 的情况。它受到 $\ell_1$-正则化路径的分段正则性的启发,也称为同伦路径。在本文中,我们处理 $\|y-Ax\|_2^2+λ\|x\|_0$ 对于连续 $λ$ 的最小化问题,并提出两种启发式搜索算法用于 $\ell_0$-同伦。继续单最佳替换是扩展单最佳替换算法的前向-后向贪心策略,之前提出用于给定 $λ$ 的 $\ell_0$-最小化。$λ$ 值的自适应搜索受到 $\ell_1$-同伦的启发。$\ell_0$ 正则化路径下降是一种更复杂的算法,利用 $\ell_0$-正则化路径的结构特性,该路径对 $λ$ 是分段常数的。两种算法都对困难的逆问题进行了实证评估,涉及病态字典。最后,我们展示它们可以轻松地与常规的模型阶选择方法结合。

英文摘要

Sparse signal restoration is usually formulated as the minimization of a quadratic cost function $\|y-Ax\|_2^2$, where A is a dictionary and x is an unknown sparse vector. It is well-known that imposing an $\ell_0$ constraint leads to an NP-hard minimization problem. The convex relaxation approach has received considerable attention, where the $\ell_0$-norm is replaced by the $\ell_1$-norm. Among the many efficient $\ell_1$ solvers, the homotopy algorithm minimizes $\|y-Ax\|_2^2+λ\|x\|_1$ with respect to x for a continuum of $λ$'s. It is inspired by the piecewise regularity of the $\ell_1$-regularization path, also referred to as the homotopy path. In this paper, we address the minimization problem $\|y-Ax\|_2^2+λ\|x\|_0$ for a continuum of $λ$'s and propose two heuristic search algorithms for $\ell_0$-homotopy. Continuation Single Best Replacement is a forward-backward greedy strategy extending the Single Best Replacement algorithm, previously proposed for $\ell_0$-minimization at a given $λ$. The adaptive search of the $λ$-values is inspired by $\ell_1$-homotopy. $\ell_0$ Regularization Path Descent is a more complex algorithm exploiting the structural properties of the $\ell_0$-regularization path, which is piecewise constant with respect to $λ$. Both algorithms are empirically evaluated for difficult inverse problems involving ill-conditioned dictionaries. Finally, we show that they can be easily coupled with usual methods of model order selection.

1505.04123 2026-06-04 cs.LG cs.AI cs.NA math.NA math.OC 版本更新

Margins, Kernels and Non-linear Smoothed Perceptrons

边距、核与非线性平滑感知机

Aaditya Ramdas, Javier Peña

AI总结 本文研究了在RKHS中寻找非线性分类函数的问题,提出了一种加速平滑算法,具有与经典核感知机相似的收敛特性,并给出了在无分类器存在时的分离定理。

Comments 17 pages, published in the proceedings of the International Conference on Machine Learning, 2014

详情
Journal ref
Ramdas, Aaditya, and Javier Pena. "Margins, kernels and non-linear smoothed perceptrons." Proceedings of the 31st International Conference on Machine Learning (ICML-14). 2014
AI中文摘要

我们关注在RKHS中寻找非线性分类函数的问题,从原问题和对偶问题两个角度出发,特别关注感知机和冯-诺依曼算法的推广。我们将问题转化为在RKHS中最大化正则化归一化硬边距(ρ),并利用表示定理将其转换为与核的(归一化和带符号)Gram矩阵相关的马哈拉诺斯基点积/半范数。我们推导出一种加速平滑算法,具有收敛率为√(log n)/ρ的特性,给定n个可分离点。当不存在此类分类器时,我们证明了RKHS版本的戈尔丹分离定理,并重新解释了负边距。这使得我们能够为原对偶算法提供保证,该算法在存在可行原问题时,可在min{√n/|ρ|, √n/ε}次迭代中找到RKHS中的完美分离器,或在无可行原问题时提供一个对偶ε-不可行性证书。

英文摘要

We focus on the problem of finding a non-linear classification function that lies in a Reproducing Kernel Hilbert Space (RKHS) both from the primal point of view (finding a perfect separator when one exists) and the dual point of view (giving a certificate of non-existence), with special focus on generalizations of two classical schemes - the Perceptron (primal) and Von-Neumann (dual) algorithms. We cast our problem as one of maximizing the regularized normalized hard-margin ($ρ$) in an RKHS and %use the Representer Theorem to rephrase it in terms of a Mahalanobis dot-product/semi-norm associated with the kernel's (normalized and signed) Gram matrix. We derive an accelerated smoothed algorithm with a convergence rate of $\tfrac{\sqrt {\log n}}ρ$ given $n$ separable points, which is strikingly similar to the classical kernelized Perceptron algorithm whose rate is $\tfrac1{ρ^2}$. When no such classifier exists, we prove a version of Gordan's separation theorem for RKHSs, and give a reinterpretation of negative margins. This allows us to give guarantees for a primal-dual algorithm that halts in $\min\{\tfrac{\sqrt n}{|ρ|}, \tfrac{\sqrt n}ε\}$ iterations with a perfect separator in the RKHS if the primal is feasible or a dual $ε$-certificate of near-infeasibility.

1412.6095 2026-06-04 eess.SY cs.LG cs.SY math.OC stat.ML 版本更新

Theoretical and Numerical Analysis of Approximate Dynamic Programming with Approximation Errors

近似动态规划中近似误差的理论与数值分析

Ali Heydari

AI总结 本文研究近似动态规划迭代中误差对最终结果的影响,分析确定性非线性最优控制问题中价值迭代方案的收敛性,并推导稳定性和吸引区域的充分条件。

Comments This study is the counterpart of another work of the author (arXiv:1412.5675) which was for value iterations with initial stabilizing guess (with overlaps on Theorem 1 and Lemma 1). As for the revision on this work, some steps of proofs are updated and an explanation about the approximation error is included. Initial submission date: 12/18/2014

详情
AI中文摘要

本文旨在回答近似动态规划(ADP)每次迭代中的近似误差如何影响最终结果的问题。研究了在考虑每次迭代中的误差影响下,确定性非线性最优控制问题中价值迭代方案的收敛性。通过已知的一般最优控制问题中的量和可验证的假设,获得了围绕最优解的有界性。此外,由于近似误差导致结果偏离最优性,推导了在有限次价值迭代后获得的结果所操作系统的稳定性充分条件,以及其吸引区域的估计。最后,通过轨道机动问题的实现过程验证了理论发展的假设,并应用充分条件以保证稳定性和近优性。

英文摘要

This study is aimed at answering the famous question of how the approximation errors at each iteration of Approximate Dynamic Programming (ADP) affect the quality of the final results considering the fact that errors at each iteration affect the next iteration. To this goal, convergence of Value Iteration scheme of ADP for deterministic nonlinear optimal control problems with undiscounted cost functions is investigated while considering the errors existing in approximating respective functions. The boundedness of the results around the optimal solution is obtained based on quantities which are known in a general optimal control problem and assumptions which are verifiable. Moreover, since the presence of the approximation errors leads to the deviation of the results from optimality, sufficient conditions for stability of the system operated by the result obtained after a finite number of value iterations, along with an estimation of its region of attraction, are derived in terms of a calculable upper bound of the control approximation error. Finally, the process of implementation of the method on an orbital maneuver problem is investigated through which the assumptions made in the theoretical developments are verified and the sufficient conditions are applied for guaranteeing stability and near optimality.

1312.7651 2026-06-04 stat.ML cs.LG cs.SY eess.SY 版本更新

Petuum: A New Platform for Distributed Machine Learning on Big Data

Petuum:一种用于大数据上分布式机器学习的新平台

Eric P. Xing, Qirong Ho, Wei Dai, Jin Kyu Kim, Jinliang Wei, Seunghak Lee, Xun Zheng, Pengtao Xie, Abhimanu Kumar, Yaoliang Yu

AI总结 本文提出一种通用框架,系统解决大规模机器学习中的数据和模型并行挑战,通过观察许多机器学习程序本质上是优化导向的,并允许容错、迭代收敛的算法解决方案,从而实现高效的系统设计。

Comments 15 pages, 10 figures, final version in KDD 2015 under the same title

详情
AI中文摘要

什么是系统化的方法,能够高效地将广泛先进的机器学习程序应用于工业级问题,使用大数据(高达数百亿参数)上的大数据(高达太字节或拍字节)?现代并行化策略采用细粒度操作和调度,超越经典批量同步处理范式,如MapReduce流行化,甚至专门的基于图的执行,依赖于机器学习程序的图表示。各种方法倾向于将系统和算法设计引向不同方向,难以找到适用于广泛机器学习程序的通用平台。我们提出一种通用框架,系统解决大规模机器学习中的数据和模型并行挑战,通过观察许多机器学习程序本质上是优化导向的,并允许容错、迭代收敛的算法解决方案。这为集成系统设计提供了独特机会,如受限误差网络同步和基于机器学习程序结构的动态调度。我们展示了这些系统设计相对于现代机器学习算法知名实现的有效性,使机器学习程序能够在较小的计算集群上以更少的时间和更大的模型规模运行。

英文摘要

What is a systematic way to efficiently apply a wide spectrum of advanced ML programs to industrial scale problems, using Big Models (up to 100s of billions of parameters) on Big Data (up to terabytes or petabytes)? Modern parallelization strategies employ fine-grained operations and scheduling beyond the classic bulk-synchronous processing paradigm popularized by MapReduce, or even specialized graph-based execution that relies on graph representations of ML programs. The variety of approaches tends to pull systems and algorithms design in different directions, and it remains difficult to find a universal platform applicable to a wide range of ML programs at scale. We propose a general-purpose framework that systematically addresses data- and model-parallel challenges in large-scale ML, by observing that many ML programs are fundamentally optimization-centric and admit error-tolerant, iterative-convergent algorithmic solutions. This presents unique opportunities for an integrative system design, such as bounded-error network synchronization and dynamic scheduling based on ML program structure. We demonstrate the efficacy of these system designs versus well-known implementations of modern ML algorithms, allowing ML programs to run in much less time and at considerably larger model sizes, even on modestly-sized compute clusters.

1503.05214 2026-06-04 cs.DC cs.LG cs.NA math.NA 版本更新

Analysis of PCA Algorithms in Distributed Environments

分布式环境中PCA算法的分析

Tarek Elgamal, Mohamed Hefeeda

AI总结 本文分析了分布式环境中PCA算法的性能,比较了时间复杂度和通信复杂度,探讨了不同算法的可扩展性瓶颈及适用场景。

详情
AI中文摘要

经典机器学习算法在处理大规模数据时常面临可扩展性瓶颈,因其最初设计用于处理小数据集。本文分析了不同计算PCA算法的方法,并评论其在支持大规模数据集时的局限性。方法在时间复杂度和通信复杂度两个重要指标上进行分析和比较。我们考虑了两种指标的最坏情况,并识别了实现每种方法的软件库。本文分析帮助研究人员和工程师理解不同PCA算法的可扩展性瓶颈,选择适合的应用和数据集特性的方法和软件库,并设计新的可扩展PCA算法。

英文摘要

Classical machine learning algorithms often face scalability bottlenecks when they are applied to large-scale data. Such algorithms were designed to work with small data that is assumed to fit in the memory of one machine. In this report, we analyze different methods for computing an important machine learing algorithm, namely Principal Component Analysis (PCA), and we comment on its limitations in supporting large datasets. The methods are analyzed and compared across two important metrics: time complexity and communication complexity. We consider the worst-case scenarios for both metrics, and we identify the software libraries that implement each method. The analysis in this report helps researchers and engineers in (i) understanding the main bottlenecks for scalability in different PCA algorithms, (ii) choosing the most appropriate method and software library for a given application and data set characteristics, and (iii) designing new scalable PCA algorithms.

1311.2854 2026-06-04 cs.LG cs.NA math.NA 版本更新

Spectral Clustering via the Power Method -- Provably

通过幂法进行谱聚类--可证明的

Christos Boutsidis, Alex Gittens, Prabhanjan Kambadur

AI总结 本文提出通过幂法计算谱聚类的近似特征向量,证明少量迭代即可获得近优划分。

Comments ICML 2015, to appear

详情
AI中文摘要

通过幂法进行谱聚类--可证明的。谱聚类是数据挖掘和机器智能中最重要的算法之一;然而,其计算复杂性限制了其在真正大规模数据分析中的应用。谱聚类的计算瓶颈在于计算图表示数据的(归一化)拉普拉斯矩阵的几个顶部特征向量。一种加速计算这些特征向量的方法是使用数值线性代数文献中的“幂法”。尽管幂法已被经验性地用于加速谱聚类,但这种方法的理论基础,据我们所知,尚未被探索。本文提供了首次严谨的理论证明,认为少量的幂迭代足以通过幂法获得的近似特征向量获得近优划分。具体而言,我们证明在通过幂法获得的近似特征向量上求解k均值聚类问题,可以得到在最优特征向量上求解k均值问题的加法误差近似。

英文摘要

Spectral clustering is one of the most important algorithms in data mining and machine intelligence; however, its computational complexity limits its application to truly large scale data analysis. The computational bottleneck in spectral clustering is computing a few of the top eigenvectors of the (normalized) Laplacian matrix corresponding to the graph representing the data to be clustered. One way to speed up the computation of these eigenvectors is to use the "power method" from the numerical linear algebra literature. Although the power method has been empirically used to speed up spectral clustering, the theory behind this approach, to the best of our knowledge, remains unexplored. This paper provides the \emph{first} such rigorous theoretical justification, arguing that a small number of power iterations suffices to obtain near-optimal partitionings using the approximate eigenvectors. Specifically, we prove that solving the $k$-means clustering problem on the approximate eigenvectors obtained via the power method gives an additive-error approximation to solving the $k$-means problem on the optimal eigenvectors.

1505.02343 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Bayesian Sparse Tucker Models for Dimension Reduction and Tensor Completion

基于贝叶斯稀疏Tucker模型的降维与张量补全

Qibin Zhao, Liqing Zhang, Andrzej Cichocki

AI总结 本文提出一种概率生成Tucker模型,通过结构稀疏性在多线性潜在空间中实现张量降维与补全,自动适应模型复杂度并提升泛化性能。

详情
AI中文摘要

Tucker分解是现代张量数据分析机器学习的核心,广泛应用于多向特征提取、压缩感知和张量补全。最具有挑战性的问题是确定模型复杂度(即多线性秩),尤其是在存在噪声和缺失数据时。此外,现有方法无法考虑潜在因子的不确定性信息,导致泛化性能低下。为了解决这些问题,我们提出了一类概率生成Tucker模型,用于张量分解与补全,具有结构稀疏性。为了利用结构稀疏建模,我们引入了两种组稀疏诱导先验,通过Laplace和学生t分布的分层表示,从而实现完全后验推断。对于模型学习,我们推导了所有模型(超)参数上的变分贝叶斯推断,并开发了基于多线性操作的高效可扩展算法。我们的方法可以自动适应模型复杂度,并通过模型证据的最大下界原理推断最优多线性秩。在合成、化学计量学和神经影像数据上的实验结果和比较显示,我们的模型在恢复多线性秩和缺失条目方面表现出色。

英文摘要

Tucker decomposition is the cornerstone of modern machine learning on tensorial data analysis, which have attracted considerable attention for multiway feature extraction, compressive sensing, and tensor completion. The most challenging problem is related to determination of model complexity (i.e., multilinear rank), especially when noise and missing data are present. In addition, existing methods cannot take into account uncertainty information of latent factors, resulting in low generalization performance. To address these issues, we present a class of probabilistic generative Tucker models for tensor decomposition and completion with structural sparsity over multilinear latent space. To exploit structural sparse modeling, we introduce two group sparsity inducing priors by hierarchial representation of Laplace and Student-t distributions, which facilitates fully posterior inference. For model learning, we derived variational Bayesian inferences over all model (hyper)parameters, and developed efficient and scalable algorithms based on multilinear operations. Our methods can automatically adapt model complexity and infer an optimal multilinear rank by the principle of maximum lower bound of model evidence. Experimental results and comparisons on synthetic, chemometrics and neuroimaging data demonstrate remarkable performance of our models for recovering ground-truth of multilinear rank and missing entries.

1505.00314 2026-06-04 cs.LG cs.SY eess.SY stat.ME 版本更新

Deconstructing Principal Component Analysis Using a Data Reconciliation Perspective

从数据协调视角解构主成分分析

Shankar Narasimhan, Nirav Bhatt

AI总结 本文从数据协调视角探讨主成分分析,揭示两者紧密关联,构建统一框架并展示其协同处理数据的方法。

详情
Journal ref
Computers and Chemical Engineering 77 (2015) 74-84
AI中文摘要

数据协调(DR)和主成分分析(PCA)是过程工业中两种流行的数据分析技术。数据协调用于从错误测量中获得准确且一致的变量和参数估计。PCA主要用于减少高维数据的维度并作为去噪预处理技术。这两种技术曾被独立开发和部署。本文的主要目的是阐明这两种看似不同的技术之间的密切关系。这导致了PCA和DR的统一框架。进一步,我们展示了如何将这两种技术以协作和一致的方式应用于数据处理。该框架已扩展以处理部分测量系统,并纳入关于过程模型的部分知识。

英文摘要

Data reconciliation (DR) and Principal Component Analysis (PCA) are two popular data analysis techniques in process industries. Data reconciliation is used to obtain accurate and consistent estimates of variables and parameters from erroneous measurements. PCA is primarily used as a method for reducing the dimensionality of high dimensional data and as a preprocessing technique for denoising measurements. These techniques have been developed and deployed independently of each other. The primary purpose of this article is to elucidate the close relationship between these two seemingly disparate techniques. This leads to a unified framework for applying PCA and DR. Further, we show how the two techniques can be deployed together in a collaborative and consistent manner to process data. The framework has been extended to deal with partially measured systems and to incorporate partial knowledge available about the process model.

1402.5284 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

Convergence results for projected line-search methods on varieties of low-rank matrices via Łojasiewicz inequality

关于通过Łojasiewicz不等式在低秩矩阵流形上投影线搜索方法收敛性的结果

Reinhold Schneider, André Uschmajew

AI总结 本文利用Łojasiewicz不等式研究了在低秩矩阵流形上投影线搜索方法的收敛性,分析了其在闭合性和曲率无界性方面的优势,并给出了临界点位于流形光滑部分的理论依据。

详情
AI中文摘要

本文旨在推导在实代数流形$\mathcal{M}_{\le k}$上投影线搜索方法的收敛性结果,该流形由实$m \times n$矩阵构成,其秩至多为$k$。此类方法扩展了用于光滑流形$\mathcal{M}_k$上的黎曼优化方法,通过在切锥中沿梯度相关方向取步长,然后投影回$\mathcal{M}_{\le k}$。考虑此类方法可避免$\mathcal{M}_k$的非闭合性和无界曲率带来的困难。基于实解析函数的点收敛性,利用投影反梯度到切锥的Łojasiewicz不等式获得收敛性结果。如果极限点位于$\mathcal{M}_{\le k}$的光滑部分,即$\mathcal{M}_k$中,则结果退化为已知结果,但能通过极限点在光滑流形上获得渐近收敛速率估计,无需先验曲率界。同时,可以给出假设临界点位于$\mathcal{M}_k$的理论依据:若$X$是$f$在$\mathcal{M}_{\le k}$上的临界点,则$X$的秩为$k$或$\nabla f(X) = 0$。

英文摘要

The aim of this paper is to derive convergence results for projected line-search methods on the real-algebraic variety $\mathcal{M}_{\le k}$ of real $m \times n$ matrices of rank at most $k$. Such methods extend Riemannian optimization methods, which are successfully used on the smooth manifold $\mathcal{M}_k$ of rank-$k$ matrices, to its closure by taking steps along gradient-related directions in the tangent cone, and afterwards projecting back to $\mathcal{M}_{\le k}$. Considering such a method circumvents the difficulties which arise from the nonclosedness and the unbounded curvature of $\mathcal{M}_k$. The pointwise convergence is obtained for real-analytic functions on the basis of a Łojasiewicz inequality for the projection of the antigradient to the tangent cone. If the derived limit point lies on the smooth part of $\mathcal{M}_{\le k}$, i.e. in $\mathcal{M}_k$, this boils down to more or less known results, but with the benefit that asymptotic convergence rate estimates (for specific step-sizes) can be obtained without an a priori curvature bound, simply from the fact that the limit lies on a smooth manifold. At the same time, one can give a convincing justification for assuming critical points to lie in $\mathcal{M}_k$: if $X$ is a critical point of $f$ on $\mathcal{M}_{\le k}$, then either $X$ has rank $k$, or $\nabla f(X) = 0$.

1504.05517 2026-06-04 cs.NI cs.LG cs.SY eess.SY 版本更新

Online Learning Algorithm for Time Series Forecasting Suitable for Low Cost Wireless Sensor Networks Nodes

适用于低成本无线传感器网络节点的时间序列预测在线学习算法

Juan Pardo, Francisco Zamora-Martinez, Paloma Botella-Rocamora

AI总结 本文提出一种基于反向传播算法的在线学习算法,用于在低成本系统级芯片上实现时间序列预测,以提升智能家居中HVAC系统的能效。

Comments 28 pages, Published 21 April 2015 at MDPI's journal "Sensors"

详情
Journal ref
Sensors 2015, 15(4), 9277-9304
AI中文摘要

时间序列预测是一种重要的预测方法,可应用于广泛的问题。特别是,预测室内温度可以提高家庭HVAC系统的利用效率,从而改善能源效率。本文描述了如何在低成本系统级芯片上实现人工神经网络(ANN)算法,以开发自主智能无线传感器网络。本文使用无线传感器网络(WSN)监控和预测智能家居中的室内温度,基于低成本微控制器技术(8051MCU)。开发了一种基于反向传播(BP)算法的在线学习方法,用于实时时间序列学习。该方法通过每次新数据到达系统时进行模型训练,而不保存大量数据创建历史数据库。为了验证该方法,通过贝叶斯基线模型进行了模拟研究,以与实际应用数据库进行比较,以查看性能和准确性。本文的核心是一种基于BP的新型算法,详细描述了该算法,并探讨了如何在简单架构和极少数硬件资源上实现计算密集型算法的挑战。

英文摘要

Time series forecasting is an important predictive methodology which can be applied to a wide range of problems. Particularly, forecasting the indoor temperature permits an improved utilization of the HVAC (Heating, Ventilating and Air Conditioning) systems in a home and thus a better energy efficiency. With such purpose the paper describes how to implement an Artificial Neural Network (ANN) algorithm in a low cost system-on-chip to develop an autonomous intelligent wireless sensor network. The present paper uses a Wireless Sensor Networks (WSN) to monitor and forecast the indoor temperature in a smart home, based on low resources and cost microcontroller technology as the 8051MCU. An on-line learning approach, based on Back-Propagation (BP) algorithm for ANNs, has been developed for real-time time series learning. It performs the model training with every new data that arrive to the system, without saving enormous quantities of data to create a historical database as usual, i.e., without previous knowledge. Consequently to validate the approach a simulation study through a Bayesian baseline model have been tested in order to compare with a database of a real application aiming to see the performance and accuracy. The core of the paper is a new algorithm, based on the BP one, which has been described in detail, and the challenge was how to implement a computational demanding algorithm in a simple architecture with very few hardware resources.

1403.5045 2026-06-04 cs.LG cs.AI cs.SY eess.SY stat.ML 版本更新

Matroid Bandits: Fast Combinatorial Optimization with Learning

Matroid Bandits: 快速组合优化中的学习

Branislav Kveton, Zheng Wen, Azin Ashkan, Hoda Eydgahi, Brian Eriksson

AI总结 本文提出matroid bandits,结合bandits和matroids,通过Optimistic Matroid Maximization算法解决在matroid上最大化随机模函数的问题,并给出两种 regret 上界。

详情
AI中文摘要

Matroid 是组合优化中独立性的概念,与计算效率密切相关。本文将bandits与matroids结合,提出matroid bandits,目标是学习在matroid上最大化随机模函数。我们提出实用算法Optimistic Matroid Maximization (OMM),并证明两种regret上界,均为亚线性时间,且在其他量上至多线性。gap-dependent上界是紧的,并证明了partition matroid bandit的匹配下界。最后在三个实际问题上评估了该方法,证明其实用性。

英文摘要

A matroid is a notion of independence in combinatorial optimization which is closely related to computational efficiency. In particular, it is well known that the maximum of a constrained modular function can be found greedily if and only if the constraints are associated with a matroid. In this paper, we bring together the ideas of bandits and matroids, and propose a new class of combinatorial bandits, matroid bandits. The objective in these problems is to learn how to maximize a modular function on a matroid. This function is stochastic and initially unknown. We propose a practical algorithm for solving our problem, Optimistic Matroid Maximization (OMM); and prove two upper bounds, gap-dependent and gap-free, on its regret. Both bounds are sublinear in time and at most linear in all other quantities of interest. The gap-dependent upper bound is tight and we prove a matching lower bound on a partition matroid bandit. Finally, we evaluate our method on three real-world problems and show that it is practical.

1504.02125 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Residential Demand Response Applications Using Batch Reinforcement Learning

利用批量强化学习的住宅需求响应应用

Frederik Ruelens, Bert Claessens, Stijn Vandael, Bart De Schutter, Robert Babuska, Ronnie Belmans

AI总结 本文利用批量强化学习技术解决需求响应问题,提出了一种无需系统辨识的政策调整方法,并展示了如何通过蒙特卡洛估计器确定开放循环调度。

Comments Submitted to Trans. on Smart Grid

详情
AI中文摘要

受近期批量强化学习(RL)进展的推动,本文致力于将批量RL应用于需求响应。与传统基于模型的方法不同,批量RL技术不需要系统辨识步骤,使其更适合大规模实施。本文将标准批量RL技术拟合Q迭代扩展到提供外生数据预测的情况。通常,批量RL技术不依赖系统动态或解决方案的专家知识,但若提供一些专家知识,可通过我们的新型策略调整方法进行整合。最后,我们解决了参与日前市场的开放循环调度挑战。我们提出了一种模型无关的蒙特卡洛估计器方法,利用指标构建人工轨迹,并通过寻找热泵恒温器的日前调度来展示此方法。实验表明,批量RL技术为基于模型的控制器提供了有价值的替代方案,并可用于构建闭环和开环策略。

英文摘要

Driven by recent advances in batch Reinforcement Learning (RL), this paper contributes to the application of batch RL to demand response. In contrast to conventional model-based approaches, batch RL techniques do not require a system identification step, which makes them more suitable for a large-scale implementation. This paper extends fitted Q-iteration, a standard batch RL technique, to the situation where a forecast of the exogenous data is provided. In general, batch RL techniques do not rely on expert knowledge on the system dynamics or the solution. However, if some expert knowledge is provided, it can be incorporated by using our novel policy adjustment method. Finally, we tackle the challenge of finding an open-loop schedule required to participate in the day-ahead market. We propose a model-free Monte-Carlo estimator method that uses a metric to construct artificial trajectories and we illustrate this method by finding the day-ahead schedule of a heat-pump thermostat. Our experiments show that batch RL techniques provide a valuable alternative to model-based controllers and that they can be used to construct both closed-loop and open-loop policies.

1504.01050 2026-06-04 cs.LG cs.SY eess.SY 版本更新

An Online Approach to Dynamic Channel Access and Transmission Scheduling

动态信道接入与传输调度的在线方法

Yang Liu, Mingyan Liu

AI总结 本文提出在线算法以跟踪已知的最优或次优动态信道接入与传输调度策略,通过强后悔度衡量性能,优于传统弱后悔算法,适用于多用户单信道和单用户多信道场景。

Comments 10 pages, to appear in MobiHoc 2015

详情
AI中文摘要

在多信道无线系统中,做出明智的信道接入和传输调度决策对于提高性能及能效和频谱效率至关重要。过去十年的研究已产生了动态和机会性信道接入方案,但这些方案通常需要先验的信道统计知识。本文通过设计学习算法跟踪已知的最优或次优动态信道接入与传输调度策略,以强后悔度衡量性能,优于传统弱后悔算法,在多用户单信道和单用户多信道场景中均实现了亚线性后悔度。

英文摘要

Making judicious channel access and transmission scheduling decisions is essential for improving performance as well as energy and spectral efficiency in multichannel wireless systems. This problem has been a subject of extensive study in the past decade, and the resulting dynamic and opportunistic channel access schemes can bring potentially significant improvement over traditional schemes. However, a common and severe limitation of these dynamic schemes is that they almost always require some form of a priori knowledge of the channel statistics. A natural remedy is a learning framework, which has also been extensively studied in the same context, but a typical learning algorithm in this literature seeks only the best static policy, with performance measured by weak regret, rather than learning a good dynamic channel access policy. There is thus a clear disconnect between what an optimal channel access policy can achieve with known channel statistics that actively exploits temporal, spatial and spectral diversity, and what a typical existing learning algorithm aims for, which is the static use of a single channel devoid of diversity gain. In this paper we bridge this gap by designing learning algorithms that track known optimal or sub-optimal dynamic channel access and transmission scheduling policies, thereby yielding performance measured by a form of strong regret, the accumulated difference between the reward returned by an optimal solution when a priori information is available and that by our online algorithm. We do so in the context of two specific algorithms that appeared in [1] and [2], respectively, the former for a multiuser single-channel setting and the latter for a single-user multichannel setting. In both cases we show that our algorithms achieve sub-linear regret uniform in time and outperforms the standard weak-regret learning algorithms.

1503.08855 2026-06-04 math.OC cs.IT cs.LG cs.MA cs.SY eess.SY math.IT stat.ML 版本更新

Decentralized learning for wireless communications and networking

无线通信与网络中的去中心化学习

Georgios B. Giannakis, Qing Ling, Gonzalo Mateos, Ioannis D. Schizas, Hao Zhu

AI总结 本文提出了一种去中心化学习算法,用于图数据的网络内处理,通过交替方向乘子法实现分布式优化,适用于无线通信和网络任务的案例研究。

Comments Contributed chapter to appear in Splitting Methods in Communication and Imaging, Science and Engineering, R. Glowinski, S. Osher, and W. Yin, Editors, New York, Springer, 2015

详情
AI中文摘要

本章讨论了用于图数据网络内处理的去中心化学习算法。提出了一个通用学习问题,并将其转换为可分离形式,通过交替方向乘子法(ADMM)迭代最小化以实现所需程度的并行化。在不交换分布式训练集元素且保持节点间通信成本可控的情况下,本地学习器同意全球推断得到的量,即若整个训练数据集集中可用时所获得的量。通过包括使用无线传感器网络的目标跟踪、揭示互联网流量异常、电力系统状态估计以及无线认知无线电网络的频谱制图等案例研究,展示了去中心化学习框架对当代无线通信和网络任务的影响。

英文摘要

This chapter deals with decentralized learning algorithms for in-network processing of graph-valued data. A generic learning problem is formulated and recast into a separable form, which is iteratively minimized using the alternating-direction method of multipliers (ADMM) so as to gain the desired degree of parallelization. Without exchanging elements from the distributed training sets and keeping inter-node communications at affordable levels, the local (per-node) learners consent to the desired quantity inferred globally, meaning the one obtained if the entire training data set were centrally available. Impact of the decentralized learning framework to contemporary wireless communications and networking tasks is illustrated through case studies including target tracking using wireless sensor networks, unveiling Internet traffic anomalies, power system state estimation, as well as spectrum cartography for wireless cognitive radio networks.

1503.08639 2026-06-04 cs.LG cs.SY eess.SY 版本更新

Sparse plus low-rank autoregressive identification in neuroimaging time series

神经影像时间序列中的稀疏加低秩自回归识别

Raphaël Liégeois, Bamdev Mishra, Mattia Zorzi, Rodolphe Sepulchre

AI总结 本文提出利用交替方向乘子法解决多变量自回归稀疏加低秩图模型问题,针对神经影像数据规模进行扩展,重点展示低秩结构捕捉时空结构的能力。

Comments 6 pages paper submitted to CDC 2015

详情
AI中文摘要

本文考虑了多变量自回归(AR)稀疏加低秩图模型的识别问题。基于最近提出的问题 formulation,我们使用交替方向乘子法(ADMM)高效求解并扩展到神经影像应用中的规模。我们将这种分解应用于合成和真实神经影像数据集,特别关注模型中低秩结构所编码的信息。特别是,我们展示了该信息捕捉原始数据的时空结构,扩展了经典成分分析方法。

英文摘要

This paper considers the problem of identifying multivariate autoregressive (AR) sparse plus low-rank graphical models. Based on the corresponding problem formulation recently presented, we use the alternating direction method of multipliers (ADMM) to efficiently solve it and scale it to sizes encountered in neuroimaging applications. We apply this decomposition on synthetic and real neuroimaging datasets with a specific focus on the information encoded in the low-rank structure of our model. In particular, we illustrate that this information captures the spatio-temporal structure of the original data, generalizing classical component analysis approaches.

1503.06468 2026-06-04 cs.LG cs.CR cs.SY eess.SY 版本更新

Machine Learning Methods for Attack Detection in the Smart Grid

智能电网中攻击检测的机器学习方法

Mete Ozay, Inaki Esnaola, Fatos T. Yarman Vural, Sanjeev R. Kulkarni, H. Vincent Poor

AI总结 本文提出利用机器学习方法检测智能电网攻击,通过批量和在线学习算法结合特征和决策融合,分析攻击向量的统计与几何特性,提升攻击检测性能。

Comments 14 pages, 11 Figures

详情
Journal ref
A version of the manuscript was published in IEEE Transactions on Neural Networks and Learning Systems, 19 March 2015
AI中文摘要

智能电网中的攻击检测被提出为统计学习问题,针对不同的攻击场景,测量数据以批量或在线方式观测。在此方法中,机器学习算法用于将测量数据分类为安全或受攻击。提供了一个攻击检测框架,以利用系统中任何可用的先验知识,并克服所提出方法中稀疏结构带来的限制。已知的批量和在线学习算法(监督和半监督)结合决策和特征级融合,用于建模攻击检测问题。分析攻击场景中使用的攻击向量的统计和几何特性与学习算法之间的关系,以利用统计学习方法检测不可观测的攻击。所提出算法在各种IEEE测试系统上进行了检验。实验分析显示,机器学习算法在攻击检测性能上优于采用状态向量估计方法的攻击检测算法。

英文摘要

Attack detection problems in the smart grid are posed as statistical learning problems for different attack scenarios in which the measurements are observed in batch or online settings. In this approach, machine learning algorithms are used to classify measurements as being either secure or attacked. An attack detection framework is provided to exploit any available prior knowledge about the system and surmount constraints arising from the sparse structure of the problem in the proposed approach. Well-known batch and online learning algorithms (supervised and semi-supervised) are employed with decision and feature level fusion to model the attack detection problem. The relationships between statistical and geometric properties of attack vectors employed in the attack scenarios and learning algorithms are analyzed to detect unobservable attacks using statistical learning methods. The proposed algorithms are examined on various IEEE test systems. Experimental analyses show that machine learning algorithms can detect attacks with performances higher than the attack detection algorithms which employ state vector estimation methods in the proposed attack detection framework.

1503.06394 2026-06-04 cs.DS cs.LG cs.NA math.NA 版本更新

Large-scale Log-determinant Computation through Stochastic Chebyshev Expansions

通过随机Chebyshev展开实现大规模对数行列式计算

Insu Han, Dmitry Malioutov, Jinwoo Shin

AI总结 本文提出一种线性时间随机算法,利用随机迹近似和Chebyshev多项式展开,高效计算大规模正定及一般非奇异矩阵的对数行列式,具有高精度和速度优势。

详情
AI中文摘要

对数行列式在机器学习中广泛应用,如高斯图模型、高斯过程模型、离散图模型的配分函数、最小体积椭球、度量学习和核学习。对数行列式计算通常涉及Cholesky分解,其计算复杂度为变量数的三次方,不适用于大规模应用。本文提出一种线性时间随机算法,通过随机迹近似(Hutchinson方法)和Chebyshev多项式展开,结合高效的矩阵向量乘法,实现对大规模正定和一般非奇异矩阵的对数行列式近似计算。我们建立了依赖输入矩阵条件数的严格加法和乘法近似误差界。实验表明,该算法在速度上比Cholesky分解和Schur补快多个数量级,并能计算包含数百万变量的矩阵的对数行列式。

英文摘要

Logarithms of determinants of large positive definite matrices appear ubiquitously in machine learning applications including Gaussian graphical and Gaussian process models, partition functions of discrete graphical models, minimum-volume ellipsoids, metric learning and kernel learning. Log-determinant computation involves the Cholesky decomposition at the cost cubic in the number of variables, i.e., the matrix dimension, which makes it prohibitive for large-scale applications. We propose a linear-time randomized algorithm to approximate log-determinants for very large-scale positive definite and general non-singular matrices using a stochastic trace approximation, called the Hutchinson method, coupled with Chebyshev polynomial expansions that both rely on efficient matrix-vector multiplications. We establish rigorous additive and multiplicative approximation error bounds depending on the condition number of the input matrix. In our experiments, the proposed algorithm can provide very high accuracy solutions at orders of magnitude faster time than the Cholesky decomposition and Schur completion, and enables us to compute log-determinants of matrices involving tens of millions of variables.

1310.0807 2026-06-04 cs.IT cs.LG cs.NA math.IT math.NA math.ST stat.ML stat.TH 版本更新

Exact and Stable Covariance Estimation from Quadratic Sampling via Convex Programming

通过凸优化从二次采样中获得精确且稳定的协方差估计

Yuxin Chen, Yuejie Chi, Andrea Goldsmith

AI总结 本文研究了通过凸优化从二次采样中提取高维数据协方差结构的方法,探讨了低秩、Toeplitz低秩、稀疏性等结构假设,并展示了在无噪声情况下能实现准确的协方差估计。

Comments accepted to IEEE Transactions on Information Theory, 2015

详情
AI中文摘要

高维数据的统计推断和信息处理往往需要高效且准确的二阶统计量估计。在数据快速变化、处理能力和存储有限的情况下,提取协方差结构需要单次数据遍历和少量存储测量。本文探讨了一种二次(或秩一)测量模型,该模型在采样过程中具有最小的内存需求和低计算复杂性,并在保持各种低维协方差结构方面被证明是最优的。具体而言,研究了四种流行的协方差矩阵结构假设,即低秩、Toeplitz低秩、稀疏性以及联合秩一和稀疏结构,通过针对相应结构的凸松弛范式实现恢复。所提出的二次采样框架具有多种潜在应用,包括流数据处理、高频无线通信、相空间断层扫描和光学相位恢复,以及非相干子空间检测。我们的方法在无噪声情况下,只要测量数量超过信息论极限即可实现普遍准确的协方差估计。我们还展示了该方法在噪声和不完美结构假设下的鲁棒性。我们的分析基于一种新的概念,称为混合范数限制等距性质(RIP-ℓ₂/ℓ₁),以及传统的RIP-ℓ₂/ℓ₂用于近各向同性和有界测量。此外,我们的结果在使用PhaseLift进行相位恢复(包括密集和稀疏信号)的已知最佳保证方面,采用了一种显著更简单的方法。

英文摘要

Statistical inference and information processing of high-dimensional data often require efficient and accurate estimation of their second-order statistics. With rapidly changing data, limited processing power and storage at the acquisition devices, it is desirable to extract the covariance structure from a single pass over the data and a small number of stored measurements. In this paper, we explore a quadratic (or rank-one) measurement model which imposes minimal memory requirements and low computational complexity during the sampling process, and is shown to be optimal in preserving various low-dimensional covariance structures. Specifically, four popular structural assumptions of covariance matrices, namely low rank, Toeplitz low rank, sparsity, jointly rank-one and sparse structure, are investigated, while recovery is achieved via convex relaxation paradigms for the respective structure. The proposed quadratic sampling framework has a variety of potential applications including streaming data processing, high-frequency wireless communication, phase space tomography and phase retrieval in optics, and non-coherent subspace detection. Our method admits universally accurate covariance estimation in the absence of noise, as soon as the number of measurements exceeds the information theoretic limits. We also demonstrate the robustness of this approach against noise and imperfect structural assumptions. Our analysis is established upon a novel notion called the mixed-norm restricted isometry property (RIP-$\ell_{2}/\ell_{1}$), as well as the conventional RIP-$\ell_{2}/\ell_{2}$ for near-isotropic and bounded measurements. In addition, our results improve upon the best-known phase retrieval (including both dense and sparse signals) guarantees using PhaseLift with a significantly simpler approach.

1503.02828 2026-06-04 cs.LG cs.NA math.NA 版本更新

Scalable Nuclear-norm Minimization by Subspace Pursuit Proximal Riemannian Gradient

通过子空间追求算法实现可扩展的核范数最小化

Mingkui Tan, Shijie Xiao, Junbin Gao, Dong Xu, Anton Van Den Hengel, Qinfeng Shi

AI总结 本文提出一种基于子空间追求的近端黎曼梯度方法,用于高效求解核范数正则化问题,避免大秩SVD,提升矩阵补全和子空间聚类等任务的性能。

详情
AI中文摘要

核范数正则化在许多学习任务中起着关键作用,如低秩矩阵恢复和低秩表示。直接求解此问题由于变量秩未知或大秩奇异值分解(SVD)而计算成本高。为此,我们提出了一种近端黎曼梯度(PRG)方案,可高效求解定义在实代数流形$\mMLr$上的迹范数正则化问题。基于PRG,我们进一步提出一种简单新颖的子空间追求(SP)范式,用于一般迹范数正则化问题,无需显式秩约束$\mMLr$。所提范式通过避免大秩SVD实现高度可扩展性。在矩阵补全和基于低秩表示的子空间聚类等任务上的实证研究显示,所提范式在性能上优于现有方法。

英文摘要

Nuclear-norm regularization plays a vital role in many learning tasks, such as low-rank matrix recovery (MR), and low-rank representation (LRR). Solving this problem directly can be computationally expensive due to the unknown rank of variables or large-rank singular value decompositions (SVDs). To address this, we propose a proximal Riemannian gradient (PRG) scheme which can efficiently solve trace-norm regularized problems defined on real-algebraic variety $\mMLr$ of real matrices of rank at most $r$. Based on PRG, we further present a simple and novel subspace pursuit (SP) paradigm for general trace-norm regularized problems without the explicit rank constraint $\mMLr$. The proposed paradigm is very scalable by avoiding large-rank SVDs. Empirical studies on several tasks, such as matrix completion and LRR based subspace clustering, demonstrate the superiority of the proposed paradigms over existing methods.

1503.03903 2026-06-04 cs.LG cs.IT cs.NA math.IT math.NA stat.ML 版本更新

Approximating Sparse PCA from Incomplete Data

从不完整数据近似稀疏PCA

Abhisek Kundu, Petros Drineas, Malik Magdon-Ismail

AI总结 研究如何利用少量数据元素形成的草图恢复数据矩阵的稀疏主成分,证明草图接近原矩阵时可获得近优解,提出稀疏PCA算法并展示其在多领域数据上的有效性,提升运行效率。

详情
AI中文摘要

我们研究如何利用少量数据元素形成的草图来恢复数据矩阵的稀疏主成分。我们证明,对于广泛的一类优化问题,如果草图在谱范数上接近原始数据矩阵,则可通过草图获得优化问题的近优解。特别是,我们使用此方法获得稀疏主成分,并证明对于m个数据点在n维空间中,O(ε^{-2}\ ilde{k}max{m,n})个元素可提供稀疏PCA问题的ε-近似解(\ ilde{k}是数据矩阵的稳定秩)。我们广泛在图像、文本、生物和金融数据上展示了我们的算法。结果表明,不仅能够从不完整数据中恢复稀疏主成分,而且通过使用稀疏草图,运行时间可减少五倍或更多。

英文摘要

We study how well one can recover sparse principal components of a data matrix using a sketch formed from a few of its elements. We show that for a wide class of optimization problems, if the sketch is close (in the spectral norm) to the original data matrix, then one can recover a near optimal solution to the optimization problem by using the sketch. In particular, we use this approach to obtain sparse principal components and show that for \math{m} data points in \math{n} dimensions, \math{O(ε^{-2}\tilde k\max\{m,n\})} elements gives an \mathε-additive approximation to the sparse PCA problem (\math{\tilde k} is the stable rank of the data matrix). We demonstrate our algorithms extensively on image, text, biological and financial data. The results show that not only are we able to recover the sparse PCAs from the incomplete data, but by using our sparse sketch, the running time drops by a factor of five or more.

1503.03355 2026-06-04 stat.ML cs.LG cs.NA math.NA stat.AP 版本更新

Automatic Unsupervised Tensor Mining with Quality Assessment

自动无监督张量挖掘与质量评估

Evangelos E. Papalexakis

AI总结 本文提出AutoTen算法,通过改进启发式方法实现自动无监督张量挖掘,通过合成数据和真实数据验证其性能,为自动化张量挖掘提供新方法。

详情
AI中文摘要

张量分解是无监督建模和挖掘多方面数据的常用工具。在探索性设置中,当没有标签或地面真实信息时,如何自动决定提取多少组件?如何评估结果质量,以便领域专家在解释结果时考虑此质量度量?本文介绍AutoTen,一种新型自动无监督张量挖掘算法,最小化用户干预,利用并改进了评估结果质量的启发式方法。我们对合成数据进行了广泛评估,优于现有基线方法。最后,我们将在各种真实数据集上应用AutoTen,提供见解和发现。我们将这项工作视为迈向完全自动化的无监督张量挖掘工具的一步,该工具可以被学术界和工业界的专业人员轻松采用。

英文摘要

A popular tool for unsupervised modelling and mining multi-aspect data is tensor decomposition. In an exploratory setting, where and no labels or ground truth are available how can we automatically decide how many components to extract? How can we assess the quality of our results, so that a domain expert can factor this quality measure in the interpretation of our results? In this paper, we introduce AutoTen, a novel automatic unsupervised tensor mining algorithm with minimal user intervention, which leverages and improves upon heuristics that assess the result quality. We extensively evaluate AutoTen's performance on synthetic data, outperforming existing baselines on this very hard problem. Finally, we apply AutoTen on a variety of real datasets, providing insights and discoveries. We view this work as a step towards a fully automated, unsupervised tensor mining tool that can be easily adopted by practitioners in academia and industry.

1503.01793 2026-06-04 cs.LO cs.GT cs.LG cs.SY eess.SY 版本更新

Correct-by-synthesis reinforcement learning with temporal logic constraints

通过时序逻辑约束的正确性合成强化学习

Min Wen, Ruediger Ehlers, Ufuk Topcu

AI总结 本文提出一种通过时序逻辑约束的正确性合成强化学习方法,结合最大许可策略和maximin-Q学习算法,实现系统满足规格的同时优化未知性能指标。

Comments 8 pages, 3 figures, 2 tables, submitted to IROS 2015

详情
AI中文摘要

我们考虑了一个问题,即合成反应控制器,该控制器在与未受控环境交互时,优化一些事先未知的性能标准,同时确保系统满足给定的时序逻辑规范。我们将问题分解为两个子问题。首先,我们提取一个(最大许可)策略,该策略编码了系统对对抗性环境的多种(可能全部)反应方式,以满足规范。然后,我们将事先未知的性能标准量化为(仍然未知)的奖励函数,并在允许的运行范围内通过所谓的maximin-Q学习算法计算系统的最优策略。我们为时序逻辑规范的一个片段建立了这种两步技术的正确性和最优性。对于超过该片段的规范,正确性仍可保持,但学习的策略可能不最优。我们提出了解决整体问题的算法,并在一组机器人运动规划示例上展示了其使用和计算要求。

英文摘要

We consider a problem on the synthesis of reactive controllers that optimize some a priori unknown performance criterion while interacting with an uncontrolled environment such that the system satisfies a given temporal logic specification. We decouple the problem into two subproblems. First, we extract a (maximally) permissive strategy for the system, which encodes multiple (possibly all) ways in which the system can react to the adversarial environment and satisfy the specifications. Then, we quantify the a priori unknown performance criterion as a (still unknown) reward function and compute an optimal strategy for the system within the operating envelope allowed by the permissive strategy by using the so-called maximin-Q learning algorithm. We establish both correctness (with respect to the temporal logic specifications) and optimality (with respect to the a priori unknown performance criterion) of this two-step technique for a fragment of temporal logic specifications. For specifications beyond this fragment, correctness can still be preserved, but the learned strategy may be sub-optimal. We present an algorithm to the overall problem, and demonstrate its use and computational requirements on a set of robot motion planning examples.

1402.5180 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Guaranteed Non-Orthogonal Tensor Decomposition via Alternating Rank-$1$ Updates

通过交替秩-1更新保证非正交张量分解

Animashree Anandkumar, Rong Ge, Majid Janzamin

AI总结 本文提供了一种保证CP张量分解的局部和全局收敛性,通过交替秩-1更新方法,适用于非对称张量,证明了在特定秩条件下可恢复过完备分解。

Comments We have added an additional sub-algorithm to remove the (approximate) residual error left after the tensor power iteration

详情
AI中文摘要

在本文中,我们为恢复CP(Candecomp/Parafac)张量分解提供了局部和全局收敛性保证。所提出算法的主要步骤是一个简单的交替秩-1更新,这是针对非对称张量的张量幂迭代的交替版本。对于秩为k的第三阶张量,在d维空间中,当k=o(d^{1.5})且张量组件不相干时,建立了局部收敛性保证,从而可以恢复过完备张量分解。我们还通过简单的初始化过程,强化了结果,通过使用随机张量切片的顶部奇异向量进行初始化,从而在更严格的秩条件下(k ≤ βd,对于任意常数β>1)实现了全局收敛性保证。此外,还提供了p阶张量的近似局部收敛性保证,条件为k=o(d^{p/2})。这些保证还包括在存在噪声张量时的紧扰动分析。

英文摘要

In this paper, we provide local and global convergence guarantees for recovering CP (Candecomp/Parafac) tensor decomposition. The main step of the proposed algorithm is a simple alternating rank-$1$ update which is the alternating version of the tensor power iteration adapted for asymmetric tensors. Local convergence guarantees are established for third order tensors of rank $k$ in $d$ dimensions, when $k=o \bigl( d^{1.5} \bigr)$ and the tensor components are incoherent. Thus, we can recover overcomplete tensor decomposition. We also strengthen the results to global convergence guarantees under stricter rank condition $k \le βd$ (for arbitrary constant $β> 1$) through a simple initialization procedure where the algorithm is initialized by top singular vectors of random tensor slices. Furthermore, the approximate local convergence guarantees for $p$-th order tensors are also provided under rank condition $k=o \bigl( d^{p/2} \bigr)$. The guarantees also include tight perturbation analysis given noisy tensor.

1502.04689 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Exact tensor completion using t-SVD

利用t-SVD进行精确张量补全

Zemin Zhang, Shuchin Aeron

AI总结 本文基于t-SVD提出张量补全方法,通过凸优化最小化张量核范数以保证恢复概率,验证了在随机采样下张量补全的最优性。

Comments 16 pages, 5 figures, 2 tables

详情
AI中文摘要

本文聚焦于从有限采样中补全多维数组(张量)的问题。我们的方法基于最近提出的张量奇异值分解(t-SVD)。利用该分解可以得到称为张量管秩的张量秩概念,其最优性性质类似于由SVD得到的矩阵秩。如[2]所示,某些多维数据,如平移视频序列,表现出低张量管秩,我们考虑在数据立方体的随机采样下补全此类数据的问题。我们证明,通过求解一个凸优化问题,该问题最小化作为张量管秩的凸松弛得到的张量核范数,可以保证在观察到与t-SVD中的自由度成比例的样本时,恢复具有 overwhelming 概率。从这个意义上说,我们的结果是顺序最优的。该结果成立的条件非常类似于矩阵补全的无相干条件,尽管我们是在t-SVD的代数框架下定义无相干性。我们还在一些真实数据集上展示了算法的性能,并将其与其他基于张量展平和Tucker分解的方法进行了比较。

英文摘要

In this paper we focus on the problem of completion of multidimensional arrays (also referred to as tensors) from limited sampling. Our approach is based on a recently proposed tensor-Singular Value Decomposition (t-SVD) [1]. Using this factorization one can derive notion of tensor rank, referred to as the tensor tubal rank, which has optimality properties similar to that of matrix rank derived from SVD. As shown in [2] some multidimensional data, such as panning video sequences exhibit low tensor tubal rank and we look at the problem of completing such data under random sampling of the data cube. We show that by solving a convex optimization problem, which minimizes the tensor nuclear norm obtained as the convex relaxation of tensor tubal rank, one can guarantee recovery with overwhelming probability as long as samples in proportion to the degrees of freedom in t-SVD are observed. In this sense our results are order-wise optimal. The conditions under which this result holds are very similar to the incoherency conditions for the matrix completion, albeit we define incoherency under the algebraic set-up of t-SVD. We show the performance of the algorithm on some real data sets and compare it with other existing approaches based on tensor flattening and Tucker decomposition.

1309.2168 2026-06-04 math.OC cs.LG cs.NA math.NA 版本更新

Large-scale optimization with the primal-dual column generation method

大规模优化中的对偶-原问题列生成方法

Jacek Gondzio, Pablo González-Brevis, Pedro Munari

AI总结 本文研究了对偶-原问题列生成方法在解决大规模凸优化问题中的表现,通过数据挖掘、不确定决策和电信网络等应用,验证了该方法在迭代次数和计算时间上的竞争力。

Comments 28 pages, 1 figure, minor revision, scaled CPU times

详情
AI中文摘要

对偶-原问题列生成方法(PDCGM)是一种通用的列生成技术,依赖于对偶-原内点法来求解受限主问题。使用这种内点法变体可以得到次优且良好的对偶解,这自然稳定了列生成过程。与标准列生成方法相比,PDCGM在减少Oracle调用次数和CPU时间方面通常表现更好,但这些结果基于较小的线性松弛组合优化问题。本文探讨了PDCGM在解决大规模凸优化问题中的行为,包括数据分析、不确定决策和电信网络等应用。数值实验中使用公开可用的基准实例,将PDCGM与文献中不同方法的最新结果进行比较。分析结果表明,PDCGM在大规模优化问题中仍具有竞争力,提供了比专用方法更具吸引力的替代方案。

英文摘要

The primal-dual column generation method (PDCGM) is a general-purpose column generation technique that relies on the primal-dual interior point method to solve the restricted master problems. The use of this interior point method variant allows to obtain suboptimal and well-centered dual solutions which naturally stabilizes the column generation. As recently presented in the literature, reductions in the number of calls to the oracle and in the CPU times are typically observed when compared to the standard column generation, which relies on extreme optimal dual solutions. However, these results are based on relatively small problems obtained from linear relaxations of combinatorial applications. In this paper, we investigate the behaviour of the PDCGM in a broader context, namely when solving large-scale convex optimization problems. We have selected applications that arise in important real-life contexts such as data analysis (multiple kernel learning problem), decision-making under uncertainty (two-stage stochastic programming problems) and telecommunication and transportation networks (multicommodity network flow problem). In the numerical experiments, we use publicly available benchmark instances to compare the performance of the PDCGM against recent results for different methods presented in the literature, which were the best available results to date. The analysis of these results suggests that the PDCGM offers an attractive alternative over specialized methods since it remains competitive in terms of number of iterations and CPU times even for large-scale optimization problems.

1501.05740 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

Bayesian Learning for Low-Rank matrix reconstruction

基于贝叶斯学习的低秩矩阵重建

Martin Sundin, Cristian R. Rojas, Magnus Jansson, Saikat Chatterjee

AI总结 本文提出基于潜在变量模型的贝叶斯学习方法,用于从线性测量中完成和重建低秩矩阵,通过证据近似和期望最大化学习模型参数,验证了在未知秩和噪声功率时的重建能力。

Comments Submitted to IEEE Transactions on Signal Processing

详情
AI中文摘要

我们开发了基于潜在变量模型的贝叶斯学习方法,用于从线性测量中完成和重建低秩矩阵。对于欠定系统,所开发的方法在未知秩和噪声功率的情况下能够重建低秩矩阵。我们推导了潜在变量模型与几种低秩促进惩罚函数之间的关系。这些关系证明了在高斯先验下使用克罗内克结构协方差矩阵的合理性。在方法中,我们使用证据近似和期望最大化来学习模型参数。通过广泛的数值模拟评估了方法的性能。

英文摘要

We develop latent variable models for Bayesian learning based low-rank matrix completion and reconstruction from linear measurements. For under-determined systems, the developed methods are shown to reconstruct low-rank matrices when neither the rank nor the noise power is known a-priori. We derive relations between the latent variable models and several low-rank promoting penalty functions. The relations justify the use of Kronecker structured covariance matrices in a Gaussian based prior. In the methods, we use evidence approximation and expectation-maximization to learn the model parameters. The performance of the methods is evaluated through extensive numerical simulations.

1501.03975 2026-06-04 cs.NE cs.LG cs.SY eess.SY 版本更新

Stochastic Gradient Based Extreme Learning Machines For Online Learning of Advanced Combustion Engines

基于随机梯度的极端学习机用于先进燃烧发动机的在线学习

Vijay Manikandan Janakiraman, XuanLong Nguyen, Dennis Assanis

AI总结 本文提出一种基于随机梯度的在线学习算法SG-ELM,用于非线性动态系统的识别。该算法通过Lyapunov方法证明了估计误差的渐近稳定性和参数稳定性,同时减少了计算需求,应用于HCCI发动机的系统识别和动态操作范围识别。

Comments This paper was written as an extract from my PhD thesis (July 2013) and so references may not be to date as of this submission (Jan 2015). The article is in review and contains 10 figures, 35 references

详情
AI中文摘要

本文提出了一种基于随机梯度的极端学习机(SG-ELM)在线学习算法。通过Lyapunov方法构建的稳定性准则,证明了估计误差的渐近稳定性和参数稳定性,适用于非线性动态系统的识别。与基于递归最小二乘的OS-ELM相比,SG-ELM在保证稳定性的同时减少了计算需求。为验证算法在实际场景中的有效性,考虑了先进燃烧发动机的识别问题。该算法应用于两个案例研究:一种是HCCI发动机系统的在线回归学习,另一种是具有类别不平衡的在线分类学习,用于识别HCCI发动机的动态操作范围。结果表明,所提SG-ELM的准确性与现有最先进方法相当,同时增加了稳定性并减少了计算成本。

英文摘要

In this article, a stochastic gradient based online learning algorithm for Extreme Learning Machines (ELM) is developed (SG-ELM). A stability criterion based on Lyapunov approach is used to prove both asymptotic stability of estimation error and stability in the estimated parameters suitable for identification of nonlinear dynamic systems. The developed algorithm not only guarantees stability, but also reduces the computational demand compared to the OS-ELM approach based on recursive least squares. In order to demonstrate the effectiveness of the algorithm on a real-world scenario, an advanced combustion engine identification problem is considered. The algorithm is applied to two case studies: An online regression learning for system identification of a Homogeneous Charge Compression Ignition (HCCI) Engine and an online classification learning (with class imbalance) for identifying the dynamic operating envelope of the HCCI Engine. The results indicate that the accuracy of the proposed SG-ELM is comparable to that of the state-of-the-art but adds stability and a reduction in computational effort.

1310.5715 2026-06-04 math.NA cs.CV cs.LG cs.NA math.OC stat.ML 版本更新

Stochastic Gradient Descent, Weighted Sampling, and the Randomized Kaczmarz algorithm

随机梯度下降、加权采样与随机化Kaczmarz算法

Deanna Needell, Nathan Srebro, Rachel Ward

AI总结 本文改进了随机梯度下降在光滑强凸目标下的线性收敛保证,从二次依赖于条件数转换为线性依赖,同时探讨了加权采样对收敛性的影响,并将随机化Kaczmarz算法与SGD联系起来,证明其在加权最小二乘问题中的指数收敛性。

Comments 22 pages, 6 figures

详情
AI中文摘要

我们获得了随机梯度下降在光滑且强凸目标下的改进有限样本保证,将线性收敛的依赖从二次的条件数$(L/μ)^2$(其中$L$是光滑性的上界,$μ$是强凸性的上界)转为线性依赖于$L/μ$。此外,我们展示了如何通过重新加权采样分布(即重要性采样)进一步提升收敛性,并获得平均光滑性的线性依赖,优于先前结果。我们还讨论了SGD中的重要性采样在其他场景中的应用。我们的结果基于将SGD与随机化Kaczmarz算法联系起来的发现,使我们能够将两种方法的文献思想相互转移。特别是,我们将随机化Kaczmarz算法重新表述为SGD的一个实例,并应用我们的结果证明其在加权最小二乘问题中的指数收敛性,而非原始最小二乘问题。然后,我们提出了一种修改的Kaczmarz算法,具有部分偏置采样,该算法能够收敛到原始最小二乘解,并以相同的指数收敛速率。

英文摘要

We obtain an improved finite-sample guarantee on the linear convergence of stochastic gradient descent for smooth and strongly convex objectives, improving from a quadratic dependence on the conditioning $(L/μ)^2$ (where $L$ is a bound on the smoothness and $μ$ on the strong convexity) to a linear dependence on $L/μ$. Furthermore, we show how reweighting the sampling distribution (i.e. importance sampling) is necessary in order to further improve convergence, and obtain a linear dependence in the average smoothness, dominating previous results. We also discuss importance sampling for SGD more broadly and show how it can improve convergence also in other scenarios. Our results are based on a connection we make between SGD and the randomized Kaczmarz algorithm, which allows us to transfer ideas between the separate bodies of literature studying each of the two methods. In particular, we recast the randomized Kaczmarz algorithm as an instance of SGD, and apply our results to prove its exponential convergence, but to the solution of a weighted least squares problem rather than the original least squares problem. We then present a modified Kaczmarz algorithm with partially biased sampling which does converge to the original least squares solution with the same exponential convergence rate.

1304.4610 2026-06-04 cs.IT cs.LG cs.NA math.IT math.NA stat.ML 版本更新

Spectral Compressed Sensing via Structured Matrix Completion

通过结构矩阵补全的谱压缩感知

Yuxin Chen, Yuejie Chi

AI总结 本文提出基于结构矩阵补全的增强矩阵补全算法,用于从少量时域样本中恢复谱稀疏对象,通过核范数最小化实现完美恢复,且在信息理论极限附近具有鲁棒性和超分辨率应用能力。

Comments accepted to International Conference on Machine Learning (ICML 2013)

详情
Journal ref
Journal of Machine Learning Research, W&CP 28 (3) :414-422, 2013
AI中文摘要

本文研究从少量时域样本中恢复谱稀疏对象的问题。具体而言,目标对象在 ambient 维度 $n$ 下假设为 $r$ 个复多维正弦波的混合,而底层频率可在单位盘内任意取值。传统压缩感知范式在施加离散字典到傅里叶表示时会遇到基函数不匹配问题。为解决此问题,我们开发了一种非参数算法,称为增强矩阵补全(EMaC),基于结构矩阵补全。该算法首先将数据排列成低秩增强形式,具有多倍Hankel结构,然后通过核范数最小化进行恢复。在温和的不相干条件下,EMaC允许样本数量超过 $\mathcal{O}(r\log^{2} n)$ 时实现完美恢复。我们还显示,在许多实例中,当观测条目数与信息理论极限成比例时,低秩多倍Hankel矩阵的准确补全是可能的(除了对数间隙外)。通过数值实验进一步展示了EMaC对有界噪声的鲁棒性和其在超分辨率中的应用能力。

英文摘要

The paper studies the problem of recovering a spectrally sparse object from a small number of time domain samples. Specifically, the object of interest with ambient dimension $n$ is assumed to be a mixture of $r$ complex multi-dimensional sinusoids, while the underlying frequencies can assume any value in the unit disk. Conventional compressed sensing paradigms suffer from the {\em basis mismatch} issue when imposing a discrete dictionary on the Fourier representation. To address this problem, we develop a novel nonparametric algorithm, called enhanced matrix completion (EMaC), based on structured matrix completion. The algorithm starts by arranging the data into a low-rank enhanced form with multi-fold Hankel structure, then attempts recovery via nuclear norm minimization. Under mild incoherence conditions, EMaC allows perfect recovery as soon as the number of samples exceeds the order of $\mathcal{O}(r\log^{2} n)$. We also show that, in many instances, accurate completion of a low-rank multi-fold Hankel matrix is possible when the number of observed entries is proportional to the information theoretical limits (except for a logarithmic gap). The robustness of EMaC against bounded noise and its applicability to super resolution are further demonstrated by numerical experiments.

1501.00125 2026-06-04 cs.LG cs.NA cs.SY eess.SY math.NA physics.data-an 版本更新

Maximum Margin Clustering for State Decomposition of Metastable Systems

最大边际聚类用于不稳定系统的状态分解

Hao Wu

AI总结 本文提出最大边际不稳定聚类方法,将不稳定状态分解转化为半监督学习问题,利用大边际技术寻找最优分解,无需相空间离散化,并通过模拟示例验证其有效性。

详情
AI中文摘要

在研究不稳定动力学系统时,主要关注如何将相空间分解为一组不稳定状态。不幸的是,基于模拟或实验数据的不稳定状态分解仍是一个挑战。最流行且最简单的做法是几何聚类,其基于经典聚类技术发展。然而,该方法的先决条件是:(1) 数据来自模拟或实验,处于全局平衡状态;(2) 坐标系统适当选择。最近,基于相空间离散化和转换概率估计的动能聚类方法因其适用于更一般情况而受到关注,但离散化策略的选择是一个难题。本文提出了一种新的分解方法,称为最大边际不稳定聚类,将不稳定状态分解问题转化为半监督学习问题,从而利用大边际技术在不进行相空间离散化的情况下寻找最优分解。此外,给出了几个模拟示例以说明所提方法的有效性。

英文摘要

When studying a metastable dynamical system, a prime concern is how to decompose the phase space into a set of metastable states. Unfortunately, the metastable state decomposition based on simulation or experimental data is still a challenge. The most popular and simplest approach is geometric clustering which is developed based on the classical clustering technique. However, the prerequisites of this approach are: (1) data are obtained from simulations or experiments which are in global equilibrium and (2) the coordinate system is appropriately selected. Recently, the kinetic clustering approach based on phase space discretization and transition probability estimation has drawn much attention due to its applicability to more general cases, but the choice of discretization policy is a difficult task. In this paper, a new decomposition method designated as maximum margin metastable clustering is proposed, which converts the problem of metastable state decomposition to a semi-supervised learning problem so that the large margin technique can be utilized to search for the optimal decomposition without phase space discretization. Moreover, several simulation examples are given to illustrate the effectiveness of the proposed method.

1312.5439 2026-06-04 eess.SY cs.IT cs.LG cs.SY math.IT math.OC 版本更新

Asynchronous Adaptation and Learning over Networks - Part III: Comparison Analysis

异步适应与学习在网络中的适应 Part III:比较分析

Xiaochuan Zhao, Ali H. Sayed

AI总结 本文比较了同步与异步网络及去中心化适应与集中化随机梯度方法的性能,发现异步网络在异步事件下仍能保持稳定收敛,且去中心化网络性能可媲美集中化方案。

Comments 39 pages, 3 figures

详情
AI中文摘要

在第二部分[3]中,我们对异步网络上的异步适应与学习性能进行了详细的均方误差分析,考虑了包括随机拓扑、随机链路故障、随机数据到达时间和随机节点开启和关闭在内的较为通用的一般模型。在本部分第三部分中,我们比较了同步网络和异步网络的性能。我们还比较了去中心化适应与集中化随机梯度(批量)解决方案的性能。两个有趣的结论尤为突出。首先,结果表明,适应网络的性能在异步事件的影响下具有较大的抗性:均值和均方收敛速率以及渐近偏值值相对于同步或集中化实现并未降低。只有稳态均方偏差在阶数上受到降级,这表示用于适应的小步长参数ν。其次,结果表明,适应分布式网络的性能与集中化方案相匹配。这些结论突显了网络化代理合作的另一个关键优势:合作不仅在与独立单代理处理相比时提升了性能,还使网络在各种形式的随机故障事件中表现出显著的鲁棒性,并能够提供与批量解决方案同样强大的性能。

英文摘要

In Part II [3] we carried out a detailed mean-square-error analysis of the performance of asynchronous adaptation and learning over networks under a fairly general model for asynchronous events including random topologies, random link failures, random data arrival times, and agents turning on and off randomly. In this Part III, we compare the performance of synchronous and asynchronous networks. We also compare the performance of decentralized adaptation against centralized stochastic-gradient (batch) solutions. Two interesting conclusions stand out. First, the results establish that the performance of adaptive networks is largely immune to the effect of asynchronous events: the mean and mean-square convergence rates and the asymptotic bias values are not degraded relative to synchronous or centralized implementations. Only the steady-state mean-square-deviation suffers a degradation in the order of $ν$, which represents the small step-size parameters used for adaptation. Second, the results show that the adaptive distributed network matches the performance of the centralized solution. These conclusions highlight another critical benefit of cooperation by networked agents: cooperation does not only enhance performance in comparison to stand-alone single-agent processing, but it also endows the network with remarkable resilience to various forms of random failure events and is able to deliver performance that is as powerful as batch solutions.

1312.5438 2026-06-04 eess.SY cs.IT cs.LG cs.SY math.IT math.OC 版本更新

Asynchronous Adaptation and Learning over Networks - Part II: Performance Analysis

异步适应与学习 over 网络 - 第二部分:性能分析

Xiaochuan Zhao, Ali H. Sayed

AI总结 本文分析了异步策略在分布式优化与适应问题中的均方误差性能,推导了均方收敛速率和稳态均方偏差的解析表达式,揭示了异步行为参数对网络性能的影响。

Comments 43 pages, 5 figures

详情
AI中文摘要

在第一部分中,我们介绍了异步事件在自适应网络上的通用模型,包括随机拓扑、随机链路故障、随机数据到达时间和代理随机开启和关闭。我们进行了稳定性分析,并确立了网络在均方误差意义上仍能收敛到期望解的事实。一旦稳定行为得到保证,就变得重要的是评估迭代收敛的速度和接近最优解的程度。这是具有挑战性的任务,因为有各种异步事件以及代理相互影响。在本部分第二部分中,我们对异步策略在网络上解决分布式优化和适应问题的均方误差性能进行了详细分析。我们推导了均方收敛速率和稳态均方偏差的解析表达式。这些表达式揭示了各种异步行为参数如何影响网络性能。在此过程中,我们建立了有趣的结论:即使在异步事件的影响下,自适应网络中的所有代理仍能达到O(ν^{1 + γ_o'})的近似一致,同时在O(ν)的精度内接近期望解,其中ν与适应的小步长参数成比例。

英文摘要

In Part I \cite{Zhao13TSPasync1}, we introduced a fairly general model for asynchronous events over adaptive networks including random topologies, random link failures, random data arrival times, and agents turning on and off randomly. We performed a stability analysis and established the notable fact that the network is still able to converge in the mean-square-error sense to the desired solution. Once stable behavior is guaranteed, it becomes important to evaluate how fast the iterates converge and how close they get to the optimal solution. This is a demanding task due to the various asynchronous events and due to the fact that agents influence each other. In this Part II, we carry out a detailed analysis of the mean-square-error performance of asynchronous strategies for solving distributed optimization and adaptation problems over networks. We derive analytical expressions for the mean-square convergence rate and the steady-state mean-square-deviation. The expressions reveal how the various parameters of the asynchronous behavior influence network performance. In the process, we establish the interesting conclusion that even under the influence of asynchronous events, all agents in the adaptive network can still reach an $O(ν^{1 + γ_o'})$ near-agreement with some $γ_o' > 0$ while approaching the desired solution within $O(ν)$ accuracy, where $ν$ is proportional to the small step-size parameter for adaptation.

1312.5434 2026-06-04 eess.SY cs.IT cs.LG cs.SY math.IT math.OC 版本更新

Asynchronous Adaptation and Learning over Networks --- Part I: Modeling and Stability Analysis

异步适应与学习 over 网络 --- 第一部分:建模与稳定性分析

Xiaochuan Zhao, Ali H. Sayed

AI总结 本文研究了网络上异步策略在分布式优化与适应问题中的稳定性与性能,分析了拓扑变化、随机链路故障等不确定因素对网络性能的影响,证明了异步网络在随机故障下的鲁棒性。

Comments 40 pages, 6 figures

详情
AI中文摘要

在本文及配套的Part II [2]和Part III [3]中,我们对解决网络上分布式优化和适应问题的异步策略的稳定性和性能进行了详细分析。我们考察了受广泛不确定源影响的异步网络,如拓扑变化、随机链路故障、随机数据到达时间和随机启停的代理。在此模型下,网络中的代理可能以随机方式停止更新解决方案或停止发送或接收信息,且不与其他代理协调。在Part I中,我们建立了确保均方稳定行为的参数分布一阶和二阶矩条件。在Part II中,我们推导了揭示异步行为各种参数如何影响网络性能的表达式。在Part III中,我们比较了异步网络的性能与集中式解决方案和同步网络的性能。一个显著结论是,异步网络的均方误差性能仅退化了O(ν)的顺序,其中ν是一个小步长参数,而收敛速度基本保持不变。这些结果为面对多级随机故障(代理、链路、数据到达和拓扑)的协作网络的显著鲁棒性提供了坚实的依据。

英文摘要

In this work and the supporting Parts II [2] and III [3], we provide a rather detailed analysis of the stability and performance of asynchronous strategies for solving distributed optimization and adaptation problems over networks. We examine asynchronous networks that are subject to fairly general sources of uncertainties, such as changing topologies, random link failures, random data arrival times, and agents turning on and off randomly. Under this model, agents in the network may stop updating their solutions or may stop sending or receiving information in a random manner and without coordination with other agents. We establish in Part I conditions on the first and second-order moments of the relevant parameter distributions to ensure mean-square stable behavior. We derive in Part II expressions that reveal how the various parameters of the asynchronous behavior influence network performance. We compare in Part III the performance of asynchronous networks to the performance of both centralized solutions and synchronous networks. One notable conclusion is that the mean-square-error performance of asynchronous networks shows a degradation only of the order of $O(ν)$, where $ν$ is a small step-size parameter, while the convergence rate remains largely unaltered. The results provide a solid justification for the remarkable resilience of cooperative networks in the face of random failures at multiple levels: agents, links, data arrivals, and topology.

1408.5845 2026-06-04 cs.DC cs.LG cs.SY eess.SY math.OC 版本更新

Analysis of a Reduced-Communication Diffusion LMS Algorithm

减少通信扩散自适应最小均方算法分析

Reza Arablouei, Stefan Werner, Kutluyıl Doğançay, Yih-Fang Huang

AI总结 本文分析了减少通信扩散自适应最小均方算法,探讨其在估计性能与通信成本之间的权衡,证明了算法在均值和均方意义下的稳定性和收敛性,并计算了稳态均方偏差。

详情
AI中文摘要

在基于扩散的自适应分布式估计算法中,每个节点通过创建中间估计并结合其闭邻域内的中间估计来估计目标参数向量。我们分析了允许每个节点在每次迭代中仅接收其邻居部分中间估计的减少通信扩散最小均方(RC-DLMS)算法的性能。该算法减轻了网络通信资源的使用,并在估计性能和通信成本之间提供折中。我们证明了RC-DLMS算法在均值和均方意义上都是稳定和收敛的。我们还计算了其理论稳态均方偏差。仿真结果展示了理论与实验之间的良好匹配。

英文摘要

In diffusion-based algorithms for adaptive distributed estimation, each node of an adaptive network estimates a target parameter vector by creating an intermediate estimate and then combining the intermediate estimates available within its closed neighborhood. We analyze the performance of a reduced-communication diffusion least mean-square (RC-DLMS) algorithm, which allows each node to receive the intermediate estimates of only a subset of its neighbors at each iteration. This algorithm eases the usage of network communication resources and delivers a trade-off between estimation performance and communication cost. We show analytically that the RC-DLMS algorithm is stable and convergent in both mean and mean-square senses. We also calculate its theoretical steady-state mean-square deviation. Simulation results demonstrate a good match between theory and experiment.

1406.5429 2026-06-04 math.NA cs.CV cs.LG cs.NA math.OC 版本更新

Playing with Duality: An Overview of Recent Primal-Dual Approaches for Solving Large-Scale Optimization Problems

双模互动:解决大规模优化问题的最新对偶方法综述

Nikos Komodakis, Jean-Christophe Pesquet

AI总结 本文综述了近期用于解决大规模优化问题的对偶方法,探讨了对偶问题在信号处理、计算机视觉和机器学习中的应用,强调了对偶算法在求解凸优化和离散问题中的优势。

详情
AI中文摘要

优化方法在信号/图像处理、计算机视觉和机器学习问题中处于核心地位。长期以来,人们认识到研究优化问题的对偶形式可能显著简化问题的求解。然而,将原问题和对偶问题联合考虑的高效策略是近期的新思想,近年来在凸分析、离散优化、并行处理和非光滑优化领域产生了许多重要贡献,尤其强调稀疏性问题。本文旨在阐述对偶方法的原理,并概述不同背景下提出的数值方法。我们展示了对偶算法在求解大规模凸优化问题和离散问题中的优势,并通过各种应用示例说明其实用性。

英文摘要

Optimization methods are at the core of many problems in signal/image processing, computer vision, and machine learning. For a long time, it has been recognized that looking at the dual of an optimization problem may drastically simplify its solution. Deriving efficient strategies which jointly brings into play the primal and the dual problems is however a more recent idea which has generated many important new contributions in the last years. These novel developments are grounded on recent advances in convex analysis, discrete optimization, parallel processing, and non-smooth optimization with emphasis on sparsity issues. In this paper, we aim at presenting the principles of primal-dual approaches, while giving an overview of numerical methods which have been proposed in different contexts. We show the benefits which can be drawn from primal-dual algorithms both for solving large-scale convex optimization problems and discrete ones, and we provide various application examples to illustrate their usefulness.

1411.6081 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

PU Learning for Matrix Completion

矩阵补全的PU学习

Cho-Jui Hsieh, Nagarajan Natarajan, Inderjit S. Dhillon

AI总结 本文研究了在仅观测到二进制测量值的情况下,如何通过PU学习方法进行矩阵补全,提出了两种方法并给出了误差界和样本复杂度。

详情
AI中文摘要

本文考虑了当观测值为某些基础矩阵M的一位测量值时的矩阵补全问题,特别是观测样本仅包含1而无0的情况。此问题源于推荐系统和社会网络等现代应用,其中仅观察到

英文摘要

In this paper, we consider the matrix completion problem when the observations are one-bit measurements of some underlying matrix M, and in particular the observed samples consist only of ones and no zeros. This problem is motivated by modern applications such as recommender systems and social networks where only "likes" or "friendships" are observed. The problem of learning from only positive and unlabeled examples, called PU (positive-unlabeled) learning, has been studied in the context of binary classification. We consider the PU matrix completion problem, where an underlying real-valued matrix M is first quantized to generate one-bit observations and then a subset of positive entries is revealed. Under the assumption that M has bounded nuclear norm, we provide recovery guarantees for two different observation models: 1) M parameterizes a distribution that generates a binary matrix, 2) M is thresholded to obtain a binary matrix. For the first case, we propose a "shifted matrix completion" method that recovers M using only a subset of indices corresponding to ones, while for the second case, we propose a "biased matrix completion" method that recovers the (thresholded) binary matrix. Both methods yield strong error bounds --- if M is n by n, the Frobenius error is bounded as O(1/((1-rho)n), where 1-rho denotes the fraction of ones observed. This implies a sample complexity of O(n\log n) ones to achieve a small error, when M is dense and n is large. We extend our methods and guarantees to the inductive matrix completion problem, where rows and columns of M have associated features. We provide efficient and scalable optimization procedures for both the methods and demonstrate the effectiveness of the proposed methods for link prediction (on real-world networks consisting of over 2 million nodes and 90 million links) and semi-supervised clustering tasks.

1408.4551 2026-06-04 math.OC cs.LG cs.SY eess.SY 版本更新

Dimensionality Reduction of Affine Variational Inequalities Using Random Projections

使用随机投影进行仿射变分不等式的降维

Bharat Prabhakar, Ankur A. Kulkarni

AI总结 本文提出了一种基于Johnson-Lindenstrauss引理的随机算法,通过降维求解近似解,验证了算法在低维下的有效性及精确解计算中的时间节省。

Comments Submitted to Mathematical Programming Series A. Edited some typos from the previous version. Also added a bound on the lower dimension

详情
AI中文摘要

我们提出了一种方法,用于降维处理定义在紧致可行区域上的仿射变分不等式(AVI)。该方法基于Johnson-Lindenstrauss引理,是一种随机算法,通过在随机子空间上投影原始AVI,以高概率产生近似解。算法允许根据所需的近似质量选择降维维度,并可作为生成接近解的初始点的精确算法子程序。通过标准求解器求解降维后的AVI,并通过低成本过程从该解恢复原始AVI的近似解。数值实验验证了理论结果,并证明该算法在低维情况下提供良好的近似解,并在精确解计算中节省时间。

英文摘要

We present a method for dimensionality reduction of an affine variational inequality (AVI) defined over a compact feasible region. Centered around the Johnson Lindenstrauss lemma, our method is a randomized algorithm that produces with high probability an approximate solution for the given AVI by solving a lower-dimensional AVI. The algorithm allows the lower dimension to be chosen based on the quality of approximation desired. The algorithm can also be used as a subroutine in an exact algorithm for generating an initial point close to the solution. The lower-dimensional AVI is obtained by appropriately projecting the original AVI on a randomly chosen subspace. The lower-dimensional AVI is solved using standard solvers and from this solution an approximate solution to the original AVI is recovered through an inexpensive process. Our numerical experiments corroborate the theoretical results and validate that the algorithm provides a good approximation at low dimensions and substantial savings in time for an exact solution.

1411.1087 2026-06-04 math.NA cs.DS cs.IT cs.LG cs.NA math.IT stat.ML 版本更新

Fast Exact Matrix Completion with Finite Samples

快速精确矩阵补全与有限样本

Prateek Jain, Praneeth Netrapalli

AI总结 本文提出一种快速迭代算法,通过观察O(nr^5 log^3 n)个样本实现精确矩阵补全,运行时间为O(nr^7 log^3 n log 1/ε),首次实现近线性时间且样本复杂度独立于精度的补全方法。

详情
AI中文摘要

矩阵补全是通过观测少量矩阵条目恢复低秩矩阵的问题。近期多项工作提出了快速非凸优化迭代算法,但样本复杂度在秩、条件数和所需精度的依赖上仍不最优。本文提出一种快速迭代算法,通过观察O(nr^5 log^3 n)个条目解决矩阵补全问题,运行时间为O(nr^7 log^3 n log 1/ε),近线性于矩阵维度。本文算法基于已知的投影梯度下降方法,投影到非凸的低秩矩阵集合。两个关键思想:1) 使用ℓ∞范数势函数而非谱范数,提供新的扰动界方法;2) 扩展Davis-Kahan定理以获得具有良好特征间隙矩阵的最佳低秩近似扰动界。这些思想可能具有独立价值。

英文摘要

Matrix completion is the problem of recovering a low rank matrix by observing a small fraction of its entries. A series of recent works [KOM12,JNS13,HW14] have proposed fast non-convex optimization based iterative algorithms to solve this problem. However, the sample complexity in all these results is sub-optimal in its dependence on the rank, condition number and the desired accuracy. In this paper, we present a fast iterative algorithm that solves the matrix completion problem by observing $O(nr^5 \log^3 n)$ entries, which is independent of the condition number and the desired accuracy. The run time of our algorithm is $O(nr^7\log^3 n\log 1/ε)$ which is near linear in the dimension of the matrix. To the best of our knowledge, this is the first near linear time algorithm for exact matrix completion with finite sample complexity (i.e. independent of $ε$). Our algorithm is based on a well known projected gradient descent method, where the projection is onto the (non-convex) set of low rank matrices. There are two key ideas in our result: 1) our argument is based on a $\ell_{\infty}$ norm potential function (as opposed to the spectral norm) and provides a novel way to obtain perturbation bounds for it. 2) we prove and use a natural extension of the Davis-Kahan theorem to obtain perturbation bounds on the best low rank approximation of matrices with good eigen-gap. Both of these ideas may be of independent interest.

1406.6474 2026-06-04 math.OC cs.DM cs.DS cs.LG cs.NA math.NA 版本更新

On the Convergence Rate of Decomposable Submodular Function Minimization

可分解亚模函数最小化的收敛速度研究

Robert Nishihara, Stefanie Jegelka, Michael I. Jordan

AI总结 本文研究了可分解亚模函数最小化算法的收敛速度,通过几何和谱图理论分析,给出了收敛速率的上界和下界。

Comments 17 pages, 3 figures

详情
Journal ref
Neural Information Processing Systems 27, 2014
AI中文摘要

亚模函数描述了机器学习、信号处理和计算机视觉中的多种离散问题。然而,最小化亚模函数存在算法挑战。最近的工作提出了一种易于使用且可并行化的算法,用于最小化可分解为

英文摘要

Submodular functions describe a variety of discrete problems in machine learning, signal processing, and computer vision. However, minimizing submodular functions poses a number of algorithmic challenges. Recent work introduced an easy-to-use, parallelizable algorithm for minimizing submodular functions that decompose as the sum of "simple" submodular functions. Empirically, this algorithm performs extremely well, but no theoretical analysis was given. In this paper, we show that the algorithm converges linearly, and we provide upper and lower bounds on the rate of convergence. Our proof relies on the geometry of submodular polyhedra and draws on results from spectral graph theory.

1411.0024 2026-06-04 math.OC cs.LG cs.SY eess.SY stat.ML 版本更新

Robust sketching for multiple square-root LASSO problems

多重平方根LASSO问题的鲁棒抽样

Vu Pham, Laurent El Ghaoui, Arturo Fernandez

AI总结 本文提出一种鲁棒框架,通过低秩近似对多个相似问题进行高效求解,减少计算量并提升统计性能。

详情
AI中文摘要

许多学习任务,如交叉验证、参数搜索或留一分析,涉及多个相似问题实例,每个实例共享大量学习数据。我们介绍了一种基于学习数据抽样的鲁棒框架,用于解决多个平方根LASSO问题。我们的方法通过低秩近似大幅减少计算工作量,将观测数从m减少到k,而不牺牲甚至提升了统计性能。理论分析和在合成及真实数据上的数值实验展示了该方法在大规模应用中的效率。

英文摘要

Many learning tasks, such as cross-validation, parameter search, or leave-one-out analysis, involve multiple instances of similar problems, each instance sharing a large part of learning data with the others. We introduce a robust framework for solving multiple square-root LASSO problems, based on a sketch of the learning data that uses low-rank approximations. Our approach allows a dramatic reduction in computational effort, in effect reducing the number of observations from $m$ (the number of observations to start with) to $k$ (the number of singular values retained in the low-rank model), while not sacrificing---sometimes even improving---the statistical performance. Theoretical analysis, as well as numerical experiments on both synthetic and real data, illustrate the efficiency of the method in large scale applications.

1406.4905 2026-06-04 cs.LG cs.RO cs.SY eess.SY stat.ML 版本更新

Variational Gaussian Process State-Space Models

变分高斯过程状态空间模型

Roger Frigola, Yutian Chen, Carl E. Rasmussen

AI总结 本文提出基于稀疏高斯过程的变分贝叶斯学习方法,用于高效学习非线性状态空间模型,实现对非线性动力系统后验的可计算性,相比传统参数模型,能灵活平衡模型容量与计算成本,避免过拟合。

详情
Journal ref
R. Frigola, Y. Chen and C. E. Rasmussen. Variational Gaussian Process State-Space Models, in Advances in Neural Information Processing Systems (NIPS), 2014
AI中文摘要

状态空间模型在科学和工程的不同领域中已成功应用超过五十年。我们提出了一种基于稀疏高斯过程的高效变分贝叶斯学习非线性状态空间模型的程序。学习结果是对非线性动力系统可计算的后验。与传统参数模型相比,我们提供了在避免过拟合的同时,可以方便地权衡模型容量和计算成本的可能性。我们的主要算法使用了结合变分贝叶斯和顺序蒙特卡洛的混合推断方法。我们还提出了随机变分推断和在线学习方法,以实现对长时间序列的快速学习。

英文摘要

State-space models have been successfully used for more than fifty years in different areas of science and engineering. We present a procedure for efficient variational Bayesian learning of nonlinear state-space models based on sparse Gaussian processes. The result of learning is a tractable posterior over nonlinear dynamical systems. In comparison to conventional parametric models, we offer the possibility to straightforwardly trade off model capacity and computational cost whilst avoiding overfitting. Our main algorithm uses a hybrid inference approach combining variational Bayes and sequential Monte Carlo. We also present stochastic variational inference and online learning approaches for fast learning with long time series.

1410.7550 2026-06-04 stat.ML cs.LG cs.NE cs.SY eess.SY 版本更新

Learning deep dynamical models from image pixels

从图像像素学习深度动态模型

Niklas Wahlström, Thomas B. Schön, Marc Peter Deisenroth

AI总结 本文提出通过深度学习与系统识别结合,从高维图像像素学习非线性动态系统的嵌入表示和预测转移模型。

Comments 10 pages, 11 figures

详情
AI中文摘要

建模动态系统在许多领域都很重要,例如控制、机器人学或神经技术。通常这些系统的状态无法直接观测,只能通过噪声且可能高维的观测获得。在这种情况下,系统识别,即在潜在空间中找到测量映射和转移映射(系统动态)具有挑战性。对于线性系统动态和测量映射,有高效的系统识别解决方案。然而在实际应用中,线性假设不成立,需要非线性系统识别技术。如果观测是高维的(例如图像),则非线性系统识别本质上困难。为了解决从高维观测中进行非线性系统识别的问题,我们结合了深度学习和系统识别的最新进展。特别是,我们通过深度自编码器联合学习观测的低维嵌入,并在该低维空间中学习预测转移模型。我们证明我们的模型能够仅从像素信息学习良好的动态系统预测模型。

英文摘要

Modeling dynamical systems is important in many disciplines, e.g., control, robotics, or neurotechnology. Commonly the state of these systems is not directly observed, but only available through noisy and potentially high-dimensional observations. In these cases, system identification, i.e., finding the measurement mapping and the transition mapping (system dynamics) in latent space can be challenging. For linear system dynamics and measurement mappings efficient solutions for system identification are available. However, in practical applications, the linearity assumptions does not hold, requiring non-linear system identification techniques. If additionally the observations are high-dimensional (e.g., images), non-linear system identification is inherently hard. To address the problem of non-linear system identification from high-dimensional observations, we combine recent advances in deep learning and system identification. In particular, we jointly learn a low-dimensional embedding of the observation by means of deep auto-encoders and a predictive transition model in this low-dimensional space. We demonstrate that our model enables learning good predictive models of dynamical systems from pixel information only.

1410.3463 2026-06-04 cs.OS cs.LG cs.SY eess.SY 版本更新

Mining Block I/O Traces for Cache Preloading with Sparse Temporal Non-parametric Mixture of Multivariate Poisson

通过稀疏时间非参数混合多维泊松模型挖掘块I/O轨迹以进行缓存预加载

Lavanya Sita Tekumalla, Chiranjib Bhattacharyya

AI总结 本文提出稀疏时间非参数混合多维泊松模型,用于从块I/O轨迹中挖掘缓存预加载的长程模式,通过稀疏建模提高效率,实验证明在基准轨迹上显著提升命中率。

详情
AI中文摘要

现有的存储领域缓存策略虽然能有效利用短程时空模式,但无法利用长程模式提高命中率。为此,我们研究了新的贝叶斯非参数建模(BNP)技术,用于捕捉缓存预加载的长程相关性。此类轨迹由一系列内存访问组成,可聚合为高维稀疏相关计数向量序列。虽然已有多种先进的BNP算法用于聚类及其时间扩展用于预测,但尚无研究将这些方法应用于相关计数向量。我们的第一项贡献是提出基于DP的多维泊松混合模型(DP-MMVP)及其时间扩展(HMM-DP-MMVP),以捕捉多维计数数据的完整协方差结构。然而,对计数向量建模完整协方差结构计算成本高,特别是高维数据。因此,我们利用计数向量的稀疏性,并作为主要贡献,引入稀疏DP混合多维泊松(Sparse-DP-MMVP),推广我们的DP-MMVP混合模型,从而实现更高效的推断。我们随后讨论了模型的时间扩展用于缓存预加载。我们首次尝试挖掘历史数据,以捕捉存储轨迹中的长程模式用于缓存预加载。实验表明,在基准轨迹上命中率显著提高,为存储领域进一步研究减少延迟的数据挖掘技术奠定了基础。

英文摘要

Existing caching strategies, in the storage domain, though well suited to exploit short range spatio-temporal patterns, are unable to leverage long-range motifs for improving hitrates. Motivated by this, we investigate novel Bayesian non-parametric modeling(BNP) techniques for count vectors, to capture long range correlations for cache preloading, by mining Block I/O traces. Such traces comprise of a sequence of memory accesses that can be aggregated into high-dimensional sparse correlated count vector sequences. While there are several state of the art BNP algorithms for clustering and their temporal extensions for prediction, there has been no work on exploring these for correlated count vectors. Our first contribution addresses this gap by proposing a DP based mixture model of Multivariate Poisson (DP-MMVP) and its temporal extension(HMM-DP-MMVP) that captures the full covariance structure of multivariate count data. However, modeling full covariance structure for count vectors is computationally expensive, particularly for high dimensional data. Hence, we exploit sparsity in our count vectors, and as our main contribution, introduce the Sparse DP mixture of multivariate Poisson(Sparse-DP-MMVP), generalizing our DP-MMVP mixture model, also leading to more efficient inference. We then discuss a temporal extension to our model for cache preloading. We take the first step towards mining historical data, to capture long range patterns in storage traces for cache preloading. Experimentally, we show a dramatic improvement in hitrates on benchmark traces and lay the groundwork for further research in storage domain to reduce latencies using data mining techniques to capture long range motifs.

1410.2786 2026-06-04 cs.LG cs.NA math.NA 版本更新

New SVD based initialization strategy for Non-negative Matrix Factorization

基于SVD的非负矩阵分解的新型初始化策略

Hanli Qiao

AI总结 本文提出基于SVD的非负矩阵分解初始化方法,通过提取主成分确定秩并利用SVD分解结果初始化算法,实验表明其优于NNDSVD。

详情
AI中文摘要

本文针对非负矩阵分解(NMF)中的两个问题进行了研究:选择合适的分解秩和提供良好的初始化方法。本文旨在利用奇异值分解(SVD)解决这些问题。首先,我们通过提取主成分来确定秩,这种方法受[1, 2]的启发。其次,我们利用奇异值及其向量来初始化NMF算法。2008年,Boutsidis和Gollopoulos[3]提出了名为NNDSVD的方法,用于增强NMF算法的初始化。他们提取了单位矩阵{C(j)}k j=1的正部分及其奇异三元组信息。该策略旨在利用正部分来处理奇异向量中的负元素,但实验表明,即使将负元素替换为绝对值,也能比NNDSVD获得更好的结果。因此,我们提出了一种基于SVD的NMF初始化方法(SVD-NMF)。在ORL和YALE两个人脸数据库上的数值实验表明,我们的方法优于NNDSVD。

英文摘要

There are two problems need to be dealt with for Non-negative Matrix Factorization (NMF): choose a suitable rank of the factorization and provide a good initialization method for NMF algorithms. This paper aims to solve these two problems using Singular Value Decomposition (SVD). At first we extract the number of main components as the rank, actually this method is inspired from [1, 2]. Second, we use the singular value and its vectors to initialize NMF algorithm. In 2008, Boutsidis and Gollopoulos [3] provided the method titled NNDSVD to enhance initialization of NMF algorithms. They extracted the positive section and respective singular triplet information of the unit matrices {C(j)}k j=1 which were obtained from singular vector pairs. This strategy aims to use positive section to cope with negative elements of the singular vectors, but in experiments we found that even replacing negative elements by their absolute values could get better results than NNDSVD. Hence, we give another method based SVD to fulfil initialization for NMF algorithms (SVD-NMF). Numerical experiments on two face databases ORL and YALE [16, 17] show that our method is better than NNDSVD.

1410.0719 2026-06-04 math.NA cs.CV cs.IT cs.LG cs.NA math.IT math.OC math.ST stat.TH 版本更新

Proceedings of the second "international Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST'14)

第二届‘国际稀疏模型与技术相互作用’研讨会论文集(iTWIST'14)

L. Jacques, C. De Vleeschouwer, Y. Boursier, P. Sudhakar, C. De Mol, A. Pizurica, S. Anthoine, P. Vandergheynst, P. Frossard, C. Bilen, S. Kitic, N. Bertin, R. Gribonval, N. Boumal, B. Mishra, P. -A. Absil, R. Sepulchre, S. Bundervoet, C. Schretter, A. Dooms, P. Schelkens, O. Chabiron, F. Malgouyres, J. -Y. Tourneret, N. Dobigeon, P. Chainais, C. Richard, B. Cornelis, I. Daubechies, D. Dunson, M. Dankova, P. Rajmic, K. Degraux, V. Cambareri, B. Geelen, G. Lafruit, G. Setti, J. -F. Determe, J. Louveaux, F. Horlin, A. Drémeau, P. Heas, C. Herzet, V. Duval, G. Peyré, A. Fawzi, M. Davies, N. Gillis, S. A. Vavasis, C. Soussen, L. Le Magoarou, J. Liang, J. Fadili, A. Liutkus, D. Martina, S. Gigan, L. Daudet, M. Maggioni, S. Minsker, N. Strawn, C. Mory, F. Ngole, J. -L. Starck, I. Loris, S. Vaiter, M. Golbabaee, D. Vukobratovic

AI总结 iTWIST'14聚焦稀疏范式理论与应用,通过演讲、海报和讨论促进国际协作,涵盖稀疏数据传感、子空间联合、非线性逆问题等主题。

Comments 69 pages, 24 extended abstracts, iTWIST'14 website: http://sites.google.com/site/itwist14

详情
AI中文摘要

iTWIST研讨会旨在通过口头报告、海报和自由讨论促进国际科学团队合作。第二届iTWIST'14于2014年8月27日至29日在比利时纳穆尔举行,吸引了70名国际参与者,包含9场特邀讲座、10场口头报告和14个海报,主题涵盖稀疏范式的理论、应用与推广,包括稀疏数据传感、低维子空间联合、非线性逆问题等。

英文摘要

The implicit objective of the biennial "international - Traveling Workshop on Interactions between Sparse models and Technology" (iTWIST) is to foster collaboration between international scientific teams by disseminating ideas through both specific oral/poster presentations and free discussions. For its second edition, the iTWIST workshop took place in the medieval and picturesque town of Namur in Belgium, from Wednesday August 27th till Friday August 29th, 2014. The workshop was conveniently located in "The Arsenal" building within walking distance of both hotels and town center. iTWIST'14 has gathered about 70 international participants and has featured 9 invited talks, 10 oral presentations, and 14 posters on the following themes, all related to the theory, application and generalization of the "sparsity paradigm": Sparsity-driven data sensing and processing; Union of low dimensional subspaces; Beyond linear and convex inverse problem; Matrix/manifold/graph sensing/processing; Blind inverse problems and dictionary learning; Sparsity and computational neuroscience; Information theory, geometry and randomness; Complexity/accuracy tradeoffs in numerical methods; Sparsity? What's next?; Sparse machine learning and inference.

1401.1549 2026-06-04 cs.LG cs.AI cs.SY eess.SY 版本更新

Optimal Demand Response Using Device Based Reinforcement Learning

基于设备的强化学习的最优需求响应

Zheng Wen, Daniel O'Neill, Hamid Reza Maei

AI总结 本文提出一种新型EMS框架,将需求响应问题建模为强化学习问题,通过设备集群分解解决调度问题,无需显式建模用户不满,提升计算效率。

详情
AI中文摘要

本文提出了一种新型EMS框架,将需求响应问题建模为强化学习问题,通过设备集群分解解决调度问题,无需显式建模用户不满,提升计算效率。

英文摘要

Demand response (DR) for residential and small commercial buildings is estimated to account for as much as 65% of the total energy savings potential of DR, and previous work shows that a fully automated Energy Management System (EMS) is a necessary prerequisite to DR in these areas. In this paper, we propose a novel EMS formulation for DR problems in these sectors. Specifically, we formulate a fully automated EMS's rescheduling problem as a reinforcement learning (RL) problem, and argue that this RL problem can be approximately solved by decomposing it over device clusters. Compared with existing formulations, our new formulation (1) does not require explicitly modeling the user's dissatisfaction on job rescheduling, (2) enables the EMS to self-initiate jobs, (3) allows the user to initiate more flexible requests and (4) has a computational complexity linear in the number of devices. We also demonstrate the simulation results of applying Q-learning, one of the most popular and classical RL algorithms, to a representative example.

1409.8327 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

Bayesian and regularization approaches to multivariable linear system identification: the role of rank penalties

基于贝叶斯和正则化方法的多变量线性系统辨识:秩惩罚的作用

Giulia Prando, Alessandro Chiuso, Gianluigi Pillonetto

AI总结 本文提出一种基于ℓ2正则化和秩惩罚的冲击响应估计器,用于处理多输入多输出系统中输入输出通道的耦合问题,通过优化边际似然估计超参数,实现闭式解。

Comments to appear in IEEE Conference on Decision and Control, 2014

详情
AI中文摘要

最近线性系统辨识的发展提出了非参数方法,依赖正则化策略处理偏差/方差权衡。本文引入一种冲击响应估计器,采用ℓ2型正则化和基于log-det启发式的秩惩罚作为秩函数的平滑近似。这允许考虑估计冲击响应的不同属性(如平滑性和稳定性),同时惩罚高复杂度模型。此外,它允许考虑并强制多输入多输出系统中不同输入输出通道的耦合。根据贝叶斯范式,参数定义了两个正则化项的相对权重以及秩惩罚的结构,通过优化边际似然估计。一旦这些超参数被估计,冲击响应估计即可用闭式形式表示。实验表明,所提方法优于仅依赖经典ℓ2正则化或原子和核范数的估计器。

英文摘要

Recent developments in linear system identification have proposed the use of non-parameteric methods, relying on regularization strategies, to handle the so-called bias/variance trade-off. This paper introduces an impulse response estimator which relies on an $\ell_2$-type regularization including a rank-penalty derived using the log-det heuristic as a smooth approximation to the rank function. This allows to account for different properties of the estimated impulse response (e.g. smoothness and stability) while also penalizing high-complexity models. This also allows to account and enforce coupling between different input-output channels in MIMO systems. According to the Bayesian paradigm, the parameters defining the relative weight of the two regularization terms as well as the structure of the rank penalty are estimated optimizing the marginal likelihood. Once these hyperameters have been estimated, the impulse response estimate is available in closed form. Experiments show that the proposed method is superior to the estimator relying on the "classic" $\ell_2$-regularization alone as well as those based in atomic and nuclear norm.

1409.8276 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

A Bayesian Tensor Factorization Model via Variational Inference for Link Prediction

通过变分推断的贝叶斯张量分解模型用于链接预测

Beyza Ermis, A. Taylan Cemgil

AI总结 本文提出基于变分贝叶斯推断的张量分解模型,用于解决链接预测问题,相比最大似然方法在大规模数据集上表现更优。

Comments arXiv admin note: substantial text overlap with arXiv:1409.8083

详情
AI中文摘要

概率性张量分解方法旨在通过设定低秩约束从不完整数据中提取有意义的结构。最近,变分贝叶斯(VB)推断技术已成功应用于大规模模型。本文提出了通过VB进行完整贝叶斯推断的单张量和耦合张量分解模型。我们的方法即使在非常大的模型上也能运行,并且易于实现。在多个现实世界数据集上,它在缺失链接预测问题上的预测性能优于基于最大似然的方法。

英文摘要

Probabilistic approaches for tensor factorization aim to extract meaningful structure from incomplete data by postulating low rank constraints. Recently, variational Bayesian (VB) inference techniques have successfully been applied to large scale models. This paper presents full Bayesian inference via VB on both single and coupled tensor factorization models. Our method can be run even for very large models and is easily implemented. It exhibits better prediction performance than existing approaches based on maximum likelihood on several real-world datasets for missing link prediction problem.

1409.5671 2026-06-04 cs.AI cs.CE cs.LG cs.LO cs.SY eess.SY 版本更新

A Formal Methods Approach to Pattern Synthesis in Reaction Diffusion Systems

反应扩散系统模式合成的正式方法方法

Ebru Aydin Gol, Ezio Bartocci, Calin Belta

AI总结 本文提出了一种基于空间叠加逻辑的模式检测与生成技术,结合模型检验与粒子群优化,实现反应扩散系统中所需模式的参数合成。

详情
AI中文摘要

我们提出了一种技术,用于检测和生成局部相互作用动态系统网络中的模式。我们的方法核心是一种新的空间叠加逻辑,其语义定义在分区图像的四叉树上。我们展示了该逻辑中的公式可以从正例和负例中高效学习。我们还证明,模式检测,作为模型检验算法实现,对于与学习集不同的测试数据集表现良好。我们为该逻辑定义了定量语义,并将模型检验算法与粒子群优化整合到计算框架中,用于合成导致反应扩散系统中所需模式的参数。

英文摘要

We propose a technique to detect and generate patterns in a network of locally interacting dynamical systems. Central to our approach is a novel spatial superposition logic, whose semantics is defined over the quad-tree of a partitioned image. We show that formulas in this logic can be efficiently learned from positive and negative examples of several types of patterns. We also demonstrate that pattern detection, which is implemented as a model checking algorithm, performs very well for test data sets different from the learning sets. We define a quantitative semantics for the logic and integrate the model checking algorithm with particle swarm optimization in a computational framework for synthesis of parameters leading to desired patterns in reaction-diffusion systems.

1409.4018 2026-06-04 cs.LG cs.NA math.NA 版本更新

EquiNMF: Graph Regularized Multiview Nonnegative Matrix Factorization

EquiNMF:图正则化多视图非负矩阵分解

Daniel Hidru, Anna Goldenberg

AI总结 本文提出EquiNMF方法,通过图正则化多视图非负矩阵分解实现数据整合,自动设置参数以提升聚类效果,实验显示其优于其他单视图和多视图方法。

详情
AI中文摘要

非负矩阵分解(NMF)方法已在广泛的实际聚类应用中证明其强大。整合相同对象/主体的多种测量类型有助于深入理解数据并优化聚类。我们开发了一种名为EquiNMF的新方法,该方法基于图正则化多视图NMF,参数完全自动化设置,适用于无监督应用。我们对多视图成像数据进行了广泛的实验,证明EquiNMF在拼接数据的单视图NMF方法和不同正则化类型的多视图NMF方法中表现一致优异。

英文摘要

Nonnegative matrix factorization (NMF) methods have proved to be powerful across a wide range of real-world clustering applications. Integrating multiple types of measurements for the same objects/subjects allows us to gain a deeper understanding of the data and refine the clustering. We have developed a novel Graph-reguarized multiview NMF-based method for data integration called EquiNMF. The parameters for our method are set in a completely automated data-specific unsupervised fashion, a highly desirable property in real-world applications. We performed extensive and comprehensive experiments on multiview imaging data. We show that EquiNMF consistently outperforms other single-view NMF methods used on concatenated data and multi-view NMF methods with different types of regularizations.

1409.2579 2026-06-04 math.NA cs.CV cs.LG cs.NA 版本更新

A theoretical contribution to the fast implementation of null linear discriminant analysis method using random matrix multiplication with scatter matrices

对利用散射矩阵进行随机矩阵乘法实现null线性判别分析方法的理论贡献

Ting-ting Feng, Gang Wu

AI总结 本文提出一种理论方法,通过合理选择随机矩阵保证null LDA的列满秩,避免信息丢失,提升计算效率。

Comments 7 pages

详情
AI中文摘要

null线性判别分析方法是一种有竞争力的降维方法,但其实现计算成本高。最近提出了一种利用随机矩阵乘法与散射矩阵的快速实现方法,但若随机矩阵任选,则导向矩阵可能秩不足,导致有用判别信息丢失。本文研究如何合理选择随机矩阵以满足null LDA的两个理论准则,给出了保证导向矩阵列满秩的必要且充分条件,并描述了该条件的几何特性。

英文摘要

The null linear discriminant analysis method is a competitive approach for dimensionality reduction. The implementation of this method, however, is computationally expensive. Recently, a fast implementation of null linear discriminant analysis method using random matrix multiplication with scatter matrices was proposed. However, if the random matrix is chosen arbitrarily, the orientation matrix may be rank deficient, and some useful discriminant information will be lost. In this paper, we investigate how to choose the random matrix properly, such that the two criteria of the null LDA method are satisfied theoretically. We give a necessary and sufficient condition to guarantee full column rank of the orientation matrix. Moreover, the geometric characterization of the condition is also described.

1408.2054 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Non-Convex Rank Minimization via an Empirical Bayesian Approach

通过经验贝叶斯方法实现非凸秩最小化

David Wipf

AI总结 本文提出基于变分近似的经验贝叶斯方法,用于解决非凸秩最小化问题,该方法在保留全局最优估计的同时,通过边际化处理克服了传统凸松弛方法的局限性,尤其在鲁棒主成分分析中表现出色。

Comments Appears in Proceedings of the Twenty-Eighth Conference on Uncertainty in Artificial Intelligence (UAI2012)

详情
AI中文摘要

在需要最小秩矩阵解的应用中,底层成本函数非凸导致优化问题难以解决。因此,核范数常被用作矩阵秩的替代惩罚项。然而,在许多实际场景中,无法保证正确估计生成的低秩矩阵,理论特例除外。本文提出了一种基于变分近似的经验贝叶斯方法,该方法在许多有用约束下保留与秩函数相同的全局最小点估计。通过边际化处理,局部最小解被平滑掉,使算法在标准凸松弛完全失败时仍能成功。虽然该方法适用于广泛低秩应用,但本文聚焦于鲁棒主成分分析问题(RPCA),即估计未知低秩矩阵及其未知稀疏损坏。理论和实证证据表明,本文方法可能优于相关MAP方法,其中凸原理成分追求(PCP)算法(Candes等,2011)可视为特例。

英文摘要

In many applications that require matrix solutions of minimal rank, the underlying cost function is non-convex leading to an intractable, NP-hard optimization problem. Consequently, the convex nuclear norm is frequently used as a surrogate penalty term for matrix rank. The problem is that in many practical scenarios there is no longer any guarantee that we can correctly estimate generative low-rank matrices of interest, theoretical special cases notwithstanding. Consequently, this paper proposes an alternative empirical Bayesian procedure build upon a variational approximation that, unlike the nuclear norm, retains the same globally minimizing point estimate as the rank function under many useful constraints. However, locally minimizing solutions are largely smoothed away via marginalization, allowing the algorithm to succeed when standard convex relaxations completely fail. While the proposed methodology is generally applicable to a wide range of low-rank applications, we focus our attention on the robust principal component analysis problem (RPCA), which involves estimating an unknown low-rank matrix with unknown sparse corruptions. Theoretical and empirical evidence are presented to show that our method is potentially superior to related MAP-based approaches, for which the convex principle component pursuit (PCP) algorithm (Candes et al., 2011) can be viewed as a special case.

1408.0838 2026-06-04 cs.LG cs.NA math.NA math.OC stat.ML 版本更新

Estimating Maximally Probable Constrained Relations by Mathematical Programming

通过数学规划估计最大可能的约束关系

Lizhen Qu, Bjoern Andres

AI总结 本文提出了一种概率测度家族,用于联合抽象多标签分类、相关聚类和排序问题,通过数学规划方法解决半监督学习。

Comments 16 pages

详情
AI中文摘要

估计约束关系是机器学习中的基本问题。特殊情形包括分类、聚类和排序。本文贡献了一种在两个有限非空集合之间所有关系的概率测度家族,提供多标签分类、相关聚类和排序的联合抽象。给定相关和不相关配对的训练集,估计最大可能测度是一个凸优化问题。给定测度估计最大可能关系是一个01线性规划问题,对于映射可在线性时间内解决,而对于等价关系和线性顺序则是NP难问题。实验显示了所有三种情况的实用解决方案。最后,联合估计最大可能测度和关系被提出为混合整数非线性规划问题。此方法为半监督学习提供了数学规划的途径。

英文摘要

Estimating a constrained relation is a fundamental problem in machine learning. Special cases are classification (the problem of estimating a map from a set of to-be-classified elements to a set of labels), clustering (the problem of estimating an equivalence relation on a set) and ranking (the problem of estimating a linear order on a set). We contribute a family of probability measures on the set of all relations between two finite, non-empty sets, which offers a joint abstraction of multi-label classification, correlation clustering and ranking by linear ordering. Estimating (learning) a maximally probable measure, given (a training set of) related and unrelated pairs, is a convex optimization problem. Estimating (inferring) a maximally probable relation, given a measure, is a 01-linear program. It is solved in linear time for maps. It is NP-hard for equivalence relations and linear orders. Practical solutions for all three cases are shown in experiments with real data. Finally, estimating a maximally probable measure and relation jointly is posed as a mixed-integer nonlinear program. This formulation suggests a mathematical programming approach to semi-supervised learning.

1404.1592 2026-06-04 math.OC cs.LG cs.SY eess.SY 版本更新

The Power of Online Learning in Stochastic Network Optimization

在线学习在随机网络优化中的力量

Longbo Huang, Xin Liu, Xiaohong Hao

AI总结 本文研究在线学习在未知系统统计的随机网络优化中的作用,提出OLAC和OLAC2算法,证明其在效用-延迟权衡和收敛时间上的最优性能。

详情
AI中文摘要

本文探讨在线学习在随机网络优化中未知系统统计下的作用,研究信息与学习如何高效融入系统控制,并提出OLAC和OLAC2两种在线学习辅助控制技术。通过双学习过程利用过往系统信息,证明OLAC和OLAC2在效用-延迟权衡上达到近最优,OLAC2具有O(ε^{-2/3})收敛时间。仿真结果验证了算法的优越性能,首次将在线学习显式融入随机网络优化并在理论和实践中展示其力量。

英文摘要

In this paper, we investigate the power of online learning in stochastic network optimization with unknown system statistics {\it a priori}. We are interested in understanding how information and learning can be efficiently incorporated into system control techniques, and what are the fundamental benefits of doing so. We propose two \emph{Online Learning-Aided Control} techniques, $\mathtt{OLAC}$ and $\mathtt{OLAC2}$, that explicitly utilize the past system information in current system control via a learning procedure called \emph{dual learning}. We prove strong performance guarantees of the proposed algorithms: $\mathtt{OLAC}$ and $\mathtt{OLAC2}$ achieve the near-optimal $[O(ε), O([\log(1/ε)]^2)]$ utility-delay tradeoff and $\mathtt{OLAC2}$ possesses an $O(ε^{-2/3})$ convergence time. $\mathtt{OLAC}$ and $\mathtt{OLAC2}$ are probably the first algorithms that simultaneously possess explicit near-optimal delay guarantee and sub-linear convergence time. Simulation results also confirm the superior performance of the proposed algorithms in practice. To the best of our knowledge, our attempt is the first to explicitly incorporate online learning into stochastic network optimization and to demonstrate its power in both theory and practice.

1407.7299 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Algorithms, Initializations, and Convergence for the Nonnegative Matrix Factorization

非负矩阵分解的算法、初始化和收敛性

Amy N. Langville, Carl D. Meyer, Russell Albright, James Cox, David Duling

AI总结 本文研究了非负矩阵分解算法的初始化对收敛速度和精度的影响,提出两种新的交替最小二乘算法,并讨论了收敛准则的选择问题。

详情
AI中文摘要

非负矩阵分解算法的初始化对收敛速度和精度有显著影响。许多非负矩阵分解算法对W或H的初始化敏感,尤其是交替最小二乘算法。本文提出了两种新的交替最小二乘算法,并比较了六种初始化方法(两种标准和四种新方法)的结果。最后,讨论了选择合适收敛准则的实践问题。

英文摘要

It is well known that good initializations can improve the speed and accuracy of the solutions of many nonnegative matrix factorization (NMF) algorithms. Many NMF algorithms are sensitive with respect to the initialization of W or H or both. This is especially true of algorithms of the alternating least squares (ALS) type, including the two new ALS algorithms that we present in this paper. We compare the results of six initialization procedures (two standard and four new) on our ALS algorithms. Lastly, we discuss the practical issue of choosing an appropriate convergence criterion.

1407.0286 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

DC approximation approaches for sparse optimization

基于DC近似方法的稀疏优化

Hoai An Le Thi, Tao Pham Dinh, Hoai Minh Le, Xuan Thanh Vo

AI总结 本文从DC框架出发,研究了稀疏优化的非凸近似方法,分析了近似问题与原问题的解的一致性,并开发了四种DCA方案,用于解决零范数和稀疏优化问题。

Comments 35 pages

详情
AI中文摘要

稀疏优化是指目标或约束中包含零范数的优化问题。本文从DC(差分凸函数)编程框架出发,研究了稀疏优化的非凸近似方法。考虑了包含所有标准稀疏诱导惩罚函数的零范数的常见DC近似,研究了近似问题与原问题的全局最小值(或局部最小值)的一致性。证明了在某些情况下,近似问题的某些全局最小值(或局部最小值)也是原问题的。利用DC编程中的精确惩罚技术,证明了某些特定近似方法在合适参数下与原问题等价。对几种稀疏诱导惩罚函数的效率进行了全面分析。开发了四种DCA(DC算法)方案,涵盖了非凸稀疏近似方法中的所有标准算法作为特殊版本。这些算法可以视为ℓ₁扰动算法/加权ℓ₁算法。本文提供了一种统一的非凸近似方法,结合了坚实的理论工具和基于DC编程和DCA的高效算法,以解决零范数和稀疏优化问题。作为应用,我们实现了我们的方法用于SVM(支持向量机)问题的特征选择,并在各种近似函数上进行了实证比较数值实验。

英文摘要

Sparse optimization refers to an optimization problem involving the zero-norm in objective or constraints. In this paper, nonconvex approximation approaches for sparse optimization have been studied with a unifying point of view in DC (Difference of Convex functions) programming framework. Considering a common DC approximation of the zero-norm including all standard sparse inducing penalty functions, we studied the consistency between global minimums (resp. local minimums) of approximate and original problems. We showed that, in several cases, some global minimizers (resp. local minimizers) of the approximate problem are also those of the original problem. Using exact penalty techniques in DC programming, we proved stronger results for some particular approximations, namely, the approximate problem, with suitable parameters, is equivalent to the original problem. The efficiency of several sparse inducing penalty functions have been fully analyzed. Four DCA (DC Algorithm) schemes were developed that cover all standard algorithms in nonconvex sparse approximation approaches as special versions. They can be viewed as, an $\ell _{1}$-perturbed algorithm / reweighted-$\ell _{1}$ algorithm / reweighted-$\ell _{1}$ algorithm. We offer a unifying nonconvex approximation approach, with solid theoretical tools as well as efficient algorithms based on DC programming and DCA, to tackle the zero-norm and sparse optimization. As an application, we implemented our methods for the feature selection in SVM (Support Vector Machine) problem and performed empirical comparative numerical experiments on the proposed algorithms with various approximation functions.

1405.7910 2026-06-04 cs.DS cs.LG cs.NA math.NA 版本更新

Optimal CUR Matrix Decompositions

最优的CUR矩阵分解

Christos Boutsidis, David P. Woodruff

AI总结 本文提出了一种在输入稀疏性和确定性算法中实现最优CUR分解的方法,通过选择少量列和行构造低秩矩阵,以近似原始矩阵。

Comments small revision in lemma 4.2

详情
AI中文摘要

本文提出了一种在输入稀疏性和确定性算法中实现最优CUR分解的方法,通过选择少量列和行构造低秩矩阵,以近似原始矩阵。

英文摘要

The CUR decomposition of an $m \times n$ matrix $A$ finds an $m \times c$ matrix $C$ with a subset of $c < n$ columns of $A,$ together with an $r \times n$ matrix $R$ with a subset of $r < m$ rows of $A,$ as well as a $c \times r$ low-rank matrix $U$ such that the matrix $C U R$ approximates the matrix $A,$ that is, $ || A - CUR ||_F^2 \le (1+ε) || A - A_k||_F^2$, where $||.||_F$ denotes the Frobenius norm and $A_k$ is the best $m \times n$ matrix of rank $k$ constructed via the SVD. We present input-sparsity-time and deterministic algorithms for constructing such a CUR decomposition where $c=O(k/ε)$ and $r=O(k/ε)$ and rank$(U) = k$. Up to constant factors, our algorithms are simultaneously optimal in $c, r,$ and rank$(U)$.

1407.2676 2026-06-04 math.OC cs.AI cs.LG cs.SY eess.SY stat.ML 版本更新

A New Optimal Stepsize For Approximate Dynamic Programming

近似动态规划的一种新最优步长

Ilya O. Ryzhov, Peter I. Frazier, Warren B. Powell

AI总结 本文提出一种新的最优步长规则,通过优化预测误差提升近似动态规划算法的短期性能,仅需一个敏感度较低的可调参数,适应问题噪声水平,加快数值实验中的收敛速度。

Comments Matlab files are included with the paper source

详情
AI中文摘要

近似动态规划(ADP)已在大规模交通运输问题、医疗保健、收益管理以及能源系统等广泛领域中得到了应用。设计有效的ADP算法有许多维度,但一个关键因素是用于更新价值函数近似值的步长规则。许多运筹学应用计算上都很耗费资源,因此快速获得良好结果非常重要。此外,最流行的步长公式使用可调参数,如果调节不当,可能会产生非常差的结果。我们推导出一种新的步长规则,以优化预测误差,从而提高ADP算法的短期性能。仅需一个相对不敏感的可调参数,新的规则能够适应问题中的噪声水平,并在数值实验中产生更快的收敛速度。

英文摘要

Approximate dynamic programming (ADP) has proven itself in a wide range of applications spanning large-scale transportation problems, health care, revenue management, and energy systems. The design of effective ADP algorithms has many dimensions, but one crucial factor is the stepsize rule used to update a value function approximation. Many operations research applications are computationally intensive, and it is important to obtain good results quickly. Furthermore, the most popular stepsize formulas use tunable parameters and can produce very poor results if tuned improperly. We derive a new stepsize rule that optimizes the prediction error in order to improve the short-term performance of an ADP algorithm. With only one, relatively insensitive tunable parameter, the new rule adapts to the level of noise in the problem and produces faster convergence in numerical experiments.

1407.1399 2026-06-04 math.NA cs.LG cs.NA 版本更新

Generalized Higher-Order Tensor Decomposition via Parallel ADMM

通过并行ADMM实现广义高阶张量分解

Fanhua Shang, Yuanyuan Liu, James Cheng

AI总结 本文提出一种并行迹范数正则化的张量分解方法,通过优化方案自动确定各模式的因子数,解决传统方法在模型选择、粗腐损和计算效率上的挑战。

Comments 9 pages, 5 figures, AAAI 2014

详情
AI中文摘要

高阶张量在计算机视觉、社交网络分析、数据挖掘和神经科学等领域变得普遍。传统张量分解方法面临三个主要挑战:模型选择、粗腐损和计算效率。为解决这些问题,我们首先提出一种并行迹范数正则化的张量分解方法,并将其公式化为凸优化问题。该方法不需要事先指定每个模式的秩,可通过我们的优化方案自动确定每个模式的因子数。通过考虑观测张量的低秩结构,我们分析了低秩张量与其核心张量之间的迹范数等价关系。然后,我们将非凸张量分解模型转换为多个更小规模矩阵迹范数最小化的加权组合。最后,我们开发了两种并行交替方向乘子法(ADMM)来解决这些问题。实验结果验证了我们的正则化公式有效,且我们的方法对噪声或异常值具有鲁棒性。

英文摘要

Higher-order tensors are becoming prevalent in many scientific areas such as computer vision, social network analysis, data mining and neuroscience. Traditional tensor decomposition approaches face three major challenges: model selecting, gross corruptions and computational efficiency. To address these problems, we first propose a parallel trace norm regularized tensor decomposition method, and formulate it as a convex optimization problem. This method does not require the rank of each mode to be specified beforehand, and can automatically determine the number of factors in each mode through our optimization scheme. By considering the low-rank structure of the observed tensor, we analyze the equivalent relationship of the trace norm between a low-rank tensor and its core tensor. Then, we cast a non-convex tensor decomposition model into a weighted combination of multiple much smaller-scale matrix trace norm minimization. Finally, we develop two parallel alternating direction methods of multipliers (ADMM) to solve our problems. Experimental results verify that our regularized formulation is effective, and our methods are robust to noise or outliers.

1407.0449 2026-06-04 cs.LG cs.SY eess.SY math.OC stat.ML 版本更新

Classification-based Approximate Policy Iteration: Experiments and Extended Discussions

基于分类的近似策略迭代:实验与扩展讨论

Amir-massoud Farahmand, Doina Precup, André M. S. Barreto, Mohammad Ghavamzadeh

AI总结 本文提出基于分类的近似策略迭代框架,通过价值函数和策略空间的规律性来提升算法性能,并在HIV控制等任务中验证了其有效性。

详情
AI中文摘要

本文提出基于分类的近似策略迭代框架,通过价值函数和策略空间的规律性来提升算法性能,并在HIV控制等任务中验证了其有效性。

英文摘要

Tackling large approximate dynamic programming or reinforcement learning problems requires methods that can exploit regularities, or intrinsic structure, of the problem in hand. Most current methods are geared towards exploiting the regularities of either the value function or the policy. We introduce a general classification-based approximate policy iteration (CAPI) framework, which encompasses a large class of algorithms that can exploit regularities of both the value function and the policy space, depending on what is advantageous. This framework has two main components: a generic value function estimator and a classifier that learns a policy based on the estimated value function. We establish theoretical guarantees for the sample complexity of CAPI-style algorithms, which allow the policy evaluation step to be performed by a wide variety of algorithms (including temporal-difference-style methods), and can handle nonparametric representations of policies. Our bounds on the estimation error of the performance loss are tighter than existing results. We also illustrate this approach empirically on several problems, including a large HIV control task.

1407.0013 2026-06-04 math.NA cs.LG cs.NA math.ST stat.TH 版本更新

Relevance Singular Vector Machine for low-rank matrix sensing

相关性奇异向量机用于低秩矩阵感知

Martin Sundin, Saikat Chatterjee, Magnus Jansson, Cristian R. Rojas

AI总结 本文提出了一种新的贝叶斯推断方法,用于低秩矩阵重建,即相关性奇异向量机(RSVM),通过在基础矩阵的奇异向量上定义合适先验来促进低秩性,并通过数值高效近似加速计算。

Comments International Conference on Signal Processing and Communications (SPCOM), 5 pages

详情
AI中文摘要

本文提出了一种新的贝叶斯推断方法用于低秩矩阵重建。我们称之为相关性奇异向量机(RSVM),其中在基础矩阵的奇异向量上定义了适当的先验以促进低秩性。为了加速计算,开发了一种数值高效的近似方法。所提出算法应用于矩阵补全和矩阵重建问题,并通过数值方法研究了其性能。

英文摘要

In this paper we develop a new Bayesian inference method for low rank matrix reconstruction. We call the new method the Relevance Singular Vector Machine (RSVM) where appropriate priors are defined on the singular vectors of the underlying matrix to promote low rank. To accelerate computations, a numerically efficient approximation is developed. The proposed algorithms are applied to matrix completion and matrix reconstruction problems and their performance is studied numerically.

1310.7529 2026-06-04 stat.ML cs.LG cs.NA math.NA math.OC 版本更新

Successive Nonnegative Projection Algorithm for Robust Nonnegative Blind Source Separation

递进非负投影算法用于鲁棒非负盲源分离

Nicolas Gillis

AI总结 本文提出一种快速且鲁棒的递归算法,用于近可分离的非负矩阵分解问题。该算法称为递进非负投影算法(SNPA),利用非负约束提升鲁棒性,适用于更广泛的非负矩阵。

Comments 31 pages, 7 figures, 4 tables. Main changes: new numerical experiments on column-rank-deficient matrices, typos corrected, discussion on the comparison with XRAY

详情
Journal ref
SIAM J. on Imaging Sciences 7 (2), pp. 1420-1450, 2014
AI中文摘要

本文提出了一种新的快速且鲁棒的递归算法,用于近可分离的非负矩阵分解,这是非负盲源分离问题的一种特定情况。该算法称为递进非负投影算法(SNPA),与流行的递进投影算法(SPA)密切相关,但利用分解中的非负约束。我们证明SNPA比SPA更鲁棒,可应用于更广泛的非负矩阵。这在一些合成数据集和真实世界超光谱图像上得到了验证。

英文摘要

In this paper, we propose a new fast and robust recursive algorithm for near-separable nonnegative matrix factorization, a particular nonnegative blind source separation problem. This algorithm, which we refer to as the successive nonnegative projection algorithm (SNPA), is closely related to the popular successive projection algorithm (SPA), but takes advantage of the nonnegativity constraint in the decomposition. We prove that SNPA is more robust than SPA and can be applied to a broader class of nonnegative matrices. This is illustrated on some synthetic data sets, and on a real-world hyperspectral image.

1406.4619 2026-06-04 math.NA cs.LG cs.NA cs.NE 版本更新

A Generalized Markov-Chain Modelling Approach to $(1,λ)$-ES Linear Optimization: Technical Report

一种通用马尔可夫链建模方法用于(1,λ)-ES线性优化:技术报告

Alexandre Chotard, Martin Holena

AI总结 本文提出一种通用马尔可夫链建模方法,用于(1,λ)-ES在线性优化中的应用,探讨了固定步长下常数步长(1,λ)-ES在简单线性约束问题中的成功条件。

详情
AI中文摘要

近年来,有几篇论文研究了(1,λ)-ES对线性优化的马尔可夫链建模,涵盖了无约束和线性约束优化,以及固定和变化的步长。所有研究都假设涉及的随机步长服从正态分布,尽管这与黑盒场景一致,但利用其他分布可能利用函数优化信息(例如分离性)。本文的目标是补充之前使用正态步长的研究,并为常数步长(1,λ)-ES在简单线性函数线性约束问题中的成功提供充分条件。将多维分布分解为其边缘分布和结合它们的copula应用于新的分布假设,特别关注具有阿基米得copula分布的分布。

英文摘要

Several recent publications investigated Markov-chain modelling of linear optimization by a $(1,λ)$-ES, considering both unconstrained and linearly constrained optimization, and both constant and varying step size. All of them assume normality of the involved random steps, and while this is consistent with a black-box scenario, information on the function to be optimized (e.g. separability) may be exploited by the use of another distribution. The objective of our contribution is to complement previous studies realized with normal steps, and to give sufficient conditions on the distribution of the random steps for the success of a constant step-size $(1,λ)$-ES on the simple problem of a linear function with a linear constraint. The decomposition of a multidimensional distribution into its marginals and the copula combining them is applied to the new distributional assumptions, particular attention being paid to distributions with Archimedean copulas.

1406.0554 2026-06-04 eess.SY cs.LG cs.SY math.OC 版本更新

Universal Convexification via Risk-Aversion

通过风险厌恶实现通用凸化

Krishnamurthy Dvijotham, Maryam Fazel, Emanuel Todorov

AI总结 本文提出了一种通用凸化框架,分析了凸化问题相对于非凸问题的次优性和收敛性,并展示了在监督学习和离散时间动力系统中的应用。

详情
AI中文摘要

我们开发了一个框架,用于将广泛类别的优化问题凸化。在额外假设下,我们分析了凸化问题解相对于原始非凸问题的次优性和证明了加性近似保证。然后我们基于随机梯度方法开发了求解相应优化问题的算法,并展示了收敛率的界限。我们展示了该框架在监督学习中的简单应用,其中可以显式进行整合,并可以使用标准(非随机)优化算法获得更好的收敛保证。然后我们将该框架扩展到一般类离散时间动力系统。在此背景下,我们的凸化方法属于已研究的风险敏感马尔可夫决策过程范式。我们推导了首个已知的基于模型和无模型的策略梯度优化算法,保证收敛到最优解。最后,我们展示了在不同应用中的数值结果以验证我们的公式。

英文摘要

We develop a framework for convexifying a fairly general class of optimization problems. Under additional assumptions, we analyze the suboptimality of the solution to the convexified problem relative to the original nonconvex problem and prove additive approximation guarantees. We then develop algorithms based on stochastic gradient methods to solve the resulting optimization problems and show bounds on convergence rates. %We show a simple application of this framework to supervised learning, where one can perform integration explicitly and can use standard (non-stochastic) optimization algorithms with better convergence guarantees. We then extend this framework to apply to a general class of discrete-time dynamical systems. In this context, our convexification approach falls under the well-studied paradigm of risk-sensitive Markov Decision Processes. We derive the first known model-based and model-free policy gradient optimization algorithms with guaranteed convergence to the optimal solution. Finally, we present numerical results validating our formulation in different applications.

1310.5035 2026-06-04 math.NA cs.LG cs.NA math.OC stat.ML 版本更新

Linearized Alternating Direction Method with Parallel Splitting and Adaptive Penalty for Separable Convex Programs in Machine Learning

线性化交替方向法与并行分裂及自适应惩罚用于机器学习中的可分离凸程序

Zhouchen Lin, Risheng Liu, Huan Li

AI总结 本文提出LADMPSAP方法,用于高效求解多块可分离凸程序,通过并行分裂和自适应惩罚改进传统ADM方法,实现更强的收敛性和更快的收敛速度,适用于稀疏表示和低秩恢复问题。

Comments Preliminary version published on Asian Conference on Machine Learning 2013

详情
AI中文摘要

许多机器学习和其他领域的问题可以重新公式化为具有线性约束的可分离凸程序。在大多数情况下,存在多个变量块。然而,传统的交替方向法(ADM)及其线性化版本(LADM,通过线性化二次惩罚项获得)仅适用于两块情况,无法简单推广到多块情况。因此,扩展基于ADM的方法以处理多块情况有巨大需求。本文提出LADMPSAP以高效求解多块可分离凸程序。当所有组件目标函数具有有界子梯度时,我们获得了比ADM和LADM更强的收敛结果,例如允许惩罚参数无界,并证明了全局收敛的充分必要条件。我们进一步提出一个简单的最优性度量,并揭示了LADMPSAP在测度意义上的收敛速度。对于具有额外凸集约束的程序,通过精细的参数估计,我们设计了一个实用的LADMPSAP变种以加快收敛速度。最后,我们通过线性化部分目标函数来推广LADMPSAP,以处理更困难的目标函数程序。LADMPSAP特别适用于稀疏表示和低秩恢复问题,因为其子问题有闭合形式解,迭代过程中的稀疏性和低秩性可以得到保持。它也高度并行化,因此适合并行或分布式计算。数值实验验证了LADMPSAP在速度和数值精度方面的优势。

英文摘要

Many problems in machine learning and other fields can be (re)for-mulated as linearly constrained separable convex programs. In most of the cases, there are multiple blocks of variables. However, the traditional alternating direction method (ADM) and its linearized version (LADM, obtained by linearizing the quadratic penalty term) are for the two-block case and cannot be naively generalized to solve the multi-block case. So there is great demand on extending the ADM based methods for the multi-block case. In this paper, we propose LADM with parallel splitting and adaptive penalty (LADMPSAP) to solve multi-block separable convex programs efficiently. When all the component objective functions have bounded subgradients, we obtain convergence results that are stronger than those of ADM and LADM, e.g., allowing the penalty parameter to be unbounded and proving the sufficient and necessary conditions} for global convergence. We further propose a simple optimality measure and reveal the convergence rate of LADMPSAP in an ergodic sense. For programs with extra convex set constraints, with refined parameter estimation we devise a practical version of LADMPSAP for faster convergence. Finally, we generalize LADMPSAP to handle programs with more difficult objective functions by linearizing part of the objective function as well. LADMPSAP is particularly suitable for sparse representation and low-rank recovery problems because its subproblems have closed form solutions and the sparsity and low-rankness of the iterates can be preserved during the iteration. It is also highly parallelizable and hence fits for parallel or distributed computing. Numerical experiments testify to the advantages of LADMPSAP in speed and numerical accuracy.

1402.0562 2026-06-04 stat.ML cs.LG cs.SY eess.SY 版本更新

Online Stochastic Optimization under Correlated Bandit Feedback

在线随机优化中的相关多臂反馈

Mohammad Gheshlaghi Azar, Alessandro Lazaric, Emma Brunskill

AI总结 本文提出HCT算法,解决局部光滑函数的在线随机优化问题,处理相关奖励挑战,改进内存需求和光滑性假设,应用于强化学习策略搜索。

详情
AI中文摘要

本文考虑在多臂反馈下局部光滑函数的在线随机优化问题。我们引入高置信度树(HCT)算法,一种新型的任何时间$\mathcal{X}$-臂多臂算法,并推导出与现有最先进方法在步骤数和光滑性因子依赖性上匹配的遗憾界。HCT的主要优势在于处理相关奖励的挑战,而现有方法要求每个臂的奖励生成过程是独立同分布(iid)的随机过程。HCT还改进了现有方法在内存需求和对均奖励函数光滑性假设的弱化方面。最后,我们讨论了HCT在强化学习策略搜索问题中的应用,并报告了初步的实证结果。

英文摘要

In this paper we consider the problem of online stochastic optimization of a locally smooth function under bandit feedback. We introduce the high-confidence tree (HCT) algorithm, a novel any-time $\mathcal{X}$-armed bandit algorithm, and derive regret bounds matching the performance of existing state-of-the-art in terms of dependency on number of steps and smoothness factor. The main advantage of HCT is that it handles the challenging case of correlated rewards, whereas existing methods require that the reward-generating process of each arm is an identically and independent distributed (iid) random process. HCT also improves on the state-of-the-art in terms of its memory requirement as well as requiring a weaker smoothness assumption on the mean-reward function in compare to the previous anytime algorithms. Finally, we discuss how HCT can be applied to the problem of policy search in reinforcement learning and we report preliminary empirical results.

1310.1502 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Randomized Approximation of the Gram Matrix: Exact Computation and Probabilistic Bounds

随机近似Gram矩阵:精确计算与概率界

John T. Holodnak, Ilse C. F. Ipsen

AI总结 研究通过随机化方法近似Gram矩阵,提出基于稳定秩的概率误差界,适用于小维度矩阵和高成功概率场景。

Comments Update to title in third version. Major revisions in second version including new bounds and a more detailed experimental section. Submitted to SIMAX

详情
AI中文摘要

给定一个具有n列的实矩阵A,问题在于通过c<<n个加权外积近似Gram积AA^T。精确计算AA^T(在精确算术中)所需的条件取决于A的右奇异向量矩阵。对于Drineas等人提出的蒙特卡洛矩阵乘法算法,我们给出了由于随机化导致的2范数相对误差的概率界。这些界依赖于稳定秩或A的秩,而不是矩阵维度。数值实验表明,这些界在严格成功概率和小维度矩阵情况下仍具信息量。我们还推导了通过从正交矩阵中采样行所获得的矩阵的最小奇异值和条件数的界。

英文摘要

Given a real matrix A with n columns, the problem is to approximate the Gram product AA^T by c << n weighted outer products of columns of A. Necessary and sufficient conditions for the exact computation of AA^T (in exact arithmetic) from c >= rank(A) columns depend on the right singular vector matrix of A. For a Monte-Carlo matrix multiplication algorithm by Drineas et al. that samples outer products, we present probabilistic bounds for the 2-norm relative error due to randomization. The bounds depend on the stable rank or the rank of A, but not on the matrix dimensions. Numerical experiments illustrate that the bounds are informative, even for stringent success probabilities and matrices of small dimension. We also derive bounds for the smallest singular value and the condition number of matrices obtained by sampling rows from orthonormal matrices.

1311.6107 2026-06-04 eess.SY cs.LG cs.SY math.OC stat.ML 版本更新

Off-policy reinforcement learning for $ H_\infty $ control design

为H∞控制设计的非策略强化学习

Biao Luo, Huai-Ning Wu, Tingwen Huang

AI总结 本文提出非策略强化学习方法解决未知内部模型非线性系统H∞控制问题,通过实时系统数据学习HJI方程解,证明收敛性并应用于F16飞机和旋转/平移执行器系统。

Comments Accepted by IEEE Transactions on Cybernetics. IEEE Transactions on Cybernetics, Online Available, 2014

详情
AI中文摘要

针对具有未知内部系统模型的非线性系统H∞控制设计问题,本文考虑将非线性H∞控制问题转化为求解所谓的哈密顿-雅可比-伊萨克斯(HJI)方程,该方程是一种通常无法解析求解的非线性偏微分方程。更糟糕的是,当准确系统模型不可用或获取成本高时,基于模型的方法无法近似求解HJI方程。为克服这些困难,本文引入了一种非策略强化学习(RL)方法,从真实系统数据而不是数学系统模型中学习HJI方程的解,并证明其收敛性。在非策略RL方法中,系统数据可以使用任意策略生成,而不是评估策略,这对于实际系统至关重要且具有前景。出于实施目的,采用基于神经网络(NN)的actor-critic结构,并基于残差加权方法推导出最小二乘NN权重更新算法。最后,所开发的基于NN的非策略RL方法在线性F16飞机植物上进行测试,并进一步应用于旋转/平移执行器系统。

英文摘要

The $H_\infty$ control design problem is considered for nonlinear systems with unknown internal system model. It is known that the nonlinear $ H_\infty $ control problem can be transformed into solving the so-called Hamilton-Jacobi-Isaacs (HJI) equation, which is a nonlinear partial differential equation that is generally impossible to be solved analytically. Even worse, model-based approaches cannot be used for approximately solving HJI equation, when the accurate system model is unavailable or costly to obtain in practice. To overcome these difficulties, an off-policy reinforcement leaning (RL) method is introduced to learn the solution of HJI equation from real system data instead of mathematical system model, and its convergence is proved. In the off-policy RL method, the system data can be generated with arbitrary policies rather than the evaluating policy, which is extremely important and promising for practical systems. For implementation purpose, a neural network (NN) based actor-critic structure is employed and a least-square NN weight update algorithm is derived based on the method of weighted residuals. Finally, the developed NN-based off-policy RL method is tested on a linear F16 aircraft plant, and further applied to a rotational/translational actuator system.

1404.7073 2026-06-04 eess.SY cs.LG cs.LO cs.RO cs.SY 版本更新

Probably Approximately Correct MDP Learning and Control With Temporal Logic Constraints

在时序逻辑约束下进行近似正确马尔可夫决策过程的学习与控制

Jie Fu, Ufuk Topcu

AI总结 本文提出在未知随机环境中,基于PAC-MDP方法学习满足时序逻辑规范的近优控制策略,通过多项式时间与空间复杂度实现高概率的近优策略生成。

Comments 9 pages, 5 figures, Accepted by 2014 Robotics: Science and Systems (RSS)

详情
AI中文摘要

我们考虑在未知随机环境中合成控制策略,以最大化满足给定时序逻辑规范的概率。我们将系统与环境的交互建模为具有初始未知转移概率的马尔可夫决策过程(MDP)。所开发的解决方案基于所谓的基于模型的近似正确马尔可夫决策过程(PAC-MDP)方法。该算法通过样本(即观测)、时间和空间,以多项式复杂度与MDP大小、自动机构造时序逻辑规范的大小、1/ε、1/δ和有限时间 horizon 相关,生成一个ε-近优策略,概率为1-δ。在此方法中,系统维护初始未知MDP的模型,并基于其学习模型和规范自动机构造产品MDP。在执行过程中,策略通过观察系统所采取的转移进行迭代更新。迭代在有限步骤内终止。以高概率,所生成的策略使得任何状态下,该策略满足规范的概率与最优策略之间的差异在预定义范围内。

英文摘要

We consider synthesis of control policies that maximize the probability of satisfying given temporal logic specifications in unknown, stochastic environments. We model the interaction between the system and its environment as a Markov decision process (MDP) with initially unknown transition probabilities. The solution we develop builds on the so-called model-based probably approximately correct Markov decision process (PAC-MDP) methodology. The algorithm attains an $\varepsilon$-approximately optimal policy with probability $1-δ$ using samples (i.e. observations), time and space that grow polynomially with the size of the MDP, the size of the automaton expressing the temporal logic specification, $\frac{1}{\varepsilon}$, $\frac{1}δ$ and a finite time horizon. In this approach, the system maintains a model of the initially unknown MDP, and constructs a product MDP based on its learned model and the specification automaton that expresses the temporal logic constraints. During execution, the policy is iteratively updated using observation of the transitions taken by the system. The iteration terminates in finitely many steps. With high probability, the resulting policy is such that, for any state, the difference between the probability of satisfying the specification under this policy and the optimal one is within a predefined bound.

1404.5899 2026-06-04 math.NA cs.LG cs.NA 版本更新

A Comparison of Clustering and Missing Data Methods for Health Sciences

健康科学中聚类和缺失数据方法的比较

Ran Zhao, Deanna Needell, Christopher Johansen, Jerry L. Grenard

AI总结 本文比较了健康行为研究中聚类与缺失数据方法,提出结合压缩感知矩阵补全与谱聚类,实验证明其在聚类和缺失数据处理中表现更优。

详情
AI中文摘要

本文比较并分析了健康行为研究中聚类方法与缺失数据方法。我们提出并分析了压缩感知的矩阵补全与谱聚类在健康数据聚类中的应用。实证测试和实际数据结果表明,这些方法在聚类中的误分类率更低,在缺失数据问题中的矩阵补全性能更优。根据我们的研究,这些改进可能的原因是谱聚类利用了高维数据,而压缩感知方法利用了健康数据的近低秩性质。

英文摘要

In this paper, we compare and analyze clustering methods with missing data in health behavior research. In particular, we propose and analyze the use of compressive sensing's matrix completion along with spectral clustering to cluster health related data. The empirical tests and real data results show that these methods can outperform standard methods like LPA and FIML, in terms of lower misclassification rates in clustering and better matrix completion performance in missing data problems. According to our examination, a possible explanation of these improvements is that spectral clustering takes advantage of high data dimension and compressive sensing methods utilize the near-to-low-rank property of health data.

1404.1377 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Orthogonal Rank-One Matrix Pursuit for Low Rank Matrix Completion

正交秩一矩阵追迹法用于低秩矩阵补全

Zheng Wang, Ming-Jun Lai, Zhaosong Lu, Wei Fan, Hasan Davulcu, Jieping Ye

AI总结 本文提出一种高效可扩展的低秩矩阵补全算法,通过将正交匹配追踪方法扩展到矩阵领域,并引入新的权重更新规则降低计算和存储复杂度,具有线性收敛速度和单一可调参数,适用于大规模学习问题。

详情
AI中文摘要

本文提出了一种高效的低秩矩阵补全算法,其核心思想是将正交匹配追踪方法从向量扩展到矩阵领域。我们进一步提出了一种经济版本的算法,通过引入新的权重更新规则来降低时间和存储复杂度。两种版本在每次矩阵追迹迭代中计算成本低廉,且在几次迭代中就能获得满意的结果。我们提出的算法的另一个优点是仅有一个可调参数,即秩,这使得用户易于理解和使用,特别是在大规模学习问题中尤为重要。此外,我们严格证明了两种版本都实现了线性收敛速度,这比之前已知的结果显著更好。我们还通过多个真实世界数据集,包括大规模推荐数据集Netflix以及MovieLens数据集,经验性地比较了所提算法与几种最先进的矩阵补全算法。数值结果表明,所提算法在效率上优于竞争算法,同时在预测性能上达到相似或更好的水平。

英文摘要

In this paper, we propose an efficient and scalable low rank matrix completion algorithm. The key idea is to extend orthogonal matching pursuit method from the vector case to the matrix case. We further propose an economic version of our algorithm by introducing a novel weight updating rule to reduce the time and storage complexity. Both versions are computationally inexpensive for each matrix pursuit iteration, and find satisfactory results in a few iterations. Another advantage of our proposed algorithm is that it has only one tunable parameter, which is the rank. It is easy to understand and to use by the user. This becomes especially important in large-scale learning problems. In addition, we rigorously show that both versions achieve a linear convergence rate, which is significantly better than the previous known results. We also empirically compare the proposed algorithms with several state-of-the-art matrix completion algorithms on many real-world datasets, including the large-scale recommendation dataset Netflix as well as the MovieLens datasets. Numerical results show that our proposed algorithm is more efficient than competing algorithms while achieving similar or better prediction performance.

1403.7737 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Sharpened Error Bounds for Random Sampling Based $\ell_2$ Regression

随机采样基于ℓ₂回归的误差界改进

Shusen Wang

AI总结 本文提出两种随机采样方法改进ℓ₂回归效率,改进误差界至O(d log d + d/ε)以实现1+ε精度,同时证明均匀采样在特定条件下可获得2+ε的界。

Comments unpublished manuscript

详情
AI中文摘要

给定数据矩阵X∈R^{n×d}和响应向量y∈R^n,当n>d时,求解最小二乘回归(LSR)问题需要O(n d²)时间与O(n d)空间。当n和d均较大时,精确求解非常昂贵。当n>>d时,将y和X的所有列随机嵌入到较小的子空间R^c中,可将LSR问题的行数减少,从而以O(c d²)时间和O(c d)空间求解。本文讨论了两种随机采样方法以更高效地求解LSR。先前工作表明基于杠杆分数采样的LSR在c≥O(d ε^{-2} log d)时可达到1+ε精度。本文改进此误差界,证明当c=O(d log d + d/ε)时即可实现1+ε精度。此外,当c≥O(μd ε^{-2} log d)时,均匀采样基于LSR以正概率达到2+ε的界。

英文摘要

Given a data matrix $X \in R^{n\times d}$ and a response vector $y \in R^{n}$, suppose $n>d$, it costs $O(n d^2)$ time and $O(n d)$ space to solve the least squares regression (LSR) problem. When $n$ and $d$ are both large, exactly solving the LSR problem is very expensive. When $n \gg d$, one feasible approach to speeding up LSR is to randomly embed $y$ and all columns of $X$ into a smaller subspace $R^c$; the induced LSR problem has the same number of columns but much fewer number of rows, and it can be solved in $O(c d^2)$ time and $O(c d)$ space. We discuss in this paper two random sampling based methods for solving LSR more efficiently. Previous work showed that the leverage scores based sampling based LSR achieves $1+ε$ accuracy when $c \geq O(d ε^{-2} \log d)$. In this paper we sharpen this error bound, showing that $c = O(d \log d + d ε^{-1})$ is enough for achieving $1+ε$ accuracy. We also show that when $c \geq O(μd ε^{-2} \log d)$, the uniform sampling based LSR attains a $2+ε$ bound with positive probability.

1311.4468 2026-06-04 cs.LG cs.SY eess.SY physics.data-an stat.ML 版本更新

Stochastic processes and feedback-linearisation for online identification and Bayesian adaptive control of fully-actuated mechanical systems

随机过程与反馈线性化用于完全驱动机械系统的在线识别和贝叶斯自适应控制

Jan-Peter Calliess, Antonis Papachristodoulou, Stephen J. Roberts

AI总结 本文提出了一种新的方法,结合概率识别与控制,利用随机过程先验条件和拉格朗日力学结构知识,通过反馈线性化实现对完全驱动机械系统的灵活非参数贝叶斯学习。

详情
AI中文摘要

本文提出了一种新的方法,用于同时进行可观察完全驱动机械系统的概率识别和控制。通过将随机过程先验条件于配置观测和噪声配置导数估计上,实现识别。与以往利用随机过程进行识别的工作不同,我们利用拉格朗日力学提供的结构知识,分别学习控制-affine系统的漂移和控制输入矩阵函数。利用反馈线性化将不确定的非线性控制问题在期望上转化为易于调节的问题。因此,本文的方法结合了非参数贝叶斯学习的灵活性与对闭环轨迹期望的元认知保证。在扭矩驱动摆的背景下,通过正态和对数正态过程的结合来学习动力学。

英文摘要

This work proposes a new method for simultaneous probabilistic identification and control of an observable, fully-actuated mechanical system. Identification is achieved by conditioning stochastic process priors on observations of configurations and noisy estimates of configuration derivatives. In contrast to previous work that has used stochastic processes for identification, we leverage the structural knowledge afforded by Lagrangian mechanics and learn the drift and control input matrix functions of the control-affine system separately. We utilise feedback-linearisation to reduce, in expectation, the uncertain nonlinear control problem to one that is easy to regulate in a desired manner. Thereby, our method combines the flexibility of nonparametric Bayesian learning with epistemological guarantees on the expected closed-loop trajectory. We illustrate our method in the context of torque-actuated pendula where the dynamics are learned with a combination of normal and log-normal processes.

1403.7429 2026-06-04 math.OC cs.DC cs.LG cs.SY eess.SY 版本更新

Distributed Reconstruction of Nonlinear Networks: An ADMM Approach

非线性网络的分布式重建:一种ADMM方法

Wei Pan, Aivar Sootla, Guy-Bart Stan

AI总结 本文提出了一种分布式算法用于大规模非线性网络的重建,通过ADMM将问题分解为子问题,用于识别时间序列数据中的非线性函数形式和参数。

Comments To appear in the Preprints of 19th IFAC World Congress 2014

详情
AI中文摘要

本文提出了一种分布式算法用于大规模非线性网络的重建。特别是,我们关注从时间序列数据中识别大规模非线性网络的非线性函数形式及其相关参数。最近,非线性网络重建问题被表述为一个非凸优化问题,基于边际似然最大化过程与稀疏诱导先验的结合。使用凸-凹过程(CCCP),推导出迭代加权lasso算法来解决初始非凸优化问题。通过利用该加权lasso算法的目标函数结构,可以设计出分布式算法。为此,我们应用交替方向乘子法(ADMM)将原问题分解为几个子问题。为了说明所提方法的有效性,我们使用我们的方法来识别具有不同网络规模(500~100,000个节点)的相互连接库克振子网络。

英文摘要

In this paper, we present a distributed algorithm for the reconstruction of large-scale nonlinear networks. In particular, we focus on the identification from time-series data of the nonlinear functional forms and associated parameters of large-scale nonlinear networks. Recently, a nonlinear network reconstruction problem was formulated as a nonconvex optimisation problem based on the combination of a marginal likelihood maximisation procedure with sparsity inducing priors. Using a convex-concave procedure (CCCP), an iterative reweighted lasso algorithm was derived to solve the initial nonconvex optimisation problem. By exploiting the structure of the objective function of this reweighted lasso algorithm, a distributed algorithm can be designed. To this end, we apply the alternating direction method of multipliers (ADMM) to decompose the original problem into several subproblems. To illustrate the effectiveness of the proposed methods, we use our approach to identify a network of interconnected Kuramoto oscillators with different network sizes (500~100,000 nodes).

1312.7292 2026-06-04 eess.SY cs.LG cs.SY 版本更新

Two Timescale Convergent Q-learning for Sleep--Scheduling in Wireless Sensor Networks

双时间尺度收敛Q学习用于无线传感器网络中的睡眠调度

Prashanth L. A., Abhranil Chatterjee, Shalabh Bhatnagar

AI总结 本文研究无线传感器网络中通过优化睡眠时间延长网络寿命并最小化跟踪误差的问题,提出双时间尺度收敛Q学习算法,结合函数近似和策略梯度更新,提升睡眠调度性能。

详情
AI中文摘要

本文考虑无线传感器网络中的入侵检测应用,研究如何调度单个传感器的睡眠时间以最大化网络寿命并最小化跟踪误差。我们将该问题建模为部分可观测马尔可夫决策过程(POMDP),具有连续状态-动作空间。我们提出了一种双时间尺度收敛的Q学习算法,用于处理POMDP的维度灾难问题。该算法结合了一种策略梯度更新方法,使用单次模拟的同时扰动随机逼近(SPSA)估计在较快的时间尺度上进行更新,而Q值参数则在较慢的时间尺度上通过类似于时序差分(TD)算法的方式更新。特征选择方案管理能量和跟踪组件,以帮助寻找最优的睡眠调度策略。我们还开发了Q学习算法的函数近似类比,但该算法没有理论收敛保证。此外,我们还调整了算法以包含随机迭代估计方案,用于估计入侵者的移动模型。在二维网络设置下的仿真结果表明,我们的算法在仅增加少量传感器的情况下,相比最近的先前工作实现了更好的跟踪精度。

英文摘要

In this paper, we consider an intrusion detection application for Wireless Sensor Networks (WSNs). We study the problem of scheduling the sleep times of the individual sensors to maximize the network lifetime while keeping the tracking error to a minimum. We formulate this problem as a partially-observable Markov decision process (POMDP) with continuous state-action spaces, in a manner similar to (Fuemmeler and Veeravalli [2008]). However, unlike their formulation, we consider infinite horizon discounted and average cost objectives as performance criteria. For each criterion, we propose a convergent on-policy Q-learning algorithm that operates on two timescales, while employing function approximation to handle the curse of dimensionality associated with the underlying POMDP. Our proposed algorithm incorporates a policy gradient update using a one-simulation simultaneous perturbation stochastic approximation (SPSA) estimate on the faster timescale, while the Q-value parameter (arising from a linear function approximation for the Q-values) is updated in an on-policy temporal difference (TD) algorithm-like fashion on the slower timescale. The feature selection scheme employed in each of our algorithms manages the energy and tracking components in a manner that assists the search for the optimal sleep-scheduling policy. For the sake of comparison, in both discounted and average settings, we also develop a function approximation analogue of the Q-learning algorithm. This algorithm, unlike the two-timescale variant, does not possess theoretical convergence guarantees. Finally, we also adapt our algorithms to include a stochastic iterative estimation scheme for the intruder's mobility model. Our simulation results on a 2-dimensional network setting suggest that our algorithms result in better tracking accuracy at the cost of only a few additional sensors, in comparison to a recent prior work.

1403.4267 2026-06-04 math.NA cs.LG cs.NA 版本更新

Balancing Sparsity and Rank Constraints in Quadratic Basis Pursuit

在二次基追踪中平衡稀疏性和秩约束

Cagdas Bilen, Gilles Puy, Rémi Gribonval, Laurent Daudet

AI总结 本文研究了同时 enforcing 稀疏性和低秩结构的方法,提出了一种分析稀疏性和低秩约束之间权衡的新方法,并通过仿真验证了其有效性。

详情
AI中文摘要

我们研究了同时 enforcing 稀疏性和低秩结构在矩阵中的方法,常用于稀疏相位恢复或压缩感知中的相位校准问题。我们提出了一种新的方法来分析这些方法中稀疏性和低秩约束之间的权衡,不仅有助于提供调整上述约束权重的指南,还使新的仿真策略能够评估性能。然后,我们为相位恢复和相位校准案例提供了仿真结果,以展示所提方法与其他方法的一致性,并评估不同权重对稀疏性和低秩结构约束性能变化的影响。

英文摘要

We investigate the methods that simultaneously enforce sparsity and low-rank structure in a matrix as often employed for sparse phase retrieval problems or phase calibration problems in compressive sensing. We propose a new approach for analyzing the trade off between the sparsity and low rank constraints in these approaches which not only helps to provide guidelines to adjust the weights between the aforementioned constraints, but also enables new simulation strategies for evaluating performance. We then provide simulation results for phase retrieval and phase calibration cases both to demonstrate the consistency of the proposed method with other approaches and to evaluate the change of performance with different weights for the sparsity and low rank structure constraints.

1403.1863 2026-06-04 cs.LG cs.SY eess.SY 版本更新

Statistical Structure Learning, Towards a Robust Smart Grid

统计结构学习,迈向稳健的智能电网

Hanie Sedghi, Edmond Jonckheere

AI总结 本文提出基于公交相角马尔可夫图的去中心化虚假数据注入检测方案,利用条件协方差测试学习电网结构,通过直流功率流模型证明正常情况下电网图可确定相角马尔可夫图,检测计算图与学习结构的差异以触发警报,无需额外硬件即可检测复杂攻击。

详情
AI中文摘要

电网的稳健控制和维护依赖于准确的数据。PMUs和状态估计器都容易受到虚假数据注入攻击。因此,必须有一个机制来快速准确地检测恶意代理篡改数据的行为——这不仅对防止可能导致停电的攻击至关重要,也对当前和未来电网的常规监控和控制任务至关重要。我们提出了一种基于公交相角马尔可夫图的去中心化虚假数据注入检测方案。我们利用条件协方差测试(CCT)来学习电网的结构。使用直流功率流模型,我们证明在正常情况下,由于电网图的行走可求和性,电网图可确定电压角的马尔可夫图。因此,计算的马尔可夫图与学习结构之间的差异应触发警报。本地电网拓扑结构可以从保护系统在线获取,并利用它来检查不匹配。如果检测到不匹配,我们使用相关异常分数来检测受攻击的节点集合。我们的方法可以检测假设了解系统公交-支路模型并能欺骗状态估计器、损害电力网络观测、控制、监控、需求响应和定价方案的最近期隐秘欺骗攻击。具体而言,在隐秘欺骗攻击下,相角马尔可夫图发生变化。除了检测攻击状态外,我们的方法还可以检测受攻击的节点集合。据我们所知,我们的方法是首次全面检测这种复杂攻击,并且不需要额外硬件。此外,我们的检测方案无论受攻击子集的大小都能成功。各种电力网络的模拟验证了我们的主张。

英文摘要

Robust control and maintenance of the grid relies on accurate data. Both PMUs and state estimators are prone to false data injection attacks. Thus, it is crucial to have a mechanism for fast and accurate detection of an agent maliciously tampering with the data---for both preventing attacks that may lead to blackouts, and for routine monitoring and control tasks of current and future grids. We propose a decentralized false data injection detection scheme based on Markov graph of the bus phase angles. We utilize the Conditional Covariance Test (CCT) to learn the structure of the grid. Using the DC power flow model, we show that under normal circumstances, and because of walk-summability of the grid graph, the Markov graph of the voltage angles can be determined by the power grid graph. Therefore, a discrepancy between calculated Markov graph and learned structure should trigger the alarm. Local grid topology is available online from the protection system and we exploit it to check for mismatch. Should a mismatch be detected, we use correlation anomaly score to detect the set of attacked nodes. Our method can detect the most recent stealthy deception attack on the power grid that assumes knowledge of bus-branch model of the system and is capable of deceiving the state estimator, damaging power network observatory, control, monitoring, demand response and pricing schemes. Specifically, under the stealthy deception attack, the Markov graph of phase angles changes. In addition to detect a state of attack, our method can detect the set of attacked nodes. To the best of our knowledge, our remedy is the first to comprehensively detect this sophisticated attack and it does not need additional hardware. Moreover, our detection scheme is successful no matter the size of the attacked subset. Simulation of various power networks confirms our claims.

1309.1369 2026-06-04 stat.ML cs.LG cs.NA math.NA stat.CO 版本更新

Semistochastic Quadratic Bound Methods

半随机二次界方法

Aleksandr Y. Aravkin, Anna Choromanska, Tony Jebara, Dimitri Kanevsky

AI总结 本文提出半随机二次界方法用于最大似然推断,通过优化分区函数,证明了在弱假设下全局收敛性和在强假设下线性收敛性,同时通过不精确子问题求解和批量大小选择方案提升效率与稳定性。

Comments 11 pages, 1 figure

详情
AI中文摘要

本文提出半随机二次界方法用于最大似然推断,通过优化分区函数,证明了在弱假设下全局收敛性和在强假设下线性收敛性,同时通过不精确子问题求解和批量大小选择方案提升效率与稳定性。

英文摘要

Partition functions arise in a variety of settings, including conditional random fields, logistic regression, and latent gaussian models. In this paper, we consider semistochastic quadratic bound (SQB) methods for maximum likelihood inference based on partition function optimization. Batch methods based on the quadratic bound were recently proposed for this class of problems, and performed favorably in comparison to state-of-the-art techniques. Semistochastic methods fall in between batch algorithms, which use all the data, and stochastic gradient type methods, which use small random selections at each iteration. We build semistochastic quadratic bound-based methods, and prove both global convergence (to a stationary point) under very weak assumptions, and linear convergence rate under stronger assumptions on the objective. To make the proposed methods faster and more stable, we consider inexact subproblem minimization and batch-size selection schemes. The efficacy of SQB methods is demonstrated via comparison with several state-of-the-art techniques on commonly used datasets.

1312.0516 2026-06-04 cs.LG cs.SY eess.SY stat.AP stat.ML 版本更新

Grid Topology Identification using Electricity Prices

利用电价识别电网拓扑

Vassilis Kekatos, Georgios B. Giannakis, Ross Baldick

AI总结 本文研究通过公开市场数据恢复电网拓扑的潜力,提出基于LMP的正则化最大似然估计器,利用低秩和稀疏结构恢复电网拉普拉斯矩阵,通过IEEE 14节点基准数据验证了方法的有效性。

Comments PES General Meeting 2014 submission

详情
AI中文摘要

本文探讨了仅使用公开市场数据恢复电网拓扑的潜力。在当代批发电力市场中,实时电价通常通过求解受网络约束的经济调度问题确定。在线性直流模型下,位置边际价格(LMP)对应于所涉及线性规划的拉格朗日乘数。有趣的是,时空变化的LMP矩阵具有以下性质:一旦与加权电网拉普拉斯矩阵相乘,就会得到低秩且稀疏的矩阵。利用这一丰富结构,开发了一种正则化的最大似然估计器(MLE)来从LMP中恢复电网拉普拉斯矩阵。所提出的凸优化问题包含促进低秩和稀疏性的正则化项,并通过可扩展的算法求解。在为IEEE 14节点基准生成的价格上进行的数值测试提供了令人鼓舞的拓扑恢复结果。

英文摘要

The potential of recovering the topology of a grid using solely publicly available market data is explored here. In contemporary whole-sale electricity markets, real-time prices are typically determined by solving the network-constrained economic dispatch problem. Under a linear DC model, locational marginal prices (LMPs) correspond to the Lagrange multipliers of the linear program involved. The interesting observation here is that the matrix of spatiotemporally varying LMPs exhibits the following property: Once premultiplied by the weighted grid Laplacian, it yields a low-rank and sparse matrix. Leveraging this rich structure, a regularized maximum likelihood estimator (MLE) is developed to recover the grid Laplacian from the LMPs. The convex optimization problem formulated includes low rank- and sparsity-promoting regularizers, and it is solved using a scalable algorithm. Numerical tests on prices generated for the IEEE 14-bus benchmark provide encouraging topology recovery results.

1306.3343 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Relaxed Sparse Eigenvalue Conditions for Sparse Estimation via Non-convex Regularized Regression

松弛的稀疏特征值条件用于非凸正则化回归中的稀疏估计

Zheng Pan, Changshui Zhang

AI总结 本文研究了非凸正则化回归中稀疏估计的松弛特征值条件,证明了非凸正则化在稀疏估计中的有效性,并展示了坐标下降法在获得近似全局解中的应用。

详情
AI中文摘要

非凸正则化器通常在实践中提高了稀疏估计的性能。为了证明这一点,我们研究了稀疏估计的条件,特别是针对一种包含许多现有正则化器的尖锐凹正则化器。对于正则化回归的全局解,我们的基于稀疏特征值的条件比L1正则化在参数估计和稀疏性估计中的条件更弱。对于近似全局和近似 stationary(AGAS)解,几乎相同的条件也足够。我们证明了通过坐标下降(CD)方法可以得到所需的AGAS解。最后,我们进行了一些实验,以展示CD方法在获得AGAS解中的性能以及尖锐凹正则化器所需估计条件的弱性。

英文摘要

Non-convex regularizers usually improve the performance of sparse estimation in practice. To prove this fact, we study the conditions of sparse estimations for the sharp concave regularizers which are a general family of non-convex regularizers including many existing regularizers. For the global solutions of the regularized regression, our sparse eigenvalue based conditions are weaker than that of L1-regularization for parameter estimation and sparseness estimation. For the approximate global and approximate stationary (AGAS) solutions, almost the same conditions are also enough. We show that the desired AGAS solutions can be obtained by coordinate descent (CD) based methods. Finally, we perform some experiments to show the performance of CD methods on giving AGAS solutions and the degree of weakness of the estimation conditions required by the sharp concave regularizers.

1306.0308 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

Probabilistic Solutions to Differential Equations and their Application to Riemannian Statistics

微分方程的概率解及其在黎曼统计学中的应用

Philipp Hennig, Søren Hauberg

AI总结 本文提出一种概率数值方法,用于求解初值和边界值问题,返回解的高斯过程后验。该方法在黎曼流形统计中具有应用价值,能处理非解析常微分方程,通过不确定性边际化提升统计鲁棒性,提出新的黎曼算法和主地理分析方法。

Comments 11 page (9 page conference paper, plus supplements)

详情
Journal ref
Proceedings of the 17th International Conference on Artificial Intelligence and Statistics (AISTATS) 2014, Reykjavik, Iceland. Journal of Machine Learning Research: W&CP volume 33
AI中文摘要

我们研究了一种概率数值方法,用于求解初值和边界值问题,该方法返回解的联合高斯过程后验。此类方法在黎曼流形统计中具有实际价值,因为几乎所有计算都涉及非解析常微分方程。概率公式允许对数值解的不确定性进行边际化,使统计结果对不准确性更不敏感。这导致了新的黎曼算法用于均值计算和主地理分析。边际化也意味着结果可能不如点估计精确,从而在状态-of-the-art方法上实现显著加速。我们的方法是关于更广泛观点的论据,即数值计算引起的不确定性应在机器学习算法的整个管道中进行跟踪。

英文摘要

We study a probabilistic numerical method for the solution of both boundary and initial value problems that returns a joint Gaussian process posterior over the solution. Such methods have concrete value in the statistics on Riemannian manifolds, where non-analytic ordinary differential equations are involved in virtually all computations. The probabilistic formulation permits marginalising the uncertainty of the numerical solution such that statistics are less sensitive to inaccuracies. This leads to new Riemannian algorithms for mean value computations and principal geodesic analysis. Marginalisation also means results can be less precise than point estimates, enabling a noticeable speed-up over the state of the art. Our approach is an argument for a wider point that uncertainty caused by numerical calculations should be tracked throughout the pipeline of machine learning algorithms.

1401.2288 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Extension of Sparse Randomized Kaczmarz Algorithm for Multiple Measurement Vectors

稀疏随机Kaczmarz算法的扩展:多测量向量问题

Hemant Kumar Aggarwal, Angshul Majumdar

AI总结 本文提出改进的随机Kaczmarz算法解决多测量向量问题,通过共同稀疏支持实现更高效的恢复与收敛。

详情
AI中文摘要

Kaczmarz算法因迭代求解超定线性方程组而流行。传统算法在几次遍历中可近似解,但随机版本能指数收敛且与方程数量无关。最近提出基于加权随机Kaczmarz算法的稀疏解算法,但仅适用于单测量向量问题。本文通过修改随机Kaczmarz算法解决多测量向量问题,将视频人脸识别建模为该问题并应用所提技术。在真实和合成数据集上,所提算法在公平约束下优于状态最新型谱投影梯度算法,蒙特卡洛模拟证实其恢复和收敛速率更优。

英文摘要

The Kaczmarz algorithm is popular for iteratively solving an overdetermined system of linear equations. The traditional Kaczmarz algorithm can approximate the solution in few sweeps through the equations but a randomized version of the Kaczmarz algorithm was shown to converge exponentially and independent of number of equations. Recently an algorithm for finding sparse solution to a linear system of equations has been proposed based on weighted randomized Kaczmarz algorithm. These algorithms solves single measurement vector problem; however there are applications were multiple-measurements are available. In this work, the objective is to solve a multiple measurement vector problem with common sparse support by modifying the randomized Kaczmarz algorithm. We have also modeled the problem of face recognition from video as the multiple measurement vector problem and solved using our proposed technique. We have compared the proposed algorithm with state-of-art spectral projected gradient algorithm for multiple measurement vectors on both real and synthetic datasets. The Monte Carlo simulations confirms that our proposed algorithm have better recovery and convergence rate than the MMV version of spectral projected gradient algorithm under fairness constraints.

1401.3198 2026-06-04 math.OC cs.LG cs.SY eess.SY 版本更新

Online Markov decision processes with Kullback-Leibler control cost

在线马尔可夫决策过程与Kullback-Leibler控制成本

Peng Guan, Maxim Raginsky, Rebecca Willett

AI总结 本文研究了在线控制问题,通过离散时间随机游走在有限状态空间中进行决策,采用KL散度作为控制成本,提出了一种计算高效且具有小遗憾的策略,并在目标跟踪任务中验证了其性能。

Comments to appear in IEEE Transactions on Automatic Control

详情
AI中文摘要

本文研究了在线控制问题,通过离散时间随机游走在有限状态空间中进行决策,采用KL散度作为控制成本,提出了一种计算高效且具有小遗憾的策略,并在目标跟踪任务中验证了其性能。

英文摘要

This paper considers an online (real-time) control problem that involves an agent performing a discrete-time random walk over a finite state space. The agent's action at each time step is to specify the probability distribution for the next state given the current state. Following the set-up of Todorov, the state-action cost at each time step is a sum of a state cost and a control cost given by the Kullback-Leibler (KL) divergence between the agent's next-state distribution and that determined by some fixed passive dynamics. The online aspect of the problem is due to the fact that the state cost functions are generated by a dynamic environment, and the agent learns the current state cost only after selecting an action. An explicit construction of a computationally efficient strategy with small regret (i.e., expected difference between its actual total cost and the smallest cost attainable using noncausal knowledge of the state costs) under mild regularity conditions is presented, along with a demonstration of the performance of the proposed strategy on a simulated target tracking problem. A number of new results on Markov decision processes with KL control cost are also obtained.

1401.1842 2026-06-04 stat.ML cs.IT cs.LG cs.NA math.IT math.NA 版本更新

Robust Large Scale Non-negative Matrix Factorization using Proximal Point Algorithm

鲁棒大规模非负矩阵分解的近点算法

Jason Gejie Liu, Shuchin Aeron

AI总结 本文提出一种鲁棒算法用于大规模非负矩阵分解,通过引入减少的约束条件改进线性规划算法,无需预先知道分解秩,适用于极端射线或主题数量远大于数据向量维度的情况。

Comments Appeared in IEEE GlobalSIP, 2013, TX, Austin

详情
AI中文摘要

本文提出了一种用于处理大规模数据的鲁棒非负矩阵分解(NMF)算法,其中分离性假设成立。具体来说,我们通过引入减少的约束条件修改了[9]中的线性规划(LP)算法,以实现精确的NMF。与以往方法不同,所提出的算法不需要知道分解秩(极端射线[3]或主题[7])。受代谢网络分析中类似问题的启发,我们考虑了一种完全不同的情形,即极端射线或主题的数量可以远大于数据向量的维度。不同合成数据集上算法的性能也得到了提供。

英文摘要

A robust algorithm for non-negative matrix factorization (NMF) is presented in this paper with the purpose of dealing with large-scale data, where the separability assumption is satisfied. In particular, we modify the Linear Programming (LP) algorithm of [9] by introducing a reduced set of constraints for exact NMF. In contrast to the previous approaches, the proposed algorithm does not require the knowledge of factorization rank (extreme rays [3] or topics [7]). Furthermore, motivated by a similar problem arising in the context of metabolic network analysis [13], we consider an entirely different regime where the number of extreme rays or topics can be much larger than the dimension of the data vectors. The performance of the algorithm for different synthetic data sets are provided.

1401.0159 2026-06-04 math.NA cs.LG cs.NA 版本更新

Speeding-Up Convergence via Sequential Subspace Optimization: Current State and Future Directions

通过顺序子空间优化加速收敛:现状与未来方向

Michael Zibulevsky

AI总结 本文综述了顺序子空间优化框架在大规模无约束优化中的应用,探讨了其与并行坐标下降法的结合,以及如何通过改进方法提升求解效率。

详情
AI中文摘要

本文是一篇以研究提案风格撰写的技术综述论文。近年来,我们引入了一个用于大规模无约束优化的通用框架——顺序子空间优化(SESOP),并展示了其在基于稀疏性的信号/图像去噪、反卷积、压缩感知、计算机断层扫描、衍射成像、支持向量机等领域的实用性。我们探索了其与并行坐标下降法和可分离替代函数方法的结合,从而在上述领域取得了最先进的成果。存在几种方法,在特定条件下比纯SESOP更快:信任区域牛顿方法——适用于Hessian矩阵易于逆的问题;截断牛顿方法——当可以快速乘以Hessian时;随机优化方法——适用于具有大数据的随机类型问题;多网格方法——适用于具有嵌套多级结构的问题。这些方法可以通过与SESOP结合进一步改进。此外,也可以通过SESOP加速约束优化问题的增广拉格朗日方法,以及具有可分离目标函数和不可分离约束的问题的交替方向乘子法。

英文摘要

This is an overview paper written in style of research proposal. In recent years we introduced a general framework for large-scale unconstrained optimization -- Sequential Subspace Optimization (SESOP) and demonstrated its usefulness for sparsity-based signal/image denoising, deconvolution, compressive sensing, computed tomography, diffraction imaging, support vector machines. We explored its combination with Parallel Coordinate Descent and Separable Surrogate Function methods, obtaining state of the art results in above-mentioned areas. There are several methods, that are faster than plain SESOP under specific conditions: Trust region Newton method - for problems with easily invertible Hessian matrix; Truncated Newton method - when fast multiplication by Hessian is available; Stochastic optimization methods - for problems with large stochastic-type data; Multigrid methods - for problems with nested multilevel structure. Each of these methods can be further improved by merge with SESOP. One can also accelerate Augmented Lagrangian method for constrained optimization problems and Alternating Direction Method of Multipliers for problems with separable objective function and non-separable constraints.

1312.6872 2026-06-04 math.NA cs.LG cs.NA 版本更新

Matrix recovery using Split Bregman

利用Split Bregman方法进行矩阵恢复

Anupriya Gogna, Ankita Shukla, Angshul Majumdar

AI总结 本文提出Split Bregman算法用于低秩矩阵恢复,通过改进收敛速度和成功率,提升重建精度,尤其在测量数据有限时表现更优。

详情
AI中文摘要

本文针对从低维投影中恢复具有内在低秩结构的矩阵问题,该问题广泛应用于模式识别、无线传感器网络、控制系统、推荐系统、图像/视频重建等领域。在理论和实践中,最有效的解决低秩矩阵恢复问题的方法是核范数最小化。本文提出了一种Split Bregman算法用于核范数最小化。Bregman技术的使用提高了算法的收敛速度并提高了成功率。即使在小数量线性测量可用的情况下,重建的准确性也更好。我们的主张通过使用我们算法及其与其他现有矩阵恢复方法的比较实验得到支持。算法基于NMSE、执行时间和成功率对不同秩和采样比率进行比较。

英文摘要

In this paper we address the problem of recovering a matrix, with inherent low rank structure, from its lower dimensional projections. This problem is frequently encountered in wide range of areas including pattern recognition, wireless sensor networks, control systems, recommender systems, image/video reconstruction etc. Both in theory and practice, the most optimal way to solve the low rank matrix recovery problem is via nuclear norm minimization. In this paper, we propose a Split Bregman algorithm for nuclear norm minimization. The use of Bregman technique improves the convergence speed of our algorithm and gives a higher success rate. Also, the accuracy of reconstruction is much better even for cases where small number of linear measurements are available. Our claim is supported by empirical results obtained using our algorithm and its comparison to other existing methods for matrix recovery. The algorithms are compared on the basis of NMSE, execution time and success rate for varying ranks and sampling ratios.

1312.6182 2026-06-04 cs.MS cs.LG cs.NA math.NA stat.ML 版本更新

Large-Scale Paralleled Sparse Principal Component Analysis

大规模并行稀疏主成分分析

W. Liu, H. Zhang, D. Tao, Y. Wang, K. Lu

AI总结 本文提出基于GPU的高效并行稀疏主成分分析方法,通过并行实现通用幂方法的四种优化形式,显著提升计算效率,实验证明其在实际数据集中的实用性。

Comments submitted to Multimedia Tools and Applications

详情
AI中文摘要

主成分分析(PCA)是一种用于多变量数据分析的统计技术,但其主成分(PCs)作为原始变量的线性组合,难以解释。稀疏PCA(SPCA)通过近似稀疏PCs来平衡统计保真度和可解释性。本文提出一种高效的GPU并行方法,用于实现SPCA,特别是通用幂方法的SPCA(GP-SPCA)。该方法在GPU上使用CUBLAS实现,比CPU上的CBLAS实现快11倍,比Matlab实现快107倍。在多个真实数据集上的广泛比较实验验证了SPCA的实用性。

英文摘要

Principal component analysis (PCA) is a statistical technique commonly used in multivariate data analysis. However, PCA can be difficult to interpret and explain since the principal components (PCs) are linear combinations of the original variables. Sparse PCA (SPCA) aims to balance statistical fidelity and interpretability by approximating sparse PCs whose projections capture the maximal variance of original data. In this paper we present an efficient and paralleled method of SPCA using graphics processing units (GPUs), which can process large blocks of data in parallel. Specifically, we construct parallel implementations of the four optimization formulations of the generalized power method of SPCA (GP-SPCA), one of the most efficient and effective SPCA approaches, on a GPU. The parallel GPU implementation of GP-SPCA (using CUBLAS) is up to eleven times faster than the corresponding CPU implementation (using CBLAS), and up to 107 times faster than a MatLab implementation. Extensive comparative experiments in several real-world datasets confirm that SPCA offers a practical advantage.

1306.2861 2026-06-04 stat.ML cs.LG cs.SY eess.SY 版本更新

Bayesian Inference and Learning in Gaussian Process State-Space Models with Particle MCMC

贝叶斯推断与学习在高斯过程状态空间模型中的粒子MCMC应用

Roger Frigola, Fredrik Lindsten, Thomas B. Schön, Carl E. Rasmussen

AI总结 本文提出一种全贝叶斯方法,用于非线性非参数状态空间模型中的推断与学习,通过高斯过程先验建模状态转移动态,并利用粒子MCMC进行高效推断。

详情
Journal ref
Published in NIPS 2013, Advances in Neural Information Processing Systems 26, pp. 3156--3164
AI中文摘要

本文提出了一种全贝叶斯方法,用于非线性非参数状态空间模型中的推断与学习,通过高斯过程先验建模状态转移动态,并利用粒子MCMC进行高效推断。

英文摘要

State-space models are successfully used in many areas of science, engineering and economics to model time series and dynamical systems. We present a fully Bayesian approach to inference \emph{and learning} (i.e. state estimation and system identification) in nonlinear nonparametric state-space models. We place a Gaussian process prior over the state transition dynamics, resulting in a flexible model able to capture complex dynamical phenomena. To enable efficient inference, we marginalize over the transition dynamics function and infer directly the joint smoothing distribution using specially tailored Particle Markov Chain Monte Carlo samplers. Once a sample from the smoothing distribution is computed, the state transition predictive distribution can be formulated analytically. Our approach preserves the full nonparametric expressivity of the model and can make use of sparse Gaussian processes to greatly reduce computational complexity.

1310.3556 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

Identifying Influential Entries in a Matrix

识别矩阵中的关键条目

Abhisek Kundu, Srinivas Nambirajan, Petros Drineas

AI总结 本文提出一种概率分布,用于识别矩阵中最关键的条目,并通过理论证明在采样少量条目后可精确重建矩阵,且无需假设矩阵的无相干性。

Comments There is a bug in the proof of Lemma 5, which we are currently working to fix

详情
AI中文摘要

对于任意一个大小为m×n、秩为ρ的矩阵A,我们提出了一种针对矩阵条目的概率分布(方程(2)中的元素杠杆分数),以揭示矩阵中最关键的条目。从理论角度看,我们证明了在采样至多s=O((m+n)ρ²ln(m+n))个条目(见方程(3)中s的精确值)并基于这些分数进行采样后,解核范数最小化问题可精确重建A。据我们所知,这些是目前在不假设矩阵A无相干性的情况下,对矩阵补全问题最强的理论保证。从实验角度看,我们显示高元素杠杆分数对应的条目揭示了数据矩阵的结构特性,这些特性对领域科学家具有研究价值。

英文摘要

For any matrix A in R^(m x n) of rank ρ, we present a probability distribution over the entries of A (the element-wise leverage scores of equation (2)) that reveals the most influential entries in the matrix. From a theoretical perspective, we prove that sampling at most s = O ((m + n) ρ^2 ln (m + n)) entries of the matrix (see eqn. (3) for the precise value of s) with respect to these scores and solving the nuclear norm minimization problem on the sampled entries, reconstructs A exactly. To the best of our knowledge, these are the strongest theoretical guarantees on matrix completion without any incoherence assumptions on the matrix A. From an experimental perspective, we show that entries corresponding to high element-wise leverage scores reveal structural properties of the data matrix that are of interest to domain scientists.

1312.2132 2026-06-04 eess.SY cs.LG cs.SY stat.ML 版本更新

Robust Subspace System Identification via Weighted Nuclear Norm Optimization

通过加权核范数优化实现鲁棒子空间系统辨识

Dorsa Sadigh, Henrik Ohlsson, S. Shankar Sastry, Sanjit A. Seshia

AI总结 本文提出一种基于加权核范数优化的鲁棒子空间系统辨识方法,通过在拟合、秩和稀疏性之间进行权衡,有效处理异常值问题。

Comments Submitted to the IFAC World Congress 2014

详情
AI中文摘要

子空间辨识是系统辨识中的经典且广泛研究的问题。该问题最近被提出为凸优化问题,通过核范数松弛。受鲁棒PCA的启发,我们扩展了这一框架以处理异常值。所提出的框架形式为一个凸优化问题,其目标函数在拟合、秩和稀疏性之间进行权衡。与鲁棒PCA类似,找到合适的正则化参数可能会有问题。我们展示了如何将合适参数的搜索空间限制在二维参数空间的有界开集内。在实践中,这非常有用,因为它限制了需要调查的参数空间。

英文摘要

Subspace identification is a classical and very well studied problem in system identification. The problem was recently posed as a convex optimization problem via the nuclear norm relaxation. Inspired by robust PCA, we extend this framework to handle outliers. The proposed framework takes the form of a convex optimization problem with an objective that trades off fit, rank and sparsity. As in robust PCA, it can be problematic to find a suitable regularization parameter. We show how the space in which a suitable parameter should be sought can be limited to a bounded open set of the two dimensional parameter space. In practice, this is very useful since it restricts the parameter space that is needed to be surveyed.

1312.1613 2026-06-04 stat.ML cs.LG cs.NA math.NA 版本更新

Max-Min Distance Nonnegative Matrix Factorization

最大-最小距离非负矩阵分解

Jim Jing-Yan Wang

AI总结 本文提出一种监督非负矩阵分解算法,通过利用类别标签将数据对分为同类和异类对,旨在最小化同类对在新空间中的最大距离,同时最大化异类对的最小距离,提升表示的判别能力。

详情
AI中文摘要

非负矩阵分解(NMF)已成为模式分类问题中流行的表示方法。它试图将数据样本的非负矩阵分解为非负基本矩阵和非负系数矩阵的乘积,系数矩阵用作新的表示。然而,传统NMF方法忽略了数据样本的类别标签。在本文中,我们提出了一种监督的新型NMF算法,以提高新表示的判别能力。利用类别标签,我们将所有数据样本对分为同类对和异类对。为了提高新NMF表示的判别能力,我们希望在新NMF空间中同类对的最大距离被最小化,同时异类对的最小距离被最大化。基于此准则,我们构建了一个目标函数,并通过交替优化基本矩阵、系数矩阵和松弛变量来优化该函数,从而得到一个迭代算法。

英文摘要

Nonnegative Matrix Factorization (NMF) has been a popular representation method for pattern classification problem. It tries to decompose a nonnegative matrix of data samples as the product of a nonnegative basic matrix and a nonnegative coefficient matrix, and the coefficient matrix is used as the new representation. However, traditional NMF methods ignore the class labels of the data samples. In this paper, we proposed a supervised novel NMF algorithm to improve the discriminative ability of the new representation. Using the class labels, we separate all the data sample pairs into within-class pairs and between-class pairs. To improve the discriminate ability of the new NMF representations, we hope that the maximum distance of the within-class pairs in the new NMF space could be minimized, while the minimum distance of the between-class pairs pairs could be maximized. With this criterion, we construct an objective function and optimize it with regard to basic and coefficient matrices and slack variables alternatively, resulting in a iterative algorithm.

1311.5750 2026-06-04 cs.LG cs.NA math.NA stat.ML 版本更新

Gradient Hard Thresholding Pursuit for Sparsity-Constrained Optimization

梯度硬阈值化追求用于稀疏约束优化

Xiao-Tong Yuan, Ping Li, Tong Zhang

AI总结 本文将硬阈值化追求算法推广到稀疏约束凸优化问题,通过梯度下降与硬阈值化步骤交替,证明其在收敛速度和参数估计精度上的强保证,并在稀疏逻辑回归和稀疏精度矩阵估计中优于现有方法。

详情
AI中文摘要

梯度硬阈值化追求算法是一种迭代的贪心选择方法,用于寻找欠定线性系统中的稀疏解。该方法已显示出强大的理论保证和出色的数值性能。本文将硬阈值化追求方法从压缩感知推广到一般性的稀疏约束凸优化问题。所提出的算法在标准梯度下降步骤和硬阈值化步骤之间交替,可有或无去偏处理。我们证明了该方法在收敛速度和参数估计精度方面具有与HTP相似的强保证。数值结果表明,该方法在稀疏逻辑回归和稀疏精度矩阵估计任务中优于现有的贪心选择方法。

英文摘要

Hard Thresholding Pursuit (HTP) is an iterative greedy selection procedure for finding sparse solutions of underdetermined linear systems. This method has been shown to have strong theoretical guarantee and impressive numerical performance. In this paper, we generalize HTP from compressive sensing to a generic problem setup of sparsity-constrained convex optimization. The proposed algorithm iterates between a standard gradient descent step and a hard thresholding step with or without debiasing. We prove that our method enjoys the strong guarantees analogous to HTP in terms of rate of convergence and parameter estimation accuracy. Numerical evidences show that our method is superior to the state-of-the-art greedy selection methods in sparse logistic regression and sparse precision matrix estimation tasks.

1311.4643 2026-06-04 cs.LG cs.IT cs.NA math.IT math.NA stat.ML 版本更新

Near-Optimal Entrywise Sampling for Data Matrices

近最优数据矩阵的逐元素采样

Dimitris Achlioptas, Zohar Karnin, Edo Liberty

AI总结 本文提出一种近最优的数据矩阵逐元素采样方法,通过四条性质保证了高效性和压缩性,同时在流式模型中能有效竞争最优分布。

Comments 14 pages, to appear in NIPS' 13

详情
AI中文摘要

我们考虑了选择矩阵$A$的非零元素以生成稀疏草稿$B$,使其最小化$\|A-B\|_2$的问题。对于大规模$m \times n$矩阵,当$n \gg m$时,我们给出了一种采样分布,具有四个重要性质:首先,它们有闭合形式,可从最少的关于$A$的信息计算;其次,允许以任意顺序流式输入非零元素进行草稿生成,每个非零元素的计算为$O(1)$;第三,生成的草稿矩阵不仅稀疏,其非零元素高度可压缩;最后,且最重要的是,在温和假设下,我们的分布能与最优离线分布竞争。需要注意的是,最优离线分布中的概率可能是所有矩阵元素的复杂函数,因此,无论计算复杂度如何,最优分布可能在流式模型中无法计算。

英文摘要

We consider the problem of selecting non-zero entries of a matrix $A$ in order to produce a sparse sketch of it, $B$, that minimizes $\|A-B\|_2$. For large $m \times n$ matrices, such that $n \gg m$ (for example, representing $n$ observations over $m$ attributes) we give sampling distributions that exhibit four important properties. First, they have closed forms computable from minimal information regarding $A$. Second, they allow sketching of matrices whose non-zeros are presented to the algorithm in arbitrary order as a stream, with $O(1)$ computation per non-zero. Third, the resulting sketch matrices are not only sparse, but their non-zero entries are highly compressible. Lastly, and most importantly, under mild assumptions, our distributions are provably competitive with the optimal offline distribution. Note that the probabilities in the optimal offline distribution may be complex functions of all the entries in the matrix. Therefore, regardless of computational complexity, the optimal distribution might be impossible to compute in the streaming model.

1311.4296 2026-06-04 cs.LG cs.NA cs.RO math.NA math.OC 版本更新

Reflection methods for user-friendly submodular optimization

用于用户友好的子模优化的反射方法

Stefanie Jegelka, Francis Bach, Suvrit Sra

AI总结 本文提出一种高效的子模函数优化方法,通过连续最佳逼近问题的反射序列求解,实现连续和离散问题的双重解决,应用于图像分割任务。

Comments Neural Information Processing Systems (NIPS), États-Unis (2013)

详情
AI中文摘要

最近研究表明,子模性自然捕捉了机器学习、信号处理和计算机视觉中广泛出现的概念。因此,需要高效的子模函数优化方法,尤其是最小化问题。尽管一般子模最小化具有挑战性,我们提出了一种新方法,利用子模函数的现有分解性。与以往方法不同,该方法既非近似也非不切实际,也不需要复杂的参数调优。此外,它易于实现和并行化。我们方法的关键组成部分是将离散子模最小化问题转化为连续最佳逼近问题,通过一系列反射求解,并可通过阈值处理获得最优离散解。该方法解决了连续和离散问题,因此在学习、推断和重建中有应用。在实验中,我们展示了该方法在两个图像分割任务中的优势。

英文摘要

Recently, it has become evident that submodularity naturally captures widely occurring concepts in machine learning, signal processing and computer vision. Consequently, there is need for efficient optimization procedures for submodular functions, especially for minimization problems. While general submodular minimization is challenging, we propose a new method that exploits existing decomposability of submodular functions. In contrast to previous approaches, our method is neither approximate, nor impractical, nor does it need any cumbersome parameter tuning. Moreover, it is easy to implement and parallelize. A key component of our method is a formulation of the discrete submodular minimization problem as a continuous best approximation problem that is solved through a sequence of reflections, and its solution can be easily thresholded to obtain an optimal discrete solution. This method solves both the continuous and discrete formulations of the problem, and therefore has applications in learning, inference, and reconstruction. In our experiments, we illustrate the benefits of our method on two image segmentation tasks.

1311.1761 2026-06-04 cs.LG cs.AI cs.NE cs.RO cs.SY eess.SY 版本更新

Exploring Deep and Recurrent Architectures for Optimal Control

探索深度和循环架构以实现最优控制

Sergey Levine

AI总结 本文探讨了将深度和循环神经网络应用于连续高维运动控制任务,通过强化学习算法训练控制器,比较不同架构的性能,并讨论深度学习在最优控制中的应用前景。

Comments Appears in the Neural Information Processing Systems (NIPS 2013) Workshop on Deep Learning

详情
AI中文摘要

复杂的多层神经网络在多个监督任务中取得了最先进的结果。然而,此类多层网络在控制领域的成功应用迄今为止主要局限于控制流水线的感知部分。本文探讨了将深度和循环神经网络应用于连续、高维运动任务,其中网络用于表示控制策略,将系统状态(由关节角度表示)直接映射到每个关节的扭矩。通过使用最近的强化学习算法guided policy search,可以成功训练具有数千参数的神经网络控制器,从而比较各种架构。我们讨论了运动控制任务与先前监督感知任务的区别,展示了比较各种架构的实验结果,并讨论了将深度学习技术应用于最优控制问题的未来方向。

英文摘要

Sophisticated multilayer neural networks have achieved state of the art results on multiple supervised tasks. However, successful applications of such multilayer networks to control have so far been limited largely to the perception portion of the control pipeline. In this paper, we explore the application of deep and recurrent neural networks to a continuous, high-dimensional locomotion task, where the network is used to represent a control policy that maps the state of the system (represented by joint angles) directly to the torques at each joint. By using a recent reinforcement learning algorithm called guided policy search, we can successfully train neural network controllers with thousands of parameters, allowing us to compare a variety of architectures. We discuss the differences between the locomotion control task and previous supervised perception tasks, present experimental results comparing various architectures, and discuss future directions in the application of techniques from deep learning to the problem of optimal control.

1310.3697 2026-06-04 stat.ML cs.LG cs.SY eess.SY 版本更新

Variance Adjusted Actor Critic Algorithms

方差调整的actor-critic算法

Aviv Tamar, Shie Mannor

AI总结 本文提出了一种针对MDP的actor-critic框架,目标为方差调整的预期回报。通过线性函数逼近和扩展兼容特征概念,提出了一种分回合算法,并证明其几乎必然收敛到目标函数的局部最优解。

详情
AI中文摘要

我们提出了一种适用于马尔可夫决策过程的actor-critic框架,其目标为方差调整的预期回报。我们的 critic 使用线性函数逼近,并将兼容特征的概念扩展到方差调整的设定。我们提出了一种分回合的actor-critic算法,并证明该算法几乎必然收敛到目标函数的局部最优解。

英文摘要

We present an actor-critic framework for MDPs where the objective is the variance-adjusted expected return. Our critic uses linear function approximation, and we extend the concept of compatible features to the variance-adjusted setting. We present an episodic actor-critic algorithm and show that it converges almost surely to a locally optimal point of the objective function.

1309.2375 2026-06-04 stat.ML cs.LG cs.NA math.NA stat.CO 版本更新

Accelerated Proximal Stochastic Dual Coordinate Ascent for Regularized Loss Minimization

加速的正则化损失最小化随机对偶坐标上升法

Shai Shalev-Shwartz, Tong Zhang

AI总结 本文提出一种正则化损失最小化问题的加速随机对偶坐标上升方法,并通过内-外迭代流程提升其效率,改进了支持向量机、逻辑回归等关键机器学习优化问题的理论结果。

详情
AI中文摘要

我们介绍了一种随机对偶坐标上升方法的正则化版本,并展示了如何通过内-外迭代流程加速该方法。我们分析了该框架的运行时间并获得了改进现有最先进结果的速率,适用于支持向量机、逻辑回归、岭回归、Lasso和多类支持向量机等多种关键机器学习优化问题。实验验证了我们的理论发现。

英文摘要

We introduce a proximal version of the stochastic dual coordinate ascent method and show how to accelerate the method using an inner-outer iteration procedure. We analyze the runtime of the framework and obtain rates that improve state-of-the-art results for various key machine learning optimization problems including SVM, logistic regression, ridge regression, Lasso, and multiclass SVM. Experiments validate our theoretical findings.

1309.5803 2026-06-04 cs.LG cs.DC cs.SY eess.SY math.OC 版本更新

Scalable Anomaly Detection in Large Homogenous Populations

大规模同质群体中的可扩展异常检测

Henrik Ohlsson, Tianshi Chen, Sina Khoshfetrat Pakazad, Lennart Ljung, S. Shankar Sastry

AI总结 本文提出一种优化方法解决大规模同质群体中的异常检测问题,通过将问题转化为凸优化问题实现分布式求解,并在特定条件下保证解的精确性。

详情
AI中文摘要

在大规模群体中进行异常检测是一个具有挑战性但高度相关的问题。该问题本质上是一个多假设问题,每个假设对应系统被划分为正常系统和异常系统的一种划分。随着系统数量的增加,假设数量迅速增长,因此对于实际应用中的问题,近似解变得必要。在本文中,我们采用优化方法来解决这个多假设问题。我们首先观察到该问题等价于一个非凸组合优化问题。然后将问题松弛为一个可以分布式在系统上求解的凸问题,并且随着系统数量的增加,该问题保持计算可行性。所提出方法的一个有趣性质是,在某些条件下可以证明其给出的结果与组合多假设问题相同,因此松弛是紧的。

英文摘要

Anomaly detection in large populations is a challenging but highly relevant problem. The problem is essentially a multi-hypothesis problem, with a hypothesis for every division of the systems into normal and anomal systems. The number of hypothesis grows rapidly with the number of systems and approximate solutions become a necessity for any problems of practical interests. In the current paper we take an optimization approach to this multi-hypothesis problem. We first observe that the problem is equivalent to a non-convex combinatorial optimization problem. We then relax the problem to a convex problem that can be solved distributively on the systems and that stays computationally tractable as the number of systems increase. An interesting property of the proposed method is that it can under certain conditions be shown to give exactly the same result as the combinatorial multi-hypothesis problem and the relaxation is hence tight.

1309.0866 2026-06-04 cs.LO cs.AI cs.LG cs.SY eess.SY 版本更新

On the Robustness of Temporal Properties for Stochastic Models

关于随机模型中时间属性的鲁棒性

Ezio Bartocci, Luca Bortolussi, Laura Nenzi, Guido Sanguinetti

AI总结 本文研究了随机模型中时间属性的鲁棒性,提出鲁棒性度量方法,并结合满足概率优化系统设计。

Comments In Proceedings HSB 2013, arXiv:1308.5724

详情
Journal ref
EPTCS 125, 2013, pp. 3-19
AI中文摘要

随机模型如连续时间马尔可夫链(CTMC)和随机混合自动机(SHA)因其能捕捉生物过程中的随机性而成为强大的形式化工具。形式化建模中的经典问题——模型检查问题——即计算特定时间逻辑公式行为在给定随机过程中的概率。然而,除了满足性外,还关注系统维持特定涌现行为的鲁棒性,不受外部噪声或模型参数微小变化的影响。本文提出将鲁棒性概念扩展至随机系统,展示其自然导致鲁棒性分数分布,并通过两个例子说明如何近似分布及其关键指标:平均鲁棒性和条件平均鲁棒性。其次,展示了如何将这些指标与满足概率结合,以解决系统设计问题,即优化随机模型的控制参数以最大化所需规范的鲁棒性。

英文摘要

Stochastic models such as Continuous-Time Markov Chains (CTMC) and Stochastic Hybrid Automata (SHA) are powerful formalisms to model and to reason about the dynamics of biological systems, due to their ability to capture the stochasticity inherent in biological processes. A classical question in formal modelling with clear relevance to biological modelling is the model checking problem. i.e. calculate the probability that a behaviour, expressed for instance in terms of a certain temporal logic formula, may occur in a given stochastic process. However, one may not only be interested in the notion of satisfiability, but also in the capacity of a system to mantain a particular emergent behaviour unaffected by the perturbations, caused e.g. from extrinsic noise, or by possible small changes in the model parameters. To address this issue, researchers from the verification community have recently proposed several notions of robustness for temporal logic providing suitable definitions of distance between a trajectory of a (deterministic) dynamical system and the boundaries of the set of trajectories satisfying the property of interest. The contributions of this paper are twofold. First, we extend the notion of robustness to stochastic systems, showing that this naturally leads to a distribution of robustness scores. By discussing two examples, we show how to approximate the distribution of the robustness score and its key indicators: the average robustness and the conditional average robustness. Secondly, we show how to combine these indicators with the satisfaction probability to address the system design problem, where the goal is to optimize some control parameters of a stochastic model in order to best maximize robustness of the desired specifications.

1308.5329 2026-06-04 cs.LO cs.LG cs.SY eess.SY 版本更新

Monitoring with uncertainty

监控中的不确定性

Ezio Bartocci, Radu Grosu

AI总结 本文探讨了在监控开销控制机制下,如何利用统计模型学习应用行为并填补监控数据缺失,以估计属性违反的概率。

Comments In Proceedings HAS 2013, arXiv:1308.4904

详情
Journal ref
EPTCS 124, 2013, pp. 1-4
AI中文摘要

我们讨论了对一个带有监控的程序进行运行时验证的问题,该程序未能发出和监控某些事件。这些间隙可能发生在引入监控开销控制机制以禁用受实时约束的应用程序监控时。我们展示了如何利用统计模型来学习应用程序行为,并“填补”引入的间隙。最后,我们介绍了并讨论了过去三年中开发的一些技术,用于在不完整轨迹存在的情况下估计感兴趣属性被违反的概率。

英文摘要

We discuss the problem of runtime verification of an instrumented program that misses to emit and to monitor some events. These gaps can occur when a monitoring overhead control mechanism is introduced to disable the monitor of an application with real-time constraints. We show how to use statistical models to learn the application behavior and to "fill in" the introduced gaps. Finally, we present and discuss some techniques developed in the last three years to estimate the probability that a property of interest is violated in the presence of an incomplete trace.

1308.3558 2026-06-04 cs.LG cs.NA math.NA 版本更新

Fast Stochastic Alternating Direction Method of Multipliers

快速随机交替方向乘子法

Leon Wenliang Zhong, James T. Kwok

AI总结 本文提出一种新的随机交替方向乘子法算法,通过逐步近似线性化ADMM中的完整梯度,提升凸问题的收敛速度至O(1/T),在无需访问所有样本的情况下达到批量ADMM的收敛率。

详情
AI中文摘要

在本文中,我们提出了一种新的随机交替方向乘子法(ADMM)算法,该算法在线性化ADMM公式中逐步近似完整梯度。除了具有与现有随机ADMM算法相同比例的每迭代复杂度外,所提出的算法在凸问题上的收敛速度从O(1/√T)提升至O(1/T),其中T是迭代次数。这与批量ADMM算法的收敛速度一致,但无需每次迭代都访问所有样本。在图引导融合Lasso上的实验表明,新算法显著快于现有随机和批量ADMM算法。

英文摘要

In this paper, we propose a new stochastic alternating direction method of multipliers (ADMM) algorithm, which incrementally approximates the full gradient in the linearized ADMM formulation. Besides having a low per-iteration complexity as existing stochastic ADMM algorithms, the proposed algorithm improves the convergence rate on convex problems from $O(\frac 1 {\sqrt{T}})$ to $O(\frac 1 T)$, where $T$ is the number of iterations. This matches the convergence rate of the batch ADMM algorithm, but without the need to visit all the samples in each iteration. Experiments on the graph-guided fused lasso demonstrate that the new algorithm is significantly faster than state-of-the-art stochastic and batch ADMM algorithms.

1308.2853 2026-06-04 cs.LG cs.IR cs.NA math.NA math.ST stat.ML stat.TH 版本更新

When are Overcomplete Topic Models Identifiable? Uniqueness of Tensor Tucker Decompositions with Structured Sparsity

何时过完备主题模型是可识别的?具有结构稀疏性的张量Tucker分解的唯一性

Animashree Anandkumar, Daniel Hsu, Majid Janzamin, Sham Kakade

AI总结 本文研究了过完备主题模型在特定阶可观察矩下的可识别性,提出通过结构稀疏性约束实现张量Tucker分解的唯一性。

详情
AI中文摘要

过完备潜在表示近年来在无监督特征学习中非常流行。本文指明哪些过完备模型在特定阶的可观察矩下可识别。我们考虑在过完备 regime 中的概率混合或主题模型,其中潜在主题数量远超观察词汇量。尽管一般过完备主题模型不可识别,但通过引入称为主题持续性的约束,我们建立了通用可识别性条件。这些条件涉及对主题-词汇矩阵或模型总体结构的新“高阶”展开条件。这些高阶展开条件允许过完备模型,并要求存在从潜在主题到高阶观察词汇的完美匹配。我们证明在过完备 regime 中,随机结构主题模型以高概率可识别。我们的可识别性结果允许一般(非退化)分布建模主题比例,从而在框架中处理任意相关的主题。我们的可识别性结果暗示了一类具有结构稀疏性的张量分解的唯一性,该类包含在Tucker分解中,但比Candecomp/Parafac(CP)分解更一般。

英文摘要

Overcomplete latent representations have been very popular for unsupervised feature learning in recent years. In this paper, we specify which overcomplete models can be identified given observable moments of a certain order. We consider probabilistic admixture or topic models in the overcomplete regime, where the number of latent topics can greatly exceed the size of the observed word vocabulary. While general overcomplete topic models are not identifiable, we establish generic identifiability under a constraint, referred to as topic persistence. Our sufficient conditions for identifiability involve a novel set of "higher order" expansion conditions on the topic-word matrix or the population structure of the model. This set of higher-order expansion conditions allow for overcomplete models, and require the existence of a perfect matching from latent topics to higher order observed words. We establish that random structured topic models are identifiable w.h.p. in the overcomplete regime. Our identifiability results allows for general (non-degenerate) distributions for modeling the topic proportions, and thus, we can handle arbitrarily correlated topics in our framework. Our identifiability results imply uniqueness of a class of tensor decompositions with structured sparsity which is contained in the class of Tucker decompositions, but is more general than the Candecomp/Parafac (CP) decomposition.

1306.2665 2026-06-04 cs.IT cs.LG cs.SY eess.SY math.IT math.OC stat.ML 版本更新

Precisely Verifying the Null Space Conditions in Compressed Sensing: A Sandwiching Algorithm

在压缩感知中精确验证空域条件:一种 Sandwiching 算法

Myung Cho, Weiyu Xu

AI总结 本文提出新算法验证压缩感知中的空域条件,通过高效计算α_k,改进了传统方法的复杂度和精度。

Comments 30 pages

详情
AI中文摘要

本文提出新的高效算法来验证压缩感知(CS)中的空域条件。给定一个(n-m)×n(m>0)的CS矩阵A和正数k,我们感兴趣于计算α_k = max{z: Az=0, z≠0} max{K: |K|≤k} ||z_K||₁||z||₁,其中K代表{1,2,...,n}的子集,|K|是K的基数。特别地,我们关注找到使得α_k < 1/2的最大k。然而,计算α_k被认为极具挑战性。本文首先提出一系列新的多项式时间算法来计算α_k的上界。基于这些新算法,我们进一步设计了一种新的Sandwiching算法,以大大降低复杂度的方式计算精确的α_k。当需要时,这种新的Sandwiching算法还能在计算复杂度和结果精度之间实现平滑的权衡。实验证明了我们的算法在性能上的改进;并且我们的算法输出精确的α_k值,其复杂度远低于穷举搜索。

英文摘要

In this paper, we propose new efficient algorithms to verify the null space condition in compressed sensing (CS). Given an $(n-m) \times n$ ($m>0$) CS matrix $A$ and a positive $k$, we are interested in computing $\displaystyle α_k = \max_{\{z: Az=0,z\neq 0\}}\max_{\{K: |K|\leq k\}}$ ${\|z_K \|_{1}}{\|z\|_{1}}$, where $K$ represents subsets of $\{1,2,...,n\}$, and $|K|$ is the cardinality of $K$. In particular, we are interested in finding the maximum $k$ such that $α_k < {1}{2}$. However, computing $α_k$ is known to be extremely challenging. In this paper, we first propose a series of new polynomial-time algorithms to compute upper bounds on $α_k$. Based on these new polynomial-time algorithms, we further design a new sandwiching algorithm, to compute the \emph{exact} $α_k$ with greatly reduced complexity. When needed, this new sandwiching algorithm also achieves a smooth tradeoff between computational complexity and result accuracy. Empirical results show the performance improvements of our algorithm over existing known methods; and our algorithm outputs precise values of $α_k$, with much lower complexity than exhaustive search.

1307.5494 2026-06-04 math.NA cs.LG cs.NA stat.ML 版本更新

On GROUSE and Incremental SVD

关于GROUSE和增量SVD

Laura Balzano, Stephen J. Wright

AI总结 本文改进增量SVD以处理缺失数据,并证明其与特定参数下的GROUSE等价,探讨了增量算法在子空间估计中的应用。

详情
AI中文摘要

GROUSE(Grassmannian Rank-One Update Subspace Estimation)是一种增量算法,用于从序列向量中识别R^n的子空间,其中每次迭代仅揭示每个向量的部分组件。近期分析表明,在某些假设下,GROUSE在期望线性速率下局部收敛。GROUSE与增量奇异值分解算法有相似之处,后者在添加单列后更新矩阵的SVD。本文改进增量SVD以处理缺失数据,并证明该改进方法在特定算法参数选择下等同于GROUSE。

英文摘要

GROUSE (Grassmannian Rank-One Update Subspace Estimation) is an incremental algorithm for identifying a subspace of Rn from a sequence of vectors in this subspace, where only a subset of components of each vector is revealed at each iteration. Recent analysis has shown that GROUSE converges locally at an expected linear rate, under certain assumptions. GROUSE has a similar flavor to the incremental singular value decomposition algorithm, which updates the SVD of a matrix following addition of a single column. In this paper, we modify the incremental SVD approach to handle missing data, and demonstrate that this modified approach is equivalent to GROUSE, for a certain choice of an algorithmic parameter.

1306.2663 2026-06-04 cs.LG cs.NA math.NA 版本更新

Large Margin Low Rank Tensor Analysis

大边距低秩张量分析

Guoqiang Zhong, Mohamed Cheriet

AI总结 本文提出一种监督模型,用于学习高维欧几里得空间中嵌入张量的内在结构,通过固定点延续过程联合发现最优维度和低维嵌入表示,模拟人类大脑的认知过程,并在物体识别和人脸识别中验证其优越性。

Comments 30 pages

详情
AI中文摘要

除了向量表示外,人类认知的直接对象通常是高阶张量,如2D图像和3D纹理。基于此,两个有趣的问题自然产生:人类大脑如何以“流形”方式表示这些张量感知,以及如何在“流形”上进行识别。本文提出一种监督模型,用于学习嵌入高维欧几里得空间中的张量的内在结构。通过固定点延续过程,该模型自动且联合发现最优维度和低维嵌入的表示。这使其成为人类大脑认知过程的有效模拟。此外,基于所学低维嵌入之间相似性的模型推广可视为人类大脑识别的对应物。在物体识别和人脸识别应用中的实验展示了所提模型优于现有方法的优越性。

英文摘要

Other than vector representations, the direct objects of human cognition are generally high-order tensors, such as 2D images and 3D textures. From this fact, two interesting questions naturally arise: How does the human brain represent these tensor perceptions in a "manifold" way, and how can they be recognized on the "manifold"? In this paper, we present a supervised model to learn the intrinsic structure of the tensors embedded in a high dimensional Euclidean space. With the fixed point continuation procedures, our model automatically and jointly discovers the optimal dimensionality and the representations of the low dimensional embeddings. This makes it an effective simulation of the cognitive process of human brain. Furthermore, the generalization of our model based on similarity between the learned low dimensional embeddings can be viewed as counterpart of recognition of human brain. Experiments on applications for object recognition and face recognition demonstrate the superiority of our proposed model over state-of-the-art approaches.

1306.1716 2026-06-04 cs.LG cs.DS cs.NA math.NA stat.ML 版本更新

Fast greedy algorithm for subspace clustering from corrupted and incomplete data

从被破坏和不完整数据中快速贪心算法用于子空间聚类

Alexander Petukhov, Inna Kozlov

AI总结 本文提出一种高效的子空间聚类算法FGSSC,能够处理高擦除率噪声数据,其聚类能力优于现有方法,计算成本略高但效率高。

Comments arXiv admin note: substantial text overlap with arXiv:1304.4282

详情
AI中文摘要

我们描述了快速贪心稀疏子空间聚类(FGSSC)算法,提供了一种高效的聚类方法,用于属于几个低维线性或仿射子空间的数据。我们的算法与前人不同之处在于能够处理具有高擦除率的噪声数据(丢失的已知坐标条目)和错误(被破坏的未知坐标条目)。我们讨论了如何实现快速贪心算法的快速版本,其贪心策略被整合到基本算法的迭代中。我们提供了数值证据表明,该快速贪心算法在子空间聚类能力上不仅优于作者所采用的现有最先进SSC算法,也优于最近的GSSC算法。同时,其计算成本仅略高于SSC。算法的显著优势在几个合成模型以及扩展耶鲁B数据集中也得到了验证。特别是,人脸识别的误分类率比SSC算法低6-20倍。我们还提供了数值证据,证明FGSSC算法能够高效地对被破坏的数据进行聚类,即使子空间维度总和显著超过环境空间的维度。

英文摘要

We describe the Fast Greedy Sparse Subspace Clustering (FGSSC) algorithm providing an efficient method for clustering data belonging to a few low-dimensional linear or affine subspaces. The main difference of our algorithm from predecessors is its ability to work with noisy data having a high rate of erasures (missed entries with the known coordinates) and errors (corrupted entries with unknown coordinates). We discuss here how to implement the fast version of the greedy algorithm with the maximum efficiency whose greedy strategy is incorporated into iterations of the basic algorithm. We provide numerical evidences that, in the subspace clustering capability, the fast greedy algorithm outperforms not only the existing state-of-the art SSC algorithm taken by the authors as a basic algorithm but also the recent GSSC algorithm. At the same time, its computational cost is only slightly higher than the cost of SSC. The numerical evidence of the algorithm significant advantage is presented for a few synthetic models as well as for the Extended Yale B dataset of facial images. In particular, the face recognition misclassification rate turned out to be 6-20 times lower than for the SSC algorithm. We provide also the numerical evidence that the FGSSC algorithm is able to perform clustering of corrupted data efficiently even when the sum of subspace dimensions significantly exceeds the dimension of the ambient space.

1305.4081 2026-06-04 cs.LG cs.NA math.NA math.OC 版本更新

Conditions for Convergence in Regularized Machine Learning Objectives

收敛性条件:正则化机器学习目标中的收敛性条件

Patrick Hop, Xinghao Pan

AI总结 本文研究了现代凸优化算法收敛率的分析方法,探讨了分布式计算中非线性延迟对收敛性的影响,并给出了收敛性的存在性和收敛率下界。

Comments 3 Pages

详情
AI中文摘要

现代凸优化算法的收敛率分析可以通过二元手段实现:经验收敛分析或理论收敛分析。当进入分布式计算领域时,这两种信息获取途径在效能上出现分歧,因为广播操作中引入了非直观的非线性减速效应,在某些情况下还涉及收集操作。尽管收敛率存在细微差别,我们仍能证明收敛性的存在,并给出收敛率的下界。本文将为在该问题领域中遇到此问题的机器学习从业者提供有用的速查指南。

英文摘要

Analysis of the convergence rates of modern convex optimization algorithms can be achived through binary means: analysis of emperical convergence, or analysis of theoretical convergence. These two pathways of capturing information diverge in efficacy when moving to the world of distributed computing, due to the introduction of non-intuitive, non-linear slowdowns associated with broadcasting, and in some cases, gathering operations. Despite these nuances in the rates of convergence, we can still show the existence of convergence, and lower bounds for the rates. This paper will serve as a helpful cheat-sheet for machine learning practitioners encountering this problem class in the field.

1305.0395 2026-06-04 math.NA cs.LG cs.NA q-bio.NC stat.ML 版本更新

Tensor Decompositions: A New Concept in Brain Data Analysis?

张量分解:脑数据处理中的新概念?

Andrzej Cichocki

AI总结 本文综述了张量分解在多向BSS/ICA、特征提取、分类和多向PLS回归中的新模型与方法,涵盖约束Tucker和CP模型及惩罚张量分解。

详情
Journal ref
Control Measurement, and System Integration (SICE), special issue; Measurement of Brain Functions and Bio-Signals, 7, 507-517, (2011)
AI中文摘要

矩阵分解及其扩展到张量分解和分解的技术已成为线性和多线性盲源分离(BSS)中的重要方法,尤其在多向独立成分分析(ICA)、非负矩阵和张量分解(NMF/NTF)、平滑成分分析(SmoCA)和稀疏成分分析(SCA)中。此外,张量分解在多线性BSS之外还有许多潜在应用,如特征提取、分类、降维和多向聚类。本文简要回顾了张量分解在组联多向BSS/ICA、特征提取、分类和多向偏最小二乘(MPLS)回归中的新模型和方法。关键词:多线性BSS,联多向BSS/ICA,张量分解和分解,约束Tucker和CP模型,惩罚张量分解(PTD),特征提取,分类,多向PLS和CCA。

英文摘要

Matrix factorizations and their extensions to tensor factorizations and decompositions have become prominent techniques for linear and multilinear blind source separation (BSS), especially multiway Independent Component Analysis (ICA), NonnegativeMatrix and Tensor Factorization (NMF/NTF), Smooth Component Analysis (SmoCA) and Sparse Component Analysis (SCA). Moreover, tensor decompositions have many other potential applications beyond multilinear BSS, especially feature extraction, classification, dimensionality reduction and multiway clustering. In this paper, we briefly overview new and emerging models and approaches for tensor decompositions in applications to group and linked multiway BSS/ICA, feature extraction, classification andMultiway Partial Least Squares (MPLS) regression problems. Keywords: Multilinear BSS, linked multiway BSS/ICA, tensor factorizations and decompositions, constrained Tucker and CP models, Penalized Tensor Decompositions (PTD), feature extraction, classification, multiway PLS and CCA.

1304.7710 2026-06-04 eess.SY cs.LG cs.SY physics.soc-ph 版本更新

Learning Geo-Temporal Non-Stationary Failure and Recovery of Power Distribution

学习电力分配网络的地理时间非平稳故障与恢复

Yun Wei, Chuanyi Ji, Floyd Galvan, Stephen Couvillon, George Orellana, James Momoh

AI总结 本文研究电力分配网络在非平稳环境下的故障与恢复行为,提出新的建模方法,并通过实际案例验证模型参数学习的有效性。

Comments 12 pages, 12 figures, Accepted with minor revisions by TNNLS, Special Issue on Learning in Nonstationary and Evolving Environments. arXiv admin note: text overlap with arXiv:1202.4720

详情
AI中文摘要

智能能源电网是机器学习在非平稳环境中的新应用领域。当大规模故障发生在电力分配网络中,由于外部干扰如飓风和恶劣天气时,这种非平稳环境就会出现。电力分配网络位于电网边缘,特别容易受到外部干扰。缺乏可量化的途径来学习大规模故障和恢复的非平稳行为。本文从三个方面研究这种非平稳行为。首先,推导出大规模故障和恢复整个生命周期的新公式。其次,开发空间时间模型,将故障和恢复建模为基于地理定位的多变量非平稳GI(t)/G(t)/Infinity队列。第三,非平稳空间时间模型识别出少量需要学习的参数。学习应用于两个真实案例:一个是飓风Ike,其中操作网络的数据精确记录了故障和恢复;另一个是飓风桑迪,其中使用汇总数据推断受影响区域的故障和恢复过程。模型参数使用真实数据学习。学习结果得出两个发现:(a) 两种不同运营商网络在两个不同飓风中的故障率行为相似,但在地理区域上不同。(b) 飓风Ike中存在快速和缓慢恢复,但桑迪飓风影响的区域网络中只显示缓慢恢复。

英文摘要

Smart energy grid is an emerging area for new applications of machine learning in a non-stationary environment. Such a non-stationary environment emerges when large-scale failures occur at power distribution networks due to external disturbances such as hurricanes and severe storms. Power distribution networks lie at the edge of the grid, and are especially vulnerable to external disruptions. Quantifiable approaches are lacking and needed to learn non-stationary behaviors of large-scale failure and recovery of power distribution. This work studies such non-stationary behaviors in three aspects. First, a novel formulation is derived for an entire life cycle of large-scale failure and recovery of power distribution. Second, spatial-temporal models of failure and recovery of power distribution are developed as geo-location based multivariate non-stationary GI(t)/G(t)/Infinity queues. Third, the non-stationary spatial-temporal models identify a small number of parameters to be learned. Learning is applied to two real-life examples of large-scale disruptions. One is from Hurricane Ike, where data from an operational network is exact on failures and recoveries. The other is from Hurricane Sandy, where aggregated data is used for inferring failure and recovery processes at one of the impacted areas. Model parameters are learned using real data. Two findings emerge as results of learning: (a) Failure rates behave similarly at the two different provider networks for two different hurricanes but differently at the geographical regions. (b) Both rapid- and slow-recovery are present for Hurricane Ike but only slow recovery is shown for a regional distribution network from Hurricane Sandy.