arXivDaily arXiv每日学术速递 周一至周五更新
2606.20557 2026-06-19 cs.LG math.ST stat.ML stat.TH 新提交

Optimal Deterministic Multicalibration and Omniprediction

最优确定性多校准与全预测

Georgy Noarov, Aaron Roth

发表机构 * University of Pennsylvania(宾夕法尼亚大学)

AI总结 本文提出一种确定性算法,实现多校准的极小化最优样本复杂度,并推广到结果不可区分性,解决确定性预测器是否必要的问题。

详情
AI中文摘要

一个模型在一组群体权重 $G$ 上是多校准的,如果它是校准的——即即使以其预测为条件也是无偏的——不仅整体上,而且在通过每个 $g \in G$ 对上下文重新加权后也是如此。这对于许多下游应用是一个有用的性质,也是可信机器学习的基本要求。在这项工作之前,所有已知达到 $\varepsilon$-多校准的极小化最优 $\widetilde O(\varepsilon^{-3})$ 样本复杂度的预测器都是随机化的,而确定性预测器仅以更差的样本复杂度已知。多校准中随机化对于最优样本复杂度是否必要的问题由 [CLNR26] 明确提出,并在之前的几项工作中隐含提出。我们通过给出一个输出确定性预测器的极小化最优多校准算法解决了这个开放问题。然后我们将该算法推广到产生满足关于有限或有限覆盖测试集合的结果不可区分性(OI)的最优确定性预测器。作为一个应用,这也给出了具有最优样本复杂度的确定性全预测器和泛预测器,解决了 [OKK25] 和 [BHHLZ25] 提出的开放问题。

英文摘要

A model is multicalibrated on a collection of group weights $G$ if it is calibrated -- i.e. unbiased even conditional on its prediction -- not just overall, but also after reweighting contexts by each $g \in G$. It is a useful property for many downstream applications and is a basic desideratum of trustworthy machine learning. Before this work, all predictors known to attain the minimax-optimal $\widetilde O(\varepsilon^{-3})$ sample complexity rate for $\varepsilon$-multicalibration were randomized, while deterministic predictors were known only with substantially worse sample complexity. Whether randomization is necessary for optimal sample complexity in multicalibration was explicitly asked by [CLNR26] and implicitly in several prior works. We resolve this open problem by giving a minimax-optimal multicalibration algorithm that outputs a deterministic predictor. We then generalize the algorithm to produce optimal deterministic predictors that satisfy outcome indistinguishability (OI) with respect to finite or finitely covered collections of tests. As an application, this also gives deterministic omnipredictors and panpredictors with optimal sample complexity, resolving open problems posed by [OKK25] and [BHHLZ25].

2606.20547 2026-06-19 cs.LG cs.CV cs.GR cs.RO math.DG 新提交

The Token Is a Group Element: On Lie-Algebra Attention over Matrix Lie Groups

Token 是群元素:关于矩阵李群上的李代数注意力

Przemyslaw Musialski

发表机构 * New Jersey Institute of Technology(新泽西理工学院)

AI总结 提出李代数注意力机制,将token定义为矩阵李群元素,利用相对位姿的李代数范数作为注意力分数,无需学习核函数或表示论工具,适用于仿射全帧群等非紧致非阿贝尔群。

Comments preprint, 19 pages, 3 figures

详情
AI中文摘要

我们将注意力token置于群上:一个token是矩阵李群$G$的一个元素$g_i$——一个纯粹的变换,没有特征负载,也没有外部作用$\rho(g)$承载它。据我们所知,这是第一个token为裸矩阵李群元素的注意力构造:它们的分数是相对位姿的闭式代数范数,而非学习核,并且它达到了每个基于不可约表示或满射指数的方法必须排除的仿射全帧群。我们称之为李代数注意力。一旦token是群元素,其余部分无需通常的表示论机制。一对的相对几何是规范的,即$g_i^{-1} g_j$,因此成对不变量$w_{ij} = \log(g_i^{-1} g_j)$是内在的而非设计的;在$G$对角作用下的等变性是重言式的,且余循环条件自动成立。注意力分数是负平方代数范数$s_{ij} = -\|\log(g_i^{-1} g_j)\|_\lambda^2/\tau$:在块加权Frobenius内积下的规范邻近核,无需不可约表示、球谐函数、Clebsch-Gordan积或学习核。该构造适用于任何矩阵李群,在包含相对位姿的选定对数图上,包括具有尺度和剪切的非紧致非阿贝尔仿射群,这些是向量token注意力方法无法达到的:既不是不可约表示传统,也不是满射指数方法。在SE(2)、SO(3)和Aff(2)上的三个序列补全实验证实了这一点:闭式分数匹配了相同不变量上的学习MLP核,并在SE(2)上优于它,使用的分数参数少50到80倍,而向量token基线破坏了不变量,误差达五到十二个数量级。

英文摘要

We place the attention token on the group: a token is an element $g_i$ of a matrix Lie group $G$ -- a bare transformation, with no feature payload and no external action $ρ(g)$ carrying it. To our knowledge this is the first attention construction whose tokens are bare matrix Lie group elements: their score is the closed-form algebra norm of the relative pose rather than a learned kernel, and it reaches the affine full-frame groups that every irrep- or surjective-exp-based method must exclude. We call it Lie-Algebra Attention. Once tokens are group elements, the rest follows with none of the usual representation-theoretic machinery. The relative geometry of a pair is canonical, $g_i^{-1} g_j$, so the pairwise invariant $w_{ij} = \log(g_i^{-1} g_j)$ is intrinsic rather than designed; equivariance under the diagonal $G$-action is tautological, and the cocycle condition holds automatically. The attention score is the negative squared algebra norm, $s_{ij} = -\|\log(g_i^{-1} g_j)\|_λ^2/τ$: the canonical proximity kernel under a block-weighted Frobenius inner product, with no irreducible representations, spherical harmonics, Clebsch-Gordan products, or learned kernel. The construction applies to any matrix Lie group on a chosen logarithm chart containing the relative poses, including the non-compact non-abelian affine groups with scale and shear that no vector-token attention method reaches: neither the irrep tradition nor surjective-exp methods. Three sequence-completion experiments, on SE(2), SO(3), and Aff(2), bear this out: the closed-form score matches a learned MLP kernel on the same invariant and outperforms it on SE(2), using 50 to 80x fewer score parameters, while a vector-token baseline breaks invariance by five to twelve orders of magnitude.

2606.20443 2026-06-19 eess.SY cs.LG cs.SY math.AT 新提交

Topological Data Analysis for High-Dimensional Dynamic Process Monitoring

高维动态过程监测的拓扑数据分析

Angan Mukherjee, Tyler A. Soderstrom, Michael J. Kurtz, Victor M. Zavala

AI总结 提出结合拓扑数据分析和机器学习的方法,将多变量时间序列表示为流形,用拓扑描述符总结结构,并用神经常微分方程学习拓扑结构动态演化,实现高效事件检测。

详情
AI中文摘要

实时过程监测需要从高维时间序列数据中提取可操作信息的方法。在这项工作中,我们提出了一种新的过程监测方法,结合了拓扑数据分析(TDA)和机器学习工具。在所提出的方法中,我们将多变量时间序列数据表示为流形,并使用拓扑描述符来总结此类数据的结构;然后,我们使用神经常微分方程来学习系统拓扑结构的动态演化。使用来自工业过程的真实数据,我们表明这种基于轨迹的事件检测方法能有效检测多种类型的事件。我们将该方法与基于重构的方法(如主成分分析和自编码器)以及使用Koopman自编码器的基于轨迹的方法进行了对比。

英文摘要

Real-time process monitoring requires methods that extract actionable information from high-dimensional time-series data. In this work, we present a new approach for process monitoring that combines tools of topological data analysis (TDA) and machine learning. In the proposed approach, we represent multivariate time-series data as manifolds and use topological descriptors to summarize the structure of such data; we then use a neural ordinary differential equation to learn the dynamic evolution of the topological structure of the system. Using real data from an industrial process, we show that this trajectory-based event detection approach is effective at detecting diverse types of events. We contrast this approach against reconstruction-based approaches such as principal component analysis and autoencoders and against a trajectory-based approach that uses Koopman autoencoders.

2606.20442 2026-06-19 cs.LG cs.NA cs.NE math.NA 新提交

Evolutionary Two-Stage Hyperparameter Optimization Strategies for Physics-Informed Neural Networks

物理信息神经网络的进化两阶段超参数优化策略

Fedor Buzaev, Dmitry Efremenko, Egor Bugaev, Andrei Ermakov, Denis Derkach, Daria Pugacheva, Fedor Ratnikov

发表机构 * HSE University(高等经济大学) AXXX

AI总结 针对物理信息神经网络训练不稳定、超参数敏感的问题,提出基于进化算法的两阶段优化策略,先低保真筛选再全训练,在三个PDE问题上显著降低误差。

Comments Equal advising: Daria Pugacheva and Fedor Ratnikov. Accepted to the ICLR 2026 Workshop on AI and PDEs

详情
AI中文摘要

物理信息神经网络(PINNs)通过将物理定律嵌入神经网络训练来求解偏微分方程(PDE)。然而,由于物理信息损失的高度非凸和多项结构,其性能受到不稳定收敛、训练平台期以及对架构和优化超参数的强敏感性的影响。在这种情况下,外循环超参数搜索是一个在异构参数上的噪声黑盒优化问题,经典的局部或基于梯度的策略容易陷入次优区域。进化算法凭借其基于种群的探索能力和处理混合、不可微搜索空间的能力,为发现有前景的配置提供了更稳健的机制。我们提出并研究了一种基于进化算法的两阶段方法,该方法结合了PINNs训练的探索和利用部分,以在固定计算预算下提高解的精度和鲁棒性。在第一阶段,我们执行具有截断轮次的低保真训练运行,以快速筛选候选配置,将超参数选择视为黑盒外循环问题。在第二阶段,只有最有希望的候选者使用标准基于梯度的优化器进行完全训练以细化解。在三个流行问题(即平流方程、Klein-Gordon方程和Helmholtz方程)上评估,我们的方法一致优于标准训练,并在受限计算资源内实现了显著更低的平均误差。

英文摘要

Physics-Informed Neural Networks (PINNs) solve Partial Differential Equations (PDEs) by embedding physical laws into neural network training. However, their performance suffers from unstable convergence, training plateaus, and strong sensitivity to architectural and optimization hyperparameters due to the highly non-convex and multi-term structure of the physics-informed loss. In this setting, the outer-loop hyperparameter search is a noisy and black-box optimization problem over heterogeneous parameters, where classical local or gradient-based strategies are easily trapped in suboptimal regions. Evolutionary algorithms, with their population-based exploration and ability to handle mixed, non-differentiable search spaces, provide a more robust mechanism for discovering promising configurations. We propose and investigate a two-stage approach based on evolutionary algorithms that combines exploration and exploitation parts of PINNs training to improve solution accuracy and robustness under fixed computational budgets. In the first stage, we perform low-fidelity training runs with truncated epochs to rapidly screen candidate configurations, treating hyperparameter selection as a black-box outer-loop problem. In the second stage, only the most promising candidates are fully trained with standard gradient-based optimizers to refine the solution. Evaluated on three popular problems, namely Advection, Klein-Gordon and Helmholtz equations, our method consistently outperforms standard training and achieves significantly lower mean error within constrained computational resources.

2606.20413 2026-06-19 eess.SP cs.IT math.IT 新提交

Hybrid TRP-UE Sensing for Enhanced Target Localization

混合TRP-UE感知用于增强目标定位

Necati Kagan Erkek, Marco Di Renzo, Arman Shojaeifard, Yasser Mestrah, Remun Koirala, Mohammad Heggo, Kunjan Shah

AI总结 提出一种混合TRP-UE感知机制,利用UE辅助感知提升网络感知性能,在室内工厂等复杂传播环境下显著改善目标定位精度。

Comments 6 pages

详情
AI中文摘要

集成感知与通信(ISAC)指的是网络在提供通信服务的同时,能够以可扩展的方式感知环境的能力。ISAC的关键功能之一是对无源和移动感知目标的精确定位。本文介绍了一种新颖的混合TRP-UE感知机制,该机制提升了基于网络的感知性能。使用符合3GPP标准的ISAC信道模型提供了评估结果。结果表明,在室内工厂等具有挑战性的传播环境中,用UE辅助感知补充基于TRP的感知具有显著优势。

英文摘要

Integrated Sensing and Communication (ISAC) refers to the capability for the network to provide communications services whilst also being able to sense the environment in a scalable manner. One of the key functions of ISAC is the accurate localization of passive and mobile sensing targets. This paper introduces a novel hybrid TRP-UE sensing mechanism that improves network-based sensing performance. Evaluation results are provided using 3GPP-compliant ISAC channel models. The results demonstrate the significant benefit in complimenting TRP-based sensing with UE-assisted sensing in challenging propagation environments such as indoor factory.

2606.20394 2026-06-19 cs.RO math.OC 新提交

Agentic AutoResearch forSpace Autonomy: An Auditable, LLM-Driven Research Agent for Aerospace Control Problems

面向空间自主性的智能体自动研究:用于航空航天控制问题的可审计、LLM驱动的研究代理

Amit Jain, Richard Linares

发表机构 * Department of Aeronautics and Astronautics(航空航天学系)

AI总结 提出AutoResearch框架,利用大语言模型作为离线研究代理,自动迭代开发航天控制策略,并通过内置可信层审计结果,消除种子噪声影响,在交会和对接问题上验证了有效性。

详情
AI中文摘要

航天器的制导、导航与控制功能日益通过从专家求解器中提炼的学习策略来实现。开发这样的策略本身就是一个研究过程:研究者选择架构和超参数,运行实验,并必须判断一个明显的改进是真实的还是仅仅是种子噪声。本文提出了AutoResearch框架,其中大语言模型自主驱动这一循环,用于航空航天控制问题,并结合了一个内置在循环中的可信层,该层根据问题自身测量的种子噪声对每个报告的结果进行认证。语言模型仅作为离线研究代理,负责开发控制策略;它产生的训练策略随后部署在航天器上,而模型本身从不操作飞行器。在每次迭代中,代理读取自然语言描述的问题描述和运行历史,对训练脚本提出一次编辑,执行它,并记录结果。任何报告的结果在通过相同的三项检查之前不会被认可:测量的每个问题的种子噪声、最佳配置的重新播种验证,以及代理编辑的留一法剪枝。相同的循环被原样应用于两个航空航天控制问题:Clohessy-Wiltshire相对交会问题和带有安全约束的避碰对接问题(经过禁飞区),每个问题都针对已知的最优控制基准进行了校准。在这两个问题中,经过审计的策略以多个标准差超过了测量的种子噪声;对相同参数的未定向搜索则没有。在对接问题上,差距变得明显:未定向搜索没有产生可行的策略,而学习到的策略在每个种子上都保持在禁飞区之外。

英文摘要

Spacecraft guidance, navigation, and control functions are increasingly realized as learned policies distilled from expert solvers. Developing such a policy is itself a research process: an investigator selects an architecture and hyperparameters, runs experiments, and must determine whether an apparent improvement is genuine or merely seed noise. This paper presents AutoResearch, a framework in which a large language model autonomously drives that loop for aerospace control problems, coupled with a credibility layer, built into the loop, that certifies each reported result against the problem's own measured seed noise. The language model serves only as the offline research agent that develops the control policy; the trained policy it produces is then deployed onboard the spacecraft, while the model itself never operates the vehicle. At each iteration the agent reads a plain-language problem description and the run history, proposes a single edit to the training script, executes it, and logs the outcome. No reported result is credited until it passes the same three checks: measured per-problem seed noise, reseeded verification of the best configuration, and leave-one-out pruning of the agent's edits. The same loop is applied, unchanged, to two aerospace control problems: a Clohessy-Wiltshire relative rendezvous and a safety-constrained collision-avoidance docking past a keep-out zone, each calibrated against a known optimal control benchmark. In both, the audited policy clears the measured seed noise by many standard deviations; an undirected search over the same parameters does not. On the docking problem the gap becomes categorical: undirected search yields no feasible policy, while the learned policy stays outside the keep-out zone on every seed.

2606.20325 2026-06-19 cs.LG cs.SC math.DS 新提交

Recurrent neural networks approximate continuous functions

递归神经网络近似连续函数

Valentin Abadie, Clemens Hutter, Helmut Bölcskei

AI总结 本文证明,对于[-1,1]上的任意连续函数,存在一个固定权重和隐藏维度的ReLU递归神经网络,其时间演化可以均匀逼近该函数,并给出了收敛速率和极小极大下界。

详情
AI中文摘要

经典逼近定理要求每当目标精度提高时,就需要一个新的神经网络。本文研究相反的可能性:能否一劳永逸地选择网络,而仅通过让其运行更长时间来换取精度?我们证明这对于[-1,1]上的每个连续函数都是可能的。更准确地说,每个这样的函数都可以通过一个具有固定权重和固定隐藏维度的单ReLU递归神经网络的时间演化来均匀逼近。该构造背后的机制是一个新的中间模型——带神经单元的图灵机(TMNU)。该模型保留了实现多项式逼近方案所需的算法自由度,同时保持足够的刚性,以便被具有显式隐藏维度和权重幅度界限的RNN模拟。由此产生的收敛速率反映了底层多项式逼近的速率。我们通过极小极大下界补充了该构造,表明运行时间不仅仅是证明的产物,而是这种固定网络逼近范式中不可避免的资源。

英文摘要

Classical approximation theorems ask for a new neural network whenever the target accuracy is improved. This paper studies the opposite possibility: can the network be chosen once and for all, and can accuracy be bought only by letting it run longer? We prove that this is possible for every continuous function on [-1,1]. More precisely, each such function is uniformly approximated by the time evolution of a single ReLU recurrent neural network with fixed weights and fixed hidden dimension. The mechanism behind the construction is a new intermediate model, the Turing machine with neural units (TMNU). This model retains the algorithmic freedom needed to implement polynomial approximation schemes, while remaining rigid enough to be simulated by RNNs with explicit bounds on hidden dimension and weight magnitude. The resulting convergence rates reflect the underlying polynomial approximation rates. We complement the construction with minimax lower bounds showing that runtime is not merely a proof artifact, but an unavoidable resource in this fixed-network approximation paradigm.

2606.20195 2026-06-19 cs.PF cs.NA math.NA 新提交

Randomized Sketching is Robust to Low-Precision Rounding on GPUs

随机草图对GPU低精度舍入具有鲁棒性

Aryaman Jeendgar, Clément Flint, Hartwig Anzt

AI总结 研究随机草图在GPU低精度下的性能与精度,提出SparseStack改进CountSketch,发现FP16舍入方式对嵌入质量影响小,分布比量化更关键。

Comments 14 pages, 3 figures

详情
AI中文摘要

随机草图是随机数值线性代数中的核心原语。在现代硬件架构上,特别是在GPU上,稀疏草图的性能受限于内存流量和原子累加,而非浮点吞吐量。这使得草图成为混合精度的自然目标,前提是低精度累加不会降低嵌入质量。我们研究了稀疏子空间嵌入的混合精度GPU实现,重点关注Higgins等人提出的GPU CountSketch内核的SparseStack泛化。SparseStack在相干输入上相对于CountSketch提高了嵌入质量,但其每列额外的非零元素增加了原子更新争用并降低了吞吐量。因此,我们实现了使用确定性舍入到最近、精确随机舍入和抖动舍入的FP16 SparseStack变体,并将它们与FP32 SparseStack、CountSketch、混合精度CountSketch和FlashSketch进行比较。我们的主要实证发现是,在测试的范围内,SparseStack嵌入质量对FP16舍入规则不敏感。确定性、随机和抖动舍入的FP16 SparseStack在不相干、相干和对抗性测试问题上产生几乎相同的子空间失真和草图求解最小二乘精度。主导精度因素是草图分布而非量化规则:SparseStack变体在相干输入上显著改善失真,而所有方法在不相干输入上表现相似。由于确定性舍入的开销最低,它在FP16 SparseStack变体中提供了最佳的性能-精度权衡。

英文摘要

Randomized sketching is a core primitive in randomized numerical linear algebra. On modern hardware architectures, in particular on GPUs, the performance of sparse sketches is limited by memory traffic and atomic accumulation rather than floating-point throughput. This makes sketching a natural target for mixed precision, provided that low-precision accumulation does not degrade the embedding quality. We study mixed-precision GPU implementations of sparse oblivious subspace embeddings, focusing on a SparseStack generalization of the GPU CountSketch kernel of Higgins et al. SparseStack improves embedding quality relative to CountSketch on coherent inputs, but its additional nonzeros per column increase atomic-update contention and reduce throughput. We therefore implement FP16 SparseStack variants using deterministic round-to-nearest, exact stochastic rounding, and dithered rounding, and compare them with FP32 SparseStack, CountSketch, mixed-precision CountSketch, and FlashSketch. Our main empirical finding is that, for the tested regimes, SparseStack embedding quality is insensitive to the FP16 rounding rule. Deterministic, stochastic, and dithered rounding FP16 SparseStack produce nearly identical subspace distortion and sketch-and-solve least-squares accuracy across incoherent, coherent, and adversarial test problems. The dominant accuracy factor is the sketch distribution rather than the quantization rule: SparseStack variants substantially improve distortion on coherent inputs, while all methods behave similarly on incoherent inputs. Since deterministic rounding has the lowest overhead, it provides the best performance--accuracy tradeoff among the FP16 SparseStack variants.

2606.20162 2026-06-19 cs.AI cs.IT cs.NI math.IT 新提交

Implicit Semantic-Aware Communication Based on Hypergraph Reasoning

基于超图推理的隐式语义感知通信

Yiwei Liao, Shurui Tu, Yong Xiao, Yingyu Li, Guangming Shi

发表机构 * China Electric Power Research Institute Co., Ltd(中国电力科学研究院有限公司) National Key Laboratory for Power Grid Environmental Protection(电网环境保护国家重点实验室) School of Electronic Information and Communications, Huazhong University of Science and Technology(华中科技大学电子信息与通信学院) Peng Cheng Laboratory(鹏城实验室) Pazhou Laboratory (Huangpu)(琶洲实验室(黄埔)) School of Mechanical Engineering and Electronic Information, China University of Geosciences(中国地质大学机械与电子信息学院)

AI总结 提出基于超图的隐式语义推理框架HISR,通过超图建模多实体高阶关系,在噪声信道下提升语义推理鲁棒性,准确率提升36.6%。

Comments This work is accepted at IEEE Transactions on Communications

详情
AI中文摘要

语义感知通信已成为下一代通信系统的变革性范式,将基本目标从传输比特级符号转变为可靠恢复和理解信息的语义含义。先前研究表明,将源消息的语义内容表示为基于图的结构可以显著提高通信效率和接收端语义推理的准确性。然而,现有解决方案通常采用仅捕获成对关系的图,从而忽略了现实场景中常见的高阶隐式相关性,例如群体交互、多实体关联和复杂关系上下文。这种限制降低了语义表达能力,并使语义推理容易受到歧义和性能下降的影响,尤其是在噪声或损坏的信道条件下。为了解决这些问题,本文提出了一种新颖的基于超图的隐式语义推理框架HISR,该框架利用超图表示语义知识实体之间的复杂多实体关系。在HISR中,实体及其关联的高阶关系被映射到针对不同关系上下文定制的专用语义子空间中。这种设计不仅解耦了多样的语义交互以减轻传统图嵌入方法中常见的过平滑效应,而且即使在传输过程中发生部分信息丢失时也能实现鲁棒的语义推理。数值结果表明,所提出的HISR在隐式语义解释准确率上比最先进的基准提高了36.6%。

英文摘要

Semantic-aware communication has emerged as a transformative paradigm for next-generation communication systems, shifting the fundamental goal from transmitting bit-level symbols to reliably recovering and understanding the semantic meaning of information. Previous studies have demonstrated that representing the semantic content of source messages as graph-based structures can significantly improve communication efficiency and the accuracy of semantic inference at the receiver. However, existing solutions typically employ graphs that capture only pairwise relationships, thereby neglecting higher-order implicit correlations commonly observed in real-world scenarios, such as group interactions, multi-entity associations, and complex relational contexts. This limitation reduces semantic expressiveness and makes semantic inference susceptible to ambiguity and performance degradation, particularly under noisy or corrupted channel conditions. To address these issues, this paper proposes a novel hypergraph-based implicit semantic reasoning framework, HISR, which leverages hypergraphs to represent complex multi-entity relationships among semantic knowledge entities. In HISR, entities and their associated higher-order relations are mapped into dedicated semantic subspaces tailored to distinct relational contexts. This design not only disentangles diverse semantic interactions to mitigate the over-smoothing effects commonly found in traditional graph embedding methods but also enables robust semantic inference even when partial information loss occurs during transmission. Numerical results show that the proposed HISR achieves up to a 36.6% improvement in implicit semantic interpretation accuracy over the state-of-the-art benchmarks.

2606.20022 2026-06-19 stat.ML cs.LG math.OC 新提交

Stochastic Linear Contextual Bandits with Bounded Noise: A Set-Membership Approach

具有有界噪声的随机线性上下文赌博机:一种集合成员方法

Haonan Xu, Yingying Li

AI总结 针对有界奖励噪声的随机线性上下文赌博机,提出基于集合成员估计和乐观原则的SME-OFU算法,实现O(log T)的遗憾界,优于次高斯噪声下的最优界。

Comments 23 pages, 1 figure

详情
AI中文摘要

本文考虑具有有界奖励噪声的随机线性上下文赌博机(SLCB)。现有工作通常假设次高斯奖励噪声和有界期望奖励,在此条件下最优遗憾界关于时间T为$\tilde{O}(\sqrt{T})$。然而,在许多应用中,实现/观测到的奖励也自然有界,这意味着奖励噪声有界。有界噪声比次高斯条件更具信息性,但在SLCB文献中尚未被明确利用。本文通过利用一种称为集合成员估计(SME)的不确定性量化方法,并应用面对不确定性的乐观原则(OFU),提出了一种新颖的算法SME-OFU。我们的算法享有改进的遗憾界$O(\log T)$。注意,这并不与次高斯噪声下现有的最优界$\tilde{O}(\sqrt{T})$矛盾,因为有界噪声是更强的条件。最后,仿真表明,当奖励噪声有界时,SME-OFU相对于为次高斯噪声设计的基准算法在经验上有所改进。

英文摘要

This paper considers stochastic linear contextual bandits (SLCB) with bounded reward noise. Existing works typically assume sub-Gaussian reward noise and bounded expected rewards, under which the optimal regret bound scales as $\tilde{O}(\sqrt{T})$ in terms of horizon $T$. However, in many applications, realized/observed rewards are also naturally bounded, implying bounded reward noise. Bounded noise is more informative than the sub-Gaussian condition but has not been leveraged explicitly in the SLCB literature. In this paper, we propose a novel algorithm SME-OFU by utilizing an uncertainty quantification method called set-membership estimation (SME) and applying the principle of optimism in the face of uncertainty (OFU). Our algorithm enjoys an improved regret bound $O(\log T)$. Notice that this does not contradict the existing optimal bound $\tilde{O}(\sqrt{T})$ for sub-Gaussian noise because bounded noise is a stronger condition. Finally, simulations show empirical improvements of SME-OFU over a benchmark algorithm designed for sub-Gaussian noise when the reward noise is bounded.

2606.19909 2026-06-19 stat.CO math.PR stat.ME 新提交

Establishing an $Ω(\sqrt{d})$ complexity lower bound for PDMP samplers and how to break it: a sub-$\sqrt{d}$ algorithm for Gaussian-tailed targets

建立 PDMP 采样器的 $\Omega(\sqrt{d})$ 复杂度下界及如何突破:针对高斯尾目标的一个亚 $\sqrt{d}$ 算法

Augustin Chevallier

AI总结 本文证明分段确定性马尔可夫过程采样器在标准设置下具有 $\Omega(\sqrt{d})$ 复杂度下界,并通过放宽目标密度连续时间不变性假设,提出一种新方案,对高斯尾目标实现 $O(d^\alpha)$($\alpha\in[0.2,0.3]$)的经验复杂度。

详情
AI中文摘要

尽管分段确定性马尔可夫过程(PDMP)采样器在理论上有非可逆性的吸引力,但迄今为止,尚未开发出在计算复杂度上相对于目标维度 $d$ 优于 $\mathcal{O}(\sqrt{d})$ 的 PDMP 采样器。我们通过在标准设置中建立 PDMP 采样器算法复杂度的 $\Omega(\sqrt{d})$ 下界,证明这是一个基本限制。通过放宽目标密度必须在所有连续时间保持不变的假设,我们随后展示了如何突破这一障碍。具体来说,我们引入了一种新颖的 PDMP 采样方案,并表明它对高斯尾目标实现了 $\mathcal{O}(d^\alpha)$ 的经验复杂度,其中 $\alpha \in [0.2, 0.3]$。此外,该 PDMP 方案在轨迹长度和速度更新之间的距离上都是局部自适应的。

英文摘要

Despite the theoretical appeal of their non-reversibility, to date, no Piecewise Deterministic Markov Process (PDMP) samplers have been developed that scale better than $\mathcal{O}(\sqrt{d})$ in computational complexity with respect to the target dimension $d$. We prove that this is a fundamental limitation by establishing an $Ω(\sqrt{d})$ lower bound on the algorithmic complexity of PDMP samplers in a standard setup. By relaxing the assumption that the target density must remain invariant at all continuous times, we then demonstrate how to bypass this barrier. Specifically, we introduce a novel PDMP sampling scheme and show that it achieves an empirical complexity of $\mathcal{O}(d^α)$, where $α\in [0.2, 0.3]$ for Gaussian-tailed targets. In addition, this PDMP scheme is locally adaptive in both trajectory length and distance between velocity updates.

2606.19878 2026-06-19 cs.LG math.OC stat.ML 新提交

On the Oracle Complexity of Interpolation-Based Gradient Descent

基于插值的梯度下降的预言复杂度

Dongmin Lee, William Lu, Anuran Makur

发表机构 * Purdue University(普渡大学)

AI总结 提出分段多项式插值梯度下降(PPI-GD)方法,通过数据域等距点查询一阶预言构造多项式插值近似全梯度,在强凸和非凸损失下分析预言复杂度,证明在数据维数受限且损失足够光滑时优于多种GD变体。

Comments 16 pages, 2 figures

详情
AI中文摘要

最近关于经验风险最小化(ERM)的一阶优化器的工作表明,可以利用ERM损失函数在训练数据中的光滑性(而非优化参数中的光滑性)来改进梯度下降(GD)方法的预言复杂度。在本文中,我们提出了一种不精确梯度方法——分段多项式插值梯度下降(PPI-GD),该方法通过在数据域中的等距点处查询一阶预言来近似每次迭代中的全梯度,从而在数据域的适当大小的块上构造所得梯度样本的多项式插值。我们分析了PPI-GD在强凸和非凸损失函数下的预言复杂度,其中数据空间维数以训练样本数量的多对数函数为界,并发现当损失函数足够光滑时,PPI-GD在关键区域优于几种GD变体。此外,我们的分析将双三次样条插值误差分析中的几种技术扩展到$d$变量张量积多项式插值的设置中,这可能对插值分析具有独立意义。

英文摘要

Recent work on first-order optimizers for empirical risk minimization (ERM) has suggested that smoothness of ERM loss functions in the training data, rather than in the optimization parameters, can be leveraged to improve the oracle complexity of gradient descent (GD) methods. In this paper, we propose an inexact gradient method, piecewise polynomial interpolation-based gradient descent (PPI-GD), which approximates the full gradient in each iteration by querying the first-order oracle at equidistant points in the data domain to construct polynomial interpolants of the resulting gradient samples over appropriately sized patches of the data domain. We analyze the oracle complexity of PPI-GD for strongly convex and non-convex loss functions when the data space dimension is bounded by a polylogarithmic function of the number of training samples, and find it to outperform several GD variants in key regimes when the loss function is sufficiently smooth. Furthermore, our analysis extends several techniques from the error analysis of bicubic spline interpolants to the setting of $d$-variate tensor product polynomial interpolants which may be of independent interest in interpolation analysis.

2606.19876 2026-06-19 cs.LG math.OC 新提交

Global Convergence of Gradient Descent for Score Matching in Gaussian Mixtures via Reverse Fisher Divergence

通过反向Fisher散度实现高斯混合模型中得分匹配的梯度下降全局收敛

Alexander Tyurin

AI总结 研究反向Fisher散度下梯度下降拟合高斯混合模型的全局收敛性,证明从任意初始化或随机初始化下学生分量收敛到最近教师分量,并给出全变差距离收敛条件。

详情
AI中文摘要

得分匹配问题是现代生成建模、扩散模型、拟合非归一化统计模型和逆问题中的核心训练目标。标准方法是最小化前向Fisher散度,其中期望相对于教师分布取。然而,最近结果表明,即使在简单的高斯混合模型设置中,该目标也可能导致不良且依赖初始化的收敛行为。本文研究另一种目标:反向Fisher散度,其中期望相对于学生分布取。我们分析梯度下降(GD)拟合高斯混合模型,并表明目标函数的这一改变导致显著更好的优化性质。首先,当教师分布是单个高斯分布且学生是固定权重和单位协方差的高斯混合模型时,我们证明了从任意初始化出发GD的全局收敛性。其次,我们将分析扩展到教师也是高斯混合模型的情况,并在全局随机初始化方案和目标均值满足$\widetilde{\Omega}(1)$-分离假设下证明了全局收敛保证。特别地,以高概率,每个学生分量收敛到其最近的教师分量,并且我们提供了学生分布在全变差距离下收敛的条件。我们的证明依赖于基于Lyapunov的梯度下降动力学新分析,表明反向Fisher散度比前向Fisher散度具有更有利的优化景观。

英文摘要

The score matching problem is a central training objective in modern generative modeling, diffusion models, fitting unnormalized statistical models, and inverse problems. A standard approach is to minimize the forward Fisher divergence, where the expectation is taken with respect to the teacher distribution. However, recent results show that even in simple Gaussian mixture model settings, this objective can lead to undesirable and initialization-dependent convergence behavior. In this paper, we study an alternative objective: the reverse Fisher divergence, where the expectation is taken with respect to the student distribution. We analyze gradient descent (GD) for fitting Gaussian mixture models and show that this change in the objective leads to significantly better optimization properties. First, when the teacher distribution is a single Gaussian and the student is a Gaussian mixture model with fixed weights and identity covariances, we prove the global convergence of GD from arbitrary initializations. Second, we extend the analysis to the case where the teacher is also a Gaussian mixture model and prove global convergence guarantees under a global random initialization scheme and a $\widetildeΩ(1)$-separation assumption on the target means. In particular, with high probability, each student component converges near its closest teacher component, and we provide conditions under which the student distribution converges in total variation distance. Our proofs rely on a new Lyapunov-based analysis of the gradient descent dynamics, showing that the reverse Fisher divergence has a much more favorable optimization landscape than the forward Fisher divergence.

2606.19834 2026-06-19 cs.DC cs.IT cs.NI math.IT 新提交

Multi-Orientation Edge-Minimum Repair for Non-Redundant Fault-Tolerant Broadcasting in Dense Eisenstein--Jacobi Networks

密集Eisenstein-Jacobi网络中非冗余容错广播的多方向边最小修复

Bader Albader

AI总结 针对密集Eisenstein-Jacobi网络,提出多方向边最小修复方法EJ-MOEM,通过评估六边形广播树方向、选择容错候选、收缩故障剪枝树并利用外部跨组件修复边重构生成树,证明单故障深度不超过t+1、双故障深度不超过t+2,实验验证至t=200均成功。

Comments Preprint also available on Zenodo:https://doi.org/10.5281/zenodo.20691537

详情
AI中文摘要

密集Eisenstein-Jacobi (EJ) 网络是六次代数互连网络,其有限商几何自然由六边形轴向坐标球表示。本文研究由 $\alpha=(t+1)+t\omega$ 生成的密集EJ网络中的非冗余一对多广播修复,其中 $t$ 是网络直径。我们提出EJ-MOEM,一种多方向边最小修复方法,该方法评估一个常数大小的六边形广播树方向族,选择一个容错感知候选,将故障剪枝树收缩为健康组件,并使用外部跨组件修复边重新连接这些组件。得到的结构是健康子图的一个有根生成树:每个健康节点恰好接收一次消息,不使用任何故障节点,并保留原始健康树组件。我们证明,对于所选方向,其故障剪枝组件图是连通的,恰好需要 $c-1$ 条外部修复边,其中 $c$ 是健康组件的数量。我们还证明了EJ坐标归约树的深度证书定理:每个单故障位置允许深度至多 $t+1$ 的修复,每个双故障位置允许深度至多 $t+2$ 的修复。证明使用了EJ六边形的三带表示、扇区后缀附着引理、非相邻扇区分离引理以及六方向屏蔽分类用于配对割集。扩展验证包括对 $t=2,\ldots,12,14,16,18$(在 $t=18$ 时多达 $N=1027$ 和 525,825 个双故障位置)的穷举单故障和双故障枚举,通过 $t=30$ 的结构化定理关键测试,以及通过 $t=200$ 的大型随机测试,全部100%成功且无违反定理的情况。

英文摘要

Dense Eisenstein--Jacobi (EJ) networks are degree-six algebraic interconnection networks whose finite quotient geometry is naturally represented by a hexagonal axial-coordinate ball. This paper studies non-redundant one-to-all broadcast repair in the dense EJ network generated by $α=(t+1)+tω$, where $t$ is the network diameter. We propose EJ-MOEM, a multi-orientation edge-minimum repair method that evaluates a constant-size family of hexagonal broadcast-tree orientations, selects a fault-aware candidate, contracts the fault-pruned tree into healthy components, and reconnects these components using external component-crossing repair edges. The resulting structure is a rooted spanning tree of the healthy subgraph: every healthy node receives the message exactly once, no faulty node is used, and the original healthy tree components are preserved. We prove that, for a chosen orientation whose fault-pruned component graph is connected, exactly $c-1$ external repair edges are necessary and sufficient, where $c$ is the number of healthy components. We also prove a depth-certificate theorem for EJ coordinate-reduction trees: every one-fault placement admits a repair of depth at most $t+1$, and every two-fault placement admits a repair of depth at most $t+2$. The proof uses the three-strip representation of EJ hexagons, a sector-suffix attachment lemma, a non-adjacent-sector separation lemma, and a six-direction shielding classification for paired cuts. Extended validation includes exhaustive one- and two-fault enumeration for $t=2,\ldots,12,14,16,18$ (up to $N=1027$ and 525,825 two-fault placements at $t=18$), structured theorem-critical tests through $t=30$, and large random tests through $t=200$, all with 100\% success and no violation of the theorem.

2606.19833 2026-06-19 cs.DC cs.IT cs.NI math.IT 新提交

Fault-Tolerant Shared-Relay Communication in Circulant Interconnection Networks

循环互连网络中的容错共享中继通信

Bader Albader, Galal Hassan, Mohamed R. Al-Mulla

AI总结 本文研究有向循环图中两跳容错共享中继问题,通过循环差多重性条件建立网络设计框架,分析中继冗余度与度预算的关系,并验证生成器选择对中继生存性的关键影响。

Comments Preprint also available on Zenodo:https://doi.org/10.5281/zenodo.20691084

详情
AI中文摘要

循环互连网络提供对称寻址、紧凑生成器描述和均匀局部连通性。本文映射了有向循环图中容错两跳原语的度-冗余度景观:给定$n$个节点和度预算$m$,最坏情况下的共享中继多重性$R(n,m)$能有多大?如果节点到有序终端对都有出边,则该节点是共享中继;一个$f$中继容错循环图要求每对终端至少有$f+1$个这样的中继。基本可行性条件是循环差多重性条件,我们将其作为数学工具而非新对象。贡献在于围绕该工具的网络设计框架:参数$R(n,m)$和$D_f(n)$、区间循环图的否定定理、中继表预处理和查找算法、对抗性和随机故障保证、负载均衡范围、启发式设计的认证上界解释、精确的小$n$校准、软件查找与搜索微基准测试,以及对526,539个生成器集的可重复研究。结果表明,生成器选择关键决定最坏情况下的中继生存性:优化阈值设计在约$1.16$-$1.63$倍计数下界内实现$f$中继容错,而标准区间生成器即使在更大度下也可能结构失效。

英文摘要

Circulant interconnection networks provide symmetric addressing, compact generator descriptions, and uniform local connectivity. This paper maps a degree--redundancy landscape for a fault-tolerant two-hop primitive in directed circulants: given $n$ nodes and degree budget $m$, how large can the worst-case shared-relay multiplicity $R(n,m)$ be? A node is a shared relay for an ordered terminal pair if it has outgoing links to both terminals; an $f$-relay-fault-tolerant circulant requires at least $f+1$ such relays for every pair. The underlying feasibility condition is a cyclic difference-multiplicity condition, which we use as a mathematical tool rather than claim as a new object. The contribution is the network-design framework around this tool: the parameters $R(n,m)$ and $D_f(n)$, a negative theorem for interval circulants, relay-table preprocessing and lookup algorithms, adversarial and random failure guarantees, load-balance scope, certified upper-bound interpretation of heuristic designs, exact small-$n$ calibration, a software lookup-versus-search microbenchmark, and a reproducible study of 526,539 generator sets. The results show that generator choice critically determines worst-case relay survivability: optimized threshold designs achieve $f$-relay-fault tolerance within about $1.16$--$1.63$ of the counting lower bound, while standard interval generators can fail structurally even at much larger degrees.

2606.19832 2026-06-19 cs.DC cs.IT cs.NI math.IT 新提交

Certified Euclidean-Residue Minimal-Alignment Switch Decompositions for Three Edge-Disjoint Hamiltonian Cycles in Eisenstein--Jacobi Networks

Eisenstein-Jacobi网络中三条边不交哈密顿环的认证欧几里得剩余最小对齐交换分解

Bader Albader

AI总结 针对非互质Eisenstein-Jacobi网络,提出一种基于局部交换演算的最小交换分解方法,构建三条边不交哈密顿环,并通过代数补关联证明其正确性。

Comments Preprint also available on Zenodo:https://doi.org/10.5281/zenodo.20693870

详情
AI中文摘要

Eisenstein-Jacobi (EJ) 网络是六度商格互连网络。对于生成元 $\alpha=a+b\rho$,设 $N=a^2+ab+b^2$ 和 $d=\gcd(a,b)$。若 $d=1$,三个自然单位方向已给出三条边不交哈密顿环。若 $d>1$,每个单位方向分裂为 $d$ 个环,边不交哈密顿环问题变为环拼接问题。现有的非互质EJ分解通过矩形表示和交换调度证明存在性。本文在自然Cayley几何中发展了一种不同的局部交换演算。前两个哈密顿环各自使用最少可能的 $d-1$ 个组件间交换构建,第三个因子作为未使用的边补集获得。贡献并非对所有非互质EJ网络的新存在性定理,而是针对欧几里得剩余族的一种紧凑、公式驱动、最小交换分解,其补关联通过符号方式证明。证明分离四个要素:组件标签坍缩、锚点取消、提升交换代表的无碰撞性以及连通补关联。本文中没有无限族定理通过有限证据或计算枚举证明。定理范围限定在代数补关联证书已写明的参数范围内。表格和CSV数据仅用于验证和重现公式,从不作为无限族定理的证明。

英文摘要

Eisenstein--Jacobi (EJ) networks are degree-six quotient-lattice interconnection networks. For a generator $α=a+bρ$, let $N=a^2+ab+b^2$ and $d=\gcd(a,b)$. If $d=1$, the three natural unit directions already give three edge-disjoint Hamiltonian cycles. If $d>1$, each unit direction splits into $d$ cycles and the EDHC problem becomes a cycle-splicing problem. Existing non-coprime EJ decompositions prove existence by using a rectangular representation and exchange schedules. This paper develops a different, local switch calculus in the natural Cayley geometry. The first two Hamiltonian cycles are built using the minimum possible $d-1$ intercomponent switches each, and the third factor is obtained as the unused edge complement. The contribution is deliberately not a new existence theorem for all non-coprime EJ networks; rather, it is a compact, formula-driven, minimal-switch decomposition for Euclidean-residue families whose complement incidence is proved symbolically. The proof separates four ingredients: component-label collapse, anchor cancellation, noncollision of lifted switch representatives, and connected complement incidence. No infinite-family theorem in this manuscript is proved by finite witnesses or by computational enumeration. The theorem scope is stated for the parameter ranges where an algebraic complement-incidence certificate is written down. Tables and CSV data are used only to verify and reproduce the formulas, never as proof of an infinite-family theorem.

2606.19761 2026-06-19 cs.LO math.LO 新提交

Finishing Oltean's Completeness Proof in Lean 4 for Hybrid Logic $L(\forall)$

在 Lean 4 中完成 Oltean 关于混合逻辑 $L(\forall)$ 的完备性证明

Lars Warren Ericson

AI总结 本文在 Lean 4 中完成了混合逻辑 $L(\forall)$ 的机器检查完备性证明,通过结构新鲜性和存在引理 Henkin 构造两种工具解决了新鲜名称的生成问题。

Comments 147 pages, 5 figures

详情
AI中文摘要

我们给出了一个在 Lean 4 中机器检查的完备性定理,针对混合逻辑 $L(\forall)$:带有名义词、满足风格绑定器 $\forall$ 和盒子模态的命题模态逻辑。(基本混合逻辑(无绑定器)的机器检查完备性由 Asta Halkjær From 在 Isabelle/HOL 中开创。)我们基于 Alex Oltean 2023 年的 Lean 4 形式化工作,该工作机械化了语法、语义、希尔伯特风格证明系统和可靠性(遵循 Blackburn 的混合完备性(1998)),但留下了不完备的部分。完成它需要在两个结构不同的点上制造新鲜名称,我们的核心发现是它们需要两种不同的工具。(1)通过扩展的 Lindenbaum 构造构建的根可证最大一致集,每一步都需要一个对整个集合新鲜的名义词;正确的工具是结构新鲜性:扩展语言,使得通过构造保留无限的名义词供应。我们调查了设计空间(Oltean 在 $\mathbb{N}$ 内的奇偶编码、Bud Mishra 建议的不交和 $N \oplus \mathbb{N}$ 参数化,以及 From 的合成完备性框架)并解释了我们采用的编码。(2)一个最大一致集的可证 $\Diamond$-后继不能通过这种方式获得:其典范盒子归约可证地提及每个名义词,因此没有保留的名称是新鲜的。这里正确的工具是 Oltean 选择但未完成的:一个存在引理 Henkin 构造,通过一个新鲜状态变量从前驱的可证性中抽取每个见证;我们通过一个携带数据的见证累加器和一个紧致性论证完成了它。定理 $\Gamma \models \varphi \to \Gamma \vdash \varphi$ 被完全形式化:该开发是无 sorry 的,且 #print axioms 仅报告 propext、this http URL 和 this http URL。我们将开发移植到 Lean v4.30.0 / mathlib v4.30.0。

英文摘要

We present a machine-checked completeness theorem, in Lean 4, for the hybrid logic $L(\forall)$: propositional modal logic with nominals, the satisfaction-style binder $\forall$, and the box modality. (Machine-checked completeness for basic hybrid logic, without binders, was pioneered by Asta Halkjær From in Isabelle/HOL.) We build on Alex Oltean's 2023 Lean 4 formalization, which mechanized the syntax, semantics, Hilbert-style proof system, and soundness following Blackburn's Hybrid Completeness (1998), but left completeness unfinished. Finishing it requires manufacturing fresh names at two structurally different points, and our central finding is that they call for two different tools. (1) The root witnessed maximal consistent set, built by an extended Lindenbaum construction, needs at each step a nominal fresh for the whole set; the right tool is structural freshness: extend the language so an infinite supply of nominals is reserved by construction. We survey the design space (Oltean's odd/even encoding inside $\mathbb{N}$, the disjoint-sum $N \oplus \mathbb{N}$ parameterization suggested by Bud Mishra, and From's synthetic-completeness frameworks) and explain the encoding we adopt. (2) The witnessed $\Diamond$-successor of a maximal consistent set cannot be obtained this way: its canonical box-reduct provably mentions every nominal, so no reserved name is fresh. Here the right tool is one Oltean chose but left incomplete: an existence-lemma Henkin construction drawing each witness from the predecessor's witnessedness through a fresh state variable; we complete it with a data-carrying witness accumulator and a compactness argument. The theorem $Γ\models φ\to Γ\vdash φ$ is fully formalized: the development is sorry-free, and #print axioms reports only propext, Classical.choice, and Quot.sound. We port the development to Lean v4.30.0 / mathlib v4.30.0.

2606.19754 2026-06-19 cs.LG cs.NA math.NA 新提交

Learning universal approximations for partial differential equations with Physics-Informed Broad Learning System

基于物理信息广度学习系统的偏微分方程通用逼近学习

Zhiwen Yu, Derong Yang, Liujian Zhang, Kaixiang Yang, Peilin Zhan, Jianmin Lv, Jane You, C. L. Philip Chen

发表机构 * School of Computer Science and Engineering, South China University of Technology(华南理工大学计算机科学与工程学院) Peng Cheng Laboratory(鹏城实验室) School of Future Technology, South China University of Technology(华南理工大学未来技术学院) School of Computer Science and Technology, Guangdong University of Technology(广东工业大学计算机科学与技术学院) Department of Industrial and Systems Engineering, The Hong Kong Polytechnic University(香港理工大学工业及系统工程学系)

AI总结 提出物理信息广度学习系统(PIBLS),通过无反向传播的最小二乘优化高效求解线性和非线性偏微分方程,比传统PINN快1-3个数量级且精度更高。

详情
AI中文摘要

偏微分方程(PDE)在建模复杂的物理、生物和工程系统中起着核心作用。虽然传统的数值求解器很稳健,但由于网格依赖性,它们常常带来高昂的计算成本,而最近的物理信息神经网络(PINN)提供了一种无网格替代方案,但经常遭受收敛缓慢和优化不稳定的问题。为了弥合这一差距,本文提出了物理信息广度学习系统(PIBLS),一种新颖的无反向传播框架,将PDE求解重新表述为直接的最小二乘优化。我们改进了该框架内的一个算法以高效处理非线性PDE,并提供了严格的数学证明,确立了PIBLS对这些方程的通用逼近性质。在线性和非线性PDE上的实验表明,PIBLS比传统PINN快1到3个数量级,同时实现了显著更高的求解精度。该框架为科学机器学习提供了一种计算高效的范式,为实时仿真和设计优化任务提供了一种实用、高速的替代方案。

英文摘要

Partial differential equations (PDEs) play a central role in modeling complex physical, biological, and engineering systems. While traditional numerical solvers are robust, they often incur prohibitive computational costs due to mesh dependencies, whereas recent Physics-Informed Neural Networks (PINNs) offer a mesh-free alternative but frequently suffer from slow convergence and optimization instability. To bridge this gap, this article proposes the Physics-Informed Broad Learning System (PIBLS), a novel backpropagation-free framework that reformulates PDE solving as a direct least-squares optimization. We improved an algorithm within this framework to handle nonlinear PDEs efficiently and provide a rigorous mathematical proof establishing the universal approximation property of PIBLS for these equations. Experiments on linear and nonlinear PDEs demonstrate that PIBLS is one to three orders of magnitude faster than conventional PINNs while achieving significantly higher solution accuracy. This framework provides a computationally efficient paradigm for scientific machine learning, offering a practical, high-speed alternative for real-time simulation and design optimization tasks.

2606.19751 2026-06-19 cs.DB math.OC 新提交

DeQL: A Decision Query Language for Prescriptive Analytics over Relational Data

DeQL:一种用于关系数据规范性分析的决策查询语言

Matteo Brucato, Fjodor Kholodkov, Soren Little, Jakob Mayer, Duc Nguyen

AI总结 DeQL扩展SQL以支持决策查询,通过CREATE CANDIDATES和DECIDE两个构造定义选项空间、约束和目标,实现子集选择、分配、调度等决策,并支持不确定性优化和模型评分。

详情
AI中文摘要

DeQL(决策查询语言)扩展了SQL以表达决策查询:给定从关系数据中提取的选项、策略约束和可测量的目标,DeQL查询计算出最佳行动方案。两个构造实现了这一扩展:CREATE CANDIDATES,定义来自关系源的选项空间;DECIDE,声明决策变量、命名约束以及针对这些变量的目标。该设计遵循SQL的原则:用户说明要优化的内容,而引擎选择如何求解;每个查询消费并产生关系;问题的结构对引擎保持可见。本文档规范了该语言(其设计原则、语法、形式文法及执行模型),并附有涵盖子集选择、分配、指派、调度以及多级聚合决策的示例,以及针对不确定性优化、内联模型评分和时间与质量受限求解的扩展。这是该规范的第一版;该语言正在积极开发中,本版本固定了后续修订将基于的核心构造。

英文摘要

DeQL (Decision Query Language) extends SQL to express decision queries: given options drawn from relational data, constraints from policy, and a measurable objective, a DeQL query computes the best course of action. Two constructs carry the extension: CREATE CANDIDATES, which defines the space of options from relational sources, and DECIDE, which declares decision variables, named constraints, and an objective over them. The design follows SQL's principles: the user states what to optimize while the engine chooses how to solve it, every query consumes and produces relations, and the structure of a problem stays visible to the engine. This document specifies the language (its design principles, syntax, formal grammar, and execution model) with examples spanning subset selection, allocation, assignment, scheduling, and decisions at multiple levels of aggregation, and extensions for optimization under uncertainty, inline model scoring, and time- and quality-bounded solving. It is the first version of the specification; the language is under active development, and this version fixes the core constructs on which later revisions will build.

2606.19715 2026-06-19 eess.SP cs.IT math.IT 新提交

Generalized Pinching-Antenna Systems: A Radio-Stripe-Based Realization

广义夹捏天线系统:基于无线电条带的实现

Yanqing Xu, Zhiguo Ding, Tsung-Hui Chang

AI总结 本文提出基于无线电条带(RS)的广义夹捏天线(RS-GPA)框架,通过主动天线处理单元实现位置灵活的无线接入,并开发稀疏激活与波束成形算法以降低总功耗。

Comments 13 pages, 7 figures

详情
AI中文摘要

本文研究无线电条带(RS)作为广义夹捏天线的实际实现,并提出基于RS的广义夹捏天线(RS-GPA)框架。与依赖导波到自由空间被动耦合的介质波导基被动夹捏天线不同,RS采用沿共享电缆部署的主动天线处理单元(APU)进行本地传输、接收和信号处理。这种类似电缆的主动架构提供了灵活的安装和广泛的频率适用性,同时允许选定的APU作为离散且可控的辐射或接收点,实现位置灵活的无线接入。基于所提出的RS-GPA框架,我们通过考虑距离相关的APU-用户信道建立了系统和信道模型。对于下行传输,我们提出了一个电路功率感知的稀疏APU激活和波束成形问题,并开发了一种重加权群稀疏波束成形算法。为了揭示激活原理,我们分析了单用户下行情况,并通过平衡发射功率节省和电路功率成本来刻画何时应激活额外的APU。受此启发,提出了一种几何引导的低复杂度多用户算法。对于上行传输,我们提出了一个联合APU激活和用户功率控制问题,并开发了一种几何引导的稀疏激活设计。数值结果表明,与基准方案相比,所提出的RS-GPA框架显著降低了总功耗,而几何引导算法在运行时间显著降低的情况下实现了与群稀疏设计几乎相同的功耗性能。

英文摘要

This paper investigates radio stripes (RSs) as a practical realization of generalized pinching antennas and proposes an RS-based generalized pinching-antenna (RS-GPA) framework. Unlike dielectric-waveguide-based passive pinching antennas that rely on passive coupling from a guided wave into free space, RSs employ active antenna processing units (APUs) deployed along a shared cable for local transmission, reception, and signal processing. This cable-like active architecture offers flexible installation and broad frequency applicability, while allowing selected APUs to act as discrete and controllable radiation or reception points for location-flexible wireless access. Based on the proposed RS-GPA framework, we establish the system and channel models by accounting for the distance-dependent APU-user channels. For downlink transmission, we formulate a circuit-power-aware sparse APU activation and beamforming problem and develop a reweighted group-sparse beamforming algorithm. To reveal the activation principle, we analyze the single-user downlink case and characterize when an additional APU should be activated by balancing transmit-power saving and circuit-power cost. Inspired by this insight, a geometry-guided low-complexity multiuser algorithm is proposed. For uplink transmission, we formulate a joint APU activation and user power control problem and develop a geometry-guided sparse activation design. Numerical results show that the proposed RS-GPA framework substantially reduces the total consumed power compared with benchmark schemes, while the geometry-guided algorithm achieves near-identical consumed-power performance to the group-sparse design with significantly lower runtime.

2606.19695 2026-06-19 eess.SY cs.GT cs.SY math.OC 新提交

A Unified Framework for Joint Sensor Placement and Scheduling for Intrusion Detection

入侵检测中联合传感器放置与调度的统一框架

Jayanth Bhargav, Mahsa Ghasemi, Shreyas Sundaram

AI总结 提出一个统一框架,将传感器放置与方向调度联合优化,通过博弈论设计效用函数并利用弱子模性实现近最优检测性能。

Comments 27 pages, 4 figures

详情
AI中文摘要

我们考虑一个入侵检测任务,其中防御者必须联合优化传感器放置位置和方向,以最小化入侵者穿越受保护环境时被漏检的概率。我们将此问题分解为一个元问题(称为SensorPlacement)和一个嵌入的子问题(称为OrientationScheduling)。对于固定的传感器放置,OrientationScheduling子问题被建模为防御者和入侵者之间的两人零和博弈,其中防御者寻求已部署传感器的方向策略以最小化漏检概率,而入侵者则寻求路径选择策略以最大化该概率。由于防御者的策略空间随传感器数量和方向组合增长,通过标准线性规划求解博弈变得不可行。为此,我们开发了一种迭代且高效的均衡求解算法,该算法利用博弈收益函数的结构,并建立了收敛到博弈纳什均衡(NE)的理论保证。该NE值随后被用作SensorPlacement元问题中的效用度量。我们证明了这个基于博弈值的效用函数在传感器放置集合上是弱子模的,并提出了一个具有近最优性保证的贪婪放置算法。据我们所知,这是第一个将博弈论效用设计与(弱)子模优化相结合的统一框架,实现了传感器放置和方向调度的原则性联合优化。通过大量仿真,我们证明所提出的方法实现了近最优的检测性能,同时与基线相比显著减少了计算时间。

英文摘要

We consider an intrusion detection task in which a defender must jointly optimize sensor placement locations and orientations to minimize the probability of missed detection of an intruder traversing a protected environment. We decompose this problem into a meta problem, termed SensorPlacement, and an embedded subproblem, termed OrientationScheduling. The OrientationScheduling subproblem, for a fixed sensor placement, is modeled as a 2-player zero-sum game between the defender and the intruder, where the defender seeks an orientation strategy for the deployed sensors to minimize the probability of missed detection, while the intruder seeks a path selection strategy to maximize it. Since the defender's strategy space grows combinatorially with the number of sensors and orientations, solving the game via standard linear programming becomes prohibitive. To this end, we develop an iterative and efficient equilibrium-seeking algorithm that exploits the structure of the game's payoff function and establishes theoretical guarantees for convergence to the Nash equilibrium (NE) of the game. This NE value is then used as a utility measure in the SensorPlacement meta problem. We show that this game-value-based utility function is weakly submodular over the set of sensor placements and propose a greedy placement algorithm with near-optimality guarantees. To our knowledge, this is the first unified framework to integrate game-theoretic utility design with (weak) submodular optimization, enabling principled joint optimization of sensor placement and orientation scheduling. Through extensive simulations, we demonstrate that the proposed approach achieves near-optimal detection performance while significantly reducing computation time compared to baselines.

2606.19655 2026-06-19 stat.CO math.ST stat.TH 新提交

A Flat Connection: The Pooling Factor and the Geometry of Centring in Hierarchical MCMC

平坦联络:分层MCMC中的汇集因子与中心化几何

Aidan D. Bindoff

AI总结 研究分层MCMC中中心化/非中心化障碍的几何原因,证明Fisher信息诱导的联络是平坦的,障碍源于统计上的汇集因子π_j,并据此提出诊断方法。

Comments 39 pages, 9 figures, accompanying R package

详情
AI中文摘要

标准MCMC诊断($\hat{R}$、有效样本量、发散计数)检测链是否混合,但不检测为何未混合。我们询问分层模型中的中心化/非中心化障碍是否具有度量之外的几何原因。联合参数空间是一个纤维丛(超参数为底,组级参数为纤维),Fisher信息度量诱导一个Ehresmann联络$A = -G_{FF}^{-1}G_{BF}$;自然假设是障碍是其曲率,采样器将其感受为和乐。我们证明这是错误的。对于任何光滑的分层后验,不仅是高斯情况,联络是平坦的,因为其水平叶是纤维得分$\partial_\alpha \log p$的水平集:度量之上没有几何障碍。剩下的障碍是统计的,而非几何的,平坦联络将其识别为一个单一量:纤维对底的条件依赖性,由每组的先验比例$\pi_j$(经典汇集因子)控制。该框架由此恢复了已有图景:先验主导的组混合缓慢,每组的非中心化最优权重有闭式解,并且一项模拟研究通过它们对分层方差的相反依赖性,将这种底-纤维耦合与漏斗(一种不同的底空间病态)区分开来。一项直接归因测试确认NUTS不运输纤维:链级足迹是先验主导组中多余的条件自相关,正如$\pi_j$所预测。真正的、甚至旋转的曲率确实出现,但仅针对由采样器工作度量(固定质量矩阵)构建的联络,此时和乐作为算法现象而非几何现象重新出现。先验比例诊断作为R包fibr分发,几何方法作为附带的复现代码。

英文摘要

Standard MCMC diagnostics ($\hat{R}$, effective sample size, divergence counts) detect whether a chain has mixed, but not why it has not. We ask whether the centring/non-centring obstruction in hierarchical models has a geometric cause beyond the metric. The joint parameter space is a fiber bundle (hyperparameters the base, group-level parameters the fibers), and the Fisher information metric induces an Ehresmann connection $A = -G_{FF}^{-1}G_{BF}$; the natural hypothesis is that the obstruction is its curvature, felt by the sampler as holonomy. We prove this false. The connection is flat for any smooth hierarchical posterior, not only the Gaussian case, because its horizontal leaves are the level sets of the fiber score $\partial_α\log p$: there is no geometric obstruction above the metric. What remains is statistical, not geometric, and the flat connection identifies it as a single quantity: the conditional dependence of fiber on base, governed per group by the prior fraction $π_j$, the classical pooling factor. From it the framework recovers the established picture, that prior-dominated groups mix slowly and that the optimal per-group non-centring weight follows in closed form, and a simulation study separates this base-fiber coupling from the funnel, a distinct base-space pathology, by their opposite dependence on the hierarchical variance. A direct attribution test confirms that NUTS does not transport the fiber: the chain-level footprint is excess conditional autocorrelation in prior-dominated groups, exactly as $π_j$ predicts. Genuine, even rotational, curvature does appear, but only for connections built from a sampler's working metric (a fixed mass matrix), where holonomy re-enters as an algorithmic rather than geometric phenomenon. The prior-fraction diagnostic is distributed as the R package fibr, with the geometric methods as accompanying reproduction code.

2606.19521 2026-06-19 cs.LG math.OC 新提交

Interactive Pareto navigation for deep multi-task learning

深度多任务学习的交互式帕累托导航

Augustina C. Amakor, Konstantin Sonntag, Sebastian Peitz

发表机构 * Department of Computer Science, TU Dortmund, Dortmund, Germany(多特蒙德工业大学计算机科学系,德国多特蒙德) Lamarr Institute for Machine Learning and Artificial Intelligence(拉马尔机器学习和人工智能研究所)

AI总结 提出偏好帕累托探索(PPE)框架,通过预测-校正方法沿帕累托流形切线方向引导偏好,利用Krylov子空间方法避免Hessian计算,实现高效交互式多目标优化。

详情
AI中文摘要

在多任务学习中,处理越来越多的目标在计算资源和决策者选择适当权衡的能力方面都很快变得具有挑战性。因此,一种广泛使用的方法是通过加权和将各个损失聚合到单个损失函数中。这通常由于帕累托前沿的形状而无法捕捉决策者的偏好,或者需要多次调整和计算,这在深度学习应用中变得过于昂贵。为了解决这些问题,我们引入了一个新颖的框架,偏好帕累托探索(PPE),它在交互式探索过程中强制执行决策者的偏好,同时考虑帕累托集的几何形状。PPE基于预测-校正方法,该方法沿着帕累托最优解流形的切线方向执行预测步骤,遵循决策者的偏好。随后的校正步骤产生反映该偏好的新权衡。为了在表征流形切空间时避免显式的Hessian计算,我们采用了一种仅依赖于矩阵-向量乘积的Krylov子空间方法。这些乘积可以通过自动微分高效获得,确保了整个优化过程的效率和鲁棒性。该方法的有效性和性能通过玩具问题和深度学习示例进行了展示。

英文摘要

In multi-task learning, handling an increasing number of objectives can quickly become challenging, both in terms of the computational resources and the decision maker's capacity to choose appropriate trade-offs. A widely used approach is thus to aggregate the individual losses in a single loss function by a weighted sum. This often fails to capture either the decision maker's preferences as a result of the shape of the Pareto front, or requires multiple adjustments and computations which becomes prohibitively expensive in deep learning applications. To address these issues, we introduce a novel framework, Preference Pareto Exploration (PPE), which enforces the decision maker's preferences while accounting for the geometry of the Pareto set in an interactive exploration process. PPE is based on a predictor-corrector method that performs predictor steps tangential to the manifold of Pareto-optimal solutions, following the decision maker's preference. The subsequent corrector step results in a new trade-off reflecting this preference. To avoid explicit Hessian computations when characterizing the tangent space of the manifold, we employ a Krylov subspace method that relies solely on matrix-vector products. These products can be efficiently obtained via automatic differentiation, ensuring both efficiency and robustness throughout the optimization process. The method's functionality and performance are demonstrated using both toy problems and examples from deep learning.

2606.19405 2026-06-19 q-bio.QM math.DS q-bio.PE 新提交

Multi-type branching inference on contact trees with application to COVID-19

接触树上的多类型分支推断及其在COVID-19中的应用

Augustine Okolie, Johannes Müller, Eno Akarawakc, Isaac Ajiboye

AI总结 提出一种直接作用于接触树上传播树的似然框架,通过多类型分支过程考虑接触度异质性,从部分解析的传播树中推断流行病学参数,并在COVID-19接触追踪数据中验证。

Comments 26 pages, 8 Figures

详情
AI中文摘要

从传播树推断流行病学参数对于理解传染病动态至关重要。现有的基于树的似然方法,包括最初应用于系统动力学环境中的多类型出生-死亡模型,提供了强大的工具,但大多数假设均匀混合,很少捕捉当个体感染更多接触者时传播潜力的变化。在这项工作中,我们开发了一个直接作用于传播树的似然框架,其中节点是个体,边是报告的传播事件,不涉及序列数据。我们推导了一个在有根接触树上的随机SIR过程的似然,其中每个感染个体由有效接触总数和已感染的下游接触数来刻画。我们得到了一个分支完全未被观察到的概率以及它产生一个处于给定状态的观察(采样)末端的概率密度的闭式常微分方程。对于已知末端状态的有根接触树,可以评估得到的似然,并且我们通过将内部分支时间视为潜在变量,将其扩展到部分解析的树。在模拟爆发上的验证确认了准确的参数恢复和良好校准的不确定性。应用于印度卡纳塔克邦的经验COVID-19接触追踪数据,展示了该框架在实际流行病学环境中的实用性。通过在多类型分支似然中纳入接触度异质性,我们的工作为从完全或部分解析的传播树推断传播动态和接触结构提供了一个原则性的基线,补充而非依赖于基于序列的系统动力学推断。

英文摘要

Inferring epidemiological parameters from transmission trees is essential for understanding infectious disease dynamics. Existing tree-based likelihood methods, including the multi-type birth-death models originally applied in phylodynamic settings, provide powerful tools, but most assume homogeneous mixing and rarely capture how transmission potential changes as an individual infects more of their contacts. In this work, we develop a likelihood framework that operates directly on transmission trees, in which nodes are individuals and edges are reported transmission events, with no sequence data involved. We derive a likelihood for a stochastic SIR process on a rooted contact tree in which each infected individual is characterised by the total number of effective contacts, and the number of already infected downstream contacts. We obtain closed-form ordinary differential equations for the probability that a clade goes entirely unobserved and for the probability density that it produces an observed (sampled) tip in a given state. The resulting likelihood can be evaluated for a rooted contact tree with known tip states, and we extend it to partially resolved trees by treating internal branching times as latent variables. Validation on simulated outbreaks confirms accurate parameter recovery and well calibrated uncertainty. Application to empirical COVID-19 contact-tracing data from Karnataka, India, demonstrates the framework's utility for real epidemiological settings. By incorporating contact-degree heterogeneity in a multi-type branching likelihood, our work provides a principled baseline for inferring both transmission dynamics and contact structure from fully or partially resolved transmission trees, complementing rather than relying on sequence-based phylodynamic inference

2606.19393 2026-06-19 cs.DM cs.DS math.CO 新提交

An alternative way of defining finite graphs

定义有限图的另一种方式

Maxim Nazarov

AI总结 提出一种完全图不变量“图线性符号”,作为有限图的替代定义,用于简化图的对称性图示和同构比较。

Journal ref Prikl. Diskr. Mat., 2015, no. 3(29), 83-94

详情
AI中文摘要

在本文中,我们引入了“图线性符号”——一种完全图不变量——它被定位为有限图的替代定义。该不变量使用类似于寻找图规范形式的算法构建。存储图线性符号而不是常规图,使我们能够极大地简化两个主要问题:考虑可能图对称性的图插图构建,以及两个图的同构比较。我们还展示了诸如着色和图路径等经典图论概念向图线性符号的可转移性。

英文摘要

In this paper we introduce "graph linear notation" -- a complete graph invariant -- which is positioned as an alternative definition for the finite graphs. This invariant is constructed using an algorithm similar to the algorithm of finding canonical forms of graphs. Storing graph linear notation instead of a regular graph allows us to greatly simplify two major problems: the construction of illustrations for graphs with regards to possible graph symmetries, and the comparison of two graphs for isomorphism. We also demonstrate the transferability to the graph linear notations such classical graph theory concepts as colourings and graph paths.

2606.19361 2026-06-19 cs.LG cs.AI cs.NA math.NA stat.CO stat.ME stat.ML 新提交

Computational Identifiability

计算可识别性

Lucius E. J. Bynum, Rajesh Ranganath, Kyunghyun Cho

发表机构 * New York University(纽约大学)

AI总结 提出“计算可识别性”框架,通过有限计算搜索过程在指定误差容限内找到经验估计量,从而解决理论可识别性在有限样本、模糊图标准等实际场景中的不足。

详情
AI中文摘要

识别条件描述了目标查询或感兴趣参数作为可用信息类型和数量的函数的可计算性。在因果识别中,这些信息通常以因果图的形式表达,数据是针对图中某些变量子集观测或收集的。目标查询可以是单个效应,也可以是给定模型中的一类效应。识别算法的推导在数学上定义了期望中理论上唯一确定所需因果效应的过程。期望中的可识别性,即“理论可识别性”,通常假设渐近性质、无限数据或其他数学理想化条件。在本文中,我们探讨了这种理论理想化的可识别性与一种受计算限制的替代方案之间的根本区别。我们提出的框架——“计算可识别性”——而是为经验估计量定义一个有限的计算搜索过程。如果该过程在期望的误差容限内经验性地找到了估计量,则满足可识别性,条件取决于搜索的指定假设(即参数上的先验分布)以及搜索过程本身。通过多个实验,我们展示了该框架如何回答细粒度的实际识别问题,例如小有限样本下的识别、模糊图标准下的识别、混合观测-干预数据下的识别,以及跨反事实数据和估计量的识别。代码见 https://this https URL。

英文摘要

Identification conditions describe the computability of a target query or parameter of interest as a function of the type and amount of information available. In causal identification, this information is often expressed in the form of a causal graph, and data are observed or collected for some subset of variables in the graph. Target queries may be for a single effect alone or for a class of effects in a given model. The derivation of an identification algorithm then defines mathematically the process by which the desired causal effect(s) can be uniquely determined, theoretically, in expectation. Identifiability in expectation, or 'theoretical identifiability,' generally assumes asymptotic properties, infinite data, or other mathematically idealized conditions. In this paper, we explore a fundamental distinction between this theoretical, idealized notion of identifiability and a proposed alternative that is computation-bound. The framework we propose - 'computational identifiability' - is to instead define a finite computational search procedure for an empirical estimator. If this process finds an estimator empirically, within a desired error tolerance, then identifiability is satisfied, conditional on the specified assumptions of the search (i.e., a prior distribution over the parameters) and conditional on the search procedure itself. Through several experiments, we demonstrate how this framework allows us to answer fine-grained, practical identification questions, such as identification with small finite samples, with ambiguous graphical criteria, with mixed observational-interventional data, and across counterfactual data and estimands. Code is available at https://github.com/lbynum/metadentify.

2606.20534 2026-06-19 math.OC 新提交

On Second-Order Methods for Bilevel Optimization

关于双层优化的二阶方法

Jiawen Bi, Jiaxiang Li, Mingyi Hong, Shuzhong Zhang

AI总结 本文针对双层优化问题,提出了一种单循环三次正则牛顿算法,在非凸上层和强凸下层设置下,实现了最优的O(ε^{-1.5})总预言复杂度,首次达到二阶驻点的最优收敛率。

详情
AI中文摘要

双层优化是现代机器学习和工程设计不可或缺的建模工具。然而,在双层优化中寻找二阶驻点的理论和实践仍然很大程度上未解决。即使对于具有强凸下层问题的双层优化,其诱导的超函数通常是非凸的。尽管三次正则牛顿方法(CRN)在单层优化中实现了最优的$\mathcal{O}(\varepsilon^{-1.5})$ SOSP(二阶驻点)率,但如何控制将二阶方法应用于双层问题时超梯度和超Hessian计算的精度,以使整个过程高效,仍不清楚。在本文中,我们着手回答这个问题。特别地,我们首先制定了一个双循环CRN基线,该基线实现了最优的外层率,但需要重复的下层求解。接下来,我们提出了一种单循环三次正则牛顿算法,该算法将一个下层梯度步与一个用于超梯度的牛顿步相结合,并证明了总体确定性的$\mathcal{O}(\varepsilon^{-1.5})$总预言复杂度,这是最优的。此外,我们说明了一些直观简单的修改可能无法维持收敛结果。据我们所知,这是第一个用于无约束NCSC(非凸上层和强凸下层)双层优化设置的确定性单循环方法,该方法实现了寻找超函数$\varepsilon$-SOSP的$\mathcal{O}(\varepsilon^{-1.5})$最优收敛率。

英文摘要

Bilevel optimization is an indispensable modeling tool for modern machine learning and engineering design. However, the theory and practice for finding second order stationary points in the context of bilevel optimization still remain largely unsettled. Even for bilevel optimization with strongly convex lower-level problem, the hyperfunction it induces is in general nonconvex. Although the Cubic Regularized Newton methods (CRN) famously achieve the optimal $\mathcal{O}(\varepsilon^{-1.5})$ SOSP (second-order stationary point) rate in single-level optimization, it is unclear how to control the accuracy of the hypergradient and hyper-Hessian computations in the context of applying the second-order methods to bilevel problems in order for the overall process to be efficient. In this paper, we set out to answer this question. In particular, we first formulate a double loop CRN baseline that achieves the optimal outer rate but requires repeated lower level solves. Next, we propose a single loop cubic regularized Newton algorithm that combines one lower-level gradient step with one Newton step for the hypergradient, and prove an overall deterministic $\mathcal{O}(\varepsilon^{-1.5})$ total oracle complexity, which is optimal. In addition, we illustrate that some intuitively simple modifications of our method may fail to hold up the convergence result. To the best of our knowledge, this is the first deterministic single loop method for unconstrained NCSC (non-convex upper-level and strongly convex lower-level) bilevel optimization setting that achieves the $\mathcal{O}(\varepsilon^{-1.5})$ optimal convergence rate for finding an $\varepsilon$-SOSP of the hyperfunction.

2606.20528 2026-06-19 math.DG 新提交

Positive Scalar Curvature Obstructions via Singular Dimension Descent

通过奇异维度下降法的正数量曲率障碍

Yuchen Bi, Jintian ZHu

AI总结 本文发展了Schoen-Yau型奇异维度下降法,用于任意维度的正数量曲率障碍研究,证明了可放大流形上的正数量曲率障碍,并建立了相应的立方宽度不等式和双系统估计。

Comments 51 pages

详情
AI中文摘要

鉴于正质量定理的共形爆破方法的最新进展,包括He--Shi--Yu、Bi--Hao--He--Shi--Zhu和Brendle--Wang的工作,我们发展了Schoen--Yau型奇异维度下降法,用于任意维度的正数量曲率障碍。我们证明了可放大流形上的正数量曲率障碍,并建立了相应的立方宽度不等式和双系统估计。该方法也适用于可放大的AM--PI空间,当奇异集的Assouad余维数大于\(3-2/n\)时,给出了正数量曲率障碍。

英文摘要

In light of recent advances in conformal blow-up methods for the positive mass theorem, including He--Shi--Yu, Bi--Hao--He--Shi--Zhu, and Brendle--Wang, we develop a Schoen--Yau type singular dimension descent method for positive scalar curvature obstructions in arbitrary dimensions. We prove obstructions to positive scalar curvature on enlargeable manifolds and establish the corresponding cubical width inequalities and two-systole estimates. The method also applies to enlargeable AM--PI spaces, giving a positive scalar curvature obstruction when the singular set has Assouad codimension greater than \(3-2/n\).

2606.20516 2026-06-19 math.DG cs.CG 新提交

Approximation and interactive design with exact 3D elastic curves

精确3D弹性曲线的逼近与交互设计

David Brander, Jens Gravesen, Marc Isern

AI总结 提出一种数值稳定方法,从给定弹性曲线段恢复11参数,实现任意空间曲线段到3D弹性曲线的快速稳定逼近,应用于精确弹性曲线交互设计和机器人热刀切割CAD曲面合理化。

Comments 20 pages

详情
AI中文摘要

弹性空间曲线是在适当约束下弯曲能量的临界点。等价于球摆方程的解析表示,导致3D弹性曲线段空间的11参数描述。我们给出了一种数值稳定的方法,从给定的弹性曲线段恢复这11个参数。利用这一点,我们提供了一种快速稳定的方法来逼近任意空间曲线段为3D弹性曲线。应用包括精确弹性曲线的交互设计和用于机器人热刀切割的CAD曲面合理化。

英文摘要

An elastic space curve is a critical point of the bending energy subject to appropriate constraints. An analytic representation, equivalent to the spherical pendulum equation, leads to an 11-parameter description of the space of 3D elastic curve segments. We give a numerically stable method for recovering the 11 parameters from a given elastic curve segment. Using this, we give a fast and stable method to approximate an arbitrary space curve segment by a 3D elastica. Applications include interactive design with exact elastic curves and CAD surface rationalization for robotic hot-blade cutting.

2606.20509 2026-06-19 math.DS 新提交

Planar constant piecewise smooth vector fields with large hysteresis

具有大滞后的平面常数分段光滑向量场

Tiago Carvalho, Leonardo Serantola, Bruno de Souza Rangel

AI总结 针对应用中广泛使用但缺乏极限集理论基础的滞后控制系统,本文在平面情形下分析两个线性向量场和两个切换边界,分类其极限集。

详情
AI中文摘要

在整个工作中,我们将对一类在应用中广泛使用但仍缺乏描述其动力学可能产生的极限集类型的一致理论基础的控制系统进行严格的数学分析。例如,在某些应用中,对某种疾病的治疗会一直进行,直到患病细胞水平低于规定的阈值C1。此时,暂停治疗以使患者机体从其副作用中恢复。随后,当患病细胞水平达到第二个大于C1的阈值C2时,恢复治疗,并重复该方案。据我们所知,目前还没有对此类模型的数学分类。在本文中,我们启动了一项旨在确定此类模型极限集的系统性文献工作。我们从平面情形开始,其中两个线性向量场处于活动状态,并考虑两个切换边界。自然,在未来的发展中,还应考虑更高维度的控制系统,其中包含额外的向量场和更一般的切换流形。

英文摘要

Throughout this work, we will carry out a rigorous mathematical analysis of a class of control systems that is widely used in applications but still lacks a consistent theoretical foundation for describing the types of limit sets that may arise from its dynamics. There are applications in which, for example, a treatment for a given disease is administered until the level of diseased cells falls below a prescribed threshold C1. At that point, the treatment is suspended in order to allow the patient's organism to recover from its side effects. Subsequently, when the level of diseased cells reaches a second threshold C2 bigger than C1, the treatment is resumed, and the protocol is repeated. To the best of our knowledge, there is not a mathematical classification of such models. In this paper, we initiate what is intended to become a consistent body of literature aimed at determining the limit sets of such models. We begin with the planar case, in which two linear vector fields are active and two switching boundaries are considered. Naturally, in future developments, control systems in higher dimensions, featuring additional vector fields and more general switching manifolds, should also be considered.