arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

检索范围排序方式

检索时间范围

重置

HOT 人工智能、机器人等 9

cs.AI 人工智能 cs.CV 计算机视觉 cs.CL 自然语言处理 cs.RO 机器人 cs.LG 机器学习 cs.SD 声音 cs.ET 新兴技术 eess.AS 音频语音 eess.IV 图像视频

CS 计算机 41

cs 计算机 cs.AI 人工智能 cs.AR 硬件架构 cs.CC 计算复杂性 cs.CE 计算工程 cs.CG 计算几何 cs.CL 自然语言处理 cs.CR 密码安全 cs.CV 计算机视觉 cs.CY 计算机与社会 cs.DB 数据库 cs.DC 分布式计算 cs.DL 数字图书馆 cs.DM 离散数学 cs.DS 数据结构 cs.ET 新兴技术 cs.FL 形式语言 cs.GL 综述文献 cs.GR 图形学 cs.GT 博弈论 cs.HC 人机交互 cs.IR 信息检索 cs.IT 信息论 cs.LG 机器学习 cs.LO 计算机逻辑 cs.MA 多智能体 cs.MM 多媒体 cs.MS 数学软件 cs.NA 数值分析 cs.NE 神经进化 cs.NI 网络架构 cs.OH 其他计算机 cs.OS 操作系统 cs.PF 性能 cs.PL 编程语言 cs.RO 机器人 cs.SC 符号计算 cs.SD 声音 cs.SE 软件工程 cs.SI 社会信息网络 cs.SY 系统控制

ECON 经济学 4

econ 经济学 econ.EM 计量经济 econ.GN 一般经济 econ.TH 理论经济

EESS 电气与系统 5

eess 电气与系统 eess.AS 音频语音 eess.IV 图像视频 eess.SP 信号处理 eess.SY 系统控制

MATH 数学 33

math 数学 math.AC 交换代数 math.AG 代数几何 math.AP 偏微分方程 math.AT 代数拓扑 math.CA 经典分析 math.CO 组合数学 math.CT 范畴论 math.CV 复变函数 math.DG 微分几何 math.DS 动力系统 math.FA 泛函分析 math.GM 一般数学 math.GN 一般拓扑 math.GR 群论 math.GT 几何拓扑 math.HO 历史综述 math.IT 信息论 math.KT K理论 math.LO 逻辑 math.MG 度量几何 math.MP 数学物理 math.NA 数值分析 math.NT 数论 math.OA 算子代数 math.OC 优化控制 math.PR 概率 math.QA 量子代数 math.RA 环与代数 math.RT 表示论 math.SG 辛几何 math.SP 谱理论 math.ST 统计理论

PHYSICS 物理 55

astro-ph 天体物理 astro-ph.CO 宇宙学 astro-ph.EP 地球行星 astro-ph.GA 星系物理 astro-ph.HE 高能天体 astro-ph.IM 天文仪器 astro-ph.SR 太阳恒星 cond-mat 凝聚态 cond-mat.dis-nn 无序神经 cond-mat.mes-hall 介观纳米 cond-mat.mtrl-sci 材料科学 cond-mat.other 其他凝聚态 cond-mat.quant-gas 量子气体 cond-mat.soft 软凝聚态 cond-mat.stat-mech 统计力学 cond-mat.str-el 强关联电子 cond-mat.supr-con 超导 gr-qc 广义相对论 hep-ex 高能实验 hep-lat 格点高能 hep-ph 高能唯象 hep-th 高能理论 math-ph 数学物理 nlin 非线性科学 nlin.AO 自适应系统 nlin.CD 混沌动力学 nlin.CG 胞自动机 nlin.PS 斑图孤子 nlin.SI 可积系统 nucl-ex 核物理实验 nucl-th 核物理理论 physics 物理 physics.acc-ph 加速器物理 physics.ao-ph 大气海洋 physics.app-ph 应用物理 physics.atm-clus 原子分子团簇 physics.atom-ph 原子物理 physics.bio-ph 生物物理 physics.chem-ph 化学物理 physics.class-ph 经典物理 physics.comp-ph 计算物理 physics.data-an 数据分析 physics.ed-ph 物理教育 physics.flu-dyn 流体动力学 physics.gen-ph 普通物理 physics.geo-ph 地球物理 physics.hist-ph 物理史哲 physics.ins-det 仪器探测 physics.med-ph 医学物理 physics.optics 光学 physics.plasm-ph 等离子体 physics.pop-ph 科普物理 physics.soc-ph 物理与社会 physics.space-ph 空间物理 quant-ph 量子物理

Q-BIO 定量生物 11

q-bio 定量生物 q-bio.BM 生物分子 q-bio.CB 细胞行为 q-bio.GN 基因组学 q-bio.MN 分子网络 q-bio.NC 神经认知 q-bio.OT 其他定量生物 q-bio.PE 种群进化 q-bio.QM 定量方法 q-bio.SC 亚细胞过程 q-bio.TO 组织器官

Q-FIN 定量金融 10

q-fin 定量金融 q-fin.CP 计算金融 q-fin.EC 经济学 q-fin.GN 一般金融 q-fin.MF 数学金融 q-fin.PM 投资组合 q-fin.PR 证券定价 q-fin.RM 风险管理 q-fin.ST 统计金融 q-fin.TR 交易微观结构

STAT 统计 7

stat 统计 stat.AP 统计应用 stat.CO 统计计算 stat.ME 统计方法 stat.ML 机器学习 stat.OT 其他统计 stat.TH 统计理论

2605.23681 2026-05-25 math.CO

New invariants for rank metric codes, with applications to the classification of rank two semifields of order 256

秩度量码的新不变量及其在256阶秩二半域分类中的应用

Jack Gilchrist, Stefano Lia, Arani Paul, John Sheekey

AI总结本文提出新的不变量用于半字段的分类，并结合新的计算技术，显著提高了对秩为2的半字段的分类效率。研究重点在于对阶为256的半字段进行完整分类，特别关注其核的阶为16的情况。新方法在理论和计算上均有所突破，为相关领域的进一步研究提供了重要工具。

2605.23679 2026-05-25 math.GT math.DG

Geometrisation of 3-manifolds

三维流形的几何化

Bruno Martelli

AI总结三维流形的几何化理论自20世纪80年代由Thurston提出猜想，直至21世纪初由Perelman完成证明。本文对这一理论进行了概述，阐述了定理的核心内容及其在不同情境下的应用与影响。

Comments 27 pages, 26 figures

2605.23678 2026-05-25 math.OC

Concentration of measure-valued solutions for semilinear parabolic equations

半线性抛物型方程的测度值解的集中性

Charlie Lebarbé, Émilien Flayac, Michel Fournié, Didier Henrion, Milan Korda

AI总结本文研究了半线性抛物型方程的测度值解的集中现象。作者证明了对于反应扩散类型的标量半线性抛物型偏微分方程，不存在松弛间隙，即线性测度形式的解能够对应于原方程的经典物理解。通过展示每个测度方程的解都能生成满足合适能量恒等式的能量测度值解，并证明此类解在非线性PDE解存在且唯一时会集中于其上，为反应扩散型PDE的测度值解理论提供了重要贡献。

详情

AI中文摘要

矩-平方和层次结构通过构造一系列凸半定松弛为解决非凸最优控制问题提供了强大框架。然而，当将这些方法扩展到非线性偏微分方程（PDE）时，一个基本挑战是可能存在松弛间隙，即使用占用测度的线性测度公式的解无法对应原始PDE的经典物理解。在本文中，我们证明了标量半线性反应扩散型抛物型PDE不存在松弛间隙。我们通过证明线性测度方程的每个解在满足适当能量恒等的Young测度空间中产生一个能量测度值（emv）解来做到这一点。然后我们证明，只要非线性PDE的解存在且唯一，任何这样的emv解都集中在该非线性PDE的解上。据我们所知，这是关于反应扩散型PDE的测度值解的第一个此类集中性结果。

英文摘要

The moment-sum-of-squares hierarchy provides a powerful framework for solving non-convex optimal control problems by constructing a sequence of convex semidefinite relaxations. However, when extending these methods to nonlinear partial differential equations (PDEs), a fundamental challenge is the potential existence of a relaxation gap, where the solution to the linear measure formulation using occupation measures fails to correspond to a classical physical solution of the original PDE. In this paper, we prove the absence of a relaxation gap for scalar semilinear parabolic PDEs of the reaction-diffusion type. We do so by showing that each solution to the linear measure equation gives rise to an energy measure-valued (emv) solution in the space of Young measures satisfying suitable energy identities. We then prove that any such emv solution concentrates on the solution to the nonlinear PDE, provided the latter exists and is unique. To the best of our knowledge, this is the first concentration result of this kind for measure-valued solutions of reaction-diffusion PDEs.

URL PDF HTML ☆

赞 0 踩 0

2605.23901 2026-05-25 cs.LG cs.AI cs.IT math.IT

LLMs as Noisy Channels: A Shannon Perspective on Model Capacity and Scaling Laws

LLMs 作为噪声信道：香农视角下的模型容量与缩放定律

Xu Ouyang, Deyi Liu, Yuhang Cai, Jing Liu, Yuan Yang, Chen Zheng, Thomas Hartvigsen, Yiyuan Ma

发表机构 * University of Virginia（弗吉尼亚大学）； University of California, Berkeley（加州大学伯克利分校）

AI总结本文从香农信息论的角度出发，将大语言模型（LLM）的训练过程建模为在噪声信道中传递信息的过程，提出了香农扩展定律（Shannon Scaling Law），用以解释传统单调扩展定律无法描述的非单调现象，如灾难性过训练和量化退化。该理论通过将模型参数映射为信道带宽、训练数据映射为信号功率，揭示了模型规模或数据量的扩展若不能保持足够的信噪比，将导致噪声放大并引发性能的U型退化。实验验证表明，该理论在多个任务和扰动设置下均优于传统扩展定律，具有良好的拟合与外推能力。

Comments Accepted by ICML 2026

详情

AI中文摘要

现有的大语言模型（LLMs）缩放定律主要是单调幂律，无法解释新出现的非单调现象，如灾难性过训练和量化引起的退化，在这些现象中，尽管计算量增加，性能却下降。我们提出了香农缩放定律，这是一个统一的理论框架，将LLM训练建模为噪声信道上的信息传输，基于香农-哈特利定理。通过将模型参数映射到信道带宽，训练令牌映射到信号功率，我们的公式明确捕捉了学习信号与内在噪声之间的相互作用。这一视角揭示了LLMs的基本香农容量：在未保持足够信噪比（SNR）的情况下扩展模型规模或数据，必然会放大噪声，导致从单调改进到U形性能退化的转变。我们通过在Pythia和OLMo2上进行的实验验证了该理论，实验包括高斯噪声、量化以及在数学、问答和代码任务上的监督微调。香农缩放定律始终优于经典缩放定律和最近的扰动感知定律，取得了强$R^2$分数，并准确捕捉了先前方法遗漏的损失盆地。它还能进行外推：在$\leq$6.9B Pythia模型上使用$\leq$180B令牌拟合后，预测了未见过的12B模型在高达307B令牌时的性能，池化$R^2=0.847$，而单调基线则崩溃。

英文摘要

Existing scaling laws for Large Language Models (LLMs), predominantly monotonic power laws, fail to explain emerging non-monotonic phenomena such as catastrophic overtraining and quantization-induced degradation, where performance deteriorates despite increased compute. We propose the Shannon Scaling Law, a unified theoretical framework that models LLM training as information transmission over a noisy channel, grounded in the Shannon-Hartley theorem. By mapping model parameters to channel bandwidth and training tokens to signal power, our formulation explicitly captures the interaction between learning signal and intrinsic noise. This perspective reveals a fundamental Shannon capacity for LLMs: scaling model size or data without preserving a sufficient signal-to-noise ratio (SNR) inevitably amplifies noise, inducing a transition from monotonic improvement to U-shaped performance degradation. We validate our theory through experiments on Pythia and OLMo2 under perturbations, including Gaussian noise, quantization and supervised fine-tuning on math, QA and code tasks. The Shannon Scaling Law consistently outperforms classical scaling laws and recent perturbation-aware laws, achieving strong $R^2$ scores and accurately capturing loss basins missed by prior approaches. It also extrapolates: fitted on $\leq$6.9B Pythia models with $\leq$180B tokens, it predicts the unseen 12B model up to 307B tokens at pooled $R^2{=}0.847$, while monotonic baselines collapse.

URL PDF HTML ☆

赞 0 踩 0

2605.23894 2026-05-25 quant-ph cs.IT math.IT

A Two-Branch Finite-Field Construction for Regular CSS LDPC Bases

正则CSS LDPC基矩阵的双分支有限域构造

Koki Okada, Kenta Kasai

AI总结本文提出了一种用于构建正则CSS量子低密度奇偶校验码基矩阵的双分支有限域构造方法。该方法通过有限域上的商群余类条件，将正则性、CSS正交性和同类型四环排除等要求转化为显式条件，并通过归一化穷举搜索生成多个不同度分布的基矩阵。该构造将有限长度设计分为两个阶段，基矩阵固定度分布和初始环约束，随后通过循环提升随机化边连接，同时满足精确代数条件。实验表明，该方法构造的量子码在纠错性能上表现优异，具有较高的码距和较低的帧错误率。

详情

AI中文摘要

本文针对正则Calderbank-Shor-Steane (CSS)量子低密度奇偶校验基矩阵，提出了一种双分支乘法陪集构造方法。对于目标列重$J$和偶行重$L$，该方法将正则性、CSS正交性和同类型4环排除归结为有限域上的显式商-陪集条件。对这些条件进行归一化穷举搜索，可生成若干$(J,L)$对的基矩阵，因此该构造不局限于单一度分布。该构造将有限长设计分为两个阶段：基矩阵固定度分布和第一围长约束，而循环提升则通过精确代数检验随机化边连接。作为详细示例，我们将一个$(3,10)$-正则基矩阵经历提升和解码阶段。对于该示例，选定的64倍提升得到的码，其同类型Tanner图围长至少为8，并且排除了一个指定的重量16非简并逻辑支撑轨道。最终实例是一个$[[10240,4108,\,10\le d\le32]]$ CSS码。解码方面，我们使用联合对数域置信传播以及针对小残差综合征的低复杂度确定性后处理规则，包括修复具有两个不满足校验的残差模式。帧错误率(FER)测量为该详细示例提供了有限长解码数据；在去极化概率$p=0.058$时，后处理FER为$1.0 imes10^{-7}$。

英文摘要

This paper develops a two-branch multiplicative-coset construction for regular Calderbank-Shor-Steane (CSS) quantum low-density parity-check base matrices. For a target column weight $J$ and an even row weight $L$, the method reduces regularity, CSS orthogonality, and same-type 4-cycle exclusion to explicit quotient-coset conditions over a finite field. A normalized exhaustive search for these conditions produces base matrices for several $(J,L)$ pairs, so the construction is not tied to a single degree distribution. The construction separates the finite-length design into two stages: the base matrix fixes the degree distribution and the first girth constraints, and a cyclic lift randomizes edge connections subject to exact algebraic checks. As a detailed example, we carry one $(3,10)$-regular base through the lift and decoding stages. For this example, the selected 64-fold lift gives a code whose same-type Tanner graphs have girth at least eight, and it also excludes a specified weight-16 nondegenerate logical-support orbit. The resulting instance is a $[[10240,4108,\,10\le d\le32]]$ CSS code. For decoding, we use joint log-domain belief propagation together with low-complexity deterministic post-processing rules for small residual syndromes, including repairs for residual patterns with two unsatisfied checks. The frame error rate (FER) measurements provide finite-length decoding data for this detailed example; at depolarizing probability $p=0.058$, the post-processing FER is $1.0\times10^{-7}$.

URL PDF HTML ☆

赞 0 踩 0

2605.23884 2026-05-25 math.FA math-ph math.CA math.MP math.SP

On almost periodicity in crystalline measures

关于晶体测度的几乎周期性

Jan Mazáč, Christoph Richard, Nicolae Strungaru

AI总结本文研究晶格测度的几乎周期性问题，探讨其作为分布的性质。作者证明了晶格测度的几乎周期性可由其平移有界性刻画，并构造了反例表明并非所有晶格测度都是平移有界的分布。通过构建特殊的晶格傅里叶特征测度，论文澄清了Meyer和Favorov提出的问题，明确了晶格测度在平移有界性方面的边界，并揭示了其在傅里叶类准晶体之外的特殊行为。

Comments 34 pages

详情

AI中文摘要

Meyer将晶体测度定义为满足其本身及其傅里叶变换$\widehatμ$都是局部有限支撑的纯点Radon测度的缓增分布$μ$。他猜想每个晶体测度作为缓增分布是几乎周期的。Favorov构造了一个反例，并询问晶体测度是否至少作为一般分布是几乎周期的。为了解决Favorov的问题，我们首先证明，在Radon测度、缓增分布或一般分布的任何类别中，晶体测度的几乎周期性可以用其平移有界性来刻画。然后我们构造了一个晶体傅里叶本征测度，它甚至作为分布也不是平移有界的。最后我们构造了一个晶体测度，它不是傅里叶准晶（特别地，它不是缓增的），但它是几乎周期的缓增分布，其傅里叶变换甚至是范数几乎周期测度。我们的例子完全解决了Meyer和Favorov的问题，并清晰地划定了平移有界性的类边界。它们还展示了晶体测度在傅里叶准晶类之外的异常行为。

英文摘要

Meyer defined crystalline measures as tempered distributions $μ$ such that both $μ$ and its Fourier transform $\widehatμ$ are pure-point Radon measures of locally finite support. He conjectured that every crystalline measure is almost periodic as a tempered distribution. Favorov constructed a counterexample and asked whether crystalline measures are at least almost periodic as general distributions. To resolve Favorov's question, we first show that the almost periodicity of a crystalline measure is characterised in terms of its translation boundedness, in any class of Radon measures, tempered distributions, or general distributions. We then construct a crystalline Fourier eigenmeasure that fails to be translation bounded even as a distribution. We finally construct a crystalline measure that fails to be a~Fourier quasicrystal (in particular, it fails to be slowly increasing), but it is an almost periodic tempered distribution whose Fourier transform is even a norm almost periodic measure. Our examples fully resolve the questions of Meyer and Favorov and sharply delineate the class boundary of translation boundedness. They also demonstrate the unusual behaviour of crystalline measures beyond the class of Fourier quasicrystals.

URL PDF HTML ☆

赞 0 踩 0

2605.23879 2026-05-25 stat.ML cs.CR cs.LG math.ST stat.TH

On the Stability of Spherical Hellinger-Kantorovich Flows and Their Implications for Differential Privacy

球形Hellinger-Kantorovich流的稳定性及其对差分隐私的影响

Aratrika Mustafi, Soumya Mukherjee

发表机构 * Department of Statistics, Pennsylvania State University（宾夕法尼亚州立大学统计学系）

AI总结本文研究了球形Hellinger-Kantorovich梯度流的稳定性问题，并探讨其在差分隐私中的应用。作者建立了该梯度流的扰动理论，分析了不同势函数下流的动力学差异，并给出了与时间相关的log-似然比和Rényi散度的统一上界，进一步推导了KL散度的界。这些结果被用于差分隐私中的指数机制采样，提供了基于SHK梯度流的纯差分隐私和近似差分隐私保证，并分离了机制本身的次优性与有限时间采样误差的影响。

详情

AI中文摘要

梯度流采样将吉布斯分布解释为概率测度上能量泛函的最小值，并生成收敛到该目标的动力学。在球形Hellinger-Kantorovich (SHK)几何下，流耦合输运和反应，并与生灭Langevin动力学一致。本文发展了SHK梯度流的摄动理论。对于两个势函数$V$和$V^{\prime}$，我们从共同初始值出发比较相关的流，并量化势差异随时间传播的程度。一个统一的扰动界给出了对数似然比和Rényi散度的无维、逐点控制，而额外的结构使我们能够推导出KL散度的界。我们将这些结果应用于差分隐私中指数机制的近似采样。似然比控制为基于SHK的采样器提供了显式的时间依赖纯DP保证，而KL界通过hockey-stick散度给出了近似DP证书。我们还推导了一个效用界，将指数机制的内在次优性与有限时间采样误差分离。

英文摘要

Gradient-flow sampling interprets a Gibbs distribution as the minimizer of an energy functional over probability measures and generates dynamics converging to this target. Under spherical Hellinger-Kantorovich (SHK) geometry, the flow couples transport and reaction and coincides with birth-death Langevin dynamics. In this work, we develop a perturbation theory for SHK gradient flows. For two potentials $V$ and $V^{\prime}$, we compare the associated flows from a common initialization and quantify how potential discrepancies propagate over time. A uniform perturbation bound yields dimension-free, pointwise control of the log-likelihood ratio and Rényi divergence, while additional structure allows us to derive bounds for the KL divergence as well. We apply these results to approximate sampling for the exponential mechanism in differential privacy. The likelihood-ratio control provides explicit time-dependent Pure-DP guarantees for SHK-based samplers, while the KL bound yields Approximate-DP certificates via hockey-stick divergence. We also derive a utility bound separating intrinsic exponential-mechanism suboptimality from finite-time sampling error.

URL PDF HTML ☆

赞 0 踩 0

2605.23872 2026-05-25 cs.LG cs.NA math.NA stat.ML

Training-Free Looped Transformers

免训练循环Transformer

Lizhang Chen, Jonathan Li, Chen Liang, Ni Lao, Qiang Liu

发表机构 * University of Texas at Austin（德克萨斯大学奥斯汀分校）

AI总结本文提出了一种无需训练的循环变压器模型，通过在冻结的预训练模型中引入一个轻量级的推理时包装器，对连续的中间层块进行循环应用，而无需额外微调或结构修改。研究发现，直接重复使用中间层块会导致性能下降，因此作者借鉴常微分方程的前向欧拉方法，将循环视为对同一近似的优化，采用更小的阻尼子步骤替代单一的大更新。实验表明，该方法在多种模型架构上均能有效提升推理性能，如在MMLU-Pro等基准测试中取得显著提升。

详情

AI中文摘要

我们引入了免训练循环Transformer，其中轻量级推理时包装器循环冻结检查点的连续中间块层，无需额外微调、继续训练或架构更改。与先前使用循环结构端到端训练的循环Transformer方法不同，我们在测试时将循环性改造到预训练模型上。我们表明，简单的块重新应用通常会降低性能，凸显了循环应用策略的重要性。受将预归一化Transformer块视为ODE上的前向欧拉步骤的启发，我们将循环视为同一近似的细化，用一个大的更新替换为更小的阻尼子步骤。在七个密集、稀疏MoE和MLA+MoE模型家族中，我们的方法在MMLU-Pro上将Qwen3-4B-Instruct提升了2.64个百分点，在CommonsenseQA上将Qwen3-30B-A3B-Instruct提升了1.14个百分点，在OpenBookQA上将Moonlight-16B-A3B-Instruct提升了1.20个百分点。

英文摘要

We introduce training-free looped transformers, in which a lightweight inference-time wrapper loops a contiguous mid-stack block of layers of a frozen checkpoint without additional fine-tuning, continued training, or architectural changes. Unlike prior looped transformer methods that train with the looped structure end-to-end, we retrofit recurrence onto pretrained models at test time. We show that naive block reapplication usually degrades performance, highlighting the importance of the loop application strategy. Motivated by viewing a pre-norm transformer block as a forward Euler step on an ODE, we instead treat looping as a refinement of the same approximation, replacing one large update with smaller damped sub-steps. Across seven dense, sparse MoE, and MLA+MoE model families, our method improves Qwen3-4B-Instruct by +2.64 pp on MMLU-Pro, Qwen3-30B-A3B-Instruct by +1.14 pp on CommonsenseQA, and Moonlight-16B-A3B-Instruct by +1.20 pp on OpenBookQA.

URL PDF HTML ☆

赞 0 踩 0

2605.23871 2026-05-25 stat.ML cs.LG math.ST stat.TH

Move on Muon : A Hamiltonian probability gradient flow perspective of Muon optimizer

Muon上的移动：Muon优化器的哈密顿概率梯度流视角

Aratrika Mustafi, Soumya Mukherjee, Bharath K. Sriperumbudur

AI总结本文从哈密顿概率梯度流的视角，研究了Muon优化器的连续时间动力学行为，提出了正则化Muon优化的梯度流形式，并揭示了其与核范数的Fenchel对偶平滑之间的联系。通过将Muon优化推广到有限粒子概率目标函数，作者推导了其惯性连续时间极限，并建立了参数-动量对的概率相空间平均场方程，证明了该动力学为阻尼哈密顿概率动力系统，具有单调递减的哈密顿能量。此外，文章还分析了目标函数的收敛性，并将该方法扩展到适用于变换器混合专家模型的块状Muon概率流。

详情

AI中文摘要

我们开发了一种在矩阵值参数概率测度空间上的梯度流，该梯度流由正则化Muon（理想化Muon优化器的解析平滑版本）诱导。关键观察是正则化正交化映射是核范数的光滑Fenchel对偶平滑的梯度。这确定了（正则化）Muon更新为更新变量中的镜像/近端步骤，其中动量充当对偶坐标。我们利用这一结构将Muon从单个矩阵参数提升到形如$J(ρ)=R\left(\int F d ρ ight)$的有限粒子概率目标，这一设置由神经网络训练的均场描述所激发，并推导出惯性连续时间极限。利用这一结构，我们在步长和动量的惯性缩放下推导出有限粒子连续时间极限，然后过渡到参数-动量对概率律上的相空间均场方程。所得流可被证明是阻尼哈密顿概率动力学，其动能由正则化Muon镜像势诱导。我们证明了一个精确的哈密顿耗散恒等式，显示哈密顿能量单调递减。虽然目标目标本身在惯性Muon动力学下不一定单调，但在额外的梯度优势、有界动量和曲率/对齐假设下，我们获得了目标间隙的连续和离散时间指数收敛率。我们还研究了均场极限方程的适定性，并建立了相互作用粒子系统的混沌传播保证。最后，我们将公式扩展到乘积矩阵空间上的Hilbert值特征映射，得到适用于平滑变压器混合专家模型的块状Muon概率流。

英文摘要

We develop a gradient flow on the space of probability measures defined on matrix-valued parameters induced by regularized Muon, an analytically smoothed version of the idealized Muon optimizer. The key observation is that the regularized orthogonalization map is the gradient of a smooth Fenchel-dual smoothing of the nuclear norm. This identifies the (regularized) Muon update as a mirror/prox step in the update variable, with momentum acting as the dual coordinate. We use this structure to lift Muon from a single matrix parameter to finite-particle probability objectives of the form $J(ρ)=R\left(\int F d ρ\right)$, a setting motivated by mean-field descriptions of neural-network training, and derive the inertial continuous-time limit. Using this structure, we derive the finite-particle continuous-time limit under the inertial scaling of step size and momentum, and then pass to a phase-space mean-field equation over probability laws on parameter-momentum pairs. The resulting flow can be shown to be a damped Hamiltonian probability dynamics whose kinetic energy is induced by the regularized Muon mirror potential. We prove an exact Hamiltonian dissipation identity, showing that the Hamiltonian energy decreases monotonically. While the target objective itself need not be monotone along the inertial Muon dynamics, under additional gradient-dominance, bounded-momentum, and curvature/alignment assumptions, we obtain continuous and discrete-time exponential convergence rates for the objective gap. We also study the well-posedness of the mean-field limit equation and establish propagation of chaos guarantees for the interacting particle system. Finally, we extend the formulation to Hilbert-valued feature maps on product matrix spaces, yielding a blockwise Muon probability flow applicable to smooth transformer mixture-of-experts models.

URL PDF HTML ☆

赞 0 踩 0

2605.23866 2026-05-25 math.MG cs.DM

Optimal Vector Balancing for Zonotopes

Zonotopes的最优向量平衡

Victor Reis

AI总结本文研究了zonotope（一种由立方体线性映射得到的几何对象）的最优向量平衡问题。作者证明了存在一个通用常数 $C$，使得对于任意zonotope $Z \subset \mathbb{R}^d$ 和其中的向量 $v_1, \dots, v_n$，总存在符号 $x_1, \dots, x_n \in \{-1, 1\}$，使得 $\sum_{i=1}^n x_i v_i$ 落在 $C\sqrt{d} Z$ 内。该结果解决了Schechtman于2002年提出的问题，并推广了Spencer的六标准偏差定理。

Comments 24 pages

2605.23865 2026-05-25 math.RA

Images of polynomials with involution on $2\times 2$ matrices

带有对合的 $2\times 2$ 矩阵上多项式的像

Lucio Centrone, Thiago Castilho de Mello

AI总结本文研究了在配备第一类对合的 $2\times 2$ 矩阵代数上，多线性 $*$-多项式的像集结构。作者针对实数域上的转置对合和二次闭域或实数域上的辛对合，分别分析了像集的性质，证明其要么是真子空间，要么包含矩阵空间的基，或总是构成一个向量空间。作为副产品，作者完善了 Brešar 和 Klep 关于有限维中心单代数上 $*$-多项式像集线性包的定理，并对维度为 4 和 16 的情况进行了补充，同时分类了特征为零的域上 $M_4(\mathbb{F})$ 的李斜理想。

Comments 17 pages

详情

AI中文摘要

设 $\mathbb{F}$ 是一个域，$M_2(\mathbb{F})$ 是 $2\times 2$ 矩阵的代数，并赋予第一类对合。我们研究在 $M_2(\mathbb{F})$ 上评估的多重线性 $*$-多项式的像。对于 $\mathbb{R}$ 上的转置对合，我们证明像要么是真向量子空间，要么包含 $M_2(\mathbb{R})$ 的一组基。对于二次闭域或 $\mathbb{R}$ 上的辛对合，我们证明像总是一个向量空间，即 $\{0\}$、$\mathbb{F}$、$sl_2(\mathbb{F})$ 或 $M_2(\mathbb{F})$ 之一。作为副产品，我们完善了 Brešar 和 Klep 的一个定理，该定理描述了具有第一类对合的有限维中心单代数上 $*$-多项式像的线性张成。他们的结果排除了维数为 4 和 16 的代数；我们解决了这两种情况，将描述扩展到所有大于 1 的维数（在 $\mathbb{R}$ 上对于转置对合，在二次闭域或 $\mathbb{R}$ 上对于辛对合）。我们还分类了特征零域上 $M_4(\mathbb{F})$ 的所有李斜理想。

英文摘要

Let $\mathbb{F}$ be a field and let $M_2(\mathbb{F})$ be the algebra of $2\times 2$ matrices endowed with an involution of the first kind. We study the image of multilinear $*$-polynomials evaluated on $M_2(\mathbb{F})$. For the transpose involution over $\mathbb{R}$, we show that the image is either a proper vector subspace or contains a basis of $M_2(\mathbb{R})$. For the symplectic involution over quadratically closed fields or over $\mathbb{R}$, we prove that the image is always a vector space, namely one of $\{0\}$, $\mathbb{F}$, $sl_2(\mathbb{F})$ or $M_2(\mathbb{F})$. As a byproduct, we complete a theorem of Brešar and Klep describing the linear span of the image of a $*$-polynomial on finite dimensional central simple algebras with involution of the first kind. Their result excluded algebras of dimensions 4 and 16; we settle both cases, extending the description to all dimensions greater than 1 (over $\mathbb{R}$ for the transpose involution, and over quadratically closed fields or $\mathbb{R}$ for the symplectic involution). We also classify all Lie skew-ideals of $M_4(\mathbb{F})$ over fields of characteristic zero.

URL PDF HTML ☆

赞 0 踩 0

2605.23864 2026-05-25 math.OC cs.SY eess.SY

Harnessing Individual Motivation for Collective Efficiency: A Mechanism-Driven Distributed Optimization Method

利用个体动机实现集体效率：一种机制驱动的分布式优化方法

Dongwei Xie, Xuhao Wang, Yujie Tang, Jie Song

AI总结在涉及多智能体协作决策的工业场景中，由于个体信息难以集中获取以及个体利益与全局性能之间的冲突，传统的集中式决策可能不可行。本文提出了一种机制驱动的分布式优化方法，通过设计激励机制引导各参与者在以自我利益驱动的前提下进行协作。该方法针对具有耦合目标函数和耦合约束的优化问题，设计了相应的分布式算法并提供了收敛性保证，同时引入两种激励机制以确保参与者的协作意愿，形成闭环反馈系统，实验结果验证了算法和机制的有效性。

详情

AI中文摘要

在涉及多智能体集体决策的工业场景中，由于对个体局部信息的访问受限，集中式决策可能不可行，而参与者自利与全局性能之间的冲突也可能阻碍协作式分布式决策。本文提出一种机制驱动的分布式决策方法，其中采用并设计激励措施，以激励参与者以分布式方式协作，即使每个参与者的决策主要受自利驱动。针对具有耦合目标函数和耦合约束的优化问题，我们设计了一种针对此类问题定制的分布式优化算法，并为其收敛性提供了保证。此外，我们设计了两种激励机制：影子定价机制和Vickrey-Clarke-Groves机制，并证明了在这些机制下参与者愿意参与分布式协作。该机制驱动分布式算法的执行，而分布式计算的最优结果指导机制中激励的确定，两者相互关联形成闭环。最后，数值实验说明了所提算法和机制的有效性。

英文摘要

In industrial scenarios involving multi-agent collective decision-making, centralized decision-making may not be admissible due to restrictive access to individual local information, while the conflicts between participants' self-interest and global performance may also impede collaborative distributed decision-making. This paper proposes a mechanism-driven distributed decision-making method, wherein incentives are employed and designed to motivate participants to collaborate in a distributed fashion even though each participant's decision is driven primarily by self-interest. Focusing on optimization problems with coupled objective functions and coupled constraints, we design a distributed optimization algorithm tailored for this class of problems and provide guarantees for its convergence. Furthermore, we design two incentive mechanisms, the shadow pricing mechanism and the Vickrey-Clarke-Groves mechanism, and demonstrate that participants are willing to engage in distributed collaboration under these mechanisms. The mechanism drives the execution of the distributed algorithm, and the optimal result of distributed computation guides the determination of incentives in the mechanism, both of which are interrelated to form a closed loop. Finally, numerical experiments illustrate the effectiveness of the proposed algorithm and mechanisms.

URL PDF HTML ☆

赞 0 踩 0

2605.23854 2026-05-25 cs.LG math.ST stat.ML stat.TH

Entrywise Error Bounds for Spectral Ranking with Semi-Random Adversaries

半随机对抗下谱排序的逐项误差界

Dongmin Lee, Anuran Makur, Japneet Singh

发表机构 * Department of Computer Science（计算机科学系）； Elmore Family School of Electrical and Computer Engineering（埃洛姆家族电子与计算机工程学院）； Purdue University（普渡大学）

AI总结本文研究了在半随机对抗环境下谱方法用于谱排序的逐项误差界问题。针对能够任意增强某些边采样概率的半随机对手，作者分析了无权重谱方法的性能，并发现其表现高度依赖生成图的谱特性。通过适当重加权观测边以抵消对手影响，可恢复接近均匀采样图的渐近性能。数值实验验证了理论结果的有效性。

Comments 17 pages, 2 figures, 32nd ACM SIGKDD Conference on Knowledge Discovery and Data Mining V.2

详情

AI中文摘要

Bradley-Terry-Luce (BTL) 模型估计是一种基于成对比较数据对项目集合进行排序的成熟策略。尽管在均匀采样图的情况下，谱估计和最大似然估计等 BTL 估计方法的理论性能已得到充分研究，但将这些结果推广到更广泛的随机图类已被证明具有挑战性。在这项工作中，我们研究了谱算法在半随机对抗下的逐项误差，该对抗可以任意提升某些边的采样概率。我们发现，未加权谱方法的性能严重依赖于生成图的谱性质。此外，我们表明，通过适当地重新加权观察到的边以对抗对抗并恢复谱间隙，可以恢复接近均匀采样图的渐近性能。最后，我们提供了支持我们理论发现的数值模拟。

英文摘要

Bradley-Terry-Luce (BTL) model estimation is a well-established strategy to rank a collection of items given a dataset of pairwise comparisons. Although the theoretical performance of BTL estimation methods, such as spectral and maximum likelihood estimation, is well studied in the regime of uniformly sampled graphs, generalizing such results to a wider class of random graphs has proved challenging. In this work, we investigate the entry-wise error of spectral algorithms against a semi-random adversary that can arbitrarily boost the sampling probabilities of certain edges. We find that the performance of the unweighted spectral method is heavily dependent on the spectral properties of the generated graph. Furthermore, we show that asymptotic performance approaching that of uniformly sampled graphs can be recovered by appropriately reweighting the observed edges to counteract the adversary and restore the spectral gap. Finally, we provide numerical simulations that support our theoretical findings.

URL PDF HTML ☆

赞 0 踩 0

2605.23853 2026-05-25 math-ph math.MP physics.optics quant-ph

Exact versus tight-binding models in longitudinally modulated $\mathcal{PT}$-symmetric coupled waveguides

纵向调制 $\mathcal{PT}$ 对称耦合波导中的精确模型与紧束缚模型

Alonso Contreras-Astorga, José Israel Galindo-Rodríguez

AI总结本文研究了纵向调制的$\mathcal{PT}$-对称耦合波导系统中紧束缚（TB）模型与精确解之间的差异。通过构建基于超对称变换的精确连续模型，并与对应的离散TB近似进行对比分析，揭示了TB模型在再现空间强度分布方面的有效性，同时指出了其在描述非厄米演化中复杂振荡相位动力学方面的局限性。研究明确了TB模型在该类系统中的适用范围，为相关理论与应用提供了重要参考。

详情

AI中文摘要

紧束缚（TB）模型是描述波导阵列中光传播的广泛采用的近似方案。尽管其成功，但在以强纵向调制为特征的 $\mathcal{PT}$ 对称系统中，其有效性尚未通过精确解析解进行严格基准测试。在这项工作中，我们通过比较从 $z$ 依赖的超对称（SUSY）变换导出的精确连续解及其相应的离散 TB 近似来填补这一空白。为此，我们为两个受纵向调制的 PT 对称耦合波导开发了一个理论模型。然后，我们针对精确的 SUSY 基准评估了 TB 框架的性能。我们的结果描绘了 TB 近似的具体有效范围，展示了其在再现空间强度分布方面的能力。然而，我们也识别了其在准确捕捉这种非厄米演化中固有的复杂振荡相位动力学方面的局限性。

英文摘要

The tight-binding (TB) model is a widely adopted approximation scheme for describing light propagation in waveguide arrays. Despite its success, its validity in $\mathcal{PT}$-symmetric systems characterized by strong longitudinal modulation has not been rigorously benchmarked against exact analytical solutions. In this work, we address this gap by performing a comparative analysis between exact continuous solutions derived from $z$-dependent supersymmetric (SUSY) transformations and their corresponding discrete TB approximations. To achieve this, we develop a theoretical model for two PT-symmetric coupled waveguides subject to longitudinal modulation. We then evaluate the performance of the TB framework against the exact SUSY benchmark. Our results delineate the specific validity range of the TB approximation, demonstrating its proficiency in reproducing spatial intensity distributions. However, we also identify its limitations in accurately capturing the complex oscillatory phase dynamics inherent to this non-Hermitian evolution.

URL PDF HTML ☆

赞 0 踩 0

2605.23852 2026-05-25 quant-ph math-ph math.MP

Convexity and non-Markovianity of Weyl Maps

Weyl映射的凸性与非马尔可夫性

Wen Xu, Vinayak Jagadish

AI总结本文研究了由Weyl动力映射及其凸组合所描述的有限维开放量子系统中非马尔可夫动力学的出现机制。通过赫米特标准型，作者建立了离散相空间子群的完整分类，构建了Weyl映射的代数框架，并分析了各向同性和各向异性Weyl映射在生成马尔可夫半群和非马尔可夫记忆效应中的不同行为。研究还揭示了凸组合在非马尔可夫性生成与抑制中的作用，并证明了非马尔可夫性在混合过程中不具有可加性，同时发现了无需混合即可表现出永恒非马尔可夫性的不可约Weyl退相干映射，拓展了非马尔可夫动力学理论在Pauli框架之外的适用范围。

Comments 17 pages

详情

AI中文摘要

我们研究了由Weyl动力学映射及其凸组合支配的有限维开放量子系统中非马尔可夫动力学的出现。利用Hermite标准型，我们提供了离散相空间$\mathbb{Z}_d imes \mathbb{Z}_d$子群的完全分类，建立了Weyl映射背后的代数框架。我们刻画了生成马尔可夫半群的各向同性Weyl动力学映射，并表明具有非均匀权重分布的各向异性Weyl映射不能具有半群性质。此外，我们分析了凸性在记忆效应产生和抑制中的作用。值得注意的是，我们证明了永恒非马尔可夫Weyl退相映射的凸组合可以生成马尔可夫半群，表明非马尔可夫性在混合下不可加。相反，我们建立了一个一般条件，在该条件下$N$个不同Weyl半群的凸混合表现出永恒非马尔可夫性。与量子比特Pauli设置相比，我们进一步识别了不可约的永恒非马尔可夫Weyl退相映射的存在，即无需任何混合机制即可显示永恒记忆效应的单个动力学映射。最后，显式的qutrit例子说明了马尔可夫、非马尔可夫和永恒非马尔可夫区域之间的转变。我们的结果揭示了有限相空间代数、凸结构和量子记忆效应之间的基本联系，从而将非马尔可夫动力学理论扩展到Pauli框架之外。

英文摘要

We investigate the emergence of non-Markovian dynamics in finite-dimensional open quantum systems governed by Weyl dynamical maps and their convex combinations. Using the Hermite normal form, we provide a complete classification of the subgroups of the discrete phase space $\mathbb{Z}_d \times \mathbb{Z}_d$, establishing the algebraic framework underlying the Weyl maps. We characterize isotropic Weyl dynamical maps that generate Markovian semigroups and show that anisotropic Weyl maps with nonuniform weight distributions cannot possess the semigroup property. Furthermore, we analyze the role of convexity in the generation and suppression of memory effects. Remarkably, we prove that convex combinations of eternally non-Markovian Weyl dephasing maps can generate Markovian semigroups, demonstrating that non-Markovianity is not additive under mixing. Conversely, we establish a general condition under which convex mixtures of $N$ distinct Weyl semigroups exhibit eternal non-Markovianity. In contrast to the qubit Pauli setting, we further identify the existence of irreducible eternally non-Markovian Weyl dephasing maps, namely, individual dynamical maps that display eternal memory effects without requiring any mixing mechanism. Finally, explicit qutrit examples illustrate the transition among Markovian, non-Markovian and eternally non-Markovian regimes. Our results uncover a fundamental connection among finite phase-space algebra, convex structures, and quantum memory effects, thereby extending the theory of non-Markovian dynamics beyond the Pauli framework.

URL PDF HTML ☆

赞 0 踩 0

2605.23849 2026-05-25 math.AC math.CO

Incidence toric ideals and three-point functions

关联环面理想与三点函数

Barbara Betti, Sean Grate, Thiago Holleben, Flavio Salizzoni

AI总结本文从组合和拓扑角度研究三元函数之间的代数关系理想。研究将问题置于与包含关系相关的入射环状理想框架中，揭示了这些理想的生成元具有组合上的空设计解释和拓扑上的平衡可定向伪流形解释。其中，由八面体产生的生成元在这些理想的结构中起着基础性作用。

Comments 22 pages, 4 figures

2605.23846 2026-05-25 math.FA

Adjacent cross-sections of the commutant of Hilbert space operators

Hilbert空间算子换位的相邻截面

László Kérchy

AI总结本文研究了希尔伯特空间算子交换子的相邻截面性质，通过构造特定的算子模型，探讨了其压缩算子的结构与变换特性。利用算子交换子的循环性，得到了一个线性流形的传递性，并进一步分析了其三维截面的基结构。文章提出了新的条件，用于刻画相邻截面之间的匹配关系，丰富了对算子代数结构的理解。

详情

AI中文摘要

应用导致几乎不变半空间存在的技术，可以为上三角算子矩阵$T= \\left[\begin{matrix}A&C\\\ 0&B\\end{matrix}\right]$给出相似模型$\\wh T$。模型$\\wh T$也是一个算子矩阵，一般情况下包含两个对角算子\\cite{ker25}，而在$A$相似于$S$的特殊情况下包含单边移位$S$和一个对角算子\\cite{ker26}。换位$\\{\\wh T\\}'$中算子的适当压缩构成一个线性流形$\\whŁ$，满足条件：通过一个典范映射，每个$\\wh X\\in\\wh Ł$被变换为$\\wh Y$且$\rank\\wh Y\\le 2$。此外，$\\{T\\}'$的循环性质导致$\\whŁ$的传递性。在\\cite{ker25}和\\cite{ker26}中，研究了$\\whŁ$的三维截面，刻画了矩阵代数$M_3[\\C]$相应子空间中出现的典范基。本文通过研究相邻截面的匹配提供了新的条件。

英文摘要

Applying the techniques resulting the existence of almost invariant half-spaces, similarity models $\wh T$ can be given for upper triangular operator-matrices $T= \left[\begin{matrix}A&C\\ 0&B\end{matrix}\right]$. The model $\wh T$ is also an operator-matrix, containing two diagonal operators in the general case \cite{ker25}, and the unilateral shift $S$ together with a diagonal operator in the particular case when $A$ is similar to $S$ \cite{ker26}. Well-chosen compressions of operators in the commutant $\{\wh T\}'$ form a linear manifold $\whŁ$ satisfying the condition that every $\wh X\in\wh Ł$ is transformed into $\wh Y$ with $\rank\wh Y\le 2$ by a canonical mapping. Furthermore, a cyclcity property of $\{T\}'$ yields transitivity of $\whŁ$. In \cite{ker25} and \cite{ker26} the 3-dimensional cross-sections of $\whŁ$ have been investigated characterizing the canonical bases occurring in the corresponding subspaces of the matrix-algebra $M_3[\C]$. In this paper new conditions are provided by studying matching of adjacent cross-sections.

URL PDF HTML ☆

赞 0 踩 0

2605.23835 2026-05-25 math.AP math.FA

Pointwise Estimates Near Singular Sets for Quasilinear Elliptic Equations

拟线性椭圆方程在奇异集附近的逐点估计

Juan Pablo Alcon Apaza

AI总结本文研究了在高维Finsler流形上的拟线性椭圆方程中边界奇异集的可移除性问题。通过引入满足距离性质的Lipschitz函数，作者给出了弱解在奇异集附近的点态估计，并证明了该估计在$p^{+} \to 1$时的收敛性，从而证明了奇异集的可移除性。此外，作者还证明了在变量指数条件下，弱解序列在某个子序列下收敛于一个广义解，进一步丰富了此类方程的解的结构理论。

Comments 31 pages

详情

AI中文摘要

本文研究n维Finsler流形$(\mathcal{M}, F, \vartheta)$的区域$\Omega$中某类拟线性椭圆方程的边界奇异集的可去性。我们使用满足距离型性质的Lipschitz函数$\rho_1$和$\rho_2$；特别地，在$\mathcal{M}$中几乎处处有$F(\cdot, \boldsymbol{\nabla} \rho_1) \leq 1$和$F(\cdot, \boldsymbol{\nabla} \rho_2) \leq 1$。奇异集定义为$\Gamma=\rho_1^{-1}(\{0\})$。模型问题是$\mathbb{R}^n \cong \mathbb{R}^d \times \mathbb{R}^{n-d} \cong \rho_1^{-1}(\{0\}) \times \rho_2^{-1}(\{0\})$区域中的$-\Delta_{p(x)} u+|u|^{q-1} u=0$，其中$\rho_1(x)=|(x_{d+1}, \ldots, x_n)|$，$\rho_2(x)=|(x_1, \ldots, x_d)|$。我们分析的主要工具是对于弱解$u \in W_{loc}^{1, p(x)}(\bar\Omega \backslash(\Gamma\cup \Sigma) ; \vartheta) \cap L_{loc}^{\infty}(\bar\Omega \backslash(\Gamma\cup \Sigma))$，在$\Gamma$附近的估计$$ |u(x)| \leq \mathbf{C} \rho_1(x)^{-\tau} $$，其中常数$\mathbf{C}>0$和$\tau>0$当$p^{+} \rightarrow 1$时趋于正值。该估计是证明$\Gamma$处奇异性可去的关键。此外，在有界区域$\Omega$中，利用该估计并假设对于每个满足$1<p^{-} \leq p^{+}<\min \{2, q+1\}$的变指数，存在弱解$u_p \in W_{loc}^{1, p(x)}(\Omega; \vartheta) \cap L_{loc}^{\infty}(\Omega)$满足$$ -\operatorname{div}\left(|\boldsymbol{\nabla} u_p|_F^{p-2} \boldsymbol{\nabla} u_p\right)+|u_p|^{q-1} u_p=0 \quad \text{在} \Omega \text{中} $$，我们证明对于每个$U \Subset \Omega$，存在子列$\{u_{p_m}\}$，其中$p_m^{+} \rightarrow 1$，收敛到解$u \in B V(U ; \vartheta) \cap L^{q+1}(U ; \vartheta)$满足$$ -\Delta_1 u+|u|^{q-1} u=0 \quad \text{在} U \text{中} $$。

英文摘要

In this work, we study the removability of boundary singular sets for certain classes of quasilinear elliptic equations in domains $Ω$ of an $n$-dimensional Finsler manifold ( $\mathcal{M}, F, \vartheta$ ). We work with Lipschitz functions $ρ_1$ and $ρ_2$ satisfying distance-type properties; in particular, $F(\cdot, \boldsymbol{\nabla} ρ_1) \leq 1$ and $F(\cdot, \boldsymbol{\nabla} ρ_2) \leq 1$ a.e. in $\mathcal{M}$. The singular set is defined by $Γ=ρ_1^{-1}(\{0\})$. The model problem is $-Δ_{p(x)} u+|u|^{q-1} u=0$ in domains of $\mathbb{R}^n \cong \mathbb{R}^d \times \mathbb{R}^{n-d} \cong ρ_1^{-1}(\{0\}) \times ρ_2^{-1}(\{0\})$, where $ρ_1(x)=|(x_{d+1}, \ldots, x_n)|$ and $ρ_2(x)=|(x_1, \ldots, x_d)|$. The main tool in our analysis is the estimate $$ |u(x)| \leq \mathbf{C} ρ_1(x)^{-τ} $$ near $Γ$ for weak solutions $u \in W_{loc}^{1, p(x)}(\barΩ \backslash(Γ\cup Σ) ; \vartheta) \cap L_{loc}^{\infty}(\barΩ \backslash(Γ\cup Σ))$, where the constants $\mathbf{C}>0$ and $τ>0$ converge to positive values as $p^{+} \rightarrow 1$. This estimate is a key ingredient in proving that the singularity at $Γ$ is removable. Moreover, in a bounded domain $Ω$, using this estimate and assuming that, for every variable exponent satisfying $1<p^{-} \leq p^{+}<\min \{2, q+1\}$, there exists a weak solution $u_p \in W_{loc}^{1, p(x)}(Ω; \vartheta) \cap L_{loc}^{\infty}(Ω)$ of $$ -\operatorname{div}\left(|\boldsymbol{\nabla} u_p|_F^{p-2} \boldsymbol{\nabla} u_p\right)+|u_p|^{q-1} u_p=0 \quad \text { in } Ω, $$ we prove that, for every $U \Subset Ω$, there exists a subsequence $\{u_{p_m}\}$, with $p_m^{+} \rightarrow 1$, that converges to a solution $u \in B V(U ; \vartheta) \cap L^{q+1}(U ; \vartheta)$ of $$ -Δ_1 u+|u|^{q-1} u=0 \quad \text { in } U . $$

URL PDF HTML ☆

赞 0 踩 0

2605.23829 2026-05-25 math.GR math.GT

Outer automorphism groups of hyperbolic groups, bounded extensions, and hierarchical hyperbolicity

双曲群的外自同构群、有界扩张与层次双曲性

Ervin Hadziosmanovic, Giorgio Mangioni

AI总结本文研究了单端双曲群的外自同构群的结构，证明在JSJ分解满足一定可定向性条件时，该外自同构群本质上是分层双曲群（HHG）。研究通过证明其有限指数子群是轨道映射类群的乘积的中心扩张，并且扩张具有有界欧拉类，从而得到该结论。文章还给出了一个反例，表明在某些情况下外自同构群并不属于HHG，展示了该结果的最优性。

Comments 27 pages, 4 figures. Comments are welcome!

2605.23828 2026-05-25 math.CO

Strong majority colorings of graphs

图的强多数着色

Rafał Kalinowski, Mateusz Kamyczura, Monika Pilśniak, Mariusz Woźniak

AI总结本文研究图的强多数着色问题，分别提出了图的强多数顶点着色和强多数边着色的概念，要求每个顶点或边的邻接元素中，每种颜色出现的次数不超过一半。研究给出了强多数色数和强多数色指标的定义，并证明了在无悬挂顶点的图中，强多数色数至多为 $2\Delta(G)+1$，同时猜想所有图的强多数边着色色指标存在一个上限为4的常数界，并在多个图类中验证了该猜想。

Comments 13 pages, 2 figures

详情

AI中文摘要

受图和有向图的顶点多数着色以及图的边多数着色的启发，我们引入了强多数着色的两个概念。图$G=(V,E)$的强多数顶点着色是一个映射$c:V\rightarrow C$，使得对于每个顶点$v\in V$和每种颜色$\alpha\in C$，$v$的邻居中至多一半具有颜色$\alpha$。$G$的强多数数，记为Maj$(G)$，是这种着色所需的最少颜色数。我们证明Maj$(G)$可以任意大，并给出每个无悬挂顶点的图$G$的紧上界Maj$(G)\le 2\Delta(G)+1$。图$G$的强多数边着色是一个映射$c:E\rightarrow C$，使得对于每条边$e\in E$和每种颜色$\alpha\in C$，与$e$相邻的边中至多一半具有颜色$\alpha$。$G$的强多数指数，记为Maj'$(G)$，是这种着色所需的最少颜色数。我们证明所有可容许图$G$的Maj'$(G)$存在一个常数上界。我们猜想这个常数小至4，并对许多图类证实了这一猜想。

英文摘要

Motivated by majority vertex-colorings of graphs and digraphs and majority edge-colorings of graphs, we introduce two concepts of strong majority colorings. A strong majority vertex-coloring of a graph $G=(V,E)$ is a mapping $c:V\rightarrow C$ such that for every vertex $v\in V$ and every color $α\in C$, at most half of the neighbors of $v$ have color $α$. The strong majority number of $G$, denoted Maj$(G)$, is the least number of colors in such a coloring. We show that Maj$(G)$ can be arbitrarily large and prove a tight upper bound Maj$(G)\le 2Δ(G)+1$ for every graph $G$ without pendant vertices. A strong majority edge-coloring of a graph $G$ is a mapping $c:E\rightarrow C$ such that for every edge $e\in E$ and every color $α\in C$, at most half of the edges adjacent to $e$ have color $α$. The strong majority index of $G$, denoted Maj'$(G)$, is the least number of colors in such a coloring. It is shown that there is an upper constant bound for Maj'$(G)$ of all admissible graphs $G$. We conjecture that this constant is as small as 4 and confirm this conjecture for numerous graph classes.

URL PDF HTML ☆

赞 0 踩 0

2605.23822 2026-05-25 math.GR math.GN math.LO

Coarse Structures on Homogeneous Spaces

齐性空间上的粗结构

Carlos Pérez Estrada, Christian Rosendal

AI总结本文研究了拓扑群 $G$ 的闭正规子群 $H$ 所对应的商群 $G/H$ 上的左粗结构是否等于 $G$ 的左粗结构在 $H$ 上的商结构。作者给出了一个波兰群中的反例，并在特定条件下建立了等价条件和充分条件，涉及有界集的提升、横截面的存在性以及 $G$ 在 $H$ 上的左粗结构的可度量化。

2605.23817 2026-05-25 math.PR math.DG

Visibility in the Boolean Model on Harmonic Manifolds

调和流形上布尔模型的可见性

Enkelejd Hashorva, Christoph Thäle

AI总结本文研究了在调和流形上的布尔模型中的可见性问题，证明了在每一点未被覆盖的情况下，沿任意方向的可见范围服从指数分布，这一现象在欧几里得空间和实双曲空间中已有结果，本文将其推广到所有单连通非紧齐性调和流形上。研究指出，这一现象的几何原因是测地线段周围管状区域体积的仿射线性增长，并进一步分析了可见区域期望体积的有限性区域及临界阈值的几何意义。此外，文章还构造了非齐性的黎曼流形，展示了指数分布与管状体积线性增长之间的紧密联系。

2605.23813 2026-05-25 math.OC cs.SY eess.SY

Minimum Effort Control Using Variational Methods of Analytical Mechanics A New Approach For Optimal Control

使用分析力学变分法的最小努力控制：最优控制的新方法

Ossama Abdelkhalik, Aimar Negrete

AI总结本文提出了一种基于分析力学变分方法的新最优控制方法，通过将控制执行器视为动态系统的一部分，直接在作用量泛函中引入控制能量项，从而避免了传统最优控制中使用协态变量的需要。该方法通过最小化包含控制能量的作用量来推导控制方程和运动方程，简化了问题复杂度，并为最优控制提供了一种全新的理论框架。文中通过案例研究验证了该方法的有效性。

详情

AI中文摘要

现代最优控制理论涉及使用动态协态变量将已知的动态系统运动方程附加到目标函数上，以约束最优控制解满足运动方程。协态变量的使用增加了变量数量，从而增加了问题的复杂性。另一方面，分析力学的变分方法通过最小化动态系统的作用量泛函来推导运动方程，将控制力视为系统的外部输入。本文提出了一种计算最优控制的新颠覆性方法。该方法采用分析力学的变分方法，除了推导运动方程外，还推导了控制方程。这是通过将控制执行器视为动态系统的一部分来实现的。除了动能和势能外，这种新方法中的作用量泛函还包括代表系统控制能量的额外能量项。提出了两种不同的方法来编写修改后的作用量泛函。所提出的方法显著偏离了现代最优控制理论，并在求解控制时消除了对协态变量的需求。本文通过一个案例研究来演示新方法。

英文摘要

Modern optimal control theory involves adjoining the already known equations of motion of a dynamic system to the objective function using dynamic costates; this is done in order to constrain the optimal control solutions to satisfy the equations of motion. The use of costates increases the number of variables and hence increases the complexity of the problem. On the other hand, variational methods of analytical mechanics finds the equations of motion by minimizing an action functional of the dynamic system, realizing control forces as external input to the system. In this paper a new disruptive approach for computing the optimal control is presented. This approach adopts the variational methods of analytical mechanics to derive equations for the control, in addition to the equations of motion. This is achieved by recognizing the control actuator as part of the dynamic system. In addition to the kinetic energy and potential energy, the action functional in this new approach includes additional energy terms that represent the control energy of the system. Two different methods are presented to write the modified action functional. The proposed approach is a significant departure from the modern optimal control theory, and it eliminates the need for costates when solving for the control. In this paper, a case study is presented to demonstrate the new approach.

URL PDF HTML ☆

赞 0 踩 0

2605.23810 2026-05-25 math.AP

Asymptotic behavior of solutions for the nonlinear Hartree equation involving the fractional Laplacian

涉及分数阶拉普拉斯的非线性Hartree方程解的渐近行为

Natalino Borgia, Silvia Cingolani, Minbo Yang, Shunneng Zhao

AI总结本文研究了涉及分数拉普拉斯算子的非线性哈特ree方程解的渐进行为，重点分析了在临界非线性项扰动下的正解的爆破现象。通过应用移动平面法和积分估计，作者获得了远离边界和靠近边界的解的统一有界性，并证明了当扰动参数趋于零时，解在域内唯一一点处发生爆破，同时确定了爆破点的位置、形状及爆破速率。此外，研究结果还推广到了涉及临界哈特ree型非线性的分数Brezis-Nirenberg问题。

详情

AI中文摘要

本文研究非局部问题 \begin{equation*}\left\lbrace \begin{aligned} &A_{s} u=(|x|^{-(n-2s)}\ast u^{2_{s}^{\sharp}-1-ε})u^{2_{s}^{\sharp}-2-ε} \quad\quad\hspace{3.5mm} \mbox{in}\hspace{2mm}Ω,\\ &u>0\quad\quad \quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\hspace{2mm}\mbox{in}\hspace{2mm}Ω,\\ &u=0\quad \quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\hspace{2mm}\mbox{on}\hspace{2mm}\mathbb{R}^n\setminusΩ, \end{aligned} \right.\end{equation*} 其中 $Ω$ 是 $\mathbb{R}^n$ 中光滑有界区域，$0<s<1$，$n\in(2s,\min\{6s,n+2s\})$，$ε>0$ 小，$2_{s}^{\sharp}-1=(n+2s)/(n-2s)$，$A_{s}$ 表示 $Ω$ 中带有外部零Dirichlet边界条件的分数阶拉普拉斯算子 $(-Δ)^{s}$。上述问题可化为次临界分数阶系统 $$ A_{s}u=u^{2_{s}^{\sharp}-2-ε}v,\hspace{2mm}A_{s}v=u^{2_{s}^{\sharp}-1-ε},\hspace{2mm}u,v>0\hspace{2mm}\mbox{in}\hspace{2mm}Ω\hspace{2mm}\mbox{and}\hspace{2mm}u=(-Δ)^sv=0\hspace{2mm}\mbox{on}\hspace{2mm}\mathbb{R}^n\setminusΩ.$$ 对于一般区域 $Ω$ 或凸区域，我们首先通过移动平面法和卷积项的积分估计，证明了一般分数阶Hartree型偏微分方程正解在远离边界处的一致 $L^1$ 有界性和在边界附近的一致 $L^{\infty}$ 有界性。在这些结果中，我们研究了当 $ε\rightarrow0$ 时解的渐近行为。这些解被证明恰好在一个点 $x_0$ 处爆破，并刻画了该点的位置。此外，还研究了爆破的形状和精确速率。最后，我们还建立了涉及临界Hartree型非线性的分数阶Brezis-Nirenberg问题解的相应主要结果。

英文摘要

In this paper, we investigate the nonlocal problem \begin{equation*}\left\lbrace \begin{aligned} &A_{s} u=(|x|^{-(n-2s)}\ast u^{2_{s}^{\sharp}-1-ε})u^{2_{s}^{\sharp}-2-ε} \quad\quad\hspace{3.5mm} \mbox{in}\hspace{2mm}Ω,\\ &u>0\quad\quad \quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\hspace{2mm}\mbox{in}\hspace{2mm}Ω,\\ &u=0\quad \quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\quad\hspace{2mm}\mbox{on}\hspace{2mm}\mathbb{R}^n\setminusΩ, \end{aligned} \right.\end{equation*} where $Ω$ is a smooth bounded domain in $\mathbb{R}^n$, $0<s<1$, $n\in(2s,\min\{6s,n+2s\})$, $ε>0$ small, $2_{s}^{\sharp}-1=(n+2s)/(n-2s)$ and $A_{s}$ stands for the fractional Laplace operator $(-Δ)^{s}$ in $Ω$ with outside zero Dirichlet boundary condition. The above problem is reduced to the subcritical fractional system $$ A_{s}u=u^{2_{s}^{\sharp}-2-ε}v,\hspace{2mm}A_{s}v=u^{2_{s}^{\sharp}-1-ε},\hspace{2mm}u,v>0\hspace{2mm}\mbox{in}\hspace{2mm}Ω\hspace{2mm}\mbox{and}\hspace{2mm}u=(-Δ)^sv=0\hspace{2mm}\mbox{on}\hspace{2mm}\mathbb{R}^n\setminusΩ.$$ For a general domain $Ω$ or domains with convexity, we first prove a uniform $L^1$ bound away from the boundary and a uniform $L^{\infty}$ bound near the boundary for positive solutions to the general fractional Hartree-type PDEs by applying the moving planes method and integral estimates for the convolution term.Among these results, we study the asymptotic behavior of solutions as $ε\rightarrow0$.These solutions are shown to blow-up at exactly one point $x_0$ and location of this point is characterized. In addition, the shape and exact rates for blowing-up are studied.Finally,we also establish the corresponding main results for solutions of the fractional Brezis-Nirenberg problem involving critical Hartree-type nonlinearity.

URL PDF HTML ☆

赞 0 踩 0

2605.23806 2026-05-25 math.GR math.LO

The geometrisation problem for topological groups

拓扑群的几何化问题

Christian Rosendal

AI总结本文提出了一种基于拓扑群的拓扑与代数结构，赋予其内在几何结构的框架。研究将几何化分为局部和大尺度部分，分别通过局部利普希茨和拟度量范畴进行形式化，并可从群的规范左一致结构和左粗结构中定义。论文还针对波利什群，给出了左粗结构可度量化的内在刻画，并引入了最小度量和最大度量以分别确定局部利普希茨结构和拟度量结构，最终在两者都存在时形成统一的规范利普希茨结构，并应用于同胚群、非阿基米德波利什群和弗雷什极限自同构群等具体例子中。

Comments Survey article to appear in the KIAS Springer Series in Mathematics. The article is based on lectures given at the Korea Institute for Advanced Studies in June 2023

详情

AI中文摘要

本文提出了一个框架，仅利用拓扑群由其拓扑和代数结构提供的数据，为其赋予内在的几何结构。几何化分为小尺度和大尺度两部分，分别通过局部Lipschitz范畴和拟度量范畴形式化，而这些范畴又可由群的左一致结构和左粗糙结构定义。对于波兰群，本文用局部有界性、有界集的可数覆盖以及相容的粗糙左不变度量的存在性刻画了左粗糙结构的可度量性。然后引入了确定局部Lipschitz结构的极小度量和确定拟度量结构的极大度量，并给出了两者的内在刻画。当两种结构都存在时，它们结合成一个单一的规范Lipschitz结构。我们的框架随后应用于具体例子，如同胚群、非阿基米德波兰群和Fraïssé极限的自同构群。

英文摘要

This paper presents a framework for assigning intrinsic geometric structures to topological groups using only the data provided by their topological and algebraic structure. The geometrisation spits into small-scale and large-scale components, formalised respectively through local Lipschitz and quasimetric categories that, in turn, are definable from the canonical left uniform and left coarse structures of the group. For Polish groups, the paper characterises metrisability of the left coarse structure in terms of local boundedness, countable coverings by bounded sets, and the existence of compatible coarsely proper left-invariant metrics. It then introduces minimal metrics, which determine local Lipschitz structure, and maximal metrics, which determine quasimetric structure, and provides intrinsic characterisations of both. When both structures exist, they combine into a single canonical Lipschitz structure. Our framework is subsequently applied to specific examples such as homeomorphism groups, non-Archimedean Polish groups and automorphism groups of Fraïssé limits.

URL PDF HTML ☆

赞 0 踩 0

2605.23803 2026-05-25 cond-mat.stat-mech cond-mat.mes-hall math-ph math.MP physics.chem-ph

Chirality-sensitive mobility and dissipation of Brownian motion on a helical landscape

螺旋景观上布朗运动的旋性敏感迁移率与耗散

Debankur Bhattacharyya, Abraham Nitzan

AI总结本文研究了在圆柱面上印制的二维螺旋形势场中具有惯性的粒子的布朗运动及其线性响应行为。通过谐波势近似和朗之万方程分析，揭示了各向同性阻尼下运动的可分离性，而各向异性阻尼则导致运动耦合并破坏分离性。研究构建了相空间中的线性奥恩斯坦-乌伦贝克过程，并在剔除零模后得到了稳定子空间中的稳态动力学，从而推导出时间及频率域下的动态迁移率张量和时间关联函数，揭示了螺旋几何对能量耗散率和交叉响应的不对称影响。

详情

AI中文摘要

我们研究了具有惯性的粒子在印刻于圆柱表面的二维螺旋势场中的布朗动力学和线性响应。在简谐阱近似下，确定性运动分解为沿螺旋方向的自由传播和横向螺旋法线方向的简谐运动。我们表明，对于各向同性阻尼，这种简化在朗之万描述中仍然成立，而沿轴向和角向的各向异性阻尼耦合了随机动力学并破坏了可分离性。由此产生的各向异性模型被表述为相空间中的线性Ornstein-Uhlenbeck过程，其中包含与沿螺旋坐标扩散相关的零模，因此在无限系统中，完整的相空间动力学不会弛豫到平稳分布。为了处理这种情况下的输运，我们在投影掉零模后得到的稳定子空间中构建平稳动力学。这为该系统的线性响应理论提供了基础，并得到了时间相关函数和动力学迁移率张量在时域和频域中的封闭解析表达式。迁移率张量的非对角元素描述了轴向力与角运动之间以及施加扭矩与轴向输运之间的交叉响应。与时间反演对称性一致，这些交叉迁移率相等，并提供了螺旋几何的直接动力学特征。此外，同时施加轴向和角向驱动揭示了由于螺旋势场导致的能量耗散率的不对称性。

英文摘要

We study the Brownian dynamics and linear response of a particle with inertia moving in a 2-dimensional helical landscape imprinted on a cylindrical surface. In the harmonic well approximation, the deterministic motion separates into free propagation along the screw direction and harmonic motion in the transverse screw-normal direction. We show that for isotropic damping this simplification survives in the Langevin description, whereas anisotropic damping along the axial and angular directions couples the stochastic dynamics and destroys separability. The resulting anisotropic model is formulated as a linear Ornstein-Uhlenbeck process in phase space with a zero mode associated with diffusion along the screw coordinate, so that in an infinite system the full phase-space dynamics does not relax to a stationary distribution. To treat transport in this setting, we construct the stationary dynamics in the stable subspace obtained after projecting out the zero mode. This leads to a linear response theory for this system and yields closed analytical expressions for stationary time-correlation functions and the dynamical mobility tensor in both the time and frequency domains. The off-diagonal elements of the mobility tensor describe cross-response between axial forcing and angular motion, and between applied torque and axial transport. Consistent with time reversal symmetry, these cross mobilities are equal and provide a direct dynamical signature of the helical geometry. In addition, a simultaneous application of driving in both the axial and angular direction reveals asymmetry in energy dissipation rate due the helical landscape.

URL PDF HTML ☆

赞 0 踩 0

2605.23798 2026-05-25 math.GR

Algorithms for experimenting with Zariski dense matrix groups over number fields

数域上Zariski稠密矩阵群实验算法

A. S. Detinko, D. L. Flannery, A. Hulpke

AI总结本文研究了在数域上稠密的扎里斯基矩阵群的算法实现问题，提出了一种计算这些群在有限生成子环极大理想下的同余商的算法。该方法为强逼近定理提供了计算上的模拟，适用于有限生成的扎里斯基稠密子群。算法已在GAP系统中实现，并通过一系列度数为2的实验展示了其应用，尤其关注于双曲群的情形。

2605.23784 2026-05-25 math-ph math.MP

Reconstruction methods for inverse scattering problems with phaseless data

无相位数据逆散射问题的重建方法

John C. Schotland, Shenwen Yu

AI总结本文研究了基于无相位数据的薛定谔方程反散射问题，提出了一种基于逆Born级数的重建方法。针对三种类型的无相位数据，分别扩展了逆Born级数框架，提出了基于傅里叶分析和极化方法的重建策略，有效恢复了散射势的傅里叶系数。数值实验验证了所提方法的有效性。

2605.23773 2026-05-25 math.CO cs.DM

A Balancing Theorem for Spanning Trees of Rectangular Grid Graphs

矩形网格图生成树的平衡定理

Jiechen Zhang

AI总结本文研究了固定顶点数的矩形网格图中生成树数量的变化规律，证明当网格的边长更加均衡时，生成树的数量会增加，特别是对于具有 $n^2$ 个顶点的所有矩形网格图，边长为 $n \times n$ 的正方形网格拥有最多的生成树。研究通过拉普拉斯乘积公式出发，结合双曲坐标变换，分离出离散凹性项和正递减残差项进行比较，从而得出了这一结论。

Comments 10 pages

2605.23761 2026-05-25 math.OC

Quasi-Newton and Krylov Methods for the Solution of Nonconvex Trust-Region Subproblems

求解非凸信赖域子问题的拟牛顿法和Krylov方法

Johann Bourhis, Oihan Cordelier, Jean-Pierre Dussault, Oussama Mouhtal, Dominique Orban

AI总结本文研究了利用全记忆和有限记忆方法求解对称正定线性方程组的问题。作者提出了三方面贡献：首先，推导了共轭梯度法与Broyden类拟牛顿法之间的新关系，并明确了它们在何种情况下生成相同迭代点并具有二次终止性；其次，将这一视角扩展到有限记忆BFGS方法，并探讨了其与有限记忆正交化Krylov方法DIOM的相似性；最后，将LBFGS和DIOM推广到非凸无约束优化的信任区域子问题求解中。数值实验表明，有限记忆方法在鲁棒性和计算效率上优于共轭梯度法，尤其在需要高精度或Hessian运算受限的情况下表现出明显优势。

详情

DOI: 10.13140/RG.2.2.23735.69281

AI中文摘要

我们研究通过全内存和有限内存方法族求解对称正定线性系统。我们的贡献有三方面。首先，我们推导了共轭梯度法（CG）与Broyden类拟牛顿法之间的新关系，这些关系完善了现有结果，并阐明了这些方法何时生成相同的迭代并享有二次终止性。我们将这一视角扩展到有限内存BFGS（LBFGS）方法。其次，我们研究了DIOM（全正交化Krylov方法（FOM）的一种有限内存变体）如何类似于LBFGS，因为它提供了一个在实际性能中至关重要的内存杠杆。最后，我们将LBFGS和DIOM推广到计算无约束、可能非凸优化的信赖域步。我们报告了在正定线性系统和无约束优化问题上的数值经验。结果表明，内存是一个关键算法杠杆：LBFGS和DIOM始终比CG更鲁棒，并且通常以更少的Hessian-向量乘积达到相当的精度。当需要高精度或Hessian操作代价高昂时，它们成为CG的可行替代方案。有限内存SR1（LSR1）方法在全内存形式下具有竞争力，但其有限内存变体因丢弃曲率信息而表现不佳。

英文摘要

We study the solution of symmetric positive-definite linear systems by way of families of full- and limited-memory methods. Our contributions are threefold. We first derive new relationships between the conjugate-gradient method (CG) and quasi-Newton methods of the Broyden class that refine existing results, and clarify when those methods generate the same iterates and enjoy quadratic termination. We extend this perspective to the limited-memory BFGS (LBFGS) method. Next, we examine how DIOM, a limited-memory variant of the full orthogonalization Krylov method (FOM), is akin to LBFGS in that it provides a memory lever that is critical in practical performance. Finally, we generalize LBFGS and DIOM to the computation of trust-region steps for unconstrained, potentially nonconvex, optimization. We report numerical experience on positive-definite linear systems and unconstrained optimization problems. The results show that memory is a key algorithmic lever: LBFGS and DIOM are consistently more robust than CG and often achieve comparable accuracy with fewer Hessian-vector products. They emerge as viable alternatives to CG when high accuracy is desirable or when operations with the Hessian are at a premium. The limited-memory SR1 (LSR1) method can be competitive in full-memory form, but its limited-memory variant suffers from discarded curvature information.

URL PDF HTML ☆

赞 0 踩 0