arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2606.20557 2026-06-19 cs.LG math.ST stat.ML stat.TH 新提交

Optimal Deterministic Multicalibration and Omniprediction

最优确定性多校准与全预测

Georgy Noarov, Aaron Roth

发表机构 * University of Pennsylvania（宾夕法尼亚大学）

AI总结本文提出一种确定性算法，实现多校准的极小化最优样本复杂度，并推广到结果不可区分性，解决确定性预测器是否必要的问题。

详情

AI中文摘要

一个模型在一组群体权重 $G$ 上是多校准的，如果它是校准的——即即使以其预测为条件也是无偏的——不仅整体上，而且在通过每个 $g \in G$ 对上下文重新加权后也是如此。这对于许多下游应用是一个有用的性质，也是可信机器学习的基本要求。在这项工作之前，所有已知达到 $\varepsilon$-多校准的极小化最优 $\widetilde O(\varepsilon^{-3})$ 样本复杂度的预测器都是随机化的，而确定性预测器仅以更差的样本复杂度已知。多校准中随机化对于最优样本复杂度是否必要的问题由 [CLNR26] 明确提出，并在之前的几项工作中隐含提出。我们通过给出一个输出确定性预测器的极小化最优多校准算法解决了这个开放问题。然后我们将该算法推广到产生满足关于有限或有限覆盖测试集合的结果不可区分性（OI）的最优确定性预测器。作为一个应用，这也给出了具有最优样本复杂度的确定性全预测器和泛预测器，解决了 [OKK25] 和 [BHHLZ25] 提出的开放问题。

英文摘要

A model is multicalibrated on a collection of group weights $G$ if it is calibrated -- i.e. unbiased even conditional on its prediction -- not just overall, but also after reweighting contexts by each $g \in G$. It is a useful property for many downstream applications and is a basic desideratum of trustworthy machine learning. Before this work, all predictors known to attain the minimax-optimal $\widetilde O(\varepsilon^{-3})$ sample complexity rate for $\varepsilon$-multicalibration were randomized, while deterministic predictors were known only with substantially worse sample complexity. Whether randomization is necessary for optimal sample complexity in multicalibration was explicitly asked by [CLNR26] and implicitly in several prior works. We resolve this open problem by giving a minimax-optimal multicalibration algorithm that outputs a deterministic predictor. We then generalize the algorithm to produce optimal deterministic predictors that satisfy outcome indistinguishability (OI) with respect to finite or finitely covered collections of tests. As an application, this also gives deterministic omnipredictors and panpredictors with optimal sample complexity, resolving open problems posed by [OKK25] and [BHHLZ25].

URL PDF HTML ☆

赞 0 踩 0

2606.20547 2026-06-19 cs.LG cs.CV cs.GR cs.RO math.DG 新提交

The Token Is a Group Element: On Lie-Algebra Attention over Matrix Lie Groups

Token 是群元素：关于矩阵李群上的李代数注意力

Przemyslaw Musialski

发表机构 * New Jersey Institute of Technology（新泽西理工学院）

AI总结提出李代数注意力机制，将token定义为矩阵李群元素，利用相对位姿的李代数范数作为注意力分数，无需学习核函数或表示论工具，适用于仿射全帧群等非紧致非阿贝尔群。

Comments preprint, 19 pages, 3 figures

详情

AI中文摘要

我们将注意力token置于群上：一个token是矩阵李群$G$的一个元素$g_i$——一个纯粹的变换，没有特征负载，也没有外部作用$\rho(g)$承载它。据我们所知，这是第一个token为裸矩阵李群元素的注意力构造：它们的分数是相对位姿的闭式代数范数，而非学习核，并且它达到了每个基于不可约表示或满射指数的方法必须排除的仿射全帧群。我们称之为李代数注意力。一旦token是群元素，其余部分无需通常的表示论机制。一对的相对几何是规范的，即$g_i^{-1} g_j$，因此成对不变量$w_{ij} = \log(g_i^{-1} g_j)$是内在的而非设计的；在$G$对角作用下的等变性是重言式的，且余循环条件自动成立。注意力分数是负平方代数范数$s_{ij} = -\|\log(g_i^{-1} g_j)\|_\lambda^2/\tau$：在块加权Frobenius内积下的规范邻近核，无需不可约表示、球谐函数、Clebsch-Gordan积或学习核。该构造适用于任何矩阵李群，在包含相对位姿的选定对数图上，包括具有尺度和剪切的非紧致非阿贝尔仿射群，这些是向量token注意力方法无法达到的：既不是不可约表示传统，也不是满射指数方法。在SE(2)、SO(3)和Aff(2)上的三个序列补全实验证实了这一点：闭式分数匹配了相同不变量上的学习MLP核，并在SE(2)上优于它，使用的分数参数少50到80倍，而向量token基线破坏了不变量，误差达五到十二个数量级。

英文摘要

We place the attention token on the group: a token is an element $g_i$ of a matrix Lie group $G$ -- a bare transformation, with no feature payload and no external action $ρ(g)$ carrying it. To our knowledge this is the first attention construction whose tokens are bare matrix Lie group elements: their score is the closed-form algebra norm of the relative pose rather than a learned kernel, and it reaches the affine full-frame groups that every irrep- or surjective-exp-based method must exclude. We call it Lie-Algebra Attention. Once tokens are group elements, the rest follows with none of the usual representation-theoretic machinery. The relative geometry of a pair is canonical, $g_i^{-1} g_j$, so the pairwise invariant $w_{ij} = \log(g_i^{-1} g_j)$ is intrinsic rather than designed; equivariance under the diagonal $G$-action is tautological, and the cocycle condition holds automatically. The attention score is the negative squared algebra norm, $s_{ij} = -\|\log(g_i^{-1} g_j)\|_λ^2/τ$: the canonical proximity kernel under a block-weighted Frobenius inner product, with no irreducible representations, spherical harmonics, Clebsch-Gordan products, or learned kernel. The construction applies to any matrix Lie group on a chosen logarithm chart containing the relative poses, including the non-compact non-abelian affine groups with scale and shear that no vector-token attention method reaches: neither the irrep tradition nor surjective-exp methods. Three sequence-completion experiments, on SE(2), SO(3), and Aff(2), bear this out: the closed-form score matches a learned MLP kernel on the same invariant and outperforms it on SE(2), using 50 to 80x fewer score parameters, while a vector-token baseline breaks invariance by five to twelve orders of magnitude.

URL PDF HTML ☆

赞 0 踩 0

2606.20443 2026-06-19 eess.SY cs.LG cs.SY math.AT 新提交

Topological Data Analysis for High-Dimensional Dynamic Process Monitoring

高维动态过程监测的拓扑数据分析

Angan Mukherjee, Tyler A. Soderstrom, Michael J. Kurtz, Victor M. Zavala

AI总结提出结合拓扑数据分析和机器学习的方法，将多变量时间序列表示为流形，用拓扑描述符总结结构，并用神经常微分方程学习拓扑结构动态演化，实现高效事件检测。

2606.20442 2026-06-19 cs.LG cs.NA cs.NE math.NA 新提交

Evolutionary Two-Stage Hyperparameter Optimization Strategies for Physics-Informed Neural Networks

物理信息神经网络的进化两阶段超参数优化策略

Fedor Buzaev, Dmitry Efremenko, Egor Bugaev, Andrei Ermakov, Denis Derkach, Daria Pugacheva, Fedor Ratnikov

发表机构 * HSE University（高等经济大学）； AXXX

AI总结针对物理信息神经网络训练不稳定、超参数敏感的问题，提出基于进化算法的两阶段优化策略，先低保真筛选再全训练，在三个PDE问题上显著降低误差。

Comments Equal advising: Daria Pugacheva and Fedor Ratnikov. Accepted to the ICLR 2026 Workshop on AI and PDEs

详情

AI中文摘要

物理信息神经网络（PINNs）通过将物理定律嵌入神经网络训练来求解偏微分方程（PDE）。然而，由于物理信息损失的高度非凸和多项结构，其性能受到不稳定收敛、训练平台期以及对架构和优化超参数的强敏感性的影响。在这种情况下，外循环超参数搜索是一个在异构参数上的噪声黑盒优化问题，经典的局部或基于梯度的策略容易陷入次优区域。进化算法凭借其基于种群的探索能力和处理混合、不可微搜索空间的能力，为发现有前景的配置提供了更稳健的机制。我们提出并研究了一种基于进化算法的两阶段方法，该方法结合了PINNs训练的探索和利用部分，以在固定计算预算下提高解的精度和鲁棒性。在第一阶段，我们执行具有截断轮次的低保真训练运行，以快速筛选候选配置，将超参数选择视为黑盒外循环问题。在第二阶段，只有最有希望的候选者使用标准基于梯度的优化器进行完全训练以细化解。在三个流行问题（即平流方程、Klein-Gordon方程和Helmholtz方程）上评估，我们的方法一致优于标准训练，并在受限计算资源内实现了显著更低的平均误差。

英文摘要

Physics-Informed Neural Networks (PINNs) solve Partial Differential Equations (PDEs) by embedding physical laws into neural network training. However, their performance suffers from unstable convergence, training plateaus, and strong sensitivity to architectural and optimization hyperparameters due to the highly non-convex and multi-term structure of the physics-informed loss. In this setting, the outer-loop hyperparameter search is a noisy and black-box optimization problem over heterogeneous parameters, where classical local or gradient-based strategies are easily trapped in suboptimal regions. Evolutionary algorithms, with their population-based exploration and ability to handle mixed, non-differentiable search spaces, provide a more robust mechanism for discovering promising configurations. We propose and investigate a two-stage approach based on evolutionary algorithms that combines exploration and exploitation parts of PINNs training to improve solution accuracy and robustness under fixed computational budgets. In the first stage, we perform low-fidelity training runs with truncated epochs to rapidly screen candidate configurations, treating hyperparameter selection as a black-box outer-loop problem. In the second stage, only the most promising candidates are fully trained with standard gradient-based optimizers to refine the solution. Evaluated on three popular problems, namely Advection, Klein-Gordon and Helmholtz equations, our method consistently outperforms standard training and achieves significantly lower mean error within constrained computational resources.

URL PDF HTML ☆

赞 0 踩 0

2606.20413 2026-06-19 eess.SP cs.IT math.IT 新提交

Hybrid TRP-UE Sensing for Enhanced Target Localization

混合TRP-UE感知用于增强目标定位

Necati Kagan Erkek, Marco Di Renzo, Arman Shojaeifard, Yasser Mestrah, Remun Koirala, Mohammad Heggo, Kunjan Shah

AI总结提出一种混合TRP-UE感知机制，利用UE辅助感知提升网络感知性能，在室内工厂等复杂传播环境下显著改善目标定位精度。

Comments 6 pages

2606.20394 2026-06-19 cs.RO math.OC 新提交

Agentic AutoResearch forSpace Autonomy: An Auditable, LLM-Driven Research Agent for Aerospace Control Problems

面向空间自主性的智能体自动研究：用于航空航天控制问题的可审计、LLM驱动的研究代理

Amit Jain, Richard Linares

发表机构 * Department of Aeronautics and Astronautics（航空航天学系）

AI总结提出AutoResearch框架，利用大语言模型作为离线研究代理，自动迭代开发航天控制策略，并通过内置可信层审计结果，消除种子噪声影响，在交会和对接问题上验证了有效性。

详情

AI中文摘要

航天器的制导、导航与控制功能日益通过从专家求解器中提炼的学习策略来实现。开发这样的策略本身就是一个研究过程：研究者选择架构和超参数，运行实验，并必须判断一个明显的改进是真实的还是仅仅是种子噪声。本文提出了AutoResearch框架，其中大语言模型自主驱动这一循环，用于航空航天控制问题，并结合了一个内置在循环中的可信层，该层根据问题自身测量的种子噪声对每个报告的结果进行认证。语言模型仅作为离线研究代理，负责开发控制策略；它产生的训练策略随后部署在航天器上，而模型本身从不操作飞行器。在每次迭代中，代理读取自然语言描述的问题描述和运行历史，对训练脚本提出一次编辑，执行它，并记录结果。任何报告的结果在通过相同的三项检查之前不会被认可：测量的每个问题的种子噪声、最佳配置的重新播种验证，以及代理编辑的留一法剪枝。相同的循环被原样应用于两个航空航天控制问题：Clohessy-Wiltshire相对交会问题和带有安全约束的避碰对接问题（经过禁飞区），每个问题都针对已知的最优控制基准进行了校准。在这两个问题中，经过审计的策略以多个标准差超过了测量的种子噪声；对相同参数的未定向搜索则没有。在对接问题上，差距变得明显：未定向搜索没有产生可行的策略，而学习到的策略在每个种子上都保持在禁飞区之外。

英文摘要

Spacecraft guidance, navigation, and control functions are increasingly realized as learned policies distilled from expert solvers. Developing such a policy is itself a research process: an investigator selects an architecture and hyperparameters, runs experiments, and must determine whether an apparent improvement is genuine or merely seed noise. This paper presents AutoResearch, a framework in which a large language model autonomously drives that loop for aerospace control problems, coupled with a credibility layer, built into the loop, that certifies each reported result against the problem's own measured seed noise. The language model serves only as the offline research agent that develops the control policy; the trained policy it produces is then deployed onboard the spacecraft, while the model itself never operates the vehicle. At each iteration the agent reads a plain-language problem description and the run history, proposes a single edit to the training script, executes it, and logs the outcome. No reported result is credited until it passes the same three checks: measured per-problem seed noise, reseeded verification of the best configuration, and leave-one-out pruning of the agent's edits. The same loop is applied, unchanged, to two aerospace control problems: a Clohessy-Wiltshire relative rendezvous and a safety-constrained collision-avoidance docking past a keep-out zone, each calibrated against a known optimal control benchmark. In both, the audited policy clears the measured seed noise by many standard deviations; an undirected search over the same parameters does not. On the docking problem the gap becomes categorical: undirected search yields no feasible policy, while the learned policy stays outside the keep-out zone on every seed.

URL PDF HTML ☆

赞 0 踩 0

2606.20325 2026-06-19 cs.LG cs.SC math.DS 新提交

Recurrent neural networks approximate continuous functions

递归神经网络近似连续函数

Valentin Abadie, Clemens Hutter, Helmut Bölcskei

AI总结本文证明，对于[-1,1]上的任意连续函数，存在一个固定权重和隐藏维度的ReLU递归神经网络，其时间演化可以均匀逼近该函数，并给出了收敛速率和极小极大下界。

详情

AI中文摘要

经典逼近定理要求每当目标精度提高时，就需要一个新的神经网络。本文研究相反的可能性：能否一劳永逸地选择网络，而仅通过让其运行更长时间来换取精度？我们证明这对于[-1,1]上的每个连续函数都是可能的。更准确地说，每个这样的函数都可以通过一个具有固定权重和固定隐藏维度的单ReLU递归神经网络的时间演化来均匀逼近。该构造背后的机制是一个新的中间模型——带神经单元的图灵机（TMNU）。该模型保留了实现多项式逼近方案所需的算法自由度，同时保持足够的刚性，以便被具有显式隐藏维度和权重幅度界限的RNN模拟。由此产生的收敛速率反映了底层多项式逼近的速率。我们通过极小极大下界补充了该构造，表明运行时间不仅仅是证明的产物，而是这种固定网络逼近范式中不可避免的资源。

英文摘要

Classical approximation theorems ask for a new neural network whenever the target accuracy is improved. This paper studies the opposite possibility: can the network be chosen once and for all, and can accuracy be bought only by letting it run longer? We prove that this is possible for every continuous function on [-1,1]. More precisely, each such function is uniformly approximated by the time evolution of a single ReLU recurrent neural network with fixed weights and fixed hidden dimension. The mechanism behind the construction is a new intermediate model, the Turing machine with neural units (TMNU). This model retains the algorithmic freedom needed to implement polynomial approximation schemes, while remaining rigid enough to be simulated by RNNs with explicit bounds on hidden dimension and weight magnitude. The resulting convergence rates reflect the underlying polynomial approximation rates. We complement the construction with minimax lower bounds showing that runtime is not merely a proof artifact, but an unavoidable resource in this fixed-network approximation paradigm.

URL PDF HTML ☆

赞 0 踩 0

2606.20195 2026-06-19 cs.PF cs.NA math.NA 新提交

Randomized Sketching is Robust to Low-Precision Rounding on GPUs

随机草图对GPU低精度舍入具有鲁棒性

Aryaman Jeendgar, Clément Flint, Hartwig Anzt

AI总结研究随机草图在GPU低精度下的性能与精度，提出SparseStack改进CountSketch，发现FP16舍入方式对嵌入质量影响小，分布比量化更关键。

Comments 14 pages, 3 figures

详情

AI中文摘要

随机草图是随机数值线性代数中的核心原语。在现代硬件架构上，特别是在GPU上，稀疏草图的性能受限于内存流量和原子累加，而非浮点吞吐量。这使得草图成为混合精度的自然目标，前提是低精度累加不会降低嵌入质量。我们研究了稀疏子空间嵌入的混合精度GPU实现，重点关注Higgins等人提出的GPU CountSketch内核的SparseStack泛化。SparseStack在相干输入上相对于CountSketch提高了嵌入质量，但其每列额外的非零元素增加了原子更新争用并降低了吞吐量。因此，我们实现了使用确定性舍入到最近、精确随机舍入和抖动舍入的FP16 SparseStack变体，并将它们与FP32 SparseStack、CountSketch、混合精度CountSketch和FlashSketch进行比较。我们的主要实证发现是，在测试的范围内，SparseStack嵌入质量对FP16舍入规则不敏感。确定性、随机和抖动舍入的FP16 SparseStack在不相干、相干和对抗性测试问题上产生几乎相同的子空间失真和草图求解最小二乘精度。主导精度因素是草图分布而非量化规则：SparseStack变体在相干输入上显著改善失真，而所有方法在不相干输入上表现相似。由于确定性舍入的开销最低，它在FP16 SparseStack变体中提供了最佳的性能-精度权衡。

英文摘要

Randomized sketching is a core primitive in randomized numerical linear algebra. On modern hardware architectures, in particular on GPUs, the performance of sparse sketches is limited by memory traffic and atomic accumulation rather than floating-point throughput. This makes sketching a natural target for mixed precision, provided that low-precision accumulation does not degrade the embedding quality. We study mixed-precision GPU implementations of sparse oblivious subspace embeddings, focusing on a SparseStack generalization of the GPU CountSketch kernel of Higgins et al. SparseStack improves embedding quality relative to CountSketch on coherent inputs, but its additional nonzeros per column increase atomic-update contention and reduce throughput. We therefore implement FP16 SparseStack variants using deterministic round-to-nearest, exact stochastic rounding, and dithered rounding, and compare them with FP32 SparseStack, CountSketch, mixed-precision CountSketch, and FlashSketch. Our main empirical finding is that, for the tested regimes, SparseStack embedding quality is insensitive to the FP16 rounding rule. Deterministic, stochastic, and dithered rounding FP16 SparseStack produce nearly identical subspace distortion and sketch-and-solve least-squares accuracy across incoherent, coherent, and adversarial test problems. The dominant accuracy factor is the sketch distribution rather than the quantization rule: SparseStack variants substantially improve distortion on coherent inputs, while all methods behave similarly on incoherent inputs. Since deterministic rounding has the lowest overhead, it provides the best performance--accuracy tradeoff among the FP16 SparseStack variants.

URL PDF HTML ☆

赞 0 踩 0

2606.20162 2026-06-19 cs.AI cs.IT cs.NI math.IT 新提交

Implicit Semantic-Aware Communication Based on Hypergraph Reasoning

基于超图推理的隐式语义感知通信

Yiwei Liao, Shurui Tu, Yong Xiao, Yingyu Li, Guangming Shi

发表机构 * China Electric Power Research Institute Co., Ltd（中国电力科学研究院有限公司）； National Key Laboratory for Power Grid Environmental Protection（电网环境保护国家重点实验室）； School of Electronic Information and Communications, Huazhong University of Science and Technology（华中科技大学电子信息与通信学院）； Peng Cheng Laboratory（鹏城实验室）； Pazhou Laboratory (Huangpu)（琶洲实验室（黄埔））； School of Mechanical Engineering and Electronic Information, China University of Geosciences（中国地质大学机械与电子信息学院）

AI总结提出基于超图的隐式语义推理框架HISR，通过超图建模多实体高阶关系，在噪声信道下提升语义推理鲁棒性，准确率提升36.6%。

Comments This work is accepted at IEEE Transactions on Communications

详情

AI中文摘要

语义感知通信已成为下一代通信系统的变革性范式，将基本目标从传输比特级符号转变为可靠恢复和理解信息的语义含义。先前研究表明，将源消息的语义内容表示为基于图的结构可以显著提高通信效率和接收端语义推理的准确性。然而，现有解决方案通常采用仅捕获成对关系的图，从而忽略了现实场景中常见的高阶隐式相关性，例如群体交互、多实体关联和复杂关系上下文。这种限制降低了语义表达能力，并使语义推理容易受到歧义和性能下降的影响，尤其是在噪声或损坏的信道条件下。为了解决这些问题，本文提出了一种新颖的基于超图的隐式语义推理框架HISR，该框架利用超图表示语义知识实体之间的复杂多实体关系。在HISR中，实体及其关联的高阶关系被映射到针对不同关系上下文定制的专用语义子空间中。这种设计不仅解耦了多样的语义交互以减轻传统图嵌入方法中常见的过平滑效应，而且即使在传输过程中发生部分信息丢失时也能实现鲁棒的语义推理。数值结果表明，所提出的HISR在隐式语义解释准确率上比最先进的基准提高了36.6%。

英文摘要

Semantic-aware communication has emerged as a transformative paradigm for next-generation communication systems, shifting the fundamental goal from transmitting bit-level symbols to reliably recovering and understanding the semantic meaning of information. Previous studies have demonstrated that representing the semantic content of source messages as graph-based structures can significantly improve communication efficiency and the accuracy of semantic inference at the receiver. However, existing solutions typically employ graphs that capture only pairwise relationships, thereby neglecting higher-order implicit correlations commonly observed in real-world scenarios, such as group interactions, multi-entity associations, and complex relational contexts. This limitation reduces semantic expressiveness and makes semantic inference susceptible to ambiguity and performance degradation, particularly under noisy or corrupted channel conditions. To address these issues, this paper proposes a novel hypergraph-based implicit semantic reasoning framework, HISR, which leverages hypergraphs to represent complex multi-entity relationships among semantic knowledge entities. In HISR, entities and their associated higher-order relations are mapped into dedicated semantic subspaces tailored to distinct relational contexts. This design not only disentangles diverse semantic interactions to mitigate the over-smoothing effects commonly found in traditional graph embedding methods but also enables robust semantic inference even when partial information loss occurs during transmission. Numerical results show that the proposed HISR achieves up to a 36.6% improvement in implicit semantic interpretation accuracy over the state-of-the-art benchmarks.

URL PDF HTML ☆

赞 0 踩 0

2606.20022 2026-06-19 stat.ML cs.LG math.OC 新提交

Stochastic Linear Contextual Bandits with Bounded Noise: A Set-Membership Approach

具有有界噪声的随机线性上下文赌博机：一种集合成员方法

Haonan Xu, Yingying Li

AI总结针对有界奖励噪声的随机线性上下文赌博机，提出基于集合成员估计和乐观原则的SME-OFU算法，实现O(log T)的遗憾界，优于次高斯噪声下的最优界。

Comments 23 pages, 1 figure

详情

AI中文摘要

本文考虑具有有界奖励噪声的随机线性上下文赌博机（SLCB）。现有工作通常假设次高斯奖励噪声和有界期望奖励，在此条件下最优遗憾界关于时间T为$\tilde{O}(\sqrt{T})$。然而，在许多应用中，实现/观测到的奖励也自然有界，这意味着奖励噪声有界。有界噪声比次高斯条件更具信息性，但在SLCB文献中尚未被明确利用。本文通过利用一种称为集合成员估计（SME）的不确定性量化方法，并应用面对不确定性的乐观原则（OFU），提出了一种新颖的算法SME-OFU。我们的算法享有改进的遗憾界$O(\log T)$。注意，这并不与次高斯噪声下现有的最优界$\tilde{O}(\sqrt{T})$矛盾，因为有界噪声是更强的条件。最后，仿真表明，当奖励噪声有界时，SME-OFU相对于为次高斯噪声设计的基准算法在经验上有所改进。

英文摘要

This paper considers stochastic linear contextual bandits (SLCB) with bounded reward noise. Existing works typically assume sub-Gaussian reward noise and bounded expected rewards, under which the optimal regret bound scales as $\tilde{O}(\sqrt{T})$ in terms of horizon $T$. However, in many applications, realized/observed rewards are also naturally bounded, implying bounded reward noise. Bounded noise is more informative than the sub-Gaussian condition but has not been leveraged explicitly in the SLCB literature. In this paper, we propose a novel algorithm SME-OFU by utilizing an uncertainty quantification method called set-membership estimation (SME) and applying the principle of optimism in the face of uncertainty (OFU). Our algorithm enjoys an improved regret bound $O(\log T)$. Notice that this does not contradict the existing optimal bound $\tilde{O}(\sqrt{T})$ for sub-Gaussian noise because bounded noise is a stronger condition. Finally, simulations show empirical improvements of SME-OFU over a benchmark algorithm designed for sub-Gaussian noise when the reward noise is bounded.

URL PDF HTML ☆

赞 0 踩 0

2606.19909 2026-06-19 stat.CO math.PR stat.ME 新提交

Establishing an $Ω(\sqrt{d})$ complexity lower bound for PDMP samplers and how to break it: a sub-$\sqrt{d}$ algorithm for Gaussian-tailed targets

建立 PDMP 采样器的 $\Omega(\sqrt{d})$ 复杂度下界及如何突破：针对高斯尾目标的一个亚 $\sqrt{d}$ 算法

Augustin Chevallier

AI总结本文证明分段确定性马尔可夫过程采样器在标准设置下具有 $\Omega(\sqrt{d})$ 复杂度下界，并通过放宽目标密度连续时间不变性假设，提出一种新方案，对高斯尾目标实现 $O(d^\alpha)$（$\alpha\in[0.2,0.3]$）的经验复杂度。

2606.19878 2026-06-19 cs.LG math.OC stat.ML 新提交

On the Oracle Complexity of Interpolation-Based Gradient Descent

基于插值的梯度下降的预言复杂度

Dongmin Lee, William Lu, Anuran Makur

发表机构 * Purdue University（普渡大学）

AI总结提出分段多项式插值梯度下降（PPI-GD）方法，通过数据域等距点查询一阶预言构造多项式插值近似全梯度，在强凸和非凸损失下分析预言复杂度，证明在数据维数受限且损失足够光滑时优于多种GD变体。

Comments 16 pages, 2 figures

详情

DOI: 10.1109/TAC.2026.3682210

AI中文摘要

最近关于经验风险最小化（ERM）的一阶优化器的工作表明，可以利用ERM损失函数在训练数据中的光滑性（而非优化参数中的光滑性）来改进梯度下降（GD）方法的预言复杂度。在本文中，我们提出了一种不精确梯度方法——分段多项式插值梯度下降（PPI-GD），该方法通过在数据域中的等距点处查询一阶预言来近似每次迭代中的全梯度，从而在数据域的适当大小的块上构造所得梯度样本的多项式插值。我们分析了PPI-GD在强凸和非凸损失函数下的预言复杂度，其中数据空间维数以训练样本数量的多对数函数为界，并发现当损失函数足够光滑时，PPI-GD在关键区域优于几种GD变体。此外，我们的分析将双三次样条插值误差分析中的几种技术扩展到$d$变量张量积多项式插值的设置中，这可能对插值分析具有独立意义。

英文摘要

Recent work on first-order optimizers for empirical risk minimization (ERM) has suggested that smoothness of ERM loss functions in the training data, rather than in the optimization parameters, can be leveraged to improve the oracle complexity of gradient descent (GD) methods. In this paper, we propose an inexact gradient method, piecewise polynomial interpolation-based gradient descent (PPI-GD), which approximates the full gradient in each iteration by querying the first-order oracle at equidistant points in the data domain to construct polynomial interpolants of the resulting gradient samples over appropriately sized patches of the data domain. We analyze the oracle complexity of PPI-GD for strongly convex and non-convex loss functions when the data space dimension is bounded by a polylogarithmic function of the number of training samples, and find it to outperform several GD variants in key regimes when the loss function is sufficiently smooth. Furthermore, our analysis extends several techniques from the error analysis of bicubic spline interpolants to the setting of $d$-variate tensor product polynomial interpolants which may be of independent interest in interpolation analysis.

URL PDF HTML ☆

赞 0 踩 0

2606.19876 2026-06-19 cs.LG math.OC 新提交

DeQL：一种用于关系数据规范性分析的决策查询语言

Matteo Brucato, Fjodor Kholodkov, Soren Little, Jakob Mayer, Duc Nguyen

AI总结 DeQL扩展SQL以支持决策查询，通过CREATE CANDIDATES和DECIDE两个构造定义选项空间、约束和目标，实现子集选择、分配、调度等决策，并支持不确定性优化和模型评分。

详情

AI中文摘要

DeQL（决策查询语言）扩展了SQL以表达决策查询：给定从关系数据中提取的选项、策略约束和可测量的目标，DeQL查询计算出最佳行动方案。两个构造实现了这一扩展：CREATE CANDIDATES，定义来自关系源的选项空间；DECIDE，声明决策变量、命名约束以及针对这些变量的目标。该设计遵循SQL的原则：用户说明要优化的内容，而引擎选择如何求解；每个查询消费并产生关系；问题的结构对引擎保持可见。本文档规范了该语言（其设计原则、语法、形式文法及执行模型），并附有涵盖子集选择、分配、指派、调度以及多级聚合决策的示例，以及针对不确定性优化、内联模型评分和时间与质量受限求解的扩展。这是该规范的第一版；该语言正在积极开发中，本版本固定了后续修订将基于的核心构造。

英文摘要

DeQL (Decision Query Language) extends SQL to express decision queries: given options drawn from relational data, constraints from policy, and a measurable objective, a DeQL query computes the best course of action. Two constructs carry the extension: CREATE CANDIDATES, which defines the space of options from relational sources, and DECIDE, which declares decision variables, named constraints, and an objective over them. The design follows SQL's principles: the user states what to optimize while the engine chooses how to solve it, every query consumes and produces relations, and the structure of a problem stays visible to the engine. This document specifies the language (its design principles, syntax, formal grammar, and execution model) with examples spanning subset selection, allocation, assignment, scheduling, and decisions at multiple levels of aggregation, and extensions for optimization under uncertainty, inline model scoring, and time- and quality-bounded solving. It is the first version of the specification; the language is under active development, and this version fixes the core constructs on which later revisions will build.

URL PDF HTML ☆

赞 0 踩 0

2606.19715 2026-06-19 eess.SP cs.IT math.IT 新提交

Generalized Pinching-Antenna Systems: A Radio-Stripe-Based Realization

广义夹捏天线系统：基于无线电条带的实现

Yanqing Xu, Zhiguo Ding, Tsung-Hui Chang

AI总结本文提出基于无线电条带（RS）的广义夹捏天线（RS-GPA）框架，通过主动天线处理单元实现位置灵活的无线接入，并开发稀疏激活与波束成形算法以降低总功耗。

Comments 13 pages, 7 figures

详情

AI中文摘要

本文研究无线电条带（RS）作为广义夹捏天线的实际实现，并提出基于RS的广义夹捏天线（RS-GPA）框架。与依赖导波到自由空间被动耦合的介质波导基被动夹捏天线不同，RS采用沿共享电缆部署的主动天线处理单元（APU）进行本地传输、接收和信号处理。这种类似电缆的主动架构提供了灵活的安装和广泛的频率适用性，同时允许选定的APU作为离散且可控的辐射或接收点，实现位置灵活的无线接入。基于所提出的RS-GPA框架，我们通过考虑距离相关的APU-用户信道建立了系统和信道模型。对于下行传输，我们提出了一个电路功率感知的稀疏APU激活和波束成形问题，并开发了一种重加权群稀疏波束成形算法。为了揭示激活原理，我们分析了单用户下行情况，并通过平衡发射功率节省和电路功率成本来刻画何时应激活额外的APU。受此启发，提出了一种几何引导的低复杂度多用户算法。对于上行传输，我们提出了一个联合APU激活和用户功率控制问题，并开发了一种几何引导的稀疏激活设计。数值结果表明，与基准方案相比，所提出的RS-GPA框架显著降低了总功耗，而几何引导算法在运行时间显著降低的情况下实现了与群稀疏设计几乎相同的功耗性能。

英文摘要

This paper investigates radio stripes (RSs) as a practical realization of generalized pinching antennas and proposes an RS-based generalized pinching-antenna (RS-GPA) framework. Unlike dielectric-waveguide-based passive pinching antennas that rely on passive coupling from a guided wave into free space, RSs employ active antenna processing units (APUs) deployed along a shared cable for local transmission, reception, and signal processing. This cable-like active architecture offers flexible installation and broad frequency applicability, while allowing selected APUs to act as discrete and controllable radiation or reception points for location-flexible wireless access. Based on the proposed RS-GPA framework, we establish the system and channel models by accounting for the distance-dependent APU-user channels. For downlink transmission, we formulate a circuit-power-aware sparse APU activation and beamforming problem and develop a reweighted group-sparse beamforming algorithm. To reveal the activation principle, we analyze the single-user downlink case and characterize when an additional APU should be activated by balancing transmit-power saving and circuit-power cost. Inspired by this insight, a geometry-guided low-complexity multiuser algorithm is proposed. For uplink transmission, we formulate a joint APU activation and user power control problem and develop a geometry-guided sparse activation design. Numerical results show that the proposed RS-GPA framework substantially reduces the total consumed power compared with benchmark schemes, while the geometry-guided algorithm achieves near-identical consumed-power performance to the group-sparse design with significantly lower runtime.

URL PDF HTML ☆

赞 0 踩 0

2606.19695 2026-06-19 eess.SY cs.GT cs.SY math.OC 新提交

A Unified Framework for Joint Sensor Placement and Scheduling for Intrusion Detection

入侵检测中联合传感器放置与调度的统一框架

Jayanth Bhargav, Mahsa Ghasemi, Shreyas Sundaram

AI总结提出一个统一框架，将传感器放置与方向调度联合优化，通过博弈论设计效用函数并利用弱子模性实现近最优检测性能。

Comments 27 pages, 4 figures

详情

AI中文摘要

我们考虑一个入侵检测任务，其中防御者必须联合优化传感器放置位置和方向，以最小化入侵者穿越受保护环境时被漏检的概率。我们将此问题分解为一个元问题（称为SensorPlacement）和一个嵌入的子问题（称为OrientationScheduling）。对于固定的传感器放置，OrientationScheduling子问题被建模为防御者和入侵者之间的两人零和博弈，其中防御者寻求已部署传感器的方向策略以最小化漏检概率，而入侵者则寻求路径选择策略以最大化该概率。由于防御者的策略空间随传感器数量和方向组合增长，通过标准线性规划求解博弈变得不可行。为此，我们开发了一种迭代且高效的均衡求解算法，该算法利用博弈收益函数的结构，并建立了收敛到博弈纳什均衡（NE）的理论保证。该NE值随后被用作SensorPlacement元问题中的效用度量。我们证明了这个基于博弈值的效用函数在传感器放置集合上是弱子模的，并提出了一个具有近最优性保证的贪婪放置算法。据我们所知，这是第一个将博弈论效用设计与（弱）子模优化相结合的统一框架，实现了传感器放置和方向调度的原则性联合优化。通过大量仿真，我们证明所提出的方法实现了近最优的检测性能，同时与基线相比显著减少了计算时间。

英文摘要

We consider an intrusion detection task in which a defender must jointly optimize sensor placement locations and orientations to minimize the probability of missed detection of an intruder traversing a protected environment. We decompose this problem into a meta problem, termed SensorPlacement, and an embedded subproblem, termed OrientationScheduling. The OrientationScheduling subproblem, for a fixed sensor placement, is modeled as a 2-player zero-sum game between the defender and the intruder, where the defender seeks an orientation strategy for the deployed sensors to minimize the probability of missed detection, while the intruder seeks a path selection strategy to maximize it. Since the defender's strategy space grows combinatorially with the number of sensors and orientations, solving the game via standard linear programming becomes prohibitive. To this end, we develop an iterative and efficient equilibrium-seeking algorithm that exploits the structure of the game's payoff function and establishes theoretical guarantees for convergence to the Nash equilibrium (NE) of the game. This NE value is then used as a utility measure in the SensorPlacement meta problem. We show that this game-value-based utility function is weakly submodular over the set of sensor placements and propose a greedy placement algorithm with near-optimality guarantees. To our knowledge, this is the first unified framework to integrate game-theoretic utility design with (weak) submodular optimization, enabling principled joint optimization of sensor placement and orientation scheduling. Through extensive simulations, we demonstrate that the proposed approach achieves near-optimal detection performance while significantly reducing computation time compared to baselines.

URL PDF HTML ☆

赞 0 踩 0

2606.19655 2026-06-19 stat.CO math.ST stat.TH 新提交

A Flat Connection: The Pooling Factor and the Geometry of Centring in Hierarchical MCMC

平坦联络：分层MCMC中的汇集因子与中心化几何

Aidan D. Bindoff

AI总结研究分层MCMC中中心化/非中心化障碍的几何原因，证明Fisher信息诱导的联络是平坦的，障碍源于统计上的汇集因子π_j，并据此提出诊断方法。

Comments 39 pages, 9 figures, accompanying R package

详情

AI中文摘要

标准MCMC诊断（$\hat{R}$、有效样本量、发散计数）检测链是否混合，但不检测为何未混合。我们询问分层模型中的中心化/非中心化障碍是否具有度量之外的几何原因。联合参数空间是一个纤维丛（超参数为底，组级参数为纤维），Fisher信息度量诱导一个Ehresmann联络$A = -G_{FF}^{-1}G_{BF}$；自然假设是障碍是其曲率，采样器将其感受为和乐。我们证明这是错误的。对于任何光滑的分层后验，不仅是高斯情况，联络是平坦的，因为其水平叶是纤维得分$\partial_\alpha \log p$的水平集：度量之上没有几何障碍。剩下的障碍是统计的，而非几何的，平坦联络将其识别为一个单一量：纤维对底的条件依赖性，由每组的先验比例$\pi_j$（经典汇集因子）控制。该框架由此恢复了已有图景：先验主导的组混合缓慢，每组的非中心化最优权重有闭式解，并且一项模拟研究通过它们对分层方差的相反依赖性，将这种底-纤维耦合与漏斗（一种不同的底空间病态）区分开来。一项直接归因测试确认NUTS不运输纤维：链级足迹是先验主导组中多余的条件自相关，正如$\pi_j$所预测。真正的、甚至旋转的曲率确实出现，但仅针对由采样器工作度量（固定质量矩阵）构建的联络，此时和乐作为算法现象而非几何现象重新出现。先验比例诊断作为R包fibr分发，几何方法作为附带的复现代码。

英文摘要

Standard MCMC diagnostics ($\hat{R}$, effective sample size, divergence counts) detect whether a chain has mixed, but not why it has not. We ask whether the centring/non-centring obstruction in hierarchical models has a geometric cause beyond the metric. The joint parameter space is a fiber bundle (hyperparameters the base, group-level parameters the fibers), and the Fisher information metric induces an Ehresmann connection $A = -G_{FF}^{-1}G_{BF}$; the natural hypothesis is that the obstruction is its curvature, felt by the sampler as holonomy. We prove this false. The connection is flat for any smooth hierarchical posterior, not only the Gaussian case, because its horizontal leaves are the level sets of the fiber score $\partial_α\log p$: there is no geometric obstruction above the metric. What remains is statistical, not geometric, and the flat connection identifies it as a single quantity: the conditional dependence of fiber on base, governed per group by the prior fraction $π_j$, the classical pooling factor. From it the framework recovers the established picture, that prior-dominated groups mix slowly and that the optimal per-group non-centring weight follows in closed form, and a simulation study separates this base-fiber coupling from the funnel, a distinct base-space pathology, by their opposite dependence on the hierarchical variance. A direct attribution test confirms that NUTS does not transport the fiber: the chain-level footprint is excess conditional autocorrelation in prior-dominated groups, exactly as $π_j$ predicts. Genuine, even rotational, curvature does appear, but only for connections built from a sampler's working metric (a fixed mass matrix), where holonomy re-enters as an algorithmic rather than geometric phenomenon. The prior-fraction diagnostic is distributed as the R package fibr, with the geometric methods as accompanying reproduction code.

URL PDF HTML ☆

赞 0 踩 0

2606.19521 2026-06-19 cs.LG math.OC 新提交

Interactive Pareto navigation for deep multi-task learning

深度多任务学习的交互式帕累托导航

Augustina C. Amakor, Konstantin Sonntag, Sebastian Peitz

发表机构 * Department of Computer Science, TU Dortmund, Dortmund, Germany（多特蒙德工业大学计算机科学系，德国多特蒙德）； Lamarr Institute for Machine Learning and Artificial Intelligence（拉马尔机器学习和人工智能研究所）

AI总结提出偏好帕累托探索（PPE）框架，通过预测-校正方法沿帕累托流形切线方向引导偏好，利用Krylov子空间方法避免Hessian计算，实现高效交互式多目标优化。

详情

AI中文摘要

在多任务学习中，处理越来越多的目标在计算资源和决策者选择适当权衡的能力方面都很快变得具有挑战性。因此，一种广泛使用的方法是通过加权和将各个损失聚合到单个损失函数中。这通常由于帕累托前沿的形状而无法捕捉决策者的偏好，或者需要多次调整和计算，这在深度学习应用中变得过于昂贵。为了解决这些问题，我们引入了一个新颖的框架，偏好帕累托探索（PPE），它在交互式探索过程中强制执行决策者的偏好，同时考虑帕累托集的几何形状。PPE基于预测-校正方法，该方法沿着帕累托最优解流形的切线方向执行预测步骤，遵循决策者的偏好。随后的校正步骤产生反映该偏好的新权衡。为了在表征流形切空间时避免显式的Hessian计算，我们采用了一种仅依赖于矩阵-向量乘积的Krylov子空间方法。这些乘积可以通过自动微分高效获得，确保了整个优化过程的效率和鲁棒性。该方法的有效性和性能通过玩具问题和深度学习示例进行了展示。

英文摘要

In multi-task learning, handling an increasing number of objectives can quickly become challenging, both in terms of the computational resources and the decision maker's capacity to choose appropriate trade-offs. A widely used approach is thus to aggregate the individual losses in a single loss function by a weighted sum. This often fails to capture either the decision maker's preferences as a result of the shape of the Pareto front, or requires multiple adjustments and computations which becomes prohibitively expensive in deep learning applications. To address these issues, we introduce a novel framework, Preference Pareto Exploration (PPE), which enforces the decision maker's preferences while accounting for the geometry of the Pareto set in an interactive exploration process. PPE is based on a predictor-corrector method that performs predictor steps tangential to the manifold of Pareto-optimal solutions, following the decision maker's preference. The subsequent corrector step results in a new trade-off reflecting this preference. To avoid explicit Hessian computations when characterizing the tangent space of the manifold, we employ a Krylov subspace method that relies solely on matrix-vector products. These products can be efficiently obtained via automatic differentiation, ensuring both efficiency and robustness throughout the optimization process. The method's functionality and performance are demonstrated using both toy problems and examples from deep learning.

URL PDF HTML ☆

赞 0 踩 0

2606.19405 2026-06-19 q-bio.QM math.DS q-bio.PE 新提交

Multi-type branching inference on contact trees with application to COVID-19

接触树上的多类型分支推断及其在COVID-19中的应用

Augustine Okolie, Johannes Müller, Eno Akarawakc, Isaac Ajiboye

AI总结提出一种直接作用于接触树上传播树的似然框架，通过多类型分支过程考虑接触度异质性，从部分解析的传播树中推断流行病学参数，并在COVID-19接触追踪数据中验证。

Comments 26 pages, 8 Figures

详情

AI中文摘要

从传播树推断流行病学参数对于理解传染病动态至关重要。现有的基于树的似然方法，包括最初应用于系统动力学环境中的多类型出生-死亡模型，提供了强大的工具，但大多数假设均匀混合，很少捕捉当个体感染更多接触者时传播潜力的变化。在这项工作中，我们开发了一个直接作用于传播树的似然框架，其中节点是个体，边是报告的传播事件，不涉及序列数据。我们推导了一个在有根接触树上的随机SIR过程的似然，其中每个感染个体由有效接触总数和已感染的下游接触数来刻画。我们得到了一个分支完全未被观察到的概率以及它产生一个处于给定状态的观察（采样）末端的概率密度的闭式常微分方程。对于已知末端状态的有根接触树，可以评估得到的似然，并且我们通过将内部分支时间视为潜在变量，将其扩展到部分解析的树。在模拟爆发上的验证确认了准确的参数恢复和良好校准的不确定性。应用于印度卡纳塔克邦的经验COVID-19接触追踪数据，展示了该框架在实际流行病学环境中的实用性。通过在多类型分支似然中纳入接触度异质性，我们的工作为从完全或部分解析的传播树推断传播动态和接触结构提供了一个原则性的基线，补充而非依赖于基于序列的系统动力学推断。

英文摘要

Inferring epidemiological parameters from transmission trees is essential for understanding infectious disease dynamics. Existing tree-based likelihood methods, including the multi-type birth-death models originally applied in phylodynamic settings, provide powerful tools, but most assume homogeneous mixing and rarely capture how transmission potential changes as an individual infects more of their contacts. In this work, we develop a likelihood framework that operates directly on transmission trees, in which nodes are individuals and edges are reported transmission events, with no sequence data involved. We derive a likelihood for a stochastic SIR process on a rooted contact tree in which each infected individual is characterised by the total number of effective contacts, and the number of already infected downstream contacts. We obtain closed-form ordinary differential equations for the probability that a clade goes entirely unobserved and for the probability density that it produces an observed (sampled) tip in a given state. The resulting likelihood can be evaluated for a rooted contact tree with known tip states, and we extend it to partially resolved trees by treating internal branching times as latent variables. Validation on simulated outbreaks confirms accurate parameter recovery and well calibrated uncertainty. Application to empirical COVID-19 contact-tracing data from Karnataka, India, demonstrates the framework's utility for real epidemiological settings. By incorporating contact-degree heterogeneity in a multi-type branching likelihood, our work provides a principled baseline for inferring both transmission dynamics and contact structure from fully or partially resolved transmission trees, complementing rather than relying on sequence-based phylodynamic inference

URL PDF HTML ☆

赞 0 踩 0

2606.19393 2026-06-19 cs.DM cs.DS math.CO 新提交

An alternative way of defining finite graphs

定义有限图的另一种方式

Maxim Nazarov

AI总结提出一种完全图不变量“图线性符号”，作为有限图的替代定义，用于简化图的对称性图示和同构比较。

Journal ref Prikl. Diskr. Mat., 2015, no. 3(29), 83-94

2606.19361 2026-06-19 cs.LG cs.AI cs.NA math.NA stat.CO stat.ME stat.ML 新提交

Computational Identifiability

计算可识别性

Lucius E. J. Bynum, Rajesh Ranganath, Kyunghyun Cho

发表机构 * New York University（纽约大学）

AI总结提出“计算可识别性”框架，通过有限计算搜索过程在指定误差容限内找到经验估计量，从而解决理论可识别性在有限样本、模糊图标准等实际场景中的不足。

详情

AI中文摘要

识别条件描述了目标查询或感兴趣参数作为可用信息类型和数量的函数的可计算性。在因果识别中，这些信息通常以因果图的形式表达，数据是针对图中某些变量子集观测或收集的。目标查询可以是单个效应，也可以是给定模型中的一类效应。识别算法的推导在数学上定义了期望中理论上唯一确定所需因果效应的过程。期望中的可识别性，即“理论可识别性”，通常假设渐近性质、无限数据或其他数学理想化条件。在本文中，我们探讨了这种理论理想化的可识别性与一种受计算限制的替代方案之间的根本区别。我们提出的框架——“计算可识别性”——而是为经验估计量定义一个有限的计算搜索过程。如果该过程在期望的误差容限内经验性地找到了估计量，则满足可识别性，条件取决于搜索的指定假设（即参数上的先验分布）以及搜索过程本身。通过多个实验，我们展示了该框架如何回答细粒度的实际识别问题，例如小有限样本下的识别、模糊图标准下的识别、混合观测-干预数据下的识别，以及跨反事实数据和估计量的识别。代码见 https://this https URL。

英文摘要

Identification conditions describe the computability of a target query or parameter of interest as a function of the type and amount of information available. In causal identification, this information is often expressed in the form of a causal graph, and data are observed or collected for some subset of variables in the graph. Target queries may be for a single effect alone or for a class of effects in a given model. The derivation of an identification algorithm then defines mathematically the process by which the desired causal effect(s) can be uniquely determined, theoretically, in expectation. Identifiability in expectation, or 'theoretical identifiability,' generally assumes asymptotic properties, infinite data, or other mathematically idealized conditions. In this paper, we explore a fundamental distinction between this theoretical, idealized notion of identifiability and a proposed alternative that is computation-bound. The framework we propose - 'computational identifiability' - is to instead define a finite computational search procedure for an empirical estimator. If this process finds an estimator empirically, within a desired error tolerance, then identifiability is satisfied, conditional on the specified assumptions of the search (i.e., a prior distribution over the parameters) and conditional on the search procedure itself. Through several experiments, we demonstrate how this framework allows us to answer fine-grained, practical identification questions, such as identification with small finite samples, with ambiguous graphical criteria, with mixed observational-interventional data, and across counterfactual data and estimands. Code is available at https://github.com/lbynum/metadentify.

URL PDF HTML ☆

赞 0 踩 0

2606.20534 2026-06-19 math.OC 新提交

On Second-Order Methods for Bilevel Optimization

关于双层优化的二阶方法

Jiawen Bi, Jiaxiang Li, Mingyi Hong, Shuzhong Zhang

AI总结本文针对双层优化问题，提出了一种单循环三次正则牛顿算法，在非凸上层和强凸下层设置下，实现了最优的O(ε^{-1.5})总预言复杂度，首次达到二阶驻点的最优收敛率。

详情

AI中文摘要

双层优化是现代机器学习和工程设计不可或缺的建模工具。然而，在双层优化中寻找二阶驻点的理论和实践仍然很大程度上未解决。即使对于具有强凸下层问题的双层优化，其诱导的超函数通常是非凸的。尽管三次正则牛顿方法（CRN）在单层优化中实现了最优的$\mathcal{O}(\varepsilon^{-1.5})$ SOSP（二阶驻点）率，但如何控制将二阶方法应用于双层问题时超梯度和超Hessian计算的精度，以使整个过程高效，仍不清楚。在本文中，我们着手回答这个问题。特别地，我们首先制定了一个双循环CRN基线，该基线实现了最优的外层率，但需要重复的下层求解。接下来，我们提出了一种单循环三次正则牛顿算法，该算法将一个下层梯度步与一个用于超梯度的牛顿步相结合，并证明了总体确定性的$\mathcal{O}(\varepsilon^{-1.5})$总预言复杂度，这是最优的。此外，我们说明了一些直观简单的修改可能无法维持收敛结果。据我们所知，这是第一个用于无约束NCSC（非凸上层和强凸下层）双层优化设置的确定性单循环方法，该方法实现了寻找超函数$\varepsilon$-SOSP的$\mathcal{O}(\varepsilon^{-1.5})$最优收敛率。

英文摘要

Bilevel optimization is an indispensable modeling tool for modern machine learning and engineering design. However, the theory and practice for finding second order stationary points in the context of bilevel optimization still remain largely unsettled. Even for bilevel optimization with strongly convex lower-level problem, the hyperfunction it induces is in general nonconvex. Although the Cubic Regularized Newton methods (CRN) famously achieve the optimal $\mathcal{O}(\varepsilon^{-1.5})$ SOSP (second-order stationary point) rate in single-level optimization, it is unclear how to control the accuracy of the hypergradient and hyper-Hessian computations in the context of applying the second-order methods to bilevel problems in order for the overall process to be efficient. In this paper, we set out to answer this question. In particular, we first formulate a double loop CRN baseline that achieves the optimal outer rate but requires repeated lower level solves. Next, we propose a single loop cubic regularized Newton algorithm that combines one lower-level gradient step with one Newton step for the hypergradient, and prove an overall deterministic $\mathcal{O}(\varepsilon^{-1.5})$ total oracle complexity, which is optimal. In addition, we illustrate that some intuitively simple modifications of our method may fail to hold up the convergence result. To the best of our knowledge, this is the first deterministic single loop method for unconstrained NCSC (non-convex upper-level and strongly convex lower-level) bilevel optimization setting that achieves the $\mathcal{O}(\varepsilon^{-1.5})$ optimal convergence rate for finding an $\varepsilon$-SOSP of the hyperfunction.

URL PDF HTML ☆

赞 0 踩 0

2606.20528 2026-06-19 math.DG 新提交

Positive Scalar Curvature Obstructions via Singular Dimension Descent

通过奇异维度下降法的正数量曲率障碍

Yuchen Bi, Jintian ZHu

AI总结本文发展了Schoen-Yau型奇异维度下降法，用于任意维度的正数量曲率障碍研究，证明了可放大流形上的正数量曲率障碍，并建立了相应的立方宽度不等式和双系统估计。

Comments 51 pages

2606.20516 2026-06-19 math.DG cs.CG 新提交

Approximation and interactive design with exact 3D elastic curves

精确3D弹性曲线的逼近与交互设计

David Brander, Jens Gravesen, Marc Isern

AI总结提出一种数值稳定方法，从给定弹性曲线段恢复11参数，实现任意空间曲线段到3D弹性曲线的快速稳定逼近，应用于精确弹性曲线交互设计和机器人热刀切割CAD曲面合理化。

Comments 20 pages

2606.20509 2026-06-19 math.DS 新提交

Planar constant piecewise smooth vector fields with large hysteresis

具有大滞后的平面常数分段光滑向量场

Tiago Carvalho, Leonardo Serantola, Bruno de Souza Rangel

AI总结针对应用中广泛使用但缺乏极限集理论基础的滞后控制系统，本文在平面情形下分析两个线性向量场和两个切换边界，分类其极限集。

详情

AI中文摘要

在整个工作中，我们将对一类在应用中广泛使用但仍缺乏描述其动力学可能产生的极限集类型的一致理论基础的控制系统进行严格的数学分析。例如，在某些应用中，对某种疾病的治疗会一直进行，直到患病细胞水平低于规定的阈值C1。此时，暂停治疗以使患者机体从其副作用中恢复。随后，当患病细胞水平达到第二个大于C1的阈值C2时，恢复治疗，并重复该方案。据我们所知，目前还没有对此类模型的数学分类。在本文中，我们启动了一项旨在确定此类模型极限集的系统性文献工作。我们从平面情形开始，其中两个线性向量场处于活动状态，并考虑两个切换边界。自然，在未来的发展中，还应考虑更高维度的控制系统，其中包含额外的向量场和更一般的切换流形。

英文摘要

Throughout this work, we will carry out a rigorous mathematical analysis of a class of control systems that is widely used in applications but still lacks a consistent theoretical foundation for describing the types of limit sets that may arise from its dynamics. There are applications in which, for example, a treatment for a given disease is administered until the level of diseased cells falls below a prescribed threshold C1. At that point, the treatment is suspended in order to allow the patient's organism to recover from its side effects. Subsequently, when the level of diseased cells reaches a second threshold C2 bigger than C1, the treatment is resumed, and the protocol is repeated. To the best of our knowledge, there is not a mathematical classification of such models. In this paper, we initiate what is intended to become a consistent body of literature aimed at determining the limit sets of such models. We begin with the planar case, in which two linear vector fields are active and two switching boundaries are considered. Naturally, in future developments, control systems in higher dimensions, featuring additional vector fields and more general switching manifolds, should also be considered.

URL PDF HTML ☆

赞 0 踩 0