arXivDaily每日学术速递，同步arXiv全量数据，AI总结、翻译，覆盖人工智能、机器人、计算机、金融、统计学、数学、物理学、生物学、经济学、电气&系统等方向。

2212.04382 2026-05-28 stat.ML cs.LG

Structure of Classifier Boundaries: Case Study for a Naive Bayes Classifier

分类器边界的结构：朴素贝叶斯分类器的案例研究

Alan F. Karr, Zac Bowen, Adam A. Porter, Regina Ruane

AI总结研究贝叶斯分类器在输入空间为图时的边界结构，通过邻域相似性度量分类不确定性，并应用于DNA读段分配问题。

2512.18444 2026-05-28 cs.GT cs.AI cs.DC cs.MA

Snowveil: A Framework for Decentralised Preference Discovery

Snowveil: 一种去中心化偏好发现的框架

Grammateia Kotsialou

AI总结针对去中心化偏好发现问题，提出基于八卦的框架Snowveil，通过随机采样和局部信念更新，在有限期望时间内以可调高概率收敛到社会选择参数，并引入约束混合博尔达规则以平衡广泛共识与多数支持。

详情

AI中文摘要

在传统社会选择中，聚合主观偏好通常假设存在一个可信的中心权威。相反，本文形式化了去中心化偏好发现（DPD）：在部分信息、异步交互、抗审查且无中心协调者的条件下，可靠地识别社会选择参数（例如，应用于全局偏好配置的聚合规则的规范结果）。为了解决DPD，我们提出了Snowveil，一个基于八卦的框架，其中智能体重复采样随机同伴排名并更新局部信念，以收敛到规范结果。利用势函数、亚鞅理论和集中界，我们证明了系统以可调的高概率在有限期望时间内达到该稳定状态。然后可以迭代这一单胜者过程，以构建多胜者场景中的一组获胜候选者。Snowveil对特定聚合规则不可知，仅要求规则满足如正向响应等公理，从而为更广泛的DPD协议提供了形式基础。为了展示Snowveil的模块化，我们引入了约束混合博尔达（CHB），一种旨在平衡广泛共识与多数支持的聚合规则。我们提供了CHB的公理分析，并通过大量模拟展示了实证结果，验证了Snowveil的O(n)可扩展性。总体而言，这项工作为大规模去中心化系统中如何从主观、表达性和多样化的偏好配置中涌现稳定共识奠定了基础。

英文摘要

Aggregating subjective preferences in social choice traditionally assumes a trusted central authority. In contrast, this paper formalises Decentralised Preference Discovery (DPD): the reliable identification of a social choice parameter (e.g. the canonical outcome of an aggregation rule applied to the global preference profile) under conditions of partial information, asynchronous interaction, censorship resistance, and no central coordinator. To address DPD, we propose Snowveil, a gossip-based framework where agents repeatedly sample random peer rankings and update local beliefs to converge on the canonical outcome. Using a potential function, submartingale theory, and concentration bounds, we prove the system reaches this stable state with tunable high probability, in finite expected time. This single-winner process can then be iterated to construct a set of winning candidates for multi-winner scenarios. Snowveil is agnostic to specific aggregation rules, requiring only that the rule satisfies axioms such as Positive Responsiveness, thus offering a formal basis for a wider class of DPD protocols. Demonstrating Snowveil's modularity, we introduce the Constrained Hybrid Borda (CHB), an aggregation rule designed to balance broad consensus with plurality support. We provide an axiomatic analysis of CHB and present empirical results via extensive simulation, validating Snowveil's O(n) scalability. Overall, this work provides a foundation for how a stable consensus emerges from subjective, expressive, and diverse preference profiles in large-scale decentralised systems.

URL PDF HTML ☆

赞 0 踩 0

2512.06797 2026-05-28 math.OC cs.AI cs.LG stat.ML

Optimal and Diffusion Transports in Machine Learning

机器学习中的最优输运与扩散输运

Gabriel Peyré

AI总结本文综述了机器学习中扩散方法和最优输运两种输运方法，它们通过拉格朗日视角设计概率分布演化，应用于采样、神经网络优化和大语言模型动力学建模。

Comments Proc. 2026 International Congress of Mathematicians

详情

AI中文摘要

机器学习中的若干问题自然地表述为随时间演化的概率分布的设计与分析。这包括通过扩散方法进行采样、优化神经网络的权重，以及分析大语言模型各层中令牌分布的演化。尽管目标应用不同（样本、权重、令牌），它们的数学描述共享一个共同结构。一个关键思想是通过平流粒子的向量场，从密度的欧拉表示转换到其拉格朗日对应。这种双重观点带来了挑战，特别是拉格朗日向量场的非唯一性，但也提供了机会，以构造在正则性、稳定性和计算可行性方面具有有利性质的密度演化和流。本综述概述了这些方法，重点介绍两种互补方法：扩散方法，它依赖于随机插值过程并支撑现代生成式AI；以及最优输运，它通过最小化位移成本来定义插值。我们说明了这两种方法如何出现在从采样、神经网络优化到建模大语言模型Transformer动力学的应用中。

英文摘要

Several problems in machine learning are naturally expressed as the design and analysis of time-evolving probability distributions. This includes sampling via diffusion methods, optimizing the weights of neural networks, and analyzing the evolution of token distributions across layers of large language models. While the targeted applications differ (samples, weights, tokens), their mathematical descriptions share a common structure. A key idea is to switch from the Eulerian representation of densities to their Lagrangian counterpart through vector fields that advect particles. This dual view introduces challenges, notably the non-uniqueness of Lagrangian vector fields, but also opportunities to craft density evolutions and flows with favorable properties in terms of regularity, stability, and computational tractability. This survey presents an overview of these methods, with emphasis on two complementary approaches: diffusion methods, which rely on stochastic interpolation processes and underpin modern generative AI, and optimal transport, which defines interpolation by minimizing displacement cost. We illustrate how both approaches appear in applications ranging from sampling, neural network optimization, to modeling the dynamics of transformers for large language models.

URL PDF HTML ☆

赞 0 踩 0

2511.11896 2026-05-28 cs.CR cs.AI cs.SE

VULPO: Context-Aware Vulnerability Detection via On-Policy LLM Optimization

VULPO：基于策略优化的上下文感知漏洞检测

Youpeng Li, Fuxun Yu, Weiliang Qi, Xinda Wang

AI总结提出VULPO框架，通过构建包含上下文信息和推理轨迹的数据集ContextVul，结合冷启动监督微调和自适应策略优化，显著提升大语言模型在真实仓库中的漏洞检测能力。

详情

AI中文摘要

大语言模型（LLM）最近在漏洞检测（VD）中展现出强大潜力。然而，准确检测真实仓库中的漏洞需要推理复杂的上下文交互。现有的基于LLM的VD方法仍然有限，因为当前数据集缺乏完整的上下文信息和高质量的推理监督，而现有的优化方法主要依赖于粗粒度的结果中心监督信号，无法建模漏洞推理过程。为解决这些限制，我们首先构建了ContextVul，这是一个新数据集，用仓库级上下文信息和精心整理的漏洞推理轨迹增强了高质量函数级漏洞基准。基于ContextVul，我们引入了一个两阶段优化框架，包括轻量级冷启动监督微调，随后是漏洞自适应策略优化（VULPO）。VULPO结合了多维奖励，共同评估漏洞识别、漏洞相关定位和因果推理质量，以及难度自适应奖励缩放，以减轻奖励黑客攻击并提高强化学习效果。大量实验证明了VULPO在上下文感知VD中的优越性。我们的VULPO-4B，第一个专门的漏洞推理LLM，显著优于现有的VD基线，相对于Qwen3-4B将Pairwise Pass@1提高了203%，并实现了与规模大150%的LLM DeepSeek-V3.1相竞争的性能。

英文摘要

Large language models (LLMs) have recently shown strong potential in vulnerability detection (VD). However, accurately detecting vulnerabilities in real-world repositories requires reasoning over complex contextual interactions. Existing LLM-based VD approaches remain limited because current datasets lack complete contextual information and high-quality reasoning supervision, while existing optimization methods primarily rely on coarse outcome-centric supervision signals that fail to model the vulnerability reasoning process. To address these limitations, we first construct ContextVul, a new dataset that augments high-quality function-level vulnerability benchmarks with repository-level contextual information and curated vulnerability reasoning traces. Building upon ContextVul, we introduce a two-stage optimization framework consisting of lightweight cold-start supervised fine-tuning followed by vulnerability-adaptive on-policy optimization (VULPO). VULPO incorporates multidimensional rewards that jointly evaluate vulnerability identification, vulnerability-relevant localization, and causal reasoning quality, along with difficulty-adaptive reward scaling to mitigate reward hacking and improve RL effectiveness. Extensive experiments demonstrate the superiority of VULPO for context-aware VD. Our VULPO-4B, the first specialized vulnerability reasoning LLM, substantially outperforms existing VD baselines, improving Pairwise Pass@1 by 203% relative to Qwen3-4B and achieving competitive performance against a 150% larger-scale LLM, DeepSeek-V3.1.

URL PDF HTML ☆

赞 0 踩 0

2411.13479 2026-05-28 stat.ML cs.LG stat.AP

Conformal Prediction for Hierarchical Data

分层数据的共形预测

Guillaume Principato, Gilles Stoltz, Yvenn Amara-Ouali, Yannig Goude, Bachir Hamrouche, Jean-Michel Poggi

AI总结针对分层数据，通过引入投影（协调）步骤到分裂共形预测中，在联合覆盖和分量覆盖下均实现更小的预测区域，并理论证明其全局更优。

Comments 39 pages, 4 figures

2510.22190 2026-05-28 astro-ph.IM astro-ph.CO cs.LG

RGC: a radio AGN classifier based on deep learning. I. A semi-supervised multiclass model for VLA images

RGC: 基于深度学习的射电活动星系核分类器. I. VLA图像的半监督多类模型

M. S. Hossain, M. S. H. Shahal, K. M. B. Asad, P. Saikia, A. Khan, F. Akter, A. Ali, M. A. Amin, D. P. Guha, M. O. B. Jihad, A. Momen, S. Sen, A. K. M. M. Rahman

AI总结提出半监督RGC模型，结合BYOL和E2CNN，利用2060个标注样本和20000个未标注样本，首次实现弯曲射电活动星系核（WAT/NAT）与直型Fanaroff-Riley类型（sFRI/sFRII）的多类分类，性能与监督模型相当且Grad-CAM关注形态结构。

Comments 12 pages, 8 pages appendix, 7 figures, re-submitted to A&A

详情

AI中文摘要

弯曲射电活动星系核（RAGNs）——宽角尾（WATs）和窄角尾（NATs）——追踪星系群和星系团中的致密环境，但目前尚无多类分类器能同时使用视觉检查标签和未标注数据将它们与直型Fanaroff-Riley类型（sFRI, sFRII）区分开来。我们发布了FIRST-2060，一个包含2060个RAGNs（sFRI, sFRII, WAT, NAT）的四类标注数据集，该数据集通过多层视觉检查从三个公开目录构建，同时发布了半监督RGC 1.0模型，该模型利用了20000个未标注源。我们将RGC与五个监督基线进行了基准测试。FIRST-2060以两种预处理变体提供：$\mathbf{R}_{L1}$（保留虚假源）和$\mathbf{R}_{L2}$（移除虚假源）。RGC模型将自监督框架BYOL（Bootstrap Your Own Latent）与$E(2)$-等变可转向CNN（E2CNN）编码器集成，在未标注数据上预训练，并在标注集上微调。所有六个模型均通过5折交叉验证、Grad-CAM注意力分析和受控类不平衡实验进行评估。ConvNeXT（$M_1$）和RGC（$M_2$）构成第一梯队，宏$F_1$分别为$0.80\pm0.02$和$0.79\pm0.02$，差异在一个标准差内。$M_2$是唯一一个Grad-CAM轮廓一致追踪RAGNs形态结构（瓣、喷流和弯曲）而非默认紧凑斑点或扩散模式的模型。这里引入的四类方案使得能够构建WAT/NAT分辨的目录，这些目录可作为环境探针和弥漫星系团射电辐射的前身分类。$M_1$和$M_2$的互补优势——分别在于跨类型和类型内区分——表明集成方法可能为巡天尺度形态目录提供实用框架。

英文摘要

Bent radio active galactic nuclei (RAGNs) -- wide-angle tails (WATs) and narrow-angle tails (NATs) -- trace dense environments in galaxy groups and clusters, yet no multiclass classifier simultaneously separates them from straight Fanaroff--Riley types (sFRI, sFRII) using visually inspected labels and unlabelled data. We release FIRST-2060, a four-class labelled dataset of 2060 RAGNs (sFRI, sFRII, WAT, NAT) constructed from three publicly available catalogues through multi-tier visual inspection, together with the semi-supervised RGC 1.0 model that leverages 20,000 unlabelled sources. We benchmark RGC against five supervised baselines. FIRST-2060 is provided in two preprocessing variants: $\mathbf{R}_{L1}$, which retains spurious sources, and $\mathbf{R}_{L2}$, from which they are removed. The RGC model integrates the self-supervised framework BYOL (Bootstrap Your Own Latent) with an $E(2)$-equivariant steerable CNN (E2CNN) encoder, pre-trained on the unlabelled data and fine-tuned on the labelled sets. All six models are evaluated with 5-fold cross-validation, Grad-CAM attention analysis, and controlled class-imbalance experiments. ConvNeXT ($M_1$) and RGC ($M_2$) form a top tier at macro-$F_1$ $0.80\pm0.02$ and $0.79\pm0.02$ respectively, a difference within one standard deviation. $M_2$ is the only model whose Grad-CAM contours consistently trace the morphological structure of RAGNs -- lobes, jets, and bends -- rather than defaulting to compact blobs or diffuse patterns. The four-class scheme introduced here enables WAT/NAT-resolved catalogues that can serve as environment probes and progenitor classifications for diffuse cluster radio emission. The complementary strengths of $M_1$ and $M_2$ -- in cross-type and within-type discrimination respectively -- suggest that an ensemble approach may offer a practical framework for survey-scale morphological catalogues.

URL PDF HTML ☆

赞 0 踩 0

2505.11638 2026-05-28 math.NA cs.LG cs.NA

Accelerating Natural Gradient Descent for PINNs with Randomized Numerical Linear Algebra

利用随机数值线性代数加速物理信息神经网络的自然梯度下降

Ivan Bioli, Carlo Marcati, Giancarlo Sangalli

AI总结针对物理信息神经网络（PINNs）中自然梯度下降（NGD）因Gram矩阵病态导致计算成本高的问题，提出基于随机数值线性代数（RandNLA）的预条件技术加速共轭梯度求解器，显著提升优化效率。

详情

AI中文摘要

自然梯度下降（NGD）已成为训练基于神经网络的偏微分方程（PDE）求解器（如物理信息神经网络（PINNs））的一种有前景的优化算法。然而，其实际应用通常受限于求解涉及Gram矩阵的线性系统的高计算成本。虽然基于共轭梯度（CG）方法的无矩阵NGD方法避免了显式矩阵求逆，但Gram矩阵的病态性显著降低了CG的收敛速度。在这项工作中，我们将无矩阵NGD扩展到比以往更广泛的问题类别，并提出使用随机数值线性代数（RandNLA）技术对内部CG求解器进行高效预条件处理。所得算法在多种使用神经网络离散化的PDE问题上，相较于现有基于NGD的方法和其他最先进的优化器，展示了显著的性能提升。

英文摘要

Natural Gradient Descent (NGD) has emerged as a promising optimization algorithm for training neural network-based solvers for partial differential equations (PDEs), such as Physics-Informed Neural Networks (PINNs). However, its practical use is often limited by the high computational cost of solving linear systems involving the Gramian matrix. While matrix-free NGD methods based on the conjugate gradient (CG) method avoid explicit matrix inversion, the ill-conditioning of the Gramian significantly slows the convergence of CG. In this work, we extend matrix-free NGD to broader classes of problems than previously considered and propose the use of Randomized Numerical Linear Algebra (RandNLA) techniques for efficient preconditioning of the inner CG solver. The resulting algorithm demonstrates substantial performance improvements over existing NGD-based methods and other state-of-the-art optimizers on a range of PDE problems discretized using neural networks.

URL PDF HTML ☆

赞 0 踩 0

2510.06970 2026-05-28 eess.SY cs.LG cs.SY

Falsification-driven reinforcement learning for maritime motion planning

基于证伪驱动的强化学习用于海上运动规划

Marlon Müller, Florian Finkeldei, Hanna Krasowski, Murat Arcak, Matthias Althoff

AI总结提出一种证伪驱动的强化学习方法，通过生成对抗性训练场景（违反信号时态逻辑规范的海上交通规则）来提升自主船舶的规则遵守能力，实验表明该方法能提供更相关的训练场景并实现更一致的规则遵守。

Comments 11 pages, 9 figures. Code available at https://fdrl-maritime.github.io

详情

DOI: 10.1016/j.oceaneng.2026.125579
Journal ref: Ocean Engineering 361 (2026) 125579

AI中文摘要

遵守海上交通规则对于自主船舶的安全运行至关重要，但训练强化学习（RL）代理遵守这些规则具有挑战性。RL代理的行为由其遇到的训练场景塑造，但创建能够捕捉海上导航复杂性的场景并非易事，仅靠真实世界数据是不够的。为了解决这个问题，我们提出了一种证伪驱动的RL方法，该方法生成对抗性训练场景，其中被测船舶违反以信号时态逻辑规范表示的海上交通规则。我们在两艘船舶的公海导航实验表明，所提出的方法提供了更相关的训练场景，并实现了更一致的规则遵守。

英文摘要

Compliance with maritime traffic rules is essential for the safe operation of autonomous vessels, yet training reinforcement learning (RL) agents to adhere to them is challenging. The behavior of RL agents is shaped by the training scenarios they encounter, but creating scenarios that capture the complexity of maritime navigation is non-trivial, and real-world data alone is insufficient. To address this, we propose a falsification-driven RL approach that generates adversarial training scenarios in which the vessel under test violates maritime traffic rules, which are expressed as signal temporal logic specifications. Our experiments on open-sea navigation with two vessels demonstrate that the proposed approach provides more relevant training scenarios and achieves more consistent rule compliance.

URL PDF HTML ☆

赞 0 踩 0

2509.22553 2026-05-28 stat.ML cs.LG

Linear Causal Representation Learning by Topological Ordering, Pruning, and Disentanglement

通过拓扑排序、剪枝和解缠的线性因果表示学习

Hao Chen, Lin Liu, Yu Guang Wang

AI总结提出一种在更弱假设下通过拓扑排序、剪枝和解缠恢复线性因果表示的新算法，并通过合成实验和大语言模型可解释性分析验证其有效性。

详情

AI中文摘要

因果表示学习（CRL）因其能够利用现代数据集的异质性，将复杂的数据生成机制解缠为因果可解释的潜在特征，而日益引起因果推断和人工智能领域的兴趣。本文进一步为CRL文献做出贡献，专注于潜在特征上的风格化线性结构因果模型，并假设一个线性混合函数将潜在特征映射到观测数据或测量值。现有的线性CRL方法通常依赖于严格假设，例如访问单节点干预数据或对潜在特征和/或外生测量噪声施加限制性分布约束。然而，这些先决条件在实践中容易违反。在这项工作中，我们提出了一种新颖的线性CRL算法，与现有方法不同，它在对环境异质性和数据生成分布的更弱假设下运行，同时仍然能够恢复潜在因果特征直至等价类。我们通过合成实验和大语言模型的可解释性分析进一步验证了我们的新算法，展示了其在有限样本下优于竞争方法的性能，以及将因果性融入理解人工智能的潜力。源代码可在https://github.com/utulie/code_for_linear_crl_paper_creator获取。

英文摘要

Causal representation learning (CRL) has garnered increasing interest from the causal inference and artificial intelligence communities due to its potential to disentangle complex data-generating mechanism into causally interpretable latent features by leveraging the heterogeneity of modern datasets. In this paper, we further contribute to the CRL literature, by focusing on the stylized linear structural causal model over latent features and assuming a linear mixing function that maps latent features to the observed data or measurements. Existing linear CRL methods often rely on stringent assumptions, such as access to single-node interventional data or restrictive distributional constraints on latent features and/or exogenous measurement noise. However, these prerequisites can be easy to violate in practice. In this work, we propose a novel linear CRL algorithm that, unlike existing methods, operates under weaker assumptions on environment heterogeneity and data-generating distributions while still recovering latent causal features up to an equivalence class. We further validate our new algorithm via synthetic experiments and an interpretability analysis of large language models, demonstrating both its superiority over competing methods in finite samples and its potential in integrating causality into understanding artificial intelligence. The source code is available at https://github.com/utulie/code_for_linear_crl_paper_creator.

URL PDF HTML ☆

赞 0 踩 0

2507.13725 2026-05-28 cs.IR cs.AI

Point of Interest Recommendation: Pitfalls and Viable Solutions

兴趣点推荐：陷阱与可行解决方案

Alejandro Bellogín, Linus W. Dietz, Francesco Ricci, Pablo Sánchez

AI总结本文批判性评估兴趣点推荐研究现状，指出数据集、算法和评估方法三方面的关键缺陷，并提出包含多利益相关者设计、上下文感知等方向的研究议程。

详情

DOI: 10.1145/3816430
Journal ref: ACM Transactions on Recommender Systems 2026

AI中文摘要

兴趣点（POI）推荐通过建议上下文相关且匹配偏好的地点和活动（如餐厅、地标、行程和文化景点），在丰富游客体验方面可发挥关键作用。与一些更常见的推荐领域（如音乐和视频）不同，POI推荐本质上具有高风险：用户投入大量时间、金钱和精力来搜索、选择和消费这些建议的POI。尽管该领域已有大量研究工作，但几个基本问题仍未解决，阻碍了所提出方法的实际应用。在本文中，我们讨论了POI推荐问题的当前状态以及我们识别的主要挑战。本文的第一个贡献是对POI推荐研究现状的批判性评估，并识别了三个主要维度（数据集、算法和评估方法）的关键缺陷。我们强调了持续存在的问题，例如缺乏标准化基准数据集、问题定义和模型设计中的有缺陷假设，以及对用户行为和系统性能中偏差的不当处理。第二个贡献是一个结构化的研究议程，从识别的问题出发，引入了与多利益相关者设计、上下文感知、数据收集、可信度、新颖交互和实际评估相关的未来工作的重要方向。

英文摘要

Point of interest (POI) recommendation can play a pivotal role in enriching tourists' experiences by suggesting context-dependent and preference-matching locations and activities, such as restaurants, landmarks, itineraries, and cultural attractions. Unlike some more common recommendation domains (e.g., music and video), POI recommendation is inherently high-stakes: users invest significant time, money, and effort to search, choose, and consume these suggested POIs. Despite the numerous research works in the area, several fundamental issues remain unresolved, hindering the real-world applicability of the proposed approaches. In this paper, we discuss the current status of the POI recommendation problem and the main challenges we have identified. The first contribution of this paper is a critical assessment of the current state of POI recommendation research and the identification of key shortcomings across three main dimensions: datasets, algorithms, and evaluation methodologies. We highlight persistent issues such as the lack of standardized benchmark datasets, flawed assumptions in the problem definition and model design, and inadequate treatment of biases in the user behavior and system performance. The second contribution is a structured research agenda that, starting from the identified issues, introduces important directions for future work related to multistakeholder design, context awareness, data collection, trustworthiness, novel interactions, and real-world evaluation.

URL PDF HTML ☆

赞 0 踩 0

2506.08846 2026-05-28 cs.CY cs.CL cs.SD eess.AS

Addressing Pitfalls in Auditing Practices of Automatic Speech Recognition Technologies: A Case Study of People with Aphasia

自动语音识别技术审计实践中的陷阱：以失语症患者为例

Katelyn Xiaoying Mei, Anna Seo Gyeong Choi, Hilke Schellmann, Mona Sloane, Allison Koenecke

AI总结本文识别了标准ASR审计中的三个常见陷阱，并提出了一个整体审计框架，通过失语症患者的案例研究发现ASR系统对其表现更差。

Comments Published at the Proceedings of The 2026 ACM Conference on Fairness, Accountability, and Transparency (FAccT '26)

详情

DOI: 10.1145/3805689.3812320

AI中文摘要

自动语音识别（ASR）系统的日益普及需要稳健的审计方法，以确保转录质量的公平性，特别是对于像失语症这样的言语障碍患者，他们不成比例地依赖ASR。虽然学术和行业审计揭示了不同用户群体之间的性能差异，但标准审计实践常常忽视可能掩盖对边缘群体伤害的细微差别。我们识别了标准ASR审计中的三个常见陷阱：（1）坚持单一的文本标准化方法，这可能掩盖ASR性能的差异并忽视边缘社区的标准化偏好；（2）展示高层次的人口统计发现，而不考虑按细微交叉亚组划分的性能差异，或依赖于相关的声学特性；（3）仅报告一个黄金标准指标（词错误率），这不足以量化常见的生成式AI错误，如幻觉。我们提出了一个解决这些陷阱的整体审计框架，并在对六个流行ASR系统的案例研究中发现，与对照组相比，失语症患者的ASR性能持续更差。我们呼吁从业者实施这些更适合快速变化的ASR环境的稳健、社区驱动的ASR审计实践。

英文摘要

Automatic Speech Recognition (ASR) systems' growing use warrants robust auditing approaches to ensure equitable transcription quality, especially for people with speech disorders like aphasia who disproportionately depend on ASR. While academic and industry audits have revealed performance disparities across user populations, standard auditing practices often overlook nuances that risk masking harm to marginalized groups. We identify three common pitfalls in standard ASR audits: (1) adhering to one method of text standardization, which can mask variance in ASR performance and ignore the standardization preferences of marginalized communities; (2) displaying high-level demographic findings without considering performance disparities by nuanced intersectional subgroups, or conditioning on relevant acoustic properties; and (3) reporting only one gold-standard metric (Word Error Rate), which inadequately quantifies common generative AI errors like hallucinations. We propose a holistic auditing framework addressing these pitfalls, and in a case study of six popular ASR systems, find consistently worse ASR performance for speakers with aphasia relative to a control group. We call on practitioners to implement these robust, community-driven ASR auditing practices better suited for the rapidly changing ASR landscape.

URL PDF HTML ☆

赞 0 踩 0

2506.12444 2026-05-28 math.OC cs.LG

Adjusted Shuffling SARAH: Advancing Complexity Analysis via Dynamic Gradient Weighting

调整的Shuffling SARAH：通过动态梯度加权推进复杂度分析

Duc Toan Nguyen, Trang H. Tran, Lam M. Nguyen

AI总结提出Adjusted Shuffling SARAH算法，通过动态加权机制集成shuffling策略到递归SARAH框架，在强凸和非凸设置下达到最优理论保证，并引入不精确模式实现与数据集大小无关的总复杂度。

2409.13058 2026-05-28 cs.HC cs.RO

Mixed Reality Tele-Ultrasound over 750 km: A Feasibility Study

混合现实远程超声检查跨越750公里：可行性研究

Ryan Yeung, David Black, Patrick B. Chen, Victoria Lessoway, Janice Reid, Sergio Rangel-Suarez, Silvia D. Chang, Septimiu E. Salcudean

AI总结本研究提出并评估了一种基于混合现实和触觉反馈的人机远程超声系统，通过新手操作员在专家远程控制下完成腹部超声检查，在754公里距离上实现了92%的图像质量达标率。

Comments 8 pages, 11 figures

详情

DOI: 10.1109/Telepresence66096.2025.11521508

AI中文摘要

为解决偏远社区缺乏超声检查的问题，先前工作引入了人机远程操作，一种基于混合现实和触觉的远程超声系统。该方法中，新手扮演认知机器人角色，由专家通过混合现实远程控制。本文总结了该系统的新进展，并描述了一项评估其用于长距离远程腹部超声检查的可行性研究。为提供简单有效的触觉反馈，我们使用了患者椭球模型，并通过系统的位置和力传感器校准其参数。我们在加拿大海达瓜依的斯基德盖特测试了该系统，专家位于754公里外的加拿大温哥华。我们共进行了11次扫描，涉及10名新手和2名超声技师。超声技师的任务是获取上腹部区域的5个目标图像。图像采集质量由2名放射科医生评估。我们收集了对准数据，新手完成了任务负荷和可用性问卷。新手和超声技师均提供了书面和口头反馈，以指导未来的设计迭代。92%的获取图像具有足够质量，可供两位放射科医生解读。新手报告的平均任务负荷低于文献中的参考值，可用性一致获得正面评价。未发现图像质量与跟随者相对于虚拟换能器的对准误差之间存在相关性。总体而言，我们表明人机远程操作使超声技师能够以高性能执行远程腹部超声成像，即使跨越远距离且使用新手跟随者。未来工作将把人机远程操作与传统、机器人及远程指导超声进行比较。

英文摘要

To address the lack of access to ultrasound in remote communities, previous work introduced human teleoperation, a mixed reality and haptics-based tele-ultrasound system. In this approach, a novice takes the role of a cognitive robot controlled remotely by an expert through mixed reality. In this manuscript we summarize new developments to this system and describe a feasibility study assessing its use for long-distance remote abdominal ultrasound examinations. To provide simple but effective haptic feedback, we used an ellipsoid model of the patient with its parameters calibrated using our system's position and force sensors. We tested the system in Skidegate, Haida Gwaii, Canada, with the experts positioned 754 km away in Vancouver, Canada. We performed 11 total scans with 10 novices and 2 sonographers. The sonographers were tasked with acquiring 5 target images in the epigastric region. The image acquisition quality was assessed by 2 radiologists. We collected alignment data and the novices completed task load and usability questionnaires. Both the novices and sonographers provided written and verbal feedback to inform future design iterations. 92% of the acquired images had sufficient quality for interpretation by both radiologists. The mean task load reported by the novices was below reference values reported in literature and the usability was unanimously positive. No correlation was found between image quality and the follower's alignment error with the virtual transducer. Overall, we show that human teleoperation enables sonographers to perform remote abdominal ultrasound imaging with high performance, even across large distances and with novice followers. Future work will compare human teleoperation to conventional, robotic and tele-mentored ultrasound.

URL PDF HTML ☆

赞 0 踩 0

2506.08311 2026-05-28 cs.SE cs.AI

Understanding Automated Program Repair Agents Through the Lens of Traceability: An Empirical Study

通过可追溯性视角理解自动化程序修复智能体：一项实证研究

Ira Ceka, Hailie Mitchell, Saurabh Pujar, Luca Buratti, Shyam Ramji, Junfeng Yang, Gail Kaiser, Baishakhi Ray

AI总结本文通过追踪五个最先进的自动化程序修复智能体在500个真实世界修复任务中的决策流程，揭示了它们在逻辑密集型错误修复、测试生成和回归测试选择方面的关键局限性，并提出了改进方向。

Comments Accepted for publication (ISSTA '26)

详情

AI中文摘要

自动化程序修复（APR）智能体利用大型语言模型（LLMs）通过推理、规划和工具使用来自主诊断和修复软件缺陷。尽管在SWE-bench等基准测试上取得了令人印象深刻的排行榜成绩，但人们对这些智能体如何采取行动、在何处失败以及它们的行为与人类开发者相比如何知之甚少。本文首次对五个最先进的APR智能体在500个真实世界修复任务中进行了系统分析，追踪了它们从问题描述到补丁验证的完整决策流程。我们的研究揭示，虽然智能体擅长简单修复，但在逻辑密集型错误上表现挣扎，常常生成冗长或过拟合的补丁，这些补丁仅能满足现有测试。我们发现测试生成和回归测试选择仍然是主要瓶颈，智能体经常无法重现问题或运行相关的回归测试。此外，大多数智能体使用原始工具（如bash脚本），缺乏调试器或程序分析器的访问权限，这限制了它们的推理能力和补丁质量。这些发现突出了当前APR系统的关键局限性，并促使采用左移方法——强调早期高质量的测试生成和验证——以减少虚假修复并提高语义正确性。我们进一步概述了下一代APR设计的具体方向：（1）更丰富且更集成的工具生态系统，（2）结合互补优势的多样化智能体架构，以及（3）优先考虑语义修复质量和测试生成保真度而非表面成功指标的基准测试。

英文摘要

Automated Program Repair (APR) agents leverage Large Language Models (LLMs) to autonomously diagnose and fix software bugs through reasoning, planning, and tool use. Despite impressive leaderboard gains on benchmarks such as SWE-bench, little is understood about how these agents take actions, where they fail, and how their behavior compares to that of human developers. This paper presents the first systematic analysis of five state-of-the-art APR agents across 500 real-world repair tasks, tracing their full decision-making pipelines -- from issue description to patch validation. Our study reveals that while agents excel at simple fixes, they struggle with logic-intensive bugs, often producing verbose or overfitted patches that merely satisfy existing tests. We find that test generation and regression test selection remain major bottlenecks, with agents frequently failing to reproduce issues or run relevant regression tests. Moreover, most agents operate with primitive tooling (e.g., bash scripts) and lack access to debuggers or program analyzers, which constrains their reasoning and patch quality. These findings highlight key limitations in current APR systems and motivate a shift-left approach -- emphasizing early, high-quality test generation and validation -- to reduce spurious fixes and improve semantic correctness. We further outline concrete directions for next-generation APR design: (1) richer and more integrated tool ecosystems, (2) diversified agentic architectures that combine complementary strengths, and (3) benchmarks that prioritize semantic repair quality and test generation fidelity over surface-level success metrics.

URL PDF HTML ☆

赞 0 踩 0

2502.08695 2026-05-28 stat.ML cs.LG

A Bayesian Nonparametric Perspective on Mahalanobis Distance for Out of Distribution Detection

马氏距离用于分布外检测的贝叶斯非参数视角

Randolph W. Linderman, Noah Cowan, Yiran Chen, Scott W. Linderman

AI总结本文通过建立贝叶斯非参数模型与相对马氏距离评分（RMDS）之间的形式关系，提出具有分层先验的贝叶斯非参数混合模型来推广RMDS，并在OpenOOD基准上证明其在训练类协方差结构不同且每类数据点较少时优于现有方法。

Comments 32 pages, 5 figures, code is available at https://github.com/rwl93/bnp4ood

2408.00057 2026-05-28 q-bio.BM cs.LG

GOProteinGNN: Leveraging Protein Knowledge Graphs for Protein Representation Learning

GOProteinGNN：利用蛋白质知识图谱进行蛋白质表示学习

Dan Kalifa, Uriel Singer, Kira Radinsky

AI总结提出GOProteinGNN架构，通过整合蛋白质知识图谱信息增强蛋白质语言模型，在氨基酸和蛋白质级别进行图学习，从而在多个下游任务上取得最优性能。

详情

DOI: 10.1145/3746252.3761500
Journal ref: CIKM 2025: Proceedings of the 34th ACM International Conference on Information and Knowledge Management

AI中文摘要

蛋白质在生物过程中起着至关重要的作用，是生命体不可或缺的。准确的蛋白质表示至关重要，尤其是在药物开发中。近年来，利用机器学习和深度学习技术进行蛋白质表示的无监督学习引起了显著关注。然而，这些方法通常只关注蛋白质的氨基酸序列，缺乏关于蛋白质及其相互作用的实际知识，从而限制了其性能。在本研究中，我们提出了GOProteinGNN，一种新颖的架构，通过在创建氨基酸级别表示时整合蛋白质知识图谱信息来增强蛋白质语言模型。我们的方法允许在单个氨基酸级别和整个蛋白质级别整合信息，通过基于图的学习实现全面有效的学习过程。通过这样做，我们可以捕捉蛋白质与其功能注释之间的复杂关系和依赖关系，从而产生更鲁棒且上下文更丰富的蛋白质表示。与以往方法不同，GOProteinGNN在训练过程中独特地学习了整个蛋白质知识图谱，这使其能够捕捉更广泛的关系细微差别和依赖关系，而不仅仅是像以往工作那样处理三元组。我们在多个下游任务上进行了全面评估，结果表明GOProteinGNN始终优于以往方法，展示了其有效性，并将其确立为蛋白质表示学习的最先进解决方案。

英文摘要

Proteins play a vital role in biological processes and are indispensable for living organisms. Accurate representation of proteins is crucial, especially in drug development. Recently, there has been a notable increase in interest in utilizing machine learning and deep learning techniques for unsupervised learning of protein representations. However, these approaches often focus solely on the amino acid sequence of proteins and lack factual knowledge about proteins and their interactions, thus limiting their performance. In this study, we present GOProteinGNN, a novel architecture that enhances protein language models by integrating protein knowledge graph information during the creation of amino acid level representations. Our approach allows for the integration of information at both the individual amino acid level and the entire protein level, enabling a comprehensive and effective learning process through graph-based learning. By doing so, we can capture complex relationships and dependencies between proteins and their functional annotations, resulting in more robust and contextually enriched protein representations. Unlike previous methods, GOProteinGNN uniquely learns the entire protein knowledge graph during training, which allows it to capture broader relational nuances and dependencies beyond mere triplets as done in previous work. We perform a comprehensive evaluation on several downstream tasks demonstrating that GOProteinGNN consistently outperforms previous methods, showcasing its effectiveness and establishing it as a state-of-the-art solution for protein representation learning.

URL PDF HTML ☆

赞 0 踩 0

2411.18502 2026-05-28 stat.ML cs.AI cs.IR cs.LG stat.ME

Isometry pursuit

等距追踪

Samson Koelle, Marina Meila

AI总结提出等距追踪算法，通过新颖的归一化方法和多任务基追踪识别宽矩阵中的正交列子矩阵，用于从可解释字典中发现等距嵌入。

2410.12035 2026-05-28 stat.ML cs.LG

Learning with Importance Weighted Variational Inference

基于重要性加权变分推断的学习

Kamélia Daudel, François Roueff

AI总结本文通过渐近分析比较了IWAE、VR和VR-IWAE边界下的重参数化和双重重参数化梯度估计器，揭示了偏差-方差权衡并证明了DREP的优越性，同时分析了困难区域中梯度估计器的方向合理性。

详情

AI中文摘要

几种涉及重要性加权思想的变分边界推广了用于边际似然优化的证据下界（ELBO），例如重要性加权自编码器（IWAE）、变分Rényi（VR）和VR-IWAE边界。然而，边界和梯度估计器的联合选择如何影响所得变分推断（VI）算法的行为仍不清楚。本文对与IWAE、VR和VR-IWAE边界相关的重参数化（REP）和双重重参数化（DREP）梯度估计器进行了统一的理论比较。通过当蒙特卡洛样本数$N$趋于无穷时信噪比的渐近分析，我们识别了这些梯度估计器中的偏差-方差权衡，并正式证明了在重要性加权VI中DREP优于REP。针对变分密度和后验密度之间的Kullback-Leibler散度以及$N$都趋于无穷的困难区域的额外渐近分析表明，即使变分近似恶化，重要性加权VI梯度估计器仍指向合理方向。这些互补的结果刻画了重要性加权VI中从糟糕初始化到最终收敛的优化轨迹。重要的是，我们的证明技术为样本均值比的研究建立了通用的理论工具，其范围超出了VI，并对蒙特卡洛方法领域做出了独立贡献。

英文摘要

Several variational bounds involving importance weighting ideas generalize the Evidence Lower BOund (ELBO) for marginal likelihood optimization, such as the Importance-weighted Auto-Encoder (IWAE), Variational Rényi (VR) and VR-IWAE bounds. Yet, it remains unclear how the joint choice of bound and gradient estimator impacts the behavior of the resulting variational inference (VI) algorithms. This paper provides a unified theoretical comparison of reparameterized (REP) and doubly-reparameterized (DREP) gradient estimators tied to the IWAE, VR and VR-IWAE bounds. Through asymptotic analyses of the Signal-to-Noise Ratio as the number of Monter Carlo samples $N$ goes to infinity, we identify a bias-variance tradeoff in these gradient estimators and we formally justify the superiority of DREP over REP in importance-weighted VI. An additional asymptotic analysis for challenging regimes, where both $N$ and the Kullback-Leibler divergence between the variational and posterior densities go to infinity, indicates that importance-weighted VI gradient estimators point in a well-founded direction even when the variational approximation deteriorates. Together, these complementary results characterize the optimization trajectory in importance-weighted VI from poor initialization to final convergence. Importantly, our proof techniques establish general theoretical tools for the study of sample means ratios whose scope extend beyond VI and constitute an independent contribution to the field of Monte Carlo methods.

URL PDF HTML ☆

赞 0 踩 0

2405.09586 2026-05-28 eess.IV cs.AI cs.CV

Factual Serialization Enhancement: A Key Innovation for Chest X-ray Report Generation

事实序列化增强：胸部X光报告生成的关键创新

Kang Liu, Zhuoqi Ma, Mengmeng Liu, Zhicheng Jiao, Xiaolu Kang, Qiguang Miao, Kun Xie

AI总结提出FSE两阶段事实序列化增强方法，通过事实引导对比学习和证据驱动报告生成，提升胸部X光报告生成的临床准确性和自然语言质量。

Comments code is available at FSE" target="_blank" rel="noopener">https://github.com/mk-runner/FSE

详情

DOI: 10.1016/j.eswa.2026.132550

AI中文摘要

放射学报告包含呈现式词汇（确保清晰和组织）和事实性词汇（基于可观察发现提供准确客观描述）。手动编写这些报告耗时费力，而自动报告生成提供了一种有前景的替代方案。该过程中的关键步骤是将X光片与其对应报告对齐。然而，现有方法通常依赖完整报告进行对齐，忽略了呈现式词汇的影响。为解决此问题，我们提出FSE，一种两阶段事实序列化增强方法。在第一阶段，我们引入事实引导的对比学习用于视觉表示，通过最大化X光片与对应事实描述之间的语义对应关系。在第二阶段，我们提出证据驱动的报告生成，通过整合来自类似历史病例的结构化事实序列化见解，增强诊断准确性。在MIMIC-CXR和IU X-ray数据集上的实验（涵盖特定和一般场景）表明，FSE在自然语言生成和临床效能指标上均优于最先进方法。消融研究进一步强调了第一阶段和第二阶段中事实序列化的积极作用。代码可在https://github.com/mk-runner/FSE获取。

英文摘要

A radiology report comprises presentation-style vocabulary, which ensures clarity and organization, and factual vocabulary, which provides accurate and objective descriptions based on observable findings. While manually writing these reports is time-consuming and labor-intensive, automatic report generation offers a promising alternative. A critical step in this process is to align radiographs with their corresponding reports. However, existing methods often rely on complete reports for alignment, overlooking the impact of presentation-style vocabulary. To address this issue, we propose FSE, a two-stage Factual Serialization Enhancement method. In Stage 1, we introduce factuality-guided contrastive learning for visual representation by maximizing the semantic correspondence between radiographs and corresponding factual descriptions. In Stage 2, we present evidence-driven report generation that enhances diagnostic accuracy by integrating insights from similar historical cases structured as factual serialization. Experiments on MIMIC-CXR and IU X-ray datasets across specific and general scenarios demonstrate that FSE outperforms state-of-the-art approaches in both natural language generation and clinical efficacy metrics. Ablation studies further emphasize the positive effects of factual serialization in Stage 1 and Stage 2. The code is available at https://github.com/mk-runner/FSE.

URL PDF HTML ☆

赞 0 踩 0

2205.14090 2026-05-28 stat.ML cs.LG

Surrogate modeling for Bayesian optimization beyond a single Gaussian process

超越单一高斯过程的贝叶斯优化代理建模

Qin Lu, Konstantinos D. Polyzos, Bingcong Li, Georgios B. Giannakis

AI总结提出一种基于高斯过程集成（EGP）的自适应代理模型，结合汤普森采样（TS）进行贝叶斯优化，以增强表达能力和并行性，并建立了贝叶斯遗憾分析。

Comments This version added some minor corrections and clarifications to the proofs

详情

AI中文摘要

贝叶斯优化（BO）在优化具有昂贵评估代价的黑盒函数方面具有充分记录的优点。这类函数出现在超参数调优、药物发现和机器人等多样化应用中。BO依赖于贝叶斯代理模型来顺序选择查询点，以平衡搜索空间的探索与利用。大多数现有工作依赖于基于单一高斯过程（GP）的代理模型，其中核函数形式通常使用领域知识预先选择。为了绕过这种设计过程，本文利用GP的集成（E）来自适应地选择实时拟合的代理模型，从而产生具有增强表达能力的GP混合后验。然后，通过汤普森采样（TS）实现使用基于EGP的函数后验获取下一个评估输入，这不需要额外的设计参数。为了赋予函数采样可扩展性，每个GP模型采用基于随机特征的核近似。新颖的EGP-TS易于适应并行操作。为了进一步建立所提出的EGP-TS收敛到全局最优的结论，基于贝叶斯遗憾的概念对顺序和并行设置进行了分析。在合成函数和实际应用上的测试展示了所提出方法的优点。

英文摘要

Bayesian optimization (BO) has well-documented merits for optimizing black-box functions with an expensive evaluation cost. Such functions emerge in applications as diverse as hyperparameter tuning, drug discovery, and robotics. BO hinges on a Bayesian surrogate model to sequentially select query points so as to balance exploration with exploitation of the search space. Most existing works rely on a single Gaussian process (GP) based surrogate model, where the kernel function form is typically preselected using domain knowledge. To bypass such a design process, this paper leverages an ensemble (E) of GPs to adaptively select the surrogate model fit on-the-fly, yielding a GP mixture posterior with enhanced expressiveness for the sought function. Acquisition of the next evaluation input using this EGP-based function posterior is then enabled by Thompson sampling (TS) that requires no additional design parameters. To endow function sampling with scalability, random feature-based kernel approximation is leveraged per GP model. The novel EGP-TS readily accommodates parallel operation. To further establish convergence of the proposed EGP-TS to the global optimum, analysis is conducted based on the notion of Bayesian regret for both sequential and parallel settings. Tests on synthetic functions and real-world applications showcase the merits of the proposed method.

URL PDF HTML ☆

赞 0 踩 0

2605.28821 2026-05-28 cond-mat.str-el cond-mat.mtrl-sci

Realization of the Ruby Lattice Antiferromagnet in Layered Transition-Metal Fluorides

层状过渡金属氟化物中红宝石晶格反铁磁体的实现

Harald O. Jeschke, Daniel Guterding, Pratyay Ghosh

AI总结本文通过层状过渡金属氟化物CsBaFe$_3$F$_{12}$和CsBaCr$_3$F$_{12}$实现了红宝石晶格反铁磁体，并利用DFT能量映射和经典蒙特卡洛模拟研究了其磁性质。

Comments 11+8 pages, 7+5 figures

详情

AI中文摘要

红宝石晶格上的反铁磁体预计会呈现一系列奇异的涌现现象，但其材料实现一直难以捉摸。这里我们展示，含有Fe$^{3+}$和Cr$^{3+}$离子的层状过渡金属氟化物CsBaFe$_3$F$_{12}$和CsBaCr$_3$F$_{12}$实现了仅轻微畸变的红宝石晶格几何结构，自旋矩分别为$S=5/2$和$S=3/2$。通过DFT能量映射计算的微观哈密顿量主要由红宝石层内的短程反铁磁相互作用主导。经典蒙特卡洛模拟揭示了两种化合物中的强烈阻挫，六边形格点上具有局部奈尔关联，而由较弱的三角形链接主导不同的长程有序趋势。对于CsBaFe$_3$F$_{12}$，计算的热力学行为与实验报告的磁有序尺度一致。对于CsBaCr$_3$F$_{12}$，经典蒙特卡洛和Luttinger-Tisza分析揭示了竞争的低能有序波矢、强烈的有限尺寸敏感性以及向非公度有序的趋势。总体而言，我们的结果确立了这些氟化物作为实验可访问的红宝石晶格反铁磁体，并为未来的中子散射研究提供了定量预测。

英文摘要

The antiferromagnet on the ruby lattice is expected to host a range of exotic emergent phenomena, yet its material realization has remained elusive. Here we show that the layered transition metal fluorides CsBaFe$_3$F$_{12}$ and CsBaCr$_3$F$_{12}$ with Fe$^{3+}$ and Cr$^{3+}$ ions realize only slightly distorted ruby lattice geometries with spin moments $S=5/2$ and $S=3/2$, respectively. Their microscopic Hamiltonians, calculated with DFT energy mapping, are dominated by short-ranged antiferromagnetic interactions within the ruby layers. Classical Monte Carlo simulations reveal strong frustration in both compounds, with local Néel correlations on the hexagonal plaquettes and distinct long-range ordering tendencies governed by weaker triangular links. For CsBaFe$_3$F$_{12}$, the calculated thermodynamic behaviour is consistent with the experimentally reported magnetic ordering scale. For CsBaCr$_3$F$_{12}$, classical Monte Carlo and Luttinger-Tisza analysis reveal competing low-energy ordering wave vectors, strong finite-size sensitivity, and a tendency toward incommensurate order. Overall, our results establish these fluorides as experimentally accessible ruby-lattice antiferromagnets and provide quantitative predictions for future neutron-scattering studies.

URL PDF HTML ☆

赞 0 踩 0

2605.28817 2026-05-28 astro-ph.CO

Fewer simulations, sharper covariances: Reducing mock covariance noise with Zeldovich approximation control variates

更少的模拟，更尖锐的协方差：使用Zeldovich近似控制变量减少模拟协方差噪声

Boryana Hadzhiyska, Martin White

AI总结提出一种控制变量方法，通过配对模拟与Zeldovich近似实现，利用已知统计特性减少大尺度结构功率谱协方差矩阵估计的方差，在DESI等巡天中显著提升大尺度估计精度。

Comments 18 pages, 10 figures

详情

AI中文摘要

我们提出了一种控制变量方法，用于减少来自大尺度结构模拟的功率谱协方差矩阵估计的方差。关键思想是将每个模拟与一个共享相同初始条件的廉价Zeldovich近似实现配对，并利用Zeldovich场的已知统计特性从协方差估计器中移除相关的样本方差。在高斯不连通近似下，我们推导了最优控制变量系数$β(k,\ell;k',\ell')$和相应相关系数$ρ(k,\ell;k',\ell')$的完全解析表达式，这些表达式用目标场和控制场的自功率谱和互功率谱表示。在单极子情况下，相关系数具有特别简单的形式$ρ(k,k') = r^2(k) r^2(k')$，其中$r(k)$是目标场与Zeldovich场之间的标准互相关系数，这意味着只要两个场强相关，协方差估计就保持高效。对于类似于暗能量光谱仪器（DESI）的发光红星系的掩蔽红移空间对数正态模拟，我们发现控制变量估计器在大尺度（$k \lesssim 0.05\,h\,{\rm Mpc}^{-1}$）上将协方差矩阵的方差减少了大约一个数量级，而这正是精确协方差估计最具挑战性的尺度。对于更高的$k$，增益较小，但通常将收敛速度加快2-3倍，从而显著降低了当前和未来大尺度结构巡天协方差估计的计算成本。由于其简单性，该方法易于在当前成像和光谱巡天（例如DESI、Euclid、LSST、PFS、SPHEREx）中实现。

英文摘要

We present a control-variate method for reducing the variance of power spectrum covariance matrix estimates from simulations of large-scale structure. The key idea is to pair each mock simulation with a cheap Zeldovich-approximation realization sharing the same initial conditions, and to use the known statistical properties of the Zeldovich field to remove correlated sample variance from the covariance estimator. Under a Gaussian disconnected approximation, we derive fully analytic expressions for both the optimal control-variate coefficient, $β(k,\ell;k',\ell')$, and the corresponding correlation, $ρ(k,\ell;k',\ell')$, in terms of the auto- and cross-power spectra of the target and control fields. In the monopole case, the correlation takes the particularly simple form $ρ(k,k') = r^2(k),r^2(k')$, where $r(k)$ is the standard cross-correlation coefficient between the target and Zeldovich fields, implying that covariance estimation remains highly efficient whenever the two fields are strongly correlated. For masked redshift-space lognormal mocks, resembling Luminous Red Galaxies from the Dark Energy Spectroscopic Instrument (DESI), we find that the control-variate estimator reduces the variance of the covariance matrix by approximately an order of magnitude on large scales, $k \lesssim 0.05\,h\,{\rm Mpc}^{-1}$, precisely where accurate covariance estimation is most challenging. The gains are smaller for higher $k$ but typically accelerate convergence by a factor of 2-3, substantially lowering the computational cost of covariance estimation for current and upcoming large-scale structure surveys. Due to its simplicity, this method is readily implementable in current imaging and spectroscopic surveys (e.g., DESI, Euclid, LSST, PFS, SPHEREx).

URL PDF HTML ☆

赞 0 踩 0

2605.28815 2026-05-28 quant-ph cond-mat.mes-hall cond-mat.str-el

A cryogenic apparatus for coupling two-dimensional materials to a confocal multimode optical cavity

用于将二维材料耦合到共焦多模光学腔的低温装置

Han S. Hiller, Pranav Parakh, Samuel H. Aronson, Kenji Maeda, Di Lao, Julian Stewart, Zengde She, Jierong Wang, Xiaodong Xu, Tony Heinz, Benjamin L. Lev

AI总结本文介绍了一种低温装置，通过可调谐共焦腔增强二维材料中的光-物质耦合，用于研究相干拉曼激发和集体电子相。

Comments 12 pages, 8 figures

详情

AI中文摘要

二维范德华材料展现出多种关联电子相，光学驱动为操控这些相提供了有前景的途径。例如，腔增强连续波拉曼激发被认为是一种通过材料激子相干且超辐射地布居声子或电荷密度波的方法。通过足够强的电子-声子耦合，可以维持稳态声子布居，从而驱动新颖的集体响应。我们描述了一个为满足此类实验要求而构建的装置：即一个超高真空系统，内部包含一个长度可调的共焦法布里-珀罗腔，腔内放置样品，两者均低温冷却并稳定以抵抗振动。四轴纳米定位器用于对准样品，并支持用于样品载流子密度调制和输运测量的电引线。通过多模腔的透射可实现原位样品成像以进行对准；本工作中的样品是过渡金属二硫族化物。在近共焦几何结构下操作可将光场集中到局域超模中，从而显著增强光-物质耦合。尽管腔长在毫米尺度，为样品对准和更换提供了空间，但这种增强仍然得以保持。

英文摘要

Two-dimensional van der Waals materials exhibit a variety of correlated electron phases, and optical driving offers a promising route toward manipulating them. For example, cavity-enhanced, continuous-wave (CW) Raman excitation has been suggested as a way to coherently and superradiantly populate phonons or charge density waves via material excitons. A steady-state phonon population may be sustained with sufficiently strong electron-phonon coupling to drive novel collective response. We describe an apparatus built to meet the requirements of such an experimental program: Namely, an ultrahigh-vacuum system housing a length-tunable confocal Fabry-Pérot cavity with an intracavity sample, both cryogenically cooled and stabilized against vibrations. A four-axis nanopositioner aligns the sample and supports electrical leads for sample carrier density modulation and transport measurements. Transmission through the multimode cavity enables in situ sample imaging for alignment; the sample is a transition metal dichalcogenide in this work. Operating near the confocal geometry concentrates the optical field into a localized supermode that substantially enhances light-matter coupling. This enhancement is preserved despite the millimeter-scale cavity length, which provides room for sample alignment and exchange.

URL PDF HTML ☆

赞 0 踩 0

2605.28813 2026-05-28 nucl-th hep-ph nucl-ex

Quantum effects in the quadrupole rotor picture of ultra-relativistic ion-ion collisions

超相对论离子-离子碰撞中四极转子图像的量子效应

Stavros Bofos, Yi Li, Chenrong Ding, Benjamin Bally, Thomas Duguet, Mikael Frosini, Jiangming Yao

AI总结本文通过比较量子四极转子与其经典刚体转子极限，系统评估了超相对论离子-离子碰撞中方位角流对核内禀形变敏感性的经典解释的有效性，发现量子贡献在轻核和球形核中占主导，而在重形变核中低于10%，表明定量解释需超越经典刚体转子范式。

Comments 14 pages, 11 figures

详情

AI中文摘要

在超相对论离子-离子碰撞中观测到的方位角强子流提供了对碰撞核中多体基态关联的灵敏探针。特别是，与核“内禀形变”相关的集体关联预计会在特定的末态可观测量上留下显著印记。然而，尽管原子核本质上是量子系统，这些效应通常是在经典刚体转子图像下解释的。在本快报中，通过比较量子四极转子与其经典刚体转子极限，系统评估了这种解释在整个核素图上的有效性。与核子费米子性质相关的量子贡献被证明在很大程度上独立于壳效应，因此也独立于内禀形变。虽然它们在轻核和/或球形核中几乎占据了量子转子有效四极形变的全部，但在内禀形变良好的重核中降至10%以下。本快报表明，对末态可观测量中核结构效应的定量解释需要超越经典刚体转子范式。除了目前量化的量子贡献外，还必须进一步纳入和表征与集体振动和非集体核子运动相关的关联。

英文摘要

The azimuthal hadronic flow observed in ultra-relativistic ion-ion collisions provides a sensitive probe of many-body ground-state correlations in the colliding nuclei. In particular, collective correlations associated with nuclear "intrinsic deformation" are expected to leave pronounced fingerprints on specific final-state observables. However, such effects are commonly interpreted within a classical rigid-rotor picture, despite the intrinsically quantum nature of nuclei. In this Letter, the validity of this interpretation is assessed systematically across the nuclear chart by comparing the quantum quadrupole rotor with its classical rigid-rotor limit. Quantum contributions associated with the fermionic nature of the nucleons are shown to be largely independent of shell effects, and hence of the intrinsic deformation. While they account for nearly all of the quantum rotor effective quadrupole deformation in light and/or spherical nuclei, they drop below 10% in intrinsically well deformed heavy nuclei. The present letter demonstrates that a quantitative interpretation of nuclear-structure effects in final-state observables requires going beyond the classical rigid-rotor paradigm. Beyond the quantum contributions quantified presently, correlations associated with collective vibrations and with the non-collective nucleonic motion must be further included and characterized.

URL PDF HTML ☆

赞 0 踩 0

2605.28808 2026-05-28 quant-ph

Device-Agnostic Microwave Noise Metrology for Nonlinear Cryogenic Quantum Devices

非线性低温量子器件的设备无关微波噪声计量

Andrea Celotto, Alessandro Alocco, Bernardo Galvano, Luca Fasolo, Emanuele Palumbo, Luca Callegaro, Luca Oberto, Patrizia Livreri, Emanuele Enrico

AI总结提出一种基于可控噪声源替代被测器件的原位噪声计量协议，结合普朗克光谱与散射参数校准，实现非线性低温微波器件增益与输入参考附加噪声的便携式表征。

Comments 12 pages, 13 figures

详情

AI中文摘要

能够实现近量子极限信号处理的微波器件是固态量子技术工具箱中的关键组件。通过放大器、混频器、隔离器等对单光子微波信号进行操控和读出，必须在信号完整性方面满足严格要求以确保可靠运行。这些有源微波量子器件在复杂的低温电子装置中工作，这给它们的表征带来了挑战，因为所有相关品质因数必须在其端口的参考平面处表示。尽管低温S参数校准并非易事，但计量方法正趋向于严格的方法。此外，保持信号完整性必须通过被测器件（DUT）端口的绝对噪声水平来量化，这需要绝对功率参考。在这项工作中，我们提出了一种基于可控噪声源替代DUT的原位噪声计量协议。我们通过证明将噪声源放置在DUT输入端会影响校准与DUT特性的可分离性来论证这一选择。我们提出的架构结合了使用可变温度台的普朗克光谱与短路-开路-负载-互易散射参数校准，使得噪声和散射量参考相同的低温参考平面。在这种配置中，读出链的校准与DUT的内部动力学分离。作为一个高要求的用例，我们将该协议应用于约瑟夫森行波参量放大器，并在激活多模非线性行为的泵浦条件下提取其增益和输入参考附加噪声。这说明了我们的设备无关协议如何支持非线性低温微波器件的便携式噪声表征。

英文摘要

Microwave devices capable of near-quantum-limited signal processing are essential components in the toolbox of solid-state quantum technologies. The manipulation and readout of single-photon microwave signals through amplifiers, mixers, isolators, etc. must fulfill strict requirements in terms of signal integrity to ensure reliable operation. These active microwave quantum devices operate in complex cryo-electronic setups. This poses challenges to their characterization, since all relevant figures of merit must be expressed at the reference planes of their ports. Even though cryogenic S-parameter calibration is non-trivial, metrological approaches are converging toward rigorous methods. Furthermore, preserving signal integrity must be quantified via absolute noise levels at the ports of the Device Under Test (DUT), requiring an absolute power reference. In this work, we present an in situ noise metrology protocol based on substituting a controllable noise source for the DUT. We motivate this choice by showing that placing the noise source at the DUT input impacts the separability of the calibration from the DUT characteristics. Our proposed architecture combines Planck spectroscopy using a Variable Temperature Stage with Short-Open-Load-Reciprocal scattering-parameter calibration, so that noise and scattering quantities are referred to the same cryogenic reference planes. In this configuration, the readout-chain calibration is separated from the internal dynamics of the DUT. As a demanding use case, we apply the protocol to a Josephson Traveling Wave Parametric Amplifier and extract its gain and input-referred added noise under pump conditions activating multimode nonlinear behavior. This illustrates how our device-agnostic protocol supports portable noise characterization of nonlinear cryogenic microwave devices.

URL PDF HTML ☆

赞 0 踩 0

2605.28804 2026-05-28 astro-ph.CO hep-ph

Whispers of Supergravity in Gravitational Wave Backgrounds: Determining the Gravitino Mass from Cosmic Thermal History

引力波背景中的超引力低语：从宇宙热历史确定引力子质量

Angus Spalding, Stephen F. King

AI总结本文提出通过随机引力波背景的特征频率直接推断引力子质量及其初始丰度，从而探测远超对撞机实验能标的超引力参数空间。

详情

AI中文摘要

超过电弱能标的引力子质量为引力子问题提供了最简单的解决方案，但如此大的质量标度远远超出了对撞机实验的探测范围。我们证明，随机引力波背景提供了直接探测这一原本不可及区域的手段。尽管这些引力子在太初核合成（BBN）之前就衰变，但它们自然地在早期宇宙中产生了一个早期物质主导时期。这一非标准纪元会在任何原初引力波背景上留下特征印记，由对应于该阶段开始和结束的两个频率表征。我们证明，这些特征可以直接用于推断引力子质量及其初始丰度，形成直接映射。未来的引力波观测台覆盖了广泛的频率范围，能够探测从BBN约束的$\mathcal{O}(100)\, ext{TeV}$一直到$\mathcal{O}(10^{10})\, ext{TeV}$的引力子质量，而NANOGrav的最新信号已经探测到$500$-$10^4$ TeV范围内的质量。因此，引力波可观测物理探测了巨大的参数空间，远远超出了对撞机实验的能标。我们正进入一个可以通过引力波背景与对撞机实验共同探测超引力的时代。

英文摘要

Gravitino masses above the electroweak scale provide the simplest solution to the gravitino problem, but such large mass scales lie far beyond the reach of collider experiments. We show that the stochastic gravitational wave background offers a direct probe of this otherwise inaccessible regime. Despite decaying before Big-Bang Nucleosynthesis (BBN), these gravitinos naturally generate a period of early matter domination in the early universe. This non-standard epoch leaves a characteristic imprint on any primordial gravitational wave background, characterised by two frequencies corresponding to the onset and end of this phase. We demonstrate that these features can be used to directly infer both the gravitino mass and its initial abundance in a direct mapping. Future gravitational wave observatories span a vast frequency range, enabling sensitivity to gravitino masses from the BBN bound of $\mathcal{O}(100)\,\text{TeV}$ all the way up to $\mathcal{O}(10^{10})\,\text{TeV}$, with recent signal by NANOGrav already probing masses in the range $500$-$10^4$ TeV. Gravitational wave observables therefore probe an enormous region of parameter space, far beyond the reach of collider experiments. We are entering an era in which supergravity can be probed through gravitational wave backgrounds alongside collider experiments.

URL PDF HTML ☆

赞 0 踩 0

2605.28801 2026-05-28 math.PR

Microscopic Weak Selection Principle for the Logistic Branching Brownian Motion with selection

带选择的对数分支布朗运动的微观弱选择原理

F. E. Bravo Lozano, M. C. Fittipaldi

AI总结本文提出对数分支布朗运动（Log-BBM）作为FKPP型方程的微观模型，证明在大种群极限下其重整化经验测度弱收敛到非局部FKPP方程的解，并展示该模型表现出Brunet-Derrida系统的弱选择原理，即粒子选择最小传播速度。

Comments 40 pages, 1 figure

详情

AI中文摘要

在这项工作中，我们提出了带选择的对数分支布朗运动（Log-BBM），这是对Groisman等人（2020）定义的N-BBM的修改，其中出生和竞争事件被解耦，以允许种群大小可变，该大小遵循Lambert（2005）定义的对数增长分支过程。我们研究了Log-BBM作为FKPP型方程的微观模型的表示。在大种群极限下，Log-BBM的重整化经验测度弱收敛到一个概率测度，其密度求解了FKPP方程的非局部版本，而其累积分布函数求解了经典的FKPP方程。我们还表明，该模型表现出Brunet-Derrida系统族的特征行为，特别是所谓的弱选择原理。实际上，我们证明，在Log-BBM中，粒子选择最小传播速度，这与FKPP方程中选定的前沿速度依赖于初始条件的情况形成对比。

英文摘要

In this work, we present the logistic branching Brownian motion with selection (Log-BBM), a modification of the N-BBM defined by Groisman et. al (2020), in which birth and competition events are decoupled to allow for a variable population size that follows the branching process with logistic growth defined by Lambert (2005). We study the representation of the Log-BBM as a microscopic model for a FKPP-type equation. In the large population limit, the renormalised empirical measure of the Log-BBM converges weakly to a probability measure whose density solves a nonlocal version of the FKPP equation, while its cumulative distribution function solves the classical FKPP equation. We also show that this model exhibits behaviour that is characteristic of the Brunet-Derrida family of systems, in particular the so-called weak selection principle. Indeed, we show that, in the Log-BBM, the particles select the minimal propagation speed, in contrast to the FKPP equation, where the selected front speed depends on the initial condition.

URL PDF HTML ☆

赞 0 踩 0

2605.28800 2026-05-28 cond-mat.mtrl-sci cond-mat.mes-hall

A GPU-based Solver for Polarization Dynamics in Ferroelectric Materials

基于GPU的铁电材料极化动力学求解器

Ali Hasan, Edoardo Piccolo, Anna Giordano, Natalya Fedorova, Jorge Íñiguez-González, Davi Rodrigues, Giovanni Finocchio

AI总结提出一个全GPU加速的可扩展数值求解器PETASPIN_microelectrics，利用Ginzburg-Landau形式计算铁电系统的完整极化矢量场，通过优化全静电场计算和并行执行，准确再现相变、磁滞回线及三维混合斯格明子等关键现象。

详情

AI中文摘要

铁电材料可用于开发结合非易失性、小尺寸、低功耗驱动和电可调性的多种器件概念。这种发展需要高效精确的设计工具来描述极化织构。然而，大多数现有的铁电求解器基于CPU，并依赖于简化的静电处理以及极化场的降维表示。这些近似限制了它们捕捉有限尺寸和边界效应的能力，并限制了可实际模拟的畴结构和畴壁的范围。在此，我们提出一个全GPU（图形处理单元）加速的可扩展数值求解器，名为PETASPIN_microelectrics，用于使用Ginzburg-Landau形式计算铁电系统的完整极化矢量场。我们的求解器包含优化且验证过的全静电场计算，并支持多个模拟的并行执行。我们通过几个基准问题系统验证了该求解器，包括BaTiO3中的相变和铁电畴壁轮廓。我们的模拟再现了BaTiO3中温度驱动的滞后相变。我们还再现了磁滞回线，并展示了在PbTiO3/SrTiO3双层系统中三维混合斯格明子的稳定化。我们的结果与解析理论预测及先前的实验研究定量一致。所提出的求解器为铁电材料的大规模模拟（包括拓扑织构的稳定化）提供了一个高效、准确的平台，支持下一代铁电器件设计的预测建模。

英文摘要

Ferroelectric materials can be used for the development of multiple device concepts combining non-volatility, small dimensions, low-power actuation, and electrical tunability. Such development demands efficient and precise design of simulation tools describing the polarization texture. However, most existing ferroelectric solvers are CPU-based and rely on simplified electrostatic treatments and reduced-dimensional representations of the polarization field. These approximations limit their ability to capture finite-size and boundary effects and restrict the range of domain structures and domain walls that can be realistically simulated. Here, we present a fully GPU (graphics processing units)-accelerated and scalable numerical solver, named PETASPIN_microelectrics, for computing the full polarization vector field of ferroelectric systems using the Ginzburg-Landau formalism. Our solver incorporates an optimized and validated calculation of the full electrostatic field and enables the parallel execution of multiple simulations. We systematically validated the solver with several benchmark problems, including phase transitions in BaTiO3 and ferroelectric domain wall profiles. Our simulations reproduce temperature-driven hysteretic phase transitions in BaTiO3. We also reproduce hysteresis loops and demonstrate stabilization of a three-dimensional hybrid skyrmion in a PbTiO3/SrTiO3 bilayer system. Our results show quantitative agreement with predictions from an analytical theory and prior experimental studies. The proposed solver provides an efficient, accurate platform for large-scale simulations of ferroelectric materials including stabilization of topological textures supporting predictive modeling for next-generation of ferroelectric device design.

URL PDF HTML ☆

赞 0 踩 0

2605.28799 2026-05-28 cond-mat.mtrl-sci

Synthesis and properties of bulk Mg$_3$WN$_4$ in a wurtzite-derived structure

纤锌矿衍生结构中块体Mg$_3$WN$_4$的合成与性质

Anna A. Berseneva, Christopher L. Rom, Layton Rudolph, Yunseung Kuk, P. Shiv Halasyamani, Rebecca W. Smaha, James R. Neilson, Andriy Zakutayev

AI总结通过固态复分解反应合成纤锌矿衍生结构的块体Mg$_3$WN$_4$，利用原位X射线衍射和非原位合成研究其形成机制、结构及光学性质。

Comments 20 pages, 5 figures

详情

AI中文摘要

对理论预测材料进行实验合成，并控制元素配位环境，可实现有用性质（如离子传输或铁电开关）的实现。其中，Mg-W-N组成空间中的新型三元氮化物是此类材料之一，最近在块体和薄膜形式中预测并合成了几种新的稳定和亚稳化合物。本文首次报道了通过固态复分解反应在纤锌矿衍生晶体结构中块体合成Mg$_3$WN$_4$。原位同步辐射粉末X射线衍射显示离子交换如何从Li$_6$WN$_4$ + 3 MgCl$_2$前驱体进行到Mg$_3$WN$_4$ + 6 LiCl产物，反应在380°C附近缓慢开始，到600°C完成，包括在440°C以上出现竞争的无序岩盐衍生相(Mg,W)N。随后在400°C下进行0.5小时、使用10%过量MgCl$_2$的非原位粉末合成揭示了纤锌矿衍生Mg$_3$WN$_4$结构的阳离子有序性质，并通过二次谐波产生测量确认了极性对称性。光学吸收光谱、化学成分分析和电子显微镜成像表明，块体纤锌矿Mg$_3$WN$_4$容易形成缺陷。总体而言，本研究表明，通过仔细控制反应的热预算，可以根据原位测量选择性地非原位合成相纯三元氮化物，并为纤锌矿Mg$_3$WN$_4$的性质表征铺平了道路。

英文摘要

Experimental synthesis of theoretically predicted materials with controlled elemental coordination environments can lead to realization of useful properties, such as facile ion transport or ferroelectric switching. Among such materials are new ternary nitrides in the Mg-W-N composition space, where several new stable and metastable compounds have been predicted and synthesized recently in bulk and film forms. Here, we report for the first time on the bulk synthesis of Mg$_3$WN$_4$ in a wurtzite-derived crystal structure via a solid state metathesis reaction. $In$ $situ$ synchrotron powder X-ray diffraction shows how the ion exchange proceeds from Li$_6$WN$_4$ + 3 MgCl$_2$ precursors to Mg$_3$WN$_4$ + 6 LiCl products, with the reaction starting slowly near 380 $^\circ$C and completing by 600 $^\circ$C, including the presence of a competing disordered rocksalt-derived phase (Mg,W)N above 440 $^\circ$C. The follow up $ex$ $situ$ powder synthesis at 400 $^\circ$C for 0.5 hour with 10% excess MgCl$_2$ reveals the cation-ordered nature of the wurtzite-derived Mg$_3$WN$_4$ structure with polar symmetry confirmed by second harmonic generation measurements. Optical absorption spectra, chemical composition analysis, and electron microscopy imaging suggests that bulk wurtzite Mg$_3$WN$_4$ is prone to defect formation. Overall, this study shows that selective $ex$ $situ$ synthesis of the phase pure ternary nitrides, informed by \textit{in situ} measurements, is possible by carefully controlling the thermal budget of the reaction, and paves a way towards property characterization of wurtzite Mg$_3$WN$_4$.

URL PDF HTML ☆

赞 0 踩 0

2605.28798 2026-05-28 physics.chem-ph cond-mat.mtrl-sci physics.comp-ph

How reproducible are first-principles simulations of liquid water?

液态水的第一性原理模拟的可重复性如何？

Niamh ONeill, Benjamin X. Shi, William J. Baldwin, Albert P. Bartok, Chris J. Pickard, Angelos Michaelides, Gabor Csanyi, Timothy C. Berkelbach

AI总结通过结合机器学习势和收敛的DFT训练数据，解决了使用revPBE-D3泛函的液态水模拟中存在的显著差异，提供了可靠的基准值。

详情

AI中文摘要

液态水至关重要，其精确的计算机模拟推动了无数方法学的发展。基于密度泛函理论（DFT）力的从头算分子动力学现已成为研究人员广泛使用的标准工具。然而，我们发现，使用相同广泛使用的密度泛函（revPBE-D3）的液态水先前研究之间存在显著差异，扩散系数变化超过20%，密度变化超过10%，这引发了关于可重复性的根本问题。通过结合能够进行稳健统计采样的现代长程机器学习原子间势和仔细收敛的DFT训练数据，我们解决了这些差异，在六个不同的社区代码中达成共识。我们的预测与先前文献显著不同：我们表明，由于基组不完备和赝势不一致，加上统计采样的局限性（在某些情况下），大多数先前结果高估了密度并低估了revPBE-D3水的扩散系数。这些基准值为验证当前和未来基于DFT的从头算分子动力学实现提供了可靠参考。达成一致建立了信心和可信度，并为系统评估新密度泛函和数值近似提供了前提。

英文摘要

Liquid water is fundamentally important, and its accurate computer simulation has been the driving force for myriad methodological developments. Ab initio molecular dynamics with forces obtained from density functional theory (DFT) is now a standard tool widely used by researchers. However, we reveal that previous studies of liquid water using the same widely-used density functional (revPBE-D3) exhibit significant discrepancies with one another, varying by over 20% in the diffusion coefficient and 10% in the density, raising fundamental questions about reproducibility. By combining modern long-range machine-learning interatomic potentials that enable robust statistical sampling with carefully converged DFT training data, we resolve these discrepancies, achieving consensus across six diverse community codes. Our predictions differ markedly from previous literature: we show that most previous results overestimate the density and underestimate the diffusion coefficient of revPBE-D3 water due to basis set incompleteness and pseudopotential inconsistencies, coupled with limitations in statistical sampling (in some cases). These benchmark values provide a reliable reference for validating current and future implementations of DFT-based ab initio molecular dynamics. Reaching agreement establishes confidence and credibility and serves as a prerequisite for the systematic assessment of new density functionals and numerical approximations.

URL PDF HTML ☆

赞 0 踩 0