arXivDaily arXiv每日学术速递 周一至周五更新
重置
physics.data-an数据分析9
2606.12393 2026-06-11 hep-ph hep-th physics.data-an 新提交

The Fundaments of Unity: ${\mathcal O}(1)$ Couplings in Quantum Field Theories

统一的基础:量子场论中的 ${\mathcal O}(1)$ 耦合

Ben Allanach (Cambridge U., DAMTP)

AI总结 本文批判性地检验了基本量子场论中无量纲耦合应为量级1的预期,提出用展宽(最大与最小耦合模之比)量化该特性,并发现即使耦合服从独立同分布,比值可能远大于预期。

详情
Comments
15 pages, 3 figures
AI中文摘要

我们批判性地检验了在基本量子场论中,拉格朗日密度中的无量纲耦合应全部为量级1的预期。我们提出了一种衡量理论符合该预期的度量:展宽(此类无量纲耦合的模的最大值与最小值之比),并得到了各种闭式结果。如果我们采用独立同分布(IID)耦合来参数化对量级1耦合值的不确定性,则耦合的比值可能远大于朴素预期。例如,对于一个具有20个IID单位正态耦合的理论,其中两个耦合的模之比大于100的概率为0.29。即使IID耦合具有指数抑制的尾部,量级1耦合的比值分布也具有肥大的幂律尾部,且随独立耦合数量的增加而增长。

英文摘要

We critically examine the expectation that in a fundamental quantum field theory, dimensionless couplings in the Lagrangian density should all be of order unity. We propose a measure to quantify the adherence of a theory to this: the spread (the ratio of the largest to the smallest of the magnitudes) of such dimensionless couplings, obtaining various closed-form results. If we take independent identically distributed (IID) couplings to parameterise our uncertainty on the values of the order unity couplings, ratios of couplings can be much larger than one might naively expect. For a theory with 20 IID unit normal couplings, the probability that the magnitude of the ratio of two of them is greater than 100 is 0.29, for example. Even when the IID couplings have exponentially suppressed tails, the distribution of ratios of order one couplings has fat power-law tails which grow with the number of independent couplings.

2606.12157 2026-06-11 physics.comp-ph physics.data-an physics.ins-det 新提交

fitPALSpectra: Python fitting of positron annihilation lifetime spectra

fitPALSpectra: 正电子湮灭寿命谱的Python拟合

Georgios E. Pavlou

AI总结 提出开源Python工作流fitPALSpectra,通过解析积分指数-高斯响应模型、约束优化和最小二乘精化,实现可配置的PALS谱模拟、拟合、可视化和报告,在合成谱上准确恢复寿命、强度等参数。

详情
Comments
6 pages, 2 figures
AI中文摘要

正电子湮灭寿命谱(PALS)通常通过拟合与探测器分辨率函数卷积的多指数寿命模型来分析。实际上,这个逆问题对初始参数选择、参数边界、源修正以及寿命与强度参数之间的相关性很敏感。本文介绍了fitPALSpectra,一个开源的Python工作流,用于可配置的PALS谱模拟、拟合、可视化和报告。该实现使用解析积分的指数-高斯响应模型、可配置的源和样品组件、约束优化、可选的最小二乘精化,以及拟合结果、相关矩阵和拟合曲线的机器可读输出。在具有已知真实参数的完全合成谱上的验证表明,该方法能准确恢复模拟的寿命、强度、探测器半高全宽、瞬移和背景。

英文摘要

Positron annihilation lifetime spectroscopy (PALS) spectra are commonly analyzed by fitting multi-exponential lifetime models convoluted with the detector resolution function. In practice, this inverse problem is sensitive to initial parameter choices, parameter bounds, source corrections, and correlations between lifetime and intensity parameters. This paper presents fitPALSpectra, an open-source Python workflow for configurable PALS spectrum simulation, fitting, visualization, and reporting. The implementation uses an analytically integrated exponential--Gaussian response model, configurable source and sample components, constrained optimization, optional least-squares refinement, and machine-readable output of fit results, correlation matrices, and fitted curves. Validation on fully synthetic spectra with known ground-truth parameters shows accurate recovery of the simulated lifetimes, intensities, detector full width at half maximum, prompt shift, and background.

2606.12097 2026-06-11 stat.AP physics.data-an 新提交

Weibull-Stationary Stochastic Differential Equations for Conditional Long-Horizon Wind Power Forecasting

条件长期风电预测的威布尔平稳随机微分方程

Luca Di Persio, Mehrdad Ghadiri

AI总结 提出一种基于威布尔平稳SDE的月度风电概率预测框架,通过异方差卡尔曼滤波和三种SDE模型实现高分辨率预测,CRPS约1.57 m/s,功率Wasserstein距离低于额定容量1.4%。

详情
AI中文摘要

我们提出了一个以十分钟分辨率进行一个月前风电预测的条件概率框架。从序列相关的SCADA风速数据中估计月度威布尔形状和尺度参数,通过Godambe协方差修正,并使用异方差卡尔曼滤波在双变量VAR(1)状态空间模型上进行预测。以MMSE预测的威布尔不变律为条件,我们构建并比较了三种正风速SDE模型:Ornstein-Uhlenbeck-Weibull变换、Fokker-Planck漂移优先扩散和Fokker-Planck扩散优先模型。模拟的风速集合通过校准的XGBoost功率曲线映射到功率。应用于Kelmarsh风电场Senvion MM92涡轮机2021年1月的数据,三种SDE公式在概率精度上统计上不可区分,平均CRPS值在1.569至1.575 m/s之间。因此,扩散优先模型在计算上更优,运行时间相对于OU-Weibull模型减少了约七倍。在功率域中,模拟与观测分布之间的Wasserstein距离为26.1-27.6 kW,低于额定容量的1.4%,而所检查月份的月能量产出偏差约为-7.3%。在0-1500 kW范围内,超越概率误差保持在1.6个百分点以下,在额定功率附近约为2.2个百分点。这些量为下游运行问题提供了决策相关的概率输入,而非完成的备用、储能、市场或疲劳优化决策。完全边缘化卡尔曼预测律下的威布尔参数是一个自然的扩展。

英文摘要

We present a one-month-ahead conditional probabilistic framework for wind-power forecasting at ten-minute resolution. Monthly Weibull shape and scale parameters are estimated from serially dependent SCADA wind-speed data, corrected through a Godambe covariance, and forecast by a heteroskedastic Kalman filter on a bivariate VAR(1) state-space model. Conditional on the MMSE forecasted Weibull invariant law, we construct and compare three positive wind-speed SDE models: an Ornstein-Uhlenbeck-Weibull transform, a Fokker-Planck drift-first diffusion, and a Fokker-Planck diffusion-first model. The simulated wind-speed ensembles are mapped to power through a calibrated XGBoost power curve. Applied to January 2021 data from a Senvion MM92 turbine at Kelmarsh Wind Farm, the three SDE formulations are statistically indistinguishable in probabilistic accuracy, with mean CRPS values between 1.569 and 1.575 m/s. The diffusion-first model is therefore preferred on computational grounds, reducing runtime by about a factor of seven relative to the OU-Weibull model. In the power domain, the Wasserstein distance between simulated and observed distributions is 26.1-27.6 kW, below $1.4\%$ of rated capacity, while the monthly energy-yield bias is about $-7.3\%$ for the examined month. Exceedance-probability errors remain below 1.6 percentage points over the 0-1500 kW range and about 2.2 percentage points near rated power. These quantities provide decision-relevant probabilistic inputs for downstream operational problems, rather than completed reserve, storage, market, or fatigue-optimization decisions. Full marginalisation over the Kalman predictive law of the Weibull parameters is left as a natural extension.

2606.12026 2026-06-11 math.SP cs.SI math-ph physics.data-an 新提交

Generalizing Perron--Frobenius theory and eigenvector-based centralities to networks with complex edge weights

将Perron-Frobenius理论和基于特征向量的中心性推广到具有复数边权重的网络

Yu Tian, Mason A. Porter, Lucas Böttcher

AI总结 本文将Perron-Frobenius定理推广到复数权重矩阵,建立不同推广之间的联系,并提出基于特征向量的中心性度量以分析复数边权重网络中的节点重要性。

详情
Comments
34 pages, 9 figures, 1 table
AI中文摘要

线性代数及其在网络分析应用中的一个基本概念是Perron-Frobenius (PF)定理,它支撑着基于特征向量的中心性度量,如特征向量中心性、PageRank以及枢纽和权威中心性。通过引用PF定理,我们知道对于具有正边权重的强连通网络,权重矩阵最大特征值对应的特征向量产生一个明确定义的中心性度量(即特征向量中心性)。PF定理及其相关中心性度量的传统表述假设网络具有实数值权重。然而,量子信息、量子化学、电动力学和机器学习等领域的许多网络具有复数值边权重。在本文中,我们研究PF定理到复数值矩阵的推广,建立这些推广之间的联系,并提出基于特征向量的中心性度量以分析具有复数边权重的网络中的节点重要性。我们还证明了满足广义PF性质的复数权重网络的存在性结果,并计算了几个示例的相关中心性度量,这些示例来自电子传输、电路分析、数学化学和通信网络等应用领域。

英文摘要

A fundamental concept in linear algebra and its applications to network analysis is the Perron--Frobenius (PF) theorem, which underpins eigenvector-based centrality measures such as eigenvector centrality, PageRank, and hubs and authorities. By invoking the PF theorem, we know for strongly connected networks with positive edge weights that the eigenvector corresponding to the largest eigenvalue of the weight matrix yields a well-defined centrality measure (namely, eigenvector centrality). Traditional formulations of the PF theorem and associated centrality measures assume that networks have real-valued weights. However, many networks in areas such as quantum information, quantum chemistry, electrodynamics, and machine learning have complex-valued edge weights. In this paper, we study generalizations of the PF theorem to complex-valued matrices, establish connections between these generalizations, and propose generalized eigenvector-based centrality measures to analyzing node importances in networks with complex edge weights. We also prove results about the existence of complex-weighted networks that satisfy generalized PF properties and calculate associated centrality measures for several examples, which we draw from application areas such as electron transport, circuit analysis, mathematical chemistry, and communication networks.

2606.11415 2026-06-11 q-bio.NC cs.LG physics.data-an q-bio.QM 新提交

Spatially Masked Regression Reveals Local and Distributed Predictability in Electrophysiological Recordings

空间掩蔽回归揭示电生理记录中的局部和分布式可预测性

Maryam Ostadsharif Memar, Nima Dehghani

AI总结 提出空间掩蔽回归(SMR)框架,通过逐步增大掩蔽区域量化电极信号中局部与分布式信息的贡献,应用于颅内和头皮脑电数据,发现邻近电极贡献显著但非全部,表明信号同时包含局部冗余和全局结构。

详情
AI中文摘要

神经记录通常被解释为局部测量,但任何单个传感器的信号也可能反映分布在整个网络中的结构化活动。这引出一个基本问题:电极信号在多大程度上反映底层系统中的局部信息与分布式信息?更具体地说,电极的活动有多少由其邻近区域携带,又有多少嵌入在阵列的更广泛分布中?我们通过空间掩蔽回归(SMR)框架解决这一问题,该框架从其余电极重建每个电极的时间序列,同时排除目标周围可配置的邻域。通过逐步增大掩蔽,空间局部性成为实验控制,用于量化在移除附近通道后有多少预测信息幸存。我们将SMR应用于具有异质电极覆盖的颅内脑电图(iEEG)和具有标准化导联组合的感觉运动皮层头皮脑电图(EEG)。使用原始信号与重建信号之间的距离相关性,我们发现两种模态中均存在强烈的受试者内重建,即使排除局部邻域后仍有显著的可预测性,且EEG中的跨受试者转移明显强于iEEG。掩蔽显示邻近电极对重建贡献显著,但并非全部,表明单个通道既反映局部冗余也反映更广泛的分布式结构。保留选定边际或谱特性但破坏相位结构或时间顺序的替代数据显著降低了性能,支持SMR依赖于结构化时间和跨通道组织而非仅边际统计的结论。这些结果将SMR定位为量化记录中局部与分布式信息平衡的可解释框架。

英文摘要

Neural recordings are often interpreted as local measurements, yet the signal at any one sensor can also reflect structured activity distributed across the broader network. This raises a basic question: to what extent does an electrode's signal reflect local versus distributed information in the underlying system? More specifically, how much of an electrode's activity is carried by its immediate neighborhood, and how much is embedded more broadly across the array? We address this with a Spatially Masked Regression (SMR) framework that reconstructs each electrode's timeseries from the remaining electrodes while excluding a configurable neighborhood around the target. By progressively increasing this mask, spatial locality becomes an experimental control for quantifying how much predictive information survives after nearby channels are withheld. We apply SMR to intracranial EEG with heterogeneous electrode coverage and to scalp EEG with standardized montages over sensorimotor cortex. Using distance correlation between original and reconstructed signals, we find strong within-subject reconstruction in both modalities, substantial residual predictability even when local neighbors are excluded, and markedly stronger cross-subject transfer in EEG than in iEEG. Masking shows that nearby electrodes contribute strongly to reconstruction but do not account for all of it, indicating that individual channels reflect both local redundancy and broader distributed structure. Surrogates that preserve selected marginal or spectral properties while disrupting phase structure or temporal ordering substantially reduce performance, supporting the conclusion that SMR depends on structured temporal and cross-channel organization rather than on marginal statistics alone. These results position SMR as an interpretable framework for quantifying the balance between local and distributed information in recordings.

2604.25701 2026-06-11 physics.bio-ph physics.data-an q-bio.BM q-bio.MN q-bio.PE 版本更新

Bayesian Rate Inference for Sequence Motif Dynamics in Systems of Reactive Nucleic Acids

反应性核酸系统中序列基序动力学的贝叶斯速率推断

Johannes Harth-Kitzerow, Ulrich Gerland, Torsten A. Enßlin

AI总结 提出贝叶斯推断框架,从链反应器模拟的连接计数数据中推断基序速率方程参数,为匹配简化模型与复杂模拟提供方法,并迈向从实验数据直接推断反应速率常数。

详情
Comments
18 pages, 8 figures, pre-submission
AI中文摘要

RNA世界假说提出了生命在早期地球上出现的一条途径。它假设生命始于基于RNA的系统,能够存储、传递和复制信息,设想单体和短RNA寡聚体相互作用形成更长的链,最终成为具有催化活性的核酶。RNA池中的关键反应是杂交、去杂交、模板化连接和切割。这些反应依赖于许多环境参数以及相互作用链之间广泛可能的构型。为了扫描如此高维的参数空间,需要高效的描述。基序速率方程将复杂的链反应器动力学投影到序列基序空间。这里我们提出了一个贝叶斯推断框架,从链反应器模拟产生的连接计数数据中推断其参数。这提供了一个将更简单的基序速率方程与更复杂的模拟相匹配的框架。此外,这是朝着直接从实验数据推断反应速率常数(包括严格的 uncertainty 估计)迈出的一步。这可能是连接理论与实验、加深我们对生命出现所必需的基本特征理解的关键步骤。

英文摘要

The RNA world hypothesis suggests a pathway of how life emerged on early earth. It assumes that life started with RNA based systems, capable of storing, transmitting and replicating information, envisioning that monomers and short RNA oligomers interact to form longer strands, eventually becoming catalytically active ribozymes. Key reactions in RNA pools are hybridization, dehybridization, templated ligation, and cleavage. Those reactions depend on many environmental parameters and the wide range of possible configurations among interacting strands. In order to scan such high dimensional parameter spaces, efficient descriptions are needed. Motif rate equations project complex strand reactor dynamics onto sequence motif space. Here we present a Bayesian inference framework to infer their parameters from ligation count data produced by strand reactor simulations. This provides a framework to match the simpler motif rate equations to more complex simulations. Additionally, it is a step towards inferring reaction rate constants directly from experimental data, including rigorous uncertainty estimation. This could be an essential procedure to connect theory and experiment, and deepen our understanding of the essential features necessary for life to emerge.

2604.24662 2026-06-11 physics.data-an cs.AI cs.IT 版本更新

Information bottleneck for learning the phase space of dynamics from high-dimensional experimental data

信息瓶颈:从高维实验数据学习动力学相空间

K. Michael Martini, Eslam Abdelaleem, Paarth Gulati, Ilya Nemenman

AI总结 提出DySIB方法,通过最大化过去与未来观测窗口间的预测互信息并惩罚表示复杂度,从高维时间序列数据中无监督学习低维动力学表示,在物理摆实验中恢复出与真实相空间匹配的二维表示。

详情
Comments
12 pages including references, 7 figures, 4 appendix pages with 4 appendix figures
AI中文摘要

从高维观测中识别系统的动力学状态变量是物理科学中的一个核心问题。挑战在于状态变量不可直接观测,必须从原始高维数据中无监督地推断。本文引入DySIB(动态对称信息瓶颈)作为一种学习方法,通过最大化过去与未来观测窗口之间的预测互信息并惩罚表示复杂度,学习时间序列数据的低维表示。该目标完全在潜在空间中运作,避免了对观测的重建。我们将DySIB应用于一个物理摆的实验视频数据集,其底层状态空间已知。该方法的学习架构超参数由数据自洽设定,恢复出一个二维表示,该表示与摆相空间的维度、拓扑和几何相匹配,学习到的坐标与标准角度和角速度平滑对齐。这些结果在一个特征明确的实验系统上表明,潜在空间中的预测信息可用于直接从高维数据中恢复可解释的动力学坐标。

英文摘要

Identifying the dynamical state variables of a system from high-dimensional observations is a central problem across physical sciences. The challenge is that the state variables are not directly observable and must be inferred from raw high-dimensional data without supervision. Here we introduce DySIB (Dynamical Symmetric Information Bottleneck) as a method to learn low-dimensional representations of time-series data by maximizing predictive mutual information between past and future observation windows while penalizing representation complexity. This objective operates entirely in latent space and avoids reconstruction of the observations. We apply DySIB to an experimental video dataset of a physical pendulum, where the underlying state space is known. The method, with hyperparameters of the learning architecture set self-consistently by the data, recovers a two-dimensional representation that matches the dimensionality, topology, and geometry of the pendulum phase space, with the learned coordinates aligning smoothly with the canonical angle and angular velocity. These results demonstrate, on a well-characterized experimental system, that predictive information in latent space can be used to recover interpretable dynamical coordinates directly from high-dimensional data.

2506.00330 2026-06-11 physics.data-an cs.IT stat.ML 版本更新

Accurate Estimation of Mutual Information in High Dimensional Data

高维数据中互信息的准确估计

Eslam Abdelaleem, K. Michael Martini, Ilya Nemenman

AI总结 针对高维欠采样下互信息估计难题,提出基于低维潜在表示的神经估计器,结合统计一致性检验、偏差校正和置信区间,并引入VSIB概率批评器族,在合成与真实图像数据上实现可靠估计。

详情
Comments
15 pages main text, 21 pages SI, 12 Figs overall
AI中文摘要

互信息(MI)量化变量之间的统计依赖性,广泛应用于科学领域,但从有限数据中准确估计仍然非常困难。常见方法在现代实验典型的高维欠采样场景($N \lesssim K$)中失败,且没有公认的测试来检测基于神经网络的估计器何时失效,使其实际上无法作为科学仪器使用。我们证明,当统计依赖关系具有低维潜在表示时,神经MI估计器可以变得可靠。样本复杂度由潜在维度$K_Z \ll K$而非环境维度决定——我们通过随机矩阵理论从经验上确认并从理论上奠定了这一机制转变。基于这一见解,我们开发了一个实用协议,为神经估计器提供显式的统计一致性检查、偏差校正和置信区间。此外,我们引入了一类新的概率批评器(VSIB族),在标准估计器失效的高MI值下显著降低偏差和方差。我们在合成基准($K=500$,$N$低至256)、Czyz等人(2023)的标准40数据集基准套件、噪声MNIST($K=784$)以及使用ResNet-20骨干网络的CIFAR-10/100($K=3072$)上验证了该协议。我们的协议始终匹配或超越现有方法,同时是唯一报告置信区间并标记不可靠估计的方法,在真实图像上实现了远低于环境像素维度的可靠MI检测。

英文摘要

Mutual information (MI) quantifies statistical dependence between variables and is widely used across scientific disciplines, yet accurate estimation from finite data remains notoriously difficult. Common approaches fail in high-dimensional, undersampled regimes ($N \lesssim K$) typical of modern experiments, and no accepted tests exist to detect when neural network-based estimators fail, making them effectively unusable as scientific instruments. We show that neural MI estimators can be made reliable when the statistical dependencies admit a low-dimensional latent representation. Sample complexity is then governed by the latent dimensionality $K_Z \ll K$ rather than the ambient dimension -- a regime shift we confirm empirically and ground theoretically via random matrix theory. Building on this insight, we develop a practical protocol that provides neural estimators with explicit statistical consistency checks, bias correction, and confidence intervals. We additionally introduce a new class of probabilistic critics (the VSIB family) that substantially reduce bias and variance at higher MI values where standard estimators break down. We validate the protocol on synthetic benchmarks ($K=500$, $N$ as low as $256$), on the standard 40-dataset benchmark suite of Czyz et al. (2023), on noisy MNIST ($K=784$), and on CIFAR-10/100 ($K=3072$) with a ResNet-20 backbone. Our protocol consistently matches or exceeds existing methods while being the only approach to report confidence intervals and flag unreliable estimates, achieving reliable MI detection well below the ambient pixel dimension on real images.

2309.12017 2026-06-11 physics.atom-ph cond-mat.stat-mech physics.comp-ph physics.data-an

Electron Ptychography Reveals Correlated Lattice Vibrations at Atomic Resolution

Anton Gladyshev, Benedikt Haas, Thomas C. Pekin, Tara M. Boland, Marcel Schloz, Peter Rez, Christoph T. Koch

详情
英文摘要

In this paper we introduce an electron ptychography reconstruction framework, CAVIAR -- Correlated Atomic Vibration Imaging with sub-Angstrom Resolution -- that reveals an entirely new channel of information: spatial correlations in atomic displacements at the atomic scale. We show reconstructions of a symmetric $Σ$9 grain boundary in silicon from realistically simulated data and experimental data for hexagonal boron nitride. By reconstructing the object as an ensemble of multiple states we are able to observe correlations between movements of atoms in the range of 10-20 pm at room temperature in agreement with the expectation. Moreover, using only the masses of the atomic species and the temperature as input, we obtain average frequencies of $10.8\pm0.1$, $13.6\pm0.6$, $18.0\pm0.2$, $25.5\pm1.5$ THz for the longitudinal and transversal acoustic and optic phonons, respectively, in agreement with inelastic neutron scattering, albeit from just a few nm$^3$ volume. This ability to spatially resolve correlated atomic motion makes CAVIAR a unique tool to explore atom dynamics at the finest scale with the potential to be instrumental in the development of phononic devices, in studying phonon-based decoherence in quantum systems, or other emerging phonon-based applications.