arXivDaily arXiv每日学术速递 周一至周五更新
2606.19760 2026-06-19 stat.AP 新提交

Covariate-Adjusted Functional Principal Components Analysis for Modeling Hazard Rates of Physical Activity in the US Population

协变量调整的功能主成分分析用于建模美国人口体力活动的风险率

Md Rokibul Hasan, Pratim Guha Niyogi

AI总结 提出基于风险函数的分布分析方法,利用功能主成分分析(FPCA)从腕部加速度计数据中刻画个体活动强度分布变异,优于均值摘要。

详情
AI中文摘要

体力活动在人类健康中起着至关重要的作用。其整体分布因人而异。常用的汇总指标无法描述这种分布模式。我们提出了一种基于分布的分析方法,通过从腕部加速度计数据中导出的风险函数来建模个体活动强度模式,从而描述体力活动。我们分析了2011-2012年国家健康与营养调查(NHANES)中4297名连续佩戴设备7天的成年人的分钟级独立于监测器的运动摘要(MIMS)数据。我们使用基于生存的方法为每个个体在共同强度网格上导出了非参数活动强度风险,将MIMS的风险曲线及其对数变换后的MIMS都视为功能对象。我们在MIMS的两个尺度上使用功能主成分分析(FPCA)来表征活动强度分布的主要变异模式。组均值风险函数在低强度水平上差异很小,而在高强度水平上我们观察到显著差异。我们的结果表明,基于风险的功能表示方法能够捕捉个体间体力活动强度分布的差异,提供了一种灵活且可解释的方式来表征异质性。该方法优于基于均值的摘要,并支持对人口亚组之间体力活动模式进行有原则的比较。

英文摘要

Physical activity plays a vital role in human health. Its entire distribution differs among people. Commonly used summary measures cannot describe this distributional pattern. We present a distribution-based analytical approach to describe physical activity by modeling individual-level activity-intensity patterns through hazard functions derived from wrist-worn accelerometer data. We analyzed minute-level Monitor-Independent Movement Summary (MIMS) data of 4297 adults with seven continuous days of device wear from the 2011- 2012 National Health and Nutrition Examination Survey (NHANES). We derived a nonparametric activity-intensity hazard using a survival-based approach for each individual on a common intensity grid, treating both the hazard curves from MIMS and their log-transformed MIMS as functional objects. We used functional principal component analysis (FPCA) on both scales of MIMS to characterize dominant modes of variation in activity-intensity distributions. Group-wise mean hazard functions showed little difference at lower intensity levels, while we observed a substantial difference at higher intensity levels. Our results demonstrate that hazard-based functional representations for capturing differences in physical activity intensity distributions across individuals offer a flexible and interpretable way to characterize heterogeneity. This approach works better than mean-based summaries and supports principled comparisons of physical activity patterns across population subgroups.

2606.18544 2026-06-19 stat.AP 新提交

Chess Signatures of Play

对弈的棋谱签名

Christian Turk, Nicholas Polson

AI总结 利用粗路径理论的签名变换提取棋局中事件顺序与交互的不变特征,构建签名核双样本检验和时序有效作弊检测方法,在控制错误率的同时显著提升检测能力。

详情
AI中文摘要

一局棋是一个流:一个按时间排序的走法序列,每个走法携带引擎评估、准确度度量、局面复杂度度量和时钟读数。我们将一局棋建模为多元路径,并应用粗路径理论的签名变换,获得一个重参数化不变、分级的特征集,记录棋局内事件的顺序和交互,无需参数化似然。我们证明,棋手的对弈法则可以从期望签名中识别,直至树状等价;构造路径空间上的签名核双样本检验;并将作弊检测重新表述为任意时序有效的序列检验:签名符合度得分成为一个e过程,其误差通过Ville不等式对每个样本量同时控制,波动在中等偏差尺度上校准。判别信息存在于签名的Levy面积中,该面积衡量准确度是否恰好当局面变难时上升——这是引擎辅助的特征,而聚合的匹配率统计忽略了这一点。在对照研究中,该检验保持精确的第一类错误控制,检测能力从对细微辅助的微不足道上升到对明显辅助的0.98,中位检测时间与增长率预测一致。校准至马格努斯·卡尔森记录在案的精英准确度后,该监测器不会标记世界冠军级别的对弈;我们展示了作弊策略,这些策略使所有聚合统计量(包括Regan系统的最佳走法频率z分数)保持不变,却被签名干净地捕获——精确说明了顺序感知、任意时序有效的检验如何加强现有的国际象棋反作弊方法。

英文摘要

A game of chess is a stream: a time-ordered sequence of moves, each carrying an engine evaluation, a measure of accuracy, a measure of position complexity, and a clock reading. We model a game as a multivariate path and apply the signature transform of rough-path theory to obtain a reparametrization-invariant, graded feature set that records the order and interaction of in-game events without a parametric likelihood. We show that a player's law of play is identifiable from the expected signature up to tree-like equivalence, construct a signature-kernel two-sample test on path space, and recast cheating detection as an anytime-valid sequential test: a signature conformance score becomes an e-process whose error is controlled for every sample size at once by Ville's inequality, with fluctuations calibrated on the moderate-deviation scale. The discriminating information lives in the signature's Levy areas, which measure whether accuracy rises precisely when positions become hard--the fingerprint of engine assistance that aggregate match-rate statistics discard. In a controlled study the test holds exact type-I control and detection power rises from negligible for subtle assistance to 0.98 for blatant assistance, with a median detection time matching the growth-rate prediction. Calibrated to Magnus Carlsen's documented elite accuracy, the monitor does not flag world-champion-level play; and we exhibit cheating strategies that leave every aggregate statistic, including the best-move-frequency z-score of the Regan system, unchanged yet are caught cleanly by the signature--making precise how an order-aware, anytime-valid test strengthens the prevailing approach to chess anti-cheating.

2606.20240 2026-06-19 econ.EM stat.AP 交叉投稿

Two-Sample IV: Efficient Two-Step Estimation and Tests for Overidentification and Weak-Instruments

两样本IV:高效两步估计及过度识别与弱工具变量检验

Fatima Kasenally, Ruoxi Guan, Frank Windmeijer

AI总结 针对两样本IV估计,提出异方差和样本异质性下稳健的两步高效估计方法及过度识别检验,仅需线性回归的汇总统计量,并扩展弱工具变量检验。

详情
AI中文摘要

两样本IV是一种流行的估计方法,当结果变量和处理变量在不同样本中可用,而工具变量在两个样本中都可用时。标准估计量是两样本两阶段最小二乘估计量,在同方差和样本同质性下是有效的。我们开发了一个稳健的两步程序,用于在一般异方差和样本异质性下进行有效估计,并提出了相关的两样本Hansen过度识别检验。我们方法的一个关键特征是只需要两个样本中简化形式和第一阶段的线性回归的汇总统计量。这些是估计系数向量的六个对象,以及同方差和异方差稳健的估计方差矩阵。我们进一步表明,在同方差和同质性下,处理样本中的第一阶段F统计量可以按标准方式用作弱工具变量检验,这里的相对偏差是比例偏差。我们提出了Montiel-Olea和Pflueger (2013)的有效F统计量的扩展,用于异方差情况,遵循Windmeijer (2025)的推广。我们在Marshall (2019)研究教育对投票行为影响的应用中说明了估计量和检验,并进行了聚类稳健推断。

英文摘要

Two-sample IV is a popular estimation method when the outcome and treatment variables are available in different samples, whereas instruments are available in both samples. The standard estimator is two-sample two-stage least squares estimator, which is efficient under homoskedasticity and homogeneity of the samples. We develop a robust two-step procedure for efficient estimation under general heteroskedasticity and heterogeneity of the samples, and propose a related two-sample Hansen overidentification test. A key feature of our approach is that only summary statistics from the linear regressions of the reduced form and first-stage in the two samples are needed. These are the six objects of the estimated coefficient vectors, and the homoskedastic and heteroskedasticity robust estimated variance matrices. We further show that the first-stage F-statistic in the treatment sample can be used as a test for weak instruments in the standard way under homoskedasticity and homogeneity, with the relative bias here a proportional bias. We propose an extension of the effective F-statistic of Montiel-Olea and Pflueger (2013) for the heteroskedastic case, following the generalization in Windmeijer (2025). We illustrate the estimators and tests in an application studying the effect of education on voting behavior from Marshall (2019), with cluster robust inference.

2606.20341 2026-06-19 stat.ME stat.AP 交叉投稿

Anchors Away: Navigating Unanchored Indirect Comparisons with Multilevel Unanchored Meta-Regression (ML-UMR)

锚定之外:使用多层次非锚定元回归(ML-UMR)导航非锚定间接比较

Conor Chandler, Jack Ishak

AI总结 针对随机证据缺失时的非锚定治疗比较,提出多层次非锚定元回归(ML-UMR),通过贝叶斯框架联合建模个体与汇总数据,估计多治疗、多研究及目标人群的边际和条件效应,并明确识别假设与可转移性假设。

Comments 20 pages (excluding supplementary material), 5 figures

详情
AI中文摘要

当随机证据不可用时,使用单臂研究或断开证据的非锚定间接治疗比较越来越多地用于卫生技术评估(HTA)。现有方法,包括匹配调整间接比较(MAIC)和模拟治疗比较(STC),通常局限于成对设置,并且通常估计比较研究人群中的边际效应,这可能与决策相关人群不同。我们提出多层次非锚定元回归(ML-UMR),一种用于综合来自完全断开证据的个体患者数据和汇总数据的贝叶斯回归框架。ML-UMR通过在一个统一似然中联合建模个体水平和汇总水平数据,将多层次网络元回归(ML-NMR)扩展到非锚定设置,从而能够估计跨多个治疗、研究和目标人群的治疗特异性结果以及边际和条件效应。ML-UMR区分了识别治疗效应所需的假设与将结果转移到目标人群所需的假设。与所有非锚定比较一样,有效推断依赖于强且通常不可验证的假设,包括条件可交换性、结果模型的正确设定以及跨治疗假设(例如,共享预后因素假设(SPFA))。ML-UMR并未减轻这些要求,而是在统一框架内使其明确,并促进敏感性分析。在模拟研究中,ML-UMR对比较人群效应产生了低偏差和名义覆盖。向其他人群的可转移性关键取决于识别假设:在强效应修饰下,违反SPFA导致偏差,而纳入亚组信息则恢复了近乎无偏的估计和名义覆盖。

英文摘要

Unanchored indirect treatment comparisons using single-arm studies or disconnected evidence are increasingly used in health technology assessment (HTA) when randomized evidence is unavailable. Existing methods, including matching-adjusted indirect comparison (MAIC) and simulated treatment comparison (STC), are generally limited to pairwise settings and typically estimate marginal effects in the comparator study population, which may differ from the decision-relevant population. We propose multilevel unanchored meta-regression (ML-UMR), a Bayesian regression framework for synthesizing individual patient data and aggregate data from fully disconnected evidence. ML-UMR extends multilevel network meta-regression (ML-NMR) to unanchored settings by jointly modeling individual- and aggregate-level data within a unified likelihood, enabling estimation of treatment-specific outcomes and both marginal and conditional effects across multiple treatments, studies, and target populations. ML-UMR distinguishes assumptions required to identify treatment effects from those required to transport results to target populations. As with all unanchored comparisons, valid inference relies on strong and often unverifiable assumptions, including conditional exchangeability, correct specification of the outcome model, and cross-treatment assumptions (e.g., shared prognostic factor assumption (SPFA)). ML-UMR does not lessen these requirements but makes them explicit within a unified framework and facilitates sensitivity analyses. In simulation studies, ML-UMR produced low bias and nominal coverage for comparator-population effects. Transportability to alternative populations depended critically on identifying assumptions: violations of SPFA led to bias under strong effect modification, whereas incorporating subgroup information restored near-unbiased estimation and nominal coverage.

2606.20114 2026-06-19 stat.ME stat.AP 交叉投稿

Community detection in small-sample ordinal regimes: A benchmarking framework for Delphi data

小样本有序情境下的社区检测:德尔菲数据的基准测试框架

Yuri Calleo, Simone Di Zio, Fabrizio Maturo

AI总结 针对德尔菲数据高维小样本导致的秩亏问题,提出从变量中心协方差模型转向网络中心连接模型,利用社区检测算法识别潜在主题结构,实现结构稳定的降维。

详情
AI中文摘要

德尔菲数据共识的统计建模面临一个关键瓶颈:问卷项目的高维性与专家小组有限样本量之间的矛盾。这种秩亏导致传统潜变量模型(如主成分分析)结构不稳定且易过拟合。为弥补这一方法论空白,本研究提出从变量中心协方差模型转向网络中心连接模型。通过将项目相关性映射到加权图拓扑,我们提出了一个基于模拟的基准测试,利用社区检测算法识别潜在主题结构,有效解决了高维小样本情境下典型的谱不稳定性和秩亏问题。该研究系统评估了基于结构密度、信息流和谱划分的拓扑方法在合成数据集上的鲁棒性,这些数据集旨在复制共识数据的病理条件,包括有序量表和系统噪声。核心方法论贡献在于证明专家判断间的共线性——传统上被视为需要正则化的统计冗余——可以有效地重新解释为凝聚的拓扑信号。该框架为研究人员提供了一种结构化的自动降维程序,确保即使在标准因子分析失效的小样本情境下也能保持结构稳定性和心理测量一致性。

英文摘要

The statistical modeling of consensus in Delphi data faces a critical bottleneck: the high dimensionality of questionnaire items relative to the limited sample size of expert panels. This rank deficiency leads traditional latent variable models, such as Principal Component Analysis, to be structurally unstable and prone to overfitting. Addressing this methodological gap, this study proposes a transition from variable-centric covariance models to network-centric connectivity models. By mapping item correlations onto a weighted graph topology, we present a simulation-based benchmark that utilizes community detection algorithms to identify latent thematic structures, effectively addressing the spectral instability and rank deficiency typical of high-dimensional, low-sample-size regimes. The research systematically evaluates the robustness of topological approaches based on structural density, information flow, and spectral partitioning against synthetic datasets designed to replicate the pathological conditions of consensus data, including ordinal scales and systemic noise. The central methodological contribution lies in demonstrating that collinearity among expert judgments - traditionally treated as statistical redundancy to be regularized - can be effectively reinterpreted as a topological signal of cohesion. This framework provides researchers with a structured and automated procedure for dimensionality reduction, ensuring structural stability and psychometric consistency even in small-sample regimes where standard factor analysis breaks down.

2606.19743 2026-06-19 stat.ME stat.AP 交叉投稿

A Bayesian spatio-temporal nearest neighbor Gaussian process model for pooled genetic data

一种用于汇总遗传数据的贝叶斯时空最近邻高斯过程模型

Imke Botha, Tianxiao Hao, Lucinda E. Harrison, Nick Golding, Daniel J. Weiss, Jennifer A. Flegg

AI总结 提出最近邻高斯过程模型,结合序贯蒙特卡洛平方算法,高效推断汇总遗传数据中的单倍型频率,并应用于非洲抗疟药物耐药性遗传数据分析。

详情
AI中文摘要

大规模遗传数据集通常汇总不同遗传标记的总等位基因计数。从这些汇总数据中推断单倍型频率(即多标记等位基因的频率)是一个挑战。由于计算成本,先前在此背景下的时空建模仅限于3个标记。在这项工作中,我们提出了一种最近邻高斯过程(NNGP)模型,以改善随标记和观测数量扩展的规模。为了推断模型参数,我们开发了一种新颖的序贯蒙特卡洛平方算法,该算法使用带有祖先抽样的粒子吉布斯来变异NNGP函数值。后者在观测数量和NNGP数量上具有线性成本,并可应用于广泛的NNGP模型。作为案例研究,我们分析了与非洲抗疟药物耐药性相关的遗传数据,并在3和6个遗传标记数据集上实证展示了我们的扩展结果。

英文摘要

Large scale genetic datasets often aggregate the total allele counts of distinct genetic markers. Inferring haplotype frequencies (i.e.\ the frequency of multimarker alleles) from these pooled data is a challenge. Previous spatio-temporal modelling in this context has been limited to 3 markers due to the computational cost. In this work, we propose a nearest neighbor Gaussian process (NNGP) model to improve scaling with the number of markers and observations. To infer the parameters of our model, we develop a novel sequential Monte Carlo squared algorithm, which uses particle Gibbs with ancestor sampling to mutate the NNGP function values. The latter has a linear cost in the number of observations and the number of NNGPs, and can be applied to a broad range of NNGP models. As a case study, we analyse genetic data relating to antimalarial drug resistance in Africa, and show our scaling results empirically on a 3 and 6 genetic marker dataset.

2606.20420 2026-06-19 q-fin.CP stat.AP 交叉投稿

Advanced Calibration Analysis and Tools: Identifying Influential Observations in Stochastic Interest Rate Model Calibration

高级校准分析与工具:识别随机利率模型校准中的有影响观测值

Philipp Mahler, Peter Ruckdeschel

AI总结 将校准问题嵌入非线性回归理论,证明最小化RMSRE等价于加权最小二乘,开发诊断框架(加权帽子矩阵、影响函数、泛函Delta方法),实证发现杠杆边界主导、有效维度损失及2022年后参数稳定性转变,指出低RMSRE不足以验证校准。

Comments 47 pages, 9 figures, 1 table

详情
AI中文摘要

利率模型的准确校准对于市场一致性估值和经济情景生成器(ESGs)至关重要。多因子模型(如G2++模型)的传统校准方法通常依赖于点估计,忽略了特定市场数据的影响和估计不确定性的量化。本文开发了一个诊断框架,将校准问题嵌入非线性回归理论。研究表明,行业常见的均方根相对误差(RMSRE)最小化等价于加权最小二乘(WLS)问题。这一等价关系导出了诊断工具的相应公式,包括用于杠杆分析的加权帽子矩阵、用于局部敏感性诊断的影响函数,以及用于局部、边界置信区间的泛函Delta方法。实现中采用了高效的雅可比矩阵分解,利用了平价(ATM)上限的解析可处理性。该框架应用于2016-2025年期间的欧元ATM上限数据集。我们的实证分析揭示了边界主导的杠杆分布、由于参数约束活跃导致的重复有效维度损失,以及2022年后市场转型中局部参数稳定性的诊断机制转变。对精算模型治理的启示是:低RMSRE不足以验证校准。最后,我们讨论了该框架对一般最小二乘问题的适用性,同时指出了对于缺乏闭式梯度的工具(如互换期权)的计算挑战。

英文摘要

The accurate calibration of interest rate models is central to market-consistent valuation and Economic Scenario Generators (ESGs). Traditional calibration methods for multi-factor models such as the G2++ model often rely on point estimates, neglecting the influence of specific market data and the quantification of estimation uncertainty. This paper develops a diagnostic framework embedding the calibration problem into non-linear regression theory. It shows that the common industry practice of minimizing the Root Mean Squared Relative Error (RMSRE) is equivalent to a Weighted Least Squares (WLS) problem. This equivalence yields the corresponding formulations for diagnostic tools, including the Weighted Hat Matrix for leverage analysis, Influence Functions for local sensitivity diagnostics, and the Functional Delta Method for local, boundary-respecting confidence intervals. The implementation uses an efficient Jacobian factorization that exploits the analytical tractability of At-The-Money (ATM) caps. The framework is applied to a dataset of Euro ATM caps covering the period 2016--2025. Our empirical analysis reveals a boundary-dominated leverage profile, repeated losses of effective dimensionality due to active parameter constraints, and a diagnostic regime shift in local parameter stability around the post-2022 market transition. The resulting message for actuarial model governance is that low RMSRE is not sufficient for calibration validation. We conclude by discussing the framework's applicability to general least-squares problems while highlighting the computational challenges for instruments lacking closed-form gradients, such as swaptions.

2606.20451 2026-06-19 stat.ML cs.LG stat.AP stat.CO 交叉投稿

SSH-Net: A Deep Neural Network for Predicting Failure Time Distribution Functions under Competing Risks with Application to GPU Data

SSH-Net: 一种用于竞争风险下预测失效时间分布函数的深度神经网络及其在GPU数据上的应用

Jie Min, Yueyao Wang, Mengkun Chen

AI总结 提出结构化分段风险深度神经网络(SSH-Net),通过将网络结构与数据结构关联,允许不同协变量组通过子网络影响预测,在竞争风险框架下预测失效时间分布函数,仿真和GPU数据验证了准确性。

详情
AI中文摘要

竞争风险在工程领域常见,当应用场景复杂时会给时间事件数据建模带来挑战。近年来,深度神经网络因其灵活性和高学习能力在竞争风险预测中受到广泛关注。然而,神经网络结构的复杂性使得基于不同数据输入的超参数调优更加困难。此外,当工程系统具有多层级的复杂物理结构时,将所有结构层级视为单一输入组可能无法捕捉关键信息。为解决这些问题,我们提出了一种结构化分段风险深度神经网络(SSH-Net),用于在特定原因竞争风险框架下预测失效时间。我们的方法将神经网络结构与数据结构相关联,并允许不同的协变量组通过分离的子网络影响失效预测。神经网络基于特定原因竞争风险模型构建。SSH-Net输出特定原因风险函数,并采用惩罚对数似然作为损失函数。通过评估Brier分数、接收者操作特征曲线下面积(AUC)和预测的特定原因累积发生函数的均方根误差(RMSE),仿真研究验证了SSH-Net的预测准确性。我们进一步使用Titan GPU失效时间数据展示了模型预测失效时间分布函数的能力。

英文摘要

Competing risks are commonly observed in engineering fields and can bring challenges to time-to-event data modeling when the application scenarios are complicated. Recently, deep neural networks have received great attention for prediction with competing risks, due to their flexibility and high learning capability. However, the complexity of neural network structure brings extra difficulty in hyperparameter tuning based on different data inputs. Additionally, when an engineered system has complex physical structures with multiple hierarchical levels, treating all structural levels as a single group of inputs may fail to capture critical information. To address the issues, we propose a Structured Segmented Hazard Deep Neural Network (SSH-Net) for failure time prediction under cause-specific competing risks framework. Our approach associates neural network structure with data structures, and allows different covariate groups to impact the failure prediction through separate sub-networks. The neural network is constructed based on a cause-specific competing risks model. The SSH-Net outputs cause-specific hazard functions, and utilizes the penalized log-likelihood as the loss function. The prediction accuracy of SSH-Net is validated through simulation studies by evaluating the Brier score, the area under receiver operating characteristic curves (AUC), and the root mean square error (RMSE) of the predicted cause-specific cumulative incident function. We further demonstrate the model's ability to predict failure time distribution functions using the Titan GPU failure time data.

2606.19775 2026-06-19 cs.SI stat.AP stat.OT 交叉投稿

Rethinking Sampling Strategy in Link Prediction

重新思考链接预测中的采样策略

Yilin Bi, Zhenyu Deng, Xinshan Jiao, Tao Zhou

AI总结 提出β-采样方案,研究两阶段采样对链接预测性能的影响,发现缺失链接的结构特征显著影响预测精度,且第二阶段采样策略至关重要。

Comments 19 pages, 5 figures, 3 tables

详情
AI中文摘要

许多现实世界的网络是不完整的,使得链接预测成为网络科学中的一个基本挑战。为了训练参数和评估算法,观察到的链接通常被划分为三个子集,即训练集、验证集和探测集。这种划分隐含地涉及两个采样过程:第一阶段采样产生探测集,第二阶段采样获得变化集。迄今为止,我们对这两个采样过程如何影响算法性能的理解仍然非常有限。为了解决这个问题,我们提出了一种称为β-采样的采样方案,其中链接的采样概率与其两个端点的度数乘积的β次幂成正比。在45个真实网络上的实验表明,通过改变探测集模拟的缺失链接的结构特征显著影响预测精度。当缺失链接倾向于连接高度数节点时,这类链接可以很容易地被准确预测。此外,即使探测集固定,第二阶段采样仍然对预测精度产生显著影响。值得注意的是,最优的第二阶段采样策略不同于随机采样(随机选择链接形成验证集)和一致采样(保证验证集和探测集中的链接具有相同的结构特征)。

英文摘要

Many real-world networks are incomplete, making link prediction a fundamental challenge in network science. To train parameters and evaluate algorithms, observed links are usually divided into three subsets, namely training, validation, and probe sets. This division implicitly involves two sampling processes: first-stage sampling yields the probe set and second-stage sampling obtains the variation set. To date, our understanding of how these two sampling processes affect algorithm performance remains quite limited. To address this issue, we propose a sampling scheme called $β$-sampling, where the sampling probability of a link is proportional to the product of the degrees of its two endpoints raised to the power of $β$. Experiments on 45 real-world networks reveal that the structural characteristics of missing links, as simulated via varying probe sets, substantially impact prediction accuracy. When missing links tend to connect high-degree nodes, such links can be predicted accurately with ease. Furthermore, even with a fixed probe set, second-stage sampling still exerts a significant influence on prediction accuracy. Notably, the optimal second-stage sampling strategy differs from \textit{random sampling} (which randomly selects links to form the validation set) and \textit{consistent sampling} (which guarantees that links in the validation and probe sets share identical structural characteristics).

2606.19607 2026-06-19 cs.AI stat.AP 交叉投稿

Which Pairs to Compare for LLM Post-Training?

LLM后训练中应比较哪些对?

Jiangze Han, Vineet Goyal, Will Ma

发表机构 * Columbia University(哥伦比亚大学)

AI总结 研究偏好后训练中如何选择最具信息量的比较对,提出基于采样设计的比较策展方法,通过DPO训练的理论分析给出优化准则,实验证明能提升样本效率。

详情
AI中文摘要

基于偏好的后训练已成为对齐语言模型的核心范式。常见的数据收集策略是为每个提示生成少量补全并标注生成的比较对。然而,人工偏好标签通常比生成额外补全昂贵得多,这提示了相同标注预算的不同使用方式:生成更大的补全集,但只标注最具信息量的比较对。本文研究在基于偏好的后训练中应比较哪些对。我们将比较策展形式化为一个采样设计问题,并通过基于偏好的后训练目标下的最终策略质量来评估设计。我们针对直接偏好优化(DPO)实例化该框架,分析标注对的选择如何通过DPO训练传播到下游策略性能。我们的主要结果为DPO训练策略的后训练最优性差距提供了匹配的上界和下界。这些界限表明,比较选择通过一个单一的设计相关信息矩阵影响下游性能,该矩阵将标签分配与参数估计误差和策略次优性联系起来。这为预算受限的比较策展提供了显式优化准则,并激发了从大型生成补全池中选择信息对的实际采样设计。在合成设置和语言模型后训练基准上的实验表明,所提出的设计在样本效率上持续优于常见的比较选择启发式方法。

英文摘要

Preference-based post-training has become a central paradigm for aligning language models. A common data-collection strategy is to generate a small set of completions for each prompt and label the resulting comparison pairs. However, human preference labels are often much more expensive than generating additional completions, suggesting a different use of the same labeling budget: generate a larger pool of completions, but label only the most informative comparison pairs. This paper studies which pairs should be compared in preference-based post-training. We formulate comparison curation as a sampling-design problem and evaluate designs by the quality of the final policy under the preference-based post-training objective. We instantiate this framework for Direct Preference Optimization (DPO), analyzing how the choice of labeled pairs propagates through DPO training to downstream policy performance. Our main results provide matching upper and lower bounds on the post-training optimality gap of the DPO-trained policy. The bounds show that comparison selection affects downstream performance through a single design-dependent information matrix, which links label allocation to parameter estimation error and policy suboptimality. This yields an explicit optimization criterion for budgeted comparison curation and motivates practical sampling designs for selecting informative pairs from large generated completion pools. Experiments on synthetic settings and language-model post-training benchmarks show that the proposed designs consistently improve sample efficiency over common comparison-selection heuristics.

2606.19642 2026-06-19 physics.ao-ph stat.AP stat.ML 交叉投稿

Rigorous uncertainty quantification of probabilistic AI weather forecasts with conformal prediction

基于保形预测的概率AI天气预报的严格不确定性量化

Anna Asch, Raphael Rossellini, Pedram Hassanzadeh, Rebecca Willett

AI总结 针对AI概率天气预报校准不足(尤其是极端事件),提出使用保形预测方法,无需分布假设即可数学保证覆盖,应用于三个全球模型(GenCast、NeuralGCM、AIFS-ENS)的温度和降水预报,实现校准不确定性而不牺牲其他概率指标。

详情
AI中文摘要

概率天气预报正随着人工智能(AI)经历快速变革。在传统数值天气预报中,计算能力可能限制集合预报对未知未来状态统计分布的近似程度。AI模型便于生成更大的集合,并经过概率考量训练,理论上能带来更好的不确定性量化。这些最先进模型的预报通常被认为是良好校准的。然而,我们在此表明,此类模型的统计覆盖(校准的最终度量)可能存在问题,尤其是在极端事件上。为解决这一缺陷,我们采用保形预测,这是一类统计方法,与以往的后处理技术不同,它在无分布假设下数学上保证覆盖。我们将在线保形预测应用于三个领先全球天气模型(GenCast、NeuralGCM和AIFS-ENS)的温度和降水预报(包括极端情况),确保校准不确定性而不牺牲其他概率指标。这种后处理方法可应用于任何预报模型。

英文摘要

Probabilistic weather forecasting is undergoing rapid transformation with artificial intelligence (AI). In traditional numerical weather prediction, computing power can limit how well ensemble forecasts approximate the unknown statistical distribution of future states. AI models facilitate larger ensembles and are trained with probabilistic considerations, ideally leading to better uncertainty quantification. Forecasts from these state-of-the-art models are often considered well-calibrated. However, here we show that the statistical coverage of such models, the ultimate measure of calibration, can struggle, especially on extreme events. To address this shortcoming, we employ conformal prediction, a class of statistical methods that mathematically guarantees coverage under no distributional assumptions, unlike previous post-processing techniques. We apply online conformal prediction to temperature and precipitation forecasts (including extremes) of three leading global weather models, GenCast, NeuralGCM, and AIFS-ENS, ensuring calibrated uncertainty at no expense to other probabilistic metrics. This post-processing method can be applied to any forecasting model.

2606.20489 2026-06-19 q-bio.PE nlin.CG physics.bio-ph stat.AP 交叉投稿

West Nile virus outbreak in Italy modelled with the quantum Game of Life

意大利西尼罗病毒疫情用量子生命游戏建模

Andrea Fontana, Simone Tambascia, Ciro Di Carluccio, Andrea Esposito, Bernardo Spagnolo, Andrea M. Chiariello

AI总结 使用量子生命游戏细胞自动机模型模拟2025年夏季意大利西尼罗病毒传播,通过优化蚊子出生和移除率,准确拟合局部和区域平均累计感染曲线,并评估环境变化的影响。

详情
AI中文摘要

近年来,意大利观察到西尼罗病毒(WNV)异常高传播,特别是在拉齐奥南部、坎帕尼亚和威尼托地区感染高峰显著。WNV的主要病媒是库蚊,通过叮咬传播人类感染。本文通过基于量子版本的生命游戏(GOL)细胞自动机模型的计算方法,研究2025年夏季意大利西尼罗热疫情的扩散。具体而言,人类动力学根据GOL规则演化,而病媒(即蚊子)的随机动力学及其与人类的相互作用同时发生。我们表明,该模型在局部和平均区域水平上以高精度拟合累计感染个体曲线,仅需优化蚊子出生率和移除率参数。此外,利用模型的灵活性,我们表明模型参数值的变化阐明了系统对环境变化的响应。例如,我们量化了蚊子传播控制措施或由于气候和生态变化导致的蚊子突然增加的影响。总体而言,我们提供了意大利WNV感染传播的一般定量描述,可作为测试不同环境情景的支持工具,并有助于决策者制定监测病媒动力学和控制病毒传播的策略。

英文摘要

In the last years, an anomalously high spreading of West Nile virus (WNV) has been observed in Italy, with particularly high peaks of infections in southern Lazio, Campania and Veneto regions. The main disease vector for WNV is represented by Culex pipiens mosquitoes, which spread human infections through their bites. Here, we investigate WNV fever epidemic diffusion during summer season 2025 in Italy through a computational approach based on a quantum version of the Game of Life (GOL) cellular automaton model. Specifically, human dynamics evolves according to the GOL rules, while stochastic dynamics of disease vectors, i.e., mosquitoes, as well as their interaction with humans, simultaneously occur. We show that this model fits the curves of cumulative infected individuals with high accuracy, either at local and average-regional level, with only optimization of mosquito birth and removal rates parameters. Furthermore, leveraging model flexibility, we show that changes in model parameters values elucidate system response to environmental variations. For instance, we quantify, e.g., the impact of mosquito spreading containment measures or sudden mosquito increasing abundance due to climatic and ecological changes. Overall, we provide a general, quantitative description of WNV infection spreading in Italy which could represent a supportive tool to test different environmental scenarios and could be useful to devise strategies for decision makers to monitor disease vector dynamics and to control consequent virus diffusion.

2605.15896 2026-06-19 stat.ME stat.AP 版本更新

A Model-Agnostic Bootstrap for Macro-Level Claims Reserving Under the Conditioning Principle

基于条件原理的宏观层面赔款准备金模型无关自助法

Robin Van Oirbeek, Tim Verdonck

AI总结 本文提出一种满足条件原理的自助法,用于宏观层面赔款准备金估计,通过Dirichlet-Gamma层次结构实现精确校准,改进了现有自助法的覆盖误差问题。

Comments 23 pages, v2: correction of the interpretation of the $κ$ parameter

详情
AI中文摘要

正确的推断对象是条件预测分布p(R|D,θ̂),其中D是观察到的三角形保持固定。我们称之为条件原理。所有现有自助法违反这一原理,通过在预测循环中对D的函数进行重采样,产生O(1)的覆盖误差,随着三角形增大不消失。Dirichlet-Gamma层次结构允许一种满足该原理的自助法:S^{IBNP}_i = X^{obs}_i (1-W_i)/W_i,其中W_i ~ Beta(cF_{I-i}, c(1-F_{I-i}))直接从其预测分布中采样。仅模拟分配比例W_i;观察到的三角形保持固定。因此继承了任何开发比例方法(链式梯度、Bornhuetter-Ferguson、Cape Cod或其他)的校准,使其模型无关。覆盖缺陷为O(I^{-1/2}),与开发时期数量无关。在复合泊松数据生成过程中,该自助法对于每个F_{I-i} ∈ (0,1)是保守的:预测标准差分析上超过真实值的因子为1/√F_{I-i}。ODP自助法通过两种相反方向的机制违反该原理:重新估计在ODP DGP下膨胀自助方差,而缺失事故年脆弱性在脆弱性DGP下缩小它。结果覆盖差异为Ω(1),无论I如何,为Meyers(2015)文档的跨投资组合误校准异质性提供了结构解释。链式梯度、Bornhuetter-Ferguson和Cape Cod在稀疏、信息丰富和池化先验下分别作为可信度估计量,计数和金额具有相同结构。集中程度c作为诊断:ĉ < 30表明开发非平稳。

英文摘要

The correct inferential object in claims reserving is the conditional predictive distribution $p(R \mid \mathcal{D}, \hatθ)$, where $\mathcal{D}$ is the observed triangle held fixed. We refer to this as the conditioning principle. All existing bootstraps violate it by resampling functions of $\mathcal{D}$ inside the predictive loop, producing an $O(1)$ coverage error that does not vanish as the triangle grows. The Dirichlet-Gamma hierarchy admits a bootstrap that satisfies the principle exactly: $S^{IBNP}_i = X^{obs}_i (1-W_i)/W_i$ with $W_i \sim \mathrm{Beta}(c\hat{F}_{I-i}, c(1-\hat{F}_{I-i}))$ sampled directly from its predictive distribution. Only the allocation proportion $W_i$ is simulated; the observed triangle is held fixed. It thus inherits calibration from any development-proportion method (Chain-Ladder, Bornhuetter-Ferguson, Cape Cod, or other), making it model-agnostic. The coverage deficit is $O(I^{-1/2})$, independent of the number of development periods. Under compound Poisson data-generating processes the bootstrap is conservative for every $F_{I-i} \in (0,1)$: the predictive standard deviation analytically exceeds the true value by the factor $1/\sqrt{F_{I-i}}$. The ODP bootstrap violates the principle through two mechanisms in opposite directions: re-estimation inflates bootstrap variance under the ODP DGP, while missing accident-year frailty deflates it under frailty DGPs. The resulting coverage discrepancy is $Ω(1)$ regardless of $I$, providing a structural explanation for the cross-portfolio miscalibration heterogeneity documented by Meyers (2015). Chain-Ladder, Bornhuetter-Ferguson and Cape Cod emerge as credibility estimators under diffuse, informative and pooling priors respectively, with identical structure for counts and amounts. The concentration $c$ serves as a diagnostic: $\hat{c} < 30$ signals non-stationary development.

2605.15811 2026-06-19 stat.ME stat.AP 版本更新

The Negative Binomial Chain-Ladder: A Full Likelihood Model for Claim Count Reserving

负二项链梯法:一种完整的似然模型用于赔款准备

Robin Van Oirbeek

AI总结 本文提出负二项链梯模型,通过泊松-伽马构造自然产生负二项分布,提供更清晰的生成解释,统一了链梯方法家族,并通过模拟验证了模型的稳健性。

Comments 35 pages, 3 figures, v2: correction of the interpretation of the $κ$ parameter

详情
AI中文摘要

链梯法仍是非寿险赔款准备的主要宏观技术,但其经典形式缺乏一致的概率基础。现有随机扩展,包括马科模型和过分散泊松(ODP)框架,提供不确定性度量但依赖二阶矩假设或准似然方差结构。本文开发了一种负二项链梯(NB-CL)模型,将链梯方法嵌入完整的似然框架中。关键贡献是微观层面推导,显示负二项分布自然源于泊松-伽马构造:索赔按具有伽马分布年度异质性的泊松过程到达,聚合产生负二项增量计数。此推导赋予分散参数κ结构解释,即年度异质性,而非随意的过分散调整。NB-CL模型在κ→∞极限下推广泊松链梯模型,与ODP模型共享点估计但方差函数不同(二次vs线性),并在单个概率层级内统一链梯家族。开发了参数Bootstrap程序以纳入过程和参数不确定性。模拟研究证实,在正确规范下,当分散参数经过偏差校正后,覆盖率接近名义水平;在模型不规范情况下表现出受控退化。对索赔计数数据(澳大利亚机动车身体伤害)和已付金额(泰勒-阿什)的实证研究证实了κ的结构解读以及在金额情况下的工作近似状态。

英文摘要

The Chain-Ladder (CL) method remains the dominant macro-level technique for claims reserving in non-life insurance, yet its classical formulation lacks a coherent probabilistic foundation. Existing stochastic extensions-including the Mack model and the Over-Dispersed Poisson (ODP) framework-provide measures of uncertainty but rely on second-moment assumptions or quasi-likelihood variance structures without clear generative interpretations. This paper develops a Negative Binomial Chain-Ladder (NB-CL) model that embeds the CL method within a full likelihood-based framework. The key contribution is a micro-level derivation showing that the negative binomial distribution arises naturally from a Poisson-Gamma construction: claims arrive according to a Poisson process with Gamma-distributed accident-year heterogeneity, and aggregation yields negative binomial incremental counts. This derivation gives the dispersion parameter $κ$ a structural interpretation as accident-year heterogeneity, rather than an ad-hoc overdispersion adjustment. The NB-CL model generalises the Poisson Chain-Ladder model in the limit $κ\to \infty$, shares the point estimates of the ODP model while differing in its variance function (quadratic vs. linear), and unifies the Chain-Ladder family within a single probabilistic hierarchy. A parametric bootstrap procedure is developed to incorporate both process and parameter uncertainty. Simulation studies confirm near-nominal coverage under correct specification once the dispersion parameter is bias-corrected, and a controlled degradation under model misspecification. Empirical illustrations on claim count data (Australian motor bodily injury) and paid amounts (Taylor-Ashe) document both the structural reading of $κ$ and the working-approximation status of the model in the amounts case.

2507.15475 2026-06-19 eess.SP math.PR stat.AP 版本更新

On the Distribution of a Two-Dimensional Random Walk with Restricted Angles

二维受限角度随机游走的分布

Karl-Ludwig Besser

AI总结 研究受限角度二维随机游走的分布,推导两步联合与边缘分布,提供一般步数的数值解及大步数近似,明确支持集的精确描述。

Comments 14 pages, 14 figures

Journal ref IEEE Transactions on Signal Processing, vol. 74, pp. 2316-2330, 2026

详情
AI中文摘要

本文推导了二维(复数)随机游走的分布,其中每一步的角度被限制在圆的一个子集。这种设置出现在信号处理中的空中计算等领域。特别地,我们推导了两步的联合和边缘分布,给出了任意步数的数值解,并对大步数提供了近似解。此外,我们为任意步数提供了支持集的精确描述。本文的结果为未来涉及此类问题的研究提供了参考。

英文摘要

In this paper, we derive the distribution of a two-dimensional (complex) random walk in which the angle of each step is restricted to a subset of the circle. This setting appears in various domains, such as in over-the-air computation in signal processing. In particular, we derive the exact joint and marginal distributions for two steps, numerical solutions for a general number of steps, and approximations for a large number of steps. Furthermore, we provide an exact characterization of the support for an arbitrary number of steps. The results in this work provide a reference for future work involving such problems.

2604.03076 2026-06-19 stat.AP 版本更新

Carbon cost pass-through rate in power system: evidence from Italy under the EU ETS

电力系统中碳成本传导率:来自欧盟排放交易体系下意大利的证据

Pierdomenico Duttilo, Francesco Lisi

AI总结 研究欧盟排放交易体系下碳成本在意大利电力市场的传导率,基于2016-2024年数据,采用自回归线性回归模型,发现全国平均传导率约32%,且各市场区域存在显著异质性。

详情
AI中文摘要

本文研究了欧盟排放交易体系(EU ETS)下碳定价对意大利电力市场的影响,重点关注第三和第四阶段(2016-2024年)各市场区域的碳成本传导率(CPTR)。利用日度数据,研究采用基于自回归动态线性回归模型的计量经济学框架,估计碳成本在批发电力价格中的反映程度。进一步通过稳健性检验和分位数回归,评估CPTR在不同燃料价差水平下的变化。结果表明,碳成本正向且显著地传导至电力价格,证实了碳定价作为关键市场驱动因素的相关性。然而,传导不完全,CPTR值始终低于100%。在国家层面,传导率估计约为32%,第三阶段和第四阶段之间无统计显著变化。各市场区域出现显著异质性:在北部、中北部和撒丁岛,第四阶段传导率上升,而在中南部和西西里岛则下降,反映了发电结构、碳强度和市场条件的差异。总体而言,研究结果强调了市场区域因素在塑造电力市场碳定价有效性中的重要性。

英文摘要

This paper investigates the impact of carbon pricing under the EU Emissions Trading System (EU ETS) on the Italian electricity market, focusing on the carbon cost pass-through rate (CPTR) across market zones during Phases 3 and 4 (2016-2024). Using daily data, the study applies an econometric framework based on a linear regression model with autoregressive dynamics to estimate the extent to which carbon costs are reflected in wholesale electricity prices. It further incorporates robustness checks and quantile regression to assess how the CPTR varies across different fuel spread levels. The results show that carbon costs are positively and significantly transmitted to electricity prices, confirming the relevance of carbon pricing as a key market driver. However, pass-through is incomplete, with CPTR values consistently below 100%. At the national level, the pass-through estimate is around 32%, with no statistically significant change between Phase 3 and Phase 4. Substantial heterogeneity emerges across market zones: pass-through increases in the North, Centre-North, and Sardinia during Phase 4, while it declines in the Centre-South and Sicily, reflecting differences in generation mix, carbon intensity, and market conditions. Overall, the findings highlight the importance of market zones factors in shaping the effectiveness of carbon pricing in electricity markets.

2410.19333 2026-06-19 econ.GN physics.soc-ph q-fin.EC stat.AP 版本更新

Swiss-system chess tournaments and unfairness

瑞士制国际象棋锦标赛与不公平性

László Csató, Alex Krumer

AI总结 研究瑞士制国际象棋锦标赛中轮次奇偶性导致的不公平性,发现多执白一局的选手得分显著更高,建议采用偶数轮次和平衡颜色分配机制。

Comments 13 pages, 4 tables

详情
AI中文摘要

瑞士制是一种日益流行的比赛形式,因为它提供了比赛场次与排名准确性之间的有利权衡。然而,关于瑞士制国际象棋锦标赛在奇数轮次下潜在的不公平性,尚无实证研究。为了分析这一问题,我们的论文比较了比赛中多执白一局的选手与少执白一局的选手的得分。利用28个高知名度赛事的数据,我们发现多执白一局的选手得分显著更高。特别是在四个Grand Swiss赛事中,这一优势超过了平局的价值。解决这种不公平性的一种潜在方案是组织偶数轮次的瑞士制国际象棋锦标赛,并使用最近提出的配对机制保证所有选手的颜色分配平衡。

英文摘要

The Swiss system is an increasingly popular competition format as it provides a favourable trade-off between the number of matches and ranking accuracy. However, there is no empirical study on the potential unfairness of Swiss-system chess tournaments if an odd number of rounds is played. To analyse this issue, our paper compares the number of points scored in the tournament between players who played one game more with the white pieces and players who played one game fewer with the white pieces. Using data from 28 highly prestigious competitions, we find that players with an extra white game score significantly more points. In particular, the advantage exceeds the value of a draw in the four Grand Swiss tournaments. A potential solution to this unfairness could be organising Swiss-system chess tournaments with an even number of rounds, and guaranteeing a balanced colour assignment for all players using a recently proposed pairing mechanism.

2512.02203 2026-06-19 econ.EM stat.AP 版本更新

Statistical Inference in Large Multi-way Networks

大规模多路网络中的统计推断

Lucas Resende, Guillaume Lecué, Lionel Wilner, Philippe Choné

AI总结 提出一种基于分类任务的多路网络结构参数估计方法,无需固定效应数量与结构假设,避免 incidental parameter 问题,在稀疏网络中比 PPML 更快且置信区间更可靠,应用于法国医疗政策因果效应分析。

Comments Working paper

详情
AI中文摘要

我们提出了一种新方法,用于在多路网络中估计结构参数,同时控制丰富的固定效应结构。该方法基于一系列分类任务,对固定效应的数量和结构均不敏感。与完全最大似然方法相比,我们的估计量不会受到 incidental parameter 问题的影响。对于稀疏连接的网络,它在计算上也比 PPML 更快。我们提供的经验证据表明,我们的估计量比 PPML 及其偏差修正策略产生更可靠的置信区间。即使在模型误设下,这些改进仍然成立,并且在稀疏设置中更为显著。虽然 PPML 在密集、低维数据中仍具有竞争力,但我们的方法为多路模型提供了一种稳健的替代方案,能够随稀疏性高效扩展。该方法被应用于研究政策改革对法国医疗空间可达性的因果效应。

英文摘要

We propose a new method to estimate structural parameters in multi-way networks while controlling for rich structures of fixed effects. The method is based on a series of classification tasks and is agnostic to both the number and structure of fixed effects. In contrast to full maximum likelihood approaches, our estimator does not suffer from the incidental parameter problem. For sparsely connected networks, it is also computationally faster than PPML. We provide empirical evidence that our estimator yields more reliable confidence intervals than PPML and its bias-correction strategies. These improvements hold even under model misspecification and are more pronounced in sparse settings. While PPML remains competitive in dense, low-dimensional data, our approach offers a robust alternative for multi-way models that scales efficiently with sparsity. The method is applied to study the causal effect of a policy reform on spatial accessibility to health care in France.

2506.18808 2026-06-19 stat.AP 版本更新

A Practical Introduction to Regression-based Causal Inference in Meteorology (I): All confounders measured

气象学中基于回归的因果推断实用入门(I):所有混杂因素可测

Caren Marzban, Yikun Zhang, Nicholas Bond, Michael Richman

AI总结 介绍在非时间序列场景下,利用匹配方法进行因果推断,提供气象学应用实例和R代码。

详情
AI中文摘要

一个变量是否是另一个变量的原因,或者仅仅与之相关,通常是一个重要的科学问题。因果推断是在统计背景下解决该问题的技术体系。尽管在存在时间信息时评估因果关系相对直接,但在非时间序列场景(本文考虑的情况)下,评估因果效应更为困难。因果推断领域的发展涉及广泛的主题概念,从而限制了其在包括气象学在内的一些领域的应用。然而,其核心所需的因果推断知识仅涉及基本概率论和回归,这是大多数气象学家熟悉的主题。通过聚焦这些核心领域,本文及其姊妹篇为气象学界进入(非时间序列)因果推断领域提供了垫脚石。尽管介绍了一些理论基础,但主要目标是将一种称为匹配的特定方法应用于气象学问题。应用数据为公开数据,并提供了R代码,为气象学学生和研究人员进入该领域铺平了道路。

英文摘要

Whether a variable is the cause of another, or simply associated with it, is often an important scientific question. Causal Inference is the name associated with the body of techniques for addressing that question in a statistical setting. Although assessing causality is relatively straightforward in the presence of temporal information, outside of that setting - the situation considered here - it is more difficult to assess causal effects. The development of the field of causal inference has involved concepts from a wide range of topics, thereby limiting its adoption across some fields, including meteorology. However, at its core, the requisite knowledge for causal inference involves little more than basic probability theory and regression, topics familiar to most meteorologists. By focusing on these core areas, this and a companion article provide a steppingstone for the meteorology community into the field of (non-temporal) causal inference. Although some theoretical foundations are presented, the main goal is the application of a specific method, called matching, to a problem in meteorology. The data for the application are in public domain, and R code is provided as well, forming an easy path for meteorology students and researchers to enter the field.

2506.18652 2026-06-19 stat.AP 版本更新

A Practical Introduction to Regression-based Causal Inference in Meteorology (II): Unmeasured confounders

气象学中基于回归的因果推断实用入门(二):未测量的混杂因素

Caren Marzban, Yikun Zhang, Nicholas Bond, Michael Richman

AI总结 介绍在未测量混杂因素存在时,利用工具变量法通过回归估计因果效应,并以气象数据为例说明工具变量选择的重要性。

详情
AI中文摘要

将相关性“提升”为因果关系的障碍之一是混杂现象,即两个变量之间的相关性实际上是由第三个变量(称为混杂因素)引起的。在先前的一篇配套文章中,我们考察了混杂因素被测量的情况。本文表明,即使混杂变量未被测量,在某些条件下,仍然可以通过一种基于回归的方法(利用工具变量的概念)来估计因果效应。使用与姊妹篇类似的气象数据集,比较和对比了因果效应的几种不同估计。结果表明,工具变量估计的因果效应依赖于工具变量的选择,而气象学考虑对于解决这种不确定性至关重要。提供了用于生成所有结果的R代码,并概述了未来工作的许多方向。

英文摘要

One obstacle to ``elevating'' correlation to causation is the phenomenon of confounding, i.e., when a correlation between two variables exists because both variables are in fact caused by a third variable, called a confounder. The situation where the confounders are measured is examined in an earlier, accompanying article. Here, it is shown that even when the confounding variables are not measured, under certain conditions it is still possible to estimate the causal effect via a regression-based method that uses the notion of instrumental variables. Using a meteorological data set, similar to that in the sister article, a number of different estimates of the causal effect are compared and contrasted. It is shown that the instrumental-variable estimates of causal effect depend on the choice of the instrumental variable, and that meteorological considerations are important in resolving the ambiguity. R code is provided for generating all of the results, and numerous directions for future work are outlined.

2502.06866 2026-06-19 cs.LG cs.AI econ.EM stat.AP stat.ML 版本更新

Global Ease of Living Index: a machine learning framework for longitudinal analysis of major economies

全球生活便利指数:面向主要经济体纵向分析的机器学习框架

Arun Kumar Selvaraj, Tanay Panat, Rohitash Chandra

发表机构 * Transitional Artificial Intelligence Research Group, School of Mathematics and Statistics(过渡人工智能研究组,数学与统计学学院) Centre for Artificial Intelligence and Innovation(人工智能与创新中心) Pingla Institute(Pingla研究所)

AI总结 提出全球生活便利指数,结合社会经济和基础设施因素,利用机器学习处理缺失数据,并通过主成分分析和因子分析降维,为政策制定者提供改善生活质量的可操作工具。

详情
AI中文摘要

全球经济、地缘政治条件以及COVID-19疫情等破坏性事件对生活成本和生活质量产生了巨大影响。理解主要经济体中生活成本和生活质量的长期影响至关重要。一个透明且全面的生活指数必须包含生活条件的多个维度。在本研究中,我们提出了一种通过全球生活便利指数量化生活质量的方法,该指数将各种社会经济和基础设施因素整合为一个单一综合得分。我们的指数利用定义生活水平的经济指标,这有助于针对特定领域进行干预改进。我们提出了一个机器学习框架来处理特定国家某些经济指标的数据缺失问题。然后,我们整理并更新数据,并使用降维方法(主成分分析和因子分析)创建自1970年以来主要经济体的生活便利指数。我们的工作通过为政策制定者提供识别需要改进领域(如医疗系统、就业机会和公共安全)的实用工具,显著丰富了相关文献。我们的方法使用开放数据和代码,易于复现并适用于各种情境,为生活质量评估的持续研究和政策制定提供了透明度和可访问性。

英文摘要

The drastic changes in the global economy, geopolitical conditions, and disruptions such as the COVID-19 pandemic have impacted the cost of living and quality of life. It is essential to comprehend the long-term implications of the cost of living and quality of life in major economies. A transparent and comprehensive living index must include multiple dimensions of living conditions. In this study, we present an approach to quantifying the quality of life through the Global Ease of Living Index that combines various socio-economic and infrastructural factors into a single composite score. Our index utilises economic indicators that define living standards, which could help in targeted interventions to improve specific areas. We present a machine learning framework to address missing data for certain economic indicators in specific countries. We then curate and update the data and use a dimensionality reduction approach (Principal Component Analysis and Factor Analysis) to create the Ease of Living Index for major economies since 1970. Our work significantly adds to the literature by offering a practical tool for policymakers to identify areas needing improvement, such as healthcare systems, employment opportunities, and public safety. Our approach with open data and code can be easily reproduced and applied to various contexts, providing transparency and accessibility for ongoing research and policy development in quality-of-life assessment.

2406.01557 2026-06-19 stat.ME stat.AP 版本更新

Flexible aggregation of compositional predictors with shared effects for microbiome association analysis

共享效应组合预测因子的灵活聚合用于微生物组关联分析

Satabdi Saha, Liangliang Zhang, Michele Guindani, Kim-Anh Do, Christine B. Peterson

AI总结 提出BRACE方法,通过尖峰-聚类先验和投影约束高斯先验,实现微生物组数据的自适应聚类和变量选择,识别与结果共享效应的关键特征。

详情
AI中文摘要

微生物组分析的最新进展为微生物群落的分子动态提供了前所未有的见解,激发了揭示微生物组在人类健康中关键作用的兴趣。然而,由于微生物组数据的高维、稀疏和组成性,识别与临床结果相关的微生物特征仍然具有挑战性。此外,许多微生物分类群虽然被分类为不同的,但可能共享功能角色,使传统的变量选择方法复杂化。为了克服这些障碍,我们引入了具有聚合组成效应的贝叶斯回归(BRACE),这是一种新方法,使用结合伯努利活动指标的尖峰-聚类先验、有限活动集上的Ewens可交换分割先验以及聚类效应上的投影约束高斯先验,进行数据自适应聚类和变量选择。我们工作的方法论创新在于如何将Ewens分割先验与聚类原子上的投影约束高斯相结合,以强制执行总和为零的约束。BRACE将具有相似效应的微生物分类群分组,产生更可解释的模型,同时实现有效的降维。通过综合模拟和一项检查口腔微生物组组成对胰岛素抵抗影响的真实应用,我们证明了BRACE在识别具有共享效应的关键特征方面优于现有方法。

英文摘要

Ongoing advancements in microbiome profiling have provided unprecedented insights into the molecular dynamics of microbial communities, sparking a surge of interest in uncovering the microbiome's critical role in human health. Identifying microbial features linked to clinical outcomes, however, remains challenging due to the high-dimensional, sparse, and compositional nature of microbiome data. Additionally, many microbial taxa, although classified as distinct, may share functional roles, complicating traditional variable selection methods. To overcome these obstacles, we introduce Bayesian Regression with Agglomerated Compositional Effects (BRACE), a novel approach using a spike-and-cluster prior combining Bernoulli activity indicators, an Ewens exchangeable partition prior on the finite active set, and a projection-based constrained Gaussian prior on cluster effects to perform data-adaptive clustering and variable selection. The methodological innovation of our work lies in how we combine the Ewens partition prior with a projection-based constrained Gaussian on the cluster atoms to enforce the sum-to-zero constraint. BRACE groups microbial taxa with similar effects on the outcome, yielding more interpretable models while enabling effective dimension reduction. Through comprehensive simulations and a real-world application examining the influence of oral microbiome composition on insulin resistance, we demonstrate BRACE's superior performance over existing methods, particularly in identifying key features with shared effects on outcomes.

2202.03332 2026-06-19 stat.ME econ.EM stat.AP 版本更新

Practical Forecasting of Environmental Maps: A Functional Data Approach

环境地图的实用预测:一种函数型数据方法

Alexander Gleim, Nazarii Salish

AI总结 提出一种基于函数型数据分析的统计方法,用于预测随时间变化的地理区域环境数据,通过整合时空依赖关系生成预测表面,并以德国地面臭氧浓度预测为例验证其有效性。

详情
AI中文摘要

环境问题在社会经济和健康研究中日益受到关注,推动了相关现实过程记录和数据收集的进展。然而,传统数据处理工具往往过于局限,无法考虑此类数据集的丰富特性。本文提出了一种简单的统计视角,用于预测随时间在预定义地理区域上顺序收集的环境数据。我们将此类数据集视为具有可能复杂地理区域的表面(或函数型)时间序列。利用函数型数据分析技术,我们开发了一种预测方法,能够同时考虑地理和时间依赖性。该方法允许整合传统多元技术以提供预测表面。我们通过德国地面臭氧浓度的预测示例展示了我们方法的实用价值,证明了其有效性和广泛应用的潜力。

英文摘要

Environmental problems are receiving increasing attention in socio-economic and health studies, fostering advances in recording and data collection of related real-life processes. However, traditional tools for data processing are often found too restrictive as they do not account for the rich nature of such data sets. In this paper, we propose a simple statistical perspective on forecasting environmental data collected sequentially over time across some predefined geographic region. We treat such data set as a surface (or functional) time series with a possibly complicated geographical domain. Using techniques from functional data analysis, we develop a forecasting methodology that allows to account for both geographic and temporal dependencies. This methodology allows integration of traditional multivariate techniques to provide forecasts surfaces. We demonstrate the practical value of our approach with a forecasting example of ground-level ozone concentration across Germany, showcasing its effectiveness and potential for broad application.