arXivDaily arXiv每日学术速递 周一至周五更新
重置
stat.AP统计应用17
2606.12174 2026-06-11 stat.AP stat.ME 新提交

The data-driven extreme value distribution: non-parametric tail estimation with a derived stability criterion

数据驱动的极值分布:基于导出稳定性准则的非参数尾部估计

Michael Sandbichler, Tobias Hell

AI总结 提出数据驱动极值分布(DDEVD),一种非参数估计器,通过核方法重建基分布并导出稳定性准则,在降水与冶金数据中优于传统极值模型。

详情
Comments
28 pages, 6 figures
AI中文摘要

量化极端事件的可能性是风险评估的基础,然而经典极值理论依赖于渐近假设,这在数据稀疏、非平稳的情况下失效,而实践者越来越常遇到这种情况。我们引入了数据驱动极值分布(DDEVD),一种非参数估计器,它元统计地聚合所有观测值,并用核重建基分布,去除了参数尾部假设。我们推导了其最优带宽,并证明了一个稳定性定律 $m < C\\,n^{1+\gamma/2}$,将可靠外推与极值指数 $\gamma$ 联系起来。在亚小时尺度的阿尔卑斯降水数据中,DDEVD 从单个十年中恢复了稳定的100年重现水平(校准比率 $0.96$),与完整记录参考值的偏差超过 $50\\%$ 的情况在不到五十分之一的窗口中发生——而 GEV 拟合则为五分之一。在冶金显微图像中,它在安全相关的晶粒尺寸尾部上与广义极值拟合相匹配,而标准对数正态分布在 $1\\,\mathrm{cm}^{2}$ 处高估了 $58\\%$。

英文摘要

Quantifying the likelihood of extreme events underpins risk assessment, yet classical Extreme Value Theory relies on asymptotic assumptions that fail in the data-sparse, non-stationary regimes practitioners increasingly face. We introduce the Data-Driven Extreme Value Distribution (DDEVD), a non-parametric estimator that aggregates all observations metastatistically and reconstructs the base distribution with a kernel, removing parametric tail assumptions. We derive its optimal bandwidth and prove a stability law $m < C\,n^{1+\gamma/2}$ relating reliable extrapolation to the extreme value index $\gamma$. In sub-hourly Alpine precipitation, DDEVD recovers stable 100-year return levels from single decades (calibration ratio $0.96$), departing from the full-record reference by over $50\,\%$ in fewer than one window in fifty -- versus one in five for a GEV fit. In metallurgical micrographs, it matches a generalised extreme-value fit on the safety-relevant grain-size tail, where the standard log-normal over-predicts by $58\,\%$ at $1\,\mathrm{cm}^{2}$.

2606.12097 2026-06-11 stat.AP physics.data-an 新提交

Weibull-Stationary Stochastic Differential Equations for Conditional Long-Horizon Wind Power Forecasting

条件长期风电预测的威布尔平稳随机微分方程

Luca Di Persio, Mehrdad Ghadiri

AI总结 提出一种基于威布尔平稳SDE的月度风电概率预测框架,通过异方差卡尔曼滤波和三种SDE模型实现高分辨率预测,CRPS约1.57 m/s,功率Wasserstein距离低于额定容量1.4%。

详情
AI中文摘要

我们提出了一个以十分钟分辨率进行一个月前风电预测的条件概率框架。从序列相关的SCADA风速数据中估计月度威布尔形状和尺度参数,通过Godambe协方差修正,并使用异方差卡尔曼滤波在双变量VAR(1)状态空间模型上进行预测。以MMSE预测的威布尔不变律为条件,我们构建并比较了三种正风速SDE模型:Ornstein-Uhlenbeck-Weibull变换、Fokker-Planck漂移优先扩散和Fokker-Planck扩散优先模型。模拟的风速集合通过校准的XGBoost功率曲线映射到功率。应用于Kelmarsh风电场Senvion MM92涡轮机2021年1月的数据,三种SDE公式在概率精度上统计上不可区分,平均CRPS值在1.569至1.575 m/s之间。因此,扩散优先模型在计算上更优,运行时间相对于OU-Weibull模型减少了约七倍。在功率域中,模拟与观测分布之间的Wasserstein距离为26.1-27.6 kW,低于额定容量的1.4%,而所检查月份的月能量产出偏差约为-7.3%。在0-1500 kW范围内,超越概率误差保持在1.6个百分点以下,在额定功率附近约为2.2个百分点。这些量为下游运行问题提供了决策相关的概率输入,而非完成的备用、储能、市场或疲劳优化决策。完全边缘化卡尔曼预测律下的威布尔参数是一个自然的扩展。

英文摘要

We present a one-month-ahead conditional probabilistic framework for wind-power forecasting at ten-minute resolution. Monthly Weibull shape and scale parameters are estimated from serially dependent SCADA wind-speed data, corrected through a Godambe covariance, and forecast by a heteroskedastic Kalman filter on a bivariate VAR(1) state-space model. Conditional on the MMSE forecasted Weibull invariant law, we construct and compare three positive wind-speed SDE models: an Ornstein-Uhlenbeck-Weibull transform, a Fokker-Planck drift-first diffusion, and a Fokker-Planck diffusion-first model. The simulated wind-speed ensembles are mapped to power through a calibrated XGBoost power curve. Applied to January 2021 data from a Senvion MM92 turbine at Kelmarsh Wind Farm, the three SDE formulations are statistically indistinguishable in probabilistic accuracy, with mean CRPS values between 1.569 and 1.575 m/s. The diffusion-first model is therefore preferred on computational grounds, reducing runtime by about a factor of seven relative to the OU-Weibull model. In the power domain, the Wasserstein distance between simulated and observed distributions is 26.1-27.6 kW, below $1.4\%$ of rated capacity, while the monthly energy-yield bias is about $-7.3\%$ for the examined month. Exceedance-probability errors remain below 1.6 percentage points over the 0-1500 kW range and about 2.2 percentage points near rated power. These quantities provide decision-relevant probabilistic inputs for downstream operational problems, rather than completed reserve, storage, market, or fatigue-optimization decisions. Full marginalisation over the Kalman predictive law of the Weibull parameters is left as a natural extension.

2606.12057 2026-06-11 stat.AP 新提交

ChargeBD: Character-Aware Heterogeneous Agent Reasoning for Guided Engineering in Battery Development

ChargeBD:面向电池开发中引导工程的字符感知异构智能体推理

Rui Huang, Zekun Jiang, Xingyu Niu, Yuqiang Li, Xinying Gu, Tianhang Zhou

AI总结 提出ChargeBD框架,通过MBTI启发的角色智能体矩阵,结合异构推理,解决液流电池多尺度多目标研发中的自适应问题。

详情
AI中文摘要

液流电池(RFB)研究涵盖分子设计、电解质优化、电极和膜材料、电堆运行、系统管理和安全分析,使其成为一个受约束、多尺度、多目标的储能研发问题。尽管大型语言模型(LLM)可以支持科学知识整合和提案生成,但通用LLM推理在创新导向探索、基于规则的执行、机理建模和系统级权衡方面仍不够自适应。本文介绍ChargeBD,一个用于电池开发中引导工程的字符感知异构智能体推理框架。从50个RFB特定任务集开始,我们构建了500个问题的ESS-LLM基准,并定义了MBTI启发的角色智能体作为结构化认知偏差模板,而非心理测量工具或真实人格表征。选择DeepSeek-V3-Plus作为共享基础模型,评估16个MBTI启发的角色智能体,以构建角色能力矩阵和认知优势矩阵。

英文摘要

Redox-flow battery (RFB) research spans molecular design, electrolyte optimization, electrode and membrane materials, stack operation, system management, and safety analysis, making it a constrained, multi-scale, and multi-objective energy-storage R&D problem. Although large language models (LLMs) can support scientific knowledge integration and proposal generation, generic LLM reasoning remains insufficiently adaptive across innovation-oriented exploration, rule-based execution, mechanistic modeling, and system-level trade-offs. Here we introduce ChargeBD, a character-aware heterogeneous-agent reasoning framework for guided engineering in battery development. Starting from a 50-question RFB-specific task set, we construct a 500-question ESS-LLM Benchmark and define MBTI-inspired persona agents as structured cognitive-bias templates rather than psychometric instruments or representations of real personalities. DeepSeek-V3-Plus is selected as the shared base model, and 16 MBTI-inspired persona agents are evaluated to construct a persona capability matrix and a cognitive advantage matrix.

2606.11768 2026-06-11 stat.ME stat.AP 新提交

Hierarchical excitatory processes for modelling event-time data in the presence of exogenous stimuli

外源刺激下事件时间数据建模的分层激发过程

Francesco Sanna Passino, Nicholas A. Heard, Jeffrey W. Brown, William N. Frost, Vince P. Lyzinski

AI总结 提出分层激发过程(HEP)模型,通过动态演化核函数叠加外源刺激的激发效应,实现对重复刺激下事件时间数据的灵活建模,并嵌入聚类框架识别潜在响应模式。

详情
AI中文摘要

我们引入了分层激发过程(HEP),一种用于在重复外部刺激下观察到的事件时间数据的灵活点过程模型。所提出的框架将点过程的条件强度建模为外部刺激引起的激发效应的叠加,其特征由参数随时间动态演化的核函数刻画。这种分层结构使得能够跨重复刺激调节激发强度,提供了一种可解释的结构。我们为所提出的模型建立了基于似然的推断,并将HEP嵌入到基于模型的聚类框架中,以识别具有相似响应动态的潜在组。模拟研究证明了该模型恢复演化潜在模式的能力,而对海蛞蝓足神经节尖峰序列记录的应用展示了HEP如何能够在不同实验条件下表征重复刺激下神经元的刺激驱动兴奋性。

英文摘要

We introduce the Hierarchical Excitatory Process (HEP), a flexible point process model for event-time data observed under repeated external stimuli. The proposed framework models the conditional intensity of a point process as a superposition of excitation effects induced by external stimuli, characterised by kernels with parameters dynamically evolving over time. This hierarchical construction enables modulation of excitation strength across repeated stimuli, providing an interpretable structure. We establish likelihood-based inference for the proposed model and embed HEP within a model-based clustering framework to identify latent groups sharing similar response dynamics. Simulation studies demonstrate the model's ability to recover evolving latent patterns, and an application to spike train recordings from the sea slug Aplysia pedal ganglion illustrates how HEPs are able to characterise stimulus-driven excitability of neurons across repeated stimulation under different experimental conditions.

2606.11651 2026-06-11 cs.LG q-bio.QM stat.AP 新提交

DeepRHP: A Hybrid Variational Autoencoder for Designing Random Heteropolymers as Protein Mimics

DeepRHP:一种用于设计随机异聚合物作为蛋白质模拟物的混合变分自编码器

Shuni Li, Zhiyuan Ruan, Andy Shen, Ivan Jayapurna, Ting Xu, Haiyan Huang

AI总结 提出混合变分自编码器DeepRHP,在半监督框架下结合特征VAE与经典VAE,通过潜在空间捕获关键化学特征与序列模式,指导随机异聚合物设计,实验验证其稳定膜蛋白的有效性。

详情
Comments
Oral presentation at AAAI 2023 Workshop on AI to Accelerate Science and Engineering
AI中文摘要

由预定义单体组成的合成随机异聚合物(RHP)为设计类蛋白质材料提供了一种方法。如果设计得当,这些RHP可以模拟蛋白质的行为和功能。因此,需要计算工具来有效指导RHP设计。我们通过开发DeepRHP(一种在半监督框架下改进的变分自编码器(VAE)模型)来弥补这一差距。通过为经典VAE配备额外的基于特征的VAE,DeepRHP迫使潜在空间捕获关键化学特征的结构以及单个RHP序列模式。从这个意义上说,我们的方法是通用的,允许以混合方式纳入任何相关特征。我们通过提出在非原生环境中稳定膜蛋白(例如水通道蛋白Z)的潜在单体组成,并将我们的预测与已发表的结果进行交叉验证,证明了DeepRHP的有效性。我们的模型与真实RHP功能之间的一致性表明,利用混合自编码器架构来指导蛋白质和其他生物化合物的RHP设计具有巨大潜力。

英文摘要

Synthetic random heteropolymers (RHPs), consisting of a predefined set of monomers, offer an approach toward the design of protein-like materials. These RHPs, if designed appropriately, can mimic protein behavior and function. As such, there is a need for computational tools to efficiently guide RHP design. We bridge this gap by developing DeepRHP, a modified variational autoencoder (VAE) model under a semi-supervised framework. By equipping a classical VAE with an additional feature-based VAE, DeepRHP forces the latent space to capture structures of critical chemical features as well as individual RHP sequence patterns. In this sense, our method is versatile by allowing any relevant features to be incorporated in a hybrid manner. We demonstrate the effectiveness of DeepRHP by suggesting potential monomer compositions that stabilize membrane proteins (e.g. Aquaporin Z) in non-native environments and cross-validating our prediction with published results. The concordance between our model and true RHP function suggests strong potential in utilizing hybrid autoencoder architectures to guide RHP design for proteins and other biological compounds.

2606.11405 2026-06-11 stat.ME stat.AP 新提交

Bayesian Causal Machine Learning for Cure Models

治愈模型的贝叶斯因果机器学习

Antonio R. Linero, F. Javier Rubio, Piyali Basak

AI总结 针对治愈模型中治疗对治愈概率和未治愈患者生存时间的不同影响,提出贝叶斯因果机器学习方法BartCure,分解受限平均生存时间的因果效应,并在乳腺癌试验中验证其有效性。

详情
AI中文摘要

在生存研究中,治疗可以通过不同机制使患者受益:治疗可能增加治愈的概率,或延迟未治愈患者的失败时间。量化哪种机制占主导地位,以及它是否在不同亚群中变化,具有临床重要性,但因果机器学习文献中针对此问题的研究有限。标准的因果生存学习器针对有限时间生存或受限平均生存时间,而许多治愈模型在未估计因果效应的情况下捕捉治愈结构。在这项工作中,我们在存在治愈亚群的情况下定义了有意义的因果效应,并引入了BartCure,一种用于估计这些效应的贝叶斯因果机器学习方法。我们推荐的因果效应将受限平均生存时间的因果效应分解为随机治愈和随机潜伏期成分,并将这些新效应与随机干预效应和主层中的因果效应联系起来。在模拟中,BartCure在估计平均效应方面具有竞争力,并且在保守地检测治疗效应异质性的方向方面特别有效。我们将BartCure应用于CALGB 40101乳腺癌试验,以估计平均和亚组因果效应,并识别治疗效应异质性。

英文摘要

In survival studies, treatments can benefit patients through different mechanisms: a treatment may increase the probability of being cured or delay failure among patients who are not cured. Quantifying which mechanism is dominant, and whether it varies across subpopulations, is clinically important, yet there is limited work in the causal machine learning literature addressing this problem. Standard causal survival learners target finite-horizon survival or restricted mean survival time, while many cure models capture cure structures without estimating causal effects. In this work, we define meaningful causal effects in the presence of a cured subpopulation and introduce BartCure, a Bayesian causal machine learning approach for estimating them. The causal effects we recommend decompose the causal effect on restricted mean survival time into a stochastic cure and stochastic latency component, and we relate these new effects to both stochastic intervention effects and causal effects in principal strata. In simulations, BartCure is competitive for estimating average effects and is especially effective at conservatively detecting the direction of treatment-effect heterogeneity. We apply BartCure to estimate average and subgroup causal effects and to identify treatment effect heterogeneity in the CALGB 40101 breast cancer trial.

2606.11282 2026-06-11 stat.AP math.PR math.ST 新提交

The Statistical Compass

统计罗盘

Eliuvish Han Cui

AI总结 将概率与随机过程思想作为统计学的翻译语言,从设计观测到数据对象、目标、稳定性、推断与应用,通过实例连接抽象对象与记录、机制和决策。

详情
Comments
669 pages, 23 figures; textbook/monograph working manuscript
AI中文摘要

本专著将概率和随机过程思想发展为统计学的翻译语言:从设计观测和数据对象到目标、稳定性陈述、推断和应用。各章节从激励性示例和随机化出发,涵盖概率测度、核、似然、数据对象、弱收敛、经验场、函数型数据、M-和Z-估计、检验、局部逼近、事件时间过程和预测。使用历史和生物医学示例,将抽象对象与记录、机制和决策联系起来。目的是为读者提供经典概率、现代数据结构和统计实践的通用语法。

英文摘要

This monograph develops probability and stochastic-process ideas as a translation language for statistics: from designed observations and data objects to targets, stability statements, inference, and use. The chapters move from motivating examples and randomization through probability measures, kernels, likelihoods, data objects, weak convergence, empirical fields, functional data, M- and Z-estimation, testing, local approximations, event-time processes, and prediction. Historical and biomedical examples are used to keep abstract objects tied to records, mechanisms, and decisions. The aim is to give readers a common grammar for classical probability, modern data structures, and statistical practice.

2606.11118 2026-06-11 cs.LG math.OC math.PR stat.AP stat.ML 版本更新

Data-Driven Dynamic Assortment in Online Platforms: Learning about Two Sides

在线平台中的数据驱动动态分类:学习双边信息

Rahul Roy, Nur Sunar, Jayashankar M. Swaminathan

AI总结 针对双边服务平台,提出一种数据驱动算法,在未知顾客和卖家选择参数的情况下动态优化商品分类,并证明其遗憾值随时间呈多对数增长且达到最优速率。

详情
AI中文摘要

我们研究了一个在离散时间环境下,具有不完全信息和异质顾客的双边服务平台上的动态分类问题。在每个周期,一位顾客到达寻求服务,平台选择一组卖家进行展示。顾客根据多项逻辑选择模型,最多向分类中的一个卖家提出交易。经过固定数量的周期后,卖家审查收到的提议,并根据另一个多项逻辑选择模型,每位卖家最多选择一个顾客,然后循环重复。一个关键挑战是平台事先不知道顾客或卖家的选择模型参数。据我们所知,这是首次研究双边选择参数均未知的动态分类问题。我们开发了一种数据驱动算法,该算法在优化平台目标的同时学习这些参数。我们使用遗憾值来评估性能,该遗憾值衡量相对于一个预知所有参数和顾客到达时间的先知基准的收入损失。我们证明该算法的最坏情况遗憾值随时间呈多对数增长,并推导出匹配的下界,从而确定其速率最优性。

英文摘要

We study a dynamic assortment problem on a two-sided service platform with incomplete information and heterogeneous customers in a discrete-time setting. In each period, a customer arrives seeking service, and the platform chooses an assortment of sellers to display. The customer then proposes a transaction to at most one seller in the assortment according to a multinomial logit choice model. After a fixed number of periods, sellers review the proposals they have received and each chooses at most one customer according to another multinomial logit choice model, after which the cycle repeats. A key challenge is that the platform does not know the choice-model parameters of either customers or sellers in advance. To our knowledge, this is the first study of a dynamic assortment problem in which both sides' choice parameters are unknown. We develop a data-driven algorithm that learns these parameters while optimizing the platform's objective over time. We evaluate performance using regret, which measures revenue loss relative to a clairvoyant benchmark that knows all parameters and customer arrivals in advance. We show that the algorithm's worst-case regret grows polylogarithmically over time, and we derive a matching lower bound, establishing its rate optimality.

2606.01650 2026-06-11 q-fin.PM q-fin.TR stat.AP stat.ME 版本更新

Post Selection Estimation of Sharpe Ratios

夏普比率的事后选择估计

Steven E. Pav

AI总结 针对从众多资产中选择具有最高样本内夏普比率的资产,研究基于多面体引理、James-Stein收缩、期望最大夏普比率去偏、阈值法和经验贝叶斯的估计器,并通过模拟评估其偏差、均方根误差和秩相关性。

详情
AI中文摘要

我们考虑估计一个资产的真实夏普比率的问题,该资产因在众多资产中具有最高的样本内夏普比率而被选中。我们讨论了基于多面体引理、James-Stein收缩、期望最大夏普比率去偏、阈值法和经验贝叶斯的估计器。我们在模拟中测试了这些估计器,计算了不同样本量、资产数量以及总体夏普比率的分布范围和形状下的偏差和均方根误差。我们还计算了估计器与潜在真实值的秩相关性,模拟了这些估计器如何用于比较或排序执行此选择过程的不同团队的结果。我们发现James-Stein估计器在相关参数的许多不同实际值下提供了最佳性能,其次是Jiang和Zhang的GMLEB估计器。这些结果对资产收益的相关性相当稳健,但有一些注意事项。

英文摘要

We consider the problem of estimating the true Sharpe ratio of an asset selected for having the highest observed in-sample Sharpe ratio among many assets. We discuss estimators based on the polyhedral lemma, James Stein shrinkage, debiasing the expected maximum Sharpe ratio, thresholding and empirical Bayes. We test these estimators in simulations, computing bias and root mean square error across different values of sample size, number of assets, and spread and shape of population Sharpe ratios. We also compute rank correlation of the estimators against the underlying quantity, simulating how these estimators might be used to compare or rank the output of different teams which perform this selection process. We find that the James Stein estimator provides the best performance across many different realistic values of the relevant parameters, followed by the GMLEB estimator of Jiang and Zhang. These results are fairly robust to correlation of asset returns, with some caveats.

2411.10959 2026-06-11 econ.EM cs.LG math.ST stat.AP stat.ME stat.ML 版本更新

Program Evaluation with Remotely Sensed Outcomes

利用遥感结果的程序评估

Ashesh Rambachan, Rahul Singh, Davide Viviano

AI总结 本文研究了在实验和准实验中,由于遥感变量不完全测量经济结果而引起的因果推断问题,提出了一种非参数识别因果参数的方法,结合实验和观测数据进行n^{-1/2}推断。

详情
AI中文摘要

我们研究了在实验和准实验中,经济结果由遥感变量不完全测量的因果推断问题。遥感变量是低成本、可扩展且在观测数据中预测经济结果的变量,例如卫星图像和移动电话活动。我们将遥感变量视为后结果:经济结果的变化导致遥感变量的变化。例如,环境质量的变化导致卫星图像的变化,而不是相反。在这一假设下,我们提出了一种结合实验和观测数据的公式,以非参数方式识别因果参数。我们开发了一种n^{-1/2}推断方法,该方法对规格不正确具有鲁棒性,并且不限制用于处理遥感变量的算法。

英文摘要

We study causal inference in experiments and quasi-experiments, where the economic outcome is imperfectly measured by a remotely sensed variable. The remotely sensed variable is low-cost, scalable, and predictive of the economic outcome in observational data; examples include satellite imagery and mobile phone activity. We model the remotely sensed variable as post-outcome: variation in the economic outcome causes variation in the remotely sensed variable. For example, changes in environmental quality cause changes in satellite imagery, not vice versa. Under this assumption, we propose a formula to nonparametrically identify the causal parameter by combining experimental and observational data. We develop a method for n^{-1/2} inference that is robust to misspecification and that does not restrict the algorithms used to process remotely sensed variables.

2604.23464 2026-06-11 stat.ME stat.AP 版本更新

Design-Based Cross-Validation for Comparing Small Area Estimators

关于小区域估计器的交叉验证

Qianyu Dong, Zehang Richard Li

AI总结 本文提出一种适用于复杂调查设计的小区域估计器交叉验证框架,通过分解交叉验证平方误差,揭示可识别偏差与不可识别成分,提升模型比较的稳健性和可解释性。

详情
Comments
Previous title: "On cross-validation for small area estimators"
AI中文摘要

地方公共卫生监测常常依赖住户调查,但所需空间分辨率的数据稀少。小区域估计(SAE)方法通过跨区域借用强度和辅助信息解决这一挑战。然而,在缺乏真实数据的情况下,比较这些估计器仍然困难。我们提出了一种适用于复杂调查设计的交叉验证框架,用于评估小区域估计器。我们的方法使能够对区域级和单元级SAE模型进行模型无关的比较。框架的核心是交叉验证平方误差的分解,揭示了可识别偏差和不可识别成分,后者可以被界定。我们的理论结果和模拟研究显示,传统方法如留一区域法交叉验证可能导致误导性的模型排名,而所提方法提供了更稳健和可解释的模型比较,并具有不确定性量化。我们通过比较赞比亚Demographic and Health Surveys中估计的亚国家女性识字率的小区域估计模型,展示了该框架。

英文摘要

Subnational monitoring of public health and development indicators often relies on household surveys where data are sparse at the desired spatial resolution. Small area estimation (SAE) methods address this challenge by borrowing strength across areas and incorporating auxiliary information. However, comparing these estimators remains difficult in the absence of ground truth. We propose a design-based cross-validation framework for evaluating small area estimators that accommodates complex survey designs. Our approach enables model-agnostic comparisons between area-level and unit-level SAE models. We derive a decomposition of the conditional mean squared error that yields a consistent cross-validation score, show that finite-sample comparisons carry an unidentifiable bias that can be bounded, and use this bound as a principled threshold for ranking models. We further show that leave-one-area-out cross-validation, a popular alternative, targets extrapolation rather than smoothing error and can reverse the correct ranking. We evaluate the framework through extensive design-based simulations. We apply the framework to compare subnational female literacy estimators in Zambia using the 2024 Demographic and Health Survey. The framework applies broadly across prevalence mapping and other SAE problems and is applicable to any small area estimator irrespective of the underlying model class.

2602.00434 2026-06-11 stat.AP 版本更新

How should covariates be handled in randomized trials? Empirical evidence from 50 trials and recommendations for practice

随机临床试验中协变量调整策略的基准测试

Yulin Shao, Liangbo Lyu, Menggang Yu, Bingkai Wang

AI总结 本文通过大规模实证研究比较了不同协变量调整策略在随机临床试验中的表现,发现简洁的回归方法在效率提升方面表现优异,而基于机器学习的方法在二元结局中计算稳定性较差。

详情
AI中文摘要

背景和目的:协变量调整可以提高随机临床试验的精度和统计功效,并被主要监管机构推荐。然而,关于不同调整策略在多样化真实世界试验中的表现缺乏实证证据,导致对统计分析计划中应预指定的方法和协变量存在不确定性。我们旨在填补这一空白并提供实用建议。 方法:我们利用50个公开可用的随机试验的个体层面数据(29,094名参与者;574个治疗-结局比较)进行了大规模实证研究。我们比较了常用的协变量调整估计量,包括分析协方差、逆概率加权、g计算和基于机器学习的方法,并结合三种协变量选择策略。性能通过精度提升、点估计变化、计算可靠性以及协变量调整改变统计显著性概率来评估。 结果:协变量调整在大多数情况下提高了精度,连续结局的中位方差减少率为13.3%,二元结局为4.6%。使用少量预指定的预测性协变量的简洁回归方法在小至中等样本中表现与更复杂的方法相当或更好。基于机器学习的估计量在二元结局中未提供额外的精度,并且更易出现计算失败。 结论:在不同试验中,简洁的协变量调整提供了稳定的效率提升,而不引入系统性偏差。这些发现支持在主要试验分析中常规使用协变量调整。所有整理的数据集和分析代码已公开发布,以支持未来临床研究。

英文摘要

Background and Objective: Covariate adjustment can improve precision and power in randomized clinical trials and is recommended by major regulatory agencies. However, there is limited empirical evidence on how different adjustment strategies perform across diverse real-world trials, leaving uncertainty about which methods and covariates should be prespecified in statistical analysis plans. We aim to address this gap and provide practical recommendations. Methods: We conducted a large-scale empirical study using individual-level data from 50 publicly available randomized trials (29,094 participants; 574 treatment-outcome comparisons). We compared commonly used covariate-adjusted estimators, including analysis of covariance, inverse-probability weighting, g-computation, and machine-learning-based approaches, combined with three covariate-selection strategies. Performance was evaluated using precision gains, changes in point estimates, computational reliability, and the probability that covariate adjustment altered statistical significance relative to an unadjusted analysis. Results: Covariate adjustment improved precision in most settings, with a median variance reduction of 13.3\% for continuous outcomes and 4.6\% for binary outcomes. Parsimonious regression approaches using a small prespecified set of prognostic covariates performed as well as or better than more complex methods, particularly in small to medium samples. Machine-learning-based estimators did not provide additional precision and were more prone to computational failure for binary outcomes. Conclusions: Across trials, parsimonious covariate adjustment provided consistent efficiency gains without introducing systematic bias. These findings support routine covariate adjustment in primary trial analyses. All curated datasets and analysis code are openly released to support future clinical research.

2411.12193 2026-06-11 stat.AP cs.LG stat.ML 版本更新

Hierarchical Probabilistic Conformal Prediction for Distributed Energy Resources Adoption

分布式能源采纳的分层概率保形预测

Wenbin Zhou, Shixiang Zhu

AI总结 针对分布式能源采纳预测中的不确定性和分层电网结构,提出基于多元霍克斯过程与分裂保形预测的量化框架,确保聚合后统计有效性,在印第安纳波利斯数据上优于基线。

详情
AI中文摘要

分布式能源(DERs)的快速增长为电网管理带来了机遇和运营挑战。准确预测DER采纳对于主动基础设施规划至关重要,但DER增长的固有不确定性和空间差异使传统预测方法复杂化。此外,配电网的分层结构要求预测在电路和变电站层面均满足统计保证,这是可靠决策的非平凡要求。本文提出了一种新的DER采纳预测不确定性量化框架,确保在分层电网结构中的有效性。利用多元霍克斯过程建模DER采纳动态,并采用定制的分裂保形预测算法,我们引入了一种新的非一致性分数,在保持预测效率的同时,在聚合下保留统计保证。我们在温和条件下建立了理论有效性,并通过印第安纳州印第安纳波利斯的客户级太阳能电池板安装数据实证评估,表明我们的方法在预测准确性和不确定性校准方面始终优于现有基线。

英文摘要

The rapid growth of distributed energy resources (DERs) presents both opportunities and operational challenges for electric grid management. Accurately predicting DER adoption is critical for proactive infrastructure planning, but the inherent uncertainty and spatial disparity of DER growth complicate traditional forecasting approaches. Moreover, the hierarchical structure of distribution grids demands that predictions satisfy statistical guarantees at both the circuit and substation levels, a non-trivial requirement for reliable decision-making. In this paper, we propose a novel uncertainty quantification framework for DER adoption predictions that ensures validity across hierarchical grid structures. Leveraging a multivariate Hawkes process to model DER adoption dynamics and a tailored split conformal prediction algorithm, we introduce a new nonconformity score that preserves statistical guarantees under aggregation while maintaining prediction efficiency. We establish theoretical validity under mild conditions and demonstrate through empirical evaluation on customer-level solar panel installation data from Indianapolis, Indiana that our method consistently outperforms existing baselines in both predictive accuracy and uncertainty calibration.

2509.04691 2026-06-11 stat.AP 版本更新

Inferring Piece Value in Chess and Chess Variants

推断国际象棋及其变体中的棋子价值

Steven E. Pav

AI总结 使用逻辑回归从Lichess数据估计标准国际象棋及四种变体的棋子价值,发现主要棋子相对价值与历史估值一致,但象略高于马,且原子棋和反象棋中绝对值较小。

详情
Comments
58 pages
AI中文摘要

我们使用逻辑回归来估计标准国际象棋及几种变体(即Chess 960、原子棋、反象棋和部落棋)中棋子的价值。我们对来自免费开源互联网国际象棋服务器Lichess的多年数据进行回归分析。我们使用已发布的玩家等级分来控制不同玩家技能带来的混杂效应。我们调整了由于观测等级分噪声导致的回归衰减偏差。我们发现,主要棋子的价值相对于兵的价值,与历史估值体系相当一致。然而,我们发现象的价值略高于马。我们发现,在原子棋和反象棋中,棋子的绝对值比标准国际象棋小。我们还给出了当不同技能水平的玩家对战时,使棋局平衡的近似棋子价值。我们简要考虑了使用Stockfish引擎进行自我对弈实验,这提供了关于棋子价值的对比视角。

英文摘要

We use logistic regression to estimate the value of the pieces in standard chess and several chess variants, namely Chess 960, Atomic chess, Antichess, and Horde chess. We perform our regressions on several years of data from Lichess, the free and open-source internet chess server. We use the published player ratings to control for the confounding effect of differential player skill. We adjust for the attenuation bias in regressions due to the noise in observed ratings. We find that major piece values, relative to the value of a pawn, are fairly consistent with historical valuation systems. However we find slightly higher value to bishops than knights. We find that piece values are smaller, in absolute value, in Atomic and Antichess than standard chess. We also present approximate values of the pieces to equalize odds when players of varying skill face off. We briefly consider self-play experiments using the Stockfish engine, which give a contrasting view of piece value.

2503.11683 2026-06-11 stat.AP

MealMeter: Using Multimodal Sensing and Machine Learning for Automatically Estimating Nutrition Intake

Asiful Arefeen, Samantha Fessler, Sayyed Mostafa Mostafavi, Carol S Johnston, Hassan Ghasemzadeh

详情
英文摘要

Accurate estimation of meal macronutrient composition is a pre-perquisite for precision nutrition, metabolic health monitoring, and glycemic management. Traditional dietary assessment methods, such as self-reported food logs or diet recalls are time-intensive and prone to inaccuracies and biases. Several existing AI-driven frameworks are data intensive. In this study, we propose MealMeter, a machine learning driven method that leverages multimodal sensor data of wearable and mobile devices. Data are collected from 12 participants to estimate macronutrient intake. Our approach integrates physiological signals (e.g., continuous glucose, heart rate variability), inertial motion data, and environmental cues to model the relationship between meal intake and metabolic responses. Using lightweight machine learning models trained on a diverse dataset of labeled meal events, MealMeter predicts the composition of carbohydrates, proteins, and fats with high accuracy. Our results demonstrate that multimodal sensing combined with machine learning significantly improves meal macronutrient estimation compared to the baselines including foundation model and achieves average mean absolute errors (MAE) and average root mean squared relative errors (RMSRE) as low as 13.2 grams and 0.37, respectively, for carbohydrates. Therefore, our developed system has the potential to automate meal tracking, enhance dietary interventions, and support personalized nutrition strategies for individuals managing metabolic disorders such as diabetes and obesity.

1910.07712 2026-06-11 stat.AP stat.CO stat.ME 版本更新

Estimating Spatially-Smoothed Fiber Orientation Distribution from Diffusion-MRI Experiments

从扩散MRI实验估计空间平滑的纤维取向分布

Jilei Yang, Seungyong Hwang, Mengjie Shi, Jie Peng

AI总结 提出最近邻自适应回归模型(NARM),通过加权局部似然估计和空间邻域嵌套实现纤维取向分布(FOD)的空间自适应估计,引入体素级重缩放和数据驱动停止规则防止过平滑,并基于配置感知策略选择相似性平滑参数,在模拟和人类连接组项目数据中提高了估计准确性和可重复性。

详情
AI中文摘要

扩散加权磁共振成像(D-MRI)是一种非侵入性体内技术,用于探测生物组织的微观结构架构。在每个体素处,纤维取向分布(FOD)表征局部纤维构型和方向,因此是D-MRI分析中的核心估计对象。我们提出了最近邻自适应回归模型(NARM),这是一种用于FOD估计的空间自适应框架,它在嵌套的空间邻域上执行加权局部似然估计,其中权重联合编码相邻FOD之间的空间邻近性和相似性,通过最优传输或Hellinger距离测量。为了防止过平滑同时保留结构异质性,我们引入了体素级重缩放方案和基于最小最近邻相异性的数据驱动停止规则。我们进一步开发了一种配置感知策略来选择相似性平滑参数,使平滑强度能够适应局部纤维复杂性。模拟研究表明,相对于体素级方法和现有的空间平滑方法PMARM,NARM提高了FOD估计精度。对人类连接组项目的重测数据的应用还表明,NARM产生了更可重复的FOD估计。实现细节以及模拟和真实数据分析的脚本可在以下网址获得:https://github.com/DMRIdotL/NARM

英文摘要

Diffusion-weighted magnetic resonance imaging (D-MRI) is a noninvasive in vivo technique for probing the microstructural architecture of biological tissues. At each voxel, the fiber orientation distribution (FOD) characterizes local fiber configurations and orientations and is therefore a central object of estimation in D-MRI analysis. We propose the Nearest-Neighbor Adaptive Regression Model (NARM), a spatially adaptive framework for FOD estimation that performs weighted local likelihood estimation over nested spatial neighborhoods, where the weights jointly encode spatial proximity and similarity among neighboring FODs, measured by either the optimal transport or Hellinger distance. To prevent over-smoothing while preserving structural heterogeneity, we introduce a voxel-wise rescaling scheme and a data-driven stopping rule based on minimum nearest-neighbor dissimilarity. We further develop a configuration-aware strategy for selecting the similarity-smoothing parameter, allowing the smoothing strength to adapt to local fiber complexity. Simulation studies demonstrate that NARM improves FOD estimation accuracy relative to voxel-wise methods and the existing spatial smoothing approach PMARM. Application to test-retest data from the Human Connectome Project additionally shows that NARM yields more reproducible FOD estimates. Implementation details and scripts for the simulation and real data analyses are available at this https URL

2305.09455 2026-06-11 stat.AP 版本更新

A latent class approach to assess the effects of dynamic adherence to polytherapy in heart failure patients

评估心力衰竭患者多药治疗动态依从性影响的潜在类别方法

Nicole Fontana, Laura Savaré, Emanuele Di Angelantonio, Francesca Ieva

AI总结 提出结合潜在马尔可夫模型与动态依从性建模的方法,分析心力衰竭患者多药治疗依从性模式及其对再住院风险的影响,发现高依从性可显著降低风险。

详情
AI中文摘要

心力衰竭(HF)的治疗严重依赖药物治疗,特别是根据临床指南推荐联合使用多种疗法。然而,对规定方案的依从性不佳仍然是一个重大挑战,导致住院率增加和患者预后恶化。本研究引入了一种新颖的方法学流程,将潜在马尔可夫模型(LMM)与动态依从性建模相结合,以评估依从性行为及其对HF再住院的影响。使用意大利伦巴第大区的行政医疗数据,我们分析了2020年7月至12月期间因HF住院的6,818名患者。在六个月的观察期内每月评估依从性,并使用Cox回归将依从性概况与临床结局联系起来。识别出七种潜在行为概况,反映了不同的依从性水平和轨迹。结果显示,较高的依从性水平显著降低了再住院风险。与低依从性患者相比,持续高依从性患者的HF再住院风险降低了56%。重要的是,观察期内依从性的改善与更好的生存概率相关,突显了及时干预的潜在益处。此外,依从性行为受到年龄、合并症负担和观察期内住院等因素的影响。本研究强调了动态和个性化策略在监测和增强多药治疗依从性方面的重要性。通过将依从性模式与临床结局联系起来,所提出的方法为改善患者管理和减轻HF对医疗系统的负担提供了可操作的见解。

英文摘要

Heart failure (HF) treatment relies heavily on pharmacotherapy, particularly combining multiple therapies as recommended by clinical guidelines. However, non-adherence to prescribed regimens remains a significant challenge, contributing to increased hospitalizations and poorer patient outcomes. This study introduces a novel methodological pipeline that integrates Latent Markov Models (LMM) with dynamic adherence modeling to evaluate adherence behaviors and their impact on HF rehospitalization. Using administrative healthcare data from Lombardy, Italy, we analyzed 6,818 patients hospitalized for HF between July and December 2020. Adherence was assessed monthly over a six-month observation period, and adherence profiles were linked to clinical outcomes using Cox regression. Seven latent behavioral profiles were identified, reflecting varying levels and trajectories of adherence. The findings revealed that higher adherence levels significantly reduced the risk of rehospitalization. Patients with consistently high adherence exhibited a 56% lower risk of HF rehospitalization compared to those with low adherence. Importantly, improving adherence during the observation period was associated with better survival probabilities, highlighting the potential benefits of timely interventions. Additionally, adherence behaviors were influenced by factors such as age, comorbidity burden, and hospitalization during the observation period. This study underscores the importance of dynamic and personalized strategies to monitor and enhance adherence to polytherapy. By linking adherence patterns to clinical outcomes, the proposed approach offers actionable insights for improving patient management and reducing the burden of HF on healthcare systems.