arXivDaily arXiv每日学术速递 周一至周五更新
重置
全部学科分类 1502
2606.17332 2026-06-17 eess.SP 新提交

Self-Calibrated Indoor Tracking from Backscatter Fiducials under NLOS Transmitter Illumination

非视距发射器照明下基于反向散射标记的自校准室内跟踪

Hüseyin Yiğitler, Kalle Ruttik, Jingyi Liao, Alexander Sheverdyaev, Riku Jäntti

AI总结 针对发射器-标记链路非视距的走廊场景,提出基于网格的惩罚似然跟踪器,联合估计接收路径、对数距离斜率和标记偏移,并利用代理校准路径进行残差校正,实现无需实测校准坐标的室内定位。

Comments 6 pages, 2 figures, 1 table, submitted to 16th International Conference on Indoor Positioning and Indoor Navigation (IPIN 2026)

详情
AI中文摘要

本文研究了在直接发射器照明之外的走廊段中,基于壁挂式反向散射标记的室内跟踪。在测量设置中,发射器到标记的链路是非视距的,而沿着走廊的标记到接收器链路主要是视距的。主要挑战在于有效标记响应依赖于部署环境,因此固定的校准链路预算不可靠。因此,我们使用基于网格的惩罚似然跟踪器,直接从接收功率中提取接收器路径、拟合的对数距离斜率参数和标记特定偏移量。得到的路径随后可重复用作残差图校正的代理校准坐标,而使用实测校准坐标的相同校正仅作为参考报告。在一个短的四标记走廊段上,无需实测校准坐标,该双频跟踪器的中位误差为0.52米,代理残差校正将其改善至0.46米。使用实测校准坐标时,相同校正和类似RADAR的指纹参考均达到0.31米。因此,主要剩余限制在于代理校准路径的质量,而非结构化观测模型本身。

英文摘要

This paper studies indoor tracking from wall-mounted backscatter fiducials in corridor segments outside direct transmitter illumination. In the measured setup, the transmitter-to-fiducial links are NLOS, whereas the fiducial-to-receiver links along the corridor are largely LOS. The main challenge is that the effective fiducial response is deployment-dependent, so a fixed calibrated link budget is not reliable. We therefore use a grid-based penalized-likelihood tracker that profiles the receiver path, a fitted log-distance slope parameter, and fiducial-specific offsets directly from received powers. The resulting paths can then be reused as surrogate calibration coordinates for residual-map correction, while the same correction with measured calibration coordinates is reported only as a reference. On a short four-fiducial corridor segment, the profiled dual-band tracker gives a 0.52 m median error without measured calibration coordinates, and surrogate residual correction improves this to 0.46 m. With measured calibration coordinates, the same correction and a RADAR-style fingerprint reference both reach 0.31 m. The main remaining limitation is therefore the quality of the surrogate calibration paths rather than the structured observation model itself.

2606.17325 2026-06-17 eess.SP 新提交

Backscatter Assisted Indoor NLOS Positioning

反向散射辅助的室内非视距定位

Kalle Ruttik, Hüseyin Yiğitler, Jingyi Liao, Alexander Sheverdyaev, Riku Jäntti

AI总结 利用被动反向散射设备作为虚拟锚点,通过非相干功率域建模和走廊约束最大后验跟踪器,实现亚米级室内非视距连续定位。

Comments 6 pages, 5 figures, accepted by IEEE International Symposium on Personal, Indoor and Mobile Radio Communications (PIMRC) 2026

详情
AI中文摘要

被动反向散射设备(BD)可以通过作为虚拟锚点来启用室内非视距(NLOS)定位,其多普勒分离特征在标准信道估计中可观测。本文研究在走廊环境中使用非相干功率域公式进行连续用户设备(UE)跟踪,该公式避免了BD相位同步,并对残余载波偏移和强多径保持鲁棒性。BD相关的测量通过具有未知BD特定偏移的对数距离定律建模,这使得无源异步设备无需发射功率校准即可用作锚点。基于该模型,我们开发了一个具有运动正则化和Huber鲁棒估计的走廊约束最大后验(MAP)跟踪器。在射线追踪启发的仿真中,该方法实现了0.23–0.27米的中位定位误差,90百分位误差低于0.45米。在办公室走廊测量中,使用四个频率为866 MHz的无源BD,该方法达到了0.505米的聚合中位误差,并优于简单的加权平均基线。结果表明,无源异步BD可以提供实用的亚米级室内NLOS跟踪,同时保持与现有信道估计流水线和能量自主BD部署的兼容性。

英文摘要

Passive backscatter devices (BDs) can enable indoor non-line-of-sight (NLOS) positioning by serving as virtual anchors whose Doppler-separated signatures are observable in standard channel estimates. This paper studies continuous user-equipment (UE) tracking in corridor environments using a noncoherent power-domain formulation that avoids BD phase synchronization and remains robust to residual carrier offsets and strong multipath. The BD-dependent measurements are modeled by a log-distance law with unknown BD-specific offsets, which allows passive asynchronous devices to be used as anchors without transmit-power calibration. Based on this model, we develop a corridor-constrained maximum a posteriori (MAP) tracker with motion regularization and Huber-robust estimation. In ray-tracing-inspired simulations, the method achieves median positioning errors of 0.23--0.27 m with 90th-percentile errors below 0.45 m. In office-corridor measurements with four passive BDs at 866 MHz, it attains an aggregated median error of 0.505 m and outperforms a simple weighted-average baseline. The results show that passive asynchronous BDs can provide practical sub-meter indoor NLOS tracking while remaining compatible with existing channel-estimation pipelines and energy-autonomous BD deployments.

2606.17311 2026-06-17 eess.SP 新提交

Pilot-Aided MIMO Channel Identification and Linear Deconvolution in Correlated Gaussian Noise

相关高斯噪声中导频辅助的MIMO信道辨识与线性解卷积

Necati Kagan Erkek, Y. Ugur Ozcan

AI总结 针对空间相关高斯噪声下的MIMO系统,采用导频辅助的最大似然/最小二乘信道估计,并与Cramer-Rao界比较,进而利用估计信道进行数据恢复,分析训练序列长度和正则化对性能的影响。

Comments 8 pages

详情
AI中文摘要

本文提出了在空间相关高斯噪声下多输入多输出(MIMO)信道辨识和线性解卷积的导频辅助研究。分析了实值$4\ imes4$基带模型,包括无记忆和有限冲激响应信道。噪声过程由Toeplitz协方差矩阵生成,通过最大似然/最小二乘公式从导频符号估计信道,并将经验均方误差与Cramer-Rao界进行比较。然后,利用估计的信道通过最大似然迫零和线性最小均方误差解卷积进行数据符号恢复。结果表明,足够长且条件良好的导频块使信道估计器接近理论下界,而短训练间隔会导致秩和条件限制,特别是对于四抽头模型。解卷积实验进一步表明,在低信噪比和信道估计不准确的情况下,MMSE正则化提供了比非正则化迫零更稳定的逆。

英文摘要

This paper presents a pilot-aided study of multiple-input multiple-output (MIMO) channel identification and linear deconvolution under spatially correlated Gaussian noise. A real-valued $4\times4$ baseband model is analyzed for both memoryless and finite-impulse-response channels. The noise process is generated from a Toeplitz covariance matrix, the channel is estimated from pilot symbols through maximum-likelihood/least-squares formulations, and the empirical mean-square error is compared with the Cramer--Rao bound. The estimated channel is then used for data-symbol recovery through maximum-likelihood zero-forcing and linear minimum-mean-square-error deconvolution. The results show that sufficiently long and well-conditioned pilot blocks allow the channel estimator to approach the theoretical lower bound, whereas short training intervals cause rank and conditioning limitations, especially for the four-tap model. The deconvolution experiments further show that MMSE regularization provides a more stable inverse than unregularized zero forcing at low signal-to-noise ratios and for inaccurate channel estimates.

2606.17306 2026-06-17 eess.SP 新提交

Robust Beamforming Design for Secure Uplink NOMA-ISAC

安全上行NOMA-ISAC的鲁棒波束赋形设计

Azadeh Tabeshnezhad, Milad Tatar Mamaghani, A. Lee Swindlehurst, Tommy Svensson, Erik Ström

AI总结 针对上行NOMA-ISAC系统中窃听者位置不确定的安全问题,提出联合优化用户和速率与感知性能的鲁棒波束赋形方案,通过交替优化算法实现快速收敛。

详情
AI中文摘要

集成感知与通信是第六代(6G)移动网络的关键技术,能够在统一系统中联合使用通信与雷达感知。虽然ISAC在频谱效率方面带来显著优势,但也引入了新的安全挑战。特别是,感知与通信资源的联合使用可能增加窃听和信息泄露的脆弱性。本文研究一个上行非正交多址(NOMA)系统,其中基站(BS)同时接收用户数据并感知位置不确定的潜在窃听者(Eve)。为增强物理层安全,设计鲁棒感知信号以同时感知和干扰Eve。我们制定了一个联合优化问题,旨在最大化用户和速率与BS感知性能,同时保持对Eve的安全性。由于所得优化问题非凸,我们开发了一种迭代交替优化(AO)算法,将其分解为两个易处理的子问题。在第一个子问题中,利用广义特征值分解以闭式优化接收合并向量。在第二个子问题中,通过半定松弛(SDR)和逐次凸近似(SCA)联合优化发射波束赋形矩阵和感知功率。仿真结果证明了我们方案在快速收敛和资源分配方面的有效性。

英文摘要

Integrated sensing and communication is an important technology for sixth-generation (6G) mobile networks, enabling the joint use of communication and radar sensing within a unified system. While offering significant benefits in terms of spectral efficiency, ISAC introduces new security challenges. In particular, the joint use of resources for sensing and communication can increase vulnerability to eavesdropping and information leakage. In this paper, we study an uplink Non-Orthogonal Multiple Access (NOMA) system where the base station (BS) simultaneously receives user data and senses a potential eavesdropper (Eve) with uncertain location. To enhance the physical-layer security, a robust sensing signal is designed to both sense and jam Eve. We formulate a joint optimization problem that aims to maximize the users' sum rate and the BS sensing performance while maintaining security against Eve. Since the resulting optimization problem is non-convex, we develop an iterative alternating optimization (AO) algorithm that decomposes it into two tractable subproblems. In the first subproblem, the receive combining vectors are optimized in closed form using generalized eigenvalue decomposition. In the second subproblem, the transmit beamforming matrices and sensing power are jointly optimized via semidefinite relaxation (SDR) and successive convex approximation (SCA). Simulation results demonstrate the effectiveness of our solution in terms of fast convergence and resource allocation.

2606.17263 2026-06-17 eess.AS 新提交

Direction of arrival estimation from distant microphone data using single frequency filtering

基于单频滤波的远距离麦克风数据波达方向估计

Sushmita Thakallapalli, Sudarsana Reddy Kadiri, Nilesh Madhu, Suryakanth V Gangashetty

AI总结 针对窄带波达方向估计易受空间混叠影响的问题,提出基于单频滤波的语音存在时频区域互相关方法,在多种混响和噪声条件下优于现有窄带方法及部分宽带方法。

详情
AI中文摘要

在远距离麦克风中,宽带(BB)波达方向(DoA)估计方法比窄带(NB)方法更适用。由于优化函数在所有频带上的聚合,BB估计器对空间混叠具有鲁棒性,而空间混叠是处理远距离麦克风数据时的一个已知问题。在NB方法中,DoA估计利用每个频带中的局部信息,因此估计受空间混叠影响。然而,与BB方法不同,NB方法利用频率稀疏性在单个时间帧内估计多个说话者的DoA。本文开发了一种提高NB DoA估计器对空间混叠鲁棒性的方法。所提方法基于对麦克风信号进行单频滤波(SFF)获得的语音存在时频区域的互相关。选择SFF谱是因为SFF分量在时间和频率上都具有高信噪比区域,并且语音与非语音的区分在SFF域中对退化具有鲁棒性。在模拟和真实数据上,使用检测和准确度指标,在不同混响和噪声条件下,将所提NB估计器与四种最先进估计器(一个NB和三个BB)进行比较。结果表明,在所有环境中,基于SFF的NB方法优于最先进的NB方法。此外,基于SFF的方法的性能优于某些BB估计器。

英文摘要

In distant microphones, broadband (BB) methods for direction-of-arrival (DoA) estimation are more suitable than narrowband (NB) methods. Due to the aggregation of their optimization function across all frequency bands, BB estimators are robust to spatial aliasing, a known problem in processing distant microphone data. In NB methods, DoA estimation is performed by utilizing \textit{local} information in each frequency band and hence the estimation is affected by spatial aliasing. However, unlike BB methods, NB methods exploit frequency sparsity to estimate the DoAs of \textit{multiple speakers} in a \textit{single time frame}. In this article, a method to improve the robustness of a NB DoA estimator to spatial aliasing is developed. The proposed method is based on cross-correlation of speech-present time-frequency regions obtained by single frequency filtering (SFF) of the microphone signals. The SFF spectrum is chosen because SFF components have regions of high signal-to-noise ratio both in time and frequency and because speech and non-speech discrimination is robust to degradations in the SFF domain. The proposed NB estimator is compared to four state-of-the-art estimators (one NB and three BB) using detection and accuracy metrics on simulated and real-world data in different reverberation and noise conditions. The results show that in all the environments, the SFF-based NB approach outperforms the state-of-the-art NB approach. Furthermore, the performance of the SFF-based approach is better than some of the BB estimators.

2606.17258 2026-06-17 eess.AS 新提交

Single frequency filtering based multi-speaker direction of arrival estimation from stereo recordings

基于单频滤波的立体录音多说话人到达方向估计

Sushmita Thakallapalli, Sudarsana Reddy Kadiri, Nilesh Madhu, Suryakanth V Gangashetty

AI总结 提出一种基于单频滤波的到达方向估计方法,利用PHAT加权互相关处理SFF输出包络,在混响、多说话人和噪声条件下优于或媲美最佳GCC方法。

详情
AI中文摘要

从嘈杂和混响的麦克风信号中进行鲁棒的到达方向(DoA)估计仍然具有挑战性。传统的估计器如广义互相关(GCC)及其变体在短时傅里叶变换(STFT)域中操作,其中频谱特征主要反映声道特性。最近的基于单频滤波(SFF)的估计器则使用时频表示,该表示提供谐波的高频谱分辨率以及激励源事件(如类脉冲)的高时间分辨率。由于激励源特征已被证明比频谱特征对噪声和混响更鲁棒,本文提出了一种改进的基于SFF的DoA估计器,该估计器使用PHAT加权的GCC来关联麦克风通道之间的SFF输出包络。我们进一步使用公开的真实房间录音,在具有挑战性的混响、多说话人和噪声条件下,对基于SFF和最先进的基于GCC的估计器进行了全面评估。实验结果表明,所提出的方法和现有的基于SFF的估计器在所有测试案例中实现了优于或可媲美最佳基于GCC的估计器的检测和精度性能。我们还证明,使用语音主导的频带可以提高GCC-PHAT的鲁棒性,这激励了未来将此类加权策略纳入基于SFF的DoA估计中。

英文摘要

Robust direction-of-arrival (DoA) estimation from noisy and reverberant microphone signals remains challenging. Conventional estimators such as generalized cross-correlation (GCC) and its variants operate in the short-time Fourier transform (STFT) domain, where spectral features primarily reflect vocal-tract characteristics. Recent single frequency filtering (SFF)-based estimators instead use a time-frequency representation that provides high spectral resolution of harmonics along with high temporal resolution of excitation-source events, such as epoch-like impulses. Since excitation-source features have been shown to be more robust to noise and reverberation than spectral features, this work proposes an improved SFF-based DoA estimator that correlates the envelopes of SFF outputs across microphone channels using PHAT-weighted GCC. We further provide a comprehensive evaluation of SFF-based and state-of-the-art GCC-based estimators using publicly available real-room recordings under challenging reverberant, multi-speaker, and noise-corrupted conditions. Experimental results show that the proposed method and an existing SFF-based estimator achieve detection and accuracy performance that is superior or comparable to the best GCC-based estimator across all test cases. We also demonstrate that using speech-dominant bins improves GCC-PHAT robustness, motivating future incorporation of such weighting strategies into SFF-based DoA estimation.

2606.17254 2026-06-17 eess.AS 新提交

Synergizing Zero-Shot Cross-Lingual Alzheimer Detection with Language-Invariant Multimodal Bi-Geometric Adversarial Learning

协同零样本跨语言阿尔茨海默检测与语言不变多模态双几何对抗学习

Girish, Mohd Mujtaba Akhtar, Farhan Sheth, Muskaan Singh, Juliana Gerard, Paula McClean, Kongfatt Wong-Lin

AI总结 提出ORBIT框架,通过跨注意力融合、多语言对抗和球面-双曲几何学习实现零样本跨语言阿尔茨海默病检测,多模态融合优于单模态基线。

Comments Accepted to INTERSPEECH 2026

详情
AI中文摘要

在这项工作中,我们研究了基于语音的零样本跨语言阿尔茨海默病检测(SADD)。我们假设,通过融合多语言语音和文本预训练模型来学习语言不变的多模态表示,对于可靠地迁移到未见过的语言至关重要,因为这两种模态捕捉了认知障碍的互补声学和语言标记,而对抗学习抑制了语言特定的混淆因素。零样本跨语言评估的实验结果证实了这一假设,表明多模态融合始终优于单模态基线。为此,我们提出了ORBIT,一个新颖的框架,它结合了跨注意力融合、多语言对抗器以及互补的球面-双曲几何学习与共识聚类。在各种设置下,与单模态模型和基于简单拼接的融合基线相比,ORBIT实现了最强的性能。

英文摘要

In this work, we study zero-shot cross-lingual speech-based Alzheimer's disease detection (SADD). We hypothesize that learning language-invariant multimodal representations by fusing multilingual speech and text pretrained models is essential for reliable transfer to unseen languages, as the two modalities capture complementary acoustic and linguistic markers of cognitive impairment while adversarial learning suppresses language-specific confounds. Empirical results in zero-shot cross-lingual evaluation substantiate the hypothesis, showing that multimodal fusion consistently outperforms unimodal baselines. To this end, we propose ORBIT, a novel framework that combines cross-attentive fusion, multi-tap language adversaries, and complementary spherical--hyperbolic geometric learning with consensus clustering. Across settings, ORBIT achieves the strongest performance compared to unimodal models and simple concatenation-based fusion baselines.

2606.17202 2026-06-17 eess.SP 新提交

Gauge Freedom Optimization for Truncation Error Reduction in Inertial Navigation

惯性导航中用于截断误差降低的规范自由度优化

Yaakov Libero, Itzik Klein

AI总结 提出u-space方法,将规范自由度推广到未知强迫函数系统,通过闭式或经验形式优化规范,在多种仿真和实测惯性导航数据中一致降低截断误差。

详情
AI中文摘要

数值积分在惯性导航系统中起着核心作用,其中传感器测量值随时间传播以获得姿态、速度和位置状态。这种传播的精度取决于数值积分器的类型、阶数和步长。先前的工作表明,对于具有已知强迫函数的二阶系统,可以利用变分参数技术中的规范自由度在不修改积分器的情况下降低截断误差。然而,这种方法需要分析已知的强迫函数,限制了其在现实系统中的应用。为了解决这一限制,我们提出了u空间方法论,一种新颖的状态映射,将规范自由度推广到具有未知强迫函数的系统。对于二阶系统,以闭式形式推导了最优规范;对于一阶系统,则以闭式和经验形式推导。通过蒙特卡洛模拟,在四种强迫函数、五种传感器等级和四种Adams-Bashforth阶数下,以及在真实惯性导航数据集上评估了所提方法。结果显示,在所有测试条件下均实现了一致的误差降低,其中在完整惯性机械编排管道中观察到的增益最大,使得该方法适用于截断误差在误差预算中占比较大份额的高等级惯性系统,以及具有高更新率的辅助低成本系统(其中传播仅跨越短更新间隔)。

英文摘要

Numerical integration plays a central role in inertial navigation systems, where sensor measurements are propagated through time to obtain orientation, velocity, and position states. The accuracy of this propagation depends on the numerical integrator type, order and step-size. Prior work showed that for second-order systems with known forcing functions, the gauge freedom in the variation of parameters technique can be exploited to reduce truncation error without modifying the integrator. However, this approach requires analytical knowledge of the forcing function, limiting its applicability in real-world systems. To address this limitation we propose the u-space methodology, a novel state mapping that generalizes the gauge freedom to systems with unknown forcing functions. The optimal gauge is derived in closed form for second-order systems and in both closed and empirical form for first-order systems. The proposed approach was evaluated through Monte Carlo simulations across four forcing functions, five sensor grades, and four Adams-Bashforth orders, as well as on a real-world inertial navigation dataset. Results show consistent error reduction across all tested conditions, with the largest gains observed in the full inertial mechanization pipeline, making the approach applicable to high-grade inertial systems, where truncation error constitutes a larger share of the error budget, and to aided low-cost systems with high-rate updates, where propagation spans only short inter-update intervals.

2606.17198 2026-06-17 eess.SP 新提交

Precoding Sequence Design for MIMO Sensing with Scatterers Based on Prior Information

基于先验信息的含散射体MIMO感知预编码序列设计

Yiming Liu, Wei Yu

AI总结 针对散射体导致信号相关干扰的问题,提出基于贝叶斯克拉美罗下界最小化的预编码序列设计方法,并开发了常模约束下的迭代算法。

Comments The paper has been submitted for possible publication

详情
AI中文摘要

干扰散射体的存在从根本上改变了MIMO感知的设计原则。与仅目标情况不同,其中MIMO感知序列设计简化为优化发射样本协方差,本文表明散射体引起的信号相关干扰使得贝叶斯Fisher信息依赖于完整的时间预编码序列。因此,对于以贝叶斯克拉美罗下界(BCRLB)为目标的含散射体MIMO感知问题,必须显式设计整个感知序列,而不仅仅是预编码矩阵。本文考虑了在硬件约束下基于目标和散射体先验信息估计多个目标方位角的MIMO感知预编码序列设计问题。我们制定了多目标角度下的最坏情况BCRLB最小化问题,在恒模或常模硬件约束下产生一个最大-最小分数规划。我们进一步开发了一个常模线性变换,将比率目标转换为线性形式,从而得到具有闭式预编码更新的迭代算法。该框架扩展到联合预编码-合并器设计和具有自适应先验细化的多阶段感知。数值结果证明了所提算法的有效性和效率,揭示了类似扫描的波束模式,该模式照亮目标角度区域同时抑制来自散射体的干扰。

英文摘要

The presence of interfering scatterers fundamentally changes the design principle for MIMO sensing. Unlike the target-only case, where MIMO sensing sequence design reduces to optimizing the transmit sample covariance, this paper shows that scatterer-induced signal-dependent interference makes the Bayesian Fisher information depend on the full temporal precoding sequence. Consequently, for the MIMO sensing problem with scatterers using the Bayesian Cramér-Rao lower bound (BCRLB) as the objective, the entire sensing sequence must be designed explicitly, instead of just the precoding matrix. This paper considers such a precoding sequence design problem under hardware constraint for MIMO sensing for estimating the azimuth angles of multiple targets based on the prior information of both the targets and the scatterers. We formulate a worst-case BCRLB minimization across multiple target angles, yielding a max-min fractional program under constant-modulus or constant-norm hardware constraints. We further develop a constant-norm linear transform that converts the ratio objectives into linear forms, leading to an iterative algorithm with closed-form precoder updates. The framework extends to joint precoder-combiner design and multi-stage sensing with adaptive prior refinement. Numerical results demonstrate the effectiveness and the efficiency of the proposed algorithm, revealing sweeping-like beampatterns that illuminate target angular regions while suppressing interference from the scatterers.

2606.18197 2026-06-17 stat.AP stat.ME 新提交

A Sensitivity Framework for Identifying Contagion under Latent Homophily for Fixed-in-Time Network Analyses, with an Application to U.S. House Congressional Voting

固定时间网络分析中潜在同质性下识别传染的敏感性框架——以美国众议院投票为例

Duncan A. Clark

AI总结 针对固定时间网络数据中传染效应与同质性难以区分的问题,提出基于选择偏差的敏感性分析框架,通过非参数界将传染识别转化为潜在同质性强度问题,并应用于2008年美国众议院TARP投票分析。

详情
AI中文摘要

连接的单位是否因为影响在联系中传播而相似,还是因为相似的单位形成联系,这是一个长期存在的问题。从观测网络数据中,传染或影响通常无法被识别。我们考虑一个最小且常见的设置:单一网络,时间固定,具有两波二元节点结果。我们不假设网络形成的参数模型,而是将传染的识别重新构建为一个选择偏差问题,并开发了一个敏感性框架。我们定义了一个控制直接效应(CDE),即在保持联系存在的同时干预他人的结果。我们表明,CDE与观察到的连接二元组风险比之间的差距由潜在同质性变量如何改变连接二元组的组成所决定。受Smith式选择偏差敏感性分析和Ding与VanderWeele的风险比边界函数的启发,我们开发了可解释的非参数界。这将问题“是否存在传染?”转化为“潜在同质性需要多强才能解释观察到的传染?”模拟研究表征了这些界的误差控制和功效。我们将该框架应用于2008年美国众议院对问题资产救助计划的投票,识别了在哪些假设下传染是合理的。

英文摘要

Whether connected units are similar because influence spreads across ties or because similar units form ties, is a long-standing problem. Contagion or influence is generically unidentified from observational network data. We consider the minimal and common setting of a single network, fixed over time, with two waves of a binary nodal outcome. Rather than positing a parametric model for network formation, we reframe identification of contagion as a selection-bias problem and develop a sensitivity framework. We define a controlled direct effect (CDE) holding a tie present while intervening on an alter's outcome. We show that the gap between the CDE and the observed connected-dyad risk ratio is governed by how strongly a latent homophily variable shifts the composition of connected dyads. Inspired by Smith-style selection-bias sensitivity analysis and the risk-ratio bounding function of Ding and VanderWeele we develop interpretable nonparametric bounds. This translates the question "is there contagion?" into the question "how strong would latent homophily have to be to explain away the observed contagion?" A simulation study characterizes the bounds' error control and power. We apply the framework to the 2008 U.S. House votes on the Troubled Asset Relief Program, identifying under which assumptions contagion is plausible.

2606.18146 2026-06-17 stat.ME 新提交

Spatial Disease Mapping and Disparity Detection Using Generative AI: An Amortized Bayesian Learning Framework

使用生成式AI的空间疾病映射与差异检测:一种摊销贝叶斯学习框架

Luca Aiello, Sudipto Banerjee

AI总结 提出一种摊销贝叶斯框架,通过神经网络近似后验分布,实现跨不同区域图的空间边界检测,并在呼吸疾病和肺癌数据中验证其有效性。

详情
AI中文摘要

我们引入了一个用于空间边界检测的摊销贝叶斯框架,该框架能够推广到具有不同区域数量和多样邻接结构的区域图上的后验推断。底层模型将泊松计数似然与协变量驱动的规则相结合,以中断跨不相似相邻区域的平滑,并利用有向无环图自回归(DAGAR)先验来捕捉残差空间依赖性。为了逼近目标后验分布,我们在模拟地图上训练了一个神经引擎:一个置换不变摘要网络编码观测计数、偏移量、协变量和邻接矩阵的图感知表示,而一个条件归一化流生成近似的后验样本。模拟研究证明了准确的参数恢复、接近名义水平的区间覆盖、良好校准的后验预测行为以及信息丰富的后验边界概率。与马尔可夫链蒙特卡洛(MCMC)的基准测试证实了在主要边界证据上的紧密一致性,而消融研究验证了包含模型引导的图摘要的有效性。最后,应用于格拉斯哥呼吸系统疾病和加利福尼亚肺癌数据表明,一个训练好的神经引擎可以无缝部署到具有不同图结构的真实世界地图上,产生的边界结论与已建立的局部平滑分析一致。

英文摘要

We introduce an amortized Bayesian framework for spatial boundary detection that generalizes posterior inference across areal graphs with varying numbers of regions and diverse adjacency structures. The underlying model couples a Poisson count likelihood with a covariate-driven rule to interrupt smoothing across dissimilar neighboring areas, utilizing a directed acyclic graph autoregressive (DAGAR) prior to capture residual spatial dependence. To approximate the target posterior distribution, a neural engine is trained on simulated maps: a permutation-invariant summary network encodes graph-aware representations of the observed counts, offsets, covariates, and adjacency matrices, while a conditional normalizing flow generates the approximate posterior draws. Simulation studies demonstrate accurate parameter recovery, near-nominal interval coverage, well-calibrated posterior predictive behavior, and informative posterior boundary probabilities. Benchmarking against Markov chain Monte Carlo (MCMC) confirms close agreement regarding primary boundary evidence, and an ablation study validates the inclusion of model-guided graph summaries. Finally, applications to Glasgow respiratory disease and California lung cancer data demonstrate that a single trained neural engine can be seamlessly deployed across real-world maps with distinct graph structures, yielding boundary conclusions consistent with established localized smoothing analyses.

2606.18139 2026-06-17 stat.ME 新提交

Bayesian Threshold-Aligned Joint Disease Progression Modeling for Alzheimer's Disease

贝叶斯阈值对齐的阿尔茨海默病联合疾病进展建模

Rong Wu, Duygu Tosun, Isabella Hausle, Margo Heston, Aaron Wolfe Scheffler

AI总结 提出贝叶斯阈值对齐联合疾病进展模型(B-TAJ DPM),通过半参数框架将生物标志物轨迹与认知障碍生存终点联合建模,并锚定于阳性阈值,以揭示异质性进展模式。

详情
AI中文摘要

阿尔茨海默病的特征是淀粉样蛋白-β和tau蛋白的逐渐积累,数年后出现认知障碍。尽管存在这一既定模式,但在病理进展年龄和认知症状发作方面存在显著的主体间变异性。为了理解这种变异的来源,需要通过联合建模疾病进展和认知障碍发生时间(参考标志性阳性阈值)的框架,将主体对齐到异质性疾病时间线上。现有的神经退行性疾病进展模型依赖于限制性参数形式,未能将疾病时间线锚定于阳性阈值,并且将生物标志物轨迹与认知生存终点分离。为了解决这些局限性,我们引入了贝叶斯阈值对齐联合疾病进展模型(B-TAJ DPM)。这个生成式半参数框架在潜在疾病时间线上建模多变量疾病进展轨迹,这些轨迹锚定于标志性阳性阈值。关键的是,该框架整合了一个生存模型,将病理进展与认知障碍联系起来。后验推断和对未见主体的后验预测在开源软件中实现。模拟研究显示出优异的估计精度和区间覆盖率。当应用于阿尔茨海默病神经影像学倡议数据时,B-TAJ DPM刻画了非线性进展模式,量化了主体间阳性年龄的变异性,并揭示了tau阳性年龄与认知障碍加速之间的联系。

英文摘要

Alzheimer's disease is characterized by the progressive accumulation of amyloid-$β$ and tau followed years later by cognitive impairment. Despite this established motif, substantial subject-level variability exists in the age of pathological progression and the onset of cognitive symptoms. To understand the source of this variation, subjects must be aligned across heterogeneous disease timelines via frameworks that jointly model disease progression and time to cognitive impairment with reference to landmark positivity thresholds. Existing neurodegenerative disease progression models rely on restrictive parametric forms, fail to anchor disease timelines to positivity thresholds, and decouple biomarker trajectories from cognitive survival endpoints. To address these limitations, we introduce the Bayesian Threshold-Aligned Joint Disease Progression Model (B-TAJ DPM). This generative, semi-parametric framework models multivariate disease progression trajectories over latent disease timelines anchored at landmark positivity thresholds. Crucially, the framework integrates a survival model to link pathological progression to cognitive impairment. Posterior inference and posterior predictions for unseen subjects are carried out in open-source software. Simulation studies demonstrate excellent estimation accuracy and interval coverage. When applied to Alzheimer's Disease Neuroimaging Initiative data, B-TAJ DPM characterizes non-linear progression patterns, quantifies subject-level variation in positivity age, and reveals links between age of tau positivity and acceleration of cognitive impairment.

2606.18113 2026-06-17 stat.ME 新提交

Undocumented Behavior in the gsynth R package and its Consequences for Three Published Studies

gsynth R包中的未记录行为及其对三项已发表研究的影响

Beniamino Green, P. M. Aronow

AI总结 研究发现gsynth包在特定选项组合下因实现错误严重低估标准误,导致假阳性率升高,并影响三篇APSR论文的结论。

详情
AI中文摘要

在2025年12月CRAN上的1.3.1版本更新之前,gsynth(一个用于估计交互固定效应(IFE)模型的流行R包)可能严重且系统地低估标准误。当两个估计选项(inference = "parametric" 和 EM = TRUE)同时使用时,会出现这种低估,此时该包会对Gobillon和Magnac(2016)的IFE-EM估计量应用参数自助法。该包在2025年12月停止支持这种组合,最新文档现在描述参数自助法因理论上的不兼容性而不适用于IFE-EM估计量。我们的重点是在gsynth的1.3.1之前版本中发现的实现错误:当EM = TRUE时使用的参数自助法与Xu(2017)提出的算法不匹配,使用了样本内残差而非样本外误差。我们证明,仅此实现错误就可能导致低估数个数量级。我们进行了一项实证蒙特卡洛研究,在一系列州级面板数据集上随机分配安慰剂处理,并表明gsynth在现实环境中可能产生高假阳性率。我们识别出三篇发表在《美国政治科学评论》上的论文受到此行为的影响。重新分析这些论文的相关部分,我们表明:(i)纠正实现错误后,大多数发现变得不显著;(ii)使用Xu(2017)的广义合成控制方法替代IFE-EM后,所有发现均变得不显著。

英文摘要

Prior to the version 1.3.1 update on CRAN in December 2025, gsynth, a popular R package for estimating Interactive Fixed Effects (IFE) models, could drastically and systematically underestimate standard errors. This underestimation would occur when two estimation options (inference = "parametric", and EM = TRUE) were used together, in which case the package would apply a parametric bootstrap procedure to Gobillon and Magnac (2016)'s IFE-EM estimator. The package ceased supporting this combination in December 2025, and the latest documentation now describes the parametric bootstrap as not suitable for use with the IFE-EM estimator due to a theoretical incompatibility. Our focus is an implementation error we identified in the pre-1.3.1 versions of gsynth: the parametric bootstrap used when EM = TRUE did not match the algorithm proposed in Xu (2017), using in-sample residuals instead of out-of-sample errors. We show that this implementation error alone can cause underestimation by orders of magnitude. We conduct an empirical Monte Carlo study using randomly assigned placebo treatments on a series of state-level panel datasets, and show that gsynth could yield high false positive rates in realistic settings. We identify three papers published in the American Political Science Review that are affected by this behavior. Reanalyzing the relevant sections of these papers, we show that (i) correcting the implementation error renders most findings insignificant, and (ii) using Xu (2017)'s Generalized Synthetic Control method in place of IFE-EM renders every finding insignificant.

2606.18078 2026-06-17 stat.ME 新提交

Spatial prediction of environmental processes using random forests: How best to account for spatial dependence?

使用随机森林对环境过程进行空间预测:如何最好地考虑空间依赖性?

Duncan Lee, Vinny Davies, Helen R. Savage, Hussein Twabi, Marriott Nliwasa, Peter MacPherson

AI总结 本文比较了随机森林融合空间依赖性的多种方法,通过模拟和空气污染数据实验,发现空间基函数方法表现一致良好。

详情
AI中文摘要

环境过程的地统计空间预测通常通过克里金法使用高斯过程模型进行,而机器学习算法是非空间预测的最先进技术。最近这些思想的融合令人兴奋,使传统机器学习算法具备了处理空间自相关的能力,从而提高了预测性能。已经提出了多种方法,包括与高斯过程的融合、观测驱动的相关结构、空间基函数和局部地理拟合。然而,尚未对其相对预测性能进行数值比较,而这对于指导环境科学家选择最优方法至关重要。本文填补了这一知识空白,并专注于随机森林作为机器学习算法,因为它们在计算和概念上比深度学习算法更易于实现。本文展示了两项研究的结果,第一项是受控模拟实验,研究是否有任何单一方法在不同空间自相关类型中始终表现优越。第二项研究关注马拉维布兰太尔市一项结核病患病率研究中空气污染浓度的预测。结果表明,虽然没有单一方法普遍优越,但使用空间基函数在模拟和真实数据研究中均表现一致良好。

英文摘要

Geostatistical spatial prediction for environmental processes is typically undertaken using Gaussian process models via Kriging, while machine learning (ML) algorithms are state-of-the-art for non-spatial prediction. An exciting recent fusion of these ideas imbibes traditional ML algorithms with the capacity to deal with spatial autocorrelation, leading to improved predictive performance. A range of approaches have been proposed, including fusion with Gaussian processes, observation-driven correlation structures, spatial basis functions and local geographical fitting. However, there has been no numerical comparison of their relative predictive performances, which is needed to advise environmental scientists on the optimal approach to use. This paper fills this knowledge gap, and focuses on random forests as the ML algorithm because they are more computationally and conceptually straightforward to implement than deep learning algorithms. The results from two studies are presented, the first being a controlled simulation experiment investigating whether any single approach is consistently superior across different spatial autocorrelation types. The second study focuses on the prediction of air pollution concentrations within a tuberculosis prevalence study in Blantyre, Malawi. The results show that whilst no single approach is universally superior, utilising spatial basis functions appears to perform consistently well across both the simulation and real data studies.

2606.18044 2026-06-17 stat.AP 新提交

Model-based clustering of compositional trajectories for the analysis of mobility data

基于模型的成分轨迹聚类用于移动数据分析

Andrea Panarotto, Manuela Cattelan, Ruggero Bellio

AI总结 提出一种基于状态空间模型的成分时间序列聚类方法,将电话数据中的移动轨迹表示为道路类型比例,以识别城市移动模式。

Comments 36 pages (26 for the main text, 10 in the supplementary), 13 figures (6 in the main text, 7 in the supplementary)

详情
AI中文摘要

理解城市移动模式对于设计高效且可持续的交通系统至关重要。受帕多瓦市及其周边地区应用的启发,我们提出了一种新颖的统计框架,用于分析和聚类源自电话数据的移动轨迹。我们引入了个体移动的成分表示,该表示将不确定的设备位置与周围道路网络的信息相结合,在每个时间点编码与观测位置兼容的不同道路类型的比例。这种表述自然地考虑了测量不确定性,并产生了在单纯形中演化的轨迹。为了对这些数据进行建模,我们开发了一个用于成分时间序列的状态空间框架,该框架同时捕捉电话测量误差和潜在移动过程的时间动态。基于这一表示,我们提出了一种基于模型的聚类方法,该方法基于状态空间模型的混合,以识别具有相似演化轨迹的组。这使我们能够将个体移动聚合成在人口层面上可解释的移动模式。案例研究的结果表明,该方法能够揭示有意义的移动行为,为政策制定者提供潜在相关的见解。

英文摘要

Understanding urban mobility patterns is crucial for designing efficient and sustainable transportation systems. Motivated by an application to the municipality of Padova and its surroundings, we propose a novel statistical framework for the analysis and clustering of mobility trajectories derived from telephonic data. We introduce a compositional representation of individual movements that integrates the uncertain device location with information on the surrounding road network, encoding at each time point the proportions of different road types compatible with the observed position. This formulation naturally accounts for measurement uncertainty and yields trajectories evolving in the simplex. To model these data, we develop a state-space framework for compositional time series that captures both the telephonic measurement error and the temporal dynamics of the latent mobility process. Building on this representation, we propose a model-based clustering approach based on mixtures of state-space models to identify groups of trajectories with similar evolution. This allows us to aggregate individual movements into interpretable mobility patterns at the population level. The results of the case study demonstrate the ability of the approach to uncover meaningful mobility behaviors, providing insights that are potentially relevant to policy makers.

2606.17939 2026-06-17 stat.AP stat.ML 新提交

Understanding Long-Term Dynamics of Individual Metro Usage: A Hidden Semi-Markov State Framework with Survival Analysis

理解个体地铁使用的长期动态:基于生存分析的隐半马尔可夫状态框架

Bingxun Wang, Valeria Maria Urbano, Shan He, Yang Chen, Wei Liu, Zhibin Jiang, Piercesare Secchi

AI总结 提出融合隐半马尔可夫模型与离散时间生存分析的框架,利用上海地铁四年刷卡数据识别五种可解释的出行状态及其转移层次,揭示退出风险与状态相关但独立于时长,而重返风险随不活跃时长急剧衰减。

详情
AI中文摘要

理解个体地铁使用在多年时间尺度上的演化对于交通规划和乘客留存至关重要。然而,现有方法通常将移动模式表征为静态聚类或短期变化,忽略了交通参与的生命周期动态。本研究提出一个基于状态的生命周期建模框架,将隐半马尔可夫模型(HSMM)与离散时间生存分析相结合,以刻画个体地铁移动性的演化。HSMM推断具有显式持续时间分布的潜在移动状态以及控制状态变迁的转移矩阵,而生存组件通过依赖于移动状态轨迹和行为历史的状态相关风险函数,对退出和重新进入事件进行建模。将该框架应用于上海地铁系统四年(2021-2024)的智能卡数据,能够识别可解释的移动状态,刻画转移动态,并量化状态依赖的退出和重新进入过程。分析揭示了五种稳健的移动状态,具有以偶尔使用网关状态为中心的方向性转移层次,以及控制脱离和回归的根本不同的时间机制:退出风险与状态相关但与持续时间无关,而重新进入风险随不活跃时长急剧衰减。这些发现为面向生命周期的移动性分析提供了方法论基础,并为交通运营商识别风险用户和安排留存干预提供了实践指导。

英文摘要

Understanding how individual metro usage evolves over multi-year horizons is essential for transit planning and passenger retention. However, existing approaches typically characterize mobility patterns as static clusters or short-term variability, leaving the lifecycle dynamics of transit participation underexplored. This study proposes a state-based lifecycle modeling framework that integrates Hidden Semi-Markov Models (HSMM) with discrete-time survival analysis to characterize the evolution of individual metro mobility. The HSMM infers latent mobility states with explicit duration distributions and a transition matrix governing regime changes, while the survival component models exit and re-entry events via state-dependent hazard functions conditioned on mobility-state trajectories and behavioral history. Applied to four years of smart card data from the Shanghai metro system (2021-2024), the framework enables the identification of interpretable mobility states, the characterization of transition dynamics, and the quantification of state-dependent exit and re-entry processes. The analysis reveals five robust mobility states with a directional transition hierarchy centered on an occasional-usage gateway state, and fundamentally different temporal mechanisms governing disengagement and return: exit hazard is state-dependent but duration-independent, whereas re-entry hazard decays sharply with inactivity length. These findings provide a methodological foundation for lifecycle-oriented mobility analysis and practical guidance for transit operators to identify at-risk users and time retention interventions.

2606.17923 2026-06-17 stat.ME 新提交

Spatial mixed models for assessing environmental exposure effects on the microbiome

评估环境暴露对微生物组影响的空间混合模型

Sooran Kim, Chan Wang, Soyoung Kwak, Fares Darawshy, Alexander Bain, Leopoldo N. Segal, Jiyoung Ahn, Huilin Li

AI总结 提出一种空间混合模型框架,利用条件自回归先验同时处理区域空间依赖和分类群生态依赖,在特征选择中实现高检测功率和低假阳性率,应用于PM2.5暴露研究识别相关菌属。

详情
AI中文摘要

环境暴露(如空气污染)对人类健康的影响日益受到重视。越来越多的证据表明,微生物组可能介导这些效应,从而解释环境与宿主生物学之间的关系。然而,环境暴露对微生物组的影响尚未完全明确,且该背景下的统计建模面临复杂依赖结构的挑战。具体而言,微生物组数据在采样区域间表现出空间依赖性,以及微生物分类群间的生态相关性,若忽略这些依赖,会显著降低检测能力,导致遗漏真实信号。我们提出了一种新颖的微生物组数据空间混合建模框架,该框架利用条件自回归先验同时考虑区域级空间依赖和分类群级生态依赖。通过模拟,我们证明该框架优于忽略此类依赖的现有方法,在特征选择中实现高检测功率,同时保持低假阳性率并降低估计均方误差。应用于两项真实研究——食品与微生物组纵向调查研究数据和肺微生物组数据集,其中涉及细颗粒物(PM2.5)暴露,我们的模型识别出已知与污染相关健康结果有关的菌属,以及可能介导宿主对空气污染反应的新分类群。这一新颖方法为揭示复杂环境数据中具有生物学意义的关联提供了强大而灵活的工具。

英文摘要

The influence of environmental exposures, such as air pollution, on human health has become increasingly recognized. A growing body of evidence suggests that the microbiome may mediate these effects, explaining the relationship between the environment and host biology. However, the impact of environmental exposures on the microbiome is not yet fully understood, and statistical modeling in this context is challenged by complex dependency structures. In particular, microbiome data exhibit spatial dependencies across sampling regions as well as ecological correlations among microbial taxa, which, if ignored, can substantially reduce detection power, leading to missed true signals. We introduce a novel spatial mixed modeling framework for microbiome data that accounts for both region-level spatial dependency and taxon-level ecological dependency using conditional autoregressive priors. Through simulations, we demonstrate that this framework outperforms existing methods that ignore such dependencies, by achieving high detection power in feature selection while maintaining low false positive rates and reduced mean squared error in estimation. Applied to two real studies-data from Food and Microbiome Longitudinal Investigation study and lung microbiome dataset-with fine particulate matter (PM_2.5) exposures, our model identified genera, which are known to be involved in pollution-related health outcomes, as well as novel taxa that may mediate host responses to air pollution. This novel approach offers a powerful and flexible tool for uncovering biologically meaningful associations in complex environmental data.

2606.17841 2026-06-17 stat.ME 新提交

Subgroup analysis in randomized controlled trials with binary outcomes: dilution and logic-respecting properties

二元结局随机对照试验中的亚组分析:稀释与逻辑一致性性质

Long-Hao Xu, Yang Han, Tim Friede

AI总结 研究二元结局随机对照试验中亚组分析的比值比和相对响应的性质,证明比值比不适合作为疗效指标而相对响应合适,并阐明两者在逻辑一致性和稀释性质上的差异。

详情
AI中文摘要

亚组分析在随机对照试验中常规用于检验治疗效果在患者亚组间是否同质或由于治疗效应异质性而不同。本文研究了二元结局亚组分析中比值比和相对响应的性质,通过新的理论见解和方法学发展扩展了先前的工作。我们建立了几个新定理,描述了当两个亚组合并时,总体人群的比值比在大小和方向上如何变化。这些结果进一步证实了比值比不适合作为该亚组设置中的疗效指标,而相对响应是合适的。我们还提出了比值比和相对响应之间的正式关系,并阐明了它们在逻辑一致性性质(即总体疗效是否介于亚组疗效之间)和稀释性质(即混合亚组是否使总体比值比向1移动)方面的差异。尽管比值比通常不具有逻辑一致性,但在某些条件下它可能近似表现为具有逻辑一致性的疗效指标。为了说明我们的发现,我们基于临床试验数据给出了一个说明性示例,并讨论了其对随机对照试验中亚组分析的意义。

英文摘要

Subgroup analysis is routinely used in randomized controlled trials to examine whether treatment effects are homogeneous across patient subgroups or differ because of treatment-effect heterogeneity. In this paper, we investigate the properties of the odds ratio and the relative response in subgroup analyses with binary outcomes, extending previous work with new theoretical insights and methodological developments. We establish several new theorems that characterize how the odds ratio for the overall population changes in both magnitude and direction when two subgroups are combined. These results further confirm that the odds ratio is inappropriate as an efficacy measure in this subgroup setting, whereas the relative response is appropriate. We also present the formal relationship between the odds ratio and the relative response, and clarify their differences in terms of the logic-respecting property, that is, whether the overall efficacy lies between the subgroup efficacies, and the dilution property, that is, whether mixing subgroups moves the overall odds ratio toward 1. Although the odds ratio is generally not logic-respecting, it may behave approximately like a logic-respecting efficacy measure under certain conditions. To illustrate our findings, we present an illustrative example based on clinical trial data and discuss its implications for subgroup analysis in randomized controlled trials.

2606.17723 2026-06-17 stat.AP 新提交

Tail Dependence in EU Carbon Markets: Graphical Models of Extremes for EUA Futures

欧盟碳市场中的尾部依赖:EUA期货的极值图模型

Jan Maciejowski, Manuele Leonelli

AI总结 应用Hüsler-Reiss极值图模型分析EU ETS第三、四阶段20个日度变量,发现尾部网络比平均依赖网络更密集、中心节点不同,且EUA期货在尾部网络中中心性最高,而股指和外汇对则相反。

详情
AI中文摘要

理解极端价格波动如何在金融和能源市场间传播,对于欧盟排放交易体系(EU ETS)的风险管理和监管设计至关重要。我们将Hüsler-Reiss极值图模型应用于一个包含20个日度变量的系统,这些变量围绕EU ETS第三和第四阶段(2013-2025年)的EUA期货,并以高斯图模型作为平均依赖基线。尾部网络在结构上与平均依赖网络截然不同:密度显著更高,围绕不同的中心节点组织,并受部门内同质性支配,这种同质性比平均依赖水平更紧密地约束了部门边界。EUA期货在标准图模型中处于边缘位置,但在尾部网络中达到最高中心性,而股指和主要外汇对则呈现相反趋势。指数随机图模型确认了所有样本期内尾部网络中股票和外汇的边缘性,并识别出市场下行期间的三角闭合是第三阶段的现象,在第四阶段消失。阶段转变重构了尾部网络而未使其稀疏化:平均依赖急剧收缩,而尾部依赖持续存在,崩溃传染从聚集传播转变为扩散传播。这些发现对合规实体的对冲构建、监管机构的压力测试校准以及EU ETS市场系统性风险监测工具的设计具有直接意义。

英文摘要

Understanding how extreme price movements propagate across financial and energy markets is critical for risk management and regulatory design in the EU Emissions Trading System (EU ETS). We apply Hüsler-Reiss graphical models of extremes to a system of 20 daily variables centred on EU allowances futures across Phases 3 and 4 of the EU ETS (2013--2025), with a Gaussian graphical model as the average-dependence baseline. The tail networks are structurally distinct from the average dependence network: substantially denser, organized around different central nodes, and governed by within-sector homophily that binds sector boundaries more tightly than at the average-dependence level. EU allowances futures are peripheral in the standard graphical model but achieve the highest centrality in the tail networks, while equity indices and major FX pairs follow the opposite trajectory. Exponential random graph models confirm equity and FX peripherality in tail networks across all sample periods and identify triadic closure during market downturns as a Phase~3 phenomenon that vanishes in Phase~4. The phase transition restructures the tail network without thinning it: average dependence contracts sharply while tail dependence persists, and crash contagion shifts from clustered to diffuse propagation. These findings have direct implications for hedge construction by compliance entities, stress-test calibration by regulators, and the design of systemic-risk monitoring tools for EU ETS markets.

2606.17717 2026-06-17 stat.ME stat.AP 新提交

Double zero-inflated spatio-temporal modeling of daily precipitation under detection thresholds

检测阈值下日降水量的双零膨胀时空建模

Juan Marcen-Gutierrez, Jorge Castillo-Mateo, Alan E. Gelfand, Jesús Asín, Ana C. Cebrián

AI总结 针对日降水量中两种零值(无降水事件和低于检测限的未测量降水)问题,提出结合Probit回归、Gamma回归和阈值截断观测机制的多层时空模型,并应用高斯过程捕捉空间依赖,在贝叶斯框架下实现精确推断。

Comments 38 pages (+33 pages supplement), 7 figures (+35 figures supplement), 5 tables

详情
AI中文摘要

解释日尺度降水行为对于精细理解降水驱动机制至关重要。然而,由于零值的频繁出现,这一工作具有挑战性。两种类型的零值——作为干旱事件的无降水和由于检测限导致的未测量降水——的公认存在加剧了这一挑战。在这项工作中,我们提出了一个多层时空模型,该模型允许我们区分和解释两种类型的零值,并对高于检测限的正降水进行建模。该方法结合了通过Probit回归建模概率的零处点质量、潜在正降水量的Gamma回归以及受阈值截断影响的观测机制。为了捕捉空间依赖性,在每个回归模型中采用了高斯过程。在贝叶斯框架下工作,我们可以获得具有精确不确定性的丰富推断范围。特别是,我们提供了基于模型的推断工具,以比较和量化真实降水过程与其观测对应物在相关特征上的差异。我们将模型应用于西班牙东北部埃布罗河流域70个站点15年间的春季日观测数据分析。我们的发现表明,阈值强烈影响观测降水的发生,特别是在湿润地区。虽然其对总累积量的影响较小,但它可能对上分位数产生显著影响。

英文摘要

Explaining precipitation behavior at daily scale is important for fine scale understanding of the mechanisms driving precipitation. However, this effort is challenging because of the frequent incidence of zeros. The challenge is amplified by the acknowledged incidence of two types of zeros -- absence of precipitation as a dry event and absence of measured precipitation due to detection limits. In this work, we propose a multilevel spatio-temporal model which allows us to distinguish and explain the two types of zeros, as well as to model positive precipitation above the detection limit. The methodology combines a point mass at zero with probability modeled through a probit regression, a Gamma regression for latent positive precipitation amounts, and an observation mechanism subject to threshold-induced censoring. To capture spatial dependencies, Gaussian processes are employed in each regression model. Working within a Bayesian framework, we can obtain a rich range of inference with exact uncertainty. In particular, we provide model-based inference tools to compare and quantify differences between the true precipitation process and its observed counterpart across relevant characteristics. We apply our model to the analysis of daily spring observations at 70 sites over 15 years from the Ebro River Basin in northeastern Spain. Our findings indicate that the threshold strongly affects the occurrence of observed precipitation, especially in humid regions. While its impact on total accumulated amounts is small, it can exert a relevant effect on upper quantiles.

2606.17515 2026-06-17 stat.ME stat.ML 新提交

Anytime-valid Optimal Policy Identification

任意有效的最优策略识别

Daniel Molitor

AI总结 针对日志化情境赌博数据,提出一种任意有效框架,通过构建高概率包含真实最优策略集的时间索引集,支持连续监测和自适应停止,并给出样本复杂度界。

Comments 15 pages, 3 figures

详情
AI中文摘要

我们开发了一个用于从日志化情境赌博数据中识别最优策略的任意有效框架。在许多应用场景中,分析者希望从候选策略类 $\Pi$ 中选择最优策略,但数据由外部确定的日志策略生成,分析者无法控制。分析者也可能希望连续监测证据,一旦最优策略明确就停止,而不是事先承诺固定样本量。本文通过构建一个时间索引集 $S_t$ 来解决这些挑战,该集合以高概率随时间一致地保留真实最优策略集。由此产生的程序允许分析者监测策略值、消除明显次优策略,并在数据依赖的时间停止而不使推断失效。当最优策略唯一时,我们定义了其识别的停止时间,并推导出样本复杂度界为 $O\\!\left(\frac{\log |\Pi|+\log\log(1/\Delta_{\min})}{\Delta_{\min}^2}\right)$,其中 $\Delta_{\min}$ 是最优与次优策略值之间的差距。模拟表明,相对于固定样本量设计,任意有效方法可以节省大量样本。应用于一个减少在线错误信息的大型自适应实验,说明了该方法如何在最优策略证据积累时提供动态视图。

英文摘要

We develop an anytime-valid framework for optimal policy identification from logged contextual bandit data. In many applied settings, the analyst wants to select the optimal policy from a candidate policy class $Π$, but data are generated by an externally determined logging policy that they do not control. The analyst may also wish to monitor evidence continuously, stopping once the optimal policy is clear rather than committing to a fixed sample size in advance. This paper addresses these challenges by constructing a time-indexed set $S_t$ that retains the true optimal policy set uniformly over time with high probability. The resulting procedure allows the analyst to monitor policy values, eliminate clearly suboptimal policies, and stop at data-dependent times without invalidating inference. When the optimal policy is unique, we define a stopping time for its identification and derive a sample-complexity bound scaling as $O\!\left(\frac{\log |Π|+\log\log(1/Δ_{\min})}{Δ_{\min}^2}\right)$, where $Δ_{\min}$ is the gap between the best and second-best policy values. Simulations demonstrate that the anytime-valid approach can yield substantial sample savings relative to fixed-$N$ designs. An application to a large adaptive experiment on reducing misinformation online illustrates how the method provides a dynamic view as evidence on the optimal policy accumulates.

2606.17486 2026-06-17 stat.ME stat.CO 新提交

Improving Linear Regression on Small Datasets via Gaussian Process and Extreme Value Theory-Based Data Augmentation

基于高斯过程和极值理论的数据增强改进小样本线性回归

Ibrahim Salay, Jagath Senarathne

AI总结 针对小样本回归中经典假设违背问题,提出GP-MEVT混合数据增强方法,结合高斯过程与极值理论扩展预测空间并保留线性结构,在模拟和真实数据上优于标准bootstrap方法。

详情
AI中文摘要

小样本量在回归分析中带来显著挑战,常导致正态性、同方差性和残差独立性等经典假设的违背。这些违背损害了参数估计的准确性,降低了统计功效,并限制了结果的泛化能力。本研究引入了基于高斯过程的改进极值定理(GP-MEVT)方法,这是一种新颖的混合数据增强方法,结合了高斯过程与极值理论以解决这些局限性。GP-MEVT方法生成增强观测值,将预测空间扩展到观测范围之外,同时保留底层线性结构,并根据残差变异引入受控变异性。通过在三个方差场景(sigma = 2, 5, 8)和样本量(n = 10, 15, 20)下的全面模拟研究,我们证明GP-MEVT实现了更高的假设满足率,显著优于标准bootstrap和带噪声的bootstrap方法。所提出的方法还表现出合理的参数估计准确性,截距和斜率估计值始终更接近真实参数值,并且在均方根误差衡量下保持竞争性或更优的模型拟合性能。应用于真实世界数据集证实了这些优势,GP-MEVT实现了67.1%的假设满足率,而bootstrap替代方法分别为17.3%和21.2%。这些发现确立了GP-MEVT作为拟合小数据集线性回归模型的稳健可靠框架,为实践者在样本量限制不可避免时提供了一种原则性的统计推断方法。

英文摘要

Small sample sizes pose significant challenges in regression analysis, often leading to violations of classical assumptions such as normality, homoscedasticity, and independence of residuals. These violations compromise parameter estimation accuracy, reduce statistical power, and limit the generalizability of findings. This study introduces the Gaussian Process-based Modified Extreme Value Theorem (GP-MEVT) method, a novel hybrid data augmentation approach that combines Gaussian Process with Extreme Value Theory to address these limitations. The GP-MEVT method generates augmented observations that extend the predictor space beyond the observed range while preserving the underlying linear structure and introducing controlled variability based on residual variation, through comprehensive simulation studies across three variance scenarios (sigma = 2, 5, 8) and sample sizes (n = 10, 15, 20). Here, we demonstrate that GP-MEVT achieves a higher rate of assumption satisfaction, substantially outperforming standard bootstrap and bootstrap with noise methods. The proposed method also exhibits reasonable parameter estimation accuracy, with intercept and slope estimates consistently closer to true parameter values, and maintains competitive or superior model fitting performance as measured by root mean square error. Application to a real-world dataset confirms these advantages, with GP-MEVT achieving a 67.1% assumption satisfaction rate compared to 17.3% and 21.2% for bootstrap alternatives. These findings establish GP-MEVT as a robust and reliable framework for fitting linear regression models to small datasets, offering practitioners a principled approach to statistical inference when sample size limitations are unavoidable.

2606.17424 2026-06-17 stat.ME 新提交

The dangers of using three-number summaries to estimate unknown standard deviations: sensitivity analyses and some possible improvements incorporating shape

使用三数汇总估计未知标准差的风险:敏感性分析及结合形状信息的改进方法

Udara Kumaranathunga, Alysha De Livera, Luke A. Prendergast

AI总结 本文揭示三数汇总(最小值、中位数、最大值)不足以可靠估计标准差,提出基于缩放Beta分布的新估计器,并开发敏感性分析工具以提高推断可靠性。

详情
AI中文摘要

近年来,将三数和五数汇总统计量(即最小值、最大值、中位数和四分位数)转换为均值和标准差的方法取得了很大进展。这在元分析中很常见,其中一些研究报告均值和标准差,而另一些报告分位数汇总。然而,我们表明,最常见的三数汇总不包含足够的信息来可靠地估计标准差。我们证明,这可能导致非常差的估计,从而可能使任何推断无效,并提供了敏感性分析的细节,使研究人员能够对其结果更有信心,或突出潜在的偏差来源。我们进一步探讨了指定额外信息是否能提供关于未知数据形状的足够信息以改进标准差估计,并在此过程中引入了一种使用缩放Beta分布的新估计器。通过模拟和真实数据示例,我们突出了该方法的优缺点。还提供了一个Web应用程序,以帮助研究人员进行敏感性分析。

英文摘要

In recent years, there has been much progress toward the development of methods for converting three- and five-number summary statistics (i.e. minimum, maximum, median, and quartiles) to means and standard deviations (SDs). This is commonly done in the meta-analysis setting, where some studies report means and SDs, while other report quantile summaries. However, we show that three-number summaries, which are the most common, do not contain enough information to reliably estimate SDs. We show that very poor estimates can result, which may invalidate any inference and provide details of a sensitivity analysis that can allow researchers to have greater confidence in their results, or highlight potential sources of bias. We further explore whether nominating additional information can provide enough information regarding the unknown data shape to improve SD estimations, and in doing so introduce a new estimator using the scaled Beta distribution. Simulations and a real data example are used to highlight the advantages and disadvantages of this approach. A Web application is also provided to help researchers perform sensitivity analyses.

2606.17308 2026-06-17 stat.ME stat.ML 新提交

Kernel-Based Functional Balancing for Causal Inference with Compositional Treatments

基于核的协变量函数平衡法用于成分处理下的因果推断

Sungbum Kim, Jiayi Wang

AI总结 针对成分处理(暴露位于单纯形)的因果效应估计,提出基于核的协变量函数平衡加权法,通过最小化再生核希尔伯特空间中的最坏情况平衡误差构造权重,并构建增强加权估计量,实现√n一致性。

Comments 40 pages, 3 figures

详情
AI中文摘要

我们研究成分处理下的因果效应估计,其中暴露位于单纯形上,估计量定义在成分上而非标量或二元值。通过考虑平均潜在结果在处理空间上的投影,采用基于核的协变量函数平衡方法进行权重构造。权重通过直接最小化在由处理和协变量联合空间定义的再生核希尔伯特空间(RKHS)上的最坏情况平衡误差获得,而非在处理分配模型下估计。基于这些权重,提出了一个增强加权估计量(AWE),其中结果函数通过核岭回归估计,并与协变量分布的边际增广相结合。尽管所得目标函数结构复杂,但通过表示定理和低秩近似,我们将其转化为有限维凸优化问题。所提出的估计量在不要求权重一致估计或光滑性的情况下实现了√n一致性。建立了围绕样本特定目标的渐近正态性结果。通过模拟研究和真实数据应用展示了经验性能。

英文摘要

We study causal effect estimation with compositional treatments, where the exposure lies on a simplex and the estimand is defined over compositions rather than scalar or binary values. By considering a projection of the average potential outcome onto the treatment space, a kernel-based covariate functional balancing approach is adopted for weight construction. The weights are obtained by directly minimizing a worst-case balancing error over a reproducing kernel Hilbert space (RKHS) defined on the joint space of treatments and covariates, instead of being estimated under a treatment assignment model. Building on these weights, an augmented weighted estimator (AWE) is proposed, where the outcome function is estimated via kernel ridge regression and combined with a marginal augmentation over the covariate distribution. Despite the complex structure of the resulting objective, a finite-dimensional convex optimization problem is formulated via a representer theorem and a low-rank approximation. The proposed estimator achieves $\sqrt{n}$-consistency without requiring consistent estimation or smoothness of the weights. An asymptotic normality result is established around a sample-specific target. Empirical performance is demonstrated through simulation studies and a real data application.

2606.17232 2026-06-17 stat.ME 新提交

Semiparametric Mediation Analysis with Separately Observed Mediator and Outcome under Unmeasured Confounding

存在未测量混杂时基于分别观测的中介变量和结局变量的半参数中介分析

Sijia Li, Ruoyu Wang

AI总结 针对中介变量和结局变量从未同时观测的数据不完整性,提出一种数据融合框架,利用共享工具变量在无交互条件下识别自然直接和间接效应,并开发具有多重稳健性的半参数影响函数估计器。

Comments 24 pages; 2 figures

详情
AI中文摘要

中介分析被广泛用于解构因果路径,然而在许多实际研究中,中介变量 M 和结局变量 Y 从未被同时观测。这种不完整性破坏了自然直接和间接效应的标准识别策略。我们引入了一种新颖的数据融合框架,通过结合两个不完整的数据源(一个测量 M,另一个测量 Y)来恢复识别。我们的方法利用共享工具变量(IVs)来规避联合观测 (M,Y) 的需求,在无交互条件下对未测量混杂仍然有效,并通过潜在对齐条件适应跨数据源的协变量和暴露偏移。我们建立了两种识别策略:一种适用于已知有效 IV 集合的场景,另一种适用于需要学习有效 IV 的场景。我们进一步开发了具有多重稳健性的半参数影响函数估计器,并提出了一个在适当条件下达到半参数效率界的估计器。我们将我们的框架应用于量化 SNP rs610932 对痴呆风险的影响在多大程度上通过免疫相关基因表达途径中介。

英文摘要

Mediation analysis is widely used to disentangle causal pathways, yet in many real-world studies the mediator M and outcome Y are never jointly observed. This incompleteness breaks the standard identification strategy for natural direct and indirect effects. We introduce a novel data fusion framework that restores the identification by combining two incomplete data sources, one measuring $M$ and the other measuring Y. Our approach leverages shared instrumental variables (IVs) to circumvent the need to observe (M,Y) jointly, remains valid under unmeasured confounding via a no-interaction condition, and accommodates covariate and exposure shifts across data sources under a latent alignment condition. We establish two identification strategies, one for settings with a known set of valid IVs, and another for settings where valid IVs must be learned. We further develop semiparametric, influence-function-based estimators with multiple robustness properties, and propose an estimator that attains the semiparametric efficiency bound under appropriate conditions. We apply our framework to quantify the extent to which the effect of SNP rs610932 on dementia risk is mediated through immune-related gene-expression pathways.

2606.17181 2026-06-17 stat.ME stat.AP 新提交

Tropical Viterbi Tubes for Decoding Uncertainty in Hidden Markov Models

热带维特比管:隐马尔可夫模型解码不确定性

Aurélien Nicosia

AI总结 提出热带维特比管,通过容忍度阈值捕获隐马尔可夫模型中接近最优的路径不确定性,并给出精确投影算法与校准方法。

Comments 33 pages, 4 figures; supplementary material included as ancillary file; submitted to The Annals of Applied Statistics

详情
AI中文摘要

隐马尔可夫模型广泛用于从序列数据推断潜在状态序列,但维特比解码仅报告一条最可能的完整路径。当解码状态具有科学意义时,这一单一最大化器可能掩盖由多条近最优轨迹产生的路径不确定性。在拟合的HMM条件下,我们引入热带维特比管:其完整数据对数得分在维特比最优值容忍度内的隐藏轨迹集合。状态、转移和变化状态投影显示哪些局部特征与全局近最优完整路径兼容,为序列分析、生态学、金融、生物医学监测及相关领域的HMM提供了路径不确定性层。该管是完整隐藏路径空间上的后验上水平集,容忍度解释为相对于维特比路径的对数后验几率损失。将容忍度校准到目标后验质量,为完整潜在路径提供了HPD阈值可信区域和保守的同时投影带。我们证明了单调性、阶梯函数行为和确定性稳定性保证,并通过最大加前向-后向递归在O(TK^2)时间内精确计算密集转移的投影管。后验管质量和HPD校正是通过FFBS近似的独立路径计算。在一个公开的蝙蝠追踪应用中,鲁棒觅食管段富含捕食嗡嗡声,而鲁棒通勤管段则缺乏:在eta=0.005时,鲁棒觅食的富集度为2.25,95%自助法区间为(1.73, 2.85);鲁棒通勤的富集度为0.27,区间为(0.16, 0.44)。

英文摘要

Hidden Markov models are widely used to infer latent state sequences from sequential data, but Viterbi decoding reports only one most likely complete path. When decoded states carry scientific meaning, this single maximizer can conceal pathwise uncertainty created by multiple near-optimal trajectories. Conditional on a fitted HMM, we introduce the tropical Viterbi tube: the set of hidden trajectories whose complete-data log-score lies within a tolerance of the Viterbi optimum. State, transition, and change-status projections show which local features remain compatible with globally near-optimal complete paths, giving a pathwise uncertainty layer for HMMs in sequence analysis, ecology, finance, biomedical monitoring, and related domains. The tube is a posterior superlevel set on complete hidden-path space, with tolerance interpreted as a log posterior-odds loss relative to a Viterbi path. Calibrating the tolerance to a target posterior mass gives an HPD-threshold credible region for the complete latent path and conservative simultaneous projected bands. We prove monotonicity, step-function behavior, and deterministic stability guarantees, and compute projected tubes exactly by max-plus forward-backward recursions in O(TK^2) time for dense transitions. Posterior tube mass and HPD calibration are separate pathwise calculations approximated by FFBS. In a public bat-tracking application, robust foraging tube segments are enriched for feeding buzzes, whereas robust commuting segments are depleted: at eta = 0.005, enrichment is 2.25 with 95% bootstrap interval (1.73, 2.85) for robust foraging and 0.27 with interval (0.16, 0.44) for robust commuting.

2606.18087 2026-06-17 econ.GN q-fin.EC 新提交

Environmental Threat and the Nation: Earthquake Risk, Distributive Priority, and Expressive Attachment

环境威胁与国家:地震风险、分配优先级与表达性依恋

Hector Galindo-Silva

AI总结 利用全球63个国家494个地区的数据,研究发现长期地震风险增强国家认同,主要通过表达性渠道(自豪感、战斗意愿)而非分配性渠道,且该效应在宗教象征基础设施完备的地区更显著。

详情
AI中文摘要

本文研究长期地震风险如何塑造国家认同,区分了分配性边际(国家成员身份作为稀缺资源分配规则)和表达性边际(自豪感、战斗意愿和情感依恋)。将世界价值观调查受访者(1981-2022年;63个国家,494个次国家地区)与次国家地震风险地理数据关联,我发现居住在高风险区域附近的人表现出更强的国家内群体取向:更多的自豪感、更强的战斗意愿,以及在就业稀缺时给予国民更多优先权。家庭依恋和外群体敌意并未上升,而宗教虔诚度同步增加。表达性边际是有条件的:在政教合一且宗教领域凝聚力强的地方,自豪感反应显著,因为这种象征性基础设施将灾难塑造为共同的国家考验;而在缺乏这些条件的地方,自豪感反应与零无显著差异。利用相邻调查波次之间地震的补充设计发现,平均短期反应为零,但检测到的反应集中在年长、对地方有依恋且无法离开的居民中——这与态度追踪长期、不可避免的风险而非单一事件相一致。综合来看,结果指向国家依恋的需求侧起源:当协变量冲击会压倒地方和家庭保险时,人们转向更大的保护与意义共同体——国家和宗教——这一逻辑我在一个简单的社会互动模型中形式化。

英文摘要

This paper studies how long-run earthquake risk shapes national identity, separating a distributive margin (national membership as a rule for allocating scarce resources) from an expressive margin (pride, willingness to fight, and affective attachment). Linking World Values Survey respondents (1981-2022; 63 countries, 494 subnational regions) to subnational seismic-risk geography, I find that people living closer to high-risk zones express stronger national in-group orientation: more pride, more willingness to fight, and more priority for nationals when jobs are scarce. Family attachment and out-group hostility do not rise, while religiosity increases in parallel. The expressive margin is conditional: the pride response is pronounced where state-religion alignment and a cohesive religious field lend the symbolic infrastructure to cast disaster as a shared national ordeal, and indistinguishable from zero where they do not. A complementary design exploiting earthquakes between adjacent survey waves finds no average short-run response, yet the response it does detect concentrates among older, place-attached residents who cannot leave -- consistent with attitudes tracking a chronic, inescapable risk rather than single events. Together, the results point to a demand-side origin of national attachment: where a covariate shock would overwhelm local and family insurance, people turn to larger communities of protection and meaning -- the nation and religion -- a logic I formalize in a simple social-interaction model.

2606.17807 2026-06-17 econ.GN q-fin.EC 新提交

Household coping mechanisms under grid failure: Evidence from a high electrification context in Lebanon

电网故障下的家庭应对机制:黎巴嫩高电气化背景下的证据

Majd Olleik, Haytham M. Dbouk, Anne Neumann, Elsa Bou Gebrael, Sebastian Zwickl-Bernhard

AI总结 基于黎巴嫩1000户家庭调查数据,研究家庭在电网故障下通过柴油发电机和光伏电池系统等供给侧应对机制及需求侧适应行为,揭示社会经济地位对应对方案获取和需求满足程度的关键影响。

Comments Submitted to a peer-reviewed journal

详情
AI中文摘要

尽管许多国家实现了近乎普遍的电气化,但电力供应短缺仍然影响着家庭能源使用。本文以黎巴嫩为案例,研究家庭如何适应高电气化、高依赖背景下的慢性电网故障。基于1000户家庭的原始调查数据,我们分析了供给侧应对机制(如柴油发电机和太阳能光伏-电池系统)以及需求侧适应措施(包括负荷转移和需求抑制)。结果揭示了家庭应对的全景图,其中社会经济地位在决定备用解决方案的获取和需求满足程度方面起着核心作用。虽然柴油发电机仍然普遍,但观察到向光伏-电池系统的转变,尤其是在经济能力较强的家庭中。然而,分散式自发电伴随着效率低下,包括大量弃光。在需求侧,家庭表现出用电量减少,导致根据所采用的备用系统类型出现不同的消费模式。这些发现强调了在评估不可靠供应下的能源需求时,区分满足和未满足需求的重要性。本文通过定量描述供应受限的高电气化背景下自发电与需求适应之间的相互作用,为文献做出了贡献。它还提供了包含抑制消费的经验需求曲线,填补了电力系统规划中的一个关键空白。从政策角度来看,结果强调需要核算未满足需求,解决应对技术获取中的不平等问题,并减少分散式系统的低效率。

英文摘要

Despite near-universal electrification in many countries, electricity supply shortages continue to shape household energy use. This paper examines how households adapt to chronic grid failure in high-electrification, high-dependence contexts, using Lebanon as a case study. Drawing on original survey data from 1,000 households, we analyze both supply-side coping mechanisms such as diesel generators and solar photovoltaic (PV)-battery systems, and demand-side adaptations, including load shifting and demand suppression. The results reveal a landscape of household responses, where socioeconomic status plays a central role in determining access to backup solutions and the extent of met demand. While diesel generators remain widespread, a transition toward PV-battery systems is observed, especially among financially capable households. However, decentralized self-generation is associated with inefficiencies, including substantial levels of curtailed solar generation. On the demand side, households exhibit reductions in electricity use, leading to distinct consumption profiles depending on the type of backup system employed. These findings highlight the importance of distinguishing between met and unmet demand when assessing energy needs under unreliable supply. The paper contributes to the literature by providing a quantitative characterization of the interaction between self-generation and demand adaptation in a supply-constrained high-electrification context. It also offers empirical demand profiles that incorporate suppressed consumption, addressing a key gap in electricity system planning. From a policy perspective, the results underscore the need to account for unmet demand, address inequities in access to coping technologies, and reduce inefficiencies in decentralized systems.

2606.17503 2026-06-17 econ.GN q-fin.EC 新提交

What Prediction Markets Can See: Market Formation, Settlement Legibility, and the Geography of Tradable Uncertainty in Africa and Latin America

预测市场能看见什么:市场形成、结算可读性以及非洲和拉丁美洲可交易不确定性的地理分布

Ade Adegbenro

AI总结 通过分析Polymarket和Kalshi上6047个非洲和拉丁美洲主题合约,构建结算可读性指标,发现市场形成具有选择性,体育和选举合约居多,而重要公民事件合约稀缺,且可读性预测合约上市方向但未达预设标准。

Comments 45 pages

详情
AI中文摘要

预测市场通常在其合约存在后通过评估价格预测结果的准确性来评价。我们研究市场形成的制度性前置条件,探究哪些不确定性能够成为可交易合约。利用Polymarket和Kalshi上列出的6047个非洲和拉丁美洲主题合约的审计数据集,我们构建了一个结算可读性的编码度量,即不确定性能够被第三方措辞、引用和可信解决的程度,并在冻结编码本下对451个单元进行验证,独立双重评分在主要维度上达到0.92和0.96的序数可靠性,盲人基准分别达到0.97和0.92。利用这一度量,我们发现市场形成具有选择性,而公众重要性无法解释这种选择性:非洲合约主要集中在足球领域,而显著的公民事件几乎不产生合约;拉丁美洲合约更深,但以委内瑞拉为主,对美国潜在军事行动的关注支撑了数据中最大的公民事件集群。可读性对合约库存进行陡峭排序,体育和选举位于量表顶端,冲突位于底部。在针对外部构建的131个公民事件框架的形成测试中,可读性按预期方向预测上市,但未达到预先指定的接受标准;而在已上市合约中,可读性与交易价值呈负相关,这与选择性上市模型的预测以及我们在估计前的预测一致。因此,预测市场库存衡量的是平台能够结算的内容,而非交易者相信的内容,将其解读为公众兴趣地图会混淆两者。

英文摘要

Prediction markets are usually evaluated after their contracts exist, by asking how well prices forecast outcomes. We study the prior institutional margin of market formation, asking which uncertainties become tradable contracts at all. Using an audited dataset of 6,047 Africa-topic and Latin America-topic contracts listed on Polymarket and Kalshi, we construct a coded measure of settlement legibility, the degree to which an uncertainty can be worded, sourced, and credibly resolved by third parties, and validate it on 451 units under a frozen codebook, where independent double scoring reaches ordinal reliabilities of 0.92 and 0.96 on the primary dimensions and blind human benchmarks reach 0.97 and 0.92. Using this measure, we find that formation is selective in ways that public importance does not explain, with African inventory concentrated overwhelmingly in football while salient civic events produce little or no inventory, and Latin American inventory deeper but dominated by Venezuela, where attention to prospective United States military action sustains the largest civic cluster in the data. Legibility orders the inventory steeply, with sports and elections near the top of the scale and conflict at the bottom. In a formation test against an externally assembled frame of 131 civic events, legibility predicts listing in the expected direction but falls short of pre-specified acceptance criteria, while among listed contracts the relation between legibility and trading value is negative, as a model of selective listing implies and as we predicted before estimation. Prediction-market inventories therefore measure what platforms can settle as much as what traders believe, and reading them as maps of public interest conflates the two.

2606.17423 2026-06-17 q-fin.CP stat.ML 新提交

Martingale Doppelgänger-Eval: An Identification Framework for Auditing Candlestick Understanding in Vision-Language Models

鞅双生评估:审计视觉语言模型对K线图理解的识别框架

Ziyao Wang

AI总结 提出Martingale Doppelgänger-Eval基准,通过受控实验识别VLM是否基于K线证据而非趋势外推进行判断,发现模型忽略或反向利用K线语义。

详情
AI中文摘要

我们引入了Martingale Doppelgänger-Eval,一个公开的影子市场基准,用于审计视觉语言模型(VLM)是否使用K线证据而非外推过去趋势。核心困难在于识别:在真实市场历史中,图表证据和趋势高度耦合,因此观测得分无法确定流畅的技术分析叙述是否基于局部视觉证据。我们形式化证明了这一局限性:在强耦合下,没有基于观测的图表-标签数据计算的评估函数能够区分基于证据的响应者和基于趋势捷径的响应者,而匹配的证据干预以指数速率区分相同的响应者,趋势-标签交换提供了独立的捷径压力测试。因此,该基准在四种受控机制下评估冻结的VLM:鞅零市场、注入阿尔法的反事实对、趋势混杂交换和制度转换。结构行为模型识别了零市场偏差、趋势敏感性、证据敏感性、提示/渲染器脆弱性和证据忠实性;附带的统计工具包提供了最小可检测效应、针对计量API的块感知序贯测试以及重叠加权伪影检查。在冻结的商业和开源VLM中,识别回归将大的正系数分配给过去趋势,但证据系数为零或与规则隐含符号相反。匹配对分析表明,模型要么忽略注入的K线语义,要么在响应时朝与规则隐含方向相反的方向移动。该基准隔离了标准观测图表基准无法检测的失败模式,并为具有可控标签机制的时间序列图像提供了可复用的审计模板。

英文摘要

We introduce Martingale Doppelgänger-Eval, a public shadow-market benchmark for auditing whether vision-language models (VLMs) use candlestick evidence rather than extrapolate past trends. The central difficulty is identification: on real market histories, chart evidence and trend are strongly coupled, so an observational score cannot determine whether a fluent technical-analysis narrative is grounded in local visual evidence. We prove this limitation formally: no evaluation functional computed from observational chart--label data can distinguish a grounded responder from a trend-shortcut responder under strong coupling, whereas matched evidence interventions separate the same responders at an exponential rate and trend--label swaps provide an independent shortcut stress test. The benchmark therefore evaluates frozen VLMs on rendered OHLCV charts under four controlled mechanisms: a martingale-null market, injected-alpha counterfactual pairs, trend-confounder swaps, and regime shifts. A structural behavioral model identifies null-market bias, trend sensitivity, evidence sensitivity, prompt/renderer fragility, and evidence faithfulness; the accompanying statistical toolkit provides minimum detectable effects, block-aware sequential testing for metered APIs, and an overlap-weighted artifact check. Across frozen commercial and open VLMs, the identified regression assigns large positive coefficients to past trend but evidence coefficients that are zero or opposite to the rule-implied sign. Matched-pair analyses show that models either ignore injected candlestick semantics or move opposite to the rule-implied direction conditional on responding. The benchmark isolates a failure mode that standard observational chart benchmarks cannot detect and gives a reusable audit template for time-series imagery with controllable label mechanisms.