Error Bounds for Importance Sampling with Estimated Proposal Distributions
重要性采样中使用估计提案分布的误差界
Cathrine Aeckerle-Willems, Ilja Klebanov, Simon Weissmann
AI总结 本文研究了使用数据驱动提案分布的重要性和采样方法,通过推导非渐近误差界,分离了蒙特卡洛误差和提案近似误差,并为基于KDE的提案提供了定量保证。
详情
使用数据驱动提案分布的重要性和采样在实践中被广泛应用。一个常见的流程是首先从目标分布的近似分布中生成大小为N的辅助样本,构建一个密度估计$\hat q$,例如核密度估计器(KDE),然后从该学习的提案中抽取n个重要性样本。尽管其实际相关性,这种分层过程的理论性质仍不明确,因为经典的重要性采样理论假设提案是固定的。我们通过推导标准、防御性和自归一化的重要性采样估计器的非渐近误差界来填补这一空白,这些结果将蒙特卡洛误差(按n^{-1/2}缩放)与通过$\hat q$的均整绝对和平方误差(MIAE和MISE)测量的提案近似误差分开。为了获得在(N,n)中的显式收敛速率,我们为由几何递归马尔可夫链构造的KDEs在平稳和非平稳情况下建立了MIAE和MISE界限。结合这些结果,为基于KDE的提案的重要性采样提供了定量保证。我们的理论为非参数重要性采样框架中选择防御性混合权重提供了实用指导。
Importance sampling with data-driven proposal distributions is widely used in practice. A common workflow first generates an auxiliary sample of size $N$ from an approximation of the target distribution, constructs a density estimate $\hat q$ such as a kernel density estimator (KDE), and then draws $n$ importance samples from this learned proposal. Despite its practical relevance, the theoretical properties of this hierarchical procedure remain poorly understood, since classical importance sampling theory assumes a fixed proposal. We address this gap by deriving non-asymptotic error bounds for standard, defensive, and self-normalized importance sampling estimators with random proposals. Our results separate the Monte Carlo error, scaling as $n^{-1/2}$, from the proposal approximation error measured through the mean integrated absolute and squared errors (MIAE and MISE) of $\hat q$. To obtain explicit convergence rates in $(N,n)$, we establish MIAE and MISE bounds for KDEs constructed from geometrically ergodic Markov chains in stationary and non-stationary regimes. Combining these results yields quantitative guarantees for importance sampling with KDE-based proposals. Our theory provides practical guidance for selecting defensive mixture weights in a nonparametric importance sampling framework.