AI中文摘要
在健康经济学和流行病学中,含过多零的计数数据频繁出现。标准泊松障碍模型(PHM)直接参数化潜在的泊松率,因此其计数分量系数是对数率比而非边际均值的对数比。因此,PHM的发病率密度比(IDR)既不精确也不随协变量分布恒定,这使应用报告复杂化。我们提出边际化泊松障碍模型(MPHM),它重新参数化计数分量,使得系数向量beta直接控制边际均值E[Y]。一个非线性连接方程将结构泊松率与该参数化均值联系起来。我们证明了连接解的存在性和唯一性,开发了向量化的Brent方法求解器,推导了得分方程和块对角Fisher信息,建立了渐近正态性,并证明了exp(beta)在所有协变量值上精确恒定。一项模拟研究,样本量n ∈ {100, 250, 500, 1000},零比例π ∈ {0.2, 0.4, 0.6, 0.8},R = 200次重复,在所有16种场景下确认了一致性、接近零的偏差以及0.905-0.975的95% Wald覆盖率。应用于NMES1988医生就诊数据(n = 4,406),MPHM得出每个额外慢性病的IDR = 1.163(95% CI: 1.150-1.177)——这是一个精确的、全人群效应,而PHM无法得出。MPHM通过直接参数化E[Y]解决了非恒定IDR问题。得到的IDR对每个个体和整个人群都成立,无需进一步边际化,大大简化了健康利用研究中协变量效应的报告。
英文摘要
Count data with excess zeros arise frequently in health economics and epidemiology. The standard Poisson Hurdle Model (PHM) parametrises the underlying Poisson rate directly, so its count-component coefficients are log-rate ratios rather than log-ratios of the marginal mean. Consequently, the incidence density ratio (IDR) from the PHM is neither exact nor constant across covariate profiles, complicating applied reporting. We propose the Marginalised Poisson Hurdle Model (MPHM), which reparametrises the count component so that the coefficient vector beta directly governs the marginal mean E[Y]. A nonlinear connector equation links the structural Poisson rate to this parametrised mean. We prove existence and uniqueness of the connector solution, develop a vectorised Brent's-method solver, derive the score equations and block-diagonal Fisher information, establish asymptotic normality, and prove that exp(beta) is exactly constant across all covariate values. A simulation study with n in {100, 250, 500, 1000}, zero proportion pi in {0.2, 0.4, 0.6, 0.8}, and R = 200 replications confirms consistency, near-zero bias, and 95% Wald coverage of 0.905-0.975 across all 16 scenarios. Applied to the NMES1988 physician visit data (n = 4,406), the MPHM yields IDR = 1.163 (95% CI: 1.150-1.177) per additional chronic condition - an exact, population-wide effect not derivable from the PHM. The MPHM resolves the non-constant IDR problem by directly parametrising E[Y]. The resulting IDR holds for every individual and the whole population without further marginalisation, substantially simplifying the reporting of covariate effects in health utilisation research.