Smoothed Rank-Based Regression Estimation Using Wilcoxon Score Functions
基于Wilcoxon得分函数的平滑秩回归估计
Feridun Tasdan
AI总结 提出用平滑秩代替整数秩的Wilcoxon秩回归估计,通过核分布函数近似指示函数,在保持稳健性的同时提高重尾误差下的效率并处理结数据,推导了Wald检验并证明渐近正态性。
详情
- Comments
- 17 pages
本文提出了一种改进的基于秩的回归估计量,通过用从平滑经验累积分布函数导出的平滑秩替换Wilcoxon秩得分回归过程中的普通整数秩。平滑秩通过连续、非递减的核分布函数H计算,该函数为标准秩回归中使用的经典指示函数提供了可微近似。将这些平滑秩代入Wilcoxon得分函数,得到简单和多元线性回归模型中斜率参数的新估计量。我们证明,所提出的估计量继承了经典秩回归的稳健性,同时在重尾误差分布下提高了效率,并更好地处理了结观测值。推导了回归系数的Wald型假设检验,并建立了其渐近正态性。蒙特卡洛模拟研究将新估计量与普通最小二乘估计量、经典Wilcoxon秩回归估计量以及Theil和Sen估计量在几种误差分布(包括正态、拉普拉斯、柯西和污染正态)下进行了比较。所提出的估计量在所有考虑的场景中均匀地达到或超过经典秩回归的相对效率,在存在异常值和重尾误差时尤其显著。
This article proposes an improved rank based regression estimator obtained by replacing the ordinary integer ranks in the Wilcoxon rank-score regression procedure with smoothed ranks derived from a smoothed empirical cumulative distribution function. The smoothed ranks are computed via a continuous, nondecreasing kernel distribution function H that provides a differentiable approximation to the classical indicator function used in standard rank regression. Substituting these smoothed ranks into the Wilcoxon score function yields a new estimator for the slope parameter(s) of the simple and multiple linear regression model. We show that the proposed estimator inherits the robustness properties of classical rank regression while providing improved efficiency under heavy tailed error distributions and better handling of tied observations. A Wald type hypothesis test for the regression coefficients is derived and its asymptotic normality is established. A Monte Carlo simulation study compares new estimator with the ordinary least-squares (OLS) estimator, the classical Wilcoxon rank regression estimator, and the Theil and Sen estimator under several error distributions including the normal, Laplace, Cauchy, and contaminated normal. The proposed estimator achieves relative efficiencies at or above those of classical rank regression uniformly across all scenarios considered, with notable gains in the presence of outliers and heavy-tailed errors.