论文标题
超高维线性回归模型中强大的自适应变量选择
Robust adaptive variable selection in ultra-high dimensional linear regression models
论文作者
论文摘要
我们考虑了超高维线性回归模型中相应回归系数的同时可变选择和估计的问题,这是最近时代极为重要的问题。在这方面,使用自适应惩罚功能来实现Oracle变量选择属性以及更容易的计算负担。但是,在存在数据污染的情况下,基于平方误差损失函数的通常的自适应程序(例如,自适应拉索)极为不舒适,这在大型数据中很常见(例如,噪声基因表达数据,光谱和光谱数据)。在本文中,我们使用基于流行密度功率差异(DPD)量度以及自适应套索惩罚的强大损耗函数提出了超高维数据的正则化程序。从理论上讲,我们研究了通用误差分布类别的自适应稳健估计器的鲁棒性和大样本特性;特别是,我们表明所提出的自适应DPD-LASSO估计器高度健壮,满足Oracle变量选择属性,并且在易于验证的假设集合集合集合集合中,回归系数的相应估计器是一致的,并且渐近地正常。为大多数使用的正常误差密度提供了数值插图。最后,该提案用于分析化学计量学领域的有趣的光谱数据集,内容涉及16和17世纪的考古玻璃容器的电子 - 探针X射线微分析(EPXMA)。
We consider the problem of simultaneous variable selection and estimation of the corresponding regression coefficients in an ultra-high dimensional linear regression models, an extremely important problem in the recent era. The adaptive penalty functions are used in this regard to achieve the oracle variable selection property along with easier computational burden. However, the usual adaptive procedures (e.g., adaptive LASSO) based on the squared error loss function is extremely non-robust in the presence of data contamination which are quite common with large-scale data (e.g., noisy gene expression data, spectra and spectral data). In this paper, we present a regularization procedure for the ultra-high dimensional data using a robust loss function based on the popular density power divergence (DPD) measure along with the adaptive LASSO penalty. We theoretically study the robustness and the large-sample properties of the proposed adaptive robust estimators for a general class of error distributions; in particular, we show that the proposed adaptive DPD-LASSO estimator is highly robust, satisfies the oracle variable selection property, and the corresponding estimators of the regression coefficients are consistent and asymptotically normal under easily verifiable set of assumptions. Numerical illustrations are provided for the mostly used normal error density. Finally, the proposal is applied to analyze an interesting spectral dataset, in the field of chemometrics, regarding the electron-probe X-ray microanalysis (EPXMA) of archaeological glass vessels from the 16th and 17th centuries.