论文标题
推断许多协变量的最佳政策
Inference on the Best Policies with Many Covariates
论文作者
论文摘要
在经济学,统计和其他学科的许多经验工作中,希望了解最有效的政策或治疗对感兴趣的响应变量的影响。由于广泛的获胜者的诅咒现象,假设最高策略是独立于随机样本选择的常规统计推断,可能会导致对最佳策略的过度乐观评估。近年来,鉴于大型数据集的可用性增加,当研究人员包括许多协变量以估算政策或治疗效果以控制潜在的混杂因素时,此问题可能会更加复杂。在本手稿中,为了同时解决上述问题,我们提出了一个基于重新采样的程序,该程序不仅可以提高获胜者的诅咒,以评估随机样本中观察到的最佳策略,而且对许多协变量的存在也很强。提出的推理过程得出准确的点估计值和有效的频繁置信区间,这些间隔将达到确切的名义水平,因为样本量对于多个最佳策略效应大小而言无穷大。我们通过蒙特卡洛实验和两项实证研究来说明我们方法的有限样本表现,并评估了慈善捐赠中最有效的政策以及国家支持工作计划中最有益的工人群体。
Understanding the impact of the most effective policies or treatments on a response variable of interest is desirable in many empirical works in economics, statistics and other disciplines. Due to the widespread winner's curse phenomenon, conventional statistical inference assuming that the top policies are chosen independent of the random sample may lead to overly optimistic evaluations of the best policies. In recent years, given the increased availability of large datasets, such an issue can be further complicated when researchers include many covariates to estimate the policy or treatment effects in an attempt to control for potential confounders. In this manuscript, to simultaneously address the above-mentioned issues, we propose a resampling-based procedure that not only lifts the winner's curse in evaluating the best policies observed in a random sample, but also is robust to the presence of many covariates. The proposed inference procedure yields accurate point estimates and valid frequentist confidence intervals that achieve the exact nominal level as the sample size goes to infinity for multiple best policy effect sizes. We illustrate the finite-sample performance of our approach through Monte Carlo experiments and two empirical studies, evaluating the most effective policies in charitable giving and the most beneficial group of workers in the National Supported Work program.