论文标题
基于设计的群集重读理论
Design-based theory for cluster rerandomization
论文作者
论文摘要
平均而言,完全随机平衡协变量,但是有限样本中通常存在协变量失衡。恢复性化可以通过丢弃不希望的治疗分配来确保实现实现实现的协变量平衡。由于后勤限制或政策注意事项,许多公共卫生和社会科学领域的现场实验在集群水平上分配了治疗。此外,它们在设计阶段经常与重新汇总化结合。我们将群集重新分析称为群集随机化实验,该实验与重新汇聚化,以在单个或群集水平上平衡协变量。现有的渐近理论只能通过在单个层面分配的治疗方法来处理重读,这使得群集重新统治是一个开放的问题。为了填补空白,我们提供了基于设计的理论,用于群集重读。此外,我们比较了两个群集重读计划,它们使用了有关协变量重要性的先前信息:一个基于加权欧几里得距离,另一个基于Mahalanobis距离与协变量的距离。我们证明,前者以最佳权重和正交的协变量来统治后者。最后但并非最不重要的一点是,我们讨论了协变量调整在分析阶段的作用,并推荐协变量调整后的程序,这些程序可以通过与相关的可靠标准误差方便地实现。
Complete randomization balances covariates on average, but covariate imbalance often exists in finite samples. Rerandomization can ensure covariate balance in the realized experiment by discarding the undesired treatment assignments. Many field experiments in public health and social sciences assign the treatment at the cluster level due to logistical constraints or policy considerations. Moreover, they are frequently combined with rerandomization in the design stage. We refer to cluster rerandomization as a cluster-randomized experiment compounded with rerandomization to balance covariates at the individual or cluster level. Existing asymptotic theory can only deal with rerandomization with treatments assigned at the individual level, leaving that for cluster rerandomization an open problem. To fill the gap, we provide a design-based theory for cluster rerandomization. Moreover, we compare two cluster rerandomization schemes that use prior information on the importance of the covariates: one based on the weighted Euclidean distance and the other based on the Mahalanobis distance with tiers of covariates. We demonstrate that the former dominates the latter with optimal weights and orthogonalized covariates. Last but not least, we discuss the role of covariate adjustment in the analysis stage and recommend covariate-adjusted procedures that can be conveniently implemented by least squares with the associated robust standard errors.