论文标题
MFL_COVID19:量化基于国家的因素,通过正规化的多任务特征学习,COVID-19的早期阶段影响病例死亡率
MFL_COVID19: Quantifying Country-based Factors affecting Case Fatality Rate in Early Phase of COVID-19 Epidemic via Regularised Multi-task Feature Learning
论文作者
论文摘要
Covid-19最近的爆发导致了全球范围内迅速的蔓延。许多国家已经实施了及时的强化抑制以最大程度地减少感染,但由于卫生资源的急需需求,导致了高病例死亡率(CFR)。其他基于国家的因素,例如社会文化问题,人口老龄化等,也影响了采取干预措施以改善早期道德的实际有效性。为了更好地了解这些因素在不同国家与Covid-19 CFR的关系对于为可能的第二波Covid-19感染做准备至关重要。在本文中,我们提出了一种新型的正则多任务学习因素分析方法,用于量化COVID-19的早期流行病的基于国家 /地区的因素。我们将CFR进展的预测视为观察到的CFR和其他基于国家的因素的ML回归问题。在此公式中,所有相关因素均分为6个有27个指标的扇区。我们提出了一种混合特征选择方法,结合了过滤器,包装器和基于树的模型,以校准初始特征相互作用的初始因素。然后,我们在我们的配方中采用了两个典型的单个任务模型(Ridge和Lasso回归)和一种最先进的MTFL方法(融合稀疏的组Lasso)。融合的稀疏组拉索(FSGL)方法允许同时选择COVID-19的多个时间点的一组基于国家 /地区的因素,还可以使整个早期阶段中每个因素的时间平滑度结合在一起。最后,我们提出了一种新型的时间投票特征选择方案,以平衡MTFL模型中多个因素的重量不稳定性。
Recent outbreak of COVID-19 has led a rapid global spread around the world. Many countries have implemented timely intensive suppression to minimize the infections, but resulted in high case fatality rate (CFR) due to critical demand of health resources. Other country-based factors such as sociocultural issues, ageing population etc., has also influenced practical effectiveness of taking interventions to improve morality in early phase. To better understand the relationship of these factors across different countries with COVID-19 CFR is of primary importance to prepare for potentially second wave of COVID-19 infections. In the paper, we propose a novel regularized multi-task learning based factor analysis approach for quantifying country-based factors affecting CFR in early phase of COVID-19 epidemic. We formulate the prediction of CFR progression as a ML regression problem with observed CFR and other countries-based factors. In this formulation, all CFR related factors were categorized into 6 sectors with 27 indicators. We proposed a hybrid feature selection method combining filter, wrapper and tree-based models to calibrate initial factors for a preliminary feature interaction. Then we adopted two typical single task model (Ridge and Lasso regression) and one state-of-the-art MTFL method (fused sparse group lasso) in our formulation. The fused sparse group Lasso (FSGL) method allows the simultaneous selection of a common set of country-based factors for multiple time points of COVID-19 epidemic and also enables incorporating temporal smoothness of each factor over the whole early phase period. Finally, we proposed one novel temporal voting feature selection scheme to balance the weight instability of multiple factors in our MTFL model.