论文标题

具有特定领域审慎的语言模型的社交媒体上公共卫生监视任务的基准测试

Benchmarking for Public Health Surveillance tasks on Social Media with a Domain-Specific Pretrained Language Model

论文作者

Naseem, Usman, Lee, Byoung Chan, Khushi, Matloob, Kim, Jinman, Dunn, Adam G.

论文摘要

社交媒体上的用户生成的文本使卫生工作者能够跟踪信息,确定可能的暴发,预测疾病趋势,监测紧急情况以及确定疾病意识以及对官方健康往来的反应。这种在社交媒体上的健康信息交换被认为是提高公共卫生监视(PHS)的尝试。尽管具有潜力,但该技术仍处于早期阶段,并且还没有准备好广泛应用。审慎的语言模型(PLM)的进步促进了多种域特异性PLM和各种下游应用的开发。但是,没有PLM用于涉及PHS的社交媒体任务。我们介绍并发布基于变压器的PHS-Bert,以确定与社交媒体上的公共卫生监视有关的任务。我们比较和基准测试了与7个不同的PHS任务相关的不同社会内侧平台的25个数据集上的PHS-Bert的性能。与主要在有限任务上进行评估的现有PLM相比,PHS-Bert在所有25个测试的数据集上都实现了最先进的性能,这表明我们的PLM在常见的PHS任务中是强大的,并且可以推广。通过使PHS-Bert可用,我们旨在促进社区降低计算成本,并为各种phs相关的任务介绍新的基准。

A user-generated text on social media enables health workers to keep track of information, identify possible outbreaks, forecast disease trends, monitor emergency cases, and ascertain disease awareness and response to official health correspondence. This exchange of health information on social media has been regarded as an attempt to enhance public health surveillance (PHS). Despite its potential, the technology is still in its early stages and is not ready for widespread application. Advancements in pretrained language models (PLMs) have facilitated the development of several domain-specific PLMs and a variety of downstream applications. However, there are no PLMs for social media tasks involving PHS. We present and release PHS-BERT, a transformer-based PLM, to identify tasks related to public health surveillance on social media. We compared and benchmarked the performance of PHS-BERT on 25 datasets from different social medial platforms related to 7 different PHS tasks. Compared with existing PLMs that are mainly evaluated on limited tasks, PHS-BERT achieved state-of-the-art performance on all 25 tested datasets, showing that our PLM is robust and generalizable in the common PHS tasks. By making PHS-BERT available, we aim to facilitate the community to reduce the computational cost and introduce new baselines for future works across various PHS-related tasks.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源