论文标题
仇恨言论如何因目标身份而变化:计算分析
How Hate Speech Varies by Target Identity: A Computational Analysis
论文作者
论文摘要
本文研究了仇恨言论如何根据其针对的身份以系统的方式变化。在针对有针对性身份的多个仇恨言论数据集中,我们发现接受针对特定身份群体的仇恨言论培训的分类器难以推广到其他有针对性的身份。这提供了目标身份仇恨言论差异的经验证据;然后,我们研究哪种模式构建了这种变化。我们发现,与目标身份群体的相对社会力量相比,目标人群类别(例如性别/性别或种族/种族)似乎对仇恨言论的语言具有更大的影响。我们还发现,与针对特定身份的仇恨言论相关的词通常与刻板印象,压迫历史,当前的社会运动和其他针对身份的社会环境有关。这些实验表明,在自动仇恨言论分类中考虑有针对性的身份以及与这些身份相关的社会环境的重要性。
This paper investigates how hate speech varies in systematic ways according to the identities it targets. Across multiple hate speech datasets annotated for targeted identities, we find that classifiers trained on hate speech targeting specific identity groups struggle to generalize to other targeted identities. This provides empirical evidence for differences in hate speech by target identity; we then investigate which patterns structure this variation. We find that the targeted demographic category (e.g. gender/sexuality or race/ethnicity) appears to have a greater effect on the language of hate speech than does the relative social power of the targeted identity group. We also find that words associated with hate speech targeting specific identities often relate to stereotypes, histories of oppression, current social movements, and other social contexts specific to identities. These experiments suggest the importance of considering targeted identity, as well as the social contexts associated with these identities, in automated hate speech classification.