论文标题
Checkthat的埃森哲! 2020:如果您这样说:使用基于变压器的模型对索赔的事后事实检查
Accenture at CheckThat! 2020: If you say so: Post-hoc fact-checking of claims using transformer-based models
论文作者
论文摘要
我们介绍了埃森哲团队为Clef2020 Checkthat使用的策略!实验室,任务1,关于英语和阿拉伯语。这项共同的任务评估了是否应在社交媒体文本中进行专业检查中的索赔。对于一名记者来说,作为事实的声明对大众的观众来说是感兴趣的,需要在传播之前进行专业事实检查。我们利用Bert和Roberta模型来确定社交媒体文本中的主张,专业的事实检查器应审查,并按优先顺序对事实进行检查。对于英语挑战,我们对罗伯塔的模型进行了微调,并增加了一个额外的平均合并层和辍学层,以增强看不见的文本的通用性。对于阿拉伯任务,我们对阿拉伯语BERT模型进行了微调,并证明了使用反向翻译来扩大少数群体并平衡数据集。这里提出的工作在英语曲目中获得第一名,在阿拉伯曲目中排名第一,第2、3和第四名。
We introduce the strategies used by the Accenture Team for the CLEF2020 CheckThat! Lab, Task 1, on English and Arabic. This shared task evaluated whether a claim in social media text should be professionally fact checked. To a journalist, a statement presented as fact, which would be of interest to a large audience, requires professional fact-checking before dissemination. We utilized BERT and RoBERTa models to identify claims in social media text a professional fact-checker should review, and rank these in priority order for the fact-checker. For the English challenge, we fine-tuned a RoBERTa model and added an extra mean pooling layer and a dropout layer to enhance generalizability to unseen text. For the Arabic task, we fine-tuned Arabic-language BERT models and demonstrate the use of back-translation to amplify the minority class and balance the dataset. The work presented here was scored 1st place in the English track, and 1st, 2nd, 3rd, and 4th place in the Arabic track.