Runne-2022共享任务：识别嵌套命名实体

论文标题

Runne-2022共享任务：识别嵌套命名实体

RuNNE-2022 Shared Task: Recognizing Nested Named Entities

论文作者

Artemova, Ekaterina, Zmeev, Maxim, Loukachevitch, Natalia, Rozhkov, Igor, Batura, Tatiana, Ivanov, Vladimir, Tutubalina, Elena

论文摘要

Runne共享的任务处理嵌套命名实体识别的问题。注释模式的设计方式是，一个实体可以部分重叠甚至嵌套到另一个实体中。这样，命名的实体“ Yermolova剧院”类型“组织”包含另一个实体“ ymolova”类型“人”。我们为Runne共享任务采用了俄罗斯的Nerel数据集。书呆子包括用俄罗斯语言写的新闻文本，并从Wikinews门户网站收集。注释模式包括29种实体类型。 Nerel中指定实体的嵌套最多达到六个级别。 Runne共享任务探索了两个设置。（i）在一般设置中，所有实体或多或少都以相同的频率发生。（ii）在少数拍摄设置中，大多数实体类型经常发生在培训集中。但是，某些实体类型的频率较低，因此具有挑战性。在测试集中，所有实体类型的频率均为。本文报告了Runne共享任务的结果。总体而言，共同的任务已收到9个团队的156份意见书。在两个设置中，一半的提交表现优于基于BERT的基线。本文概述了共享的任务设置并讨论了提交的系统，从而发现了嵌套NER问题的意义见解。我们的github存储库中可用：https：//github.com/dialogue-evaluation/runne提供了指向评估平台和来自共享任务的数据。

The RuNNE Shared Task approaches the problem of nested named entity recognition. The annotation schema is designed in such a way, that an entity may partially overlap or even be nested into another entity. This way, the named entity "The Yermolova Theatre" of type "organization" houses another entity "Yermolova" of type "person". We adopt the Russian NEREL dataset for the RuNNE Shared Task. NEREL comprises news texts written in the Russian language and collected from the Wikinews portal. The annotation schema includes 29 entity types. The nestedness of named entities in NEREL reaches up to six levels. The RuNNE Shared Task explores two setups. (i) In the general setup all entities occur more or less with the same frequency. (ii) In the few-shot setup the majority of entity types occur often in the training set. However, some of the entity types are have lower frequency, being thus challenging to recognize. In the test set the frequency of all entity types is even. This paper reports on the results of the RuNNE Shared Task. Overall the shared task has received 156 submissions from nine teams. Half of the submissions outperform a straightforward BERT-based baseline in both setups. This paper overviews the shared task setup and discusses the submitted systems, discovering meaning insights for the problem of nested NER. The links to the evaluation platform and the data from the shared task are available in our github repository: https://github.com/dialogue-evaluation/RuNNE.

下载PDF全文

下载文献需遵守相关版权规定

论文标题