论文标题
将边界组装到中国社交媒体文本中的命名实体识别的DNN框架中
Integrating Boundary Assembling into a DNN Framework for Named Entity Recognition in Chinese Social Media Text
论文作者
论文摘要
指定的实体识别是自然语言处理中的一项具有挑战性的任务,尤其是对于非正式和嘈杂的社交媒体文本。因此,中文单词边界也是实体边界,因此,称为中文文本的实体识别可以受益于单词边界检测,并通过中文单词分割输出。然而,中文单词细分构成了自身的困难,因为它受到多种因素的影响,例如分割标准,使用的算法等。不当地处理,它可能会导致层次的失败,而未遵循指定实体识别的质量。在本文中,我们将边界组装方法与最先进的深神经网络模型集成在一起,并将更新的单词边界信息合并到命名实体识别的条件随机字段模型中。我们的方法显示出比以前的最新结果的绝对改善2%。
Named entity recognition is a challenging task in Natural Language Processing, especially for informal and noisy social media text. Chinese word boundaries are also entity boundaries, therefore, named entity recognition for Chinese text can benefit from word boundary detection, outputted by Chinese word segmentation. Yet Chinese word segmentation poses its own difficulty because it is influenced by several factors, e.g., segmentation criteria, employed algorithm, etc. Dealt improperly, it may generate a cascading failure to the quality of named entity recognition followed. In this paper we integrate a boundary assembling method with the state-of-the-art deep neural network model, and incorporate the updated word boundary information into a conditional random field model for named entity recognition. Our method shows a 2% absolute improvement over previous state-of-the-art results.