学习句法和动态选择性编码文档摘要

论文标题

学习句法和动态选择性编码文档摘要

Learning Syntactic and Dynamic Selective Encoding for Document Summarization

论文作者

Xu, Haiyang, He, Yahao, Han, Kun, Chen, Junwen, Li, Xiangang

论文摘要

文本摘要旨在生成由源文本的主要信息组成的标题或简短摘要。最近的研究采用序列到序列框架来用神经网络编码输入并产生抽象性摘要。但是，大多数研究都以语义词嵌入为编码器，但忽略了文本的句法信息。此外，尽管以前的研究提出了选择性门以控制从编码器到解码器的信息流，但它在解码过程中是静态的，无法根据解码器状态区分信息。在本文中，我们提出了一种新型的神经结构，以进行文档摘要。我们的方法具有以下贡献：首先，我们将句法信息（例如选区解析树）纳入编码顺序中，以从文档中学习语义和句法信息，从而获得更准确的摘要；其次，我们提出了一个动态门网络，以根据解码器状态的上下文选择显着信息，这对于记录摘要至关重要。提出的模型已在CNN/每日邮件汇总数据集上进行了评估，实验结果表明，所提出的方法的表现优于基线方法。

Text summarization aims to generate a headline or a short summary consisting of the major information of the source text. Recent studies employ the sequence-to-sequence framework to encode the input with a neural network and generate abstractive summary. However, most studies feed the encoder with the semantic word embedding but ignore the syntactic information of the text. Further, although previous studies proposed the selective gate to control the information flow from the encoder to the decoder, it is static during the decoding and cannot differentiate the information based on the decoder states. In this paper, we propose a novel neural architecture for document summarization. Our approach has the following contributions: first, we incorporate syntactic information such as constituency parsing trees into the encoding sequence to learn both the semantic and syntactic information from the document, resulting in more accurate summary; second, we propose a dynamic gate network to select the salient information based on the context of the decoder state, which is essential to document summarization. The proposed model has been evaluated on CNN/Daily Mail summarization datasets and the experimental results show that the proposed approach outperforms baseline approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题