合并结构信息以改善对话响应的生成

论文标题

合并结构信息以改善对话响应的生成

On Incorporating Structural Information to improve Dialogue Response Generation

论文作者

Moghe, Nikita, Vijayan, Priyesh, Ravindran, Balaraman, Khapra, Mitesh M.

论文摘要

我们考虑从包含域特定资源的背景知识中产生对话响应的任务。具体来说，给定围绕电影的对话，任务是基于有关电影的背景知识（例如情节，评论，reddit评论等）生成下一个响应。这需要从对话上下文和背景资源中捕获结构，顺序和语义信息。这是一项新任务，没有得到社区的太多关注。我们提出了一种新的体系结构，该架构使用BERT与明确的结构和序列信息结合捕获深层上下文化表示的能力。更具体地说，我们使用（i）图形卷积网络（GCN）捕获结构信息，（ii）LSTMS来捕获顺序信息和（iii）BERT，以捕获捕获语义信息的深层上下文化表示。我们广泛分析了所提出的体系结构。为此，我们提出了一个插件语义序列结构（SSS）框架，该框架使我们能够有效地结合此类语言信息。通过一系列实验，我们进行了一些有趣的观察结果。首先，我们观察到，对于NLP任务的GCN模型的流行改编，其中结构信息（GCN）被添加到顺序信息（LSTMS）之上（LSTMS）上的性能很差。这使我们探索了结合语义和结构信息以改善性能的有趣方法。其次，我们观察到，尽管BERT已经超过其他深层的上下文化表示（例如Elmo），但它仍然从使用GCN明确添加的其他结构信息中受益。鉴于最近伯特已经捕获了结构信息的说法，这有点令人惊讶。最后，拟议的SSS框架比基线提高了7.95％。

We consider the task of generating dialogue responses from background knowledge comprising of domain specific resources. Specifically, given a conversation around a movie, the task is to generate the next response based on background knowledge about the movie such as the plot, review, Reddit comments etc. This requires capturing structural, sequential and semantic information from the conversation context and the background resources. This is a new task and has not received much attention from the community. We propose a new architecture that uses the ability of BERT to capture deep contextualized representations in conjunction with explicit structure and sequence information. More specifically, we use (i) Graph Convolutional Networks (GCNs) to capture structural information, (ii) LSTMs to capture sequential information and (iii) BERT for the deep contextualized representations that capture semantic information. We analyze the proposed architecture extensively. To this end, we propose a plug-and-play Semantics-Sequences-Structures (SSS) framework which allows us to effectively combine such linguistic information. Through a series of experiments we make some interesting observations. First, we observe that the popular adaptation of the GCN model for NLP tasks where structural information (GCNs) was added on top of sequential information (LSTMs) performs poorly on our task. This leads us to explore interesting ways of combining semantic and structural information to improve the performance. Second, we observe that while BERT already outperforms other deep contextualized representations such as ELMo, it still benefits from the additional structural information explicitly added using GCNs. This is a bit surprising given the recent claims that BERT already captures structural information. Lastly, the proposed SSS framework gives an improvement of 7.95% over the baseline.

下载PDF全文

下载文献需遵守相关版权规定

论文标题