IMOJIE：基于迭代的基于内存的关节开放信息提取

论文标题

IMOJIE：基于迭代的基于内存的关节开放信息提取

IMoJIE: Iterative Memory-Based Joint Open Information Extraction

论文作者

Kolluru, Keshav, Aggarwal, Samarth, Rathore, Vipul, Mausam, Chakrabarti, Soumen

论文摘要

尽管用于开放信息提取的传统系统是统计和基于规则的，但最近引入了该任务的神经模型。我们的作品建立在副本的开放式模型基础上（Cui et al。，2018）。我们的分析表明，复制练习每句话会产生恒定数量的提取，其提取的元组通常表达冗余信息。我们提出了Imojie，这是复制指令的扩展，该扩展在所有先前提取的元组上产生下一次提取。这种方法克服了复制提取的两个缺点，导致每个句子的不同摘录数量。我们训练Imojie对几个非神经系统的提取的训练数据进行训练，这些数据已自动过滤以减少冗余和噪声。 Imojie的表现优于大约18 f1 pts，而基于BERT的强基线由2 f1 pts胜过，为这项任务建立了新的最新技术。

While traditional systems for Open Information Extraction were statistical and rule-based, recently neural models have been introduced for the task. Our work builds upon CopyAttention, a sequence generation OpenIE model (Cui et. al., 2018). Our analysis reveals that CopyAttention produces a constant number of extractions per sentence, and its extracted tuples often express redundant information. We present IMoJIE, an extension to CopyAttention, which produces the next extraction conditioned on all previously extracted tuples. This approach overcomes both shortcomings of CopyAttention, resulting in a variable number of diverse extractions per sentence. We train IMoJIE on training data bootstrapped from extractions of several non-neural systems, which have been automatically filtered to reduce redundancy and noise. IMoJIE outperforms CopyAttention by about 18 F1 pts, and a BERT-based strong baseline by 2 F1 pts, establishing a new state of the art for the task.

下载PDF全文

下载文献需遵守相关版权规定

论文标题