与PRPN无监督解析的自我训练

论文标题

与PRPN无监督解析的自我训练

Self-Training for Unsupervised Parsing with PRPN

论文作者

Mohananey, Anhad, Kann, Katharina, Bowman, Samuel R.

论文摘要

神经无监督的解析（UP）模型学习解析而无需访问句法注释，同时对其他任务进行了优化，例如语言建模。在这项工作中，我们为神经模型提出了自我训练：我们利用模型副本预测的汇总注释作为对未来副本的监督。为了能够在培训期间使用模型的预测，我们扩展了最新的神经架构PRPN（Shen等，2018a），以便可以以半监督的方式对其进行培训。然后，我们将模型预测的解析的示例添加到我们未标记的培训数据中。我们的自我训练模型的表现使PRPN的表现为8.1％F1，而先前的艺术状态则比1.6％的F1胜过PRPN。此外，我们表明我们的架构也有助于在超低资源设置中的半监督解析。

Neural unsupervised parsing (UP) models learn to parse without access to syntactic annotations, while being optimized for another task like language modeling. In this work, we propose self-training for neural UP models: we leverage aggregated annotations predicted by copies of our model as supervision for future copies. To be able to use our model's predictions during training, we extend a recent neural UP architecture, the PRPN (Shen et al., 2018a) such that it can be trained in a semi-supervised fashion. We then add examples with parses predicted by our model to our unlabeled UP training data. Our self-trained model outperforms the PRPN by 8.1% F1 and the previous state of the art by 1.6% F1. In addition, we show that our architecture can also be helpful for semi-supervised parsing in ultra-low-resource settings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题