测量和减少NLP结构化预测中的模型更新回归

论文标题

测量和减少NLP结构化预测中的模型更新回归

Measuring and Reducing Model Update Regression in Structured Prediction for NLP

论文作者

Cai, Deng, Mansimov, Elman, Lai, Yi-An, Su, Yixuan, Shu, Lei, Zhang, Yi

论文摘要

深度学习的最新进展导致在广泛的应用中迅速采用了基于机器的NLP模型。尽管准确性不断提高，但向后兼容性也是工业应用的重要方面，但很少受到研究的关注。向后兼容性要求新模型不会在其前身正确处理的情况下进行回归。该工作研究模型更新结构化预测任务中的回归。我们选择句法依赖性解析和对话语义解析作为NLP中结构化预测任务的代表性示例。首先，我们在不同模型更新设置中测量和分析模型更新回归。接下来，我们探索和基准测试现有的技术，用于减少模型更新回归，包括模型集成和知识蒸馏。我们进一步提出了一种简单有效的方法，即通过考虑结构化预测的特征，向后征服重新排列（BCR）。实验表明，BCR比模型集合和知识蒸馏方法更好地减轻模型更新回归。

Recent advance in deep learning has led to the rapid adoption of machine learning-based NLP models in a wide range of applications. Despite the continuous gain in accuracy, backward compatibility is also an important aspect for industrial applications, yet it received little research attention. Backward compatibility requires that the new model does not regress on cases that were correctly handled by its predecessor. This work studies model update regression in structured prediction tasks. We choose syntactic dependency parsing and conversational semantic parsing as representative examples of structured prediction tasks in NLP. First, we measure and analyze model update regression in different model update settings. Next, we explore and benchmark existing techniques for reducing model update regression including model ensemble and knowledge distillation. We further propose a simple and effective method, Backward-Congruent Re-ranking (BCR), by taking into account the characteristics of structured prediction. Experiments show that BCR can better mitigate model update regression than model ensemble and knowledge distillation approaches.

下载PDF全文

下载文献需遵守相关版权规定

论文标题