了解神经机器翻译的学习动态

论文标题

了解神经机器翻译的学习动态

Understanding Learning Dynamics for Neural Machine Translation

论文作者

Zhu, Conghui, Li, Guanlin, Liu, Lemao, Zhao, Tiejun, Shi, Shuming

论文摘要

尽管NMT取得了巨大的成功，但仍有一个严重的挑战：在训练过程中很难解释内部动态。在本文中，我们建议通过使用最新提出的名为损失变化分配（LCA）〜\ citep {lan-2019-2019-loss-change-Alocation}的技术来了解NMT的学习动力学。由于LCA需要为每个更新计算整个数据集中的梯度，因此我们提出了一个大约将其付诸实践的NMT方案。由SGD的课程激励的％。我们的模拟实验表明，这种近似计算是有效的，并且在经验上被证明可以为蛮力实施带来一致的结果。特别是，在两个标准翻译基准数据集上进行了广泛的实验揭示了一些有价值的发现。

Despite the great success of NMT, there still remains a severe challenge: it is hard to interpret the internal dynamics during its training process. In this paper we propose to understand learning dynamics of NMT by using a recent proposed technique named Loss Change Allocation (LCA)~\citep{lan-2019-loss-change-allocation}. As LCA requires calculating the gradient on an entire dataset for each update, we instead present an approximate to put it into practice in NMT scenario. %motivated by the lesson from sgd. Our simulated experiment shows that such approximate calculation is efficient and is empirically proved to deliver consistent results to the brute-force implementation. In particular, extensive experiments on two standard translation benchmark datasets reveal some valuable findings.

下载PDF全文

下载文献需遵守相关版权规定

论文标题