同时机器翻译的无预期培训

论文标题

同时机器翻译的无预期培训

Anticipation-Free Training for Simultaneous Machine Translation

论文作者

Chang, Chih-Chiang, Chuang, Shun-Po, Lee, Hung-yi

论文摘要

同时机器翻译（Simulmt）通过在源句子完全可用之前开始翻译来加快翻译过程。由于语言之间的上下文和单词顺序差异，这很困难。现有方法会增加延迟或引入Simulmt模型的自适应阅读写策略，以处理本地重新排序并提高翻译质量。但是，长距离重新排序将使Simulmt模型错误地学习翻译。具体而言，当尚未读取相应的源代币时，模型可能被迫预测目标令牌。这导致了推断期间的积极预期，从而导致幻觉现象。为了减轻此问题，我们提出了一个新框架，将翻译过程分解为单调翻译步骤和重新排序步骤，并通过辅助分类网络（ASN）对后者进行建模。 ASN重新安排隐藏状态以匹配目标语言的顺序，以便Simulmt模型可以学会更合理地翻译。整个模型是优化的端到端，不依赖外部对齐器或数据。在推论过程中，将删除ASN以实现流媒体。实验表明，所提出的框架可以胜过延迟较小的先前方法。

Simultaneous machine translation (SimulMT) speeds up the translation process by starting to translate before the source sentence is completely available. It is difficult due to limited context and word order difference between languages. Existing methods increase latency or introduce adaptive read-write policies for SimulMT models to handle local reordering and improve translation quality. However, the long-distance reordering would make the SimulMT models learn translation mistakenly. Specifically, the model may be forced to predict target tokens when the corresponding source tokens have not been read. This leads to aggressive anticipation during inference, resulting in the hallucination phenomenon. To mitigate this problem, we propose a new framework that decompose the translation process into the monotonic translation step and the reordering step, and we model the latter by the auxiliary sorting network (ASN). The ASN rearranges the hidden states to match the order in the target language, so that the SimulMT model could learn to translate more reasonably. The entire model is optimized end-to-end and does not rely on external aligners or data. During inference, ASN is removed to achieve streaming. Experiments show the proposed framework could outperform previous methods with less latency.

下载PDF全文

下载文献需遵守相关版权规定

论文标题