论文标题
嘈杂的自我知识蒸馏文本摘要
Noisy Self-Knowledge Distillation for Text Summarization
论文作者
论文摘要
在本文中,我们将自我知识蒸馏应用于文本摘要中,我们认为可以减轻单个参考和嘈杂数据集的最大样本培训的问题。我们的学生摘要模型并没有依靠一列注释标签,而是在教师的指导下进行了培训,该指导产生了平滑的标签以帮助正规化培训。此外,为了更好地建模培训期间的不确定性,我们为教师和学生模型引入了多个噪声信号。我们在三个基准上进行了实验证明,我们的框架可以提高预审计和未经预告量的摘要,从而实现最先进的结果。
In this paper we apply self-knowledge distillation to text summarization which we argue can alleviate problems with maximum-likelihood training on single reference and noisy datasets. Instead of relying on one-hot annotation labels, our student summarization model is trained with guidance from a teacher which generates smoothed labels to help regularize training. Furthermore, to better model uncertainty during training, we introduce multiple noise signals for both teacher and student models. We demonstrate experimentally on three benchmarks that our framework boosts the performance of both pretrained and non-pretrained summarizers achieving state-of-the-art results.