论文标题
曾经就足够了:快速句子对建模的轻巧交叉注意
Once is Enough: A Light-Weight Cross-Attention for Fast Sentence Pair Modeling
论文作者
论文摘要
基于变压器的模型在句子对建模任务(例如答案选择和自然语言推论(NLI))上取得了巨大成功。这些模型通常对输入对进行跨注意,从而导致过度的计算成本。最近的研究提出了双重编码器和晚期相互作用体系结构,以更快地计算。但是,跨注意力和计算速度表达之间的平衡仍然需要更好的协调。为此,本文介绍了一种新型的范式混音编码器,以进行有效的句子对建模。 MixenCoder涉及轻巧的跨注意机制。它在并行建模查询候选相互作用时仅进行查询编码一次。对四个任务进行的广泛实验表明,我们的混音编码器可以将句子配对加速超过113倍,同时与更昂贵的跨意见模型相当。
Transformer-based models have achieved great success on sentence pair modeling tasks, such as answer selection and natural language inference (NLI). These models generally perform cross-attention over input pairs, leading to prohibitive computational costs. Recent studies propose dual-encoder and late interaction architectures for faster computation. However, the balance between the expressive of cross-attention and computation speedup still needs better coordinated. To this end, this paper introduces a novel paradigm MixEncoder for efficient sentence pair modeling. MixEncoder involves a light-weight cross-attention mechanism. It conducts query encoding only once while modeling the query-candidate interaction in parallel. Extensive experiments conducted on four tasks demonstrate that our MixEncoder can speed up sentence pairing by over 113x while achieving comparable performance as the more expensive cross-attention models.