论文标题

部分可观测时空混沌系统的无模型预测

Revisiting Distance Metric Learning for Few-Shot Natural Language Classification

论文作者

Sosnowski, Witold, Wróblewska, Anna, Seweryn, Karolina, Gawrysiak, Piotr

论文摘要

近年来,远程度量学习(DML)在图像处理中引起了很多关注。本文分析了其对自然语言处理(NLP)分类任务的监督微调语言模型的影响。我们在培训罗伯塔语言模型中调查了几个DML损失功能,以了解已知的Senteval转移任务数据集。我们还分析了模型推断期间使用基于代理的DML损失的可能性。 我们的系统实验表明,在几乎没有稳定的学习设置下,尤其是基于代理的DML损失可以积极影响监督语言模型的微调和推理。与CCE(分类跨透明镜损失)和代替者损失相结合的模型平均具有最佳性能和跑赢大约3.27个百分点的表现 - 取决于训练数据集。

Distance Metric Learning (DML) has attracted much attention in image processing in recent years. This paper analyzes its impact on supervised fine-tuning language models for Natural Language Processing (NLP) classification tasks under few-shot learning settings. We investigated several DML loss functions in training RoBERTa language models on known SentEval Transfer Tasks datasets. We also analyzed the possibility of using proxy-based DML losses during model inference. Our systematic experiments have shown that under few-shot learning settings, particularly proxy-based DML losses can positively affect the fine-tuning and inference of a supervised language model. Models tuned with a combination of CCE (categorical cross-entropy loss) and ProxyAnchor Loss have, on average, the best performance and outperform models with only CCE by about 3.27 percentage points -- up to 10.38 percentage points depending on the training dataset.

扫码加入交流群

加入微信交流群

微信交流群二维码

扫码加入学术交流群,获取更多资源