论文标题
情感分析中课程学习的Sentiwordnet策略
A SentiWordNet Strategy for Curriculum Learning in Sentiment Analysis
论文作者
论文摘要
课程学习(CL)是在训练集上学习的想法,以样本范围从易于到困难的方式进行测序或排序,从而导致性能的增加,而不是其他随机排序。这个想法与认知科学关于人类大脑学习的理论相吻合,并且可以通过将其作为一系列易于到困难的任务来更容易地学习一项艰巨的任务。这个想法在机器学习和图像处理中引起了很多关注,并且最近在自然语言处理(NLP)中。在本文中,我们在情感分析设置中运用了由Sentiwordnet驱动的课程学习思想。在这种情况下,给定文本段,我们的目的是提取其情感或极性。 SentiWordnet是具有情感极性注释的词汇资源。通过将表现与其他课程策略进行比较,而没有课程,则提出了拟议策略的有效性。使用卷积,复发和基于注意力的体系结构来评估这一改进。这些模型是在标准情感数据集(Stanford Teembank)上评估的。
Curriculum Learning (CL) is the idea that learning on a training set sequenced or ordered in a manner where samples range from easy to difficult, results in an increment in performance over otherwise random ordering. The idea parallels cognitive science's theory of how human brains learn, and that learning a difficult task can be made easier by phrasing it as a sequence of easy to difficult tasks. This idea has gained a lot of traction in machine learning and image processing for a while and recently in Natural Language Processing (NLP). In this paper, we apply the ideas of curriculum learning, driven by SentiWordNet in a sentiment analysis setting. In this setting, given a text segment, our aim is to extract its sentiment or polarity. SentiWordNet is a lexical resource with sentiment polarity annotations. By comparing performance with other curriculum strategies and with no curriculum, the effectiveness of the proposed strategy is presented. Convolutional, Recurrence, and Attention-based architectures are employed to assess this improvement. The models are evaluated on a standard sentiment dataset, Stanford Sentiment Treebank.