Tessore, Juan Pablo; Esnaola, Leonardo Martín; Ramón, Hugo Dionisio; Lanzarini, Laura; Baldassarri, Sandra
Resumen:
Basic emotion classification is one of the main tasks of Sentiment Analysis usuallyperformed by using several machine learning techniques. One of the main issues inSentiment Analysis is the availability of tagged resources to properly train super-vised classification algorithms. This is of particular concern in languages other thanEnglish, such as Spanish, where scarcity of these resources is the norm. In addition,most basic emotion datasets available in Spanish are rather small, containing a few hundred (or thousand) samples. Usually, the samples only contain a short text(frequently a comment) and a tag (the basic emotion), omitting crucial contextualinformation that may help to improve the classification task results. In this paper, theimpact of using contextual information is measured on a recently published Spanishbasic emotion dataset and the baseline architecture proposed in the Semantic Eval-uation 2019 competition. This particular dataset has two main advantages for thispaper. First, it was compiled using Distant Supervision and as a result it containsseveral hundred thousand samples. Secondly, the authors included valuable contex-tual information for each comment. The results show that contextual information,such as news headlines or summaries, helps improve the classification accuracy overa dataset of distantly supervised basic emotion labelled comments.