11 research outputs found

    UG18 at SemEval-2018 Task 1: Generating Additional Training Data for Predicting Emotion Intensity in Spanish

    Get PDF
    The present study describes our submission to SemEval 2018 Task 1: Affect in Tweets. Our Spanish-only approach aimed to demonstrate that it is beneficial to automatically generate additional training data by (i) translating training data from other languages and (ii) applying a semi-supervised learning method. We find strong support for both approaches, with those models outperforming our regular models in all subtasks. However, creating a stepwise ensemble of different models as opposed to simply averaging did not result in an increase in performance. We placed second (EI-Reg), second (EI-Oc), fourth (V-Reg) and fifth (V-Oc) in the four Spanish subtasks we participated in.Comment: Accepted at SemEval 201

    SemEval-2018 Task 1: Affect in Tweets

    Get PDF
    We present the SemEval-2018 Task 1: Affect in Tweets, which includes an array of subtasks on inferring the affectual state of a person from their tweet. For each task, we created labeled data from English, Arabic, and Spanish tweets. The individual tasks are: 1. emotion intensity regression, 2. emotion intensity ordinal classification, 3. valence (sentiment) regression, 4. valence ordinal classification, and 5. emotion classification. Seventy-five teams (about 200 team members) participated in the shared task. We summarize the methods, resources, and tools used by the participating teams, with a focus on the techniques and resources that are particularly useful. We also analyze systems for consistent bias towards a particular race or gender. The data is made freely available to further improve our understanding of how people convey emotions through language

    An谩lisis de la contribuci贸n de los fonemas a la predicci贸n de la valencia emocional en tweets en espa帽ol e ingl茅s

    Get PDF
    Aunque tradicionalmente se ha asumido que el sonido de las palabras y su significado se relacionan de forma arbitraria, distintos hallazgos emp铆ricos respaldan la hip贸tesis de que las unidades fonol贸gicas b谩sicas del lenguaje guardan una relaci贸n sistem谩tica con aspectos sem谩nticos, incluyendo la connotaci贸n afectiva y actitudinal de las palabras (Adelman, Estes, & Cossu, 2018; Aryani, Conrad, Schmidtke, & Jacobs, 2018; Dingemanse, Blasi, Lupyan, Christiansen, & Monaghan, 2015; Monaghan, Shillcock, Christiansen, & Kirby, 2014; Schmidtke, Conrad, & Jacobs, 2014). A partir de estas premisas, se busc贸 identificar si las unidades fonol贸gicas del espa帽ol y el ingl茅s contribuyen a la predicci贸n de la valencia emocional en un corpus de tweets. Para esto, se entren贸 un conjunto de modelos de regresi贸n lineal m煤ltiple, cuyo desempe帽o fue evaluado a partir de la correlaci贸n y los indicadores de error calculados partir de las valencias predichas y las observadas en los datasets de prueba proporcionados por el concurso SemEval-2018 (Mohammad, Bravo-M谩rquez, Salameh, & Kiritchenko, 2018). Se encontr贸 que la adici贸n de los recursos fonol贸gicos a un conjunto de predictores l茅xicos (Bag of Words de los Tweets, normalizada con el m茅todo TF-IDF) tiene un efecto reducido pero consistente sobre las m茅tricas globales de ajuste, y en ambos idiomas permite discriminar con mayor precisi贸n las valencias observadas cercanas a los valores medios, as铆 como las valencias inferiores asociadas a contenidos afectivos negativos.Mag铆ster en Anal铆tica para la Inteligencia de NegociosMaestr铆

    Character-based Neural Semantic Parsing

    Get PDF
    Humans and computers do not speak the same language. A lot of day-to-day tasks would be vastly more efficient if we could communicate with computers using natural language instead of relying on an interface. It is necessary, then, that the computer does not see a sentence as a collection of individual words, but instead can understand the deeper, compositional meaning of the sentence. A way to tackle this problem is to automatically assign a formal, structured meaning representation to each sentence, which are easy for computers to interpret. There have been quite a few attempts at this before, but these approaches were usually heavily reliant on predefined rules, word lists or representations of the syntax of the text. This made the general usage of these methods quite complicated. In this thesis we employ an algorithm that can learn to automatically assign meaning representations to texts, without using any such external resource. Specifically, we use a type of artificial neural network called a sequence-to-sequence model, in a process that is often referred to as deep learning. The devil is in the details, but we find that this type of algorithm can produce high quality meaning representations, with better performance than the more traditional methods. Moreover, a main finding of the thesis is that, counter intuitively, it is often better to represent the text as a sequence of individual characters, and not words. This is likely the case because it helps the model in dealing with spelling errors, unknown words and inflections
    corecore