Search CORE

11 research outputs found

UG18 at SemEval-2018 Task 1: Generating Additional Training Data for Predicting Emotion Intensity in Spanish

Author: Kuijper Marloes
van Lenthe Mike
van Noord Rik
Publication venue
Publication date: 01/01/2018
Field of study

The present study describes our submission to SemEval 2018 Task 1: Affect in Tweets. Our Spanish-only approach aimed to demonstrate that it is beneficial to automatically generate additional training data by (i) translating training data from other languages and (ii) applying a semi-supervised learning method. We find strong support for both approaches, with those models outperforming our regular models in all subtasks. However, creating a stepwise ensemble of different models as opposed to simply averaging did not result in an increase in performance. We placed second (EI-Reg), second (EI-Oc), fourth (V-Reg) and fifth (V-Oc) in the four Spanish subtasks we participated in.Comment: Accepted at SemEval 201

arXiv.org e-Print Archive

Crossref

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

UG18 at SemEval-2018 Task 1:Generating Additional Training Data for Predicting Emotion Intensity in Spanish

Author: Kuijper Marloes
van Lenthe Mike
van Noord Rik
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 05/06/2018
Field of study

ARTS repository - University of Groningen

UG18 at SemEval-2018 Task 1:Generating Additional Training Data for Predicting Emotion Intensity in Spanish

Author: Kuijper Marloes
van Lenthe Mike
van Noord Rik
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 05/06/2018
Field of study

University of Groningen

UG18 at SemEval-2018 Task 1:Generating Additional Training Data for Predicting Emotion Intensity in Spanish

Author: Kuijper Marloes
van Lenthe Mike
van Noord Rik
Publication venue: Association for Computational Linguistics (ACL)
Publication date
Field of study

SemEval-2018 Task 1: Affect in Tweets

Author: Felipe Bravo-Marquez
Mohammad Salameh
Saif M. Mohammad
Svetlana Kiritchenko
Publication venue
Publication date: 01/01/2018
Field of study

We present the SemEval-2018 Task 1: Affect in Tweets, which includes an array of subtasks on inferring the affectual state of a person from their tweet. For each task, we created labeled data from English, Arabic, and Spanish tweets. The individual tasks are: 1. emotion intensity regression, 2. emotion intensity ordinal classification, 3. valence (sentiment) regression, 4. valence ordinal classification, and 5. emotion classification. Seventy-five teams (about 200 team members) participated in the shared task. We summarize the methods, resources, and tools used by the participating teams, with a focus on the techniques and resources that are particularly useful. We also analyze systems for consistent bias towards a particular race or gender. The data is made freely available to further improve our understanding of how people convey emotions through language

NRC Publications Archive

Crossref

Open Access Repository

Using Translated Data to Improve Deep Learning Author Profiling Models

Author: Snijders Stan
van der Hall Daniël
van Noord Rik
Veenhoven Robert
Publication venue: 'American Cleft Palate Association'
Publication date: 10/09/2018
Field of study

ARTS repository - University of Groningen

Análisis de la contribución de los fonemas a la predicción de la valencia emocional en tweets en español e inglés

Author: Bernal Rojas Gabriel Alejandro
Publication venue: 'Facultad de Ciencias Economicas y Administrativas de la Universidad de Cuenca'
Publication date: 01/01/2020
Field of study

Aunque tradicionalmente se ha asumido que el sonido de las palabras y su significado se relacionan de forma arbitraria, distintos hallazgos empíricos respaldan la hipótesis de que las unidades fonológicas básicas del lenguaje guardan una relación sistemática con aspectos semánticos, incluyendo la connotación afectiva y actitudinal de las palabras (Adelman, Estes, & Cossu, 2018; Aryani, Conrad, Schmidtke, & Jacobs, 2018; Dingemanse, Blasi, Lupyan, Christiansen, & Monaghan, 2015; Monaghan, Shillcock, Christiansen, & Kirby, 2014; Schmidtke, Conrad, & Jacobs, 2014). A partir de estas premisas, se buscó identificar si las unidades fonológicas del español y el inglés contribuyen a la predicción de la valencia emocional en un corpus de tweets. Para esto, se entrenó un conjunto de modelos de regresión lineal múltiple, cuyo desempeño fue evaluado a partir de la correlación y los indicadores de error calculados partir de las valencias predichas y las observadas en los datasets de prueba proporcionados por el concurso SemEval-2018 (Mohammad, Bravo-Márquez, Salameh, & Kiritchenko, 2018). Se encontró que la adición de los recursos fonológicos a un conjunto de predictores léxicos (Bag of Words de los Tweets, normalizada con el método TF-IDF) tiene un efecto reducido pero consistente sobre las métricas globales de ajuste, y en ambos idiomas permite discriminar con mayor precisión las valencias observadas cercanas a los valores medios, así como las valencias inferiores asociadas a contenidos afectivos negativos.Magíster en Analítica para la Inteligencia de NegociosMaestrí

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Repositorio Institucional - Pontificia Universidad Javeriana

Character-based Neural Semantic Parsing

Author: van Noord Rik
Publication venue: 'University of Groningen Press'
Publication date: 01/01/2021
Field of study

Humans and computers do not speak the same language. A lot of day-to-day tasks would be vastly more efficient if we could communicate with computers using natural language instead of relying on an interface. It is necessary, then, that the computer does not see a sentence as a collection of individual words, but instead can understand the deeper, compositional meaning of the sentence. A way to tackle this problem is to automatically assign a formal, structured meaning representation to each sentence, which are easy for computers to interpret. There have been quite a few attempts at this before, but these approaches were usually heavily reliant on predefined rules, word lists or representations of the syntax of the text. This made the general usage of these methods quite complicated. In this thesis we employ an algorithm that can learn to automatically assign meaning representations to texts, without using any such external resource. Specifically, we use a type of artificial neural network called a sequence-to-sequence model, in a process that is often referred to as deep learning. The devil is in the details, but we find that this type of algorithm can produce high quality meaning representations, with better performance than the more traditional methods. Moreover, a main finding of the thesis is that, counter intuitively, it is often better to represent the text as a sequence of individual characters, and not words. This is likely the case because it helps the model in dealing with spelling errors, unknown words and inflections

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen