Search CORE

58 research outputs found

Evaluation of Distributional Semantic Models of Ancient Greek:Preliminary Results and a Road Map for Future Work

Author: McGillivray Barbara
Nissim Malvina
Pedrazzini Nilo
Peels-Matthey Saskia
Stopponi Silvia
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 08/09/2023
Field of study

We evaluate four count-based and predictive distributional semantic models of Ancient Greek against AGREE, a composite benchmark of human judgements, to assess their ability to retrieve semantic relatedness. On the basis of the observations deriving from the analysis of the results, we design a procedure for a largerscale intrinsic evaluation of count-based and predictive language models, including syntactic embeddings. We also propose possible ways of exploiting the different layers of the whole AGREE benchmark (including both humanand machine-generated data) and different evaluation metrics

Dissertations of the University of Groningen

Evaluation of Distributional Semantic Models of Ancient Greek:Preliminary Results and a Road Map for Future Work

Author: McGillivray Barbara
Nissim Malvina
Pedrazzini Nilo
Peels-Matthey Saskia
Stopponi Silvia
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 08/09/2023
Field of study

Proceedings - University of Groningen

Evaluation of Distributional Semantic Models of Ancient Greek:Preliminary Results and a Road Map for Future Work

Author: McGillivray Barbara
Nissim Malvina
Pedrazzini Nilo
Peels-Matthey Saskia
Stopponi Silvia
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 08/09/2023
Field of study

ARTS repository - University of Groningen

Evaluation of Distributional Semantic Models of Ancient Greek:Preliminary Results and a Road Map for Future Work

Author: McGillivray Barbara
Nissim Malvina
Pedrazzini Nilo
Peels-Matthey Saskia
Stopponi Silvia
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 08/09/2023
Field of study

Proceedings - University of Groningen

"What is on your mind?" Automated Scoring of Mindreading in Childhood and Early Adolescence

Author: Aguilera Irene Luque
Devine Rory T.
Kovatchev Venelin
Lee Mark
Smith Phillip
Traynor Imogen Grumley
Publication venue
Publication date: 01/01/2020
Field of study

In this paper we present the first work on the automated scoring of mindreading ability in middle childhood and early adolescence. We create MIND-CA, a new corpus of 11,311 question-answer pairs in English from 1,066 children aged 7 to 14. We perform machine learning experiments and carry out extensive quantitative and qualitative evaluation. We obtain promising results, demonstrating the applicability of state-of-the-art NLP solutions to a new domain and task.Comment: Accepted in COLING 202

arXiv.org e-Print Archive

Crossref

University of Birmingham Research Portal

Discovering multiword expressions

Author: Aline Villavicencio
Attia
Baldwin
Barrett
Barrett
Biber
Calzolari
Camacho-Collados
Church
Clark
Curran
de Marneffe
Dunning
Firth
Frege
Kilgarriff
Kim
Kiros
Lapesa
Leacock
Lin
Manning
Marco Idiart
McCarthy
Melamed
Mitchell
Moon
Nunberg
Pearce
Peters
Roller
Sag
Salehi
Salehi
Schneider
Schneider
Schulte im Walde
Sporleder
Søgaard
Van de Cruys
Villavicencio
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/11/2019
Field of study

In this paper, we provide an overview of research on multiword expressions (MWEs), from a natural lan- guage processing perspective. We examine methods developed for modelling MWEs that capture some of their linguistic properties, discussing their use for MWE discovery and for idiomaticity detection. We con- centrate on their collocational and contextual preferences, along with their fixedness in terms of canonical forms and their lack of word-for-word translatatibility. We also discuss a sample of the MWE resources that have been used in intrinsic evaluation setups for these methods

Crossref

White Rose Research Online

English Machine Reading Comprehension Datasets: A Survey

Author: Dzendzik Daria
Foster Jennifer
Vogel Carl
Publication venue
Publication date: 08/10/2021
Field of study

This paper surveys 60 English Machine Reading Comprehension datasets, with a view to providing a convenient resource for other researchers interested in this problem. We categorize the datasets according to their question and answer form and compare them across various dimensions including size, vocabulary, data source, method of creation, human performance level, and first question word. Our analysis reveals that Wikipedia is by far the most common data source and that there is a relative lack of why, when, and where questions across datasets.Comment: Will appear at EMNLP 2021. Dataset survey paper: 9 pages, 5 figures, 2 tables + attachmen

arXiv.org e-Print Archive

DCU Online Research Access Service

Creating expert knowledge by relying on language learners : a generic approach for mass-producing language resources by combining implicit crowdsourcing and language learning

Author: 12th edition of the Language Resources and Evaluation Conference (LREC'20)
Aparaschivei Lavina
Barreiro Anabela
Borg Claudia
Cibej Jaka
Forascu Corina
Fort Karen
HaCohen-Kerner Yaakov
Hassan Umair ul
Holdt Spela Arhar
Katinskaia Anisia
Konig Alexander
Kosem Iztok
Lyding Verena
Millour Alice
Nicholas Lionel
Rodosthenous Christos
Sangati Federico
Zdravkova Katerina
Publication venue
Publication date: 01/05/2020
Field of study

We introduce in this paper a generic approach to combine implicit crowdsourcing and language learning in order to mass-produce language resources (LRs) for any language for which a crowd of language learners can be involved. We present the approach by explaining its core paradigm that consists in pairing specific types of LRs with specific exercises, by detailing both its strengths and challenges, and by discussing how much these challenges have been addressed at present. Accordingly, we also report on on-going proof-of-concept efforts aiming at developing the first prototypical implementation of the approach in order to correct and extend an LR called ConceptNet based on the input crowdsourced from language learners. We then present an international network called the European Network for Combining Language Learning with Crowdsourcing Techniques (enetCollect) that provides the context to accelerate the implementation of the generic approach. Finally, we exemplify how it can be used in several language learning scenarios to produce a multitude of NLP resources and how it can therefore alleviate the long-standing NLP issue of the lack of LRs.peer-reviewe

OAR@UM

Emotion Embeddings \unicode{x2014} Learning Stable and Homogeneous Abstractions from Heterogeneous Affective Datasets

Author: Buechel Sven
Hahn Udo
Publication venue
Publication date: 15/08/2023
Field of study

Human emotion is expressed in many communication modalities and media formats and so their computational study is equally diversified into natural language processing, audio signal analysis, computer vision, etc. Similarly, the large variety of representation formats used in previous research to describe emotions (polarity scales, basic emotion categories, dimensional approaches, appraisal theory, etc.) have led to an ever proliferating diversity of datasets, predictive models, and software tools for emotion analysis. Because of these two distinct types of heterogeneity, at the expressional and representational level, there is a dire need to unify previous work on increasingly diverging data and label types. This article presents such a unifying computational model. We propose a training procedure that learns a shared latent representation for emotions, so-called emotion embeddings, independent of different natural languages, communication modalities, media or representation label formats, and even disparate model architectures. Experiments on a wide range of heterogeneous affective datasets indicate that this approach yields the desired interoperability for the sake of reusability, interpretability and flexibility, without penalizing prediction quality. Code and data are archived under https://doi.org/10.5281/zenodo.7405327 .Comment: 18 pages, 6 figure

arXiv.org e-Print Archive

Extracting locations from sport and exercise-related social media messages using a neural network-based bilingual toponym recognition model

Author: Hasanen Elina
Hiippala Tuomo
Koivisto Sonja Maria
Liu Pengyuan
Muukkonen Petteri
Nurmi Marisofia Kaarina
Pyykönen Janne
Salmikangas Anna-Katriina
Simula Mikko
Toivonen Tuuli
Van der Lijn Charlotte Jacoba Cornelia
Vehkakoski Kirsi
Virmasalo Ilkka
Väisänen Tuomas Lauri Aleksanteri
Publication venue
Publication date: 01/01/2022
Field of study

Sport and exercise contribute to health and well-being in cities. While previous research has mainly focused on activities at specific locations such as sport facilities, "informal sport" that occur at arbitrary locations across the city have been largely neglected. Such activities are more challenging to observe, but this challenge may be addressed using data collected from social media platforms, because social media users regularly generate content related to sports and exercise at given locations. This allows studying all sport, including those "informal sport" which are at arbitrary locations, to better understand sports and exercise-related activities in cities. However, user-generated geographical information available on social media platforms is becoming scarcer and coarser. This places increased emphasis on extracting location information from free-form text content on social media, which is complicated by multilingualism and informal language. To support this effort, this article presents an end-to-end deep learning-based bilingual toponym recognition model for extracting location information from social media content related to sports and exercise. We show that our approach outperforms five state-of-the-art deep learning and machine learning models. We further demonstrate how our model can be deployed in a geoparsing framework to support city planners in promoting healthy and active lifestyles.Peer reviewe

Jyväskylä University Digital Archive

Helsingin yliopiston digitaalinen arkisto

University of St. Andrews - Pure

St Andrews Research Repository