Search CORE

2,584 research outputs found

What a Nerd! Beating Students and Vector Cosine in the ESL and TOEFL Datasets

Author: Chiu Tin-Shing
Huang Chu-Ren
Lenci Alessandro
Lu Qin
Santus Enrico
Publication venue
Publication date: 01/01/2016
Field of study

In this paper, we claim that Vector Cosine, which is generally considered one of the most efficient unsupervised measures for identifying word similarity in Vector Space Models, can be outperformed by a completely unsupervised measure that evaluates the extent of the intersection among the most associated contexts of two target words, weighting such intersection according to the rank of the shared contexts in the dependency ranked lists. This claim comes from the hypothesis that similar words do not simply occur in similar contexts, but they share a larger portion of their most relevant contexts compared to other related words. To prove it, we describe and evaluate APSyn, a variant of Average Precision that, independently of the adopted parameters, outperforms the Vector Cosine and the co-occurrence on the ESL and TOEFL test sets. In the best setting, APSyn reaches 0.73 accuracy on the ESL dataset and 0.70 accuracy in the TOEFL dataset, beating therefore the non-English US college applicants (whose average, as reported in the literature, is 64.50%) and several state-of-the-art approaches.Comment: in LREC 201

arXiv.org e-Print Archive

The Hong Kong Polytechnic University Pao Yue-kong Library

Archivio della Ricerca - Università di Pisa

EVALITA4ELG: Italian Benchmark Linguistic Resources, NLP Services and Tools for the ELG Platform

Author: Basile Valerio
Bolioli Andrea
Bosca Alessio
Bosco Cristina
Fell Michael
Patti Viviana
Varvara Rossella
Publication venue: 'OpenEdition'
Publication date: 01/01/2020
Field of study

Institutional Research Information System University of Turin

Lessons Learned from EVALITA 2020 and Thirteen Years of Evaluation of Italian Language Technology

Author: Basile Valerio
Danilo Croce
Maria Di Maro
Passaro Lucia C.
Publication venue: 'OpenEdition'
Publication date: 01/01/2020
Field of study

This paper provides a summary of the 7th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA2020) which was held online on December 17th, due to the 2020 COVID-19 pandemic. The 2020 edition of Evalita included 14 different tasks belonging to five research areas, namely: (i) Affect, Hate, and Stance, (ii) Creativity and Style, (iii) New Challenges in Long-standing Tasks, (iv) Semantics and Multimodality, (v) Time and Diachrony. This paper provides a description of the tasks and the key findings from the analysis of participant outcomes. Moreover, it provides a detailed analysis of the participants and task organizers which demonstrates the growing interest with respect to this campaign. Finally, a detailed analysis of the evaluation of tasks across the past seven editions is provided; this allows to assess how the research carried out by the Italian community dealing with Computational Linguistics has evolved in terms of popular tasks and paradigms during the last 13 years

Archivio della Ricerca - Università di Pisa

Institutional Research Information System University of Turin

Long-term Social Media Data Collection at the University of Turin

Author: Basile Valerio
Lai Mirko
Sanguinetti Manuela
Publication venue: CEUR-WS
Publication date: 01/01/2018
Field of study

Institutional Research Information System University of Turin

Learning Greek and Latin Through Digital Annotation: The EuporiaEDU System

Author: Boschetti Federico
Mugelli Gloria
Re Giulia
Taddei Andrea
Publication venue: 'Universitatsbibliothek Kiel'
Publication date: 31/05/2021
Field of study

Gloria Mugelli, Giulia Re, Andrea Taddei & Federico Boschetti describe the 'EphoriaEDU' system, a resource for digital annotation of ancient texts developed by the Lab. of Anthropology of Ancient Greece (LAMA), the CoPhiLab at the ILC-CNR in Pisa and the Venice Digital and Public Humanities Department. The system allows to structure textual information by connecting keywords and creating networks of concepts such as ritual actions in Greek Tragedy. It is applicable to all kinds of linguistic or cultural observations, allowing a wide range of collaboration between teachers and students from high school to university

MACAU: Open Access Repository of Kiel University

Open Access Repository

Recommended from our members

Keywords of written reflection - a comparison between reflective and descriptive datasets

Author: Ullmann Thomas Daniel
Publication venue
Publication date: 09/10/2015
Field of study

This study investigates reflection keywords by contrasting two datasets, one of reflective sentences and another of descriptive sentences. The log-likelihood statistic reveals several reflection keywords that are discussed in the context of a model for reflective writing. These keywords are seen as a useful building block for tools that can automatically analyse reflection in texts

Open Research Online (The Open University)

VerbAtlas: a novel large-scale verbal semantic resource and its application to semantic role labeling

Author: andrea di fabio
CONIA SIMONE
roberto navigli
Publication venue
Publication date: 01/01/2019
Field of study

We present VerbAtlas, a new, hand-crafted lexical-semantic resource whose goal is to bring together all verbal synsets from WordNet into semantically-coherent frames. The frames define a common, prototypical argument structure while at the same time providing new concept-specific information. In contrast to PropBank, which defines enumerative semantic roles, VerbAtlas comes with an explicit, cross-frame set of semantic roles linked to selectional preferences expressed in terms of WordNet synsets, and is the first resource enriched with semantic information about implicit, shadow, and default arguments. We demonstrate the effectiveness of VerbAtlas in the task of dependency-based Semantic Role Labeling and show how its integration into a high-performance system leads to improvements on both the in-domain and out-of-domain test sets of CoNLL-2009. VerbAtlas is available at http://verbatlas.org

Crossref

Archivio della ricerca- Università di Roma La Sapienza

Models to represent linguistic linked data

Author: A. GÓMEZ-PÉREZ
Borin
Crystal
E. MONTIEL-PONSODA
Ehrmann
Farrar
Fellbaum
Fellbaum
Hanks
Hayes
Hellmann
Ide
J. BOSQUE-GIL
J. GRACIA
Klimek
Mel’cuk
Mel’cuk
Menke
Ogden
Peirce
Pustejovsky
Schuurman
Trippel
Vila-Suero
Publication venue: 'Cambridge University Press (CUP)'
Publication date: 01/01/2018
Field of study

As the interest of the Semantic Web and computational linguistics communities in linguistic linked data (LLD) keeps increasing and the number of contributions that dwell on LLD rapidly grows, scholars (and linguists in particular) interested in the development of LLD resources sometimes find it difficult to determine which mechanism is suitable for their needs and which challenges have already been addressed. This review seeks to present the state of the art on the models, ontologies and their extensions to represent language resources as LLD by focusing on the nature of the linguistic content they aim to encode. Four basic groups of models are distinguished in this work: models to represent the main elements of lexical resources (group 1), vocabularies developed as extensions to models in group 1 and ontologies that provide more granularity on specific levels of linguistic analysis (group 2), catalogues of linguistic data categories (group 3) and other models such as corpora models or service-oriented ones (group 4). Contributions encompassed in these four groups are described, highlighting their reuse by the community and the modelling challenges that are still to be faced

Crossref

Repositorio Universidad de Zaragoza

Archivo Digital UPM

Not just paper: enhancement of archive cultural heritage

Author: Calamai Silvia
Candeo Giovanni
Monachini Monica
Piccardi Duccio
Pretto Niccolò
Stamuli Maria Francesca
Publication venue: 'Walter de Gruyter GmbH'
Publication date: 01/01/2022
Field of study

Oral archives and digital technologies have gone hand-in-hand for a very long time. Both sides benefit from this interdisciplinary junction: technology enhances the preservation and diffusion of oral materials, while exploiting them to develop cutting-edge tools for their treatment. This chapter deals with an Italian instantiation of this mutual relationship: the Archivio Vi.Vo. project. Offering innovative solutions concerning metadata, audio restoration, description , and access, Archivio Vi.Vo. aims to build an online platform to host the oral archives from Tuscany. The project is powered by CLARIN-IT, which guarantees its compliance with standards and offers resources for data access and discov-erability. Archivio Vi.Vo. has not been built from scratch: it is instead a cross-fertilization of previous initiatives and research projects (e.g., the Gra.fo project). Moreover, the chapter presents the related, contemporary work of a multidisciplinary group striving to synthesize a Vademecum for future generations of oral archive researchers. Lastly, a brief list of tentative ideas for future developments of the Archivio Vi.Vo. platform will be presented

Archivio della Ricerca - Università degli Studi di Siena