11 research outputs found

    Overview of the EVALITA 2016 Part of speech on twitter for Italian task

    Get PDF
    The increasing interest for the extraction of various forms of knowledge from micro-blogs and social media makes crucial the development of resources and tools that can be used for automatically deal with them. PoSTWITA contributes to the advancement of the state-of-the-art for Italian language by: (a) enriching the community with a previously not existing col- lection of data extracted from Twitter and annotated with grammatical categories, to be used as a benchmark for system evaluation; (b) supporting the adaptation of Part of Speech tagging systems to this particular text domain

    Two Layers of Annotation for Representing Event Mentions in News Stories

    Get PDF
    In this paper, we describe our preliminary study of methods for annotating event mentions as part of our research on high-precision models for event extraction from news. We propose a two-layer annotation scheme, designed to capture the functional and the conceptual aspects of event mentions separately. We hypothesize that the precision can be improved by modeling and extracting the different aspects of news events separately, and then combining the extracted information by leveraging the complementarities of the models. We carry out a preliminary annotation using the proposed scheme and analyze the annotation quality in terms of inter-annotator agreement

    Lessons Learned from EVALITA 2020 and Thirteen Years of Evaluation of Italian Language Technology

    Get PDF
    This paper provides a summary of the 7th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA2020) which was held online on December 17th, due to the 2020 COVID-19 pandemic. The 2020 edition of Evalita included 14 different tasks belonging to five research areas, namely: (i) Affect, Hate, and Stance, (ii) Creativity and Style, (iii) New Challenges in Long-standing Tasks, (iv) Semantics and Multimodality, (v) Time and Diachrony. This paper provides a description of the tasks and the key findings from the analysis of participant outcomes. Moreover, it provides a detailed analysis of the participants and task organizers which demonstrates the growing interest with respect to this campaign. Finally, a detailed analysis of the evaluation of tasks across the past seven editions is provided; this allows to assess how the research carried out by the Italian community dealing with Computational Linguistics has evolved in terms of popular tasks and paradigms during the last 13 years

    EVALITA Goes Social: Tasks, Data, and Community at the 2016 Edition

    Get PDF
    EVALITA, the evaluation campaign of Natural Language Processing and Speech Tools for the Italian language, was organised for the fifth time in 2016. Six tasks, covering both re-reruns as well as completely new tasks, and an IBM-sponsored challenge, attracted a total of 34 submissions. An innovative aspect at this edition was the focus on social media data, especially Twitter, and the use of shared data across tasks, yielding a test set with layers of annotation concerning PoS tags, sentiment information, named entities and linking, and factuality information. Differently from the previous edition(s), many systems relied on a neural architecture, and achieved best results when used. From the experience and success of this edition, also in terms of dissemination of information and data, and in terms of collaboration between organisers of different tasks, we collected some reflections and suggestions that prospective EVALITA chairs might be willing to take into account for future editions

    The EVALITA 2016 Event Factuality Annotation Task (FactA)

    Get PDF
    This report describes the FactA (Event Factuality Annotation) Task presented at the EVALITA 2016 evaluation campaign. The task aimed at evaluating systems for the identification of the factuality profiling of events. Motivations, datasets, evaluation metrics, and post-evaluation results are presented and discussed.Questo report descrive il task di valutazione FactA (Event Factaulity Annotation) presentato nell’ambito della campagna di valutazione EVALITA 2016. Il task si prefigge lo scopo di valutare sistemi automatici per il riconoscimento della fattualità associata agli eventi in un testo. Le motivazioni, i dati usati, le metriche di valutazione, e risultati post-valutazione sono presentati e discussi

    Negation and Speculation in NLP: A Survey, Corpora, Methods, and Applications

    Get PDF
    Negation and speculation are universal linguistic phenomena that affect the performance of Natural Language Processing (NLP) applications, such as those for opinion mining and information retrieval, especially in biomedical data. In this article, we review the corpora annotated with negation and speculation in various natural languages and domains. Furthermore, we discuss the ongoing research into recent rule-based, supervised, and transfer learning techniques for the detection of negating and speculative content. Many English corpora for various domains are now annotated with negation and speculation; moreover, the availability of annotated corpora in other languages has started to increase. However, this growth is insufficient to address these important phenomena in languages with limited resources. The use of cross-lingual models and translation of the well-known languages are acceptable alternatives. We also highlight the lack of consistent annotation guidelines and the shortcomings of the existing techniques, and suggest alternatives that may speed up progress in this research direction. Adding more syntactic features may alleviate the limitations of the existing techniques, such as cue ambiguity and detecting the discontinuous scopes. In some NLP applications, inclusion of a system that is negation- and speculation-aware improves performance, yet this aspect is still not addressed or considered an essential step
    corecore