7 research outputs found

    Lessons Learned from EVALITA 2020 and Thirteen Years of Evaluation of Italian Language Technology

    Get PDF
    This paper provides a summary of the 7th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA2020) which was held online on December 17th, due to the 2020 COVID-19 pandemic. The 2020 edition of Evalita included 14 different tasks belonging to five research areas, namely: (i) Affect, Hate, and Stance, (ii) Creativity and Style, (iii) New Challenges in Long-standing Tasks, (iv) Semantics and Multimodality, (v) Time and Diachrony. This paper provides a description of the tasks and the key findings from the analysis of participant outcomes. Moreover, it provides a detailed analysis of the participants and task organizers which demonstrates the growing interest with respect to this campaign. Finally, a detailed analysis of the evaluation of tasks across the past seven editions is provided; this allows to assess how the research carried out by the Italian community dealing with Computational Linguistics has evolved in terms of popular tasks and paradigms during the last 13 years

    Norm It! Lexical Normalization for Italian and Its Downstream Effects forDependency Parsing

    Get PDF
    Lexical normalization is the task of translating non-standard social media data to a standard form. Previous work has shown that this is beneficial for many downstream tasks in multiple languages. However, for Italian, there is no benchmark available for lexical normalization, despite the presence of many benchmarks for other tasks involving social media data. In this paper, we discuss the creation of a lexical normalization dataset for Italian. After two rounds of annotation, a Cohen’s kappa score of 78.64 is obtained. During this process, we also analyze the inter-annotator agreement for this task, which is only rarely done on datasets for lexical normalization,and when it is reported, the analysis usually remains shallow. Furthermore, we utilize this dataset to train a lexical normalization model and show that it can be used to improve dependency parsing of social media data. All annotated data and the code to reproduce the results are available at: http://bitbucket.org/robvanderg/normi

    Proceedings of the Seventh Italian Conference on Computational Linguistics CLiC-it 2020

    Get PDF
    On behalf of the Program Committee, a very warm welcome to the Seventh Italian Conference on Computational Linguistics (CLiC-it 2020). This edition of the conference is held in Bologna and organised by the University of Bologna. The CLiC-it conference series is an initiative of the Italian Association for Computational Linguistics (AILC) which, after six years of activity, has clearly established itself as the premier national forum for research and development in the fields of Computational Linguistics and Natural Language Processing, where leading researchers and practitioners from academia and industry meet to share their research results, experiences, and challenges
    corecore