4 research outputs found

    Lessons Learned from EVALITA 2020 and Thirteen Years of Evaluation of Italian Language Technology

    Get PDF
    This paper provides a summary of the 7th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA2020) which was held online on December 17th, due to the 2020 COVID-19 pandemic. The 2020 edition of Evalita included 14 different tasks belonging to five research areas, namely: (i) Affect, Hate, and Stance, (ii) Creativity and Style, (iii) New Challenges in Long-standing Tasks, (iv) Semantics and Multimodality, (v) Time and Diachrony. This paper provides a description of the tasks and the key findings from the analysis of participant outcomes. Moreover, it provides a detailed analysis of the participants and task organizers which demonstrates the growing interest with respect to this campaign. Finally, a detailed analysis of the evaluation of tasks across the past seven editions is provided; this allows to assess how the research carried out by the Italian community dealing with Computational Linguistics has evolved in terms of popular tasks and paradigms during the last 13 years

    The ArtiPhon Task at Evalita 2016

    No full text
    Despite the impressive results achieved by ASR technology in the last few years, state-of-the-art ASR systems can still perform poorly when training and testing conditions are different (e.g., different acoustic environments). This is usually referred to as the mismatch problem. In the ArtiPhon task at Evalita 2016 we wanted to evaluate phone recognition systems in mismatched speaking styles. While training data consisted of read speech, most of testing data consisted of single-speaker hypo- and hyper-articulated speech. A second goal of the task was to investigate whether the use of speech production knowledge, in the form of measured articulatory movements, could help in building ASR systems that are more robust to the effects of the mismatch problem. Here I report the result of the only entry of the task and of baseline systems.Nonostante i notevoli risultati ottenuti recentemente nel riconoscimento automatico del parlato (ASR) le prestazioni dei sistemi ASR peggiorano significativamente in quando le condizioni di testing sono differenti da quelle di training (per esempio quando il tipo di rumore acustico è differente). Un primo gol della ArtiPhon task ad Evalita 2016 è quello di valutare il comportamento di sistemi di riconoscimento fonetico in presenza di un mismatch in termini di registro del parlato. Mentre il parlato di training consiste di frasi lette ad un velocita; di eloquio “standard”, il parlato di testing consiste di frasi sia iper- che ipo-articolate. Un secondo gol della task è quello di analizzare se e come l’utilizzo di informazione concernente la produzione del parlato migliora l’accuratezza dell’ASR e in particolare nel caso di mismatch a livello di registri del parlato. Qui riporto risultati dell’unico sistema che è stato sottomesso e di una baseline
    corecore