Search CORE

309 research outputs found

Overview of the EVALITA 2016 Part of speech on twitter for Italian task

Author: Andrea Bolioli
Bosco Cristina
Fabio Tamburini
Mazzei Alessandro
Publication venue: CEUR Workshop Proceedings (CEUR-WS.org)
Publication date: 01/01/2016
Field of study

The increasing interest for the extraction of various forms of knowledge from micro-blogs and social media makes crucial the development of resources and tools that can be used for automatically deal with them. PoSTWITA contributes to the advancement of the state-of-the-art for Italian language by: (a) enriching the community with a previously not existing col- lection of data extracted from Twitter and annotated with grammatical categories, to be used as a benchmark for system evaluation; (b) supporting the adaptation of Part of Speech tagging systems to this particular text domain

Archivio istituzionale della ricerca - Alma Mater Studiorum Università di Bologna

Institutional Research Information System University of Turin

When silver glitters more than gold: Bootstrapping an Italian part-of-speech tagger for Twitter

Author: Nissim Malvina
Plank Barbara
Publication venue
Publication date: 01/01/2016
Field of study

We bootstrap a state-of-the-art part-of-speech tagger to tag Italian Twitter data, in the context of the Evalita 2016 PoSTWITA shared task. We show that training the tagger on native Twitter data enriched with little amounts of specifically selected gold data and additional silver-labelled data scraped from Facebook, yields better results than using large amounts of manually annotated data from a mix of genres.Comment: Proceedings of the 5th Evaluation Campaign of Natural Language Processing and Speech Tools for Italian (EVALITA 2016

arXiv.org e-Print Archive

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

OpenEdition

Dissertations of the University of Groningen