Search CORE

2 research outputs found

The Tanl Lemmatizer Enriched with a Sequence of Cascading Filters

Author: A. Loponen
E. Zanchetta
M. Baroni
S. Buchholz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

The Tanl Lemmatizer Enriched with a Sequence of Cascading Filters

Author: ATTARDI GIUSEPPE
Dei Rossi S.
SIMI MARIA
Publication venue: place:Berlin
Publication date: 01/01/2012
Field of study

We have extended an existing lemmatizer, which relies on a lexicon of about 1.2 millions form, where lemmas are indexed by rich PoS tags, with a sequence of cascading filters, each one in charge of dealing with specific issues related to out-of-dictionary words. The last two filters are devoted to resolve semantic ambiguities between words of the same syntactic category, by querying external resources: an enriched index built on the Italian Wikipedia and the Google index

Archivio della Ricerca - Università di Pisa