Enhancing a Portuguese text classifier using part-of-speech tags

Gonçalves, Teresa; Quaresma, Paulo

research

Enhancing a Portuguese text classifier using part-of-speech tags

Authors: Teresa Gonçalves
Paulo Quaresma
Publication date: 1 January 2005
Publisher: 'Springer Science and Business Media LLC'

Abstract

Support Vector Machines have been applied to text classification with great success. In this paper, we apply and evaluate the impact of using part-of- speech tags (nouns, proper nouns, adjectives and verbs) as a feature selection procedure in a European Portuguese written dataset – the Portuguese Attorney General’s Office documents. From the results, we can conclude that verbs alone don’t have enough informa- tion to produce good learners. On the other hand, we obtain learners with equiva- lent performance and a reduced number of features (at least half) if we use specific part-of-speech tags instead of all words

Similar works

Full text

Open in the Core reader

Download PDF

Available Versions

Repositório Científico da Universidade de Évora

oai:dspace.uevora.pt:10174/256...

Last time updated on 17/11/2016