7 research outputs found

    Factors Affecting Part-of-Speech Tagging for Tagalog

    Get PDF
    PACLIC 23 / City University of Hong Kong / 3-5 December 200

    Improving the PoS tagging accuracy of Icelandic text

    Get PDF
    Proceedings of the 17th Nordic Conference of Computational Linguistics NODALIDA 2009. Editors: Kristiina Jokinen and Eckhard Bick. NEALT Proceedings Series, Vol. 4 (2009), 103-110. © 2009 The editors and contributors. Published by Northern European Association for Language Technology (NEALT) http://omilia.uio.no/nealt . Electronically published at Tartu University Library (Estonia) http://hdl.handle.net/10062/9206

    Application of a POS Tagger to a Novel Chronological Division of Early Modern German Text

    Get PDF
    This paper describes the application of a part-of-speech tagger to a particular configuration of historical German documents. Most natural language processing (NLP) is done on contemporary documents, and historical documents can present difficulties for these tools. I compared the performance of a single high-quality tagger on two stages of historical German (Early Modern German) materials. I used the TnT (Trigrams 'n' Tags) tagger, a probabilistic tagger developed by Thorsten Brants in a 2000 paper. I applied this tagger to two subcorpora which I derived from the University of Manchester's GerManC corpus, divided by date of creation of the original document, with each one used for both training and testing. I found that the earlier half, from a period with greater variability in the language, was significantly more difficult to tag correctly. The broader tag categories of punctuation and "other" were overrepresented in the errors.Master of Science in Information Scienc
    corecore