CORE
🇺🇦
make metadata, not war
Services
Services overview
Explore all CORE services
Access to raw data
API
Dataset
FastSync
Content discovery
Recommender
Discovery
OAI identifiers
OAI Resolver
Managing content
Dashboard
Bespoke contracts
Consultancy services
Support us
Support us
Membership
Sponsorship
Community governance
Advisory Board
Board of supporters
Research network
About
About us
Our mission
Team
Blog
FAQs
Contact us
Using Stanford part-of-speech tagger for the morphologically-rich Filipino Language
Authors
Matthew Phillip V. Go
Nicco Nocon
Publication date
1 January 2019
Publisher
Animo Repository
Abstract
This research focuses on the implementation of a Maximum Entropy-based Part-of-Speech (POS) tagger for Filipino. It uses the Stanford POS tagger - a trainable POS tagger that has been trained on English, Chinese, Arabic, and other languages and producing one of the highest results in each language. The tagger was trained for Filipino using a 406k token corpus and considering unique Filipino linguistic phenomena such as high morphology and intra-sentential code-switches. The Filipino POS tagger resulted to 96.15% tagging accuracy which currently presents the highest accuracy and with a large lead among existing POS taggers for Filipino. Copyright © 2017 Matthew Phillip Go and Nicco Noco
Similar works
Full text
Available Versions
Institutional Repositories DataBase (IRDB)
See this paper in CORE
Go to the repository landing page
Download from data provider
oai:irdb.nii.ac.jp:00835:00037...
Last time updated on 06/09/2020
Animo Repository - De La Salle University Research
See this paper in CORE
Go to the repository landing page
Download from data provider
oai:animorepository.dlsu.edu.p...
Last time updated on 03/12/2021