6,082 research outputs found
The Importance of Automatic Syntactic Features in Vietnamese Named Entity Recognition
This paper presents a state-of-the-art system for Vietnamese Named Entity
Recognition (NER). By incorporating automatic syntactic features with word
embeddings as input for bidirectional Long Short-Term Memory (Bi-LSTM), our
system, although simpler than some deep learning architectures, achieves a much
better result for Vietnamese NER. The proposed method achieves an overall F1
score of 92.05% on the test set of an evaluation campaign, organized in late
2016 by the Vietnamese Language and Speech Processing (VLSP) community. Our
named entity recognition system outperforms the best previous systems for
Vietnamese NER by a large margin.Comment: 7 pages, 9 tables, 3 figures, accepted to PACLIC 201
Myanmar named entity corpus and its use in syllable-based neural named entity recognition
Myanmar language is a low-resource language and this is one of the main reasons why Myanmar Natural Language Processing lagged behind compared to other languages. Currently, there is no publicly available named entity corpus for Myanmar language. As part of this work, a very first manually annotated Named Entity tagged corpus for Myanmar language was developed and proposed to support the evaluation of named entity extraction. At present, our named entity corpus contains approximately 170,000 name entities and 60,000 sentences. This work also contributes the first evaluation of various deep neural network architectures on Myanmar Named Entity Recognition. Experimental results of the 10-fold cross validation revealed that syllable-based neural sequence models without additional feature engineering can give better results compared to baseline CRF model. This work also aims to discover the effectiveness of neural network approaches to textual processing for Myanmar language as well as to promote future research works on this understudied language
- …