Search CORE

1,020 research outputs found

A Novel Neural Network Model for Joint POS Tagging and Graph-based Dependency Parsing

Author: Dras Mark
Johnson Mark
Nguyen Dat Quoc
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

We present a novel neural network model that learns POS tagging and graph-based dependency parsing jointly. Our model uses bidirectional LSTMs to learn feature representations shared for both POS tagging and dependency parsing tasks, thus handling the feature-engineering problem. Our extensive experiments, on 19 languages from the Universal Dependencies project, show that our model outperforms the state-of-the-art neural network-based Stack-propagation model for joint POS tagging and transition-based dependency parsing, resulting in a new state of the art. Our code is open-source and available together with pre-trained models at: https://github.com/datquocnguyen/jPTDPComment: v2: also include universal POS tagging, UAS and LAS accuracies w.r.t gold-standard segmentation on Universal Dependencies 2.0 - CoNLL 2017 shared task test data; in CoNLL 201

arXiv.org e-Print Archive

Crossref

A non-projective greedy dependency parser with bidirectional LSTMs

Author: Gómez-Rodríguez Carlos
Vilares David
Publication venue
Publication date: 01/01/2017
Field of study

The LyS-FASTPARSE team presents BIST-COVINGTON, a neural implementation of the Covington (2001) algorithm for non-projective dependency parsing. The bidirectional LSTM approach by Kipperwasser and Goldberg (2016) is used to train a greedy parser with a dynamic oracle to mitigate error propagation. The model participated in the CoNLL 2017 UD Shared Task. In spite of not using any ensemble methods and using the baseline segmentation and PoS tagging, the parser obtained good results on both macro-average LAS and UAS in the big treebanks category (55 languages), ranking 7th out of 33 teams. In the all treebanks category (LAS and UAS) we ranked 16th and 12th. The gap between the all and big categories is mainly due to the poor performance on four parallel PUD treebanks, suggesting that some `suffixed' treebanks (e.g. Spanish-AnCora) perform poorly on cross-treebank settings, which does not occur with the corresponding `unsuffixed' treebank (e.g. Spanish). By changing that, we obtain the 11th best LAS among all runs (official and unofficial). The code is made available at https://github.com/CoNLL-UD-2017/LyS-FASTPARSEComment: 12 pages, 2 figures, 5 table

arXiv.org e-Print Archive

Crossref

Universal Dependencies Parsing for Colloquial Singaporean English

Author: Chan GuangYong Leonard
Chieu Hai Leong
Wang Hongmin
Yang Jie
Zhang Yue
Publication venue
Publication date: 01/01/2017
Field of study

Singlish can be interesting to the ACL community both linguistically as a major creole based on English, and computationally for information extraction and sentiment analysis of regional social media. We investigate dependency parsing of Singlish by constructing a dependency treebank under the Universal Dependencies scheme, and then training a neural network model by integrating English syntactic knowledge into a state-of-the-art parser trained on the Singlish treebank. Results show that English knowledge can lead to 25% relative error reduction, resulting in a parser of 84.47% accuracies. To the best of our knowledge, we are the first to use neural stacking to improve cross-lingual dependency parsing on low-resource languages. We make both our annotation and parser available for further research.Comment: Accepted by ACL 201

arXiv.org e-Print Archive

Crossref

Character Composition Model with Convolutional Neural Networks for Dependency Parsing on Morphologically Rich Languages

Author: Vu Ngoc Thang
Yu Xiang
Publication venue
Publication date: 01/01/2017
Field of study

We present a transition-based dependency parser that uses a convolutional neural network to compose word representations from characters. The character composition model shows great improvement over the word-lookup model, especially for parsing agglutinative languages. These improvements are even better than using pre-trained word embeddings from extra data. On the SPMRL data sets, our system outperforms the previous best greedy parser (Ballesteros et al., 2015) by a margin of 3% on average.Comment: Accepted in ACL 2017 (Short

arXiv.org e-Print Archive

Crossref

Parsing Thai Social Data: A New Challenge for Thai NLP

Author: Chalothorn Tawunrat
Khampingyot Borirat
Maharattamalai Nattasit
Singkul Sattaya
Taerungruang Supawat
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 06/03/2020
Field of study

Dependency parsing (DP) is a task that analyzes text for syntactic structure and relationship between words. DP is widely used to improve natural language processing (NLP) applications in many languages such as English. Previous works on DP are generally applicable to formally written languages. However, they do not apply to informal languages such as the ones used in social networks. Therefore, DP has to be researched and explored with such social network data. In this paper, we explore and identify a DP model that is suitable for Thai social network data. After that, we will identify the appropriate linguistic unit as an input. The result showed that, the transition based model called, improve Elkared dependency parser outperform the others at UAS of 81.42%.Comment: 7 Pages, 8 figures, to be published in The 14th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP 2019

arXiv.org e-Print Archive

Crossref