11,566 research outputs found
Parsing Thai Social Data: A New Challenge for Thai NLP
Dependency parsing (DP) is a task that analyzes text for syntactic structure
and relationship between words. DP is widely used to improve natural language
processing (NLP) applications in many languages such as English. Previous works
on DP are generally applicable to formally written languages. However, they do
not apply to informal languages such as the ones used in social networks.
Therefore, DP has to be researched and explored with such social network data.
In this paper, we explore and identify a DP model that is suitable for Thai
social network data. After that, we will identify the appropriate linguistic
unit as an input. The result showed that, the transition based model called,
improve Elkared dependency parser outperform the others at UAS of 81.42%.Comment: 7 Pages, 8 figures, to be published in The 14th International Joint
Symposium on Artificial Intelligence and Natural Language Processing
(iSAI-NLP 2019
An improved neural network model for joint POS tagging and dependency parsing
We propose a novel neural network model for joint part-of-speech (POS)
tagging and dependency parsing. Our model extends the well-known BIST
graph-based dependency parser (Kiperwasser and Goldberg, 2016) by incorporating
a BiLSTM-based tagging component to produce automatically predicted POS tags
for the parser. On the benchmark English Penn treebank, our model obtains
strong UAS and LAS scores at 94.51% and 92.87%, respectively, producing 1.5+%
absolute improvements to the BIST graph-based parser, and also obtaining a
state-of-the-art POS tagging accuracy at 97.97%. Furthermore, experimental
results on parsing 61 "big" Universal Dependencies treebanks from raw texts
show that our model outperforms the baseline UDPipe (Straka and Strakov\'a,
2017) with 0.8% higher average POS tagging score and 3.6% higher average LAS
score. In addition, with our model, we also obtain state-of-the-art downstream
task scores for biomedical event extraction and opinion analysis applications.
Our code is available together with all pre-trained models at:
https://github.com/datquocnguyen/jPTDPComment: 11 pages; In Proceedings of the CoNLL 2018 Shared Task: Multilingual
Parsing from Raw Text to Universal Dependencies, to appea
AMR Dependency Parsing with a Typed Semantic Algebra
We present a semantic parser for Abstract Meaning Representations which
learns to parse strings into tree representations of the compositional
structure of an AMR graph. This allows us to use standard neural techniques for
supertagging and dependency tree parsing, constrained by a linguistically
principled type system. We present two approximative decoding algorithms, which
achieve state-of-the-art accuracy and outperform strong baselines.Comment: This paper will be presented at ACL 2018 (see
https://acl2018.org/programme/papers/
Latent Tree Language Model
In this paper we introduce Latent Tree Language Model (LTLM), a novel
approach to language modeling that encodes syntax and semantics of a given
sentence as a tree of word roles.
The learning phase iteratively updates the trees by moving nodes according to
Gibbs sampling. We introduce two algorithms to infer a tree for a given
sentence. The first one is based on Gibbs sampling. It is fast, but does not
guarantee to find the most probable tree. The second one is based on dynamic
programming. It is slower, but guarantees to find the most probable tree. We
provide comparison of both algorithms.
We combine LTLM with 4-gram Modified Kneser-Ney language model via linear
interpolation. Our experiments with English and Czech corpora show significant
perplexity reductions (up to 46% for English and 49% for Czech) compared with
standalone 4-gram Modified Kneser-Ney language model.Comment: Accepted to EMNLP 201
- …