16 research outputs found
Handling unknown words in statistical latent-variable parsing models for Arabic, English and French
This paper presents a study of the impact of using simple and complex morphological clues to improve the classification of rare and unknown words for parsing. We compare this approach to a language-independent technique
often used in parsers which is based solely on word frequencies. This study is applied to three languages that exhibit different levels of morphological expressiveness: Arabic, French and English. We integrate information
about Arabic affixes and morphotactics into a PCFG-LA parser and obtain stateof-the-art accuracy. We also show that these morphological clues can be learnt automatically
from an annotated corpus
From news to comment: Resources and benchmarks for parsing the language of web 2.0
We investigate the problem of parsing the noisy language of social media. We evaluate four all-Street-Journal-trained statistical parsers (Berkeley, Brown, Malt and MST) on a new dataset containing 1,000 phrase structure trees for sentences from microblogs (tweets) and discussion forum posts. We compare the four parsers on their ability to produce Stanford dependencies for these Web 2.0 sentences. We find that the parsers have a particular problem with tweets and that a substantial part of this problem is related to POS tagging accuracy. We attempt three retraining experiments involving Malt, Brown and an in-house Berkeley-style parser and obtain a statistically significant improvement for all three parsers
Attention Is All You Need
The dominant sequence transduction models are based on complex recurrent or
convolutional neural networks in an encoder-decoder configuration. The best
performing models also connect the encoder and decoder through an attention
mechanism. We propose a new simple network architecture, the Transformer, based
solely on attention mechanisms, dispensing with recurrence and convolutions
entirely. Experiments on two machine translation tasks show these models to be
superior in quality while being more parallelizable and requiring significantly
less time to train. Our model achieves 28.4 BLEU on the WMT 2014
English-to-German translation task, improving over the existing best results,
including ensembles by over 2 BLEU. On the WMT 2014 English-to-French
translation task, our model establishes a new single-model state-of-the-art
BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction
of the training costs of the best models from the literature. We show that the
Transformer generalizes well to other tasks by applying it successfully to
English constituency parsing both with large and limited training data.Comment: 15 pages, 5 figure
Structured Training for Neural Network Transition-Based Parsing
We present structured perceptron training for neural network transition-based
dependency parsing. We learn the neural network representation using a gold
corpus augmented by a large number of automatically parsed sentences. Given
this fixed network representation, we learn a final layer using the structured
perceptron with beam-search decoding. On the Penn Treebank, our parser reaches
94.26% unlabeled and 92.41% labeled attachment accuracy, which to our knowledge
is the best accuracy on Stanford Dependencies to date. We also provide in-depth
ablative analysis to determine which aspects of our model provide the largest
gains in accuracy
Dynamic Self-training Framework for Graph Convolutional Networks
Graph neural networks (GNN) such as GCN, GAT, MoNet have achieved
state-of-the-art results on semi-supervised learning on graphs. However, when
the number of labeled nodes is very small, the performances of GNNs downgrade
dramatically. Self-training has proved to be effective for resolving this
issue, however, the performance of self-trained GCN is still inferior to that
of G2G and DGI for many settings. Moreover, additional model complexity make it
more difficult to tune the hyper-parameters and do model selection. We argue
that the power of self-training is still not fully explored for the node
classification task. In this paper, we propose a unified end-to-end
self-training framework called \emph{Dynamic Self-traning}, which generalizes
and simplifies prior work. A simple instantiation of the framework based on GCN
is provided and empirical results show that our framework outperforms all
previous methods including GNNs, embedding based method and self-trained GCNs
by a noticeable margin. Moreover, compared with standard self-training,
hyper-parameter tuning for our framework is easier.Comment: 11page
Treebank Conversion based Self-training Strategy for Parsing
Abstract In this paper, we propose a novel selftraining strategy for parsing which is based on Treebank conversion (SSPTC). In SSPTC, we make full use of the strong points of Treebank conversion and self-training, and offset their weaknesses with each other. To provide good parse selection strategies which are needed in self-training, we score the automatically generated parse trees with parse trees in source Treebank as a reference. To maintain the constituency between source Treebank and conversion Treebank which is needed in Treebank conversion, we get the conversion trees with the help of self-training. In our experiments, SSPTC strategy is utilized to parse Tsinghua Chinese Treebank with the help of Penn Chinese Treebank. The results significantly outperform the baseline parser