15,642 research outputs found
GNN-SL: Sequence Labeling Based on Nearest Examples via GNN
To better handle long-tail cases in the sequence labeling (SL) task, in this
work, we introduce graph neural networks sequence labeling (GNN-SL), which
augments the vanilla SL model output with similar tagging examples retrieved
from the whole training set. Since not all the retrieved tagging examples
benefit the model prediction, we construct a heterogeneous graph, and leverage
graph neural networks (GNNs) to transfer information between the retrieved
tagging examples and the input word sequence. The augmented node which
aggregates information from neighbors is used to do prediction. This strategy
enables the model to directly acquire similar tagging examples and improves the
general quality of predictions. We conduct a variety of experiments on three
typical sequence labeling tasks: Named Entity Recognition (NER), Part of Speech
Tagging (POS), and Chinese Word Segmentation (CWS) to show the significant
performance of our GNN-SL. Notably, GNN-SL achieves SOTA results of 96.9 (+0.2)
on PKU, 98.3 (+0.4) on CITYU, 98.5 (+0.2) on MSR, and 96.9 (+0.2) on AS for the
CWS task, and results comparable to SOTA performances on NER datasets, and POS
datasets.Comment: preprin
Part of speech tagging of slovene language using deep neural networks
The thesis deals with part of speech tagging of Slovene language. Part of speech tagging is a process of matching sentences in natural language with a sequence of suitable tags, which contain information about parts of speech and morphological properties of words. Our solution uses character-level representation of words, which is different from typical solutions, which process input sentences as sequences of words. Our part of speech tagger is implemented using convolutional and recurrent neural networks. Unlike common approaches that address this problem as multi-class classification, our solution proposes a multi-label classification approach. In order to improve our results we implement an ensemble of three part of speech taggers. When comparing our solution with existing ones, we find that the proposed solution achieves the best results
Part of speech tagging of slovene language using deep neural networks
The thesis deals with part of speech tagging of Slovene language. Part of speech tagging is a process of matching sentences in natural language with a sequence of suitable tags, which contain information about parts of speech and morphological properties of words. Our solution uses character-level representation of words, which is different from typical solutions, which process input sentences as sequences of words. Our part of speech tagger is implemented using convolutional and recurrent neural networks. Unlike common approaches that address this problem as multi-class classification, our solution proposes a multi-label classification approach. In order to improve our results we implement an ensemble of three part of speech taggers. When comparing our solution with existing ones, we find that the proposed solution achieves the best results
Does the Word Chien Bark? Representation Learning in Neural Machine Translation Encoders
This thesis presents experiments with using representation learning to explore how neural networks learn. Neural networks which take text as input create internal representations of the text during their training. Recent work has found that these representations can be used to perform other downstream linguistic tasks, such as part-of-speech (POS) tagging. This demonstrates that the neural networks are learning linguistic information and storing this information in the representations. We focus on the representations created by neural machine translation (NMT) models and whether they can be used in POS tagging. We train 5 NMT models including an auto-encoder. We extract the encoder from each model and utilize the representations that the encoder produces to train a hand-crafted Encoder-Tagger (ET) model to do POS tagging. We explore the impact of various features including NMT target language, NMT BLEU score, encoder depth, sequence length, token frequency, and percentage of out-of-vocabulary (OOV) tokens in a sequence. We find that NMT encoder representations contain sufficient linguistic information to perform POS tagging and that there are correlations between several features, which helps us to better understand the inner workings of neural networks
Recommended from our members
The Recurrent Temporal Discriminative Restricted Boltzmann Machines
Classification of sequence data is the topic of interest for dynamic Bayesian models and Recurrent Neural Networks (RNNs). While the former can explicitly model the temporal dependencies between class variables, the latter have a capability of learning representations. Several attempts have been made to improve performance by combining these two approaches or increasing the processing capability of the hidden units in RNNs. This often results in complex models with a large number of learning parameters. In this paper, a compact model is proposed which offers both representation learning and temporal inference of class variables by rolling Restricted Boltzmann Machines (RBMs) and class variables over time. We address the key issue of intractability in this variant of RBMs by optimising a conditional distribution, instead of a joint distribution. Experiments reported in the paper on melody modelling and optical character recognition show that the proposed model can outperform the state-of-the-art. Also, the experimental results on optical character recognition, part-of-speech tagging and text chunking demonstrate that our model is comparable to recurrent neural networks with complex memory gates while requiring far fewer parameters
- …