1,899 research outputs found
A Comparative Study of Word Embeddings for Reading Comprehension
The focus of past machine learning research for Reading Comprehension tasks
has been primarily on the design of novel deep learning architectures. Here we
show that seemingly minor choices made on (1) the use of pre-trained word
embeddings, and (2) the representation of out-of-vocabulary tokens at test
time, can turn out to have a larger impact than architectural choices on the
final performance. We systematically explore several options for these choices,
and provide recommendations to researchers working in this area
Explicit Utilization of General Knowledge in Machine Reading Comprehension
To bridge the gap between Machine Reading Comprehension (MRC) models and
human beings, which is mainly reflected in the hunger for data and the
robustness to noise, in this paper, we explore how to integrate the neural
networks of MRC models with the general knowledge of human beings. On the one
hand, we propose a data enrichment method, which uses WordNet to extract
inter-word semantic connections as general knowledge from each given
passage-question pair. On the other hand, we propose an end-to-end MRC model
named as Knowledge Aided Reader (KAR), which explicitly uses the above
extracted general knowledge to assist its attention mechanisms. Based on the
data enrichment method, KAR is comparable in performance with the
state-of-the-art MRC models, and significantly more robust to noise than them.
When only a subset (20%-80%) of the training examples are available, KAR
outperforms the state-of-the-art MRC models by a large margin, and is still
reasonably robust to noise.Comment: ACL 201
Learning to Compute Word Embeddings On the Fly
Words in natural language follow a Zipfian distribution whereby some words
are frequent but most are rare. Learning representations for words in the "long
tail" of this distribution requires enormous amounts of data. Representations
of rare words trained directly on end tasks are usually poor, requiring us to
pre-train embeddings on external data, or treat all rare words as
out-of-vocabulary words with a unique representation. We provide a method for
predicting embeddings of rare words on the fly from small amounts of auxiliary
data with a network trained end-to-end for the downstream task. We show that
this improves results against baselines where embeddings are trained on the end
task for reading comprehension, recognizing textual entailment and language
modeling
Comparing Attention-based Convolutional and Recurrent Neural Networks: Success and Limitations in Machine Reading Comprehension
We propose a machine reading comprehension model based on the
compare-aggregate framework with two-staged attention that achieves
state-of-the-art results on the MovieQA question answering dataset. To
investigate the limitations of our model as well as the behavioral difference
between convolutional and recurrent neural networks, we generate adversarial
examples to confuse the model and compare to human performance. Furthermore, we
assess the generalizability of our model by analyzing its differences to human
inference,Comment: CoNLL 201
Comparative Study of Machine Learning Models and BERT on SQuAD
This study aims to provide a comparative analysis of performance of certain
models popular in machine learning and the BERT model on the Stanford Question
Answering Dataset (SQuAD). The analysis shows that the BERT model, which was
once state-of-the-art on SQuAD, gives higher accuracy in comparison to other
models. However, BERT requires a greater execution time even when only 100
samples are used. This shows that with increasing accuracy more amount of time
is invested in training the data. Whereas in case of preliminary machine
learning models, execution time for full data is lower but accuracy is
compromised
A Deterministic Algorithm for Bridging Anaphora Resolution
Previous work on bridging anaphora resolution (Poesio et al., 2004; Hou et
al., 2013b) use syntactic preposition patterns to calculate word relatedness.
However, such patterns only consider NPs' head nouns and hence do not fully
capture the semantics of NPs. Recently, Hou (2018) created word embeddings
(embeddings_PP) to capture associative similarity (ie, relatedness) between
nouns by exploring the syntactic structure of noun phrases. But embeddings_PP
only contains word representations for nouns. In this paper, we create new word
vectors by combining embeddings_PP with GloVe. This new word embeddings
(embeddings_bridging) are a more general lexical knowledge resource for
bridging and allow us to represent the meaning of an NP beyond its head easily.
We therefore develop a deterministic approach for bridging anaphora resolution,
which represents the semantics of an NP based on its head noun and
modifications. We show that this simple approach achieves the competitive
results compared to the best system in Hou et al.(2013b) which explores Markov
Logic Networks to model the problem. Additionally, we further improve the
results for bridging anaphora resolution reported in Hou (2018) by combining
our simple deterministic approach with Hou et al.(2013b)'s best system MLN II.Comment: 11 page
Recent Advances in Natural Language Inference: A Survey of Benchmarks, Resources, and Approaches
In the NLP community, recent years have seen a surge of research activities
that address machines' ability to perform deep language understanding which
goes beyond what is explicitly stated in text, rather relying on reasoning and
knowledge of the world. Many benchmark tasks and datasets have been created to
support the development and evaluation of such natural language inference
ability. As these benchmarks become instrumental and a driving force for the
NLP research community, this paper aims to provide an overview of recent
benchmarks, relevant knowledge resources, and state-of-the-art learning and
inference approaches in order to support a better understanding of this growing
field
The emergent algebraic structure of RNNs and embeddings in NLP
We examine the algebraic and geometric properties of a uni-directional GRU
and word embeddings trained end-to-end on a text classification task. A
hyperparameter search over word embedding dimension, GRU hidden dimension, and
a linear combination of the GRU outputs is performed. We conclude that words
naturally embed themselves in a Lie group and that RNNs form a nonlinear
representation of the group. Appealing to these results, we propose a novel
class of recurrent-like neural networks and a word embedding scheme.Comment: 24 pages, 16 figure
Doc2Im: document to image conversion through self-attentive embedding
Text classification is a fundamental task in NLP applications. Latest
research in this field has largely been divided into two major sub-fields.
Learning representations is one sub-field and learning deeper models, both
sequential and convolutional, which again connects back to the representation
is the other side. We posit the idea that the stronger the representation is,
the simpler classifier models are needed to achieve higher performance. In this
paper we propose a completely novel direction to text classification research,
wherein we convert text to a representation very similar to images, such that
any deep network able to handle images is equally able to handle text. We take
a deeper look at the representation of documents as an image and subsequently
utilize very simple convolution based models taken as is from computer vision
domain. This image can be cropped, re-scaled, re-sampled and augmented just
like any other image to work with most of the state-of-the-art large
convolution based models which have been designed to handle large image
datasets. We show impressive results with some of the latest benchmarks in the
related fields. We perform transfer learning experiments, both from text to
text domain and also from image to text domain. We believe this is a paradigm
shift from the way document understanding and text classification has been
traditionally done, and will drive numerous novel research ideas in the
community
Semi-Supervised Few-Shot Learning for Dual Question-Answer Extraction
This paper addresses the problem of key phrase extraction from sentences.
Existing state-of-the-art supervised methods require large amounts of annotated
data to achieve good performance and generalization. Collecting labeled data
is, however, often expensive. In this paper, we redefine the problem as
question-answer extraction, and present SAMIE: Self-Asking Model for
Information Ixtraction, a semi-supervised model which dually learns to ask and
to answer questions by itself. Briefly, given a sentence and an answer ,
the model needs to choose the most appropriate question ; meanwhile,
for the given sentence and same question selected in the previous
step, the model will predict an answer . The model can support few-shot
learning with very limited supervision. It can also be used to perform
clustering analysis when no supervision is provided. Experimental results show
that the proposed method outperforms typical supervised methods especially when
given little labeled data.Comment: 7 pages, 5 figures, submission to IJCAI1
- …