5,613 research outputs found
A Simple Language Model based on PMI Matrix Approximations
In this study, we introduce a new approach for learning language models by
training them to estimate word-context pointwise mutual information (PMI), and
then deriving the desired conditional probabilities from PMI at test time.
Specifically, we show that with minor modifications to word2vec's algorithm, we
get principled language models that are closely related to the well-established
Noise Contrastive Estimation (NCE) based language models. A compelling aspect
of our approach is that our models are trained with the same simple negative
sampling objective function that is commonly used in word2vec to learn word
embeddings.Comment: Accepted to EMNLP 201
Distributional Inclusion Vector Embedding for Unsupervised Hypernymy Detection
Modeling hypernymy, such as poodle is-a dog, is an important generalization
aid to many NLP tasks, such as entailment, coreference, relation extraction,
and question answering. Supervised learning from labeled hypernym sources, such
as WordNet, limits the coverage of these models, which can be addressed by
learning hypernyms from unlabeled text. Existing unsupervised methods either do
not scale to large vocabularies or yield unacceptably poor accuracy. This paper
introduces distributional inclusion vector embedding (DIVE), a
simple-to-implement unsupervised method of hypernym discovery via per-word
non-negative vector embeddings which preserve the inclusion property of word
contexts in a low-dimensional and interpretable space. In experimental
evaluations more comprehensive than any previous literature of which we are
aware-evaluating on 11 datasets using multiple existing as well as newly
proposed scoring functions-we find that our method provides up to double the
precision of previous unsupervised embeddings, and the highest average
performance, using a much more compact word representation, and yielding many
new state-of-the-art results.Comment: NAACL 201
ExplaiNE: An Approach for Explaining Network Embedding-based Link Predictions
Networks are powerful data structures, but are challenging to work with for
conventional machine learning methods. Network Embedding (NE) methods attempt
to resolve this by learning vector representations for the nodes, for
subsequent use in downstream machine learning tasks.
Link Prediction (LP) is one such downstream machine learning task that is an
important use case and popular benchmark for NE methods. Unfortunately, while
NE methods perform exceedingly well at this task, they are lacking in
transparency as compared to simpler LP approaches.
We introduce ExplaiNE, an approach to offer counterfactual explanations for
NE-based LP methods, by identifying existing links in the network that explain
the predicted links. ExplaiNE is applicable to a broad class of NE algorithms.
An extensive empirical evaluation for the NE method `Conditional Network
Embedding' in particular demonstrates its accuracy and scalability
- …