2 research outputs found
Substitute Based SCODE Word Embeddings in Supervised NLP Tasks
We analyze a word embedding method in supervised tasks. It maps words on a
sphere such that words co-occurring in similar contexts lie closely. The
similarity of contexts is measured by the distribution of substitutes that can
fill them. We compared word embeddings, including more recent representations,
in Named Entity Recognition (NER), Chunking, and Dependency Parsing. We examine
our framework in multilingual dependency parsing as well. The results show that
the proposed method achieves as good as or better results compared to the other
word embeddings in the tasks we investigate. It achieves state-of-the-art
results in multilingual dependency parsing. Word embeddings in 7 languages are
available for public use.Comment: 11 page
The Role of Context Types and Dimensionality in Learning Word Embeddings
We provide the first extensive evaluation of how using different types of
context to learn skip-gram word embeddings affects performance on a wide range
of intrinsic and extrinsic NLP tasks. Our results suggest that while intrinsic
tasks tend to exhibit a clear preference to particular types of contexts and
higher dimensionality, more careful tuning is required for finding the optimal
settings for most of the extrinsic tasks that we considered. Furthermore, for
these extrinsic tasks, we find that once the benefit from increasing the
embedding dimensionality is mostly exhausted, simple concatenation of word
embeddings, learned with different context types, can yield further performance
gains. As an additional contribution, we propose a new variant of the skip-gram
model that learns word embeddings from weighted contexts of substitute words.Comment: Accepted to NAACL 201