25,480 research outputs found
Linking GloVe with word2vec
The Global Vectors for word representation (GloVe), introduced by Jeffrey
Pennington et al. is reported to be an efficient and effective method for
learning vector representations of words. State-of-the-art performance is also
provided by skip-gram with negative-sampling (SGNS) implemented in the word2vec
tool. In this note, we explain the similarities between the training objectives
of the two models, and show that the objective of SGNS is similar to the
objective of a specialized form of GloVe, though their cost functions are
defined differently.Comment: 5 pages, 2 figure
Corpus specificity in LSA and Word2vec: the role of out-of-domain documents
Latent Semantic Analysis (LSA) and Word2vec are some of the most widely used
word embeddings. Despite the popularity of these techniques, the precise
mechanisms by which they acquire new semantic relations between words remain
unclear. In the present article we investigate whether LSA and Word2vec
capacity to identify relevant semantic dimensions increases with size of
corpus. One intuitive hypothesis is that the capacity to identify relevant
dimensions should increase as the amount of data increases. However, if corpus
size grow in topics which are not specific to the domain of interest, signal to
noise ratio may weaken. Here we set to examine and distinguish these
alternative hypothesis. To investigate the effect of corpus specificity and
size in word-embeddings we study two ways for progressive elimination of
documents: the elimination of random documents vs. the elimination of documents
unrelated to a specific task. We show that Word2vec can take advantage of all
the documents, obtaining its best performance when it is trained with the whole
corpus. On the contrary, the specialization (removal of out-of-domain
documents) of the training corpus, accompanied by a decrease of dimensionality,
can increase LSA word-representation quality while speeding up the processing
time. Furthermore, we show that the specialization without the decrease in LSA
dimensionality can produce a strong performance reduction in specific tasks.
From a cognitive-modeling point of view, we point out that LSA's word-knowledge
acquisitions may not be efficiently exploiting higher-order co-occurrences and
global relations, whereas Word2vec does
- …
