34,240 research outputs found
Learning Word Representations with Hierarchical Sparse Coding
We propose a new method for learning word representations using hierarchical
regularization in sparse coding inspired by the linguistic study of word
meanings. We show an efficient learning algorithm based on stochastic proximal
methods that is significantly faster than previous approaches, making it
possible to perform hierarchical sparse coding on a corpus of billions of word
tokens. Experiments on various benchmark tasks---word similarity ranking,
analogies, sentence completion, and sentiment analysis---demonstrate that the
method outperforms or is competitive with state-of-the-art methods. Our word
representations are available at
\url{http://www.ark.cs.cmu.edu/dyogatam/wordvecs/}
- …