1 research outputs found
Hybrid Improved Document-level Embedding (HIDE)
In recent times, word embeddings are taking a significant role in sentiment
analysis. As the generation of word embeddings needs huge corpora, many
applications use pretrained embeddings. In spite of the success, word
embeddings suffers from certain drawbacks such as it does not capture sentiment
information of a word, contextual information in terms of parts of speech tags
and domain-specific information. In this work we propose HIDE a Hybrid Improved
Document level Embedding which incorporates domain information, parts of speech
information and sentiment information into existing word embeddings such as
GloVe and Word2Vec. It combine improved word embeddings into document level
embeddings. Further, Latent Semantic Analysis (LSA) has been used to represent
documents as a vectors. HIDE is generated, combining LSA and document level
embeddings, which is computed from improved word embeddings. We test HIDE with
six different datasets and shown considerable improvement over the accuracy of
existing pretrained word vectors such as GloVe and Word2Vec. We further compare
our work with two existing document level sentiment analysis approaches. HIDE
performs better than existing systems