3,307 research outputs found

    Context-Aware Sentiment Analysis using Tweet Expansion Method

    Get PDF
    The large source of information space produced by the plethora of social media platforms in general and microblogging in particular has spawned a slew of new applications and prompted the rise and expansion of sentiment analysis research. We propose a sentiment analysis technique that identifies the main parts to describe tweet intent and also enriches them with relevant words, phrases, or even inferred variables. We followed a state-of-the-art hybrid deep learning model to combine Convolutional Neural Network (CNN) and the Long Short-Term Memory network (LSTM) to classify tweet data based on their polarity. To preserve the latent relationships between tweet terms and their expanded representation, sentence encoding and contextualized word embeddings are utilized. To investigate the performance of tweet embeddings on the sentiment analysis task, we tested several context-free models (Word2Vec, Sentence2Vec, Glove, and FastText), a dynamic embedding model (BERT), deep contextualized word representations (ELMo), and an entity-based model (Wikipedia). The proposed method and results prove that text enrichment improves the accuracy of sentiment polarity classification with a notable percentage

    Context-Aware Sentiment Analysis using Tweet Expansion Method

    Get PDF
    The large source of information space produced by the plethora of social media platforms in general and microblogging in particular has spawned a slew of new applications and prompted the rise and expansion of sentiment analysis research. We propose a sentiment analysis technique that identifies the main parts to describe tweet intent and also enriches them with relevant words, phrases, or even inferred variables. We followed a state-of-the-art hybrid deep learning model to combine Convolutional Neural Network (CNN) and the Long Short-Term Memory network (LSTM) to classify tweet data based on their polarity. To preserve the latent relationships between tweet terms and their expanded representation, sentence encoding and contextualized word embeddings are utilized. To investigate the performance of tweet embeddings on the sentiment analysis task, we tested several context-free models (Word2Vec, Sentence2Vec, Glove, and FastText), a dynamic embedding model (BERT), deep contextualized word representations (ELMo), and an entity-based model (Wikipedia). The proposed method and results prove that text enrichment improves the accuracy of sentiment polarity classification with a notable percentage

    DICE: Deep intelligent contextual embedding for twitter sentiment analysis

    Full text link
    © 2019 IEEE. The sentiment analysis of the social media-based short text (e.g., Twitter messages) is very valuable for many good reasons, explored increasingly in different communities such as text analysis, social media analysis, and recommendation. However, it is challenging as tweet-like social media text is often short, informal and noisy, and involves language ambiguity such as polysemy. The existing sentiment analysis approaches are mainly for document and clean textual data. Accordingly, we propose a Deep Intelligent Contextual Embedding (DICE), which enhances the tweet quality by handling noises within contexts, and then integrates four embeddings to involve polysemy in context, semantics, syntax, and sentiment knowledge of words in a tweet. DICE is then fed to a Bi-directional Long Short Term Memory (BiLSTM) network with attention to determine the sentiment of a tweet. The experimental results show that our model outperforms several baselines of both classic classifiers and combinations of various word embedding models in the sentiment analysis of airline-related tweets

    Hybrid Words Representation for the classification of low quality text

    Full text link
    University of Technology Sydney. Faculty of Engineering and Information Technology.Language enables humans to communicate with others. For instance, we talk, give our opinions and suggestions all using natural language; to be more precise, we use words while communicating with others. However, in today's world, we wish to communicate with computers, just like humans. It is not an easy task because human communicate in an unstructured and informal way, whereas computers need structured and clean data. So it is essential for computers to understand and classify text accurately for proper human-computer interactions. For classifying a text, the first question we must address is how to improve the low-quality text. The next immediate challenge is to have the best representation so that text can be classified accurately. The way text is organized reflects polysemy, semantic and syntactical coupling relationships which are embedded in its contents. The effective capturing of such content relationships is thereby crucial for a better understanding of text representations. This is especially challenging in the environments where the text messages are short, informal and noisy, and involves natural language ambiguities. The existing sentiment classification methods are mainly for document and clean textual data which can not capture relationship, different attributes and characteristics within tweet messages. Social media analysis, especially the analysis of tweet messages on Twitter has become increasingly relevant since the significant portion of data is ubiquitous in nature. The social media-based short text is valuable for many good reasons, explored increasingly in text analysis, social media analysis and recommendation. In the same time, there is a number of challenges that need to be addressed in this space. One of the main issues is that the traditional word embeddings are unable to capture polysemy (assigns the same representation of a word irrespective of its context and meaning) and out of vocabulary words (assigns a random representation). Furthermore, traditional word embeddings fail to capture sentiment information of words which results in similar word vector representations having the opposite polarities. Thus, ignoring polysemy within the context and sentiment polarity of words in a tweet reduces the performance for tweets classification. In order to address the above-mentioned research challenges and limitations associated with word-level representations, this thesis focuses on improving the representation of low-quality text by improving the unstructured and informal nature of tweets to utilize the information thoroughly and manages the natural language ambiguities to build a more robust sentiment classification model. As compared to previous studies, the proposed models can deal with the ubiquitous nature of the short text, polysemy, semantic and syntactical relationships within a content, thereby addressing the natural language ambiguity problems. Chapter 4 presents the effects of pre-processing techniques using two different word representation models with the machine and deep learning classifiers. Then, we present our recommended combination (approach) of different pre-processing techniques which improves the low quality, by performing sentiment-aware tokenization, correction of spelling mistakes, word segmentation and other techniques to utilize most of the information hidden in unstructured text. The experimental result shows that the proposed combination performs well as compared to other combinations. Chapter 5 presents the hybrid words representation. In this chapter, we proposed our Deep Intelligent Contextual Embedding for Twitter sentiment analysis. Proposed model addresses the natural language ambiguities and is devised to capture polysemy in context, semantics, syntax and sentiment knowledge of words. Bi-directional Long-Short Term Memory wth attention is employed to determine the sentiment. We evaluate the proposed model by performing quantitative and qualitative analysis. The experimental results show that the proposed model outperforms various word embedding models in the sentiment analysis of tweets. Above mentioned methods can be applied to any social media classification task. The performance of proposed models is compared with different models which support the effectiveness of the proposed models and bound the information loss in their generated high-quality representations

    A Study on Sentiment Analysis on Airline Quality Services: A Conceptual Paper

    Get PDF
    Airline quality service is crucial for airlines to remain competitive in the industry. The quality of the services of these airlines must meet customer satisfaction and other aspects of the overall service experience. The levels of service quality in an airline service may impact satisfaction and loyalty which may influence customer sentiment. Concerning the importance of airline quality service, customer sentiment towards the service must be investigated and one of the ways to analyze it is by using sentiment analysis. Sentiment analysis is the chosen tool nowadays to analyze comments or reviews made on these services, which may be positive, negative, or neutral. Using sentiment analysis, will not only help potential customers to view the overall sentiment portrayed, but organizations can also use the findings to improve their organization to be more competitive. Thus, this paper will focus on reviewing several recent works related to sentiment analysis as a tool for assisting organizations in assessing the quality of services in the airline industry. As a result, a new framework for assessing the quality of service for the organizations, especially the airline company will be proposed

    Sentiment analysis in the stock market based on Twitter data

    Get PDF
    In this dissertation, we discuss how Twitter can help detecting public sentiment towards companies listed in the stock market, in particular listed in the S&P 500 index (S&P 500). The collection of data is done through a web scrapper that collects tweets from Twitter, using advanced search features based on queries related to the companies under scrutiny. The content of tweets are classified as positive, neutral or negative sentiments and the outcome is then compared against stock market prices. To do so, it is proposed and implemented a framework with different Sentiment Analysis (SA) models and Machine Learning (ML) techniques. Also, to establish which models are more appropriate in detecting and classifying sentiments, a series of visual representations were created to evaluate and compare results. As a conclusion, the results obtained show that an increase in the volume of tweets leads to oscillations in both stock price and trading volume. Furthermore, the data analysis performed in relation to some companies under scope shows that the use of moving averages of sentiment scores makes the analysis clearer and more insightful, which is particular useful when measuring the strength or weakness of the price of a stock. In the end, it can be perceived as a momentum indicator for the stock market.Nesta dissertação, é analisada a forma como a plataforma Twitter pode ajudar a detectar sentimento público relativamente a empresas cotadas em bolsa, com foco em empresas que fazem parte do indíce americano S&P 500. A obtenção de dados é feita através de um web scrapper, que recolhe tweets através de funções de pesquisa avançada, baseada em queries associadas às empresas em análise. O conteúdo dos tweets são classificados como positivos, neutros ou negativos, sendo os resultados comparados de seguida com os preços das ações. Nesse sentido, é proposta um arquitectura de trabalho, com a respetiva implementação, que inclui vários modelos de análise de sentimento e técnicas de Machine Learning. Por outro lado, de modo a estabelecer quais são os modelos mais adequados para detectar e classificar sentimentos, são criados várias representações visuais para avaliar e comparar resultados. Como conclusão, os resultados obtidos mostram que um aumento do número de tweets conduz a oscilações, quer no preço, quer na quantidade de ações transacionadas. Além disso, a análise de dados levada a cabo relativamente a algumas empresas em estudo, mostra que a utilização de médias móveis de resultados de sentimento torna a leitura da análise mais clara e evidente, o que é bastante útil para medir a força ou fraqueza do preço de determinada ação. Acima de tudo, tal poderá ser percecionado como um indicador de momento para o mercado de capitais

    Improving sentiment classification using a RoBERTa-based hybrid model

    Get PDF
    IntroductionSeveral attempts have been made to enhance text-based sentiment analysis’s performance. The classifiers and word embedding models have been among the most prominent attempts. This work aims to develop a hybrid deep learning approach that combines the advantages of transformer models and sequence models with the elimination of sequence models’ shortcomings.MethodsIn this paper, we present a hybrid model based on the transformer model and deep learning models to enhance sentiment classification process. Robustly optimized BERT (RoBERTa) was selected for the representative vectors of the input sentences and the Long Short-Term Memory (LSTM) model in conjunction with the Convolutional Neural Networks (CNN) model was used to improve the suggested model’s ability to comprehend the semantics and context of each input sentence. We tested the proposed model with two datasets with different topics. The first dataset is a Twitter review of US airlines and the second is the IMDb movie reviews dataset. We propose using word embeddings in conjunction with the SMOTE technique to overcome the challenge of imbalanced classes of the Twitter dataset.ResultsWith an accuracy of 96.28% on the IMDb reviews dataset and 94.2% on the Twitter reviews dataset, the hybrid model that has been suggested outperforms the standard methods.DiscussionIt is clear from these results that the proposed hybrid RoBERTa–(CNN+ LSTM) method is an effective model in sentiment classification

    Enhancing Mental Health Awareness through Twitter Analysis: A Comparative Study of Machine Learning and Hybrid Deep Learning Techniques

    Get PDF
    This study explores the utilization of social media data, specifically tweets and comments, for gaining insights into individuals' mental health conditions. The objective is to enhance mental health awareness and enable early detection and intervention. Twitter data is collected using depression-related keywords, and two models are employed: a Random Forest model with TF-IDF and a hybrid CNN-LSTM model incorporating word2vec. The performance of the CNN-LSTM model surpasses that of the Random Forest model, achieving an accuracy rate of 89.4%. Furthermore, a user interface is developed to analyze users' Twitter profiles based on their tweets, allowing for potential intervention through automated reply messages. By harnessing social media data and advanced machine learning techniques, this research contributes to improving mental health awareness and timely addressing of mental health concerns

    Inquest of Current Situation in Afghanistan Under Taliban Rule Using Sentiment Analysis and Volume Analysis

    Get PDF
    Microblogging websites and social media platforms serve as a potential source for mining public opinions and sentiments on a variety of subjects including the prevailing situations in war-afflicted countries. In particular, Twitter has a large number of geotagged tweets that make the analysis of sentiments across time and space possible. This study performs volume analysis and sentiment analysis using LDA (Latent Dirichlet Allocation) and text mining over two datasets collected for different periods. To increase the adequacy and efficacy of the sentiment analysis, a hybrid feature engineering approach is proposed that elevates the performance of machine learning models. Geotagged tweets are used for volume analysis indicating that the highest number of tweets is originated from India, the US, the UK, Pakistan, and Afghanistan. Analysis of positive and negative tweets reveals that negative tweets are mostly originated from India and the US. On the contrary, positive tweets belong to Pakistan and Afghanistan. LDA is used for topic modeling on two datasets containing tweets about the current situation after the Taliban take control of Afghanistan. Topics extracted through LDA suggest that majority of the Afghanistan people seem satisfied with the Taliban�s takeover while the topics from negative tweets reveal that issues discussed in negative tweets are related to the US concerns in Afghanistan. Sentiment analysis over two different datasets indicates that the trend of the sentiments has been shifted positively over three weeks
    • …