149 research outputs found

    User Emotion Identification in Twitter Using Specific Features: Hashtag, Emoji, Emoticon, and Adjective Term

    Get PDF
    Twitter is a social media application, which can give a sign for identifying user emotion. Identification of user emotion can be utilized in commercial domain, health, politic, and security problems. The problem of emotion identification in twit is the unstructured short text messages which lead the difficulty to figure out main features. In this paper, we propose a new framework for identifying the tendency of user emotions using specific features, i.e. hashtag, emoji, emoticon, and adjective term. Preprocessing is applied in the first phase, and then user emotions are identified by means of classification method using kNN. The proposed method can achieve good results, near ground truth, with accuracy of 92%

    Mining Twitter Sequences of Product Opinions with Multi-Word Aspect Terms

    Get PDF
    Social media platforms have opened doors to users\u27 opinions and perceptions. The text remains the most popular means of contact on social media, despite different means of communication (audio/video and images). Twitter is one such microblogging platform that allows people to express their thoughts within 280 characters per message. The freedom of expression has made it difficult to understand the polarity (Positive, Negative, or Neutral) of the tweets/posts. Given a corpus of microblog texts (e.g., the new iPhone battery life is good, but camera quality is bad ), mining aspects (e.g., battery life, camera quality) and opinions (e.g., good, bad) of these products are challenging due to the vast data being generated. Aspect-Based Opinion Mining (ABOM) is thus a combination of aspect extraction and opinion mining that allows an enterprise to analyze the data in detail, saving time and money automatically. Existing systems such as Hate Crime Twitter Sentiment (HCTS) and Microblog Aspect Miner (MAM) have been recently proposed to perform ABOM on Twitter. These systems generally go through the four-step approach of obtaining microblog posts, identifying frequent nouns (candidate aspects), pruning the candidate aspects, and getting opinion polarity. However, they differ in how well they prune their candidate features. HCTS uses Apriori based Association rule mining to find the important aspects (single and multi word) of a given product. However, the Apriori based system generate many candidate sequences which generates redundant candidate aspects and HCTS also fails to summarize the category of the aspects (Camera? Battery?). MAM follows the similar approach to that of HCTS for finding the relevant aspects but it further clusters the frequent nouns (aspects) to obtain the relevant aspects. However, it does not identify the multi-word aspects and the aspect category of a product. This thesis proposes a system called Microblog Aspect Sequence Miner (MASM) as an extension of Microblog Aspect Miner (MAM) by replacing the Apriori algorithm with the modified frequent sequential pattern mining algorithm. The system uses the power of sequential pattern mining for aspect extraction in ABOM. The sentiments of the tweets are unknown, so we build our approach in an unsupervised learning manner. The input posts are first classified to identify those tweets which contain the opinion (subjective) to those that do not have any opinion (objective). Then we extract the Parts of Speech tags for the explicit aspects to identify the frequent nouns. The novel frequent pattern mining framework (CM-SPAM) is applied to segment the single and multi-word aspects which generates less sequences as compared to previous approaches. This prior knowledge helps us to operate a topic modeling framework (Latent Dirichlet Allocation) to determine the summary of most common aspects (Aspect Category) and their sentiments for a product. Thefindings demonstrate that the MASM model has a promising performance in finding relevant aspects with reduction of average vector size (cost of candidate/aspect generation) against the MAM and HCTS using the Sanders Twitter corpus dataset. Experimental results with evaluation metrics of execution time, precision, recall, and F-measure indicate that our approach has higher recall and precision than the existing systems

    Social media data analytics to improve supply chain management in food industries

    Full text link
    © 2017 Elsevier Ltd This paper proposes a big-data analytics-based approach that considers social media (Twitter) data for the identification of supply chain management issues in food industries. In particular, the proposed approach includes text analysis using a support vector machine (SVM) and hierarchical clustering with multiscale bootstrap resampling. The result of this approach included a cluster of words which could inform supply-chain (SC) decision makers about customer feedback and issues in the flow/quality of food products. A case study in the beef supply chain was analysed using the proposed approach, where three weeks of data from Twitter were used

    Social support, social capital and online community e-loyalty: an empirical study

    Get PDF
    Online communities, as an essential manifestation of online social relationships, sociality factors (including social support factors and social relationship factors, etc.) ought to facilitate the formation of community trust and community satisfaction. However, although the existing literature has explored the underlying mechanisms of online community trust and satisfaction formation, few studies implemented research from the integrated sociality perspectives. In this thesis, we integrate social capital theory and social support theory to consider social capital and social support as important antecedent social factors in forming community trust and community satisfaction, which influence users' trust and satisfaction in online communities. Community trust and satisfaction further promote community loyalty. Specifically, this thesis scrutinizes the influence of three levels of social support factors such as information support, emotional support, and human-computer network management support and three kinds of social capital such as structure, cognition, and relationship to online community trust and satisfaction. Based on the proposed research model, 430 online community users' survey data were collected through an empirical questionnaire and the research model was tested through the partial least squares structural equation model method. The results of the thesis suggest that social support factors, including information support, emotional support, and interpersonal network interaction support, and social capital factors including structural capital, relational capital, and cognitive capital significantly affect community users' loyalty not only directly but also indirectly through enhancing community users' trust and satisfaction. Thus, users' trust and satisfaction with the community are significant mediating variables.Nas comunidades online, como uma importante manifestação das relações sociais online, os fatores de socialidade (incluindo fatores de apoio social e fatores de relacionamento social) devem facilitar a formação de confiança e satisfação da comunidade. No entanto, embora a literatura existente tenha explorado os mecanismos subjacentes à formação da confiança e da satisfação da comunidade online, poucos estudos consideraram a perspectiva social de forma integrada. Nesta tese, integramos a teoria do capital social e a teoria do suporte social para considerar o capital social e o suporte social como importantes fatores sociais antecedentes na formação da confiança e satisfação da comunidade, que influenciam a confiança e a satisfação dos utilizadores em comunidades online. A confiança e a satisfação da comunidade promovem ainda mais a lealdade da comunidade. Especificamente, esta tese estuda a influência de três níveis de fatores de suporte social - suporte de informação, suporte emocional e suporte de gestão da relação homen-computador - e três tipos de capital social - estrutura, cognição e relacionamento - na confiança e satisfação da comunidade online. Com base no modelo de pesquisa proposto, 430 observações de utilizadores de comunidades online foram recolhidos através de um questionário. O modelo de pesquisa foi testado através de métodos de equação estruturais. Os resultados da tese relevam que fatores de suporte social, incluindo suporte de informação, suporte emocional e suporte de interação de rede interpessoal, e fatores de capital social, incluindo capital estrutural, capital relacional e capital cognitivo, afetam significativamente a lealdade dos utilizadores da comunidade, não apenas diretamente mas também indiretamente, aumentando a confiança e a satisfação dos utilizadores da comunidade; a confiança e a satisfação dos usuários com a comunidade são variáveis mediadoras importantes

    Inferring the geolocation of tweets at a fine-grained level

    Get PDF
    Recently, the use of Twitter data has become important for a wide range of real-time applications, including real-time event detection, topic detection or disaster and emergency management. These applications require to know the precise location of the tweets for their analysis. However, approximately 1% of the tweets are finely-grained geotagged, which remains insufficient for such applications. To overcome this limitation, predicting the location of non-geotagged tweets, while challenging, can increase the sample of geotagged data to support the applications mentioned above. Nevertheless, existing approaches on tweet geolocalisation are mostly focusing on the geolocation of tweets at a coarse-grained level of granularity (i.e., city or country level). Thus, geolocalising tweets at a fine-grained level (i.e., street or building level) has arisen as a newly open research problem. In this thesis, we investigate the problem of inferring the geolocation of non-geotagged tweets at a fine-grained level of granularity (i.e., at most 1 km error distance). In particular, we aim to predict the geolocation where a given tweet was generated using its text as a source of evidence. This thesis states that the geolocalisation of non-geotagged tweets at a fine-grained level can be achieved by exploiting the characteristics of the 1\% of already available individual finely-grained geotagged tweets provided by the Twitter stream. We evaluate the state-of-the-art, derive insights on their issues and propose an evolution of techniques to achieve the geolocalisation of tweets at a fine-grained level. First, we explore the existing approaches in the literature for tweet geolocalisation and derive insights on the problems they exhibit when adapted to work at a fine-grained level. To overcome these problems, we propose a new approach that ranks individual geotagged tweets based on their content similarity to a given non-geotagged. Our experimental results show significant improvements over previous approaches. Next, we explore the predictability of the location of a tweet at a fine-grained level in order to reduce the average error distance of the predictions. We postulate that to obtain a fine-grained prediction a correlation between similarity and geographical distance should exist, and define the boundaries were fine-grained predictions can be achieved. To do that, we incorporate a majority voting algorithm to the ranking approach that assesses if such correlation exists by exploiting the geographical evidence encoded within the Top-N most similar geotagged tweets in the ranking. We report experimental results and demonstrate that by considering this geographical evidence, we can reduce the average error distance, but with a cost in coverage (the number of tweets for which our approach can find a fine-grained geolocation). Furthermore, we investigate whether the quality of the ranking of the Top-N geotagged tweets affects the effectiveness of fine-grained geolocalisation, and propose a new approach to improve the ranking. To this end, we adopt a learning to rank approach that re-ranks geotagged tweets based on their geographical proximity to a given non-geotagged tweet. We test different learning to rank algorithms and propose multiple features to model fine-grained geolocalisation. Moreover, we investigate the best performing combination of features for fine-grained geolocalisation. This thesis also demonstrates the applicability and generalisation of our fine-grained geolocalisation approaches in a practical scenario related to a traffic incident detection task. We show the effectiveness of using new geolocalised incident-related tweets in detecting the geolocation of real incidents reports, and demonstrate that we can improve the overall performance of the traffic incident detection task by enhancing the already available geotagged tweets with new tweets that were geolocalised using our approach. The key contribution of this thesis is the development of effective approaches for geolocalising tweets at a fine-grained level. The thesis provides insights on the main challenges for achieving the fine-grained geolocalisation derived from exhaustive experiments over a ground truth of geotagged tweets gathered from two different cities. Additionally, we demonstrate its effectiveness in a traffic incident detection task by geolocalising new incident-related tweets using our fine-grained geolocalisation approaches

    Research and Prediction on the Sharing of WeChat Official Accounts’ Articles

    Get PDF
    With the development of mobile Internet, We Media was born. WeChat Official Account Platform is the largest we media platform in China. In WeChat social network, information can only be rapidly spread through the sharing operation of users. This paper takes WeChat official accounts as the object and uses logistic regression model to explore the influencing factors on sharing. After that, a prediction model is constructed based on logistic regression and support vector machine. The significance of this study is to propose the factors that influence WeChat official accounts’ articles sharing, and to construct a sharing prediction model

    Predicting Consumers’ Brand Sentiment Using Text Analysis on Reddit

    Get PDF
    With the emergence of data privacy regulations around the world (e.g. GDPR, CCPA), practitioners of Internet marketing, the largest digital marketing channel, face the trade-off between user data protection and advertisement targeting accuracy due to their current reliance on PII-related social media analytics. To address this challenge, this research proposes a predictive model for consumers’ brand sentiment based entirely on textual data from Reddit, i.e. fully compliant with current data privacy regulations. This author uses natural language processing techniques to process all post and comment data from the r/gadgets subreddit community in 2018 – extracting frequently-discussed brands and products through named entity recognition, as well as generating brand sentiment labels for active users in r/gadgets through sentiment analysis. This research then uses four supervised learning classifiers to predict brand sentiments for four brand clusters (Apple, Samsung, Microsoft and Google) based on the self-identified characteristics of Reddit users. Across all four brand clusters, the predictive model proposed by this research achieved a ROC AUC score above 0.7 (three out of the four above 0.8). This research thus shows the predictive power of self-identified user characteristics on brand sentiments and offers a non-PII-required consumer targeting model for digital marketing practitioners
    • …
    corecore