9 research outputs found

    I Understand What You Are Saying: Leveraging Deep Learning Techniques for Aspect Based Sentiment Analysis

    Get PDF
    Despite widespread use of online reviews in consumer purchase decision making, the potential value of online reviews in facilitating digital collaboration among product/service providers, consumers, and online retailers remains under explored. One of the significant barriers to realizing the above potential lies in the difficulty of understanding online reviews due to their sheer volume and free-text form. To promote digital collaborations, we investigate aspect based sentiment dynamics of online reviews by proposing a semi-supervised, deep learning facilitated analytical pipeline. This method leverages deep learning techniques for text representation and classification. Additionally, building on previous studies that address aspect extraction and sentiment identification in isolation, we address both aspects and sentiments analyses simultaneously. Further, this study presents a novel perspective to understanding the dynamics of aspect based sentiments by analyzing aspect based sentiment in time series. The findings of this study have significant implications with regards to digital collaborations among consumers, product/service providers and other stakeholders of online reviews

    TOPIC MODELING FOR EMAIL SUBJECT LINE ANALYSIS

    Get PDF
    Email processing is an emerging area in natural language processing and machine learning. Archivists often must make judgements about the relevance and record status of email messages. This study is an attempt to streamline that process by testing subject line and message body analysis using topic modeling. Specifically, using the Enron Corpus and Latent Dirichlet Allocation, this study investigates the extent to which email subject lines can be used to predict the content of email messages to support efficient archival processing.Master of Science in Information Scienc

    A Gamma-Poisson Mixture Topic Model for Short Text

    Get PDF
    Most topic models are constructed under the assumption that documents follow a multinomial distribution. The Poisson distribution is an alternative distribution to describe the probability of count data. For topic modelling, the Poisson distribution describes the number of occurrences of a word in documents of fixed length. The Poisson distribution has been successfully applied in text classification, but its application to topic modelling is not well documented, specifically in the context of a generative probabilistic model. Furthermore, the few Poisson topic models in literature are admixture models, making the assumption that a document is generated from a mixture of topics. In this study, we focus on short text. Many studies have shown that the simpler assumption of a mixture model fits short text better. With mixture models, as opposed to admixture models, the generative assumption is that a document is generated from a single topic. One topic model, which makes this one-topic-per-document assumption, is the Dirichlet-multinomial mixture model. The main contributions of this work are a new Gamma-Poisson mixture model, as well as a collapsed Gibbs sampler for the model. The benefit of the collapsed Gibbs sampler derivation is that the model is able to automatically select the number of topics contained in the corpus. The results show that the Gamma-Poisson mixture model performs better than the Dirichlet-multinomial mixture model at selecting the number of topics in labelled corpora. Furthermore, the Gamma-Poisson mixture produces better topic coherence scores than the Dirichlet-multinomial mixture model, thus making it a viable option for the challenging task of topic modelling of short text.Comment: 26 pages, 14 Figures, to be published in Mathematical Problems in Engineerin

    Information Diffusion and Summarization in Social Networks

    Get PDF
    Social networks are web-based services that allow users to connect and share information. Due to the huge size of social network graph and the plethora of generated content, it is difficult to diffuse and summarize the social media content. This thesis thus addresses the problems of information diffusion and information summarization in social networks. Information diffusion is a process by which information about new opinions, behaviors, conventions, practices, and technologies flow from person-to-person through a social network. Studies on information diffusion primarily focus on how information diffuses in networks and how to enhance information diffusion. Our aim is to enhance the information diffusion in social networks. Many factors affect information diffusion, such as network connectivity, location, posting timestamp, post content, etc. In this thesis, we analyze the effect of three of the most important factors of information diffusion, namely network connectivity, posting time and post content. We first study the network factor to enhance the information diffusion, and later analyze how time and content factors can diffuse the information to a large number of users. Network connectivity of a user determines his ability to disseminate information. A well-connected authoritative user can disseminate information to a more wider audience compared to an ordinary user. We present a novel algorithm to find topicsensitive authorities in social networks. We use the topic-specific authoritative position of the users to promote a given topic through word-of-mouth (WoM) marketing. Next, the lifetime of social media content is very short, which is typically a few hours. If post content is posted at the time when the targeted audience are not online or are not interested in interacting with the content, the content will not receive high audience reaction. We look at the problem of finding the best posting time(s) to get high information diffusion. Further, the type of social media content determines the amount of audience interaction, it gets in social media. Users react differently to different types of content. If a post is related to a topic that is more arousing or debatable, then it tends to get more comments. We propose a novel method to identify whether a post has high arousal content or not. Furthermore, the sentiment of post content is also an important factor to garner users’ attention in social media. Same information conveyed with different sentiments receives a different amount of audience reactions. We understand to what extent the sentiment policies employed in social media have been successful to catch users’ attention. Finally, we study the problem of information summarization in social networks. Social media services generate a huge volume of data every day, which is difficult to search or comprehend. Information summarization is a process of creating a concise readable summary of this huge volume of unstructured information. We present a novel method to summarize unstructured social media text by generating topics similar to manually created topics. We also show a comprehensive topical summary by grouping semantically related topics
    corecore