1,388 research outputs found

    Latent dirichlet markov allocation for sentiment analysis

    Get PDF
    In recent years probabilistic topic models have gained tremendous attention in data mining and natural language processing research areas. In the field of information retrieval for text mining, a variety of probabilistic topic models have been used to analyse content of documents. A topic model is a generative model for documents, it specifies a probabilistic procedure by which documents can be generated. All topic models share the idea that documents are mixture of topics, where a topic is a probability distribution over words. In this paper we describe Latent Dirichlet Markov Allocation Model (LDMA), a new generative probabilistic topic model, based on Latent Dirichlet Allocation (LDA) and Hidden Markov Model (HMM), which emphasizes on extracting multi-word topics from text data. LDMA is a four-level hierarchical Bayesian model where topics are associated with documents, words are associated with topics and topics in the model can be presented with single- or multi-word terms. To evaluate performance of LDMA, we report results in the field of aspect detection in sentiment analysis, comparing to the basic LDA model

    Topic Modeling in Sentiment Analysis: A Systematic Review

    Get PDF
    With the expansion and acceptance of Word Wide Web, sentiment analysis has become progressively popular research area in information retrieval and web data analysis. Due to the huge amount of user-generated contents over blogs, forums, social media, etc., sentiment analysis has attracted researchers both in academia and industry, since it deals with the extraction of opinions and sentiments. In this paper, we have presented a review of topic modeling, especially LDA-based techniques, in sentiment analysis. We have presented a detailed analysis of diverse approaches and techniques, and compared the accuracy of different systems among them. The results of different approaches have been summarized, analyzed and presented in a sophisticated fashion. This is the really effort to explore different topic modeling techniques in the capacity of sentiment analysis and imparting a comprehensive comparison among them

    Aspect discovery from product reviews

    Get PDF

    Self-Supervised and Controlled Multi-Document Opinion Summarization

    Full text link
    We address the problem of unsupervised abstractive summarization of collections of user generated reviews with self-supervision and control. We propose a self-supervised setup that considers an individual document as a target summary for a set of similar documents. This setting makes training simpler than previous approaches by relying only on standard log-likelihood loss. We address the problem of hallucinations through the use of control codes, to steer the generation towards more coherent and relevant summaries.Finally, we extend the Transformer architecture to allow for multiple reviews as input. Our benchmarks on two datasets against graph-based and recent neural abstractive unsupervised models show that our proposed method generates summaries with a superior quality and relevance.This is confirmed in our human evaluation which focuses explicitly on the faithfulness of generated summaries We also provide an ablation study, which shows the importance of the control setup in controlling hallucinations and achieve high sentiment and topic alignment of the summaries with the input reviews.Comment: 18 pages including 5 pages appendi

    A survey of data mining techniques for social media analysis

    Get PDF
    Social network has gained remarkable attention in the last decade. Accessing social network sites such as Twitter, Facebook LinkedIn and Google+ through the internet and the web 2.0 technologies has become more affordable. People are becoming more interested in and relying on social network for information, news and opinion of other users on diverse subject matters. The heavy reliance on social network sites causes them to generate massive data characterised by three computational issues namely; size, noise and dynamism. These issues often make social network data very complex to analyse manually, resulting in the pertinent use of computational means of analysing them. Data mining provides a wide range of techniques for detecting useful knowledge from massive datasets like trends, patterns and rules [44]. Data mining techniques are used for information retrieval, statistical modelling and machine learning. These techniques employ data pre-processing, data analysis, and data interpretation processes in the course of data analysis. This survey discusses different data mining techniques used in mining diverse aspects of the social network over decades going from the historical techniques to the up-to-date models, including our novel technique named TRCM. All the techniques covered in this survey are listed in the Table.1 including the tools employed as well as names of their authors
    corecore