3 research outputs found

    Review on recent advances in information mining from big consumer opinion data for product design

    Get PDF
    In this paper, based on more than ten years' studies on this dedicated research thrust, a comprehensive review concerning information mining from big consumer opinion data in order to assist product design is presented. First, the research background and the essential terminologies regarding online consumer opinion data are introduced. Next, studies concerning information extraction and information utilization of big consumer opinion data for product design are reviewed. Studies on information extraction of big consumer opinion data are explained from various perspectives, including data acquisition, opinion target recognition, feature identification and sentiment analysis, opinion summarization and sampling, etc. Reviews on information utilization of big consumer opinion data for product design are explored in terms of how to extract critical customer needs from big consumer opinion data, how to connect the voice of the customers with product design, how to make effective comparisons and reasonable ranking on similar products, how to identify ever-evolving customer concerns efficiently, and so on. Furthermore, significant and practical aspects of research trends are highlighted for future studies. This survey will facilitate researchers and practitioners to understand the latest development of relevant studies and applications centered on how big consumer opinion data can be processed, analyzed, and exploited in aiding product design

    Search beyond traditional probabilistic information retrieval

    Get PDF
    "This thesis focuses on search beyond probabilistic information retrieval. Three ap- proached are proposed beyond the traditional probabilistic modelling. First, term associ- ation is deeply examined. Term association considers the term dependency using a factor analysis based model, instead of treating each term independently. Latent factors, con- sidered the same as the hidden variables of ""eliteness"" introduced by Robertson et al. to gain understanding of the relation among term occurrences and relevance, are measured by the dependencies and occurrences of term sequences and subsequences. Second, an entity-based ranking approach is proposed in an entity system named ""EntityCube"" which has been released by Microsoft for public use. A summarization page is given to summarize the entity information over multiple documents such that the truly relevant entities can be highly possibly searched from multiple documents through integrating the local relevance contributed by proximity and the global enhancer by topic model. Third, multi-source fusion sets up a meta-search engine to combine the ""knowledge"" from different sources. Meta-features, distilled as high-level categories, are deployed to diversify the baselines. Three modified fusion methods are employed, which are re- ciprocal, CombMNZ and CombSUM with three expanded versions. Through extensive experiments on the standard large-scale TREC Genomics data sets, the TREC HARD data sets and the Microsoft EntityCube Web collections, the proposed extended models beyond probabilistic information retrieval show their effectiveness and superiority.

    Evaluating the impact of social-media on sales forecasting: a quantitative study of worlds biggest brands using Twitter, Facebook and Google Trends

    Get PDF
    In the world of digital communication, data from online sources such as social networks might provide additional information about changing consumer interest and significantly improve the accuracy of forecasting models. In this thesis I investigate whether information from Twitter, Facebook and Google Trends have the ability to improve daily sales forecasts for companies with respect to the forecasts from transactional sales data only. My original contribution to this domain, exposed in the present thesis, consists in the following main steps: 1. Data collection. I collected Twitter, Facebook and Google Trends data for the period May 2013 May 2015 for 75 brands. Historical transactional sales data was supplied by Certona Corporation. 2. Sentiment analysis. I introduced a new sentiment classification approach based on combining the two standard techniques (lexicon-based and machine learning based). The proposed method outperforms the state-of-the-art approach by 7% in F-score. 3. Identification and classification of events. I proposed a framework for events detection and a robust method for clustering Twitter events into different types based on the shape of the Twitter volume and sentiment peaks. This approach allows to capture the varying dynamics of information propagation through the social network. I provide empirical evidence that it is possible to identify types of Twitter events that have significant power to predict spikes in sales. 4. Forecasting next day sales. I explored linear, non-linear and cointegrating relationships between sales and social-media variables for 18 brands and showed that social-media variables can improve daily sales forecasts for the majority of brands by capturing factors, such as consumer sentiment and brand perception. Moreover, I identified that social-media data without sales information, can be used to predict sales direction with the accuracy of 63%. The experts from the industry consider the results obtained in this thesis to be valuable and useful for decision making and for making strategic planning for the future
    corecore