363 research outputs found

    Basic tasks of sentiment analysis

    Full text link
    Subjectivity detection is the task of identifying objective and subjective sentences. Objective sentences are those which do not exhibit any sentiment. So, it is desired for a sentiment analysis engine to find and separate the objective sentences for further analysis, e.g., polarity detection. In subjective sentences, opinions can often be expressed on one or multiple topics. Aspect extraction is a subtask of sentiment analysis that consists in identifying opinion targets in opinionated text, i.e., in detecting the specific aspects of a product or service the opinion holder is either praising or complaining about

    Analyzing user reviews of messaging Apps for competitive analysis

    Get PDF
    Dissertation presented as the partial requirement for obtaining a Master's degree in Data Science and Advanced Analytics, specialization in Data ScienceThe rise of various messaging apps has resulted in intensively fierce competition, and the era of Web 2.0 enables business managers to gain competitive intelligence from user-generated content (UGC). Text-mining UGC for competitive intelligence has been drawing great interest of researchers. However, relevant studies mostly focus on industries such as hospitality and products, and few studies applied such techniques to effectively perform competitive analysis for messaging apps. Here, we conducted a competitive analysis based on topic modeling and sentiment analysis by text-mining 27,479 user reviews of four iOS messaging apps, namely Messenger, WhatsApp, Signal and Telegram. The results show that the performance of topic modeling and sentiment analysis is encouraging, and that a combination of the extracted app aspect-based topics and the adjusted sentiment scores can effectively reveal meaningful competitive insights into user concerns, competitive strengths and weaknesses as well as changes of user sentiments over time. We anticipate that this study will not only advance the existing literature on competitive analysis using text mining techniques for messaging apps but also help existing players and new entrants in the market to sharpen their competitive edge by better understanding their user needs and the industry trends

    A comparative analysis of recommender systems based on item aspect opinions extracted from user reviews

    Full text link
    In popular applications such as e-commerce sites and social media, users provide online reviews giving personal opinions about a wide array of items, such as products, services and people. These reviews are usually in the form of free text, and represent a rich source of information about the users’ preferences. Among the information elements that can be extracted from reviews, opinions about particular item aspects (i.e., characteristics, attributes or components) have been shown to be effective for user modeling and personalized recommendation. In this paper, we investigate the aspect-based recommendation problem by separately addressing three tasks, namely identifying references to item aspects in user reviews, classifying the sentiment orientation of the opinions about such aspects in the reviews, and exploiting the extracted aspect opinion information to provide enhanced recommendations. Differently to previous work, we integrate and empirically evaluate several state-of-the-art and novel methods for each of the above tasks. We conduct extensive experiments on standard datasets and several domains, analyzing distinct recommendation quality metrics and characteristics of the datasets, domains and extracted aspects. As a result of our investigation, we not only derive conclusions about which combination of methods is most appropriate according to the above issues, but also provide a number of valuable resources for opinion mining and recommendation purposes, such as domain aspect vocabularies and domain-dependent, aspect-level lexiconsThis work was supported by the Spanish Ministry of Economy, Industry and Competitiveness (TIN2016-80630-P)

    Sentiment classification with case-base approach

    Get PDF
    L'augmentation de la croissance des réseaux, des blogs et des utilisateurs des sites d'examen sociaux font d'Internet une énorme source de données, en particulier sur la façon dont les gens pensent, sentent et agissent envers différentes questions. Ces jours-ci, les opinions des gens jouent un rôle important dans la politique, l'industrie, l'éducation, etc. Alors, les gouvernements, les grandes et petites industries, les instituts universitaires, les entreprises et les individus cherchent à étudier des techniques automatiques fin d’extraire les informations dont ils ont besoin dans les larges volumes de données. L’analyse des sentiments est une véritable réponse à ce besoin. Elle est une application de traitement du langage naturel et linguistique informatique qui se compose de techniques de pointe telles que l'apprentissage machine et les modèles de langue pour capturer les évaluations positives, négatives ou neutre, avec ou sans leur force, dans des texte brut. Dans ce mémoire, nous étudions une approche basée sur les cas pour l'analyse des sentiments au niveau des documents. Notre approche basée sur les cas génère un classificateur binaire qui utilise un ensemble de documents classifies, et cinq lexiques de sentiments différents pour extraire la polarité sur les scores correspondants aux commentaires. Puisque l'analyse des sentiments est en soi une tâche dépendante du domaine qui rend le travail difficile et coûteux, nous appliquons une approche «cross domain» en basant notre classificateur sur les six différents domaines au lieu de le limiter à un seul domaine. Pour améliorer la précision de la classification, nous ajoutons la détection de la négation comme une partie de notre algorithme. En outre, pour améliorer la performance de notre approche, quelques modifications innovantes sont appliquées. Il est intéressant de mentionner que notre approche ouvre la voie à nouveaux développements en ajoutant plus de lexiques de sentiment et ensembles de données à l'avenir.Increasing growth of the social networks, blogs, and user review sites make Internet a huge source of data especially about how people think, feel, and act toward different issues. These days, people opinions play an important role in the politic, industry, education, etc. Thus governments, large and small industries, academic institutes, companies, and individuals are looking for investigating automatic techniques to extract their desire information from large amount of data. Sentiment analysis is one true answer to this need. Sentiment analysis is an application of natural language processing and computational linguistic that consists of advanced techniques such as machine learning and language model approaches to capture the evaluative factors such as positive, negative, or neutral, with or without their strength, from plain texts. In this thesis we study a case-based approach on cross-domain for sentiment analysis on the document level. Our case-based algorithm generates a binary classifier that uses a set of the processed cases, and five different sentiment lexicons to extract the polarity along the corresponding scores from the reviews. Since sentiment analysis inherently is a domain dependent task that makes it problematic and expensive work, we use a cross-domain approach by training our classifier on the six different domains instead of limiting it to one domain. To improve the accuracy of the classifier, we add negation detection as a part of our algorithm. Moreover, to improve the performance of our approach, some innovative modifications are applied. It is worth to mention that our approach allows for further developments by adding more sentiment lexicons and data sets in the future

    Aspect-based sentiment analysis for social recommender systems.

    Get PDF
    Social recommender systems harness knowledge from social content, experiences and interactions to provide recommendations to users. The retrieval and ranking of products, using similarity knowledge, is central to the recommendation architecture. To enhance recommendation performance, having an effective representation of products is essential. Social content such as product reviews contain experiential knowledge in the form of user opinions centred on product aspects. Making sense of these for recommender systems requires the capability to reason with text. However, Natural Language Processing (NLP) toolkits trained on formal text documents encounter challenges when analysing product reviews, due to their informal nature. This calls for novel methods and algorithms to capitalise on textual content in product reviews together with other knowledge resources. In this thesis, methods to utilise user purchase preference knowledge - inferred from the viewed and purchased product behaviour - are proposed to overcome the challenges encountered in analysing textual content. This thesis introduces three major methods to improve the performance of social recommender systems. First, an effective aspect extraction method that combines strengths of both dependency relations and frequent noun analysis is proposed. Thereafter, this thesis presents how extracted aspects can be used to structure opinionated content enabling sentiment knowledge to enrich product representations. Second, a novel method to integrate aspect-level sentiment analysis and implicit knowledge extracted from users' product purchase preferences analysis is presented. The role of sentiment distribution and threshold analysis on the proposed integration method is also explored. Third, this thesis explores the utility of feature selection techniques to rank and select relevant aspects for product representation. For this purpose, this thesis presents how established dimensionality reduction approaches from text classification can be employed to select a subset of aspects for recommendation purposes. Finally, a comprehensive evaluation of all the proposed methods in this thesis is presented using a computational measure of 'better' and Mean Average Precision (MAP) with seven real-world datasets

    Investigating and extending the methods in automated opinion analysis through improvements in phrase based analysis

    Get PDF
    Opinion analysis is an area of research which deals with the computational treatment of opinion statement and subjectivity in textual data. Opinion analysis has emerged over the past couple of decades as an active area of research, as it provides solutions to the issues raised by information overload. The problem of information overload has emerged with the advancements in communication technologies which gave rise to an exponential growth in user generated subjective data available online. Opinion analysis has a rich set of applications which are used to enable opportunities for organisations such as tracking user opinions about products, social issues in communities through to engagement in political participation etc.The opinion analysis area shows hyperactivity in recent years and research at different levels of granularity has, and is being undertaken. However it is observed that there are limitations in the state-of-the-art, especially as dealing with the level of granularities on their own does not solve current research issues. Therefore a novel sentence level opinion analysis approach utilising clause and phrase level analysis is proposed. This approach uses linguistic and syntactic analysis of sentences to understand the interdependence of words within sentences, and further uses rule based analysis for phrase level analysis to calculate the opinion at each hierarchical structure of a sentence. The proposed opinion analysis approach requires lexical and contextual resources for implementation. In the context of this Thesis the approach is further presented as part of an extended unifying framework for opinion analysis resulting in the design and construction of a novel corpus. The above contributions to the field (approach, framework and corpus) are evaluated within the Thesis and are found to make improvements on existing limitations in the field, particularly with regards to opinion analysis automation. Further work is required in integrating a mechanism for greater word sense disambiguation and in lexical resource development

    Discovering High-Profit Product Feature Groups by mining High Utility Sequential Patterns from Feature-Based Opinions

    Get PDF
    Extracting a group of features together instead of a single feature from the mined opinions, such as “{battery, camera, design} of a smartphone,” may yield higher profit to the manufactures and higher customer satisfaction, and these can be called High Profit Feature Groups (HPFG). The accuracy of Opinion-Feature Extraction can be improved if more complex sequential patterns of customer reviews are learned and included in the user-behavior analysis to obtain relevant frequent feature groups. Existing Opinion-Feature Extraction systems that use Data Mining techniques with some sequences include those referred to in this thesis as Rashid13OFExt, Rana18OFExt, and HPFG19_HU. Rashid13OFExt and Rana18OFExt systems use Sequential Pattern Mining, Association Rule Mining, and Class Sequential Rules to obtain frequent product features and opinion words from reviews. However, these systems do not discover the frequent high profit features considering utility values (internal and external) such as cost, profit, quantity, or other user preferences. HPFG19_HU system uses High Utility Itemset Mining and Aspect-Based Sentiment Analysis to extract High Utility Aspect groups based on feature-opinion sets. It works on transaction databases of itemsets formed using aspects by considering the high utility values (e.g., are more profitable to the seller?) from the extracted frequent patterns from a set of opinion sentences. However, the HPFG19_HU system does not consider the order of occurrences (sequences) of product features formed in customer opinion sentences that help distinguish similar users and identifying more relevant and related high profit product features. This thesis proposes a system called High Profit Sequential Feature Group based on High Utility Sequences (HPSFG_HUS), which is an extension to the HPFG19_HU system. The proposed system combines Feature-Based Opinion Mining and High Utility Sequential Pattern Mining to extract High Profit Feature Groups from product reviews. The input to the proposed system is the product reviews corpus. The output is the High Profit Sequential Feature Groups in sequence databases that identify sequential patterns in the features extracted from opinions by considering the order of occurrences of features in the review. This method improves on existing system\u27s accuracy in extracting relevant frequent feature groups. The results on retailer’s graphs of extracted High Profit Sequential Feature Groups show that the proposed HPSFG_HUS system provides more accurate high feature groups, sales profit, and user satisfaction. Experimental results evaluating execution time, accuracy, precision, and comparison show higher revenue than the tested existing systems

    Big data: data analysis and decision making

    Get PDF
    This project consists of a study of the alternatives available to assist the process of decision-making for businesses. The study of the alternatives has been dimensioned so that the maximum number of fields that were considered to be related to the decision-making procedure in companies can be studied. The fields chosen for the preliminary study are: Decision Theory, "Business Intelligence" and "Sentiment Analysis". With the intention to develop a solution to assist the decision-making process, I have designed a system that performs the tasks of capture, storage, analysis and displaying of the results obtained through Twitter data. This system focuses on the analysis of opinion expressed through tweets. The system modules can be executed independently, and perform the tasks of capture or analysis and visualization. The capture module allows us to store tweets that meet certain criteria and store them in a local file. Analysis and visualization modules analyze data from the file and display the results through a graphical interface. These modules perform different analysis such as evaluating the opinion’s evolution, the evolution of the average opinion, the frequency and degree of opinion and geopositioning positive/negative tweets, among various other tasks

    Sentiment Classification of Online Customer Reviews and Blogs Using Sentence-level Lexical Based Semantic Orientation Method

    Get PDF
    ABSTRACT Sentiment analysis is the process of extracting knowledge from the peoples‟ opinions, appraisals and emotions toward entities, events and their attributes. These opinions greatly impact on customers to ease their choices regarding online shopping, choosing events, products and entities. With the rapid growth of online resources, a vast amount of new data in the form of customer reviews and opinions are being generated progressively. Hence, sentiment analysis methods are desirable for developing efficient and effective analyses and classification of customer reviews, blogs and comments. The main inspiration for this thesis is to develop high performance domain independent sentiment classification method. This study focuses on sentiment analysis at the sentence level using lexical based method for different type data such as reviews and blogs. The proposed method is based on general lexicons i.e. WordNet, SentiWordNet and user defined lexical dictionaries for sentiment orientation. The relations and glosses of these dictionaries provide solution to the domain portability problem. The experiments are performed on various data sets such as customer reviews and blogs comments. The results show that the proposed method with sentence contextual information is effective for sentiment classification. The proposed method performs better than word and text level corpus based machine learning methods for semantic orientation. The results highlight that the proposed method achieves an average accuracy of 86% at sentence-level and 97% at feedback level for customer reviews. Similarly, it achieves an average accuracy of 83% at sentence level and 86% at feedback level for blog comment
    • …
    corecore