6 research outputs found

    A Case-Based Approach to Cross Domain Sentiment Classification

    Get PDF
    This paper considers the task of sentiment classification of subjective text across many domains, in particular on scenarios where no in-domain data is available. Motivated by the more general applicability of such methods, we propose an extensible approach to sentiment classification that leverages sentiment lexicons and out-of-domain data to build a case-based system where solutions to past cases are reused to predict the sentiment of new documents from an unknown domain. In our approach the case representation uses a set of features based on document statistics, while the case solution stores sentiment lexicons employed on past predictions allowing for later retrieval and reuse on similar documents. The case-based nature of our approach also allows for future improvements since new lexicons and classification methods can be added to the case base as they become available. On a cross domain experiment our method has shown robust results when compared to a baseline single-lexicon classifier where the lexicon has to be pre-selected for the domain in question

    A Case-Based Approach to Cross Domain Sentiment Classification

    Get PDF
    This paper considers the task of sentiment classification of subjective text across many domains, in particular on scenarios where no in-domain data is available. Motivated by the more general applicability of such methods, we propose an extensible approach to sentiment classification that leverages sentiment lexicons and out-of-domain data to build a case-based system where solutions to past cases are reused to predict the sentiment of new documents from an unknown domain. In our approach the case representation uses a set of features based on document statistics, while the case solution stores sentiment lexicons employed on past predictions allowing for later retrieval and reuse on similar documents. The case-based nature of our approach also allows for future improvements since new lexicons and classification methods can be added to the case base as they become available. On a cross domain experiment our method has shown robust results when compared to a baseline single-lexicon classifier where the lexicon has to be pre-selected for the domain in question

    Sentiment classification with case-base approach

    Get PDF
    L'augmentation de la croissance des réseaux, des blogs et des utilisateurs des sites d'examen sociaux font d'Internet une énorme source de données, en particulier sur la façon dont les gens pensent, sentent et agissent envers différentes questions. Ces jours-ci, les opinions des gens jouent un rôle important dans la politique, l'industrie, l'éducation, etc. Alors, les gouvernements, les grandes et petites industries, les instituts universitaires, les entreprises et les individus cherchent à étudier des techniques automatiques fin d’extraire les informations dont ils ont besoin dans les larges volumes de données. L’analyse des sentiments est une véritable réponse à ce besoin. Elle est une application de traitement du langage naturel et linguistique informatique qui se compose de techniques de pointe telles que l'apprentissage machine et les modèles de langue pour capturer les évaluations positives, négatives ou neutre, avec ou sans leur force, dans des texte brut. Dans ce mémoire, nous étudions une approche basée sur les cas pour l'analyse des sentiments au niveau des documents. Notre approche basée sur les cas génère un classificateur binaire qui utilise un ensemble de documents classifies, et cinq lexiques de sentiments différents pour extraire la polarité sur les scores correspondants aux commentaires. Puisque l'analyse des sentiments est en soi une tâche dépendante du domaine qui rend le travail difficile et coûteux, nous appliquons une approche «cross domain» en basant notre classificateur sur les six différents domaines au lieu de le limiter à un seul domaine. Pour améliorer la précision de la classification, nous ajoutons la détection de la négation comme une partie de notre algorithme. En outre, pour améliorer la performance de notre approche, quelques modifications innovantes sont appliquées. Il est intéressant de mentionner que notre approche ouvre la voie à nouveaux développements en ajoutant plus de lexiques de sentiment et ensembles de données à l'avenir.Increasing growth of the social networks, blogs, and user review sites make Internet a huge source of data especially about how people think, feel, and act toward different issues. These days, people opinions play an important role in the politic, industry, education, etc. Thus governments, large and small industries, academic institutes, companies, and individuals are looking for investigating automatic techniques to extract their desire information from large amount of data. Sentiment analysis is one true answer to this need. Sentiment analysis is an application of natural language processing and computational linguistic that consists of advanced techniques such as machine learning and language model approaches to capture the evaluative factors such as positive, negative, or neutral, with or without their strength, from plain texts. In this thesis we study a case-based approach on cross-domain for sentiment analysis on the document level. Our case-based algorithm generates a binary classifier that uses a set of the processed cases, and five different sentiment lexicons to extract the polarity along the corresponding scores from the reviews. Since sentiment analysis inherently is a domain dependent task that makes it problematic and expensive work, we use a cross-domain approach by training our classifier on the six different domains instead of limiting it to one domain. To improve the accuracy of the classifier, we add negation detection as a part of our algorithm. Moreover, to improve the performance of our approach, some innovative modifications are applied. It is worth to mention that our approach allows for further developments by adding more sentiment lexicons and data sets in the future

    Domain Independent Sentiment Classification with Many Lexicons

    Get PDF
    Sentiment lexicons are language resources widely used in opinion mining and important tools in unsupervised sentiment classification. We present a comparative study of sentiment classification of reviews on six different domains using sentiment lexicons from different sources. Our results highlight the tendency of a lexicon’s performance to be imbalanced towards one class, and indicate lexicon accuracy varies with the target domain. We propose an approach that combines information from different lexicons to make classification decisions and achieve more robust results that consistently improve our baseline across all domains tested. These are further refined by a domain independent score adjustment that mitigates the effect of the recall imbalance seen on some of the result
    corecore