7 research outputs found

    Data Categorization and Review Identification on Twitter Using WordNet Implicit Aspect Sentiment Analysis

    Get PDF
    Social media review analysis has developed into a fascinating profession that addresses important public safety issues that are respected globally. Sentiment analysis (SA) on Twitter is still a topic of ongoing attention in this profession. Tweet datasets for sentiment opposition bracket are subjected to aspect-grounded SA, a method that allows information to be extracted, dissected, and categorized in order to predict social media evaluations. The implicit aspect for social media review tweets that is implied by adjectives and verbs is the subject of this paper's aspect identification job. In order to improve training data for [1) Social media review Implicit Aspect Rulings Discovery (IASD) and Social media review Implicit Aspect Identification (IAI), a mongrel model is suggested. It is based on WordNet semantic relations and the Term-Weighting scheme. Three classifiers—Multinomial Naïve Bayes, Support Vector Machine, and Random Forest—are used to estimate the performance on three Twitter social media review datasets. The obtained results show the value of verbs in training data enhancement for social media review IASD and IAI, as well as the efficacy of WN reversal and description relations

    Automatic opinion extraction from textual comments of students surveys

    Get PDF
    Регрутовање нових и задржавање постојећих студената су важна питања за све високошколске установе. Стога је пресудно стално праћење нивоа задовољства студената. Аутоматска анализа мишљења студената се може реализовати применом аспектно базиране сентимент анализе (АБСА). АБСА је под- дисциплина обраде природног језика која се фокусира на идентификацију сентимената (негативних, неутралних, позитивних) и аспеката (носиоца сентимента) у реченици. Циљ ове докторске дисертације је да предложи систем за АБСА текстуалних коментара студентских анкета на српском језику. Предложени систем се ослања на технике обраде природног језика, модела машинког учења, правила и речника. Корпус је прикупљен и анотиран за развој и евалуацију система и укључује рецензије студената о наставном особљу и студијским програмима на Факултету техничких наука. Резултати истраживања показују да се позитивни сентимент може успешно идентификовати са Ф-мером 0,91, док се негативан сентимент може идентификовати са Ф-мером 0,97. Док су Ф-мере за аспекте у опсегу између 0,49 и 0,89, у зависности од њихове учесталости у корпусу. Према сазнању аутора, ово је прво истраживање АБСА које је спроведено на нивоу сегмента реченице за српски језик. Методологија и сазнања која су представљена у овој докторској дисертацији пружају преко потребне основе за даљи рад на анализи сентимената за српски језик који је у овој области недовољно истражен и има недостатак језичких ресурса.Regrutovanje novih i zadržavanje postojećih studenata su važna pitanja za sve visokoškolske ustanove. Stoga je presudno stalno praćenje nivoa zadovoljstva studenata. Automatska analiza mišljenja studenata se može realizovati primenom aspektno bazirane sentiment analize (ABSA). ABSA je pod- disciplina obrade prirodnog jezika koja se fokusira na identifikaciju sentimenata (negativnih, neutralnih, pozitivnih) i aspekata (nosioca sentimenta) u rečenici. Cilj ove doktorske disertacije je da predloži sistem za ABSA tekstualnih komentara studentskih anketa na srpskom jeziku. Predloženi sistem se oslanja na tehnike obrade prirodnog jezika, modela mašinkog učenja, pravila i rečnika. Korpus je prikupljen i anotiran za razvoj i evaluaciju sistema i uključuje recenzije studenata o nastavnom osoblju i studijskim programima na Fakultetu tehničkih nauka. Rezultati istraživanja pokazuju da se pozitivni sentiment može uspešno identifikovati sa F-merom 0,91, dok se negativan sentiment može identifikovati sa F-merom 0,97. Dok su F-mere za aspekte u opsegu između 0,49 i 0,89, u zavisnosti od njihove učestalosti u korpusu. Prema saznanju autora, ovo je prvo istraživanje ABSA koje je sprovedeno na nivou segmenta rečenice za srpski jezik. Metodologija i saznanja koja su predstavljena u ovoj doktorskoj disertaciji pružaju preko potrebne osnove za dalji rad na analizi sentimenata za srpski jezik koji je u ovoj oblasti nedovoljno istražen i ima nedostatak jezičkih resursa.Student recruitment and retention are an important issue for all higher education institutions. Constant monitoring of student satisfaction levels is therefore crucial. Aspect-based sentiment analysis is a sub-discipline of natural language processing (NLP) that focuses on the identification of sentiments (negative, neutral, positive) and aspects (sentiment targets) in a sentence. This research introduces a system for aspect-based sentiment analysis of free text reviews expressed in student opinion surveys in the Serbian language. Sentiment analysis was carried out at the finest level of text granularity - the level of sentence segment (phrase and clause). The presented system relies on NLP techniques, machine learning models, rules, and dictionaries. The corpora collected and annotated for system development, and evaluation comprise: students’ reviews of teaching staff at the Faculty of Technical Sciences. The research results indicate that positive sentiment can successfully be identified with F-measure of 0.91 while negative sentiment can be detected with F-measure of 0.97. While the F-measures for the aspects are in range from 0.49 to 0.89, depending on their frequency in the corpus. To the best of the authors’ knowledge, this is the first study of aspect-based sentiment analysis carried out at the level of the sentence segment for the Serbian language. The methodology and findings presented in this paper provide a muchneeded bases for further work on sentiment analysis for the Serbian language that is well under-resourced and under-researched in this area

    Discovering and understanding community opinions of neighbourhoods expressed in question answering platforms

    Get PDF
    Humans value the opinions of others. In recent years, people have been using social media platforms to both voice and gather opinions. Looking for relevant pieces of information through the huge amount of expressed opinions across several platforms is an overwhelming task. This is why automatically extracting information from such sources has received a great deal of attention in both academia and industry. However, little work in this field has been dedicated to the domain of city neighbourhoods. One reason is that unlike for many products and services, there are no dedicated review platforms for collecting opinions regarding the neighbourhoods. In the absence of dedicated review sites, a great amount of expressed opinions on neighbourhoods and other domains can be found on community question answering (QA) platforms. So far, this data has not been used. This raises a question as to what the strengths and limitations of QA data are and what challenges does it bring for extracting opinion information expressed about neighbourhoods. In this thesis, we comprehensively investigate these questions, using data from Yahoo! Answers for neighbourhoods of London. First, we investigate how well QA discussions reflect the demographic attributes of neighbourhoods present in census (e.g. age, religion, etc.). Our results show that significant, strong and meaningful correlations exist between text features from QA data and many demographic attributes. For instance, the terms poverty, drug, and rundown are amongst the top correlated terms with the attribute deprivation. We further demonstrate that text features based on Yahoo! Answers discussions can achieve a very good accuracy in predicting a wide range of demographic attributes for neighbourhoods. These predictions outperform predictions that are made using Twitter data, a platform that has been used widely in the past for predicting many real-world attributes. Demographics data provides objective statistics related to the population of neighbourhoods. Many attributes of interest are not reflected in those statistics. For instance, census data does not record statistics regarding whether a neighbourhood is posh, quiet or good for nightlife. Knowing these aspects is complementary to the demographic attributes in forming an understanding of neighbourhoods. We investigate whether text features from QA data can predict such aspects. To do this, we create a dataset of neighbourhoods labeled with these aspects. Our prediction results show that QA data can predict such aspects with a higher performance compared to Twitter data in the presence of these labels. Predicting a single value for a characteristic of a neighbourhood cannot provide a complete picture of people's opinions. To provide a fine-grained summary, a popular approach is to extract the sentiments towards different aspects of a given entity from each expressed opinion. Aspect-based sentiment analysis has been studied extensively, but research has always utilised the text from dedicated review platforms where a user usually writes opinions on a single specified entity. In the absence of a review platform for neighbourhoods, we extend the task to process the text from QA platforms where fewer assumptions can be made and the data is noisy. We construct a human-annotated dataset based on text from Yahoo! Answers discussions with a high inter-annotator agreements of over 70%, a suitable level for this task. To address this task, we propose methods based on representations of text that are learned sequentially using recurrent neural models or representations that are defined using the traditional bag of n-grams features. Our proposed methods can achieve prediction accuracies on similar levels to the less challenging sentiment analysis tasks. In summary, the study in this thesis demonstrates the strengths of QA data in predicting the values of real-world entities and for extracting information from opinions, specifically for the domain of city neighbourhoods

    A Vector Space Approach for Aspect Based Sentiment Analysis

    No full text

    A Vector Space Approach for Aspect Based Sentiment Analysis

    No full text
    Abstract Vector representations for language has been shown to be useful in a number of Natural Language Processing tasks. In this paper, we aim to investigate the effectiveness of word vector representations for the problem of Aspect Based Sentiment Analysis. In particular, we target three sub-tasks namely aspect term extraction, aspect category detection, and aspect sentiment prediction. We investigate the effectiveness of vector representations over different text data and evaluate the quality of domain-dependent vectors. We utilize vector representations to compute various vectorbased features and conduct extensive experiments to demonstrate their effectiveness. Using simple vector based features, we achieve F1 scores of 79.91% for aspect term extraction, 86.75% for category detection, and the accuracy 72.39% for aspect sentiment prediction

    A vector space approach for aspect-based sentiment analysis

    No full text
    Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015.This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections.Cataloged from student-submitted PDF version of thesis.Includes bibliographical references (pages 75-82).Vector representations for language have been shown to be useful in a number of Natural Language Processing (NLP) tasks. In this thesis, we aim to investigate the effectiveness of word vector representations for the research problem of Aspect-Based Sentiment Analysis (ABSA), which attempts to capture both semantic and sentiment information encoded in user generated content such as product reviews. In particular, we target three ABSA sub-tasks: aspect term extraction, aspect category detection, and aspect sentiment prediction. We investigate the effectiveness of vector representations over different text data, and evaluate the quality of domain-dependent vectors. We utilize vector representations to compute various vector-based features and conduct extensive experiments to demonstrate their eectiveness. Using simple vector-based features, we achieve F1 scores of 79.9% for aspect term extraction, 86.7% for category detection, and 72.3% for aspect sentiment prediction.by Abdulaziz Alghunaim.M. Eng
    corecore