1,082 research outputs found

    CLOUD-BASED MACHINE LEARNING AND SENTIMENT ANALYSIS

    Get PDF
    The role of a Data Scientist is becoming increasingly ubiquitous as companies and institutions see the need to gain additional insights and information from data to make better decisions to improve the quality-of-service delivery to customers. This thesis document contains three aspects of data science projects aimed at improving tools and techniques used in analyzing and evaluating data. The first research study involved the use of a standard cybersecurity dataset and cloud-based auto-machine learning algorithms were applied to detect vulnerabilities in the network traffic data. The performance of the algorithms was measured and compared using standard evaluation metrics. The second research study involved the use of text-mining social media, specifically Reddit. We mined up to 100,000 comments in multiple subreddits and tested for hate speech via a custom designed version of the Python Vader sentiment analysis package. Our work integrated standard sentiment analysis with Hatebase.org and we demonstrate our new method can better detect hate speech in social media. Following sentiment analysis and hate speech detection, in the third research project, we applied statistical techniques in evaluating the significant difference in text analytics, specifically the sentiment-categories for both lexicon-based software and cloud-based tools. We compared the three big cloud providers, AWS, Azure, and GCP with the standard python Vader sentiment analysis library. We utilized statistical analysis to determine a significant difference between the cloud platforms utilized as well as Vader and demonstrated that each platform is unique in its analysis scoring mechanism

    A Machine Learning Approach For Opinion Holder Extraction In Arabic Language

    Full text link
    Opinion mining aims at extracting useful subjective information from reliable amounts of text. Opinion mining holder recognition is a task that has not been considered yet in Arabic Language. This task essentially requires deep understanding of clauses structures. Unfortunately, the lack of a robust, publicly available, Arabic parser further complicates the research. This paper presents a leading research for the opinion holder extraction in Arabic news independent from any lexical parsers. We investigate constructing a comprehensive feature set to compensate the lack of parsing structural outcomes. The proposed feature set is tuned from English previous works coupled with our proposed semantic field and named entities features. Our feature analysis is based on Conditional Random Fields (CRF) and semi-supervised pattern recognition techniques. Different research models are evaluated via cross-validation experiments achieving 54.03 F-measure. We publicly release our own research outcome corpus and lexicon for opinion mining community to encourage further research

    Macro-micro approach for mining public sociopolitical opinion from social media

    Get PDF
    During the past decade, we have witnessed the emergence of social media, which has prominence as a means for the general public to exchange opinions towards a broad range of topics. Furthermore, its social and temporal dimensions make it a rich resource for policy makers and organisations to understand public opinion. In this thesis, we present our research in understanding public opinion on Twitter along three dimensions: sentiment, topics and summary. In the first line of our work, we study how to classify public sentiment on Twitter. We focus on the task of multi-target-specific sentiment recognition on Twitter, and propose an approach which utilises the syntactic information from parse-tree in conjunction with the left-right context of the target. We show the state-of-the-art performance on two datasets including a multi-target Twitter corpus on UK elections which we make public available for the research community. Additionally we also conduct two preliminary studies including cross-domain emotion classification on discourse around arts and cultural experiences, and social spam detection to improve the signal-to-noise ratio of our sentiment corpus. Our second line of work focuses on automatic topical clustering of tweets. Our aim is to group tweets into a number of clusters, with each cluster representing a meaningful topic, story, event or a reason behind a particular choice of sentiment. We explore various ways of tackling this challenge and propose a two-stage hierarchical topic modelling system that is efficient and effective in achieving our goal. Lastly, for our third line of work, we study the task of summarising tweets on common topics, with the goal to provide informative summaries for real-world events/stories or explanation underlying the sentiment expressed towards an issue/entity. As most existing tweet summarisation approaches rely on extractive methods, we propose to apply state-of-the-art neural abstractive summarisation model for tweets. We also tackle the challenge of cross-medium supervised summarisation with no target-medium training resources. To the best of our knowledge, there is no existing work on studying neural abstractive summarisation on tweets. In addition, we present a system for providing interactive visualisation of topic-entity sentiments and the corresponding summaries in chronological order. Throughout our work presented in this thesis, we conduct experiments to evaluate and verify the effectiveness of our proposed models, comparing to relevant baseline methods. Most of our evaluations are quantitative, however, we do perform qualitative analyses where it is appropriate. This thesis provides insights and findings that can be used for better understanding public opinion in social media

    Energy-Efficient Joint Resource Allocation Algorithms for MEC-Enabled Emotional Computing in Urban Communities

    Get PDF
    This paper considers a mobile edge computing (MEC) system, where the MEC server first collects data from emotion sensors and then computes the emotion of each user. We give the formula of the emotional prediction accuracy. In order to improve the energy efficiency of the system, we propose resources allocation algorithms. We aim to minimize the total energy consumption of the MEC server and sensors by jointly optimizing the computing resources allocation and the data transmitting time. The formulated problem is a non-convex problem, which is very difficult to solve in general. However, we transform it into convex problems and apply convex optimization techniques to address it. The optimal solution is given in closed form. Simulation results show that the total energy consumption of our system can be effectively reduced by the proposed scheme compared with the benchmark

    Multimodal Speech Emotion Recognition

    Get PDF
    Tato práce se zaměřuje na problém Rozpoznávánı́ emocı́, který spadá do třı́dy problémů Zpracovánı́ přirozeného jazyka. Cı́lem této práce bylo vytvořit modely strojového učenı́ na rozpoznánı́ emocı́ z textu a ze zvuku. Práce základně seznámı́ čtenáře s tı́mto problémem, s možnostmi reprezentace emocı́, s dostupnými datovými sadami a s existujı́cı́mi řešenı́mi. Poté se v práci popisujı́ naše navrhnutá řešenı́ pro úlohy Rozpoznávánı́ emocı́ z textu, Rozpoznávánı́ emocı́ ze zvuku a Multimodálnı́ho rozpoznávánı́ emocı́ z řeči. Dále popisujeme experimenty, které jsme provedli, prezentujeme dosažené výsledky těchto experimentů a ukazujeme naše dvě praktické demo aplikace. Dva z našich navrhovaných modelů porazily předchozı́ nejlepšı́ dos-tupné řešenı́ z roku 2018. Všechny experimenty a modely byly naprogramovány v programovacı́m jazyce Python.This work focuses on the Emotion Recognition task, which falls into the Natural Language Processing problems. The goal of this work was to create Machine learning models to recognize emotions from text and audio. The work introduces the problem, possible emotion representations, available datasets, and existing solutions to a reader. It then describes our proposed solutions for Text Emotion Recognition (TER), Speech Emotion Recognition (SER), and Multimodal Speech Emotion Recognition tasks. Further, we describe the experiments we have conducted, present the results of those experiments, and show our two demo practical applications. Two of our proposed models were able to outperform a previous state-of-the-art solution from 2018. All experiments and models were programmed in the Python programming language
    corecore