288 research outputs found

    Sentiment Analysis of Persian Language: Review of Algorithms, Approaches and Datasets

    Full text link
    Sentiment analysis aims to extract people's emotions and opinion from their comments on the web. It widely used in businesses to detect sentiment in social data, gauge brand reputation, and understand customers. Most of articles in this area have concentrated on the English language whereas there are limited resources for Persian language. In this review paper, recent published articles between 2018 and 2022 in sentiment analysis in Persian Language have been collected and their methods, approach and dataset will be explained and analyzed. Almost all the methods used to solve sentiment analysis are machine learning and deep learning. The purpose of this paper is to examine 40 different approach sentiment analysis in the Persian Language, analysis datasets along with the accuracy of the algorithms applied to them and also review strengths and weaknesses of each. Among all the methods, transformers such as BERT and RNN Neural Networks such as LSTM and Bi-LSTM have achieved higher accuracy in the sentiment analysis. In addition to the methods and approaches, the datasets reviewed are listed between 2018 and 2022 and information about each dataset and its details are provided

    Towards MLOps in a Startup Company

    Get PDF
    Mestrado Bolonha em Data Analytics for BusinessCompanies which have developed around a product or service built on top of one or several machine learning applications will reach a point where the complexity of these applications become unfeasible to manage manually. To approach a more mature level of their machine learning applications these companies must start implementing MLOps in their operations as well as orchestration tooling for building durable and scalable machine learning pipelines. This paper presents how a startup company in the marketing- and market research sector has taken their first steps towards machine learning maturity by implementing MLOps practices and orchestration with Apache Airflow.info:eu-repo/semantics/publishedVersio

    Sentiment Analysis for Online Product Reviews and Recommendation Using Deep Learning Based Optimization Algorithm

    Get PDF
    Recently, online shopping is becoming a popular means for users to buy and consume with the advances in Internet technologies. Satisfaction of users could be efficiently improvised by carrying out a Sentiment Analysis (SA) of larger amount of user reviews on e-commerce platform. But still, it is a challenge to envision the precise sentiment polarity of the user reviews due to the modifications in sequence length, complicated logic, and textual order. In this study, we propose a Hybrid-Flash Butterfly Optimization with Deep Learning based Sentiment Analysis (HFBO-DLSA) for Online Product Reviews. The presented HFBO-DLSA technique mainly aims to determine the nature of sentiments based on online product reviews. For accomplishing this, the presented HFBO-DLSA technique applies data pre-processing at the preliminary stage to make it compatible. Besides, the HFBO-DLSA model uses deep belief network (DBN) model for classification. The HFBO algorithm is used as a hyperparameter tuning process to improve the SA performance of the DBN method. The experimental validation of the presented HFBO-DLSA method has been tested under a set of datasets. The experimental results reveal that the HFBO-DLSA approach surpasses recent techniques in terms of SA outcomes. Specifically, when compared to various existing models on the Canon dataset, the HFBO-DLSA technique achieves remarkable results with an accuracy of 97.66%, precision of 98.54%, recall of 94.64%, and an F-score of 96.43%. In comparative analysis, other approaches such as ACO, SVM, and NN exhibit poorer performance, while TextCNN, BiLSTM, and RCNN approaches yield slightly improved SA results

    Video advertisement mining for predicting revenue using random forest

    Get PDF
    Shaken by the threat of financial crisis in 2008, industries began to work on the topic of predictive analytics to efficiently control inventory levels and minimize revenue risks. In this third-generation age of web-connected data, organizations emphasized the importance of data science and leveraged the data mining techniques for gaining a competitive edge. Consider the features of Web 3.0, where semantic-oriented interaction between humans and computers can offer a tailored service or product to meet consumers\u27 needs by means of learning their preferences. In this study, we concentrate on the area of marketing science to demonstrate the correlation between TV commercial advertisements and sales achievement. Through different data mining and machine-learning methods, this research will come up with one concrete and complete predictive framework to clarify the effects of word of mouth by using open data sources from YouTube. The uniqueness of this predictive model is that we adopt the sentiment analysis as one of our predictors. This research offers a preliminary study on unstructured marketing data for further business use

    Recolha, extração e classificação de opiniões sobre aplicações lúdicas para saúde e bem-estar

    Get PDF
    Nowadays, mobile apps are part of the life of anyone who owns a smartphone. With technological evolution, new apps come with new features, which brings a greater demand from users when using an application. Moreover, at a time when health and well-being are a priority, more and more apps provide a better user experience, not only in terms of health monitoring but also a pleasant experience in terms of entertainment and well-being. However, there are still some limitations regarding user experience and usability. What can best translate user satisfaction and experience are application reviews. Therefore, to have a perception of the most relevant aspects of the current applications, a collection of reviews and respective classifications was performed. This thesis aims to develop a system that allows the presentation of the most relevant aspects of a given health and wellness application after collecting the reviews and later extracting the aspects and classifying them. In the reviews collection task, two Python libraries, one for the Google Play Store and one for the App Store, provide methods for extracting data about an application. For the extraction and classification of aspects, the LCF-ATEPC model was chosen given its performance in aspects-based sentiment analysis studies.Atualmente, as aplicações móveis fazem parte da vida de qualquer pessoa que possua um smartphone. Com a evolução tecnológica, novas aplicações surgem com novas funcionalidades, o que traz uma maior exigência por parte dos utilizadores quando usam uma aplicação. Numa altura em que a saúde e bem-estar são uma prioridade, existem cada vez mais aplicações com o intuito de providenciar uma melhor experiência ao utilizador, não só a nível de monitorização de saúde, mas também de uma experiência agradável em termos de entertenimento e bem estar. Contudo, existem ainda algumas limitações no que toca à experiência e usabilidade do utilizador. O que melhor pode traduzir a satisfação e experiência do utilizador são as reviews das aplicações. Assim sendo, para ter uma perceção dos aspetos mais relevantes das atuais aplicações, foi feita uma recolha das reviews e respetivas classificações. O objetivo desta tese consiste no desenvolvimento de um sistema que permita apresentar os aspetos mais relevantes de uma determinada aplicação de saúde e bem estar, após a recolha das reviews e posterior extração dos aspetos e classificação dos mesmos. No processo de recolha de reviews, foram usadas duas bibliotecas em Python, uma relativa à Google Play Store e outra à App Store, que providenciam métodos para extrair dados relativamente a uma aplicação. Para a extração e classificação dos aspetos, o modelo LCF-ATEPC foi o escolhido dada a sua performance em estudos de análise de sentimento baseada em aspectos.Mestrado em Engenharia de Computadores e Telemátic

    IDENTIFIKASI KEBUTUHAN DASAR DI TEMPAT EVAKUASI SEMENTARA PASCA ERUPSI MERAPI DENGAN SENTIMENT ANALISIS DAN SUPPORT VECTOR MACHINE

    Get PDF
    AbstractMount Merapi Eruption in 2010 was the biggest after 1872. The impact of this eruption was felt by people who lived around the areas which were affected by this Merapi Eruption. Thus, disaster management was done. One of the disaster management was the fulfillment of basic needs. This research aims to collect public opinion against the fulfillment of basic needs in the shelters after Merapi Eruption based on Twitter data. The algorithm which is used in this research is Support Vector Machine to develop classification model over the data that has been collected. The expected result from this study is to know the basic needs in a shelter. The accuracy gained by performing Cross Validation for 10 folds from Support Vector Machine is 87.96% and Maximum Entropy is 87.45%. Keywords: twitter, sentiment analisis, merapi eruption, support vector machine AbstrakErupsi Gunung Merapi 2010 merupakan yang terbesar setelah tahun 1872. Dampak dari Erupsi Gunung Merapi dirasakan oleh masyarakat yang tinggal di daerah terdampak Erupsi Merapi. Oleh sebab itu dilakukan penanggulangan Bencana. salah satu penanggulangan bencana adalah pemenuhan kebutuhan dasar. Penelitian ini bertujuan untuk mengumpulkan opini publik terhadap pemenuhan kebutuhan dasar di tempat pengungsian pasca erupsi merapi berdasarkan data Twitter. Algoritma yang digunakan dalam penelitian ini adalah Support Vector Machine untuk membangun model klasifikasi atas data yang sudah dikumpulkan.   Hasil yang diharapkan dari penelitian ini adalah mengetahui kebutuhan dasar dari suatu tempat pengungsian. Akurasi yang didapatkan dengan melakukan Cross Validation sebanyak 10 fold dari model klasifikasi Support Vector Machine87,96% dan Maximum Entropy 87,45 Kata Kunci: twitter, analisis sentimen, erupsi merapi, support vector machin

    Validation of Twitter opinion trends with national polling aggregates: Hillary Clinton vs Donald Trump

    Full text link
    Measuring and forecasting opinion trends from real-time social media is a long-standing goal of big-data analytics. Despite its importance, there has been no conclusive scientific evidence so far that social media activity can capture the opinion of the general population. Here we develop a method to infer the opinion of Twitter users regarding the candidates of the 2016 US Presidential Election by using a combination of statistical physics of complex networks and machine learning based on hashtags co-occurrence to develop an in-domain training set approaching 1 million tweets. We investigate the social networks formed by the interactions among millions of Twitter users and infer the support of each user to the presidential candidates. The resulting Twitter trends follow the New York Times National Polling Average, which represents an aggregate of hundreds of independent traditional polls, with remarkable accuracy. Moreover, the Twitter opinion trend precedes the aggregated NYT polls by 10 days, showing that Twitter can be an early signal of global opinion trends. Our analytics unleash the power of Twitter to uncover social trends from elections, brands to political movements, and at a fraction of the cost of national polls

    Social Data Mining for Crime Intelligence

    Get PDF
    With the advancement of the Internet and related technologies, many traditional crimes have made the leap to digital environments. The successes of data mining in a wide variety of disciplines have given birth to crime analysis. Traditional crime analysis is mainly focused on understanding crime patterns, however, it is unsuitable for identifying and monitoring emerging crimes. The true nature of crime remains buried in unstructured content that represents the hidden story behind the data. User feedback leaves valuable traces that can be utilised to measure the quality of various aspects of products or services and can also be used to detect, infer, or predict crimes. Like any application of data mining, the data must be of a high quality standard in order to avoid erroneous conclusions. This thesis presents a methodology and practical experiments towards discovering whether (i) user feedback can be harnessed and processed for crime intelligence, (ii) criminal associations, structures, and roles can be inferred among entities involved in a crime, and (iii) methods and standards can be developed for measuring, predicting, and comparing the quality level of social data instances and samples. It contributes to the theory, design and development of a novel framework for crime intelligence and algorithm for the estimation of social data quality by innovatively adapting the methods of monitoring water contaminants. Several experiments were conducted and the results obtained revealed the significance of this study in mining social data for crime intelligence and in developing social data quality filters and decision support systems
    corecore