698 research outputs found

    A Movie Weekly Box-office Revenues Prediction Model Based on Online Reviews

    Get PDF
    To predict the movie weekly box-office revenues, this paper proposes a new prediction model based on ensemble machine learning method. Firstly, we extract some important features from movie online reviews. Then, due to the limited ability of the single machine learning model, an ensemble machine XGboost is employed to predict the movie weekly box-office revenues in this paper. Finally, we collect the movie online reviews from Douban.com, and use about 600 movies to verify the performance of the model. The experimental results show that the effectiveness and practicability of this model

    An Improved Machine Learning Approach to Analyze the Sentiment of the Movie Reviews Using IMDB dataset

    Get PDF
    Sentiment analysis is a sub-domain of opinion mining where the analysis is focused on the extraction of emotions and opinions of the people towards a particular topic from a structured, semi-structured or unstructured textual data. In this paper, we try to focus our task of sentiment analysis on IMDB movie review database. . In this work the novel approach is improved NaĂŻve Bayes algorithm that is done with the help of Tf-IDF (Term Frequency-Inverse Document Frequency). The comparison is done on different sizes dataset and the comparison is done on the basis of parameters like mean square error, accuracy, precision, recall and F1 score and our work has shown better accuracy than other classification algorithm Keywords: Review, Sentiment Analysis, Modern Information Retrieval, Opinion Mining, Classifier.

    Price Prediction: Determining Changes in Stock Pricing through Sentiment Analysis of Online Consumer Reviews

    Full text link
    The rapid growth of technology has changed the dynamics in which consumers socialize and make their purchasing decisions. The volume of online reviews has grown rapidly over the past decade, leading the peer groups of consumer to carry a disproportionate weight in the purchasing decision process. The sheer volume of reviews can be a daunting task for an operator to attempt to incorporate the reviews in their analysis. Sentiment analysis allows for large volumes of consumer reviews to be processed in a relatively easy, and time sensitive manner. The information contained in these reviews, the sentiment score, is the same feeling hospitality consumers are gathering from other consumers prior to making their purchasing decision. To demonstrate the importance of these reviews, this study will seek to model the directional change of a company’s stock price using the sentiment of the consumer’s reviews as the primary predictor. Support Vector Machines will help to classify a year’s worth of consumer reviews on nine distinct properties of a publicly traded Las Vegas gaming/hotel company. This is then modeled using ARIMA modelling techniques to forecast an out-of-time sample, and the accuracy will be assessed by showing that the results being due to random change are minimal. The model is able to accurately predict 28 out of 39 time periods in the out of time sample, which has less than a .0047 probability of being due to random chance

    What Is Important When We Evaluate Movies? Insights from Computational Analysis of Online Reviews

    Get PDF
    The question of what is important when we evaluate movies is crucial for understanding how lay audiences experience and evaluate entertainment products such as films. In line with this, subjective movie evaluation criteria (SMEC) have been conceptualized as mental representations of important attitudes toward specific film features. Based on exploratory and confirmatory factor analyses of self-report data from online surveys, previous research has found and validated eight dimensions. Given the large-scale evaluative information that is available in online users’ comments in movie databases, it seems likely that what online users write about movies may enrich our knowledge about SMEC. As a first fully exploratory attempt, drawing on an open-source dataset including movie reviews from IMDb, we estimated a correlated topic model to explore the underlying topics of those reviews. In 35,136 online movie reviews, the most prevalent topics tapped into three major categories—Hedonism, Actors’ Performance, and Narrative—and indicated what reviewers mostly wrote about. Although a qualitative analysis of the reviews revealed that users mention certain SMEC, results of the topic model covered only two SMEC: Story Innovation and Light-heartedness. Implications for SMEC and entertainment research are discussed

    Improving productivity in Hollywood with data science: Using emotional arcs of movies to drive product and service innovation in entertainment industries

    Get PDF
    Improving productivity in the entertainment industry is a very challenging task as it heavily depends on generating attractive content for the consumers. The consumer-centric design (putting the consumers at the centre of the content development and production) focuses on ways in which businesses can design customized services and products which accurately reflect consumer preferences. We propose a new framework which allows to use data science to optimize content-generation in entertainment and test this framework for the motion picture industry. We use the natural language processing methodology combined with econometric analysis to explore whether and to what extent emotions shape consumer preferences for media and entertainment content, which, in turn, affect revenue streams. By analyzing 6,174 movie scripts, we generate the emotional trajectory of each motion picture. We then combine the obtained mappings into clusters which represent groupings of consumer emotional journeys. These clusters are then plugged into an econometric model to predict overall success parameters of the movies including box office revenues, viewer satisfaction levels (captured by IMDb ratings), awards, as well as the number of viewers’ and critics’ reviews. We find that emotional arcs in movies can be partitioned into 6 basic shapes. The highest box offices are associated with the Man in a Hole shape which is characterized by an emotional fall followed by an emotional rise. This U-shaped emotional arc results in financially successful movies irrespective of genre and production budget. Implications of this analysis for generating on-demand content and improving productivity in entertainment industries are discussed

    Sentiment Analysis of Twitter Data for a Tourism Recommender System in Bangladesh

    Get PDF
    The exponentially expanding Digital Universe is generating huge amount of data containing valuable information. The tourism industry, which is one of the fastest growing economic sectors, can benefit from the myriad of digital data travelers generate in every phase of their travel- planning, booking, traveling, feedback etc. One application of tourism related data can be to provide personalized destination recommendations. The primary objective of this research is to facilitate the business development of a tourism recommendation system for Bangladesh called “JatraLog”. Sentiment based recommendation is one of the features that will be employed in the recommendation system. This thesis aims to address two research goals: firstly, to study Sentiment Analysis as a tourism recommendation tool and secondly, to investigate twitter as a potential source of valuable tourism related data for providing recommendations for different countries, specifically Bangladesh. Sentiment Analysis can be defined as a Text Classification problem, where a document or text is classified into two groups: positive or negative, and in some cases a third group, i.e. neutral. For this thesis, two sets of tourism related English language tweets were collected from Twitter using keywords. The first set contains only the tweets and the second set contains geo-location and timestamp along with the tweets. Then the collected tweets were automatically labeled as positive or negative depending on whether the tweets contained positive or negative emoticons respectively. After they were labeled, 90% of the tweets from the first set were used to train a Naive Bayes Sentiment Classifier and the remaining 10% were used to test the accuracy of the Classifier. The Classifier accuracy was found to be approximately 86.5%. The second set was used to retrieve statistical information required to address the second research goal, i.e. investigating Twitter as a potential source of sentiment data for a destination recommendation system

    Predictive Analytics on Emotional Data Mined from Digital Social Networks with a Focus on Financial Markets

    Get PDF
    This dissertation is a cumulative dissertation and is comprised of five articles. User-Generated Content (UGC) comprises a substantial part of communication via social media. In this dissertation, UGC that carries and facilitates the exchange of emotions is referred to as “emotional data.” People “produce” emotional data, that is, they express their emotions via tweets, forum posts, blogs, and so on, or they “consume” it by being influenced by expressed sentiments, feelings, opinions, and the like. Decisions often depend on shared emotions and data – which again lead to new data because decisions may change behaviors or results. “Emotional Data Intelligence” ultimately seeks an answer to the question of how all the different emotions expressed in public online sources influence decision-making processes. The overarching research topic of this dissertation follows the question whether network structures and emotional sentiment data extracted from digital social networks contain predictive information or they are just noise. Underlying data was collected from different social media sources, such as Twitter, blogs, message boards, or online news and social networking sites, such as Xing. By means of methodologies of social network analysis (SNA), sentiment analysis, and predictive analysis the individual contributions of this dissertation study whether sentiment data from social media or online social networking structures can predict real-world behaviors. The focus lies on the analysis of emotional data and network structures and its predictive power for financial markets. With the formal construction of the data analyses methodologies introduced in the individual contributions this dissertation contributes to the theories of social network analysis, sentiment analysis, and predictive analytics

    Opinion Mining of Movie Review using Hybrid Method of Support Vector Machine and Particle Swarm Optimization

    Get PDF
    Nowadays, online social media is online discourse where people contribute to create content, share it, bookmark it, and network at an impressive rate. The faster message and ease of use in social media today is Twitter. The messages on Twitter include reviews and opinions on certain topics such as movie, book, product, politic, and so on. Based on this condition, this research attempts to use the messages of twitter to review a movie by using opinion mining or sentiment analysis. Opinion mining refers to the application of natural language processing, computational linguistics, and text mining to identify or classify whether the movie is good or not based on message opinion. Support Vector Machine (SVM) is supervised learning methods that analyze data and recognize the patterns that are used for classification. This research concerns on binary classification which is classified into two classes. Those classes are positive and negative. The positive class shows good message opinion; otherwise the negative class shows the bad message opinion of certain movies. This justification is based on the accuracy level of SVM with the validation process uses 10-Fold cross validation and confusion matrix. The hybrid Partical Swarm Optimization (PSO) is used to improve the election of best parameter in order to solve the dual optimization problem. The result shows the improvement of accuracy level from 71.87% to 77%

    Extracting common emotions from blogs based on fine-grained sentiment clustering

    Get PDF
    • …
    corecore