Predicting Movie Success with Machine Learning Techniques: Theoretical and Methodological Approaches to Improve Model Performance

Abstract

학위논문 (석사)-- 서울대학교 대학원 : 경영학과 경영정보 전공, 2016. 2. 박진수.Previous studies on predicting the box-office performance of a movie using machine learning techniques have shown practical levels of predictive accuracy. However, their efforts to improve the model accuracy have been limited only to the methodological perspective. In this paper, we combine a theory-driven approach and a methodology-driven approach to further increase the accuracy of prediction models. First, we add a new feature derived from the theory of transmedia storytelling. Such theory-driven feature selection not only increases the forecast accuracy, but also enhances the explanatory power of the prediction model. Second, we use an ensemble approach, which has rarely been adopted in the research on predicting box-office performance. As a result, our model, Cinema Ensemble Model (CEM), outperforms the prediction models from the past studies using machine learning algorithms. We suggest that CEM can be extensively used for industrial experts as a powerful tool for improving decision-making process.1. Introduction 4 2. Related Works 5 2.1. Predictive studies in the movie domain 5 2.2. The theory of transmedia storytelling 6 3. Methodology 7 3.1. Building an Ensemble Model for Predicting Movie Success 7 3.2. Descriptions of Learning Algorithms for Component Models 8 3.2.1. Adaptive Tree Boosting 8 3.2.2. Gradient Tree Boosting 9 3.2.3. Linear Discriminant 9 3.2.4. Logistic Regression 10 3.2.5. Random Forests 10 3.2.6. Support Vector Classifier 10 3.3. Discretization of the Movie Success 11 3.4. Feature Definition 11 3.4.1. Genre 12 3.4.2. Sequel 12 3.4.3. Number of Plays at the Initial Day of Release 13 3.4.4. Movie Buzz before the Release 13 3.4.5. Transmedia Storytelling 14 3.4.6. Star Buzz (i.e., Star Power) 15 4. Data Collection 16 5. Analysis 17 5.1. Performance Metrics 17 5.2. Candidate-model Performance 18 5.3. Cinema Ensemble Model (CEM) Performance 19 5.4. Performance Improvement by Transmedia Storytelling Feature 21 6. Discussion 22 References 25Maste

    Similar works