    Movie Industry Economics: How Data Analytics Can Help Predict Movies’ Financial Success

    Purpose: Data analytics techniques can help to predict movie success, as measured by box office sales or Oscar awards. Revenue prediction of a movie before its theatrical release is also an important indicator for attracting investors. While measures for predicting the success of a movie in box office sales and awards are widely missing, this study uses data analytics techniques to present a new measure for prediction of movies’ financial success.Methodology: Data were collected by web-scraping and text mining. Classification and Regression Tree (CART), Random Forests, Conditional Forests, and Gradient Boosting were used and a model for prediction of movies' financial success proposed. Content strategy and generating high profile reviews with complex themes can add to controversy and increase the chance of nomination for major movie awards, including Oscars.Findings/Contribution: Findings show that data analytics is key to predicting the success of movies. Although predicting sales based on data available before the release remains a difficult endeavor, even with state-of-the-art analytics technologies, it potentially reduces the risk of investors, studios and other stakeholders to select successful film candidates and have them chosen before the production process starts. The contribution of this study is to develop a model for predicting box office sales and the chance of nomination for winning Oscars. Practical Implications: Cinema managers and investors can use the proposed model as a guide for predicting movies’ financial success

    Movie’s box office performance prediction: An approach based on movie’s script, text mining and deep learning

    Dissertation presented as the partial requirement for obtaining a Master's degree in Information Management, specialization in Knowledge Management and Business IntelligenceA capacidade de prever a bilheteria de filmes tem sido atividade de grande interesse para investigadores. Entretanto, parcela significativa destes estudos concentra-se no uso de variáveis disponíveis apenas nos estágios de produção e pós-produção de filmes. O objetivo deste trabalho é desenvolver um modelo preditivo de bilheteria baseando-se apenas em informações dos roteiros dos filmes, por meio do uso de técnicas de processamento de linguagem natural (PLN), mineração de texto e de redes neuronais profundas. Essa abordagem visa otimizar a tomada de decisão de investidores em uma fase ainda inicial dos projetos, com foco específico na melhoria dos processos seletivos da Agência Nacional do Cinema do Brasil.The ability to predict movies box-office has been a field of interest for many researchers. However, most of these studies are concentrated on variables that are available only in later stages as in production and pos-production phase of films. The objective of this work is to develop a predictive model to forecast movie box-office performance based only on information in the movie script, using natural language processing techniques, text mining and deep learning neural networks. This approach aims to optimize the investor’s decision-making process at earlier steps of the project, with special focus on the selection process of the Brazilian Film Agency (ANCINE – Agência Nacional do cinema)

    Influence of social media on performance of movies

    "May 2014."Thesis advisor: Dr. Wenjun Zeng.Includes bibliographical references (pages 51-53)

    Movies, TV programs and Youtube channels

    학위논문(박사) -- 서울대학교대학원 : 공과대학 산업공학과, 2021.8. 조성준.The content market, including video content market, is a high-risk, high-return industry. Because the cost of copying and distributing the created video content is very low, large profit can be generated upon success. However, as content is an experience good, its quality cannot be judged before purchase. Hence, marketing has an important role in the content market because of the asymmetry of information between suppliers and consumers. Additionally, it has the characteristics of One Source Multi Use; if it is successful, additional profits can be created through various channels. Therefore, it is important for the content industry to correctly distinguish content with a high probability of success from the one without it and to conduct effective marketing activities to familiarize consumers with the product. Herein, we propose a methodology to assist in data-based decision-making using machine learning models and help in identifying problematic issues in video content markets such as movies, TV programs, and over-the-top (OTT) market. In the film market, although marketing is very important, decisions are still made based on the sense of practitioners. We used the market research data collected through online and offline surveys to learn a model that can predict the number of audiences on the opening-week Saturday, and then use the learned model to propose a method for effective marketing activities. In the TV program market, programming is performed to improve the overall viewership by matching TV programs and viewer groups well. We learn a model that predicts the audience rating of a program using the characteristics of the program and the audience-rating information of the programs before, after, and at the same time, and use the resulting data to assist in decision-making to find the optimal programming scenario. The OTT market is facing a new problem of user's perception bias caused by the “recent recommendation” system. In the fields of politics and news particularly, if the user does not have access to different viewpoints because of the recommendation service, it may create and/or deepen a bias toward a specific political view without the user being aware of it. In order to compensate for this, it is important to use the recommended channel while the user is well aware of what kind of channel it is. We built a channel network in the news/political field using the data extracted from the comments left by users on the videos of each channel. In addition, we propose a method to compensate for the bias by classifying networks into conservative and progressive channel clusters and presenting the topography of the political tendencies of YouTube channels.1 Introduction 1 2 Prediction of Movie Audience on First Saturday with Decision Trees 5 2.1 Background 5 2.2 Related work 9 2.3 Predictive model construction 15 2.3.1 Data 15 2.3.2 Target variable 17 2.3.3 Predictor variable 19 2.3.4 Decision Tree and ensemble prediction models 28 2.4 Prediction model evaluation 29 2.5 Summary 37 3 Prediction of TV program ratings with Decision Trees 40 3.1 Background 40 3.2 Related work 42 3.2.1 Research on the ratings themselves 42 3.2.2 Research on broadcasting programming 44 3.3 Predictive model construction 45 3.3.1 Target variable 45 3.3.2 Predictor variable 46 3.3.3 Prediction Model 48 3.4 Prediction model evaluation 50 3.4.1 Data 50 3.4.2 Experimental results 51 3.5 Optimization strategy using the predictive model 54 3.5.1 Broadcasting programming change process 56 3.5.2 Case Study 57 3.6 Summary 60 4 Relation detection of YouTube channels 62 4.1 Background 62 4.2 Related work 65 4.3 Method 67 4.3.1 Channel representation 68 4.3.2 Channel clustering with large k and merging clusters by keywords 71 4.3.3 Relabeling with RWR 73 4.3.4 Isolation score 74 4.4 Result 74 4.4.1 Channel representation 74 4.4.2 Channel clustering with large k and merging clusters by keywords 76 4.4.3 Relabeling with RWR 77 4.4.4 Isolation score 79 4.5 Discussion 80 4.5.1 On the Representativeness of the Channel Preferences of the Users from Their Comments 80 4.5.2 On Relabeling with RWR 82 4.6 Summary 83 5 Conclusion 85 5.1 Contribution 85 5.2 Future Direction 87 Bibliography 91 국문초록 110박

    The power of prediction with social media

    Social media provide an impressive amount of data about users and their interactions, thereby offering computer and social scientists, economists, and statisticians – among others – new opportunities for research. Arguably, one of the most interesting lines of work is that of predicting future events and developments from social media data. However, current work is fragmented and lacks of widely accepted evaluation approaches. Moreover, since the first techniques emerged rather recently, little is known about their overall potential, limitations and general applicability to different domains. Therefore, better understanding the predictive power and limitations of social media is of utmost importanc

    Open Models of Decision Support Towards a Framework

    Aquesta tesi presenta un marc per als models oberts de suport a les decisions en les organitzacions. El treball es vehicula a través d’un compendi d’articles on s’analitzen els fluxos d’entrada i de sortida de coneixement en les organitzacions, així como les tecnologies existents de suport a les decisions. Es presenten els factors subjacents que impulsen nous models per a formes obertes de suport a la decisió. La tesis presenta un estudi de les distintes tipologies de models de suport a les decisions tenint en compte diferents tipus d’organitzacions. En el primer estudi, paper#, es presenta l’evolució de les tecnologies de suport a les decisions i l’avançament de les noves tecnologies per als models oberts. Aquest estudi proporciona una visió des d’una perspectiva evolutiva de la relació entre el coneixement expert i la seva utilització en les tecnologies de suport a les decisions. La investigació revela l’entorn canviant que la tecnologia ofereix a l’hora de adquirir coneixement per a la presa de decisions i obre horitzons sobre el nou paper que els experts tenen en aquests entorns. Es suggereix que un canvi significatiu en la presa de decisions es basa en el desafiament entre el paper tradicional dels experts i no experts. Per últim, aquest treball explora les oportunitats d’integració de la intel•ligència artificial en la tecnologia de suport a les decisions i quins beneficis addicionals poden aportar les eines d’ intel•ligència col•lectiva en la presa de decisions. El segon estudi, paper#2, investiga sobre la tipologia anomenada "agregada" dins del marc d’entorns oberts per al suport a la presa de decisions. S’utilitza un problema de predicció com a fil conductor per a posar en relleu la complexitat de la previsió de la demanda dins de la industria del cinema. S’analitza com es pot utilitzar la tecnologia per a millorar l’eficàcia en les decisions. La investigació compara dues tecnologies de suport a les decisions: sistemes experts i eines d’intel•ligència col•lectiva, i il•lustra com l’industria del cinema utilitza cada una d’aquestes tecnologies en la previsió dels ingressos de taquilla. Per últim, aquest article explora els beneficis de l’ integració d’aquestes tecnologies de suport per a l’obtenció de prediccions més precises. El tercer estudi, article#3, presenta un estudi longitudinal durant un període de 10 anys que utilitza IBM “Innovation Jams” como un context per a la col•laboració a gran escala dins de la tipologia anomenada "plataforma". Aquest article investiga el paper de les “Innovation Jams”, en el canvi organitzacional i com IBM es compromet amb un nou model d’innovació en les organitzacions. En ell es descriuen les “Innovation Jams”, que han impulsat la innovació i consolidat la pràctica de la innovació oberta en IBM. En aquest article s’utilitza el gènere musical d’una "jamband" com una metàfora per a descriure el desenvolupament emergent i l’ús de les “Innovation Jams”, com una manera d’entendre el canvi organitzatiu. Aquest estudi longitudinal ofereix una visió actualitzada de la recerca en “Innovation Jams”, mostrant com han evolucionat des d’un concepte, a una eina de gestió i finalment a un servei. L’article conclou amb una discussió sobre les implicacions dels resultats i com aquests permeten teoritzar sobre nous models d’ innovació i el canvi en les organitzacions. La recerca duta a terme en aquesta tesi ofereix un marc per als models oberts de suport a la decisió, i suggereix que, les fonts internes i externes de coneixement poden ser utilitzades, més enllà de la innovació del producte o serveis, per a la presa de decisions amb el suport de tecnologies emergents. Les contribucions teòriques d’aquesta tesi sostenen que les organitzacions ja no poden confiar en la tecnologia de suport a les decisions que únicament es centren en la reducció de la frontera entre els aspectes racionals i no racionals de la conducta social humana, sinó que pel contrari, han de considerar la xarxa dinàmica de la organització per al suport a la decisió. D’altra banda, les implicacions pràctiques d’aquesta tesi animen les organitzacions a pensar estratègicament sobre com les tecnologies emergents poden ajudar en la presa de decisions i també com els models de decisió resultants poden ser utilitzats per a navegar per l’entorn complex existent, i, a la vegada, forjar vincles més forts amb els clients, proveïdors i la xarxa de l’organització.Esta tesis presenta un marco para modelos abiertos de soporte a las decisiones en las organizaciones. El trabajo se vehicula a través de un compendio de artículos dónde se analizan los flujos de entrada y salida de conocimiento en las organizaciones, así como las tecnologías existentes de soporte a las decisiones. Se presentan los factores subyacentes que impulsan nuevos modelos para formas abiertas de soporte a la decisión. La tesis presenta un estudio de las distintas tipologías de modelos de soporte a las decisiones teniendo en cuenta distintos tipos de organizaciones. En el primer estudio paper#1 se presenta la evolución de las tecnologías de apoyo a las decisiones y el avance de las nuevas tecnologías para los modelos abiertos. Este estudio proporciona una visión desde una perspectiva evolutiva de la relación entre conocimiento experto y su utilización en las tecnologías de soporte a las decisiones. La investigación revela el entorno cambiante que la tecnología ofrece a la hora de adquirir conocimiento para la toma de decisiones y abre horizontes sobre el nuevo papel que los expertos tienen en estos entornos. Se sugiere que un cambio significativo en la toma de decisiones se basa en el desafío entre el papel tradicional de los expertos y no expertos. Por último, este trabajo explora las oportunidades de integración de la inteligencia artificial en la tecnología de soporte de decisiones y que beneficios adicionales pueden aportar las herramientas de inteligencia colectiva en la toma de decisiones. El segundo estudio, paper#2, investiga sobre la tipología llamada "agregada" dentro del marco de entornos abiertos para el soporte a la toma de decisiones. Se utiliza un problema de predicción como hilo conductor para poner en relieve la complejidad de la previsión de la demanda dentro de la industria del cine. Se analiza cómo se puede utilizar la tecnología para mejorar la eficacia en las decisiones. La investigación compara dos tecnologías de soporte a las decisiones: sistemas expertos y herramientas de inteligencia colectiva, e ilustra cómo la industria del cine utiliza cada una de estas tecnologías en la previsión de los ingresos de taquilla. Por último, este artículo explora los beneficios de la integración de estas tecnologías de apoyo para la obtención de predicciones más precisas. El tercer estudio, artículo #3, presenta un estudio longitudinal durante un período de 10 años que utiliza IBM “Innovation Jams”, como un contexto para la colaboración a gran escala dentro de la tipología llamada "plataforma". Este artículo investiga el papel de las “Innovation Jams”, en el cambio organizacional y como IBM se compromete con un nuevo modelo de innovación de la organización. En él se describen las “Innovation Jams”, que han impulsado la innovación y consolidado la práctica de la innovación abierta en IBM. En este artículo se utiliza el género musical de una "jamband" como una metáfora para describir el desarrollo emergente y el uso de las “Innovation Jams”, como una manera de entender el cambio organizativo. Este estudio longitudinal ofrece una visión actualizada de la investigación en “Innovation Jams”, mostrando cómo han evolucionado desde un concepto, a una herramienta de gestión y finalmente a un servicio. El artículo concluye con una discusión sobre las implicaciones de los resultados y como ellos permiten teorizar sobre nuevos modelos de innovación y el cambio en las organizaciones. La investigación llevada a cabo en esta tesis ofrece un marco para los modelos abiertos de apoyo a la decisión, y sugiere que el uso de fuentes internas y externas de conocimiento pueden ser utilizadas más allá de la innovación del producto o servicio para la toma de decisiones con el soporte de tecnologías emergentes. Las contribuciones teóricas de esta tesis sostienen que las organizaciones ya no pueden confiar en la tecnología de apoyo a las decisiones que únicamente se centran en la reducción de la frontera entre los aspectos racionales y no racionales de la conducta social humana, sino por el contrario, deben considerar la red dinámica de la organización para el apoyo a la decisión. Por otra parte, las implicaciones prácticas de esta tesis alienta a las organizaciones a pensar estratégicamente acerca de cómo las tecnologías emergentes pueden ayudar a la toma de decisiones y también cómo los modelos de decisión resultantes pueden ser utilizados para navegar por el entorno complejo existente y, a su vez, forjar vínculos más fuertes con los clientes, proveedores y más amplios de la red de la organización.This thesis presents a framework for open models of decision support through a compendium of papers that links research on the inward and outward flows of knowledge to the organization and decision support technologies. The framework presents underlying factors driving new and more open models of decision support. A typology of decision support models is offered considering types of problems organizations and managers charged with decision-making face. Thesis essay #1 suggests a perspective of the changing landscape for decision support technology and the advancement of new technology for open models of decision support. This study provides insight from an evolutionary perspective of expertise that has shaped the field of decision support technologies. The investigation sets out to reveal the changing landscape of expertise in supporting decision-making using technology and sheds light on the new role that experts will play in organizational decision-making. It suggests that a significant change in how decision-making is being supported which challenge the traditional role of experts and non-experts. Finally, this paper explores opportunities for decision support technology integration and the added benefits artificial intelligence can bring to collective intelligence tools. Thesis essay #2 investigates the ‘aggregate’ typology within the open model decision support framework. A forecasting problem is used to highlight the complexity of demand forecasting in supply-chain management within the film industry and how technology is leveraged for effective supply-chain management decisions. The investigation compares two decision support technologies: expert systems and collective intelligence tools and illustrates how the film industry uses each in forecasting box-office revenue. Finally, this essay explores the combined benefits in integrating each support technology for more accurate forecasting. Thesis essay #3 is a longitudinal study over a 10 year period that uses IBM Innovation Jams as a context for large-scale collaboration within the ‘platform’ typology. This essay investigates the role of innovation jams on organizational change as IBM learned to engage with a new model of organizing innovation. It describes the role innovation jams have played in shaping the practice of open innovation at IBM. This essay uses the musical genre of a “jamband” as a metaphor to describe the emergent development and use of innovation jams as a way to understand organizational change. This longitudinal study brings innovation jam research up-to-date and presents innovation jams as they evolved from a concept, a management tool, and service. The essay concludes with a discussion on the implications of the findings for theorizing about new models of organizing innovation for organizational change. Research conducted in this thesis offers a framework of open models of decision support that suggests that the use of internal and external sources of knowledge can be leveraged beyond product or service innovation, to include decision-making supported by emerging technology. Theoretical contributions of this thesis argues that organizations can no longer rely on decision support technology that solely focus on bridging the boundary between rational and non-rational aspects of human social behavior but instead, must consider the larger dynamic organizational network for decision support. Moreover, practical implications of this thesis encourages organizations to think strategically about how emerging technology can support decision making and the resulting decision support models to navigate the complex environment they work in and in turn, to forge stronger links with customers, suppliers, and the wider organizational network

    Evaluating Copyright Protection in the Data-Driven Era: Centering on Motion Picture\u27s Past and Future

    Since the 1910s, Hollywood has measured audience preferences with rough industry-created methods. In the 1940s, scientific audience research led by George Gallup started to conduct film audience surveys with traditional statistical and psychological methods. However, the quantity, quality, and speed were limited. Things dramatically changed in the internet age. The prevalence of digital data increases the instantaneousness, convenience, width, and depth of collecting audience and content data. Advanced data and AI technologies have also allowed machines to provide filmmakers with ideas or even make human-like expressions. This brings new copyright challenges in the data-driven era. Massive amounts of text and data are the premise of text and data mining (TDM), as well as the admission ticket to access machine learning technologies. Given the high and uncertain copyright violation risks in the data-driven creation process, whoever controls the copyrighted film materials can monopolize the data and AI technologies to create motion pictures in the data-driven era. Considering that copyright shall not be the gatekeeper to new technological uses that do not impair the original uses of copyrighted works in the existing markets, this study proposes to create a TDM and model training limitations or exceptions to copyrights and recommends the Singapore legislative model. Motion pictures, as public entertainment media, have inherently limited creative choices. Identifying data-driven works’ human original expression components is also challenging. This study proposes establishing a voluntarily negotiated license institution backed up by a compulsory license to enable other filmmakers to reuse film materials in new motion pictures. The film material’s degree of human original authorship certified by film artists’ guilds shall be a crucial factor in deciding the compulsory license’s royalty rate and terms to encourage retaining human artists. This study argues that international and domestic policymakers should enjoy broad discretion to qualify data-driven work’s copyright protection because data-driven work is a new category of work. It would be too late to wait until ubiquitous data-driven works block human creative freedom and floods of data-driven work copyright litigations overwhelm the judicial systems

    Knowledge Modelling and Learning through Cognitive Networks

    One of the most promising developments in modelling knowledge is cognitive network science, which aims to investigate cognitive phenomena driven by the networked, associative organization of knowledge. For example, investigating the structure of semantic memory via semantic networks has illuminated how memory recall patterns influence phenomena such as creativity, memory search, learning, and more generally, knowledge acquisition, exploration, and exploitation. In parallel, neural network models for artificial intelligence (AI) are also becoming more widespread as inferential models for understanding which features drive language-related phenomena such as meaning reconstruction, stance detection, and emotional profiling. Whereas cognitive networks map explicitly which entities engage in associative relationships, neural networks perform an implicit mapping of correlations in cognitive data as weights, obtained after training over labelled data and whose interpretation is not immediately evident to the experimenter. This book aims to bring together quantitative, innovative research that focuses on modelling knowledge through cognitive and neural networks to gain insight into mechanisms driving cognitive processes related to knowledge structuring, exploration, and learning. The book comprises a variety of publication types, including reviews and theoretical papers, empirical research, computational modelling, and big data analysis. All papers here share a commonality: they demonstrate how the application of network science and AI can extend and broaden cognitive science in ways that traditional approaches cannot

    Price Prediction: Determining Changes in Stock Pricing through Sentiment Analysis of Online Consumer Reviews

    The rapid growth of technology has changed the dynamics in which consumers socialize and make their purchasing decisions. The volume of online reviews has grown rapidly over the past decade, leading the peer groups of consumer to carry a disproportionate weight in the purchasing decision process. The sheer volume of reviews can be a daunting task for an operator to attempt to incorporate the reviews in their analysis. Sentiment analysis allows for large volumes of consumer reviews to be processed in a relatively easy, and time sensitive manner. The information contained in these reviews, the sentiment score, is the same feeling hospitality consumers are gathering from other consumers prior to making their purchasing decision. To demonstrate the importance of these reviews, this study will seek to model the directional change of a company’s stock price using the sentiment of the consumer’s reviews as the primary predictor. Support Vector Machines will help to classify a year’s worth of consumer reviews on nine distinct properties of a publicly traded Las Vegas gaming/hotel company. This is then modeled using ARIMA modelling techniques to forecast an out-of-time sample, and the accuracy will be assessed by showing that the results being due to random change are minimal. The model is able to accurately predict 28 out of 39 time periods in the out of time sample, which has less than a .0047 probability of being due to random chance