44,563 research outputs found
Early Prediction of Movie Box Office Success based on Wikipedia Activity Big Data
Use of socially generated "big data" to access information about collective
states of the minds in human societies has become a new paradigm in the
emerging field of computational social science. A natural application of this
would be the prediction of the society's reaction to a new product in the sense
of popularity and adoption rate. However, bridging the gap between "real time
monitoring" and "early predicting" remains a big challenge. Here we report on
an endeavor to build a minimalistic predictive model for the financial success
of movies based on collective activity data of online users. We show that the
popularity of a movie can be predicted much before its release by measuring and
analyzing the activity level of editors and viewers of the corresponding entry
to the movie in Wikipedia, the well-known online encyclopedia.Comment: 13 pages, Including Supporting Information, 7 Figures, Download the
dataset from: http://wwm.phy.bme.hu/SupplementaryDataS1.zi
Recommended from our members
Advertising and Word-of-Mouth Effects on Pre-launch Consumer Interest and Initial Sales of Experience Products
This study examines how consumers' interest in a new experience product develops as a result of advertising and word-of-mouth activities during the pre-launch period. The empirical settings are the U.S. motion picture and video game industries. The focal variables include weekly ad spend, blog volume, online search volume during pre-launch periods, opening-week sales, and product characteristics. We treat pre-launch search volume of keywords as a measure of pre-launch consumer interest in the related product. To identify probable persistent effects among the pre-launch time-series variables, we apply a vector autoregressive modeling approach. We find that blog postings have permanent, trend-setting effects on pre-launch consumer interest in a new product, while advertising has only temporary effects. In the U.S. motion picture industry, the four-week cumulative elasticity of pre-launch consumer interest is 0.187 to advertising and 0.635 to blog postings. In the U.S. video game industry, the elasticities are 0.093 and 1.306, respectively. We also find long-run co-evolution between blog and search volume, which suggests that consumers' interest in the upcoming product cannot grow without bounds for a given level of blog volume
Knowledge will Propel Machine Understanding of Content: Extrapolating from Current Examples
Machine Learning has been a big success story during the AI resurgence. One
particular stand out success relates to learning from a massive amount of data.
In spite of early assertions of the unreasonable effectiveness of data, there
is increasing recognition for utilizing knowledge whenever it is available or
can be created purposefully. In this paper, we discuss the indispensable role
of knowledge for deeper understanding of content where (i) large amounts of
training data are unavailable, (ii) the objects to be recognized are complex,
(e.g., implicit entities and highly subjective content), and (iii) applications
need to use complementary or related data in multiple modalities/media. What
brings us to the cusp of rapid progress is our ability to (a) create relevant
and reliable knowledge and (b) carefully exploit knowledge to enhance ML/NLP
techniques. Using diverse examples, we seek to foretell unprecedented progress
in our ability for deeper understanding and exploitation of multimodal data and
continued incorporation of knowledge in learning techniques.Comment: Pre-print of the paper accepted at 2017 IEEE/WIC/ACM International
Conference on Web Intelligence (WI). arXiv admin note: substantial text
overlap with arXiv:1610.0770
- …