56,278 research outputs found
Early Prediction of Movie Box Office Success based on Wikipedia Activity Big Data
Use of socially generated "big data" to access information about collective
states of the minds in human societies has become a new paradigm in the
emerging field of computational social science. A natural application of this
would be the prediction of the society's reaction to a new product in the sense
of popularity and adoption rate. However, bridging the gap between "real time
monitoring" and "early predicting" remains a big challenge. Here we report on
an endeavor to build a minimalistic predictive model for the financial success
of movies based on collective activity data of online users. We show that the
popularity of a movie can be predicted much before its release by measuring and
analyzing the activity level of editors and viewers of the corresponding entry
to the movie in Wikipedia, the well-known online encyclopedia.Comment: 13 pages, Including Supporting Information, 7 Figures, Download the
dataset from: http://wwm.phy.bme.hu/SupplementaryDataS1.zi
A Feature-Based Bayesian Method for Content Popularity Prediction in Edge-Caching Networks
Edge-caching is recognized as an efficient technique for future wireless
cellular networks to improve network capacity and user-perceived quality of
experience. Due to the random content requests and the limited cache memory,
designing an efficient caching policy is a challenge. To enhance the
performance of caching systems, an accurate content request prediction
algorithm is essential. Here, we introduce a flexible model, a Poisson
regressor based on a Gaussian process, for the content request distribution in
stationary environments. Our proposed model can incorporate the content
features as side information for prediction enhancement. In order to learn the
model parameters, which yield the Poisson rates or alternatively content
popularities, we invoke the Bayesian approach which is very robust against
over-fitting.
However, the posterior distribution in the Bayes formula is analytically
intractable to compute. To tackle this issue, we apply a Monte Carlo Markov
Chain (MCMC) method to approximate the posterior distribution. Two types of
predictive distributions are formulated for the requests of existing contents
and for the requests of a newly-added content. Finally, simulation results are
provided to confirm the accuracy of the developed content popularity learning
approach.Comment: arXiv admin note: substantial text overlap with arXiv:1903.0306
Tracking the History and Evolution of Entities: Entity-centric Temporal Analysis of Large Social Media Archives
How did the popularity of the Greek Prime Minister evolve in 2015? How did
the predominant sentiment about him vary during that period? Were there any
controversial sub-periods? What other entities were related to him during these
periods? To answer these questions, one needs to analyze archived documents and
data about the query entities, such as old news articles or social media
archives. In particular, user-generated content posted in social networks, like
Twitter and Facebook, can be seen as a comprehensive documentation of our
society, and thus meaningful analysis methods over such archived data are of
immense value for sociologists, historians and other interested parties who
want to study the history and evolution of entities and events. To this end, in
this paper we propose an entity-centric approach to analyze social media
archives and we define measures that allow studying how entities were reflected
in social media in different time periods and under different aspects, like
popularity, attitude, controversiality, and connectedness with other entities.
A case study using a large Twitter archive of four years illustrates the
insights that can be gained by such an entity-centric and multi-aspect
analysis.Comment: This is a preprint of an article accepted for publication in the
International Journal on Digital Libraries (2018
A Bayesian Poisson-Gaussian Process Model for Popularity Learning in Edge-Caching Networks
Edge-caching is recognized as an efficient technique for future cellular
networks to improve network capacity and user-perceived quality of experience.
To enhance the performance of caching systems, designing an accurate content
request prediction algorithm plays an important role. In this paper, we develop
a flexible model, a Poisson regressor based on a Gaussian process, for the
content request distribution.
The first important advantage of the proposed model is that it encourages the
already existing or seen contents with similar features to be correlated in the
feature space and therefore it acts as a regularizer for the estimation.
Second, it allows to predict the popularities of newly-added or unseen contents
whose statistical data is not available in advance. In order to learn the model
parameters, which yield the Poisson arrival rates or alternatively the content
\textit{popularities}, we invoke the Bayesian approach which is robust against
over-fitting.
However, the resulting posterior distribution is analytically intractable to
compute. To tackle this, we apply a Markov Chain Monte Carlo (MCMC) method to
approximate this distribution which is also asymptotically exact. Nevertheless,
the MCMC is computationally demanding especially when the number of contents is
large. Thus, we employ the Variational Bayes (VB) method as an alternative low
complexity solution. More specifically, the VB method addresses the
approximation of the posterior distribution through an optimization problem.
Subsequently, we present a fast block-coordinate descent algorithm to solve
this optimization problem. Finally, extensive simulation results both on
synthetic and real-world datasets are provided to show the accuracy of our
prediction algorithm and the cache hit ratio (CHR) gain compared to existing
methods from the literature
- …