51 research outputs found
Will This Video Go Viral? Explaining and Predicting the Popularity of Youtube Videos
What makes content go viral? Which videos become popular and why others
don't? Such questions have elicited significant attention from both researchers
and industry, particularly in the context of online media. A range of models
have been recently proposed to explain and predict popularity; however, there
is a short supply of practical tools, accessible for regular users, that
leverage these theoretical results. HIPie -- an interactive visualization
system -- is created to fill this gap, by enabling users to reason about the
virality and the popularity of online videos. It retrieves the metadata and the
past popularity series of Youtube videos, it employs Hawkes Intensity Process,
a state-of-the-art online popularity model for explaining and predicting video
popularity, and it presents videos comparatively in a series of interactive
plots. This system will help both content consumers and content producers in a
range of data-driven inquiries, such as to comparatively analyze videos and
channels, to explain and predict future popularity, to identify viral videos,
and to estimate response to online promotion.Comment: 4 page
Recurrent Neural Networks for Online Video Popularity Prediction
In this paper, we address the problem of popularity prediction of online
videos shared in social media. We prove that this challenging task can be
approached using recently proposed deep neural network architectures. We cast
the popularity prediction problem as a classification task and we aim to solve
it using only visual cues extracted from videos. To that end, we propose a new
method based on a Long-term Recurrent Convolutional Network (LRCN) that
incorporates the sequentiality of the information in the model. Results
obtained on a dataset of over 37'000 videos published on Facebook show that
using our method leads to over 30% improvement in prediction performance over
the traditional shallow approaches and can provide valuable insights for
content creators
Shallow reading with Deep Learning: Predicting popularity of online content using only its title
With the ever decreasing attention span of contemporary Internet users, the
title of online content (such as a news article or video) can be a major factor
in determining its popularity. To take advantage of this phenomenon, we propose
a new method based on a bidirectional Long Short-Term Memory (LSTM) neural
network designed to predict the popularity of online content using only its
title. We evaluate the proposed architecture on two distinct datasets of news
articles and news videos distributed in social media that contain over 40,000
samples in total. On those datasets, our approach improves the performance over
traditional shallow approaches by a margin of 15%. Additionally, we show that
using pre-trained word vectors in the embedding layer improves the results of
LSTM models, especially when the training set is small. To our knowledge, this
is the first attempt of applying popularity prediction using only textual
information from the title
When is it Biased? Assessing the Representativeness of Twitter's Streaming API
Twitter has captured the interest of the scientific community not only for
its massive user base and content, but also for its openness in sharing its
data. Twitter shares a free 1% sample of its tweets through the "Streaming
API", a service that returns a sample of tweets according to a set of
parameters set by the researcher. Recently, research has pointed to evidence of
bias in the data returned through the Streaming API, raising concern in the
integrity of this data service for use in research scenarios. While these
results are important, the methodologies proposed in previous work rely on the
restrictive and expensive Firehose to find the bias in the Streaming API data.
In this work we tackle the problem of finding sample bias without the need for
"gold standard" Firehose data. Namely, we focus on finding time periods in the
Streaming API data where the trend of a hashtag is significantly different from
its trend in the true activity on Twitter. We propose a solution that focuses
on using an open data source to find bias in the Streaming API. Finally, we
assess the utility of the data source in sparse data situations and for users
issuing the same query from different regions
Early Prediction of Movie Box Office Success based on Wikipedia Activity Big Data
Use of socially generated "big data" to access information about collective
states of the minds in human societies has become a new paradigm in the
emerging field of computational social science. A natural application of this
would be the prediction of the society's reaction to a new product in the sense
of popularity and adoption rate. However, bridging the gap between "real time
monitoring" and "early predicting" remains a big challenge. Here we report on
an endeavor to build a minimalistic predictive model for the financial success
of movies based on collective activity data of online users. We show that the
popularity of a movie can be predicted much before its release by measuring and
analyzing the activity level of editors and viewers of the corresponding entry
to the movie in Wikipedia, the well-known online encyclopedia.Comment: 13 pages, Including Supporting Information, 7 Figures, Download the
dataset from: http://wwm.phy.bme.hu/SupplementaryDataS1.zi
Breaking the News: First Impressions Matter on Online News
A growing number of people are changing the way they consume news, replacing
the traditional physical newspapers and magazines by their virtual online
versions or/and weblogs. The interactivity and immediacy present in online news
are changing the way news are being produced and exposed by media corporations.
News websites have to create effective strategies to catch people's attention
and attract their clicks. In this paper we investigate possible strategies used
by online news corporations in the design of their news headlines. We analyze
the content of 69,907 headlines produced by four major global media
corporations during a minimum of eight consecutive months in 2014. In order to
discover strategies that could be used to attract clicks, we extracted features
from the text of the news headlines related to the sentiment polarity of the
headline. We discovered that the sentiment of the headline is strongly related
to the popularity of the news and also with the dynamics of the posted comments
on that particular news.Comment: The paper appears in ICWSM 201
- …