12,204 research outputs found
Predicting popularity of online videos using Support Vector Regression
In this work, we propose a regression method to predict the popularity of an
online video based on temporal and visual cues. Our method uses Support Vector
Regression with Gaussian Radial Basis Functions. We show that modelling
popularity patterns with this approach provides higher and more stable
prediction results, mainly thanks to the non-linearity character of the
proposed method as well as its resistance against overfitting. We compare our
method with the state of the art on datasets containing over 14,000 videos from
YouTube and Facebook. Furthermore, we show that results obtained relying only
on the early distribution patterns, can be improved by adding social and visual
metadata
The Entropy of Attention and Popularity in YouTube Videos
The vast majority of YouTube videos never become popular, languishing in
obscurity with few views, no likes, and no comments. We use information
theoretical measures based on entropy to examine how time series distributions
of common measures of popularity in videos from YouTube's "Trending videos" and
"Most recent" video feeds relate to the theoretical concept of attention. While
most of the videos in the "Most recent" feed are never popular, some 20% of
them have distributions of attention metrics and measures of entropy that are
similar to distributions for "Trending videos". We analyze how the 20% of "Most
recent" videos that become somewhat popular differ from the 80% that do not,
then compare these popular "Most recent" videos to different subsets of
"Trending videos" to try to characterize and compare the attention each
receives
Tracking Large-Scale Video Remix in Real-World Events
Social information networks, such as YouTube, contains traces of both
explicit online interaction (such as "like", leaving a comment, or subscribing
to video feed), and latent interactions (such as quoting, or remixing parts of
a video). We propose visual memes, or frequently re-posted short video
segments, for tracking such latent video interactions at scale. Visual memes
are extracted by scalable detection algorithms that we develop, with high
accuracy. We further augment visual memes with text, via a statistical model of
latent topics. We model content interactions on YouTube with visual memes,
defining several measures of influence and building predictive models for meme
popularity. Experiments are carried out on with over 2 million video shots from
more than 40,000 videos on two prominent news events in 2009: the election in
Iran and the swine flu epidemic. In these two events, a high percentage of
videos contain remixed content, and it is apparent that traditional news media
and citizen journalists have different roles in disseminating remixed content.
We perform two quantitative evaluations for annotating visual memes and
predicting their popularity. The joint statistical model of visual memes and
words outperform a concurrence model, and the average error is ~2% for
predicting meme volume and ~17% for their lifespan.Comment: 11 pages, accepted for journal publicatio
Who Watches (and Shares) What on YouTube? And When? Using Twitter to Understand YouTube Viewership
We combine user-centric Twitter data with video-centric YouTube data to
analyze who watches and shares what on YouTube. Combination of two data sets,
with 87k Twitter users, 5.6mln YouTube videos and 15mln video sharing events,
allows rich analysis going beyond what could be obtained with either of the two
data sets individually. For Twitter, we generate user features relating to
activity, interests and demographics. For YouTube, we obtain video features for
topic, popularity and polarization. These two feature sets are combined through
sharing events for YouTube URLs on Twitter. This combination is done both in a
user-, a video- and a sharing-event-centric manner. For the user-centric
analysis, we show how Twitter user features correlate both with YouTube
features and with sharing-related features. As two examples, we show urban
users are quicker to share than rural users and for some notions of "influence"
influential users on Twitter share videos with a higher number of views. For
the video-centric analysis, we find a superlinear relation between initial
Twitter shares and the final amounts of views, showing the correlated behavior
of Twitter. On user impact, we find the total amount of followers of users that
shared the video in the first week does not affect its final popularity.
However, aggregated user retweet rates serve as a better predictor for YouTube
video popularity. For the sharing-centric analysis, we reveal existence of
correlated behavior concerning the time between video creation and sharing
within certain timescales, showing the time onset for a coherent response, and
the time limit after which collective responses are extremely unlikely. We show
that response times depend on video category, revealing that Twitter sharing of
a video is highly dependent on its content. To the best of our knowledge this
is the first large-scale study combining YouTube and Twitter data.Comment: 12 pages, 8 figures and 10 table
On the Dynamics of Social Media Popularity: A YouTube Case Study
Understanding the factors that impact the popularity dynamics of social media
can drive the design of effective information services, besides providing
valuable insights to content generators and online advertisers. Taking YouTube
as case study, we analyze how video popularity evolves since upload, extracting
popularity trends that characterize groups of videos. We also analyze the
referrers that lead users to videos, correlating them, features of the video
and early popularity measures with the popularity trend and total observed
popularity the video will experience. Our findings provide fundamental
knowledge about popularity dynamics and its implications for services such as
advertising and search.Comment: Extended version of a paper published in ACM WSDM 2011. Pre-print of
the paper accepted for publication on the ACM Transactions on Internet
Tecnolog
Deriving Latent Social Impulses to Determine Longevous Videos
Online video websites receive huge amount of videos daily from users all
around the world. How to provide valuable recommendations to viewers is an
important task for both video websites and related third parties, such as
search engines. Previous work conducted numerous analysis on the view counts of
videos, which measure a video's value in terms of popularity. However, the
long-lasting value of an online video, namely longevity, is hidden behind the
history that a video accumulates its "popularity" through time. Generally
speaking, a longevous video tends to constantly draw society's attention. With
focus on one of the leading video websites, Youtube, this paper proposes a
scoring mechanism quantifying a video's longevity. Evaluating a video's
longevity can not only improve a video recommender system, but also help us to
discover videos having greater advertising value, as well as adjust a video
website's strategy of storing videos to shorten its responding time. In order
to accurately quantify longevity, we introduce the concept of latent social
impulses and how to use them measure a video's longevity. In order to derive
latent social impulses, we view the video website as a digital signal filter
and formulate the task as a convex minimization problem. The proposed longevity
computation is based on the derived social impulses. Unfortunately, the
required information to derive social impulses are not always public, which
makes a third party unable to directly evaluate every video's longevity. To
solve this problem, we formulate a semi-supervised learning task by using part
of videos having known longevity scores to predict the unknown longevity
scores. We propose a Gaussian Random Markov model with Loopy Belief Propagation
to solve this problem. The conducted experiments on Youtube demonstrate that
the proposed method significantly improves the prediction results comparing to
baselines.Comment: Accepted by WWW '14 as a poster pape
A Survey of Information Cascade Analysis: Models, Predictions, and Recent Advances
The deluge of digital information in our daily life -- from user-generated
content, such as microblogs and scientific papers, to online business, such as
viral marketing and advertising -- offers unprecedented opportunities to
explore and exploit the trajectories and structures of the evolution of
information cascades. Abundant research efforts, both academic and industrial,
have aimed to reach a better understanding of the mechanisms driving the spread
of information and quantifying the outcome of information diffusion. This
article presents a comprehensive review and categorization of information
popularity prediction methods, from feature engineering and stochastic
processes, through graph representation, to deep learning-based approaches.
Specifically, we first formally define different types of information cascades
and summarize the perspectives of existing studies. We then present a taxonomy
that categorizes existing works into the aforementioned three main groups as
well as the main subclasses in each group, and we systematically review
cutting-edge research work. Finally, we summarize the pros and cons of existing
research efforts and outline the open challenges and opportunities in this
field.Comment: Author version, with 43 pages, 9 figures, and 11 table
Forecasting Popularity of Videos using Social Media
This paper presents a systematic online prediction method (Social-Forecast)
that is capable to accurately forecast the popularity of videos promoted by
social media. Social-Forecast explicitly considers the dynamically changing and
evolving propagation patterns of videos in social media when making popularity
forecasts, thereby being situation and context aware. Social-Forecast aims to
maximize the forecast reward, which is defined as a tradeoff between the
popularity prediction accuracy and the timeliness with which a prediction is
issued. The forecasting is performed online and requires no training phase or a
priori knowledge. We analytically bound the prediction performance loss of
Social-Forecast as compared to that obtained by an omniscient oracle and prove
that the bound is sublinear in the number of video arrivals, thereby
guaranteeing its short-term performance as well as its asymptotic convergence
to the optimal performance. In addition, we conduct extensive experiments using
real-world data traces collected from the videos shared in RenRen, one of the
largest online social networks in China. These experiments show that our
proposed method outperforms existing view-based approaches for popularity
prediction (which are not context-aware) by more than 30% in terms of
prediction rewards
TrendLearner: Early Prediction of Popularity Trends of User Generated Content
We here focus on the problem of predicting the popularity trend of user
generated content (UGC) as early as possible. Taking YouTube videos as case
study, we propose a novel two-step learning approach that: (1) extracts
popularity trends from previously uploaded objects, and (2) predicts trends for
new content. Unlike previous work, our solution explicitly addresses the
inherent tradeoff between prediction accuracy and remaining interest in the
content after prediction, solving it on a per-object basis. Our experimental
results show great improvements of our solution over alternatives, and its
applicability to improve the accuracy of state-of-the-art popularity prediction
methods.Comment: To appear at Elsevier Information Sciences Journa
Popularity and Quality in Social News Aggregators: A Study of Reddit and Hacker News
In this paper we seek to understand the relationship between the online
popularity of an article and its intrinsic quality. Prior experimental work
suggests that the relationship between quality and popularity can be very
distorted due to factors like social influence bias and inequality in
visibility. We conduct a study of popularity on two different social news
aggregators, Reddit and Hacker News. We define quality as the relative number
of votes an article would have received if each article was shown, in a
bias-free way, to an equal number of users. We propose a simple poisson
regression method to estimate this quality metric from time-series voting data.
We validate our methods on data from Reddit and Hacker News, as well the
experimental data from prior work. This method works well even though the
collected data is subject to common social media biases. Using these estimates,
we find that popularity on Reddit and Hacker News is a stronger reflection of
intrinsic quality than expected
- …