1,491 research outputs found
Will This Video Go Viral? Explaining and Predicting the Popularity of Youtube Videos
What makes content go viral? Which videos become popular and why others
don't? Such questions have elicited significant attention from both researchers
and industry, particularly in the context of online media. A range of models
have been recently proposed to explain and predict popularity; however, there
is a short supply of practical tools, accessible for regular users, that
leverage these theoretical results. HIPie -- an interactive visualization
system -- is created to fill this gap, by enabling users to reason about the
virality and the popularity of online videos. It retrieves the metadata and the
past popularity series of Youtube videos, it employs Hawkes Intensity Process,
a state-of-the-art online popularity model for explaining and predicting video
popularity, and it presents videos comparatively in a series of interactive
plots. This system will help both content consumers and content producers in a
range of data-driven inquiries, such as to comparatively analyze videos and
channels, to explain and predict future popularity, to identify viral videos,
and to estimate response to online promotion.Comment: 4 page
Cross-Partisan Discussions on YouTube: Conservatives Talk to Liberals but Liberals Don't Talk to Conservatives
We present the first large-scale measurement study of cross-partisan
discussions between liberals and conservatives on YouTube, based on a dataset
of 274,241 political videos from 973 channels of US partisan media and 134M
comments from 9.3M users over eight months in 2020. Contrary to a simple
narrative of echo chambers, we find a surprising amount of cross-talk: most
users with at least 10 comments posted at least once on both left-leaning and
right-leaning YouTube channels. Cross-talk, however, was not symmetric. Based
on the user leaning predicted by a hierarchical attention model, we find that
conservatives were much more likely to comment on left-leaning videos than
liberals on right-leaning videos. Secondly, YouTube's comment sorting algorithm
made cross-partisan comments modestly less visible; for example, comments from
conservatives made up 26.3% of all comments on left-leaning videos but just
over 20% of the comments were in the top 20 positions. Lastly, using
Perspective API's toxicity score as a measure of quality, we find that
conservatives were not significantly more toxic than liberals when users
directly commented on the content of videos. However, when users replied to
comments from other users, we find that cross-partisan replies were more toxic
than co-partisan replies on both left-leaning and right-leaning videos, with
cross-partisan replies being especially toxic on the replier's home turf.Comment: Accepted into ICWSM 2021, the code and datasets are publicly
available at https://github.com/avalanchesiqi/youtube-crosstal
How to Train Your YouTube Recommender to Avoid Unwanted Videos
YouTube provides features for users to indicate disinterest when presented
with unwanted recommendations, such as the "Not interested" and "Don't
recommend channel" buttons. These buttons are purported to allow the user to
correct "mistakes" made by the recommendation system. Yet, relatively little is
known about the empirical efficacy of these buttons. Neither is much known
about users' awareness of and confidence in them. To address these gaps, we
simulated YouTube users with sock puppet agents. Each agent first executed a
"stain phase", where it watched many videos of one assigned topic; it then
executed a "scrub phase", where it tried to remove recommendations of the
assigned topic. Each agent repeatedly applied a single scrubbing strategy,
either indicating disinterest in one of the videos visited in the stain phase
(disliking it or deleting it from the watch history), or indicating disinterest
in a video recommended on the homepage (clicking the "not interested" or "don't
recommend channel" button or opening the video and clicking the dislike
button). We found that the stain phase significantly increased the fraction of
the recommended videos dedicated to the assigned topic on the user's homepage.
For the scrub phase, using the "Not interested" button worked best,
significantly reducing such recommendations in all topics tested, on average
removing 88% of them. Neither the stain phase nor the scrub phase, however, had
much effect on videopage recommendations. We also ran a survey (N = 300) asking
adult YouTube users in the US whether they were aware of and used these buttons
before, as well as how effective they found these buttons to be. We found that
44% of participants were not aware that the "Not interested" button existed.
However, those who were aware of this button often used it to remove unwanted
recommendations (82.8%) and found it to be modestly effective (3.42 out of 5).Comment: Accepted into ICWSM 2024, the code is publicly available at
https://github.com/avliu-um/youtube-disinteres
Measuring Collective Attention in Online Content: Sampling, Engagement, and Network Effects
The production and consumption of online content have been increasing rapidly, whereas human attention is a scarce resource. Understanding how the content captures collective attention has become a challenge of growing importance. In this thesis, we tackle this challenge from three fronts -- quantifying sampling effects of social media data; measuring engagement behaviors towards online content; and estimating network effects induced by the recommender systems.
Data sampling is a fundamental problem. To obtain a list of items, one common method is sampling based on the item prevalence in social media streams. However, social data is often noisy and incomplete, which may affect the subsequent observations. For each item, user behaviors can be conceptualized as two steps -- the first step is relevant to the content appeal, measured by the number of clicks; the second step is relevant to the content quality, measured by the post-clicking metrics, e.g., dwell time, likes, or comments. We categorize online attention (behaviors) into two classes: popularity (clicking) and engagement (watching, liking, or commenting). Moreover, modern platforms use recommender systems to present the users with a tailoring content display for maximizing satisfaction. The recommendation alters the appeal of an item by changing its ranking, and consequently impacts its popularity.
Our research is enabled by the data available from the largest video hosting site YouTube. We use YouTube URLs shared on Twitter as a sampling protocol to obtain a collection of videos, and we track their prevalence from 2015 to 2019. This method creates a longitudinal dataset consisting of more than 5 billion tweets. Albeit the volume is substantial, we find Twitter still subsamples the data. Our dataset covers about 80% of all tweets with YouTube URLs. We present a comprehensive measurement study of the Twitter sampling effects across different timescales and different subjects. We find that the volume of missing tweets can be estimated by Twitter rate limit messages, true entity ranking can be inferred based on sampled observations, and sampling compromises the quality of network and diffusion models.
Next, we present the first large-scale measurement study of how users collectively engage with YouTube videos. We study the time and percentage of each video being watched. We propose a duration-calibrated metric, called relative engagement, which is correlated with recognized notion of content quality, stable over time, and predictable even before a video's upload.
Lastly, we examine the network effects induced by the YouTube recommender system. We construct the recommendation network for 60,740 music videos from 4,435 professional artists. An edge indicates that the target video is recommended on the webpage of source video. We discover the popularity bias -- videos are disproportionately recommended towards more popular videos. We use the bow-tie structure to characterize the network and find that the largest strongly connected component consists of 23.1% of videos while occupying 82.6% of attention. We also build models to estimate the latent influence between videos and artists. By taking into account the network structure, we can predict video popularity 9.7% better than other baselines.
Altogether, we explore the collective consuming patterns of human attention towards online content. Methods and findings from this thesis can be used by content producers, hosting sites, and online users alike to improve content production, advertising strategies, and recommender systems. We expect our new metrics, methods, and observations can generalize to other multimedia platforms such as the music streaming service Spotify
NPRL2: A New Target In Breast Cancer Treatment
https://openworks.mdanderson.org/sumexp22/1063/thumbnail.jp
The birth of edge cities in China: measuring the spillover effects of industrial parks
From its established status as a high-tech science park in 1988, Zhongguancun has been transformed from a village to China’s “Silicon Valley”. Zhongguancun’s big success has led many Chinese local governments to embrace ‘place-based’ investments and support the building of industrial parks (special economic zones, SEZ). In fact, this is a growing global trend. A recent Economist article, reported that there are more than 4,000 SEZs (industrial parks) around the world, ranging from basic export processing zones and science parks to more high-tech economic zones
- …