36,051 research outputs found
Evolution of the Media Web
We present a detailed study of the part of the Web related to media content,
i.e., the Media Web. Using publicly available data, we analyze the evolution of
incoming and outgoing links from and to media pages. Based on our observations,
we propose a new class of models for the appearance of new media content on the
Web where different \textit{attractiveness} functions of nodes are possible
including ones taken from well-known preferential attachment and fitness
models. We analyze these models theoretically and empirically and show which
ones realistically predict both the incoming degree distribution and the
so-called \textit{recency property} of the Media Web, something that existing
models did not do well. Finally we compare these models by estimating the
likelihood of the real-world link graph from our data set given each model and
obtain that models we introduce are significantly more likely than previously
proposed ones. One of the most surprising results is that in the Media Web the
probability for a post to be cited is determined, most likely, by its quality
rather than by its current popularity
A survey of statistical network models
Networks are ubiquitous in science and have become a focal point for
discussion in everyday life. Formal statistical models for the analysis of
network data have emerged as a major topic of interest in diverse areas of
study, and most of these involve a form of graphical representation.
Probability models on graphs date back to 1959. Along with empirical studies in
social psychology and sociology from the 1960s, these early works generated an
active network community and a substantial literature in the 1970s. This effort
moved into the statistical literature in the late 1970s and 1980s, and the past
decade has seen a burgeoning network literature in statistical physics and
computer science. The growth of the World Wide Web and the emergence of online
networking communities such as Facebook, MySpace, and LinkedIn, and a host of
more specialized professional network communities has intensified interest in
the study of networks and network data. Our goal in this review is to provide
the reader with an entry point to this burgeoning literature. We begin with an
overview of the historical development of statistical network modeling and then
we introduce a number of examples that have been studied in the network
literature. Our subsequent discussion focuses on a number of prominent static
and dynamic network models and their interconnections. We emphasize formal
model descriptions, and pay special attention to the interpretation of
parameters and their estimation. We end with a description of some open
problems and challenges for machine learning and statistics.Comment: 96 pages, 14 figures, 333 reference
Characterizing and modeling the dynamics of online popularity
Online popularity has enormous impact on opinions, culture, policy, and
profits. We provide a quantitative, large scale, temporal analysis of the
dynamics of online content popularity in two massive model systems, the
Wikipedia and an entire country's Web space. We find that the dynamics of
popularity are characterized by bursts, displaying characteristic features of
critical systems such as fat-tailed distributions of magnitude and inter-event
time. We propose a minimal model combining the classic preferential popularity
increase mechanism with the occurrence of random popularity shifts due to
exogenous factors. The model recovers the critical features observed in the
empirical analysis of the systems analyzed here, highlighting the key factors
needed in the description of popularity dynamics.Comment: 5 pages, 4 figures. Modeling part detailed. Final version published
in Physical Review Letter
DancingLines: An Analytical Scheme to Depict Cross-Platform Event Popularity
Nowadays, events usually burst and are propagated online through multiple
modern media like social networks and search engines. There exists various
research discussing the event dissemination trends on individual medium, while
few studies focus on event popularity analysis from a cross-platform
perspective. Challenges come from the vast diversity of events and media,
limited access to aligned datasets across different media and a great deal of
noise in the datasets. In this paper, we design DancingLines, an innovative
scheme that captures and quantitatively analyzes event popularity between
pairwise text media. It contains two models: TF-SW, a semantic-aware popularity
quantification model, based on an integrated weight coefficient leveraging
Word2Vec and TextRank; and wDTW-CD, a pairwise event popularity time series
alignment model matching different event phases adapted from Dynamic Time
Warping. We also propose three metrics to interpret event popularity trends
between pairwise social platforms. Experimental results on eighteen real-world
event datasets from an influential social network and a popular search engine
validate the effectiveness and applicability of our scheme. DancingLines is
demonstrated to possess broad application potentials for discovering the
knowledge of various aspects related to events and different media
- …