27 research outputs found
Cascading Behavior in Large Blog Graphs
How do blogs cite and influence each other? How do such links evolve? Does
the popularity of old blog posts drop exponentially with time? These are some
of the questions that we address in this work. Our goal is to build a model
that generates realistic cascades, so that it can help us with link prediction
and outlier detection.
Blogs (weblogs) have become an important medium of information because of
their timely publication, ease of use, and wide availability. In fact, they
often make headlines, by discussing and discovering evidence about political
events and facts. Often blogs link to one another, creating a publicly
available record of how information and influence spreads through an underlying
social network. Aggregating links from several blog posts creates a directed
graph which we analyze to discover the patterns of information propagation in
blogspace, and thereby understand the underlying social network. Not only are
blogs interesting on their own merit, but our analysis also sheds light on how
rumors, viruses, and ideas propagate over social and computer networks.
Here we report some surprising findings of the blog linking and information
propagation structure, after we analyzed one of the largest available datasets,
with 45,000 blogs and ~ 2.2 million blog-postings. Our analysis also sheds
light on how rumors, viruses, and ideas propagate over social and computer
networks. We also present a simple model that mimics the spread of information
on the blogosphere, and produces information cascades very similar to those
found in real life
Fried Chicken Bucket Processes
Chinese restaurant processes are useful hierarchical models; however, they make certain assumptions on finiteness that may not be appropriate for modeling some phenomena. Therefore, we introduce fried chicken bucket processes (FCBP) that involve different sampling methods. We also introduce spork notation as a simple way of representing this model.
Structural Analysis of Large Networks: Observations and Applications
Network data (also referred to as relational data, social network data, real graph
data) has become ubiquitous, and understanding patterns in this data has become an
important research problem. We investigate how interactions in social networks are
formed and how these interactions facilitate diffusion, model these behaviors, and
apply these findings to real-world problems.
We examined graphs of size up to 16 million nodes, across many domains from
academic citation networks, to campaign contributions and actor-movie networks.
We also performed several case studies in online social networks such as blogs and
message board communities.
Our major contributions are the following: (a) We discover several surprising
patterns in network topology and interactions, such as Popularity Decay power law
(in-links to a blog post decay with a power law with -1:5 exponent) and the oscillating
size of connected components; (b) We propose generators such as the Butterfly
generator that reproduce both established and new properties found in real networks;
(c) several case studies, including a proposed method of detecting misstatements in
accounting data, where using network effects gave a significant boost in detection
accuracy