1,737 research outputs found
Wedge Sampling for Computing Clustering Coefficients and Triangle Counts on Large Graphs
Graphs are used to model interactions in a variety of contexts, and there is
a growing need to quickly assess the structure of such graphs. Some of the most
useful graph metrics are based on triangles, such as those measuring social
cohesion. Algorithms to compute them can be extremely expensive, even for
moderately-sized graphs with only millions of edges. Previous work has
considered node and edge sampling; in contrast, we consider wedge sampling,
which provides faster and more accurate approximations than competing
techniques. Additionally, wedge sampling enables estimation local clustering
coefficients, degree-wise clustering coefficients, uniform triangle sampling,
and directed triangle counts. Our methods come with provable and practical
probabilistic error estimates for all computations. We provide extensive
results that show our methods are both more accurate and faster than
state-of-the-art alternatives.Comment: Full version of SDM 2013 paper "Triadic Measures on Graphs: The Power
of Wedge Sampling" (arxiv:1202.5230
Discrete Temporal Models of Social Networks
We propose a family of statistical models for social network evolution over
time, which represents an extension of Exponential Random Graph Models (ERGMs).
Many of the methods for ERGMs are readily adapted for these models, including
maximum likelihood estimation algorithms. We discuss models of this type and
their properties, and give examples, as well as a demonstration of their use
for hypothesis testing and classification. We believe our temporal ERG models
represent a useful new framework for modeling time-evolving social networks,
and rewiring networks from other domains such as gene regulation circuitry, and
communication networks
A kernel-based framework for learning graded relations from data
Driven by a large number of potential applications in areas like
bioinformatics, information retrieval and social network analysis, the problem
setting of inferring relations between pairs of data objects has recently been
investigated quite intensively in the machine learning community. To this end,
current approaches typically consider datasets containing crisp relations, so
that standard classification methods can be adopted. However, relations between
objects like similarities and preferences are often expressed in a graded
manner in real-world applications. A general kernel-based framework for
learning relations from data is introduced here. It extends existing approaches
because both crisp and graded relations are considered, and it unifies existing
approaches because different types of graded relations can be modeled,
including symmetric and reciprocal relations. This framework establishes
important links between recent developments in fuzzy set theory and machine
learning. Its usefulness is demonstrated through various experiments on
synthetic and real-world data.Comment: This work has been submitted to the IEEE for possible publication.
Copyright may be transferred without notice, after which this version may no
longer be accessibl
Parameter estimators of random intersection graphs with thinned communities
This paper studies a statistical network model generated by a large number of
randomly sized overlapping communities, where any pair of nodes sharing a
community is linked with probability via the community. In the special case
with the model reduces to a random intersection graph which is known to
generate high levels of transitivity also in the sparse context. The parameter
adds a degree of freedom and leads to a parsimonious and analytically
tractable network model with tunable density, transitivity, and degree
fluctuations. We prove that the parameters of this model can be consistently
estimated in the large and sparse limiting regime using moment estimators based
on partially observed densities of links, 2-stars, and triangles.Comment: 15 page
- …