4,341 research outputs found
Catching the head, tail, and everything in between: a streaming algorithm for the degree distribution
The degree distribution is one of the most fundamental graph properties of
interest for real-world graphs. It has been widely observed in numerous domains
that graphs typically have a tailed or scale-free degree distribution. While
the average degree is usually quite small, the variance is quite high and there
are vertices with degrees at all scales. We focus on the problem of
approximating the degree distribution of a large streaming graph, with small
storage. We design an algorithm headtail, whose main novelty is a new estimator
of infrequent degrees using truncated geometric random variables. We give a
mathematical analysis of headtail and show that it has excellent behavior in
practice. We can process streams will millions of edges with storage less than
1% and get extremely accurate approximations for all scales in the degree
distribution.
We also introduce a new notion of Relative Hausdorff distance between tailed
histograms. Existing notions of distances between distributions are not
suitable, since they ignore infrequent degrees in the tail. The Relative
Hausdorff distance measures deviations at all scales, and is a more suitable
distance for comparing degree distributions. By tracking this new measure, we
are able to give strong empirical evidence of the convergence of headtail
Relative entropy minimizing noisy non-linear neural network to approximate stochastic processes
A method is provided for designing and training noise-driven recurrent neural
networks as models of stochastic processes. The method unifies and generalizes
two known separate modeling approaches, Echo State Networks (ESN) and Linear
Inverse Modeling (LIM), under the common principle of relative entropy
minimization. The power of the new method is demonstrated on a stochastic
approximation of the El Nino phenomenon studied in climate research
- …