70,188 research outputs found
On Sampling Edges Almost Uniformly
We consider the problem of sampling an edge almost uniformly from an unknown graph, G = (V, E). Access to the graph is provided via queries of the following types: (1) uniform vertex queries, (2) degree queries, and (3) neighbor queries. We describe a new simple algorithm that returns a random edge e in E using tilde{O}(n/sqrt{eps m}) queries in expectation, such that each edge e is sampled with probability (1 +/- eps)/m. Here, n = |V| is the number of vertices, and m = |E| is the number of edges. Our algorithm is optimal in the sense that any algorithm that samples an edge from an almost-uniform distribution must perform Omega(n/sqrt{m}) queries
Sampling and Counting Crossing-Free Matchings
Sampling of combinatorial structures is an important statistical tool used in applications in a number of areas ranging from statistical physics, data mining, to biological sciences. Of comparable importance is the computation of the cor- responding partition function, which, in the case of the uniform distribution, is equivalent to the problem of counting all such structures. For self-reducible combinatorial structures, once we can produce an almost uniform sample from them, then we can approximately count them.
Using a Markov chain Monte Carlo method, this thesis presents polynomial-time algorithms to approximately count and almost uniformly sample crossing-free matchings for certain input classes of graphs. Since the problem in its generality appears to be difficult, we made natural restrictions on the in- put graphs. Namely, we consider vertices arranged in a grid in the plane, where edges are line segments connecting the vertices and a matching is crossing-free if no two matching edges intersect. For appropriate bounds on the dimensions of the grid and the edge lengths, we show that a natural Markov chain is rapidly mixing and that the problem is self-reducible
Estimating and Sampling Graphs with Multidimensional Random Walks
Estimating characteristics of large graphs via sampling is a vital part of
the study of complex networks. Current sampling methods such as (independent)
random vertex and random walks are useful but have drawbacks. Random vertex
sampling may require too many resources (time, bandwidth, or money). Random
walks, which normally require fewer resources per sample, can suffer from large
estimation errors in the presence of disconnected or loosely connected graphs.
In this work we propose a new -dimensional random walk that uses
dependent random walkers. We show that the proposed sampling method, which we
call Frontier sampling, exhibits all of the nice sampling properties of a
regular random walk. At the same time, our simulations over large real world
graphs show that, in the presence of disconnected or loosely connected
components, Frontier sampling exhibits lower estimation errors than regular
random walks. We also show that Frontier sampling is more suitable than random
vertex sampling to sample the tail of the degree distribution of the graph
Densest Subgraph in Dynamic Graph Streams
In this paper, we consider the problem of approximating the densest subgraph
in the dynamic graph stream model. In this model of computation, the input
graph is defined by an arbitrary sequence of edge insertions and deletions and
the goal is to analyze properties of the resulting graph given memory that is
sub-linear in the size of the stream. We present a single-pass algorithm that
returns a approximation of the maximum density with high
probability; the algorithm uses O(\epsilon^{-2} n \polylog n) space,
processes each stream update in \polylog (n) time, and uses \poly(n)
post-processing time where is the number of nodes. The space used by our
algorithm matches the lower bound of Bahmani et al.~(PVLDB 2012) up to a
poly-logarithmic factor for constant . The best existing results for
this problem were established recently by Bhattacharya et al.~(STOC 2015). They
presented a approximation algorithm using similar space and
another algorithm that both processed each update and maintained a
approximation of the current maximum density in \polylog (n)
time per-update.Comment: To appear in MFCS 201
- …