70,188 research outputs found

    On Sampling Edges Almost Uniformly

    Get PDF
    We consider the problem of sampling an edge almost uniformly from an unknown graph, G = (V, E). Access to the graph is provided via queries of the following types: (1) uniform vertex queries, (2) degree queries, and (3) neighbor queries. We describe a new simple algorithm that returns a random edge e in E using tilde{O}(n/sqrt{eps m}) queries in expectation, such that each edge e is sampled with probability (1 +/- eps)/m. Here, n = |V| is the number of vertices, and m = |E| is the number of edges. Our algorithm is optimal in the sense that any algorithm that samples an edge from an almost-uniform distribution must perform Omega(n/sqrt{m}) queries

    Sampling and Counting Crossing-Free Matchings

    Get PDF
    Sampling of combinatorial structures is an important statistical tool used in applications in a number of areas ranging from statistical physics, data mining, to biological sciences. Of comparable importance is the computation of the cor- responding partition function, which, in the case of the uniform distribution, is equivalent to the problem of counting all such structures. For self-reducible combinatorial structures, once we can produce an almost uniform sample from them, then we can approximately count them. Using a Markov chain Monte Carlo method, this thesis presents polynomial-time algorithms to approximately count and almost uniformly sample crossing-free matchings for certain input classes of graphs. Since the problem in its generality appears to be difficult, we made natural restrictions on the in- put graphs. Namely, we consider vertices arranged in a grid in the plane, where edges are line segments connecting the vertices and a matching is crossing-free if no two matching edges intersect. For appropriate bounds on the dimensions of the grid and the edge lengths, we show that a natural Markov chain is rapidly mixing and that the problem is self-reducible

    Estimating and Sampling Graphs with Multidimensional Random Walks

    Full text link
    Estimating characteristics of large graphs via sampling is a vital part of the study of complex networks. Current sampling methods such as (independent) random vertex and random walks are useful but have drawbacks. Random vertex sampling may require too many resources (time, bandwidth, or money). Random walks, which normally require fewer resources per sample, can suffer from large estimation errors in the presence of disconnected or loosely connected graphs. In this work we propose a new mm-dimensional random walk that uses mm dependent random walkers. We show that the proposed sampling method, which we call Frontier sampling, exhibits all of the nice sampling properties of a regular random walk. At the same time, our simulations over large real world graphs show that, in the presence of disconnected or loosely connected components, Frontier sampling exhibits lower estimation errors than regular random walks. We also show that Frontier sampling is more suitable than random vertex sampling to sample the tail of the degree distribution of the graph

    Densest Subgraph in Dynamic Graph Streams

    Full text link
    In this paper, we consider the problem of approximating the densest subgraph in the dynamic graph stream model. In this model of computation, the input graph is defined by an arbitrary sequence of edge insertions and deletions and the goal is to analyze properties of the resulting graph given memory that is sub-linear in the size of the stream. We present a single-pass algorithm that returns a (1+ϵ)(1+\epsilon) approximation of the maximum density with high probability; the algorithm uses O(\epsilon^{-2} n \polylog n) space, processes each stream update in \polylog (n) time, and uses \poly(n) post-processing time where nn is the number of nodes. The space used by our algorithm matches the lower bound of Bahmani et al.~(PVLDB 2012) up to a poly-logarithmic factor for constant ϵ\epsilon. The best existing results for this problem were established recently by Bhattacharya et al.~(STOC 2015). They presented a (2+ϵ)(2+\epsilon) approximation algorithm using similar space and another algorithm that both processed each update and maintained a (4+ϵ)(4+\epsilon) approximation of the current maximum density in \polylog (n) time per-update.Comment: To appear in MFCS 201
    corecore