364 research outputs found

    Linear Programming in the Semi-streaming Model with Application to the Maximum Matching Problem

    Get PDF
    In this paper, we study linear programming based approaches to the maximum matching problem in the semi-streaming model. The semi-streaming model has gained attention as a model for processing massive graphs as the importance of such graphs has increased. This is a model where edges are streamed-in in an adversarial order and we are allowed a space proportional to the number of vertices in a graph. In recent years, there has been several new results in this semi-streaming model. However broad techniques such as linear programming have not been adapted to this model. We present several techniques to adapt and optimize linear programming based approaches in the semi-streaming model with an application to the maximum matching problem. As a consequence, we improve (almost) all previous results on this problem, and also prove new results on interesting variants

    Improved approximation guarantees for weighted matching in the semi-streaming model

    Get PDF
    We study the maximum weight matching problem in the semi-streaming model, and improve on the currently best one-pass algorithm due to Zelke (Proc. of STACS2008, pages 669-680) by devising a deterministic approach whose performance guarantee is 4.91+epsilon. In addition, we study preemptive online algorithms, a sub-class of one-pass algorithms where we are only allowed to maintain a feasible matching in memory at any point in time. All known results prior to Zelke's belong to this sub-class. We provide a lower bound of 4.967 on the competitive ratio of any such deterministic algorithm, and hence show that future improvements will have to store in memory a set of edges which is not necessarily a feasible matching

    Bipartite Matching in the Semi-Streaming Model

    Get PDF
    We present the first deterministic 1+eps approximation algorithm for finding a large matching in a bipartite graph in the semi-streaming model which requires only O(1/eps5)O(1/eps^5) passes over the input stream. In this model, the input graph G=(V,E)G=(V,E) is given as a stream of its edges in some arbitrary order, and storage of the algorithm is bounded by O(npolylogn)O(n polylog n) bits, where n=Vn = |V|. The only previously known arbitrarily good approximation for general graphs is achieved by the randomized algorithm of McGregor (2005), which uses Ω((1/eps)(1/eps))\Omega((1/eps)^(1/eps)) passes. We show that even for bipartite graphs, McGregor's algorithm needs Ω(1/eps)Ω(1/eps)\Omega(1/eps)^\Omega(1/eps) passes, thus it is necessarily exponential in the approximation parameter. The design as well as the analysis of our algorithm require the introduction of some new techniques. A novelty of our algorithm is a new deterministic assignment of matching edges to augmenting paths which is responsible for the complexity reduction, and gets rid of randomization. We repeatedly grow an initial matching using augmenting paths up to a length of 2k+1fork=2/eps2k+1 for k=2/eps. We terminate when the number of augmenting paths found in one iteration falls below a certain threshold also depending on kk, that guarantees a 1+eps1+eps approximation. The main challenge is to find those augmenting paths without requiring an excessive number of passes. In each iteration, using multiple passes, we grow a set of alternating paths in parallel, considering multiple passes, we grow a set of alternating paths in parallel, considering each edge as a possible extension as it comes along in the stream. Backtracking is used on paths that fail to grow any further. Crucial are the so-called position limits: when a matching edge is the i-th matching edge in a path and it is then removed by backtracking, it will only be inserted into a path again at a position strictly lesser than i. This rule strikes a balance between terminating quickly on the one hand and giving the procedure enough freedom on the other hand

    Graph Sparsification in the Semi-streaming Model

    Get PDF
    Analyzing massive data sets has been one of the key motivations for studying streaming algorithms. In recent years, there has been significant progress in analysing distributions in a streaming setting, but the progress on graph problems has been limited. A main reason for this has been the existence of linear space lower bounds for even simple problems such as determining the connectedness of a graph. However, in many new scenarios that arise from social and other interaction networks, the number of vertices is significantly less than the number of edges. This has led to the formulation of the semi-streaming model where we assume that the space is (near) linear in the number of vertices (but not necessarily the edges), and the edges appear in an arbitrary (and possibly adversarial) order. In this paper we focus on graph sparsification, which is one of the major building blocks in a variety of graph algorithms. There has been a long history of (non-streaming) sampling algorithms that provide sparse graph approximations and it a natural question to ask if the sparsification can be achieved using a small space, and in addition using a single pass over the data? The question is interesting from the standpoint of both theory and practice and we answer the question in the affirmative, by providing a one pass O~(n/ϵ2)\tilde{O}(n/\epsilon^{2}) space algorithm that produces a sparsification that approximates each cut to a (1+ϵ)(1+\epsilon) factor. We also show that Ω(nlog1ϵ)\Omega(n \log \frac1\epsilon) space is necessary for a one pass streaming algorithm to approximate the min-cut, improving upon the Ω(n)\Omega(n) lower bound that arises from lower bounds for testing connectivity

    Analyzing Massive Graphs in the Semi-streaming Model

    Get PDF
    Massive graphs arise in a many scenarios, for example, traffic data analysis in large networks, large scale scientific experiments, and clustering of large data sets. The semi-streaming model was proposed for processing massive graphs. In the semi-streaming model, we have a random accessible memory which is near-linear in the number of vertices. The input graph (or equivalently, edges in the graph) is presented as a sequential list of edges (insertion-only model) or edge insertions and deletions (dynamic model). The list is read-only but we may make multiple passes over the list. There has been a few results in the insertion-only model such as computing distance spanners and approximating the maximum matching. In this thesis, we present some algorithms and techniques for (i) solving more complex problems in the semi-streaming model, (for example, problems in the dynamic model) and (ii) having better solutions for the problems which have been studied (for example, the maximum matching problem). In course of both of these, we develop new techniques with broad applications and explore the rich trade-offs between the complexity of models (insertion-only streams vs. dynamic streams), the number of passes, space, accuracy, and running time. 1. We initiate the study of dynamic graph streams. We start with basic problems such as the connectivity problem and computing the minimum spanning tree. These problems are trivial in the insertion-only model. However, they require non-trivial (and multiple passes for computing the exact minimum spanning tree) algorithms in the dynamic model. 2. Second, we present a graph sparsification algorithm in the semi-streaming model. A graph sparsification is a sparse graph that approximately preserves all the cut values of a graph. Such a graph acts as an oracle for solving cut-related problems, for example, the minimum cut problem and the multicut problem. Our algorithm produce a graph sparsification with high probability in one pass. 3. Third, we use the primal-dual algorithms to develop the semi-streaming algorithms. The primal-dual algorithms have been widely accepted as a framework for solving linear programs and semidefinite programs faster. In contrast, we apply the method for reducing space and number of passes in addition to reducing the running time. We also present some examples that arise in applications and show how to apply the techniques: the multicut problem, the correlation clustering problem, and the maximum matching problem. As a consequence, we also develop near-linear time algorithms for the bb-matching problems which were not known before

    Depth First Search in the Semi-streaming Model

    Get PDF
    Depth first search (DFS) tree is a fundamental data structure for solving various graph problems. The classical algorithm for building a DFS tree requires O(m+n) time for a given undirected graph G having n vertices and m edges. In the streaming model, an algorithm is allowed several passes (preferably single) over the input graph having a restriction on the size of local space used. Now, a DFS tree of a graph can be trivially computed using a single pass if O(m) space is allowed. In the semi-streaming model allowing O(n) space, it can be computed in O(n) passes over the input stream, where each pass adds one vertex to the DFS tree. However, it remains an open problem to compute a DFS tree using o(n) passes using o(m) space even in any relaxed streaming environment. We present the first semi-streaming algorithms that compute a DFS tree of an undirected graph in o(n) passes using o(m) space. We first describe an extremely simple algorithm that requires at most ceil[n/k] passes to compute a DFS tree using O(nk) space, where k is any positive integer. For example using k=sqrt{n}, we can compute a DFS tree in sqrt{n} passes using O(n sqrt{n}) space. We then improve this algorithm by using more involved techniques to reduce the number of passes to ceil[h/k] under similar space constraints, where h is the height of the computed DFS tree. In particular, this algorithm improves the bounds for the case where the computed DFS tree is shallow (having o(n) height). Moreover, this algorithm is presented in form of a framework that allows the flexibility of using any algorithm to maintain a DFS tree of a stored sparser subgraph as a black box, which may be of an independent interest. Both these algorithms essentially demonstrate the existence of a trade-off between the space and number of passes required for computing a DFS tree. Furthermore, we evaluate these algorithms experimentally which reveals their exceptional performance in practice. For both random and real graphs, they require merely a few passes even when allowed just O(n) space
    corecore