14 research outputs found

    Bipartite Matching in the Semi-Streaming Model

    Get PDF
    We present the first deterministic 1+eps approximation algorithm for finding a large matching in a bipartite graph in the semi-streaming model which requires only O(1/eps5)O(1/eps^5) passes over the input stream. In this model, the input graph G=(V,E)G=(V,E) is given as a stream of its edges in some arbitrary order, and storage of the algorithm is bounded by O(npolylogn)O(n polylog n) bits, where n=Vn = |V|. The only previously known arbitrarily good approximation for general graphs is achieved by the randomized algorithm of McGregor (2005), which uses Ω((1/eps)(1/eps))\Omega((1/eps)^(1/eps)) passes. We show that even for bipartite graphs, McGregor's algorithm needs Ω(1/eps)Ω(1/eps)\Omega(1/eps)^\Omega(1/eps) passes, thus it is necessarily exponential in the approximation parameter. The design as well as the analysis of our algorithm require the introduction of some new techniques. A novelty of our algorithm is a new deterministic assignment of matching edges to augmenting paths which is responsible for the complexity reduction, and gets rid of randomization. We repeatedly grow an initial matching using augmenting paths up to a length of 2k+1fork=2/eps2k+1 for k=2/eps. We terminate when the number of augmenting paths found in one iteration falls below a certain threshold also depending on kk, that guarantees a 1+eps1+eps approximation. The main challenge is to find those augmenting paths without requiring an excessive number of passes. In each iteration, using multiple passes, we grow a set of alternating paths in parallel, considering multiple passes, we grow a set of alternating paths in parallel, considering each edge as a possible extension as it comes along in the stream. Backtracking is used on paths that fail to grow any further. Crucial are the so-called position limits: when a matching edge is the i-th matching edge in a path and it is then removed by backtracking, it will only be inserted into a path again at a position strictly lesser than i. This rule strikes a balance between terminating quickly on the one hand and giving the procedure enough freedom on the other hand

    Maximum Matching in Turnstile Streams

    Get PDF
    We consider the unweighted bipartite maximum matching problem in the one-pass turnstile streaming model where the input stream consists of edge insertions and deletions. In the insertion-only model, a one-pass 22-approximation streaming algorithm can be easily obtained with space O(nlogn)O(n \log n), where nn denotes the number of vertices of the input graph. We show that no such result is possible if edge deletions are allowed, even if space O(n3/2δ)O(n^{3/2-\delta}) is granted, for every δ>0\delta > 0. Specifically, for every 0ϵ10 \le \epsilon \le 1, we show that in the one-pass turnstile streaming model, in order to compute a O(nϵ)O(n^{\epsilon})-approximation, space Ω(n3/24ϵ)\Omega(n^{3/2 - 4\epsilon}) is required for constant error randomized algorithms, and, up to logarithmic factors, space O(n22ϵ)O( n^{2-2\epsilon} ) is sufficient. Our lower bound result is proved in the simultaneous message model of communication and may be of independent interest

    On the Power of Advice and Randomization for Online Bipartite Matching

    Get PDF
    While randomized online algorithms have access to a sequence of uniform random bits, deterministic online algorithms with advice have access to a sequence of advice bits, i.e., bits that are set by an all powerful oracle prior to the processing of the request sequence. Advice bits are at least as helpful as random bits, but how helpful are they? In this work, we investigate the power of advice bits and random bits for online maximum bipartite matching (MBM). The well-known Karp-Vazirani-Vazirani algorithm is an optimal randomized (11e)(1-\frac{1}{e})-competitive algorithm for \textsc{MBM} that requires access to Θ(nlogn)\Theta(n \log n) uniform random bits. We show that Ω(log(1ϵ)n)\Omega(\log(\frac{1}{\epsilon}) n) advice bits are necessary and O(1ϵ5n)O(\frac{1}{\epsilon^5} n) sufficient in order to obtain a (1ϵ)(1-\epsilon)-competitive deterministic advice algorithm. Furthermore, for a large natural class of deterministic advice algorithms, we prove that Ω(logloglogn)\Omega(\log \log \log n) advice bits are required in order to improve on the 12\frac{1}{2}-competitiveness of the best deterministic online algorithm, while it is known that O(logn)O(\log n) bits are sufficient. Last, we give a randomized online algorithm that uses cnc n random bits, for integers c1c \ge 1, and a competitive ratio that approaches 11e1-\frac{1}{e} very quickly as cc is increasing. For example if c=10c = 10, then the difference between 11e1-\frac{1}{e} and the achieved competitive ratio is less than 0.00020.0002

    Parameterized Streaming Algorithms for Vertex Cover

    Full text link
    As graphs continue to grow in size, we seek ways to effectively process such data at scale. The model of streaming graph processing, in which a compact summary is maintained as each edge insertion/deletion is observed, is an attractive one. However, few results are known for optimization problems over such dynamic graph streams. In this paper, we introduce a new approach to handling graph streams, by instead seeking solutions for the parameterized versions of these problems where we are given a parameter kk and the objective is to decide whether there is a solution bounded by kk. By combining kernelization techniques with randomized sketch structures, we obtain the first streaming algorithms for the parameterized versions of the Vertex Cover problem. We consider the following three models for a graph stream on nn nodes: 1. The insertion-only model where the edges can only be added. 2. The dynamic model where edges can be both inserted and deleted. 3. The \emph{promised} dynamic model where we are guaranteed that at each timestamp there is a solution of size at most kk. In each of these three models we are able to design parameterized streaming algorithms for the Vertex Cover problem. We are also able to show matching lower bound for the space complexity of our algorithms. (Due to the arXiv limit of 1920 characters for abstract field, please see the abstract in the paper for detailed description of our results)Comment: Fixed some typo

    On Conceptually Simple Algorithms for Variants of Online Bipartite Matching

    Full text link
    We present a series of results regarding conceptually simple algorithms for bipartite matching in various online and related models. We first consider a deterministic adversarial model. The best approximation ratio possible for a one-pass deterministic online algorithm is 1/21/2, which is achieved by any greedy algorithm. D\"urr et al. recently presented a 22-pass algorithm called Category-Advice that achieves approximation ratio 3/53/5. We extend their algorithm to multiple passes. We prove the exact approximation ratio for the kk-pass Category-Advice algorithm for all k1k \ge 1, and show that the approximation ratio converges to the inverse of the golden ratio 2/(1+5)0.6182/(1+\sqrt{5}) \approx 0.618 as kk goes to infinity. The convergence is extremely fast --- the 55-pass Category-Advice algorithm is already within 0.01%0.01\% of the inverse of the golden ratio. We then consider a natural greedy algorithm in the online stochastic IID model---MinDegree. This algorithm is an online version of a well-known and extensively studied offline algorithm MinGreedy. We show that MinDegree cannot achieve an approximation ratio better than 11/e1-1/e, which is guaranteed by any consistent greedy algorithm in the known IID model. Finally, following the work in Besser and Poloczek, we depart from an adversarial or stochastic ordering and investigate a natural randomized algorithm (MinRanking) in the priority model. Although the priority model allows the algorithm to choose the input ordering in a general but well defined way, this natural algorithm cannot obtain the approximation of the Ranking algorithm in the ROM model

    Sample-Based Estimation of Node Similarity in Streaming Bipartite Graphs

    Get PDF
    My thesis would focus on analyzing the estimation of node similarity in streaming bipartite graph. As an important model in many applications of data mining, the bipartite graph represents the relationships between two sets of non-interconnected nodes, e.g. customers and the products/services they buy, users and the events/groups they get involved in, individuals and the diseases that they are subject to, etc. In most of these cases, data is naturally streaming over time. The node similarity in my thesis is mainly referred to neighborhood-based similarity, i.e., Common Neighbors (CN) measure. We analyze the distributional properties of CN in terms of the CN score, its dense ranks, in which equal weight objects receive the same rank and ranks are consecutive, and its fraction in full projection graph, which is also called similarity graph. We find that, in real-world dataset, the pairs of nodes with large value of CN only constitute a relatively quite small fraction. With this property, real-world streaming bipartite graph provide an opportunity for space saving by weighted sampling, which can preferentially select high weighted edges. Therefore, in this thesis, we propose a new one pass scheme for sampling the projection graphs of streaming bipartite graph in fixed storage and providing unbiased estimates of the CN similarity weights

    On the Power of Advice and Randomization for Online Bipartite Matching

    Get PDF

    Sample-Based Estimation of Node Similarity in Streaming Bipartite Graphs

    Get PDF
    My thesis would focus on analyzing the estimation of node similarity in streaming bipartite graph. As an important model in many applications of data mining, the bipartite graph represents the relationships between two sets of non-interconnected nodes, e.g. customers and the products/services they buy, users and the events/groups they get involved in, individuals and the diseases that they are subject to, etc. In most of these cases, data is naturally streaming over time. The node similarity in my thesis is mainly referred to neighborhood-based similarity, i.e., Common Neighbors (CN) measure. We analyze the distributional properties of CN in terms of the CN score, its dense ranks, in which equal weight objects receive the same rank and ranks are consecutive, and its fraction in full projection graph, which is also called similarity graph. We find that, in real-world dataset, the pairs of nodes with large value of CN only constitute a relatively quite small fraction. With this property, real-world streaming bipartite graph provide an opportunity for space saving by weighted sampling, which can preferentially select high weighted edges. Therefore, in this thesis, we propose a new one pass scheme for sampling the projection graphs of streaming bipartite graph in fixed storage and providing unbiased estimates of the CN similarity weights
    corecore