14 research outputs found
Bipartite Matching in the Semi-Streaming Model
We present the first deterministic 1+eps approximation algorithm for finding a large matching in a bipartite graph in the semi-streaming model which requires only passes over the input stream. In this model, the input graph is given as a stream of its edges in some arbitrary order, and storage of the algorithm is bounded by bits, where . The only previously known arbitrarily good approximation for general graphs is achieved by the randomized algorithm of McGregor (2005), which uses passes. We show that even for bipartite graphs, McGregor's algorithm needs passes, thus it is necessarily exponential in the approximation parameter. The design as well as the analysis of our algorithm require the introduction of some new techniques. A novelty of our algorithm is a new deterministic assignment of matching edges to augmenting paths which is responsible for the complexity reduction, and gets rid of randomization. We repeatedly grow an initial matching using augmenting paths up to a length of . We terminate when the number of augmenting paths found in one iteration falls below a certain threshold also depending on , that guarantees a approximation. The main challenge is to find those augmenting paths without requiring an excessive number of passes. In each iteration, using multiple passes, we grow a set of alternating paths in parallel, considering multiple passes, we grow a set of alternating paths in parallel, considering each edge as a possible extension as it comes along in the stream. Backtracking is used on paths that fail to grow any further. Crucial are the so-called position limits: when a matching edge is the i-th matching edge in a path and it is then removed by backtracking, it will only be inserted into a path again at a position strictly lesser than i. This rule strikes a balance between terminating quickly on the one hand and giving the procedure enough freedom on the other hand
Maximum Matching in Turnstile Streams
We consider the unweighted bipartite maximum matching problem in the one-pass
turnstile streaming model where the input stream consists of edge insertions
and deletions. In the insertion-only model, a one-pass -approximation
streaming algorithm can be easily obtained with space , where
denotes the number of vertices of the input graph. We show that no such result
is possible if edge deletions are allowed, even if space is
granted, for every . Specifically, for every , we show that in the one-pass turnstile streaming model, in order to compute
a -approximation, space is
required for constant error randomized algorithms, and, up to logarithmic
factors, space is sufficient. Our lower bound result is
proved in the simultaneous message model of communication and may be of
independent interest
On the Power of Advice and Randomization for Online Bipartite Matching
While randomized online algorithms have access to a sequence of uniform
random bits, deterministic online algorithms with advice have access to a
sequence of advice bits, i.e., bits that are set by an all powerful oracle
prior to the processing of the request sequence. Advice bits are at least as
helpful as random bits, but how helpful are they? In this work, we investigate
the power of advice bits and random bits for online maximum bipartite matching
(MBM).
The well-known Karp-Vazirani-Vazirani algorithm is an optimal randomized
-competitive algorithm for \textsc{MBM} that requires access
to uniform random bits. We show that
advice bits are necessary and
sufficient in order to obtain a
-competitive deterministic advice algorithm. Furthermore, for a
large natural class of deterministic advice algorithms, we prove that
advice bits are required in order to improve on the
-competitiveness of the best deterministic online algorithm, while
it is known that bits are sufficient.
Last, we give a randomized online algorithm that uses random bits, for
integers , and a competitive ratio that approaches
very quickly as is increasing. For example if , then the difference
between and the achieved competitive ratio is less than
Parameterized Streaming Algorithms for Vertex Cover
As graphs continue to grow in size, we seek ways to effectively process such
data at scale. The model of streaming graph processing, in which a compact
summary is maintained as each edge insertion/deletion is observed, is an
attractive one. However, few results are known for optimization problems over
such dynamic graph streams.
In this paper, we introduce a new approach to handling graph streams, by
instead seeking solutions for the parameterized versions of these problems
where we are given a parameter and the objective is to decide whether there
is a solution bounded by . By combining kernelization techniques with
randomized sketch structures, we obtain the first streaming algorithms for the
parameterized versions of the Vertex Cover problem. We consider the following
three models for a graph stream on nodes:
1. The insertion-only model where the edges can only be added.
2. The dynamic model where edges can be both inserted and deleted.
3. The \emph{promised} dynamic model where we are guaranteed that at each
timestamp there is a solution of size at most .
In each of these three models we are able to design parameterized streaming
algorithms for the Vertex Cover problem. We are also able to show matching
lower bound for the space complexity of our algorithms.
(Due to the arXiv limit of 1920 characters for abstract field, please see the
abstract in the paper for detailed description of our results)Comment: Fixed some typo
On Conceptually Simple Algorithms for Variants of Online Bipartite Matching
We present a series of results regarding conceptually simple algorithms for
bipartite matching in various online and related models. We first consider a
deterministic adversarial model. The best approximation ratio possible for a
one-pass deterministic online algorithm is , which is achieved by any
greedy algorithm. D\"urr et al. recently presented a -pass algorithm called
Category-Advice that achieves approximation ratio . We extend their
algorithm to multiple passes. We prove the exact approximation ratio for the
-pass Category-Advice algorithm for all , and show that the
approximation ratio converges to the inverse of the golden ratio
as goes to infinity. The convergence is
extremely fast --- the -pass Category-Advice algorithm is already within
of the inverse of the golden ratio.
We then consider a natural greedy algorithm in the online stochastic IID
model---MinDegree. This algorithm is an online version of a well-known and
extensively studied offline algorithm MinGreedy. We show that MinDegree cannot
achieve an approximation ratio better than , which is guaranteed by any
consistent greedy algorithm in the known IID model.
Finally, following the work in Besser and Poloczek, we depart from an
adversarial or stochastic ordering and investigate a natural randomized
algorithm (MinRanking) in the priority model. Although the priority model
allows the algorithm to choose the input ordering in a general but well defined
way, this natural algorithm cannot obtain the approximation of the Ranking
algorithm in the ROM model
Sample-Based Estimation of Node Similarity in Streaming Bipartite Graphs
My thesis would focus on analyzing the estimation of node similarity in streaming bipartite
graph. As an important model in many applications of data mining, the bipartite
graph represents the relationships between two sets of non-interconnected nodes, e.g. customers
and the products/services they buy, users and the events/groups they get involved
in, individuals and the diseases that they are subject to, etc. In most of these cases, data is
naturally streaming over time.
The node similarity in my thesis is mainly referred to neighborhood-based similarity,
i.e., Common Neighbors (CN) measure. We analyze the distributional properties of CN
in terms of the CN score, its dense ranks, in which equal weight objects receive the same
rank and ranks are consecutive, and its fraction in full projection graph, which is also
called similarity graph. We find that, in real-world dataset, the pairs of nodes with large
value of CN only constitute a relatively quite small fraction. With this property, real-world
streaming bipartite graph provide an opportunity for space saving by weighted sampling,
which can preferentially select high weighted edges.
Therefore, in this thesis, we propose a new one pass scheme for sampling the projection
graphs of streaming bipartite graph in fixed storage and providing unbiased estimates of
the CN similarity weights
Sample-Based Estimation of Node Similarity in Streaming Bipartite Graphs
My thesis would focus on analyzing the estimation of node similarity in streaming bipartite
graph. As an important model in many applications of data mining, the bipartite
graph represents the relationships between two sets of non-interconnected nodes, e.g. customers
and the products/services they buy, users and the events/groups they get involved
in, individuals and the diseases that they are subject to, etc. In most of these cases, data is
naturally streaming over time.
The node similarity in my thesis is mainly referred to neighborhood-based similarity,
i.e., Common Neighbors (CN) measure. We analyze the distributional properties of CN
in terms of the CN score, its dense ranks, in which equal weight objects receive the same
rank and ranks are consecutive, and its fraction in full projection graph, which is also
called similarity graph. We find that, in real-world dataset, the pairs of nodes with large
value of CN only constitute a relatively quite small fraction. With this property, real-world
streaming bipartite graph provide an opportunity for space saving by weighted sampling,
which can preferentially select high weighted edges.
Therefore, in this thesis, we propose a new one pass scheme for sampling the projection
graphs of streaming bipartite graph in fixed storage and providing unbiased estimates of
the CN similarity weights