Search CORE

447 research outputs found

Fat Polygonal Partitions with Applications to Visualization and Embeddings

Author: de Berg Mark
Onak Krzysztof
Sidiropoulos Anastasios
Publication venue
Publication date: 01/01/2013
Field of study

Let

\mathcal{T}

be a rooted and weighted tree, where the weight of any node is equal to the sum of the weights of its children. The popular Treemap algorithm visualizes such a tree as a hierarchical partition of a square into rectangles, where the area of the rectangle corresponding to any node in

\mathcal{T}

is equal to the weight of that node. The aspect ratio of the rectangles in such a rectangular partition necessarily depends on the weights and can become arbitrarily high. We introduce a new hierarchical partition scheme, called a polygonal partition, which uses convex polygons rather than just rectangles. We present two methods for constructing polygonal partitions, both having guarantees on the worst-case aspect ratio of the constructed polygons; in particular, both methods guarantee a bound on the aspect ratio that is independent of the weights of the nodes. We also consider rectangular partitions with slack, where the areas of the rectangles may differ slightly from the weights of the corresponding nodes. We show that this makes it possible to obtain partitions with constant aspect ratio. This result generalizes to hyper-rectangular partitions in

\mathbb{R}^d

. We use these partitions with slack for embedding ultrametrics into

d

-dimensional Euclidean space: we give a

\mathop{\rm polylog}(\Delta)

-approximation algorithm for embedding

n

-point ultrametrics into

\mathbb{R}^d

with minimum distortion, where

\Delta

denotes the spread of the metric, i.e., the ratio between the largest and the smallest distance between two points. The previously best-known approximation ratio for this problem was polynomial in

n

. This is the first algorithm for embedding a non-trivial family of weighted-graph metrics into a space of constant dimension that achieves polylogarithmic approximation ratio.Comment: 26 page

arXiv.org e-Print Archive

CiteSeerX

Repository TU/e

Pure OAI Repository

Directory of Open Access Journals

Parallel Algorithms for Geometric Graph Problems

Author: Andoni Alexandr
Nikolov Aleksandar
Onak Krzysztof
Yaroslavtsev Grigory
Publication venue
Publication date: 01/01/2014
Field of study

We give algorithms for geometric graph problems in the modern parallel models inspired by MapReduce. For example, for the Minimum Spanning Tree (MST) problem over a set of points in the two-dimensional space, our algorithm computes a

(1+\epsilon)

-approximate MST. Our algorithms work in a constant number of rounds of communication, while using total space and communication proportional to the size of the data (linear space and near linear time algorithms). In contrast, for general graphs, achieving the same result for MST (or even connectivity) remains a challenging open problem, despite drawing significant attention in recent years. We develop a general algorithmic framework that, besides MST, also applies to Earth-Mover Distance (EMD) and the transportation cost problem. Our algorithmic framework has implications beyond the MapReduce model. For example it yields a new algorithm for computing EMD cost in the plane in near-linear time,

n^{1+o_\epsilon(1)}

. We note that while recently Sharathkumar and Agarwal developed a near-linear time algorithm for

(1+\epsilon)

-approximating EMD, our algorithm is fundamentally different, and, for example, also solves the transportation (cost) problem, raised as an open question in their work. Furthermore, our algorithm immediately gives a

(1+\epsilon)

-approximation algorithm with

n^{\delta}

space in the streaming-with-sorting model with

1/\delta^{O(1)}

passes. As such, it is tempting to conjecture that the parallel models may also constitute a concrete playground in the quest for efficient algorithms for EMD (and other similar problems) in the vanilla streaming model, a well-known open problem

arXiv.org e-Print Archive

CiteSeerX

Round Compression for Parallel Matching Algorithms

Author: Czumaj Artur
Mitrović Slobodan
Mądry Aleksander
Onak Krzysztof
Sankowski Piotr
Łącki Jakub
Publication venue
Publication date: 01/01/2018
Field of study

For over a decade now we have been witnessing the success of {\em massive parallel computation} (MPC) frameworks, such as MapReduce, Hadoop, Dryad, or Spark. One of the reasons for their success is the fact that these frameworks are able to accurately capture the nature of large-scale computation. In particular, compared to the classic distributed algorithms or PRAM models, these frameworks allow for much more local computation. The fundamental question that arises in this context is though: can we leverage this additional power to obtain even faster parallel algorithms? A prominent example here is the {\em maximum matching} problem---one of the most classic graph problems. It is well known that in the PRAM model one can compute a 2-approximate maximum matching in

O(\log{n})

rounds. However, the exact complexity of this problem in the MPC framework is still far from understood. Lattanzi et al. showed that if each machine has

n^{1+\Omega(1)}

memory, this problem can also be solved

2

-approximately in a constant number of rounds. These techniques, as well as the approaches developed in the follow up work, seem though to get stuck in a fundamental way at roughly

O(\log{n})

rounds once we enter the near-linear memory regime. It is thus entirely possible that in this regime, which captures in particular the case of sparse graph computations, the best MPC round complexity matches what one can already get in the PRAM model, without the need to take advantage of the extra local computation power. In this paper, we finally refute that perplexing possibility. That is, we break the above

O(\log n)

round complexity bound even in the case of {\em slightly sublinear} memory per machine. In fact, our improvement here is {\em almost exponential}: we are able to deliver a

(2+\epsilon)

-approximation to maximum matching, for any fixed constant

\epsilon>0

, in

O((\log \log n)^2)

rounds

arXiv.org e-Print Archive

Infoscience - École polytechnique fédérale de Lausanne

DSpace@MIT

Crossref

Warwick Research Archives Portal Repository

Maintaining a large matching and a small vertex cover

Author: Onak Krzysztof
Rubinfeld Ronitt
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 01/01/2010
Field of study

We consider the problem of maintaining a large matching and a small vertex cover in a dynamically changing graph. Each update to the graph is either an edge deletion or an edge insertion. We give the first randomized data structure that simultaneously achieves a constant approximation factor and handles a sequence of K updates in K*polylog(n) time, where n is the number of vertices in the graph. Previous data structures require a polynomial amount of computation per update.National Science Foundation (U.S.). (Grant number 0732334)National Science Foundation (U.S.). (Grant number 0728645)Marie Curie International Reintegration Grants (Grant number PIRG03-GA-2008-231077)Israel Science Foundation (Grant number 1147/09)Israel Science Foundation (Grant number 1675/09

CiteSeerX

DSpace@MIT

Crossref

On Approximating the Number of $k$ -cliques in Sublinear Time

Author: Avron H.
Curvature
Eden T.
New
On
Onak K.
Portes Alejandro
Seshadhri C.
Publication venue
Publication date: 12/03/2018
Field of study

We study the problem of approximating the number of

k

-cliques in a graph when given query access to the graph. We consider the standard query model for general graphs via (1) degree queries, (2) neighbor queries and (3) pair queries. Let

n

denote the number of vertices in the graph,

m

the number of edges, and

C_k

the number of

k

-cliques. We design an algorithm that outputs a

(1+\varepsilon)

-approximation (with high probability) for

C_k

, whose expected query complexity and running time are O\left(\frac{n}{C_k^{1/k}}+\frac{m^{k/2}}{C_k}\right)\poly(\log n,1/\varepsilon,k). Hence, the complexity of the algorithm is sublinear in the size of the graph for

C_k = \omega(m^{k/2-1})

. Furthermore, we prove a lower bound showing that the query complexity of our algorithm is essentially optimal (up to the dependence on

\log n

1/\varepsilon

and

k

). The previous results in this vein are by Feige (SICOMP 06) and by Goldreich and Ron (RSA 08) for edge counting (

k=2

) and by Eden et al. (FOCS 2015) for triangle counting (

k=3

). Our result matches the complexities of these results. The previous result by Eden et al. hinges on a certain amortization technique that works only for triangle counting, and does not generalize for larger cliques. We obtain a general algorithm that works for any

k\geq 3

by designing a procedure that samples each

k

-clique incident to a given set

S

of vertices with approximately equal probability. The primary difficulty is in finding cliques incident to purely high-degree vertices, since random sampling within neighbors has a low success probability. This is achieved by an algorithm that samples uniform random high degree vertices and a careful tradeoff between estimating cliques incident purely to high-degree vertices and those that include a low-degree vertex

arXiv.org e-Print Archive

Crossref