Search CORE

20 research outputs found

Constant Approximation for $k$ -Median and $k$ -Means with Outliers via Iterative Rounding

Author: Arthur David
Charikar M.
Chawla Sanjay
Chen Ke
Cohen-Addad Vincent
Guha Sudipto
Korupolu Madhukar R.
Ott Lionel
Shi Li. A
Publication venue
Publication date: 06/04/2018
Field of study

In this paper, we present a new iterative rounding framework for many clustering problems. Using this, we obtain an

(\alpha_1 + \epsilon \leq 7.081 + \epsilon)

-approximation algorithm for

k

-median with outliers, greatly improving upon the large implicit constant approximation ratio of Chen [Chen, SODA 2018]. For

k

-means with outliers, we give an

(\alpha_2+\epsilon \leq 53.002 + \epsilon)

-approximation, which is the first

O(1)

-approximation for this problem. The iterative algorithm framework is very versatile; we show how it can be used to give

\alpha_1

- and

(\alpha_1 + \epsilon)

-approximation algorithms for matroid and knapsack median problems respectively, improving upon the previous best approximations ratios of

8

[Swamy, ACM Trans. Algorithms] and

17.46

[Byrka et al, ESA 2015]. The natural LP relaxation for the

k

-median/

k

-means with outliers problem has an unbounded integrality gap. In spite of this negative result, our iterative rounding framework shows that we can round an LP solution to an almost-integral solution of small cost, in which we have at most two fractionally open facilities. Thus, the LP integrality gap arises due to the gap between almost-integral and fully-integral solutions. Then, using a pre-processing procedure, we show how to convert an almost-integral solution to a fully-integral solution losing only a constant-factor in the approximation ratio. By further using a sparsification technique, the additive factor loss incurred by the conversion can be reduced to any

\epsilon > 0

arXiv.org e-Print Archive

Crossref

Coupled and k-sided placements: generalizing generalized assignment

Author: Adam Meyerson
Brian Tagiku
DA Patterson
DB Shmoys
E Anderson
E Hazan
GA Alvarez
JK Lenstra
LC Lau
LW Dowdy
Madhukar Korupolu
N Buchbinder
Rajmohan Rajaraman
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Placement Algorithms for Hierarchical Cooperative Caching and other . . .

Author: Madhukar Korupolu
Publication venue
Publication date
Field of study

In a large-scale information system, such as a digital library or the world wide web, a set of distributed caches can improve their effectiveness by cooperating with one another, both in serving each other's requests as well as in deciding what to store. This dissertation explores the potential of such cooperative caching, and provides basic placement algorithms using which the caches can coordinate their storage decisions. The first part of the dissertation focuses on variants of the placement problem involving a single object. The most well-known of these variants are the facility location problems, which have received considerable attention in the operations research literature due to their widespread applicability. We prove that a simple local search heuristic, proposed about 25 years ago, yields polynomial-time constant-factor approximations for several metric facility location problems. The second part of the dissertation addresses the simultaneous placement of a collection of objects in hierarchical networks. We provide both exact and approximate polynomial-time algorithms for this hierarchical placement problem. Our exact algorithm is based on a reduction to min-cost flow, and does not appear to be practical for large problem sizes. Hence we are motivated to look for simpler approximate algorithms. Our main result is a simple constant-factor approximation algorithm that admits an efficient distributed implementation

CiteSeerX

Quasi-Fully Dynamic Algorithms for Two-Connectivity, Cycle Equivalence and Related Problems

Author: Korupolu Vijaya Ramachandran
Madhukar Reddy
Publication venue
Publication date: 01/01/1997
Field of study

In this paper weintroduce a new class of dynamic graph algorithms called quasi-fully dynamic algorithms, which are much more general than the backtracking algorithms and are much simpler than the fully dynamic algorithms. These algorithms are especially suitable for applications in which a certain core connected portion of the graph remains xed, and fully dynamic updates occur on the remaining edges in the graph. We present very simple quasi-fully dynamic algorithms with O(log n) worst case time, per operation, for 2-edge connectivity and cycle equivalence. The former is deterministic while the latter is Monte-Carlo type randomized. For 2-vertex connectivity, we give a randomized Las Vegas algorithm with O(log 4 n) expected amortized time per operation. We introduce the concept of quasi-k-edge-connectivity, which is a slightly relaxed version of k-edge connectivity, and show that it can be maintained in O(log n) worst case time per operation. We also analyze the performance of a natural extension of our quasi-fully dynamic algorithms to fully dynamic algorithms. The quasi-fully dynamic algorithm we present for cycle equivalence (which has several applications in optimizing compilers) is of special interest since the algorithm is quite simple, and no special-purpose incremental or backtracking algorithm is known for this problem.

CiteSeerX

Coordinated Placement and Replacement for Large-Scale Distributed Caches

Author: Madhukar R. Korupolu
Michael Dahlin
Publication venue
Publication date: 01/01/1998
Field of study

In a large-scale information system such as a digital library or the web, a set of distributed caches can improve their effectiveness by coordinating their data placement decisions. Using simulation, we examine three practical cooperative placement algorithms including one that is provably close to optimal, and we compare these algorithms to the optimal placement algorithm and several cooperative and non-cooperative replacement algorithms. We draw five conclusions from these experiments: (1) cooperative placement can significantly improve performance compared to local replacement algorithms particularly when the size of individual caches is limited compared to the universe of objects; (2) although the Amortized Placement algorithm is only guaranteed to be within 14 times the optimal, in practice it seems to provide an excellent approximation of the optimal; (3) in a cooperative caching scenario, the recent GreedyDual local replacement algorithm performs much better than the other local replacement algorithms; (4) our Hierarchical GreedyDual replacement algorithm yields further improvements over the GreedyDual algorithm especially when there are idle caches in the system; and (5) a key challenge to coordinated placement algorithms is generating good predictions of access patterns based on past accesses

CiteSeerX

Server-storage virtualization: Integration and load balancing in data centers

Author: Aameek Singh
Dushmanta Mohapatra
Madhukar Korupolu
Publication venue
Publication date: 01/01/2008
Field of study

Abstract—We describe the design of an agile data center with integrated server and storage virtualization technologies. Such data centers form a key building block for new cloud computing architectures. We also show how to leverage this integrated agility for non-disruptive load balancing in data centers across multiple resource layers- servers, switches, and storage. We propose a novel load balancing algorithm called VectorDot for handling the hierarchical and multi-dimensional resource constraints in such systems. The algorithm, inspired by the successful Toyoda method for multi-dimensional knapsacks, is the first of its kind. We evaluate our system on a range of synthetic and real data center testbeds comprising of VMware ESX servers, IBM SAN Volume Controller, Cisco and Brocade switches. Experiments under varied conditions demonstrate the end-to-end validity of our system and the ability of VectorDot to efficiently remove overloads on server, switch and storage nodes. I

CiteSeerX

Crossref

Equivalence and Related Problems

Author: Korupolu\lambda Vijaya Ramach
Madhukar Reddy
Publication venue
Publication date
Field of study

Abstract In this paper we introduce a new class of dynamic graph algorithms called quasi-fully dynamic algorithms, which are much more general than the backtracking algorithms and are much simpler than the fully dynamic algorithms. These algorithms are especially suitable for applications in which a certain core connected portion of the graph remains fixed, and fully dynamic updates occur on the remaining edges in the graph. We present very simple quasi-fully dynamic algorithms with O(log n) worst case time, per operation, for 2-edge connectivity and cycle equivalence. The former is deterministic while the latter is Monte-Carlo type randomized. For 2-vertex connectivity, we give a randomized Las Vegas algorithm with O(log4 n) expected amortized time per operation. We introduce the concept of quasi-k-edge-connectivity, which is a slightly relaxed version of k-edge connectivity, and show that it can be maintained in O(log n) worst case time per operation. We also analyze the performance of a natural extension of our quasi-fully dynamic algorithms to fully dynamic algorithms. The quasi-fully dynamic algorithm we present for cycle equivalence (which has several applications in optimizing compilers) is of special interest since the algorithm is quite simple, and no special-purpose incremental or backtracking algorithm is known for this problem. 1 Introduction Dynamic graph algorithms have received a great deal of attention in the last few years (see e.g., [4]). These algorithms maintain a property of a given graph under a sequence of suitably restricted updates and queries. Throughout this paper we will be concerned with edge updates (insertions/deletions) only: insertion/deletion of isolated vertices can be implemented trivially in all the known dynamic graph algorithms. The existing dynamic algorithms can be classified into three types depending on the nature of (edge) updates allowed: ffl Partially Dynamic: Only insertions are allowed (Incremental) or only deletions are allowed (Decremental)

CiteSeerX

Coupled Placement in Modern Data Centers

Author: Aameek Singh
Bhuvan Bamba
Madhukar Korupolu
Publication venue
Publication date: 01/01/2009
Field of study

We introduce the coupled placement problem for modern data centers spanning placement of application computation and data among available server and storage resources. While the two have traditionally been addressed independently in data centers, two modern trends make it beneficial to consider them together in a coupled manner: (a) rise in virtualization technologies, which enable applications packaged as VMs to be run on any server in the data center with spare compute resources, and (b) rise in multi-purpose hardware devices in the data center which provide compute resources of varying capabilities at different proximities from the storage nodes. We present a novel framework called CPA for addressing such coupled placement of application data and computation in modern data centers. Based on two well-studied problems – Stable Marriage and Knapsacks – the CPA framework is simple, fast, versatile and automatically enables high throughput applications to be placed on nearby server and storage node pairs. While a theoretical proof of CPA’s worst-case approximation guarantee remains an open question, we use extensive experimental analysis to evaluate CPA on large synthetic data centers comparing it to Linear Programming based methods and other traditional methods. Experiments show that CPA is consistently and surprisingly within 0 to 4 % of the Linear Programming based optimal values for various data center topologies and workload patterns. At the same time it is one to two orders of magnitude faster than the LP based methods and is able to scale to much larger problem sizes. The fast running time of CPA makes it highly suitable for large data center environments where hundreds to thousands of server and storage nodes are common. LP based approaches are prohibitively slow in such environments. CPA is also suitable for fast interactive analysis during consolidation of such environments from physical to virtual resources. 1

CiteSeerX