2,332 research outputs found
General Bounds for Incremental Maximization
We propose a theoretical framework to capture incremental solutions to
cardinality constrained maximization problems. The defining characteristic of
our framework is that the cardinality/support of the solution is bounded by a
value that grows over time, and we allow the solution to be
extended one element at a time. We investigate the best-possible competitive
ratio of such an incremental solution, i.e., the worst ratio over all
between the incremental solution after steps and an optimum solution of
cardinality . We define a large class of problems that contains many
important cardinality constrained maximization problems like maximum matching,
knapsack, and packing/covering problems. We provide a general
-competitive incremental algorithm for this class of problems, and show
that no algorithm can have competitive ratio below in general.
In the second part of the paper, we focus on the inherently incremental
greedy algorithm that increases the objective value as much as possible in each
step. This algorithm is known to be -competitive for submodular objective
functions, but it has unbounded competitive ratio for the class of incremental
problems mentioned above. We define a relaxed submodularity condition for the
objective function, capturing problems like maximum (weighted) (-)matching
and a variant of the maximum flow problem. We show that the greedy algorithm
has competitive ratio (exactly) for the class of problems that satisfy
this relaxed submodularity condition.
Note that our upper bounds on the competitive ratios translate to
approximation ratios for the underlying cardinality constrained problems.Comment: fixed typo
Designing Networks with Good Equilibria under Uncertainty
We consider the problem of designing network cost-sharing protocols with good
equilibria under uncertainty. The underlying game is a multicast game in a
rooted undirected graph with nonnegative edge costs. A set of k terminal
vertices or players need to establish connectivity with the root. The social
optimum is the Minimum Steiner Tree. We are interested in situations where the
designer has incomplete information about the input. We propose two different
models, the adversarial and the stochastic. In both models, the designer has
prior knowledge of the underlying metric but the requested subset of the
players is not known and is activated either in an adversarial manner
(adversarial model) or is drawn from a known probability distribution
(stochastic model).
In the adversarial model, the designer's goal is to choose a single,
universal protocol that has low Price of Anarchy (PoA) for all possible
requested subsets of players. The main question we address is: to what extent
can prior knowledge of the underlying metric help in the design? We first
demonstrate that there exist graphs (outerplanar) where knowledge of the
underlying metric can dramatically improve the performance of good network
design. Then, in our main technical result, we show that there exist graph
metrics, for which knowing the underlying metric does not help and any
universal protocol has PoA of , which is tight. We attack this
problem by developing new techniques that employ powerful tools from extremal
combinatorics, and more specifically Ramsey Theory in high dimensional
hypercubes.
Then we switch to the stochastic model, where each player is independently
activated. We show that there exists a randomized ordered protocol that
achieves constant PoA. By using standard derandomization techniques, we produce
a deterministic ordered protocol with constant PoA.Comment: This version has additional results about stochastic inpu
Systematizing Genome Privacy Research: A Privacy-Enhancing Technologies Perspective
Rapid advances in human genomics are enabling researchers to gain a better
understanding of the role of the genome in our health and well-being,
stimulating hope for more effective and cost efficient healthcare. However,
this also prompts a number of security and privacy concerns stemming from the
distinctive characteristics of genomic data. To address them, a new research
community has emerged and produced a large number of publications and
initiatives.
In this paper, we rely on a structured methodology to contextualize and
provide a critical analysis of the current knowledge on privacy-enhancing
technologies used for testing, storing, and sharing genomic data, using a
representative sample of the work published in the past decade. We identify and
discuss limitations, technical challenges, and issues faced by the community,
focusing in particular on those that are inherently tied to the nature of the
problem and are harder for the community alone to address. Finally, we report
on the importance and difficulty of the identified challenges based on an
online survey of genome data privacy expertsComment: To appear in the Proceedings on Privacy Enhancing Technologies
(PoPETs), Vol. 2019, Issue
Fully Dynamic Consistent Facility Location
We consider classic clustering problems in fully dynamic data streams, where data elements can be both inserted and deleted. In this context, several parameters are of importance: (1) the quality of the solution after each insertion or deletion, (2) the time it takes to update the solution, and (3) how different consecutive solutions are. The question of obtaining efficient algorithms in this context for facility location, k-median and k-means has been raised in a recent paper by Hubert-Chan et al. [WWW'18] and also appears as a natural follow-up on the online model with recourse studied by Lattanzi and Vassilvitskii [ICML'17] (i.e.: in insertion-only streams). In this paper, we focus on general metric spaces and mainly on the facility location problem. We give an arguably simple algorithm that maintains a constant factor approximation, with O(n log n) update time, and total recourse O(n). This improves over the naive algorithm which consists in recomputing a solution at each time step and that can take up to O(n^2) update time, and O(n^2) total recourse. These bounds are nearly optimal: in general metric space, inserting a point take O(n) times to describe the distances to other points, and we give a simple lower bound of O(n) for the recourse. Moreover, we generalize this result for the k-medians and k-means problems: our algorithm maintains a constant factor approximation in time O˜(n+k^2). We complement our analysis with experiments showing that the cost of the solution maintained by our algorithm at any time t is very close to the cost of a solution obtained by quickly recomputing a solution from scratch at time t while having a much better running time
First-Come-First-Served for Online Slot Allocation and Huffman Coding
Can one choose a good Huffman code on the fly, without knowing the underlying
distribution? Online Slot Allocation (OSA) models this and similar problems:
There are n slots, each with a known cost. There are n items. Requests for
items are drawn i.i.d. from a fixed but hidden probability distribution p.
After each request, if the item, i, was not previously requested, then the
algorithm (knowing the slot costs and the requests so far, but not p) must
place the item in some vacant slot j(i). The goal is to minimize the sum, over
the items, of the probability of the item times the cost of its assigned slot.
The optimal offline algorithm is trivial: put the most probable item in the
cheapest slot, the second most probable item in the second cheapest slot, etc.
The optimal online algorithm is First Come First Served (FCFS): put the first
requested item in the cheapest slot, the second (distinct) requested item in
the second cheapest slot, etc. The optimal competitive ratios for any online
algorithm are 1+H(n-1) ~ ln n for general costs and 2 for concave costs. For
logarithmic costs, the ratio is, asymptotically, 1: FCFS gives cost opt + O(log
opt).
For Huffman coding, FCFS yields an online algorithm (one that allocates
codewords on demand, without knowing the underlying probability distribution)
that guarantees asymptotically optimal cost: at most opt + 2 log(1+opt) + 2.Comment: ACM-SIAM Symposium on Discrete Algorithms (SODA) 201
Set covering with our eyes closed
Given a universe of elements and a weighted collection of subsets of , the universal set cover problem is to a priori map each element to a set containing such that any set is covered by S(X)=\cup_{u\in XS(u). The aim is to find a mapping such that the cost of is as close as possible to the optimal set cover cost for . (Such problems are also called oblivious or a priori optimization problems.) Unfortunately, for every universal mapping, the cost of can be times larger than optimal if the set is adversarially chosen. In this paper we study the performance on average, when is a set of randomly chosen elements from the universe: we show how to efficiently find a universal map whose expected cost is times the expected optimal cost. In fact, we give a slightly improved analysis and show that this is the best possible. We generalize these ideas to weighted set cover and show similar guarantees to (nonmetric) facility location, where we have to balance the facility opening cost with the cost of connecting clients to the facilities. We show applications of our results to universal multicut and disc-covering problems and show how all these universal mappings give us algorithms for the stochastic online variants of the problems with the same competitive factors
- …