Search CORE

45,512 research outputs found

Proportional Approval Voting, Harmonic k-median, and Negative Association

Author: Byrka Jaroslaw
Skowron Piotr
Sornat Krzysztof
Publication venue: LIPIcs - Leibniz International Proceedings in Informatics. 45th International Colloquium on Automata, Languages, and Programming (ICALP 2018)
Publication date: 01/01/2018
Field of study

We study a generic framework that provides a unified view on two important classes of problems: (i) extensions of the k-median problem where clients are interested in having multiple facilities in their vicinity (e.g., due to the fact that, with some small probability, the closest facility might be malfunctioning and so might not be available for using), and (ii) finding winners according to some appealing multiwinner election rules, i.e., election system aimed for choosing representatives bodies, such as parliaments, based on preferences of a population of voters over individual candidates. Each problem in our framework is associated with a vector of weights: we show that the approximability of the problem depends on structural properties of these vectors. We specifically focus on the harmonic sequence of weights, since it results in particularly appealing properties of the considered problem. In particular, the objective function interpreted in a multiwinner election setup reflects to the well-known Proportional Approval Voting (PAV) rule. Our main result is that, due to the specific (harmonic) structure of weights, the problem allows constant factor approximation. This is surprising since the problem can be interpreted as a variant of the k-median problem where we do not assume that the connection costs satisfy the triangle inequality. To the best of our knowledge this is the first constant factor approximation algorithm for a variant of k-median that does not require this assumption. The algorithm we propose is based on dependent rounding [Srinivasan, FOCS\u2701] applied to the solution of a natural LP-relaxation of the problem. The rounding process is well known to produce distributions over integral solutions satisfying Negative Correlation (NC), which is usually sufficient for the analysis of approximation guarantees offered by rounding procedures. In our analysis, however, we need to use the fact that the carefully implemented rounding process satisfies a stronger property, called Negative Association (NA), which allows us to apply standard concentration bounds for conditional random variables

arXiv.org e-Print Archive

Dagstuhl Research Online Publication Server

Learning mixtures of separated nonspherical Gaussians

Author: Arora Sanjeev
Kannan Ravi
Publication venue: 'Institute of Mathematical Statistics'
Publication date: 01/01/2005
Field of study

Mixtures of Gaussian (or normal) distributions arise in a variety of application areas. Many heuristics have been proposed for the task of finding the component Gaussians given samples from the mixture, such as the EM algorithm, a local-search heuristic from Dempster, Laird and Rubin [J. Roy. Statist. Soc. Ser. B 39 (1977) 1-38]. These do not provably run in polynomial time. We present the first algorithm that provably learns the component Gaussians in time that is polynomial in the dimension. The Gaussians may have arbitrary shape, but they must satisfy a ``separation condition'' which places a lower bound on the distance between the centers of any two component Gaussians. The mathematical results at the heart of our proof are ``distance concentration'' results--proved using isoperimetric inequalities--which establish bounds on the probability distribution of the distance between a pair of points generated according to the mixture. We also formalize the more general problem of max-likelihood fit of a Gaussian mixture to unstructured data.Comment: Published at http://dx.doi.org/10.1214/105051604000000512 in the Annals of Applied Probability (http://www.imstat.org/aap/) by the Institute of Mathematical Statistics (http://www.imstat.org

arXiv.org e-Print Archive

CiteSeerX

Crossref

Concentration for independent random variables with heavy tails

Author: Barthe Franck
Cattiaux Patrick
Roberto Cyril
Publication venue
Publication date: 01/01/2005
Field of study

If a random variable is not exponentially integrable, it is known that no concentration inequality holds for an infinite sequence of independent copies. Under mild conditions, we establish concentration inequalities for finite sequences of

n

independent copies, with good dependence in

n

arXiv.org e-Print Archive

Transport Inequalities. A Survey

Author: Gozlan Nathael
Léonard Christian
Publication venue
Publication date: 01/01/2010
Field of study

This is a survey of recent developments in the area of transport inequalities. We investigate their consequences in terms of concentration and deviation inequalities and sketch their links with other functional inequalities and also large deviation theory.Comment: Proceedings of the conference Inhomogeneous Random Systems 2009; 82 pages

arXiv.org e-Print Archive

CiteSeerX

HAL - UPEC / UPEM

Dependent randomized rounding for clustering and partition systems with knapsack constraints

Author: Harris David G.
Pensyl Thomas
Srinivasan Aravind
Trinh Khoa
Publication venue
Publication date: 01/04/2020
Field of study

Clustering problems are fundamental to unsupervised learning. There is an increased emphasis on fairness in machine learning and AI; one representative notion of fairness is that no single demographic group should be over-represented among the cluster-centers. This, and much more general clustering problems, can be formulated with "knapsack" and "partition" constraints. We develop new randomized algorithms targeting such problems, and study two in particular: multi-knapsack median and multi-knapsack center. Our rounding algorithms give new approximation and pseudo-approximation algorithms for these problems. One key technical tool, which may be of independent interest, is a new tail bound analogous to Feige (2006) for sums of random variables with unbounded variances. Such bounds are very useful in inferring properties of large networks using few samples

arXiv.org e-Print Archive