119 research outputs found
Center-based Clustering under Perturbation Stability
Clustering under most popular objective functions is NP-hard, even to
approximate well, and so unlikely to be efficiently solvable in the worst case.
Recently, Bilu and Linial \cite{Bilu09} suggested an approach aimed at
bypassing this computational barrier by using properties of instances one might
hope to hold in practice. In particular, they argue that instances in practice
should be stable to small perturbations in the metric space and give an
efficient algorithm for clustering instances of the Max-Cut problem that are
stable to perturbations of size . In addition, they conjecture that
instances stable to as little as O(1) perturbations should be solvable in
polynomial time. In this paper we prove that this conjecture is true for any
center-based clustering objective (such as -median, -means, and
-center). Specifically, we show we can efficiently find the optimal
clustering assuming only stability to factor-3 perturbations of the underlying
metric in spaces without Steiner points, and stability to factor
perturbations for general metrics. In particular, we show for such instances
that the popular Single-Linkage algorithm combined with dynamic programming
will find the optimal clustering. We also present NP-hardness results under a
weaker but related condition
Differentially Private Data Analysis of Social Networks via Restricted Sensitivity
We introduce the notion of restricted sensitivity as an alternative to global
and smooth sensitivity to improve accuracy in differentially private data
analysis. The definition of restricted sensitivity is similar to that of global
sensitivity except that instead of quantifying over all possible datasets, we
take advantage of any beliefs about the dataset that a querier may have, to
quantify over a restricted class of datasets. Specifically, given a query f and
a hypothesis H about the structure of a dataset D, we show generically how to
transform f into a new query f_H whose global sensitivity (over all datasets
including those that do not satisfy H) matches the restricted sensitivity of
the query f. Moreover, if the belief of the querier is correct (i.e., D is in
H) then f_H(D) = f(D). If the belief is incorrect, then f_H(D) may be
inaccurate.
We demonstrate the usefulness of this notion by considering the task of
answering queries regarding social-networks, which we model as a combination of
a graph and a labeling of its vertices. In particular, while our generic
procedure is computationally inefficient, for the specific definition of H as
graphs of bounded degree, we exhibit efficient ways of constructing f_H using
different projection-based techniques. We then analyze two important query
classes: subgraph counting queries (e.g., number of triangles) and local
profile queries (e.g., number of people who know a spy and a computer-scientist
who know each other). We demonstrate that the restricted sensitivity of such
queries can be significantly lower than their smooth sensitivity. Thus, using
restricted sensitivity we can maintain privacy whether or not D is in H, while
providing more accurate results in the event that H holds true
Differentially Private Approximations of a Convex Hull in Low Dimensions
We give the first differentially private algorithms that estimate a variety of geometric features of points in the Euclidean space, such as diameter, width, volume of convex hull, min-bounding box, min-enclosing ball, etc. Our work relies heavily on the notion of Tukey-depth. Instead of (non-privately) approximating the convex-hull of the given set of points P, our algorithms approximate the geometric features of D_{P}(?) - the ?-Tukey region induced by P (all points of Tukey-depth ? or greater). Moreover, our approximations are all bi-criteria: for any geometric feature ? our (?,?)-approximation is a value "sandwiched" between (1-?)?(D_P(?)) and (1+?)?(D_P(?-?)).
Our work is aimed at producing a (?,?)-kernel of D_P(?), namely a set ? such that (after a shift) it holds that (1-?)D_P(?) ? CH(?) ? (1+?)D_P(?-?). We show that an analogous notion of a bi-critera approximation of a directional kernel, as originally proposed by [Pankaj K. Agarwal et al., 2004], fails to give a kernel, and so we result to subtler notions of approximations of projections that do yield a kernel. First, we give differentially private algorithms that find (?,?)-kernels for a "fat" Tukey-region. Then, based on a private approximation of the min-bounding box, we find a transformation that does turn D_P(?) into a "fat" region but only if its volume is proportional to the volume of D_P(?-?). Lastly, we give a novel private algorithm that finds a depth parameter ? for which the volume of D_P(?) is comparable to the volume of D_P(?-?). We hope our work leads to the further study of the intersection of differential privacy and computational geometry
Graph coloring with no large monochromatic components
For a graph G and an integer t we let mcc_t(G) be the smallest m such that
there exists a coloring of the vertices of G by t colors with no monochromatic
connected subgraph having more than m vertices. Let F be any nontrivial
minor-closed family of graphs. We show that \mcc_2(G) = O(n^{2/3}) for any
n-vertex graph G \in F. This bound is asymptotically optimal and it is attained
for planar graphs. More generally, for every such F and every fixed t we show
that mcc_t(G)=O(n^{2/(t+1)}). On the other hand we have examples of graphs G
with no K_{t+3} minor and with mcc_t(G)=\Omega(n^{2/(2t-1)}).
It is also interesting to consider graphs of bounded degrees. Haxell, Szabo,
and Tardos proved \mcc_2(G) \leq 20000 for every graph G of maximum degree 5.
We show that there are n-vertex 7-regular graphs G with \mcc_2(G)=\Omega(n),
and more sharply, for every \epsilon>0 there exists c_\epsilon>0 and n-vertex
graphs of maximum degree 7, average degree at most 6+\epsilon for all
subgraphs, and with mcc_2(G)\ge c_\eps n. For 6-regular graphs it is known only
that the maximum order of magnitude of \mcc_2 is between \sqrt n and n.
We also offer a Ramsey-theoretic perspective of the quantity \mcc_t(G).Comment: 13 pages, 2 figure
- …