108,272 research outputs found

    On the Fixed-Parameter Tractability of Capacitated Clustering

    Get PDF
    We study the complexity of the classic capacitated k-median and k-means problems parameterized by the number of centers, k. These problems are notoriously difficult since the best known approximation bound for high dimensional Euclidean space and general metric space is Theta(log k) and it remains a major open problem whether a constant factor exists. We show that there exists a (3+epsilon)-approximation algorithm for the capacitated k-median and a (9+epsilon)-approximation algorithm for the capacitated k-means problem in general metric spaces whose running times are f(epsilon,k) n^{O(1)}. For Euclidean inputs of arbitrary dimension, we give a (1+epsilon)-approximation algorithm for both problems with a similar running time. This is a significant improvement over the (7+epsilon)-approximation of Adamczyk et al. for k-median in general metric spaces and the (69+epsilon)-approximation of Xu et al. for Euclidean k-means

    Parameterized and approximation complexity of the detection pair problem in graphs

    Full text link
    We study the complexity of the problem DETECTION PAIR. A detection pair of a graph GG is a pair (W,L)(W,L) of sets of detectors with WV(G)W\subseteq V(G), the watchers, and LV(G)L\subseteq V(G), the listeners, such that for every pair u,vu,v of vertices that are not dominated by a watcher of WW, there is a listener of LL whose distances to uu and to vv are different. The goal is to minimize W+L|W|+|L|. This problem generalizes the two classic problems DOMINATING SET and METRIC DIMENSION, that correspond to the restrictions L=L=\emptyset and W=W=\emptyset, respectively. DETECTION PAIR was recently introduced by Finbow, Hartnell and Young [A. S. Finbow, B. L. Hartnell and J. R. Young. The complexity of monitoring a network with both watchers and listeners. Manuscript, 2015], who proved it to be NP-complete on trees, a surprising result given that both DOMINATING SET and METRIC DIMENSION are known to be linear-time solvable on trees. It follows from an existing reduction by Hartung and Nichterlein for METRIC DIMENSION that even on bipartite subcubic graphs of arbitrarily large girth, DETECTION PAIR is NP-hard to approximate within a sub-logarithmic factor and W[2]-hard (when parameterized by solution size). We show, using a reduction to SET COVER, that DETECTION PAIR is approximable within a factor logarithmic in the number of vertices of the input graph. Our two main results are a linear-time 22-approximation algorithm and an FPT algorithm for DETECTION PAIR on trees.Comment: 13 page

    The One-Way Communication Complexity of Dynamic Time Warping Distance

    Get PDF
    We resolve the randomized one-way communication complexity of Dynamic Time Warping (DTW) distance. We show that there is an efficient one-way communication protocol using O~(n/alpha) bits for the problem of computing an alpha-approximation for DTW between strings x and y of length n, and we prove a lower bound of Omega(n / alpha) bits for the same problem. Our communication protocol works for strings over an arbitrary metric of polynomial size and aspect ratio, and we optimize the logarithmic factors depending on properties of the underlying metric, such as when the points are low-dimensional integer vectors equipped with various metrics or have bounded doubling dimension. We also consider linear sketches of DTW, showing that such sketches must have size Omega(n)

    A Pseudo-Metric between Probability Distributions based on Depth-Trimmed Regions

    Full text link
    The design of a metric between probability distributions is a longstanding problem motivated by numerous applications in Machine Learning. Focusing on continuous probability distributions on the Euclidean space Rd\mathbb{R}^d, we introduce a novel pseudo-metric between probability distributions by leveraging the extension of univariate quantiles to multivariate spaces. Data depth is a nonparametric statistical tool that measures the centrality of any element xRdx\in\mathbb{R}^d with respect to (w.r.t.) a probability distribution or a data set. It is a natural median-oriented extension of the cumulative distribution function (cdf) to the multivariate case. Thus, its upper-level sets -- the depth-trimmed regions -- give rise to a definition of multivariate quantiles. The new pseudo-metric relies on the average of the Hausdorff distance between the depth-based quantile regions w.r.t. each distribution. Its good behavior w.r.t. major transformation groups, as well as its ability to factor out translations, are depicted. Robustness, an appealing feature of this pseudo-metric, is studied through the finite sample breakdown point. Moreover, we propose an efficient approximation method with linear time complexity w.r.t. the size of the data set and its dimension. The quality of this approximation as well as the performance of the proposed approach are illustrated in numerical experiments
    corecore