1,503 research outputs found

    Prioritized Metric Structures and Embedding

    Full text link
    Metric data structures (distance oracles, distance labeling schemes, routing schemes) and low-distortion embeddings provide a powerful algorithmic methodology, which has been successfully applied for approximation algorithms \cite{llr}, online algorithms \cite{BBMN11}, distributed algorithms \cite{KKMPT12} and for computing sparsifiers \cite{ST04}. However, this methodology appears to have a limitation: the worst-case performance inherently depends on the cardinality of the metric, and one could not specify in advance which vertices/points should enjoy a better service (i.e., stretch/distortion, label size/dimension) than that given by the worst-case guarantee. In this paper we alleviate this limitation by devising a suit of {\em prioritized} metric data structures and embeddings. We show that given a priority ranking (x1,x2,,xn)(x_1,x_2,\ldots,x_n) of the graph vertices (respectively, metric points) one can devise a metric data structure (respectively, embedding) in which the stretch (resp., distortion) incurred by any pair containing a vertex xjx_j will depend on the rank jj of the vertex. We also show that other important parameters, such as the label size and (in some sense) the dimension, may depend only on jj. In some of our metric data structures (resp., embeddings) we achieve both prioritized stretch (resp., distortion) and label size (resp., dimension) {\em simultaneously}. The worst-case performance of our metric data structures and embeddings is typically asymptotically no worse than of their non-prioritized counterparts.Comment: To appear at STOC 201

    Algorithms for Constructing Overlay Networks For Live Streaming

    Full text link
    We present a polynomial time approximation algorithm for constructing an overlay multicast network for streaming live media events over the Internet. The class of overlay networks constructed by our algorithm include networks used by Akamai Technologies to deliver live media events to a global audience with high fidelity. We construct networks consisting of three stages of nodes. The nodes in the first stage are the entry points that act as sources for the live streams. Each source forwards each of its streams to one or more nodes in the second stage that are called reflectors. A reflector can split an incoming stream into multiple identical outgoing streams, which are then sent on to nodes in the third and final stage that act as sinks and are located in edge networks near end-users. As the packets in a stream travel from one stage to the next, some of them may be lost. A sink combines the packets from multiple instances of the same stream (by reordering packets and discarding duplicates) to form a single instance of the stream with minimal loss. Our primary contribution is an algorithm that constructs an overlay network that provably satisfies capacity and reliability constraints to within a constant factor of optimal, and minimizes cost to within a logarithmic factor of optimal. Further in the common case where only the transmission costs are minimized, we show that our algorithm produces a solution that has cost within a factor of 2 of optimal. We also implement our algorithm and evaluate it on realistic traces derived from Akamai's live streaming network. Our empirical results show that our algorithm can be used to efficiently construct large-scale overlay networks in practice with near-optimal cost

    Fast and Deterministic Approximations for k-Cut

    Get PDF
    In an undirected graph, a k-cut is a set of edges whose removal breaks the graph into at least k connected components. The minimum weight k-cut can be computed in n^O(k) time, but when k is treated as part of the input, computing the minimum weight k-cut is NP-Hard [Goldschmidt and Hochbaum, 1994]. For poly(m,n,k)-time algorithms, the best possible approximation factor is essentially 2 under the small set expansion hypothesis [Manurangsi, 2017]. Saran and Vazirani [1995] showed that a (2 - 2/k)-approximately minimum weight k-cut can be computed via O(k) minimum cuts, which implies a O~(km) randomized running time via the nearly linear time randomized min-cut algorithm of Karger [2000]. Nagamochi and Kamidoi [2007] showed that a (2 - 2/k)-approximately minimum weight k-cut can be computed deterministically in O(mn + n^2 log n) time. These results prompt two basic questions. The first concerns the role of randomization. Is there a deterministic algorithm for 2-approximate k-cuts matching the randomized running time of O~(km)? The second question qualitatively compares minimum cut to 2-approximate minimum k-cut. Can 2-approximate k-cuts be computed as fast as the minimum cut - in O~(m) randomized time? We give a deterministic approximation algorithm that computes (2 + eps)-minimum k-cuts in O(m log^3 n / eps^2) time, via a (1 + eps)-approximation for an LP relaxation of k-cut

    The Secure Link Prediction Problem

    Get PDF
    Link Prediction is an important and well-studied problem for social networks. Given a snapshot of a graph, the link prediction problem predicts which new interactions between members are most likely to occur in the near future. As networks grow in size, data owners are forced to store the data in remote cloud servers which reveals sensitive information about the network. The graphs are therefore stored in encrypted form. We study the link prediction problem on encrypted graphs. To the best of our knowledge, this secure link prediction problem has not been studied before. We use the number of common neighbors for prediction. We present three algorithms for the secure link prediction problem. We design prototypes of the schemes and formally prove their security. We execute our algorithms in real-life datasets.Comment: This has been accepted for publication in Advances in Mathematics of Communications (AMC) journa

    Improved Outlier Robust Seeding for k-means

    Full text link
    The kk-means is a popular clustering objective, although it is inherently non-robust and sensitive to outliers. Its popular seeding or initialization called kk-means++ uses D2D^{2} sampling and comes with a provable O(logk)O(\log k) approximation guarantee \cite{AV2007}. However, in the presence of adversarial noise or outliers, D2D^{2} sampling is more likely to pick centers from distant outliers instead of inlier clusters, and therefore its approximation guarantees \textit{w.r.t.} kk-means solution on inliers, does not hold. Assuming that the outliers constitute a constant fraction of the given data, we propose a simple variant in the D2D^2 sampling distribution, which makes it robust to the outliers. Our algorithm runs in O(ndk)O(ndk) time, outputs O(k)O(k) clusters, discards marginally more points than the optimal number of outliers, and comes with a provable O(1)O(1) approximation guarantee. Our algorithm can also be modified to output exactly kk clusters instead of O(k)O(k) clusters, while keeping its running time linear in nn and dd. This is an improvement over previous results for robust kk-means based on LP relaxation and rounding \cite{Charikar}, \cite{KrishnaswamyLS18} and \textit{robust kk-means++} \cite{DeshpandeKP20}. Our empirical results show the advantage of our algorithm over kk-means++~\cite{AV2007}, uniform random seeding, greedy sampling for kk means~\cite{tkmeanspp}, and robust kk-means++~\cite{DeshpandeKP20}, on standard real-world and synthetic data sets used in previous work. Our proposal is easily amenable to scalable, faster, parallel implementations of kk-means++ \cite{Bahmani,BachemL017} and is of independent interest for coreset constructions in the presence of outliers \cite{feldman2007ptas,langberg2010universal,feldman2011unified}
    corecore