19 research outputs found

    Labeled Nearest Neighbor Search and Metric Spanners via Locality Sensitive Orderings

    Get PDF
    Chan, Har-Peled, and Jones [SICOMP 2020] developed locality-sensitive orderings (LSO) for Euclidean space. A (τ,ρ)(\tau,\rho)-LSO is a collection Σ\Sigma of orderings such that for every x,yRdx,y\in\mathbb{R}^d there is an ordering σΣ\sigma\in\Sigma, where all the points between xx and yy w.r.t. σ\sigma are in the ρ\rho-neighborhood of either xx or yy. In essence, LSO allow one to reduce problems to the 11-dimensional line. Later, Filtser and Le [STOC 2022] developed LSO's for doubling metrics, general metric spaces, and minor free graphs. For Euclidean and doubling spaces, the number of orderings in the LSO is exponential in the dimension, which made them mainly useful for the low dimensional regime. In this paper, we develop new LSO's for Euclidean, p\ell_p, and doubling spaces that allow us to trade larger stretch for a much smaller number of orderings. We then use our new LSO's (as well as the previous ones) to construct path reporting low hop spanners, fault tolerant spanners, reliable spanners, and light spanners for different metric spaces. While many nearest neighbor search (NNS) data structures were constructed for metric spaces with implicit distance representations (where the distance between two metric points can be computed using their names, e.g. Euclidean space), for other spaces almost nothing is known. In this paper we initiate the study of the labeled NNS problem, where one is allowed to artificially assign labels (short names) to metric points. We use LSO's to construct efficient labeled NNS data structures in this model

    Online Duet between Metric Embeddings and Minimum-Weight Perfect Matchings

    Full text link
    Low-distortional metric embeddings are a crucial component in the modern algorithmic toolkit. In an online metric embedding, points arrive sequentially and the goal is to embed them into a simple space irrevocably, while minimizing the distortion. Our first result is a deterministic online embedding of a general metric into Euclidean space with distortion O(logn)min{logΦ,n}O(\log n)\cdot\min\{\sqrt{\log\Phi},\sqrt{n}\} (or, O(d)min{logΦ,n}O(d)\cdot\min\{\sqrt{\log\Phi},\sqrt{n}\} if the metric has doubling dimension dd), solving a conjecture by Newman and Rabinovich (2020), and quadratically improving the dependence on the aspect ratio Φ\Phi from Indyk et al.\ (2010). Our second result is a stochastic embedding of a metric space into trees with expected distortion O(dlogΦ)O(d\cdot \log\Phi), generalizing previous results (Indyk et al.\ (2010), Bartal et al.\ (2020)). Next, we study the \emph{online minimum-weight perfect matching} problem, where a sequence of 2n2n metric points arrive in pairs, and one has to maintain a perfect matching at all times. We allow recourse (as otherwise the order of arrival determines the matching). The goal is to return a perfect matching that approximates the \emph{minimum-weight} perfect matching at all times, while minimizing the recourse. Our third result is a randomized algorithm with competitive ratio O(dlogΦ)O(d\cdot \log \Phi) and recourse O(logΦ)O(\log \Phi) against an oblivious adversary, this result is obtained via our new stochastic online embedding. Our fourth result is a deterministic algorithm against an adaptive adversary, using O(log2n)O(\log^2 n) recourse, that maintains a matching of weight at most O(logn)O(\log n) times the weight of the MST, i.e., a matching of lightness O(logn)O(\log n). We complement our upper bounds with a strategy for an oblivious adversary that, with recourse rr, establishes a lower bound of Ω(lognrlogr)\Omega(\frac{\log n}{r \log r}) for both competitive ratio and lightness.Comment: 53 pages, 8 figures, to be presented at the ACM-SIAM Symposium on Discrete Algorithms (SODA24

    Networked Data Analytics: Network Comparison And Applied Graph Signal Processing

    Get PDF
    Networked data structures has been getting big, ubiquitous, and pervasive. As our day-to-day activities become more incorporated with and influenced by the digital world, we rely more on our intuition to provide us a high-level idea and subconscious understanding of the encountered data. This thesis aims at translating the qualitative intuitions we have about networked data into quantitative and formal tools by designing rigorous yet reasonable algorithms. In a nutshell, this thesis constructs models to compare and cluster networked data, to simplify a complicated networked structure, and to formalize the notion of smoothness and variation for domain-specific signals on a network. This thesis consists of two interrelated thrusts which explore both the scenarios where networks have intrinsic value and are themselves the object of study, and where the interest is for signals defined on top of the networks, so we leverage the information in the network to analyze the signals. Our results suggest that the intuition we have in analyzing huge data can be transformed into rigorous algorithms, and often the intuition results in superior performance, new observations, better complexity, and/or bridging two commonly implemented methods. Even though different in the principles they investigate, both thrusts are constructed on what we think as a contemporary alternation in data analytics: from building an algorithm then understanding it to having an intuition then building an algorithm around it. We show that in order to formalize the intuitive idea to measure the difference between a pair of networks of arbitrary sizes, we could design two algorithms based on the intuition to find mappings between the node sets or to map one network into the subset of another network. Such methods also lead to a clustering algorithm to categorize networked data structures. Besides, we could define the notion of frequencies of a given network by ordering features in the network according to how important they are to the overall information conveyed by the network. These proposed algorithms succeed in comparing collaboration histories of researchers, clustering research communities via their publication patterns, categorizing moving objects from uncertain measurmenets, and separating networks constructed from different processes. In the context of data analytics on top of networks, we design domain-specific tools by leveraging the recent advances in graph signal processing, which formalizes the intuitive notion of smoothness and variation of signals defined on top of networked structures, and generalizes conventional Fourier analysis to the graph domain. In specific, we show how these tools can be used to better classify the cancer subtypes by considering genetic profiles as signals on top of gene-to-gene interaction networks, to gain new insights to explain the difference between human beings in learning new tasks and switching attentions by considering brain activities as signals on top of brain connectivity networks, as well as to demonstrate how common methods in rating prediction are special graph filters and to base on this observation to design novel recommendation system algorithms

    Bedibe: Datasets and Software Tools for Distributed Bandwidth Prediction

    Get PDF
    National audiencePouvoir prédire la bande passante disponible est une problématique cruciale pour un grand nombre d'applications distribuées sur Internet. Plusieurs solutions ont été proposées, mais l'absence d'implémentations communes et de jeux de données reconnus rend difficile la comparaison et la reproductibilité des résultats. Dans cet article, nous présentons bedibe, la combinaison de mesures de bande passante effectuées sur Planet-Lab et d'un logiciel pour faciliter l'écriture et l'étude d'algorithmes pour la prédiction de bande passante. bedibe inclut les implémentations des meilleures solutions de la littérature, et a pour but de faciliter la comparaison des résultats obtenus par les différentes équipes qui travaillent sur ce thème

    Ramsey-type theorems for metric spaces with applications to online problems

    Get PDF
    A nearly logarithmic lower bound on the randomized competitive ratio for the metrical task systems problem is presented. This implies a similar lower bound for the extensively studied k-server problem. The proof is based on Ramsey-type theorems for metric spaces, that state that every metric space contains a large subspace which is approximately a hierarchically well-separated tree (and in particular an ultrametric). These Ramsey-type theorems may be of independent interest.Comment: Fix an error in the metadata. 31 pages, 0 figures. Preliminary version in FOCS '01. To be published in J. Comput. System Sc

    Dynamics of spectral algorithms for distributed routing

    Get PDF
    Thesis (Ph. D.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2012.Cataloged from PDF version of thesis.Includes bibliographical references (p. 109-117).In the past few decades distributed systems have evolved from man-made machines to organically changing social, economic and protein networks. This transition has been overwhelming in many ways at once. Dynamic, heterogeneous, irregular topologies have taken the place of static, homogeneous, regular ones. Asynchronous, ad hoc peer-to-peer networks have replaced carefully engineered super-computers, governed by globally synchronized clocks. Modern network scales have demanded distributed data structures in place of traditionally centralized ones. While the core problems of routing remain mostly unchanged, the sweeping changes of the computing environment invoke an altogether new science of algorithmic and analytic techniques. It is these techniques that are the focus of the present work. We address the re-design of routing algorithms in three classical domains: multi-commodity routing, broadcast routing and all-pairs route representation. Beyond their practical value, our results make pleasing contributions to Mathematics and Theoretical Computer Science. We exploit surprising connections to NP-hard approximation, and we introduce new techniques in metric embeddings and spectral graph theory. The distributed computability of "oblivious routes", a core combinatorial property of every graph and a key ingredient in route engineering, opens interesting questions in the natural and experimental sciences as well. Oblivious routes are "universal" communication pathways in networks which are essentially unique. They are magically robust as their quality degrades smoothly and gracefully with changes in topology or blemishes in the computational processes. While we have only recently learned how to find them algorithmically, their power begs the question whether naturally occurring networks from Biology to Sociology to Economics have their own mechanisms of finding and utilizing these pathways. Our discoveries constitute a significant progress towards the design of a self-organizing Internet, whose infrastructure is fueled entirely by its participants on an equal citizen basis. This grand engineering challenge is believed to be a potential technological solution to a long line of pressing social and human rights issues in the digital age. Some prominent examples include non-censorship, fair bandwidth allocation, privacy and ownership of social data, the right to copy information, non-discrimination based on identity, and many others.by Petar Maymounkov.Ph.D

    Reliable Geometric Spanners

    Get PDF
    corecore