146,726 research outputs found

    The Metric Nearness Problem

    Get PDF
    Metric nearness refers to the problem of optimally restoring metric properties to distance measurements that happen to be nonmetric due to measurement errors or otherwise. Metric data can be important in various settings, for example, in clustering, classification, metric-based indexing, query processing, and graph theoretic approximation algorithms. This paper formulates and solves the metric nearness problem: Given a set of pairwise dissimilarities, find a “nearest” set of distances that satisfy the properties of a metric—principally the triangle inequality. For solving this problem, the paper develops efficient triangle fixing algorithms that are based on an iterative projection method. An intriguing aspect of the metric nearness problem is that a special case turns out to be equivalent to the all pairs shortest paths problem. The paper exploits this equivalence and develops a new algorithm for the latter problem using a primal-dual method. Applications to graph clustering are provided as an illustration. We include experiments that demonstrate the computational superiority of triangle fixing over general purpose convex programming software. Finally, we conclude by suggesting various useful extensions and generalizations to metric nearness

    Efficient Construction of Probabilistic Tree Embeddings

    Get PDF
    In this paper we describe an algorithm that embeds a graph metric (V,dG)(V,d_G) on an undirected weighted graph G=(V,E)G=(V,E) into a distribution of tree metrics (T,DT)(T,D_T) such that for every pair u,vVu,v\in V, dG(u,v)dT(u,v)d_G(u,v)\leq d_T(u,v) and ET[dT(u,v)]O(logn)dG(u,v){\bf{E}}_{T}[d_T(u,v)]\leq O(\log n)\cdot d_G(u,v). Such embeddings have proved highly useful in designing fast approximation algorithms, as many hard problems on graphs are easy to solve on tree instances. For a graph with nn vertices and mm edges, our algorithm runs in O(mlogn)O(m\log n) time with high probability, which improves the previous upper bound of O(mlog3n)O(m\log^3 n) shown by Mendel et al.\,in 2009. The key component of our algorithm is a new approximate single-source shortest-path algorithm, which implements the priority queue with a new data structure, the "bucket-tree structure". The algorithm has three properties: it only requires linear time in the number of edges in the input graph; the computed distances have a distance preserving property; and when computing the shortest-paths to the kk-nearest vertices from the source, it only requires to visit these vertices and their edge lists. These properties are essential to guarantee the correctness and the stated time bound. Using this shortest-path algorithm, we show how to generate an intermediate structure, the approximate dominance sequences of the input graph, in O(mlogn)O(m \log n) time, and further propose a simple yet efficient algorithm to converted this sequence to a tree embedding in O(nlogn)O(n\log n) time, both with high probability. Combining the three subroutines gives the stated time bound of the algorithm. Then we show that this efficient construction can facilitate some applications. We proved that FRT trees (the generated tree embedding) are Ramsey partitions with asymptotically tight bound, so the construction of a series of distance oracles can be accelerated

    Average Sensitivity of Graph Algorithms

    Full text link
    In modern applications of graphs algorithms, where the graphs of interest are large and dynamic, it is unrealistic to assume that an input representation contains the full information of a graph being studied. Hence, it is desirable to use algorithms that, even when only a (large) subgraph is available, output solutions that are close to the solutions output when the whole graph is available. We formalize this idea by introducing the notion of average sensitivity of graph algorithms, which is the average earth mover's distance between the output distributions of an algorithm on a graph and its subgraph obtained by removing an edge, where the average is over the edges removed and the distance between two outputs is the Hamming distance. In this work, we initiate a systematic study of average sensitivity. After deriving basic properties of average sensitivity such as composition, we provide efficient approximation algorithms with low average sensitivities for concrete graph problems, including the minimum spanning forest problem, the global minimum cut problem, the minimum ss-tt cut problem, and the maximum matching problem. In addition, we prove that the average sensitivity of our global minimum cut algorithm is almost optimal, by showing a nearly matching lower bound. We also show that every algorithm for the 2-coloring problem has average sensitivity linear in the number of vertices. One of the main ideas involved in designing our algorithms with low average sensitivity is the following fact; if the presence of a vertex or an edge in the solution output by an algorithm can be decided locally, then the algorithm has a low average sensitivity, allowing us to reuse the analyses of known sublinear-time algorithms and local computation algorithms (LCAs). Using this connection, we show that every LCA for 2-coloring has linear query complexity, thereby answering an open question.Comment: 39 pages, 1 figur

    The Power of Dynamic Distance Oracles: Efficient Dynamic Algorithms for the Steiner Tree

    Get PDF
    In this paper we study the Steiner tree problem over a dynamic set of terminals. We consider the model where we are given an nn-vertex graph G=(V,E,w)G=(V,E,w) with positive real edge weights, and our goal is to maintain a tree which is a good approximation of the minimum Steiner tree spanning a terminal set SVS \subseteq V, which changes over time. The changes applied to the terminal set are either terminal additions (incremental scenario), terminal removals (decremental scenario), or both (fully dynamic scenario). Our task here is twofold. We want to support updates in sublinear o(n)o(n) time, and keep the approximation factor of the algorithm as small as possible. We show that we can maintain a (6+ε)(6+\varepsilon)-approximate Steiner tree of a general graph in O~(nlogD)\tilde{O}(\sqrt{n} \log D) time per terminal addition or removal. Here, DD denotes the stretch of the metric induced by GG. For planar graphs we achieve the same running time and the approximation ratio of (2+ε)(2+\varepsilon). Moreover, we show faster algorithms for incremental and decremental scenarios. Finally, we show that if we allow higher approximation ratio, even more efficient algorithms are possible. In particular we show a polylogarithmic time (4+ε)(4+\varepsilon)-approximate algorithm for planar graphs. One of the main building blocks of our algorithms are dynamic distance oracles for vertex-labeled graphs, which are of independent interest. We also improve and use the online algorithms for the Steiner tree problem.Comment: Full version of the paper accepted to STOC'1

    Analyzing the Effect of Objective Correlation on the Efficient Set of MNK-Landscapes

    Get PDF
    In multiobjective combinatorial optimization, there exists two main classes of metaheuristics, based either on multiple aggregations, or on a dominance relation. As in the single objective case, the structure of the search space can explain the difficulty for multiobjective metaheuristics, and guide the design of such methods. In this work we analyze the properties of multiobjective combinatorial search spaces. In particular, we focus on the features related the efficient set, and we pay a particular attention to the correlation between objectives. Few benchmark takes such objective correlation into account. Here, we define a general method to design multiobjective problems with correlation. As an example, we extend the well-known multiobjective NK-landscapes. By measuring different properties of the search space, we show the importance of considering the objective correlation on the design of metaheuristics.Comment: Learning and Intelligent OptimizatioN Conference (LION 5), Rome : Italy (2011

    Line-distortion, Bandwidth and Path-length of a graph

    Full text link
    We investigate the minimum line-distortion and the minimum bandwidth problems on unweighted graphs and their relations with the minimum length of a Robertson-Seymour's path-decomposition. The length of a path-decomposition of a graph is the largest diameter of a bag in the decomposition. The path-length of a graph is the minimum length over all its path-decompositions. In particular, we show: - if a graph GG can be embedded into the line with distortion kk, then GG admits a Robertson-Seymour's path-decomposition with bags of diameter at most kk in GG; - for every class of graphs with path-length bounded by a constant, there exist an efficient constant-factor approximation algorithm for the minimum line-distortion problem and an efficient constant-factor approximation algorithm for the minimum bandwidth problem; - there is an efficient 2-approximation algorithm for computing the path-length of an arbitrary graph; - AT-free graphs and some intersection families of graphs have path-length at most 2; - for AT-free graphs, there exist a linear time 8-approximation algorithm for the minimum line-distortion problem and a linear time 4-approximation algorithm for the minimum bandwidth problem

    NetLSD: Hearing the Shape of a Graph

    Full text link
    Comparison among graphs is ubiquitous in graph analytics. However, it is a hard task in terms of the expressiveness of the employed similarity measure and the efficiency of its computation. Ideally, graph comparison should be invariant to the order of nodes and the sizes of compared graphs, adaptive to the scale of graph patterns, and scalable. Unfortunately, these properties have not been addressed together. Graph comparisons still rely on direct approaches, graph kernels, or representation-based methods, which are all inefficient and impractical for large graph collections. In this paper, we propose the Network Laplacian Spectral Descriptor (NetLSD): the first, to our knowledge, permutation- and size-invariant, scale-adaptive, and efficiently computable graph representation method that allows for straightforward comparisons of large graphs. NetLSD extracts a compact signature that inherits the formal properties of the Laplacian spectrum, specifically its heat or wave kernel; thus, it hears the shape of a graph. Our evaluation on a variety of real-world graphs demonstrates that it outperforms previous works in both expressiveness and efficiency.Comment: KDD '18: The 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, August 19--23, 2018, London, United Kingdo
    corecore