8 research outputs found

    0-1 Integer Linear Programming with a Linear Number of Constraints

    Full text link
    We give an exact algorithm for the 0-1 Integer Linear Programming problem with a linear number of constraints that improves over exhaustive search by an exponential factor. Specifically, our algorithm runs in time 2(1−poly(1/c))n2^{(1-\text{poly}(1/c))n} where n is the number of variables and cn is the number of constraints. The key idea for the algorithm is a reduction to the Vector Domination problem and a new algorithm for that subproblem

    Decoding Hidden Markov Models Faster Than Viterbi Via Online Matrix-Vector (max, +)-Multiplication

    Full text link
    In this paper, we present a novel algorithm for the maximum a posteriori decoding (MAPD) of time-homogeneous Hidden Markov Models (HMM), improving the worst-case running time of the classical Viterbi algorithm by a logarithmic factor. In our approach, we interpret the Viterbi algorithm as a repeated computation of matrix-vector (max⁥,+)(\max, +)-multiplications. On time-homogeneous HMMs, this computation is online: a matrix, known in advance, has to be multiplied with several vectors revealed one at a time. Our main contribution is an algorithm solving this version of matrix-vector (max⁥,+)(\max,+)-multiplication in subquadratic time, by performing a polynomial preprocessing of the matrix. Employing this fast multiplication algorithm, we solve the MAPD problem in O(mn2/log⁥n)O(mn^2/ \log n) time for any time-homogeneous HMM of size nn and observation sequence of length mm, with an extra polynomial preprocessing cost negligible for m>nm > n. To the best of our knowledge, this is the first algorithm for the MAPD problem requiring subquadratic time per observation, under the only assumption -- usually verified in practice -- that the transition probability matrix does not change with time.Comment: AAAI 2016, to appea

    Faster all-pairs shortest paths via circuit complexity

    Full text link
    We present a new randomized method for computing the min-plus product (a.k.a., tropical product) of two n×nn \times n matrices, yielding a faster algorithm for solving the all-pairs shortest path problem (APSP) in dense nn-node directed graphs with arbitrary edge weights. On the real RAM, where additions and comparisons of reals are unit cost (but all other operations have typical logarithmic cost), the algorithm runs in time n32Ω(log⁥n)1/2\frac{n^3}{2^{\Omega(\log n)^{1/2}}} and is correct with high probability. On the word RAM, the algorithm runs in n3/2Ω(log⁥n)1/2+n2+o(1)log⁥Mn^3/2^{\Omega(\log n)^{1/2}} + n^{2+o(1)}\log M time for edge weights in ([0,M]∩Z)âˆȘ{∞}([0,M] \cap {\mathbb Z})\cup\{\infty\}. Prior algorithms used either n3/(log⁥cn)n^3/(\log^c n) time for various c≀2c \leq 2, or O(MαnÎČ)O(M^{\alpha}n^{\beta}) time for various α>0\alpha > 0 and ÎČ>2\beta > 2. The new algorithm applies a tool from circuit complexity, namely the Razborov-Smolensky polynomials for approximately representing AC0[p]{\sf AC}^0[p] circuits, to efficiently reduce a matrix product over the (min⁥,+)(\min,+) algebra to a relatively small number of rectangular matrix products over F2{\mathbb F}_2, each of which are computable using a particularly efficient method due to Coppersmith. We also give a deterministic version of the algorithm running in n3/2log⁥Ύnn^3/2^{\log^{\delta} n} time for some ÎŽ>0\delta > 0, which utilizes the Yao-Beigel-Tarui translation of AC0[m]{\sf AC}^0[m] circuits into "nice" depth-two circuits.Comment: 24 pages. Updated version now has slightly faster running time. To appear in ACM Symposium on Theory of Computing (STOC), 201

    Struktury danych i algorytmy dynamiczne dla grafĂłw planarnych

    Get PDF
    Obtaining provably efficient algorithms for the most basic graph problems like finding (shortest) paths or computing maximum matchings, fast enough to handle real-world-scale graphs (i.e., consisting of millions of vertices and edges), is a very challenging task. For example, in a very general regime of strongly-polynomial algorithms (see, e.g., [65]), we still do not know how to compute shortest paths in a real-weighted sparse directed graph significantly faster than in quadratic time, using the classical, but somewhat simple-minded, Bellman-Ford method. One way to circumvent this problem is to consider more restricted computation models for graph algorithms. If, for example, we restrict ourselves to graphs with integral edge weights, we can improve upon the Bellman-Ford algorithm [14, 31]. Although these results are very deep algorithmically, their theoretical efficiency is still very far from the only known trivial linear lower bound on the actual time complexity of the negatively-weighted shortest path problem. Another approach is to develop algorithms specialized for certain graph classes that appear in practice. Planar graphs constitute one of the most important and well-studied such classes. Many of the real-world networks can be drawn on a plane with no or few edge crossings. The examples include not very complex road networks and graphs considered in the domain of VLSI design. Complex road networks, although far from being planar, share with planar graphs some useful properties, like the existence of small separators [20]. Special cases of planar graphs, such as grids, appear often in the area of image processing (e.g., [7]). And indeed, if we restrict ourselves to planar graphs, many of the classical polynomial-time graph problems, in particular computing shortest paths [35, 58] and maximum flows [4, 5, 21] in real-weighted graphs, can be solved either optimally or in nearly-linear time. The very rich combinatorial structure of planar graphs often allows breaking barriers that appear in the respective problems for general graphs by using techniques from computational geometry (e.g., [27]), or by applying sophisticated data structures, such as dynamic trees [4, 10, 21, 66]. In this thesis, we focus on the data-structural aspect of planar graph algorithmics. By this, we mean that rather than concentrating on particular planar graph problems, we study more abstract, “low-level” problems. Efficient algorithms for these problems can be used in a blackbox manner to design algorithms for multiple specific problems at once. Such an approach allows us to improve upon many known complexity upper bounds for different planar graph problems simultaneously, without going into the specifics of these problems. We also study dynamic algorithms for planar graphs, i.e., algorithms that maintain certain information about a dynamically changing graph (such as “is the graph connected?”) much more efficiently than by recomputing this information from scratch after each update. We consider the edge-update model where the input graph can be modified only by adding or removing 1 single edges. A graph algorithm is called fully-dynamic if it supports both edge insertions and edge deletions, and partially dynamic if it supports either only edge insertions (then we call it incremental) or only edge deletions (then it is called decremental). When designing dynamic graph algorithms, we care about the update time, i.e., the time needed by the algorithm to adapt to an elementary change of the graph, and query time, i.e., the time needed by the algorithm to recompute the requested portion of the maintained information. Sometimes, especially in partially dynamic settings, it is more convenient to measure the total update time, i.e., the total time needed by the algorithm to process any possible sequence of updates. For some dynamic problems, it is worth focusing on a more restricted explicit maintenance model where the entire maintained information is explicitly updated (so that the user is notified about the update) after each change. In this model the query procedure is trivial and thus we only care about the update time. Note that there is actually no clear distinction between dynamic graph algorithms and graph data structures, since dynamic algorithms are often used as black-boxes to obtain efficient static algorithms (e.g., [26]). For example, the incremental connectivity problem, where one needs to process queries about the existence of a path between given vertices, while the input undirected graph undergoes edge insertions, is actually equivalent to the disjoint-set data structure problem, also called the union-find data structure problem (see, e.g., [15]). We concentrate mostly on the decremental model and obtain very efficient decremental algorithms for problems on unweighted planar graphs related to reachability and connectivity. We also apply our dynamic algorithms to static problems, thus confirming once again the datastructural character of these results. In the following, let G = (V, E) denote the input planar graph with n vertices. For clarity of this summary, assume G is a simple graph. Then, by planarity, it has O(n) edges. When we talk about general graphs, we denote by m the number of edges of the graph. 2 Contracting a Planar Graph The first part of the thesis is devoted to the data-structural aspect of contracting edges in planar graphs. Edge contraction is one of the fundamental graph operations. Given an undirected graph and its edge e, contracting the edge e consists in removing it from the graph and merging its endpoints. The notion of contraction has been used to describe a number of prominent graph algorithms, including Edmonds’ algorithm for computing maximum matchings [19], or Karger’s minimum cut algorithm [44]. Edge contractions are of particular interest in planar graphs, as a number of planar graph properties can be described using contractions. For example, it is well-known that a graph is planar precisely when it cannot be transformed into K5 or K3,3 by contracting edges, or removing vertices or edges (see e.g., [17]). Moreover, contracting an edge preserves planarity. We would like to have at our disposal a data structure that performs contractions on the input planar graph and still provides access to the most basic information about our graph, such as the sizes of neighbors sets of individual vertices and the adjacency relation. While contraction operation is conceptually very simple, its efficient implementation is challenging. This is because it is not clear how to represent individual vertices’ adjacency lists so that adjacency list merges, adjacency queries, and neighborhood size queries are all efficient. By using standard data structures (e.g., balanced binary search trees), one can maintain adjacency lists of a graph subject to contractions in polylogarithmic amortized time. However, in many planar graph algorithms this becomes a bottleneck. As an example, consider the problem of computing a 5-coloring of a planar graph. There exists a very simple algorithm based on contractions [53] that only relies on a folklore fact that 2 a planar graph has a vertex of degree no more than 5. However, linear-time algorithms solving this problem use some more involved planar graph properties [23, 53, 60]. For example, the algorithm by Matula et al. [53] uses the fact that every planar graph has either a vertex of degree at most 4 or a vertex of degree 5 adjacent to at least four vertices, each having degree at most 11. Similarly, although there exists a very simple algorithm for computing a minimum spanning tree of a planar graph based on edge contractions, various different methods have been used to implement it efficiently [23, 51, 52]. The problem of maintaining a planar graph under contractions has been studied before. In their book, Klein and Mozes [46] showed that there exists a (a bit more general) data structure maintaining a planar graph under edge contractions and deletions, and answering adjacency queries in O(1) worst-case time. The update time is O(log n). This result is based on the work of Brodal and Fagerberg [8], who showed how to maintain a bounded-outdegree orientation of a dynamic planar graph so that the edge set updates are supported in O(log n) amortized time. Gustedt [32] showed an optimal solution to the union-find problem in the case when at any time the actual subsets form disjoint and connected subgraphs of a given planar graph G. In other words, in this problem the allowed unions correspond to the edges of a planar graph and the execution of a union operation can be seen as a contraction of the respective edge. Our Results We show a data structure that can efficiently maintain a planar graph subject to edge contractions in linear total time, assuming the standard word-RAM model with word size ℩(log n). It can report groups of parallel edges and self-loops that emerge. It also supports constant-time adjacency queries and maintains the neighbor lists and degrees explicitly. The data structure can be used as a black-box to implement planar graph algorithms that use contractions. As an example, our data structure can be used to give clean and conceptually simple lineartime implementations of algorithms for computing 5-coloring or minimum spanning tree. More importantly, by using our data structure, we give improved algorithms for a few problems in planar graphs. In particular, we obtain optimal algorithms for decremental 2-edgeconnectivity (see, e.g., [30]), finding a unique perfect matching [26], and computing maximal 3-edge-connected subgraphs [12]. In order to obtain our result, we first partition the graph into small pieces of roughly logarithmic size (using so-called r-divisions [24]). Then we solve our problem recursively for each of the pieces, and separately using a simple-minded approach for the subgraph induced by o(n) vertices contained in multiple pieces (the so-called boundary vertices). Such an approach proved successful in obtaining optimal data structures for the planar union-find problem [32] and decremental connectivity [50]. In fact, our data-structural problem can be seen as a generalization of the former problem. However, maintaining the status of each edge e of the initial graph G (i.e., whether e has become a self-loop or a parallel edge) subject to edge contractions, and supporting constant-time adjacency queries without resorting to randomization, turn out to be serious technical challenges. Overcoming these difficulties is our main contribution of this part of the thesis. 3 Decremental Reachability The second part of this thesis is devoted to dynamic reachability problems in planar graphs. In the dynamic reachability problem we are given a (directed) graph G subject to edge updates and the goal is to design a data structure that would allow answering queries about the existence of a path between a pair of query vertices u, v ∈ V . 3 Two variants of dynamic reachability are studied most often. In the all-pairs variant, our data structure has to support queries between arbitrary pairs of vertices. This variant is also called the dynamic transitive closure problem, since a path u → v exists in G if uv is an edge of the transitive closure of G. In the single-source reachability problem, a source vertex s ∈ V is fixed from the very beginning and the only allowed queries are about the existence of a path s → v, where v ∈ V . If we work with undirected graphs, the dynamic reachability problem is called the dynamic connectivity problem. Note that in the undirected case a path u → v exists in G if and only if a path v → u exists in G. State of the Art Dynamic reachability in general directed graphs turns out to be a very challenging problem. First of all, it is computationally much more demanding than its undirected counterpart. For undirected graphs, fully-dynamic all-pairs algorithms with polylogarithmic amortized update and query bounds are known [36, 38, 71]. For directed graphs, on the other hand, in most settings (either single-source or all-pairs, either incremental, decremental or fully-dynamic) the best known algorithm has either polynomial update time or polynomial query time. The only exception is the incremental single-source reachability problem, for which a trivial extension of depth-first search [68] achieves O(1) amortized update time. One of the possible reasons behind such a big gap between the undirected and directed settings is that one needs only linear time to compute the connected components of an undirected graph, and thus there exists a O(n)-space static data structure that can answer connectivity queries in undirected graphs in O(1) time. On the other hand, the best known algorithm for computing the transitive closure runs in Oe(min(n ω , nm)) = Oe(n 2 ) 1 time [11, 59]. So far, the best known bounds for fully-dynamic reachability are as follows. For dynamic transitive closure, there exist a number of algorithms with O(n 2 ) update time and O(1) query time [16, 61, 64]. These algorithms, in fact, maintain the transitive closure explicitly. There also exist a few fully-dynamic algorithms that are better for sparse graphs, each of which has ℩(n) amortized update time and query time which is o(n) but still polynomial in n [62, 63, 64]. For the single-source variant, the only known non-trivial (i.e., other than recompute-from-scratch) algorithm has O(n 1.53) update time and O(1) query time [64]. Algorithms with O(nm) total update time are known for both incremental [39] and decremental [48, 62] transitive closure. Note that for sparse graphs this bound is only poly-logarithmic factors away from the best known static transitive closure upper bound [11]. All the known partially-dynamic single-source reachability algorithms work in the explicit maintenance model. As mentioned before, for incremental single-source reachability, an optimal (in the amortized sense) algorithm is known. Interestingly, the first algorithms with O(mn1− ) total update time (where > 0) have been obtained only recently [33, 34]. The best known algorithm to date has Oe(m √ n) total update time and is due to Chechik et al. [13]. Dynamic reachability has also been previously studied for planar graphs. Diks and Sankowski [18] showed a fully-dynamic transitive closure algorithm with Oe( √ n) update and query times, which works under the assumption that the graph is plane embedded and the inserted edges can only connect vertices sharing some adjacent face. Ɓącki [48] showed that one can maintain the strongly connected components of a planar graph under edge deletions in O(n √ n) total time. By known reductions, it follows that there exists a decremental single-source reachability algorithm for planar graphs with O(n √ n) total update time. Note that this bound matches the recent best known bound for general graphs [13] up to polylogarithmic factors. 1We denote by Oe(f(n)) the order O(f(n) polylog n)

    Systems and Algorithms for Dynamic Graph Processing

    Get PDF
    Data generated from human and systems interactions could be naturally represented as graph data. Several emerging applications rely on graph data, such as the semantic web, social networks, bioinformatics, finance, and trading among others. These applications require graph querying capabilities which are often implemented in graph database management systems (GDBMS). Many GDBMSs have capabilities to evaluate one-time versions of recursive or subgraph queries over static graphs – graphs that do not change or a single snapshot of a changing graph. They generally do not support incrementally maintaining queries as graphs change. However, most applications that employ graphs are dynamic in nature resulting in graphs that change over time, also known as dynamic graphs. This thesis investigates how to build a generic and scalable incremental computation solution that is oblivious to graph workloads. It focuses on two fundamental computations performed by many applications: recursive queries and subgraph queries. Specifically, for subgraph queries, this thesis presents the first approach that (i) performs joins with worstcase optimal computation and communication costs; and (ii) maintains a total memory footprint almost linear in the number of input edges. For recursive queries, this thesis studies optimizations for using differential computation (DC). DC is a general incremental computation that can maintain the output of a recursive dataflow computation upon changes. However, it requires a prohibitively large amount of memory because it maintains differences that track changes in queries input/output. The thesis proposes a suite of optimizations that are based on reducing the number of these differences and recomputing them when necessary. The techniques and optimizations in this thesis, for subgraph and recursive computations, represent a proposal for how to build a state-of-the-art generic and scalable GDBMS for dynamic graph data management

    Graph-based Analysis of Dynamic Systems

    Get PDF
    The analysis of dynamic systems provides insights into their time-dependent characteristics. This enables us to monitor, evaluate, and improve systems from various areas. They are often represented as graphs that model the system's components and their relations. The analysis of the resulting dynamic graphs yields great insights into the system's underlying structure, its characteristics, as well as properties of single components. The interpretation of these results can help us understand how a system works and how parameters influence its performance. This knowledge supports the design of new systems and the improvement of existing ones. The main issue in this scenario is the performance of analyzing the dynamic graph to obtain relevant properties. While various approaches have been developed to analyze dynamic graphs, it is not always clear which one performs best for the analysis of a specific graph. The runtime also depends on many other factors, including the size and topology of the graph, the frequency of changes, and the data structures used to represent the graph in memory. While the benefits and drawbacks of many data structures are well-known, their runtime is hard to predict when used for the representation of dynamic graphs. Hence, tools are required to benchmark and compare different algorithms for the computation of graph properties and data structures for the representation of dynamic graphs in memory. Based on deeper insights into their performance, new algorithms can be developed and efficient data structures can be selected. In this thesis, we present four contributions to tackle these problems: A benchmarking framework for dynamic graph analysis, novel algorithms for the efficient analysis of dynamic graphs, an approach for the parallelization of dynamic graph analysis, and a novel paradigm to select and adapt graph data structures. In addition, we present three use cases from the areas of social, computer, and biological networks to illustrate the great insights provided by their graph-based analysis. We present a new benchmarking framework for the analysis of dynamic graphs, the Dynamic Network Analyzer (DNA). It provides tools to benchmark and compare different algorithms for the analysis of dynamic graphs as well as the data structures used to represent them in memory. DNA supports the development of new algorithms and the automatic verification of their results. Its visualization component provides different ways to represent dynamic graphs and the results of their analysis. We introduce three new stream-based algorithms for the analysis of dynamic graphs. We evaluate their performance on synthetic as well as real-world dynamic graphs and compare their runtimes to snapshot-based algorithms. Our results show great performance gains for all three algorithms. The new stream-based algorithm StreaM_k, which counts the frequencies of k-vertex motifs, achieves speedups up to 19,043 x for synthetic and 2882 x for real-world datasets. We present a novel approach for the distributed processing of dynamic graphs, called parallel Dynamic Graph Analysis (pDNA). To analyze a dynamic graph, the work is distributed by a partitioner that creates subgraphs and assigns them to workers. They compute the properties of their respective subgraph using standard algorithms. Their results are used by the collator component to merge them to the properties of the original graph. We evaluate the performance of pDNA for the computation of five graph properties on two real-world dynamic graphs with up to 32 workers. Our approach achieves great speedups, especially for the analysis of complex graph measures. We introduce two novel approaches for the selection of efficient graph data structures. The compile-time approach estimates the workload of an analysis after an initial profiling phase and recommends efficient data structures based on benchmarking results. It achieves speedups of up to 5.4 x over baseline data structure configurations for the analysis of real-word dynamic graphs. The run-time approach monitors the workload during analysis and exchanges the graph representation if it finds a configuration that promises to be more efficient for the current workload. Compared to baseline configurations, it achieves speedups up to 7.3 x for the analysis of a synthetic workload. Our contributions provide novel approaches for the efficient analysis of dynamic graphs and tools to further investigate the trade-offs between different factors that influence the performance.:1 Introduction 2 Notation and Terminology 3 Related Work 4 DNA - Dynamic Network Analyzer 5 Algorithms 6 Parallel Dynamic Network Analysis 7 Selection of Efficient Graph Data Structures 8 Use Cases 9 Conclusion A DNA - Dynamic Network Analyzer B Algorithms C Selection of Efficient Graph Data Structures D Parallel Dynamic Network Analysis E Graph-based Intrusion Detection System F Molecular Dynamic

    Graph-based Analysis of Dynamic Systems

    Get PDF
    The analysis of dynamic systems provides insights into their time-dependent characteristics. This enables us to monitor, evaluate, and improve systems from various areas. They are often represented as graphs that model the system's components and their relations. The analysis of the resulting dynamic graphs yields great insights into the system's underlying structure, its characteristics, as well as properties of single components. The interpretation of these results can help us understand how a system works and how parameters influence its performance. This knowledge supports the design of new systems and the improvement of existing ones. The main issue in this scenario is the performance of analyzing the dynamic graph to obtain relevant properties. While various approaches have been developed to analyze dynamic graphs, it is not always clear which one performs best for the analysis of a specific graph. The runtime also depends on many other factors, including the size and topology of the graph, the frequency of changes, and the data structures used to represent the graph in memory. While the benefits and drawbacks of many data structures are well-known, their runtime is hard to predict when used for the representation of dynamic graphs. Hence, tools are required to benchmark and compare different algorithms for the computation of graph properties and data structures for the representation of dynamic graphs in memory. Based on deeper insights into their performance, new algorithms can be developed and efficient data structures can be selected. In this thesis, we present four contributions to tackle these problems: A benchmarking framework for dynamic graph analysis, novel algorithms for the efficient analysis of dynamic graphs, an approach for the parallelization of dynamic graph analysis, and a novel paradigm to select and adapt graph data structures. In addition, we present three use cases from the areas of social, computer, and biological networks to illustrate the great insights provided by their graph-based analysis. We present a new benchmarking framework for the analysis of dynamic graphs, the Dynamic Network Analyzer (DNA). It provides tools to benchmark and compare different algorithms for the analysis of dynamic graphs as well as the data structures used to represent them in memory. DNA supports the development of new algorithms and the automatic verification of their results. Its visualization component provides different ways to represent dynamic graphs and the results of their analysis. We introduce three new stream-based algorithms for the analysis of dynamic graphs. We evaluate their performance on synthetic as well as real-world dynamic graphs and compare their runtimes to snapshot-based algorithms. Our results show great performance gains for all three algorithms. The new stream-based algorithm StreaM_k, which counts the frequencies of k-vertex motifs, achieves speedups up to 19,043 x for synthetic and 2882 x for real-world datasets. We present a novel approach for the distributed processing of dynamic graphs, called parallel Dynamic Graph Analysis (pDNA). To analyze a dynamic graph, the work is distributed by a partitioner that creates subgraphs and assigns them to workers. They compute the properties of their respective subgraph using standard algorithms. Their results are used by the collator component to merge them to the properties of the original graph. We evaluate the performance of pDNA for the computation of five graph properties on two real-world dynamic graphs with up to 32 workers. Our approach achieves great speedups, especially for the analysis of complex graph measures. We introduce two novel approaches for the selection of efficient graph data structures. The compile-time approach estimates the workload of an analysis after an initial profiling phase and recommends efficient data structures based on benchmarking results. It achieves speedups of up to 5.4 x over baseline data structure configurations for the analysis of real-word dynamic graphs. The run-time approach monitors the workload during analysis and exchanges the graph representation if it finds a configuration that promises to be more efficient for the current workload. Compared to baseline configurations, it achieves speedups up to 7.3 x for the analysis of a synthetic workload. Our contributions provide novel approaches for the efficient analysis of dynamic graphs and tools to further investigate the trade-offs between different factors that influence the performance.:1 Introduction 2 Notation and Terminology 3 Related Work 4 DNA - Dynamic Network Analyzer 5 Algorithms 6 Parallel Dynamic Network Analysis 7 Selection of Efficient Graph Data Structures 8 Use Cases 9 Conclusion A DNA - Dynamic Network Analyzer B Algorithms C Selection of Efficient Graph Data Structures D Parallel Dynamic Network Analysis E Graph-based Intrusion Detection System F Molecular Dynamic
    corecore