    A general framework for coloring problems: old results, new results, and open problems

    In this survey paper we present a general framework for coloring problems that was introduced in a joint paper which the author presented at WG2003. We show how a number of different types of coloring problems, most of which have been motivated from frequency assignment, fit into this framework. We give a survey of the existing results, mainly based on and strongly biased by joint work of the author with several different groups of coauthors, include some new results, and discuss several open problems for each of the variants

    Efficient Frequent Subtree Mining Beyond Forests

    A common paradigm in distance-based learning is to embed the instance space into some appropriately chosen feature space equipped with a metric and to define the dissimilarity between instances by the distance of their images in the feature space. If the instances are graphs, then frequent connected subgraphs are a well-suited pattern language to define such feature spaces. Identifying the set of frequent connected subgraphs and subsequently computing embeddings for graph instances, however, is computationally intractable. As a result, existing frequent subgraph mining algorithms either restrict the structural complexity of the instance graphs or require exponential delay between the output of subsequent patterns. Hence distance-based learners lack an efficient way to operate on arbitrary graph data. To resolve this problem, in this thesis we present a mining system that gives up the demand on the completeness of the pattern set to instead guarantee a polynomial delay between subsequent patterns. Complementing this, we devise efficient methods to compute the embedding of arbitrary graphs into the Hamming space spanned by our pattern set. As a result, we present a system that allows to efficiently apply distance-based learning methods to arbitrary graph databases. To overcome the computational intractability of the mining step, we consider only frequent subtrees for arbitrary graph databases. This restriction alone, however, does not suffice to make the problem tractable. We reduce the mining problem from arbitrary graphs to forests by replacing each graph by a polynomially sized forest obtained from a random sample of its spanning trees. This results in an incomplete mining algorithm. However, we prove that the probability of missing a frequent subtree pattern is low. We show empirically that this is true in practice even for very small sized forests. As a result, our algorithm is able to mine frequent subtrees in a range of graph databases where state-of-the-art exact frequent subgraph mining systems fail to produce patterns in reasonable time or even at all. Furthermore, the predictive performance of our patterns is comparable to that of exact frequent connected subgraphs, where available. The above method considers polynomially many spanning trees for the forest, while many graphs have exponentially many spanning trees. The number of patterns found by our mining algorithm can be negatively influenced by this exponential gap. We hence propose a method that can (implicitly) consider forests of exponential size, while remaining computationally tractable. This results in a higher recall for our incomplete mining algorithm. Furthermore, the methods extend the known positive results on the tractability of exact frequent subtree mining to a novel class of transaction graphs. We conjecture that the next natural extension of our results to a larger transaction graph class is at least as difficult as proving whether P = NP, or not. Regarding the graph embedding step, we apply a similar strategy as in the mining step. We represent a novel graph by a forest of its spanning trees and decide whether the frequent trees from the mining step are subgraph isomorphic to this forest. As a result, the embedding computation has one-sided error with respect to the exact subgraph isomorphism test but is computationally tractable. Furthermore, we show that we can leverage a partial order on the pattern set. This structure can be used to reduce the runtime of the embedding computation dramatically. For the special case of Jaccard-similarity between graph embeddings, a further substantial reduction of runtime can be achieved using min-hashing. The Jaccard-distance can be approximated using small sketch vectors that can be computed fast, again using the partial order on the tree patterns

    Parallel Approximate Maximum Flows in Near-Linear Work and Polylogarithmic Depth

    We present a parallel algorithm for the (1−ϵ)(1-\epsilon)-approximate maximum flow problem in capacitated, undirected graphs with nn vertices and mm edges, achieving O(ϵ−3polylogn)O(\epsilon^{-3}\text{polylog} n) depth and O(mϵ−3polylogn)O(m \epsilon^{-3} \text{polylog} n) work in the PRAM model. Although near-linear time sequential algorithms for this problem have been known for almost a decade, no parallel algorithms that simultaneously achieved polylogarithmic depth and near-linear work were known. At the heart of our result is a polylogarithmic depth, near-linear work recursive algorithm for computing congestion approximators. Our algorithm involves a recursive step to obtain a low-quality congestion approximator followed by a "boosting" step to improve its quality which prevents a multiplicative blow-up in error. Similar to Peng [SODA'16], our boosting step builds upon the hierarchical decomposition scheme of R\"acke, Shah, and T\"aubig [SODA'14]. A direct implementation of this approach, however, leads only to an algorithm with no(1)n^{o(1)} depth and m1+o(1)m^{1+o(1)} work. To get around this, we introduce a new hierarchical decomposition scheme, in which we only need to solve maximum flows on subgraphs obtained by contracting vertices, as opposed to vertex-induced subgraphs used in R\"acke, Shah, and T\"aubig [SODA'14]. In particular, we are able to directly extract congestion approximators for the subgraphs from a congestion approximator for the entire graph, thereby avoiding additional recursion on those subgraphs. Along the way, we also develop a parallel flow-decomposition algorithm that is crucial to achieving polylogarithmic depth and may be of independent interest

    Polyhedra and algorithms for problems bridging notions of connectivity and independence

    Get PDF
    I denne avhandlinga interesserer vi oss for å finne delgrafer som svarer til utvalgte modeller for begrepene sammenheng og uavhengighet. I korthet betyr dette stabile (også kalt uavhengige) mengder med gitt kardinalitet, stabile (også kalt konfliktfrie) spenntrær og pardannelser (eller uavhengige kantmengder) som induserer en sammenhengende delgraf. Dette er kombinatoriske strukturer som kan generaliseres til ulike modeller for nettverksdesign innen telekommunikasjon og forsyningsvirksomhet, plassering av anlegg, fylogenetikk, og mange andre applikasjoner innen operasjonsanalyse og optimering. Vi argumenterer for at de valgte strukturene reiser interessante forskningsspørsmål, og vi bidrar med forbedret matematisk forståelse av selve strukturene, samt forbedrede algoritmer for å takle de tilhørende kombinatoriske optimeringsproblemene. Med det mener vi metoder for å identifisere en optimal struktur, forutsatt at elementene som danner dem (hjørner eller kanter i en gitt graf) er tildelt verdier. Forskninga vår omfatter ulike områder innenfor algoritmer, kombinatorikk og optimering. De fleste resultatene omhandler det å finne bedre beskrivelser av de geometriske strukturene (nemlig 0/1-polytoper) som representerer alle mulige løsninger for hvert av problemene. Slike forbedrede beskrivelser oversettes til lineære ulikheter i heltallsprogrammeringsmodeller, noe som igjen gir mer effektive beregningsresultater når man løser referanseinstanser av hvert problem. Vi påpeker gjentatte ganger betydninga av å dele kildekoden til implementasjonen av alle utviklede algoritmer og verktøy når det foreslås nye modeller og løsningsmetoder for heltallsprogrammering og kombinatorisk optimering. Kodearkivene våre inkluderer fullstendige implementasjoner, utformet med effektivitet og modulær design i tankene, og fremmer dermed gjenbruk, videre forskning og nye anvendelser innen forskning og utvikling.We are interested in finding subgraphs that capture selected models of connectivity and independence. In short: fixed cardinality stable (or independent) sets, stable (or conflict-free) spanning trees, and matchings (or independent edge sets) inducing a connected subgraph. These are combinatorial structures that can be generalized to a number of models across network design in telecommunication and utilities, facility location, phylogenetics, among many other application domains of operations research and optimization. We argue that the selected structures raise appealing research questions, and seek to contribute with improved mathematical understanding of the structures themselves, as well as improved algorithms to face the corresponding combinatorial optimization problems. That is, methods to identify an optimal structure, assuming the elements that form them (vertices or edges in a given graph) have a weight. Our research spans different lines within algorithmics, combinatorics and optimization. Most of the results concern finding better descriptions of the geometric structures (namely, 0/1-polytopes) that represent all feasible solutions to each of the problems. Such improved descriptions translate to linear inequalities in integer programming formulations which, in turn, provide stronger computational results when solving benchmark instances of each problem. We repeatedly remark the importance of sharing an open-source implementation of all algorithms and tools developed when proposing new models and solution methods in integer programming and combinatorial optimization. Our code repositories include full implementations, crafted with efficiency and modular design in mind, thus fostering reuse, further research and new applications in research and development.Doktorgradsavhandlin
