    Two lower bounds for pp-centered colorings

    Given a graph GG and an integer pp, a coloring f:V(G)Nf : V(G) \to \mathbb{N} is \emph{pp-centered} if for every connected subgraph HH of GG, either ff uses more than pp colors on HH or there is a color that appears exactly once in HH. The notion of pp-centered colorings plays a central role in the theory of sparse graphs. In this note we show two lower bounds on the number of colors required in a pp-centered coloring. First, we consider monotone classes of graphs whose shallow minors have average degree bounded polynomially in the radius, or equivalently (by a result of Dvo\v{r}\'ak and Norin), admitting strongly sublinear separators. We construct such a class such that pp-centered colorings require a number of colors super-polynomial in pp. This is in contrast with a recent result of Pilipczuk and Siebertz, who established a polynomial upper bound in the special case of graphs excluding a fixed minor. Second, we consider graphs of maximum degree Δ\Delta. D\k{e}bski, Felsner, Micek, and Schr\"{o}der recently proved that these graphs have pp-centered colorings with O(Δ21/pp)O(\Delta^{2-1/p} p) colors. We show that there are graphs of maximum degree Δ\Delta that require Ω(Δ21/ppln1/pΔ)\Omega(\Delta^{2-1/p} p \ln^{-1/p}\Delta) colors in any pp-centered coloring, thus matching their upper bound up to a logarithmic factor.Comment: v3: final version with journal layout v2: revised following referees' comment

    Graph Isomorphism and the Lasserre Hierarchy

    In this paper we show lower bounds for a certain large class of algorithms solving the Graph Isomorphism problem, even on expander graph instances. Spielman [25] shows an algorithm for isomorphism of strongly regular expander graphs that runs in time exp(O(n^(1/3)) (this bound was recently improved to expf O(n^(1/5) [5]). It has since been an open question to remove the requirement that the graph be strongly regular. Recent algorithmic results show that for many problems the Lasserre hierarchy works surprisingly well when the underlying graph has expansion properties. Moreover, recent work of Atserias and Maneva [3] shows that k rounds of the Lasserre hierarchy is a generalization of the k-dimensional Weisfeiler-Lehman algorithm for Graph Isomorphism. These two facts combined make the Lasserre hierarchy a good candidate for solving graph isomorphism on expander graphs. Our main result rules out this promising direction by showing that even Omega(n) rounds of the Lasserre semidefinite program hierarchy fail to solve the Graph Isomorphism problem even on expander graphs.Comment: 22 pages, 3 figures, submitted to CC

    Fast depth-based subgraph kernels for unattributed graphs

    In this paper, we investigate two fast subgraph kernels based on a depth-based representation of graph-structure. Both methods gauge depth information through a family of K-layer expansion subgraphs rooted at a vertex [1]. The first method commences by computing a centroid-based complexity trace for each graph, using a depth-based representation rooted at the centroid vertex that has minimum shortest path length variance to the remaining vertices [2]. This subgraph kernel is computed by measuring the Jensen-Shannon divergence between centroid-based complexity entropy traces. The second method, on the other hand, computes a depth-based representation around each vertex in turn. The corresponding subgraph kernel is computed using isomorphisms tests to compare the depth-based representation rooted at each vertex in turn. For graphs with n vertices, the time complexities for the two new kernels are O(n 2) and O(n 3), in contrast to O(n 6) for the classic Gärtner graph kernel [3]. Key to achieving this efficiency is that we compute the required Shannon entropy of the random walk for our kernels with O(n 2) operations. This computational strategy enables our subgraph kernels to easily scale up to graphs of reasonably large sizes and thus overcome the size limits arising in state-of-the-art graph kernels. Experiments on standard bioinformatics and computer vision graph datasets demonstrate the effectiveness and efficiency of our new subgraph kernels

    An efficient algorithm for discovering frequent subgraphs

    Abstract — Over the years, frequent itemset discovery algorithms have been used to find interesting patterns in various application areas. However, as data mining techniques are being increasingly applied to non-traditional domains, existing frequent pattern discovery approach cannot be used. This is because the transaction framework that is assumed by these algorithms cannot be used to effectively model the datasets in these domains. An alternate way of modeling the objects in these datasets is to represent them using graphs. Within that model, one way of formulating the frequent pattern discovery problem is as that of discovering subgraphs that occur frequently over the entire set of graphs. In this paper we present a computationally efficient algorithm, called FSG, for finding all frequent subgraphs in large graph datasets. We experimentally evaluate the performance of FSG using a variety of real and synthetic datasets. Our results show that despite the underlying complexity associated with frequent subgraph discovery, FSG is effective in finding all frequently occurring subgraphs in datasets containing over 200,000 graph transactions and scales linearly with respect to the size of the dataset. Index Terms — Data mining, scientific datasets, frequent pattern discovery, chemical compound datasets

    Analysis of Generative Chemistries

    For the modelling of chemistry we use undirected, labelled graphs as explicit models of molecules and graph transformation rules for modelling generalised chemical reactions. This is used to define artificial chemistries on the level of individual bonds and atoms, where formal graph grammars implicitly represent large spaces of chemical compounds. We use a graph rewriting formalism, rooted in category theory, called the Double Pushout approach, which directly expresses the transition state of chemical reactions. Using concurrency theory for transformation rules, we define algorithms for the composition of rewrite rules in a chemically intuitive manner that enable automatic abstraction of the level of detail in chemical pathways. Based on this rule composition we define an algorithmic framework for generation of vast reaction networks for specific spaces of a given chemistry, while still maintaining the level of detail of the model down to the atomic level. The framework also allows for computation with graphs and graph grammars, which is utilised to model non-trivial chemical systems. The graph generation relies on graph isomorphism testing, and we review the general individualisation-refinement paradigm used in the state-of-the-art algorithms for graph canonicalisation, isomorphism testing, and automorphism discovery. We present a model for chemical pathways based on a generalisation of network flows from ordinary directed graphs to directed hypergraphs. The model allows for reasoning about the flow of individual molecules in general pathways, and the introduction of chemically motivated routing constraints. It further provides the foundation for defining specialised pathway motifs, which is illustrated by defining necessary topological constraints for both catalytic and autocatalytic pathways. We also prove that central types of pathway questions are NP-complete, even for restricted classes of reaction networks. The complete pathway model, including constraints for catalytic and autocatalytic pathways, is implemented using integer linear programming. This implementation is used in a tree search method to enumerate both optimal and near-optimal pathway solutions. The formal methods are applied to multiple chemical systems: the enzyme catalysed beta-lactamase reaction, variations of the glycolysis pathway, and the formose process. In each of these systems we use rule composition to abstract pathways and calculate traces for isotope labelled carbon atoms. The pathway model is used to automatically enumerate alternative non-oxidative glycolysis pathways, and enumerate thousands of candidates for autocatalytic pathways in the formose process