12 research outputs found

    Finite Model Theory and Proof Complexity Revisited: Distinguishing Graphs in Choiceless Polynomial Time and the Extended Polynomial Calculus

    Get PDF
    This paper extends prior work on the connections between logics from finite model theory and propositional/algebraic proof systems. We show that if all non-isomorphic graphs in a given graph class can be distinguished in the logic Choiceless Polynomial Time with counting (CPT), then they can also be distinguished in the bounded-degree extended polynomial calculus (EPC), and the refutations have roughly the same size as the resource consumption of the CPT-sentence. This allows to transfer lower bounds for EPC to CPT and thus constitutes a new potential approach towards better understanding the limits of CPT. A super-polynomial EPC lower bound for a Ptime-instance of the graph isomorphism problem would separate CPT from Ptime and thus solve a major open question in finite model theory. Further, using our result, we provide a model theoretic proof for the separation of bounded-degree polynomial calculus and bounded-degree extended polynomial calculus

    Lower Bounds for Choiceless Polynomial Time via Symmetric XOR-Circuits

    Get PDF

    Revisiting Tree Isomorphism: AHU Algorithm with Primes Numbers

    Full text link
    The AHU algorithm has been the state of the art since the 1970s for determining in linear time whether two unordered rooted trees are isomorphic or not. However, it has been criticized (by Campbell and Radford) for the way it is written, which requires several (re)readings to be understood, and does not facilitate its analysis. In this paper, we propose an alternative version of the AHU algorithm, which addresses this issue by being designed to be clearer to understand and implement, with the same theoretical complexity and equally fast in practice.. Whereas the key to the linearity of the original algorithm lay on the careful sorting of lists of integers, we replace this step by the multiplication of lists of prime numbers, and prove that this substitution causes no loss in the final complexity of the new algorithm

    Fine-grained Expressivity of Graph Neural Networks

    Full text link
    Numerous recent works have analyzed the expressive power of message-passing graph neural networks (MPNNs), primarily utilizing combinatorial techniques such as the 11-dimensional Weisfeiler-Leman test (11-WL) for the graph isomorphism problem. However, the graph isomorphism objective is inherently binary, not giving insights into the degree of similarity between two given graphs. This work resolves this issue by considering continuous extensions of both 11-WL and MPNNs to graphons. Concretely, we show that the continuous variant of 11-WL delivers an accurate topological characterization of the expressive power of MPNNs on graphons, revealing which graphs these networks can distinguish and the level of difficulty in separating them. We identify the finest topology where MPNNs separate points and prove a universal approximation theorem. Consequently, we provide a theoretical framework for graph and graphon similarity combining various topological variants of classical characterizations of the 11-WL. In particular, we characterize the expressive power of MPNNs in terms of the tree distance, which is a graph distance based on the concepts of fractional isomorphisms, and substructure counts via tree homomorphisms, showing that these concepts have the same expressive power as the 11-WL and MPNNs on graphons. Empirically, we validate our theoretical findings by showing that randomly initialized MPNNs, without training, exhibit competitive performance compared to their trained counterparts. Moreover, we evaluate different MPNN architectures based on their ability to preserve graph distances, highlighting the significance of our continuous 11-WL test in understanding MPNNs' expressivity

    Gradual Weisfeiler-Leman: Slow and Steady Wins the Race

    Full text link
    The classical Weisfeiler-Leman algorithm aka color refinement is fundamental for graph learning and central for successful graph kernels and graph neural networks. Originally developed for graph isomorphism testing, the algorithm iteratively refines vertex colors. On many datasets, the stable coloring is reached after a few iterations and the optimal number of iterations for machine learning tasks is typically even lower. This suggests that the colors diverge too fast, defining a similarity that is too coarse. We generalize the concept of color refinement and propose a framework for gradual neighborhood refinement, which allows a slower convergence to the stable coloring and thus provides a more fine-grained refinement hierarchy and vertex similarity. We assign new colors by clustering vertex neighborhoods, replacing the original injective color assignment function. Our approach is used to derive new variants of existing graph kernels and to approximate the graph edit distance via optimal assignments regarding vertex similarity. We show that in both tasks, our method outperforms the original color refinement with only moderate increase in running time advancing the state of the art

    Canonization for Bounded and Dihedral Color Classes in Choiceless Polynomial Time

    Get PDF
    In the quest for a logic capturing Ptime the next natural classes of structures to consider are those with bounded color class size. We present a canonization procedure for graphs with dihedral color classes of bounded size in the logic of Choiceless Polynomial Time (CPT), which then captures Ptime on this class of structures. This is the first result of this form for non-abelian color classes. The first step proposes a normal form which comprises a "rigid assemblage". This roughly means that the local automorphism groups form 2-injective 3-factor subdirect products. Structures with color classes of bounded size can be reduced canonization preservingly to normal form in CPT. In the second step, we show that for graphs in normal form with dihedral color classes of bounded size, the canonization problem can be solved in CPT. We also show the same statement for general ternary structures in normal form if the dihedral groups are defined over odd domains

    Detection of Common Subtrees with Identical Label Distribution

    Full text link
    Frequent pattern mining is a relevant method to analyse structured data, like sequences, trees or graphs. It consists in identifying characteristic substructures of a dataset. This paper deals with a new type of patterns for tree data: common subtrees with identical label distribution. Their detection is far from obvious since the underlying isomorphism problem is graph isomorphism complete. An elaborated search algorithm is developed and analysed from both theoretical and numerical perspectives. Based on this, the enumeration of patterns is performed through a new lossless compression scheme for trees, called DAG-RW, whose complexity is investigated as well. The method shows very good properties, both in terms of computation times and analysis of real datasets from the literature. Compared to other substructures like topological subtrees and labelled subtrees for which the isomorphism problem is linear, the patterns found provide a more parsimonious representation of the data.Comment: 40 page

    WL meet VC

    Full text link
    Recently, many works studied the expressive power of graph neural networks (GNNs) by linking it to the 11-dimensional Weisfeiler--Leman algorithm (1-WL1\text{-}\mathsf{WL}). Here, the 1-WL1\text{-}\mathsf{WL} is a well-studied heuristic for the graph isomorphism problem, which iteratively colors or partitions a graph's vertex set. While this connection has led to significant advances in understanding and enhancing GNNs' expressive power, it does not provide insights into their generalization performance, i.e., their ability to make meaningful predictions beyond the training set. In this paper, we study GNNs' generalization ability through the lens of Vapnik--Chervonenkis (VC) dimension theory in two settings, focusing on graph-level predictions. First, when no upper bound on the graphs' order is known, we show that the bitlength of GNNs' weights tightly bounds their VC dimension. Further, we derive an upper bound for GNNs' VC dimension using the number of colors produced by the 1-WL1\text{-}\mathsf{WL}. Secondly, when an upper bound on the graphs' order is known, we show a tight connection between the number of graphs distinguishable by the 1-WL1\text{-}\mathsf{WL} and GNNs' VC dimension. Our empirical study confirms the validity of our theoretical findings.Comment: arXiv admin note: text overlap with arXiv:2206.1116