363 research outputs found

    Depth-based Hypergraph Complexity Traces from Directed Line Graphs

    Get PDF
    In this paper, we aim to characterize the structure of hypergraphs in terms of structural complexity measure. Measuring the complexity of a hypergraph in a straightforward way tends to be elusive since the hyperedges of a hypergraph may exhibit varying relational orders. We thus transform a hypergraph into a line graph which not only accurately reflects the multiple relationships exhibited by the hyperedges but is also easier to manipulate for complexity analysis. To locate the dominant substructure within a line graph, we identify a centroid vertex by computing the minimum variance of its shortest path lengths. A family of centroid expansion subgraphs of the line graph is then derived from the centroid vertex. We compute the depth-based complexity traces for the hypergraph by measuring either the directed or undirected entropies of its centroid expansion subgraphs. The resulting complexity traces provide a flexible framework that can be applied to both hypergraphs and graphs. We perform (hyper)graph classification in the principal component space of the complexity trace vectors. Experiments on (hyper)graph datasets abstracted from bioinformatics and computer vision data demonstrate the effectiveness and efficiency of the complexity traces.This work is supported by National Natural Science Foundation of China (Grant no. 61503422). This work is supported by the Open Projects Program of National Laboratory of Pattern Recognition. Francisco Escolano is supported by the project TIN2012-32839 of the Spanish Government. Edwin R. Hancock is supported by a Royal Society Wolfson Research Merit Award

    An edge-based matching kernel on commute-time spanning trees

    Get PDF

    Information Theoretic Graph Kernels

    Get PDF
    This thesis addresses the problems that arise in state-of-the-art structural learning methods for (hyper)graph classification or clustering, particularly focusing on developing novel information theoretic kernels for graphs. To this end, we commence in Chapter 3 by defining a family of Jensen-Shannon diffusion kernels, i.e., the information theoretic kernels, for (un)attributed graphs. We show that our kernels overcome the shortcomings of inefficiency (for the unattributed diffusion kernel) and discarding un-isomorphic substructures (for the attributed diffusion kernel) that arise in the R-convolution kernels. In Chapter 4, we present a novel framework of computing depth-based complexity traces rooted at the centroid vertices for graphs, which can be efficiently computed for graphs with large sizes. We show that our methods can characterize a graph in a higher dimensional complexity feature space than state-of-the-art complexity measures. In Chapter 5, we develop a novel unattributed graph kernel by matching the depth-based substructures in graphs, based on the contribution in Chapter 4. Unlike most existing graph kernels in the literature which merely enumerate similar substructure pairs of limited sizes, our method incorporates explicit local substructure correspondence into the process of kernelization. The new kernel thus overcomes the shortcoming of neglecting structural correspondence that arises in most state-of-the-art graph kernels. The novel methods developed in Chapters 3, 4, and 5 are only restricted to graphs. However, real-world data usually tends to be represented by higher order relationships (i.e., hypergraphs). To overcome the shortcoming, in Chapter 6 we present a new hypergraph kernel using substructure isomorphism tests. We show that our kernel limits tottering that arises in the existing walk and subtree based (hyper)graph kernels. In Chapter 7, we summarize the contributions of this thesis. Furthermore, we analyze the proposed methods. Finally, we give some suggestions for the future work

    A nested alignment graph kernel through the dynamic time warping framework

    Get PDF
    In this paper, we propose a novel nested alignment graph kernel drawing on depth-based complexity traces and the dynamic time warping framework. Specifically, for a pair of graphs, we commence by computing the depth-based complexity traces rooted at the centroid vertices. The resulting kernel for the graphs is defined by measuring the global alignment kernel, which is developed through the dynamic time warping framework, between the complexity traces. We show that the proposed kernel simultaneously considers the local and global graph characteristics in terms of the complexity traces, but also provides richer statistic measures by incorporating the whole spectrum of alignment costs between these traces. Our experiments demonstrate the effectiveness and efficiency of the proposed kernel

    Static Generation of UML Sequence Diagrams

    Get PDF
    UML sequence diagrams are visual representations of object interactions in a system and can provide valuable information for program comprehension, debugging, maintenance, and software archeology. Sequence diagrams generated from legacy code are independent of existing documentation that may have eroded. We present a framework for static generation of UML sequence diagrams from object-oriented source code. The framework provides a query refinement system to guide the user to interesting interactions in the source code. Our technique involves constructing a hypergraph representation of the source code, traversing the hypergraph with respect to a user-defined query, and generating the corresponding set of sequence diagrams. We implemented our framework as a tool, StaticGen (supporting software: StaticGen), analyzing a corpus of 30 Android applications. We provide experimental results demonstrating the efficacy of our technique (originally appeared in the Proceedings of Fundamental Approaches to Software Engineering—20th International Conference, FASE 2017, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2017, Uppsala, Sweden, April 22–29, 2017)

    Analysis of Generative Chemistries

    Get PDF
    For the modelling of chemistry we use undirected, labelled graphs as explicit models of molecules and graph transformation rules for modelling generalised chemical reactions. This is used to define artificial chemistries on the level of individual bonds and atoms, where formal graph grammars implicitly represent large spaces of chemical compounds. We use a graph rewriting formalism, rooted in category theory, called the Double Pushout approach, which directly expresses the transition state of chemical reactions. Using concurrency theory for transformation rules, we define algorithms for the composition of rewrite rules in a chemically intuitive manner that enable automatic abstraction of the level of detail in chemical pathways. Based on this rule composition we define an algorithmic framework for generation of vast reaction networks for specific spaces of a given chemistry, while still maintaining the level of detail of the model down to the atomic level. The framework also allows for computation with graphs and graph grammars, which is utilised to model non-trivial chemical systems. The graph generation relies on graph isomorphism testing, and we review the general individualisation-refinement paradigm used in the state-of-the-art algorithms for graph canonicalisation, isomorphism testing, and automorphism discovery. We present a model for chemical pathways based on a generalisation of network flows from ordinary directed graphs to directed hypergraphs. The model allows for reasoning about the flow of individual molecules in general pathways, and the introduction of chemically motivated routing constraints. It further provides the foundation for defining specialised pathway motifs, which is illustrated by defining necessary topological constraints for both catalytic and autocatalytic pathways. We also prove that central types of pathway questions are NP-complete, even for restricted classes of reaction networks. The complete pathway model, including constraints for catalytic and autocatalytic pathways, is implemented using integer linear programming. This implementation is used in a tree search method to enumerate both optimal and near-optimal pathway solutions. The formal methods are applied to multiple chemical systems: the enzyme catalysed beta-lactamase reaction, variations of the glycolysis pathway, and the formose process. In each of these systems we use rule composition to abstract pathways and calculate traces for isotope labelled carbon atoms. The pathway model is used to automatically enumerate alternative non-oxidative glycolysis pathways, and enumerate thousands of candidates for autocatalytic pathways in the formose process

    Dagstuhl Reports : Volume 1, Issue 2, February 2011

    Get PDF
    Online Privacy: Towards Informational Self-Determination on the Internet (Dagstuhl Perspectives Workshop 11061) : Simone Fischer-Hübner, Chris Hoofnagle, Kai Rannenberg, Michael Waidner, Ioannis Krontiris and Michael Marhöfer Self-Repairing Programs (Dagstuhl Seminar 11062) : Mauro Pezzé, Martin C. Rinard, Westley Weimer and Andreas Zeller Theory and Applications of Graph Searching Problems (Dagstuhl Seminar 11071) : Fedor V. Fomin, Pierre Fraigniaud, Stephan Kreutzer and Dimitrios M. Thilikos Combinatorial and Algorithmic Aspects of Sequence Processing (Dagstuhl Seminar 11081) : Maxime Crochemore, Lila Kari, Mehryar Mohri and Dirk Nowotka Packing and Scheduling Algorithms for Information and Communication Services (Dagstuhl Seminar 11091) Klaus Jansen, Claire Mathieu, Hadas Shachnai and Neal E. Youn

    Beyond Worst-Case Analysis for Joins with Minesweeper

    Full text link
    We describe a new algorithm, Minesweeper, that is able to satisfy stronger runtime guarantees than previous join algorithms (colloquially, `beyond worst-case guarantees') for data in indexed search trees. Our first contribution is developing a framework to measure this stronger notion of complexity, which we call {\it certificate complexity}, that extends notions of Barbay et al. and Demaine et al.; a certificate is a set of propositional formulae that certifies that the output is correct. This notion captures a natural class of join algorithms. In addition, the certificate allows us to define a strictly stronger notion of runtime complexity than traditional worst-case guarantees. Our second contribution is to develop a dichotomy theorem for the certificate-based notion of complexity. Roughly, we show that Minesweeper evaluates β\beta-acyclic queries in time linear in the certificate plus the output size, while for any β\beta-cyclic query there is some instance that takes superlinear time in the certificate (and for which the output is no larger than the certificate size). We also extend our certificate-complexity analysis to queries with bounded treewidth and the triangle query.Comment: [This is the full version of our PODS'2014 paper.

    Combinatorics

    Get PDF
    Combinatorics is a fundamental mathematical discipline that focuses on the study of discrete objects and their properties. The present workshop featured research in such diverse areas as Extremal, Probabilistic and Algebraic Combinatorics, Graph Theory, Discrete Geometry, Combinatorial Optimization, Theory of Computation and Statistical Mechanics. It provided current accounts of exciting developments and challenges in these fields and a stimulating venue for a variety of fruitful interactions. This is a report on the meeting, containing extended abstracts of the presentations and a summary of the problem session
    • …
    corecore