191,329 research outputs found

    JGraphT -- A Java library for graph data structures and algorithms

    Full text link
    Mathematical software and graph-theoretical algorithmic packages to efficiently model, analyze and query graphs are crucial in an era where large-scale spatial, societal and economic network data are abundantly available. One such package is JGraphT, a programming library which contains very efficient and generic graph data-structures along with a large collection of state-of-the-art algorithms. The library is written in Java with stability, interoperability and performance in mind. A distinctive feature of this library is the ability to model vertices and edges as arbitrary objects, thereby permitting natural representations of many common networks including transportation, social and biological networks. Besides classic graph algorithms such as shortest-paths and spanning-tree algorithms, the library contains numerous advanced algorithms: graph and subgraph isomorphism; matching and flow problems; approximation algorithms for NP-hard problems such as independent set and TSP; and several more exotic algorithms such as Berge graph detection. Due to its versatility and generic design, JGraphT is currently used in large-scale commercial, non-commercial and academic research projects. In this work we describe in detail the design and underlying structure of the library, and discuss its most important features and algorithms. A computational study is conducted to evaluate the performance of JGraphT versus a number of similar libraries. Experiments on a large number of graphs over a variety of popular algorithms show that JGraphT is highly competitive with other established libraries such as NetworkX or the BGL.Comment: Major Revisio

    HyperANF: Approximating the Neighbourhood Function of Very Large Graphs on a Budget

    Full text link
    The neighbourhood function N(t) of a graph G gives, for each t, the number of pairs of nodes such that y is reachable from x in less that t hops. The neighbourhood function provides a wealth of information about the graph (e.g., it easily allows one to compute its diameter), but it is very expensive to compute it exactly. Recently, the ANF algorithm (approximate neighbourhood function) has been proposed with the purpose of approximating NG(t) on large graphs. We describe a breakthrough improvement over ANF in terms of speed and scalability. Our algorithm, called HyperANF, uses the new HyperLogLog counters and combines them efficiently through broadword programming; our implementation uses overdecomposition to exploit multi-core parallelism. With HyperANF, for the first time we can compute in a few hours the neighbourhood function of graphs with billions of nodes with a small error and good confidence using a standard workstation. Then, we turn to the study of the distribution of the shortest paths between reachable nodes (that can be efficiently approximated by means of HyperANF), and discover the surprising fact that its index of dispersion provides a clear-cut characterisation of proper social networks vs. web graphs. We thus propose the spid (Shortest-Paths Index of Dispersion) of a graph as a new, informative statistics that is able to discriminate between the above two types of graphs. We believe this is the first proposal of a significant new non-local structural index for complex networks whose computation is highly scalable

    Compositional Algorithms on Compositional Data: Deciding Sheaves on Presheaves

    Full text link
    Algorithmicists are well-aware that fast dynamic programming algorithms are very often the correct choice when computing on compositional (or even recursive) graphs. Here we initiate the study of how to generalize this folklore intuition to mathematical structures writ large. We achieve this horizontal generality by adopting a categorial perspective which allows us to show that: (1) structured decompositions (a recent, abstract generalization of many graph decompositions) define Grothendieck topologies on categories of data (adhesive categories) and that (2) any computational problem which can be represented as a sheaf with respect to these topologies can be decided in linear time on classes of inputs which admit decompositions of bounded width and whose decomposition shapes have bounded feedback vertex number. This immediately leads to algorithms on objects of any C-set category; these include -- to name but a few examples -- structures such as: symmetric graphs, directed graphs, directed multigraphs, hypergraphs, directed hypergraphs, databases, simplicial complexes, circular port graphs and half-edge graphs. Thus we initiate the bridging of tools from sheaf theory, structural graph theory and parameterized complexity theory; we believe this to be a very fruitful approach for a general, algebraic theory of dynamic programming algorithms. Finally we pair our theoretical results with concrete implementations of our main algorithmic contribution in the AlgebraicJulia ecosystem.Comment: Revised and simplified notation and improved exposition. The companion code can be found here: https://github.com/AlgebraicJulia/StructuredDecompositions.j

    Ramsey numbers involving a triangle: theory and algorithms

    Get PDF
    Ramsey theory studies the existence of highly regular patterns in large sets of objects. Given two graphs G and H, the Ramsey number R(G, H) is defined to be the smallest integer n such that any graph F with n or more vertices must contain G, or F must contain H. Albeit beautiful, the problem of determining Ramsey numbers is considered to be very difficult. We focus our attention on efficient algorithms for determining Ram sey numbers involving a triangle: R(K3 , G). With the help of theoretical tools, the search space is reduced by using different pruning techniques and linear programming. Efficient operations are also carried out to mathematically glue together small graphs to construct larger critical graphs. Using the algorithms developed in this thesis, we compute all the Ramsey numbers R(Kz,G), where G is any connected graph of order seven. Most of the corresponding critical graphs are also constructed. We believe that the algorithms developed here will have wider applications to other Ramsey-type problems

    Generic Strategies for Chemical Space Exploration

    Full text link
    Computational approaches to exploring "chemical universes", i.e., very large sets, potentially infinite sets of compounds that can be constructed by a prescribed collection of reaction mechanisms, in practice suffer from a combinatorial explosion. It quickly becomes impossible to test, for all pairs of compounds in a rapidly growing network, whether they can react with each other. More sophisticated and efficient strategies are therefore required to construct very large chemical reaction networks. Undirected labeled graphs and graph rewriting are natural models of chemical compounds and chemical reactions. Borrowing the idea of partial evaluation from functional programming, we introduce partial applications of rewrite rules. Binding substrate to rules increases the number of rules but drastically prunes the substrate sets to which it might match, resulting in dramatically reduced resource requirements. At the same time, exploration strategies can be guided, e.g. based on restrictions on the product molecules to avoid the explicit enumeration of very unlikely compounds. To this end we introduce here a generic framework for the specification of exploration strategies in graph-rewriting systems. Using key examples of complex chemical networks from sugar chemistry and the realm of metabolic networks we demonstrate the feasibility of a high-level strategy framework. The ideas presented here can not only be used for a strategy-based chemical space exploration that has close correspondence of experimental results, but are much more general. In particular, the framework can be used to emulate higher-level transformation models such as illustrated in a small puzzle game

    Towards effective exact methods for the Maximum Balanced Biclique Problem in bipartite graphs

    Get PDF
    The Maximum Balanced Biclique Problem (MBBP) is a prominent model with numerous applications. Yet, the problem is NP-hard and thus computationally challenging. We propose novel ideas for designing effective exact algorithms for MBBP in bipartite graphs. First, an Upper Bound Propagation (UBP) procedure to pre-compute an upper bound involving each vertex is introduced. Then we extend a simple Branch-and-Bound (B&B) algorithm by integrating the pre-computed upper bounds. Based on UBP, we also study a new integer linear programming model of MBBP which is more compact than an existing formulation (Dawande, Keskinocak, Swaminathan, & Tayur, 2001). We introduce new valid inequalities induced from the upper bounds to tighten these mathematical formulations for MBBP. Experiments with random bipartite graphs demonstrate the efficiency of the extended B&B algorithm and the valid inequalities generated on demand. Further tests with 30 real-life instances show that, for at least three very large graphs, the new approaches improve the computational time with four orders of magnitude compared to the original B&B
    • …
    corecore