191,329 research outputs found
JGraphT -- A Java library for graph data structures and algorithms
Mathematical software and graph-theoretical algorithmic packages to
efficiently model, analyze and query graphs are crucial in an era where
large-scale spatial, societal and economic network data are abundantly
available. One such package is JGraphT, a programming library which contains
very efficient and generic graph data-structures along with a large collection
of state-of-the-art algorithms. The library is written in Java with stability,
interoperability and performance in mind. A distinctive feature of this library
is the ability to model vertices and edges as arbitrary objects, thereby
permitting natural representations of many common networks including
transportation, social and biological networks. Besides classic graph
algorithms such as shortest-paths and spanning-tree algorithms, the library
contains numerous advanced algorithms: graph and subgraph isomorphism; matching
and flow problems; approximation algorithms for NP-hard problems such as
independent set and TSP; and several more exotic algorithms such as Berge graph
detection. Due to its versatility and generic design, JGraphT is currently used
in large-scale commercial, non-commercial and academic research projects. In
this work we describe in detail the design and underlying structure of the
library, and discuss its most important features and algorithms. A
computational study is conducted to evaluate the performance of JGraphT versus
a number of similar libraries. Experiments on a large number of graphs over a
variety of popular algorithms show that JGraphT is highly competitive with
other established libraries such as NetworkX or the BGL.Comment: Major Revisio
HyperANF: Approximating the Neighbourhood Function of Very Large Graphs on a Budget
The neighbourhood function N(t) of a graph G gives, for each t, the number of
pairs of nodes such that y is reachable from x in less that t hops. The
neighbourhood function provides a wealth of information about the graph (e.g.,
it easily allows one to compute its diameter), but it is very expensive to
compute it exactly. Recently, the ANF algorithm (approximate neighbourhood
function) has been proposed with the purpose of approximating NG(t) on large
graphs. We describe a breakthrough improvement over ANF in terms of speed and
scalability. Our algorithm, called HyperANF, uses the new HyperLogLog counters
and combines them efficiently through broadword programming; our implementation
uses overdecomposition to exploit multi-core parallelism. With HyperANF, for
the first time we can compute in a few hours the neighbourhood function of
graphs with billions of nodes with a small error and good confidence using a
standard workstation. Then, we turn to the study of the distribution of the
shortest paths between reachable nodes (that can be efficiently approximated by
means of HyperANF), and discover the surprising fact that its index of
dispersion provides a clear-cut characterisation of proper social networks vs.
web graphs. We thus propose the spid (Shortest-Paths Index of Dispersion) of a
graph as a new, informative statistics that is able to discriminate between the
above two types of graphs. We believe this is the first proposal of a
significant new non-local structural index for complex networks whose
computation is highly scalable
Compositional Algorithms on Compositional Data: Deciding Sheaves on Presheaves
Algorithmicists are well-aware that fast dynamic programming algorithms are
very often the correct choice when computing on compositional (or even
recursive) graphs. Here we initiate the study of how to generalize this
folklore intuition to mathematical structures writ large. We achieve this
horizontal generality by adopting a categorial perspective which allows us to
show that: (1) structured decompositions (a recent, abstract generalization of
many graph decompositions) define Grothendieck topologies on categories of data
(adhesive categories) and that (2) any computational problem which can be
represented as a sheaf with respect to these topologies can be decided in
linear time on classes of inputs which admit decompositions of bounded width
and whose decomposition shapes have bounded feedback vertex number. This
immediately leads to algorithms on objects of any C-set category; these include
-- to name but a few examples -- structures such as: symmetric graphs, directed
graphs, directed multigraphs, hypergraphs, directed hypergraphs, databases,
simplicial complexes, circular port graphs and half-edge graphs.
Thus we initiate the bridging of tools from sheaf theory, structural graph
theory and parameterized complexity theory; we believe this to be a very
fruitful approach for a general, algebraic theory of dynamic programming
algorithms. Finally we pair our theoretical results with concrete
implementations of our main algorithmic contribution in the AlgebraicJulia
ecosystem.Comment: Revised and simplified notation and improved exposition. The
companion code can be found here:
https://github.com/AlgebraicJulia/StructuredDecompositions.j
Ramsey numbers involving a triangle: theory and algorithms
Ramsey theory studies the existence of highly regular patterns in large sets of objects. Given two graphs G and H, the Ramsey number R(G, H) is defined to be the smallest integer n such that any graph F with n or more vertices must contain G, or F must contain H. Albeit beautiful, the problem of determining Ramsey numbers is considered to be very difficult. We focus our attention on efficient algorithms for determining Ram sey numbers involving a triangle: R(K3 , G). With the help of theoretical tools, the search space is reduced by using different pruning techniques and linear programming. Efficient operations are also carried out to mathematically glue together small graphs to construct larger critical graphs. Using the algorithms developed in this thesis, we compute all the Ramsey numbers R(Kz,G), where G is any connected graph of order seven. Most of the corresponding critical graphs are also constructed. We believe that the algorithms developed here will have wider applications to other Ramsey-type problems
Generic Strategies for Chemical Space Exploration
Computational approaches to exploring "chemical universes", i.e., very large
sets, potentially infinite sets of compounds that can be constructed by a
prescribed collection of reaction mechanisms, in practice suffer from a
combinatorial explosion. It quickly becomes impossible to test, for all pairs
of compounds in a rapidly growing network, whether they can react with each
other. More sophisticated and efficient strategies are therefore required to
construct very large chemical reaction networks.
Undirected labeled graphs and graph rewriting are natural models of chemical
compounds and chemical reactions. Borrowing the idea of partial evaluation from
functional programming, we introduce partial applications of rewrite rules.
Binding substrate to rules increases the number of rules but drastically prunes
the substrate sets to which it might match, resulting in dramatically reduced
resource requirements. At the same time, exploration strategies can be guided,
e.g. based on restrictions on the product molecules to avoid the explicit
enumeration of very unlikely compounds. To this end we introduce here a generic
framework for the specification of exploration strategies in graph-rewriting
systems. Using key examples of complex chemical networks from sugar chemistry
and the realm of metabolic networks we demonstrate the feasibility of a
high-level strategy framework.
The ideas presented here can not only be used for a strategy-based chemical
space exploration that has close correspondence of experimental results, but
are much more general. In particular, the framework can be used to emulate
higher-level transformation models such as illustrated in a small puzzle game
Towards effective exact methods for the Maximum Balanced Biclique Problem in bipartite graphs
The Maximum Balanced Biclique Problem (MBBP) is a prominent model with numerous applications. Yet, the problem is NP-hard and thus computationally challenging. We propose novel ideas for designing effective exact algorithms for MBBP in bipartite graphs. First, an Upper Bound Propagation (UBP) procedure to pre-compute an upper bound involving each vertex is introduced. Then we extend a simple Branch-and-Bound (B&B) algorithm by integrating the pre-computed upper bounds. Based on UBP, we also study a new integer linear programming model of MBBP which is more compact than an existing formulation (Dawande, Keskinocak, Swaminathan, & Tayur, 2001). We introduce new valid inequalities induced from the upper bounds to tighten these mathematical formulations for MBBP. Experiments with random bipartite graphs demonstrate the efficiency of the extended B&B algorithm and the valid inequalities generated on demand. Further tests with 30 real-life instances show that, for at least three very large graphs, the new approaches improve the computational time with four orders of magnitude compared to the original B&B
- …