8 research outputs found
Maximum Covering Subtrees for Phylogenetic Networks
Tree-based phylogenetic networks, which may be roughly defined as
leaf-labeled networks built by adding arcs only between the original tree
edges, have elegant properties for modeling evolutionary histories. We answer
an open question of Francis, Semple, and Steel about the complexity of
determining how far a phylogenetic network is from being tree-based, including
non-binary phylogenetic networks. We show that finding a phylogenetic tree
covering the maximum number of nodes in a phylogenetic network can be be
computed in polynomial time via an encoding into a minimum-cost maximum flow
problem
Sufficient Control of Complex Networks
In this paper, we propose to study on sufficient control of complex networks
which is to control a sufficiently large portion of the network, where only the
quantity of controllable nodes matters. To the best of our knowledge, this is
the first time that such a problem is investigated. We prove that the
sufficient controllability problem can be converted into a minimum cost flow
problem, for which an algorithm can be easily devised with polynomial
complexity. Further, we study the problem of minimum-cost sufficient control,
which is to drive a sufficiently large subset of the network nodes to any
predefined state with the minimum cost using a given number of controllers. It
is proved that the problem is NP-hard. We propose an ``extended
-norm-constraint-based Projected Gradient Method" (eLPGM)
algorithm which may achieve suboptimal solutions for the problems at small or
medium sizes. To tackle the large-scale problems, we propose to convert the
control problem into a graph algorithm problem, and devise an efficient
low-complexity ``Evenly Divided Control Paths" (EDCP) algorithm to tackle the
graph problem. Simulation results on both synthetic and real-life networks are
provided, demonstrating the satisfactory performance of the proposed methods
Tree comparison: enumeration and application to cheminformatics
Graphs are a well-known data structure used in many application domains that rely on relationships between individual entities. Examples are social networks, where the users may be in friendship with each other, road networks, where one-way or bidirectional roads connect crossings, and work package assignments, where workers are assigned to tasks. In chem- and bioinformatics, molecules are often represented as molecular graphs, where vertices represent atoms, and bonds between them are represented by edges connecting the vertices. Since there is an ever-increasing amount of data that can be treated as graphs, fast algorithms are needed to compare such graphs. A well-researched concept to compare two graphs is the maximum common subgraph. On the one hand, this allows finding substructures that are common to both input graphs. On the other hand, we can derive a similarity score from the maximum common subgraph. A practical application is rational drug design which involves molecular similarity searches.
In this thesis, we study the maximum common subgraph problem, which entails finding a largest graph, which is isomorphic to subgraphs of two input graphs. We focus on restrictions that allow polynomial-time algorithms with a low exponent. An example is the maximum common subtree of two input trees. We succeed in improving the previously best-known time bound. Additionally, we provide a lower time bound under certain assumptions. We study a generalization of the maximum common subtree problem, the block-and-bridge preserving maximum common induced subgraph problem between outerplanar graphs. This problem is motivated by the application to cheminformatics. First, the vast majority of drugs modeled as molecular graphs is outerplanar, and second, the blocks correspond to the ring structures and the bridges to atom chains or linkers. If we allow disconnected common subgraphs, the problem becomes NP-hard even for trees as input. We propose a second generalization of the maximum common subtree problem, which allows skipping vertices in the input trees while maintaining polynomial running time.
Since a maximum common subgraph is not unique in general, we investigate the problem to enumerate all maximum solutions. We do this for both the maximum common subtree problem and the block-and-bridge preserving maximum common induced subgraph problem between outerplanar graphs. An arising subproblem which we analyze is the enumeration of maximum weight matchings in bipartite graphs. We support a weight function between the vertices and edges for all proposed common subgraph methods in this thesis. Thus the objective is to compute a common subgraph of maximum weight. The weights may be integral or real-valued, including negative values. A special case of using such a weight function is computing common subgraph isomorphisms between labeled graphs, where labels between mapped vertices and edges must be equal. An experimental study evaluates the practical running times and the usefulness of our block-and-bridge preserving maximum common induced subgraph algorithm against state of the art algorithms