47 research outputs found

    Incremental complexity of a bi-objective hypergraph transversal problem

    Get PDF
    The hypergraph transversal problem has been intensively studied, from both a theoretical and a practical point of view. In particular , its incremental complexity is known to be quasi-polynomial in general and polynomial for bounded hypergraphs. Recent applications in computational biology however require to solve a generalization of this problem, that we call bi-objective transversal problem. The instance is in this case composed of a pair of hypergraphs (A, B), and the aim is to find minimal sets which hit all the hyperedges of A while intersecting a minimal set of hyperedges of B. In this paper, we formalize this problem, link it to a problem on monotone boolean \land -- \lor formulae of depth 3 and study its incremental complexity

    Minimal Conflicting Sets for the Consecutive Ones Property in ancestral genome reconstruction

    Full text link
    A binary matrix has the Consecutive Ones Property (C1P) if its columns can be ordered in such a way that all 1's on each row are consecutive. A Minimal Conflicting Set is a set of rows that does not have the C1P, but every proper subset has the C1P. Such submatrices have been considered in comparative genomics applications, but very little is known about their combinatorial structure and efficient algorithms to compute them. We first describe an algorithm that detects rows that belong to Minimal Conflicting Sets. This algorithm has a polynomial time complexity when the number of 1's in each row of the considered matrix is bounded by a constant. Next, we show that the problem of computing all Minimal Conflicting Sets can be reduced to the joint generation of all minimal true clauses and maximal false clauses for some monotone boolean function. We use these methods on simulated data related to ancestral genome reconstruction to show that computing Minimal Conflicting Set is useful in discriminating between true positive and false positive ancestral syntenies. We also study a dataset of yeast genomes and address the reliability of an ancestral genome proposal of the Saccahromycetaceae yeasts.Comment: 20 pages, 3 figure

    Logic Integer Programming Models for Signaling Networks

    Full text link
    We propose a static and a dynamic approach to model biological signaling networks, and show how each can be used to answer relevant biological questions. For this we use the two different mathematical tools of Propositional Logic and Integer Programming. The power of discrete mathematics for handling qualitative as well as quantitative data has so far not been exploited in Molecular Biology, which is mostly driven by experimental research, relying on first-order or statistical models. The arising logic statements and integer programs are analyzed and can be solved with standard software. For a restricted class of problems the logic models reduce to a polynomial-time solvable satisfiability algorithm. Additionally, a more dynamic model enables enumeration of possible time resolutions in poly-logarithmic time. Computational experiments are included

    Discovery of the D-basis in binary tables based on hypergraph dualization

    Get PDF
    Discovery of (strong) association rules, or implications, is an important task in data management, and it nds application in arti cial intelligence, data mining and the semantic web. We introduce a novel approach for the discovery of a speci c set of implications, called the D-basis, that provides a representation for a reduced binary table, based on the structure of its Galois lattice. At the core of the method are the D-relation de ned in the lattice theory framework, and the hypergraph dualization algorithm that allows us to e ectively produce the set of transversals for a given Sperner hypergraph. The latter algorithm, rst developed by specialists from Rutgers Center for Operations Research, has already found numerous applications in solving optimization problems in data base theory, arti cial intelligence and game theory. One application of the method is for analysis of gene expression data related to a particular phenotypic variable, and some initial testing is done for the data provided by the University of Hawaii Cancer Cente

    Discovery of the D-basis in binary tables based on hypergraph dualization

    Get PDF
    Discovery of (strong) association rules, or implications, is an important task in data management, and it nds application in arti cial intelligence, data mining and the semantic web. We introduce a novel approach for the discovery of a speci c set of implications, called the D-basis, that provides a representation for a reduced binary table, based on the structure of its Galois lattice. At the core of the method are the D-relation de ned in the lattice theory framework, and the hypergraph dualization algorithm that allows us to e ectively produce the set of transversals for a given Sperner hypergraph. The latter algorithm, rst developed by specialists from Rutgers Center for Operations Research, has already found numerous applications in solving optimization problems in data base theory, arti cial intelligence and game theory. One application of the method is for analysis of gene expression data related to a particular phenotypic variable, and some initial testing is done for the data provided by the University of Hawaii Cancer Cente

    Efficiently Enumerating Hitting Sets of Hypergraphs Arising in Data Profiling

    Get PDF
    We devise an enumeration method for inclusion-wise minimal hitting sets in hypergraphs. It has delay O(mk* +1 · n2) and uses linear space. Hereby, n is the number of vertices, m the number of hyperedges, and k* the rank of the transversal hypergraph. In particular, on classes of hypergraphs for which the cardinality k* of the largest minimal hitting set is bounded, the delay is polynomial. The algorithm solves the extension problem for minimal hitting sets as a subroutine. We show that the extension problem is W[3]-complete when parameterised by the cardinality of the set which is to be extended. For the subroutine, we give an algorithm that is optimal under the exponential time hypothesis. Despite these lower bounds, we provide empirical evidence showing that the enumeration outperforms the theoretical worst-case guarantee on hypergraphs arising in the profiling of relational databases, namely, in the detection of unique column combinations

    An average study of hypergraphs and their minimal transversals

    Get PDF
    International audienceIn this paper, we study some average properties of hypergraphs and the average com-plexity of algorithms applied to hypergraphs under different probabilistic models. Our approach is both theoretical and experimental since our goal is to obtain a random model that is able to capture the real-data complexity. Starting from a model that generalizes the Erdös-Renyi model [9, 10], we obtain asymptotic estimations on the average number of transversals, minimals and minimal transversals in a random hy-pergraph. We use those results to obtain an upper bound on the average complexity of algorithms to generate the minimal transversals of an hypergraph. Then we make our random model more complex in order bring it closer to real-data and identify cases where the average number of minimal tranversals is at most polynomial, quasi-polynomial or exponential