47 research outputs found
Incremental complexity of a bi-objective hypergraph transversal problem
The hypergraph transversal problem has been intensively studied, from both a
theoretical and a practical point of view. In particular , its incremental
complexity is known to be quasi-polynomial in general and polynomial for
bounded hypergraphs. Recent applications in computational biology however
require to solve a generalization of this problem, that we call bi-objective
transversal problem. The instance is in this case composed of a pair of
hypergraphs (A, B), and the aim is to find minimal sets which hit all the
hyperedges of A while intersecting a minimal set of hyperedges of B. In this
paper, we formalize this problem, link it to a problem on monotone boolean
-- formulae of depth 3 and study its incremental complexity
Minimal Conflicting Sets for the Consecutive Ones Property in ancestral genome reconstruction
A binary matrix has the Consecutive Ones Property (C1P) if its columns can be
ordered in such a way that all 1's on each row are consecutive. A Minimal
Conflicting Set is a set of rows that does not have the C1P, but every proper
subset has the C1P. Such submatrices have been considered in comparative
genomics applications, but very little is known about their combinatorial
structure and efficient algorithms to compute them. We first describe an
algorithm that detects rows that belong to Minimal Conflicting Sets. This
algorithm has a polynomial time complexity when the number of 1's in each row
of the considered matrix is bounded by a constant. Next, we show that the
problem of computing all Minimal Conflicting Sets can be reduced to the joint
generation of all minimal true clauses and maximal false clauses for some
monotone boolean function. We use these methods on simulated data related to
ancestral genome reconstruction to show that computing Minimal Conflicting Set
is useful in discriminating between true positive and false positive ancestral
syntenies. We also study a dataset of yeast genomes and address the reliability
of an ancestral genome proposal of the Saccahromycetaceae yeasts.Comment: 20 pages, 3 figure
Logic Integer Programming Models for Signaling Networks
We propose a static and a dynamic approach to model biological signaling
networks, and show how each can be used to answer relevant biological
questions. For this we use the two different mathematical tools of
Propositional Logic and Integer Programming. The power of discrete mathematics
for handling qualitative as well as quantitative data has so far not been
exploited in Molecular Biology, which is mostly driven by experimental
research, relying on first-order or statistical models. The arising logic
statements and integer programs are analyzed and can be solved with standard
software. For a restricted class of problems the logic models reduce to a
polynomial-time solvable satisfiability algorithm. Additionally, a more dynamic
model enables enumeration of possible time resolutions in poly-logarithmic
time. Computational experiments are included
Discovery of the D-basis in binary tables based on hypergraph dualization
Discovery of (strong) association rules, or implications, is an important
task in data management, and it nds application in arti cial intelligence,
data mining and the semantic web. We introduce a novel approach
for the discovery of a speci c set of implications, called the D-basis, that provides
a representation for a reduced binary table, based on the structure of
its Galois lattice. At the core of the method are the D-relation de ned in
the lattice theory framework, and the hypergraph dualization algorithm that
allows us to e ectively produce the set of transversals for a given Sperner hypergraph.
The latter algorithm, rst developed by specialists from Rutgers
Center for Operations Research, has already found numerous applications in
solving optimization problems in data base theory, arti cial intelligence and
game theory. One application of the method is for analysis of gene expression
data related to a particular phenotypic variable, and some initial testing is
done for the data provided by the University of Hawaii Cancer Cente
Discovery of the D-basis in binary tables based on hypergraph dualization
Discovery of (strong) association rules, or implications, is an important
task in data management, and it nds application in arti cial intelligence,
data mining and the semantic web. We introduce a novel approach
for the discovery of a speci c set of implications, called the D-basis, that provides
a representation for a reduced binary table, based on the structure of
its Galois lattice. At the core of the method are the D-relation de ned in
the lattice theory framework, and the hypergraph dualization algorithm that
allows us to e ectively produce the set of transversals for a given Sperner hypergraph.
The latter algorithm, rst developed by specialists from Rutgers
Center for Operations Research, has already found numerous applications in
solving optimization problems in data base theory, arti cial intelligence and
game theory. One application of the method is for analysis of gene expression
data related to a particular phenotypic variable, and some initial testing is
done for the data provided by the University of Hawaii Cancer Cente
Efficiently Enumerating Hitting Sets of Hypergraphs Arising in Data Profiling
We devise an enumeration method for inclusion-wise minimal hitting sets in hypergraphs. It has delay O(mk* +1 · n2) and uses linear space. Hereby, n is the number of vertices, m the number of hyperedges, and k* the rank of the transversal hypergraph. In particular, on classes of hypergraphs for which the cardinality k* of the largest minimal hitting set is bounded, the delay is polynomial. The algorithm solves the extension problem for minimal hitting sets as a subroutine. We show that the extension problem is W[3]-complete when parameterised by the cardinality of the set which is to be extended. For the subroutine, we give an algorithm that is optimal under the exponential time hypothesis. Despite these lower bounds, we provide empirical evidence showing that the enumeration outperforms the theoretical worst-case guarantee on hypergraphs arising in the profiling of relational databases, namely, in the detection of unique column combinations
An average study of hypergraphs and their minimal transversals
International audienceIn this paper, we study some average properties of hypergraphs and the average com-plexity of algorithms applied to hypergraphs under different probabilistic models. Our approach is both theoretical and experimental since our goal is to obtain a random model that is able to capture the real-data complexity. Starting from a model that generalizes the Erdös-Renyi model [9, 10], we obtain asymptotic estimations on the average number of transversals, minimals and minimal transversals in a random hy-pergraph. We use those results to obtain an upper bound on the average complexity of algorithms to generate the minimal transversals of an hypergraph. Then we make our random model more complex in order bring it closer to real-data and identify cases where the average number of minimal tranversals is at most polynomial, quasi-polynomial or exponential