2,683 research outputs found

    Minimum multiple characterization of biological data using partially defined boolean formulas

    Get PDF
    In this paper, we adress a characterization problem coming from plant biology. We consider different groups of experiments, each corresponding to the indentification of a given bacteria with regards to a given set of characters for diagnosis purposes. We have to compute simultaneously a complete minimal set of characterization formulas for each group. We propose two different approaches, based on Boolean functions, that allow us to study the satisfiability and the underlying complexity of this problem

    TaBooN -- Boolean Network Synthesis Based on Tabu Search

    Full text link
    Recent developments in Omics-technologies revolutionized the investigation of biology by producing molecular data in multiple dimensions and scale. This breakthrough in biology raises the crucial issue of their interpretation based on modelling. In this undertaking, network provides a suitable framework for modelling the interactions between molecules. Basically a Biological network is composed of nodes referring to the components such as genes or proteins, and the edges/arcs formalizing interactions between them. The evolution of the interactions is then modelled by the definition of a dynamical system. Among the different categories of network, the Boolean network offers a reliable qualitative framework for the modelling. Automatically synthesizing a Boolean network from experimental data therefore remains a necessary but challenging issue. In this study, we present taboon, an original work-flow for synthesizing Boolean Networks from biological data. The methodology uses the data in the form of Boolean profiles for inferring all the potential local formula inference. They combine to form the model space from which the most truthful model with regards to biological knowledge and experiments must be found. In the taboon work-flow the selection of the fittest model is achieved by a Tabu-search algorithm. taboon is an automated method for Boolean Network inference from experimental data that can also assist to evaluate and optimize the dynamic behaviour of the biological networks providing a reliable platform for further modelling and predictions

    Efficient computational strategies to learn the structure of probabilistic graphical models of cumulative phenomena

    Full text link
    Structural learning of Bayesian Networks (BNs) is a NP-hard problem, which is further complicated by many theoretical issues, such as the I-equivalence among different structures. In this work, we focus on a specific subclass of BNs, named Suppes-Bayes Causal Networks (SBCNs), which include specific structural constraints based on Suppes' probabilistic causation to efficiently model cumulative phenomena. Here we compare the performance, via extensive simulations, of various state-of-the-art search strategies, such as local search techniques and Genetic Algorithms, as well as of distinct regularization methods. The assessment is performed on a large number of simulated datasets from topologies with distinct levels of complexity, various sample size and different rates of errors in the data. Among the main results, we show that the introduction of Suppes' constraints dramatically improve the inference accuracy, by reducing the solution space and providing a temporal ordering on the variables. We also report on trade-offs among different search techniques that can be efficiently employed in distinct experimental settings. This manuscript is an extended version of the paper "Structural Learning of Probabilistic Graphical Models of Cumulative Phenomena" presented at the 2018 International Conference on Computational Science

    Algebraic Geometry Arising from Discrete Models of Gene Regulatory Networks

    Get PDF
    Discrete models of gene regulatory networks have gained popularity in computational systems biology over the last dozen years. However, not all discrete network models reflect the behaviors of real biological systems. In this work, we focus on two model selection methods and algebraic geometry arising from these model selection methods. The first model selection method involves biologically relevant functions. We begin by introducing k-canalizing functions, a generalization of nested canalizing functions. We extend results on nested canalizing functions and derived a unique extended monomial form of arbitrary Boolean functions. This gives us a stratification of the set of n-variable Boolean functions by canalizing depth. We obtain closed formulas for the number of n-variable Boolean functions with depth k, which simultaneously generalizes enumeration formulas for canalizing, and nested canalizing functions. We characterize the set of k-canalizing functions as an algebraic variety in F2n. 2 . Next, e propose a method for the reverse engineering of networks of k-canalizing functions using techniques from computational algebra, based on our parametrization of k-canalizing functions. We also analyze binary decision diagrams of k-canalizing functions. The second model selection method involves computing minimal polynomial models using Gröbner bases. We built up the connection between staircases and Gröbner bases. We pro-vided a necessary and sufficient condition for the ideal I(V ) to have a unique reduced Gröbner basis, using the concept of a basic staircase. We also provide a sufficient combinatorial characterization of V ⊂ Nnp that yields a unique reduced Grobner basis

    Logical analysis of data as a tool for the analysis of probabilistic discrete choice behavior

    Get PDF
    Probabilistic Discrete Choice Models (PDCM) have been extensively used to interpret the behavior of heterogeneous decision makers that face discrete alternatives. The classification approach of Logical Analysis of Data (LAD) uses discrete optimization to generate patterns, which are logic formulas characterizing the different classes. Patterns can be seen as rules explaining the phenomenon under analysis. In this work we discuss how LAD can be used as the first phase of the specification of PDCM. Since in this task the number of patterns generated may be extremely large, and many of them may be nearly equivalent, additional processing is necessary to obtain practically meaningful information. Hence, we propose computationally viable techniques to obtain small sets of patterns that constitute meaningful representations of the phenomenon and allow to discover significant associations between subsets of explanatory variables and the output. We consider the complex socio-economic problem of the analysis of the utilization of the Internet in Italy, using real data gathered by the Italian National Institute of Statistics

    An extensive English language bibliography on graph theory and its applications

    Get PDF
    Bibliography on graph theory and its application
    corecore