2,683 research outputs found
Minimum multiple characterization of biological data using partially defined boolean formulas
In this paper, we adress a characterization problem coming from plant biology. We consider different groups of experiments, each corresponding to the indentification of a given bacteria with regards to a given set of characters for diagnosis purposes. We have to compute simultaneously a complete minimal set of characterization formulas for each group. We propose two different approaches, based on Boolean functions, that allow us to study the satisfiability and the underlying complexity of this problem
TaBooN -- Boolean Network Synthesis Based on Tabu Search
Recent developments in Omics-technologies revolutionized the investigation of
biology by producing molecular data in multiple dimensions and scale. This
breakthrough in biology raises the crucial issue of their interpretation based
on modelling. In this undertaking, network provides a suitable framework for
modelling the interactions between molecules. Basically a Biological network is
composed of nodes referring to the components such as genes or proteins, and
the edges/arcs formalizing interactions between them. The evolution of the
interactions is then modelled by the definition of a dynamical system. Among
the different categories of network, the Boolean network offers a reliable
qualitative framework for the modelling. Automatically synthesizing a Boolean
network from experimental data therefore remains a necessary but challenging
issue. In this study, we present taboon, an original work-flow for synthesizing
Boolean Networks from biological data. The methodology uses the data in the
form of Boolean profiles for inferring all the potential local formula
inference. They combine to form the model space from which the most truthful
model with regards to biological knowledge and experiments must be found. In
the taboon work-flow the selection of the fittest model is achieved by a
Tabu-search algorithm. taboon is an automated method for Boolean Network
inference from experimental data that can also assist to evaluate and optimize
the dynamic behaviour of the biological networks providing a reliable platform
for further modelling and predictions
Efficient computational strategies to learn the structure of probabilistic graphical models of cumulative phenomena
Structural learning of Bayesian Networks (BNs) is a NP-hard problem, which is
further complicated by many theoretical issues, such as the I-equivalence among
different structures. In this work, we focus on a specific subclass of BNs,
named Suppes-Bayes Causal Networks (SBCNs), which include specific structural
constraints based on Suppes' probabilistic causation to efficiently model
cumulative phenomena. Here we compare the performance, via extensive
simulations, of various state-of-the-art search strategies, such as local
search techniques and Genetic Algorithms, as well as of distinct regularization
methods. The assessment is performed on a large number of simulated datasets
from topologies with distinct levels of complexity, various sample size and
different rates of errors in the data. Among the main results, we show that the
introduction of Suppes' constraints dramatically improve the inference
accuracy, by reducing the solution space and providing a temporal ordering on
the variables. We also report on trade-offs among different search techniques
that can be efficiently employed in distinct experimental settings. This
manuscript is an extended version of the paper "Structural Learning of
Probabilistic Graphical Models of Cumulative Phenomena" presented at the 2018
International Conference on Computational Science
Algebraic Geometry Arising from Discrete Models of Gene Regulatory Networks
Discrete models of gene regulatory networks have gained popularity in computational systems biology over the last dozen years. However, not all discrete network models reflect the behaviors of real biological systems. In this work, we focus on two model selection methods and algebraic geometry arising from these model selection methods. The first model selection method involves biologically relevant functions. We begin by introducing k-canalizing functions, a generalization of nested canalizing functions. We extend results on nested canalizing functions and derived a unique extended monomial form of arbitrary Boolean functions. This gives us a stratification of the set of n-variable Boolean functions by canalizing depth. We obtain closed formulas for the number of n-variable Boolean functions with depth k, which simultaneously generalizes enumeration formulas for canalizing, and nested canalizing functions. We characterize the set of k-canalizing functions as an algebraic variety in F2n. 2 . Next, e propose a method for the reverse engineering of networks of k-canalizing functions using techniques from computational algebra, based on our parametrization of k-canalizing functions. We also analyze binary decision diagrams of k-canalizing functions. The second model selection method involves computing minimal polynomial models using Gröbner bases. We built up the connection between staircases and Gröbner bases. We pro-vided a necessary and sufficient condition for the ideal I(V ) to have a unique reduced Gröbner basis, using the concept of a basic staircase. We also provide a sufficient combinatorial characterization of V ⊂ Nnp that yields a unique reduced Grobner basis
Logical analysis of data as a tool for the analysis of probabilistic discrete choice behavior
Probabilistic Discrete Choice Models (PDCM) have been extensively used to interpret the behavior of heterogeneous decision makers that face discrete alternatives. The classification approach of Logical Analysis of Data (LAD) uses discrete optimization to generate patterns, which are logic formulas characterizing the different classes. Patterns can be seen as rules explaining the phenomenon under analysis. In this work we discuss how LAD can be used as the first phase of the specification of PDCM. Since in this task the number of patterns generated may be extremely large, and many of them may be nearly equivalent, additional processing is necessary to obtain practically meaningful information. Hence, we propose computationally viable techniques to obtain small sets of patterns that constitute meaningful representations of the phenomenon and allow to discover significant associations between subsets of explanatory variables and the output. We consider the complex socio-economic problem of the analysis of the utilization of the Internet in Italy, using real data gathered by the Italian National Institute of Statistics
An extensive English language bibliography on graph theory and its applications
Bibliography on graph theory and its application
- …