8,829 research outputs found
Inference of Disease-Related Molecular Logic from Systems-Based Microarray Analysis
Computational analysis of gene expression data from microarrays has been useful for medical diagnosis and prognosis. The ability to analyze such data at the level of biological modules, rather than individual genes, has been recognized as important for improving our understanding of disease-related pathways. It has proved difficult, however, to infer pathways from microarray data by deriving modules of multiple synergistically interrelated genes, rather than individual genes. Here we propose a systems-based approach called Entropy Minimization and Boolean Parsimony (EMBP) that identifies, directly from gene expression data, modules of genes that are jointly associated with disease. Furthermore, the technique provides insight into the underlying biomolecular logic by inferring a logic function connecting the joint expression levels in a gene module with the outcome of disease. Coupled with biological knowledge, this information can be useful for identifying disease-related pathways, suggesting potential therapeutic approaches for interfering with the functions of such pathways. We present an example providing such gene modules associated with prostate cancer from publicly available gene expression data, and we successfully validate the results on additional independently derived data. Our results indicate a link between prostate cancer and cellular damage from oxidative stress combined with inhibition of apoptotic mechanisms normally triggered by such damage
Spatial Analysis of Expression Patterns Predicts Genetic Interactions at the Mid-Hindbrain Boundary
The isthmic organizer mediating differentiation of mid- and hindbrain during vertebrate development is characterized by a well-defined pattern of locally restricted gene expression domains around the mid-hindbrain boundary (MHB). This pattern is established and maintained by a regulatory network between several transcription and secreted factors that is not yet understood in full detail. In this contribution we show that a Boolean analysis of the characteristic spatial gene expression patterns at the murine MHB reveals key regulatory interactions in this network. Our analysis employs techniques from computational logic for the minimization of Boolean functions. This approach allows us to predict also the interplay of the various regulatory interactions. In particular, we predict a maintaining, rather than inducing, effect of Fgf8 on Wnt1 expression, an issue that remained unclear from published data. Using mouse anterior neural plate/tube explant cultures, we provide experimental evidence that Fgf8 in fact only maintains but does not induce ectopic Wnt1 expression in these explants. In combination with previously validated interactions, this finding allows for the construction of a regulatory network between key transcription and secreted factors at the MHB. Analyses of Boolean, differential equation and reaction-diffusion models of this network confirm that it is indeed able to explain the stable maintenance of the MHB as well as time-courses of expression patterns both under wild-type and various knock-out conditions. In conclusion, we demonstrate that similar to temporal also spatial expression patterns can be used to gain information about the structure of regulatory networks. We show, in particular, that the spatial gene expression patterns around the MHB help us to understand the maintenance of this boundary on a systems level
Soft topographic map for clustering and classification of bacteria
In this work a new method for clustering and building a
topographic representation of a bacteria taxonomy is presented. The method is based on the analysis of stable parts of the genome, the so-called “housekeeping genes”. The proposed method generates topographic maps of the bacteria taxonomy, where relations among different
type strains can be visually inspected and verified. Two well known DNA alignement algorithms are applied to the genomic sequences. Topographic maps are optimized to represent the similarity among the sequences according to their evolutionary distances. The experimental analysis is carried out on 147 type strains of the Gammaprotebacteria
class by means of the 16S rRNA housekeeping gene. Complete sequences of the gene have been retrieved from the NCBI public database. In the experimental tests the maps show clusters of homologous type strains and present some singular cases potentially due to incorrect classification
or erroneous annotations in the database
Revisiting the Training of Logic Models of Protein Signaling Networks with a Formal Approach based on Answer Set Programming
A fundamental question in systems biology is the construction and training to
data of mathematical models. Logic formalisms have become very popular to model
signaling networks because their simplicity allows us to model large systems
encompassing hundreds of proteins. An approach to train (Boolean) logic models
to high-throughput phospho-proteomics data was recently introduced and solved
using optimization heuristics based on stochastic methods. Here we demonstrate
how this problem can be solved using Answer Set Programming (ASP), a
declarative problem solving paradigm, in which a problem is encoded as a
logical program such that its answer sets represent solutions to the problem.
ASP has significant improvements over heuristic methods in terms of efficiency
and scalability, it guarantees global optimality of solutions as well as
provides a complete set of solutions. We illustrate the application of ASP with
in silico cases based on realistic networks and data
Attribute Exploration of Discrete Temporal Transitions
Discrete temporal transitions occur in a variety of domains, but this work is
mainly motivated by applications in molecular biology: explaining and analyzing
observed transcriptome and proteome time series by literature and database
knowledge. The starting point of a formal concept analysis model is presented.
The objects of a formal context are states of the interesting entities, and the
attributes are the variable properties defining the current state (e.g.
observed presence or absence of proteins). Temporal transitions assign a
relation to the objects, defined by deterministic or non-deterministic
transition rules between sets of pre- and postconditions. This relation can be
generalized to its transitive closure, i.e. states are related if one results
from the other by a transition sequence of arbitrary length. The focus of the
work is the adaptation of the attribute exploration algorithm to such a
relational context, so that questions concerning temporal dependencies can be
asked during the exploration process and be answered from the computed stem
base. Results are given for the abstract example of a game and a small gene
regulatory network relevant to a biomedical question.Comment: Only the email address and reference have been replace
Dynamic optimization of metabolic networks coupled with gene expression
The regulation of metabolic activity by tuning enzyme expression levels is
crucial to sustain cellular growth in changing environments. Metabolic networks
are often studied at steady state using constraint-based models and
optimization techniques. However, metabolic adaptations driven by changes in
gene expression cannot be analyzed by steady state models, as these do not
account for temporal changes in biomass composition. Here we present a dynamic
optimization framework that integrates the metabolic network with the dynamics
of biomass production and composition, explicitly taking into account enzyme
production costs and enzymatic capacity. In contrast to the established dynamic
flux balance analysis, our approach allows predicting dynamic changes in both
the metabolic fluxes and the biomass composition during metabolic adaptations.
We applied our algorithm in two case studies: a minimal nutrient uptake
network, and an abstraction of core metabolic processes in bacteria. In the
minimal model, we show that the optimized uptake rates reproduce the empirical
Monod growth for bacterial cultures. For the network of core metabolic
processes, the dynamic optimization algorithm predicted commonly observed
metabolic adaptations, such as a diauxic switch with a preference ranking for
different nutrients, re-utilization of waste products after depletion of the
original substrate, and metabolic adaptation to an impending nutrient
depletion. These examples illustrate how dynamic adaptations of enzyme
expression can be predicted solely from an optimization principle
Identification of SNP interactions using logic regression
Interactions of single nucleotide polymorphisms (SNPs) are assumed to be responsible for complex diseases such as sporadic breast cancer. Important goals of studies concerned with such genetic data are thus to identify combinations of SNPs that lead to a higher risk of developing a disease and to measure the importance of these interactions. There are many approaches based on classification methods such as CART and Random Forests that allow measuring the importance of single variables. But with none of these methods the importance of combinations of variables can be quantified directly. In this paper, we show how logic regression can be employed to identify SNP interactions explanatory for the disease status in a case- control study and propose two measures for quantifying the importance of these interactions for classification. These approaches are then applied, on the one hand, to simulated data sets, and on the other hand, to the SNP data of the GENICA study, a study dedicated to the identification of genetic and gene-environment interactions associated with sporadic breast cancer. --Single Nucleotide Polymorphism,Feature Selection,Variable Importance Measure,GENICA
Coherent frequentism
By representing the range of fair betting odds according to a pair of
confidence set estimators, dual probability measures on parameter space called
frequentist posteriors secure the coherence of subjective inference without any
prior distribution. The closure of the set of expected losses corresponding to
the dual frequentist posteriors constrains decisions without arbitrarily
forcing optimization under all circumstances. This decision theory reduces to
those that maximize expected utility when the pair of frequentist posteriors is
induced by an exact or approximate confidence set estimator or when an
automatic reduction rule is applied to the pair. In such cases, the resulting
frequentist posterior is coherent in the sense that, as a probability
distribution of the parameter of interest, it satisfies the axioms of the
decision-theoretic and logic-theoretic systems typically cited in support of
the Bayesian posterior. Unlike the p-value, the confidence level of an interval
hypothesis derived from such a measure is suitable as an estimator of the
indicator of hypothesis truth since it converges in sample-space probability to
1 if the hypothesis is true or to 0 otherwise under general conditions.Comment: The confidence-measure theory of inference and decision is explicitly
extended to vector parameters of interest. The derivation of upper and lower
confidence levels from valid and nonconservative set estimators is formalize
- …