113 research outputs found
Efficient Bayesian estimates for discrimination among topologically different systems biology models
A major effort in systems biology is the development of mathematical models that describe complex biological systems at multiple scales and levels of abstraction. Determining the topology—the set of interactions—of a biological system from observations of the system's behavior is an important and difficult problem. Here we present and demonstrate new methodology for efficiently computing the probability distribution over a set of topologies based on consistency with existing measurements. Key features of the new approach include derivation in a Bayesian framework, incorporation of prior probability distributions of topologies and parameters, and use of an analytically integrable linearization based on the Fisher information matrix that is responsible for large gains in efficiency. The new method was demonstrated on a collection of four biological topologies representing a kinase and phosphatase that operate in opposition to each other with either processive or distributive kinetics, giving 8–12 parameters for each topology. The linearization produced an approximate result very rapidly (CPU minutes) that was highly accurate on its own, as compared to a Monte Carlo method guaranteed to converge to the correct answer but at greater cost (CPU weeks). The Monte Carlo method developed and applied here used the linearization method as a starting point and importance sampling to approach the Bayesian answer in acceptable time. Other inexpensive methods to estimate probabilities produced poor approximations for this system, with likelihood estimation showing its well-known bias toward topologies with more parameters and the Akaike and Schwarz Information Criteria showing a strong bias toward topologies with fewer parameters. These results suggest that this linear approximation may be an effective compromise, providing an answer whose accuracy is near the true Bayesian answer, but at a cost near the common heuristics.National Cancer Institute (U.S.) (U54 CA112967)National University of Singapor
Efficient Calculation of Molecular Configurational Entropies Using an Information Theoretic Approximation
Accurate computation of free energy changes upon molecular binding remains a challenging problem, and changes in configurational entropy are especially difficult due to the potentially large numbers of local minima, anharmonicity, and high-order coupling among degrees of freedom. Here we propose a new method to compute molecular entropies based on the maximum information spanning tree (MIST) approximation that we have previously developed. Estimates of high-order couplings using only low-order terms provide excellent convergence properties, and the theory is also guaranteed to bound the entropy. The theory is presented together with applications to the calculation of the entropies of a variety of small molecules and the binding entropy change for a series of HIV protease inhibitors. The MIST framework developed here is demonstrated to compare favorably with results computed using the related mutual information expansion (MIE) approach, and an analysis of similarities between the methods is presented.National Institutes of Health (U.S.) (GM065418)National Institutes of Health (U.S.) (GM082209)National Science Foundation (U.S.) (0821391
Novel modeling formalisms and simulation tools in computational biosystems
Living organisms are complex systems that emerge
from the fundamental building blocks of life. Systems
Biology is a recent field of science that studies these
complex phenomena at the cellular level (Kitano 2002).
Understanding the mechanisms of the cell is essential
for research and development in several areas such as
drug discovery and biotechnological production. In the
latter, metabolic engineering is used for building mutant
microbial strains with increased productivity of
compounds with industrial interest, such as biofuels
(Stephanopoulos 1998). Using computational models of
cellular metabolism, it is possible to systematically test
and predict the optimal manipulations, such as gene
knockouts, that produce the ideal phenotype for a
specific application. These models are typically built in
an iterative cycle of experiment and refinement, by
multidisciplinary research teams that include biologists,
engineers and computer scientists.
The interconnection between different cellular
processes, such as metabolism and genetic regulation,
reflects the importance of the holistic approach claimed
by the Systems Biology paradigm in replacement of
traditional reductionist methods. Although most cellular
components have been studied individually, the
behavior of the cell emerges from the network-level
interaction and requires an integrative analysis. Recent
high–throughput methods have generated the so- called
omics data (e.g.: genomics, transcriptomics, proteomics,
metabolomics, fluxomics) that have allowed the
reconstruction of biological networks (Palsson 2006).
However, despite the great advances in the area, we are
still far from a whole-cell computational model that is
able to simulate all the components of a living cell. Due
to the enormous size and complexity of intracellular
biological networks, computational cell models tend to
be partial and focused on the application of interest.
Also, due to the multidisciplinarity of the field, these
models are based on several different kinds of
formalisms. Therefore, it is important to develop a
framework with common modeling formalisms, analysis
and simulation methods, that is able to accommodate
different kinds biological networks, with different types
of entities and their interactions, into genome-scale
integrated models. Cells are composed by thousands of
components that interact in myriad ways. Despite this
intricate interconnection it is usual to divide and classify
these networks according to biological function. The
main types of networks are signaling, gene regulatory
and metabolic. Signal transduction is a process for
cellular communication where the cell receives and
responds to external stimuli through signaling cascades
(Gomperts et al. 2009; Albert and Wang 2009). These
cascades affect gene regulation, which is the method for
controlling gene expression, and consequently several
cellular functions (Schlittand and Brazma 2007;
Karlebach and Sgamir 2008). Many genes encode
enzymes which are responsible for catalyzing
biochemical reactions. The complex network of these
reactions forms the cellular metabolism that sustains the
cell’s growth and energy requirements (Steuer and
Junker 2009; Palsson 2006).
The objectives of this work, in the context of a PhD
thesis, consist in re-search and selection of an
appropriate modeling formalism to develop a
framework for integration of different biological
networks, with focus on regulatory and metabolic
networks, and the implementation of suitable analysis,
simulation and optimization methods. To achieve these
goals, it is necessary to resolve many modeling issues,
such as the integration of discrete and continuous
events, representation of network topology, support for
different levels of abstraction, lack of parameters and
model complexity. This framework will be used for the
implementation of an integrated model of E. coli, a
widely used organism for industrial application
The Per2 Negative Feedback Loop Sets the Period in the Mammalian Circadian Clock Mechanism
Processes that repeat in time, such as the cell cycle, the circadian rhythm, and seasonal variations, are prevalent in biology. Mathematical models can represent our knowledge of the underlying mechanisms, and numerical methods can then facilitate analysis, which forms the foundation for a more integrated understanding as well as for design and intervention. Here, the intracellular molecular network responsible for the mammalian circadian clock system was studied. A new formulation of detailed sensitivity analysis is introduced and applied to elucidate the influence of individual rate processes, represented through their parameters, on network functional characteristics. One of four negative feedback loops in the model, the Per2 loop, was uniquely identified as most responsible for setting the period of oscillation; none of the other feedback loops were found to play as substantial a role. The analysis further suggested that the activity of the kinases CK1δ and CK1ɛ were well placed within the network such that they could be instrumental in implementing short-term adjustments to the period in the circadian clock system. The numerical results reported here are supported by previously published experimental data
Exploring the gap between dynamic and steady-state models of metabolism
Integration of different kinds of biological networks, is within the holistic approach of Systems
Biology. However, looking at metabolic networks only, one already finds a separation between
dynamic [2] and steady-state [3] models of metabolism. This work reviews the differences between
both modeling approaches and explores the gap between them. Common properties of both kind of
models are studied in detail, using as case study the central carbon metabolism of E. coli. Steadystate
models are underdetermined and define a space of possible solutions, the so-called flux cone
[4]. On the other hand, the kinetic properties of dynamic models define a specific flux distribution
inside this space of solutions. We explore how this particular solution changes in function of initial
conditions and the different kinetic parameters. Due to changes in experimental conditions and
experimental measurement error, these parameters can vary in a wide range, changing the flux
distribution around its original value within a kinetically feasible solution space. We perform Monte
Carlo sampling [5] to analyze the solution space of both the dynamic and steady-state models. We
estimate the volume of the kinetically feasible solution space under different restrictions and find it
to be considerably smaller than the volume of the steady-state flux cone. -rherefore, it is possible to
cope with the lack and uncertainty in experimental data by defining refined solution spaces that can
be used in constraint-based methods [1] such as Flux Balance Analysis
Fast Methods for Simulation of Biomolecule of Electrostatics
Biomolecular structure and interactions in aqueous environment are determined by a complicated interplay between physical and chemical forces including solvation, electrostatics, van der Waals forces, the hydrophobic effect and covalent bonding. Among them, electrostatics has been of particular interest due to its long-range nature and the tradeoff between desolvation and interaction effects [1]. In addition, electrostatic interactions play a significant role within a biomolecule as well as between biomolecules, making the balance between the two vital to the understanding of macromolecular systems. As a result, much effort has been devoted to accurate modeling and simulation of biomolecule electrostatics. One important application of this work is to compute the structure of electrostatic interactions for a biomolecule in an electrolyte solution, as well as the potential that the molecule generates in space. There are two valuable uses for these simulations. First, it provides a full picture of the electrostatic energetics of a biomolecular system, improving our understanding of how electrostatics contributes to stability, specificity, function, and molecular interaction [2]. Second, these simulations serve as a tool for molecular design, since electrostatic complementarity is an important feature of interacting molecules. Through examination of the electrostatics and potential field generated by a protein molecule, for example, it may be possible to suggest improvements to other proteins or drug molecules that interact with it, or perhaps even design new interacting molecules de novo [3]. There are two approaches in simulating a protein macromolecule in an aqueous solution with nonzero ionic strength. Discrete/atomistic approaches based on Monte-Carlo or molecular dynamics simulations treat the macromolecule and solvent explicitly at the atomic level. Therefore, an enormous number of solvent molecules are required to provide reasonable accuracy, especially when electric fields far away from macroscopic surface are of interest, leading to computational infeasibility. In this work, we adopt instead an approach based on a continuum description of the macromolecule and solvent. Although the continuum model of biomolecule electrostatics is widely used, the numerical techniques used to evaluate the model do not exploit fast solver approaches developed for analyzing integrated circuit interconnect. I will describe the formulation used for analyzing biomolecule electrostatics, and then derive an integral formulation of the problem that can be rapidly solved with precorrected-FFT method [4].Singapore-MIT Alliance (SMA
Development of dynamic multi-layered bioprocess models
Advances in experimental technologies in conjunction with the increasing number of sequenced genomes permitted uncovering
new molecular interactions aiding in the characterization of individual cellular components.
Until recently, in bioprocess engineering, cells were modeled as black box entities responsible for consuming substrates and
producing certain compounds, ignoring the underlying biological mechanisms. Nonetheless, the development of cellular
mechanistic dynamic models has been hampered by the lack of specific experimental data and imprecise knowledge of the
mechanistic rate laws underlying several reactions. This fact hardens the applicability of engineering concepts to cellular
systems.
Even incomplete cellular models provide valuable insights to help consolidate ongoing efforts in Biotechnology, namely, the
growing tendency in industry to replace chemical synthesis techniques by biotechnological ones. These tendencies are driven
by sustainability and profitability concerns, regarding the production of certain chemical compounds like bulk chemicals and
pharmaceuticals.
It is important to bear in mind that the metabolism of wild-type microorganisms is geared to its survival and reproduction
without engaging in the production of compounds outside this scope. Thus, the metabolism usually has to be modified in
order to meet the desired industrial outcome, typically the overproduction of a target compound.
In this work an optimization algorithm based on Evolutionary Computation approaches was previously developed in order
to enhance the production of a target metabolite based on a dynamic metabolic model.
An extension of mechanistic Escherichia coli model is currently being developed in order to study how to improve the
production of industrial relevant compounds
Modeling formalisms in systems biology
Systems Biology has taken advantage of computational tools and high-throughput experimental data to model several biological processes. These include signaling, gene regulatory, and metabolic networks. However, most of these models are specific to each kind of network. Their interconnection demands a whole-cell modeling framework for a complete understanding of cellular systems. We describe the features required by an integrated framework for modeling, analyzing and simulating biological processes, and review several modeling formalisms that have been used in Systems Biology including Boolean networks, Bayesian networks, Petri nets, process algebras, constraint-based models, differential equations, rule-based models, interacting state machines, cellular automata, and agent-based models. We compare the features provided by different formalisms, and discuss recent approaches in the integration of these formalisms, as well as possible directions for the future.Research supported by grants SFRH/BD/35215/2007 and SFRH/BD/25506/2005 from the Fundacao para a Ciencia e a Tecnologia (FCT) and the MIT-Portugal Program through the project "Bridging Systems and Synthetic Biology for the development of improved microbial cell factories" (MIT-Pt/BS-BB/0082/2008)
Symmetric Signaling by an Asymmetric 1 Erythropoietin: 2 Erythropoietin Receptor Complex
Via sites 1 and 2, erythropoietin binds asymmetrically to two identical receptor monomers, although it is unclear how asymmetry affects receptor activation and signaling. Here we report the design and validation of two mutant erythropoietin receptors that probe the role of individual members of the receptor dimer by selectively binding either site 1 or site 2 on erythropoietin. Ba/F3 cells expressing either mutant receptor do not respond to erythropoietin, but cells co-expressing both receptors respond to erythropoietin by proliferation and activation of the JAK2-Stat5 pathway. A truncated receptor with only one cytosolic tyrosine (Y343) is sufficient for signaling in response to erythropoietin, regardless of the monomer on which it is located. Similarly, only one receptor in the dimer needs a juxtamembrane hydrophobic L253 or W258 residue, essential for JAK2 activation. We conclude that despite asymmetry in the ligand-receptor interaction, both sides are competent for signaling, and appear to signal equally.National Institutes of Health (U.S.) (Grant P01 HL32262)Amgen Inc. (Research Grant)National Institutes of Health (U.S.) (Grant GM 065418)United States. Dept. of Energy. Computational Science Graduate Fellowship (DE-FG02-97ER25308)National Institutes of Health (U.S.). Ruth L. Kirschstein National Research Service Award (Postdoctoral Fellowship 5F32HL077036
- …