232 research outputs found
Genomic signatures of population decline in the malaria mosquito Anopheles gambiae
Population genomic features such as nucleotide diversity and linkage disequilibrium are expected to be strongly shaped by changes in population size, and might therefore be useful for monitoring the success of a control campaign. In the Kilifi district of Kenya, there has been a marked decline in the abundance of the malaria vector Anopheles gambiae subsequent to the rollout of insecticide-treated bed nets. To investigate whether this decline left a detectable population genomic signature, simulations were performed to compare the effect of population crashes on nucleotide diversity, Tajima's D, and linkage disequilibrium (as measured by the population recombination parameter ρ). Linkage disequilibrium and ρ were estimated for An. gambiae from Kilifi, and compared them to values for Anopheles arabiensis and Anopheles merus at the same location, and for An. gambiae in a location 200 km from Kilifi. In the first simulations ρ changed more rapidly after a population crash than the other statistics, and therefore is a more sensitive indicator of recent population decline. In the empirical data, linkage disequilibrium extends 100-1000 times further, and ρ is 100-1000 times smaller, for the Kilifi population of An. gambiae than for any of the other populations. There were also significant runs of homozygosity in many of the individual An. gambiae mosquitoes from Kilifi. These results support the hypothesis that the recent decline in An. gambiae was driven by the rollout of bed nets. Measuring population genomic parameters in a small sample of individuals before, during and after vector or pest control may be a valuable method of tracking the effectiveness of interventions
Gene expression drives the evolution of dominance.
Dominance is a fundamental concept in molecular genetics and has implications for understanding patterns of genetic variation, evolution, and complex traits. However, despite its importance, the degree of dominance in natural populations is poorly quantified. Here, we leverage multiple mating systems in natural populations of Arabidopsis to co-estimate the distribution of fitness effects and dominance coefficients of new amino acid changing mutations. We find that more deleterious mutations are more likely to be recessive than less deleterious mutations. Further, this pattern holds across gene categories, but varies with the connectivity and expression patterns of genes. Our work argues that dominance arises as a consequence of the functional importance of genes and their optimal expression levels
Dynamic modeling of gene expression in prokaryotes: application to glucose-lactose diauxie in Escherichia coli
Coexpression of genes or, more generally, similarity in the expression
profiles poses an unsurmountable obstacle to inferring the gene regulatory
network (GRN) based solely on data from DNA microarray time series. Clustering
of genes with similar expression profiles allows for a course-grained view of
the GRN and a probabilistic determination of the connectivity among the
clusters. We present a model for the temporal evolution of a gene cluster
network which takes into account interactions of gene products with genes and,
through a non-constant degradation rate, with other gene products. The number
of model parameters is reduced by using polynomial functions to interpolate
temporal data points. In this manner, the task of parameter estimation is
reduced to a system of linear algebraic equations, thus making the computation
time shorter by orders of magnitude. To eliminate irrelevant networks, we test
each GRN for stability with respect to parameter variations, and impose
restrictions on its behavior near the steady state. We apply our model and
methods to DNA microarray time series' data collected on Escherichia coli
during glucose-lactose diauxie and infer the most probable cluster network for
different phases of the experiment.Comment: 20 pages, 4 figures; Systems and Synthetic Biology 5 (2011
What Can Causal Networks Tell Us about Metabolic Pathways?
Graphical models describe the linear correlation structure of data and have been used to establish causal relationships among phenotypes in genetic mapping populations. Data are typically collected at a single point in time. Biological processes on the other hand are often non-linear and display time varying dynamics. The extent to which graphical models can recapitulate the architecture of an underlying biological processes is not well understood. We consider metabolic networks with known stoichiometry to address the fundamental question: “What can causal networks tell us about metabolic pathways?”. Using data from an Arabidopsis BaySha population and simulated data from dynamic models of pathway motifs, we assess our ability to reconstruct metabolic pathways using graphical models. Our results highlight the necessity of non-genetic residual biological variation for reliable inference. Recovery of the ordering within a pathway is possible, but should not be expected. Causal inference is sensitive to subtle patterns in the correlation structure that may be driven by a variety of factors, which may not emphasize the substrate-product relationship. We illustrate the effects of metabolic pathway architecture, epistasis and stochastic variation on correlation structure and graphical model-derived networks. We conclude that graphical models should be interpreted cautiously, especially if the implied causal relationships are to be used in the design of intervention strategies
Shape, Size, and Robustness: Feasible Regions in the Parameter Space of Biochemical Networks
The concept of robustness of regulatory networks has received much attention in the last decade. One measure of robustness has been associated with the volume of the feasible region, namely, the region in the parameter space in which the system is functional. In this paper, we show that, in addition to volume, the geometry of this region has important consequences for the robustness and the fragility of a network. We develop an approximation within which we could algebraically specify the feasible region. We analyze the segment polarity gene network to illustrate our approach. The study of random walks in the parameter space and how they exit the feasible region provide us with a rich perspective on the different modes of failure of this network model. In particular, we found that, between two alternative ways of activating Wingless, one is more robust than the other. Our method provides a more complete measure of robustness to parameter variation. As a general modeling strategy, our approach is an interesting alternative to Boolean representation of biochemical networks
Inference of population splits and mixtures from genome-wide allele frequency data
Many aspects of the historical relationships between populations in a species
are reflected in genetic data. Inferring these relationships from genetic data,
however, remains a challenging task. In this paper, we present a statistical
model for inferring the patterns of population splits and mixtures in multiple
populations. In this model, the sampled populations in a species are related to
their common ancestor through a graph of ancestral populations. Using
genome-wide allele frequency data and a Gaussian approximation to genetic
drift, we infer the structure of this graph. We applied this method to a set of
55 human populations and a set of 82 dog breeds and wild canids. In both
species, we show that a simple bifurcating tree does not fully describe the
data; in contrast, we infer many migration events. While some of the migration
events that we find have been detected previously, many have not. For example,
in the human data we infer that Cambodians trace approximately 16% of their
ancestry to a population ancestral to other extant East Asian populations. In
the dog data, we infer that both the boxer and basenji trace a considerable
fraction of their ancestry (9% and 25%, respectively) to wolves subsequent to
domestication, and that East Asian toy breeds (the Shih Tzu and the Pekingese)
result from admixture between modern toy breeds and "ancient" Asian breeds.
Software implementing the model described here, called TreeMix, is available at
http://treemix.googlecode.comComment: 28 pages, 6 figures in main text. Attached supplement is 22 pages, 15
figures. This is an updated version of the preprint available at
http://precedings.nature.com/documents/6956/version/
Mathematical and Statistical Techniques for Systems Medicine: The Wnt Signaling Pathway as a Case Study
The last decade has seen an explosion in models that describe phenomena in
systems medicine. Such models are especially useful for studying signaling
pathways, such as the Wnt pathway. In this chapter we use the Wnt pathway to
showcase current mathematical and statistical techniques that enable modelers
to gain insight into (models of) gene regulation, and generate testable
predictions. We introduce a range of modeling frameworks, but focus on ordinary
differential equation (ODE) models since they remain the most widely used
approach in systems biology and medicine and continue to offer great potential.
We present methods for the analysis of a single model, comprising applications
of standard dynamical systems approaches such as nondimensionalization, steady
state, asymptotic and sensitivity analysis, and more recent statistical and
algebraic approaches to compare models with data. We present parameter
estimation and model comparison techniques, focusing on Bayesian analysis and
coplanarity via algebraic geometry. Our intention is that this (non exhaustive)
review may serve as a useful starting point for the analysis of models in
systems medicine.Comment: Submitted to 'Systems Medicine' as a book chapte
Identification of neutral biochemical network models from time series data
<p>Abstract</p> <p>Background</p> <p>The major difficulty in modeling biological systems from multivariate time series is the identification of parameter sets that endow a model with dynamical behaviors sufficiently similar to the experimental data. Directly related to this parameter estimation issue is the task of identifying the structure and regulation of ill-characterized systems. Both tasks are simplified if the mathematical model is canonical, <it>i.e</it>., if it is constructed according to strict guidelines.</p> <p>Results</p> <p>In this report, we propose a method for the identification of admissible parameter sets of canonical S-systems from biological time series. The method is based on a Monte Carlo process that is combined with an improved version of our previous parameter optimization algorithm. The method maps the parameter space into the network space, which characterizes the connectivity among components, by creating an ensemble of decoupled S-system models that imitate the dynamical behavior of the time series with sufficient accuracy. The concept of sloppiness is revisited in the context of these S-system models with an exploration not only of different parameter sets that produce similar dynamical behaviors but also different network topologies that yield dynamical similarity.</p> <p>Conclusion</p> <p>The proposed parameter estimation methodology was applied to actual time series data from the glycolytic pathway of the bacterium <it>Lactococcus lactis </it>and led to ensembles of models with different network topologies. In parallel, the parameter optimization algorithm was applied to the same dynamical data upon imposing a pre-specified network topology derived from prior biological knowledge, and the results from both strategies were compared. The results suggest that the proposed method may serve as a powerful exploration tool for testing hypotheses and the design of new experiments.</p
Population Based Model of Human Embryonic Stem Cell (hESC) Differentiation during Endoderm Induction
The mechanisms by which human embryonic stem cells (hESC) differentiate to endodermal lineage have not been extensively studied. Mathematical models can aid in the identification of mechanistic information. In this work we use a population-based modeling approach to understand the mechanism of endoderm induction in hESC, performed experimentally with exposure to Activin A and Activin A supplemented with growth factors (basic fibroblast growth factor (FGF2) and bone morphogenetic protein 4 (BMP4)). The differentiating cell population is analyzed daily for cellular growth, cell death, and expression of the endoderm proteins Sox17 and CXCR4. The stochastic model starts with a population of undifferentiated cells, wherefrom it evolves in time by assigning each cell a propensity to proliferate, die and differentiate using certain user defined rules. Twelve alternate mechanisms which might describe the observed dynamics were simulated, and an ensemble parameter estimation was performed on each mechanism. A comparison of the quality of agreement of experimental data with simulations for several competing mechanisms led to the identification of one which adequately describes the observed dynamics under both induction conditions. The results indicate that hESC commitment to endoderm occurs through an intermediate mesendoderm germ layer which further differentiates into mesoderm and endoderm, and that during induction proliferation of the endoderm germ layer is promoted. Furthermore, our model suggests that CXCR4 is expressed in mesendoderm and endoderm, but is not expressed in mesoderm. Comparison between the two induction conditions indicates that supplementing FGF2 and BMP4 to Activin A enhances the kinetics of differentiation than Activin A alone. This mechanistic information can aid in the derivation of functional, mature cells from their progenitors. While applied to initial endoderm commitment of hESC, the model is general enough to be applicable either to a system of adult stem cells or later stages of ESC differentiation
- …