788 research outputs found

    Hierarchical Cluster-based Partial Least Squares Regression (HC-PLSR) is an efficient tool for metamodelling of nonlinear dynamic models

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Deterministic dynamic models of complex biological systems contain a large number of parameters and state variables, related through nonlinear differential equations with various types of feedback. A metamodel of such a dynamic model is a statistical approximation model that maps variation in parameters and initial conditions (inputs) to variation in features of the trajectories of the state variables (outputs) throughout the entire biologically relevant input space. A sufficiently accurate mapping can be exploited both instrumentally and epistemically. Multivariate regression methodology is a commonly used approach for emulating dynamic models. However, when the input-output relations are highly nonlinear or non-monotone, a standard linear regression approach is prone to give suboptimal results. We therefore hypothesised that a more accurate mapping can be obtained by locally linear or locally polynomial regression. We present here a new method for local regression modelling, Hierarchical Cluster-based PLS regression (HC-PLSR), where fuzzy <it>C</it>-means clustering is used to separate the data set into parts according to the structure of the response surface. We compare the metamodelling performance of HC-PLSR with polynomial partial least squares regression (PLSR) and ordinary least squares (OLS) regression on various systems: six different gene regulatory network models with various types of feedback, a deterministic mathematical model of the mammalian circadian clock and a model of the mouse ventricular myocyte function.</p> <p>Results</p> <p>Our results indicate that multivariate regression is well suited for emulating dynamic models in systems biology. The hierarchical approach turned out to be superior to both polynomial PLSR and OLS regression in all three test cases. The advantage, in terms of explained variance and prediction accuracy, was largest in systems with highly nonlinear functional relationships and in systems with positive feedback loops.</p> <p>Conclusions</p> <p>HC-PLSR is a promising approach for metamodelling in systems biology, especially for highly nonlinear or non-monotone parameter to phenotype maps. The algorithm can be flexibly adjusted to suit the complexity of the dynamic model behaviour, inviting automation in the metamodelling of complex systems.</p

    Exact reconstruction of gene regulatory networks using compressive sensing.

    Get PDF
    BackgroundWe consider the problem of reconstructing a gene regulatory network structure from limited time series gene expression data, without any a priori knowledge of connectivity. We assume that the network is sparse, meaning the connectivity among genes is much less than full connectivity. We develop a method for network reconstruction based on compressive sensing, which takes advantage of the network's sparseness.ResultsFor the case in which all genes are accessible for measurement, and there is no measurement noise, we show that our method can be used to exactly reconstruct the network. For the more general problem, in which hidden genes exist and all measurements are contaminated by noise, we show that our method leads to reliable reconstruction. In both cases, coherence of the model is used to assess the ability to reconstruct the network and to design new experiments. We demonstrate that it is possible to use the coherence distribution to guide biological experiment design effectively. By collecting a more informative dataset, the proposed method helps reduce the cost of experiments. For each problem, a set of numerical examples is presented.ConclusionsThe method provides a guarantee on how well the inferred graph structure represents the underlying system, reveals deficiencies in the data and model, and suggests experimental directions to remedy the deficiencies

    Information visualization for DNA microarray data analysis: A critical review

    Get PDF
    Graphical representation may provide effective means of making sense of the complexity and sheer volume of data produced by DNA microarray experiments that monitor the expression patterns of thousands of genes simultaneously. The ability to use ldquoabstractrdquo graphical representation to draw attention to areas of interest, and more in-depth visualizations to answer focused questions, would enable biologists to move from a large amount of data to particular records they are interested in, and therefore, gain deeper insights in understanding the microarray experiment results. This paper starts by providing some background knowledge of microarray experiments, and then, explains how graphical representation can be applied in general to this problem domain, followed by exploring the role of visualization in gene expression data analysis. Having set the problem scene, the paper then examines various multivariate data visualization techniques that have been applied to microarray data analysis. These techniques are critically reviewed so that the strengths and weaknesses of each technique can be tabulated. Finally, several key problem areas as well as possible solutions to them are discussed as being a source for future work

    Control Theory: On the Way to New Application Fields

    Get PDF
    Control theory is an interdisciplinary ïŹeld that is located at the crossroads of pure and applied mathematics with systems engineering and the sciences. Recently, deep interactions are emerging with new application areas, such as systems biology, quantum control and information technology. In order to address the new challenges posed by the new application disciplines, a special focus of this workshop has been on the interaction between control theory and mathematical systems biology. To complement these more biology oriented focus, a series of lectures in this workshop was devoted to the control of networks of systems, fundamentals of nonlinear control systems, model reduction and identiïŹcation, algorithmic aspects in control, as well as open problems in control

    Systems biology approaches to the dynamics of gene expression and chemical reactions

    Get PDF
    Systems biology is an emergent interdisciplinary field of study whose main goal is to understand the global properties and functions of a biological system by investigating its structure and dynamics [74]. This high-level knowledge can be reached only with a coordinated approach involving researchers with different backgrounds in molecular biology, the various omics (like genomics, proteomics, metabolomics), computer science and dynamical systems theory. The history of systems biology as a distinct discipline began in the 1960s, and saw an impressive growth since year 2000, originated by the increased accumulation of biological information, the development of high-throughput experimental techniques, the use of powerful computer systems for calculations and database hosting, and the spread of Internet as the standard medium for information diffusion [77]. In the last few years, our research group tried to tackle a set of systems biology problems which look quite diverse, but share some topics like biological networks and system dynamics, which are of our interest and clearly fundamental for this field. In fact, the first issue we studied (covered in Part I) was the reverse engineering of large-scale gene regulatory networks. Inferring a gene network is the process of identifying interactions among genes from experimental data (tipically microarray expression profiles) using computational methods [6]. Our aim was to compare some of the most popular association network algorithms (the only ones applicable at a genome-wide level) in different conditions. In particular we verified the predictive power of similarity measures both of direct type (like correlations and mutual information) and of conditional type (partial correlations and conditional mutual information) applied on different kinds of experiments (like data taken at equilibrium or time courses) and on both synthetic and real microarray data (for E. coli and S. cerevisiae). In our simulations we saw that all network inference algorithms obtain better performances from data produced with \u201cstructural\u201d perturbations (like gene knockouts at steady state) than with just dynamical perturbations (like time course measurements or changes of the initial expression levels). Moreover, our analysis showed differences in the performances of the algorithms: direct methods are more robust in detecting stable relationships (like belonging to the same protein complex), while conditional methods are better at causal interactions (e.g. transcription factor\u2013binding site interactions), especially in presence of combinatorial transcriptional regulation. Even if time course microarray experiments are not particularly useful for inferring gene networks, they can instead give a great amount of information about the dynamical evolution of a biological process, provided that the measurements have a good time resolution. Recently, such a dataset has been published [119] for the yeast metabolic cycle, a well-known process where yeast cells synchronize with respect to oxidative and reductive functions. In that paper, the long-period respiratory oscillations were shown to be reflected in genome-wide periodic patterns in gene expression. As explained in Part II, we analyzed these time series in order to elucidate the dynamical role of post-transcriptional regulation (in particular mRNA stability) in the coordination of the cycle. We found that for periodic genes, arranged in classes according either to expression profile or to function, the pulses of mRNA abundance have phase and width which are directly proportional to the corresponding turnover rates. Moreover, the cascade of events which occurs during the yeast metabolic cycle (and their correlation with mRNA turnover) reflects to a large extent the gene expression program observable in other dynamical contexts such as the response to stresses or stimuli. The concepts of network and of systems dynamics return also as major arguments of Part III. In fact, there we present a study of some dynamical properties of the so-called chemical reaction networks, which are sets of chemical species among which a certain number of reactions can occur. These networks can be modeled as systems of ordinary differential equations for the species concentrations, and the dynamical evolution of these systems has been theoretically studied since the 1970s [47, 65]. Over time, several independent conditions have been proved concerning the capacity of a reaction network, regardless of the (often poorly known) reaction parameters, to exhibit multiple equilibria. This is a particularly interesting characteristic for biological systems, since it is required for the switch-like behavior observed during processes like intracellular signaling and cell differentiation. Inspired by those works, we developed a new open source software package for MATLAB, called ERNEST, which, by checking these various criteria on the structure of a chemical reaction network, can exclude the multistationarity of the corresponding reaction system. The results of this analysis can be used, for example, for model discrimination: if for a multistable biological process there are multiple candidate reaction models, it is possible to eliminate some of them by proving that they are always monostationary. Finally, we considered the related property of monotonicity for a reaction network. Monotone dynamical systems have the tendency to converge to an equilibrium and do not present chaotic behaviors. Most biological systems have the same features, and are therefore considered to be monotone or near-monotone [85, 116]. Using the notion of fundamental cycles from graph theory, we proved some theoretical results in order to determine how distant is a given biological network from being monotone. In particular, we showed that the distance to monotonicity of a network is equal to the minimal number of negative fundamental cycles of the corresponding J-graph, a signed multigraph which can be univocally associated to a dynamical system

    Arvbarhet og biologisk systemdynamikk

    Get PDF
    The concept of heritability is rooted in the observation that relatives resemble one another more than expected by chance. Narrow-sense heritability is defined as the proportion of phenotypic variance that is attributable to additive genetic variation (i.e. where an allele substitution has the same effect irrespective of the rest of the genotype), while broad-sense heritability denotes the proportion of phenotypic variance caused by genetic variation including non-additive effects. Both concepts have been highly instrumental in evolutionary biology, production biology and biomedical research for several decades. However, this successful instrumental use should not be equated with deep understanding of how underlying biology shapes narrow- and broad-sense heritability. Nor does it guarantee that these statistical definitions and associated methodology are optimally suited to deal with the recent floods of biological data. Seeking a deeper understanding of the relationship between narrow- and broad-sense heritability in terms of biological mechanisms, I simulated genetic variation in dynamic models of biological systems. A striking result was that the ratio between narrow-sense and broad-sense heritability depended strongly on the type of regulatory architecture involved. Applying the same approach to an ensemble of gene regulatory network models, I showed that monotonicity features of genotype-to-phenotype maps reveal deep connections between molecular regulatory architecture and heritability aspects; connections that do not materialize from the classical distinction between additive, dominant and epistatic gene actions. Lastly, I addressed why genome-wide association studies (GWAS) have failed to identify much of the genetic variation underlying highly heritable traits. By linking computational physiology to GWAS, one can do GWAS on lower-level phenotypes that are mathematically related to each other through a dynamic model. This allows much more precise identification of the causal genetic variation, coupled with understanding of its function.Begrepet arvbarhet gjenspeiler det faktum at slektninger jevnt over ligner mer pÄ hverandre enn pÄ andre individer. Arvbarhet i smal forstand defineres som andelen av fenotypisk varians som kan tilskrives additive effekter av genetisk variasjon (altsÄ der en allel-substitusjon har samme effekt uavhengig av resten av genotypen), mens arvbarhet i vid forstand betegner den samlede andelen som skyldes bÄde additive og ikke-additive effekter. Begge begrepene har vist seg nyttige i evolusjonsbiologi, produksjonsbiologi og biomedisinsk forskning over flere tiÄr. Denne nytten som verktÞy er imidlertid ikke ensbetydende med dyp innsikt i hvordan de to typene av arvbarhet formes av underliggende biologi. Det er heller ikke selvsagt at disse statistisk baserte definisjonene og metodene vil vÊre de beste til Ä mÞte dagens flom av nye biologiske data. I mitt doktorgradsarbeid har jeg belyst hvordan forholdet mellom arvbarhet i smal og vid forstand henger sammen med biologiske mekanismer, gjennom Ä simulere genetisk variasjon i dynamiske modeller av fysiologiske systemer. Et slÄende resultat var at den regulatoriske arkitekturen til systemet har mye Ä si for forholdstallet mellom arvbarhet i smal og vid forstand. PÄ lignende vis studerte jeg arvbarhet i et knippe modeller av genregulatoriske nettverk med ulike grader av monotonitet i den matematiske sammenhengen mellom genotype og fenotype. Dette avdekket dype bÄnd mellom arvbarhetsmÞnstre og molekylÊr regulatorisk arkitektur; sammenhenger som ikke er Äpenbare ut fra det klassiske skillet mellom additive, dominante og epistatiske gen-effekter. Til sist tok jeg for meg svakheter ved dagens statistiske metoder for Ä forklare hvordan variasjon i sterkt arvbare trekk styres av genetiske forskjeller mellom individer. SÄkalte hel-genom-assosiasjons-studier (genome-wide association studies, GWAS) pÄviser ofte en mengde relevante loci med genetisk variasjon, men disse forklarer likevel bare en liten del av den observerte arvbarheten i overordnede trekk som f.eks. kroppshÞyde eller sjukdomsforekomst. En mer lovende tilnÊrming er Ä koble matematisk fysiologi til GWAS. Jeg viser at man ved Ä gjÞre GWAS pÄ lavnivÄ-fenotyper som er matematisk forbundet gjennom en dynamisk modell, kan identifisere den Ärsaksbestemmende genetiske variasjonen langt mer presist og samtidig Þke forstÄelsen av dennes funksjon
    • 

    corecore