8 research outputs found

    Inferring signalling networks from longitudinal data using sampling based approaches in the R-package 'ddepn'

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Network inference from high-throughput data has become an important means of current analysis of biological systems. For instance, in cancer research, the functional relationships of cancer related proteins, summarised into signalling networks are of central interest for the identification of pathways that influence tumour development. Cancer cell lines can be used as model systems to study the cellular response to drug treatments in a time-resolved way. Based on these kind of data, modelling approaches for the signalling relationships are needed, that allow to generate hypotheses on potential interference points in the networks.</p> <p>Results</p> <p>We present the R-package 'ddepn' that implements our recent approach on network reconstruction from longitudinal data generated after external perturbation of network components. We extend our approach by two novel methods: a Markov Chain Monte Carlo method for sampling network structures with two edge types (activation and inhibition) and an extension of a prior model that penalises deviances from a given reference network while incorporating these two types of edges. Further, as alternative prior we include a model that learns signalling networks with the scale-free property.</p> <p>Conclusions</p> <p>The package 'ddepn' is freely available on R-Forge and CRAN <url>http://ddepn.r-forge.r-project.org</url>, <url>http://cran.r-project.org</url>. It allows to conveniently perform network inference from longitudinal high-throughput data using two different sampling based network structure search algorithms.</p

    Systematic analysis of time resolved high-throughput data using stochastic network inference methods

    Get PDF
    Breast Cancer is the most common cancer in women and is characterised by various deregulations in signalling processes, leading to abnormal proliferation, differentiation or apoptosis. Several treatments for breast cancer exist, including the human monoclonal antibody Trastuzumab and the small molecule erlotinib, which both target and inhibit receptors of the ERBB receptor network. However, signalling processes in cancers, especially under drug treatment are not yet completely understood, and methods that learn treatment specific regulation and signalling patterns on a system- wide view from experimental data are needed. One approach is the reconstruction of interaction networks for genes or proteins under external perturbation, and many different algorithms have been proposed in the past. These include Boolean networks, Bayesian Networks, Dynamic Bayesian networks and differential equation systems, all describing the system on a different level of accuracy and complexity. However, if external perturbation is applied, the targets of the perturbations either have to be known, or only the targets of a single perturbation can be learned directly from data in current algorithms. And in general, dependencies of signalling events at different time points should be included into the modelling frameworks, too. This work proposes a novel approach to learn networks from longitudinal and externally perturbed data, called Dynamic Deterministic Effects Propagation Networks (DDEPN )'. Nodes in the network correspond to genes or proteins, selected from a particular biological system, while edges describe the interactions between the nodes. DDEPN models the activity of a node as boolean variable (either active or passive) and creates an activity profile of all nodes for the given time frame, depending on a given network structure. The activity profile is assessed by a likelihood score that describes the probability of the measured data given the activity profile. A network structure that fits best the measured data is identified by modifying the network such that the likelihood score is optimised. DDEPN is applied to a phosphoproteomic dataset from the ERBB signalling cascade, as well as to gene expression data measuring cell cycle related genes. Known signalling cascades from the ERBB and cell cycle networks could be successfully reconstructed and DDEPN also outperformed related network inference approaches. Further, in the ERBB data set, the combined application of the drugs erlotinib and Trastuzumab to the breast cancer cell line HCC1954 resulted in potent inhibition of growth promoting signalling effects, reflected in the down-regulation of the MAPK and AKT signalling pathways. This suggests that this combination therapy could be also a promising option for treatment of breast cancer patients

    Network inference : extension of linear programming model for time-series data

    Get PDF
    Dissertação de mestrado em Engenharia InformáticaWith the widespread availability of high-throughput technologies, it is now possible to study the behavior of dozens or even hundreds of gene/proteins through a single experiment. Still, these experiments provide only the gene/protein expression values, telling nothing about their interactions with each other. To understand these interactions, network inference methods need to be applied. By understanding such interactions, new light can be shed into biological processes and, in particular, into disease’s mechanisms of action, providing new insights for drug design: which genes/proteins should be targeted in order to cure/prevent a specific disease. In this thesis, we developed and tested two alternative extensions for a previously developed model based on linear programming. Such model infers signal transduction networks from perturbation steady-state data. The extensions now developed take advantage of perturbation time-series data, which further improves the resolution of causal relationships between genes/proteins. In a first phase, we use artificial networks with simulated data to test the performance of both extensions in different conditions. Additionally, we compare their performance to the original model and to a state-of-the-art model for perturbation timeseries data, DDEPN. Overall, our second extension exhibits a better performance, and significantly higher sensitivity. This extension assumes a given gene/protein can only influence its targets if it is in an active form. In a second phase, we use two experimental datasets related to ERBB signaling and evaluate the resulting networks: 1) by finding literature support for the inferred edges, and 2) by using a network assembled with Ingenuity IPA as true network to do a quantitative assessment. Our results are further compared to DDEPN and the original model in a quantitative way. Quantitatively, our second model extension is shown to perform better than both the original model and DDEPN. Qualitatively, we find literature support for most of the inferred edges in both datasets, while also inferring a few plausible edges for which no literature evidence was found.Com o uso generalizado de tecnologias de alto rendimento como os microarrays de ADN, torna-se comum estudar dezenas ou mesmo centenas de genes/proteínas numa única experiência. Contudo, estas experiências apenas nos permitem determinar a expressão dos genes/proteínas e nada nos dizem sobre as interações entre os mesmos. Assim, torna-se necessário o uso de métodos de inferência de redes, de modo a estudar as interações entre genes/proteínas. Ao perceber estas interações, não só é possível perceber melhor os processos biológicos em geral, como também o modo como actuam as doenças, de forma a desenvolver novos medicamentos. Nesta tese de mestrado, desenvolvemos e testámos duas extensões para um modelo baseado em programação linear. Este modelo infere redes de transdução de sinal a partir de experiências de RNAi em que as medidas são feitas após a perturbação, quando a rede se encontra em estado estacionário. Com as extensões desenvolvidas nesta tese é possível tirar partido de séries temporais de dados provenientes de experiências de RNAi, o que permite distinguir relações de causalidade entre proteínas. Numa primeira fase, usamos redes artificiais e dados simulados para testar a performance de ambas as extensões em diferentes condições. Além disso, comparamo-las com o modelo original e com um modelo recente, DDEPN, que usa séries temporais de dados de experiências em que a rede a inferir é perturbada. Em geral, a nossa segunda extensão obtém melhores resultados, principalmente em termos de sensibilidade. Esta extensão assume que só proteínas activas podem influenciar outras proteínas. Numa segunda fase, usamos dois conjuntos de dados experimentais e avaliamos os resultados obtidos: 1) procurando referências na literatura para as ligações inferidas, e 2) usando uma rede de referência para fazer uma avaliação quantitativa e estabelecer comparações com o modelo original e o DDEPN. Quantitativamente, a nossa segunda extensão obtém melhores resultados do que o modelo original e o DDEPN. Qualitativamente, encontrámos suporte na literatura para a maioria das ligações inferidas pela segunda extensão. Inferimos ainda algumas ligações bastante plausíveis, embora não tenhamos encontrado suporte para estas

    Exact Tests for Singular Network Data

    Get PDF
    Abstract We propose methodology for exact statistical tests of hypotheses for models of network dynamics. The methodology formulates Markovian exponential families, then uses sequential importance sampling to compute expectations within basins of attraction and within level sets of a sufficient statistic for an overdispersion model. Comparisons of hypotheses can be done conditional on basins of attraction. Examples are presented

    Inference in systems biology: modelling approaches and applications

    Get PDF
    The main topic of this thesis is the study of biological regulatory systems using different computational modelling approaches in order to gain new insights into not yet completely understood biological processes. In "systems biology", mathematical models represent a powerful tool to study biological processes. Models are abstractions of reality always including some degree of simplification: an important ingredient of the modelling process, having a major role in suggesting the appropriate level of abstraction and simplification, is the purpose of the model, that is the question they have to answer. This thesis is focused on the analysis of how models of different complexity appropriately describe the available data to achieve a given purpose. Such analysis guides the choice of the most appropriate degree of simplification of the system under study that allows neglecting some aspects without compromising the results of the model. Three levels of detail for inference and modelling are analyzed in this thesis depending on the system under consideration. The first level is the network level, where molecules are nodes connected by edges and the interest is in the inference of the topology of connections at large scale. In the second level the network is interpreted as a mean to produce qualitative simulations and predictions which can be compared with experimental data. The third level of detail consist in a more mechanistic dynamic description of the system using ordinary differential equations but limiting the analysis to small subsystems. For each level of detail, appropriate approaches have been developed and applied to in silico and real data of different biological systems. Finally, different modelling appraches have been integrated to analyze insulin signalling pathway on different levels of simplification using a novel experimental dataset collected specifically for this purpos

    RNA Interference Data: from a Statistical Analysis to Network Inference

    Get PDF
    Viruses are the cause of many severe human diseases such as Hepatitis C, Dengue fever, AIDS, Infuenza and even cancer. In consequence of viral diseases several millions of people die every year all over the world. Due to the rapid evolution of viruses their drug development and treatment are especially difficult. The present work aims at getting a better understanding of the ongoing signaling processes of certain diseases. To do this, methods for the analysis and network inference of RNA interference (RNAi) data are presented. Recent biological and technological advances in the fi eld of RNAi enable the knockdown of individual genes in a high-content high-throughput manner. Thereby, a detailed quantifi cation of perturbation e ffects on specifi c phenotypes can be assessed using multiparametric imaging. This in turn allows the identi fication of genes which are involved in certain biological processes such as virus-host factors used in the viral life-cycles. However, hit lists of already published RNAi screens show only a small overlap, even for studies of the same virus. This may be due to insufficient data analysis where the potential of microscopic screening data is not fully tapped since individual cell measurements are not taken into account for data normalization and hit scoring. This thesis shows that for RNAi data studying Hepatitis C and Dengue virus the phenotypic e ffect after a perturbation is highly influenced by each cell's population context. Therefore, novel methodologies are proposed which use the individual cell measurements for the data analysis and statistical scoring. This results in an increased sensitivity and speci ficity in comparison to already existing methods where these factors are disregarded. The method proposed here allows the identifi cation of already existing as well as new hit genes which are signi ficantly involved in the respective viral life-cycles. The spatial and temporal placement of these hits, however, still remains unknown, and the ongoing signaling processes are only poorly understood. To understand the underlying biology from a system wide view it is necessary to infer the signaling cascade of involved factors in detail. One of the challenges of network inference is the exponentially increasing dimensionality with an increasing number of nodes. The method proposed in this thesis is formulated as a linear optimization problem which can be solved efficiently even for large data sets. The model can incorporate data of single or multiple perturbations at the same time. The aim is to defend the network topology which best represents the given data. Based on simulated data for an small artificial five-node example the robustness of the model against noisy or incomplete data is demonstrated. Furthermore, for this small as well as for larger networks with 10 to 52 nodes it is shown that the model achieves superior results than random guessing. In addition, the performance and the computation time of large networks are better than another approach which has been recently published. Moreover, the network inference method presented here has been applied to data measuring the signaling of ErbB proteins. These proteins are associated with the development of many human cancers. The results of the network inference show that already known signaling cascades can be successfully reconstructed from the data. Additionally, newly learned protein-protein interactions indicate that there are several still unknown feedback and feedforward loops. The proteins of these loops may serve as potential targets to control ErbB signaling. The knowledge about these factors is an important step towards the development of new drugs and therefore,this helps to fi ght ErbB related diseases
    corecore