29 research outputs found

    Exploring the Boundaries of Gene Regulatory Network Inference

    No full text
    To understand how the components of a complex system like the biological cell interact and regulate each other, we need to collect data for how the components respond to system perturbations. Such data can then be used to solve the inverse problem of inferring a network that describes how the pieces influence each other. The work in this thesis deals with modelling the cell regulatory system, often represented as a network, with tools and concepts derived from systems biology. The first investigation focuses on network sparsity and algorithmic biases introduced by penalised network inference procedures. Many contemporary network inference methods rely on a sparsity parameter such as the L1 penalty term used in the LASSO. However, a poor choice of the sparsity parameter can give highly incorrect network estimates. In order to avoid such poor choices, we devised a method to optimise the sparsity parameter, which maximises the accuracy of the inferred network. We showed that it is effective on in silico data sets with a reasonable level of informativeness and demonstrated that accurate prediction of network sparsity is key to elucidate the correct network parameters. The second investigation focuses on how knowledge from association networks can be transferred to regulatory network inference procedures. It is common that the quality of expression data is inadequate for reliable gene regulatory network inference. Therefore, we constructed an algorithm to incorporate prior knowledge and demonstrated that it increases the accuracy of network inference when the quality of the data is low. The third investigation aimed to understand the influence of system and data properties on network inference accuracy. L1 regularisation methods commonly produce poor network estimates when the data used for inference is ill-conditioned, even when the signal to noise ratio is so high that all links in the network can be proven to exist for the given significance. In this study we elucidated some general principles for under what conditions we expect strongly degraded accuracy. Moreover, it allowed us to estimate expected accuracy from conditions of simulated data, which was used to predict the performance of inference algorithms on biological data. Finally, we built a software package GeneSPIDER for solving problems encountered during previous investigations. The software package supports highly controllable network and data generation as well as data analysis and exploration in the context of network inference.At the time of the doctoral defense, the following paper was unpublished and had a status as follows: Paper 4: Manuscript. </p

    Network services and tool support for cloud environments

    No full text
    Virtualization of servers and network infrastructure is an effective way to reduce hardware costs as well as power consumption. Today cloud systems are often used to handle virtualization of servers but lack the ability of deploy and configure network equipment. Facilitating network equipment configuration within a cloud environment, would make it possible to create complete virtual networks along with well known and proven features of cloud computing today. Our solution provides users with a graphical tool for easy and quick configuration of not only virtual machines but also virtual network equipment within a cloud environment. This makes it possible for a user to create advanced network topologies. Creating complete virtual networks like this using a single tool, will speed up configuration and minimize the errors that can occur when manually configuring multiple instances. The implemented software solution consists of two major parts, a graphical user interface (GUI) and a back-end server. The back-end server is responsible for handling communication between the user application and an underlying cloud platform, in this case OpenStack. The graphical user interface gives the user the possibility to draw networks and launch virtual machines using simple drag-and-drop features. It also monitors all the running virtual instances and physical machines, and alerts the user if a problem occurs. This project is the first step towards supporting global virtual networks spanning across multiple data centers. It shows that it is possible to create virtual networks using a cloud environment as a base

    Network services and tool support for cloud environments

    No full text
    Virtualization of servers and network infrastructure is an effective way to reduce hardware costs as well as power consumption. Today cloud systems are often used to handle virtualization of servers but lack the ability of deploy and configure network equipment. Facilitating network equipment configuration within a cloud environment, would make it possible to create complete virtual networks along with well known and proven features of cloud computing today. Our solution provides users with a graphical tool for easy and quick configuration of not only virtual machines but also virtual network equipment within a cloud environment. This makes it possible for a user to create advanced network topologies. Creating complete virtual networks like this using a single tool, will speed up configuration and minimize the errors that can occur when manually configuring multiple instances. The implemented software solution consists of two major parts, a graphical user interface (GUI) and a back-end server. The back-end server is responsible for handling communication between the user application and an underlying cloud platform, in this case OpenStack. The graphical user interface gives the user the possibility to draw networks and launch virtual machines using simple drag-and-drop features. It also monitors all the running virtual instances and physical machines, and alerts the user if a problem occurs. This project is the first step towards supporting global virtual networks spanning across multiple data centers. It shows that it is possible to create virtual networks using a cloud environment as a base

    Deriving disease modules from the compressed transcriptional space embedded in a deep autoencoder

    No full text
    Disease modules in molecular interaction maps have been useful for characterizing diseases. Yet biological networks, that commonly define such modules are incomplete and biased toward some well-studied disease genes. Here we ask whether disease-relevant modules of genes can be discovered without prior knowledge of a biological network, instead training a deep autoencoder from large transcriptional data. We hypothesize that modules could be discovered within the autoencoder representations. We find a statistically significant enrichment of genome-wide association studies (GWAS) relevant genes in the last layer, and to a successively lesser degree in the middle and first layers respectively. In contrast, we find an opposite gradient where a modular protein-protein interaction signal is strongest in the first layer, but then vanishing smoothly deeper in the network. We conclude that a data-driven discovery approach is sufficient to discover groups of disease-related genes. The study of disease modules facilitates insight into complex diseases, but their identification relies on knowledge of molecular networks. Here, the authors show that disease modules and genes can also be discovered in deep autoencoder representations of large human gene expression datasets.Funding Agencies|Swedish foundation for strategic researchSwedish Foundation for Strategic Research; Swedish Research CouncilSwedish Research Council; Linkoping University</p

    A generalized framework for controlling FDR in gene regulatory network inference

    No full text
    Motivation: Inference of gene regulatory networks (GRNs) from perturbation data can give detailed mechanistic insights of a biological system. Many inference methods exist, but the resulting GRN is generally sensitive to the choice of method-specific parameters. Even though the inferred GRN is optimal given the parameters, many links may be wrong or missing if the data is not informative. To make GRN inference reliable, a method is needed to estimate the support of each predicted link as the method parameters are varied. Results: To achieve this we have developed a method called nested bootstrapping, which applies a bootstrapping protocol to GRN inference, and by repeated bootstrap runs assesses the stability of the estimated support values. To translate bootstrap support values to false discovery rates we run the same pipeline with shuffled data as input. This provides a general method to control the false discovery rate of GRN inference that can be applied to any setting of inference parameters, noise level, or data properties. We evaluated nested bootstrapping on a simulated dataset spanning a range of such properties, using the LASSO, Least Squares, RNI, GENIE3 and CLR inference methods. An improved inference accuracy was observed in almost all situations. Nested bootstrapping was incorporated into the GeneSPIDER package, which was also used for generating the simulated networks and data, as well as running and analyzing the inferences

    GeneSPIDER - Generation and Simulation Package for Informative Data ExploRation

    No full text
    A range of tools are available to model, simulate and analyze gene regulatory networks (GRNs). However, these tools provide limited ability to control network topology, system dynamics, design of experiments, data properties, or noise characteristics. Independent control of these properties is the key to drawing conclusions on which inference method to use and what result to expect from it, as well as obtaining desired approximations of real biological systems. To draw conclusions on the relation between a network or data property and the performance of an inference method in simulations, system approximations with varying properties are needed. We present a Matlab package \gs for generation and analysis of networks and data in a dynamical systems framework with focus on the ability to vary properties. It supplies not only essential components that have been missing, but also wrappers to existing tools in common use. In particular, it contains tools for controlling and analyzing network topology (random, small-world, scale-free), stability of linear time-invariant systems, signal to noise ratio (SNR), and Interampatteness. It also contains methods for design of perturbation experiments, bootstrapping, analysis of linear dependence, sample selection, scaling of the SNR, and performance evaluation. GeneSPIDER offers control of network and data properties in simulations, together with tools to analyze these properties and draw conclusions on the quality of inferred GRNs. It can be fetched freely from the online =git= repository https://bitbucket.org/sonnhammergrni/genespider

    GeneSPIDER - Generation and Simulation Package for Informative Data ExploRation

    No full text
    A range of tools are available to model, simulate and analyze gene regulatory networks (GRNs). However, these tools provide limited ability to control network topology, system dynamics, design of experiments, data properties, or noise characteristics. Independent control of these properties is the key to drawing conclusions on which inference method to use and what result to expect from it, as well as obtaining desired approximations of real biological systems. To draw conclusions on the relation between a network or data property and the performance of an inference method in simulations, system approximations with varying properties are needed. We present a Matlab package \gs for generation and analysis of networks and data in a dynamical systems framework with focus on the ability to vary properties. It supplies not only essential components that have been missing, but also wrappers to existing tools in common use. In particular, it contains tools for controlling and analyzing network topology (random, small-world, scale-free), stability of linear time-invariant systems, signal to noise ratio (SNR), and Interampatteness. It also contains methods for design of perturbation experiments, bootstrapping, analysis of linear dependence, sample selection, scaling of the SNR, and performance evaluation. GeneSPIDER offers control of network and data properties in simulations, together with tools to analyze these properties and draw conclusions on the quality of inferred GRNs. It can be fetched freely from the online =git= repository https://bitbucket.org/sonnhammergrni/genespider

    GeneSPIDER - Generation and Simulation Package for Informative Data ExploRation

    No full text
    A range of tools are available to model, simulate and analyze gene regulatory networks (GRNs). However, these tools provide limited ability to control network topology, system dynamics, design of experiments, data properties, or noise characteristics. Independent control of these properties is the key to drawing conclusions on which inference method to use and what result to expect from it, as well as obtaining desired approximations of real biological systems. To draw conclusions on the relation between a network or data property and the performance of an inference method in simulations, system approximations with varying properties are needed. We present a Matlab package \gs for generation and analysis of networks and data in a dynamical systems framework with focus on the ability to vary properties. It supplies not only essential components that have been missing, but also wrappers to existing tools in common use. In particular, it contains tools for controlling and analyzing network topology (random, small-world, scale-free), stability of linear time-invariant systems, signal to noise ratio (SNR), and Interampatteness. It also contains methods for design of perturbation experiments, bootstrapping, analysis of linear dependence, sample selection, scaling of the SNR, and performance evaluation. GeneSPIDER offers control of network and data properties in simulations, together with tools to analyze these properties and draw conclusions on the quality of inferred GRNs. It can be fetched freely from the online =git= repository https://bitbucket.org/sonnhammergrni/genespider
    corecore