76,427 research outputs found

    Cancer driver gene detection in transcriptional regulatory networks using the structure analysis of weighted regulatory interactions

    Full text link
    Identification of genes that initiate cell anomalies and cause cancer in humans is among the important fields in the oncology researches. The mutation and development of anomalies in these genes are then transferred to other genes in the cell and therefore disrupt the normal functionality of the cell. These genes are known as cancer driver genes (CDGs). Various methods have been proposed for predicting CDGs, most of which based on genomic data and based on computational methods. Therefore, some researchers have developed novel bioinformatics approaches. In this study, we propose an algorithm, which is able to calculate the effectiveness and strength of each gene and rank them by using the gene regulatory networks and the stochastic analysis of regulatory linking structures between genes. To do so, firstly we constructed the regulatory network using gene expression data and the list of regulatory interactions. Then, using biological and topological features of the network, we weighted the regulatory interactions. After that, the obtained regulatory interactions weight was used in interaction structure analysis process. Interaction analysis was achieved using two separate Markov chains on the bipartite graph obtained from the main graph of the gene network. To do so, the stochastic approach for link-structure analysis has been implemented. The proposed algorithm categorizes higher-ranked genes as driver genes. The efficiency of the proposed algorithm, regarding the F-measure value and number of identified driver genes, was compared with 23 other computational and network-based methods

    Learn to Generate Time Series Conditioned Graphs with Generative Adversarial Nets

    Full text link
    Deep learning based approaches have been utilized to model and generate graphs subjected to different distributions recently. However, they are typically unsupervised learning based and unconditioned generative models or simply conditioned on the graph-level contexts, which are not associated with rich semantic node-level contexts. Differently, in this paper, we are interested in a novel problem named Time Series Conditioned Graph Generation: given an input multivariate time series, we aim to infer a target relation graph modeling the underlying interrelationships between time series with each node corresponding to each time series. For example, we can study the interrelationships between genes in a gene regulatory network of a certain disease conditioned on their gene expression data recorded as time series. To achieve this, we propose a novel Time Series conditioned Graph Generation-Generative Adversarial Networks (TSGG-GAN) to handle challenges of rich node-level context structures conditioning and measuring similarities directly between graphs and time series. Extensive experiments on synthetic and real-word gene regulatory networks datasets demonstrate the effectiveness and generalizability of the proposed TSGG-GAN

    Generation of a Compendium of Transcription Factor Cascades and Identification of Potential Therapeutic Targets using Graph Machine Learning

    Full text link
    Transcription factors (TFs) play a vital role in the regulation of gene expression thereby making them critical to many cellular processes. In this study, we used graph machine learning methods to create a compendium of TF cascades using data extracted from the STRING database. A TF cascade is a sequence of TFs that regulate each other, forming a directed path in the TF network. We constructed a knowledge graph of 81,488 unique TF cascades, with the longest cascade consisting of 62 TFs. Our results highlight the complex and intricate nature of TF interactions, where multiple TFs work together to regulate gene expression. We also identified 10 TFs with the highest regulatory influence based on centrality measurements, providing valuable information for researchers interested in studying specific TFs. Furthermore, our pathway enrichment analysis revealed significant enrichment of various pathways and functional categories, including those involved in cancer and other diseases, as well as those involved in development, differentiation, and cell signaling. The enriched pathways identified in this study may have potential as targets for therapeutic intervention in diseases associated with dysregulation of transcription factors. We have released the dataset, knowledge graph, and graphML methods for the TF cascades, and created a website to display the results, which can be accessed by researchers interested in using this dataset. Our study provides a valuable resource for understanding the complex network of interactions between TFs and their regulatory roles in cellular processes

    Application of new probabilistic graphical models in the genetic regulatory networks studies

    Get PDF
    This paper introduces two new probabilistic graphical models for reconstruction of genetic regulatory networks using DNA microarray data. One is an Independence Graph (IG) model with either a forward or a backward search algorithm and the other one is a Gaussian Network (GN) model with a novel greedy search method. The performances of both models were evaluated on four MAPK pathways in yeast and three simulated data sets. Generally, an IG model provides a sparse graph but a GN model produces a dense graph where more information about gene-gene interactions is preserved. Additionally, we found two key limitations in the prediction of genetic regulatory networks using DNA microarray data, the first is the sufficiency of sample size and the second is the complexity of network structures may not be captured without additional data at the protein level. Those limitations are present in all prediction methods which used only DNA microarray data.Comment: 38 pages, 3 figure

    On the inconsistency of â„“1-penalised sparse precision matrix estimation

    Get PDF
    Background: Various l(1)-penalised estimation methods such as graphical lasso and CLIME are widely used for sparse precision matrix estimation and learning of undirected network structure from data. Many of these methods have been shown to be consistent under various quantitative assumptions about the underlying true covariance matrix. Intuitively, these conditions are related to situations where the penalty term will dominate the optimisation. Results: We explore the consistency of l(1)-based methods for a class of bipartite graphs motivated by the structure of models commonly used for gene regulatory networks. We show that all l(1)-based methods fail dramatically for models with nearly linear dependencies between the variables. We also study the consistency on models derived from real gene expression data and note that the assumptions needed for consistency never hold even for modest sized gene networks and l(1)-based methods also become unreliable in practice for larger networks. Conclusions: Our results demonstrate that l(1)-penalised undirected network structure learning methods are unable to reliably learn many sparse bipartite graph structures, which arise often in gene expression data. Users of such methods should be aware of the consistency criteria of the methods and check if they are likely to be met in their application of interest.Peer reviewe

    On the inconsistency of â„“1-penalised sparse precision matrix estimation

    Get PDF
    Background: Various l(1)-penalised estimation methods such as graphical lasso and CLIME are widely used for sparse precision matrix estimation and learning of undirected network structure from data. Many of these methods have been shown to be consistent under various quantitative assumptions about the underlying true covariance matrix. Intuitively, these conditions are related to situations where the penalty term will dominate the optimisation. Results: We explore the consistency of l(1)-based methods for a class of bipartite graphs motivated by the structure of models commonly used for gene regulatory networks. We show that all l(1)-based methods fail dramatically for models with nearly linear dependencies between the variables. We also study the consistency on models derived from real gene expression data and note that the assumptions needed for consistency never hold even for modest sized gene networks and l(1)-based methods also become unreliable in practice for larger networks. Conclusions: Our results demonstrate that l(1)-penalised undirected network structure learning methods are unable to reliably learn many sparse bipartite graph structures, which arise often in gene expression data. Users of such methods should be aware of the consistency criteria of the methods and check if they are likely to be met in their application of interest.Peer reviewe

    Inferring Regulatory Networks by Combining Perturbation Screens and Steady State Gene Expression Profiles

    Full text link
    Reconstructing transcriptional regulatory networks is an important task in functional genomics. Data obtained from experiments that perturb genes by knockouts or RNA interference contain useful information for addressing this reconstruction problem. However, such data can be limited in size and/or are expensive to acquire. On the other hand, observational data of the organism in steady state (e.g. wild-type) are more readily available, but their informational content is inadequate for the task at hand. We develop a computational approach to appropriately utilize both data sources for estimating a regulatory network. The proposed approach is based on a three-step algorithm to estimate the underlying directed but cyclic network, that uses as input both perturbation screens and steady state gene expression data. In the first step, the algorithm determines causal orderings of the genes that are consistent with the perturbation data, by combining an exhaustive search method with a fast heuristic that in turn couples a Monte Carlo technique with a fast search algorithm. In the second step, for each obtained causal ordering, a regulatory network is estimated using a penalized likelihood based method, while in the third step a consensus network is constructed from the highest scored ones. Extensive computational experiments show that the algorithm performs well in reconstructing the underlying network and clearly outperforms competing approaches that rely only on a single data source. Further, it is established that the algorithm produces a consistent estimate of the regulatory network.Comment: 24 pages, 4 figures, 6 table

    Feedbacks from the metabolic network to the genetic network reveal regulatory modules in E. coli and B. subtilis

    Full text link
    The genetic regulatory network (GRN) plays a key role in controlling the response of the cell to changes in the environment. Although the structure of GRNs has been the subject of many studies, their large scale structure in the light of feedbacks from the metabolic network (MN) has received relatively little attention. Here we study the causal structure of the GRNs, namely the chain of influence of one component on the other, taking into account feedback from the MN. First we consider the GRNs of E. coli and B. subtilis without feedback from MN and illustrate their causal structure. Next we augment the GRNs with feedback from their respective MNs by including (a) links from genes coding for enzymes to metabolites produced or consumed in reactions catalyzed by those enzymes and (b) links from metabolites to genes coding for transcription factors whose transcriptional activity the metabolites alter by binding to them. We find that the inclusion of feedback from MN into GRN significantly affects its causal structure, in particular the number of levels and relative positions of nodes in the hierarchy, and the number and size of the strongly connected components (SCCs). We then study the functional significance of the SCCs. For this we identify condition specific feedbacks from the MN into the GRN by retaining only those enzymes that are essential for growth in specific environmental conditions simulated via the technique of flux balance analysis (FBA). We find that the SCCs of the GRN augmented by these feedbacks can be ascribed specific functional roles in the organism. Our algorithmic approach thus reveals relatively autonomous subsystems with specific functionality, or regulatory modules in the organism. This automated approach could be useful in identifying biologically relevant modules in other organisms for which network data is available, but whose biology is less well studied.Comment: 15 figure
    • …
    corecore