254 research outputs found

    Performance Assessment of the Network Reconstruction Approaches on Various Interactomes

    Get PDF
    Beyond the list of molecules, there is a necessity to collectively consider multiple sets of omic data and to reconstruct the connections between the molecules. Especially, pathway reconstruction is crucial to understanding disease biology because abnormal cellular signaling may be pathological. The main challenge is how to integrate the data together in an accurate way. In this study, we aim to comparatively analyze the performance of a set of network reconstruction algorithms on multiple reference interactomes. We first explored several human protein interactomes, including PathwayCommons, OmniPath, HIPPIE, iRefWeb, STRING, and ConsensusPathDB. The comparison is based on the coverage of each interactome in terms of cancer driver proteins, structural information of protein interactions, and the bias toward well-studied proteins. We next used these interactomes to evaluate the performance of network reconstruction algorithms including all-pair shortest path, heat diffusion with flux, personalized PageRank with flux, and prize-collecting Steiner forest (PCSF) approaches. Each approach has its own merits and weaknesses. Among them, PCSF had the most balanced performance in terms of precision and recall scores when 28 pathways from NetPath were reconstructed using the listed algorithms. Additionally, the reference interactome affects the performance of the network reconstruction approaches. The coverage and disease- or tissue-specificity of each interactome may vary, which may result in differences in the reconstructed networks

    Network-Based Interpretation of Diverse High-Throughput Datasets through the Omics Integrator Software Package

    Get PDF
    High-throughput, ‘omic’ methods provide sensitive measures of biological responses to perturbations. However, inherent biases in high-throughput assays make it difficult to interpret experiments in which more than one type of data is collected. In this work, we introduce Omics Integrator, a software package that takes a variety of ‘omic’ data as input and identifies putative underlying molecular pathways. The approach applies advanced network optimization algorithms to a network of thousands of molecular interactions to find high-confidence, interpretable subnetworks that best explain the data. These subnetworks connect changes observed in gene expression, protein abundance or other global assays to proteins that may not have been measured in the screens due to inherent bias or noise in measurement. This approach reveals unannotated molecular pathways that would not be detectable by searching pathway databases. Omics Integrator also provides an elegant framework to incorporate not only positive data, but also negative evidence. Incorporating negative evidence allows Omics Integrator to avoid unexpressed genes and avoid being biased toward highly-studied hub proteins, except when they are strongly implicated by the data. The software is comprised of two individual tools, Garnet and Forest, that can be run together or independently to allow a user to perform advanced integration of multiple types of high-throughput data as well as create condition-specific subnetworks of protein interactions that best connect the observed changes in various datasets. It is available at http://fraenkel.mit.edu/omicsintegrator and on GitHub at https://github.com/fraenkel-lab/OmicsIntegrator.National Institutes of Health (U.S.) (grant U54CA112967)National Institutes of Health (U.S.) (grant U01CA184898)National Institutes of Health (U.S.) (grant U54NS091046)National Institutes of Health (U.S.) (grant R01GM089903

    Reconstruction of the temporal signaling network in Salmonella-infected human cells

    Get PDF
    Salmonella enterica is a bacterial pathogen that usually infects its host through food sources. Translocation of the pathogen proteins into the host cells leads to changes in the signaling mechanism either by activating or inhibiting the host proteins. Using high-throughput ‘omic’ technologies, changes in the signaling components can be quantified at different levels; however, experimental hits are usually incomplete to represent the whole signaling system as some driver proteins stay hidden within the experimental data. Given that the bacterial infection modifies the response network of the host, more coherent view of the underlying biological processes and the signaling networks can be obtained by using a network modeling approach based on the reverse engineering principles in which a confident region from the protein interactome is found by inferring hits from the omic experiments. In this work, we have used a published temporal phosphoproteomic dataset of Salmonella-infected human cells and reconstructed the temporal signaling network of the human host by integrating the interactome and the phosphoproteomic datasets. We have combined two well-established network modeling frameworks, the Prize-collecting Steiner Forest (PCSF) approach and the Integer Linear Programming (ILP) based edge inference approach. The resulting network conserves the information on temporality, direction of interactions, while revealing hidden entities in the signaling, such as the SNARE binding, mTOR signaling, immune response, cytoskeleton organization, and apoptosis pathways. Targets of the Salmonella effectors in the host cells such as CDC42, RHOA, 14-3-3ή, Syntaxin family, Oxysterol-binding proteins were included in the reconstructed signaling network although they were not present in the initial phosphoproteomic data. We believe that integrated approaches have a high potential for the identification of clinical targets in infectious diseases, especially in the Salmonella infections

    The node-weighted Steiner tree approach to identify elements of cancer-related signaling pathways

    Get PDF
    BACKGROUND Cancer constitutes a momentous health burden in our society. Critical information on cancer may be hidden in its signaling pathways. However, even though a large amount of money has been spent on cancer research, some critical information on cancer-related signaling pathways still remains elusive. Hence, new works towards a complete understanding of cancer-related signaling pathways will greatly benefit the prevention, diagnosis, and treatment of cancer. RESULTS We propose the node-weighted Steiner tree approach to identify important elements of cancer-related signaling pathways at the level of proteins. This new approach has advantages over previous approaches since it is fast in processing large protein-protein interaction networks. We apply this new approach to identify important elements of two well-known cancer-related signaling pathways: PI3K/Akt and MAPK. First, we generate a node-weighted protein-protein interaction network using protein and signaling pathway data. Second, we modify and use two preprocessing techniques and a state-of-the-art Steiner tree algorithm to identify a subnetwork in the generated network. Third, we propose two new metrics to select important elements from this subnetwork. On a commonly used personal computer, this new approach takes less than 2 s to identify the important elements of PI3K/Akt and MAPK signaling pathways in a large node-weighted protein-protein interaction network with 16,843 vertices and 1,736,922 edges. We further analyze and demonstrate the significance of these identified elements to cancer signal transduction by exploring previously reported experimental evidences. CONCLUSIONS Our node-weighted Steiner tree approach is shown to be both fast and effective to identify important elements of cancer-related signaling pathways. Furthermore, it may provide new perspectives into the identification of signaling pathways for other human diseases

    SAMNetWeb: identifying condition-specific networks linking signaling and transcription

    Get PDF
    Motivation: High-throughput datasets such as genetic screens, mRNA expression assays and global phospho-proteomic experiments are often difficult to interpret due to inherent noise in each experimental system. Computational tools have improved interpretation of these datasets by enabling the identification of biological processes and pathways that are most likely to explain the measured results. These tools are primarily designed to analyse data from a single experiment (e.g. drug treatment versus control), creating a need for computational algorithms that can handle heterogeneous datasets across multiple experimental conditions at once. Summary: We introduce SAMNetWeb, a web-based tool that enables functional enrichment analysis and visualization of high-throughput datasets. SAMNetWeb can analyse two distinct data types (e.g. mRNA expression and global proteomics) simultaneously across multiple experimental systems to identify pathways activated in these experiments and then visualize the pathways in a single interaction network. Through the use of a multi-commodity flow based algorithm that requires each experiment ‘share’ underlying protein interactions, SAMNetWeb can identify distinct and common pathways across experiments. Availability and implementation: SAMNetWeb is freely available at http://fraenkel.mit.edu/samnetweb.United States. National Institutes of Health (U54CA112967)United States. National Institutes of Health (R01GM089903)National Science Foundation (U.S.) (DB1-0821391

    Multi-label multi-instance transfer learning for simultaneous reconstruction and cross-talk modeling of multiple human signaling pathways

    Get PDF
    Text file contains the predicted cross-talk signaling components between human signaling pathways (homolog instance). (ZIP 36 KB

    Genome-Scale Networks Link Neurodegenerative Disease Genes to α-Synuclein through Specific Molecular Pathways

    Get PDF
    Numerous genes and molecular pathways are implicated in neurodegenerative proteinopathies, but their inter-relationships are poorly understood. We systematically mapped molecular pathways underlying the toxicity of alpha-synuclein (α-syn), a protein central to Parkinson's disease. Genome-wide screens in yeast identified 332 genes that impact α-syn toxicity. To “humanize” this molecular network, we developed a computational method, TransposeNet. This integrates a Steiner prize-collecting approach with homology assignment through sequence, structure, and interaction topology. TransposeNet linked α-syn to multiple parkinsonism genes and druggable targets through perturbed protein trafficking and ER quality control as well as mRNA metabolism and translation. A calcium signaling hub linked these processes to perturbed mitochondrial quality control and function, metal ion transport, transcriptional regulation, and signal transduction. Parkinsonism gene interaction profiles spatially opposed in the network (ATP13A2/PARK9 and VPS35/PARK17) were highly distinct, and network relationships for specific genes (LRRK2/PARK8, ATXN2, and EIF4G1/PARK18) were confirmed in patient induced pluripotent stem cell (iPSC)-derived neurons. This cross-species platform connected diverse neurodegenerative genes to proteinopathy through specific mechanisms and may facilitate patient stratification for targeted therapy. Keywords: alpha-synuclein; iPS cell; Parkinson’s disease; stem cell; mRNA translation; RNA-binding protein; LRRK2; VPS35; vesicle trafficking; yeas

    Identifying niche mediated regulatory factors of stem cell phenotypic state: a systems biology approach

    Get PDF
    Understanding how the cellular niche controls the stem cell phenotype is often hampered due to the complexity of variegated niche composition, its dynamics, and nonlinear stem cell–niche interactions. Here, we propose a systems biology view that considers stem cell–niche interactions as a many‐body problem amenable to simplification by the concept of mean field approximation. This enables approximation of the niche effect on stem cells as a constant field that induces sustained activation/inhibition of specific stem cell signaling pathways in all stem cells within heterogeneous populations exhibiting the same phenotype (niche determinants). This view offers a new basis for the development of single cell‐based computational approaches for identifying niche determinants, which has potential applications in regenerative medicine and tissue engineering

    Methods for Utilizing Co-expression Networks for Biological Insight

    Full text link
    The explosion of high-throughput Omics assays in past 15 years has led to a revolution in the quantity of data and the number of data types which are available to biological researchers. This has necessitated a second revolution in the development of analytical tools to handle this wealth and variety of data. No longer is it practical for a researcher to simply examine a list of differentially expressed compounds and draw meaningful insight about the biological processes at hand; these differentially expressed compounds must be put into context with each other, and integrated with existing biological knowledge. Co-expression techniques, where the simultaneous expression of two or more compounds is analyzed, have become a powerful tool for biological insight in high-throughput Omics settings. The primary goal of this dissertation is to develop techniques for identifying and characterizing patterns of co-expression. In our first project, we develop a Differentially Weighted Factor Model for estimating covariance matrices related through structured experimental design. Our factor model allows us to estimate common structural elements using all available data, and to estimate unique structural elements in a condition specific manner. We develop a method for visualizing the resulting estimates, and implement the method in an R package, DWFM. The second project presents a method using the Prize Collecting Steiner Tree algorithm to integrate and identify modules in lipid and untargeted metabolomic assays in a data-driven manner. These assays are integrated over a co-expression network specific to the applied setting in question, allowing us to capture modules unique to this setting. Our final project presents a second technique for identifying modules of co-expressed biomolecules. This technique addresses a major limitation of PCST based approaches, namely that one is required to choose a cutoff to obtain a list of differentially expressed compounds used as input into the algorithm. Additionally, this second method utilizes a meta-analytic inspired approach to identify patterns of co-expression across multiple data sets, thus reducing the impact of a single noisy assay.PHDStatisticsUniversity of Michigan, Horace H. Rackham School of Graduate Studieshttps://deepblue.lib.umich.edu/bitstream/2027.42/143996/1/tealg_1.pd
    • 

    corecore