457 research outputs found

    POINeT: protein interactome with sub-network analysis and hub prioritization

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Protein-protein interactions (PPIs) are critical to every aspect of biological processes. Expansion of all PPIs from a set of given queries often results in a complex PPI network lacking spatiotemporal consideration. Moreover, the reliability of available PPI resources, which consist of low- and high-throughput data, for network construction remains a significant challenge. Even though a number of software tools are available to facilitate PPI network analysis, an integrated tool is crucial to alleviate the burden on querying across multiple web servers and software tools.</p> <p>Results</p> <p>We have constructed an integrated web service, POINeT, to simplify the process of PPI searching, analysis, and visualization. POINeT merges PPI and tissue-specific expression data from multiple resources. The tissue-specific PPIs and the numbers of research papers supporting the PPIs can be filtered with user-adjustable threshold values and are dynamically updated in the viewer. The network constructed in POINeT can be readily analyzed with, for example, the built-in centrality calculation module and an integrated network viewer. Nodes in global networks can also be ranked and filtered using various network analysis formulas, i.e., centralities. To prioritize the sub-network, we developed a ranking filtered method (S3) to uncover potential novel mediators in the midbody network. Several examples are provided to illustrate the functionality of POINeT. The network constructed from four schizophrenia risk markers suggests that EXOC4 might be a novel marker for this disease. Finally, a liver-specific PPI network has been filtered with adult and fetal liver expression profiles.</p> <p>Conclusion</p> <p>The functionalities provided by POINeT are highly improved compared to previous version of POINT. POINeT enables the identification and ranking of potential novel genes involved in a sub-network. Combining with tissue-specific gene expression profiles, PPIs specific to selected tissues can be revealed. The straightforward interface of POINeT makes PPI search and analysis just a few clicks away. The modular design permits further functional enhancement without hampering the simplicity. POINeT is available at <url>http://poinet.bioinformatics.tw/</url>.</p

    Utility of network integrity methods in therapeutic target identification

    Get PDF
    Analysis of the biological gene networks involved in a disease may lead to the identification of therapeutic targets. Such analysis requires exploring network properties, in particular the importance of individual network nodes (i.e., genes). There are many measures that consider the importance of nodes in a network and some may shed light on the biological significance and potential optimality of a gene or set of genes as therapeutic targets. This has been shown to be the case in cancer therapy. A dilemma exists, however, in finding the best therapeutic targets based on network analysis since the optimal targets should be nodes that are highly influential in, but not toxic to, the functioning of the entire network. In addition, cancer therapeutics targeting a single gene often result in relapse since compensatory, feedback and redundancy loops in the network may offset the activity associated with the targeted gene. Thus, multiple genes reflecting parallel functional cascades in a network should be targeted simultaneously, but require the identification of such targets. We propose a methodology that exploits centrality statistics characterizing the importance of nodes within a gene network that is constructed from the gene expression patterns in that network. We consider centrality measures based on both graph theory and spectral graph theory. We also consider the origins of a network topology, and show how different available representations yield different node importance results. We apply our techniques to tumor gene expression data and suggest that the identification of optimal therapeutic targets involving particular genes, pathways and sub-networks based on an analysis of the nodes in that network is possible and can facilitate individualized cancer treatments. The proposed methods also have the potential to identify candidate cancer therapeutic targets that are not thought to be oncogenes but nonetheless play important roles in the functioning of a cancer-related network or pathway

    On Ranked Approximate Matching Of Large Attributed Graphs

    Get PDF
    Many emerging database applications entail sophisticated graph based query manipulation, predominantly evident in large-scale scientific applications. To access the information embedded in graphs, efficient graph matching tools and algorithms have become of prime importance. Although the prohibitively expensive time complexity associated with exact sub-graph isomorphism techniques has limited its efficacy in the application domain, approximate yet efficient graph matching techniques have received much attention due to their pragmatic applicability. Since public domain databases are noisy and incomplete in nature, inexact graph matching techniques have proven to be more promising in terms of inferring knowledge from numerous structural data repositories. Contemporary algorithms for approximate graph matching incur substantial cost to generate candidates, and then test and rank them for possible match. Leading algorithms balance processing time and overall resource consumption cost by leveraging sophisticated data structures and graph properties to improve overall performance. In this dissertation, we propose novel techniques for approximate graph matching based on two different techniques called TraM or Top-k Graph Matching and Approximate Network Matching or AtoM respectively. While TraM off-loads a significant amount of its processing on to the database making the approach viable for large graphs, AtoM provides improved turn around time by means of graph summarization prior to matching. The summarization process is aided by domain sensitive similarity matrices, which in turn helps improve the matching performance. The vector space embedding of the graphs and efficient filtration of the search space enables computation of approximate graph similarity at a throw-away cost. We combine domain similarity and topological similarity to obtain overall graph similarity and compare them with neighborhood biased segments of the data-graph for proper matches. We show that our approach can naturally support the emerging trend in graph pattern queries and discuss its suitability for large networks as it can be seamlessly transformed to adhere to map-reduce framework. We have conducted thorough experiments on several synthetic and real data sets, and have demonstrated the effectiveness and efficiency of the proposed method

    Biological Networks

    Get PDF
    Networks of coordinated interactions among biological entities govern a myriad of biological functions that span a wide range of both length and time scales—from ecosystems to individual cells and from years to milliseconds. For these networks, the concept “the whole is greater than the sum of its parts” applies as a norm rather than an exception. Meanwhile, continued advances in molecular biology and high-throughput technology have enabled a broad and systematic interrogation of whole-cell networks, allowing the investigation of biological processes and functions at unprecedented breadth and resolution—even down to the single-cell level. The explosion of biological data, especially molecular-level intracellular data, necessitates new paradigms for unraveling the complexity of biological networks and for understanding how biological functions emerge from such networks. These paradigms introduce new challenges related to the analysis of networks in which quantitative approaches such as machine learning and mathematical modeling play an indispensable role. The Special Issue on “Biological Networks” showcases advances in the development and application of in silico network modeling and analysis of biological systems

    A visual and curatorial approach to clinical variant prioritization and disease gene discovery in genome-wide diagnostics

    Get PDF
    Background: Genome-wide data are increasingly important in the clinical evaluation of human disease. However, the large number of variants observed in individual patients challenges the efficiency and accuracy of diagnostic review. Recent work has shown that systematic integration of clinical phenotype data with genotype information can improve diagnostic workflows and prioritization of filtered rare variants. We have developed visually interactive, analytically transparent analysis software that leverages existing disease catalogs, such as the Online Mendelian Inheritance in Man database (OMIM) and the Human Phenotype Ontology (HPO), to integrate patient phenotype and variant data into ranked diagnostic alternatives. Methods: Our tool, “OMIM Explorer” (http://www.omimexplorer.com), extends the biomedical application of semantic similarity methods beyond those reported in previous studies. The tool also provides a simple interface for translating free-text clinical notes into HPO terms, enabling clinical providers and geneticists to contribute phenotypes to the diagnostic process. The visual approach uses semantic similarity with multidimensional scaling to collapse high-dimensional phenotype and genotype data from an individual into a graphical format that contextualizes the patient within a low-dimensional disease map. The map proposes a differential diagnosis and algorithmically suggests potential alternatives for phenotype queries—in essence, generating a computationally assisted differential diagnosis informed by the individual’s personal genome. Visual interactivity allows the user to filter and update variant rankings by interacting with intermediate results. The tool also implements an adaptive approach for disease gene discovery based on patient phenotypes. Results: We retrospectively analyzed pilot cohort data from the Baylor Miraca Genetics Laboratory, demonstrating performance of the tool and workflow in the re-analysis of clinical exomes. Our tool assigned to clinically reported variants a median rank of 2, placing causal variants in the top 1 % of filtered candidates across the 47 cohort cases with reported molecular diagnoses of exome variants in OMIM Morbidmap genes. Our tool outperformed Phen-Gen, eXtasy, PhenIX, PHIVE, and hiPHIVE in the prioritization of these clinically reported variants. Conclusions: Our integrative paradigm can improve efficiency and, potentially, the quality of genomic medicine by more effectively utilizing available phenotype information, catalog data, and genomic knowledge

    Topological analysis of protein co-abundance networks identifies novel host targets important for HCV infection and pathogenesis

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>High-throughput methods for obtaining global measurements of transcript and protein levels in biological samples has provided a large amount of data for identification of 'target' genes and proteins of interest. These targets may be mediators of functional processes involved in disease and therefore represent key points of control for viruses and bacterial pathogens. Genes and proteins that are the most highly differentially regulated are generally considered to be the most important. We present topological analysis of co-abundance networks as an alternative to differential regulation for confident identification of target proteins from two related global proteomics studies of hepatitis C virus (HCV) infection.</p> <p>Results</p> <p>We analyzed global proteomics data sets from a cell culture study of HCV infection and from a clinical study of liver biopsies from HCV-positive patients. Using lists of proteins known to be interaction partners with pathogen proteins we show that the most differentially regulated proteins in both data sets are indeed enriched in pathogen interactors. We then use these data sets to generate co-abundance networks that link proteins based on similar abundance patterns in time or across patients. Analysis of these co-abundance networks using a variety of network topology measures revealed that both degree and betweenness could be used to identify pathogen interactors with better accuracy than differential regulation alone, though betweenness provides the best discrimination. We found that though overall differential regulation was not correlated between the cell culture and liver biopsy data, network topology was conserved to an extent. Finally, we identified a set of proteins that has high betweenness topology in both networks including a protein that we have recently shown to be essential for HCV replication in cell culture.</p> <p>Conclusions</p> <p>The results presented show that the network topology of protein co-abundance networks can be used to identify proteins important for viral replication. These proteins represent targets for further experimental investigation that will provide biological insight and potentially could be exploited for novel therapeutic approaches to combat HCV infection.</p

    Elastic Network Models in Biology: From Protein Mode Spectra to Chromatin Dynamics

    Get PDF
    Biomacromolecules perform their functions by accessing conformations energetically favored by their structure-encoded equilibrium dynamics. Elastic network model (ENM) analysis has been widely used to decompose the equilibrium dynamics of a given molecule into a spectrum of modes of motions, which separates robust, global motions from local fluctuations. The scalability and flexibility of the ENMs permit us to efficiently analyze the spectral dynamics of large systems or perform comparative analysis for large datasets of structures. I showed in this thesis how ENMs can be adapted (1) to analyze protein superfamilies that share similar tertiary structures but may differ in their sequence and functional dynamics, and (2) to analyze chromatin dynamics using contact data from Hi-C experiments, and (3) to perform a comparative analysis of genome topology across different types of cell lines. The first study showed that protein family members share conserved, highly cooperative (global) modes of motion. A low-to-intermediate frequency spectral regime was shown to have a maximal impact on the functional differentiation of families into subfamilies. The second study demonstrated the Gaussian Network Model (GNM) can accurately model chromosomal mobility and couplings between genomic loci at multiple scales: it can quantify the spatial fluctuations in the positions of gene loci, detect large genomic compartments and smaller topologically-associating domains (TADs) that undergo en bloc movements, and identify dynamically coupled distal regions along the chromosomes. The third study revealed close similarities between chromosomal dynamics across different cell lines on a global scale, but notable cell-specific variations in the spatial fluctuations of genomic loci. It also called attention to the role of the intrinsic spatial dynamics of chromatin as a determinant of cell differentiation. Together, these studies provide a comprehensive view of the versatility and utility of the ENMs in analyzing spatial dynamics of biomolecules, from individual proteins to the entire chromatin
    corecore