56 research outputs found

    Gene network interconnectedness and the generalized topological overlap measure

    Get PDF
    BACKGROUND: Network methods are increasingly used to represent the interactions of genes and/or proteins. Genes or proteins that are directly linked may have a similar biological function or may be part of the same biological pathway. Since the information on the connection (adjacency) between 2 nodes may be noisy or incomplete, it can be desirable to consider alternative measures of pairwise interconnectedness. Here we study a class of measures that are proportional to the number of neighbors that a pair of nodes share in common. For example, the topological overlap measure by Ravasz et al. [1] can be interpreted as a measure of agreement between the m = 1 step neighborhoods of 2 nodes. Several studies have shown that two proteins having a higher topological overlap are more likely to belong to the same functional class than proteins having a lower topological overlap. Here we address the question whether a measure of topological overlap based on higher-order neighborhoods could give rise to a more robust and sensitive measure of interconnectedness. RESULTS: We generalize the topological overlap measure from m = 1 step neighborhoods to m ≥ 2 step neighborhoods. This allows us to define the m-th order generalized topological overlap measure (GTOM) by (i) counting the number of m-step neighbors that a pair of nodes share and (ii) normalizing it to take a value between 0 and 1. Using theoretical arguments, a yeast co-expression network application, and a fly protein network application, we illustrate the usefulness of the proposed measure for module detection and gene neighborhood analysis. CONCLUSION: Topological overlap can serve as an important filter to counter the effects of spurious or missing connections between network nodes. The m-th order topological overlap measure allows one to trade-off sensitivity versus specificity when it comes to defining pairwise interconnectedness and network modules

    Geometric Interpretation of Gene Coexpression Network Analysis

    Get PDF
    The merging of network theory and microarray data analysis techniques has spawned a new field: gene coexpression network analysis. While network methods are increasingly used in biology, the network vocabulary of computational biologists tends to be far more limited than that of, say, social network theorists. Here we review and propose several potentially useful network concepts. We take advantage of the relationship between network theory and the field of microarray data analysis to clarify the meaning of and the relationship among network concepts in gene coexpression networks. Network theory offers a wealth of intuitive concepts for describing the pairwise relationships among genes, which are depicted in cluster trees and heat maps. Conversely, microarray data analysis techniques (singular value decomposition, tests of differential expression) can also be used to address difficult problems in network theory. We describe conditions when a close relationship exists between network analysis and microarray data analysis techniques, and provide a rough dictionary for translating between the two fields. Using the angular interpretation of correlations, we provide a geometric interpretation of network theoretic concepts and derive unexpected relationships among them. We use the singular value decomposition of module expression data to characterize approximately factorizable gene coexpression networks, i.e., adjacency matrices that factor into node specific contributions. High and low level views of coexpression networks allow us to study the relationships among modules and among module genes, respectively. We characterize coexpression networks where hub genes are significant with respect to a microarray sample trait and show that the network concept of intramodular connectivity can be interpreted as a fuzzy measure of module membership. We illustrate our results using human, mouse, and yeast microarray gene expression data. The unification of coexpression network methods with traditional data mining methods can inform the application and development of systems biologic methods

    Understanding network concepts in modules

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Network concepts are increasingly used in biology and genetics. For example, the clustering coefficient has been used to understand network architecture; the connectivity (also known as degree) has been used to screen for cancer targets; and the topological overlap matrix has been used to define modules and to annotate genes. Dozens of potentially useful network concepts are known from graph theory.</p> <p>Results</p> <p>Here we study network concepts in special types of networks, which we refer to as approximately factorizable networks. In these networks, the pairwise connection strength (adjacency) between 2 network nodes can be factored into node specific contributions, named node 'conformity'. The node conformity turns out to be highly related to the connectivity. To provide a formalism for relating network concepts to each other, we define three types of network concepts: fundamental-, conformity-based-, and approximate conformity-based concepts. Fundamental concepts include the standard definitions of connectivity, density, centralization, heterogeneity, clustering coefficient, and topological overlap. The approximate conformity-based analogs of fundamental network concepts have several theoretical advantages. First, they allow one to derive simple relationships between seemingly disparate networks concepts. For example, we derive simple relationships between the clustering coefficient, the heterogeneity, the density, the centralization, and the topological overlap. The second advantage of approximate conformity-based network concepts is that they allow one to show that fundamental network concepts can be approximated by simple functions of the connectivity in module networks.</p> <p>Conclusion</p> <p>Using protein-protein interaction, gene co-expression, and simulated data, we show that a) many networks comprised of module nodes are approximately factorizable and b) in these types of networks, simple relationships exist between seemingly disparate network concepts. Our results are implemented in freely available R software code, which can be downloaded from the following webpage: <url>http://www.genetics.ucla.edu/labs/horvath/ModuleConformity/ModuleNetworks</url></p

    Evaluating the Effects of SARS-CoV-2 Spike Mutation D614G on Transmissibility and Pathogenicity.

    Get PDF
    Global dispersal and increasing frequency of the SARS-CoV-2 spike protein variant D614G are suggestive of a selective advantage but may also be due to a random founder effect. We investigate the hypothesis for positive selection of spike D614G in the United Kingdom using more than 25,000 whole genome SARS-CoV-2 sequences. Despite the availability of a large dataset, well represented by both spike 614 variants, not all approaches showed a conclusive signal of positive selection. Population genetic analysis indicates that 614G increases in frequency relative to 614D in a manner consistent with a selective advantage. We do not find any indication that patients infected with the spike 614G variant have higher COVID-19 mortality or clinical severity, but 614G is associated with higher viral load and younger age of patients. Significant differences in growth and size of 614G phylogenetic clusters indicate a need for continued study of this variant

    Geographical and temporal distribution of SARS-CoV-2 clades in the WHO European Region, January to June 2020

    Get PDF
    We show the distribution of severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2) genetic clades over time and between countries and outline potential genomic surveillance objectives. We applied three genomic nomenclature systems to all sequence data from the World Health Organization European Region available until 10 July 2020. We highlight the importance of real-time sequencing and data dissemination in a pandemic situation, compare the nomenclatures and lay a foundation for future European genomic surveillance of SARS-CoV-2

    Risk Factors for Graft-versus-Host Disease in Haploidentical Hematopoietic Cell Transplantation Using Post-Transplant Cyclophosphamide

    Get PDF
    Post-transplant cyclophosphamide (PTCy) has significantly increased the successful use of haploidentical donors with a relatively low incidence of graft-versus-host disease (GVHD). Given its increasing use, we sought to determine risk factors for GVHD after haploidentical hematopoietic cell transplantation (haplo-HCT) using PTCy. Data from the Center for International Blood and Marrow Transplant Research on adult patients with acute myeloid leukemia, acute lymphoblastic leukemia, myelodysplastic syndrome, or chronic myeloid leukemia who underwent PTCy-based haplo-HCT (2013 to 2016) were analyzed and categorized into 4 groups based on myeloablative (MA) or reduced-intensity conditioning (RIC) and bone marrow (BM) or peripheral blood (PB) graft source. In total, 646 patients were identified (MA-BM = 79, MA-PB = 183, RIC-BM = 192, RIC-PB = 192). The incidence of grade 2 to 4 acute GVHD at 6 months was highest in MA-PB (44%), followed by RIC-PB (36%), MA-BM (36%), and RIC-BM (30%) (P = .002). The incidence of chronic GVHD at 1 year was 40%, 34%, 24%, and 20%, respectively (P < .001). In multivariable analysis, there was no impact of stem cell source or conditioning regimen on grade 2 to 4 acute GVHD; however, older donor age (30 to 49 versus <29 years) was significantly associated with higher rates of grade 2 to 4 acute GVHD (hazard ratio [HR], 1.53; 95% confidence interval [CI], 1.11 to 2.12; P = .01). In contrast, PB compared to BM as a stem cell source was a significant risk factor for the development of chronic GVHD (HR, 1.70; 95% CI, 1.11 to 2.62; P = .01) in the RIC setting. There were no differences in relapse or overall survival between groups. Donor age and graft source are risk factors for acute and chronic GVHD, respectively, after PTCy-based haplo-HCT. Our results indicate that in RIC haplo-HCT, the risk of chronic GVHD is higher with PB stem cells, without any difference in relapse or overall survival

    Global urban environmental change drives adaptation in white clover

    Get PDF
    Urbanization transforms environments in ways that alter biological evolution. We examined whether urban environmental change drives parallel evolution by sampling 110,019 white clover plants from 6169 populations in 160 cities globally. Plants were assayed for a Mendelian antiherbivore defense that also affects tolerance to abiotic stressors. Urban-rural gradients were associated with the evolution of clines in defense in 47% of cities throughout the world. Variation in the strength of clines was explained by environmental changes in drought stress and vegetation cover that varied among cities. Sequencing 2074 genomes from 26 cities revealed that the evolution of urban-rural clines was best explained by adaptive evolution, but the degree of parallel adaptation varied among cities. Our results demonstrate that urbanization leads to adaptation at a global scale

    The Somatic Genomic Landscape of Glioblastoma

    Get PDF
    We describe the landscape of somatic genomic alterations based on multi-dimensional and comprehensive characterization of more than 500 glioblastoma tumors (GBMs). We identify several novel mutated genes as well as complex rearrangements of signature receptors including EGFR and PDGFRA. TERT promoter mutations are shown to correlate with elevated mRNA expression, supporting a role in telomerase reactivation. Correlative analyses confirm that the survival advantage of the proneural subtype is conferred by the G-CIMP phenotype, and MGMT DNA methylation may be a predictive biomarker for treatment response only in classical subtype GBM. Integrative analysis of genomic and proteomic profiles challenges the notion of therapeutic inhibition of a pathway as an alternative to inhibition of the target itself. These data will facilitate the discovery of therapeutic and diagnostic target candidates, the validation of research and clinical observations and the generation of unanticipated hypotheses that can advance our molecular understanding of this lethal cancer

    Genomic epidemiology of SARS-CoV-2 in a UK university identifies dynamics of transmission

    Get PDF
    AbstractUnderstanding SARS-CoV-2 transmission in higher education settings is important to limit spread between students, and into at-risk populations. In this study, we sequenced 482 SARS-CoV-2 isolates from the University of Cambridge from 5 October to 6 December 2020. We perform a detailed phylogenetic comparison with 972 isolates from the surrounding community, complemented with epidemiological and contact tracing data, to determine transmission dynamics. We observe limited viral introductions into the university; the majority of student cases were linked to a single genetic cluster, likely following social gatherings at a venue outside the university. We identify considerable onward transmission associated with student accommodation and courses; this was effectively contained using local infection control measures and following a national lockdown. Transmission clusters were largely segregated within the university or the community. Our study highlights key determinants of SARS-CoV-2 transmission and effective interventions in a higher education setting that will inform public health policy during pandemics.</jats:p

    Evaluating the Effects of SARS-CoV-2 Spike Mutation D614G on Transmissibility and Pathogenicity

    Get PDF
    Global dispersal and increasing frequency of the SARS-CoV-2 spike protein variant D614G are suggestive of a selective advantage but may also be due to a random founder effect. We investigate the hypothesis for positive selection of spike D614G in the United Kingdom using more than 25,000 whole genome SARS-CoV-2 sequences. Despite the availability of a large dataset, well represented by both spike 614 variants, not all approaches showed a conclusive signal of positive selection. Population genetic analysis indicates that 614G increases in frequency relative to 614D in a manner consistent with a selective advantage. We do not find any indication that patients infected with the spike 614G variant have higher COVID-19 mortality or clinical severity, but 614G is associated with higher viral load and younger age of patients. Significant differences in growth and size of 614G phylogenetic clusters indicate a need for continued study of this variant
    corecore