8 research outputs found

    Entropy-scaling search of massive biological data

    Get PDF
    Many datasets exhibit a well-defined structure that can be exploited to design faster search tools, but it is not always clear when such acceleration is possible. Here, we introduce a framework for similarity search based on characterizing a dataset's entropy and fractal dimension. We prove that searching scales in time with metric entropy (number of covering hyperspheres), if the fractal dimension of the dataset is low, and scales in space with the sum of metric entropy and information-theoretic entropy (randomness of the data). Using these ideas, we present accelerated versions of standard tools, with no loss in specificity and little loss in sensitivity, for use in three domains---high-throughput drug screening (Ammolite, 150x speedup), metagenomics (MICA, 3.5x speedup of DIAMOND [3,700x BLASTX]), and protein structure search (esFragBag, 10x speedup of FragBag). Our framework can be used to achieve "compressive omics," and the general theory can be readily applied to data science problems outside of biology.Comment: Including supplement: 41 pages, 6 figures, 4 tables, 1 bo

    Superconductors for fusion: A roadmap

    Get PDF
    With the first tokamak designed for full nuclear operation now well into final assembly (ITER), and a major new research tokamak starting commissioning (JT60SA), nuclear fusion is becoming a mainstream potential energy source for the future. A critical part of the viability of magnetic confinement for fusion is superconductor technology. The experience gained and lessons learned in the application of this technology to ITER and JT60SA, together with new and improved superconducting materials, is opening multiple routes to commercial fusion reactors. The objective of this roadmap is, through a series of short articles, to outline some of these routes and the materials/technologies that go with them

    Immune-Array Analysis in Sporadic Inclusion Body Myositis Reveals HLA-DRB1 Amino Acid Heterogeneity Across the Myositis Spectrum

    Get PDF
    OBJECTIVE: Inclusion body myositis (IBM) is characterized by a combination of inflammatory and degenerative changes affecting muscle. While the primary cause of IBM is unknown, genetic factors may influence disease susceptibility. To determine genetic factors contributing to the etiology of IBM, we conducted the largest genetic association study of the disease to date, investigating immune-related genes using the Immunochip. METHODS: A total of 252 Caucasian patients with IBM were recruited from 11 countries through the Myositis Genetics Consortium and compared with 1,008 ethnically matched controls. Classic HLA alleles and amino acids were imputed using SNP2HLA. RESULTS: The HLA region was confirmed as the most strongly associated region in IBM (P = 3.58 × 10−33). HLA imputation identified 3 independent associations (with HLA–DRB1*03:01, DRB1*01:01, and DRB1*13:01), although the strongest association was with amino acid positions 26 and 11 of the HLA–DRB1 molecule. No association with anti–cytosolic 5′-nucleotidase 1A–positive status was found independent of HLA–DRB1*03:01. There was no association of HLA genotypes with age at onset of IBM. Three non-HLA regions reached suggestive significance, including the chromosome 3 p21.31 region, an established risk locus for autoimmune disease, where a frameshift mutation in CCR5 is thought to be the causal variant. CONCLUSION: This is the largest, most comprehensive genetic association study to date in IBM. The data confirm that HLA is the most strongly associated region and identifies novel amino acid associations that may explain the risk in this locus. These amino acid associations differentiate IBM from polymyositis and dermatomyositis and may determine properties of the peptide-binding groove, allowing it to preferentially bind autoantigenic peptides. A novel suggestive association within the chromosome 3 p21.31 region suggests a role for CCR5

    Dense genotyping of immune-related loci in idiopathic inflammatory myopathies confirms HLA alleles as the strongest genetic risk factor and suggests different genetic background for major clinical subgroups

    No full text
    OBJECTIVES: The idiopathic inflammatory myopathies (IIMs) are a heterogeneous group of rare autoimmune diseases characterised by muscle weakness and extramuscular manifestations such as skin rashes and interstitial lung disease. We genotyped 2566 IIM cases of Caucasian descent using the Immunochip; a custom array covering 186 established autoimmune susceptibility loci. The cohort was predominantly comprised of patients with dermatomyositis (DM, n=879), juvenile DM (JDM, n=481), polymyositis (PM, n=931) and inclusion body myositis (n=252) collected from 14 countries through the Myositis Genetics Consortium. RESULTS: The human leucocyte antigen (HLA) and PTPN22 regions reached genome-wide significance (p<5 710-8). Nine regions were associated at a significance level of p<2.25 710-5, including UBE2L3, CD28 and TRAF6, with evidence of independent effects within STAT4. Analysis of clinical subgroups revealed distinct differences between PM, and DM and JDM. PTPN22 was associated at genome-wide significance with PM, but not DM and JDM, suggesting this effect is driven by PM. Additional suggestive associations including IL18R1 and RGS1 in PM and GSDMB in DM were identified. HLA imputation confirmed that alleles HLA-DRB1*03:01 and HLA-B*08:01 of the 8.1 ancestral haplotype (8.1AH) are most strongly associated with IIM, and provides evidence that amino acids within the HLA, such as HLA-DQB1 position 57 in DM, may explain part of the risk in this locus. Associations with alleles outside the 8.1AH reveal differences between PM, DM and JDM. CONCLUSIONS: This work represents the largest IIM genetic study to date, reveals new insights into the genetic architecture of these rare diseases and suggests different predominating pathophysiology in different clinical subgroups

    Systematic protein-protein interaction and pathway analyses in the idiopathic inflammatory myopathies

    No full text
    Background: The idiopathic inflammatory myopathies (IIM) are autoimmune diseases characterised by acquired proximal muscle weakness, inflammatory cell infiltrates in muscle and myositis-specific/associated autoantibodies. It is unclear which pathways are involved in IIM, and the functional relationship between autoantibody targets has not been systematically explored. Protein-protein interaction and pathway analyses were conducted to identify pathways relevant to disease, using autoantibody targets and gene products of IIM-associated single nucleotide polymorphism (SNP) loci. Methods: Protein-protein interactions were analysed using Disease Association Protein-Protein Link Evaluator (DAPPLE). Gene ontology and pathway analyses were conducted using Database for Annotation Visualisation and Integrated Discovery (DAVID) and Gene Relationships Across Implicated Loci (GRAIL). Analyses were undertaken including the targets of published autoantibodies, significant and suggestive SNPs from an IIM association study and autoantibody targets plus SNPs combined. Results: The protein-protein interaction networks formed by autoantibody targets and associated SNPs showed significant direct and/or indirect connectivity (p <0.05). Autoantibody targets plus associated SNPs combined resulted in more significant indirect and common interactor connectivity, suggesting autoantibody targets and proteins encoded by IIM-associated loci may be involved in common pathways. Tumour necrosis factor receptor-associated factor 6 (TRAF6) was identified as a hub protein, and UBE3B, HSPA1A, HSPA1B and PSMD3 also were identified as genes with significant connectivity. Pathway analysis identified that autoantibody targets and associated SNP regions are significantly interconnected (p <0.01), and confirmed autoantibody target involvement in translational and post-translational processes. 'Ubiquitin' was the only keyword strongly linking significant genes across regions in all three GRAIL analyses of autoantibody targets and IIM-associated SNPs. Conclusions: Autoantibody targets and IIM-associated loci show significant connectivity and inter-relatedness, and identify several key genes and pathways in IIM pathogenesis, possibly mediated via the ubiquitination pathway

    Le site de référence du Partenariat européen d’innovation pour un vieillissement actif et en bonne santé MACVIA-LR (contre les maladies chroniques pour un vieillissement en bonne santé en Languedoc-Roussillon)

    No full text
    International audienceLe site de référence du Partenariat européen d'innovation pour un vieillissement actif et en bonne santé MACVIA-LR (contre les maladies chroniques pour un vieillissement en bonne santé en Languedoc-Roussillon

    Le site de référence du Partenariat européen d’innovation pour un vieillissement actif et en bonne santé MACVIA-LR (contre les maladies chroniques pour un vieillissement en bonne santé en Languedoc-Roussillon)

    No full text

    A global metagenomic map of urban microbiomes and antimicrobial resistance

    No full text
    We present a global atlas of 4,728 metagenomic samples from mass-transit systems in 60 cities over 3 years, representing the first systematic, worldwide catalog of the urban microbial ecosystem. This atlas provides an annotated, geospatial profile of microbial strains, functional characteristics, antimicrobial resistance (AMR) markers, and genetic elements, including 10,928 viruses, 1,302 bacteria, 2 archaea, and 838,532 CRISPR arrays not found in reference databases. We identified 4,246 known species of urban microorganisms and a consistent set of 31 species found in 97% of samples that were distinct from human commensal organisms. Profiles of AMR genes varied widely in type and density across cities. Cities showed distinct microbial taxonomic signatures that were driven by climate and geographic differences. These results constitute a high-resolution global metagenomic atlas that enables discovery of organisms and genes, highlights potential public health and forensic applications, and provides a culture-independent view of AMR burden in cities.Funding: the Tri-I Program in Computational Biology and Medicine (CBM) funded by NIH grant 1T32GM083937; GitHub; Philip Blood and the Extreme Science and Engineering Discovery Environment (XSEDE), supported by NSF grant number ACI-1548562 and NSF award number ACI-1445606; NASA (NNX14AH50G, NNX17AB26G), the NIH (R01AI151059, R25EB020393, R21AI129851, R35GM138152, U01DA053941); STARR Foundation (I13- 0052); LLS (MCL7001-18, LLS 9238-16, LLS-MCL7001-18); the NSF (1840275); the Bill and Melinda Gates Foundation (OPP1151054); the Alfred P. Sloan Foundation (G-2015-13964); Swiss National Science Foundation grant number 407540_167331; NIH award number UL1TR000457; the US Department of Energy Joint Genome Institute under contract number DE-AC02-05CH11231; the National Energy Research Scientific Computing Center, supported by the Office of Science of the US Department of Energy; Stockholm Health Authority grant SLL 20160933; the Institut Pasteur Korea; an NRF Korea grant (NRF-2014K1A4A7A01074645, 2017M3A9G6068246); the CONICYT Fondecyt Iniciación grants 11140666 and 11160905; Keio University Funds for Individual Research; funds from the Yamagata prefectural government and the city of Tsuruoka; JSPS KAKENHI grant number 20K10436; the bilateral AT-UA collaboration fund (WTZ:UA 02/2019; Ministry of Education and Science of Ukraine, UA:M/84-2019, M/126-2020); Kyiv Academic Univeristy; Ministry of Education and Science of Ukraine project numbers 0118U100290 and 0120U101734; Centro de Excelencia Severo Ochoa 2013–2017; the CERCA Programme / Generalitat de Catalunya; the CRG-Novartis-Africa mobility program 2016; research funds from National Cheng Kung University and the Ministry of Science and Technology; Taiwan (MOST grant number 106-2321-B-006-016); we thank all the volunteers who made sampling NYC possible, Minciencias (project no. 639677758300), CNPq (EDN - 309973/2015-5), the Open Research Fund of Key Laboratory of Advanced Theory and Application in Statistics and Data Science – MOE, ECNU, the Research Grants Council of Hong Kong through project 11215017, National Key RD Project of China (2018YFE0201603), and Shanghai Municipal Science and Technology Major Project (2017SHZDZX01) (L.S.
    corecore