200 research outputs found

    MinION Analysis and Reference Consortium: Phase 1 data release and analysis

    Get PDF
    The advent of a miniaturized DNA sequencing device with a high-throughput contextual sequencing capability embodies the next generation of large scale sequencing tools. The MinION™ Access Programme (MAP) was initiated by Oxford Nanopore Technologies™ in April 2014, giving public access to their USB-attached miniature sequencing device. The MinION Analysis and Reference Consortium (MARC) was formed by a subset of MAP participants, with the aim of evaluating and providing standard protocols and reference data to the community. Envisaged as a multi-phased project, this study provides the global community with the Phase 1 data from MARC, where the reproducibility of the performance of the MinION was evaluated at multiple sites. Five laboratories on two continents generated data using a control strain of Escherichia coli K-12, preparing and sequencing samples according to a revised ONT protocol. Here, we provide the details of the protocol used, along with a preliminary analysis of the characteristics of typical runs including the consistency, rate, volume and quality of data produced. Further analysis of the Phase 1 data presented here, and additional experiments in Phase 2 of E. coli from MARC are already underway to identify ways to improve and enhance MinION performance

    Focused HLA analysis in Caucasians with myositis identifies significant associations with autoantibody subgroups

    Get PDF
    Objectives: Idiopathic inflammatory myopathies (IIM) are a spectrum of rare autoimmune diseases characterised clinically by muscle weakness and heterogeneous systemic organ involvement. The strongest genetic risk is within the major histocompatibility complex (MHC). Since autoantibody presence defines specific clinical subgroups of IIM, we aimed to correlate serotype and genotype, to identify novel risk variants in the MHC region that co-occur with IIM autoantibodies. Methods: We collected available autoantibody data in our cohort of 2582 Caucasian patients with IIM. High resolution human leucocyte antigen (HLA) alleles and corresponding amino acid sequences were imputed using SNP2HLA from existing genotyping data and tested for association with 12 autoantibody subgroups. Results: We report associations with eight autoantibodies reaching our study-wide significance level of p<2.9x10(-5). Associations with the 8.1 ancestral haplotype were found with anti-Jo-1 (HLA-B*08:01, p=2.28x10(-53) and HLA-DRB1*03:01, p=3.25x10(-9)), anti-PM/Scl (HLA-DQB1*02:01, p=1.47x10(-26)) and anti-cN1A autoantibodies (HLA-DRB1*03:01, p=1.40x10(-11)). Associations independent of this haplotype were found with anti-Mi-2 (HLA-DRB1*07:01, p=4.92x10(-13)) and anti-HMGCR autoantibodies (HLA-DRB1*11, p=5.09x10(-6)). Amino acid positions may be more strongly associated than classical HLA associations; for example with anti-Jo-1 autoantibodies and position 74 of HLA-DRB1 (p=3.47x10(-64)) and position 9 of HLA-B (p=7.03x10(-11)). We report novel genetic associations with HLA-DQB1 anti-TIF1 autoantibodies and identify haplotypes that may differ between adult-onset and juvenile-onset patients with these autoantibodies. Conclusions: These findings provide new insights regarding the functional consequences of genetic polymorphisms within the MHC. As autoantibodies in IIM correlate with specific clinical features of disease, understanding genetic risk underlying development of autoantibody profiles has implications for future research

    Genome Sizes and the Benford Distribution

    Get PDF
    BACKGROUND: Data on the number of Open Reading Frames (ORFs) coded by genomes from the 3 domains of Life show the presence of some notable general features. These include essential differences between the Prokaryotes and Eukaryotes, with the number of ORFs growing linearly with total genome size for the former, but only logarithmically for the latter. RESULTS: Simply by assuming that the (protein) coding and non-coding fractions of the genome must have different dynamics and that the non-coding fraction must be particularly versatile and therefore be controlled by a variety of (unspecified) probability distribution functions (pdf's), we are able to predict that the number of ORFs for Eukaryotes follows a Benford distribution and must therefore have a specific logarithmic form. Using the data for the 1000+ genomes available to us in early 2010, we find that the Benford distribution provides excellent fits to the data over several orders of magnitude. CONCLUSIONS: In its linear regime the Benford distribution produces excellent fits to the Prokaryote data, while the full non-linear form of the distribution similarly provides an excellent fit to the Eukaryote data. Furthermore, in their region of overlap the salient features are statistically congruent. This allows us to interpret the difference between Prokaryotes and Eukaryotes as the manifestation of the increased demand in the biological functions required for the larger Eukaryotes, to estimate some minimal genome sizes, and to predict a maximal Prokaryote genome size on the order of 8-12 megabasepairs. These results naturally allow a mathematical interpretation in terms of maximal entropy and, therefore, most efficient information transmission

    Identification of Novel Associations and Localization of Signals in Idiopathic Inflammatory Myopathies Using Genome-Wide Imputation

    Get PDF
    OBJECTIVES: The idiopathic inflammatory myopathies (IIM) are heterogeneous diseases, thought to be initiated by immune activation in genetically predisposed individuals. In this study we imputed variants from the Immunochip array using a large reference panel to fine-map associations and identify novel associations in IIM. METHODS: We analysed 2,565 Caucasian IIM samples collected through the Myositis Genetics Consortium (MYOGEN) and 10,260 ethnically-matched controls. We imputed 1,648,116 variants from the Immunochip array using the Haplotype Reference Consortium panel and conducted association analysis on IIM, and clinical and serological subgroups. RESULTS: The human leukocyte antigen (HLA) locus was consistently the most significantly associated region. Four non-HLA regions reached genome-wide significance, three in the whole IIM cohort (SDK2 and LINC00924 - both novel, and STAT4), with evidence of independent variants in STAT4, and NAB1 in the polymyositis (PM) subgroup. We also found suggestive evidence of association with loci previously associated with other autoimmune rheumatic diseases (TEC and LTBR). We identified more significant associations than those previously reported in IIM, for STAT4 and DGKQ in the total cohort, for NAB1 and FAM167A-BLK loci in PM, and CCR5 in inclusion body myositis. We found enrichment of variants among DNase I hypersensitivity sites and histone marks associated with active transcription within blood cells. CONCLUSIONS: We report novel and strong associations in IIM and PM, and localise signals to single genes and immune cell types

    Systematic Single-Cell Analysis of Pichia pastoris Reveals Secretory Capacity Limits Productivity

    Get PDF
    Biopharmaceuticals represent the fastest growing sector of the global pharmaceutical industry. Cost-efficient production of these biologic drugs requires a robust host organism for generating high titers of protein during fermentation. Understanding key cellular processes that limit protein production and secretion is, therefore, essential for rational strain engineering. Here, with single-cell resolution, we systematically analysed the productivity of a series of Pichia pastoris strains that produce different proteins both constitutively and inducibly. We characterized each strain by qPCR, RT-qPCR, microengraving, and imaging cytometry. We then developed a simple mathematical model describing the flux of folded protein through the ER. This combination of single-cell measurements and computational modelling shows that protein trafficking through the secretory machinery is often the rate-limiting step in single-cell production, and strategies to enhance the overall capacity of protein secretion within hosts for the production of heterologous proteins may improve productivity

    Movements and Population Structure of Humpback Whales in the North Pacific

    Get PDF
    Despite the extensive use of photographic identification methods to investigate humpback whales in the North Pacific, few quantitative analyses have been conducted. We report on a comprehensive analysis of interchange in the North Pacific among three wintering regions (Mexico, Hawaii, and Japan) each with two to three subareas, and feeding areas that extended from southern California to the Aleutian Islands. Of the 6,413 identification photographs of humpback whales obtained by 16 independent research groups between 1990 and 1993 and examined for this study, 3,650 photographs were determined to be of suitable quality. A total of 1,241 matches was found by two independent matching teams, identifying 2,712 unique whales in the sample (seen one to five times). Site fidelity was greatest at feeding areas where there was a high rate of resightings in the same area in different years and a low rate of interchange among different areas. Migrations between winter regions and feeding areas did not follow a simple pattern, although highest match rates were found for whales that moved between Hawaii and southeastern Alaska, and between mainland and Baja Mexico and California. Interchange among subareas of the three primary wintering regions was extensive for Hawaii, variable (depending on subareas) for Mexico, and low for Japan and reflected the relative distances among subareas. Interchange among these primary wintering regions was rare. This study provides the first quantitative assessment of the migratory structure of humpback whales in the entire North Pacific basin
    corecore