432 research outputs found

    Optimizing Illumina next-generation sequencing library preparation for extremely AT-biased genomes.

    Get PDF
    BAckground: Massively parallel sequencing technology is revolutionizing approaches to genomic and genetic research. Since its advent, the scale and efficiency of Next-Generation Sequencing (NGS) has rapidly improved. In spite of this success, sequencing genomes or genomic regions with extremely biased base composition is still a great challenge to the currently available NGS platforms. The genomes of some important pathogenic organisms like Plasmodium falciparum (high AT content) and Mycobacterium tuberculosis (high GC content) display extremes of base composition. The standard library preparation procedures that employ PCR amplification have been shown to cause uneven read coverage particularly across AT and GC rich regions, leading to problems in genome assembly and variation analyses. Alternative library-preparation approaches that omit PCR amplification require large quantities of starting material and hence are not suitable for small amounts of DNA/RNA such as those from clinical isolates. We have developed and optimized library-preparation procedures suitable for low quantity starting material and tolerant to extremely high AT content sequences. Results: We have used our optimized conditions in parallel with standard methods to prepare Illumina sequencing libraries from a non-clinical and a clinical isolate (containing ~53% host contamination). By analyzing and comparing the quality of sequence data generated, we show that our optimized conditions that involve a PCR additive (TMAC), produces amplified libraries with improved coverage of extremely AT-rich regions and reduced bias toward GC neutral templates. Conclusion: We have developed a robust and optimized Next-Generation Sequencing library amplification method suitable for extremely AT-rich genomes. The new amplification conditions significantly reduce bias and retain the complexity of either extremes of base composition. This development will greatly benefit sequencing clinical samples that often require amplification due to low mass of DNA starting material

    Ethical Data Release in Genome-Wide Association Studies in Developing Countries

    Get PDF
    Michael Parker and colleagues discuss the ethical issues associated with data release from genome-wide association studies in developing countries

    Efficient depletion of host DNA contamination in malaria clinical sequencing.

    Get PDF
    The cost of whole-genome sequencing (WGS) is decreasing rapidly as next-generation sequencing technology continues to advance, and the prospect of making WGS available for public health applications is becoming a reality. So far, a number of studies have demonstrated the use of WGS as an epidemiological tool for typing and controlling outbreaks of microbial pathogens. Success of these applications is hugely dependent on efficient generation of clean genetic material that is free from host DNA contamination for rapid preparation of sequencing libraries. The presence of large amounts of host DNA severely affects the efficiency of characterizing pathogens using WGS and is therefore a serious impediment to clinical and epidemiological sequencing for health care and public health applications. We have developed a simple enzymatic treatment method that takes advantage of the methylation of human DNA to selectively deplete host contamination from clinical samples prior to sequencing. Using malaria clinical samples with over 80% human host DNA contamination, we show that the enzymatic treatment enriches Plasmodium falciparum DNA up to ∼9-fold and generates high-quality, nonbiased sequence reads covering >98% of 86,158 catalogued typeable single-nucleotide polymorphism loci

    Extreme mutation bias and high AT content in Plasmodium falciparum.

    Get PDF
    For reasons that remain unknown, the Plasmodium falciparum genome has an exceptionally high AT content compared to other Plasmodium species and eukaryotes in general - nearly 80% in coding regions and approaching 90% in non-coding regions. Here, we examine how this phenomenon relates to genome-wide patterns of de novo mutation. Mutation accumulation experiments were performed by sequential cloning of six P. falciparum isolates growing in human erythrocytes in vitro for 4 years, with 279 clones sampled for whole genome sequencing at different time points. Genome sequence analysis of these samples revealed a significant excess of G:C to A:T transitions compared to other types of nucleotide substitution, which would naturally cause AT content to equilibrate close to the level seen across the P. falciparum reference genome (80.6% AT). These data also uncover an extremely high rate of small indel mutation relative to other species, primarily associated with repetitive AT-rich sequences, in addition to larger-scale structural rearrangements focused in antigen-coding var genes. In conclusion, high AT content in P. falciparum is driven by a systematic mutational bias and ultimately leads to an unusual level of microstructural plasticity, raising the question of whether this contributes to adaptive evolution

    Evolutionary analysis of the most polymorphic gene family in falciparum malaria

    Get PDF
    The var gene family of the human malaria parasite Plasmodium falciparum encode proteins that are crucial determinants of both pathogenesis and immune evasion and are highly polymorphic. Here we have assembled nearly complete var gene repertoires from 2398 field isolates and analysed a normalised set of 714 from across 12 countries. This therefore represents the first large scale attempt to catalogue the worldwide distribution of var gene sequences We confirm the extreme polymorphism of this gene family but also demonstrate an unexpected level of sequence sharing both within and between continents. We show that this is likely due to both the remnants of selective sweeps as well as a worrying degree of recent gene flow across continents with implications for the spread of drug resistance. We also address the evolution of the var repertoire with respect to the ancestral genes within the Laverania and show that diversity generated by recombination is concentrated in a number of hotspots. An analysis of the subdomain structure indicates that some existing definitions may need to be revised From the analysis of this data, we can now understand the way in which the family has evolved and how the diversity is continuously being generated. Finally, we demonstrate that because the genes are distributed across the genome, sequence sharing between genotypes acts as a useful population genetic marker

    Classical sickle beta-globin haplotypes exhibit a high degree of long-range haplotype similarity in African and Afro-Caribbean populations

    Get PDF
    Background: The sickle (βs) mutation in the beta-globin gene (HBB) occurs on five "classical" βs haplotype backgrounds in ethnic groups of African ancestry. Strong selection in favour of the βs allele - a consequence of protection from severe malarial infection afforded by heterozygotes - has been associated with a high degree of extended haplotype similarity. The relationship between classical βs haplotypes and long-range haplotype similarity may have both anthropological and clinical implications, but to date has not been explored. Here we evaluate the haplotype similarity of classical βs haplotypes over 400 kb in population samples from Jamaica, The Gambia, and among the Yoruba of Nigeria (Hapmap YRI). Results: The most common βs sub-haplotype among Jamaicans and the Yoruba was the Benin haplotype, while in The Gambia the Senegal haplotype was observed most commonly. Both subtypes exhibited a high degree of long-range haplotype similarity extending across approximately 400 kb in all three populations. This long-range similarity was significantly greater than that seen for other haplotypes sampled in these populations (P < 0.001), and was independent of marker choice and marker density. Among the Yoruba, Benin haplotypes were highly conserved, with very strong linkage disequilibrium (LD) extending a megabase across the βs mutation. Conclusion: Two different classical βs haplotypes, sampled from different populations, exhibit comparable and extensive long-range haplotype similarity and strong LD. This LD extends across the adjacent recombination hotspot, and is discernable at distances in excess if 400 kb. Although the multicentric geographic distribution of βs haplotypes indicates strong subdivision among early Holocene sub-Saharan populations, we find no evidence that selective pressures imposed by falciparum malaria varied in intensity or timing between these subpopulations. Our observations also suggest that cis-acting loci, which may influence outcomes in sickle cell disease, could lie considerable distances away from β-globin

    Further evidence supporting a role for gs signal transduction in severe malaria pathogenesis.

    Get PDF
    With the functional demonstration of a role in erythrocyte invasion by Plasmodium falciparum parasites, implications in the aetiology of common conditions that prevail in individuals of African origin, and a wealth of pharmacological knowledge, the stimulatory G protein (Gs) signal transduction pathway presents an exciting target for anti-malarial drug intervention. Having previously demonstrated a role for the G-alpha-s gene, GNAS, in severe malaria disease, we sought to identify other important components of the Gs pathway. Using meta-analysis across case-control and family trio (affected child and parental controls) studies of severe malaria from The Gambia and Malawi, we sought evidence of association in six Gs pathway candidate genes: adenosine receptor 2A (ADORA2A) and 2B (ADORA2B), beta-adrenergic receptor kinase 1 (ADRBK1), adenylyl cyclase 9 (ADCY9), G protein beta subunit 3 (GNB3), and regulator of G protein signalling 2 (RGS2). Our study amassed a total of 2278 cases and 2364 controls. Allele-based models of association were investigated in all genes, and genotype and haplotype-based models were investigated where significant allelic associations were identified. Although no significant associations were observed in the other genes, several were identified in ADORA2A. The most significant association was observed at the rs9624472 locus, where the G allele (approximately 20% frequency) appeared to confer enhanced risk to severe malaria [OR = 1.22 (1.09-1.37); P = 0.001]. Further investigation of the ADORA2A gene region is required to validate the associations identified here, and to identify and functionally characterize the responsible causal variant(s). Our results provide further evidence supporting a role of the Gs signal transduction pathway in the regulation of severe malaria, and request further exploration of this pathway in future studies

    The ethics of sustainable genomic research in Africa

    Get PDF
    • …
    corecore