225,042 research outputs found

    Plasmodium knowlesi Genome Sequences from Clinical Isolates Reveal Extensive Genomic Dimorphism.

    Get PDF
    Plasmodium knowlesi is a newly described zoonosis that causes malaria in the human population that can be severe and fatal. The study of P. knowlesi parasites from human clinical isolates is relatively new and, in order to obtain maximum information from patient sample collections, we explored the possibility of generating P. knowlesi genome sequences from archived clinical isolates. Our patient sample collection consisted of frozen whole blood samples that contained excessive human DNA contamination and, in that form, were not suitable for parasite genome sequencing. We developed a method to reduce the amount of human DNA in the thawed blood samples in preparation for high throughput parasite genome sequencing using Illumina HiSeq and MiSeq sequencing platforms. Seven of fifteen samples processed had sufficiently pure P. knowlesi DNA for whole genome sequencing. The reads were mapped to the P. knowlesi H strain reference genome and an average mapping of 90% was obtained. Genes with low coverage were removed leaving 4623 genes for subsequent analyses. Previously we identified a DNA sequence dimorphism on a small fragment of the P. knowlesi normocyte binding protein xa gene on chromosome 14. We used the genome data to assemble full-length Pknbpxa sequences and discovered that the dimorphism extended along the gene. An in-house algorithm was developed to detect SNP sites co-associating with the dimorphism. More than half of the P. knowlesi genome was dimorphic, involving genes on all chromosomes and suggesting that two distinct types of P. knowlesi infect the human population in Sarawak, Malaysian Borneo. We use P. knowlesi clinical samples to demonstrate that Plasmodium DNA from archived patient samples can produce high quality genome data. We show that analyses, of even small numbers of difficult clinical malaria isolates, can generate comprehensive genomic information that will improve our understanding of malaria parasite diversity and pathobiology

    Exploring genome wide bisulfite sequencing for DNA methylation analysis in livestock: a technical assessment

    Get PDF
    peer-reviewedRecent advances made in “omics” technologies are contributing to a revolution in livestock selection and breeding practices. Epigenetic mechanisms, including DNA methylation are important determinants for the control of gene expression in mammals. DNA methylation research will help our understanding of how environmental factors contribute to phenotypic variation of complex production and health traits. High-throughput sequencing is a vital tool for the comprehensive analysis of DNA methylation, and bisulfite-based strategies coupled with DNA sequencing allows for quantitative, site-specific methylation analysis at the genome level or genome wide. Reduced representation bisulfite sequencing (RRBS) and more recently whole genome bisulfite sequencing (WGBS) have proven to be effective techniques for studying DNA methylation in both humans and mice. Here we report the development of RRBS and WGBS for use in sheep, the first application of this technology in livestock species. Important technical issues associated with these methodologies including fragment size selection and sequence depth are examined and discussed.AgResearch AR&C grant for funding and Teagasc for providing a short-term overseas training awar

    Estimating absolute methylation levels at single-CpG resolution from methylation enrichment and restriction enzyme sequencing methods

    Get PDF
    Recent advancements in sequencing-based DNA methylation profiling methods provide an unprecedented opportunity to map complete DNA methylomes. These include whole-genome bisulfite sequencing (WGBS, MethylC-seq, or BS-seq), reduced-representation bisulfite sequencing (RRBS), and enrichment-based methods such as MeDIP-seq, MBD-seq, and MRE-seq. These methods yield largely comparable results but differ significantly in extent of genomic CpG coverage, resolution, quantitative accuracy, and cost, at least while using current algorithms to interrogate the data. None of these existing methods provides single-CpG resolution, comprehensive genome-wide coverage, and cost feasibility for a typical laboratory. We introduce methylCRF, a novel conditional random fields–based algorithm that integrates methylated DNA immunoprecipitation (MeDIP-seq) and methylation-sensitive restriction enzyme (MRE-seq) sequencing data to predict DNA methylation levels at single-CpG resolution. Our method is a combined computational and experimental strategy to produce DNA methylomes of all 28 million CpGs in the human genome for a fraction (<10%) of the cost of whole-genome bisulfite sequencing methods. methylCRF was benchmarked for accuracy against Infinium arrays, RRBS, WGBS sequencing, and locus-specific bisulfite sequencing performed on the same human embryonic stem cell line. methylCRF transformation of MeDIP-seq/MRE-seq was equivalent to a biological replicate of WGBS in quantification, coverage, and resolution. We used conventional bisulfite conversion, PCR, cloning, and sequencing to validate loci where our predictions do not agree with whole-genome bisulfite data, and in 11 out of 12 cases, methylCRF predictions of methylation level agree better with validated results than does whole-genome bisulfite sequencing. Therefore, methylCRF transformation of MeDIP-seq/MRE-seq data provides an accurate, inexpensive, and widely accessible strategy to create full DNA methylomes

    Linear-Time Superbubble Identification Algorithm for Genome Assembly

    Get PDF
    DNA sequencing is the process of determining the exact order of the nucleotide bases of an individual's genome in order to catalogue sequence variation and understand its biological implications. Whole-genome sequencing techniques produce masses of data in the form of short sequences known as reads. Assembling these reads into a whole genome constitutes a major algorithmic challenge. Most assembly algorithms utilize de Bruijn graphs constructed from reads for this purpose. A critical step of these algorithms is to detect typical motif structures in the graph caused by sequencing errors and genome repeats, and filter them out; one such complex subgraph class is a so-called superbubble. In this paper, we propose an O(n+m)-time algorithm to detect all superbubbles in a directed acyclic graph with n nodes and m (directed) edges, improving the best-known O(m log m)-time algorithm by Sung et al

    Having a direct look:analysis of DNA damage and repair mechanisms by next generation sequencing

    Get PDF
    AbstractGenetic information is under constant attack from endogenous and exogenous sources, and the use of model organisms has provided important frameworks to understand how genome stability is maintained and how various DNA lesions are repaired. The advance of high throughput next generation sequencing (NGS) provides new inroads for investigating mechanisms needed for genome maintenance. These emerging studies, which aim to link genetic toxicology and mechanistic analyses of DNA repair processes in vivo, rely on defining mutational signatures caused by faulty replication, endogenous DNA damaging metabolites, or exogenously applied genotoxins; the analysis of their nature, their frequency and distribution. In contrast to classical studies, where DNA repair deficiency is assessed by reduced cellular survival, the localization of DNA repair factors and their interdependence as well as limited analysis of single locus reporter assays, NGS based approaches reveal the direct, quantal imprint of mutagenesis genome-wide, at the DNA sequence level. As we will show, such investigations require the analysis of DNA derived from single genotoxin treated cells, or DNA from cell populations regularly passaged through single cell bottlenecks when naturally occurring mutation accumulation is investigated. We will argue that the life cycle of the nematode Caenorhabditis elegans, its genetic malleability combined with whole genome sequencing provides an exciting model system to conduct such analysis

    Analysis of Archived Residual Newborn Screening Blood Spots After Whole Genome Amplification

    Get PDF
    Deidentified newborn screening bloodspot samples (NBS) represent a valuable potential resource for genomic research if impediments to whole exome sequencing of NBS deoxyribonucleic acid (DNA), including the small amount of genomic DNA in NBS material, can be overcome. For instance, genomic analysis of NBS could be used to define allele frequencies of disease-associated variants in local populations, or to conduct prospective or retrospective studies relating genomic variation to disease emergence in pediatric populations over time. In this study, we compared the recovery of variant calls from exome sequences of amplified NBS genomic DNA to variant calls from exome sequencing of non-amplified NBS DNA from the same individuals. Results: Using a standard alignment-based Genome Analysis Toolkit (GATK), we find 62,000-76,000 additional variants in amplified samples. After application of a unique kmer enumeration and variant detection method (RUFUS), only 38,000-47,000 additional variants are observed in amplified gDNA. This result suggests that roughly half of the amplification-introduced variants identified using GATK may be the result of mapping errors and read misalignment. Conclusions: Our results show that it is possible to obtain informative, high-quality data from exome analysis of whole genome amplified NBS with the important caveat that different data generation and analysis methods can affect variant detection accuracy, and the concordance of variant calls in whole-genome amplified and non-amplified exomes.National Institute of Health P01HD067244, NS076465, R01ES021006Nutritional Science

    Library Preparation for Whole Genome Bisulfite Sequencing of Plant Genomes

    Get PDF
    Epigenetic mechanisms are a key interface between the environment and the genotype. These mechanisms regulate gene expression in response to plant development and environmental stimuli, which ultimately affects the plant’s phenotype. DNA methylation, in particular cytosine methylation, is probably the best studied epigenetic modification in eukaryotes. It has been associated to the regulation of gene expression in response to cell/tissue differentiation, organism development and adaptation to changing environments. Whole genome bisulfite sequencing (WGBS) is considered the gold standard to study DNA methylation at a genome level. Here we present a protocol for the preparation of whole genome bisulfite sequencing libraries from plant samples (grapevine leaves) which includes detailed instructions for sample collection and DNA extraction, sequencing library preparation and bisulfite treatment

    Optimizing DNA Extraction Methods for Nanopore Sequencing of Neisseria gonorrhoeae Directly from Urine Samples

    Get PDF
    Empirical gonorrhea treatment at initial diagnosis reduces onward transmission. However, increasing resistance to multiple antibiotics may necessitate waiting for culture-based diagnostics to select an effective treatment. There is a need for same-day culture-free diagnostics that identify infection and detect antimicrobial resistance. We investigated if Nanopore sequencing can detect sufficient Neisseria gonorrhoeae DNA to reconstruct whole genomes directly from urine samples. We used N. gonorrhoeae-spiked urine samples and samples from gonorrhea infections to determine optimal DNA extraction methods that maximize the amount of N. gonorrhoeae DNA sequenced while minimizing contaminating host DNA. In simulated infections, the Qiagen UCP pathogen mini kit provided the highest ratio of N. gonorrhoeae to human DNA and the most consistent results. Depletion of human DNA with saponin increased N. gonorrhoeae yields in simulated infections but decreased yields in clinical samples. In 10 urine samples from men with symptomatic urethral gonorrhea, ≥92.8% coverage of an N. gonorrhoeae reference genome was achieved in all samples, with ≥93.8% coverage breath at ≥10-fold depth in 7 (70%) samples. In simulated infections, if ≥104 CFU/ml of N. gonorrhoeae was present, sequencing of the large majority of the genome was frequently achieved. N. gonorrhoeae could also be detected from urine in cobas PCR medium tubes and from urethral swabs and in the presence of simulated Chlamydia coinfection. Using Nanopore sequencing of urine samples from men with urethral gonorrhea, sufficient data can be obtained to reconstruct whole genomes in the majority of samples without the need for culture

    Genomic insights into high exopolysaccharide-producing dairy starter bacterium Streptococcus thermophilus ASCC 1275

    Get PDF
    Poster presentationStreptococcus thermophilus is an essential dairy starter for the manufacture of yogurt and cheese. Whole-genome sequencing of this organism is expected to provide insights into the genetic basis of metabolic pathways for biotechnological and probiotic applications. Streptococcus thermophilus ASCC 1275, a high EPS-producing dairy starter, has shown texture-enhancing properties for yogurt and cheese. After genomic DNA extraction using CTAB/NaCl method, whole genome sequencing including one shot-gun sequencing, two extra paired-end sequencing and Sanger sequencing was performed for strain …postprin
    corecore