26 research outputs found

    Population-Sequencing as a Biomarker of Burkholderia mallei and Burkholderia pseudomallei Evolution through Microbial Forensic Analysis

    Get PDF
    Large-scale genomics projects are identifying biomarkers to detect human disease. B. pseudomallei and B. mallei are two closely related select agents that cause melioidosis and glanders. Accurate characterization of metagenomic samples is dependent on accurate measurements of genetic variation between isolates with resolution down to strain level. Often single biomarker sensitivity is augmented by use of multiple or panels of biomarkers. In parallel with single biomarker validation, advances in DNA sequencing enable analysis of entire genomes in a single run: population-sequencing. Potentially, direct sequencing could be used to analyze an entire genome to serve as the biomarker for genome identification. However, genome variation and population diversity complicate use of direct sequencing, as well as differences caused by sample preparation protocols including sequencing artifacts and mistakes. As part of a Department of Homeland Security program in bacterial forensics, we examined how to implement whole genome sequencing (WGS) analysis as a judicially defensible forensic method for attributing microbial sample relatedness; and also to determine the strengths and limitations of whole genome sequence analysis in a forensics context. Herein, we demonstrate use of sequencing to provide genetic characterization of populations: direct sequencing of populations

    Facile whole mitochondrial genome resequencing from nipple aspirate fluid using MitoChip v2.0

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Mutations in the mitochondrial genome (mtgenome) have been associated with many disorders, including breast cancer. Nipple aspirate fluid (NAF) from symptomatic women could potentially serve as a minimally invasive sample for breast cancer screening by detecting somatic mutations in this biofluid. This study is aimed at 1) demonstrating the feasibility of NAF recovery from symptomatic women, 2) examining the feasibility of sequencing the entire mitochondrial genome from NAF samples, 3) cross validation of the Human mitochondrial resequencing array 2.0 (MCv2), and 4) assessing the somatic mtDNA mutation rate in benign breast diseases as a potential tool for monitoring early somatic mutations associated with breast cancer.</p> <p>Methods</p> <p>NAF and blood were obtained from women with symptomatic benign breast conditions, and we successfully assessed the mutation load in the entire mitochondrial genome of 19 of these women. DNA extracts from NAF were sequenced using the mitochondrial resequencing array MCv2 and by capillary electrophoresis (CE) methods as a quality comparison. Sequencing was performed independently at two institutions and the results compared. The germline mtDNA sequence determined using DNA isolated from the patient's blood (control) was compared to the mutations present in cellular mtDNA recovered from patient's NAF.</p> <p>Results</p> <p>From the cohort of 28 women recruited for this study, NAF was successfully recovered from 23 participants (82%). Twenty two (96%) of the women produced fluids from both breasts. Twenty NAF samples and corresponding blood were chosen for this study. Except for one NAF sample, the whole mtgenome was successfully amplified using a single primer pair, or three pairs of overlapping primers. Comparison of MCv2 data from the two institutions demonstrates 99.200% concordance. Moreover, MCv2 data was 99.999% identical to CE sequencing, indicating that MCv2 is a reliable method to rapidly sequence the entire mtgenome. Four NAF samples contained somatic mutations.</p> <p>Conclusion</p> <p>We have demonstrated that NAF is a suitable material for mtDNA sequence analysis using the rapid and reliable MCv2. Somatic mtDNA mutations present in NAF of women with benign breast diseases could potentially be used as risk factors for progression to breast cancer, but this will require a much larger study with clinical follow up.</p

    Performance of mitochondrial DNA mutations detecting early stage cancer

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Mutations in the mitochondrial genome (mtgenome) have been associated with cancer and many other disorders. These mutations can be point mutations or deletions, or admixtures (heteroplasmy). The detection of mtDNA mutations in body fluids using resequencing microarrays, which are more sensitive than other sequencing methods, could provide a strategy to measure mutation loads in remote anatomical sites.</p> <p>Methods</p> <p>We determined the mtDNA mutation load in the entire mitochondrial genome of 26 individuals with different early stage cancers (lung, bladder, kidney) and 12 heavy smokers without cancer. MtDNA was sequenced from three matched specimens (blood, tumor and body fluid) from each cancer patient and two matched specimens (blood and sputum) from smokers without cancer. The inherited wildtype sequence in the blood was compared to the sequences present in the tumor and body fluid, detected using the Affymetrix Genechip<sup>® </sup>Human Mitochondrial Resequencing Array 1.0 and supplemented by capillary sequencing for noncoding region.</p> <p>Results</p> <p>Using this high-throughput method, 75% of the tumors were found to contain mtDNA mutations, higher than in our previous studies, and 36% of the body fluids from these cancer patients contained mtDNA mutations. Most of the mutations detected were heteroplasmic. A statistically significantly higher heteroplasmy rate occurred in tumor specimens when compared to both body fluid of cancer patients and sputum of controls, and in patient blood compared to blood of controls. Only 2 of the 12 sputum specimens from heavy smokers without cancer (17%) contained mtDNA mutations. Although patient mutations were spread throughout the mtDNA genome in the lung, bladder and kidney series, a statistically significant elevation of tRNA and ND complex mutations was detected in tumors.</p> <p>Conclusion</p> <p>Our findings indicate comprehensive mtDNA resequencing can be a high-throughput tool for detecting mutations in clinical samples with potential applications for cancer detection, but it is unclear the biological relevance of these detected mitochondrial mutations. Whether the detection of tumor-specific mtDNA mutations in body fluidsy this method will be useful for diagnosis and monitoring applications requires further investigation.</p

    ReseqChip: Automated integration of multiple local context probe data from the MitoChip array in mitochondrial DNA sequence assembly

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The Affymetrix MitoChip v2.0 is an oligonucleotide tiling array for the resequencing of the human mitochondrial (mt) genome. For each of 16,569 nucleotide positions of the mt genome it holds two sets of four 25-mer probes each that match the heavy and the light strand of a reference mt genome and vary only at their central position to interrogate all four possible alleles. In addition, the MitoChip v2.0 carries alternative local context probes to account for known mtDNA variants. These probes have been neglected in most studies due to the lack of software for their automated analysis.</p> <p>Results</p> <p>We provide ReseqChip, a free software that automates the process of resequencing mtDNA using multiple local context probes on the MitoChip v2.0. ReseqChip significantly improves base call rate and sequence accuracy. ReseqChip is available at <url>http://code.open-bio.org/svnweb/index.cgi/bioperl/browse/bioperl-live/trunk/Bio/Microarray/Tools/</url>.</p> <p>Conclusions</p> <p>ReseqChip allows for the automated consolidation of base calls from alternative local mt genome context probes. It thereby improves the accuracy of resequencing, while reducing the number of non-called bases.</p

    Long homopurine•homopyrimidine sequences are characteristic of genes expressed in brain and the pseudoautosomal region

    Get PDF
    Homo(purine•pyrimidine) sequences (R•Y tracts) with mirror repeat symmetries form stable triplexes that block replication and transcription and promote genetic rearrangements. A systematic search was conducted to map the location of the longest R•Y tracts in the human genome in order to assess their potential function(s). The 814 R•Y tracts with ≥250 uninterrupted base pairs were preferentially clustered in the pseudoautosomal region of the sex chromosomes and located in the introns of 228 annotated genes whose protein products were associated with functions at the cell membrane. These genes were highly expressed in the brain and particularly in genes associated with susceptibility to mental disorders, such as schizophrenia. The set of 1957 genes harboring the 2886 R•Y tracts with ≥100 uninterrupted base pairs was additionally enriched in proteins associated with phosphorylation, signal transduction, development and morphogenesis. Comparisons of the ≥250 bp R•Y tracts in the mouse and chimpanzee genomes indicated that these sequences have mutated faster than the surrounding regions and are longer in humans than in chimpanzees. These results support a role for long R•Y tracts in promoting recombination and genome diversity during evolution through destabilization of chromosomal DNA, thereby inducing repair and mutation

    Internal Transcribed Spacer 2 (nu ITS2 rRNA) Sequence-Structure Phylogenetics: Towards an Automated Reconstruction of the Green Algal Tree of Life

    Get PDF
    L). Some have advocated the use of the nuclear-encoded, internal transcribed spacer two (ITS2) as an alternative to the traditional chloroplast markers. However, the ITS2 is broadly perceived to be insufficiently conserved or to be confounded by introgression or biparental inheritance patterns, precluding its broad use in phylogenetic reconstruction or as a DNA barcode. A growing body of evidence has shown that simultaneous analysis of nucleotide data with secondary structure information can overcome at least some of the limitations of ITS2. The goal of this investigation was to assess the feasibility of an automated, sequence-structure approach for analysis of IT2 data from a large sampling of phylum Chlorophyta.Sequences and secondary structures from 591 chlorophycean, 741 trebouxiophycean and 938 ulvophycean algae, all obtained from the ITS2 Database, were aligned using a sequence structure-specific scoring matrix. Phylogenetic relationships were reconstructed by Profile Neighbor-Joining coupled with a sequence structure-specific, general time reversible substitution model. Results from analyses of the ITS2 data were robust at multiple nodes and showed considerable congruence with results from published phylogenetic analyses.Our observations on the power of automated, sequence-structure analyses of ITS2 to reconstruct phylum-level phylogenies of the green algae validate this approach to assessing diversity for large sets of chlorophytan taxa. Moreover, our results indicate that objections to the use of ITS2 for DNA barcoding should be weighed against the utility of an automated, data analysis approach with demonstrated power to reconstruct evolutionary patterns for highly divergent lineages

    Population-Sequencing as a Biomarker of Burkholderia mallei and Burkholderia pseudomallei Evolution through Microbial Forensic Analysis

    No full text
    Large-scale genomics projects are identifying biomarkers to detect human disease. B. pseudomallei and B. mallei are two closely related select agents that cause melioidosis and glanders. Accurate characterization of metagenomic samples is dependent on accurate measurements of genetic variation between isolates with resolution down to strain level. Often single biomarker sensitivity is augmented by use of multiple or panels of biomarkers. In parallel with single biomarker validation, advances in DNA sequencing enable analysis of entire genomes in a single run: population-sequencing. Potentially, direct sequencing could be used to analyze an entire genome to serve as the biomarker for genome identification. However, genome variation and population diversity complicate use of direct sequencing, as well as differences caused by sample preparation protocols including sequencing artifacts and mistakes. As part of a Department of Homeland Security program in bacterial forensics, we examined how to implement whole genome sequencing (WGS) analysis as a judicially defensible forensic method for attributing microbial sample relatedness; and also to determine the strengths and limitations of whole genome sequence analysis in a forensics context. Herein, we demonstrate use of sequencing to provide genetic characterization of populations: direct sequencing of populations

    A multistep mutation mechanism drives the evolution of the CAG repeat at MJD/SCA3 locus

    No full text
    Despite the intense debate around the repeat instability reported on the large group of neurological disorders caused by trinucleotide repeat expansions, little is known about the mutation process underlying alleles in the normal range that, ultimately, expand to pathological size. In this study, we assessed the mutation mechanisms by which wild-type Machado-Joseph disease (MJD) alleles have been generated throughout human evolution. Haplotypes including the CAG repeat, six intragenic SNPs and four flanking microsatellites were analysed in 431 normal chromosomes of European, Asian and African origin. A bimodal CAG repeat length frequency distribution was found in the four most frequent wild-type lineages (H1-GCGGCA; H2-GTGGCA; H3-TTAGAC and H4-TTACAC). Based on flanking microsatellite haplotypes, the variance calculated by analysis of molecular variance between modal (CAG) n alleles was little or null in lineages H1, H2 and H4, as were the pairwise differences. Moreover, genetic distances among all the alleles from each lineage did not reflect the allele sizes differences, as expected if a stepwise mutation model was the main process of evolution. On the contrary, when exposed in maximum parsimonious phylogenetic trees, a large number of mutation steps separated same-size alleles, whereas several microsatellite haplotypes were shared by modal CAGs. In conclusion, our results suggest that the main mutation mechanism occurring in the evolution of the polymorphic CAG region at MJD/SCA3 locus is a multistep one, either by gene conversion or DNA slippage; repeats with 14, 21, 23 and 27 CAGs are the main alleles involved in this process.link_to_OA_fulltex

    Breakpoints of gross deletions coincide with non-B DNA conformations

    No full text
    Genomic rearrangements are a frequent source of instability, but the mechanisms involved are poorly understood. A 2.5-kbp poly(purine·pyrimidine) sequence from the human PKD1 gene, known to form non-B DNA structures, induced long deletions and other instabilities in plasmids that were mediated by mismatch repair and, in some cases, transcription. The breakpoints occurred at predicted non-B DNA structures. Distance measurements also indicated a significant proximity of alternating purine-pyrimidine and oligo(purine·pyrimidine) tracts to breakpoint junctions in 222 gross deletions and translocations, respectively, involved in human diseases. In 11 deletions analyzed, breakpoints were explicable by non-B DNA structure formation. We conclude that alternative DNA conformations trigger genomic rearrangements through recombination-repair activities
    corecore