31,497 research outputs found

    From cheek swabs to consensus sequences : an A to Z protocol for high-throughput DNA sequencing of complete human mitochondrial genomes

    Get PDF
    Background: Next-generation DNA sequencing (NGS) technologies have made huge impacts in many fields of biological research, but especially in evolutionary biology. One area where NGS has shown potential is for high-throughput sequencing of complete mtDNA genomes (of humans and other animals). Despite the increasing use of NGS technologies and a better appreciation of their importance in answering biological questions, there remain significant obstacles to the successful implementation of NGS-based projects, especially for new users. Results: Here we present an ‘A to Z’ protocol for obtaining complete human mitochondrial (mtDNA) genomes – from DNA extraction to consensus sequence. Although designed for use on humans, this protocol could also be used to sequence small, organellar genomes from other species, and also nuclear loci. This protocol includes DNA extraction, PCR amplification, fragmentation of PCR products, barcoding of fragments, sequencing using the 454 GS FLX platform, and a complete bioinformatics pipeline (primer removal, reference-based mapping, output of coverage plots and SNP calling). Conclusions: All steps in this protocol are designed to be straightforward to implement, especially for researchers who are undertaking next-generation sequencing for the first time. The molecular steps are scalable to large numbers (hundreds) of individuals and all steps post-DNA extraction can be carried out in 96-well plate format. Also, the protocol has been assembled so that individual ‘modules’ can be swapped out to suit available resources

    Improving dbNSFP

    Get PDF
    IMPROVING dbNSFP Mingyao Lu, B.S. Advisory Professor: Xiaoming Liu, Ph.D. The analysis and interpretation of DNA variation are very important for the Whole Exome studies (WES). Genome research has focused on single nucleotide variants (SNVs). Since indels are as important as SNVs, especially indels in coding regions are often candidates of disease-causing variants, thus, it is necessary to expand the focus to include indel mutations. The goal of my project is to provide an automatic annotation pipeline to the WES based disease studies project by extending the dbNSFP with a tool for automated indel annotation and deleteriousness prediction. The current sequencing results typically include both SNVs and indels. Although there have been many available tools to integrate functional prediction/annotations for SNV effects, there are no such tools for indels to my knowledge. Therefore, the aim of this thesis was to add deleteriousness prediction scores to indel annotation based on gene models, including CADD, SIFT, and PROVEAN. All those scores can be calculated on-the-fly after installing resources locally. A Docker implementing the indel annotation and deleteriousness prediction has been developed and ready to be deployed from the cloud

    RACS: Rapid Analysis of ChIP-Seq data for contig based genomes

    Full text link
    Background: Chromatin immunoprecipitation coupled to next generation sequencing (ChIP-Seq) is a widely used technique to investigate the function of chromatin-related proteins in a genome-wide manner. ChIP-Seq generates large quantities of data which can be difficult to process and analyse, particularly for organisms with contig based genomes. Contig-based genomes often have poor annotations for cis-elements, for example enhancers, that are important for gene expression. Poorly annotated genomes make a comprehensive analysis of ChIP-Seq data difficult and as such standardized analysis pipelines are lacking. Methods: We report a computational pipeline that utilizes traditional High-Performance Computing techniques and open source tools for processing and analysing data obtained from ChIP-Seq. We applied our computational pipeline "Rapid Analysis of ChIP-Seq data" (RACS) to ChIP-Seq data that was generated in the model organism Tetrahymena thermophila, an example of an organism with a genome that is available in contigs. Results: To test the performance and efficiency of RACs, we performed control ChIP-Seq experiments allowing us to rapidly eliminate false positives when analyzing our previously published data set. Our pipeline segregates the found read accumulations between genic and intergenic regions and is highly efficient for rapid downstream analyses. Conclusions: Altogether, the computational pipeline presented in this report is an efficient and highly reliable tool to analyze genome-wide ChIP-Seq data generated in model organisms with contig-based genomes. RACS is an open source computational pipeline available to download from: https://bitbucket.org/mjponce/racs --or-- https://gitrepos.scinet.utoronto.ca/public/?a=summary&p=RACSComment: Submitted to BMC Bioinformatics. Computational pipeline available at https://bitbucket.org/mjponce/rac

    Methods for Scarless, Selection-Free Generation of Human Cells and Allele-Specific Functional Analysis of Disease-Associated SNPs and Variants of Uncertain Significance.

    Get PDF
    With the continued emergence of risk loci from Genome-Wide Association studies and variants of uncertain significance identified from patient sequencing, better methods are required to translate these human genetic findings into improvements in public health. Here we combine CRISPR/Cas9 gene editing with an innovative high-throughput genotyping pipeline utilizing KASP (Kompetitive Allele-Specific PCR) genotyping technology to create scarless isogenic cell models of cancer variants in ~1 month. We successfully modeled two novel variants previously identified by our lab in the PALB2 gene in HEK239 cells, resulting in isogenic cells representing all three genotypes for both variants. We also modeled a known functional risk SNP of colorectal cancer, rs6983267, in HCT-116 cells. Cells with extremely low levels of gene editing could still be identified and isolated using this approach. We also introduce a novel molecular assay, ChIPnQASO (Chromatin Immunoprecipitation and Quantitative Allele-Specific Occupation), which uses the same technology to reveal allele-specific function of these variants at the DNA-protein interaction level. We demonstrated preferential binding of the transcription factor TCF7L2 to the rs6983267 risk allele over the non-risk. Our pipeline provides a platform for functional variant discovery and validation that is accessible and broadly applicable for the progression of efforts towards precision medicine
    • …
    corecore