34,288 research outputs found

    High-resolution genetic mapping with pooled sequencing

    Get PDF
    Background: Modern genetics has been transformed by high-throughput sequencing. New experimental designs in model organisms involve analyzing many individuals, pooled and sequenced in groups for increased efficiency. However, the uncertainty from pooling and the challenge of noisy sequencing data demand advanced computational methods. Results: We present MULTIPOOL, a computational method for genetic mapping in model organism crosses that are analyzed by pooled genotyping. Unlike other methods for the analysis of pooled sequence data, we simultaneously consider information from all linked chromosomal markers when estimating the location of a causal variant. Our use of informative sequencing reads is formulated as a discrete dynamic Bayesian network, which we extend with a continuous approximation that allows for rapid inference without a dependence on the pool size. MULTIPOOL generalizes to include biological replicates and case-only or case-control designs for binary and quantitative traits. Conclusions: Our increased information sharing and principled inclusion of relevant error sources improve resolution and accuracy when compared to existing methods, localizing associations to single genes in several cases. MULTIPOOL is freely available at http://cgs.csail.mit.edu/multipool/ webcite.National Science Foundation (U.S.) (Graduate Research Fellowship Grant 0645960

    Simultaneous mapping of multiple gene loci with pooled segregants

    Get PDF
    The analysis of polygenic, phenotypic characteristics such as quantitative traits or inheritable diseases remains an important challenge. It requires reliable scoring of many genetic markers covering the entire genome. The advent of high-throughput sequencing technologies provides a new way to evaluate large numbers of single nucleotide polymorphisms (SNPs) as genetic markers. Combining the technologies with pooling of segregants, as performed in bulked segregant analysis (BSA), should, in principle, allow the simultaneous mapping of multiple genetic loci present throughout the genome. The gene mapping process, applied here, consists of three steps: First, a controlled crossing of parents with and without a trait. Second, selection based on phenotypic screening of the offspring, followed by the mapping of short offspring sequences against the parental reference. The final step aims at detecting genetic markers such as SNPs, insertions and deletions with next generation sequencing (NGS). Markers in close proximity of genomic loci that are associated to the trait have a higher probability to be inherited together. Hence, these markers are very useful for discovering the loci and the genetic mechanism underlying the characteristic of interest. Within this context, NGS produces binomial counts along the genome, i.e., the number of sequenced reads that matches with the SNP of the parental reference strain, which is a proxy for the number of individuals in the offspring that share the SNP with the parent. Genomic loci associated with the trait can thus be discovered by analyzing trends in the counts along the genome. We exploit the link between smoothing splines and generalized mixed models for estimating the underlying structure present in the SNP scatterplots

    QTL analysis of high thermotolerance with superior and downgraded parental yeast strains reveals new minor QTLs and converges on novel causative alleles involved in RNA processing

    Get PDF
    Revealing QTLs with a minor effect in complex traits remains difficult. Initial strategies had limited success because of interference by major QTLs and epistasis. New strategies focused on eliminating major QTLs in subsequent mapping experiments. Since genetic analysis of superior segregants from natural diploid strains usually also reveals QTLs linked to the inferior parent, we have extended this strategy for minor QTL identification by eliminating QTLs in both parent strains and repeating the QTL mapping with pooled-segregant whole-genome sequence analysis. We first mapped multiple QTLs responsible for high thermotolerance in a natural yeast strain, MUCL28177, compared to the laboratory strain, BY4742. Using single and bulk reciprocal hemizygosity analysis we identified MKT1 and PRP42 as causative genes in QTLs linked to the superior and inferior parent, respectively. We subsequently downgraded both parents by replacing their superior allele with the inferior allele of the other parent. QTL mapping using pooled-segregant whole-genome sequence analysis with the segregants from the cross of the downgraded parents, revealed several new QTLs. We validated the two most-strongly linked new QTLs by identifying NCS2 and SMD2 as causative genes linked to the superior downgraded parent and we found an allele-specific epistatic interaction between PRP42 and SMD2. Interestingly, the related function of PRP42 and SMD2 suggests an important role for RNA processing in high thermotolerance and underscores the relevance of analyzing minor QTLs. Our results show that identification of minor QTLs involved in complex traits can be successfully accomplished by crossing parent strains that have both been downgraded for a single QTL. This novel approach has the advantage of maintaining all relevant genetic diversity as well as enough phenotypic difference between the parent strains for the trait-of-interest and thus maximizes the chances of successfully identifying additional minor QTLs that are relevant for the phenotypic difference between the original parents

    Improved linkage analysis of Quantitative Trait Loci using bulk segregants unveils a novel determinant of high ethanol tolerance in yeast

    Get PDF
    Background: Bulk segregant analysis (BSA) coupled to high throughput sequencing is a powerful method to map genomic regions related with phenotypes of interest. It relies on crossing two parents, one inferior and one superior for a trait of interest. Segregants displaying the trait of the superior parent are pooled, the DNA extracted and sequenced. Genomic regions linked to the trait of interest are identified by searching the pool for overrepresented alleles that normally originate from the superior parent. BSA data analysis is non-trivial due to sequencing, alignment and screening errors. Results: To increase the power of the BSA technology and obtain a better distinction between spuriously and truly linked regions, we developed EXPLoRA (EXtraction of over-rePresented aLleles in BSA), an algorithm for BSA data analysis that explicitly models the dependency between neighboring marker sites by exploiting the properties of linkage disequilibrium through a Hidden Markov Model (HMM). Reanalyzing a BSA dataset for high ethanol tolerance in yeast allowed reliably identifying QTLs linked to this phenotype that could not be identified with statistical significance in the original study. Experimental validation of one of the least pronounced linked regions, by identifying its causative gene VPS70, confirmed the potential of our method. Conclusions: EXPLoRA has a performance at least as good as the state-of-the-art and it is robust even at low signal to noise ratio's i.e. when the true linkage signal is diluted by sampling, screening errors or when few segregants are available

    EXPLoRA-web: linkage analysis of quantitative trait loci using bulk segregant analysis

    Get PDF
    Identification of genomic regions associated with a phenotype of interest is a fundamental step toward solving questions in biology and improving industrial research. Bulk segregant analysis (BSA) combined with high-throughput sequencing is a technique to efficiently identify these genomic regions associated with a trait of interest. However, distinguishing true from spuriously linked genomic regions and accurately delineating the genomic positions of these truly linked regions requires the use of complex statistical models currently implemented in software tools that are generally difficult to operate for non-expert users. To facilitate the exploration and analysis of data generated by bulked segregant analysis, we present EXPLoRA-web, a web service wrapped around our previously published algorithm EXPLoRA, which exploits linkage disequilibrium to increase the power and accuracy of quantitative trait loci identification in BSA analysis. EXPLoRA-web provides a user friendly interface that enables easy data upload and parallel processing of different parameter configurations. Results are provided graphically and as BED file and/or text file and the input is expected in widely used formats, enabling straightforward BSA data analysis. The web server is available at http://bioinformatics.intec.ugent.be/explora-web/

    Whole genome sequencing-based mapping and candidate identification of mutations from fixed zebrafish tissue

    Get PDF
    As forward genetic screens in zebrafish become more common, the number of mutants that cannot be identified by gross morphology or through transgenic approaches, such as many nervous system defects, has also increased. Screening for these difficult-to-visualize phenotypes demands techniques such as whole-mount in situ hybridization (WISH) or antibody staining, which require tissue fixation. To date, fixed tissue has not been amenable for generating libraries for whole genome sequencing (WGS). Here, we describe a method for using genomic DNA from fixed tissue and a bioinformatics suite for WGS-based mapping of zebrafish mutants. We tested our protocol using two known zebrafish mutant alleles, gpr126st49 and egr2bfh227, both of which cause myelin defects. As further proof of concept we mapped a novel mutation, stl64, identified in a zebrafish WISH screen for myelination defects. We linked stl64 to chromosome 1 and identified a candidate nonsense mutation in the F-box and WD repeat domain containing 7 (fbxw7) gene. Importantly, stl64 mutants phenocopy previously described fbxw7vu56 mutants, and knockdown of fbxw7 in wild-type animals produced similar defects, demonstrating that stl64 disrupts fbxw7. Together, these data show that our mapping protocol can map and identify causative lesions in mutant screens that require tissue fixation for phenotypic analysis

    From cheek swabs to consensus sequences : an A to Z protocol for high-throughput DNA sequencing of complete human mitochondrial genomes

    Get PDF
    Background: Next-generation DNA sequencing (NGS) technologies have made huge impacts in many fields of biological research, but especially in evolutionary biology. One area where NGS has shown potential is for high-throughput sequencing of complete mtDNA genomes (of humans and other animals). Despite the increasing use of NGS technologies and a better appreciation of their importance in answering biological questions, there remain significant obstacles to the successful implementation of NGS-based projects, especially for new users. Results: Here we present an ‘A to Z’ protocol for obtaining complete human mitochondrial (mtDNA) genomes – from DNA extraction to consensus sequence. Although designed for use on humans, this protocol could also be used to sequence small, organellar genomes from other species, and also nuclear loci. This protocol includes DNA extraction, PCR amplification, fragmentation of PCR products, barcoding of fragments, sequencing using the 454 GS FLX platform, and a complete bioinformatics pipeline (primer removal, reference-based mapping, output of coverage plots and SNP calling). Conclusions: All steps in this protocol are designed to be straightforward to implement, especially for researchers who are undertaking next-generation sequencing for the first time. The molecular steps are scalable to large numbers (hundreds) of individuals and all steps post-DNA extraction can be carried out in 96-well plate format. Also, the protocol has been assembled so that individual ‘modules’ can be swapped out to suit available resources

    A High-Density Linkage Map of the Ancestral Diploid Strawberry, Fragaria iinumae, Constructed with Single Nucleotide Polymorphism Markers from the IStraw90 Array and Genotyping by Sequencing

    Get PDF
    Fragaria iinumae Makino is recognized as an ancestor of the octoploid strawberry species, which includes the cultivated strawberry, Fragaria ×ananassa Duchesne ex Rozier. Here we report the construction of the first high-density linkage map for F. iinumae. The F. iinumae linkage map (Fii map) is based on two high-throughput techniques of single nucleotide polymorphism (SNP) genotyping: the IStraw90 Array (hereafter “Array”), and genotyping by sequencing (GBS). The F2 generation mapping population was derived by selfing F. iinumae hybrid F1D, the product of a cross between two divergent F. iinumae accessions collected from Hokkaido, Japan. The Fii map consists of seven linkage groups (LGs) and has an overall length of 451.7 cM as defined by 496 loci populated by 4173 markers: 3280 from the Array and 893 from GBS. Comparisons with two versions of the Fragaria vesca ssp. vesca L. ‘Hawaii 4’ pseudo-chromosome (PC) assemblies reveal substantial conservation of synteny and colinearity, yet identified differences that point to possible genomic divergences between F. iinumae and F. vesca, and/or to F. vesca genomic assembly errors. The Fii map provides a basis for anchoring a F. iinumae genome assembly as a prerequisite for constructing a second diploid reference genome for Fragaria
    corecore