8,710 research outputs found

    Genotyping common and rare variation using overlapping pool sequencing

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Recent advances in sequencing technologies set the stage for large, population based studies, in which the ANA or RNA of thousands of individuals will be sequenced. Currently, however, such studies are still infeasible using a straightforward sequencing approach; as a result, recently a few multiplexing schemes have been suggested, in which a small number of ANA pools are sequenced, and the results are then deconvoluted using compressed sensing or similar approaches. These methods, however, are limited to the detection of rare variants.</p> <p>Results</p> <p>In this paper we provide a new algorithm for the deconvolution of DNA pools multiplexing schemes. The presented algorithm utilizes a likelihood model and linear programming. The approach allows for the addition of external data, particularly imputation data, resulting in a flexible environment that is suitable for different applications.</p> <p>Conclusions</p> <p>Particularly, we demonstrate that both low and high allele frequency SNPs can be accurately genotyped when the DNA pooling scheme is performed in conjunction with microarray genotyping and imputation. Additionally, we demonstrate the use of our framework for the detection of cancer fusion genes from RNA sequences.</p

    Compressed Genotyping

    Full text link
    Significant volumes of knowledge have been accumulated in recent years linking subtle genetic variations to a wide variety of medical disorders from Cystic Fibrosis to mental retardation. Nevertheless, there are still great challenges in applying this knowledge routinely in the clinic, largely due to the relatively tedious and expensive process of DNA sequencing. Since the genetic polymorphisms that underlie these disorders are relatively rare in the human population, the presence or absence of a disease-linked polymorphism can be thought of as a sparse signal. Using methods and ideas from compressed sensing and group testing, we have developed a cost-effective genotyping protocol. In particular, we have adapted our scheme to a recently developed class of high throughput DNA sequencing technologies, and assembled a mathematical framework that has some important distinctions from 'traditional' compressed sensing ideas in order to address different biological and technical constraints.Comment: Submitted to IEEE Transaction on Information Theory - Special Issue on Molecular Biology and Neuroscienc

    Fine-tuning the performance of ddRAD-seq in the peach genome

    Get PDF
    The advance of Next Generation Sequencing (NGS) technologies allows high-throughput genotyping at a reasonable cost, although, in the case of peach, this technology has been scarcely developed. To date, only a standard Genotyping by Sequencing approach (GBS), based on a single restriction with ApeKI to reduce genome complexity, has been applied in peach. In this work, we assessed the performance of the double-digest RADseq approach (ddRADseq), by testing 6 double restrictions with the restriction profile generated with ApeKI. The enzyme pair PstI/MboI retained the highest number of loci in concordance with the in silico analysis. Under this condition, the analysis of a diverse germplasm collection (191 peach genotypes) yielded 200,759,000 paired-end (2 × 250 bp) reads that allowed the identification of 113,411 SNP, 13,661 InDel and 2133 SSR. We take advantage of a wide sample set to describe technical scope of the platform. The novel platform presented here represents a useful tool for genomic-based breeding for peach.EEA San PedroFil: Aballay, Maximiliano Martín. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria San Pedro; Argentina.Fil: Aballay, Maximiliano Martín. Consejo Nacional de Investigaciones Científica y Técnicas; ArgentinaFil: Aguirre, Natalia Cristina. Instituto Nacional de Tecnología Agropecuaria (INTA). Instituto de Biotecnología; Argentina.Fil: Aguirre, Natalia Cristina. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Agrobiotecnología y Biología Molecular; Argentina.Fil: Filippi, Carla Valeria. Instituto Nacional de Tecnología Agropecuaria (INTA). Instituto de Biotecnología; ArgentinaFil: Filippi, Carla Valeria. Consejo Nacional de Investigaciones Científicas y Técnicas. Instituto de Agrobiotecnología y Biología Molecular; ArgentinaFil: Valentini, Gabriel Hugo. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria San Pedro; ArgentinaFil: Sánchez, Gerardo. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria San Pedro; Argentin

    Genome-wide analysis of ivermectin response by Onchocerca volvulus reveals that genetic drift and soft selective sweeps contribute to loss of drug sensitivity

    Get PDF
    Treatment of onchocerciasis using mass ivermectin administration has reduced morbidity and transmission throughout Africa and Central/South America. Mass drug administration is likely to exert selection pressure on parasites, and phenotypic and genetic changes in several Onchocerca volvulus populations from Cameroon and Ghana-exposed to more than a decade of regular ivermectin treatment-have raised concern that sub-optimal responses to ivermectin's anti-fecundity effect are becoming more frequent and may spread.Pooled next generation sequencing (Pool-seq) was used to characterise genetic diversity within and between 108 adult female worms differing in ivermectin treatment history and response. Genome-wide analyses revealed genetic variation that significantly differentiated good responder (GR) and sub-optimal responder (SOR) parasites. These variants were not randomly distributed but clustered in ~31 quantitative trait loci (QTLs), with little overlap in putative QTL position and gene content between the two countries. Published candidate ivermectin SOR genes were largely absent in these regions; QTLs differentiating GR and SOR worms were enriched for genes in molecular pathways associated with neurotransmission, development, and stress responses. Finally, single worm genotyping demonstrated that geographic isolation and genetic change over time (in the presence of drug exposure) had a significantly greater role in shaping genetic diversity than the evolution of SOR.This study is one of the first genome-wide association analyses in a parasitic nematode, and provides insight into the genomics of ivermectin response and population structure of O. volvulus. We argue that ivermectin response is a polygenically-determined quantitative trait (QT) whereby identical or related molecular pathways but not necessarily individual genes are likely to determine the extent of ivermectin response in different parasite populations. Furthermore, we propose that genetic drift rather than genetic selection of SOR is the underlying driver of population differentiation, which has significant implications for the emergence and potential spread of SOR within and between these parasite populations

    Deep Sequencing of the Nicastrin Gene in Pooled DNA, the Identification of Genetic Variants That Affect Risk of Alzheimer's Disease

    Get PDF
    Nicastrin is an obligatory component of the γ-secretase; the enzyme complex that leads to the production of Aβ fragments critically central to the pathogenesis of Alzheimer's disease (AD). Analyses of the effects of common variation in this gene on risk for late onset AD have been inconclusive. We investigated the effect of rare variation in the coding regions of the Nicastrin gene in a cohort of AD patients and matched controls using an innovative pooling approach and next generation sequencing. Five SNPs were identified and validated by individual genotyping from 311 cases and 360 controls. Association analysis identified a non-synonymous rare SNP (N417Y) with a statistically higher frequency in cases compared to controls in the Greek population (OR 3.994, CI 1.105–14.439, p = 0.035). This finding warrants further investigation in a larger cohort and adds weight to the hypothesis that rare variation explains some of genetic heritability still to be identified in Alzheimer's disease

    Genome wide SNP discovery, analysis and evaluation in mallard (Anas platyrhynchos)

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Next generation sequencing technologies allow to obtain at low cost the genomic sequence information that currently lacks for most economically and ecologically important organisms. For the mallard duck genomic data is limited. The mallard is, besides a species of large agricultural and societal importance, also the focal species when it comes to long distance dispersal of Avian Influenza. For large scale identification of SNPs we performed Illumina sequencing of wild mallard DNA and compared our data with ongoing genome and EST sequencing of domesticated conspecifics. This is the first study of its kind for waterfowl.</p> <p>Results</p> <p>More than one billion base pairs of sequence information were generated resulting in a 16× coverage of a reduced representation library of the mallard genome. Sequence reads were aligned to a draft domesticated duck reference genome and allowed for the detection of over 122,000 SNPs within our mallard sequence dataset. In addition, almost 62,000 nucleotide positions on the domesticated duck reference showed a different nucleotide compared to wild mallard. Approximately 20,000 SNPs identified within our data were shared with SNPs identified in the sequenced domestic duck or in EST sequencing projects. The shared SNPs were considered to be highly reliable and were used to benchmark non-shared SNPs for quality. Genotyping of a representative sample of 364 SNPs resulted in a SNP conversion rate of 99.7%. The correlation of the minor allele count and observed minor allele frequency in the SNP discovery pool was 0.72.</p> <p>Conclusion</p> <p>We identified almost 150,000 SNPs in wild mallards that will likely yield good results in genotyping. Of these, ~101,000 SNPs were detected within our wild mallard sequences and ~49,000 were detected between wild and domesticated duck data. In the ~101,000 SNPs we found a subset of ~20,000 SNPs shared between wild mallards and the sequenced domesticated duck suggesting a low genetic divergence. Comparison of quality metrics between the total SNP set (122,000 + 62,000 = 184,000 SNPs) and the validated subset shows similar characteristics for both sets. This indicates that we have detected a large amount (~150,000) of accurately inferred mallard SNPs, which will benefit bird evolutionary studies, ecological studies (e.g. disentangling migratory connectivity) and industrial breeding programs.</p
    corecore