5 research outputs found

    Performance of Genotype Imputation for Rare Variants Identified in Exons and Flanking Regions of Genes

    Get PDF
    Genotype imputation has the potential to assess human genetic variation at a lower cost than assaying the variants using laboratory techniques. The performance of imputation for rare variants has not been comprehensively studied. We utilized 8865 human samples with high depth resequencing data for the exons and flanking regions of 202 genes and Genome-Wide Association Study (GWAS) data to characterize the performance of genotype imputation for rare variants. We evaluated reference sets ranging from 100 to 3713 subjects for imputing into samples typed for the Affymetrix (500K and 6.0) and Illumina 550K GWAS panels. The proportion of variants that could be well imputed (true r2>0.7) with a reference panel of 3713 individuals was: 31% (Illumina 550K) or 25% (Affymetrix 500K) with MAF (Minor Allele Frequency) less than or equal 0.001, 48% or 35% with 0.001<MAF< = 0.005, 54% or 38% with 0.005<MAF< = 0.01, 78% or 57% with 0.01<MAF< = 0.05, and 97% or 86% with MAF>0.05. The performance for common SNPs (MAF>0.05) within exons and flanking regions is comparable to imputation of more uniformly distributed SNPs. The performance for rare SNPs (0.01<MAF< = 0.05) was much more dependent on the GWAS panel and the number of reference samples. These results suggest routine use of genotype imputation for extending the assessment of common variants identified in humans via targeted exon resequencing into additional samples with GWAS data, but imputation of very rare variants (MAF< = 0.005) will require reference panels with thousands of subjects

    Estimation of uncertainty in genetic linkage data for human pedigrees

    No full text
    Genetic linkage analysis entails estimating the distance between two genes on a chromosome using genotype information from a sample of individuals. For human pedigree data counting the number of meiotic crossovers or recombination events is impossible due to the lack of complete information. Consequently maximum likelihood methods are used to estimate the recombination frequency in these cases. Since the advent of high resolution genetic maps, errors in genetic linkage data have become more of a problem. Errors can introduce spurious recombinations which increase the map distance and distort linkage maps reducing the power to locate genetic diseases. A general method for detecting errors in pedigree genotype data is presented. Its performance is evaluated with power studies using Monte Carlo methods on simulated data with pedigree structures similar to the CEPH pedigrees and a larger disease pedigree used in the study of idiopathic dilated cardiomyopathy. An investigation of the effect that errors have on the power of locating a disease gene in a proposed linkage study is also presented. The study's results can be used to plan linkage studies which account for error thereby increasing their probability of success. The error detection method and power study results are important tools for performing linkage studies now and in the future which require high resolution maps
    corecore