27 research outputs found
The hazards of genotype imputation in chromosomal regions under selection: A case study using the Lactase gene region
Although imputation of missing SNP results has been widely used in genetic studies, claims about the quality and usefulness of imputation have outnumbered the few studies that have questioned its limitations. But it is becoming clear that these limitations are real—for example, disease association signals can be missed in regions of LD breakdown. Here, as a case study, using the chromosomal region of the well-known lactase gene, LCT, we address the issue of imputation in the context of variants that have become frequent in a limited number of modern population groups only recently, due to selection. We study SNPs in a 500 bp region covering the enhancer of LCT, and compare imputed genotypes with directly genotyped data. We examine the haplotype pairs of all individuals with discrepant and missing genotypes. We highlight the nonrandom nature of the allelic errors and show that most incorrect imputations and missing data result from long haplotypes that are evolutionarily closely related to those carrying the derived alleles, while some relate to rare and recombinant haplotypes. We conclude that bias of incorrectly imputed and missing genotypes can decrease the accuracy of imputed results substantially
The hazards of genotype imputation when mapping disease susceptibility variants
BACKGROUND: The cost-free increase in statistical power of using imputation to infer missing genotypes is undoubtedly appealing, but is it hazard-free? This case study of three type-2 diabetes (T2D) loci demonstrates that it is not; it sheds light on why this is so and raises concerns as to the shortcomings of imputation at disease loci, where haplotypes differ between cases and reference panel. RESULTS: T2D-associated variants were previously identified using targeted sequencing. We removed these significantly associated SNPs and used neighbouring SNPs to infer them by imputation. We compared imputed with observed genotypes, examined the altered pattern of T2D-SNP association, and investigated the cause of imputation errors by studying haplotype structure. Most T2D variants were incorrectly imputed with a low density of scaffold SNPs, but the majority failed to impute even at high density, despite obtaining high certainty scores. Missing and discordant imputation errors, which were observed disproportionately for the risk alleles, produced monomorphic genotype calls or false-negative associations. We show that haplotypes carrying risk alleles are considerably more common in the T2D cases than the reference panel, for all loci. CONCLUSIONS: Imputation is not a panacea for fine mapping, nor for meta-analysing multiple GWAS based on different arrays and different populations. A total of 80% of the SNPs we have tested are not included in array platforms, explaining why these and other such associated variants may previously have been missed. Regardless of the choice of software and reference haplotypes, imputation drives genotype inference towards the reference panel, introducing errors at disease loci
Identification and Replication of Three Novel Myopia Common Susceptibility Gene Loci on Chromosome 3q26 using Linkage and Linkage Disequilibrium Mapping
Refractive error is a highly heritable quantitative trait responsible for considerable morbidity. Following an initial genome-wide linkage study using microsatellite markers, we confirmed evidence for linkage to chromosome 3q26 and then conducted fine-scale association mapping using high-resolution linkage disequilibrium unit (LDU) maps. We used a preliminary discovery marker set across the 30-Mb region with an average SNP density of 1 SNP/15 kb (Map 1). Map 1 was divided into 51 LDU windows and additional SNPs were genotyped for six regions (Map 2) that showed preliminary evidence of multi-marker association using composite likelihood. A total of 575 cases and controls selected from the tails of the trait distribution were genotyped for the discovery sample. Malecot model estimates indicate three loci with putative common functional variants centred on MFN1 (180,566 kb; 95% confidence interval 180,505–180, 655 kb), approximately 156 kb upstream from alternate-splicing SOX2OT (182,595 kb; 95% CI 182,533–182,688 kb) and PSARL (184,386 kb; 95% CI 184,356–184,411 kb), with the loci showing modest to strong evidence of association for the Map 2 discovery samples (p<10−7, p<10−10, and p = 0.01, respectively). Using an unselected independent sample of 1,430 individuals, results replicated for the MFN1 (p = 0.006), SOX2OT (p = 0.0002), and PSARL (p = 0.0005) gene regions. MFN1 and PSARL both interact with OPA1 to regulate mitochondrial fusion and the inhibition of mitochondrial-led apoptosis, respectively. That two mitochondrial regulatory processes in the retina are implicated in the aetiology of myopia is surprising and is likely to provide novel insight into the molecular genetic basis of common myopia
World-wide distributions of lactase persistence alleles and the complex effects of recombination and selection
The genetic trait of lactase persistence (LP) is associated with at least five independent functional single nucleotide variants in a regulatory region about 14 kb upstream of the lactase gene [-13910*T (rs4988235), -13907*G (rs41525747), -13915*G (rs41380347), -14009*G (rs869051967) and -14010*C (rs145946881)]. These alleles have been inferred to have spread recently and present-day frequencies have been attributed to positive selection for the ability of adult humans to digest lactose without risk of symptoms of lactose intolerance. One of the inferential approaches used to estimate the level of past selection has been to determine the extent of haplotype homozygosity (EHH) of the sequence surrounding the SNP of interest. We report here new data on the frequencies of the known LP alleles in the 'Old World' and their haplotype lineages. We examine and confirm EHH of each of the LP alleles in relation to their distinct lineages, but also show marked EHH for one of the older haplotypes that does not carry any of the five LP alleles. The region of EHH of this (B) haplotype exactly coincides with a region of suppressed recombination that is detectable in families as well as in population data, and the results show how such suppression may have exaggerated haplotype-based measures of past selection
Comparative analysis of genome-wide association studies signals for lipids, diabetes, and coronary heart disease: Cardiovascular Biomarker Genetics Collaboration
To evaluate the associations of emergent genome-wide-association study-derived coronary heart disease (CHD)-associated single nucleotide polymorphisms (SNPs) with established and emerging risk factors, and the association of genome-wide-association study-derived lipid-associated SNPs with other risk factors and CHD events
Linkage disequilibrium maps and disease-association mapping
Over the last few years, association mapping of disease genes has developed into one of the most dynamic research areas of human genetics. It focuses on identifying functional polymorphisms that predispose to complex diseases. Population-based approaches are concerned with exploiting linkage disequilibrium (LD) between single-nucleotide polymorphism (SNPs) and disease-predisposing loci. The utility of SNPs in association mapping is now well established and the interest in this field has been escalated by the discovery of millions of SNPs across the genome. This chapter reviews an association-mapping method that utilizes metric LD maps in LD units and employs a composite likelihood approach to combine information from all single SNP tests. It applies a model that incorporates a parameter for the location of the causal polymorphism. A proof-of-principle application of this method to a small region is given and its potential properties to large-scale datasets are discussed
Allelic association and disease mapping
The application of allelic association to map genes for complex traits, particularly using high-density maps of single nucleotide polymorphisms in candidate regions, is an area of very active research. Here we present some aspects of the methodology and applications to both major gene mapping, which illustrates the effectiveness of the method, and oligogenes, where methods are still in flux and for which there have been relatively few successes to date. Several important considerations emerge, including the selection of the optimal metric for measuring association and the importance of modelling the decline in association with distance given the variability in association in a candidate region. The Malecot model of association with distance is shown to have a resolution of greater than 50 kilobases but the available evidence suggests that considerably higher resolution might be achieved with dense single nucleotide polymorphism (SNP) maps
Challenges to global financial stability: Interconnections, credit risk, business cycle and the role of market participants
The worldwide financial crisis (WFC) of 2007-09 has shown the importance of cross-sectional dependencies of assets, credit exposures and volatility, which can threaten domestic and global financial stability through cascades in financial networks. A correct assessment of company-specific risk has to account for the potential risk spillover effects from other firms (Hautsch et al. 2014). This is because of the intertwined nature of financial markets, which allow the spread of risk throughout the system (Acemoglu et al., 2015). The potential impact of interconnected financial institutions on the entire financial system has been a financial stability concern for central banks and regulators. The need for economic foundations for a systemic risk measure is more than an academic concern since it involves regulators, supervisory authorities and policy-makers (Acharya et al. 2017). This special issue provides a substantial contribution to the systemic risk literatur