54 research outputs found

    Identification of causal genes for complex traits.

    Get PDF
    MotivationAlthough genome-wide association studies (GWAS) have identified thousands of variants associated with common diseases and complex traits, only a handful of these variants are validated to be causal. We consider 'causal variants' as variants which are responsible for the association signal at a locus. As opposed to association studies that benefit from linkage disequilibrium (LD), the main challenge in identifying causal variants at associated loci lies in distinguishing among the many closely correlated variants due to LD. This is particularly important for model organisms such as inbred mice, where LD extends much further than in human populations, resulting in large stretches of the genome with significantly associated variants. Furthermore, these model organisms are highly structured and require correction for population structure to remove potential spurious associations.ResultsIn this work, we propose CAVIAR-Gene (CAusal Variants Identification in Associated Regions), a novel method that is able to operate across large LD regions of the genome while also correcting for population structure. A key feature of our approach is that it provides as output a minimally sized set of genes that captures the genes which harbor causal variants with probability ρ. Through extensive simulations, we demonstrate that our method not only speeds up computation, but also have an average of 10% higher recall rate compared with the existing approaches. We validate our method using a real mouse high-density lipoprotein data (HDL) and show that CAVIAR-Gene is able to identify Apoa2 (a gene known to harbor causal variants for HDL), while reducing the number of genes that need to be tested for functionality by a factor of 2.Availability and implementationSoftware is freely available for download at genetics.cs.ucla.edu/caviar

    Integrating Functional Data to Prioritize Causal Variants in Statistical Fine-Mapping Studies

    Get PDF
    Standard statistical approaches for prioritization of variants for functional testing in fine-mapping studies either use marginal association statistics or estimate posterior probabilities for variants to be causal under simplifying assumptions. Here, we present a probabilistic framework that integrates association strength with functional genomic annotation data to improve accuracy in selecting plausible causal variants for functional validation. A key feature of our approach is that it empirically estimates the contribution of each functional annotation to the trait of interest directly from summary association statistics while allowing for multiple causal variants at any risk locus. We devise efficient algorithms that estimate the parameters of our model across all risk loci to further increase performance. Using simulations starting from the 1000 Genomes data, we find that our framework consistently outperforms the current state-of-the-art fine-mapping methods, reducing the number of variants that need to be selected to capture 90% of the causal variants from an average of 13.3 to 10.4 SNPs per locus (as compared to the next-best performing strategy). Furthermore, we introduce a cost-to-benefit optimization framework for determining the number of variants to be followed up in functional assays and assess its performance using real and simulation data. We validate our findings using a large scale meta-analysis of four blood lipids traits and find that the relative probability for causality is increased for variants in exons and transcription start sites and decreased in repressed genomic regions at the risk loci of these traits. Using these highly predictive, trait-specific functional annotations, we estimate causality probabilities across all traits and variants, reducing the size of the 90% confidence set from an average of 17.5 to 13.5 variants per locus in this data

    Fine-mapping of lipid regions in global populations discovers ethnic-specific signals and refines previously identified lipid loci

    Get PDF
    Genome-wide association studies have identified over 150 loci associated with lipid traits, however, no large-scale studies exist for Hispanics and other minority populations. Additionally, the genetic architecture of lipid-influencing loci remains largely unknown. We performed one of the most racially/ethnically diverse fine-mapping genetic studies of HDL-C, LDL-C, and triglycerides to-date using SNPs on the MetaboChip array on 54,119 individuals: 21,304 African Americans, 19,829 Hispanic Americans, 12,456 Asians, and 530 American Indians. The majority of signals found in these groups generalize to European Americans. While we uncovered signals unique to racial/ethnic populations, we also observed systematically consistent lipid associations across these groups. In African Americans, we identified three novel signals associated with HDL-C (LPL, APOA5, LCAT) and two associated with LDL-C (ABCG8, DHODH). In addition, using this population, we refined the location for 16 out of the 58 known MetaboChip lipid loci. These results can guide tailored screening efforts, reveal population-specific responses to lipid-lowering medications, and aid in the development of new targeted drug therapies

    Multiethnic Meta-Analysis Identifies Ancestry-Specific and Cross-Ancestry Loci for Pulmonary Function

    Get PDF
    Nearly 100 loci have been identified for pulmonary function, almost exclusively in studies of European ancestry populations. We extend previous research by meta-analyzing genome-wide association studies of 1000 Genomes imputed variants in relation to pulmonary function in a multiethnic population of 90,715 individuals of European (N = 60,552), African (N = 8429), Asian (N = 9959), and Hispanic/Latino (N = 11,775) ethnicities. We identify over 50 additional loci at genome-wide significance in ancestry-specific or multiethnic meta-analyses. Using recent fine-mapping methods incorporating functional annotation, gene expression, and differences in linkage disequilibrium between ethnicities, we further shed light on potential causal variants and genes at known and newly identified loci. Several of the novel genes encode proteins with predicted or established drug targets, including KCNK2 and CDK12. Our study highlights the utility of multiethnic and integrative genomics approaches to extend existing knowledge of the genetics of lung function and clinical relevance of implicated loci

    Atlas of prostate cancer heritability in European and African-American men pinpoints tissue-specific regulation.

    Get PDF
    Although genome-wide association studies have identified over 100 risk loci that explain ∼33% of familial risk for prostate cancer (PrCa), their functional effects on risk remain largely unknown. Here we use genotype data from 59,089 men of European and African American ancestries combined with cell-type-specific epigenetic data to build a genomic atlas of single-nucleotide polymorphism (SNP) heritability in PrCa. We find significant differences in heritability between variants in prostate-relevant epigenetic marks defined in normal versus tumour tissue as well as between tissue and cell lines. The majority of SNP heritability lies in regions marked by H3k27 acetylation in prostate adenoc7arcinoma cell line (LNCaP) or by DNaseI hypersensitive sites in cancer cell lines. We find a high degree of similarity between European and African American ancestries suggesting a similar genetic architecture from common variation underlying PrCa risk. Our findings showcase the power of integrating functional annotation with genetic data to understand the genetic basis of PrCa.This work was supported by NIH fellowship F32 GM106584 (AG), NIH grants R01 MH101244(A.G.), R01 CA188392 (B.P.), U01 CA194393(B.P.), R01 GM107427 (M.L.F.), R01 CA193910 (M.L.F./M.P.) and Prostate Cancer Foundation Challenge Award (M.L.F./M.P.). This study makes use of data generated by the Wellcome Trust Case Control Consortium and the Wellcome Trust Sanger Institute. A full list of the investigators who contributed to the generation of the Wellcome Trust Case Control Consortium data is available on www.wtccc.org.uk. Funding for the Wellcome Trust Case Control Consortium project was provided by the Wellcome Trust under award 076113. This study makes use of data generated by the UK10K Consortium. A full list of the investigators who contributed to the generation of the data is available online (http://www.UK10K.org). The PRACTICAL consortium was supported by the following grants: European Commission's Seventh Framework Programme grant agreement n° 223175 (HEALTH-F2-2009-223175), Cancer Research UK Grants C5047/A7357, C1287/A10118, C5047/A3354, C5047/A10692, C16913/A6135 and The National Institute of Health (NIH) Cancer Post-Cancer GWAS initiative Grant: no. 1 U19 CA 148537-01 (the GAME-ON initiative); Cancer Research UK (C1287/A10118, C1287/A 10710, C12292/A11174, C1281/A12014, C5047/A8384, C5047/A15007 and C5047/A10692), the National Institutes of Health (CA128978) and Post-Cancer GWAS initiative (1U19 CA148537, 1U19 CA148065 and 1U19 CA148112—the GAME-ON initiative), the Department of Defense (W81XWH-10-1-0341), A Linneus Centre (Contract ID 70867902), Swedish Research Council (grant no K2010-70X-20430-04-3), the Swedish Cancer Foundation (grant no 09-0677), grants RO1CA056678, RO1CA082664 and RO1CA092579 from the US National Cancer Institute, National Institutes of Health; US National Cancer Institute (R01CA72818); support from The National Health and Medical Research Council, Australia (126402, 209057, 251533, 396414, 450104, 504700, 504702, 504715, 623204, 940394 and 614296); NIH grants CA63464, CA54281 and CA098758; US National Cancer Institute (R01CA128813, PI: J.Y. Park); Bulgarian National Science Fund, Ministry of Education and Science (contract DOO-119/2009; DUNK01/2–2009; DFNI-B01/28/2012); Cancer Research UK grants [C8197/A10123] and [C8197/A10865]; grant code G0500966/75466; NIHR Health Technology Assessment Programme (projects 96/20/06 and 96/20/99); Cancer Research UK grant number C522/A8649, Medical Research Council of England grant number G0500966, ID 75466 and The NCRI, UK; The US Dept of Defense award W81XWH-04-1-0280; Australia Project Grant [390130, 1009458] and Enabling Grant [614296 to APCB]; the Prostate Cancer Foundation of Australia (Project Grant [PG7] and Research infrastructure grant [to APCB]); NIH grant R01 CA092447; Vanderbilt-Ingram Cancer Center (P30 CA68485); Cancer Research UK [C490/A10124] and supported by the UK National Institute for Health Research Biomedical Research Centre at the University of Cambridge; Competitive Research Funding of the Tampere University Hospital (9N069 and X51003); Award Number P30CA042014 from the National Cancer Institute.This is the final version of the article. It first appeared from Nature Publishing Group via http://dx.doi.org/0.1038/ncomms1097

    Multiethnic meta-analysis identifies ancestry-specific and cross-ancestry loci for pulmonary function

    Get PDF
    Nearly 100 loci have been identified for pulmonary function, almost exclusively in studies of European ancestry populations. We extend previous research by meta-analyzing genome-wide association studies of 1000 Genomes imputed variants in relation to pulmonary function in a multiethnic population of 90,715 individuals of European (N = 60,552), African (N = 8429), Asian (N = 9959), and Hispanic/Latino (N = 11,775) ethnicities. We identify over 50 additional loci at genome-wide significance in ancestry-specific or multiethnic meta-analyses. Using recent fine-mapping methods incorporating functional annotation, gene expression, and differences in linkage disequilibrium between ethnicities, we further shed light on potential causal variants and genes at known and newly identified loci. Several of the novel genes encode proteins with predicted or established drug targets, including KCNK2 and CDK12. Our study highlights the utility of multiethnic and integrative genomics approaches to extend existing knowledge of the genetics of l

    Atlas of prostate cancer heritability in European and African-American men pinpoints tissue-specific regulation

    Get PDF
    Although genome-wide association studies have identified over 100 risk loci that explain similar to 33% of familial risk for prostate cancer (PrCa), their functional effects on risk remain largely unknown. Here we use genotype data from 59,089 men of European and African American ancestries combined with cell-type-specific epigenetic data to build a genomic atlas of single-nucleotide polymorphism (SNP) heritability in PrCa. We find significant differences in heritability between variants in prostate-relevant epigenetic marks defined in normal versus tumour tissue as well as between tissue and cell lines. The majority of SNP heritability lies in regions marked by H3k27 acetylation in prostate adenoc7arcinoma cell line (LNCaP) or by DNaseI hypersensitive sites in cancer cell lines. We find a high degree of similarity between European and African American ancestries suggesting a similar genetic architecture from common variation underlying PrCa risk. Our findings showcase the power of integrating functional annotation with genetic data to understand the genetic basis of PrCa.Peer reviewe

    Multiethnic meta-analysis identifies ancestry-specific and cross-ancestry loci for pulmonary function

    Get PDF
    Nearly 100 loci have been identified for pulmonary function, almost exclusively in studies of European ancestry populations. We extend previous research by meta-analyzing genome-wide association studies of 1000 Genomes imputed variants in relation to pulmonary function in a multiethnic population of 90,715 individuals of European (N = 60,552), African (N = 8429), Asian (N = 9959), and Hispanic/Latino (N = 11,775) ethnicities. We identify over 50 additional loci at genome-wide significance in ancestry-specific or multiethnic meta-analyses. Using recent fine-mapping methods incorporating functional annotation, gene expression, and differences in linkage disequilibrium between ethnicities, we further shed light on potential causal variants and genes at known and newly identified loci. Several of the novel genes encode proteins with predicted or established drug targets, including KCNK2 and CDK12. Our study highlights the utility of multiethnic and integrative genomics approaches to extend existing knowledge of the genetics of lung function and clinical relevance of implicated loci
    corecore