276 research outputs found

    Comprehensive Survey of SNPs in the Affymetrix Exon Array Using the 1000 Genomes Dataset

    Get PDF
    Microarray gene expression data has been used in genome-wide association studies to allow researchers to study gene regulation as well as other complex phenotypes including disease risks and drug response. To reach scientifically sound conclusions from these studies, however, it is necessary to get reliable summarization of gene expression intensities. Among various factors that could affect expression profiling using a microarray platform, single nucleotide polymorphisms (SNPs) in target mRNA may lead to reduced signal intensity measurements and result in spurious results. The recently released 1000 Genomes Project dataset provides an opportunity to evaluate the distribution of both known and novel SNPs in the International HapMap Project lymphoblastoid cell lines (LCLs). We mapped the 1000 Genomes Project genotypic data to the Affymetrix GeneChip Human Exon 1.0ST array (exon array), which had been used in our previous studies and for which gene expression data had been made publicly available. We also evaluated the potential impact of these SNPs on the differentially spliced probesets we had identified previously. Though the 1000 Genomes Project data allowed a comprehensive survey of the SNPs in this particular array, the same approach can certainly be applied to other microarray platforms. Furthermore, we present a detailed catalogue of SNP-containing probesets (exon-level) and transcript clusters (gene-level), which can be considered in evaluating findings using the exon array as well as benefit the design of follow-up experiments and data re-analysis

    ExprTarget: An Integrative Approach to Predicting Human MicroRNA Targets

    Get PDF
    Variation in gene expression has been observed in natural populations and associated with complex traits or phenotypes such as disease susceptibility and drug response. Gene expression itself is controlled by various genetic and non-genetic factors. The binding of a class of small RNA molecules, microRNAs (miRNAs), to mRNA transcript targets has recently been demonstrated to be an important mechanism of gene regulation. Because individual miRNAs may regulate the expression of multiple gene targets, a comprehensive and reliable catalogue of miRNA-regulated targets is critical to understanding gene regulatory networks. Though experimental approaches have been used to identify many miRNA targets, due to cost and efficiency, current miRNA target identification still relies largely on computational algorithms that aim to take advantage of different biochemical/thermodynamic properties of the sequences of miRNAs and their gene targets. A novel approach, ExprTarget, therefore, is proposed here to integrate some of the most frequently invoked methods (miRanda, PicTar, TargetScan) as well as the genome-wide HapMap miRNA and mRNA expression datasets generated in our laboratory. To our knowledge, this dataset constitutes the first miRNA expression profiling in the HapMap lymphoblastoid cell lines. We conducted diagnostic tests of the existing computational solutions using the experimentally supported targets in TarBase as gold standard. To gain insight into the biases that arise from such an analysis, we investigated the effect of the choice of gold standard on the evaluation of the various computational tools. We analyzed the performance of ExprTarget using both ROC curve analysis and cross-validation. We show that ExprTarget greatly improves miRNA target prediction relative to the individual prediction algorithms in terms of sensitivity and specificity. We also developed an online database, ExprTargetDB, of human miRNA targets predicted by our approach that integrates gene expression profiling into a broader framework involving important features of miRNA target site predictions

    Genetic architecture of host proteins involved in SARS-CoV-2 infection

    Get PDF
    Understanding the genetic architecture of host proteins interacting with SARS-CoV-2 or mediating the maladaptive host response to COVID-19 can help to identify new or repurpose existing drugs targeting those proteins. We present a genetic discovery study of 179 such host proteins among 10,708 individuals using an aptamer-based technique. We identify 220 host DNA sequence variants acting in cis (MAF 0.01-49.9%) and explaining 0.3-70.9% of the variance of 97 of these proteins, including 45 with no previously known protein quantitative trait loci (pQTL) and 38 encoding current drug targets. Systematic characterization of pQTLs across the phenome identified protein-drug-disease links and evidence that putative viral interaction partners such as MARK3 affect immune response. Our results accelerate the evaluation and prioritization of new drug development programmes and repurposing of trials to prevent, treat or reduce adverse outcomes. Rapid sharing and detailed interrogation of results is facilitated through an interactive webserver (https://omicscience.org/apps/covidpgwas/).We further acknowledge support for genomics from the Medical Research Council (MC_PC_13046). Proteomic measurements were supported and governed by a collaboration agreement between the University of Cambridge and Somalogic. JCZ and VPWA are supported by a 4-year Wellcome Trust PhD Studentship and the Cambridge Trust, CL, EW, and NJW are funded by the Medical Research Council (MC_UU_12015/1). NJW and ADH are an NIHR Senior Investigator. GK is supported by grants from the National Institute on Aging (NIA): R01 AG057452, RF1 AG058942, RF1 AG059093, U01 AG061359, and U19 AG063744. MR acknowledges funding from the Francis Crick Institute, which receives its core funding from Cancer Research UK (FC001134), the UK Medical Research Council (FC001134), and the Wellcome Trust (FC001134). ERG is supported by the National Human Genome Research Institute of the National Institutes of Health under Award Numbers R35HG010718 and R01HG011138. JR is supported by the German Federal Ministry of Education and Research (BMBF) within the framework of the e:Med research and funding concept (grant no. 01ZX1912D). This work was supported by the UCL British Heart Foundation Research Accelerator Award (AA/18/6/34223), the National Institute for Health Research University College London Hospitals Biomedical Research Centre, and arises from one of the national "Covid-19 Cardiovascular Disease Flagship Projects" designated by the NIHR-BHF Cardiovascular Partnership

    Mapping the proteo-genomic convergence of human diseases

    Get PDF
    Characterization of the genetic regulation of proteins is essential for understanding disease etiology and developing therapies. We identified 10,674 genetic associations for 3892 plasma proteins to create a cis-anchored gene-protein-disease map of 1859 connections that highlights strong cross-disease biological convergence. This proteo-genomic map provides a framework to connect etiologically related diseases, to provide biological context for new or emerging disorders, and to integrate different biological domains to establish mechanisms for known gene-disease links. Our results identify proteo-genomic connections within and between diseases and establish the value of cis-protein variants for annotation of likely causal disease genes at loci identified in genome-wide association studies, thereby addressing a major barrier to experimental validation and clinical translation of genetic discoveries

    Trait-Associated SNPs Are More Likely to Be eQTLs: Annotation to Enhance Discovery from GWAS

    Get PDF
    Although genome-wide association studies (GWAS) of complex traits have yielded more reproducible associations than had been discovered using any other approach, the loci characterized to date do not account for much of the heritability to such traits and, in general, have not led to improved understanding of the biology underlying complex phenotypes. Using a web site we developed to serve results of expression quantitative trait locus (eQTL) studies in lymphoblastoid cell lines from HapMap samples (http://www.scandb.org), we show that single nucleotide polymorphisms (SNPs) associated with complex traits (from http://www.genome.gov/gwastudies/) are significantly more likely to be eQTLs than minor-allele-frequency–matched SNPs chosen from high-throughput GWAS platforms. These findings are robust across a range of thresholds for establishing eQTLs (p-values from 10−4–10−8), and a broad spectrum of human complex traits. Analyses of GWAS data from the Wellcome Trust studies confirm that annotating SNPs with a score reflecting the strength of the evidence that the SNP is an eQTL can improve the ability to discover true associations and clarify the nature of the mechanism driving the associations. Our results showing that trait-associated SNPs are more likely to be eQTLs and that application of this information can enhance discovery of trait-associated SNPs for complex phenotypes raise the possibility that we can utilize this information both to increase the heritability explained by identifiable genetic factors and to gain a better understanding of the biology underlying complex traits

    Allele-specific miRNA-binding analysis identifies candidate target genes for breast cancer risk

    Get PDF
    Most breast cancer (BC) risk-associated single-nucleotide polymorphisms (raSNPs) identified in genome-wide association studies (GWAS) are believed to cis-regulate the expression of genes. We hypothesise that cis-regulatory variants contributing to disease risk may be affecting microRNA (miRNA) genes and/or miRNA binding. To test this, we adapted two miRNA-binding prediction algorithms-TargetScan and miRanda-to perform allele-specific queries, and integrated differential allelic expression (DAE) and expression quantitative trait loci (eQTL) data, to query 150 genome-wide significant ( P≀5×10-8 ) raSNPs, plus proxies. We found that no raSNP mapped to a miRNA gene, suggesting that altered miRNA targeting is an unlikely mechanism involved in BC risk. Also, 11.5% (6 out of 52) raSNPs located in 3'-untranslated regions of putative miRNA target genes were predicted to alter miRNA::mRNA (messenger RNA) pair binding stability in five candidate target genes. Of these, we propose RNF115, at locus 1q21.1, as a strong novel target gene associated with BC risk, and reinforce the role of miRNA-mediated cis-regulation at locus 19p13.11. We believe that integrating allele-specific querying in miRNA-binding prediction, and data supporting cis-regulation of expression, improves the identification of candidate target genes in BC risk, as well as in other common cancers and complex diseases.Funding Agency Portuguese Foundation for Science and Technology CRESC ALGARVE 2020 European Union (EU) 303745 Maratona da Saude Award DL 57/2016/CP1361/CT0042 SFRH/BPD/99502/2014 CBMR-UID/BIM/04773/2013 POCI-01-0145-FEDER-022184info:eu-repo/semantics/publishedVersio

    Genome-wide association study identifies loci associated with liability to alcohol and drug dependence that is associated with variability in reward-related ventral striatum activity in African- and European-Americans.

    Get PDF
    Genetic influences on alcohol and drug dependence partially overlap, however, specific loci underlying this overlap remain unclear. We conducted a genome-wide association study (GWAS) of a phenotype representing alcohol or illicit drug dependence (ANYDEP) among 7291 European-Americans (EA; 2927 cases) and 3132 African-Americans (AA: 1315 cases) participating in the family-based Collaborative Study on the Genetics of Alcoholism. ANYDEP was heritable (h 2 in EA = 0.60, AA = 0.37). The AA GWAS identified three regions with genome-wide significant (GWS; P < 5E-08) single nucleotide polymorphisms (SNPs) on chromosomes 3 (rs34066662, rs58801820) and 13 (rs75168521, rs78886294), and an insertion-deletion on chromosome 5 (chr5:141988181). No polymorphisms reached GWS in the EA. One GWS region (chromosome 1: rs1890881) emerged from a trans-ancestral meta-analysis (EA + AA) of ANYDEP, and was attributable to alcohol dependence in both samples. Four genes (AA: CRKL, DZIP3, SBK3; EA: P2RX6) and four sets of genes were significantly enriched within biological pathways for hemostasis and signal transduction. GWS signals did not replicate in two independent samples but there was weak evidence for association between rs1890881 and alcohol intake in the UK Biobank. Among 118 AA and 481 EA individuals from the Duke Neurogenetics Study, rs75168521 and rs1890881 genotypes were associated with variability in reward-related ventral striatum activation. This study identified novel loci for substance dependence and provides preliminary evidence that these variants are also associated with individual differences in neural reward reactivity. Gene discovery efforts in non-European samples with distinct patterns of substance use may lead to the identification of novel ancestry-specific genetic markers of risk

    Genome-wide association and meta-analysis in populations from Starr County, Texas, and Mexico City identify type 2 diabetes susceptibility loci and enrichment for expression quantitative trait loci in top signals

    Get PDF
    AIMS/HYPOTHESIS: We conducted genome-wide association studies (GWASs) and expression quantitative trait loci (eQTL) analyses to identify and characterise risk loci for type 2 diabetes in Mexican-Americans from Starr County, TX, USA. METHOD: Using 1.8 million directly interrogated and imputed genotypes in 837 unrelated type 2 diabetes cases and 436 normoglycaemic controls, we conducted Armitage trend tests. To improve power in this population with high disease rates, we also performed ordinal regression including an intermediate class with impaired fasting glucose and/or glucose tolerance. These analyses were followed by meta-analysis with a study of 967 type 2 diabetes cases and 343 normoglycaemic controls from Mexico City, Mexico. RESULT: The top signals (unadjusted p value <1×10(−5)) included 49 single nucleotide polymorphisms (SNPs) in eight gene regions (PER3, PARD3B, EPHA4, TOMM7, PTPRD, HNT [also known as RREB1], LOC729993 and IL34) and six intergenic regions. Among these was a missense polymorphism (rs10462020; Gly639Val) in the clock gene PER3, a system recently implicated in diabetes. We also report a second signal (minimum p value 1.52× 10(−6)) within PTPRD, independent of the previously implicated SNP, in a population of Han Chinese. Top meta-analysis signals included known regions HNF1A and KCNQ1. Annotation of top association signals in both studies revealed a marked excess of trans-acting eQTL in both adipose and muscle tissues. CONCLUSIONS/INTERPRETATION: In the largest study of type 2 diabetes in Mexican populations to date, we identified modest associations of novel and previously reported SNPs. In addition, in our top signals we report significant excess of SNPs that predict transcript levels in muscle and adipose tissues

    Genome-Wide Association Analysis of Incident Coronary Heart Disease (CHD) in African Americans: A Short Report

    Get PDF
    African Americans have the highest rate of mortality due to coronary heart disease (CHD). Although multiple loci have been identified influencing CHD risk in European-Americans using a genome-wide association (GWAS) approach, no GWAS of incident CHD has been reported for African Americans. We performed a GWAS for incident CHD events collected during 19 years of follow-up in 2,905 African Americans from the Atherosclerosis Risk in Communities (ARIC) study. We identified a genome-wide significant SNP (rs1859023, MAF = 31%) located at 7q21 near the PFTK1 gene (HR = 0.57, 95% CI 0.46 to 0.69, p = 1.86×10−08), which replicated in an independent sample of over 8,000 African American women from the Women's Health Initiative (WHI) (HR = 0.81, 95% CI 0.70 to 0.93, p = 0.005). PFTK1 encodes a serine/threonine-protein kinase, PFTAIRE-1, that acts as a cyclin-dependent kinase regulating cell cycle progression and cell proliferation. This is the first finding of incident CHD locus identified by GWAS in African Americans
    • 

    corecore