336 research outputs found

    Genotype-Based Test in Mapping Cis-Regulatory Variants from Allele-Specific Expression Data

    Get PDF
    Identifying and understanding the impact of gene regulatory variation is of considerable importance in evolutionary and medical genetics; such variants are thought to be responsible for human-specific adaptation [1] and to have an important role in genetic disease. Regulatory variation in cis is readily detected in individuals showing uneven expression of a transcript from its two allelic copies, an observation referred to as allelic imbalance (AI). Identifying individuals exhibiting AI allows mapping of regulatory DNA regions and the potential to identify the underlying causal genetic variant(s). However, existing mapping methods require knowledge of the haplotypes, which make them sensitive to phasing errors. In this study, we introduce a genotype-based mapping test that does not require haplotype-phase inference to locate regulatory regions. The test relies on partitioning genotypes of individuals exhibiting AI and those not expressing AI in a 2Γ—3 contingency table. The performance of this test to detect linkage disequilibrium (LD) between a potential regulatory site and a SNP located in this region was examined by analyzing the simulated and the empirical AI datasets. In simulation experiments, the genotype-based test outperforms the haplotype-based tests with the increasing distance separating the regulatory region from its regulated transcript. The genotype-based test performed equally well with the experimental AI datasets, either from genome–wide cDNA hybridization arrays or from RNA sequencing. By avoiding the need of haplotype inference, the genotype-based test will suit AI analyses in population samples of unknown haplotype structure and will additionally facilitate the identification of cis-regulatory variants that are located far away from the regulated transcript

    Study of Transcriptional Effects in Cis at the IFIH1 Locus

    Get PDF
    Background: The Thr allele at the non-synonymous single-nucleotide polymorphism (nsSNP) Thr946Ala in the IFIH1 gene confers risk for Type 1 diabetes (T1D). The SNP is embedded in a 236 kb linkage disequilibrium (LD) block that includes four genes: IFIH1, GCA, FAP and KCNH7. The absence of common nsSNPs in the other genes makes the IFIH1 SNP the strongest functional candidate, but it could be merely a marker of association, due to LD with a variant regulating expression levels of IFIH1 or neighboring genes. Methodology/Principal Findings: We investigated the effect of the T1D-associated variation on mRNA transcript expression of these genes. Heterozygous mRNA from lymphoblastoid cell lines (LCLs), pancreas and thymus was examined by allelic expression imbalance, to detect effects in cis on mRNA expression. Using single-nucleotide primer extension, we found no difference between mRNA transcripts in 9 LCLs, 6 pancreas and 13 thymus samples, suggesting that GCA and FAP are not involved. On the other hand, KCNH7 was not expressed at a detectable level in all tissues examined. Moreover, the association of the Thr946Ala SNP with T1D is not due to modulation of IFIH1 expression in organs involved in the disease, pointing to the IFIH1 nsSNP as the causal variant. Conclusions/Significance: The mechanism of the association of the nsSNP with T1D remains to be determined, but does not involve mRNA modulation. It becomes necessary to study differential function of the IFIH1 protein alleles at Thr946Al

    Computational Analysis of Whole-Genome Differential Allelic Expression Data in Human

    Get PDF
    Allelic imbalance (AI) is a phenomenon where the two alleles of a given gene are expressed at different levels in a given cell, either because of epigenetic inactivation of one of the two alleles, or because of genetic variation in regulatory regions. Recently, Bing et al. have described the use of genotyping arrays to assay AI at a high resolution (∼750,000 SNPs across the autosomes). In this paper, we investigate computational approaches to analyze this data and identify genomic regions with AI in an unbiased and robust statistical manner. We propose two families of approaches: (i) a statistical approach based on z-score computations, and (ii) a family of machine learning approaches based on Hidden Markov Models. Each method is evaluated using previously published experimental data sets as well as with permutation testing. When applied to whole genome data from 53 HapMap samples, our approaches reveal that allelic imbalance is widespread (most expressed genes show evidence of AI in at least one of our 53 samples) and that most AI regions in a given individual are also found in at least a few other individuals. While many AI regions identified in the genome correspond to known protein-coding transcripts, others overlap with recently discovered long non-coding RNAs. We also observe that genomic regions with AI not only include complete transcripts with consistent differential expression levels, but also more complex patterns of allelic expression such as alternative promoters and alternative 3β€² end. The approaches developed not only shed light on the incidence and mechanisms of allelic expression, but will also help towards mapping the genetic causes of allelic expression and identify cases where this variation may be linked to diseases

    Allele-specific miRNA-binding analysis identifies candidate target genes for breast cancer risk

    Get PDF
    Most breast cancer (BC) risk-associated single-nucleotide polymorphisms (raSNPs) identified in genome-wide association studies (GWAS) are believed to cis-regulate the expression of genes. We hypothesise that cis-regulatory variants contributing to disease risk may be affecting microRNA (miRNA) genes and/or miRNA binding. To test this, we adapted two miRNA-binding prediction algorithms-TargetScan and miRanda-to perform allele-specific queries, and integrated differential allelic expression (DAE) and expression quantitative trait loci (eQTL) data, to query 150 genome-wide significant ( P≀5Γ—10-8 ) raSNPs, plus proxies. We found that no raSNP mapped to a miRNA gene, suggesting that altered miRNA targeting is an unlikely mechanism involved in BC risk. Also, 11.5% (6 out of 52) raSNPs located in 3'-untranslated regions of putative miRNA target genes were predicted to alter miRNA::mRNA (messenger RNA) pair binding stability in five candidate target genes. Of these, we propose RNF115, at locus 1q21.1, as a strong novel target gene associated with BC risk, and reinforce the role of miRNA-mediated cis-regulation at locus 19p13.11. We believe that integrating allele-specific querying in miRNA-binding prediction, and data supporting cis-regulation of expression, improves the identification of candidate target genes in BC risk, as well as in other common cancers and complex diseases.Funding Agency Portuguese Foundation for Science and Technology CRESC ALGARVE 2020 European Union (EU) 303745 Maratona da Saude Award DL 57/2016/CP1361/CT0042 SFRH/BPD/99502/2014 CBMR-UID/BIM/04773/2013 POCI-01-0145-FEDER-022184info:eu-repo/semantics/publishedVersio

    Differential Allelic Expression in the Human Genome: A Robust Approach To Identify Genetic and Epigenetic Cis-Acting Mechanisms Regulating Gene Expression

    Get PDF
    The recent development of whole genome association studies has lead to the robust identification of several loci involved in different common human diseases. Interestingly, some of the strongest signals of association observed in these studies arise from non-coding regions located in very large introns or far away from any annotated genes, raising the possibility that these regions are involved in the etiology of the disease through some unidentified regulatory mechanisms. These findings highlight the importance of better understanding the mechanisms leading to inter-individual differences in gene expression in humans. Most of the existing approaches developed to identify common regulatory polymorphisms are based on linkage/association mapping of gene expression to genotypes. However, these methods have some limitations, notably their cost and the requirement of extensive genotyping information from all the individuals studied which limits their applications to a specific cohort or tissue. Here we describe a robust and high-throughput method to directly measure differences in allelic expression for a large number of genes using the Illumina Allele-Specific Expression BeadArray platform and quantitative sequencing of RT-PCR products. We show that this approach allows reliable identification of differences in the relative expression of the two alleles larger than 1.5-fold (i.e., deviations of the allelic ratio larger than 60∢40) and offers several advantages over the mapping of total gene expression, particularly for studying humans or outbred populations. Our analysis of more than 80 individuals for 2,968 SNPs located in 1,380 genes confirms that differential allelic expression is a widespread phenomenon affecting the expression of 20% of human genes and shows that our method successfully captures expression differences resulting from both genetic and epigenetic cis-acting mechanisms

    Immunoseq: the identification of functionally relevant variants through targeted capture and sequencing of active regulatory regions in human immune cells

    Get PDF
    BACKGROUND\textbf{BACKGROUND}: The observation that the genetic variants identified in genome-wide association studies (GWAS) frequently lie in non-coding regions of the genome that contain cis-regulatory elements suggests that altered gene expression underlies the development of many complex traits. In order to efficiently make a comprehensive assessment of the impact of non-coding genetic variation in immune related diseases we emulated the whole-exome sequencing paradigm and developed a custom capture panel for the known DNase I hypersensitive site (DHS) in immune cells - "Immunoseq". RESULTS\textbf{RESULTS}: We performed Immunoseq in 30 healthy individuals where we had existing transcriptome data from T cells. We identified a large number of novel non-coding variants in these samples. Relying on allele specific expression measurements, we also showed that our selected capture regions are enriched for functional variants that have an impact on differential allelic gene expression. The results from a replication set with 180 samples confirmed our observations. CONCLUSIONS\textbf{CONCLUSIONS}: We show that Immunoseq is a powerful approach to detect novel rare variants in regulatory regions. We also demonstrate that these novel variants have a potential functional role in immune cells.This work was supported by grants from the Canadian Institute of Health Research (CIHR), the UK Medical Research Council (G1100125), the Swedish Research Council (DO283001) and Knut and Alice Wallenberg Foundation (KAW). We also acknowledge the use of subjects from the Cambridge BioResource and the support of the Cambridge NIHR Biomedical Research Centre. AM was supported by the Fond de Recherche SantΓ© QuΓ©bec Doctoral training award. TP and CL holds a Canada Research Chair

    Global Analysis of the Impact of Environmental Perturbation on cis-Regulation of Gene Expression

    Get PDF
    Genetic variants altering cis-regulation of normal gene expression (cis-eQTLs) have been extensively mapped in human cells and tissues, but the extent by which controlled, environmental perturbation influences cis-eQTLs is unclear. We carried out large-scale induction experiments using primary human bone cells derived from unrelated donors of Swedish origin treated with 18 different stimuli (7 treatments and 2 controls, each assessed at 2 time points). The treatments with the largest impact on the transcriptome, verified on two independent expression arrays, included BMP-2 (tβ€Š=β€Š2h), dexamethasone (DEX) (tβ€Š=β€Š24h), and PGE2 (tβ€Š=β€Š24h). Using these treatments and control, we performed expression profiling for 18,144 RefSeq transcripts on biological replicates of the complete study cohort of 113 individuals (ntotalβ€Š=β€Š782) and combined it with genome-wide SNP-genotyping data in order to map treatment-specific cis-eQTLs (defined as SNPs located within the gene Β±250 kb). We found that 93% of cis-eQTLs at 1% FDR were observed in at least one additional treatment, and in fact, on average, only 1.4% of the cis-eQTLs were considered as treatment-specific at high confidence. The relative invariability of cis-regulation following perturbation was reiterated independently by genome-wide allelic expression tests where only a small proportion of variance could be attributed to treatment. Treatment-specific cis-regulatory effects were, however, 2- to 6-fold more abundant among differently expressed genes upon treatment. We further followed-up and validated the DEX–specific cis-regulation of the MYO6 and TNC loci and found top cis-regulatory variants located 180 kb and 250 kb upstream of the transcription start sites, respectively. Our results suggest that, as opposed to tissue-specificity of cis-eQTLs, the interactions between cellular environment and cis-variants are relatively rare (∼1.5%), but that detection of such specific interactions can be achieved by a combination of functional genomic approaches as described here

    Allele-Specific Gene Expression Is Widespread Across the Genome and Biological Processes

    Get PDF
    Allelic specific gene expression (ASGE) appears to be an important factor in human phenotypic variability and as a consequence, for the development of complex traits and diseases. In order to study ASGE across the human genome, we have performed a study in which genotyping was coupled with an analysis of ASGE by screening 11,500 SNPs using the Mapping 10 K Array to identify differential allelic expression. We found that from the 5,133 SNPs that were suitable for analysis (heterozygous in our sample and expressed in peripheral blood mononuclear cells), 2,934 (57%) SNPs had differential allelic expression. Such SNPs were equally distributed along human chromosomes and biological processes. We validated the presence or absence of ASGE in 18 out 20 SNPs (90%) randomly selected by real time PCR in 48 human subjects. In addition, we observed that SNPs close to -but not included in- segmental duplications had increased levels of ASGE. Finally, we found that transcripts of unknown function or non-coding RNAs, also display ASGE: from a total of 2,308 intronic SNPs, 1510 (65%) SNPs underwent differential allelic expression. In summary, ASGE is a widespread mechanism in the human genome whose regulation seems to be far more complex than expected

    Multiplex SNP typing by bioluminometric assay coupled with terminator incorporation (BATI)

    Get PDF
    A multiplex single-nucleotide polymorphism (SNP) typing platform using β€˜bioluminometric assay coupled with terminator [2β€²,3β€²-dideoxynucleoside triphosphates (ddNTPs)] incorporation’ (named β€˜BATI’ for short) was developed. All of the reactions are carried out in a single reaction chamber containing target DNAs, DNA polymerase, reagents necessary for converting PPi into ATP and reagents for luciferase reaction. Each of the four ddNTPs is dispensed into the reaction chamber in turn. PPi is released by a nucleotide incorporation reaction and is used to produce ATP when the ddNTP dispensed is complementary to the base in a template. The ATP is used in a luciferase reaction to release visible light. Only 1 nt is incorporated into a template at a time because ddNTPs do not have a 3β€² hydroxyl group. This feature greatly simplifies a sequencing spectrum. The luminescence is proportional to the amount of template incorporated. Only one peak appears in the spectrum of a homozygote sample, and two peaks at the same intensity appear for a heterozygote sample. In comparison with pyrosequencing using dNTP, the spectrum obtained by BATI is very simple, and it is very easy to determine SNPs accurately from it. As only one base is extended at a time and the extension signals are quantitative, the observed spectrum pattern is uniquely determined even for a sample containing multiplex SNPs. We have successfully used BATI to type various samples containing plural target sequence areas. The measurements can be carried out with an inexpensive and small luminometer using a photodiode array as the detector. It takes only a few minutes to determine multiplex SNPs. These results indicate that this novel multiplexed approach can significantly decrease the cost of SNP typing and increase the typing throughput with an inexpensive and small luminometer
    • …
    corecore