487 research outputs found

    Principles for the post-GWAS functional characterisation of risk loci

    Get PDF
    Several challenges lie ahead in assigning functionality to susceptibility SNPs. For example, most effect sizes are small relative to effects seen in monogenic diseases, with per allele odds ratios usually ranging from 1.15 to 1.3. It is unclear whether current molecular biology methods have enough resolution to differentiate such small effects. Our objective here is therefore to provide a set of recommendations to optimize the allocation of effort and resources in order maximize the chances of elucidating the functional contribution of specific loci to the disease phenotype. It has been estimated that 88% of currently identified disease-associated SNP are intronic or intergenic. Thus, in this paper we will focus our attention on the analysis of non-coding variants and outline a hierarchical approach for post-GWAS functional studies

    Assigning function to genome wide association study variants associated with complex gastrointestinal disease

    Get PDF
    PhDThe genome‐wide association study era has identified numerous loci associated with many common polygenic diseases. The next challenge is to identify the functional consequences of these variants and elicit how they impact on disease risk. Using a combination of protein based assays, large scale microarrays and high‐throughput generation sequencing platforms this thesis aims to identify the functional effects of disease loci, with particular focus on Crohn’s disease and coeliac disease, two common complex gastrointestinal diseases. Variants located within the Interleukin 23 receptor are associated with both susceptibility and protection from Crohn’s disease, a debilitating chronic inflammatory disease of the bowel. A study was undertaken to investigate the effect of these variants, at the mRNA as well as the protein level, on both cytokine and receptor levels. Coeliac disease is a dietary intolerance to the gluten component of wheat, barley and rye and has an estimated prevalence of approximately 1%. Genome‐wide association studies have identified eight genomic different loci as associated with coeliac disease but none have been functionally characterised. To investigate the effect that genotype has on gene transcript levels, a genetical genomics study was undertaken in patients with coeliac disease generating results with relevance to a range of autoimmune disorders. Before disease based effects can be identified, it is first important to fully characterise the normal human transcriptome and methylome. To this end CD4 + T cells were studied using novel high‐throughput sequencing techniques, with the aim of providing some insight into novel genomic properties that may illuminate current and future disease associated loci. Given the base pair resolution approach of high‐throughput sequencing, a novel method of assaying for SNP effects on gene expression was developed. This allele specific method, using whole transcriptome sequencing, is capable of identifying alterations in transcript expression on a genome‐wide scale

    Identification and removal of low-complexity sites in allele-specific analysis of ChIP-seq data

    Get PDF
    Motivation: High-throughput sequencing technologies enable the genome-wide analysis of the impact of genetic variation on molecular phenotypes at unprecedented resolution. However, although powerful, these technologies can also introduce unexpected artifacts. Results: We investigated the impact of library amplification bias on the identification of allele-specific (AS) molecular events from high-throughput sequencing data derived from chromatin immunoprecipitation assays (ChIP-seq). Putative AS DNA binding activity for RNA polymerase II was determined using ChIP-seq data derived from lymphoblastoid cell lines of two parent-daughter trios. We found that, at high-sequencing depth, many significant AS binding sites suffered from an amplification bias, as evidenced by a larger number of clonal reads representing one of the two alleles. To alleviate this bias, we devised an amplification bias detection strategy, which filters out sites with low read complexity and sites featuring a significant excess of clonal reads. This method will be useful for AS analyses involving ChIP-seq and other functional sequencing assays. Availability: The R package absfilter for library clonality simulations and detection of amplification-biased sites is available from http://updepla1srv1.epfl.ch/waszaks/absfilter Contact: [email protected] or [email protected] Supplementary information: Supplementary data are available at Bioinformatics onlin

    Genome-Wide Identification and Quantification of cis- and trans-Regulated Genes Responding to Marek’s Disease Virus Infection via Analysis of Allele-Specific Expression

    Get PDF
    Marek’s disease (MD) is a commercially important neoplastic disease of chickens caused by Marek’s disease virus (MDV), a naturally occurring oncogenic alphaherpesvirus. Selecting for increased genetic resistance to MD is a control strategy that can augment vaccinal control measures. To identify high-confidence candidate MD resistance genes, we conducted a genome-wide screen for allele-specific expression (ASE) amongst F1 progeny of two inbred chicken lines that differ substantially in MD resistance. High throughput sequencing was initially used to profile transcriptomes from pools of uninfected and infected individuals at 4 days post-infection to identify any genes showing ASE in response to MDV infection. RNA sequencing identified 22,655 single nucleotide polymorphisms (SNPs) of which 5,360 in 3,773 genes exhibited significant allelic imbalance. Illumina GoldenGate assays were subsequently used to quantify regulatory variation controlled at the gene (cis) and elsewhere in the genome (trans) by examining differences in expression between F1 individuals and artificial F1 RNA pools over six time periods in 1,536 of the most significant SNPs identified by RNA sequencing. Allelic imbalance as a result of cis-regulatory changes was confirmed in 861 of the 1,233 GoldenGate assays successfully examined. Furthermore we have identified seven genes that display trans-regulation only in infected animals and ∼500 SNP that show a complex interaction between cis- and trans-regulatory changes. Our results indicate ASE analyses are a powerful approach to identify regulatory variation responsible for differences in transcript abundance in genes underlying complex traits. And the genes with SNPs exhibiting ASE provide a strong foundation to further investigate the causative polymorphisms and genetic mechanisms for MD resistance. Finally, the methods used here for identifying specific genes and SNPs have practical implications for applying marker-assisted selection to complex traits that are difficult to measure in agricultural species, when expression differences are expected to control a portion of the phenotypic variance

    RNA-seq analysis of single bovine blastocysts

    Get PDF
    Background: Use of RNA-Seq presents unique benefits in terms of gene expression analysis because of its wide dynamic range and ability to identify functional sequence variants. This technology provides the opportunity to assay the developing embryo, but the paucity of biological material available from individual embryos has made this a challenging prospect. Results: We report here the first application of RNA-Seq for the analysis of individual blastocyst gene expression, SNP detection, and characterization of allele specific expression (ASE). RNA was extracted from single bovine blastocysts (n = 5), amplified, and analyzed using high-throughput sequencing. Approximately 38 million sequencing reads were generated per embryo and 9,489 known bovine genes were found to be expressed, with a high correlation of expression levels between samples (r > 0.97). Transcriptomic data was analyzed to identify SNP in expressed genes, and individual SNP were examined to characterize allele specific expression. Expressed biallelic SNP variants with allelic imbalances were observed in 473 SNP, where one allele represented between 65-95% of a variant’s transcripts. Conclusions: This study represents the first application of RNA-seq technology in single bovine embryos allowing a representation of the embryonic transcriptome and the analysis of transcript sequence variation to describe specific allele expression.EEA BalcarceFil: Chitwood, James L. University of California Davis. Department of Animal Science; Estados UnidosFil: Rincon, Gonzalo. University of California Davis. Department of Animal Science; Estados UnidosFil: Kaiser, German Gustavo. Instituto Nacional de Tecnología Agropecuaria (INTA). Estación Experimental Agropecuaria Balcarce; ArgentinaFil: Medrano, Juan F. University of California Davis. Department of Animal Science; Estados UnidosFil: Ross, Pablo J. University of California Davis. Department of Animal Science; Estados Unido

    Identification of SNPs associated with muscle yield and quality traits using allelic-imbalance analyses of pooled RNA-Seq samples in rainbow trout

    Get PDF
    Coding/functional SNPs change the biological function of a gene and, therefore, could serve as “large-effect” genetic markers. In this study, we used two bioinformatics pipelines, GATK and SAMtools, for discovering coding/functional SNPs with allelic-imbalances associated with total body weight, muscle yield, muscle fat content, shear force, and whiteness. Phenotypic data were collected for approximately 500 fish, representing 98 families (5 fish/family), from a growth-selected line, and the muscle transcriptome was sequenced from 22 families with divergent phenotypes (4 low- versus 4 high-ranked families per trait). Results GATK detected 59,112 putative SNPs; of these SNPs, 4798 showed allelic imbalances (\u3e2.0 as an amplification and \u3c0.5 as loss of heterozygosity). SAMtools detected 87,066 putative SNPs; and of them, 4962 had allelic imbalances between the low- and high-ranked families. Only 1829 SNPs with allelic imbalances were common between the two datasets, indicating significant differences in algorithms. The two datasets contained 7930 non-redundant SNPs of which 4439 mapped to 1498 protein-coding genes (with 6.4% non-synonymous SNPs) and 684 mapped to 295 lncRNAs. Validation of a subset of 92 SNPs revealed 1) 86.7–93.8% success rate in calling polymorphic SNPs and 2) 95.4% consistent matching between DNA and cDNA genotypes indicating a high rate of identifying SNPs with allelic imbalances. In addition, 4.64% SNPs revealed random monoallelic expression. Genome distribution of the SNPs with allelic imbalances exhibited high density for all five traits in several chromosomes, especially chromosome 9, 20 and 28. Most of the SNP-harboring genes were assigned to important growth-related metabolic pathways. Conclusion These results demonstrate utility of RNA-Seq in assessing phenotype-associated allelic imbalances in pooled RNA-Seq samples. The SNPs identified in this study were included in a new SNP-Chip design (available from Affymetrix) for genomic and genetic analyses in rainbow trout

    Immunoseq: the identification of functionally relevant variants through targeted capture and sequencing of active regulatory regions in human immune cells

    Get PDF
    BACKGROUND\textbf{BACKGROUND}: The observation that the genetic variants identified in genome-wide association studies (GWAS) frequently lie in non-coding regions of the genome that contain cis-regulatory elements suggests that altered gene expression underlies the development of many complex traits. In order to efficiently make a comprehensive assessment of the impact of non-coding genetic variation in immune related diseases we emulated the whole-exome sequencing paradigm and developed a custom capture panel for the known DNase I hypersensitive site (DHS) in immune cells - "Immunoseq". RESULTS\textbf{RESULTS}: We performed Immunoseq in 30 healthy individuals where we had existing transcriptome data from T cells. We identified a large number of novel non-coding variants in these samples. Relying on allele specific expression measurements, we also showed that our selected capture regions are enriched for functional variants that have an impact on differential allelic gene expression. The results from a replication set with 180 samples confirmed our observations. CONCLUSIONS\textbf{CONCLUSIONS}: We show that Immunoseq is a powerful approach to detect novel rare variants in regulatory regions. We also demonstrate that these novel variants have a potential functional role in immune cells.This work was supported by grants from the Canadian Institute of Health Research (CIHR), the UK Medical Research Council (G1100125), the Swedish Research Council (DO283001) and Knut and Alice Wallenberg Foundation (KAW). We also acknowledge the use of subjects from the Cambridge BioResource and the support of the Cambridge NIHR Biomedical Research Centre. AM was supported by the Fond de Recherche Santé Québec Doctoral training award. TP and CL holds a Canada Research Chair
    corecore