2,882 research outputs found
Recommended from our members
Transcriptome-Wide Association Supplements Genome-Wide Association in Zea mays.
Modern improvement of complex traits in agricultural species relies on successful associations of heritable molecular variation with observable phenotypes. Historically, this pursuit has primarily been based on easily measurable genetic markers. The recent advent of new technologies allows assaying and quantifying biological intermediates (hereafter endophenotypes) which are now readily measurable at a large scale across diverse individuals. The usefulness of endophenotypes for delineating the regulatory landscape of the genome and genetic dissection of complex trait variation remains underexplored in plants. The work presented here illustrated the utility of a large-scale (299-genotype and seven-tissue) gene expression resource to dissect traits across multiple levels of biological organization. Using single-tissue- and multi-tissue-based transcriptome-wide association studies (TWAS), we revealed that about half of the functional variation acts through altered transcript abundance for maize kernel traits, including 30 grain carotenoid abundance traits, 20 grain tocochromanol abundance traits, and 22 field-measured agronomic traits. Comparing the efficacy of TWAS with genome-wide association studies (GWAS) and an ensemble approach that combines both GWAS and TWAS, we demonstrated that results of TWAS in combination with GWAS increase the power to detect known genes and aid in prioritizing likely causal genes. Using a variance partitioning approach in the largely independent maize Nested Association Mapping (NAM) population, we also showed that the most strongly associated genes identified by combining GWAS and TWAS explain more heritable variance for a majority of traits than the heritability captured by the random genes and the genes identified by GWAS or TWAS alone. This not only improves the ability to link genes to phenotypes, but also highlights the phenotypic consequences of regulatory variation in plants
The sexually antagonistic genes of Drosophila melanogaster
When selective pressures differ between males and females, the genes experiencing these conflicting evolutionary forces are said to be sexually antagonistic. Although the phenotypic effect of these genes has been documented in both wild and laboratory populations, their identity, number, and location remains unknown. Here, by combining data on sex-specific fitness and genome-wide transcript abundance in a quantitative genetic framework, we identified a group of candidate genes experiencing sexually antagonistic selection in the adult, which correspond to 8% of Drosophila melanogaster genes. As predicted, the X chromosome is enriched for these genes, but surprisingly they represent only a small proportion of the total number of sex-biased transcripts, indicating that the latter is a poor predictor of sexual antagonism. Furthermore, the majority of genes whose expression profiles showed a significant relationship with either male or female adult fitness are also sexually antagonistic. These results provide a first insight into the genetic basis of intralocus sexual conflict and indicate that genetic variation for fitness is dominated and maintained by sexual antagonism, potentially neutralizing any indirect genetic benefits of sexual selection
Allele-specific expression and eQTL analysis in mouse adipose tissue.
BackgroundThe simplest definition of cis-eQTLs versus trans, refers to genetic variants that affect expression in an allele specific manner, with implications on underlying mechanism. Yet, due to technical limitations of expression microarrays, the vast majority of eQTL studies performed in the last decade used a genomic distance based definition as a surrogate for cis, therefore exploring local rather than cis-eQTLs.ResultsIn this study we use RNAseq to explore allele specific expression (ASE) in adipose tissue of male and female F1 mice, produced from reciprocal crosses of C57BL/6J and DBA/2J strains. Comparison of the identified cis-eQTLs, to local-eQTLs, that were obtained from adipose tissue expression in two previous population based studies in our laboratory, yields poor overlap between the two mapping approaches, while both local-eQTL studies show highly concordant results. Specifically, local-eQTL studies show ~60% overlap between themselves, while only 15-20% of local-eQTLs are identified as cis by ASE, and less than 50% of ASE genes are recovered in local-eQTL studies. Utilizing recently published ENCODE data, we also find that ASE genes show significant bias for SNPs prevalence in DNase I hypersensitive sites that is ASE direction specific.ConclusionsWe suggest a new approach to analysis of allele specific expression that is more sensitive and accurate than the commonly used fisher or chi-square statistics. Our analysis indicates that technical differences between the cis and local-eQTL approaches, such as differences in genomic background or sex specificity, account for relatively small fraction of the discrepancy. Therefore, we suggest that the differences between two eQTL mapping approaches may facilitate sorting of SNP-eQTL interactions into true cis and trans, and that a considerable portion of local-eQTL may actually represent trans interactions
Leveraging large scale beef cattle genomic data to identify the architecture of polygenic selection and local adaptation
Includes vita.Since the invention of the first array-based genotyping assay for cattle in 2008, millions of animals have been genotyped worldwide. Leveraging these genotypes offers exciting opportunities to explore both basic and applied research questions. Commercial genotyping assays are of adequate variant density to perform well in prediction contexts but are not sufficient for mapping studies. Using reference panels made up of individuals genotyped at higher densities, we can statistically infer the missing variation of low-density assays through the process of imputation. Here, we explore the best practices for performing routine imputation in large commercially generated genomic datasets of U.S. beef cattle. We find that using a large multi-breed imputation reference maximizes accuracy, particularly for rare variants. Using three of these large, imputed datasets, we explore two major population genetics questions. First, we map polygenic selection in the bovine genome, using Generation Proxy Selection Mapping (GPSM). This identifies hundreds of regions of the genome actively under selection in cattle populations. Using a similar approach, we identify dozens of genomic variants associated with environments across the U.S., likely involved local adaptation. Understanding the genomic basis of local adaptation in cattle will enable select and breed cattle better suited to a changing climate.Includes bibliographical references (pages 203-228
Recommended from our members
The Expanding Landscape of Alternative Splicing Variation in Human Populations.
Alternative splicing is a tightly regulated biological process by which the number of gene products for any given gene can be greatly expanded. Genomic variants in splicing regulatory sequences can disrupt splicing and cause disease. Recent developments in sequencing technologies and computational biology have allowed researchers to investigate alternative splicing at an unprecedented scale and resolution. Population-scale transcriptome studies have revealed many naturally occurring genetic variants that modulate alternative splicing and consequently influence phenotypic variability and disease susceptibility in human populations. Innovations in experimental and computational tools such as massively parallel reporter assays and deep learning have enabled the rapid screening of genomic variants for their causal impacts on splicing. In this review, we describe technological advances that have greatly increased the speed and scale at which discoveries are made about the genetic variation of alternative splicing. We summarize major findings from population transcriptomic studies of alternative splicing and discuss the implications of these findings for human genetics and medicine
Discover hidden splicing variations by mapping personal transcriptomes to personal genomes.
RNA-seq has become a popular technology for studying genetic variation of pre-mRNA alternative splicing. Commonly used RNA-seq aligners rely on the consensus splice site dinucleotide motifs to map reads across splice junctions. Consequently, genomic variants that create novel splice site dinucleotides may produce splice junction RNA-seq reads that cannot be mapped to the reference genome. We developed and evaluated an approach to identify 'hidden' splicing variations in personal transcriptomes, by mapping personal RNA-seq data to personal genomes. Computational analysis and experimental validation indicate that this approach identifies personal specific splice junctions at a low false positive rate. Applying this approach to an RNA-seq data set of 75 individuals, we identified 506 personal specific splice junctions, among which 437 were novel splice junctions not documented in current human transcript annotations. 94 splice junctions had splice site SNPs associated with GWAS signals of human traits and diseases. These involve genes whose splicing variations have been implicated in diseases (such as OAS1), as well as novel associations between alternative splicing and diseases (such as ICA1). Collectively, our work demonstrates that the personal genome approach to RNA-seq read alignment enables the discovery of a large but previously unknown catalog of splicing variations in human populations
Equine trait mapping
Assigning function to genes is essential for a better understanding of biological
systems. To date, approximately half of the genes in the vertebrate genome have known
function. Domestic animals are a rich source for trait mapping and in this thesis we
have mapped three distinct equine phenotypes. The result provides increased
knowledge regarding gene function and importantly, practical implications for horse
welfare. In paper I and IV, we confirm that Equine Multiple Congenital Ocular
Anomalies (MCOA) syndrome is inherited as an incompletely dominant trait (p=
2.2x10-16). By first identifying a 208 kb identity-by decent (IBD) region and
subsequently excluding polymorphic sites identified through Illumina sequencing, we
conclude that the gene PMEL causes these defects in horse. Our findings, together with
functional analyses recently published, support that the cause of MCOA syndrome is a
missense mutation (Arg625Cys) near the transmembrane region of PMEL that results
in altered biochemical properties. In paper II we show that variants in the MHC-II
region influence the susceptibility to equine Insect Bite Hypersensitivity with the same
marker risk allele identified in two distinct populations, OR 4.19 (p= 2.3x10-5) and 1.48
(p= 0.04) for Icelandic horses and Exmoor ponies respectively. In addition,
homozygosity across the MHC-II region confers a higher risk of developing disease,
OR= 2.67 (p= 1.3x10-3). Finally, in paper III we utilize the EquineSNP50 BeadChip to
identify the first Gait locus in horse. A highly significant SNP (EMP2= 2.0x10-4) was
identified to be consistent with a recessive mode of inheritance for the lateral gait pace
in Icelandic horses, and confirmed in an independent sample set (p= 2.4x10-14).
Illumina sequencing of an established IBD region identified a nonsense mutation in the
gene DMRT3. A clearly dichotomous distribution in a panel of gaited and non-gaited
breeds revealed that the DMRT3 mutation is permissive for a variety of alternate gaits.
The mutation also has a favorable effect in harness racing horses. Functional
characterization of the truncated protein demonstrated correct localization and an intact
DNA binding profile. mRNA expression in a small population of commissural neurons
from the spinal cord was confirmed in mutant and wild type horses. Further, a DMRT3
null mouse displayed a change in spinal cord circuit signaling and locomotion. These
findings reveal a new molecule involved in the regulation of limb movement
Ten years of the horse reference genome: insights into equine biology, domestication and population dynamics in the post-genome era.
The horse reference genome from the Thoroughbred mare Twilight has been available for a decade and, together with advances in genomics technologies, has led to unparalleled developments in equine genomics. At the core of this progress is the continuing improvement of the quality, contiguity and completeness of the reference genome, and its functional annotation. Recent achievements include the release of the next version of the reference genome (EquCab3.0) and generation of a reference sequence for the Y chromosome. Horse satellite-free centromeres provide unique models for mammalian centromere research. Despite extremely low genetic diversity of the Y chromosome, it has been possible to trace patrilines of breeds and pedigrees and show that Y variation was lost in the past approximately 2300Β years owing to selective breeding. The high-quality reference genome has led to the development of three different SNP arrays and WGSs of almost 2000 modern individual horses. The collection of WGS of hundreds of ancient horses is unique and not available for any other domestic species. These tools and resources have led to global population studies dissecting the natural history of the species and genetic makeup and ancestry of modern breeds. Most importantly, the available tools and resources, together with the discovery of functional elements, are dissecting molecular causes of a growing number of Mendelian and complex traits. The improved understanding of molecular underpinnings of various traits continues to benefit the health and performance of the horse whereas also serving as a model for complex disease across species
Identification, Replication, and Functional Fine-Mapping of Expression Quantitative Trait Loci in Primary Human Liver Tissue
The discovery of expression quantitative trait loci (βeQTLsβ) can
help to unravel genetic contributions to complex traits. We identified genetic
determinants of human liver gene expression variation using two independent
collections of primary tissue profiled with Agilent
(nβ=β206) and Illumina (nβ=β60)
expression arrays and Illumina SNP genotyping (550K), and we also incorporated
data from a published study (nβ=β266). We found that
βΌ30% of SNP-expression correlations in one study failed to replicate
in either of the others, even at thresholds yielding high reproducibility in
simulations, and we quantified numerous factors affecting reproducibility. Our
data suggest that drug exposure, clinical descriptors, and unknown factors
associated with tissue ascertainment and analysis have substantial effects on
gene expression and that controlling for hidden confounding variables
significantly increases replication rate. Furthermore, we found that
reproducible eQTL SNPs were heavily enriched near gene starts and ends, and
subsequently resequenced the promoters and 3β²UTRs for 14 genes and tested
the identified haplotypes using luciferase assays. For three genes, significant
haplotype-specific in vitro functional differences correlated
directly with expression levels, suggesting that many bona fide
eQTLs result from functional variants that can be mechanistically isolated in a
high-throughput fashion. Finally, given our study design, we were able to
discover and validate hundreds of liver eQTLs. Many of these relate directly to
complex traits for which liver-specific analyses are likely to be relevant, and
we identified dozens of potential connections with disease-associated loci.
These included previously characterized eQTL contributors to diabetes, drug
response, and lipid levels, and they suggest novel candidates such as a role for
NOD2 expression in leprosy risk and
C2orf43 in prostate cancer. In general, the work presented
here will be valuable for future efforts to precisely identify and functionally
characterize genetic contributions to a variety of complex traits
- β¦