Search CORE

78 research outputs found

The Genetic and Mechanistic Basis for Variation in Gene Regulation

Author: Gilad Yoav
Pai Athma A.
Pritchard Jonathan K.
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2015
Field of study

It is now well established that noncoding regulatory variants play a central role in the genetics of common diseases and in evolution. However, until recently, we have known little about the mechanisms by which most regulatory variants act. For instance, what types of functional elements in DNA, RNA, or proteins are most often affected by regulatory variants? Which stages of gene regulation are typically altered? How can we predict which variants are most likely to impact regulation in a given cell type? Recent studies, in many cases using quantitative trait loci (QTL)-mapping approaches in cell lines or tissue samples, have provided us with considerable insight into the properties of genetic loci that have regulatory roles. Such studies have uncovered novel biochemical regulatory interactions and led to the identification of previously unrecognized regulatory mechanisms. We have learned that genetic variation is often directly associated with variation in regulatory activities (namely, we can map regulatory QTLs, not just expression QTLs [eQTLs]), and we have taken the first steps towards understanding the causal order of regulatory events (for example, the role of pioneer transcription factors). Yet, in most cases, we still do not know how to interpret overlapping combinations of regulatory interactions, and we are still far from being able to predict how variation in regulatory mechanisms is propagated through a chain of interactions to eventually result in changes in gene expression profiles.National Institutes of Health (U.S.) (grant NIH HG006123)National Institutes of Health (U.S.) (NIH GM007197)National Institutes of Health (U.S.) (grant NIH MH084703)Howard Hughes Medical InstituteJane Coffin Childs Memorial Fund for Medical Research (postdoctoral fellowship

DSpace@MIT

Crossref

Directory of Open Access Journals

PubMed Central

RNA-seq: impact of RNA degradation on transcript quantification

Author: Gallego Romero Irene
Gilad Yoav
Pai Athma A.
Tung Jenny
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2014
Field of study

Background The use of low quality RNA samples in whole-genome gene expression profiling remains controversial. It is unclear if transcript degradation in low quality RNA samples occurs uniformly, in which case the effects of degradation can be corrected via data normalization, or whether different transcripts are degraded at different rates, potentially biasing measurements of expression levels. This concern has rendered the use of low quality RNA samples in whole-genome expression profiling problematic. Yet, low quality samples (for example, samples collected in the course of fieldwork) are at times the sole means of addressing specific questions. Results We sought to quantify the impact of variation in RNA quality on estimates of gene expression levels based on RNA-seq data. To do so, we collected expression data from tissue samples that were allowed to decay for varying amounts of time prior to RNA extraction. The RNA samples we collected spanned the entire range of RNA Integrity Number (RIN) values (a metric commonly used to assess RNA quality). We observed widespread effects of RNA quality on measurements of gene expression levels, as well as a slight but significant loss of library complexity in more degraded samples. Conclusions While standard normalizations failed to account for the effects of degradation, we found that by explicitly controlling for the effects of RIN using a linear model framework we can correct for the majority of these effects. We conclude that in instances in which RIN and the effect of interest are not associated, this approach can help recover biologically meaningful signals in data from degraded RNA samples.American Heart Association (Predoctoral Fellowship

DSpace@MIT

Crossref

Springer - Publisher Connector

PubMed Central

Numerous recursive sites contribute to accuracy of splicing of long introns in flies [preprint]

Author: Adelman Karen
Burge Christopher B.
Paggi Joseph
Pai Athma A.
Publication venue: eScholarship@UMassChan
Publication date: 27/03/2018
Field of study

Recursive splicing, a process by which a single intron is removed from pre-mRNA transcripts in multiple distinct segments, has been observed in a small subset of Drosophila melanogaster introns. However, detection of recursive splicing requires observation of splicing intermediates which are inherently unstable, making it difficult to study. Here we developed new computational approaches to identify recursively spliced introns and applied them, in combination with existing methods, to nascent RNA sequencing data from Drosophila S2 cells. These approaches identified hundreds of novel sites of recursive splicing, expanding the catalog of recursively spliced fly introns by 4-fold. Recursive sites occur in most very long (\u3e 40 kb) fly introns, including many genes involved in morphogenesis and development, and tend to occur near the midpoints of introns. Suggesting a possible function for recursive splicing, we observe that fly introns with recursive sites are spliced more accurately than comparably sized non-recursive introns

eScholarship@UMMS

Recommended from our members

Noisy Splicing Drives mRNA Isoform Diversity in Human Cells

Author: Gilad Yoav
Pai Athma A.
Pickrell Joseph K.
Pritchard Jonathan K.
Publication venue
Publication date: 18/01/2024
Field of study

While the majority of multiexonic human genes show some evidence of alternative splicing, it is unclear what fraction of observed splice forms is functionally relevant. In this study, we examine the extent of alternative splicing in human cells using deep RNA sequencing and de novo identification of splice junctions. We demonstrate the existence of a large class of low abundance isoforms, encompassing approximately 150,000 previously unannotated splice junctions in our data. Newly-identified splice sites show little evidence of evolutionary conservation, suggesting that the majority are due to erroneous splice site choice. We show that sequence motifs involved in the recognition of exons are enriched in the vicinity of unconserved splice sites. We estimate that the average intron has a splicing error rate of approximately 0.7% and show that introns in highly expressed genes are spliced more accurately, likely due to their shorter length. These results implicate noisy splicing as an important property of genome evolution.</p

Knowledge UChicago

Recommended from our members

A Genome-Wide Study of DNA Methylation Patterns and Gene Expression Levels in Multiple Human and Chimpanzee Tissues

Author: Bell Jordana T.
Gilad Yoav
Marioni John C.
Pai Athma A.
Pritchard Jonathan K.
Publication venue
Publication date: 18/01/2024
Field of study

The modification of DNA by methylation is an important epigenetic mechanism that affects the spatial and temporal regulation of gene expression. Methylation patterns have been described in many contexts within and across a range of species. However, the extent to which changes in methylation might underlie inter-species differences in gene regulation, in particular between humans and other primates, has not yet been studied. To this end, we studied DNA methylation patterns in livers, hearts, and kidneys from multiple humans and chimpanzees, using tissue samples for which genome-wide gene expression data were also available. Using the multi-species gene expression and methylation data for 7,723 genes, we were able to study the role of promoter DNA methylation in the evolution of gene regulation across tissues and species. We found that inter-tissue methylation patterns are often conserved between humans and chimpanzees. However, we also found a large number of gene expression differences between species that might be explained, at least in part, by corresponding differences in methylation levels. In particular, we estimate that, in the tissues we studied, inter-species differences in promoter methylation might underlie as much as 12%–18% of differences in gene expression levels between humans and chimpanzees.</p

Knowledge UChicago

Bayesian nonparametric discovery of isoforms and individual specific quantification

Author: Aguiar Derek
Cheng Li-Fang
Dumitrascu Bianca
Engelhardt Barbara E.
Mordelet Fantine
Pai Athma A.
Publication venue: eScholarship@UMassChan
Publication date: 23/03/2017
Field of study

Most human protein-coding genes can be transcribed into multiple distinct mRNA isoforms. These alternative splicing patterns encourage molecular diversity, and dysregulation of isoform expression plays an important role in disease etiology. However, isoforms are difficult to characterize from short-read RNA-seq data because they share identical subsequences and occur in different frequencies across tissues and samples. Here, we develop BIISQ, a Bayesian nonparametric model for isoform discovery and individual specific quantification from short-read RNA-seq data. BIISQ does not require isoform reference sequences but instead estimates an isoform catalog shared across samples. We use stochastic variational inference for efficient posterior estimates and demonstrate superior precision and recall for simulations compared to state-of-the-art isoform reconstruction methods. BIISQ shows the most gains for low abundance isoforms, with 36% more isoforms correctly inferred at low coverage versus a multi-sample method and 170% more versus single-sample methods. We estimate isoforms in the GEUVADIS RNA-seq data and validate inferred isoforms by associating genetic variants with isoform ratios

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

Directory of Open Access Journals

eScholarship@UMMS

Recommended from our members

Controls of Nucleosome Positioning in the Human Genome

Author: Fondufe-Mittendorf Yvonne N.
Gaffney Daniel J.
Gilad Yoav
Lewellen Noah
McVicker Graham
Michelini Katelyn
Pai Athma A.
Pritchard Jonathan K.
Widom Jonathan
Publication venue
Publication date: 18/01/2024
Field of study

Nucleosomes are important for gene regulation because their arrangement on the genome can control which proteins bind to DNA. Currently, few human nucleosomes are thought to be consistently positioned across cells; however, this has been difficult to assess due to the limited resolution of existing data. We performed paired-end sequencing of micrococcal nuclease-digested chromatin (MNase–seq) from seven lymphoblastoid cell lines and mapped over 3.6 billion MNase–seq fragments to the human genome to create the highest-resolution map of nucleosome occupancy to date in a human cell type. In contrast to previous results, we find that most nucleosomes have more consistent positioning than expected by chance and a substantial fraction (8.7%) of nucleosomes have moderate to strong positioning. In aggregate, nucleosome sequences have 10 bp periodic patterns in dinucleotide frequency and DNase I sensitivity; and, across cells, nucleosomes frequently have translational offsets that are multiples of 10 bp. We estimate that almost half of the genome contains regularly spaced arrays of nucleosomes, which are enriched in active chromatin domains. Single nucleotide polymorphisms that reduce DNase I sensitivity can disrupt the phasing of nucleosome arrays, which indicates that they often result from positioning against a barrier formed by other proteins. However, nucleosome arrays can also be created by DNA sequence alone. The most striking example is an array of over 400 nucleosomes on chromosome 12 that is created by tandem repetition of sequences with strong positioning properties. In summary, a large fraction of nucleosomes are consistently positioned—in some regions because they adopt favored sequence positions, and in other regions because they are forced into specific arrangements by chromatin remodeling or DNA binding proteins.</p

Knowledge UChicago

Controls of nucleosome positioning in the human genome

Author: Fondufe-Mittendorf Yvonne N.
Gaffney Daniel J.
Gilad Yoav
Lewellen Noah
McVicker Graham
Michelini Katelyn
Pai Athma A.
Pritchard Jonathan K.
Widom Jonathan
Publication venue: UKnowledge
Publication date: 01/01/2012
Field of study

Nucleosomes are important for gene regulation because their arrangement on the genome can control which proteins bind to DNA. Currently, few human nucleosomes are thought to be consistently positioned across cells; however, this has been difficult to assess due to the limited resolution of existing data. We performed paired-end sequencing of micrococcal nuclease-digested chromatin (MNase-seq) from seven lymphoblastoid cell lines and mapped over 3.6 billion MNase-seq fragments to the human genome to create the highest-resolution map of nucleosome occupancy to date in a human cell type. In contrast to previous results, we find that most nucleosomes have more consistent positioning than expected by chance and a substantial fraction (8.7%) of nucleosomes have moderate to strong positioning. In aggregate, nucleosome sequences have 10 bp periodic patterns in dinucleotide frequency and DNase I sensitivity; and, across cells, nucleosomes frequently have translational offsets that are multiples of 10 bp. We estimate that almost half of the genome contains regularly spaced arrays of nucleosomes, which are enriched in active chromatin domains. Single nucleotide polymorphisms that reduce DNase I sensitivity can disrupt the phasing of nucleosome arrays, which indicates that they often result from positioning against a barrier formed by other proteins. However, nucleosome arrays can also be created by DNA sequence alone. The most striking example is an array of over 400 nucleosomes on chromosome 12 that is created by tandem repetition of sequences with strong positioning properties. In summary, a large fraction of nucleosomes are consistently positioned--in some regions because they adopt favored sequence positions, and in other regions because they are forced into specific arrangements by chromatin remodeling or DNA binding proteins

Directory of Open Access Journals

University of Kentucky

FigShare

Widespread occurrence of hybrid internal-terminal exons in human transcriptomes

Author: Burge Christopher B.
Calvo-Roitberg Ezequiel
Fiszbein Ana
Kim Gyeung Yun
McGurk Michael
Pai Athma A.
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 01/01/2021
Field of study

Now published in Science Advances doi: 10.1126/sciadv.abk1752.Alternative RNA processing is a major mechanism for diversifying the human transcriptome. Messenger RNA isoform differences are predominantly driven by alternative first exons, cassette internal exons and alternative last exons. Despite the importance of classifying exons to understand isoform structure, there is a lack of tools to look at isoform-specific exon usage using RNA-sequencing data. We recently observed that alternative transcription start sites often arise near annotated internal exons, creating “hybrid” exons that can be used as both first or internal exons. To investigate the creation of hybrid exons, we built the HIT (Hybrid-Internal-Terminal) exon pipeline that systematically classifies exons depending on their isoform-specific usage. Using a combination of junction reads coverage and probabilistic modeling, the HIT index identified thousands of hybrid first-internal and internal-last exons that were previously misclassified. Hybrid exons are enriched in long genes with at least ten internal exons, have longer flanking introns and strong splice sites. The usage of hybrid exons varies considerably across human tissues, but they are predominantly used in brain, testis and colon cells. Notably, genes involved in RNA splicing have the highest fraction of intra-tissue hybrid exons. Further, we found more than 100,000 inter-tissue hybrid exons that changed from internal to terminal exons across tissues. By developing the first method that can classify exons according to their isoform contexts, our findings demonstrate the existence of hybrid exons, expand the repertoire of tissue-specific terminal exons and uncover unexpected complexities of the human transcriptome.Accepted manuscrip

DSpace@MIT

Boston University Institutional Repository (OpenBU)

PubMed Central