13 research outputs found
Recommended from our members
MPRAnalyze: statistical framework for massively parallel reporter assays.
Massively parallel reporter assays (MPRAs) can measure the regulatory function of thousands of DNA sequences in a single experiment. Despite growing popularity, MPRA studies are limited by a lack of a unified framework for analyzing the resulting data. Here we present MPRAnalyze: a statistical framework for analyzing MPRA count data. Our model leverages the unique structure of MPRA data to quantify the function of regulatory sequences, compare sequences' activity across different conditions, and provide necessary flexibility in an evolving field. We demonstrate the accuracy and applicability of MPRAnalyze on simulated and published data and compare it with existing methods
Recommended from our members
Functional interpretation of single cell similarity maps.
We present Vision, a tool for annotating the sources of variation in single cell RNA-seq data in an automated and scalable manner. Vision operates directly on the manifold of cell-cell similarity and employs a flexible annotation approach that can operate either with or without preconceived stratification of the cells into groups or along a continuum. We demonstrate the utility of Vision in several case studies and show that it can derive important sources of cellular variation and link them to experimental meta-data even with relatively homogeneous sets of cells. Vision produces an interactive, low latency and feature rich web-based report that can be easily shared among researchers, thus facilitating data dissemination and collaboration
Recommended from our members
GeneFishing to reconstruct context specific portraits of biological processes.
Rapid advances in genomic technologies have led to a wealth of diverse data, from which novel discoveries can be gleaned through the application of robust statistical and computational methods. Here, we describe GeneFishing, a semisupervised computational approach to reconstruct context-specific portraits of biological processes by leveraging gene-gene coexpression information. GeneFishing incorporates multiple high-dimensional statistical ideas, including dimensionality reduction, clustering, subsampling, and results aggregation, to produce robust results. To illustrate the power of our method, we applied it using 21 genes involved in cholesterol metabolism as "bait" to "fish out" (or identify) genes not previously identified as being connected to cholesterol metabolism. Using simulation and real datasets, we found that the results obtained through GeneFishing were more interesting for our study than those provided by related gene prioritization methods. In particular, application of GeneFishing to the GTEx liver RNA sequencing (RNAseq) data not only reidentified many known cholesterol-related genes, but also pointed to glyoxalase I (GLO1) as a gene implicated in cholesterol metabolism. In a follow-up experiment, we found that GLO1 knockdown in human hepatoma cell lines increased levels of cellular cholesterol ester, validating a role for GLO1 in cholesterol metabolism. In addition, we performed pantissue analysis by applying GeneFishing on various tissues and identified many potential tissue-specific cholesterol metabolism-related genes. GeneFishing appears to be a powerful tool for identifying related components of complex biological systems and may be used across a wide range of applications
Recommended from our members
Methodology and Applications for studying the Heterogeneity and Sequence Determinants of cis-Regulatory Elements
cis-regulatory elements (CREs) are non-coding segments of the genome which regulate the transcription of nearby genes. They can be broadly divided to two categories: 1) promoters, positioned directly upstream of their target gene, and 2) enhancers, positioned distally to their target gene. Enhancers are thought to be the main drivers of cell type-specific and state-specific transcription, and regulate gene expression by fine-tuning the rate of transcription, as opposed to the more binary (on or off) regulatory function that promoters typically have. Understanding how enhancers function is therefore crucially important to understanding how cells obtain and maintain certain fates and determine response to stimuli. Despite their importance, much is still unknown about the roles enhancers play in many biological processes, and how their sequence determines their regulatory function.The first part of this dissertation deals with single-cell chromatin accessibility data (e.g as produced by single-cell ATAC-seq) as a means for systemically studying heterogeneity of CREs, and specifically enhancers. In chapter 2 this is demonstrated in the innate immune system's response to vaccination: in a subset of cells, a distinct state of chromatin accessibility maintains long-term epigenetic changes that prime these cells to a different response to stimuli, and provides non-specific viral protection.However promising, the unique properties of this data modality poses significant challenges. These are addressed in chapter 3, which introduced PeakVI, a deep generative model that provides a comprehensive statistical framework for analyzing data generated by scATAC-seq assays. Recent advances in sequencing technologies now enable obtaining these measurements alongside gene expression measurements (i.e single cell RNA-seq), providing the ability to directly measure the relationship between the heterogeneity of the chromatin landscape and that of the transcriptional profile. Chapter 4 introduces MultiVI, a general framework for the joint analysis of multi-modal single-cell data, using single-cell ATAC-seq and single-cell RNA-seq as the main example. These models enable exploration of cis-regulatory programs, identification of putative key enhancers, and generating hypotheses about their regulatory functions.The second part of this dissertation focuses on analyzing high-throughput functional data produces by massively parallel reporter assays (MPRAs). These assays enable direct functional characterization of thousands of synthetically generated candidate regulatory sequences. However, these assays include both DNA-seq and RNA-seq observations, and require controlling for various technical confounders within both assays, posing substantial computational challenges. Chapter 5 describes MPRAnalyze, a nested generalized linear model that provides a comprehensive statistical framework for analyzing MPRA data. Chapter 6 then uses MPRAnalyze extensively to identify key enhancers and novel trancription factors involved in early neural differentiation. In chapter 7, systemic perturbation of binding sites in the identified enhancers reveal the specific sequence features that determine enhancer function, and elucidates how multiple functional sites interact in a single enhancer sequence to reach the desired functional output
Massively parallel reporter perturbation assays uncover temporal regulatory architecture during neural differentiation
神経細胞を作る遺伝子制御の構造を解明. 京都大学プレスリリース. 2022-03-24.Hundreds of gene regulatory motifs cooperate and conflict to make brain cells. 京都大学プレスリリース. 2022-03-24.Gene regulatory elements play a key role in orchestrating gene expression during cellular differentiation, but what determines their function over time remains largely unknown. Here, we perform perturbation-based massively parallel reporter assays at seven early time points of neural differentiation to systematically characterize how regulatory elements and motifs within them guide cellular differentiation. By perturbing over 2, 000 putative DNA binding motifs in active regulatory regions, we delineate four categories of functional elements, and observe that activity direction is mostly determined by the sequence itself, while the magnitude of effect depends on the cellular environment. We also find that fine-tuning transcription rates is often achieved by a combined activity of adjacent activating and repressing elements. Our work provides a blueprint for the sequence components needed to induce different transcriptional patterns in general and specifically during neural differentiation
Recommended from our members
Massively parallel reporter perturbation assays uncover temporal regulatory architecture during neural differentiation.
Gene regulatory elements play a key role in orchestrating gene expression during cellular differentiation, but what determines their function over time remains largely unknown. Here, we perform perturbation-based massively parallel reporter assays at seven early time points of neural differentiation to systematically characterize how regulatory elements and motifs within them guide cellular differentiation. By perturbing over 2,000 putative DNA binding motifs in active regulatory regions, we delineate four categories of functional elements, and observe that activity direction is mostly determined by the sequence itself, while the magnitude of effect depends on the cellular environment. We also find that fine-tuning transcription rates is often achieved by a combined activity of adjacent activating and repressing elements. Our work provides a blueprint for the sequence components needed to induce different transcriptional patterns in general and specifically during neural differentiation
Recommended from our members
Functional interpretation of single cell similarity maps.
We present Vision, a tool for annotating the sources of variation in single cell RNA-seq data in an automated and scalable manner. Vision operates directly on the manifold of cell-cell similarity and employs a flexible annotation approach that can operate either with or without preconceived stratification of the cells into groups or along a continuum. We demonstrate the utility of Vision in several case studies and show that it can derive important sources of cellular variation and link them to experimental meta-data even with relatively homogeneous sets of cells. Vision produces an interactive, low latency and feature rich web-based report that can be easily shared among researchers, thus facilitating data dissemination and collaboration
Recommended from our members
lentiMPRA and MPRAflow for high-throughput functional characterization of gene regulatory elements.
Massively parallel reporter assays (MPRAs) can simultaneously measure the function of thousands of candidate regulatory sequences (CRSs) in a quantitative manner. In this method, CRSs are cloned upstream of a minimal promoter and reporter gene, alongside a unique barcode, and introduced into cells. If the CRS is a functional regulatory element, it will lead to the transcription of the barcode sequence, which is measured via RNA sequencing and normalized for cellular integration via DNA sequencing of the barcode. This technology has been used to test thousands of sequences and their variants for regulatory activity, to decipher the regulatory code and its evolution, and to develop genetic switches. Lentivirus-based MPRA (lentiMPRA) produces 'in-genome' readouts and enables the use of this technique in hard-to-transfect cells. Here, we provide a detailed protocol for lentiMPRA, along with a user-friendly Nextflow-based computational pipeline-MPRAflow-for quantifying CRS activity from different MPRA designs. The lentiMPRA protocol takes ~2 months, which includes sequencing turnaround time and data processing with MPRAflow