1,147 research outputs found
Recommended from our members
Bioinformatics challenges and perspectives when studying the effect of epigenetic modifications on alternative splicing
It is widely known that epigenetic modifications are important in regulating transcription, but several have also been reported in alternative splicing. The regulation of pre-mRNA splicing is important to explain proteomic diversity and the misregulation of splicing has been implicated in many diseases. Here, we give a brief overview of the role of epigenetics in alternative splicing and disease. We then discuss the bioinformatics methods that can be used to model interactions between epigenetic marks and regulators of splicing. These models can be used to identify alternative splicing and epigenetic changes across different phenotypes
Decoding the regulatory network of early blood development from single-cell gene expression measurements.
Reconstruction of the molecular pathways controlling organ development has been hampered by a lack of methods to resolve embryonic progenitor cells. Here we describe a strategy to address this problem that combines gene expression profiling of large numbers of single cells with data analysis based on diffusion maps for dimensionality reduction and network synthesis from state transition graphs. Applying the approach to hematopoietic development in the mouse embryo, we map the progression of mesoderm toward blood using single-cell gene expression analysis of 3,934 cells with blood-forming potential captured at four time points between E7.0 and E8.5. Transitions between individual cellular states are then used as input to develop a single-cell network synthesis toolkit to generate a computationally executable transcriptional regulatory network model of blood development. Several model predictions concerning the roles of Sox and Hox factors are validated experimentally. Our results demonstrate that single-cell analysis of a developing organ coupled with computational approaches can reveal the transcriptional programs that underpin organogenesis.We thank J. Downing (St. Jude Children's Research Hospital, Memphis, TN, USA) for the Runx1-ires-GFP mouse. Research in the authors' laboratory is supported by the Medical Research Council, Biotechnology and Biological Sciences Research Council, Leukaemia and Lymphoma Research, the Leukemia and Lymphoma Society, Microsoft Research and core support grants by the Wellcome Trust to the Cambridge Institute for Medical Research and Wellcome Trust - MRC Cambridge Stem Cell Institute. V.M. is supported by a Medical Research Council Studentship and Centenary Award and S.W. by a Microsoft Research PhD Scholarship.This is the accepted manuscript for a paper published in Nature Biotechnology 33, 269β276 (2015) doi:10.1038/nbt.315
Integrative analysis identifies candidate tumor microenvironment and intracellular signaling pathways that define tumor heterogeneity in NF1
Neurofibromatosis type 1 (NF1) is a monogenic syndrome that gives rise to numerous symptoms including cognitive impairment, skeletal abnormalities, and growth of benign nerve sheath tumors. Nearly all NF1 patients develop cutaneous neurofibromas (cNFs), which occur on the skin surface, whereas 40-60% of patients develop plexiform neurofibromas (pNFs), which are deeply embedded in the peripheral nerves. Patients with pNFs have a ~10% lifetime chance of these tumors becoming malignant peripheral nerve sheath tumors (MPNSTs). These tumors have a severe prognosis and few treatment options other than surgery. Given the lack of therapeutic options available to patients with these tumors, identification of druggable pathways or other key molecular features could aid ongoing therapeutic discovery studies. In this work, we used statistical and machine learning methods to analyze 77 NF1 tumors with genomic data to characterize key signaling pathways that distinguish these tumors and identify candidates for drug development. We identified subsets of latent gene expression variables that may be important in the identification and etiology of cNFs, pNFs, other neurofibromas, and MPNSTs. Furthermore, we characterized the association between these latent variables and genetic variants, immune deconvolution predictions, and protein activity predictions
GENE-Counter: A Computational Pipeline for the Analysis of RNA-Seq Data for Gene Expression Differences
GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq) data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM)-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO) terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts
DNA Chemical Reaction Network Design Synthesis and Compilation
The advantages of biomolecular computing include 1) the ability to interface with, monitor, and intelligently protect and maintain the functionality of living systems, 2) the ability to create computational devices with minimal energy needs and hazardous waste production during manufacture and lifecycle, 3) the ability to store large amounts of information for extremely long time periods, and 4) the ability to create computation analogous to human brain function. To realize these advantages over electronics, biomolecular computing is at a watershed moment in its evolution. Computing with entire molecules presents different challenges and requirements than computing just with electric charge. These challenges have led to ad-hoc design and programming methods with high development costs and limited device performance. At the present time, device building entails complete low-level detail immersion. We address these shortcomings by creation of a systems engineering process for building and programming DNA-based computing devices. Contributions of this thesis include numeric abstractions for nucleic acid sequence and secondary structure, and a set of algorithms which employ these abstractions. The abstractions and algorithms have been implemented into three artifacts: DNADL, a design description language; Pyxis, a molecular compiler and design toolset; and KCA, a simulation of DNA kinetics using a cellular automaton discretization. Our methods are applicable to other DNA nanotechnology constructions and may serve in the development of a full DNA computing model
Studies on genetic and epigenetic regulation of gene expression dynamics
The information required to build an organism is contained in its genome and the first
biochemical process that activates the genetic information stored in DNA is transcription.
Cell type specific gene expression shapes cellular functional diversity and dysregulation
of transcription is a central tenet of human disease. Therefore, understanding
transcriptional regulation is central to understanding biology in health and disease.
Transcription is a dynamic process, occurring in discrete bursts of activity that can be
characterized by two kinetic parameters; burst frequency describing how often genes
burst and burst size describing how many transcripts are generated in each burst. Genes
are under strict regulatory control by distinct sequences in the genome as well as
epigenetic modifications. To properly study how genetic and epigenetic factors affect
transcription, it needs to be treated as the dynamic cellular process it is. In this thesis, I
present the development of methods that allow identification of newly induced gene
expression over short timescales, as well as inference of kinetic parameters describing
how frequently genes burst and how many transcripts each burst give rise to. The work is
presented through four papers:
In paper I, I describe the development of a novel method for profiling newly transcribed
RNA molecules. We use this method to show that therapeutic compounds affecting
different epigenetic enzymes elicit distinct, compound specific responses mediated by
different sets of transcription factors already after one hour of treatment that can only
be detected when measuring newly transcribed RNA.
The goal of paper II is to determine how genetic variation shapes transcriptional bursting.
To this end, we infer transcriptome-wide burst kinetics parameters from genetically
distinct donors and find variation that selectively affects burst sizes and frequencies.
Paper III describes a method for inferring transcriptional kinetics transcriptome-wide
using single-cell RNA-sequencing. We use this method to describe how the regulation of
transcriptional bursting is encoded in the genome. Our findings show that gene specific
burst sizes are dependent on core promoter architecture and that enhancers affect burst
frequencies. Furthermore, cell type specific differential gene expression is regulated by
cell type specific burst frequencies.
Lastly, Paper IV shows how transcription shapes cell types. We collect data on cellular
morphologies, electrophysiological characteristics, and measure gene expression in the
same neurons collected from the mouse motor cortex. Our findings show that cells
belonging to the same, distinct transcriptomic families have distinct and non-overlapping
morpho-electric characteristics. Within families, there is continuous and correlated
variation in all modalities, challenging the notion of cell types as discrete entities
A High-Resolution Whole-Genome Map of Key Chromatin Modifications in the Adult Drosophila melanogaster
Epigenetic research has been focused on cell-type-specific regulation; less is known about common features of epigenetic programming shared by diverse cell types within an organism. Here, we report a modified method for chromatin immunoprecipitation and deep sequencing (ChIPβSeq) and its use to construct a high-resolution map of the Drosophila melanogaster key histone marks, heterochromatin protein 1a (HP1a) and RNA polymerase II (polII). These factors are mapped at 50-bp resolution genome-wide and at 5-bp resolution for regulatory sequences of genes, which reveals fundamental features of chromatin modification landscape shared by major adult Drosophila cell types: the enrichment of both heterochromatic and euchromatic marks in transposons and repetitive sequences, the accumulation of HP1a at transcription start sites with stalled polII, the signatures of histone code and polII level/position around the transcriptional start sites that predict both the mRNA level and functionality of genes, and the enrichment of elongating polII within exons at splicing junctions. These features, likely conserved among diverse epigenomes, reveal general strategies for chromatin modifications
Recommended from our members
Regulatory logic of cellular diversity in the nervous system
During nervous system development, thousands of distinct cell types are generated and assembled into complex circuits that control all aspects of animal cognition and behavior. Understanding what these diverse cells are, how they are generated, and what they do in the context of circuits and behavior form the fundamental efforts of the field of neuroscience. In this thesis, I investigate how the genomic organization of regulatory elements informs specific patterns of gene expression in the nervous system. In particular, I examine how distinct combinations of transcription factors interpret information encoded in the genome to control global gene expression programs in a cell-type-specific manner.
In Chapter Two, I describe the establishment of a developmentally inspired transcriptional programming system to generate spinal and cranial motor neurons directly from mouse embryonic stem cells. Programmed motor neurons acquire general characteristics that mirror their in vivo counterparts, providing a robust system for studying cell fate specification in the nervous system. Combinatorial expression of cell-type-specific programming factors informs context-dependent enhancer binding and acquisition of appropriate cell-type-specific molecular and functional properties.
In Chapter Three, I take advantage of this robust, experimentally accessible system to probe the chromatin-level organization and regulatory principles controlling specificity of motor neuron gene expression programs. Motor neuron genes are controlled by multiple distantly distributed enhancer constellations stretched across large regulatory domains. Using this motor neuron specification model, I discovered a unique regulatory organization controlling gene expression in the nervous system, whereby neuronal genes are controlled from uniquely complex regulatory domains acting over large distances.
In Chapter Four, I extrapolate on the insights gained from studying motor neurons at a single point in time to investigate the dynamics of the regulatory environment during neuronal maturation. We demonstrate that enhancers are highly dynamic even after postmitotic specification. The dynamic nature of enhancers is dependent on combinatorial binding with new transcriptional cofactors.
Overall, my results suggest that neuronal gene expression programs within a single cell type are regulated in a highly dynamic fashion by a complex set of enhancers. I propose that during development the immense cellular complexity of the nervous system is established and maintained by correspondingly complex repertoire of enhancers
- β¦