1,147 research outputs found

    Decoding the regulatory network of early blood development from single-cell gene expression measurements.

    Get PDF
    Reconstruction of the molecular pathways controlling organ development has been hampered by a lack of methods to resolve embryonic progenitor cells. Here we describe a strategy to address this problem that combines gene expression profiling of large numbers of single cells with data analysis based on diffusion maps for dimensionality reduction and network synthesis from state transition graphs. Applying the approach to hematopoietic development in the mouse embryo, we map the progression of mesoderm toward blood using single-cell gene expression analysis of 3,934 cells with blood-forming potential captured at four time points between E7.0 and E8.5. Transitions between individual cellular states are then used as input to develop a single-cell network synthesis toolkit to generate a computationally executable transcriptional regulatory network model of blood development. Several model predictions concerning the roles of Sox and Hox factors are validated experimentally. Our results demonstrate that single-cell analysis of a developing organ coupled with computational approaches can reveal the transcriptional programs that underpin organogenesis.We thank J. Downing (St. Jude Children's Research Hospital, Memphis, TN, USA) for the Runx1-ires-GFP mouse. Research in the authors' laboratory is supported by the Medical Research Council, Biotechnology and Biological Sciences Research Council, Leukaemia and Lymphoma Research, the Leukemia and Lymphoma Society, Microsoft Research and core support grants by the Wellcome Trust to the Cambridge Institute for Medical Research and Wellcome Trust - MRC Cambridge Stem Cell Institute. V.M. is supported by a Medical Research Council Studentship and Centenary Award and S.W. by a Microsoft Research PhD Scholarship.This is the accepted manuscript for a paper published in Nature Biotechnology 33, 269–276 (2015) doi:10.1038/nbt.315

    Integrative analysis identifies candidate tumor microenvironment and intracellular signaling pathways that define tumor heterogeneity in NF1

    Get PDF
    Neurofibromatosis type 1 (NF1) is a monogenic syndrome that gives rise to numerous symptoms including cognitive impairment, skeletal abnormalities, and growth of benign nerve sheath tumors. Nearly all NF1 patients develop cutaneous neurofibromas (cNFs), which occur on the skin surface, whereas 40-60% of patients develop plexiform neurofibromas (pNFs), which are deeply embedded in the peripheral nerves. Patients with pNFs have a ~10% lifetime chance of these tumors becoming malignant peripheral nerve sheath tumors (MPNSTs). These tumors have a severe prognosis and few treatment options other than surgery. Given the lack of therapeutic options available to patients with these tumors, identification of druggable pathways or other key molecular features could aid ongoing therapeutic discovery studies. In this work, we used statistical and machine learning methods to analyze 77 NF1 tumors with genomic data to characterize key signaling pathways that distinguish these tumors and identify candidates for drug development. We identified subsets of latent gene expression variables that may be important in the identification and etiology of cNFs, pNFs, other neurofibromas, and MPNSTs. Furthermore, we characterized the association between these latent variables and genetic variants, immune deconvolution predictions, and protein activity predictions

    GENE-Counter: A Computational Pipeline for the Analysis of RNA-Seq Data for Gene Expression Differences

    Get PDF
    GENE-counter is a complete Perl-based computational pipeline for analyzing RNA-Sequencing (RNA-Seq) data for differential gene expression. In addition to its use in studying transcriptomes of eukaryotic model organisms, GENE-counter is applicable for prokaryotes and non-model organisms without an available genome reference sequence. For alignments, GENE-counter is configured for CASHX, Bowtie, and BWA, but an end user can use any Sequence Alignment/Map (SAM)-compliant program of preference. To analyze data for differential gene expression, GENE-counter can be run with any one of three statistics packages that are based on variations of the negative binomial distribution. The default method is a new and simple statistical test we developed based on an over-parameterized version of the negative binomial distribution. GENE-counter also includes three different methods for assessing differentially expressed features for enriched gene ontology (GO) terms. Results are transparent and data are systematically stored in a MySQL relational database to facilitate additional analyses as well as quality assessment. We used next generation sequencing to generate a small-scale RNA-Seq dataset derived from the heavily studied defense response of Arabidopsis thaliana and used GENE-counter to process the data. Collectively, the support from analysis of microarrays as well as the observed and substantial overlap in results from each of the three statistics packages demonstrates that GENE-counter is well suited for handling the unique characteristics of small sample sizes and high variability in gene counts

    DNA Chemical Reaction Network Design Synthesis and Compilation

    Get PDF
    The advantages of biomolecular computing include 1) the ability to interface with, monitor, and intelligently protect and maintain the functionality of living systems, 2) the ability to create computational devices with minimal energy needs and hazardous waste production during manufacture and lifecycle, 3) the ability to store large amounts of information for extremely long time periods, and 4) the ability to create computation analogous to human brain function. To realize these advantages over electronics, biomolecular computing is at a watershed moment in its evolution. Computing with entire molecules presents different challenges and requirements than computing just with electric charge. These challenges have led to ad-hoc design and programming methods with high development costs and limited device performance. At the present time, device building entails complete low-level detail immersion. We address these shortcomings by creation of a systems engineering process for building and programming DNA-based computing devices. Contributions of this thesis include numeric abstractions for nucleic acid sequence and secondary structure, and a set of algorithms which employ these abstractions. The abstractions and algorithms have been implemented into three artifacts: DNADL, a design description language; Pyxis, a molecular compiler and design toolset; and KCA, a simulation of DNA kinetics using a cellular automaton discretization. Our methods are applicable to other DNA nanotechnology constructions and may serve in the development of a full DNA computing model

    Studies on genetic and epigenetic regulation of gene expression dynamics

    Get PDF
    The information required to build an organism is contained in its genome and the first biochemical process that activates the genetic information stored in DNA is transcription. Cell type specific gene expression shapes cellular functional diversity and dysregulation of transcription is a central tenet of human disease. Therefore, understanding transcriptional regulation is central to understanding biology in health and disease. Transcription is a dynamic process, occurring in discrete bursts of activity that can be characterized by two kinetic parameters; burst frequency describing how often genes burst and burst size describing how many transcripts are generated in each burst. Genes are under strict regulatory control by distinct sequences in the genome as well as epigenetic modifications. To properly study how genetic and epigenetic factors affect transcription, it needs to be treated as the dynamic cellular process it is. In this thesis, I present the development of methods that allow identification of newly induced gene expression over short timescales, as well as inference of kinetic parameters describing how frequently genes burst and how many transcripts each burst give rise to. The work is presented through four papers: In paper I, I describe the development of a novel method for profiling newly transcribed RNA molecules. We use this method to show that therapeutic compounds affecting different epigenetic enzymes elicit distinct, compound specific responses mediated by different sets of transcription factors already after one hour of treatment that can only be detected when measuring newly transcribed RNA. The goal of paper II is to determine how genetic variation shapes transcriptional bursting. To this end, we infer transcriptome-wide burst kinetics parameters from genetically distinct donors and find variation that selectively affects burst sizes and frequencies. Paper III describes a method for inferring transcriptional kinetics transcriptome-wide using single-cell RNA-sequencing. We use this method to describe how the regulation of transcriptional bursting is encoded in the genome. Our findings show that gene specific burst sizes are dependent on core promoter architecture and that enhancers affect burst frequencies. Furthermore, cell type specific differential gene expression is regulated by cell type specific burst frequencies. Lastly, Paper IV shows how transcription shapes cell types. We collect data on cellular morphologies, electrophysiological characteristics, and measure gene expression in the same neurons collected from the mouse motor cortex. Our findings show that cells belonging to the same, distinct transcriptomic families have distinct and non-overlapping morpho-electric characteristics. Within families, there is continuous and correlated variation in all modalities, challenging the notion of cell types as discrete entities

    A High-Resolution Whole-Genome Map of Key Chromatin Modifications in the Adult Drosophila melanogaster

    Get PDF
    Epigenetic research has been focused on cell-type-specific regulation; less is known about common features of epigenetic programming shared by diverse cell types within an organism. Here, we report a modified method for chromatin immunoprecipitation and deep sequencing (ChIP–Seq) and its use to construct a high-resolution map of the Drosophila melanogaster key histone marks, heterochromatin protein 1a (HP1a) and RNA polymerase II (polII). These factors are mapped at 50-bp resolution genome-wide and at 5-bp resolution for regulatory sequences of genes, which reveals fundamental features of chromatin modification landscape shared by major adult Drosophila cell types: the enrichment of both heterochromatic and euchromatic marks in transposons and repetitive sequences, the accumulation of HP1a at transcription start sites with stalled polII, the signatures of histone code and polII level/position around the transcriptional start sites that predict both the mRNA level and functionality of genes, and the enrichment of elongating polII within exons at splicing junctions. These features, likely conserved among diverse epigenomes, reveal general strategies for chromatin modifications
    • …
    corecore