Search CORE

232 research outputs found

Automated Problem Decomposition for the Boolean Domain with Genetic Programming

Author: A. Moraglio
D. Jackson
E. Hemberg
J. Walker
M. Keijzer
M. O’Neill
S.C. Roberts
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2013
Field of study

Researchers have been interested in exploring the regularities and modularity of the problem space in genetic programming (GP) with the aim of decomposing the original problem into several smaller subproblems. The main motivation is to allow GP to deal with more complex problems. Most previous works on modularity in GP emphasise the structure of modules used to encapsulate code and/or promote code reuse, instead of in the decomposition of the original problem. In this paper we propose a problem decomposition strategy that allows the use of a GP search to find solutions for subproblems and combine the individual solutions into the complete solution to the problem

CiteSeerX

Crossref

Kent Academic Repository

Recommended from our members

Integrated Genome Analysis Suggests that Most Conserved Non-Coding Sequences are Regulatory Factor Binding Sites

Author: Cloonan Nicole
Gray Jesse M.
Greenberg Michael Eldon
Grimmond Sean
Hemberg Martin
Kreiman Gabriel
Kuersten Scott
Publication venue: 'Oxford University Press (OUP)'
Publication date: 15/04/2013
Field of study

More than 98% of a typical vertebrate genome does not code for proteins. Although non-coding regions are sprinkled with short (<200 bp) islands of evolutionarily conserved sequences, the function of most of these unannotated conserved islands remains unknown. One possibility is that unannotated conserved islands could encode non-coding RNAs (ncRNAs); alternatively, unannotated conserved islands could serve as promoter-distal regulatory factor binding sites (RFBSs) like enhancers. Here we assess these possibilities by comparing unannotated conserved islands in the human and mouse genomes to transcribed regions and to RFBSs, relying on a detailed case study of one human and one mouse cell type. We define transcribed regions by applying a novel transcript-calling algorithm to RNA-Seq data obtained from total cellular RNA, and we define RFBSs using ChIP-Seq and DNAse-hypersensitivity assays. We find that unannotated conserved islands are four times more likely to coincide with RFBSs than with unannotated ncRNAs. Thousands of conserved RFBSs can be categorized as insulators based on the presence of CTCF or as enhancers based on the presence of p300/CBP and H3K4me1. While many unannotated conserved RFBSs are transcriptionally active to some extent, the transcripts produced tend to be unspliced, non-polyadenylated and expressed at levels 10 to 100-fold lower than annotated coding or ncRNAs. Extending these findings across multiple cell types and tissues, we propose that most conserved non-coding genomic DNA in vertebrate genomes corresponds to promoter-distal regulatory elements

Harvard University - DASH

Recommended from our members

Souporcell: robust clustering of single-cell RNA-seq data by genotype without reference genotypes.

Author: Durbin Richard
Gaffney Daniel J
Heaton Haynes
Hemberg Martin
Imaz Maria
Knights Andrew
Lawniczak Mara KN
Talman Arthur M
Publication venue: Nat Methods
Publication date: 01/06/2020
Field of study

Methods to deconvolve single-cell RNA-sequencing (scRNA-seq) data are necessary for samples containing a mixture of genotypes, whether they are natural or experimentally combined. Multiplexing across donors is a popular experimental design that can avoid batch effects, reduce costs and improve doublet detection. By using variants detected in scRNA-seq reads, it is possible to assign cells to their donor of origin and identify cross-genotype doublets that may have highly similar transcriptional profiles, precluding detection by transcriptional profile. More subtle cross-genotype variant contamination can be used to estimate the amount of ambient RNA. Ambient RNA is caused by cell lysis before droplet partitioning and is an important confounder of scRNA-seq analysis. Here we develop souporcell, a method to cluster cells using the genetic variants detected within the scRNA-seq reads. We show that it achieves high accuracy on genotype clustering, doublet detection and ambient RNA estimation, as demonstrated across a range of challenging scenarios.We acknowledge the Wellcome Sanger Institute’s DNA Pipelines for construction of the 10x sequencing libraries. We thank Allan Muhwezi and Andrew Russell for assistance with parasite culture and 10x Single-cell 3’ RNA-seq respectively. In addition, we would like to thank Matthew Young for useful conversations about ambient RNA, Mirjana Efremova for providing information about the maternal/fetal data, and Katie Gray for assistance in interpreting the previously unannotated cluster. The Wellcome Sanger Institute is funded by the Wellcome Trust (grant 206194/Z/17/Z), which supports MKNL and MH. This work was supported by an MRC Career Development Award (G1100339) to MKNL. We would like to acknowledge the Wellcome Trust Sanger Institute as the source of the human induced pluripotent cell lines that were generated under the Human Induced Pluripotent Stem Cell Initiative funded by a grant from the Wellcome Trust and Medical Research Council, supported by the Wellcome Trust (WT098051) and the NIHR/Wellcome Trust Clinical Research Facility, and acknowledges Life Science Technologies Corporation as the provider of Cytotune (HipSci.org). The Cardiovascular Epidemiology Unit is supported by core funding from the UK Medical Research Council (MR/L003120/1), the British Heart Foundation (RG/13/13/30194; RG/18/13/33946) and the National Institute for Health Research [Cambridge Biomedical Research Centre at the Cambridge University Hospital’s NHS Foundation Trust]. The views expressed are those of the authors and not necessarily those of the NHS, the NIHR or the Department of Health and Social Care

Apollo (Cambridge)

Genome-Wide Analysis of MEF2 Transcriptional Program Reveals Synaptic Target Genes and Neuronal Activity-Dependent Polyadenylation Site Selection

Author: Bear Daniel M.
Flavell Steven W.
Gray Jesse M.
Greenberg Michael E.
Harmin David A.
Hemberg Martin
Hong Elizabeth J.
Kim Tae-Kyung
Markenscoff-Papadimitriou Eirene
Publication venue: 'Elsevier BV'
Publication date: 26/12/2008
Field of study

Although many transcription factors are known to control important aspects of neural development, the genome-wide programs that are directly regulated by these factors are not known. We have characterized the genetic program that is activated by MEF2, a key regulator of activity-dependent synapse development. These MEF2 target genes have diverse functions at synapses, revealing a broad role for MEF2 in synapse development. Several of the MEF2 targets are mutated in human neurological disorders including epilepsy and autism spectrum disorders, suggesting that these disorders may be caused by disruption of an activity-dependent gene program that controls synapse development. Our analyses also reveal that neuronal activity promotes alternative polyadenylation site usage at many of the MEF2 target genes, leading to the production of truncated mRNAs that may have different functions than their full-length counterparts. Taken together, these analyses suggest that the ubiquitously expressed transcription factor MEF2 regulates an intricate transcriptional program in neurons that controls synapse development

Elsevier - Publisher Connector

PubMed Central

Caltech Authors

Computational Stem Cell Biology: Open Questions and Guiding Principles

Author: Cacchiarelli D.
Cahan P.
de Sousa Lopes S. M. C.
del Sol A.
Dunn S. -J.
Hemberg M.
Morris S. A.
Rackham O. J. L.
Wells C. A.
Publication venue: 'Elsevier BV'
Publication date: 01/01/2021
Field of study

Computational biology is enabling an explosive growth in our understanding of stem cells and our ability to use them for disease modeling, regenerative medicine, and drug discovery. We discuss four topics that exemplify applications of computation to stem cell biology: cell typing, lineage tracing, trajectory inference, and regulatory networks. We use these examples to articulate principles that have guided computational biology broadly and call for renewed attention to these principles as computation becomes increasingly important in stem cell biology. We also discuss important challenges for this field with the hope that it will inspire more to join this exciting area

Archivio della ricerca - Università degli studi di Napoli Federico II

Temporal Tracking of Microglia Activation in Neurodegeneration at Single-Cell Resolution

Author: Adaikkan Chinnakkaruppan
De Jager Philip L.
Gao Fan
Hemberg Martin
Manet Elodie
Mathys Hansruedi
Ransohoff Richard M.
Regev Aviv
Tsai Li-Huei
Young Jennie Zin-Ney
Publication venue: 'Elsevier BV'
Publication date: 01/10/2017
Field of study

Microglia, the tissue-resident macrophages in the brain, are damage sensors that react to nearly any perturbation, including neurodegenerative diseases such as Alzheimer's disease (AD). Here, using single-cell RNA sequencing, we determined the transcriptome of more than 1,600 individual microglia cells isolated from the hippocampus of a mouse model of severe neurodegeneration with AD-like phenotypes and of control mice at multiple time points during progression of neurodegeneration. In this neurodegeneration model, we discovered two molecularly distinct reactive microglia phenotypes that are typified by modules of co-regulated type I and type II interferon response genes, respectively. Furthermore, our work identified previously unobserved heterogeneity in the response of microglia to neurodegeneration, discovered disease stage-specific microglia cell states, revealed the trajectory of cellular reprogramming of microglia in response to neurodegeneration, and uncovered the underlying transcriptional programs. Mathys et al. use single-cell RNA sequencing to determine the phenotypic heterogeneity of microglia during the progression of neurodegeneration. They identify multiple disease stage-specific cell states, including two molecularly distinct reactive microglia phenotypes that are typified by modules of co-regulated type I and type II interferon response genes, respectively.National Institutes of Health (U.S.) (Grant RF1 AG054321

DSpace@MIT

Crossref

Harvard University - DASH

Directory of Open Access Journals

A Dominated Coupling From The Past algorithm for the stochastic simulation of networks of biochemical reactions

Author: A Goldbeter
A Raj
C Gadgil
C Valeriani
CV Rao
DT Gillespie
DT Gillespie
E Thönnes
HH McAdams
J Paulsson
JA Fill
JG Propp
JL Doob
JM Raser
JR Norris
M Hemberg
M Mitzenmacher
Martin Hemberg
Mauricio Barahona
MB Elowitz
MB Elowitz
NG van Kampen
O Häggström
S Muller
S Widder
T Lindvall
TS Gardner
WS Kendall
Publication venue: BioMed Central
Publication date: 01/01/2008
Field of study

Abstract Background In recent years, stochastic descriptions of biochemical reactions based on the Master Equation (ME) have become widespread. These are especially relevant for models involving gene regulation. Gillespie’s Stochastic Simulation Algorithm (SSA) is the most widely used method for the numerical evaluation of these models. The SSA produces exact samples from the distribution of the ME for finite times. However, if the stationary distribution is of interest, the SSA provides no information about convergence or how long the algorithm needs to be run to sample from the stationary distribution with given accuracy. Results We present a proof and numerical characterization of a Perfect Sampling algorithm for the ME of networks of biochemical reactions prevalent in gene regulation and enzymatic catalysis. Our algorithm combines the SSA with Dominated Coupling From The Past (DCFTP) techniques to provide guaranteed sampling from the stationary distribution. The resulting DCFTP-SSA is applicable to networks of reactions with uni-molecular stoichiometries and sub-linear, (anti-) monotone propensity functions. We showcase its applicability studying steady-state properties of stochastic regulatory networks of relevance in synthetic and systems biology. Conclusion The DCFTP-SSA provides an extension to Gillespie’s SSA with guaranteed sampling from the stationary solution of the ME for a broad class of stochastic biochemical networks.</p

Crossref

Springer

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Spiral - Imperial College Digital Repository

Genomic positional conservation identifies topological anchor point (tap)RNAs linked to developmental loci

Author: Amaral P
Arias-Carrasco R
Buscher M
Enright A
Gascoigne D
Han N
Hemberg M
Kouzarides T
Leonardi T
Maracaja-Coutinho V
Nakaya H
Pluchino S
Shiekhattar R
Vire E
Zhang A
Publication venue
Publication date: 29/04/2016
Field of study

The mammalian genome is transcribed into large numbers of long noncoding RNAs (lncRNAs), but the definition of functional lncRNA groups has proven difficult, partly due to their low sequence conservation and lack of identified shared properties. Here we consider positional conservation across mammalian genomes as an indicator of functional commonality. We identify 665 conserved lncRNA promoters in mouse and human genomes that are preserved in genomic position relative to orthologous coding genes. The identified positionally conserved lncRNA genes are primarily associated with developmental transcription factor loci with which they are co-expressed in a tissue-specific manner. Strikingly, over half of all positionally conserved RNAs in this set are linked to distinct chromatin organization structures, overlapping the binding sites for the CTCF chromatin organizer and located at chromatin loop anchor points and borders of topologically associating domains (TADs). These topological anchor point (tap)RNAs possess conserved sequence domains that are enriched in potential recognition motifs for Zinc Finger proteins. Characterization of these non-coding RNAs and their associated coding genes shows that they are functionally connected: they regulate each other ′s expression and influence the metastatic phenotype of cancer cells in vitro in a similar fashion. Thus, interrogation of positionally conserved lncRNAs identifies a new subset of tapRNAs with shared functional properties. These results provide a large dataset of lncRNAs that conform to the ″extended gene″ model, in which conserved developmental genes are genomically and functionally linked to regulatory lncRNA loci across mammalian evolution

Crossref

UCL Discovery

University of Miami: Scholarship Miami

The Helicase Aquarius/EMB-4 Is Required to Overcome Intronic Barriers to Allow Nuclear RNAi Pathways to Heritably Silence Transcription

Author: Akay A
Berkyurek
Claycomb JM
Di Domenico T
Engelhardt J
Hemberg M
Lamond AI
Larance M
Ma P
Medhi R
Miska
Nabih A
Parada GE
Rudolph
Suen KM
Wedeles CJ
Zhang X
Publication venue: Developmental Cell
Publication date: 01/08/2017
Field of study

Small RNAs play a crucial role in genome defense against transposable elements and guide Argonaute proteins to nascent RNA transcripts to induce co-transcriptional gene silencing. However, the molecular basis of this process remains unknown. Here, we identify the conserved RNA helicase Aquarius/EMB-4 as a direct and essential link between small RNA pathways and the transcriptional machinery in

\textit{Caenorhabditis elegans}

. Aquarius physically interacts with the germline Argonaute HRDE-1. Aquarius is required to initiate small-RNA-induced heritable gene silencing. HRDE-1 and Aquarius silence overlapping sets of genes and transposable elements. Surprisingly, removal of introns from a target gene abolishes the requirement for Aquarius, but not HRDE-1, for small RNA-dependent gene silencing. We conclude that Aquarius allows small RNA pathways to compete for access to nascent transcripts undergoing co-transcriptional splicing in order to detect and silence transposable elements. Thus, Aquarius and HRDE-1 act as gatekeepers coordinating gene expression and genome defense.A.C.B. was supported by an HFSP grant to E.A.M. (RPG0014/2015). This work was supported by Cancer Research UK (C13474/A18583, C6946/A14492), the Wellcome Trust (104640/Z/14/Z, 092096/Z/10/Z), and The European Research Council (ERC, grant 260688). The work of P.M. and X.Z. is supported by NIH grant R01GM113242 and NIH grant R01GM122080. R.M. was a Commonwealth Scholar, funded by the UK Government. J.M.C., A.N., and C.J.W. were supported by the CIHR (MOP-274660) and the Canada Research Chairs Program. A.I.L. was supported by a Wellcome Trust Programme Grant (108058/Z/15/Z) and M.L was supported by 2013/RSE/SCOTGOV/ MARIECURIE

Crossref

Apollo (Cambridge)

University of Dundee Online Publications

White Rose Research Online

University of East Anglia digital repository

The Malaria Cell Atlas: single parasite transcriptomes across the complete Plasmodium life cycle

Author: Andrews Tallulah
Berriman Matthew
Billker Oliver
Butungi Hellen
Heaton Haynes
Hemberg Martin
Herren Jeremy K.
Howick Virginia M.
Lawniczak Mara K.N.
Metcalf Tom
Natarajan Kedar
Rayner Julian C.
Reid Adam J.
Russell Andrew J.C.
Talman Arthur M.
Verzier Lisa H.
Publication venue: 'American Association for the Advancement of Science (AAAS)'
Publication date: 23/08/2019
Field of study

Malaria parasites adopt a remarkable variety of morphological life stages as they transition through multiple mammalian host and mosquito vector environments. We profiled the single-cell transcriptomes of thousands of individual parasites, deriving the first high-resolution transcriptional atlas of the entire life cycle. We then used our atlas to precisely define developmental stages of single cells from three different human malaria parasite species, including parasites isolated directly from infected individuals. The Malaria Cell Atlas provides both a comprehensive view of gene usage in a eukaryotic parasite and an open-access reference dataset for the study of malaria parasites

Enlighten