16,656 research outputs found

    Regulatory motif discovery using a population clustering evolutionary algorithm

    Get PDF
    This paper describes a novel evolutionary algorithm for regulatory motif discovery in DNA promoter sequences. The algorithm uses data clustering to logically distribute the evolving population across the search space. Mating then takes place within local regions of the population, promoting overall solution diversity and encouraging discovery of multiple solutions. Experiments using synthetic data sets have demonstrated the algorithm's capacity to find position frequency matrix models of known regulatory motifs in relatively long promoter sequences. These experiments have also shown the algorithm's ability to maintain diversity during search and discover multiple motifs within a single population. The utility of the algorithm for discovering motifs in real biological data is demonstrated by its ability to find meaningful motifs within muscle-specific regulatory sequences

    Genome-Wide Survey of MicroRNA - Transcription Factor Feed-Forward Regulatory Circuits in Human

    Full text link
    In this work, we describe a computational framework for the genome-wide identification and characterization of mixed transcriptional/post-transcriptional regulatory circuits in humans. We concentrated in particular on feed-forward loops (FFL), in which a master transcription factor regulates a microRNA, and together with it, a set of joint target protein coding genes. The circuits were assembled with a two step procedure. We first constructed separately the transcriptional and post-transcriptional components of the human regulatory network by looking for conserved over-represented motifs in human and mouse promoters, and 3'-UTRs. Then, we combined the two subnetworks looking for mixed feed-forward regulatory interactions, finding a total of 638 putative (merged) FFLs. In order to investigate their biological relevance, we filtered these circuits using three selection criteria: (I) GeneOntology enrichment among the joint targets of the FFL, (II) independent computational evidence for the regulatory interactions of the FFL, extracted from external databases, and (III) relevance of the FFL in cancer. Most of the selected FFLs seem to be involved in various aspects of organism development and differentiation. We finally discuss a few of the most interesting cases in detail.Comment: 51 pages, 5 figures, 4 tables. Supporting information included. Accepted for publication in Molecular BioSystem

    Motif Discovery through Predictive Modeling of Gene Regulation

    Full text link
    We present MEDUSA, an integrative method for learning motif models of transcription factor binding sites by incorporating promoter sequence and gene expression data. We use a modern large-margin machine learning approach, based on boosting, to enable feature selection from the high-dimensional search space of candidate binding sequences while avoiding overfitting. At each iteration of the algorithm, MEDUSA builds a motif model whose presence in the promoter region of a gene, coupled with activity of a regulator in an experiment, is predictive of differential expression. In this way, we learn motifs that are functional and predictive of regulatory response rather than motifs that are simply overrepresented in promoter sequences. Moreover, MEDUSA produces a model of the transcriptional control logic that can predict the expression of any gene in the organism, given the sequence of the promoter region of the target gene and the expression state of a set of known or putative transcription factors and signaling molecules. Each motif model is either a kk-length sequence, a dimer, or a PSSM that is built by agglomerative probabilistic clustering of sequences with similar boosting loss. By applying MEDUSA to a set of environmental stress response expression data in yeast, we learn motifs whose ability to predict differential expression of target genes outperforms motifs from the TRANSFAC dataset and from a previously published candidate set of PSSMs. We also show that MEDUSA retrieves many experimentally confirmed binding sites associated with environmental stress response from the literature.Comment: RECOMB 200

    Transcriptional networks specifying homeostatic and inflammatory programs of gene expression in human aortic endothelial cells.

    Get PDF
    Endothelial cells (ECs) are critical determinants of vascular homeostasis and inflammation, but transcriptional mechanisms specifying their identities and functional states remain poorly understood. Here, we report a genome-wide assessment of regulatory landscapes of primary human aortic endothelial cells (HAECs) under basal and activated conditions, enabling inference of transcription factor networks that direct homeostatic and pro-inflammatory programs. We demonstrate that 43% of detected enhancers are EC-specific and contain SNPs associated to cardiovascular disease and hypertension. We provide evidence that AP1, ETS, and GATA transcription factors play key roles in HAEC transcription by co-binding enhancers associated with EC-specific genes. We further demonstrate that exposure of HAECs to oxidized phospholipids or pro-inflammatory cytokines results in signal-specific alterations in enhancer landscapes and associate with coordinated binding of CEBPD, IRF1, and NFκB. Collectively, these findings identify cis-regulatory elements and corresponding trans-acting factors that contribute to EC identity and their specific responses to pro-inflammatory stimuli

    The dual transcriptional regulator CysR in Corynebacterium glutamicum ATCC 13032 controls a subset of genes of the McbR regulon in response to the availability of sulphide acceptor molecules

    Get PDF
    Background: Regulation of sulphur metabolism in Corynebacterium glutamicum ATCC 13032 has been studied intensively in the last few years, due to its industrial as well as scientific importance. Previously, the gene cg0156 was shown to belong to the regulon of McbR, a global transcriptional repressor of sulphur metabolism in C. glutamicum. This gene encodes a putative ROK-type regulator, a paralogue of the activator of sulphonate utilisation, SsuR. Therefore, it is an interesting candidate for study to further the understanding of the regulation of sulphur metabolism in C. glutamicum. Results: Deletion of cg0156, now designated cysR, results in the inability of the mutant to utilise sulphate and aliphatic sulphonates. DNA microarray hybridisations revealed 49 genes with significantly increased and 48 with decreased transcript levels in presence of the native CysR compared to a cysR deletion mutant. Among the genes positively controlled by CysR were the gene cluster involved in sulphate reduction, fpr2 cysIXHDNYZ, and ssuR. Gel retardation experiments demonstrated that binding of CysR to DNA depends in vitro on the presence of either O-acetyl-L-serine or O-acetyl-L-homoserine. Mapping of the transcription start points of five transcription units helped to identify a 10 bp inverted repeat as the possible CysR binding site. Subsequent in vivo tests proved this motif to be necessary for CysR-dependent transcriptional regulation. Conclusion: CysR acts as the functional analogue of the unrelated LysR-type regulator CysB from Escherichia coli, controlling sulphide production in response to acceptor availability. In both bacteria, gene duplication events seem to have taken place which resulted in the evolution of dedicated regulators for the control of sulphonate utilisation. The striking convergent evolution of network topology indicates the strong selective pressure to control the metabolism of the essential but often toxic sulphur-containing (bio-)molecules

    An Experimental Framework to Examine the Influence of Promoter Architecture and Genomic Context on Gene Expression

    Get PDF
    Transcription is a fundamental process of gene expression. Information stored in DNA is transcribed into different types of mobile RNA, which play a role in various essential processes of the cell, e.g. translation. However, cells do not need all the information stored in their DNA at the same time. Therefore, the process of transcription gets regulated by a plethora of mechanisms. One frequently discussed but poorly understood mechanism of transcription regulation is DNA supercoiling [Travers and Muskhelishvili, 2005]. Whereby, the process of transcription itself affects the DNA-topology up- and downstream of the transcription machinery as described in the twin supercoiling domain model [Liu and Wang, 1987]. This phenomenon is called Transcription Coupled DNA Supercoiling (TCDS). It has also been shown that genes react individually to changes in DNA supercoiling and that there is a selection pressure on adapting to the DNA supercoiling levels emitted by neighbouring gene expression [Sobetzko, 2016]. The system in which promoters react to changes in DNA supercoiling is as diverse as there are promoters; notably, some promoters seem not to respond to DNA supercoiling at all. Thus, this raises the question as to which elements within different promoter types cause them to respond to TCDS so differently. In this thesis, I built a pipeline to investigate the effects of TCDS and DNA supercoiling on promoters. Firstly, I created a plasmid toolbox, which allows modular assembly of transcription units. The central feature of this toolbox is the flexibility to test different arrangements of multiple transcription units. I achieved this by adapting the well established Modular Cloning (MoClo) standard [Weber et al., 2011] and build my toolbox around it. I thus created a system that works on both its own and is compatible with the existing standard MoClo protocol. In the second part of this thesis, I established an experimental pipeline using synthetic σ70-promoters to investigate the influence of DNA supercoiling on transcription. The experimental setup allowed precise changes in parts of the promoter and at the same time created a library of these promoters. Using this pipeline to investigate the spacer region of the promoter, I was able to confirm that the spacer influences the promoter strength. Further, I showed that the promoter spacer has only a limited effect on the supercoiling sensitivity of a promoter. I also showed that a 5‘-TGTG-3‘ motif in the spacer region could lower transcription by enhancing RNA-polymerase (RNAP)-binding. Moreover, the experimental setup also showed the constraints of using the DNA-relaxing drug novobiocin on a plasmid-based system. Hence, to further investigate the effects of TCDS on neighbouring transcription, I applied an optogenetically-controllable promoter to the previously established pipeline. Finally, I began to explore the possibility of integrating my experimental promoter setup into any genomic position. As such, a CRISPR/Cas9-based homologous re-combination system was developed further to make it modular and compatible with the Modular Cloning protocol. I could show the first features of this system to work

    Predicting gene expression in the human malaria parasite Plasmodium falciparum using histone modification, nucleosome positioning, and 3D localization features.

    Get PDF
    Empirical evidence suggests that the malaria parasite Plasmodium falciparum employs a broad range of mechanisms to regulate gene transcription throughout the organism's complex life cycle. To better understand this regulatory machinery, we assembled a rich collection of genomic and epigenomic data sets, including information about transcription factor (TF) binding motifs, patterns of covalent histone modifications, nucleosome occupancy, GC content, and global 3D genome architecture. We used these data to train machine learning models to discriminate between high-expression and low-expression genes, focusing on three distinct stages of the red blood cell phase of the Plasmodium life cycle. Our results highlight the importance of histone modifications and 3D chromatin architecture in Plasmodium transcriptional regulation and suggest that AP2 transcription factors may play a limited regulatory role, perhaps operating in conjunction with epigenetic factors
    • …
    corecore