9,378 research outputs found

    Transcriptional Regulation of Dual-Specificity Phosphatase 4 (Dusp4) by Muscle RING Finger 1 (MuRF1) and Myogenic Regulatory Factors

    Get PDF
    Skeletal muscle atrophy can occur at any age and as a result of numerous physiological conditions and thus, it was necessary to better identify the molecular underpinnings of the atrophy cascade so that new therapeutic targets to treat muscle wasting might be identified. MuRF1 was first identified as a marker of skeletal muscle atrophy over a decade ago; however, recent work suggests that this E3 ubiquitin ligase may participate in muscle wasting by regulating the transcriptional activity of genes differentially expressed in response to muscle atrophy. Dusp4, a dual-specificity phosphatase and member of the MAPK cascade, is induced in response to neurogenic atrophy; however, this induction is significantly blunted in the MuRF1-null mice which are resistant to muscle atrophy. The research presented in this thesis aims to characterize the mechanism by which MuRF1 may transcriptionally regulate Dusp4 and characterizes the function of Dusp4 in skeletal muscle

    Transcription Factor-DNA Binding Via Machine Learning Ensembles

    Full text link
    We present ensemble methods in a machine learning (ML) framework combining predictions from five known motif/binding site exploration algorithms. For a given TF the ensemble starts with position weight matrices (PWM's) for the motif, collected from the component algorithms. Using dimension reduction, we identify significant PWM-based subspaces for analysis. Within each subspace a machine classifier is built for identifying the TF's gene (promoter) targets (Problem 1). These PWM-based subspaces form an ML-based sequence analysis tool. Problem 2 (finding binding motifs) is solved by agglomerating k-mer (string) feature PWM-based subspaces that stand out in identifying gene targets. We approach Problem 3 (binding sites) with a novel machine learning approach that uses promoter string features and ML importance scores in a classification algorithm locating binding sites across the genome. For target gene identification this method improves performance (measured by the F1 score) by about 10 percentage points over the (a) motif scanning method and (b) the coexpression-based association method. Top motif outperformed 5 component algorithms as well as two other common algorithms (BEST and DEME). For identifying individual binding sites on a benchmark cross species database (Tompa et al., 2005) we match the best performer without much human intervention. It also improved the performance on mammalian TFs. The ensemble can integrate orthogonal information from different weak learners (potentially using entirely different types of features) into a machine learner that can perform consistently better for more TFs. The TF gene target identification component (problem 1 above) is useful in constructing a transcriptional regulatory network from known TF-target associations. The ensemble is easily extendable to include more tools as well as future PWM-based information.Comment: 33 page

    cis-Regulatory sequences driving the expression of the Hbox12 homeobox-containing gene in the presumptive aboral ectoderm territory of the Paracentrotus lividus sea urchin embryo.

    Get PDF
    Embryonic development is coordinated by networks of evolutionary conserved regulatory genes encoding transcription factors and components of cell signalling pathways. In the sea urchin embryo, a number of genes encoding transcription factors display territorial restricted expression. Among these, the zygotic Hbox12 homeobox gene is transiently transcribed in a limited number of cells of the animal-lateral half of the early Paracentrotus lividus embryo, whose descendants will constitute part of the ectoderm territory. To obtain insights on the regulation of Hbox12 expression, we have explored the cis-regulatory apparatus of the gene. In this paper, we show that the intergenic region of the tandem Hbox12 repeats drives GFP expression in the presumptive aboral ectoderm and that a 234 bp fragment, defined aboral ectoderm (AE) module, accounts for the restricted expression of the transgene. Within this module, a consensus sequence for a Sox factor and the binding of the Otx activator are both required for correct Hbox12 gene expression. Spatial restriction to the aboral ectoderm is achieved by a combination of different repressive sequence elements. Negative sequence elements necessary for repression in the endomesoderm map within the most upstream 60 bp region and nearby the Sox binding site. Strikingly, a Myb-like consensus is necessary for repression in the oral ectoderm, while down-regulation at the gastrula stage depends on a GA-rich region. These results suggest a role for Hbox12 in aboral ectoderm specification and represent our first attempt in the identification of the gene regulatory circuits involved in this process

    The Partial Order Kernel and its Application to Understanding the Regulatory Grammar of Conserved Non-coding Elements

    Get PDF
    PhDConserved non-coding elements (CNEs) are regions of non-coding DNA which have remained evolutionarily conserved across various species over millions of years and are found to cluster near genes involved in early embryonic development, suggesting that they play an important role as regulatory elements. Indeed, many CNEs have been shown to act as enhancers; however, not all regulatory elements are conserved and in some cases, deletion of CNEs did not result in any notable phenotypes. These opposing ndings indicate that the functions of CNEs are still poorly understood and further research on these elements is needed to uncover the reasons for their extreme conservation. The aim of this thesis is to investigate the use and development of algorithms for decoding the regulatory grammar of CNEs. Initially, an assessment of several methods for functional classi cation of CNEs is provided. The results obtained using these methods are validated by functional assays and their limitations in capturing the grammar of CNEs are discussed. Motivated by these limitations, a partial order graph representation of the sequence of transcription factor binding sites (TFBSs) in a CNE that allows e cient handling of the overlapping sites is introduced. A dynamic programming-based method for aligning two such graphs and identifying regulatory signatures composed of co-occurring TFBSs is proposed and evaluated. The results demonstrate the predictive ability of this method, which can be used to prioritise regions for experimental validation. Building on this method, the partial order kernel (POKer) for comparison of strings containing alternative substrings and represented by partial order graphs is introduced. The POKer is evaluated in di erent sequence comparison tasks, including visual localisation. An approach using the POKer for functional classi cation of CNEs is introduced and its e ectiveness in capturing the grammar of CNEs is demonstrated. Finally, the implications of the results presented in this work for modelling the evolution of CNEs are discussed

    CLIMP: Clustering Motifs via Maximal Cliques with Parallel Computing Design.

    Get PDF
    A set of conserved binding sites recognized by a transcription factor is called a motif, which can be found by many applications of comparative genomics for identifying over-represented segments. Moreover, when numerous putative motifs are predicted from a collection of genome-wide data, their similarity data can be represented as a large graph, where these motifs are connected to one another. However, an efficient clustering algorithm is desired for clustering the motifs that belong to the same groups and separating the motifs that belong to different groups, or even deleting an amount of spurious ones. In this work, a new motif clustering algorithm, CLIMP, is proposed by using maximal cliques and sped up by parallelizing its program. When a synthetic motif dataset from the database JASPAR, a set of putative motifs from a phylogenetic foot-printing dataset, and a set of putative motifs from a ChIP dataset are used to compare the performances of CLIMP and two other high-performance algorithms, the results demonstrate that CLIMP mostly outperforms the two algorithms on the three datasets for motif clustering, so that it can be a useful complement of the clustering procedures in some genome-wide motif prediction pipelines. CLIMP is available at http://sqzhang.cn/climp.html

    Modeling the Evolution of Regulatory Elements by Simultaneous Detection and Alignment with Phylogenetic Pair HMMs

    Get PDF
    The computational detection of regulatory elements in DNA is a difficult but important problem impacting our progress in understanding the complex nature of eukaryotic gene regulation. Attempts to utilize cross-species conservation for this task have been hampered both by evolutionary changes of functional sites and poor performance of general-purpose alignment programs when applied to non-coding sequence. We describe a new and flexible framework for modeling binding site evolution in multiple related genomes, based on phylogenetic pair hidden Markov models which explicitly model the gain and loss of binding sites along a phylogeny. We demonstrate the value of this framework for both the alignment of regulatory regions and the inference of precise binding-site locations within those regions. As the underlying formalism is a stochastic, generative model, it can also be used to simulate the evolution of regulatory elements. Our implementation is scalable in terms of numbers of species and sequence lengths and can produce alignments and binding-site predictions with accuracy rivaling or exceeding current systems that specialize in only alignment or only binding-site prediction. We demonstrate the validity and power of various model components on extensive simulations of realistic sequence data and apply a specific model to study Drosophila enhancers in as many as ten related genomes and in the presence of gain and loss of binding sites. Different models and modeling assumptions can be easily specified, thus providing an invaluable tool for the exploration of biological hypotheses that can drive improvements in our understanding of the mechanisms and evolution of gene regulation
    • …
    corecore