1,866 research outputs found

    Binding Site Graphs: A New Graph Theoretical Framework for Prediction of Transcription Factor Binding Sites

    Get PDF
    Computational prediction of nucleotide binding specificity for transcription factors remains a fundamental and largely unsolved problem. Determination of binding positions is a prerequisite for research in gene regulation, a major mechanism controlling phenotypic diversity. Furthermore, an accurate determination of binding specificities from high-throughput data sources is necessary to realize the full potential of systems biology. Unfortunately, recently performed independent evaluation showed that more than half the predictions from most widely used algorithms are false. We introduce a graph-theoretical framework to describe local sequence similarity as the pair-wise distances between nucleotides in promoter sequences, and hypothesize that densely connected subgraphs are indicative of transcription factor binding sites. Using a well-established sampling algorithm coupled with simple clustering and scoring schemes, we identify sets of closely related nucleotides and test those for known TF binding activity. Using an independent benchmark, we find our algorithm predicts yeast binding motifs considerably better than currently available techniques and without manual curation. Importantly, we reduce the number of false positive predictions in yeast to less than 30%. We also develop a framework to evaluate the statistical significance of our motif predictions. We show that our approach is robust to the choice of input promoters, and thus can be used in the context of predicting binding positions from noisy experimental data. We apply our method to identify binding sites using data from genome scale ChIP–chip experiments. Results from these experiments are publicly available at http://cagt10.bu.edu/BSG. The graphical framework developed here may be useful when combining predictions from numerous computational and experimental measures. Finally, we discuss how our algorithm can be used to improve the sensitivity of computational predictions of transcription factor binding specificities

    Chromatin environment and cellular context specify compensatory activity of paralogous MEF2 transcription factors

    Get PDF
    Compensation among paralogous transcription factors (TFs) confers genetic robustness of cellular processes, but how TFs dynamically respond to paralog depletion on a genome-wide scale in vivo remains incompletely understood. Using single and double conditional knockout of myocyte enhancer factor 2 (MEF2) family TFs in granule neurons of the mouse cerebellum, we find that MEF2A and MEF2D play functionally redundant roles in cerebellar-dependent motor learning. Although both TFs are highly expressed in granule neurons, transcriptomic analyses show MEF2D is the predominant genomic regulator of gene expression in vivo. Strikingly, genome-wide occupancy analyses reveal upon depletion of MEF2D, MEF2A occupancy robustly increases at a subset of sites normally bound to MEF2D. Importantly, sites experiencing compensatory MEF2A occupancy are concentrated within open chromatin and undergo functional compensation for genomic activation and gene expression. Finally, motor activity induces a switch from non-compensatory to compensatory MEF2-dependent gene regulation. These studies uncover genome-wide functional interdependency between paralogous TFs in the brain

    Dynamic DNA methylation across diverse human cell lines and tissues

    Get PDF
    As studies of DNA methylation increase in scope, it has become evident that methylation has a complex relationship with gene expression, plays an important role in defining cell types, and is disrupted in many diseases. We describe large-scale single-base resolution DNA methylation profiling on a diverse collection of 82 human cell lines and tissues using reduced representation bisulfite sequencing (RRBS). Analysis integrating RNA-seq and ChIP-seq data illuminates the functional role of this dynamic mark. Loci that are hypermethylated across cancer types are enriched for sites bound by NANOG in embryonic stem cells, which supports and expands the model of a stem/progenitor cell signature in cancer. CpGs that are hypomethylated across cancer types are concentrated in megabase-scale domains that occur near the telomeres and centromeres of chromosomes, are depleted of genes, and are enriched for cancer-specific EZH2 binding and H3K27me3 (repressive chromatin). In noncancer samples, there are cell-type specific methylation signatures preserved in primary cell lines and tissues as well as methylation differences induced by cell culture. The relationship between methylation and expression is context-dependent, and we find that CpG-rich enhancers bound by EP300 in the bodies of expressed genes are unmethylated despite the dense gene-body methylation surrounding them. Non-CpG cytosine methylation occurs in human somatic tissue, is particularly prevalent in brain tissue, and is reproducible across many individuals. This study provides an atlas of DNA methylation across diverse and well-characterized samples and enables new discoveries about DNA methylation and its role in gene regulation and disease

    Short-Term Retinoic Acid Treatment Increases In Vivo, but Decreases In Vitro, Epidermal Transglutaminase-K Enzyme Activity and Immunoreactivity

    Get PDF
    Epidermal transglutaminase-K is believed to catalyze the covalent linking of loricrin and involucrin to form cross-linked (CE) envelopes. In normal skin, transglutaminase-K is expressed as a band immediately below the stratum corneum, whereas in psoriasis and healing skin its expression is considerably expanded throughout the suprabasal layers. We have investigated whether the hyperproliferative state induced by short-term application of topical retinoic acid is similarly characterized by an increase in transglutaminase-K enzyme activity and immunoreactivity.Retinoic acid (0.1% cream) or vehicle were applied to human skin and occluded for 4 d. Skin biopsies were obtained for measurement of transglutaminase-K and transglutaminase-C activity and immunoreactivity. For comparison, cultured normal human keratinocytes were incubated for 4 d in the presence of 1 μM retinoic acid and the subsequent transglutaminase-K activity and immunoreactivity measured. Transglutaminase-K activity was increased 2.8 times in retinoic acid compared to vehicle-treated skin (p < 0.005, n = 12) whereas there was no significant difference in transglutaminase-C activity. However, transglutaminase-K mRNA levels were not significantly different between retinoic acid- and vehicle-treated skin. In vehicle-treated skin, transglutaminase-K immunoreactivity was limited to a narrow, substratum corneal band, but was considerably expanded in a diffuse suprabasal pattern in retinoic acid-treated epidermis. In contrast, transglutaminase-K immunostaining was decreased and its enzymatic activity reduced sixfold in retinoic acid-treated keratinocytes (p < 0.01, n = 4).These results demonstrate that retinoic acid treatment in vivo, in contrast to in vitro, leads to not only increased transglutaminase-K protein expression but also increased enzymatic activity in the absence of detectable increases in mRNA levels.These data, taken with the previously reported lack of in vivo modulation of the differentiation markers keratins 1 and 10 by retinoic acid, indicate that certain aspects of keratinocyte terminal differentiation that are altered in vitro by retinoic acid do not occur in vivo in human skin

    An accessible proteogenomics informatics resource for cancer researchers

    Get PDF
    Proteogenomics has emerged as a valuable approach in cancer research, which integrates genomic and transcriptomic data with mass spectrometry–based proteomics data to directly identify expressed, variant protein sequences that may have functional roles in cancer. This approach is computationally intensive, requiring integration of disparate software tools into sophisticated workflows, challenging its adoption by nonexpert, bench scientists. To address this need, we have developed an extensible, Galaxy-based resource aimed at providing more researchers access to, and training in, proteogenomic informatics. Our resource brings together software from several leading research groups to address two foundational aspects of proteogenomics: (i) generation of customized, annotated protein sequence databases from RNA-Seq data; and (ii) accurate matching of tandem mass spectrometry data to putative variants, followed by filtering to confirm their novelty. Directions for accessing software tools and workflows, along with instructional documentation, can be found at z.umn.edu/canresgithub.publishedVersio

    Impact of the Deepwater Horizon oil spill on a deep-water coral community in the Gulf of Mexico

    Get PDF
    To assess the potential impact of the Deepwater Horizon oil spill on offshore ecosystems, 11 sites hosting deep-water coral communities were examined 3 to 4 mo after the well was capped. Healthy coral communities were observed at all sites \u3e20 km from the Macondo well, including seven sites previously visited in September 2009, where the corals and communities appeared unchanged. However, at one site 11 km southwest of the Macondo well, coral colonies presented widespread signs of stress, including varying degrees of tissue loss, sclerite enlargement, excessmucous production, bleached commensal ophiuroids, and covering by brown flocculent material (floc). On the basis of these criteria the level of impact to individual colonies was ranked from 0 (least impact) to 4 (greatest impact). Of the 43 corals imaged at that site, 46% exhibited evidence of impact onmore than half of the colony,whereas nearly a quarter of all of the corals showed impact to \u3e90% of the colony. Additionally, 53% of these corals\u27 ophiuroid associates displayed abnormal color and/or attachment posture. Analysis of hopanoid petroleumbiomarkers isolated from the floc provides strong evidence that this material contained oil fromtheMacondowell. The presence of recently damaged and deceased corals beneath the path of a previously documented plume emanating from the Macondo well provides compelling evidence that the oil impacted deep-water ecosystems. Our findings underscore the unprecedented nature of the spill in terms of its magnitude, release at depth, and impact to deep-water ecosystems

    Positional clustering improves computational binding site detection and identifies novel cis-regulatory sites in mammalian GABA(A) receptor subunit genes

    Get PDF
    Understanding transcription factor (TF) mediated control of gene expression remains a major challenge at the interface of computational and experimental biology. Computational techniques predicting TF-binding site specificity are frequently unreliable. On the other hand, comprehensive experimental validation is difficult and time consuming. We introduce a simple strategy that dramatically improves robustness and accuracy of computational binding site prediction. First, we evaluate the rate of recurrence of computational TFBS predictions by commonly used sampling procedures. We find that the vast majority of results are biologically meaningless. However clustering results based on nucleotide position improves predictive power. Additionally, we find that positional clustering increases robustness to long or imperfectly selected input sequences. Positional clustering can also be used as a mechanism to integrate results from multiple sampling approaches for improvements in accuracy over each one alone. Finally, we predict and validate regulatory sequences partially responsible for transcriptional control of the mammalian type A γ-aminobutyric acid receptor (GABA(A)R) subunit genes. Positional clustering is useful for improving computational binding site predictions, with potential application to improving our understanding of mammalian gene expression. In particular, predicted regulatory mechanisms in the mammalian GABA(A)R subunit gene family may open new avenues of research towards understanding this pharmacologically important neurotransmitter receptor system

    A standardized kinesin nomenclature

    Get PDF
    In recent years the kinesin superfamily has become so large that several different naming schemes have emerged, leading to confusion and miscommunication. Here, we set forth a standardized kinesin nomenclature based on 14 family designations. The scheme unifies all previous phylogenies and nomenclature proposals, while allowing individual sequence names to remain the same, and for expansion to occur as new sequences are discovered
    corecore