88 research outputs found
A GO catalogue of human DNA-binding transcription factors
To control gene transcription, DNA-binding transcription factors recognise specific sequence motifs in gene regulatory regions. A complete and reliable GO annotation of all DNA-binding transcription factors is key to investigating the delicate balance of gene regulation in response to environmental and developmental stimuli. The need for such information is demonstrated by the many lists of transcription factors that have been produced over the past decade. The COST Action Gene Regulation Ensemble Effort for the Knowledge Commons (GREEKC) Consortium brought together experts in the field of transcription with the aim of providing high quality and interoperable gene regulatory data. The Gene Ontology (GO) Consortium provides strict definitions for gene product function, including factors that regulate transcription. The collaboration between the GREEKC and GO Consortia has enabled the application of those definitions to produce a new curated catalogue of over 1400 human DNA-binding transcription factors, that can be accessed at https://www.ebi.ac.uk/QuickGO/targetset/dbTF. This catalogue has facilitated an improvement in the GO annotation of human DNA-binding transcription factors and led to the GO annotation of almost sixty thousand DNA-binding transcription factors in over a hundred species. Thus, this work will aid researchers investigating the regulation of transcription in both biomedical and basic science
Vitamin D receptor ChIP-seq in primary CD4+ cells: relationship to serum 25-hydroxyvitamin D levels and autoimmune disease
PMCID: PMC3710212This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited
GRISOTTO: A greedy approach to improve combinatorial algorithms for motif discovery with prior knowledge
<p>Abstract</p> <p>Background</p> <p>Position-specific priors (PSP) have been used with success to boost EM and Gibbs sampler-based motif discovery algorithms. PSP information has been computed from different sources, including orthologous conservation, DNA duplex stability, and nucleosome positioning. The use of prior information has not yet been used in the context of combinatorial algorithms. Moreover, priors have been used only independently, and the gain of combining priors from different sources has not yet been studied.</p> <p>Results</p> <p>We extend RISOTTO, a combinatorial algorithm for motif discovery, by post-processing its output with a greedy procedure that uses prior information. PSP's from different sources are combined into a scoring criterion that guides the greedy search procedure. The resulting method, called GRISOTTO, was evaluated over 156 yeast TF ChIP-chip sequence-sets commonly used to benchmark prior-based motif discovery algorithms. Results show that GRISOTTO is at least as accurate as other twelve state-of-the-art approaches for the same task, even without combining priors. Furthermore, by considering combined priors, GRISOTTO is considerably more accurate than the state-of-the-art approaches for the same task. We also show that PSP's improve GRISOTTO ability to retrieve motifs from mouse ChiP-seq data, indicating that the proposed algorithm can be applied to data from a different technology and for a higher eukaryote.</p> <p>Conclusions</p> <p>The conclusions of this work are twofold. First, post-processing the output of combinatorial algorithms by incorporating prior information leads to a very efficient and effective motif discovery method. Second, combining priors from different sources is even more beneficial than considering them separately.</p
An Integrated Pipeline for the Genome-Wide Analysis of Transcription Factor Binding Sites from ChIP-Seq
ChIP-Seq has become the standard method for genome-wide profiling DNA association
of transcription factors. To simplify analyzing and interpreting ChIP-Seq data,
which typically involves using multiple applications, we describe an integrated,
open source, R-based analysis pipeline. The pipeline addresses data input, peak
detection, sequence and motif analysis, visualization, and data export, and can
readily be extended via other R and Bioconductor packages. Using a standard
multicore computer, it can be used with datasets consisting of tens of thousands
of enriched regions. We demonstrate its effectiveness on published human
ChIP-Seq datasets for FOXA1, ER, CTCF and STAT1, where it detected co-occurring
motifs that were consistent with the literature but not detected by other
methods. Our pipeline provides the first complete set of Bioconductor tools for
sequence and motif analysis of ChIP-Seq and ChIP-chip data
Computational analysis of the evolutionarily conserved Missing In Metastasis/Metastasis Suppressor 1 gene predicts novel interactions, regulatory regions and transcriptional control
Missing in Metastasis (MIM), or Metastasis Suppressor 1 (MTSS1), is a highly conserved protein, which links the plasma membrane to the actin cytoskeleton. MIM has been implicated in various cancers, however, its modes of action remain largely enigmatic. Here, we performed an extensive in silico characterisation of MIM to gain better understanding of its function. We detected previously unappreciated functional motifs including adaptor protein (AP) complex interaction site and a C-helix, pointing to a role in endocytosis and regulation of actin dynamics, respectively. We also identified new functional regions, characterised with phosphorylation sites or distinct hydrophilic properties. Strong negative selection during evolution, yielding high conservation of MIM, has been combined with positive selection at key sites. Interestingly, our analysis of intra-molecular co-evolution revealed potential regulatory hotspots that coincided with reduced potentially\ua0pathogenic polymorphisms. We explored databases for the mutations and expression levels of MIM in cancer. Experimentally, we focused on chronic lymphocytic leukaemia (CLL), where MIM showed high overall expression, however, downregulation on poor prognosis samples. Finally, we propose strong conservation of MTSS1 also on the transcriptional level and predict novel transcriptional regulators. Our data highlight important targets for future studies on the role of MIM in different tissues and cancers
APOBEC signature mutation generates an oncogenic enhancer that drives LMO1 expression in T-ALL
Oncogenic driver mutations are those that provide a proliferative or survival advantage to neoplastic cells, resulting in clonal selection. Although most cancer-causing mutations have been detected in the protein-coding regions of the cancer genome; driver mutations have recently also been discovered within noncoding genomic sequences. Thus, a current challenge is to gain precise understanding of how these unique genomic elements function in cancer pathogenesis, while clarifying mechanisms of gene regulation and identifying new targets for therapeutic intervention. Here we report a C-to-T single nucleotide transition that occurs as a somatic mutation in noncoding sequences 4 kb upstream of the transcriptional start site of the LMO1 oncogene in primary samples from patients with T-cell acute lymphoblastic leukaemia. This single nucleotide alteration conforms to an APOBEC-like cytidine deaminase mutational signature, and generates a new binding site for the MYB transcription factor, leading to the formation of an aberrant transcriptional enhancer complex that drives high levels of expression of the LMO1 oncogene. Since APOBEC-signature mutations are common in a broad spectrum of human cancers, we suggest that noncoding nucleotide transitions such as the one described here may activate potent oncogenic enhancers not only in T-lymphoid cells but in other cell lineages as well
- …