18 research outputs found

    Inferring transcription factor complexes from ChIP-seq data

    Get PDF
    Chromatin immunoprecipitation followed by high-throughput sequencing (ChIP-seq) allows researchers to determine the genome-wide binding locations of individual transcription factors (TFs) at high resolution. This information can be interrogated to study various aspects of TF behaviour, including the mechanisms that control TF binding. Physical interaction between TFs comprises one important aspect of TF binding in eukaryotes, mediating tissue-specific gene expression. We have developed an algorithm, spaced motif analysis (SpaMo), which is able to infer physical interactions between the given TF and TFs bound at neighbouring sites at the DNA interface. The algorithm predicts TF interactions in half of the ChIP-seq data sets we test, with the majority of these predictions supported by direct evidence from the literature or evidence of homodimerization. High resolution motif spacing information obtained by this method can facilitate an improved understanding of individual TF complex structures. SpaMo can assist researchers in extracting maximum information relating to binding mechanisms from their TF ChIP-seq data. SpaMo is available for download and interactive use as part of the MEME Suite (http://meme.nbcr.net)

    A global role for KLF1 in erythropoiesis revealed by ChIP-seq in primary erythroid cells

    Get PDF
    KLF1 regulates a diverse suite of genes to direct erythroid cell differentiation from bipotent progenitors. To determine the local cis-regulatory contexts and transcription factor networks in which KLF1 operates, we performed KLF1 ChIP-seq in the mouse. We found at least 945 sites in the genome of E14.5 fetal liver erythroid cells which are occupied by endogenous KLF1. Many of these recovered sites reside in erythroid gene promoters such as Hbb-bl, but the majority are distant to any known gene. Our data suggests KLF1 directly regulates most aspects of terminal erythroid differentiation including production of alpha- and beta-globin protein chains, heme biosynthesis, coordination of proliferation and anti-apoptotic pathways, and construction of the red cell membrane and cytoskeleton by functioning primarily as a transcriptional activator. Additionally, we suggest new mechanisms for KLF1 cooperation with other transcription factors, in particular the erythroid transcription factor GATA1, to maintain homeostasis in the erythroid compartment

    High-throughput chromatin information enables accurate tissue-specific prediction of transcription factor binding sites

    Get PDF
    In silico prediction of transcription factor binding sites (TFBSs) is central to the task of gene regulatory network elucidation. Genomic DNA sequence information provides a basis for these predictions, due to the sequence specificity of TF-binding events. However, DNA sequence alone is an impoverished source of information for the task of TFBS prediction in eukaryotes, as additional factors, such as chromatin structure regulate binding events. We show that incorporating high-throughput chromatin modification estimates can greatly improve the accuracy of in silico prediction of in vivo binding for a wide range of TFs in human and mouse. This improvement is superior to the improvement gained by equivalent use of either transcription start site proximity or phylogenetic conservation information. Importantly, predictions made with the use of chromatin structure information are tissue specific. This result supports the biological hypothesis that chromatin modulates TF binding to produce tissue-specific binding profiles in higher eukaryotes, and suggests that the use of chromatin modification information can lead to accurate tissue-specific transcriptional regulatory network elucidation

    The value of position-specific priors in motif discovery using MEME

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Position-specific priors have been shown to be a flexible and elegant way to extend the power of Gibbs sampler-based motif discovery algorithms. Information of many types–including sequence conservation, nucleosome positioning, and negative examples–can be converted into a prior over the location of motif sites, which then guides the sequence motif discovery algorithm. This approach has been shown to confer many of the benefits of conservation-based and discriminative motif discovery approaches on Gibbs sampler-based motif discovery methods, but has not previously been studied with methods based on expectation maximization (EM).</p> <p>Results</p> <p>We extend the popular EM-based MEME algorithm to utilize position-specific priors and demonstrate their effectiveness for discovering transcription factor (TF) motifs in yeast and mouse DNA sequences. Utilizing a discriminative, conservation-based prior dramatically improves MEME's ability to discover motifs in 156 yeast TF ChIP-chip datasets, more than doubling the number of datasets where it finds the correct motif. On these datasets, MEME using the prior has a higher success rate than eight other conservation-based motif discovery approaches. We also show that the same type of prior improves the accuracy of motifs discovered by MEME in mouse TF ChIP-seq data, and that the motifs tend to be of slightly higher quality those found by a Gibbs sampling algorithm using the same prior.</p> <p>Conclusions</p> <p>We conclude that using position-specific priors can substantially increase the power of EM-based motif discovery algorithms such as MEME algorithm.</p

    TP53 outperforms other androgen receptor biomarkers to predict abiraterone or enzalutamide outcome in metastatic castration-resistant prostate cancer

    Get PDF
    Purpose: To infer the prognostic value of simultaneous androgen receptor (AR) and TP53 profiling in liquid biopsies from patients with metastatic castration-resistant prostate cancer (mCRPC) starting a new line of AR signaling inhibitors (ARSi). Experimental Design: Between March 2014 and April 2017, we recruited patients with mCRPC (n = 168) prior to ARSi in a cohort study encompassing 10 European centers. Blood samples were collected for comprehensive profiling of Cell Search-enriched circulating tumor cells (CTC) and circulating tumor DNA (ctDNA). Targeted CTC RNA sequencing (RNA-seq) allowed the detection of eight AR splice variants (ARV). Low-pass whole-genome and targeted gene-body sequencing of AR and TP53 was applied to identify amplifications, loss of heterozygosity, mutations, and structural rearrangements in ctDNA. Clinical or radiologic progression-free survival (PFS) was estimated by Kaplan-Meier analysis, and independent associations were determined using multivariable Cox regression models. Results: Overall, no single AR perturbation remained associated with adverse prognosis after multivariable analysis. Instead, tumor burden estimates (CTC counts, ctDNA fraction, and visceral metastases) were significantly associated with PFS. TP53 inactivation harbored independent prognostic value [HR 1.88; 95% confidence interval (CI), 1.18-3.00; P = 0.008], and outperformed ARV expression and detection of genomic AR alterations. Using Cox coefficient analysis of clinical parameters and TP53 status, we identified three prognostic groups with differing PFS estimates (median, 14.7 vs. 7.51 vs. 2.62 months; P < 0.0001), which was validated in an independent mCRPC cohort (n = 202) starting first-line ARSi (median, 14.3 vs. 6.39 vs. 2.23 months; P < 0.0001). Conclusions: In an all-comer cohort, tumor burden estimates and TP53 outperform any AR perturbation to infer prognosis. See related commentary by Rebello et al., p. 169

    Cell-free DNA profiling of metastatic prostate cancer reveals microsatellite instability, structural rearrangements and clonal hematopoiesis.

    Get PDF
    This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.BACKGROUND: There are multiple existing and emerging therapeutic avenues for metastatic prostate cancer, with a common denominator, which is the need for predictive biomarkers. Circulating tumor DNA (ctDNA) has the potential to cost-efficiently accelerate precision medicine trials to improve clinical efficacy and diminish costs and toxicity. However, comprehensive ctDNA profiling in metastatic prostate cancer to date has been limited. METHODS: A combination of targeted and low-pass whole genome sequencing was performed on plasma cell-free DNA and matched white blood cell germline DNA in 364 blood samples from 217 metastatic prostate cancer patients. RESULTS: ctDNA was detected in 85.9% of baseline samples, correlated to line of therapy and was mirrored by circulating tumor cell enumeration of synchronous blood samples. Comprehensive profiling of the androgen receptor (AR) revealed a continuous increase in the fraction of patients with intra-AR structural variation, from 15.4% during first-line metastatic castration-resistant prostate cancer therapy to 45.2% in fourth line, indicating a continuous evolution of AR during the course of the disease. Patients displayed frequent alterations in DNA repair deficiency genes (18.0%). Additionally, the microsatellite instability phenotype was identified in 3.81% of eligible samples (≥ 0.1 ctDNA fraction). Sequencing of non-repetitive intronic and exonic regions of PTEN, RB1, and TP53 detected biallelic inactivation in 47.5%, 20.3%, and 44.1% of samples with ≥ 0.2 ctDNA fraction, respectively. Only one patient carried a clonal high-impact variant without a detectable second hit. Intronic high-impact structural variation was twice as common as exonic mutations in PTEN and RB1. Finally, 14.6% of patients presented false positive variants due to clonal hematopoiesis, commonly ignored in commercially available assays. CONCLUSIONS: ctDNA profiles appear to mirror the genomic landscape of metastatic prostate cancer tissue and may cost-efficiently provide somatic information in clinical trials designed to identify predictive biomarkers. However, intronic sequencing of the interrogated tumor suppressors challenges the ubiquitous focus on coding regions and is vital, together with profiling of synchronous white blood cells, to minimize erroneous assignments which in turn may confound results and impede true associations in clinical trials.The Belgian Foundation Against Cancer (grant number C/2014/227); Kom op tegen Kanker (Stand up to Cancer), the Flemish Cancer Society (grant number 00000000116000000206); Royal College of Surgeons/Cancer Research UK (C19198/A1533); The Cancer Research Funds of Radiumhemmet, through the PCM program at KI (grant number 163012); The Erling-Persson family foundation (grant number 4-2689-2016); the Swedish Research Council (grant number K2010-70X-20430-04-3), and the Swedish Cancer Foundation (grant number 09-0677)

    Visualizing the evaluation of functional programs for debugging

    Get PDF
    In this position paper, we present a prototype of a visualizer for functional programs. Such programs, whose evaluation model is the reduction of an expression to a value through repeated application of rewriting rules, and which tend to make little or no use of mutable state, are amenable to visualization in the same fashion as simple mathematical expressions, with which every schoolchild is familiar. We show how such visualizations may be produced for the strict functional language OCaml, by direct interpretation of the abstract syntax tree and appropriate pretty-printing. We describe (and begin to address) the challenges of presenting such program traces in limited space and of identifying their essential elements, so that our methods will one day be practical for more than toy programs. We consider the problems posed by the parts of modern functional programming which are not purely functional such as mutable state, input/output and exceptions. We describe initial work on the use of such visualizations to address the problem of program debugging, which is our ultimate aim

    Improved prediction of transcription binding sites from chromatin modification data

    No full text
    In this paper we apply machine learning to the task of predicting transcription factor binding sites by combining information on multiple forms of chromatin modification with the binding strength DNA site predicted by a position weight matrix. We additionally explore the effect of incorporating auxiliary features such as the distance of the site to the nearest gene's transcription start site and the degree to which the site is conserved among related species. We approach the task as a classification problem, and show that both Naïve Bayes and Random Forests can provide substantial increases in the accuracy of predicted binding sites. Our results extend previous work which simply filtered candidate sites based on H3K4Me3 chromatin modification scores. In addition we apply feature selection to explore which forms of chromatin modification and which auxiliary features have predictive value for which transcription factors. © IEEE
    corecore