54 research outputs found
A systematic, large-scale comparison of transcription factor binding site models
Background The modelling of gene regulation is a major challenge in biomedical
research. This process is dominated by transcription factors (TFs) and
mutations in their binding sites (TFBSs) may cause the misregulation of genes,
eventually leading to disease. The consequences of DNA variants on TF binding
are modelled in silico using binding matrices, but it remains unclear whether
these are capable of accurately representing in vivo binding. In this study,
we present a systematic comparison of binding models for 82 human TFs from
three freely available sources: JASPAR matrices, HT-SELEX-generated models and
matrices derived from protein binding microarrays (PBMs). We determined their
ability to detect experimentally verified “real” in vivo TFBSs derived from
ENCODE ChIP-seq data. As negative controls we chose random downstream exonic
sequences, which are unlikely to harbour TFBS. All models were assessed by
receiver operating characteristics (ROC) analysis. Results While the area-
under-curve was low for most of the tested models with only 47 % reaching a
score of 0.7 or higher, we noticed strong differences between the various
position-specific scoring matrices with JASPAR and HT-SELEX models showing
higher success rates than PBM-derived models. In addition, we found that while
TFBS sequences showed a higher degree of conservation than randomly chosen
sequences, there was a high variability between individual TFBSs. Conclusions
Our results show that only few of the matrix-based models used to predict
potential TFBS are able to reliably detect experimentally confirmed TFBS. We
compiled our findings in a freely accessible web application called ePOSSUM
(http:/mutationtaster.charite.de/ePOSSUM/) which uses a Bayes classifier to
assess the impact of genetic alterations on TF binding in user-defined
sequences. Additionally, ePOSSUM provides information on the reliability of
the prediction using our test set of experimentally confirmed binding sites
- …