1 research outputs found

    Data from: Comparative analysis of models in predicting the effects of SNPs on TF-DNA binding using large-scale in vitro and in vivo data

    No full text
    Noncoding variants associated with complex traits can alter the motifs of transcription factor (TF)-DNA binding. Although many computational models have been developed to predict the effects of noncoding variants on TF binding, their predictive power lacks systematic evaluation. Here we have evaluated 14 different models built on position weight matrices (PWMs), support vector machine (SVM), ordinary least squares (OLS) and deep neural networks (DNN), using large-scale in vitro (i. e. SNP-SELEX) and in vivo (i. e. allele-specific binding, ASB) TF binding data. The SNP-SELEX data used in this study were collected from the GVAT database (http://renlab.sdsc.edu/GVATdb/), and the ASB data were collected from the ADASTRA database (https://adastra.autosome.org/bill-cipher/downloads). This dataset contains following files.SNP-SELEX_firstbatch_evaldata_positive_data.txt.gz: SNP-SELEX, first batch, positive setSNP-SELEX_firstbatch_evaldata_negative_data.txt.gz: SNP-SELEX, first batch, negative setSNP-SELEX_novelbatch_evaldata_positive_data.txt.gz: SNP-SELEX, novel batch, positive setSNP-SELEX_novelbatch_evaldata_negative_data.txt.gz: SNP-SELEX, novel batch, negative setASB_evaldata_positive_data.txt.gz: ASB, positive setASB_evaldata_negative_data.txt.gz: ASB, negative setSNP-SELEX_AUROC_AUPRC.xlsx: AUROC and AUPRC of 14 models based on SNP-SELEXASB_AUROC_AUPRC.xlsx: AUROC and AUPRC of 14 models based on ASB</p
    corecore