Broad-source Factor ChIP-Seq 자동 분석 파이프라인의 Domain Calling 프로그램 성능 비교 분석 연구

Abstract

MasterChromatin Immunoprecipitation coupled with Next-Generation Sequencing (ChIP-Seq) can identify the protein-bound region on DNA with high accuracy. DNA binding proteins like transcription factor usually show concentrated, narrow peak-like binding pattern while some histone modifications like H3K27me3 and H3K36me3 show broad, domain-like binding pattern. Peak calling process for point-source data is well optimized, but domain calling process for broad-source data is difficult and related researches are still in progress. Although many domain calling programs for broad-source factor are now available, they are different in mathematical model, significance threshold, domain calling method and specificity. However, comparative performance studies for broad-source domain calling programs are rare and many previous studies are by-product of the program development. Here, I compared performances of seven domain calling programs (RSEG, MACS2, SICER, hiddenDomains, BroadPeak, PeakRanger-CCAT, PeakRanger-BCP) to provide the practical guideline in program selection. Computer simulated H3K36me3 ChIP-Seq data and real experimental ChIP-Seq data (H1 H3K27me3, H1 H3K36me3) were used for performance evaluation. Results from simulated H3K36me3 data showed four programs (RSEG, SICER, hiddenDomains, PeakRanger-BCP) have good performance in domain calling. RSEG showed the lowest false positive rate while hiddenDomains showed the highest specificity in domain calling. BCP showed good performance with very short running time. Results from H1 cell H3K27me3 data for selected four programs (RSEG, SICER, hiddenDomains, PeakRanger-BCP) showed existence of biologically meaningful barrier CTCF sites near called domains. RSEG showed the best performance in H1 H3K27me3 data, but hiddenDomains and PeakRanger-BCP also showed high accuracy. Considering running time, resource usage, domain calling specificity and convenience, I highly recommend using PeakRanger-BCP for domain calling process. Selection of the best domain calling program for automated broad-source factor ChIP-Seq pipeline will provide the uniform, high quality results in epigenetic researches

    Similar works

    Full text

    thumbnail-image

    Available Versions