Article thumbnail

Genome Annotation Test with Validation on Transcription Start Site and ChIP-Seq for Pol-II Binding Data

By Justin Bedo and Adam Kowalczyk

Abstract

Motivation: Many ChIP-Seq experiments are aimed at developing gold standards for determining the locations of various genomic fea-tures such as transcription start or transcription factor binding sites on the whole genome. Many such pioneering experiments lack rig-orous testing methods and adequate “gold standard ” annotations to compare against as they themselves are the most reliable source of empirical data available. To overcome this problem, we propose a self consistency test whereby a dataset is tested against itself. It relies on a supervised machine learning style protocol for in-silico an-notation of a genome and accuracy estimation to guarantee, at least, self-consistency. Results: The main results use a novel performance metric (a calib-rated precision) in order to assess and compare the robustness of the proposed supervised learning method across different test sets. As a proof of principle we applied the whole protocol to two recent ChIP-Seq ENCODE datasets of STAT1 and Pol-II binding sites. STAT1 is benchmarked against in-silico detection of binding sites using avail-able position weight matrices. Pol-II, the main focus of this paper, is benchmarked against 17 algorithms for the closely related and well studied problem of in-silico transcription start site (TSS) prediction. Our results also demonstrate the feasibility of in-silico genome an-notation extension with encouraging results from a small portion of annotated genome to the remainder

Year: 2016
OAI identifier: oai:CiteSeerX.psu:10.1.1.936.1807
Provided by: CiteSeerX
Download PDF:
Sorry, we are unable to provide the full text but you may find it at the following location(s):
  • http://bioinformatics.oxfordjo... (external link)
  • http://bioinformatics.oxfordjo... (external link)
  • http://citeseerx.ist.psu.edu/v... (external link)
  • Suggested articles


    To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.