We present a new algorithm for the identification of bound regions from
ChIP-seq experiments. Our method for identifying statistically significant
peaks from read coverage is inspired by the notion of persistence in
topological data analysis and provides a non-parametric approach that is robust
to noise in experiments. Specifically, our method reduces the peak calling
problem to the study of tree-based statistics derived from the data. We
demonstrate the accuracy of our method on existing datasets, and we show that
it can discover previously missed regions and can more clearly discriminate
between multiple binding events. The software T-PIC (Tree shape Peak
Identification for ChIP-Seq) is available at
http://math.berkeley.edu/~vhower/tpic.htmlComment: 12 pages, 6 figure