1 research outputs found

    Improved H3K27ac Histone Mark Prediction using K-mer Proximity Feature

    Get PDF
    Prediction of gene regulatory elements-enhancers is computationally challenging because features associated with them are ill-understood. Several histone marks are known to be associated with enhancers locations and have been successfully used to predict multiple thousands of enhancers approximate locations. The k-mer (a short continuous nucleotides of length k) is one of the most commonly engineered features from histone sequences for machine learning task. However, usually large kmer (i.e. 5 ≤ k ≤ 7) feature set is needed to perform well and no domain knowledge is used. In this study we proposed the kmer proximity feature which is domain dependent to represent the H3K27ac histone enrichment in DNA sequences. This feature represents the spatial content of DNA sequences. We compare the performances of using the proximity and the k-mer feature for H3K27ac marks prediction and results indicate that the proposed feature gives higher prediction accuracy rates. These findings supported that the proximity feature is a more distinguishing feature of DNA sequences with histone modification enrichment
    corecore