Evaluating the Significance of Global and Local Features in Expressed Sequence Tag: A Clustering Quality Perspective

Abstract

Abstract—Clustering of expressed sequence tag (EST) plays an important role in gene analysis. Alignment-based sequence comparison is commonly used to measure the similarity between sequences, and recently some of the alignment-free comparisons have been introduced. In this paper, we evaluate the role of global and local features extracted from the alignment free approaches i.e., compression-based method and generalized relative entropy method, in the quality of EST clustering perspective. Our evaluation shows that the local feature of EST yields much better clustering result compares to the global feature. Index Terms- sequence clustering, alignment-free, similarity measure, grammar-based distance, generalized relative entropy I

    Similar works

    Full text

    thumbnail-image

    Available Versions