Microarray time-series data clustering via gene expression profile alignment

Abstract

Clustering gene expression data given In terms of time-series is a challenging problem that imposes its own particular constraints, namely, exchanging two or more time points is not possible as it would deliver quite different results and would lead to erroneous biological conclusions. In this thesis, clustering methods introducing the concept of multiple alignment of natural cubic spline representations of gene expression profiles are presented. The multiple alignment is achieved by minimizing the sum of integrated squared errors over a time-interval, defined on a set of profiles. The proposed approach with flat clustering algorithms like k-means and EM are shown to cluster microarray time-series profiles efficiently and reduce the computational time significantly. The effectiveness of the approaches is experimented on six data sets. Experiments have also been carried out in order to determine the number of clusters and to determine the accuracies of the proposed approaches

    Similar works