3,111 research outputs found
An exploration of methodologies to improve semi-supervised hierarchical clustering with knowledge-based constraints
Clustering algorithms with constraints (also known as semi-supervised clustering algorithms) have been introduced to the field of machine learning as a significant variant to the conventional unsupervised clustering learning algorithms. They have been demonstrated to achieve better performance due to integrating prior knowledge during the clustering process, that enables uncovering relevant useful information from the data being clustered. However, the research conducted within the context of developing semi-supervised hierarchical clustering techniques are still an open and active investigation area. Majority of current semi-supervised clustering algorithms are developed as partitional clustering (PC) methods and only few research efforts have been made on developing semi-supervised hierarchical clustering methods. The aim of this research is to enhance hierarchical clustering (HC) algorithms based on prior knowledge, by adopting novel methodologies. [Continues.
Techniques for clustering gene expression data
Many clustering techniques have been proposed for the analysis of gene expression data obtained from microarray experiments. However, choice of suitable method(s) for a given experimental dataset is not straightforward. Common approaches do not translate well and fail to take account of the data profile. This review paper surveys state of the art applications which recognises these limitations and implements procedures to overcome them. It provides a framework for the evaluation of clustering in gene expression analyses. The nature of microarray data is discussed briefly. Selected examples are presented for the clustering methods considered
Feature Trajectory Dynamic Time Warping for Clustering of Speech Segments
Dynamic time warping (DTW) can be used to compute the similarity between two
sequences of generally differing length. We propose a modification to DTW that
performs individual and independent pairwise alignment of feature trajectories.
The modified technique, termed feature trajectory dynamic time warping (FTDTW),
is applied as a similarity measure in the agglomerative hierarchical clustering
of speech segments. Experiments using MFCC and PLP parametrisations extracted
from TIMIT and from the Spoken Arabic Digit Dataset (SADD) show consistent and
statistically significant improvements in the quality of the resulting clusters
in terms of F-measure and normalised mutual information (NMI).Comment: 10 page
- …