5,590 research outputs found
Cross likelihood ratio based speaker clustering using eigenvoice models
This paper proposes the use of eigenvoice modeling techniques with the Cross Likelihood Ratio (CLR) as a criterion for speaker clustering within a speaker diarization system. The CLR has previously been shown to be a robust decision criterion for speaker clustering using Gaussian Mixture Models. Recently, eigenvoice modeling techniques have become increasingly popular, due to its ability to adequately represent a speaker based on sparse training data, as well as an improved capture of differences in speaker characteristics. This paper hence proposes that it would be beneficial to capitalize on the advantages of eigenvoice modeling in a CLR framework. Results obtained on the 2002 Rich Transcription (RT-02) Evaluation dataset show an improved clustering performance, resulting in a 35.1% relative improvement in the overall Diarization Error Rate (DER) compared to the baseline system
Autonomous clustering using rough set theory
This paper proposes a clustering technique that minimises the need for subjective
human intervention and is based on elements of rough set theory. The proposed algorithm is
unified in its approach to clustering and makes use of both local and global data properties to
obtain clustering solutions. It handles single-type and mixed attribute data sets with ease and
results from three data sets of single and mixed attribute types are used to illustrate the
technique and establish its efficiency
Recommended from our members
A survey of clustering methods
In this paper, I describe a large variety of clustering methods within a single framework. This paper unifies work across different fields, from biology (numerical taxonomy) to machine learning (concept formation). An important objective for this paper is to show that one can benefit by a knowledge of research across different disciplines. After describing the task from a set of different viewpoints or paradigms, I begin by describing the similarity measures or evaluation functions that form the basis of any clustering technique. Next, I describe a number of different algorithms that use these measures, and I close with a brief discussion of ways to evaluate different approaches to clustering
Discussion of: Treelets--An adaptive multi-scale basis for sparse unordered data
We would like to congratulate Lee, Nadler and Wasserman on their contribution
to clustering and data reduction methods for high and low situations. A
composite of clustering and traditional principal components analysis, treelets
is an innovative method for multi-resolution analysis of unordered data. It is
an improvement over traditional PCA and an important contribution to clustering
methodology. Their paper [arXiv:0707.0481] presents theory and supporting
applications addressing the two main goals of the treelet method: (1) Uncover
the underlying structure of the data and (2) Data reduction prior to
statistical learning methods. We will organize our discussion into two main
parts to address their methodology in terms of each of these two goals. We will
present and discuss treelets in terms of a clustering algorithm and an
improvement over traditional PCA. We will also discuss the applicability of
treelets to more general data, in particular, the application of treelets to
microarray data.Comment: Published in at http://dx.doi.org/10.1214/08-AOAS137F the Annals of
Applied Statistics (http://www.imstat.org/aoas/) by the Institute of
Mathematical Statistics (http://www.imstat.org
- …