53,278 research outputs found
Relational visual cluster validity
The assessment of cluster validity plays a very important role in cluster analysis. Most commonly used cluster validity methods are based on statistical hypothesis testing or finding the best clustering scheme by computing a number of different cluster validity indices. A number of visual methods of cluster validity have been produced to display directly the validity of clusters by mapping data into two- or three-dimensional space. However, these methods may lose too much information to correctly estimate the results of clustering algorithms. Although the visual cluster validity (VCV) method of Hathaway and Bezdek can successfully solve this problem, it can only be applied for object data, i.e. feature measurements. There are very few validity methods that can be used to analyze the validity of data where only a similarity or dissimilarity relation exists β relational data. To tackle this problem, this paper presents a relational visual cluster validity (RVCV) method to assess the validity of clustering relational data. This is done by combining the results of the non-Euclidean relational fuzzy c-means (NERFCM) algorithm with a modification of the VCV method to produce a visual representation of cluster validity. RVCV can cluster complete and incomplete relational data and adds to the visual cluster validity theory. Numeric examples using synthetic and real data are presente
A sparse multinomial probit model for classification
A recent development in penalized probit modelling using a hierarchical Bayesian approach has led to a sparse binomial (two-class) probit classifier that can be trained via an EM algorithm. A key advantage of the formulation is that no tuning of hyperparameters relating to the penalty is needed thus simplifying the model selection process. The resulting model demonstrates excellent classification performance and a high degree of sparsity when used as a kernel machine. It is, however, restricted to the binary classification problem and can only be used in the multinomial situation via a one-against-all or one-against-many strategy. To overcome this, we apply the idea to the multinomial probit model. This leads to a direct multi-classification approach and is shown to give a sparse solution with accuracy and sparsity comparable with the current state-of-the-art. Comparative numerical benchmark examples are used to demonstrate the method
- β¦