Search CORE

36 research outputs found

Minimum Density Hyperplanes

Author: Hofmeyr David P.
Pavlidis Nicos G.
Tasoulis Sotiris K.
Publication venue
Publication date: 01/01/2016
Field of study

Associating distinct groups of objects (clusters) with contiguous regions of high probability density (high-density clusters), is central to many statistical and machine learning approaches to the classification of unlabelled data. We propose a novel hyperplane classifier for clustering and semi-supervised classification which is motivated by this objective. The proposed minimum density hyperplane minimises the integral of the empirical probability density function along it, thereby avoiding intersection with high density clusters. We show that the minimum density and the maximum margin hyperplanes are asymptotically equivalent, thus linking this approach to maximum margin clustering and semi-supervised support vector classifiers. We propose a projection pursuit formulation of the associated optimisation problem which allows us to find minimum density hyperplanes efficiently in practice, and evaluate its performance on a range of benchmark datasets. The proposed approach is found to be very competitive with state of the art methods for clustering and semi-supervised classification

arXiv.org e-Print Archive

Lancaster E-Prints

Stellenbosch University SUNScholar Repository

Minimum density hyperplanes in the feature space

Author: Pavlidis Nicos Georgios
Yates Katie
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/02/2017
Field of study

Lancaster E-Prints

Divisive clustering of high dimensional data streams

Author: Eckley Idris
Hofmeyr David
Pavlidis Nicos
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/09/2016
Field of study

Clustering streaming data is gaining importance as automatic data acquisition technologies are deployed in diverse applications. We propose a fully incremental projected divisive clustering method for high-dimensional data streams that is motivated by high density clustering. The method is capable of identifying clusters in arbitrary subspaces, estimating the number of clusters, and detecting changes in the data distribution which necessitate a revision of the model. The empirical evaluation of the proposed method on numerous real and simulated datasets shows that it is scalable in dimension and number of clusters, is robust to noisy and irrelevant features, and is capable of handling a variety of types of non-stationarity

Lancaster E-Prints

PPCI: an R Package for Cluster Identification using Projection Pursuit

Author: Hofmeyr David
Pavlidis Nicos
Publication venue: 'The R Foundation'
Publication date: 01/12/2019
Field of study

This paper presents the R package PPCI which implements three recently proposed projection pursuit methods for clustering. The methods are unified by the approach of defining an optimal hyperplane to separate clusters, and deriving a projection index whose optimiser is the vector normal to this separating hyperplane. Divisive hierarchical clustering algorithms that can detect clusters defined in different subspaces are readily obtained by recursively bi-partitioning the data through such hyperplanes. Projecting onto the vector normal to the optimal hyperplane enables visualisations of the data that can be used to validate the partition at each level of the cluster hierarchy. PPCI also provides a simplified framework in which the clustering models can be modified in an interactive manner. Extensions to problems involving clusters which are not linearly separable, and to the problem of finding maximum hard margin hyperplanes for clustering are also discussed

Lancaster E-Prints

Stellenbosch University SUNScholar Repository

Subspace Clustering with Active Learning

Author: Pavlidis Nicos
Peng Hankui
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 11/11/2019
Field of study

Subspace clustering is a growing field of unsupervised learning that has gained much popularity in the computer vision community. Applications can be found in areas such as motion segmentation and face clustering. It assumes that data originate from a union of subspaces, and clusters the data depending on the corresponding subspace. In practice, it is reasonable to assume that a limited amount of labels can be obtained, potentially at a cost. Therefore, algorithms that can effectively and efficiently incorporate this information to improve the clustering model are desirable. In this paper, we propose an active learning framework for subspace clustering that sequentially queries informative points and updates the subspace model. The query stage of the proposed framework relies on results from the perturbation theory of principal component analysis, to identify influential and potentially misclassified points. A constrained subspace clustering algorithm is proposed that monotonically decreases the objective function subject to the constraints imposed by the labelled data. We show that our proposed framework is suitable for subspace clustering algorithms including iterative methods and spectral methods. Experiments on synthetic data sets, motion segmentation data sets, and Yale Faces data sets demonstrate the advantage of our proposed active strategy over state-of-the-art

arXiv.org e-Print Archive

Crossref

Lancaster E-Prints

The algorithm selection problem for solving Sudoku with metaheuristics

Author: Kheiri Ahmed
Notice Danielle
Pavlidis Nicos
Publication venue
Publication date: 03/07/2023
Field of study

Lancaster E-Prints

Nonlinear Dimensionality Reduction for Clustering

Author: Pavlidis Nicos
Roos Teemu
Tasoulis Sotiris
Publication venue: 'Elsevier BV'
Publication date: 01/11/2020
Field of study

We introduce an approach to divisive hierarchical clustering that is capable of identifying clusters in nonlinear manifolds. This approach uses the isometric mapping (Isomap) to recursively embed (subsets of) the data in one dimension, and then performs a binary partition designed to avoid the splitting of clusters. We provide a theoretical analysis of the conditions under which contiguous and high density clusters in the original space are guaranteed to be separable in the one dimensional embedding. To the best of our knowledge there is little prior work that studies this problem. Extensive experiments on simulated and real data sets show that hierarchical divisive clustering algorithms derived from this approach are effective

Lancaster E-Prints