286,119 research outputs found
Feature Selection in k-Median Clustering
An e ective method for selecting features in clustering
unlabeled data is proposed based on changing the objective
function of the standard k-median clustering algorithm. The
change consists of perturbing the objective function by a
term that drives the medians of each of the k clusters toward
the (shifted) global median of zero for the entire dataset.
As the perturbation parameter is increased, more and more
features are driven automatically toward the global zero
median and are eliminated from the problem until one last
feature remains. An error curve for unlabeled data clustering
as a function of the number of features used gives reducedfeature
clustering error relative to the \gold standard" of the
full-feature clustering. This clustering error curve parallels
a classi cation error curve based on real data labels. This
justi es the utility of the former error curve for unlabeled
data as a means of choosing an appropriate number of
reduced features in order to achieve a correctness comparable
to that obtained by the full set of original features. For
example, on the 3-class Wine dataset, clustering with 4
selected input space features is comparable to within 4%
to clustering using the original 13 features of the problem
Optimal Clustering Framework for Hyperspectral Band Selection
Band selection, by choosing a set of representative bands in hyperspectral
image (HSI), is an effective method to reduce the redundant information without
compromising the original contents. Recently, various unsupervised band
selection methods have been proposed, but most of them are based on
approximation algorithms which can only obtain suboptimal solutions toward a
specific objective function. This paper focuses on clustering-based band
selection, and proposes a new framework to solve the above dilemma, claiming
the following contributions: 1) An optimal clustering framework (OCF), which
can obtain the optimal clustering result for a particular form of objective
function under a reasonable constraint. 2) A rank on clusters strategy (RCS),
which provides an effective criterion to select bands on existing clustering
structure. 3) An automatic method to determine the number of the required
bands, which can better evaluate the distinctive information produced by
certain number of bands. In experiments, the proposed algorithm is compared to
some state-of-the-art competitors. According to the experimental results, the
proposed algorithm is robust and significantly outperform the other methods on
various data sets
Reducing Objective Function Mismatch in Deep Clustering with the Unsupervised Companion Objective
Preservation of local similarity structure is a key challenge in deep clustering. Many recent deep clustering methods therefore use autoencoders to help guide the model's neural network towards an embedding which is more reflective of the input space geometry. However, recent work has shown that autoencoder-based deep clustering models can suffer from objective function mismatch (OFM). In order to improve the preservation of local similarity structure, while simultaneously having a low OFM, we develop a new auxiliary objective function for deep clustering. Our Unsupervised Companion Objective (UCO) encourages a consistent clustering structure at intermediate layers in the network -- helping the network learn an embedding which is more reflective of the similarity structure in the input space. Since a clustering-based auxiliary objective has the same goal as the main clustering objective, it is less prone to introduce objective function mismatch between itself and the main objective. Our experiments show that attaching the UCO to a deep clustering model improves the performance of the model, and exhibits a lower OFM, compared to an analogous autoencoder-based model
Belief Hierarchical Clustering
In the data mining field many clustering methods have been proposed, yet
standard versions do not take into account uncertain databases. This paper
deals with a new approach to cluster uncertain data by using a hierarchical
clustering defined within the belief function framework. The main objective of
the belief hierarchical clustering is to allow an object to belong to one or
several clusters. To each belonging, a degree of belief is associated, and
clusters are combined based on the pignistic properties. Experiments with real
uncertain data show that our proposed method can be considered as a propitious
tool
Fuzzy clustering with volume prototypes and adaptive cluster merging
Two extensions to the objective function-based fuzzy
clustering are proposed. First, the (point) prototypes are extended to hypervolumes, whose size can be fixed or can be determined automatically from the data being clustered. It is shown that clustering with hypervolume prototypes can be formulated as the minimization of an objective function. Second, a heuristic cluster merging step is introduced where the similarity among the clusters
is assessed during optimization. Starting with an overestimation of the number of clusters in the data, similar clusters are merged in order to obtain a suitable partitioning. An adaptive threshold for merging is proposed. The extensions proposed are applied to
Gustafson–Kessel and fuzzy c-means algorithms, and the resulting extended algorithm is given. The properties of the new algorithm are illustrated by various examples
- …