3,197 research outputs found
Deep Clustering: A Comprehensive Survey
Cluster analysis plays an indispensable role in machine learning and data
mining. Learning a good data representation is crucial for clustering
algorithms. Recently, deep clustering, which can learn clustering-friendly
representations using deep neural networks, has been broadly applied in a wide
range of clustering tasks. Existing surveys for deep clustering mainly focus on
the single-view fields and the network architectures, ignoring the complex
application scenarios of clustering. To address this issue, in this paper we
provide a comprehensive survey for deep clustering in views of data sources.
With different data sources and initial conditions, we systematically
distinguish the clustering methods in terms of methodology, prior knowledge,
and architecture. Concretely, deep clustering methods are introduced according
to four categories, i.e., traditional single-view deep clustering,
semi-supervised deep clustering, deep multi-view clustering, and deep transfer
clustering. Finally, we discuss the open challenges and potential future
opportunities in different fields of deep clustering
Deep Divergence-Based Approach to Clustering
A promising direction in deep learning research consists in learning
representations and simultaneously discovering cluster structure in unlabeled
data by optimizing a discriminative loss function. As opposed to supervised
deep learning, this line of research is in its infancy, and how to design and
optimize suitable loss functions to train deep neural networks for clustering
is still an open question. Our contribution to this emerging field is a new
deep clustering network that leverages the discriminative power of
information-theoretic divergence measures, which have been shown to be
effective in traditional clustering. We propose a novel loss function that
incorporates geometric regularization constraints, thus avoiding degenerate
structures of the resulting clustering partition. Experiments on synthetic
benchmarks and real datasets show that the proposed network achieves
competitive performance with respect to other state-of-the-art methods, scales
well to large datasets, and does not require pre-training steps
Text Classification: A Review, Empirical, and Experimental Evaluation
The explosive and widespread growth of data necessitates the use of text
classification to extract crucial information from vast amounts of data.
Consequently, there has been a surge of research in both classical and deep
learning text classification methods. Despite the numerous methods proposed in
the literature, there is still a pressing need for a comprehensive and
up-to-date survey. Existing survey papers categorize algorithms for text
classification into broad classes, which can lead to the misclassification of
unrelated algorithms and incorrect assessments of their qualities and behaviors
using the same metrics. To address these limitations, our paper introduces a
novel methodological taxonomy that classifies algorithms hierarchically into
fine-grained classes and specific techniques. The taxonomy includes methodology
categories, methodology techniques, and methodology sub-techniques. Our study
is the first survey to utilize this methodological taxonomy for classifying
algorithms for text classification. Furthermore, our study also conducts
empirical evaluation and experimental comparisons and rankings of different
algorithms that employ the same specific sub-technique, different
sub-techniques within the same technique, different techniques within the same
category, and categorie
- …