5,971 research outputs found
Neither Global Nor Local: A Hierarchical Robust Subspace Clustering For Image Data
In this paper, we consider the problem of subspace clustering in presence of
contiguous noise, occlusion and disguise. We argue that self-expressive
representation of data in current state-of-the-art approaches is severely
sensitive to occlusions and complex real-world noises. To alleviate this
problem, we propose a hierarchical framework that brings robustness of local
patches-based representations and discriminant property of global
representations together. This approach consists of 1) a top-down stage, in
which the input data is subject to repeated division to smaller patches and 2)
a bottom-up stage, in which the low rank embedding of local patches in field of
view of a corresponding patch in upper level are merged on a Grassmann
manifold. This summarized information provides two key information for the
corresponding patch on the upper level: cannot-links and recommended-links.
This information is employed for computing a self-expressive representation of
each patch at upper levels using a weighted sparse group lasso optimization
problem. Numerical results on several real data sets confirm the efficiency of
our approach
Deep Clustering With Intra-class Distance Constraint for Hyperspectral Images
The high dimensionality of hyperspectral images often results in the
degradation of clustering performance. Due to the powerful ability of deep
feature extraction and non-linear feature representation, the clustering
algorithm based on deep learning has become a hot research topic in the field
of hyperspectral remote sensing. However, most deep clustering algorithms for
hyperspectral images utilize deep neural networks as feature extractor without
considering prior knowledge constraints that are suitable for clustering. To
solve this problem, we propose an intra-class distance constrained deep
clustering algorithm for high-dimensional hyperspectral images. The proposed
algorithm constrains the feature mapping procedure of the auto-encoder network
by intra-class distance so that raw images are transformed from the original
high-dimensional space to the low-dimensional feature space that is more
conducive to clustering. Furthermore, the related learning process is treated
as a joint optimization problem of deep feature extraction and clustering.
Experimental results demonstrate the intense competitiveness of the proposed
algorithm in comparison with state-of-the-art clustering methods of
hyperspectral images
Deep Comprehensive Correlation Mining for Image Clustering
Recent developed deep unsupervised methods allow us to jointly learn
representation and cluster unlabelled data. These deep clustering methods
mainly focus on the correlation among samples, e.g., selecting high precision
pairs to gradually tune the feature representation, which neglects other useful
correlations. In this paper, we propose a novel clustering framework, named
deep comprehensive correlation mining(DCCM), for exploring and taking full
advantage of various kinds of correlations behind the unlabeled data from three
aspects: 1) Instead of only using pair-wise information, pseudo-label
supervision is proposed to investigate category information and learn
discriminative features. 2) The features' robustness to image transformation of
input space is fully explored, which benefits the network learning and
significantly improves the performance. 3) The triplet mutual information among
features is presented for clustering problem to lift the recently discovered
instance-level deep mutual information to a triplet-level formation, which
further helps to learn more discriminative features. Extensive experiments on
several challenging datasets show that our method achieves good performance,
e.g., attaining clustering accuracy on CIFAR-10, which is
higher than the state-of-the-art results.Comment: Accepted to ICCV 201
Local Deep-Feature Alignment for Unsupervised Dimension Reduction
This paper presents an unsupervised deep-learning framework named Local
Deep-Feature Alignment (LDFA) for dimension reduction. We construct
neighbourhood for each data sample and learn a local Stacked Contractive
Auto-encoder (SCAE) from the neighbourhood to extract the local deep features.
Next, we exploit an affine transformation to align the local deep features of
each neighbourhood with the global features. Moreover, we derive an approach
from LDFA to map explicitly a new data sample into the learned low-dimensional
subspace. The advantage of the LDFA method is that it learns both local and
global characteristics of the data sample set: the local SCAEs capture local
characteristics contained in the data set, while the global alignment
procedures encode the interdependencies between neighbourhoods into the final
low-dimensional feature representations. Experimental results on data
visualization, clustering and classification show that the LDFA method is
competitive with several well-known dimension reduction techniques, and
exploiting locality in deep learning is a research topic worth further
exploring
Moving Object Segmentation in Jittery Videos by Stabilizing Trajectories Modeled in Kendall's Shape Space
Moving Object Segmentation is a challenging task for jittery/wobbly videos.
For jittery videos, the non-smooth camera motion makes discrimination between
foreground objects and background layers hard to solve. While most recent works
for moving video object segmentation fail in this scenario, our method
generates an accurate segmentation of a single moving object. The proposed
method performs a sparse segmentation, where frame-wise labels are assigned
only to trajectory coordinates, followed by the pixel-wise labeling of frames.
The sparse segmentation involving stabilization and clustering of trajectories
in a 3-stage iterative process. At the 1st stage, the trajectories are
clustered using pairwise Procrustes distance as a cue for creating an affinity
matrix. The 2nd stage performs a block-wise Procrustes analysis of the
trajectories and estimates Frechet means (in Kendall's shape space) of the
clusters. The Frechet means represent the average trajectories of the motion
clusters. An optimization function has been formulated to stabilize the Frechet
means, yielding stabilized trajectories at the 3rd stage. The accuracy of the
motion clusters are iteratively refined, producing distinct groups of
stabilized trajectories. Next, the labels obtained from the sparse segmentation
are propagated for pixel-wise labeling of the frames, using a GraphCut based
energy formulation. Use of Procrustes analysis and energy minimization in
Kendall's shape space for moving object segmentation in jittery videos, is the
novelty of this work. Second contribution comes from experiments performed on a
dataset formed of 20 real-world natural jittery videos, with manually annotated
ground truth. Experiments are done with controlled levels of artificial jitter
on videos of SegTrack2 dataset. Qualitative and quantitative results indicate
the superiority of the proposed method.Comment: 13 pages, 3 figures, Published in British Machine Vision Conference
2017 (BMVC-2017
Towards combinatorial clustering: preliminary research survey
The paper describes clustering problems from the combinatorial viewpoint. A
brief systemic survey is presented including the following: (i) basic
clustering problems (e.g., classification, clustering, sorting, clustering with
an order over cluster), (ii) basic approaches to assessment of objects and
object proximities (i.e., scales, comparison, aggregation issues), (iii) basic
approaches to evaluation of local quality characteristics for clusters and
total quality characteristics for clustering solutions, (iv) clustering as
multicriteria optimization problem, (v) generalized modular clustering
framework, (vi) basic clustering models/methods (e.g., hierarchical clustering,
k-means clustering, minimum spanning tree based clustering, clustering as
assignment, detection of clisue/quasi-clique based clustering, correlation
clustering, network communities based clustering), Special attention is
targeted to formulation of clustering as multicriteria optimization models.
Combinatorial optimization models are used as auxiliary problems (e.g.,
assignment, partitioning, knapsack problem, multiple choice problem,
morphological clique problem, searching for consensus/median for structures).
Numerical examples illustrate problem formulations, solving methods, and
applications. The material can be used as follows: (a) a research survey, (b) a
fundamental for designing the structure/architecture of composite modular
clustering software, (c) a bibliography reference collection, and (d) a
tutorial.Comment: 102 pages, 66 figures, 67 table
Learning From Hidden Traits: Joint Factor Analysis and Latent Clustering
Dimensionality reduction techniques play an essential role in data analytics,
signal processing and machine learning. Dimensionality reduction is usually
performed in a preprocessing stage that is separate from subsequent data
analysis, such as clustering or classification. Finding reduced-dimension
representations that are well-suited for the intended task is more appealing.
This paper proposes a joint factor analysis and latent clustering framework,
which aims at learning cluster-aware low-dimensional representations of matrix
and tensor data. The proposed approach leverages matrix and tensor
factorization models that produce essentially unique latent representations of
the data to unravel latent cluster structure -- which is otherwise obscured
because of the freedom to apply an oblique transformation in latent space. At
the same time, latent cluster structure is used as prior information to enhance
the performance of factorization. Specific contributions include several
custom-built problem formulations, corresponding algorithms, and discussion of
associated convergence properties. Besides extensive simulations, real-world
datasets such as Reuters document data and MNIST image data are also employed
to showcase the effectiveness of the proposed approaches
Spectral Clustering via Ensemble Deep Autoencoder Learning (SC-EDAE)
Recently, a number of works have studied clustering strategies that combine
classical clustering algorithms and deep learning methods. These approaches
follow either a sequential way, where a deep representation is learned using a
deep autoencoder before obtaining clusters with k-means, or a simultaneous way,
where deep representation and clusters are learned jointly by optimizing a
single objective function. Both strategies improve clustering performance,
however the robustness of these approaches is impeded by several deep
autoencoder setting issues, among which the weights initialization, the width
and number of layers or the number of epochs. To alleviate the impact of such
hyperparameters setting on the clustering performance, we propose a new model
which combines the spectral clustering and deep autoencoder strengths in an
ensemble learning framework. Extensive experiments on various benchmark
datasets demonstrate the potential and robustness of our approach compared to
state-of-the-art deep clustering methods.Comment: Revised manuscrip
Coarse-to-Fine Classification via Parametric and Nonparametric Models for Computer-Aided Diagnosis
Classification is one of the core problems in Computer-Aided Diagnosis (CAD),
targeting for early cancer detection using 3D medical imaging interpretation.
High detection sensitivity with desirably low false positive (FP) rate is
critical for a CAD system to be accepted as a valuable or even indispensable
tool in radiologists' workflow. Given various spurious imagery noises which
cause observation uncertainties, this remains a very challenging task. In this
paper, we propose a novel, two-tiered coarse-to-fine (CTF) classification
cascade framework to tackle this problem. We first obtain
classification-critical data samples (e.g., samples on the decision boundary)
extracted from the holistic data distributions using a robust parametric model
(e.g., \cite{Raykar08}); then we build a graph-embedding based nonparametric
classifier on sampled data, which can more accurately preserve or formulate the
complex classification boundary. These two steps can also be considered as
effective "sample pruning" and "feature pursuing + NN/template matching",
respectively. Our approach is validated comprehensively in colorectal polyp
detection and lung nodule detection CAD systems, as the top two deadly cancers,
using hospital scale, multi-site clinical datasets. The results show that our
method achieves overall better classification/detection performance than
existing state-of-the-art algorithms using single-layer classifiers, such as
the support vector machine variants \cite{Wang08}, boosting \cite{Slabaugh10},
logistic regression \cite{Ravesteijn10}, relevance vector machine
\cite{Raykar08}, -nearest neighbor \cite{Murphy09} or spectral projections
on graph \cite{Cai08}
Multi-View Surveillance Video Summarization via Joint Embedding and Sparse Optimization
Most traditional video summarization methods are designed to generate
effective summaries for single-view videos, and thus they cannot fully exploit
the complicated intra and inter-view correlations in summarizing multi-view
videos in a camera network. In this paper, with the aim of summarizing
multi-view videos, we introduce a novel unsupervised framework via joint
embedding and sparse representative selection. The objective function is
two-fold. The first is to capture the multi-view correlations via an embedding,
which helps in extracting a diverse set of representatives. The second is to
use a `2;1- norm to model the sparsity while selecting representative shots for
the summary. We propose to jointly optimize both of the objectives, such that
embedding can not only characterize the correlations, but also indicate the
requirements of sparse representative selection. We present an efficient
alternating algorithm based on half-quadratic minimization to solve the
proposed non-smooth and non-convex objective with convergence analysis. A key
advantage of the proposed approach with respect to the state-of-the-art is that
it can summarize multi-view videos without assuming any prior
correspondences/alignment between them, e.g., uncalibrated camera networks.
Rigorous experiments on several multi-view datasets demonstrate that our
approach clearly outperforms the state-of-the-art methods.Comment: IEEE Trans. on Multimedia, 2017 (In Press
- …