31 research outputs found
A distribution-dependent Mumford-Shah model for unsupervised hyperspectral image segmentation
Hyperspectral images provide a rich representation of the underlying spectrum
for each pixel, allowing for a pixel-wise classification/segmentation into
different classes. As the acquisition of labeled training data is very
time-consuming, unsupervised methods become crucial in hyperspectral image
analysis. The spectral variability and noise in hyperspectral data make this
task very challenging and define special requirements for such methods.
Here, we present a novel unsupervised hyperspectral segmentation framework.
It starts with a denoising and dimensionality reduction step by the
well-established Minimum Noise Fraction (MNF) transform. Then, the Mumford-Shah
(MS) segmentation functional is applied to segment the data. We equipped the MS
functional with a novel robust distribution-dependent indicator function
designed to handle the characteristic challenges of hyperspectral data. To
optimize our objective function with respect to the parameters for which no
closed form solution is available, we propose an efficient fixed point
iteration scheme. Numerical experiments on four public benchmark datasets show
that our method produces competitive results, which outperform two
state-of-the-art methods substantially on three of these datasets
Simplified Energy Landscape for Modularity Using Total Variation
Networks capture pairwise interactions between entities and are frequently
used in applications such as social networks, food networks, and protein
interaction networks, to name a few. Communities, cohesive groups of nodes,
often form in these applications, and identifying them gives insight into the
overall organization of the network. One common quality function used to
identify community structure is modularity. In Hu et al. [SIAM J. App. Math.,
73(6), 2013], it was shown that modularity optimization is equivalent to
minimizing a particular nonconvex total variation (TV) based functional over a
discrete domain. They solve this problem, assuming the number of communities is
known, using a Merriman, Bence, Osher (MBO) scheme.
We show that modularity optimization is equivalent to minimizing a convex
TV-based functional over a discrete domain, again, assuming the number of
communities is known. Furthermore, we show that modularity has no convex
relaxation satisfying certain natural conditions. We therefore, find a
manageable non-convex approximation using a Ginzburg Landau functional, which
provably converges to the correct energy in the limit of a certain parameter.
We then derive an MBO algorithm with fewer hand-tuned parameters than in Hu et
al. and which is 7 times faster at solving the associated diffusion equation
due to the fact that the underlying discretization is unconditionally stable.
Our numerical tests include a hyperspectral video whose associated graph has
2.9x10^7 edges, which is roughly 37 times larger than was handled in the paper
of Hu et al.Comment: 25 pages, 3 figures, 3 tables, submitted to SIAM J. App. Mat
Hyperspectral Image Clustering with Spatially-Regularized Ultrametrics
We propose a method for the unsupervised clustering of hyperspectral images
based on spatially regularized spectral clustering with ultrametric path
distances. The proposed method efficiently combines data density and geometry
to distinguish between material classes in the data, without the need for
training labels. The proposed method is efficient, with quasilinear scaling in
the number of data points, and enjoys robust theoretical performance
guarantees. Extensive experiments on synthetic and real HSI data demonstrate
its strong performance compared to benchmark and state-of-the-art methods. In
particular, the proposed method achieves not only excellent labeling accuracy,
but also efficiently estimates the number of clusters.Comment: 5 pages, 2 columns, 9 figure
Semi-Supervised First-Person Activity Recognition in Body-Worn Video
Body-worn cameras are now commonly used for logging daily life, sports, and
law enforcement activities, creating a large volume of archived footage. This
paper studies the problem of classifying frames of footage according to the
activity of the camera-wearer with an emphasis on application to real-world
police body-worn video. Real-world datasets pose a different set of challenges
from existing egocentric vision datasets: the amount of footage of different
activities is unbalanced, the data contains personally identifiable
information, and in practice it is difficult to provide substantial training
footage for a supervised approach. We address these challenges by extracting
features based exclusively on motion information then segmenting the video
footage using a semi-supervised classification algorithm. On publicly available
datasets, our method achieves results comparable to, if not better than,
supervised and/or deep learning methods using a fraction of the training data.
It also shows promising results on real-world police body-worn video
Stochastic Block Models are a Discrete Surface Tension
Networks, which represent agents and interactions between them, arise in
myriad applications throughout the sciences, engineering, and even the
humanities. To understand large-scale structure in a network, a common task is
to cluster a network's nodes into sets called "communities", such that there
are dense connections within communities but sparse connections between them. A
popular and statistically principled method to perform such clustering is to
use a family of generative models known as stochastic block models (SBMs). In
this paper, we show that maximum likelihood estimation in an SBM is a network
analog of a well-known continuum surface-tension problem that arises from an
application in metallurgy. To illustrate the utility of this relationship, we
implement network analogs of three surface-tension algorithms, with which we
successfully recover planted community structure in synthetic networks and
which yield fascinating insights on empirical networks that we construct from
hyperspectral videos.Comment: to appear in Journal of Nonlinear Scienc
Automated Remote Sensing Image Interpretation with Limited Labeled Training Data
Automated remote sensing image interpretation has been investigated for more than a decade. In early years, most work was based on the assumption that there are sufficient labeled samples to be used for training. However, ground-truth collection is a very tedious and time-consuming task and sometimes very expensive, especially in the field of remote sensing that usually relies on field surveys to collect ground truth. In recent years, as the development of advanced machine learning techniques, remote sensing image interpretation with limited ground-truth has caught the attention of researchers in the fields of both remote sensing and computer science.
Three approaches that focus on different aspects of the interpretation process, i.e., feature extraction, classification, and segmentation, are proposed to deal with the limited ground truth problem. First, feature extraction techniques, which usually serve as a pre-processing step for remote sensing image classification are explored. Instead of only focusing on feature extraction, a joint feature extraction and classification framework is proposed based on ensemble local manifold learning. Second, classifiers in the case of limited labeled training data are investigated, and an enhanced ensemble learning method that outperforms state-of-the-art classification methods is proposed. Third, image segmentation techniques are investigated, with the aid of unlabeled samples and spatial information. A semi-supervised self-training method is proposed, which is capable of expanding the number of training samples by its own and hence improving classification performance iteratively. Experiments show that the proposed approaches outperform state-of-the-art techniques in terms of classification accuracy on benchmark remote sensing datasets.4 month