1,975 research outputs found
Advances in Hyperspectral Image Classification: Earth monitoring with statistical learning methods
Hyperspectral images show similar statistical properties to natural grayscale
or color photographic images. However, the classification of hyperspectral
images is more challenging because of the very high dimensionality of the
pixels and the small number of labeled examples typically available for
learning. These peculiarities lead to particular signal processing problems,
mainly characterized by indetermination and complex manifolds. The framework of
statistical learning has gained popularity in the last decade. New methods have
been presented to account for the spatial homogeneity of images, to include
user's interaction via active learning, to take advantage of the manifold
structure with semisupervised learning, to extract and encode invariances, or
to adapt classifiers and image representations to unseen yet similar scenes.
This tutuorial reviews the main advances for hyperspectral remote sensing image
classification through illustrative examples.Comment: IEEE Signal Processing Magazine, 201
CNN Architectures for Large-Scale Audio Classification
Convolutional Neural Networks (CNNs) have proven very effective in image
classification and show promise for audio. We use various CNN architectures to
classify the soundtracks of a dataset of 70M training videos (5.24 million
hours) with 30,871 video-level labels. We examine fully connected Deep Neural
Networks (DNNs), AlexNet [1], VGG [2], Inception [3], and ResNet [4]. We
investigate varying the size of both training set and label vocabulary, finding
that analogs of the CNNs used in image classification do well on our audio
classification task, and larger training and label sets help up to a point. A
model using embeddings from these classifiers does much better than raw
features on the Audio Set [5] Acoustic Event Detection (AED) classification
task.Comment: Accepted for publication at ICASSP 2017 Changes: Added definitions of
mAP, AUC, and d-prime. Updated mAP/AUC/d-prime numbers for Audio Set based on
changes of latest Audio Set revision. Changed wording to fit 4 page limit
with new addition
ProTeCt: Prompt Tuning for Hierarchical Consistency
Large visual-language models, like CLIP, learn generalized representations
and have shown promising zero-shot performance. Few-shot adaptation methods,
based on prompt tuning, have also been shown to further improve performance on
downstream datasets. However, these models are not hierarchically consistent.
Frequently, they infer incorrect labels at coarser taxonomic class levels, even
when the inference at the leaf level (original class labels) is correct. This
is problematic, given their support for open set classification and, in
particular, open-grained classification, where practitioners define label sets
at various levels of granularity. To address this problem, we propose a prompt
tuning technique to calibrate the hierarchical consistency of model
predictions. A set of metrics of hierarchical consistency, the Hierarchical
Consistent Accuracy (HCA) and the Mean Treecut Accuracy (MTA), are first
proposed to benchmark model performance in the open-granularity setting. A
prompt tuning technique, denoted as Prompt Tuning for Hierarchical Consistency
(ProTeCt), is then proposed to calibrate classification across all possible
label set granularities. Results show that ProTeCt can be combined with
existing prompt tuning methods to significantly improve open-granularity
classification performance without degradation of the original classification
performance at the leaf level
- …