90,717 research outputs found
Active Domain Adaptation via Clustering Uncertainty-weighted Embeddings
Generalizing deep neural networks to new target domains is critical to their
real-world utility. In practice, it may be feasible to get some target data
labeled, but to be cost-effective it is desirable to select a
maximally-informative subset via active learning (AL). We study the problem of
AL under a domain shift, called Active Domain Adaptation (Active DA). We
empirically demonstrate how existing AL approaches based solely on model
uncertainty or diversity sampling are suboptimal for Active DA. Our algorithm,
Active Domain Adaptation via Clustering Uncertainty-weighted Embeddings
(ADA-CLUE), i) identifies target instances for labeling that are both uncertain
under the model and diverse in feature space, and ii) leverages the available
source and target data for adaptation by optimizing a semi-supervised
adversarial entropy loss that is complementary to our active sampling
objective. On standard image classification-based domain adaptation benchmarks,
ADA-CLUE consistently outperforms competing active adaptation, active learning,
and domain adaptation methods across domain shifts of varying severity
Semantic Autoencoder for Zero-Shot Learning
Existing zero-shot learning (ZSL) models typically learn a projection
function from a feature space to a semantic embedding space (e.g.~attribute
space). However, such a projection function is only concerned with predicting
the training seen class semantic representation (e.g.~attribute prediction) or
classification. When applied to test data, which in the context of ZSL contains
different (unseen) classes without training data, a ZSL model typically suffers
from the project domain shift problem. In this work, we present a novel
solution to ZSL based on learning a Semantic AutoEncoder (SAE). Taking the
encoder-decoder paradigm, an encoder aims to project a visual feature vector
into the semantic space as in the existing ZSL models. However, the decoder
exerts an additional constraint, that is, the projection/code must be able to
reconstruct the original visual feature. We show that with this additional
reconstruction constraint, the learned projection function from the seen
classes is able to generalise better to the new unseen classes. Importantly,
the encoder and decoder are linear and symmetric which enable us to develop an
extremely efficient learning algorithm. Extensive experiments on six benchmark
datasets demonstrate that the proposed SAE outperforms significantly the
existing ZSL models with the additional benefit of lower computational cost.
Furthermore, when the SAE is applied to supervised clustering problem, it also
beats the state-of-the-art.Comment: accepted to CVPR201
Graph Signal Processing: Overview, Challenges and Applications
Research in Graph Signal Processing (GSP) aims to develop tools for
processing data defined on irregular graph domains. In this paper we first
provide an overview of core ideas in GSP and their connection to conventional
digital signal processing. We then summarize recent developments in developing
basic GSP tools, including methods for sampling, filtering or graph learning.
Next, we review progress in several application areas using GSP, including
processing and analysis of sensor network data, biological data, and
applications to image processing and machine learning. We finish by providing a
brief historical perspective to highlight how concepts recently developed in
GSP build on top of prior research in other areas.Comment: To appear, Proceedings of the IEE
STV-based Video Feature Processing for Action Recognition
In comparison to still image-based processes, video features can provide rich and intuitive information about dynamic events occurred over a period of time, such as human actions, crowd behaviours, and other subject pattern changes. Although substantial progresses have been made in the last decade on image processing and seen its successful applications in face matching and object recognition, video-based event detection still remains one of the most difficult challenges in computer vision research due to its complex continuous or discrete input signals, arbitrary dynamic feature definitions, and the often ambiguous analytical methods. In this paper, a Spatio-Temporal Volume (STV) and region intersection (RI) based 3D shape-matching method has been proposed to facilitate the definition and recognition of human actions recorded in videos. The distinctive characteristics and the performance gain of the devised approach stemmed from a coefficient factor-boosted 3D region intersection and matching mechanism developed in this research. This paper also reported the investigation into techniques for efficient STV data filtering to reduce the amount of voxels (volumetric-pixels) that need to be processed in each operational cycle in the implemented system. The encouraging features and improvements on the operational performance registered in the experiments have been discussed at the end
Time frequency analysis in terahertz pulsed imaging
Recent advances in laser and electro-optical technologies have made the previously under-utilized terahertz frequency band of the electromagnetic spectrum
accessible for practical imaging. Applications are emerging, notably in the biomedical domain. In this chapter the technique of terahertz pulsed imaging is
introduced in some detail. The need for special computer vision methods, which arises from the use of pulses of radiation and the acquisition of a time series at
each pixel, is described. The nature of the data is a challenge since we are interested not only in the frequency composition of the pulses, but also how these differ for different parts of the pulse. Conventional and short-time Fourier transforms and wavelets were used in preliminary experiments on the analysis of terahertz
pulsed imaging data. Measurements of refractive index and absorption coefficient were compared, wavelet compression assessed and image classification by multidimensional
clustering techniques demonstrated. It is shown that the timefrequency methods perform as well as conventional analysis for determining material properties. Wavelet compression gave results that were robust through compressions that used only 20% of the wavelet coefficients. It is concluded that the time-frequency methods hold great promise for optimizing the extraction of the spectroscopic information contained in each terahertz pulse, for the analysis of more complex signals comprising multiple pulses or from recently introduced acquisition techniques
Efficient Information Theoretic Clustering on Discrete Lattices
We consider the problem of clustering data that reside on discrete, low
dimensional lattices. Canonical examples for this setting are found in image
segmentation and key point extraction. Our solution is based on a recent
approach to information theoretic clustering where clusters result from an
iterative procedure that minimizes a divergence measure. We replace costly
processing steps in the original algorithm by means of convolutions. These
allow for highly efficient implementations and thus significantly reduce
runtime. This paper therefore bridges a gap between machine learning and signal
processing.Comment: This paper has been presented at the workshop LWA 201
- …