19,222 research outputs found
Deep learning in remote sensing: a review
Standing at the paradigm shift towards data-intensive science, machine
learning techniques are becoming increasingly important. In particular, as a
major breakthrough in the field, deep learning has proven as an extremely
powerful tool in many fields. Shall we embrace deep learning as the key to all?
Or, should we resist a 'black-box' solution? There are controversial opinions
in the remote sensing community. In this article, we analyze the challenges of
using deep learning for remote sensing data analysis, review the recent
advances, and provide resources to make deep learning in remote sensing
ridiculously simple to start with. More importantly, we advocate remote sensing
scientists to bring their expertise into deep learning, and use it as an
implicit general model to tackle unprecedented large-scale influential
challenges, such as climate change and urbanization.Comment: Accepted for publication IEEE Geoscience and Remote Sensing Magazin
Quality-based Multimodal Classification Using Tree-Structured Sparsity
Recent studies have demonstrated advantages of information fusion based on
sparsity models for multimodal classification. Among several sparsity models,
tree-structured sparsity provides a flexible framework for extraction of
cross-correlated information from different sources and for enforcing group
sparsity at multiple granularities. However, the existing algorithm only solves
an approximated version of the cost functional and the resulting solution is
not necessarily sparse at group levels. This paper reformulates the
tree-structured sparse model for multimodal classification task. An accelerated
proximal algorithm is proposed to solve the optimization problem, which is an
efficient tool for feature-level fusion among either homogeneous or
heterogeneous sources of information. In addition, a (fuzzy-set-theoretic)
possibilistic scheme is proposed to weight the available modalities, based on
their respective reliability, in a joint optimization problem for finding the
sparsity codes. This approach provides a general framework for quality-based
fusion that offers added robustness to several sparsity-based multimodal
classification algorithms. To demonstrate their efficacy, the proposed methods
are evaluated on three different applications - multiview face recognition,
multimodal face recognition, and target classification.Comment: To Appear in 2014 IEEE Conference on Computer Vision and Pattern
Recognition (CVPR 2014
Going Deeper into Action Recognition: A Survey
Understanding human actions in visual data is tied to advances in
complementary research areas including object recognition, human dynamics,
domain adaptation and semantic segmentation. Over the last decade, human action
analysis evolved from earlier schemes that are often limited to controlled
environments to nowadays advanced solutions that can learn from millions of
videos and apply to almost all daily activities. Given the broad range of
applications from video surveillance to human-computer interaction, scientific
milestones in action recognition are achieved more rapidly, eventually leading
to the demise of what used to be good in a short time. This motivated us to
provide a comprehensive review of the notable steps taken towards recognizing
human actions. To this end, we start our discussion with the pioneering methods
that use handcrafted representations, and then, navigate into the realm of deep
learning based approaches. We aim to remain objective throughout this survey,
touching upon encouraging improvements as well as inevitable fallbacks, in the
hope of raising fresh questions and motivating new research directions for the
reader
Multi-task Image Classification via Collaborative, Hierarchical Spike-and-Slab Priors
Promising results have been achieved in image classification problems by
exploiting the discriminative power of sparse representations for
classification (SRC). Recently, it has been shown that the use of
\emph{class-specific} spike-and-slab priors in conjunction with the
class-specific dictionaries from SRC is particularly effective in low training
scenarios. As a logical extension, we build on this framework for multitask
scenarios, wherein multiple representations of the same physical phenomena are
available. We experimentally demonstrate the benefits of mining joint
information from different camera views for multi-view face recognition.Comment: Accepted to International Conference in Image Processing (ICIP) 201
- …