11,645 research outputs found
Multi-View Region Adaptive Multi-temporal DMM and RGB Action Recognition
Human action recognition remains an important yet challenging task. This work
proposes a novel action recognition system. It uses a novel Multiple View
Region Adaptive Multi-resolution in time Depth Motion Map (MV-RAMDMM)
formulation combined with appearance information. Multiple stream 3D
Convolutional Neural Networks (CNNs) are trained on the different views and
time resolutions of the region adaptive Depth Motion Maps. Multiple views are
synthesised to enhance the view invariance. The region adaptive weights, based
on localised motion, accentuate and differentiate parts of actions possessing
faster motion. Dedicated 3D CNN streams for multi-time resolution appearance
information (RGB) are also included. These help to identify and differentiate
between small object interactions. A pre-trained 3D-CNN is used here with
fine-tuning for each stream along with multiple class Support Vector Machines
(SVM)s. Average score fusion is used on the output. The developed approach is
capable of recognising both human action and human-object interaction. Three
public domain datasets including: MSR 3D Action,Northwestern UCLA multi-view
actions and MSR 3D daily activity are used to evaluate the proposed solution.
The experimental results demonstrate the robustness of this approach compared
with state-of-the-art algorithms.Comment: 14 pages, 6 figures, 13 tables. Submitte
Fusion of Heterogeneous Earth Observation Data for the Classification of Local Climate Zones
This paper proposes a novel framework for fusing multi-temporal,
multispectral satellite images and OpenStreetMap (OSM) data for the
classification of local climate zones (LCZs). Feature stacking is the most
commonly-used method of data fusion but does not consider the heterogeneity of
multimodal optical images and OSM data, which becomes its main drawback. The
proposed framework processes two data sources separately and then combines them
at the model level through two fusion models (the landuse fusion model and
building fusion model), which aim to fuse optical images with landuse and
buildings layers of OSM data, respectively. In addition, a new approach to
detecting building incompleteness of OSM data is proposed. The proposed
framework was trained and tested using data from the 2017 IEEE GRSS Data Fusion
Contest, and further validated on one additional test set containing test
samples which are manually labeled in Munich and New York. Experimental results
have indicated that compared to the feature stacking-based baseline framework
the proposed framework is effective in fusing optical images with OSM data for
the classification of LCZs with high generalization capability on a large
scale. The classification accuracy of the proposed framework outperforms the
baseline framework by more than 6% and 2%, while testing on the test set of
2017 IEEE GRSS Data Fusion Contest and the additional test set, respectively.
In addition, the proposed framework is less sensitive to spectral diversities
of optical satellite images and thus achieves more stable classification
performance than state-of-the art frameworks.Comment: accepted by TGR
- …