32,633 research outputs found
Dynamic texture recognition using time-causal and time-recursive spatio-temporal receptive fields
This work presents a first evaluation of using spatio-temporal receptive
fields from a recently proposed time-causal spatio-temporal scale-space
framework as primitives for video analysis. We propose a new family of video
descriptors based on regional statistics of spatio-temporal receptive field
responses and evaluate this approach on the problem of dynamic texture
recognition. Our approach generalises a previously used method, based on joint
histograms of receptive field responses, from the spatial to the
spatio-temporal domain and from object recognition to dynamic texture
recognition. The time-recursive formulation enables computationally efficient
time-causal recognition. The experimental evaluation demonstrates competitive
performance compared to state-of-the-art. Especially, it is shown that binary
versions of our dynamic texture descriptors achieve improved performance
compared to a large range of similar methods using different primitives either
handcrafted or learned from data. Further, our qualitative and quantitative
investigation into parameter choices and the use of different sets of receptive
fields highlights the robustness and flexibility of our approach. Together,
these results support the descriptive power of this family of time-causal
spatio-temporal receptive fields, validate our approach for dynamic texture
recognition and point towards the possibility of designing a range of video
analysis methods based on these new time-causal spatio-temporal primitives.Comment: 29 pages, 16 figure
UV-GAN: Adversarial Facial UV Map Completion for Pose-invariant Face Recognition
Recently proposed robust 3D face alignment methods establish either dense or
sparse correspondence between a 3D face model and a 2D facial image. The use of
these methods presents new challenges as well as opportunities for facial
texture analysis. In particular, by sampling the image using the fitted model,
a facial UV can be created. Unfortunately, due to self-occlusion, such a UV map
is always incomplete. In this paper, we propose a framework for training Deep
Convolutional Neural Network (DCNN) to complete the facial UV map extracted
from in-the-wild images. To this end, we first gather complete UV maps by
fitting a 3D Morphable Model (3DMM) to various multiview image and video
datasets, as well as leveraging on a new 3D dataset with over 3,000 identities.
Second, we devise a meticulously designed architecture that combines local and
global adversarial DCNNs to learn an identity-preserving facial UV completion
model. We demonstrate that by attaching the completed UV to the fitted mesh and
generating instances of arbitrary poses, we can increase pose variations for
training deep face recognition/verification models, and minimise pose
discrepancy during testing, which lead to better performance. Experiments on
both controlled and in-the-wild UV datasets prove the effectiveness of our
adversarial UV completion model. We achieve state-of-the-art verification
accuracy, , under the CFP frontal-profile protocol only by combining
pose augmentation during training and pose discrepancy reduction during
testing. We will release the first in-the-wild UV dataset (we refer as WildUV)
that comprises of complete facial UV maps from 1,892 identities for research
purposes
Binary Patterns Encoded Convolutional Neural Networks for Texture Recognition and Remote Sensing Scene Classification
Designing discriminative powerful texture features robust to realistic
imaging conditions is a challenging computer vision problem with many
applications, including material recognition and analysis of satellite or
aerial imagery. In the past, most texture description approaches were based on
dense orderless statistical distribution of local features. However, most
recent approaches to texture recognition and remote sensing scene
classification are based on Convolutional Neural Networks (CNNs). The d facto
practice when learning these CNN models is to use RGB patches as input with
training performed on large amounts of labeled data (ImageNet). In this paper,
we show that Binary Patterns encoded CNN models, codenamed TEX-Nets, trained
using mapped coded images with explicit texture information provide
complementary information to the standard RGB deep models. Additionally, two
deep architectures, namely early and late fusion, are investigated to combine
the texture and color information. To the best of our knowledge, we are the
first to investigate Binary Patterns encoded CNNs and different deep network
fusion architectures for texture recognition and remote sensing scene
classification. We perform comprehensive experiments on four texture
recognition datasets and four remote sensing scene classification benchmarks:
UC-Merced with 21 scene categories, WHU-RS19 with 19 scene classes, RSSCN7 with
7 categories and the recently introduced large scale aerial image dataset (AID)
with 30 aerial scene types. We demonstrate that TEX-Nets provide complementary
information to standard RGB deep model of the same network architecture. Our
late fusion TEX-Net architecture always improves the overall performance
compared to the standard RGB network on both recognition problems. Our final
combination outperforms the state-of-the-art without employing fine-tuning or
ensemble of RGB network architectures.Comment: To appear in ISPRS Journal of Photogrammetry and Remote Sensin
Overcomplete steerable pyramid filters and rotation invariance
A given (overcomplete) discrete oriented pyramid may be converted into a steerable pyramid by interpolation. We present a technique for deriving the optimal interpolation functions (otherwise called 'steering coefficients'). The proposed scheme is demonstrated on a computationally efficient oriented pyramid, which is a variation on the Burt and Adelson (1983) pyramid. We apply the generated steerable pyramid to orientation-invariant texture analysis in order to demonstrate its excellent rotational isotropy. High classification rates and precise rotation identification are demonstrated
- …