8,070 research outputs found
Recombinator Networks: Learning Coarse-to-Fine Feature Aggregation
Deep neural networks with alternating convolutional, max-pooling and
decimation layers are widely used in state of the art architectures for
computer vision. Max-pooling purposefully discards precise spatial information
in order to create features that are more robust, and typically organized as
lower resolution spatial feature maps. On some tasks, such as whole-image
classification, max-pooling derived features are well suited; however, for
tasks requiring precise localization, such as pixel level prediction and
segmentation, max-pooling destroys exactly the information required to perform
well. Precise localization may be preserved by shallow convnets without pooling
but at the expense of robustness. Can we have our max-pooled multi-layered cake
and eat it too? Several papers have proposed summation and concatenation based
methods for combining upsampled coarse, abstract features with finer features
to produce robust pixel level predictions. Here we introduce another model ---
dubbed Recombinator Networks --- where coarse features inform finer features
early in their formation such that finer features can make use of several
layers of computation in deciding how to use coarse features. The model is
trained once, end-to-end and performs better than summation-based
architectures, reducing the error from the previous state of the art on two
facial keypoint datasets, AFW and AFLW, by 30\% and beating the current
state-of-the-art on 300W without using extra data. We improve performance even
further by adding a denoising prediction model based on a novel convnet
formulation.Comment: accepted in CVPR 201
Deep Neural Network and Data Augmentation Methodology for off-axis iris segmentation in wearable headsets
A data augmentation methodology is presented and applied to generate a large
dataset of off-axis iris regions and train a low-complexity deep neural
network. Although of low complexity the resulting network achieves a high level
of accuracy in iris region segmentation for challenging off-axis eye-patches.
Interestingly, this network is also shown to achieve high levels of performance
for regular, frontal, segmentation of iris regions, comparing favorably with
state-of-the-art techniques of significantly higher complexity. Due to its
lower complexity, this network is well suited for deployment in embedded
applications such as augmented and mixed reality headsets
Mask-guided Style Transfer Network for Purifying Real Images
Recently, the progress of learning-by-synthesis has proposed a training model
for synthetic images, which can effectively reduce the cost of human and
material resources. However, due to the different distribution of synthetic
images compared with real images, the desired performance cannot be achieved.
To solve this problem, the previous method learned a model to improve the
realism of the synthetic images. Different from the previous methods, this
paper try to purify real image by extracting discriminative and robust features
to convert outdoor real images to indoor synthetic images. In this paper, we
first introduce the segmentation masks to construct RGB-mask pairs as inputs,
then we design a mask-guided style transfer network to learn style features
separately from the attention and bkgd(background) regions and learn content
features from full and attention region. Moreover, we propose a novel
region-level task-guided loss to restrain the features learnt from style and
content. Experiments were performed using mixed studies (qualitative and
quantitative) methods to demonstrate the possibility of purifying real images
in complex directions. We evaluate the proposed method on various public
datasets, including LPW, COCO and MPIIGaze. Experimental results show that the
proposed method is effective and achieves the state-of-the-art results.Comment: arXiv admin note: substantial text overlap with arXiv:1903.0582
- …