1,272,371 research outputs found
Understanding How Image Quality Affects Deep Neural Networks
Image quality is an important practical challenge that is often overlooked in
the design of machine vision systems. Commonly, machine vision systems are
trained and tested on high quality image datasets, yet in practical
applications the input images can not be assumed to be of high quality.
Recently, deep neural networks have obtained state-of-the-art performance on
many machine vision tasks. In this paper we provide an evaluation of 4
state-of-the-art deep neural network models for image classification under
quality distortions. We consider five types of quality distortions: blur,
noise, contrast, JPEG, and JPEG2000 compression. We show that the existing
networks are susceptible to these quality distortions, particularly to blur and
noise. These results enable future work in developing deep neural networks that
are more invariant to quality distortions.Comment: Final version will appear in IEEE Xplore in the Proceedings of the
Conference on the Quality of Multimedia Experience (QoMEX), June 6-8, 201
On the Effect of Inter-observer Variability for a Reliable Estimation of Uncertainty of Medical Image Segmentation
Uncertainty estimation methods are expected to improve the understanding and
quality of computer-assisted methods used in medical applications (e.g.,
neurosurgical interventions, radiotherapy planning), where automated medical
image segmentation is crucial. In supervised machine learning, a common
practice to generate ground truth label data is to merge observer annotations.
However, as many medical image tasks show a high inter-observer variability
resulting from factors such as image quality, different levels of user
expertise and domain knowledge, little is known as to how inter-observer
variability and commonly used fusion methods affect the estimation of
uncertainty of automated image segmentation. In this paper we analyze the
effect of common image label fusion techniques on uncertainty estimation, and
propose to learn the uncertainty among observers. The results highlight the
negative effect of fusion methods applied in deep learning, to obtain reliable
estimates of segmentation uncertainty. Additionally, we show that the learned
observers' uncertainty can be combined with current standard Monte Carlo
dropout Bayesian neural networks to characterize uncertainty of model's
parameters.Comment: Appears in Medical Image Computing and Computer Assisted
Interventions (MICCAI), 201
Semantic Perceptual Image Compression using Deep Convolution Networks
It has long been considered a significant problem to improve the visual
quality of lossy image and video compression. Recent advances in computing
power together with the availability of large training data sets has increased
interest in the application of deep learning cnns to address image recognition
and image processing tasks. Here, we present a powerful cnn tailored to the
specific task of semantic image understanding to achieve higher visual quality
in lossy compression. A modest increase in complexity is incorporated to the
encoder which allows a standard, off-the-shelf jpeg decoder to be used. While
jpeg encoding may be optimized for generic images, the process is ultimately
unaware of the specific content of the image to be compressed. Our technique
makes jpeg content-aware by designing and training a model to identify multiple
semantic regions in a given image. Unlike object detection techniques, our
model does not require labeling of object positions and is able to identify
objects in a single pass. We present a new cnn architecture directed
specifically to image compression, which generates a map that highlights
semantically-salient regions so that they can be encoded at higher quality as
compared to background regions. By adding a complete set of features for every
class, and then taking a threshold over the sum of all feature activations, we
generate a map that highlights semantically-salient regions so that they can be
encoded at a better quality compared to background regions. Experiments are
presented on the Kodak PhotoCD dataset and the MIT Saliency Benchmark dataset,
in which our algorithm achieves higher visual quality for the same compressed
size.Comment: Accepted to Data Compression Conference, 11 pages, 5 figure
Learn to synthesize and synthesize to learn
Attribute guided face image synthesis aims to manipulate attributes on a face
image. Most existing methods for image-to-image translation can either perform
a fixed translation between any two image domains using a single attribute or
require training data with the attributes of interest for each subject.
Therefore, these methods could only train one specific model for each pair of
image domains, which limits their ability in dealing with more than two
domains. Another disadvantage of these methods is that they often suffer from
the common problem of mode collapse that degrades the quality of the generated
images. To overcome these shortcomings, we propose attribute guided face image
generation method using a single model, which is capable to synthesize multiple
photo-realistic face images conditioned on the attributes of interest. In
addition, we adopt the proposed model to increase the realism of the simulated
face images while preserving the face characteristics. Compared to existing
models, synthetic face images generated by our method present a good
photorealistic quality on several face datasets. Finally, we demonstrate that
generated facial images can be used for synthetic data augmentation, and
improve the performance of the classifier used for facial expression
recognition.Comment: Accepted to Computer Vision and Image Understanding (CVIU
- …