10,326 research outputs found
Brain-Inspired Deep Networks for Image Aesthetics Assessment
Image aesthetics assessment has been challenging due to its subjective
nature. Inspired by the scientific advances in the human visual perception and
neuroaesthetics, we design Brain-Inspired Deep Networks (BDN) for this task.
BDN first learns attributes through the parallel supervised pathways, on a
variety of selected feature dimensions. A high-level synthesis network is
trained to associate and transform those attributes into the overall aesthetics
rating. We then extend BDN to predicting the distribution of human ratings,
since aesthetics ratings are often subjective. Another highlight is our
first-of-its-kind study of label-preserving transformations in the context of
aesthetics assessment, which leads to an effective data augmentation approach.
Experimental results on the AVA dataset show that our biological inspired and
task-specific BDN model gains significantly performance improvement, compared
to other state-of-the-art models with the same or higher parameter capacity
Quality Assessment for Tone-Mapped HDR Images Using Multi-Scale and Multi-Layer Information
Tone mapping operators and multi-exposure fusion methods allow us to enjoy
the informative contents of high dynamic range (HDR) images with standard
dynamic range devices, but also introduce distortions into HDR contents.
Therefore methods are needed to evaluate tone-mapped image quality. Due to the
complexity of possible distortions in a tone-mapped image, information from
different scales and different levels should be considered when predicting
tone-mapped image quality. So we propose a new no-reference method of
tone-mapped image quality assessment based on multi-scale and multi-layer
features that are extracted from a pre-trained deep convolutional neural
network model. After being aggregated, the extracted features are mapped to
quality predictions by regression. The proposed method is tested on the largest
public database for TMIQA and compared to existing no-reference methods. The
experimental results show that the proposed method achieves better performance.Comment: This paper has 6 pages, 3 tables and 2 figures in total, corrects a
typo in the accepted versio
Human Attention Estimation for Natural Images: An Automatic Gaze Refinement Approach
Photo collections and its applications today attempt to reflect user
interactions in various forms. Moreover, photo collections aim to capture the
users' intention with minimum effort through applications capturing user
intentions. Human interest regions in an image carry powerful information about
the user's behavior and can be used in many photo applications. Research on
human visual attention has been conducted in the form of gaze tracking and
computational saliency models in the computer vision community, and has shown
considerable progress. This paper presents an integration between implicit gaze
estimation and computational saliency model to effectively estimate human
attention regions in images on the fly. Furthermore, our method estimates human
attention via implicit calibration and incremental model updating without any
active participation from the user. We also present extensive analysis and
possible applications for personal photo collections
Saliency detection based on structural dissimilarity induced by image quality assessment model
The distinctiveness of image regions is widely used as the cue of saliency.
Generally, the distinctiveness is computed according to the absolute difference
of features. However, according to the image quality assessment (IQA) studies,
the human visual system is highly sensitive to structural changes rather than
absolute difference. Accordingly, we propose the computation of the structural
dissimilarity between image patches as the distinctiveness measure for saliency
detection. Similar to IQA models, the structural dissimilarity is computed
based on the correlation of the structural features. The global structural
dissimilarity of a patch to all the other patches represents saliency of the
patch. We adopt two widely used structural features, namely the local contrast
and gradient magnitude, into the structural dissimilarity computation in the
proposed model. Without any postprocessing, the proposed model based on the
correlation of either of the two structural features outperforms 11
state-of-the-art saliency models on three saliency databases.Comment: For associated source code, see https://github.com/yangli-xjtu/SD
Segmentation of retinal cysts from Optical Coherence Tomography volumes via selective enhancement
Automated and accurate segmentation of cystoid structures in Optical
Coherence Tomography (OCT) is of interest in the early detection of retinal
diseases. It is, however, a challenging task. We propose a novel method for
localizing cysts in 3D OCT volumes. The proposed work is biologically inspired
and based on selective enhancement of the cysts, by inducing motion to a given
OCT slice. A Convolutional Neural Network (CNN) is designed to learn a mapping
function that combines the result of multiple such motions to produce a
probability map for cyst locations in a given slice. The final segmentation of
cysts is obtained via simple clustering of the detected cyst locations. The
proposed method is evaluated on two public datasets and one private dataset.
The public datasets include the one released for the OPTIMA Cyst segmentation
challenge (OCSC) in MICCAI 2015 and the DME dataset. After training on the OCSC
train set, the method achieves a mean Dice Coefficient (DC) of 0.71 on the OCSC
test set. The robustness of the algorithm was examined by cross-validation on
the DME and AEI (private) datasets and a mean DC values obtained were 0.69 and
0.79, respectively. Overall, the proposed system outperforms all benchmarks.
These results underscore the strengths of the proposed method in handling
variations in both data acquisition protocols and scanners.Comment: Under review in Journal of Biomedical and Health Informatic
MSR-net:Low-light Image Enhancement Using Deep Convolutional Network
Images captured in low-light conditions usually suffer from very low
contrast, which increases the difficulty of subsequent computer vision tasks in
a great extent. In this paper, a low-light image enhancement model based on
convolutional neural network and Retinex theory is proposed. Firstly, we show
that multi-scale Retinex is equivalent to a feedforward convolutional neural
network with different Gaussian convolution kernels. Motivated by this fact, we
consider a Convolutional Neural Network(MSR-net) that directly learns an
end-to-end mapping between dark and bright images. Different fundamentally from
existing approaches, low-light image enhancement in this paper is regarded as a
machine learning problem. In this model, most of the parameters are optimized
by back-propagation, while the parameters of traditional models depend on the
artificial setting. Experiments on a number of challenging images reveal the
advantages of our method in comparison with other state-of-the-art methods from
the qualitative and quantitative perspective.Comment: 9page
Vision-based Human Gender Recognition: A Survey
Gender is an important demographic attribute of people. This paper provides a
survey of human gender recognition in computer vision. A review of approaches
exploiting information from face and whole body (either from a still image or
gait sequence) is presented. We highlight the challenges faced and survey the
representative methods of these approaches. Based on the results, good
performance have been achieved for datasets captured under controlled
environments, but there is still much work that can be done to improve the
robustness of gender recognition under real-life environments.Comment: 30 page
Measuring and Understanding Sensory Representations within Deep Networks Using a Numerical Optimization Framework
A central challenge in sensory neuroscience is describing how the activity of
populations of neurons can represent useful features of the external
environment. However, while neurophysiologists have long been able to record
the responses of neurons in awake, behaving animals, it is another matter
entirely to say what a given neuron does. A key problem is that in many sensory
domains, the space of all possible stimuli that one might encounter is
effectively infinite; in vision, for instance, natural scenes are
combinatorially complex, and an organism will only encounter a tiny fraction of
possible stimuli. As a result, even describing the response properties of
sensory neurons is difficult, and investigations of neuronal functions are
almost always critically limited by the number of stimuli that can be
considered. In this paper, we propose a closed-loop, optimization-based
experimental framework for characterizing the response properties of sensory
neurons, building on past efforts in closed-loop experimental methods, and
leveraging recent advances in artificial neural networks to serve as as a
proving ground for our techniques. Specifically, using deep convolutional
neural networks, we asked whether modern black-box optimization techniques can
be used to interrogate the "tuning landscape" of an artificial neuron in a
deep, nonlinear system, without imposing significant constraints on the space
of stimuli under consideration. We introduce a series of measures to quantify
the tuning landscapes, and show how these relate to the performances of the
networks in an object recognition task. To the extent that deep convolutional
neural networks increasingly serve as de facto working hypotheses for
biological vision, we argue that developing a unified approach for studying
both artificial and biological systems holds great potential to advance both
fields together
DRAW: Deep networks for Recognizing styles of Artists Who illustrate children's books
This paper is motivated from a young boy's capability to recognize an
illustrator's style in a totally different context. In the book "We are All
Born Free" [1], composed of selected rights from the Universal Declaration of
Human Rights interpreted by different illustrators, the boy was surprised to
see a picture similar to the ones in the "Winnie the Witch" series drawn by
Korky Paul (Figure 1). The style was noticeable in other characters of the same
illustrator in different books as well. The capability of a child to easily
spot the style was shown to be valid for other illustrators such as Axel
Scheffler and Debi Gliori. The boy's enthusiasm let us to start the journey to
explore the capabilities of machines to recognize the style of illustrators.
We collected pages from children's books to construct a new illustrations
dataset consisting of about 6500 pages from 24 artists. We exploited deep
networks for categorizing illustrators and with around 94% classification
performance our method over-performed the traditional methods by more than 10%.
Going beyond categorization we explored transferring style. The classification
performance on the transferred images has shown the ability of our system to
capture the style. Furthermore, we discovered representative illustrations and
discriminative stylistic elements.Comment: ACM ICMR 201
An Overview of Melanoma Detection in Dermoscopy Images Using Image Processing and Machine Learning
The incidence of malignant melanoma continues to increase worldwide. This
cancer can strike at any age; it is one of the leading causes of loss of life
in young persons. Since this cancer is visible on the skin, it is potentially
detectable at a very early stage when it is curable. New developments have
converged to make fully automatic early melanoma detection a real possibility.
First, the advent of dermoscopy has enabled a dramatic boost in clinical
diagnostic ability to the point that melanoma can be detected in the clinic at
the very earliest stages. The global adoption of this technology has allowed
accumulation of large collections of dermoscopy images of melanomas and benign
lesions validated by histopathology. The development of advanced technologies
in the areas of image processing and machine learning have given us the ability
to allow distinction of malignant melanoma from the many benign mimics that
require no biopsy. These new technologies should allow not only earlier
detection of melanoma, but also reduction of the large number of needless and
costly biopsy procedures. Although some of the new systems reported for these
technologies have shown promise in preliminary trials, widespread
implementation must await further technical progress in accuracy and
reproducibility. In this paper, we provide an overview of computerized
detection of melanoma in dermoscopy images. First, we discuss the various
aspects of lesion segmentation. Then, we provide a brief overview of clinical
feature segmentation. Finally, we discuss the classification stage where
machine learning algorithms are applied to the attributes generated from the
segmented features to predict the existence of melanoma.Comment: 15 pages, 3 figure
- …