10,326 research outputs found

    Brain-Inspired Deep Networks for Image Aesthetics Assessment

    Full text link
    Image aesthetics assessment has been challenging due to its subjective nature. Inspired by the scientific advances in the human visual perception and neuroaesthetics, we design Brain-Inspired Deep Networks (BDN) for this task. BDN first learns attributes through the parallel supervised pathways, on a variety of selected feature dimensions. A high-level synthesis network is trained to associate and transform those attributes into the overall aesthetics rating. We then extend BDN to predicting the distribution of human ratings, since aesthetics ratings are often subjective. Another highlight is our first-of-its-kind study of label-preserving transformations in the context of aesthetics assessment, which leads to an effective data augmentation approach. Experimental results on the AVA dataset show that our biological inspired and task-specific BDN model gains significantly performance improvement, compared to other state-of-the-art models with the same or higher parameter capacity

    Quality Assessment for Tone-Mapped HDR Images Using Multi-Scale and Multi-Layer Information

    Full text link
    Tone mapping operators and multi-exposure fusion methods allow us to enjoy the informative contents of high dynamic range (HDR) images with standard dynamic range devices, but also introduce distortions into HDR contents. Therefore methods are needed to evaluate tone-mapped image quality. Due to the complexity of possible distortions in a tone-mapped image, information from different scales and different levels should be considered when predicting tone-mapped image quality. So we propose a new no-reference method of tone-mapped image quality assessment based on multi-scale and multi-layer features that are extracted from a pre-trained deep convolutional neural network model. After being aggregated, the extracted features are mapped to quality predictions by regression. The proposed method is tested on the largest public database for TMIQA and compared to existing no-reference methods. The experimental results show that the proposed method achieves better performance.Comment: This paper has 6 pages, 3 tables and 2 figures in total, corrects a typo in the accepted versio

    Human Attention Estimation for Natural Images: An Automatic Gaze Refinement Approach

    Full text link
    Photo collections and its applications today attempt to reflect user interactions in various forms. Moreover, photo collections aim to capture the users' intention with minimum effort through applications capturing user intentions. Human interest regions in an image carry powerful information about the user's behavior and can be used in many photo applications. Research on human visual attention has been conducted in the form of gaze tracking and computational saliency models in the computer vision community, and has shown considerable progress. This paper presents an integration between implicit gaze estimation and computational saliency model to effectively estimate human attention regions in images on the fly. Furthermore, our method estimates human attention via implicit calibration and incremental model updating without any active participation from the user. We also present extensive analysis and possible applications for personal photo collections

    Saliency detection based on structural dissimilarity induced by image quality assessment model

    Full text link
    The distinctiveness of image regions is widely used as the cue of saliency. Generally, the distinctiveness is computed according to the absolute difference of features. However, according to the image quality assessment (IQA) studies, the human visual system is highly sensitive to structural changes rather than absolute difference. Accordingly, we propose the computation of the structural dissimilarity between image patches as the distinctiveness measure for saliency detection. Similar to IQA models, the structural dissimilarity is computed based on the correlation of the structural features. The global structural dissimilarity of a patch to all the other patches represents saliency of the patch. We adopt two widely used structural features, namely the local contrast and gradient magnitude, into the structural dissimilarity computation in the proposed model. Without any postprocessing, the proposed model based on the correlation of either of the two structural features outperforms 11 state-of-the-art saliency models on three saliency databases.Comment: For associated source code, see https://github.com/yangli-xjtu/SD

    Segmentation of retinal cysts from Optical Coherence Tomography volumes via selective enhancement

    Full text link
    Automated and accurate segmentation of cystoid structures in Optical Coherence Tomography (OCT) is of interest in the early detection of retinal diseases. It is, however, a challenging task. We propose a novel method for localizing cysts in 3D OCT volumes. The proposed work is biologically inspired and based on selective enhancement of the cysts, by inducing motion to a given OCT slice. A Convolutional Neural Network (CNN) is designed to learn a mapping function that combines the result of multiple such motions to produce a probability map for cyst locations in a given slice. The final segmentation of cysts is obtained via simple clustering of the detected cyst locations. The proposed method is evaluated on two public datasets and one private dataset. The public datasets include the one released for the OPTIMA Cyst segmentation challenge (OCSC) in MICCAI 2015 and the DME dataset. After training on the OCSC train set, the method achieves a mean Dice Coefficient (DC) of 0.71 on the OCSC test set. The robustness of the algorithm was examined by cross-validation on the DME and AEI (private) datasets and a mean DC values obtained were 0.69 and 0.79, respectively. Overall, the proposed system outperforms all benchmarks. These results underscore the strengths of the proposed method in handling variations in both data acquisition protocols and scanners.Comment: Under review in Journal of Biomedical and Health Informatic

    MSR-net:Low-light Image Enhancement Using Deep Convolutional Network

    Full text link
    Images captured in low-light conditions usually suffer from very low contrast, which increases the difficulty of subsequent computer vision tasks in a great extent. In this paper, a low-light image enhancement model based on convolutional neural network and Retinex theory is proposed. Firstly, we show that multi-scale Retinex is equivalent to a feedforward convolutional neural network with different Gaussian convolution kernels. Motivated by this fact, we consider a Convolutional Neural Network(MSR-net) that directly learns an end-to-end mapping between dark and bright images. Different fundamentally from existing approaches, low-light image enhancement in this paper is regarded as a machine learning problem. In this model, most of the parameters are optimized by back-propagation, while the parameters of traditional models depend on the artificial setting. Experiments on a number of challenging images reveal the advantages of our method in comparison with other state-of-the-art methods from the qualitative and quantitative perspective.Comment: 9page

    Vision-based Human Gender Recognition: A Survey

    Full text link
    Gender is an important demographic attribute of people. This paper provides a survey of human gender recognition in computer vision. A review of approaches exploiting information from face and whole body (either from a still image or gait sequence) is presented. We highlight the challenges faced and survey the representative methods of these approaches. Based on the results, good performance have been achieved for datasets captured under controlled environments, but there is still much work that can be done to improve the robustness of gender recognition under real-life environments.Comment: 30 page

    Measuring and Understanding Sensory Representations within Deep Networks Using a Numerical Optimization Framework

    Full text link
    A central challenge in sensory neuroscience is describing how the activity of populations of neurons can represent useful features of the external environment. However, while neurophysiologists have long been able to record the responses of neurons in awake, behaving animals, it is another matter entirely to say what a given neuron does. A key problem is that in many sensory domains, the space of all possible stimuli that one might encounter is effectively infinite; in vision, for instance, natural scenes are combinatorially complex, and an organism will only encounter a tiny fraction of possible stimuli. As a result, even describing the response properties of sensory neurons is difficult, and investigations of neuronal functions are almost always critically limited by the number of stimuli that can be considered. In this paper, we propose a closed-loop, optimization-based experimental framework for characterizing the response properties of sensory neurons, building on past efforts in closed-loop experimental methods, and leveraging recent advances in artificial neural networks to serve as as a proving ground for our techniques. Specifically, using deep convolutional neural networks, we asked whether modern black-box optimization techniques can be used to interrogate the "tuning landscape" of an artificial neuron in a deep, nonlinear system, without imposing significant constraints on the space of stimuli under consideration. We introduce a series of measures to quantify the tuning landscapes, and show how these relate to the performances of the networks in an object recognition task. To the extent that deep convolutional neural networks increasingly serve as de facto working hypotheses for biological vision, we argue that developing a unified approach for studying both artificial and biological systems holds great potential to advance both fields together

    DRAW: Deep networks for Recognizing styles of Artists Who illustrate children's books

    Full text link
    This paper is motivated from a young boy's capability to recognize an illustrator's style in a totally different context. In the book "We are All Born Free" [1], composed of selected rights from the Universal Declaration of Human Rights interpreted by different illustrators, the boy was surprised to see a picture similar to the ones in the "Winnie the Witch" series drawn by Korky Paul (Figure 1). The style was noticeable in other characters of the same illustrator in different books as well. The capability of a child to easily spot the style was shown to be valid for other illustrators such as Axel Scheffler and Debi Gliori. The boy's enthusiasm let us to start the journey to explore the capabilities of machines to recognize the style of illustrators. We collected pages from children's books to construct a new illustrations dataset consisting of about 6500 pages from 24 artists. We exploited deep networks for categorizing illustrators and with around 94% classification performance our method over-performed the traditional methods by more than 10%. Going beyond categorization we explored transferring style. The classification performance on the transferred images has shown the ability of our system to capture the style. Furthermore, we discovered representative illustrations and discriminative stylistic elements.Comment: ACM ICMR 201

    An Overview of Melanoma Detection in Dermoscopy Images Using Image Processing and Machine Learning

    Full text link
    The incidence of malignant melanoma continues to increase worldwide. This cancer can strike at any age; it is one of the leading causes of loss of life in young persons. Since this cancer is visible on the skin, it is potentially detectable at a very early stage when it is curable. New developments have converged to make fully automatic early melanoma detection a real possibility. First, the advent of dermoscopy has enabled a dramatic boost in clinical diagnostic ability to the point that melanoma can be detected in the clinic at the very earliest stages. The global adoption of this technology has allowed accumulation of large collections of dermoscopy images of melanomas and benign lesions validated by histopathology. The development of advanced technologies in the areas of image processing and machine learning have given us the ability to allow distinction of malignant melanoma from the many benign mimics that require no biopsy. These new technologies should allow not only earlier detection of melanoma, but also reduction of the large number of needless and costly biopsy procedures. Although some of the new systems reported for these technologies have shown promise in preliminary trials, widespread implementation must await further technical progress in accuracy and reproducibility. In this paper, we provide an overview of computerized detection of melanoma in dermoscopy images. First, we discuss the various aspects of lesion segmentation. Then, we provide a brief overview of clinical feature segmentation. Finally, we discuss the classification stage where machine learning algorithms are applied to the attributes generated from the segmented features to predict the existence of melanoma.Comment: 15 pages, 3 figure
    • …
    corecore