6,268 research outputs found

    Radiometric Scene Decomposition: Scene Reflectance, Illumination, and Geometry from RGB-D Images

    Full text link
    Recovering the radiometric properties of a scene (i.e., the reflectance, illumination, and geometry) is a long-sought ability of computer vision that can provide invaluable information for a wide range of applications. Deciphering the radiometric ingredients from the appearance of a real-world scene, as opposed to a single isolated object, is particularly challenging as it generally consists of various objects with different material compositions exhibiting complex reflectance and light interactions that are also part of the illumination. We introduce the first method for radiometric scene decomposition that handles those intricacies. We use RGB-D images to bootstrap geometry recovery and simultaneously recover the complex reflectance and natural illumination while refining the noisy initial geometry and segmenting the scene into different material regions. Most important, we handle real-world scenes consisting of multiple objects of unknown materials, which necessitates the modeling of spatially-varying complex reflectance, natural illumination, texture, interreflection and shadows. We systematically evaluate the effectiveness of our method on synthetic scenes and demonstrate its application to real-world scenes. The results show that rich radiometric information can be recovered from RGB-D images and demonstrate a new role RGB-D sensors can play for general scene understanding tasks.Comment: 16 page

    Describing Colors, Textures and Shapes for Content Based Image Retrieval - A Survey

    Full text link
    Visual media has always been the most enjoyed way of communication. From the advent of television to the modern day hand held computers, we have witnessed the exponential growth of images around us. Undoubtedly it's a fact that they carry a lot of information in them which needs be utilized in an effective manner. Hence intense need has been felt to efficiently index and store large image collections for effective and on- demand retrieval. For this purpose low-level features extracted from the image contents like color, texture and shape has been used. Content based image retrieval systems employing these features has proven very successful. Image retrieval has promising applications in numerous fields and hence has motivated researchers all over the world. New and improved ways to represent visual content are being developed each day. Tremendous amount of research has been carried out in the last decade. In this paper we will present a detailed overview of some of the powerful color, texture and shape descriptors for content based image retrieval. A comparative analysis will also be carried out for providing an insight into outstanding challenges in this field

    Unsupervised and semi-supervised learning with Categorical Generative Adversarial Networks assisted by Wasserstein distance for dermoscopy image Classification

    Full text link
    Melanoma is a curable aggressive skin cancer if detected early. Typically, the diagnosis involves initial screening with subsequent biopsy and histopathological examination if necessary. Computer aided diagnosis offers an objective score that is independent of clinical experience and the potential to lower the workload of a dermatologist. In the recent past, success of deep learning algorithms in the field of general computer vision has motivated successful application of supervised deep learning methods in computer aided melanoma recognition. However, large quantities of labeled images are required to make further improvements on the supervised method. A good annotation generally requires clinical and histological confirmation, which requires significant effort. In an attempt to alleviate this constraint, we propose to use categorical generative adversarial network to automatically learn the feature representation of dermoscopy images in an unsupervised and semi-supervised manner. Thorough experiments on ISIC 2016 skin lesion chal- lenge demonstrate that the proposed feature learning method has achieved an average precision score of 0.424 with only 140 labeled images. Moreover, the proposed method is also capable of generating real-world like dermoscopy images

    Land-Cover Classification with High-Resolution Remote Sensing Images Using Transferable Deep Models

    Full text link
    In recent years, large amount of high spatial-resolution remote sensing (HRRS) images are available for land-cover mapping. However, due to the complex information brought by the increased spatial resolution and the data disturbances caused by different conditions of image acquisition, it is often difficult to find an efficient method for achieving accurate land-cover classification with high-resolution and heterogeneous remote sensing images. In this paper, we propose a scheme to apply deep model obtained from labeled land-cover dataset to classify unlabeled HRRS images. The main idea is to rely on deep neural networks for presenting the contextual information contained in different types of land-covers and propose a pseudo-labeling and sample selection scheme for improving the transferability of deep models. More precisely, a deep Convolutional Neural Networks is first pre-trained with a well-annotated land-cover dataset, referred to as the source data. Then, given a target image with no labels, the pre-trained CNN model is utilized to classify the image in a patch-wise manner. The patches with high confidence are assigned with pseudo-labels and employed as the queries to retrieve related samples from the source data. The pseudo-labels confirmed with the retrieved results are regarded as supervised information for fine-tuning the pre-trained deep model. To obtain a pixel-wise land-cover classification with the target image, we rely on the fine-tuned CNN and develop a hybrid classification by combining patch-wise classification and hierarchical segmentation. In addition, we create a large-scale land-cover dataset containing 150 Gaofen-2 satellite images for CNN pre-training. Experiments on multi-source HRRS images show encouraging results and demonstrate the applicability of the proposed scheme to land-cover classification

    Generic Feature Learning for Wireless Capsule Endoscopy Analysis

    Full text link
    The interpretation and analysis of the wireless capsule endoscopy recording is a complex task which requires sophisticated computer aided decision (CAD) systems in order to help physicians with the video screening and, finally, with the diagnosis. Most of the CAD systems in the capsule endoscopy share a common system design, but use very different image and video representations. As a result, each time a new clinical application of WCE appears, new CAD system has to be designed from scratch. This characteristic makes the design of new CAD systems a very time consuming. Therefore, in this paper we introduce a system for small intestine motility characterization, based on Deep Convolutional Neural Networks, which avoids the laborious step of designing specific features for individual motility events. Experimental results show the superiority of the learned features over alternative classifiers constructed by using state of the art hand-crafted features. In particular, it reaches a mean classification accuracy of 96% for six intestinal motility events, outperforming the other classifiers by a large margin (a 14% relative performance increase)

    Video Smoke Detection Based on Deep Saliency Network

    Full text link
    Video smoke detection is a promising fire detection method especially in open or large spaces and outdoor environments. Traditional video smoke detection methods usually consist of candidate region extraction and classification, but lack powerful characterization for smoke. In this paper, we propose a novel video smoke detection method based on deep saliency network. Visual saliency detection aims to highlight the most important object regions in an image. The pixel-level and object-level salient convolutional neural networks are combined to extract the informative smoke saliency map. An end-to-end framework for salient smoke detection and existence prediction of smoke is proposed for application in video smoke detection. The deep feature map is combined with the saliency map to predict the existence of smoke in an image. Initial and augmented dataset are built to measure the performance of frameworks with different design strategies. Qualitative and quantitative analysis at frame-level and pixel-level demonstrate the excellent performance of the ultimate framework.Comment: 21 pages, 12 figure

    An Enhanced Deep Feature Representation for Person Re-identification

    Full text link
    Feature representation and metric learning are two critical components in person re-identification models. In this paper, we focus on the feature representation and claim that hand-crafted histogram features can be complementary to Convolutional Neural Network (CNN) features. We propose a novel feature extraction model called Feature Fusion Net (FFN) for pedestrian image representation. In FFN, back propagation makes CNN features constrained by the handcrafted features. Utilizing color histogram features (RGB, HSV, YCbCr, Lab and YIQ) and texture features (multi-scale and multi-orientation Gabor features), we get a new deep feature representation that is more discriminative and compact. Experiments on three challenging datasets (VIPeR, CUHK01, PRID450s) validates the effectiveness of our proposal.Comment: Citation for this paper: Shangxuan Wu, Ying-Cong Chen, Xiang Li, An-Cong Wu, Jin-Jie You, and Wei-Shi Zheng. An Enhanced Deep Feature Representation for Person Re-identification. In IEEE WACV, 201

    A Survey on Periocular Biometrics Research

    Full text link
    Periocular refers to the facial region in the vicinity of the eye, including eyelids, lashes and eyebrows. While face and irises have been extensively studied, the periocular region has emerged as a promising trait for unconstrained biometrics, following demands for increased robustness of face or iris systems. With a surprisingly high discrimination ability, this region can be easily obtained with existing setups for face and iris, and the requirement of user cooperation can be relaxed, thus facilitating the interaction with biometric systems. It is also available over a wide range of distances even when the iris texture cannot be reliably obtained (low resolution) or under partial face occlusion (close distances). Here, we review the state of the art in periocular biometrics research. A number of aspects are described, including: i) existing databases, ii) algorithms for periocular detection and/or segmentation, iii) features employed for recognition, iv) identification of the most discriminative regions of the periocular area, v) comparison with iris and face modalities, vi) soft-biometrics (gender/ethnicity classification), and vii) impact of gender transformation and plastic surgery on the recognition accuracy. This work is expected to provide an insight of the most relevant issues in periocular biometrics, giving a comprehensive coverage of the existing literature and current state of the art.Comment: Published in Pattern Recognition Letter

    BLNet: A Fast Deep Learning Framework for Low-Light Image Enhancement with Noise Removal and Color Restoration

    Full text link
    Images obtained in real-world low-light conditions are not only low in brightness, but they also suffer from many other types of degradation, such as color bias, unknown noise, detail loss and halo artifacts. In this paper, we propose a very fast deep learning framework called Bringing the Lightness (denoted as BLNet) that consists of two U-Nets with a series of well-designed loss functions to tackle all of the above degradations. Based on Retinex Theory, the decomposition net in our model can decompose low-light images into reflectance and illumination and remove noise in the reflectance during the decomposition phase. We propose a Noise and Color Bias Control module (NCBC Module) that contains a convolutional neural network and two loss functions (noise loss and color loss). This module is only used to calculate the loss functions during the training phase, so our method is very fast during the test phase. This module can smooth the reflectance to achieve the purpose of noise removal while preserving details and edge information and controlling color bias. We propose a network that can be trained to learn the mapping between low-light and normal-light illumination and enhance the brightness of images taken in low-light illumination. We train and evaluate the performance of our proposed model over the real-world Low-Light (LOL) dataset), and we also test our model over several other frequently used datasets (LIME, DICM and MEF datasets). We conduct extensive experiments to demonstrate that our approach achieves a promising effect with good rubustness and generalization and outperforms many other state-of-the-art methods qualitatively and quantitatively. Our method achieves high speed because we use loss functions instead of introducing additional denoisers for noise removal and color correction. The code and model are available at https://github.com/weixinxu666/BLNet.Comment: 13 pages, 12 figures, journa

    Hierarchical Gaussian Descriptors with Application to Person Re-Identification

    Full text link
    Describing the color and textural information of a person image is one of the most crucial aspects of person re-identification (re-id). In this paper, we present novel meta-descriptors based on a hierarchical distribution of pixel features. Although hierarchical covariance descriptors have been successfully applied to image classification, the mean information of pixel features, which is absent from the covariance, tends to be the major discriminative information for person re-id. To solve this problem, we describe a local region in an image via hierarchical Gaussian distribution in which both means and covariances are included in their parameters. More specifically, the region is modeled as a set of multiple Gaussian distributions in which each Gaussian represents the appearance of a local patch. The characteristics of the set of Gaussians are again described by another Gaussian distribution. In both steps, we embed the parameters of the Gaussian into a point of Symmetric Positive Definite (SPD) matrix manifold. By changing the way to handle mean information in this embedding, we develop two hierarchical Gaussian descriptors. Additionally, we develop feature norm normalization methods with the ability to alleviate the biased trends that exist on the descriptors. The experimental results conducted on five public datasets indicate that the proposed descriptors achieve remarkably high performance on person re-id.Comment: 14 pages, 12 figures, 4 table
    corecore