6,268 research outputs found
Radiometric Scene Decomposition: Scene Reflectance, Illumination, and Geometry from RGB-D Images
Recovering the radiometric properties of a scene (i.e., the reflectance,
illumination, and geometry) is a long-sought ability of computer vision that
can provide invaluable information for a wide range of applications.
Deciphering the radiometric ingredients from the appearance of a real-world
scene, as opposed to a single isolated object, is particularly challenging as
it generally consists of various objects with different material compositions
exhibiting complex reflectance and light interactions that are also part of the
illumination. We introduce the first method for radiometric scene decomposition
that handles those intricacies. We use RGB-D images to bootstrap geometry
recovery and simultaneously recover the complex reflectance and natural
illumination while refining the noisy initial geometry and segmenting the scene
into different material regions. Most important, we handle real-world scenes
consisting of multiple objects of unknown materials, which necessitates the
modeling of spatially-varying complex reflectance, natural illumination,
texture, interreflection and shadows. We systematically evaluate the
effectiveness of our method on synthetic scenes and demonstrate its application
to real-world scenes. The results show that rich radiometric information can be
recovered from RGB-D images and demonstrate a new role RGB-D sensors can play
for general scene understanding tasks.Comment: 16 page
Describing Colors, Textures and Shapes for Content Based Image Retrieval - A Survey
Visual media has always been the most enjoyed way of communication. From the
advent of television to the modern day hand held computers, we have witnessed
the exponential growth of images around us. Undoubtedly it's a fact that they
carry a lot of information in them which needs be utilized in an effective
manner. Hence intense need has been felt to efficiently index and store large
image collections for effective and on- demand retrieval. For this purpose
low-level features extracted from the image contents like color, texture and
shape has been used. Content based image retrieval systems employing these
features has proven very successful. Image retrieval has promising applications
in numerous fields and hence has motivated researchers all over the world. New
and improved ways to represent visual content are being developed each day.
Tremendous amount of research has been carried out in the last decade. In this
paper we will present a detailed overview of some of the powerful color,
texture and shape descriptors for content based image retrieval. A comparative
analysis will also be carried out for providing an insight into outstanding
challenges in this field
Unsupervised and semi-supervised learning with Categorical Generative Adversarial Networks assisted by Wasserstein distance for dermoscopy image Classification
Melanoma is a curable aggressive skin cancer if detected early. Typically,
the diagnosis involves initial screening with subsequent biopsy and
histopathological examination if necessary. Computer aided diagnosis offers an
objective score that is independent of clinical experience and the potential to
lower the workload of a dermatologist. In the recent past, success of deep
learning algorithms in the field of general computer vision has motivated
successful application of supervised deep learning methods in computer aided
melanoma recognition. However, large quantities of labeled images are required
to make further improvements on the supervised method. A good annotation
generally requires clinical and histological confirmation, which requires
significant effort. In an attempt to alleviate this constraint, we propose to
use categorical generative adversarial network to automatically learn the
feature representation of dermoscopy images in an unsupervised and
semi-supervised manner. Thorough experiments on ISIC 2016 skin lesion chal-
lenge demonstrate that the proposed feature learning method has achieved an
average precision score of 0.424 with only 140 labeled images. Moreover, the
proposed method is also capable of generating real-world like dermoscopy
images
Land-Cover Classification with High-Resolution Remote Sensing Images Using Transferable Deep Models
In recent years, large amount of high spatial-resolution remote sensing
(HRRS) images are available for land-cover mapping. However, due to the complex
information brought by the increased spatial resolution and the data
disturbances caused by different conditions of image acquisition, it is often
difficult to find an efficient method for achieving accurate land-cover
classification with high-resolution and heterogeneous remote sensing images. In
this paper, we propose a scheme to apply deep model obtained from labeled
land-cover dataset to classify unlabeled HRRS images. The main idea is to rely
on deep neural networks for presenting the contextual information contained in
different types of land-covers and propose a pseudo-labeling and sample
selection scheme for improving the transferability of deep models. More
precisely, a deep Convolutional Neural Networks is first pre-trained with a
well-annotated land-cover dataset, referred to as the source data. Then, given
a target image with no labels, the pre-trained CNN model is utilized to
classify the image in a patch-wise manner. The patches with high confidence are
assigned with pseudo-labels and employed as the queries to retrieve related
samples from the source data. The pseudo-labels confirmed with the retrieved
results are regarded as supervised information for fine-tuning the pre-trained
deep model. To obtain a pixel-wise land-cover classification with the target
image, we rely on the fine-tuned CNN and develop a hybrid classification by
combining patch-wise classification and hierarchical segmentation. In addition,
we create a large-scale land-cover dataset containing 150 Gaofen-2 satellite
images for CNN pre-training. Experiments on multi-source HRRS images show
encouraging results and demonstrate the applicability of the proposed scheme to
land-cover classification
Generic Feature Learning for Wireless Capsule Endoscopy Analysis
The interpretation and analysis of the wireless capsule endoscopy recording
is a complex task which requires sophisticated computer aided decision (CAD)
systems in order to help physicians with the video screening and, finally, with
the diagnosis. Most of the CAD systems in the capsule endoscopy share a common
system design, but use very different image and video representations. As a
result, each time a new clinical application of WCE appears, new CAD system has
to be designed from scratch. This characteristic makes the design of new CAD
systems a very time consuming. Therefore, in this paper we introduce a system
for small intestine motility characterization, based on Deep Convolutional
Neural Networks, which avoids the laborious step of designing specific features
for individual motility events. Experimental results show the superiority of
the learned features over alternative classifiers constructed by using state of
the art hand-crafted features. In particular, it reaches a mean classification
accuracy of 96% for six intestinal motility events, outperforming the other
classifiers by a large margin (a 14% relative performance increase)
Video Smoke Detection Based on Deep Saliency Network
Video smoke detection is a promising fire detection method especially in open
or large spaces and outdoor environments. Traditional video smoke detection
methods usually consist of candidate region extraction and classification, but
lack powerful characterization for smoke. In this paper, we propose a novel
video smoke detection method based on deep saliency network. Visual saliency
detection aims to highlight the most important object regions in an image. The
pixel-level and object-level salient convolutional neural networks are combined
to extract the informative smoke saliency map. An end-to-end framework for
salient smoke detection and existence prediction of smoke is proposed for
application in video smoke detection. The deep feature map is combined with the
saliency map to predict the existence of smoke in an image. Initial and
augmented dataset are built to measure the performance of frameworks with
different design strategies. Qualitative and quantitative analysis at
frame-level and pixel-level demonstrate the excellent performance of the
ultimate framework.Comment: 21 pages, 12 figure
An Enhanced Deep Feature Representation for Person Re-identification
Feature representation and metric learning are two critical components in
person re-identification models. In this paper, we focus on the feature
representation and claim that hand-crafted histogram features can be
complementary to Convolutional Neural Network (CNN) features. We propose a
novel feature extraction model called Feature Fusion Net (FFN) for pedestrian
image representation. In FFN, back propagation makes CNN features constrained
by the handcrafted features. Utilizing color histogram features (RGB, HSV,
YCbCr, Lab and YIQ) and texture features (multi-scale and multi-orientation
Gabor features), we get a new deep feature representation that is more
discriminative and compact. Experiments on three challenging datasets (VIPeR,
CUHK01, PRID450s) validates the effectiveness of our proposal.Comment: Citation for this paper: Shangxuan Wu, Ying-Cong Chen, Xiang Li,
An-Cong Wu, Jin-Jie You, and Wei-Shi Zheng. An Enhanced Deep Feature
Representation for Person Re-identification. In IEEE WACV, 201
A Survey on Periocular Biometrics Research
Periocular refers to the facial region in the vicinity of the eye, including
eyelids, lashes and eyebrows. While face and irises have been extensively
studied, the periocular region has emerged as a promising trait for
unconstrained biometrics, following demands for increased robustness of face or
iris systems. With a surprisingly high discrimination ability, this region can
be easily obtained with existing setups for face and iris, and the requirement
of user cooperation can be relaxed, thus facilitating the interaction with
biometric systems. It is also available over a wide range of distances even
when the iris texture cannot be reliably obtained (low resolution) or under
partial face occlusion (close distances). Here, we review the state of the art
in periocular biometrics research. A number of aspects are described,
including: i) existing databases, ii) algorithms for periocular detection
and/or segmentation, iii) features employed for recognition, iv) identification
of the most discriminative regions of the periocular area, v) comparison with
iris and face modalities, vi) soft-biometrics (gender/ethnicity
classification), and vii) impact of gender transformation and plastic surgery
on the recognition accuracy. This work is expected to provide an insight of the
most relevant issues in periocular biometrics, giving a comprehensive coverage
of the existing literature and current state of the art.Comment: Published in Pattern Recognition Letter
BLNet: A Fast Deep Learning Framework for Low-Light Image Enhancement with Noise Removal and Color Restoration
Images obtained in real-world low-light conditions are not only low in
brightness, but they also suffer from many other types of degradation, such as
color bias, unknown noise, detail loss and halo artifacts. In this paper, we
propose a very fast deep learning framework called Bringing the Lightness
(denoted as BLNet) that consists of two U-Nets with a series of well-designed
loss functions to tackle all of the above degradations. Based on Retinex
Theory, the decomposition net in our model can decompose low-light images into
reflectance and illumination and remove noise in the reflectance during the
decomposition phase. We propose a Noise and Color Bias Control module (NCBC
Module) that contains a convolutional neural network and two loss functions
(noise loss and color loss). This module is only used to calculate the loss
functions during the training phase, so our method is very fast during the test
phase. This module can smooth the reflectance to achieve the purpose of noise
removal while preserving details and edge information and controlling color
bias. We propose a network that can be trained to learn the mapping between
low-light and normal-light illumination and enhance the brightness of images
taken in low-light illumination. We train and evaluate the performance of our
proposed model over the real-world Low-Light (LOL) dataset), and we also test
our model over several other frequently used datasets (LIME, DICM and MEF
datasets). We conduct extensive experiments to demonstrate that our approach
achieves a promising effect with good rubustness and generalization and
outperforms many other state-of-the-art methods qualitatively and
quantitatively. Our method achieves high speed because we use loss functions
instead of introducing additional denoisers for noise removal and color
correction. The code and model are available at
https://github.com/weixinxu666/BLNet.Comment: 13 pages, 12 figures, journa
Hierarchical Gaussian Descriptors with Application to Person Re-Identification
Describing the color and textural information of a person image is one of the
most crucial aspects of person re-identification (re-id). In this paper, we
present novel meta-descriptors based on a hierarchical distribution of pixel
features. Although hierarchical covariance descriptors have been successfully
applied to image classification, the mean information of pixel features, which
is absent from the covariance, tends to be the major discriminative information
for person re-id. To solve this problem, we describe a local region in an image
via hierarchical Gaussian distribution in which both means and covariances are
included in their parameters. More specifically, the region is modeled as a set
of multiple Gaussian distributions in which each Gaussian represents the
appearance of a local patch. The characteristics of the set of Gaussians are
again described by another Gaussian distribution. In both steps, we embed the
parameters of the Gaussian into a point of Symmetric Positive Definite (SPD)
matrix manifold. By changing the way to handle mean information in this
embedding, we develop two hierarchical Gaussian descriptors. Additionally, we
develop feature norm normalization methods with the ability to alleviate the
biased trends that exist on the descriptors. The experimental results conducted
on five public datasets indicate that the proposed descriptors achieve
remarkably high performance on person re-id.Comment: 14 pages, 12 figures, 4 table
- …