19,165 research outputs found
Ensemble of Different Approaches for a Reliable Person Re-identification System
An ensemble of approaches for reliable person re-identification is proposed in this paper. The proposed ensemble is built combining widely used person re-identification systems using different color spaces and some variants of state-of-the-art approaches that are proposed in this paper. Different descriptors are tested, and both texture and color features are extracted from the images; then the different descriptors are compared using different distance measures (e.g., the Euclidean distance, angle, and the Jeffrey distance). To improve performance, a method based on skeleton detection, extracted from the depth map, is also applied when the depth map is available. The proposed ensemble is validated on three widely used datasets (CAVIAR4REID, IAS, and VIPeR), keeping the same parameter set of each approach constant across all tests to avoid overfitting and to demonstrate that the proposed system can be considered a general-purpose person re-identification system. Our experimental results show that the proposed system offers significant improvements over baseline approaches. The source code used for the approaches tested in this paper will be available at https://www.dei.unipd.it/node/2357 and http://robotics.dei.unipd.it/reid/
Evaluating color texture descriptors under large variations of controlled lighting conditions
The recognition of color texture under varying lighting conditions is still
an open issue. Several features have been proposed for this purpose, ranging
from traditional statistical descriptors to features extracted with neural
networks. Still, it is not completely clear under what circumstances a feature
performs better than the others. In this paper we report an extensive
comparison of old and new texture features, with and without a color
normalization step, with a particular focus on how they are affected by small
and large variation in the lighting conditions. The evaluation is performed on
a new texture database including 68 samples of raw food acquired under 46
conditions that present single and combined variations of light color,
direction and intensity. The database allows to systematically investigate the
robustness of texture descriptors across a large range of variations of imaging
conditions.Comment: Submitted to the Journal of the Optical Society of America
Review of Person Re-identification Techniques
Person re-identification across different surveillance cameras with disjoint
fields of view has become one of the most interesting and challenging subjects
in the area of intelligent video surveillance. Although several methods have
been developed and proposed, certain limitations and unresolved issues remain.
In all of the existing re-identification approaches, feature vectors are
extracted from segmented still images or video frames. Different similarity or
dissimilarity measures have been applied to these vectors. Some methods have
used simple constant metrics, whereas others have utilised models to obtain
optimised metrics. Some have created models based on local colour or texture
information, and others have built models based on the gait of people. In
general, the main objective of all these approaches is to achieve a
higher-accuracy rate and lowercomputational costs. This study summarises
several developments in recent literature and discusses the various available
methods used in person re-identification. Specifically, their advantages and
disadvantages are mentioned and compared.Comment: Published 201
BSUV-Net: a fully-convolutional neural network for background subtraction of unseen videos
Background subtraction is a basic task in computer vision and video processing often applied as a pre-processing step for object tracking, people recognition, etc. Recently, a number of successful background-subtraction algorithms have been proposed, however nearly all of the top-performing ones are supervised. Crucially, their success relies upon the availability of some annotated frames of the test video during training. Consequently, their performance on completely âunseenâ videos is undocumented in the literature. In this work, we propose a new, supervised, background subtraction algorithm for unseen videos (BSUV-Net) based on a fully-convolutional neural network. The input to our network consists of the current frame and two background frames captured at different time scales along with their semantic segmentation maps. In order to reduce the chance of overfitting, we also introduce a new data-augmentation technique which mitigates the impact of illumination difference between the background frames and the current frame. On the CDNet-2014 dataset, BSUV-Net outperforms stateof-the-art algorithms evaluated on unseen videos in terms of several metrics including F-measure, recall and precision.Accepted manuscrip
Deep Thermal Imaging: Proximate Material Type Recognition in the Wild through Deep Learning of Spatial Surface Temperature Patterns
We introduce Deep Thermal Imaging, a new approach for close-range automatic
recognition of materials to enhance the understanding of people and ubiquitous
technologies of their proximal environment. Our approach uses a low-cost mobile
thermal camera integrated into a smartphone to capture thermal textures. A deep
neural network classifies these textures into material types. This approach
works effectively without the need for ambient light sources or direct contact
with materials. Furthermore, the use of a deep learning network removes the
need to handcraft the set of features for different materials. We evaluated the
performance of the system by training it to recognise 32 material types in both
indoor and outdoor environments. Our approach produced recognition accuracies
above 98% in 14,860 images of 15 indoor materials and above 89% in 26,584
images of 17 outdoor materials. We conclude by discussing its potentials for
real-time use in HCI applications and future directions.Comment: Proceedings of the 2018 CHI Conference on Human Factors in Computing
System
A fully-convolutional neural network for background subtraction of unseen videos
Background subtraction is a basic task in computer vision
and video processing often applied as a pre-processing step
for object tracking, people recognition, etc. Recently, a number of successful background-subtraction algorithms have
been proposed, however nearly all of the top-performing
ones are supervised. Crucially, their success relies upon
the availability of some annotated frames of the test video
during training. Consequently, their performance on completely âunseenâ videos is undocumented in the literature.
In this work, we propose a new, supervised, backgroundsubtraction algorithm for unseen videos (BSUV-Net) based
on a fully-convolutional neural network. The input to our
network consists of the current frame and two background
frames captured at different time scales along with their semantic segmentation maps. In order to reduce the chance
of overfitting, we also introduce a new data-augmentation
technique which mitigates the impact of illumination difference between the background frames and the current frame.
On the CDNet-2014 dataset, BSUV-Net outperforms stateof-the-art algorithms evaluated on unseen videos in terms of
several metrics including F-measure, recall and precision.Accepted manuscrip
Articulated Clinician Detection Using 3D Pictorial Structures on RGB-D Data
Reliable human pose estimation (HPE) is essential to many clinical
applications, such as surgical workflow analysis, radiation safety monitoring
and human-robot cooperation. Proposed methods for the operating room (OR) rely
either on foreground estimation using a multi-camera system, which is a
challenge in real ORs due to color similarities and frequent illumination
changes, or on wearable sensors or markers, which are invasive and therefore
difficult to introduce in the room. Instead, we propose a novel approach based
on Pictorial Structures (PS) and on RGB-D data, which can be easily deployed in
real ORs. We extend the PS framework in two ways. First, we build robust and
discriminative part detectors using both color and depth images. We also
present a novel descriptor for depth images, called histogram of depth
differences (HDD). Second, we extend PS to 3D by proposing 3D pairwise
constraints and a new method that makes exact inference tractable. Our approach
is evaluated for pose estimation and clinician detection on a challenging RGB-D
dataset recorded in a busy operating room during live surgeries. We conduct
series of experiments to study the different part detectors in conjunction with
the various 2D or 3D pairwise constraints. Our comparisons demonstrate that 3D
PS with RGB-D part detectors significantly improves the results in a visually
challenging operating environment.Comment: The supplementary video is available at https://youtu.be/iabbGSqRSg
- âŠ