28,431 research outputs found
Region-based Skin Color Detection.
Skin color provides a powerful cue for complex computer vision applications. Although skin color detection
has been an active research area for decades, the mainstream technology is based on the individual pixels.
This paper presents a new region-based technique for skin color detection which outperforms the current
state-of-the-art pixel-based skin color detection method on the popular Compaq dataset (Jones and Rehg,
2002). Color and spatial distance based clustering technique is used to extract the regions from the images,
also known as superpixels. In the first step, our technique uses the state-of-the-art non-parametric pixel-based
skin color classifier (Jones and Rehg, 2002) which we call the basic skin color classifier. The pixel-based skin
color evidence is then aggregated to classify the superpixels. Finally, the Conditional Random Field (CRF)
is applied to further improve the results. As CRF operates over superpixels, the computational overhead is
minimal. Our technique achieves 91.17% true positive rate with 13.12% false negative rate on the Compaq
dataset tested over approximately 14,000 web images
Constrained Deep Transfer Feature Learning and its Applications
Feature learning with deep models has achieved impressive results for both
data representation and classification for various vision tasks. Deep feature
learning, however, typically requires a large amount of training data, which
may not be feasible for some application domains. Transfer learning can be one
of the approaches to alleviate this problem by transferring data from data-rich
source domain to data-scarce target domain. Existing transfer learning methods
typically perform one-shot transfer learning and often ignore the specific
properties that the transferred data must satisfy. To address these issues, we
introduce a constrained deep transfer feature learning method to perform
simultaneous transfer learning and feature learning by performing transfer
learning in a progressively improving feature space iteratively in order to
better narrow the gap between the target domain and the source domain for
effective transfer of the data from the source domain to target domain.
Furthermore, we propose to exploit the target domain knowledge and incorporate
such prior knowledge as a constraint during transfer learning to ensure that
the transferred data satisfies certain properties of the target domain. To
demonstrate the effectiveness of the proposed constrained deep transfer feature
learning method, we apply it to thermal feature learning for eye detection by
transferring from the visible domain. We also applied the proposed method for
cross-view facial expression recognition as a second application. The
experimental results demonstrate the effectiveness of the proposed method for
both applications.Comment: International Conference on Computer Vision and Pattern Recognition,
201
Recommended from our members
A Robust and Artifact Resistant Algorithm of Ultrawideband Imaging System for Breast Cancer Detection.
Goal: Ultrawideband radar imaging is regarded as one of the most promising alternatives for breast cancer detection. A range of algorithms reported in literature show satisfactory tumor detection capabilities. However, most of algorithms suffer significant deterioration or even fail when the early-stage artifact, including incident signals and skin-fat interface reflections, cannot be perfectly removed from received signals. Furthermore, fibro-glandular tissue poses another challenge for tumor detection, due to the small dielectric contrast between glandular and cancerous tissues. Methods: This paper introduces a novel Robust and Artifact Resistant (RAR) algorithm, in which a neighborhood pairwise correlation-based weighting is designed to overcome the adverse effects from both artifact and glandular tissues. In RAR, backscattered signals are time-shifted, summed, and weighted by the maximum combination of the neighboring pairwise correlation coefficients between shifted signals, forming the intensity of each point within an imaging area. Results: The effectiveness was investigated using 3-D anatomically and dielectrically accurate finite-difference-time-domain numerical breast models. The use of neighborhood pairwise correlation provided robustness against artifact, and enabled the detection of multiple scatterers. RAR is compared with four well-known algorithms: delay-and-sum, delay-multiply-and-sum, modified-weighted-delay-and-sum, and filtered-delay-and-sum. Conclusion: It has shown that RAR exhibits improved identification capability, robust artifact resistance, and high detectability over its counterparts in most scenarios considered, while maintaining computational efficiency. Simulated tumors in both homogeneous and heterogonous, from mildly to moderately dense breast phantoms, combining an entropy-based artifact removal algorithm, were successfully identified and localized. Significance: These results show the strong potential of RAR for breast cancer screening
Analysis of Hand Segmentation in the Wild
A large number of works in egocentric vision have concentrated on action and
object recognition. Detection and segmentation of hands in first-person videos,
however, has less been explored. For many applications in this domain, it is
necessary to accurately segment not only hands of the camera wearer but also
the hands of others with whom he is interacting. Here, we take an in-depth look
at the hand segmentation problem. In the quest for robust hand segmentation
methods, we evaluated the performance of the state of the art semantic
segmentation methods, off the shelf and fine-tuned, on existing datasets. We
fine-tune RefineNet, a leading semantic segmentation method, for hand
segmentation and find that it does much better than the best contenders.
Existing hand segmentation datasets are collected in the laboratory settings.
To overcome this limitation, we contribute by collecting two new datasets: a)
EgoYouTubeHands including egocentric videos containing hands in the wild, and
b) HandOverFace to analyze the performance of our models in presence of similar
appearance occlusions. We further explore whether conditional random fields can
help refine generated hand segmentations. To demonstrate the benefit of
accurate hand maps, we train a CNN for hand-based activity recognition and
achieve higher accuracy when a CNN was trained using hand maps produced by the
fine-tuned RefineNet. Finally, we annotate a subset of the EgoHands dataset for
fine-grained action recognition and show that an accuracy of 58.6% can be
achieved by just looking at a single hand pose which is much better than the
chance level (12.5%).Comment: Accepted at CVPR 201
Data association and occlusion handling for vision-based people tracking by mobile robots
This paper presents an approach for tracking multiple persons on a mobile robot with a combination of colour and thermal vision sensors, using several new techniques. First, an adaptive colour model is incorporated into the measurement model of the tracker. Second, a new approach for detecting occlusions is introduced, using a machine learning classifier for pairwise comparison of persons (classifying which one is in front of the other). Third, explicit occlusion handling is incorporated into the tracker. The paper presents a comprehensive, quantitative evaluation of the whole system and its different components using several real world data sets
- …