2,291 research outputs found

    Region-based Skin Color Detection.

    Get PDF
    Skin color provides a powerful cue for complex computer vision applications. Although skin color detection has been an active research area for decades, the mainstream technology is based on the individual pixels. This paper presents a new region-based technique for skin color detection which outperforms the current state-of-the-art pixel-based skin color detection method on the popular Compaq dataset (Jones and Rehg, 2002). Color and spatial distance based clustering technique is used to extract the regions from the images, also known as superpixels. In the first step, our technique uses the state-of-the-art non-parametric pixel-based skin color classifier (Jones and Rehg, 2002) which we call the basic skin color classifier. The pixel-based skin color evidence is then aggregated to classify the superpixels. Finally, the Conditional Random Field (CRF) is applied to further improve the results. As CRF operates over superpixels, the computational overhead is minimal. Our technique achieves 91.17% true positive rate with 13.12% false negative rate on the Compaq dataset tested over approximately 14,000 web images

    Robust pedestrian detection and tracking in crowded scenes

    Get PDF
    In this paper, a robust computer vision approach to detecting and tracking pedestrians in unconstrained crowded scenes is presented. Pedestrian detection is performed via a 3D clustering process within a region-growing framework. The clustering process avoids using hard thresholds by using bio-metrically inspired constraints and a number of plan view statistics. Pedestrian tracking is achieved by formulating the track matching process as a weighted bipartite graph and using a Weighted Maximum Cardinality Matching scheme. The approach is evaluated using both indoor and outdoor sequences, captured using a variety of different camera placements and orientations, that feature significant challenges in terms of the number of pedestrians present, their interactions and scene lighting conditions. The evaluation is performed against a manually generated groundtruth for all sequences. Results point to the extremely accurate performance of the proposed approach in all cases

    Face detection in profile views using fast discrete curvelet transform (FDCT) and support vector machine (SVM)

    Get PDF
    Human face detection is an indispensable component in face processing applications, including automatic face recognition, security surveillance, facial expression recognition, and the like. This paper presents a profile face detection algorithm based on curvelet features, as curvelet transform offers good directional representation and can capture edge information in human face from different angles. First, a simple skin color segmentation scheme based on HSV (Hue - Saturation - Value) and YCgCr (luminance - green chrominance - red chrominance) color models is used to extract skin blocks. The segmentation scheme utilizes only the S and CgCr components, and is therefore luminance independent. Features extracted from three frequency bands from curvelet decomposition are used to detect face in each block. A support vector machine (SVM) classifier is trained for the classification task. In the performance test, the results showed that the proposed algorithm can detect profile faces in color images with good detection rate and low misdetection rate

    Skin Colour Detection Based On An Adaptive Multi-Thresholding Technique

    Get PDF
    Today, human region detection in complex scenes has received a great attention due to the wide use of websites and the considerable progress of the still and video images processing tasks. Skin detection or segmentation is a very popular and useful technique for detecting and tracking of human body parts, especially faces and hands. It is employed in tasks like face or hand detection and tracking, filtering of objectionable web images, people retrieval in databases and the Internet. This thesis aims to build a skin detection system that will discriminate between the skin and non-skin pixels in still coloured images. This is done by introducing a metric, which measures the distances of the pixel colour to skin tone. The need for a compact skin model representation stimulates the development of parametric skin distribution models which is used in this research.An adaptive skin colour detection model has been proposed in this thesis. The model is based on the bivariate normal distribution of the skin chromatic subspace. The model uses the 2D Single Gaussian model (SGM), and the 2D Gaussian mixture model (GMM) to represent the skin colour distribution. The model also based on the image segmentation using an automatic and adaptive multi-thresholding technique. This thesis shows that the Gaussian mixture model alone or the Gaussian single model does not improve the performance of the skin detection model due to the number of false detections for high correct classification. For this reason, a combination of SGM and GMM in the same model is proposed in this research. The results show that when processing images of different people taken in different imaging conditions, the use of only one single threshold value is not adapted, and since the proposed method is capable of adaptively adjusting its threshold values and effectively separating skin colour regions from non skin ones, it is applicable to images with various conditions. The experiment shows that the suggested algorithm achieves a noticeable performance improvement and offers a robust solution for skin detection under varying illumination. The results show that the average of the correct rate “True Positive” rate for the test images is equal to 94.064% while the False Positive average is equal to 13.166%

    Audio‐Visual Speaker Tracking

    Get PDF
    Target motion tracking found its application in interdisciplinary fields, including but not limited to surveillance and security, forensic science, intelligent transportation system, driving assistance, monitoring prohibited area, medical science, robotics, action and expression recognition, individual speaker discrimination in multi‐speaker environments and video conferencing in the fields of computer vision and signal processing. Among these applications, speaker tracking in enclosed spaces has been gaining relevance due to the widespread advances of devices and technologies and the necessity for seamless solutions in real‐time tracking and localization of speakers. However, speaker tracking is a challenging task in real‐life scenarios as several distinctive issues influence the tracking process, such as occlusions and an unknown number of speakers. One approach to overcome these issues is to use multi‐modal information, as it conveys complementary information about the state of the speakers compared to single‐modal tracking. To use multi‐modal information, several approaches have been proposed which can be classified into two categories, namely deterministic and stochastic. This chapter aims at providing multimedia researchers with a state‐of‐the‐art overview of tracking methods, which are used for combining multiple modalities to accomplish various multimedia analysis tasks, classifying them into different categories and listing new and future trends in this field

    Adaptive Real-Time Image Processing for Human-Computer Interaction

    Get PDF

    A vision-based approach for human hand tracking and gesture recognition.

    Get PDF
    Hand gesture interface has been becoming an active topic of human-computer interaction (HCI). The utilization of hand gestures in human-computer interface enables human operators to interact with computer environments in a natural and intuitive manner. In particular, bare hand interpretation technique frees users from cumbersome, but typically required devices in communication with computers, thus offering the ease and naturalness in HCI. Meanwhile, virtual assembly (VA) applies virtual reality (VR) techniques in mechanical assembly. It constructs computer tools to help product engineers planning, evaluating, optimizing, and verifying the assembly of mechanical systems without the need of physical objects. However, traditional devices such as keyboards and mice are no longer adequate due to their inefficiency in handling three-dimensional (3D) tasks. Special VR devices, such as data gloves, have been mandatory in VA. This thesis proposes a novel gesture-based interface for the application of VA. It develops a hybrid approach to incorporate an appearance-based hand localization technique with a skin tone filter in support of gesture recognition and hand tracking in the 3D space. With this interface, bare hands become a convenient substitution of special VR devices. Experiment results demonstrate the flexibility and robustness introduced by the proposed method to HCI.Dept. of Computer Science. Paper copy at Leddy Library: Theses & Major Papers - Basement, West Bldg. / Call Number: Thesis2004 .L8. Source: Masters Abstracts International, Volume: 43-03, page: 0883. Adviser: Xiaobu Yuan. Thesis (M.Sc.)--University of Windsor (Canada), 2004

    Using Local Context To Improve Face Detection

    Full text link
    Most face detection algorithms locate faces by classifying the content of a detection window iterating over all positions and scales of the input image. Recent developments have accelerated this process up to real-time performance at high levels of accuracy. However, even the best of today's computational systems are far from being able to compete with the detection capabilities of the human visual system. Psychophysical experiments have shown the importance of local context in the face detection process. In this paper we investigate the role of local context for face detection algorithms. In experiments on two large data sets we find that using local context can significantly increase the number of correct detections, particularly in low resolution cases, uncommon poses or individual appearances as well as occlusions

    Tracking Skin-Colored Objects in Real-Time

    Get PDF
    We present a methodology for tracking multiple skin-colored objects in a monocular image sequence. The proposed approach encompasses a collection of techniques that allow the modeling, detection and temporal association of skincolored objects across image sequences. A non-parametric model of skin color is employed. Skin-colored objects are detected with a Bayesian classifier that is bootstrapped with a small set of training data and refined through an off-line iterative training procedure. By using on-line adaptation of skin-color probabilities the classifier is able to cope with considerable illumination changes. Tracking over time is achieved by a novel technique that can handle multiple objects simultaneously. Tracked objects may move in complex trajectories, occlude each other in the field of view of a possibly moving camera and vary in number over time. A prototype implementation of the developed system operates on 320x240 live video in real time (28Hz), running on a conventional Pentium IV processor. Representative experimental results from the application of this prototype to image sequences are also presented. 1

    Vision-Based Production of Personalized Video

    No full text
    In this paper we present a novel vision-based system for the automated production of personalised video souvenirs for visitors in leisure and cultural heritage venues. Visitors are visually identified and tracked through a camera network. The system produces a personalized DVD souvenir at the end of a visitor’s stay allowing visitors to relive their experiences. We analyze how we identify visitors by fusing facial and body features, how we track visitors, how the tracker recovers from failures due to occlusions, as well as how we annotate and compile the final product. Our experiments demonstrate the feasibility of the proposed approach
    corecore