236,882 research outputs found

    A field model for human detection and tracking

    Full text link

    3D hand tracking.

    Get PDF
    The hand is often considered as one of the most natural and intuitive interaction modalities for human-to-human interaction. In human-computer interaction (HCI), proper 3D hand tracking is the first step in developing a more intuitive HCI system which can be used in applications such as gesture recognition, virtual object manipulation and gaming. However, accurate 3D hand tracking, remains a challenging problem due to the hand’s deformation, appearance similarity, high inter-finger occlusion and complex articulated motion. Further, 3D hand tracking is also interesting from a theoretical point of view as it deals with three major areas of computer vision- segmentation (of hand), detection (of hand parts), and tracking (of hand). This thesis proposes a region-based skin color detection technique, a model-based and an appearance-based 3D hand tracking techniques to bring the human-computer interaction applications one step closer. All techniques are briefly described below. Skin color provides a powerful cue for complex computer vision applications. Although skin color detection has been an active research area for decades, the mainstream technology is based on individual pixels. This thesis presents a new region-based technique for skin color detection which outperforms the current state-of-the-art pixel-based skin color detection technique on the popular Compaq dataset (Jones & Rehg 2002). The proposed technique achieves 91.17% true positive rate with 13.12% false negative rate on the Compaq dataset tested over approximately 14,000 web images. Hand tracking is not a trivial task as it requires tracking of 27 degreesof- freedom of hand. Hand deformation, self occlusion, appearance similarity and irregular motion are major problems that make 3D hand tracking a very challenging task. This thesis proposes a model-based 3D hand tracking technique, which is improved by using proposed depth-foreground-background ii feature, palm deformation module and context cue. However, the major problem of model-based techniques is, they are computationally expensive. This can be overcome by discriminative techniques as described below. Discriminative techniques (for example random forest) are good for hand part detection, however they fail due to sensor noise and high interfinger occlusion. Additionally, these techniques have difficulties in modelling kinematic or temporal constraints. Although model-based descriptive (for example Markov Random Field) or generative (for example Hidden Markov Model) techniques utilize kinematic and temporal constraints well, they are computationally expensive and hardly recover from tracking failure. This thesis presents a unified framework for 3D hand tracking, using the best of both methodologies, which out performs the current state-of-the-art 3D hand tracking techniques. The proposed 3D hand tracking techniques in this thesis can be used to extract accurate hand movement features and enable complex human machine interaction such as gaming and virtual object manipulation

    Online video streaming for human tracking based on weighted resampling particle filter

    Full text link
    © 2018 The Authors. Published by Elsevier Ltd. This paper proposes a weighted resampling method for particle filter which is applied for human tracking on active camera. The proposed system consists of three major parts which are human detection, human tracking, and camera control. The codebook matching algorithm is used for extracting human region in human detection system, and the particle filter algorithm estimates the position of the human in every input image. The proposed system in this paper selects the particles with highly weighted value in resampling, because it provides higher accurate tracking features. Moreover, a proportional-integral-derivative controller (PID controller) controls the active camera by minimizing difference between center of image and the position of object obtained from particle filter. The proposed system also converts the position difference into pan-tilt speed to drive the active camera and keep the human in the field of view (FOV) camera. The intensity of image changes overtime while tracking human therefore the proposed system uses the Gaussian mixture model (GMM) to update the human feature model. As regards, the temporal occlusion problem is solved by feature similarity and the resampling particles. Also, the particle filter estimates the position of human in every input frames, thus the active camera drives smoothly. The robustness of the accurate tracking of the proposed system can be seen in the experimental results

    A pupil-positioning method based on the starburst model

    Get PDF
    Human eye detection has become an area of interest in the field of computer vision with an extensive range of applications in human-computer interaction, disease diagnosis, and psychological and physiological studies. Gaze-tracking systems are an important research topic in the human-computer interaction field. As one of the core modules of the head-mounted gaze-tracking system, pupil positioning affects the accuracy and stability of the system. By tracking eye movements to better locate the center of the pupil, this paper proposes a method for pupil positioning based on the starburst model. The method uses vertical and horizontal coordinate integral projections in the rectangular region of the human eye for accurate positioning and applies a linear interpolation method that is based on a circular model to the reflections in the human eye. In this paper, we propose a method for detecting the feature points of the pupil edge based on the starburst model, which clusters feature points and uses the RANdom SAmple Consensus (RANSAC) algorithm to perform ellipse fitting of the pupil edge to accurately locate the pupil center. Our experimental results show that the algorithm has higher precision, higher efficiency and more robustness than other algorithms and excellent accuracy even when the image of the pupil is incomplete.Science and Technology Support Plan Project of Hebei Province (grant numbers 17210803D and 19273703D Science and Technology Spark Project of the Hebei Seismological Bureau (grant number DZ20180402056) Education Department of Hebei Province (grant number QN2018095) Polytechnic College of Hebei University of Science and Technolog

    Sensor-Based Safety Performance Assessment of Individual Construction Workers

    Get PDF
    Over the last decade, researchers have explored various technologies and methodologies to enhance worker safety at construction sites. The use of advanced sensing technologies mainly has focused on detecting and warning about safety issues by directly relying on the detection capabilities of these technologies. Until now, very little research has explored methods to quantitatively assess individual workers’ safety performance. For this, this study uses a tracking system to collect and use individuals’ location data in the proposed safety framework. A computational and analytical procedure/model was developed to quantify the safety performance of individual workers beyond detection and warning. The framework defines parameters for zone-based safety risks and establishes a zone-based safety risk model to quantify potential risks to workers. To demonstrate the model of safety analysis, the study conducted field tests at different construction sites, using various interaction scenarios. Probabilistic evaluation showed a slight underestimation and overestimation in certain cases; however, the model represented the overall safety performance of a subject quite well. Test results showed clear evidence of the model’s ability to capture safety conditions of workers in pre-identified hazard zones. The developed approach presents a way to provide visualized and quantified information as a form of safety index, which has not been available in the industry. In addition, such an automated method may present a suitable safety monitoring method that can eliminate human deployment that is expensive, error-prone, and time-consuming

    Handling Heavy Occlusion in Dense Crowd Tracking by Focusing on the Heads

    Full text link
    With the rapid development of deep learning, object detection and tracking play a vital role in today's society. Being able to identify and track all the pedestrians in the dense crowd scene with computer vision approaches is a typical challenge in this field, also known as the Multiple Object Tracking (MOT) challenge. Modern trackers are required to operate on more and more complicated scenes. According to the MOT20 challenge result, the pedestrian is 4 times denser than the MOT17 challenge. Hence, improving the ability to detect and track in extremely crowded scenes is the aim of this work. In light of the occlusion issue with the human body, the heads are usually easier to identify. In this work, we have designed a joint head and body detector in an anchor-free style to boost the detection recall and precision performance of pedestrians in both small and medium sizes. Innovatively, our model does not require information on the statistical head-body ratio for common pedestrians detection for training. Instead, the proposed model learns the ratio dynamically. To verify the effectiveness of the proposed model, we evaluate the model with extensive experiments on different datasets, including MOT20, Crowdhuman, and HT21 datasets. As a result, our proposed method significantly improves both the recall and precision rate on small & medium sized pedestrians and achieves state-of-the-art results in these challenging datasets.Comment: Accepted at AJCAI 202

    Robust and real-time hand detection and tracking in monocular video

    Get PDF
    In recent years, personal computing devices such as laptops, tablets and smartphones have become ubiquitous. Moreover, intelligent sensors are being integrated into many consumer devices such as eyeglasses, wristwatches and smart televisions. With the advent of touchscreen technology, a new human-computer interaction (HCI) paradigm arose that allows users to interface with their device in an intuitive manner. Using simple gestures, such as swipe or pinch movements, a touchscreen can be used to directly interact with a virtual environment. Nevertheless, touchscreens still form a physical barrier between the virtual interface and the real world. An increasingly popular field of research that tries to overcome this limitation, is video based gesture recognition, hand detection and hand tracking. Gesture based interaction allows the user to directly interact with the computer in a natural manner by exploring a virtual reality using nothing but his own body language. In this dissertation, we investigate how robust hand detection and tracking can be accomplished under real-time constraints. In the context of human-computer interaction, real-time is defined as both low latency and low complexity, such that a complete video frame can be processed before the next one becomes available. Furthermore, for practical applications, the algorithms should be robust to illumination changes, camera motion, and cluttered backgrounds in the scene. Finally, the system should be able to initialize automatically, and to detect and recover from tracking failure. We study a wide variety of existing algorithms, and propose significant improvements and novel methods to build a complete detection and tracking system that meets these requirements. Hand detection, hand tracking and hand segmentation are related yet technically different challenges. Whereas detection deals with finding an object in a static image, tracking considers temporal information and is used to track the position of an object over time, throughout a video sequence. Hand segmentation is the task of estimating the hand contour, thereby separating the object from its background. Detection of hands in individual video frames allows us to automatically initialize our tracking algorithm, and to detect and recover from tracking failure. Human hands are highly articulated objects, consisting of finger parts that are connected with joints. As a result, the appearance of a hand can vary greatly, depending on the assumed hand pose. Traditional detection algorithms often assume that the appearance of the object of interest can be described using a rigid model and therefore can not be used to robustly detect human hands. Therefore, we developed an algorithm that detects hands by exploiting their articulated nature. Instead of resorting to a template based approach, we probabilistically model the spatial relations between different hand parts, and the centroid of the hand. Detecting hand parts, such as fingertips, is much easier than detecting a complete hand. Based on our model of the spatial configuration of hand parts, the detected parts can be used to obtain an estimate of the complete hand's position. To comply with the real-time constraints, we developed techniques to speed-up the process by efficiently discarding unimportant information in the image. Experimental results show that our method is competitive with the state-of-the-art in object detection while providing a reduction in computational complexity with a factor 1 000. Furthermore, we showed that our algorithm can also be used to detect other articulated objects such as persons or animals and is therefore not restricted to the task of hand detection. Once a hand has been detected, a tracking algorithm can be used to continuously track its position in time. We developed a probabilistic tracking method that can cope with uncertainty caused by image noise, incorrect detections, changing illumination, and camera motion. Furthermore, our tracking system automatically determines the number of hands in the scene, and can cope with hands entering or leaving the video canvas. We introduced several novel techniques that greatly increase tracking robustness, and that can also be applied in other domains than hand tracking. To achieve real-time processing, we investigated several techniques to reduce the search space of the problem, and deliberately employ methods that are easily parallelized on modern hardware. Experimental results indicate that our methods outperform the state-of-the-art in hand tracking, while providing a much lower computational complexity. One of the methods used by our probabilistic tracking algorithm, is optical flow estimation. Optical flow is defined as a 2D vector field describing the apparent velocities of objects in a 3D scene, projected onto the image plane. Optical flow is known to be used by many insects and birds to visually track objects and to estimate their ego-motion. However, most optical flow estimation methods described in literature are either too slow to be used in real-time applications, or are not robust to illumination changes and fast motion. We therefore developed an optical flow algorithm that can cope with large displacements, and that is illumination independent. Furthermore, we introduce a regularization technique that ensures a smooth flow-field. This regularization scheme effectively reduces the number of noisy and incorrect flow-vector estimates, while maintaining the ability to handle motion discontinuities caused by object boundaries in the scene. The above methods are combined into a hand tracking framework which can be used for interactive applications in unconstrained environments. To demonstrate the possibilities of gesture based human-computer interaction, we developed a new type of computer display. This display is completely transparent, allowing multiple users to perform collaborative tasks while maintaining eye contact. Furthermore, our display produces an image that seems to float in thin air, such that users can touch the virtual image with their hands. This floating imaging display has been showcased on several national and international events and tradeshows. The research that is described in this dissertation has been evaluated thoroughly by comparing detection and tracking results with those obtained by state-of-the-art algorithms. These comparisons show that the proposed methods outperform most algorithms in terms of accuracy, while achieving a much lower computational complexity, resulting in a real-time implementation. Results are discussed in depth at the end of each chapter. This research further resulted in an international journal publication; a second journal paper that has been submitted and is under review at the time of writing this dissertation; nine international conference publications; a national conference publication; a commercial license agreement concerning the research results; two hardware prototypes of a new type of computer display; and a software demonstrator
    • …
    corecore