176 research outputs found

    Head-Up Display Perspective Correction Using Homography Transformation

    Get PDF
    The purpose of this project is to research the potential of applying mathematical methods in order to aid the development of robust Head-Up Display systems for automotive vehicles. If the vehicle detects an object and the Head-Up Display should overlay some information over that object, then the information has to be displayed at some position which depends on where the driver is looking from. This has to be done keeping in mind the real-time application of driving a car. The solution proposed in this thesis is to calculate a set of homography matrices corresponding to a set of points that represent a specific position of the driver’s head. Then, by tracking the head of the driver the matching homography is continuously applied to the graphical interface so that it matches the outside world

    Spatial Pyramid Context-Aware Moving Object Detection and Tracking for Full Motion Video and Wide Aerial Motion Imagery

    Get PDF
    A robust and fast automatic moving object detection and tracking system is essential to characterize target object and extract spatial and temporal information for different functionalities including video surveillance systems, urban traffic monitoring and navigation, robotic. In this dissertation, I present a collaborative Spatial Pyramid Context-aware moving object detection and Tracking system. The proposed visual tracker is composed of one master tracker that usually relies on visual object features and two auxiliary trackers based on object temporal motion information that will be called dynamically to assist master tracker. SPCT utilizes image spatial context at different level to make the video tracking system resistant to occlusion, background noise and improve target localization accuracy and robustness. We chose a pre-selected seven-channel complementary features including RGB color, intensity and spatial pyramid of HoG to encode object color, shape and spatial layout information. We exploit integral histogram as building block to meet the demands of real-time performance. A novel fast algorithm is presented to accurately evaluate spatially weighted local histograms in constant time complexity using an extension of the integral histogram method. Different techniques are explored to efficiently compute integral histogram on GPU architecture and applied for fast spatio-temporal median computations and 3D face reconstruction texturing. We proposed a multi-component framework based on semantic fusion of motion information with projected building footprint map to significantly reduce the false alarm rate in urban scenes with many tall structures. The experiments on extensive VOTC2016 benchmark dataset and aerial video confirm that combining complementary tracking cues in an intelligent fusion framework enables persistent tracking for Full Motion Video and Wide Aerial Motion Imagery.Comment: PhD Dissertation (162 pages

    Performance Analysis of Tracking on Mobile Devices using Local Binary Descriptors

    Get PDF
    With the growing ubiquity of mobile devices, users are turning to their smartphones and tablets to perform more complex tasks than ever before. Performing computer vision tasks on mobile devices must be done despite the constraints on CPU performance, memory, and power consumption. One such task for mobile devices involves object tracking, an important area of computer vision. The computational complexity of tracking algorithms makes them ideal candidates for optimization on mobile platforms. This thesis presents a mobile implementation for real time object tracking. Currently few tracking approaches take into consideration the resource constraints on mobile devices. Optimizing performance for mobile devices can result in better and more efficient tracking approaches for mobile applications such as augmented reality. These performance benefits aim to increase the frame rate at which an object is tracked and reduce power consumption during tracking. For this thesis, we utilize binary descriptors, such as Binary Robust Independent Elementary Features (BRIEF), Oriented FAST and Rotated BRIEF (ORB), Binary Robust Invariant Scalable Keypoints (BRISK), and Fast Retina Keypoint (FREAK). The tracking performance of these descriptors is benchmarked on mobile devices. We consider an object tracking approach based on a dictionary of templates that involves generating keypoints of a detected object and candidate regions in subsequent frames. Descriptor matching, between candidate regions in a new frame and a dictionary of templates, identifies the location of the tracked object. These comparisons are often computationally intensive and require a great deal of memory and processing time. Google\u27s Android operating system is used to implement the tracking application on a Samsung Galaxy series phone and tablet. Control of the Android camera is largely done through OpenCV\u27s Android SDK. Power consumption is measured using the PowerTutor Android application. Other performance characteristics, such as processing time, are gathered using the Dalvik Debug Monitor Server (DDMS) tool included in the Android SDK. These metrics are used to evaluate the tracker\u27s performance on mobile devices

    Object Tracking Using Local Binary Descriptors

    Get PDF
    Visual tracking has become an increasingly important topic of research in the field of Computer Vision (CV). There are currently many tracking methods based on the Detect-then-Track paradigm. This type of approach may allow for a system to track a random object with just one initialization phase, but may often rely on constructing models to follow the object. Another limitation of these methods is that they are computationally and memory intensive, which hinders their application to resource constrained platforms such as mobile devices. Under these conditions, the implementation of Augmented Reality (AR) or complex multi-part systems is not possible. In this thesis, we explore a variety of interest point descriptors for generic object tracking. The SIFT descriptor is considered a benchmark and will be compared with binary descriptors such as BRIEF, ORB, BRISK, and FREAK. The accuracy of these descriptors is benchmarked against the ground truth of the object\u27s location. We use dictionaries of descriptors to track regions with small error under variations due to occlusions, illumination changes, scaling, and rotation. This is accomplished by using Dense-to-Sparse Search Pattern, Locality Constraints, and Scale Adaptation. A benchmarking system is created to test the descriptors\u27 accuracy, speed, robustness, and distinctness. This data offers a comparison of the tracking system to current state of the art systems such as Multiple Instance Learning Tracker (MILTrack), Tracker Learned Detection (TLD), and Continuously Adaptive MeanShift (CAMSHIFT)

    Accelerated Object Tracking with Local Binary Features

    Get PDF
    Multi-object tracking is a problem with wide application in modern computing. Object tracking is leveraged in areas such as human computer interaction, autonomous vehicle navigation, panorama generation, as well as countless other robotic applications. Several trackers have demonstrated favorable results for tracking of single objects. However, modern object trackers must make significant tradeoffs in order to accommodate multiple objects while maintaining real-time performance. These tradeoffs include sacrifices in robustness and accuracy that adversely affect the results. This thesis details the design and multiple implementations of an object tracker that is focused on computational efficiency. The computational efficiency of the tracker is achieved through use of local binary descriptors in a template matching approach. Candidate templates are matched to a dictionary composed of both static and dynamic templates to allow for variation in the appearance of the object while minimizing the potential for drift in the tracker. Locality constraints have been used to reduce tracking jitter. Due to the significant promise for parallelization, the tracking algorithm was implemented on the Graphics Processing Unit (GPU) using the CUDA API. The tracker\u27s efficiency also led to its implantation on a mobile platform as one of the mobile trackers that can accurately track at faster than realtime speed. Benchmarks were performed to compare the proposed tracker to state of the art trackers on a wide range of standard test videos. The tracker implemented in this work has demonstrated a higher degree of accuracy while operating several orders of magnitude faster

    Robust and real-time hand detection and tracking in monocular video

    Get PDF
    In recent years, personal computing devices such as laptops, tablets and smartphones have become ubiquitous. Moreover, intelligent sensors are being integrated into many consumer devices such as eyeglasses, wristwatches and smart televisions. With the advent of touchscreen technology, a new human-computer interaction (HCI) paradigm arose that allows users to interface with their device in an intuitive manner. Using simple gestures, such as swipe or pinch movements, a touchscreen can be used to directly interact with a virtual environment. Nevertheless, touchscreens still form a physical barrier between the virtual interface and the real world. An increasingly popular field of research that tries to overcome this limitation, is video based gesture recognition, hand detection and hand tracking. Gesture based interaction allows the user to directly interact with the computer in a natural manner by exploring a virtual reality using nothing but his own body language. In this dissertation, we investigate how robust hand detection and tracking can be accomplished under real-time constraints. In the context of human-computer interaction, real-time is defined as both low latency and low complexity, such that a complete video frame can be processed before the next one becomes available. Furthermore, for practical applications, the algorithms should be robust to illumination changes, camera motion, and cluttered backgrounds in the scene. Finally, the system should be able to initialize automatically, and to detect and recover from tracking failure. We study a wide variety of existing algorithms, and propose significant improvements and novel methods to build a complete detection and tracking system that meets these requirements. Hand detection, hand tracking and hand segmentation are related yet technically different challenges. Whereas detection deals with finding an object in a static image, tracking considers temporal information and is used to track the position of an object over time, throughout a video sequence. Hand segmentation is the task of estimating the hand contour, thereby separating the object from its background. Detection of hands in individual video frames allows us to automatically initialize our tracking algorithm, and to detect and recover from tracking failure. Human hands are highly articulated objects, consisting of finger parts that are connected with joints. As a result, the appearance of a hand can vary greatly, depending on the assumed hand pose. Traditional detection algorithms often assume that the appearance of the object of interest can be described using a rigid model and therefore can not be used to robustly detect human hands. Therefore, we developed an algorithm that detects hands by exploiting their articulated nature. Instead of resorting to a template based approach, we probabilistically model the spatial relations between different hand parts, and the centroid of the hand. Detecting hand parts, such as fingertips, is much easier than detecting a complete hand. Based on our model of the spatial configuration of hand parts, the detected parts can be used to obtain an estimate of the complete hand's position. To comply with the real-time constraints, we developed techniques to speed-up the process by efficiently discarding unimportant information in the image. Experimental results show that our method is competitive with the state-of-the-art in object detection while providing a reduction in computational complexity with a factor 1 000. Furthermore, we showed that our algorithm can also be used to detect other articulated objects such as persons or animals and is therefore not restricted to the task of hand detection. Once a hand has been detected, a tracking algorithm can be used to continuously track its position in time. We developed a probabilistic tracking method that can cope with uncertainty caused by image noise, incorrect detections, changing illumination, and camera motion. Furthermore, our tracking system automatically determines the number of hands in the scene, and can cope with hands entering or leaving the video canvas. We introduced several novel techniques that greatly increase tracking robustness, and that can also be applied in other domains than hand tracking. To achieve real-time processing, we investigated several techniques to reduce the search space of the problem, and deliberately employ methods that are easily parallelized on modern hardware. Experimental results indicate that our methods outperform the state-of-the-art in hand tracking, while providing a much lower computational complexity. One of the methods used by our probabilistic tracking algorithm, is optical flow estimation. Optical flow is defined as a 2D vector field describing the apparent velocities of objects in a 3D scene, projected onto the image plane. Optical flow is known to be used by many insects and birds to visually track objects and to estimate their ego-motion. However, most optical flow estimation methods described in literature are either too slow to be used in real-time applications, or are not robust to illumination changes and fast motion. We therefore developed an optical flow algorithm that can cope with large displacements, and that is illumination independent. Furthermore, we introduce a regularization technique that ensures a smooth flow-field. This regularization scheme effectively reduces the number of noisy and incorrect flow-vector estimates, while maintaining the ability to handle motion discontinuities caused by object boundaries in the scene. The above methods are combined into a hand tracking framework which can be used for interactive applications in unconstrained environments. To demonstrate the possibilities of gesture based human-computer interaction, we developed a new type of computer display. This display is completely transparent, allowing multiple users to perform collaborative tasks while maintaining eye contact. Furthermore, our display produces an image that seems to float in thin air, such that users can touch the virtual image with their hands. This floating imaging display has been showcased on several national and international events and tradeshows. The research that is described in this dissertation has been evaluated thoroughly by comparing detection and tracking results with those obtained by state-of-the-art algorithms. These comparisons show that the proposed methods outperform most algorithms in terms of accuracy, while achieving a much lower computational complexity, resulting in a real-time implementation. Results are discussed in depth at the end of each chapter. This research further resulted in an international journal publication; a second journal paper that has been submitted and is under review at the time of writing this dissertation; nine international conference publications; a national conference publication; a commercial license agreement concerning the research results; two hardware prototypes of a new type of computer display; and a software demonstrator

    Real-Time, Multiple Pan/Tilt/Zoom Computer Vision Tracking and 3D Positioning System for Unmanned Aerial System Metrology

    Get PDF
    The study of structural characteristics of Unmanned Aerial Systems (UASs) continues to be an important field of research for developing state of the art nano/micro systems. Development of a metrology system using computer vision (CV) tracking and 3D point extraction would provide an avenue for making these theoretical developments. This work provides a portable, scalable system capable of real-time tracking, zooming, and 3D position estimation of a UAS using multiple cameras. Current state-of-the-art photogrammetry systems use retro-reflective markers or single point lasers to obtain object poses and/or positions over time. Using a CV pan/tilt/zoom (PTZ) system has the potential to circumvent their limitations. The system developed in this paper exploits parallel-processing and the GPU for CV-tracking, using optical flow and known camera motion, in order to capture a moving object using two PTU cameras. The parallel-processing technique developed in this work is versatile, allowing the ability to test other CV methods with a PTZ system using known camera motion. Utilizing known camera poses, the object\u27s 3D position is estimated and focal lengths are estimated for filling the image to a desired amount. This system is tested against truth data obtained using an industrial system

    A Comprehensive Performance Evaluation of Deformable Face Tracking "In-the-Wild"

    Full text link
    Recently, technologies such as face detection, facial landmark localisation and face recognition and verification have matured enough to provide effective and efficient solutions for imagery captured under arbitrary conditions (referred to as "in-the-wild"). This is partially attributed to the fact that comprehensive "in-the-wild" benchmarks have been developed for face detection, landmark localisation and recognition/verification. A very important technology that has not been thoroughly evaluated yet is deformable face tracking "in-the-wild". Until now, the performance has mainly been assessed qualitatively by visually assessing the result of a deformable face tracking technology on short videos. In this paper, we perform the first, to the best of our knowledge, thorough evaluation of state-of-the-art deformable face tracking pipelines using the recently introduced 300VW benchmark. We evaluate many different architectures focusing mainly on the task of on-line deformable face tracking. In particular, we compare the following general strategies: (a) generic face detection plus generic facial landmark localisation, (b) generic model free tracking plus generic facial landmark localisation, as well as (c) hybrid approaches using state-of-the-art face detection, model free tracking and facial landmark localisation technologies. Our evaluation reveals future avenues for further research on the topic.Comment: E. Antonakos and P. Snape contributed equally and have joint second authorshi
    • …
    corecore