1,107 research outputs found

    Spatial Pyramid Context-Aware Moving Object Detection and Tracking for Full Motion Video and Wide Aerial Motion Imagery

    Get PDF
    A robust and fast automatic moving object detection and tracking system is essential to characterize target object and extract spatial and temporal information for different functionalities including video surveillance systems, urban traffic monitoring and navigation, robotic. In this dissertation, I present a collaborative Spatial Pyramid Context-aware moving object detection and Tracking system. The proposed visual tracker is composed of one master tracker that usually relies on visual object features and two auxiliary trackers based on object temporal motion information that will be called dynamically to assist master tracker. SPCT utilizes image spatial context at different level to make the video tracking system resistant to occlusion, background noise and improve target localization accuracy and robustness. We chose a pre-selected seven-channel complementary features including RGB color, intensity and spatial pyramid of HoG to encode object color, shape and spatial layout information. We exploit integral histogram as building block to meet the demands of real-time performance. A novel fast algorithm is presented to accurately evaluate spatially weighted local histograms in constant time complexity using an extension of the integral histogram method. Different techniques are explored to efficiently compute integral histogram on GPU architecture and applied for fast spatio-temporal median computations and 3D face reconstruction texturing. We proposed a multi-component framework based on semantic fusion of motion information with projected building footprint map to significantly reduce the false alarm rate in urban scenes with many tall structures. The experiments on extensive VOTC2016 benchmark dataset and aerial video confirm that combining complementary tracking cues in an intelligent fusion framework enables persistent tracking for Full Motion Video and Wide Aerial Motion Imagery.Comment: PhD Dissertation (162 pages

    3D high definition video coding on a GPU-based heterogeneous system

    Get PDF
    H.264/MVC is a standard for supporting the sensation of 3D, based on coding from 2 (stereo) to N views. H.264/MVC adopts many coding options inherited from single view H.264/AVC, and thus its complexity is even higher, mainly because the number of processing views is higher. In this manuscript, we aim at an efficient parallelization of the most computationally intensive video encoding module for stereo sequences. In particular, inter prediction and its collaborative execution on a heterogeneous platform. The proposal is based on an efficient dynamic load balancing algorithm and on breaking encoding dependencies. Experimental results demonstrate the proposed algorithm's ability to reduce the encoding time for different stereo high definition sequences. Speed-up values of up to 90× were obtained when compared with the reference encoder on the same platform. Moreover, the proposed algorithm also provides a more energy-efficient approach and hence requires less energy than the sequential reference algorith

    Statistical Regression Methods for GPGPU Design Space Exploration

    Get PDF
    General Purpose Graphics Processing Units (GPGPUs) have leveraged the performance and power efficiency of today\u27s heterogeneous systems to usher in a new era of innovation in high-performance scientific computing. These systems can offer significantly high performance for massively parallel applications; however, their resources may be wasted due to inefficient tuning strategies. Previous application tuning studies pre-dominantly employ low-level, architecture specific tuning which can make the performance modeling task difficult and less generic. In this research, we explore the GPGPU design space featuring the memory hierarchy for application tuning using regression-based performance prediction framework and rank the design space based on the runtime performance. The regression-based framework models the GPGPU device computations using algorithm characteristics such as the number of floating-point operations, total number of bytes, and hardware parameters pertaining to the GPGPU memory hierarchy as predictor variables. The computation component regression models are developed using several instrumented executions of the algorithms that include a range of FLOPS-to-Byte requirement. We validate our model with a Synchronous Iterative Algorithm (SIA) set that includes Spiking Neural Networks (SNNs) and Anisotropic Diffusion Filtering (ADF) for massive images. The highly parallel nature of the above mentioned algorithms, in addition to their wide range of communication-to-computation complexities, makes them good candidates for this study. A hierarchy of implementations for the SNNs and ADF is constructed and ranked using the regression-based framework. We further illustrate the Synchronous Iterative GPGPU Execution (SIGE) model on the GPGPU-augmented Palmetto Cluster. The performance prediction framework maps appropriate design space implementation for 4 out of 5 case studies used in this research. The final goal of this research is to establish the efficacy of the regression-based framework to accurately predict the application kernel runtime, allowing developers to correctly rank their design space prior to the large-scale implementation

    Keyframe-based monocular SLAM: design, survey, and future directions

    Get PDF
    Extensive research in the field of monocular SLAM for the past fifteen years has yielded workable systems that found their way into various applications in robotics and augmented reality. Although filter-based monocular SLAM systems were common at some time, the more efficient keyframe-based solutions are becoming the de facto methodology for building a monocular SLAM system. The objective of this paper is threefold: first, the paper serves as a guideline for people seeking to design their own monocular SLAM according to specific environmental constraints. Second, it presents a survey that covers the various keyframe-based monocular SLAM systems in the literature, detailing the components of their implementation, and critically assessing the specific strategies made in each proposed solution. Third, the paper provides insight into the direction of future research in this field, to address the major limitations still facing monocular SLAM; namely, in the issues of illumination changes, initialization, highly dynamic motion, poorly textured scenes, repetitive textures, map maintenance, and failure recovery

    GPU Accelerated Vector Median Filter

    Get PDF
    Noise reduction is an important step for most image processing tasks. For three channel color images, a widely used technique is vector median filter in which color values of pixels are treated as 3-component vectors. Vector median filters are computationally expensive; for a window size of n x n, each of the n(sup 2) vectors has to be compared with other n(sup 2) - 1 vectors in distances. General purpose computation on graphics processing units (GPUs) is the paradigm of utilizing high-performance many-core GPU architectures for computation tasks that are normally handled by CPUs. In this work. NVIDIA's Compute Unified Device Architecture (CUDA) paradigm is used to accelerate vector median filtering. which has to the best of our knowledge never been done before. The performance of GPU accelerated vector median filter is compared to that of the CPU and MPI-based versions for different image and window sizes, Initial findings of the study showed 100x improvement of performance of vector median filter implementation on GPUs over CPU implementations and further speed-up is expected after more extensive optimizations of the GPU algorithm

    Parallel Implementation of a Real-Time High Dynamic Range Video System

    Full text link
    Abstract. This article describes the use of the parallel processing capabilities of a graphics chip to increase the processing speed of a high dynamic range (HDR) video system. The basis is an existing HDR video system that produces each frame from a sequence of regular images taken in quick succession under varying exposure settings. The image sequence is processed in a pipeline consisting of: shutter speeds selection, capturing, color space conversion, image registration, HDR stitching, and tone mapping. This article identifies bottlenecks in the pipeline and describes modifications to the algorithms that are necessary to enable parallel processing. Time-critical steps are processed on a graphics processing unit (GPU). The resulting processing time is evaluated and compared to the original sequential code. The creation of an HDR video frame is sped up by a factor of 15 on the average

    Indexed dependence metadata and its applications in software performance optimisation

    No full text
    To achieve continued performance improvements, modern microprocessor design is tending to concentrate an increasing proportion of hardware on computation units with less automatic management of data movement and extraction of parallelism. As a result, architectures increasingly include multiple computation cores and complicated, software-managed memory hierarchies. Compilers have difficulty characterizing the behaviour of a kernel in a general enough manner to enable automatic generation of efficient code in any but the most straightforward of cases. We propose the concept of indexed dependence metadata to improve application development and mapping onto such architectures. The metadata represent both the iteration space of a kernel and the mapping of that iteration space from a given index to the set of data elements that iteration might use: thus the dependence metadata is indexed by the kernel’s iteration space. This explicit mapping allows the compiler or runtime to optimise the program more efficiently, and improves the program structure for the developer. We argue that this form of explicit interface specification reduces the need for premature, architecture-specific optimisation. It improves program portability, supports intercomponent optimisation and enables generation of efficient data movement code. We offer the following contributions: an introduction to the concept of indexed dependence metadata as a generalisation of stream programming, a demonstration of its advantages in a component programming system, the decoupled access/execute model for C++ programs, and how indexed dependence metadata might be used to improve the programming model for GPU-based designs. Our experimental results with prototype implementations show that indexed dependence metadata supports automatic synthesis of double-buffered data movement for the Cell processor and enables aggressive loop fusion optimisations in image processing, linear algebra and multigrid application case studies
    • …
    corecore