562 research outputs found

    A review of parallel computing for large-scale remote sensing image mosaicking

    Get PDF
    Interest in image mosaicking has been spurred by a wide variety of research and management needs. However, for large-scale applications, remote sensing image mosaicking usually requires significant computational capabilities. Several studies have attempted to apply parallel computing to improve image mosaicking algorithms and to speed up calculation process. The state of the art of this field has not yet been summarized, which is, however, essential for a better understanding and for further research of image mosaicking parallelism on a large scale. This paper provides a perspective on the current state of image mosaicking parallelization for large scale applications. We firstly introduce the motivation of image mosaicking parallel for large scale application, and analyze the difficulty and problem of parallel image mosaicking at large scale such as scheduling with huge number of dependent tasks, programming with multiple-step procedure, dealing with frequent I/O operation. Then we summarize the existing studies of parallel computing in image mosaicking for large scale applications with respect to problem decomposition and parallel strategy, parallel architecture, task schedule strategy and implementation of image mosaicking parallelization. Finally, the key problems and future potential research directions for image mosaicking are addressed

    Detection And Classification Of Buried Radioactive Materials

    Get PDF
    This dissertation develops new approaches for detection and classification of buried radioactive materials. Different spectral transformation methods are proposed to effectively suppress noise and to better distinguish signal features in the transformed space. The contributions of this dissertation are detailed as follows. 1) Propose an unsupervised method for buried radioactive material detection. In the experiments, the original Reed-Xiaoli (RX) algorithm performs similarly as the gross count (GC) method; however, the constrained energy minimization (CEM) method performs better if using feature vectors selected from the RX output. Thus, an unsupervised method is developed by combining the RX and CEM methods, which can efficiently suppress the background noise when applied to the dimensionality-reduced data from principle component analysis (PCA). 2) Propose an approach for buried target detection and classification, which applies spectral transformation followed by noisejusted PCA (NAPCA). To meet the requirement of practical survey mapping, we focus on the circumstance when sensor dwell time is very short. The results show that spectral transformation can alleviate the effects from spectral noisy variation and background clutters, while NAPCA, a better choice than PCA, can extract key features for the following detection and classification. 3) Propose a particle swarm optimization (PSO)-based system to automatically determine the optimal partition for spectral transformation. Two PSOs are incorporated in the system with the outer one being responsible for selecting the optimal number of bins and the inner one for optimal bin-widths. The experimental results demonstrate that using variable bin-widths is better than a fixed bin-width, and PSO can provide better results than the traditional Powell’s method. 4) Develop parallel implementation schemes for the PSO-based spectral partition algorithm. Both cluster and graphics processing units (GPU) implementation are designed. The computational burden of serial version has been greatly reduced. The experimental results also show that GPU algorithm has similar speedup as cluster-based algorithm

    Distributed and Scalable Video Analysis Architecture for Human Activity Recognition Using Cloud Services

    Get PDF
    This thesis proposes an open-source, maintainable system for detecting human activity in large video datasets using scalable hardware architectures. The system is validated by detecting writing and typing activities that were collected as part of the Advancing Out of School Learning in Mathematics and Engineering (AOLME) project. The implementation of the system using Amazon Web Services (AWS) is shown to be both horizontally and vertically scalable. The software associated with the system was designed to be robust so as to facilitate reproducibility and extensibility for future research

    Real-time Blind Separation and Deconvolution of Real-world signals

    Get PDF
    We present a reallistic and robust implementation of Blind Source Separation and Blind Deconvolution. The algorithm is developed from the idea of natraul gradient learning, wavlet filtering and denoising, and the characteristic of different sound source. Several hardware pieciecs are integrated, including a mobile robot, NT workstation and DSP chip to achieve the real time separation of real world signal. Besides, a method of judging the separation performance without knowing the mixing matrix ( mixing filter ) is proposed and verified

    Clock synchronisation for UWB and DECT communication networks

    Get PDF
    Synchronisation deals with the distribution of time and/or frequency across a network of nodes dispersed in an area, in order to align their clocks with respect to time and/or frequency. It remains an important requirement in telecommunication networks, especially in Time Division Duplexing (TDD) systems such as Ultra Wideband (UWB) and Digital Enhanced Cordless Telecommunications (DECT) systems. This thesis explores three di erent research areas related to clock synchronisation in communication networks; namely algorithm development and implementation, managing Packet Delay Variation (PDV), and coping with the failure of a master node. The first area proposes a higher-layer synchronisation algorithm in order to meet the specific requirements of a UWB network that is based on the European Computer Manufacturers Association (ECMA) standard. At up to 480 Mbps data rate, UWB is an attractive technology for multimedia streaming. Higher-layer synchronisation is needed in order to facilitate synchronised playback at the receivers and prevent distortion, but no algorithm is de ned in the ECMA-368 standard. In this research area, a higher-layer synchronisation algorithm is developed for an ECMA-368 UWB network. Network simulations and FPGA implementation are used to show that the new algorithm satis es the requirements of the network. The next research area looks at how PDV can be managed when Precision Time Protocol (PTP) is implemented in an existing Ethernet network. Existing literature indicates that the performance of a PDV ltering algorithm usually depends on the delay pro le of the network in which it is applied. In this research area, a new sample-mode PDV filter is proposed which is independent of the shape of the delay profile. Numerical simulations show that the sample-mode filtering algorithm is able to match or out-perform the existing sample minimum, mean, and maximum filters, at differentlevels of network load. Finally, the thesis considers the problem of dealing with master failures in a PTP network for a DECT audio application. It describes the existing master redundancy techniques and shows why they are unsuitable for the specific application. Then a new alternate master cluster technique is proposed along with an alternative BMCA to suit the application under consideration. Network simulations are used to show how this technique leads to a reduction in the total time to recover from a master failure

    INTELLIGENT SENSOR SYSTEM FOR SELECTED ENVIRONMENTAL SAFETY APPLICATION

    Get PDF
    Song, Chen-Lin. M.S., Purdue University, August 2015. Intelligent Sensor System for Selected Environmental Safety Applications. Major Professor: Suranjan Panigrah

    Paralleizing AwSpPCA for robust facial recognition using CUDA

    Get PDF
    This thesis report is submitted in partial fulfillment of the requirements for the degree of Bachelor of Science in Computer Science and Engineering, 2014.Cataloged from PDF version of thesis report.This paper was conducted to analyze the performance benefits of parallelizing the Adaptive Weighted Sub-patterned Principle Component Analysis (Aw SP PCA) algorithm, given that the algorithm is implemented so as to retain the accuracy from its serialized version. The serialized execution of this algorithm is analyzed first and then compared against its parallel implementation, both compiled and run on the same computer. Throughout this paper, the methodology is to undergo a step by step procedure which can clearly outline and describe the problems faced when trying to parallelize this algorithm. It will also describe where, how and why parallelizing procedures were used. The results of the research have shown that while not all parts of the algorithm can be implemented in parallel in the first place, some of the sections that can be parallelized does not necessarily yield a considerable amount of benefits. Also, it was seen that not all sections scale well with problem size, meaning that some portions of the algorithm can be left in its serialized state without much loss in time. The sections which can be parallelized were discussed in detail. Some changes were also made to certain variables to ensure the best accuracy possible. Finally, through analysis and experimentation, a speedup of 2.76 was achieved, with a recognition accuracy of 92.6%.Syed Amer ZawadAshfaque AliB. Computer Science and Engineerin

    SYSTEM-ON-A-CHIP (SOC)-BASED HARDWARE ACCELERATION FOR HUMAN ACTION RECOGNITION WITH CORE COMPONENTS

    Get PDF
    Today, the implementation of machine vision algorithms on embedded platforms or in portable systems is growing rapidly due to the demand for machine vision in daily human life. Among the applications of machine vision, human action and activity recognition has become an active research area, and market demand for providing integrated smart security systems is growing rapidly. Among the available approaches, embedded vision is in the top tier; however, current embedded platforms may not be able to fully exploit the potential performance of machine vision algorithms, especially in terms of low power consumption. Complex algorithms can impose immense computation and communication demands, especially action recognition algorithms, which require various stages of preprocessing, processing and machine learning blocks that need to operate concurrently. The market demands embedded platforms that operate with a power consumption of only a few watts. Attempts have been mad to improve the performance of traditional embedded approaches by adding more powerful processors; this solution may solve the computation problem but increases the power consumption. System-on-a-chip eld-programmable gate arrays (SoC-FPGAs) have emerged as a major architecture approach for improving power eciency while increasing computational performance. In a SoC-FPGA, an embedded processor and an FPGA serving as an accelerator are fabricated in the same die to simultaneously improve power consumption and performance. Still, current SoC-FPGA-based vision implementations either shy away from supporting complex and adaptive vision algorithms or operate at very limited resolutions due to the immense communication and computation demands. The aim of this research is to develop a SoC-based hardware acceleration workflow for the realization of advanced vision algorithms. Hardware acceleration can improve performance for highly complex mathematical calculations or repeated functions. The performance of a SoC system can thus be improved by using hardware acceleration method to accelerate the element that incurs the highest performance overhead. The outcome of this research could be used for the implementation of various vision algorithms, such as face recognition, object detection or object tracking, on embedded platforms. The contributions of SoC-based hardware acceleration for hardware-software codesign platforms include the following: (1) development of frameworks for complex human action recognition in both 2D and 3D; (2) realization of a framework with four main implemented IPs, namely, foreground and background subtraction (foreground probability), human detection, 2D/3D point-of-interest detection and feature extraction, and OS-ELM as a machine learning algorithm for action identication; (3) use of an FPGA-based hardware acceleration method to resolve system bottlenecks and improve system performance; and (4) measurement and analysis of system specications, such as the acceleration factor, power consumption, and resource utilization. Experimental results show that the proposed SoC-based hardware acceleration approach provides better performance in terms of the acceleration factor, resource utilization and power consumption among all recent works. In addition, a comparison of the accuracy of the framework that runs on the proposed embedded platform (SoCFPGA) with the accuracy of other PC-based frameworks shows that the proposed approach outperforms most other approaches
    corecore