1,029 research outputs found

    SYSTEM-ON-A-CHIP (SOC)-BASED HARDWARE ACCELERATION FOR HUMAN ACTION RECOGNITION WITH CORE COMPONENTS

    Get PDF
    Today, the implementation of machine vision algorithms on embedded platforms or in portable systems is growing rapidly due to the demand for machine vision in daily human life. Among the applications of machine vision, human action and activity recognition has become an active research area, and market demand for providing integrated smart security systems is growing rapidly. Among the available approaches, embedded vision is in the top tier; however, current embedded platforms may not be able to fully exploit the potential performance of machine vision algorithms, especially in terms of low power consumption. Complex algorithms can impose immense computation and communication demands, especially action recognition algorithms, which require various stages of preprocessing, processing and machine learning blocks that need to operate concurrently. The market demands embedded platforms that operate with a power consumption of only a few watts. Attempts have been mad to improve the performance of traditional embedded approaches by adding more powerful processors; this solution may solve the computation problem but increases the power consumption. System-on-a-chip eld-programmable gate arrays (SoC-FPGAs) have emerged as a major architecture approach for improving power eciency while increasing computational performance. In a SoC-FPGA, an embedded processor and an FPGA serving as an accelerator are fabricated in the same die to simultaneously improve power consumption and performance. Still, current SoC-FPGA-based vision implementations either shy away from supporting complex and adaptive vision algorithms or operate at very limited resolutions due to the immense communication and computation demands. The aim of this research is to develop a SoC-based hardware acceleration workflow for the realization of advanced vision algorithms. Hardware acceleration can improve performance for highly complex mathematical calculations or repeated functions. The performance of a SoC system can thus be improved by using hardware acceleration method to accelerate the element that incurs the highest performance overhead. The outcome of this research could be used for the implementation of various vision algorithms, such as face recognition, object detection or object tracking, on embedded platforms. The contributions of SoC-based hardware acceleration for hardware-software codesign platforms include the following: (1) development of frameworks for complex human action recognition in both 2D and 3D; (2) realization of a framework with four main implemented IPs, namely, foreground and background subtraction (foreground probability), human detection, 2D/3D point-of-interest detection and feature extraction, and OS-ELM as a machine learning algorithm for action identication; (3) use of an FPGA-based hardware acceleration method to resolve system bottlenecks and improve system performance; and (4) measurement and analysis of system specications, such as the acceleration factor, power consumption, and resource utilization. Experimental results show that the proposed SoC-based hardware acceleration approach provides better performance in terms of the acceleration factor, resource utilization and power consumption among all recent works. In addition, a comparison of the accuracy of the framework that runs on the proposed embedded platform (SoCFPGA) with the accuracy of other PC-based frameworks shows that the proposed approach outperforms most other approaches

    Distributed and Scalable Video Analysis Architecture for Human Activity Recognition Using Cloud Services

    Get PDF
    This thesis proposes an open-source, maintainable system for detecting human activity in large video datasets using scalable hardware architectures. The system is validated by detecting writing and typing activities that were collected as part of the Advancing Out of School Learning in Mathematics and Engineering (AOLME) project. The implementation of the system using Amazon Web Services (AWS) is shown to be both horizontally and vertically scalable. The software associated with the system was designed to be robust so as to facilitate reproducibility and extensibility for future research

    Self-learning Anomaly Detection in Industrial Production

    Get PDF

    Speeding up Adaboost object detection with motion segmentation and Haar feature acceleration

    Get PDF
    A key challenge in a surveillance system is the object detection task. Object detection in general is a non-trivial problem. A sub-problem within the broader context of object detection which many researchers focus on is face detection. Numerous techniques have been proposed for face detection. One of the better performing algorithms is proposed by Viola et. al. This algorithm is based on Adaboost and uses Haar features to detect objects. The main reason for its popularity is very low false positive rates and the fact that the classifier network can be trained for any detection task. The use of Haar basis functions to represent key object features is the key to its success. The basis functions are organized as a network to form a strong classifier. To detect objects, this technique divides each input image into non-overlapping sub-windows and the strong classifier is applied to each sub-window to detect the presence of an object. The process is repeated at multiple scales of the input image to detect objects of various sizes. In this thesis we propose an object detection system that uses object segmentation as a preprocessing step. We use Mixture of Gaussians (MoG) proposed by Staffer et. al. for object segmentation. One key advantage with using segmentation to extract image regions of interest is that it reduces the number of search windows sent to detection task, thereby reducing the computational complexity and the execution time. Moreover, owing to the computational complexity of both the segmentation and detection algorithms we used in the system, we propose hardware architectures for accelerating key computationally intensive blocks. In this thesis we propose hardware architecture for MoG and also for a key compute intensive block within the adaboost algorithm corresponding to the Haar feature computation

    Discovering user mobility and activity in smart lighting environments

    Full text link
    "Smart lighting" environments seek to improve energy efficiency, human productivity and health by combining sensors, controls, and Internet-enabled lights with emerging “Internet-of-Things” technology. Interesting and potentially impactful applications involve adaptive lighting that responds to individual occupants' location, mobility and activity. In this dissertation, we focus on the recognition of user mobility and activity using sensing modalities and analytical techniques. This dissertation encompasses prior work using body-worn inertial sensors in one study, followed by smart-lighting inspired infrastructure sensors deployed with lights. The first approach employs wearable inertial sensors and body area networks that monitor human activities with a user's smart devices. Real-time algorithms are developed to (1) estimate angles of excess forward lean to prevent risk of falls, (2) identify functional activities, including postures, locomotion, and transitions, and (3) capture gait parameters. Two human activity datasets are collected from 10 healthy young adults and 297 elder subjects, respectively, for laboratory validation and real-world evaluation. Results show that these algorithms can identify all functional activities accurately with a sensitivity of 98.96% on the 10-subject dataset, and can detect walking activities and gait parameters consistently with high test-retest reliability (p-value < 0.001) on the 297-subject dataset. The second approach leverages pervasive "smart lighting" infrastructure to track human location and predict activities. A use case oriented design methodology is considered to guide the design of sensor operation parameters for localization performance metrics from a system perspective. Integrating a network of low-resolution time-of-flight sensors in ceiling fixtures, a recursive 3D location estimation formulation is established that links a physical indoor space to an analytical simulation framework. Based on indoor location information, a label-free clustering-based method is developed to learn user behaviors and activity patterns. Location datasets are collected when users are performing unconstrained and uninstructed activities in the smart lighting testbed under different layout configurations. Results show that the activity recognition performance measured in terms of CCR ranges from approximately 90% to 100% throughout a wide range of spatio-temporal resolutions on these location datasets, insensitive to the reconfiguration of environment layout and the presence of multiple users.2017-02-17T00:00:00

    Automated camera ranking and selection using video content and scene context

    Get PDF
    PhDWhen observing a scene with multiple cameras, an important problem to solve is to automatically identify “what camera feed should be shown and when?” The answer to this question is of interest for a number of applications and scenarios ranging from sports to surveillance. In this thesis we present a framework for the ranking of each video frame and camera across time and the camera network, respectively. This ranking is then used for automated video production. In the first stage information from each camera view and from the objects in it is extracted and represented in a way that allows for object- and frame-ranking. First objects are detected and ranked within and across camera views. This ranking takes into account both visible and contextual information related to the object. Then content ranking is performed based on the objects in the view and camera-network level information. We propose two novel techniques for content ranking namely: Routing Based Ranking (RBR) and Multivariate Gaussian based Ranking (MVG). In RBR we use a rule based framework where weighted fusion of object and frame level information takes place while in MVG the rank is estimated as a multivariate Gaussian distribution. Through experimental and subjective validation we demonstrate that the proposed content ranking strategies allows the identification of the best-camera at each time. The second part of the thesis focuses on the automatic generation of N-to-1 videos based on the ranked content. We demonstrate that in such production settings it is undesirable to have frequent inter-camera switching. Thus motivating the need for a compromise, between selecting the best camera most of the time and minimising the frequent inter-camera switching, we demonstrate that state-of-the-art techniques for this task are inadequate and fail in dynamic scenes. We propose three novel methods for automated camera selection. The first method (¡go f ) performs a joint optimization of a cost function that depends on both the view quality and inter-camera switching so that a i Abstract ii pleasing best-view video sequence can be composed. The other two methods (¡dbn and ¡util) include the selection decision into the ranking-strategy. In ¡dbn we model the best-camera selection as a state sequence via Directed Acyclic Graphs (DAG) designed as a Dynamic Bayesian Network (DBN), which encodes the contextual knowledge about the camera network and employs the past information to minimize the inter camera switches. In comparison ¡util utilizes the past as well as the future information in a Partially Observable Markov Decision Process (POMDP) where the camera-selection at a certain time is influenced by the past information and its repercussions in the future. The performance of the proposed approach is demonstrated on multiple real and synthetic multi-camera setups. We compare the proposed architectures with various baseline methods with encouraging results. The performance of the proposed approaches is also validated through extensive subjective testing
    corecore