59 research outputs found

    SYSTEM-ON-A-CHIP (SOC)-BASED HARDWARE ACCELERATION FOR HUMAN ACTION RECOGNITION WITH CORE COMPONENTS

    Get PDF
    Today, the implementation of machine vision algorithms on embedded platforms or in portable systems is growing rapidly due to the demand for machine vision in daily human life. Among the applications of machine vision, human action and activity recognition has become an active research area, and market demand for providing integrated smart security systems is growing rapidly. Among the available approaches, embedded vision is in the top tier; however, current embedded platforms may not be able to fully exploit the potential performance of machine vision algorithms, especially in terms of low power consumption. Complex algorithms can impose immense computation and communication demands, especially action recognition algorithms, which require various stages of preprocessing, processing and machine learning blocks that need to operate concurrently. The market demands embedded platforms that operate with a power consumption of only a few watts. Attempts have been mad to improve the performance of traditional embedded approaches by adding more powerful processors; this solution may solve the computation problem but increases the power consumption. System-on-a-chip eld-programmable gate arrays (SoC-FPGAs) have emerged as a major architecture approach for improving power eciency while increasing computational performance. In a SoC-FPGA, an embedded processor and an FPGA serving as an accelerator are fabricated in the same die to simultaneously improve power consumption and performance. Still, current SoC-FPGA-based vision implementations either shy away from supporting complex and adaptive vision algorithms or operate at very limited resolutions due to the immense communication and computation demands. The aim of this research is to develop a SoC-based hardware acceleration workflow for the realization of advanced vision algorithms. Hardware acceleration can improve performance for highly complex mathematical calculations or repeated functions. The performance of a SoC system can thus be improved by using hardware acceleration method to accelerate the element that incurs the highest performance overhead. The outcome of this research could be used for the implementation of various vision algorithms, such as face recognition, object detection or object tracking, on embedded platforms. The contributions of SoC-based hardware acceleration for hardware-software codesign platforms include the following: (1) development of frameworks for complex human action recognition in both 2D and 3D; (2) realization of a framework with four main implemented IPs, namely, foreground and background subtraction (foreground probability), human detection, 2D/3D point-of-interest detection and feature extraction, and OS-ELM as a machine learning algorithm for action identication; (3) use of an FPGA-based hardware acceleration method to resolve system bottlenecks and improve system performance; and (4) measurement and analysis of system specications, such as the acceleration factor, power consumption, and resource utilization. Experimental results show that the proposed SoC-based hardware acceleration approach provides better performance in terms of the acceleration factor, resource utilization and power consumption among all recent works. In addition, a comparison of the accuracy of the framework that runs on the proposed embedded platform (SoCFPGA) with the accuracy of other PC-based frameworks shows that the proposed approach outperforms most other approaches

    Heterogeneous computing systems for vision-based multi-robot tracking

    Get PDF
    Irwansyah A. Heterogeneous computing systems for vision-based multi-robot tracking. Bielefeld: Universität Bielefeld; 2017

    Reconfigurable Vision Processing for Player Tracking in Indoor Sports

    Get PDF
    Ibraheem OW. Reconfigurable Vision Processing for Player Tracking in Indoor Sports. Bielefeld: Universität Bielefeld; 2018.Over the past decade, there has been an increasing growth of using vision-based systems for tracking players in sports. The tracking results are used to evaluate and enhance the performance of the players as well as to provide detailed information (e.g., on the players and team performance) to viewers. Player tracking using vision systems is a very challenging task due to the nature of sports games, which includes severe and frequent interactions (e.g., occlusions) between the players. Additionally, these vision systems have high computational demands since they require processing of a huge amount of video data based on the utilization of multiple cameras with high resolution and high frame rate. As a result, most of the existing systems based on general-purpose computers are not able to perform online real-time player tracking, but track the players offline using pre-recorded video files, limiting, e.g., direct feedback on the player performance during the game. In this thesis, a reconfigurable vision-based system for automatically tracking the players in indoor sports is presented. The proposed system targets player tracking for basketball and handball games. It processes the incoming video streams from GigE Vision cameras, achieving online real-time player tracking. The teams are identified and the players are detected based on the colors of their jerseys, using background subtraction, color thresholding, and graph clustering techniques. Moreover, the trackingby-detection approach is used to realize player tracking. FPGA technology is used to handle the compute-intensive vision processing tasks by implementing the video acquisition, video preprocessing, player segmentation, and team identification & player detection in hardware, while the less compute-intensive player tracking is performed on the CPU of a host-PC. Player detection and tracking are evaluated using basketball and handball datasets. The results of this work show that the maximum achieved frame rate for the FPGA implementation is 96.7 fps using a Xilinx Virtex-4 FPGA and 136.4 fps using a Virtex-7 device. The player tracking requires an average processing time of 2.53 ms per frame in a host-PC equipped with a 2.93 GHz Intel i7-870 CPU. As a result, the proposed reconfigurable system supports a maximum frame rate of 77.6 fps using two GigE Vision cameras with a resolution of 1392x1040 pixels each. Using the FPGA implementation, a speedup by a factor of 15.5 is achieved compared to an OpenCV-based software implementation in a host-PC. Additionally, the results show a high accuracy for player tracking. In particular, the achieved average precision and recall for player detection are up to 84.02% and 96.6%, respectively. For player tracking, the achieved average precision and recall are up to 94.85% and 94.72%, respectively. Furthermore, the proposed reconfigurable system achieves a 2.4 times higher performance per Watt than a software-based implementation (without FPGA support) for player tracking in a host-PC.Acknowledgments: I (Omar W. Ibraheem) would like to thank the German Academic Exchange Service (DAAD), the Congnitronics and Sensor Systems research group, and the Cluster of Excellence Cognitive Interaction Technology ‘CITEC’ (EXC 277) (Bielefeld University) not only for funding the work in this thesis, but also for all the help and support they gave to successfully finish my thesis

    FPGA-based smart camera mote for pervasive wireless network

    Get PDF
    International audienceSmart camera networks raise challenging issues in many fields of research, including vision processing, communication protocols, distributed algorithms or power management. The ever increasing resolution of image sensors entails huge amounts of data, far exceeding the bandwidth of current networks and thus forcing smart camera nodes to process raw data into useful information. Consequently, on-board processing has become a key issue for the expansion of such networked systems. In this context, FPGA-based platforms, supporting massive, fine grain data parallelism, offer large opportunities. Besides, the concept of a middleware, providing services for networking, data transfer, dynamic loading or hardware abstraction, has emerged as a means of harnessing the hardware and software complexity of smart camera nodes. In this paper, we prospect the development of a new kind of smart cameras, wherein FPGAs provide high performance processing and general purpose processors support middleware services. In this approach, FPGA devices can be reconfigured at run-time through the network both from explicit user request and transparent middleware decision. An embedded real-time operating system is in charge of the communication layer, and thus can autonomously decide to use a part of the FPGA as an available processing resource. The classical programmability issue, a significant obstacle when dealing with FPGAs, is addressed by resorting to a domain specific high-level programming language (CAPH) for describing operations to be implemented on FPGAs

    Event-driven industrial robot control architecture for the Adept V+ platform

    Get PDF
    Modern industrial robotic systems are highly interconnected. They operate in a distributed environment and communicate with sensors, computer vision systems, mechatronic devices, and computational components. On the fundamental level, communication and coordination between all parties in such distributed system are characterized by discrete event behavior. The latter is largely attributed to the specifics of communication over the network, which, in terms, facilitates asynchronous programming and explicit event handling. In addition, on the conceptual level, events are an important building block for realizing reactivity and coordination. Eventdriven architecture has manifested its effectiveness for building loosely-coupled systems based on publish-subscribe middleware, either general-purpose or robotic-oriented. Despite all the advances in middleware, industrial robots remain difficult to program in context of distributed systems, to a large extent due to the limitation of the native robot platforms. This paper proposes an architecture for flexible event-based control of industrial robots based on the Adept V+ platform. The architecture is based on the robot controller providing a TCP/IP server and a collection of robot skills, and a high-level control module deployed to a dedicated computing device. The control module possesses bidirectional communication with the robot controller and publish/subscribe messaging with external systems. It is programmed in asynchronous style using pyadept, a Python library based on Python coroutines, AsyncIO event loop and ZeroMQ middleware. The proposed solution facilitates integration of Adept robots into distributed environments and building more flexible robotic solutions with eventbased logic

    Засоби та методики оцінки ефективності передавання відеопотоку на основі технології GigE Vision з використанням процесору загального призначення

    Get PDF
    У роботі досліджено ефективність реалізації GigE Vision сумісного джерела відеопотоку на обчислю-вальній платформі, основаній на ARM процесорі загального призначення. Зокрема, для реалізації джерела створено прототип GigE Vision сумісної камери з використанням порівняно розповсюдженого одноплатного комп’ютера Raspberry Pi 4. З використанням програмного інтерфейсу Video4Linux2 розроблено програмну реалізацію проце-дури захоплення зображень із відеосенсора, підключеного до одноплатного комп’ютера та за допомогою бібліотеки Aravis створено процедуру конвертування і передавання мережею захоплених кадрів у сумісному з технологією GigE Vision форматі. Запропоновано метод вимірювання затримок передачі кадрів каналом Ethernet та проведено відповідні вимірювання. Встановлено, що програмна реалізація GigE Vision сумісної відеокамери на сучасних одноплатних комп’ютерах може вважатися перспективною, в особливості, за подальшого вдосконалення шляхом оптимізації відповідних програмних та/або апаратних складових.The paper investigates the possibility of efficient implementation of a GigE Vision compatible video stream source on a computing platform based on a system-on-a-chip with general-purpose ARM processor cores. In particular, to implement the aforementioned video source, a proprietary prototype of a GigE Vision compatible camera was developed based on the Raspberry Pi 4 single-board computer. This computing platform was chosen due to its widespread use and wide community support. The software part of the camera is implemented using the Video4Linux and Aravis libraries. The first library is used for the primary image capturing from a video sensor connected to a single board computer. The second library is intended for forming and transmission of video stream frames compatible with GigE Vision technology over the network. To estimate the delays in the transmission of a video stream over an Ethernet channel, a methodology based on the Precise Time Protocol (PTP) has been proposed and applied. During the experiments, it was found that the software implementation of a GigE Vision compatible camera on single-board computers with general-purpose proces-sor cores is quite promising. Without additional optimization, such an implementation can be successfully used to transmit small frames (with a resolution of up to 640 × 480 pixels), giving a delay less than 10 ms. At the same time, some additional optimizations may be required to transmit larger frames. Namely, a MTU (maximum transmission unit) size value plays the crucial role in latency formation. Thus, to implement a faster camera, it is necessary to select a platform that supports the largest possible MTU (unfortunately, it turned out that it is not possible with Raspberry Pi 4, as it supports relatively small MTU size of up to 2000 bytes). In addition, the image format conversion procedure can noticeably affect the delay. Therefore, it is highly desirable to avoid any frame processing on the transmitter side and, if it is possible, to broadcast raw images. If the conversion of the frame format is necessary, the platform should be chosen so that there are free computing cores on it, which will permit to distribute all necessary frame conversions between these cores using parallelization tech-niques

    3D image acquisition and processing with high continuous data throughput for human-machine-interaction and adaptive manufacturing

    Get PDF
    Many applications in industrial environment are able to detect structures in measurement volumes from macroscopic to microscopic range. One way to process the resulting image data and to calculate three-dimensional (3D) images is the use of active stereo vision technology. In this context, one of the main challenges is to deal with the permanently increasing amount of data. This paper aims to describes methods for handling the required data throughput for 3D image acquisition in active stereo vision systems. Thus, the main focus is on implementing the steps of the image processing chain on re-configurable hardware. Among other things, this includes the pre-processing step with the correction of distortion and rectification of incoming image data. Therefore, the approach uses the offline pre-calculation of rectification maps. Furthermore, with the aid of the rectified maps, each image is directly rectified during the image acquisition. Afterwards, an FPGA and GPU-based approach is selected for optimal performance of stereo matching and 3D point calculation

    Data Aggregation through Web Service Composition in Smart Camera Networks

    Get PDF
    Distributed Smart Camera (DSC) networks are power constrained real-time distributed embedded systems that perform computer vision using multiple cameras. Providing data aggregation techniques that is criti-cal for running complex image processing algorithms on DSCs is a challenging task due to complexity of video and image data. Providing highly desirable SQL APIs for sophisticated query processing in DSC networks is also challenging for similar reasons. Research on DSCs to date have not addressed the above two problems. In this thesis, we develop a novel SOA based middleware framework on a DSC network that uses Distributed OSGi to expose DSC network services as web services. We also develop a novel web service composition scheme that aid in data aggregation and a SQL query interface for DSC net-works that allow sophisticated query processing. We validate our service orchestration concept for data aggregation by providing query primitive for face detection in smart camera network

    A Delay Tolerant Networking-Based Approach to a High Data Rate Architecture for Spacecraft

    Get PDF
    Historically, it has been the case that SWaP placed such severe constraints on radios that the links between spacecraft and the ground were relatively slow. This meant that the radio link was normally a significant bottleneck in returning scientific data. Over recent years, however, a combination of more efficient radio design, intelligent waveforms, and highly directed, high-frequency RF / optical systems have led to a rapid increase in the amount of data that can be pushed through radio and optical links. This has led to some cases where the radio links are capable of moving data much more quickly than the spacecraft and instruments are capable of actually generating it! In some instances, scientific data can therefore be lost not because the downlink is too slow to support the data rate, but instead because the spacecraft was not designed in a way that would let it fully utilize both the radio and the networking services available to it.The High Data Rate Architecture (HiDRA) project describes a packet-based approach to building modern, distributed spacecraft systems. It presents a means for spacecraft and other assets to participate in both present and future Delay Tolerant Networks (DTN), while simultaneously ensuring that the asset is able to fully utilize the new, high-speed links that have been seeing more widespread development and deployment in recent years. With this in mind, this paper begins with a discussion regarding HiDRA's evolution. Next, it discusses the capabilities and limitations of NASA's present DTN-enabled networks. Of particular note is the way in which principles of network design at the terrestrial level (e.g. use of programmable networks / software-defined networks, separation between data and control plane, infusion of COTS Ethernet switch chips, etc.) can all be translated into the space environment as well. After this, the paper discusses the design and implementation of a present prototype reference implementation of High-Rate DTN (HDTN), which is intended to demonstrate future high-rate networking concepts as part of a coherent demonstration on the International Space Station (ISS). The goal, of both the research and of this implementation, is to help develop a ready-made toolbox of ideas, approaches, and examples from which mission designers can draw when putting together new missions. Assuming all goes as planned, this should not only work to reduce the cost of individual mission design, but also improve the rate at which science data can be returned for mission participants to review
    corecore