134 research outputs found

    FPGA implementation of an embedded face detection system based on LEON3

    This paper presents an FPGA face detection embedded system. In order achieve acceleration in the face detection process a hardware-software codesign technique is proposed. The paper describes the face detection acceleration mechanism. It also describes the implementation of an IP module that allows hardware acceleration.Comisión Europea MOBY-DIC FP7-IST-248858Ministerio de Ciencia y Tecnología TEC2011-24319Junta de Andalucía P08-TIC-0367

    EfficientNet-Lite and Hybrid CNN-KNN Implementation for Facial Expression Recognition on Raspberry Pi

    Facial expression recognition (FER) is the task of determining a person’s current emotion. It plays an important role in healthcare, marketing, and counselling. With the advancement in deep learning algorithms like Convolutional Neural Network (CNN), the system’s accuracy is improving. A hybrid CNN and k-Nearest Neighbour (KNN) model can improve FER’s accuracy. This paper presents a hybrid CNN-KNN model for FER on the Raspberry Pi 4, where we use CNN for feature extraction. Subsequently, the KNN performs expression recognition. We use the transfer learning technique to build our system with an EfficientNet-Lite model. The hybrid model we propose replaces the Softmax layer in the EfficientNet with the KNN. We train our model using the FER-2013 dataset and compare its performance with different architectures trained on the same dataset. We perform optimization on the Fully Connected layer, loss function, loss optimizer, optimizer learning rate, class weights, and KNN distance function with the k-value. Despite running on the Raspberry Pi hardware with very limited processing power, low memory capacity, and small storage capacity, our proposed model achieves a similar accuracy of 75.26% (with a slight improvement of 0.06%) to the state-of-the-art’s Ensemble of 8 CNN model

    Kodizajn arhitekture i algoritama za lokalizacijumobilnih robota i detekciju prepreka baziranih namodelu

    This thesis proposes SoPC (System on a Programmable Chip) architectures for efficient embedding of vison-based localization and obstacle detection tasks in a navigational pipeline on autonomous mobile robots. The obtained results are equivalent or better in comparison to state-ofthe- art. For localization, an efficient hardware architecture that supports EKF-SLAM's local map management with seven-dimensional landmarks in real time is developed. For obstacle detection a novel method of object recognition is proposed - detection by identification framework based on single detection window scale. This framework allows adequate algorithmic precision and execution speeds on embedded hardware platforms.Ova teza bavi se dizajnom SoPC (engl. System on a Programmable Chip) arhitektura i algoritama za efikasnu implementaciju zadataka lokalizacije i detekcije prepreka baziranih na viziji u kontekstu autonomne robotske navigacije. Za lokalizaciju, razvijena je efikasna računarska arhitektura za EKF-SLAM algoritam, koja podržava skladištenje i obradu sedmodimenzionalnih orijentira lokalne mape u realnom vremenu. Za detekciju prepreka je predložena nova metoda prepoznavanja objekata u slici putem prozora detekcije fiksne dimenzije, koja omogućava veću brzinu izvršavanja algoritma detekcije na namenskim računarskim platformama


    Today, the implementation of machine vision algorithms on embedded platforms or in portable systems is growing rapidly due to the demand for machine vision in daily human life. Among the applications of machine vision, human action and activity recognition has become an active research area, and market demand for providing integrated smart security systems is growing rapidly. Among the available approaches, embedded vision is in the top tier; however, current embedded platforms may not be able to fully exploit the potential performance of machine vision algorithms, especially in terms of low power consumption. Complex algorithms can impose immense computation and communication demands, especially action recognition algorithms, which require various stages of preprocessing, processing and machine learning blocks that need to operate concurrently. The market demands embedded platforms that operate with a power consumption of only a few watts. Attempts have been mad to improve the performance of traditional embedded approaches by adding more powerful processors; this solution may solve the computation problem but increases the power consumption. System-on-a-chip eld-programmable gate arrays (SoC-FPGAs) have emerged as a major architecture approach for improving power eciency while increasing computational performance. In a SoC-FPGA, an embedded processor and an FPGA serving as an accelerator are fabricated in the same die to simultaneously improve power consumption and performance. Still, current SoC-FPGA-based vision implementations either shy away from supporting complex and adaptive vision algorithms or operate at very limited resolutions due to the immense communication and computation demands. The aim of this research is to develop a SoC-based hardware acceleration workflow for the realization of advanced vision algorithms. Hardware acceleration can improve performance for highly complex mathematical calculations or repeated functions. The performance of a SoC system can thus be improved by using hardware acceleration method to accelerate the element that incurs the highest performance overhead. The outcome of this research could be used for the implementation of various vision algorithms, such as face recognition, object detection or object tracking, on embedded platforms. The contributions of SoC-based hardware acceleration for hardware-software codesign platforms include the following: (1) development of frameworks for complex human action recognition in both 2D and 3D; (2) realization of a framework with four main implemented IPs, namely, foreground and background subtraction (foreground probability), human detection, 2D/3D point-of-interest detection and feature extraction, and OS-ELM as a machine learning algorithm for action identication; (3) use of an FPGA-based hardware acceleration method to resolve system bottlenecks and improve system performance; and (4) measurement and analysis of system specications, such as the acceleration factor, power consumption, and resource utilization. Experimental results show that the proposed SoC-based hardware acceleration approach provides better performance in terms of the acceleration factor, resource utilization and power consumption among all recent works. In addition, a comparison of the accuracy of the framework that runs on the proposed embedded platform (SoCFPGA) with the accuracy of other PC-based frameworks shows that the proposed approach outperforms most other approaches

    Embedded Machine Learning: Emphasis on Hardware Accelerators and Approximate Computing for Tactile Data Processing

    Machine Learning (ML) a subset of Artificial Intelligence (AI) is driving the industrial and technological revolution of the present and future. We envision a world with smart devices that are able to mimic human behavior (sense, process, and act) and perform tasks that at one time we thought could only be carried out by humans. The vision is to achieve such a level of intelligence with affordable, power-efficient, and fast hardware platforms. However, embedding machine learning algorithms in many application domains such as the internet of things (IoT), prostheses, robotics, and wearable devices is an ongoing challenge. A challenge that is controlled by the computational complexity of ML algorithms, the performance/availability of hardware platforms, and the application\u2019s budget (power constraint, real-time operation, etc.). In this dissertation, we focus on the design and implementation of efficient ML algorithms to handle the aforementioned challenges. First, we apply Approximate Computing Techniques (ACTs) to reduce the computational complexity of ML algorithms. Then, we design custom Hardware Accelerators to improve the performance of the implementation within a specified budget. Finally, a tactile data processing application is adopted for the validation of the proposed exact and approximate embedded machine learning accelerators. The dissertation starts with the introduction of the various ML algorithms used for tactile data processing. These algorithms are assessed in terms of their computational complexity and the available hardware platforms which could be used for implementation. Afterward, a survey on the existing approximate computing techniques and hardware accelerators design methodologies is presented. Based on the findings of the survey, an approach for applying algorithmic-level ACTs on machine learning algorithms is provided. Then three novel hardware accelerators are proposed: (1) k-Nearest Neighbor (kNN) based on a selection-based sorter, (2) Tensorial Support Vector Machine (TSVM) based on Shallow Neural Networks, and (3) Hybrid Precision Binary Convolution Neural Network (BCNN). The three accelerators offer a real-time classification with monumental reductions in the hardware resources and power consumption compared to existing implementations targeting the same tactile data processing application on FPGA. Moreover, the approximate accelerators maintain a high classification accuracy with a loss of at most 5%


    U današnje vrijeme dolazi do sve veće svijesti očuvanju okoliša. Često je na udaru automobilska industrija jer se smatra jednim od najvećih zagađivača okoliša. Iz tih se razloga energetski sustavi sve više okreću prema obnovljivim izvorima energije a transport prema elektrifikaciji putem hibridnih i električnih vozila. Iako je većina komponenti koja sačinjava hibridni električni pogon vozila već odavno poznata te njihov razvoj dolazi do vrhunca, nije još točno određen i definiran pravac najboljeg i najefikasnijeg upravljanja hibridnim pogonima, posebno onima kod kojih je električni motor relativno male snage (tzv. umjerena hibridna vozila). Pod pojmom upravljanja u ovom kontekstu smatra se pojam nadređene strategije upravljanja koji se bavi time kako najbolje iskoristiti komponente hibridnog sustava u svrhu smanjenje potrošnje goriva. Iz tog razloga u ovom radu se obrađuje jedan od mogućih načina upravljanja umjerenim hibridnim električnim vozilom paralelne P2 konfiguracije. Razvija se strategija temeljena na bazi pravila (engl. rule-based), koja ovisno o zadanim pravilima definira stanje i parametre rada zadanih komponenti. Unutar upravljačke strategije definiraju se takozvane funkcionalnosti hibridnog vozila prema prethodno spomenutoj bazi pravila. Funkcionalnosti koje se razmatraju jesu: sporohodna električna vožnja, pasivno i aktivno električno krstarenje te regenerativno kočenje. Razvijena upravljačka strategija implementira se i ispituje unutar simulacijskog paketa AVL CRUISE, pri čemu je kod simulacijskog ispitivanja naglasak na analizi smanjenja potrošnje goriva te utjecaja voznost na vozila.In recent years, environmental awareness is growing. Automotive industry is often hit by criticism as it is considered as one of the largest environmental polluters. For these reasons energy systems are increasingly turning towards renewable energy sources, and transport systems towards electrification through hybrid and electric vehicles. Although most of the components which constitute the hybrid system have long been known and their development is at its climax, the best and most efficient control strategy design for such systems has not yet been defined and determined, especially for those with relatively low power electric motors (so-called mild hybrid electric vehicles). Here, the control strategy term refers to high-level control strategy for hybrid powertrain, which combines the hybrid system components with the aim of reducing the fuel consumption. For this reason, this paper deals with one of the possible approaches of high-level control strategy development for mild hybrid electric vehicle in P2 configuration. A rule-based control strategy is developed, which, depending on the given rules, defines the operation mode and parameters of the hybrid powertrain components. Within the control strategy, the so-called hybrid vehicle functionalities are defined according to the previously mentioned rule base. The functionalities to be considered are: eCreep, eCoasting, eSailing and regenerative braking. The developed control strategy is implemented and verified within the simulation package AVL CRUISE, whereby simulation testing focuses on analysis of fuel consumption reduction and the driveability of the vehicle

    Energy efficient enabling technologies for semantic video processing on mobile devices

    Semantic object-based processing will play an increasingly important role in future multimedia systems due to the ubiquity of digital multimedia capture/playback technologies and increasing storage capacity. Although the object based paradigm has many undeniable benefits, numerous technical challenges remain before the applications becomes pervasive, particularly on computational constrained mobile devices. A fundamental issue is the ill-posed problem of semantic object segmentation. Furthermore, on battery powered mobile computing devices, the additional algorithmic complexity of semantic object based processing compared to conventional video processing is highly undesirable both from a real-time operation and battery life perspective. This thesis attempts to tackle these issues by firstly constraining the solution space and focusing on the human face as a primary semantic concept of use to users of mobile devices. A novel face detection algorithm is proposed, which from the outset was designed to be amenable to be offloaded from the host microprocessor to dedicated hardware, thereby providing real-time performance and reducing power consumption. The algorithm uses an Artificial Neural Network (ANN), whose topology and weights are evolved via a genetic algorithm (GA). The computational burden of the ANN evaluation is offloaded to a dedicated hardware accelerator, which is capable of processing any evolved network topology. Efficient arithmetic circuitry, which leverages modified Booth recoding, column compressors and carry save adders, is adopted throughout the design. To tackle the increased computational costs associated with object tracking or object based shape encoding, a novel energy efficient binary motion estimation architecture is proposed. Energy is reduced in the proposed motion estimation architecture by minimising the redundant operations inherent in the binary data. Both architectures are shown to compare favourable with the relevant prior art