Enhancing Performance of Computer Vision Applications on Low-Power Embedded Systems Through Heterogeneous Parallel Programming

Abstract

Enabling computer vision applications on low-power embedded systems gives rise to new challenges for embedded SW developers. Such applications implement different functionalities, like image recognition based on deep learning, simultaneous localization and mapping tasks. They are characterized by stringent performance constraints to guarantee real-time behaviors and, at the same time, energy constraints to save battery on the mobile platform. Even though heterogeneous embedded boards are getting pervasive for their high computational power at low power costs, they need a time consuming customization of the whole application (i.e., mapping of application blocks to CPUGPU processing elements and their synchronization) to efficiently exploit their potentiality. Different languages and environments have been proposed for such an embedded SW customization. Nevertheless, they often find limitations on complex real cases, as their application is mutual exclusive. This paper presents a comprehensive framework that relies on a heterogeneous parallel programming model, which combines OpenMP, PThreads, OpenVX, OpenCV, and CUDA to best exploit different levels of parallelism while guaranteeing a semi-automatic customization. The paper shows how such languages and API platforms have been interfaced, synchronized, and applied to customize an ORBSLAM application for an NVIDIA Jetson TX2 board

    Similar works