71,722 research outputs found
Extending OpenVX for Model-based Design of Embedded Vision Applications
Developing computer vision applications for lowpower heterogeneous systems is increasingly gaining interest in the embedded systems community. Even more interesting is the tuning of such embedded software for the target architecture when this is driven by multiple constraints (e.g., performance, peak power, energy consumption). Indeed, developers frequently run into system-level inefficiencies and bottlenecks that can not be quickly addressed by traditional methods. In this context OpenVX has been proposed as the standard platform to develop portable, optimized and powerefficient applications for vision algorithms targeting embedded systems. Nevertheless, adopting OpenVX for rapid prototyping, early algorithm parametrization and validation of complex embedded applications is a very challenging task. This paper presents a methodology to integrate a model-based design environment to OpenVX. The methodology allows applying Matlab/Simulink for the model-based design, parametrization, and validation of computer vision applications. Then, it allows for the automatic synthesis of the application model into an OpenVX description for the hardware and constraints-aware application tuning. Experimental results have been conducted with an application for digital image stabilization developed through Simulink and, then, automatically synthesized into OpenVX-VisionWorks code for an NVIDIA Jetson TX1 boar
A Survey on Compiler Autotuning using Machine Learning
Since the mid-1990s, researchers have been trying to use machine-learning
based approaches to solve a number of different compiler optimization problems.
These techniques primarily enhance the quality of the obtained results and,
more importantly, make it feasible to tackle two main compiler optimization
problems: optimization selection (choosing which optimizations to apply) and
phase-ordering (choosing the order of applying optimizations). The compiler
optimization space continues to grow due to the advancement of applications,
increasing number of compiler optimizations, and new target architectures.
Generic optimization passes in compilers cannot fully leverage newly introduced
optimizations and, therefore, cannot keep up with the pace of increasing
options. This survey summarizes and classifies the recent advances in using
machine learning for the compiler optimization field, particularly on the two
major problems of (1) selecting the best optimizations and (2) the
phase-ordering of optimizations. The survey highlights the approaches taken so
far, the obtained results, the fine-grain classification among different
approaches and finally, the influential papers of the field.Comment: version 5.0 (updated on September 2018)- Preprint Version For our
Accepted Journal @ ACM CSUR 2018 (42 pages) - This survey will be updated
quarterly here (Send me your new published papers to be added in the
subsequent version) History: Received November 2016; Revised August 2017;
Revised February 2018; Accepted March 2018
Learning to infer: RL-based search for DNN primitive selection on Heterogeneous Embedded Systems
Deep Learning is increasingly being adopted by industry for computer vision
applications running on embedded devices. While Convolutional Neural Networks'
accuracy has achieved a mature and remarkable state, inference latency and
throughput are a major concern especially when targeting low-cost and low-power
embedded platforms. CNNs' inference latency may become a bottleneck for Deep
Learning adoption by industry, as it is a crucial specification for many
real-time processes. Furthermore, deployment of CNNs across heterogeneous
platforms presents major compatibility issues due to vendor-specific technology
and acceleration libraries. In this work, we present QS-DNN, a fully automatic
search based on Reinforcement Learning which, combined with an inference engine
optimizer, efficiently explores through the design space and empirically finds
the optimal combinations of libraries and primitives to speed up the inference
of CNNs on heterogeneous embedded devices. We show that, an optimized
combination can achieve 45x speedup in inference latency on CPU compared to a
dependency-free baseline and 2x on average on GPGPU compared to the best vendor
library. Further, we demonstrate that, the quality of results and time
"to-solution" is much better than with Random Search and achieves up to 15x
better results for a short-time search
Spatial Multiplexing of QPSK Signals with a Single Radio: Antenna Design and Over-the-Air Experiments
The paper describes the implementation and performance analysis of the first
fully-operational beam-space MIMO antenna for the spatial multiplexing of two
QPSK streams. The antenna is composed of a planar three-port radiator with two
varactor diodes terminating the passive ports. Pattern reconfiguration is used
to encode the MIMO information onto orthogonal virtual basis patterns in the
far-field. A measurement campaign was conducted to compare the performance of
the beam-space MIMO system with a conventional 2-by-?2 MIMO system under
realistic propagation conditions. Propagation measurements were conducted for
both systems and the mutual information and symbol error rates were estimated
from Monte-Carlo simulations over the measured channel matrices. The results
show the beam-space MIMO system and the conventional MIMO system exhibit
similar finite-constellation capacity and error performance in NLOS scenarios
when there is sufficient scattering in the channel. In comparison, in LOS
channels, the capacity performance is observed to depend on the relative
polarization of the receiving antennas.Comment: 31 pages, 23 figure
Robustness analysis of evolutionary controller tuning using real systems
A genetic algorithm (GA) presents an excellent method for controller parameter tuning. In our work, we evolved the heading as well as the altitude controller for a small lightweight helicopter. We use the real flying robot to evaluate the GA's individuals rather than an artificially consistent simulator. By doing so we avoid the ldquoreality gaprdquo, taking the controller from the simulator to the real world. In this paper we analyze the evolutionary aspects of this technique and discuss the issues that need to be considered for it to perform well and result in robust controllers
Control design toolbox for large scale variable speed pitch regulated wind turbines
The trend towards large multi-MW wind turbineshas given new impetus to the development of wind turbine controllers.Additional objectives are being placed on the controllermaking the specification of the control system more complex. A new toolbox, which assists with most of the control design cycle,has been developed. Its purpose is to assist and guide the control system designer through the design cycle, thereby enabling faster design. With the choice of control strategy unrestricted,the toolbox is sufficiently flexible to support the design processfor the aforementioned more complex specifications
Data path analysis for dynamic circuit specialisation
Dynamic Circuit Specialisation (DCS) is a method that exploits the reconfigurability of modern FPGAs to allow the specialisation of FPGA circuits at run-time. Currently, it is only explored as part of Register-transfer level design. However, at the Register-transfer level (RTL), a large part of the design is already locked in. Therefore, maximally exploiting the opportunities of DCS could require a costly redesign. It would be interesting to already have insight in the opportunities for DCS from the higher abstraction level. Moreover, the general design trend in FPGA design is to work on higher abstraction levels and let tool(s) translate this higher level description to RTL. This paper presents the first profiler that, based on the high-level description of an application, estimates the benefits of an implementation using DCS. This allows a designer to determine much earlier in the design cycle whether or not DCS would be interesting. The high-level profiling methodology was implemented and tested on a set of PID designs
- …