560 research outputs found

    Feasibility study and porting of the damped least square algorithm on FPGA

    Get PDF
    Modern embedded computing platforms used within Cyber-Physical Systems (CPS) are nowadays leveraging more and more often on heterogeneous computing substrates, such as newest Field Programmable Gate Array (FPGA) devices. Compared to general purpose platforms, which have a fixed datapath, FPGAs provide designers the possibility of customizing part of the computing infrastructure, to better shape the execution on the application needs/features, and offer high efficiency in terms of timing and power performance, while naturally featuring parallelism. In the context of FPGA-based CPSs, this article has a two fold mission. On the one hand, it presents an analysis of the Damped Least Square (DLS) algorithm for a perspective hardware implementation. On the other hand, it describes the implementation of a robotic arm controller based on the DLS to numerically solve Inverse Kinematics problems over a heterogeneous FPGA. Assessments involve a Trossen Robotics WidowX robotic arm controlled by a Digilent ZedBoard provided with a Xilinx Zynq FPGA that computes the Inverse Kinematic

    Adaptive Multicore Scheduling for the LTE Uplink

    Get PDF
    International audienceThe next generation cellular system of 3GPP is named Long Term Evolution (LTE). Each millisecond, a LTE base station receives information from up to one hundred users. Multicore heterogeneous embedded systems with Digital Signal Processors (DSP) and coprocessors are power efficient solutions to decode the LTE uplink signals in base stations. The LTE uplink is a highly variable algorithm. Its multicore scheduling must be adapted every millisecond to the number of connected users and to the data rate they require. To solve the issue of the dynamic deployment while maintaining low latency, one approach would be to find efficient on-the-fly solutions using techniques such as graph generation and scheduling. This approach is opposed to a static scheduling of predefined cases. We show that the static approach is not suitable for the LTE uplink and that present DSP cores are powerful enough to recompute an efficient adaptive schedule for the LTE uplink most complex cases in real-time

    Nn-X - a hardware accelerator for convolutional neural networks

    Get PDF
    Convolutional neural networks (ConvNets) are hierarchical models of the mammalian visual cortex. These models have been increasingly used in computer vision to perform object recognition and full scene understanding. ConvNets consist of multiple layers that contain groups of artificial neurons, which are mathematical approximations of biological neurons. A ConvNet can consist of millions of neurons and require billions of computations to produce one output. ^ Currently, giant server farms are used to process information in real time. These supercomputers require a large amount of power and a constant link to the end-user. Low powered embedded systems are not able to run convolutional neural networks in real time. Thus, using these systems on mobile platforms or on platforms where a connection to an off-site server is not guaranteed, is unfeasible. ^ In this work we present nn-X — a scalable hardware architecture capable of processing ConvNets in real time. We evaluate the performance and power consumption of the aforementioned architecture and compare it with systems typically used to process convolutional neural networks. Our system is prototyped on the Xilinx Zynq XC7Z045 device. On this device, we are able to achieve a peak performance of 227 GOPs/s, a measured performance of up to 200 GOPs/s while consuming less than 3 W of power. This translates to a performance per power improvement of up to 10 times that of conventional embedded systems and up to 25 times that of performance systems like desktops and GPUs
    corecore