2,368 research outputs found

    Benchmarking CPUs and GPUs on embedded platforms for software receiver usage

    Get PDF
    Smartphones containing multi-core central processing units (CPUs) and powerful many-core graphics processing units (GPUs) bring supercomputing technology into your pocket (or into our embedded devices). This can be exploited to produce power-efficient, customized receivers with flexible correlation schemes and more advanced positioning techniques. For example, promising techniques such as the Direct Position Estimation paradigm or usage of tracking solutions based on particle filtering, seem to be very appealing in challenging environments but are likewise computationally quite demanding. This article sheds some light onto recent embedded processor developments, benchmarks Fast Fourier Transform (FFT) and correlation algorithms on representative embedded platforms and relates the results to the use in GNSS software radios. The use of embedded CPUs for signal tracking seems to be straight forward, but more research is required to fully achieve the nominal peak performance of an embedded GPU for FFT computation. Also the electrical power consumption is measured in certain load levels.Peer ReviewedPostprint (published version

    Parallelised max-log-MAP model

    Get PDF
    A paralleliscd max-Log-MAP model (P-max-Log-MAP) that exploits the sub-word parallelism and very long instruction word architccture of a microprocessor or a digital signal processor (DSP) is presented. The proposed model rcduccs considerably thc computational complexity of the max-Log-MAP algorithm; valid therefore facilitates easy implementation

    Low power digital signal processing

    Get PDF

    Characteristics of homogeneous multi-core fibers for SDM transmission

    Get PDF
    We describe optical data transmission systems using homogeneous, single-mode, multi-core fibers (MCFs). We first briefly discuss space-division multiplexing (SDM) fibers, observing that no individual SDM fiber offers overwhelming advantages over bundles of single-mode fiber (SMF) across all transmission regimes. We note that for early adoption of SDM fibers, uncoupled or weakly coupled fibers which are compatible with existing SDM infrastructure have a practical advantage. Yet, to be more attractive than parallel SMF, it is also necessary to demonstrate benefits beyond improved spatial spectral efficiency. It is hoped that the lower spread of propagation delays (skew) between spatial channels in some fibers can be exploited for improved performance and greater efficiency from hardware sharing and joint processing. However, whether these benefits can be practically harnessed and outweigh impairments or effort to mitigate cross talk between spatial channels is not yet clear. Hence, focusing on homogeneous MCFs, we first describe measurements and simulations on the impact of inter-core cross talk in such fibers before reporting experimental investigation into the spatial channel skew variation with a series of the experimental results including a comparison with SMF in varying environmental conditions. Finally, we present some system and transmission experiments using parallel recirculating loops that enable demonstration of both multi-dimensional modulation and joint digital processing techniques across three MCF cores. Both techniques lead to increased transmission reach but highlight the need for further experimental analysis to properly characterize the potential benefits of correlated propagation delays in such fibers

    Efficient Neural Network Implementations on Parallel Embedded Platforms Applied to Real-Time Torque-Vectoring Optimization Using Predictions for Multi-Motor Electric Vehicles

    Get PDF
    The combination of machine learning and heterogeneous embedded platforms enables new potential for developing sophisticated control concepts which are applicable to the field of vehicle dynamics and ADAS. This interdisciplinary work provides enabler solutions -ultimately implementing fast predictions using neural networks (NNs) on field programmable gate arrays (FPGAs) and graphical processing units (GPUs)- while applying them to a challenging application: Torque Vectoring on a multi-electric-motor vehicle for enhanced vehicle dynamics. The foundation motivating this work is provided by discussing multiple domains of the technological context as well as the constraints related to the automotive field, which contrast with the attractiveness of exploiting the capabilities of new embedded platforms to apply advanced control algorithms for complex control problems. In this particular case we target enhanced vehicle dynamics on a multi-motor electric vehicle benefiting from the greater degrees of freedom and controllability offered by such powertrains. Considering the constraints of the application and the implications of the selected multivariable optimization challenge, we propose a NN to provide batch predictions for real-time optimization. This leads to the major contribution of this work: efficient NN implementations on two intrinsically parallel embedded platforms, a GPU and a FPGA, following an analysis of theoretical and practical implications of their different operating paradigms, in order to efficiently harness their computing potential while gaining insight into their peculiarities. The achieved results exceed the expectations and additionally provide a representative illustration of the strengths and weaknesses of each kind of platform. Consequently, having shown the applicability of the proposed solutions, this work contributes valuable enablers also for further developments following similar fundamental principles.Some of the results presented in this work are related to activities within the 3Ccar project, which has received funding from ECSEL Joint Undertaking under grant agreement No. 662192. This Joint Undertaking received support from the European Union’s Horizon 2020 research and innovation programme and Germany, Austria, Czech Republic, Romania, Belgium, United Kingdom, France, Netherlands, Latvia, Finland, Spain, Italy, Lithuania. This work was also partly supported by the project ENABLES3, which received funding from ECSEL Joint Undertaking under grant agreement No. 692455-2

    FPGAs in Industrial Control Applications

    Get PDF
    The aim of this paper is to review the state-of-the-art of Field Programmable Gate Array (FPGA) technologies and their contribution to industrial control applications. Authors start by addressing various research fields which can exploit the advantages of FPGAs. The features of these devices are then presented, followed by their corresponding design tools. To illustrate the benefits of using FPGAs in the case of complex control applications, a sensorless motor controller has been treated. This controller is based on the Extended Kalman Filter. Its development has been made according to a dedicated design methodology, which is also discussed. The use of FPGAs to implement artificial intelligence-based industrial controllers is then briefly reviewed. The final section presents two short case studies of Neural Network control systems designs targeting FPGAs

    Vector support for multicore processors with major emphasis on configurable multiprocessors

    Get PDF
    It recently became increasingly difficult to build higher speed uniprocessor chips because of performance degradation and high power consumption. The quadratically increasing circuit complexity forbade the exploration of more instruction-level parallelism (JLP). To continue raising the performance, processor designers then focused on thread-level parallelism (TLP) to realize a new architecture design paradigm. Multicore processor design is the result of this trend. It has proven quite capable in performance increase and provides new opportunities in power management and system scalability. But current multicore processors do not provide powerful vector architecture support which could yield significant speedups for array operations while maintaining arealpower efficiency. This dissertation proposes and presents the realization of an FPGA-based prototype of a multicore architecture with a shared vector unit (MCwSV). FPGA stands for Filed-Programmable Gate Array. The idea is that rather than improving only scalar or TLP performance, some hardware budget could be used to realize a vector unit to greatly speedup applications abundant in data-level parallelism (DLP). To be realistic, limited by the parallelism in the application itself and by the compiler\u27s vectorizing abilities, most of the general-purpose programs can only be partially vectorized. Thus, for efficient resource usage, one vector unit should be shared by several scalar processors. This approach could also keep the overall budget within acceptable limits. We suggest that this type of vector-unit sharing be established in future multicore chips. The design, implementation and evaluation of an MCwSV system with two scalar processors and a shared vector unit are presented for FPGA prototyping. The MicroBlaze processor, which is a commercial IP (Intellectual Property) core from Xilinx, is used as the scalar processor; in the experiments the vector unit is connected to a pair of MicroBlaze processors through standard bus interfaces. The overall system is organized in a decoupled and multi-banked structure. This organization provides substantial system scalability and better vector performance. For a given area budget, benchmarks from several areas show that the MCwSV system can provide significant performance increase as compared to a multicore system without a vector unit. However, a MCwSV system with two MicroBlazes and a shared vector unit is not always an optimized system configuration for various applications with different percentages of vectorization. On the other hand, the MCwSV framework was designed for easy scalability to potentially incorporate various numbers of scalar/vector units and various function units. Also, the flexibility inherent to FPGAs can aid the task of matching target applications. These benefits can be taken into account to create optimized MCwSV systems for various applications. So the work eventually focused on building an architecture design framework incorporating performance and resource management for application-specific MCwSV (AS-MCwSV) systems. For embedded system design, resource usage, power consumption and execution latency are three metrics to be used in design tradeoffs. The product of these metrics is used here to choose the MCwSV system with the smallest value

    Phase Estimation for Grid Synchronization of DG System Using Cordic Algorithm

    Get PDF
    The proper operation of grid connected inverter system is determined by grid voltage conditions such as phase, amplitude and frequency. In such applications, an accurate and fast detection of the phase angle, amplitude and frequency of the grid voltage is essential for reference current generation. Phase angle plays an important role in control being used to transform the feedback variables to a suitable reference frame in which the control structure is implemented. Hence grid synchronization has a significant role in the control of grid connected inverter system. However, accurate on-line tracking of phase angle of the grid voltages under distorted grid condition is critical especially; during line notching, voltage unbalance, voltage dips, frequency variations etc. This project work involves development of phase estimation technique for grid synchronization using CORDIC algorithm during unbalanced three-phase grid voltage conditions. By proposing CORDIC algorithm, we can largely reduce the computational time while it will be implemented in real time platform using FPGA or DSP. Computer simulations have been carried out using MATLAB-Simulink package for feasibility of the study
    corecore