36 research outputs found

    Fast- and Low-Complexity atan2(a,b) Approximation

    Full text link
    [EN] This article presents a new entry to the class of published algorithms for the fast computation of the arctangent of a complex number. Our method uses a look-up table (LUT) to reduce computational errors. We also show how to convert a large-sized LUT addressed by two variables to an equivalent-performance smaller-sized LUT addressed by only one variable. In addition, we demonstrate how and why the use of follow-on LUTs applied to other simple arctan algorithms produce unexpected and interesting results.This work is funded by the Spanish Ministerio de Economía y Competitividad and FEDER under grant TEC2015-70858-C2-2-R.Torres Carot, V.; Valls Coquillat, J.; Lyons, R. (2017). Fast- and Low-Complexity atan2(a,b) Approximation. IEEE Signal Processing Magazine. 34(6):164-169. https://doi.org/10.1109/MSP.2017.2730898S16416934

    Real-time neural signal processing and low-power hardware co-design for wireless implantable brain machine interfaces

    Get PDF
    Intracortical Brain-Machine Interfaces (iBMIs) have advanced significantly over the past two decades, demonstrating their utility in various aspects, including neuroprosthetic control and communication. To increase the information transfer rate and improve the devices’ robustness and longevity, iBMI technology aims to increase channel counts to access more neural data while reducing invasiveness through miniaturisation and avoiding percutaneous connectors (wired implants). However, as the number of channels increases, the raw data bandwidth required for wireless transmission also increases becoming prohibitive, requiring efficient on-implant processing to reduce the amount of data through data compression or feature extraction. The fundamental aim of this research is to develop methods for high-performance neural spike processing co-designed within low-power hardware that is scaleable for real-time wireless BMI applications. The specific original contributions include the following: Firstly, a new method has been developed for hardware-efficient spike detection, which achieves state-of-the-art spike detection performance and significantly reduces the hardware complexity. Secondly, a novel thresholding mechanism for spike detection has been introduced. By incorporating firing rate information as a key determinant in establishing the spike detection threshold, we have improved the adaptiveness of spike detection. This eventually allows the spike detection to overcome the signal degradation that arises due to scar tissue growth around the recording site, thereby ensuring enduringly stable spike detection results. The long-term decoding performance, as a consequence, has also been improved notably. Thirdly, the relationship between spike detection performance and neural decoding accuracy has been investigated to be nonlinear, offering new opportunities for further reducing transmission bandwidth by at least 30% with minor decoding performance degradation. In summary, this thesis presents a journey toward designing ultra-hardware-efficient spike detection algorithms and applying them to reduce the data bandwidth and improve neural decoding performance. The software-hardware co-design approach is essential for the next generation of wireless brain-machine interfaces with increased channel counts and a highly constrained hardware budget. The fundamental aim of this research is to develop methods for high-performance neural spike processing co-designed within low-power hardware that is scaleable for real-time wireless BMI applications. The specific original contributions include the following: Firstly, a new method has been developed for hardware-efficient spike detection, which achieves state-of-the-art spike detection performance and significantly reduces the hardware complexity. Secondly, a novel thresholding mechanism for spike detection has been introduced. By incorporating firing rate information as a key determinant in establishing the spike detection threshold, we have improved the adaptiveness of spike detection. This eventually allows the spike detection to overcome the signal degradation that arises due to scar tissue growth around the recording site, thereby ensuring enduringly stable spike detection results. The long-term decoding performance, as a consequence, has also been improved notably. Thirdly, the relationship between spike detection performance and neural decoding accuracy has been investigated to be nonlinear, offering new opportunities for further reducing transmission bandwidth by at least 30\% with only minor decoding performance degradation. In summary, this thesis presents a journey toward designing ultra-hardware-efficient spike detection algorithms and applying them to reduce the data bandwidth and improve neural decoding performance. The software-hardware co-design approach is essential for the next generation of wireless brain-machine interfaces with increased channel counts and a highly constrained hardware budget.Open Acces

    Embedded electronic systems driven by run-time reconfigurable hardware

    Get PDF
    Abstract This doctoral thesis addresses the design of embedded electronic systems based on run-time reconfigurable hardware technology –available through SRAM-based FPGA/SoC devices– aimed at contributing to enhance the life quality of the human beings. This work does research on the conception of the system architecture and the reconfiguration engine that provides to the FPGA the capability of dynamic partial reconfiguration in order to synthesize, by means of hardware/software co-design, a given application partitioned in processing tasks which are multiplexed in time and space, optimizing thus its physical implementation –silicon area, processing time, complexity, flexibility, functional density, cost and power consumption– in comparison with other alternatives based on static hardware (MCU, DSP, GPU, ASSP, ASIC, etc.). The design flow of such technology is evaluated through the prototyping of several engineering applications (control systems, mathematical coprocessors, complex image processors, etc.), showing a high enough level of maturity for its exploitation in the industry.Resumen Esta tesis doctoral abarca el diseño de sistemas electrónicos embebidos basados en tecnología hardware dinámicamente reconfigurable –disponible a través de dispositivos lógicos programables SRAM FPGA/SoC– que contribuyan a la mejora de la calidad de vida de la sociedad. Se investiga la arquitectura del sistema y del motor de reconfiguración que proporcione a la FPGA la capacidad de reconfiguración dinámica parcial de sus recursos programables, con objeto de sintetizar, mediante codiseño hardware/software, una determinada aplicación particionada en tareas multiplexadas en tiempo y en espacio, optimizando así su implementación física –área de silicio, tiempo de procesado, complejidad, flexibilidad, densidad funcional, coste y potencia disipada– comparada con otras alternativas basadas en hardware estático (MCU, DSP, GPU, ASSP, ASIC, etc.). Se evalúa el flujo de diseño de dicha tecnología a través del prototipado de varias aplicaciones de ingeniería (sistemas de control, coprocesadores aritméticos, procesadores de imagen, etc.), evidenciando un nivel de madurez viable ya para su explotación en la industria.Resum Aquesta tesi doctoral està orientada al disseny de sistemes electrònics empotrats basats en tecnologia hardware dinàmicament reconfigurable –disponible mitjançant dispositius lògics programables SRAM FPGA/SoC– que contribueixin a la millora de la qualitat de vida de la societat. S’investiga l’arquitectura del sistema i del motor de reconfiguració que proporcioni a la FPGA la capacitat de reconfiguració dinàmica parcial dels seus recursos programables, amb l’objectiu de sintetitzar, mitjançant codisseny hardware/software, una determinada aplicació particionada en tasques multiplexades en temps i en espai, optimizant així la seva implementació física –àrea de silici, temps de processat, complexitat, flexibilitat, densitat funcional, cost i potència dissipada– comparada amb altres alternatives basades en hardware estàtic (MCU, DSP, GPU, ASSP, ASIC, etc.). S’evalúa el fluxe de disseny d’aquesta tecnologia a través del prototipat de varies aplicacions d’enginyeria (sistemes de control, coprocessadors aritmètics, processadors d’imatge, etc.), demostrant un nivell de maduresa viable ja per a la seva explotació a la indústria

    Controlling a contactless planar actuator with manipulator

    Get PDF
    An existing magnetically levitated planar actuator with manipulator has been studied and improved from a control point of view. This prototype consists of a magnetically levitated six-degree-of-freedom (6-DOF) planar actuator with moving magnets, with a 2-DOF manipulator on top of it. This system contains three different contactless technologies: contactless bearing and propulsion of the planar actuator, wireless powering of the manipulator, and wireless communication and control of the manipulator. The planar actuator (PA) consists of a Halbach magnet array, which is levitated and controlled in all six DOF’s above a stationary coil array. The PA is propelled in two horizontal translational DOF’s while the other four DOF’s are stabilized to accomplish a stiff bearing. Each active coil contributes to the production of forces and torques acting on the magnet array. Since the number of active coils is much larger than the number of DOF’s, the desired force production can be distributed over many coils. Therefore, a commutation algorithm has to be used to invert the mapping of the forces and torques exerted by the set of active coils as a function of the coil currents and the position and orientation of the translator. One method for linearization and decoupling of the forces and torques was developed in the past. The method is called direct wrench decoupling and guaranties minimal dissipation of energy. However, no constraints on the maximum current can be given. This study proposes two novel, norm-based commutation methods: l8-norm and clipped l2-norm based commutation. Both methods can put bounds on the maximum currents in the coils to prevent saturation of the current amplifiers. The first method focuses on minimization of the maximum current whereas the second method limits the peak current while it minimizes the power losses. Consequently, a higher acceleration of the translator can be achieved and/or less powerful (cheaper) current amplifiers can be utilized and/or fewer commutation errors arise. Only a long-stroke translational movement of the moving magnet planar actuators has been considered in the past. The possibility of a completely propelled and controlled rotation about the vertical axis instead of just stabilizing it for bearing has been analyzed in this thesis from a control point of view. Enhancing the planar actuator with a long-range rotation will increase its utility value and opens new application areas. Based on this investigation, a novel coil array with a triangular grid of rounded coils has been proposed for better controllability in any orientation of the PA. In addition, other coil and magnet topologies have been studied from a control point of view for their suitability for full rotation. The influence of different kinds of error-causes on the commutation precision has been studied. From this investigation, it has been found that the offsets of the measurement system have the highest influence on the precision of the commutation. Investigation of the convergence of the procedure for estimation and elimination of these offsets has been performed. Although it was not proven that the procedure could be applied on the whole workspace of the PA, the convergence has been shown at least for all the investigated points. From this investigation, convergence for any position in the workspace of the PA is expected. It was found that it is possible to use the procedure also with different topologies and with different commutations. A novel wireless link has been developed for the real-time control of a fast motion system. The wireless link communicates via infrared-light transceivers and the link has a delay and a packet-loss ratio almost indistinguishable from the wired connection for the bandwidth of the system up to several kilohertz. The clipped l2-norm based commutation method has been successfully tested on the experimental setup after improving the measurement system, the contactless energy transfer and the wireless communication. With a new, interferometer sensor system, a well-controlled PA with two long-stroke DOF’s has become available. Improved contactless energy transfer does not cause increased electromagnetic interference during switching between the primary coils any more and the wireless connection using the infrared link provides a reliable communication channel between the manipulator and the fixed world. Several control approaches have been tested on the experimental setup. Both, the classical PID control, Sliding-mode control and Iterative learning control have been implemented. Each controller brought better performance than the previous one. Also, a fourth-order trajectory and enhanced feedforward control helped to improve performance. Finally, the tracking errors, in comparison to the initial situation, were reduced by a factor 10 (and even more than by a factor 50 with deactivated contactless energy transfer) while the velocity and acceleration of the system were a factor 4 and 14, respectively, higher

    FPGA implementation of bluetooth low energy physical layer with OpenCL

    Get PDF
    Aquesta dissertació presenta principalment el disseny de processament digital de senyals (DSP) entre la transmissió en Capa Física de Bluetooth de Baixa Energia (BLE PHY), i la seva implementació en dispositius Field Programmable Gate Array (FPGA) utilitzant Open Computing Language (OpenCL). Durant el disseny de DSP, es basa en l'arquitectura en fase / quadratura-fase (IQ) per construir els processos de modulació i demodulació del senyal mitjançant l'ús d'un esquema de modelador de senyal anomenat Gaussian Frequency-Shift Keying (GFSK), en la comunicació de curt abast que presenta un fort rendiment anti-interferència. Pel que fa a l'OpenCL, és un dels mètodes de síntesi d'alt nivell (HLS) per al disseny de FPGA. No només compta amb una alta productivitat, sinó que també pot realitzar una alta eficiència operativa per FPGA mitjançant l'ús d'arquitectura de programació paral·lela. A més, aquí invoca una plataforma remota anomenada Intel DevCloud per controlar el FPGA per verificar el programa, faria que el disseny fos més còmode i econòmic.Esta disertación presenta principalmente el diseño de Procesamiento Digital de Señales (DSP) entre la transmisión en Bluetooth Low Energy Physical Layer (BLE PHY), y su implementación en Field Programmable Gate Array (FPGA) con Open Computing Language (OpenCL). Durante el diseño de DSP, se basa en la arquitectura In-Phase/Quadrature-Phase (IQ) para construir los procesos de modulación y demodulación de la señal mediante la utilización de un esquema de modelador de señal llamado Gaussian Frequency-Shift Keying (GFSK), en la comunicación de corto alcance presenta un fuerte rendimiento anti-interferencia. Con respecto al OpenCL, es uno de los métodos de síntesis de alto nivel (HLS) para el diseño de FPGA. No solo presenta una alta productividad, sino que también puede lograr una alta eficiencia operativa para FPGA mediante el uso de la arquitectura de programación paralela. Además, aquí invoca una plataforma remota llamada Intel DevCloud para controlar la FPGA para verificar el programa, lo que haría que el diseño fuera más conveniente y económico.This dissertation is primarily presenting the design of Digital Signal Processing (DSP) between the transmission in Bluetooth Low Energy Physical Layer (BLE PHY), and its implementation in a Field Programmable Gate Array (FPGA) device with Open Computing Language (OpenCL). During the design of DSP, it bases on the In-Phase/Quadrature-Phase (IQ) architecture to construct the modulation and demodulation processes of signal by utilizing a signal shaper scheme called Gaussian Frequency-Shift Keying (GFSK), in the short-rang communication it features strong anti-interference performance. Regarding with the OpenCL, it's one of High-Level Synthesis (HLS) methodsfor FPGAs design. It not only features high productive, but also can realize high operational efficiency for FPGA by using parallel programming architecture. Moreover, here invokes a remote platform called Intel DevCloud to control the FPGA for verifying the program, it would make the design more convenient and economic

    On the Exploration of FPGAs and High-Level Synthesis Capabilities on Multi-Gigabit-per-Second Networks

    Full text link
    Tesis doctoral inédita leída en la Universidad Autónoma de Madrid, Escuela Politécnica Superior, Departamento de Tecnología Electrónica y de las Comunicaciones. Fecha de lectura: 24-01-2020Traffic on computer networks has faced an exponential grown in recent years. Both links and communication equipment had to adapt in order to provide a minimum quality of service required for current needs. However, in recent years, a few factors have prevented commercial off-the-shelf hardware from being able to keep pace with this growth rate, consequently, some software tools are struggling to fulfill their tasks, especially at speeds higher than 10 Gbit/s. For this reason, Field Programmable Gate Arrays (FPGAs) have arisen as an alternative to address the most demanding tasks without the need to design an application specific integrated circuit, this is in part to their flexibility and programmability in the field. Needless to say, developing for FPGAs is well-known to be complex. Therefore, in this thesis we tackle the use of FPGAs and High-Level Synthesis (HLS) languages in the context of computer networks. We focus on the use of FPGA both in computer network monitoring application and reliable data transmission at very high-speed. On the other hand, we intend to shed light on the use of high level synthesis languages and boost FPGA applicability in the context of computer networks so as to reduce development time and design complexity. In the first part of the thesis, devoted to computer network monitoring. We take advantage of the FPGA determinism in order to implement active monitoring probes, which consist on sending a train of packets which is later used to obtain network parameters. In this case, the determinism is key to reduce the uncertainty of the measurements. The results of our experiments show that the FPGA implementations are much more accurate and more precise than the software counterpart. At the same time, the FPGA implementation is scalable in terms of network speed — 1, 10 and 100 Gbit/s. In the context of passive monitoring, we leverage the FPGA architecture to implement algorithms able to thin cyphered traffic as well as removing duplicate packets. These two algorithms straightforward in principle, but very useful to help traditional network analysis tools to cope with their task at higher network speeds. On one hand, processing cyphered traffic bring little benefits, on the other hand, processing duplicate traffic impacts negatively in the performance of the software tools. In the second part of the thesis, devoted to the TCP/IP stack. We explore the current limitations of reliable data transmission using standard software at very high-speed. Nowadays, the network is becoming an important bottleneck to fulfill current needs, in particular in data centers. What is more, in recent years the deployment of 100 Gbit/s network links has started. Consequently, there has been an increase scrutiny of how networking functionality is deployed, furthermore, a wide range of approaches are currently being explored to increase the efficiency of networks and tailor its functionality to the actual needs of the application at hand. FPGAs arise as the perfect alternative to deal with this problem. For this reason, in this thesis we develop Limago an FPGA-based open-source implementation of a TCP/IP stack operating at 100 Gbit/s for Xilinx’s FPGAs. Limago not only provides an unprecedented throughput, but also, provides a tiny latency when compared to the software implementations, at least fifteen times. Limago is a key contribution in some of the hottest topic at the moment, for instance, network-attached FPGA and in-network data processing

    A field programmable gate array based motion control platform

    Get PDF
    The expectations from motion control systems have been rising day by day. As the system becomes more complex, conventional motion control systems can not achieve to meet all the specifications with optimized results. This creates the need of re-designing the control platform in order to meet the new specifications. Field programmable gate arrays (FPGA) offer reconfigurable hardware, which would result in overcoming this re-designing issue. The hardware structure of the system can be reconfigured, even though the hardware is deployed. As the functionality is provided by the hardware, the performance is enhanced. The dedicated hardware also improves the power consumption. The board size also shrinks, as the discrete components can be implemented in FPGA. The shrinkage of the board size also lowers the cost. As a trade-off, FPGA programming is more complicated than software programming. The aim of this thesis is to create a level of abstraction in order to diminish the requirement of advanced hardware description language knowledge for implementing motion control algorithms on FPGA's. The hardware library is introduced which is specifically implemented for motion control purposes. In order to have a thorough motion control platform, other parts of the system like, user interface, kinematics calculations and trajectory generation, have been implemented as a software library. The control algorithms are tested, and the system is verified by experimenting on a parallel mechanism

    Control of a Modular Multilevel Flying Capacitor Based STATCOM for Distribution Systems

    Get PDF
    Voltage fluctuation and power losses in the distribution line are problems in distribution networks. One method to mitigate these problems is by injecting reactive power into the network using a Static Synchronous Compensator (STATCOM). This can be used both for regulating the voltage and reducing the losses. A STATCOM is critically dependent on a grid synchronisation scheme that can accurately track the changes occurring in the grid phase and frequency. The Modular Multilevel Converter (MMC) is a promising topology for STATCOM applications because of its simple modular circuit structure that allows for higher voltage ratings, and conventionally uses a stack of sub-modules which are either two-level half or H-bridge converters. As a novel alternative, the thesis investigates the practicality of a STATCOM based on a three-level flying capacitor (FC) converter. Two variants of this topology are presented; the FC Half-bridge and FC H-bridge. A comprehensive study is undertaken to compare these with the Half and H-bridge sub-module under STATCOM operation. Most importantly, an FC H-bridge-based STATCOM is investigated for reactive power compensation. The challenges of multilevel, multi-module PWM control schemes achieving good waveforms at low switching frequency, whilst maintaining module capacitor voltage balance, are thoroughly addressed. Simulation results validate the operation for both line voltage regulation and power factor correction. An experimental power system with an FC-based STATCOM rig is designed and built, and validates the simulation results for power factor correction. It demonstrates correct operation of a control scheme that includes a system for maintaining capacitor voltage balance. Another new contribution is the investigation of a phase locking technique based on the Energy Operator (EO). The method, combining two different EO computations, is shown to achieve fast and accurate detection of frequency and phase angle when combined with an appropriate filter, and crucially operates well under unbalanced voltage conditions. The technique is compared with two other well-known phase locked loop (PLL) schemes, showing that it outperforms the others in terms of speed and accuracy. A hardware implementation of the EO-PLL validates the principle, showing the simplicity of the metho

    Rapid Digital Architecture Design of Computationally Complex Algorithms

    Get PDF
    Traditional digital design techniques hardly keep up with the rising abundance of programmable circuitry found on recent Field-Programmable Gate Arrays. Therefore, the novel Rapid Data Type-Agnostic Digital Design Methodology (RDAM) elevates the design perspective of digital design engineers away from the register-transfer level to the algorithmic level. It is founded on the capabilities of High-Level Synthesis tools. By consequently working with data type-agnostic source codes, the RDAM brings significant simplifications to the fixed-point conversion of algorithms and the design of complex-valued architectures. Signal processing applications from the field of Compressed Sensing illustrate the efficacy of the RDAM in the context of multi-user wireless communications. For instance, a complex-valued digital architecture of Orthogonal Matching Pursuit with rank-1 updating has successfully been implemented and tested
    corecore