13 research outputs found

    Electronic systems for the restoration of the sense of touch in upper limb prosthetics

    Get PDF
    In the last few years, research on active prosthetics for upper limbs focused on improving the human functionalities and the control. New methods have been proposed for measuring the user muscle activity and translating it into the prosthesis control commands. Developing the feed-forward interface so that the prosthesis better follows the intention of the user is an important step towards improving the quality of life of people with limb amputation. However, prosthesis users can neither feel if something or someone is touching them over the prosthesis and nor perceive the temperature or roughness of objects. Prosthesis users are helped by looking at an object, but they cannot detect anything otherwise. Their sight gives them most information. Therefore, to foster the prosthesis embodiment and utility, it is necessary to have a prosthetic system that not only responds to the control signals provided by the user, but also transmits back to the user the information about the current state of the prosthesis. This thesis presents an electronic skin system to close the loop in prostheses towards the restoration of the sense of touch in prosthesis users. The proposed electronic skin system inlcudes an advanced distributed sensing (electronic skin), a system for (i) signal conditioning, (ii) data acquisition, and (iii) data processing, and a stimulation system. The idea is to integrate all these components into a myoelectric prosthesis. Embedding the electronic system and the sensing materials is a critical issue on the way of development of new prostheses. In particular, processing the data, originated from the electronic skin, into low- or high-level information is the key issue to be addressed by the embedded electronic system. Recently, it has been proved that the Machine Learning is a promising approach in processing tactile sensors information. Many studies have been shown the Machine Learning eectiveness in the classication of input touch modalities.More specically, this thesis is focused on the stimulation system, allowing the communication of a mechanical interaction from the electronic skin to prosthesis users, and the dedicated implementation of algorithms for processing tactile data originating from the electronic skin. On system level, the thesis provides design of the experimental setup, experimental protocol, and of algorithms to process tactile data. On architectural level, the thesis proposes a design ow for the implementation of digital circuits for both FPGA and integrated circuits, and techniques for the power management of embedded systems for Machine Learning algorithms

    Characterization and Acceleration of High Performance Compute Workloads

    Get PDF

    Characterization and Acceleration of High Performance Compute Workloads

    Get PDF

    Integrated Programmable-Array accelerator to design heterogeneous ultra-low power manycore architectures

    Get PDF
    There is an ever-increasing demand for energy efficiency (EE) in rapidly evolving Internet-of-Things end nodes. This pushes researchers and engineers to develop solutions that provide both Application-Specific Integrated Circuit-like EE and Field-Programmable Gate Array-like flexibility. One such solution is Coarse Grain Reconfigurable Array (CGRA). Over the past decades, CGRAs have evolved and are competing to become mainstream hardware accelerators, especially for accelerating Digital Signal Processing (DSP) applications. Due to the over-specialization of computing architectures, the focus is shifting towards fitting an extensive data representation range into fewer bits, e.g., a 32-bit space can represent a more extensive data range with floating-point (FP) representation than an integer representation. Computation using FP representation requires numerous encodings and leads to complex circuits for the FP operators, decreasing the EE of the entire system. This thesis presents the design of an EE ultra-low-power CGRA with native support for FP computation by leveraging an emerging paradigm of approximate computing called transprecision computing. We also present the contributions in the compilation toolchain and system-level integration of CGRA in a System-on-Chip, to envision the proposed CGRA as an EE hardware accelerator. Finally, an extensive set of experiments using real-world algorithms employed in near-sensor processing applications are performed, and results are compared with state-of-the-art (SoA) architectures. It is empirically shown that our proposed CGRA provides better results w.r.t. SoA architectures in terms of power, performance, and area

    Optimisations arithmétiques et synthèse de haut niveau

    Get PDF
    High-level synthesis (HLS) tools offer increased productivity regarding FPGA programming.However, due to their relatively young nature, they still lack many arithmetic optimizations.This thesis proposes safe arithmetic optimizations that should always be applied.These optimizations are simple operator specializations, following the C semantic.Other require to a lift the semantic embedded in high-level input program languages, which are inherited from software programming, for an improved accuracy/cost/performance ratio.To demonstrate this claim, the sum-of-product of floating-point numbers is used as a case study. The sum is performed on a fixed-point format, which is tailored to the application, according to the context in which the operator is instantiated.In some cases, there is not enough information about the input data to tailor the fixed-point accumulator.The fall-back strategy used in this thesis is to generate an accumulator covering the entire floating-point range.This thesis explores different strategies for implementing such a large accumulator, including new ones.The use of a 2's complement representation instead of a sign+magnitude is demonstrated to save resources and to reduce the accumulation loop delay.Based on a tapered precision scheme and an exact accumulator, the posit number systems claims to be a candidate to replace the IEEE floating-point format.A throughout analysis of posit operators is performed, using the same level of hardware optimization as state-of-the-art floating-point operators.Their cost remains much higher that their floating-point counterparts in terms of resource usage and performance. Finally, this thesis presents a compatibility layer for HLS tools that allows one code to be deployed on multiple tools.This library implements a strongly typed custom size integer type along side a set of optimized custom operators.À cause de la nature relativement jeune des outils de synthèse de haut-niveau (HLS), de nombreuses optimisations arithmétiques n'y sont pas encore implémentées. Cette thèse propose des optimisations arithmétiques se servant du contexte spécifique dans lequel les opérateurs sont instanciés.Certaines optimisations sont de simples spécialisations d'opérateurs, respectant la sémantique du C.D'autres nécéssitent de s'éloigner de cette sémantique pour améliorer le compromis précision/coût/performance.Cette proposition est démontré sur des sommes de produits de nombres flottants.La somme est réalisée dans un format en virgule-fixe défini par son contexte.Quand trop peu d’informations sont disponibles pour définir ce format en virgule-fixe, une stratégie est de générer un accumulateur couvrant l'intégralité du format flottant.Cette thèse explore plusieurs implémentations d'un tel accumulateur.L'utilisation d'une représentation en complément à deux permet de réduire le chemin critique de la boucle d'accumulation, ainsi que la quantité de ressources utilisées. Un format alternatif aux nombres flottants, appelé posit, propose d'utiliser un encodage à précision variable.De plus, ce format est augmenté par un accumulateur exact.Pour évaluer précisément le coût matériel de ce format, cette thèse présente des architectures d'opérateurs posits, implémentés avec le même degré d'optimisation que celui de l'état de l'art des opérateurs flottants.Une analyse détaillée montre que le coût des opérateurs posits est malgré tout bien plus élevé que celui de leurs équivalents flottants.Enfin, cette thèse présente une couche de compatibilité entre outils de HLS, permettant de viser plusieurs outils avec un seul code. Cette bibliothèque implémente un type d'entiers de taille variable, avec de plus une sémantique strictement typée, ainsi qu'un ensemble d'opérateurs ad-hoc optimisés

    SĂ­ntesis VLSI de un Multiplicador de Punto Flotante de PrecisiĂłn Simple

    Get PDF
    La multiplicación es una de las operaciones más importantes para la ejecución de instrucciones en dispositivos de procesamiento de datos. En este trabajo se presenta el diseño de un multiplicador de punto flotante, siguiendo el estándar IEEE-754. El sistema se divide en tres fases, la primera separa los datos, la segunda realiza una multiplicación en punto fijo y la tercera lleva a cabo el cálculo del nuevo exponente. Lasegunda fase es crítica y se desarrolla mediante un algoritmo basado en celdas unitarias para generar una matriz de multiplicación. El sistema se implementó en VHDL (VHSIC Hardware Description Language) con la herramienta ISE WebPack 14.4 de Xilinx. Posteriormente, se realizó parte del proceso de síntesis lógica y física, utilizando las herramientas EDA (Electronic Design Automation) de Alliance y se obtuvo una versión preliminar del layout para su fabricación en tecnología VLSI. El layout presentó un gran consumo de área, sin embargo, el diseño es escalable y se podría aumentar la capacidad del multiplicador sin necesidad de un rediseño. El sistema se comportó de manera satisfactoria en respuesta a diferentes patrones de prueba diseñados en las herramientas de Xilinx y Alliance

    Design and implementation of an out of order execution engine of floating point arithmetic operations

    Get PDF
    In this thesis, work is undertaken towards the design in hardware description languages and implementation in FPGA of an out of order execution engine of floating point arithmetic operations. This thesis work, is part of a project called Lagarto

    Spéculation temporelle pour accélérateurs matériels

    Get PDF
    This thesis is focused on the use of timing speculation to improve the performance and energy efficiency of hardware accelerators. Timing speculation is the use of a circuit using a frequency or a voltage at which its operation is no longer guaranteed. It increases the performance of the circuit (computations per second) but also its energy efficiency (computations per joule). As the correct operation of the circuit is no longer guaranteed, it must be accompanied by an error detection mechanism. This mechanism must have the lowest possible additional cost in terms of resources used, energy and impact on performance. These overheads must indeed be low enough to make the approach worthwhile, but also be as low as possible to maximize the gain obtained. We present a new algorithm-level error detection mechanism for convolutions used in convolutional neural networks that meets these conditions. We show that combining this mechanism with timing speculation can improve the performance and energy efficiency of a convolution hardware accelerator.Résumé : Cette thèse porte sur l’utilisation de la spéculation temporelle pour améliorer les performances et l’efficacité énergétique d’accélérateurs matériels. La spéculation temporelle consiste en l’utilisation d’un circuit en utilisant une fréquence ou une tension à laquelle son fonctionnement n’est plus garanti. Elle permet d’augmenter les performances du circuit (calculs par seconde) mais aussi son efficacité énergétique (calculs par joule). Comme le fonctionnement du circuit n’est plus garanti, elle doit être accompagnée d’un mécanisme de détection d’erreur. Celui-ci doit avoir un coût en ressources utilisées, en énergie et un impact sur les performances les plus faibles possibles. Ces surcoûts doivent effectivement être suffisamment faibles pour que l’approche vaille le coup, mais aussi être le plus bas possible pour maximiser les gains obtenus. Nous présentons un nouveau mécanisme de détection d’erreur au niveau algorithmique pour les convolutions utilisées dans les réseaux de neurones convolutifs qui remplit ces conditions. Nous montrons que la combinaison de ce mécanisme avec la spéculation temporelle permet d’améliorer les performances et l’efficacité énergétique d’un accélérateur matériel de convolution

    Automated Dynamic Error Analysis Methods for Optimization of Computer Arithmetic Systems

    Get PDF
    Computer arithmetic is one of the more important topics within computer science and engineering. The earliest implementations of computer systems were designed to perform arithmetic operations and cost if not all digital systems will be required to perform some sort of arithmetic as part of their normal operations. This reliance on the arithmetic operations of computers means the accurate representation of real numbers within digital systems is vital, and an understanding of how these systems are implemented and their possible drawbacks is essential in order to design and implement modern high performance systems. At present the most widely implemented system for computer arithmetic is the IEEE754 Floating Point system, while this system is deemed to the be the best available implementation it has several features that can result in serious errors of computation if not implemented correctly. Lack of understanding of these errors and their effects has led to real world disasters in the past on several occasions. Systems for the detection of these errors are highly important and fast, efficient and easy to use implementations of these detection systems is a high priority. Detection of floating point rounding errors normally requires run-time analysis in order to be effective. Several systems have been proposed for the analysis of floating point arithmetic including Interval Arithmetic, Affine Arithmetic and Monte Carlo Arithmetic. While these systems have been well studied using theoretical and software based approaches, implementation of systems that can be applied to real world situations has been limited due to issues with implementation, performance and scalability. The majority of implementations have been software based and have not taken advantage of the performance gains associated with hardware accelerated computer arithmetic systems. This is especially problematic when it is considered that systems requiring high accuracy will often require high performance. The aim of this thesis and associated research is to increase understanding of error and error analysis methods through the development of easy to use and easy to understand implementations of these techniques