17 research outputs found

    A biophysically accurate floating point somatic neuroprocessor

    Get PDF

    Cascaded VLSI neural network architecture for on-line learning

    Get PDF
    High-speed, analog, fully-parallel and asynchronous building blocks are cascaded for larger sizes and enhanced resolution. A hardware-compatible algorithm permits hardware-in-the-loop learning despite limited weight resolution. A comparison-intensive feature classification application has been demonstrated with this flexible hardware and new algorithm at high speed. This result indicates that these building block chips can be embedded as application-specific-coprocessors for solving real-world problems at extremely high data rates

    Real-Time neural signal decoding on heterogeneous MPSocs based on VLIW ASIPs

    Get PDF
    An important research problem, at the basis of the development of embedded systems for neuroprosthetic applications, is the development of algorithms and platforms able to extract the patient's motion intention by decoding the information encoded in neural signals. At the state of the art, no portable and reliable integrated solutions implementing such a decoding task have been identified. To this aim, in this paper, we investigate the possibility of using the MPSoC paradigm in this application domain. We perform a design space exploration that compares different custom MPSoC embedded architectures, implementing two versions of a on-line neural signal decoding algorithm, respectively targeting decoding of single and multiple acquisition channels. Each considered design points features a different application configuration, with a specific partitioning and mapping of parallel software tasks, executed on customized VLIW ASIP processing cores. Experimental results, obtained by means of FPGA-based prototyping and post-floorplanning power evaluation on a 40nm technology library, assess the performance and hardware-related costs of the considered configurations. The reported power figures demonstrate the usability of the MPSoC paradigm within the processing of bio-electrical signals and show the benefits achievable by the exploitation of the instruction-level parallelism within tasks

    An investigation into adaptive power reduction techniques for neural hardware

    No full text
    In light of the growing applicability of Artificial Neural Network (ANN) in the signal processing field [1] and the present thrust of the semiconductor industry towards lowpower SOCs for mobile devices [2], the power consumption of ANN hardware has become a very important implementation issue. Adaptability is a powerful and useful feature of neural networks. All current approaches for low-power ANN hardware techniques are ‘non-adaptive’ with respect to the power consumption of the network (i.e. power-reduction is not an objective of the adaptation/learning process). In the research work presented in this thesis, investigations on possible adaptive power reduction techniques have been carried out, which attempt to exploit the adaptability of neural networks in order to reduce the power consumption. Three separate approaches for such adaptive power reduction are proposed: adaptation of size, adaptation of network weights and adaptation of calculation precision. Initial case studies exhibit promising results with significantpower reduction

    Acquisition systems and decoding algorithms of peripheral neural signals for prosthetic applications

    Get PDF
    During the years, neuroprosthetic applications have obtained a great deal of attention by the international research, especially in the bioengineering field, thanks to the huge investments on several proposed projects funded by the political institutions which consider the treatment of this particular disease of fundamental importance for the global community. The aim of these projects is to find a possible solution to restore the functionalities lost by a patient subjected to an upper limb amputation trying to develop, according to physiological considerations, a communication link between the brain in which the significant signals are generated and a motor prosthesis device able to perform the desired action. Moreover, the designed system must be able to give back to the brain a sensory feedback about the surrounding world in terms of pressure or temperature acquired by tactile biosensors placed at the surface of the cybernetic hand. It in fact allows to execute involuntarymovements when for example the armcomes in contact with hot objects. The development of such a closed-loop architecture involves the need to address some critical issues which depend on the chosen approach. Several solutions have been proposed by the researches of the field, each one differing with respect to where the neural signals are acquired, either at the central nervous systemor at the peripheral one,most of themfollowing the former even that the latter is always considered by the amputees amore natural way to handle the artificial limb. This research work is based on the use of intrafascicular electrodes directly implanted in the residual peripheral nerves of the stump which represents a good compromise choice in terms of invasiveness and selectivity extracting electroneurographic (ENG) signals from which it is possible to identify the significant activity of a quite limited number of neuronal cells. In the perspective of the hardware implementation of the resulting solution which can work autonomously without any intervention by the amputee in an adaptive way according to the current characteristics of the processed signal and by using batteries as power source allowing portability, it is necessary to fulfill the tight constraints imposed by the application under consideration involved in each of the various phases which compose the considered closed-loop system. Regarding to the recording phase, the implementation must be able to remove the unwanted interferences mainly due to the electro-stimulations of themuscles placed near the electrodes featured by an order of magnitude much greater in comparison to that of the signals of interest amplifying the frequency components belonging to the significant bandwidth, and to convert them with a high resolution in order to obtain good performance at the next processing phases. To this aim, a recording module for peripheral neural signals will be presented, based on the use of a sigma-delta architecture which is composed by two main parts: an analog front-end stage for neural signal acquisition, pre-filtering and sigma-delta modulation and a digital unit for sigma-delta decimation and system configuration. Hardware/software cosimulations exploiting the Xilinx System Generator tool in Matlab Simulink environment and then transistor-level simulations confirmed that the system is capable of recording neural signals in the order of magnitude of tens of μV rejecting the huge low-frequency noise due to electromyographic interferences. The same architecture has been then exploited to implement a prototype of an 8-channel implantable electronic bi-directional interface between the peripheral nervous system and the neuro-controlled hand prosthesis. The solution includes a custom designed Integrated Circuit (0.35μm CMOS technology), responsible of the signal pre-filtering and sigma-delta modulation for each channel and the neural stimuli generation (in the opposite path) based on the directives sent by a digital control systemmapped on a low-cost Xilinx FPGA Spartan-3E 1600 development board which also involves the multi-channel sigma-delta decimation with a high-order band-pass filter as first stage in order to totally remove the unwanted interferences. In this way, the analog chip can be implanted near the electrodes thanks to its limited size avoiding to add a huge noise to theweak neural signals due to longwires connections and to cause heat-related infections, shifting the complexity to the digital part which can be hosted on a separated device in the stump of the amputeewithout using complex laboratory instrumentations. The system has been successfully tested from the electrical point of view and with in-vivo experiments exposing good results in terms of output resolution and noise rejection even in case of critical conditions. The various output channels at the Nyquist sampling frequency coming from the acquisition system must be processed in order to decode the intentions of movements of the amputee, applying the correspondent electro-mechanical stimulation in input to the cybernetic hand in order to perform the desired motor action. Different decoding approaches have been presented in the past, the majority of them were conceived starting from the relative implementation and performance evaluation of their off-line version. At the end of the research, it is necessary to develop these solutions on embedded systems performing an online processing of the peripheral neural signals. However, it is often possible only by using complex hardware platforms clocked at very high operating frequencies which are not be compliant with the low-power requirements needed to allow portability for the prosthetic device. At present, in fact, the important aspect of the real-time implementation of sophisticated signal processing algorithms on embedded systems has been often overlooked, notwithstanding the impact that limited resources of the former may have on the efficiency/effectiveness of any given algorithm. In this research work it has been addressed the optimization of a state-of-the-art algorithmfor PNS signals decoding that is a step forward for its real-time, full implementation onto a floating-point Digital Signal Processor (DSP). Beyond low-level optimizations, different solutions have been proposed at an high level in order to find the best trade-off in terms of effectiveness/efficiency. A latency model, obtained through cycle accurate profiling of the different code sections, has been drawn in order to perform a fair performance assessment. The proposed optimized real-time algorithmachieves up to 96% of correct classification on real PNS signals acquired through tf-LIFE electrodes on animals, and performs as the best off-line algorithmfor spike clustering on a synthetic cortical dataset characterized by a reasonable dissimilarity between the spikemorphologies of different neurons. When the real-time requirements are joined to the fulfilment of area and power minimization for implantable/portable applications, such as for the target neuroprosthetic devices, only custom VLSI implementations can be adopted. In this case, every part of the algorithmshould be carefully tuned. To this aim, the first preprocessing stage of the decoding algorithmbased on the use of aWavelet Denoising solution able to remove also the in-band noise sources has been deeply analysed in order to obtain an optimal hardware implementation. In particular, the usually overlooked part related to threshold estimation has been evaluated in terms of required hardware resources and functionality, exploiting the commercial Xilinx System Generator tool for the design of the architecture and the co-simulation. The analysis has revealed how the widely used Median Absolute Deviation (MAD) could lead o hardware implementations highly inefficient compared to other dispersion estimators demonstrating better scalability, relatively to the specific application. Finally, two different hardware implementations of the reference decoding algorithm have been presented highlighting pros and cons of each one of them. Firstly, a novel approach based on high-level dataflow description and automatic hardware generation is presented and evaluated on the on-line template-matching spike sorting algorithmwhich represents the most complex processing stage. It starts from the identification of the single kernels with the greater computational complexity and using their dataflow description to generate the HDL implementation of a coarse-grained reconfigurable global kernel characterized by theminimumresources in order to reduce the area and the energy dissipation for the fulfilment of the low-power requirements imposed by the application. Results in the best case have revealed a 71%of area saving compared tomore traditional solutions,without any accuracy penalty. With respect to single kernels execution, better latency performance are achievable stillminimizing the number of adopted resources. The performance in terms of latency can also be improved by tuning the implemented parallelismin the light of a defined number of channels and real-time constraints, by using more than one reconfigurable global kernel in order that they can be exploited to perform the same or different kernels at the same time in a parallel way, due to the fact that each one can execute the relative processing only in a sequential way. For this reason, a second FPGA-based prototype has been proposed based on the use of aMulti-Processor System-on-Chip (MPSoC) embedded architecture. This prototype is capable of respecting the real-time constraints posed by the application when clocked at less than 50 MHz, in comparison to 300 MHz of the previous DSP implementation. Considering that the application workload is extremely data dependent and unpredictable due to the sparsity of the neural signals, the architecture has to be dimensioned taking into account critical worst-case operating conditions in order to always ensure the correct functionality. To compensate the resulting overprovisioning of the system architecture, a software-controllable power management based on the use of clock gating techniques has been integrated in order tominimize the dynamic power consumption of the resulting solution. Summarizing, this research work can be considered a sort of proof-of-concept for the proposed techniques considering all the design issues which characterize each stage of the closed-loop system in the perspective of a portable low-power real-time hardware implementation of the neuro-controlled prosthetic device

    Acquisition systems and decoding algorithms of peripheral neural signals for prosthetic applications

    Get PDF
    During the years, neuroprosthetic applications have obtained a great deal of attention by the international research, especially in the bioengineering field, thanks to the huge investments on several proposed projects funded by the political institutions which consider the treatment of this particular disease of fundamental importance for the global community. The aim of these projects is to find a possible solution to restore the functionalities lost by a patient subjected to an upper limb amputation trying to develop, according to physiological considerations, a communication link between the brain in which the significant signals are generated and a motor prosthesis device able to perform the desired action. Moreover, the designed system must be able to give back to the brain a sensory feedback about the surrounding world in terms of pressure or temperature acquired by tactile biosensors placed at the surface of the cybernetic hand. It in fact allows to execute involuntarymovements when for example the armcomes in contact with hot objects. The development of such a closed-loop architecture involves the need to address some critical issues which depend on the chosen approach. Several solutions have been proposed by the researches of the field, each one differing with respect to where the neural signals are acquired, either at the central nervous systemor at the peripheral one,most of themfollowing the former even that the latter is always considered by the amputees amore natural way to handle the artificial limb. This research work is based on the use of intrafascicular electrodes directly implanted in the residual peripheral nerves of the stump which represents a good compromise choice in terms of invasiveness and selectivity extracting electroneurographic (ENG) signals from which it is possible to identify the significant activity of a quite limited number of neuronal cells. In the perspective of the hardware implementation of the resulting solution which can work autonomously without any intervention by the amputee in an adaptive way according to the current characteristics of the processed signal and by using batteries as power source allowing portability, it is necessary to fulfill the tight constraints imposed by the application under consideration involved in each of the various phases which compose the considered closed-loop system. Regarding to the recording phase, the implementation must be able to remove the unwanted interferences mainly due to the electro-stimulations of themuscles placed near the electrodes featured by an order of magnitude much greater in comparison to that of the signals of interest amplifying the frequency components belonging to the significant bandwidth, and to convert them with a high resolution in order to obtain good performance at the next processing phases. To this aim, a recording module for peripheral neural signals will be presented, based on the use of a sigma-delta architecture which is composed by two main parts: an analog front-end stage for neural signal acquisition, pre-filtering and sigma-delta modulation and a digital unit for sigma-delta decimation and system configuration. Hardware/software cosimulations exploiting the Xilinx System Generator tool in Matlab Simulink environment and then transistor-level simulations confirmed that the system is capable of recording neural signals in the order of magnitude of tens of μV rejecting the huge low-frequency noise due to electromyographic interferences. The same architecture has been then exploited to implement a prototype of an 8-channel implantable electronic bi-directional interface between the peripheral nervous system and the neuro-controlled hand prosthesis. The solution includes a custom designed Integrated Circuit (0.35μm CMOS technology), responsible of the signal pre-filtering and sigma-delta modulation for each channel and the neural stimuli generation (in the opposite path) based on the directives sent by a digital control systemmapped on a low-cost Xilinx FPGA Spartan-3E 1600 development board which also involves the multi-channel sigma-delta decimation with a high-order band-pass filter as first stage in order to totally remove the unwanted interferences. In this way, the analog chip can be implanted near the electrodes thanks to its limited size avoiding to add a huge noise to theweak neural signals due to longwires connections and to cause heat-related infections, shifting the complexity to the digital part which can be hosted on a separated device in the stump of the amputeewithout using complex laboratory instrumentations. The system has been successfully tested from the electrical point of view and with in-vivo experiments exposing good results in terms of output resolution and noise rejection even in case of critical conditions. The various output channels at the Nyquist sampling frequency coming from the acquisition system must be processed in order to decode the intentions of movements of the amputee, applying the correspondent electro-mechanical stimulation in input to the cybernetic hand in order to perform the desired motor action. Different decoding approaches have been presented in the past, the majority of them were conceived starting from the relative implementation and performance evaluation of their off-line version. At the end of the research, it is necessary to develop these solutions on embedded systems performing an online processing of the peripheral neural signals. However, it is often possible only by using complex hardware platforms clocked at very high operating frequencies which are not be compliant with the low-power requirements needed to allow portability for the prosthetic device. At present, in fact, the important aspect of the real-time implementation of sophisticated signal processing algorithms on embedded systems has been often overlooked, notwithstanding the impact that limited resources of the former may have on the efficiency/effectiveness of any given algorithm. In this research work it has been addressed the optimization of a state-of-the-art algorithmfor PNS signals decoding that is a step forward for its real-time, full implementation onto a floating-point Digital Signal Processor (DSP). Beyond low-level optimizations, different solutions have been proposed at an high level in order to find the best trade-off in terms of effectiveness/efficiency. A latency model, obtained through cycle accurate profiling of the different code sections, has been drawn in order to perform a fair performance assessment. The proposed optimized real-time algorithmachieves up to 96% of correct classification on real PNS signals acquired through tf-LIFE electrodes on animals, and performs as the best off-line algorithmfor spike clustering on a synthetic cortical dataset characterized by a reasonable dissimilarity between the spikemorphologies of different neurons. When the real-time requirements are joined to the fulfilment of area and power minimization for implantable/portable applications, such as for the target neuroprosthetic devices, only custom VLSI implementations can be adopted. In this case, every part of the algorithmshould be carefully tuned. To this aim, the first preprocessing stage of the decoding algorithmbased on the use of aWavelet Denoising solution able to remove also the in-band noise sources has been deeply analysed in order to obtain an optimal hardware implementation. In particular, the usually overlooked part related to threshold estimation has been evaluated in terms of required hardware resources and functionality, exploiting the commercial Xilinx System Generator tool for the design of the architecture and the co-simulation. The analysis has revealed how the widely used Median Absolute Deviation (MAD) could lead o hardware implementations highly inefficient compared to other dispersion estimators demonstrating better scalability, relatively to the specific application. Finally, two different hardware implementations of the reference decoding algorithm have been presented highlighting pros and cons of each one of them. Firstly, a novel approach based on high-level dataflow description and automatic hardware generation is presented and evaluated on the on-line template-matching spike sorting algorithmwhich represents the most complex processing stage. It starts from the identification of the single kernels with the greater computational complexity and using their dataflow description to generate the HDL implementation of a coarse-grained reconfigurable global kernel characterized by theminimumresources in order to reduce the area and the energy dissipation for the fulfilment of the low-power requirements imposed by the application. Results in the best case have revealed a 71%of area saving compared tomore traditional solutions,without any accuracy penalty. With respect to single kernels execution, better latency performance are achievable stillminimizing the number of adopted resources. The performance in terms of latency can also be improved by tuning the implemented parallelismin the light of a defined number of channels and real-time constraints, by using more than one reconfigurable global kernel in order that they can be exploited to perform the same or different kernels at the same time in a parallel way, due to the fact that each one can execute the relative processing only in a sequential way. For this reason, a second FPGA-based prototype has been proposed based on the use of aMulti-Processor System-on-Chip (MPSoC) embedded architecture. This prototype is capable of respecting the real-time constraints posed by the application when clocked at less than 50 MHz, in comparison to 300 MHz of the previous DSP implementation. Considering that the application workload is extremely data dependent and unpredictable due to the sparsity of the neural signals, the architecture has to be dimensioned taking into account critical worst-case operating conditions in order to always ensure the correct functionality. To compensate the resulting overprovisioning of the system architecture, a software-controllable power management based on the use of clock gating techniques has been integrated in order tominimize the dynamic power consumption of the resulting solution. Summarizing, this research work can be considered a sort of proof-of-concept for the proposed techniques considering all the design issues which characterize each stage of the closed-loop system in the perspective of a portable low-power real-time hardware implementation of the neuro-controlled prosthetic device

    Spiking Neural Networks models targeted for implementation on Reconfigurable Hardware

    Get PDF
    La tesis presentada se centra en la denominada tercera generación de redes neuronales artificiales, las Redes Neuronales Spiking (SNN) también llamadas ‘de espigas’ o ‘de eventos’. Este campo de investigación se convirtió en un tema popular e importante en la última década debido al progreso de la neurociencia computacional. Las Redes Neuronales Spiking, que tienen no sólo la plasticidad espacial sino también temporal, ofrecen una alternativa prometedora a las redes neuronales artificiales clásicas (ANN) y están más cerca de la operación real de las neuronas biológicas ya que la información se codifica y transmite usando múltiples espigas o eventos en forma de trenes de pulsos. Este campo ha ido creciendo en los últimos años y ampliado el área de ingenierı́a neuromórfica cuya principal área de trabajo es el uso de VLSI analógicos, digitales, mixtos analógico/digital y software que implementa modelos de sistemas neuronales spiking. Esta tesis analiza las Redes Neuronales Spiking desde la perspectiva de Aprendizaje Automático, donde la plausibilidad biológica no es el objetivo principal, pero la capacidad de crear algoritmos de inteligencia artificial basados en SNN es uno de los objetivos principales, junto con su viabilidad de implementación de hardware. Con el fin de cumplir con los objetivos, varios modelos neuronales y topologı́as de red son revisados y comparados. La codificación de picos o la representación de datos con los picos también se discute en este trabajo. El desarrollo de topologı́as SNN y algoritmos capaces de proporcionar capacidades de inteligencia artificial basadas en espigas de entrada al sistema es uno de los principales temas de esta tesis. Sin embargo, se hace también hincapié en su implementación hardware ya que existen modelos complejos para SNN que en muchos casos no son viables para sistemas en tiempo real y requieren de sistemas de alta capacidad computacional para ser ejecutados. El tema principal de la investigación en este trabajo es la evaluación de algoritmos existentes y el desarrollo de nuevos algoritmos, estructuras de datos y métodos de codificación para la implementación hardware de las redes neuronales de spiking, especialmente dirigidas a FPGA (Field-Programmable Gate Arrays). Los dispositivos FPGA son elegidos debido a sus excelentes capacidades de cálculo paralelo masivo, bajo consumo de energı́a, baja latencia y versatilidad. En los últimos años, las FPGA se convirtieron en una popular plataforma para tareas clásicas de aprendizaje de máquinas, tales como reconocimiento de imágenes, control automático, predicción de series temporales, robótica, etc. Ası́, la tesis investiga todas las cuestiones relacionadas con el despliegue de un sistema completo de hardware basado en espigas, desde la codificación de información externa como entradas hasta la salida final de un sistema de inteligencia artificial basado en SNN, incluida la optimización en la transmisión de datos, y todo ello implementado en arquitecturas hardware que optimizan el rendimiento y permiten la implementación de redes spiking de un elevado número de neuronas. Se propone una nueva arquitectura simplificada de neuronas de tipo LIF (Leaky Integrate-and-Fire). La neurona se evalúa para redes de tipo Perceptron y Restricted Boltzmann Machine (RBM) para probar su rendimiento. Además, las capacidades de aprendizaje de las redes propuestas se desarrollan mediante la definición de un procedimiento optimizado para el aprendizaje de STDP (Spike Time Dependent Plasticity). Las propuestas de optimización en software son completadas por nuevas arquitecturas de hardware, especialmente diseñadas para la implementación de FPGA. En lo que se refiere a las arquitecturas de hardware, esta tesis define la llamada ”neurona autómata”, basada en un formato de representación de espigas novedoso también y definido en esta tesis, llamado ‘Variable Timeslot Length Address-Event Representation’ (VTSAER). Este formato tiene una mayor versatilidad que anteriores propuestas de AER, eliminando la necesidad de marcas de tiempo y permitiendo un verdadero sincronismo de cualquier número arbitrario de eventos. La estructura del VTSAER permite procesar la información en las neuronas de espigas como un autómata finito alimentado por eventos. Este nuevo enfoque ayuda a separar el estado del sistema de la tasa de entrada de datos y reducir el número de canales de entrada/salida. Otra novedad propuesta en esta tesis es una arquitectura vectorizada de capas de las redes neuronales. Esta arquitectura permite calcular el estado de cualquier número arbitrario de capas reutilizando los mismos bloques neuronales de hardware varias veces. Este concepto de procesamiento vectorial de datos se puede aplicar no sólo en las redes neuronales de espigas, sino también en redes neuronales clásicas no-spiking de tipo ANN y otros algoritmos de aprendizaje automático. Con la arquitectura vectorizada y la neurona autómata, el factor limitante para el tamaño de la red es sólo la cantidad de memoria en el FPGA, lo que es una mejora significativa a las implementaciones anteriores. En cuanto a los algoritmos de aprendizaje para SNN, esta tesis describe una nueva aplicación del algoritmo de aprendizaje de Spike Timing Dependent Plasticity. STDP sigue siendo el algoritmo de aprendizaje más popular para las redes neuronales spiking,derivado de las observaciones de los fenómenos biológicos. Implementaciones de hardware digital de la STDP rara vez se encuentran dado que el algoritmo está utilizando causalidad de sincronización hacia atrás que requiere un empleo significativo de recursos de hardware. La nueva implementación propuesta en esta tesis está resolviendo el problema de causalidad con una sobrecarga de hardware muy pequeña. La versión mejorada de STDP se puede utilizar en redes de número arbitrario de neuronas. El proceso de actualización de pesos es independiente para cada neurona y no afecta al flujo global de entrada de espigas. La implementación FPGA de algoritmos de codificación visual también se cubre en esta tesis. Se describe la codificación de campos receptivos visuales tipo Gabor y se presentan dos implementaciones de hardware. El método de codificación de campo receptivo es muy similar a la operación de convolución utilizada en redes neuronales no-spiking. Los campos especı́ficos de orientación de Gabor son importantes en el procesamiento de imágenes, ya que son fenómenos bien estudiados observados en la corteza visual de mamı́feros y se desempeñan bien en el procesamiento de imágenes y en las tareas de codificación de espigas. Las dos propuestas de implementación en FPGA son arquitectura paralela y vectorizada. La comparación se realiza utilizando tamaños de campo receptivo tı́picamente usados en tareas prácticas que muestran las posibilidades de aplicación para cada una de las propuestas de implementación. Además, la implementación del hardware digital de algoritmos requiere la adaptación de la aritmética, ya que la aritmética de punto fijo se utiliza para evitar la complejidad adicional dada por los cálculos de coma flotante. Por lo tanto, se realiza un extenso estudio de la aritmética de punto fijo en el hardware de codificación y procesamiento de spikes para probar que el punto fijo es capaz de proporcionar la exactitud y precisión requeridas a un menor costo computacional y de recursos. Todos los algoritmos y arquitecturas propuestos se prueban resolviendo problemas clásicos con bases de datos abiertos (open source) para poder hacer una comparación con otros autores: los conjuntos de datos SEMEION e Iris se utilizan en este caso. Con respecto a los resultados de hardware, las arquitecturas digitales propuestas permiten una alta frecuencia de operación de reloj, cercana al máximo permitido por el dispositivo FPGA (alcanza hasta 387MHz). Los algoritmos y arquitecturas propuestos también permiten SNN de tamaño arbitrario, limitándose sólo a la capacidad del dispositivo. Todas las cuestiones antes mencionadas forman una compleja solución novedosa para la implementación de redes neuronales de espigas en hardware FPGA con velocidad de procesamiento varios cientos de veces más rápido que las simulaciones de software y una precisión comparable. Los bloques de hardware propuestos son versátiles, capaces de implementar una amplia gama de modificaciones de los algoritmos descritos y adaptar múltiples topologı́as SNN con diferentes números de entradas, número de capas, número de neuronas por capa, número de salidas, longitud de bits y, en general, aquellos parámetros que permiten implementar múltiples formas de SNN. En total, utilizando los bloques de hardware desarrollados en esta tesis, es posible construir un sistema neuromórfico masivo autosuficiente con un ciclo de procesamiento completo hecho dentro de un chip. De este modo, los sistemas neuromórficos podrı́an ser implementados a un costo menor en términos de desarrollo y tiempo de diseño, junto con placas de hardware más simples.This thesis describes a novel architecture of the Spiking Neural Networks implemented in hardware using Field-Programmable Gate Arrays. By starting from the state of the art theoretical and practical works, a new approach to the problem is proposed. The presented work is dealing with both software and hardware topics such as: • Spiking neural models with focus on their performance and feasibility in hardware. A novel simplified neuron model is created and tested. • Learning of SNNs in software and hardware. The well-known learning algorithms are implemented and tested with the simplified neuron model. • Data representation and conversion in spiking neural systems. A new version of Address-Event Representation protocol is proposed, effectively allowing the finite automata approach to the SNN implementation. A novel hardware architecture to encode images is presented. • Hardware platforms’ resources and their usability for SNN implementation. The latest commercial FPGA devices are evaluated as the prospective platform for large-scale SNN implementation. • Spiking perceptron and spiking Restricted Boltzmann machine implementation. Two popular network models are implemented and tested, utilizing the proposed neuronal model. • Neural network learning in hardware. The previously studied algorithms are im- plemented in the hardware. The aforementioned material was partially published in two journal and five conference papers. The system has been fully developed and tested using public domain datasets
    corecore