198 research outputs found
Recommended from our members
Very-Large-Scale-Integration Circuit Techniques in Internet-of-Things Applications
Heading towards the era of Internet-of-things (IoT) means both opportunity and challenge for the circuit-design community. In a system where billions of devices are equipped with the ability to sense, compute, communicate with each other and perform tasks in a coordinated manner, security and power management are among the most critical challenges.
Physically unclonable function (PUF) emerges as an important security primitive in hardware-security applications; it provides an object-specific physical identifier hidden within the intrinsic device variations, which is hard to expose and reproduce by adversaries. Yet, designing a compact PUF robust to noise, temperature and voltage remains a challenge.
This thesis presents a novel PUF design approach based on a pair of ultra-compact analog circuits whose output is proportional to absolute temperature. The proposed approach is demonstrated through two works: (1) an ultra-compact and robust PUF based on voltage-compensated proportional-to-absolute-temperature voltage generators that occupies 8.3× less area than the previous work with the similar robustness and twice the robustness of the previously most compact PUF design and (2) a technique to transform a 6T-SRAM array into a robust analog PUF with minimal overhead. In this work, similar circuit topology is used to transform a preexisting on-chip SRAM into a PUF, which further reduces the area in (1) with no robustness penalty.
In this thesis, we also explore techniques for power management circuit design.
Energy harvesting is an essential functionality in an IoT sensor node, where battery replacement is cost-prohibitive or impractical. Yet, existing energy-harvesting power management units (EH PMU) suffer from efficiency loss in the two-step voltage conversion: harvester-to-battery and battery-to-load. We propose an EH PMU architecture with hybrid energy storage, where a capacitor is introduced in addition to the battery to serve as an intermediate energy buffer to minimize the battery involvement in the system energy flow. Test-case measurements show as much as a 2.2× improvement in the end-to-end energy efficiency.
In contrast, with the drastically reduced power consumption of IoT nodes that operates in the sub-threshold regime, adaptive dynamic voltage scaling (DVS) for supply-voltage margin removal, fully on-chip integration and high power conversion efficiency (PCE) are required in PMU designs. We present a PMU–load co-design based on a fully integrated switched-capacitor DC-DC converter (SC-DC) and hybrid error/replica-based regulation for a fully digital PMU control. The PMU is integrated with a neural spike processor (NSP) that achieves a record-low power consumption of 0.61 µW for 96 channels. A tunable replica circuit is added to assist the error regulation and prevent loss of regulation. With automatic energy-robustness co-optimization, the PMU can set the SC-DC’s optimal conversion ratio and switching frequency. The PMU achieves a PCE of 77.7% (72.2%) at VIN = 0.6 V (1 V) and at the NSP’s margin-free operating point
Automatic Spike sorting and robust power line interference cancellation for neural signal processing
Ph.DDOCTOR OF PHILOSOPH
Resource efficient on-node spike sorting
Current implantable brain-machine interfaces are recording multi-neuron activity by utilising multi-channel, multi-electrode micro-electrodes. With the rapid increase in recording capability has come more stringent constraints on implantable system power consumption and size. This is even more so with the increasing demand for wireless systems to increase the number of channels being monitored whilst overcoming the communication bottleneck (in transmitting raw data) via transcutaneous bio-telemetries. For systems observing unit activity, real-time spike sorting within an implantable device offers a unique solution to this problem.
However, achieving such data compression prior to transmission via an on-node spike sorting system has several challenges. The inherent complexity of the spike sorting problem arising from various factors (such as signal variability, local field potentials, background and multi-unit activity) have required computationally intensive algorithms (e.g. PCA, wavelet transform, superparamagnetic clustering). Hence spike sorting systems have traditionally been implemented off-line, usually run on work-stations. Owing to their complexity and not-so-well scalability, these algorithms cannot be simply transformed into a resource efficient hardware. On the contrary, although there have been several attempts in implantable hardware, an implementation to match comparable accuracy to off-line within the required power and area requirements for future BMIs have yet to be proposed.
Within this context, this research aims to fill in the gaps in the design towards a resource efficient implantable real-time spike sorter which achieves performance comparable to off-line methods. The research covered in this thesis target: 1) Identifying and quantifying the trade-offs on subsequent signal processing performance and hardware resource utilisation of the parameters associated with analogue-front-end. Following the development of a behavioural model of the analogue-front-end and an optimisation tool, the sensitivity of the spike sorting accuracy to different front-end parameters are quantified. 2) Identifying and quantifying the trade-offs associated with a two-stage hybrid solution to realising real-time on-node spike sorting. Initial part of the work focuses from the perspective of template matching only, while the second part of the work considers these parameters from the point of whole system including detection, sorting, and off-line training (template building). A set of minimum requirements are established which ensure robust, accurate and resource efficient operation. 3) Developing new feature extraction and spike sorting algorithms towards highly scalable systems. Based on waveform dynamics of the observed action potentials, a derivative based feature extraction and a spike sorting algorithm are proposed. These are compared with most commonly used methods of spike sorting under varying noise levels using realistic datasets to confirm their merits. The latter is implemented and demonstrated in real-time through an MCU based platform.Open Acces
A Survey of Spiking Neural Network Accelerator on FPGA
Due to the ability to implement customized topology, FPGA is increasingly
used to deploy SNNs in both embedded and high-performance applications. In this
paper, we survey state-of-the-art SNN implementations and their applications on
FPGA. We collect the recent widely-used spiking neuron models, network
structures, and signal encoding formats, followed by the enumeration of related
hardware design schemes for FPGA-based SNN implementations. Compared with the
previous surveys, this manuscript enumerates the application instances that
applied the above-mentioned technical schemes in recent research. Based on
that, we discuss the actual acceleration potential of implementing SNN on FPGA.
According to our above discussion, the upcoming trends are discussed in this
paper and give a guideline for further advancement in related subjects
ZyON: Enabling Spike Sorting on APSoC-Based Signal Processors for High-Density Microelectrode Arrays
Multi-Electrode Arrays and High-Density Multi-Electrode Arrays of sensors are a key instrument in neuroscience research. Such devices are evolving to provide ever-increasing temporal and spatial resolution, paving the way to unprecedented results when it comes to understanding the behaviour of neuronal networks and interacting with them. However, in some experimental cases, in-place low-latency processing of the sensor data acquired by the arrays is required. This poses the need for high-performance embedded computing platforms capable of processing in real-time the stream of samples produced by the acquisition front-end to extract higher-level information. Previous work has demonstrated that Field-Programmable Gate Array and All-Programmable System-On-Chip devices are suitable target technology for the implementation of real-time processors of High-Density Multi-Electrode Arrays data. However, approaches available in literature can process a limited number of channels or are designed to execute only the first steps of the neural signal processing chain. In this work, we propose an All-Programmable System-On-Chip based implementation capable of sorting neural spikes acquired by the sensors, to associate the shape of each spike to a specific firing neuron. Our system, implemented on a Xilinx Z7020 All-Programmable System-On-Chip is capable of executing on-line spike sorting up to 5500 acquisition channels, 43x more than state-of-the-art alternatives, supporting 18KHz acquisition frequency. We present an experimental study on a commonly used reference dataset, using on-line refinement of the sorting clusters to improve accuracy up to 82%, with only 4% degradation with respect to off-line analysis
Acquisition systems and decoding algorithms of peripheral neural signals for prosthetic applications
During the years, neuroprosthetic applications have obtained a great deal of attention by
the international research, especially in the bioengineering field, thanks to the huge investments
on several proposed projects funded by the political institutions which consider the treatment of this particular disease of fundamental importance for the global community.
The aim of these projects is to find a possible solution to restore the functionalities lost by a patient subjected to an upper limb amputation trying to develop, according to physiological considerations, a communication link between the brain in which the significant signals are
generated and a motor prosthesis device able to perform the desired action. Moreover, the designed system must be able to give back to the brain a sensory feedback about the surrounding world in terms of pressure or temperature acquired by tactile biosensors placed at the surface of the cybernetic hand. It in fact allows to execute involuntarymovements when for example the armcomes in contact with hot objects.
The development of such a closed-loop architecture involves the need to address some critical issues which depend on the chosen approach. Several solutions have been proposed
by the researches of the field, each one differing with respect to where the neural signals are acquired, either at the central nervous systemor at the peripheral one,most of themfollowing the former even that the latter is always considered by the amputees amore natural way
to handle the artificial limb. This research work is based on the use of intrafascicular electrodes directly implanted in the residual peripheral nerves of the stump which represents a good compromise choice in terms of invasiveness and selectivity extracting electroneurographic
(ENG) signals from which it is possible to identify the significant activity of a quite limited number of neuronal cells. In the perspective of the hardware implementation of
the resulting solution which can work autonomously without any intervention by the amputee in an adaptive way according to the current characteristics of the processed signal and by using batteries as power source allowing portability, it is necessary to fulfill the tight
constraints imposed by the application under consideration involved in each of the various phases which compose the considered closed-loop system.
Regarding to the recording phase, the implementation must be able to remove the unwanted interferences mainly due to the electro-stimulations of themuscles placed near the
electrodes featured by an order of magnitude much greater in comparison to that of the signals of interest amplifying the frequency components belonging to the significant bandwidth, and to convert them with a high resolution in order to obtain good performance at the next processing phases. To this aim, a recording module for peripheral neural signals will be presented, based on the use of a sigma-delta architecture which is composed by
two main parts: an analog front-end stage for neural signal acquisition, pre-filtering and sigma-delta modulation and a digital unit for sigma-delta decimation and system configuration.
Hardware/software cosimulations exploiting the Xilinx System Generator tool in Matlab Simulink environment and then transistor-level simulations confirmed that the system
is capable of recording neural signals in the order of magnitude of tens of μV rejecting the huge low-frequency noise due to electromyographic interferences.
The same architecture has been then exploited to implement a prototype of an 8-channel implantable electronic bi-directional interface between the peripheral nervous system and the neuro-controlled hand prosthesis. The solution includes a custom designed Integrated Circuit (0.35μm CMOS technology), responsible of the signal pre-filtering and sigma-delta modulation for each channel and the neural stimuli generation (in the opposite path) based
on the directives sent by a digital control systemmapped on a low-cost Xilinx FPGA Spartan-3E 1600 development board which also involves the multi-channel sigma-delta decimation
with a high-order band-pass filter as first stage in order to totally remove the unwanted interferences.
In this way, the analog chip can be implanted near the electrodes thanks to its limited size avoiding to add a huge noise to theweak neural signals due to longwires connections and to cause heat-related infections, shifting the complexity to the digital part which can be hosted on a separated device in the stump of the amputeewithout using complex laboratory instrumentations. The system has been successfully tested from the electrical point
of view and with in-vivo experiments exposing good results in terms of output resolution and noise rejection even in case of critical conditions.
The various output channels at the Nyquist sampling frequency coming from the acquisition system must be processed in order to decode the intentions of movements of the amputee, applying the correspondent electro-mechanical stimulation in input to the cybernetic hand in order to perform the desired motor action. Different decoding approaches have been presented in the past, the majority of them were conceived starting from the relative
implementation and performance evaluation of their off-line version. At the end of the research, it is necessary to develop these solutions on embedded systems performing an online processing of the peripheral neural signals. However, it is often possible only by using
complex hardware platforms clocked at very high operating frequencies which are not be compliant with the low-power requirements needed to allow portability for the prosthetic
device.
At present, in fact, the important aspect of the real-time implementation of sophisticated signal processing algorithms on embedded systems has been often overlooked, notwithstanding the impact that limited resources of the former may have on the efficiency/effectiveness
of any given algorithm. In this research work it has been addressed the optimization of a state-of-the-art algorithmfor PNS signals decoding that is a step forward for its real-time, full implementation onto a floating-point Digital Signal Processor (DSP). Beyond low-level
optimizations, different solutions have been proposed at an high level in order to find the best trade-off in terms of effectiveness/efficiency. A latency model, obtained through cycle accurate profiling of the different code sections, has been drawn in order to perform a fair
performance assessment. The proposed optimized real-time algorithmachieves up to 96% of correct classification on real PNS signals acquired through tf-LIFE electrodes on animals, and performs as the best off-line algorithmfor spike clustering on a synthetic cortical dataset
characterized by a reasonable dissimilarity between the spikemorphologies of different neurons.
When the real-time requirements are joined to the fulfilment of area and power minimization
for implantable/portable applications, such as for the target neuroprosthetic devices, only custom VLSI implementations can be adopted. In this case, every part of the algorithmshould be carefully tuned. To this aim, the first preprocessing stage of the decoding algorithmbased on the use of aWavelet Denoising solution able to remove also the in-band noise sources has been deeply analysed in order to obtain an optimal hardware implementation.
In particular, the usually overlooked part related to threshold estimation has been evaluated in terms of required hardware resources and functionality, exploiting the commercial Xilinx System Generator tool for the design of the architecture and the co-simulation. The
analysis has revealed how the widely used Median Absolute Deviation (MAD) could lead o hardware implementations highly inefficient compared to other dispersion estimators
demonstrating better scalability, relatively to the specific application.
Finally, two different hardware implementations of the reference decoding algorithm have been presented highlighting pros and cons of each one of them. Firstly, a novel approach based on high-level dataflow description and automatic hardware generation is presented
and evaluated on the on-line template-matching spike sorting algorithmwhich represents the most complex processing stage. It starts from the identification of the single kernels with the greater computational complexity and using their dataflow description to generate the HDL implementation of a coarse-grained reconfigurable global kernel characterized by theminimumresources in order to reduce the area and the energy dissipation for
the fulfilment of the low-power requirements imposed by the application. Results in the best case have revealed a 71%of area saving compared tomore traditional solutions,without any accuracy penalty. With respect to single kernels execution, better latency performance are achievable stillminimizing the number of adopted resources.
The performance in terms of latency can also be improved by tuning the implemented parallelismin the light of a defined number of channels and real-time constraints, by using
more than one reconfigurable global kernel in order that they can be exploited to perform the same or different kernels at the same time in a parallel way, due to the fact that each one can execute the relative processing only in a sequential way. For this reason, a second FPGA-based prototype has been proposed based on the use of aMulti-Processor System-on-Chip (MPSoC) embedded architecture. This prototype is capable of respecting the real-time
constraints posed by the application when clocked at less than 50 MHz, in comparison to 300 MHz of the previous DSP implementation. Considering that the application workload
is extremely data dependent and unpredictable due to the sparsity of the neural signals, the architecture has to be dimensioned taking into account critical worst-case operating conditions in order to always ensure the correct functionality. To compensate the resulting overprovisioning
of the system architecture, a software-controllable power management based on the use of clock gating techniques has been integrated in order tominimize the dynamic
power consumption of the resulting solution.
Summarizing, this research work can be considered a sort of proof-of-concept for the proposed techniques considering all the design issues which characterize each stage of the
closed-loop system in the perspective of a portable low-power real-time hardware implementation of the neuro-controlled prosthetic device
Acquisition systems and decoding algorithms of peripheral neural signals for prosthetic applications
During the years, neuroprosthetic applications have obtained a great deal of attention by
the international research, especially in the bioengineering field, thanks to the huge investments
on several proposed projects funded by the political institutions which consider the treatment of this particular disease of fundamental importance for the global community.
The aim of these projects is to find a possible solution to restore the functionalities lost by a patient subjected to an upper limb amputation trying to develop, according to physiological considerations, a communication link between the brain in which the significant signals are
generated and a motor prosthesis device able to perform the desired action. Moreover, the designed system must be able to give back to the brain a sensory feedback about the surrounding world in terms of pressure or temperature acquired by tactile biosensors placed at the surface of the cybernetic hand. It in fact allows to execute involuntarymovements when for example the armcomes in contact with hot objects.
The development of such a closed-loop architecture involves the need to address some critical issues which depend on the chosen approach. Several solutions have been proposed
by the researches of the field, each one differing with respect to where the neural signals are acquired, either at the central nervous systemor at the peripheral one,most of themfollowing the former even that the latter is always considered by the amputees amore natural way
to handle the artificial limb. This research work is based on the use of intrafascicular electrodes directly implanted in the residual peripheral nerves of the stump which represents a good compromise choice in terms of invasiveness and selectivity extracting electroneurographic
(ENG) signals from which it is possible to identify the significant activity of a quite limited number of neuronal cells. In the perspective of the hardware implementation of
the resulting solution which can work autonomously without any intervention by the amputee in an adaptive way according to the current characteristics of the processed signal and by using batteries as power source allowing portability, it is necessary to fulfill the tight
constraints imposed by the application under consideration involved in each of the various phases which compose the considered closed-loop system.
Regarding to the recording phase, the implementation must be able to remove the unwanted interferences mainly due to the electro-stimulations of themuscles placed near the
electrodes featured by an order of magnitude much greater in comparison to that of the signals of interest amplifying the frequency components belonging to the significant bandwidth, and to convert them with a high resolution in order to obtain good performance at the next processing phases. To this aim, a recording module for peripheral neural signals will be presented, based on the use of a sigma-delta architecture which is composed by
two main parts: an analog front-end stage for neural signal acquisition, pre-filtering and sigma-delta modulation and a digital unit for sigma-delta decimation and system configuration.
Hardware/software cosimulations exploiting the Xilinx System Generator tool in Matlab Simulink environment and then transistor-level simulations confirmed that the system
is capable of recording neural signals in the order of magnitude of tens of μV rejecting the huge low-frequency noise due to electromyographic interferences.
The same architecture has been then exploited to implement a prototype of an 8-channel implantable electronic bi-directional interface between the peripheral nervous system and the neuro-controlled hand prosthesis. The solution includes a custom designed Integrated Circuit (0.35μm CMOS technology), responsible of the signal pre-filtering and sigma-delta modulation for each channel and the neural stimuli generation (in the opposite path) based
on the directives sent by a digital control systemmapped on a low-cost Xilinx FPGA Spartan-3E 1600 development board which also involves the multi-channel sigma-delta decimation
with a high-order band-pass filter as first stage in order to totally remove the unwanted interferences.
In this way, the analog chip can be implanted near the electrodes thanks to its limited size avoiding to add a huge noise to theweak neural signals due to longwires connections and to cause heat-related infections, shifting the complexity to the digital part which can be hosted on a separated device in the stump of the amputeewithout using complex laboratory instrumentations. The system has been successfully tested from the electrical point
of view and with in-vivo experiments exposing good results in terms of output resolution and noise rejection even in case of critical conditions.
The various output channels at the Nyquist sampling frequency coming from the acquisition system must be processed in order to decode the intentions of movements of the amputee, applying the correspondent electro-mechanical stimulation in input to the cybernetic hand in order to perform the desired motor action. Different decoding approaches have been presented in the past, the majority of them were conceived starting from the relative
implementation and performance evaluation of their off-line version. At the end of the research, it is necessary to develop these solutions on embedded systems performing an online processing of the peripheral neural signals. However, it is often possible only by using
complex hardware platforms clocked at very high operating frequencies which are not be compliant with the low-power requirements needed to allow portability for the prosthetic
device.
At present, in fact, the important aspect of the real-time implementation of sophisticated signal processing algorithms on embedded systems has been often overlooked, notwithstanding the impact that limited resources of the former may have on the efficiency/effectiveness
of any given algorithm. In this research work it has been addressed the optimization of a state-of-the-art algorithmfor PNS signals decoding that is a step forward for its real-time, full implementation onto a floating-point Digital Signal Processor (DSP). Beyond low-level
optimizations, different solutions have been proposed at an high level in order to find the best trade-off in terms of effectiveness/efficiency. A latency model, obtained through cycle accurate profiling of the different code sections, has been drawn in order to perform a fair
performance assessment. The proposed optimized real-time algorithmachieves up to 96% of correct classification on real PNS signals acquired through tf-LIFE electrodes on animals, and performs as the best off-line algorithmfor spike clustering on a synthetic cortical dataset
characterized by a reasonable dissimilarity between the spikemorphologies of different neurons.
When the real-time requirements are joined to the fulfilment of area and power minimization
for implantable/portable applications, such as for the target neuroprosthetic devices, only custom VLSI implementations can be adopted. In this case, every part of the algorithmshould be carefully tuned. To this aim, the first preprocessing stage of the decoding algorithmbased on the use of aWavelet Denoising solution able to remove also the in-band noise sources has been deeply analysed in order to obtain an optimal hardware implementation.
In particular, the usually overlooked part related to threshold estimation has been evaluated in terms of required hardware resources and functionality, exploiting the commercial Xilinx System Generator tool for the design of the architecture and the co-simulation. The
analysis has revealed how the widely used Median Absolute Deviation (MAD) could lead o hardware implementations highly inefficient compared to other dispersion estimators
demonstrating better scalability, relatively to the specific application.
Finally, two different hardware implementations of the reference decoding algorithm have been presented highlighting pros and cons of each one of them. Firstly, a novel approach based on high-level dataflow description and automatic hardware generation is presented
and evaluated on the on-line template-matching spike sorting algorithmwhich represents the most complex processing stage. It starts from the identification of the single kernels with the greater computational complexity and using their dataflow description to generate the HDL implementation of a coarse-grained reconfigurable global kernel characterized by theminimumresources in order to reduce the area and the energy dissipation for
the fulfilment of the low-power requirements imposed by the application. Results in the best case have revealed a 71%of area saving compared tomore traditional solutions,without any accuracy penalty. With respect to single kernels execution, better latency performance are achievable stillminimizing the number of adopted resources.
The performance in terms of latency can also be improved by tuning the implemented parallelismin the light of a defined number of channels and real-time constraints, by using
more than one reconfigurable global kernel in order that they can be exploited to perform the same or different kernels at the same time in a parallel way, due to the fact that each one can execute the relative processing only in a sequential way. For this reason, a second FPGA-based prototype has been proposed based on the use of aMulti-Processor System-on-Chip (MPSoC) embedded architecture. This prototype is capable of respecting the real-time
constraints posed by the application when clocked at less than 50 MHz, in comparison to 300 MHz of the previous DSP implementation. Considering that the application workload
is extremely data dependent and unpredictable due to the sparsity of the neural signals, the architecture has to be dimensioned taking into account critical worst-case operating conditions in order to always ensure the correct functionality. To compensate the resulting overprovisioning
of the system architecture, a software-controllable power management based on the use of clock gating techniques has been integrated in order tominimize the dynamic
power consumption of the resulting solution.
Summarizing, this research work can be considered a sort of proof-of-concept for the proposed techniques considering all the design issues which characterize each stage of the
closed-loop system in the perspective of a portable low-power real-time hardware implementation of the neuro-controlled prosthetic device
Real-time neural signal processing and low-power hardware co-design for wireless implantable brain machine interfaces
Intracortical Brain-Machine Interfaces (iBMIs) have advanced significantly over the past
two decades, demonstrating their utility in various aspects, including neuroprosthetic control
and communication. To increase the information transfer rate and improve the devices’
robustness and longevity, iBMI technology aims to increase channel counts to access more
neural data while reducing invasiveness through miniaturisation and avoiding percutaneous
connectors (wired implants). However, as the number of channels increases, the raw data
bandwidth required for wireless transmission also increases becoming prohibitive, requiring
efficient on-implant processing to reduce the amount of data through data compression or
feature extraction.
The fundamental aim of this research is to develop methods for high-performance neural spike processing co-designed within low-power hardware that is scaleable for real-time
wireless BMI applications. The specific original contributions include the following:
Firstly, a new method has been developed for hardware-efficient spike detection, which
achieves state-of-the-art spike detection performance and significantly reduces the hardware
complexity. Secondly, a novel thresholding mechanism for spike detection has been introduced. By incorporating firing rate information as a key determinant in establishing the spike
detection threshold, we have improved the adaptiveness of spike detection. This eventually
allows the spike detection to overcome the signal degradation that arises due to scar tissue
growth around the recording site, thereby ensuring enduringly stable spike detection results.
The long-term decoding performance, as a consequence, has also been improved notably.
Thirdly, the relationship between spike detection performance and neural decoding accuracy has been investigated to be nonlinear, offering new opportunities for further reducing
transmission bandwidth by at least 30% with minor decoding performance degradation.
In summary, this thesis presents a journey toward designing ultra-hardware-efficient spike
detection algorithms and applying them to reduce the data bandwidth and improve neural
decoding performance. The software-hardware co-design approach is essential for the next
generation of wireless brain-machine interfaces with increased channel counts and a highly
constrained hardware budget.
The fundamental aim of this research is to develop methods for high-performance neural spike processing co-designed within low-power hardware that is scaleable for real-time wireless BMI applications. The specific original contributions include the following:
Firstly, a new method has been developed for hardware-efficient spike detection, which achieves state-of-the-art spike detection performance and significantly reduces the hardware complexity. Secondly, a novel thresholding mechanism for spike detection has been introduced. By incorporating firing rate information as a key determinant in establishing the spike detection threshold, we have improved the adaptiveness of spike detection. This eventually allows the spike detection to overcome the signal degradation that arises due to scar tissue growth around the recording site, thereby ensuring enduringly stable spike detection results. The long-term decoding performance, as a consequence, has also been improved notably. Thirdly, the relationship between spike detection performance and neural decoding accuracy has been investigated to be nonlinear, offering new opportunities for further reducing transmission bandwidth by at least 30\% with only minor decoding performance degradation.
In summary, this thesis presents a journey toward designing ultra-hardware-efficient spike detection algorithms and applying them to reduce the data bandwidth and improve neural decoding performance. The software-hardware co-design approach is essential for the next generation of wireless brain-machine interfaces with increased channel counts and a highly constrained hardware budget.Open Acces
Resource-efficient algorithms and circuits for highly-scalable BMI channel architectures
The study of the human brain has for long fascinated mankind. This organ that controls all cognitive processes and physical actions remains, to this day, among the least understood biological systems. Several billions of neurons form intricate interconnected networks communicating information through through complex electrochemical activities. Electrode arrays, such as for EEG, ECoG, and MEAs (microelectrode arrays), have enabled the observation of neural activity through recording of these electrical signals for both investigative and clinical applications. Although MEAs are widely considered the most invasive such method for recording, they do however provide highest resolution (both spatially and temporally).
Due to close proximity, each microelectrode can pick up spiking activity from multiple neurons. This thesis focuses on the design and implementation of novel circuits and systems suitable for high channel count implantable neural interfaces. Implantability poses stringent requirements on the design, such as ultra-low power, small silicon footprint, reduced communication bandwidth and high efficiency to avoid information loss. The information extraction chain typically involves signal amplification and conditioning, spike detection, and spike sorting to determine the spatial and time firing pattern of each neuron.
This thesis first provides a background to the origin and basic electrophysiology of these biopotential signals followed by a thorough review of the relevant state-of-the circuits and systems for facilitating the neural interface.
Within this context, novel front-end circuits are presented for achieving resource-constrained biopotential amplification whilst additionally considering the signal dynamics and realistic requirements for effective classification. Specifically, it is shown how a band-limited biopotential amplifier can reduce power requirements without compromising detectability. Furthermore through the development of a novel automatic gain control for neural spike recording, the dynamic range of the signal in subsequent processing blocks can be maintained in multichannel systems. This is particularly effective if now considering systems that no longer requiring independent tuning of amplification gains for each individual channel. This also alleviates the common requirement to over-spec the resolution in data conversion therefore saving power, area and data capacity.
Dealing with basic spike detection and feature extraction, a novel circuit for maxima detection is presented for identifying and signalling the onset of spike peaks and troughs. This is then combined with a novel non-linear energy operator (NEO) preprocessor and applied to spike detection. This again contributes to the general theme of achieving a calibration-free multi-channel system that is signal-driven and adaptive. Another original contribution herein includes a spike rate encoder circuit suitable for applications that are not are not affected by providing multi-unit responses.
Finally, spike sorting (feature extraction and clustering) is examined. A new method for feature extraction is proposed based on utilising the extrema of the first and second derivatives of the signal. It is shown that this provides an extremely resource-efficient metric than can achieve noise immunity than other methods of comparable complexity. Furthermore, a novel unsupervised clustering method is proposed which adaptively determines the number of clusters and assigns incoming spikes to appropriate cluster on-the-fly. In addition to high accuracy achieved by the combination of these methods for spike sorting, a major advantage is their low-computational complexity that renders them readily implementable in low-power hardware.Open Acces
- …