Search CORE

6 research outputs found

A 16-Channel Fully Configurable Neural SoC With 1.52 μW/Ch Signal Acquisition, 2.79 μW/Ch Real-Time Spike Classifier, and 1.79 TOPS/W Deep Neural Network Accelerator in 22 nm FDSOI

Author: Bauer Heiner
Dixius Andreas
Ellguth Georg
George Richard
Hänzsche Stefan
Höppner Sebastian
Kelber Florian
Mayr Christian
Scholze Stefan
Schüffny Franz Marcus
Stolba Marco
Walter Dennis
Zeinolabedin Seyed Mohammad Ali
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 20/01/2023
Field of study

With the advent of high-density micro-electrodes arrays, developing neural probes satisfying the real-time and stringent power-efficiency requirements becomes more challenging. A smart neural probe is an essential device in future neuroscientific research and medical applications. To realize such devices, we present a 22 nm FDSOI SoC with complex on-chip real-time data processing and training for neural signal analysis. It consists of a digitally-assisted 16-channel analog front-end with 1.52 μ W/Ch, dedicated bio-processing accelerators for spike detection and classification with 2.79 μ W/Ch, and a 125 MHz RISC-V CPU, utilizing adaptive body biasing at 0.5 V with a supporting 1.79 TOPS/W MAC array. The proposed SoC shows a proof-of-concept of how to realize a high-level integration of various on-chip accelerators to satisfy the neural probe requirements for modern applications

Qucosa

HSSS - Hochschulschriftenserver der SLUB

Technische Universität Dresden: Qucosa

Recommended from our members

Algorithm and Hardware Co-Design for Local/Edge Computing

Author: Jiang Zhewei
Publication venue: 'Columbia University Libraries/Information Services'
Publication date: 01/01/2020
Field of study

Advances in VLSI manufacturing and design technology over the decades have created many computing paradigms for disparate computing needs. With concerns for transmission cost, security, latency of centralized computing, edge/local computing are increasingly prevalent in the faster growing sectors like Internet-of-Things (IoT) and other sectors that require energy/connectivity autonomous systems such as biomedical and industrial applications. Energy and power efficient are the main design constraints in local and edge computing. While there exists a wide range of low power design techniques, they are often underutilized in custom circuit designs as the algorithms are developed independent of the hardware. Such compartmentalized design approach fails to take advantage of the many compatible algorithmic and hardware techniques that can improve the efficiency of the entire system. Algorithm hardware co-design is to explore the design space with whole stack awareness. The main goal of the algorithm hardware co-design methodology is the enablement and improvement of small form factor edge and local VLSI systems operating under strict constraints of area and energy efficiency. This thesis presents selected works of application specific digital and mixed-signal integrated circuit designs. The application space ranges from implantable biomedical devices to edge machine learning acceleration

Columbia University Academic Commons

Simulation and implementation of novel deep learning hardware architectures for resource constrained devices

Author: Lammie Corey
Publication venue
Publication date: 01/01/2022
Field of study

Corey Lammie designed mixed signal memristive-complementary metal–oxide–semiconductor (CMOS) and field programmable gate arrays (FPGA) hardware architectures, which were used to reduce the power and resource requirements of Deep Learning (DL) systems; both during inference and training. Disruptive design methodologies, such as those explored in this thesis, can be used to facilitate the design of next-generation DL systems

ResearchOnline at James Cook University

Resource efficient on-node spike sorting

Author: Barsakcioglu Deren
Publication venue: Electrical and Electronic Engineering, Imperial College London
Publication date: 01/01/2016
Field of study

Current implantable brain-machine interfaces are recording multi-neuron activity by utilising multi-channel, multi-electrode micro-electrodes. With the rapid increase in recording capability has come more stringent constraints on implantable system power consumption and size. This is even more so with the increasing demand for wireless systems to increase the number of channels being monitored whilst overcoming the communication bottleneck (in transmitting raw data) via transcutaneous bio-telemetries. For systems observing unit activity, real-time spike sorting within an implantable device offers a unique solution to this problem. However, achieving such data compression prior to transmission via an on-node spike sorting system has several challenges. The inherent complexity of the spike sorting problem arising from various factors (such as signal variability, local field potentials, background and multi-unit activity) have required computationally intensive algorithms (e.g. PCA, wavelet transform, superparamagnetic clustering). Hence spike sorting systems have traditionally been implemented off-line, usually run on work-stations. Owing to their complexity and not-so-well scalability, these algorithms cannot be simply transformed into a resource efficient hardware. On the contrary, although there have been several attempts in implantable hardware, an implementation to match comparable accuracy to off-line within the required power and area requirements for future BMIs have yet to be proposed. Within this context, this research aims to fill in the gaps in the design towards a resource efficient implantable real-time spike sorter which achieves performance comparable to off-line methods. The research covered in this thesis target: 1) Identifying and quantifying the trade-offs on subsequent signal processing performance and hardware resource utilisation of the parameters associated with analogue-front-end. Following the development of a behavioural model of the analogue-front-end and an optimisation tool, the sensitivity of the spike sorting accuracy to different front-end parameters are quantified. 2) Identifying and quantifying the trade-offs associated with a two-stage hybrid solution to realising real-time on-node spike sorting. Initial part of the work focuses from the perspective of template matching only, while the second part of the work considers these parameters from the point of whole system including detection, sorting, and off-line training (template building). A set of minimum requirements are established which ensure robust, accurate and resource efficient operation. 3) Developing new feature extraction and spike sorting algorithms towards highly scalable systems. Based on waveform dynamics of the observed action potentials, a derivative based feature extraction and a spike sorting algorithm are proposed. These are compared with most commonly used methods of spike sorting under varying noise levels using realistic datasets to confirm their merits. The latter is implemented and demonstrated in real-time through an MCU based platform.Open Acces

Spiral - Imperial College Digital Repository