Development and Validation of a Spike Detection and Classification Algorithm Aimed at Implementation on Hardware Devices by Biffi, E. et al.
Hindawi Publishing Corporation
Computational Intelligence and Neuroscience
Volume 2010, Article ID 659050, 15 pages
doi:10.1155/2010/659050
Research Article
Developmentand Validation of
a Spike Detection and Classiﬁcation Algorithm Aimedat
Implementationon Hardware Devices
E.Bifﬁ,1 D. Ghezzi,2 A.Pedrocchi,1 and G. Ferrigno1
1Neuroengineering and Medical Robotics Laboratory, Bioengineering Department, Politecnico di Milano,
Piazza Leonardo da Vinci 32, 20133 Milano, Italy
2Neuroscience and Brain Technologies Department, Italian Institute of Technology, via Morego 30, 16163 Genova, Italy
Correspondence should be addressed to E. Biﬃ, emilia.biﬃ@mail.polimi.it
Received 1 March 2009; Revised 18 September 2009; Accepted 30 November 2009
Academic Editor: Karim Oweiss
Copyright © 2010 E. Biﬃ et al. This is an open access article distributed under the Creative Commons Attribution License, which
permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Neurons cultured in vitro on MicroElectrode Array (MEA) devices connect to each other, forming a network. To study
electrophysiologicalactivityandlongtermplasticityeﬀects,longperiodrecordingandspikesortermethodsareneeded.Therefore,
on-line and real time analysis, optimization of memory use and data transmission rate improvement become necessary. We
developed an algorithm for amplitude-threshold spikes detection, whose performances were veriﬁed with (a) statistical analysis
on both simulated and real signal and (b) Big O Notation. Moreover, we developed a PCA-hierarchical classiﬁer, evaluated on
simulated and real signal. Finally we proposed a spike detection hardware design on FPGA, whose feasibility was veriﬁed in terms
of CLBs number, memory occupation and temporal requirements; once realized, it will be able to execute on-line detection and
real time waveform analysis, reducing data storage problems.
1.Introduction
Neuronal cells communicate by means of electric pulses,
called Action Potentials (APs) or, brieﬂy, spikes [1, 2].
These voltage changes have been traditionally recorded with
conventional electrodes (e.g., glass pipettes), therefore the
number of neurons simultaneously recorded and the time
needed for electrodes placement are well known limits [3].
To overcome these experimental diﬃculties, the use of
MicroElectrodeArraybiochips(MEAs)guaranteesthepossi-
bilitytorecordextracellularactivityofneuronalpreparations
from tens of electrodes at the same time [4]. Because of the
inherent nature of the extracellular recording, each electrode
records the neuronal activity from a region, where generally
tens of neurons are present thus providing the acquisition of
a Multi Unit Activity (MUA). To extract the activity of every
single ﬁring unit inﬂuencing that electrode from the MUA,
we need a process called “spike sorting” which includes AP
detection and classiﬁcation.
There are two diﬀerent ways to acquire and analyze
electrophysiological data: store the raw trace observed on
all electrodes and perform spike detecting and sorting later
(oﬄine sorting) or detect and sort spikes immediately
(during acquisition) and only store the sorted spikes (real-
time online sorting) [5]. A compromise between these
approaches is to detect spikes online and only store the
detected spikes for later oﬄine sorting.
Spike detection and classiﬁcation of neuronal action
potentials can be performed using supervised methods [6,
7] or using unsupervised ones [8, 9]. As the particular
knowledge of the APs is not available before we detect them,
supervised detectors (e.g., artiﬁcial neural networks) have
obvious limits in real applications. Unsupervised methods,
instead, are generally used because they do not require
a priori knowledge about neuronal spike waveforms nor
the user presence: thus, they are very useful for automatic
processing purposes.
The most widely used unsupervised technique to detect
APs is the amplitude threshold crossing. Conventional spike
detection is made by comparing signal amplitude with a
ﬁxed or an adaptive threshold, depending on the Root Mean
Square (RMS) value of data in running windows [10, 11].2 Computational Intelligence and Neuroscience
Problems related to the large number of electrodes make
automatic threshold level adjustment attractive compared
to manually setting the threshold for each channel [9].
Both methods suﬀer from bias induced by high amplitude
and high ﬁring-rate neuronal spikes [8]; because of the
silencing of biological activity is unfeasible, levels must be
estimated from the recorded superposition of signal and
noise. Several methods were recently proposed to overcome
this diﬃculty by adopting an unbiased estimate of the
standard deviation (SD) of background noise. Donoho [12]
claimed that the median absolute deviation divided by
the 75th percentile of the standard normal distribution is
equal to the S.D. of background noises. Wagenaar et al.
in 2005 [11] proposed a tool using diﬀerent algorithms to
estimate noise level. In 2007, Thakur et al. [3] developed
an algorithm to estimate the SD of background noises by
considering only the cap, which is the middle portion of
the amplitude distribution, that is, “cap-ﬁtting” algorithm.
Moreover, researchers realized systems or chip embedded
to detect the presence of neuronal spikes and communicate
only active portions of recorded signals [4]. Even if these
methods suﬀer from some problems [3] they are often
used to optimize detection eﬃciency. Methods such as
principal components [13] and Haar transform [14]h a v e
also been used. Other methods involve discrimination based
on diﬀerent properties of AP waveforms, such as zero
crossing and spike width [15] and discrimination based
on conduction velocity [16]. Discrimination methods are
based on forming a feature space deﬁned by the spikes
and isolating spikes by creating discrimination boundaries
between clusters within that feature space.
The methods typically used in supervised classiﬁcations
for the classiﬁcation of the detected action potentials, are
template matching and the multilayer perceptron detectors
[7, 17]. Other approaches that have been used are artiﬁcial
neural networks [7] in which a training procedure is used
to determine the weights of a feed-forward discrimination
network to perform the classiﬁcation. All these methods
require prior information about the APs; for this reason
unsupervised classiﬁers are preferred. Among the unsu-
pervised spike sorting systems, the Principal Component
Analysis (PCA) [6] and the fuzzy c-means (FCM) clustering
[18] are commonly used. Classiﬁer methods have been
compared by Wheeler and Heetderks [19], evaluating the
performances of nine diﬀerent methods including spike
amplitude, conduction latency, PCA and template matching
using Euclidean distance. The last two methods were found
to perform the best for spike sorting in noisy data. In
this approach, templates that represent a typical waveform
are used as a standard. A dissimilarity measure is then
deﬁnedandtheunknowninputspikesarecomparedwiththe
standard. Several dissimilarity measures have been used [3].
A new clustering technique is the hierarchical classiﬁer. In
this approach, clusters are successively grouped into greater
ones.
Our goal meets the increasing demand for long period
recordings to study the functional and adaptive properties of
complex neuronal networks on which plasticity and learning
are based.
The present study deals with the selection and validation
of software for the adaptive detection and classiﬁcation of
neuronal spikes, able to distinguish the activity of single
cells from the signals recorded by MEA. We have to meet
two important requirements: operate in real-time and work
for long periods. Thus, spike detection must be adaptive,
unsupervised and fast. Hence, this task could clearly beneﬁt
from the acceleration achievable with FPGAs (Field Pro-
grammable Gate Arrays). In 2005, Zviagintsev et al. [20]
proposed 3 VLSI (Very Large Scale Integration) architec-
tures for processing of data coming from an implantable
integrated circuit; their architectures were power-eﬃcient
but must be implemented for each channel. To assure
fast processing for online detection on 60 electrodes, our
algorithm will be integrated on an FPGA. Therefore, it needs
to be simple for the hardware implementation on VHDL
language (VLSI Hardware Description Language). Spike
detection will be nonreversible, since only spike templates
are saved, as to reduce data storage. Moreover, to obtain an
on-line classiﬁcation, waveforms extracted will be sent to a
Digital Signal Processor (DSP) for automatic classiﬁcation.
2.MaterialsandMethods
2.1. Neurons Cultures. Hippocampal neurons from new-
born (P0) mice pups were prepared following [21]; cells
were mechanically dissociated using trituration after a
treatment with trypsin at 37
◦C. Cells were then plated
at 40,000cells/chip on MEAs treated with poly-L-lysine
(0.1mg/ml). MEA60 (Ayanda Biosystems SA, Lausanne,
Switzerland) with platinum (Pt) 40μm round electrodes,
200μm spaced were utilized. Cells were cultured in Neu-
robasal supplemented with B27 (2%), Pen/Strep (1%) and
L-glutamine (2mM). 50% medium was changed every week.
Then, we recorded spontaneous activities of cultured
hippocampal neuronal network on MEA substrate at 21
Days in Vitro (DIV), about the complete maturation of the
network. Extracellular recordings were carried out with a
MEA1060 signal ampliﬁcation and data acquisitions system
(Multi Channel Systems MCS GmbH, Reutlingen, Ger-
many). Sampling rate was 25000Hz and single recordings
lasted 300s.
2.2. Algorithms Design. The algorithm, developed in Matlab
(The Mathworks, Natick, USA) environment, is made up
of three focal elements. First of all, an amplitude-threshold
spikes detector, based on noise level estimation, identiﬁes
peaks over threshold, APs. Then APs are bundled into group
using PCA. Finally, they are classiﬁed with a hierarchical
classiﬁer (Figure 1).
The complete algorithm was tested ﬁrst on simulated
data and then on neuronal spikes, which were obtained
recording spontaneous activities from hippocampal neu-
ronal networks cultured on MEA substrates.
The60secondssimulatedsignal(Figure 2)wasartiﬁcially
madeupofbothpositiveandnegativetriangularwaves(each
wave is composed by 20 samples); the intensity peaks range
from 40 to 100μm. In addition, 2 overlapped and temporallyComputational Intelligence and Neuroscience 3
Amplitude threshold
detector
Principal component
analysis
Hierarchical classiﬁer AP PC
Figure 1: Block diagram of the algorithm design.
−100
−80
−60
−40
−20
0
20
40
60
80
100
0.46 0.48 0.50 .52 0.54 0.56 0.58 0.60 .62 0.64 0.66
Time (s)
A
m
p
l
i
t
u
d
e
(
μ
V
)
Figure 2: Simulated signal (200ms).
shifted triangular waves, representing multi-shaped AP in
neuronal culture activity, were included in the simulation.
The full amount of valid spikes, simple and complex waves,
was 600.
A white Gaussian noise was used to simulate background
noise. First, it was band-pass ﬁltered (2nd order Butter-
worth) between 150Hz and 2,500Hz, then normalized and
overlapped to simulated signal. The Signal to Noise Ratio
(SNR) is equal to 5dB, as real neuronal culture SNR [22].
2.2.1. Spikes Detection. The spikes detection algorithm is
structured as follows. First, the signal is band-pass ﬁltered
with a 2nd order Butterworth ﬁlter (150Hz–2,500Hz); then,
both positive and negative thresholds values are determined
as a multiple of the noise level esteem. Finally, positive APs
over positive threshold and negative APs under negative
threshold are detected and validated to prevent double
detections of unitary events. A detected peak is tested in
±1ms window, checking if it is the highest peak of either
polarity and if the 50% of its amplitude is higher than other
peak of the same polarity, as discussed by Wagenaar et al.
[11]. If both tests are passed, the detected peak is consider an
AP; otherwise it is rejected. A speciﬁc activity was dedicated
to the deﬁnition of thresholds as well as to the selection of
the noise level esteem.
Five noise level estimation algorithms, some suggested
by literature [11, 23], some modiﬁed were developed. Then,
their performances were compared with statistical analysis,
to select the best solution.
(a) BandFlt. Based on the algorithm described in [11,
23], detects spikes if the ﬁltered stream exceeds a
given multiple (supposed equal to 4) of the estimated
RMS noise. Three hundred 10ms windows of data
are read and for each of these windows the RMS
values are computed. The results are sorted, and the
ﬁnal esteem of noise level is the 25th percentile of
the values collected. Positive and negative threshold
values are equal and ﬁxed.
(b) Limada [11, 23]. Like BandFlt, it splits the data
stream into 10ms windows, and determines the 2nd
(V .02) and 30th (V .30) percentiles of the distribution
of voltages found in each window. Then, it performs
two tests: it checks if the ratio between V .02 and V .30
is smaller than 5 (i.e.: no actual spike in the window)
and if the absolute value of V .30 is signiﬁcantly non-
zero (i.e.: data in the window are not blanked out).
If both tests are passed, the window is considered
“clean”. Noise level initialization value is deﬁned after
the collection of 100 clean windows; then the current
noise level estimate (noise est) is updated as follows:
noise est(k) = 0.99 ×noise est(k − 1)+0 .01 ×| V.02|,( 1 )
where k is the current iteration index.
Positive and negative threshold values are computed
by multiplying noise level estimate by 4. Their values are
opposite and adaptive.
(c) Another adaptive algorithm suggested by literature
[11, 23] and tested in this work is AdaFlt.I td i v i d e s
the signal into 10ms windows and 128 windows
of data are read. For each of these windows, the
maximum and minimum values are measured; then
results are sorted and 40th percentile of both col-
lections is computed (M.4 and m.4). The noise level
initialization estimates for positive (noise estp)a n d
negative (noise estn) spikes are based on these data.
Thresholds are computed by multiplying noise level
esteem by 2. While running, AdaFlt keeps collecting
minima and maxima in 10ms windows, although
it uses only one every ten windows. Whenever
128 windows have been collected, positive (2)a n d
negative (3) thresholds are updated as
noise estp(k) = 0.9 ×noise estp(k −1)+0 .1 ×M.4, (2)
noise estn(k) = 0.9 ×noise estn(k − 1)+0 .1 ×m.4. (3)
(d) AdaFlt 128 diﬀers from AdaFlt because AdaFlt 128
adapts thresholds after 128 windows, AdaFlt after
1280. Therefore AdaFlt 128 is faster and more
adaptive than AdaFlt, but it suﬀers more for noise
ﬂuctuations.4 Computational Intelligence and Neuroscience
Ampliﬁer & A/D converter
60
electrodes
FPGA
Spike detection
Time stamps
Waveforms
SRAM
DSP
Validation &
classiﬁcation
Validated
time stamps
Waveforms
grouped
Figure 3: Hardware design.
(e) The last algorithm developed in this work is Ada-
BandFlt, devised as an improvement of BandFlt with
adaptive properties. It splits the data stream into
10ms windows, calculates the RMS values for each
of the initial 100 windows and determines the 25th
percentile of RMS distribution (N.25). Noise level
initialization value is deﬁned. Then the current noise
level esteem is updated as follows:
noise est(k) = 0.8 ×noise est(k −1)+0 .2 ×N.25. (4)
2.2.2. Spikes Classiﬁcation. After detecting temporal occur-
rences of APs, validated waveforms must be extracted from
data stream applying a 2ms symmetric window to the signal.
Then their tallest peaks (i.e., time stamps) are aligned,
making up a matrix N × c,w h e r eN is the number
of “observation” (i.e., neuronal waveforms) and c is the
number of sample, that is, 50 for the 2ms window. The
strong assumption is that on each electrode neurons usually
generate spikes with a characteristic shape; so the aim
of classiﬁcation is correlating source with a characteristic
waveform in the recorded data. This assumption implies that
neurons maintain the shape of the AP and that the whole
system does not move or change.
The classiﬁcation algorithm proposed in this work is
based on PCA and hierarchical classiﬁcation.
The PCA ﬁnds an ordered set of orthogonal basis vectors
that capture the directions of the largest variation in the data
[6]. APs are bounded into groups, projecting data set onto
the principal components.
The hierarchical classiﬁer is a method of cluster analysis;
it splits N observations into a series of M clusters, where M
can range from 1 (all observations grouped into one cluster)
to N (each observation is a cluster). The strength of this
technique is that it provides the possibility of increasing or
decreasing the number of clusters depending on the required
level of aggregation [24].
The hierarchical classiﬁcation algorithm theoretically
starts with N clusters, each containing just one object. Then
the likeness between each couple of observation is evaluated
computing the Euclidean distance between all combinations
of the centroids of the clusters. Data are combined in
neighbours couples, making up a binary tree diagram. Then,
these binary clusters are grouped forming bigger clusters
until a hierarchical tree diagram is obtained.
After the clustering, the number of spikes in each cluster
is determined; 10 was deﬁned the minimum cluster dimen-
sionnottoberejected[3].Moreover,foreachclusterthedata
dispersion (Di) and the centre of mass (Cmi)a r ec o m p u t e d .
The hierarchical level, signiﬁcant for the grouping, was
deﬁned when the centres of mass distances (Cmi−j)b e t w e e n
one cluster (clusteri) and the others (clusterj)w a sg r e a t e r
than Di,a si n
Di < Cmi − Cmj ∀i, j = 1,...,no of clusters. (5)
In literature, generally, clusters must house more than
one element. However, we have decided not to constrain the
number of elements in each cluster because it’s possible that
the best data grouping is characterised by the presence of one
cluster with only one element detected, having one neuron
spiking very rarely. Thus, we prefer to eventually exclude
this cluster at the end of classiﬁcation, avoiding inaccurate
possible re-arrangements which are often mistaken.
2.3. Evaluation of Performance. Performances of spike detec-
tion algorithms were evaluated with statistical screening tests
and, subsequently, with the big O notation.
2.3.1. Algorithms Screening Tests. Algorithms performances
were statistically evaluated using both the simulated neu-
ronal signal described in Section 2.2 and the neuronal
spikes identiﬁed in 5s MEA recordings by experts’ visual
inspection.
They were weighed up performing a screening test
(Algorithms Screening Tests) that determines True and False
Positive (TP-FP) that are, respectively, APs and noise peaks
detected by the algorithm considered; moreover, screening
test determines True and False Negative (TN-FN) values
that are, respectively, noise peaks and APs not identiﬁed.
Using these 4 parameters we computed Sensibility (Se), the
probability that a noise peak could be detected by algorithm,
and Speciﬁcity (Sp), the probability that an AP could not
be detected by algorithm. Moreover, we evaluated Positive
Predicted Value (PPV), the probability that a test-positive
peak should be an AP, and Negative Predicted Value (NPV),
the probability that a test-negative peak should be noise. The
parameters (6) are mathematically described as
Se =
TP
TP+FN
,
Sp =
TN
TN+FP
,
PPV =
TP
TP+FP
,
NPV =
TN
TN+FN
.
(6)Computational Intelligence and Neuroscience 5
BandFlt AdaFlt AdaFlt128 Limada AdaBandFlt
1068
832
930 892
960
FP- false positive
(a)
BandFlt AdaFlt AdaFlt128 Limada AdaBandFlt
1
76
16 16
1
FP- false negative
(b)
BandFlt AdaFlt AdaFlt128 Limada AdaBandFlt
0.99822
0.99861
0.99845
0.99851
0.99840
Sp- speciﬁcity
(c)
BandFlt AdaFlt AdaFlt128 Limada AdaBandFlt
0.99833
0.87333
0.97333 0.97333
0.99833
Se- sensibility
(d)
Figure 4: Screening test results for all the noise level estimation methods on simulated signals. (a) FP-False Positive, (b) FN-False Negative,
(c) Sp-Speciﬁcity, and (d) Se-Sensibility.
BandFlt AdaFlt AdaFlt128 Limada AdaBandFlt
0.35933
0.38643 0.38573
0.39566
0.38422
PPV- positive predicted value
Figure 5: PPV-positive predicted values: evaluation between algo-
rithms.
This evaluation guided us to identify 2 algorithms with
good performances; to choose between them, the big O
Notation was used.
2.3.2.BigONotation. BigOnotationallowstoquantitatively
evaluate the performance or complexity of an algorithm as
a function of the number of its input data. It describes
the worst-case scenario, and can be used to illustrate the
executiontimerequiredorthespaceused(e.g.,inmemoryor
on disk) by an algorithm. This allows designers to determine
which of multiple algorithms to use, in a way that is inde-
pendent of computer architecture or clock rate.
The Big O Notation is deﬁned as O(f(N)), with N the
number of input data.
We assessed the two best algorithms identiﬁed with
this complexity evaluation method. To evaluate nested
instructions, we considered the product between the Big O
Notation of each instruction; while to assess sequentially
commands, we considered only the worst and the most
restraining notation.
2.3.3. Classiﬁcation Performances. Hierarchical classiﬁcation
algorithm performances were previously evaluated on sim-
ulated signal, knowing the diﬀerent kinds of waveform
artiﬁcially designed. On the other hand, the assessment of
classiﬁcation algorithm performances on real signal required
visual inspection of the neuronal activity (5-second signal),
formed by more AP waveforms, morphologically diﬀerent.
The capability of distinguishing these dissimilarities was
judged comparing the visual inspection results to algorithm’s
output.6 Computational Intelligence and Neuroscience
BandFlt AdaFlt AdaFlt128 Limada AdaBandFlt
15
1
8
1
17
FP- false positive
(a)
BandFlt AdaFlt AdaFlt128 Limada AdaBandFlt
3
38
36
42
1
FN- false negative
(b)
BandFlt AdaFlt AdaFlt128 Limada AdaBandFlt
0.99988
0.99999
0.99994
0.99999
0.99986
Sp- speciﬁcity
(c)
BandFlt AdaFlt AdaFlt128 Limada AdaBandFlt
0.93182
0.13636
0.18182
0.04545
0.97778
Se- sensibility
(d)
Figure 6: Screening test results for all the noise level estimation methods on neuronal spikes. (a) FP-False Positive, (b) FN-False Negative,
(c) Sp-Speciﬁcity, and (d) Se-Sensibility.
2.4. Hardware Architecture. The hardware design, not yet
implemented, is shown in Figure 3. The input is ﬁltered,
ampliﬁed, digitalized and transferred into an FPGA, where
spikes are detected. The time stamps, not yet validated, and
the waveforms, are memorized in a SRAM; then they enter
the DSP for the spikes validation, waveforms alignment and
classiﬁcation.
FPGAs are digital circuits widely used for manufacturing
complex digital systems due to the advantages they oﬀer,
such as gate count, speed, rapid hardware realization and
low development cost [25]. The chip consists of a regular
symmetrical structure of Conﬁgurable Logic Blocks (CLBs),
interconnected by a programmable network and enhanced
with lots of Input/Output Blocks (IOBs) and block RAM.
Speciﬁcally, CLBs constitute the main logic resources for
implementing synchronous as well as combinatorial circuits.
Each CLB contains four slices and each slice contains
two small Look-Up Tables (LUTs) and two ﬂip-ﬂops. The
LUTs can be used as a 16bits shift register or as a 16×1
distributed memory. For application requiring large, on-
chip memory, block RAM is preferable than distributed
RAM; it is organized in columns and each column contains
a number of blocks depending on the size of FPGA
[26].
The feasibility of spike detection algorithm implementa-
tion on FPGA was veriﬁed evaluating the required hardware
space, in terms of CLBs number, memory occupation and
timing performances. On the other hand, we decided to
develop PCA on DSP, a microprocessor optimized to per-
form recursive instructions, as in literature [27]. Moreover,
Morizet et al. [28] had already veriﬁed that the FPGA
core was too slow for this computation. Finally, our design
planned to develop hierarchical classiﬁer on DSP, as already
done in [29].
3. Results
3.1. Algorithms Screening Tests
3.1.1. Spikes Detection on Simulated Signal. Algorithms
Screening Tests were performed on spikes detected but not
validated in simulated neuronal signal, computing TP, FP
(Figure 4(a)), TN and FN (Figure 4(b))v a l u e s .
BandFlt underestimates noise level; therefore it identiﬁes
more FPs than the others. On the other hand it produces
noise level values more stable than standard deviation,
conﬁrming previous results [11, 23] and identifying only 1
FN.Computational Intelligence and Neuroscience 7
Input
Multiplier
18 ×18
Samples to
compare with
threshold
Finite state
machine
RES CLK
Accumulator Divider
MS
RES
CLK
MEM A
Finite state
machine
RES
CLK
MEM B
Threshold
250
i=1
Bubble sort
Figure 7: Architectural blocks to compute a threshold value.
Threshold
RES
CLK
RAM
Thresk
Finite state
machine

0.2
Shift 2
Thresk
New
threshold
Thresk−1
Thresk−1
X
Figure 8: Hardware blocks to update the threshold.
AdaFlt ﬁnds the smallest number of FPs but a lot of
FNs, because of the excessively slow threshold adaptation.
Further, AdaFlt 128 is faster and more adaptive than
AdaFlt: it identiﬁes a small number of FNs caused by noise
ﬂuctuations.
Limada is quite reliable in order to identify the threshold:
it ﬁnds a small number both of FPs and FNs. It is an
adaptive method, necessary for real time analysis. The main
disadvantage is that it needs approximately 5s to initialize
threshold value losing a large number of spikes.
Finally, AdaBandFlt method is like BandFlt in term of FN
(i.e., 1 FN) but it ﬁnds less FP than the latter and it is an
adaptive method.
Therefore we can already identify two groups of algo-
rithms: a group characterised by ﬁnding most of FP and
really few FN and a group characterised by ﬁnding very few
FP and some FN.
Then, Se (Figure 4(c))a n dS p( Figure 4(d))w e r ec o m -
puted by means of Screening Test results. PPV was evaluated,
too (see Figure 5), whereas NPV was not reported because of
its low meaningfulness.
BandFltandAdaFltwereimmediatelyrejectedbecauseof
their low value, respectively of Sp and Se.
Limada was selected because it has high Se and Sp;
moreover it has the highest PPV. Likewise, AdaBandFlt was
chosen because, even if its SP and PPV are comparable to
AdaFlt 128, its Se is the highest one. As a matter of fact,
we favoured the Se parameter preferring to reject some
spike later (after classiﬁcation) instead of loose true spikes
at the beginning of the analysis, in view of the hardware
implementation, where spike detection is not reversible.
3.1.2. Spikes Detection on Neuronal Spikes. Algorithms
Screening Tests were performed on spikes detected but not
validated in a 5s fragment of a 300s neuronal activity,
recorded by MEA. TP, FP (Figure 6(a)), TN and FN
(Figure 6(b)) values were computed. Then, Sp (Figure 6(c))
and Se (Figure 6(d)) were evaluated for each algorithm.
The performance evaluation of spike detection algo-
rithms on real recorded electrical activity agreed with the
analysis on simulated signal. BandFlt and AdaBandFlt ﬁnd
more FP than other algorithms; on the other hand AdaFlt,
AdaFlt 128 and Limada miss more APs. Furthermore,
diﬀerences in Se value become more evident examining
neuronal activity. The performances assessment on experi-
mental data highlights that AdaFlt, AdaFlt 128 and Limada
has an extremely low sensibility on them. Diﬀerently from
what was observed on simulated data, BandFlt speciﬁcity is
equivalent to AdaBandFlt on experimental data. However,
since the only diﬀerence between these 2 algorithms is in the
adaptability, we still select AdaBandFlt, being adaptation a
crucial requisite for the envisaged long term acquisitions.
3.2. Big O Notation. As an additional estimation, we eval-
uated the two best algorithms’ behaviours (i.e., Limada
and AdaBandFlt), regardless of the hardware, using the Big
O Notation. Especially, we considered only the complexity
evaluationofthethresholdinitialization,thatwasrecognized
as the most time consuming step of our algorithms.
The two algorithms are reported in pseudocode, to have
a compact and informal high-level description; the O(f(N))
notation, with N the number of inputs, is noted in bold near
each statement (see Algorithm 1).
AdaBandFlt and Limada complexity is described, respec-
tively, by (7):
O

1002F

,
O

100F ×N2
,
(7)
considering the worst-case performance scenario.8 Computational Intelligence and Neuroscience
−0.06
−0.04
−0.02
0
0.02
0.04
0.06
−0.08 −0.06 −0.04 −0.02 0 0.02 0.04
PC analysis
2
n
d
c
o
m
p
o
n
e
n
t
1st component
(a)
−0.06
−0.04
−0.02
0
0.02
0.04
0.06
−0.08 −0.06 −0.04 −0.02 0 0.02 0.04
Hierarchical classiﬁcation
2
n
d
c
o
m
p
o
n
e
n
t
1st component
(b)
−90
−80
−70
−60
−50
−40
−30
−20
−10
0
10
2 6 10 14 18 2 6 10 14 18 2 6 10 14 18
−20
0
20
40
60
80
100
120
−100
−50
0
50
100
150
−120
−100
−80
−60
−40
−20
0
20
−20
0
20
40
60
80
100
120
−15
−10
−5
0
5
10
15
2 6 10 14 18 2 6 10 14 18 2 6 10 14 18
A
m
p
l
i
t
u
d
e
(
u
V
)
Sample
1st cluster
Sample
2nd cluster
Sample
3rd cluster
A
m
p
l
i
t
u
d
e
(
u
V
)
Sample
4th cluster
Sample
5th cluster
Sample
6th cluster
(c)
0.46 0.48 0.50 .52 0.54 0.56 0.58 0.60 .62 0.64 0.66
−100
−80
−60
−40
−20
0
20
40
60
80
100
Reconstructed neuronal electrical activity
Time (s)
A
m
p
l
i
t
u
d
e
(
u
V
)
(d)
Figure 9: (a) Simulated data projected along the ﬁrst two principal components. (b) Simulated data grouped and classiﬁed in the ﬁrst two
principal components space. (c) The 6 clusters identiﬁed by the hierarchical classiﬁcation algorithm. (d) Reconstructed simulated signal
(200ms).
The evaluation of these two algorithms showed that, if N
is bigger than 10 samples, Limada had worse performances
than AdaBandFlt, regardless of the hardware. However, 10
samples are extremely few and not feasible for a reliable
threshold identiﬁcation (at the most commonly used fre-
quency of 10kHz, 10 samples mean 1ms that is less than
usual estimation of one AP duration).
3.3. Proposed Architecture and Performance Analysis. To
verify the feasibility of AdaBandFlt implementation on
FPGA, CLBs number, memory occupation and timing were
evaluated. Figure 7 shows the blocks architecture to compute
the threshold value for 1 channel. The input is a 12bit data
stream, coming from an analog to digital converter (ADC).
To compute the threshold, the system has to evaluate the
RMS value of 250 samples (i.e., 10ms windows, assuming a
samplefrequencyequalto25kHz);aftercomputingtheRMS
forawindow,thesystemhastocollect100RMSvaluesandto
sortthem.Thethresholdisthe25thpercentileofthisordered
distribution. Note that we simpliﬁed the algorithm doing the
following assumption.
(a) we used a square threshold values (MS) and compare
it with square samples, avoiding to compute the root
(8):
25th percentile(MS) =

25th percentile(RMS)
2,( 8 )
(b) we skipped out the problem of division using FPGA,
rounding 250 samples to 256; this allowed us to
calculate the RMS only shifting the point of 8
positions.Computational Intelligence and Neuroscience 9
−40
−30
−20
−10
0
10
20
30
40
205 205.5 206 206.5 207 207.5 208 208.5 209 209.5 210
Time (s)
A
m
p
l
i
t
u
d
e
(
u
V
)
Figure 10: Neuronal electrical activity (5s) recorded by MEA60
pre-ampliﬁer.
Thus, the input enters the embedded multiplier 18 ×
18, its output is a 24bit word (we decided to use one
more bit to keep track of sign) and it is recursively added
to other samples; to perform the sum, the adder utilizes
the accumulator, intrinsically controlled by a ﬁnite state
machine, shared with all channels. The division result is, in
the worst case, a 32bit word, 24bits before the point and
8bits after the point. To a ﬁrst approximation, we decided
to deﬁne the MS value only with the ﬁrst 24bits. In terms of
hardware occupation, to compute the MS value we need
(i) 1 diﬀerential IOB,
(ii) 1 embedded multiplexer,
(iii) 32 LUTs for the adder [30] and 32 ﬂip-ﬂops for the
accumulator, matching 4 CLBs,
(iv) 1 32bit shift register, that is, 2 CLBs,
(v) 1 CLB for the ﬁnite state machine.
After computing the MS value, it is memorized. Speciﬁ-
cally, the ﬁrst 100MS values are memorized in MEM A and
thesecond100RMS2 valuesarememorizedinMEMB,while
values contained in MEM A are sorted by the BubbleSort
logic.Thefollowingstepisdual:MEMAisﬁlledwhilevalues
in MEM B are sorted. This is controlled by a ﬁnite state
machine, shared with all channels. The ﬁnite state machine
also manages the computation of the 25th percentile index,
counted as follows:
Ik =

0.5+
n ×k
100

,( 9 )
with n the sorted vector length and k the kth percentile.
Thus, in the instance considered (n =100, k =25), Ik is about
25: computing the 25th percentile means choosing the 25th
sample in the sorted vector. Therefore, the ﬁrst value of
threshold comes out from MEM A, the second from MEM
B and so on. In terms of hardware occupation, to compute
the threshold value we need
(i) 1 CLB for the ﬁnite state machine,
(ii) 2.3kbits of RAM for each memory block (i.e.,
4.6kbits),
(iii) 50 CLBs for BubbleSort (i.e., N/2 pipelined stages,
with N the length of the sorted input vector) [31],
(iv) 24 2:1 multiplexers.
After computing the 24bit threshold value, the system
puts it into a RAM 24bits location, named Thresk−1 in
Figure 8; likewise, it puts the following threshold value in
Thresk. Then, the threshold values are updated as in (4):
Thresk−1 is multiplied by 4 (i.e., shifted left of 2 positions),
added to Thresk and multiplied by 0.2. Finally, Thresk
becomes Thresk−1 and the new threshold is memorized in
Thresk location. All these operation are controlled by a ﬁnite
state machine, shared with all channels.
Intermsofhardwareoccupation,toupdatethethreshold
v a l u ew en e e d
(i) 48bits of RAM,
(ii) 1 CLB for the ﬁnite state machine,
(iii) 1 32bit shift register, that is, 2 CLBs,
(iv) 3 CLBs for the adder,
(v) 5 CLBs for the multiplier.
Pipelined with threshold values counting, the system
detects spikes. First, it compares the ith squared sample with
the threshold to ﬁnd over-threshold values; then, it veriﬁes if
thedetectedvalueisamaximum(oraminimum),asfollows:
 x2
i
  > |Thres|&
 x2
i
  >
 x2
i−1
 &
 x2
i
  >
 x2
i+1
 . (10)
We decided to realize these comparison subtracting each
couple and looking at the sign. Thus (10)c a nb er e w r i t t e na s
follows:
PS
 x2
i
  −|Tresh|

&P S
 x2
i
  −
 x2
i−1
 
&P S
 x2
i
  −
 x2
i+1
 
,
(11)
with PS = positive signed.
This could be realized using 1 LUT for each couple
of compared bits; to compose the comparison of complete
words, additional XOR and AND gates are required [25].
Approximately, we need 3 CLBs for each word comparison
(24bit words), that is 9 CLBs for a detection.
With the design outlined above, we estimated FPGA
occupation. Taking into account all 64 MEA channels,
AdaBandFlt, mapped on FPGA, requires
(i) 64 diﬀerential IOB,
(ii) 64 dedicated multipliers,
(iii) 4803 CLBs,
(iv) 291kbits of RAM,
(v) 1536 dedicated 2:1 multiplexers.10 Computational Intelligence and Neuroscience
AdaBandFlt
number of samples=0;
number of windows=0;
F; // deﬁnition of sample frequency //
while number of windows < 100 do O(100)
while number of samples < F∗0.01 // deﬁne a 10ms window // O(F∗0.01)
take a sample
number of samples ← number of samples + 1;
end
compute the RMS and put it in a vector named “error” O(1)
number of windows ← number of windows + 1;
end
BubbleSort(error) // sort the vector of 100RMS // O(1002)
deﬁne the threshold initial value (25th percentile) O(1)
Limada
number of samples=0;
number of windows=0;
threshold init=0;
F; // deﬁnition of sample frequency //
ok=false;
while ok == false do //ok is false until 100 clean windows are collected // O(N)
while number of samples < F∗0.01 // deﬁne a 10ms window // O(F∗0.01)
take a sample
number of samples ← number of samples + 1;
end
BubbleSort(samples) //sort the voltage values of a window // O(1002)
compute the 2nd (V .02) and the 30th (V .30) percentiles O(1)
if number of windows < 100 O(N)
test= V .02/V .30; O(1)
if test < 5&|V.30| is signiﬁcantly non zero O(1)
threshold init ← 0.99∗ threshold init + 0.01∗|V.30|
number of windows ← number of windows + 1;
end
else ok==true; // 100 clean windows are collected //
end
end
Algorithm 1: Pseudocode for AdaBandFlt and Limada.
Therefore, we identiﬁed two suitable solution for Ada-
BandFlt implementation.
The ﬁrst one envisages the use of a single FPGA; we
identiﬁed the device XC3S5000 of the Xilinx Spartan-3
family, whose architecture satisﬁes our needs:
(i) 300 diﬀerential IOB,
(ii) 104 dedicated multipliers,
(iii) 8320 CLBs,
(iv) 1872kbits of block RAM,
(v) 65k 2:1 dedicated multiplexers.
This gives a utilisation of around 58% of the FPGA,
which seems a comfortable margin [30]. Furthermore, we
analyzed RAM occupation in detail. The block RAM of
XC3S5000 is ordered in 104 18kbit blocks; the memory
organization,suitableforourdata,is512×32bit.Wecounted
to use one block for each couple of RAM A and RAM B used
to compute threshold values (one block for each channel),
addressing RAM A from 1 to 100 and RAM B from 129
to 139 to simplify this task. This allow to sort RAM A and
to ﬁll RAM B almost contemporaneously. Moreover, even if
XC3S5000 is the biggest device within Spartan-3 family, it is
not too expensive (less than $120).
The second solution envisages the use of two smaller
FPGAs (one every 32 channels), the XC3S4000 of the Xilinx
Spartan-3 family, each containing
(i) 300 diﬀerential IOB,
(ii) 96 dedicated multipliers,
(iii) 6912 CLBs,
(iv) 432kbits of distributed RAM, organized in 96 18kbit
blocks,
(v) 54k 2:1 dedicated multiplexers.
This gives a utilisation of around 35% of each FPGAs.
Diﬀerently from the ﬁrst solution, this design allows a mod-
ular approach, useful for future improvement. Furthermore,Computational Intelligence and Neuroscience 11
−0.2
−0.15
−0.1
−0.05
0
0.05
0.1
0.15
−0.04 −0.02 0 0.02 0.04 0.06 0.08
2
n
d
c
o
m
p
o
n
e
n
t
1st component
PC analysis
(a)
−0.2
−0.15
−0.1
−0.05
0
0.05
0.1
0.15
−0.04 −0.02 0 0.02 0.04 0.06 0.08
2
n
d
c
o
m
p
o
n
e
n
t
1st component
Hierarchical classiﬁcation
(b)
Figure 11: (a) Electrical activity projected along the ﬁrst two principal components. (b) Hierarchical classiﬁer output. Real data grouped
and classiﬁed in the ﬁrst two principal components space.
the device is rather cheap: in 2004, the volume pricing was
under $100 for the XC3S4000 in 250 unit quantities [32].
Finally, we identiﬁed the BubbleSort as the bottleneck of
our architecture and we decided to evaluate it in terms of
temporal requirements, verifying if its timing was consistent
with other operations. To sort a n-length vector, bubble sort
makes n − 1 steps through the data. In each step, adjacent
elements are compared and swapped if necessary. Notice that
after the ﬁrst pass through the data, the largest element in
the sequence has bubbled up into the last array position. In
general, after k passes through the data, the last k elements of
the array are correct and need not be considered any longer.
Thus, we hypothesized 3 clock cycles for each comparison,
n/2 clock cycles for each couple of values in the vector and
n−1 clock cycles for the whole sorting [33]. Since our device
sorts a vector after collecting 100 windows of 250 samples
each, there is a 500ms delay between two sorting steps even
considering the maximum sampling frequency (i.e., 50kHz)
of the data acquisition system. On the other hand, the
device needs about 15000 clock cycles to sort a 100 elements
vector; even considering a slow FPGA inner frequency (i.e.,
10MHz), this means that it takes about 1.5ms to sort a
vector, that gives us a secure margin. Moreover, while n-1
passes through the data are required to guarantee that the
list is sorted in the end, it is possible for the list to become
sorted much earlier. When no exchanges at all are made in
a given pass, then the array is sorted and no additional steps
are required. A minor algorithmic modiﬁcation would be to
counttheexchangesmadeinapass,andtoterminatethesort
when none are made, decreasing the space occupation.
3.4. Spikes Classiﬁcation on Simulated Signal. After Ada-
BandFlt was selected as the best noise level estimation
method for our purposes, the whole spike detection algo-
rithm was applied to the simulated neuronal signal described
in Section 2.2; then, APs detected became the input for
the classiﬁcation algorithm. They were projected along the
ﬁrst two principal components (Figure 9(a)) to get APs into
separated groups. It was already demonstrated [34] that 1st
and 2nd component eigenvalues can represent the useful
characteristics of classiﬁcation.
Then,hierarchicalclassiﬁeryieldedanaccurateclassiﬁca-
tion (Figure 9(b)). The deﬁned maximum number of cluster
was settled at 7: this value matches with a likely maximum
number of cellular bodies in the electric ﬁeld registered by
each electrode plus one or two noise sources.
The algorithm identiﬁed 6 clusters plus one single
element cluster (i.e., formed by only one element) that
was rejected after the analysis. In Figure 9(c) all waveforms
for each class are shown: 5 of them match exactly the 5
well-known classes of waveform artiﬁcially made up in the
simulated neuronal signal. On the other hand, the yellow
cluster is formed by two data groups (i.e., data projections
aroundzeroinFigures9(a)and9(b))representingthecluster
of false positive detected spike (i.e., peak-to-peak amplitude
less than 30μV).
Therefore, the classiﬁcation is satisfactory; moreover, we
had chosen a less speciﬁc spike detection algorithm not to
lose spikes and, thanks to this classiﬁer, we were able to reject
FP, classiﬁed all together.
The ﬁnal goal was the signal reconstruction using the
classiﬁedwaveforms.InFigure 9(d)wecansingleoutallkind
of waveform previously made up in the simulated neuronal
signal.
3.5. Spikes Classiﬁcation on Neuronal Spikes. After testing
the spike detection algorithm on 300s real data, hierarchical
classiﬁcation was performed; classiﬁer performances were
evaluated by visual inspection on a portion of signal (5s).12 Computational Intelligence and Neuroscience
−50
−40
−30
−20
−10
0
10
20
30
40
50
5 1 01 52 02 53 03 54 04 55 0
Sample
A
m
p
l
i
t
u
d
e
(
u
V
)
1st cluster
(a)
−50
−40
−30
−20
−10
0
10
20
30
40
50
5 1 01 52 02 53 03 54 04 55 0
Sample
A
m
p
l
i
t
u
d
e
(
u
V
)
2nd cluster
(b)
−50
−40
−30
−20
−10
0
10
20
30
40
50
5 1 01 52 02 53 03 54 04 55 0
Sample
A
m
p
l
i
t
u
d
e
(
u
V
)
3rd cluster
(c)
−50
−40
−30
−20
−10
0
10
20
30
40
50
5 1 01 52 02 53 03 54 04 55 0
Sample
A
m
p
l
i
t
u
d
e
(
u
V
)
4th cluster
(d)
−50
−40
−30
−20
−10
0
10
20
30
40
50
5 1 01 52 02 53 03 54 04 55 0
Sample
A
m
p
l
i
t
u
d
e
(
u
V
)
5th cluster
(e)
Figure 12: Identiﬁed clusters.Computational Intelligence and Neuroscience 13
−40
−30
−20
−10
0
10
20
30
40
205 205.5 206 206.5 207 207.5 208 208.5 209 209.5 210
Time (s)
A
m
p
l
i
t
u
d
e
(
μ
V
)
Figure 13: Reconstructed neuronal activity (5s).
In Figure 10 a ne x a m p l eo f5sn e u r o n a le l e c t r i c a la c t i v i t y
recordings is presented. In Figure 11(a) waveforms projec-
tions on the 1st and the 2nd component eigenvalues are
shown, then, the data are classiﬁed (Figure 11(b)).
Figure 12 shows the clusters identiﬁed; each cluster
amplitude and morphological characteristics are diﬀerent.
Note that the algorithm, also tested on real signal, dis-
tinguishes false positive detected spikes (i.e., Figures 12(c)
and 12(d), clusters centred around zero in eigenvalues
projection).
Finally, the neuronal electrical activity is reconstructed:
in Figure 13 a 5s segment is illustrated, for simplicity. It
is possible to clearly distinguish red and light-blue spikes,
that represent, respectively, positive and negative noise
clusters, a yellow AP, biphasic but with no extreme polar-
ization, few pink APs, still biphasic and prevalently positive
raised, and some black peaks, essentially monophasic 40μV
waveforms.
4. Conclusions
In the context of extracellular electrophysiology, an eﬃcient
and reliable identiﬁcation of spikes has to be reached. Fur-
thermore, it’s important to distinguish between application-
speciﬁc and generalized methods [35]. In this paper we have
proposed an application-speciﬁc algorithm, aimed at future
hardware integration (on FPGA and DSP) for real-time long
term acquisitions.
The threshold-amplitude spikes detection method com-
putes threshold as a multiple of basal noise level; 5 noise esti-
mation methods were developed in Matlab (The Mathworks,
Natick,USA)andtheirperformanceswerestatisticallyevalu-
ated,bothonsimulatedneuronalsignalandonrealelectrical
activity, recorded by MEA60 (Multi Channel Systems MCS,
Gmbh). The characteristics we decided to favour were (a)
the adaptability because, during long-term recordings, noise
levels often drift on a time-scale of hours and (b) the
minimum number of false negative and the sensibility, in
order not to lose spikes because of the nonreversibility of the
spike detection procedure. The statistical analysis identiﬁed
two algorithms, AdaBandFlt and Limada, satisfying these
requirements. Then, we evaluated their behaviours using
the Big O Notation, that determined the minor complexity
of AdaBandFlt, a quick adaptive noise estimation method,
based on the evaluation of the 25th percentile of 1s signal’s
RMS distribution. We decided to develop spike detection
on FPGA to reach fast performances, and we assessed its
hardware requirements, in terms of CLBs number, memory
and timing request. We identiﬁed two feasible solution:
the ﬁrst one envisages the use of one FPGA (XC3S5000,
Xilinx Spartan-3) that satisﬁed hardware requirements;
the second one allows modular development utilising
two smaller FPGAs, each for 32 channels (XC3S4000,
Xilinx Spartan-3).
Realizing the spike classiﬁcation algorithm, we favoured
a shape clustering and classiﬁcation being automatic and
reliable. After spike detection, the software extracts wave-
forms and sorts them around time stamps. Then, it bundles
waveforms into groups with PCA and classiﬁes APs with
a hierarchical classiﬁer, identifying false positive detected
spikes, too. The classiﬁcation algorithm was tested on
simulated signal, comparing its output to the artiﬁcially
designed waveforms; moreover, the classiﬁer performances
were evaluated on real data, comparing visual inspection
results to the morphology of the identiﬁed clusters. The
analysis stated the goodness of clustering procedure. We
decided to develop PCA and hierarchical classiﬁer on DSP,
as already done in literature [27–29].
Therefore, the developed spike detection and classiﬁ-
cation algorithm has the highest performances in order
to achieve long period recordings of neuronal cultures’
electrical activity, avoiding data leakage and allowing its
future hardware implementation.
Having veriﬁed its feasibility, in future works we’d like to
develop an integrated hardware, composed of an high-speed
device (FPGA) in order to guarantee a real-time detection
of spikes and to save only their templates for the following
classiﬁcation,integratedinaDSP.Usinghighspeedhardware
devices, AdaBandFlt and the PCA-hierarchical classiﬁer will
be able to execute on-line detection and real time waveforms
analysis, reducing data storage problems.
The system has many applications, from high-through-
put pharmacological screening, to neuro pharmacology and
monitoring eﬀects of drugs and toxins (neurotoxicity), to
neuroplasticity researches.
Acknowledgments
This work was carried out in collaboration with the
ALEMBIC facility (Advanced Light and Electron Microscopy
Bio Imaging Centre) of the San Raﬀaele Scientiﬁc Insti-
tute. The work was developed within the research line—
Biosensors and artiﬁcial bio-systems—of the convention
between the Italian Institute of Technology and Politecnico
di Milano. Emilia Biﬃ Ph.D. is partly funded by Fond.
Confalonieri.14 Computational Intelligence and Neuroscience
References
[1] F. O. Morin, Y. Takamura, and E. Tamiya, “Investigating neu-
ronalactivitywithplanarmicroelectrodearrays:achievements
and new perspectives,” Journal of Bioscience and Bioengineer-
ing, vol. 100, no. 2, pp. 131–143, 2005.
[2] M. A. Nicolelis and S. Ribeiro, “Multielectrode recordings: the
nextsteps,”CurrentOpinioninNeurobiology,v ol.12,no .5,p p .
602–606, 2002.
[ 3 ]P .H .T h a k u r ,H .L u ,S .S .H s i a o ,a n dK .O .J o h n s o n ,“ A u t o -
mated optimal detection and classiﬁcation of neural action
potentials in extra-cellular recordings,” Journal of Neuroscience
Methods, vol. 162, no. 1-2, pp. 364–376, 2007.
[4] Y. Perelman and R. Ginosar, “An integrated system for
multichannel neuronal recording with spike/LFP separation,
integrated A/D conversion and threshold detection,” IEEE
TransactionsonBiomedicalEngineering,vol.54,no.1,pp.130–
137, 2007.
[5] U. Rutishauser, E. M. Schuman, and A. N. Mamelak, “Online
detection and sorting of extracellularly recorded action poten-
tials in human medial temporal lobe recordings, in vivo,”
Journal of Neuroscience Methods, vol. 154, no. 1-2, pp. 204–
224, 2006.
[6] M. S. Lewicki, “A review of methods for spike sorting:
the detection and classiﬁcation of neural action potentials,”
Network, vol. 9, no. 4, pp. R53–R78, 1998.
[7] R. Chandra and L. M. Optican, “Detection, classiﬁcation,
and superposition resolution of action potentials in multiunit
single-channel recordings by an on-line real-time neural
network,” IEEE Transactions on Biomedical Engineering, vol.
44, no. 5, pp. 403–412, 1997.
[8] P. T. Watkins, G. Santhanam, K. V. Shenoy, and R. R.
Harrison, “Validation of adaptive threshold spike detector for
neural recording,” in Proceedings of the Annual International
Conference of the IEEE Engineering in Medicine and Biology,
vol. 6, pp. 4079–4082, 2004.
[9] H.-L. Chan, M.-A. Lin, T. Wu, S.-T. Lee, Y.-T. Tsai, and P.-
K. Chao, “Detection of neuronal spikes using an adaptive
threshold based on the max-min spread sorting method,”
Journal of Neuroscience Methods, vol. 172, no. 1, pp. 112–121,
2008.
[10] K. S. Guillory and R. A. Normann, “A 100-channel system
for real time detection and storage of extracellular spike
waveforms,” Journal of Neuroscience Methods,v o l .9 1 ,n o .1 - 2 ,
pp. 21–29, 1999.
[11] D. Wagenaar, T. B. Demarse, and S. M. Potter, “MeaBench:
a toolset for multi-electrode data acquisition and on-line
analysis,” in Proceedings of the 2nd International IEEE EMBS
Conference on Neural Engineering, vol. 1, pp. 518–521, 2005.
[12] D. L. Donoho, “De-noising by soft-thresholding,” IEEE Trans-
actions on Information Theory, vol. 41, no. 3, pp. 613–627,
1995.
[13] A. Daﬀe r t s h o f e r ,C .J .C .L a m o t h ,O .G .M e i j e r ,a n dP .J .B e e k ,
“PCA in studying coordination and variability: a tutorial,”
Clinical Biomechanics, vol. 19, no. 4, pp. 415–428, 2004.
[ 1 4 ]X .Y a n ga n dS .A .S h a m m a ,“ At o t a l l ya u t o m a t e ds y s t e m
for the detection and classiﬁcation of neural spikes,” IEEE
Transactions on Biomedical Engineering, vol. 35, no. 10, pp.
806–816, 1988.
[15] D. J. Mishelevich, “On-line real-time digital computer separa-
tion of extracellular neuroelectric signals,” IEEE Transactions
on Biomedical Engineering, vol. 17, no. 2, pp. 147–150, 1970.
[16] M. Brunner, G. Karg, and U. T. Koch, “An improved system
for single unit isolation from multiunit nerve recordings by
velocity analysis,” Journal of Neuroscience Methods, vol. 33, no.
1, pp. 1–9, 1990.
[17] J. H. Choi, H. K. Jung, and T. Kim, “A new action potential
detector using the MTEO and its eﬀects on spike sorting
systems at low signal-to-noise ratios,” IEEE Transactions on
Biomedical Engineering, vol. 53, no. 4, pp. 738–746, 2006.
[18] Z. H. Inan and M. Kuntalp, “A study on fuzzy C-means
clustering-based systems in automatic spike detection,” Com-
puters in Biology and Medicine, vol. 37, no. 8, pp. 1160–1166,
2007.
[19] B. C. Wheeler and W. J. Heetderks, “A comparison of
techniques for classiﬁcation of multiple neural signals,” IEEE
Transactions on Biomedical Engineering, vol. 29, no. 12, pp.
752–759, 1982.
[20] A. Zviagintsev, Y. Perelman, and R. Ginosar, “Algorithms and
architectures for low power spike detection and alignment,”
Journal of Neural Engineering, vol. 3, no. 1, pp. 35–42, 2006.
[21] G.BankerandK.Goslin,CulturingNerveCells,TheMITPress,
Cambridge, Mass, USA, 2nd edition, 1998.
[22] I. N. Bankman, K. O. Johnson, and W. Schneider, “Optimal
detection, classiﬁcation, and superposition resolution in neu-
ral waveform recordings,” IEEE Transactions on Biomedical
Engineering, vol. 40, no. 8, pp. 836–841, 1993.
[23] D. A. Wagenaar, “A versatile all-channel stimulator for
electrode arrays, with real-time control,” Journal of Neural
Engineering, vol. 1, no. 1, pp. 39–45, 2004.
[24] M. Abeles and M. H. Goldstein Jr., “Multispike train analysis,”
Proceedings of the IEEE, vol. 65, no. 5, pp. 762–773, 1977.
[25] T. N. Kumar and C. W. Chong, “An automated approach for
locating multiple faulty LUTs in an FPGA,” Microelectronics
Reliability, vol. 48, no. 11-12, pp. 1900–1906, 2008.
[26] XILINX Inc., “Spartan-3 FPGA family data sheet,” 2008,
http://www.xilinx.com/support/documentation/data sheets/
ds099.pdf.
[27] D. Han, Y. N. Rao, J. C. Principe, and K. Gugel, “Real-
timePCA(PrincipalComponentAnalysis)implementationon
DSP,” in Proceedings of the IEEE International Conference on
Neural Networks, vol. 3, pp. 2159–2162, 2004.
[28] N. Morizet, F. Amiel, I. D. Hamed, and T. Ea, “A compar-
ative implementation of PCA face recognition algorithm,”
in Proceedings of the 14th IEEE International Conference on
Electronics, Circuits, and Systems, pp. 865–868, 2007.
[29] N. Kim, N. Kehtarnavaz, M. B. Yeary, and S. Thornton,
“DSP-based hierarchical neural network modulation signal
classiﬁcation,” IEEE Transactions on Neural Networks, vol. 14,
no. 5, pp. 1065–1071, 2003.
[30] XILINXInc.,“ExtendedSpartan-3A,Spartan-3EandSpartan-
3 FPGA families,” 2009, http://www.xilinx.com/support/doc-
umentation/user guides/ug331.pdf.
[31] K. Benkrid and D. Crookes, “New bit-level algorithm for gen-
eral purpose median ﬁltering,” Journal of Electronic Imaging,
vol. 12, no. 2, pp. 263–269, 2003.
[32] XILINX Inc., “Xilinx Spartan-3 Platform FPGAs,” 2003,
http://www.xilinx.com/company/press//kits/spartan3/
S3PressFAQ.pdf.
[33] B. R. Preiss, Data Structures and Algorithms with Object-
OrientedDesignPatternsinC++,WebBook,2009,http://www
.brpreiss.com/books/opus4/html/page493.html.Computational Intelligence and Neuroscience 15
[34] L. MacNabb, “Application of cluster analysis towards the
development of health region peer groups,” in Proceedings of
the Survey Methods Section, pp. 85–90, 2003.
[35] A. Maccione, M. Gandolfo, P. Massobrio, A. Novellino, S.
Martinoia, and M. Chiappalone, “A novel algorithm for
precise identiﬁcation of spikes in extracellularly recorded
neuronal signals,” Journal of Neuroscience Methods, vol. 177,
no. 1, pp. 241–249, 2009.