Sensor nodes present in a wireless sensor network (WSN) for security surveillance applications should preferably be small, energy efficient and inexpensive with onsensor computational abilities. An appropriate data processing scheme in the sensor node can help in reducing the power dissipation of the transceiver through compression of information to be communicated. In this paper, authors have attempted a simulation-based study of human footstep sound classification in natural surroundings using simple timedomain features. We used a spiking neural network (SNN), a computationally low weight classifier, derived from an artificial neural network (ANN), for classification. A classification accuracy greater than 85% is achieved using an SNN, degradation of ~5% as compared to ANN. The SNN scheme, along with the required feature extraction scheme, can be amenable to low power sub-threshold analog implementation. Results show that all analog implementation of the proposed SNN scheme can achieve significant power savings over the digital implementation of the same computing scheme and also over other conventional digital architectures using frequency domain feature extraction and ANN based classification.
INTRODUCTION
Wireless sensor networks (WSNs) are used in a wide range of applications in the field of health care, environment, agriculture, public safety, military, and industry and transportation systems [1] . The wireless sensor nodes in a WSN are designed according to the target application. Sensor nodes for security surveillance application for monitoring the presence of human beings in restricted zone fall under the category of passive supervision in which smallsized static sensor nodes are to be deployed in large numbers with continuous monitoring. These resource constrained sensor nodes should be inexpensive and power efficient (few μW) for serving the purpose [2] .
One of the constraint that the wireless sensor nodes suffer from, is the fact that they have limited battery life. Often the most power-hungry component in a WSN is the transceiver unit. Hence, the concept of on-sensor computing is applied in the sensor nodes for minimizing the communication power due to the transmission of large chunks of data. However, the intelligent sensor nodes having on-sensor computational abilities should consume less power, preferably much less than the transceiver unit. Therefore, there is a need for crosshierarchy low power system design approach that spans from algorithm and architectural level, going all the way down to circuit level implementation of the application specific computing device..
With the advancement in the field of energy efficient brain-inspired neuromorphic computing algorithms, several low power computing systems have been envisaged and implemented, suitable for on-sensor computing. [4] [5] [6] [7] [8] . Spiking neural network (SNN) are inspired from biological neurons in which communication between neurons take place in the form of spikes (events). The membrane potential (vmem) of a neuron gets affected based on the presynaptic events associated with it. Whenever vmem exceeds the threshold potential (vth) of the neuron, a postsynaptic spike is generated implying that the neuron is active for passing information to the next layer neurons connected to it. The learning rules for SNN algorithms can be supervised or unsupervised, rate based or spike-time based which can be chosen based on the application [9] . These concepts can be applied to real-time applications involving energy efficient neuromorphic hardware aiming for a reduction in the computational complexity with acceptable performance. Generally, the computations during the training of neural network frameworks are much more intensive compared to the feedforward classification process during testing. Hence, performing offline training with online classification is an encouraging approach. A method for training an ANN with the conventional back-propagation algorithm, and then using the trained parameters using a rate based SNN for classification is presented in [6] . This method can help in ensuring a balance between better performance and less computational complexity when compared to conventional ANNs.
In this work, we present an integrated system design for a low power sensor node targeted for a surveillance application. The node is supposed to spot human presence in restricted areas by analysing the generated acoustic audio signals using simple low complexity time domain features followed by analog domain neuromorphic implementation [11] [12] . We compare the algorithm level performance of the proposed scheme with some standard techniques to establish in terms of accuracy. We also provide architecture and circuit level design for the proposed scheme and compare it with some conventional schemes. The major contributions of the work are:  Design and optimization of time domain feature extraction from acoustic data for SNN based processing.
 Design and optimization of SNN parameters for best accuracy  Comparison of proposed scheme with conventional approach like those involving frequency domain feature extraction followed by ANN classifier in terms of accuracy.
 Analog implementation of proposed scheme and it's power comparison with digital implementation of the proposed scheme as well as that of conventional approaches.
The rest of the paper is organized as follows: In section II, a brief description of the application is given along with a review of some conventional approaches addressing similar applications. Section III presents the proposed scheme based on time domain feature extraction and SNN based classification along with its architectural implementation. Algorithm level optimization of the feature extraction scheme is presented in section IV. SNN based classifier and its algorithmic optimization is presented in section V. Section VI describes digital and analog implementation of the proposed scheme. . In section VII simulation results are presented for the proposed solution along with its comparison with conventional approaches. Section VIII concludes the paper.
II. APPLICATION OVERVIEW
We present the design of a low power sensor node with on-sensor computing for surveillance application. The objective is to detect the presence of humans in a restricted zone. This kind of intelligent sensor has importance in military surveillance applications for monitoring human presence in sensitive areas. A block level diagram of the neural network based sensor node is shown in Fig. 1 which consist of units for power management, sensor interface, filtering and analog to digital conversion (ADC), data processing, memory, and wireless communication. The sensor is supposed to capture the surrounding sound signals within its range, which would be processed by other blocks to finally get the result of classification, used by the transceiver to communicate to the base station.
Conventional methods for audio signal classification usually rely on spectral and wavelet based features from the frequency domain in addition to time domain features such as mean, standard deviation, maximum value of the amplitude followed by a classifier [23] [24] [25] [26] [27] .
A detailed review of acoustic signal processing along with the different stages involved (sound rejection, detection, localization, classification and cancellation) is presented in [23] . In [24] FFT features from the acoustic and seismic signals were considered along with baseline classifiers such as kNN (k nearest neighbor), maximum likelihood (ML) and support vector machine (SVM) for vehicle classification. In [25] Mel-Frequency cepstral coefficient (MFCC) features were used to classify different types of vehicle (heavy, medium, light), and it was found that artificial neural network (ANN) classifier achieves a better (>20%) classification accuracy compared to kNN. In [26] authors consider spectral features using linear prediction coding (LPC) along with a neural network for classifying seismic signals. In [27] both time and frequency domain features were considered and assessed using neural networks and genetic algorithms for classifying seismic signals. In order to avoid the computational energy and the difficulty in implementing in low-cost sensor nodes with constrained resources, authors in [12] presented a time domain signal processing for feature extraction followed by neural networks for classifying acoustic/seismic signals of vehicle types.
These conventional methods suffer from computational complexity involved in the feature extraction and classification process and hence, lead to relatively power hungry solutions. Therefore, energy efficient techniques for real-time feature extraction and classification is necessary. We approached the problem considering low weight time domain feature extraction and SNN. This is an approximate computing methodology amenable to analog implementation.
III. PROPOSED SCHEME:TIME DOMAIN APPROXIMATE FEATURE EXTRACTION AND SNN BASED CLASSIFICATION SCHEME
We propose a scheme involving low complexity time domain features followed by a SNN classifier as an alternate option for low power sensor node design for acoustic applications. The simple time domain features will reduce the complex computations involved in calculating the frequency domain features. In addition to that the SNN classifier would be a viable alternative to its counterpart ANN due to the spike (event) based processing of information. In a subsequent section we show the amenability of the proposed scheme to low power analog implementation. 
A. Low complexity time domain feature extraction
Features derived directly from the time domain would be preferable in terms of lesser computation complexity when compared to frequency domain features. Authors in [12] have earlier shown that time encoded signal processing and recognition (TESPAR) has been successful in obtaining essential features for seismic signals and would be feasible for real-time processing.
TESPAR is inspired by infinite clipping coding of a waveform which represents the duration between the zero crossings of the waveform [12] . The segment between two consecutive real zeroes is termed as an epoch. The intelligibility of the waveform can be enhanced by introducing the concept of complex zeroes in addition to the real zeroes. The complex zeroes represent the shape of the signal waveform in the form of local maxima and minima of the signal waveform. Therefore, a band limited signal can be approximated by computing the following parameters in each epoch as shown in The D and S contain information of the fundamental frequency and harmonics of the signal respectively [12] .
The 2-D D and S matrix representing a signal waveform is shown in Fig. 2 (b) . Each location in the 2-D matrix will have the number of occurrences of the (D, S) combination present in the signal. The 2-D matrix is further expanded to a 1-D vector to form the final feature vector (FV) to be fed to the classifier. Optimization of different parameters related to it are discussed in Section IV.
B. SNN based classification scheme
The final feature vector is to be fed to a SNN classifier. The general network structure of a SNN is similar to that of the ANN as shown in Fig. 3 . The difference lies in the functionality of the neurons present in the network.
An illustration of the spike generation from feature vector within a specified time window is shown in Fig. 4(a) . The instances of spike generation are of random nature. The number of spikes generated in the interval is directly proportional to the (FV) magnitude and the maximum input firing rate (r), which is a network parameter. This way the input layer neurons process spikes equivalent to the FV value.
For, the subsequent layers, integrate and fire (I&F) neurons are used to implement the spiking operation as shown in Fig. 4 (b) . The I&F neuron receives input from the previous layer neurons whenever there is a pre-synaptic spike event. The neurons between two consecutive layers are connected via synapses of different strength representing the weights. Hence, the input current magnitude is scaled down according to the synaptic strength. After receiving the input, the membrane potential (vmem) of the neuron increases. Whenever vmem exceeds the threshold potential (vth) of the neuron, (i.e., vmem>vth) the neuron is activated and generates a post synaptic spike, after which vmem falls down to the neurons resting potential (vres). The equation depicting the vmem of a simple I&F neuron is in (1)
Here, Wi and δ(.) represent the weight of the i th incoming synapse and delta function respectively. Si = [ti 0 , ti 1 , …..] stores the spike times of the i th presynaptic neuron.
The optimal choice of network parameters for the ANN and the corresponding SNN network are discussed in Section V. The architectural diagram of the SNN classifier can be constructed using digital blocks as well as in analog fashion (refer Section VI).
IV. OPTIMIZATION OF FEATURE EXTRACTION SCHEME
We considered a window size of 5s to be qualified as a data sample from which essential information is to be identified. This essential information is known as the FV, which represents the data sample. FV can be extracted from the time/frequency domain of the signal. Selecting proper feature aids in better performance on the subsequent stages (classification). The complexity in realizing the FV in real time scenario is a critical factor to be considered. The target is to achieve an acceptable classification accuracy with 
A. Dataset creation
The surface where the sensor nodes are placed is considered to be a gravel surface. Due to unavailability of a significant number of data needed for addressing the problem, we created a suitable dataset using data augmentation methods such as rolling the files circularly and adding random noise to it [14] . The audio data of the footstep sound on gravel and different background noise were taken from different sources 1 sampled at 44.1 KHz. From the frequency spectrum of the audio files, it was understood that the major portion of the power content lies in the frequency range of (0-5)KHz both for the background noise as well as footstep with background noise. Therefore, a cut-off frequency of fc=5KHz is chosen for filtering out the highfrequency signals. The magnitude of power content depends on the intensity of background noise or footstep, which may vary in different instances. However, the frequency content of both footstep and background noise lie in a similar range (0-5KHz). Therefore, the originally recorded audio files were downsampled to fs=11025Hz (fs>2*fm), for further analysis.
The task is to detect human presence on a fixed surface (gravel considered) in the presence of natural environmental sounds, four different background noises were considered, namely, 1) crickets, 2) birds chirping, 3) rain and 4) wind. For the human presence, audio files of a single person walking on gravel were acquired. The file was divided into 5s segments, considering it as a suitable window length for real-time analysis. For emulating multiple people (3 considered) walking on gravel, a different combination of files was mixed from the single person dataset with a random scaling factor between 0 to 1 for incorporating the effect of distance or intensity from the sensor node location. The final dataset created, consisted of 12352 data samples consisting of two classes, 1) background noises (6176 samples), 2) single person and multiple people walking on gravel in the presence of background noises (6176 samples). The total dataset was divided into train set (9024 samples), validation set (1664 samples), and test set (1664 samples) for further analysis. The number of a single person and multiple people and the four different background noises were selected in a balanced manner for incorporating their effect in equal proportion during the training of the network.
B. Choice of sampling frequency
The footstep audio signal is initially filtered with a low pass filter with pass band frequency 5KHz and stop band frequency 6KHz as the range of footstep audio signal lies within the frequency range (0-5)KHz. Then, different sampling frequencies (fs) were chosen and its effect on classification accuracy was observed as shown in Table. I while setting the (D, S) parameters to (20, 5) . It is to be noted that the test data samples for different frequencies are the same files but resampled to different frequencies. It is observed that the change in test accuracy is insignificant even at lower fs. Hence, fs = 11025Hz has been chosen to reduce the computational complexity for further analysis.
C. Choice of D and S matrix dimension
The two-dimensional matrix of the (D, S) combination for (50, 20) and (20, 5) as shown in Fig. 5 indicates that most of the essential features are concentrated in lower values of (D, S) combinations. The test accuracy observed for different (D, S) combinations at fs = 11025Hz is shown in Table. II from which it is evident that an acceptable accuracy can be obtained even at lower (Dmax, Smax) combination. The FV length is equal to D × S which determines the number of neurons in the input layer of the neural network. Lesser number of input neurons is preferable due to the reduction in the architectural complexity of the classifier. Therefore, a (D, S) combination of (10, 5), implying a feature vector of dimension 50 is considered which can provide an accuracy of ≈ 93.32%.
V. OPTIMAL CONVERSION OF ANN TO SNN
After choosing the optimum FV parameters (Section IV), viz., fs = 11025Hz, D = 10 and S = 5 we train an ANN with different network configurations. The following parameters were chosen during training of the ANN model, learning rate, α = 0.001 with a categorical cross entropy loss function suitable for classification problem, Adam optimizer, a dropout (d = 0.5) is considered for avoiding over fitting. Training was done using back propagation algorithm for minimizing the cross entropy cost function as given in (2) For the proper conversion of ANN to its corresponding SNN network, certain conditions, namely, 1) setting the bias to zero, 2) using ReLU activation functions were considered [6] . ReLU provides a better approximation to relate the firing rate of the integrate and fire (IF) neurons in the SNN model whereas a zero bias will eliminate the effect of external influence to the IF neurons, i.e. only the weights and the neuron threshold potential (Vt) will matter. Therefore, the rate based SNN classifier requires the trained weights obtained by the ANN network, which needs to be further optimized with the SNN parameters namely, maximum firing rate (r) and neuron threshold potential (Vt). A higher Vt implies that a neuron is less likely to spike as the neuron membrane potential will need more time to reach Vt. Hence, a proper combination of r and Vt will ensure higher accuracy, which has been determined by considering different combinations, as shown in Table. III. It can be observed that r = 1000Hz and Vt = 1mV acquires an accuracy of 86.75% within 0.1s duration with a step size of 1ms as shown in Fig. 6 (a) . The average number of spike occurrences for the IF neurons in the input, hidden and To reduce the memory footprint due to the storage of weight coefficients, the effect on test accuracy is observed while truncating the floating point weight coefficients. From Fig. 7 it can be seen that the test accuracy falls from 93.26% to 60.87% when the number of fractional bits is reduced from 2 to 1. Apart from the fractional bits, 4 bits are considered for allocating the signed integer part. There is a drop in accuracy ~(3-6) % for the SNN classifier depending on the total number of time steps considered as shown in Fig. 7 . Hence, a significant amount of computation and power consumption involving the memory can be avoided by restricting the number of bits used in the weights coefficients.
VI. DIGITAL AND ANALOG IMPLEMENTATION OF THE PROPOSED SCHEME

A. Digital implementation I) Architectural schemes for DS FV generation
The top-level block diagram of the DS generator with SNN inference unit shown in Fig. 8 consists of sub-blocks, viz., i) DS FV generator, ii) slice storage, iii) spike generator, iv) spike address generator, v) weight memory, vi) IF neuron array.
Two different schemes for generating the DS FV were analyzed. The DS FV is generated using registers, comparators, counters, and basic gates.
In scheme I, sequential implementation of DS FV generator followed by ANN was analyzed. In scheme II, a parallel DS FV followed by ANN/SNN was analyzed. The DS FV generator along with the slice storage scheme is shown in Fig. 11 (a) The spike generator (refer Fig. 8 ) is supposed to convert the input feature vector (FVI) to spike patterns according to the equation, c* FVI >rand() where c=dt*fmaxrate. Here, rand() is a random number between 0 to 1, fmaxrate is directly proportional to the number of spike occurrences and dt=1ms empirically [5] . A pseudo-random number generator is designed using an 11-bit linear feedback shift register (LFSR) for emulating the random number generation. Each feature value in the FV must be compared with a different random number. This is achieved by comparing the outputs of 50 such pseudo-random number generators with FV of dimension 50x1 to get the 50-bit spike vector in one clock cycle as shown in Fig. 12 .
There are two main advantages of the spike based operation, 1) It eliminates the multiplication operation between the input FV and weights as required in ANN, instead the synaptic weights are added if a spike is generated in the pre-synaptic neuron, 2) If the previous layer neuron does not produce spike, then the addition of the synaptic weight between the pre and postsynaptic neurons has to be skipped. Hence, the address location of the weights has to be accessed according to the spike pattern in the 50-bit FV for which a leading zero counter (LZC) [10] is used. The functioning of an LZC with the highest bit reset logic for an 8-bit spike pattern is shown in Fig. 13 (a) which produces the output as the location of a spike in the FV and then resets it and scans the next location. The output of LZC is invalid when all the spikes present in the FV are reset according to the highest bit reset logic. The spike address generator as shown in Fig. 13 (b) uses the LZC for generating the spike address. The structure of an IF and MAC neuron is shown in Fig.  14 (a) and (b) respectively. The weight coefficients between the IF neuron and the neurons in the previous layer producing spikes are added and stored in an accumulator. If the accumulated value is greater than the firing threshold (Vt), the IF neuron generates spike which is used by the address generator of the next layer. The structure of the IF neuron array for a parallel SNN is shown in Fig. 14 (c) .
B. Analog implementation i) Analog implementation of feature extraction ii) Variation analysis for analog design
The neural network can be implemented using neural network in analog fashion. The transistors are used for representing the weights. We did a variation analysis of the trained weights (6 bits considered excluding sign bit) for assessing its feasibility for analog implementation. The variation in the trained weights in the form of random Gaussian distribution is shown in Fig. xx8 (a) which corresponds to a percentage standard deviation of σ=36.68%. We further analyzed the effect on test accuracy versus the percentage standard deviation and find that both the models can tolerate up to σ=30% as shown in Fig. xx8  (b) .
iii) Cross bar scheme for SNN, using multi-bit memristor weights The crossbar SNN circuit can be implemented using multi-bit memristor weights in an efficient manner [23, 24, 27, 28, and 29] . A simple crossbar circuit which used programmable weights in the form of memristor is shown in Fig. xx8(c) . From [26] it can be understood that memristor are compatible with CMOS systems, hence will be a viable option for compact implementation of the SNN classifier using resistive crossbar memory (RCM). For, further energy benefits, nanoscaled spintronic devices as mentioned in [25] can be used for representing the synaptic weights.
VII. RESULTS AND PERFORMANCE COMPARISON
A. Optimization of conventional digital scheme for best results
Conventional methods involves the selection of frequency domain features. In Fig. 15 we have made a comparison between features derived from FFT, DS, and DWT. FV derived from FFT by three different approaches, viz., 1) by computing an N=65536 point FFT over the 5s duration and then averaging the bins for feature dimension reduction (Case I), 2) by considering a window size (200ms) with no overlap and computing N=4096 point FFT over the segment at each window and selecting the maximum value of PSD among all windows followed by FV dimension reduction by bin averaging (Case II), 3) by considering window size (50 ms) throughout the 5s segment on which an N point FFT is performed (N=512, 256, 128, 64, 32, 16) , only the max value of PSD among all windows is considered in the FV (Case III). The FV derived from the FFT (Case III) is convenient compared to the former two cases because N<512 point FFT is computed which in turn eliminates the FV dimension reduction process.
Apart from FV derived from FFT, FV is also derived from DS and discrete wavelet transform (DWT) which can be derived from the time domain. In Fig. 15 the comparison is done between the different FVs while keeping the FV dimension similar. The size of the FV decides the number of nodes at the input layer of the neural network. It is apparent that the computational complexity of FVs derived from frequency domain will be more as compared to the ones derived from the time domain.
B. Comparison between 128 point FFT FV (optimized) and DS FV
The performance of FV DS is more suitable in terms of performance and computational complexity. Comparison of power and resource utilization using D and S and 128-pt FFT feature vector (FV) targeting Zynq AP SoC XC7Z020 CLG484-1 FPGA is shown in Table. IV from which it can be inferred that the FF FV takes ~3x more power compared to DS FV and also consumes more resources. Hence, low complexity time domain DS FV can be considered as a viable alternative to spectral based features derived from the frequency domain. Hence DS FV is considered for further analysis.
C. Digital schemes for SD-ANN and SD-SNN
System-level architectures of the DS FV generator along with ANN/SNN inference unit have been designed while targeting Zedboard with Zynq AP SoC XC7Z020-CLG484-1 FPGA. We discuss some schemes for implementing them in this section. Two different schemes were considered to implement the design: (1) Scheme I: Sequential operation, (2) Scheme II: Parallel operation. Scheme I: The block level diagram of the system consisting of the DS FV generator and ANN is shown in Fig. 16 . It consists of a DS generator, an address generator, memory banks, FIFO, ANN classifier and control circuit modules. The RTL of the ANN inference unit of dimension [50-80-2] (Section VI) is shown in Fig. 17 . The operation frequency of the ANN block was chosen to be 8.32 KHz which will be able to complete 4160 MAC operations in 500ms time. The input FV from the FIFO unit is stored in a register array. The hidden layer MAC unit accesses the FV from the register array sequentially and multiplies it with the corresponding weights stored in memory for NI=50 MAC operations, after which the bias corresponding to the neuron is added. This process is repeated NH=80 times for computing the results of all the NH hidden neurons. The hidden layer output is passed through a ReLU activation block and then stored in a register array which acts as the input to the output layer MAC unit. The similar process is repeated to get the final output result through a comparator. The utilization report and the power consumption are mentioned in Table V and VI. Scheme II:A parallel implementation of DS FV followed by ANN/SNN was done (Section VI (A)) . The format of the trained weight coefficient consists of 10 bits out of which 6 bits are for fractional, 1 bit for the sign and 3 bits for integer. The weights are stored in distributed ROM memory. The weights of the 1 st layer are stored in 50 address locations, where each location corresponds to the neuron number in the input layer, i.e., the 1 st location contains all the weight coefficients between the 1 st input neuron with the 80 hidden neurons, resulting in 800 bits per address location. Similarly, the weight memory between the output layer neuron and the hidden layer consists of 80 locations with 20 bits per address location. The utilization report and power consumption report for scheme II are mentioned in Table VII and VIII respectively. The trained parameters are stored in distributed memory which will infer LUTs and FF. It is to be noted that the for similar network architecture, the ANN occupies 82 DSP blocks due to multiplication operation, whereas in SNN, DSP blocks are not inferred. It can be understood that the major portion of the power is consumed by the Mixed Mode Clock Manager (MMCM) which converts the inbuilt 100MHz clock to the desired clock frequency of operation.
D. Synthesis results of DS-SNN (digital)
The estimate of different modules in 65nm technology node is shown in Fig. 12 ) consisting of a PN generator for converting the input feature vector of size 50 to a corresponding 50 bit spike pattern is consume ~0.78 µW.
The IF neuron consumes 0.26nW with an area of ~297µm2 which is ~4.58x area efficient and ~5x power efficient than the MAC neuron followed by ReLU activation function for the 1st hidden layer. It is to be noted that the MAC neuron for the subsequent layers will consume more area and power due to increase in the size of the input bit width whereas the same IF neuron can be used in subsequent layers as the input bit width depend on the bit width of the trained parameters and the average number of strong connections in the previous layers which is found to be similar for our network configuration (50-80-2). The number of weights required for our network is equal to 4160. An SRAM memory for storing the weights will consume ~2.3 µW with an area of ~1915 µm2. The logic for selecting the address corresponding to the stored weights based on the spike patterns (refer Fig. 13 ) for the hidden and output layers consume 1864 µm2 area and ~150nW power. It can be concluded that there lies savings in terms of area and power for the SNN inference unit compared to the ANN inference unit when operated at the same clock frequency.
E. Estimation of power consumption of the SNN classifier
The SNN architecture used in this work can be mapped to low power neuromorphic processors having spike-based processing facilities. An approximate analysis can be done based on the average number of spike occurrences (Nspk) during the entire duration of the inference operation. Considering that a spike activity consumes β Joules of energy. In section 1V.e (Fig 6) we estimated the average number of spike generation to be Nspk = 700 for 100-time steps. Therefore, approximate energy consumption for the SNN inference based on different spike based processors in literature is listed in Table X .
VIII. CONCLUSION
In this work we have studied the classification of human footsteps on gravel in the presence of four different background noises consisting of wind, rain, birds chirp and crickets, using ANN/SNN model. An appropriate choice of fs and input feature vector, i.e., (D, S) combination was determined, such as to achieve an acceptable accuracy (>90%) with fewer computations involved in the feature extraction process. A time domain based feature (DS FV) showed ~3x power benefits in terms of power compared to frequency domain FFT FV. The training of the network was done using ANN with a proper choice of parameters to get an accuracy of ≈ 93.32 % by the ANN classifier and ≈ 86.41 % by the SNN classifier. Also, the effect on accuracy due to approximating the weight coefficients was observed for both the ANN and SNN classifiers. Hardware-level assessment of the classifiers (ANN/SNN) used along with the DS FV generator were carried out targeting Zedboard with Zynq AP SoC XC7Z020-CLG484-1 FPGA. The later uses fewer resources compared to the former and dissipates ~8% less power.
As a future work the implementation of FV generation and SNN can be explored in analog fashion for better energy benefits. 
