Introduction {#s1}
============

Developing computational tools for simulating the brain networks is of a very special interest, because the models provide powerful means for investigating different characteristics of the neural system. For example, they can be used to find the effect of malfunctioning voltage-gated channels on network level behaviors in specific brain diseases or are employed to track effects of learning on synaptic efficacies and neural behavior. Using the computational tools before undertaking biological experiments can also give some insights into the results of experiments. In addition, computational models can be used as controllers for cognitive robots.

Neurons and neural pools are basis of computational models. Neurons receive sensory signals, process the information, excite/inhibit each other through a complex electrochemical process (Kandel et al., [@B20]). A neural pool is a group of neurons sharing excitatory or inhibitory property. Neural pools can inhibit or excite each other by means of output signals. Therefore, activity of each neuron can affect the behavior of its pool and other pools in the brain, so specific behavior of the neural networks emerge from interaction of neurons and neural pools (Buzsáki, [@B7]).

A neural network can be implemented on software or hardware. Due to the sequential execution of software, the parallel nature of neural networks is affected which leads to reduction of execution speed. Implementing neurons on hardware can provide several benefits; including high-speed modeling of big neural networks and preparing responses in real-time, etc. (Indiveri et al., [@B18]). In the hardware implementation techniques, digital implementations are more preferred vs. analog implementations, based on some of the digital advantages such as noise-robustness, more flexibility, simple real-world interfaces, and easier testability (Muthuramalingam et al., [@B24]). In addition, digital implementations are more cost-effective and less time consuming (Gatet et al., [@B13]). One of the other benefits of the digital implementations is their capability for fast development (Indiveri et al., [@B18]). There are different methods for digital implementation such as ASIC (Application Specific Integrated Circuit), DSP (Digital Signal Processing), and FPGA (Field Programmable Gate Array). DSP-based implementations are not suitable for modeling the parallel behavior of the neurons because of their sequential nature. ASIC implementation is more efficient than FPGA in terms of power and area, but it suffers from lack of re-configurability (Wanhammar, [@B32]). FPGA has some benefits over DSP and ASIC that we are interested in: it is reconfigurable, so it is useful for rapid prototyping of neural networks (Wang et al., [@B31]), and it has parallel processing architecture. Therefore, FPGA could be an appropriate solution for hardware implementation of neural networks (Muthuramalingam et al., [@B24]).

FPGAs have limited usable area and design tool chains, which create difficulty in implementation of large neural networks. Therefore, designs should be optimized in size to be implemented on an FPGA with limited number of resources. In this paper, we design and implement a biologically plausible neural network on an FPGA. There are different biological neuron models; however, we opt for Hodgkin-Huxley (H-H) neural model because of its biological plausibility and inclusion of synaptic details. Due to the FPGA area limitation, we use the reduced order of the previous implementation (Bonabi et al., [@B4]). The implemented neural network in this work is a modified model used in Moldakarimov et al. ([@B23]). The implemented network is the basic component of many neural networks; it has two mini-columns, each has two neural pools: an excitatory pool with 60 neurons and an inhibitory pool with 15 neurons. The reason that we are interested in this model is that this model can be used to investigate the competition between neural pools in the brain (Bakhtiari et al., [@B3]). Using the hardware as an accelerator for reducing the computation time of such neural networks is our main objective. The basis of single pool implementation are stated in Bonabi et al. ([@B5]). We use MATLAB simulations for high-level design of neural network and the results of simulations are used as a golden model to check correctness of our implementation.

In Materials and Methods, we introduce the model at the neural and the network levels. FPGA implementation of the network is introduced in this section, too. In Results, the validation processes and the implementation results for a set of variables are given. In addition, the hardware system and the speed of processing are discussed. Discussions and Conclusions are given in the last section.

Materials and methods {#s2}
=====================

Neuron model
------------

The neuron model used in this implementation is the reduced version of the model introduced in Traub and Miles ([@B30]). Ermentrout and Kopell ([@B12]) and Börgers et al. ([@B6]) also used the reduced model to investigate the dynamical behaviors of neural networks. The structure of the model for both of the inhibitory and excitatory neurons is the same. The membrane potential of the neurons follows the H-H equations (Hodgkin and Huxley, [@B17]). The ionic current that is mainly composed of sodium and potassium ions controls the membrane voltage. Moreover, this voltage regularizes the current flow of the ions by means of voltage-dependent ion channels. There are other ionic currents such as chloride, which their gating variables are independent of the membrane voltage. These ions constitute the leak current. Inside and outside of the membrane do not have equal concentration of ions, which results in an electrical potential. Both concentration gradient and electrical potential controls the current flow. The H-H model is shown in Figure [1](#F1){ref-type="fig"}.

![**Hodgkin and Huxley proposed circuit for squid giant axon**. g~K~ and g~Na~, are voltage-dependent conductance (Hodgkin and Huxley, [@B17]).](fnins-08-00379-g0001){#F1}

According to Börgers et al. ([@B6]); Izhikevich ([@B19]) the relationship between input current, membrane voltage, and the complete set of H-H equations come in (1a--4c): $$\frac{CdV}{dt} = I - I_{Na} - I_{K} - I_{L}$$ $$I_{Na} = \overline{g_{Na}}m^{3}h\left( {V - V_{Na}} \right)$$ $$I_{K} = \overline{g_{K}}n^{4}\left( {V - V_{K}} \right)$$ $$I_{L} = g_{L}\left( {V - V_{L}} \right)$$

Where, $$m = m_{\infty}\left( V \right) = \frac{\alpha_{m}\left( V \right)}{\left\lbrack {\alpha_{m}\left( V \right) + \beta_{m}\left( V \right)} \right\rbrack}$$ $$\alpha_{m}\left( V \right) = \frac{0.32\left( {V + 54} \right)}{1 - exp\left\lbrack {- 0.25\left( {V + 54} \right)} \right\rbrack}$$ $$\beta_{m}\left( V \right) = \frac{0.28\left( {V + 27} \right)}{exp\left\lbrack {0.2\left( {V + 27} \right)} \right\rbrack - 1}$$ $$h = max\left( {1 - 1.25n,\ 0} \right)$$ $$\frac{dn}{dt} = 0.7\left( {\alpha_{n}\left( V \right)\left( {1 - n} \right) - \beta_{n}\left( V \right)n} \right)$$ $$\alpha_{n}\left( V \right) = \frac{0.01\left( {V + 34} \right)}{1 - exp\left\lbrack {- 0.1\left( {V + 34} \right)} \right\rbrack}$$ $$\beta_{n}\left( V \right) = 0.125exp\left\lbrack {- 0.0125\left( {V + 44} \right)} \right\rbrack$$

The typical values and units of the parameters that used in the above equations according to Börgers et al. ([@B6]) are as follows: *C* = 1 μ*F*/*cm*^2^, *g*~*Na*~ = 100 *mS*/*cm*^2^, *V*~*Na*~ = 50 *mV*, *g*~*K*~ = 80 *mS*/*cm*^2^, *V*~*K*~ = −100 *mV*, *g*~*L*~ = 0.1 *mS*/*cm*^2^, *V*~*L*~ = −67 *mV*.

The units of the letters *V*, *t*, and *I* are *mV*, *ms*, and μA/cm^2^ respectively.

Neural network model
--------------------

The implemented neural network is composed of two similar mini-columns. Figure [2](#F2){ref-type="fig"} shows the network model. Each mini-column has an excitatory and an inhibitory pool. All of the neurons in each pool are connected to each other. By means of specialized structures called synapses, the information is transmitted among neurons. According to Börgers et al. ([@B6]) neurons in each pool are related to each other with neurotransmitter, AMPA (E → E and E → I) and GABA~A~ receptors (I → I and I → E). In order to create a relationship between the two mini-columns, an excitatory synaptic current is given to the inhibitory pool in the adjacent mini-column. In Figure [2](#F2){ref-type="fig"}, *J*~*ee*~, *J*~*ie*~, *J*~*ii*~, *J*~*ei*~, and *J*^*external*^~*ie*~ are the weights of neuron connections. These coefficients indicate the strength of synaptic connections of the excitatory neurons to each other, the synaptic strength of the excitatory neurons to the inhibitory neurons in the same mini-column, the strength of synaptic connections of the inhibitory neurons to each other, the synaptic strength of the inhibitory neurons to the excitatory neurons, and the synaptic strength of the excitatory neurons to the inhibitory neurons in the adjacent mini-column, respectively.

![**The structure of neural network, consist of two competitive mini-columns**.](fnins-08-00379-g0002){#F2}

Each excitatory pool receives two synaptic currents: one from its own neurons and the other from the neurons of the inhibitory pool in the same mini-column (see Figure [2](#F2){ref-type="fig"}). Equation (5) shows these currents. In this equation, *V~*ee*~* and *V~*ei*~* are equal to 0 and −80, respectively.

I

s

y

n

e

(

1

)

=

J

e

e

g

e

(

1

)

(

V

e

e

−

V

e

\[

j

\]

)

\+

J

e

i

g

i

(

1

)

(

V

e

i

−

V

e

\[

j

\]

)

Each inhibitory pool receives three synaptic currents as described in Equation (6). One synaptic current is exerted from the excitatory pool in the same mini-column, the second one is from its own neurons, and the last one is from the excitatory pool of the adjacent mini-column.

I

s

y

n

i

(

1

)

=

J

i

e

g

e

(

1

)

(

V

i

e

−

V

i

\[

j

\]

)

\+

J

i

i

g

i

(

1

)

(

V

i

i

−

V

i

\[

j

\]

)

\+

J

i

e

e

x

t

e

r

n

a

l

g

e

(

2

)

(

V

i

e

−

V

i

\[

j

\]

)

In Equation (6) *V~*ie*~* is 0 and *V~*ii*~* is −80. Equation (5) is added to Equation (1a) for excitatory pools and Equation (6) is added to Equation (1a) for inhibitory pools. As *V~*ii*~* and *V~*ei*~* are considered less than the minimum value of the action potential of a neuron, the corresponding synaptic current is always negative, so it is always opposed to increase the voltage value. Thus, this can be an expression of the inhibitory synaptic current.

The total effect of the synaptic variable of the excitatory and inhibitory pools on post-synaptic neurons is shown by *g*~*X*~, *X* ∈ {*e*, *i*} in Equation (7). Based on all-to-all connections in each pool, *g*~*X*~ is equal for all neurons in the same pool and it is obtained by the average of pre-synaptic neurons effects in each pool. In Equation (7), *N~*X*~* is the number of neurons in each pool.

g

X

=

∑

k

=

1

N

X

s

X

\[

k

\]

N

X

According to Börgers et al. ([@B6]), *s~*X*~*\[*k*\] is the gating variable, which is calculated by the following equation. τ~R~ and τ~D~ for AMPA receivers are equal to 0.2, 2 and for GABA~A~ receivers are equal to 0.5, 10, respectively.

d

s

d

t

=

1

\+

tan

h

(

V

10

)

2

1

−

s

τ

R

−

s

τ

D

The data transmission rate of neurotransmitters between the pre-synaptic and post-synaptic neurons is modeled by the mechanism of synapse. The term \[1 + tanh(V/10)\]/2 can be assumed as a normalized neurotransmitter concentration (Börgers et al., [@B6]).

FPGA implementation
-------------------

In this section, the FPGA implementation of the neural network, shown in Figure [2](#F2){ref-type="fig"}, is described. According to Figure [2](#F2){ref-type="fig"}, this network is made of two similar mini-columns, each has two neural pools: excitatory and inhibitory. The neurons of the excitatory and inhibitory pools have the same structure and there is no significant difference in their synaptic mechanisms. The selected single neuron model for implementation is described completely in Bonabi et al. ([@B4]) and we made a few changes in the model in this work. We reduced the order of dynamics of the system to make it simpler for the implementation of big neural networks. As Equations (1a--4c) describe, the dynamics of *m* and *h* are neglected. The most difficult part in the implementation of equations is the implementation of the exponential function. The accuracy and performance in the implementation of the exponential function has an intensive impact on the results. We use hyperbolic Coordinate Rotation Digital Computer (CORDIC) algorithm, which could be implemented using simple shifters and adders, to calculate the function. Equations (9a--c) shows the CORDIC algorithm used in our implementation (Ercegovac and Lang, [@B11]).

x

\[

j

\+

1

\]

=

x

\[

j

\]

\+

σ

j

2

−

j

y

\[

j

\]

y

\[

j

\+

1

\]

=

y

\[

j

\]

\+

σ

j

2

−

j

x

\[

j

\]

z

\[

j

\+

1

\]

=

z

\[

j

\]

−

σ

j

t

a

n

h

−

1

(

2

−

j

)

In order to minimize the required FPGA resources, we use LUT for implementing hyperbolic tangent inverse function in the Equation (9c). Therefore, we achieve less running time, because only a simple search is required instead of a mathematical calculation. CORDIC algorithm can calculate the value of functions with a reasonable error only when the inputs are in a limited range (Ercegovac and Lang, [@B11]). According to Equation (10), in order to have a bigger range of inputs, we separate each input into two parts: integer part and fractional part. We precalculate the exponential of the integer part using MATLAB and save them in a LUT. The exponential of the fractional part is calculated using the implemented CORDIC module. Then, in order to produce the final value of the exponential function, the output of LUT is multiplied by the output of the implemented CORDIC module.

A

=

B

\+

C

⇒

exp

(

A

)

=

exp

(

B

)

·

exp

(

C

)

Another module needed to produce new values of *n*, *s*, and *V* from their dynamics is the integrator. The method used to implement this module is the same as our previous work (Bonabi et al., [@B4]). Equation (11) is used to implement the integrator.

x

(

t

\+

Δ

t

)

=

x

(

t

)

\+

Δ

t

.

x

˙

(

t

)

In this work, we add a multiplexer (Mux) in order to insert the initial values for the dynamics *n*, *s*, *V*. Figure [3](#F3){ref-type="fig"} shows the block diagram of the implemented module for calculating the integral function. In order to calculate the integral of each step, a register is used to save the previous values of the integral.

![**Block diagram of implemented module for calculating the integral function**.](fnins-08-00379-g0003){#F3}

The basis of implementations of the excitatory and the inhibitory pools are the same and it is similar to the method presented in Bonabi et al. ([@B5]). We only change the number of neurons and the number of synaptic currents in each pool. Neurons are connected to each other with neurotransmitters, which make synaptic current. Synaptic current, as shown in Equations (5, 6), depends on the total effects of the synaptic variable and the gating variable. Therefore, for calculating synaptic current, we have to implement a module that could calculate the gating variable. The block diagram in Figure [4](#F4){ref-type="fig"} is used to implement the gating variable, *s*, which is shown in Equation (8).

![**Designed module to implement the gating variable *s***.](fnins-08-00379-g0004){#F4}

In order to calculate the hyperbolic tangent block in Figure [4](#F4){ref-type="fig"}, we use Equation (12) by employing our implemented exponential function.

t

a

n

h

(

x

)

=

e

x

p

(

2

x

)

−

1

exp

(

2

x

)

\+

1

To reduce the number of multipliers, a one-bit left shifter is used to duplicate the input of hyperbolic tangent. These currents are added to the implemented pools in order to make a connection between the neurons in each pool, between excitatory and inhibitory pools in each mini-column, and excitatory and inhibitory pools in the different mini-columns. Because of the limited usable area, the implemented mini-column has five neurons, four of them are in excitatory pool, and the last one is in inhibitory pool. The block diagram of this neural pool is shown in Figure [5](#F5){ref-type="fig"}.

![**Block diagram of implemented mini-column with five neurons**.](fnins-08-00379-g0005){#F5}

Figure [6](#F6){ref-type="fig"} shows the controller that is designed for this mini-column. This mini-column finishes its work when both of the pools finish their processing.

![**Designed controller for the implemented mini-column**.](fnins-08-00379-g0006){#F6}

To implement the neural network with 150 neurons, two mini-columns with 75 neurons are needed which are connected to each other by a synaptic current. Because of the resource limitation of FPGA, we share resources by multiplexing in time to implement this neural network on a single chip. Figure [7](#F7){ref-type="fig"} shows the architecture that is used to increase the number of neurons in each mini-column. In this architecture, we use a ROM to put the initial values of membrane voltages and gating variables for each of the neurons. The RAM is used to write the responses of the neurons and use them for the next executions of the system. Both of the ROM and RAM have 15 rows; each row has data for all of the five neurons in the mini-column. Each row is used in each execution. The Mux is used to choose the data from ROM and RAM. The address generator generates the address to read from the ROM/RAM and write in the RAM.

![**The proposed architecture to increase the number of neurons in the mini-column**.](fnins-08-00379-g0007){#F7}

The proposed architecture has a specific controller, shown in Figure [8](#F8){ref-type="fig"}, which controls all the processes. We have a counter in this controller, which manages the interactions of all components. The proposed architecture and the controller work in these steps: At first, the controller resets the circuit in order to set the values of all components to zero and the controller moves to the next state.In the second state (Read Input) the select input of Mux is set to zero and the inputs of the mini-column is read from the first line of the ROM. For the next 15 executions of this system, the select input of Mux is zero to choose the data from ROM. After first 15 executions, the select input is changed to one and data will be read from RAM. After reading the inputs of mini-column, the controller moves to the next state.The controller will stay in this state (Process) until the responses of neurons (five neurons in the mini-column) are ready. When the ready output of the mini-column is activated, the state of controller changes to the next state (Write Output).In the Write Output state, the answer of five neurons (membrane voltages and gating variables) is written in the same row of the RAM that data was read from it. If the address reaches to 15, the controller changes its state to Output Ready state; otherwise, the next state will be Check Status. Then, after a clock cycle, the state of the controller will be changed to Read Input, address generator points to the next row of the ROM/RAM, and these steps will be repeated.If the value of the address generator reaches 15, the controller changes its state to Output Ready. In this state, the ready output of mini-column with 75 neurons is activated. Then, all of the synaptic currents will be calculated and given to the pools. The controller resets the counter in the address generator, so address generator will point to the first row of the RAM and these steps will be repeated until the desired final time.

![**The designed controller for the implemented architecture to increase number of neuron in the mini-column**.](fnins-08-00379-g0008){#F8}

In the neural network, two mini-columns should be connected to each other by synaptic currents. To implement this connection when *g*~*X*~ is calculated in each mini-column, it will be entered to the inhibitory pool of the adjacent mini-column by coefficient of *J*^*external*^~*ie*~ and the synaptic current will be calculated in the inhibitory pools. The block diagram of the implemented neural network is shown in Figure [9](#F9){ref-type="fig"}.

![**Block diagram of implemented neural network**.](fnins-08-00379-g0009){#F9}

Figure [10](#F10){ref-type="fig"} shows the designed controller for the implemented neural network. At first, the controller resets both of the mini-columns and they start to work from the S~Start~ state. Then, mini-columns are activated together and work as a parallel system until both of them prepare the response of their 75 neurons. Each mini-column that finishes working, activates its ready output (Rdy). According to the ready output of the mini-columns, the controller changes its state. For example, if the ready output of the first mini-column is activated (Rdy~1~ = 1), the controller changes its state to S^B^~*Rdy*1~. In the new state, the clock of this mini-column is deactivated. Then, the controller changes its state to the next state (S~Rdy1~). State of the controller would not change until the responses of all of the neurons in the other mini-column are prepared. Then, the controller moves to the next state (S^A^~*Rdy*1~), the clock pulse of both of the mini-columns are deactivated in this state, and after a clock cycle the controller moves to the final state (S~Finish~). In the final state, the results are written in the memories of the mini-columns (RAM) and the controller changes its state to the first state (S~Start~). The controller repeats this process until the desired time.

![**The designed controller, which is used to control the neural network**.](fnins-08-00379-g0010){#F10}

In order to add the effect of stochastic factors, we add a noise term to the input current of each neuron. This random term creates small differences between neurons, which makes the implementation more biologically plausible. The noises are generated from a zero mean Gaussian distribution for each neuron. Then, they are saved in a ROM that is shown in Figure [9](#F9){ref-type="fig"} (*I~*Noise*~*). Each given address to the ROM by the address generator gives noises to the neurons of the two mini-columns.

Results {#s3}
=======

To validate the implemented neural network on FPGA, the deployed bit level simulations are compared with the numerical implementations of the mathematical models in MATLAB Simulink.

All of the hardware components are designed and implemented using VHDL modeling language. The main objective in the design of the low-level implementation is the accuracy of the output. Due to the hardware constraints, in the implementation of all modules minimum number of bits are used. More details of implementation are given in Table [1](#T1){ref-type="table"} as synthesis results.

###### 

**Abstract of synthesis results**.

  **Criteria**            **Virtex-7**   
  ----------------------- -------------- ----
  Frequency               63.386 MHz     --
  No. of LUTs             86032          33
  No.of LUT-FF pairs      30528          28
  No.of slice registers   50228          9
  No.of DSP blocks        1112           72
  No.of BRAM              14             1

The membrane voltage of the implemented single neuron (blue line) and MATLAB simulation (red line) for *I*~*ext*~ = 0.7 mA are shown in Figure [11](#F11){ref-type="fig"}. According to this figure, there is a small difference between the membrane voltages of the implemented model on FPGA and MATLAB simulation, because of the rounding error in the digital implementation. However, the firing rates of both of the signals are equal.

![**The voltage of single neurons, implemented single neuron (blue line) and MATLAB simulation (red line)**.](fnins-08-00379-g0011){#F11}

In an excitatory pool (when there is no connection between the pools of a mini-column) each neuron receives an excitatory synaptic current and external stimulus. Therefore, a neuron in an excitatory pool receives two exciting current while a free single neuron receives just an external stimulus. Thus, the firing rate of a neuron in an excitatory pool is higher than a free single neuron. Each neuron in the inhibitory pool receives an inhibitory synaptic current that reduces the effect of external stimulus. Therefore, the firing rate of a neuron in the inhibitory pool would be less than the firing rate of a free single neuron. In order to investigate the effects of synaptic currents in the pools, we exerted a certain stimulation (*I*~*ext*~ = 0.7 mA) to a free single neuron, an excitatory pool, and an inhibitory pool. The result shows that in the excitatory pool the firing rate is more than the firing rate of the free single neuron and in the inhibitory pool it is less than the firing rate of the free single neuron. The firing rate in the single neuron, excitatory pool, and inhibitory pool are shown in the Table [2](#T2){ref-type="table"}.

###### 

**Firing rates of a neuron in single neuron and pools**.

                     **Single neuron**   **Excitatory pool**   **Inhibitory pool**
  ------------------ ------------------- --------------------- ---------------------
  Firing rate (HZ)   26                  30                    16

In each mini-column, the excitatory pool receives an inhibitory current from the inhibitory pool that reduces the firing rates of the neurons. Figure [12](#F12){ref-type="fig"} shows the membrane voltages of one neuron in the excitatory and inhibitory pools, given *I*~*E*~ = 0.8 mA, *I*~*I*~ = 0.7 mA, *J*~*ie*~ = 0.7, and *J*~*ei*~ = 0.2. According to the voltages in this figure, when the neurons in the inhibitory pool reach their maximum value, they prevent the neurons in the excitatory pool to complete their firing. Therefore, as you can see in Figure [12A](#F12){ref-type="fig"} small bumps are created in the shape of the voltages of the excitatory neurons. In addition, the firing rate of the neurons in the excitatory pools is reduced.

![**Membrane voltage of one of the neurons, (A) In the excitatory pool, (B) In the inhibitory pool of a mini-column**.](fnins-08-00379-g0012){#F12}

When mini-columns are connected to each other, and a stimulation excites one of the excitatory pools in the network, both of the inhibitory pools will be activated and they try to inhibit their excitatory pools. For example, consider a case when the excitatory pool in the second mini-column (*E*2) receives greater stimulation than the excitatory pool in the first mini-column (*E*1). *E*2 and *E*1 exert excitatory synaptic currents to both of the second and first inhibitory pools (*I*2 and *I*1 respectively) (see Figure [2](#F2){ref-type="fig"}). *I*1 and *I*2 try to suppress the effect of the stimulations to *E*1 and *E*2. Since *E*1 is less stimulated than *E*2, the inhibiting effect of *I*1 might be able to suppress the neural activity in *E*1. To investigate this on the implemented system, we injected the same currents to both of the inhibitory pools and *E*1 is stimulated with less external current than *E*2. As rastergrams in Figure [13](#F13){ref-type="fig"} shows, the neurons in *E*1 are strongly suppressed by *I*1, while *I*2 is not able to suppress *E*2. Rastergrams of *I*1 and *I*2 looks almost the same, because both of them are excited by *E*1 and *E*2 and they receive equal external stimulations. The external currents and weights of neuron connections are as follow: $$\begin{array}{l}
{I_{E}^{(2)} = 0.85mA,I_{I}^{(2)} = 0.7mA,I_{E}^{(1)} = 0.7mA,I_{I}^{(1)} = 0.7mA} \\
{J_{ee} = 0.1,J_{ie} = 0.7,J_{ii} = 0.02,J_{ei} = 0.2,J_{ie}^{external} = 0.45.} \\
\end{array}$$

![**Rastergrams of the implemented neural network, for (A) The excitatory pool in mini-column 2, (B) The inhibitory pool in mini-column 2, (C) The excitatory pool in mini-column 1, (D) The inhibitory pool in mini-column 1**.](fnins-08-00379-g0013){#F13}

Discussion {#s4}
==========

There is a range of neuron models: from abstract to biophysically plausible ones (Bailey et al., [@B2]). The computational efficiency of models is inversely proportional to inclusion of details in them. Therefore, simpler models are more widely used. Abstract neuron models are usually used in the artificial neural network implementations; see Khan et al. ([@B21]); Sahin et al. ([@B27]); Çavuşlu et al. ([@B9]) as examples. Interest in implementation of spiking neuron models is also increased in the recent years (Yang et al., [@B33]; Ambroise et al., [@B1]).

Several implementations of biologically plausible neuron models and neural networks have also been proposed. For example, in Zou et al. ([@B35]), the authors have implemented a neural ınetwork based on the H-H neuron model using analog ASIC and a computer. In another example, in Heo and Song ([@B16]) a VLSI implementation of a biological neuron model is presented. Each specific ıimplementation has its own advantages over any general-purpose implementation, like realization on FPGA. Nevertheless, the inherent parallel processing nature of FPGA devices, in addition to their shorter design time and online re-configurability, make them a good alternative for implementation of biologically plausible neural networks.

There are FPGA implementations of simple neuron models; see Chen and Wang ([@B10]) and Yang et al. ([@B33]) for Integrate-and-Fire and Leaky Integrate-and-Fire models respectively. In Bailey et al. ([@B2]) Cellular Automata is implemented on FPGA. In Cassidy and Andreou ([@B8]); Mokhtar et al. ([@B22]); Rice et al. ([@B26]); Ambroise et al. ([@B1]) the Izhikevich model, which is a compromise between biological and abstract neuron models, has been implemented on FPGA. We are interested to have a more biologically plausible neuron model; so the H-H neuron model is chosen. There are different FPGA implementations of the H-H neuron model (Graas et al., [@B14]; Zhang et al., [@B34]; Pourhaj et al., [@B25]; Saïghi et al., [@B28],[@B29]; Grassia et al., [@B15]). A LUT-based implementation of computing algorithms is used in Graas et al. ([@B14]), that has better time complexity, but more area is needed compared to the direct implementation. In addition, it has lower accuracy than direct algorithmic implementation. Therefore, a large size of memory is used to save the pre-computed results with limited number of bits, which reduces the final accuracy because of the limitations on memory size. Moreover, in Graas et al. ([@B14]) MATLAB System Generator (SG) is used to implement a neuron and it is obvious that SG cannot produce an optimal hardware. Pourhaj et al. ([@B25]) used LUT-based implementation in different parts of the model and some equations with the exponential terms, which has a ıside effect on the final accuracy of implementation. The authors of Zhang et al. ([@B34]) implemented a 32-bit floating point reconfigurable somatic neuroprocessor on an FPGA. In Saïghi et al. ([@B28],[@B29]); Grassia et al. ([@B15]) the H-H based neural network has been implemented on a combination of systems: digital hardware, analog hardware, and software. Due to data transmission between different interfaces, these kinds of implementations are more sensitive to noise and have lower speed compared to the network implemented on a single FPGA. Also, implementation, validation, and evaluation of these types of systems are harder than pure digital design on FPGA devises. In Bonabi et al. ([@B5]), a neural pool is implemented on FPGA. In this paper, we have upgraded this pool to a network composed of four different pools. It is done by improving our implementation method through using computational techniques, such as CORDIC algorithm and step-by-step integration in the implementation of arithmetic circuits, in addition to sharing resources. Our implementation makes increasing the network size possible while keeping the network execution speed close to real time and having high precision.

Conclusions and future work
===========================

In this paper, we introduced a method to implement a biologically plausible neural network on an FPGA. The network consists of four neural pools; two excitatory; and two inhibitory pools. Each pool is implemented based on the H-H neuron model.

We used MATLAB simulation for validating our implementation. After simulating the network using MATLAB, we chose suitable number of bits for each variable in mathematical computations, in order to have less error in the results of the implemented network.

We described each part of the H-H model equations using VHDL as a hardware description language (Bottom-up approach). In order to have a neural network with the maximum number of neurons and accurate responses, we focused on having small error, rapid processing time, and using less hardware resources. In the implementation, both of the full implementation and LUT-based implementation were used to obtain high accuracy and low execution time. In addition, varieties of computational circuits were tested to have an implementation with optimal resource usage, such as CORDIC algorithm and step-by-step integrator, which are beneficial in having accurate response and low hardware resource usage. At every stages of the implementation, range of the inputs and outputs are examined, in order to achieve the appropriate data representation and to have simplified arithmetic functions. For example, in the implementation of different modules, the LUT implementation is used instead of complete functional implementation, where the function has limited range of changes in the values of inputs and outputs. We achieved high-speed performance of running the neural network, based on the parallel processing nature of FPGA, which is practically impossible in sequential platforms. Moreover, our design has high accuracy in its output; i.e., in membrane voltage signals and firing rates.

The presented implementation technique has other benefits as well, such as scalability and extendibility. The number of neurons could be increased by increasing the number of repetitions in the address generator. It results in linear increase in response time of the system. The implemented network could be used as a pipeline system to raise its speed in the next steps of this research. Using our system for developing controllers for cognitive robots is among our research plans.

Conflict of interest statement
------------------------------

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

[^1]: Edited by: Tim Pearce, University of Leicester, UK

[^2]: Reviewed by: Alejandro Linares-Barranco, University of Seville, Spain; Guillaume Garreau, Johns Hopkins University, USA

[^3]: This article was submitted to Neuromorphic Engineering, a section of the journal Frontiers in Neuroscience.
