Introduction: In this Letter, the design and implementation of a biologically-inspired auto-associative memory, which uses the massive logic resources available in an FPGA to model axonal delay elements (DEs), are presented. As illustrated in Fig. 1 , the system learns a certain input pattern through a training process. When a partial input pattern is applied, the complete version of the training pattern is retrieved and a sustained replay effected. 
Ref Fig. 1 Auto-associative memory Architecture: We define a pattern as a sequence of pulses. We refer to the pulses as spikes and each spike has width t spike . The sequence of pulses is represented by a vector of k elements {x 0 , x 1 , . . ., x k21 }, with each x i representing the time between a reference signal and the rising edge of each pulse [1] . The spiking neural network (SNN) model consists of k neurons. Each neuron contains a coincidence detector driven by a subset of the other k 2 1 neurons. The input spikes to a neuron's coincidence detector are referred to as the context spikes for the given neuron [2] . The training pattern is treated with wrap-around in the time domain, i.e. the last spike precedes the first spike. If each coincidence detector receives c context spikes, then a total of k × c programmable delay lines are required, as is illustrated in Fig. 2 for the case k ¼ 4, c ¼ 2. Using the programmable delay lines, the SNN is configured as a feedback network that drives the recurrent activation of the stored spike pattern. The feedback connection from a neuron to a delay line introduces an additional delay of d feedback . The total delay imposed on a spike, d total , is therefore d line + d feedback . The coincidence detector of each neuron is implemented using an AND gate followed by a D flip-flop (see Fig. 3 ). The pulse width of each output spike is set to t spike using a fixed delay element that resets the flip-flop. Results: Twelve simple patterns were tested and correct operation was achieved in all instances both in simulation and in hardware. For example, for the pattern {1, 0, 4, 2} shown in Fig. 1 , each spike has a 6 ns pulse width and the period of the pattern is 5 unit intervals, or 30 ns. For the N 0 spike in this pattern, the algorithm makes a 3 connection from each of the preceding context spike neurons N 1 and N 2 to the target neuron N 0 through appropriate configurations of the context muxes. The algorithm then calculates the delay between the two spikes, i.e. d total , and determines the delay mux configuration, s dly , for achieving this delay. The value of d total against s dly was determined from simulation, and the plot is shown in Fig. 4 . The context spikes N 1 and N 2 were delayed by 6 ns (s dly ¼ 0) and 12 ns (s dly ¼ 1), respectively, to achieve coincidence and cause triggering of the N 0 spike. The recall of the pattern can be successfully triggered from a partial representation, e.g. {x 0 , x 1 } ¼ {1, 0}. Fig. 5 shows the recall of a pattern captured on an oscilloscope. Note that N 3 spike was triggered first in response to the input of its context spikes x 0 and x 1 .
The auto-associative memory was implemented and tested on an Altera Cyclone II EP2C35 family FPGA, which consists of 33 216 LEs, with 16 LEs per logic array block (LAB). A 400 MHz PLLclocked counter and a set of registers are used to capture the timing of the input pattern. A Nios II soft-processor runs the training algorithm, reads the spike timings from the registers, and applies appropriate mux configurations for context neurons connections and delay settings. The auto-associative memory operates asynchronously. The entire system utilises 8% (2764/33 216) of the total LEs available on the FPGA, while the SNN takes up less than 1% (328/33 216). This implementation was intended to demonstrate the feasibility of the design and the capacity can be expanded by replicating the SNN for multiple pattern storage. With the availability of high-density FPGAs such as Stratix IV with 820k LEs, approximately 2500 patterns could be stored.
Conclusion: An FPGA implementation of a compact auto-associative memory is presented. The SNN uses only combinational logic and no sequential clocking elements; it has the potential to process patterns at very high speed and low latency, and could potentially be used for ultra-high performance digital processing designs. 
