Abstract. We use neuromorphic chips to perform arbitrary mathematical computations for the first time. Static and dynamic computations are realized with heterogeneous spiking silicon neurons by programming their weighted connections. Using 4K neurons with 16M feed-forward or recurrent synaptic connections, formed by 256K local arbors, we communicate a scalar stimulus, quadratically transform its value, and compute its time integral. Our approach provides a promising alternative for extremely power-constrained embedded controllers, such as fully implantable neuroprosthetic decoders.
Brain-Inspired Analog-Digital Systems
Analog computation promises extreme energy-efficiency by operating close to the shot-noise limit [1] . By exploiting physical laws (e.g., conservation of charge for summation), a handful of analog devices is sufficient to perform computation. In contrast, digital computation relies on abstractions that require many more devices to achieve the same function (e.g., hundreds of transistors to add two 8-bit numbers). Furthermore, these abstractions break when noise exceeds a critical level, requiring enormous noise margins to avoid catastrophic failures. In contrast, analog degrades gracefully, allowing for operation at low noise margins, thereby saving power.
However, robust and programmable computation using noisy analog circuits is challenging. Robust computation requires a distributed approach, but this is difficult because analog communication is susceptible to heterogeneity and noise. Programmable computation requires flexibility, but this is also difficult because analog computation exploits the underlying devices' fixed physical properties.
In this paper, we realize robust and programmable mathematical computations with noisy and heterogeneous components using a framework inspired by the brain [2] . The brain uses graded dendritic potentials (cf. analog computation), all-or-none axonal spikes (cf. digital communication) and probabilistic synapses (cf. weighted connections). Our analog computational units are spiking silicon neurons [5] ; our digital communication fabric is a packet-switched network [3] ; and our weighted connections are packet-delivery probabilities [8] . While neuromorphic systems that combine some or all of these features of the brain have been built previously, they only performed specific computations. In contrast, we realize any desired mathematical computation by programming the connections' weights to exploit the silicon neurons' heterogeneity.
Section 2 reviews a theoretical framework for mapping computations onto heterogeneous populations of spiking neurons. Section 3 presents hardware elements that support this framework. Section 4 describes Neurogrid, a sixteenchip, million-neuron neuromorphic system used as a testbed (proposed in [4] ). Section 5 presents implementations of static and dynamic computations. Section 6 discusses possible applications.
The Neural Engineering Framework
To program the connection weights among heterogeneous spiking neurons (indexed by i), NEF follows three principles [2] (Figure 1 
):
Representation: A multi-dimensional stimulus x(t) is nonlinearly encoded as a spike rate a i (x(t))-represented by the neuron tuning curve-that is linearly decoded to recover an estimate of x(t),x(t) = i a i (x(t))φ x i , where φ x i are the decoding weights. Transformation: Transformations of x(t) into y(t) are mapped directly to transformations of a i (x(t)) into b j (y(t)) using alternate decoding weights. For example, y(t) = Ax(t) is represented by the spike rates b j (Ax(t)), where neuron j's input is computed directly from neuron i's output using Ax(t) = i a i (x(t))Aφ x i , an alternative linear weighting. Dynamics: Dynamics are realized using the synapses' spike response h(t). This principle brings together the first two principles and adds the time dimension. For example, for h(t) = τ −1 e −t/τ ,ẋ = Ax(t) + By(t) is realized as the equivalent, neurally plausible dynamical system: x(t) = h(t) * (A x(t) + B y(t)), where convolution replaces integration, A = τ A + I, and B = τ B.
The first principle is realized by assigning neuron i a randomly chosen preferred direction in the stimulus space,φ i :
Here G is the neurons' nonlinear current-to-spike-rate function. The dot product converts the multi-dimensional stimulus, x(t), to a one-dimensional soma current, J i . α i is a gain or conversion factor and J bias i is a bias current. These two parameters are chosen to uniformly distribute firing thresholds and maximum firing rates within specified ranges (Figure 1, left ) . For a one-dimensional (1D) stimulus space,φ i = ±1 . In contrast, the linear decoding weights, φ x i , are obtained by minimizing the mean square error. This error may be computed relative to the original stimulus x(t) or some nonlinear function thereof, f (x(t)),
Representation
Transformation Dynamics Fig. 1 . NEF's three principles. Representation: Tuning curves map stimuli (x(t)) to spike rates (ai(x(t))). Transformation: Populations ai(t) and bj(t) connected by weights wij transform x(t)'s representation into y(t)'s. Dynamics: Synapses' spike response, h(t), implement dynamics using neurally plausible matrices A and B .
The second and third principles are realized by using the matrix A that describes x(t)'s transformation or dynamics to specify the weight w ij connecting neuron i to neuron j [2] :
where φ
is neuron i's decoding vector andφ j is neuron j's encoding vector ( Figure 1 , middle, right ). Neuron j will fire as if it received an input of y(t) = Af (x(t)); hence decoding its firing rate will yield the desired result. This recipe enables a computation specified in a low-dimensional space (A) to be solved in a high-dimensional space (w ij ) using lower precision elements.
Hardware Elements
To support NEF in hardware, we need spiking silicon neurons, exponentially decaying synapses and programmable interconnection weights. Neurogrid implements these elements as follows.
Silicon neurons are implemented with quadratic integrate-and-fire dynamics:
where τ m is the membrane time-constant, v is the membrane potential (normalized by the threshold voltage), g syn is the total synaptic conductance (normalized by the leak conductance), and e rev is the reversal potential (also normalized by the threshold voltage); v is reset to 0 when it exceeds 10. Integration yields the inter-spike intervals and inversion yields the conductance-to-spike-rate function:
where t ref is the refractory period [6] . This nonlinear function is appropriate as NEF does not favor any particular neuron model. Silicon synapses are implemented by low-pass filtering unit-amplitude digital pulses, p rise (t), of width t rise [7] :
where τ syn is the exponential decay constant, g sat is the saturation conductance value and t i are the spike times. p rise (t) is triggered for every presynaptic spike that the neuron receives. Longer t rise will result in longer exponential rise and thus a higher peak. Minimizing t rise avoids saturation and matches h(t) = τ −1 e −t/τ more closely. These synapses' programmable reversal potential (e rev ) supports positive and negative weights. An input spike causes the synapse to drive the membrane potential to e rev . If e rev is greater than the membrane's resting potential, the membrane depolarizes, realizing a positive weight (excitatory synapse). If e rev is lower than the membrane's resting potential, the membrane hyperpolarizes realizing a negative weight (inhibitory synapse). These two effects are not necessarily balanced because their driving forces change in opposite ways with the membrane potential. This imbalance is minimized by setting e rev much higher or much lower than v, such that their difference is much larger than v's changes.
Programmable interconnection weights are implemented probabilistically. In NEF, w ij specifies the amplitude of the synaptic current (i.e., g sat ) that is fed as input to postsynaptic neuron j when presynaptic neuron i spikes. In our probabilistic implementation, the input strength does not vary. Instead, w ij specifies the fraction of presynaptic neuron i's spikes that reach postsynaptic neuron j [8] . This approach eliminates the need for a digital-to-analog converter (assuming that the weights are stored digitally) and an analog multiplier. It also reduces bandwidth requirements, since the probabilities are usually very low (small weights).
The ability to utilize the fabrication process' imperfections (i.e., transistor mismatch) is one of NEF's attractions. The silicon neuron's properties (threshold voltage and leakage conductance) are determined by its transistors' physical properties (width, length and doping). Variations in these properties cause normalization factors to vary from one neuron to another, resulting in a distribution of bias and gain parameter values [5] . As a result, identically designed neurons have different conductance-to-spike-rate functions, and hence we do not need to set different bias currents and gains for each neuron. However, mismatch in the silicon synapse's properties (g sat , e rev , t rise τ syn ) is detrimental because it imbalances excitation and inhibition, as the same circuit is used for all the synapses (of a given type) a particular neuron receives. We mitigated this by modifying the algorithm that finds the decoding weights (see Section 5).
System Description
We use Neurogrid, a neumorphic system with one million spiking neurons, for analog computation and a daughterboard, with an FPGA and a bank of SRAMs, Fig. 2 . Neurogrid. Spiking neural networks with up to 16 layers, each with up to 256×256 neurons, are modeled with sixteen 12×14 sq-mm Neurocores (left) assembled on a 6.5×7.5 sq-in circuit board (center, main board). Neurocores, connected in a binary tree with 91M spikes/s links (peak rate), relay their spikes to a routing table (center, daughterboard) that supports programmable weighted connections. Spike activity is visualized in real time (right). The entire 1M-neuron system consumes 3.1W.
for digital communication (Figure 2 ). Layers are mapped onto Neurogrid's 180 nm CMOS chips (Neurocores) and connected by relaying packets.
The daughterboard implements our probabilistic connection scheme. The SRAMs (Cypress Semiconductor CY7C1071DV33) store 32 MBytes with a 50 ns access time. The FPGA (Xilinx Spartan-3E, 100MHz clock) parses incoming packets to identify the source neuron's chip, row, and column address. This information is used to calculate a base address from which to begin reading SRAM entries. Each 32-bit entry specifies four weights (6-bit value, a sign bit and a control bit). A Bernoulli trial is performed by comparing a 7-bit random number with the 6-bit value, padded with one zero to yield a maximum probability of 0.5. If the random number is less than the weight, a packet is output that specifies the route to the corresponding target neuron's Neurocore as well as its row and column address. The synapse type is also included-excitatory or inhibitory-determined by the weight's sign. If the random number is more than the weight, the connection is ignored and the next entry is read.
With all-to-all connectivity, the FPGA can support N L = 2175 neurons per layer firing at an peak mean rate of f avg = 16.9 spikes/s, a total of N 2 L f avg = 80M connections per second. Four weights are delivered every 50 ns, and thus each connection is processed in t wgt = 12.5 ns. Hence, N L cannot exceed 1/ f avg t wgt = 2175. Since the average weight is less than 1/16 in practice, no more than 5M packets per second are sent to Neurogrid. Thus, the 200 ns it takes to output a packet-four clock cycles per word-does not limit performance.
Larger networks can be supported by exploiting Neurogrid's local arbor mechanism, which efficiently distributes synaptic input by relaying analog signals between neighboring neurons [9] . A neuron r neurons away from the arbor's center receives a synaptic input that decays as λ r / √ r (λ is programmable for each synapse type). This method enables us to increase connections by a factor of d 2 , where d is the arbor's diameter, defined as 2/ ln λ-twice the distance at which the current decays by a factor of e. It also produces distributions of bias currents 
Implementation Results
We demonstrated NEF's three principles in Neurogrid by communicating a scalar stimulus, quadratically transforming its value, and computing its time integral. We deliver the scalar stimulus (−1 ≤ x ≤ +1) in the form of Poisson spike trains (f ± ) to the excitatory and inhibitory synapses.
To represent the scalar x, encoding vectors,φ i = ±1, are randomly assigned to the N L /d 2 arbor centers where f ± = f max max (±x, 0) are applied. For x = −1, neurons withφ i = +1 receive 0 and f max spikes/s at their excitatory and inhibitory synapses, respectively. For x = +1, the synapses receive f max and 0 spikes/s, respectively. For x = 0, both synapses receive 0 spikes/s. The reverse is true forφ i = −1.
We compute decoders to recreate f ± using convex optimization (CVX package in MATLAB), constraining the decoding weights to be non-negative (the next paragraph explains why) and to not exceed the FPGA's maximum probability. The N L decoding weights,
minimize
where M is a m × N L matrix of measured spike rates, with N L columns corresponding to the neurons and m rows corresponding to the applied spike input rates, which are specified by f ± , a m × 1 vector. We build decoders to obtain f x 2 ± = f max max (±x 2 , 0) using the same approach (Figure 3 , left ). To communicate the scalar (y = x) or transform it quadratically (y = x 2 ), we use the previously computed decoding weights to program the weight that connects source layer neuron i to target layer arbor center j. Specifically, we program the excitatory and inhibitory weights to w
This formulation delivers an estimate of f ± to the target layer, except that it will use the opposite synapse type if φ f± i < 0. For instance, if f ± both have 10 negatively weighted spikes per second, then instead of receiving 40 (= 50 − 10) and 10 (= 20 − 10) spikes/s, the excitatory and inhibitory synapses would receive 60 and 30 spikes/s. If mismatch (see Section 3) makes the excitatory synapse stronger, these additional spikes will have a net excitatory effect. This bias would make the target neuron spike at a higher rate, which would lead to a decoding error. Non-negative decoding weights avoid this error and yields accurate communication and transformation (Figure 3, right ) .
To integrate the scalar (ẋ = u), we use the decoding weights for f ± = f max max (±x, 0) to program the weight w ij that connects neuron i to arbor center j within the same layer (A = I) and apply input spike rates scaled by τ syn (B = τ syn I). At 2 s, the integrator's time constant is over an order of magnitude greater than τ syn = 0.1 s (Figure 4) . Over multiple trials, the mean and variance of the integrator's value change with a time constant of 2.0 s (drift) and a rate of 4.9 × 10 −4 /s (diffusion), respectively. These values can be reduced by increasing f max or τ syn , as each neuron estimates f ± using the τ syn f ± spikes it received in the last τ syn seconds. The signal-to-noise ratio should improve as τ syn f max (assuming Poisson statistics). However, the hardware's minimum t rise (0.2 ms) limited f max to 4000 spikes/s, compared to 80 × 4096 = 320K for NEF.
Conclusion
We have demonstrated NEF's three principles on neuromorphic hardware, successfully performing arbitrary mathematical computations using silicon neurons. Unlike NEF's deterministic weights, we used probabilistic weights and still achieved accurate results. Our results readily extend to multiple dimensions, thus supporting any control-theoretic algorithm. One example is a Kalman filter-based decoder for brain-machine interfaces [10] , and we can now envision a low-power, fully implantable neuromorphic chip to implement it.
