Perceptrons from Memristors by Silva, Francisco et al.
Perceptrons from Memristors
Francisco Silva,1, ∗ Mikel Sanz,2, † Joa˜o Seixas,3, 4, 5, ‡ Enrique Solano,2, 6, 7, § and Yasser Omar1, 3, ¶
1Instituto de Telecomunicac¸o˜es, Physics of Information and Quantum Technologies Group, Portugal
2Department of Physical Chemistry, University of the Basque Country UPV/EHU, Apartado 644, E-48080 Bilbao, Spain
3Instituto Superior Te´cnico, Universidade de Lisboa, Portugal
4CeFEMA, Instituto Superior Te´cnico, Universidade de Lisboa, Portugal
5Laborato´rio de Instrumentac¸a˜o e F´ısica Experimental de Part´ıculas (LIP), Lisbon, Portugal
6IKERBASQUE, Basque Foundation for Science, Maria Diaz de Haro 3, 48013 Bilbao, Spain
7Department of Physics, Shanghai University, 200444 Shanghai, China
(Dated: December 27, 2018)
Memristors, resistors with memory whose outputs depend on the history of their inputs, have been
used with success in neuromorphic architectures, particularly as synapses and non-volatile memo-
ries. However, to the best of our knowledge, no model for a network in which both the synapses and
the neurons are implemented using memristors has been proposed so far. In the present work we
introduce models for single and multilayer perceptrons based exclusively on memristors. We adapt
the delta rule to the memristor-based single-layer perceptron and the backpropagation algorithm
to the memristor-based multilayer perceptron. Our results show that both perform as expected
for perceptrons, including satisfying Minsky-Paperts theorem. As a consequence of the Universal
Approximation Theorem, they also show that memristors are universal function approximators.
By using memristors for both the neurons and the synapses, our models pave the way for novel
memristor-based neural network architectures and algorithms. A neural network based on mem-
ristors could show advantages in terms of energy conservation and open up possibilities for other
learning systems to be adapted to a memristor-based paradigm, both in the classical and quantum
learning realms.
I. INTRODUCTION
The perceptron, introduced by Rosenblatt in 1958 [1],
was one of the first models for supervised learning. In a
perceptron, the inputs x1...xn are linearly combined with
coefficients given by the weights w1...wn, as well as with a
bias b to form the input v to the neuron (see Fig. 1). v is
then fed into a non-linear function whose output is either
0 or 1. The goal of the perceptron is thus to find a set
of weights {wi} that correctly assigns inputs {xi} to one
of two predetermined binary classes. The correct weights
for this task are found by an iterative training process,
Fig. 1. In a single-layer perceptron (SLP) the inputs xi are
multiplied by their respective weights wi and added, together
with a bias b to form the net input to the SLP, v. The output
y of the SLP is given by some activation function, φ(v).
∗ francisco.horta.ferreira.da.silva@tecnico.ulisboa.pt
† mikel.sanz@ehu.eus
‡ joao.seixas@tecnico.ulisboa.pt
§ enr.solano@gmail.com
¶ yasser.omar@lx.it.pt
for instance the delta rule [2]. However, the perceptron
is only capable of learning linearly separable patterns, as
was shown in 1969 by Minksy and Papert [3]. These limi-
tations triggered a search for more capable models, which
eventually resulted in the proposal of the multilayer per-
ceptron. These objects can be seen as several layers of
perceptrons connected to each other by synapses (see Fig.
2). This structure ensures that the multilayer perceptron
does not suffer from the same limitations as Rosenblatt’s
perceptron. In fact, the Universal Approximation Theo-
rem [4] states that a multilayer perceptron with at least
one hidden layer of neurons and with conveniently cho-
sen activation functions can approximate any continuous
function to an arbitrary accuracy.
There are various methods to train a neural network
such as the multilayer perceptron. One of the most
widespread is the backpropagation algorithm, a gener-
alization of the original delta rule [5].
Fig. 2. In a multilayer perceptron (MLP), single-layer per-
ceptrons (SLP) are arranged in layers and connected to each
other, with the outputs of the SLPs in the output layer being
the outputs of the MLP. Here, each SLP is represented by a
disc.
ar
X
iv
:1
80
7.
04
91
2v
2 
 [c
s.E
T]
  2
6 D
ec
 20
18
2Artificial neural networks such as the multilayer per-
ceptron have proven extremely useful in solving a wide
variety of problems [6–8], but they have thus far mostly
been implemented in digital computers. This means that
we are not profiting from some of the advantages that
these networks could have over traditional computing
paradigms, such as very low energy consumption and
massive parallelization [9]. Keeping these advantages is,
of course, of utmost interest, and this could be done if a
physical neural network was used instead of a simulation
on a digital computer. In order to construct such a net-
work, a suitable building block must be found, with the
memristor being a good candidate.
Besides these energetic considerations, exploring the
fact that MLPs are universal function approximators our
proposal of MLPs based only on memristors implies that
memristive circuits can approximate any smooth function
f : Rn → Rm to arbitrary accuracy.
The memristor was first introduced in 1971 as a two-
terminal device that behaves as a resistor with mem-
ory [10]. The three known elementary circuit elements,
namely the resistor, the capacitor and the inductor, can
be defined by the relation they establish between two
of the four fundamental circuit variables: the current
i, the voltage u, the charge q and the flux-linkage φ.
There are six possible combinations of these four vari-
ables, five of which lead to widely-known relations: three
from the circuit elements mentioned above, and two given
by q(t) =
∫ t
−∞ i(τ)dτ and φ(t) =
∫ t
−∞ u(τ)dτ . This
means that only the relation between φ and q remains
to be defined: the memristor provides this missing rela-
tion. Despite having been predicted in 1971 using this
argument, it was not until 2008 that the existence of
memristors was demonstrated at HP Labs [11], which
led to a new boom in memristor-related research [12]. In
particular, there have been proposals of how memristors
could be used in Hebbian learning systems [13–15], in the
simulation of fluid-like integro-differential equations [16],
in the construction of digital quantum computers [17]
and of how they could be used to implement non-volatile
memories [18].
The pinched current-voltage hysteresis loop inherent
to memristors endows them with intrinsic memory capa-
bilities, leading to the belief that they might be used as a
building block in neural computing architectures [19–21].
Furthermore, the relatively small dimension of memris-
tors, the fact that they can be laid out in a very dense
manner and their non-volatile nature may lead to highly
parallel, energy efficient neuromorphic hardware [22–25].
The possibility of using memristors as synapses in neu-
ral networks has been extensively studied. The wealth
of proposals in this field can be broadly split into two
groups: one related to spike-timing-dependent plastic-
ity (STDP) and spiking neural networks (SNN) [26–30],
and the other to more traditional neural network mod-
els [31–43]. The first group has a more biological focus,
with its main goal being the reproduction of effects oc-
curring in natural neural networks, rather than algorith-
mic improvements. In fact, the convergence of STDP-
based learning is not guaranteed for general inputs [31].
The second group is more oriented towards neuromorphic
computing and is composed of two major architectures,
one based on memristor crossbars and another on mem-
ristor arrays.
Despite all these results, and to the best of our knowl-
edge, all existent proposals use memristors exclusively as
synapses, with the networks’ neurons being implemented
by some other device. The main goal of this paper is
thus to introduce a memristor-based perceptron, i.e., a
single-layer perceptron (SLP) in which both synapses and
neurons are built from memristors. It will be generalized
to a memristor-based multilayer perceptron (MLP) and
we will also introduce learning rules for both perceptrons,
based on the delta rule for the SLP, and on the backprop-
agation algorithm for the MLP.
Recently the universality of memristors has been stud-
ied for Boolean functions [44] and as a memcomput-
ing equivalent of a Universal Turing Machine (Universal
Memcomputing Machine [19]). However, to the best of
our knowledge, it has not yet been shown that the mem-
ristor is a universal function approximator. This result
will come as a consequence of the introduction of the
above-mentioned memristor-based MLP.
II. THE MEMRISTOR AS A DYNAMICAL
SYSTEM
In general, a current-controlled memristor is a dynam-
ical system whose evolution is described by the following
pair of equations [10]
{
V = R(~γ, I)I, (1a)
~˙γ = ~f(~γ, I). (1b)
The first one is Ohm’s law and relates the voltage output
of the memristor V with the current input I through the
memristance R(~γ, I), which is a scalar function depend-
ing both on I and on the set of the memristor’s internal
variables ~γ. This dependence of the memristance on the
internal variables induces the memristor’s output depen-
dence on past inputs, i.e., this is the mechanism that en-
dows the memristor with memory. The second equation
describes the time-evolution of the memristor’s internal
variables by relating their time derivative, ~˙γ, to an n-
dimensional vector function ~f(~γ, I), depending on both
previous values of the internal variables and the input of
the memristor.
3A. Memristor-based Single-Layer Perceptron
Algorithm 1 Delta rule for Single-layer Perceptron
Initialization
Set the bias current Ib to 0.
Initialize the weights w1, w2, wb.
Set the internal state variables γ1, γ2, γ3 to w1, w2 and wb,
respectively.
for d in data do
Forward Pass
Compute the net input to the perceptron:
I = w1x1 + w2x2. (2)
Compute the perceptron’s output:
V = g(I, γ1, γ2, γ3). (3)
Backward Pass
Compute the difference ∆ between the target output and
the actual output:
∆ = T − V. (4)
Compute the derivative of the activation function with re-
spect to the net input, g′.
for i in internal variables do
if ∆ ≥ 0 then
Set the bias Ib = Iγi .
else
Set the bias Ib = −Iγi .
end if
Update γi by inputting I = ∆xig
′ + Ib.
end for
Update the weights by setting them to the updated values
of the internal state variables.
Set the bias Ib = 0.
end for
Our goal is to implement a perceptron and an adapta-
tion of the delta rule to train it using only a memristor.
To this end, we use the memristor’s internal variables
to store the SLP’s weights and the learning rate. Equa-
tion (1b) allows us to control the evolution of the mem-
ristor’s internal variables and implement a learning rule.
If, for example, we want to implement a SLP with two
inputs we need a memristor with four internal variables,
two of them to store the weights of the connections be-
tween the inputs and the SLP, a third one to store the
SLP’s bias weight and another for the learning rate.
Let us then consider a memristor with four inter-
nal state variables, from now on labeled by ~γ =
(γ1, γ2, γ3, γ4) and in which ~f = (f1, f2, f3, f4). It could
be difficult to externally control multiple internal vari-
ables. However, a possible solution is to use several mem-
ristors with the chosen requirements and with an exter-
nally controlled internal variable each.
In order to understand the form of these functions, we
must remember that we expect different behaviours from
the perceptron depending on the stage of the algorithm.
In the forward propagation stage, the weights must re-
main constant to obtain the output for a given input. In
this phase the internal variables must not change. On
the other hand, in the backpropagation stage, we want
to update the perceptron’s weights by changing the inter-
nal variables. However, it may happen that the update
is different for each of the weights, so we need to be able
to change only one of the internal variables without af-
fecting the others.
There are thus three different possible scenarios in the
backpropagation stage: we want to update γ1, while γ2
and γ3 should not change; we want to update γ2, while
γ1 and γ3 should not change, and we want to update γ3,
while γ1 and γ2 should not change. To conciliate this
with the fact that a memristor takes only one input, we
propose the use of threshold-based functions, as well as a
bias current Ib, for the evolution of the internal variables
V (t) = g(I, γ1, γ2, γ3), (5)
γ˙i = (I − Ib) (θ(I − Iγi)− θ(I − (Iγi + a)))
+ (I + Ib) (θ(−I − Iγi)− θ(−I − (Iγi + a))) ,
(6)
where g is an activation function, θ is the Heaviside func-
tion, Iγi is the threshold for the internal variable γi and
a is a parameter that determines the dimension of the
threshold, i.e., the range of current values for which the
internal variables are updated. The first term of the up-
date function can only be non-zero if the input current is
positive, whereas the second term can only be non-zero if
the input current is negative, allowing us to both increase
and decrease the values of the internal variables. If Iγ1 ,
Iγ2 and Iγ3 are sufficiently different from each other and
from zero, we can reach the correct behaviour by choos-
ing the memristor’s input appropriately. The thresholds
and the a parameter are thus hyperparameters that must
be calibrated for each problem. In the aforementioned
construction in which our memristor with three internal
variables is constructed as an equivalent memristor, we
can also use an external current or voltage control to keep
the internal variable fixed. In fact, this is how it is usu-
ally addressed experimentally [21, 45–47]. Therefore, we
can assume that this construction is possible. It is im-
portant to note that, in an experimental implementation,
this threshold system does not need to be based on the
input currents’ intensities. It can, for instance, be based
on the use of signals of different frequencies for each of
the internal variables or in the codification of the signals
meant for each of the internal variables in AC voltage
signals.
We are now ready to present a learning algorithm for
our SLP based on the delta rule, which is described in
Algorithm 1. In case one wants to generalize this pro-
cedure to an arbitrary number of inputs n, this can be
trivially achieved by using a memristor with n+1 internal
variables and adapting Algorithm 1 accordingly.
4B. Memristor-based Multilayer Perceptron
Algorithm 2 Backpropagation for Multilayer Percep-
tron
Initialization
Set the bias current Ib to 0.
Initialize the weights {wij} and {wbk}.
Set the internal variable γij of each connection memristor
ij to the respective connection weight wij .
Set the internal variable γk of each connection memristor k
to the respective bias weight wbk .
for d in data do
Forward Pass
for l in layers do
Compute the output of each connection memristor ij in
layer l:
Vij(wij , I) = wijI. (7)
Sum the outputs of the connection memristors connected
to each node memristor k in layer l
ink =
∑
Iik (8)
Compute the node memristor’s output:
Vk = ROFF
(
1− γbk
D
+
RON
ROFF
γbk
D
)
ink.
end for
Backward Pass
for k in output layer do
Compute the difference ∆ between the target output and
the actual output of the node memristor:
∆k = Tk − Vk. (9)
Compute the local gradient of the node memristor using
Equation (16).
end for
for layer in hidden layers do
for node in layer do
Compute the local gradient of node memristor l in layer
using Equation (17).
end for
end for
for connection in connections do
Compute the weight update.
Set the bias current: Ib = Iγij .
Update the connection memristor’s internal variable by in-
putting I = ∆wij + Ib to it.
Update the connection’s weight by setting it to the updated
value of the respective internal variable.
end for
for node in nodes do
Compute the bias weight update according to Equa-
tion (18).
Set the bias current: Ib = Iγb .
Update the node memristor’s internal variable by inputting
I = ∆wk + Ib.
Update the bias weight by setting it to the updated value
of the respective internal variable.
end for
end for
In this model, memristors are used to emulate both
the connections and the nodes of a MLP. In principle,
the nodes could be emulated by non-linear resistors, but
using memristors allows us to take advantage of their
internal variable to implement a bias weight, which in
some cases proves fundamental for a successful network
training.
The equations describing the evolution of the memris-
tor at each node in this model are the same as in the
seminal HP Labs paper [11]. We have chosen the exper-
imentally tested set
V (t) =
(
RON
γ(t)
D
+ROFF
(
1− γ(t)
D
))
I(t), (10)
γ˙ =
{
µV
RON
D I(t)− Iγ if µV ROND I(t) > Iγ ,
0 o.w.
(11)
Here, RON and ROFF are, respectively, the doped and un-
doped resistances of the memristor, D and µV are phys-
ical memristor parameters, namely the thickness of its
semiconductor film and its average ion mobility, and Iγ
is a threshold current playing the same role as the I~γ
in the model for the memristor-based SLP introduced
above. Equation (10) can be approximated by
V (t) = ROFF
(
1− γ(t)
D
)
I(t), (12)
since we have that RONROFF ≈ 1100 . If, for instance, we im-
pose a constant current input I to the memristor for a
time t, the output is given by
V (t) ∝ −I2t. (13)
It is then possible to implement non-linear activation
functions starting from Equation (10), which is an im-
portant condition for the universality of neural networks
[48].
Looking now at synaptic memristors, their evolution is
described by
V (t) = γ(t)I(t), (14)
γ˙ =
(
µV
RON
D
I(t)− Iγ
)
θ
(
µV
RON
D
I(t)− Iγ
)
. (15)
In synaptic memristors, the internal variable γ is used
to store the weight of the respective connection, whereas
in node memristors the internal variable is used to store
the node’s bias weight.
As explained before, the node memristors are chosen
to operate in a non-linear regime, which allows us to im-
plement non-linear activation functions. On the other
hand, we choose a linear regime for synaptic memristors,
which allows us to emulate the multiplication of weights
by signals.
It must be mentioned that Equation (11) is only valid
for γ ∈ [0, D]. If we were to store the network weights in
5the internal variables using only a rescaling constant A,
i.e., w = Aγ, then the weights would all have the same
sign. Although convergence of the standard backprop-
agation algorithm is still possible in this case [49], it is
usually slower and more difficult, so it is convenient to re-
define the variable [11] D → D′ so that the interval of the
internal variable in which Equation (11) is valid becomes
[−D′/2, D′2]. Using a rescaling constant B, the network
weights can then be in the interval [−BD′/2, BD′/2].
The new learning algorithm is an adaptation of the
backpropagation algorithm, chosen due to its widespread
use and robustness. In our case, the activation function
of the neurons is the function that relates the output of a
node memristor with its input, as seen in Equation (10).
The local gradients of the output layer and hidden layer
neurons are respectively given by:
Output: δk = Tkφ
′
(∑
i
Vik
)
, (16)
Hidden: δk = φ
′
(∑
i
Vik
)∑
l
δlwkl. (17)
In Equation (16), Tk denotes the target output for neu-
ron k in the output layer. In Equations (16) and (17), φ′
is the derivative of the neuron’s activation function with
respect to the input to the neuron
∑
i Vik. Finally, in
Equation (17), the sum
∑
l δlwkl is taken over the gradi-
ents of all neurons l in the layer to the right of the neuron
that are connected to it by weights wkl. The update to
the bias weight of a node memristor is given by:
∆wk = ηδk, (18)
where η is the learning rate. The connection weight wij
is updated using ∆wij = ηδjVi, where δj is the local
gradient of the neuron to the right of the connection,
and Vi is the output of the neuron to the left of the
connection.
We count now with all necessary elements to adapt
the backpropagation algorithm for our memristor-based
MLP, as described in Algorithm 2.
III. SIMULATION RESULTS
In order to test the validity of our SLP and MLP,
we tested their performance on three logical gates: OR,
AND and XOR. The first two are simple problems which
should be successfully learnt by SLP and MLP, whereas
only the MLP should be able to learn the XOR gate, due
to Minsky-Papert’s theorem.
The Glorot weight initialization scheme [50] was used
for all simulations, as it has been shown to bring faster
convergence in some problems when compared to other
initialization schemes. In this scheme the weights are
initialized according to U(−1, 1), weighed by
√
6
nin+nout
,
where nin and nout are the number of neurons in the
previous and following layers, respectively. The data sets
used contain 100 randomly generated labeled elements,
which were shuffled for each epoch, and the cost function
is:
E =
1
2
(T −O)2, (19)
where T is the target output and O the actual output.
A. Single-Layer Perceptron Simulation Results
For the SLP, a learning rate of 0.1 was used for all
tested gates, a value set by trial and error. The metric
we used to evaluate the evolution of the network’s per-
formance on a given problem was its total error over an
epoch, which is given by Equation (20).
Etotal =
∑
j
Ej =
1
2
∑
j
(Tj −Oj)2, (20)
where the sum is taken over all elements in the training
set. In Fig. 3, the evolution of the total error over 1000
epochs, averaged over 100 different realizations of the
starting weights, is plotted.
Fig. 3. Evolution of the learning progress of our single-layer
perceptron (SLP), quantified by its total error, given by Equa-
tion (20), for the OR, AND and XOR gates over 1000 epochs.
The total error of our SLP for the OR and AND gates goes
to 0 very quickly, indicating that our SLP successfully learns
these gates. The same is not true for the XOR gate, which
our SLP is incapable of learning, in accordance with Minksy-
Papert’s theorem [3].
We observe that our SLP successfully learns the gates
OR and AND, with the total error falling to 0 within 200
epochs, as expected from a SLP. However, the total error
of our SLP for the XOR gate does not go to zero, which
means that it is not able to learn this gate, in accordance
with Minsky-Papert’s theorem.
B. Multilayer Perceptron Simulation Results
The structure of the network was chosen following [51].
There, a network with one hidden layer of two neurons is
6recommended for the case of two inputs and one output.
As noted in [51], networks with only one hidden layer
are capable of approximating any function, although
in some problems, adding extra hidden layers improves
the performance. However, the results obtained by
employing only one hidden layer are satisfactory, thus
there is no need for a more complex network structure.
There is also the matter of how many neurons must
be employed in the hidden layer. In this case, there is
a trade-off between speed of training and accuracy. A
network with more neurons in the hidden layer counts
with more free parameters, so it will be able to output
a more accurate fit, but at the cost of a longer time
required to train the network. A rule of thumb for
choosing the number of neurons in the hidden layer is
to start with an amount that is between the number of
inputs and the number of outputs and adjust according
to the results obtained. This leads to two neurons
for the hidden layer and, similarly to what happened
with the number of hidden layers, the results obtained
using two neurons in the hidden layer are sufficiently
accurate, so there was no need to try other structures.
The learning rates used, which we have chosen through
trial and error, are 0.1 for the OR and AND gates, and
0.01 for the XOR gate. In Fig. 4, the evolution of the
total error over 1000 epochs, averaged over 100 different
realizations of the starting weights, is plotted.
Fig. 4. Evolution of the learning progress of our multilayer
perceptron (MLP), quantified by its total error, given by
Equation (20) for the OR, AND and XOR gates over 1000
epochs. As can be seen, the total error of our MLP for the
these gates approaches 0, indicating that it successfully learns
all three gates.
As was the case for our SLP, our MLP successfully
learns the OR and AND gates. In fact, it is able to learn
them faster than our SLP, which is a consequence of the
larger number of free parameters. Additionally, it is able
to learn the XOR gate, indicating that it behaves as well
as a regular MLP.
In summary, both memristor-based perceptrons be-
have as expected. Our SLP is able to learn the OR and
AND gates, but not the XOR gate, so it is limited to solv-
ing linearly separable problems, just as any other single-
layer neural network. However, our MLP is not subject
to such a limitation and it is able to learn all three gates.
C. Receiver Operating Characteristic Curves
Fig. 5. ROC curves obtained with the SLP for the OR and
XOR gates, and with the MLP for the XOR gate. The thresh-
olds used were t = 0.3, 0.5 and 0.7 We can see that the SLP
correctly classifies the inputs for the OR gate every time, but
it does not perform better than random guessing for the XOR
gate, as expected. On the other hand, the MLP correctly clas-
sifies the XOR gate inputs every time.
As another measure of the perceptrons’ performance,
we show in Fig. 5 the receiver operating characteristic
(ROC) curves obtained with perceptrons trained for 500
epochs on data sets of size 100. The curves shown were
obtained using a SLP trained for the OR gate, a SLP
trained for the XOR gate and a MLP trained for the XOR
gate, with thresholds of t = 0.3, 0.5 and 0.7 for each.
Again, we see that the SLP is capable of learning the OR
gate but not XOR, since it correctly classifies the inputs
for OR every time, but its performance is equivalent to
random guessing for XOR. We can also see that the MLP
is capable of learning the XOR gate, since it correctly
classifies its inputs every time. The learning rates used
in training were 0.1 for the SLP on both gates and 0.01
for the MLP on XOR gate, as explained in the previous
subsection.
IV. CONCLUSION
In this paper, we introduced models for single and mul-
tilayer perceptrons based exclusively on memristors. We
provided learning algorithms for both, based on the delta
rule and on the backpropagation algorithm, respectively.
Using a threshold-based system, our models are able to
use the internal variables of memristors to store and up-
date the perceptron’s weights. We also ran simulations
of both models, which revealed that they behaved as ex-
pected, and in accordance with Minsky-Papert’s theo-
rem. Our memristor-based perceptrons have the same
7capabilities of regular perceptrons, thus showing the fea-
sibility and power of a neural network based exclusively
on memristors.
To the best of our knowledge, our neural networks are
the first ones in which memristors are used as both the
neurons and the synapses. Due to the Universal Ap-
proximation Theorem for multilayer perceptrons, this
implies that memristors are universal function approx-
imators, i.e., they can approximate any smooth function
f : Rn → Rm to arbitrary accuracy, which is a novel re-
sult in their characterization as devices for computation.
Our models also pave the way for novel neural network
architectures and algorithms based on memristors. As
previously discussed, such networks could show advan-
tages in terms of energy optimization, allow for higher
synaptic densities and open up possibilities for other
learning systems to be adapted to a memristor-based
paradigm, both in the classical and quantum learning
realms. In particular, it would be interesting to try to ex-
tend these models to the quantum computing paradigm,
using a recently proposed quantum memristor [52], and
its implementation in quantum technologies, such as su-
perconducting circuits [53] or quantum photonics [54].
ACKNOWLEDGMENTS
Work by FS was supported in part by a New Tal-
ents in Quantum Technologies scholarship from the
Calouste Gulbenkian Foundation. FS and YO thank
the support from Fundac¸a˜o para a Cieˆncia e a Tec-
nologia (Portugal), namely through programme POCH
and projects UID/EEA/50008/2013 and IT/QuNet, as
well as from the project TheBlinQC supported by the
EU H2020 QuantERA ERA-NET Cofund in Quantum
Technologies and by FCT (QuantERA/0001/2017), from
the JTF project NQuN (ID 60478), and from the EU
H2020 Quantum Flagship projects QIA (820445) and
QMiCS (820505). MS and ES are grateful for the fund-
ing of Spanish MINECO/FEDER FIS2015-69983-P and
Basque Government IT986-16. This material is also
based upon work supported by the U.S. Department of
Energy, Office of Science, Office of Advance Scientific
Computing Research (ASCR), under field work proposal
number ERKJ335.
[1] F. Rosenblatt, “The perceptron: a probabilistic model
for information storage and organization in the brain.”
Psychological review, vol. 65, no. 6, p. 386, 1958.
[2] B. Widrow and M. E. Hoff, “Adaptive switching cir-
cuits,” Stanford Univ Ca Stanford Electronics Labs,
Tech. Rep., 1960.
[3] M. Minsky, S. A. Papert, and L. Bottou, Perceptrons:
An introduction to computational geometry. MIT press,
2017.
[4] G. Cybenko, “Approximation by superpositions of a sig-
moidal function,” Mathematics of control, signals and
systems, vol. 2, no. 4, pp. 303–314, 1989.
[5] D. E. Rumelhart, G. E. Hinton, and R. J. Williams,
“Learning representations by back-propagating errors,”
nature, vol. 323, no. 6088, p. 533, 1986.
[6] H. A. Rowley, S. Baluja, and T. Kanade, “Neural
network-based face detection,” IEEE Transactions on
pattern analysis and machine intelligence, vol. 20, no. 1,
pp. 23–38, 1998.
[7] J. Devlin, R. Zbib, Z. Huang, T. Lamar, R. Schwartz, and
J. Makhoul, “Fast and robust neural network joint mod-
els for statistical machine translation,” in Proceedings of
the 52nd Annual Meeting of the Association for Com-
putational Linguistics (Volume 1: Long Papers), vol. 1,
2014, pp. 1370–1380.
[8] F. Ercal, A. Chawla, W. V. Stoecker, H.-C. Lee, and
R. H. Moss, “Neural network diagnosis of malignant
melanoma from color images,” IEEE Transactions on
biomedical engineering, vol. 41, no. 9, pp. 837–845, 1994.
[9] A. K. Jain, J. Mao, and K. M. Mohiuddin, “Artificial
neural networks: A tutorial,” Computer, vol. 29, no. 3,
pp. 31–44, 1996.
[10] L. Chua, “Memristor-the missing circuit element,” IEEE
Transactions on circuit theory, vol. 18, no. 5, pp. 507–
519, 1971.
[11] D. B. Strukov, G. S. Snider, D. R. Stewart, and R. S.
Williams, “The missing memristor found,” Nature, vol.
453, no. 7191, p. 80, 2008.
[12] T. Prodromakis and C. Toumazou, “A review on mem-
ristive devices and applications,” in Electronics, Circuits,
and Systems (ICECS), 2010 17th IEEE International
Conference on. IEEE, 2010, pp. 934–937.
[13] D. Soudry, D. Di Castro, A. Gal, A. Kolodny, and
S. Kvatinsky, “Hebbian learning rules with memristors,”
Israel Institute of Technology: Haifa, Israel, 2013.
[14] K. D. Cantley, A. Subramaniam, H. J. Stiegler, R. A.
Chapman, E. M. Vogel et al., “Hebbian learning in spik-
ing neural networks with nanocrystalline silicon tfts and
memristive synapses,” IEEE Transactions on Nanotech-
nology, vol. 10, no. 5, pp. 1066–1073, 2011.
[15] W. He, K. Huang, N. Ning, K. Ramanathan, G. Li,
Y. Jiang, J. Sze, L. Shi, R. Zhao, and J. Pei, “Enabling
an integrated rate-temporal learning scheme on memris-
tor,” Scientific reports, vol. 4, p. 4755, 2014.
[16] G. A. Barrios, J. Retamal, E. Solano, and M. Sanz, “Ana-
log simulator of integro-differential equations with classi-
cal memristors,” arXiv preprint arXiv:1803.05945, 2018.
[17] Y. V. Pershin and M. Di Ventra, “Neuromorphic, digi-
tal, and quantum computation with memory circuit el-
ements,” Proceedings of the IEEE, vol. 100, no. 6, pp.
2071–2080, 2012.
[18] Y. Ho, G. M. Huang, and P. Li, “Nonvolatile memristor
memory: device characteristics and design implications,”
in Computer-Aided Design-Digest of Technical Papers,
2009. ICCAD 2009. IEEE/ACM International Confer-
ence on. IEEE, 2009, pp. 485–490.
[19] F. L. Traversa and M. Di Ventra, “Universal memcom-
puting machines,” IEEE transactions on neural networks
8and learning systems, vol. 26, no. 11, pp. 2702–2715,
2015.
[20] Y. V. Pershin and M. Di Ventra, “Experimental demon-
stration of associative memory with memristive neural
networks,” Neural Networks, vol. 23, no. 7, pp. 881–886,
2010.
[21] J. J. Yang, D. B. Strukov, and D. R. Stewart, “Mem-
ristive devices for computing,” Nature nanotechnology,
vol. 8, no. 1, p. 13, 2013.
[22] J. P. Strachan, A. C. Torrezan, G. Medeiros-Ribeiro, and
R. S. Williams, “Measuring the switching dynamics and
energy efficiency of tantalum oxide memristors,” Nan-
otechnology, vol. 22, no. 50, p. 505402, 2011.
[23] D. S. Jeong, K. M. Kim, S. Kim, B. J. Choi, and C. S.
Hwang, “Memristors for energy-efficient new computing
paradigms,” Advanced Electronic Materials, vol. 2, no. 9,
p. 1600090, 2016.
[24] T. M. Taha, R. Hasan, C. Yakopcic, and M. R. McLean,
“Exploring the design space of specialized multicore neu-
ral processors,” in Neural Networks (IJCNN), The 2013
International Joint Conference on. IEEE, 2013, pp. 1–8.
[25] G. Indiveri, B. Linares-Barranco, R. Legenstein, G. Deli-
georgis, and T. Prodromakis, “Integration of nanoscale
memristor synapses in neuromorphic computing architec-
tures,” Nanotechnology, vol. 24, no. 38, p. 384010, 2013.
[26] H. Mostafa, A. Khiat, A. Serb, C. G. Mayr, G. Indiveri,
and T. Prodromakis, “Implementation of a spike-based
perceptron learning rule using tio2- x memristors,” Fron-
tiers in neuroscience, vol. 9, p. 357, 2015.
[27] A. Thomas, “Memristor-based neural networks,” Journal
of Physics D: Applied Physics, vol. 46, no. 9, p. 093001,
2013.
[28] I. E. Ebong and P. Mazumder, “Cmos and memristor-
based neural network design for position detection.” Pro-
ceedings of the IEEE, vol. 100, no. 6, pp. 2050–2060, 2012.
[29] A. Afifi, A. Ayatollahi, and F. Raissi, “Implementation
of biologically plausible spiking neural network models
on the memristor crossbar-based cmos/nano circuits,” in
Circuit Theory and Design, 2009. ECCTD 2009. Euro-
pean Conference on. IEEE, 2009, pp. 563–566.
[30] D. Querlioz, O. Bichler, and C. Gamrat, “Simulation of
a memristor-based spiking neural network immune to de-
vice variations,” in Neural Networks (IJCNN), The 2011
International Joint Conference on. IEEE, 2011, pp.
1775–1781.
[31] D. Soudry, D. Di Castro, A. Gal, A. Kolodny, and
S. Kvatinsky, “Memristor-based multilayer neural net-
works with online gradient descent training,” IEEE
transactions on neural networks and learning systems,
vol. 26, no. 10, pp. 2408–2421, 2015.
[32] R. Hasan and T. M. Taha, “Enabling back propaga-
tion training of memristor crossbar neuromorphic proces-
sors,” in Neural Networks (IJCNN), 2014 International
Joint Conference on. IEEE, 2014, pp. 21–28.
[33] F. M. Bayat, M. Prezioso, B. Chakrabarti, I. Kataeva,
and D. Strukov, “Memristor-based perceptron classifier:
Increasing complexity and coping with imperfect hard-
ware,” in Proceedings of the 36th International Confer-
ence on Computer-Aided Design. IEEE Press, 2017, pp.
549–554.
[34] D. Negrov, I. Karandashev, V. Shakirov, Y. Matveyev,
W. Dunin-Barkowski, and A. Zenkevich, “An approxi-
mate backpropagation learning rule for memristor based
neural networks using synaptic plasticity,” Neurocomput-
ing, vol. 237, pp. 193–199, 2017.
[35] A. Emelyanov, D. Lapkin, V. Demin, V. Erokhin, S. Bat-
tistoni, G. Baldi, A. Dimonte, A. Korovin, S. Iannotta,
P. Kashkarov et al., “First steps towards the realization
of a double layer perceptron based on organic memristive
devices,” AIP Advances, vol. 6, no. 11, p. 111301, 2016.
[36] L. Wang, M. Duan, and S. Duan, “Memristive percep-
tron for combinational logic classification,” Mathematical
Problems in Engineering, vol. 2013, 2013.
[37] C. Yakopcic and T. M. Taha, “Energy efficient per-
ceptron pattern recognition using segmented memristor
crossbar arrays,” in Neural Networks (IJCNN), The 2013
International Joint Conference on. IEEE, 2013, pp. 1–8.
[38] V. Demin, V. Erokhin, A. Emelyanov, S. Battistoni,
G. Baldi, S. Iannotta, P. Kashkarov, and M. Kovalchuk,
“Hardware elementary perceptron based on polyaniline
memristive devices,” Organic Electronics, vol. 25, pp. 16–
20, 2015.
[39] S. Duan, X. Hu, Z. Dong, L. Wang, and P. Mazumder,
“Memristor-based cellular nonlinear/neural network: de-
sign, analysis, and applications.” IEEE Trans. Neural
Netw. Learning Syst., vol. 26, no. 6, pp. 1202–1213, 2015.
[40] M. Prezioso, F. Merrikh-Bayat, B. Hoskins, G. Adam,
K. K. Likharev, and D. B. Strukov, “Training and op-
eration of an integrated neuromorphic network based
on metal-oxide memristors,” Nature, vol. 521, no. 7550,
p. 61, 2015.
[41] A. Wu, S. Wen, and Z. Zeng, “Synchronization control
of a class of memristor-based recurrent neural networks,”
Information Sciences, vol. 183, no. 1, pp. 106–116, 2012.
[42] S. Wen, X. Xie, Z. Yan, T. Huang, and Z. Zeng, “Gen-
eral memristor with applications in multilayer neural net-
works,” Neural Networks, vol. 103, pp. 142–149, 2018.
[43] S. P. Adhikari, C. Yang, H. Kim, and L. O. Chua, “Mem-
ristor bridge synapse-based neural network and its learn-
ing,” IEEE Transactions on Neural Networks and Learn-
ing Systems, vol. 23, no. 9, pp. 1426–1435, 2012.
[44] E. Lehtonen, J. Poikonen, and M. Laiho, “Two memris-
tors suffice to compute all boolean functions,” Electronics
letters, vol. 46, no. 3, p. 230, 2010.
[45] Q. Xia, W. Robinett, M. W. Cumbie, N. Banerjee, T. J.
Cardinali, J. J. Yang, W. Wu, X. Li, W. M. Tong, D. B.
Strukov et al., “Memristor- cmos hybrid integrated cir-
cuits for reconfigurable logic,” Nano letters, vol. 9, no. 10,
pp. 3640–3645, 2009.
[46] D. Yu, H. H.-C. Iu, Y. Liang, T. Fernando, and L. O.
Chua, “Dynamic behavior of coupled memristor cir-
cuits,” IEEE Transactions on Circuits and Systems I:
Regular Papers, vol. 62, no. 6, pp. 1607–1616, 2015.
[47] R. K. Budhathoki, M. P. Sah, S. P. Adhikari, H. Kim,
and L. Chua, “Composite behavior of multiple memristor
circuits,” IEEE Transactions on Circuits and Systems I:
Regular Papers, vol. 60, no. 10, pp. 2688–2700, 2013.
[48] K. Hornik, “Approximation capabilities of multilayer
feedforward networks,” Neural networks, vol. 4, no. 2,
pp. 251–257, 1991.
[49] F. Dickey and J. DeLaurentis, “Optical neural networks
with unipolar weights,” Optics communications, vol. 101,
no. 5-6, pp. 303–305, 1993.
[50] X. Glorot and Y. Bengio, “Understanding the difficulty
of training deep feedforward neural networks,” in Pro-
ceedings of the thirteenth international conference on ar-
tificial intelligence and statistics, 2010, pp. 249–256.
[51] S. Walczak and N. Cerpa, “Heuristic principles for the de-
9sign of artificial neural networks,” Information and soft-
ware technology, vol. 41, no. 2, pp. 107–117, 1999.
[52] P. Pfeiffer, I. Egusquiza, M. Di Ventra, M. Sanz, and
E. Solano, “Quantum memristors,” Scientific reports,
vol. 6, p. 29507, 2016.
[53] J. Salmilehto, F. Deppe, M. Di Ventra, M. Sanz, and
E. Solano, “Quantum memristors with superconducting
circuits,” Scientific reports, vol. 7, p. 42044, 2017.
[54] M. Sanz, L. Lamata, and E. Solano, “Invited article:
Quantum memristors in quantum photonics,” APL Pho-
tonics, vol. 3, no. 8, p. 080801, 2018.
