Nanoscale devices such as carbon nanotube and nanowires based transistors, memristors and molecular devices are expected to play an important role in the development of new computing architectures. While their size represents a decisive advantage in terms of integration density, it also raises the critical question of how to efficiently address large numbers of densely integrated nanodevices without the need for complex multi-layer interconnection topologies similar to those used in CMOS technology. Two-terminal programmable devices in crossbar geometry seem particularly attractive, but suffer from severe addressing difficulties due to cross-talk, which implies complex programming procedures. Three-terminal devices can be easily addressed individually, but with limited gain in terms of interconnect integration. We show how optically gated carbon nanotube devices enable efficient individual addressing when arranged in a crossbar geometry with shared gate electrodes. This topology is particularly well suited for parallel programming or learning in the context of neuromorphic computing architectures.
Introduction
For at least three decades, complementary metal-oxide semiconductor (CMOS) technology has been the dominant technology for integrated circuits (ICs) and computing systems. Nevertheless, miniaturization of CMOS ICs will probably be limited close to the 11 nm node because of physical barriers such as quantum effects 3 . Below this size, Boolean computing will face very severe difficulties, due in particular to high leakage currents, which will dramatically impact the power consumption performances of CMOS circuits [1] . In this context, devices based on nano-objects such as carbon nanotubes [2] , nanowires [3, 4] , graphene [5] or molecules [6] are viewed as possible means to go beyond Moore's law. In order to make the implementation of nanodevices with CMOS circuits tolerant to the intrinsic high variability and defect density originating from their small size and from fabrication issues (potentially involving self-assembly processes) [7] , intense research activity is presently developing to evaluate the use of adaptive neural network principles [8, 9] (figure 1) in which the synapses are built at the nanoscale [10] [11] [12] [13] while the neurons are made with conventional CMOS circuits. Such a strategy, while elegant, raises the critical issue of how to address individual nanodevices within large assemblies so as to allow programming or learning. Two-terminal nanodevices in crossbar arrays combined with CMOS based control circuits are considered as one of the most promising computing architectures [13] [14] [15] [16] [17] [18] [19] [20] but the simplicity of the topology introduces severe difficulties concerning individual addressing, for example cross-talk effects.
We recently demonstrated that optically gated carbon nanotube field effect transistors (OG-CNTFETs) [21] [22] [23] can be operated as two-terminal programmable resistors and Figure 1 . Generic diagram of a single-layer neural network composed of i inputs, j neurons and (i × j ) synapses. A single neuron and its associated synapses form a perceptron. Incoming signals (X i ) are multiplied by the synaptic weights (W i j ) before reaching the neuron that integrates the signals and triggers a response using, for example, a threshold function.
that, in such a configuration, they have all the required characteristics of artificial synapses [23] . In particular, their resistivity can be programmed over a wide range and then stored in a non-volatile way so that it can play the role of a synaptic weight (W i j in figure 1 ) in neural network topologies. The programming relies on a single illumination step of the whole circuit followed by local electrical stimulations of the input electrode (source or drain) of the different devices.
Most importantly, we showed that in simple circuits composed of multiple OG-CNTFETs, each device can be independently programmed at a given resistivity value independently of its initial (post-fabrication) characteristics so that the programming step fully compensates for the process variability [23] . This early work was focused at the device level and no circuit implementation was studied.
In this paper, we present a crossbar topology for arrays of nanotube based synapses, in which gate electrodes are shared by all the synapses connected to the same neuron. Appropriate polarization of these gate electrodes allows the protection of already-trained synapses from further modification and fulfils a key requirement to achieve efficient addressing. Using simulations, we establish the potential of such a topology in terms of the programming/learning efficiency of large arrays.
Optically gated carbon nanotube field effect transistor (OG-CNTFET)

OG-CNTFETs as memory transistors
An OG-CNTFET consists of a p-type carbon nanotube field effect transistor (CNTFET) in back-gate geometry and coated with a thin film of photoconductive polymer, for example P3OT (poly(3-octylthiophene-2,5-diyl)) [21] .
The FET channel can be variously composed of a single semiconducting single wall carbon nanotube (SWNT) (as schematically represented in figure 2(a)); multiple parallel semiconducting SWNTs ( figure 2(b) ) or an array of SWNTs (figure 2(c)) potentially composed of both metallic and semiconducting SWNTs (but adjusted so as to provide an FET with a good off-state). At constant source-drain bias V DS and positive gate bias V GS , such a FET is in its off-state but can be turnedon using light at an appropriate wavelength and intensity (see figures 2(a) and (b) in [21] for details on wavelength and light intensity dependence). Indeed, an illumination pulse generates electron-hole pairs in the polymer, a fraction of which gets dissociated. While photo-induced holes are depleted from the device in this bias configuration, electrons get trapped (in the gate dielectric, close to the nanotube-SiO 2 interface [22] ). The trapped electrons apply a negative potential to the nanotube channel, which becomes more conductive. After the light is turned off, a significant part of these electrons remain trapped so that the change forms a non-volatile memory state (we verified that the light-induced trapped charges show no sign of decay after >40 h). Using either negative electrical pulses on the gate electrode [21, 22] or positive electrical pulses on the source electrode ( figure 2(d) ) [23] , these electrons can be de-trapped in a controlled manner. By adjusting the number of trapped charges, the resistivity of the device channel can be precisely programmed, as shown for example in figure 2(e), which displays eight I (V ) curves of the same device at different stages of programming. The experimental details are described in [21] for single nanotube devices and in [23] for resistance programming using devices based on nanotube networks. The robustness of the write-read-erase cycles is illustrated in [24] .
OG-CNTFET programming using the gate protection effect
Our initial prototype circuits [23] used OG-CNTFETs as two-terminal devices: we fix a constant gate potential for a series of devices sharing the same gate electrode and program them using the source electrode. We first illuminate the circuit globally to set all the devices in their state of minimal resistivity and then program each nanotube resistor independently by applying electrical pulses on the input (source) electrodes. However, while very tempting from an integration point of view, this strategy is quite limited in terms of programming/learning efficiency. Indeed, in a simple crossbar, programming pulses would act on a full series of devices and not on individual ones. To overcome this strong limitation, we use a specific property of OG-CNTFETs that is illustrated in figure 3 . During programming, electrons are detrapped from the gate dielectric only when the electric field is oriented in the appropriate direction, so that only negative V GS pulses or positive V DS pulses are efficient. As a consequence, if a strong positive V GS bias is applied, it shields (or protect) the device from being programmed by positive V DS pulses. This is shown in figure 3 , where the current through an OG-CNTFET submitted to programming pulses is displayed in two configurations of gate bias: figure 3(b) corresponds to the conventional programming configuration: V GS = 0 V and positive V DS pulses modify the final resistivity value. Figure 3 (c) corresponds to V GS = 6 V. The same programming pulses are applied but the final current is unaffected. Note that in this latter configuration the current level is low during the programming cycle because V GS > 0 sets the FET in it offstate. 3. OG-CNTFET based crossbar architecture for neuromorphic computing
Architecture
Based on this 'gate protection effect' we developed an original neuromorphic computing architecture. Instead of a single and fully global gate electrode, we propose to implement one gate electrode per row of devices in a crossbar geometry as represented in figure 4 . In this configuration, all OG-CNTFETs within the same row share both the same gate electrode and the same source electrode. They correspond to synapses connected to the same neuron which provides the function Y i . The drain terminals (pre-synapse terminals) of all OGCNTFETs within the same column are also connected together and carry the inputs signals (X i ). The neurons needed here are simple CMOS based comparators. They receive, from the post-synapse terminals, a current (I si ) which is directly the sum of the contributions from the different synapses and compare it to reference values to be targeted (I refi ). The neurons can also apply a feedback signal (V gi ) to the gate electrode of their own row of synapses. The operation of this architecture includes three phases: initializing, learning and computing. At first, light initializes globally all the OG-CNTFETs to their low resistance state. Then electrical pulses are applied to all the inputs (X i ) to update the resistance of the OG-CNTFETs (or the weight of the synapses). At each neuron, the total current from a row (I si ), measured between the programming pulses for given values of the inputs, progressively decreases toward I refi , the known values to be learnt. Once this value is reached in a given row, the neuron of this row sends a feedback signal to its corresponding shared gate electrode, which consists of a positive V gi value. This activates the protection mechanism and prevents any further modification of all the resistivity values within this row. The incoming input signals keep modifying devices in the other rows until all the functions have been learnt. Once all the synaptic weights have been set to appropriate values, the network can be used for ultra-rapid parallel computing of multiple functions. In such a configuration of fixed weights, the circuit takes advantage of the exceptional transport properties of carbon nanotubes. It is indeed envisioned that carbon nanotube electronic devices would be much faster than devices based on conventional semiconductors [25, 26] . Recent experimental results tend to support these projections (see for example [27, 28] and references therein).
The principal advantage of the gate protection effect is to allow the crossbar architecture to learn several functions in a massively parallel way. Indeed, input signals can be sent to the full array of nanodevices at the same time. Most importantly, as each row stops being modified when it has reached the targeted value, the total duration to learn a large number of functions is simply the duration of the learning phase of a single function: the one requiring the largest number of programming pulses. This implies a dramatic increase in the learning speed when compared with conventional serial learning procedures. 
Architecture simulation based on an electrical model of OG-CNTFET
Based on experimental measurements, we developed a SPICE-like electrical model of OG-CNTFET [29] in the Cadence Spectre simulator, which is the main platform tool for integrated circuits development [30] . This model provides the interface to simulate OG-CNTFET based synaptic networks with CMOS based neurons and allows the testing of neuromorphic circuits. Figure 5 shows the simulation of a neural network composed of four neurons, with each neuron associated to 16 synapses (i.e. a total of 64 synapses while only 16 are represented in figure 4 for clarity). As mentioned above, there are three operating phases for this architecture: the first one is the initializing phase which consists in resetting all the synapses to their minimal weight. Three light pulses are used in the simulation to realize this task (in analogy with figure 2(d)) but a single longer pulse would be equivalently efficient. The second one is the learning phase during which the weights of the synapses are updated to target reference currents at different neurons. The last one is the computing phase.
This simulation shows an example in which all the 16 X i values are set to 0.4 V between pulses and the targeted values for this specific configuration of the inputs are chosen as: I ref0 to I ref3 = 10 nA, 30 nA, 20 nA and 40 nA respectively (the reference current levels can be chosen at arbitrary values in the programming range which, in the present state of Figure 4 . Crossbar architecture comprising four neurons (CMOS based) and 16 synapses (OG-CNTFETs). Each neuron is associated with four synapses within a row which share both the same electrical gate (V gi ) and the same source terminal. The drain terminals (vertical pre-synapse terminals) connect synapses from different neurons and serve to apply both the programming pulses during the learning phase and the input signals during the computing phase. All the synapses in the array also share a common optical gate terminal: the P3OT film. The output of the functions (Y i ) and the feedback signals to the gate (V gi ) are produced by simple CMOS based neurons. This model also includes a random parameter (the minimal resistivity of the device after illumination) which allows one to simulate the effect of variability (mismatch) in OG-CNTFETs. This parameter was chosen so as to reflect the variability in performances obtained from experimental measurements [23] . The good operation of this simulation thus also demonstrates the high reliability of this neuromorphic architecture in the presence of device-to-device variability.
It is important to note that in the developed model, the parameters were adjusted to correspond to experimental data acquired on non-scaled-down devices, such as those used in figure 3 . The next generations of such circuits will be programmed with sub-µs pulses using OG-CNTFETs with a scale-down gate oxide thickness (experimental evidence will be published separately). We should also note that the final currents at the end of the learning step are close to the reference currents, but with a moderate accuracy (∼94%). This is due to the insufficient number of synapses per neuron (16) in this example. When we associate each neuron with 100 synapses, the accuracy becomes >98%. An arbitrary high precision can be reached by appropriately scaling up the number of synapses per neuron. As the synapses will be built at the nanoscale, while the neuron will be CMOS based, this type of architecture would intrinsically be in a configuration of a large synapses/neurons ratio compatible with high precision. With large numbers of synapses, parallel learning procedures, such as the one developed here, becomes a mandatory requirement.
Discussion and conclusions
Comparison with previous nanodevice based crossbar architectures
Without the gate protection effect provided by the OG-CNTFET, this architecture would be similar to some nanoneuromorphic computing architectures which have been studied recently [17, 31, 32] . They are all based on crossbar structures and tunable two-terminals nanodevices as synapses, but use different methods to program the functions. Some of them update the relevant synapses and program the functions in series, as there is no efficient solution to protect the nanodevices associated with the programmed functions in the array [33] . This leads to very long learning times for large circuits. We think that a parallel learning capability is one of the essential requirements to build neuromorphic machines for practical applications, and promises higher performance than CMOS based computing systems [34, 35] . For the crossbar architecture based on memristors, a parallel learning process could potentially be carried out, as their conductance would not be updated if the potential difference between their two terminals is lower than a given threshold voltage [17, 18] . Therefore, one of the most efficient solutions would be to use the low threshold voltage (e.g. 0.7 V) of memristors to control the parallel learning [18, 35] or programming [36] . However, this method leads to important cross-talk effects, as the threshold voltages of memristors could be different from each other and difficult to precisely control due to nanofabrication issues. For large crossbar architectures, these difficulties would be severe due to voltage drops along the connecting wires. Another problem for these solutions is that each neuron requires a complex CMOS control circuit. This would enlarge greatly the final die area [18, 37] .
Expected performances (die area, reliability, learning speed and power consumption)
Architecture design is one of the most crucial steps to build high performance neural network circuits. It determines the power consumption, the die area and both the learning and the computing speed. In order to reduce the final die area and simplify the circuit initialization, the whole array of OG-CNTFETs shares the same global optical stimulus. In [23] , we used a light spot of ∼200 µm 2 . Such surface could accommodate a large number of scaled-down devices (see section 4.3 below). The proposed crossbar geometry is based only on shared electrodes: for the input, the output and the feedback signals. It thus minimizes the nanodevice interconnections and allows the development of a process with only two levels of metallization. Unlike for memristors in crossbar architectures, cross-talk problems do not occur between adjacent OG-CNTFETs during and after the function learning. Indeed, the protection is not based on a precise voltage value that could vary from device to device. The postsynapse terminal is always grounded and the FET channel is fully closed by the high positive bias applied on the gate electrodes of the protected devices. This allows a reliable parallel learning process even in very large arrays. As the gate electrodes are shared per neuron, the use of the third terminal will not substantially lower the device density. Furthermore they can be implemented vertically under the two-terminal programmable resistors (we showed recently that silicon wires can be used as shared gate electrodes for OG-CNTFETs, but the description of this process flow and the associated electrical results are beyond the scope of this paper). Thereby this architecture also promises a low die area. Another important advantage of this architecture is the low standby power consumption during the learning phase. As the application of a positive V g bias reduces the current down to nearly zero when the protection is activated ( figure 3(c) ), the power consumption saving for such neuromorphic circuits is optimized by the shutdown of the learning process per row as soon as the target level is reached. In addition, as the multi-level states of OGCNTFETs are non-volatile, the circuit would also work with low standby power consumption during the computing phase.
Scaling, speed and implementation perspectives
The data presented in this paper were obtained on OGCNTFETs based on nanotube networks ( figure 2(c) ). The main purpose was to illustrate both the programmability ( figure 1(e) ) and the 'gate protection' mechanism (figure 3), which are the basis of the studied architecture. For the simulations to be as realistic as possible, the data were also used to calibrate the models. OG-CNTFETs can be built using individual nanotubes and short channel lengths (for example, we used 100 nm in [21] ). Several groups also demonstrated conventional nanotube transistors with channel length down to 10 nm [38] and with good performance down to 40 nm [39] . Recently, we used silicon wires as shared back-gate electrodes for OGCNTFETs [40] . In this geometry, scaled-down devices could be integrated down to a conservative density of 10 10 cm −2 . As a result, it is expected that the CMOS part used for the neurons would be the limiting factor in terms of die area. It is thus important to keep the functionality of this silicon based part very simple, which is the case in the studied architecture (neurons are CMOS comparators).
The magnitude of the manipulated signals is also important for future implementations.
With individual nanotubes, we achieved ON-currents above 1 µA (at V DS = 400 mV) and a programmability range of four orders of magnitude [21] . Optimized single nanotube transistors can drive currents as high as 20 µA with I on /I off ratio above 10 6 . In a configuration such as the one illustrated in figure 2(b) based on nanotubes in parallel, higher current levels can be achieved without significantly increasing the device width. Recent progress in the CVD growth of aligned SWNTs show the potential of this configuration [41, 42] .
Concerning speed, our recent experimental results show that by scaling down the gate dielectric thickness to 2 nm, programming pulses shorter than 1 µs can be used [40] . These results are in preparation for publication. After the programming phase, the circuit would benefit from the intrinsic high speed of carbon nanotube electronics. In 2009, we showed, for example, that CNTFETs with cut-off frequencies as high as 80 GHz can be built [27] .
The physical implementation of an optical gate may be difficult to achieve, even though there is a very intense activity in the field of mixed electrical/optical electronics (note in this context that nanotube transistors can be used as both light detectors [43] and light sources [44] ). We estimate that the studied architecture could be implemented in a fully electrical way. Indeed, its principle relies on devices having two control signals. The only requirement for these two control signals is that one must be common to all the devices while the second one must be shared by devices within the same row. In the present implementation, light constitutes an efficient way to share a global stimulus. But this global signal could be a global back-gate electrode. In that case, the second one would be a shared top-gate electrode. With the recent progress in carbon nanotube based memory devices [45, 46] , such an implementation can be reasonably envisaged.
Conclusions
In conclusion, we studied an original crossbar architecture to integrate carbon nanotube based programmable devices within a neuromorphic architecture. An experimental demonstration of a mechanism that allows protecting already programmed devices from further modification was provided. This mechanism allows very efficient parallel learning of multiple functions within large arrays. This hybrid nano/CMOS architecture promises high reliability, high density and fast learning speed, which are the crucial requirements to building neuromorphic computing machines for practical applications and to providing performances exceeding those of present computing systems.
Using a precise functional model of OG-CNTFETs, we simulated this crossbar architecture and highlighted the principal characteristics of its expected performances.
