A superconducting loop stores persistent current without any ohmic loss, making it an ideal platform for energy efficient memories. Conventional superconducting memories use an architecture based on Josephson junctions (JJs) and have demonstrated access times less than 10 ps and power dissipation as low as 10 -19 J. However, their scalability has been slow to develop due to the challenges in reducing the dimensions of JJs and minimizing the area of the superconducting loops. In addition to the memory itself, complex readout circuits require additional JJs and inductors for coupling signals, increasing the overall area. Here, we have demonstrated a superconducting memory based solely on lithographic nanowires. The small dimensions of the nanowire ensure that the device can be fabricated in a dense area in multiple layers, while the high kinetic inductance makes the loop essentially independent of geometric inductance, allowing it to be scaled down without sacrificing performance. The memory is operated by a group of nanowire cryotrons patterned alongside the storage loop, enabling us to reduce the entire memory cell to 3 μm × 7 μm in our proof-of-concept device. In this work we present the operation principles of a superconducting nanowire memory (nMem) and characterize its bit error rate, speed, and power dissipation.
low power dissipation, motivating the development of a superconducting computer for supercomputers and big data centers [1] . Basic logic gates, analog-to-digital converters, and small processors have been demonstrated by SFQ circuits. However, the challenge of creating a high speed, low power, and scalable memory that operates at cryogenic temperatures for SFQ compatibility remains an obstacle to the development of a practical superconducting computer. Several technologies in the past have been built to achieve this goal. One approach involves using a hybrid architecture that combines SFQ circuits and CMOS memories [2] . Scaling up CMOS memory to the level of a superconducting computer is relatively easy, benefiting from technologies developed in the advanced semiconductor industry. But, as the CMOS circuit requires large voltage input, the amplification interface between SFQ and CMOS units consumes the majority of the power dissipation and limits the operating speed. Another hybrid approach uses multiple layers of magnetic materials to create a superconducting-ferromagnetic-superconducting (SFS) junction [3, 4] ; however, this technique demands careful tuning of the materials to enable a scalable array at cryogenic temperature.
Compared to these hybrid approaches, a technique relying on memories and readout circuits made entirely of superconductors may be more straightforward, as they share the same signal levels, temperature dependences, and fabrication processes as SFQ circuits. A conventional all-superconductor memory stores bit information in a superconducting loop and uses SFQ circuits to enable addressing, writing and reading operations [5] . However, the development of a scalable Josephson junction (JJ)-based memory has been slow due to several limitations [6] . First, reducing the area of a JJ below 1 μm 2 makes it increasingly difficult to fabricate a junction of high current density and a high yield. In addition, the superconducting loop requires an inductor of at least few pH to ensure the conditions for single flux quantum operations, increasing the overall area required by the device. The total area is further increased by the transformers and SQUID amplifiers used in addressing, writing and reading operations of the storage elements.
Furthermore, since magnetic coupling is typically used in SFQ circuits, adjacent memory cells must be far enough apart to avoid crosstalk, limiting the density of memory arrays.
Here, we demonstrate an alternative superconducting memory made entirely of lithographic nanowires (nMem). We use superconducting nanowire devices, which are patterned together with the nanowire storage loop in a very compact size, to enable operations for addressing, writing and reading. In comparison to
Josephson based memory elements, the nMem offers multiple advantages. The minimum feature size defined as the width of a nanowire is typically ~100 nm, smaller than a Josephson junction by 1~2 orders of magnitude. The entire memory cell is patterned from a single thin (~7 nm) film and could be patterned in multiple layers for an even higher scalability, making it promising for large arrays. Additionally, the kinetic inductance of a thin nanowire is about two orders of magnitude larger than its geometric inductance, allowing the superconducting loop to be scaled down while maintaining the high inductance required for storage. Furthermore, because magnetic fields penetrate through the thin nanowires, the nMem is not sensitive to perturbation by magnetic fields and thus may be densely packed into an array without crosstalk.
Using the kinetic inductance to shrink the size of a superconducting loop was demonstrated in Ref. [7] . The authors designed a superconducting loop into a nanoSQUID and operated it as a memory by sending current pulses or applying magnetic field. In this work, the memory combines with on-chip cryotron devices, which are used for addressing, writing and reading the storing loop. Therefore, we can fully operate the memory with digital pulses and characterize its bit error rate. We have also previously demonstrated that SFQ pulses can trigger a nanowire cryotron [8] , suggesting that nMems can be integrated with RSFQ circuits through an interface circuit made from cryotrons. Figure 1a shows the schematic diagram of a single-memory cell. A superconducting loop stores the bit information in the form of a persistent current, while a thermally coupled cryotron, which we refer to as heat-Tron (hTron), enables the write operation and a current-crowding cryotron (yTron) reads the stored persistence current nondestructively. The loop and the cryotrons were patterned together within a 3 μm × 7
Memory operation:
μm area as shown in Fig.1b . Figure 1c shows experimental waveforms for writing and reading the two different states. The detailed operation principle will be discussed following the descriptions of the cryotron devices. reading bits '1' and '0'. To read the memory, we used the same input port for sending the bias and reading the output. In the output pulse Vr, the first pulse is the leakage signal from the biasing pulse Ir, while there will be a second pulse appears after the leakage pulse only if the storage state is '0'. The circuit diagram is shown in Fig. 6a .
hTron characterization
A large memory requires bit selection scheme to operate either an individual bit or a group of bits, i.e., a word. In the nMem, the superconducting loop stores the bit information while the cryotrons work for the bit selection. We use the heat-Tron (hTron) as a selection line to enable the write operation. Only when the hTron is triggered, can bit information be written into the superconducting loop. Since heat is generated during the operations of the hTron, it is important to characterize the power dissipation and switching speed of the hTron, which could limit the overall performance of the memory element.
The hTron is a nanowire cryotron device comprised of two isolated nanowires placed close together with a typical spacing of 40 nm. We refer to the narrower nanowire as the gate and the wider nanowire as the channel. As shown in Fig. 1b , an hTron is on the left side of the memory with its channel forming part of the storage loop. When an input pulse switches the hTron gate from the superconducting state to the resistive state, the gate dissipates power and increases the local temperature of the channel through Joule heating, suppressing its critical current. Applying a biasing current to the channel greater than the suppressed critical current will cause the channel nanowire to switch. In this way, the switching of the hTron channel nanowire dictates the opening of the superconducting loop for fluxons to enter (write '1') or exit (write '0'). The electrical isolation between the hTron gate and channel minimizes crosstalk between the port for selecting a memory loop and the ports for writing and reading the loop, which is a promising feature for a multiplexing memory array. switched by a gate pulse. We observed that the channel switched probabilistically if Esw was too weak.
We characterized an individual hTron device isolated from the storage loop. We found that there was a tradeoff between the dissipation power on the gate and the delay for switching the channel. To observe this effect, we sent fast pulses to the hTron gate and channel to measure the delay between the input pulse to the hTron gate and the switching time of the channel. Delays of the cables and amplifiers were removed after calibration. The width of the pulse sent to the hTron gate was fixed at τp = 8 ns, while the high level of the pulse was swept in order to generate different energy dissipations. We assumed that the current through the gate wire was held constantly at a self-heating current of Ihg = 2 μA and that all of the input voltage was applied on the gate. Thus, the energy dissipation per switch on the gate was calculated by using Esw = Vgh×Ihg×τp. The data in Fig.2 shows that the switching delay is a function of the biasing current on the hTron channel and the energy dissipation on the gate. It took longer for the channel to switch when less energy was dissipated on the gate and when the channel was biased at a lower current.
Compared to SFQ circuits, the hTron is more energetically expensive per switch and requires a longer time for completing a write operation. This would limit the application of the hTron in a fast and energy efficient memory array or a logic circuit. A future multilayer design that stacks the hTron gate on top of channel with a thin insulting layer would enhance the thermal coupling, making the hTron faster and more energy efficient. In this work, however, the electrical isolation between the hTron gate and channel makes it a valuable tool for characterizing the memory operations.
yTron characterizations
To read the stored bit information, i.e. the circulating current, there are a destructive approach and a nondestructive approach. The destructive readout approach can be done by sensing the switching current of the memory loop through the write port. We will discuss it in section 3.3. The non-destructive readout approach uses the current-crowding cryotron (yTron), which senses the circulating current of the superconducting loop through the read port. As the detection happens in the read nanowire, the superconducting state of the storage loop is maintained, enabling us to read the stored bit for multiple times. We would like to discuss the operation principles and characterize the sensitivity of the yTron in advance for better presenting the memory results as following.
The yTron is a device made from two nanowires joined together with a sharp intersection point. It uses the current-crowding effect to control the switching current of one arm with the bias current through the other [9] . Details of the operation principle of a yTron are described in Ref. [10] . Here, we will discuss the readout approach of a memory with an integrated yTron. The information stored in the superconducting nanowire memory is in terms of the number of fluxons. The trapped fluxons nΦ0 generate a persistent current of nΦ0/LL, where LL is the total loop inductance and Φ0 is the magnetic flux quantum. This persistent current is also a biasing current to one arm (sensing arm) of the yTron device, and thus controls the switching current of the other arm (detecting arm) of the yTron. Therefore, we can read the different fluxons stored in each state by measuring the difference of the switching current of the yTron detecting arm. One promising feature of using the yTron as a readout tool is that reading the detecting arm has no effect on the superconducting state of the sensing arm attached to the storing loop. Therefore, the read operations are nondestructive. We measured the dependence of switching current of the detecting arm sw darm to the biasing current through the sensing arm bias sarm of a separate yTron which had the same geometry as the one used in the memory.
The sensitivity of the yTron is defined as the derivative d( sw darm )/ d( bias sarm ), which is the slope of the sw darm vs. bias sarm curve as shown in Fig. 3 . We observed that the yTron responded to the change over a wide range of bias sarm but with varied sensitivity. The highest and most constant sensitivity (~0.8) was over the range 0 μA < bias sarm <20 μA, which was where we operated the persistent currents in the memory.
Memory operation diagram
With knowledge of the cryotron devices, we can now discuss the operations of the memory shown in Fig.   1 . To write currents into the storage loop, representing bit '1', we sent a pulse through the write port to bias the wires in the memory loop below the level that the loop can switch. Afterwards, another pulse was sent through the gate nanowire of the hTron, representing the write-enable port. The write-enable pulse then switched the hTron channel, allowing about 15 fluxons to enter the loop. To write a lower current into the loop, representing bit '0', we only sent a write-enable pulse without biasing the wires in the memory loop;
this switched the hTron gate in order to either erase the '1' state if it had been written by the previous operation or maintain the '0' state. The read operation was performed by reading voltage pulses generated from the yTron detecting arm. As the stored currents for states '1' and '0' determined two different switching currents ( sw1 darm > sw0 darm ), we sent a pulse to bias the yTron's detecting arm to a current level close to ( sw0 darm + sw1 darm )/2. Therefore, if the memory state was '1', we read no pulse from the yTron's detecting arm. Otherwise, if the memory state was '0', the yTron's detecting arm switched and a voltage pulse was observed.
We simulated the memory circuit to understand how the currents in the memory loop changed during a writing '1' operation. In particular, we studied how current pulse from the write port (Iw) split between the left arm (Ileft, through the hTron channel) and the right arm (Iright, through the yTron sensing arm) of the storing loop. As shown in Fig.4a -c, before the hTron was turned on, Iw split to Ileft and Iright with a ratio α/(1-α) = Lright/Lleft, where Lright and Lleft were the inductances of the right and left sides of the nanowire loop, respectively. When Iw reached the highest level w high and then the hTron was turned on, the switching current of the left arm sw left was suppressed below α× w high . Thus, the left wire switched into resistive state, expelling the bias current to the right wire. Diversion of bias current reduced Ileft to a level Ires at which the resistive state in the left wire could no longer be maintained, allowing it to return to superconducting state.
After the hTron was off and Iw was removed, Ileft and Iright reduced following the same splitting ratio Lright/Lleft.
When all of the input pulses were removed, the superconducting loop stored a circulating current Istore = α w high -Ires.
The simulation indicates that a higher Istore will be written into the storing loop for a higher w high . However, too much input current will switch both arms when w high -Ires > sw right , where sw right is the switching current of the right wire. We observed a sharp transition when sw right was too high as shown in Fig.4d . We measured the switching current of the yTron's detecting arm sw darm , which was proportional to Istore, at different levels of w high . Increasing w high increased sw darm until w high = 48 μA, agreeing with our simulation results. As we showed in Fig. 2 , to make the hTron switched deterministically, the write-enable pulse had to be enough powerful to switch the superconducting loop. We found the linear increase shown in Fig. 4d started at a higher w high for a weaker write-enable pulse, which agreed with our previous data of an individual hTron shown in Fig.2 . shows the probability of the switching current at each w high . The maximum sw darm occurred when w high was ~48 μA, above which both wires of the storing loop switched into the normal state.
Memory Characterizations

Bit Error Rate
To ensure correct write and read operations, the nanowire memory must perform with a very low bit error rate (BER). As we discussed in previous sections, the write operation can be deterministic if we dissipated enough energy on the hTron gate and set a proper value for w high . The bias margin of the write operation could be much wider than the bias margin of the read operations, if energy efficiency was not seriously considered. Here, we focus on errors caused by the read operations. In specifically, we would like to characterize the bias margin of the yTron for ensuring the memory operations of an acceptable BER. To ensure the write operations of no errors, the writing current was fixed at w high = 32 μA and the energy dissipation of the hTron pulse was set to 13 fJ (pulse width was 8 ns, pulse high level was 0.8 V).
The BER measurements made on our devices are shown in Fig. 5 . In every measurement cycle, we first wrote a random bit '1' or '0' to the nMem. Afterwards, we sent a pulse with fixed amplitude to the yTron's detecting arm to read the memory state. If bit '1' was stored, the yTron was expected to be in superconducting state and no output voltage pulse would be detected. In the opposite case when bit '0' was written, we expected to measure a voltage pulse. Because the operation signals for the nMem were pulses, we first generated a pseudorandom binary sequence (PRBS), and then used this sequence to trigger a second arbitrary waveform generator (AWG) to produce pulses of fixed width and amplitude. As the rising-edge triggering mode was used, the output pulse train only indicated the transitions from bit '0' to bit '1'. On average, one quarter of the PRBS bits produced a pulse for writing '1'. The timing diagram of the operation patterns are illustrated in Fig. 5b .
We used a counter to record the total number of operations Ntot, the number of bit '1' writes NW1, and the number of bit '0' reads NR0, from which the BER can be calculated by BER = 1-NR0/(Ntot-NW1). The maximum Ntot was set to be 3 × 10 7 , giving a lowest measurable BER of 4.4 × 10 -8 . As shown in Fig. 5c, when the read pulse level was too far below the switching current of the detecting arm, the yTron did not always switch when bit '0' was written, causing the W0R1 errors (write bit '0' but read bit '1'). When the reading pulse was too high, the yTron detecting arm switched even when we wrote bit '1', causing the W1R0 errors (write bit '1' but read bit '0'). Only when the reading pulse was in an optimal range could correct operations be obtained. The read operation margin was defined as the biasing range at a fixed BER.
For a BER on the order of 10 -7 , the biasing margin was from 52.4 μA to 57.0 μA. Fits to the trench of the BER curves suggested that a BER lower than 10 -10 could be achieved but with a narrowed operation margin. 
Non-destructive readout
In the reading operations, although the yTron's detection arm switches to normal state and produces a voltage pulse, the superconducting state of the storing loop is not disturbed. In this way, the yTron offers a non-destructive readout of the nMem. To demonstrate non-destructive readout by the yTron, we wrote to the memory once, but read its state multiple times. Unlike the read operations used in measuring the BER, we sent ramped pulses of an amplitude higher than the maximum switching current to the yTron detecting arm to determine when it switched, from which we calculated the switching currents of the yTron's detecting arm at different memory states. As a result, we forced the yTron to switch for reading both bit '1'
and bit '0'.
As shown in Fig.6a , 400 read operations were executed within 200 μs after one write operation. When bit '1' was written, the yTron's detecting arm switched later along the bias current ramp, indicating that it had a higher switching current. In the case of writing bit '0', the yTron's detecting arm switched earlier. The measured switching delays in both cases decayed over time, presumably because the local temperature was increased by the heat generated from the yTron's detecting arm when it switched to the normal state. When fewer reading pulses were sent to the nMem within a longer time frame, the temperature had a longer time to cool down, resulting in reading of stable switching currents of the yTron's detecting arm. As shown in Fig. 6b , we performed 180 read operations within 900 μs. The measured switching currents were more stable than the data shown in Fig. 6a . 
Bipolar operation without hTron
In the present nMem design, the hTron dominates the overall energy consumption. As mentioned previously, a stacked hTron design with the gate nanowire on top of the channel would likely increase thermal coupling and reduce the power dissipation; however, this tactic requires the development of a multilayer process. An alternative way to reduce energy costs would be to avoid using the hTron by writing to the nMem with bipolar pulses through the write/data-in port. We demonstrated this bipolar operation using the same nMem device while leaving the hTron gate grounded. As shown in Fig. 7a , for each write operation, we always sent a negative pulse of higher amplitude through the write/data-in port to force the memory loop to switch regardless of its previous state, acting as a clear operation. This also generated a level of the persistent current representing bit '0'. To write bit '1', a positive pulse was sent after the negative pulse. The amplitude of the positive pulse was adjusted to a level such that only the left arm of the memory loop switched while the right arm remained superconducting, which was equivalent to the previous operation of writing bit '1' when the hTron was used.
In the bipolar operation, the negative pulse switched both arms of the loop into resistive state. Therefore, 
Conclusion
In this work, we have demonstrated a superconducting memory made entirely from nanowire devices fabricated together on a single plane. We discussed the advantages of the nMem and described its operation principles. The nMem has a compact size which is promising for scaling up to a large memory array; while our proof-of-concept device was 3 μm x 7 μm, the nMem can be minimized in future iterations by reducing the nanowire width and loop dimensions while maintaining a high kinetic inductance. Multilayer fabrication may also allow for arrays of even higher density. We measured a minimum BER less than 10 -7 , indicating that the memory is reliable. The nMem was operated in the electrothermal regime, where a normal resistance needed to sustain to enable write and read operations. This operation regime could be analogous to a JJ operated in a latched mode. Due to heating from the normal resistance, the performance metrics of speed and power dissipation were not competitive to the performance of JJs operated in flux regime, i.e.
the JJ memories in SFQ circuits. To speed up the memory operations and reduce the power dissipation of an nMem, it may be possible to operate the nanowire in flux regime by resistively shunting to suppress
Joule heating during switching [11] [12] . Therefore, we envision that the nMem's performance could eventually match the speed and power dissipation of RSFQ circuits.
