Abstract-We designed and fabricated a 4 4 switch in which all interconnections were implemented using passive transmission lines (PTLs). The switch consisted of four identical 2 2 switches connected using PTLs. The 2 2 switch was designed using gate-to-gate passive interconnections. Using the on-chip testing method, we demonstrated 40-GHz operation of the 4 4 switch. To verify the effectiveness of passive interconnection, we compared the 4 4 switch with an identical switch that was designed using Josephson transmission lines (JTLs) for interconnections. The comparison showed that the PTL-version 4 4 switch had 50% fewer junctions and required 55% less powering current. The latency and the largest jitter in the PTL-version 4 4 switch were estimated to be 36% and 61%, respectively, of those of the JTL version.
Implementation of a 4I. INTRODUCTION

I
T is an advantage of single-flux-quantum (SFQ) device technology [1] that it can use superconducting passive transmission lines (PTLs), which provide high-speed signal transmission at the speed of light, for circuit interconnections [1] . Since the first experimental demonstration [2] and the first successful application of PTL interconnection to a practical SFQ circuit [3] were reported, various works have been done toward the use of PTLs for the interconnection of SFQ circuits [4] - [11] .
Interconnection using PTLs will become an indispensable technology to increase operation speed and/or the degree of integration of SFQ circuits [7] . This is mainly because the number of Josephson junctions (JJs) used for interconnection has to be reduced to reduce timing-error probability that is caused by thermal noise and process variation and increases as circuit scale increases.
So far, we have implemented a PTL driver and a receiver, and developed a method for designing SFQ circuits with passive interconnections. We applied the developed components and method to actual SFQ circuits, such as a 4 4 switch with block-to-block passive interconnections [11] switch with gate-to-gate passive interconnections [9] , [10] . In this paper, we report on the design of and an experiment on a 4 4 switch in which all interconnections were implemented using PTLs. To verify the effectiveness of the passive interconnection, we compare the 4 4 switch with the same circuit designed using Josephson transmission lines (JTLs) for interconnections.
II. CIRCUIT DESIGN Powering currents are 188.3 mA and 196.5 mA for the switch and the testing circuit, resulting in a total current of 384.8 mA.
The basic concept for interconnection design is to use PTLs for all interconnections and short JTL segments only for adjusting signal timings. The only exception is one-dimensional recursively cascaded circuits, such as SRs. For such circuits, PTL interconnection is not applicable because it just adds extra JJs for the drivers and receivers.
The chip was designed hierarchically. First, a 2 2 switch was designed using gate-to-gate passive interconnection. Next, the 4 4 switch was designed by connecting the four 2 2 switches using PTLs. The SRs and the HFCG were also designed. Finally, the 4 4 switch, the SRs, and the HFCG were connected using PTLs. Such a hierarchical design flow will be naturally used to design large-scale SFQ circuits in the future. The following sections describe the circuit design in detail.
A. 2 2 Switch With Gate-to-Gate Passive Interconnection
Fig . 3 shows the circuit diagram of the 2 2 switch [12] . The switch consists of 13 logic cells and it has three pipeline stages. The switch has two data inputs, in0 and in1, and two data outputs, out0 and out1. The switch's two routing operations, "cross" and "bar", are controlled by two control inputs, set_cross and reset.
We designed the switch using gate-to-gate passive interconnection. Fig. 4 shows a microphotograph of the 2 2 switch. In this design, we improved the previous version of this circuit [9] , [10] . In the previous version, we imposed a constraint on the interconnection: PTL interconnection from a logic cell to the next-stage logic cell can only go to the right or up. In general, this constraint reduces the number of JJs needed for timing adjustments. However, this constraint tends to produce dead space in the circuit layout, so it makes the circuit size relatively large. In the present version, we did not impose the interconnection constraint. As a result, the circuit area has been significantly reduced to 72% of that of the previous version, although the number of JJs has been increased to 108% of that of the previous version.
We adjusted the signal timings to make the timing margins for all the cells as wide as possible at 40 GHz. In our design method, signal timings are adjusted by inserting JTL segments within the PTL-connectable logic cells [9] , [10] . From our cell library, "CONNECT" [13] , two types of JTLs, which involve two JJs or three JJs, are available for the timing adjustment. The delay times are 8.3 ps and 12.8 ps for the 2-JJ JTL and the 3-JJ JTL, respectively. By using these JTLs, we can adjust the signal timings with an accuracy of about 4 ps. For fine timing adjustments much smaller than 4 ps, if needed, PTL lengths are tuned. The PTL delay is 0.3 ps for the length of 40 (i.e., 1-stage-JTL length).
The PTL used in our circuits is a 2-, 34--wide microstrip line. We used 34 PTLs in the 2 2 switch. The minimum and the maximum PTL length used in the switch were 20 and 611 , respectively. We used 18 PTL crossings. The maximum number of crossings per PTL was four. We did not use shields to isolate the crossing PTLs. According to our preliminary experiments, there was no serious degradation of the driver and receiver bias margins for up to four shield-less PTL crossings.
B. 4 4 Switch With Block-to-Block Passive Interconnection
We designed the 4 4 switch by connecting four 2 2 switch blocks using six PTLs. The signal timings were adjusted by inserting JTLs in series to the PTLs to make the timing margin of the DFFs (D_1 and D_2 in Fig. 3 ) in the right-hand 2 2 switch blocks (2 2_UR and 2 2_LR in Fig. 2 ) as wide as possible at 40 GHz. As can be seen in Fig. 1 , two PTL lengths are used to connect the 2 2 switch blocks. The shorter PTLs are 220--long while the longer PTLs are 920--long. The difference in the length, which is 700 , results in difference in the interconnection delay of only about 6 ps. Timing adjustment was thus easily accomplished. Note that if we use JTL interconnection, the difference in the length, 700
, results in a difference in delay of about 75 ps. To cancel this difference, long JTL chain, which involves 18 JJs, has to be added to the shorter interconnections. Doing so will significantly increase the number of JJs, powering current, circuit area, circuit delay, and jitter. This rough estimation explicitly shows one of the advantages of passive interconnection.
C. On-Chip Testing Components
We employed the on-chip testing method [14] to verify the design of the 4 4 switch. The on-chip testing circuit consists of eight 8-bit input SRs, four 8-bit output SRs, and a HFCG, as shown in Fig. 2 . The HFCG consists of splitters and confluence buffers [15] , and it generates 14 high-frequency clock pulses when trigger signal trig is input. The SRs and HFCG were designed just by directly connecting library cells.
D. Chip Design
We designed the chip by placing the 4 4 switch, the SRs, and the HFCG, and by connecting them using PTLs. The clock output port and four signal output ports of each input SR were connected to corresponding input ports of the switch, using five 2.56-mm-long PTLs. The clock input port and two signal input ports of each output SR were also connected to the corresponding output ports of the switch, using three 2.60-mm-long PTLs. We also used a 0.44-mm-long PTL and two 1.30-mm-long PTLs to supply clock pulses from the HFCG to the input SRs. Since we used the same-length PTLs for paths that required the same delay, timing did not need to be adjusted.
Compared to that of JTL interconnection, the delay time of PTL interconnection is much less sensitive to how it is drawn or to its length. This is because PTL delay per unit length is an order of magnitude shorter than that of JTL. Additionally, since the number of JJs and current are independent of PTL length, less attention needs to be paid to the length of the interconnections. The use of passive interconnection thus makes floor-planning in hierarchical design of large-scale SFQ circuits much easier than in circuit design using JTL interconnection.
III. EXPERIMENTAL RESULTS
The chip was fabricated using NEC's standard Nb process [16] . The critical current density of the junction was 2.5
. The 4 4 switch has 16 routing operations. By using the on-chip testing method, we confirmed the switch's correct 16 routings at 40 GHz. Fig. 5 shows the waveform of the test. The chip had a bias margin of 4.5% at 40 GHz. Fig. 6 shows the frequency dependence of the bias margin for the 4 4 switch. The clock frequency was estimated by simulating the HFCG. This result shows the validity of our design. The bias margins at high frequencies were limited by the timing error of the NOT cell. The timing margin of the NOT cell is 4 ps at 40 GHz, which is by far the smallest in the circuit. Therefore, by improving the timing parameters of the NOT cell, we will be able to increase the switch's bias margin.
IV. ANALYSIS
It is important to verify what kind of and how much advantage the passive interconnection has over conventional JTL interconnection. We therefore compared the 4 4 switch with an identical switch that was implemented using JTL interconnection. The JTL-version switch was designed using an automatic placing and routing tool [12] . As shown in Table I , in the PTL-version switch, the number of JJs was reduced to 50% of that in the JTL-version switch. The use of PTLs will thus improve circuit yield. It is also expected that the circuit area will be reduced just by increasing the number of wiring layers for PTL interconnections [9] , [10] . The use of PTLs also reduces the powering current to 45% of that of the JTL version. The effect of magnetic field induced by the powering current, which will become a serious problem as circuit scale increases [17] , is thus reduced. The latency in the PTL version is 36% of that of the JTL version, so the use of passive interconnection increases operation speed.
The critical factors that limit the high-speed, low-bit-error-rate operation of SFQ circuits are timing jitter due to thermal noise and deviation of timing due to process variation. The largest jitter or the largest timing deviation occurs at a logic cell such that the sum of the numbers of JJs in the clock path and the data path connected to its input ports is largest in the circuit. Fig. 7 shows one of such "critical" logic cells, a DFF (D_2 in 2 2_UR), in the PTL-version 4 4 switch. Note that D_2 in 2 2_LR is also the critical cell due to the circuit symmetry. Here, we define as the number of JJs involved in the clock and data paths connected to the input ports of the critical cell. As shown in Table I , in the PTL version, is 37% of that of the JTL version. Both the maximum jitter and the maximum timing deviation in the PTL-version switch are roughly estimated to be 61% of those of the JTL version, since both of them are in proportion to the square root of . The jitter per JJ is 0.09 ps in our standard process [18] . The jitter at the critical cell is thus roughly estimated to be 1.4 ps for the JTL version. If we assume that the effect of reflection and resonance in the passive interconnection on signal timings is negligible, the jitter at the critical cell in the PTL version is roughly estimated to be 0.9 ps.
V. CONCLUSION
We designed and experimentally tested a 4 4 switch in which all interconnections were implemented using PTLs. In the switch, the number of JJs, current, latency, and jitter were reduced to 50%, 45%, 36%, and 61%, respectively, of those of the JTL version of the switch. These significant reductions clearly show the effectiveness of passive interconnection. The PTL-version switch operated at 40 GHz, which demonstrated that our developed cells and design method for passive interconnection are applicable to practical SFQ logic circuits.
