Abstract-The authors describe a signal distribution network for sequential systems constructed using the Quantum-dot cellular automata (QCA) computing paradigm. This network promises to enable the construction of arbitrarily complex QCA sequential systems in which all wire crossings are performed using nearest neighbor interactions, which will improve the thermal behavior of QCA systems as well as their resistance to stray charge and fabrication imperfections. The new sequential signal distribution network is demonstrated by the complete design and simulation of a two-bit counter, a three-bit counter, and a pattern detection circuit.
. The QCA geometry used in this paper. Each cell is composed of four quantum dots placed at the corners of a square, and the distance from each dot to the center of the square is 20 nm. Tunneling is allowed between each pair of adjacent quantum dots. Two electrons occupy each cell, and the distance between the centers of the adjacent cells is 60 nm. [After reference 7].
The tunneling among the four dots can be controlled by the height of the tunneling barriers between each pair of dots, and this can in turn be controlled by the voltage applied to a nearby metal lead under the plane of the QCA cells. When the barriers are raised very high, the cell is in a "locked" state, meaning the electrons are not allowed to move among the dots and are effectively isolated on their current dot. When the barriers are lowered significantly, the cell is in a "relaxed" state, meaning that the electrons are largely free to spread out evenly among the four cells. By quasi-adiabatically switching from a relaxed state to a locked state and then back to a relaxed state, the system is able to respond to inputs applied in a specific manner. Allowing different groups of cells to pass through this cycle at different times allows data to propagate through the system in a predetermined manner [6] .
As shown in Fig. 2 , four different clock signals (all repeating the same pattern, but offset from each other in time) can be applied to four different regions of the system in order to allow data to flow continuously through a QCA system [8] , [9] . In addition to controlling the directionality of the data flow, the use of quasi-adiabatic switching and clocking regions also allows each QCA cell to store one bit of data during its locked phase [10] . This is equivalent to the functionality of a D flip-flop. Combined with the complete set of combinational logic described above, this memory functionality further enables QCA systems to implement generalized sequential logic functionality as well.
One challenge facing the QCA paradigm is that it is particularly difficult cross two lines of cells without allowing the signals to interact. Because QCA systems are coplanar, and because their interactions are based on physical proximity, it is very difficult to pass signals through each other without interference. One solution to this problem, relying on rotated cells and next-near neighbor interactions, was proposed in [5] . Unfortunately, it has since been shown that this wire crossing negatively impacts the excitation energy between the ground state and the Fig. 2 . Data propagates from one region (in the "locked" phase) to its neighbor (in the "locking" phase). Once a region has propagated its information, it relaxes in preparation for receiving its next input. A typical region of cells will repetitively cycle through these four phases, allowing data to be transferred in the desired direction as indicated by the arrows. [After reference 7.] first excited state, which has been shown to degrade the overall system's thermal behavior and its resistance to fabrication imperfections and stray charge [11] [12] [13] [14] [15] . A great deal of research has been performed to investigate methods to minimize wire crossings [16] [17] [18] [19] and to use multiple parallel crossings to strengthen the interaction of the next-near neighbor wire crossing described above [20] , [21] . One solution even allows an arbitrary set of wire crossings, although it requires the use of several additional clock signals [22] .
II. IMPROVING THE COMBINATIONAL SIGNAL DISTRIBUTION NETWORK
One method for implementing a generalized set of wire crossings (referred to as a "signal distribution network") was presented in [7] and [23] . This network allows a set of N inputs to be arbitrarily duplicated and distributed to the inputs of a combinational logic system. Such a signal distribution network (SDN) for combinational systems is shown in Fig. 3 , where it is used to implement the wire crossings necessary for a one-bit full adder.
While the SDN presented in [7] offers the ability to perform any necessary number of wire crossings while only relying on near-neighbor interactions, it does require a relatively large amount of surface area to perform the signal distribution. As shown in Fig. 3 , the three parallel vertical wires and the horizontal wires that connect them take up approximately as much surface area as the combinational logic needed to perform the addition.
It turns out that this may not be strictly necessary. The multiple horizontal lines connecting between the leftmost and rightmost vertical lines are redundant with each other, an artifact of the programmable array of logic technology that was used as an initial model for this network. Furthermore, the multiple vertical wires are only needed to place the correct values into the rightmoving data pipeline. As shown in Fig. 4 , which is functionally identical to Fig. 3 , the same result can be achieved with only one vertical wire that is driven by a short input wire that receives all the results in parallel. Making this small change to the SDN reduces the size of the overall network by almost half while removing dozens of redundant cells. It also illustrates that the key to the operation of the SDN is the serial transmission of the data values to the input of the vertical line. Fig. 5 shows a schematic representation of a generalized sequential digital system. Such a system is distinguished from a combinational system in that it contains a "current state" that is a function of all previous inputs to the system. This current state is routed back to the next-state decoder (NSD), which determines the new value of the current state. In addition, the current state may need to be distributed to multiple inputs of an output decoder, which can be used to assert one or more outputs when the system is in a particular state. Thus, the sequential system will require one or possibly two combinational SDNs, as well as finding a way to perform the wire crossings between the current state and the outputs of the system. These wire crossings (15 for the case shown here) will occur in a very regular pattern that suggests a common solution.
III. A SDN FOR SEQUENTIAL SYSTEMS
Thus, a sequential system requires one or two signal distribution blocks and one fixed wire crossing block. Fig. 6 shows an alternative implementation of the functionality present in Fig. 5 that addresses all three of these needs simultaneously. In this case, there is no need for an explicit "current state" block, since each QCA cell can provide the functionality of a D flip-flop. Instead, the signals leaving the NSD in parallel pass through a series of delay blocks (composed of four QCA cells each), Fig. 6 . A schematic representation of a sequential SDN used to perform all the necessary wire crossings for a sequential QCA system. Since each QCA cell acts as a D flip-flop, there is no need for an explicit "current state" block. Data leaving the NSD works its way vertically along a series of delay blocks (implemented by clocking regions). Once they reach the top, they are transmitted to the inputs of the NSD and the output decoder. As before, the first data to arrive must be delayed until the later data has also propagated so that they arrive at the combinational logic gates simultaneously.
and they are shifted vertically upward. This serial data stream is then delivered to the inputs of the NSD as well as to the output decoder, if present. If the current state of the system is to serve directly as the output (as in the case of a counter, for example), then the output (current state) can simply be read at the location indicated.
This "Sequential Signal Distribution Network" (SSDN) can be used to significantly simplify the distribution of signals in sequential systems. The next three sections of this paper illustrate specific examples of sequential systems constructed using this method.
IV. SEQUENTIAL SDNS TO IMPLEMENT COUNTER CIRCUITS
In previous sections of this paper, the basic block diagram and data flow of sequential systems have been described. In this section, an implementation of the SSDN will be shown and described in detail. In order to exhibit the full functionality of the SSDN, only complete examples will be examined. By using full examples of sequential systems, it can be demonstrated that the SSDN can handle multiple wire crossings, which pass the current state to the output decoder as well as routing it back to the NSD.
A counter circuit, one of the most common and basic sequential systems, has been chosen to exhibit the implementation of the SSDN. Counter circuits can be seen in a wide variety of applications and are also easily scalable for high volume calculation tests. The counter circuit maintains the current state of the system, distributes those signals as necessary to the NSD, and forwards the current state to an optional output decoder. The schematic representation of a two-bit counter circuit can be seen in Fig. 7 . The sequential system contains two wire crossings used to distribute the signals to the NSD as well as a wire crossing needed to route the current-state signals back to the inputs of the NSD.
The QCA implementation of the schematic from Fig. 7 can be seen in Fig. 8 . Here, it can be observed that the three explicit wire crossings mentioned above are no longer explicitly needed. Instead, the clocked QCA cells are positioned and clocked in such a way that the signal is transmitted serially along the top and left side of the system. This new functionality of the SSDN is made possible by the addition of two new clock phases, shown as orange and purple in the online version of the figure. This addition was required in order to pass the data vertically and horizontally in an alternating pattern. The timing of asserting these new clock phases coincide with the timing of the yellow clock phase. However, only one of the orange or purple phases is ever asserted at a time. For example, in one clock cycle the purple and yellow will be asserted at the same time, then in the next cycle, orange and yellow will be asserted. This ensures that the data is driven by the green cells and is then passed either horizontally or vertically through each intersection. The alignment of the yellow, orange and purple clock phases maintains the flow of data through the system.
The purpose of a sequential system is to use previous outputs, or current states, as the next inputs into the system. However, realistic sequential systems have user inputs such as a start, stop, or reset. In order to emulate a realistic system, a reset input was implemented. Fig. 8 shows two grey reset cells. At the beginning of the simulation, the reset signal will place the counter into a known state ("00" in this case). It is important to note that this extra clock signal is not strictly necessary, but is only included to allow the system to be initialized. Thus, the true number of clock signals required for this system is four (as with all pipelined QCA systems) plus two (the orange and purple signals described above). Adding a SSDN to a QCA network will increase the number of clock signals required from four to six.
Using the clock cycle chart in Fig. 9 , the flow of the data through the sequential system depicted in Fig. 8 can be examined. The counter is first put into a known state of "00" from the assertion of the grey, or reset phase. The data from Q A then flows up and around, through the distribution network using the blue, green, and yellow cells, in that order. At the same time, data from Q B moves vertically through its own blue and green cells. However, instead of entering a yellow cell, the data flows into the purple cells. This is because the purple and yellow clock cycles are both asserted at cycle four. The red phase is then asserted, acquiring the data driven by the purple cells while disregarding the bogus data in the relaxed orange cells. This is the first instance of passing the correct data through the wire crossing.
The data from Q B is then quickly passed along through another cycle of blue, green, and yellow while the Q A data moves in much smaller steps, often only one cell at a time. By phase number eight, both sets of data have passed through the NSD and are ready to be output. Again, the utilization of the added clock phases allows the data to be passed through the wire crossing. In clock cycle number eight both the orange and yellow phases are asserted, locking the Q A data and Q B data into their respectively colored cells, waiting for the red phase to lock. The red Fig. 10 . Simulation results of the system shown in Fig. 8 when the clock signals of Fig. 9 are applied. Notice that the RESET signal is applied to the cells in the gray region in the first clock cycle, which causes Q A and Q B to go to zero. Over the next 34 clock cycles, those two signals progress through the sequence 00, 01, 10, 11, and back to 00. It is important to note that the outputs are only valid at the times labelled. It is also possible to observe how the cells labelled "vertical" and "horizontal" alternate control of the "intersection" cell, allowing data to flow in both directions. Fig. 11 . A schematic representation of the digital logic required to implement a three-bit (modulo-8) counter. Note that six wire crossings are required to distribute the signals to the NSD, and three more are required to feed the current state back to the inputs of the NSD. These nine wire crossings will be implemented using the sequential SDN. cells for output Q A are only driven by the valid data locked in the orange cells, and not the invalid data in the adjacent purple cells. Output Q B is also determined only by the locked adjacent yellow cells. Thus, after nine clock cycles, the first valid output, "01" is seen.
The process described above is repeated, but without the use of the reset. This causes the next output, "10" to appear eight cycles later, on clock cycle 17. The sequential system continues this pattern through 25 total cycles, counting from "00" to "11."
Applying the clock signals depicted in Fig. 9 to the twobit counter of Fig. 8 produces the simulation results shown in Fig. 10 . These results were obtained by performing a selfconsistent simulation of the array of QCA cells, applying the Intercellular Hartree Approximation. Each cell was modeled Fig. 12 . An implementation of the three-bit counter from Fig. 11 using QCA cells and a sequential SDN. Again notice the use of two special clock phases to implement the serial wire crossings. It is now more evident that lines carrying Q A must be designed to transmit that value more slowly, whereas Q C moves the data as quickly as possible toward the beginning of the combinational logic.
using the Time-Independent Schrodinger equation and secondquantization operators. The ground state of each cell was calculated multiple times until the entire array of cells had reached a self-consistent ground state. The expectation value of the charge density on each site was then calculated, and this yielded the polarization of each cell, as shown in Fig. 10 . The same method was used to generate the results in Figs. 14 and 19. Fig. 10 displays the reset signal, the state of the vertical (purple) cells, the state of the horizontal (orange) cells, the state of the intersection cell, and the counter outputs. Beginning from the top, it can be seen that the reset signal is only asserted at the beginning of the sequence in order to start the counter into a known state.
Also very noticeable is that the vertical and horizontal signals are mutually exclusive. This is due to the fact that there is only valid data passing in one direction every fourth cycle. Although the purple and orange signals are only locked alternatively every four cycles, the intersection signal locks every four cycles because it is being driven by either the vertical or horizontal data. Further examination confirms this because the value of the intersection signal is always the previous value of the vertical or horizontal signals, whichever one was asserted last.
Finally, the output of the two-bit counter can be seen in signals Q A and Q B . The Q A output signal matches the intersection signal because they are both located in the same row of red cells. The output is first driven by the reset and puts the counter Fig. 12 when the clock signals of Fig. 13 are applied. Over the course of 98 clock cycles, the output signals Q A , Q B , and Q C cycle through the sequence 000, 001, 010, 011, 100, 101, 110, 111, and back to 000. It is important to note that the outputs are only valid at the times labelled. The pattern of one horizontal data pulse followed by two vertical data pulses can also be observed in this figure. into "00." Eight cycles later, the first legitimate output of "01" is displayed. It is important to note that the outputs are only valid once every eight clock cycles. Although they lock every four clock cycles, the locked results for Q A and Q B are invalid for cycles in which no label appears in Fig. 10 . The process continues, outputting "10" during cycle 17, "11" during cycle 25 and finally rolling back to "00" on cycle 35. These simulation results confirm that a QCA two-bit counter sequential system can be constructed using the implementation of the proposed SSDN. Fig. 15 . State diagram for detection of the pattern 11010. Labels along each transition arrow show the input signal that will cause the system to move along that arrow, followed by the output signal that will result from the current state and that input being applied. Note that the only time the output will be asserted is when the system is in state S 4 and receives a 0 input.
As previously mentioned, counter circuits work well for examples because of their scalability. In order to examine how the SSDN performs on a larger scale, it was decided to expand the previous example from a two-bit counter to a three-bit counter. Fig. 11 shows the schematic representation of the threebit counter that will be implemented with the SSDN. It can be seen that the three-bit counter contains six wire crossings used to distribute the signals to the NSD as well as three wire crossings needed to route the current-state signals back to the inputs of the NSD. Fig. 12 displays the QCA implementation of the three-bit counter circuit depicted in Fig. 11 . The QCA implementation of the NSD has become visibly more complex, accounting for the increased number of inputs and outputs to the sequential system. However, the same data flow principles that were utilized in the two-bit counter can be seen at work here.
The current states of Q A , Q B , and Q C are successively transmitted vertically, passing through the three wire crossings with the use of the newly created purple and orange clock phases. These signals are then passed serially along the top and down the left side of the diagram. To the left side of the diagram there are labels depicting where each of the current state signals enter the NSD. As with the two-bit counter, the Q A signal must slowly propagate for four clock cycles in order for the Q B signal to catch up. Then, both the Q A and Q B signals (or the logical combination of the two) must propagate for four more signals, allowing all three signals to align before continuing through the decoding logic. The next-state outputs are then passed horizontally through the wire crossing, labeled "intersection" in Fig. 12 , to their respective outputs. Fig. 13 depicts the sequence of the applied clock signals to the corresponding regions in Fig. 12 . The assertion of the reset signal at the beginning of the sequence puts the sequential system into a known state of "000." The data is then propagated through the system, flowing through the red, blue, green, and yellow regions in order. However, exceptions to this data progression occur when a signal must pass through a wire crossing.
As in the two-bit counter example, the addition of the orange and purple signals make this wire crossing possible. However, instead of the purple and orange regions being asserted alternatively every four cycles as described in the two-bit counter, a pattern of purple-purple-orange can be seen instead. This is because Q C must pass through two wire crossings, utilizing the vertical purple regions, before it can reach the NSD. Once it reaches the NSD, it is then quickly propagated through the required logic, after which the outputs are passed horizontally through the orange regions. It is the behavior that results in two purple region assertions to every one orange regions assertion. The simulation data depicted in Fig. 14 verifies that there are two vertical assertions to every horizontal assertion as well as the correct outputs of the QCA implemented three-bit counter circuit.
V. PATTERN DETECTOR USING SEQUENTIAL SDN
The SSDN can also be used to construct a pattern detector, which monitors a serial bit stream for a particular binary sequence. This example is particularly interesting because it illustrates a Mealy machine, in contrast with the Moore machines described in the previous examples. In computing theory, a Moore machine is a sequential system where the output is determined exclusively by the state of the machine. A Mealy machine, on the other hand, is a sequential system where the output is determined by both the state of the machine and the input signal.
The chosen example is a pattern detector designed to detect the pattern "11010." The system has a single input line x and a single output line z. The output value is always a logic "0" except when the last 5 bits of the serial input stream have been "11010."
The logic design for this system results in the state diagram shown in Fig. 15 . As can be seen in the figure, five distinct states [S 0 − S 4 ] are needed for this implementation. A three-bit binary number is required to represent these five states.
The three state variables used to store the state information are labeled Q A , Q B , and Q C . Using the single output design methodology [24] , the NSD equations were minimized in the sum of products format as follows:
where D α denotes the future/next-state value of the state variable Q α . The minimized logic equation for the output of the system was similarly found to be:
The logic circuit diagram corresponding to the pattern detector is shown in Fig. 16 . The circuit includes three D-flip flops to store the state variables, 15 combinational logic components used to implement the NSD, and two combinational logic components used to implement the output decoder.
An implementation of the pattern detector from Fig. 16 using QCA cells and a sequential SDN is shown in Fig. 17 . Initially, the present state value is determined by the three "Reset" cells associated with each of the three state variables Q A , Q B , and Q C . In this particular example, only Q A is used, along with the input signal x, in the output decoder. However, all three state variables are used, along with the input signal x, to determine the next-state value. For this purpose, the values of these variables all propagate through the rightmost vertical line starting in the same clock phase. These values then complete the loop by moving leftward along the top horizontal line, and then copies Fig. 18 . The clock signals applied to the regions shown in Fig. 17 . The three purple pulses allow the data to move upward along the right edge of the NSD, while one of the orange pulses enables data to move horizontally to the right of the NSD. The second orange pulse is not strictly necessary, because the data is working its way through the NSD. It is included here to ensure that the "intersection" cells are in a well-defined state for the simulation.
of the signals are distributed as needed by twelve horizontal lines. Similarly, the input signal x propagates along the same path where five copies of it are distributed to help determine the next-state value.
Seeing that four signals are propagating via the same path, the pattern of clock phases is carefully selected to ensure that all four signals are routed to their destinations at the proper times. The input signal x, controlled by an external source signal, will appear on the far-left vertical line after two clock cycles, and it is never updated internally. The three state variables, on the other hand, complete a feedback loop and periodically update their own values every 20 clock cycles. It is therefore imperative that the values of the three state variables and the input signal reach the end of the loop simultaneously to update the state value. For example, Q A has a shorter time to reach the far-left vertical line than Q B , so its propagation is slowed down along the horizontal lines. Similarly, Q C has the longest path to the far-left vertical line but is caught up by having the fastest propagation in the horizontal lines.
The clock signals shown in Fig. 18 illustrate the clocking pattern used to ensure the appropriate routing and propagation of all signals. The gray signal is only activated to reset the system in order to determine a specific initial state. It is then de-activated during the synchronous operation of the system. Once the system is initialized, a four cycle "normal" pattern is set for the signals to propagate to the blue cells in one clock cycle, followed by the green cells, then the yellow cells, and finally the red cells. The only exception to this pattern is the intersection lines where more sophisticated clocking is needed to handle the wire crossing. At these intersections, the yellow cells are replaced by either orange cells for the horizontal lines or purple cells for the vertical lines. The yellow clock signal is de-multiplexed between the purple cells and the orange cells The simulation results of the system are shown in Fig. 19 . During the first two clock cycles, the system is reset, and an input value is applied. As was shown in Fig. 17 , the state variables are updated once every 20 clock cycles. Therefore, it is reasonable to evaluate the output of the system only during the (20 · N + 2)th clock cycles where N is a positive integer. The results show that the only time an output of "1" was detected was at N = 6 in response to the input segment x [2:6] being "11010." This input sequence is highlighted by the dotted box shown in Fig. 19 , which is immediately followed by an asserted output.
VI. CONCLUSION AND OBSERVATIONS
In this paper, the authors have introduced a specialized Sequential Signal Distribution Network (SSDN) that can be used to route and distribute the signals as needed in a generalized sequential system. This functionality has been demonstrated by the complete design and simulation of three example systems: a two-bit counter, a three-bit counter, and a sequence detector. The SSDN allows designers to apply a generalized strategy for implementing the wire crossings necessary to route signals from the current state to the required inputs of the NSD and the output decoder, if one is required.
The examples presented in this paper have been selected to illustrate the features of the SSDN, but the method is entirely general. There is nothing about the examples presented here that make them particularly advantageous to the methods presented.
The SSDN is a general tool that can be used to implement any Mealy or Moore finite state machine. Designers who wish to incorporate this tool into their own designs only need to adjust the clock phases of the horizontal lines leading up to the majority logic gates in order to ensure that all input signals arrive at the majority logic gates simultaneously, as illustrated in the three examples presented in this paper.
Although this is a significant step forward in overcoming the wire crossing challenge, this solution still exhibits some undesirable characteristics. In particular, it requires 4·(N-1) clock cycles to route the signals to the appropriate inputs, where N is the number of bits in the current state. Another concern is that it is necessary to be able to control the clocking regions very precisely, with several regions containing only a single cell. In practice, this will be difficult to implement. A third concern is that this network requires six different clock signals (in addition to the Reset signal, which is only used to ensure a well-defined initial state). Since it is possible to implement combinational QCA systems with only four clock signals, this adds complexity to the clocking system required.
