Abstract: New convolution-based multiple-stream error-control coding and decoding schemes are introduced. The new coding method applies the reversibility property in the convolution-based encoder for multiple-stream error-control encoding and implements the reversibility property in the new reversible Viterbi decoding algorithm for multiple-stream error-correction decoding. The complete design of quantum circuits for the quantum realization of the new quantum Viterbi cell in the quantum domain is also introduced. In quantum mechanics, a closed system is an isolated system that can't exchange energy or matter with its surroundings and doesn't interact with other quantum systems. In contrast to open quantum systems, closed quantum systems obey the unitary evolution and thus they are reversible. Reversibility property in error-control coding can be important for the following main reasons: (1) reversibility is a basic requirement for low-power circuit design in future technologies such as in quantum computing (QC), (2) reversibility leads to super-speedy encoding/decoding operations because of the superposition and entanglement properties that emerge in the quantum computing systems that are naturally reversible and therefore very high performance is obtained, and (3) it is shown in this paper that the reversibility relationship between multiple-streams of data can be used for further correction of errors that are uncorrectable using the implemented decoding algorithm such as in the case of triple-errors that are uncorrectable using the classical irreversible Viterbi algorithm.
Introduction
Due to the anticipated failure of Moore's law around the year 2020, quantum computing (QC) will play an increasingly crucial role in building more compact and less power consuming computers [4, 66, 67] . Due to this fact, and because all quantum computer gates (i.e., building blocks) should be reversible [4, 11, 48, 67, 72, 77, 92] , reversible computing will have an increasingly more existence in the future design of regular, compact, and universal circuits and systems. (k, k) reversible circuits are circuits that have the same number of inputs (k) and outputs (k) and are oneto-one mappings between vectors of inputs and outputs, thus the vector of input states can be always uniquely reconstructed from the vector of output states. A (k, k) conservative circuit has the same number of inputs k and outputs k and has the same number of values (states) in inputs and outputs (e.g., the same number of ones and twos in inputs and outputs for ternary) [4, 67, 72] . The importance of the conservativeness property stems from the fact that this property reflects the physical law of energy preservation: no energy can be created or destroyed, but can be transformed from one form to another. Thus, conservative logic will incorporate the fundamental law of energy preservation into the logic design of circuits and systems.
Other motivations for pursuing the possibility of implementing circuits and systems using reversible logic (RL) and QC would include items such as: (1) power: the fact that, theoretically, the internal computations in RL systems consume no power. It is shown in [48] that the amount of energy (heat) dissipated for every irreversible bit operation is given by K × T ln (2) where K = 1.3806505 × 10 −23 JK −1 is the Boltzmann constant and T is the operating temperature, and that a necessary (but not sufficient) condition for not dissipating power in any physical circuit is that all system circuits must be built using fully reversible logical components. Thus, reversible logic circuits are information-lossless. For this reason, different technologies have been studied to implement reversible logic in hardware such as in [4, 66, 67, 72, 77, 92] : bioinformatics, nanotechnology-based circuits and systems, adiabatic CMOS VLSI circuit design, optical systems, and quantum circuits. Fully reversible digital systems will greatly reduce the power consumption (theoretically eliminate) through three conditions: (i) logical reversibility: the vector of input states can always be uniquely reconstructed from the vector of output states, (ii) physical reversibility: the physical switch operates backwards as well as forwards, and (iii) the use of "ideal-like" switches that have no parasitic resistances; (2) size: since the newly emerging quantum computing technology must be reversible [4, 11, 48, 66, 67, 92] , the current trends related to more dense hardware implementations are heading towards 1 Angstrom (atomic size), at which quan-tum mechanical effects have to be accounted for; and (3) speed (performance): if the properties of superposition and entanglement of quantum mechanics can be usefully employed in the design of circuits and systems, significant computational speed enhancements can be expected [4, 67] .
Therefore, while in the classical (irreversible) systems the frequency-to-power ratio ( f /p), or equivalently power-to-frequency ratio (p/ f ), doesn't improve much after certain threshold (level) since the increase in frequency (i.e., more speed; better performance) leads to the increase in power consumption, this doesn't exist in the quantum domain; in the quantum system, speed of processing is very high (due to the properties of quantum superposition and entanglement) and power consumption is inversely very low, i.e., ( f /p) → ∞ or equivalently (p/ f ) → 0.
In general, in data communications between two communicating systems (nodes), noise exists and corrupts the sent data messages, and thus noisy corrupted messages will be received. The corrupting noise is usually sourced from the communication channel. Therefore, error correction of communicated data and reversible error correction of communicated batch of data (i.e., parallel data streams) are highly important tasks in situations where noise occurs. Many solutions have been classically implemented to solve for the classical error detection and correction problems: (1) one solution to solve for error-control is parity checking [30, 31] which is one of the most widely used methods for error detection in digital logic circuits and systems, in which re-sending data is performed in case error is detected in the transmitted data. This error is detected by the parity checker in the receiver side. Various parity-preserving circuits have been implemented in which the parity of the outputs matches that of the inputs, and such circuits can be fault-tolerant since a circuit output can detect a single error; (2) another solution to solve this highly important problem, that is to extract the correct data message from the noisy erroneous counterpart, is by using various coding schemes that work optimally for specific types of statistical distributions of noise [1-3, 5-10, 12-18, 20-47, 49-65, 69-71, 74-76, 78-91, 93-96, 98-101] .
For example, the manufacturers of integrated circuits (ICs) have recently started to produce error-correcting circuits, and one such circuit is the TI 74LS636 [19] which is an 8-bit error detection and correction circuit that corrects any single-bit memory read error and flags any two-bit error which is called single error correction / double error detection (SECDED). This IC is currently found in high-end computer systems because of the cost of implementing a system that uses error correction, and the newest computer systems are now using DDR memory with errorcorrection code (ECC). When a single error is detected, the 74LS636 goes through an error-correction cycle; the 74LS636 checks the single-error flag (SEF) to determine whether an error has occurred, and if it has then a correction cycle causes the single-error defect to be corrected, and if a double-error occurs then an interrupt request is generated by the double-error flag (DEF) output. Since the introduction of the Intel Pentium microprocessor, the modern microprocessor design incorporates the logic circuitry to detect/correct errors provided that the memory can store the extra eight bits required for storing the ECC code, in which the ECC memory is 72-bits wide using the eight additional bits to store the ECC code (i.e., memory width is 64 data bits + 8 bits for ECC code), and if an error occurs then the microprocessor runs the correction cycle to correct the error. Recently, some memory devices such as Samsung memory also perform an internal error check in which Samsung ECC uses three bytes to check every 256 bytes of memory.
The main contributions of this paper are the introduction of new convolutionbased multiple-stream error-control encoding and decoding schemes that apply the reversibility property in both the convolution-based encoder for multiple-stream error-control encoding and in the new reversible Viterbi decoding algorithm for multiple-stream error-control decoding. Also, the complete design of quantum circuits for the quantum implementation of the new quantum Viterbi cell (i.e., quantum trellis node) in the quantum domain is introduced. It is also introduced in this paper that the reversibility relationship between multiple-streams of parallel data can be used for further correction of errors that are uncorre ctable using the implemented decoding algorithm such as in the case of triple-errors (or more) that are uncorrectable using the irreversible Viterbi algorithm.
Basic background in error-control coding, reversible logic and quantum computing is presented in Section 2. The new reversible error correction method in data communication is introduced in Section 3. The design of quantum circuits for the quantum implementation of the new quantum Viterbi cell is introduced in Section 4. Conclusions and future work are presented in Section 5.
Fundamentals
This Section presents basic background in the topics of error-correction coding, reversible logic, and quantum computing. The fundamentals presented in this section will be utilized in the development of the new results introduced in Sections 3−4.
Error correction
In the data communication context, noise usually exists and is generated from the channel in which transmitted data are communicated. Such noise corrupts sent messages from one end and thus noisy corrupted messages are received on the other end. To solve the problem of extracting a correct message from its corrupted counterpart, noise must be modeled [22, 68, 97] Figure 1 illustrates the modeling of data communication in the existence of noise, the solution to the noise problem using an encoder / decoder scheme, and the utilization of a new block called the reverser for bijectivity (uniqueness) in multiple-stream (i.e., parallel data) communication.
Each of the two nodes sides in the system shown in Figure 1 consists of three major parts: (1) encoding (e.g., generating a convolutional code using a convolutional encoder) to generate an encoded transmitted decision (message), (2) channel noise, and (3) decoding (e.g., generating the correct convolution code using the corresponding decoding algorithm (cf. Viterbi algorithm)) to generate the decoded correct received data message.
In general, in block coding, the encoder receives a k-bit message block and generates an n-bit code word, and therefore code words are generated on a block-byblock basis, and the whole message block must be buffered before the generation of the associated code word. On the other hand, message bits are received serially rather than in blocks where it is undesirable to use a buffer. In such case, one uses convloutional coding, in which a convolutional coder generates redundant bits by using modulo-2 convolutions.
The binary convolutional encoder can be seen as a finite state machine (FSM) consisting of an M-stage shift register with interconnections to n modulo-2 adders and a multiplexer to serialize the outputs of the adders, in which an L-bit message sequence generates a coded output sequence of length n(L + M) bits [1-3, 5, 12, 17, 21, 26, 32, 35, 37, 39, 44, 47, 49, 50, 54, 56, 60, 61, 63, 71, 74, 79, 87, 89, 91, 93, 94, 96, 98, 99] . Definition 1. For an L-bit message sequence, M-stage shift register, n modulo-2 adders, and a generated coded output sequence of length n(L + M) bits, the code rate r is calculated as:
bits / symbol and for the typical case of L ≫ M, the code rate reduces to r ≈ (1/n) bits/symbol. Definition 2. The constraint length of a convolutional code is the number of shifts over which a single message bit can influence the encoder output. Thus, for an encoder with an M-stage shift register, the number of shifts required for a message bit to enter the shift register and then come out of it is equal to K = M + 1. Thus, the encoder constraint length is equal to K.
A binary convolutional code can be generated with code rate r ≈ (k/n) by using k shift registers, n modulo-2 adders, an input multiplexer, and an output multiplexer. An example of a convolutional encoder with constraint length = 3 and rate = 1 / 2 is the one shown in Figure 2 . The convolutional codes generated by the encoder in Figure 2 are part of what is generally called nonsystematic codes. Each path connecting the output to the input of a convolutional encoder can be characterized in terms of the impulse response which is defined as the response of that path to "1" applied to its input, with each flip-flop of the encoder set initially to "0". Equivalently, we can characterize each path in terms of a generator polynomial defined as the unit-delay transform of the impulse response. More specifically, the generator polynomial is defined as:
where g i is the generator coefficients ∈ {0, 1}, and the generator sequence {g 0 , g 1 , . . . , g M } composed of generator coefficients is the impulse response of the corresponding path in the convolutional encoder, and D is the unit-delay variable.
Example 1. For the convolutional encoder in Figure 2 , path #1 impulse response is (1, 1, 1), and path #2 impulse response is (1, 0, 1). Thus, according to Equation (1), the following are the corresponding generating polynomials, respectively, where addition is performed in modulo-2 addition arithmetic:
For a message sequence (10011), the following is the D-domain polynomial representation:
As convolution in time domain is transformed into multiplication in the D-domain, path #1 output polynomial and path #2 output polynomial are as follows, respectively:
Therefore, the output sequences of paths #1 and #2 are as follows, respectively:
Output sequence of path #1: (1111001) Output sequence of path #2: (1011111) The resulting encoded sequence from the convolutional encoder in Figure 2 is obtained by multiplexing the two output sequences of paths #1 and #2 as follows: c = (11, 10, 11, 11, 01, 01, 11) Example 2. For the convolutional encoder in Figure 2 , the following are examples of encoded data messages:
In general, a data message sequence of length L bits results in an encoded sequence of length equals to n(L + K − 1) bits. Usually a terminating sequence of (K − 1) zeros called the tail of the message is appended to the last input bit of the message sequence in order for the shift register to be restored to its zero initial state.
The structural properties of the convolutional encoder (cf. Figure 2 ) can be represented graphically in several equivalent representations (cf. Figure 3 ) using: (1) code tree, (2) trellis, and (3) state diagram. The trellis contains (L + K) levels where L is the length of the incoming message sequence and K is the constraint length of the code. Therefore, the trellis form is preferred over the code tree form because the number of nodes at any level of the trellis does not continue to grow as the number of incoming message bits increases, but rather it remains constant at 2 K−1 , where K is the constraint length of the code. Figure 3 shows the various graphical representations for the convolutional encoder in Figure 2 .
Therefore, any encoded output sequence can be generated from the corresponding input message sequence using the following equivalent methods: (1) circuit of the convolutional encoder (cf. Figure 2 ), (2) polynomial generator (cf. Examples 1 and 2), (3) code tree (cf. Figure 3a) , (4) Figure 3c ).
An important decoder that uses the trellis representation to correct received erroneous messages is the Viterbi decoding algorithm [26, [89] [90] [91] . The Viterbi algorithm is a dynamic programming algorithm which is used to find the maximumlikelihood sequence of hidden states, which results in a sequence of observed events particularly in the context of hidden Markov models (HMMs) [73] . The Viterbi algorithm forms a subset of information theory [1, 22] , and has been extensively used in a wide range of applications including speech recognition, keyword spotting, computational linguistics, bioinformatics, and in communications including digital cellular, dial-up modems, satellite, deep-space and wireless local area network (LAN) communications.
The Viterbi algorithm is a maximum-likelihood decoder which is optimum for a noise type which is statistically characterized as an Additive White Gaussian Noise (AWGN). This algorithm operates by computing a metric for every possible path in the trellis representation. The metric for a specific path is computed as the Hamming distance between the coded sequence represented by that path and the c 2 ) between such a pair of code vectors is defined as the number of locations in which their respective elements differ. In the Viterbi algorithm context, the Hamming distance is computed by counting how many bits are different between the received channel symbol pair and the possible channel symbol pairs, in which the results can only be "0", "1" or "2". Therefore, for each node (i.e., state) in the trellis, the Viterbi algorithm compares the two paths entering the node. The path with the lower metric is retained and the other path is discarded. This computation is repeated for every level j of the trellis in the range M ≤ j ≤ L, where M = (K − 1) is the encoders memory and L is the length of the incoming message sequence. The paths that are retained are called survivor or active paths. In some cases, applying the Viterbi algorithm leads to the following difficulty: when the paths entering a node (state) are compared and their metrics are found to be identical then a choice is made by making a guess (i.e., flipping a fair coin). The Viterbi algorithm is a maximum likelihood sequence estimator, and the following procedure and Examples 3 -5 illustrate the detailed steps for the implementation of this algorithm [ Let j = 0, 1, 2, . . ., and assume at the previous j the following is performed:
(a) All survivor paths are identified; (b) The survivor paths and its metric for each state of the trellis are stored.
Then, at level (clock time) ( j + 1) and for all the paths entering each state of the trellis, compute the metric by adding the metric of the incoming branches to the metric of the connecting survivor path from level j. Thus, for each state, identify the path with the lowest metric as the survivor of step ( j + 1), therefore updating the computation. 3. Final step: Continue the computation until the algorithm completes the forward search through the trellis and thus reaches the terminating node (i.e., all zero state), at which time it makes a decision on the maximum-likelihood path. Then, the sequence of symbols associated with that path is released to the destination as the decoded version of the received sequence.
Example 3. Suppose that the resulting encoded sequence from the convolutional encoder in Figure 2 is as follows:
Now suppose a noise corrupts this sequence, and the noisy received sequence is as follows:
Using the Viterbi algorithm, Figure 4 shows the resulting step-by-step illustration [39] to produce the survivor path which generates the correct sent message c = (0000000000). Example 4. For the convolutional encoder in Figure 2 , path #1 impulse response is (1, 1, 1), and path #2 impulse response is (1, 0, 1). Thus, the following are the corresponding generating polynomials, respectively: 
As convolution in time domain is transformed into multiplication in the D-domain, the path #1 output polynomial and path #2 output polynomial are as follows, respectively, where addition is performed in modulo-2 arithmetic:
Output sequence of path #1: (11011) Output sequence of path #2: (10001)
The resulting encoded sequence from the convolutional encoder in Figure 2 is obtained by multiplexing the two output sequences of paths #1 and #2 as follows: c = (11, 10, 00, 10, 11)
Now suppose a noise corrupts this sequence, and the noisy received sequence is as follows: c ′ = (01, 10, 10, 10, 11)
Using the Viterbi algorithm, the following is the resulting survivor path which generates the correct sent message c = (11, 10, 00, 10, 11). A difficulty with the application of the Viterbi algorithm occurs when the received sequence is very long. In this case the Viterbi algorithm is applied to a truncated path memory using a decoding window of length greater or equal five times the convolutional code constraint length K, in which the algorithm operates on a frame-by-frame of the received sequence each of length l ≥ 5K. The decoding decisions made in this way are not a truly maximum likelihood, but they can be made almost as good provided that the decoding window is long enough. Another difficulty is the number of errors; for example, in case of three errors, the Viterbi algorithm when applied to a convolutional code of r = 1 / 2 and K = 3 cannot produce a correctable decoded message from the incoming erroneous message. Exceptions are triple-error patterns that spread over a time span > K.
Example 5.
Suppose an all-zero sequence c = (0000000000) is generated by the convolutional encoder in Figure 2 . For a received sequence containing three errors c ′ = (1100010000), Figure 6 shows the breakdown of the Viterbi algorithm when implemented to the convolutional encoder in Figure 2 (K = 3 and r = 1 / 2 ) as it fails to correct for a triple-error pattern. 
Reversible logic
In quantum mechanical systems, a closed system is an isolated system that doesn't exchange energy or matter with its surroundings (i.e., doesn't dissipate power) and doesn't interact with other quantum systems. Closed quantum systems obey the unitary evolution and therefore they are reversible.
In general, an (n, k) reversible circuit is a circuit that has n number of inputs and k number of outputs and is one-to-one mapping between vectors of inputs and outputs, thus the vector of input states can be always uniquely reconstructed from the vector of output states [4, 11, 48, 66, 67, 72, 77, 92] . Thus, a (k, k) reversible map is a bijective function which is both (1) injective (one-to-one or (1:1)) and (2) surjective (onto). (Such bijective systems are also known as: equipollent, equipotent, and one-to-one correspondence.) The auxiliary outputs that are needed only for the purpose of reversibility are called garbage outputs. These are auxiliary outputs from which a reversible map is constructed (cf. Example 6). Therefore, reversible circuits (systems) are information-lossless.
Geometrically, achieving reversibility leads to value space-partitioning that leads to spatial partitions of unique values. Algebraically and in terms of systems representation, reversibility leads to multi-input multi-output (MIMO) bijective maps (i.e., bijective functions). An algorithm called reversible Boolean function (RevBF) that produces a reversible form from an irreversible Boolean function is as follows [4] . 
Algorithm
For example, using the RevBF algorithm, the construction of the reversible map in Table 1a is obtained as follows: since W is irreversible, assign auxiliary ("garbage") output W 1 and assign the first half of its values the constant "0" and the second half another constant "1". The new XNOR map is now reversible. This gate is also called the inverted Feynman gate or inverted Controlled-NOT (inverted C-NOT) gate in which: Figure 11a .)
Quantum computing
Quantum computing (QC) is a method of computation that uses a closed-system dynamic process governed (for a single particle) by the Schröinger Equation (SE) [4, 67] . The single-particle one-dimensional time-dependent SE (TDSE) takes the following general form:
or
where h is Planck constant (6.626×10 −34 Js), = h/(2π) is the reduced Planck constant, V (x,t) is the potential, m is particle mass, i is the imaginary number, |ψ(x,t) is the quantum state, H is the Hamiltonian operator (H = −[(h/2π) 2 /2m]∇ 2 + V ), and ∇ 2 is the Laplacian operator. While the above holds for all physical systems, in the quantum computing (QC) context, the timeindependent SE (TISE) is normally used [4, 67] :
where the solution |ψ is an expansion over orthogonal basis states |φ i defined in Hilbert space H as follows:
where the coefficients c i are called probability amplitudes, and |c i | 2 is the probability that the quantum state |ψ will collapse into the (eigen) state |φ i . The probability is equal to the inner product | φ i |ψ | 2 , with the unitary condition ∑ |c i | 2 = 1. In QC, a linear and unitary operator T is used to transform an input vector of quantum bits (qubits) into an output vector of qubits [4, 67] . In two-valued QC, a qubit is a vector of bits defined as follows:
A two-valued quantum state |ψ is a superposition of quantum basis states |φ 1 such as those defined in Equation (6) . Thus, for the orthonormal computational basis states {|0 , |1 }, one has the following quantum state:
where αα * = |α| 2 = p 0 ≡ the probability of having state |ψ in state |0 , β β * = |β | 2 = p 1 ≡ the probability of having state ψ in state |1 , and |α| 2 + |β | 2 = 1. The calculation in QC for multiple systems (e.g., the equivalent of a register) follow the tensor product (⊗) [4] . For example, given two states |ψ 1 and |ψ 2 one has the following QC:
A physical system, describable by the following equation [4, 67] :
(e.g., the hydrogen atom), can be used to physically implement a two-valued QC. Another common alternative form of Equation (9) is:
Many-valued QC (MVQC) can also be accomplished [4, 67] . For the threevalued QC, the qubit becomes a 3-dimensional vector qudit (quantum discrete digit), and in general, for MVQC the qudit is of dimension many. For example, one has for 3-state QC (in Hilbert space H) the following qudits:
A three-valued quantum state is a superposition of three quantum orthonormal basis states (vectors). Thus, for the orthonormal computational basis states {|0 , |1 , |2 }, one has the following quantum state: |ψ = α|0 + β |1 + γ|2 (12) where αα * = |α| 2 = p 0 ≡the probability of having state |ψ in state |0 , β β * = |β | 2 = p 1 ≡the probability of having state |ψ in state |1 , γγ * = |γ| 2 = p 2 ≡the probability of having state |ψ in state |2 , and |α| 2 + |β | 2 + |γ| 2 = 1. In general, for an n-valued logic, a quantum state is a superposition of n quantum orthonormal basis states (vectors). Thus, for the orthonormal computational basis states {|0 , |1 , . . . , |n − 1 }, one has the following quantum state:
where:
The calculation in QC for many-valued multiple systems follow the tensor product in a manner similar to the one demonstrated for two-valued QC in Equation (8) .
As stated previously, while an open quantum system does interact with its environment (i.e., its surroundings or bath) and thus dissipate power resulting in a non-unitary evolution, a closed quantum system is an isolated system that doesn't exchange energy or matter with its surroundings and therefore doesn't dissipate power resulting in a unitary evolution (i.e., unitary transformation or unitary matrix) and hence they are reversible. A physical system comprising trapped ions under multiple laser excitations can be used to reliably implement MVQC [66] . A physical system in which an atom (particle) is exposed to a specific potential field (function) V (x) can also be used to implement MVQC (two-valued being a special case) [4, 67] . In such an implementation, the (resulting) distinct energy states are used as the orthonormal basis states. The latter is illustrated in Example 7 below which is an example of implementing MVQC by exposing a particle to a potential field V where the distinct energy states are used as the orthonormal basis states.
Example 7. We assume the following constraints: (1) spring potential V (x) = (1/2)kx 2 , where m is a particle, k = mω 2 is spring constant, and ω is the angular frequency (ω = 2π· frequency), and (2) boundary conditions. Also, assuming the solution of the TISE in Equation (4) for these constraints is of the following form (i.e., the Gaussian function):
where α = mω/ . The general solution for the wave function |ψ , (for a spring potential) is:
where H n (x) are the Hermite polynomials. This solution leads to the sequence of evenly spaced energy levels (eigenvalues) E n characterized by a quantum number n as follows:
The distribution of the energy states (eigenvalues) and their associated probabilities are shown in Figure 7 . Fig. 7 . Harmonic oscillator (HO) potential and wavefunctions: (a) wavefunctions for various energy levels (subscripts), (b) spring potential V (x) and the associated energy levels E n , and (c) probabilities for measuring particle m in each energy state (E n ).
A closed-system quantum circuit is a composition of quantum gates with the following properties [4, 67] : (1) must be reversible, (2) must have an equal number of inputs k and outputs k, (3) doesn't allow fan-out, (4) is constrained to be acyclic (i.e., feedback (loop) is not allowed), and (5) the transformation performed is unitary (i.e., a unitary matrix). The quantum Viterbi circuit design in the quantum domain using the corresponding basic quantum primitives will be completely shown in Section 4.
Reversible Error Correction via Reversible Viterbi Algorithm
While in subsection 2.1 the error correction of communicated data was done for the case of single-input single-output (SISO) systems, this section introduces reversible error correction of communicated batch (parallel) of data in multiple-input multiple-output (MIMO) systems. Reversibility in parallel-based data communication is directly observed since:
where O 1 is the unique output (transmitted) data from node #1 and I 2 is the unique input (received) data to node #2. In MIMO systems, the existence of noise will cause an error that may lead to irreversibility in data communication (i.e., irreversibility in data mapping) since O 1 = I 2 . As will be introduced in this and the following sections respectively, the implementation of reversible error correction can be performed (1) in software using the new reversible error-correction algorithm and (2) in hardware using quantum error correction hardware. The following algorithm, called Reversible Viterbi (RV) Algorithm, introduces the implementation of reversible error correction in the parallel data communication.
Algorithm RV 1. Use the RevBF Algorithm to reversibly encode the communicated batch of data. 2. Given a specific convolutional encoder circuit, determine the generator polynomials for all paths. 3. For each communicated message within the batch, determine the encoded message sequence. 4. For each received message, use the Viterbi Algorithm to decode the received erroneous message. 5. Generate the total maximum-likelihood trellis resulting from the iterative application of the Viterbi decoding algorithm. 6. Generate the corrected communicated batch of data messages.
End
The convolutional encoding for the RV algorithm can be performed serially using a single convolutional encoder from Figure 2 , or in parallel using the general parallel convolutional encoder circuit shown in Figure 8 in which several s convolutional encoders operate in parallel for encoding s number of simultaneously submitted messages (i.e., data message set of cardinality (size) equal to s) generated from s nodes. Example 8. The reversibility implementation (e.g., RevBF Algorithm) upon the following input bit stream {m 1 = 1, m 2 = 1, m 3 = 1} produces the following reversible set of message sequences:
For the convolutional encoder in Figure 8 , the following is the D-domain polynomial representations, respectively:
The resulting encoded sequences are generated in parallel as follows, respectively:
Now suppose noise sources corrupt these sequences, and the noisy received sequences are as follows: c Figure 9 shows the resulting survivor paths which generate the correct sent messages: {c 1 = (1110001011), c 2 = (0000111011), c 3 = (0011010111)}. As in the irreversible Viterbi Algorithm, in some cases, applying the reversible Viterbi (RV) algorithm leads to the following difficulties: (1) when the paths entering a node (state) are compared and their metrics are found to be identical then a choice is made by making a guess (i.e., flipping a fair coin); (2) when the received sequence is very long and in this case the reversible Viterbi algorithm is applied to a truncated path memory using a decoding window of length greater or equal five times the convolutional code constraint length K, in which the algorithm operates on a frame-by-frame of the received sequence each of length l ≥ 5K, and the decoding decisions made in this way are not a truly maximum likelihood, but they can be made almost as good provided that the decoding window is long enough; (3) the number of errors: for example, in case of three errors, the Viterbi algorithm when applied to a convolutional code of r = 1 / 2 and K = 3 cannot produce a correctable decoded message from the incoming erroneous noisy (corrupted) message. (Exceptions are triple-error patterns that spread over a time span > K.)
Yet, parallelism in multi-stream data submission (transmission) allows for the possible existence of extra relationship(s) between the submitted data-streams that can be used for (1) detection of error existence and (2) further correction after RV algorithm in case the RV algorithm fails to correct for the occurring errors. Examples of such inter-stream relationships are: (1) parity (even and odd) relationship between the corresponding bits within the inter-stream submitted data, (2) reversibility relationship between the parallel submitted data streams and this re-lationship exists from applying a known reversible mapping such as the RevBF algorithm, or (3) combination of parity and reversibility properties. The reversibility property in the RV algorithm produces a reversibility relationship between the sent parallel streams of data, and this known reversibility mapping can be used to correct the uncorrectable errors (e.g., triple errors) whi ch the RV algorithm fails to correct.
Example 9. The following is a version of the RevBF algorithm that produces reversibility as follows:
Algorithm RevBF (Version 1) Note that the erroneous m 3 is Figures 10b-10e and 10g-10h are correctable using the RV algorithm since less than triple-errors exits, but the triple error as in Figure 10f is (usually) uncorrectable using the RV algorithm. Yet, the existence of the reversibility property using the RevBF algorithm adds information that can be used to correct m 3 as follows: By applying the RevBF Algorithm (Version 1) from right-to-left in Figure 10f one notes that in the second column (from right) two "0" cells are added in the top in the correctly received m 1 and m 2 messages, which means that in the most right column the last cell must be "1" since otherwise the top two cells in the correctly received m 1 and m 2 messages should have been "0" and "1" respectively to achieve value space-partitioning. Now, since the 3rd cell of the most right column must be "1" then the last cell of the 2nd column from the right must be "1" also because of the uniqueness requirement according to the RevBF algorithm (Version 1) for value space-partitioning between the first two messages {m 1 , m 2 } and the 3rd message m 3 . Then, and according to the RevBF algorithm (Version 1) the 3rd cell of the last column from right must have the value "0" which is the ones complement (NOT) of the previously assigned constant "1" to the 3rd cell of the 2nd column from the right. Consequently, the correct message m 3 = (011) is obtained.
Quantum Circuit Design of the New RV Algorithm
The reversible hardware implementation for each trellis node in the (reversible) Viterbi algorithm requires the following reversible components: reversible modulo-2 adder, reversible arithmetic adder, reversible subtractor (RS) and reversible selector (i.e., reversible multiplexer) to be both used in one possible design of the corresponding reversible comparator (RC). Table 2 shows the truth tables of an irreversible half-adder (HA), irreversible subtractor, and irreversible full-adder (FA). While each quantum circuit is reversible, not each reversible circuit is quantum [4, 67] . Figure 11 shows the various quantum circuits for the quantum realization of each quantum trellis node in the corresponding (reversible) Viterbi algorithm. Figures 11a-11c present fundamental quantum gates [4, 67] . 
(g) (h) Fig. 11 . Quantum reversible circuits for the quantum realization of each trellis node in the corresponding (reversible) Viterbi algorithm: (a) quantum XOR gate (Feynman gate; Controlled-NOT (C-NOT) gate), (b) quantum Toffoli gate (Controlled-Controlled-NOT (C 2 -NOT) gate), (c) quantum multiplexer (Fredkin gate; Controlled-Swap (C-Swap) gate), (d) quantum subtractor, (e) quantum half-adder (QHA), (f) quantum full-adder (QFA), (g) quantum equality-based comparator that compares two 2-bit numbers where an isolated XOR symbol means a quantum NOT gate, and (h) basic quantum reversible Viterbi (QV) cell (i.e., quantum reversible trellis node) which is made of two Feynman gates, one QHA, one QFA and one quantum comparator with multiplexing (QCM). The quantum comparator can be synthesized using a quantum subtractor (QS) and a Fredkin gate. The symbol ⊕ is logic XOR (exclusive OR; modulo-2 addition), ∧ is logic AND, ∨ is logic OR, and ′ is logic NOT.
11d-11g show basic quantum arithmetic circuits of: quantum subtractor ( Figure  11d ), quantum half-adder (Figure 11e ), quantum full-adder (Figure 11f) , and the quantum equality-based comparator (Figure 11g ) [4, 67] . Figure 11h introduces the basic quantum Viterbi cell (i.e., quantum trellis node) which is made of two Feynman gates, one QHA, one QFA and one quantum comparator with multiplexing (QCM). Figure 12 shows the logic circuit design of an iterative network to compare two 3-digit binary numbers: X = x 1 x 2 x 3 and Y = y 1 y 2 y 3 , and Figure 13 presents the detailed synthesis of a comparator circuit which is made of a comparator cell (Figure 13a ) and a comparator output circuit (Figure 13b ). The extension of the circuit in Figure 12 to compare two n-digit binary numbers is straightforward by utilizing n-cells and the same output circuit.
Cell 1
Cell 2 Cell 3 Output Circuit Figure 14 illustrates the quantum circuit synthesis for the comparator cell and the output circuit (which were shown in Figure 13 ), and Figure 15 shows the design of a quantum comparator with multiplexing (QCM) where Figure 15a shows an iterative quantum network to compare two 3-digit binary numbers and Figure 15c shows the complete design of the QCM. The extension of the quantum circuit in Figure 15a to compare two n-digit binary numbers is straightforward by utilizing n quantum cells (from Figure 14a ) and the same output quantum circuit (in Figure  14b) . Figure 16 shows the complete design of a quantum trellis node (i.e., quantum Viterbi cell) in the irreversible and reversible Viterbi algorithms that was shown in Figure 11h . The design of the quantum trellis node shown in Figure 16f 
(d) (e) (f) (a) (b) (c) Fig. 16 . The complete design of a quantum trellis node in the irreversible and reversible Viterbi algorithms that was shown in Figure 11h : (a) quantum circuit that is made of two Feynman gates (i.e., two quantum XORs) to produce the difference between incoming received bits (A 1 A 2 ) and trellis bits (B 1 B 2 ) followed by quantum half-adder (QHA) to produce the corresponding sum (s 1 c 1 ) which is the Hamming distance for the first line entering the trellis node, (b) quantum circuit that is made of two Feynman gates (i.e., two quantum XORs) to produce the difference between incoming received bits (A * 1 A * 2 ) and trellis bits (B * 1 B * 2 ) followed by quantum half-adder (QHA) to produce the corresponding sum (s 2 c 2 ) which is the Hamming distance for the second line entering the trellis node, (c) logic circuit composed of QHA and quantum full-adder (QFA) that adds the current Hamming distance to the previous Hamming distance, (d) quantum circuit in the first line entering the trellis node for the logic circuit in (c) that is made of a QHA followed by a QFA, (e) quantum circuit in the second line entering the trellis node for the logic circuit in (c) that is made of a QHA followed by a QFA, and (f) quantum comparator with multiplexing (QCM) in the trellis node that compares the two entering metric numbers: X = s 3 s 4 c * and Y = s * 3 s * 4 c * * and selects using control line O 1 the path that produces the minimum entering metric (i.e., X < Y ).
ceeds as follows: (1) two quantum circuits for the first and second lines entering the trellis node each is made of two Feynman gates (i.e., two quantum XORs) to produce the difference between incoming received bits and trellis bits followed by quantum half-adder (QHA) to produce the corresponding sum (which is the Hamming distance) are shown in Figures 16a and 16b , (2) logic circuit composed of a QHA and a quantum full-adder (QFA) that adds the current Hamming distance to the previous Hamming distance is shown in Figure 16c , (3) two quantum circuits(i.e., unitary matrix) and hence it is reversible. Since power reduction has become the current main concern for digital logic designers after performance (speed), reversibility property in error-control coding is highly important because reversibility is a main requirement for low-power circuit synthesis of future technologies such as in quantum computing, and reversibility property results in super-speedy encoding/decoding operations because of the superposition and entanglement properties that emerge in the closed quantum computing systems that are inherently reversible.
Future work will include items such as the investigation of using the introduced reversibility property in more advanced multi-error coding schemes to correct the corresponding corrupted multi-stream communicated data, and also the investigation of the corresponding optimal quantum circuit design of such new reversible systems.
