Public-key cryptography: a hardware implementation and novel neural network-based approach by Nguyen, Phong
Calhoun: The NPS Institutional Archive
Theses and Dissertations Thesis Collection
1992-09
Public-key cryptography: a hardware implementation
and novel neural network-based approach
Nguyen, Phong








Approved for public release; distribution is unlimited.
PUBLIC-KEY CRYPTOGRAPHY: A HARDWARE IMPLEMENTATION
AND NOVEL NEURAL NETWORK-BASED APPROACH
by
Phong Nguyen
Lieutenant, United States Navy
B.S.E.E., United States Naval Academy, 1985
Submitted in partial fulfillment of the
requirements for the degree of




rY CLASSIFICATION OF THIS PAGE
REPORT DOCUMENTATION PAGE
PORT SECURITY CLASSIFICATION UNCLASSIFIED 1t>. RESTRICTIVE MARKINGS
CURITY CLASSIFICATION AUTHORITY 3. DISTRIBUTION/AVAILABILITY OF REPORT
Approved for public release;
distribution is unlimited
CLASSIFICATION/DOWNGRADING SCHEDULE
(FORMING ORGANIZATION REPORT NUMBER(S) 5. MONITORING ORGANIZATION REPORT NUMBER(S)
ME QF PERFORMING ORGANIZATION





7a. NAME OF MONITORING ORGANIZATION
Naval Postgraduate School
DRESS (City, State, and ZIP Code)
terey, CA 93943-5000






9. PROCUREMENT INSTRUMENT IDENTIFICATION NUMBER









LE (Include Security Classification) PUBLIC-KEY CRYPTOGRAPHY: A HARDWARE IMPLEMENTATION AND









14. DATE OF REPORT (Year, Month, Day)
1992 September
pplementary NOTATiOThe views expressed in this thesis are those of the author and do not reflect the official
•y or position of the Department of Defense or the United States Government.
COSATI CODES 1 8. SUBJECT TERMS (Continue on reverse if necessary and identify by block number)
Cryptography, Public-Key, Secret-Key, Discrete Logarithm, I
LD GROUP SUB-GROUP
Exponentiation, Diffie-Hellman, RSA, Inverse, GCD, Neural Netwo
Back-Propagation, Factorization, Sum of Residues, Modulo Reduction
STRACT (Continue on reverse if necessary and identify by block number)
ie concealment of information passed over a non-secure communication link lies in the complex field of cryp
jhy. Furthermore, when absolutely no secure channel exists for the exchange of a secret key with which data
pted and decrypted, the remedy lies in a branch of cryptography known as public-key cryptosystem (PKS). Th
provides an in-depth study of the public-key cryptosystem. Essential background knowledge is covered leadin
a. VLSI implementation of a fast modulo exponentiator based on the sum of residues (SOR) method. Fast modu)
lentiation is vital in the most popular PKS schemes. Furthermore, since all cryptosystems make use of som
of mapping functions, a neural network - an excellent non-linear mapping technique - provides a viable mediui
which a possible cryptosystem can be based. In examining this possibility, this thesis presents an adaptation <
ick-propagation neural network to a "pseudo" public-key arrangement. Following examinations of the networl
management system is then devised. Finally, a complete top-down block diagram of an entire cryptosystei
on the neural network of this study is proposed.
3TRIBUTION/AVAILABILJTY OF ABSTRACT
INCLASSIFIED/UNLIMITED Q SAME AS RPT. Q OTIC USERS 21. ABSTRACT SECURITY CLASSIFICATIONUNCLASSIFIED
AMEOF RESPONSIBLE INDIVIDUAL
in Yang 22b. TELEPHONE [Include Area Code)(408) 646-2266
22C-OEFJCE SYMBOL
EC/Ya
RM 1473,84 MAR 83 APR edition may be used until exhausted
All other editions are obsolete
SECURITY CLASSIFICATION OF THIS PAGE
UNCLASSIFIED
ABSTRACT
The concealment of information passed over a non-secure communication link
lies in the complex field of cryptography. Furthermore, when absolutely no secure
channel exists for the exchange of a secret key with which data is encrypted and de-
crypted, the remedy lies in a branch of cryptography known as public-key cryptosys-
tem (PKS). This thesis provides an in-depth study of the public-key cryptosystem.
Essential background knowledge is covered leading up to a VLSI implementation of
a fast modulo exponentiator based on the sum of residues (SOR) method. Fast mod-
ulo exponentiation is vital in the most popular PKS schemes. Furthermore, since all
cryptosystems make use of some form of mapping functions, a neural network - an
excellent non-linear mapping technique - provides a viable medium upon which a
possible cryptosystem can be based. In examining this possibility, this thesis presents
an adaptation of the back-propagation neural network to a "pseudo" public-key ar-
rangement. Following examinations of the network, a key management system is
then devised. Finally, a complete top-down block diagram of an entire cryptosystem




The author extends his gratitude to Professor Yang for the insights and en-
couragement throughout the research. In addition, credit must be aknowledged for
LT Daniel Sullivan's assistance in the neural network section and for LT Arthur
Billingsley's general support in this thesis. Finally, my appreciation and admiration
are extended to the faculty and staff of the Electrical and Computer Engineering








II. MATHEMATICAL BASIS FOR THE DEVELOPMENT OF
PUBLIC-KEY CRYPTOSYSTEMS 3
A. MODULO ARITHMETIC 3
B. FAST MODULO EXPONENTIATION 5
C. DISCRETE LOGARITHM 8
D. INVERSES 8
E. ARTIFICIAL NEURAL NETWORK 9
F. THE PUBLIC-KEY CRYPTOSYSTEM . 14
1. The Diffie-Hellman Scheme for Public-Key Cryptosystem ... 15
2. The RSA Cryptosystem 18
G. CRYPTOANALYSIS 20
1. Factorization 21
III. HARDWARE DEVELOPMENT OF THE PUBLIC-KEY CRYP-
TOSYSTEM 25
A. MODULO EXPONENTIATION USING RECURSIVE SUM OF
RESIDUES 25
1. Sum of Residues Reduction 27
B. VLSI LAYOUT DEVELOPMENT 32
1. Master Slave Flip Flop 32
2. Adder 33
3. Multiplexer 35
4. Modulo Reduction Unit 36
IV
IV. A NEURAL NETWORK-BASED PUBLIC-KEY CRYPTOSYS-
TEM 40
A. EXPERIMENTS IMPLEMENTING A NEURAL NETWORK IN
CRYPTOSYSTEMS 40
B. EXPERIMENTAL RESULTS AND OBSERVATIONS 46
C. AN IN-DEPTH EXAMPLE 50
D. KEY MANAGEMENT 53
1. A Proposed Pseudo Public-Key Cryptosystem Using A Neural
Network 54
2. Justification of the "Pseudo" Prefix 57
E. PROBLEMS OF A NEURAL NETWORK AS A CRYPTOSYSTEM
AND PROPOSED SOLUTIONS 58
F. DEVELOPMENT OF A COMPLETE BLOCK-DIAGRAM-LEVEL
HARDWARE SCHEME USING A NEURAL NETWORK IN PKS . 61
V. CONCLUSION 64
APPENDIX A. SUPPLEMENTARY PROGRAMS 67
APPENDIX B. RNL SIMULATION OF MODULO REDUCTION
UNIT 72
APPENDIX C. SAMPLE NEURAL NETWORK FROM NEURAL-
' WARE : 80
REFERENCES 95
INITIAL DISTRIBUTION LIST 98
LIST OF TABLES
2.1 EULER'S TOTIENT FUNCTIONS 5
2.2 EXAMPLE FAST EXPONENTIATION FOR 5 10 6
2.3 EXAMPLE FAST EXPONENTIATION AND MODULO OF 5 10 mod 9 7
2.4 EXHAUSTIVE FACTORIZATION WITH ONE '486 COMPUTER . 23
2.5 FACTORIZATION TIME WITH SANDIA'S BENCHMARK [REF 17] 24
3.1 EXAMPLE SUM OF RESIDUES FOR 18 mod 7 28
3.2 EXAMPLE RECURSIVE SOR FOR 18 mod 7 28
4.1 EXAMPLE TRAINING SET 45
4.2 TRAINING TIME VS ERROR RELATIONSHIP 47
4.3 RELATIONSHIP BETWEEN NETWORK SIZE AND ERROR ... 48




2.1 A Processing Element 11
2.2 A Back-Propagation Network 12
2.3 Block Diagram of Diffie-Hellman Cryptosystem 16
2.4 Block Diagram of RSA Cryptosystem 19
3.1 Block Diagram of over all exponentiation unit 26
3.2 Modulo Reduction Unit 29
3.3 Overhead Vs Input Bit 30
3.4 Block Diagram of 4-Bit SOR with Logic Units 31
3.5 MSFF Circuit Diagram 32
3.6 MSFF Layout 33
3.7 Adder Circuit 34
3.8 Adder Layout 34
3.9 MUX Function Block Circuit Diagram 35
3.10 Layout of MUX 36
3.11 Layout of 4-bit Modulo Reduction Unit 37
3.12 Size of Modulo Reduction Unit 38
3.13 Speed Performance of Modulo Reduction Unit From SPICE 38
4.1 Neural Network As A Cryptosystem Block Diagram 42
4.2 Back-Propagation Network For Encryption and Decryption 43
4.3 Relationship between Network Size, Iterations to Convergence and
Input Size 49
4.4 Neural Network in PKS 62
vn
I. INTRODUCTION
In the recent past, there possibly was a time when protection of vital electronic
information was not considered a necessity and therefore not deemed to be a topic
of common interest. Such a time is forever behind us. In our time, information is
most often passed across a public telecommunication medium. Whether this medium
be a telephone line or satellite link, there exist eavesdropping methods which are so
sophisticated and efficient that no information is physically secure. How then is one
to revert to the inherent privacy of the past? The answer to this question and thus
the solution to concealment of information lie in the complex science of cryptography.
Cryptography is the field involving the preparation of messages intended to be
incomprehensible to all except those who legitimately possess the means to recover the
original information [Ref 1]. At present, the fastest and most popular cryptosystems
employ some convention of mapping a set of numbers representing data to another
set of numbers (encryption). The recovery of data is done by simply reversing the
mapping process so as to obtain the original content (decryption). Often, this type of
mapping is governed by the notion of a key. In order to provide the essential element
of secrecy, system users must provide this key which is normally a privately or semi-
privately known string of characters or bits. For a cryptosystem to be completely
secure, knowledge of both the mapping function and key is required to recover the
original text from encrypted text.
Of the cryptosystems which use the forementioned concept of a key, two distinct
categories are made: secret-key and public-key.
As suggested by the name, a cryptosystem is secret-key if the key must be
secretly agreed upon prior to any parties being able to communicate through the
1
system. In this arrangement, both parties normally have the same key which is used
in both encryption and decryption. Algorithms implementing this scheme are labeled
symmetric. Intuitively, one recognizes a severe restriction in the secret-key system:
an advance agreement on the key over a secure channel. When such a channel is
not readily available, the topic of this thesis, public-key cryptosystem (PKS), is the
remedy.
Most PKS systems use an asymmetric algorithm whereupon separate keys are
required for encryption and decryption. This scheme allows the passing of keys,
most likely encryption keys, over an unsecure channel without any compromise to
the system's safety. In boasting this versatile capability, however, public-key system
must pay a price, namely a reduction in system speed [Ref 2]. Currently, PKS is much
slower than secret-key, too slow for large quantities of data. For this reason, its use
is often limited to the exchange of keys in secret-key systems. In the future, along
with advancements in technology, perhaps this speed barrier will be lifted yielding
more opportunity for the employment of PKS.
It is in the spirit of this future that this thesis is presented. It is an in-depth
study of the public-key cryptosystem. First, the mathematical basis behind PKS is
covered so as to establish an essential background knowledge in a somewhat esoteric
subject. Second, the capability of VLSI implementation of PKS is explored via a
fast modulo exponentiator, a hardware device required in two of the most popular
public-key systems. A vital component of the fast modulo exponentiator, a modulo
reduction unit, is designed with MAGIC tools [Ref 3], validated with RNL simulation
[Ref 4], and examined for possible use. Finally, to conclude the scope of this research,
a completely novel approach to PKS is proposed: a possible implementation of neural
networks in public-key cryptography.
II. MATHEMATICAL BASIS FOR THE
DEVELOPMENT OF PUBLIC-KEY
CRYPTOSYSTEMS
Compared to the complexity of conventional engineering mathematics, the con-
cepts behind the algorithms for public-key cryptosystem are elementary in nature yet
without complete understanding of them, no initial familiarization to the system is
possible. Due to this realization, this chapter concentrates heavily on the mathemat-
ics of asymmetric cryptography. It provides a basic overview of modulo arithmetic,
fast exponentiation, and discrete logarithm. It also outlines a background knowledge
in artificial neural networks, a branch of engineering upon which a completely new
angle in cryptography is based. Furthermore, the fundamentals of public-key cryp-
tosystems are covered using two well-established examples, the Diffie-Hellman and
RSA systems. Finally, the chapter concludes with the problem of cryptoanalysis: the
purpose of all cryptosystems.
A. MODULO ARITHMETIC





21 = 3 + 9 x 2.
This operation is commonly described as 21 divided by 9 equals 2 with remainder
of 3.
When written as x = y(mod z), by convention x is said to be "congruent to y
modulo z." Congruency applies if and only if
x = y + k x z
where k is any integer. Also y is called a residue mod z of x if and only if x =
y(mod z).
Note that -15(mod 6) = -3(mod 6).
Clearly, for any z, y belongs to a complete set of residues {0, 1,2..., z — 1}. From
this complete set of residues, there exists a subset called a reduced set of residues
which has elements relatively prime to the modulus z. For example, a complete set
of residues modulo 12 is {0,1,2,3,4,5,6,7,8,9,10,11}. From this, only {1,5,7,11}
does not have a common factor with 12 (0 excluded); it is therefore a reduced set
[Ref2].
For a modulo prime, clearly the reduced set of residues contains all elements of
the complete set except for 0. Therefore for a prime n, the reduced set of residues
has (n — 1) elements. In addition, generally the reduced set of residues for a product
of two primes m and n has ((m — l)(n — 1)) elements and that for a prime power n r
has (n — l)n(r~ 1 ) elements. Commonly, the number of elements in a reduced set of
residues for modulo n is referred to as the Euler Totient function <f>(n) [Ref 2]. Table
2.1 shows <f>(n) for several n [Ref 2].
Like normal integer arithmetic, addition and multiplication in integer modulo n
abide by the laws of associativity, commutativity and distributivity [Ref 2].
Theorem 1 [Ref 2]:
(a + 6)(mod n) = (a mod n + b mod n) mod n




n r (n prime)
l,2,...,n-l
[l,2,...,n-l,n + l,























TABLE 2.1: EULER'S TOTIENT FUNCTIONS
Theorem 2 [Ref 2]:
a&(mod n) = (a mod n x 6 mod n) mod n
These two theorems form the basis for the development of fast modulo expo-
nentiation.
B. FAST MODULO EXPONENTIATION
Many public-key cryptosystem requires the computation of x k mod n, with n
and k being extremely large numbers (in excess of 256 bits.) A naive solution would
be to multiply by x a repetition of k — 1 times then taking the modulo of the large
result. At best, this is both cumbersome and inefficient for today's computers due
to finite word length limit. Fortunately, there is an algorithm which avoids this



















1 (remains the same)
52
52 (remains the same)
5 10
TABLE 2.2: EXAMPLE FAST EXPONENTIATION FOR 510
straightforward method: fast modular exponentiation [Ref 5].
Taking advantage of Theorem 2, the exponentiation is faster when performed
by repeated squaring operations coupled with conditional multiplication by the par-
tial product according to the binary representation of the exponent. This is best
explained by an example.
Example:
Suppose we are required to find 5 10 mod 9.
let x — 5; k = 10; m = 9
Using ppo = 1 and
{x2'" 1 x ppi-i if k{ = 1
ppi-x if Ja =
k in binary is 1010. In accordance to k, bit by bit from least significant bit
(LSB) first, the squaring of x occurs iteratively for every k bit (0 or 1) but the result
is multiplied by the partial product only when k bit is 1. All the while, modulo
operation is performed in each squaring or multiplication in order to maintain a
manageable intermediate result. The partial product is always initialized to 1 (partial
product at iteration step 0, pp = 1). Let's examine Table 2.2 for clarity. From the
result of Table 2.2, indeed we have accomplished 5 10 . D
If we incorporate the modulo operation into each iteration according to Theorem
2, the modulo problem is also solved. Table 2.3 incorporates modulo reduction to






















Table 2.3 outlines in detail the process until a partial product of 49 is obtained.
Note that the result of the square operation becomes the number to be squared in the
next iteration. Also the previous partial product is the number in the multiplying
operation if the k bit is 1. In this example, since 49 mod 9 = 4, indeed 510 mod 9
(which also equals 4) is performed.
In this example the savings in multiplications is 4 (5 versus 9 using the naive
method). For larger number applications, let a be the number of binary bits of the
exponent k and 6 be log2 a. Using fast exponentiation, the number of multiplications
(call it X) is bounded by 6+1 < X < 26+1 depending on the number of l's and
O's in k. X with fast exponentiation grows linearly in length of k and is considerably
smaller then X obtained by the straightforward method of multiplying by k — 1 times
[Ref 5].
Appendix A contains a C program implementing fast modular exponentiation
using the above algorithm. It should be noted that the program is not suitable for
numbers exceeding the capability of the computer. Most computers have 32 bits res-
olution therefore results which are greater than 32 bits are likely to be too large. This
limitation, however, is resolved by using hardware for fast modular exponentiation
as will be shown in Chapter III.
C. DISCRETE LOGARITHM
Discrete logarithm is the branch of mathematics centered on the solution to the
exponent of a powered number; namely, finding x in ax = b mod n when given a, 6, n.
Example:
a = 3; 6 = 4;n = 11;




so x = 4.
Given a large modulus n and a, 6 (greater than 100 digits magnitude), discrete
logarithm is classified as a non-deterministic polynomials problem; the solution to
which is extremely difficult and impractical to derive [ Ref 6]. Therefore its use is
prevalent throughout many public-key cryptosystems.
D. INVERSES
Unlike integer arithmetic, modulo arithmetic often has inverses. Given a 6
{0,n — 1}, there could be a unique b £ {0,n — 1} such that
a6(modn) = 1 [Ref 2]
A systematic method to compute inverses involves the notion of the greatest
common divisor {gcd). Conventionally, gcd(a,b) is an integer c such that a/c and
8
b/c result in the smallest possible integer value. For example, gcd(8, 12) = 4 but
gcd{$, 16) = 8.
From the mathematics of gcd, we pose:
Lemma 1 [Ref 2]: if gcd(a,n) = 1 then
a, mod n ^ aj mod n; < i, j < n
Fermat's Theorem [Ref 2]: p is a prime and gcd(a,p) = 1 then
a{p
'l)(mod p) = 1




Theorem 4 [Ref 2]: if <7af(a,n) = 1 then
a*
(n) mod n = 1
Recall 4>{n) is the number of elements in a reduced set of residues (Table 2.1).
From the above Theorems, Euclid's algorithm is developed to find gcd(a,n) as
well as inverse a-1 (mod n) of a mod n. It is not within the scope of this study to
detail the foundation of this algorithm. If further information is preferred, reference
2 is suggested for consultation. For the purpose of this thesis, C programs for gcd
and inverse are provided in Appendix A [Ref 2].
E. ARTIFICIAL NEURAL NETWORK
In 1985, Ackley, Hinton and Sejnowski [Ref 7] applied a back-propagation neural
network to encode orthogonal binary vectors of length N using log2N hidden units.
Following this, Cottrell, Munro and Zipser [Ref 8] used the same type of network to
achieve image (data) compression. Both these two application examples involved a
special form of mapping via neural networks and, thus, suggested a possible use in
cryptography. In fact, they are inspirational for the work of Chapter IV in this thesis
which explores in detail the possibility of implementing neural networks in a novel
public-key cryptosystem. In light of this, this section provides a basic understanding
of neural networks, especially the back-propagation neural network.
A formal definition of a neural network is:
"A neural network is a parallel, distributed information processing structure con-
sisting of processing elements (which can possess a local memory and can carry out
localized information processing operations) interconnected via unidirectional signal
channels called connections. Each processing element has a single output connection
that branches into as many collateral connections as desired; each carries the same
signal- the processing element ouput. This ouput signal can be of any mathematical
types. The information of each element can be arbitrary with the restriction that it
must be completely local; it must depend only on the current values of arriving input
signals at and on values in local memory. " [Ref 9]
Having defined a neural network, the basic unit, a processing element, is shown
in Figure 2.1. The processing element has many input connections combined by a
simple summation. The combination is then transformed through a transfer function.
The function of interest here is a hyperbolic tangent. The single ouput of the element
is fanned out to several ouput paths which then become inputs of other elements. The
ouput to input connections each has a corresponding weight. Since the connections
prior to entering the elements are modified by the weights, the summation within
each element is a weighted sum. The actual mathematical process within an element
is thus:
f{)^ WijXi); i = layer; j = number of weights
10
Figure 2.1: A Processing Element
An overall neural network consists of many processing elements joined together
as previously discussed. A typical neural network, a back-propagation network in
this case, is shown in Figure 2.2 [Ref 10]. For organization purpose, processing
elements are grouped into layers. A normal network is composed of two layers with
connections to the outside world: an input buffer where data is entered and an output
buffer where the response of the network to the given input is stored. Layers between
the input and ouput layers are named hidden layers [Ref 10].
There are currently many types of neural networks designed for multitude of
applications. For the purpose of encoding and decoding in a cryptosystem where the
mapping of input to output is almost always non-linear, a most suitable network is
the back-propagation type.
A back-propagation neural network is a 3 to 5 layer network that behaves as an
interpolative-associative mapping scheme. That is it has the ability to learn map-
ping by generalizing input/ouput pairs relationship [Ref 9]. Moreover, the network
employs a supervised, delta-rule learning scheme whereupon the input stimulus and
corresponding output are first presented to the system which in turn reduces the








d d e n 3
i d d e n 2
i d d e n 1
Bil s * Z '<>*
1
1 1 1 l.n
Figure 2.2: A Back-Propagation Network [Ref 10]
configures its weights to achieve the desired input/ouput mapping. After learning is
accomplished, the error is reduced to minimum and the actual outputs of all inputs
of interest will be approximately equaled to the theoretical output [Ref 10].
Having covered the necessary basics, the mathematical background for the back-
propagation network is now provided. In order to establish a common convention,
the notations used for this development is as follows.
• Xj = current output of j th neuron in layer s,
Wj} = connection weights joining i th neuron in layer [s-1] to j th neuron in layer•
• /• = weight summation of inputs to j th neuron in layer s.
The mathematical process for single back-propagation element is:
*•*'
= /E^i'- 11 )] = /(/<'>)
i
12
Given that the network has some global error function £", the critical parameter
that is fed back through the layers is defined as:
ej'1 =
-dE/dlf
where e' is the local error of processing element j in layer s. Furthermore,
using the chain rule twice yields:
^=/'(^)L(4j+,|4r')-
k
The main mechanism in the back-propagation network is to forward the input to
the output, determine the error at the output, then propagate the errors back using
the above equations. Given knowledge of local errors, the final aim is to minimize
the global error by modifying the weights.
This is done by using the gradient rule which dictates that the weights change
in the direction of minimum error.
Au$ = -k(dE/dwW)
where k is a learning coefficient.
Again using the chain rule:
dEldw [°} = {dEldlf){dlfldw [$) = -e^xl5-11
For an in-depth derivation of all forementioned equations, the reader is referred
to references 9 and 10.
Using the above equations in several iterations, an algorithm for the back-
propagation network can be developed to train the network weights in converging to
13
a given set of training data: inputs and corresponding outputs. It is not within the
scope of this research to derive or show the algorithm; however, such an algorithm
can be found in reference 9. In Chapter IV, a specific software package, Neuralware,
will be utilize to set up a back-propagation network. The network will train with
specific mapping functions so as to accomplish an encryption and decryption scheme
in a newly-proposed "pseudo" public-key cryptosystem.
This concludes the necessary background in mathematic. We are now equipped
with enough knowledge to explore the core of the public-key cryptosystem.
F. THE PUBLIC-KEY CRYPTOSYSTEM
The single foundation upon which all asymmetric cryptosystems are built is that
of the one-way function. Such a function is practical to solve in one direction but
within a range it is computationally infeasible for any algorithm to invert the solution
taken over a range of elements [Ref 11]. A formal definition of a one-way function is
beyond the scope of this study. An informal definition is that a one-way function is
one in which for / : x —> y, it is easy to find y = f(x) given x. However, given y, it
is difficult to compute x such that f(x) = y [Ref 12]. For use in cryptography, the
difficulty must be great enough so as to render the solution impractical.
Currently we have a few one-way functions which are utilized exclusively in the
public-key system. A good example of a one-way function is integer multiplication.
Whereas the multiplication of large integers is relatively easy with current technol-
ogy, the factoring of a large integer is time-consuming to the point of infeasibility.
Another important example is modular exponentiation with large exponents. As
previously discussed, fast exponentiation techniques makes the exponentiation prac-
tical. However, even with the best current algorithms and technology, the solution
of a discrete logarithmic problem of such magnitude remains unattainable within a
14
reasonable time [Ref 13]. To see how the two suggested one-way functions are used
in public-key cryptosystems, in-depth studies of two systems are now provided: the
Diffie-Hellman and RSA cryptosystems.
1. The Diffie-Hellman Scheme for Public-Key Cryp-
tosystem
The first system to achieve the notoriety of a true public-key system was
proposed by Dime and Hellman seminal paper in 1976 [Ref 14]. It is in this paper
that the discrete logarithm problem was first proposed as a candidate for a one-way
function. The scheme is best summarized as follows.
Let n be a large integer and g, another integer, such that g £ {l,n — 1}.
Parties A and B establish n and g over insecure channels. A then chooses a large
integer x and computes g
x mod n while B chooses y and computes gy mod n. Next, A
and B exchanges their perspective computations again over insecure channels without
divulging x and y. At this point A has gy and n (possibly compromised over unsecured
channels) and x which was never communicated to anyone. Similarly, B has gx,n
and y. A and B can construct the key as follows.
for A: key = (gy )
x mod n











Clearly A and B now have the same key (gx ) y mod n which can be used
for any cryptography systems. Because the operation of exponentiation with large
exponent is slow, Diffie-Hellman is proposed only to make keys for faster private-key
system such as DES so that the key will not be compromised [Ref 12 ].
Even if a cryptanalyst was able to intercept the exchanges for <7, n, g
x mod n
and gy mod n, he faces the problem of finding x and y from his known data. He must
15
Party A Party B
4Urisecurd Channel
Agree Upon A Modulus N
(N I* Prime) And A
Primitive Element g
N
Select Secret Key K 4









Inverse K* Such That
KK* = I (mod N-l)
M = "L^dnod N)
y. n
Agree Upon A Modulus N
(N Is Prime) And A
Primitive Element g












Inverse K* Such That
KK' = I (mod N-l)
M = MLrrpt<»od N >
Figure 2.3: Block Diagram of Diffie-Hellman Cryptosystem
16
solve a discrete logarithm problem, an NP class problem, which, to date, is accepted
to be infeasible within certain time restraints [Ref 13]. A summarizing block diagram
of the Diffie-Hellman cryptosystem is provided in Figure 2.3. Moreover, an example
of its application is hereby offered.
Example [Ref 13]:
Let g = 7 and n = 2 x 739(7149 - l)/6 + 1.








Now both A and B can compute 7xy and mod it with n to establish secret
key 7xy mod n. Since a party other than A and B does not know either x or y in this
case, it is infeasible to attempt finding. 7xy .
Note: The numbers in this example are obtained from reference 13 where
neither x nor y was divulged. This author has been unable to find their values. In
the original article, a challenge of 100 dollars was offered to anyone who could solve
for x and y and thus 7ry .D
Presently, the Diffie-Hellman scheme remains trustworthy because the dis-
crete logarithm problem is still a difficult one to solve. Nevertheless, no one has
17
proven beyond a doubt that it is impossible to solve. In fact, many algorithms do
exist which can derive the solution. The only setback is that even the best of them
is not fast enough with current technology. For more safety, the integers x and y can
simply be increased in magnitude and for the worst case, an establishment of new
key within an acceptable time interval can render any cryptoanalysis harmless.
2. The RSA Cryptosystem
Invented in 1978, the Rivest, Shamir and Adleman (RSA) public-key cryp-
tosystem incorporates two one-way functions: the discrete logarithm and factoriza-
tion problems. The security guaranteed by this system is so sound that since its in-
ception until present, it has been accepted as the most popular method of public-key
encryption [Ref 15]. The elegance and subtle power of the RSA system is summarized
as follows.
Party A generates 2 random primes of approximately 130 bits each, p and
q. The product pq is then computed and called n. The number of reduced residues
elements is next obtained: 4>{n) = (p — l)(g — 1) (see Table 2.1). In turn, an integer
e is generated such that gcd(e, 4>(n)) — 1. A now has the public key < e,n > which
can be published to B through insecured channels.
Having the public key, party B can encrypt a message by transforming the
message into an integer value m. m is then encrypt by:
Encryp(m) = m e mod n
In order to be able to decipher Encryp(m), A must make a private key from
<f>(n) and e. Such a key, D, is found by using Euclid's algorithm (Appendix A) so
that,




p «a4 4 «f abm!





GmwU • rooh Uui
ft***.*) 1
PuMU Key - <«jb>
1
(1—nto » MMh tk*4
D* - 1 mod*
Private I*y * <D.a>
I
ta4 PaaUa by <«3>
to Party B
Dacxypt with Private Km




maarjftim) (»-*) —4 a
Figure 2.4: Block Diagram of RSA Cryptosystem
Once D is found, the deciphering is simply done by,
Deciph(Encryp(m)) = (Encryp(m)) mod n
Proof [Ref 6]:
Given all parameters above, by Euler's Theorem:
if De = 1 mod (0) —* mDe = m mod n
- mDe mod n = m
Figure 2.4 clarifies the process. In addition, a pedagogical example of RSA
at work is shown below.
19
Example:
(Use actual Appendix A programs )
Let p = 7; q = 13 -> n = 7 x 13 = 91; 0(n) = (7 - 1)(13 - 1) = 72
Pick e = 5 and D = 29 such that De = 2<f>(n) + 1 = 145
Message m = 23
Encryp(m) - 235 mod 91 = 4.
Decryp{m) = 429 mod 91 = 23.
Judging solely on the above example, it might not seem obvious that the
RSA system is safe. The reason is because the example's numbers are small. As
stated earlier, with p and q both being about 130 bits, their product,n, can range
in excess of 160 bits. In turn, e and D are also large numbers. Given this kind of
range, to crack the code, one must face the discrete logarithm as well as factorization.
To date, the factorization of a large product of primes remains unsolvable within a
feasible time [Ref 2]. This fact is further examined in the next section, cryptoanalysis.
G. CRYPTOANALYSIS
The art of breaking cryptographic code is called cryptoanalysis. Since there are
many public-key systems, the cryptoanalysis of only the RSA system is discussed so
as to provide a flavor of how difficult it is and thereby prove its soundness.
The gist behind breaking the RSA system is the ability to solve for both the
discrete logarithm and factorization problems. The latter of the two is the most
difficult so the discrete logarithm problem will be the first to be explored.
Given the public key < e,n > and let's assume we were somehow able to factor
n and therefore know p and q. We can now use Euclid's algorithm the same way as if
the sender would to make his/her private key. Take the example in the RSA section.
20
< e,n >=< 5,91 >
Knowing p and q we can compute <f>(n) = (p — l)(q — I)
Use Euclid's algorithm to find the secret key D such that
De = 1 mod <f>(n)
With D, the sender's encryption can be intercepted and decrypted by
encryp{m)D mod n
We have done the easy part. So far we assumed to know the two prime factors
of the modulo n in the public key < e,n >. The main insurance of the RSA system
is the derivation of the two factors p and q [Ref 15]. Whereas the cryptographer
only has to come up with two primes, a difficult task but not impossible with the
primes being about 130 bits, the cryptoanalyst, in order to recover the two primes to
compute 0(n), must face the grim task of factoring a number in excess of 260 digits
within a finite time limit. This leads to the topic of factorization which will also
be exploited as the safety basis for the later proposed cryptosytem based on neural
network.
1. Factorization
A factorization problem has no current classification but the consensus is
that it is neither a Polynomial (P) nor Nondeterministic Polynomial (NP)- Complete
problem [Ref 16]. It is loosely described as a Nondeterministic Polynomial Indistin-
guishable (NPI) problem [ Ref 16]. An algorithm is said to run in polynomial time
(P) if there are constants A and c such that the running time for all inputs of length
21
k is Akc for all k. All P problems are deterministic and P-time bounded. An al-
gorithm is deterministic if at each step of the computation, the next step is unique.
P-time bounded means that the execution is in polynomial time since its complexity
is bounded by a polynomial in the input length. An algorithm is said to run in
NP time if there are no known deterministic P-time solution. In NP problems, at
each step of computation, decision problems on the next step exist. To systemati-
cally solve an NP problem requires exponential time. A subset of NP problems, an
NP-complete problem surfaces when P=NP. NP-complete problems are considered
as the most difficult class in NP. An NPI problem is basically defined as having the
level of difficulty in between NP and NP-complete. Factorization, an NPI problem,
can not be solved in P-time and is not a member of NP-complete [Ref 2]
.
In order to be convinced that factorization of large numbers is at this time
insurmountable, we examine the most straightforward and therefore easiest method.
Given a number n to be factorized, we compute yjn and round it to the next integer
value, m. We then use m as the final index of a for to loop beginning with 1. In each
iteration of the loop, the operation (n mod index) is performed until the result is
notifying that an integer factor is found. Considering the speed of the computer, this
is not a bad method of factorization if n is within a certain range of digits in length.
However, this limit is what is exploited in public-key system (n is more than 130
digits in length.) The shortcoming of this method is explored using Matlab program
on an IBM '486, 50 MHz, 16 MBytes (Appendix A). The result is shown in Table
2.4.
.
Undisputably, with n being at least 100 decimal bits in the RSA system, the
method above, although possible, is hardly feasible if exhaustive search is required.
Fortunately, the mathematics of factoring have long surpassed the simplicity
of the forementioned method. Currently there are established algorithms as well as
22
Digits factorized Aprroximate time





40 3000 centuries *
* Esitimate
TABLE 2.4: EXHAUSTIVE FACTORIZATION WITH ONE '486 COMPUTER
on-going researches which could reduce the time factor at a phenomenal rate.
As a result of a concerted effort initiated in 1982, the mathematics de-
partment at Sandia National Laboratory established some tangible bounds on the
computational feasibility of factoring large numbers. The outcome, using a Cray
X-MP computer, was within a range of 7.2 minutes to 32 hours for numbers varying
from 55 to 77 digits in length [Ref 17].
In a separate study by Ronald Rivest [Ref 15], it is proven that with the
best algorithm available such as that of a quadratic sieve [Ref 18], a large prime
composite integer can be factored with a running time proportional to:
e
y/ln(n)ln(ln(n))
In the range of interest(approximately 256 bits in length), for k bit number
n, a crude approximation is:
5 x 109+<*/5O >
Using Sandia's benchmark that a 75-digit number can be factored in about
1 day [Ref 17] and the formula of Rivest's article [Ref 15], Table 2.5 is derived [Ref
17]..
Based on the data above, it is safe to surmise that the problem of factor-










9 x 10 12
2 x 1015
3 x 1017










TABLE 2.5: FACTORIZATION TIME WITH SANDIA'S BENCHMARK [REF 17]
knowledge and technology. The exploitation of this problem in the RSA system and
the neural network-based system of Chapter IV is hereby justified.
24
III. HARDWARE DEVELOPMENT OF THE
PUBLIC-KEY CRYPTOSYSTEM
The feasibility of most popular public-key systems is heavily dependent upon
the possibility of hardware implementation. Although the algorithm is theoretically
simple, its software implementation is slow and highly limited to the resolution of
the processor. Such problems are not worth tackling when, with the available VLSI
technology, hardware implementation is faster and more efficient.
The crux of many public-key cryptosystems hardware rests on the ability to
devise a fast exponentiation scheme where the exponent and modulus are extreme in
length (greater than 256 bits). From our two sample cryptosystems, Diffie-Hellman
and RSA, the fast exponentiation problem is essential in putting the theory to prac-
tice. To familiarize the reader with the possibility for hardware implementation of
existing public-key cryptosystems, this chapter will develop in detail a hardware
scheme for fast exponentiation based the recursive sum of residues algorithm.
A. MODULO EXPONENTIATION USING RECURSIVE
SUM OF RESIDUES
Currently the most popular working hardware for the RSA system performs
exponentiation by repeated squaring operations coupled with conditional multipli-
cation. During each square or multiplication stage, modulo reduction is also incor-
porated so as to maintain a small intermediate result [Ref 19]. The combination of
squaring (considered as part of multiplication), multiplication and modulo reduction
operations forms the core of fast exponentiation. Currently, there are two categories














Figure 3.1: Block Diagram of over all exponentiation unit
1. Multiplication and modulo reduction are done in tandem. As the partial prod-
ucts are formed, a decision based on special algorithms is made on whether to
perform a reduction on the product [Ref 19].
2. Multiplication and modulo reduction are done sequentially. The result of the
multiplication is first obtained and then fed serially to the modulo reduction
unit [Ref 19].
For the purpose of this thesis, only the latter case (2) is considered. The under-
lying reason behind this choice is simplicity which leads to a modular structure that
in turn can easily be implemented in VLSI. Moreover, the first part of this hardware
scheme, a serial multiplier, will not be delved into with details due to the abundance
of such units already available. This leads us to focus on the hardware implementa-
tion of the modulo reduction unit to which the result of the serial multiplier is fed
into in accordance to the basic block diagram of Figure 3.1 [Ref 19].
26
1. Sum of Residues Reduction
Our modulo reduction unit is based on the sum-of-residues reduction
method. That is the number, x, reduced by modulus, m, is expressed in the fol-
lowing binary form:
x = ^xt-2*'-1 ; si » [0,1]
t=i
The modulo reduction is
n
x mod m = (^Px^1-1 ) mod m
Since modulo reduction is associative
n
x mod m = (y] x t (2'
_1
mod m)) mod m
t=i
Summarizing, one performs the reduction as a conditional power of 2 re-
duced by mod m (a residue) and a summation of all the resulting residues (hence
sum of residues) [Ref 19].
Example:
modulus mis7, x = 10010 = 18,i initialized to 1.
Residues are at 2 1 and 24 due to positions of 1 in 10010. Respectively the
residues are 2 mod 7 and 16 mod 7 which are 2 and 2. Hence J2 r,; = ri -fr4 = 2 + 2 =
4.
Table 3.1 summarizes the SOR process for the example which resulted in:
(5j r,) mod 7 = 4 mod 7 = 4
Indeed 18 mod 7 = 4
Given a modulus, residues can be obtained by a look-up table; however,
this requires excessive space. Given n as the modulus length, a typical table size is n
27
shift x LSB First X residue 2* * mod 7 = resulting residue
X 2° mod 7 = 1 =
1 X 21 mod 7 = 2 = 2
X 22 mod 7 = 4 =
X 23 mod 7 = 1 =
1 X 24 mod 7 = 2 = 2
• . . residues will repeat £ resulting
. . 124124... residues = 4
• . pattern
TABLE 3.1: EXAMPLE SUM OF RESIDUES FOR 18 mod 7
iteration 2rt_, m




ri = 2r,_! or 2r,-_ 1 — m





TABLE 3.2: EXAMPLE RECURSIVE SOR FOR 18 mod 7
by 2n. With n being greater than 256 bits, this would require extremely large data
paths, undesirable in silicon implementation [Ref 19]. For this reason, it would be
more desirable to calculate the residues as necessary in accordance with the given
modulus. Fortunately, there is a simple recursive formula which allows for easy
hardware calculation of residues:
ith residues = r,-; i = 2...n
. _




- m iff (2rt_! - m > 0)
j*! initialized to 1 [Ref 19]






f*r©ti«ct *f Serial *4«H*p4*«r















Figure 3.2: Modulo Reduction Unit
recursive sum of residues method, the result of which is in Table 3.2, indeed the
residues are the iterative pattern: 1,2,4,1,2,4,1...
A diagram of an architecture using the sum of residues method for modulo
reduction is provided in Figure 3.2 [Ref 19] .
Respectively, M and R are two n-bit registers holding (—m), the two's
complement of the modulus, and r,, the current residue. Initially, the current residue
is set to 1. As the system is clocked, the register is loaded with 2r, or 2r< — m,
depending on the sign bit of the 2r, — m add. The accumulator sums those residues
which are passed by the incoming bits of the serial multiplier's product P. There's
an overhead amount of bits which must be taken into acount for the accumulator's
size. The necessary overhead bits are given in Figure 3.3 [Ref 19).
Having a sound understanding of the theory behind the architecture in
Figure 3.2, the next obstacle that must be cleared is the transformation of the theory
to an actual VLSI layout. With some intuition and basic knowledge of logic circuit, a
block diagram complete with logic units, inputs and outputs is developed and shown
29











) 100 200 300 400 500 6(
Input Bit-Length
Figure 3.3: Overhead Vs Input Bit
in Figure 3.4.
A few details in the transformation between Figures 3.2 and 3.4 are hereby
stated for clarification. Whereas in Figure 3.2 a multiplier was used to obtain the
correct residue for the accumulator, in the final design, a multiplexer is chosen to
perform the multiplication. Also the left shift logical to obtain 2r, is finalized without
a shift register but rather by hardwiring the outputs of the residues directly to the
inputs of the first adder.
From a VLSI perspective of Figure 3.4, one sees that it is beneficial to devise
a modular unit (shaded region) which could easily be assembled together to form a
larger complete reduction unit satisfying the length of the modulus. To realize a
single modular unit, only 2 master-slave flip flop's (MSFF), 2 combinational adders
and 2 2:1 multiplexers are needed. The control for this unit alone and for the rest of
the modular reduction device is a couple of simple two-phase clocks. The simplicity
of this modular scheme is attractive. However, the cost is in silicon area and speed
as we will see.
30
Clock
Z'% Complement of Modulo
Number 7=1001
PS Number To Be Moded
Shifted In One Bit





































8 IIST REUISTEIS (AC :UMUIATOR)\V
out
elk
^T" "T^ To Zp.gegj
Use MUX To Multiply-.
By V«r fc
R sidue X P
2
elk ia oik
l out out ^NfcStV







Figure 3.4: Block Diagram of 4-Bit SOR with Logic Units
31
Input Output
Figure 3.5: MSFF Circuit Diagram
B. VLSI LAYOUT DEVELOPMENT
1. Master Slave Flip Flop
The desire for a simple control method, a two-phase clock, necessitates the
use of a master-slave flip flop instead of a direct latch. In the first stage where
the residues 'are computed, the adder uses the output of the flip flop (slave) while
the output of the hardwired shift left 2r, is transferred to the input end of the flip
flop( master). The same requirements for the flip flop are imposed in the accumulator
unit where the flip flop must act as both the accumulator's adder output register
(master) as well as accumulated input to the adder.
The chosen circuit for our master-slave flip flop is shown in Figure 3.5 [Ref
20].
Analysis of Figure 3.5 shows two cascading 2-phase static latch. This struc-
ture is sound and efficient to implement. A minor problem of clock race is possible
when clock is high and clockbar overlaps it causing a tendency for the input and feed-
back signal to contest with the new value on the flip flop input [Ref 20]. Fortunately,
for our purpose, this problem did not manifest itself as the feedback transistor is




-^S NS v^ Ira *SB3 JB^i 2 C

























Figure 3.6: MSFF Layout
slave flip flop is given in Figure 3.6. It should be preempted that the design will
be slightly alter later on in order to conform to the overall modularity of the entire
modulo reduction unit.
Silicon space for the MSFF is 64 x 135 /xm2 . SPICE analysis [Ref 21] on the
layout determined a delay from input to output to be 10ns. The maximum speed
of operation for the MSFF is lOOMhz. Since the input and output of the MSFF is
inherent only to the single module, no effect from the other modules are of concern.
2. Adder
Due to the modularity of the design, the simplest approach is taken in the
development of the two adders in the module. The chosen unit for both adders is
a combinational adder with approximately equal sum and carry delays. Carries are
allowed to ripple through the necessary modules. This choice is made mainly to
conform to the modular structure. The ripple carry design does cost much in speed.
The circuit diagram for the adder is shown in Figure 3.7 [Ref 20]. The appropriate
33
Figure 3.7: Adder Circuit
Vddi .^
iH^^vcKvv'^^^*^i\v^ii} ~v v !v ;; ; * u'l'^vc'vw^i
s* as f|>»: s§ §§ •::•>: p §§ •:•:•:•: ^ •:•>:•: ^













Figure 3.8: Adder Layout
layout follows in. Figure 3.8.
The adder layout sizes up to 73 x 145 //m 2 . SPICE analysis Ref 21] of a single
adder unit showed that the sum and carry delays are 4.8ns and 4.5ns respectively.
From this result, intuition dictates that when the unit is put together for a larger





Figure 3.9: MUX Function Block Circuit Diagram
3. Multiplexer
The reduction unit calls for the use of two 2:1 mux's per bit of modulus.
The first takes its select input from the sign bit of the sum of the first adder and
output 2ri or 2r,- — m as appropriate. The second simply acts as a multiplier with
its select input as the single bit shifted in from the output of the serial multiplier
and outputs the residues if the select is 1 and if select is 0. In short it acts as a
single bit multiplier. For our multiplexer, a function block design is used [Ref 22].
The circuit is shown in Figure 3.9 [Ref 22].
This is an NMOS device in which only one of the two inputs a, b is passed to
the output depending on whether NMOS-1 or NMOS-2 is turned on. Only one NMOS
gate can turn on at the time because the inputs to their gates are complements.
Intuitively, the select input of the multiplexer is the input to the two gates. The
VLSI layout is shown in Figure 3.10.
Because of the simplicity of the circuit, the only delay is one transistor
gate. Compared to the delay of the adder or flip flop, this is negligible and will not





Figure 3.10: Layout of MUX
4. Modulo Reduction Unit
Having all the necessary components, the entire modulo reduction unit can
now be developed. As previously mentioned, a "modular" design is implemented
in this thesis so that, depending on the size of the modulus, the entire unit can be
constructed by simply cascading the same module together n times (modulus is n-bit
in length.) Bearing this in mind, the layout for the module as well as a 4-bit modulus
modulo reduction unit is shown in Figure 3.11.
The foremost significance of the VLSI scheme for the modulo reduction unit
is that it is simple in implementation and, above all, it works. Using a CFL program
[Ref 3], the module can easily be generated into an n bit unit. Experimentally, RNL
simulations were performed [Ref 3]. The results, which are enclosed in Appendix
B, testify strongly on behalf of the unit's functional capability. However, as to the
efficiency in area and speed, the empirical data is debatable in support of different
individual's needs.
Since the modulo reduction unit is designed mainly for modularity, the size
of the entire structure grows geometrically with the number of bit that the unit is
designed for. Each module per bit is sized at 73 x 672 fim2 . If n is the number of
bits required to be modulo reduced, then n modules are needed. Disregarding the
36
;. i i iniiii MiMii ir i XL^'-l^TTn!'" HWlnnf ii i it''ttiv v r-—v ' ;;;';;; m-p-Trr rrnr in- y
;•"*-"",:;;""""
.7 ;i tittyy- ittt rrrTQflD
wmM'.'.'.v. :.v.v.tmmMM4; 'f.'.-•» \ >»»*»*•. jc ..-.'.-.-.Vaotmmeera)'i*9$Wfl •. ( limMv.'.1.'. $ ::::;M .>w»inuq»<: . m*M jj U 1) 1m d
'
h>•••• ....««»»/»»: >
lii ititf riftf tilt lt • • *• *^»> • • • • A»Sj»i»«»i«V»» » • •wwwww^wwv^- • •'<
••*
,0, BSiL ,°. • H iQ '' ,0 , «tK , .a, -• •• $5 »oii a. «a^ . o .. t v ! w ti-o.v a »o-. n
„4.i..(p.^...^... .^
_.b>..^. .;».... ^.^.i.....ji


















i..*^-*---—•*•*••&—4—4l~"<a-^— i-...s L#£.i..i L .$.;>. ...i. i. *.?.».<. •
















50 100 150 200
Input Bit-Length
250 300
Figure 3.12: Size of Modulo Reduction Unit
8000
50 200 250100 150
Number of Bits
Figure 3.13: Speed Performance of Modulo Reduction Unit From SPICE
minimal effect of overhead bits (Figure 3.3), the size of a modulo reduction unit for
n-bit modulus is n x 49056/im2 . Figure 3.12 is a plot relating the size of the unit to
the number of bits.
In regard to speed consideration, experimental data found the unit's car-
rychain to be the limiting factor. After SPICE simulation [Ref 21], Figure 3.13 was
obtained to gauge the speed performance of the modulo reduction unit.
38
Since the carrychain imposes the speed limit in this design, intuitively, one
can incorporate speed saving techniques such as various carry-look-ahead adders;
however, this will alter the modularity structure. This is beyond the scope of the
thesis but remains a viable avenue for speed improvement at the expense of silicon
space.
In summary, this chapter has provided the basic hardware building blocks
for a fast exponentiation scheme with specific details on a modulo reduction unit.
From this foundation, an RSA hardware implementation can easily be conceived.
Such an implementation is necessary in many applications, one of which is the subject
of the next chapter: a novel approach to PKS using neural networks. As will be
explained in the following chapter, the hardware technology developed here will be a
small integral part of a "pseudo" public-key cryptosystem based on neural networks.
39
IV. A NEURAL NETWORK-BASED
PUBLIC-KEY CRYPTOSYSTEM
Since all cryptosystems make use of some form of mapping functions to trans-
form data to unintelligible code and then recover it, a neural network - inherently
an excellent non-linear mapping technique - provides a viable choice for a medium
from which a possible cryptosystem can be based upon. In examining this possibil-
ity, this chapter presents an adaptation of the back-propagation neural network to
a "pseudo" public-key arrangement. Strictly as an initial research, a simple require-
ment of encrypting and decrypting a number representing any character or data is
fulfilled via the network. Following examinations of the network, a key management
system is then devised. As data are fed to the network in simulation of encrypting
and decrypting, the problems and solutions to the system are discussed. Finally,
a complete top-down block diagram of an entire cryptosystem based on the neural
network of this study is proposed.
A. EXPERIMENTS IMPLEMENTING A NEURAL NET-
WORK IN CRYPTOSYSTEMS
The neural network-based cryptosystem to be designed, a cipher system, re-
quires two basic elements: a key management scheme and an algorithm for two-way
mapping a set of numbers representing data. In this respect, it is fundamentally
not far different than other cryptosystems. The differences surface only in the im-
plementation of mapping. Whereas all existing system such as DES [Ref 23], once
implemented in hardware, maps in a set pattern, a neural network can change its
mapping any time by simply retraining its weights to new data. As it turns out, this
40
deviation from the norm is advantageous since it adds an extra level of protection.
Namely, if the system is compromised, retraining and obtainment of new weights are
neither a difficult nor time-consuming task [Ref 24, 25].
Before the network is presented, some background is in order. The system
of this study is designed to map up to a set of 45 characters for encryption and
decryption. Figure 4.1 is a block diagram of the system. From Figure 4.2 [Ref 26],
the two networks for encryption and decryption are identical systems; they are both
back-propagation networks composed of 4 inputs, 1 output, and three hidden layers
of various sizes.
Prior to proceeding with the explanations of Figure 4.1, it is stressed that this
system is based mainly on the RSA system. As such, it simply takes a number,
encrypts it to another number and decrypts it back. Like RSA, this is all the neural
network is set up to do. For simplicity, this number represents a particular character;
however, the relationship between the number and character is not explored in detail
because this is a subject outside of the focus of this thesis. Furthermore, the input to
the network of this research is only 16 bit in length. Again this is chosen for simplicity
and clarity in an example system. It is not chosen for security. Like RSA in which
system security rests on the key being numbers greater than 256 bit, the security of
this system also depends upon the range of the input being greater than 256 bit. In
fact, with the input being only 16 bit long, the system can be compromised within
nanoseconds. However, successful cryptoanalysis of 256-bit inputs will be shown in
Section 4.D.1 to take trillion of milleniums. So in order to apply this system to real-
world application, it is preempted that the input range should be increased and the
assignment of a number to character be done separately so as to maximize security.
To clarify Figures 4.1 and 4.2, in order to encrypt, a 16-bit number representing
a character is partitioned into 4 segments so as to provide the 4 4-bit inputs to the
41
Example of Encrypt/Decrypt of Character 2
to a 4 Dltft H«
Number X




















Result Xi = [4 7 A D] = m Character 2
Note: Message M can only be within a certain range
of number which A originally used to train the encrypt
network. Hence the range of If must be sent separately
ia a separate P.K.S. (RSA).
Send Range of M
Using RSA etc...
Non Secure Receive Range of M
To Encrypt







f 14 Il5 Il6 Il7




: 18 kl . f1
2
13
/ U;4^LJ H i d d e n2
£fl-'tf




1 1 i in
Figure 4.2: Back-Propagation Network For Encryption and Decryption
encryption network, the output of which is a single 16-bit number different than that
of the original input. These 4 4-bit inputs along with their corresponding 16-bit
output are first fed to the network to train the weights. Once trained, the weights of
the encryption unit would have converged to values such that when these converged
weights are set as constants, the same 4 4-bit inputs used for training will provide
an actual output that can be rounded to the desired output used in training. For
example, if the desired output is 1256 then the actual output must be between 1255.5
and 1256.5 so that rounding to the nearest integer would yield 1256.
Naturally, for a system encrypting up to 45 separate characters, the correspond-
ing training sets will be 45 input/ouput pairs. Basically, this is how the network is
trained and utilized for encryption. It should be noted that whether the input/output
pairs are linearly related or not, the weights should converge and accommodate the
required mapping function.
For decryption, the same type of network, training and mapping scheme will
be used, only this time the recovery of the original data is essential. Intuitively, the
43
input of the decryption unit is the 16-bit output of the encryption network. To keep
the structures of the encryption and decryption networks identical, the encryption
output must be partitioned into 4 4-bit segments before it becomes inputs to be
decrypted. The desired output of the decryption network must then be the original
16 bit input of the encryption network. To clarify the process, the following example
is offered.
Example A:
Given a single processing element with 4 inputs and one output.
The element's function is /(£) = H5
The four input x's= [1 2 A 6]i6 ; output=12599 = 313716
The four converged encryption weights are found to be [77 1056 501 900] such
that
1(77) + 2(1056) + 10(501) + 6(900) = 12599.
The encryption weights are thus : [77 1056 501 900].
Since the encrypted output is 3137i6 , the decryption input is [3 1 3 7]i6
The four converged decryption weights are found to be [290 66 997 121] such
that
3(290) + 1(66) + 3(997) + 7(121) = 4774 = 12A616 .
The decryption weights are thus : 290 66 997 121.
Based on the example, a training set of several encryption and corresponding
decryption numbers can be randomly picked to represent any character. A typical
training set for 28 characters, the upper case alphabet with comma and space, is
shown in Table 4.1.
44
Encryption = «= Decryption
Text Character Hex Rep Dec Rep — «— Encrypted Character Hex Rep Dec Rep
A 12AC 04780 R 321C 12828
B 134E 04942 N 981B 38939
C 214B 08523 P A235 41525
D 2698 09880 S 425A 16986
E 35B7 13751 Q 6533 25907
F 538A 21386 A159 41305
G 6942 26946 L 8731 34609
H 661B 26139 D 2698 09880
I 728D 29325 M 9137 37175
J 7546 30022 H 661B 26139
K 811A 33050 B 134E 04942
L 8731 34609 J 7546 30022
M 9137 37175 C 214B 08523
N 981B 38939 F 538A 21386
A159 41305 A 12AC 04780
P A235 41525 G 6942 26946
Q 6533 25907 K 811A 33050
R 321C 12828 I 728D 29325
S 425A 16986 E 35B7 13751
T B366 45926 Z F553 62803
U B129 45353 Y EA54 59988
V C568 50536 space OBCA 03018
w D346 54086 U B129 45353
X D351 54097 W D346 54086
Y EA54 59988 V C568 50536
Z F553 62803 comma 092D 02445
space OBCA 03018 X D351 54097
comma 098D 02445 T B366 45926
TABLE 4.1: EXAMPLE TRAINING SET
45
Notably, the assignment scheme of Table 4.1 is monoalphabetic. This is chosen
strictly for simplicity, not security. The focus of of the neural network is to map a
number to another then recover it. How the number might represent a character is
entirely another subject in cryptography. In light of this, using training sets similar
to Table 4.1, experiments were next conducted to support the proposed theory of
using neural networks for a cryptosystem.
B. EXPERIMENTAL RESULTS AND OBSERVATIONS
In order to accommodate the mapping scheme for the proposed cryptosystem,
a series of experiments designed to gauge the performance of the back-propagation
network were carried out. The primary goal of the experiments is the development of
an optimal network based on several parameters. Information such as training time,
error tolerance, range of input numbers, network sizes and their interdependence
are of primary interest in building a working example network for the cryptosystem.
In accomplishing the desired goal, the chosen back-propagation network consists of
4 inputs, 1 output and 3 hidden layers of various sizes. The network is built and
simulated using the Neuralware software package [Ref 26] implemented in an IBM
'486, 50MHz, 16 Mbytes.
Table 4.2 provides the first set of results which are intended to show the re-
lationship between convergence error and training time. For the experiment, a set
of 45 training input/output pairs (45 characters of NTP) along with 4 bit per in-
put (16 bit overall since there are 4 inputs) were used. Error is measured in root
mean squared values (RMS), a common statistical method of error estimation which
is employed by Neuralware. Training time is compared by number of iterations, a
method of measurement used in Neuralware. It should be noted that time of iter-
ations varies for different networks. The larger the network, the time per iteration
46




5 2500 0.6 250000 0.5
10 73000 0.0025 300000 0.002
15 70000 0.002 350000 0.00006
20 124500 0.0005 270500 0.0001
25 115570 0.000085 340000 0.000017
TABLE 4.2: TRAINING TIME VS ERROR RELATIONSHIP
increases proportionally.
Conclusions drawn from Table 4.2 concern primarily training time and error.
Comparing the error with iterations to the error, one noted that up to the first set
of iterations, the errors decreased significantly for all networks. After this, the error
goes down significantly less even for a greater increase in iterations. This shows
that after a certain barrier, training of all networks follows the law of diminishing
return wherein the error decreases minimally despite greater increase in training time.
Eventually, when the error has reached its minimum, no amount of training time will
help. This behavior is typical of all neural networks [Ref 24, 25]. After this first
observation, another set of experiments were run and their results are summarized
in Table 4.3. For this experiment, the iterations to convergence were set to 3.5 x 105
iterations where it was determined that the error was at its minimum for all tested
networks (weights have converged to optimal values). The inputs again are 4 bit each
and 45 input/output pairs were used as training sets.
Clearly from Table 4.3, given the same set of input/ouput, the larger network
results in the least error at final convergence. This is due to the larger amount of
processing elements and weights (memory) available to accommodate the necessary
mapping patterns.
The final experiment intends to formulate the interdependence between network
47
Elements/hidden layer RMS error
5 0.2109
10 7.835 x lO"4
15 3.0836 x 10~5
20 2.492 x 10"5
25 1.684 x 10"5
TABLE 4.3: RELATIONSHIP BETWEEN NETWORK SIZE AND ERROR
size, iterations to convergence, and input size. The results are depicted in Figure 4.3.
The conclusions which can be drawn from Figure 4.3 are:
• In regards to the range of inputs, as the number of bits per input increases,
the training time increases. Theoretically, this trend can be attributed to the
weights having to accommodate mappings of larger number to smaller ones as
well as the reverse. Namely, as a set of small and large inputs maps to larger
and smaller outputs respectively, the weights have to be small as well as large
if there are not enough weights. This may lead to non-convergence as they can
not be both. This is seen in the extremely high increase in training time with
the smaller size networks. As the network grows, there are more weights to
map thus there is less strain on the system causing training time to decrease.
• In regards to the number of input/output pairs to be mapped, as the training
pairs increased to 45 (number of characters in NTP set), the iterations to con-
vergence also increased. This is easily explained by an analogy to the human
brain which is the structure emulated by neural networks. When there is more
information to learn, the brain labors to maximum capacity until its cells are de-
pleted. In the case of neural networks, as the size of the network is exceeded by
the information memory" demands, the iterations increase with approximately
no learning. A barrier is reached until more neurons are available.
48








10 20 30 40
Characters Mapped
3 bit per input
10 20 30 40
Characters Mapped
Solid- 20 elements / hidden layer
Dashed- 30 elements / hidden layer
Dash dot(star)-40 elements / hidden layer
Tanh, Delta Rule























10 20 30 40
Characters Mapped
50
Figure 4.3: Relationship between Network Size, Iterations to Convergence and Input
Size
49
• In regards to the size of the network, the relationship to input/output as well
as range of inputs are already described in observations of Table 4.2 and 4.3.
One more observation is added here in that as network size is enlarged for more
training input or input size, the training time increased. Mathematically this
makes sense since there are more weights and neurons (memory) to update.
Each iteration now takes longer to complete.
After thorough exploration of empirical data, the final conclusion is that there
exists a network for the proposed cryptosystem. And it works. After several trials,
the optimal network for this paper's system is found to consist of a 4 bit per input,
4 inputs, 1 output, 3 hidden layers, 25 elements per hidden layer, with 45 sets of
input/output traing pairs. This specific network is used in a conclusive example in
the next section.
C. AN IN-DEPTH EXAMPLE
This example is based on Table 4.1 which in turn is based on the Naval Tacti-
cal Publication coding scheme wherein a character is mapped unto another: A<-+R,
B«-*N... This scheme is chosen for clarity in that an encrypted text will also be a
string of characters. In reality, however, since the characters are coded by a num-
ber, the encrypted text need not be a number representing another character. For
instance, character 'A' encrypts to 5BCF\q where 5BCFi6 in this case does not
represent a character in Table 4.1.
This example employs a monoalphabetic substitution scheme to assign a number
to a character. In this respect, this system is vulnerable to single-letter frequency
analysis and is therefore easy to break [Ref 27] . However, if each character is coded
by multiple numbers utilizing schemes such as homophonic or polyalphabetic sub-
stitution (Beale or Vignere and Beaufort cipher), the safety margin would greatly
50
increase [Ref 27]. Additionally, for real-world application, the input range must be
raised from 16 bit to greater than 256 bit.
As stated in the previous section, this system, based on RSA, is concerned only
with two-way mapping a number to another. Bearing this in mind, this section is
intended only as a pedagogical example of how such a scheme could be implemented
so as to be able to actually encrypt and decrypt a plaintext message. In reality,
for complete security, a separate scheme of assigning numbers to characters must
be- chosen to defeat the frequency of letters in plaintext. If interested, the reader is
referred to reference 27 for the assignment of numbers to characters. Moreover, the
range of the network's input must be greater than 256 bit. Having established the
objective of this example, illustrations of the system is hereby offered. The following
plaintext message is encrypted and decrypted using the system of Figure 4.1.
Plaintext: FIND ME COMPLETE CHAOS AND I WILL SHOW YOU SCI-
ENCE
Decimal coded text and encrypted text:
F I I D ME C
I I I I II I
Plaintext : 21386 1 2932S 1 38939 1 09880 1 03018 1 37175 1 13751 1 03018 1 08523
1
Encrypted text : 41305 1 37175 1 21386 1 16986 1 54097 1 08523 1 25907 1 54097 1 41525
I I I I I
.
I I I IOMFSXCqXP0MPLETE CHAOS
I I I I I I I I I I I I
41305 1 37175 1 41525 I 34609 1 13751 1 45926 1 13751 1 03018 1 08523 1 26139 1 04780 1 41305 I 16986
04780 I 08523 I 26946 I 30022 1 25907 1 62803 1 25907 1 54097 1 41525 1 09880 1 12828 I 04780 113751
I II I I I I I I | | | |ACGJQZqXPORAE
A H D I WILLIII I I I I I
03018 1 04780 1 38939 1 09880 1 030181 29325 1 03818 1 54086 1 29325 1 34609 1 34609 I 03818
1
54097 1 12828 1 21386 1 16986 1 54097 1 37175 I 54097 1 45353 I 37175 1 30022 1 30022 1 54097
I I I I I I I I I I I IXRFSXMXUNJJX
51
SHOW YOU
I I I I III
16986 1 26139 1 4130S 1 54086 1 03818 1 59988 1 41305 1 45353 1 03818
1
13751 1 09880 I 04780 1 45353 1 54097 1 50536 I 04780 1 59988 1 54097
I I I I I I I I IEDAUXVAYX
S C I E I C E
I I I I I I I
16986 1 08523 1 29325 1 13751 1 38939 1 08523 1 13751
13751 1 41525 1 37175 1 25907 1 21386 1 41525 1 25907
I I I I I I I
e p m q f p q
Resulting encrypted text:
OMFSXCQXPACGJQZqXPDRAEXRFSXMXUMJJXEDAUXVAYXEPMQFPQ
Additionally, given the monoalphabetic scheme chosen here, in order to guard
against the problem of frequent repetition in the english vocabulary such as the word
the, double patterns //, nn, tt which can simplify cryptoanalysis, random or strate-
gically placed noise can be added to the encryption via some algorithm. Remember
that since one is using only 28 numbers out of 2 16 here, there are multitudes of num-
bers left to insert into the above patterns as noise bytes. In this specific example,
the noise is inserted by human intuition and is shown as asterisk (*) signifying any
number not used in coding the characters.
An example of encrypted text with noise inserted:
QMFS*XCqX*PACG*jqZ*qXQDR*AEX*RFSXMXU*MJ**JXE*DAUXV*AYXEPM*QFPQ
With the noise option, one must have a scheme to filter the noise out prior
to entering the decryption network. The decryption network simply recover the
plaintext from the encrypted text as previously discussed. Both the encryption and
decryption networks is subjected to the following parameters:
52
• Momentum coefficient = 0.300.
• Learning coefficient = 0.500.
• Function = Tanh.
• Learning rule = Delta-rule.
• Size = 4 inputs, 1 output, 3 hidden layers, 25 elements/layer.
• The time to minimum acceptable error was approximately 8 hours.
The two networks' (encryption and decryption) data employed for this example
are included in Appendix C.
Clearly, the basis of how to encrypt and decrypt via a neural network is es-
tablished. Based on knowledge of cryptography, the concept of a key must now be
incorporated.
D. KEY MANAGEMENT
Up until present, the method of mapping has been discussed without any men-
tioning of a key. In reality, the key evolves from the actual training process. Namely,
once the training is done, both for encryption and decryption, the converged weights
are the keys. Since different training sets are used (inverse sets), a key for encryption
and another for decryption are required. The keys will change when the network
switch mapping function via new training sets.
For our example of only one training input/ouput pair and one processing el-
ement in Section A (Example A), the keys are [77 1056 501 900] for encryption and
[290 66 997 121] for decryption. The fact that two keys must exist is perhaps clearer
now with the example; however, the fact that this is a one-way scheme only remains
murky. Let's clarify this further. For a specific set of encryption/decryption key that
53
party A obtains from training, party B given the encryption key, can encrypt while
A can decrypt using decryption key. Unless B somehow also obtain the decryption
key (the only safe way to do this is through a secured channel) there is no way for A
to encrypt to B unless B had come up with separate encrypt/decrypt keys of his own
and sent A the encryption key. There is no restriction against both parties using the
same encryption/decryption keys that only one has derived, provided the system is
a secret-key type where the keys can be distributed through safe channels. In this
respect, there is little to gain from a neural network as it is nothing more than an-
other mapping method. But there is much more to the versatility of neural network
which should be exploited.
In the key management scheme thus far mentioned, only one party needs to
train the network and then passes the weights as keys for encrypt and decrypt to his
or her counterpart. However, if both parties were to obtain separate training sets
and thus keys, only the encryption keys need to be exchanged. In this respect, there
exists a "pseudo" public-key scheme which can be exploited since the decryption key
requires no exchange. This possibility is hereby explored.
1. A Proposed Pseudo Public-Key Cryptosystem Using
A Neural Network
Irrefutably in cryptography, the possibility of a pseudo-public-key imple-
mentation of a neural network merits this paper further examination. Currently,
the designed networks mentioned that the keys, the encryption/decryption weights,
can be passed through a secured channel. If a cryptoanalyst has the keys and the
same network, he has broken all codes. Now the assumption is lifted. This research
postulates that if both parties develop their own set of keys, the encryption keys can
be exchanged through any public channel( Figure 4.1). A cryptoanalyst having pos-
54
session of the encryption key, a network, and encrypted data will face an enormous
obstacle in breaking the code: time (in terms of centuries.)
From the forementioned implementation, one recalls that only the encryp-
tion key needs to be exchanged if both parties train on separate data and each obtains
his or her own keys. The decryption key is never divulged. Given the encryption
key Eencr and the encrypted message Y a cryptoanalyst must solve an excessively
difficult equation to recover the original input X.
Example D:
Using data from our simple one element one input/output training Example A.







To solve for the original data, he must solve
llxi + 1056x2 + 501x3 + 900x4 = 3137i6
with x,- being 4 bit,
which is one equation and four unknown.
The above example is done on a simple single processing element model with
a simple linear function. Given a multilayer network such as the back-propagation
type with non-linear processing elements, even if the attacker knows the network, the
problem mathematically increases in difficulty since the number of elements grows
and thus the amount of required factorizations grows.
Even with a simple one cell example, for a crude cryptoanalysis method, one
must solve the equation -by trying 2 16 combination of inputs to break one character.
55
Using a crude equation for Table 4.4:
Time in seconds = 2Numberofbit9loops(10~9 sec computer/loop) lOOOcomputers
Number of input bits per x,- Time







1.08 x 1017 centuries
64 3.67 x 1055 centuries
TABLE 4.4: EXHAUSTIVE SEARCH CRYPTOANALYSIS TIME FOR A SINGLE
CELL
On the average it will take less then all combinations as it is probable that the
solution can come anywhere in the search. An exhaustive search of 216 loops for 216
combinations poses little problem with the power of the computer but let's say one
increases the same simple single layer input and output to a 32 -bit, 64-bit , 128-bit,
or 256-bit input. Herein lies the basis behind the security of this system: a large
range for the input of the network. Whereas up until now, only 16-bit inputs were
used in a simple example, when this range is increased to 256 bit, the difficulty of
working with such a large number renders any cryptoanalysis infeasible. Using an
exhaustive search, Table 4.4 shows the amount of total possible time it would take
to break one character given 1000 computers operating at 1 ns per loop operation (a
very generous, fast time).
As with all cryptosystems, the time above can be minimized further if the
system is susceptible to the problem of predictable frequency in the vocabulary.
Namely, when the number representing trends such as 'the', 'a', space, double letters
'11', 'nn' exists, estimation of those characters are made easier. With this system,
there exists a countermeasure in that one could use numbers not mapped to inject
noise into the transmission thus breaking up any patterns. Here, since only 45 num-
bers are needed to represent 45 characters, there are 2 16 — 45 random numbers left
56
to be used by some algorithm which would insert them into common words such as
those mentioned above. This possibility was shown earlier in the in-depth example
of Section C.
With the multi-element structure of the back-propagation network, the
cryptoanalysis problem is exponentially greater with increase in number of network
elements. Undoubtedly, the insurmountable time can be decreased given the luck
factor in the probabilities and in due time further development in mathematics can
solve in feasible time the NP complete problem. Nevertheless, at this date, the
postulate is made that this is a very safe public-key cryptosystem.
2. Justification of the "Pseudo" Prefix
Ironically, the restrictions which necessitate the prefix "pseudo" for the
system arise from the same attributes that make the system safe. Given a range of
bits of input x, one cannot use all the possible combinations to train the network.
For example, if each x was 64 bits long, one faces 24x64 = 2256 possible combinations.
In order to encrypt anything between and 2256
,
all 2256 numbers must be matched
to a unique y and trained to the network. This is comparable to the problem of the
cryptoanalyst; it would take trillions of milleniums - not feasible.
The solution to this problem is avoidance. One needs only to train a certain
range of number corresponding to the number of characters needed to be encrypted.
For the NTP character set in this proposed system, one needs only a range of 45
out of numbers 2 16 possible. However, both the encrypter and decrypter must know
this range. How is this range to be kept a secret and still be passed to both parties?
In order to make this neural network completely public-key, another PKS system is
required to pass this range. It is suggested that the already popular Rivest Shamir
Adleman (RSA) PKS system mentioned in Chapter II and III be used to pass this
57
range.
In summary, key management involves the direct public disclosure of the
encryption weights and the indirect public disclosure of the range of inputs via the
RSA system. This leads to the question of why not use RSA completely and not be
bothered with the neural network. The answer is that RSA is traditionally slower
compared to neural networks (after training) and since the range of numbers used in
encryption/decryption needs to be exchange only once prior to utilizing the system,
one can afford to use RSA whereas for text encryption a drawn-out repetitive real-
time process, a neural network is much more efficient [Ref 12, 24].
E. PROBLEMS OF A NEURAL NETWORK AS A CRYP-
TOSYSTEM AND PROPOSED SOLUTIONS
The two potentially detrimental problems with the neural network scheme are
that of the network weights not converging to an acceptable error for some non-
linear training sets (non-convergence) and the mapping not guaranteed to be one to
one (aliasing). Fortunately, the intrinsic versatility of neural networks is such that
solutions to these problems exist.
The more serious of the two problems, non-convergence, can be easily illustrated
by referring back to the one processing cell, one input/output training set example.
With simply one cell, an addition of a second input/ouput pair - if not linearly
related to the first pair - can cause the cell weights not to converge to acceptable
errors; namely, there are no possible set of weights which will accommodate the
correct outputs for both inputs. For example, the input/ouput pair [2 1 B 6]i6 and
[0 E F 3] i6 is added to example 4. A. Using the old convergence weight for the original
input/output, the actual output of the second pair is:
2(77) + 1(1056) + 11(501) + 6(900) = 12, 121 = 2£5916 .
58
Clearly this is not the desired output for the second input. Hence, if one was to
use the two data set above to train the single cell, the weights would not converge.
One is then left with some restriction as to how to choose training set (mapping func-
tion). This restriction, can be easily exploited by a cryptoanalyst to break the system
as he or she now knows that only certain mapping function is possible given knowl-
edge of the system. Luckily, this restriction can be lifted with the back-propagation
network used in this research.
As previously mentioned in Section A, a back-propagation network is an excel-
lent mapping method of non-linear functions. Relying on this property, the training
sets for encryption and decryption do not need to be linearly related. The more cells
one adds to the network, the more non-linear functions can be mapped. Theoreti-
cally, with enough cells per layers, the weights will converge to acceptable errors given
just any training data [Ref 24]. For the non-convergence example above, indeed the
back-propagation network did prove to be the solution.
Additionally for public-key cryptography, one must bear in mind that the train-
ing data for encryption and decryption are related. For it to work, the weights of both
encryption and decryption networks must converge. A training set that converges for
encryption but its inverse training set does not yield converged weights for the decryp-
tion network is otherwise of no use in cryptography. From experimental data of the
proposed 45 character encryption/decryption scheme, using the back-propagation
system, problems of convergence were sometimes encountered. The reader is referred
back to the experimental Section B where it was shown that when non-convergence
does surface, the solution is to add more cells.
Apart from non-convergence, the second problem, aliasing, proved less serious
but still needed to be dealt with. Aliasing occurs when, given a converged weights,
two or more sets of inputs map to the same output. This nuisance can be attributed
59
to the same problem which necessitated the "pseudo" prefix. Since one trains only
a range of inputs within the vast possibility (> 2256 ), the unused inputs could by
chance map to one of the same chosen outputs.
Example E:
Again reverting back to the one cell, one input/output training set of Example A in
Section A, an input of [1 2 A 6]ie along with encryption weights of [77 1056 501 900]
yielded an encrypted code of 12599 = 3137i6 .
Let's use an input of [7 1 4 A] 16 and the same converged weights. The encrypted
code for this input will be
7(77) + 1(1056) + 4(501) + 10(900) = 12599 = 313716 ,
which is the same output with the original input; hence aliasing has occured.
Clearly aliasing is a theoretical possibility and thus a problem; however, in real-
ity it can be easily be avoided by making sure one uses only the trained input/output
pairs for encryption and decryption. This way, one knows exactly that a given en-
cryption output should map back to the desired encryption input during decryption
and not the aliased value. In fact, the alias problem can be exploited to the system's
advantage. If certain aliasing problems are adapted intentionally, cryptoanalysis be-
comes more difficult. As previously explained in the "pseudo" justification section,
only the desired parties knows the range of inputs to use whereas others do not. It
is essential only to choose exact one-to-one mapping pairs in this range to avoid
aliasing. Outside this range, any other inputs can have the aliasing effect, an actual
benefit in extra safety.
60
F. DEVELOPMENT OF A COMPLETE BLOCK-DIAGRAM
LEVEL HARDWARE SCHEME USING A NEURAL NET-
WORK IN PKS
Up until now, most of the basic building blocks of a PKS using neural network
have been discussed. Gathering all the essential blocks together, a possible block
diagram proposal for an entire cryptosystem is shown in Figure 4.4.
Block by block description of Figure 4.4.
• The only component not yet delved into is the automatic generator of training
input/ouput sets. This function can be fulfilled by a linear feedback shift
register (LFSR). Given an input polynomial, it is a simple circuit capable of
generating a random set of different numbers given. For this study, an LFSR of
order 16 is necessary to generate 2 16 — 1 random numbers for both input/output
pairs of encryption. For further insights on LFSR's, consult reference 28. After
the input/ouput training sets of encryption is established by the LFSR, the
decryption input/ouput training sets must be the inverse; namely ouput and
input of encryption become input and input of decryption, respectively.
• Decrypt/encrypt neural net- Both networks are of the back-propagation type
composed of 4 inputs
,
1 ouput, 3 hidden layers with 25 elements per layer.
• Input Range Exchange- As discussed in Section D.2, the RSA hardware of
Chapter III can be used to send the range thus making this a "pseudo" PKS.
• Network Weights- The weights of the neural networks must be able to undergo
changes during training and then be set to constants once the the converged
weights are obtained via training or received from opposite parties. Simple





Linear Feedback Shift Register
(Generate Training Seta)
































Figure 4.4: Neural Network in PKS
62
A working model of a public-key cryptosystem based on neural networks has
been designed. It is merely a sample model which can be applied in limited usage;




This thesis has presented some novel approaches to public-key cryptosystems.
The focus was centered on a specific hardware implementation and a completely
new angle to PKS using neural networks. In both issues, research produced working
models when simulated by computers.
The hardware implementation for a modulo reduction unit in a fast exponentia-
tor - an essential device in the most popular PKS, RSA cryptosystem - was developed
based on the sum-of-residues method (SOR). The design is based on the concept
of modularity. The modular unit can be conveniently connected to form a fast ex-
ponentiator for numbers of any length. The result is a working VLSI layout when
simulated by RNL (Appendix C). The efficiency in speed and size, though offered in
the study, remains issues to be considered when the unit is to be used in real-world
applications. If the speed and size given hereby are acceptable to a certain applica-
tion then this unit is perhaps a viable alternative to existing technology due to its
advantage in modularity.
The second part of this thesis involves the use of neural networks in PKS. To
the author's knowledge, the attempt to integrate neural networks into cryptography
is a novel idea. Whether it is either original or even revolutionary remains to be seen.
That the goal is at all plausible is an unanticipated surprise when the experimental
results confirmed it so. This is not to say that plausibility means practicality. So far,
all that is proven is that the concept works. Whether the scheme is feasible needs
further research.
64
From data gathered in Tables 2.4 and 4.4, one can conclude that at 256 bit
in length for the key in RSA and input in the neural network-based cryptosystem,
exhaustive cryptoanalysis faces infeasible time limit. For all practical purpose, re-
quiring trillion of milleniums to break, the system of this thesis is as safe as any
current PKS (Table 4.4). Additionally, the most significant advantage in using neu-
ral networks in PKS is that there is no need for fast exponentiation which has proven
to be slow for large exponents and modulus [Ref 2] . The only necessary operations
in a back-propagation network are multiplication, addition and hyperbolic tangent
(or other non-linear functions.) The computational feasibility of the neural network
scheme, however, is not explored here and is left to follow-on research.
At present, the example system only applies for input ranging 16 bit in length.
For the system to be secured, it is suggested that the range be extended to 256 bit.
Intuitively, if one single network is to be used to map numbers with 256 bit range,
it will have to be large and thus will slow down the system. However, if parallel
processing is available and one can afford to design a 256 bit cryptosystem based on
16 16-bit neural networks, the results of this paper will be of value. Furthermore,
only the back-propagation network was used in this research. Given the multitudes
of network types in various applications, there may exist other schemes capable of
using other networks.
This paper is intended to pioneer the idea of neural network in cryptosystem.
As such it claims only the initiative in a novel avenue to cryptography. The proposed
theory of employing neural networks in cryptography now ends with a call for further
research into the efficiency, speed and possibilities of more capable networks. The
key to the knowledge gathered so far is that a new method is postulated and there
seems to be some merit in that it works with some restrictions. These restrictions
may be lifted by further investigation or perhaps there shall come a disproval which
65
may destroy the entire scheme altogether. Be that as it may, time constraint dictates
that this introductory study terminates with many aspirations of fueling follow-on




The following programs axe provided to supplement background knowledge in
public-key cryptography. In order, they are: fast exponentiation, greatest common
divisor, inverse, and factorization. The first three programs are written in C [Ref
2] and run on Unix while factorization is in Matlab code and ran on an IBM '486,
50MHz, 16MB.
/*
This program uses the fast exponential algorithm to compute the operation:
a~z mod n. It is intended as an example of software implementation of the
RSA public key cryptosystem. */
#include <stdio.h>
/* The algorithm is contained in the following function to be called when
necessary. */
int fastexp(a, z, n)






{while (!(z 7. 2))
i
z /= 2;











int a, z, n, t;
printf ("a"z(mod n) . Enter a, z, n ")
;
scanf("7.d 7.d 7.d0" ,4a,4z,to)
;
t» fastexp( a, z, n) ;
printf ("Result = 7.d\n" , t );
}
*********************•+ + ******* + ******** +********************
/*
This program uses Euclid's algorithm to solve for the greatest common
denominator (gcd) of two number. Given two input integers, a and n, this
program provides their mutual gcd. This is intended to be an example for
68




int gClOO]; /* Initialize an array for gcd */
int i«i;
printf ("gcd of a,n. Enter a,n separated by space:");
scanf ("7.d '/.dO", ftg[0], &g[l]);
while (g[i])
{
gCi+1] - gCi-1] % gCi];
i++;
}
printf ("gcd of %d and */.d is 7.d \n" ,g[0] ,g[l] ,g[i-l]) ;
s
>
/* This program compute the inverse, x, of a and n (0<a<n) such that
ax (mod n) = 1 */
include <stdio.h>
mainO
int g[l00], u[100], v[100]; /* Initialize arrays for indexing */
69
int i=l; /* Beginning index # of loop */
int y,n,a; /* Defining input and intermediate var. */
printf ("inverse of a,n. Enter a,n separated by space: ")
;
scanf ("7.d 7,d0", &a, An); /* Read in a and n */
g[0]» n;
g[l]= a;
u[0] = v[l] = 1;
u[l] = v[0] » 0;
while (g[i])
{
gCi]" u[i] * n + v[i] * a;
y- gCi-i]/gCi];
gCi+l] » g[i-l] - y*g[i];
u[i+l] = u[i-l] - y*u[i];
v[i+l] = v[i-l] - y*v[i]
;
i++;
} /* Using extension of Euclid's gcd algo */
if (vCi-1] <= 0)
{











7* This is a Matlab program designed to factorize a product of two
7% primes for the cryptoanalys is of the RSA public-key cryptosystem.
*/. Intended merely to show the futility of factorizing large numbers,
'/, it employs a naive exhaustive search method of dividing and
7* checking the remainder of the division of the product and every
7* possible odd numbers until a factor is found. To use the program,
7* simply type rsafac( 'product of 2 primes').
function[x]»rsafac(z)
;
% Enter the product.
w=round(sqrt(z)) ; */, Factor can not be larger than
7. the square root of the product
.
for n3 l:2:w % No need to test even numbers, and
7, limit of search is w.
v=z/n; 7, Testing by dividing products by
7* odd numbers
.
if (rem(v,l)==0) '/. If v is integer then
x=[n,v]; 7% n and v are factors.





RNL SIMULATION OF MODULO
REDUCTION UNIT
The following examples are indicative of the successful RNL simulation [Ref 3]
of the final modulo reduction unit. The unit simulated here is limited to modulo
numbes of 4-bit length. The RNL control file, stimulation file for one example are
included along with simulation results of 5 modulo operations.
Sample control file for RNL simulation of 5 mod 7 using modulo reduction
layout of .Figure 3.11.
; The name of this control file for rnl is: modl.l
; Simulation for modulo reduction unit of Chapter 3.
; LOAD STANDARD LIBRARY ROUTINES
(load "uvstd.l")
(load "uvsim.l")
; FILE WHICH WILL LOG THE RESULTS
(log-file "modl.rlog")
; READ IN THE BINARY NETWORK FILE
(read-network "modi")
; DEFINE THE TIME SCALE FOR SIMULATION
(setq incr 90)
; DEFINE INPUT VECTOR IF ANY, standard STYLE
(defvec '(bit state s3 s2 si sO ))
72
; DEFINE INPUT VECTOR IF ANY, SINGLE INDEX STYLE
; DEFINE INPUT VECTOR IF ANY, double index STYLE
; STANDARD REPORT FORMAT DEFINITION.
(def-report ' ("response= " ell cl2 in i3 i2 il (vec state)))
; PLOTFILE SPECIFIED
openplot "modi. ben"
; LOGIC ANALYZER STYLE OUTPUT FORMAT SELECTION.
(setq lanalyze t)
(wr-format)
; GLITCH DETECTOR SELECTION.
(setq glitch-detect t)
; NODE TRANSIENTS REPORT DEFINITION.
(chflag '( s3 s2 si sO))
; TRIGGER CONDITION SET-UP
; ADDITIONAL SIMULATION SET-UP COMMAND LINES.
(printf "Commence simulation. . An")
; SPECIFICATION OF A TIME/BASENAME FILE FOR INCLUSION.
(load "modi. time")
; ADDITIONAL WRAP-UP COMMAND LINES.
(printf "...completed simulation !\n")
exit
; GEN-CONTROL COMPLETED.
;The following is the stimulation file for the input to the rnl simulation
; above for 5 mod 7.
Sample < >.stim file for 5 mod 7:
73
time_range 10
in h 1 2 h 4 ; Note 101 is entered for 5
inn 1 h 2 1 4 ; Simply inverse of in
ell 2 1 h 1 ; 2-phase clocks
clln 2 h 1 1
cl2 2 h 1 1
cl2n 2 1 h 1
opt h x 1 ; Initializing MUX select
optn 10x1
mO h ; 2's complement of 7 is 1001
ml 1 ; Modulo number inputs
m2 1
m3 h
s3 1 x 1 ; Initializing summer
s2 1 x 1
si 1 x 1
sO 1 x 1
i3 1 x 1 ; Initializing 1st residue to 1
i2 1 x 1




The following is the RNL simulation result of stimulation file above
5 mod 7 :
118 nodes, transistors: enh=68 intrinsic=0 p-chan=56 dep=0
low-poweraO pullup»0 resistor»0
Report format of logic analyzer style output




9 1 1 1 0000
18 1 1 1 0001 - 1st elk pulse
27 1 1 0001
36 1 1 0001 - 2nd elk pulse
45 1 1 1 0001
54 1 1 1 0101 - 3rd elk pulse ***
63 1 1 1 0101
. . .completed simulation!
* Input is .101= 5 (Note input taken at each rising clock edge.)
** Residues are 1,2,4,1,2,4... for mod 7.
*** 5 mod 7 = 0101» 5.
t**********************************************************************
75
The following is a second RNL simulation result (10 mod 6)
118 nodes, transistors: enh=68 intrinsic=0 p-chan=56
dep=0 lov-power=0 pullup=0 resistor=0
Report format of logic analyzer style output
time ell cl2 in
*
i3 i2 il state (result)
Commence simulation
9 1 1 0000
18 1 1 0000 - 1st elk pulse
27 1 1 1 0000
36 1 1 1 0010 - 2nd elk pulse
45 1 1 0010
54 1 1 0010 - 3rd elk pulse
63 1 1 1 0010
72 1 1 1 0100 - 4th elk pulse ***
. . .completed simulation!
* Input is 1010= 10.
** Residues are 1,2,4... for mod 7.
*** 10 mod 6 = 0100= 6.
*******************************************************************
; Third RNL simulation using 10 mod 7:
; 118 nodes, transistors: enh=68 intrinsie=0 p-chan=56
; dep=0 low-power=0 pullup=0 resistor=0
76
; Report format of logic analyzer style output




9 1 1 0000
18 1 1 0000 - 1st elk pulse
27 1 1 1 0000
36 1 1 1 0010 - 2nd elk pulse
45 1 1 0010
54 1 1 0010 - 3rd elk pulse
63 1 1 1 0010
72 1 1 1 0011 - 4th elk pulse ***
...completed simulation!
* Input is 1010» 10.
** Residues for mod 7 is 1,2,4,1,2,4...
**10 mod 7= 0011 = 3.
; Fourth RNL simulation using 11 mod 6.
; 118 nodes, transistors: enh=68 intrinsic=0 p-chan=56
; dep=0 low-power=0 pullup=0 resistor=0
; Report format of logic analyzer style output




9 1 1 1 0000
18 1 1 1 0001 - 1st elk pulse
27 1 1 1 0001
36 1 1 1 0011 - 2nd elk pulse
45 1 1 0011
54 1 1 0011 - 3rd elk pulse
63 1 1 1 0011
72 1 1 1 0101 - 4th elk pulse***
81 1 1 1 0101
. . .completed simulation!
* input is 1011= 11.
** Residues of mod 6 axe 1,2,4,2,4...
*** 11 mod 6= 0101= 5
************************************************************
Fifth RNL simulation with 17 mod 5
118 nodes, transistors: enh=68 intrinsic=0 p-chan=56
dep=0 low-power=0 pullup=0 resistor=0
Report format of logic analyzer style output










0001 - 1st elk pulse
78
27 1 1 0001
36 1 1 0001 - 2nd elk pulse
45 1 1 0001
54 1 1 0001 - 3rd elk pulse
63 1 1 1 0001
72 1 1 1 0001 - 4th elk pulse
81 1 1 1 0001
90 1 1 1 0010 - 5th elk pulse***
99 1 1 1 0010
...completed simulation!
* Input is 10001* 17.
** Residues of mod 5 are 1,2,4,3,1,2,4,3...
*** 17 mod 5=0010 » 2.
79
APPENDIX C
SAMPLE NEURAL NETWORK FROM
NEURALWARE
The following is data for the encryption and decryption neural network used in
Chapter IV in-depth example. The network data is formatted from Neuralware [Ref
26] "annotated" option once convergence is reached. This option piovides all the
necessary parameters to reconstruct the network trained by data from Table 4.1. Of
the many parameters, those of interest are learning iterations (375642 for encryption
and 333877 for decryption), error function ( standard = hyperbolic tangent), learning
rule (delta-rule), and the processing elements' data. Of the element's data, the error
for each element's output was approximately zero once convergence is reached. The
weight data are not included other than the number of weights going to each element.
The reason for this omission is that it is not pertinent. With the data offered here
and Table 4.1, one can reconstruct the encryption and decryption network using
Neuralware.
Title: Encryption Network for In—Depth Example
r °ifPi?7 *





















































































Scale: 1.00 Low Limit: -9999.00
Offset: 0.00 High Limit: 9999.00





*** Weights 0.000 Error






















































Scale: 1.00 Low Limit: -9999.00
Offset: 0.00 High Limit: 9999.00

























Gain 1.0000 0.0000 0.0000
Learn Step 10000 30000 70000
Coefficient 1 0.3000 0.1800 0.0648
Coefficient 2 0.3000 0.1800 0.0648
Coefficient 3 0.1000 0.1000 0.1000
PE: 6
1.000 Err Factor 0.000 Desired
0.044 Sum 0.044 Transfer
*** 5 Weights 0.000 Error













































































































1.000 Err Factor 0.000 Desired
1.274 Sum 0.855 Transfer
23
1.000 Err Factor 0.000 Desired
0.031 Sum 0.031 Transfer
24
1.000 Err Factor 0.000 Desired
0.029 Sum 0.029 Transfer
25
1.000 Err Factor 0.000 Desired






































PEs: 25 Wgt Fields
Spacing: 5 F' offset
Shape : Square
Scale: 1.00 Low Limit: -9999.00
Offset: 0.00 High Limit: 9999.00












1.000 Err Factor 0.000 Desired
0.221 Sum 0.218 Transfer 0.218 Output
***26 Weights -0.000 Error -0.000 Current Error

















100.0000 0.0000 0.0000 0.0000 0.0000
1.0000 0.0000 0.0000 0.0000 0.0000
1.0000 0.0000 0.0000 0.0000 0.0000
10000 30000 70000 150000 310000
l 0.2500 0.1500 0.0540 0.0070 0.0001
2 0.3000 0.1800 0.0648 0.0084 0.0001















-0.-297 Sum -0.288 Transfer -0. 288 Output
PE: 35
1.000 Err Factor 0.000 Desired
-0.168 Sum -0.167 Transfer -0. 167 Output
PE: 36
1.000 Err Factor 0.000 Desired
0.315 Sum 0.305 Transfer 0. 305 Output
PE: 37
1.000 Err Factor 0.000 Desired
1.152 Sum 0.818 Transfer 0.,818 Output
PE: 38
1.000 Err Factor 0.000 Desired
-0.165 Sum -0.164 Transfer -0,,164 Output
PE: 39
1.000 Err Factor 0.000 Desired
-1.256 Sum -0.850 Transfer -0,.850 Output
PE: 40
1.000 Err Factor 0.000 Desired
-0.520 Sum -0.477 Transfer -0,.477 Output
PE: 41
1.000 Err Factor 0.000 Desired
-1.282 Sum -0.857 Transfer -0 .857 Output
PE: 42
1.000 Err Factor 0.000 Desired





0.082 Sum 0.081 Transfer .081 Output
PE: 44
1.000 Err Factor 0.000 Desired
-2.658 Sum -0.990 Transfer -0 .990 Output
PE: 45
1.000 Err Factor 0.000 Desired
4.263 Sum 1.000 Transfer 1 .000 Output
PE: 46
1.000 Err Factor 0.000 Desired
-0.159 Sum -0.158 Transfer -0 .158 Output
PE: 47
1.000 Err Factor 0.000 Desired
-0.068 Sum -0.068 Transfer -o,.068 Output
PE: 48
1.000 Err Factor 0.000 Desired
-0.707 Sum -0.609 Transfer -0,.609 Output
PE: 49
1.000 Err Factor 0.000 Desired
-0.527 Sum -0.483 Transfer -0,,483 Output
PE: 50
1.000 Err Factor 0.000 Desired
-3.316 Sum -0.997 Transfer -0.,997 Output
PE: 51
1.000 Err Factor 0.000 Desired
-1.019 Sum -0.770 Transfer -0,.770 Output
PE: 52










1.000 Err Factor 0.000 Desired









PES: 25 Wgt Fields: 2
Spacing: 5 F' offset: 0.00
Shape : Square
Scale: 1.00 Low Limit: -9999.00
Offset: 0.00 High Limit: 9999.00















Firing Density 00 .0000 0.0000 0.0000 0000 0.0000
Temperature .0000 0.0000 0.0000 .0000 0.0000
Gain 1 .0000 0.0000 0.0000 .0000 0.0000
Gain 1 .0000 0.0000 0.0000 .0000 0.0000
Learn Step 10000 30000 70000 150000 310000
Coefficient 1 .2000 0.1200 0.0432 .0056 0.0001
Coefficient 2 .3000 0.1800 0.0648 .0084 0.0001
Coefficient 3 .1000 0.1000 0.1000 .1000 0.1000
PE: 56
1.000 Err Factor 0.000 Desired
0.421 Sum 0.398 Transfer 0.398 Output
PE: 57
1.000 Err Factor 0.000 Desired
-0.212 Sum -0.209 Transfer -0.209 Output
PE: 58
1.000 Err Factor 0.000 Desired
0.145 Sum 0.144 Transfer 0.144 Output
PE: 59
1.000 Err Factor 0.000 Desired
-0.139 Sum -0.138 Transfer -0.138 Output
PE: 60
1.000 Err Factor 0.000 Desired
-0.209 Sum -0.206 Transfer -0.206 Output
PE: 61
1.000 Err Factor 0.000 Desired
0.137 Sum 0.136 Transfer 0.136 Output
PE: 62
1.000 Err Factor 0.000 Desired
0.151 Sum 0.150 Transfer 0.150 Output
PE: 63
1.000 Err Factor 0.000 Desired
-0.306 Sum -0.297 Transfer -0.297 Output
PE: 64
1.000 Err Factor 0.000 Desired
0.669 Sum 0.584 Transfer 0.584 Output
PE: 65
1.000 Err Factor 0.000 Desired
















































PEs: 1 Wgt Fields
Spacing: 5 F 1 offset
Shape: Square
Scale: 1.00 Low Limit: -9999.00
Offset: 0.00 High Limit: 9999.00





























































Firing Density 100.0000 0.0000 0.0000 0.0000 0.0000
Temperature 0.0000 0.0000 0.0000 0.0000 0.0000
Gain 1.0000 0.0000 0.0000 0.0000 0.0000
Gain 1.0000 0.0000 0.0000 0.0000 0.0000
Learn Step 10000 30000 70000 150000 310000
Coefficient 1 0.1500 0.0900 0.0324 0.0042 0.0001
Coefficient 2 0.3000 0.1800 0.0648 0.0084 0.0001
Coefficient 3 0.1000 0.1000 0.1000 0.1000 0.1000
PE: 81
1.000 Err Factor -0 525 Desired
-0.583 Sum -0 525 Transfer -0.525 Output
26 Weights 0.000 Error 0. 000 Current Error
*****************************************************************
Resulting actual output and desired output for encryption after































































Title: Decryption Network for In—Depth Example of Chapter 4
Display Mode: Network Type: Hetero-Associative
Display Style: default
Control Strategy: backprop L/R Schedule: backprop
333877 Learn Recall Layer





















Firing Density 100.0000 0.0000
Gain 1.0000 0.0000
Learn Step 5000
Coefficient 1 0.9000 0.0000
Coefficient 2 0.6000 0.0000


















































PEs: 25 Wgt Fields: 2













































4 Wgt Fields: 1 Sum: Sum
5 F' offset: 0.00 Transfer: Linear
Square Output : Direct
00 Low Limit: -9999.00 Error Func: standard
00 High Limit: 9999.00 Learn : —None
00 Init High: 0.100 L/R Schedule : ( Network
)






















Scale: 1.00 Low Limit: -9999.
Offset: 0.00 High Limit: 9999.00




00 Error Func: standard
Learn: Delta-Rule
L/R Schedule: hiddenl
Winner 2 : None
Recall Step 1
Firing Density 100,,0000 0.0000 0.0000 0,,0000 0.0000
Gain 1.,0000 0.0000 0.0000 0,,0000 0.0000
Gain 1.,0000 0.0000 0.0000 0.,0000 0.0000
Learn Step 10000 30000 70000 150000 310000
Coefficient 1 0.,3000 0.1500 0.0375 0.,0023 0.0000
Coefficient 2 0,,3000 0.1500 0.0375 0,,0023 0.0000
Coefficient 3 0,,1000 0.1000 0.1000 0,,1000 0.1000
PE: 6
1.000 Err Factor 0.000 Desired
1.734 Sum 0.940 Transfer 0.940 Output
*** 5 Weights -0.000 Error -0. 000 Current Error
Repeat for PE's from here on, !5 weights. nearly error.
PE: 7
1.000 Err Factor 0.000 Desired
-2.111 Sum -0.971 Transfer -0.971 Output
PE: 8
1.000 Err Factor 0.000 Desired
-0.297 Sum -0.289 Transfer -0.289 Output
PE: 9
1.000 Err Factor 0.000 Desired
0.912 Sum 0.722 Transfer 0.722 Output
PE: 10
1.000 Err Factor 0.000 Desired
-0.258 Sum -0.252 Transfer -0.252 Output
PE: 11
1.000 Err Factor 0.000 Desired
-0.159 Sum -0.158 Transfer -0.158 Output
PE: 12
1.000 Err Factor 0.000 Desired
0.169 Sum 0.168 Transfer 0.168 Output
PE: 13
1.000 Err Factor 0.000 Desired
-0.342 Sum -0.330 Transfer -0.330 Output
PE: 14
1.000 Err Factor 0.000 Desired
0.677 Sum 0.589 Transfer 0.589 Output
PE: 15
1.000 Err Factor 0.000 Desired
-1.055 Sum -0.784 Transfer -0.784 Output
PE: 16
1.000 Err Factor 0.000 Desired
-0.215 Sum -0.212 Transfer -0.212 Output
PE: 17
1.000 Err Factor 0.000 Desired
1.487 Sum 0.903 Transfer 0.903 Output
PE: 18
1.000 Err Factor 0.000 Desired


















































PES: 25 Wgt Fields: 2
Spacing: 5 F' offset: 0.00
Shape : Square
Scale: 1.00 Low Limit: -9999.00
Offset: 0.00 High Limit: 9999.00




































Winner 2 : None
Recall Step 1
Firing Density 100.0000 0.0000 0.0000 0.0000 0.0000
Gain 1.0000 0.0000 0.0000 0.0000 0.0000
Gain 1.0000 0.0000 0.0000 0.0000 0.0000
Learn Step 10000 30000 70000 150000 310000
Coefficient 1 0.2500 0.1250 0.0313 0.0020 0.0000
Coefficient 2 0.3000 0.1500 0.0375 0.0023 0.0000
Coefficient 3 0.1000 0.1000 0.1000 0.1000 0.1000
PE: 31
1.000 Err Factor 000 Desired
-4.909 Sum -1 000 Transfer -1.000 Output
26 Weights -0.000 Error -0. 000 Current Error
90


















































































































1.000 Err Factor 0.000 Desired
0.166 Sum 0.165 Transfer
51
1.000 Err Factor 0.000 Desired

























Scale: 1.00 Low Limit: -9999.00
Offset: 0.00 High Limit: 9999.00



















Firing Density 100.0000 0.0000 0.0000 0.0000 0.0000
Gain 1.0000 0.0000 0.0000 0.0000 0.0000
Gain 1.0000 0.0000 0.0000 0.0000 0.0000
Learn Step 10000 30000 70000 150000 310000
Coefficient 1 0.2000 0.1000 0.0250 0.0016 0.0000
Coefficient 2 0.3000 0.1500 0.0375 0.0023 0.0000
Coefficient 3 0.1000 0.1000 0.1000 0.1000 0.1000
PE: 56
1.000 Err Factor 0. 000 Desired
0.824 Sum 677 Transfer 0.677 Output
*** 26 Weights -0.000 Error -0.000 Current Error



































-0.110 Sum -0.110 Transfer -0. 110 Output
PE: 63
1.000 Err Factor 0.000 Desired
-0.076 Sum -0.076 Transfer -0. 076 Output
PE: 64
1.000 Err Factor 0.000 Desired
0.697 Sum 0.602 Transfer 0. 602 Output
PE: 65
1.000 Err Factor 0.000 Desired
-0.083 Sum -0.083 Transfer -0. 083 Output
PE: 66
1.000 Err Factor 0.000 Desired
-0.117 Sum -0.117 Transfer -0.,117 Output
PE: 67
1.000 Err Factor 0.000 Desired
-2.059 Sum -0.968 Transfer -0.,968 Output
PE: 68
1.000 Err Factor 0.000 Desired
0.513 Sum 0.472 Transfer 0.,472 Output
PE: 69
1.000 Err Factor 0.000 Desired
-0.735 Sum -0.626 Transfer -0,,626 Output
PE: 70
1.000 Err. Factor 0.000 Desired
-0.142 Sum -0.141 Transfer -0,.141 Output
PE: 71
1.000 Err Factor 0.000 Desired
0.405 Sum 0.384 Transfer 0..384 Output
PE: 72
1.000 Err Factor 0.000 Desired
0.007 Sum 0.007 Transfer .007 Output
PE: 73
1.000 Err Factor 0.000 Desired
3.931 Sum 0.999 Transfer .999 Output
PE: 74
1.000 Err Factor 0.000 Desired
0.238 Sum 0.234 Transfer .234 Output
PE: 75
1.000 Err Factor • 0.000 Desired
-0.478 Sum -0.444 Transfer -0 .444 Output
PE: 76
1.000 Err Factor o-.ooo Desired
-0.288 Sum -0.280 Transfer -0 .280 Output
PE: 77
1.000 Err Factor 0.000 Desired
0.474 Sum 0.441 Transfer .441 Output
PE: 78
1.000 Err Factor 0.000 Desired
-8.096 Sum -1.000 Transfer -1 .000 Output
PE: 79
1.000 Err Factor 0.000 Desired
0.169 Sum 0.167 Transfer .167 Output
PE: 80
1.000 Err Factor 0.000 Desired





PES: 1 Wgt Fieldsi: 2 Sum: Sum
Spacing: 5 F' offset.: 0.00 Transfer: TanH
Shape: Square Output: Direct
Scale: 1.00 Low Limit: -9999.00 Error Func: standard
Offset: 0.00 High Limit: 9999.00 Learn: Delta-Rule
Init Low: -0.100 Init High: 0.100 L/R Schedule: out
Winner 1: None Winner 2: None
./R Schedule: out
Recall Step 1
Firing Density 100.0000 0.0000 0,.0000 0.0000 0.0000
Gain 1.0000 0.0000 0,,0000 0.0000 0.0000
Gain 1.0000 0.0000 0..0000 0.0000 0.0000
Learn Step 10000 30000 70000 150000 310000
Coefficient 1 0.1500 0.0750 0..0188 0.0012 0.0000
Coefficient 2 0.3000 0.1500 0.,0375 0.0023 0.0000
Coefficient 3 0.1000 0.1000 0.,1000 0.1000 0.1000
PE: 81
1.000 Err Factor -0. 298 Desired
-0.307 Sum -o. 298 Transfer -0.298 Output
26 Weights 0.000 Error 0. 000 Current Error
•A***************************************************************
Decryption desired and actual output after convergence






























































1. W. F. Ehrsam, S. M. Matyas, C. H. Meyer, and W. L. Tuchman, "A Cryp-
tographic Key Management Scheme for Implementing the Data Encryption
Standard," IBM System Journal, Vol 17 No 2, 1978, pp.106-107.
2. Jennifer Seberry, Josef Pieprzyk, "Cryptography, An introduction to Computer
Security," Advances in Computer Science Series, Australia Pty Ltd, 1989, pp
8-188.
3. W. S. Scott, R. N. Mayo, G. Hamachi, J. K. Ousterhous,"1986 VLSI Tools Still
More Works by Original Artists," Report No. UCB/CSD 86/272, University
of California at Berkeley, Berkely, CA 1986.
4. VLSI Design Tools Reference Manual, Northwest Laboratory for Integrated
Systems, TR 88-09-01 University of Washington, Seattle, Sec 6.4 pp. 20-48.
5. Donald E. Knuth, The Art of Computer Programming, Vol.2: Seminumerical
Algorithm, Addison Wesley, Reading MA, 1981, pp. 399-401, 419-420.
6. Carl Pomerance, "Cryptography and Computational Number Theory-An Intro-
duction," Proceedings of Symposia in Applied Mathematics:American Mathe-
matical Society, Vol 42, Rhode Island, 1990, pp. 1-11.
7. D. H. Ackley, G. H. Hinton, T. J. Sejnowski, "A Learning Algorithm for Boltz-
man Machine," Cognitive Science 9, 1985, pp. 147-169.
8. G. W. Cottrell, P. Munro, D. Zipser, "Image Compression by Back-Propagation:
An Example of Extensional Programming," ICS Report 8702, University of
California at San Diego, Feb, 1987.
9. R. Hecht-Nielsen, Neurocomputing, Prentice Hall, Englewood Cliffs, NJ, 1991,
pp. 2, 124-137.
10. Neuralcomputing, Neuralworks Professional II/Plus and Neural Network Ex-
plorer, Neuralware Inc., Pittsburg, 1991, pp. 3-11, 89-101.
11. S. Goldwasser, S. Micali, A. C. Yao,"On Signature and Authorization," Ad-
vances in Cryptology: Proceedings of Crypto '82, Plenum Press., 1983, pp.
211-215.
12. Giles Brassard, "Modern Cryptography," Lecture Notes in Computre Science,
Vol 325, Springer-Verlag, NewYork, 1988, pp.20-31.
95
13. Kevin S. McCurley, "The Discrete Logarithm Problem", IBM Research Report
RJ6877, Yorktown Heights, NY, 1989, pp20-21.
14. W. Dime, M. G. Hellman,"New Direction in Cryptography," IEEE Transactions
on Information Theory, Vol. IT-22, 1976, pp. 472-492, 644-654.
15. Ronald L. Rivest , "RSA Chips (Past/Present/Future)," Advances in Cryptology-
Eurocrypt '84, Berlin 1985, p. 159.
16. Carl Pomerance, "Factoring," Proceeding of Symposia in Applied Mathematics:
American Mathematical Society Vol 42, RI, 1990, pp.27-48.
17. J. A. Davis, D. B. Holdridge, G. J. Simmons, "Status Report on Factoring,"
Advances in Cryptology- Eurocrypt '84, Berlin, 1985, p. 183.
18. Carl Pomerance, "The Quadratic Sieve Factoring Algorithm," Advances in
Cryptology- Eurocrypt '84, Berlin, 1985, pp.169-181.
19. P.A. Findley, B. A. Johnson, "Modulo Exponentiation Using Recursive Sums of
Residues," Advances in Cryptology- Crypto '89, Springer-Verlag, pp. 371-386.
20. Neil West, Kamran Eshraghian, Principles of CMOS VLSI Design, Addison
Wesley, Reading, MA, pp.212-224, 310-312.
21. Andrei Vladimirescu, Sally Liu, "The Simulation of MOS Integrated Circuits
Using Spice2," Electronics Research Laboratory, UC Berkeley, UCB/ERL M80/7,
October 1980.
22. Carver Mead, Lynn Conway, Introduction to VLSI Systems, Addison Wes-
ley,Reading MA 1980, pp. 150-153.
23. FIPS,"Data Encryption Standard", Federal Information Processing Standard
Publication, 46-1, NBS, Jan, 1988.
24. R. Hecht-Nielsen, Neurocomputing, Prentice Hall, Englewood Cliffs, NJ, 1991,
pp. 3-1, 89-101.
25. S. Luce, D. Martins, S. Nunn, J. Waters, "Exploring the Back- Propagation
Model for Speech Application," Speech Tech', 1988, pp. 199-203.
26. Neuralcomputing, Neuralworks Professional II/Plus and Neural Network Ex-
plorer, Neuralware Inc., Pittsburg, 1991, pp. 3-11, 89-101.
27. Dorothy E. R. Denning, Cryptography and Data Security Addison Wesley,
Reading MA 1982, pp.62-78.
96
28. John Wakerly, Digital Design Principles and Practices, Prentice Hall, Engle-
wood Cliffs, NJ, pp. 484-486.
97
INITIAL DISTRIBUTION LIST
1. Defense Technical Information Center
Cameron Station
Alexandria, VA 22304-6145
2. Library, Code 52
Naval Postgraduate School
Monterey, CA 93943-5002
3. Chairman, Code EC
Department of Electrical and Computer Engineering
Naval Postgraduate School
Monterey, CA 93943-5002
4. Dr. Chyan Yang, Code EC/Ya
Department of Electrical and Computer Engineering
Naval Postgraduate School
Monterey, CA 93943-5000
5. Dr. Herschel H. Loomis Jr, Code EC/Lo
Department of Electrical and Computer Engineering
Naval Postgraduate School
Monterey, CA 93943-5000
6. Director National Security Agency
Attn: Mr Alan Chedester, V29
2800 Savage Rd
Ft Meade, MD 20755










MONTERf : CA 93943-5101
GAYLORD S

