Abstract. This paper describes a fast software stream cipher called Fish based on the shrinking principle applied to the lagged Fibonacci generator (Fish -Fibonacci shrinking). It is designed to make full use of the 32 bit word length of popular processors. On an Intc1486 clocked with 33 MHz a data rate of 15 Mbit/s is achieved with a C implementation.
Introduction
Coppersmith, Krawczyk, and Mansour ( [CKM93] ) presented at Crypto '93 a promising stream cipher, the shrinking generator. It is based on linear shift registers with linear feedback. The output bits of one shift register decide which of the output bits of the other shift registers are used and which are discarded. The design is well suited for hardware implementation. In software shift registers are not very efficient because each machine instruction operates on a single bit only. The remaining bits in the registers of the processor are unused.
In this paper we suggest an algorithm called Fish. We apply the shrinking principle to a stream cipher based on the lagged Fibonacci generator ([KnuS1]) (Fish -F ibonacci shrinking). We use the full 32 bit wordlength of popular processors in order to achieve a high data rate.
The Principle of Shrinking Generators
In this section we describe a slight generalization of the principle of the generator suggested originally ( [CKM93] ). We consider two pseudo random generators A and S. A produces a sequence a0, al,.., of elements of GF(2) ~A. S produces a sequence so, sl,.., of elements of GF(2) ~s.
We apply a mapping d : GF(2) ns -~ GF(2) to the elements of s0,sl,... to decide which elements are accepted and which arc discarded. In the original shrinking generator only elements genererated by A are accepted or discarded, in our generalization the results of S are treated the same. Another difference of our scheme is that the accepted elements are not yet the final result, another stage of processing is needed. We define the shrinking procedure as follows: If d(s~) = 1 then a~ and s~ are accepted, otherwise they are discarded. That is, we define a sequence il, i2,..., i~,.., where ik is the k-th position in So, Sl,. We consider the shrunk sequences zo, zl,.., which is nil, a~2,.., and ho, hi, 9 9 9 which is s~l, si2: .... For all elements hj d(hj) = 1 holds. The principle of the generalized shrinking generator is illustrated in Fig. 1 .
In the original shrinking generator there was nA -~ 1 and ns = 1 . The mapping d() was the identity, zo,zl,.., were used as the output bits of the generator.
Specification of the Fast Software Algorithm Fish
In order to make full use of the 32 bit wordlength of most popular processors, we choose nA : 32 and ns = 32.
For both A and S we use the fastest software pseudo random number generator we know, namely the additive generator ([KnuS1]) which is also called the lagged Fibonacci generator. We define a-55,a-54,... ,a-1 and s-52,s-51,. .. ,s-1 are initial values of the generators and must be derived from the key. The sequence of the least significant bits of a lagged Fibonacci generator is generated by a linear feedback shift register (LFSR) where the feedback polynomial is a trinomial.
The mapping d : GF(2) 32 -~ GF(2) maps a 32 bit vector to its least significant bit, d ((b31, b3o,. .., b0)) = b0 9
It would be unsecure to use the shrunk sequence z0, zl,.., as the result like in the original shrinking generator, since the underlying linear structure could be detected. With probability 1/8 a triple of elements al, ai-55, and ai-24 is accepted as elements of zo, Zl,.... An attacker could try to identify such triples by adding elements of Zo, zl,.., with a suitable distance and checking whether the sum turns up some elements later. Therefore we have to hide the linear structure of z0, zl, ....
We split the sequences z0, zl,.., and ho, hi,.., up into pairs (z2i, z2i+l) and (h2~, h2~+1) and derive the two 32 bit output words r2~ and r2i+1 from these. We define c2~ = z2~ 9 (h2i A h2~+1)
where 9 stands for the bitwise logical XOR operation and A for the bitwise logical AND. The last three equations achieve an exchange of those bits of c2i and z2i+l which are 1 in h2i+l. The operations are visualized in Fig. 2 .
The least significant bits of h2i and h2i+l are 1 because of our choice of the function d. Therefore it is possible to reconstruct the least significant bits of z2~ and z2i+] from r2i and r2i+l, and vice versa the least significant bits of r2i and r2i+l follow from z2i and z2i+l. This implies that the least significant bits of the output words of Fish are the bits of the underlying LFSR shrinking generator which has a feedback trinomial.
Implementation Considerations
For the implementation a security aspect must be considered. It would be fatal for the security of the implementation if a potential attacker could find out from the time behaviour whether results of the additive generators were discarded or not. In applications where this could bc possible it can be prevented by buffering.
Results for the Suggested Algorithm
On a PC with an Intel 486 clocked at 33MHz, using the Metaware High C compiler and the Pharlap DOS-Extender a data rate of 15Mbit/s for a C implementation of the suggested algorithm Fish is achieved.
Several 
