We present various experiments in Hardware/Software design tradeoffs met in speeding up long integer multiplications. This work spans over a year, with more than 12 different hardware designs tested and measured. To implement these designs, we rely on our PAM (for Programmable Active Memory, see [BRV]) technology which provides us with a 50 millisecond turn-around time silicon foundry for implementing up to 50K gate logic designs fully equipped with fast local RAM and host bus interface. First, we demonstrate how a simple hardware 512 bits integer multiplier coupled with a low end workstation host yields performance on long arithmetic superior to that of the fastest computers for which we could obtain actual benchmark figures. Second, we specialize this hardware in order to speed-up one specific application of long integer arithmetic, namely Rivest-Shamir-Adleman public-key cryptography [RSA]. We demonstrate how a single host driving 3 differently configured PAM boards delivers RSA encryption and decryption faster than 225Kbits/sec for 512 bits keys. This beats the best currently working VLSI specially built for RSA by one order of magnitude
To submit an update or takedown request for this paper, please submit an Update/Correction/Removal Request.