This work a i m at designing a j70 ding-pint exp onential function wing the table-driven method. The algorithm was first implemented using sequential VHDL and later translated to concurrent Verilog. The main part of the work consisted of creating modules that would handle basic IEEE-754 single precision number manipulation routines such 09 addition, multiplication, and rounding to-nearest integer. Using these routines, a model was implemented based on the table-driven algorithm. The VHDL design, as well as the V eriloglesign, wer esimulated and the results proved to be satisfactory. Synthesis was performal wing CMOSIS5 technology on the VHDL code and yielde da fairly large result.
Introduction
The last tw odecades ha vebrought extraordinary adv ancesin numerical calculations. The improvements in hardware and in execution speed ha e contributed to the great progress in mathematical calculation speed and accuracy. Along with these advances came the development, in 1985, of a format that would revolutionize the mrld of science: the IEEE-754 single and double precision formats [2] . The numbers repre sen ted under these formats, whiQ are respectively 32 and 64 bits in length, have a greater range than their 2's complement counterparts. In the early days, there w as no h a r d w e a vailable to implemel floating-point arithmetics. The only way to perform these operations w ouldbe to write softw areroutines. Unfortunately, the creation of suc hprograms is rather complex and is not a trivial task for most people. Furthermore, the execution speed w ouldbe very slow when compared to a hardware implementation.
The interest then shifted to hardware design of such mathematical modules. The objective is to create an integrated circuit that would handle the transcenden tal mathematical functions in the IEEE-754 single and double precision formats. This paper outlines the work that was put into creating the hardware implementation of an exponential function. The design of the circuit was done using tw o differelt hardware description languages) namely Verilog and VHDL. Although the implementation was following the a l p rithm outlined in [5] and [3] , several changes had to be made to accommodate our singleprecision implementation, in con trastto the reference, which used double-precision operations.
Floating-Point Exponential Function
The table-driven implementation of the exponential function used by Ping Tak Peter T ang [5] consists of three main parts. The input value is first reduced to a certain working range. A shifted exponential function is then estimated using known polynomial approximations. Finally) the exponential function of the original input is reconstructed using a certain formula. where lrl+r21 5 (log2)/64, rn and j are integers, and rl and r2 are real numbers. Note that all logarithmic functions are, in reality, natural logarithmic functions (base e). The polynomial approximation required is that of exp(r) -1 which can be expressed as a Taylor series. p ( r ) = r + a1 * r2 + a2* r3 + ...
(2)
where a1 and a2 are the coefficients and r is the v ariable of the Taylor series. The exponential function can then be reconstructed in the follo wing manner starting from equation (1) and equating r = r l + r2:
x = m * log2 + (j * Eog2)/32 + r exp(x) = ezp((rn * Eog2) + (j/32) * Eog2 + r ) ( 5 ) exp(x) = exp(20g2~ + E092j/~~ + r )
The objective of the algorithm would then be to isolate rn and j and find the coefficients for the polynomial (a1 and a2). The left hand side of this equation can easily be solved and will be assigned the letter N . Using the modulo-32 function, the values for (32 * rn) and j , named N1 and N2, respectively, can be calculated with little or no error.
Equation ( L1 and L2 are constants that can be added together in order to approximate 32/(log 2 ) . The, reason for separating the value into tw o consta&s is to increase the accuracy to one that is higher than that of single precision. The sum of r l and r2 will represent the scaling of the input X to a value T in the interval [-(log 2)/64, The value of r2 is obtained by (log 2)/64]. Using the d u e of T in the T aylor series (2), the function of ezp(r) -1 can be approximated.
For convenience purposes, only the first three elements will be considered.
series which can be done in many w q s . The implemented algorithm first calculated the second and third order elements. The coefficients a1 and a2 are constants that w erecalculated b y Ping Tak Peter T ang using a Remez algorithm [5] .
Examination of equation (9) reveals that only tw o more values need to be calculated in order to obtain the final result: 2m and 2j/32. The table-driven implementation w asnamed so because of the fact that the v alues for 2/32, j ranging &om 0 to 31, were calculated beforehand and stored in a table. These numbers are broken do wninto tw oparts, namely d e a d and s-trail, to increase the precision roughly by an order o f t w o.With these d u e s known, the final result can be determined, without difficulty, using equation
The following step consists of forming the T aylor (9).
Hardware Implementation
The implementation of the algorithm, w as done using tw o different hardware description languages, namely, V erilogand VHDL. The design constructed using VHDL made use of the sequential mode in contrast to the Verilog implementation that used combinational logic. Both essentially implement the same algorithm outlined in the previous section.
The VHDL and Verilog designs are composed of numerous procedures that perform IEEE 754 operations. These operations include the addition, multiplication, division by 32, rounding to the nearest integer, modulo 32, comparison and powers of 2. These modules w ere used as building blots to construct the floatingpoint exponential function (Figure 3) . T o ensure that the code is synthesizable, the program was made primitive and the length was much greater than it needed to be. A general description of each procedure follows.
Interested readers can refer to [ l ] for a more detailed description.
Addition
The addition procedure-c o w s both the addition and the subtraction operations. The idea is mainly the same for both but handling both cases together is an added degree of complexity. The algorithm puts both numbers to the same exponent, adds or subtracts the numbers and then normalizes.
0-7803-5579-w99/$10.00 0 1999 JEEE significands each with a "1" concatenateihto The result obtained will have about twice as many bits as the significand should normally have and so, the result will be truncated, normalized and the implied "1" will be removed. The normalization process will be fairly simple knowing that the multiplication of two 24 bit numbers with a one at the most significant bit position will yield a result with a one at the most significant bit (bit 47) or at bit 46. Depending on the situation, the result will either be shifted once or twice.
A t the beginning of the algorithm, there is an IF statement that checks for exceptional cases where there is a zero in at least one of the inputs. Since inputs such as "zero", "NaN" (Not a Number) and both infinities are determined by a specific bit pattern, they have to be treated separately b y the multiplication procedure.
:%", ' 
Division b y32
The first part of the addition chec ks which input is greater. This is important in cases where the inputs are of opposite signs. If the be the same. When the signs are different, the input with the greater magnitude will impose its sign. The next step is to denomalize b@uts and perform
This function is only required to be used on a specific type of numbers: multiples of 32. Knowing this fact, the procedure does not need to support all posbe explained as follows: the algorithm will output zero if the input exponent is less than five and will simply subtract five fkom the exponent if it is not the case. inputs carry the s m e sign, the output sign will then sible Of inputs* The Operations performed the addition. How ever, before going on to thatstep, "01" has to be concatenated t o both numbers. The
Round to Nearest Integer
reason for this is that the 1 is the implied 1 contained in the IEEE 754 format. The 0 is there to make sure that the carry bit is not lost. Denormaliiing is done by right-shifting the smaller input b y an amount determined by the difference in exponents. The exponent is unbiased by removing 127 ("01111111") from its biased d u e . Addition is then performed normally and the last part is normalizing. Normalization is done using a list of IF-THEN-ELSE statements to k eep the code simple. It would have been more convenielt to use FOR loops but the code would then be more dense and significantly more complex for later synthesis.
Muit ipiicat ion
Multiplication is an operation that is quite straightforward. Its algorithm is divided into three main parts corresponding to the three parts of the single precision format. The first part, the sign, is determined by an exclusive-OR function of the tw oinput signs. The exponent of the output, the second part, is calculated by adding the bv oinput exponents. And finally, the significand is determined by multiplying the two input 0-7803-5579-2/99/$10.00 0 1999 IEEE The "Round to Nearest Integer" algorithm starts by checking if the exponent is of the order of -2 or less. This would result in an output of zero. The second case is to chec k if the exponert is -1 in which case the output would be equal to 1. These are t w o special situations that deal with negative exponents since the main algorithm cannot handle these cases.
The basic idea here is to verify the bit at the 0.5 position. If the bit is set, the decimal positions are filled with zero and we add one to the resulting integer. If the bit is reset, the bits located to the right of the decimal point will be reset. Toaccomplish this, the input is first shifted right b y a number of positions corresponding to the exponent (so that all the fraction bits are shifted out). The number obtained should be an integer. This number is then incremented by one if the bit at 0.5 is set else it should be left the same.
Modulo 32
Modulo 3 2 k operation that is done by simply taking the five first bits located to the left of the decimal point. The result will then be an unsigned 5-bit integer that will have to be converted to single precision format.
The procedure is somewhat similar to that of rounding to the nearest integer. The input is first shifted right by the number of bits corresponding to the exponent. The result is then ANDed with the "11111" bit pattern in order t o isolate the fiv ebits. The conversion process checks where the 6rst 1 is located starting from the most significant position. An exponent is then assigned accordingly and the result is shifted left to comply with the rules of normalization.
Comparison
Unlike the other procedures, the comparison does not output a number in the IEEE 754 format. Instead, it generates three bits that give a comparative indication of the size of the first input with respect to the second input. If the first input is greater than the second one, then the most significant bit is set. If the second input is greater than the first, then it is the least significant bit that is set. If the tw o inputs are equal then the middle bit is set. Only one bit can be set at any given time.
Powrs of Two
The powers of tw o function can be implemelted b y realizing that the value of the input is the value of the output exponent. F o r example, placing four as an input would result in tw o to the pov er of four, yielding four in the exponent field. The objective of the function would then be to convert the input, being an lEEE 754 number, to a 2's complement number. The bias of 127 would then be added to the result and the sum would be placed in the exponent field. The sign and significand fields will be filled with zeros because the result will always be positiv eand will dways be an integer multiple of two.
Get J
The current implementation of the exponential circuit uses the tabledriven approach. The table index should ideally be an unsigned integer to make the searcheasier. The "Get J" procedure takescare of this. It takes a n d e r in the single-precision format and transforms it to an unsigned number. The procedure examines the exponent and extracts the corre sponding bits from the significand. Using an unsigned number for the search makes the task of finding a correct v d u e for S easier (refer to the algorithm described in Figure 2 ).
Modifications and Remarks
The algorithm described by Ping Tak Peter T ang [5] used single precision in combination with double precision in order to achiev ebetter accuracy. The work presented here does not cover double precision calculation and thus, sev eral hanges have been made made.
The beginning of the algorithm contains a multitude of IF statements chec king for special case considerations. One of those cases is an upper limit threshold beyond which the output goes to infinity. The second case is a low er limit threshold that Qec ks to see if the input is low enough for the following approximation to hold: OUTPUT = 1 + INPUT. The low erlimit can be left the same without any major consequences. How ear, the algorithm overflows for inputs far smaller than the upper limit and thus, the boundary had to be changed. The value for Threshold-1 w as modified to 89 from its initial value of about 220.
In addition, modifications had to be made to the moddo-32 function which operates differently with negative n u d e r s . The algorithm needs the output of this function to be positive and so, negative results will have 32 added to them.
Simulation and Synthesis Results
After implementing the algorithm, the next step is to verify the accuracy of the outputs. The verification is done by comparing the expected results obtained using a normal calculator to the output generated by the implemented algorithm.
The analysis was performed on 20 test vectors covering a widely-used range of inputs. The results obtained are tabulated in Table 1 . As it can be seen, the outputs differ from the expected results by only a small margin. These errors can be attributed to differen t factors that include errors in reduction, errors in approximation and rounding errors. Errors in reduction occur because of the mapping of the input to a range of values r betw een (og2)/64. The approximation errors are present because of the use of only the first three elements in the T aylor series to calculate the er -1 function. The rounding errors are due to the many cases where numbers had to be rounded because single precision did not provide enough accuracy .A more detailed analysis is described in the work of Ping Tak Peter T ang [5] .The numbers given there are how ever notapplicable here because this project does not cover double precision arithmetics. Using the RTL design of the Exponential Function described in previous sections, synthesis was performed using CMOSIS5 technology on the VHDL code and yielded a fairly large result. This can be attributed to the fact that no size restrictions were set in order to accelerate the process. T able 2 shws the results obtained using Synopsys.
Number of Nets 15944

T able 2: Synthesis Results
In Table2, cells refer to the number of standard cells that the design uses, whereas nets refer to interconnects (internal input/output wires). All area measures are given in square microns.
Conclusions
This paper describes the functionality of the floating-point exponential function from the inside. It presents a general view of the building blocks that 0-7803-5579-2/99/$10.00 0 1999 IEEE constitute the design. Two RTL models were created using VHDL and V erilogand both w eresimulated, showing satisfactory results. The VHDL design w as successfully syhhesized and this becomes a good working element for future research.
Even though the contribution made by this w ork is substantial, there is still a lot of room left for improvemert in terms of accuracy and compactness of the code. Most modifications will, how e w , not have a great impact on the performance of the design. As a finished product, this project seems promising in that it can be integrated along with other similar modules to form a transcendental mathematical unit.
Using the syn thesized design, verification procedures can be made to formally verify the different levels of abstraction and eventually, check if the RTL implementation implies the high-level specification. The process of formally verifying the algorithm described in 
