Abstract-Arithmetic Logic Units are one of the vital unit in general purpose processors and major source of power dissipation. In this paper we have demonstrated an optimized Arithmetic and Logic Unit through the use of an optimized carry select adder. Carry select adders have been considered as the best in their category in terms of power and delay. In this context a full adder optimized in terms of power has been used in synthesizing a carry select adder. Combined with the new adder structure, there is a substantial improvement in terms of power and delay. The total device power and hierarchy power has been reduced to 12.5 % and 53.39 % respectively. 3 % reduction in total completion time has also been observed. The circuit has been synthesized on kintex FPGA through Xilinx 14.3 using 28 nm technology in Verilog HDL and results has been simulated on Modelsim 10.3c. The design is verified using System Verilog on QuestaSim in UVM environment.
I. INTRODUCTION
In the era of growing System-On-Chip industry and scaling of devices up to nanometre regime, for the Production of any VLSI chip, we ought to focus on power and area requirement and propagation delay of the design. The densely packed transistors on single chip have led to an increase in power dissipation. Power is the key concerned because of the noteworthy growth in the field of personal computing devices and wireless communication system which are demanding complex functionality and high speed computation with low power consumption. Need for lower power consuming devices continue to increase drastically as components are becoming smaller, battery-powered and require more functionality in today's era. The benefit of utilizing a blend of low-power components in union with low-power design techniques is more important now than ever before.
Addition is the most fundamental arithmetic operation among all others. Subtractor, multipliers etc. all have adders as their basic functional unit.
The main contribution of the paper is that we have designed a 64-bit Arithmetic and Logic Unit for the computation of eight functions. This paper make use of the optimized carry select adder block. Optimization has been carried out by reducing the internal logics in the circuit through the use of a multiplexer based full adder circuit.
The rest of the paper is organized as follows. In Section II the previous work in the field of arithmetic logic unit and carry select adders are discussed. In Section III, we have discussed design of our 64-bit ALU. In section IV, we have presented various results along with simulation window. Section V concludes the paper.
II. PREVIOUS WORK
Multiplexer based full adder cell was proposed by Alhalabi, B. and Al-Sheraidah [1] in 2001 that uses 23 % less power and was 64 % faster. The use of multiplexer not only reduces the transition activity and charging recycling capability, but also make entire signal gates directly excited by the fresh input signals leading to noticeable reduction in short-current power consumption. 4-bit Arithmetic and Logic Unit was proposed in 2011 that performed eight functions. An optimized full adder circuit was used for performing addition and subtraction operations. More than 70 % reduction in power and area was observed as compared to conventional design [2] . Carry select adders are known for their speed. A low power consuming Carry select adder can be an asset for any SOI [3] , [4] , [6] . An efficient full adder design can been used to optimize the big designs. In 2013 again 4-bit Arithmetic and logic unit was proposed based on gate diffusion technique performing same eight operations [5] . We have designed a 64-bit Arithmetic and logic unit performing eight operations based on a multiplexer based optimized full adder circuit. We are also inprocess of implementing this design on reconfigurable system as implemented in [7] [8] [9] [10] [11] III. ARITHMETIC AND LOGIC UNIT Arithmetic logic unit is the basic building block of any central processing unit and is found in every microprocessor now-adays. In this paper the proposed Arithmetic and Logic Unit performs eight operations that are addition, subtraction, increment, decrement, XOR, AND, EX-NOR and OR depending upon the select line which is shown as RTL schematic diagram as shown in figure1 . Table 1 illustrate the  truth table for For SEL=10X (where X symbolizes don't care, either 0 or 1) the operation will take as 100 for addition. 64-bit optimized carry select adder will be used for addition. For SEL= 101 complement of second input will be added with another input using same carry select adder design. SEL 110 will add 1 to first input and SEL 111 will subtract 1 from first input only.
A. CARRY SELECT ADDER
Carry select adder used in this context has used optimized full adder circuit using multiplexers. Fig. 2 , shows the full adder used in this design. The full adder implemented with the help of multiplexer has proven to reduce the logic used in the device. This methodology has given us better results than previous implemented arithmetic and logic units. The outputs, sum and carry generated through this full adder circuit are:
The total power of any device is the sum of static and dynamic power. Dynamic/Switching power is mainly due to charging and discharging of load capacitors driven by the circuit. The frequent toggling between internal nodes increases the dynamic power. The static power of the proposed arithmetic logic unit is almost same as compared to the devices available in market This design has reduced the dynamic power of device by reducing the logics and signals required by it. With the use of multiplexers in the circuit, we are able to reduce the switching activities at the internal nodes and reduce the power of the circuit. The propagation delay has also been reduced. Fig. 3 , shows how these full adders are further cascaded to form 4-bit ripple carry adder. The ripple carry adders are those adders that add the carry with the next transaction. Multiple full adder modules can be cascaded one after another to add a large number of data. These 4 bit adders are cascaded together in a 32-bit carry select adder. The carry-select type adder normally comprises of two ripple carry type adders and a multiplexer. Addition of two n-bit numbers with a carry-select type adder is done with the help of two adders (that are two ripple carry type adder's) so as to perform the calculations twice. Fig. 4 shows the organization of 64-bit carry select adder. First time, the assumption of the carry being zero is done and the other assuming one. The speed of the processor is fasten up as later when the two results are calculated, the true sum, as well as the true carry, is then chosen with the help of multiplexer once the correct carry is known to multiplexer. There are various kinds of carry select adders used in the proposed design. The carry select adders are the perfect compromise of area, speed, power and delay. The Device generated through the use of these carry select adders is proven to be efficient in terms of power and delay. Fig. 5 and 6 shows the graph comparing various power i.e. clock, logic, signal, IOs, static power, dynamic power and total power consumption values and hierarchy powers of conventional and proposed arithmetic and logic units. Fig. 7 , Shows the graph comparing the logics used in the proposed and conventional designs. The reduction in logics leads to the reduction in the power dissipation of the device. Total On-chip power and time summary is listed in table II and III. Figure 8 shows the Simulation waveform generated through Modelsim for the 64-bit Arithmetic and Logic Unit.
The total device Power and hierarchy power has been reduced to 12.5 % and 53.39 % respectively. 3 % reduction in total completion time has also been observed from the synthesis report. 64-bit Arithmetic logic Unit has been successfully designed, simulated and optimized. This design has been synthesized using Verilog HDL. Further the arithmetic logic unit has been verified successfully using System Verilog on QUESTASIM. The simulation has been done to ensure perfect working of the design. The test bench has been written in System Verilog UVM environment to ensure the design is free from any kind of bugs. The power of the proposed design can be reduced further when implemented with the cadence design tools at transistor level.
