In order to reuse a register transfer level (RTL)-based IP block, it takes another architectural exploration in which the RTL will be put, and it also takes virtual platforms to develop the driver and applications software. Due to the increasing demands of new technology, the hardware and software complexity of organizing embedded systems is growing rapidly. Accordingly, the traditional design methodology cannot stand up forever to designing complex devices. In this paper, I introduce an electronic system level (ESL)-based approach to designing complex hardware with a derivative of SystemVerilog. I adopted the concept of reuse with higher levels of abstraction of the ESL language than traditional HDLs to design multiplication server farms. Using the concept of ESL, I successfully implemented server farms as well as a test bench in one simulation environment. It would have cost a number of Verilog/C simulations if I had followed the traditional way, which would have required much more time and effort.
I. INTRODUCTION

A. Motivation
Electronic system level (ESL) design includes hardware and software interactions with higher levels of abstraction for system-level transactions. ESL methodologies have evolved from algorithmic modeling, such as architectural explorations, and proving concepts as supplementary techniques, such as design of embedded systems, system-level test bench development, hardware/software co-simulation, and highlevel synthesis of ASIC/FPGA designs. Efficient ESL design includes the ability to proceed from the concept to the optimal implementation of architectural functionality, as well as verification. As one of the most evolved ESL languages, Bluespec SystemVerilog (BSV) has introduced to the system hardware architects a new way to simplify implementing complicated control logic while at the same time not losing control over the architecture and efficiency of the design. According to [1] , a reduction of over 50% in time can be achieved in verifying a design and fewer than 50% of the bugs can be found compared to traditional register transfer level (RTL) design. BSV differs from traditional Verilog or VHDL, or even from SystemVerilog in many aspects. Instead of using the traditional procedural statements, which implement concurrency such as always, BSV uses statements named rules that refer to fully synthesizable behavior. BSV enables better overall system generation than via traditional ways since all instances such as methods, rules, modules, interfaces, and functions are regarded as first-class objects. This means that the objects can be used as arguments in other objects.
B. Related Work
This work is an extended version of a conference proceeding version [2] . In this version I explain the multiplier architecture, points on extracting a higher level of abstraction, and the reorder buffer more in detail. To build a server farm in the traditional way, a hardware finite state machine must be defined in HDL, which costs time and effort [3] . In the high-level abstraction languages such as BSV however, it is much simpler to define the state machine logic. At this point, I experimentally implemented server farms using BSV with a few existing Verilog IP logics such as multipliers and random number generators, to build the multiplication server farms. I first discuss the custom designed modified Booth multiplier, and then I discuss how to utilize the IPs in the BSV test bench program. Next, after considering how I achieved the higher levels of abstraction, I present the random number generator and the architecture of the server farms, and finally discuss the results.
II. MODIFIED BOOTH MULTIPLIER WITH FULL-CUSTOM LAYOUT
A. Modified Booth Multiplier
Fig . 1 presents the custom layout of the modified Booth multiplier. In order to accommodate a fast clock frequency, I used 2-stage pipeline architecture. In the Wallace tree module I used a 4:2 carry save adder (CSA) for a regular layout. The multiplier can perform 2's complement-represented multiplication in 9.5 ns with LG 0.6 μm 3-metal Nwell CMOS technology. The multiplier is composed of 9115 transistors and occupies 1135 × 1545 μm 2 in the die size [4] . The IP was also written in Verilog HDL for reuse in the BSV ESL design. 
B. PP-Generation Block
The partial product (PP) generation block includes the Booth encoder. According to the modified Booth's algorithm, a variation of multiplicand (A) among [+2A, +1A, 0A, -1A, -2A] must be chosen in reference to the last three digits of the multiplier input (B). +{1,2}/-{1,2} operation is implemented by shift or inversion. To reduce the area, I customized the size of the multiplexors using a weak pull-up pMOS transistor. In Fig. 2 , instead of using a complementary MOS transistor system, I used nMOS switches only, in order to reduce the size needed. To prevent the side-effect of using only nMOS, I appended a weak pMOS pull-up transistor between the input and output of the buffer so that the multiplexer output maintains both strong high and low.
C. Wallace Tree
The 9 pieces of partial products generated are handled by 4:2 CSAs. The reason I chose the 4:2 CSA structure was that 4:2 structure shows good regularity compared to the 3:2 structured types. I also used the sign-generation method to avoid unnecessary calculations incurred by the sign bits in the Wallace tree.
D. Carry Chain Adder
The carry vector and the sum vector are first stored in buffers, which were designed with a master/slave structure. In the next clock cycle, the register outputs are transferred into the 33-bit adder blocks. I used carry chain adders as the basic adder cell. The final addition results are transmitted through output buffers to the pads. 
PP-Generation
Guarded C
BSV also adop ed language in ce this langu guage, there is programming easily determ are logic [7] . 
RVER FARM
ection of comp server needs he arbiter alloc mes available implemented le paper-and-p he modified B I-A (Fig. 6) . In s the same ran the correspon n farm are cor complete each and the multip will be availab , the arbiter sh rder the jobs at fits this spe for out-of-orde implemented in zed IP. The con Fig. 7 
Random N
The public Adv phic algorithm ds and Technol erator [9] was nks to Drimer S Verilog HD erator. As sh mutations enab dom number ge 
Number Gen
vanced Encrypt m to achieve a logy-recommen reused and de et al. [10] , I w DL module to own in Fig. ble the cipher eneration. According to the cipher theory, encrypted data with enough high security density behave as random numbers because after the repetitive permutation, data retain the characteristics of whitening. I then devised a wrapper for the AES Verilog IP to fit in the BSV test bench program. The test bench program instantiates 2 multiplication server farms and dispatches the same jobs to each farm. I utilized the technique to use the fully filled first-in-first-outs to evenly distribute the data elements one at a time. The test result is shown in Fig. 9 . As you can see, the multiplication server farm accomplishes consecutive jobs in the order the data were received.
V. CONCLUSIONS
ESL methodologies have evolved from algorithmic modeling such as architectural explorations and proving concepts as supplementary techniques such as design of embedded systems, system level test bench development, hardware/software co-simulation, and high-level synthesis of ASIC/FPGA designs. To rapidly prototype the multiplication server farms, simple pipelined multipliers and modified Booth multipliers are implemented together with input buffers and reorder buffers, each for out-of-order completion. An AES cryptographic function module is reused to produce random number test vectors. By means of the BSV, I was able to utilize a higher system level of abstraction than the traditional HDLs to attain multiplication server farms with two farms. If I parameterize the number of farms and the number of multipliers, I could obtain additional server farms with ease, while it would have taken several times more time and effort to build the same implementation using Verilog or VHDL. 
