Abstract-Finite State Machines (FSM), are one of the more complex structures found in almost all digital systems today. Hardware Description Languages are used for high-level digital system design. VHDL (VHSIC Hardware Description Language) provides the capability of different coding styles for FSMs. Therefore, a choice of a coding style is needed to achieve specific performance goals and to minimize resource utilization for implementation in a re-configurable computing environment such as an FPGA. This paper is a study of the tradeoffs that can be made by changing coding styles. A comparative study on three different FSM coding styles is shown to address their impact on performance and resource utilization for the most commonly used encoding methods for FPGA designs. The results show that a particular coding style leads to a savings in resource utilization with a significant performance improvement over the others while the others pose a consistent performance regardless of the resource utilization outcome.
I. INTRODUCTION
As with any programming language, regardless if it is a hardware description language such as VHDL or software programming language like C++, there will always be more than one way to write code to accomplish the same task [1, 2] . Each language provides several viable alternatives to the designer as to how to accomplish a given task. The challenge facing a developer is to efficiently utilize these alternatives, while developing the code, within a constrained environment to accomplish the task at hand. However, there are no guidelines presented in the current literature as to how a hardware designer should develop code in a specific way to maximize performance or resource utilization when designing a digital hardware system using VHDL [3] . In this paper, we address this deficiency by evaluating three different methods of coding a finite state machine using two different state assignment-encoding schemes when implemented in an FPGA [4] . A finite state machine is treated as a hardware module that may have multiple inputs and can make decisions based on these inputs over time. There may be multiple paths that the machine can take based on [5] . In the mean time, there are several methods used to encode the state registers within these FSMs. One-Hot, and Gray encoding are the most common choices used by the HDL designers targeting FPGAs [6] . A thorough discussion of the pros and cons of these coding styles along with their applications can be found in [7, 8] A. Combined Single Process This method uses a single process to control both the state transitions and outputs. It is claimed to be the most common method used for FSM design. An observed drawback to this method for coding a FSM is that synthesis software has difficulty identifying it as a finite state machine structure [9] . However, this may be overcome by having the outputs inferred into a clocked element, such as a flip-flop. Figure 1 shows a block diagram of the implementation of this type. The third coding style also employs two processes, but it provides a different method for registering the outputs. Each output signal is first inferred as a combinatorial logic then registered based upon the next state and not the current state as shown in Figure 3 [7] . The IV. DESIGN OF THE EXPERIMENT For our purpose of characterization, a constant environment must be maintained and limited to those parameters only affecting the coding styles. Thus, certain parameters remain unchanged throughout the evaluation process such as tool selection, tool configuration, target device for implementation, FSM function, and target device architecture. Coding and implementation are conducted using the Xilinx tool suite, ISE 5.2i [10], targeting the FPGA device XC2S600E [11] . In order to focus only on changes in the FSM, common components are implemented separately and instantiated as needed by the FSM. In addition, a common component constraints file that defines the targeted device and other required timing parameters are utilized. To ensure that the three coding styles operated identically, a common VHDL behavioral testbench was developed. The testbench modeled all possible combinations of input variations and monitored the outputs at all states. Modelsim XE 5.6b simulation tool [12] is then used to verify that the functions of the finite state machines developed remain the same for all tested coding styles. Based upon this methodology, conclusions about the resource usage and performance of state machines can be drawn along with their dependency on the coding style and state encoding.
V. CASE STUDY The finite state machine designed for this evaluation controls a unique serial protocol. The protocol consists of five signals, three inputs and two outputs as shown in Figure 4 . For clarification, a signal name ending with an " I" indicates an input and with an " 0" indicates an output. It is important to note that during the assertion of both nCTS_I and nRTS_O, multiple data byte sequences are possible. In order to introduce additional complexity into the state machine design, an exception is added into the typical operation. This exception provides a means by which the receiving entity can signal the transmitting entity that data is ready to be sent but the transmitting entity needs to initiate the transfer. Figure 5 provides an illustration of the exception and how it is handled. Management of timers tl, t2, and t3 is required to control movement of the state machine. Additional components, such as programmable timers, a FIFO, specific counters, etc., necessary for the operation of the digital system are also designed and maintained the same. They are not discussed here since they are beyond the focus of this case study and do not affect the state machine performance nor its implementation. With these challenges in mind, an appropriate state machine is developed. One important item to note is that with a significantly faster system clock than what the protocol is using, many states will have a significant amount of idle time prior to transitioning to the next state. This idle time must be managed such that no state machine movement occurs. Figure 6 shows the state diagram of the developed state machine used for controlling the serial protocol mentioned above. Each state manages particular signals in order to accurately depict the protocol management of nRTS Oand STS 0. 
VI. RESULTS
In order to analyze the differences between coding styles, data is collected from various reports generated by the Xilinx tools at several stages of the process as shown in Figure 7 . Synthesis results are a key indicator of how the VHDL code is inferred into the FPGA fabric. As indicated above, the synthesizer assumes the best possible conditions, i.e. unlimited routing, signal fan-out, and logic resources. With this in mind, synthesis results are coarse compared to the actual resources used after further processing. Key resource usage indicators are found in the synthesis report provided by the tools. These key indicators are total slice count, total slice flip-flops used, and total 4-input LUTs used. Table 1 summarizes the data extracted from the synthesis reports. Figure  8 It is desired to evaluate performance based on how the logic is inferred into the device. An additional variable, clock period, is selected, added and varied, allowing the performance of the system to be affected. By setting the system clock, net CLK I, to a particular period length as a global constraint, the PAR process attempts to ensure that it will work at the associated frequency. In a synchronous design, the global constraint forces the PAR algorithms to try and route signals such that the time between the output of a FF to the input of the next FF is equal to or less than this constraint. A range of timing constraints was chosen such that the PAR process would continually improve and try to route the logic better. The constraint value varies between 8.5 ns to 30 ns. Figure 10 is a graphical representation of the data shown in Table 3 . All of the coding styles essentially had the same performance. The last constraint value where all coding styles met the constraint was 9.00 ns. Figure 10 . PAR timing for One-Hot encoding Table 4 provides the timing results for the three coding styles using Gray encoding. All values are in nanoseconds using the same constraint as in the One-Hot encoding method. Unlike the One-Hot encoding, there was more variance in the results between the three coding styles. Figure 11 utilizes the data from Table 3 . This method is not a recommended implementation due to the outputs being driven by combinatorial logic. Therefore, the Combined Single Process (CSP) implemented using Grey encoding, achieved the best performance with a system clock of 8.72 ns, as indicated in Table 4 . This method is the recommended implementation due to the outputs being driven by registered logic.
