Modeling the Instantaneous Power Consumption of an FPGA by Gualdarrama, Jon P
Worcester Polytechnic Institute
Digital WPI
Major Qualifying Projects (All Years) Major Qualifying Projects
January 2016
Modeling the Instantaneous Power Consumption
of an FPGA
Jon P. Gualdarrama
Worcester Polytechnic Institute
Follow this and additional works at: https://digitalcommons.wpi.edu/mqp-all
This Unrestricted is brought to you for free and open access by the Major Qualifying Projects at Digital WPI. It has been accepted for inclusion in
Major Qualifying Projects (All Years) by an authorized administrator of Digital WPI. For more information, please contact digitalwpi@wpi.edu.
Repository Citation
Gualdarrama, J. P. (2016). Modeling the Instantaneous Power Consumption of an FPGA. Retrieved from
https://digitalcommons.wpi.edu/mqp-all/2712
  
 Modeling the Instantaneous 
Power Consumption of an 
FPGA 
A Major Qualifying Project Report 
submitted to the faculty of 
Worcester Polytechnic Institute 
in partial fulfillment of the requirements of the  
Degree of Bachelor of Science 
Submitted to: 
Professor Thomas Eisenbarth 
Submitted by: 
Jon Paul Gualdarrama 
 
 
Date: December 16, 2015 
jpgualdarrama@wpi.edu 
 
i 
 
Abstract 
The power consumption of an FPGA’s routing network dominates the instantaneous power consumption 
of the device. A model for the routing network would therefore be useful for simulating the instantaneous 
power of the device. The goal of this project is to develop such a model, relating the power consumption 
of the routing network to the length of each net. This model will be integrated with power simulation 
tools in the Power Side-Channel Attack Risk Evaluator (PSCARE) to give a more accurate power 
simulation for FPGAs. Preliminary testing on the model shows that it can correctly simulate a modified 
PSCARE power simulation. 
ii 
 
Table of Contents 
Abstract .......................................................................................................................................................... i 
Table of Contents .......................................................................................................................................... ii 
Table of Figures ........................................................................................................................................... iv 
Table of Tables ............................................................................................................................................ iv 
1 Introduction ................................................................................................................................................ 1 
2 Background ................................................................................................................................................ 2 
2.1 FPGA Architecture ............................................................................................................................. 2 
2.1.1 CMOS Transistor Architecture .................................................................................................... 2 
2.1.2 FPGA Dynamic Power Consumption .......................................................................................... 3 
2.1.3 FPGA Logic Organization ........................................................................................................... 3 
2.2 Xilinx Tools ........................................................................................................................................ 3 
2.2.1 Synthesis, Implementation, and Simulation Tools ....................................................................... 3 
2.2.2 XDL Tool ..................................................................................................................................... 5 
2.3 PSCARE - A Power Simulation Tool ................................................................................................. 7 
2.3.1 Power Simulation Flow ................................................................................................................ 7 
2.4 SASEBO-GII ...................................................................................................................................... 8 
2.4.1 Architecture .................................................................................................................................. 8 
2.5 PRESENT Encryption Core ................................................................................................................ 9 
3 Methodology ............................................................................................................................................ 12 
3.1 Generate XDL and XDLRC Files ..................................................................................................... 12 
3.2 Parse XDL and XDLRC Files ........................................................................................................... 12 
3.2.1 XDL File Parsing ....................................................................................................................... 12 
3.2.2 XDLRC File Parsing .................................................................................................................. 13 
3.3 Calculate Net Paths ........................................................................................................................... 13 
3.4 Calculate Manhattan Distance .......................................................................................................... 16 
3.5 Testing Manhattan Distance Model .................................................................................................. 16 
3.6 Integration with PSCARE ................................................................................................................. 16 
3.7 Test Measurement Setup Development ............................................................................................ 17 
iii 
 
4 Results and Conclusions .......................................................................................................................... 19 
4.1 Preliminary Results ........................................................................................................................... 19 
4.1.1 Testing Conclusions ................................................................................................................... 20 
4.2 Conclusions ....................................................................................................................................... 20 
4.2.1 XDL Nets and VCD Nets ........................................................................................................... 20 
4.2.2 Model Generality ....................................................................................................................... 21 
5 Future Work ............................................................................................................................................. 22 
5.1 Power Measurements and Model Testing ......................................................................................... 22 
5.2 Comprehensiveness of Net Path Calculation .................................................................................... 22 
Appendices .................................................................................................................................................. 25 
Appendix A ............................................................................................................................................. 25 
A.1 Verilog Post-PAR Simulation Model Code Snippet .................................................................... 25 
A.2 XDL Code Snippet ....................................................................................................................... 26 
A.3 VCD Code Snippet ....................................................................................................................... 27 
Appendix B ............................................................................................................................................. 28 
B.1 8-Bit Counter Post-PAR Simulation Model Verilog Code .......................................................... 28 
B.2 8-Bit Counter Verilog Testbench ................................................................................................. 30 
 
  
iv 
 
Table of Figures  
Figure 1 – CMOS Architecture [2] ............................................................................................................... 2 
Figure 2 – Xilinx FPGA Design Flow [8] [9] ............................................................................................... 4 
Figure 3 – Net Declaration Syntax in XDL File ........................................................................................... 6 
Figure 4 – XDLRC Tile Declaration Syntax ................................................................................................ 6 
Figure 5 – Power Simulation Flowchart ....................................................................................................... 7 
Figure 6 – PSCARE Output Example ........................................................................................................... 8 
Figure 7 – SASEBO Configuration [13] ....................................................................................................... 9 
Figure 8 – Algorithmic Description of PRESENT [14] .............................................................................. 10 
Figure 9 – SASEBO-GII PRESENT Dataflow ........................................................................................... 11 
Figure 10 – Net Path Calculation Algorithm Flowchart ............................................................................. 15 
Figure 11 – 8-Bit Synchronous Counter ..................................................................................................... 16 
Figure 12 – Modified PSCARE Power Simulation Flowchart ................................................................... 17 
Figure 13 – SASEBO Power Measurement Setup ...................................................................................... 18 
Figure 14 – Original PSCARE Output ........................................................................................................ 19 
Figure 15 – Modified PSCARE Output ...................................................................................................... 20 
 
Table of Tables 
Table 1 – CMOS Architecture Input and Output Voltage Properties ........................................................... 3 
 
 
1 
 
1 Introduction 
 Field Programmable Gate Arrays (FPGAs) are commonly used over Application Specific Integrated 
Circuits (ASICs) when flexibility and speed of implementation are required over maximum optimization. 
FPGAs are constructed from a large array of logic elements and a complex routing network that can be 
programmed and reprogrammed as needed. This implies that a large variety of digital designs can be 
implemented on an FPGA, but also means that for a particular design, the power consumption of an 
FPGA will be higher than that of an ASIC [1]. The FPGA’s power draw during design execution, known 
as dynamic power consumption, is dominated by the power consumption of the nets in the routing 
network used as part of the design [1]. 
Therefore, an accurate model for the power consumption of an FPGA’s routing network would be highly 
useful for simulating the dynamic power of an FPGA. The MITRE Corporation has previously developed 
a tool known as the Power Side-Channel Attack Risk Evaluator (PSCARE) that simulates dynamic power 
consumption of an ASIC to examine potential vulnerabilities in a digital design. PSCARE has been 
shown to work well for ASICs, but is limited in its modeling of FPGA power because of the significant 
power increase due to additional architecture. It is possible to modify PSCARE to simulate FPGA power 
by accounting for the extra power consumed by each net. 
This project develops a model for an FPGA’s routing power consumption using output files from the 
Xilinx Electronic Design Automation (EDA) tools and incorporate the model into PSCARE to simulate 
the dynamic power consumption of an FPGA. In preparation for future testing of the model, a setup for 
taking real power measurements will be also be developed. Because these measurements will not be 
performed as part of this project, numerical results on the accuracy of the model will not be discussed 
here. Instead, the modified PSCARE output will be shown to display the behavior of the model, and 
conclusions drawn from the development of the model will be discussed. 
2 
 
2 Background 
Designing a model for a routing network’s power draw required an understanding of a number of topics. 
A familiarity with the fundamental architecture of the Xilinx FPGAs based on Complimentary Metal-
Oxide Semiconductor (CMOS) transistor pairs and the power profile of these pairs is essential to 
understanding the power consumption of the routing network. Xilinx FPGAs are programmed using the 
Xilinx Integrated Synthesis Environment (ISE), and thus an understanding of the ISE tools used in this 
project is also necessary. Correctly integrating the results of the model with PSCARE also necessitates 
knowledge of how PSCARE operates. In preparation for future testing of the modified PSCARE output, 
the Side-channel Attack Standard Evaluation Board (SASEBO)-GII and the PRESENT encryption core 
were chosen as a testing platform and design, respectively, and are described here.  
2.1 FPGA Architecture 
At the lowest level, modern FPGA logic is built on a collection of Complementary Metal-Oxide 
Semiconductor (CMOS) transistor pairs. The power profile of a CMOS pair is the building block of an 
FPGA’s total power consumption.  
2.1.1 CMOS Transistor Architecture 
The standard CMOS architecture is shown in Figure 1. A PMOS transistor and an NMOS transistor are 
wired with both drains connected together. The source of the PMOS transistor is connected to VDD, the 
voltage that corresponds to a logic “1,” and the source of the NMOS transistor is connected to VSS, the 
voltage that corresponds to a logic “0.” 
 
Figure 1 – CMOS Architecture [2] 
CMOS power estimation is typically split into static and dynamic power consumption. Static power 
consumption derives from parasitics and leakage currents through both transistors. Dynamic power 
3 
 
consumption is caused by a temporary short-circuit between VDD and VSS during state changes. Table 1 
describes the input and outputs state that result in static and dynamic power. Further explanation can be 
found in [3]. 
 
Table 1 – CMOS Architecture Input and Output Voltage Properties 
Input voltage A NMOS PMOS Output voltage Q Output state Power 
VSS, logic 0 Off On VDD, logic 1 Static Static 
VDD, logic 1 On Off VSS, logic 0 Static Static 
logic 0 → logic 1 Turning on Turning off Transitioning, 1 → 0 Transitioning Dynamic 
logic 1 → logic 0 Turning off Turning on Transitioning, 0 → 1 Transitioning Dynamic 
 
2.1.2 FPGA Dynamic Power Consumption 
The most significant factor in a device’s dynamic power is the switching activity of the CMOS transistor 
pairs [4]. Switching activity is data-dependent [5]; that is, the switching activity of each CMOS pair 
depends on the inputs to the pair, and the power consumption of the routing network, directly related to 
switching activity, is also data-dependent. Because routing power is the largest portion of dynamic power 
consumption dynamic power consumption is data-dependent as well, and therefore is very important to 
the security of the device. 
2.1.3 FPGA Logic Organization 
CMOS transistor pairs are combined to build basic blocks like Lookup Tables (LUTs). Components like 
these are combined into slices, which are paired off into Configurable Logic Blocks (CLBs), an FPGA’s 
main logical unit [6]. CLBs are organized into rows and columns and assigned coordinate values. 
Additionally, the Xilinx IDE tools organize CLBs and other blocks into tiles. Tiles are also assigned 
identifying coordinate values by the Xilinx tools. The coordinate system used and how the Xilinx IDE 
tools report the information are discussed in 2.3.2. 
2.2 Xilinx Tools 
The Xilinx IDE tools used in this project part of the ISE Webpack. The tools used for design synthesis, 
implementation, and simulation are described in the following sections. Another built-in command line 
tool called the Xilinx Description Language (XDL) tool was used to convert the output of implementation 
tools into a human-readable format that was critical for this project.  
2.2.1 Synthesis, Implementation, and Simulation Tools 
4 
 
In the ISE software, the Xilinx Synthesis Tool (XST) is used to synthesize Verilog designs [7]. 
Implementation is divided into three steps, each of which is accomplished by a different tool: translation, 
mapping, and place-and-route. Figure 2 shows a streamlined FPGA design flow, simplified to emphasize 
the tools used for synthesis, implementation, and simulation. Each of these tools will be discussed below 
as it pertains to the design flow and to this project. 
 
Figure 2 – Xilinx FPGA Design Flow [8] [9] 
XST synthesizes an input HDL design into an NGC file, a Xilinx-proprietary netlist format. The next step 
in the flow, Translation, is run with the NGDBuild command [10]. NGDBuild takes an NGC file and a 
User Constraints File (UCF) as inputs and converts them into a Netlist Generic Database (NGD) file 
containing the logic from the input HDL with Xilinx synthesis primitives and the constraints from the 
UCF. The MAP program converts the NGD file to a Native Circuit Description (NCD) file; this step 
converts the Xilinx primitives in the NGD file into FPGA-specific primitives. The NCD file serves as 
input to the Place-and-Route (PAR) command, which gives the primitives in the NCD file specific tile 
5 
 
locations on the FPGA, routes all of the connections between tiles and sites, and returns an updated NCD 
file. 
As Figure 2 shows, simulations can be run at each step of the design flow with the ISim simulator. Before 
synthesis, a behavioral simulation can be run on the input HDL. After synthesis and each subsequent step 
in the flow, the NetGen command is used to convert the output of that step to an HDL file called a post-
synthesis, -translation, -map, or -par simulation model. The simulation model then serves as input to 
ISim. A simulation at any step can generate a Value Change Dump (VCD) file. A VCD file lists the time-
ordered bit changes for all nets in a simulation model. 
2.2.2 XDL Tool 
The post-PAR NCD contains all the information necessary to implement a design. However, the NCD file 
format is encoded in a Xilinx-specific format. To make the information available to developers, Xilinx 
includes the Xilinx Description Language (XDL) Tcl command in the ISE Webpack to convert an NCD 
to an XDL file, convert an XDL to an NCD, and generate an XDLRC file. It should be noted that 
documentation on the XDL and XDLRC file formats is scarce; the descriptions of both hereafter draw 
heavily from [11]. 
2.2.2.1 XDL File Format 
An XDL text file lists the logical slices used to implement a design and the nets used to route the 
connections between them. Each slice instantiation must follow the same format as this example: inst 
"left" SLICEX, placed CLEXM_X8Y33 SLICE_X11Y33 [11]. This line starts with the “inst” keyword 
followed by the name of the instance, “left” in this case. This is followed by the instance type. After this 
the keyword “placed” or “unplaced” describes the placement of the slice; in an XDL file generated post-
PAR, every slice will be “placed” by definition. If the slice has been placed, the placement keyword will 
be followed by the tile the slice is located in (“CLEXM_X8Y33”) and which slices inside that CLB is 
being instantiated (“SLICE_X11Y33”). Not shown in the example is the configuration statement that 
follows the instantiation. The configuration statement begins with the “cfg” keyword, and, will contain 
the logical equations for each LUT in use in the slice. 
The nets that connect slices are instantiated as in Figure 3. Each net is specified by the name of the net in 
the post-PAR simulation model and an optional net type, often omitted for the most general type “wire.” 
A net has one starting pin known as the “outpin” and can have multiple endpoints known as “inpins.” 
Each outpin and inpin is specified by the name of the slice where it is located and the pin on that slice it 
connects to, Pip declarations list points between the outpin and inpins. Each pip specifies its location by 
the name of the tile and the pins at which the net enters and exits the tile. 
6 
 
 
 
2.2.2.2 XDLRC File Format 
An XDLRC file lists every tile, site, and possible connection on a particular FPGA. As Figure 4 shows, 
tiles are the top of the hierarchy in an XDLRC file. The first line of each tile lists the x and y coordinates 
of the tile based on the global coordinate system of the FPGA, the name of the tile, the type of the tile, 
and the number of primitive sites in the tile. Primitive sites fall under tiles and are identified by the name 
of the site, the type of the site, a keyword describing the bondedness of the tile, and the number of 
pinwires within that site. Pinwires are the inputs and outputs of a site; each pinwire declaration shows the 
name of the pinwire, its type, and the name of the wire the pinwire connects to in its tile. Wire declaration 
lines give the name of the wire and the number of connections the wire has. Each connection, or “conn,” 
of a wire lists the tile and wire at the opposite end of the connection. Pip declarations follow the same 
syntax as in the XDL file. The final line of each tile gives a summary of the tile and includes the name 
and type of the tile and the numbers of pinwires, wires, and pips in that tile. 
 
(tile <x-coordinate> <y-coordinate> <tile name> <tile type> <number primitive sites> 
    (primitive_site <site name> <site type> <bondedness keyword> <number pinwires> 
        (pinwire <pin name> <pinwire type> <wire name>) 
        … 
        (pinwire <pin name> <pinwire type> <wire name>) 
    ) 
    (wire <wire name> <number connections> 
        (conn <conn tile name> <conn wire name>) 
        … 
        (conn <conn tile name> <conn wire name>) 
    ) 
    (pip <tile> <pin> -> <pin>) 
    … 
    (pip <tile> <pin> -> <pin>) 
    (tile_summary <tile name> <tile type> <number pinwires> <number wires> <number pips>) 
) 
net <net name> [<net type>] , 
    outpin <slice> <pin> , 
    inpin <slice> <pin> , 
    … 
    inpin <slice> <pin> , 
    pip <tile> <pin> -> <pin> , 
    … 
    pip <tile> <pin> -> <pin> 
    ; 
Figure 3 – Net Declaration Syntax in XDL File 
Figure 4 – XDLRC Tile Declaration Syntax 
7 
 
2.3 PSCARE - A Power Simulation Tool 
PSCARE was a developed as a tool for evaluating the security vulnerability of a design through simulated 
dynamic power consumption [12]. Since this project will focus primarily on modifications to PSCARE’s 
power simulation tools, this section describes how these power simulations are performed, outlined in 
Figure 5. A VCD file, generated from an ISim simulation, is parsed to obtain the total switching activity 
at each timestamp. A Boxcar filter is applied to the list of switching activities to smooth out the 
transitions between each timestamp and achieve a simulated power consumption plot. 
 
Figure 5 – Power Simulation Flowchart 
2.3.1 Power Simulation Flow 
As discussed in Section 2.2.1, VCD files list timestamped data on the bit changes of every net is a design. 
So, the value of each net can be tracked at each timestamp, and the Hamming distance of a net can be 
calculated from the previous to the current timestamp. The sum of all Hamming distances at a single 
timestamp is known as the total toggle count for that timestamp. (1) below shows how the total toggle 
count is calculated. 𝑛𝑘 represents the kth net in the design and 𝑛𝑘,𝑡 represents the value of the kth net at 
time t. The Hamming distance is shown as the function 𝐻𝐷 of two binary values. This equation is applied 
for each timestamp from the VCD file to generate a timestamped toggle count for each point in time in 
the simulation. 
 
𝑡𝑐 = ∑ 𝐻𝐷(𝑛𝑘,𝑡 , 𝑛𝑘,𝑡+1)
𝐾−1
𝑘=0
 
 
The timestamped array of toggle counts is treated as a series of weighted impulses as shown in the top 
plot in Figure 6. To smooth out the curve and model the response of an FPGA’s dynamic power 
(1) 
 
8 
 
consumption to bit flips, a boxcar filter is applied to the toggle count array at a given sampling frequency. 
The bottom plot in Figure 6shows an example of what the filtered toggle count looks like. 
 
Figure 6 – PSCARE Output Example 
2.4 SASEBO-GII 
The Side-channel Attack Standard Evaluation Board (SASEBO) GII was developed by the National 
Institute of Advanced Industrial Science and Technology of Japan for the purpose of conducting side-
channel experiments on a cryptographic design. This project utilized the SASEBO-GII as a platform for 
the development of a measurement setup that can be used in the future to evaluate the results of this 
project. 
2.4.1 Architecture 
The main components of the board that will discussed in this report are the two FPGAs on the device, the 
Spartan-3 and the Virtex-5, both manufactured by Xilinx. Additionally, other components such as 
external pins will be discussed as they pertain to the project. 
Figure 7shows a block diagram of the layout of the main components on the GII [13]. The Spartan-3, the 
control FPGA, communicates with the USB controller to receive control signals and input data and 
transmit data back to the PC. The Spartan also sends data to and from the Virtex-5, the target FPGA, 
along bidirectional data lines. Each FPGA has an SPI-ROM block that holds the configuration data for 
that FPGA. The ROMs are programmed with a design using a Xilinx Platform Cable USB II, Platform 
Cable IV, or Parallel Cable IV. Each FPGA can be controlled independently of the other. An on-board 
9 
 
oscillator provides a 24MHz clock signal to both FPGAs, but each chip can also be clocked externally. 
Each chip can also be powered independently using provided header pins. If the board is powered via 
USB, voltage regulators supply each FPGA will the correct voltage levels. The SASEBO-GII provides an 
external power measurement pin J2 that provides a trace of the board’s power consumption. This is 
accomplished by measuring the voltage drop across a 1Ω resistor connected to the board’s ground. This 
ground reference is also provided on pin TP4. The board also has a set of 32 external header pins J6-1 
through J6-32 that can be used to monitor signals from the FPGAs. 
 
Figure 7 – SASEBO Configuration [13] 
2.5 PRESENT Encryption Core 
The PRESENT encryption algorithm was chosen as a test design for the Manhattan Distance model. 
Similar to the Advanced Encryption Standard (AES), PRESENT is a lightweight cipher intended for low-
security applications such as RFID tags and sensor networks [14]. It supports two key lengths, 80 bits and 
128 bits, and takes a 64-bit plaintext. Thirty-one rounds of encryption are performed on the plaintext as 
laid out in Figure 8. A thorough description of the algorithm can be found at [14]. 
10 
 
 
Figure 8 – Algorithmic Description of PRESENT [14] 
To realize the cipher on the SASEBO-GII, a core was chosen based on a pre-existing implementation of 
AES provided by the manufacturers of the SASEBO. The flow of data through the modules on the target 
FPGA can be seen in Figure 9. In the core, data is read from the control FPGA into a FIFO “fifo_r.” A 
control bus known as “ctrl_lbus” handles the interaction between “fifo_r” and a bus interface “lbus_if.” 
The control bus reads data from fifo_r and sends control signals back to it, telling fifo_r when more data 
can be read from the control FPGA. Data read into ctrl_lbus is split between address data and input data 
and passed along to the bus interface lbus_if. Lbus_if then pieces together the full plaintext and key and 
sends the data to the encryption algorithm. When the ciphertext has been fully encrypted, it flows back 
through the bus interface and the control bus to another FIFO “fifo_w,” from which data is read and 
written back to the control FPGA. 
11 
 
 
Figure 9 – SASEBO-GII PRESENT Dataflow 
 
12 
 
3 Methodology 
The goal of this project was to develop a model for FPGA routing structures based on Manhattan 
Distance that could be incorporated into PSCARE to achieve an accurate power simulation of a digital 
design on an FPGA. The model takes as inputs an XDL and an XDLRC file from a placed-and-routed 
design and uses the information therein to calculate the correct model for that design. The process for 
calculating coefficients and integrating the model with PSCARE is described in the following sections. 
3.1 Generate XDL and XDLRC Files 
This project utilizes XDL files generated from a fully placed-and-routed design; hence, it is assumed that 
the design has been synthesized and implemented as described in Section 2.2. 1.. Using ISE’s Tcl 
command window, the command below was used to generate the XDL file. The option “ncd2xdl” 
specifies the NCD to XDL conversion mode, and the two arguments tell the XDL command what the 
input NCD file and the output XDL file are. 
xdl -ncd2xdl <NCD file> <XDL file> 
The XDL tool was used with the “report” mode specifier to generate an XDLRC file with the command 
below. The XDLRC file is specific to the FPGA model being used, specified by the first argument. The 
two options, “pips” and “all_conns,” tell the XDL tool to report all pips on every tile and all connections 
between wires, respectively. This information is needed to help determine the paths along nets in the XDL 
file, described below in Section 3.2. 
xdl -report -pips -all_conns <chip part number> <XDLRC file>  
 
3.2 Parse XDL and XDLRC Files 
Both files were stored as text and had to be parsed to make use of the data. The parsing code was written 
to extract all data that could be used for developing a routing model. 
3.2.1 XDL File Parsing 
The XDL file format can be understood as three sections: the “design” section, the instance declaration 
section, and the net declaration section. For this project, the “design” section was not used. Instance and 
net declarations were parsed as follows to keep all information relevant for the model. 
For this project, each type of instance mapped to the FPGA was parsed identically; the XDL file includes 
“XDL_DUMMY” instances that are not mapped to the FPGA and are not used anywhere else in the file, 
and so were ignored in parsing. Each mapped instance followed the format described in Section 2.2.2.1. 
To parse the declaration, the first line was split along the spaces and stored, throwing away the “inst” 
13 
 
keyword. The configuration statement was not needed and thus was ignored for the model, but, it is likely 
that different models could use the information in the configuration statements, so code was included to 
parse them. 
Every placed-and-routed net declaration followed the format in Figure 3. Some net declarations are 
mapped to pads and are not routed to any locations on the FPGA; these nets were ignored. For all other 
nets, the first line of the declaration was stored as the net name and net type; if no type was written in the 
XDL, the “wire” type was assumed, as mentioned in Section 2.2.2.1. Outpin and inpins were stored by 
keyword, slice, and pin. Inpins were collected into a list to keep them together and facilitate storage. Pips 
were broken up into tile, outpin, direction, and inpin, and the “pip” keyword was removed. All nets were 
stored in one structure organized by net name. 
3.2.2 XDLRC File Parsing 
The XDLRC syntax in Figure 4 is easy to convert to a structure in code. All information is organized 
under tiles, so creating a data structure sorted by tile name made sense. The structure contained a 
collection of wires, a collection of sites, and a collection of pips. Every wire had a name, a number 
representing the number of connections to that wire, and a list of the wires (“conns”) it connected to; each 
“conn” was listed by tile and wire name. Primitive sites listed its name, its type, the number of pinwires in 
the site, and the list of those pinwires. Each pinwire listed the name of the site’s IO pin, the type of 
pinwire, and the name of wire the IO pin connects to. For each pip, the name of the tile it belonged to, the 
start and end pins it connected, and the direction of the pip were stored. 
3.3 Calculate Net Paths 
The pips for each net listed in XDL file are in no particular order. To correctly calculate the Manhattan 
Distance between each point, the path along a net from outpin to inpin had to be determined. The 
algorithm used for determining the path is shown graphically in Figure 10 and is described in the 
following steps: 
1. The number of inpins defines the number of paths there are. So, initialize as many paths as there 
are inpins with one inpin each. While coding, I noticed that there are special cases in the XDLRC 
where the name of a pinwire can be formatted differently when it is referenced elsewhere in the 
file. To handle this and ensure that the rest of the algorithm worked correctly, those special cases 
were handled by reformatting the name of the pinwire before it is added to a path. 
14 
 
2. Iterate through the pips. For each pip, determine if the pip outpin or a “conn” to the pip outpin is 
connected to the inpin of another pip that has not been appended to the path. This step is shown in 
Figure 10 and is accomplished by: 
a. First determining if the current pip should be considered at all. If the current pip outpin or 
a “conn” to the pip outpin matches the tail of any path, then the current pip should be 
checked. If there is no match, this pip belongs in a later spot in the path, and operation 
continues to the next pip. 
b. Comparing the current pip outpin with the inpins of remaining unchecked pips. If there is 
a match, the pip outpin is added to the path, the pip is marked as checked, and execution 
continues to the next pip. If there is no match, the function proceeds to the next check. 
c. Because there must be a connection between the current pip outpin and another 
unchecked pip and because step 2.b failed, the connection must be through a “conn” of 
the pip outpin. Recursively check connections N layers away; that is, check “conns” of 
pip outpin (N=1), then conns of those conns (N=2), etc. Once a match is found, it is added 
to the path, the pip is marked as checked, and execution continues to the next pip. 
3. Because pips are not stored in a particular order, repeat the iteration until all pips have been added 
to a path. 
4. Append the net’s outpin to the end of both paths. 
15 
 
 
Figure 10 – Net Path Calculation Algorithm Flowchart 
16 
 
3.4 Calculate Manhattan Distance 
Once the paths for each net have been calculated, the Manhattan Distance for each net is calculated as the 
sum of the Manhattan Distances between each point in the path. In the case where a net branched out 
from one outpin to multiple inpins, the distances was averaged over the number of inpins 𝑁 to achieve a 
single value that was still related to the lengths of the branches. (2) below shows the definition of the 
Manhattan Distance, here represented by a function 𝑀𝐷 of two coordinate pairs 𝐴, 𝐵, and (3) shows how 
the Manhattan Distance was used to calculate the coefficient 𝑐𝑘 for the kth net in a design as the average 
of the Manhattan Distances between the outpin 𝑂𝑃𝑘 and each inpin 𝐼𝑃𝑘,𝑛.  
𝑀𝐷(𝐴, 𝐵) = |𝑥𝐴 − 𝑥𝐵| + |𝑦𝐴 − 𝑦𝐵| 
𝑐𝑘 =
1
𝑁
∑ 𝑀𝐷(𝑂𝑃𝑘 , 𝐼𝑃𝑘,𝑛)
𝑁
𝑛=1
 
3.5 Testing Manhattan Distance Model 
Initial testing of the functionality of the Manhattan Distance code before integration with PSCARE was 
done with two designs. First, an 8-bit counter design was used to test the basic functionality of the code. 
The top-level block diagram in Figure 11 shows the inputs and outputs to the counter design. The counter 
can count up or down (direction input) between a given maximum and minimum. When the counter is 
reset, the count is set to a given initial count value. 
 
Figure 11 – 8-Bit Synchronous Counter 
3.6 Integration with PSCARE 
(3) was used to calculate a coefficient for each net in an XDL file generated for a particular design. Nets 
from the XDL correspond directly to nets in a VCD file, as long as the VCD file was generated from a 
(2) 
2 (3) 
2 
17 
 
post-PAR simulation of the same design. As a result, the coefficients could be used to modify the toggle 
count model in PSCARE. Modifying (1) with the coefficient 𝑐𝑘 gives (4). This equation was implemented 
in PSCARE to realize the Manhattan Distance model.  
𝑡𝑐 = ∑ 𝑐𝑘 ∙ 𝐻𝐷(𝑛𝑘,𝑡 , 𝑛𝑘,𝑡+1)
𝐾−1
𝑘=0
 
 
The code for PSCARE was modified to take as an argument any script that returns a correctly formatted 
structure of coefficients. Because the code for this project was written in Python, a dictionary mapping net 
names to coefficient values like the dictionary returned from the Manhattan Distance script was used as 
the coefficient structure format. Multiplication by the coefficients was done before filtering, as Figure 12 
shows. 
 
Figure 12 – Modified PSCARE Power Simulation Flowchart 
3.7 Test Measurement Setup Development 
Once the model was integrated with PSCARE, a measurement setup was developed through preliminary 
experimentation that can be used to test the accuracy of the modified PSCARE output. The SASEBO-GII 
discussed in Section 2.4 was used to run the test design. Measurement scripts were written for a Gage 
CS144002U 14-bit USB Oscilloscope, used to digitize the power signal from the SASEBO. It was 
determined experimentally that a 40dB gain was adequate to utilize the full +/- 1.1V input range of the 
oscilloscope; in this case, two inverting 20dB amplifiers were used in series. Figure 13 shows the 
designed measurement setup. 
18 
 
 
Figure 13 – SASEBO Power Measurement Setup 
19 
 
4 Results and Conclusions 
4.1 Preliminary Results 
The model successfully computes coefficient values for the tested design, discussed in Section 3.5. The 
output of running PSCARE on the 8-bit counter model, is discussed here. The Verilog post-place-and-
route simulation model and a simple Verilog testbench used to simulate the design can be found in 
Appendix B. Comparing Figure 14 with Figure 15 shows the effect of modifying PSCARE with the 
coefficients calculated by the Manhattan Distance model on the simulated power consumption of the 8-bit 
counter. The same patterns can be seen in the original and modified versions. Figure 15 exhibits the same 
shape as the original output, but with higher peaks and valleys. During the first clock cycle, inputs to the 
counter module are being initialized to the correct values, resulting in the spike in the toggle count for the 
first clock cycle. All activity after that is due to the testbench stimulus and the switching of the output 
“count” and internal nets. This shows the Manhattan Distance model developed in this project. 
 
Figure 14 – Original PSCARE Output 
20 
 
 
Figure 15 – Modified PSCARE Output 
4.1.1 Testing Conclusions 
The results of the preliminary testing show that the PSCARE simulation modified with the Manhattan 
Distance model followed the same shape as the unweighted simulation for the test design used. Because 
of this, it is reasonable to conjecture that the weighted simulation is still data-dependent and will have the 
same response to security tests like Differential Power Analysis (DPA); this claim is not supported here, 
however, and is left for testing in future work. In any case, further testing is required to determine how the 
model behaves with other test designs, such as the PRESENT implementation in Section 2.5. 
4.2 Conclusions 
4.2.1 XDL Nets and VCD Nets 
A comparison of an XDL file for a placed-and-routed digital design and a VCD file generated from a 
simulation of the same design will show that all nets in the XDL file exist in the VCD file. The VCD file 
will also contains a number of additional nets that do not exist in the XDL file. Because the VCD file is 
generated post-place-and-route, the nets listed contain inputs and outputs to each primitive instantiated in 
a design. For example, the code snippets in Appendix A show the instantiation of a module in the 
PRESENT design as a post-PAR Verilog model, its equivalent in the design’s XDL file, and the 
21 
 
definition of a portion of the same module in the VCD output of a post-PAR simulation of the design. It is 
clear that the VCD file lists all the nets listed in the Verilog in addition to internal nets of the 
“Mrom_data_o_mux000021” primitive, and it appears that these nets are omitted from the XDL file. The 
net declarations in the XDL code below, however, show that the four nets which set the four used address 
inputs to the Verilog primitive, “ADR0,” “ADR1,” “ADR2,” and “ADR4” are set as the pins “A1,” “A2,” 
“A3,” and “A5” on the “pres/sub_per_output<36>” slice. The output of the primitive “O” is utilized as 
the “A” pin of the same slice. Therefore, the input and output pins of the primitive wires listed in the 
VCD file are accounted for in the XDL file as outpins and inpins of nets. Other nets exist in the VCD file 
that come directly from the post-PAR simulation model, but do not exist in the XDL file, such as the 
“\NlwBufferSignal_Maddsub_count_share0000_cy<3>/DI<1>” signal in Appendix A; these nets are not 
routed on the FPGA, and their switching activity is encompassed by the nets they are ultimately 
connected to. 
4.2.2 Model Generality 
The model utilizes XDL and XDLRC files generated by the Xilinx software tools as primary inputs to the 
model. These file formats are only available through the Xilinx tools for Xilinx manufactured FPGAs. 
This means that the model is limited to Xilinx FPGAs. Should devices from other manufacturers be 
considered, additional work must be done to implement an equivalent model for those systems. 
The generality of the model for different types of designs is also unsure. A small number of designs were 
tested for the purposes of verifying the functionality of the code, but additional testing would be required 
to examine how well the model behaves with other designs. Xilinx does not provide documentation for 
the XDL and XDLRC formats, so it is unknown if special cases or additional particulars exist that have 
not been handled in the current code (see Section 3.3). 
  
22 
 
5 Future Work 
To thoroughly test the Manhattan Distance model, further work is required. Comparing the simulated 
power traces of PSCARE, modified with the Manhattan Distance model, to real power traces would show 
the accuracy of the model. Performing security analysis, such as Differential Power Analysis (DPA), on 
the simulated power traces would also be beneficial to examining the model’s performance. The code of 
the model can also be examined; particularly the net path calculation algorithm. 
5.1 Power Measurements and Model Testing 
The next step in confirming the functionality of this model would be to test its accuracy against real 
power measurements. The measurement setup described in Section 3.7 can be used to measure power 
traces from the SASEBO board. Additional considerations should be made in regards to the oscilloscope 
used in collecting data. The Gage CS144002U provides a high resolution, but has been observed to have 
data alignment problems when data is transferred to a PC. 
Because PSCARE was intended for evaluating security vulnerabilities and has been shown to accurately 
predict them, performing DPA tests on the output of the modified PSCARE would help determine if the 
tests are still able to correctly detect the same security risks that can be found with the unmodified 
PSCARE output. 
5.2 Comprehensiveness of Net Path Calculation 
As noted in Section 3.3, the code written for determining the paths along nets from an XDL file has two 
possible shortcomings: nets of certain tile types had to be reformatted to match their names in different 
parts of the XDLRC file, and the algorithm for calculating a path is complex. The special cases 
encountered in calculating paths found during debugging and are particular to the XDLRC generated in 
the design. Because each XDLRC file is specific to a particular FPGA model and part number, further 
testing is required to determine if all possible cases have been handled. The complexity of the path 
calculation algorithm is due to the number of possibilities of connections between tiles. Further review of 
the XDLRC file format is required to determine if the algorithm used can be simplified. 
 
 
23 
 
6 Reference 
[1]  K. K. W. Poon, S. J. E. Wilton and A. Yan, "A Detailed Power Model for FIeld-Programmable 
Gate Arrays," ACM Transactions on Design Automation of Electronic Systems, vol. 10, no. 2, pp. 
279-302, 2005.  
[2]  E. Bonet, "CMOS inverter (NOT logic gate)," 07 12 2006. [Online]. Available: 
https://en.wikipedia.org/wiki/CMOS#/media/File:CMOS_Inverter.svg. [Accessed 12 11 2015]. 
[3]  B. Sales, "CMOS Inverters: A simple description of the characteristics of CMOS inverters," 
[Online]. Available: 
https://courseware.ee.calpoly.edu/~dbraun/courses/ee307/F02/02_Sales/section02_bruce_sales.htm
l. [Accessed 12 10 2015]. 
[4]  L. Shang, A. S. Kaviani and K. Bathala, "Dynamic Power Consumption in Virtex-II FPGA 
Family," FPGA '02, 24-26 February 2002.  
[5]  A. Moradi, A. Barenghi, T. Kasper and C. Paar, "On the Vulnerability of FPGA Bitsttream 
Encryption against Power Analysis Attacks," in ACM Conference on Computer and 
Communications Security, Chicago, Illinois, 2011.  
[6]  "Virtex-5 FPGA User Guide," 16 3 2012. [Online]. Available: 
http://www.xilinx.com/support/documentation/user_guides/ug190.pdf. [Accessed 2015]. 
[7]  "XST User Guide," 16 9 2009. [Online]. Available: 
http://www.xilinx.com/support/documentation/sw_manuals/xilinx11/xst.pdf. [Accessed 2015]. 
[8]  "Command Line Tools User Guide," 24 4 2012. [Online]. Available: 
http://www.xilinx.com/support/documentation/sw_manuals/xilinx14_4/devref.pdf. [Accessed 
2015]. 
[9]  "Synthesis and Simulation Design Guide," 2 12 2009. [Online]. Available: 
http://www.xilinx.com/support/documentation/sw_manuals/xilinx11/sim.pdf. [Accessed 2015]. 
[10]  "Implementation Overview for FPGAs," Xilinx, Inc., [Online]. Available: 
http://www.xilinx.com/support/documentation/sw_manuals/xilinx11/ise_c_implement_fpga_desig
n.htm. [Accessed 20 11 2015]. 
[11]  C. Beckhoff, D. Koch and J. Torrrsen, "The Xilinx Design Language (XDL): Tutorial and Use 
Cases," Department of Informatics, University of Oslo, Norway, Oslo, Norway. 
[12]  J. V. Haley and D. Hullihen, "System Security Metrics via Power Simulation for VLSI Designs," 
2013. 
[13]  Side-channel Attack Standard Evaluation Board SASEBO-GII Specification, Research Center for 
Information Security,, 2009.  
24 
 
[14]  A. Bogdanov, L. R. Knudsen, G. Leander, C. Paar, A. Poschmann, M. J. B. Robshaw, Y. Seurin 
and C. Vikkelsoe, "PRESENT: An Ultra-Lightweight Block Cipher," in Workshop on 
Cryptographic Hardware and Embedded Systems, Vienna, Austria, 2007.  
 
 
 
 
 
25 
 
Appendices 
Appendix A 
A.1 Verilog Post-PAR Simulation Model Code Snippet 
module sbox_INST_12 ( 
    enable, data_i, data_o 
); 
    input enable; 
    input [3 : 0] data_i; 
    output [3 : 0] data_o; 
    wire \NlwBufferSignal_Maddsub_count_share0000_cy<3>/DI<1> ; 
    Mrom_data_o_mux000021 ( 
        .ADR3(1'b1), 
        .ADR5(1'b1), 
        .ADR0(data_i[1]), 
        .ADR2(data_i[0]), 
        .ADR1(data_i[2]), 
        .ADR4(data_i[3]), 
        .O(data_o[2]) 
    ); 
    X_LUT6 #( 
        .LOC ( "SLICE_X14Y69" ), 
        .INIT ( 64'h2D2DD2D22D2DD2D2 )) 
    ); 
    // Additional instantiations 
endmodule 
 
  
26 
 
A.2 XDL Code Snippet 
inst "pres/sub_per_output<36>" "SLICEL",placed CLBLL_X10Y68 SLICE_X14Y68  , 
  cfg " A5LUT::#OFF 
A6LUT:pres/present_cipher_sp/sub_per_substitution/substitution_sbox4/Mrom_data_o_mux000021:#LU
T:O6=((~A1*((~A3*(A2@~A5))+(A3*(~A2+A5))))+(A1*((~A3*~A2)+(A3*(A2*~A5))))) 
    ACY0::#OFF AFF::#OFF AFFINIT::#OFF AFFMUX::#OFF AFFSR::#OFF AOUTMUX::#OFF 
    AUSED::0 B5LUT::#OFF B6LUT::#OFF BCY0::#OFF BFF::#OFF BFFINIT::#OFF 
    BFFMUX::#OFF BFFSR::#OFF BOUTMUX::#OFF BUSED::#OFF C5LUT::#OFF C6LUT::#OFF 
    CCY0::#OFF CEUSED::#OFF CFF::#OFF CFFINIT::#OFF CFFMUX::#OFF CFFSR::#OFF 
    CLKINV::#OFF COUTMUX::#OFF COUTUSED::#OFF CUSED::#OFF D5LUT::#OFF 
    D6LUT::#OFF DCY0::#OFF DFF::#OFF DFFINIT::#OFF DFFMUX::#OFF DFFSR::#OFF 
    DOUTMUX::#OFF DUSED::#OFF PRECYINIT::#OFF REVUSED::#OFF SRUSED::#OFF 
    SYNC_ATTR::#OFF " 
    ; 
 
net "pres/sub_per_output<36>" ,  
    outpin "pres/sub_per_output<36>" A , 
    inpin "blk_dout<36>" B2 , 
    inpin "pres/state<39>" A2 , 
    pip CLBLL_X10Y64 SITE_IMUX_B28 -> M_A2 ,  
    pip CLBLL_X10Y64 SITE_IMUX_B40 -> L_B2 ,  
    pip CLBLL_X10Y68 M_A -> SITE_LOGIC_OUTS12 ,  
    pip INT_X10Y64 SE2MID2 -> IMUX_B28 ,  
    pip INT_X10Y64 SE2MID2 -> IMUX_B40 ,  
    pip INT_X10Y65 SL2END2 -> SE2BEG2 ,  
    pip INT_X10Y68 LOGIC_OUTS12 -> SL2BEG_N2 ,  
    ; 
  
27 
 
A.3 VCD Code Snippet 
$scope module substitution_sbox4 $end 
$scope module Mrom_data_o_mux000021 $end 
$var wire 1 ,A O $end 
$var wire 1 @A ADR0 $end 
$var wire 1 .A ADR1 $end 
$var wire 1 BA ADR2 $end 
$var wire 1 CA ADR3 $end 
$var wire 1 1A ADR4 $end 
$var wire 1 EA ADR5 $end 
$var wire 1 3A a0 $end 
$var wire 1 4A a1 $end 
$var wire 1 5A a2 $end 
$var wire 1 6A a3 $end 
$var wire 1 7A a4 $end 
$var wire 1 8A a5 $end 
$var wire 1 9A o_out_tmp $end 
$var reg 1 :A o_out $end 
$var reg 1 ;A tmp $end 
$scope function lut6_mux8 $end 
$var reg 1 <A lut6_mux8 $end 
$var reg 8 =A d [7:0] $end 
$var reg 3 >A s [2:0] $end 
$upscope $end 
$upscope $end 
// Additional scope definitions 
$upscope $end 
 
 
 
 
 
 
  
28 
 
Appendix B 
B.1 8-Bit Counter Post-PAR Simulation Model Verilog Code 
`timescale 1 ns/1 ps 
module counter8_t ( 
    clk, reset_n, direction, count, maximum, init_count, minimum, increment 
); 
    input clk; 
    input reset_n; 
    input direction; 
    output [7 : 0] count; 
    input [7 : 0] maximum; 
    input [7 : 0] init_count; 
    input [7 : 0] minimum; 
    input [7 : 0] increment; 
    wire N9; 
    wire reset_n_IBUF_433; 
    wire init_count_5_IBUF_434; 
    wire minimum_5_IBUF_435; 
    wire \NlwBufferSignal_Maddsub_count_share0000_cy<3>/DI<1> ; 
    wire \NLW_Maddsub_count_share0000_cy<3>_CO[0]_UNCONNECTED ; 
    wire VCC; 
    wire GND; 
    wire [3 : 3] Maddsub_count_share0000_cy; 
    wire [7 : 0] Maddsub_count_share0000_lut; 
    wire [7 : 0] count_share0000; 
    wire [7 : 0] count_mux0000; 
    // additional wire declarations 
 
    initial $sdf_annotate("netgen/par/counter8_timesim.sdf"); 
    X_BUF \Maddsub_count_share0000_cy<3>/Maddsub_count_share0000_cy<3>_DMUX_Delay ( 
        .I(count_share0000[3]), 
        .O(\count_share0000<3>_0 ) 
    ); 
    X_BUF \Maddsub_count_share0000_cy<3>/Maddsub_count_share0000_cy<3>_CMUX_Delay ( 
        .I(count_share0000[2]), 
        .O(\count_share0000<2>_0 ) 
    ); 
    X_BUF \Maddsub_count_share0000_cy<3>/Maddsub_count_share0000_cy<3>_BMUX_Delay ( 
        .I(count_share0000[1]), 
        .O(\count_share0000<1>_0 ) 
    ); 
    X_BUF \Maddsub_count_share0000_cy<3>/Maddsub_count_share0000_cy<3>_AMUX_Delay ( 
        .I(count_share0000[0]), 
        .O(\count_share0000<0>_0 ) 
    ); 
    X_LUT6 #( 
        .LOC ( "SLICE_X5Y74" ), 
        .INIT ( 64'hC3C3C3C33C3C3C3C )) 
    \Maddsub_count_share0000_lut<3> ( 
29 
 
        .ADR0(1'b1), 
        .ADR4(1'b1), 
        .ADR3(1'b1), 
        .ADR5(count_mux0001), 
        .ADR1(\increment<3>/IBUF ), 
        .ADR2(count_3_469), 
        .O(Maddsub_count_share0000_lut[3]) 
    ); 
 
    // additional primitive instantiations 
 
endmodule 
 
  
30 
 
B.2 8-Bit Counter Verilog Testbench 
`timescale 1ns/1ps 
module counter8_testbench (); 
    reg  clk, reset_n, direction; 
    reg [7:0] init_count, increment, minimum, maximum; 
    wire [7:0] count; 
 
    counter8_t UUT(.clk(clk), .reset_n(reset_n), .init_count(init_count), .direction(direction), 
        .increment(increment), .minimum(minimum), .maximum(maximum), .count(count)); 
 
    always #5 clk = ~clk; 
 
    initial begin 
        $dumpfile ("counter8_par.vcd"); 
        $dumpvars(0, UUT); 
        clk = 1'b0; 
        init_count = {$random} % 2**8; 
        direction = 1'b1; 
        increment = 8'b10; 
        minimum = 8'h00; 
        maximum = 8'hFF; 
        reset_n = 1'b1; 
        @(posedge clk); 
        @(posedge clk); 
        reset_n = 1'b0; // reset 
        @(posedge clk); 
        @(posedge clk); 
        reset_n = 1'b1; // enable module 
        $dumpflush; 
    end 
endmodule // COUNTER8_testbench 
